remote_submission.md 2.7 KB

Pocket Archive remote submission guide

Audience: system administrators, developers

Pocket archive supports submissions of local contents via its command-line interface, and of remote contents uploaded to a local folder.

Pocket Archive does not have an administrative user interface. Instead, it relies on a "hot folder" method to support remote submissions. The pkar_watch utility is run on the machine that hosts Pocket Archive to watch a particular system folder for laundry lists. Any time a file named pkar_submission*.csv is added to that folder, it is processed for submission.

In order for this approach to work, the laundry list must be uploaded after all the other submission files have been successfully uploaded.

The watched folder is local to the Pocket Archive operating system. However, it is possible to run an FTP server on the same folder, thus providing remote access and permission management.

Note: currently, pkar_watch relies on inotify, a Linux utility. Until a POSIX compatible solution is implemented, the program can only be run on Linux.

Transfer protocols

Pocket Archive itself does not provide a network service that allows remote users to upload their contents. However, setting up a SFTP, FTPS, WebDav, or other transfer service mapping directly to a local file system is a quite standard and straightforward way to expose a deposit endpoint and manage its permissions.

Note on S3

S3 is not a good choice for this setup because it complicates things significantly, at least in the MinIO implementation of S3 that was tested. MinIO remaps the file and folder structure on disk in a way that it doesn't match what is seen on the S3 end, and what was uploaded by the depositor. As a consequence, the files uploaded from S3 cannot be easily utilized straight from the underlying storage (and they probably shouldn't, either). A separate process to monior S3 events and a second S3 transfer would be required to fetch the SIP, which is not in line with the minimalistic and low-bandwidth philosophy of Pocket Archive.

Watchdog process

Remote submissions are enabled by running the pkar_watch service in the background:

pkar_watch [options] path

path points the folder to watch. It must be a local folder or a locally mounted network folder [WIP note: the latter is not yet tested].

Other options include:

    -l, --loglevel <number> (default: 3)
        Log level: 1 = error, 2 = warning, 3 = info [default], 4 = debug.

    -g, --gen-site
        (Re-)generate the website after each submission.

    -c, --cleanup
        Remove laundry list and SIP after successful submission.

See pkar_watch --help for up to date information.