Prechádzať zdrojové kódy

Restructure user guide.

scossu 1 deň pred
rodič
commit
cb8a2cf861

+ 1 - 0
.gitignore

@@ -46,3 +46,4 @@ data/ores/*
 data/dres/*
 out/*
 !.keep
+doc/user_guide/site

+ 8 - 302
README.md

@@ -24,308 +24,14 @@ Pocket Archive fulfills the following functions:
 In spite of its design simplicity, Pocket Archive strives to be highly
 flexible. It is based on [Volksdata
 ](https://git.knowledgetx.com/scossu/volksdata), a very compact Linked Data
-store written in C. There is no restriction to the types and schema of metadata
-allowed, or the file types supported. A file-based configuration allows to set
-up content types and validation rules, or to have (almost) no rules at all.
+store and manipulation library. There is no restriction to the types and schema
+of metadata allowed, or the file types supported. A file-based configuration
+allows to set up content types and validation rules, or to have (almost) no
+rules at all.
 
-## Why
+## Documentation
 
-Several years ago, the author of this project believed that he should work in
-larger and larger institutions, with larger and larger data sets. One day, he
-came across a [project](https://zenodo.org/records/8111569) that changed his
-perspective.
+The full user documentation sources are in the [user
+guide](./doc/user_guide/docs) folder and published at https://pkar-doc.bmll.cc
 
-"From a standpoint of preserving human cultural heritage at large, does it make
-more sense to design very large repositories for very rich institutions, with a
-lot of layers of safety but also a lot of bureaucracy and redundancy, or rather
-contribute to many decentralized projects that are highly efficient, small,
-representing periferal cultures, and most importantly, that are at much higher
-risk of loss than large institutions'"?
-
-The answer was: both. This software has been conceived with the experience of
-large-scale repositories as the background to decide what works and what
-doesn't, what is necessary and what is superfluous, and what catalogers and
-archivists need to do their job.
-
-It is not inconceivable that if many Pocket Archives were to sprout all over
-the place one day, they could be periodically harvested, linked together, and
-presented in one large, central archive (it's Linked Data, after all), without
-any detriment to the indepencence of the individual archives.
-
-## Quickstart
-
-This has been tested on Linux only. It's not guaranteed to work on other
-systems at the moment.
-
-### System prerequisites
-
-- A build environment (at least Git, libc, a C compiler, and Make)
-- UUID library (`uuid/uuid.h` - util-linux or linux-headers in most distros)
-- xxhash development package
-- lmdb development package
-- libvips development package
-- Lua 5.4 development package (lua-dev in some distros)
-- Luarocks 5.4
-
-If using Arch Linux:
-
-```
-pacman -Syu
-pacman -S base-devel util-linux-libs git xxhash lmdb libvips lua luarocks
-```
-
-### Install Volksdata & Pocket Archive
-
-Pocket Archive and Volksdata are still alpha and not in the Luarocks artifact
-repo yet, so the rocks must be installed manually for the time being.
-
-Installing locally or in a dedicated container is strongly recommended at this
-stage.
-
-When Pocket Archive will get into beta status and be published on Luarocks,
-the commands below will be replaced by a one-line command. Until then...
-
-```
-# Note: tested on Archlinux. Other distros (especially Alpine) may need tweaks.
-eval $(luarocks path)
-luarocks install --local debugger  # Not in dependencies file but temporarily required
-git clone --recurse-submodules https://git.knowledgetx.com/scossu/volksdata_lua.git
-cd volksdata_lua
-luarocks build --local
-lua test.lua  # optional
-cd ../
-git clone --recurse-submodules https://git.knowledgetx.com/scossu/pocket_archive.git
-cd pocket_archive
-luarocks build --local
-# Add Luarocks paths to your login script
-luarocks path >> ~/.bashrc
-# Penlight.clonetree is not working properly. A pull request is in progress:
-# https://github.com/lunarmodules/Penlight/pull/496
-# In the meantime, clone the fork and install from the local repo:
-cd ../
-git clone https://github.com/scossu/Penlight.git
-cd Penlight
-git checkout clonetree
-luarocks build --local
-cd ../pocket_archive
-```
-
-### Run demo submission
-
-Initialize the archive first:
-
-```
-pkar init
-```
-
-This, after user confirmation, will create the required folders and database
-file in the archive root (a temporary folder by default). Then:
-
-```
-pkar submission test/sample_submission/pkar_submission.csv
-```
-
-### Generate static site from archive
-
-```
-pkar gen-site
-```
-
-Will generate the static site in `out/http/`. Note that this is static HTML but
-it needs a web server to resolve links (completely server-less version is in
-the works).
-
-If you don't have a configured web server yet, the provided `darkhttpd` will
-work in a pinch:
-
-```
-cd ext/darkhttpd/  # Archlinux: sudo pacman -S darkhttpd
-make
-cd -
-```
-
-Serve the site:
-
-```
-./ext/darkhttpd/darkhttpd out/html
-```
-
-(see more options with `darkhttpd --help`)
-
-Point your browser to `localhost:8080` and enjoy.
-
-## Basic concepts
-
-Until some proper reference is written, this should serve as a high-level
-documentation to help evaluate the functionality and to help me to stay on
-track. Many of these ideas have been ripped right off my day job, so there is
-a good chance they work.
-
-### General philosophy
-
-The functional goals of Pocket Archive are simplicity and flexibility, from
-both a user's and a maintainer's perspectives. These two properties are usually
-seen as conflicting, but within reason, they can coexist.
-
-Pocket Archive is built upon a minimalistic framework: C and Lua, with very few
-dependencies. Similarly to these foundational elements, Pocket Archive strives
-to offer few tools that can be combined in a multitude of ways to achieve many
-goals, rather than many tools each doing a specific thing.
-
-### Resource
-
-The Linked Data adage goes, "everything is a Resource". Without confusing users
-too much by taking the concept to the Linked Data extremes, the term *resource*
-is used in this project to describe individual, self-contained units of
-information such as:
-
-- Digital files;
-- Intellectual or physical artifacts (artworks, documents, books, etc.);
-- Structural elements inside or around an entity, such as the order of pages in
-  a book, the two sides of a postcard, a collection of oher resources, etc.
-
-Files are called *opaque resources*. They are viewed by Pocket Archive as
-"opaque" in that the system doesn't care about their contents. It only ensures
-that files are stored as they were submitted, and keeps checksums to guard
-against data corruption.
-
-All other entities are called *descriptive resources*. These are effectively
-Linked Data, which can be queried and searched for. Each file also has its own
-descriptive resource, so that it can be classified, discovered, and described.
-
-### Submission
-
-A Pocket Archive repository is populated via *submissions*. A submission is
-performed by telling the archive to pick up some files from a folder it can
-access, push them into storage, add metadata to them, and index them so that
-they can be found later.
-
-A submission is directed by a *laundry list*, which is a spreadsheet listing
-all the resources (both opaque and descriptive) to be created, and the metadata
-assigned to them. The laundry list, formatted as a CSV (comma-separated values)
-file, can be edited by several free and open source applications, such as
-LibreOffice or Google Sheets. For repetitive, high-volume submissions,
-templates can be set to facilitate filling in metadata fields. An [example
-submission ](test/sample_submission/postcard-bag/data/), which includes a
-laundry list, is available.
-
-Using spreadsheets is for most users much faster and intuitive than clicking
-around an alien user interface filled with icons and terms that one has never
-seen before.
-
-Detailed instructions on how to write a laundry list are under the
-[submission documentation](doc/submission.md).
-
-### Metadata & content model
-
-Metadata are (yes, it's a *plural* noun) controlled by a *content model*, which
-in this project is intended as the entirety of definitions of content types
-recognized by the system, and how they relate to one another. Each individual
-Pocket Archive installation can use the baseline one provided by default, or
-extend it via additional configuration.
-
-See the [content model configuration manual](doc/content_model.md) for details
-on how to set up a custom content model.
-
-### Site generation
-
-Pocket Archive can generate HTML pages and all the related assets to
-run a complete static website. The advantages of a static website over a
-dynamic one are that it's much simpler and economical to set up and run, and
-it's impervious to malicious attacks.
-
-The entire site must be generated every time resources are created or updated.
-This is usually very fast, but on large archives it can take a while. This is
-the downside of static website: they are static.
-
-## Functionality
-
-### CLI
-
-Pocket Archive can be managed via a command line interface (CLI) when
-installed locally (e.g. via Luarocks).
-
-The `pkar` script contains several useful commands, e.g.
-
-```
-pkar init
-```
-
-Initialize the Pocket Archive store and database at the location indicated by
-the `$PKAR_DRES` and `$PKAR_ORES` environment variables. This operation deletes
-all preexisting data found in those directories.
-
-```
-pkar deposit <path>
-```
-
-Deposit resources in the `<path>` folder. This folder must contain a
-laundry list named `pkar_submission.csv` with file paths relative to that
-folder.
-
-```
-pkar gen-site
-```
-
-Generate static site at `$PKAR_ROOT/out/html`. This includes all HTML pages,
-derivative media, thumbnails, ancillary assets, and RDF representations of all
-resources.  The `html` folder can be pointed to by a static HTTP server for
-local testing, or copied to a remote HTTP server. For local testing, I have
-been using `darkhttpd` which does the job in only 55Kb. *TODO serverless
-deployment option*
-
-```
-pkar gen-rdf <id> [-f, --format <format>]
-```
-
-Generate the RDF representation of one resource. Useful for debugging and
-inspecting.
-
-More detailed information can be obtain with `pkar --help`.
-
-### Remote submissions
-
-Obviously, the CLI only works if one has command line access (e.g. via SSH) to
-the machine hosting Pocket Archive. It is often the case that a contributor has
-neither shell access nor expertise to run the Pocket Archive CLI. Remote
-submissions provide a more user-friendly way to submit contents to any Pocket
-Archive instance on the WWW.
-
-Read the [remote submission guide](doc/remote_submission.md) for more details.
-
-### Environment variables
-
-The following environment variables are available to modify the application
-behavior:
-
-- `PKAR_ROOT`: Root of Pocket Archive data. It defaults to `.`.
-- `PKAR_ORES`: Directory of opaque resources (content files). It defaults to
-  `${PKAR_ROOT}/data/ores`.
-- `PKAR_DRES`: Directory of descriptive resources (metadata). It defaults to
-  `${PKAR_ROOT}/data/dres`.
-- `PKAR_CONFIG_DIR`: configuration directory. This should be a directory
-  containing the `model` directory with the content mode configuration and
-  `app.lua` with general application configuration. It defaults to the `config`
-  directory installed by Luarocks.
-
-## Other documentation
-
-See the [doc](doc) folder for complete documentation (currently under
-construction) in different sections aimed at archivists, system administrators,
-and/or developers.
-
-## Compatibility
-
-Linux only. The submission watchdog relies on `inotify` which is not portable.
-Adopters not using the watchdog or willing to re-implement it may have success
-with other POSIX environments, but these have not been tested.
-
-## Status
-
-**ALPHA**. Pocket Archive is a very recent project, in fast development. Its
-foundational library, Volksdata, has been developed as a spare-time project for
-6 years and it just entered in beta status.
-
-### Road map
-
-See [Road map doc](doc/roadmap.md).
+API documentation is in progress.

+ 0 - 108
doc/roadmap.md

@@ -1,108 +0,0 @@
-# Pocket Archive Road Map
-
-The first goal is to build a working prototype, with all the basic functional
-components, even if not entirely developed or only usable in a specific
-development environment, to demonstrate the overall workflows and
-functionality.
-
-The second step is to produce a basic application, which is fully
-functional and available for use by the intended audience.
-
-Beyond the basic application, new features and bug fixes will be driven by
-usage and opportunities for expanding adoption in relevant areas.
-
-❏ = pending; ⚒ = in progress; ⎊ = blocked; ✓ = complete; ✖︎ = not implemented.
-
-## ✓ Prototype (design -> alpha)
-
-- Configuration + config parser
-  - Application
-  - Content model
-    - Validation rules
-- Submission module
-  - SIP building
-  - Metadata from LL
-  - Brick structures
-  - Structure inference
-- HTML generator
-  - Index
-  - Resource
-  - Static assets
-- Non-HTML generators
-  - RDF (turtle)
-  - Transformers
-  - JS search engine index
-- CLI
-  - Init archive
-  - Deposit
-  - Generate site
-  - Generate LL (single resource)
-  - Generate RDF (single resource)
-- Front end
-  - JS search engine
-  - Add collections to index page
-  - Basic styling
-      - Default type icons
-- QA
-  - ~50 resource data set
-
-## ⚒ Basic application (alpha -> beta -> v1.0)
-
-- ✖︎ Management UI & API
-  - ✖︎ Deposit via single tar or zip file submission
-- ✓ submission
-  - ✓ Watch local folder and trigger submission
-    - ✓ Option to regenerate site after submission
-    - ✓ Option to clean up sources & LL on success
-  - ✓ Submission report
-  - ✓ Deleting resources
-- ✓ Proper collection handling
-  - ✓ Dedicated template
-  - ✓ Link to markdown doc for presentation page
-  - ✓ Handle artifacts as members
-- ✓ Content model
-  - ✓ Generate content model documentation (HTML)
-  - ✓ Content model dump (CLI)
-  - ✓ Local overrides
-- ⚒ Documentation
-  - ✓ Break main sections off README
-  - ✓ Watchdog process guide
-  - ✓ Submission guide
-  - ✓ Content modeling primer (archivist)
-  - ⚒ Content modeling manual (sysadmin)
-  - ✓ Glossary
-  - ❏ Site generation guide
-  - ⚒ API documentation (ldoc)
-- ❏ Testing
-    - ❏ Unit tests (Busted)
-    - ⚒ Roundtrip submission, download LL, update, resubmission
-    - ⚒ >100 resource data set
-- ⚒ Presentation
-  - ⎊ Generate site for one collection only
-  - ✓ Generate LL for submission
-  - ✓ htmlgen option for local file or webserver URL generation
-  - ✖︎ Generate RDF (multi) [addressed by dump archive RDF]
-  - ✓ Mobile-friendly layout
-  - ❏ Enhanced styling and access
-  - ❏ Category browsing
-  - ❏ Improve search indexing
-- ⚒ Preservation
-  - ✓ Dump archive RDF
-  - ❏ Backup full environment (including config)
-  - ❏ Restore whole archive from RDF & data folder
-
-## ❏ Post-release wishlist
-
-(will be turned into separate release plans)
-
-- Multilingual support
-- ✖︎ Deposit via remote hot folder
-  - FTP [Addressed by separate FTP server]
-  - S3 [Not a good choice - See remote deposit guide]
-- Schema definition validator
-- Incremental build
-- Rebuild only site assets
-- Custom templating
-- Auto relatioships inference
-- Markdown support for property values
-

+ 0 - 0
doc/bricks01.graphml → doc/user_guide/docs/assets/bricks01.graphml


+ 0 - 0
doc/bricks01.png → doc/user_guide/docs/assets/bricks01.png


+ 0 - 0
doc/bricks02.graphml → doc/user_guide/docs/assets/bricks02.graphml


+ 0 - 0
doc/bricks02.png → doc/user_guide/docs/assets/bricks02.png


+ 0 - 0
doc/bricks03.graphml → doc/user_guide/docs/assets/bricks03.graphml


+ 0 - 0
doc/bricks03.png → doc/user_guide/docs/assets/bricks03.png


+ 0 - 0
doc/bricks_coll.graphml → doc/user_guide/docs/assets/bricks_coll.graphml


+ 0 - 0
doc/pkar_res_lifecycle.graphml → doc/user_guide/docs/assets/pkar_res_lifecycle.graphml


+ 0 - 0
doc/pkar_res_lifecycle.png → doc/user_guide/docs/assets/pkar_res_lifecycle.png


+ 0 - 0
doc/pkar_screenshot_classification.png → doc/user_guide/docs/assets/pkar_screenshot_classification.png


+ 3 - 0
doc/user_guide/docs/config.md

@@ -0,0 +1,3 @@
+# Configuration
+
+TODO

+ 9 - 10
doc/content_model_primer.md → doc/user_guide/docs/content_model_intro.md

@@ -1,9 +1,7 @@
-# Content modeling primer
+# Content modeling Introduction
 
 *Audience: Archivists, system administrators*
 
-**WORK IN PROGRESS**
-
 Terms showing in **bold** are referenced in the [glossary](./glossary.md).
 
 This document is a general-purpose introduction to content modeling concepts in
@@ -41,7 +39,7 @@ specific categories.
 
 ![Screenshot of a portion of a Pocket Archive presentation page, with the tpe
 hierarchy visible in the "Classification"
-section.](./pkar_screenshot_classification.png)
+section.](./assets/pkar_screenshot_classification.png)
 
 This hierarchy is visible in the **presentation** page of a **resource**, e.g.
 an **Artifact** of type "Still Image". The page has a "Classification" section
@@ -131,7 +129,7 @@ reason, bricks are used to provide ordered stand-ins.
 Bricks can be used for this purpose in a variety of ways. A very simple use
 case is illustrated below:
 
-![Ordering of pages in a book using Bricks.](bricks01.png)
+![Ordering of pages in a book using Bricks.](./assets/bricks01.png)
 
 In this example, a book resource (an Artifact) contains some ordered pages
 represented by Files. Note that there is a direct relationship between the
@@ -144,7 +142,7 @@ This example is contrived, as we could have just as easily pointed the `first`
 relationship from the book to the first file, and the `next` one from one
 file to the next. But, what if we have multiple files per page:
 
-![Ordering of multi-file pages in a book using Bricks.](bricks02.png)
+![Ordering of multi-file pages in a book using Bricks.](./assets/bricks02.png)
 
 In this case, closer to a real archival scenario, we have an **archival
 master** file, a **production master** file, and a transcript text file for
@@ -158,7 +156,7 @@ A more complex example, less common because it is more laborious to set up,
 but entirely possible, involves an Artifact resource that has multiple
 structures, such a full book and its excerpt:
 
-![Complex ordering example using multiple structures.](bricks03.png)
+![Complex ordering example using multiple structures.](./assets/bricks03.png)
 
 In this case, in addition to the Page resources of the previous example, we
 have an additional layer of bricks only to keep ordering. The pages keep their
@@ -171,9 +169,10 @@ This use case is seldom used for books, especially in large collections where
 setting up multiple orderings for individual artifacts is not practical, but it
 may be very useful in building collections by "borrowing" already submitted
 resources that belong to another collection. The `coll2` row in the [example
-laundry list](../test/sample_submission/pkar_submission-demo.csv) does exactly
-that within a couple of lines. Pocket Archive takes care of creating the
-appropriate bricks.
+laundry
+list](https://git.knowledgetx.com/scossu/pocket_archive/src/master/test/sample_submission/pkar_submission-demo.csv)
+does exactly that within a couple of lines. Pocket Archive takes care of
+creating the appropriate bricks.
 
 ## Assigning content types
 

+ 1 - 1
doc/content_model_manual.md → doc/user_guide/docs/content_model_manual.md

@@ -3,7 +3,7 @@
 *Audience: system administrators, developers*
 
 For a generic introduction to content modeling in Pocket Archive, see
-the [content modeling primer](./content_model_primer.md)
+the [content modeling primer](./content_model_intro.md)
 
 ## Core schema and predefined content types
 

+ 4 - 1
doc/glossary.md → doc/user_guide/docs/glossary.md

@@ -218,7 +218,8 @@ its static site generation process.
 A **metadata** element that can be attributed to a **resource**. Properties are
 more or less strictly defined in the **content model** by the archive
 administrator and they may have a data type, a cardinality, and a range. See
-the [content model primer](./content_model_primer.md) for more information.
+the [content model introduction](./content_model_intro.md) for more
+information.
 
 ## \*Production master
 
@@ -249,6 +250,8 @@ existence, Pocket Archive guarantees the consistency of relationship links.
 
 ## Resource
 
+A self-standing unit of digital data that can be identified with a **URI**.
+
 In **RDF** parlance, "everything is a resource", which means, every unit of
 information can be represented by a globally unique document on the Web.
 

+ 78 - 0
doc/user_guide/docs/index.md

@@ -0,0 +1,78 @@
+# Pocket Archive
+
+## The idea
+
+Stick it in your pocket and carry it around. Install it on a cloud server.
+Install it on a Raspberry Pi. Browse it offline. Browse it online. Duplicate
+it, share it, harvest it and aggregate it. Feed it non-GMO spreadsheets
+regularly and it will thrive.
+
+## A more sensical description
+
+Pocket Archive is a digital archival system and static site generator for
+small- to medium-(?) sized archives. It is designed to function in environments
+with unreliable connectivity and requires very low technical and human
+resources to set up, run, and use.
+
+Pocket Archive fulfills the following functions:
+
+- Storage and management of files and metadata-only resources
+- Management of descriptive, administrative, and technical metadata
+- Dynamic relationships between resources
+- Static site generation (discovery interface)
+
+In spite of its design simplicity, Pocket Archive strives to be highly
+flexible. It is based on [Volksdata
+](https://git.knowledgetx.com/scossu/volksdata), a very compact Linked Data
+store written in C. There is no restriction to the types and schema of metadata
+allowed, or the file types supported. A file-based configuration allows to set
+up content types and validation rules, or to have (almost) no rules at all.
+
+## Why
+
+Several years ago, the author of this project believed that he should work in
+larger and larger institutions, with larger and larger data sets. One day, he
+came across a [project](https://zenodo.org/records/8111569) that changed his
+perspective.
+
+"From a standpoint of preserving human cultural heritage at large, does it make
+more sense to design very large repositories for very rich institutions, with a
+lot of layers of safety but also a lot of bureaucracy and redundancy, or rather
+contribute to many decentralized projects that are highly efficient, small,
+representing periferal cultures, and most importantly, that are at much higher
+risk of loss than large institutions'"?
+
+The answer was: both. This software has been conceived with the experience of
+large-scale repositories as the background to decide what works and what
+doesn't, what is necessary and what is superfluous, and what catalogers and
+archivists need to do their job.
+
+It is not inconceivable that if many Pocket Archives were to sprout all over
+the place one day, they could be periodically harvested, linked together, and
+presented in one large, central archive (it's Linked Data, after all), without
+any detriment to the indepencence of the individual archives.
+
+## Further documentation
+
+Please explore the [installation guide](./install.md) for setup instructions
+or, if you have an instance already set up for you, the [Content model
+introduction](content_model_intro.md) to learn about Pocket Archive's content
+modeling system and the [submission guide](./submission.md) to get started with
+archiving contents. The [glossary](./glossary.md) is a useful reference to some
+specialized terms used in Pocket Archive and in the digital preservation field.
+
+## Compatibility
+
+Linux only at the moment. The submission watchdog relies on `inotify` which is
+not portable.  Adopters not using the watchdog or willing to re-implement it
+may have success with other POSIX environments, but these have not been tested.
+
+## Status
+
+**ALPHA**. Pocket Archive is a very recent project, in fast development. Its
+foundational library, Volksdata, has been developed as a spare-time project for
+6 years and it just entered in beta status.
+
+### Road map
+
+See [Road map doc](./roadmap.md).

+ 128 - 0
doc/user_guide/docs/install.md

@@ -0,0 +1,128 @@
+# Installation and set up
+
+This has been tested on Linux only. It's not guaranteed to work on other
+systems at the moment.
+
+## System prerequisites
+
+- A build environment (at least Git, libc, a C compiler, and Make)
+- UUID library (`uuid/uuid.h` - util-linux or linux-headers in most distros)
+- xxhash development package
+- lmdb development package
+- libvips development package
+- Lua 5.4 development package (lua-dev in some distros)
+- Luarocks 5.4
+
+If using Arch Linux:
+
+```bash
+pacman -Syu
+pacman -S base-devel util-linux-libs git xxhash lmdb libvips lua luarocks
+```
+
+## Install Volksdata & Pocket Archive
+
+Pocket Archive and Volksdata are still alpha and not in the Luarocks artifact
+repo yet, so the rocks must be installed manually for the time being.
+
+Installing locally or in a dedicated container is strongly recommended at this
+stage.
+
+When Pocket Archive will get into beta status and be published on Luarocks,
+the commands below will be replaced by a one-line command. Until then...
+
+```bash
+# Note: tested on Archlinux. Other distros (especially Alpine) may need tweaks.
+eval $(luarocks path)
+luarocks install --local debugger  # Not in dependencies file but temporarily required
+git clone --recurse-submodules https://git.knowledgetx.com/scossu/volksdata_lua.git
+cd volksdata_lua
+luarocks build --local
+lua test.lua  # optional
+cd ../
+git clone --recurse-submodules https://git.knowledgetx.com/scossu/pocket_archive.git
+cd pocket_archive
+luarocks build --local
+# Add Luarocks paths to your login script
+luarocks path >> ~/.bashrc
+# Penlight.clonetree is not working properly. A pull request is in progress:
+# https://github.com/lunarmodules/Penlight/pull/496
+# In the meantime, clone the fork and install from the local repo:
+cd ../
+git clone https://github.com/scossu/Penlight.git
+cd Penlight
+git checkout clonetree
+luarocks build --local
+cd ../pocket_archive
+```
+
+## Run demo submission
+
+Initialize the archive first:
+
+```bash
+pkar init
+```
+
+This, after user confirmation, will create the required folders and database
+file in the archive root (a temporary folder by default). Then:
+
+```
+pkar submission test/sample_submission/pkar_submission.csv
+```
+
+## Generate static site from archive
+
+```bash
+pkar gen-site
+```
+
+Will generate the static site in `out/http/`. These files can be viewed on a
+local machine with only a browser (point to `index.html`), packaged and sent to
+someone else to browse on their own machine, or served remotely with a very
+simple static HTTP server. `darkhttpd` is provided here for convenience:
+
+```bash
+cd ext/darkhttpd/
+make
+cd -
+```
+
+Serve the site:
+
+```bash
+./ext/darkhttpd/darkhttpd out/html
+```
+
+(see more options with `darkhttpd --help`)
+
+Point your browser to `localhost:8080` and enjoy.
+
+## Setup
+
+## Configuration
+
+The configuration directory provided by the Luarocks package and in the Git
+repo (`config` folder) should be copied to a user-defined location and pointed
+to when running Pocket Archive (see "Environment variables" below). The
+configuration can thus be modified without being overwritten by a Pocket
+Archive update.
+
+The [Configuration guide](./config.md) describes the configuration details.
+
+### Environment variables
+
+The following environment variables are available to modify the application
+behavior:
+
+- `PKAR_ROOT`: Root of Pocket Archive data. It defaults to `.`.
+- `PKAR_ORES`: Directory of opaque resources (content files). It defaults to
+  `${PKAR_ROOT}/data/ores`.
+- `PKAR_DRES`: Directory of descriptive resources (metadata). It defaults to
+  `${PKAR_ROOT}/data/dres`.
+- `PKAR_CONFIG_DIR`: configuration directory. This should be a directory
+  containing the `model` directory with the content mode configuration and
+  `app.lua` with general application configuration. It defaults to the `config`
+  directory installed by Luarocks.
+
+

+ 0 - 0
doc/remote_submission.md → doc/user_guide/docs/remote_submission.md


+ 113 - 0
doc/user_guide/docs/roadmap.md

@@ -0,0 +1,113 @@
+# Pocket Archive Road Map
+
+The first goal is to build a working prototype, with all the basic functional
+components, even if not entirely developed or only usable in a specific
+development environment, to demonstrate the overall workflows and
+functionality.
+
+The second step is to produce a basic application, which is fully
+functional and available for use by the intended audience.
+
+Beyond the basic application, new features and bug fixes will be driven by
+usage and opportunities for expanding adoption in relevant areas.
+
+❏ = pending  
+⚒ = in progress  
+⎊ = blocked or on hold  
+✓ = complete  
+✖︎ = not implemented  
+
+## ✓ Prototype (design -> alpha)
+
+- Configuration + config parser
+    - Application
+    - Content model
+        - Validation rules
+- Submission module
+    - SIP building
+    - Metadata from LL
+    - Brick structures
+    - Structure inference
+- HTML generator
+    - Index
+    - Resource
+    - Static assets
+- Non-HTML generators
+    - RDF (turtle)
+    - Transformers
+    - JS search engine index
+- CLI
+    - Init archive
+    - Deposit
+    - Generate site
+    - Generate LL (single resource)
+    - Generate RDF (single resource)
+- Front end
+    - JS search engine
+    - Add collections to index page
+    - Basic styling
+        - Default type icons
+- QA
+    - ~50 resource data set
+
+## ⚒ Basic application (alpha -> beta -> v1.0)
+
+- ✖︎ Management UI & API
+    - ✖︎ Deposit via single tar or zip file submission
+- ✓ submission
+    - ✓ Watch local folder and trigger submission
+        - ✓ Option to regenerate site after submission
+        - ✓ Option to clean up sources & LL on success
+    - ✓ Submission report
+    - ✓ Deleting resources
+- ✓ Proper collection handling
+    - ✓ Dedicated template
+    - ✓ Link to markdown doc for presentation page
+    - ✓ Handle artifacts as members
+- ✓ Content model
+    - ✓ Generate content model documentation (HTML)
+    - ✓ Content model dump (CLI)
+    - ✓ Local overrides
+- ⚒ Documentation
+    - ✓ Break main sections off README
+    - ✓ Watchdog process guide
+    - ✓ Submission guide
+    - ✓ Content modeling introduction (archivist)
+    - ⚒ Content model setup manual (sysadmin)
+    - ✓ Glossary
+    - ❏ Site generation guide
+    - ⚒ Migrate doc platform (Mkdocs?) & publish separately
+    - ⚒ API documentation (ldoc)
+- ⚒ Testing
+    - ❏ Unit tests (Busted)
+    - ⚒ Roundtrip submission, download LL, update, resubmission
+    - ⚒ >100 resource data set
+- ⚒ Presentation
+    - ⎊ Generate site for one collection only
+    - ✓ Generate LL for submission
+    - ✓ htmlgen option for local file or webserver URL generation
+    - ✖︎ Generate RDF (multi) [addressed by dump archive RDF]
+    - ✓ Mobile-friendly layout
+    - ❏ Enhanced styling and access
+    - ❏ Category browsing
+    - ❏ Improve search indexing
+- ⚒ Preservation
+    - ✓ Dump archive RDF
+    - ❏ Backup full environment (including config)
+    - ❏ Restore whole archive from RDF & data folder
+
+## ❏ Post-release wishlist
+
+(will be turned into separate release plans)
+
+- Multilingual support
+- ✖︎ Deposit via remote hot folder
+    - FTP [Addressed by separate FTP server]
+    - S3 [Not a good choice - See remote deposit guide]
+- Schema definition validator
+- Incremental build
+- Rebuild only site assets
+- Custom templating
+- Auto relatioships inference
+- Markdown support for property values
+

+ 40 - 31
doc/submission.md → doc/user_guide/docs/submission.md

@@ -1,9 +1,7 @@
-# Pocket Archive submission guide
+# Submission guide
 
 *Audience: archivists, system administrators, developers*
 
-**WORK IN PROGRESS**
-
 Terms appearing in **bold** are referenced in the [glossary](./glossary.md).
 
 ## Archival process overview
@@ -15,12 +13,12 @@ submission may include multiple resources, which can be related but do not
 necessarily have to.
 
 ![The full cycle of operations for a given resource in Pocket
-Archive.](./pkar_res_lifecycle.png)
+Archive.](./assets/pkar_res_lifecycle.png)
 
-1. Archivist selects and lays out resources to be archived in his or her own
-   workstation.
+1. Archivist selects and lays out digital **resources** to be archived in his
+   or her own workstation.
 2. Archivist creates a **laundry list** that includes an inventory of the
-   resources and their metadata. This, together with the files and folders
+   resources and their **metadata**. This, together with the files and folders
    previously prepares, constitutes the **SIP**.
 3. Archivist transfers the SIP to the **Drop box**: first the files and
    folders, then the laundry list.
@@ -30,10 +28,10 @@ Archive.](./pkar_res_lifecycle.png)
    of whether it was successful or failed).
 6. Depending on setup, Pocket Archive may delete the SIP from the Drop box if
    the submission succeeded.
-7. Depending on setup, Pocket Archive may (re-)generate the static site.
+7. Depending on setup, Pocket Archive may (re-)generate the **static site**.
 8. If the archivist wants to update the archived resources, they can either
-   request a full copy of the SIP, or to only update metadata, only the laundry
-   list, and re-submit it.
+   request a full copy of the SIP, (or to only update metadata, only the
+   laundry list), edit it and/or replace files, and re-submit the new SIP.
 9. The archivist can remove a resource and, optionally, all its members at any
    time.
 
@@ -52,9 +50,13 @@ curator-defined folder hierarchy, and metadata, the latter gathered in a single
 file called a laundry list; and sending them both to Pocket Archive for
 processing.
 
-A [working SIP example](../test/sample_submission) including files and a
-laundry list, used for testing, is available as a quick reference. Other
-examples are illustrated further down in this document.
+A [working SIP
+example](https://git.knowledgetx.com/scossu/pocket_archive/src/master/test/sample_submission)
+including files and a laundry list, used for testing, is available as a quick
+reference (note: the CSV file is currently displayed as a raw file. To view it
+as a spreadsheet, download it and open it with Libreoffice or another
+spreadsheet editor). Other examples are illustrated further down in this
+document.
 
 As the above life cycle chart shows, the SIP is a disposable artifact. Once it
 is successfully archived, it can be deleted. The full SIP can be regenerated by
@@ -137,7 +139,7 @@ exporting each sheet individually as a CSV.
 #### Laundry list format
 
 The first row of a laundry list is reserved for the header, which indicates the
-field names.  These can be in any order, but following a specific order is
+**field** names.  These can be in any order, but following a specific order is
 recommended. The order used in this document and in all laundry lists
 automatically generated by Pocket Archive is: `content_type`, `id`,
 `source_path`, and then all ordinary fields in alphabetical order.
@@ -146,9 +148,9 @@ Each subsequent row represents a resource (except in a multi-value case,
 described below). The `content_type` field is mandatory for each resource.
 
 The `source_path` field is only mandatory for files. All other fields are
-optional for the submission, however, some type definitions may have
+optional for the submission, however, some **schema** definitions may have
 constraints in this regard and may be at least strongly recommended. This
-depends on the content model used.
+depends on the **content model** used.
 
 #### Fields with a special meaning
 
@@ -175,6 +177,11 @@ depends on the content model used.
   are also deleted, along with their own members, recursively. See the
   "Deleting resources" section below.
 
+Note: when a field is defined as "mandatory" above, this is intended
+per-resource.  If the resource spans multiple rows, as when it has multi-valued
+fields, a mandatory field is only required to have a value on the first row of
+the resource.
+
 Example of a table representing an artifact with two files:
 
 <table>
@@ -272,11 +279,12 @@ and/or `source_path` is considered an error.
 
 The ordering of rows in a laundry list determines the ordering of the resources
 in their container. The system automatically assigns an order to the resources,
-using their source path and their position in the laundry list.  Resources at
-the top are not assigned an order, as they are considered self-standing. If an
-order is needed for those, the `pas:next` **property** can be set to the
-desired resource (see point below about relationships), or they can be put in
-an enclosing folder that acts as a collection.
+using their source path and their position in the laundry list. Resources at
+the top level, i.e. directly under the SIP folder, are not assigned an order, as
+they are considered self-standing. If an order is needed for those, the
+`pas:next` **property** can be set to the desired resource (see point below
+about relationships), or they can be put in an enclosing folder that acts as a
+collection.
 
 **Relationships** can be established between resources. These are stored as
 persistent links and appear as hyperlinks in the discovery interface. A
@@ -350,11 +358,11 @@ re-submission.
 
 This chapter is a very concise introduction to content modeling in Pocket
 Archive, which is treated in detail in the [Content modeling
-primer](./content_model_primer.md). It is strongly recommended to read that
-guide before archiving resources in earnest.
+introduction](./content_model_intro.md). It is strongly recommended to read
+that guide before archiving resources in earnest.
 
-The three main resource types found in a submission are: Artifact, File, and
-Brick. See the [content modeling primer](./content_model_primer.md) for more
+The three main resource types found in a submission are: **Artifact**,
+**File**, and **Brick**. See the Content modeling introduction for more
 information about these.
 
 These three key content types are seldom used as-is. They usually have
@@ -368,10 +376,10 @@ that object, but also the capture of a `text` artifact if it is the scan of a
 book page.
 
 See the provided [sample laundry
-list](../test/sample_submission/pkar_submission-demo.csv) for examples of
-artifacts, files, and bricks making up a two-sided postcard. (Note: you may
-need to download the file and open it with a spreadsheet editor. The current
-platform shows the raw file.)
+list](https://git.knowledgetx.com/scossu/pocket_archive/src/master/test/sample_submission/pkar_submission-demo.csv)
+for examples of artifacts, files, and bricks making up a two-sided postcard.
+(Note: you may need to download the file and open it with a spreadsheet editor.
+The current platform shows the raw file.)
 
 ### Submission ID and submission name
 
@@ -418,8 +426,9 @@ laundry list creation.
 Fortuntately, such repetitive and error-prone tasks can be easily automated
 with tools provided by most spreadsheet applications. A macro (a mini-program
 that runs in an application) for LibreOffice Calc is [provided
-here](../src/util/libreoffice_idgen.bas) to automatically generate 16-character
-IDs for all the cells selected in a table.
+here](https://git.knowledgetx.com/scossu/pocket_archive/src/master/src/util/libreoffice_idgen.bas)
+to automatically generate 16-character IDs for all the cells selected in a
+table.
 
 ## Deleting resources
 

+ 22 - 0
doc/user_guide/mkdocs.yml

@@ -0,0 +1,22 @@
+---
+site_name: Pocket Archive
+
+theme: readthedocs
+
+nav:
+  - Introduction: "index.md"
+  - Setup:
+      - Installation: "install.md"
+      - Configuration: "config.md"
+  - "Content modeling":
+      - "Introduction (for content managers)": "content_model_intro.md"
+      - "Set up guide (for system admins)": "content_model_manual.md"
+  - Submission:
+      - "Submission guide": "submission.md"
+      - "Remote submission": "remote_submission.md"
+  - Glossary: "glossary.md"
+  - Roadmap: "roadmap.md"
+
+markdown_extensions:
+  - toc:
+      toc_depth: 4