浏览代码

Split off verbose sections of README.

scossu 1 周之前
父节点
当前提交
7a8a914221
共有 4 个文件被更改,包括 149 次插入128 次删除
  1. 28 128
      README.md
  2. 50 0
      doc/content_model.md
  3. 70 0
      doc/submission.md
  4. 1 0
      src/core.lua

+ 28 - 128
README.md

@@ -37,21 +37,26 @@ perspective.
 
 "From a standpoint of preserving human cultural heritage at large, does it make
 more sense to design very large repositories for very rich institutions, with a
-lot of layers of safety but also a lot of bureaucracy and redundancy, or
+lot of layers of safety but also a lot of bureaucracy and redundancy, or rather
 contribute to many decentralized projects that are highly efficient, small,
 representing periferal cultures, and most importantly, that are at much higher
-risk of loss than the large institutions"?
+risk of loss than large institutions'"?
 
 Both: this software has been conceived with the experience of large-scale
 repositories as the background to decide what works and what doesn't, what is
 necessary and what is superfluous, and what catalogers and archivists need to
 do their job.
 
+It is not inconceivable that if many Pocket Archives were to sprout all over
+the place one day, they could be periodically harvested, linked together, and
+presented in one large, central archive (it's Linked Data, after all), without
+any detriment to the indepencence of the individual archives.
+
 ## Basic concepts
 
 Until some proper reference is written, this should serve as a high-level
-documentation to help evaluate the functionality and for the author to stay on
-track. Some of these ideas have been ripped right off my day job, so there is
+documentation to help evaluate the functionality and to help me to stay on
+track. Many of these ideas have been ripped right off my day job, so there is
 a good chance they work.
 
 ### General philosophy
@@ -61,9 +66,9 @@ both a user's and a maintainer's perspectives. These two properties are usually
 seen as conflicting, but within reason, they can coexist.
 
 Pocket Archive is built upon a minimalistic framework: C and Lua, with very few
-dependencies. As with these foundational elements, it strives to offer few
-tools that can be combined in a multitude of ways to achieve many goals, rather
-than many tools each doing a specific thing.
+dependencies. Similarly to these foundational elements, Pocket Archive strives
+to offer few tools that can be combined in a multitude of ways to achieve many
+goals, rather than many tools each doing a specific thing.
 
 ### Resource
 
@@ -72,10 +77,10 @@ too much by taking the concept to the Linked Data extremes, the term *resource*
 is used in this project to describe individual, self-contained units of
 information such as:
 
-- Files;
+- Digital files;
 - Intellectual or physical artifacts (artworks, documents, books, etc.);
 - Structural elements inside or around an entity, such as the order of pages in
-  a book, the two sides of a postcard, a collection of artifacts, etc.
+  a book, the two sides of a postcard, a collection of oher resources, etc.
 
 Files are called *opaque resources*. They are viewed by Pocket Archive as
 "opaque" in that the system doesn't care about their contents. It only ensures
@@ -97,133 +102,28 @@ A submission is directed by a *laundry list*, which is a spreadsheet listing
 all the resources (both opaque and descriptive) to be created, and the metadata
 assigned to them. The laundry list, formatted as a CSV (comma-separated value)
 file, can be edited by several free and open source applications, such as
-LibreOffice. For repetitive, high- volume submissions, templates can be set to
+LibreOffice. For repetitive, high-volume submissions, templates can be set to
 facilitate filling in metadata fields. An [example submission
 ](test/sample_submission/postcard-bag/data/), which includes a laundry list, is
 available.
 
-Detailed instructions on how to write a laundry list shall be added later. For
-now, the following are the basic guidelines to build a submission package:
-
-- Resources are arranged in files and folders on a local filesystem that Pocket
-  Archive can access.
-- File and folder arrangement is important. A folder represents a descriptive
-  resource, and can have metadata attached to it. A file of folder under a
-  parent folder is automatically added as a *child* of the parent resource.
-  This relationship is intended to present the parent as a container of other
-  sub-resources (descriptive and/or opaque). With this method, hierarchies of
-  any complexity can be built.
-- File and folder order in the submission folder is *not* important. No need to
-  rename files and folders to force a specific ordering. This is specified via
-  laundry list instead. See below.
-- The laundry list file is placed under the submission package folder and must
-  be named `pkar_submission.csv`.
-
-A laundry list is thus formatted:
-
-- The first row is reserved for the headers, which indicate the field names.
-- Each subsequent row represents a resource (except in a multi-value case,
-  described below). The `pas:sourcePath` and `pas:contentType` fields are
-  mandatory for each resource. All other fields are optional for the
-  submission, however, some type definitions may have constraints in this
-  regard.
-- All field names, except for `id`, have a namespace prefix among the ones
-  defined in the configuration. See dedicated section for details about
-  namespaces.
-- Fields with a special meaning:
-    - `id`: optional and single-valued. If provided, it becomes the primary
-      identifier for the resource, which is used anywhere information about the
-      resource is retrieved. The depositor is responmsible for ensuring that
-      the provided ID is unique across the system. If left blank, the system
-      generates an identifier that is guaranteed to be unique.
-    - `pas:sourcePath`: mandatory and single-valued. It refers to the file or
-      folder path relative to the package.
-    - `pas:contentType`: mandatory and single-valued. It defines the content
-      type assigned to the resource. For files, it should be `pas:File` or a
-      sub-type thereof. For folders it must not be a `pas:File` or sub-type.
-- To provide multiple values for one or more fields, additional values are
-  added to rows below the previous. For these additional rows, the `sourcePath`
-  field **must not** be filled, and additional values for single-valued fields
-  are ignored.
-- The ordering of the rows determines the ordering of the resources in their
-  container. The system automatically assigns an order to the resources, using
-  their source path and their position in the laundry list. Resources at the
-  top are not assigned an order, as they are considered self-standing. If an
-  order is needed for those, the `pas:next` field can be set to the desired
-  resource (see point below about relationships), or they can be put in an
-  enclosing folder that acts as a collection.
-- Relationships can be established between resources. These are stored as
-  persistent links and appear as hyperlinks in the discovery interface. A
-  relationship can only be set for a field that is configured as "resource"
-  type. To set a relationship with a resource in the
-  same laundry list that doesn't have an explicit ID set, insert the source
-  path of the resource. For a resource that has already an ID, either by being
-  assigned one manually or by being already deposited, insert the full ID
-  including the `par:` namespace (e.g. for ID `12345`, insert `par:12345`).
-
-### Update
-
-A submission is also used to update existing resources. Each resource update is
-a full replacement of all the resource's metadata, so a submission must include
-a full representation of each of the resources updated.
-
-To facilitate this task while avoiding the need to hold on to all of the
-archive's laundry lists, Pocket Archive can generate a laundry list for one or
-more selected resources. This list, which represents the current state of the
-resources requested, can be edited and submitted for an update. This method is
-much faster and intuitive than clicking around an alien user interface filled
-with icons and terms that one has never seen before.
+Using spreadsheets is for most users much faster and intuitive than clicking
+around an alien user interface filled with icons and terms that one has never
+seen before.
 
-### Metadata & content model
+Detailed instructions on how to write a laundry list are under the 
+[submission documentation](doc/submission.md).
 
-**Note:** The scope of this functional area is currently under review. Things
-may change.
+### Metadata & content model
 
 Metadata are (yes, it's a *plural* noun) controlled by a *content model*, which
 in this project is intended as the entirety of definitions of content types
-recognized by the system, and how they relate to one another. Each *type
-definition* is encoded in a configuration file defining a single content
-category type. This configuration is specific to each individual Pocket Archive
-installation, which can use the baseline one provided by default, or extend it
-via additional configurations. Please look at the [default model
-configuration](config/model/typedef) files that come with Pocket Archive.
-
-One doesn't have to define all possible types in detail. Pocket Archive
-provides some basic types, e.g.: `Anything` (the super-class of them all),
-`Artifact`, `File`, `Part`, which can be used in a very basic installation and
-should not be radically altered, because some basic functionality of the system
-relies on them. To add more specific definitions, *subtypes* can be defined. A
-subtype inherits all the property definitions of its broader model, and adds
-more specific behavior. An example classification could be: Anything -> File ->
-Image File -> Scientific Image. Each of the sub-types would only define the
-special properties of that definition, which add to, or replace, the properties
-of its broader definitions.
-
-All resources in Pocket Archive must be assigned a content type. If someone has
-to deal with a resource that doesn't fit in any of the predefined content
-models, they can asign it the most specific type that they can. At worst, they
-can put it under Anything. Of course, if one starts dealing with many
-unclassifiable resources that look similar, it's probably best to define a
-model for them; but that is not mandatory.
-
-Each metadata field can be specified by constraints. These constraints can be
-on:
-
-- Type: the data type for the field, e.g. string, number, resource
-  (relationship), etc.
-- Cardinality: how many values can be set for a field, for each resource. These
-  values can be adjusted to set mandatory fields, single-valued fields, etc.
-- Range: the range of values allowed. How this is interpreted depends on the
-  data type: for a number can be a min/max range, for a string a regular
-  expression pattern, for a resource the type(s) of the resources pointed to,
-  etc.
-
-All of these constraints are optionals. Fields that are not defined may accept
-any number of values, and are optional. So it's up to the repository manager
-to decide how specific or how free-form their archive should be.
-
-Note that fields that are not defined at least by a label, may be hard to
-understand by users browsing the discovery interface.
+recognized by the system, and how they relate to one another. Each individual
+Pocket Archive installation can use the baseline one provided by default, or
+extend it via additional configuration.
+
+See the [content model configuration manual](doc/content_model.md) for details
+on how to set up a custom content model.
 
 ### Site generation
 
@@ -251,7 +151,7 @@ Simple road map for a rough prototype:
   - ⚒ Content model
     - ⎊ Validation rules
     - ⎊ Relationship inference rules
-  - Local overrides
+  - Local overrides
 - ⚒ Submission module
   - ✓ SIP building
   - ✓ Metadata from LL

+ 50 - 0
doc/content_model.md

@@ -0,0 +1,50 @@
+# Content model configuration
+
+**Note:** The scope of this functional area is currently under review. Things
+may change.
+
+Pocket Archive ships with some predefined content types. For some very simple
+archives, this may be enough to get started with little or no customization.
+For a setup which needs to define more numerous or complex content types in a
+more articulated way, additional types can be defined. Please look at the
+[default model configuration](../config/model/typedef) files that come with
+Pocket Archive. 
+
+Each *type definition* is encoded in a configuration file defining a single
+content category type. One doesn't have to define all possible types in detail.
+Pocket Archive provides some basic types, e.g.: `Anything` (the super-class of
+them all), `Artifact`, `File`, `Part`, which should not be radically altered,
+because some basic functionality of the system relies on them. To add more
+specific definitions, *subtypes* can be defined. A subtype inherits all the
+property definitions of its broader model, and adds more specific behavior. An
+example classification could be: Anything -> File -> Image File -> Scientific
+Image.  Each of the sub-types would only define the special properties of that
+definition, which add to, or replace, the properties of its broader
+definitions.
+
+All resources in Pocket Archive must be assigned a content type. If someone has
+to deal with a resource that doesn't fit in any of the predefined content
+models, they can asign it the most specific type that they can. At worst, they
+can put it under Anything. Of course, if one starts dealing with many
+unclassifiable resources that look similar, it's probably best to define a
+model for them; but that is not mandatory.
+
+Each metadata field can be specified by constraints. These constraints can be
+on:
+
+- Type: the data type for the field, e.g. string, number, resource
+  (relationship), etc.
+- Cardinality: how many values can be set for a field, for each resource. These
+  values can be adjusted to set mandatory fields, single-valued fields, etc.
+- Range: the range of values allowed. How this is interpreted depends on the
+  data type: for a number can be a min/max range, for a string a regular
+  expression pattern, for a resource the type(s) of the resources pointed to,
+  etc.
+
+All of these constraints are optionals. Fields that are not defined may accept
+any number of values, and are optional. So it's up to the repository manager
+to decide how specific or how free-form their archive should be.
+
+Note that fields that are not defined at least by a label, may be hard to
+understand by users browsing the discovery interface.
+

+ 70 - 0
doc/submission.md

@@ -0,0 +1,70 @@
+# Submission process
+
+The following are basic guidelines to build a submission package:
+
+- Resources are arranged in files and folders on a local filesystem that Pocket
+  Archive can access.
+- File and folder arrangement is important. A folder represents a descriptive
+  resource, and can have metadata attached to it. A file of folder under a
+  parent folder is automatically added as a *child* of the parent resource.
+  This relationship is intended to present the parent as a container of other
+  sub-resources (descriptive and/or opaque). With this method, hierarchies of
+  any complexity can be built.
+- File and folder order in the submission folder is *not* important. No need to
+  rename files and folders to force a specific ordering. This is specified via
+  laundry list instead. See below.
+- The laundry list file is placed under the submission package folder and must
+  be named `pkar_submission.csv`.
+
+A laundry list is thus formatted:
+
+- The first row is reserved for the headers, which indicate the field names.
+- Each subsequent row represents a resource (except in a multi-value case,
+  described below). The `pas:sourcePath` and `pas:contentType` fields are
+  mandatory for each resource. All other fields are optional for the
+  submission, however, some type definitions may have constraints in this
+  regard.
+- All field names, except for `id`, have a namespace prefix among the ones
+  defined in the configuration. See dedicated section for details about
+  namespaces.
+- Fields with a special meaning:
+    - `id`: optional and single-valued. If provided, it becomes the primary
+      identifier for the resource, which is used anywhere information about the
+      resource is retrieved. The depositor is responmsible for ensuring that
+      the provided ID is unique across the system. If left blank, the system
+      generates an identifier that is guaranteed to be unique.
+    - `pas:sourcePath`: mandatory and single-valued. It refers to the file or
+      folder path relative to the package.
+    - `pas:contentType`: mandatory and single-valued. It defines the content
+      type assigned to the resource. For files, it should be `pas:File` or a
+      sub-type thereof. For folders it must not be a `pas:File` or sub-type.
+- To provide multiple values for one or more fields, additional values are
+  added to rows below the previous. For these additional rows, the `sourcePath`
+  field **must not** be filled, and additional values for single-valued fields
+  are ignored.
+- The ordering of the rows determines the ordering of the resources in their
+  container. The system automatically assigns an order to the resources, using
+  their source path and their position in the laundry list. Resources at the
+  top are not assigned an order, as they are considered self-standing. If an
+  order is needed for those, the `pas:next` field can be set to the desired
+  resource (see point below about relationships), or they can be put in an
+  enclosing folder that acts as a collection.
+- Relationships can be established between resources. These are stored as
+  persistent links and appear as hyperlinks in the discovery interface. A
+  relationship can only be set for a field that is configured as "resource"
+  type. To set a relationship with a resource in the
+  same laundry list that doesn't have an explicit ID set, insert the source
+  path of the resource. For a resource that has already an ID, either by being
+  assigned one manually or by being already deposited, insert the full ID
+  including the `par:` namespace (e.g. for ID `12345`, insert `par:12345`).
+
+### Update
+
+A submission is also used to update existing resources. Each resource update is
+a full replacement of all the resource's metadata, so a submission must include
+a full representation of each of the resources updated.
+
+To facilitate this task while avoiding the need to hold on to all of the
+archive's laundry lists, Pocket Archive can generate a laundry list for one or
+more selected resources. This list, which represents the current state of the
+resources requested, can be edited and submitted for an update. 

+ 1 - 0
src/core.lua

@@ -11,6 +11,7 @@ local config_path = os.getenv("PA_CONFIG_DIR") or (root_path .. "/config")
 
 
 local M = {
+    -- Project root path.
     root = root_path,
     config = dofile(config_path .. "/app.lua"),