|
@@ -20,6 +20,13 @@ of the US Federal Agency Digital Guidelines Inititative (FADGI).
|
|
|
|
|
|
## \*Artifact
|
|
## \*Artifact
|
|
|
|
|
|
|
|
+A human-made object with a cultural value. It can be a physical object, such as
|
|
|
|
+a book, a scuplture, or a document, or also digital data (e.g. a born-digital
|
|
|
|
+photograph or video clip, a software application, etc.). It roughly corresponds
|
|
|
|
+to the Intellectual Entity concept in the
|
|
|
|
+[PREMIS](https://www.loc.gov/standards/premis/v3/premis-3-0-final.pdf) data
|
|
|
|
+dictionary for preservation metadata.
|
|
|
|
+
|
|
## Atomic, atomicity
|
|
## Atomic, atomicity
|
|
An atomic operation is an operation on data that either succeeds or fails
|
|
An atomic operation is an operation on data that either succeeds or fails
|
|
completely. Complex data structures can be handled via atomic operations, if
|
|
completely. Complex data structures can be handled via atomic operations, if
|
|
@@ -28,6 +35,20 @@ parts of the data structure are intact.
|
|
|
|
|
|
## Checksum
|
|
## Checksum
|
|
|
|
|
|
|
|
+A sequence of bytes, usually visualized as an alphanumeric sequence (e.g.,
|
|
|
|
+`blake2:e974d0e881f151ee293519e[…]`), that represents the "fingerprint" of a
|
|
|
|
+digital file. Many algorithms are available for generating a checksum for a
|
|
|
|
+file, but for each algorithm, a file has only one checksum. If even one bit
|
|
|
|
+changes in the file, the checksum changes completely. It is a fundamental tool
|
|
|
|
+for digital preservation, as it can easily indicate if a file has changed on
|
|
|
|
+the storage medium (due to storage corruption) or in transit (due to network
|
|
|
|
+glitches), or if it may have been forged.
|
|
|
|
+
|
|
|
|
+Pocket Archive calculates and stores checksums in the
|
|
|
|
+[BLAKE2b](https://www.blake2.net/) format, which is a less popular, but vey
|
|
|
|
+fast and secure algorithm. In future releases, it may support multiple
|
|
|
|
+algorithms.
|
|
|
|
+
|
|
## \*Brick
|
|
## \*Brick
|
|
|
|
|
|
## \*Collection
|
|
## \*Collection
|
|
@@ -46,11 +67,23 @@ indicate implicit memberships (e.g. a file inside a folder), which are added
|
|
automatically by the submission process, and explicit ones, which are defined
|
|
automatically by the submission process, and explicit ones, which are defined
|
|
via the `has_member` **property**.
|
|
via the `has_member` **property**.
|
|
|
|
|
|
|
|
+## \*Codename
|
|
|
|
+
|
|
|
|
+The name used to reference a **field** in a **laundry list** by content
|
|
|
|
+managers. It is by convention made of lowercase letters, numbers, and
|
|
|
|
+underscores, e.g. `path_name`, `submission_date`, `creator`…
|
|
|
|
+
|
|
## \*Content model
|
|
## \*Content model
|
|
|
|
|
|
|
|
+In Pocket Archive, a content model is the complete set of definitions of
|
|
|
|
+all the **content types** in a Pocket Archive instance, the **properties** that
|
|
|
|
+define them, and how the interact with one another via **relationships**.
|
|
|
|
+Otherwise known as an *ontology*.
|
|
|
|
+
|
|
## \*Content type
|
|
## \*Content type
|
|
|
|
|
|
-## \*Content type definition
|
|
|
|
|
|
+Classification of a resource according to a **content model**. Each resource
|
|
|
|
+in Pocket Archive is assigned one and only one content type.
|
|
|
|
|
|
## CSV
|
|
## CSV
|
|
|
|
|
|
@@ -68,30 +101,125 @@ application file (usually employed by the "Save" command).
|
|
|
|
|
|
## \*Descriptive resource
|
|
## \*Descriptive resource
|
|
|
|
|
|
|
|
+In Pocket Archive, this is a **resource** that the system may parse and
|
|
|
|
+understand as meaningul data, i.e., as **RDF** data. The **Artifact** and
|
|
|
|
+**Brick** content types are descriptive resources. Files, which are **opaque
|
|
|
|
+resources**, are paired with an implicit descriptive resource that presents the
|
|
|
|
+file's metadata so that the file can be described and fund in searches.
|
|
|
|
+
|
|
## \*Drop box
|
|
## \*Drop box
|
|
|
|
|
|
|
|
+A folder, on a local or remote filesystem, that is being watched by a running
|
|
|
|
+Pocket Archive (`pkar_watch`) instance. Any **laundry list** that is put into
|
|
|
|
+this folder will trigger a **submission** process.
|
|
|
|
+
|
|
## \*Field
|
|
## \*Field
|
|
|
|
|
|
|
|
+The description of a **resource** **property** as a column in a
|
|
|
|
+**laundry list** CSV. A field has a name (in the laundry list, the
|
|
|
|
+**codename**) and one or multiple values.
|
|
|
|
+
|
|
## Fixity
|
|
## Fixity
|
|
|
|
|
|
|
|
+The assurance that a digital file is intact and bit-by-bit identical to how
|
|
|
|
+it was submitted. Fixity is checked by verifying a **checksum**.
|
|
|
|
+
|
|
## \*Laundry list
|
|
## \*Laundry list
|
|
|
|
|
|
|
|
+A CSV file with tabular data listing all the resources included in a
|
|
|
|
+**submission** and their metadata. The Laundry list is produced by the
|
|
|
|
+depositor of a **SIP** and triggers an automatic submission process.
|
|
|
|
+
|
|
## Linked Data, Linked Open Data
|
|
## Linked Data, Linked Open Data
|
|
|
|
|
|
|
|
+Data (in the case of Linked Open Data, published and freely accessible on the
|
|
|
|
+Web) in the **RDF** format. Linked (Open) Data is a popular publishing format
|
|
|
|
+among cultural heritage, humanitarian, and scientific institutions, and other
|
|
|
|
+organizations that value interoperability and the free exchange of data sets.
|
|
|
|
+Linked Data facilitates the aggregation and reconciliation of heterogeneous
|
|
|
|
+data sets produces by different sources, by relying on controlled vocabularies
|
|
|
|
+and unambiguous, globally unique identifiers.
|
|
|
|
+
|
|
## Markdown
|
|
## Markdown
|
|
|
|
|
|
|
|
+Plain-text [writing format](https://daringfireball.net/projects/markdown/) that
|
|
|
|
+can be converted to HTML or other formatted text by using conventional marks
|
|
|
|
+and embedded HTML. Markdown is very popular among technical documentation
|
|
|
|
+writers because it doesn't need a specialized application to write. This
|
|
|
|
+glossary and the other Pocket Archive documentation are written in Markdown.
|
|
|
|
+
|
|
|
|
+Pocket Archive supports writing Markdown documents for its "long description"
|
|
|
|
+**property** that can be used to create content-rich introduction pages for
|
|
|
|
+**Collections**.
|
|
|
|
+
|
|
## Metadata
|
|
## Metadata
|
|
|
|
|
|
|
|
+Literally, data about data. Metadata are administrative and technical
|
|
|
|
+information about a physical or digital object that do not constitute the
|
|
|
|
+object itself, but are helpful to classify, inventory, find, and relate it.
|
|
|
|
+
|
|
|
|
+## Namespace
|
|
|
|
+
|
|
|
|
+The prefix of a group of **UIDs** or **URIs** that is constant for a whole
|
|
|
|
+organization or business unit. It is a convention used to separate identifiers
|
|
|
|
+into broad categories for administrative purposes. Namespaces are used
|
|
|
|
+extensively in **RDF** and in Pocket Archive, however, they are a more
|
|
|
|
+technical aspect of archiving that is not easily visible by occasional users.
|
|
|
|
+
|
|
|
|
+Namespaces in RDF can be shortened within a contained system, as they can be
|
|
|
|
+lengthy, and the mapping between the short prefix and the full-length namespace
|
|
|
|
+is maintained by that system. URIs published on the Web must be either in their
|
|
|
|
+fully-qualified form, or accompanied by the namespace mapping in the same
|
|
|
|
+document.
|
|
|
|
+
|
|
|
|
+E.g.: the URI `http://purl.org/dc/terms/contributor` can be represented
|
|
|
|
+internally in Pocket Archive as `dc:contributor`, as long as the relation
|
|
|
|
+between `dc:` and `http://purl.org/dc/terms/` is registered.
|
|
|
|
+
|
|
|
|
+Pocket Archive supports user-defined namespaces and mappings, that can be
|
|
|
|
+configured by the archive administrator.
|
|
|
|
+
|
|
## Ontology
|
|
## Ontology
|
|
|
|
|
|
See **Content model**.
|
|
See **Content model**.
|
|
|
|
|
|
## \*Opaque resource
|
|
## \*Opaque resource
|
|
|
|
|
|
|
|
+In Pocket Archive, a digital file preserved in the archive. It is "opaque" in
|
|
|
|
+the sense that Pocket Archive is only aware of its presence and **fixity**, but
|
|
|
|
+it doesn't know about its contents. Each opaque resource is accompanied by a
|
|
|
|
+**descriptive resource** that contains its **metadata** and points to it.
|
|
|
|
+
|
|
|
|
+## \*Presentation
|
|
|
|
+
|
|
|
|
+In Pocket Archive, this is the whole package of Web pages, **presentation
|
|
|
|
+files**, and ancillary digital assets that make up a **static site**
|
|
|
|
+generated by Pocket Archive.
|
|
|
|
+
|
|
|
|
+Presentation data are disposable and can be regenerated on demand. Pocket
|
|
|
|
+Archive does not decide whether or how a presentation should be published, or
|
|
|
|
+who has access to it. That is a decision left to the archive owners and system
|
|
|
|
+administrators.
|
|
|
|
+
|
|
## \*Presentation file
|
|
## \*Presentation file
|
|
|
|
|
|
|
|
+Also known as *Derivative file* and other names by
|
|
|
|
+[FADGI](https://www.digitizationguidelines.gov/term.php?term=derivativefile).
|
|
|
|
+This is a file derived from a **production master** file that is fit for
|
|
|
|
+**presentation**. It often has a lower quality and lossy compression than its
|
|
|
|
+source, and it does not need to be preserved, as it can be regenerated without
|
|
|
|
+manual intervention.
|
|
|
|
+
|
|
|
|
+Pocket Archive automatically generates presentation files and thumbnails during
|
|
|
|
+its static site generation process.
|
|
|
|
+
|
|
## \*Property
|
|
## \*Property
|
|
|
|
|
|
|
|
+A **metadata** element that can be attributed to a **resource**. Properties are
|
|
|
|
+more or less strictly defined in the **content model** by the archive
|
|
|
|
+administrator and they may have a data type, a cardinality, and a range. See
|
|
|
|
+the [content model primer](./content_model_primer.md) for more information.
|
|
|
|
+
|
|
## \*Production master
|
|
## \*Production master
|
|
|
|
|
|
A file fit for generating **presentation files**. This is usually a file
|
|
A file fit for generating **presentation files**. This is usually a file
|
|
@@ -119,7 +247,14 @@ to a **resource** managed by Pocket Archive itself. Unlike hyperlinks in the
|
|
WWW, which do not always own the resource pointed to and do not guarantee its
|
|
WWW, which do not always own the resource pointed to and do not guarantee its
|
|
existence, Pocket Archive guarantees the consistency of relationship links.
|
|
existence, Pocket Archive guarantees the consistency of relationship links.
|
|
|
|
|
|
-## \*Resource
|
|
|
|
|
|
+## Resource
|
|
|
|
+
|
|
|
|
+In **RDF** parlance, "everything is a resource", which means, every unit of
|
|
|
|
+information can be represented by a globally unique document on the Web.
|
|
|
|
+
|
|
|
|
+In Pocket Archive, the definition of resource is more specific, and it
|
|
|
|
+refers to any record individually retrievable in the archive. Every resource is
|
|
|
|
+assigned a **content type**.
|
|
|
|
|
|
## RDF
|
|
## RDF
|
|
|
|
|
|
@@ -179,8 +314,39 @@ Pocket Archive generates static sites for presentation. It also has the option
|
|
user's local computer with a web browser, without any web server or even any
|
|
user's local computer with a web browser, without any web server or even any
|
|
Internet connection.
|
|
Internet connection.
|
|
|
|
|
|
-## Submission
|
|
|
|
|
|
+## \*Submission
|
|
|
|
+
|
|
|
|
+The act of assembling and sending a curated data set to Pocket Archive for
|
|
|
|
+archival.
|
|
|
|
+
|
|
|
|
+A submission is made up of files, often arranged in folder hierarchies (the
|
|
|
|
+data), and an accompanying inventory, or **laundry list**, that contains the
|
|
|
|
+**metadata**. A submission has a unique identifier that gets assigned to all
|
|
|
|
+the **resources** included in it.
|
|
|
|
|
|
## UID
|
|
## UID
|
|
|
|
|
|
|
|
+Unique Identifier. Usually, this identifier is intended to be unique only in
|
|
|
|
+the system it is working in. By default, Pocket Archive resources are assigned
|
|
|
|
+16-character random strings, prefixed by a namespace to denote a resource. This
|
|
|
|
+is sufficient to keep millions of records in the archive without collision
|
|
|
|
+(i.e. duplicate IDs).
|
|
|
|
+
|
|
|
|
+## URI
|
|
|
|
+
|
|
|
|
+Universal Resource Identifier. It is a globally unique identifier that is able
|
|
|
|
+to pinpoint a specific **resource** on the WWW. A URI may or may not resolve
|
|
|
|
+to an actual location on the Web. URIs are a key component of the **RDF**.
|
|
|
|
+
|
|
|
|
+Pocket Archive uses URIs to identify individual resources, metadata properties,
|
|
|
|
+content types, and other entities. These are usually hidden from the end user
|
|
|
|
+but viewable in the resources' raw data representation.
|
|
|
|
+
|
|
## UUID
|
|
## UUID
|
|
|
|
+
|
|
|
|
+Universally Unique Identifier. Similar to a **UID**, but with a reasonable
|
|
|
|
+guarantee of uniqueness in the global space (WWW). Uniqueness is usually
|
|
|
|
+guaranteed by a **namespace** prefix that is a Web domain name owned by the
|
|
|
|
+UUID publisher, and/or by a long string of random characters that make the
|
|
|
|
+chance of collision (overlap) small enough to be negligible, or by a
|
|
|
|
+progressive sequence controlled by the publishing system.
|