glossary.md 6.7 KB

Pocket Archive glossary

Audience: all users

WORK IN PROGRESS

This document is a glossary of terms specific to Pocket Archive and to mainstream archival, cataloging, and IT practices. The former are prefixed by an asterisk: *. Terms used in descriptions that also have a definition in this document are written in bold.

Archival master file

A file fit for conservation purposes. This is usually the version of a digital capture that contains the most information, and which can be used to generate other derivatives, most notably, a production master file.

This is a standard definition of the US Federal Agency Digital Guidelines Inititative (FADGI).

*Artifact

Atomic, atomicity

An atomic operation is an operation on data that either succeeds or fails completely. Complex data structures can be handled via atomic operations, if the system handles them, to ensure that at each step of a transfer process, all parts of the data structure are intact.

Checksum

*Brick

*Collection

In Pocket Archive, a Collection is a group of artifacts that are logically related in a way or another. There is no well-defined rule to how collections should be arranged, and that is a curatorial decision based on the materials at hand and the target audience.

Collections can contain other collections, other structural Bricks, or artifacts. They are presented in a special way, in that they can have a long description stored in a separate file, that can provide a rich presentation for the collection's landing page.

Any resource can belong to multiple collections. Laundry lists have means to indicate implicit memberships (e.g. a file inside a folder), which are added automatically by the submission process, and explicit ones, which are defined via the has_member property.

*Content model

*Content type

*Content type definition

CSV

Stands for Comma-Separated Values. It is a file format for tabular data, which can be read and edited by several software packages, such as LibreOffice or Google Sheets. Opened in these applications, a CSV looks like a spreadsheet, and can be edited as such. Text formatting, column or row sizes, borders, and similar style features are not supported. CSV is pure data, which is all we are interested in.

Pocket Archive's laundry lists are formatted in CSV. Spreadsheet applications can be used to compile laundry lists, with the caveat that the file must be exported as a .csv file rather than the native spreadsheet application file (usually employed by the "Save" command).

*Descriptive resource

*Drop box

*Field

Fixity

*Laundry list

Linked Data, Linked Open Data

Markdown

Metadata

Ontology

See Content model.

*Opaque resource

*Presentation file

*Property

*Production master

A file fit for generating presentation files. This is usually a file generated from the archival master file and manually adjusted for presentation, with elements that are imortant for preservation but not for public display, removed (e.g. color bars or working layers in a still image). Because it is manually adjusted, it should be preserved along with the archival version.

In cases of necessity, the same file may serve both archival and production master roles, however this is not a recommended practice and only acceptable when a proper archival master is not (or no longer) available.

Derivatives of this file are usually lower-quality copies that are automatically generated and not preservation-worthy.

This is a standard definition of the US Federal Agency Digital Guidelines Inititative (FADGI).

*Relationship

A relationship, in Pocket Archive, is a special type of hyperlink that points to a resource managed by Pocket Archive itself. Unlike hyperlinks in the WWW, which do not always own the resource pointed to and do not guarantee its existence, Pocket Archive guarantees the consistency of relationship links.

*Resource

RDF

Acronym for Resource Description Framework. It is the data format used for Linked Data.

Pocket Archive uses RDF internally and is able to export RDF for interoperability with external systems. End users and content managers need not be concerned with the internals of RDF, but it is good to have an awareness of the underlying support for this format.

RDF was designed by Tim Berners-Lee, the "father of the Internet", and it is a format expressly made for the WWW. In RDF, everything is a resource, represented by a Web document, that can be identified globally by a URI. This format is particularly fit for aggregating and sharing data sets from heteroeneous sources, that may have been cataloged according to different standards, using different tools.

Pocket Archive uses RDF to maintain a flexible method to relate resources together and to facilitate sharing its data in the wild.

Schema (pl. schemata)

The complete set of rules governing a given content type. A schema defines all the properties applicable to a specific type and their constraints. It is written out as a set of files that include the content type in question and all its super-types.

SIP

Submission Information Package: a package of files, folders, and metadata that constitute a complete submission package. A SIP is normally prepared by archivists, either by hand or with the aid of automated tools, and is the first step of the actual archival process.

This is a term of the OAIS standard that defines guidelines for digital archival practices.

Static site

A web site that is made up entirely of static files. This means that all contents of the website are pre-generated and consist of actual files living on a filesystem, in contrast with most modern dynamic sites whose contents are mostly generated on demand by a continuously running process.

While much less flexible than dynamic sites, static sites are still widely used today. Dynamic sites rely on complex, often resource-intensive applications and infrastructure that can be subject to exploits and attacks of all sorts, and on more applications and infrastructure to prevent those attacks. Sites of small to medium size with predefined content can take advantage of a simple and economical static site, that needs only a simple web server to run.

Pocket Archive generates static sites for presentation. It also has the option [WORK IN PROGRESS] to generate contents that can be viewed directly on the user's local computer with a web browser, without any web server or even any Internet connection.

Submission

UID

UUID