Legacy Lakesuperior code.

Stefano Cossu b968f5a8ee Add stub CLI methods; update documentation. 7 éve
data dc50b0a51d Make extract_imr compatible with bdb back end; add RDF types for resource graphs. 7 éve
doc b968f5a8ee Add stub CLI methods; update documentation. 7 éve
etc.skeleton 6498205eb5 Move stuff for Python API; lots of cleanup here and there. 7 éve
lakesuperior 212786dac9 Misc additions: some cosmetics to HTML pages, random docs, stub 7 éve
static 4090e51570 SPARQL query UI and API. 7 éve
tests e9f2e4fd85 Include leading slash in UIDs (á la filesystem path). 7 éve
util ad9f67b4bf Move bootstrap to admin CLI; add other method stubs. 7 éve
.gitignore 2fdc1b902e Initial commit: some boilerplate borrowed from Combine, basic folder structure and documentation. 7 éve
LICENSE 2fdc1b902e Initial commit: some boilerplate borrowed from Combine, basic folder structure and documentation. 7 éve
README.md b968f5a8ee Add stub CLI methods; update documentation. 7 éve
conftest.py 6980366c72 Separate environments between inside and outside app context. 7 éve
fcrepo 46f5e63e42 Various startup scripts. 7 éve
fcrepo_mt 46f5e63e42 Various startup scripts. 7 éve
lsup-admin b968f5a8ee Add stub CLI methods; update documentation. 7 éve
profiler.py 8554f845a3 Adapt profiler script to multi-modal access. 7 éve
requirements.txt d2a4d67889 Update requirements.txt. 7 éve
server.py ad9f67b4bf Move bootstrap to admin CLI; add other method stubs. 7 éve

README.md

LAKEsuperior

LAKEsuperior is an experimental Fedora Repository implementation.

Guiding Principles

LAKEsuperior aims at being an uncomplicated, efficient Fedora 4 implementation.

Its main goals are:

  • Reliability: Based on solid technologies with stability in mind.
  • Efficiency: Small memory and CPU footprint, high scalability.
  • Ease of management: Tools to perform monitoring and maintenance included.
  • Simplicity of design: Straight-forward architecture, robustness over features.

Key features

  • Drop-in replacement for Fedora4 (with some caveats); currently being tested with Hyrax 2
  • Very stable persistence layer based on LMDB and filesystem. Fully ACID-compliant writes guarantee consistency of data.
  • Term-based search (planned) and SPARQL Query API + UI
  • No performance penalty for storing many resources under the same container; no kudzu pairtree segmentation 1
  • Extensible provenance metadata tracking
  • Multi-modal access: HTTP (REST), command line interface and native Python API.
  • Fits in a pocket: you can carry 50M triples in an 8Gb memory stick.

Implementation of the official Fedora API specs (Fedora 5.x and beyond) is not foreseen in the short term, however it would be a natural evolution of this project if it gains support.

Please make sure you read the Delta document for divergences with the official Fedora4 implementation.

Target Audience

LAKEsuperior is for anybody who cares about preserving data in the long term.

Less vaguely, LAKEsuperior is targeted at who needs to store large quantities of highly linked metadata and documents.

Its Python/C environment and API make it particularly well suited for academic and scientific environment who would be able to embed it in a Python application as a library or extend it via plug-ins.

In its current status, LAKEsuperior is aimed at developers and hands-on managers who are able to run a Python environment and are interested in evaluating this project.

Installation

Dependencies

  1. Python 3.5 or greater.
  2. The LMDB database library. It should be included in most Linux distributions' standard package repositories.
  3. A message broker supporting the STOMP protocol. For testing and evaluation purposes, Coilmq is included with the dependencies and should be automatically installed.

Installation steps

  1. Install dependencies as indicated above
  2. Create a virtualenv in a project folder: virtualenv -p <python 3.5+ exec path> <virtualenv folder>
  3. Initialize the virtualenv: source <path_to_virtualenv>/bin/activate
  4. Clone this repo
  5. cd into repo folder
  6. Install dependencies: pip install -r requirements.txt
  7. Copy the etc.skeleton folder to a separate location
  8. Set the configuration folder location in the environment: export FCREPO_CONFIG_DIR=<your config dir location> (alternatively you can add this line to your virtualenv activate script)
  9. Configure the application
  10. Start your STOMP broker, e.g.: coilmq &
  11. Run ./lsup_admin bootstrap to initialize the binary and graph stores
  12. Run ./fcrepo for a single-threaded server (Bjoern) or ./fcrepo-mt for a multi-threaded server (GUnicorn).

Production deployment

If you like fried repositories for lunch, deploy before 11AM.

Status and development

LAKEsuperior is in alpha status. Please see the TODO list for a rudimentary road map and status.

Technical documentation

Architecture Overview

Content Model

Command-Line Reference

Storage Implementation

Performance Benchmarks

TODO list


1 However if your client splits pairtrees upstream, such as Hyrax does, that obviously needs to change to get rid of the path segments.