Browse Source

Add stub CLI methods; update documentation.

Stefano Cossu 6 years ago
parent
commit
b968f5a8ee
4 changed files with 141 additions and 39 deletions
  1. 32 23
      README.md
  2. 1 1
      doc/notes/TODO
  3. 33 0
      doc/notes/cli.md
  4. 75 15
      lsup-admin

+ 32 - 23
README.md

@@ -9,34 +9,26 @@ LAKEsuperior aims at being an uncomplicated, efficient Fedora 4 implementation.
 
 Its main goals are:
 
-- *Simplicity of design:* LAKEsuperior relies on [LMDB](https://symas.com/lmdb/),
-an embedded, high-performance key-value store, for storing metadata and on
-the filesystem to store binaries.
-- *Efficiency:* while raw speed is important, LAKEsuperior also aims at being
-conservative with resources. Its memory and CPU footprint are small. Python C
-extensions are used where possible to improve performance.
-- *Reliability:* fully ACID-compliant writes guarantee consistency of data.
-- *Ease of management:* Contents can be queried directly via term search or
-SPARQL without the aid of external indices. Scripts and interfaces for
-repository administration and monitoring are shipped with the standard release.
-- *Portability:* aims at maintaining a minimal set of dependencies.
+- **Reliability:** Based on solid technologies with stability in mind.
+- **Efficiency:** Small memory and CPU footprint, high scalability.
+- **Ease of management:** Tools to perform monitoring and maintenance included.
+- **Simplicity of design:** Straight-forward architecture, robustness over
+  features.
 
 ## Key features
 
-- Drop-in replacement for Fedora4 (with some caveats: see
-  [Delta document](doc/notes/fcrepo4_deltas.md))—currently being tested with
-  Hyrax 2
+- Drop-in replacement for Fedora4 (with some
+  [caveats](doc/notes/fcrepo4_deltas.md)); currently being tested with Hyrax 2
+- Very stable persistence layer based on [LMDB](https://symas.com/lmdb/) and
+  filesystem. Fully ACID-compliant writes guarantee consistency of data.
 - Term-based search (*planned*) and SPARQL Query API + UI
 - No performance penalty for storing many resources under the same container; no
   [kudzu](https://www.nature.org/ourinitiatives/urgentissues/land-conservation/forests/kudzu.xml)
   pairtree segmentation <sup id="a1">[1](#f1)</sup>
-- Constant performance writing to a resource with
-  many children or members; option to omit children in retrieval
-- Migration tools (*planned*)
-- Python API (*planned*): Authors of Python clients can use LAKEsuperior as an
-  embedded repository with no HTTP traffic or interim RDF serialization &
-  de-serialization involved.
-- Fits in a pocket: you can carry over 50M triples in an 8Gb memory stick.
+- Extensible [provenance metadata](doc/notes/model.md) tracking
+- [Multi-modal access](doc/notes/architecture.md): HTTP (REST), command line
+  interface and native Python API.
+- Fits in a pocket: you can carry 50M triples in an 8Gb memory stick.
 
 Implementation of the official [Fedora API specs](https://fedora.info/spec/)
 (Fedora 5.x and beyond) is not
@@ -46,6 +38,21 @@ project if it gains support.
 Please make sure you read the [Delta document](doc/notes/fcrepo4_deltas.md) for
 divergences with the official Fedora4 implementation.
 
+## Target Audience
+
+LAKEsuperior is for anybody who cares about preserving data in the long term.
+
+Less vaguely, LAKEsuperior is targeted at who needs to store large quantities
+of highly linked metadata and documents.
+
+Its Python/C environment and API make it particularly well suited for academic
+and scientific environment who would be able to embed it in a Python
+application as a library or extend it via plug-ins.
+
+In its current status, LAKEsuperior is aimed at developers and
+hands-on managers who are able to run a Python environment and are
+interested in evaluating this project.
+
 ## Installation
 
 ### Dependencies
@@ -72,9 +79,9 @@ dependencies and should be automatically installed.
    add this line to your virtualenv `activate` script)
 1. Configure the application
 1. Start your STOMP broker, e.g.: `coilmq &`
-1. Run `util/bootstrap.py` to initialize the binary and graph stores
+1. Run `./lsup_admin bootstrap` to initialize the binary and graph stores
 1. Run `./fcrepo` for a single-threaded server (Bjoern) or `./fcrepo-mt` for a
-   multi-threaded development server (GUnicorn).
+   multi-threaded server (GUnicorn).
 
 ### Production deployment
 
@@ -91,6 +98,8 @@ for a rudimentary road map and status.
 
 [Content Model](doc/notes/model.md)
 
+[Command-Line Reference](doc/notes/cli.md)
+
 [Storage Implementation](doc/notes/storage.md)
 
 [Performance Benchmarks](doc/notes/performance.md)

+ 1 - 1
doc/notes/TODO

@@ -92,7 +92,7 @@
   - [D] Query
 - [D] Align logger variable
 - [D] UIDs start with a slash
-- [ ] CLI prototype
+- [D] CLI prototype
 - [W] Update documentation
 
 # Alpha 8

+ 33 - 0
doc/notes/cli.md

@@ -0,0 +1,33 @@
+# LAKEsuperior Command Line Reference
+
+The LAKEsuperior command line tool is used for maintenance and administration
+purposes.
+
+The script is invoked from the main install directory. The tool is
+self-documented, so this is just a redundant overview:
+
+```
+$ ./lsup_admin
+Usage: lsup-admin [OPTIONS] COMMAND [ARGS]...
+
+Options:
+  --help  Show this message and exit.
+
+  bootstrap     Bootstrap binary and graph stores.
+  check_fixity  [STUB] Check fixity of a resource.
+  check_refint  [STUB] Check referential integrity.
+  cleanup       [STUB] Clean up orphan database items.
+  copy          [STUB] Copy (backup) repository data.
+  dump          [STUB] Dump repository to disk.
+  load          [STUB] Load serialized repository data.
+  stats         Print repository statistics.
+
+```
+
+All entries marked `[STUB]` are not yet implemented, however the
+`lsup_admin <command> --help` command will issue a description of what the
+command is meant to do. Please see the [TODO](TODO) document for a rough road
+map.
+
+All of the above commands are also available via, and based upon, the native
+Python API.

+ 75 - 15
lsup-admin

@@ -1,10 +1,12 @@
 #!/usr/bin/env python
 import click
+import json
 import os
 import sys
 
 import lakesuperior.env_setup
 
+from lakesuperior.api import admin as admin_api
 from lakesuperior.config_parser import config
 from lakesuperior.globals import AppGlobals
 from lakesuperior.env import env
@@ -30,12 +32,14 @@ def bootstrap():
     Additional scaffolding files may be parsed to create initial contents.
     '''
     click.echo(
-            'This operation will WIPE ALL YOUR DATA. Are you sure? '
-            '(Please type `yes` to continue) > ')
+            click.style(
+                'WARNING: This operation will WIPE ALL YOUR DATA.\n',
+                bold=True, fg='red')
+            + 'Are you sure? (Please type `yes` to continue) > ', nl=False)
     choice = input().lower()
     if choice != 'yes':
         click.echo('Aborting.')
-        sys.exit()
+        sys.exit(1)
 
     click.echo('Initializing graph store at {}'.format(rdfly.store.path))
     with TxnManager(env.app_globals.rdf_store, write=True) as txn:
@@ -50,9 +54,29 @@ def bootstrap():
 
 
 @click.command()
-def cleanup():
+@click.option(
+    '--human', '-h', is_flag=True, flag_value=True,
+    help='Print a human-readable string. By default, JSON is printed.')
+def stats(human=False):
+    '''
+    Print repository statistics.
+
+    @param human (bool) Whether to output the data in human-readable
+    format.
+    '''
+    stat_data = admin_api.stats()
+    if human:
+        click.echo(
+            'This option is not supported yet. Sorry.\nUse the `/admin/stats`'
+            ' endpoint in the web UI for a pretty printout.')
+    else:
+        click.echo(json.dumps(stat_data))
+
+
+@click.command()
+def check_fixity(uid):
     '''
-    [STUB] Clean up orphaned database items.
+    [STUB] Check fixity of a resource.
     '''
     pass
 
@@ -61,40 +85,76 @@ def cleanup():
 def check_refint():
     '''
     [STUB] Check referential integrity.
+
+    This command scans the graph store to verify that all references to
+    resources within the repository are effectively pointing to existing
+    resources. For repositories set up with the `referencial_integrity` option
+    (the default), this is a pre-condition for a consistent data set.
+    '''
+    pass
+
+
+@click.command()
+def cleanup():
+    '''
+    [STUB] Clean up orphan database items.
     '''
     pass
 
 
 @click.command()
-def copy_repo():
+def copy():
     '''
-    [STUB] Copy (backup) repository.
+    [STUB] Copy (backup) repository data.
+
+    This s a low-level copy, which backs up the data directories containing
+    graph and binary data. It may not even be a necessary command since to
+    back up the repository one just needs to copy the binary and metadata
+    folders.
     '''
     pass
 
 
 @click.command()
-def export_repo():
+@click.argument('src')
+@click.argument('dest')
+@click.option(
+    '--binaries', '-b', show_default=True,
+    help='If set to `include`, full binaries are included in the dump. If '
+    'set to `truncate`, binaries are created as zero-byte files in the proper '
+    'folder structure. If set to `skip`, binaries are not exported. Data '
+    'folders are not created.')
+def dump(src, dest, binaries='include'):
     '''
-    [STUB] High-level repository export.
+    [STUB] Dump repository to disk.
+
+    Dump a Fedora 4 repository to disk. The Fedora repo can be
+    LAKEsuperior or another compatible implementation.
     '''
     pass
 
 
 @click.command()
-def import_repo():
+@click.argument('src')
+@click.argument('dest')
+def load(src, dest):
     '''
-    [STUB] High-level repository import.
+    [STUB] Load serialized repository data.
+
+    Load serialized data from a filesystem location into a Fedora repository.
+    The Fedora repo can be LAKEsuperior or another compatible implementation.
     '''
     pass
 
 
 admin.add_command(bootstrap)
-admin.add_command(cleanup)
+admin.add_command(check_fixity)
 admin.add_command(check_refint)
-admin.add_command(copy_repo)
-admin.add_command(export_repo)
-admin.add_command(import_repo)
+admin.add_command(cleanup)
+admin.add_command(copy)
+admin.add_command(dump)
+admin.add_command(load)
+admin.add_command(stats)
 
 if __name__ == '__main__':
     admin()