migration.rst 2.8 KB

12345678910111213141516171819202122232425262728293031323334353637383940414243444546474849505152535455565758596061626364656667686970717273
  1. Migration, Backup & Restore
  2. ===========================
  3. All Lakesuperior data is by default fully contained in a folder. This
  4. means that only the data, configurations and code folders are needed for
  5. it to run. No Postgres, Redis, or such. Data and configuration folders
  6. can be moved around as needed.
  7. Migration Tool
  8. --------------
  9. Migration is the process of importing and converting data from a
  10. different Fedora or LDP implementation into a new Lakesuperior instance.
  11. This process uses the HTTP/LDP API of the original repository. A
  12. command-line utility is available as part of the ``lsup-admin`` suite to
  13. assist in such operation.
  14. A repository can be migrated with a one-line command such as:
  15. ::
  16. lsup-admin migrate http://source-repo.edu/rest /local/dest/folder
  17. For more options, enter
  18. ::
  19. lsup-admin migrate --help
  20. The script will crawl through the resources and crawl through outbound
  21. links within them. In order to do this, resources are added as raw
  22. triples, i.e. no consistency checks are made.
  23. This script will create a full dataset in the specified destination
  24. folder, complete with a default configuration that allows to start the
  25. Lakesuperior server immediately after the migration is complete.
  26. Two approaches to migration are possible:
  27. 1. By providing a starting point on the source repository. E.g. if the
  28. repository you want to migrate is at ``http://repo.edu/rest/prod``
  29. you can add the ``-s /prod`` option to the script to avoid migrating
  30. irrelevant branches. Note that the script will still reach outside of
  31. the starting point if resources are referencing other resources
  32. outside of it.
  33. 2. By providing a file containing a list of resources to migrate. This
  34. is useful if a source repository cannot produce a full list (e.g. the
  35. root node has more children than the server can handle) but a list of
  36. individual resources is available via an external index (Solr,
  37. triplestore, etc.). The resources can be indicated by their fully
  38. qualified URIs or paths relative to the repository root. (*TODO
  39. latter option needs testing*)
  40. Consistency check can (and should) be run after the migration::
  41. lsup-admin check_refint
  42. This is critical to ensure that all resources in the repository are referencing
  43. to other repository resources that are actually existing.
  44. This feature has been added in alpha9.
  45. *TODO: The output of ``check_refint`` is somewhat crude. Improvements can be
  46. made to output integrity violations to a machine-readable log and integrate
  47. with the migration tool.*
  48. Backup And Restore
  49. ------------------
  50. A back up of a LAKEshore repository consists in copying the RDF and
  51. non-RDF data folders. These folders are indicated in the application
  52. configuration. The default commands provided by your OS (``cp``,
  53. ``rsync``, ``tar`` etc. for Unix) are all is needed.