fcrepo4_deltas.rst 11 KB

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162163164165166167168169170171172173174175176177178179180181182183184185186187188189190191192193194195196197198199200201202203204205206207208209210211212213214215216217218219220221222223224225226227228229230231232233234235236237238239240241242243244245246247248249250251252253254255256257258259260261262263264265266267268269270271272273274275276277278279280281282283284285286
  1. Divergencies between lakesuperior and FCREPO4
  2. =============================================
  3. This is a (vastly incomplete) list of discrepancies between the current
  4. FCREPO4 implementation and Lakesuperior. More will be added as more
  5. clients will use it.
  6. Not yet implemented (but in the plans)
  7. --------------------------------------
  8. - Various headers handling (partial)
  9. - AuthN and WebAC-based authZ
  10. - Fixity check
  11. - Blank nodes (at least partly working, but untested)
  12. - Multiple byte ranges for the ``Range`` request header
  13. Potentially breaking changes
  14. ----------------------------
  15. The following divergences may lead into incompatibilities with some
  16. clients.
  17. ETags
  18. ~~~~~
  19. "Weak" ETags for LDP-RSs (i.e. RDF graphs) are not implemented. Given the
  20. possible many interpretations of how any kind of checksum for an LDP resource
  21. should be calculated (see `discussion
  22. <https://groups.google.com/d/topic/fedora-tech/8pemDHNvbvc/discussion>`__), and
  23. also given the relatively high computation cost necessary to determine whether
  24. to send a ``304 Not Modified`` vs. a ``200 OK`` for an LDP-RS request, this
  25. feature has been considered impractical to implement with the limited resources
  26. available at the moment.
  27. As a consequence, LDP-RS requests will never return a ``304`` and will never
  28. include an ``ETag`` header. Clients should not rely on that header for
  29. non-binary resources.
  30. That said, calculating RDF chacksums is still an academically interesting topic
  31. and may be valuable for practical purposes such as metadata preservation.
  32. Atomicity
  33. ~~~~~~~~~
  34. FCREPO4 supports batch atomic operations whereas a transaction can be
  35. opened and a number of operations (i.e. multiple R/W requests to the
  36. repository) can be performed. The operations are persisted in the
  37. repository only if and when the transaction is committed.
  38. LAKesuperior only supports atomicity for a single HTTP request. I.e. a
  39. single HTTTP request that should result in multiple write operations to
  40. the storage layer is only persisted if no exception is thrown.
  41. Otherwise, the operation is rolled back in order to prevent resources to
  42. be left in an inconsistent state.
  43. Tombstone methods
  44. ~~~~~~~~~~~~~~~~~
  45. If a client requests a tombstone resource in FCREPO4 with a method other
  46. than DELETE, the server will return ``405 Method Not Allowed``
  47. regardless of whether the tombstone exists or not.
  48. Lakesuperior will return ``405`` only if the tombstone actually exists,
  49. ``404`` otherwise.
  50. ``Limit`` Header
  51. ~~~~~~~~~~~~~~~~
  52. Lakesuperior does not support the ``Limit`` header which in FCREPO can be used
  53. to limit the number of "child" resources displayed for a container graph. Since
  54. this seems to have a mostly cosmetic function in FCREPO to compensate for
  55. performance limitations (displaying a page with many thousands of children in
  56. the UI can take minutes), and since Lakesuperior already offers options in the
  57. ``Prefer`` header to not return any children, this option is not implemented.
  58. Web UI
  59. ~~~~~~
  60. FCREPO4 includes a web UI for simple CRUD operations.
  61. Such a UI is not in the immediate Lakesuperior development plans.
  62. However, a basic UI is available for read-only interaction: LDP resource
  63. browsing, SPARQL query and other search facilities, and administrative
  64. tools. Some of the latter *may* involve write operations, such as
  65. clean-up tasks.
  66. Automatic path segment generation
  67. ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
  68. A ``POST`` request without a slug in FCREPO4 results in a pairtree
  69. consisting of several intermediate nodes leading to the automatically
  70. minted identifier. E.g.
  71. ::
  72. POST /rest
  73. results in ``/rest/8c/9a/07/4e/8c9a074e-dda3-5256-ea30-eec2dd4fcf61``
  74. being created.
  75. The same request in Lakesuperior would create
  76. ``/rest/8c9a074e-dda3-5256-ea30-eec2dd4fcf61`` (obviously the
  77. identifiers will be different).
  78. This seems to break Hyrax at some point, but might have been fixed. This
  79. needs to be verified further.
  80. Allow PUT requests with empty body on existing resources
  81. ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
  82. FCREPO4 returns a ``409 Conflict`` if a PUT request with no payload is sent
  83. to an existing resource.
  84. Lakesuperior allows to perform this operation, which would result in deleting
  85. all the user-provided properties in that resource.
  86. If the original resource is an LDP-NR, however, the operation will raise a
  87. ``415 Unsupported Media Type`` because the resource will be treated as an empty
  88. LDP-RS, which cannot replace an existing LDP-NR.
  89. Non-standard client breaking changes
  90. ------------------------------------
  91. The following changes may be incompatible with clients relying on some
  92. FCREPO4 behavior not endorsed by LDP or other specifications.
  93. Pairtrees
  94. ~~~~~~~~~
  95. FCREPO4 generates “pairtree” resources if a resource is created in a
  96. path whose segments are missing. E.g. when creating ``/a/b/c/d``, if
  97. ``/a/b`` and ``/a/b/c`` do not exist, FCREPO4 will create two Pairtree
  98. resources. POSTing and PUTting into Pairtrees is not allowed. Also, a
  99. containment triple is established between the closest LDPC and the
  100. created resource, e.g. if ``a`` exists, a
  101. ``</a> ldp:contains </a/b/c/d>`` triple is created.
  102. Lakesuperior does not employ Pairtrees. In the example above
  103. Lakesuperior would create a fully qualified LDPC for each missing
  104. segment, which can be POSTed and PUT to. Containment triples are created
  105. between each link in the path, i.e. ``</a> ldp:contains </a/b>``,
  106. ``</a/b> ldp:contains </a/b/c>`` etc. This may potentially break clients
  107. relying on the direct containment model.
  108. The rationale behind this change is that Pairtrees are the byproduct of
  109. a limitation imposed by Modeshape and introduce complexity in the
  110. software stack and confusion for the client. Lakesuperior aligns with
  111. the more intuitive UNIX filesystem model, where each segment of a path
  112. is a “folder” or container (except for the leaf nodes that can be either
  113. folders or files). In any case, clients are discouraged from generating
  114. deep paths in Lakesuperior without a specific purpose because these
  115. resources create unnecessary data.
  116. Non-mandatory, non-authoritative slug in version POST
  117. ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
  118. FCREPO4 requires a ``Slug`` header to POST to ``fcr:versions`` to create
  119. a new version.
  120. Lakesuperior adheres to the more general FCREPO POST rule and if no slug
  121. is provided, an automatic ID is generated instead. The ID is a UUID4.
  122. Note that internally this ID is not called “label” but “uid” since it is
  123. treated as a fully qualified identifier. The ``fcrepo:hasVersionLabel``
  124. predicate, however ambiguous in this context, will be kept until the
  125. adoption of Memento, which will change the retrieval mechanisms.
  126. Another notable difference is that if a POST is issued on the same resource
  127. ``fcr:versions`` location using a version ID that already exists, Lakesuperior
  128. will just mint a random identifier rather than returning an error.
  129. Deprecation track
  130. -----------------
  131. Lakesuperior offers some “legacy” options to replicate the FCREPO4
  132. behavior, however encourages new development to use a different approach
  133. for some types of interaction.
  134. Endpoints
  135. ~~~~~~~~~
  136. The FCREPO root endpoint is ``/rest``. The Lakesuperior root endpoint is
  137. ``/ldp``.
  138. This should not pose a problem if a client does not have ``rest``
  139. hard-coded in its code, but in any event, the ``/rest`` endpoint is
  140. provided for backwards compatibility.
  141. Future implementations of the Fedora API specs may employ a "versioned"
  142. endpoint scheme that allows multiple Fedora API versions to be available to the
  143. client, e.g. ``/ldp/fc4`` for the current LDP API version, ``/ldp/fc5`` for
  144. Fedora version 5.x, etc.
  145. Automatic LDP class assignment
  146. ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
  147. Since Lakesuperior rejects client-provided server-managed triples, and
  148. since the LDP types are among them, the LDP container type is inferred
  149. from the provided properties: if the ``ldp:hasMemberRelation`` and
  150. ``ldp:membershipResource`` properties are provided, the resource is a
  151. Direct Container. If in addition to these the
  152. ``ldp:insertedContentRelation`` property is present, the resource is an
  153. Indirect Container. If any of the first two are missing, the resource is
  154. a Container.
  155. Clients are encouraged to omit LDP types in PUT, POST and PATCH
  156. requests.
  157. Lenient handling
  158. ~~~~~~~~~~~~~~~~
  159. FCREPO4 requires server-managed triples to be expressly indicated in a
  160. PUT request, unless the ``Prefer`` header is set to
  161. ``handling=lenient; received="minimal"``, in which case the RDF payload
  162. must not have any server-managed triples.
  163. Lakesuperior works under the assumption that client should never provide
  164. server-managed triples. It automatically handles PUT requests sent to
  165. existing resources by returning a 412 if any server managed triples are
  166. included in the payload. This is the same as setting ``Prefer`` to
  167. ``handling=strict``, which is the default.
  168. If ``Prefer`` is set to ``handling=lenient``, all server-managed triples
  169. sent with the payload are ignored.
  170. Clients using the ``Prefer`` header to control PUT behavior as
  171. advertised by the specs should not notice any difference.
  172. Optional improvements
  173. ---------------------
  174. The following are improvements in performance or usability that can only
  175. be taken advantage of if client code is adjusted.
  176. LDP-NR content and metadata
  177. ~~~~~~~~~~~~~~~~~~~~~~~~~~~
  178. FCREPO4 relies on the ``/fcr:metadata`` identifier to retrieve RDF
  179. metadata about an LDP-NR. Lakesuperior supports this as a legacy option,
  180. but encourages the use of content negotiation to do the same while
  181. offering explicit endpoints for RDF and non-RDF content retrieval.
  182. Any request to an LDP-NR with an ``Accept`` header set to one of the
  183. supported RDF serialization formats will yield the RDF metadata of the
  184. resource instead of the binary contents.
  185. The ``fcr:metadata`` URI returns the RDF metadata of a LDP-NR.
  186. The ``fcr:content`` URI returns the non-RDF content.
  187. The two optionsabove return an HTTP error if requested for a LDP-RS.
  188. “Include” and “Omit” options for children
  189. ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
  190. Lakesuperior offers an additional ``Prefer`` header option to exclude
  191. all references to child resources (i.e. by removing all the
  192. ``ldp:contains`` triples) while leaving the other server-managed triples
  193. when retrieving a resource:
  194. ::
  195. Prefer: return=representation; [include | omit]="http://fedora.info/definitions/v4/repository#Children"
  196. The default behavior is to include all children URIs.
  197. Soft-delete and purge
  198. ~~~~~~~~~~~~~~~~~~~~~
  199. **NOTE**: The implementation of this section is incomplete and debated.
  200. In FCREPO4 a deleted resource leaves a tombstone deleting all traces of
  201. the previous resource.
  202. In Lakesuperior, a normal DELETE creates a new version snapshot of the
  203. resource and puts a tombstone in its place. The resource versions are
  204. still available in the ``fcr:versions`` location. The resource can be
  205. “resurrected” by issuing a POST to its tombstone. This will result in a
  206. ``201``.
  207. If a tombstone is deleted, the resource and its versions are completely
  208. deleted (purged).
  209. Moreover, setting the ``Prefer:no-tombstone`` header option on DELETE
  210. allows to delete a resource and its versions directly without leaving a
  211. tombstone.