fcrepo4_deltas.rst 10 KB

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162163164165166167168169170171172173174175176177178179180181182183184185186187188189190191192193194195196197198199200201202203204205206207208209210211212213214215216217218219220221222223224225226227228229230231232233234235236237238239240241242243244245246247248249250251252253254255256257258259260261262263264265266267268269270271272273274275
  1. Divergencies between lakesuperior and FCREPO4
  2. =============================================
  3. This is a (vastly incomplete) list of discrepancies between the current
  4. FCREPO4 implementation and Lakesuperior. More will be added as more
  5. clients will use it.
  6. Not yet implemented (but in the plans)
  7. --------------------------------------
  8. - Various headers handling (partial)
  9. - AuthN/Z
  10. - Fixity check
  11. - Blank nodes
  12. Potentially breaking changes
  13. ----------------------------
  14. The following divergences may lead into incompatibilities with some
  15. clients.
  16. ETags
  17. ~~~~~
  18. "Weak" ETags for LDP-RSs (i.e. RDF graphs) are not implemented. Given the
  19. possible many interpretations of how any kind of checksum for an LDP resource
  20. should be calculated (see `discussion
  21. <https://groups.google.com/d/topic/fedora-tech/8pemDHNvbvc/discussion>`__), and
  22. also given the relatively high computation cost necessary to determine whether
  23. to send a ``304 Not Modified`` vs. a ``200 OK`` for an LDP-RS request, this
  24. feature has been considered impractical to implement with the limited resources
  25. available at the moment.
  26. As a consequence, LDP-RS requests will never return a ``304`` and will never
  27. include an ``ETag`` header. Clients should not rely on that header for
  28. non-binary resources.
  29. That said, calculating RDF chacksums is still an academically interesting topic
  30. and may be valuable for practical purposes such as metadata preservation.
  31. Atomicity
  32. ~~~~~~~~~
  33. FCREPO4 supports batch atomic operations whereas a transaction can be
  34. opened and a number of operations (i.e. multiple R/W requests to the
  35. repository) can be performed. The operations are persisted in the
  36. repository only if and when the transaction is committed.
  37. LAKesuperior only supports atomicity for a single HTTP request. I.e. a
  38. single HTTTP request that should result in multiple write operations to
  39. the storage layer is only persisted if no exception is thrown.
  40. Otherwise, the operation is rolled back in order to prevent resources to
  41. be left in an inconsistent state.
  42. Tombstone methods
  43. ~~~~~~~~~~~~~~~~~
  44. If a client requests a tombstone resource in FCREPO4 with a method other
  45. than DELETE, the server will return ``405 Method Not Allowed``
  46. regardless of whether the tombstone exists or not.
  47. Lakesuperior will return ``405`` only if the tombstone actually exists,
  48. ``404`` otherwise.
  49. Web UI
  50. ~~~~~~
  51. FCREPO4 includes a web UI for simple CRUD operations.
  52. Such a UI is not in the immediate Lakesuperior development plans.
  53. However, a basic UI is available for read-only interaction: LDP resource
  54. browsing, SPARQL query and other search facilities, and administrative
  55. tools. Some of the latter *may* involve write operations, such as
  56. clean-up tasks.
  57. Automatic path segment generation
  58. ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
  59. A ``POST`` request without a slug in FCREPO4 results in a pairtree
  60. consisting of several intermediate nodes leading to the automatically
  61. minted identifier. E.g.
  62. ::
  63. POST /rest
  64. results in ``/rest/8c/9a/07/4e/8c9a074e-dda3-5256-ea30-eec2dd4fcf61``
  65. being created.
  66. The same request in Lakesuperior would create
  67. ``/rest/8c9a074e-dda3-5256-ea30-eec2dd4fcf61`` (obviously the
  68. identifiers will be different).
  69. This seems to break Hyrax at some point, but might have been fixed. This
  70. needs to be verified further.
  71. Allow PUT requests with empty body on existing resources
  72. ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
  73. FCREPO4 returns a ``409 Conflict`` if a PUT request with no payload is sent
  74. to an existing resource.
  75. Lakesuperior allows to perform this operation, which would result in deleting
  76. all the user-provided properties in that resource.
  77. If the original resource is an LDP-NR, however, the operation will raise a
  78. ``415 Unsupported Media Type`` because the resource will be treated as an empty
  79. LDP-RS, which cannot replace an existing LDP-NR.
  80. Non-standard client breaking changes
  81. ------------------------------------
  82. The following changes may be incompatible with clients relying on some
  83. FCREPO4 behavior not endorsed by LDP or other specifications.
  84. Pairtrees
  85. ~~~~~~~~~
  86. FCREPO4 generates “pairtree” resources if a resource is created in a
  87. path whose segments are missing. E.g. when creating ``/a/b/c/d``, if
  88. ``/a/b`` and ``/a/b/c`` do not exist, FCREPO4 will create two Pairtree
  89. resources. POSTing and PUTting into Pairtrees is not allowed. Also, a
  90. containment triple is established between the closest LDPC and the
  91. created resource, e.g. if ``a`` exists, a
  92. ``</a> ldp:contains </a/b/c/d>`` triple is created.
  93. Lakesuperior does not employ Pairtrees. In the example above
  94. Lakesuperior would create a fully qualified LDPC for each missing
  95. segment, which can be POSTed and PUT to. Containment triples are created
  96. between each link in the path, i.e. ``</a> ldp:contains </a/b>``,
  97. ``</a/b> ldp:contains </a/b/c>`` etc. This may potentially break clients
  98. relying on the direct containment model.
  99. The rationale behind this change is that Pairtrees are the byproduct of
  100. a limitation imposed by Modeshape and introduce complexity in the
  101. software stack and confusion for the client. Lakesuperior aligns with
  102. the more intuitive UNIX filesystem model, where each segment of a path
  103. is a “folder” or container (except for the leaf nodes that can be either
  104. folders or files). In any case, clients are discouraged from generating
  105. deep paths in Lakesuperior without a specific purpose because these
  106. resources create unnecessary data.
  107. Non-mandatory, non-authoritative slug in version POST
  108. ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
  109. FCREPO4 requires a ``Slug`` header to POST to ``fcr:versions`` to create
  110. a new version.
  111. Lakesuperior adheres to the more general FCREPO POST rule and if no slug
  112. is provided, an automatic ID is generated instead. The ID is a UUID4.
  113. Note that internally this ID is not called “label” but “uid” since it is
  114. treated as a fully qualified identifier. The ``fcrepo:hasVersionLabel``
  115. predicate, however ambiguous in this context, will be kept until the
  116. adoption of Memento, which will change the retrieval mechanisms.
  117. Another notable difference is that if a POST is issued on the same resource
  118. ``fcr:versions`` location using a version ID that already exists, Lakesuperior
  119. will just mint a random identifier rather than returning an error.
  120. Deprecation track
  121. -----------------
  122. Lakesuperior offers some “legacy” options to replicate the FCREPO4
  123. behavior, however encourages new development to use a different approach
  124. for some types of interaction.
  125. Endpoints
  126. ~~~~~~~~~
  127. The FCREPO root endpoint is ``/rest``. The Lakesuperior root endpoint is
  128. ``/ldp``.
  129. This should not pose a problem if a client does not have ``rest``
  130. hard-coded in its code, but in any event, the ``/rest`` endpoint is
  131. provided for backwards compatibility.
  132. Future implementations of the Fedora API specs may employ a "versioned"
  133. endpoint scheme that allows multiple Fedora API versions to be available to the
  134. client, e.g. ``/ldp/fc4`` for the current LDP API version, ``/ldp/fc5`` for
  135. Fedora version 5.x, etc.
  136. Automatic LDP class assignment
  137. ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
  138. Since Lakesuperior rejects client-provided server-managed triples, and
  139. since the LDP types are among them, the LDP container type is inferred
  140. from the provided properties: if the ``ldp:hasMemberRelation`` and
  141. ``ldp:membershipResource`` properties are provided, the resource is a
  142. Direct Container. If in addition to these the
  143. ``ldp:insertedContentRelation`` property is present, the resource is an
  144. Indirect Container. If any of the first two are missing, the resource is
  145. a Container.
  146. Clients are encouraged to omit LDP types in PUT, POST and PATCH
  147. requests.
  148. Lenient handling
  149. ~~~~~~~~~~~~~~~~
  150. FCREPO4 requires server-managed triples to be expressly indicated in a
  151. PUT request, unless the ``Prefer`` header is set to
  152. ``handling=lenient; received="minimal"``, in which case the RDF payload
  153. must not have any server-managed triples.
  154. Lakesuperior works under the assumption that client should never provide
  155. server-managed triples. It automatically handles PUT requests sent to
  156. existing resources by returning a 412 if any server managed triples are
  157. included in the payload. This is the same as setting ``Prefer`` to
  158. ``handling=strict``, which is the default.
  159. If ``Prefer`` is set to ``handling=lenient``, all server-managed triples
  160. sent with the payload are ignored.
  161. Clients using the ``Prefer`` header to control PUT behavior as
  162. advertised by the specs should not notice any difference.
  163. Optional improvements
  164. ---------------------
  165. The following are improvements in performance or usability that can only
  166. be taken advantage of if client code is adjusted.
  167. LDP-NR content and metadata
  168. ~~~~~~~~~~~~~~~~~~~~~~~~~~~
  169. FCREPO4 relies on the ``/fcr:metadata`` identifier to retrieve RDF
  170. metadata about an LDP-NR. Lakesuperior supports this as a legacy option,
  171. but encourages the use of content negotiation to do the same while
  172. offering explicit endpoints for RDF and non-RDF content retrieval.
  173. Any request to an LDP-NR with an ``Accept`` header set to one of the
  174. supported RDF serialization formats will yield the RDF metadata of the
  175. resource instead of the binary contents.
  176. The ``fcr:metadata`` URI returns the RDF metadata of a LDP-NR.
  177. The ``fcr:content`` URI returns the non-RDF content.
  178. The two optionsabove return an HTTP error if requested for a LDP-RS.
  179. “Include” and “Omit” options for children
  180. ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
  181. Lakesuperior offers an additional ``Prefer`` header option to exclude
  182. all references to child resources (i.e. by removing all the
  183. ``ldp:contains`` triples) while leaving the other server-managed triples
  184. when retrieving a resource:
  185. ::
  186. Prefer: return=representation; [include | omit]="http://fedora.info/definitions/v4/repository#Children"
  187. The default behavior is to include all children URIs.
  188. Soft-delete and purge
  189. ~~~~~~~~~~~~~~~~~~~~~
  190. **NOTE**: The implementation of this section is incomplete and debated.
  191. In FCREPO4 a deleted resource leaves a tombstone deleting all traces of
  192. the previous resource.
  193. In Lakesuperior, a normal DELETE creates a new version snapshot of the
  194. resource and puts a tombstone in its place. The resource versions are
  195. still available in the ``fcr:versions`` location. The resource can be
  196. “resurrected” by issuing a POST to its tombstone. This will result in a
  197. ``201``.
  198. If a tombstone is deleted, the resource and its versions are completely
  199. deleted (purged).
  200. Moreover, setting the ``Prefer:no-tombstone`` header option on DELETE
  201. allows to delete a resource and its versions directly without leaving a
  202. tombstone.