README.md 2.0 KB

LSUP_RDF

This project is work in progress.

This C library was initially meant to replace RDFLib dependency and Cython code in Lakesuperior in an effort to reduce code clutter and speed up RDF handling; it later became a project for an independent RDF library. Initial experiments with Redland as a more efficient replacement for RDFLib were not successful due to the complexity of the API and the lack of an adequate persistent back end.

This library makes two graph back ends available: a memory one based on hash maps and a disk-based one based on LMDB, an extremely fast and compact embedded key-store value. Graphs can be created independently with either back end within the same program. Triples in the persistent back end are fully indexed and optimized for a balance of lookup speed, data compactness, and write performance (in order of importance).

The API is not yet defined and at the moment this project is only intended for curious developers and researchers.

This is also my first stab at writing a C library (coming from Python) and an unpaid fun project, so don't be surprised if you find some gross stuff.

In Scope – Short Term

The short-term goal is to support usage in Lakesuperior and a workable set of features as a standalone library:

  • Handling of graphs, triples, terms
  • Memory- and disk-backed (persistent) graph storage
  • Contexts (disk-backed only)
  • Handling of blank nodes
  • Validation of literal and URI terms
  • Validation of RDF triples
  • Fast graph Lookup using matching patterns
  • Graph boolean operations
  • Serialization and de-serialization to/from N-Triples and Turtle
  • Compile-time configuration of max graph size (efficiency vs. huge capacity)
  • Python bindings

Possibly In scope – Long Term

  • SPARQL queries (We'll see... Will definitely need help)
  • Binary serialization and hashing of graphs
  • Lua bindings
  • C++ bindings (requires external contribution)

Likely Out of Scope

  • JSON-LD serialization