# `lsup_rdf` **This project is work in progress.** Embedded RDF (and maybe later, generic graph) store and manipulation library. ## Purpose The goal of this library is to provide extremely efficient and compact handling of RDF data. At least a C API and Python bindings are planned. This library can be thought of as SQLite or BerkeleyDB for graphs. It can be embedded directly in a program and store persistent data without the need of running a server. Two graph back ends are available: a memory one based on hash maps and a disk-based one based on [LMDB](https://symas.com/lmdb/), an extremely fast and compact embedded key-store value. Graphs can be created independently with either back end within the same program. Triples in the persistent back end are fully indexed and optimized for a balance of lookup speed, data compactness, and write performance (in order of importance). This library was initially meant to replace RDFLib dependency and Cython code in [Lakesuperior](https://notabug.org/scossu/lakesuperior) in an effort to reduce code clutter and speed up RDF handling; it is now a project for an independent RDF library, but unless the contributor base expands, it will remain focused on serving Lakesuperior. ## Development Status **Pre-alpha.** The API is not yet defined and may change radically. The code may not compile, or throw a fit when run. At the moment this project is only intended for curious developers and researchers. This is also my first stab at writing a C library (coming from Python) and an unpaid fun project, so don't be surprised if you find some gross stuff. ## Road Map ### In Scope – Short Term The short-term goal is to support usage in Lakesuperior and a workable set of features as a standalone library: - Handling of graphs, triples, terms - Memory- and disk-backed (persistent) graph storage - Contexts (disk-backed only) - Handling of blank nodes - Validation of literal and URI terms - Validation of RDF triples - Fast graph Lookup using matching patterns - Graph boolean operations - Serialization and de-serialization to/from N-Triples and N-Quads - Serialization and de-serialization to/from Turtle and TriG - Compile-time configuration of max graph size (efficiency vs. capacity) - Python bindings - Basic command line utilities ### Possibly In scope – Long Term - Binary serialization and hashing of graphs - Binary protocol for synchronizing remote replicas - Lua bindings ### Likely Out of Scope (Unless provided and maintained by external contributors) - C++ bindings - JSON-LD de/serialization - SPARQL queries (We'll see... Will definitely need help)