C and Python RDF library. ALPHA

Stefano Cossu e08da1a836 Rearrange python module sources. 4 年之前
cpython e08da1a836 Rearrange python module sources. 4 年之前
ext 55e4185353 Pin xxhash submodule to release branch; update xxhash to v0.8. 4 年之前
include e08da1a836 Rearrange python module sources. 4 年之前
src e08da1a836 Rearrange python module sources. 4 年之前
test 533a4fe77d Resolve all mem leaks. 4 年之前
.gitignore d8d2ce73ff Kinda working Python bindings. 4 年之前
.gitmodules 55e4185353 Pin xxhash submodule to release branch; update xxhash to v0.8. 4 年之前
CODE_OF_CONDUCT 5e1c8e5fa6 Fix Makefile; add docs. 4 年之前
LICENSE 5e1c8e5fa6 Fix Makefile; add docs. 4 年之前
Makefile a0c107e295 Some fixes: 4 年之前
README.md 4b192334a2 Update README. 4 年之前
TODO.md 5e1c8e5fa6 Fix Makefile; add docs. 4 年之前
profile.c a0c107e295 Some fixes: 4 年之前
setup.py d8d2ce73ff Kinda working Python bindings. 4 年之前
test.c 6411db4085 Redesign buffer and term API. 4 年之前

README.md

lsup_rdf

This project is work in progress.

Embedded RDF (and maybe later, generic graph) store and manipulation library.

Purpose

The goal of this library is to provide extremely efficient and compact handling of RDF data. At least a C API and Python bindings are planned.

This library can be thought of as SQLite or BerkeleyDB for graphs. It can be embedded directly in a program and store persistent data without the need of running a server.

Two graph back ends are available: a memory one based on hash maps and a disk-based one based on LMDB, an extremely fast and compact embedded key-store value. Graphs can be created independently with either back end within the same program. Triples in the persistent back end are fully indexed and optimized for a balance of lookup speed, data compactness, and write performance (in order of importance).

This library was initially meant to replace RDFLib dependency and Cython code in Lakesuperior in an effort to reduce code clutter and speed up RDF handling; it is now a project for an independent RDF library, but unless the contributor base expands, it will remain focused on serving Lakesuperior.

Development Status

Pre-alpha. The API is not yet defined and may change radically. The code may not compile, or throw a fit when run. At the moment this project is only intended for curious developers and researchers.

This is also my first stab at writing a C library (coming from Python) and an unpaid fun project, so don't be surprised if you find some gross stuff.

Road Map

In Scope – Short Term

The short-term goal is to support usage in Lakesuperior and a workable set of features as a standalone library:

  • Handling of graphs, triples, terms
  • Memory- and disk-backed (persistent) graph storage
  • Contexts (disk-backed only)
  • Handling of blank nodes
  • Validation of literal and URI terms
  • Validation of RDF triples
  • Fast graph Lookup using matching patterns
  • Graph boolean operations
  • Serialization and de-serialization to/from N-Triples and N-Quads
  • Serialization and de-serialization to/from Turtle and TriG
  • Compile-time configuration of max graph size (efficiency vs. capacity)
  • Python bindings
  • Basic command line utilities

Possibly In scope – Long Term

  • Binary serialization and hashing of graphs
  • Binary protocol for synchronizing remote replicas
  • Lua bindings

Likely Out of Scope

(Unless provided and maintained by external contributors)

  • C++ bindings
  • JSON-LD de/serialization
  • SPARQL queries (We'll see... Will definitely need help)