C and Python RDF library. ALPHA
Stefano Cossu e30c194960 Fix Python builder; bump version; wrap up release. | 2 лет назад | |
---|---|---|
cpython | 2 лет назад | |
docs | 2 лет назад | |
ext | 2 лет назад | |
include | 2 лет назад | |
src | 2 лет назад | |
test | 2 лет назад | |
.gitignore | 2 лет назад | |
.gitmodules | 2 лет назад | |
CODE_OF_CONDUCT | 4 лет назад | |
Doxyfile | 2 лет назад | |
LICENSE | 4 лет назад | |
MANIFEST.in | 2 лет назад | |
Makefile | 2 лет назад | |
README.md | 2 лет назад | |
TODO.md | 2 лет назад | |
profile.c | 2 лет назад | |
pyproject.toml | 2 лет назад | |
setup.py | 2 лет назад | |
test.c | 2 лет назад | |
valgrind-python.supp | 4 лет назад |
This project is work in progress.
Embedded RDF (and maybe later, generic graph) store and manipulation library.
The goal of this library is to provide efficient and compact handling of RDF data. At least a complete C API and Python bindings are planned.
This library can be thought of as SQLite or BerkeleyDB for graphs. It can be
embedded directly in a program and store persistent data without the need of
running a server. In addition, lsup_rdf
can perform in-memory graph
operations such as validation, de/serialization, boolean operations, lookup,
etc.
Two graph back ends are available: a memory one based on hash maps and a disk-based one based on LMDB, an extremely fast and compact embedded key-store value. Graphs can be created independently with either back end within the same program. Triples in the persistent back end are fully indexed and optimized for a balance of lookup speed, data compactness, and write performance (in order of importance).
This library was initially meant to replace RDFLib dependency and Cython code in Lakesuperior in an effort to reduce code clutter and speed up RDF handling; it is now a project for an independent RDF library, but unless the contributor base expands, it will remain focused on serving Lakesuperior.
Alpha. The API structure is not yet stable and may change radically. The code may not compile, or throw a fit when run. Testing is minimal. At the moment this project is only intended for curious developers and researchers.
This is also my first stab at writing a C library (coming from Python) and an unpaid fun project, so don't be surprised if you find some gross stuff.
The short-term goal is to support usage in Lakesuperior and a workable set of features as a standalone library:
(Unless provided and maintained by external contributors)
gcc
so far.The default make
command compiles the library. Enter make help
to get an
overview of the other available commands.
make install
installs libraries and headers in the directories set by the
environment variable $PREFIX
. If this is unset, the default /usr/local
prefix is used.
Options to compile with debug symbols are available.
DEBUG
: Set debug mode: memory map is at reduced size, logging is forced to
TRACE level, etc.
LSUP_RDF_STREAM_CHUNK_SIZE
: Size of RDF decoding buffer, i.e., maximum size
of a chunk of RDF data fed to the parser when decoding a RDF file into a graph.
This should be larger than the maximum expected size of a single term in your
RDF source. The default value is 8192, which is mildly conservative. If you
experience parsing errors on decoding, and they happen to be on a term such a
very long string literal, try recompiling the library with a larger value.
The generated liblsuprdf.so
and liblsuprdf.a
libraries can be linked
dynamically or statically to your code. Only the lsup_rdf.h
header, which
recursively includes other headers in the include
directory, needs to be
#include
d in the embedding code.
Environment variables and/or compiler options might have to be set in order to find the dynamic libraries and headers in their install locations.
For compilation and linking examples, refer to test
, memtest
, perftest
and other actions in the current Makefile.
LSUP_MDB_STORE_PATH
: The file path for the persistent store back end. For
production use it is strongly recommended to set this to a permanent location
on the fastest storage volume available. If unset, the current directory will
be used. The directory must exist.
LSUP_LOGLEVEL
: A number between 0 and 5, corresponding to:
TRACE
DEBUG
INFO
WARN
ERROR
FATAL
If unspecified, it is set to 3.
LSUP_MDB_MAPSIZE
Virtual memory map size. It is recommended to leave this
alone. By default, it is set to 1Tb for 64-bit systems and 4Gb for 32-bit
systems. The map size by itself does not use up any extra resources.
Almost all header files are documented. Run doxygen
(see
Doxygen) to generate HTML documentation in
docs/html
.
TODO