12345678910111213141516171819202122232425262728293031323334353637383940414243444546474849505152535455565758596061626364656667686970717273747576777879 |
- # Performance Benchmark Notes
- ## Environment
- ### Hardware
- - Dell Precison M3800 Laptop
- - 8x Intel(R) Core(TM) i7-4712HQ CPU @ 2.30GHz
- - 12Gb RAM
- - SSD
- ### Software
- - Arch Linux OS
- - glibc 2.26-11
- - python 3.5.4
- - lmdb 0.9.21-1
- - db (BerkeleyDB) 5.3.28-3
- ### Sample Data Set
- Modified Duchamp VIAF dataset (343 triples; changed all subjects to `<>`)
- ## Sleepycat Back End Test
- 10K PUTs to new resources under the same container:
- ~18' running time
- 0.108" per resource
- 3.4M triples total in repo at the end of the process
- Retrieval of parent resource (11400 triples), pipe to /dev/null: 3.6"
- Database size: 1.2 Gb
- ## LMDB Back End Test
- ### Strategy #4
- 10K PUTs to new resources under the same container:
- ~29' running time
- 0.178" per resource
- 3.4M triples total in repo at the end of the process
- Some gaps every ~40-50 requests, probably blocking transactions or disk
- flush
- Database size: 633 Mb
- Retrieval of parent resource (11400 triples), pipe to /dev/null: 3.48"
- ### Strategy #5
- 10K PUTs to new resources under the same container:
- 29' running time
- 0.176" per resource
- 3.4M triples total in repo at the end of the process
- Less gaps than strategy #4, however overall timing is almost identical. The
- blocker seems to be somewhere else.
- Database size: 422 Mb
- Retrieval of parent resource (11400 triples), pipe to /dev/null: 7.5"
- ### After using triple methods rather than SPARQL for extract_imr
- 25' running time
- 0.155" per resource
- Database size: 523 Mb
- Retrieval of parent resource, pipe to /dev/null: 1.9"
|