performance.txt 1.5 KB

12345678910111213141516171819202122232425262728293031323334353637383940414243444546474849505152535455565758596061626364656667686970717273747576777879
  1. # Performance Benchmark Notes
  2. ## Environment
  3. ### Hardware
  4. - Dell Precison M3800 Laptop
  5. - 8x Intel(R) Core(TM) i7-4712HQ CPU @ 2.30GHz
  6. - 12Gb RAM
  7. - SSD
  8. ### Software
  9. - Arch Linux OS
  10. - glibc 2.26-11
  11. - python 3.5.4
  12. - lmdb 0.9.21-1
  13. - db (BerkeleyDB) 5.3.28-3
  14. ### Sample Data Set
  15. Modified Duchamp VIAF dataset (343 triples; changed all subjects to `<>`)
  16. ## Sleepycat Back End Test
  17. 10K PUTs to new resources under the same container:
  18. ~18' running time
  19. 0.108" per resource
  20. 3.4M triples total in repo at the end of the process
  21. Retrieval of parent resource (11400 triples), pipe to /dev/null: 3.6"
  22. Database size: 1.2 Gb
  23. ## LMDB Back End Test
  24. ### Strategy #4
  25. 10K PUTs to new resources under the same container:
  26. ~29' running time
  27. 0.178" per resource
  28. 3.4M triples total in repo at the end of the process
  29. Some gaps every ~40-50 requests, probably blocking transactions or disk
  30. flush
  31. Database size: 633 Mb
  32. Retrieval of parent resource (11400 triples), pipe to /dev/null: 3.48"
  33. ### Strategy #5
  34. 10K PUTs to new resources under the same container:
  35. 29' running time
  36. 0.176" per resource
  37. 3.4M triples total in repo at the end of the process
  38. Less gaps than strategy #4, however overall timing is almost identical. The
  39. blocker seems to be somewhere else.
  40. Database size: 422 Mb
  41. Retrieval of parent resource (11400 triples), pipe to /dev/null: 7.5"
  42. ### After using triple methods rather than SPARQL for extract_imr
  43. 25' running time
  44. 0.155" per resource
  45. Database size: 523 Mb
  46. Retrieval of parent resource, pipe to /dev/null: 1.9"