Page MenuHomeSoftware Heritage

Implement a MongoDB backend for SWH-provenance
Open, NormalPublic


Steps for phase 1

  • Finding an initial experimental data model
  • Make an MVP (with the new backend and a sample dataset)
  • Run some experiments
  • Scaling/performance improvements
  • Change the data model and indexes accordingly
  • Finding the hosting strategy (single server vs multiple instance)
  • Populate with prod data and testing
  • Deployment

Phase 2:

  • Support to work with incremental data stream

Event Timeline

jayeshv triaged this task as Normal priority.
jayeshv created this task.
zack renamed this task from Implement a MonoDB backend for SWH-provenance to Implement a MongoDB backend for SWH-provenance .Thu, Jul 15, 10:52 AM
jayeshv updated the task description. (Show Details)

Data model


Python interface
Will use pymongo, the low level driver to interface with the db.

db version v5.0.1
Build Info: {
    "version": "5.0.1",
    "gitVersion": "318fd9cabc59dc9651f3189b622af6e06ab6cd33",
    "openSSLVersion": "OpenSSL 1.1.1f  31 Mar 2020",
    "modules": [],
    "allocator": "tcmalloc",
    "environment": {
        "distmod": "ubuntu2004",
        "distarch": "x86_64",
        "target_arch": "x86_64"