HomeSoftware Heritage

DB import: massive speed up, via sqlite tuning and better mem handling

Description

DB import: massive speed up, via sqlite tuning and better mem handling

  • avoid loading into memory the entire input file, and rely on DB unicity constraints to spot duplicated
  • tune sqlite disabling sync writes and journal; this is unsafe, but insertion is all or nothing anyway
  • minor: improve exception handling, propagating SQLite errors up the stack

With this chagen import time for 30M SWHIDs went down from ~6m30s to ~55s, and
memory usage down from 5 GiB to a few tens MiB

Closes T2836
Closes T2812