HomeSoftware Heritage

from_disk: skip intermediate dictionnary creation when building model

Description

from_disk: skip intermediate dictionnary creation when building model

Before this change we would do the following :

  1. translate from_disk's object into dict,
  2. sort these dict,
  3. feed the list to Directory.from_dict,
  4. create DirectoryEntry from these dict.

Skipping the directory creating and directly creating the
DirectoryEntries provide us with a small but stable and noticeable
performance win.

We tested this change on simple information of the Mercurial loader,
with a noop-loader stockage:

swh loader run mercurial https://foss.heptapod.net/mercurial/mercurial-devel directory=/data/repos/mercurial-devel

Median time of 3 run

before: 11 minute 56 seconds
aftere: 11 minute 50 seconds

On a profile of the same run, the to_model call of the from_disk's Directory class took the following percentage:
before: 17%
after: 15%

Details

Provenance
marmouteAuthored on Sep 22 2022, 6:24 PM
marmoutePushed on Sep 26 2022, 2:28 PM
Differential Revision
D8525: from_disk: skip intermediate dictionnary creation when building model
Parents
rDMODad3ecac9beae: model: avoid another extra creation of Model object
Branches
Unknown
Tags
Unknown
Build Status
Buildable 31739
Build 49662: test-and-buildJenkins console · Jenkins