from_disk: only build a model object once Before this change, a Directory object was built to compute the `id` of we fed to the Directory object we built for `to_model`. We tested this change on simple information of the Mercurial loader, with a noop-loader stockage: swh loader run mercurial https://foss.heptapod.net/mercurial/mercurial-devel directory=/data/repos/mercurial-devel = Median time of 3 run = before: 17 minutes 48 seconds after: 12 minutes 59 seconds On a profile of the same run, the `to_model` call of the from_disk's `Directory` class took the following percentage: before: 43% after: 24%
Details
- Reviewers
olasd vlorentz - Group Reviewers
Reviewers - Maniphest Tasks
- T4595: Improve common tooling for loading
- Commits
- rDMOD814a6c8416d5: from_disk: only build a model object once
- Required Signatures
L3 Software Heritage Contributor License Agreement, version 1.0
I ran tox, timing and profile
Diff Detail
- Repository
- rDMOD Data model
- Branch
- master
- Lint
No Linters Available - Unit
No Unit Test Coverage - Build Status
Buildable 31622 Build 49456: Phabricator diff pipeline on jenkins Jenkins console · Jenkins Build 49455: arc lint + arc unit
Event Timeline
Build is green
Patch application report for D8510 (id=30643)
Rebasing onto 9ce6feb9d6...
Current branch diff-target is up to date.
Changes applied before test
commit c5aa397ecdfcd8cd9c11657252fcae0de2e8e4c2 Author: Pierre-Yves David <pierre-yves.david@ens-lyon.org> Date: Tue Sep 20 14:26:17 2022 +0200 from_disk: only build a model object once Before this change, a Directory object was built to compute the `id` of we fed to the Directory object we built for `to_model`.
See https://jenkins.softwareheritage.org/job/DMOD/job/tests-on-diff/504/ for more details.
Build is green
Patch application report for D8510 (id=30644)
Rebasing onto 9ce6feb9d6...
Current branch diff-target is up to date.
Changes applied before test
commit 36185705f1805465d565b00b0df917b0e4f1c8e4 Author: Pierre-Yves David <pierre-yves.david@ens-lyon.org> Date: Tue Sep 20 14:26:17 2022 +0200 from_disk: only build a model object once Before this change, a Directory object was built to compute the `id` of we fed to the Directory object we built for `to_model`.
See https://jenkins.softwareheritage.org/job/DMOD/job/tests-on-diff/505/ for more details.
Build is green
Patch application report for D8510 (id=30712)
Rebasing onto 9ce6feb9d6...
Current branch diff-target is up to date.
Changes applied before test
commit 36185705f1805465d565b00b0df917b0e4f1c8e4 Author: Pierre-Yves David <pierre-yves.david@ens-lyon.org> Date: Tue Sep 20 14:26:17 2022 +0200 from_disk: only build a model object once Before this change, a Directory object was built to compute the `id` of we fed to the Directory object we built for `to_model`.
See https://jenkins.softwareheritage.org/job/DMOD/job/tests-on-diff/514/ for more details.
Build is green
Patch application report for D8510 (id=30720)
Rebasing onto 9ce6feb9d6...
Current branch diff-target is up to date.
Changes applied before test
commit 36185705f1805465d565b00b0df917b0e4f1c8e4 Author: Pierre-Yves David <pierre-yves.david@ens-lyon.org> Date: Tue Sep 20 14:26:17 2022 +0200 from_disk: only build a model object once Before this change, a Directory object was built to compute the `id` of we fed to the Directory object we built for `to_model`.
See https://jenkins.softwareheritage.org/job/DMOD/job/tests-on-diff/518/ for more details.
I have updated the commit message with more timing data, but I am not sure how to get phabricator to reflect that.
Build is green
Patch application report for D8510 (id=30725)
Rebasing onto 9ce6feb9d6...
Current branch diff-target is up to date.
Changes applied before test
commit 814a6c8416d56f5f8b3e590d419d5aea7a888ab2 Author: Pierre-Yves David <pierre-yves.david@ens-lyon.org> Date: Tue Sep 20 14:26:17 2022 +0200 from_disk: only build a model object once Before this change, a Directory object was built to compute the `id` of we fed to the Directory object we built for `to_model`. We tested this change on simple information of the Mercurial loader, with a noop-loader stockage: swh loader run mercurial https://foss.heptapod.net/mercurial/mercurial-devel directory=/data/repos/mercurial-devel = Median time of 3 run = before: 17 minutes 48 seconds after: 12 minutes 59 seconds On a profile of the same run, the `to_model` call of the from_disk's `Directory` class took the following percentage: before: 43% after: 24%
See https://jenkins.softwareheritage.org/job/DMOD/job/tests-on-diff/522/ for more details.
Build is green
Patch application report for D8510 (id=30732)
Rebasing onto 9ce6feb9d6...
Current branch diff-target is up to date.
Changes applied before test
commit 814a6c8416d56f5f8b3e590d419d5aea7a888ab2 Author: Pierre-Yves David <pierre-yves.david@ens-lyon.org> Date: Tue Sep 20 14:26:17 2022 +0200 from_disk: only build a model object once Before this change, a Directory object was built to compute the `id` of we fed to the Directory object we built for `to_model`. We tested this change on simple information of the Mercurial loader, with a noop-loader stockage: swh loader run mercurial https://foss.heptapod.net/mercurial/mercurial-devel directory=/data/repos/mercurial-devel = Median time of 3 run = before: 17 minutes 48 seconds after: 12 minutes 59 seconds On a profile of the same run, the `to_model` call of the from_disk's `Directory` class took the following percentage: before: 43% after: 24%
See https://jenkins.softwareheritage.org/job/DMOD/job/tests-on-diff/526/ for more details.