Page MenuHomeSoftware Heritage

from_disk: only build a model object once
ClosedPublic

Authored by marmoute on Sep 20 2022, 2:36 PM.

Details

Summary
from_disk: only build a model object once

Before this change, a Directory object was built to compute the `id` of
we fed to the Directory object we built for `to_model`.

We tested this change on simple information of the Mercurial loader,
with a noop-loader stockage:

    swh loader run mercurial https://foss.heptapod.net/mercurial/mercurial-devel directory=/data/repos/mercurial-devel

= Median time of 3 run =
before: 17 minutes 48 seconds
after:  12 minutes 59 seconds

On a profile of the same run, the `to_model` call of the from_disk's `Directory` class took the following percentage:
before: 43%
after:  24%
Test Plan

I ran tox, timing and profile

Diff Detail

Repository
rDMOD Data model
Branch
D8510
Lint
No Linters Available
Unit
No Unit Test Coverage
Build Status
Buildable 31689
Build 49577: Phabricator diff pipeline on jenkinsJenkins console · Jenkins
Build 49576: arc lint + arc unit

Event Timeline

Build is green

Patch application report for D8510 (id=30643)

Rebasing onto 9ce6feb9d6...

Current branch diff-target is up to date.
Changes applied before test
commit c5aa397ecdfcd8cd9c11657252fcae0de2e8e4c2
Author: Pierre-Yves David <pierre-yves.david@ens-lyon.org>
Date:   Tue Sep 20 14:26:17 2022 +0200

    from_disk: only build a model object once
    
    Before this change, a Directory object was built to compute the `id` of
    we fed to the Directory object we built for `to_model`.

See https://jenkins.softwareheritage.org/job/DMOD/job/tests-on-diff/504/ for more details.

Build is green

Patch application report for D8510 (id=30644)

Rebasing onto 9ce6feb9d6...

Current branch diff-target is up to date.
Changes applied before test
commit 36185705f1805465d565b00b0df917b0e4f1c8e4
Author: Pierre-Yves David <pierre-yves.david@ens-lyon.org>
Date:   Tue Sep 20 14:26:17 2022 +0200

    from_disk: only build a model object once
    
    Before this change, a Directory object was built to compute the `id` of
    we fed to the Directory object we built for `to_model`.

See https://jenkins.softwareheritage.org/job/DMOD/job/tests-on-diff/505/ for more details.

This revision is now accepted and ready to land.Sep 21 2022, 11:40 AM

Build is green

Patch application report for D8510 (id=30712)

Rebasing onto 9ce6feb9d6...

Current branch diff-target is up to date.
Changes applied before test
commit 36185705f1805465d565b00b0df917b0e4f1c8e4
Author: Pierre-Yves David <pierre-yves.david@ens-lyon.org>
Date:   Tue Sep 20 14:26:17 2022 +0200

    from_disk: only build a model object once
    
    Before this change, a Directory object was built to compute the `id` of
    we fed to the Directory object we built for `to_model`.

See https://jenkins.softwareheritage.org/job/DMOD/job/tests-on-diff/514/ for more details.

Build is green

Patch application report for D8510 (id=30720)

Rebasing onto 9ce6feb9d6...

Current branch diff-target is up to date.
Changes applied before test
commit 36185705f1805465d565b00b0df917b0e4f1c8e4
Author: Pierre-Yves David <pierre-yves.david@ens-lyon.org>
Date:   Tue Sep 20 14:26:17 2022 +0200

    from_disk: only build a model object once
    
    Before this change, a Directory object was built to compute the `id` of
    we fed to the Directory object we built for `to_model`.

See https://jenkins.softwareheritage.org/job/DMOD/job/tests-on-diff/518/ for more details.

I have updated the commit message with more timing data, but I am not sure how to get phabricator to reflect that.

Build is green

Patch application report for D8510 (id=30725)

Rebasing onto 9ce6feb9d6...

Current branch diff-target is up to date.
Changes applied before test
commit 814a6c8416d56f5f8b3e590d419d5aea7a888ab2
Author: Pierre-Yves David <pierre-yves.david@ens-lyon.org>
Date:   Tue Sep 20 14:26:17 2022 +0200

    from_disk: only build a model object once
    
    Before this change, a Directory object was built to compute the `id` of
    we fed to the Directory object we built for `to_model`.
    
    We tested this change on simple information of the Mercurial loader,
    with a noop-loader stockage:
    
        swh loader run mercurial https://foss.heptapod.net/mercurial/mercurial-devel directory=/data/repos/mercurial-devel
    
    = Median time of 3 run =
    before: 17 minutes 48 seconds
    after:  12 minutes 59 seconds
    
    On a profile of the same run, the `to_model` call of the from_disk's `Directory` class took the following percentage:
    before: 43%
    after:  24%

See https://jenkins.softwareheritage.org/job/DMOD/job/tests-on-diff/522/ for more details.

marmoute edited the test plan for this revision. (Show Details)
marmoute edited the summary of this revision. (Show Details)

batch-update

Build is green

Patch application report for D8510 (id=30732)

Rebasing onto 9ce6feb9d6...

Current branch diff-target is up to date.
Changes applied before test
commit 814a6c8416d56f5f8b3e590d419d5aea7a888ab2
Author: Pierre-Yves David <pierre-yves.david@ens-lyon.org>
Date:   Tue Sep 20 14:26:17 2022 +0200

    from_disk: only build a model object once
    
    Before this change, a Directory object was built to compute the `id` of
    we fed to the Directory object we built for `to_model`.
    
    We tested this change on simple information of the Mercurial loader,
    with a noop-loader stockage:
    
        swh loader run mercurial https://foss.heptapod.net/mercurial/mercurial-devel directory=/data/repos/mercurial-devel
    
    = Median time of 3 run =
    before: 17 minutes 48 seconds
    after:  12 minutes 59 seconds
    
    On a profile of the same run, the `to_model` call of the from_disk's `Directory` class took the following percentage:
    before: 43%
    after:  24%

See https://jenkins.softwareheritage.org/job/DMOD/job/tests-on-diff/526/ for more details.