Page MenuHomeSoftware Heritage

package/archive: Add snapshot_append parameter to ArchiveLoader
ClosedPublic

Authored by anlambert on May 27 2021, 11:52 AM.

Details

Summary

It enables to append the latest snapshot content of an origin each
time the loader is invoked.

The purpose if to keep track of all the origin artifacts loaded so
far in each new visit of the origin.

Closes T3347

Diff Detail

Repository
rDLDBASE Generic VCS/Package Loader
Branch
archives-snapshot-append
Lint
No Linters Available
Unit
No Unit Test Coverage
Build Status
Buildable 21655
Build 33656: Phabricator diff pipeline on jenkinsJenkins console · Jenkins
Build 33655: arc lint + arc unit

Event Timeline

ardumont added inline comments.
swh/loader/package/archive/loader.py
175

or something?

Build is green

Patch application report for D5789 (id=20691)

Rebasing onto 0e4bb4bbc8...

Current branch diff-target is up to date.
Changes applied before test
commit 36e4275c53d2a1c1eb3168bffabb5e7a6d876f3b
Author: Antoine Lambert <antoine.lambert@inria.fr>
Date:   Thu May 27 11:47:11 2021 +0200

    package/archive: Add snapshot_append parameter to ArchiveLoader
    
    It enables to append the latest snapshot content of an origin each
    time the loader is invoked.
    
    The purpose if to keep track of all the origin artifacts loaded so
    far in each new visit of the origin.
    
    Closes T3347

See https://jenkins.softwareheritage.org/job/DLDBASE/job/tests-on-diff/477/ for more details.

swh/loader/package/archive/loader.py
175

extra_branches must return dict version of snapshot branches, not model based ones, see below:

Traceback (most recent call last):
  File "/home/anlambert/swh/swh-environment/swh-loader-core/swh/loader/package/loader.py", line 633, in load
    default_version, tmp_revisions, extra_branches
  File "/home/anlambert/swh/swh-environment/swh-loader-core/swh/loader/package/loader.py", line 788, in _load_snapshot
    snapshot = Snapshot.from_dict(snapshot_data)
  File "/home/anlambert/swh/swh-environment/swh-model/swh/model/model.py", line 425, in from_dict
    for (name, branch) in d.pop("branches").items()
  File "/home/anlambert/swh/swh-environment/swh-model/swh/model/collections.py", line 27, in __init__
    self.data = tuple(data)
  File "/home/anlambert/swh/swh-environment/swh-model/swh/model/model.py", line 425, in <genexpr>
    for (name, branch) in d.pop("branches").items()
  File "/home/anlambert/swh/swh-environment/swh-model/swh/model/model.py", line 400, in from_dict
    return cls(target=d["target"], target_type=TargetType(d["target_type"]))
TypeError: 'SnapshotBranch' object is not subscriptable
swh/loader/package/archive/loader.py
175

Nevertheless, the extra dict copy is not ,needed at all, will update.

Build is green

Patch application report for D5789 (id=20693)

Rebasing onto 0e4bb4bbc8...

Current branch diff-target is up to date.
Changes applied before test
commit 584777f3a5d2cdcec8c77ce0642eb54bae887a5e
Author: Antoine Lambert <antoine.lambert@inria.fr>
Date:   Thu May 27 11:47:11 2021 +0200

    package/archive: Add snapshot_append parameter to ArchiveLoader
    
    It enables to append the latest snapshot content of an origin each
    time the loader is invoked.
    
    The purpose if to keep track of all the origin artifacts loaded so
    far in each new visit of the origin.
    
    Closes T3347

See https://jenkins.softwareheritage.org/job/DLDBASE/job/tests-on-diff/478/ for more details.

This revision is now accepted and ready to land.May 27 2021, 1:42 PM