Page MenuHomeSoftware Heritage

Add an extid_version field to ExtIDs
ClosedPublic

Authored by olasd on Jul 23 2021, 12:14 PM.

Details

Summary

This allows distinguishing multiple potential versions of the mapping
between external objects and their counterparts archived in Software
Heritage, for instance when a loader has a backwards-incompatible change
that should result in objects being loaded again.

The field defaults to zero, in which case it's backwards-compatible with
the previous implementation in terms of identifier computation.

Related to T3418

Test Plan

new objects added to existing tests

Diff Detail

Repository
rDMOD Data model
Branch
extid-version
Lint
No Linters Available
Unit
No Unit Test Coverage
Build Status
Buildable 22736
Build 35456: Phabricator diff pipeline on jenkinsJenkins console · Jenkins
Build 35455: arc lint + arc unit

Event Timeline

Build is green

Patch application report for D6019 (id=21758)

Rebasing onto 153c6e8442...

Current branch diff-target is up to date.
Changes applied before test
commit d026e377b1678eeda2d68a413ddc88259d46102a
Author: Nicolas Dandrimont <nicolas@dandrimont.eu>
Date:   Fri Jul 23 12:09:41 2021 +0200

    Add an extid_version field to ExtIDs
    
    This allows distinguishing multiple potential versions of the mapping
    between external objects and their counterparts archived in Software
    Heritage, for instance when a loader has a backwards-incompatible change
    that should result in objects being loaded again.
    
    The field defaults to zero, in which case it's backwards-compatible with
    the previous implementation in terms of identifier computation.

See https://jenkins.softwareheritage.org/job/DMOD/job/tests-on-diff/366/ for more details.

ardumont added a subscriber: ardumont.

lgtm

Relatedly to the need (not this diff), I'm unclear on how to deal with the data
migration now. For current extid clashes (which is the source of the task afaiui), how
can we determine which extid is the oldest one and which are more recent (if more than 2
extids)? [1]

[1] oh how, i'm a pitb sometimes ;)

This revision is now accepted and ready to land.Jul 23 2021, 3:09 PM

Relatedly to the need (not this diff), I'm unclear on how to deal with the data
migration now. For current extid clashes (which is the source of the task afaiui), how
can we determine which extid is the oldest one and which are more recent (if more than 2
extids)? [1]

Maybe we actually won't have anything to do in regards to migration.

We could adapt the loader so it detects the clash. That is, the loader finds more than
one version for the extid but without any extid-version. Then considers them moot [1],
bumps a new extid-version to 1 or something (thus recomputing the hashes for those
origins). And maybe with this, the loaders get unstuck and we are back on track.

[1] because it's anterior to this diff and we can't really detect what's what currently
(see caveats described in the task for the other discarded implementation).

Add backwards-compat test

Relatedly to the need (not this diff), I'm unclear on how to deal with the data
migration now. For current extid clashes (which is the source of the task afaiui), how
can we determine which extid is the oldest one and which are more recent (if more than 2
extids)? [1]

Maybe we actually won't have anything to do in regards to migration.

We could adapt the loader so it detects the clash. That is, the loader finds more than
one version for the extid but without any extid-version. Then considers them moot [1],
bumps a new extid-version to 1 or something (thus recomputing the hashes for those
origins). And maybe with this, the loaders get unstuck and we are back on track.

[1] because it's anterior to this diff and we can't really detect what's what currently
(see caveats described in the task for the other discarded implementation).

Yeah, my intent is to add the version field to storage as defaulting to 0.

The current version of the mercurial loader, which is still pending full deployment, would look for extids of version 1 (for now), which don't exist at all (as no origins have been successfully loaded with this version of the loader yet).

Build is green

Patch application report for D6019 (id=21769)

Rebasing onto 153c6e8442...

Current branch diff-target is up to date.
Changes applied before test
commit 1545ef77e36d4539d372bdb569e02edf1f4b860d
Author: Nicolas Dandrimont <nicolas@dandrimont.eu>
Date:   Fri Jul 23 12:09:41 2021 +0200

    Add an extid_version field to ExtIDs
    
    This allows distinguishing multiple potential versions of the mapping
    between external objects and their counterparts archived in Software
    Heritage, for instance when a loader has a backwards-incompatible change
    that should result in objects being loaded again.
    
    The field defaults to zero, in which case it's backwards-compatible with
    the previous implementation in terms of identifier computation.

See https://jenkins.softwareheritage.org/job/DMOD/job/tests-on-diff/367/ for more details.

This revision was automatically updated to reflect the committed changes.