Page MenuHomeSoftware Heritage

Use versioned ExtIDs in loader mercurial implementation
ClosedPublic

Authored by ardumont on Jul 28 2021, 10:32 AM.

Details

Summary

For now this hardcodes the version to 1 be it to read or store extids.

This allows:

  • store the new hashes with a version (actually no version means version 0).
  • to keep the old loader mercurial ExtID references in the archives (no need to clean them up as that poses other problems regarding the journal)
  • in effect unblock the current ingestion/updates of existing origins which already have more than one ExtIDs due to different incompatible versions.

The storage implementation does not allow filtering on the extid_version so it's up to
the loader to do the filtering. Hence the current implementation.

Related to T3418
Depends on D6040

Test Plan

tox

Diff Detail

Repository
rDLDHG Mercurial loader
Branch
master
Lint
No Linters Available
Unit
No Unit Test Coverage
Build Status
Buildable 22783
Build 35525: Phabricator diff pipeline on jenkinsJenkins console · Jenkins
Build 35524: arc lint + arc unit

Event Timeline

Build has FAILED

Patch application report for D6036 (id=21818)

Rebasing onto f3232bfd67...

Current branch diff-target is up to date.
Changes applied before test
commit dabeff526263bc1279d02a073b89e6bba3cd831a
Author: Antoine R. Dumont (@ardumont) <ardumont@softwareheritage.org>
Date:   Wed Jul 28 10:22:15 2021 +0200

    Use versioned ExtIDs in loader mercurial implementation
    
    For now this hardcodes the version to 1 be it to read or store extids.
    
    This allows:
    - store the new hashes with a version (actually no version means version 0).
    - to keep the old loader mercurial ExtID references in the archives (no need to clean
    them up as that poses other problems regarding the journal)
    - in effect unblock the current ingestion/updates of existing origins which already have
    more than one ExtIDs due to different incompatible versions.
    
    The storage implementation does not allow filtering on the extid_version so it's up to
    the loader to do the filtering. Hence the current implementation.

Link to build: https://jenkins.softwareheritage.org/job/DLDHG/job/tests-on-diff/258/
See console output for more information: https://jenkins.softwareheritage.org/job/DLDHG/job/tests-on-diff/258/console

Harbormaster returned this revision to the author for changes because remote builds failed.Jul 28 2021, 10:34 AM
Harbormaster failed remote builds in B22782: Diff 21818!
ardumont edited the summary of this revision. (Show Details)

Build has FAILED

Patch application report for D6036 (id=21818)

Rebasing onto f3232bfd67...

Current branch diff-target is up to date.
Changes applied before test
commit dabeff526263bc1279d02a073b89e6bba3cd831a
Author: Antoine R. Dumont (@ardumont) <ardumont@softwareheritage.org>
Date:   Wed Jul 28 10:22:15 2021 +0200

    Use versioned ExtIDs in loader mercurial implementation
    
    For now this hardcodes the version to 1 be it to read or store extids.
    
    This allows:
    - store the new hashes with a version (actually no version means version 0).
    - to keep the old loader mercurial ExtID references in the archives (no need to clean
    them up as that poses other problems regarding the journal)
    - in effect unblock the current ingestion/updates of existing origins which already have
    more than one ExtIDs due to different incompatible versions.
    
    The storage implementation does not allow filtering on the extid_version so it's up to
    the loader to do the filtering. Hence the current implementation.

Link to build: https://jenkins.softwareheritage.org/job/DLDHG/job/tests-on-diff/259/
See console output for more information: https://jenkins.softwareheritage.org/job/DLDHG/job/tests-on-diff/259/console

Build has FAILED

Patch application report for D6036 (id=21818)

Rebasing onto f3232bfd67...

Current branch diff-target is up to date.
Changes applied before test
commit dabeff526263bc1279d02a073b89e6bba3cd831a
Author: Antoine R. Dumont (@ardumont) <ardumont@softwareheritage.org>
Date:   Wed Jul 28 10:22:15 2021 +0200

    Use versioned ExtIDs in loader mercurial implementation
    
    For now this hardcodes the version to 1 be it to read or store extids.
    
    This allows:
    - store the new hashes with a version (actually no version means version 0).
    - to keep the old loader mercurial ExtID references in the archives (no need to clean
    them up as that poses other problems regarding the journal)
    - in effect unblock the current ingestion/updates of existing origins which already have
    more than one ExtIDs due to different incompatible versions.
    
    The storage implementation does not allow filtering on the extid_version so it's up to
    the loader to do the filtering. Hence the current implementation.

Link to build: https://jenkins.softwareheritage.org/job/DLDHG/job/tests-on-diff/260/
See console output for more information: https://jenkins.softwareheritage.org/job/DLDHG/job/tests-on-diff/260/console

ardumont edited the summary of this revision. (Show Details)

rework commit msg

Build has FAILED

that has nothing to do with the change... /me *sighs*

Build has FAILED

Patch application report for D6036 (id=21819)

Rebasing onto f3232bfd67...

Current branch diff-target is up to date.
Changes applied before test
commit a38377ad29779181a68af736385651a1f7e31d23
Author: Antoine R. Dumont (@ardumont) <ardumont@softwareheritage.org>
Date:   Wed Jul 28 10:22:15 2021 +0200

    Use versioned ExtIDs in loader mercurial implementation
    
    For now this hardcodes the version to 1 be it to read or store extids.
    
    This allows:
    - store the new hashes with a version (actually no version means version 0).
    - to keep the old loader mercurial ExtID references in the archives (no need to clean
      them up as that poses other problems regarding the journal)
    - in effect unblock the current ingestion/updates of existing origins which already have
      more than one ExtIDs due to different incompatible versions.
    
    The storage implementation does not allow filtering on the extid_version so it's up to
    the loader to do the filtering. Hence the current implementation.

Link to build: https://jenkins.softwareheritage.org/job/DLDHG/job/tests-on-diff/261/
See console output for more information: https://jenkins.softwareheritage.org/job/DLDHG/job/tests-on-diff/261/console

Harbormaster returned this revision to the author for changes because remote builds failed.Jul 28 2021, 10:54 AM
Harbormaster failed remote builds in B22783: Diff 21819!

Build has FAILED

Patch application report for D6036 (id=21825)

Rebasing onto f3232bfd67...

Current branch diff-target is up to date.
Changes applied before test
commit 7d5b76fed2fe734492b5569d081db81ec0b9bd54
Author: Antoine R. Dumont (@ardumont) <ardumont@softwareheritage.org>
Date:   Wed Jul 28 10:22:15 2021 +0200

    Use versioned ExtIDs in main loader mercurial implementation
    
    For now this hardcodes the version to 1 for either reading or writing instructions.
    
    This allows:
    - store the new hashes with a version (actually no version means version 0).
    - to keep the old loader mercurial ExtID references in the archives (no need to clean
      them up as that poses other problems regarding the journal)
    - in effect unblock the current ingestion/updates of existing origins which already have
      more than one ExtIDs due to different incompatible versions.
    
    The storage implementation does not allow filtering on the extid_version so it's up to
    the loader to do the filtering. Hence the current implementation.

Link to build: https://jenkins.softwareheritage.org/job/DLDHG/job/tests-on-diff/262/
See console output for more information: https://jenkins.softwareheritage.org/job/DLDHG/job/tests-on-diff/262/console

Harbormaster returned this revision to the author for changes because remote builds failed.Jul 28 2021, 11:36 AM
Harbormaster failed remote builds in B22789: Diff 21825!

Build has FAILED

Patch application report for D6036 (id=21825)

Rebasing onto f3232bfd67...

Current branch diff-target is up to date.
Changes applied before test
commit 7d5b76fed2fe734492b5569d081db81ec0b9bd54
Author: Antoine R. Dumont (@ardumont) <ardumont@softwareheritage.org>
Date:   Wed Jul 28 10:22:15 2021 +0200

    Use versioned ExtIDs in main loader mercurial implementation
    
    For now this hardcodes the version to 1 for either reading or writing instructions.
    
    This allows:
    - store the new hashes with a version (actually no version means version 0).
    - to keep the old loader mercurial ExtID references in the archives (no need to clean
      them up as that poses other problems regarding the journal)
    - in effect unblock the current ingestion/updates of existing origins which already have
      more than one ExtIDs due to different incompatible versions.
    
    The storage implementation does not allow filtering on the extid_version so it's up to
    the loader to do the filtering. Hence the current implementation.

Link to build: https://jenkins.softwareheritage.org/job/DLDHG/job/tests-on-diff/263/
See console output for more information: https://jenkins.softwareheritage.org/job/DLDHG/job/tests-on-diff/263/console

ardumont edited the summary of this revision. (Show Details)

Rebase on top of D6040 so the failing test [1] subsides

[1] for the wrong reason

Build has FAILED

Patch application report for D6036 (id=21849)

Could not rebase; Attempt merge onto f3232bfd67...

Updating f3232bf..be7a469
Fast-forward
 swh/loader/mercurial/from_disk.py         | 21 ++++++++++++++++-----
 swh/loader/mercurial/hgutil.py            |  6 ++++--
 swh/loader/mercurial/tests/test_hgutil.py |  6 ++++--
 3 files changed, 24 insertions(+), 9 deletions(-)
Changes applied before test
commit be7a4694d4f23a85115480825add1c933de548d5
Author: Antoine R. Dumont (@ardumont) <ardumont@softwareheritage.org>
Date:   Wed Jul 28 10:22:15 2021 +0200

    Use versioned ExtIDs in main loader mercurial implementation
    
    For now this hardcodes the version to 1 for either reading or writing instructions.
    
    This allows:
    - store the new hashes with a version (actually no version means version 0).
    - to keep the old loader mercurial ExtID references in the archives (no need to clean
      them up as that poses other problems regarding the journal)
    - in effect unblock the current ingestion/updates of existing origins which already have
      more than one ExtIDs due to different incompatible versions.
    
    The storage implementation does not allow filtering on the extid_version so it's up to
    the loader to do the filtering. Hence the current implementation.

commit 61ec738eacfd93de4cccb545dae5de8a65429ee2
Author: Antoine R. Dumont (@ardumont) <ardumont@softwareheritage.org>
Date:   Wed Jul 28 16:35:59 2021 +0200

    test: Make the child process wait longer so it gets actually killed
    
    Prior to this, depending on the load on jenkins, the test could be flaky and fail for
    the wrong reason [1]
    
    [1] https://jenkins.softwareheritage.org/job/DLDHG/job/tests-on-diff/263/console

Link to build: https://jenkins.softwareheritage.org/job/DLDHG/job/tests-on-diff/273/
See console output for more information: https://jenkins.softwareheritage.org/job/DLDHG/job/tests-on-diff/273/console

Harbormaster returned this revision to the author for changes because remote builds failed.Jul 28 2021, 5:32 PM
Harbormaster failed remote builds in B22811: Diff 21849!

Build is green

Patch application report for D6036 (id=21851)

Could not rebase; Attempt merge onto f3232bfd67...

Updating f3232bf..a7a9007
Fast-forward
 swh/loader/mercurial/from_disk.py         | 21 ++++++++++++++++-----
 swh/loader/mercurial/hgutil.py            |  3 ++-
 swh/loader/mercurial/tests/test_hgutil.py |  8 +++++---
 3 files changed, 23 insertions(+), 9 deletions(-)
Changes applied before test
commit a7a90070a9f063a53b048632e5fbff637803d432
Author: Antoine R. Dumont (@ardumont) <ardumont@softwareheritage.org>
Date:   Wed Jul 28 10:22:15 2021 +0200

    Use versioned ExtIDs in main loader mercurial implementation
    
    For now this hardcodes the version to 1 for either reading or writing instructions.
    
    This allows:
    - store the new hashes with a version (actually no version means version 0).
    - to keep the old loader mercurial ExtID references in the archives (no need to clean
      them up as that poses other problems regarding the journal)
    - in effect unblock the current ingestion/updates of existing origins which already have
      more than one ExtIDs due to different incompatible versions.
    
    The storage implementation does not allow filtering on the extid_version so it's up to
    the loader to do the filtering. Hence the current implementation.

commit d93fa51dbd0cb2c7a9e57c3a98c6b96d0cac8d43
Author: Antoine R. Dumont (@ardumont) <ardumont@softwareheritage.org>
Date:   Wed Jul 28 16:35:59 2021 +0200

    test: Make the child process wait longer so it gets actually killed
    
    Prior to this, depending on the load on jenkins, the test could be flaky and fail for
    the wrong reason [1]
    
    [1] https://jenkins.softwareheritage.org/job/DLDHG/job/tests-on-diff/263/console

See https://jenkins.softwareheritage.org/job/DLDHG/job/tests-on-diff/275/ for more details.

Ensure the written ExtIDs are written with version 1.

That does not ensure yet that we are only reading ExtIDs with version 1.

Build is green

Patch application report for D6036 (id=21853)

Could not rebase; Attempt merge onto f3232bfd67...

Updating f3232bf..6a9937b
Fast-forward
 swh/loader/mercurial/from_disk.py            | 21 ++++++++++++++++-----
 swh/loader/mercurial/hgutil.py               |  3 ++-
 swh/loader/mercurial/tests/test_from_disk.py | 18 +++++++++++++++++-
 swh/loader/mercurial/tests/test_hgutil.py    |  6 ++++--
 4 files changed, 39 insertions(+), 9 deletions(-)
Changes applied before test
commit 6a9937bb8728118def647e5803433221e7011493
Author: Antoine R. Dumont (@ardumont) <ardumont@softwareheritage.org>
Date:   Wed Jul 28 10:22:15 2021 +0200

    Use versioned ExtIDs in main loader mercurial implementation
    
    For now this hardcodes the version to 1 for either reading or writing instructions.
    
    This allows:
    - store the new hashes with a version (actually no version means version 0).
    - to keep the old loader mercurial ExtID references in the archives (no need to clean
      them up as that poses other problems regarding the journal)
    - in effect unblock the current ingestion/updates of existing origins which already have
      more than one ExtIDs due to different incompatible versions.
    
    The storage implementation does not allow filtering on the extid_version so it's up to
    the loader to do the filtering. Hence the current implementation.

commit cae036cac707c07dc083aa134c59fcd48e3b313c
Author: Antoine R. Dumont (@ardumont) <ardumont@softwareheritage.org>
Date:   Wed Jul 28 16:35:59 2021 +0200

    test: Make the child process wait longer so it gets actually killed
    
    Prior to this, depending on the load on jenkins, the test could be flaky and fail for
    the wrong reason [1]
    
    [1] https://jenkins.softwareheritage.org/job/DLDHG/job/tests-on-diff/263/console

See https://jenkins.softwareheritage.org/job/DLDHG/job/tests-on-diff/277/ for more details.

Build is green

Patch application report for D6036 (id=21858)

Rebasing onto cae036cac7...

Current branch diff-target is up to date.
Changes applied before test
commit 9be124af2151f60ea85dab678310cad7b0cdb270
Author: Antoine R. Dumont (@ardumont) <ardumont@softwareheritage.org>
Date:   Wed Jul 28 10:22:15 2021 +0200

    Use versioned ExtIDs in main loader mercurial implementation
    
    For now this hardcodes the version to 1 for either reading or writing instructions.
    
    This allows:
    - store the new hashes with a version (actually no version means version 0).
    - to keep the old loader mercurial ExtID references in the archives (no need to clean
      them up as that poses other problems regarding the journal)
    - in effect unblock the current ingestion/updates of existing origins which already have
      more than one ExtIDs due to different incompatible versions.
    
    The storage implementation does not allow filtering on the extid_version so it's up to
    the loader to do the filtering. Hence the current implementation.

See https://jenkins.softwareheritage.org/job/DLDHG/job/tests-on-diff/278/ for more details.

This revision is now accepted and ready to land.Jul 29 2021, 11:26 AM