Some corrupted repos have missing files or broken logical links in the
underlying Mercurial datastructure, which means that say sometimes fail
for a given revision. This does not mean we should throw away the rest
of the repository. (Tested on repos of various levels and flavors of
corruption in the Boatbucket archive)
Details
Details
- Reviewers
vlorentz - Group Reviewers
Reviewers - Commits
- rDLDHG888471483a7b: Handle more cases of corruption
Diff Detail
Diff Detail
- Repository
- rDLDHG Mercurial loader
- Lint
No Linters Available - Unit
No Unit Test Coverage - Build Status
Buildable 21060 Build 32686: Phabricator diff pipeline on jenkins Jenkins console · Jenkins Build 32685: arc lint + arc unit
Event Timeline
Comment Actions
Build has FAILED
Patch application report for D5627 (id=20067)
Could not rebase; Attempt merge onto f03f274065...
Updating f03f274..5413622 Fast-forward swh/loader/mercurial/from_disk.py | 27 +++++++++++++++++++++++---- swh/loader/mercurial/hgutil.py | 3 ++- swh/loader/mercurial/utils.py | 3 ++- 3 files changed, 27 insertions(+), 6 deletions(-)
Changes applied before test
commit 5413622cd6cf1a584d1b9d300b3dd1f5a6e94ce6 Author: Raphaël Gomès <rgomes@octobus.net> Date: Mon Apr 26 23:33:20 2021 +0200 Handle more cases of corruption Some corrupted repos have missing files or broken logical links in the underlying Mercurial datastructure, which means that say sometimes fail for a given revision. This does not mean we should throw away the rest of the repository. (Tested on repos of various levels and flavors of corruption in the Boatbucket archive) commit c6c3b386ef246860e9292ab0331aaf32cf72d61b Author: Raphaël Gomès <rgomes@octobus.net> Date: Mon Apr 26 23:28:50 2021 +0200 Ignore the repository's config `HGRCPATH` only tells Mercurial to ignore the user's config files, but some repositories have a `.hg/hgrc` file (only in the case that you copy the files instead of cloning, if present) that is usually used for server-side configuration. We want to ignore this, since it might affect loading and ask for hooks that are not there or are otherwise annoying/dangerous, for example. commit 2ec0206482f46491086791b6b8718d5094cb4d77 Author: Raphaël Gomès <rgomes@octobus.net> Date: Mon Apr 26 23:26:09 2021 +0200 Also use minimal env in the new Mercurial loader The old loader (bundle2 loader) already received this treatment which ensures Mercurial doesn't pick up on any user customization, but I apparently forgot to apply the same changes to the new one. commit 1d8b26c042011f3271e451b11b670f37b44e8685 Author: Raphaël Gomès <rgomes@octobus.net> Date: Tue Apr 27 10:53:56 2021 +0200 Use billiard instead of stdlib multiprocessing This circumvents a few celery-related issues, and is consistent with what the rest of the codebase does.
Link to build: https://jenkins.softwareheritage.org/job/DLDHG/job/tests-on-diff/207/
See console output for more information: https://jenkins.softwareheritage.org/job/DLDHG/job/tests-on-diff/207/console
Comment Actions
Build is green
Patch application report for D5627 (id=20080)
Could not rebase; Attempt merge onto f03f274065...
Updating f03f274..d88ab53 Fast-forward swh/loader/mercurial/from_disk.py | 27 +++++++++++++++++++++++---- swh/loader/mercurial/hgutil.py | 18 +++++++++++++----- swh/loader/mercurial/tests/test_hgutil.py | 11 +++++++---- swh/loader/mercurial/utils.py | 3 ++- 4 files changed, 45 insertions(+), 14 deletions(-)
Changes applied before test
commit d88ab535e1f998136c1b27bf5aca6c585d2440d6 Author: Raphaël Gomès <rgomes@octobus.net> Date: Mon Apr 26 23:33:20 2021 +0200 Handle more cases of corruption Some corrupted repos have missing files or broken logical links in the underlying Mercurial datastructure, which means that say sometimes fail for a given revision. This does not mean we should throw away the rest of the repository. (Tested on repos of various levels and flavors of corruption in the Boatbucket archive) commit 250edbb11b85a62498dde8def39e84367cd3cebb Author: Raphaël Gomès <rgomes@octobus.net> Date: Mon Apr 26 23:28:50 2021 +0200 Ignore the repository's config `HGRCPATH` only tells Mercurial to ignore the user's config files, but some repositories have a `.hg/hgrc` file (only in the case that you copy the files instead of cloning, if present) that is usually used for server-side configuration. We want to ignore this, since it might affect loading and ask for hooks that are not there or are otherwise annoying/dangerous, for example. commit 457fb88bf36d6c4eedee5e9423f9747e3ea4abf4 Author: Raphaël Gomès <rgomes@octobus.net> Date: Mon Apr 26 23:26:09 2021 +0200 Also use minimal env in the new Mercurial loader The old loader (bundle2 loader) already received this treatment which ensures Mercurial doesn't pick up on any user customization, but I apparently forgot to apply the same changes to the new one. commit a89783c52f2e3c44e08018e0c5fa99f54471f994 Author: Raphaël Gomès <rgomes@octobus.net> Date: Tue Apr 27 10:53:56 2021 +0200 Use billiard instead of stdlib multiprocessing This circumvents a few celery-related issues, and is consistent with what the rest of the codebase does.
See https://jenkins.softwareheritage.org/job/DLDHG/job/tests-on-diff/212/ for more details.
Comment Actions
Build is green
Patch application report for D5627 (id=20174)
Rebasing onto 888471483a...
First, rewinding head to replay your work on top of it... Fast-forwarded diff-target to base-revision-222-D5627.
Changes applied before test
See https://jenkins.softwareheritage.org/job/DLDHG/job/tests-on-diff/222/ for more details.