This makes sure we don't write corrupt objects to the storage,
like the examples in T75.
Details
- Reviewers
ardumont - Group Reviewers
Reviewers - Maniphest Tasks
- T399: (Re-)Compute data checksums before insertion
T3135: Improve integrity of ingested content - Commits
- rDLDG6d7a998b1093: converters: Recompute hashes and check they match the originals
Diff Detail
- Repository
- rDLDG Git loader
- Branch
- master
- Lint
No Linters Available - Unit
No Unit Test Coverage - Build Status
Buildable 23684 Build 36971: Phabricator diff pipeline on jenkins Jenkins console · Jenkins Build 36970: arc lint + arc unit
Event Timeline
Build has FAILED
Patch application report for D6281 (id=22747)
Could not rebase; Attempt merge onto 85eb54015c...
Updating 85eb540..3558c5f Fast-forward requirements.txt | 1 + swh/loader/git/converters.py | 74 ++++++++++++++++---- swh/loader/git/tests/test_converters.py | 115 ++++++++++++++++++++++++++++---- 3 files changed, 164 insertions(+), 26 deletions(-)
Changes applied before test
commit 3558c5f5aaf1ba9f36e1544cb1719ebeb848aada
Author: Valentin Lorentz <vlorentz@softwareheritage.org>
Date: Thu Sep 16 14:01:20 2021 +0200
converters: Recompute hashes and check they match the originals
This makes sure we don't write corrupt objects to the storage,
like the examples in T75.
commit f413e171b6e0ffd703613089bdfbee24dc048bda
Author: Valentin Lorentz <vlorentz@softwareheritage.org>
Date: Thu Sep 16 13:39:59 2021 +0200
converters: Add typingLink to build: https://jenkins.softwareheritage.org/job/DLDG/job/tests-on-diff/106/
See console output for more information: https://jenkins.softwareheritage.org/job/DLDG/job/tests-on-diff/106/console
Build has FAILED
Patch application report for D6281 (id=22749)
Could not rebase; Attempt merge onto 85eb54015c...
Updating 85eb540..4f17deb Fast-forward requirements.txt | 1 + swh/loader/git/converters.py | 74 ++++++++++++++++---- swh/loader/git/tests/test_converters.py | 115 ++++++++++++++++++++++++++++---- 3 files changed, 164 insertions(+), 26 deletions(-)
Changes applied before test
commit 4f17deb0037c7108c3e56344331014882b60475a
Author: Valentin Lorentz <vlorentz@softwareheritage.org>
Date: Thu Sep 16 14:01:20 2021 +0200
converters: Recompute hashes and check they match the originals
This makes sure we don't write corrupt objects to the storage,
like the examples in T75.
commit f413e171b6e0ffd703613089bdfbee24dc048bda
Author: Valentin Lorentz <vlorentz@softwareheritage.org>
Date: Thu Sep 16 13:39:59 2021 +0200
converters: Add typingLink to build: https://jenkins.softwareheritage.org/job/DLDG/job/tests-on-diff/107/
See console output for more information: https://jenkins.softwareheritage.org/job/DLDG/job/tests-on-diff/107/console
Build has FAILED
Patch application report for D6281 (id=22749)
Could not rebase; Attempt merge onto 85eb54015c...
Updating 85eb540..4f17deb Fast-forward requirements.txt | 1 + swh/loader/git/converters.py | 74 ++++++++++++++++---- swh/loader/git/tests/test_converters.py | 115 ++++++++++++++++++++++++++++---- 3 files changed, 164 insertions(+), 26 deletions(-)
Changes applied before test
commit 4f17deb0037c7108c3e56344331014882b60475a
Author: Valentin Lorentz <vlorentz@softwareheritage.org>
Date: Thu Sep 16 14:01:20 2021 +0200
converters: Recompute hashes and check they match the originals
This makes sure we don't write corrupt objects to the storage,
like the examples in T75.
commit f413e171b6e0ffd703613089bdfbee24dc048bda
Author: Valentin Lorentz <vlorentz@softwareheritage.org>
Date: Thu Sep 16 13:39:59 2021 +0200
converters: Add typingLink to build: https://jenkins.softwareheritage.org/job/DLDG/job/tests-on-diff/108/
See console output for more information: https://jenkins.softwareheritage.org/job/DLDG/job/tests-on-diff/108/console
Build is green
Patch application report for D6281 (id=22752)
Could not rebase; Attempt merge onto 85eb54015c...
Updating 85eb540..a34cbe5 Fast-forward requirements-swh.txt | 2 +- requirements.txt | 1 + swh/loader/git/converters.py | 74 ++++++++++++++++---- swh/loader/git/tests/test_converters.py | 115 ++++++++++++++++++++++++++++---- 4 files changed, 165 insertions(+), 27 deletions(-)
Changes applied before test
commit a34cbe5196eab14b98d311ae4eda3ab590c18675
Author: Valentin Lorentz <vlorentz@softwareheritage.org>
Date: Thu Sep 16 14:01:20 2021 +0200
converters: Recompute hashes and check they match the originals
This makes sure we don't write corrupt objects to the storage,
like the examples in T75.
commit f413e171b6e0ffd703613089bdfbee24dc048bda
Author: Valentin Lorentz <vlorentz@softwareheritage.org>
Date: Thu Sep 16 13:39:59 2021 +0200
converters: Add typingSee https://jenkins.softwareheritage.org/job/DLDG/job/tests-on-diff/109/ for more details.
lgtm, a couple of questions/remarks inline.
| swh/loader/git/converters.py | ||
|---|---|---|
| 45 | I have a hard time reading this ^ chunk (line 37 to 45). Does that mean something like, all functions decorated with check_ids must respect the signature (self, obj: ShaFile, log=None) -> _THashable when called upon? (That plus what the decorator actually does, checking for mismatched checksums). Now that i said it, it actually makes sense to me... | |
| swh/loader/git/tests/test_converters.py | ||
| 116–120 | why not ^ ? or even simpler: def sha(self):
return lambda: self._sha? | |
| swh/loader/git/tests/test_converters.py | ||
|---|---|---|
| 116–120 | because that's not what dulwich expects. | |
Build is green
Patch application report for D6281 (id=22763)
Could not rebase; Attempt merge onto 85eb54015c...
Updating 85eb540..61e2d2c Fast-forward requirements-swh.txt | 2 +- requirements.txt | 1 + swh/loader/git/converters.py | 64 +++++++++++++----- swh/loader/git/tests/test_converters.py | 115 ++++++++++++++++++++++++++++---- 4 files changed, 152 insertions(+), 30 deletions(-)
Changes applied before test
commit 61e2d2c1f1ce043e6bed9b9b22a4b065150835b4
Author: Valentin Lorentz <vlorentz@softwareheritage.org>
Date: Thu Sep 16 14:01:20 2021 +0200
converters: Recompute hashes and check they match the originals
This makes sure we don't write corrupt objects to the storage,
like the examples in T75.
commit f413e171b6e0ffd703613089bdfbee24dc048bda
Author: Valentin Lorentz <vlorentz@softwareheritage.org>
Date: Thu Sep 16 13:39:59 2021 +0200
converters: Add typingSee https://jenkins.softwareheritage.org/job/DLDG/job/tests-on-diff/110/ for more details.
Build is green
Patch application report for D6281 (id=22764)
Could not rebase; Attempt merge onto 85eb54015c...
Updating 85eb540..6d7a998 Fast-forward requirements-swh.txt | 2 +- swh/loader/git/converters.py | 64 +++++++++++++----- swh/loader/git/tests/test_converters.py | 115 ++++++++++++++++++++++++++++---- 3 files changed, 151 insertions(+), 30 deletions(-)
Changes applied before test
commit 6d7a998b1093aaba83d0660188fbe48c0ed5b269
Author: Valentin Lorentz <vlorentz@softwareheritage.org>
Date: Thu Sep 16 14:01:20 2021 +0200
converters: Recompute hashes and check they match the originals
This makes sure we don't write corrupt objects to the storage,
like the examples in T75.
commit f413e171b6e0ffd703613089bdfbee24dc048bda
Author: Valentin Lorentz <vlorentz@softwareheritage.org>
Date: Thu Sep 16 13:39:59 2021 +0200
converters: Add typingSee https://jenkins.softwareheritage.org/job/DLDG/job/tests-on-diff/111/ for more details.
| swh/loader/git/converters.py | ||
|---|---|---|
| 36 | why not make check_id return the object? return check_id(Directory(id=tree.sha().digest(), entries=tuple(entries),)) | |
| swh/loader/git/converters.py | ||
|---|---|---|
| 36 | because it's weird and forces line wrapping when constructing Revision | |
| swh/loader/git/tests/test_converters.py | ||
|---|---|---|
| 116–120 | ah! | |