The deduplication logic of 'person' objects is an internal detail of
storage backends, so it's better not to rely on it.
Details
Details
- Reviewers
olasd - Group Reviewers
Reviewers - Commits
- rDLDBASE64922781b0a1: tests.get_stats: Don't return a 'person' count.
Diff Detail
Diff Detail
- Repository
- rDLDBASE Generic VCS/Package Loader
- Lint
Automatic diff as part of commit; lint not applicable. - Unit
Automatic diff as part of commit; unit tests not applicable.
Event Timeline
Comment Actions
Build is green
Patch application report for D3977 (id=14011)
Could not rebase; Attempt merge onto fbe906c0c9...
Merge made by the 'recursive' strategy. swh/loader/core/loader.py | 17 ++++++++--------- swh/loader/package/archive/tests/test_archive.py | 4 ---- swh/loader/package/cran/tests/test_cran.py | 2 -- swh/loader/package/debian/tests/test_debian.py | 3 --- swh/loader/package/deposit/tests/test_deposit.py | 3 --- swh/loader/package/nixguix/tests/test_nixguix.py | 3 --- swh/loader/package/npm/tests/test_npm.py | 4 ---- swh/loader/package/pypi/tests/test_pypi.py | 7 ------- swh/loader/tests/__init__.py | 1 - 9 files changed, 8 insertions(+), 36 deletions(-)
Changes applied before test
commit 722eeac96ddb5391ed0b5d88592517dab7723dd2 Merge: fbe906c 60553db Author: Jenkins user <jenkins@localhost> Date: Thu Sep 17 12:49:17 2020 +0000 Merge branch 'diff-target' into HEAD commit 60553dbd4b12d1990a284c4ac952ab846189177f Author: Valentin Lorentz <vlorentz@softwareheritage.org> Date: Thu Sep 17 14:48:15 2020 +0200 tests.get_stats: Don't return a 'person' count. The deduplication logic of 'person' objects is an internal detail of storage backends, so it's better not to rely on it. commit 46485fbe943b110a75196236dbae3da31263b755 Author: Valentin Lorentz <vlorentz@softwareheritage.org> Date: Thu Sep 17 14:31:22 2020 +0200 loader: Stop materializing full lists of objects to be stored. Since 43728c596498979cd5083b61e93360b4c2071c31, store_data consumes the entire iterator of contents, and since 3b97703d7f14e145d6124f1c61f5f283ee8eecf2, it does the same for other object types. This causes all the (new) objects of the loaded repository to be loaded in memory at the same time before being sent to the storage, which can cause OOM errors. Instead, with this commit, objects are added one by one to the storage, which restores the lazy behavior we had before these two commits using the buffered storage proxy.
See https://jenkins.softwareheritage.org/job/DLDBASE/job/tests-on-diff/295/ for more details.
Comment Actions
Build is green
Patch application report for D3977 (id=14027)
Rebasing onto 7b2c80e708...
Current branch diff-target is up to date.
Changes applied before test
commit 64922781b0a19c6a0b2a54ba79818c3a7bd65b6a Author: Valentin Lorentz <vlorentz@softwareheritage.org> Date: Thu Sep 17 14:48:15 2020 +0200 tests.get_stats: Don't return a 'person' count. The deduplication logic of 'person' objects is an internal detail of storage backends, so it's better not to rely on it.
See https://jenkins.softwareheritage.org/job/DLDBASE/job/tests-on-diff/296/ for more details.