The deduplication logic of 'person' objects is an internal detail of
storage backends, so it's better not to rely on it.
Details
Details
- Reviewers
olasd - Group Reviewers
Reviewers - Commits
- rDLDBASE64922781b0a1: tests.get_stats: Don't return a 'person' count.
Diff Detail
Diff Detail
- Repository
- rDLDBASE Generic VCS/Package Loader
- Lint
Automatic diff as part of commit; lint not applicable. - Unit
Automatic diff as part of commit; unit tests not applicable.
Event Timeline
Comment Actions
Build is green
Patch application report for D3977 (id=14011)
Could not rebase; Attempt merge onto fbe906c0c9...
Merge made by the 'recursive' strategy. swh/loader/core/loader.py | 17 ++++++++--------- swh/loader/package/archive/tests/test_archive.py | 4 ---- swh/loader/package/cran/tests/test_cran.py | 2 -- swh/loader/package/debian/tests/test_debian.py | 3 --- swh/loader/package/deposit/tests/test_deposit.py | 3 --- swh/loader/package/nixguix/tests/test_nixguix.py | 3 --- swh/loader/package/npm/tests/test_npm.py | 4 ---- swh/loader/package/pypi/tests/test_pypi.py | 7 ------- swh/loader/tests/__init__.py | 1 - 9 files changed, 8 insertions(+), 36 deletions(-)
Changes applied before test
commit 722eeac96ddb5391ed0b5d88592517dab7723dd2
Merge: fbe906c 60553db
Author: Jenkins user <jenkins@localhost>
Date: Thu Sep 17 12:49:17 2020 +0000
Merge branch 'diff-target' into HEAD
commit 60553dbd4b12d1990a284c4ac952ab846189177f
Author: Valentin Lorentz <vlorentz@softwareheritage.org>
Date: Thu Sep 17 14:48:15 2020 +0200
tests.get_stats: Don't return a 'person' count.
The deduplication logic of 'person' objects is an internal detail of
storage backends, so it's better not to rely on it.
commit 46485fbe943b110a75196236dbae3da31263b755
Author: Valentin Lorentz <vlorentz@softwareheritage.org>
Date: Thu Sep 17 14:31:22 2020 +0200
loader: Stop materializing full lists of objects to be stored.
Since 43728c596498979cd5083b61e93360b4c2071c31, store_data consumes the entire iterator
of contents, and since 3b97703d7f14e145d6124f1c61f5f283ee8eecf2, it does the same for
other object types.
This causes all the (new) objects of the loaded repository to be loaded
in memory at the same time before being sent to the storage, which can
cause OOM errors.
Instead, with this commit, objects are added one by one to the storage,
which restores the lazy behavior we had before these two commits using
the buffered storage proxy.See https://jenkins.softwareheritage.org/job/DLDBASE/job/tests-on-diff/295/ for more details.
Comment Actions
Build is green
Patch application report for D3977 (id=14027)
Rebasing onto 7b2c80e708...
Current branch diff-target is up to date.
Changes applied before test
commit 64922781b0a19c6a0b2a54ba79818c3a7bd65b6a
Author: Valentin Lorentz <vlorentz@softwareheritage.org>
Date: Thu Sep 17 14:48:15 2020 +0200
tests.get_stats: Don't return a 'person' count.
The deduplication logic of 'person' objects is an internal detail of
storage backends, so it's better not to rely on it.See https://jenkins.softwareheritage.org/job/DLDBASE/job/tests-on-diff/296/ for more details.