Duplicated entries are now filtered by a SELECT DISTINCT clause.
Details
Details
- Reviewers
olasd - Group Reviewers
Reviewers - Commits
- rDPROV3a2f11aadb7d: Fix direct sql query for directories to the archive
Diff Detail
Diff Detail
- Repository
- rDPROV Provenance database
- Lint
Automatic diff as part of commit; lint not applicable. - Unit
Automatic diff as part of commit; unit tests not applicable.
Event Timeline
Comment Actions
LGTM.
As suggested on IRC, if you want to exert that code, you could place a call to update directory set file_entries = file_entries || file_entries, dir_entries = dir_entries || dir_entries after seeding the storage, which will replicate the duplication that seems to have happened in the production storage.
Comment Actions
Build is green
Patch application report for D6991 (id=25354)
Could not rebase; Attempt merge onto cc7401096d...
Updating cc74010..128d173 Fast-forward requirements-swh.txt | 1 + swh/provenance/__init__.py | 9 ++++++- swh/provenance/archive.py | 3 +-- swh/provenance/postgresql/archive.py | 14 +++++------ swh/provenance/storage/archive.py | 2 +- swh/provenance/swhgraph/__init__.py | 0 swh/provenance/swhgraph/archive.py | 46 ++++++++++++++++++++++++++++++++++++ 7 files changed, 64 insertions(+), 11 deletions(-) create mode 100644 swh/provenance/swhgraph/__init__.py create mode 100644 swh/provenance/swhgraph/archive.py
Changes applied before test
commit 128d1734974798536f0716a213a2f0982a1f785e Author: Andres Ezequiel Viso <aeviso@softwareheritage.org> Date: Thu Jan 20 17:51:18 2022 +0100 Fix direct sql query for directories to the archive Duplicated entries are now filtered by a `SELECT DISTINCT` clause. commit 846427ea1ce130e1e3d9fd62c40154ff587bbace Author: Andres Ezequiel Viso <aeviso@softwareheritage.org> Date: Thu Jan 20 16:08:32 2022 +0100 Add partial implementation of `ArchiveGraph` class commit eebf1f7889f1c9072ba8b8c8d0325d151b1ff014 Author: Andres Ezequiel Viso <aeviso@softwareheritage.org> Date: Thu Jan 20 15:35:43 2022 +0100 Remove ordered result constrain from `snapshot_get_heads` It is not require anymore after simplifying the origin-revision layer algorithm.
See https://jenkins.softwareheritage.org/job/DPROV/job/tests-on-diff/562/ for more details.
Comment Actions
Build is green
Patch application report for D6991 (id=25355)
Rebasing onto cc7401096d...
Current branch diff-target is up to date.
Changes applied before test
commit 3a2f11aadb7d32d1ab8caa5c96d1fa2ea2b5f852 Author: Andres Ezequiel Viso <aeviso@softwareheritage.org> Date: Thu Jan 20 17:51:18 2022 +0100 Fix direct sql query for directories to the archive Duplicated entries are now filtered by a `SELECT DISTINCT` clause.
See https://jenkins.softwareheritage.org/job/DPROV/job/tests-on-diff/563/ for more details.