Use origin URLs for skipped_content['origin'] instead of origin ids.
ClosedPublic
Actions

Authored by vlorentz on Sep 30 2019, 11:17 AM.

Details

Reviewers

douardda
ardumont
olasd

Group Reviewers

Reviewers

Commits

rDSTOCe2393243e07f: Use origin URLs for skipped_content['origin'] instead of origin ids.
rDSTOe2393243e07f: Use origin URLs for skipped_content['origin'] instead of origin ids.

Summary

This commit uses URLs *instead of* IDs, not in addition to.
Supporting IDs should not be needed anymore.

Diff Detail

Repository

rDSTO Storage manager

Branch

skipped-content-origin-url

Lint

No Linters Available

Unit

No Unit Test Coverage

Build Status

Buildable 7998
Build 11528: tox-on-jenkins	Jenkins
Build 11527: arc lint + arc unit

Event Timeline

vlorentz created this revision.Sep 30 2019, 11:17 AM

Build is green
See https://jenkins.softwareheritage.org/job/DSTO/job/tox/651/ for more details.

Harbormaster completed remote builds in B7997: Diff 6856.Sep 30 2019, 11:21 AM

I guess you only really need the hunk in the postgres storage? What is the in-memory storage change trying to achieve ?

swh/storage/in_memory.py
76	the argument should probably be renamed `content_and_origins`
140	`contents_and_origins` :P
143	Surely that only works because we only add contents from a single origin at a time; after the filtering, `skipped_content_missing` and `origins` aren't the same length any more. You really need to pass the full content to `skipped_content_missing`, then do the content/origin splitting. Which, in addition to the double-zipping, makes me wonder if that's really the right way to go at all.
155–158	Could you turn this into a for loop? This isn't very readable.
190	`content_and_origins`?

This revision now requires changes to proceed.Sep 30 2019, 11:38 AM

In D2040#47240, @olasd wrote:

I guess you only really need the hunk in the postgres storage? What is the in-memory storage change trying to achieve ?

You're right. We don't need to store it in the in-mem storage for now.

remove most of the changes from the in-mem storage; we don't need to store those.

Build is green
See https://jenkins.softwareheritage.org/job/DSTO/job/tox/652/ for more details.

Harbormaster completed remote builds in B7998: Diff 6857.Sep 30 2019, 11:49 AM

olasd accepted this revision.Sep 30 2019, 12:01 PM

This revision is now accepted and ready to land.Sep 30 2019, 12:01 PM

Closed by commit rDSTOe2393243e07f: Use origin URLs for skipped_content['origin'] instead of origin ids. (authored by vlorentz). · Explain WhySep 30 2019, 12:02 PM

This revision was automatically updated to reflect the committed changes.

vlorentz added a commit: rDSTOe2393243e07f: Use origin URLs for skipped_content['origin'] instead of origin ids..

vlorentz added a commit: rDSTOCe2393243e07f: Use origin URLs for skipped_content['origin'] instead of origin ids..Oct 30 2019, 5:21 PM