Page MenuHomeSoftware Heritage

Add an env var to the in-mem storage to disable origin ids.
ClosedPublic

Authored by vlorentz on Jul 4 2019, 5:00 PM.

Details

Summary

If it is true (the default), there is no change from the current behavior,
but setting it to False and running another package's tests allows
to make sure that package does not rely on origin ids at all. (eg. see D1691)

Diff Detail

Repository
rDSTO Storage manager
Branch
switch-disable-origin-ids
Lint
No Linters Available
Unit
No Unit Test Coverage
Build Status
Buildable 6710
Build 9373: tox-on-jenkinsJenkins
Build 9372: arc lint + arc unit

Event Timeline

ardumont added a subscriber: ardumont.

sounds fine.

a couple of suggestion to improve readability.

swh/storage/in_memory.py
1146

why not use the same instruction as below?

return origin_id if ENABLE_ORIGIN_IDS else origin['url']
1160

Just a note outside the scope of this diff.
If we aren't using those (fetch_history_{start,end}) anymore (you removed the call from the loader-core IIRC), this should go away as well.

1334

simplify this a bit:

key = 'id' if ENABLE_ORIGIN_IDS else 'url'
visit['origin'] = self._origins[visit['origin']['url']][key]
return visit
This revision is now accepted and ready to land.Jul 5 2019, 2:35 PM
  • make it an env var
  • simplify _convert_visit
vlorentz retitled this revision from Add a global boolean to the in-mem storage to disable origin ids. to Add an env var to the in-mem storage to disable origin ids..Jul 8 2019, 10:56 AM
vlorentz edited the summary of this revision. (Show Details)
vlorentz added inline comments.
swh/storage/in_memory.py
1146

I used the ternary to make tests shorter, but in the main code I prefer readability.
I don't have a strong opinion on this, though.

This revision was landed with ongoing or failed builds.Jul 8 2019, 1:53 PM
This revision was automatically updated to reflect the committed changes.