Page MenuHomeSoftware Heritage

Manipulate origin URLs instead of origin ids.
Needs ReviewPublic

Authored by vlorentz on Fri, Jun 7, 4:23 PM.

Details

Reviewers
douardda
Group Reviewers
Reviewers
Summary

Depends on D1559 (allows querying the storage without an origin id).

Diff Detail

Repository
rDCIDX Object indexer
Branch
origin-urls
Lint
No Linters Available
Unit
No Unit Test Coverage
Build Status
Buildable 6221
Build 8593: tox-on-jenkinsJenkins
Build 8592: arc lint + arc unit

Event Timeline

vlorentz created this revision.Fri, Jun 7, 4:23 PM
douardda requested changes to this revision.Thu, Jun 13, 2:26 PM
douardda added a subscriber: douardda.
douardda added inline comments.
swh/indexer/origin_head.py
46–49

why changing the behavior of head selection mechanism here?

why cannot we know at this point which type the processed origin is?

In all cases, if this try-based solution is now mandatory, it would be nice to have it encapsulated in a generic self.get_head() method IMHO.

This revision now requires changes to proceed.Thu, Jun 13, 2:26 PM
vlorentz planned changes to this revision.Thu, Jun 13, 2:41 PM
vlorentz added inline comments.
swh/indexer/origin_head.py
46–49

why cannot we know at this point which type the processed origin is?

That requires a new API endpoint in the storage, but indeed, we could (and should).

vlorentz added inline comments.Fri, Jun 14, 10:34 AM
swh/indexer/origin_head.py
46–49
vlorentz updated this revision to Diff 5247.Fri, Jun 14, 3:05 PM
  • Drop origin ids from tests as well
  • Use new-style snapshot_add in the tests (long overdue!)
  • Use origin_visit_get_latest instead of snapshot_get_latest (which is deprecated), in order to know the visit type.
vlorentz updated this revision to Diff 5248.Fri, Jun 14, 3:08 PM

add missing aliases.