Page MenuHomeSoftware Heritage

relational exports: add ID field to origin table
ClosedPublic

Authored by seirl on Apr 14 2022, 4:34 PM.

Details

Summary

The origin table now contains the origin URL and a sha1 of the URL as
the "ID" field. Allows us to join this table more easily with the SWHIDs
retrieved from the compressed graph, as well as generate the edge
dataset without having to compute sha1s manually.

Diff Detail

Repository
rDDATASET Datasets
Lint
Automatic diff as part of commit; lint not applicable.
Unit
Automatic diff as part of commit; unit tests not applicable.

Event Timeline

Build is green

Patch application report for D7585 (id=27465)

Rebasing onto 075b3c3068...

Current branch diff-target is up to date.
Changes applied before test
commit 9f342d9994aaaa406c4f586fe46803c8e60b850d
Author: Antoine Pietri <antoine.pietri1@gmail.com>
Date:   Thu Apr 14 14:11:18 2022 +0000

    relational exports: add ID field to origin table
    
    The origin table now contains the origin URL and a sha1 of the URL as
    the "ID" field. Allows us to join this table more easily with the SWHIDs
    retrieved from the compressed graph, as well as generate the edge
    dataset without having to compute sha1s manually.

See https://jenkins.softwareheritage.org/job/DDATASET/job/tests-on-diff/129/ for more details.

seirl requested review of this revision.Apr 14 2022, 4:37 PM
This revision is now accepted and ready to land.Apr 14 2022, 5:43 PM