Page MenuHomeSoftware Heritage

luigi: Add swh-dataset version to export/meta.json
ClosedPublic

Authored by vlorentz on Nov 21 2022, 3:52 PM.

Details

Summary

May be useful later, for traceability.

Depends on D8865.

Diff Detail

Repository
rDDATASET Datasets
Branch
master
Lint
No Linters Available
Unit
No Unit Test Coverage
Build Status
Buildable 32885
Build 51537: Phabricator diff pipeline on jenkinsJenkins console · Jenkins
Build 51536: arc lint + arc unit

Event Timeline

Build is green

Patch application report for D8866 (id=31957)

Could not rebase; Attempt merge onto 23853dbfac...

Updating 23853db..3bf294f
Fast-forward
 swh/dataset/luigi.py | 139 ++++++++++++++++++++++++++++++++++++---------------
 1 file changed, 98 insertions(+), 41 deletions(-)
Changes applied before test
commit 3bf294f9beb6f94ea947ed9338c1cd061f4555ba
Author: Valentin Lorentz <vlorentz@softwareheritage.org>
Date:   Mon Nov 21 15:52:39 2022 +0100

    luigi: Add swh-dataset version to export/meta.json
    
    May be useful later, for traceability.

commit e4df585f8dd66aa3bca0be967ef79cd6fa8a7c0a
Author: Valentin Lorentz <vlorentz@softwareheritage.org>
Date:   Mon Nov 21 15:47:03 2022 +0100

    luigi: Add LocalExport task
    
    It allows other packages (eg. swh-graph) to depend on the presence of the local
    dataset, with a configurable way to obtain it if missing

commit b39436e38be5fefe16c92d6553845cd113bafd14
Author: Valentin Lorentz <vlorentz@softwareheritage.org>
Date:   Mon Nov 21 15:42:33 2022 +0100

    luigi: Remove copies of stamp files to/from S3
    
    They are only useful while exporting the dataset -- after the export is
    finished, meta.json is good enough and stamp files only save a couple
    of minutes when only some objects types are needed (ie. never in practice)

See https://jenkins.softwareheritage.org/job/DDATASET/job/tests-on-diff/167/ for more details.

This revision is now accepted and ready to land.Nov 21 2022, 4:44 PM
This revision was landed with ongoing or failed builds.Nov 24 2022, 4:12 PM
This revision was automatically updated to reflect the committed changes.

Build is green

Patch application report for D8866 (id=32025)

Could not rebase; Attempt merge onto 23853dbfac...

Updating 23853db..9a194f4
Fast-forward
 swh/dataset/luigi.py | 139 ++++++++++++++++++++++++++++++++++++---------------
 1 file changed, 98 insertions(+), 41 deletions(-)
Changes applied before test
commit 9a194f421be314658e4ee39d38173d4125d17d11
Author: Valentin Lorentz <vlorentz@softwareheritage.org>
Date:   Mon Nov 21 15:52:39 2022 +0100

    luigi: Add swh-dataset version to export/meta.json
    
    May be useful later, for traceability.

commit 0bf9c88d9604184b55735541b890797a890a9182
Author: Valentin Lorentz <vlorentz@softwareheritage.org>
Date:   Mon Nov 21 15:47:03 2022 +0100

    luigi: Add LocalExport task
    
    It allows other packages (eg. swh-graph) to depend on the presence of the local
    dataset, with a configurable way to obtain it if missing

commit b39436e38be5fefe16c92d6553845cd113bafd14
Author: Valentin Lorentz <vlorentz@softwareheritage.org>
Date:   Mon Nov 21 15:42:33 2022 +0100

    luigi: Remove copies of stamp files to/from S3
    
    They are only useful while exporting the dataset -- after the export is
    finished, meta.json is good enough and stamp files only save a couple
    of minutes when only some objects types are needed (ie. never in practice)

See https://jenkins.softwareheritage.org/job/DDATASET/job/tests-on-diff/169/ for more details.