Page MenuHomeSoftware Heritage

luigi: Add swh-dataset version to export/meta.json
ClosedPublic

Authored by vlorentz on Mon, Nov 21, 3:52 PM.

Details

Summary

May be useful later, for traceability.

Depends on D8865.

Diff Detail

Repository
rDDATASET Datasets
Lint
Automatic diff as part of commit; lint not applicable.
Unit
Automatic diff as part of commit; unit tests not applicable.

Event Timeline

Build is green

Patch application report for D8866 (id=31957)

Could not rebase; Attempt merge onto 23853dbfac...

Updating 23853db..3bf294f
Fast-forward
 swh/dataset/luigi.py | 139 ++++++++++++++++++++++++++++++++++++---------------
 1 file changed, 98 insertions(+), 41 deletions(-)
Changes applied before test
commit 3bf294f9beb6f94ea947ed9338c1cd061f4555ba
Author: Valentin Lorentz <vlorentz@softwareheritage.org>
Date:   Mon Nov 21 15:52:39 2022 +0100

    luigi: Add swh-dataset version to export/meta.json
    
    May be useful later, for traceability.

commit e4df585f8dd66aa3bca0be967ef79cd6fa8a7c0a
Author: Valentin Lorentz <vlorentz@softwareheritage.org>
Date:   Mon Nov 21 15:47:03 2022 +0100

    luigi: Add LocalExport task
    
    It allows other packages (eg. swh-graph) to depend on the presence of the local
    dataset, with a configurable way to obtain it if missing

commit b39436e38be5fefe16c92d6553845cd113bafd14
Author: Valentin Lorentz <vlorentz@softwareheritage.org>
Date:   Mon Nov 21 15:42:33 2022 +0100

    luigi: Remove copies of stamp files to/from S3
    
    They are only useful while exporting the dataset -- after the export is
    finished, meta.json is good enough and stamp files only save a couple
    of minutes when only some objects types are needed (ie. never in practice)

See https://jenkins.softwareheritage.org/job/DDATASET/job/tests-on-diff/167/ for more details.

This revision is now accepted and ready to land.Mon, Nov 21, 4:44 PM
This revision was landed with ongoing or failed builds.Thu, Nov 24, 4:12 PM
This revision was automatically updated to reflect the committed changes.

Build is green

Patch application report for D8866 (id=32025)

Could not rebase; Attempt merge onto 23853dbfac...

Updating 23853db..9a194f4
Fast-forward
 swh/dataset/luigi.py | 139 ++++++++++++++++++++++++++++++++++++---------------
 1 file changed, 98 insertions(+), 41 deletions(-)
Changes applied before test
commit 9a194f421be314658e4ee39d38173d4125d17d11
Author: Valentin Lorentz <vlorentz@softwareheritage.org>
Date:   Mon Nov 21 15:52:39 2022 +0100

    luigi: Add swh-dataset version to export/meta.json
    
    May be useful later, for traceability.

commit 0bf9c88d9604184b55735541b890797a890a9182
Author: Valentin Lorentz <vlorentz@softwareheritage.org>
Date:   Mon Nov 21 15:47:03 2022 +0100

    luigi: Add LocalExport task
    
    It allows other packages (eg. swh-graph) to depend on the presence of the local
    dataset, with a configurable way to obtain it if missing

commit b39436e38be5fefe16c92d6553845cd113bafd14
Author: Valentin Lorentz <vlorentz@softwareheritage.org>
Date:   Mon Nov 21 15:42:33 2022 +0100

    luigi: Remove copies of stamp files to/from S3
    
    They are only useful while exporting the dataset -- after the export is
    finished, meta.json is good enough and stamp files only save a couple
    of minutes when only some objects types are needed (ie. never in practice)

See https://jenkins.softwareheritage.org/job/DDATASET/job/tests-on-diff/169/ for more details.