This will allow running swh-graph tasks easily on machines that didn't
export the graph themselves.
Details
- Reviewers
ardumont - Group Reviewers
Reviewers - Commits
- rDDATASET23853dbfacd4: luigi: Add DownloadFromS3 task
Diff Detail
- Repository
- rDDATASET Datasets
- Lint
Automatic diff as part of commit; lint not applicable. - Unit
Automatic diff as part of commit; unit tests not applicable.
Event Timeline
Build is green
Patch application report for D8832 (id=31839)
Could not rebase; Attempt merge onto 058e568492...
Updating 058e568..6640b70 Fast-forward swh/dataset/luigi.py | 117 +++++++++++++++++++++++++++++++++++++++++++++------ 1 file changed, 104 insertions(+), 13 deletions(-)
Changes applied before test
commit 6640b70bbffae25113af73e06ecafde8ef4a779a Author: Valentin Lorentz <vlorentz@softwareheritage.org> Date: Thu Nov 10 16:41:12 2022 +0100 luigi: Add DownloadFromS3 task This will allow running swh-graph tasks easily on machines that didn't export the graph themselves. commit 418cf1837e26e48f529f79afc4613d55a14060cf Author: Valentin Lorentz <vlorentz@softwareheritage.org> Date: Thu Nov 10 16:39:13 2022 +0100 luigi: Make Format and ObjectType public Other tasks will import them in order to depend on tasks defined here
See https://jenkins.softwareheritage.org/job/DDATASET/job/tests-on-diff/162/ for more details.
one question inline.
swh/dataset/luigi.py | ||
---|---|---|
436 | Will that trigger an upload to s3 task? If so, I gather this task is triggered from the appropriate node (with data and s3 access, etc...), right? |
swh/dataset/luigi.py | ||
---|---|---|
425 | What does that do? |
swh/dataset/luigi.py | ||
---|---|---|
436 | Luigi tasks cannot be triggered remotely. This requirements means that the task will run if it is properly configured, or the whole workflow will fail if it is not. |
Build is green
Patch application report for D8832 (id=31869)
Could not rebase; Attempt merge onto 058e568492...
Updating 058e568..23853db Fast-forward swh/dataset/luigi.py | 117 +++++++++++++++++++++++++++++++++++++++++++++------ 1 file changed, 104 insertions(+), 13 deletions(-)
Changes applied before test
commit 23853dbfacd49aba0da526023e736a42cf4c328a Author: Valentin Lorentz <vlorentz@softwareheritage.org> Date: Thu Nov 10 16:41:12 2022 +0100 luigi: Add DownloadFromS3 task This will allow running swh-graph tasks easily on machines that didn't export the graph themselves. commit 418cf1837e26e48f529f79afc4613d55a14060cf Author: Valentin Lorentz <vlorentz@softwareheritage.org> Date: Thu Nov 10 16:39:13 2022 +0100 luigi: Make Format and ObjectType public Other tasks will import them in order to depend on tasks defined here
See https://jenkins.softwareheritage.org/job/DDATASET/job/tests-on-diff/163/ for more details.
swh/dataset/luigi.py | ||
---|---|---|
425 | significant=False means that if two tasks are called with different values for this parameter but equal values for all others, then Luigi will consider one of the tasks to be redundant and won't run it. |