This will allow running swh-graph tasks easily on machines that didn't
export the graph themselves.
Details
- Reviewers
ardumont - Group Reviewers
Reviewers - Commits
- rDDATASET23853dbfacd4: luigi: Add DownloadFromS3 task
Diff Detail
- Repository
- rDDATASET Datasets
- Lint
Automatic diff as part of commit; lint not applicable. - Unit
Automatic diff as part of commit; unit tests not applicable.
Event Timeline
Build is green
Patch application report for D8832 (id=31839)
Could not rebase; Attempt merge onto 058e568492...
Updating 058e568..6640b70 Fast-forward swh/dataset/luigi.py | 117 +++++++++++++++++++++++++++++++++++++++++++++------ 1 file changed, 104 insertions(+), 13 deletions(-)
Changes applied before test
commit 6640b70bbffae25113af73e06ecafde8ef4a779a
Author: Valentin Lorentz <vlorentz@softwareheritage.org>
Date: Thu Nov 10 16:41:12 2022 +0100
luigi: Add DownloadFromS3 task
This will allow running swh-graph tasks easily on machines that didn't
export the graph themselves.
commit 418cf1837e26e48f529f79afc4613d55a14060cf
Author: Valentin Lorentz <vlorentz@softwareheritage.org>
Date: Thu Nov 10 16:39:13 2022 +0100
luigi: Make Format and ObjectType public
Other tasks will import them in order to depend on tasks defined hereSee https://jenkins.softwareheritage.org/job/DDATASET/job/tests-on-diff/162/ for more details.
one question inline.
| swh/dataset/luigi.py | ||
|---|---|---|
| 436 | Will that trigger an upload to s3 task? If so, I gather this task is triggered from the appropriate node (with data and s3 access, etc...), right? | |
| swh/dataset/luigi.py | ||
|---|---|---|
| 425 | What does that do? | |
| swh/dataset/luigi.py | ||
|---|---|---|
| 436 | Luigi tasks cannot be triggered remotely. This requirements means that the task will run if it is properly configured, or the whole workflow will fail if it is not. | |
Build is green
Patch application report for D8832 (id=31869)
Could not rebase; Attempt merge onto 058e568492...
Updating 058e568..23853db Fast-forward swh/dataset/luigi.py | 117 +++++++++++++++++++++++++++++++++++++++++++++------ 1 file changed, 104 insertions(+), 13 deletions(-)
Changes applied before test
commit 23853dbfacd49aba0da526023e736a42cf4c328a
Author: Valentin Lorentz <vlorentz@softwareheritage.org>
Date: Thu Nov 10 16:41:12 2022 +0100
luigi: Add DownloadFromS3 task
This will allow running swh-graph tasks easily on machines that didn't
export the graph themselves.
commit 418cf1837e26e48f529f79afc4613d55a14060cf
Author: Valentin Lorentz <vlorentz@softwareheritage.org>
Date: Thu Nov 10 16:39:13 2022 +0100
luigi: Make Format and ObjectType public
Other tasks will import them in order to depend on tasks defined hereSee https://jenkins.softwareheritage.org/job/DDATASET/job/tests-on-diff/163/ for more details.
| swh/dataset/luigi.py | ||
|---|---|---|
| 425 | significant=False means that if two tasks are called with different values for this parameter but equal values for all others, then Luigi will consider one of the tasks to be redundant and won't run it. | |