Details
- Reviewers
ardumont vlorentz - Group Reviewers
Reviewers - Maniphest Tasks
- T1724: Maven Central repository support
- Commits
- rDLDBASE89f5ccc7f5fc: loader: add new maven-jar loader
Diff Detail
- Repository
- rDLDBASE Generic VCS/Package Loader
- Branch
- T1724_maven_central_jar_loader
- Lint
No Linters Available - Unit
No Unit Test Coverage - Build Status
Buildable 25179 Build 39339: Phabricator diff pipeline on jenkins Jenkins console · Jenkins Build 39338: arc lint + arc unit
Event Timeline
- maven-loader: set the pom as metadata
- maven-loader: fix after final review (ardumont, vlorentz)
Build is green
Patch application report for D6396 (id=24339)
Rebasing onto 021880d030...
First, rewinding head to replay your work on top of it... Applying: loader: add new maven-jar loader Using index info to reconstruct a base tree... M docs/package-loader-specifications.rst Falling back to patching base and 3-way merge... Auto-merging docs/package-loader-specifications.rst Applying: maven-loader: set the pom as metadata Applying: maven-loader: fix after final review (ardumont, vlorentz)
Changes applied before test
commit 4d5e059e98901c3e24b4c36446bdf969f371ecbd Author: Boris Baldassari <boris@chrysalice.org> Date: Sun Nov 28 12:27:32 2021 +0100 maven-loader: fix after final review (ardumont, vlorentz) commit 08050bafb353f2567381a98e14cf82acf03db6f9 Author: Boris Baldassari <boris@chrysalice.org> Date: Sat Nov 27 14:15:09 2021 +0100 maven-loader: set the pom as metadata maven-loader: fix after final review (ardumont, vlorentz) commit f98171102d2bf541d9d3963d0ac7b23017bb77bb Author: Boris Baldassari <boris@chrysalice.org> Date: Sun Aug 29 22:14:59 2021 +0200 loader: add new maven-jar loader The maven loader loads jar and zip files as Maven artefacts into the software heritage archive. Note: Supersedes D6158 and addresses the review done in that diff. Related to T1724
See https://jenkins.softwareheritage.org/job/DLDBASE/job/tests-on-diff/626/ for more details.
Thanks for the last adaptations.
I have just one last question to discuss with valentin but i don't think that's blocking in the end.
swh/loader/package/maven/loader.py | ||
---|---|---|
80 | @vlorentz Do we allow such side-effects [1] to occur this early in the package loading process? I guess it's fine since it's called at the get_package_info time... [1] any form of interactions with the outside world (db, remote service, file, whatever can raise any form of issues...) |
swh/loader/package/maven/loader.py | ||
---|---|---|
80 | It's not too early; but this is outside any exception handling, which means a crash here (and they *will* happen since it's a network request) won't update the visit status, so it must be done somewhere else, eg. by specializing build_extrinsic_directory_metadata. | |
162–164 | ping? | |
swh/loader/package/maven/tests/test_maven.py | ||
318–342 | thanks! |
Build has FAILED
Patch application report for D6396 (id=24363)
Rebasing onto 021880d030...
First, rewinding head to replay your work on top of it... Applying: loader: add new maven-jar loader Using index info to reconstruct a base tree... M docs/package-loader-specifications.rst Falling back to patching base and 3-way merge... Auto-merging docs/package-loader-specifications.rst Applying: maven-loader: set the pom as metadata Applying: maven-loader: fix after final review (ardumont, vlorentz) Applying: maven-loader: fix metadata for completeness
Changes applied before test
commit 9c6b24c34f8b295d4bb827fe8e708ec0aae43662 Author: Boris Baldassari <boris@chrysalice.org> Date: Mon Nov 29 15:17:00 2021 +0100 maven-loader: fix metadata for completeness commit 0c2836237f89ebfda915b7c53388deac59a6e1e3 Author: Boris Baldassari <boris@chrysalice.org> Date: Sun Nov 28 12:27:32 2021 +0100 maven-loader: fix after final review (ardumont, vlorentz) commit 9803dbb0ff470447ea52f8ff622215445d15c910 Author: Boris Baldassari <boris@chrysalice.org> Date: Sat Nov 27 14:15:09 2021 +0100 maven-loader: set the pom as metadata maven-loader: fix after final review (ardumont, vlorentz) commit 7de48dffc1f7c04e6c18838278d4d471d86c9ca5 Author: Boris Baldassari <boris@chrysalice.org> Date: Sun Aug 29 22:14:59 2021 +0200 loader: add new maven-jar loader The maven loader loads jar and zip files as Maven artefacts into the software heritage archive. Note: Supersedes D6158 and addresses the review done in that diff. Related to T1724
Link to build: https://jenkins.softwareheritage.org/job/DLDBASE/job/tests-on-diff/627/
See console output for more information: https://jenkins.softwareheritage.org/job/DLDBASE/job/tests-on-diff/627/console
Build has FAILED
Patch application report for D6396 (id=24365)
Rebasing onto 021880d030...
First, rewinding head to replay your work on top of it... Applying: loader: add new maven-jar loader Using index info to reconstruct a base tree... M docs/package-loader-specifications.rst Falling back to patching base and 3-way merge... Auto-merging docs/package-loader-specifications.rst Applying: maven-loader: set the pom as metadata Applying: maven-loader: fix after final review (ardumont, vlorentz) Applying: maven-loader: fix metadata for completeness
Changes applied before test
commit 07583d1ba11f4ffbbe885841e2663b78efc6b694 Author: Boris Baldassari <boris@chrysalice.org> Date: Mon Nov 29 15:17:00 2021 +0100 maven-loader: fix metadata for completeness commit 7931894264347cee1f4c7c7143dd6efc90f7fb6a Author: Boris Baldassari <boris@chrysalice.org> Date: Sun Nov 28 12:27:32 2021 +0100 maven-loader: fix after final review (ardumont, vlorentz) commit 32ce31242b76ef65635b6b9af47cdbe389a382c1 Author: Boris Baldassari <boris@chrysalice.org> Date: Sat Nov 27 14:15:09 2021 +0100 maven-loader: set the pom as metadata maven-loader: fix after final review (ardumont, vlorentz) commit 66ac1d3f9f8a906d48920a827d702b9c6c45bf21 Author: Boris Baldassari <boris@chrysalice.org> Date: Sun Aug 29 22:14:59 2021 +0200 loader: add new maven-jar loader The maven loader loads jar and zip files as Maven artefacts into the software heritage archive. Note: Supersedes D6158 and addresses the review done in that diff. Related to T1724
Link to build: https://jenkins.softwareheritage.org/job/DLDBASE/job/tests-on-diff/628/
See console output for more information: https://jenkins.softwareheritage.org/job/DLDBASE/job/tests-on-diff/628/console
Build has FAILED
Patch application report for D6396 (id=24371)
Rebasing onto 021880d030...
First, rewinding head to replay your work on top of it... Applying: loader: add new maven-jar loader Using index info to reconstruct a base tree... M docs/package-loader-specifications.rst Falling back to patching base and 3-way merge... Auto-merging docs/package-loader-specifications.rst Applying: maven-loader: set the pom as metadata Applying: maven-loader: fix after final review (ardumont, vlorentz) Applying: maven-loader: fix metadata for completeness Applying: maven-loader: fix metadata for pom and json
Changes applied before test
commit a115b6abb4b73a212efb23484843e1955dedf5c5 Author: Boris Baldassari <boris@chrysalice.org> Date: Mon Nov 29 16:04:00 2021 +0100 maven-loader: fix metadata for pom and json commit f52ce3c27f444bc96814d51d1ab49a86abc21d49 Author: Boris Baldassari <boris@chrysalice.org> Date: Mon Nov 29 15:17:00 2021 +0100 maven-loader: fix metadata for completeness commit 4ec8cf818af490b422bb4c2fa2654d1ff78ebd15 Author: Boris Baldassari <boris@chrysalice.org> Date: Sun Nov 28 12:27:32 2021 +0100 maven-loader: fix after final review (ardumont, vlorentz) commit d9d7b66afdfd571ba88bc20749d23f1158c742fe Author: Boris Baldassari <boris@chrysalice.org> Date: Sat Nov 27 14:15:09 2021 +0100 maven-loader: set the pom as metadata maven-loader: fix after final review (ardumont, vlorentz) commit bb87b06a85be0a1dddd68907a027311aba3935f9 Author: Boris Baldassari <boris@chrysalice.org> Date: Sun Aug 29 22:14:59 2021 +0200 loader: add new maven-jar loader The maven loader loads jar and zip files as Maven artefacts into the software heritage archive. Note: Supersedes D6158 and addresses the review done in that diff. Related to T1724
Link to build: https://jenkins.softwareheritage.org/job/DLDBASE/job/tests-on-diff/629/
See console output for more information: https://jenkins.softwareheritage.org/job/DLDBASE/job/tests-on-diff/629/console
- loader: add new maven-jar loader
- maven-loader: set the pom as metadata
- maven-loader: fix after final review (ardumont, vlorentz)
- maven-loader: fix metadata for completeness
- maven-loader: fix metadata for pom and json
Build has FAILED
Patch application report for D6396 (id=24372)
Rebasing onto 021880d030...
Current branch diff-target is up to date.
Changes applied before test
commit a9e0076483e6d44f3ab5600fc8c6f9caecf1d3c8 Author: Boris Baldassari <boris@chrysalice.org> Date: Mon Nov 29 16:04:00 2021 +0100 maven-loader: fix metadata for pom and json commit 4f65c899bb66dc3457c603bfd8e32937fb93b7ba Author: Boris Baldassari <boris@chrysalice.org> Date: Mon Nov 29 15:17:00 2021 +0100 maven-loader: fix metadata for completeness commit 52fcafa36a3cf3834d3e8f3f278e8f40d128000e Author: Boris Baldassari <boris@chrysalice.org> Date: Sun Nov 28 12:27:32 2021 +0100 maven-loader: fix after final review (ardumont, vlorentz) commit 8ab8d11531953ca00cb6f58c7fd117ef5cdcd482 Author: Boris Baldassari <boris@chrysalice.org> Date: Sat Nov 27 14:15:09 2021 +0100 maven-loader: set the pom as metadata maven-loader: fix after final review (ardumont, vlorentz) commit 6f6dc40d66c21f183522861aaae792de7332d2a5 Author: Boris Baldassari <boris@chrysalice.org> Date: Sun Aug 29 22:14:59 2021 +0200 loader: add new maven-jar loader The maven loader loads jar and zip files as Maven artefacts into the software heritage archive. Note: Supersedes D6158 and addresses the review done in that diff. Related to T1724
Link to build: https://jenkins.softwareheritage.org/job/DLDBASE/job/tests-on-diff/630/
See console output for more information: https://jenkins.softwareheritage.org/job/DLDBASE/job/tests-on-diff/630/console
Many thanks ;)
I guess you "only" need to fix the build now ;)
swh/loader/package/maven/loader.py | ||
---|---|---|
97 | *thumbs up* |
Build has FAILED
Patch application report for D6396 (id=24379)
Rebasing onto 021880d030...
Current branch diff-target is up to date.
Changes applied before test
commit 0021e0cf7f8239a600e2cb0601d37c70bcb3af2e Author: Boris Baldassari <boris@chrysalice.org> Date: Wed Dec 1 12:56:31 2021 +0100 maven-loader: move request.get to own function commit a9e0076483e6d44f3ab5600fc8c6f9caecf1d3c8 Author: Boris Baldassari <boris@chrysalice.org> Date: Mon Nov 29 16:04:00 2021 +0100 maven-loader: fix metadata for pom and json commit 4f65c899bb66dc3457c603bfd8e32937fb93b7ba Author: Boris Baldassari <boris@chrysalice.org> Date: Mon Nov 29 15:17:00 2021 +0100 maven-loader: fix metadata for completeness commit 52fcafa36a3cf3834d3e8f3f278e8f40d128000e Author: Boris Baldassari <boris@chrysalice.org> Date: Sun Nov 28 12:27:32 2021 +0100 maven-loader: fix after final review (ardumont, vlorentz) commit 8ab8d11531953ca00cb6f58c7fd117ef5cdcd482 Author: Boris Baldassari <boris@chrysalice.org> Date: Sat Nov 27 14:15:09 2021 +0100 maven-loader: set the pom as metadata maven-loader: fix after final review (ardumont, vlorentz) commit 6f6dc40d66c21f183522861aaae792de7332d2a5 Author: Boris Baldassari <boris@chrysalice.org> Date: Sun Aug 29 22:14:59 2021 +0200 loader: add new maven-jar loader The maven loader loads jar and zip files as Maven artefacts into the software heritage archive. Note: Supersedes D6158 and addresses the review done in that diff. Related to T1724
Link to build: https://jenkins.softwareheritage.org/job/DLDBASE/job/tests-on-diff/631/
See console output for more information: https://jenkins.softwareheritage.org/job/DLDBASE/job/tests-on-diff/631/console
- loader: add new maven-jar loader
- maven-loader: set the pom as metadata
- maven-loader: fix after final review (ardumont, vlorentz)
- maven-loader: fix metadata for completeness
- maven-loader: fix metadata for pom and json
- maven-loader: move request.get to own function
- maven-loader: fix tests for unordered function call
Build has FAILED
Patch application report for D6396 (id=24380)
Rebasing onto 8a3df1376d...
Current branch diff-target is up to date.
Changes applied before test
commit 72514d26ae4de277015c284c307217e04f3619db Author: Boris Baldassari <boris@chrysalice.org> Date: Wed Dec 1 13:36:30 2021 +0100 maven-loader: fix tests for unordered function call commit 03ba42a29b5e1002311c629988f8bedf1aa66c05 Author: Boris Baldassari <boris@chrysalice.org> Date: Wed Dec 1 12:56:31 2021 +0100 maven-loader: move request.get to own function commit 7dbc0c13162d77c1914c8667b4d871baf451586c Author: Boris Baldassari <boris@chrysalice.org> Date: Mon Nov 29 16:04:00 2021 +0100 maven-loader: fix metadata for pom and json commit 5f9980709be4051db9267086a062ee7925e67db8 Author: Boris Baldassari <boris@chrysalice.org> Date: Mon Nov 29 15:17:00 2021 +0100 maven-loader: fix metadata for completeness commit 88d27d09654f080db5e316c4504d204931c353a4 Author: Boris Baldassari <boris@chrysalice.org> Date: Sun Nov 28 12:27:32 2021 +0100 maven-loader: fix after final review (ardumont, vlorentz) commit 247959d4d602edb0005b606ba472fb5ce3f94041 Author: Boris Baldassari <boris@chrysalice.org> Date: Sat Nov 27 14:15:09 2021 +0100 maven-loader: set the pom as metadata maven-loader: fix after final review (ardumont, vlorentz) commit 9a15f5064e29adfd8d3dd40169d9eeceb271d6e6 Author: Boris Baldassari <boris@chrysalice.org> Date: Sun Aug 29 22:14:59 2021 +0200 loader: add new maven-jar loader The maven loader loads jar and zip files as Maven artefacts into the software heritage archive. Note: Supersedes D6158 and addresses the review done in that diff. Related to T1724
Link to build: https://jenkins.softwareheritage.org/job/DLDBASE/job/tests-on-diff/632/
See console output for more information: https://jenkins.softwareheritage.org/job/DLDBASE/job/tests-on-diff/632/console
Build has FAILED
Patch application report for D6396 (id=24381)
Rebasing onto 8a3df1376d...
Current branch diff-target is up to date.
Changes applied before test
commit 7d878099945890d2f83d6cffdbc28204078918bb Author: Boris Baldassari <boris@chrysalice.org> Date: Wed Dec 1 13:41:20 2021 +0100 maven-loader: fix tests for unordered function call commit 72514d26ae4de277015c284c307217e04f3619db Author: Boris Baldassari <boris@chrysalice.org> Date: Wed Dec 1 13:36:30 2021 +0100 maven-loader: fix tests for unordered function call commit 03ba42a29b5e1002311c629988f8bedf1aa66c05 Author: Boris Baldassari <boris@chrysalice.org> Date: Wed Dec 1 12:56:31 2021 +0100 maven-loader: move request.get to own function commit 7dbc0c13162d77c1914c8667b4d871baf451586c Author: Boris Baldassari <boris@chrysalice.org> Date: Mon Nov 29 16:04:00 2021 +0100 maven-loader: fix metadata for pom and json commit 5f9980709be4051db9267086a062ee7925e67db8 Author: Boris Baldassari <boris@chrysalice.org> Date: Mon Nov 29 15:17:00 2021 +0100 maven-loader: fix metadata for completeness commit 88d27d09654f080db5e316c4504d204931c353a4 Author: Boris Baldassari <boris@chrysalice.org> Date: Sun Nov 28 12:27:32 2021 +0100 maven-loader: fix after final review (ardumont, vlorentz) commit 247959d4d602edb0005b606ba472fb5ce3f94041 Author: Boris Baldassari <boris@chrysalice.org> Date: Sat Nov 27 14:15:09 2021 +0100 maven-loader: set the pom as metadata maven-loader: fix after final review (ardumont, vlorentz) commit 9a15f5064e29adfd8d3dd40169d9eeceb271d6e6 Author: Boris Baldassari <boris@chrysalice.org> Date: Sun Aug 29 22:14:59 2021 +0200 loader: add new maven-jar loader The maven loader loads jar and zip files as Maven artefacts into the software heritage archive. Note: Supersedes D6158 and addresses the review done in that diff. Related to T1724
Link to build: https://jenkins.softwareheritage.org/job/DLDBASE/job/tests-on-diff/633/
See console output for more information: https://jenkins.softwareheritage.org/job/DLDBASE/job/tests-on-diff/633/console
Build is green
Patch application report for D6396 (id=24382)
Rebasing onto 8a3df1376d...
Current branch diff-target is up to date.
Changes applied before test
commit 813378833670c41fa777b5cf91a97de0fa1dab07 Author: Boris Baldassari <boris@chrysalice.org> Date: Sun Aug 29 22:14:59 2021 +0200 loader: add new maven-jar loader The maven loader loads jar and zip files as Maven artefacts into the software heritage archive. Note: Supersedes D6158 and addresses the review done in that diff. Related to T1724
See https://jenkins.softwareheritage.org/job/DLDBASE/job/tests-on-diff/634/ for more details.
@borisbaldassari Can you please push your commit in master?
So that will land the diff ;)
I'd love to, but I get rejected by the commit hook as a non-developer.
I can't remember how I did for the lister, I think it went fine.. What am I doing wrong?
(swh) boris@debian:swh-loader-core$ git status On branch master Your branch is ahead of 'origin/master' by 1 commit. (use "git push" to publish your local commits) nothing to commit, working tree clean (swh) boris@debian:swh-loader-core$ git push origin master Enumerating objects: 30, done. Counting objects: 100% (30/30), done. Delta compression using up to 3 threads Compressing objects: 100% (21/21), done. Writing objects: 100% (22/22), 22.43 KiB | 3.74 MiB/s, done. Total 22 (delta 10), reused 0 (delta 0) remote: *** PUSH REJECTED BY COMMIT HOOK *** remote: remote: This push was rejected by Herald push rule H16. remote: Change: commit/ remote: Rule: Non-developers require code review for pushing content remote: Reason: needs review remote: Transcript: https://forge.softwareheritage.org/herald/transcript/348895/ remote: To forge.softwareheritage.org:/source/swh-loader-core.git ! [remote rejected] master -> master (pre-receive hook declined) error: failed to push some refs to 'git@forge.softwareheritage.org:/source/swh-loader-core.git'
docs/package-loader-specifications.rst | ||
---|---|---|
63 | Changed both to "Synthetic release for maven-loader at {p_info.url}\n". | |
63 | Ok, no problem. Will do that Thursday (I'm busy until then). | |
65 | Yes, I understand that specifications are supposed to be in sync with the code. I had changed the specs to use swh_person in order to be aligned with the maven loader and with other loaders (e.g. deposit) to ensure consistency. Don't understand what happened. Anyway, I've set the empty person in the loader, reverted my last change in the spec, and now everything is empty_author. | |
swh/loader/package/maven/loader.py | ||
40 | Yes, now it's a string. Updated as suggested to use datetime. | |
45–46 | I think I can make it an int. | |
55–56 | As per the Lucene docs, it is the last update time of the file. Fixed. | |
57 | Interesting. It was initially copied from deposit (which i used as a template to start with). Removed it. | |
78–79 | No, we don't have it at the lister level either. And we are not sure the URL exists. This is what I had written in another comment -- but I have the feeling that some of my answers are lost, which does not help. We have no reliable way to know the POM; this is a heuristic that should work most of the time. | |
83 | neat! | |
84 | As stated above, yes. This is what I explained in another, now invisible answer. Let me look for it. Ah, actually it has simply moved, it's right below: So we can choose between having the XML pom (complete, but not reliable) or having the JSON (incomplete, but always provided). | |
102 | We don't have the original POM file around when loading the jar, and I don't see a reliable way to get it. We might use the (gId, aId, version) coordinates to try to download a pom and set it as metadata, but I'm not sure how often it would work. What do you prefer? I've set the a_metadata full info instead of the extract for now. | |
116 | See next comment, yes. We introduced last_update as a datetime, but the time parameter passed to the loader is an int. And it stil has the same meaning, i.e. last modification time on the server (as defined in the lucene index at least). Do you think it's not? | |
121 | Yes, it still is a timestamp. We changed the type to int, but the content is basically the same. | |
149–150 | Hum. Interesting, thanks for pointing that out. I thought it was somehow used by the loading mechanism to register its state, but on a second look it's probably just a carryover from the example I took in the archive loader. Removed. | |
162–164 | No, the lister is expected to always provide the version (alongside the gid and aid). If you're wondering why having this loop, it's because it's more elegant than artefact[0] and it feels more safe. But I can change it. (Note that there is only one artefact in the artifact list, as identified by the url.) | |
162–164 | I don't know. I'm not sure to understand what you mean or expect. What I know:
To answer your initial comment, when the loader is invoked there is indeed at most one package (whatever the gid, aid, version). Is there a specific case/situation you have in mind? | |
169 | Moved that next to the parsing step. I don't think it should be put in the lister, as what is yielded by the lister is already a UTC time. This line simply sets the loader local var as a UTC time to be interpreted correctly. | |
176 | Hum. No, I don't think so. The last_update parameter has been added as a datetime for the scheduler, but the time parameter passed to the loader is still an int. Am I missing something? | |
206 | I've updated the documentation to use "SWH robot", for consistency with the deposit and archive loaders (which I've used intensively as examples). | |
206 | As mentioned I've updated the package-loader-specifications.rst file to be consistent with other loaders. It was a mistake from me to create the maven entry with an empty person at first. | |
swh/loader/package/maven/tests/test_maven.py | ||
236–244 | Good point, thanks. | |
282–287 | Because requests_mock_datadir.request_history is a requests_mock.request._RequestObjectProxy object that cannot be compared to a list. It's ugly, though, so I've rewritten these lines for readability. Thanks. | |
282–287 | This has been addressed in the lines below: assert requests_mock_datadir.request_history[0].url == MVN_ARTIFACTS[1]["url"] assert len(requests_mock_datadir.request_history) == 1 I don't know what I can improve there. As said above the equality just isn't possible as I cannot compare requests_mock.request._RequestObjectProxy to a list. But the second line ensures that equality is satisfied (and that there is no other item). Any idea? | |
318–342 | @vlorentz you're not annoying. Anyway, I've done as suggested. |
Build is green
Patch application report for D6396 (id=24512)
Rebasing onto 5d22455c94...
Current branch diff-target is up to date.
Changes applied before test
commit 89f5ccc7f5fc030e6db514f79726788770c6fa50 Author: Boris Baldassari <boris@chrysalice.org> Date: Sun Aug 29 22:14:59 2021 +0200 loader: add new maven-jar loader The maven loader loads jar and zip files as Maven artefacts into the software heritage archive. Note: Supersedes D6158 and addresses the review done in that diff. Related to T1724
See https://jenkins.softwareheritage.org/job/DLDBASE/job/tests-on-diff/646/ for more details.