Lister for maven repositories
Details
Thu, Jun 16
Wed, Jun 15
Yesss! \o/
So in the end, the conclusion is that the loader already does the right thing so it's a noop, right?
Good news *can* happen, ahah! Thanks for notifying me.
To summarize, the initial intent was to adapt the jar loaded (as extracted directory) to append the pom.xml so we do not lose that reference.
You're right boris, indeed it's already stored as extrinsic metadata, we hadn't checked properly :)
Thank you for your answer !
I'm not sure to understand the intent, as we already keep the pom in the extrinsic metadata (don't we?).
Double-checking in the SWH codebase, I believe you could build upon this: see [1] lines 166-180.
Congrats on the work done! I think that downloading the pom file from the same folder is indeed the way to go.
I think the simplest way to get the pom file associated to a specific release of a maven package is to download it from the folder where we can find the source jar.
Mon, Jun 13
- some maven origins contain a zip instead of a jar, and in that case it looks like the pom.xml is included (ex : https://webapp.staging.swh.network/browse/origin/directory/?origin_url=https://repo1.maven.org/maven2/org/jboss/snowdrop/snowdrop)
in a source.jar, the pom is not inculded by default but can be if specified :
Fri, Jun 3
There remains git and other dvcs typed origins listed by maven but not github ones.
status: triggered 2 full-maven lister runs on maven central and jboss [1]
And no more exotic github urls are popping up [2].
Yesterday, i had fixed, diffed, released and pushed the diff [1] fixing the
canonicalization of remaining exotic urls, cleaned up 'git' (out of a maven listing)
origins and triggered back a listing. Today, checking back those origins (staging
scheduler), there was still noise which should no longer have been there...
Thu, Jun 2
Full listing is not finished yet but still there remains origins with exotic starting urls which are not canonicalized.
I'd say the issue lies with the canonicalize swh.core implementation code which only deals with https:// and git:// urls.
So some improvments are needed there.
Wed, Jun 1
Plan:
- P1369: Listing status after first round listing
- Clean up maven github origins listing [1]
- Trigger maven full run [2]
- Wait for listing to finish
- Listing status after new maven lister round of listing
- Ping in mailing list discussion with data!
Old maven behavior results in origins like git://github.com, ... [1]
The new maven lister behavior should now result in canonical github urls http://github.com/user/repo.
Analysis ongoing and report will go after that comment.