Page MenuHomeSoftware Heritage
Feed Advanced Search

Jan 8 2023

gitlab-migration closed T1724: Maven Central repository support as Migrated.

This task has been migrated to GitLab.

Jan 8 2023, 10:21 PM · Maven loader, Maven lister, GSoC 2019, Archive coverage
gitlab-migration changed the status of T3874: staging: Analyze result of the maven listing and ingestion, a subtask of T3746: staging: Deploy maven indexer/lister/loader, from Resolved to Migrated.
Jan 8 2023, 10:03 PM · Maven loader, Maven lister, System administration, Archive coverage
gitlab-migration changed the status of T3874: staging: Analyze result of the maven listing and ingestion from Resolved to Migrated.

This task has been migrated to GitLab.

Jan 8 2023, 10:03 PM · Maven loader, Maven lister, Archive coverage
gitlab-migration changed the status of T4318: Consider using jar command to extract jar archives from Resolved to Migrated.

This task has been migrated to GitLab.

Jan 8 2023, 4:37 PM · Maven loader
gitlab-migration changed the status of T4232: Listers: Canonicalize listed github origins, a subtask of T3874: staging: Analyze result of the maven listing and ingestion, from Resolved to Migrated.
Jan 8 2023, 4:36 PM · Maven loader, Maven lister, Archive coverage

Nov 3 2022

bchauvet added a parent task for T4330: Deploy maven stack in production: T4079: Extend archive coverage.
Nov 3 2022, 10:17 AM · System administration, Maven loader, Maven lister, GSoC 2019, Archive coverage

Oct 19 2022

gitlab-migration changed the status of T4330: Deploy maven stack in production from Resolved to Migrated.

This task has been migrated to GitLab.

Oct 19 2022, 6:07 PM · System administration, Maven loader, Maven lister, GSoC 2019, Archive coverage
gitlab-migration changed the status of T4330: Deploy maven stack in production, a subtask of T1724: Maven Central repository support, from Resolved to Migrated.
Oct 19 2022, 6:07 PM · Maven loader, Maven lister, GSoC 2019, Archive coverage
gitlab-migration changed the status of T4326: Archive the pom file additionally to the source folder, a subtask of T3746: staging: Deploy maven indexer/lister/loader, from Invalid to Migrated.
Oct 19 2022, 6:07 PM · Maven loader, Maven lister, System administration, Archive coverage
gitlab-migration changed the status of T4326: Archive the pom file additionally to the source folder from Invalid to Migrated.

This task has been migrated to GitLab.

Oct 19 2022, 6:07 PM · Maven loader, Maven lister, System administration, Archive coverage
gitlab-migration changed the status of T4215: staging: Deploy latest maven stack, a subtask of T3874: staging: Analyze result of the maven listing and ingestion, from Resolved to Migrated.
Oct 19 2022, 6:06 PM · Maven loader, Maven lister, Archive coverage
gitlab-migration changed the status of T4215: staging: Deploy latest maven stack from Resolved to Migrated.

This task has been migrated to GitLab.

Oct 19 2022, 6:06 PM · Maven loader, Maven lister, System administration, Archive coverage
gitlab-migration changed the status of T4143: staging: Deploy maven stack fixes, a subtask of T3874: staging: Analyze result of the maven listing and ingestion, from Resolved to Migrated.
Oct 19 2022, 6:06 PM · Maven loader, Maven lister, Archive coverage
gitlab-migration changed the status of T4143: staging: Deploy maven stack fixes from Resolved to Migrated.

This task has been migrated to GitLab.

Oct 19 2022, 6:06 PM · Maven loader, Maven lister, System administration, Archive coverage
gitlab-migration changed the status of T4105: Push maven-index-exporter image to docker hub, a subtask of T3746: staging: Deploy maven indexer/lister/loader, from Resolved to Migrated.
Oct 19 2022, 6:06 PM · Maven loader, Maven lister, System administration, Archive coverage
gitlab-migration changed the status of T3746: staging: Deploy maven indexer/lister/loader, a subtask of T1724: Maven Central repository support, from Resolved to Migrated.
Oct 19 2022, 6:05 PM · Maven loader, Maven lister, GSoC 2019, Archive coverage
gitlab-migration changed the status of T3746: staging: Deploy maven indexer/lister/loader from Resolved to Migrated.

This task has been migrated to GitLab.

Oct 19 2022, 6:05 PM · Maven loader, Maven lister, System administration, Archive coverage

Sep 8 2022

ardumont closed T4330: Deploy maven stack in production, a subtask of T1724: Maven Central repository support, as Resolved.
Sep 8 2022, 3:03 PM · Maven loader, Maven lister, GSoC 2019, Archive coverage
ardumont closed T4330: Deploy maven stack in production as Resolved.
Sep 8 2022, 3:03 PM · System administration, Maven loader, Maven lister, GSoC 2019, Archive coverage
vsellier added a comment to T4330: Deploy maven stack in production.

\o/ great

Sep 8 2022, 2:41 PM · System administration, Maven loader, Maven lister, GSoC 2019, Archive coverage
ardumont moved T4330: Deploy maven stack in production from in-progress to deployed/landed/monitoring on the System administration board.
Sep 8 2022, 2:16 PM · System administration, Maven loader, Maven lister, GSoC 2019, Archive coverage
ardumont updated the task description for T4330: Deploy maven stack in production.
Sep 8 2022, 2:16 PM · System administration, Maven loader, Maven lister, GSoC 2019, Archive coverage
ardumont added a comment to T4330: Deploy maven stack in production.

Checks:

  • task has been scheduled by the scheduler runner process [1]
  • listing is being consumed by one worker [2]
  • 'maven' listed origins is steadily growing [3]
  • New 'maven' listed origins are getting scheduled for ingestion [4]
  • maven loaders are ingesting those [5]
Sep 8 2022, 2:16 PM · System administration, Maven loader, Maven lister, GSoC 2019, Archive coverage
ardumont updated the task description for T4330: Deploy maven stack in production.
Sep 8 2022, 2:11 PM · System administration, Maven loader, Maven lister, GSoC 2019, Archive coverage
ardumont added a revision to T4330: Deploy maven stack in production: D8421: production: Update lister to consume maven listing.
Sep 8 2022, 2:11 PM · System administration, Maven loader, Maven lister, GSoC 2019, Archive coverage
ardumont updated the task description for T4330: Deploy maven stack in production.
Sep 8 2022, 2:06 PM · System administration, Maven loader, Maven lister, GSoC 2019, Archive coverage
ardumont added a comment to T4330: Deploy maven stack in production.

Schedule maven-central listing:

swhscheduler@saatchi:~$ curl -s https://repo1.maven.org/maven2/ | head -2
<!DOCTYPE html>
<html>
swhscheduler@saatchi:~$ curl -s https://maven-exporter.internal.softwareheritage.org/export-maven-central.fld | head -2
doc 0
  field 0
swhscheduler@saatchi:~$ curl -s http://saatchi.internal.softwareheritage.org:5008/
<html>
<head><title>Software Heritage scheduler RPC server</title></head>
<body>
<p>You have reached the
<a href="https://www.softwareheritage.org/">Software Heritage</a>
scheduler RPC server.<br />
See its
<a href="https://docs.softwareheritage.org/devel/swh-scheduler/">documentation
and API</a> for more information</p>
</body>
</html>swhscheduler@saatchi:~$ swh scheduler --url http://saatchi.internal.softwareheritage.org:5008/ \
>   task add list-maven-full \
>     url=https://repo1.maven.org/maven2/ \
>     index_url=https://maven-exporter.internal.softwareheritage.org/export-maven-central.fld
Created 1 tasks
Sep 8 2022, 2:04 PM · System administration, Maven loader, Maven lister, GSoC 2019, Archive coverage
ardumont updated the task description for T4330: Deploy maven stack in production.
Sep 8 2022, 1:47 PM · System administration, Maven loader, Maven lister, GSoC 2019, Archive coverage
ardumont added a comment to T4330: Deploy maven stack in production.

Finally, export is done on maven central [1], the fld is computed [2]...
And it's also exposed, hence reachable from lister worker nodes.

Sep 8 2022, 1:47 PM · System administration, Maven loader, Maven lister, GSoC 2019, Archive coverage
ardumont changed the status of T4330: Deploy maven stack in production, a subtask of T1724: Maven Central repository support, from Open to Work in Progress.
Sep 8 2022, 11:18 AM · Maven loader, Maven lister, GSoC 2019, Archive coverage
ardumont changed the status of T4330: Deploy maven stack in production from Open to Work in Progress.
Sep 8 2022, 11:18 AM · System administration, Maven loader, Maven lister, GSoC 2019, Archive coverage
ardumont added a project to T4330: Deploy maven stack in production: System administration.
Sep 8 2022, 11:18 AM · System administration, Maven loader, Maven lister, GSoC 2019, Archive coverage
ardumont updated the task description for T4330: Deploy maven stack in production.
Sep 8 2022, 10:30 AM · System administration, Maven loader, Maven lister, GSoC 2019, Archive coverage
ardumont updated the task description for T4330: Deploy maven stack in production.
Sep 8 2022, 10:19 AM · System administration, Maven loader, Maven lister, GSoC 2019, Archive coverage
ardumont updated the task description for T4330: Deploy maven stack in production.
Sep 8 2022, 10:05 AM · System administration, Maven loader, Maven lister, GSoC 2019, Archive coverage
ardumont updated the task description for T4330: Deploy maven stack in production.
Sep 8 2022, 9:57 AM · System administration, Maven loader, Maven lister, GSoC 2019, Archive coverage

Sep 7 2022

ardumont updated the task description for T4330: Deploy maven stack in production.
Sep 7 2022, 6:27 PM · System administration, Maven loader, Maven lister, GSoC 2019, Archive coverage

Sep 6 2022

ardumont added a revision to T4330: Deploy maven stack in production: D8399: production: Provision new maven-exporter node.
Sep 6 2022, 2:56 PM · System administration, Maven loader, Maven lister, GSoC 2019, Archive coverage
ardumont added a revision to T4330: Deploy maven stack in production: D8397: Deploy maven-exporter production node.
Sep 6 2022, 11:50 AM · System administration, Maven loader, Maven lister, GSoC 2019, Archive coverage

Jun 16 2022

ardumont updated the task description for T4330: Deploy maven stack in production.
Jun 16 2022, 11:13 AM · System administration, Maven loader, Maven lister, GSoC 2019, Archive coverage
ardumont triaged T4330: Deploy maven stack in production as Normal priority.
Jun 16 2022, 10:21 AM · System administration, Maven loader, Maven lister, GSoC 2019, Archive coverage

Jun 15 2022

bchauvet closed T4326: Archive the pom file additionally to the source folder, a subtask of T3746: staging: Deploy maven indexer/lister/loader, as Invalid.
Jun 15 2022, 5:20 PM · Maven loader, Maven lister, System administration, Archive coverage
bchauvet closed T4326: Archive the pom file additionally to the source folder as Invalid.
Jun 15 2022, 5:20 PM · Maven loader, Maven lister, System administration, Archive coverage
borisbaldassari added a comment to T4326: Archive the pom file additionally to the source folder.

Yesss! \o/

Jun 15 2022, 4:33 PM · Maven loader, Maven lister, System administration, Archive coverage
anlambert added a comment to T4326: Archive the pom file additionally to the source folder.

So in the end, the conclusion is that the loader already does the right thing so it's a noop, right?

Jun 15 2022, 4:26 PM · Maven loader, Maven lister, System administration, Archive coverage
ardumont renamed T4326: Archive the pom file additionally to the source folder from archive the pom file additionally to the source folder to Archive the pom file additionally to the source folder.
Jun 15 2022, 4:09 PM · Maven loader, Maven lister, System administration, Archive coverage
ardumont renamed T4326: Archive the pom file additionally to the source folder from archive the pom file additionnaly to the source folder to archive the pom file additionally to the source folder.
Jun 15 2022, 4:09 PM · Maven loader, Maven lister, System administration, Archive coverage
borisbaldassari added a comment to T4326: Archive the pom file additionally to the source folder.

Good news *can* happen, ahah! Thanks for notifying me.

Jun 15 2022, 3:42 PM · Maven loader, Maven lister, System administration, Archive coverage
ardumont updated subscribers of T4326: Archive the pom file additionally to the source folder.

To summarize, the initial intent was to adapt the jar loaded (as extracted directory) to append the pom.xml so we do not lose that reference.

Jun 15 2022, 3:33 PM · Maven loader, Maven lister, System administration, Archive coverage
bchauvet added a comment to T4326: Archive the pom file additionally to the source folder.

You're right boris, indeed it's already stored as extrinsic metadata, we hadn't checked properly :)
Thank you for your answer !

Jun 15 2022, 3:33 PM · Maven loader, Maven lister, System administration, Archive coverage
borisbaldassari added a comment to T4326: Archive the pom file additionally to the source folder.

I'm not sure to understand the intent, as we already keep the pom in the extrinsic metadata (don't we?).
Double-checking in the SWH codebase, I believe you could build upon this: see [1] lines 166-180.

Jun 15 2022, 3:07 PM · Maven loader, Maven lister, System administration, Archive coverage
borisbaldassari added a comment to T4326: Archive the pom file additionally to the source folder.

Congrats on the work done! I think that downloading the pom file from the same folder is indeed the way to go.

Jun 15 2022, 2:54 PM · Maven loader, Maven lister, System administration, Archive coverage
anlambert added a comment to T4326: Archive the pom file additionally to the source folder.

I think the simplest way to get the pom file associated to a specific release of a maven package is to download it from the folder where we can find the source jar.

Jun 15 2022, 1:48 PM · Maven loader, Maven lister, System administration, Archive coverage

Jun 13 2022

bchauvet updated the task description for T4326: Archive the pom file additionally to the source folder.
Jun 13 2022, 2:21 PM · Maven loader, Maven lister, System administration, Archive coverage
bchauvet added a comment to T4326: Archive the pom file additionally to the source folder.
Jun 13 2022, 2:16 PM · Maven loader, Maven lister, System administration, Archive coverage
bchauvet updated the task description for T4326: Archive the pom file additionally to the source folder.
Jun 13 2022, 1:36 PM · Maven loader, Maven lister, System administration, Archive coverage
bchauvet updated the task description for T4326: Archive the pom file additionally to the source folder.
Jun 13 2022, 1:30 PM · Maven loader, Maven lister, System administration, Archive coverage
bchauvet added a comment to T4326: Archive the pom file additionally to the source folder.

in a source.jar, the pom is not inculded by default but can be if specified :

Jun 13 2022, 1:27 PM · Maven loader, Maven lister, System administration, Archive coverage
bchauvet added a comment to T4326: Archive the pom file additionally to the source folder.
Jun 13 2022, 1:25 PM · Maven loader, Maven lister, System administration, Archive coverage
bchauvet updated the task description for T4326: Archive the pom file additionally to the source folder.
Jun 13 2022, 12:11 PM · Maven loader, Maven lister, System administration, Archive coverage
bchauvet triaged T4326: Archive the pom file additionally to the source folder as Normal priority.
Jun 13 2022, 12:10 PM · Maven loader, Maven lister, System administration, Archive coverage
anlambert closed T4318: Consider using jar command to extract jar archives as Resolved.
Jun 13 2022, 11:28 AM · Maven loader

Jun 9 2022

anlambert added a revision to T4318: Consider using jar command to extract jar archives: D7976: tarball: Use standard Python module zipfile to extract jar archive.
Jun 9 2022, 3:27 PM · Maven loader
ardumont added a comment to T4318: Consider using jar command to extract jar archives.

This is better as we will not have to install any new runtime dependencies in workers.

Jun 9 2022, 3:04 PM · Maven loader
anlambert added a comment to T4318: Consider using jar command to extract jar archives.

Alternatively, we could use the zipfile standard Python module which seems to work in a similar way as the jar command, see below:

(swh) anlambert@carnavalet:/tmp/jar_test$ wget https://repo1.maven.org/maven2/org/pustefixframework/pustefix-archetype-basic/0.15.20/pustefix-archetype-basic-0.15.20-sources.jar
--2022-06-09 14:56:17--  https://repo1.maven.org/maven2/org/pustefixframework/pustefix-archetype-basic/0.15.20/pustefix-archetype-basic-0.15.20-sources.jar
Resolving repo1.maven.org (repo1.maven.org)... 151.101.120.209
Connecting to repo1.maven.org (repo1.maven.org)|151.101.120.209|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 45637 (45K) [application/java-archive]
Saving to: ‘pustefix-archetype-basic-0.15.20-sources.jar’
Jun 9 2022, 3:02 PM · Maven loader
ardumont added a comment to T4318: Consider using jar command to extract jar archives.

fwiw, this makes sense ;)

Jun 9 2022, 3:01 PM · Maven loader
swh-sentry-integration added a comment to T4318: Consider using jar command to extract jar archives.

Sentry issue: SWH-LOADER-CORE-150

Jun 9 2022, 1:51 PM · Maven loader
swh-sentry-integration added a comment to T4318: Consider using jar command to extract jar archives.

Sentry issue: SWH-LOADER-CORE-ZW

Jun 9 2022, 1:51 PM · Maven loader
anlambert triaged T4318: Consider using jar command to extract jar archives as Normal priority.
Jun 9 2022, 1:50 PM · Maven loader

Jun 3 2022

ardumont added a comment to T3874: staging: Analyze result of the maven listing and ingestion.

There remains git and other dvcs typed origins [1] listed by maven but not github ones [2].

Jun 3 2022, 4:11 PM · Maven loader, Maven lister, Archive coverage
ardumont closed T4232: Listers: Canonicalize listed github origins, a subtask of T3874: staging: Analyze result of the maven listing and ingestion, as Resolved.
Jun 3 2022, 3:19 PM · Maven loader, Maven lister, Archive coverage
ardumont closed T3874: staging: Analyze result of the maven listing and ingestion as Resolved.
Jun 3 2022, 3:18 PM · Maven loader, Maven lister, Archive coverage
ardumont closed T3874: staging: Analyze result of the maven listing and ingestion, a subtask of T3746: staging: Deploy maven indexer/lister/loader, as Resolved.
Jun 3 2022, 3:18 PM · Maven loader, Maven lister, System administration, Archive coverage
ardumont updated the task description for T3874: staging: Analyze result of the maven listing and ingestion.
Jun 3 2022, 3:18 PM · Maven loader, Maven lister, Archive coverage
ardumont added a subtask for T3874: staging: Analyze result of the maven listing and ingestion: T4232: Listers: Canonicalize listed github origins.
Jun 3 2022, 3:18 PM · Maven loader, Maven lister, Archive coverage
ardumont added a comment to T3874: staging: Analyze result of the maven listing and ingestion.

status: triggered 2 full-maven lister runs on maven central and jboss [1]
And no more exotic github urls are popping up [2].

Jun 3 2022, 2:18 PM · Maven loader, Maven lister, Archive coverage
ardumont added a comment to T3874: staging: Analyze result of the maven listing and ingestion.

Yesterday, i had fixed, diffed, released and pushed the diff [1] fixing the
canonicalization of remaining exotic urls, cleaned up 'git' (out of a maven listing)
origins and triggered back a listing. Today, checking back those origins (staging
scheduler), there was still noise which should no longer have been there...

Jun 3 2022, 9:35 AM · Maven loader, Maven lister, Archive coverage

Jun 2 2022

ardumont added a revision to T3874: staging: Analyze result of the maven listing and ingestion: D7946: github/utils: Deal with exotic urls to canonicalize.
Jun 2 2022, 3:14 PM · Maven loader, Maven lister, Archive coverage
ardumont added a comment to T3874: staging: Analyze result of the maven listing and ingestion.

Full listing is not finished yet but still there remains origins with exotic starting urls which are not canonicalized.
I'd say the issue lies with the canonicalize swh.core implementation code which only deals with https:// and git:// urls.
So some improvments are needed there.

Jun 2 2022, 2:08 PM · Maven loader, Maven lister, Archive coverage

Jun 1 2022

ardumont updated the task description for T3874: staging: Analyze result of the maven listing and ingestion.
Jun 1 2022, 3:06 PM · Maven loader, Maven lister, Archive coverage
ardumont added a comment to T3874: staging: Analyze result of the maven listing and ingestion.

Plan:

  • P1369: Listing status after first round listing
  • Clean up maven github origins listing [1]
  • Trigger maven full run [2]
  • Wait for listing to finish
  • Listing status after new maven lister round of listing
  • Ping in mailing list discussion with data!
Jun 1 2022, 3:05 PM · Maven loader, Maven lister, Archive coverage
ardumont updated the task description for T3874: staging: Analyze result of the maven listing and ingestion.
Jun 1 2022, 3:01 PM · Maven loader, Maven lister, Archive coverage
ardumont updated the task description for T3874: staging: Analyze result of the maven listing and ingestion.
Jun 1 2022, 10:50 AM · Maven loader, Maven lister, Archive coverage
ardumont added a comment to T3874: staging: Analyze result of the maven listing and ingestion.

Old maven behavior results in origins like git://github.com, ... [1]
The new maven lister behavior should now result in canonical github urls http://github.com/user/repo.
Analysis ongoing and report will go after that comment.

Jun 1 2022, 10:50 AM · Maven loader, Maven lister, Archive coverage
ardumont updated the task description for T3874: staging: Analyze result of the maven listing and ingestion.
Jun 1 2022, 10:47 AM · Maven loader, Maven lister, Archive coverage

May 13 2022

anlambert added projects to T4143: staging: Deploy maven stack fixes: Maven lister, Maven loader.
May 13 2022, 4:55 PM · Maven loader, Maven lister, System administration, Archive coverage
anlambert added projects to T3746: staging: Deploy maven indexer/lister/loader: Maven lister, Maven loader.
May 13 2022, 4:54 PM · Maven loader, Maven lister, System administration, Archive coverage
anlambert added projects to T1724: Maven Central repository support: Maven lister, Maven loader.
May 13 2022, 4:54 PM · Maven loader, Maven lister, GSoC 2019, Archive coverage
anlambert added projects to T4215: staging: Deploy latest maven stack: Maven lister, Maven loader.
May 13 2022, 4:54 PM · Maven loader, Maven lister, System administration, Archive coverage
anlambert added projects to T3874: staging: Analyze result of the maven listing and ingestion: Maven lister, Maven loader.
May 13 2022, 4:53 PM · Maven loader, Maven lister, Archive coverage
anlambert created Maven loader.
May 13 2022, 4:53 PM