\o/ great
- Queries
- All Stories
- Search
- Advanced Search
- Transactions
- Transaction Logs
Advanced Search
Sep 9 2022
Sep 8 2022
Checks:
- task has been scheduled by the scheduler runner process [1]
- listing is being consumed by one worker [2]
- 'maven' listed origins is steadily growing [3]
- New 'maven' listed origins are getting scheduled for ingestion [4]
- maven loaders are ingesting those [5]
Schedule maven-central listing:
swhscheduler@saatchi:~$ curl -s https://repo1.maven.org/maven2/ | head -2 <!DOCTYPE html> <html> swhscheduler@saatchi:~$ curl -s https://maven-exporter.internal.softwareheritage.org/export-maven-central.fld | head -2 doc 0 field 0 swhscheduler@saatchi:~$ curl -s http://saatchi.internal.softwareheritage.org:5008/ <html> <head><title>Software Heritage scheduler RPC server</title></head> <body> <p>You have reached the <a href="https://www.softwareheritage.org/">Software Heritage</a> scheduler RPC server.<br /> See its <a href="https://docs.softwareheritage.org/devel/swh-scheduler/">documentation and API</a> for more information</p> </body> </html>swhscheduler@saatchi:~$ swh scheduler --url http://saatchi.internal.softwareheritage.org:5008/ \ > task add list-maven-full \ > url=https://repo1.maven.org/maven2/ \ > index_url=https://maven-exporter.internal.softwareheritage.org/export-maven-central.fld Created 1 tasks
Finally, export is done on maven central [1], the fld is computed [2]...
And it's also exposed, hence reachable from lister worker nodes.
Sep 7 2022
Sep 6 2022
Sep 5 2022
Sep 2 2022
Sep 1 2022
Aug 31 2022
Aug 30 2022
I've made a complete run on docker
In T4233#89838, @franckbret wrote:Arch Linux Lister Docker Report
The lister takes a lot of time and fail on max retries when scraping repository directory (It has run fine a few weeks ago.). Not sure at this point, but I suspect that's a random problem related to network / http server. WIll run it multiple time to see if it failed on the same resource.
By the way I guess that the we need to define a strategy for those exceptions.
In T4465#89812, @franckbret wrote:Loader runs on Docker report
Loader runs fine on docker.
I've first launched 100 and then 1900 loader tasks, it complete in less than an hour.swh-scheduler=# select count(*) from origin_visit_stats where visit_type='pubdev' and last_visit_status='successful'; count ------- 1450 swh-scheduler=# select count(*) from origin_visit_stats where visit_type='pubdev' and last_visit_status='failed'; count ------- 550
Aug 29 2022
Aug 26 2022
Arch Linux Lister Docker Report
Aur Loader runs in Docker report
AUR Lister runs in Docker report
Loader runs on Docker report
Lister runs on Docker report
Aug 22 2022
Aug 20 2022
Aug 18 2022
Aug 17 2022
Aug 11 2022
Aug 9 2022
Aug 8 2022
This has now been discussed on the sourcehut mailing list and I took part in the conversation.
Aug 5 2022
We discussed internally what to do with inactive repositories.
We reached a decision to move unused repos to object storage.
Once implemented, they will still be accessible but take a bit longer to access after a long period of inactivity.
Aug 4 2022
Looks like there's many more repos that should be visitable but aren't:
worth opening a dedicated forge issue
Done. T4423
open an upstream issue
updated query running:
As usual, I'm uneasy with the (general) idea of manually handling some repositories to resorb one bit of lag. This will only increase lag in another area that we will want to cover next. Rinse, repeat.
answer: 4755
I am currently running a query to find how many origins are over one year overdue for a visit:
Jul 28 2022
Jul 19 2022
Jul 13 2022
Jun 29 2022
Will work on the incremental lister, and then document (not already done).
What the next steps here?