Let's analyze the result of the ingestion of subversion repositories with external definitions
submitted on staging two days ago.
First let's compute some statistics about the visit statuses of these repositories from the
save_origin_request table of the staging webapp database.
13:51 $ ssh anlambert@webapp.internal.staging.swh.network Linux webapp 4.19.0-18-amd64 #1 SMP Debian 4.19.208-1 (2021-09-29) x86_64 The programs included with the Debian GNU/Linux system are free software; the exact distribution terms for each program are described in the individual files in /usr/share/doc/*/copyright. Debian GNU/Linux comes with ABSOLUTELY NO WARRANTY, to the extent permitted by applicable law. Last login: Fri Jan 21 12:29:24 2022 from 192.168.101.11 anlambert@webapp:~$ django-admin dumpdata --settings=swh.web.settings.production > webapp.staging.db.json anlambert@carnavalet:~$ scp anlambert@webapp.internal.staging.swh.network:webapp.staging.db.json . anlambert@carnavalet:~$ cd ~/swh/swh-environment/swh-web (swh) ✘-PIPE ~/swh/swh-environment/swh-web [master|⚑ 106] 13:56 $ rm swh/web/settings/db.sqlite3 (swh) ✘-PIPE ~/swh/swh-environment/swh-web [master|⚑ 106] 13:56 $ make run-migrations-dev python3 swh/web/manage.py migrate --settings=swh.web.settings.development -v0 2>/dev/null (swh) ✔ ~/swh/swh-environment/swh-web [master|⚑ 106] 13:57 $ django-admin loaddata --settings=swh.web.settings.development ~/webapp.staging.db.json
Once the copy of the webapp staging database done, we removed manually (using sqlitebrowser)
the lines in the save_origin_request that do not concern subversion with externals.
We can now compute the statistics about visit statuses.
(swh) ✔ ~/swh/swh-environment/swh-web [master|⚑ 106] 13:45 $ sqlite3 swh/web/settings/db.sqlite3 SQLite version 3.34.1 2021-01-20 14:10:07 Enter ".help" for usage hints. sqlite> select visit_status, count(*) from save_origin_request group by visit_status; |241 created|4 failed|7 full|521 not_found|1 partial|117
So at the time of writing, on the 891 save code now requests for subversion repos with externals:
- 241 have not been visited yet
- 7 failed
- 521 succeeds with a full repository loading
- 117 succeeds with a partial repository loading
For the record, a partial repository loading can mean:
- the reconstructed filesystem for the last loaded revision differs from the one obtained with a subversion export operation on that revision, that check is performed by the post_load hook of the subversion loader
- svnrdump could not dump the whole repository (network issue for instance), only a partial set of revisions
So results are not bad so far but there is still some issues in the externals support implementation in the loader,
let's find them and fix them.
Related to T3864#77294