Projects related to the Git VCS
Jun 19 2018
Apr 12 2018
Last origin rescheduled and injected.
python3-dulwich (fix included) packaged and pushed to our debian repository.
After discussion with jelmer (dulwich's author), he proposed and implemented the real solution, deal with bytes (avoiding altogether encoding water mudding ;)
It's landed in dulwich/dulwich's master branch \m/.
Apr 11 2018
Patching dulwich to try and detect the encoding (when the problem arose) seems to do the trick:
With latest dulwich (> 0.19.1, current head) we break somewhere else now, still encoding related:
I opened a discussion at at https://github.com/jelmer/dulwich/issues/608 about this case.
Jan 19 2018
I was initially opened to clean up the repository because i thought it was some form of corruption.
But now, i no longer think that's the case. And don't want to tamper with sources.
Jan 18 2018
After some digging, it seems an encoding problem:
Trying to analyze a bit further that repository, we can see this:
Dec 21 2017
The sql error was sheer bad luck, tested locally and no problem, so it was rescheduled, loaded successfully.
Only 2 errors left:
- 1 about bad transaction in db
- 1 about unicode error:
Dec 19 2017
Updated and scheduled the last 170 repositories.
Now, remains those to be checked for errors.
Dec 15 2017
Nov 7 2017
Oct 27 2017
Oct 26 2017
Oct 3 2017
Jul 28 2017
For information, the last injection has been done. The remaining errors:
(but we should have a list of those repos, for posterity).
Jul 27 2017
These should be rescheduled and driven to successful completion.
Jul 26 2017
After much learning on how to read and extract logs from our kibana instance, here is the error repartition.
Jun 6 2017
As of now, ingestion, after multiple (re)schedulings, has been done.
Apr 26 2017
Update on this.
Apr 7 2017
Feb 15 2017
Visit dates have been fixed for the origins already injected.
Feb 12 2017
Feb 11 2017
Command to trigger the messages (from worker01):
cat /srv/storage/space/mirrors/gitorious.org/full_mapping.txt | SWH_WORKER_INSTANCE=swh_loader_git_disk ./load_gitorious.py --root-repositories /srv/storage/space/mirrors/gitorious.org/mnt/repositories
(The script defaults to use the right queue 'swh_loader_git_express' and the right origin-date 'Wed, 30 Mar 2016 09:40:04 +0200')