- Queries
- All Stories
- Search
- Advanced Search
- Transactions
- Transaction Logs
Advanced Search
Jan 8 2023
Oct 19 2022
Jun 19 2020
We still need to try to ingest the zeq2 repo, but that can be done in a followup task.
May 30 2020
The following repositories failed to import. Their on-disk structure is either completely empty, or only contains refs (no actual git objects stored):
May 29 2020
After the first (naive, I guess) pass, 1470 repositories are still missing.
May 19 2020
The code for loading git repositories from disk hasn't been run in production in a while, so I've decided to run the imports of the missing repos manually.
We also have a single origin with no full visit:
After dumping all origins starting with https://gitorious.org/ in the archive:
Jun 19 2018
Apr 12 2018
Last origin rescheduled and injected.
python3-dulwich (fix included) packaged and pushed to our debian repository.
After discussion with jelmer (dulwich's author), he proposed and implemented the real solution, deal with bytes (avoiding altogether encoding water mudding ;)
It's landed in dulwich/dulwich's master branch \m/.
Apr 11 2018
Patching dulwich to try and detect the encoding (when the problem arose) seems to do the trick:
With latest dulwich (> 0.19.1, current head) we break somewhere else now, still encoding related:
I opened a discussion at at https://github.com/jelmer/dulwich/issues/608 about this case.
Jan 19 2018
I was initially opened to clean up the repository because i thought it was some form of corruption.
But now, i no longer think that's the case. And don't want to tamper with sources.
Jan 18 2018
After some digging, it seems an encoding problem:
Trying to analyze a bit further that repository, we can see this:
Dec 21 2017
The sql error was sheer bad luck, tested locally and no problem, so it was rescheduled, loaded successfully.
Only 2 errors left:
- 1 about bad transaction in db
- 1 about unicode error:
Dec 19 2017
Updated and scheduled the last 170 repositories.
Now, remains those to be checked for errors.
Dec 15 2017
Fixed with that latest version package:
Packaged it and pushed to our own repository.
Update on this:
- Issue opened.
- Pull Request (PR) proposed and merged.
Nov 13 2017
This error slipped under my radar last week.
I opened a related issue in dulwich since it should be handled upstream.
Nov 10 2017
PR got merged \m/
Nov 7 2017
Nov 4 2017
PR got merged \m/
Oct 31 2017
Follow up on this:
Oct 27 2017
The revision in question is:
Debugging some more, the date generating this error is the following, which raises indeed the initial overflow error:
Possibly related error.
Debugging problematic object shows 1e82c9224b8898672b3b6fe8b6b737f7eed24cf6 which git fsck references as well.
Turns out it's a badly formatted tag:
Oct 26 2017
Patching the version to print the identifier in error, i retrieve the following object ae51106031a0bb39a8def57a8592f70116487eab (which is amongst the badly formatted tags listed by git fsck below).
In that particular repository, the tag has no time (tag.tag_time and tag.tag_timezone are None, tag._tag_timezone_neg_utc is False - those are the default values for that object).
But the swh-loader-git's code expects those values to exist.
In our model though, we are ok with that date not being provided.
Tweaking the loader git to print the actual sha1: