I concur with myself on the previous remarks, here is the repartition between issue and origins:
- Queries
- All Stories
- Search
- Advanced Search
- Transactions
- Transaction Logs
Advanced Search
Dec 8 2017
Dec 1 2017
Nov 29 2017
Nov 23 2017
For information, for the first url, It's more a bug in the origin_url computation which results with the same origin_url for 2 differents dumps with the same name:
ardumont@uffizi:~% grep ich-sys /srv/storage/space/mirrors/code.google.com/sources/INDEX-svn-dumps http://ich-sys.googlecode.com/svn/ /srv/storage/space/mirrors/code.google.com/sources/v2/code.google.com/i/ich-sys/ich-sys-repo.svndump.gz http://ich-sys.googlecode.com/svn/ /srv/storage/space/mirrors/code.google.com/sources/v2/eclipselabs.org/i/ich-sys/ich-sys-repo.svndump.gz
Nov 14 2017
Possibly related to rDLDBASEd74506f6b53dc3ffdae66e8446fe561d006a64f9
Nov 10 2017
With latest fix from T839, this removes that edge case as well.
The symlink referenced here was not a symlink.
Which now makes sense with my first disconcerting analysis.
So that means that's the fix is:
That's awesome.
Another repository triggered something similar. So, not quite yet fixed.
Nov 8 2017
As usual, to be complete...
Nov 7 2017
Latest run from last week-end, some new bugs (new tasks are or will be opened for detailed analysis and fix):
Other repository impacted: /srv/storage/space/mirrors/code.google.com/sources/v2/code.google.com/h/humanroot/humanroot-repo.svndump.gz
Nov 6 2017
Apparently a link exists with the same path. Thus the error.
Oct 26 2017
Heads up on this, i fixed some wrong behavior or bugs:
Oct 25 2017
Looking further into it.
Defining such option for this in our puppet manifest would make this eventually end up in /etc/systemd/system/swh-worker@${SERVICE_NAME}.service.d/parameters.conf.
In this case ${SERVICE_NAME} being swh_loader_svn.
We could use the loader-svn's systemd [service] property LimitNOFILE:
- 'LimitNOFILE= ulimit -n Number of File Descriptors'.
Oct 24 2017
The 'googlecode' and 'unknown' key entries are googlecode svn related.
Most are stored in the 'unknown' entry key because, we somehow don't have the task's input args (origin url + dump file), which is unfortunate (for rescheduling those).
There is no file descriptor leak.
Oct 17 2017
Oct 10 2017
Oct 9 2017
Well, well, at that moment, the revision has no 'id' key yet...
Ok, fixing it.
Well (remembering now), the fact that only 400 revisions is stored is normal (as in implemented that way).
Stacktrace of the reproduced error:
Ok, just so you know, i did not reproduce this behavior immediately (even though: swh-env updated, db rebuilt, configuration file 'almost' identical to prod).
I had another issue prior to the one described (about DentryPerms.directory).
Oct 6 2017
Oct 4 2017
First, an important detail, those were disk loading of svn dumps.
So, at first, this mount a gzip dump as an svn repository and then it processes the history log.
So, this can be quite resource consuming (disk, memory).
Out of the 140k remaining svn repositories to mount and load, i have 606 errors.
Oct 3 2017
Oct 2 2017
Feb 15 2017
Command used to trigger the production of tasks:
Feb 14 2017
This issue has been solved and the fix deployed everywhere.
I just actually stopped the SVN loaders :)