Page MenuHomeSoftware Heritage

svn: An error occurred when running svnrdump and no exploitable dump file has been generated.
Closed, MigratedEdits Locked

Description

This is a recurring problem. It also seems to be proeminent one in the sourceforge svn
origins dataset to ingest [2] [3]

This needs fixing.

A proper modus operandi which worked for the mercurial loader proposal.
First lookout a svn origin with that problem.
Wrap a test which tries to load that repository and fail the same way.
Then debug it the pdb way to determine the source of the issue.

[1] https://sentry.softwareheritage.org/share/issue/4295e7cc58884f528e73c5ea9ff6e235/

[2] no data to show but to lookup a kibana dashboard with it, the workers (worker[17-18]
running that dataset does not push issues to sentry because of a current systemd
limitation in our puppet manifests regarding one env variable for sentry...)

[3]

Sep 28 08:04:03 worker18 python3[58950]: [2021-09-28 08:04:03,449: INFO/MainProcess] Received task: swh.loader.svn.tasks.DumpMountAndLoadSvnRepository[decc664e-ed22-45ff-931b-a8ed02e1075b]
Sep 28 08:04:03 worker18 python3[105921]: [2021-09-28 08:04:03,516: INFO/ForkPoolWorker-232] Load origin 'https://svn.code.sf.net/p/white-rats-studios/svn' with type 'svn'
Sep 28 08:04:04 worker18 python3[105921]: [2021-09-28 08:04:04,177: ERROR/ForkPoolWorker-232] Loading failure, updating to `failed` status
                                          Traceback (most recent call last):
                                            File "/usr/lib/python3/dist-packages/swh/loader/core/loader.py", line 335, in load
                                              self.prepare()
                                            File "/usr/lib/python3/dist-packages/swh/loader/svn/loader.py", line 756, in prepare
                                              dump_path = self.dump_svn_revisions(self.svn_url, last_loaded_svn_rev)
                                            File "/usr/lib/python3/dist-packages/swh/loader/svn/loader.py", line 745, in dump_svn_revisions
                                              "An error occurred when running svnrdump and "
                                          Exception: An error occurred when running svnrdump and no exploitable dump file has been generated.

Event Timeline

ardumont created this task.

@anlambert worse case scenario, if we don't find the source of the problem, we may want to add a fallback to the basic implem which tries and load messages one at a time.
Because then at least, we'd at least ingest something up to the point where we could no longer do.

@ardumont, looking at debug logs, the error in [1] is due to a permission issue

svnrdump: E170013: Unable to connect to a repository at URL 'https://llvm.org/svn/llvm-project'
svnrdump: E175013: Access to '/svn/llvm-project' forbidden

while the error in [3] is due to a non existing svn repository:

anlambert@carnavalet:/tmp$ svnrdump dump https://svn.code.sf.net/p/white-rats-studios/svn
svnrdump: E170013: Unable to connect to a repository at URL 'https://svn.code.sf.net/p/white-rats-studios/svn'
svnrdump: E070014: Could not open the requested SVN filesystem
anlambert@carnavalet:/tmp$

Before trying to fallback on the basic implementation, I think we should exploit the error number
returned by svnrdump, for instance E070014 means the origin is not found.

Correct!

I was looking into it a bit more and realize that it's related to what you said
(origin not found).

So the actual issue is that we don't exploit correctly the error result and then
we report the wrong status, failed instead of not_found.

ardumont changed the task status from Open to Work in Progress.Sep 29 2021, 6:34 PM
ardumont claimed this task.