svnadmin: E125005: Valeur d'une propriété invalide dans le flot de sauvegarde ; envisager de corriger la source ou utiliser l'option --bypass-prop-validation au chargement.
svnadmin: E125005: Propriété 'svn:log' refusée car non codée en UTF-8

Passing the --bypass-prop-validation option effectively fixes the loading issue.
I think we should use it as we already handle properties decoding errors in the loader implementation.

May 7 2020, 7:26 PM · SVN Loader

rdicosmo triaged T2395: Save code now fails on a svn project as Normal priority.

May 7 2020, 6:11 PM · SVN Loader

Apr 21 2020

ardumont closed T1268: Migrate existing snapshots according to the new snapshot convention as Resolved.

svn loader now uses HEAD as branch name (against master in the early days).

Apr 21 2020, 11:37 AM · SVN Loader

Jan 22 2020

ardumont added a comment to T2198: Robust SVN import.

vlorentz added a project to T2198: Robust SVN import: SVN Loader.

Jan 22 2020, 4:43 PM · SVN Loader, Roadmap 2020

Aug 20 2019

vlorentz added a subtask for T611: support for external definitions in the svn/subversion loader: T1957: Handling missing DAG nodes.

Aug 20 2019, 9:59 AM · SVN Loader

Jul 3 2019

ardumont placed T617: ingest Google Code Subversion repositories up for grabs.

Jul 3 2019, 3:26 PM · Archive coverage, Origin-GoogleCode, SVN Loader

ardumont placed T923: Mount the asf svn repository mirror up for grabs.

Jul 3 2019, 3:26 PM · SVN Loader

ardumont placed T958: googlecode import: Clean up googlecode origin's origin_visits up for grabs.

Jul 3 2019, 3:26 PM · SVN Loader, Origin-GoogleCode, Archive content

May 25 2019

zack closed T328: svn / subversion loader as Resolved.

closing, we do have an SVN loader now: it has still some issues, but the bulk of the job is done

May 25 2019, 5:01 PM · SVN Loader

zack closed T328: svn / subversion loader, a subtask of T617: ingest Google Code Subversion repositories, as Resolved.

May 25 2019, 5:01 PM · Archive coverage, Origin-GoogleCode, SVN Loader

zack updated subscribers of T466: Test - Ingest XXL svn repository.

@anlambert what's the status of ingesting very large SVN repos, now that we have put the loader in production?

May 25 2019, 4:59 PM · SVN Loader

Oct 15 2018

ardumont renamed T1268: Migrate existing snapshots according to the new snapshot convention from Migrate existing snapshots according to the new snapshot conventions to Migrate existing snapshots according to the new snapshot convention.

Oct 15 2018, 10:45 AM · SVN Loader

ardumont triaged T1268: Migrate existing snapshots according to the new snapshot convention as Normal priority.

Oct 15 2018, 10:45 AM · SVN Loader

Oct 4 2018

zack removed a parent task for T328: svn / subversion loader: T807: dogfooding: ingest the Software Heritage forge into the archive (via the canonical URLs).

Oct 4 2018, 11:51 AM · SVN Loader

Oct 2 2018

ardumont added a comment to T611: support for external definitions in the svn/subversion loader.

It does not indeed, I need more thinking on this...

Oct 2 2018, 7:17 PM · SVN Loader

Oct 1 2018

ardumont added a comment to T611: support for external definitions in the svn/subversion loader.

Thanks for the clarification, i needed it.

Oct 1 2018, 6:22 PM · SVN Loader

zack added a comment to T611: support for external definitions in the svn/subversion loader.

In T611#22696, @ardumont wrote:

@zack Can you enlighten me as to why we want to store that information at the directory level (and not say at the revision one)?

Oct 1 2018, 4:58 PM · SVN Loader

ardumont added a comment to T611: support for external definitions in the svn/subversion loader.

@zack Can you enlighten me as to why we want to store that information at the directory level (and not say at the revision one)?

Oct 1 2018, 4:31 PM · SVN Loader

ardumont added a comment to T611: support for external definitions in the svn/subversion loader.

According to the official documentation (marked not a smart idea to reference), there has been a breaking migration format from svn 1.5 onwards.

Oct 1 2018, 4:22 PM · SVN Loader

Sep 30 2018

ardumont added a comment to T958: googlecode import: Clean up googlecode origin's origin_visits.

And also make sure the one visit date is the right one:

Sep 30 2018, 11:39 AM · SVN Loader, Origin-GoogleCode, Archive content

Sep 28 2018

ardumont added a comment to T570: svn loader: CRLF/LF mess in svn history results in hash computations divergence.

as in T946#22626:

23 origins in error scheduled back [1]
workers' dashboard log

Sep 28 2018, 4:09 PM · SVN Loader

ardumont added a comment to P304 mixed crlf origin in error.

softwareheritage=> \copy (select o.url, fh.status, fh.stderr from origin o inner join origin_visit ov on o.id=ov.origin inner join fetch_history fh on fh.origin=o.id where o.type='svn' and no
t fh.status and ov.visit = (select max(visit) from origin_visit where origin=o.id) and stderr like '%Inconsistency. CRLF detected in a converted file%') to '/home/ardumont/data-transit' ;

Sep 28 2018, 4:05 PM · SVN Loader

ardumont created P304 mixed crlf origin in error.

Sep 28 2018, 4:04 PM · SVN Loader

ardumont closed T946: loader-svn: googlecode import: UnicodeDecodeError in user svn properties fails the loading as Resolved.

Sep 28 2018, 3:04 PM · Origin-GoogleCode, SVN Loader

ardumont closed T946: loader-svn: googlecode import: UnicodeDecodeError in user svn properties fails the loading, a subtask of T879: Reschedule googlecode svn origins from scratch, as Resolved.

Sep 28 2018, 3:04 PM · Origin-GoogleCode, SVN Loader, Archive content

ardumont added a comment to T946: loader-svn: googlecode import: UnicodeDecodeError in user svn properties fails the loading.

new loader svn packaged and deployed
origins in error scheduled back
workers logs (kibana dashboard) [1]

Sep 28 2018, 3:04 PM · Origin-GoogleCode, SVN Loader

ardumont closed D448: svn.ra: Ignore decoding error on unused user-defined svn properties.

Sep 28 2018, 11:24 AM · SVN Loader

anlambert accepted D448: svn.ra: Ignore decoding error on unused user-defined svn properties.

All good! Time to close T946.

Sep 28 2018, 11:18 AM · SVN Loader

ardumont updated the test plan for D448: svn.ra: Ignore decoding error on unused user-defined svn properties.

Sep 28 2018, 11:17 AM · SVN Loader

ardumont updated the diff for D448: svn.ra: Ignore decoding error on unused user-defined svn properties.

loader.svn.tests: Add a scenario around user-defined svn properties

Sep 28 2018, 11:14 AM · SVN Loader

ardumont added a project to D448: svn.ra: Ignore decoding error on unused user-defined svn properties: SVN Loader.

Sep 28 2018, 10:15 AM · SVN Loader

Sep 27 2018

ardumont renamed T946: loader-svn: googlecode import: UnicodeDecodeError in user svn properties fails the loading from loader-svn: googlecode import: UnicodeDecodeError in user svn properties to loader-svn: googlecode import: UnicodeDecodeError in user svn properties fails the loading.

Sep 27 2018, 6:12 PM · Origin-GoogleCode, SVN Loader

ardumont renamed T946: loader-svn: googlecode import: UnicodeDecodeError in user svn properties fails the loading from loader-svn: googlecode import: UnicodeDecodeError to loader-svn: googlecode import: UnicodeDecodeError in user svn properties.

Sep 27 2018, 6:12 PM · Origin-GoogleCode, SVN Loader

ardumont added a comment to T946: loader-svn: googlecode import: UnicodeDecodeError in user svn properties fails the loading.

Ok, so going for that fix.

Sep 27 2018, 4:51 PM · Origin-GoogleCode, SVN Loader

ardumont added a comment to T946: loader-svn: googlecode import: UnicodeDecodeError in user svn properties fails the loading.

\m/

Sep 27 2018, 4:48 PM · Origin-GoogleCode, SVN Loader

anlambert added a comment to T946: loader-svn: googlecode import: UnicodeDecodeError in user svn properties fails the loading.

So here are the results:

Sep 27 2018, 4:43 PM · Origin-GoogleCode, SVN Loader

anlambert added a comment to T946: loader-svn: googlecode import: UnicodeDecodeError in user svn properties fails the loading.

I was trying to play with your Python scripts to query kibana logs but it's been a while since I
did not write a query for elastic search and their json format is still awful.

Sep 27 2018, 4:24 PM · Origin-GoogleCode, SVN Loader

ardumont added a comment to T946: loader-svn: googlecode import: UnicodeDecodeError in user svn properties fails the loading.

Today is not a good day for me ;)

Sep 27 2018, 3:53 PM · Origin-GoogleCode, SVN Loader

ardumont updated the title for P303 svn loader: all time origins in error with UnicodeDecodeError - 81 (with duplicates) from svn loader: origins in error with unicodedecodeerror 81 (with duplicates) to svn loader: all time origins in error with UnicodeDecodeError - 81 (with duplicates).

Sep 27 2018, 3:48 PM · SVN Loader

ardumont created P303 svn loader: all time origins in error with UnicodeDecodeError - 81 (with duplicates).

Sep 27 2018, 3:47 PM · SVN Loader

ardumont added a comment to T946: loader-svn: googlecode import: UnicodeDecodeError in user svn properties fails the loading.

Great to have kibana back!

Sep 27 2018, 3:15 PM · Origin-GoogleCode, SVN Loader

anlambert added a comment to T946: loader-svn: googlecode import: UnicodeDecodeError in user svn properties fails the loading.

Another try with another svn repo gives me the following output:

Sep 27 2018, 2:37 PM · Origin-GoogleCode, SVN Loader

anlambert added a comment to T946: loader-svn: googlecode import: UnicodeDecodeError in user svn properties fails the loading.

Great to have kibana back!

Sep 27 2018, 2:29 PM · Origin-GoogleCode, SVN Loader

ardumont added a comment to T946: loader-svn: googlecode import: UnicodeDecodeError in user svn properties fails the loading.

I had to configure back the new kibana0 (was banco before) to start parsing those logs back.

Sep 27 2018, 1:56 PM · Origin-GoogleCode, SVN Loader

ardumont added a comment to T946: loader-svn: googlecode import: UnicodeDecodeError in user svn properties fails the loading.

This issue is reproduced on 6 repositories (so far, still running locally on some other big repositories).

To my old self, what are those other 5 repositories?

Sep 27 2018, 10:50 AM · Origin-GoogleCode, SVN Loader

Sep 26 2018

anlambert added a comment to T611: support for external definitions in the svn/subversion loader.

To get some ideas on what we can found, below are some examples of svn:externals property values from googlecode svn projects.

https://wow-xlog.googlecode.com/svn/
LibXEvent-1.0 https://wow-xlog.googlecode.com/svn/branches/LibXEvent-1.0/

Sep 26 2018, 5:59 PM · SVN Loader

Sep 25 2018

anlambert closed T1161: SVN loader: Create local dump of remote repository to speed up loading task as Resolved by committing rDLDSVN677225bbd57d: svn.loader: Add efficient loader based on remote dumps.

Sep 25 2018, 2:20 PM · SVN Loader

ardumont renamed T946: loader-svn: googlecode import: UnicodeDecodeError in user svn properties fails the loading from googlecode import: UnicodeDecodeError to loader-svn: googlecode import: UnicodeDecodeError.

Sep 25 2018, 2:18 PM · Origin-GoogleCode, SVN Loader

ardumont added a comment to T946: loader-svn: googlecode import: UnicodeDecodeError in user svn properties fails the loading.

Digging deeper to try and improve the result to return (at the moment, empty string).
I tried to use chardet to detect the language to try and decode the bytes.
This fails as nothing appropriate is found.

Sep 25 2018, 2:18 PM · Origin-GoogleCode, SVN Loader

ardumont added a comment to T946: loader-svn: googlecode import: UnicodeDecodeError in user svn properties fails the loading.

By analyzing the repository dump file in emacs

Sep 25 2018, 2:18 PM · Origin-GoogleCode, SVN Loader

anlambert added a comment to T946: loader-svn: googlecode import: UnicodeDecodeError in user svn properties fails the loading.

Thinking a bit more about the issue, there might a way to workaround it in our client code instead
of hacking in subvertpy.

Sep 25 2018, 2:18 PM · Origin-GoogleCode, SVN Loader

Sep 24 2018

anlambert added a comment to T946: loader-svn: googlecode import: UnicodeDecodeError in user svn properties fails the loading.

Tracking down the issue in subvertpy source code, the error occurs in the subvertpy._ra C extension module.
More precisely, an exception is raised at line 1068 in file subvertpy/editor.c when Python tries to decode a svn property value from 'utf-8' encoding.

1062  static svn_error_t *py_cb_editor_change_prop(void *dir_baton, const char *name, const svn_string_t *value, apr_pool_t *pool)
1063  {
1064          PyObject *self = (PyObject *)dir_baton, *ret;
1065          PyGILState_STATE state = PyGILState_Ensure();
1066
1067          if (value != NULL) {
1068                  ret = PyObject_CallMethod(self, "change_prop", "sz#", name, value->data, value->len);
1069          } else {
1070                  ret = PyObject_CallMethod(self, "change_prop", "sO", name, Py_None);
1071          }
1072          CB_CHECK_PYRETVAL(ret);
1073          Py_DECREF(ret);
1074          PyGILState_Release(state);
1075          return NULL;
1076  }

Sep 24 2018, 4:55 PM · Origin-GoogleCode, SVN Loader

ardumont added a comment to T946: loader-svn: googlecode import: UnicodeDecodeError in user svn properties fails the loading.

(If you are not interested in details, no need to look further, it's a run extract ;)

Sep 24 2018, 4:24 PM · Origin-GoogleCode, SVN Loader

ardumont lowered the priority of T946: loader-svn: googlecode import: UnicodeDecodeError in user svn properties fails the loading from High to Normal.

This one is still true. Still banging my head on this.

Sep 24 2018, 4:14 PM · Origin-GoogleCode, SVN Loader

anlambert added a comment to T1161: SVN loader: Create local dump of remote repository to speed up loading task.

Based on my last tests, I was too confident that svnadmin will be able to load a dump containing an arbitrary revision range
(either generated by svnrdump and rsvndump). So let's put that incremental dump idea in hold for the moment as it needs
more investigation on the subject.

Sep 24 2018, 12:06 PM · SVN Loader

Sep 22 2018

ardumont added a comment to T1161: SVN loader: Create local dump of remote repository to speed up loading task.

FWIW, this is my main worry about this approach.

Sep 22 2018, 1:14 PM · SVN Loader

zack added a comment to T1161: SVN loader: Create local dump of remote repository to speed up loading task.

FWIW, this is my main worry about this approach.

Sep 22 2018, 12:12 PM · SVN Loader

ardumont added a comment to T1161: SVN loader: Create local dump of remote repository to speed up loading task.

Hum, it seems there exist some subtle corner cases where incremental loading will fail ...
...
To be sure, I quickly patched the rsvndump source code and the issue went away, so my assumption seems right.
....

Sep 22 2018, 11:55 AM · SVN Loader

Sep 21 2018

anlambert added a comment to T1161: SVN loader: Create local dump of remote repository to speed up loading task.

Hum, it seems there exist some subtle corner cases where incremental loading will fail ...
This is what I got for instance, when playing with the Apache Subversion repository by
loading it incrementally (killing rsvndump randomly in order to load what we dumped so far).

Sep 21 2018, 7:35 PM · SVN Loader

ardumont closed D434: svn.loader: Uncompress the tarball during the `prepare` call.

Sep 21 2018, 2:23 PM · SVN Loader

ardumont updated the diff for D434: svn.loader: Uncompress the tarball during the `prepare` call.

Fix blank spaces in readme and rebase

Sep 21 2018, 2:23 PM · SVN Loader

ardumont added a comment to D434: svn.loader: Uncompress the tarball during the `prepare` call.

Let's forget my comments about the svn_url parameter drop

Right.

Sep 21 2018, 2:22 PM · SVN Loader

anlambert accepted D434: svn.loader: Uncompress the tarball during the `prepare` call.

Let's forget my comments about the svn_url parameter drop and let's land it!

Sep 21 2018, 2:20 PM · SVN Loader

anlambert added inline comments to D434: svn.loader: Uncompress the tarball during the `prepare` call.

Sep 21 2018, 2:09 PM · SVN Loader

ardumont updated the test plan for D434: svn.loader: Uncompress the tarball during the `prepare` call.

Sep 21 2018, 2:07 PM · SVN Loader

ardumont updated the summary of D434: svn.loader: Uncompress the tarball during the `prepare` call.

Sep 21 2018, 2:07 PM · SVN Loader

ardumont added inline comments to D434: svn.loader: Uncompress the tarball during the `prepare` call.

Sep 21 2018, 2:04 PM · SVN Loader

ardumont added a project to D434: svn.loader: Uncompress the tarball during the `prepare` call: SVN Loader.

Sep 21 2018, 2:02 PM · SVN Loader

anlambert raised the priority of T1161: SVN loader: Create local dump of remote repository to speed up loading task from Wishlist to Normal.

Sep 21 2018, 1:56 PM · SVN Loader

ardumont added a comment to D432: Refactor svn loader to respect loader-core's interface (fetch_data/store_data).

Note: I'm willing to rebase all this on @anlambert's current work to improve the loading speed (D433)

Sep 21 2018, 1:27 PM · SVN Loader

ardumont closed D432: Refactor svn loader to respect loader-core's interface (fetch_data/store_data).

Sep 21 2018, 1:23 PM · SVN Loader

ardumont added a comment to D432: Refactor svn loader to respect loader-core's interface (fetch_data/store_data).

Really nice rework!

Sep 21 2018, 1:22 PM · SVN Loader

ardumont updated the diff for D432: Refactor svn loader to respect loader-core's interface (fetch_data/store_data).

Amend and improve the svnrepo initialization

Sep 21 2018, 1:20 PM · SVN Loader

ardumont added a comment to T1161: SVN loader: Create local dump of remote repository to speed up loading task.

Awesome!

Sep 21 2018, 12:06 PM · SVN Loader

ardumont added inline comments to D432: Refactor svn loader to respect loader-core's interface (fetch_data/store_data).

Sep 21 2018, 11:58 AM · SVN Loader

anlambert added a comment to T1161: SVN loader: Create local dump of remote repository to speed up loading task.

Have you checked that the last part results in the same snapshot as the actual svn loader?
That is do the full loading with the actual svn loader (up to the 20 revisions), take the snapshot and compare it with this quoted one.

Sep 21 2018, 11:56 AM · SVN Loader

anlambert accepted D432: Refactor svn loader to respect loader-core's interface (fetch_data/store_data).

Really nice rework! Implementing fetch_data / store_data really helps to better understand the loader processing.
I added a couple of comments but it's all good for me.

Sep 21 2018, 11:41 AM · SVN Loader

ardumont added a comment to T1161: SVN loader: Create local dump of remote repository to speed up loading task.

thinking more about this. I see something missing in the description.

Sep 21 2018, 10:15 AM · SVN Loader

Sep 20 2018

ardumont added a comment to T1161: SVN loader: Create local dump of remote repository to speed up loading task.

\m/
Thanks for the thorough description!
It's awesome.

Sep 20 2018, 9:45 PM · SVN Loader

ardumont updated the summary of D432: Refactor svn loader to respect loader-core's interface (fetch_data/store_data).

Sep 20 2018, 9:37 PM · SVN Loader

anlambert added a comment to T1161: SVN loader: Create local dump of remote repository to speed up loading task.

So I took some time to dig a little further on that idea of creating a dump file using the
svnrdump command from the official tools coming along with subversion.

Sep 20 2018, 8:22 PM · SVN Loader

ardumont added inline comments to D432: Refactor svn loader to respect loader-core's interface (fetch_data/store_data).

Sep 20 2018, 6:36 PM · SVN Loader

ardumont retitled D432: Refactor svn loader to respect loader-core's interface (fetch_data/store_data) from Refactor svn loader to clarify what it does to Refactor svn loader to respect loader-core's interface (fetch_data/store_data).

Sep 20 2018, 6:36 PM · SVN Loader

ardumont updated the diff for D432: Refactor svn loader to respect loader-core's interface (fetch_data/store_data).