Page MenuHomeSoftware Heritage
Feed Advanced Search

Sep 19 2018

anlambert closed T570: svn loader: CRLF/LF mess in svn history results in hash computations divergence as Resolved by committing rDLDSVNd372c8d0b1cc: ra.py: Normalize line endings when svn:eol-style property is set.
Sep 19 2018, 3:43 PM · SVN Loader
anlambert closed T570: svn loader: CRLF/LF mess in svn history results in hash computations divergence, a subtask of T466: Test - Ingest XXL svn repository, as Resolved.
Sep 19 2018, 3:43 PM · SVN Loader
ardumont closed T879: Reschedule googlecode svn origins from scratch, a subtask of T617: ingest Google Code Subversion repositories, as Resolved.
Sep 19 2018, 1:56 PM · Archive coverage, Origin-GoogleCode, SVN Loader
ardumont closed T879: Reschedule googlecode svn origins from scratch as Resolved.

That's been done for a while now.

Sep 19 2018, 1:56 PM · Origin-GoogleCode, SVN Loader, Archive content

Sep 11 2018

anlambert added a comment to T570: svn loader: CRLF/LF mess in svn history results in hash computations divergence.

Just to be sure I'm getting your proposal right: this will ensure that the normalization is done aligning with what a checkout will do (rather than the other way around), right?
If so, I agree this is definitely the way to go, as we will be guaranteeing that we archive what users would get out of a (badly stored) SVN revision.

Sep 11 2018, 10:53 AM · SVN Loader
zack added a comment to T570: svn loader: CRLF/LF mess in svn history results in hash computations divergence.

Based on my understanding, this property does not seem to be taken into account
when using the remote access replay api and thus the reconstructed files may
contain different line endings as those generated by 'svn export', so there is a
tree divergence.

Sep 11 2018, 8:21 AM · SVN Loader

Sep 10 2018

ardumont added a comment to T570: svn loader: CRLF/LF mess in svn history results in hash computations divergence.

I took some time to dig into that issue last week and this what I understood of it.

Sep 10 2018, 6:08 PM · SVN Loader
anlambert added a comment to T570: svn loader: CRLF/LF mess in svn history results in hash computations divergence.

I took some time to dig into that issue last week and this what I understood of it.

Sep 10 2018, 5:27 PM · SVN Loader
ardumont renamed T570: svn loader: CRLF/LF mess in svn history results in hash computations divergence from Investigate svn bug about altered history to svn loader: CRLF/LF mess in svn history results in hash computations divergence.
Sep 10 2018, 10:29 AM · SVN Loader

Jul 29 2018

zack added a comment to T1161: SVN loader: Create local dump of remote repository to speed up loading task.

good catch !

Jul 29 2018, 8:19 AM · SVN Loader

Jul 27 2018

anlambert renamed T1161: SVN loader: Create local dump of remote repository to speed up loading task from Subversion loader: Create a local dump of a remote repository to speed up loading task to SVN loader: Create local dump of remote repository to speed up loading task.
Jul 27 2018, 4:46 PM · SVN Loader
anlambert triaged T1161: SVN loader: Create local dump of remote repository to speed up loading task as Wishlist priority.
Jul 27 2018, 2:54 PM · SVN Loader

Jun 19 2018

zack edited projects for T617: ingest Google Code Subversion repositories, added: Archive coverage; removed Archive content.
Jun 19 2018, 3:28 PM · Archive coverage, Origin-GoogleCode, SVN Loader

Feb 16 2018

ardumont added a comment to T923: Mount the asf svn repository mirror.
  • Restored in time.
  • Fix the input dumps to mount in order
  • Restarted the dump mounting routine
Feb 16 2018, 1:52 PM · SVN Loader

Feb 15 2018

ardumont added a comment to T923: Mount the asf svn repository mirror.

Well, the dump is all messed up to be polite now...

Feb 15 2018, 5:13 PM · SVN Loader
ardumont added a comment to T923: Mount the asf svn repository mirror.

Well, the dump is all messed up to be polite now...
Restoring to an old point in time.

Feb 15 2018, 4:41 PM · SVN Loader
ardumont added a comment to T923: Mount the asf svn repository mirror.

Need to notify our asf contact but i'll make sure there are no other holes first.

Feb 15 2018, 4:10 PM · SVN Loader
ardumont added a comment to T923: Mount the asf svn repository mirror.

Well, yeah, that file is missing.

Feb 15 2018, 3:35 PM · SVN Loader
ardumont added a comment to T923: Mount the asf svn repository mirror.

Well, nothing too serious. Wrong even possibly missing file dump!

Feb 15 2018, 3:33 PM · SVN Loader
ardumont added a comment to T923: Mount the asf svn repository mirror.

Status on this, everything were fine up until today.

Feb 15 2018, 3:28 PM · SVN Loader

Feb 13 2018

ardumont added projects to T617: ingest Google Code Subversion repositories: SVN Loader, Origin-GoogleCode.
Feb 13 2018, 2:31 PM · Archive coverage, Origin-GoogleCode, SVN Loader
ardumont created T958: googlecode import: Clean up googlecode origin's origin_visits.
Feb 13 2018, 1:45 PM · SVN Loader, Origin-GoogleCode, Archive content

Feb 9 2018

ardumont added a comment to T923: Mount the asf svn repository mirror.

I don't see anything new here.

Feb 9 2018, 12:12 PM · SVN Loader
zack added a comment to T923: Mount the asf svn repository mirror.

I don't see anything new here. Subversion offers no integrity guarantees, it applies to the ASF repos like it applies to any other SVN repo out there. We need to decide a policy about when (if at all), re-do full ingestions of Subversion repos (which will allow to re-inject modified objects at the cost of forking the resulting history on Software Heritage) or just say *shrug* and never re-ingest in a non-incremental way any Subversion repo we have previously ingested.

Feb 9 2018, 11:59 AM · SVN Loader
ardumont added a comment to T923: Mount the asf svn repository mirror.

During our latest exchange with our asf contact (Greg Stein), i ask about history modification and here is his answer:

Feb 9 2018, 11:35 AM · SVN Loader
ardumont added a comment to T570: svn loader: CRLF/LF mess in svn history results in hash computations divergence.

There, asf has the same inconsistency error, which is now detected:

Feb 9 2018, 9:34 AM · SVN Loader

Feb 8 2018

ardumont added a comment to T923: Mount the asf svn repository mirror.

Update on this. As the tested dumps so far were going well.
I have automated the remaining dumps to mount.
It's currently running.

Feb 8 2018, 12:35 PM · SVN Loader

Feb 7 2018

ardumont added a comment to T570: svn loader: CRLF/LF mess in svn history results in hash computations divergence.

Of course i forgot to mention (because i only remember it now), it's not only CRLF, it's sometimes a mix with CR/CRLF/LF as well...

Feb 7 2018, 6:29 PM · SVN Loader
ardumont added a comment to T570: svn loader: CRLF/LF mess in svn history results in hash computations divergence.

Ok digging further, all repositories are corrupted, or at least not consistent.
And this inconsistency is unfortunately permitted in the svn toolchain (at least, it was at some point).

Feb 7 2018, 6:14 PM · SVN Loader

Feb 6 2018

ardumont added a comment to T570: svn loader: CRLF/LF mess in svn history results in hash computations divergence.

So far, other dumps with the error. All of those errors have the same reason, a mix of CRLF lines terminator where export does not expect that:

Feb 6 2018, 4:09 PM · SVN Loader
ardumont closed T948: googlecode import: Loading failure on symbolic link edge cases as Resolved by committing rDLDSVN06bcb409d9af: swh.loader.svn: Fix corner edge case on symbolic link.
Feb 6 2018, 3:35 PM · Origin-GoogleCode, SVN Loader, Archive content
ardumont closed T947: googlecode import: Some dumps are just empty repository, a subtask of T879: Reschedule googlecode svn origins from scratch, as Resolved.
Feb 6 2018, 3:35 PM · Origin-GoogleCode, SVN Loader, Archive content
ardumont closed T948: googlecode import: Loading failure on symbolic link edge cases, a subtask of T879: Reschedule googlecode svn origins from scratch, as Resolved.
Feb 6 2018, 3:35 PM · Origin-GoogleCode, SVN Loader, Archive content
ardumont closed T947: googlecode import: Some dumps are just empty repository as Resolved by committing rDLDSVNde3c7a031f8b: swh.loader.svn: Deal with empty svn repository.
Feb 6 2018, 3:35 PM · Origin-GoogleCode, SVN Loader, Archive content
ardumont added a comment to T946: loader-svn: googlecode import: UnicodeDecodeError in user svn properties fails the loading.

This issue is reproduced on 6 repositories (so far, still running locally on some other big repositories).
Those fails are commits holding non-unicode named characters in tree or filename (japanese for example).
What's failing is not clear at all though.

Feb 6 2018, 11:49 AM · Origin-GoogleCode, SVN Loader

Feb 5 2018

ardumont added a comment to T948: googlecode import: Loading failure on symbolic link edge cases.

It appears that in this case, the properties must be changed not to the symlink but to its source.

Feb 5 2018, 5:42 PM · Origin-GoogleCode, SVN Loader, Archive content
ardumont created T948: googlecode import: Loading failure on symbolic link edge cases.
Feb 5 2018, 3:46 PM · Origin-GoogleCode, SVN Loader, Archive content
ardumont changed the status of T947: googlecode import: Some dumps are just empty repository, a subtask of T879: Reschedule googlecode svn origins from scratch, from Open to Work in Progress.
Feb 5 2018, 1:45 PM · Origin-GoogleCode, SVN Loader, Archive content
ardumont renamed T947: googlecode import: Some dumps are just empty repository from googlecode import: Some dumps starts their log to revision 0 to googlecode import: Some dumps are just empty repository.
Feb 5 2018, 1:45 PM · Origin-GoogleCode, SVN Loader, Archive content
ardumont added a comment to T947: googlecode import: Some dumps are just empty repository.

It's more empty repository case than a repository starting its commit range at 0...

Feb 5 2018, 1:37 PM · Origin-GoogleCode, SVN Loader, Archive content
ardumont created T947: googlecode import: Some dumps are just empty repository.
Feb 5 2018, 11:43 AM · Origin-GoogleCode, SVN Loader, Archive content

Feb 2 2018

ardumont created T946: loader-svn: googlecode import: UnicodeDecodeError in user svn properties fails the loading.
Feb 2 2018, 1:52 PM · Origin-GoogleCode, SVN Loader
ardumont added a comment to T879: Reschedule googlecode svn origins from scratch.

This is in stand-by during the snapshot migration.

Feb 2 2018, 1:44 PM · Origin-GoogleCode, SVN Loader, Archive content

Jan 15 2018

ardumont added a comment to T923: Mount the asf svn repository mirror.

The 1st suggestion is currently tested and so far so good (more than 700k revision has been done so far).

Jan 15 2018, 3:12 PM · SVN Loader

Jan 12 2018

ardumont changed the status of T923: Mount the asf svn repository mirror from Open to Work in Progress.
Jan 12 2018, 2:33 PM · SVN Loader
ardumont changed the status of T923: Mount the asf svn repository mirror, a subtask of T466: Test - Ingest XXL svn repository, from Open to Work in Progress.
Jan 12 2018, 2:33 PM · SVN Loader
ardumont added a comment to T923: Mount the asf svn repository mirror.

Repeating my initial comment.

Jan 12 2018, 2:32 PM · SVN Loader

Jan 11 2018

ardumont created T923: Mount the asf svn repository mirror.
Jan 11 2018, 12:19 PM · SVN Loader
ardumont added a comment to T466: Test - Ingest XXL svn repository.

Well, trying to mount the repository on the side seems to be a task on its own:

Jan 11 2018, 12:13 PM · SVN Loader

Jan 9 2018

ardumont added a comment to T466: Test - Ingest XXL svn repository.

Those dumps are currently being retrieved in uffizi:/srv/storage/space/mirrors/asf.

Jan 9 2018, 4:47 PM · SVN Loader
ardumont added a comment to T466: Test - Ingest XXL svn repository.

Even svn's own tools break on such cases (svnsync must be iteratively called to continue).

Jan 9 2018, 12:35 PM · SVN Loader

Dec 14 2017

ardumont closed T676: Google Code SVN import: Examine ingestion logs for errors and list them if any as Resolved.
Dec 14 2017, 3:24 PM · SVN Loader
ardumont closed T847: loader-svn: Some SVN origins have occurrences that point to non-existent objects as Resolved.
Dec 14 2017, 3:23 PM · SVN Loader
ardumont closed T897: Clean wrong origins which are eclipselabs/apache-extras ones (already injected), a subtask of T847: loader-svn: Some SVN origins have occurrences that point to non-existent objects, as Resolved.
Dec 14 2017, 3:22 PM · SVN Loader
ardumont closed T897: Clean wrong origins which are eclipselabs/apache-extras ones (already injected) as Resolved.
Dec 14 2017, 3:22 PM · SVN Loader
ardumont closed T896: Clean up wrong origins, a subtask of T879: Reschedule googlecode svn origins from scratch, as Resolved.
Dec 14 2017, 3:03 PM · Origin-GoogleCode, SVN Loader, Archive content
ardumont closed T896: Clean up wrong origins as Resolved.
Dec 14 2017, 3:03 PM · Origin-GoogleCode, SVN Loader, Archive content
ardumont added a comment to T897: Clean wrong origins which are eclipselabs/apache-extras ones (already injected).

origins-to-cleanup file: /srv/storage/space/lists/svn/INDEX-svn-dumps-to-cleanup

Dec 14 2017, 2:56 PM · SVN Loader
ardumont changed the status of T897: Clean wrong origins which are eclipselabs/apache-extras ones (already injected) from Open to Work in Progress.
Dec 14 2017, 2:16 PM · SVN Loader
ardumont changed the status of T897: Clean wrong origins which are eclipselabs/apache-extras ones (already injected), a subtask of T847: loader-svn: Some SVN origins have occurrences that point to non-existent objects, from Open to Work in Progress.
Dec 14 2017, 2:16 PM · SVN Loader
ardumont renamed T897: Clean wrong origins which are eclipselabs/apache-extras ones (already injected) from Clean up potential bad origins to Clean wrong origins which are eclipselabs/apache-extras ones (already injected).
Dec 14 2017, 2:15 PM · SVN Loader
ardumont updated the task description for T897: Clean wrong origins which are eclipselabs/apache-extras ones (already injected).
Dec 14 2017, 1:33 PM · SVN Loader
ardumont added a comment to T896: Clean up wrong origins.

P202 checked and ok locally.
Now asked for review as it will remove data from the main db.

Dec 14 2017, 12:06 PM · Origin-GoogleCode, SVN Loader, Archive content
ardumont changed the status of T896: Clean up wrong origins from Open to Work in Progress.
Dec 14 2017, 12:06 PM · Origin-GoogleCode, SVN Loader, Archive content
ardumont changed the status of T896: Clean up wrong origins, a subtask of T879: Reschedule googlecode svn origins from scratch, from Open to Work in Progress.
Dec 14 2017, 12:06 PM · Origin-GoogleCode, SVN Loader, Archive content
ardumont added a comment to T879: Reschedule googlecode svn origins from scratch.

After discussion with the team, it has been decided to remove from the re-scheduling the svn dumps whose compressed size exceeds 2Gib.
This reflects the same decision took for git repositories.

Dec 14 2017, 12:05 PM · Origin-GoogleCode, SVN Loader, Archive content

Dec 13 2017

ardumont added a comment to T897: Clean wrong origins which are eclipselabs/apache-extras ones (already injected).

T896 will help.

Dec 13 2017, 11:45 AM · SVN Loader
ardumont reopened T847: loader-svn: Some SVN origins have occurrences that point to non-existent objects as "Open".

Reopened as new missing children task was created.

Dec 13 2017, 11:42 AM · SVN Loader
ardumont created T897: Clean wrong origins which are eclipselabs/apache-extras ones (already injected).
Dec 13 2017, 11:42 AM · SVN Loader
ardumont created T896: Clean up wrong origins.
Dec 13 2017, 11:40 AM · Origin-GoogleCode, SVN Loader, Archive content

Dec 11 2017

ardumont added a comment to T879: Reschedule googlecode svn origins from scratch.

Scheduled back from saatchi (as i needed the producer credentials to access the queue properties):

Dec 11 2017, 5:08 PM · Origin-GoogleCode, SVN Loader, Archive content
ardumont raised the priority of T879: Reschedule googlecode svn origins from scratch from Normal to High.
Dec 11 2017, 11:03 AM · Origin-GoogleCode, SVN Loader, Archive content
ardumont changed the status of T879: Reschedule googlecode svn origins from scratch from Open to Work in Progress.
Dec 11 2017, 11:03 AM · Origin-GoogleCode, SVN Loader, Archive content
ardumont added a comment to T676: Google Code SVN import: Examine ingestion logs for errors and list them if any.

I close this in favor of T879

Dec 11 2017, 11:02 AM · SVN Loader
ardumont updated the task description for T879: Reschedule googlecode svn origins from scratch.
Dec 11 2017, 11:01 AM · Origin-GoogleCode, SVN Loader, Archive content
ardumont created T879: Reschedule googlecode svn origins from scratch.
Dec 11 2017, 10:59 AM · Origin-GoogleCode, SVN Loader, Archive content
ardumont closed T863: loader-svn: Fix origin clashes for homonym but distinct svn dumps, a subtask of T847: loader-svn: Some SVN origins have occurrences that point to non-existent objects, as Resolved.
Dec 11 2017, 10:21 AM · SVN Loader
ardumont closed T863: loader-svn: Fix origin clashes for homonym but distinct svn dumps as Resolved.
Dec 11 2017, 10:21 AM · Origin-GoogleCode, SVN Loader
ardumont closed T847: loader-svn: Some SVN origins have occurrences that point to non-existent objects as Resolved.

Status, there are no longer missing objects:

Dec 11 2017, 10:20 AM · SVN Loader
ardumont updated the task description for T863: loader-svn: Fix origin clashes for homonym but distinct svn dumps.
Dec 11 2017, 10:20 AM · Origin-GoogleCode, SVN Loader
ardumont added a comment to T863: loader-svn: Fix origin clashes for homonym but distinct svn dumps.

Well, there are errors.

Dec 11 2017, 9:42 AM · Origin-GoogleCode, SVN Loader
ardumont added a comment to T863: loader-svn: Fix origin clashes for homonym but distinct svn dumps.
  • Make sure nothing is amiss
Dec 11 2017, 9:18 AM · Origin-GoogleCode, SVN Loader

Dec 9 2017

ardumont updated the task description for T863: loader-svn: Fix origin clashes for homonym but distinct svn dumps.
Dec 9 2017, 11:25 AM · Origin-GoogleCode, SVN Loader
ardumont updated the task description for T863: loader-svn: Fix origin clashes for homonym but distinct svn dumps.
Dec 9 2017, 11:22 AM · Origin-GoogleCode, SVN Loader
ardumont updated the task description for T863: loader-svn: Fix origin clashes for homonym but distinct svn dumps.
Dec 9 2017, 11:16 AM · Origin-GoogleCode, SVN Loader
ardumont updated the task description for T863: loader-svn: Fix origin clashes for homonym but distinct svn dumps.
Dec 9 2017, 11:14 AM · Origin-GoogleCode, SVN Loader
ardumont added a comment to T570: svn loader: CRLF/LF mess in svn history results in hash computations divergence.
Dec 9 2017, 11:08 AM · SVN Loader
ardumont added a parent task for T847: loader-svn: Some SVN origins have occurrences that point to non-existent objects: T617: ingest Google Code Subversion repositories.
Dec 9 2017, 10:53 AM · SVN Loader

Dec 8 2017

ardumont updated the task description for T863: loader-svn: Fix origin clashes for homonym but distinct svn dumps.
Dec 8 2017, 6:16 PM · Origin-GoogleCode, SVN Loader
ardumont updated the task description for T863: loader-svn: Fix origin clashes for homonym but distinct svn dumps.
Dec 8 2017, 5:26 PM · Origin-GoogleCode, SVN Loader
ardumont updated the task description for T863: loader-svn: Fix origin clashes for homonym but distinct svn dumps.
Dec 8 2017, 5:12 PM · Origin-GoogleCode, SVN Loader
ardumont updated the task description for T863: loader-svn: Fix origin clashes for homonym but distinct svn dumps.
Dec 8 2017, 4:18 PM · Origin-GoogleCode, SVN Loader
ardumont updated the task description for T863: loader-svn: Fix origin clashes for homonym but distinct svn dumps.
Dec 8 2017, 4:11 PM · Origin-GoogleCode, SVN Loader
ardumont added a project to T876: loader-svn: Reschedule origins with missing data: Origin-GoogleCode.
Dec 8 2017, 3:39 PM · Origin-GoogleCode, SVN Loader
ardumont updated the task description for T863: loader-svn: Fix origin clashes for homonym but distinct svn dumps.
Dec 8 2017, 3:38 PM · Origin-GoogleCode, SVN Loader
ardumont renamed T863: loader-svn: Fix origin clashes for homonym but distinct svn dumps from loader-svn: Investigate origin clash for homonym but distinct svn dumps to loader-svn: Fix origin clashes for homonym but distinct svn dumps.
Dec 8 2017, 2:00 PM · Origin-GoogleCode, SVN Loader
ardumont renamed T863: loader-svn: Fix origin clashes for homonym but distinct svn dumps from loader-svn: Investigate potential origin clash for homonym but distinct svn dumps to loader-svn: Investigate origin clash for homonym but distinct svn dumps.
Dec 8 2017, 1:59 PM · Origin-GoogleCode, SVN Loader
ardumont renamed T847: loader-svn: Some SVN origins have occurrences that point to non-existent objects from Some SVN origins have occurrences that point to non-existent objects to loader-svn: Some SVN origins have occurrences that point to non-existent objects.
Dec 8 2017, 1:50 PM · SVN Loader
ardumont updated the task description for T876: loader-svn: Reschedule origins with missing data.
Dec 8 2017, 1:48 PM · Origin-GoogleCode, SVN Loader
ardumont claimed T847: loader-svn: Some SVN origins have occurrences that point to non-existent objects.
Dec 8 2017, 1:42 PM · SVN Loader