Page MenuHomeSoftware Heritage
Feed Advanced Search

Feb 21 2018

fiendish added a comment to T329: hg / mercurial loader.

Can we associate the name of the temporary storage directory for a load with that loader's pid, and then make every new loader instance compare existing temp storage dirs during init? If a storage directory exists for a process that does not exist (because the process was killed) then it can be deleted.

Feb 21 2018, 3:42 PM · Mercurial loader
fiendish added a comment to T329: hg / mercurial loader.

I worry that RAM is way more constrained than disk space is. It seems like the biggest problem is/was

Feb 21 2018, 3:27 PM · Mercurial loader
fiendish added a comment to T964: 2018-02-16 worker disk full postmortem.

If cache files are sticking around, then of course the code should make sure that they go away when done or aborted. But I think that a few G used during processing of extremely large repos should be acceptable. :/

Feb 21 2018, 5:18 AM · Mercurial loader
fiendish added a comment to T329: hg / mercurial loader.

I think 6e12c90b160ad3277a1edea27a05f9adea1bc92f may be a bad idea. Have you tested how much RAM it takes to hold the whole dirs dict in memory on a very large repo like mozilla-unified?

Feb 21 2018, 5:08 AM · Mercurial loader
fiendish added a comment to T970: mercurial loader: What to do in case of .hgtags?.

I agree with taking tags from both sides and discarding all lines that don't fit the pattern.

Feb 21 2018, 4:28 AM · Archive content, Mercurial loader

Feb 20 2018

fiendish added a comment to T329: hg / mercurial loader.

As discussed in irc a short while ago (just leaving this as note here), seeing 2 caches is normal and expected, since one is spawned inside reader and one in loader. Will have to also pass that argument to the reader instance.

Feb 20 2018, 10:16 PM · Mercurial loader

Feb 15 2018

fiendish committed rDLDHG1770cdf04823: oops. naught nodes. (authored by fiendish).
oops. naught nodes.
Feb 15 2018, 9:56 AM
fiendish changed the edit policy for P224 (An Untitled Masterwork).
Feb 15 2018, 9:38 AM
fiendish committed rDLDHG2caf1b1fe568: use parent rev ids instead of node ids (authored by fiendish).
use parent rev ids instead of node ids
Feb 15 2018, 8:42 AM
fiendish committed rDLDHG7411fb6d8815: don't keep hglib attached. it can use a lot of ram (authored by fiendish).
don't keep hglib attached. it can use a lot of ram
Feb 15 2018, 8:24 AM

Feb 14 2018

fiendish added a comment to T329: hg / mercurial loader.

Does it make sense to open that in the loader's configuration property?

Feb 14 2018, 4:10 AM · Mercurial loader

Feb 13 2018

fiendish added a comment to T329: hg / mercurial loader.

The bundle loader is tunable to use less ram and therefore more disk for its live caching (though I need to revisit the counter to make the tuning argument less arbitrary and more representative of real bytes used, because it currently ignores overhead and python data has a lot of overhead).

Feb 13 2018, 5:13 PM · Mercurial loader
fiendish added a comment to T329: hg / mercurial loader.

The bundle step, for some repository, is at the moment needing quite some ram

Feb 13 2018, 4:50 PM · Mercurial loader

Feb 9 2018

fiendish added a comment to T682: Ingest Google Code Mercurial repositories.

yay

Feb 9 2018, 10:53 PM · Archive coverage, Mercurial loader

Feb 8 2018

fiendish added a comment to T329: hg / mercurial loader.

Well I'm not sure what just happened, but I commited a patch (and apparently also some duplicate history).

Feb 8 2018, 8:21 PM · Mercurial loader
fiendish committed rDLDHG871812306973: Merge branch 'master' of ssh://forge.softwareheritage.org/source/swh-loader… (authored by fiendish).
Merge branch 'master' of ssh://forge.softwareheritage.org/source/swh-loader…
Feb 8 2018, 8:13 PM
fiendish committed rDLDHG7d1134eecd06: Bump requirements for new swh.loader.core (authored by olasd).
Bump requirements for new swh.loader.core
Feb 8 2018, 8:13 PM
fiendish committed rDLDHG94b576313105: objects: make all functions conform to flake8 (authored by olasd).
objects: make all functions conform to flake8
Feb 8 2018, 8:13 PM
fiendish committed rDLDHG4b236447d16e: swh.mercurial.loader: Fix slow_loader release computation (authored by ardumont).
swh.mercurial.loader: Fix slow_loader release computation
Feb 8 2018, 8:13 PM
fiendish committed rDLDHG1f0633f629b7: swh.loader.mercurial: Remove unneeded flush method (authored by ardumont).
swh.loader.mercurial: Remove unneeded flush method
Feb 8 2018, 8:12 PM
fiendish committed rDLDHG4e4f019d0cde: swh.mercurial.loader: Fix slow_loader + replace occs with snapshot (authored by ardumont).
swh.mercurial.loader: Fix slow_loader + replace occs with snapshot
Feb 8 2018, 8:12 PM
fiendish committed rDLDHG287ade13dedb: revert fbdd798b0e32a4cc0ef50b08ae2217d45f95e7ad and skip some work when possible (authored by fiendish).
revert fbdd798b0e32a4cc0ef50b08ae2217d45f95e7ad and skip some work when possible
Feb 8 2018, 8:12 PM
fiendish committed rDLDHGc2082f52fefe: swh.mercurial.loader: Replace occurrences with snapshot (authored by ardumont).
swh.mercurial.loader: Replace occurrences with snapshot
Feb 8 2018, 8:12 PM
fiendish added a reverting change for rDLDHGfbdd798b0e32: swh.loader.mercurial.loader: Fix content_missing call to storage: rDLDHG287ade13dedb: revert fbdd798b0e32a4cc0ef50b08ae2217d45f95e7ad and skip some work when possible.
Feb 8 2018, 8:12 PM
fiendish committed rDLDHG04cb69275736: comment about compression (authored by fiendish).
comment about compression
Feb 8 2018, 8:12 PM
fiendish added a comment to T329: hg / mercurial loader.

I'll do it as part of my patch, but I will need you to look at it. You made the original changes for good reasons, so I just want to make sure that the reasons are preserved.

Feb 8 2018, 10:09 AM · Mercurial loader

Feb 7 2018

fiendish added a comment to T329: hg / mercurial loader.

Also commit fbdd798b0e32a4cc0ef50b08ae2217d45f95e7ad is very problematic.

Feb 7 2018, 10:38 PM · Mercurial loader

Feb 2 2018

fiendish added a comment to T329: hg / mercurial loader.

I propose to treat remote and local repositories the same (for now at least) with hg incoming to write the bundle in bundle20_loader:prepare. (This may require building mercurial from available 4.5 source to not hit some giant memory leak)

Feb 2 2018, 10:18 PM · Mercurial loader
fiendish added a comment to T329: hg / mercurial loader.
Feb 2 2018, 6:56 AM · Mercurial loader

Jan 12 2018

fiendish added a comment to T329: hg / mercurial loader.

For fetching the blob, the only gotcha i see is that possibly we have contents without data (the big one are filtered out).

Jan 12 2018, 2:58 AM · Mercurial loader

Dec 26 2017

fiendish added a comment to T329: hg / mercurial loader.
Dec 26 2017, 6:44 PM · Mercurial loader
fiendish added a comment to T329: hg / mercurial loader.

as entertained in the code, only bundle20 format support

Dec 26 2017, 6:43 PM · Mercurial loader

Dec 25 2017

fiendish committed rDLDHGdbd12db8f569: force v2 bundle generation (authored by fiendish).
force v2 bundle generation
Dec 25 2017, 9:51 PM
fiendish committed rDLDHG8e5f04fa6671: prepare args were reordered in previous commit (authored by fiendish).
prepare args were reordered in previous commit
Dec 25 2017, 9:44 PM

Oct 16 2017

fiendish updated the summary of D256: mercurial bundle20 parser/loader.
Oct 16 2017, 10:28 PM
fiendish updated the diff for D256: mercurial bundle20 parser/loader.

go back to creation of chunked_reader

Oct 16 2017, 10:11 PM
fiendish added a task to D256: mercurial bundle20 parser/loader: T329: hg / mercurial loader.
Oct 16 2017, 10:06 PM
fiendish added a revision to T329: hg / mercurial loader: D256: mercurial bundle20 parser/loader.
Oct 16 2017, 10:06 PM · Mercurial loader
fiendish updated the summary of D256: mercurial bundle20 parser/loader.
Oct 16 2017, 10:02 PM
fiendish edited reviewers for D256: mercurial bundle20 parser/loader, added: Reviewers; removed: Mercurial loader, Core Loader.
Oct 16 2017, 10:01 PM
fiendish changed the edit policy for D256: mercurial bundle20 parser/loader.
Oct 16 2017, 9:58 PM
fiendish committed rDLDHG6b5527e45966: fast loader (authored by fiendish).
fast loader
Oct 16 2017, 9:33 PM

Jun 9 2017

fiendish committed rDLDHG6cc321b5f6e5: chunked reader for hg20 bundles (authored by fiendish).
chunked reader for hg20 bundles
Jun 9 2017, 3:18 PM

May 30 2017

fiendish committed rDLDHGdbd3c734e632: fix logic and speed way up (authored by fiendish).
fix logic and speed way up
May 30 2017, 1:42 AM

May 24 2017

fiendish committed rDLDHG24a2433de277: remove the intermediate file read object (authored by fiendish).
remove the intermediate file read object
May 24 2017, 9:58 AM

May 23 2017

fiendish committed rDLDHG7d40b4fff36a: allow second resolution in time offsets (authored by fiendish).
allow second resolution in time offsets
May 23 2017, 11:12 AM

May 21 2017

fiendish committed rDLDHG5b230a590787: soften comment (authored by fiendish).
soften comment
May 21 2017, 10:55 AM
fiendish committed rDLDHGe589b64a1cbd: HG bundle20 parser first prototype (authored by fiendish).
HG bundle20 parser first prototype
May 21 2017, 10:46 AM

May 17 2017

fiendish edited P162 This comment currently resides at the top of the mercurial bundle_loader that I'm working on.
May 17 2017, 1:19 PM · Mercurial loader
fiendish created P162 This comment currently resides at the top of the mercurial bundle_loader that I'm working on.
May 17 2017, 12:44 PM · Mercurial loader

May 9 2017

fiendish added a comment to T329: hg / mercurial loader.

I don't know if I should finish cleaning up slow_loader for code review, since the hglib interface is so slow as to be next to useless.

May 9 2017, 5:57 PM · Mercurial loader
fiendish committed rDLDHG154cce580ac6: might as well push the slow hglib mercurial loader (authored by fiendish).
might as well push the slow hglib mercurial loader
May 9 2017, 5:47 PM

Apr 21 2017

fiendish edited P153 hglib vs dulwich finding blobs.
Apr 21 2017, 3:29 PM

Apr 4 2017

fiendish edited P153 hglib vs dulwich finding blobs.
Apr 4 2017, 10:51 PM
fiendish edited P153 hglib vs dulwich finding blobs.
Apr 4 2017, 10:14 PM
fiendish edited P153 hglib vs dulwich finding blobs.
Apr 4 2017, 10:13 PM
fiendish created P153 hglib vs dulwich finding blobs.
Apr 4 2017, 4:19 PM

Mar 17 2017

fiendish added a comment to T592: ingest bitbucket git repositories.

When would be a good time to try to get this running?

Mar 17 2017, 5:37 PM · Archive coverage, Origin-Bitbucket

Mar 10 2017

fiendish added a comment to P143 (An Untitled Masterwork).
or f['perms'] != parent_dir[fname]['perms'])):  # please don't remove the double parens. pydocstyle needs them.
Mar 10 2017, 5:22 PM

Mar 6 2017

fiendish closed T591: lister for bitbucket repositories as Resolved.
Mar 6 2017, 5:45 PM · Origin-Bitbucket
fiendish closed T591: lister for bitbucket repositories, a subtask of T561: ingest bitbucket (meta task), as Resolved.
Mar 6 2017, 5:45 PM · Archive coverage, Origin-Bitbucket
fiendish closed D165: refactor github lister into something more generic.

does this work?

Mar 6 2017, 1:27 PM · GitHub lister, Bitbucket lister
fiendish committed rDLS68d77fd43f54: Refactor lister code (authored by fiendish).
Refactor lister code
Mar 6 2017, 12:36 PM

Mar 2 2017

fiendish updated the diff for D165: refactor github lister into something more generic.

Formatted the test responses and added commentary on the life goals of tasks.

Mar 2 2017, 6:08 PM · GitHub lister, Bitbucket lister
fiendish added inline comments to D165: refactor github lister into something more generic.
Mar 2 2017, 11:50 AM · GitHub lister, Bitbucket lister

Feb 23 2017

fiendish added a comment to D165: refactor github lister into something more generic.

what does this button do?

Feb 23 2017, 2:15 PM · GitHub lister, Bitbucket lister
fiendish updated the diff for D165: refactor github lister into something more generic.

shot at making the lister base and intermediate classes agnostic to the transport layer

Feb 23 2017, 2:01 PM · GitHub lister, Bitbucket lister

Feb 20 2017

fiendish added inline comments to D165: refactor github lister into something more generic.
Feb 20 2017, 6:41 PM · GitHub lister, Bitbucket lister
fiendish updated the diff for D165: refactor github lister into something more generic.

updated requirements

Feb 20 2017, 5:34 PM · GitHub lister, Bitbucket lister
fiendish updated the diff for D165: refactor github lister into something more generic.

rebase on origin/master, more tests + bug fixes

Feb 20 2017, 4:30 PM · GitHub lister, Bitbucket lister
fiendish updated the test plan for D165: refactor github lister into something more generic.
Feb 20 2017, 9:40 AM · GitHub lister, Bitbucket lister
fiendish updated the diff for D165: refactor github lister into something more generic.

longer tests, more refactoring

Feb 20 2017, 2:45 AM · GitHub lister, Bitbucket lister

Feb 18 2017

fiendish added a revision to T591: lister for bitbucket repositories: D165: refactor github lister into something more generic.
Feb 18 2017, 6:13 PM · Origin-Bitbucket
fiendish added a task to D165: refactor github lister into something more generic: T591: lister for bitbucket repositories.
Feb 18 2017, 6:13 PM · GitHub lister, Bitbucket lister
fiendish created D165: refactor github lister into something more generic.
Feb 18 2017, 6:12 PM · GitHub lister, Bitbucket lister