- User Since
- Jan 30 2017, 11:00 PM (312 w, 5 d)
Feb 21 2018
Can we associate the name of the temporary storage directory for a load with that loader's pid, and then make every new loader instance compare existing temp storage dirs during init? If a storage directory exists for a process that does not exist (because the process was killed) then it can be deleted.
I worry that RAM is way more constrained than disk space is. It seems like the biggest problem is/was
If cache files are sticking around, then of course the code should make sure that they go away when done or aborted. But I think that a few G used during processing of extremely large repos should be acceptable. :/
I think 6e12c90b160ad3277a1edea27a05f9adea1bc92f may be a bad idea. Have you tested how much RAM it takes to hold the whole dirs dict in memory on a very large repo like mozilla-unified?
I agree with taking tags from both sides and discarding all lines that don't fit the pattern.
Feb 20 2018
As discussed in irc a short while ago (just leaving this as note here), seeing 2 caches is normal and expected, since one is spawned inside reader and one in loader. Will have to also pass that argument to the reader instance.
Feb 15 2018
Feb 14 2018
Does it make sense to open that in the loader's configuration property?
Feb 13 2018
The bundle loader is tunable to use less ram and therefore more disk for its live caching (though I need to revisit the counter to make the tuning argument less arbitrary and more representative of real bytes used, because it currently ignores overhead and python data has a lot of overhead).
The bundle step, for some repository, is at the moment needing quite some ram
Feb 9 2018
Feb 8 2018
Well I'm not sure what just happened, but I commited a patch (and apparently also some duplicate history).
I'll do it as part of my patch, but I will need you to look at it. You made the original changes for good reasons, so I just want to make sure that the reasons are preserved.
Feb 7 2018
Also commit fbdd798b0e32a4cc0ef50b08ae2217d45f95e7ad is very problematic.
Feb 2 2018
I propose to treat remote and local repositories the same (for now at least) with hg incoming to write the bundle in bundle20_loader:prepare. (This may require building mercurial from available 4.5 source to not hit some giant memory leak)
Jan 12 2018
For fetching the blob, the only gotcha i see is that possibly we have contents without data (the big one are filtered out).
Dec 26 2017
as entertained in the code, only bundle20 format support
Dec 25 2017
Oct 16 2017
go back to creation of chunked_reader
Jun 9 2017
May 30 2017
May 24 2017
May 23 2017
May 21 2017
May 17 2017
May 9 2017
I don't know if I should finish cleaning up slow_loader for code review, since the hglib interface is so slow as to be next to useless.
Apr 21 2017
Apr 4 2017
Mar 17 2017
When would be a good time to try to get this running?
Mar 10 2017
or f['perms'] != parent_dir[fname]['perms'])): # please don't remove the double parens. pydocstyle needs them.
Mar 6 2017
does this work?
Mar 2 2017
Formatted the test responses and added commentary on the life goals of tasks.
Feb 23 2017
what does this button do?
shot at making the lister base and intermediate classes agnostic to the transport layer
Feb 20 2017
rebase on origin/master, more tests + bug fixes
longer tests, more refactoring