Oh well...
- Queries
- All Stories
- Search
- Advanced Search
- Transactions
- Transaction Logs
All Stories
Oct 4 2021
The buffer proxy does not buffer visit statuses, so we need to flush the buffer before creating visit statuses.
Build is green
rebase
Build is green
rebase
Build is green
Build is green
Build is green
rebase
Build is green
rebase
rebase
Build is green
rebase
Build is green
Build is green
turn backend classes into context managers
rebase
rebase
Build is green
Build is green
squash with D6272
squash with D6273
it make sense to create a dedicated swh-perfecthash package.
That I did not know, so indeed, if we need a specific wrapper for our needs, ...
In addition to being unmaintained,this could be addressed by asking authors to be in charge of the package
Thanks vlorentz. Done.
In D6399#166050, @ardumont wrote:Do you have the pre-commit installed in that repository yet? [1]
[1] https://docs.softwareheritage.org/devel/developer-setup.html#checkout-the-source-code
According to the snippet referenced by @ardumont, all branch names starting with refs/pull/ should be filtered out.
But in the recent snapshot of torvalds/linux there are a lot of branch names like that.
How come?
And aside from the tox.ini/requirements.txt change, this diff actually replaces hacks with better code, so it's a win-win
Good idea.
@anlambert This ticket might have some performance implications.
for eg: in the first case, to redirect /browse/origin/directory/?origin_url=<> to the root directory, we have to query the archive first. The obvious way would be to call the get_snapshot_context function.
https://forge.softwareheritage.org/source/swh-web/browse/master/swh/web/browse/snapshot_context.py$395
Ideally this doc would (briefly) describe how bazaar works and how it is different from already supported DVCS, then document chosen the "mapping" of the bzr model into swh (especially mentioning what is lost during this).
In T3104#71609, @dachary wrote:SWH I guess: I don't see the difference whether it's embedded in swh-objstorage, winery or a dedicated package.
If I understand correctly, you're suggesting that I create a package at the same level as https://forge.softwareheritage.org/source/puppet-swh-site/, right ? For instance https://forge.softwareheritage.org/source/swh-perfecthash/ by following the instructions from the documentation.
So does it make sense to use this package instead of reimplementing one? What's the catch?
In addition to being unmaintained,
Would it be possible to add a "conception documentation" included in the docs/ of the BZR loader repo? (possibly with D6344 or as a standalone diff)?
SWH I guess: I don't see the difference whether it's embedded in swh-objstorage, winery or a dedicated package.
@borisbaldassari Click "Add Action..." over the comment box, select "Abandon Revision", then submit
keda looks promising. P1193 is an example of configuration working for the docker environment. It's able to scale to 0 when no messages are present on the queue.
When messages are present, the loaders are launched progressively until the limit of cpu/memory of the host is reached or the max number of allowed worker is reached.
- commands
/usr/bin/time -v swh loader run git https://github.com/CocoaPods/Specs /usr/bin/time -v swh loader run git https://github.com/cozy/cozy-stack /usr/bin/time -v swh loader run git https://github.com/hylang/hy /usr/bin/time -v swh loader run git https://github.com/vsellier/easy-cozy /usr/bin/time -v swh loader run git https://github.com/rancher/dashboard /usr/bin/time -v swh loader run git https://github.com/kubernetes/kubectl /usr/bin/time -v swh loader run git https://github.com/git/git /usr/bin/time -v swh loader run git https://github.com/torvalds/linux /usr/bin/time -v swh loader run git https://github.com/rust-lang/rust
That was just a test, trash it.
07:38:12 py3 run-test: commands[0] | docker run debian:bullseye date 07:38:12 Unable to find image 'debian:bullseye' locally 07:38:12 bullseye: Pulling from library/debian 07:38:12 df5590a8898b: Already exists 07:38:12 Digest: sha256:86dddd82dddf445aea3d2ea26af46cebd727bf2f47ed810fa1450a0d79722d55 07:38:12 Status: Downloaded newer image for debian:bullseye
Build is green
tox is called with explicit -e, adding a new environment is a noop unless the matching jenkins job is updated
Oct 3 2021
All runs done from medium to large repositories.
No diverging hash and consistently the loader-git ran with the patched version uses less memory.
Run on large repositories:
|---------+-----------------+-------+-------------------------+-------------------------+------------------------| | Machine | torvalds/linux | refs | Snapshot | Memory (max RSS kbytes) | Elapsed Time (h:mm:ss) | |---------+-----------------+-------+-------------------------+-------------------------+------------------------| | staging | torvalds/linux | 1496 | \xc2847...3fb4 | 1361324 | 6:59:16 | | prod | // | // | \xc2847...3fb4 | 3080408 | 24:13:11 | |---------+-----------------+-------+-------------------------+-------------------------+------------------------| | staging | CocoaPods/Specs | 14036 | X (hash mismatched) [1] | 5789344 | 23:10:48 | | prod | // | // | X (killed) [2] | 14280284 | 10:09:09 | |---------+-----------------+-------+-------------------------+-------------------------+------------------------|
Build is green
what about naming the parameter create_snapshot instead?
Fine with me.
Prior to this
commit, it was implied that the store_data could only be called once. It's a limitation
that needs to change for some ongoing optimizations in the loader git.is it, though? it allows creating "partial" snapshots
Well, yes. But even with this, that's still the case (if create_snapshot is True after the first round).
Without this though, we cannot pass into more than one iteration of the loop (in the git loader
which is the sole running subclass instance of this). The ingestion fails because it wants
to create one snapshot after the first store_data called (so only one loop is allowed
with the current implem).As it misses references to build the snapshot, it fails.
The reading of all references is done through multiple iterations (optimization ongoing to
read the packfiles into multiple steps)
Adapt according to latest analysis/development