Query: Advanced Search

recover_corrupt_objects: clean imports

recover_corrupt_objects: use logging instead of tqdm

recover_corrupt_objects: use stream_results instead of stream_results_optional

unattended_upgrades: Allow the new codename for debian-security

recover_corrupt_objects: make assertions more useful

recover_corrupt_objects: Remove spurious select

recover_corrupt_objects: Pull config from a configfile instead of the cli

	Include stories about projects I am a member of.

I wonder if that'd be worth a warning. May end up being a bit noisy though.

Thanks! (and sorry for the hash algo ping-pong)

I'm not 100% convinced we need to recheck the objects at every addition (within a transaction that can still fail to commit) instead of afterwards, but it doesn't /hurt/ either. We'll make a full pass on all objects again later anyway.

Oof. I guess that's one more reason to be wary of TODOs in tests.

On most forges, 403 errors are used in place of 404 errors (so you're not able to do discovery of private repositories by checking the return code). I think treating them as not found is correct.

Either way, I think this is good to go, thanks!

We still query the fields out of the Postgres database (to ignore them in the db -> model object conversion), right? Should we stop doing that too?

Sounds good to me, thanks!

The buffer bits look reasonable to me. I'm not so sure about the validator bits, as we're not actually storing the id fields for these objects, afaik?

Add provenance-client01 virtual machine

Update tfstate after provisioning the machine

Could we create this script, for each kafka cluster, on the kafka management host rather than on individual brokers, so we can have all the management in a single place?

In D6983#181657, @ardumont wrote:

lgtm

Note: jsyk, the inventory entry [1] references 40G for that node and the default is 32G (as per the test plan).

[1] https://inventory.internal.softwareheritage.org/virtualization/virtual-machines/108/edit/

Any chance you could add a couple of tests for this?

proxmox/terraform: Make the vmid variable properly optional

Mark kelvingrove's cpu as host instead of kvm64

softwareheritage=#   alter table revision
    add constraint revision_date_offset_not_null
    check (date is null or date_offset_bytes is not null) not valid,
    add constraint revision_committer_date_offset_not_null
    check (committer_date is null or committer_date_offset_bytes is not null) not valid;
ALTER TABLE

Add hints on how to deploy the storage db externally

Update mirror docker docs following a walkthrough

Migrate volume-dependent service placement constraints to labels

Gitignore the graph-replayer and content-replayer configs

Make the default storage db flavor "mirror"

Add hints on how to deploy the storage db externally

Rename secrets using a swh-mirror prefix

The code for the gauges feels like something that would be usefully handled with a context manager.

I'm kinda wondering if this import stuff should move to a common module - I think we do kind of the same thing with entrypoints?

All revisions (supposedly) have been migrated to bytes offsets. I'll wait for the ongoing base backup and vacuum to complete before adding the constraints on the production database.

Sounds good. I don't think we need to make them unique though (and it might make stuff confusing if writes get rejected by this index rather than the primary one)

requirements-test: Drop pre-commit

This all looks fine to me, thanks!

In D6950#180643, @anlambert wrote:

In D6950#180642, @olasd wrote:

Rather than add this logic on the svn loader only, we could consider either making swh.model.from_disk support incremental computations by keeping track of the ctime / mtime of on-disk data, and by collecting objects for the new loader. This would make this logic reusable by all loaders.

Without changing the swh.model.from_disk logic, we could also just diff the sets of objects between iterations (and rely on the OS cache for the new computation to be vaguely efficient), and only send new ones to the storage.

Indeed, that feature could be implemented at the swh.model.from_disk side. The first proposal seems the most reasonable to me in terms of performance
for a loader but the second one could also be of interest to have.
Nevertheless, not sure the implementation will be so straightforward and they will be a lot of cases to cover with tests so not a one day work.
Currently the subversion loader is the only one that needs that directories diff feature so I think we can keep the implementation as it is at the
moment but I will create a task to implement the directories diff features in swh.model.from_disk.

Looks like it's been spending a lot of time on DNS lookups.

This looks like an impressive speedup, kudos.

Add my new gpg key to the debian repository keyring

Thanks for this change!

In D6936#180561, @vlorentz wrote:

We could, but it means another round of going through all four packages and changing every constructor call, and it's a lot of busy work

Couldn't we make the old constructor arguments optional first, to avoid breaking all dependencies at once?

Looks good to me, thank you!

Advanced Search
Use Results
Edit Query
Hide Query

Jan 26 2022

Jan 25 2022

Jan 24 2022

Jan 21 2022

Jan 20 2022

Jan 19 2022

Jan 18 2022

Jan 17 2022

Jan 14 2022

Jan 12 2022

Advanced SearchUse ResultsEdit QueryHide Query