Page MenuHomeSoftware Heritage
Feed Advanced Search

Jan 26 2022

olasd committed rDSNIP0455611094ee: recover_corrupt_objects: clean imports (authored by olasd).
recover_corrupt_objects: clean imports
Jan 26 2022, 1:02 PM
olasd committed rDSNIP5e20eee1b438: recover_corrupt_objects: use logging instead of tqdm (authored by olasd).
recover_corrupt_objects: use logging instead of tqdm
Jan 26 2022, 1:02 PM
olasd committed rDSNIP871da6ca0a09: recover_corrupt_objects: use stream_results instead of stream_results_optional (authored by olasd).
recover_corrupt_objects: use stream_results instead of stream_results_optional
Jan 26 2022, 1:02 PM

Jan 25 2022

olasd committed rSPSITE89642e23c829: unattended_upgrades: Allow the new codename for debian-security (authored by olasd).
unattended_upgrades: Allow the new codename for debian-security
Jan 25 2022, 8:24 PM
olasd committed rDSNIP99a3e4997191: recover_corrupt_objects: make assertions more useful (authored by olasd).
recover_corrupt_objects: make assertions more useful
Jan 25 2022, 6:27 PM
olasd added inline comments to D7035: Automate weekly-planning script.
Jan 25 2022, 5:00 PM
olasd committed rDSNIP27369c56cac9: recover_corrupt_objects: Remove spurious select (authored by olasd).
recover_corrupt_objects: Remove spurious select
Jan 25 2022, 4:52 PM
olasd committed rDSNIP44fe78774db2: recover_corrupt_objects: Pull config from a configfile instead of the cli (authored by olasd).
recover_corrupt_objects: Pull config from a configfile instead of the cli
Jan 25 2022, 4:52 PM
olasd added a comment to D7029: Filter out extids targeting non-existing releases.

I wonder if that'd be worth a warning. May end up being a bit noisy though.

Jan 25 2022, 3:52 PM
olasd accepted D7033: the desired key len is 32 for sha256.

Thanks! (and sorry for the hash algo ping-pong)

Jan 25 2022, 3:46 PM
olasd accepted D6957: Add recover_corrupt_objects.py.

I'm not 100% convinced we need to recheck the objects at every addition (within a transaction that can still fail to commit) instead of afterwards, but it doesn't /hurt/ either. We'll make a full pass on all objects again later anyway.

Jan 25 2022, 1:43 PM

Jan 24 2022

olasd added inline comments to D7024: Fix directory_add to actually insert the manifest + add directory_get_raw_manifest.
Jan 24 2022, 2:27 PM
olasd accepted D7024: Fix directory_add to actually insert the manifest + add directory_get_raw_manifest.

Oof. I guess that's one more reason to be wary of TODOs in tests.

Jan 24 2022, 1:32 PM
olasd triaged T3879: Replace ntp with systemd-timesyncd (or chrony) for time synchronization across all the infra as Low priority.
Jan 24 2022, 1:28 PM · System administration

Jan 21 2022

olasd accepted D7020: Configure the kafka clusters environment for the consumer lag exporter.

Neat, thanks.

Jan 21 2022, 7:01 PM
olasd added inline comments to D6979: Document the mirror credentials management.
Jan 21 2022, 5:39 PM
olasd accepted D6986: kafka: add a script to create the kafka credentials.

thanks!

Jan 21 2022, 5:38 PM
olasd accepted D7016: Refactor clone with timout util from hg loader.
Jan 21 2022, 4:51 PM
olasd accepted D7008: Stop using the deprecated 'TimestampWithTimezone.offset' attribute.

Thanks!

Jan 21 2022, 2:49 PM
olasd accepted D7007: Stop using the deprecated 'TimestampWithTimezone.offset' attribute.

Thanks!

Jan 21 2022, 2:48 PM
olasd accepted D7006: Stop using the deprecated 'TimestampWithTimezone.offset' attribute.

Thanks!

Jan 21 2022, 2:47 PM
olasd accepted D7009: Consider unauthorized access to origin as a not found visit status.

On most forges, 403 errors are used in place of 404 errors (so you're not able to do discovery of private repositories by checking the return code). I think treating them as not found is correct.

Jan 21 2022, 2:43 PM
olasd accepted D6937: Remove 'offset' and 'negative_utc'.

Either way, I think this is good to go, thanks!

Jan 21 2022, 2:01 PM
olasd added a comment to D6937: Remove 'offset' and 'negative_utc'.

We still query the fields out of the Postgres database (to ignore them in the db -> model object conversion), right? Should we stop doing that too?

Jan 21 2022, 1:53 PM
olasd accepted D7005: Add method 'TimestampWithTimezone.offset_minutes'.

Sounds good to me, thanks!

Jan 21 2022, 1:39 PM
olasd accepted D6984: Refactor clone with timout util from hg loader.

Thanks!

Jan 21 2022, 11:31 AM
olasd accepted D6944: Add a few statsd metrics in the kafka journal client.
Jan 21 2022, 11:28 AM

Jan 20 2022

olasd accepted D6991: Fix direct sql query for directories to the archive.
Jan 20 2022, 5:58 PM
olasd published D6991: Fix direct sql query for directories to the archive for review.
Jan 20 2022, 5:58 PM
olasd accepted D6989: Add a Statsd.status_gauge() context manager.

Awesome, thanks!

Jan 20 2022, 3:27 PM
olasd added a comment to D5646: Make buffer and validate proxy storage also handle other object types.

The buffer bits look reasonable to me. I'm not so sure about the validator bits, as we're not actually storing the id fields for these objects, afaik?

Jan 20 2022, 3:25 PM
olasd closed T3487: Installation of the new provenance server as Resolved.
Jan 20 2022, 11:47 AM · System administration
olasd closed D6983: Add provenance-client01 virtual machine.
Jan 20 2022, 11:22 AM
olasd committed rSPREee8a56b4df35: Add provenance-client01 virtual machine (authored by olasd).
Add provenance-client01 virtual machine
Jan 20 2022, 11:22 AM
olasd updated the diff for D6983: Add provenance-client01 virtual machine.

Update tfstate after provisioning the machine

Jan 20 2022, 11:22 AM
olasd added a comment to D6986: kafka: add a script to create the kafka credentials.

Could we create this script, for each kafka cluster, on the kafka management host rather than on individual brokers, so we can have all the management in a single place?

Jan 20 2022, 11:02 AM
olasd added a comment to D6983: Add provenance-client01 virtual machine.

lgtm

Note: jsyk, the inventory entry [1] references 40G for that node and the default is 32G (as per the test plan).

[1] https://inventory.internal.softwareheritage.org/virtualization/virtual-machines/108/edit/

Jan 20 2022, 10:25 AM
olasd added a comment to D6984: Refactor clone with timout util from hg loader.

Any chance you could add a couple of tests for this?

Jan 20 2022, 10:12 AM
olasd committed rSPREde0278bd78dd: proxmox/terraform: Make the vmid variable properly optional (authored by olasd).
proxmox/terraform: Make the vmid variable properly optional
Jan 20 2022, 8:50 AM
olasd closed D6982: proxmox/terraform: Make the vmid variable properly optional.
Jan 20 2022, 8:50 AM
olasd committed rSPRE686b74e25834: Mark kelvingrove's cpu as host instead of kvm64 (authored by olasd).
Mark kelvingrove's cpu as host instead of kvm64
Jan 20 2022, 8:50 AM
olasd closed D6981: Mark kelvingrove's cpu as host instead of kvm64.
Jan 20 2022, 8:50 AM

Jan 19 2022

olasd added inline comments to D6957: Add recover_corrupt_objects.py.
Jan 19 2022, 11:21 PM
olasd closed T3819: Deploy swh.model 4.1.0 / swh.storage 0.41.0 to production as Resolved.
Jan 19 2022, 7:12 PM · System administration
olasd closed T3819: Deploy swh.model 4.1.0 / swh.storage 0.41.0 to production, a subtask of T3752: Store/represent time offsets as strings, as Resolved.
Jan 19 2022, 7:12 PM · Data Model, Storage manager
olasd updated the task description for T3819: Deploy swh.model 4.1.0 / swh.storage 0.41.0 to production.
Jan 19 2022, 7:12 PM · System administration
olasd added a comment to T3819: Deploy swh.model 4.1.0 / swh.storage 0.41.0 to production.
softwareheritage=#   alter table revision
    add constraint revision_date_offset_not_null
    check (date is null or date_offset_bytes is not null) not valid,
    add constraint revision_committer_date_offset_not_null
    check (committer_date is null or committer_date_offset_bytes is not null) not valid;
ALTER TABLE
Jan 19 2022, 7:11 PM · System administration
olasd requested review of D6983: Add provenance-client01 virtual machine.
Jan 19 2022, 6:25 PM
olasd requested review of D6982: proxmox/terraform: Make the vmid variable properly optional.
Jan 19 2022, 6:24 PM
olasd requested review of D6981: Mark kelvingrove's cpu as host instead of kvm64.
Jan 19 2022, 6:23 PM
olasd committed rDDOCb2a68a9d2605: Add hints on how to deploy the storage db externally (authored by olasd).
Add hints on how to deploy the storage db externally
Jan 19 2022, 4:04 PM
olasd closed D6969: Update mirror docker docs following a walkthrough.
Jan 19 2022, 4:04 PM
olasd committed rDDOC2b8d86d713fb: Update mirror docker docs following a walkthrough (authored by olasd).
Update mirror docker docs following a walkthrough
Jan 19 2022, 4:04 PM
olasd updated the diff for D6969: Update mirror docker docs following a walkthrough.

Gotta love rebasing

Jan 19 2022, 3:47 PM
olasd committed rCDFP18d94c295557: Migrate volume-dependent service placement constraints to labels (authored by olasd).
Migrate volume-dependent service placement constraints to labels
Jan 19 2022, 3:42 PM
olasd committed rCDFP35cd316296af: Gitignore the graph-replayer and content-replayer configs (authored by olasd).
Gitignore the graph-replayer and content-replayer configs
Jan 19 2022, 3:42 PM
olasd closed D6978: Update docker stack configuration to match documentation changes.
Jan 19 2022, 3:42 PM
olasd committed rCDFPc26ab911ae13: Make the default storage db flavor "mirror" (authored by olasd).
Make the default storage db flavor "mirror"
Jan 19 2022, 3:42 PM
olasd removed a reviewer for D6978: Update docker stack configuration to match documentation changes: Reviewers.
Jan 19 2022, 3:40 PM
olasd updated the diff for D6969: Update mirror docker docs following a walkthrough.

Add hints on how to deploy the storage db externally

Jan 19 2022, 3:30 PM
olasd requested review of D6978: Update docker stack configuration to match documentation changes.
Jan 19 2022, 2:56 PM
olasd added a revision to T3829: Document mirror - how to create and deploy a mirror from scratch: D6978: Update docker stack configuration to match documentation changes.
Jan 19 2022, 2:56 PM · Mirror
olasd closed D6968: Rename secrets using a swh-mirror prefix.
Jan 19 2022, 2:54 PM
olasd committed rCDFP6e574b79e7c6: Rename secrets using a swh-mirror prefix (authored by olasd).
Rename secrets using a swh-mirror prefix
Jan 19 2022, 2:54 PM
olasd updated the diff for D6969: Update mirror docker docs following a walkthrough.

*shrug*

Jan 19 2022, 2:44 PM
olasd triaged T3863: Allow loading sensitive configuration values from distinct, more restricted, files as Normal priority.
Jan 19 2022, 2:41 PM · Core & foundations
olasd requested review of D6969: Update mirror docker docs following a walkthrough.
Jan 19 2022, 12:36 PM

Jan 18 2022

olasd requested review of D6968: Rename secrets using a swh-mirror prefix.
Jan 18 2022, 6:04 PM
olasd added a comment to D6944: Add a few statsd metrics in the kafka journal client.

The code for the gauges feels like something that would be usefully handled with a context manager.

Jan 18 2022, 2:54 PM
olasd accepted D6884: Add support for the rdkafka 'stats_cb' config option in JournalClient.

I'm kinda wondering if this import stuff should move to a common module - I think we do kind of the same thing with entrypoints?

Jan 18 2022, 1:42 PM
olasd added a comment to T3819: Deploy swh.model 4.1.0 / swh.storage 0.41.0 to production.

All revisions (supposedly) have been migrated to bytes offsets. I'll wait for the ongoing base backup and vacuum to complete before adding the constraints on the production database.

Jan 18 2022, 12:28 PM · System administration
olasd merged task T2449: Consider switching timestamp offset storage to strings/byte arrays into T3752: Store/represent time offsets as strings.
Jan 18 2022, 12:27 PM · Storage manager, Data Model
olasd merged T2449: Consider switching timestamp offset storage to strings/byte arrays into T3752: Store/represent time offsets as strings.
Jan 18 2022, 12:27 PM · Data Model, Storage manager
olasd added a subtask for T3752: Store/represent time offsets as strings: T3819: Deploy swh.model 4.1.0 / swh.storage 0.41.0 to production.
Jan 18 2022, 12:26 PM · Data Model, Storage manager
olasd added a parent task for T3819: Deploy swh.model 4.1.0 / swh.storage 0.41.0 to production: T3752: Store/represent time offsets as strings.
Jan 18 2022, 12:26 PM · System administration
olasd updated the task description for T3819: Deploy swh.model 4.1.0 / swh.storage 0.41.0 to production.
Jan 18 2022, 12:26 PM · System administration
olasd accepted D6956: model: Add support for more edge cases in _parse_offset_bytes.

Great, thanks!

Jan 18 2022, 11:01 AM
olasd added a comment to D6958: postgres: Add indices to keep track of objects with a raw_manifest.

Sounds good. I don't think we need to make them unique though (and it might make stuff confusing if writes get rejected by this index rather than the primary one)

Jan 18 2022, 11:00 AM

Jan 17 2022

olasd accepted D6939: Stop passing 'offset' and 'negative_utc' to TimestampWithTimezone().

Thanks!

Jan 17 2022, 4:44 PM
olasd added inline comments to D6939: Stop passing 'offset' and 'negative_utc' to TimestampWithTimezone().
Jan 17 2022, 4:30 PM
olasd closed D6858: requirements-test: Drop pre-commit.
Jan 17 2022, 4:27 PM
olasd committed rDCOREd374b6002955: requirements-test: Drop pre-commit (authored by olasd).
requirements-test: Drop pre-commit
Jan 17 2022, 4:27 PM
olasd updated the diff for D6858: requirements-test: Drop pre-commit.

Rebase

Jan 17 2022, 4:25 PM
olasd accepted D6894: converters: Write object_bytes and raw_manifest on revisions and releases.

This all looks fine to me, thanks!

Jan 17 2022, 4:22 PM

Jan 14 2022

olasd accepted D6923: converters: Write raw_manifest of Directory objects.

Awesome, thanks.

Jan 14 2022, 6:29 PM
olasd added inline comments to D6939: Stop passing 'offset' and 'negative_utc' to TimestampWithTimezone().
Jan 14 2022, 5:50 PM
olasd accepted D6940: tests: Use 'offset_bytes' instead of 'offset'/'negative_utc'.

Great, thanks!

Jan 14 2022, 5:49 PM
olasd accepted D6938: tests: Replace 'offset' and 'negative_utc' with 'offset_bytes'.

Cool, thanks!

Jan 14 2022, 5:48 PM
olasd accepted D6950: ra: Send modified objects only to storage after replaying a revision.
Jan 14 2022, 5:44 PM
olasd added a comment to D6950: ra: Send modified objects only to storage after replaying a revision.
In D6950#180642, @olasd wrote:

Rather than add this logic on the svn loader only, we could consider either making swh.model.from_disk support incremental computations by keeping track of the ctime / mtime of on-disk data, and by collecting objects for the new loader. This would make this logic reusable by all loaders.

Without changing the swh.model.from_disk logic, we could also just diff the sets of objects between iterations (and rely on the OS cache for the new computation to be vaguely efficient), and only send new ones to the storage.

Indeed, that feature could be implemented at the swh.model.from_disk side. The first proposal seems the most reasonable to me in terms of performance
for a loader but the second one could also be of interest to have.
Nevertheless, not sure the implementation will be so straightforward and they will be a lot of cases to cover with tests so not a one day work.
Currently the subversion loader is the only one that needs that directories diff feature so I think we can keep the implementation as it is at the
moment but I will create a task to implement the directories diff features in swh.model.from_disk.

Jan 14 2022, 4:10 PM
olasd closed T3847: pergamon takes a long time to apply its own manifest as Resolved.

Looks like it's been spending a lot of time on DNS lookups.

Jan 14 2022, 2:36 PM · System administration
olasd added a comment to D6950: ra: Send modified objects only to storage after replaying a revision.

This looks like an impressive speedup, kudos.

Jan 14 2022, 2:08 PM
olasd added inline comments to D6936: TimestampWithTimezone: Make 'offset' and 'negative_utc' optional.
Jan 14 2022, 1:52 PM
olasd committed rSPSITE99361dc7e2f6: Add my new gpg key to the debian repository keyring (authored by olasd).
Add my new gpg key to the debian repository keyring
Jan 14 2022, 1:48 PM
olasd accepted D6936: TimestampWithTimezone: Make 'offset' and 'negative_utc' optional.

Thanks a lot!

Jan 14 2022, 1:46 PM
olasd added a comment to D6936: TimestampWithTimezone: Make 'offset' and 'negative_utc' optional.

Thanks for this change!

Jan 14 2022, 1:39 PM
olasd added a comment to D6936: TimestampWithTimezone: Make 'offset' and 'negative_utc' optional.

We could, but it means another round of going through all four packages and changing every constructor call, and it's a lot of busy work

Jan 14 2022, 12:23 PM
olasd added a comment to D6936: TimestampWithTimezone: Make 'offset' and 'negative_utc' optional.

Couldn't we make the old constructor arguments optional first, to avoid breaking all dependencies at once?

Jan 14 2022, 11:24 AM

Jan 12 2022

olasd accepted D6924: Fix TimestampWithTimezone.from_dict() on datetimes before 1970 with non-integer seconds.

Thanks!

Jan 12 2022, 2:47 PM
olasd accepted D6921: sql: Clean up task/task_run data model.

Looks good to me, thank you!

Jan 12 2022, 11:25 AM