Feed Advanced Search

Advanced Search
Use Results
Edit Query
Hide Query

	Include stories about projects I am a member of.

Oct 23 2020

douardda updated the task description for T2645: Add listing tasks for gitea instances.

Oct 23 2020, 10:20 AM · Origin-Gitea/Gogs, Archive coverage, Lister

Oct 22 2020

douardda created D4333: Normalize the expected config entry for the journal_client.

Oct 22 2020, 4:01 PM

douardda accepted D4312: Add `Hg20BundleLoader` tests from json files.

globally ok, but please add a README file as suggested in the previous comment

Oct 22 2020, 12:32 PM

douardda added a comment to D4312: Add `Hg20BundleLoader` tests from json files.

Would be nice to have a README file in tests/data explaining what these json files are and how to produce them.

Oct 22 2020, 11:52 AM

douardda added inline comments to D4082: Make the type of values of JournalWriter generic, so it works with types not from swh-model..

Oct 22 2020, 11:46 AM

douardda added a comment to D4082: Make the type of values of JournalWriter generic, so it works with types not from swh-model..

Wouldn't it make a bit easier to name the generic version of the journal writer something like GenericKafkaJournalWriter and have KafkaJournalWriter = GenericKafkaJournalWriter[BaseModel] ? (for bw compat)

Oct 22 2020, 11:43 AM

douardda requested changes to D4193: swh identify: add --exclude.

This globally LGTM but there is this path encoding issue. The 2 new functions in from_disk.py should take a bytes argument instead of a str one.

Oct 22 2020, 11:25 AM

Oct 21 2020

douardda created P830 (An Untitled Masterwork).

Oct 21 2020, 2:39 PM

douardda updated the task description for T2645: Add listing tasks for gitea instances.

Oct 21 2020, 12:18 PM · Origin-Gitea/Gogs, Archive coverage, Lister

Oct 19 2020

douardda created P828 (An Untitled Masterwork).

Oct 19 2020, 5:48 PM

douardda added inline comments to D4216: add swh-hg-identify a cli to identify hg objects.

Oct 19 2020, 4:49 PM

douardda triaged T2717: Write an end-user documentation on how to use the authenticated stack as High priority.

Oct 19 2020, 1:26 PM · Web app, Documentation

Oct 16 2020

douardda added a comment to T2706: Benchmark objstorage for mirror (uffizi vs. azure vs. s3).

Same as before but with 1M (fresh) sha1s:

Oct 16 2020, 1:02 PM · Object storage, Mirror

douardda added a comment to T2706: Benchmark objstorage for mirror (uffizi vs. azure vs. s3).

Since the results on uffizi above did suffer from a few caveats, I've made a few more tests:

a first result has been obtained with a dataset that had only objects stored on the XFS part of the objstorage
a second dataset has been created (with the order by sha256 part to spread the sha1s)
but results are a mix hot/cold cache tests

Oct 16 2020, 11:59 AM · Object storage, Mirror

Oct 15 2020

douardda added a comment to T2706: Benchmark objstorage for mirror (uffizi vs. azure vs. s3).

Some results:

Oct 15 2020, 1:02 PM · Object storage, Mirror

douardda added a comment to T2706: Benchmark objstorage for mirror (uffizi vs. azure vs. s3).

Current benchmarck scenario:

Oct 15 2020, 12:43 PM · Object storage, Mirror

douardda triaged T2706: Benchmark objstorage for mirror (uffizi vs. azure vs. s3) as High priority.

Oct 15 2020, 12:36 PM · Object storage, Mirror

Oct 14 2020

douardda created P821 (An Untitled Masterwork).

Oct 14 2020, 3:43 PM

douardda created P820 bench objstorage.

Oct 14 2020, 3:39 PM

Oct 13 2020

douardda accepted D4089: Add tests and fix behavior of scanner cli.

I'm mostly OK with this now, so I'll make it "accepted", but please refactor a bit the cli_run_[n]ok() helper functions before landing it.

Oct 13 2020, 1:24 PM

douardda set the repository for D4193: swh identify: add --exclude to rDMOD Data model.

Oct 13 2020, 1:04 PM

douardda added a comment to D4193: swh identify: add --exclude.

In D4193#104860, @ardumont wrote:

@douardda @zack Note that this diff somehow did not trigger the ci tests, only the linters. No idea why. just a heads up.

Oct 13 2020, 1:03 PM

Oct 9 2020

douardda committed rCDFPcdb00c8e5a34: Revert to a orgname/reponame based images naming scheme (authored by douardda).

Revert to a orgname/reponame based images naming scheme

Oct 9 2020, 5:08 PM

douardda committed rCDFP65cbf32679ad: Reorganize the images under the softwareheritage hub repo (authored by douardda).

Reorganize the images under the softwareheritage hub repo

Oct 9 2020, 4:39 PM

douardda accepted D4210: backfill: use the common `storage` top-level config key.

same

Oct 9 2020, 4:34 PM

douardda accepted D4209: backfill: support arbitrary journal writer configuration.

oh so much yes!

Oct 9 2020, 4:33 PM

douardda updated the task description for T2682: Deploy a small publicly available kafka server (with some content) on a staging (+ the related objstorage).

Oct 9 2020, 3:47 PM · Staging environment, System administration

douardda triaged T2682: Deploy a small publicly available kafka server (with some content) on a staging (+ the related objstorage) as High priority.

Oct 9 2020, 3:37 PM · Staging environment, System administration

Oct 8 2020

douardda added inline comments to D4198: PEP8 refactoring of scanner modules.

Oct 8 2020, 6:19 PM

douardda added a comment to D4198: PEP8 refactoring of scanner modules.

Overall ok, but I would have preferred the renaming be in a dedicated revision, separated from type annotation fixes/additions.

Oct 8 2020, 6:05 PM

douardda accepted D4182: conftest: Declare swh.core pytest_plugin.

otherwise fine with me

Oct 8 2020, 5:56 PM

douardda added a comment to D4182: conftest: Declare swh.core pytest_plugin.

does this requires the plugin's entrypoint in swh.core be removed ? (eg. because of swh.core.pytest_plugin being loaded twice or something like that) or is it safe to apply and use with a swh.core that still declates its pytest_plugin an entrypoint?

Oct 8 2020, 5:56 PM

douardda added a comment to D4078: Add a 'unique_key' method on model objects.

In D4078#103969, @douardda wrote:

In D4078#103968, @vlorentz wrote:

yes, it would make sense for values. Do you want to open a task for that?

you read my mind :-)

Oct 8 2020, 12:42 PM

douardda added a comment to T1279: swh-journal: The schema migration problem.

Since this "migration problem" also concerns cassandra, maybe an simple approach would be to add a Final version attribute to all model entities (a simple monotonic integer).

Oct 8 2020, 12:41 PM · Journal

douardda updated subscribers of D4078: Add a 'unique_key' method on model objects.

In D4078#103941, @vlorentz wrote:

In D4078#103938, @douardda wrote:

maybe stupid question, but why using dict as unique key (in many model classes)? Why not use a tuple? I mean it seems to me that such a UID should be usable as dict keys or in a set directly.

I don't know, I just copied what we were already doing in swh-journal. Dicts have the nice property of being somewhat "self-documenting" though.

Oct 8 2020, 12:39 PM

douardda added a comment to D4078: Add a 'unique_key' method on model objects.

In D4078#103968, @vlorentz wrote:

yes, it would make sense for values. Do you want to open a task for that?

Oct 8 2020, 12:36 PM

douardda added a comment to D4078: Add a 'unique_key' method on model objects.

In D4078#103941, @vlorentz wrote:

In D4078#103938, @douardda wrote:

maybe stupid question, but why using dict as unique key (in many model classes)? Why not use a tuple? I mean it seems to me that such a UID should be usable as dict keys or in a set directly.

I don't know, I just copied what we were already doing in swh-journal. Dicts have the nice property of being somewhat "self-documenting" though.

In D4078#103940, @douardda wrote:

Also (most probably dumb idea, writing as it pops in my mind), wouldn't it make sense to add some kind of 'per-object class model version' in the key?

This would prevent compacting away old versions of objects. Is this something we want?

Oct 8 2020, 12:34 PM

douardda added a comment to D4194: model: use visit ids in the unique key, instead of their date..

In D4194#103939, @vlorentz wrote:

microsecond in postgres, millisecond in cassandra.

Oct 8 2020, 12:27 PM

douardda accepted D4186: scanner.model: Fix Tree.toDict to be side-effect free.

The split in 2 revisions is not mandatory, just sayin' for good measure.

Oct 8 2020, 12:16 PM

douardda added a comment to D4186: scanner.model: Fix Tree.toDict to be side-effect free.

looks good (did not even notice toDict() is not even a recursive method! so this dict_nodes really makes no sense at all).

Oct 8 2020, 12:15 PM

douardda added a comment to D4078: Add a 'unique_key' method on model objects.

In D4078#103938, @douardda wrote:

maybe stupid question, but why using dict as unique key (in many model classes)? Why not use a tuple? I mean it seems to me that such a UID should be usable as dict keys or in a set directly.

Oct 8 2020, 12:08 PM

douardda added a comment to D4078: Add a 'unique_key' method on model objects.

maybe stupid question, but why using dict as unique key (in many model classes)? Why not use a tuple? I mean it seems to me that such a UID should be usable as dict keys or in a set directly.

Oct 8 2020, 12:04 PM

douardda added a comment to D4194: model: use visit ids in the unique key, instead of their date..

dates are not unique (ie. multiple visits can share a date, and they
do in practice); and visit statuses already use visit ids in their
unique key.

Oct 8 2020, 11:52 AM

douardda added a comment to D4193: swh identify: add --exclude.

In D4193#103804, @zack wrote:

Thanks, even though this is a little bit disturbing discrepancy wrt swh-scanner exclusion mechanism,

Oct 8 2020, 11:24 AM

douardda requested changes to D4193: swh identify: add --exclude.

can you please remove the "noise" added by arc in the commit message? And update it (still the previous option name in there).

Oct 8 2020, 11:16 AM

Oct 7 2020

douardda requested changes to D3435: Add mercurial.from_disk.HgLoaderFromDisk.

Oct 7 2020, 3:52 PM

Oct 2 2020

douardda added a comment to T1410: Kill implicit configuration: new configuration scheme.

Maybe starting a pad/hackmd document would be easier at this point?

Oct 2 2020, 4:01 PM · Core & foundations

douardda added a comment to D4131: Remove parse_url helper that adds no real value.

In D4131#102303, @douardda wrote:

this is debatable, but it does "normalize" the given url, so it does something. I agress the https:// auto-add prefix is strange, but the trailing / still brings value. For example there are listers that do not implement this, so if you create a listing task with url=https://somehere.org/api/v1 it will fail because it will forge invalid urls (missing the trailing /).
[edit] and I find this very annoying

Oct 2 2020, 3:14 PM

douardda added a comment to D4131: Remove parse_url helper that adds no real value.

this is debatable, but it does "normalize" the given url, so it does something. I agress the https:// auto-add prefix is strange, but the trailing / still brings value. For example there are listers that do not implement this, so if you create a listing task with url=https://somehere.org/api/v1 it will fail because it will forge invalid urls (missing the trailing /).
[edit] and I find this very annoying

Oct 2 2020, 3:13 PM

douardda added a comment to D3334: Add a new TenaciousProxyStorage.

In D3334#95217, @vlorentz wrote:

@douardda ping?

Oct 2 2020, 2:32 PM

Oct 1 2020

douardda accepted D4112: jobs/tools/dockerfiles: Enable to trigger builds remotely.

Oct 1 2020, 2:24 PM

Sep 30 2020

douardda committed rCDFP0264e6b611ca: Fix a typo in the README file (authored by douardda).

Fix a typo in the README file

Sep 30 2020, 11:48 AM

douardda committed rCDFP110773478d07: Update the db initialization to swh-storage 0.15 (authored by douardda).

Update the db initialization to swh-storage 0.15

Sep 30 2020, 11:48 AM

douardda requested changes to D4089: Add tests and fix behavior of scanner cli.

Sep 30 2020, 10:13 AM

douardda accepted D4092: packer: Clean up spurious blanks.

Sep 30 2020, 9:41 AM

douardda closed T2313: Archive git.fsfe.org (Gitea) as Resolved.

Listed (oneshot full + recurring incremental) and loaded (as far as I can tell).

Sep 30 2020, 9:37 AM · Archive coverage, Lister

Sep 29 2020

douardda added a comment to D4055: Drop debian packaging, which is now handled on separate branches.

(just updated my swh-env, now I see where this diff comes from :-) )

Sep 29 2020, 12:09 PM

douardda accepted D4055: Drop debian packaging, which is now handled on separate branches.

sure

Sep 29 2020, 11:49 AM

douardda accepted D4068: jobs/swh-environment: Improve build script.

LGTM (except the "rm -f ../$module.log " for which I am not convinced it's a good idea)

Sep 29 2020, 11:47 AM

douardda added a comment to T2313: Archive git.fsfe.org (Gitea).

I've sent an email to the fsfe.

Sep 29 2020, 11:39 AM · Archive coverage, Lister

douardda triaged T2645: Add listing tasks for gitea instances as Normal priority.

Sep 29 2020, 10:09 AM · Origin-Gitea/Gogs, Archive coverage, Lister

Sep 28 2020

douardda added a comment to T2313: Archive git.fsfe.org (Gitea).

Can this be closed now? What's missing? Adding a listing task?

Sep 28 2020, 9:47 AM · Archive coverage, Lister

Sep 25 2020

douardda created P786 (An Untitled Masterwork).

Sep 25 2020, 6:37 PM

douardda committed rDMODbe8f1a559d82: Adapt cli declaration entrypoint to swh.core 0.3 (authored by douardda).

Adapt cli declaration entrypoint to swh.core 0.3

Sep 25 2020, 3:27 PM

douardda closed D4051: Adapt cli declaration entrypoint to swh.core 0.3.

Sep 25 2020, 3:27 PM

douardda updated the diff for D4051: Adapt cli declaration entrypoint to swh.core 0.3.

and with the pytest.ini hunk we don't need a (non working) dependency on swh.core[testing]

Sep 25 2020, 3:25 PM

douardda updated the diff for D4051: Adapt cli declaration entrypoint to swh.core 0.3.

add a precision in the ci message for the pytest.ini hunk

Sep 25 2020, 3:22 PM

douardda created D4051: Adapt cli declaration entrypoint to swh.core 0.3.

Sep 25 2020, 3:15 PM

douardda created P781 (An Untitled Masterwork).

Sep 25 2020, 12:59 PM

douardda created P780 (An Untitled Masterwork).

Sep 25 2020, 12:36 PM

douardda committed rDVAUec87dfe25879: Adapt cli declaration entrypoint to swh.core 0.3 (authored by douardda).

Adapt cli declaration entrypoint to swh.core 0.3

Sep 25 2020, 12:09 PM

douardda committed rDSCHbe7a5aeafa7f: Rename sql files according to swh.core 0.3 (authored by douardda).

Rename sql files according to swh.core 0.3

Sep 25 2020, 11:40 AM

douardda closed D4045: Rename sql files according to swh.core 0.3.

Sep 25 2020, 11:40 AM

douardda added a comment to D4045: Rename sql files according to swh.core 0.3.

In D4045#100032, @ardumont wrote:

ah yeah, it'd be best to align indeed.

Sep 25 2020, 11:39 AM

douardda committed rDOBJSRPL99571a1068b0: Adapt cli declaration entrypoint to swh.core 0.3 (authored by douardda).

Adapt cli declaration entrypoint to swh.core 0.3

Sep 25 2020, 10:02 AM

douardda created D4045: Rename sql files according to swh.core 0.3.

Sep 25 2020, 9:57 AM

douardda committed rDSCH5cc573d16f53: Adapt cli declaration entrypoint to swh.core 0.3 (authored by douardda).

Adapt cli declaration entrypoint to swh.core 0.3

Sep 25 2020, 9:53 AM

Sep 24 2020

douardda accepted D4039: Update sql paths for the moved SQL files.

crumbs everywhere

Sep 24 2020, 8:13 PM

douardda accepted D4038: tests: Run SQL files with psql instead of psycopg2.

sure

Sep 24 2020, 8:07 PM

douardda committed rCDFPf9dbb018395a: Update grafana dashboards (authored by douardda).

Update grafana dashboards

Sep 24 2020, 5:50 PM

douardda added a comment to T2624: Create strategy for documentation with a map or a full table of content.

Also https://plan.io/blog/technical-documentation/

Sep 24 2020, 3:43 PM · Roadmap 2021, meta-task, Documentation

douardda added a comment to D4012: core.loader: Log information about origin currently being ingested.

In D4012#99525, @olasd wrote:

I don't think the origin url and visit type should be sent in the task result; they're arguments of the task already.

If we want them logged by the worker when the task ends (which I agree would be useful), then we should improve logging on the worker/celery side to show some of the task arguments (for instance, if there's a "url" argument) instead / in addition of the task id.

Sep 24 2020, 3:28 PM · Core Loader

douardda added inline comments to D4012: core.loader: Log information about origin currently being ingested.

Sep 24 2020, 3:26 PM · Core Loader

douardda triaged T2621: running tox fails because C.UTF-8 is not available as Normal priority.

Sep 24 2020, 3:10 PM · Development environment

douardda closed T2119: Monitoring of workers as Resolved.

Sep 24 2020, 3:08 PM · Scheduling utilities, Sprint 2019/12 (Monitor and Conquer)

douardda closed T1296: Configure static analysis (eg. radon) reporting using jenkins warnings ng plugin, a subtask of T1024: Proper continuous integration setup, as Wontfix.

Sep 24 2020, 3:07 PM · Restricted Project, Continuous Integration, System administration

douardda closed T1296: Configure static analysis (eg. radon) reporting using jenkins warnings ng plugin as Wontfix.

This is probably mostly deprecated now we have mypy & al. Also the reporting via warnings-ng-plugin may not be such a priority now.

Sep 24 2020, 3:07 PM · Restricted Project, Continuous Integration, System administration