Page MenuHomeSoftware Heritage
Feed Advanced Search

Oct 13 2020

douardda added a comment to D4193: swh identify: add --exclude.

@douardda @zack Note that this diff somehow did not trigger the ci tests, only the linters. No idea why. just a heads up.

Oct 13 2020, 1:03 PM

Oct 9 2020

douardda committed rCDFPcdb00c8e5a34: Revert to a orgname/reponame based images naming scheme (authored by douardda).
Revert to a orgname/reponame based images naming scheme
Oct 9 2020, 5:08 PM
douardda committed rCDFP65cbf32679ad: Reorganize the images under the softwareheritage hub repo (authored by douardda).
Reorganize the images under the softwareheritage hub repo
Oct 9 2020, 4:39 PM
douardda accepted D4210: backfill: use the common `storage` top-level config key.

same

Oct 9 2020, 4:34 PM
douardda accepted D4209: backfill: support arbitrary journal writer configuration.

oh so much yes!

Oct 9 2020, 4:33 PM
douardda updated the task description for T2682: Deploy a small publicly available kafka server (with some content) on a staging (+ the related objstorage).
Oct 9 2020, 3:47 PM · Staging environment, System administration
douardda triaged T2682: Deploy a small publicly available kafka server (with some content) on a staging (+ the related objstorage) as High priority.
Oct 9 2020, 3:37 PM · Staging environment, System administration

Oct 8 2020

douardda added inline comments to D4198: PEP8 refactoring of scanner modules.
Oct 8 2020, 6:19 PM
douardda added a comment to D4198: PEP8 refactoring of scanner modules.

Overall ok, but I would have preferred the renaming be in a dedicated revision, separated from type annotation fixes/additions.

Oct 8 2020, 6:05 PM
douardda accepted D4182: conftest: Declare swh.core pytest_plugin.

otherwise fine with me

Oct 8 2020, 5:56 PM
douardda added a comment to D4182: conftest: Declare swh.core pytest_plugin.

does this requires the plugin's entrypoint in swh.core be removed ? (eg. because of swh.core.pytest_plugin being loaded twice or something like that) or is it safe to apply and use with a swh.core that still declates its pytest_plugin an entrypoint?

Oct 8 2020, 5:56 PM
douardda added a comment to D4078: Add a 'unique_key' method on model objects.

yes, it would make sense for values. Do you want to open a task for that?

you read my mind :-)

Oct 8 2020, 12:42 PM
douardda added a comment to T1279: swh-journal: The schema migration problem.

Since this "migration problem" also concerns cassandra, maybe an simple approach would be to add a Final version attribute to all model entities (a simple monotonic integer).

Oct 8 2020, 12:41 PM · Journal
douardda updated subscribers of D4078: Add a 'unique_key' method on model objects.

maybe stupid question, but why using dict as unique key (in many model classes)? Why not use a tuple? I mean it seems to me that such a UID should be usable as dict keys or in a set directly.

I don't know, I just copied what we were already doing in swh-journal. Dicts have the nice property of being somewhat "self-documenting" though.

Oct 8 2020, 12:39 PM
douardda added a comment to D4078: Add a 'unique_key' method on model objects.

yes, it would make sense for values. Do you want to open a task for that?

Oct 8 2020, 12:36 PM
douardda added a comment to D4078: Add a 'unique_key' method on model objects.

maybe stupid question, but why using dict as unique key (in many model classes)? Why not use a tuple? I mean it seems to me that such a UID should be usable as dict keys or in a set directly.

I don't know, I just copied what we were already doing in swh-journal. Dicts have the nice property of being somewhat "self-documenting" though.

Also (most probably dumb idea, writing as it pops in my mind), wouldn't it make sense to add some kind of 'per-object class model version' in the key?

This would prevent compacting away old versions of objects. Is this something we want?

Oct 8 2020, 12:34 PM
douardda added a comment to D4194: model: use visit ids in the unique key, instead of their date..

microsecond in postgres, millisecond in cassandra.

Oct 8 2020, 12:27 PM
douardda accepted D4186: scanner.model: Fix Tree.toDict to be side-effect free.

The split in 2 revisions is not mandatory, just sayin' for good measure.

Oct 8 2020, 12:16 PM
douardda added a comment to D4186: scanner.model: Fix Tree.toDict to be side-effect free.

looks good (did not even notice toDict() is not even a recursive method! so this dict_nodes really makes no sense at all).

Oct 8 2020, 12:15 PM
douardda added a comment to D4078: Add a 'unique_key' method on model objects.

maybe stupid question, but why using dict as unique key (in many model classes)? Why not use a tuple? I mean it seems to me that such a UID should be usable as dict keys or in a set directly.

Oct 8 2020, 12:08 PM
douardda added a comment to D4078: Add a 'unique_key' method on model objects.

maybe stupid question, but why using dict as unique key (in many model classes)? Why not use a tuple? I mean it seems to me that such a UID should be usable as dict keys or in a set directly.

Oct 8 2020, 12:04 PM
douardda added a comment to D4194: model: use visit ids in the unique key, instead of their date..

dates are not unique (ie. multiple visits can share a date, and they
do in practice); and visit statuses already use visit ids in their
unique key.

Oct 8 2020, 11:52 AM
douardda added a comment to D4193: swh identify: add --exclude.
In D4193#103804, @zack wrote:

Thanks, even though this is a little bit disturbing discrepancy wrt swh-scanner exclusion mechanism,

Oct 8 2020, 11:24 AM
douardda requested changes to D4193: swh identify: add --exclude.

can you please remove the "noise" added by arc in the commit message? And update it (still the previous option name in there).

Oct 8 2020, 11:16 AM

Oct 7 2020

douardda requested changes to D3435: Add mercurial.from_disk.HgLoaderFromDisk.
Oct 7 2020, 3:52 PM

Oct 2 2020

douardda added a comment to T1410: Kill implicit configuration: new configuration scheme.

Maybe starting a pad/hackmd document would be easier at this point?

Oct 2 2020, 4:01 PM · Core & foundations
douardda added a comment to D4131: Remove parse_url helper that adds no real value.

this is debatable, but it does "normalize" the given url, so it does something. I agress the https:// auto-add prefix is strange, but the trailing / still brings value. For example there are listers that do not implement this, so if you create a listing task with url=https://somehere.org/api/v1 it will fail because it will forge invalid urls (missing the trailing /).
[edit] and I find this very annoying

Oct 2 2020, 3:14 PM
douardda added a comment to D4131: Remove parse_url helper that adds no real value.

this is debatable, but it does "normalize" the given url, so it does something. I agress the https:// auto-add prefix is strange, but the trailing / still brings value. For example there are listers that do not implement this, so if you create a listing task with url=https://somehere.org/api/v1 it will fail because it will forge invalid urls (missing the trailing /).
[edit] and I find this very annoying

Oct 2 2020, 3:13 PM
douardda added a comment to D3334: Add a new TenaciousProxyStorage.

@douardda ping?

Oct 2 2020, 2:32 PM

Oct 1 2020

douardda accepted D4112: jobs/tools/dockerfiles: Enable to trigger builds remotely.
Oct 1 2020, 2:24 PM

Sep 30 2020

douardda committed rCDFP0264e6b611ca: Fix a typo in the README file (authored by douardda).
Fix a typo in the README file
Sep 30 2020, 11:48 AM
douardda committed rCDFP110773478d07: Update the db initialization to swh-storage 0.15 (authored by douardda).
Update the db initialization to swh-storage 0.15
Sep 30 2020, 11:48 AM
douardda requested changes to D4089: Add tests and fix behavior of scanner cli.
Sep 30 2020, 10:13 AM
douardda accepted D4092: packer: Clean up spurious blanks.
Sep 30 2020, 9:41 AM
douardda closed T2313: Archive git.fsfe.org (Gitea) as Resolved.

Listed (oneshot full + recurring incremental) and loaded (as far as I can tell).

Sep 30 2020, 9:37 AM · Archive coverage, Lister

Sep 29 2020

douardda added a comment to D4055: Drop debian packaging, which is now handled on separate branches.

(just updated my swh-env, now I see where this diff comes from :-) )

Sep 29 2020, 12:09 PM
douardda accepted D4055: Drop debian packaging, which is now handled on separate branches.

sure

Sep 29 2020, 11:49 AM
douardda accepted D4068: jobs/swh-environment: Improve build script.

LGTM (except the "rm -f ../$module.log " for which I am not convinced it's a good idea)

Sep 29 2020, 11:47 AM
douardda added a comment to T2313: Archive git.fsfe.org (Gitea).

I've sent an email to the fsfe.

Sep 29 2020, 11:39 AM · Archive coverage, Lister
douardda triaged T2645: Add listing tasks for gitea instances as Normal priority.
Sep 29 2020, 10:09 AM · Origin-Gitea/Gogs, Archive coverage, Lister

Sep 28 2020

douardda added a comment to T2313: Archive git.fsfe.org (Gitea).

Can this be closed now? What's missing? Adding a listing task?

Sep 28 2020, 9:47 AM · Archive coverage, Lister

Sep 25 2020

douardda created P786 (An Untitled Masterwork).
Sep 25 2020, 6:37 PM
douardda committed rDMODbe8f1a559d82: Adapt cli declaration entrypoint to swh.core 0.3 (authored by douardda).
Adapt cli declaration entrypoint to swh.core 0.3
Sep 25 2020, 3:27 PM
douardda closed D4051: Adapt cli declaration entrypoint to swh.core 0.3.
Sep 25 2020, 3:27 PM
douardda updated the diff for D4051: Adapt cli declaration entrypoint to swh.core 0.3.

and with the pytest.ini hunk we don't need a (non working) dependency on swh.core[testing]

Sep 25 2020, 3:25 PM
douardda updated the diff for D4051: Adapt cli declaration entrypoint to swh.core 0.3.

add a precision in the ci message for the pytest.ini hunk

Sep 25 2020, 3:22 PM
douardda created D4051: Adapt cli declaration entrypoint to swh.core 0.3.
Sep 25 2020, 3:15 PM
douardda created P781 (An Untitled Masterwork).
Sep 25 2020, 12:59 PM
douardda created P780 (An Untitled Masterwork).
Sep 25 2020, 12:36 PM
douardda committed rDVAUec87dfe25879: Adapt cli declaration entrypoint to swh.core 0.3 (authored by douardda).
Adapt cli declaration entrypoint to swh.core 0.3
Sep 25 2020, 12:09 PM
douardda committed rDSCHbe7a5aeafa7f: Rename sql files according to swh.core 0.3 (authored by douardda).
Rename sql files according to swh.core 0.3
Sep 25 2020, 11:40 AM
douardda closed D4045: Rename sql files according to swh.core 0.3.
Sep 25 2020, 11:40 AM
douardda added a comment to D4045: Rename sql files according to swh.core 0.3.

ah yeah, it'd be best to align indeed.

Sep 25 2020, 11:39 AM
douardda committed rDOBJSRPL99571a1068b0: Adapt cli declaration entrypoint to swh.core 0.3 (authored by douardda).
Adapt cli declaration entrypoint to swh.core 0.3
Sep 25 2020, 10:02 AM
douardda created D4045: Rename sql files according to swh.core 0.3.
Sep 25 2020, 9:57 AM
douardda committed rDSCH5cc573d16f53: Adapt cli declaration entrypoint to swh.core 0.3 (authored by douardda).
Adapt cli declaration entrypoint to swh.core 0.3
Sep 25 2020, 9:53 AM

Sep 24 2020

douardda accepted D4039: Update sql paths for the moved SQL files.

crumbs everywhere

Sep 24 2020, 8:13 PM
douardda accepted D4038: tests: Run SQL files with psql instead of psycopg2.

sure

Sep 24 2020, 8:07 PM
douardda committed rCDFPf9dbb018395a: Update grafana dashboards (authored by douardda).
Update grafana dashboards
Sep 24 2020, 5:50 PM
douardda added a comment to T2624: Create strategy for documentation with a map or a full table of content.

Also https://plan.io/blog/technical-documentation/

Sep 24 2020, 3:43 PM · Roadmap 2021, meta-task, Documentation
douardda added a comment to D4012: core.loader: Log information about origin currently being ingested.
In D4012#99525, @olasd wrote:

I don't think the origin url and visit type should be sent in the task result; they're arguments of the task already.

If we want them logged by the worker when the task ends (which I agree would be useful), then we should improve logging on the worker/celery side to show some of the task arguments (for instance, if there's a "url" argument) instead / in addition of the task id.

Sep 24 2020, 3:28 PM · Core Loader
douardda added inline comments to D4012: core.loader: Log information about origin currently being ingested.
Sep 24 2020, 3:26 PM · Core Loader
douardda triaged T2621: running tox fails because C.UTF-8 is not available as Normal priority.
Sep 24 2020, 3:10 PM · Development environment
douardda closed T2119: Monitoring of workers as Resolved.
Sep 24 2020, 3:08 PM · Scheduling utilities, Sprint 2019/12 (Monitor and Conquer)
douardda closed T1296: Configure static analysis (eg. radon) reporting using jenkins warnings ng plugin, a subtask of T1024: Proper continuous integration setup, as Wontfix.
Sep 24 2020, 3:07 PM · Restricted Project, Continuous Integration, System administration
douardda closed T1296: Configure static analysis (eg. radon) reporting using jenkins warnings ng plugin as Wontfix.

This is probably mostly deprecated now we have mypy & al. Also the reporting via warnings-ng-plugin may not be such a priority now.

Sep 24 2020, 3:07 PM · Restricted Project, Continuous Integration, System administration
douardda accepted D4030: Add an optional swh.storage read replica to the docker setup.
Sep 24 2020, 3:03 PM
douardda added a comment to T1330: Ensure documentation building process does not generates any rst syntax warning.

I'd like to close this task, but unfortunately:

Sep 24 2020, 2:47 PM · Documentation
douardda closed T358: doc: high-level architecture diagram, a subtask of T509: Generate and publish Software Heritage Development Documentation, as Resolved.
Sep 24 2020, 2:42 PM · Documentation
douardda closed T358: doc: high-level architecture diagram as Resolved.

I guess we can say that, let's close this.

Sep 24 2020, 2:42 PM · Documentation
douardda closed T1601: Journal client of swh-storage mirrors as Resolved.

I guess this can be closed now

Sep 24 2020, 2:42 PM · Sprint 2019 03
douardda accepted D3981: Support different database flavors in the SQL scripts.
Sep 24 2020, 2:35 PM
douardda committed rDENV402b154052d5: Rename mirror-related things from replica to mirror (authored by douardda).
Rename mirror-related things from replica to mirror
Sep 24 2020, 1:04 PM
douardda closed D4022: Rename mirror-related things from replica to mirror.
Sep 24 2020, 1:04 PM
douardda accepted D4024: Docker: Migrate relevant services to the split `swh db create` / `swh db init`.

thx a lot

Sep 24 2020, 1:03 PM
douardda accepted D4023: Update debian-like host instructions to reference the documentation.

sure go

Sep 24 2020, 12:08 PM
douardda committed rDWCLI04375ab5713d: Adapt cli declaration entrypoint to swh.core 0.3 (authored by douardda).
Adapt cli declaration entrypoint to swh.core 0.3
Sep 24 2020, 11:57 AM
douardda committed rDDEP047588a03f0b: Fix dependency-related hell in tox (authored by douardda).
Fix dependency-related hell in tox
Sep 24 2020, 11:26 AM
douardda closed D4029: Attempt to fix dependency-related hell in tox.
Sep 24 2020, 11:26 AM
douardda updated the diff for D4029: Attempt to fix dependency-related hell in tox.

update ci message

Sep 24 2020, 11:26 AM
douardda added a comment to D4029: Attempt to fix dependency-related hell in tox.

commit message! ;)

Sep 24 2020, 11:18 AM
douardda created D4029: Attempt to fix dependency-related hell in tox.
Sep 24 2020, 11:13 AM

Sep 23 2020

douardda committed rDSEA7b9a254fc016: Adapt cli declaration entrypoint to swh.core 0.3 (authored by douardda).
Adapt cli declaration entrypoint to swh.core 0.3
Sep 23 2020, 6:40 PM
douardda committed rDTSCNf10d3d87cbcb: Adapt cli declaration entrypoint to swh.core 0.3 (authored by douardda).
Adapt cli declaration entrypoint to swh.core 0.3
Sep 23 2020, 6:27 PM
douardda committed rDTPL19683b52773f: Adapt cli declaration entrypoint to swh.core 0.3 (authored by douardda).
Adapt cli declaration entrypoint to swh.core 0.3
Sep 23 2020, 6:21 PM
douardda committed rDOBJSfd1cc6dc71c8: Adapt cli declaration entrypoint to swh.core 0.3 (authored by douardda).
Adapt cli declaration entrypoint to swh.core 0.3
Sep 23 2020, 6:18 PM
douardda updated the diff for D4022: Rename mirror-related things from replica to mirror.

fix a couple of sedisms

Sep 23 2020, 6:12 PM
douardda created D4022: Rename mirror-related things from replica to mirror.
Sep 23 2020, 6:07 PM
douardda committed rDLDBASE7ed1927b9968: Adapt cli declaration entrypoint to swh.core 0.3 (authored by douardda).
Adapt cli declaration entrypoint to swh.core 0.3
Sep 23 2020, 5:50 PM
douardda committed rDLSd10f78d80c95: Adapt cli declaration entrypoint to swh.core 0.3 (authored by douardda).
Adapt cli declaration entrypoint to swh.core 0.3
Sep 23 2020, 5:42 PM
douardda committed rDCIDX8c4b33ba2c74: Adapt cli declaration entrypoint to swh.core 0.3 (authored by douardda).
Adapt cli declaration entrypoint to swh.core 0.3
Sep 23 2020, 5:26 PM
douardda committed rDICP750a15e4f9ae: Adapt cli declaration entrypoint to swh.core 0.3 (authored by douardda).
Adapt cli declaration entrypoint to swh.core 0.3
Sep 23 2020, 5:13 PM
douardda committed rDGRPHc293d0726a7b: Adapt cli declaration entrypoint to swh.core 0.3 (authored by douardda).
Adapt cli declaration entrypoint to swh.core 0.3
Sep 23 2020, 5:04 PM
douardda committed rDDEP3416abd01019: Adapt cli declaration entrypoint to swh.core 0.3 (authored by douardda).
Adapt cli declaration entrypoint to swh.core 0.3
Sep 23 2020, 4:37 PM
douardda committed rDDATASET6205fdee730b: Adapt cli declaration entrypoint to swh.core 0.3 (authored by douardda).
Adapt cli declaration entrypoint to swh.core 0.3
Sep 23 2020, 4:36 PM
douardda committed rDSTOc97b23bcd78b: Adapt cli declaration entrypoint to swh.core 0.3 (authored by douardda).
Adapt cli declaration entrypoint to swh.core 0.3
Sep 23 2020, 4:35 PM
douardda closed D4002: Adapt cli declaration entrypoint to swh.core 0.3.
Sep 23 2020, 4:35 PM
douardda updated the diff for D4002: Adapt cli declaration entrypoint to swh.core 0.3.

rebase

Sep 23 2020, 4:29 PM
douardda committed rCDFJ4e78deb8b89c: Add libfuse3-dev in the docker (authored by douardda).
Add libfuse3-dev in the docker
Sep 23 2020, 3:45 PM
douardda closed D4021: Add libfuse3-dev in the docker.
Sep 23 2020, 3:45 PM