Page MenuHomeSoftware Heritage
Feed Advanced Search

Apr 16 2021

zack added a comment to T3252: Better handling of erroneous origins submitted to save code now.

@rdicosmo great summary, I'm certainly on that page :)

Apr 16 2021, 3:30 PM · System administration, Save Code Now, Web app
zack added a comment to T3252: Better handling of erroneous origins submitted to save code now.

thanks !

Apr 16 2021, 1:15 PM · System administration, Save Code Now, Web app
zack added a comment to T3252: Better handling of erroneous origins submitted to save code now.

but adding an email field (auto filled for registered users) to send a notification after the origin was loaded seems a good tradeoff. To implement the email notification, we will have to add a journal client in swh-web processing origin visit messages.

Apr 16 2021, 11:43 AM · System administration, Save Code Now, Web app

Apr 15 2021

zack added a comment to T3252: Better handling of erroneous origins submitted to save code now.

Oh, and now that we have user profile pages, we should have a list of "my" save code now requests with their status visible in the user profile, for those who want to check synchronously the status of their requests (and might have disabled email notifications).

Apr 15 2021, 11:35 PM · System administration, Save Code Now, Web app
zack added a comment to T3252: Better handling of erroneous origins submitted to save code now.

It would be desirable to provide the user with feedback that helps fix the issue.

Apr 15 2021, 11:33 PM · System administration, Save Code Now, Web app
zack accepted D5540: docs: Update for new schema.

Thanks. Can you please make a release after landing this, so that docs.s.o gets updated?

Apr 15 2021, 9:42 PM

Apr 14 2021

zack closed T1968: existing graph endpoints should not return 404 upon missing arguments as Invalid.

Sure! My apologies @Hakimb, but it's thank to your work that we have realized what was the right fate for this task.

Apr 14 2021, 5:10 PM · Easy hack, Compressed graph service
zack updated subscribers of T1968: existing graph endpoints should not return 404 upon missing arguments.

@seirl, @vlorentz: I see your point, and I agree. We should never have used /nested/paths for this API.
Maybe we should just reconsider this and, one @Hakimb is ready with a new traversal language proposal, we can map it to a better REST API that uses query parameters, and deal properly with 4xx return codes.

Apr 14 2021, 4:15 PM · Easy hack, Compressed graph service
zack added a comment to T2981: Graph API: add a (node type) result filters.
In T2981#63164, @Hakimb wrote:

questions:

1/ So for the "filter that applies to visits that return nodes one by one" part, we are talking about: neighbors, walk, visit/nodes only?

Apr 14 2021, 4:13 PM · Compressed graph service
zack requested changes to D5522: Add athena subcommand to create/query AWS Athena database.

Most of my comments are minor/nice to have, although I'd like to be able to pass queries directly on the CLI.

Apr 14 2021, 4:09 PM

Apr 12 2021

zack updated subscribers of T3242: Decommission ClearlyDefined resources.

@vsellier: ack on the outboarding, that is actionable as of now.

Apr 12 2021, 5:16 PM · System administration
zack added a comment to T3084: Fast track save code now requests.

Thanks for this!

Apr 12 2021, 5:11 PM · System administration, Web app

Apr 8 2021

zack added a comment to T3161: graph service: add anti-DoS limit on the number of edges traversed.

ok, so @Hakimb: go for no default value. If the query param is not passed, the visit will not stop before the end. If it's given, it will stop once the limit is reached. Call the query param ?max_edges. You will find that the java code already keeps track of the number of edges traversed, so you should just need to compare with that.

Apr 8 2021, 2:44 PM · Compressed graph service
zack added a comment to T3161: graph service: add anti-DoS limit on the number of edges traversed.

To complement what @vlorentz mentioned, we should actually stop the visit after the maximum number of edges has been reached, because it is keep doing the visit (no matter how many results are returned after it) that can DoS the swh-graph backend.

Apr 8 2021, 2:24 PM · Compressed graph service

Apr 7 2021

zack accepted D5438: bin/install: Add support for running outside a virtualenv.

(good catch also for the missing "$@" in the last invocation)

Apr 7 2021, 1:26 PM
zack added a comment to T3084: Fast track save code now requests.

@ardumont we briefly discussed this a while ago with @olasd. I think the proposed solution was indeed to have a separate queue (and workers) for "save code now" request, but not necessarily one separate queue per loader, because the current priority system wasn't considered to be "fast enough". Maybe we can discuss this briefly with him and synthesize here what you come up with?

Apr 7 2021, 1:02 PM · System administration, Web app
zack added a comment to D5427: NodeIdMap: use the MPH + mmapped .order to translate SWHID -> node ID.

I don't think that's good enough. We should have an overview of swh-graph's design that doesn't require reading all the code in an unspecified order.
And reading the code does not give a rationale for the decision.

Apr 7 2021, 12:52 PM

Apr 6 2021

zack committed rMSLD93db6cff6c7b: swh-scanner talk: add links to code and pypi package (authored by zack).
swh-scanner talk: add links to code and pypi package
Apr 6 2021, 4:37 PM
zack committed rMSLD3d4ddee13cde: minor changes and updates for LLW 2021 talk (authored by zack).
minor changes and updates for LLW 2021 talk
Apr 6 2021, 4:28 PM
zack closed T3212: typo in the identify function in swh-model/swh/model/cli.py as Invalid.

No, swh identify is correct, as all SWH CLI commands register as sub-commands of the main swh executable.

Apr 6 2021, 4:03 PM · Documentation
zack resigned from D5411: return a 400 error when accessing endpoints without the arguments.
Apr 6 2021, 12:33 PM
zack added a project to T3209: Fix swh-scanner for python > 3.7: Code scanner.
Apr 6 2021, 12:01 PM · Code scanner
zack requested changes to D5411: return a 400 error when accessing endpoints without the arguments.

also, can you add tests verifying that calling the API without an argument does in fact return 400 error?

Apr 6 2021, 11:59 AM
zack requested changes to D5420: cli/identify: Add support for --recursive.
Apr 6 2021, 11:39 AM
zack closed T1136: swh-identify: support recursive checksumming of directories as Invalid.

duplicate with T3160

Apr 6 2021, 11:36 AM · Data Model
zack committed rMSLD37e00f419eda: check-in slide skeleton for LLW 2021 (authored by zack).
check-in slide skeleton for LLW 2021
Apr 6 2021, 11:03 AM
zack added inline comments to D5420: cli/identify: Add support for --recursive.
Apr 6 2021, 10:44 AM

Apr 2 2021

zack added a reviewer for D5411: return a 400 error when accessing endpoints without the arguments: seirl.
Apr 2 2021, 6:09 PM
zack added a comment to T3196: Improve discoverability of the permalinks tab.

@anlambert it looks like we're thinking at the same placement for the link that open the permalink box. The main difference seems to be "modal popup" v. "drop-down section" (that makes the rest of the page scroll down). Maybe you can just try both and see what looks best?

Apr 2 2021, 8:13 AM · Web app

Apr 1 2021

zack added a comment to T3196: Improve discoverability of the permalinks tab.

Adding both something (the animation) and an optional checkbox to hide (because it is potentially annoying in the long run) does not sound like a great UX.

Apr 1 2021, 10:28 PM · Web app
zack closed T2269: cron spam: <root@*> find /var/log/kafka -type f -not -name *.gz -a -ctime +1 -exec gzip {} \+ as Resolved.
Apr 1 2021, 9:25 PM · System administration
zack edited Description on Roadmap 2021.
Apr 1 2021, 11:03 AM
zack edited Description on Roadmap 2021.
Apr 1 2021, 11:00 AM

Mar 31 2021

zack committed rDGRPH8d30918cd7f8: docs: drop mention of conffile in quickstart (authored by zack).
docs: drop mention of conffile in quickstart
Mar 31 2021, 5:19 PM
zack renamed T1538: Add "forge" now from save "forge" now to Save "forge" now.
Mar 31 2021, 11:07 AM · Add Forge Now , Roadmap 2022, meta-task, Roadmap 2021
zack moved T3175: Prepare production environment from Backlog to Done on the Roadmap 2021 board.
Mar 31 2021, 11:05 AM · Roadmap 2021, System administration, Monitoring

Mar 30 2021

zack added a comment to T2833: cpan.loader - archive Perl modules from CPAN.

awesome, thanks @joenio ! you can also drop by our other devel communication channel if you want to discuss this in other ways: https://www.softwareheritage.org/community/developers/

Mar 30 2021, 3:29 PM · CPAN lister, Archive coverage
zack renamed T2833: cpan.loader - archive Perl modules from CPAN from [feature request] cpan.loader - preserver Perl modules from CPAN to cpan.loader - preserver Perl modules from CPAN.
Mar 30 2021, 8:22 AM · CPAN lister, Archive coverage
zack raised the priority of T2833: cpan.loader - archive Perl modules from CPAN from Wishlist to Normal.
Mar 30 2021, 8:22 AM · CPAN lister, Archive coverage
zack added a comment to T2833: cpan.loader - archive Perl modules from CPAN.

Hey, yes, we want to have one, but nobody is working it at the moment, and we rather have someone knowledgeable with that ecosystem to work on it. So, if you're interested, you're more than welcome to help there! (And thank you in advance.)

Mar 30 2021, 8:21 AM · CPAN lister, Archive coverage

Mar 29 2021

zack committed rMSLD17714c5a3348: CYU talk: use more recent data model slide (authored by zack).
CYU talk: use more recent data model slide
Mar 29 2021, 7:34 PM
zack committed rMSLD42be91daa092: check in slides for tomorrow talk at CYU (authored by zack).
check in slides for tomorrow talk at CYU
Mar 29 2021, 3:01 PM

Mar 27 2021

zack closed T3180: [spam] as Invalid.
Mar 27 2021, 5:54 PM · General, Web client
zack committed R183:bb3690aee756: add recent papers (authored by zack).
add recent papers
Mar 27 2021, 2:31 PM
zack committed R183:04b760d62231: add citation for Apache Gremlin graph traversal language (authored by zack).
add citation for Apache Gremlin graph traversal language
Mar 27 2021, 2:17 PM

Mar 26 2021

zack triaged T3178: document how to export the graph dataset automatically as Normal priority.
Mar 26 2021, 12:25 PM · Documentation, Datasets
zack reopened T1847: fully automate export of the graph dataset, a subtask of T1848: refresh graph dataset export, as Open.
Mar 26 2021, 12:25 PM · Datasets
zack reopened T1847: fully automate export of the graph dataset as "Open".

reopening, as ideally we'd like to have run the entire ORC export once to completion before closing

Mar 26 2021, 12:25 PM · Compressed graph service, Datasets

Mar 23 2021

zack updated the task description for T3168: Proper deployment of swh-graph with debian package.
Mar 23 2021, 12:24 PM · Compressed graph service, Puppet recipes
zack added a project to T3168: Proper deployment of swh-graph with debian package: Compressed graph service.
Mar 23 2021, 12:23 PM · Compressed graph service, Puppet recipes

Mar 22 2021

zack renamed T3161: graph service: add anti-DoS limit on the number of edges traversed from graph service: add limit on the number of edges traversed to graph service: add anti-DoS limit on the number of edges traversed.
Mar 22 2021, 9:43 AM · Compressed graph service
zack added a subtask for T2220: swh-graph in production: T3161: graph service: add anti-DoS limit on the number of edges traversed.
Mar 22 2021, 9:43 AM · Roadmap 2022, meta-task, Roadmap 2021, Compressed graph service
zack added a parent task for T3161: graph service: add anti-DoS limit on the number of edges traversed: T2220: swh-graph in production.
Mar 22 2021, 9:43 AM · Compressed graph service
zack triaged T3161: graph service: add anti-DoS limit on the number of edges traversed as Normal priority.
Mar 22 2021, 9:12 AM · Compressed graph service
zack closed T2113: swh-graph: add support to optionally resolve ori PIDs to origin URLs as Wontfix.

Now that this is (optionally) done by swh-web, I don't think we want to implement it in swh-graph too.

Mar 22 2021, 8:56 AM · Compressed graph service

Mar 21 2021

zack added a comment to D5295: Add type annotations to metadata mappings.

While you are at it, and as a minor point, please also double check your commit message, it doesn't match our conventions (e.g., it is in passive voice, while it shouldn't).

Mar 21 2021, 10:58 PM

Mar 20 2021

zack committed rMSLD63f6936d4189: LibrePlanet talk: last touches (authored by zack).
LibrePlanet talk: last touches
Mar 20 2021, 2:37 PM
zack renamed T3160: swh identify: add a -R/--recursive flag from swh identify: add a -R/--recursive to swh identify: add a -R/--recursive flag.
Mar 20 2021, 2:22 PM · Easy hack, Data Model
zack updated the task description for T3160: swh identify: add a -R/--recursive flag.
Mar 20 2021, 2:21 PM · Easy hack, Data Model
zack triaged T3160: swh identify: add a -R/--recursive flag as Normal priority.
Mar 20 2021, 2:20 PM · Easy hack, Data Model

Mar 19 2021

zack accepted D5292: cli: Don't show a traceback or warning if the config file does not exist.
Mar 19 2021, 5:47 PM
zack committed rMSLD5ec7dc16b302: check in slides for LibrePlanet 2021 (authored by zack).
check in slides for LibrePlanet 2021
Mar 19 2021, 5:13 PM
zack placed T2234: Write use case-specific documentation up for grabs.

Please do not claim tasks @shivam2003, just submit a patch fixing the issue when you have one. Thanks.

Mar 19 2021, 5:10 PM · Roadmap 2021, meta-task, Documentation
zack committed rMSLD83819f6e6034: common: add SwhFS ICSE paper to biblio module (authored by zack).
common: add SwhFS ICSE paper to biblio module
Mar 19 2021, 4:29 PM
zack committed rMSLDdcf96f56494d: common: revamp some old/common slides to reflect current state (authored by zack).
common: revamp some old/common slides to reflect current state
Mar 19 2021, 4:29 PM
zack committed rMSLDcd8af720ce3c: common: add swh identify tutorial/example to SWHID module (authored by zack).
common: add swh identify tutorial/example to SWHID module
Mar 19 2021, 4:29 PM
zack committed rMSLDc2d00871acb0: common: add (minimal) slide module for swh-fuse (authored by zack).
common: add (minimal) slide module for swh-fuse
Mar 19 2021, 4:29 PM
zack committed rMSLDe9f19e6288df: common: add one-slider module about the Merkle structure (authored by zack).
common: add one-slider module about the Merkle structure
Mar 19 2021, 4:29 PM
zack created P979 Command-Line Input.
Mar 19 2021, 2:26 PM
zack committed rMSLDb6c2a59d3dc5: common/images: add archive coverage image + links for coverage & growth (authored by zack).
common/images: add archive coverage image + links for coverage & growth
Mar 19 2021, 2:15 PM
zack committed rDSEA7b3b0dca9d55: doc: capitalize heading title (authored by zack).
doc: capitalize heading title
Mar 19 2021, 11:02 AM
zack committed rDDOC54fe755ea8a9: make heading for swh-loader page consisted with other packages (authored by zack).
make heading for swh-loader page consisted with other packages
Mar 19 2021, 10:46 AM

Mar 18 2021

zack added a member for Developers: aeviso.
Mar 18 2021, 2:45 PM
zack removed a member for Developers: tenma.
Mar 18 2021, 2:45 PM
zack removed a member for Developers: fiendish.
Mar 18 2021, 2:44 PM

Mar 16 2021

zack placed T3137: Add type annotations to swh.loaders.svn up for grabs.
Mar 16 2021, 7:01 PM · Easy hack, SVN Loader
zack added a comment to T3137: Add type annotations to swh.loaders.svn.

@shashikant231 please do not claim tasks. Just submit a diff fixing the issue when you have one. Thanks.

Mar 16 2021, 7:01 PM · Easy hack, SVN Loader
zack placed T3136: Prior art detection service up for grabs.

@shashikant231 please do not claim tasks, thanks.

Mar 16 2021, 7:00 PM · Roadmap 2022, Code scanner, Scientific Community Building, Roadmap 2021, meta-task
zack placed T1968: existing graph endpoints should not return 404 upon missing arguments up for grabs.

Dear @Kaustuv942, sure, patches welcome. We do not use task claiming for non regular contributors though, just submit a patch when you have one.

Mar 16 2021, 6:59 PM · Easy hack, Compressed graph service

Mar 15 2021

zack committed rDGRPH58b46f78ee3f: FindEarliestRevision: bug fix: do not follow rev:rev edges (authored by zack).
FindEarliestRevision: bug fix: do not follow rev:rev edges
Mar 15 2021, 9:34 PM

Mar 14 2021

zack added a parent task for T3125: add revision timestamp to the compression timeline: T3126: API: add endpoint to find the earliest revision referencing a dir/cnt node.
Mar 14 2021, 12:04 PM · Compressed graph service
zack added a subtask for T3126: API: add endpoint to find the earliest revision referencing a dir/cnt node: T3125: add revision timestamp to the compression timeline.
Mar 14 2021, 12:04 PM · Compressed graph service
zack triaged T3126: API: add endpoint to find the earliest revision referencing a dir/cnt node as Normal priority.
Mar 14 2021, 12:03 PM · Compressed graph service
zack triaged T3125: add revision timestamp to the compression timeline as Normal priority.
Mar 14 2021, 12:02 PM · Compressed graph service

Mar 13 2021

zack updated the task description for T3124: vpn (and ssh) connection to louvre.s.o failure.
Mar 13 2021, 1:49 PM · System administration
zack triaged T3124: vpn (and ssh) connection to louvre.s.o failure as Unbreak Now! priority.
Mar 13 2021, 1:48 PM · System administration

Mar 11 2021

zack committed rDGRPHe0ef3b9b124b: FindEarliestRevision: make it work as a *nix filter and add accounting (authored by zack).
FindEarliestRevision: make it work as a *nix filter and add accounting
Mar 11 2021, 8:19 AM

Mar 10 2021

zack accepted D5213: swh/scanner : Strip root path from json output.
Mar 10 2021, 6:17 PM
zack requested changes to D5213: swh/scanner : Strip root path from json output.

In my opinion, the textual output doesn't print anything other than the directory structure so instead of removing the whole root path we can just put the directory name (it looks better).
[...]
But the ndjson output must be stripped.

Mar 10 2021, 4:41 PM
zack added a comment to T3111: add recently published scientific papers to the website.

Done ! Hopefully teachpress (WP plugin to display publications list) has a bibtex import that works like a charm.

Mar 10 2021, 3:22 PM · Website
zack requested changes to D5213: swh/scanner : Strip root path from json output.

Thanks, this fix looks good.

Mar 10 2021, 12:27 PM
zack triaged T3111: add recently published scientific papers to the website as Normal priority.
Mar 10 2021, 11:16 AM · Website

Mar 8 2021

zack triaged T3101: Latest versions on Pypi are an incompatible combination as High priority.

@vlorentz: can you have a look at this? it's related to the recent changes around the CoreSWHID class, maybe just a release of swh-scanner is missing

Mar 8 2021, 3:26 PM · Code scanner

Mar 6 2021

zack accepted D5210: Add Kumar Shivendu in swh-indexer/CONTRIBUTORS.
Mar 6 2021, 4:43 PM
zack accepted D5209: Add Kumar Shivendu in swh-web/CONTRIBUTORS.
Mar 6 2021, 4:43 PM
zack resigned from D5210: Add Kumar Shivendu in swh-indexer/CONTRIBUTORS.
Mar 6 2021, 3:49 PM
zack resigned from D5209: Add Kumar Shivendu in swh-web/CONTRIBUTORS.
Mar 6 2021, 3:48 PM
zack requested changes to D5210: Add Kumar Shivendu in swh-indexer/CONTRIBUTORS.

please keep the file sorted alphabetically

Mar 6 2021, 3:40 PM
zack requested changes to D5209: Add Kumar Shivendu in swh-web/CONTRIBUTORS.

please keep the file sorted alphabetically

Mar 6 2021, 3:40 PM

Mar 5 2021

zack changed the visibility for Roadmap 2020.
Mar 5 2021, 4:24 PM