Page MenuHomeSoftware Heritage

zack (Stefano Zacchiroli)
UserAdministrator

User Details

User Since
Sep 7 2015, 3:43 PM (233 w, 1 h)
Roles
Administrator

Recent Activity

Wed, Feb 19

zack committed rMSLD10ded1eb48f5: SANER 2020 slides: add co-authors to titlepage (authored by zack).
SANER 2020 slides: add co-authors to titlepage
Wed, Feb 19, 12:23 PM
zack committed rMSLD9f9e6b4f5d48: check in slides for SANER 2020 swh-graph presentation (authored by zack).
check in slides for SANER 2020 swh-graph presentation
Wed, Feb 19, 12:17 PM

Tue, Feb 18

zack accepted D2686: Re-introduce the swh.core dependency in swh.model[cli].

thanks !

Tue, Feb 18, 12:56 PM
zack renamed T2288: pip install swh.model[cli] no longer provides a usable "swh" command from pip install swh.model[cli] no longer provides a usable "swh identify" to pip install swh.model[cli] no longer provides a usable "swh" command.
Tue, Feb 18, 12:55 PM · Data Model
zack lowered the priority of T2288: pip install swh.model[cli] no longer provides a usable "swh" command from High to Normal.
In T2288#42054, @olasd wrote:

Your swh-identify backtrace looks like a $PATH caching issue (see also the path of the swh-identify script in the first line). I can't reproduce it in a fresh venv and swh-identify works fine.

Tue, Feb 18, 12:55 PM · Data Model
zack updated the task description for T2288: pip install swh.model[cli] no longer provides a usable "swh" command.
Tue, Feb 18, 9:44 AM · Data Model
zack updated the task description for T2288: pip install swh.model[cli] no longer provides a usable "swh" command.
Tue, Feb 18, 9:44 AM · Data Model
zack triaged T2288: pip install swh.model[cli] no longer provides a usable "swh" command as High priority.
Tue, Feb 18, 9:43 AM · Data Model
zack committed rMSLDccc7b7f7cda2: lyon talk: fix affiliation (authored by zack).
lyon talk: fix affiliation
Tue, Feb 18, 8:39 AM

Mon, Feb 17

zack added inline comments to D2669: Add ?limit=N method variants to return first N results.
Mon, Feb 17, 1:58 PM

Fri, Feb 14

zack added inline comments to D2669: Add ?limit=N method variants to return first N results.
Fri, Feb 14, 2:59 PM

Thu, Feb 13

zack updated the summary of D2461: add Python client for the archive WEB API.
Thu, Feb 13, 2:38 PM
zack changed the status of T2279: Python client for the Web API from Open to Work in Progress.

a draft implementation of this idea is in D2461

Thu, Feb 13, 2:37 PM · Web app
zack triaged T2279: Python client for the Web API as Wishlist priority.
Thu, Feb 13, 2:37 PM · Web app
zack added inline comments to D2661: Web API: /known/ input size limit.
Thu, Feb 13, 9:10 AM

Tue, Feb 11

zack renamed T2277: varnish: limit maximum size of incoming POST requests for Web API from varnish: limit maximum size of incoming POST requests to varnish: limit maximum size of incoming POST requests for Web API.
Tue, Feb 11, 3:09 PM · System administration
zack updated the task description for T2276: Web API: /known: add a length limit to the list of accepted PID.
Tue, Feb 11, 3:08 PM · Web app
zack triaged T2277: varnish: limit maximum size of incoming POST requests for Web API as High priority.
Tue, Feb 11, 3:08 PM · System administration
zack triaged T2276: Web API: /known: add a length limit to the list of accepted PID as High priority.
Tue, Feb 11, 2:52 PM · Web app

Thu, Feb 6

zack committed rMSLDda10e13a735a: Merge branch 'master' of ssh://forge.softwareheritage.org/diffusion/64/slides (authored by zack).
Merge branch 'master' of ssh://forge.softwareheritage.org/diffusion/64/slides
Thu, Feb 6, 8:30 PM
zack committed rMSLD9ba6e593cf66: Lyon 1 talk: last touches (authored by zack).
Lyon 1 talk: last touches
Thu, Feb 6, 8:30 PM
zack committed rMSLDb8841ee71e42: graph-compression.org: new module about swh-graph (authored by zack).
graph-compression.org: new module about swh-graph
Thu, Feb 6, 8:30 PM
zack committed rMSLD49a7d96527d9: dataset: typesetting and presentation improvements (minor) (authored by zack).
dataset: typesetting and presentation improvements (minor)
Thu, Feb 6, 8:30 PM
zack added a comment to T1351: (periodically) ingest GNU package releases.

Given this is done, where can one see the timeline of visits for a given origin coming from GNU?

Thu, Feb 6, 8:25 PM · Archive coverage
zack committed rMSLD27e5ed9d1a1d: check-in slides for graph talk at Univ. Lyon 1 (authored by zack).
check-in slides for graph talk at Univ. Lyon 1
Thu, Feb 6, 4:52 PM
zack committed rMSLDcaca20b637b1: biblio: add SANER 2020 paper (authored by zack).
biblio: add SANER 2020 paper
Thu, Feb 6, 4:46 PM
zack committed rMSLD2a89a629111a: status extended: update graph (projected) figures for consistency (authored by zack).
status extended: update graph (projected) figures for consistency
Thu, Feb 6, 4:46 PM
zack committed rMSLD6710b7241122: Merkle structure slide: update biblio reference in title (authored by zack).
Merkle structure slide: update biblio reference in title
Thu, Feb 6, 3:59 PM
zack committed rMSLD9596bff20d3d: dataset: improve rendering of first slide (authored by zack).
dataset: improve rendering of first slide
Thu, Feb 6, 3:59 PM
zack committed rMSLD3bb545111f8c: dataset module: add more example queries (authored by zack).
dataset module: add more example queries
Thu, Feb 6, 3:36 PM
zack committed rMSLDed2eb66eb932: Merge branch 'master' of ssh://forge.softwareheritage.org/diffusion/64/slides (authored by zack).
Merge branch 'master' of ssh://forge.softwareheritage.org/diffusion/64/slides
Thu, Feb 6, 1:30 PM
zack committed rMSLDeb240dc69f47: status module: update growth graph (authored by zack).
status module: update growth graph
Thu, Feb 6, 1:30 PM
zack committed rMSLD586784e82c60: zurich talk: typo in location (authored by zack).
zurich talk: typo in location
Thu, Feb 6, 12:28 PM
zack triaged T2269: cron spam: <root@*> find /var/log/kafka -type f -not -name *.gz -a -ctime +1 -exec gzip {} \+ as Normal priority.
Thu, Feb 6, 8:18 AM · System administration

Wed, Feb 5

zack added a reviewer for D2629: dataset: add graph export based on kafka: Reviewers.
Wed, Feb 5, 7:53 PM
zack committed R183:df8d41dab532: new entries: dejavu, rescience (authored by zack).
new entries: dejavu, rescience
Wed, Feb 5, 2:56 PM
zack committed R183:359d3e0a969e: add several refs on large-scale studies on programming languages (authored by zack).
add several refs on large-scale studies on programming languages
Wed, Feb 5, 2:08 PM

Tue, Feb 4

zack added a comment to D2623: Split Content class into two classes, for missing and non-missing contents..

"Missing" isn't perfect, maybe, but it's consistent with SQL storage tables at least.

No, SQL tables use "skipped". But I prefer "missing" because it's more generic (it also includes content we couldn't find)

Tue, Feb 4, 4:29 PM
zack added a comment to D2623: Split Content class into two classes, for missing and non-missing contents..

s/non missing/present/
would be an improvement.

Tue, Feb 4, 4:14 PM

Mon, Feb 3

zack added a comment to T2264: Use Sphinx 2.x to build the documentation.

tnx for debugging and fixing this!

Mon, Feb 3, 3:01 PM · Development documentation

Sat, Feb 1

zack committed rDGRPHa0e3bc848e6b: docs: revamp compression workflow (authored by zack).
docs: revamp compression workflow
Sat, Feb 1, 12:31 PM

Thu, Jan 30

zack added a comment to T2262: Dealing with IRIs.

I'm fine with switching to IRIs in the doc, just please expand what it means on first use (with a mention like "they are like URIs but"), as I don't think the acronym is that well-known yet, especially in the US.

Thu, Jan 30, 5:47 PM · Storage manager, Data Model
zack committed rDGRPH9f8e8b7e0d04: swh-graph doc: mention SANER 2020 paper (authored by zack).
swh-graph doc: mention SANER 2020 paper
Thu, Jan 30, 3:33 PM

Wed, Jan 29

zack added a comment to T1791: Web API: do not leak internal, non-intrinsic origin identifiers.

I propose to remove those id leaks from that endpoint and contact OpenAIRE to tell them to use the Link header from now on to paginate
the results.

Wed, Jan 29, 10:48 PM · Web app
zack retitled D2599: cli: add support for reading a file content from stdin in 'swh identify' command from cli: add support for reading a file content fron stdin in 'swh identify' command to cli: add support for reading a file content from stdin in 'swh identify' command.
Wed, Jan 29, 3:00 PM
zack accepted D2599: cli: add support for reading a file content from stdin in 'swh identify' command.
Wed, Jan 29, 2:59 PM
zack updated the task description for T2254: textual search language for the Web UI.
Wed, Jan 29, 2:31 PM · Archive search, Web app
zack committed rDSTO68702b564044: CONTRIBUTORS: add Daniele Serafini (authored by zack).
CONTRIBUTORS: add Daniele Serafini
Wed, Jan 29, 2:24 PM
zack committed rDWAPPS4e3de43c5a69: CONTRIBUTORS: add Daniele Serafini (authored by zack).
CONTRIBUTORS: add Daniele Serafini
Wed, Jan 29, 2:23 PM
zack updated subscribers of T1791: Web API: do not leak internal, non-intrinsic origin identifiers.

@anlambert is this done and, if so, can you close it?

Wed, Jan 29, 2:20 PM · Web app
zack closed T2176: Web API: add a /known endpoint to check if a list of PIDs exist in the archive as Resolved.

this has been implemented by @DanSeraf in D2582

Wed, Jan 29, 2:15 PM · Web app
zack added a project to T2114: swh-graph API: add ?limit=N method variants to return first N results: Easy hack.
Wed, Jan 29, 2:09 PM · Easy hack, Graph service
zack triaged T2254: textual search language for the Web UI as Normal priority.
Wed, Jan 29, 1:31 PM · Archive search, Web app

Mon, Jan 27

zack accepted D2582: Web API endpoint /known/.
Mon, Jan 27, 7:47 PM
zack added inline comments to D2582: Web API endpoint /known/.
Mon, Jan 27, 3:59 PM
zack requested changes to D2582: Web API endpoint /known/.
Mon, Jan 27, 2:52 PM

Sun, Jan 26

zack added a comment to T2251: Update publication page.

thanks Morane!

Sun, Jan 26, 5:20 PM · Website

Jan 21 2020

zack updated the task description for T2242: GitHub loading optimization: skip repos with old enough updated_at/pushed_at timestamps.
Jan 21 2020, 1:34 PM · Git loader
zack triaged T2242: GitHub loading optimization: skip repos with old enough updated_at/pushed_at timestamps as Normal priority.
Jan 21 2020, 1:33 PM · Git loader
zack created T2242: GitHub loading optimization: skip repos with old enough updated_at/pushed_at timestamps.
Jan 21 2020, 1:33 PM · Git loader

Jan 20 2020

zack committed R183:78fe649f1008: add Karl Fogel book Producing OSS (authored by zack).
add Karl Fogel book Producing OSS
Jan 20 2020, 9:30 AM

Jan 14 2020

zack closed T2177: Add origin persistent identifiers resolving as Wontfix.

For the time being we do not want to leak origin PIDs to the world, as they're not intrinsically computable on the origin itself (but only on the URLs, which is not the artifact itself, which is not digital, but just a locator for it). Let's consider ori PIDs only an internal shorthand useful in places where having fixed-length shorthands for full URLs is needed (e.g., swh-graph).

Jan 14 2020, 3:03 PM · Web app

Jan 13 2020

zack triaged T2176: Web API: add a /known endpoint to check if a list of PIDs exist in the archive as Normal priority.
Jan 13 2020, 5:10 PM · Web app

Jan 10 2020

zack committed R183:06a94027ffa1: add Clauset ref on power law (authored by zack).
add Clauset ref on power law
Jan 10 2020, 3:30 PM
zack committed R183:af744c1658ee: add hopcroft/hullman ref for CC algorithm (authored by zack).
add hopcroft/hullman ref for CC algorithm
Jan 10 2020, 8:54 AM

Jan 9 2020

zack committed R183:62bac0b6781d: add book references for numpy and scipy (authored by zack).
add book references for numpy and scipy
Jan 9 2020, 11:18 AM
zack added a comment to D2461: add Python client for the archive WEB API.

IMHO this should be a 'standalone' repo/project. I mean it should be possible to pip install it at least.

Jan 9 2020, 11:02 AM

Jan 8 2020

zack committed R183:0a349d260b0d: add Barabasi scale-free network paper (authored by zack).
add Barabasi scale-free network paper
Jan 8 2020, 5:14 PM
zack committed R183:8fa4fe98c2e1: add recent SWH papers: MSR (challenge), SANER (swh-graph), CISE (DOI) (authored by zack).
add recent SWH papers: MSR (challenge), SANER (swh-graph), CISE (DOI)
Jan 8 2020, 1:51 PM
zack committed R183:db3ecddd3ba8: add Barabasi blueprint for network analyses (authored by zack).
add Barabasi blueprint for network analyses
Jan 8 2020, 12:18 PM

Jan 7 2020

zack committed R183:5eb49454aac1: add BOA and WoC (authored by zack).
add BOA and WoC
Jan 7 2020, 10:48 AM
zack committed R183:68a7a84ea194: add libraries.io, HOPL, and linguist refs (authored by zack).
add libraries.io, HOPL, and linguist refs
Jan 7 2020, 10:45 AM

Jan 6 2020

zack committed R183:ebf04a3e944c: add provenance TR and swh-graph paper (authored by zack).
add provenance TR and swh-graph paper
Jan 6 2020, 10:17 AM

Dec 21 2019

zack added a comment to D2492: api: Return absolute URIs in JSON responses.

I'm not a big fan of using absolute URLs. It makes the code more verbose, tests more complex, and prevents clients from using proxies. Why do we want/need that?

Dec 21 2019, 8:34 AM

Dec 19 2019

zack closed T2144: Define an architecture for end-to-end monitoring/testing, a subtask of T2118: Deposit: End to End monitoring, as Resolved.
Dec 19 2019, 10:06 AM · Sprint 2019/12 (Monitor and Conquer)
zack closed T2144: Define an architecture for end-to-end monitoring/testing, a subtask of T2117: Save Code Now: End to End monitoring, as Resolved.
Dec 19 2019, 10:06 AM · Sprint 2019/12 (Monitor and Conquer)
zack closed T2144: Define an architecture for end-to-end monitoring/testing, a subtask of T2129: Journal: End to end monitoring, as Resolved.
Dec 19 2019, 10:06 AM · Sprint 2019/12 (Monitor and Conquer)
zack closed T2144: Define an architecture for end-to-end monitoring/testing as Resolved.
Dec 19 2019, 10:06 AM · Sprint 2019/12 (Monitor and Conquer)
zack closed T2144: Define an architecture for end-to-end monitoring/testing, a subtask of T2125: Production Web UI end to end testing, as Resolved.
Dec 19 2019, 10:06 AM · Sprint 2019/12 (Monitor and Conquer)
zack closed T2144: Define an architecture for end-to-end monitoring/testing, a subtask of T2126: Production Vault end to end testing, as Resolved.
Dec 19 2019, 10:06 AM · Sprint 2019/12 (Monitor and Conquer)
zack added a comment to T2144: Define an architecture for end-to-end monitoring/testing.

(marking as done as it was moved to the done column on the sprint board, please reopen if not ok)

Dec 19 2019, 10:06 AM · Sprint 2019/12 (Monitor and Conquer)
zack closed T1359: Add sentry support in every swh running service as Resolved.

(marking as done as it was moved to the done column on the sprint board, please reopen if not ok)

Dec 19 2019, 10:06 AM · Sprint 2019/12 (Monitor and Conquer), Metrics/monitoring, System administration
zack closed T1359: Add sentry support in every swh running service , a subtask of T1358: Setup a sentry service, as Resolved.
Dec 19 2019, 10:06 AM · Sprint 2019/12 (Monitor and Conquer), Metrics/monitoring, System administration

Dec 17 2019

zack closed T2157: Web API: flaky 403 responses between python requests and curl as Invalid.

Turns out this was due to a left-over username/password in my ~/.netrc, which apparently requests uses by default and curl doesn't.
(But why the heck the caching effect? I've no idea...)

Dec 17 2019, 2:47 PM · Web app
zack triaged T2157: Web API: flaky 403 responses between python requests and curl as Normal priority.
Dec 17 2019, 2:39 PM · Web app
zack added inline comments to D2461: add Python client for the archive WEB API.
Dec 17 2019, 10:25 AM
zack updated the diff for D2461: add Python client for the archive WEB API.
  • webclient: add missing error checking
Dec 17 2019, 10:25 AM
zack updated the diff for D2461: add Python client for the archive WEB API.
  • simplify code and streamline returned snapshot types
Dec 17 2019, 10:18 AM

Dec 16 2019

zack created D2461: add Python client for the archive WEB API.
Dec 16 2019, 4:50 PM

Dec 14 2019

zack committed R183:56cce6de6f0b: add ML big code survey (authored by zack).
add ML big code survey
Dec 14 2019, 5:52 PM
zack committed R183:903dc0770d4a: fix authors initials in MSR survey entry (authored by zack).
fix authors initials in MSR survey entry
Dec 14 2019, 5:52 PM
zack added a comment to T2147: Web API: make next link contain full URLs.

@zack, this is now deployed to production.

Dec 14 2019, 12:35 PM · Web app

Dec 13 2019

zack added a comment to T2138: Return full URIS in next links of web API responses.

Darn, sorry, I've looked for a dupe before submitting, but I obviously failed at that. Thanks for closing the duplicate.

Dec 13 2019, 12:57 PM · Web app
zack triaged T2147: Web API: make next link contain full URLs as Low priority.
Dec 13 2019, 11:19 AM · Web app

Dec 12 2019

zack moved T2134: loader: Implement uniform loading CLI from in progress to done on the Sprint 2019/12 (Monitor and Conquer) board.
Dec 12 2019, 5:51 PM · Sprint 2019/12 (Monitor and Conquer)
zack moved T2124: Save Code Now: monitoring of admin infra from in progress to done on the Sprint 2019/12 (Monitor and Conquer) board.
Dec 12 2019, 5:50 PM · Sprint 2019/12 (Monitor and Conquer)

Dec 9 2019

zack committed R183:905b9c6a5c6c: fix a bunch of capitalization/missing case protection issues (authored by zack).
fix a bunch of capitalization/missing case protection issues
Dec 9 2019, 10:49 AM

Dec 6 2019

zack committed rDGRPH71ce98054b4e: CLI: generalize 'map lookup' to lookup many identifiers at once (authored by zack).
CLI: generalize 'map lookup' to lookup many identifiers at once
Dec 6 2019, 4:19 PM
zack closed T2112: make "swh graph map lookup" accept lists of identifiers as Resolved by committing rDGRPH71ce98054b4e: CLI: generalize 'map lookup' to lookup many identifiers at once.
Dec 6 2019, 4:19 PM · Graph service
zack closed D2379: CLI: generalize 'map lookup' to lookup many identifiers at once.
Dec 6 2019, 4:19 PM