Page MenuHomeSoftware Heritage
Feed Advanced Search

Mar 5 2020

zack requested changes to D2770: Updated API Docs to add limit query param.
Mar 5 2020, 1:17 PM
zack committed R183:a82f6bc2ad46: fix biblio details for neo4j paper (authored by zack).
fix biblio details for neo4j paper
Mar 5 2020, 9:34 AM

Mar 4 2020

zack committed R183:52b824ddafc8: add Bogdan paper about CI quality (authored by zack).
add Bogdan paper about CI quality
Mar 4 2020, 11:38 AM

Mar 3 2020

zack committed R183:17b4268fd06b: add a bunch of papers (authored by zack).
add a bunch of papers
Mar 3 2020, 4:54 PM
zack committed rDGRPH07cb5353ca87: CONTRIBUTORS: add Léni Gauffier (authored by zack).
CONTRIBUTORS: add Léni Gauffier
Mar 3 2020, 4:06 PM
zack committed R183:e5b6bf3c3cb0: fix page number in various entries (authored by zack).
fix page number in various entries
Mar 3 2020, 2:42 PM
zack accepted D2748: model: output result to json file.

good job !

Mar 3 2020, 2:25 PM

Mar 2 2020

zack renamed T2298: scanner: support alternative output formats from swh-scanner: more output formats to scanner: support alternative output formats.
Mar 2 2020, 7:38 PM · Code scanner
zack requested changes to D2748: model: output result to json file.
Mar 2 2020, 7:37 PM

Feb 28 2020

zack raised the priority of T1927: Web app: rate limiting based on per-client API tokens from Normal to High.
Feb 28 2020, 11:42 PM · Web app
zack assigned T2300: swh-scanner: print a nicer error message when rate limit is hit to DanSeraf.
Feb 28 2020, 5:08 PM · Easy hack, Code scanner
zack triaged T2300: swh-scanner: print a nicer error message when rate limit is hit as Low priority.
Feb 28 2020, 5:06 PM · Easy hack, Code scanner
zack triaged T2299: scanner: add integration tests as High priority.
Feb 28 2020, 4:10 PM · Code scanner
zack edited P599 Masterwork From Distant Lands.
Feb 28 2020, 4:08 PM
zack committed rDTSCN4f94e31ce4fd: README: add oneliner description (authored by zack).
README: add oneliner description
Feb 28 2020, 4:04 PM
zack added a project to T2298: scanner: support alternative output formats: Code scanner.
Feb 28 2020, 3:54 PM · Code scanner
zack created Code scanner.
Feb 28 2020, 3:54 PM
zack committed rDTSCNb81705307f7f: CONTRIBUTORS: add Daniele Serafini (authored by zack).
CONTRIBUTORS: add Daniele Serafini
Feb 28 2020, 3:45 PM
zack committed rDTSCN774935765aea: import pre-commit configuration from swh-py-template (authored by zack).
import pre-commit configuration from swh-py-template
Feb 28 2020, 3:44 PM
zack committed rDTSCNb91f1e1428fd: cli.py: make flake8 pass (authored by zack).
cli.py: make flake8 pass
Feb 28 2020, 3:44 PM
zack committed rDTSCN785de3317683: .gitignore: sync with current version from swh-py-template (authored by zack).
.gitignore: sync with current version from swh-py-template
Feb 28 2020, 3:43 PM
zack committed R183:551a38d538ef: add papers on README and technical debt analysis (authored by zack).
add papers on README and technical debt analysis
Feb 28 2020, 2:02 PM

Feb 27 2020

zack added inline comments to D2657: code scanner prototype.
Feb 27 2020, 4:28 PM
zack added a comment to D2657: code scanner prototype.
  • I'd make the color output optional, that'd make for something more easily parsable.
Feb 27 2020, 1:57 PM

Feb 25 2020

zack triaged T2294: Web API: add a /ping endpoint as Low priority.
Feb 25 2020, 10:55 AM · Web app, Easy hack

Feb 19 2020

zack committed rMSLD10ded1eb48f5: SANER 2020 slides: add co-authors to titlepage (authored by zack).
SANER 2020 slides: add co-authors to titlepage
Feb 19 2020, 12:23 PM
zack committed rMSLD9f9e6b4f5d48: check in slides for SANER 2020 swh-graph presentation (authored by zack).
check in slides for SANER 2020 swh-graph presentation
Feb 19 2020, 12:17 PM

Feb 18 2020

zack accepted D2686: Re-introduce the swh.core dependency in swh.model[cli].

thanks !

Feb 18 2020, 12:56 PM
zack renamed T2288: pip install swh.model[cli] no longer provides a usable "swh" command from pip install swh.model[cli] no longer provides a usable "swh identify" to pip install swh.model[cli] no longer provides a usable "swh" command.
Feb 18 2020, 12:55 PM · Data Model
zack lowered the priority of T2288: pip install swh.model[cli] no longer provides a usable "swh" command from High to Normal.
In T2288#42054, @olasd wrote:

Your swh-identify backtrace looks like a $PATH caching issue (see also the path of the swh-identify script in the first line). I can't reproduce it in a fresh venv and swh-identify works fine.

Feb 18 2020, 12:55 PM · Data Model
zack updated the task description for T2288: pip install swh.model[cli] no longer provides a usable "swh" command.
Feb 18 2020, 9:44 AM · Data Model
zack updated the task description for T2288: pip install swh.model[cli] no longer provides a usable "swh" command.
Feb 18 2020, 9:44 AM · Data Model
zack triaged T2288: pip install swh.model[cli] no longer provides a usable "swh" command as High priority.
Feb 18 2020, 9:43 AM · Data Model
zack committed rMSLDccc7b7f7cda2: lyon talk: fix affiliation (authored by zack).
lyon talk: fix affiliation
Feb 18 2020, 8:39 AM

Feb 17 2020

zack added inline comments to D2669: Add ?limit=N method variants to return first N results.
Feb 17 2020, 1:58 PM

Feb 14 2020

zack added inline comments to D2669: Add ?limit=N method variants to return first N results.
Feb 14 2020, 2:59 PM

Feb 13 2020

zack updated the summary of D2461: add Python client for the archive WEB API.
Feb 13 2020, 2:38 PM
zack changed the status of T2279: Python client for the Web API from Open to Work in Progress.

a draft implementation of this idea is in D2461

Feb 13 2020, 2:37 PM · Web app
zack triaged T2279: Python client for the Web API as Wishlist priority.
Feb 13 2020, 2:37 PM · Web app
zack added inline comments to D2661: Web API: /known/ input size limit.
Feb 13 2020, 9:10 AM

Feb 11 2020

zack renamed T2277: varnish: limit maximum size of incoming POST requests for Web API from varnish: limit maximum size of incoming POST requests to varnish: limit maximum size of incoming POST requests for Web API.
Feb 11 2020, 3:09 PM · System administration
zack updated the task description for T2276: Web API: /known: add a length limit to the list of accepted PID.
Feb 11 2020, 3:08 PM · Web app
zack triaged T2277: varnish: limit maximum size of incoming POST requests for Web API as High priority.
Feb 11 2020, 3:08 PM · System administration
zack triaged T2276: Web API: /known: add a length limit to the list of accepted PID as High priority.
Feb 11 2020, 2:52 PM · Web app

Feb 6 2020

zack committed rMSLDda10e13a735a: Merge branch 'master' of ssh://forge.softwareheritage.org/diffusion/64/slides (authored by zack).
Merge branch 'master' of ssh://forge.softwareheritage.org/diffusion/64/slides
Feb 6 2020, 8:30 PM
zack committed rMSLD9ba6e593cf66: Lyon 1 talk: last touches (authored by zack).
Lyon 1 talk: last touches
Feb 6 2020, 8:30 PM
zack committed rMSLDb8841ee71e42: graph-compression.org: new module about swh-graph (authored by zack).
graph-compression.org: new module about swh-graph
Feb 6 2020, 8:30 PM
zack committed rMSLD49a7d96527d9: dataset: typesetting and presentation improvements (minor) (authored by zack).
dataset: typesetting and presentation improvements (minor)
Feb 6 2020, 8:30 PM
zack added a comment to T1351: (periodically) ingest GNU package releases.

Given this is done, where can one see the timeline of visits for a given origin coming from GNU?

Feb 6 2020, 8:25 PM · Archive coverage
zack committed rMSLD27e5ed9d1a1d: check-in slides for graph talk at Univ. Lyon 1 (authored by zack).
check-in slides for graph talk at Univ. Lyon 1
Feb 6 2020, 4:52 PM
zack committed rMSLDcaca20b637b1: biblio: add SANER 2020 paper (authored by zack).
biblio: add SANER 2020 paper
Feb 6 2020, 4:46 PM
zack committed rMSLD2a89a629111a: status extended: update graph (projected) figures for consistency (authored by zack).
status extended: update graph (projected) figures for consistency
Feb 6 2020, 4:46 PM
zack committed rMSLD6710b7241122: Merkle structure slide: update biblio reference in title (authored by zack).
Merkle structure slide: update biblio reference in title
Feb 6 2020, 3:59 PM
zack committed rMSLD9596bff20d3d: dataset: improve rendering of first slide (authored by zack).
dataset: improve rendering of first slide
Feb 6 2020, 3:59 PM
zack committed rMSLD3bb545111f8c: dataset module: add more example queries (authored by zack).
dataset module: add more example queries
Feb 6 2020, 3:36 PM
zack committed rMSLDed2eb66eb932: Merge branch 'master' of ssh://forge.softwareheritage.org/diffusion/64/slides (authored by zack).
Merge branch 'master' of ssh://forge.softwareheritage.org/diffusion/64/slides
Feb 6 2020, 1:30 PM
zack committed rMSLDeb240dc69f47: status module: update growth graph (authored by zack).
status module: update growth graph
Feb 6 2020, 1:30 PM
zack committed rMSLD586784e82c60: zurich talk: typo in location (authored by zack).
zurich talk: typo in location
Feb 6 2020, 12:28 PM
zack triaged T2269: cron spam: <root@*> find /var/log/kafka -type f -not -name *.gz -a -ctime +1 -exec gzip {} \+ as Normal priority.
Feb 6 2020, 8:18 AM · System administration

Feb 5 2020

zack added a reviewer for D2629: dataset: add graph export based on kafka: Reviewers.
Feb 5 2020, 7:53 PM
zack committed R183:df8d41dab532: new entries: dejavu, rescience (authored by zack).
new entries: dejavu, rescience
Feb 5 2020, 2:56 PM
zack committed R183:359d3e0a969e: add several refs on large-scale studies on programming languages (authored by zack).
add several refs on large-scale studies on programming languages
Feb 5 2020, 2:08 PM

Feb 4 2020

zack added a comment to D2623: Split Content class into two classes, for missing and non-missing contents..

"Missing" isn't perfect, maybe, but it's consistent with SQL storage tables at least.

No, SQL tables use "skipped". But I prefer "missing" because it's more generic (it also includes content we couldn't find)

Feb 4 2020, 4:29 PM
zack added a comment to D2623: Split Content class into two classes, for missing and non-missing contents..

s/non missing/present/
would be an improvement.

Feb 4 2020, 4:14 PM

Feb 3 2020

zack added a comment to T2264: Use Sphinx 2.x to build the documentation.

tnx for debugging and fixing this!

Feb 3 2020, 3:01 PM · Documentation

Feb 1 2020

zack committed rDGRPHa0e3bc848e6b: docs: revamp compression workflow (authored by zack).
docs: revamp compression workflow
Feb 1 2020, 12:31 PM

Jan 30 2020

zack added a comment to T2262: Deal with IRIs.

I'm fine with switching to IRIs in the doc, just please expand what it means on first use (with a mention like "they are like URIs but"), as I don't think the acronym is that well-known yet, especially in the US.

Jan 30 2020, 5:47 PM · Storage manager, Data Model
zack committed rDGRPH9f8e8b7e0d04: swh-graph doc: mention SANER 2020 paper (authored by zack).
swh-graph doc: mention SANER 2020 paper
Jan 30 2020, 3:33 PM

Jan 29 2020

zack added a comment to T1791: Web API: do not leak internal, non-intrinsic origin identifiers.

I propose to remove those id leaks from that endpoint and contact OpenAIRE to tell them to use the Link header from now on to paginate
the results.

Jan 29 2020, 10:48 PM · Web app
zack retitled D2599: cli: add support for reading a file content from stdin in 'swh identify' command from cli: add support for reading a file content fron stdin in 'swh identify' command to cli: add support for reading a file content from stdin in 'swh identify' command.
Jan 29 2020, 3:00 PM
zack accepted D2599: cli: add support for reading a file content from stdin in 'swh identify' command.
Jan 29 2020, 2:59 PM
zack updated the task description for T2254: textual search language for the Web UI.
Jan 29 2020, 2:31 PM · Archive search, Web app
zack committed rDSTO68702b564044: CONTRIBUTORS: add Daniele Serafini (authored by zack).
CONTRIBUTORS: add Daniele Serafini
Jan 29 2020, 2:24 PM
zack committed rDWAPPS4e3de43c5a69: CONTRIBUTORS: add Daniele Serafini (authored by zack).
CONTRIBUTORS: add Daniele Serafini
Jan 29 2020, 2:23 PM
zack updated subscribers of T1791: Web API: do not leak internal, non-intrinsic origin identifiers.

@anlambert is this done and, if so, can you close it?

Jan 29 2020, 2:20 PM · Web app
zack closed T2176: Web API: add a /known endpoint to check if a list of PIDs exist in the archive as Resolved.

this has been implemented by @DanSeraf in D2582

Jan 29 2020, 2:15 PM · Web app
zack added a project to T2114: swh-graph API: add ?limit=N method variants to return first N results: Easy hack.
Jan 29 2020, 2:09 PM · Easy hack, Compressed graph service
zack triaged T2254: textual search language for the Web UI as Normal priority.
Jan 29 2020, 1:31 PM · Archive search, Web app

Jan 27 2020

zack accepted D2582: Web API endpoint /known/.
Jan 27 2020, 7:47 PM
zack added inline comments to D2582: Web API endpoint /known/.
Jan 27 2020, 3:59 PM
zack requested changes to D2582: Web API endpoint /known/.
Jan 27 2020, 2:52 PM

Jan 26 2020

zack added a comment to T2251: Update publication page.

thanks Morane!

Jan 26 2020, 5:20 PM · Website

Jan 21 2020

zack updated the task description for T2242: GitHub loading optimization: skip repos with old enough updated_at/pushed_at timestamps.
Jan 21 2020, 1:34 PM · Git loader
zack triaged T2242: GitHub loading optimization: skip repos with old enough updated_at/pushed_at timestamps as Normal priority.
Jan 21 2020, 1:33 PM · Git loader
zack created T2242: GitHub loading optimization: skip repos with old enough updated_at/pushed_at timestamps.
Jan 21 2020, 1:33 PM · Git loader

Jan 20 2020

zack committed R183:78fe649f1008: add Karl Fogel book Producing OSS (authored by zack).
add Karl Fogel book Producing OSS
Jan 20 2020, 9:30 AM

Jan 14 2020

zack closed T2177: Add origin persistent identifiers resolving as Wontfix.

For the time being we do not want to leak origin PIDs to the world, as they're not intrinsically computable on the origin itself (but only on the URLs, which is not the artifact itself, which is not digital, but just a locator for it). Let's consider ori PIDs only an internal shorthand useful in places where having fixed-length shorthands for full URLs is needed (e.g., swh-graph).

Jan 14 2020, 3:03 PM · Web app

Jan 13 2020

zack triaged T2176: Web API: add a /known endpoint to check if a list of PIDs exist in the archive as Normal priority.
Jan 13 2020, 5:10 PM · Web app

Jan 10 2020

zack committed R183:06a94027ffa1: add Clauset ref on power law (authored by zack).
add Clauset ref on power law
Jan 10 2020, 3:30 PM
zack committed R183:af744c1658ee: add hopcroft/hullman ref for CC algorithm (authored by zack).
add hopcroft/hullman ref for CC algorithm
Jan 10 2020, 8:54 AM

Jan 9 2020

zack committed R183:62bac0b6781d: add book references for numpy and scipy (authored by zack).
add book references for numpy and scipy
Jan 9 2020, 11:18 AM
zack added a comment to D2461: add Python client for the archive WEB API.

IMHO this should be a 'standalone' repo/project. I mean it should be possible to pip install it at least.

Jan 9 2020, 11:02 AM

Jan 8 2020

zack committed R183:0a349d260b0d: add Barabasi scale-free network paper (authored by zack).
add Barabasi scale-free network paper
Jan 8 2020, 5:14 PM
zack committed R183:8fa4fe98c2e1: add recent SWH papers: MSR (challenge), SANER (swh-graph), CISE (DOI) (authored by zack).
add recent SWH papers: MSR (challenge), SANER (swh-graph), CISE (DOI)
Jan 8 2020, 1:51 PM
zack committed R183:db3ecddd3ba8: add Barabasi blueprint for network analyses (authored by zack).
add Barabasi blueprint for network analyses
Jan 8 2020, 12:18 PM

Jan 7 2020

zack committed R183:5eb49454aac1: add BOA and WoC (authored by zack).
add BOA and WoC
Jan 7 2020, 10:48 AM
zack committed R183:68a7a84ea194: add libraries.io, HOPL, and linguist refs (authored by zack).
add libraries.io, HOPL, and linguist refs
Jan 7 2020, 10:45 AM

Jan 6 2020

zack committed R183:ebf04a3e944c: add provenance TR and swh-graph paper (authored by zack).
add provenance TR and swh-graph paper
Jan 6 2020, 10:17 AM

Dec 21 2019

zack added a comment to D2492: api: Return absolute URIs in JSON responses.

I'm not a big fan of using absolute URLs. It makes the code more verbose, tests more complex, and prevents clients from using proxies. Why do we want/need that?

Dec 21 2019, 8:34 AM

Dec 19 2019

zack closed T2144: Define an architecture for end-to-end monitoring/testing, a subtask of T2118: Deposit: End to End monitoring, as Resolved.
Dec 19 2019, 10:06 AM · Sprint 2019/12 (Monitor and Conquer)