Page MenuHomeSoftware Heritage
Feed Advanced Search

Nov 29 2021

zack added a comment to T3755: misleading 100% known summary in sunburst rendering.

I've tried replacing the content of foo.txt with something unknown to the archive (random garbage) and the sunburst rendering still shows 100.0%.
So it could also be a rounding error instead.
Either way, it is misleading and should be fixed.

Nov 29 2021, 1:14 PM · Code scanner
zack triaged T3755: misleading 100% known summary in sunburst rendering as Low priority.
Nov 29 2021, 1:10 PM · Code scanner
zack triaged T3754: scanning sunburst rendering fail with "ValueError: Empty data passed with indices specified." as Normal priority.
Nov 29 2021, 1:04 PM · Code scanner

Jul 21 2021

DanSeraf closed T3420: scanner: make the various query algorithms user-selectable as Resolved by committing rDTSCNd5a070e1429d: add scan policies.
Jul 21 2021, 2:00 PM · Code scanner

Jul 15 2021

DanSeraf added a revision to T3420: scanner: make the various query algorithms user-selectable: D5996: swh-scanner: new scan policies.
Jul 15 2021, 7:17 PM · Code scanner

Jul 8 2021

DanSeraf closed T2692: Move the output related functions to another (sub)module as Resolved by committing rDTSCN0d92c754c8df: use model.from_disk instead of scanner.model to store a source code project.
Jul 8 2021, 3:42 PM · Code scanner
DanSeraf closed T2730: scanner: should output the root SWHID as well as Resolved by committing rDTSCN0d92c754c8df: use model.from_disk instead of scanner.model to store a source code project.
Jul 8 2021, 3:42 PM · Easy hack, Code scanner
DanSeraf closed T3349: use swh.model.merkle/from_disk instead of swh.scanner.model, a subtask of T2730: scanner: should output the root SWHID as well, as Resolved.
Jul 8 2021, 3:42 PM · Easy hack, Code scanner
DanSeraf closed T3349: use swh.model.merkle/from_disk instead of swh.scanner.model as Resolved by committing rDTSCN0d92c754c8df: use model.from_disk instead of scanner.model to store a source code project.
Jul 8 2021, 3:42 PM · Code scanner
DanSeraf closed T3349: use swh.model.merkle/from_disk instead of swh.scanner.model, a subtask of T3420: scanner: make the various query algorithms user-selectable, as Resolved.
Jul 8 2021, 3:42 PM · Code scanner
zack changed the status of T2730: scanner: should output the root SWHID as well from Open to Work in Progress.
Jul 8 2021, 2:13 PM · Easy hack, Code scanner
zack changed the status of T2692: Move the output related functions to another (sub)module from Open to Work in Progress.
Jul 8 2021, 2:13 PM · Code scanner
zack moved T3318: scanner should use the known() method of web.client from In progress to Backlog on the Code scanner board.
Jul 8 2021, 2:13 PM · Code scanner
zack added a subtask for T3318: scanner should use the known() method of web.client: T2635: web client: add async API.
Jul 8 2021, 2:11 PM · Code scanner

Jul 5 2021

zack added a parent task for T3349: use swh.model.merkle/from_disk instead of swh.scanner.model: T2730: scanner: should output the root SWHID as well.
Jul 5 2021, 3:21 PM · Code scanner
zack added a subtask for T2730: scanner: should output the root SWHID as well: T3349: use swh.model.merkle/from_disk instead of swh.scanner.model.
Jul 5 2021, 3:21 PM · Easy hack, Code scanner
zack removed a parent task for T2730: scanner: should output the root SWHID as well: T3349: use swh.model.merkle/from_disk instead of swh.scanner.model.
Jul 5 2021, 3:20 PM · Easy hack, Code scanner
zack removed a subtask for T3349: use swh.model.merkle/from_disk instead of swh.scanner.model: T2730: scanner: should output the root SWHID as well.
Jul 5 2021, 3:20 PM · Code scanner
zack changed the status of T3420: scanner: make the various query algorithms user-selectable from Open to Work in Progress.
Jul 5 2021, 3:11 PM · Code scanner
zack assigned T3318: scanner should use the known() method of web.client to DanSeraf.
Jul 5 2021, 3:11 PM · Code scanner
zack changed the status of T3318: scanner should use the known() method of web.client from Open to Work in Progress.
Jul 5 2021, 3:11 PM · Code scanner
zack added a parent task for T3349: use swh.model.merkle/from_disk instead of swh.scanner.model: T3420: scanner: make the various query algorithms user-selectable.
Jul 5 2021, 3:10 PM · Code scanner
zack added a subtask for T3420: scanner: make the various query algorithms user-selectable: T3349: use swh.model.merkle/from_disk instead of swh.scanner.model.
Jul 5 2021, 3:10 PM · Code scanner
zack triaged T3420: scanner: make the various query algorithms user-selectable as Normal priority.
Jul 5 2021, 3:10 PM · Code scanner

Jun 30 2021

DanSeraf added a revision to T3349: use swh.model.merkle/from_disk instead of swh.scanner.model: D5951: model: make deduplication optional when iterating over the merkle tree.
Jun 30 2021, 4:21 PM · Code scanner

Jun 25 2021

DanSeraf added a revision to T2692: Move the output related functions to another (sub)module: D5926: swh.scanner: use model.from_disk instead of scanner.model to store a source code project.
Jun 25 2021, 1:57 PM · Code scanner
DanSeraf added a revision to T2730: scanner: should output the root SWHID as well: D5926: swh.scanner: use model.from_disk instead of scanner.model to store a source code project.
Jun 25 2021, 1:57 PM · Easy hack, Code scanner
DanSeraf added a revision to T3349: use swh.model.merkle/from_disk instead of swh.scanner.model: D5926: swh.scanner: use model.from_disk instead of scanner.model to store a source code project.
Jun 25 2021, 1:57 PM · Code scanner

Jun 15 2021

zack closed T3209: Fix swh-scanner for python > 3.7 as Resolved by committing rDTSCNd58bcb59a099: Fix swh-scanner for python 3.7 and >= 3.8.
Jun 15 2021, 11:10 AM · Code scanner

Jun 11 2021

zack renamed T3349: use swh.model.merkle/from_disk instead of swh.scanner.model from consider using swh.model.merkle/from_disk instead of swh.scanner.model to use swh.model.merkle/from_disk instead of swh.scanner.model.
Jun 11 2021, 11:16 AM · Code scanner
zack added a subtask for T3349: use swh.model.merkle/from_disk instead of swh.scanner.model: T2730: scanner: should output the root SWHID as well.
Jun 11 2021, 11:16 AM · Code scanner
zack added a parent task for T2730: scanner: should output the root SWHID as well: T3349: use swh.model.merkle/from_disk instead of swh.scanner.model.
Jun 11 2021, 11:16 AM · Easy hack, Code scanner

May 28 2021

zack changed the status of T3349: use swh.model.merkle/from_disk instead of swh.scanner.model from Open to Work in Progress.
May 28 2021, 11:13 AM · Code scanner
zack triaged T3349: use swh.model.merkle/from_disk instead of swh.scanner.model as Normal priority.
May 28 2021, 11:13 AM · Code scanner

May 10 2021

zack triaged T3318: scanner should use the known() method of web.client as Low priority.
May 10 2021, 9:02 AM · Code scanner

Apr 23 2021

vlorentz assigned T3136: Prior art detection service to zack.
Apr 23 2021, 4:51 PM · Code scanner, Scientific Community Building, Roadmap 2021, meta-task

Apr 18 2021

aastha1999 added a comment to T3209: Fix swh-scanner for python > 3.7.

I'm sorry for the delay. I was unaware that this task was assigned to me. I missed it but solved it as soon as I came to know about it. Also, I wanted to ask if there is a way to install a library mentioned in requirements.txt according to the python version used?

Apr 18 2021, 3:54 PM · Code scanner

Apr 6 2021

zack added a project to T3209: Fix swh-scanner for python > 3.7: Code scanner.
Apr 6 2021, 12:01 PM · Code scanner

Mar 26 2021

DanSeraf closed T2679: Use the `swh.model` version of `extract_regex_objs` as Resolved.
Mar 26 2021, 4:54 PM · Code scanner
DanSeraf added a revision to T2679: Use the `swh.model` version of `extract_regex_objs`: D5359: scanner: use 'extract_regex_objs' from swh.model.
Mar 26 2021, 2:39 PM · Code scanner

Mar 19 2021

vlorentz triaged T3136: Prior art detection service as Normal priority.
Mar 19 2021, 4:24 PM · Code scanner, Scientific Community Building, Roadmap 2021, meta-task

Mar 16 2021

zack placed T3136: Prior art detection service up for grabs.

@shashikant231 please do not claim tasks, thanks.

Mar 16 2021, 7:00 PM · Code scanner, Scientific Community Building, Roadmap 2021, meta-task
shashikant231 claimed T3136: Prior art detection service.

Hi @rdicosmo, can you guide me to start working on this issue.I have already built this project in my system.

Mar 16 2021, 4:00 PM · Code scanner, Scientific Community Building, Roadmap 2021, meta-task

Mar 15 2021

rdicosmo added a subtask for T3136: Prior art detection service: T3112: Provenance index for the full archive.
Mar 15 2021, 8:59 PM · Code scanner, Scientific Community Building, Roadmap 2021, meta-task
rdicosmo added a project to T3136: Prior art detection service: Code scanner.
Mar 15 2021, 8:58 PM · Code scanner, Scientific Community Building, Roadmap 2021, meta-task

Mar 10 2021

KShivendu closed T2731: scanner: strip the path passed as argument from output as Resolved.
Mar 10 2021, 6:59 PM · Easy hack, Code scanner
KShivendu moved T2731: scanner: strip the path passed as argument from output from Backlog to In progress on the Easy hack board.
Mar 10 2021, 5:34 PM · Easy hack, Code scanner

Mar 8 2021

vlorentz closed T3101: Latest versions on Pypi are an incompatible combination as Resolved.

The issue should be fixed now

Mar 8 2021, 5:03 PM · Code scanner
vlorentz added a project to T3101: Latest versions on Pypi are an incompatible combination: Code scanner.
Mar 8 2021, 4:31 PM · Code scanner

Mar 7 2021

KShivendu added a revision to T2731: scanner: strip the path passed as argument from output: D5213: swh/scanner : Strip root path from json output.
Mar 7 2021, 2:34 PM · Easy hack, Code scanner

Mar 3 2021

zack added a project to T2730: scanner: should output the root SWHID as well: Easy hack.
Mar 3 2021, 9:49 AM · Easy hack, Code scanner
zack added a project to T2731: scanner: strip the path passed as argument from output: Easy hack.
Mar 3 2021, 9:49 AM · Easy hack, Code scanner

Dec 19 2020

zack placed T2300: swh-scanner: print a nicer error message when rate limit is hit up for grabs.
Dec 19 2020, 9:48 PM · Easy hack, Code scanner
zack closed T2813: swh scanner db import does not validate SWHIDs as Resolved by committing rDTSCN33a9cd4eb965: DB import: skip invalid SWHIDs during import.
Dec 19 2020, 9:47 PM · Code scanner
zack closed T2812: scanner import db is slow, improve its performances as Resolved by committing rDTSCNfe84403087cc: DB import: massive speed up, via sqlite tuning and better mem handling.
Dec 19 2020, 9:47 PM · Code scanner
zack closed T2836: swh scanner db import loads keeps all input SWHIDs in memory as Resolved by committing rDTSCNfe84403087cc: DB import: massive speed up, via sqlite tuning and better mem handling.
Dec 19 2020, 9:47 PM · Easy hack, Code scanner

Dec 15 2020

zack updated the task description for T2812: scanner import db is slow, improve its performances.
Dec 15 2020, 5:57 PM · Code scanner
zack updated the task description for T2812: scanner import db is slow, improve its performances.
Dec 15 2020, 5:50 PM · Code scanner
zack renamed T2812: scanner import db is slow, improve its performances from scanner: improve SWHID (txt) -> sqlite import time to scanner import db is slow, improve its performances.
Dec 15 2020, 5:48 PM · Code scanner

Dec 2 2020

zack added a project to T2836: swh scanner db import loads keeps all input SWHIDs in memory: Easy hack.
Dec 2 2020, 9:26 AM · Easy hack, Code scanner
zack triaged T2836: swh scanner db import loads keeps all input SWHIDs in memory as Normal priority.
Dec 2 2020, 9:26 AM · Easy hack, Code scanner

Nov 25 2020

zack triaged T2813: swh scanner db import does not validate SWHIDs as Low priority.
Nov 25 2020, 10:37 PM · Code scanner
zack triaged T2812: scanner import db is slow, improve its performances as Low priority.
Nov 25 2020, 10:00 PM · Code scanner
zack closed T2680: proxy support for swh scanner as Resolved by committing rDTSCN65f0b8e4c6ea: honor HTTP(S)_PROXY environment variables, to support HTTP proxies.
Nov 25 2020, 4:42 PM · Easy hack, Code scanner

Nov 24 2020

DanSeraf closed T2760: swh-scanner: add support for local DB of known SWHIDs as Resolved.
Nov 24 2020, 1:54 PM · Code scanner

Nov 22 2020

DanSeraf added a revision to T2760: swh-scanner: add support for local DB of known SWHIDs: D4552: 'db serve' option to start the API service.
Nov 22 2020, 4:19 PM · Code scanner

Nov 18 2020

DanSeraf added a revision to T2760: swh-scanner: add support for local DB of known SWHIDs: D4508: scanner: 'db import' option to create local database with known swhids.
Nov 18 2020, 2:24 PM · Code scanner

Nov 16 2020

DanSeraf changed the status of T2760: swh-scanner: add support for local DB of known SWHIDs from Open to Work in Progress.
Nov 16 2020, 10:41 AM · Code scanner

Nov 6 2020

zack updated the task description for T2760: swh-scanner: add support for local DB of known SWHIDs.
Nov 6 2020, 2:50 PM · Code scanner
zack triaged T2760: swh-scanner: add support for local DB of known SWHIDs as Normal priority.
Nov 6 2020, 2:32 PM · Code scanner

Oct 24 2020

zack triaged T2731: scanner: strip the path passed as argument from output as Low priority.
Oct 24 2020, 5:01 PM · Easy hack, Code scanner
zack updated the task description for T2730: scanner: should output the root SWHID as well.
Oct 24 2020, 4:58 PM · Easy hack, Code scanner
zack updated the task description for T2730: scanner: should output the root SWHID as well.
Oct 24 2020, 4:58 PM · Easy hack, Code scanner
zack triaged T2730: scanner: should output the root SWHID as well as Normal priority.
Oct 24 2020, 4:58 PM · Easy hack, Code scanner

Oct 13 2020

DanSeraf triaged T2692: Move the output related functions to another (sub)module as Normal priority.
Oct 13 2020, 9:57 AM · Code scanner
DanSeraf closed T2690: swh scanner reports double results in ndjson format as Resolved by committing rDTSCNc2768d171a78: model: dropped _iter_nodes_attr function.
Oct 13 2020, 9:36 AM · Code scanner

Oct 12 2020

zack triaged T2679: Use the `swh.model` version of `extract_regex_objs` as Low priority.
Oct 12 2020, 6:59 PM · Code scanner
zack triaged T2690: swh scanner reports double results in ndjson format as Normal priority.
Oct 12 2020, 6:59 PM · Code scanner
zvr created T2690: swh scanner reports double results in ndjson format.
Oct 12 2020, 6:47 PM · Code scanner

Oct 9 2020

zack added a project to T2680: proxy support for swh scanner: Easy hack.
Oct 9 2020, 2:59 PM · Easy hack, Code scanner
zack triaged T2680: proxy support for swh scanner as Normal priority.
Oct 9 2020, 2:58 PM · Easy hack, Code scanner
acezar created T2679: Use the `swh.model` version of `extract_regex_objs`.
Oct 9 2020, 2:47 PM · Code scanner

Sep 28 2020

tenma closed T2632: swh scanner fail to start when configuration file is missing as Resolved.
Sep 28 2020, 10:14 AM · Code scanner

Sep 25 2020

tenma added a revision to T2632: swh scanner fail to start when configuration file is missing: D4046: Fix default config file may be absent in scanner cli.
Sep 25 2020, 11:44 AM · Code scanner

Sep 23 2020

zack triaged T2632: swh scanner fail to start when configuration file is missing as High priority.
Sep 23 2020, 2:12 PM · Code scanner

Sep 14 2020

tenma closed T2572: swh-scanner: add support for authentication token to lift rate-limit as Resolved by committing rDTSCN0abe025e277b: Add standard config support and auth token for swh-scanner.
Sep 14 2020, 2:10 PM · Code scanner

Sep 9 2020

tenma added a revision to T2572: swh-scanner: add support for authentication token to lift rate-limit: D3900: Add standard config support and HTTP auth token for swh-scanner.
Sep 9 2020, 7:58 PM · Code scanner

Sep 8 2020

zack assigned T2572: swh-scanner: add support for authentication token to lift rate-limit to tenma.
Sep 8 2020, 10:50 AM · Code scanner
zack triaged T2572: swh-scanner: add support for authentication token to lift rate-limit as Normal priority.
Sep 8 2020, 10:25 AM · Code scanner
zack renamed T2300: swh-scanner: print a nicer error message when rate limit is hit from scanner: print a nicer error message when rate limit is hit to swh-scanner: print a nicer error message when rate limit is hit.
Sep 8 2020, 10:24 AM · Easy hack, Code scanner

Jun 22 2020

DanSeraf closed T2364: scanner: file browser in the sunburst/dashboard output as Resolved by committing rDTSCN0f10ec6ae8fe: dashboard: file visualization per directory path.
Jun 22 2020, 7:39 PM · Code scanner
DanSeraf closed T2336: scanner: add support for an exclusion list as Resolved.
Jun 22 2020, 2:57 PM · Code scanner

Apr 30 2020

DanSeraf closed T2365: scanner: add color legend for sunburst output as Resolved by committing rDTSCNfb8ae03e494c: plot: color legend.
Apr 30 2020, 12:41 PM · Code scanner

Apr 29 2020

DanSeraf closed T2363: scanner: json output should return both known and unknown files/dirs as Resolved by committing rDTSCN623a9dbe6157: ndjson output format.
Apr 29 2020, 4:40 PM · Code scanner

Apr 23 2020

olasd added a comment to T2363: scanner: json output should return both known and unknown files/dirs.

Just jumping in, I suggest using ndjson (newline-delimited json) instead of a full json tree, as the former is easier to stream / parse incrementally for large outputs (like the linux kernel).

Apr 23 2020, 12:14 PM · Code scanner
zack added a comment to T2363: scanner: json output should return both known and unknown files/dirs.
$ swh scanner scan -f json /tmp/test
{
    "dir1": {
        "children": {
            "subdir1": {
                "children": {
                    "text.txt": {
                        "known": true,
                        "swhid": "swh:1:cnt:ff5b57b7095eb5d168a36db6552ad2ce1f219bf6"
                    }
Apr 23 2020, 10:47 AM · Code scanner

Apr 22 2020

DanSeraf added a comment to T2363: scanner: json output should return both known and unknown files/dirs.

The new json output will be like the following:

Apr 22 2020, 6:13 PM · Code scanner

Apr 15 2020

DanSeraf closed T2362: scanner: aiohttp.client_exceptions.ServerDisconnectedError: None as Invalid.
Apr 15 2020, 5:58 PM · Code scanner
zack updated the task description for T2363: scanner: json output should return both known and unknown files/dirs.
Apr 15 2020, 2:07 PM · Code scanner
zack triaged T2365: scanner: add color legend for sunburst output as Low priority.
Apr 15 2020, 1:56 PM · Code scanner