Page MenuHomeSoftware Heritage

DanSeraf (Daniele Serafini)
User

Projects

User Details

User Since
Sep 23 2019, 3:10 PM (96 w, 4 d)

Recent Activity

Thu, Jul 29

DanSeraf closed D6044: scanner: translate CoreSWHID instances before the request to the backend.
Thu, Jul 29, 3:56 PM
DanSeraf committed rDTSCN98011a42261b: translate CoreSWHID instances before the request to the backend (authored by DanSeraf).
translate CoreSWHID instances before the request to the backend
Thu, Jul 29, 3:56 PM
DanSeraf requested review of D6044: scanner: translate CoreSWHID instances before the request to the backend.
Thu, Jul 29, 3:49 PM
DanSeraf closed D6027: swh-scanner: add 'auto' option as default policy.
Thu, Jul 29, 2:22 PM
DanSeraf committed rDTSCNcd19bbbcee76: add 'auto' option as default policy (authored by DanSeraf).
add 'auto' option as default policy
Thu, Jul 29, 2:22 PM
DanSeraf updated the diff for D6027: swh-scanner: add 'auto' option as default policy.

policies description in CLI

Thu, Jul 29, 1:32 PM

Wed, Jul 28

DanSeraf updated the diff for D6027: swh-scanner: add 'auto' option as default policy.
  • docstring for each policy
  • remove redundant variables
  • test auto policy for a "big" source tree
Wed, Jul 28, 3:45 PM

Tue, Jul 27

DanSeraf added a comment to D6027: swh-scanner: add 'auto' option as default policy.

This diff touched the part of the code that needs to be documented, so it's fine to document it in the same diff

Tue, Jul 27, 4:35 PM
DanSeraf added a comment to D6027: swh-scanner: add 'auto' option as default policy.

Could you document the scan policies?

Tue, Jul 27, 4:29 PM
DanSeraf requested review of D6027: swh-scanner: add 'auto' option as default policy.
Tue, Jul 27, 2:29 PM

Wed, Jul 21

DanSeraf closed T3420: scanner: make the various query algorithms user-selectable as Resolved by committing rDTSCNd5a070e1429d: add scan policies.
Wed, Jul 21, 2:00 PM · Code scanner
DanSeraf closed D5996: swh-scanner: new scan policies.
Wed, Jul 21, 2:00 PM
DanSeraf committed rDTSCNd5a070e1429d: add scan policies (authored by DanSeraf).
add scan policies
Wed, Jul 21, 2:00 PM
DanSeraf updated the diff for D5996: swh-scanner: new scan policies.

commit message

Wed, Jul 21, 12:31 PM
DanSeraf updated the diff for D5996: swh-scanner: new scan policies.

better comment wording

Wed, Jul 21, 12:16 PM

Tue, Jul 20

DanSeraf updated the diff for D5996: swh-scanner: new scan policies.

make CI pass

Tue, Jul 20, 5:42 PM
DanSeraf updated the diff for D5996: swh-scanner: new scan policies.

make CI pass

Tue, Jul 20, 4:23 PM
DanSeraf updated the diff for D5996: swh-scanner: new scan policies.

test scan policies SWHIDs request to the backend

Tue, Jul 20, 3:55 PM

Mon, Jul 19

DanSeraf added inline comments to D5996: swh-scanner: new scan policies.
Mon, Jul 19, 2:00 PM

Thu, Jul 15

DanSeraf requested review of D5996: swh-scanner: new scan policies.
Thu, Jul 15, 7:20 PM
DanSeraf added a revision to T3420: scanner: make the various query algorithms user-selectable: D5996: swh-scanner: new scan policies.
Thu, Jul 15, 7:17 PM · Code scanner

Thu, Jul 8

DanSeraf closed D5981: scanner: access MerkleNodeInfo with the correct key.
Thu, Jul 8, 6:34 PM
DanSeraf committed rDTSCN33b1316bcdab: access MerkleNodeInfo with the correct key (authored by DanSeraf).
access MerkleNodeInfo with the correct key
Thu, Jul 8, 6:34 PM
DanSeraf requested review of D5981: scanner: access MerkleNodeInfo with the correct key.
Thu, Jul 8, 4:38 PM
DanSeraf closed T2692: Move the output related functions to another (sub)module as Resolved by committing rDTSCN0d92c754c8df: use model.from_disk instead of scanner.model to store a source code project.
Thu, Jul 8, 3:42 PM · Code scanner
DanSeraf closed T2730: scanner: should output the root SWHID as well as Resolved by committing rDTSCN0d92c754c8df: use model.from_disk instead of scanner.model to store a source code project.
Thu, Jul 8, 3:42 PM · Easy hack, Code scanner
DanSeraf closed T3349: use swh.model.merkle/from_disk instead of swh.scanner.model, a subtask of T2730: scanner: should output the root SWHID as well, as Resolved.
Thu, Jul 8, 3:42 PM · Easy hack, Code scanner
DanSeraf closed T3349: use swh.model.merkle/from_disk instead of swh.scanner.model as Resolved by committing rDTSCN0d92c754c8df: use model.from_disk instead of scanner.model to store a source code project.
Thu, Jul 8, 3:42 PM · Code scanner
DanSeraf closed T3349: use swh.model.merkle/from_disk instead of swh.scanner.model, a subtask of T3420: scanner: make the various query algorithms user-selectable, as Resolved.
Thu, Jul 8, 3:42 PM · Code scanner
DanSeraf closed D5926: swh.scanner: use model.from_disk instead of scanner.model to store a source code project.
Thu, Jul 8, 3:42 PM
DanSeraf committed rDTSCN0d92c754c8df: use model.from_disk instead of scanner.model to store a source code project (authored by DanSeraf).
use model.from_disk instead of scanner.model to store a source code project
Thu, Jul 8, 3:42 PM
DanSeraf updated the diff for D5926: swh.scanner: use model.from_disk instead of scanner.model to store a source code project.

rebase

Thu, Jul 8, 3:35 PM
DanSeraf updated the diff for D5926: swh.scanner: use model.from_disk instead of scanner.model to store a source code project.

commit message

Thu, Jul 8, 3:28 PM

Wed, Jul 7

DanSeraf updated the diff for D5926: swh.scanner: use model.from_disk instead of scanner.model to store a source code project.

make pytest pass

Wed, Jul 7, 6:05 PM
DanSeraf updated the diff for D5926: swh.scanner: use model.from_disk instead of scanner.model to store a source code project.

requested changes

Wed, Jul 7, 5:04 PM
DanSeraf added inline comments to D5926: swh.scanner: use model.from_disk instead of scanner.model to store a source code project.
Wed, Jul 7, 11:48 AM

Mon, Jul 5

DanSeraf requested review of D5926: swh.scanner: use model.from_disk instead of scanner.model to store a source code project.
Mon, Jul 5, 9:33 AM

Fri, Jul 2

DanSeraf closed D5951: model: make deduplication optional when iterating over the merkle tree.
Fri, Jul 2, 11:55 AM
DanSeraf committed rDMOD153c6e84421b: make deduplication optional when iterating over the merkle tree (authored by DanSeraf).
make deduplication optional when iterating over the merkle tree
Fri, Jul 2, 11:55 AM
DanSeraf updated the diff for D5951: model: make deduplication optional when iterating over the merkle tree.

rebase

Fri, Jul 2, 11:53 AM
DanSeraf updated the diff for D5951: model: make deduplication optional when iterating over the merkle tree.

requested changes

Fri, Jul 2, 10:13 AM

Jun 30 2021

DanSeraf requested review of D5951: model: make deduplication optional when iterating over the merkle tree.
Jun 30 2021, 4:24 PM
DanSeraf added a revision to T3349: use swh.model.merkle/from_disk instead of swh.scanner.model: D5951: model: make deduplication optional when iterating over the merkle tree.
Jun 30 2021, 4:21 PM · Code scanner

Jun 25 2021

DanSeraf added a revision to T2692: Move the output related functions to another (sub)module: D5926: swh.scanner: use model.from_disk instead of scanner.model to store a source code project.
Jun 25 2021, 1:57 PM · Code scanner
DanSeraf added a revision to T2730: scanner: should output the root SWHID as well: D5926: swh.scanner: use model.from_disk instead of scanner.model to store a source code project.
Jun 25 2021, 1:57 PM · Easy hack, Code scanner
DanSeraf added a revision to T3349: use swh.model.merkle/from_disk instead of swh.scanner.model: D5926: swh.scanner: use model.from_disk instead of scanner.model to store a source code project.
Jun 25 2021, 1:57 PM · Code scanner

Jun 22 2021

DanSeraf committed rDTSCN88e90727ae9d: exclude .git directory during the repository extraction (authored by DanSeraf).
exclude .git directory during the repository extraction
Jun 22 2021, 4:28 PM

Jun 21 2021

DanSeraf closed T3393: add swhid() method to from_disk classes as Resolved by committing rDMODe4566a6605ff: from_disk: get swhid from Content/Directory objects.
Jun 21 2021, 5:16 PM · Data Model
DanSeraf closed D5899: swh-model: get SWHID from Content/Directory objects in from_disk.
Jun 21 2021, 5:16 PM
DanSeraf committed rDMODe4566a6605ff: from_disk: get swhid from Content/Directory objects (authored by DanSeraf).
from_disk: get swhid from Content/Directory objects
Jun 21 2021, 5:16 PM
DanSeraf updated the diff for D5899: swh-model: get SWHID from Content/Directory objects in from_disk.

requested changes

Jun 21 2021, 5:06 PM
DanSeraf updated the diff for D5899: swh-model: get SWHID from Content/Directory objects in from_disk.

unit test

Jun 21 2021, 4:00 PM

Jun 18 2021

DanSeraf added a comment to D5899: swh-model: get SWHID from Content/Directory objects in from_disk.
In D5899#150867, @zack wrote:

LGTM in general, but needs unit tests (in addition to the two nitpicks above about docstrings)

Jun 18 2021, 5:23 PM
DanSeraf updated the diff for D5899: swh-model: get SWHID from Content/Directory objects in from_disk.

requested changes

Jun 18 2021, 5:22 PM
DanSeraf requested review of D5899: swh-model: get SWHID from Content/Directory objects in from_disk.
Jun 18 2021, 4:50 PM
DanSeraf added a revision to T3393: add swhid() method to from_disk classes: D5899: swh-model: get SWHID from Content/Directory objects in from_disk.
Jun 18 2021, 4:48 PM · Data Model

Jun 15 2021

DanSeraf closed T3383: swh identify --recursive breaks --exclude, resulting in a "AttributeError: 'str' object has no attribute 'decode'" traceback as Resolved by committing rDMODe09446a6f44b: encode exclude patterns before extracting regex objects.
Jun 15 2021, 6:28 PM · Data Model
DanSeraf closed D5876: swh-model: encode exclude patterns before extracting regex objects.
Jun 15 2021, 6:28 PM
DanSeraf committed rDMODe09446a6f44b: encode exclude patterns before extracting regex objects (authored by DanSeraf).
encode exclude patterns before extracting regex objects
Jun 15 2021, 6:28 PM
DanSeraf updated the diff for D5876: swh-model: encode exclude patterns before extracting regex objects.

commit message

Jun 15 2021, 6:24 PM
DanSeraf requested review of D5876: swh-model: encode exclude patterns before extracting regex objects.
Jun 15 2021, 5:53 PM
DanSeraf added a revision to T3383: swh identify --recursive breaks --exclude, resulting in a "AttributeError: 'str' object has no attribute 'decode'" traceback: D5876: swh-model: encode exclude patterns before extracting regex objects.
Jun 15 2021, 5:42 PM · Data Model

Jun 11 2021

DanSeraf closed T3160: swh identify: add a -R/--recursive flag as Resolved.

closed by https://forge.softwareheritage.org/D5825

Jun 11 2021, 4:42 PM · Easy hack, Data Model
DanSeraf closed D5825: swh-model: add recursive option.

landed in https://forge.softwareheritage.org/rDMODae50e43fe091d8cc7b4cfbe0eea17565b38dbb0d

Jun 11 2021, 4:40 PM
DanSeraf updated the diff for D5825: swh-model: add recursive option.

rebase

Jun 11 2021, 4:33 PM
DanSeraf committed rDMODae50e43fe091: cli: add recursive option (authored by DanSeraf).
cli: add recursive option
Jun 11 2021, 4:26 PM
DanSeraf updated the diff for D5825: swh-model: add recursive option.

requested changes

Jun 11 2021, 1:40 PM
DanSeraf added inline comments to D5825: swh-model: add recursive option.
Jun 11 2021, 1:26 PM
DanSeraf updated the diff for D5825: swh-model: add recursive option.
  • missing incompatibility checks
  • ignore --recursive if the input object is not a directory
  • print only the swhid if using --no-filename
Jun 11 2021, 12:59 PM
DanSeraf added a comment to D5825: swh-model: add recursive option.

It seems it's still missing incompatibility checks with --no-dereference, --no-filename, and --type

Actually, --no-filename could be compatible (and implemented in this diff); should i include it?

Jun 11 2021, 12:24 PM

Jun 10 2021

DanSeraf updated the diff for D5825: swh-model: add recursive option.
  • requested changes
  • get the object path based on the data contained in the node object
Jun 10 2021, 12:28 PM

Jun 9 2021

DanSeraf added inline comments to D5825: swh-model: add recursive option.
Jun 9 2021, 12:00 PM
DanSeraf added inline comments to D5825: swh-model: add recursive option.
Jun 9 2021, 11:58 AM

Jun 8 2021

DanSeraf added a comment to D5825: swh-model: add recursive option.
In D5825#148519, @zack wrote:
  1. If relevant, could you implement --verify too?

Sure, if it is useful i could open another diff for it

This is not going to be easy, because one would need to pass a set of SWHIDs, mapped to individual paths, which is much harder to do on a CLI than passing a single SWHID (as in the current implementation of --verify).
The only easy way to do this would be passing a filename pointing to, e.g., a JSON formatted manifest, to be compared with computed SWHIDs, but I don't see much the point of doing that.

I'm fine with --verify being incompatible with --recursive for the time being (maybe with an appropriate error message).

Jun 8 2021, 6:26 PM
DanSeraf added a comment to D5825: swh-model: add recursive option.

It will show only the SWHID of the given directory, basically the same process as before is applied

But what was the process before? Did it ignore directory entries?

It checks only the given directories generating a from_disk.Directory object for each directory. Should it uses the same logic used for the recursive option?

Jun 8 2021, 6:18 PM
DanSeraf added a comment to D5825: swh-model: add recursive option.
  1. What happens if one runs it on a directory and --recursive isn't given?

It will show only the SWHID of the given directory, basically the same process as before is applied

Jun 8 2021, 5:19 PM
DanSeraf requested review of D5825: swh-model: add recursive option.
Jun 8 2021, 3:00 PM

Apr 29 2021

DanSeraf closed D5644: scanner-benchmark: add algorithms timings in results.
Apr 29 2021, 3:20 PM
DanSeraf committed rDTSCNed78dce5eb6e: add algorithms timings in results (authored by DanSeraf).
add algorithms timings in results
Apr 29 2021, 3:20 PM
DanSeraf updated the diff for D5644: scanner-benchmark: add algorithms timings in results.
  • minimum required version for swh-model
  • treat algo_min in the same way as other cases
Apr 29 2021, 3:18 PM
DanSeraf requested review of D5644: scanner-benchmark: add algorithms timings in results.
Apr 29 2021, 2:40 PM

Mar 26 2021

DanSeraf closed T2679: Use the `swh.model` version of `extract_regex_objs` as Resolved.
Mar 26 2021, 4:54 PM · Code scanner
DanSeraf closed D5359: scanner: use 'extract_regex_objs' from swh.model.
Mar 26 2021, 4:54 PM
DanSeraf committed rDTSCNb3256c87728e: use 'extract_regex_objs' from swh.model (authored by DanSeraf).
use 'extract_regex_objs' from swh.model
Mar 26 2021, 4:54 PM
DanSeraf closed T2570: swh-identify: support exclusion patterns (e.g., for .git/) as swh-scanner does as Resolved.

Already implemented in D4193

Mar 26 2021, 3:15 PM · Data Model
DanSeraf updated the diff for D5359: scanner: use 'extract_regex_objs' from swh.model.

removed unnecessary conversion

Mar 26 2021, 3:12 PM
DanSeraf added inline comments to D5359: scanner: use 'extract_regex_objs' from swh.model.
Mar 26 2021, 3:07 PM
DanSeraf added inline comments to D5359: scanner: use 'extract_regex_objs' from swh.model.
Mar 26 2021, 3:00 PM
DanSeraf updated the summary of D5359: scanner: use 'extract_regex_objs' from swh.model.
Mar 26 2021, 2:42 PM
DanSeraf requested review of D5359: scanner: use 'extract_regex_objs' from swh.model.
Mar 26 2021, 2:40 PM
DanSeraf added a revision to T2679: Use the `swh.model` version of `extract_regex_objs`: D5359: scanner: use 'extract_regex_objs' from swh.model.
Mar 26 2021, 2:39 PM · Code scanner

Feb 5 2021

DanSeraf committed rDTSCN8daa353de986: reimplement algo_min (authored by DanSeraf).
reimplement algo_min
Feb 5 2021, 5:51 PM
DanSeraf closed D5032: scanner-benchmark: improve logging information.
Feb 5 2021, 3:29 PM
DanSeraf committed rDTSCNe084e0f0f9ca: improve logging information (authored by DanSeraf).
improve logging information
Feb 5 2021, 3:29 PM
DanSeraf requested review of D5032: scanner-benchmark: improve logging information.
Feb 5 2021, 3:02 PM

Feb 4 2021

DanSeraf closed D5011: scanner-benchmark: use os.listdir() instead of os.walk() to avoid symlinks.
Feb 4 2021, 5:46 PM
DanSeraf committed rDTSCNe46e713d2145: run random algorithm only once (authored by DanSeraf).
run random algorithm only once
Feb 4 2021, 5:46 PM
DanSeraf committed rDTSCN3004b66787b2: use os.listdir() instead of os.walk() to avoid symlinks (authored by DanSeraf).
use os.listdir() instead of os.walk() to avoid symlinks
Feb 4 2021, 5:46 PM
DanSeraf updated the diff for D5011: scanner-benchmark: use os.listdir() instead of os.walk() to avoid symlinks.

rebase

Feb 4 2021, 5:38 PM
DanSeraf updated the diff for D5011: scanner-benchmark: use os.listdir() instead of os.walk() to avoid symlinks.

rebase

Feb 4 2021, 5:08 PM