Page MenuHomeSoftware Heritage
Feed Advanced Search

Fri, Sep 11

ardumont updated the task description for T2586: Investigate feasibility of revision diff fragments specification for SWHID.
Fri, Sep 11, 3:02 PM · Data Model
anlambert triaged T2586: Investigate feasibility of revision diff fragments specification for SWHID as Normal priority.
Fri, Sep 11, 3:01 PM · Data Model

Tue, Sep 8

zack added a comment to T2571: swh-identify: add support for --type revision.

Re supported VCSs: sure, but I'd start with git that is a low-hanging fruit.

Tue, Sep 8, 1:18 PM · Data Model
douardda added a comment to T2571: swh-identify: add support for --type revision.

do we mean something else than echo swh:1:rev:$(git rev-parse HEAD) ?
Do we want support for other (supported) VCS?

Tue, Sep 8, 12:34 PM · Data Model
zack triaged T2571: swh-identify: add support for --type revision as Normal priority.
Tue, Sep 8, 9:12 AM · Data Model
zack triaged T2570: swh-identify: support exclusion patterns (e.g., for .git/) as swh-scanner does as Normal priority.
Tue, Sep 8, 9:09 AM · Data Model
zack closed T1687: Add filename as an optional part in persistent identifiers as Resolved.

This seems to have been addressed with the path qualifier in SWHIDs.
Closing.
(Please reopen if I'm missing something.)

Tue, Sep 8, 9:06 AM · Data Model
zack updated subscribers of T1136: swh-identify: support recursive checksumming of directories.

As an update: a feature equivalent to this one has been implemented in swh-scanner by @DanSeraf.
I guess it would still be useful to have (as it seems like a natural need) also in swh-identify, but of course the code should not be duplicated.

Tue, Sep 8, 9:04 AM · Data Model

Fri, Sep 4

ardumont added a revision to T645: Type swh-storage endpoints with swh.model objects: D3883: algos.diff: Add missed revision_get conversion.
Fri, Sep 4, 3:37 PM · Data Model, Storage manager

Thu, Sep 3

ardumont added a revision to T645: Type swh-storage endpoints with swh.model objects: D3877: Adapt storage.revision_get calls according to latest api change.
Thu, Sep 3, 5:48 PM · Data Model, Storage manager
ardumont added a revision to T645: Type swh-storage endpoints with swh.model objects: D3870: Adapt storage.revision_get calls according to latest api change.
Thu, Sep 3, 1:26 PM · Data Model, Storage manager
ardumont added a revision to T645: Type swh-storage endpoints with swh.model objects: D3869: test_loader: Adapt to latest storage revision_get change.
Thu, Sep 3, 1:20 PM · Data Model, Storage manager

Wed, Sep 2

ardumont added a revision to T645: Type swh-storage endpoints with swh.model objects: D3868: loader: Adapt to latest storage revision_get change.
Wed, Sep 2, 6:39 PM · Data Model, Storage manager
ardumont added a revision to T645: Type swh-storage endpoints with swh.model objects: D3865: metadata: Adapt to latest storage revision_get change.
Wed, Sep 2, 4:08 PM · Data Model, Storage manager
ardumont added a revision to T645: Type swh-storage endpoints with swh.model objects: D3864: migrations: Adapt according to latest storage revision_get api change.
Wed, Sep 2, 3:49 PM · Data Model, Storage manager
ardumont added a revision to T645: Type swh-storage endpoints with swh.model objects: D3863: Refactor revision_get storage API to return Revision objects.
Wed, Sep 2, 3:25 PM · Data Model, Storage manager

Mon, Aug 31

ardumont added a revision to T645: Type swh-storage endpoints with swh.model objects: D3854: swh.web: Adapt to latest storage release_get api change.
Mon, Aug 31, 4:44 PM · Data Model, Storage manager
ardumont added a revision to T645: Type swh-storage endpoints with swh.model objects: D3853: test_loader: Adapt to latest storage release_get change.
Mon, Aug 31, 3:49 PM · Data Model, Storage manager
ardumont added a revision to T645: Type swh-storage endpoints with swh.model objects: D3852: storage*: release_get(...) -> List[Optional[Release]].
Mon, Aug 31, 3:42 PM · Data Model, Storage manager

Fri, Aug 28

vlorentz added a comment to T2430: lookup ingested tarballs (or similar source code containers) by container checksum.

Update: I just tried out my script to use Disarchive as an alternative to pristine-tar. It was pretty easy since they both work similarly (from the outside, ofc).

Fri, Aug 28, 10:20 PM · Data Model

Thu, Aug 27

vlorentz added a comment to T2430: lookup ingested tarballs (or similar source code containers) by container checksum.

@samplet so it's an improvement over pristine-tar, great!

Thu, Aug 27, 11:33 PM · Data Model
samplet added a comment to T2430: lookup ingested tarballs (or similar source code containers) by container checksum.

@lewo Great! The code is still “technical preview” quality, so there are some rough edges. Also, I’m willing to make architectural changes to better suit SWH. For instance, the whole “database” model might not be necessary if we want to store the metadata along with the files themselves. Feel free to ask for whatever help or changes you need! In the meantime, I will work on fixing some bugs and cleaning things up.

Thu, Aug 27, 11:25 PM · Data Model
lewo added a comment to T2430: lookup ingested tarballs (or similar source code containers) by container checksum.

@samplet wow! that's pretty cool! Thank you ;)

Thu, Aug 27, 10:28 PM · Data Model
ardumont added a comment to T2478: backfill origin-visit and origin-visit-status topics.

i guess so, yes.

Thu, Aug 27, 5:18 PM · Storage manager, Data Model
ardumont added a comment to T645: Type swh-storage endpoints with swh.model objects.

well, storage is typed now but the rest remains inconsistent (as T645#47156 explicits and propose something to subside it)

Thu, Aug 27, 5:11 PM · Data Model, Storage manager
douardda closed T2478: backfill origin-visit and origin-visit-status topics, a subtask of T2310: Make origin visits immutable, as Wontfix.
Thu, Aug 27, 4:43 PM · Storage manager, Data Model
douardda closed T2478: backfill origin-visit and origin-visit-status topics as Wontfix.

I guess this task can be closed, since this backfilling process will be part of the one we will run soon to fill the new kafka cluster

Thu, Aug 27, 4:43 PM · Storage manager, Data Model
douardda added a comment to T645: Type swh-storage endpoints with swh.model objects.

what's missing for this task to be closed?

Thu, Aug 27, 4:32 PM · Data Model, Storage manager

Wed, Aug 26

samplet added a comment to T2430: lookup ingested tarballs (or similar source code containers) by container checksum.

@rdicosmo the full ID is a better choice. Thanks!

Wed, Aug 26, 3:45 PM · Data Model
vlorentz added a comment to T2430: lookup ingested tarballs (or similar source code containers) by container checksum.

I tried this as well, before trying pristine-tar. An issue I ran into is that storing the value of fields isn't enough, because there are multiple way to represent them in tar (eg. numbers are usually null-terminated strings, so \x00\x00\x00\x00, 0\x00\x00\x00, 0000, and \x00123 represent the same value).

Wed, Aug 26, 10:06 AM · Data Model
rdicosmo added a comment to T2430: lookup ingested tarballs (or similar source code containers) by container checksum.

Thank you @samplet for sharing this great work: I am really looking forward to see you and @vlorentz compare notes and see whether we can archive the three extra files you produce as extrinsic metadata in the SWH archive!

Wed, Aug 26, 9:54 AM · Data Model
samplet added a comment to T2430: lookup ingested tarballs (or similar source code containers) by container checksum.

Hi all,

Wed, Aug 26, 6:37 AM · Data Model

Aug 19 2020

vlorentz triaged T2523: Archive opensource.samsung.com as Normal priority.
Aug 19 2020, 7:40 PM · Lister, Archive coverage

Aug 7 2020

ardumont added a revision to T645: Type swh-storage endpoints with swh.model objects: D3738: model: Add Sha1 alias.
Aug 7 2020, 9:54 AM · Data Model, Storage manager
ardumont added a revision to T645: Type swh-storage endpoints with swh.model objects: D3737: Adapt type and rename content_get_metadata calls to content_get.
Aug 7 2020, 6:40 AM · Data Model, Storage manager
ardumont added a revision to T645: Type swh-storage endpoints with swh.model objects: D3735: test_npm: Adapt content_get_metadata call to content_get.
Aug 7 2020, 12:59 AM · Data Model, Storage manager
ardumont added a revision to T645: Type swh-storage endpoints with swh.model objects: D3734: indexer.rehash: Adapt content_get_metadata call to content_get.
Aug 7 2020, 12:53 AM · Data Model, Storage manager
ardumont added a revision to T645: Type swh-storage endpoints with swh.model objects: D3733: storage*: Rename and type content_get(List[Sha1]) -> List[Optional[Content]].
Aug 7 2020, 12:45 AM · Data Model, Storage manager

Aug 6 2020

vlorentz added a comment to T2430: lookup ingested tarballs (or similar source code containers) by container checksum.

I started looking into this, using https://nix-community.github.io/nixpkgs-swh/sources-unstable.json as a source of archive files.

Aug 6 2020, 4:19 PM · Data Model
ardumont added a revision to T2517: Add remaining missing types to swh.storage.interface: D3719: Adapt code according to storage signature.
Aug 6 2020, 9:58 AM · Data Model, Storage manager
ardumont added a revision to T645: Type swh-storage endpoints with swh.model objects: D3718: "text"-indexers: Migrate to partition index instead of range.
Aug 6 2020, 9:50 AM · Data Model, Storage manager
ardumont added a revision to T2517: Add remaining missing types to swh.storage.interface: D3717: Adapt code according to storage signature.
Aug 6 2020, 9:46 AM · Data Model, Storage manager
ardumont added a revision to T2517: Add remaining missing types to swh.storage.interface: D3716: Adapt code according to storage signature.
Aug 6 2020, 9:40 AM · Data Model, Storage manager

Aug 5 2020

ardumont added a revision to T645: Type swh-storage endpoints with swh.model objects: D3715: in_memory: Drop dead code.
Aug 5 2020, 8:02 PM · Data Model, Storage manager
ardumont added a comment to T645: Type swh-storage endpoints with swh.model objects.
  1. {object}_missing for object in {content, directory, revision, release, snapshot}
Aug 5 2020, 5:45 PM · Data Model, Storage manager
ardumont added a comment to T645: Type swh-storage endpoints with swh.model objects.

I'll type with what we have right now, that will simplify the next diffs which introduce type changes.
But also demonstrates the inconsistencies we have right now.

Aug 5 2020, 5:44 PM · Data Model, Storage manager
ardumont closed T2517: Add remaining missing types to swh.storage.interface, a subtask of T645: Type swh-storage endpoints with swh.model objects, as Resolved.
Aug 5 2020, 5:29 PM · Data Model, Storage manager
ardumont closed T2517: Add remaining missing types to swh.storage.interface as Resolved.
Aug 5 2020, 5:29 PM · Data Model, Storage manager
ardumont added a revision to T645: Type swh-storage endpoints with swh.model objects: D3713: storage*: content_get_partition(...) -> PagedResult[Content].
Aug 5 2020, 4:11 PM · Data Model, Storage manager
ardumont added a revision to T645: Type swh-storage endpoints with swh.model objects: D3712: storage*: Drop deprecated content_get_range endpoint.
Aug 5 2020, 3:18 PM · Data Model, Storage manager
ardumont renamed T645: Type swh-storage endpoints with swh.model objects from Type swh-storage endpoints to Type swh-storage endpoints with swh.model objects.
Aug 5 2020, 2:06 PM · Data Model, Storage manager
ardumont removed revisions from T645: Type swh-storage endpoints with swh.model objects: D3708: storage*: origin_get_by_sha1: Drop generator from pgstorage, D3707: storage*: revision_*: Type remaining existing endpoints, D3706: storage*: directory_*: Type remaining existing endpoints, D3705: storage*: skipped_content_missing: Type remaining existing endpoints, D3704: storage*: content_missing_per_sha1(_git): Type remaining existing endpoints, D3703: storage*: content_missing: Unify and type remaining existing endpoints, D3702: storage*: content_get_partition: Type remaining existing endpoints, D3701: storage*: content_get_range: Type remaining existing endpoints, D3700: storage*: content_get: Type remaining existing endpoints, D3699: storage*: content_update: Type remaining existing endpoints, D3698: storage*: origin_get_by_sha1: Type remaining existing endpoints, D3697: storage*: check_config: Type remaining existing endpoints.
Aug 5 2020, 1:12 PM · Data Model, Storage manager
ardumont added revisions to T2517: Add remaining missing types to swh.storage.interface: D3708: storage*: origin_get_by_sha1: Drop generator from pgstorage, D3707: storage*: revision_*: Type remaining existing endpoints, D3706: storage*: directory_*: Type remaining existing endpoints, D3705: storage*: skipped_content_missing: Type remaining existing endpoints, D3704: storage*: content_missing_per_sha1(_git): Type remaining existing endpoints, D3703: storage*: content_missing: Unify and type remaining existing endpoints, D3702: storage*: content_get_partition: Type remaining existing endpoints, D3701: storage*: content_get_range: Type remaining existing endpoints, D3700: storage*: content_get: Type remaining existing endpoints, D3699: storage*: content_update: Type remaining existing endpoints, D3698: storage*: origin_get_by_sha1: Type remaining existing endpoints, D3697: storage*: check_config: Type remaining existing endpoints.
Aug 5 2020, 1:12 PM · Data Model, Storage manager
ardumont renamed T2517: Add remaining missing types to swh.storage.interface from Add remaining missing types to the interface to Add remaining missing types to swh.storage.interface.
Aug 5 2020, 1:10 PM · Data Model, Storage manager
ardumont added a revision to T2517: Add remaining missing types to swh.storage.interface: D3711: storage*: object_find_by_sha1_git: Type remaining existing endpoints.
Aug 5 2020, 1:06 PM · Data Model, Storage manager
ardumont added a revision to T2517: Add remaining missing types to swh.storage.interface: D3710: storage*: snapshot_*: Type remaining existing endpoints.
Aug 5 2020, 12:50 PM · Data Model, Storage manager
ardumont added a revision to T2517: Add remaining missing types to swh.storage.interface: D3709: storage*: release_*: Type remaining existing endpoints.
Aug 5 2020, 12:30 PM · Data Model, Storage manager
ardumont triaged T2517: Add remaining missing types to swh.storage.interface as Normal priority.
Aug 5 2020, 12:21 PM · Data Model, Storage manager
ardumont added a revision to T645: Type swh-storage endpoints with swh.model objects: D3708: storage*: origin_get_by_sha1: Drop generator from pgstorage.
Aug 5 2020, 9:55 AM · Data Model, Storage manager
ardumont added a revision to T645: Type swh-storage endpoints with swh.model objects: D3707: storage*: revision_*: Type remaining existing endpoints.
Aug 5 2020, 8:38 AM · Data Model, Storage manager

Aug 4 2020

ardumont added a revision to T645: Type swh-storage endpoints with swh.model objects: D3706: storage*: directory_*: Type remaining existing endpoints.
Aug 4 2020, 11:12 PM · Data Model, Storage manager
ardumont added a revision to T645: Type swh-storage endpoints with swh.model objects: D3705: storage*: skipped_content_missing: Type remaining existing endpoints.
Aug 4 2020, 11:11 PM · Data Model, Storage manager
ardumont added a revision to T645: Type swh-storage endpoints with swh.model objects: D3704: storage*: content_missing_per_sha1(_git): Type remaining existing endpoints.
Aug 4 2020, 6:58 PM · Data Model, Storage manager
ardumont added a revision to T645: Type swh-storage endpoints with swh.model objects: D3703: storage*: content_missing: Unify and type remaining existing endpoints.
Aug 4 2020, 6:48 PM · Data Model, Storage manager
ardumont added a revision to T645: Type swh-storage endpoints with swh.model objects: D3702: storage*: content_get_partition: Type remaining existing endpoints.
Aug 4 2020, 6:47 PM · Data Model, Storage manager
ardumont added a revision to T645: Type swh-storage endpoints with swh.model objects: D3701: storage*: content_get_range: Type remaining existing endpoints.
Aug 4 2020, 6:47 PM · Data Model, Storage manager
ardumont added a revision to T645: Type swh-storage endpoints with swh.model objects: D3700: storage*: content_get: Type remaining existing endpoints.
Aug 4 2020, 6:40 PM · Data Model, Storage manager
ardumont added a revision to T645: Type swh-storage endpoints with swh.model objects: D3699: storage*: content_update: Type remaining existing endpoints.
Aug 4 2020, 6:40 PM · Data Model, Storage manager
ardumont added a revision to T645: Type swh-storage endpoints with swh.model objects: D3698: storage*: origin_get_by_sha1: Type remaining existing endpoints.
Aug 4 2020, 6:02 PM · Data Model, Storage manager
ardumont added a revision to T645: Type swh-storage endpoints with swh.model objects: D3697: storage*: check_config: Type remaining existing endpoints.
Aug 4 2020, 6:00 PM · Data Model, Storage manager
ardumont added a comment to T645: Type swh-storage endpoints with swh.model objects.

I'll type with what we have right now, that will simplify the next diffs which introduce type changes.
But also demonstrates the inconsistencies we have right now.

Aug 4 2020, 5:26 PM · Data Model, Storage manager
ardumont added a revision to T645: Type swh-storage endpoints with swh.model objects: D3693: service: Adapt according to the latest storage.content_find changes.
Aug 4 2020, 11:15 AM · Data Model, Storage manager
ardumont changed the status of T645: Type swh-storage endpoints with swh.model objects from Open to Work in Progress.

Current status, related endpoints to origin, origin-visit and origin-visit-status are done now both read/write.
Remains dag model objects (content, directory, revision, release, snapshot) reading endpoints to align and type.

Aug 4 2020, 10:17 AM · Data Model, Storage manager
ardumont added a revision to T645: Type swh-storage endpoints with swh.model objects: D3692: storage*: Type content_find(...) -> List[Content].
Aug 4 2020, 10:11 AM · Data Model, Storage manager

Aug 3 2020

ardumont added a revision to T645: Type swh-storage endpoints with swh.model objects: D3687: storage*: Type {cnt,dir,rev,rel,snp}_get_random(...) -> Sha1Git.
Aug 3 2020, 1:27 PM · Data Model, Storage manager

Aug 2 2020

ardumont removed a revision from T645: Type swh-storage endpoints with swh.model objects: D3684: swh.search.get_search: Simplify instantiation.
Aug 2 2020, 11:15 AM · Data Model, Storage manager
ardumont removed a revision from T645: Type swh-storage endpoints with swh.model objects: D3685: swh.search: Define an interface for search backends and use it.
Aug 2 2020, 11:15 AM · Data Model, Storage manager
ardumont added a revision to T645: Type swh-storage endpoints with swh.model objects: D3685: swh.search: Define an interface for search backends and use it.
Aug 2 2020, 11:11 AM · Data Model, Storage manager
ardumont added a revision to T645: Type swh-storage endpoints with swh.model objects: D3684: swh.search.get_search: Simplify instantiation.
Aug 2 2020, 11:11 AM · Data Model, Storage manager

Aug 1 2020

ardumont added a revision to T645: Type swh-storage endpoints with swh.model objects: D3683: storage.in_memory: Fix origin_list implementation.
Aug 1 2020, 11:16 AM · Data Model, Storage manager
ardumont added a revision to T645: Type swh-storage endpoints with swh.model objects: D3682: cli.task: Migrate scheduler cli to latest storage change on iter_origins.
Aug 1 2020, 10:04 AM · Data Model, Storage manager
ardumont added a revision to T645: Type swh-storage endpoints with swh.model objects: D3681: storage*: Drop origin-get-range in favor of origin-list.
Aug 1 2020, 9:30 AM · Data Model, Storage manager

Jul 31 2020

ardumont added a revision to T645: Type swh-storage endpoints with swh.model objects: D3675: origin: Migrate use to storage.origin_list instead of origin_get_range.
Jul 31 2020, 4:19 PM · Data Model, Storage manager
ardumont added a revision to T645: Type swh-storage endpoints with swh.model objects: D3671: storage*: Add type annotation to origin_count.
Jul 31 2020, 2:51 PM · Data Model, Storage manager
ardumont added a revision to T645: Type swh-storage endpoints with swh.model objects: D3669: Reuse swh.core stream_results function.
Jul 31 2020, 1:58 PM · Data Model, Storage manager
ardumont added a revision to T645: Type swh-storage endpoints with swh.model objects: D3664: api.classes: Open swh.core.api.classes.stream_results.
Jul 31 2020, 12:58 PM · Data Model, Storage manager

Jul 30 2020

ardumont added a revision to T645: Type swh-storage endpoints with swh.model objects: D3661: common/service: Migrate origin_search to latest apis change.
Jul 30 2020, 10:56 PM · Data Model, Storage manager
ardumont added a revision to T645: Type swh-storage endpoints with swh.model objects: D3657: search*: Type origin_search(...) -> PagedResult[Dict].
Jul 30 2020, 7:33 PM · Data Model, Storage manager
ardumont added a revision to T645: Type swh-storage endpoints with swh.model objects: D3651: storage*: Type origin_search(...) -> PagedResult[Origin].
Jul 30 2020, 4:10 PM · Data Model, Storage manager
ardumont added a revision to T645: Type swh-storage endpoints with swh.model objects: D3650: storage*: Adapt origin_list(...) -> PagedResult[Origin].
Jul 30 2020, 2:32 PM · Data Model, Storage manager
ardumont added a revision to T645: Type swh-storage endpoints with swh.model objects: D3648: algos.snapshot: Open snapshot_get_from_revision algorithm.
Jul 30 2020, 10:02 AM · Data Model, Storage manager

Jul 29 2020

ardumont added a revision to T645: Type swh-storage endpoints with swh.model objects: D3647: service: Migrate to latest origin_visit_get api change.
Jul 29 2020, 8:17 PM · Data Model, Storage manager
ardumont added a revision to T645: Type swh-storage endpoints with swh.model objects: D3645: deposit.migrations: Migrate to latest storage api change.
Jul 29 2020, 7:35 PM · Data Model, Storage manager
ardumont added a revision to T645: Type swh-storage endpoints with swh.model objects: D3643: storage*: Simplify next-page-token computation.
Jul 29 2020, 4:59 PM · Data Model, Storage manager
ardumont added a revision to T645: Type swh-storage endpoints with swh.model objects: D3641: storage*: add origin_visit_status_get(...) -> PagedResult[OriginVisitStatus].
Jul 29 2020, 4:35 PM · Data Model, Storage manager

Jul 28 2020

ardumont added a revision to T645: Type swh-storage endpoints with swh.model objects: D3632: swh.core.api: Expose a serializable PagedResult object.
Jul 28 2020, 3:42 PM · Data Model, Storage manager
ardumont added a revision to T645: Type swh-storage endpoints with swh.model objects: D3629: storage*: use an enum to explicit the order in origin_visit_get.
Jul 28 2020, 1:14 PM · Data Model, Storage manager

Jul 27 2020

ardumont added a revision to T645: Type swh-storage endpoints with swh.model objects: D3627: storage*: origin_visit_get(...) -> PagedResult[OriginVisit].
Jul 27 2020, 10:10 PM · Data Model, Storage manager
ardumont added a revision to T645: Type swh-storage endpoints with swh.model objects: D3626: Update swh.storage.origin_visit_get_by calls to latest api change.
Jul 27 2020, 4:01 PM · Data Model, Storage manager
ardumont added a parent task for T645: Type swh-storage endpoints with swh.model objects: T2223: Type checking.
Jul 27 2020, 2:55 PM · Data Model, Storage manager