I'll type with what we have right now, that will simplify the next diffs which introduce type changes.
But also demonstrates the inconsistencies we have right now.

Aug 4 2020, 5:26 PM · Data Model, Storage manager

ardumont added a revision to T645: Type swh-storage endpoints with swh.model objects: D3693: service: Adapt according to the latest storage.content_find changes.

Aug 4 2020, 11:15 AM · Data Model, Storage manager

ardumont changed the status of T645: Type swh-storage endpoints with swh.model objects from Open to Work in Progress.

Current status, related endpoints to origin, origin-visit and origin-visit-status are done now both read/write.
Remains dag model objects (content, directory, revision, release, snapshot) reading endpoints to align and type.

Aug 4 2020, 10:17 AM · Data Model, Storage manager

ardumont added a revision to T645: Type swh-storage endpoints with swh.model objects: D3692: storage*: Type content_find(...) -> List[Content].

Aug 4 2020, 10:11 AM · Data Model, Storage manager

Aug 3 2020

ardumont added a revision to T645: Type swh-storage endpoints with swh.model objects: D3687: storage*: Type {cnt,dir,rev,rel,snp}_get_random(...) -> Sha1Git.

Aug 3 2020, 1:27 PM · Data Model, Storage manager

Aug 2 2020

ardumont removed a revision from T645: Type swh-storage endpoints with swh.model objects: D3684: swh.search.get_search: Simplify instantiation.

Aug 2 2020, 11:15 AM · Data Model, Storage manager

ardumont removed a revision from T645: Type swh-storage endpoints with swh.model objects: D3685: swh.search: Define an interface for search backends and use it.

Aug 2 2020, 11:15 AM · Data Model, Storage manager

ardumont added a revision to T645: Type swh-storage endpoints with swh.model objects: D3685: swh.search: Define an interface for search backends and use it.

Aug 2 2020, 11:11 AM · Data Model, Storage manager

ardumont added a revision to T645: Type swh-storage endpoints with swh.model objects: D3684: swh.search.get_search: Simplify instantiation.

Aug 2 2020, 11:11 AM · Data Model, Storage manager

Aug 1 2020

ardumont added a revision to T645: Type swh-storage endpoints with swh.model objects: D3683: storage.in_memory: Fix origin_list implementation.

Aug 1 2020, 11:16 AM · Data Model, Storage manager

ardumont added a revision to T645: Type swh-storage endpoints with swh.model objects: D3682: cli.task: Migrate scheduler cli to latest storage change on iter_origins.

Aug 1 2020, 10:04 AM · Data Model, Storage manager

ardumont added a revision to T645: Type swh-storage endpoints with swh.model objects: D3681: storage*: Drop origin-get-range in favor of origin-list.

Aug 1 2020, 9:30 AM · Data Model, Storage manager

Jul 31 2020

ardumont added a revision to T645: Type swh-storage endpoints with swh.model objects: D3675: origin: Migrate use to storage.origin_list instead of origin_get_range.

Jul 31 2020, 4:19 PM · Data Model, Storage manager

ardumont added a revision to T645: Type swh-storage endpoints with swh.model objects: D3671: storage*: Add type annotation to origin_count.

Jul 31 2020, 2:51 PM · Data Model, Storage manager

ardumont added a revision to T645: Type swh-storage endpoints with swh.model objects: D3669: Reuse swh.core stream_results function.

Jul 31 2020, 1:58 PM · Data Model, Storage manager

ardumont added a revision to T645: Type swh-storage endpoints with swh.model objects: D3664: api.classes: Open swh.core.api.classes.stream_results.

Jul 31 2020, 12:58 PM · Data Model, Storage manager

Jul 30 2020

ardumont added a revision to T645: Type swh-storage endpoints with swh.model objects: D3661: common/service: Migrate origin_search to latest apis change.

Jul 30 2020, 10:56 PM · Data Model, Storage manager

ardumont added a revision to T645: Type swh-storage endpoints with swh.model objects: D3657: search*: Type origin_search(...) -> PagedResult[Dict].

Jul 30 2020, 7:33 PM · Data Model, Storage manager

ardumont added a revision to T645: Type swh-storage endpoints with swh.model objects: D3651: storage*: Type origin_search(...) -> PagedResult[Origin].

Jul 30 2020, 4:10 PM · Data Model, Storage manager

ardumont added a revision to T645: Type swh-storage endpoints with swh.model objects: D3650: storage*: Adapt origin_list(...) -> PagedResult[Origin].

Jul 30 2020, 2:32 PM · Data Model, Storage manager

ardumont added a revision to T645: Type swh-storage endpoints with swh.model objects: D3648: algos.snapshot: Open snapshot_get_from_revision algorithm.

Jul 30 2020, 10:02 AM · Data Model, Storage manager

Jul 29 2020

ardumont added a revision to T645: Type swh-storage endpoints with swh.model objects: D3647: service: Migrate to latest origin_visit_get api change.

Jul 29 2020, 8:17 PM · Data Model, Storage manager

ardumont added a revision to T645: Type swh-storage endpoints with swh.model objects: D3645: deposit.migrations: Migrate to latest storage api change.

Jul 29 2020, 7:35 PM · Data Model, Storage manager

ardumont added a revision to T645: Type swh-storage endpoints with swh.model objects: D3643: storage*: Simplify next-page-token computation.

Jul 29 2020, 4:59 PM · Data Model, Storage manager

ardumont added a revision to T645: Type swh-storage endpoints with swh.model objects: D3641: storage*: add origin_visit_status_get(...) -> PagedResult[OriginVisitStatus].

Jul 29 2020, 4:35 PM · Data Model, Storage manager

Jul 28 2020

ardumont added a revision to T645: Type swh-storage endpoints with swh.model objects: D3632: swh.core.api: Expose a serializable PagedResult object.

Jul 28 2020, 3:42 PM · Data Model, Storage manager

ardumont added a revision to T645: Type swh-storage endpoints with swh.model objects: D3629: storage*: use an enum to explicit the order in origin_visit_get.

Jul 28 2020, 1:14 PM · Data Model, Storage manager

Jul 27 2020

ardumont added a revision to T645: Type swh-storage endpoints with swh.model objects: D3627: storage*: origin_visit_get(...) -> PagedResult[OriginVisit].

Jul 27 2020, 10:10 PM · Data Model, Storage manager

ardumont added a revision to T645: Type swh-storage endpoints with swh.model objects: D3626: Update swh.storage.origin_visit_get_by calls to latest api change.

Jul 27 2020, 4:01 PM · Data Model, Storage manager

ardumont added a parent task for T645: Type swh-storage endpoints with swh.model objects: T2223: Type checking.

Jul 27 2020, 2:55 PM · Data Model, Storage manager

ardumont added a revision to T645: Type swh-storage endpoints with swh.model objects: D3625: storage*: origin_visit_get_by -> Optional[OriginVisit].

Jul 27 2020, 2:19 PM · Data Model, Storage manager

ardumont added a revision to T645: Type swh-storage endpoints with swh.model objects: D3622: storage*: origin_visit_find_by_date -> Optional[OriginVisit].

Jul 27 2020, 12:49 PM · Data Model, Storage manager

ardumont added a revision to T645: Type swh-storage endpoints with swh.model objects: D3620: storage*: type origin_visit_get_latest endpoint result.

Jul 27 2020, 8:16 AM · Data Model, Storage manager

Jul 25 2020

ardumont added a revision to T645: Type swh-storage endpoints with swh.model objects: D3619: metadata: Update swh.storage.origin_get call to latest api change.

Jul 25 2020, 8:02 AM · Data Model, Storage manager

Jul 24 2020

ardumont added a revision to T645: Type swh-storage endpoints with swh.model objects: D3618: Update swh.storage.origin_get calls to latest api change.

Jul 24 2020, 6:30 PM · Data Model, Storage manager

ardumont added a revision to T645: Type swh-storage endpoints with swh.model objects: D3609: loader: Update swh.storage.origin_get call to latest api change.

Jul 24 2020, 8:51 AM · Data Model, Storage manager

ardumont added a revision to T645: Type swh-storage endpoints with swh.model objects: D3608: loader: Update swh.storage.origin_get call to latest api change.

Jul 24 2020, 8:26 AM · Data Model, Storage manager

ardumont added a revision to T645: Type swh-storage endpoints with swh.model objects: D3607: migrations: Update swh.storage.origin_get calls to latest api change.

Jul 24 2020, 8:14 AM · Data Model, Storage manager

Jul 23 2020

ardumont added a revision to T645: Type swh-storage endpoints with swh.model objects: D3605: storage*: origin_get(Iterable[str]) -> Iterable[Optional[Origin]].

Jul 23 2020, 5:55 PM · Data Model, Storage manager

ardumont renamed T645: Type swh-storage endpoints with swh.model objects from Make swh-storage endpoints typed to Type swh-storage endpoints.

Jul 23 2020, 1:27 PM · Data Model, Storage manager

ardumont renamed T645: Type swh-storage endpoints with swh.model objects from Add types to swh-storage's api internal data structure result to Make swh-storage endpoints typed.

Jul 23 2020, 1:27 PM · Data Model, Storage manager

ardumont added a revision to T645: Type swh-storage endpoints with swh.model objects: D3594: storage*.origin_visit_get_random: Return model object instead of dict.

Jul 23 2020, 1:26 PM · Data Model, Storage manager

ardumont removed a project from T645: Type swh-storage endpoints with swh.model objects: Web app.

Jul 23 2020, 10:34 AM · Data Model, Storage manager

ardumont added a subtask for T645: Type swh-storage endpoints with swh.model objects: T2494: tests: Use data model objects within tests (drop dicts).

Jul 23 2020, 10:15 AM · Data Model, Storage manager

ardumont added a comment to T645: Type swh-storage endpoints with swh.model objects.

Well now:

the types are here.
All *_add endpoints from storages are taking as input the new model objects
All storage tests are using the data model objects as input

Jul 23 2020, 9:54 AM · Data Model, Storage manager

Jul 10 2020

douardda added a comment to T2421: Make model objects immutable.

Reading the code dealing with snapshot branches in several storage implementations, it really seems to me that storing them as a dict-like structure has no advantage.

Jul 10 2020, 12:17 PM · Data Model

douardda added a comment to T2421: Make model objects immutable.

But I'd like to use the opportunity of this cleanup to go a bit further than "the minimal amount of work for pedantic correctness", and actually make changes that have a conceptual meaning.

Jul 10 2020, 12:13 PM · Data Model

Jul 8 2020

anlambert added a revision to T2423: Extract the `extra_headers` away from `Revision.metadata` into a top-level immutable object: D3463: loader: Adapt to swh-model >= 0.4.0.

Jul 8 2020, 3:35 PM · Data Model

Jul 6 2020

douardda updated the task description for T2423: Extract the `extra_headers` away from `Revision.metadata` into a top-level immutable object.

Jul 6 2020, 1:09 PM · Data Model

ardumont added a comment to T2474: drop blake2 hashes.

Nothing against it either.
If that can make us ingest faster, it'd be neat.

Jul 6 2020, 11:23 AM · Data Model, Storage manager

douardda added a revision to T2423: Extract the `extra_headers` away from `Revision.metadata` into a top-level immutable object: D3426: Extract revision's extra_header as a top level attribute.

Jul 6 2020, 10:44 AM · Data Model

ardumont closed T2310: Make origin visits immutable as Resolved.

Jul 6 2020, 10:05 AM · Storage manager, Data Model

ardumont added a comment to T2310: Make origin visits immutable.

The main part is done, actually make the origin-visit immutable.
It's been deployed fully now.

Jul 6 2020, 10:04 AM · Storage manager, Data Model

ardumont updated the task description for T2310: Make origin visits immutable.

Jul 6 2020, 10:03 AM · Storage manager, Data Model

ardumont updated the task description for T2310: Make origin visits immutable.

Jul 6 2020, 9:59 AM · Storage manager, Data Model

ardumont triaged T2478: backfill origin-visit and origin-visit-status topics as Normal priority.

Jul 6 2020, 9:58 AM · Storage manager, Data Model

Jul 3 2020

ardumont updated the task description for T2310: Make origin visits immutable.

Jul 3 2020, 4:51 PM · Storage manager, Data Model

ardumont added a revision to T2310: Make origin visits immutable: D3416: storage.db: Drop db.origin_visit_upsert behavior.

Jul 3 2020, 4:34 PM · Storage manager, Data Model

douardda added a comment to T2474: drop blake2 hashes.

not sure about the db space as an argument, but the CPU is by itself worth the move IMHO.

Jul 3 2020, 4:30 PM · Data Model, Storage manager

vlorentz added a project to T2474: drop blake2 hashes: Data Model.

Jul 3 2020, 4:17 PM · Data Model, Storage manager

Jul 2 2020

civodul added a comment to T2430: lookup ingested tarballs (or similar source code containers) by container checksum.

In T2430#46040, @zack wrote:

@civodul I wanted to raise the topic of storing container metadata (in the style of what tools like pristine-tar do) here too, so thanks for giving me the chance :-)

Jul 2 2020, 10:44 PM · Data Model

ardumont updated the task description for T2310: Make origin visits immutable.

Jul 2 2020, 12:23 PM · Storage manager, Data Model

zack added a comment to T2430: lookup ingested tarballs (or similar source code containers) by container checksum.

@civodul I wanted to raise the topic of storing container metadata (in the style of what tools like pristine-tar do) here too, so thanks for giving me the chance :-)
I agree it might be a technical solution, *but*, I'm not sure I see the point.
Didn't you agree that having a "lookup service" from tarball/container checksums to SWHIDs (the Software Heritage identifiers, that can then be used to lookup stuff in the archive) would be enough to satisfy distro needs?
If yes, then "archiving container metadata" could be replaced by simply having a way to add entries to the lookup table. And allowing distros to do so is option that we can explore. (Once the service exists, of course.)

Jul 2 2020, 12:07 PM · Data Model

civodul added a comment to T2430: lookup ingested tarballs (or similar source code containers) by container checksum.

Do I get it right that the primary reason why tarballs aren't systematically archived is that doing so would be too expensive storage-wise (no deduplication)?

Jul 2 2020, 12:00 PM · Data Model

ardumont updated the task description for T2310: Make origin visits immutable.

Jul 2 2020, 10:32 AM · Storage manager, Data Model