ardumont renamed T645: Type swh-storage endpoints with swh.model objects from Add types to swh-storage's api internal data structure result to Make swh-storage endpoints typed.

Jul 23 2020, 1:27 PM · Data Model, Storage manager

ardumont added a revision to T645: Type swh-storage endpoints with swh.model objects: D3594: storage*.origin_visit_get_random: Return model object instead of dict.

Jul 23 2020, 1:26 PM · Data Model, Storage manager

ardumont removed a project from T645: Type swh-storage endpoints with swh.model objects: Web app.

Jul 23 2020, 10:34 AM · Data Model, Storage manager

ardumont added a subtask for T645: Type swh-storage endpoints with swh.model objects: T2494: tests: Use data model objects within tests (drop dicts).

Jul 23 2020, 10:15 AM · Data Model, Storage manager

ardumont added a comment to T645: Type swh-storage endpoints with swh.model objects.

Well now:

the types are here.
All *_add endpoints from storages are taking as input the new model objects
All storage tests are using the data model objects as input

Jul 23 2020, 9:54 AM · Data Model, Storage manager

Jul 10 2020

douardda added a comment to T2421: Make model objects immutable.

Reading the code dealing with snapshot branches in several storage implementations, it really seems to me that storing them as a dict-like structure has no advantage.

Jul 10 2020, 12:17 PM · Data Model

douardda added a comment to T2421: Make model objects immutable.

But I'd like to use the opportunity of this cleanup to go a bit further than "the minimal amount of work for pedantic correctness", and actually make changes that have a conceptual meaning.

Jul 10 2020, 12:13 PM · Data Model

Jul 8 2020

anlambert added a revision to T2423: Extract the `extra_headers` away from `Revision.metadata` into a top-level immutable object: D3463: loader: Adapt to swh-model >= 0.4.0.

Jul 8 2020, 3:35 PM · Data Model

Jul 6 2020

douardda updated the task description for T2423: Extract the `extra_headers` away from `Revision.metadata` into a top-level immutable object.

Jul 6 2020, 1:09 PM · Data Model

ardumont added a comment to T2474: drop blake2 hashes.

Nothing against it either.
If that can make us ingest faster, it'd be neat.

Jul 6 2020, 11:23 AM · Data Model, Storage manager

douardda added a revision to T2423: Extract the `extra_headers` away from `Revision.metadata` into a top-level immutable object: D3426: Extract revision's extra_header as a top level attribute.

Jul 6 2020, 10:44 AM · Data Model

ardumont closed T2310: Make origin visits immutable as Resolved.

Jul 6 2020, 10:05 AM · Storage manager, Data Model

ardumont added a comment to T2310: Make origin visits immutable.

The main part is done, actually make the origin-visit immutable.
It's been deployed fully now.

Jul 6 2020, 10:04 AM · Storage manager, Data Model

ardumont updated the task description for T2310: Make origin visits immutable.

Jul 6 2020, 10:03 AM · Storage manager, Data Model

ardumont updated the task description for T2310: Make origin visits immutable.

Jul 6 2020, 9:59 AM · Storage manager, Data Model

ardumont triaged T2478: backfill origin-visit and origin-visit-status topics as Normal priority.

Jul 6 2020, 9:58 AM · Storage manager, Data Model

Jul 3 2020

ardumont updated the task description for T2310: Make origin visits immutable.

Jul 3 2020, 4:51 PM · Storage manager, Data Model

ardumont added a revision to T2310: Make origin visits immutable: D3416: storage.db: Drop db.origin_visit_upsert behavior.

Jul 3 2020, 4:34 PM · Storage manager, Data Model

douardda added a comment to T2474: drop blake2 hashes.

not sure about the db space as an argument, but the CPU is by itself worth the move IMHO.

Jul 3 2020, 4:30 PM · Data Model, Storage manager

vlorentz added a project to T2474: drop blake2 hashes: Data Model.

Jul 3 2020, 4:17 PM · Data Model, Storage manager

Jul 2 2020

civodul added a comment to T2430: lookup ingested tarballs (or similar source code containers) by container checksum.

In T2430#46040, @zack wrote:

@civodul I wanted to raise the topic of storing container metadata (in the style of what tools like pristine-tar do) here too, so thanks for giving me the chance :-)

Jul 2 2020, 10:44 PM · Data Model

ardumont updated the task description for T2310: Make origin visits immutable.

Jul 2 2020, 12:23 PM · Storage manager, Data Model

zack added a comment to T2430: lookup ingested tarballs (or similar source code containers) by container checksum.

@civodul I wanted to raise the topic of storing container metadata (in the style of what tools like pristine-tar do) here too, so thanks for giving me the chance :-)
I agree it might be a technical solution, *but*, I'm not sure I see the point.
Didn't you agree that having a "lookup service" from tarball/container checksums to SWHIDs (the Software Heritage identifiers, that can then be used to lookup stuff in the archive) would be enough to satisfy distro needs?
If yes, then "archiving container metadata" could be replaced by simply having a way to add entries to the lookup table. And allowing distros to do so is option that we can explore. (Once the service exists, of course.)

Jul 2 2020, 12:07 PM · Data Model

civodul added a comment to T2430: lookup ingested tarballs (or similar source code containers) by container checksum.

Do I get it right that the primary reason why tarballs aren't systematically archived is that doing so would be too expensive storage-wise (no deduplication)?

Jul 2 2020, 12:00 PM · Data Model

ardumont updated the task description for T2310: Make origin visits immutable.

Jul 2 2020, 10:32 AM · Storage manager, Data Model

Jul 1 2020

ardumont updated the task description for T2310: Make origin visits immutable.

Jul 1 2020, 4:22 PM · Storage manager, Data Model

ardumont updated the task description for T2310: Make origin visits immutable.

Jul 1 2020, 4:01 PM · Storage manager, Data Model

ardumont updated the task description for T2310: Make origin visits immutable.

Jul 1 2020, 3:38 PM · Storage manager, Data Model

douardda added a revision to T2423: Extract the `extra_headers` away from `Revision.metadata` into a top-level immutable object: D3389: Extract the extra_headers from metadata on the Revision model class.

Jul 1 2020, 3:18 PM · Data Model

Jun 30 2020

ardumont updated the task description for T2310: Make origin visits immutable.

Jun 30 2020, 4:09 PM · Storage manager, Data Model

ardumont updated the task description for T2310: Make origin visits immutable.

Jun 30 2020, 4:02 PM · Storage manager, Data Model

ardumont added a revision to T2310: Make origin visits immutable: D3380: storage*: Drop intermediary conversion step into OriginVisit.

Jun 30 2020, 3:30 PM · Storage manager, Data Model

ardumont added a revision to T2310: Make origin visits immutable: D3376: journal_data: Drop obsolete origin_visit fields.

Jun 30 2020, 2:18 PM · Storage manager, Data Model

Jun 29 2020

ardumont updated the task description for T2310: Make origin visits immutable.

Jun 29 2020, 6:35 PM · Storage manager, Data Model

ardumont updated the task description for T2310: Make origin visits immutable.

Jun 29 2020, 6:25 PM · Storage manager, Data Model

ardumont updated the task description for T2310: Make origin visits immutable.

Jun 29 2020, 1:19 PM · Storage manager, Data Model

Jun 26 2020

ardumont updated the task description for T2310: Make origin visits immutable.

Jun 26 2020, 6:09 PM · Storage manager, Data Model

ardumont added a revision to T2310: Make origin visits immutable: D3365: tests*: Drop obsolete origin visit fields.

Jun 26 2020, 6:08 PM · Storage manager, Data Model

ardumont added a revision to T2310: Make origin visits immutable: D3350: Iterate over paginated visits in batches to retrieve latest visit/snapshot.

Jun 26 2020, 5:45 PM · Storage manager, Data Model

ardumont updated the task description for T2310: Make origin visits immutable.

Jun 26 2020, 3:36 PM · Storage manager, Data Model

ardumont updated the task description for T2310: Make origin visits immutable.

Jun 26 2020, 12:33 PM · Storage manager, Data Model

ardumont added a revision to T2310: Make origin visits immutable: D3363: tests*: Drop obsolete origin visit fields.

Jun 26 2020, 12:28 PM · Storage manager, Data Model

ardumont updated the task description for T2310: Make origin visits immutable.

Jun 26 2020, 10:26 AM · Storage manager, Data Model

ardumont added a revision to T2310: Make origin visits immutable: D3362: indexer*: Drop obsolete origin visit fields.

Jun 26 2020, 10:25 AM · Storage manager, Data Model

ardumont added a revision to T2310: Make origin visits immutable: D3361: loader*: Drop obsolete origin visit fields.

Jun 26 2020, 10:12 AM · Storage manager, Data Model

ardumont updated the task description for T2310: Make origin visits immutable.

Jun 26 2020, 10:00 AM · Storage manager, Data Model

ardumont added a revision to T2310: Make origin visits immutable: D3360: replayer: Drop obsolete fields from origin-visit.

Jun 26 2020, 9:58 AM · Storage manager, Data Model

Jun 25 2020

ardumont updated the task description for T2310: Make origin visits immutable.

Jun 25 2020, 7:21 PM · Storage manager, Data Model

douardda closed T2422: Add an `object_type` attribute to model classes as Resolved.

closed by D3152

Jun 25 2020, 12:57 PM · Data Model

Jun 24 2020

civodul added a comment to T2430: lookup ingested tarballs (or similar source code containers) by container checksum.

Thanks for your feedback, @rdicosmo!

Jun 24 2020, 5:29 PM · Data Model

douardda added a comment to T2422: Add an `object_type` attribute to model classes.

Previously proposed "short-term" solution does not work. So the only "short-term" solution is to make DiskBakedContent inherit from BaseModel (or BaseContent).

Jun 24 2020, 3:45 PM · Data Model

douardda added a comment to T2422: Add an `object_type` attribute to model classes.

This task is currently blocked by an implementation "detail":

Jun 24 2020, 3:28 PM · Data Model

ardumont added a revision to T2310: Make origin visits immutable: D3344: journal_data: Make origin-visit optional fields to None.

Jun 24 2020, 1:40 PM · Storage manager, Data Model

ardumont added a revision to T2310: Make origin visits immutable: D3342: storage*: Drop obsolete fields from origin_visit.

Jun 24 2020, 11:00 AM · Storage manager, Data Model

ardumont added a revision to T2310: Make origin visits immutable: D3341: OriginVisitStatus: Allow "created" status.

Jun 24 2020, 9:14 AM · Storage manager, Data Model

Jun 23 2020

rdicosmo added a comment to T2430: lookup ingested tarballs (or similar source code containers) by container checksum.

In T2430#45767, @zimoun wrote:

if you still have that tarball at hand, then it can be ingested in SWH, and we keep the correspondence between SWHID and SHA256; in principle, you need to trust us, but one can foresee having external parties checking that the correspondence is real while the tarball is still there, and adding their observation to the chain of trust means you need to trust us less and less

By we keep the correspondence between SWHID and SHA256 you mean you on the SWH side?

Jun 23 2020, 6:49 PM · Data Model

zimoun added a comment to T2430: lookup ingested tarballs (or similar source code containers) by container checksum.

if you still have that tarball at hand, then it can be ingested in SWH, and we keep the correspondence between SWHID and SHA256; in principle, you need to trust us, but one can foresee having external parties checking that the correspondence is real while the tarball is still there, and adding their observation to the chain of trust means you need to trust us less and less

Jun 23 2020, 6:12 PM · Data Model

rdicosmo added a comment to T2430: lookup ingested tarballs (or similar source code containers) by container checksum.

In T2430#45764, @civodul wrote:

@rdicosmo The discussion of the "source of trust" is an important one, and it's interesting to see how we can address it going forward.

The proposal of a correspondence table, as I wrote on swh-devel, leaves open the question of today's and yesterday's software, assuming SWHIDs become the de facto standard tomorrow. How can I check the integrity of code fetched from SWH if all I have is its tarball's SHA256 from its release announcement? How can I check its authenticity if all I have is an OpenPGP signature computed over a tarball?