2020-07-03 17:23:20 civodul zack: hey! could you clarify the policy wrt. tarballs? 2020-07-03 17:23:43 civodul i understand it's undesirable, but still it happens, so i'm trying to see if we can have our cake and eat it too :-) 2020-07-03 17:27:30 +zack civodul: I understand it's annoying for your needs, but for now I confirm what both I and rdicosmo have said in the ticket(s), i.e., no tarball archival for now 2020-07-03 17:28:01 +zack the lookup service is more likely to happen, but I (now) understand your point about verifiability of them 2020-07-03 17:28:49 +zack I confess I don't entirely buy it, especially because in the medium term the hashes you have today are going to be broken, and with them your verifiability will become also moot 2020-07-03 17:29:07 +zack but I can see how in the short term it would have been desirable 2020-07-03 17:29:15 +ardumont olasd: yeah, i hear you, i got too eager to be done with that thing, i think 2020-07-03 17:29:38 +ardumont trying to fix now (i have some error about cursor already closed right now ¯\_(ツ)_/¯) 2020-07-03 17:29:57 +olasd ardumont: to be fair, most of my remarks are about code that's around the area you touched, not code you touched specifically 2020-07-03 17:30:07 +zack civodul: I also still think that the lookup service will be better than nothing, but I'd totally understand if you instead consider it's not enough for your needs 2020-07-03 17:30:25 rdicosmo civodul: we see your point, but it's a matter of resources and priority... right now our hands are quite full 2020-07-03 17:30:52 +olasd moranegg[m]: swh:1:rel:ba01e42e250d30c80f3588bdb10fd25bb7769ca8;origin=https://forge.softwareheritage.org/source/swh-model.git;visit=swh:1:snp:d28fb8b2315a2582b7e96efefb4a6e5af381008f is the latest swh.model release 2020-07-03 17:31:01 +olasd (and it's indeed a release) 2020-07-03 17:34:29 +douardda seing this, one advantage I see of having an swh:1:ori SWHID would be to control the length of other SWHID's qualifiers 2020-07-03 17:35:28 +olasd but that makes us an oracle, which is... not great 2020-07-03 17:36:06 +douardda why? it's just a matter of computing a sha(1|256) of the origin's url 2020-07-03 17:36:33 +douardda unless I missed something? 2020-07-03 17:37:57 rdicosmo douardda: the origin carries semantics that we want to make independent of a resolver ... 2020-07-03 17:38:43 rdicosmo ^^^ if we use a hash to encode an origin, then nobody can find out what the origin was without us 2020-07-03 17:39:16 +douardda oh yes I get it 2020-07-03 17:39:39 civodul rdicosmo, zack: alright, thanks for the clear reply 2020-07-03 17:39:49 civodul and yes, i perfectly understand that you have enough on your plate 2020-07-03 17:39:51 rdicosmo ^^^ if one just wants to have a shorted encoding, then it must be a bijection, not a one way function (e.g. base64 etc.) but we discussed this time ago and it seems not worth our while 2020-07-03 17:40:27 civodul in the short term the outcome may be that we'll have Tarball Heritage in parallel ;-) 2020-07-03 17:41:02 civodul NixOS, Guix, etc. will have to maintain their caches and be stricter about preservation 2020-07-03 17:41:55 rdicosmo civodul: keep that up in the short term, and we'll tackle when we'll have more slack :-) 2020-07-03 17:42:28 civodul heh :-) 2020-07-03 17:42:59 civodul i understood it as "no" (regarding storing tarballs) rather than "yes, but later" 2020-07-03 17:43:01 civodul correct? 2020-07-03 17:43:08 civodul (just to make sure there's no misunderstanding) 2020-07-03 17:44:08 rdicosmo civodul: as zack said, it's a "no for now", and the correct negation is "yes, but later" :-) 2020-07-03 17:45:11 civodul ok :-) 2020-07-03 17:46:44 rdicosmo civodul: if some wealthy benefactor pops up inexpectedly with significant resources to donate, later could be sooner, but this kind of event usually only happens in the movies 2020-07-03 17:47:03 civodul sure 2020-07-03 17:47:11 civodul TBH i'm also a bit worried about perception 2020-07-03 17:47:19 civodul dunno 2020-07-03 17:48:29 rdicosmo civodul: Software Heritage is a long term undertaking, it will tke time to roll out everything we need, but we'll get there 2020-07-03 17:49:47 * civodul nods 2020-07-03 17:51:17 civodul it's an issue we discussed 4 years ago though, which is why i bother you more today than back then 2020-07-03 17:51:35 civodul but anyway, thanks for taking the time again 2020-07-03 17:51:45 civodul we'll do our best on our side with the resources that we have 2020-07-03 17:59:23 +ardumont olasd: fixed the shortcomings in D3420 ;) 2020-07-03 17:59:23 -- Notice(swhbot): D3420 (author: ardumont, Needs Review) on swh-storage: pg-storage: Add missing cur parameter passing 2020-07-03 18:30:44 +zack civodul: fwiw the overhead is not only about storage, but also (mainly? not sure yet) about having another ingestion process to engineer and maintain, as storing tarballs doesn't fit our current ingestion mechanism well 2020-07-03 18:31:39 civodul zack: right, i see 2020-07-03 18:31:54 civodul though an option might be to not do anything special about it 2020-07-03 18:32:22 +zack re: perception, not sure. I think your POV is very specific. As *we will* archive any source code bundle that people want us to archive. But that is not the same as archiving its *container* as is 2020-07-03 18:33:18 +zack civodul: if you just throw the container as is, say, in our blob storage, you lose all the advantages of our data model (fine-grained deduplication, etc.) and you will make the resource problem worse 2020-07-03 18:33:41 civodul i'm pretty sure that concern is not limited to Guix + NixOS 2020-07-03 18:33:58 civodul it's just more acute there because these are the only distros that offer long-term reproducibility 2020-07-03 18:34:26 civodul but release announcements, papers, CMakeLists.txt, READMEs, etc. include hashes of tarballs 2020-07-03 18:35:00 +zack they also include version numbers 2020-07-03 18:35:09 civodul that's not enough 2020-07-03 18:35:13 +zack of course 2020-07-03 18:35:50 +zack but we need to worry first about saving the source code itself that's behind those identifiers, and we're doing it :) 2020-07-03 18:36:07 +zack it's not like that if you don't have the tarball hashes is like not doing the most important part of the job 2020-07-03 18:36:19 civodul sure, and you're doing great :-) 2020-07-03 18:36:22 civodul well 2020-07-03 18:36:50 civodul code that cannot be authenticated just cannot be used in many cases 2020-07-03 18:37:37 +zack i'm still curious about your feedback on my comment on weak hashes 2020-07-03 18:37:51 +zack the more time passes, the more hashes in old announcements will become useless 2020-07-03 18:38:20 civodul yeah, that's true to some extent, though "useless" is a strong word 2020-07-03 18:38:30 civodul it cannot be used as an argument for not having any integrity check at all 2020-07-03 18:39:02 civodul and again, it's not just announcements + those crazy Guix folks ;-) 2020-07-03 18:39:09 +zack we have integrity checks :-) 2020-07-03 18:39:30 +zack we just don't have (yet) the integrity checks that you happen to rely on 2020-07-03 18:39:43 civodul i mean integrity checks for users: my script downloads hello-1.0 and it wants to make sure it really got what it asked for 2020-07-03 18:40:38 +zack anyway, your use case is clear, and we'll get to it --- it's just not available right now, sorry about that 2020-07-03 18:40:59 civodul yup, got it