Wed, Jan 22
in addition we will also need to modify storage to store and allow retrieval of hashed origin URLs.
Oct 4 2019
Sep 5 2019
Aug 20 2019
I think objects that we refuse to archive because of policy (that is, currently, contents larger than 100MB) also fit that description.
Jul 10 2019
Jul 8 2019
As it turns out, intrinsic origin identifiers are indeed handy for graph compression, so I'd like to see this task resolved.
Jun 30 2019
Jun 17 2019
Jun 15 2019
Build is green
See https://jenkins.softwareheritage.org/job/DSTO/job/tox/493/ for more details.
Jun 6 2019
Jun 5 2019
Jun 4 2019
Just a couple of comments:
- the current proposal is ori instead of org as 3-letter stem
- your use cases are all valid, but would equally work with a full URL and with a hashed URL
I'm for the hashed origin only if we make it available as an identifier under our PID schema:
May 29 2019
Okay then. I'll work on updating the identifier specification.
show the first one by default, and allow the user to pick another one.
May 28 2019
May 23 2019
One way to answer the question use the hash vs tuple (or plain url) is to know whether those identifier are destined to be persistent ones or not.
If they do, the hash would be more consistent with the existing ones (swh:1:ori:<hash>?).
Also, they'd be simpler to use (read/type) in a url (vs a url within a url).
May 22 2019
This sounds like a good idea.
Tangential, but impactful on this discussion, we have had in the past a discussion about removing origin types from our notion of origin (there might be a task about it, but I couldn't find it right now).