This is already well beyond a prototype, so I'm closing this task.
- Queries
- All Stories
- Search
- Advanced Search
- Transactions
- Transaction Logs
Advanced Search
Dec 19 2017
Dec 15 2017
Dec 13 2017
Dec 12 2017
As the "prototype" part of the vault is definitely over, I'm closing this task.
Nov 10 2017
We can always allow people to truncate the identifier to some arbitrary (shorter) length. The canonical URI would be the full identifier, but our URI resolver can recognize shortened identifiers and point to a disambiguation page with all the objects whose identifier starts with the given string.
Nov 6 2017
I agree with all the suggestions: the full id should definitely contain all
this information.
Nevertheless, the sheer length of the result *may* turn out to be a blocker
for adoption as a reference to software in the academic publishing
framework. We can propose this, and see if we need to also provide a
shorter backup if really there is a strong negative feedback.
I'm not opposed to having explicit hash scheme names in the IDs—it is a good idea, only to be weighed against the cost in terms of length.
But we should also have schema version numbers, in case more radical changes will be needed in the future, e.g., renaming the object types in the graph.
If we retain both suggestions, that would give:
- swh:1:revision:sha1_git:<git sha1 of a revision>
- swh:1:content:blake2s256:<blake2s256 of a content>
I've been thinking about this in relation to T836.
Nov 5 2017
Oct 17 2017
Oct 16 2017
Oct 6 2017
Sep 15 2017
we're taking a different route for this now, based on @grouss WIP
Sep 5 2017
Sep 4 2017
Aug 4 2017
Jun 12 2017
May 9 2017
I don't know if I should finish cleaning up slow_loader for code review, since the hglib interface is so slow as to be next to useless.
May 7 2017
done (by @olasd last week)
Apr 24 2017
We now have a "full" content mirror on azure (the data for each 16th bucket is up to date as of the time the snapshot was taken).
Apr 7 2017
Mar 14 2017
systemd-journal-remote has been configured to receive logs on pergamon; systemd-journal-upload has been configured on all machines to push the journal messages to pergamon.
Mar 6 2017
Mar 1 2017
Feb 21 2017
Feb 15 2017
Feb 13 2017
Feb 12 2017
Feb 10 2017
Feb 9 2017
this has been done by @olasd a while ago, as part of the work described at https://www.softwareheritage.org/2016/11/09/listing-47-million-repositories-refactoring-our-github-lister/
this is a duplicate of T120
Feb 8 2017
We now have a (empty) Diffusion repo for this: https://forge.softwareheritage.org/source/swh-loader-mercurial/
Feb 2 2017
*drum roll*
*tsssss*
Feb 1 2017
Jan 31 2017
Jan 30 2017
Jan 26 2017
Jan 25 2017
Jan 24 2017
Jan 20 2017
Jan 11 2017
Jan 9 2017
Jan 4 2017
Dec 6 2016
Remains to update the azure workers with the latest indexer.
I'm on it.
Dec 2 2016
Nov 23 2016
Done as of T585.
Nov 22 2016
Nov 18 2016
Nov 15 2016
Sum up and status:
- Use of fossology's nomossa (nomos standalone version)
- DB schema updated (swh-storage)
- Api endpoints opened to add/read nomos's output (swh-storage)
- New Indexer added (swh-indexer)
- Puppet manifest for that indexer (swh-profile, swh-site)
- Deploy DB upgrade (uffizi)
- Deploy swh-storage upgrade (uffizi)
- Deploy indexer on azure nodes
- Feed existing contents stored in azure to keep up with other indexers
- Update missing licenses in storage
- Feed failed contents (the ones that were tagged, license unknown) back in
Nov 5 2016
Updated previous comment. It's ok.
Nov 4 2016
Current analysis' details in P120