Re supported VCSs: sure, but I'd start with git that is a low-hanging fruit.
- Queries
- All Stories
- Search
- Advanced Search
- Transactions
- Transaction Logs
Advanced Search
Sep 11 2020
Sep 8 2020
do we mean something else than echo swh:1:rev:$(git rev-parse HEAD) ?
Do we want support for other (supported) VCS?
This seems to have been addressed with the path qualifier in SWHIDs.
Closing.
(Please reopen if I'm missing something.)
As an update: a feature equivalent to this one has been implemented in swh-scanner by @DanSeraf.
I guess it would still be useful to have (as it seems like a natural need) also in swh-identify, but of course the code should not be duplicated.
Sep 4 2020
Sep 3 2020
Sep 2 2020
Aug 31 2020
Aug 28 2020
Update: I just tried out my script to use Disarchive as an alternative to pristine-tar. It was pretty easy since they both work similarly (from the outside, ofc).
Aug 27 2020
@samplet so it's an improvement over pristine-tar, great!
@lewo Great! The code is still “technical preview” quality, so there are some rough edges. Also, I’m willing to make architectural changes to better suit SWH. For instance, the whole “database” model might not be necessary if we want to store the metadata along with the files themselves. Feel free to ask for whatever help or changes you need! In the meantime, I will work on fixing some bugs and cleaning things up.
@samplet wow! that's pretty cool! Thank you ;)
i guess so, yes.
well, storage is typed now but some endpoints remains inconsistent (as T645#47156 explicits with a unification proposal which is not done yet, aside content_get_data and content_get_metadata)
I guess this task can be closed, since this backfilling process will be part of the one we will run soon to fill the new kafka cluster
what's missing for this task to be closed?
Aug 26 2020
@rdicosmo the full ID is a better choice. Thanks!
I tried this as well, before trying pristine-tar. An issue I ran into is that storing the value of fields isn't enough, because there are multiple way to represent them in tar (eg. numbers are usually null-terminated strings, so \x00\x00\x00\x00, 0\x00\x00\x00, 0000, and \x00123 represent the same value).
Hi all,
Aug 19 2020
Aug 7 2020
Aug 6 2020
I started looking into this, using https://nix-community.github.io/nixpkgs-swh/sources-unstable.json as a source of archive files.
Aug 5 2020
- {object}_missing for object in {content, directory, revision, release, snapshot}
I'll type with what we have right now, that will simplify the next diffs which introduce type changes.
But also demonstrates the inconsistencies we have right now.
Aug 4 2020
I'll type with what we have right now, that will simplify the next diffs which introduce type changes.
But also demonstrates the inconsistencies we have right now.
Current status, related endpoints to origin, origin-visit and origin-visit-status are done now both read/write.
Remains dag model objects (content, directory, revision, release, snapshot) reading endpoints to align and type.