GitLab returns very little data while logged out, so we won't be able to collect much. This seems to differ from their documentation, so I opened a ticket https://gitlab.com/gitlab-org/gitlab/-/issues/361952
- Queries
- All Stories
- Search
- Advanced Search
- Transactions
- Transaction Logs
Advanced Search
Aug 24 2022
Aug 22 2022
Jul 19 2022
Jul 18 2022
Jul 13 2022
Jul 12 2022
Jul 6 2022
Jul 5 2022
Jun 2 2022
May 30 2022
May 17 2022
May 13 2022
May 10 2022
Currently can't do it on GitLab while logged out: https://gitlab.com/gitlab-org/gitlab/-/issues/361952
May 3 2022
Apr 29 2022
Apr 28 2022
Apr 27 2022
Apr 22 2022
WIP
Apr 21 2022
I started working this design. We'll see if it needs to change later
Apr 19 2022
In summary, we would archive everything with priority "high" or "mid", as well as the "license" and "main language" fields, as they are all easy to fetch and store
Apr 18 2022
id | name [url] | type | methods | auth/throttle | code_source | metadata_source | metadata_conformance | etl_code | status |
Apr 11 2022
In T1739#82939, @olasd wrote:Yes, all these are good points. As long as forges don't provide a way of loading the metadata in bulk, it makes sense to do it at the same time as loading.
In T1739#82920, @vlorentz wrote:The original idea for this was to have separate tasks to fetch metadata, so that loaders did not have forge-specific code to fetch metadata.
However, the idea of loading metadata from loader is more appealing the more I think about it:
- Metadata are fetched at about the same time as we snapshot code; which would allow showing more consistent states of repositories
- Active repositories automatically have their metadata fetched more often than inactive ones
- We don't have one more moving part to monitor and schedule
- This allows the Git loader to know a new repo is a "forge fork" of another one before it starts loading, so it can do an incremental load
To me the advantages are strong ,especially point 1 and 4.
The original idea for this was to have separate tasks to fetch metadata, so that loaders did not have forge-specific code to fetch metadata.
Looks like *what* we want to collect is a solved issue.
Mar 25 2022
Mar 23 2022
Feb 23 2022
Feb 22 2022
Feb 21 2022
Feb 7 2022
Jan 18 2022
Jan 13 2022
Nov 22 2021
Nov 10 2021
Nov 8 2021
Oct 21 2021
I think we (or I should say- I) missed ambiguity of the concept origin.
Oct 8 2021
(this task is a dependency of T3638, because author != committer in revisions created by opam, and releases don't have a "committer" field, so switching to releases would lose this data)
Sep 6 2021
Sep 3 2021
Sep 2 2021
I updated the task with a breakdown of the cost of getting each info.