- Queries
- All Stories
- Search
- Advanced Search
- Transactions
- Transaction Logs
Advanced Search
Jan 8 2023
Dec 1 2022
Oct 28 2022
Oct 19 2022
Aug 11 2022
Aug 5 2022
We discussed internally what to do with inactive repositories.
We reached a decision to move unused repos to object storage.
Once implemented, they will still be accessible but take a bit longer to access after a long period of inactivity.
Aug 4 2022
Looks like there's many more repos that should be visitable but aren't:
updated query running:
As usual, I'm uneasy with the (general) idea of manually handling some repositories to resorb one bit of lag. This will only increase lag in another area that we will want to cover next. Rinse, repeat.
answer: 4755
I am currently running a query to find how many origins are over one year overdue for a visit:
Jul 6 2022
GitLab returns very little data while logged out, so we won't be able to collect much. This seems to differ from their documentation, so I opened a ticket https://gitlab.com/gitlab-org/gitlab/-/issues/361952
Jun 21 2022
Jun 7 2022
May 30 2022
May 13 2022
May 10 2022
Currently can't do it on GitLab while logged out: https://gitlab.com/gitlab-org/gitlab/-/issues/361952
May 3 2022
Apr 29 2022
Apr 28 2022
Apr 21 2022
Apr 11 2022
Mar 23 2022
Feb 18 2022
Dec 13 2021
The main issue that prevents us from archiving these objects today is that our object storage still uses a plain sha1 as primary key (hence the current unicity constraint on the sha1 field of the content table in our primary graph storage).
Dec 8 2021
Updated task name and description to reflect the findings from @anlambert
Dec 7 2021
It is possible that more key cryptographic software will include these files.
Thanks a lot @anlambert for looking into this.
It is possible that more key cryptographic software will include these files.
We need a strategy to handle this situation, may you add this example to the SWHID v2 task?
Loading the repository in docker environment gives me the following traceback:
swh-loader_1 | [2021-12-07 12:08:06,876: INFO/MainProcess] Task swh.loader.git.tasks.UpdateGitRepository[a1aa28c0-1cb0-4e2a-8ae2-720ba6ca439e] received swh-loader_1 | [2021-12-07 12:08:06,877: INFO/MainProcess] loader@b11bfd448510 ready. swh-loader_1 | [2021-12-07 12:08:06,957: DEBUG/ForkPoolWorker-1] Loading config file /loader.yml swh-loader_1 | [2021-12-07 12:08:09,904: INFO/ForkPoolWorker-1] Load origin 'https://gitlab.com/sequoia-pgp/sequoia' with type 'git' swh-loader_1 | [2021-12-07 12:08:09,908: DEBUG/ForkPoolWorker-1] Transport url to communicate with server: https://gitlab.com/sequoia-pgp/sequoia swh-loader_1 | [2021-12-07 12:08:09,909: DEBUG/ForkPoolWorker-1] Client Urllib3HttpGitClient('https://gitlab.com/sequoia-pgp/sequoia/', dumb=None) to fetch pack at /sequoia-pgp/sequoia swh-loader_1 | [2021-12-07 12:08:10,422: DEBUG/ForkPoolWorker-1] local_heads_count=0 swh-loader_1 | [2021-12-07 12:08:10,422: DEBUG/ForkPoolWorker-1] remote_heads_count=1821 swh-loader_1 | [2021-12-07 12:08:10,422: DEBUG/ForkPoolWorker-1] wanted_refs_count=1821 swh-loader_1 | [2021-12-07 12:09:17,112: ERROR/ForkPoolWorker-1] Loading failure, updating to `failed` status swh-loader_1 | Traceback (most recent call last): swh-loader_1 | File "/srv/softwareheritage/venv/lib/python3.7/site-packages/swh/storage/api/client.py", line 29, in raise_for_status swh-loader_1 | super().raise_for_status(response) swh-loader_1 | File "/srv/softwareheritage/venv/lib/python3.7/site-packages/swh/core/api/__init__.py", line 344, in raise_for_status swh-loader_1 | raise exception from None swh-loader_1 | swh.core.api.RemoteException: <RemoteException 500 HashCollision: ['sha1', '38762cf7f55934b34d179ae6a4c80cadccbb7f0a', [{'blake2s256': '30e4bd16c3f98e74429d237c19ca9def702e5720cb124cb4b92e74f989aaf116', 'sha1': '38762cf7f55934b34d179ae6a4c80cadccbb7f0a', 'sha1_git': 'b621eeccd5c7edac9b7dcba35a8d5afd075e24f2', 'sha256': 'd4488775d29bdef7993367d541064dbdda50d383f89f0aa13a6ff2e0894ba5ff'}, {'blake2s256': '8f677e3214ca8b2acad91884a1571ef3f12b786501f9a6bedfd6239d82095dd2', 'sha1': '38762cf7f55934b34d179ae6a4c80cadccbb7f0a', 'sha1_git': 'ba9aaa145ccd24ef760cf31c74d8f7ca1a2e47b0', 'sha256': '2bb787a73e37352f92383abe7e2902936d1059ad9f1ba6daaa9c1e58ee6970d0'}]]> swh-loader_1 | swh-loader_1 | During handling of the above exception, another exception occurred: swh-loader_1 | swh-loader_1 | Traceback (most recent call last): swh-loader_1 | File "/src/swh-loader-core/swh/loader/core/loader.py", line 339, in load swh-loader_1 | self.store_data() swh-loader_1 | File "/src/swh-loader-core/swh/loader/core/loader.py", line 458, in store_data swh-loader_1 | self.storage.directory_add([directory]) swh-loader_1 | File "/srv/softwareheritage/venv/lib/python3.7/site-packages/swh/storage/proxies/buffer.py", line 171, in directory_add swh-loader_1 | stats = self.object_add(directories, object_type="directory", keys=["id"]) swh-loader_1 | File "/srv/softwareheritage/venv/lib/python3.7/site-packages/swh/storage/proxies/buffer.py", line 224, in object_add swh-loader_1 | return self.flush() swh-loader_1 | File "/srv/softwareheritage/venv/lib/python3.7/site-packages/swh/storage/proxies/buffer.py", line 286, in flush swh-loader_1 | stats = add_fn(list(batch)) swh-loader_1 | File "/srv/softwareheritage/venv/lib/python3.7/site-packages/swh/storage/proxies/filter.py", line 58, in content_add swh-loader_1 | [x for x in content if x.sha256 in contents_to_add] swh-loader_1 | File "/srv/softwareheritage/venv/lib/python3.7/site-packages/swh/storage/api/client.py", line 45, in content_add swh-loader_1 | return self.post("content/add", {"content": content}) swh-loader_1 | File "/srv/softwareheritage/venv/lib/python3.7/site-packages/swh/core/api/__init__.py", line 278, in post swh-loader_1 | return self._decode_response(response) swh-loader_1 | File "/srv/softwareheritage/venv/lib/python3.7/site-packages/swh/core/api/__init__.py", line 354, in _decode_response swh-loader_1 | self.raise_for_status(response) swh-loader_1 | File "/srv/softwareheritage/venv/lib/python3.7/site-packages/swh/storage/api/client.py", line 39, in raise_for_status swh-loader_1 | raise HashCollision(*e.args[0]["args"]) swh-loader_1 | swh.storage.exc.HashCollision: ('sha1', '38762cf7f55934b34d179ae6a4c80cadccbb7f0a', [{'sha256': 'd4488775d29bdef7993367d541064dbdda50d383f89f0aa13a6ff2e0894ba5ff', 'sha1': '38762cf7f55934b34d179ae6a4c80cadccbb7f0a', 'sha1_git': 'b621eeccd5c7edac9b7dcba35a8d5afd075e24f2', 'blake2s256': '30e4bd16c3f98e74429d237c19ca9def702e5720cb124cb4b92e74f989aaf116'}, {'sha256': '2bb787a73e37352f92383abe7e2902936d1059ad9f1ba6daaa9c1e58ee6970d0', 'sha1': '38762cf7f55934b34d179ae6a4c80cadccbb7f0a', 'sha1_git': 'ba9aaa145ccd24ef760cf31c74d8f7ca1a2e47b0', 'blake2s256': '8f677e3214ca8b2acad91884a1571ef3f12b786501f9a6bedfd6239d82095dd2'}])
Oct 2 2021
Sep 23 2021
Sep 22 2021
Status on this:
- Added the foss heptapod instance for listing [1]
- Ensure the run went smoothly [2]
- Stop 12 some workers, keep only 4 for the instance is not harassed [3]
- Trigger those new origins for ingestion
- when done (or almost), starts back the other workers [4]
- Update the archive logs when ingestion is done
Sep 17 2021
Thanks for the heads up, gonna simplify stuff then.
The hg-git type are served as regular Mercurial repository. So they can be listed as Mercurial repository safely