Page MenuHomeSoftware Heritage

Save Code NowFolder
ActivePublic

Members

  • This project does not have any members.
  • View All

Watchers

  • This project does not have any watchers.
  • View All

Details

Description

stuff related to on-demand source code archival, also known as "save code now" (cf. http://save.softwareheritage.org/ )

Recent Activity

Wed, May 18

anlambert closed T4240: Do not accept save requests with credentials leaked in the origin URL as Resolved.

This has been implemented and deployed.

Wed, May 18, 1:37 PM · Save Code Now, Web app

Tue, May 17

anlambert added a revision to T4240: Do not accept save requests with credentials leaked in the origin URL: D7843: origin_save: Reject save request when origin URL contains a password.
Tue, May 17, 5:34 PM · Save Code Now, Web app

Fri, May 13

anlambert triaged T4240: Do not accept save requests with credentials leaked in the origin URL as Normal priority.
Fri, May 13, 3:49 PM · Save Code Now, Web app

Mon, May 2

anlambert added a project to T4055: Save Code Now does not work for GitHub origins while GitHub's API is unavailable: Easy hack.
Mon, May 2, 5:30 PM · Easy hack, Save Code Now, Web app

Mar 25 2022

vlorentz closed T4051: Add forge now / Save code now: tabs are not preserved with the browser history as Resolved.
Mar 25 2022, 10:13 AM · Web app, Add Forge Now , Save Code Now

Mar 23 2022

bchauvet added projects to T3775: Dealing with repositories with contents that produces hash conflicts (example included from GitLab): Roadmap 2022, meta-task.
Mar 23 2022, 4:39 PM · meta-task, Roadmap 2022, Save Code Now, Origin-GitLab

Mar 17 2022

bchauvet added a parent task for T4051: Add forge now / Save code now: tabs are not preserved with the browser history: T4047: User interface to submit and follow Add forge now requests.
Mar 17 2022, 4:32 PM · Web app, Add Forge Now , Save Code Now
vlorentz triaged T4055: Save Code Now does not work for GitHub origins while GitHub's API is unavailable as Low priority.
Mar 17 2022, 12:14 PM · Easy hack, Save Code Now, Web app
vlorentz moved T4051: Add forge now / Save code now: tabs are not preserved with the browser history from MVP/weekly backlog to Backlog on the Add Forge Now board.
Mar 17 2022, 12:01 PM · Web app, Add Forge Now , Save Code Now
vlorentz moved T4051: Add forge now / Save code now: tabs are not preserved with the browser history from Backlog to MVP/weekly backlog on the Add Forge Now board.
Mar 17 2022, 12:01 PM · Web app, Add Forge Now , Save Code Now
vlorentz triaged T4052: [Add forge now] Handle errors from the server when submitting the form as High priority.
Mar 17 2022, 12:01 PM · Add Forge Now , Web app
anlambert added a comment to T4051: Add forge now / Save code now: tabs are not preserved with the browser history.

rDWAPPS39bab96e9f0484e04fe837b5162d889b8040980a fixed the issue for save code now but is not deployed yet.

Mar 17 2022, 12:00 PM · Web app, Add Forge Now , Save Code Now
vlorentz triaged T4051: Add forge now / Save code now: tabs are not preserved with the browser history as Normal priority.
Mar 17 2022, 11:58 AM · Web app, Add Forge Now , Save Code Now
vlorentz created T4051: Add forge now / Save code now: tabs are not preserved with the browser history.
Mar 17 2022, 11:58 AM · Web app, Add Forge Now , Save Code Now

Mar 16 2022

vlorentz added a revision to T3969: Save Code Now: blacklist *.github.io: D7362: Save Code Now: Rewrite github.io URLs.
Mar 16 2022, 4:07 PM · Save Code Now, Web app

Mar 10 2022

anlambert added a revision to T3923: Include submodules recursively when saving git repositories: D7332: loader: Add support for submodules discovering.
Mar 10 2022, 3:25 PM · Git loader, Save Code Now

Mar 7 2022

vlorentz added a comment to T3923: Include submodules recursively when saving git repositories.

Somewhat related task: T3311

Mar 7 2022, 11:03 AM · Git loader, Save Code Now

Feb 21 2022

vlorentz lowered the priority of T3969: Save Code Now: blacklist *.github.io from Normal to Low.
Feb 21 2022, 8:35 PM · Save Code Now, Web app
vlorentz triaged T3969: Save Code Now: blacklist *.github.io as Normal priority.
Feb 21 2022, 8:35 PM · Save Code Now, Web app
anlambert added a project to T3923: Include submodules recursively when saving git repositories: Git loader.
Feb 21 2022, 1:32 PM · Git loader, Save Code Now

Feb 10 2022

zack triaged T3923: Include submodules recursively when saving git repositories as Normal priority.
Feb 10 2022, 7:44 AM · Git loader, Save Code Now

Jan 28 2022

anlambert closed T3848: Activate saved origin browse link only when loading data are available in database as Resolved.

Fixed and deployed.

Jan 28 2022, 4:56 PM · Save Code Now, Web app
ardumont removed a project from T3082: Improve Save Code Now handling: System administration.
Jan 28 2022, 3:39 PM · Save Code Now, meta-task, Roadmap 2021, Web app

Jan 27 2022

douardda closed T1481: add metric to monitor "save code now" efficiency as Resolved.

we can always improve it, but now we have a decent dashboard, so let's consider this done.

Jan 27 2022, 1:45 PM · Save Code Now, System administration, Metrics/monitoring
douardda closed T1481: add metric to monitor "save code now" efficiency, a subtask of T3082: Improve Save Code Now handling, as Resolved.
Jan 27 2022, 1:45 PM · Save Code Now, meta-task, Roadmap 2021, Web app

Jan 26 2022

anlambert added a revision to T3848: Activate saved origin browse link only when loading data are available in database: D7044: assets/save: Add origin browse link only for valid visit statuses.
Jan 26 2022, 4:34 PM · Save Code Now, Web app

Jan 19 2022

ardumont closed T3846: Unstuck running save code now origins as Resolved.

Remains 8 origins on running state as they are currently being ingested (and those are large origins).

Jan 19 2022, 6:12 PM · Save Code Now

Jan 17 2022

ardumont updated the task description for T3458: save code now: Requests are not getting updated from time to time.
Jan 17 2022, 6:05 PM · Save Code Now
ardumont added a comment to T3846: Unstuck running save code now origins.
  • Those tasks were updated with a status failed [1] [2]
  • Their associated scheduler task id where archived recently [3]
  • They have been rescheduled through the save code now cli [4]
  • Their ingestion is ongoing and their associated status should update once done
Jan 17 2022, 2:04 PM · Save Code Now
ardumont changed the status of T3846: Unstuck running save code now origins from Open to Work in Progress.
Jan 17 2022, 12:22 PM · Save Code Now

Jan 14 2022

anlambert triaged T3848: Activate saved origin browse link only when loading data are available in database as Normal priority.
Jan 14 2022, 11:42 AM · Save Code Now, Web app

Jan 13 2022

ardumont updated the task description for T3846: Unstuck running save code now origins.
Jan 13 2022, 11:38 AM · Save Code Now
ardumont updated the task description for T3846: Unstuck running save code now origins.
Jan 13 2022, 11:38 AM · Save Code Now
ardumont updated the task description for T3846: Unstuck running save code now origins.
Jan 13 2022, 11:36 AM · Save Code Now
ardumont updated the task description for T3846: Unstuck running save code now origins.
Jan 13 2022, 11:34 AM · Save Code Now
ardumont renamed T3846: Unstuck running save code now origins from Unstuck running save code now origin to Unstuck running save code now origins.
Jan 13 2022, 11:34 AM · Save Code Now
ardumont triaged T3846: Unstuck running save code now origins as High priority.
Jan 13 2022, 11:34 AM · Save Code Now

Jan 10 2022

ardumont placed T3082: Improve Save Code Now handling up for grabs.
Jan 10 2022, 10:06 AM · Save Code Now, meta-task, Roadmap 2021, Web app

Dec 13 2021

olasd added a comment to T3775: Dealing with repositories with contents that produces hash conflicts (example included from GitLab).

The main issue that prevents us from archiving these objects today is that our object storage still uses a plain sha1 as primary key (hence the current unicity constraint on the sha1 field of the content table in our primary graph storage).

Dec 13 2021, 4:39 PM · meta-task, Roadmap 2022, Save Code Now, Origin-GitLab

Dec 8 2021

rdicosmo added a comment to T3775: Dealing with repositories with contents that produces hash conflicts (example included from GitLab).

Updated task name and description to reflect the findings from @anlambert

Dec 8 2021, 3:16 PM · meta-task, Roadmap 2022, Save Code Now, Origin-GitLab
rdicosmo renamed T3775: Dealing with repositories with contents that produces hash conflicts (example included from GitLab) from Check failures in save code now requests for GitLab to Dealing with repositories with contents that produces hash conflicts (example included from GitLab).
Dec 8 2021, 3:16 PM · meta-task, Roadmap 2022, Save Code Now, Origin-GitLab

Dec 7 2021

anlambert added a comment to T3775: Dealing with repositories with contents that produces hash conflicts (example included from GitLab).

It is possible that more key cryptographic software will include these files.

Dec 7 2021, 4:56 PM · meta-task, Roadmap 2022, Save Code Now, Origin-GitLab
rdicosmo added a comment to T3775: Dealing with repositories with contents that produces hash conflicts (example included from GitLab).

Thanks a lot @anlambert for looking into this.
It is possible that more key cryptographic software will include these files.
We need a strategy to handle this situation, may you add this example to the SWHID v2 task?

Dec 7 2021, 3:36 PM · meta-task, Roadmap 2022, Save Code Now, Origin-GitLab
anlambert added a comment to T3775: Dealing with repositories with contents that produces hash conflicts (example included from GitLab).

Loading the repository in docker environment gives me the following traceback:

swh-loader_1                        | [2021-12-07 12:08:06,876: INFO/MainProcess] Task swh.loader.git.tasks.UpdateGitRepository[a1aa28c0-1cb0-4e2a-8ae2-720ba6ca439e] received
swh-loader_1                        | [2021-12-07 12:08:06,877: INFO/MainProcess] loader@b11bfd448510 ready.
swh-loader_1                        | [2021-12-07 12:08:06,957: DEBUG/ForkPoolWorker-1] Loading config file /loader.yml
swh-loader_1                        | [2021-12-07 12:08:09,904: INFO/ForkPoolWorker-1] Load origin 'https://gitlab.com/sequoia-pgp/sequoia' with type 'git'
swh-loader_1                        | [2021-12-07 12:08:09,908: DEBUG/ForkPoolWorker-1] Transport url to communicate with server: https://gitlab.com/sequoia-pgp/sequoia
swh-loader_1                        | [2021-12-07 12:08:09,909: DEBUG/ForkPoolWorker-1] Client Urllib3HttpGitClient('https://gitlab.com/sequoia-pgp/sequoia/', dumb=None) to fetch pack at /sequoia-pgp/sequoia
swh-loader_1                        | [2021-12-07 12:08:10,422: DEBUG/ForkPoolWorker-1] local_heads_count=0
swh-loader_1                        | [2021-12-07 12:08:10,422: DEBUG/ForkPoolWorker-1] remote_heads_count=1821
swh-loader_1                        | [2021-12-07 12:08:10,422: DEBUG/ForkPoolWorker-1] wanted_refs_count=1821
swh-loader_1                        | [2021-12-07 12:09:17,112: ERROR/ForkPoolWorker-1] Loading failure, updating to `failed` status
swh-loader_1                        | Traceback (most recent call last):
swh-loader_1                        |   File "/srv/softwareheritage/venv/lib/python3.7/site-packages/swh/storage/api/client.py", line 29, in raise_for_status
swh-loader_1                        |     super().raise_for_status(response)
swh-loader_1                        |   File "/srv/softwareheritage/venv/lib/python3.7/site-packages/swh/core/api/__init__.py", line 344, in raise_for_status
swh-loader_1                        |     raise exception from None
swh-loader_1                        | swh.core.api.RemoteException: <RemoteException 500 HashCollision: ['sha1', '38762cf7f55934b34d179ae6a4c80cadccbb7f0a', [{'blake2s256': '30e4bd16c3f98e74429d237c19ca9def702e5720cb124cb4b92e74f989aaf116', 'sha1': '38762cf7f55934b34d179ae6a4c80cadccbb7f0a', 'sha1_git': 'b621eeccd5c7edac9b7dcba35a8d5afd075e24f2', 'sha256': 'd4488775d29bdef7993367d541064dbdda50d383f89f0aa13a6ff2e0894ba5ff'}, {'blake2s256': '8f677e3214ca8b2acad91884a1571ef3f12b786501f9a6bedfd6239d82095dd2', 'sha1': '38762cf7f55934b34d179ae6a4c80cadccbb7f0a', 'sha1_git': 'ba9aaa145ccd24ef760cf31c74d8f7ca1a2e47b0', 'sha256': '2bb787a73e37352f92383abe7e2902936d1059ad9f1ba6daaa9c1e58ee6970d0'}]]>
swh-loader_1                        | 
swh-loader_1                        | During handling of the above exception, another exception occurred:
swh-loader_1                        | 
swh-loader_1                        | Traceback (most recent call last):
swh-loader_1                        |   File "/src/swh-loader-core/swh/loader/core/loader.py", line 339, in load
swh-loader_1                        |     self.store_data()
swh-loader_1                        |   File "/src/swh-loader-core/swh/loader/core/loader.py", line 458, in store_data
swh-loader_1                        |     self.storage.directory_add([directory])
swh-loader_1                        |   File "/srv/softwareheritage/venv/lib/python3.7/site-packages/swh/storage/proxies/buffer.py", line 171, in directory_add
swh-loader_1                        |     stats = self.object_add(directories, object_type="directory", keys=["id"])
swh-loader_1                        |   File "/srv/softwareheritage/venv/lib/python3.7/site-packages/swh/storage/proxies/buffer.py", line 224, in object_add
swh-loader_1                        |     return self.flush()
swh-loader_1                        |   File "/srv/softwareheritage/venv/lib/python3.7/site-packages/swh/storage/proxies/buffer.py", line 286, in flush
swh-loader_1                        |     stats = add_fn(list(batch))
swh-loader_1                        |   File "/srv/softwareheritage/venv/lib/python3.7/site-packages/swh/storage/proxies/filter.py", line 58, in content_add
swh-loader_1                        |     [x for x in content if x.sha256 in contents_to_add]
swh-loader_1                        |   File "/srv/softwareheritage/venv/lib/python3.7/site-packages/swh/storage/api/client.py", line 45, in content_add
swh-loader_1                        |     return self.post("content/add", {"content": content})
swh-loader_1                        |   File "/srv/softwareheritage/venv/lib/python3.7/site-packages/swh/core/api/__init__.py", line 278, in post
swh-loader_1                        |     return self._decode_response(response)
swh-loader_1                        |   File "/srv/softwareheritage/venv/lib/python3.7/site-packages/swh/core/api/__init__.py", line 354, in _decode_response
swh-loader_1                        |     self.raise_for_status(response)
swh-loader_1                        |   File "/srv/softwareheritage/venv/lib/python3.7/site-packages/swh/storage/api/client.py", line 39, in raise_for_status
swh-loader_1                        |     raise HashCollision(*e.args[0]["args"])
swh-loader_1                        | swh.storage.exc.HashCollision: ('sha1', '38762cf7f55934b34d179ae6a4c80cadccbb7f0a', [{'sha256': 'd4488775d29bdef7993367d541064dbdda50d383f89f0aa13a6ff2e0894ba5ff', 'sha1': '38762cf7f55934b34d179ae6a4c80cadccbb7f0a', 'sha1_git': 'b621eeccd5c7edac9b7dcba35a8d5afd075e24f2', 'blake2s256': '30e4bd16c3f98e74429d237c19ca9def702e5720cb124cb4b92e74f989aaf116'}, {'sha256': '2bb787a73e37352f92383abe7e2902936d1059ad9f1ba6daaa9c1e58ee6970d0', 'sha1': '38762cf7f55934b34d179ae6a4c80cadccbb7f0a', 'sha1_git': 'ba9aaa145ccd24ef760cf31c74d8f7ca1a2e47b0', 'blake2s256': '8f677e3214ca8b2acad91884a1571ef3f12b786501f9a6bedfd6239d82095dd2'}])
Dec 7 2021, 1:23 PM · meta-task, Roadmap 2022, Save Code Now, Origin-GitLab
rdicosmo triaged T3775: Dealing with repositories with contents that produces hash conflicts (example included from GitLab) as High priority.
Dec 7 2021, 12:18 PM · meta-task, Roadmap 2022, Save Code Now, Origin-GitLab

Dec 3 2021

ardumont placed T1481: add metric to monitor "save code now" efficiency up for grabs.
Dec 3 2021, 3:57 PM · Save Code Now, System administration, Metrics/monitoring
ardumont moved T1481: add metric to monitor "save code now" efficiency from deployed/landed/monitoring to Backlog on the System administration board.
Dec 3 2021, 3:57 PM · Save Code Now, System administration, Metrics/monitoring

Nov 9 2021

olasd added a comment to T3286: Use journal clients for webapp and deposit to subscribe to events.

Moving towards event notifications and stream processing instead of polling sounds worthwhile, before the amount of polling becomes more important than managing the event notification mechanism. For the two systems you've mentioned, I think we're really, really far away from that, but it's still worth considering a way to do event-driven notifications properly, so we don't have to rush it through.

Nov 9 2021, 5:26 PM · Save Code Now, SWORD deposit, Web app

Oct 26 2021

anlambert closed T3690: Save code now ui rejects valid git sourceforge origin as Resolved.

Fixed and deployed, origin https://git.code.sf.net/u/bsomervi/hamlib.git has been successfully submitted and is currently being loaded into the archive, closing this.

Oct 26 2021, 5:45 PM · Save Code Now
anlambert added a revision to T3690: Save code now ui rejects valid git sourceforge origin: D6551: assets/save: Fix sourceforge git repository URL validation.
Oct 26 2021, 11:30 AM · Save Code Now