Page MenuHomeSoftware Heritage

Re-run deposit loaders tasks with corrupt metadata
Closed, MigratedEdits Locked

Description

We had a bug last week that caused the deposit loader to corrupt metadata (introduced in swh-deposit by D7211 (first released in v0.17.0) and fixed in swh-loader-core by D7270 (first released in v2.6.0)). We should re-load deposits affected by this issue (all the data it needs is still in the deposit DB)

Event Timeline

vlorentz triaged this task as High priority.Mar 7 2022, 9:39 AM
vlorentz created this task.
vlorentz updated the task description. (Show Details)

@vlorentz I'm missing information.

  • The fixes per impacted module?
  • What's the module version?
  • Are they deployed already? (if not that probably need first a deployment task for those).

See the updated task description

According to grafana [1], the deposit v0.17.0 and the loader-core v2.5.4 were released
around the 2022-02-25 12:47:03 [3].

[1] grafana is usually tagged when a release happens

[2] loader.core includes the deposit loader

[3] https://grafana.softwareheritage.org/goto/Az06nIPnk?orgId=1

Deposit in questions (no upper bound though):

17:16:13 softwareheritage-deposit@belvedere:5432=> select id, load_task_id, reception_date, origin_url from deposit where reception_date > '2022-02-25' and load_task_id is not null order by reception_date;
+------+--------------+-------------------------------+---------------------------------------------------------------------------+
|  id  | load_task_id |        reception_date         |                                origin_url                                 |
+------+--------------+-------------------------------+---------------------------------------------------------------------------+
| 2085 | 407157795    | 2022-02-25 09:13:58.319933+00 | https://www.softwareheritage.org/check-deposit-2022-02-25T09:13:58.187228 |
| 2086 | 407163060    | 2022-02-25 11:01:53.288991+00 | https://hal.archives-ouvertes.fr/halshs-03526772                          |
| 2087 | 407165557    | 2022-02-25 11:54:29.004236+00 | https://hal.archives-ouvertes.fr/halshs-03561443                          |
| 2088 | 407165591    | 2022-02-25 11:55:14.647872+00 | https://hal.archives-ouvertes.fr/halshs-03561448                          |
| 2089 | 407169150    | 2022-02-25 13:12:29.491824+00 | https://www.softwareheritage.org/check-deposit-2022-02-25T13:12:29.267490 |
| 2090 | 407236102    | 2022-02-26 12:50:37.928105+00 | https://www.softwareheritage.org/check-deposit-2022-02-26T12:50:37.585596 |
| 2091 | 407309947    | 2022-02-27 12:28:43.070734+00 | https://www.softwareheritage.org/check-deposit-2022-02-27T12:28:42.913099 |
| 2092 | 407379705    | 2022-02-28 12:06:58.278931+00 | https://www.softwareheritage.org/check-deposit-2022-02-28T12:06:58.177279 |
| 2093 | 407445912    | 2022-03-01 11:45:09.835913+00 | https://www.softwareheritage.org/check-deposit-2022-03-01T11:45:09.706099 |
| 2094 | 407504412    | 2022-03-02 10:38:36.773328+00 | https://hal.archives-ouvertes.fr/hal-03592676                             |
| 2095 | 407506097    | 2022-03-02 11:23:16.452148+00 | https://www.softwareheritage.org/check-deposit-2022-03-02T11:23:16.351001 |
| 2096 | 407506910    | 2022-03-02 11:41:39.351296+00 | https://hal.archives-ouvertes.fr/hal-03592602                             |
| 2097 | 407565731    | 2022-03-03 11:01:23.521034+00 | https://www.softwareheritage.org/check-deposit-2022-03-03T11:01:23.348885 |
| 2098 | 407624130    | 2022-03-04 10:39:29.553787+00 | https://www.softwareheritage.org/check-deposit-2022-03-04T10:39:29.419994 |
| 2107 | 407643393    | 2022-03-04 18:28:46.408218+00 | https://www.softwareheritage.org/check-deposit-2022-03-04T18:28:46.309128 |
| 2109 | 407644857    | 2022-03-04 19:01:09.336105+00 | https://www.softwareheritage.org/check-deposit-2022-03-04T19:01:09.254140 |
| 2110 | 407645069    | 2022-03-04 19:05:19.914166+00 | https://www.softwareheritage.org/check-deposit-2022-03-04T19:05:19.837104 |
| 2111 | 407703264    | 2022-03-05 18:25:40.21825+00  | https://www.softwareheritage.org/check-deposit-2022-03-05T18:25:40.056679 |
| 2112 | 407758670    | 2022-03-06 18:18:03.578598+00 | https://www.softwareheritage.org/check-deposit-2022-03-06T18:18:03.434582 |
| 2113 | 407810770    | 2022-03-07 17:30:59.32006+00  | https://www.softwareheritage.org/check-deposit-2022-03-07T17:30:59.186357 |
| 2114 | 407864454    | 2022-03-08 16:44:03.261524+00 | https://www.softwareheritage.org/check-deposit-2022-03-08T16:44:03.110733 |
| 2115 | 407917592    | 2022-03-09 14:40:15.460855+00 | https://hal.archives-ouvertes.fr/hal-03279853                             |
| 2116 | 407920283    | 2022-03-09 15:56:58.620788+00 | https://www.softwareheritage.org/check-deposit-2022-03-09T15:56:58.537250 |
| 2117 | 407972080    | 2022-03-10 15:10:01.750027+00 | https://www.softwareheritage.org/check-deposit-2022-03-10T15:10:01.646333 |
| 2118 | 408024298    | 2022-03-11 14:23:05.220188+00 | https://www.softwareheritage.org/check-deposit-2022-03-11T14:23:05.105181 |
| 2119 | 408077288    | 2022-03-12 13:36:11.4087+00   | https://www.softwareheritage.org/check-deposit-2022-03-12T13:36:11.267814 |
| 2120 | 408133382    | 2022-03-13 13:35:45.329405+00 | https://www.softwareheritage.org/check-deposit-2022-03-13T13:35:45.138161 |
| 2121 | 408191367    | 2022-03-14 13:35:19.390942+00 | https://www.softwareheritage.org/check-deposit-2022-03-14T13:35:19.284882 |
| 2122 | 408192531    | 2022-03-14 14:05:26.424093+00 | https://hal.archives-ouvertes.fr/hal-03597952                             |
+------+--------------+-------------------------------+---------------------------------------------------------------------------+
(29 rows)

Time: 9.642 ms
`
ardumont changed the task status from Open to Work in Progress.Mar 14 2022, 7:00 PM
ardumont moved this task from Weekly backlog to in-progress on the System administration board.

I'm confused by the deployment state.

production is ok (deposit server: v0.17.1, loader.core: 2.6.0 [1]), staging seems out of
sync (deposit server: v0.17.0, loader.core: 2.6.0).

It's not that important but keeping them in sync sounds better long term.
I'll realign this but i'll not necessarily reschedule the staging deposits.

[1] prod:

root@pergamon:~# clush -b -w @swh-workers -w moma "DEBIAN_FRONTEND=noninteractive dpkg -l python3-swh.deposit* python3-swh.loader.core"
---------------
worker[01-16] (16)
---------------
Desired=Unknown/Install/Remove/Purge/Hold
| Status=Not/Inst/Conf-files/Unpacked/halF-conf/Half-inst/trig-aWait/Trig-pend
|/ Err?=(none)/Reinst-required (Status,Err: uppercase=bad)
||/ Name                       Version               Architecture Description
+++-==========================-=====================-============-====================================
ii  python3-swh.deposit        0.17.0-1~swh1~bpo10+1 all          Software Heritage Deposit Server
ii  python3-swh.deposit.client 0.17.0-1~swh1~bpo10+1 all          Software Heritage Deposit Api Client
ii  python3-swh.deposit.loader 0.17.0-1~swh1~bpo10+1 all          Software Heritage Deposit Loader
ii  python3-swh.loader.core    2.6.0-1~swh1~bpo10+1  all          Software Heritage Loader Core
---------------
moma
---------------
Desired=Unknown/Install/Remove/Purge/Hold
| Status=Not/Inst/Conf-files/Unpacked/halF-conf/Half-inst/trig-aWait/Trig-pend
|/ Err?=(none)/Reinst-required (Status,Err: uppercase=bad)
||/ Name                       Version               Architecture Description
+++-==========================-=====================-============-====================================
ii  python3-swh.deposit        0.17.1-1~swh1~bpo10+1 all          Software Heritage Deposit Server
ii  python3-swh.deposit.client 0.17.1-1~swh1~bpo10+1 all          Software Heritage Deposit Api Client
ii  python3-swh.deposit.loader 0.17.1-1~swh1~bpo10+1 all          Software Heritage Deposit Loader
un  python3-swh.loader.core    <none>                <none>       (no description available)

[2] staging:

root@pergamon:~# clush -b -w @staging-loader-workers -w deposit.internal.staging.swh.network "DEBIAN_FRONTEND=noninteractive dpkg -l python3-swh.deposit* python3-swh.loader.core"
---------------
worker[0-3].internal.staging.swh.network (4)
---------------
Desired=Unknown/Install/Remove/Purge/Hold
| Status=Not/Inst/Conf-files/Unpacked/halF-conf/Half-inst/trig-aWait/Trig-pend
|/ Err?=(none)/Reinst-required (Status,Err: uppercase=bad)
||/ Name                       Version               Architecture Description
+++-==========================-=====================-============-====================================
ii  python3-swh.deposit        0.17.1-1~swh1~bpo10+1 all          Software Heritage Deposit Server
ii  python3-swh.deposit.client 0.17.1-1~swh1~bpo10+1 all          Software Heritage Deposit Api Client
ii  python3-swh.deposit.loader 0.17.1-1~swh1~bpo10+1 all          Software Heritage Deposit Loader
ii  python3-swh.loader.core    2.6.0-1~swh1~bpo10+1  all          Software Heritage Loader Core
---------------
deposit.internal.staging.swh.network
---------------
Desired=Unknown/Install/Remove/Purge/Hold
| Status=Not/Inst/Conf-files/Unpacked/halF-conf/Half-inst/trig-aWait/Trig-pend
|/ Err?=(none)/Reinst-required (Status,Err: uppercase=bad)
||/ Name                       Version               Architecture Description
+++-==========================-=====================-============-====================================
ii  python3-swh.deposit        0.17.0-1~swh1~bpo10+1 all          Software Heritage Deposit Server
ii  python3-swh.deposit.client 0.17.0-1~swh1~bpo10+1 all          Software Heritage Deposit Api Client
ii  python3-swh.deposit.loader 0.17.0-1~swh1~bpo10+1 all          Software Heritage Deposit Loader
un  python3-swh.loader.core    <none>                <none>       (no description available)

Of course, it no longer works... Where would the fun in that be.
I'll have a look later on (but don't let that be a blocker... if someone else wants to unstuck it, it's fine ;)

[1]

swhdeposit@deposit:~$ swh deposit admin --config-file /etc/softwareheritage/deposit/server.yml --platform production deposit reschedule --deposit-id 938
Traceback (most recent call last):
  File "/usr/bin/swh", line 33, in <module>
    sys.exit(load_entry_point('swh.core==2.2.2', 'console_scripts', 'swh')())
  File "/usr/lib/python3/dist-packages/swh/core/cli/__init__.py", line 185, in main
    return swh(auto_envvar_prefix="SWH")
  File "/usr/lib/python3/dist-packages/click/core.py", line 764, in __call__
    return self.main(*args, **kwargs)
  File "/usr/lib/python3/dist-packages/click/core.py", line 717, in main
    rv = self.invoke(ctx)
  File "/usr/lib/python3/dist-packages/click/core.py", line 1137, in invoke
    return _process_result(sub_ctx.command.invoke(sub_ctx))
  File "/usr/lib/python3/dist-packages/click/core.py", line 1137, in invoke
    return _process_result(sub_ctx.command.invoke(sub_ctx))
  File "/usr/lib/python3/dist-packages/click/core.py", line 1137, in invoke
    return _process_result(sub_ctx.command.invoke(sub_ctx))
  [Previous line repeated 1 more time]
  File "/usr/lib/python3/dist-packages/click/core.py", line 956, in invoke
    return ctx.invoke(self.callback, **ctx.params)
  File "/usr/lib/python3/dist-packages/click/core.py", line 555, in invoke
    return callback(*args, **kwargs)
  File "/usr/lib/python3/dist-packages/click/decorators.py", line 17, in new_func
    return f(get_current_context(), *args, **kwargs)
  File "/usr/lib/python3/dist-packages/swh/deposit/cli/admin.py", line 284, in adm_deposit_reschedule
    [task_id], status="next_run_not_scheduled", next_run=datetime.now()
  File "/usr/lib/python3/dist-packages/swh/core/api/__init__.py", line 182, in meth_
    return self._post(meth._endpoint_path, post_data)
  File "/usr/lib/python3/dist-packages/swh/core/api/__init__.py", line 263, in _post
    data = self._encode_data(data)
  File "/usr/lib/python3/dist-packages/swh/core/api/__init__.py", line 282, in _encode_data
    return encode_data(data, extra_encoders=self.extra_type_encoders)
  File "/usr/lib/python3/dist-packages/swh/core/api/serializers.py", line 126, in encode_data_client
    return msgpack_dumps(data, extra_encoders=extra_encoders)
  File "/usr/lib/python3/dist-packages/swh/core/api/serializers.py", line 275, in msgpack_dumps
    default=encode_types,
  File "/usr/lib/python3/dist-packages/msgpack/__init__.py", line 35, in packb
    return Packer(**kwargs).pack(o)
  File "msgpack/_packer.pyx", line 286, in msgpack._cmsgpack.Packer.pack
  File "msgpack/_packer.pyx", line 292, in msgpack._cmsgpack.Packer.pack
  File "msgpack/_packer.pyx", line 289, in msgpack._cmsgpack.Packer.pack
  File "msgpack/_packer.pyx", line 225, in msgpack._cmsgpack.Packer._pack
  File "msgpack/_packer.pyx", line 283, in msgpack._cmsgpack.Packer._pack
TypeError: can not serialize 'datetime.datetime' object

Hot-fixing the staging deposit instance with val's patch seems to work ok now.

swhdeposit@deposit:~$ swh deposit admin --config-file /etc/softwareheritage/deposit/server.yml --platform production deposit reschedule --deposit-id 939
swhdeposit@deposit:~$ swh deposit admin --config-file /etc/softwareheritage/deposit/server.yml --platform production deposit reschedule --deposit-id 940

Nothing much new change for those deposits at the end of the loading.
The deposit status got reset and the loading happened.

Rescheduled the deposit id in [941:952] U [955:978]. [1]

Results before [1] and after [2] for comparison. Some deposits are still in verified
status for some reason... (scheduling stuff).

I'll check a bit later again.

[1]

swhdeposit@deposit:~$ for id in `seq 941 952` `seq 955 978`; do swh deposit admin --config-file /etc/softwareheritage/deposit/server.yml --platform production deposit reschedule --deposit-id $id; done

[2] Before

15:09:31 swh-deposit@db1:5432=> select id, reception_date, status, swhid from deposit where reception_date > '2022-02-25' and load_task_id is not null and status='done' order by reception_date;
+-----+-------------------------------+--------+----------------------------------------------------+
| id  |        reception_date         | status |                       swhid                        |
+-----+-------------------------------+--------+----------------------------------------------------+
| 939 | 2022-02-25 13:03:55.397734+00 | done   | swh:1:dir:d83b7dda887dc790f7207608474650d4344b8df9 |
| 940 | 2022-02-25 13:08:02.986674+00 | done   | swh:1:dir:d83b7dda887dc790f7207608474650d4344b8df9 |
| 941 | 2022-02-25 13:10:56.660217+00 | done   | swh:1:dir:d83b7dda887dc790f7207608474650d4344b8df9 |
| 942 | 2022-02-26 12:03:11.587671+00 | done   | swh:1:dir:d83b7dda887dc790f7207608474650d4344b8df9 |
| 943 | 2022-02-27 10:55:33.071236+00 | done   | swh:1:dir:d83b7dda887dc790f7207608474650d4344b8df9 |
| 944 | 2022-02-28 09:47:52.072206+00 | done   | swh:1:dir:d83b7dda887dc790f7207608474650d4344b8df9 |
| 945 | 2022-03-01 08:40:08.637126+00 | done   | swh:1:dir:d83b7dda887dc790f7207608474650d4344b8df9 |
| 946 | 2022-03-01 10:11:21.292108+00 | done   | swh:1:dir:7267827f0c6ae331b20596ecdb71614bb86b47f0 |
| 947 | 2022-03-01 10:49:59.862959+00 | done   | swh:1:dir:7267827f0c6ae331b20596ecdb71614bb86b47f0 |
| 948 | 2022-03-01 12:38:55.412746+00 | done   | swh:1:dir:5e80d0672771d1d1ac037a37b33af9d6260efef2 |
| 949 | 2022-03-01 13:38:58.87221+00  | done   | swh:1:dir:5e80d0672771d1d1ac037a37b33af9d6260efef2 |
| 950 | 2022-03-01 13:43:10.602682+00 | done   | swh:1:dir:a5003f717459702d8e1a35e634832a1f0d23e5ce |
| 951 | 2022-03-02 07:32:30.859831+00 | done   | swh:1:dir:d83b7dda887dc790f7207608474650d4344b8df9 |
| 952 | 2022-03-02 15:22:28.090742+00 | done   | swh:1:dir:09b0e419a997c87ba4da06991bab867d568a2076 |
| 955 | 2022-03-03 06:24:43.783543+00 | done   | swh:1:dir:d83b7dda887dc790f7207608474650d4344b8df9 |
| 956 | 2022-03-04 05:16:51.966295+00 | done   | swh:1:dir:d83b7dda887dc790f7207608474650d4344b8df9 |
| 957 | 2022-03-04 16:48:50.738495+00 | done   | swh:1:dir:d83b7dda887dc790f7207608474650d4344b8df9 |
| 958 | 2022-03-04 17:24:46.894111+00 | done   | swh:1:dir:d83b7dda887dc790f7207608474650d4344b8df9 |
| 959 | 2022-03-04 17:29:11.729435+00 | done   | swh:1:dir:d83b7dda887dc790f7207608474650d4344b8df9 |
| 960 | 2022-03-04 18:07:53.033515+00 | done   | swh:1:dir:d83b7dda887dc790f7207608474650d4344b8df9 |
| 961 | 2022-03-05 18:06:55.046618+00 | done   | swh:1:dir:d83b7dda887dc790f7207608474650d4344b8df9 |
| 962 | 2022-03-06 17:07:08.389622+00 | done   | swh:1:dir:d83b7dda887dc790f7207608474650d4344b8df9 |
| 963 | 2022-03-07 16:54:59.823794+00 | done   | swh:1:dir:d83b7dda887dc790f7207608474650d4344b8df9 |
| 964 | 2022-03-08 16:43:01.348875+00 | done   | swh:1:dir:d83b7dda887dc790f7207608474650d4344b8df9 |
| 965 | 2022-03-09 11:17:30.319781+00 | done   | swh:1:dir:c4993c872593e960dc84e4430dbbfbc34fd706d0 |
| 966 | 2022-03-09 14:31:42.062191+00 | done   | swh:1:dir:c4993c872593e960dc84e4430dbbfbc34fd706d0 |
| 967 | 2022-03-09 14:37:30.339323+00 | done   | swh:1:dir:c4993c872593e960dc84e4430dbbfbc34fd706d0 |
| 968 | 2022-03-09 14:44:17.355722+00 | done   | swh:1:dir:c4993c872593e960dc84e4430dbbfbc34fd706d0 |
| 969 | 2022-03-09 15:02:44.768178+00 | done   | swh:1:dir:c4993c872593e960dc84e4430dbbfbc34fd706d0 |
| 970 | 2022-03-09 15:48:16.431623+00 | done   | swh:1:dir:c4993c872593e960dc84e4430dbbfbc34fd706d0 |
| 971 | 2022-03-09 15:53:22.332808+00 | done   | swh:1:dir:c4993c872593e960dc84e4430dbbfbc34fd706d0 |
| 972 | 2022-03-09 16:30:56.037631+00 | done   | swh:1:dir:d83b7dda887dc790f7207608474650d4344b8df9 |
| 973 | 2022-03-10 16:18:53.270816+00 | done   | swh:1:dir:d83b7dda887dc790f7207608474650d4344b8df9 |
| 974 | 2022-03-11 10:18:43.741886+00 | done   | swh:1:dir:c4993c872593e960dc84e4430dbbfbc34fd706d0 |
| 975 | 2022-03-11 16:06:49.603781+00 | done   | swh:1:dir:d83b7dda887dc790f7207608474650d4344b8df9 |
| 976 | 2022-03-12 15:59:32.34099+00  | done   | swh:1:dir:d83b7dda887dc790f7207608474650d4344b8df9 |
| 977 | 2022-03-13 15:52:16.475602+00 | done   | swh:1:dir:d83b7dda887dc790f7207608474650d4344b8df9 |
| 978 | 2022-03-14 15:45:04.350296+00 | done   | swh:1:dir:d83b7dda887dc790f7207608474650d4344b8df9 |
+-----+-------------------------------+--------+----------------------------------------------------+
(38 rows)

Time: 8.187 ms

[2] After

15:21:51 swh-deposit@db1:5432=> select id, reception_date, status, swhid from deposit where reception_date > '2022-02-25' and load_task_id is not null order by reception_date;
+-----+-------------------------------+----------+----------------------------------------------------+
| id  |        reception_date         |  status  |                       swhid                        |
+-----+-------------------------------+----------+----------------------------------------------------+
| 938 | 2022-02-25 12:21:06.264691+00 | verified | (null)                                             |
| 939 | 2022-02-25 13:03:55.397734+00 | done     | swh:1:dir:d83b7dda887dc790f7207608474650d4344b8df9 |
| 940 | 2022-02-25 13:08:02.986674+00 | done     | swh:1:dir:d83b7dda887dc790f7207608474650d4344b8df9 |
| 941 | 2022-02-25 13:10:56.660217+00 | done     | swh:1:dir:d83b7dda887dc790f7207608474650d4344b8df9 |
| 942 | 2022-02-26 12:03:11.587671+00 | done     | swh:1:dir:d83b7dda887dc790f7207608474650d4344b8df9 |
| 943 | 2022-02-27 10:55:33.071236+00 | done     | swh:1:dir:d83b7dda887dc790f7207608474650d4344b8df9 |
| 944 | 2022-02-28 09:47:52.072206+00 | done     | swh:1:dir:d83b7dda887dc790f7207608474650d4344b8df9 |
| 945 | 2022-03-01 08:40:08.637126+00 | done     | swh:1:dir:d83b7dda887dc790f7207608474650d4344b8df9 |
| 946 | 2022-03-01 10:11:21.292108+00 | done     | swh:1:dir:7267827f0c6ae331b20596ecdb71614bb86b47f0 |
| 947 | 2022-03-01 10:49:59.862959+00 | done     | swh:1:dir:7267827f0c6ae331b20596ecdb71614bb86b47f0 |
| 948 | 2022-03-01 12:38:55.412746+00 | done     | swh:1:dir:5e80d0672771d1d1ac037a37b33af9d6260efef2 |
| 949 | 2022-03-01 13:38:58.87221+00  | done     | swh:1:dir:5e80d0672771d1d1ac037a37b33af9d6260efef2 |
| 950 | 2022-03-01 13:43:10.602682+00 | done     | swh:1:dir:a5003f717459702d8e1a35e634832a1f0d23e5ce |
| 951 | 2022-03-02 07:32:30.859831+00 | done     | swh:1:dir:d83b7dda887dc790f7207608474650d4344b8df9 |
| 952 | 2022-03-02 15:22:28.090742+00 | done     | swh:1:dir:09b0e419a997c87ba4da06991bab867d568a2076 |
| 955 | 2022-03-03 06:24:43.783543+00 | done     | swh:1:dir:d83b7dda887dc790f7207608474650d4344b8df9 |
| 956 | 2022-03-04 05:16:51.966295+00 | done     | swh:1:dir:d83b7dda887dc790f7207608474650d4344b8df9 |
| 957 | 2022-03-04 16:48:50.738495+00 | done     | swh:1:dir:d83b7dda887dc790f7207608474650d4344b8df9 |
| 958 | 2022-03-04 17:24:46.894111+00 | done     | swh:1:dir:d83b7dda887dc790f7207608474650d4344b8df9 |
| 959 | 2022-03-04 17:29:11.729435+00 | done     | swh:1:dir:d83b7dda887dc790f7207608474650d4344b8df9 |
| 960 | 2022-03-04 18:07:53.033515+00 | done     | swh:1:dir:d83b7dda887dc790f7207608474650d4344b8df9 |
| 961 | 2022-03-05 18:06:55.046618+00 | done     | swh:1:dir:d83b7dda887dc790f7207608474650d4344b8df9 |
| 962 | 2022-03-06 17:07:08.389622+00 | done     | swh:1:dir:d83b7dda887dc790f7207608474650d4344b8df9 |
| 963 | 2022-03-07 16:54:59.823794+00 | done     | swh:1:dir:d83b7dda887dc790f7207608474650d4344b8df9 |
| 964 | 2022-03-08 16:43:01.348875+00 | done     | swh:1:dir:d83b7dda887dc790f7207608474650d4344b8df9 |
| 965 | 2022-03-09 11:17:30.319781+00 | done     | swh:1:dir:c4993c872593e960dc84e4430dbbfbc34fd706d0 |
| 966 | 2022-03-09 14:31:42.062191+00 | verified | (null)                                             |
| 967 | 2022-03-09 14:37:30.339323+00 | verified | (null)                                             |
| 968 | 2022-03-09 14:44:17.355722+00 | verified | (null)                                             |
| 969 | 2022-03-09 15:02:44.768178+00 | verified | (null)                                             |
| 970 | 2022-03-09 15:48:16.431623+00 | verified | (null)                                             |
| 971 | 2022-03-09 15:53:22.332808+00 | verified | (null)                                             |
| 972 | 2022-03-09 16:30:56.037631+00 | done     | swh:1:dir:d83b7dda887dc790f7207608474650d4344b8df9 |
| 973 | 2022-03-10 16:18:53.270816+00 | done     | swh:1:dir:d83b7dda887dc790f7207608474650d4344b8df9 |
| 974 | 2022-03-11 10:18:43.741886+00 | verified | (null)                                             |
| 975 | 2022-03-11 16:06:49.603781+00 | done     | swh:1:dir:d83b7dda887dc790f7207608474650d4344b8df9 |
| 976 | 2022-03-12 15:59:32.34099+00  | done     | swh:1:dir:d83b7dda887dc790f7207608474650d4344b8df9 |
| 977 | 2022-03-13 15:52:16.475602+00 | done     | swh:1:dir:d83b7dda887dc790f7207608474650d4344b8df9 |
| 978 | 2022-03-14 15:45:04.350296+00 | done     | swh:1:dir:d83b7dda887dc790f7207608474650d4344b8df9 |
+-----+-------------------------------+----------+----------------------------------------------------+
(39 rows)

Time: 7.894 ms

They finally got scheduled but i still saw them without a swhid in the deposit backend.

I think those stuck in limbo got hit by the sentry issue SWH-DEPOSIT-2N
[1] [2]

They are now currently disabled in the scheduler [3].

[1] https://sentry.softwareheritage.org/share/issue/3ac50437ebc74a8e8cae14e2068948ff/

[2] https://sentry.softwareheritage.org/share/issue/c6587364ecb24b59a8eed835a2750726/

[3]

16:15:46 swh-scheduler@db1:5432=> select * from task where type='load-deposit' and status='disabled';
+----------+--------------+------------------------------------------------------------------------------------------------------------------+-------------------------------+------------------+----------+---------+--------------+----------+
|    id    |     type     |                                                    arguments                                                     |           next_run            | current_interval |  status  | policy  | retries_left | priority |
+----------+--------------+------------------------------------------------------------------------------------------------------------------+-------------------------------+------------------+----------+---------+--------------+----------+
| 30183878 | load-deposit | {"args": [], "kwargs": {"url": "https://inria.halpreprod.archives-ouvertes.fr/hal-01243573", "deposit_id": 969}} | 2022-03-15 14:50:36.794404+00 | 1 day            | disabled | oneshot |            0 | (null)   |
| 30184464 | load-deposit | {"args": [], "kwargs": {"url": "https://inria.halpreprod.archives-ouvertes.fr/hal-01243573", "deposit_id": 970}} | 2022-03-15 14:50:36.794404+00 | 1 day            | disabled | oneshot |            0 | (null)   |
| 30184531 | load-deposit | {"args": [], "kwargs": {"url": "https://inria.halpreprod.archives-ouvertes.fr/hal-01243573", "deposit_id": 971}} | 2022-03-15 14:50:36.794404+00 | 1 day            | disabled | oneshot |            0 | (null)   |
| 30214716 | load-deposit | {"args": [], "kwargs": {"url": "https://inria.halpreprod.archives-ouvertes.fr/hal-01243573", "deposit_id": 974}} | 2022-03-15 14:50:36.794404+00 | 1 day            | disabled | oneshot |            0 | (null)   |
+----------+--------------+------------------------------------------------------------------------------------------------------------------+-------------------------------+------------------+----------+---------+--------------+----------+
(4 rows)

@vlorentz I've tentatively opened D7363 to fix ^.

Hot-patching staging deposit with the diff change and schedule back those deposits they are now ok [1].

[1]

17:59:40 swh-deposit@db1:5432=> select id, type, reception_date, status, swhid from deposit where reception_date > '2022-02-25' and load_task_id is not null order by reception_date;
+-----+------+-------------------------------+--------+----------------------------------------------------+
| id  | type |        reception_date         | status |                       swhid                        |
+-----+------+-------------------------------+--------+----------------------------------------------------+
| 938 | code | 2022-02-25 12:21:06.264691+00 | done   | swh:1:dir:d83b7dda887dc790f7207608474650d4344b8df9 |
| 939 | code | 2022-02-25 13:03:55.397734+00 | done   | swh:1:dir:d83b7dda887dc790f7207608474650d4344b8df9 |
| 940 | code | 2022-02-25 13:08:02.986674+00 | done   | swh:1:dir:d83b7dda887dc790f7207608474650d4344b8df9 |
| 941 | code | 2022-02-25 13:10:56.660217+00 | done   | swh:1:dir:d83b7dda887dc790f7207608474650d4344b8df9 |
| 942 | code | 2022-02-26 12:03:11.587671+00 | done   | swh:1:dir:d83b7dda887dc790f7207608474650d4344b8df9 |
| 943 | code | 2022-02-27 10:55:33.071236+00 | done   | swh:1:dir:d83b7dda887dc790f7207608474650d4344b8df9 |
| 944 | code | 2022-02-28 09:47:52.072206+00 | done   | swh:1:dir:d83b7dda887dc790f7207608474650d4344b8df9 |
| 945 | code | 2022-03-01 08:40:08.637126+00 | done   | swh:1:dir:d83b7dda887dc790f7207608474650d4344b8df9 |
| 946 | code | 2022-03-01 10:11:21.292108+00 | done   | swh:1:dir:7267827f0c6ae331b20596ecdb71614bb86b47f0 |
| 947 | code | 2022-03-01 10:49:59.862959+00 | done   | swh:1:dir:7267827f0c6ae331b20596ecdb71614bb86b47f0 |
| 948 | code | 2022-03-01 12:38:55.412746+00 | done   | swh:1:dir:5e80d0672771d1d1ac037a37b33af9d6260efef2 |
| 949 | code | 2022-03-01 13:38:58.87221+00  | done   | swh:1:dir:5e80d0672771d1d1ac037a37b33af9d6260efef2 |
| 950 | code | 2022-03-01 13:43:10.602682+00 | done   | swh:1:dir:a5003f717459702d8e1a35e634832a1f0d23e5ce |
| 951 | code | 2022-03-02 07:32:30.859831+00 | done   | swh:1:dir:d83b7dda887dc790f7207608474650d4344b8df9 |
| 952 | code | 2022-03-02 15:22:28.090742+00 | done   | swh:1:dir:09b0e419a997c87ba4da06991bab867d568a2076 |
| 955 | code | 2022-03-03 06:24:43.783543+00 | done   | swh:1:dir:d83b7dda887dc790f7207608474650d4344b8df9 |
| 956 | code | 2022-03-04 05:16:51.966295+00 | done   | swh:1:dir:d83b7dda887dc790f7207608474650d4344b8df9 |
| 957 | code | 2022-03-04 16:48:50.738495+00 | done   | swh:1:dir:d83b7dda887dc790f7207608474650d4344b8df9 |
| 958 | code | 2022-03-04 17:24:46.894111+00 | done   | swh:1:dir:d83b7dda887dc790f7207608474650d4344b8df9 |
| 959 | code | 2022-03-04 17:29:11.729435+00 | done   | swh:1:dir:d83b7dda887dc790f7207608474650d4344b8df9 |
| 960 | code | 2022-03-04 18:07:53.033515+00 | done   | swh:1:dir:d83b7dda887dc790f7207608474650d4344b8df9 |
| 961 | code | 2022-03-05 18:06:55.046618+00 | done   | swh:1:dir:d83b7dda887dc790f7207608474650d4344b8df9 |
| 962 | code | 2022-03-06 17:07:08.389622+00 | done   | swh:1:dir:d83b7dda887dc790f7207608474650d4344b8df9 |
| 963 | code | 2022-03-07 16:54:59.823794+00 | done   | swh:1:dir:d83b7dda887dc790f7207608474650d4344b8df9 |
| 964 | code | 2022-03-08 16:43:01.348875+00 | done   | swh:1:dir:d83b7dda887dc790f7207608474650d4344b8df9 |
| 965 | code | 2022-03-09 11:17:30.319781+00 | done   | swh:1:dir:c4993c872593e960dc84e4430dbbfbc34fd706d0 |
| 966 | code | 2022-03-09 14:31:42.062191+00 | done   | swh:1:dir:c4993c872593e960dc84e4430dbbfbc34fd706d0 |
| 967 | code | 2022-03-09 14:37:30.339323+00 | done   | swh:1:dir:c4993c872593e960dc84e4430dbbfbc34fd706d0 |
| 968 | code | 2022-03-09 14:44:17.355722+00 | done   | swh:1:dir:c4993c872593e960dc84e4430dbbfbc34fd706d0 |
| 969 | code | 2022-03-09 15:02:44.768178+00 | done   | swh:1:dir:c4993c872593e960dc84e4430dbbfbc34fd706d0 |
| 970 | code | 2022-03-09 15:48:16.431623+00 | done   | swh:1:dir:c4993c872593e960dc84e4430dbbfbc34fd706d0 |
| 971 | code | 2022-03-09 15:53:22.332808+00 | done   | swh:1:dir:c4993c872593e960dc84e4430dbbfbc34fd706d0 |
| 972 | code | 2022-03-09 16:30:56.037631+00 | done   | swh:1:dir:d83b7dda887dc790f7207608474650d4344b8df9 |
| 973 | code | 2022-03-10 16:18:53.270816+00 | done   | swh:1:dir:d83b7dda887dc790f7207608474650d4344b8df9 |
| 974 | code | 2022-03-11 10:18:43.741886+00 | done   | swh:1:dir:c4993c872593e960dc84e4430dbbfbc34fd706d0 |
| 975 | code | 2022-03-11 16:06:49.603781+00 | done   | swh:1:dir:d83b7dda887dc790f7207608474650d4344b8df9 |
| 976 | code | 2022-03-12 15:59:32.34099+00  | done   | swh:1:dir:d83b7dda887dc790f7207608474650d4344b8df9 |
| 977 | code | 2022-03-13 15:52:16.475602+00 | done   | swh:1:dir:d83b7dda887dc790f7207608474650d4344b8df9 |
| 978 | code | 2022-03-14 15:45:04.350296+00 | done   | swh:1:dir:d83b7dda887dc790f7207608474650d4344b8df9 |
| 979 | code | 2022-03-15 15:37:47.127079+00 | done   | swh:1:dir:d83b7dda887dc790f7207608474650d4344b8df9 |
| 980 | code | 2022-03-16 08:43:42.338695+00 | done   | swh:1:dir:c4993c872593e960dc84e4430dbbfbc34fd706d0 |
| 981 | code | 2022-03-16 09:00:01.562112+00 | done   | swh:1:dir:c4993c872593e960dc84e4430dbbfbc34fd706d0 |
| 982 | code | 2022-03-16 10:12:26.166215+00 | done   | swh:1:dir:c4993c872593e960dc84e4430dbbfbc34fd706d0 |
| 983 | code | 2022-03-16 14:38:20.51879+00  | done   | swh:1:dir:d83b7dda887dc790f7207608474650d4344b8df9 |
| 984 | code | 2022-03-16 14:46:22.702811+00 | done   | swh:1:dir:c4993c872593e960dc84e4430dbbfbc34fd706d0 |
+-----+------+-------------------------------+--------+----------------------------------------------------+
(45 rows)

So production wise, I've:

  • hot patched the production deposit with the changed listed here.
  • rescheduled the deposits [1] [2]

It's done now.

[1] Before:

18:02:51 softwareheritage-deposit@belvedere:5432=> select id, type, reception_date, status, swhid from deposit where reception_date > '2022-02-25' and load_task_id is not null order by reception_date;
+------+------+-------------------------------+----------+----------------------------------------------------+
|  id  | type |        reception_date         |  status  |                       swhid                        |
+------+------+-------------------------------+----------+----------------------------------------------------+
| 2085 | code | 2022-02-25 09:13:58.319933+00 | done     | swh:1:dir:d83b7dda887dc790f7207608474650d4344b8df9 |
| 2086 | code | 2022-02-25 11:01:53.288991+00 | done     | swh:1:dir:8cce425de4c89c6e291a39981a91714ce8a030ae |
| 2087 | code | 2022-02-25 11:54:29.004236+00 | done     | swh:1:dir:8ec648b4c887766e13bf3977e95411ce4c52656e |
| 2088 | code | 2022-02-25 11:55:14.647872+00 | done     | swh:1:dir:e0c6306ccca069405bba3f8e1bf2a7aa8a374dd8 |
| 2089 | code | 2022-02-25 13:12:29.491824+00 | done     | swh:1:dir:d83b7dda887dc790f7207608474650d4344b8df9 |
| 2090 | code | 2022-02-26 12:50:37.928105+00 | done     | swh:1:dir:d83b7dda887dc790f7207608474650d4344b8df9 |
| 2091 | code | 2022-02-27 12:28:43.070734+00 | done     | swh:1:dir:d83b7dda887dc790f7207608474650d4344b8df9 |
| 2092 | code | 2022-02-28 12:06:58.278931+00 | done     | swh:1:dir:d83b7dda887dc790f7207608474650d4344b8df9 |
| 2093 | code | 2022-03-01 11:45:09.835913+00 | done     | swh:1:dir:d83b7dda887dc790f7207608474650d4344b8df9 |
| 2094 | code | 2022-03-02 10:38:36.773328+00 | done     | swh:1:dir:a37aaceff4433cc3f33f408d7a4268b912b33188 |
| 2095 | code | 2022-03-02 11:23:16.452148+00 | done     | swh:1:dir:d83b7dda887dc790f7207608474650d4344b8df9 |
| 2096 | code | 2022-03-02 11:41:39.351296+00 | done     | swh:1:dir:5d6f2287a9e2e08aaad3f31f833c82d50bcf3c88 |
| 2097 | code | 2022-03-03 11:01:23.521034+00 | done     | swh:1:dir:d83b7dda887dc790f7207608474650d4344b8df9 |
| 2098 | code | 2022-03-04 10:39:29.553787+00 | done     | swh:1:dir:d83b7dda887dc790f7207608474650d4344b8df9 |
| 2107 | code | 2022-03-04 18:28:46.408218+00 | done     | swh:1:dir:d83b7dda887dc790f7207608474650d4344b8df9 |
| 2109 | code | 2022-03-04 19:01:09.336105+00 | done     | swh:1:dir:d83b7dda887dc790f7207608474650d4344b8df9 |
| 2110 | code | 2022-03-04 19:05:19.914166+00 | done     | swh:1:dir:d83b7dda887dc790f7207608474650d4344b8df9 |
| 2111 | code | 2022-03-05 18:25:40.21825+00  | done     | swh:1:dir:d83b7dda887dc790f7207608474650d4344b8df9 |
| 2112 | code | 2022-03-06 18:18:03.578598+00 | done     | swh:1:dir:d83b7dda887dc790f7207608474650d4344b8df9 |
| 2113 | code | 2022-03-07 17:30:59.32006+00  | done     | swh:1:dir:d83b7dda887dc790f7207608474650d4344b8df9 |
| 2114 | code | 2022-03-08 16:44:03.261524+00 | done     | swh:1:dir:d83b7dda887dc790f7207608474650d4344b8df9 |
| 2115 | code | 2022-03-09 14:40:15.460855+00 | done     | swh:1:dir:a07dd9184bea77677716c30cbde8157761d1bd25 |
| 2116 | code | 2022-03-09 15:56:58.620788+00 | done     | swh:1:dir:d83b7dda887dc790f7207608474650d4344b8df9 |
| 2117 | code | 2022-03-10 15:10:01.750027+00 | done     | swh:1:dir:d83b7dda887dc790f7207608474650d4344b8df9 |
| 2118 | code | 2022-03-11 14:23:05.220188+00 | done     | swh:1:dir:d83b7dda887dc790f7207608474650d4344b8df9 |
| 2119 | code | 2022-03-12 13:36:11.4087+00   | done     | swh:1:dir:d83b7dda887dc790f7207608474650d4344b8df9 |
| 2120 | code | 2022-03-13 13:35:45.329405+00 | done     | swh:1:dir:d83b7dda887dc790f7207608474650d4344b8df9 |
| 2121 | code | 2022-03-14 13:35:19.390942+00 | done     | swh:1:dir:d83b7dda887dc790f7207608474650d4344b8df9 |
| 2122 | code | 2022-03-14 14:05:26.424093+00 | verified | (null)                                             |
| 2123 | code | 2022-03-15 13:34:50.424512+00 | done     | swh:1:dir:d83b7dda887dc790f7207608474650d4344b8df9 |
| 2124 | code | 2022-03-16 12:41:03.367607+00 | done     | swh:1:dir:d83b7dda887dc790f7207608474650d4344b8df9 |
+------+------+-------------------------------+----------+----------------------------------------------------+
(31 rows)

[2] After the rescheduling:

18:13:41 softwareheritage-deposit@belvedere:5432=> select id, type, load_task_id reception_date, status, swhid from deposit where reception_date > '2022-02-25' and load_task_id is not null order by reception_date;
+------+------+----------------+--------+----------------------------------------------------+
|  id  | type | reception_date | status |                       swhid                        |
+------+------+----------------+--------+----------------------------------------------------+
| 2085 | code | 407157795      | done   | swh:1:dir:d83b7dda887dc790f7207608474650d4344b8df9 |
| 2086 | code | 407163060      | done   | swh:1:dir:8cce425de4c89c6e291a39981a91714ce8a030ae |
| 2087 | code | 407165557      | done   | swh:1:dir:8ec648b4c887766e13bf3977e95411ce4c52656e |
| 2088 | code | 407165591      | done   | swh:1:dir:e0c6306ccca069405bba3f8e1bf2a7aa8a374dd8 |
| 2089 | code | 407169150      | done   | swh:1:dir:d83b7dda887dc790f7207608474650d4344b8df9 |
| 2090 | code | 407236102      | done   | swh:1:dir:d83b7dda887dc790f7207608474650d4344b8df9 |
| 2091 | code | 407309947      | done   | swh:1:dir:d83b7dda887dc790f7207608474650d4344b8df9 |
| 2092 | code | 407379705      | done   | swh:1:dir:d83b7dda887dc790f7207608474650d4344b8df9 |
| 2093 | code | 407445912      | done   | swh:1:dir:d83b7dda887dc790f7207608474650d4344b8df9 |
| 2094 | code | 407504412      | done   | swh:1:dir:a37aaceff4433cc3f33f408d7a4268b912b33188 |
| 2095 | code | 407506097      | done   | swh:1:dir:d83b7dda887dc790f7207608474650d4344b8df9 |
| 2096 | code | 407506910      | done   | swh:1:dir:5d6f2287a9e2e08aaad3f31f833c82d50bcf3c88 |
| 2097 | code | 407565731      | done   | swh:1:dir:d83b7dda887dc790f7207608474650d4344b8df9 |
| 2098 | code | 407624130      | done   | swh:1:dir:d83b7dda887dc790f7207608474650d4344b8df9 |
| 2107 | code | 407643393      | done   | swh:1:dir:d83b7dda887dc790f7207608474650d4344b8df9 |
| 2109 | code | 407644857      | done   | swh:1:dir:d83b7dda887dc790f7207608474650d4344b8df9 |
| 2110 | code | 407645069      | done   | swh:1:dir:d83b7dda887dc790f7207608474650d4344b8df9 |
| 2111 | code | 407703264      | done   | swh:1:dir:d83b7dda887dc790f7207608474650d4344b8df9 |
| 2112 | code | 407758670      | done   | swh:1:dir:d83b7dda887dc790f7207608474650d4344b8df9 |
| 2113 | code | 407810770      | done   | swh:1:dir:d83b7dda887dc790f7207608474650d4344b8df9 |
| 2114 | code | 407864454      | done   | swh:1:dir:d83b7dda887dc790f7207608474650d4344b8df9 |
| 2115 | code | 407917592      | done   | swh:1:dir:a07dd9184bea77677716c30cbde8157761d1bd25 |
| 2116 | code | 407920283      | done   | swh:1:dir:d83b7dda887dc790f7207608474650d4344b8df9 |
| 2117 | code | 407972080      | done   | swh:1:dir:d83b7dda887dc790f7207608474650d4344b8df9 |
| 2118 | code | 408024298      | done   | swh:1:dir:d83b7dda887dc790f7207608474650d4344b8df9 |
| 2119 | code | 408077288      | done   | swh:1:dir:d83b7dda887dc790f7207608474650d4344b8df9 |
| 2120 | code | 408133382      | done   | swh:1:dir:d83b7dda887dc790f7207608474650d4344b8df9 |
| 2121 | code | 408191367      | done   | swh:1:dir:d83b7dda887dc790f7207608474650d4344b8df9 |
| 2122 | code | 408192531      | done   | swh:1:dir:a0462c70888c7bc954c8e5813dacd5bcde452447 |
| 2123 | code | 408248545      | done   | swh:1:dir:d83b7dda887dc790f7207608474650d4344b8df9 |
| 2124 | code | 408299741      | done   | swh:1:dir:d83b7dda887dc790f7207608474650d4344b8df9 |
+------+------+----------------+--------+----------------------------------------------------+
(31 rows)

Time: 6.888 ms