⚓ T1875 staging infra: Reproduce existing production setup in a compact way

Status	Assigned	Task
Migrated	gitlab-migration	T1873 Relieve puppet master (pergamon)'s load
Migrated	gitlab-migration	T1785 Setup staging infrastructure
Migrated	gitlab-migration	T1875 staging infra: Reproduce existing production setup in a compact way
Migrated	gitlab-migration	T1995 swh-lister: Unstuck debian build failures

ardumont updated the task description. (Show Details)

ardumont updated the task description. (Show Details)Aug 8 2019, 7:47 PM

ardumont updated the task description. (Show Details)

ardumont updated the task description. (Show Details)Aug 8 2019, 7:50 PM

ardumont updated the task description. (Show Details)

ardumont mentioned this in rSPSITE0796e1b95f9d: gateway.internal.staging: Add networks info up to the louvre route.Aug 8 2019, 9:04 PM

ardumont mentioned this in rSPSITE7b3bc38295d7: site.pp: Reference gateway.internal.staging's role.

ardumont mentioned this in rSPSITEcf342bbbb925: staging: Install webapp node's role.

ardumont mentioned this in rSPSITEd961819d8c36: gateway.internal.staging: Add networks info up to the louvre route.Aug 8 2019, 9:07 PM

ardumont mentioned this in rSPSITEc128e8324518: site.pp: Reference gateway.internal.staging's role.

ardumont mentioned this in rSPSITE24e74d106a64: staging: Install webapp node's role.

ardumont updated the task description. (Show Details)Aug 8 2019, 10:17 PM

ardumont claimed this task.Aug 8 2019, 11:52 PM

ardumont updated the task description. (Show Details)

ardumont mentioned this in rSPSITE3fcaa352548d: deposit: Allow loader/checker deposit configuration reuse.Aug 29 2019, 1:58 PM

ardumont mentioned this in rSPSITE37b7717f6aee: staging: Add checker and deposit workers.

ardumont mentioned this in rSPSITEc7711b75ac98: deposit: Allow loader/checker deposit configuration reuse.Aug 29 2019, 2:21 PM

ardumont mentioned this in rSPSITEa2db672ae9c5: staging: Add checker and deposit workers.

ardumont mentioned this in rSPSITE2b8da6a447a4: worker/*_deposit: Fix missing base_deposit.

ardumont updated the task description. (Show Details)Aug 29 2019, 2:27 PM

ardumont updated the task description. (Show Details)Aug 29 2019, 3:57 PM

ardumont updated the task description. (Show Details)Aug 29 2019, 4:05 PM

Progress on the deposit part:

$ swh deposit upload --url http://deposit.internal.staging.swh.network \
    --username hal-preprod \
    --password <pass> \
    --archive jesuisgpl.tgz \
    --name jesuisgpl \
    --author zack
INFO:swh.deposit.cli.client:{'deposit_id': '2', 'deposit_status': 'deposited', 'deposit_status_detail': None, 'deposit_date': 'Aug. 30, 2019, 2:08 p.m.'}

# deposit-id 1 fails because the metadata we incomplete 
$ swh deposit status --url http://deposit.internal.staging.swh.network \
    --username hal-preprod \
    --password <pass> \
    --deposit-id 1
INFO:swh.deposit.cli.client:{'deposit_id': '1', 'deposit_status': 'rejected', 'deposit_status_detail': '- Mandatory fields are missing (author)\n- Mandatory alternate fields are missing (name or title)', 'deposit_swh_id': None, 'deposit_swh_id_context': None, 'deposit_swh_anchor_id': None, 'deposit_swh_anchor_id_context': None, 'deposit_external_id': '8740a2d3-d11c-4daf-8968-a6554dacbbc2'}

# deposit-id 2 is complete (so should work)
$ swh deposit status --url http://deposit.internal.staging.swh.network \
    --username hal-preprod \
    --password <pass> \
    --deposit-id 2
INFO:swh.deposit.cli.client:{'deposit_id': '2', 'deposit_status': 'verified', 'deposit_status_detail': 'Deposit is fully received, checked, and ready for loading', 'deposit_swh_id': None, 'deposit_swh_id_context': None, 'deposit_swh_anchor_id': None, 'deposit_swh_anchor_id_context': None, 'deposit_external_id': '8a60afdb-f424-46ec-abc6-49a03ccefce3'}

# but it fails for storage migration reasons
$ swh deposit status --url http://deposit.internal.staging.swh.network \
    --username hal-preprod \
    --password <pass> \
    --deposit-id 2
INFO:swh.deposit.cli.client:{'deposit_id': '2', 'deposit_status': 'failed', 'deposit_status_detail': 'The deposit loading into the Software Heritage archive failed', 'deposit_swh_id': None, 'deposit_swh_id_context': None, 'deposit_swh_anchor_id': None, 'deposit_swh_anchor_id_context': None, 'deposit_external_id': '8a60afdb-f424-46ec-abc6-49a03ccefce3'}

Now the deposit loading fails because we have some changes in the storage layer (origin-id are no longer ids but url):

Aug 30 14:08:34 worker1 python3[11461]: [2019-08-30 14:08:34,917: ERROR/ForkPoolWorker-1] Loading failure, updating to `partial` status
                                        Traceback (most recent call last):
                                          File "/usr/lib/python3/dist-packages/swh/loader/core/loader.py", line 876, in load
                                            self.store_metadata()
                                          File "/usr/lib/python3/dist-packages/swh/deposit/loader/loader.py", line 93, in store_metadata
                                            tool_id, metadata)
                                          File "/usr/lib/python3/dist-packages/retrying.py", line 49, in wrapped_f
                                            return Retrying(*dargs, **dkw).call(f, *args, **kw)
                                          File "/usr/lib/python3/dist-packages/retrying.py", line 206, in call
                                            return attempt.get(self._wrap_exception)
                                          File "/usr/lib/python3/dist-packages/retrying.py", line 247, in get
                                            six.reraise(self.value[0], self.value[1], self.value[2])
                                          File "/usr/lib/python3/dist-packages/six.py", line 686, in reraise
                                            raise value
                                          File "/usr/lib/python3/dist-packages/retrying.py", line 200, in call
                                            attempt = Attempt(fn(*args, **kwargs), attempt_number, False)
                                          File "/usr/lib/python3/dist-packages/swh/loader/core/loader.py", line 352, in send_origin_metadata
                                            self.origin['url'], visit_date, provider_id, tool_id, metadata)
                                          File "/usr/lib/python3/dist-packages/swh/storage/api/client.py", line 241, in origin_metadata_add
                                            'metadata': metadata})
                                          File "/usr/lib/python3/dist-packages/swh/core/api/__init__.py", line 205, in post
                                            return self._decode_response(response)
                                          File "/usr/lib/python3/dist-packages/swh/core/api/__init__.py", line 237, in _decode_response
                                            raise pickle.loads(decode_response(response))
                                        psycopg2.DataError: invalid input syntax for integer: "https://inria.halpreprod.archives-ouvertes.fr/8a60afdb-f424-46ec-abc6-49a03ccefce3"
                                        LINE 2: ...          provider_id, tool_id, metadata) values ('https://i...

@vlorentz (stacktrace) ^

I think we missed the storage origin_metadata_add endpoint during the migration out of the origin id (both in-memory and pg storages).
The loader-core (0.0.44) has been changed to provide the url instead of the id but the storage endpoint still expects the origin-id.
That's my understanding, do you concur?

TIA

Indeed, I did not change it because I expected to do it at the same time as a refactoring of origin_metadata_* (aka implementing D1614).
I'll fix it asap

D1935

Finally, after updating plenty of our modules (model, storage, ...), loader-deposit is fixed, so i can finally cross that box! \m/

Sep 05 08:31:03 worker1 python3[11272]: [2019-09-05 08:31:03,260: INFO/ForkPoolWorker-1] Task swh.deposit.loader.tasks.LoadDepositArchiveTsk[da9f066e-4c44-4053-81a1-e38db35d6149] succeeded in 2.102082097902894s: {'status': 'eventful'}

Cheers,

ardumont updated the task description. (Show Details)Sep 5 2019, 10:39 AM

ardumont created subtask T1995: swh-lister: Unstuck debian build failures.Sep 10 2019, 1:33 PM

olasd closed subtask T1995: swh-lister: Unstuck debian build failures as Resolved.Sep 10 2019, 2:51 PM

ardumont updated the task description. (Show Details)Sep 11 2019, 3:18 PM

ardumont mentioned this in rSPSITE430ce3c4a92c: staging: Add journal0 node.Sep 13 2019, 2:17 PM

ardumont mentioned this in rSPRE8760b5d10019: staging: Add journal0 node.Sep 13 2019, 2:34 PM

ardumont mentioned this in rSPSITE8631d5885abb: staging: Make storage writes to journal0's broker.Sep 13 2019, 2:40 PM

ardumont mentioned this in rSPSITE64aa3c348b8c: staging: Set zookeeper instance on journal0.

ardumont mentioned this in rSPSITEcdd8fb222e92: staging: Fix journal broker reference.Sep 13 2019, 2:52 PM

ardumont mentioned this in rSPSITE624b41c72c64: staging: Add missing kafka::cluster setup.

ardumont mentioned this in rSPSITE67237f1efc00: staging: Add missing kafka::cluster setup.Sep 13 2019, 3:02 PM

ardumont mentioned this in rSPSITE31df4b0f4a9d: staging: Deploy indexer origin intrinsic metadata.Sep 13 2019, 3:32 PM

ardumont mentioned this in rSPSITEc25b001b8862: staging: Add journal0 node.Oct 9 2019, 10:54 PM

ardumont mentioned this in rSPSITEe25f76602fbc: staging: Make storage writes to journal0's broker.

ardumont mentioned this in rSPSITEd594c1a1049a: staging: Set zookeeper instance on journal0.

ardumont mentioned this in rSPSITE15ec1eff1887: staging: Fix journal broker reference.

ardumont mentioned this in rSPSITE41214d7f20b2: staging: Add missing kafka::cluster setup.

ardumont mentioned this in rSPSITEf5f76a8ccd0f: staging: Deploy indexer origin intrinsic metadata.

Stand-by as package-loader priority T1389 took over.
Staging infra will be used when package-loader land (which will need some work debian packaging + configuration updates).

And then starting back the work to finalize it (one indexer which pulls a kafka).

vlorentz added a project: Staging environment.Jan 22 2020, 4:28 PM

It's fairly complete already.

For the remaining indexer which pulls a kafka, we can always improve on this later.

This task has been migrated to GitLab.

gitlab-migration changed the status of subtask T1995: swh-lister: Unstuck debian build failures from Resolved to Migrated.Oct 19 2022, 5:56 PM

staging infra: Reproduce existing production setup in a compact way
Closed, MigratedEdits Locked
Actions

Description

Related Objects
Search...

Event Timeline

	ardumont
	Jul 3 2019, 2:14 PM

staging infra: Reproduce existing production setup in a compact wayClosed, MigratedEdits LockedActions

Description

Related ObjectsSearch...

Event Timeline

staging infra: Reproduce existing production setup in a compact way
Closed, MigratedEdits Locked
Actions

Related Objects
Search...