Page MenuHomeSoftware Heritage

(staging) deposit: Investigate update query error
Closed, MigratedEdits Locked

Description

From an email:

Je me permets de vous écrire par rapport aux erreurs que je rencontre en essayant de modifier les méta-données de documents.
J’effectue la requête avec Curl comme indiqué [1]

Voici le résultat de la requête pour récupérer le statut

<entry xmlns="http://www.w3.org/2005/Atom"
       xmlns:sword="http://purl.org/net/sword/"
       xmlns:dcterms="http://purl.org/dc/terms/">
    <deposit_id>21</deposit_id>
    <deposit_status>done</deposit_status>
    <deposit_status_detail>The deposit has been successfully loaded into the Software Heritage archive</deposit_status_detail>
    <deposit_swh_id>swh:1:dir:998b89033f20b31dbad54263d8acdbbcc7ce3c1f</deposit_swh_id>
    <deposit_swh_id_context>swh:1:dir:998b89033f20b31dbad54263d8acdbbcc7ce3c1f;origin=https://inria.halpreprod.archives-ouvertes.fr/hal-02511079;visit=swh:1:snp:a89c77297d90d8e49ed9f7062c24bfa84b1e4b7f;anchor=swh:1:rev:f6f4e99c40f2cc41fcb9af0d03ba9c63d5d908e0;path=/</deposit_swh_id_context>
    <deposit_external_id>hal-02511079</deposit_external_id>
</entry>

Voici le résultat de la requête de modification avec l'erreur.

<?xml version="1.0" encoding="utf-8"?>
<sword:error xmlns="http://www.w3.org/2005/Atom"
       xmlns:sword="http://purl.org/net/sword/">
    <summary>Mismatched provided SWHID swh:1:dir:998b89033f20b31dbad54263d8acdbbcc7ce3c1f, swh:1:dir:998b89033f20b31dbad54263d8acdbbcc7ce3c1f with deposit&#39;s swh:1:dir:998b89033f20b31dbad54263d8acdbbcc7ce3c1f.</summary>
    <sword:treatment>processing failed</sword:treatment>

    <sword:verboseDescription>
        The provided SWHID does not match the deposit to update. Please ensure you send the correct deposit SWHID.
    </sword:verboseDescription>

</sword:error>

[1] P815

Event Timeline

ardumont triaged this task as Normal priority.Dec 9 2020, 9:26 AM
ardumont created this task.
ardumont renamed this task from deposit: Investigate update query error to (staging) deposit: Investigate update query error.Dec 9 2020, 10:30 AM

Further email inquiry got sent.

En effet, l'erreur précédente était liée au fait que je mettais deux fois le SWHID dans le Headers.
L'erreur effectif renvoyée et le suivant :

<h1>Server Error (500)</h1>

@Theophane20 Je suis toujours interesse par l'actuelle commande utilisee de votre cote.

En attendant, regardant les logs serveur, je tombe sur cette erreur de notre cote [1]
A priori, il y a un pb d'input que nous gerons mal de notre cote.
D'ou l'erreur 500 (probleme interne).

  • english version

I'm still interested by the actual command used on your side.

Looking a bit closer the server logs, i've encountered the error [1].
A priori, i'd say it's an input problem (that's why i'm actually interested by the command used).
In any case, that problem is badly dealt with on our side, thus the 500.

[1]

Dec 09 18:25:55 deposit python3[2185861]: 2020-12-09 18:25:55 [2185861] django.request:ERROR Internal Server Error: /1/hal-preprod/21/metadata/
                                          Traceback (most recent call last):
                                            File "/usr/lib/python3/dist-packages/django/core/handlers/exception.py", line 34, in inner
                                              response = get_response(request)
                                            File "/usr/lib/python3/dist-packages/django/core/handlers/base.py", line 115, in _get_response
                                              response = self.process_exception_by_middleware(e, request)
                                            File "/usr/lib/python3/dist-packages/django/core/handlers/base.py", line 113, in _get_response
                                              response = wrapped_callback(request, *callback_args, **callback_kwargs)
                                            File "/usr/lib/python3/dist-packages/django/views/decorators/csrf.py", line 54, in wrapped_view
                                              return view_func(*args, **kwargs)
                                            File "/usr/lib/python3/dist-packages/django/views/generic/base.py", line 71, in view
                                              return self.dispatch(request, *args, **kwargs)
                                            File "/usr/lib/python3/dist-packages/rest_framework/views.py", line 495, in dispatch
                                              response = self.handle_exception(exc)
                                            File "/usr/lib/python3/dist-packages/rest_framework/views.py", line 455, in handle_exception
                                              self.raise_uncaught_exception(exc)
                                            File "/usr/lib/python3/dist-packages/rest_framework/views.py", line 492, in dispatch
                                              response = handler(request, *args, **kwargs)
                                            File "/usr/lib/python3/dist-packages/swh/deposit/api/common.py", line 1133, in put
                                              data = self.process_put(request, headers, collection_name, deposit_id)
                                            File "/usr/lib/python3/dist-packages/swh/deposit/api/deposit_update.py", line 210, in process_put
                                              deposit, parse_swhid(swhid), metadata, raw_metadata, deposit.origin_url,
                                            File "/usr/lib/python3/dist-packages/swh/deposit/api/common.py", line 717, in _store_metadata_deposit
                                              self.storage_metadata.raw_extrinsic_metadata_add([metadata_object])
                                            File "/usr/lib/python3/dist-packages/swh/core/api/__init__.py", line 181, in meth_
                                              return self.post(meth._endpoint_path, post_data)
                                            File "/usr/lib/python3/dist-packages/swh/core/api/__init__.py", line 278, in post
                                              return self._decode_response(response)
                                            File "/usr/lib/python3/dist-packages/swh/core/api/__init__.py", line 352, in _decode_response
                                              self.raise_for_status(response)
                                            File "/usr/lib/python3/dist-packages/swh/storage/api/client.py", line 29, in raise_for_status
                                              super().raise_for_status(response)
                                            File "/usr/lib/python3/dist-packages/swh/core/api/__init__.py", line 335, in raise_for_status
                                              payload=data["exception"], response=response
                                          KeyError: 'exception'
Dec 09 18:25:55 deposit python3[2185861]: 2020-12-09 18:25:55 [2185861] gunicorn.access:INFO 127.0.0.1 - hal-preprod [09/Dec/2020:18:25:55 +0000] "PUT /1/hal-preprod/21/metadata/ HTTP/1.1" 500 27 "-" "SWORDAPP PHP v2 library (version 0.1)
http://php.swordapp.org/"

Note for self: did not find this ^ in sentry (no idea why).

In the code I am using the curl module of php to do the request, this is what the modifier code looks like

 $curl=curl_init();
 curl_setopt ($curl,CURLOPT_HEADER, true);
 curl_setopt ($curl,CURLOPT_RETURNTRANSFER,true);
 curl_setopt($curl, CURLOPT_USERPWD, "USER:PASSWORD");
 curl_setopt($curl, CURLOPT_URL, 'https://deposit.internal.staging.swh.network/1/hal-preprod/21/metadata/');
 curl_setopt($curl, CURLOPT_VERBOSE, false);
 $headers=[
            "Content-Type: application/atom+xml;type=entry",
            "In-Progress: false",
            "X_CHECK_SWHID:swh:1:dir:998b89033f20b31dbad54263d8acdbbcc7ce3c1f"
           ];
 curl_setopt($curl, CURLOPT_PUT, true);
 curl_setopt($curl, CURLOPT_INFILE, fopen('deposit-hal-preprod.xml', 'rb'));
 curl_setopt($curl, CURLOPT_INFILESIZE, filesize('deposit-hal-preprod.xml'));
 curl_setopt($curl, CURLOPT_HTTPHEADER, $headers);
 $response= curl_exec($curl);
 curl_close($curl);

Ce snippet de code me semble raisonnable.

Comme une exception est levee malgre tout, je me demande si le xml envoye passe
les verifications fonctionnelles. Cette exception cascaderait alors en une
exception technique que l'on peut voir ds l'erreur plus haut.

Est-ce que vous pouvez partager ce flux xml svp? Vous pouvez soit le coller ici
ou dans un paste [1]

Par ailleurs, pour information, je ne vais pas tarder a deployer une nouvelle
version du deposit (pre-prod). Cette version contient entre autre des
ameliorations sur la gestion de nos exceptions.

Merci d'avance


This looks sensible.

As an exception occurs nonetheless, i'm wondering whether the xml passes the
functional checks. This exceptions is then raised and then cascades into the
stacktrace i mentioned...

Can you please share that xml? You can paste even here or in the paste
application [1]

For information, i will soon deploy a new deposit version on staging (which
among other things improves a tenfold our exception handling).

Thanks for the feedback,

[1] https://forge.softwareheritage.org/paste

Heads up, deposit v0.7.2 got deployed.

Irrelevant to the problem at hands but still, the following needs to be changed
from:

curl_setopt($curl, CURLOPT_URL, 'https://deposit.internal.staging.swh.network/1/hal-preprod/21/metadata/');

to:

curl_setopt($curl, CURLOPT_URL, 'https://deposit.staging.swh.network/1/hal-preprod/21/atom/');

I also updated the paste you mentioned in the description [1]

Notice also the change to use the actual public dns address
deposit.staging.swh.network (the internal address is not something that you
should be able to resolve on your side :).

[1] P815

Cheers,

  File "/usr/lib/python3/dist-packages/swh/core/api/__init__.py", line 335, in raise_for_status
    payload=data["exception"], response=response
KeyError: 'exception'

that's caused by a mismatch of swh.core versions between client and server

ardumont changed the task status from Open to Work in Progress.Dec 11 2020, 3:23 PM
ardumont moved this task from Backlog to In progress on the SWORD deposit board.

@Theophane20 Hi, can you confirm the issue is fixed? Thanks!

After a few exchanges with Yannick we have resolved the task on HAL's end after the change of an endpoint.