Page MenuHomeSoftware Heritage

SWORD deposit: backend server
Closed, MigratedEdits Locked

Description

Scholarly archives including HAL use SWORD as protocol.

Implement SWORD endpoints to permit:

  • new artifacts deposit (hal id + metadata) -> injection in swh (resulting in synthetic revision)
  • existing artifacts update (hal id + metadata) -> injection in swh (resulting in a new synthetic revision with parent)
  • returning a status on action (swh id, hal id, eventual manifest of what was deposited)

Note:

  • artifacts are limited to tarballs (.zip)
  • metadata are to be defined (T717)
  • Related T647

Event Timeline

ardumont renamed this task from POC - HAL-SWH to HAL-SWH integration implementation.May 29 2017, 10:45 AM

We can embed image for sequence chart in readme with #F2403754 reference

can someone tag this task with the most appropriate project?
ideally, no task should have no tag whatsoever…

zack added a parent task: Unknown Object (Maniphest Task).Sep 16 2017, 2:10 PM

Documentation updated:

  • CREATE
  • UPDATE
  • DELETE

I should have created tasks upfront... now it will be overkill to complete.

Anyway, I will sum up here what has been done and what remains to be done:

  • sword 2.0
    • Ingest specification and understand it
    • read other client documentation to make sense of some dark spot
  • Technicals
    • Understand django framework (we moved away from flask in swh-web)
    • Understand how tot test it
    • Understand how to deploy django application
  • Implement public interface api for deposit (no injection, only the public interface):
    • Add basic authentication on subset implementation
    • Work on deploying subset implementation
    • Debian package missing dependency (pushed on pergamon)
    • Keep the specifications up-to-date with development
    • Implement service document SD-IRI endpoint (GET /1/servicedocument/)
    • Implement collection COL-IRI endpoint (POST /1/<collection-name>/)
    • Implement update endpoints EDIT/SE IRI and EM-IRI (replace, add). (POST/PUT /1/<collection-name>/<deposit-id>/{media|metadata})
    • Implement delete deposit endpoints on 'partial' deposit (DELETE /1/<collection-name>/<deposit-id>/{media|metadata})
    • Implement state endpoint on deposit (GET /1/<collection-name>/<deposit-id>/state/)
    • Implement content iri (GET /1/<collection-name>/<deposit-id>/content/)
    • Refactor common behaviors (errors, restrictions, etc...)
    • Integration tests with as less side-effect as possible (django helps for the db)
    • Play with sword client (swordapp/python2-sword-client) to exerce api (this uncover some misunderstood spot as well)
    • Read and improve documentation
    • Make documentation browsable using swh-docs (cf. swh-deposit/docs/index.rst).
    • Package and deploy https://$vhost.softwareheritage.org/ on moma (behind basic http auth)
    • Check for missing things (reading again the sword2.0 specification for it, this is not a blocker for the hal test)
  • Build and deploy documentation - https://docs.softwareheritage.org/devel/swh-deposit/
  • Implement first deposit internal injection mechanism (scheduling one-shot task)

I'll open tasks for the remaining actions:

  • T820 - Check with hal people the connectivity is ok (the latest tests were ok)
  • T821 - Finalize injection with origin_metadata

Note that this implementation is hal agnostic.
hal is a client and a software collection in the actual implementation.

zack renamed this task from HAL-SWH integration implementation to SWORD deposit: backend server.Jan 3 2018, 10:30 AM
ardumont updated the task description. (Show Details)