Page MenuHomeSoftware Heritage

Write specs about metadata workflow
Open, NormalPublic

Description

(opening tasks for myself on my last day might not be the best idea, but this should be written somewhere)

The metadata workflow and strategy is about recovering descriptive metadata on the artifacts in the archive.
This metadata can be found:

  • in the content itself -> intrinsic metadata (implemented with T1232)
  • not in the content -> extrinsic metadata
    • extrinsic metadata can be found with the content when listing or loading the content
    • or in a software registry (e.g Wikidata, swMath, ASCL..)

The different components and the storage infrastructure that was put in place to keep this information
should be specified and documented.

A discussion started over the metadata_provider in D637.

Event Timeline

moranegg created this task.Nov 14 2018, 4:03 PM
moranegg triaged this task as Normal priority.

Once this task is handled, an improved docstring for Storage.metadata_provider_add would be necessary.

Where would you put this type of specs?

vlorentz added a subscriber: vlorentz.EditedMay 22 2019, 11:54 AM

Either in the docs (like the persistent identifiers spec) or on the wiki (like the snapshot spec).

What about writing it on the wiki, and moving it to the docs when it's finished?

EDIT: nevermind, it's discussed in T1683

vlorentz changed the status of subtask T1737: Define and specify metadata providers from Open to Work in Progress.May 24 2019, 10:30 AM
moranegg added a parent task: Unknown Object (Maniphest Task).Jun 14 2019, 12:33 PM