Page MenuHomeSoftware Heritage

Provide stats on extracted metadata in the indexer storage api
Closed, MigratedEdits Locked

Description

Number of origins that were indexed, how many have a non-empty set of metadata, breakdown per metadata type.

Event Timeline

vlorentz triaged this task as Normal priority.Jan 21 2019, 11:38 AM
vlorentz created this task.

Useful queries:

select count(*) from origin_intrinsic_metadata;
select count(*) from origin_intrinsic_metadata where metadata != '{"@context": "https://doi.org/10.5063/schema/codemeta-2.0"}';

(The latter is a hack, for a long-term solution, doing JSON operations to check if there is any key other than @context would be better.)

vlorentz renamed this task from Show stats on extracted metadata to Provide stats on extracted metadata.Jan 21 2019, 11:39 AM
vlorentz renamed this task from Provide stats on extracted metadata to Provide stats on extracted metadata in the indexer storage api.