- indexers call themselves directly instead of going through the scheduler
- metadata is attached to directories instead of revisions
Depends on D8084
Differential D8085
docs: Update description of the metadata workflow vlorentz on Jul 6 2022, 4:05 PM. Authored by Tags None Subscribers None
Details
Depends on D8084
Diff Detail
Event TimelineComment Actions Build has FAILED Patch application report for D8085 (id=29188)Could not rebase; Attempt merge onto c2742b5b75... Updating c2742b5..085837e Fast-forward docs/images/tasks-metadata-indexers.uml | 57 ++++++++---------- docs/metadata-workflow.rst | 29 +++++---- swh/indexer/codemeta.py | 57 +++++++++++------- swh/indexer/metadata.py | 10 ++-- swh/indexer/metadata_detector.py | 4 +- swh/indexer/metadata_dictionary/__init__.py | 12 +++- swh/indexer/metadata_dictionary/base.py | 63 ++++++++++++-------- swh/indexer/metadata_dictionary/cff.py | 4 +- swh/indexer/metadata_dictionary/codemeta.py | 4 +- swh/indexer/metadata_dictionary/composer.py | 4 +- swh/indexer/metadata_dictionary/github.py | 68 +++++++++++++++++++--- swh/indexer/metadata_dictionary/maven.py | 4 +- swh/indexer/metadata_dictionary/npm.py | 4 +- swh/indexer/metadata_dictionary/python.py | 4 +- swh/indexer/metadata_dictionary/ruby.py | 11 +--- .../tests/metadata_dictionary/test_github.py | 26 +++++++-- 16 files changed, 229 insertions(+), 132 deletions(-) Changes applied before testcommit 085837edeea62d1439b9cfeb265b28ab858a5e7e Author: Valentin Lorentz <vlorentz@softwareheritage.org> Date: Tue Jul 5 17:46:33 2022 +0200 docs: Update description of the metadata workflow 1. indexers call themselves directly instead of going through the scheduler 2. metadata is attached to directories instead of revisions commit 3458892274226aabe490a795abe5d6fce990be99 Author: Valentin Lorentz <vlorentz@softwareheritage.org> Date: Tue Jul 5 16:43:02 2022 +0200 github: Translate stargazers_count and watchers_count commit e177c77baf48b69e420a3eed6b9125a7f209947f Author: Valentin Lorentz <vlorentz@softwareheritage.org> Date: Mon Jul 4 18:26:24 2022 +0200 Simplify codemeta.make_absolute_uri() commit dd9adebeca15c697cc27011693c8d84f6ec1544e Author: Valentin Lorentz <vlorentz@softwareheritage.org> Date: Mon Jul 4 18:25:25 2022 +0200 Document codemeta.make_absolute_uri() commit 358ee08416dd847d7ebbddd0c721d7a287149175 Author: Valentin Lorentz <vlorentz@softwareheritage.org> Date: Mon Jul 4 13:55:19 2022 +0200 Use compact URIs for ForgeFed and ActivityStreams It makes resulting documents (usually) shorter, and tests more readable. commit d41f26eef0561fd41932eb688bc6908f2253ef4c Author: Valentin Lorentz <vlorentz@softwareheritage.org> Date: Mon Jul 4 13:37:58 2022 +0200 Use separate base classes for intrinsic and extrinsic mappings detect_metadata_files and extrinsic_metadata_formats (respectively) are somewhat mutually exclusive, so it does not make much sense to have them in the same class and MAPPINGS dict Link to build: https://jenkins.softwareheritage.org/job/DCIDX/job/tests-on-diff/341/ Comment Actions Build is green Patch application report for D8085 (id=29191)Could not rebase; Attempt merge onto c2742b5b75... Updating c2742b5..724034d Fast-forward docs/images/tasks-metadata-indexers.uml | 57 ++++++++---------- docs/metadata-workflow.rst | 31 +++++----- swh/indexer/codemeta.py | 57 +++++++++++------- swh/indexer/metadata.py | 10 ++-- swh/indexer/metadata_detector.py | 4 +- swh/indexer/metadata_dictionary/__init__.py | 12 +++- swh/indexer/metadata_dictionary/base.py | 63 ++++++++++++-------- swh/indexer/metadata_dictionary/cff.py | 4 +- swh/indexer/metadata_dictionary/codemeta.py | 4 +- swh/indexer/metadata_dictionary/composer.py | 4 +- swh/indexer/metadata_dictionary/github.py | 68 +++++++++++++++++++--- swh/indexer/metadata_dictionary/maven.py | 4 +- swh/indexer/metadata_dictionary/npm.py | 4 +- swh/indexer/metadata_dictionary/python.py | 4 +- swh/indexer/metadata_dictionary/ruby.py | 11 +--- .../tests/metadata_dictionary/test_github.py | 26 +++++++-- 16 files changed, 230 insertions(+), 133 deletions(-) Changes applied before testcommit 724034de625f3a388a261e1eed3e6a2c9620c539 Author: Valentin Lorentz <vlorentz@softwareheritage.org> Date: Tue Jul 5 17:46:33 2022 +0200 docs: Update description of the metadata workflow 1. indexers call themselves directly instead of going through the scheduler 2. metadata is attached to directories instead of revisions commit 3458892274226aabe490a795abe5d6fce990be99 Author: Valentin Lorentz <vlorentz@softwareheritage.org> Date: Tue Jul 5 16:43:02 2022 +0200 github: Translate stargazers_count and watchers_count commit e177c77baf48b69e420a3eed6b9125a7f209947f Author: Valentin Lorentz <vlorentz@softwareheritage.org> Date: Mon Jul 4 18:26:24 2022 +0200 Simplify codemeta.make_absolute_uri() commit dd9adebeca15c697cc27011693c8d84f6ec1544e Author: Valentin Lorentz <vlorentz@softwareheritage.org> Date: Mon Jul 4 18:25:25 2022 +0200 Document codemeta.make_absolute_uri() commit 358ee08416dd847d7ebbddd0c721d7a287149175 Author: Valentin Lorentz <vlorentz@softwareheritage.org> Date: Mon Jul 4 13:55:19 2022 +0200 Use compact URIs for ForgeFed and ActivityStreams It makes resulting documents (usually) shorter, and tests more readable. commit d41f26eef0561fd41932eb688bc6908f2253ef4c Author: Valentin Lorentz <vlorentz@softwareheritage.org> Date: Mon Jul 4 13:37:58 2022 +0200 Use separate base classes for intrinsic and extrinsic mappings detect_metadata_files and extrinsic_metadata_formats (respectively) are somewhat mutually exclusive, so it does not make much sense to have them in the same class and MAPPINGS dict See https://jenkins.softwareheritage.org/job/DCIDX/job/tests-on-diff/342/ for more details. |