Add R metadata indexer
Details
Diff Detail
- Repository
- rDCIDX Metadata indexer
- Branch
- add-r-indexer
- Lint
No Linters Available - Unit
No Unit Test Coverage - Build Status
Buildable 21508 Build 33416: Phabricator diff pipeline on jenkins Jenkins console · Jenkins Build 33415: arc lint + arc unit
Event Timeline
Build has FAILED
Patch application report for D5417 (id=19371)
Rebasing onto 8f1fb0f931...
Current branch diff-target is up to date.
Changes applied before test
commit fecd016150e4a3fb6b4c335386f32c07251ac4b8 Author: aastha1999 <asthana.aastha1999@gmail.com> Date: Sat Apr 3 21:13:53 2021 +0000 Add R DESCRIPTION indexer
Link to build: https://jenkins.softwareheritage.org/job/DCIDX/job/tests-on-diff/168/
See console output for more information: https://jenkins.softwareheritage.org/job/DCIDX/job/tests-on-diff/168/console
Hi @vlorentz. This is my first attempt at creating an R metadata indexer. I still need to work more on normalization. Also, I couldn't find a translation for fields such as "Imports", "Collate" etc. in codemeta.
Build has FAILED
Patch application report for D5417 (id=19375)
Rebasing onto 8f1fb0f931...
Current branch diff-target is up to date.
Changes applied before test
commit b32269e8928c78571d21519abf8fa1eb90e2c427 Author: aastha1999 <asthana.aastha1999@gmail.com> Date: Mon Apr 5 19:58:46 2021 +0000 Add python-debian in requirements commit fecd016150e4a3fb6b4c335386f32c07251ac4b8 Author: aastha1999 <asthana.aastha1999@gmail.com> Date: Sat Apr 3 21:13:53 2021 +0000 Add R DESCRIPTION indexer
Link to build: https://jenkins.softwareheritage.org/job/DCIDX/job/tests-on-diff/169/
See console output for more information: https://jenkins.softwareheritage.org/job/DCIDX/job/tests-on-diff/169/console
Build has FAILED
Patch application report for D5417 (id=19376)
Rebasing onto 8f1fb0f931...
Current branch diff-target is up to date.
Changes applied before test
commit 15af34bef18566a5d6d7a9a676e784f50deea9ce Author: aastha1999 <asthana.aastha1999@gmail.com> Date: Mon Apr 5 20:13:16 2021 +0000 Updating D5417 commit b32269e8928c78571d21519abf8fa1eb90e2c427 Author: aastha1999 <asthana.aastha1999@gmail.com> Date: Mon Apr 5 19:58:46 2021 +0000 Add python-debian in requirements commit fecd016150e4a3fb6b4c335386f32c07251ac4b8 Author: aastha1999 <asthana.aastha1999@gmail.com> Date: Sat Apr 3 21:13:53 2021 +0000 Add R DESCRIPTION indexer
Link to build: https://jenkins.softwareheritage.org/job/DCIDX/job/tests-on-diff/170/
See console output for more information: https://jenkins.softwareheritage.org/job/DCIDX/job/tests-on-diff/170/console
Also, I couldn't find a translation for fields such as "Imports", "Collate" etc. in codemeta.
It's ok, we don't have to translate everything.
Build is green
Patch application report for D5417 (id=20529)
Rebasing onto 8fd4846af5...
First, rewinding head to replay your work on top of it... Fast-forwarded diff-target to base-revision-175-D5417.
Changes applied before test
See https://jenkins.softwareheritage.org/job/DCIDX/job/tests-on-diff/175/ for more details.
Build is green
Patch application report for D5417 (id=20530)
Could not rebase; Attempt merge onto 8fd4846af5...
Updating 8fd4846..64936d0 Fast-forward requirements.txt | 1 + swh/indexer/metadata_dictionary/R.py | 48 +++++++++++++++ swh/indexer/metadata_dictionary/__init__.py | 3 +- swh/indexer/storage/__init__.py | 10 +++- swh/indexer/tests/storage/test_storage.py | 1 + swh/indexer/tests/test_cli.py | 1 + swh/indexer/tests/test_metadata.py | 92 +++++++++++++++++++++++++++++ 7 files changed, 154 insertions(+), 2 deletions(-) create mode 100644 swh/indexer/metadata_dictionary/R.py
Changes applied before test
commit 64936d037ede2d47fd2dd52564328b54dbfb8532 Author: aastha1999 <asthana.aastha1999@gmail.com> Date: Mon Apr 5 20:13:16 2021 +0000 Updating D5417 commit 3ce4e0407a2f852996208a5674866bc094032990 Author: aastha1999 <asthana.aastha1999@gmail.com> Date: Mon Apr 5 19:58:46 2021 +0000 Add python-debian in requirements commit 5f4064de8a62d6e2f277b5ecb18a39aa873c089e Author: aastha1999 <asthana.aastha1999@gmail.com> Date: Sat Apr 3 21:13:53 2021 +0000 Add R DESCRIPTION indexer
See https://jenkins.softwareheritage.org/job/DCIDX/job/tests-on-diff/176/ for more details.
Build is green
Patch application report for D5417 (id=20531)
Rebasing onto 8fd4846af5...
Current branch diff-target is up to date.
Changes applied before test
commit 64936d037ede2d47fd2dd52564328b54dbfb8532 Author: aastha1999 <asthana.aastha1999@gmail.com> Date: Mon Apr 5 20:13:16 2021 +0000 Updating D5417 commit 3ce4e0407a2f852996208a5674866bc094032990 Author: aastha1999 <asthana.aastha1999@gmail.com> Date: Mon Apr 5 19:58:46 2021 +0000 Add python-debian in requirements commit 5f4064de8a62d6e2f277b5ecb18a39aa873c089e Author: aastha1999 <asthana.aastha1999@gmail.com> Date: Sat Apr 3 21:13:53 2021 +0000 Add R DESCRIPTION indexer
See https://jenkins.softwareheritage.org/job/DCIDX/job/tests-on-diff/177/ for more details.
Thanks!
The schema:license field should be either:
- an URI
- a Creative Work object, or
- an array of 1 and 2
URIs can be obtained from license names like this: https://forge.softwareheritage.org/source/swh-indexer/browse/master/swh/indexer/metadata_dictionary/npm.py$136-143
Unfortunately, it will be harder for the + file LICENSE part... I don't see a good solution for this, the input data is just unusable :/
Can you look for other examples, to see what license field other packages use, and list them here?
A few comments below, they should be easy to fix:
swh/indexer/metadata_dictionary/R.py | ||
---|---|---|
17 | Please fill this | |
19 | Hmm, that's not a very clear name without context. What about "r-description"? | |
41–45 | I don't see these in the expected output in the test. Not need to parse them if they are not translatable. | |
swh/indexer/tests/test_metadata.py | ||
247 | Hmm, this should be automatically converted from "schema:url" to "url". I'll look into the issue. |