Page MenuHomeSoftware Heritage

Add origin_metadata_get API endpoint
ClosedPublic

Authored by twitu on Jun 21 2019, 8:45 AM.

Active Operations

Details

Summary

I am adding an API end point to get origin intrinsic metadata to from origin_type and origin_url from indexer_storage. This involves adding an endpoint in origin.py and function that queries the db in service.py.

I use storage.origin_get to get basic origin information from the type and url. I use this to pass the id of the origin, to origin_metadata_get in indexer storage which returns to intrinsic metadata of the origin.

referencing T1613

Diff Detail

Repository
rDWAPPS Web applications
Branch
metadata-endpoint
Lint
No Linters Available
Unit
No Unit Test Coverage
Build Status
Buildable 6423
Build 8916: tox-on-jenkinsJenkins
Build 8915: arc lint + arc unit

Event Timeline

in reference to T1613
I am returning entry for only one origin id. Is this expected functionality?

We require that all endpoints have unit tests. Could you add some?
You can see in swh/web/tests/api/views/test_origin.py how we do it.

swh/web/api/views/origin.py
436โ€“437

Could you use this as route instead?

@api_route(r'/origin/(?P<origin_type>[a-z]+)/url/(?P<origin_url>.+)//metadata',
           'api-origin-metdata-get')

Both ways are consistent with existing APIs, but we're trying to migrate away from using origin_id. You can look at how the api_origin() endpoint does it.

469โ€“473

'results' is for the results of a search, you can return directly the return value of api_lookup(...).

This revision now requires changes to proceed.Jun 21 2019, 10:09 AM
  • Fix intrinsic metadata enpoint and test case
  • Remove unnecessary tox.ini dependency
swh/web/common/service.py
303

I am still making this call to convert url and type information to id of origin, because intrinsic metadata is only searchable by id

swh/web/tests/api/views/test_origin.py
333

Test case is failing here, citing an Attribute error here. I need multiple context managers because to test the api I need to mock both storage and index-storage. I am not sure why I am unable to mock storage, any suggestions?

twitu marked 2 inline comments as not done.Jun 22 2019, 7:25 AM
  • Change key reference from origin_id to id
swh/web/common/service.py
308

The test is successful when I change this line to origin_ids = [1]. Otherwise I receive a KeyError for id. This indicates that storage.origin_get is not being mocked and is not returning the correct values. The expected value is a dictionary containing key id. I am unable to figure out why this is not behaving as expected.

twitu edited the summary of this revision. (Show Details)
swh/web/common/service.py
308

I added print(origin_info) here to record to log the contents of origin_info.

.tox/py3/lib/python3.5/site-packages/swh/web/tests/api/views/test_origin.py {'url': 'https://github.com/wcoder/highlightjs-line-numbers.js', 'type': 'git'}
{'url': 'repo_with_submodules', 'type': 'git'}
{'url': 'https://github.com/memononen/libtess2', 'type': 'git'}
{'url': 'repo_with_submodules', 'type': 'git'}
{'url': 'repo_with_submodules', 'type': 'git'}
{'url': 'https://github.com/memononen/libtess2', 'type': 'git'}
{'url': 'https://github.com/memononen/libtess2', 'type': 'git'}
{'url': 'https://github.com/wcoder/highlightjs-line-numbers.js', 'type': 'git'}
{'url': 'https://github.com/wcoder/highlightjs-line-numbers.js', 'type': 'git'}
{'url': 'https://github.com/wcoder/highlightjs-line-numbers.js', 'type': 'git'}
{'url': 'repo_with_submodules', 'type': 'git'}
{'url': 'https://github.com/memononen/libtess2', 'type': 'git'}
{'url': 'https://github.com/memononen/libtess2', 'type': 'git'}
{'url': 'https://github.com/wcoder/highlightjs-line-numbers.js', 'type': 'git'}
{'url': 'https://github.com/memononen/libtess2', 'type': 'git'}
{'url': 'repo_with_submodules', 'type': 'git'}
{'url': 'repo_with_submodules', 'type': 'git'}
{'url': 'https://github.com/wcoder/highlightjs-line-numbers.js', 'type': 'git'}
{'url': 'https://github.com/wcoder/highlightjs-line-numbers.js', 'type': 'git'}
{'url': 'https://github.com/memononen/libtess2', 'type': 'git'}
{'url': 'repo_with_submodules', 'type': 'git'}
{'url': 'https://github.com/wcoder/highlightjs-line-numbers.js', 'type': 'git'}
{'url': 'repo_with_submodules', 'type': 'git'}
{'url': 'https://github.com/wcoder/highlightjs-line-numbers.js', 'type': 'git'}
{'url': 'repo_with_submodules', 'type': 'git'}
{'url': 'repo_with_submodules', 'type': 'git'}
{'url': 'https://github.com/wcoder/highlightjs-line-numbers.js', 'type': 'git'}
{'url': 'https://github.com/memononen/libtess2', 'type': 'git'}
{'url': 'https://github.com/memononen/libtess2', 'type': 'git'}
{'url': 'https://github.com/memononen/libtess2', 'type': 'git'}
{'url': 'repo_with_submodules', 'type': 'git'}
{'url': 'https://github.com/wcoder/highlightjs-line-numbers.js', 'type': 'git'}
{'url': 'https://github.com/memononen/libtess2', 'type': 'git'}
{'url': 'https://github.com/wcoder/highlightjs-line-numbers.js', 'type': 'git'}
{'url': 'repo_with_submodules', 'type': 'git'}
{'url': 'https://github.com/memononen/libtess2', 'type': 'git'}
{'url': 'https://github.com/memononen/libtess2', 'type': 'git'}
{'url': 'https://github.com/wcoder/highlightjs-line-numbers.js', 'type': 'git'}
{'url': 'repo_with_submodules', 'type': 'git'}
{'url': 'https://github.com/memononen/libtess2', 'type': 'git'}
{'url': 'repo_with_submodules', 'type': 'git'}
{'url': 'https://github.com/wcoder/highlightjs-line-numbers.js', 'type': 'git'}
{'url': 'repo_with_submodules', 'type': 'git'}
{'url': 'https://github.com/wcoder/highlightjs-line-numbers.js', 'type': 'git'}
{'url': 'https://github.com/memononen/libtess2', 'type': 'git'}
{'url': 'https://github.com/memononen/libtess2', 'type': 'git'}
{'url': 'https://github.com/memononen/libtess2', 'type': 'git'}
<-- removed repeated value for clarity -->

What I got back does not contain the any key for id, though it should be part of the returned values according to API docs.

twitu updated this revision to Diff 5470.EditedJun 25 2019, 5:09 AM

Fix test case. Silly mistakes are costly.

Cool, thanks! Two last nitpicks:

  1. you can drop the mocking of swh.web.common.service.storage and add the origin in it directly
  2. Could you check that origin_intrinsic_metadata_get is called with the right arguments? You can use this function: https://docs.python.org/3/library/unittest.mock.html#unittest.mock.Mock.assert_called_once_with
This revision now requires changes to proceed.Jun 25 2019, 10:29 AM

Add assert_called_once_with to test case

This revision is now accepted and ready to land.Jun 25 2019, 5:21 PM

Sorry, I just noticed, but this needs to be updated:

.. http:get:: /api/1/origin/(origin_type)/url/(origin_url)/intrinsic-metadata
This revision now requires changes to proceed.Jun 25 2019, 5:22 PM

Nevermind, I misread it. It's fine

This revision is now accepted and ready to land.Jun 25 2019, 5:22 PM

Oh, but there's a double slash and a typo here:

@api_route(r'/origin/(?P<origin_type>[a-z]+)/url/(?P<origin_url>.+)'
           '//intrinsic-metadata', 'api-origin-intrinsic-metdata-get')

it should be replaced by:

@api_route(r'/origin/(?P<origin_type>[a-z]+)/url/(?P<origin_url>.+)'
           '/intrinsic-metadata', 'api-origin-intrinsic-metadata-get')
This revision now requires changes to proceed.Jun 25 2019, 5:23 PM

The + is not needed at the end off the line

Remove + symbol from end of line

Ready to land made all the required changes.

This revision is now accepted and ready to land.Jun 25 2019, 5:48 PM
This revision was automatically updated to reflect the committed changes.