Details
- Reviewers
ardumont - Group Reviewers
Reviewers - Maniphest Tasks
- T737: create origin_metadata table
- Commits
- rDSTOCd13751aeaa9a: Refactor entry points to origin_metadata table with get_by function
rDSTOC48fc6f750003: Refactor origin_metadata and adding provider table and logic
rDSTOC3b425af2ebbc: Create origin_metadata tables and logic
rDSTOC17882a93521e: test for origin_metadata add and get functions pass
rDSTOC76c5326599f4: Create origin_metadata tables and logic
R65:d13751aeaa9a: Refactor entry points to origin_metadata table with get_by function
R65:48fc6f750003: Refactor origin_metadata and adding provider table and logic
R65:17882a93521e: test for origin_metadata add and get functions pass
R65:76c5326599f4: Create origin_metadata tables and logic
R65:3b425af2ebbc: Create origin_metadata tables and logic
rDSTOd13751aeaa9a: Refactor entry points to origin_metadata table with get_by function
rDSTO48fc6f750003: Refactor origin_metadata and adding provider table and logic
rDSTO76c5326599f4: Create origin_metadata tables and logic
rDSTO17882a93521e: test for origin_metadata add and get functions pass
rDSTO3b425af2ebbc: Create origin_metadata tables and logic
tests for add, get, get_all and get_by_provenance
Diff Detail
- Repository
- rDSTO Storage manager
- Lint
Automatic diff as part of commit; lint not applicable. - Unit
Automatic diff as part of commit; unit tests not applicable.
Event Timeline
I only have nitpicks about it:
- indentation consistency
- documentation (missing types or Args section)
- a test which misses something (at least another origin_metadata entry)
swh/storage/db.py | ||
---|---|---|
1053 ↗ | (On Diff #848) | Please add the variable's "type". |
1078 ↗ | (On Diff #848) | what's the \? |
1100 ↗ | (On Diff #848) | tabulation here is not the same as prior functions. |
swh/storage/storage.py | ||
1751 ↗ | (On Diff #848) | Missing the Args entry. |
swh/storage/tests/test_storage.py | ||
3542 | Maybe add another origin_metadata from another provenance. |
- Refactor origin_metadata and adding provider table and logic
- Merge branch 'master' of ssh://forge.softwareheritage.org/diffusion/DSTO/swh-storage
Added some remarks.
I did not check the tests yet though.
sql/swh-schema.sql | ||
---|---|---|
17 | Why 111 and not 110? | |
swh/storage/api/client.py | ||
287 | We should merge those methods (get/get_all). If someone wants only 1 origin_metadata, (s)he should simply asks for one origin_metadata (list of 1 element). That's what we do for most of our endpoints content, revision, etc... Not sure why we diverge for cache_content_get/get_all endpoint. To sum up, as we can avoid the duplication here, we should stick to the standard :) | |
293 | indentation :) | |
303 | provider/get? | |
swh/storage/db.py | ||
1033 ↗ | (On Diff #851) | missing one space tool and metadata. |
1038 ↗ | (On Diff #851) | time when the metadata was found? |
1046 ↗ | (On Diff #851) | cur = self._cursor(cur) |
1075 ↗ | (On Diff #851) | It should probably be named origin_metadata_get_by_origin. |
1095 ↗ | (On Diff #851) | This probably could be merged with the method above as get_by. |
1147 ↗ | (On Diff #851) | cur = self._cursor(cur) |
swh/storage/storage.py | ||
1772 ↗ | (On Diff #851) | a nitpick, if we take a look either at db or storage module, the pattern is inverted. if not om: return None return dict... |
1795 ↗ | (On Diff #851) | yield dict(... |
1798 ↗ | (On Diff #851) | Maybe we could merge origin_metadata_get_by_provider_type and origin_metadata_get_all (which is a get by origin actually) to origin_metadata_get_by. This could either take an origin_id and a provider_type (defaulting to None) or a dict of filters. You'd also need to unify the information returned (that is join on the provider table to return those information even in the case of the get_by_origin, to return more than just the provider id). |
1826 ↗ | (On Diff #851) | same :) |
- Refactor origin_metadata and adding provider table and logic
- Refactor entry point to origin_metadata table
sql/swh-schema.sql | ||
---|---|---|
17 | because i added origin_metadata on the 110 but i can put it back to 110 | |
swh/storage/db.py | ||
1075 ↗ | (On Diff #851) | you had a great idea making all get method, so now we have only origin_metadata_get
if a second param (provider_type) is given it's:
|
This sounds quite good.
I believe there remains some dead code though. Can you please check and adapt accordingly?
Cheers,
swh/storage/api/client.py | ||
---|---|---|
294 | I think this is some dead code. | |
swh/storage/api/server.py | ||
434 | If my previous assertion is correct, this should disappear as well. | |
swh/storage/tests/test_storage.py | ||
3593 | This tends to agree with my previous assertions :) |
swh/storage/db.py | ||
---|---|---|
1107 ↗ | (On Diff #868) | cur = self._cursor(cur) |