Details
Diff Detail
- Repository
- rDGRPH Compressed graph representation
- Branch
- benchmark-browsing
- Lint
No Linters Available - Unit
No Unit Test Coverage - Build Status
Buildable 7240 Build 10231: tox-on-jenkins Jenkins Build 10230: arc lint + arc unit
Event Timeline
Build has FAILED
Link to build: https://jenkins.softwareheritage.org/job/DGRPH/job/tox/2/
See console output for more information: https://jenkins.softwareheritage.org/job/DGRPH/job/tox/2/console
docs/api.rst | ||
---|---|---|
38 | A bit awkward/unclear. Simpler: "avg time needed to traverse an edge", or something such. | |
java/server/src/main/java/org/softwareheritage/graph/Endpoint.java | ||
59 | It's not great to return digested data at this inner level. It'd be best to return raw info here, like the total traversal time (that you already have). And the number of traversed edges. Then, do the avg computation later, when you have to output/use it. | |
java/server/src/main/java/org/softwareheritage/graph/benchmark/Common.java | ||
42 | I haven't checked the code as I'm on my phone, but Statistics sounds like the right place where you want to compute the avg. You should just make sure the raw data needed to do so reach it. |
java/server/src/main/java/org/softwareheritage/graph/Endpoint.java | ||
---|---|---|
59 | The number of traversed edges is already returned implicitly as the size of the result field. | |
java/server/src/main/java/org/softwareheritage/graph/benchmark/Common.java | ||
42 | Right now the Statistics class is a general class to compute stats on double values because AccessEdge does not handle Endpoint.Output but raw timings. I'll try to think about a way to have a more specific endpoint related statistics class/methods. |
java/server/src/main/java/org/softwareheritage/graph/Endpoint.java | ||
---|---|---|
59 | Yes and no, for instance, I suspect there is an off-by-one, due to the difference in returning the number of *nodes* and the number of *edges*. Also, for endpoints that only return the final node (e.g., for the provenance use cases), you will not have anything meaningful on which to call size(). Hence my suggestion to return as some sort of timings (it's not time, but it's still a measure of how much work the endpoint calculation did) the metric of how many edges you've traversed. That can be consumed by downstream users to compute and output average cost. |
java/server/src/main/java/org/softwareheritage/graph/Endpoint.java | ||
---|---|---|
59 | Oh I see, so yes I need to add a new meta info indeed. However I'm not sure what would be the proper way to do so. I don't want to have yet another wrapper class to return both the result and meta information, especially since meta information are all calculated in Endpoint, and this class was only created to be a wrapper class to do all the higher-level computation around low-level traversal methods. |
- Add a new meta info nbEdgesAccessed
- Move utils/ into benchmark/utils/
I ended up using a member variable to count number of edges accessed in the
lower-level class, I find this better than a wrapper class but I'm still open to
other suggestions (or arguments for wrapper class!). ;)
Build has FAILED
Link to build: https://jenkins.softwareheritage.org/job/DGRPH/job/tox/3/
See console output for more information: https://jenkins.softwareheritage.org/job/DGRPH/job/tox/3/console