Details
Details
Diff Detail
Diff Detail
- Repository
- rDDATASET Datasets
- Branch
- master
- Lint
No Linters Available - Unit
No Unit Test Coverage - Build Status
Buildable 12259 Build 18593: arc lint + arc unit
Event Timeline
Comment Actions
Some things you could try to improve perfs after you land this diff:
- WITHOUT ROWID https://sqlite.org/withoutrowid.html
- using a cursor, adding IF NOT EXISTS ... to the query and checking cursor.total_changes
- alternatively, just use IF NOT EXISTS ... without checking the changes, remove the creation of nodes.csv from this process, and create it from an other process from the sqlite DB
swh/dataset/utils.py | ||
---|---|---|
42 | a short docstring plz | |
54 | here too, for the return type |
swh/dataset/graph.py | ||
---|---|---|
49–52 | I think you need origin and the visit id here, or you'll only get one visit per origin |
swh/dataset/graph.py | ||
---|---|---|
49–52 | And you probably need to filter visits out to only keep the ones whose states are "final" |
swh/dataset/graph.py | ||
---|---|---|
49–52 | Good catch for the visit ID, thanks! |