Merged in https://forge.softwareheritage.org/rDMOD9523be0552d822be617da77bf0d2ca2f479da572
- Queries
- All Stories
- Search
- Advanced Search
- Transactions
- Transaction Logs
Feed Advanced Search
Advanced Search
Advanced Search
Mar 26 2021
Mar 26 2021
kSJFGSLHDFSKJGHDKFJHGDKFJHG
seirl committed rDMOD9523be0552d8: Model test data: add Release with no author/date (authored by seirl).
Model test data: add Release with no author/date
Remove phabricator garbage
The ORC exporter is done, and it's likely that we won't provide CSV exports in the future, or we'll generate them from the ORC format.
seirl closed T1847: fully automate export of the graph dataset, a subtask of T1848: refresh graph dataset export, as Resolved.
Mar 25 2021
Mar 25 2021
Mar 24 2021
Mar 24 2021
Mar 23 2021
Mar 23 2021
Add permissions on edge labels
DirEntry: allow for empty permission field
seirl committed rDGRPHe0be35f0f59e: labels: use -label prefix for all edge labels, instead of -filename-labels (authored by seirl).
labels: use -label prefix for all edge labels, instead of -filename-labels
ReadLabelledGraph: use FCL instead of PFCL
java: add subdataset exporting functions
seirl committed rDGRPH9a20f2e9bc2c: LabelMapBuilder: use low-level scanning of the input file (authored by seirl).
LabelMapBuilder: use low-level scanning of the input file
LabelMapBuilder: restructure in functions
seirl committed rDGRPH7b31937a4715: LabelMapBuilder: non-static builder function (authored by seirl).
LabelMapBuilder: non-static builder function
seirl committed rDGRPH2fcd96d7bb21: LabelMapBuilder: remove need for hashtable, sync streams (authored by seirl).
LabelMapBuilder: remove need for hashtable, sync streams
Use MPH functions operating on byte arrays
seirl committed rDGRPH4e2fedc3bce8: LabelMapBuilder: refactor logic in separate line iterators (authored by seirl).
LabelMapBuilder: refactor logic in separate line iterators
seirl committed rDGRPH0aa061682e95: LabelMapBuilder: support both sorting methods (authored by seirl).
LabelMapBuilder: support both sorting methods
seirl committed rDGRPH968f9c6c2d0e: LabelMapBuilder: add TextualEdgeLabelLineIterator, fix BSort (authored by seirl).
LabelMapBuilder: add TextualEdgeLabelLineIterator, fix BSort
Merge branch 'label_permissions'
Mar 4 2021
Mar 4 2021
Feb 24 2021
Feb 24 2021
Add FindEarliestRevision tool
Feb 15 2021
Feb 15 2021
Add ORC exporter
ORC exporter: Add unit tests
seirl committed rDDATASETbf8d2625d3b3: Refactor export paths in the base Exporter class (authored by seirl).
Refactor export paths in the base Exporter class
seirl committed rDDATASET40f068d648d2: ORC exporter: avoid fromtimestamp(), use datetime() from epoch instead (authored by seirl).
ORC exporter: avoid fromtimestamp(), use datetime() from epoch instead
Feb 12 2021
Feb 12 2021
I added unit tests and reworked the logic, and also addressed @olasd 's comment. Could you please rereview? :-)
ORC exporter: avoid fromtimestamp(), use datetime() from epoch instead
- Add unit tests
- Refactor export paths in the base Exporter class
Feb 2 2021
Feb 2 2021
seirl triaged T3021: Investigate why reading the journal of the content table takes so long as Normal priority.
Jan 8 2021
Jan 8 2021
Rebase on master, include webgraph files
I'm realizing that this is missing the "simplified" step and needs more changes.
seirl committed rDGRPH5a987aae6e93: config: sane default for batch_size using a heuristic on ram size (authored by seirl).
config: sane default for batch_size using a heuristic on ram size
seirl updated the diff for D4820: config: sane default for batch_size using a heuristic on ram size.
rebase
seirl updated the diff for D4820: config: sane default for batch_size using a heuristic on ram size.
Add task name to commit message
seirl committed rDGRPH85da2e78d681: cli: compression: fix weird bug when using ranges of steps (authored by seirl).
cli: compression: fix weird bug when using ranges of steps
seirl committed rDGRPH317205722b65: Compression: set custom temporary directory at the java level (authored by seirl).
Compression: set custom temporary directory at the java level
seirl committed rDGRPHa4b6570e16ec: Compression: read only src/dst from the labelled edge file (authored by seirl).
Compression: read only src/dst from the labelled edge file
Jan 7 2021
Jan 7 2021
java: bump unimi dependencies
Jan 6 2021
Jan 6 2021
Jan 5 2021
Jan 5 2021
Dec 17 2020
Dec 17 2020
seirl retitled D4762: Add ORC exporter from Add ORC exporterThis adds a new exporter in columnar format (Apache ORC) using the PyORClibrary. The output can be used on various clouds like AWS S3. to Add ORC exporter.
seirl retitled D4762: Add ORC exporter from Add ORC exporter
This adds a new exporter in columnar format (Apache ORC) using the PyORC
library. The output can be used on various clouds like AWS S3. to Add ORC exporterThis adds a new exporter in columnar format (Apache ORC) using the PyORClibrary. The output can be used on various clouds like AWS S3..
seirl committed rDDATASETe439aa686f22: Edge exporter: use common remove_pull_requests() function (authored by seirl).
Edge exporter: use common remove_pull_requests() function
Dec 16 2020
Dec 16 2020
seirl committed rDDATASETcb71cea14def: journalprocessor: be resilient to exporter errors (authored by seirl).
journalprocessor: be resilient to exporter errors
seirl committed rDDATASET6577f653f3c6: Export CLI: add a way to exclude specific object types (authored by seirl).
Export CLI: add a way to exclude specific object types
seirl committed rDDATASETf3b156598000: journalprocessor: fix hashing of origin_visit_status objects (authored by seirl).
journalprocessor: fix hashing of origin_visit_status objects
Normalize .timestamp()
Dec 15 2020
Dec 15 2020
seirl added a reviewer for D4750: journalprocessor: fix hashing of origin_visit_status objects: Reviewers.
Landed, but phabricator doesn't seem to see it.
- journalprocessor: remove comment about deserialize_message overload being a 'hack'
- journalprocessor: also partition sqlite files by first byte
- SQLite on-disk set: disable journalling and synchronous mode
- tests: fix test_export_origin
Dec 14 2020
Dec 14 2020
Add 2020-12-07 coregraphie
Dec 11 2020
Dec 11 2020
- Exporter documentation fixes
- Journal processor: fetch offsets in parallel
Fix various coding errors and minor improvements
seirl committed rDDATASETf1952316a1ea: Graph export: add labels to the export CSV format (authored by seirl).
Graph export: add labels to the export CSV format
Better commit message:
Dec 10 2020
Dec 10 2020
Dec 9 2020
Dec 9 2020
seirl committed rDDATASETb21d4a5ca327: graph exporter: schema upgrade for origin_visit_status (authored by seirl).
graph exporter: schema upgrade for origin_visit_status
Dec 8 2020
Dec 8 2020
Subscribe to the correct objects
Fix variable name
seirl added a reviewer for D4691: graph exporter: schema upgrade for origin_visit_status: Reviewers.