eg. orc exporter specific exporter config entries are now under the
'orc' section, like:
journal:
brokers: [...]
orc:
remove_pull_requests: true
max_rows:
revision: 100000
directory: 10000Depends on D7461.
Differential D7462
Move exporter config entries in dedicated sections Authored by douardda on Mar 29 2022, 5:44 PM. Tags None Subscribers None
Details
eg. orc exporter specific exporter config entries are now under the journal:
brokers: [...]
orc:
remove_pull_requests: true
max_rows:
revision: 100000
directory: 10000Depends on D7461.
Diff Detail
Event TimelineComment Actions Build is green Patch application report for D7462 (id=27047)Could not rebase; Attempt merge onto 5a8a8a7847... Updating 5a8a8a7..ebb5a89 Fast-forward swh/dataset/exporters/orc.py | 98 +++++++++++++++++++++++++++++++++++------- swh/dataset/relational.py | 10 +++++ swh/dataset/test/test_orc.py | 100 +++++++++++++++++++++++++++++++++++++++---- 3 files changed, 185 insertions(+), 23 deletions(-) Changes applied before testcommit ebb5a89f95d73f52e87c456b872ec6c529d80fe3
Author: David Douard <david.douard@sdfa3.org>
Date: Fri Mar 25 15:43:18 2022 +0100
Move exporter config entries in dedicated sections
eg. orc exporter specific exporter config entries are now under the
'orc' section, like:
journal:
brokers: [...]
orc:
remove_pull_requests: true
max_rows:
revision: 100000
directory: 10000
commit e8ccb166a6aaa82f5917388f9b995c830499170a
Author: David Douard <david.douard@sdfa3.org>
Date: Wed Mar 23 16:35:52 2022 +0100
Add support for limited row numbers in ORC files
Make it possible to specify a maximum number of rows a table can store
in a single ORC file. The limit can only be set on main tables for now
(i.e. cannot be specified for tables like revision_history or
directory_entry).
This can be set by configuration only (no extra cli options).
commit fd3f9aa61de374655fd4bc4920d5047eb7d0c4ca
Author: David Douard <david.douard@sdfa3.org>
Date: Fri Mar 18 12:24:31 2022 +0100
Add the raw_manifest column for revision, release and directory ORC files
commit 5c652bb058e2c1b59bafefd6817f392fdc171a20
Author: David Douard <david.douard@sdfa3.org>
Date: Fri Mar 18 12:22:47 2022 +0100
Export revision extra headers in a dedicated ORC file
commit 45c8124b7a310963a868eb6602ea24e240d761e4
Author: David Douard <david.douard@sdfa3.org>
Date: Fri Mar 18 12:20:20 2022 +0100
Add the type fields for revision and origin_visit_status ORC tableSee https://jenkins.softwareheritage.org/job/DDATASET/job/tests-on-diff/99/ for more details. Comment Actions Build has FAILED Patch application report for D7462 (id=27066)Could not rebase; Attempt merge onto fd3f9aa61d... Updating fd3f9aa..c728c05 Fast-forward swh/dataset/exporters/orc.py | 99 +++++++++++++++++++++++++++++++++++--------- swh/dataset/relational.py | 58 +++++++++++++++----------- swh/dataset/test/test_orc.py | 96 ++++++++++++++++++++++++++++++++++++++---- 3 files changed, 202 insertions(+), 51 deletions(-) Changes applied before testcommit c728c05e630cf02f35081e810279a5e5b24ebf98
Author: David Douard <david.douard@sdfa3.org>
Date: Fri Mar 25 15:43:18 2022 +0100
Move exporter config entries in dedicated sections
eg. orc exporter specific exporter config entries are now under the
'orc' section, like:
journal:
brokers: [...]
orc:
remove_pull_requests: true
max_rows:
revision: 100000
directory: 10000
commit 850ee3be47cf3b1e0ab53f8820cc5e4c86b94f38
Author: David Douard <david.douard@sdfa3.org>
Date: Wed Mar 23 16:35:52 2022 +0100
Add support for limited row numbers in ORC files
Make it possible to specify a maximum number of rows a table can store
in a single ORC file. The limit can only be set on main tables for now
(i.e. cannot be specified for tables like revision_history or
directory_entry).
This can be set by configuration only (no extra cli options).Link to build: https://jenkins.softwareheritage.org/job/DDATASET/job/tests-on-diff/105/ Comment Actions Build is green Patch application report for D7462 (id=27075)Could not rebase; Attempt merge onto fd3f9aa61d... Updating fd3f9aa..e01daba Fast-forward swh/dataset/exporters/orc.py | 99 +++++++++++++++++++++++++++++++++++--------- swh/dataset/relational.py | 58 +++++++++++++++----------- swh/dataset/test/test_orc.py | 96 ++++++++++++++++++++++++++++++++++++++---- 3 files changed, 202 insertions(+), 51 deletions(-) Changes applied before testcommit e01daba4d601733a86ce7401fe54247908d03e5c
Author: David Douard <david.douard@sdfa3.org>
Date: Fri Mar 25 15:43:18 2022 +0100
Move exporter config entries in dedicated sections
eg. orc exporter specific exporter config entries are now under the
'orc' section, like:
journal:
brokers: [...]
orc:
remove_pull_requests: true
max_rows:
revision: 100000
directory: 10000
commit 3df08fd71759487e963e6569c8dfd0c502b060de
Author: David Douard <david.douard@sdfa3.org>
Date: Wed Mar 23 16:35:52 2022 +0100
Add support for limited row numbers in ORC files
Make it possible to specify a maximum number of rows a table can store
in a single ORC file. The limit can only be set on main tables for now
(i.e. cannot be specified for tables like revision_history or
directory_entry).
This can be set by configuration only (no extra cli options).See https://jenkins.softwareheritage.org/job/DDATASET/job/tests-on-diff/110/ for more details. |