eg. orc exporter specific exporter config entries are now under the
'orc' section, like:
journal: brokers: [...] orc: remove_pull_requests: true max_rows: revision: 100000 directory: 10000
Depends on D7461.
Differential D7462
Move exporter config entries in dedicated sections douardda on Mar 29 2022, 5:44 PM. Authored by Tags None Subscribers None
Details
eg. orc exporter specific exporter config entries are now under the journal: brokers: [...] orc: remove_pull_requests: true max_rows: revision: 100000 directory: 10000 Depends on D7461.
Diff Detail
Event TimelineComment Actions Build is green Patch application report for D7462 (id=27047)Could not rebase; Attempt merge onto 5a8a8a7847... Updating 5a8a8a7..ebb5a89 Fast-forward swh/dataset/exporters/orc.py | 98 +++++++++++++++++++++++++++++++++++------- swh/dataset/relational.py | 10 +++++ swh/dataset/test/test_orc.py | 100 +++++++++++++++++++++++++++++++++++++++---- 3 files changed, 185 insertions(+), 23 deletions(-) Changes applied before testcommit ebb5a89f95d73f52e87c456b872ec6c529d80fe3 Author: David Douard <david.douard@sdfa3.org> Date: Fri Mar 25 15:43:18 2022 +0100 Move exporter config entries in dedicated sections eg. orc exporter specific exporter config entries are now under the 'orc' section, like: journal: brokers: [...] orc: remove_pull_requests: true max_rows: revision: 100000 directory: 10000 commit e8ccb166a6aaa82f5917388f9b995c830499170a Author: David Douard <david.douard@sdfa3.org> Date: Wed Mar 23 16:35:52 2022 +0100 Add support for limited row numbers in ORC files Make it possible to specify a maximum number of rows a table can store in a single ORC file. The limit can only be set on main tables for now (i.e. cannot be specified for tables like revision_history or directory_entry). This can be set by configuration only (no extra cli options). commit fd3f9aa61de374655fd4bc4920d5047eb7d0c4ca Author: David Douard <david.douard@sdfa3.org> Date: Fri Mar 18 12:24:31 2022 +0100 Add the raw_manifest column for revision, release and directory ORC files commit 5c652bb058e2c1b59bafefd6817f392fdc171a20 Author: David Douard <david.douard@sdfa3.org> Date: Fri Mar 18 12:22:47 2022 +0100 Export revision extra headers in a dedicated ORC file commit 45c8124b7a310963a868eb6602ea24e240d761e4 Author: David Douard <david.douard@sdfa3.org> Date: Fri Mar 18 12:20:20 2022 +0100 Add the type fields for revision and origin_visit_status ORC table See https://jenkins.softwareheritage.org/job/DDATASET/job/tests-on-diff/99/ for more details. Comment Actions Build has FAILED Patch application report for D7462 (id=27066)Could not rebase; Attempt merge onto fd3f9aa61d... Updating fd3f9aa..c728c05 Fast-forward swh/dataset/exporters/orc.py | 99 +++++++++++++++++++++++++++++++++++--------- swh/dataset/relational.py | 58 +++++++++++++++----------- swh/dataset/test/test_orc.py | 96 ++++++++++++++++++++++++++++++++++++++---- 3 files changed, 202 insertions(+), 51 deletions(-) Changes applied before testcommit c728c05e630cf02f35081e810279a5e5b24ebf98 Author: David Douard <david.douard@sdfa3.org> Date: Fri Mar 25 15:43:18 2022 +0100 Move exporter config entries in dedicated sections eg. orc exporter specific exporter config entries are now under the 'orc' section, like: journal: brokers: [...] orc: remove_pull_requests: true max_rows: revision: 100000 directory: 10000 commit 850ee3be47cf3b1e0ab53f8820cc5e4c86b94f38 Author: David Douard <david.douard@sdfa3.org> Date: Wed Mar 23 16:35:52 2022 +0100 Add support for limited row numbers in ORC files Make it possible to specify a maximum number of rows a table can store in a single ORC file. The limit can only be set on main tables for now (i.e. cannot be specified for tables like revision_history or directory_entry). This can be set by configuration only (no extra cli options). Link to build: https://jenkins.softwareheritage.org/job/DDATASET/job/tests-on-diff/105/ Comment Actions Build is green Patch application report for D7462 (id=27075)Could not rebase; Attempt merge onto fd3f9aa61d... Updating fd3f9aa..e01daba Fast-forward swh/dataset/exporters/orc.py | 99 +++++++++++++++++++++++++++++++++++--------- swh/dataset/relational.py | 58 +++++++++++++++----------- swh/dataset/test/test_orc.py | 96 ++++++++++++++++++++++++++++++++++++++---- 3 files changed, 202 insertions(+), 51 deletions(-) Changes applied before testcommit e01daba4d601733a86ce7401fe54247908d03e5c Author: David Douard <david.douard@sdfa3.org> Date: Fri Mar 25 15:43:18 2022 +0100 Move exporter config entries in dedicated sections eg. orc exporter specific exporter config entries are now under the 'orc' section, like: journal: brokers: [...] orc: remove_pull_requests: true max_rows: revision: 100000 directory: 10000 commit 3df08fd71759487e963e6569c8dfd0c502b060de Author: David Douard <david.douard@sdfa3.org> Date: Wed Mar 23 16:35:52 2022 +0100 Add support for limited row numbers in ORC files Make it possible to specify a maximum number of rows a table can store in a single ORC file. The limit can only be set on main tables for now (i.e. cannot be specified for tables like revision_history or directory_entry). This can be set by configuration only (no extra cli options). See https://jenkins.softwareheritage.org/job/DDATASET/job/tests-on-diff/110/ for more details. |