- ensure the last offset is sent to the queue,
- fix the computation of the progress value (off-by-one).
Depends on D6232
Differential D6233
Make sure the progress bar for the export reaches 100% douardda on Sep 9 2021, 5:54 PM. Authored by Tags None Subscribers None
Details
Depends on D6232
Diff Detail
Event TimelineComment Actions Build is green Patch application report for D6233 (id=22552)Could not rebase; Attempt merge onto 002ee70b99... Updating 002ee70..48d246f Fast-forward swh/dataset/journalprocessor.py | 40 ++++++++++++++++++++++++---------------- 1 file changed, 24 insertions(+), 16 deletions(-) Changes applied before testcommit 48d246f178851dfe06e47b6c12555fcd095f5641 Author: David Douard <david.douard@sdfa3.org> Date: Thu Sep 9 11:54:15 2021 +0200 Make sure the progress bar for the export reaches 100% - ensure the last offset is sent to the queue, - fix the computation of the progress value (off-by-one). commit 3a2f5076dcbf791d1ef43982b70551f048ee7c3e Author: David Douard <david.douard@sdfa3.org> Date: Thu Sep 9 11:47:57 2021 +0200 Explicitly close the temporary kafka consumer in `get_offsets` used to retrieve partitions and lo/hi offets. It could cause some dead-lock/long timeout kind of situation sometime (especially in the developper docker environment). commit 45126fd621e8b75c592d7c6cd3d8d1337f95c97e Author: David Douard <david.douard@sdfa3.org> Date: Thu Sep 9 11:39:44 2021 +0200 Simplify the lo/high partition offset computation The computation of lo and high offsets used to be done in 2 steps: - first get the watermak offsets (thus the absolute min and max offsets of the whole partition) - then, as a "hook" in `process()`, retrieve the last committed offset for the partition and "push" these current offsets in the progress queue. Instead, this simplifies a bit this process by quering the committed offsets while computing the hi/low offsets. commit e47a3db1287b3f6ada32c3afb3270ef0947a7659 Author: David Douard <david.douard@sdfa3.org> Date: Thu Sep 9 14:22:37 2021 +0200 Use proper signature for JournalClientOffsetRanges.process() See https://jenkins.softwareheritage.org/job/DDATASET/job/tests-on-diff/4/ for more details. Comment Actions Build is green Patch application report for D6233 (id=22555)Could not rebase; Attempt merge onto 002ee70b99... Updating 002ee70..3f331e1 Fast-forward swh/dataset/journalprocessor.py | 46 +++++++++++++++++++++++------------------ 1 file changed, 26 insertions(+), 20 deletions(-) Changes applied before testcommit 3f331e1823e3329085f01f073fe8a6bd6f43473a Author: David Douard <david.douard@sdfa3.org> Date: Thu Sep 9 14:30:25 2021 +0200 Reduce the size of the progress bar so we get a chance to actually have a visible progress bar: - reduce the label size (shorter desc), - use a single 'workers' postfix (like "workers=n/m"). commit 48d246f178851dfe06e47b6c12555fcd095f5641 Author: David Douard <david.douard@sdfa3.org> Date: Thu Sep 9 11:54:15 2021 +0200 Make sure the progress bar for the export reaches 100% - ensure the last offset is sent to the queue, - fix the computation of the progress value (off-by-one). commit 3a2f5076dcbf791d1ef43982b70551f048ee7c3e Author: David Douard <david.douard@sdfa3.org> Date: Thu Sep 9 11:47:57 2021 +0200 Explicitly close the temporary kafka consumer in `get_offsets` used to retrieve partitions and lo/hi offets. It could cause some dead-lock/long timeout kind of situation sometime (especially in the developper docker environment). commit 45126fd621e8b75c592d7c6cd3d8d1337f95c97e Author: David Douard <david.douard@sdfa3.org> Date: Thu Sep 9 11:39:44 2021 +0200 Simplify the lo/high partition offset computation The computation of lo and high offsets used to be done in 2 steps: - first get the watermak offsets (thus the absolute min and max offsets of the whole partition) - then, as a "hook" in `process()`, retrieve the last committed offset for the partition and "push" these current offsets in the progress queue. Instead, this simplifies a bit this process by quering the committed offsets while computing the hi/low offsets. commit e47a3db1287b3f6ada32c3afb3270ef0947a7659 Author: David Douard <david.douard@sdfa3.org> Date: Thu Sep 9 14:22:37 2021 +0200 Use proper signature for JournalClientOffsetRanges.process() See https://jenkins.softwareheritage.org/job/DDATASET/job/tests-on-diff/5/ for more details. Comment Actions Build is green Patch application report for D6233 (id=22581)Could not rebase; Attempt merge onto 002ee70b99... Updating 002ee70..5881ae0 Fast-forward swh/dataset/journalprocessor.py | 54 +++++++++++++++++++++++++---------------- 1 file changed, 33 insertions(+), 21 deletions(-) Changes applied before testcommit 5881ae06f636a74e7fb0addca04127bfe18b687d Author: David Douard <david.douard@sdfa3.org> Date: Thu Sep 9 14:30:25 2021 +0200 Reduce the size of the progress bar so we get a chance to actually have a visible progress bar: - reduce the label size (shorter desc), - use a single 'workers' postfix (like "workers=n/m"). commit 47713ee38c9498a0548535e5b8361d8158ee3e09 Author: David Douard <david.douard@sdfa3.org> Date: Thu Sep 9 11:54:15 2021 +0200 Make sure the progress bar for the export reaches 100% - ensure the last offset is sent to the queue, - fix the computation of the progress value (off-by-one). commit d07b2a632256da4e7778bf7b1f4a02acd03f9ca0 Author: David Douard <david.douard@sdfa3.org> Date: Thu Sep 9 11:47:57 2021 +0200 Explicitly close the temporary kafka consumer in `get_offsets` used to retrieve partitions and lo/hi offets. It could cause some dead-lock/long timeout kind of situation sometime (especially in the developper docker environment). commit 2760e322af7c5862e0329198671b49d2755491ef Author: David Douard <david.douard@sdfa3.org> Date: Thu Sep 9 11:39:44 2021 +0200 Simplify the lo/high partition offset computation The computation of lo and high offsets used to be done in 2 steps: - first get the watermak offsets (thus the absolute min and max offsets of the whole partition) - then, as a "hook" in `process()`, retrieve the last committed offset for the partition and "push" these current offsets in the progress queue. Instead, this simplifies a bit this process by quering the committed offsets while computing the hi/low offsets. commit e47a3db1287b3f6ada32c3afb3270ef0947a7659 Author: David Douard <david.douard@sdfa3.org> Date: Thu Sep 9 14:22:37 2021 +0200 Use proper signature for JournalClientOffsetRanges.process() See https://jenkins.softwareheritage.org/job/DDATASET/job/tests-on-diff/9/ for more details. Comment Actions Build is green Patch application report for D6233 (id=22610)Could not rebase; Attempt merge onto 002ee70b99... Updating 002ee70..358d849 Fast-forward swh/dataset/journalprocessor.py | 54 +++++++++++++++++++++++++---------------- 1 file changed, 33 insertions(+), 21 deletions(-) Changes applied before testcommit 358d84938d01ee25706619e533213c6e62f4c828 Author: David Douard <david.douard@sdfa3.org> Date: Thu Sep 9 14:30:25 2021 +0200 Reduce the size of the progress bar so we get a chance to actually have a visible progress bar: - reduce the label size (shorter desc), - use a single 'workers' postfix (like "workers=n/m"). commit 47713ee38c9498a0548535e5b8361d8158ee3e09 Author: David Douard <david.douard@sdfa3.org> Date: Thu Sep 9 11:54:15 2021 +0200 Make sure the progress bar for the export reaches 100% - ensure the last offset is sent to the queue, - fix the computation of the progress value (off-by-one). commit d07b2a632256da4e7778bf7b1f4a02acd03f9ca0 Author: David Douard <david.douard@sdfa3.org> Date: Thu Sep 9 11:47:57 2021 +0200 Explicitly close the temporary kafka consumer in `get_offsets` used to retrieve partitions and lo/hi offets. It could cause some dead-lock/long timeout kind of situation sometime (especially in the developper docker environment). commit 2760e322af7c5862e0329198671b49d2755491ef Author: David Douard <david.douard@sdfa3.org> Date: Thu Sep 9 11:39:44 2021 +0200 Simplify the lo/high partition offset computation The computation of lo and high offsets used to be done in 2 steps: - first get the watermak offsets (thus the absolute min and max offsets of the whole partition) - then, as a "hook" in `process()`, retrieve the last committed offset for the partition and "push" these current offsets in the progress queue. Instead, this simplifies a bit this process by quering the committed offsets while computing the hi/low offsets. commit e47a3db1287b3f6ada32c3afb3270ef0947a7659 Author: David Douard <david.douard@sdfa3.org> Date: Thu Sep 9 14:22:37 2021 +0200 Use proper signature for JournalClientOffsetRanges.process() See https://jenkins.softwareheritage.org/job/DDATASET/job/tests-on-diff/14/ for more details. |