- ensure the last offset is sent to the queue,
- fix the computation of the progress value (off-by-one).
Depends on D6232
Differential D6233
Make sure the progress bar for the export reaches 100% Authored by douardda on Sep 9 2021, 5:54 PM. Tags None Subscribers None
Details
Depends on D6232
Diff Detail
Event TimelineComment Actions Build is green Patch application report for D6233 (id=22552)Could not rebase; Attempt merge onto 002ee70b99... Updating 002ee70..48d246f Fast-forward swh/dataset/journalprocessor.py | 40 ++++++++++++++++++++++++---------------- 1 file changed, 24 insertions(+), 16 deletions(-) Changes applied before testcommit 48d246f178851dfe06e47b6c12555fcd095f5641
Author: David Douard <david.douard@sdfa3.org>
Date: Thu Sep 9 11:54:15 2021 +0200
Make sure the progress bar for the export reaches 100%
- ensure the last offset is sent to the queue,
- fix the computation of the progress value (off-by-one).
commit 3a2f5076dcbf791d1ef43982b70551f048ee7c3e
Author: David Douard <david.douard@sdfa3.org>
Date: Thu Sep 9 11:47:57 2021 +0200
Explicitly close the temporary kafka consumer in `get_offsets`
used to retrieve partitions and lo/hi offets.
It could cause some dead-lock/long timeout kind of situation sometime
(especially in the developper docker environment).
commit 45126fd621e8b75c592d7c6cd3d8d1337f95c97e
Author: David Douard <david.douard@sdfa3.org>
Date: Thu Sep 9 11:39:44 2021 +0200
Simplify the lo/high partition offset computation
The computation of lo and high offsets used to be done in 2 steps:
- first get the watermak offsets (thus the absolute min and max offsets
of the whole partition)
- then, as a "hook" in `process()`, retrieve the last committed offset
for the partition and "push" these current offsets in the progress
queue.
Instead, this simplifies a bit this process by quering the committed
offsets while computing the hi/low offsets.
commit e47a3db1287b3f6ada32c3afb3270ef0947a7659
Author: David Douard <david.douard@sdfa3.org>
Date: Thu Sep 9 14:22:37 2021 +0200
Use proper signature for JournalClientOffsetRanges.process()See https://jenkins.softwareheritage.org/job/DDATASET/job/tests-on-diff/4/ for more details. Comment Actions Build is green Patch application report for D6233 (id=22555)Could not rebase; Attempt merge onto 002ee70b99... Updating 002ee70..3f331e1 Fast-forward swh/dataset/journalprocessor.py | 46 +++++++++++++++++++++++------------------ 1 file changed, 26 insertions(+), 20 deletions(-) Changes applied before testcommit 3f331e1823e3329085f01f073fe8a6bd6f43473a
Author: David Douard <david.douard@sdfa3.org>
Date: Thu Sep 9 14:30:25 2021 +0200
Reduce the size of the progress bar
so we get a chance to actually have a visible progress bar:
- reduce the label size (shorter desc),
- use a single 'workers' postfix (like "workers=n/m").
commit 48d246f178851dfe06e47b6c12555fcd095f5641
Author: David Douard <david.douard@sdfa3.org>
Date: Thu Sep 9 11:54:15 2021 +0200
Make sure the progress bar for the export reaches 100%
- ensure the last offset is sent to the queue,
- fix the computation of the progress value (off-by-one).
commit 3a2f5076dcbf791d1ef43982b70551f048ee7c3e
Author: David Douard <david.douard@sdfa3.org>
Date: Thu Sep 9 11:47:57 2021 +0200
Explicitly close the temporary kafka consumer in `get_offsets`
used to retrieve partitions and lo/hi offets.
It could cause some dead-lock/long timeout kind of situation sometime
(especially in the developper docker environment).
commit 45126fd621e8b75c592d7c6cd3d8d1337f95c97e
Author: David Douard <david.douard@sdfa3.org>
Date: Thu Sep 9 11:39:44 2021 +0200
Simplify the lo/high partition offset computation
The computation of lo and high offsets used to be done in 2 steps:
- first get the watermak offsets (thus the absolute min and max offsets
of the whole partition)
- then, as a "hook" in `process()`, retrieve the last committed offset
for the partition and "push" these current offsets in the progress
queue.
Instead, this simplifies a bit this process by quering the committed
offsets while computing the hi/low offsets.
commit e47a3db1287b3f6ada32c3afb3270ef0947a7659
Author: David Douard <david.douard@sdfa3.org>
Date: Thu Sep 9 14:22:37 2021 +0200
Use proper signature for JournalClientOffsetRanges.process()See https://jenkins.softwareheritage.org/job/DDATASET/job/tests-on-diff/5/ for more details. Comment Actions Build is green Patch application report for D6233 (id=22581)Could not rebase; Attempt merge onto 002ee70b99... Updating 002ee70..5881ae0 Fast-forward swh/dataset/journalprocessor.py | 54 +++++++++++++++++++++++++---------------- 1 file changed, 33 insertions(+), 21 deletions(-) Changes applied before testcommit 5881ae06f636a74e7fb0addca04127bfe18b687d
Author: David Douard <david.douard@sdfa3.org>
Date: Thu Sep 9 14:30:25 2021 +0200
Reduce the size of the progress bar
so we get a chance to actually have a visible progress bar:
- reduce the label size (shorter desc),
- use a single 'workers' postfix (like "workers=n/m").
commit 47713ee38c9498a0548535e5b8361d8158ee3e09
Author: David Douard <david.douard@sdfa3.org>
Date: Thu Sep 9 11:54:15 2021 +0200
Make sure the progress bar for the export reaches 100%
- ensure the last offset is sent to the queue,
- fix the computation of the progress value (off-by-one).
commit d07b2a632256da4e7778bf7b1f4a02acd03f9ca0
Author: David Douard <david.douard@sdfa3.org>
Date: Thu Sep 9 11:47:57 2021 +0200
Explicitly close the temporary kafka consumer in `get_offsets`
used to retrieve partitions and lo/hi offets.
It could cause some dead-lock/long timeout kind of situation sometime
(especially in the developper docker environment).
commit 2760e322af7c5862e0329198671b49d2755491ef
Author: David Douard <david.douard@sdfa3.org>
Date: Thu Sep 9 11:39:44 2021 +0200
Simplify the lo/high partition offset computation
The computation of lo and high offsets used to be done in 2 steps:
- first get the watermak offsets (thus the absolute min and max offsets
of the whole partition)
- then, as a "hook" in `process()`, retrieve the last committed offset
for the partition and "push" these current offsets in the progress
queue.
Instead, this simplifies a bit this process by quering the committed
offsets while computing the hi/low offsets.
commit e47a3db1287b3f6ada32c3afb3270ef0947a7659
Author: David Douard <david.douard@sdfa3.org>
Date: Thu Sep 9 14:22:37 2021 +0200
Use proper signature for JournalClientOffsetRanges.process()See https://jenkins.softwareheritage.org/job/DDATASET/job/tests-on-diff/9/ for more details. Comment Actions Build is green Patch application report for D6233 (id=22610)Could not rebase; Attempt merge onto 002ee70b99... Updating 002ee70..358d849 Fast-forward swh/dataset/journalprocessor.py | 54 +++++++++++++++++++++++++---------------- 1 file changed, 33 insertions(+), 21 deletions(-) Changes applied before testcommit 358d84938d01ee25706619e533213c6e62f4c828
Author: David Douard <david.douard@sdfa3.org>
Date: Thu Sep 9 14:30:25 2021 +0200
Reduce the size of the progress bar
so we get a chance to actually have a visible progress bar:
- reduce the label size (shorter desc),
- use a single 'workers' postfix (like "workers=n/m").
commit 47713ee38c9498a0548535e5b8361d8158ee3e09
Author: David Douard <david.douard@sdfa3.org>
Date: Thu Sep 9 11:54:15 2021 +0200
Make sure the progress bar for the export reaches 100%
- ensure the last offset is sent to the queue,
- fix the computation of the progress value (off-by-one).
commit d07b2a632256da4e7778bf7b1f4a02acd03f9ca0
Author: David Douard <david.douard@sdfa3.org>
Date: Thu Sep 9 11:47:57 2021 +0200
Explicitly close the temporary kafka consumer in `get_offsets`
used to retrieve partitions and lo/hi offets.
It could cause some dead-lock/long timeout kind of situation sometime
(especially in the developper docker environment).
commit 2760e322af7c5862e0329198671b49d2755491ef
Author: David Douard <david.douard@sdfa3.org>
Date: Thu Sep 9 11:39:44 2021 +0200
Simplify the lo/high partition offset computation
The computation of lo and high offsets used to be done in 2 steps:
- first get the watermak offsets (thus the absolute min and max offsets
of the whole partition)
- then, as a "hook" in `process()`, retrieve the last committed offset
for the partition and "push" these current offsets in the progress
queue.
Instead, this simplifies a bit this process by quering the committed
offsets while computing the hi/low offsets.
commit e47a3db1287b3f6ada32c3afb3270ef0947a7659
Author: David Douard <david.douard@sdfa3.org>
Date: Thu Sep 9 14:22:37 2021 +0200
Use proper signature for JournalClientOffsetRanges.process()See https://jenkins.softwareheritage.org/job/DDATASET/job/tests-on-diff/14/ for more details. |