benchmark.py: run random algorithm only once
Details
- Reviewers
zack - Group Reviewers
Reviewers - Commits
- rDTSCNe46e713d2145: run random algorithm only once
rDTSCN3004b66787b2: use os.listdir() instead of os.walk() to avoid symlinks
Diff Detail
- Repository
- rDTSCN Code scanner
- Lint
Automatic diff as part of commit; lint not applicable. - Unit
Automatic diff as part of commit; unit tests not applicable.
Event Timeline
Build has FAILED
Patch application report for D5011 (id=17882)
Could not rebase; Attempt merge onto 33a9cd4eb9...
Auto-merging swh/scanner/cli.py Merge made by the 'recursive' strategy. benchmark.py | 136 ++++++++++++++ run_backend.sh | 15 ++ run_benchmark.sh | 37 ++++ swh/scanner/backend.py | 16 +- swh/scanner/benchmark_algos.py | 395 +++++++++++++++++++++++++++++++++++++++++ swh/scanner/cli.py | 73 ++++++++ swh/scanner/model.py | 57 +++++- 7 files changed, 718 insertions(+), 11 deletions(-) create mode 100755 benchmark.py create mode 100755 run_backend.sh create mode 100755 run_benchmark.sh create mode 100644 swh/scanner/benchmark_algos.py
Changes applied before test
commit 4d3001147e4469ca62353bcd681d9a696d596517 Merge: 33a9cd4 ba54311 Author: Jenkins user <jenkins@localhost> Date: Thu Feb 4 13:29:34 2021 +0000 Merge branch 'diff-target' into HEAD commit ba54311a7c2a7eb16491044a04507f9701b3c57b Author: Daniele Serafini <me@danieleserafini.eu> Date: Thu Feb 4 14:28:31 2021 +0100 run random algorithm only once commit aaf3266f05c569bd0f7f30013d455c37df2aaf27 Author: Daniele Serafini <me@danieleserafini.eu> Date: Thu Feb 4 14:17:59 2021 +0100 use os.listdir() instead of os.walk() to avoid symlinks commit 3d3665a4f5bb77c981a27ee9206a2c92717e82b0 Author: Daniele Serafini <me@danieleserafini.eu> Date: Tue Feb 2 15:30:54 2021 +0100 algo_min: delete the upstream directories if a (sub)directory is unknown commit c42e643aa512cbd8c039be2350159e46d34daa0d Author: Daniele Serafini <me@danieleserafini.eu> Date: Tue Feb 2 13:24:12 2021 +0100 model: wrong iteration in 'iterate_bfs' function commit 0d3b5cb86144b87accab7f9a45d6457f457d47d0 Author: Daniele Serafini <me@danieleserafini.eu> Date: Tue Feb 2 11:13:13 2021 +0100 make 'set_children_status' works with different kind of nodes commit b601f382db643ddb0af40c85d1d8fc5065bd7224 Author: Daniele Serafini <me@danieleserafini.eu> Date: Thu Jan 28 16:45:45 2021 +0100 file_priority: remove children only when the unset directory is known If the directory is unknown the algorithm should check the downstream directories since they could be unknown too. commit 5e01c09af4c61a309d71adb0d4f61d1766b8a021 Author: Daniele Serafini <me@danieleserafini.eu> Date: Tue Jan 26 10:10:00 2021 +0100 retry request in case of backend failure commit ebad16c02da6bffbc96a623e082a4b5f706d7b1f Author: Daniele Serafini <me@danieleserafini.eu> Date: Mon Jan 25 13:48:14 2021 +0100 algo_min: remove the current node as well commit 5cd9f762467ece41d7d8e1ae1841e1d24aad45e4 Author: Daniele Serafini <me@danieleserafini.eu> Date: Mon Jan 18 10:26:06 2021 +0100 fix: the temporary directory is removed by tempfile commit 7a289332f73025f94f7f85ab5bd6755b876ebe68 Author: Daniele Serafini <me@danieleserafini.eu> Date: Tue Jan 12 23:12:18 2021 +0100 print results as a csv commit 9e4df16d9486a891498124dd4cfb7558c57dfa0c Author: Daniele Serafini <me@danieleserafini.eu> Date: Tue Jan 12 23:10:39 2021 +0100 extract repositories in temporary directories commit 7bd1939949dcbcf0c52b8647f2b1750f2c9d2300 Author: Daniele Serafini <me@danieleserafini.eu> Date: Thu Dec 10 23:59:31 2020 +0100 scanner experiments
Link to build: https://jenkins.softwareheritage.org/job/DTSCN/job/tests-on-diff/95/
See console output for more information: https://jenkins.softwareheritage.org/job/DTSCN/job/tests-on-diff/95/console
swh/scanner/benchmark_algos.py | ||
---|---|---|
305–308 | if you want to avoid symlinks, these doesn't work, because doc (for both) says: "This follows symbolic links, so both islink() and isfile() can be true for the same path." you want to avoid a test before either of these like: "if os.path.islink(...): ... continue ..." |
swh/scanner/benchmark_algos.py | ||
---|---|---|
305–308 | actually, you probably do not want to ignore symlinks completely (I think?, it depends on how your tree is then used) if you want to keep them, probably you should just avoid listing root_path if *it* is a symlink, so using islink() on it before invoking listdir on it() |
Build has FAILED
Patch application report for D5011 (id=17892)
Could not rebase; Attempt merge onto 33a9cd4eb9...
Auto-merging swh/scanner/cli.py Merge made by the 'recursive' strategy. benchmark.py | 136 ++++++++++++++ run_backend.sh | 15 ++ run_benchmark.sh | 37 ++++ swh/scanner/backend.py | 16 +- swh/scanner/benchmark_algos.py | 396 +++++++++++++++++++++++++++++++++++++++++ swh/scanner/cli.py | 73 ++++++++ swh/scanner/model.py | 57 +++++- 7 files changed, 719 insertions(+), 11 deletions(-) create mode 100755 benchmark.py create mode 100755 run_backend.sh create mode 100755 run_benchmark.sh create mode 100644 swh/scanner/benchmark_algos.py
Changes applied before test
commit 34d1383d95e3a26cd5d2e26aad84dbe624698a80 Merge: 33a9cd4 0806485 Author: Jenkins user <jenkins@localhost> Date: Thu Feb 4 15:31:35 2021 +0000 Merge branch 'diff-target' into HEAD commit 080648583efcdf14c31af2f42ccc1c86f2745b63 Author: Daniele Serafini <me@danieleserafini.eu> Date: Thu Feb 4 16:28:21 2021 +0100 run random algorithm only once commit 3004b66787b28cffa1047427876750397f02e06a Author: Daniele Serafini <me@danieleserafini.eu> Date: Thu Feb 4 16:27:59 2021 +0100 use os.listdir() instead of os.walk() to avoid symlinks commit 3d3665a4f5bb77c981a27ee9206a2c92717e82b0 Author: Daniele Serafini <me@danieleserafini.eu> Date: Tue Feb 2 15:30:54 2021 +0100 algo_min: delete the upstream directories if a (sub)directory is unknown commit c42e643aa512cbd8c039be2350159e46d34daa0d Author: Daniele Serafini <me@danieleserafini.eu> Date: Tue Feb 2 13:24:12 2021 +0100 model: wrong iteration in 'iterate_bfs' function commit 0d3b5cb86144b87accab7f9a45d6457f457d47d0 Author: Daniele Serafini <me@danieleserafini.eu> Date: Tue Feb 2 11:13:13 2021 +0100 make 'set_children_status' works with different kind of nodes commit b601f382db643ddb0af40c85d1d8fc5065bd7224 Author: Daniele Serafini <me@danieleserafini.eu> Date: Thu Jan 28 16:45:45 2021 +0100 file_priority: remove children only when the unset directory is known If the directory is unknown the algorithm should check the downstream directories since they could be unknown too. commit 5e01c09af4c61a309d71adb0d4f61d1766b8a021 Author: Daniele Serafini <me@danieleserafini.eu> Date: Tue Jan 26 10:10:00 2021 +0100 retry request in case of backend failure commit ebad16c02da6bffbc96a623e082a4b5f706d7b1f Author: Daniele Serafini <me@danieleserafini.eu> Date: Mon Jan 25 13:48:14 2021 +0100 algo_min: remove the current node as well commit 5cd9f762467ece41d7d8e1ae1841e1d24aad45e4 Author: Daniele Serafini <me@danieleserafini.eu> Date: Mon Jan 18 10:26:06 2021 +0100 fix: the temporary directory is removed by tempfile commit 7a289332f73025f94f7f85ab5bd6755b876ebe68 Author: Daniele Serafini <me@danieleserafini.eu> Date: Tue Jan 12 23:12:18 2021 +0100 print results as a csv commit 9e4df16d9486a891498124dd4cfb7558c57dfa0c Author: Daniele Serafini <me@danieleserafini.eu> Date: Tue Jan 12 23:10:39 2021 +0100 extract repositories in temporary directories commit 7bd1939949dcbcf0c52b8647f2b1750f2c9d2300 Author: Daniele Serafini <me@danieleserafini.eu> Date: Thu Dec 10 23:59:31 2020 +0100 scanner experiments
Link to build: https://jenkins.softwareheritage.org/job/DTSCN/job/tests-on-diff/96/
See console output for more information: https://jenkins.softwareheritage.org/job/DTSCN/job/tests-on-diff/96/console
Build has FAILED
Patch application report for D5011 (id=17893)
Rebasing onto 33a9cd4eb9...
First, rewinding head to replay your work on top of it... Applying: scanner experiments Applying: extract repositories in temporary directories Applying: print results as a csv Applying: fix: the temporary directory is removed by tempfile Applying: algo_min: remove the current node as well Applying: retry request in case of backend failure Applying: file_priority: remove children only when the unset directory is known Applying: make 'set_children_status' works with different kind of nodes Applying: model: wrong iteration in 'iterate_bfs' function Applying: algo_min: delete the upstream directories if a (sub)directory is unknown Applying: check if path is a symlink Applying: run random algorithm only once
Changes applied before test
commit 6c534b8af6b62468cf8467aa2791f63f1a471958 Author: Daniele Serafini <me@danieleserafini.eu> Date: Thu Feb 4 16:47:24 2021 +0100 run random algorithm only once commit 3446bb600e3aeca5ddc22b5b9a17eda224996450 Author: Daniele Serafini <me@danieleserafini.eu> Date: Thu Feb 4 16:27:59 2021 +0100 check if path is a symlink exclude the path if it is a symlink. - os.listdir() instead of os.walk() to list subdirectories commit b46c265a776490a6797454e64e5cbc607fba1e94 Author: Daniele Serafini <me@danieleserafini.eu> Date: Tue Feb 2 15:30:54 2021 +0100 algo_min: delete the upstream directories if a (sub)directory is unknown commit 4cec0aa255ba71479acb7cd58048f697c3ad0aa5 Author: Daniele Serafini <me@danieleserafini.eu> Date: Tue Feb 2 13:24:12 2021 +0100 model: wrong iteration in 'iterate_bfs' function commit 15cb48637cf708bf15fcab7a6958b2b97bdafe7b Author: Daniele Serafini <me@danieleserafini.eu> Date: Tue Feb 2 11:13:13 2021 +0100 make 'set_children_status' works with different kind of nodes commit 3ebcebddc15ac53203c53ac771a501339ff681a8 Author: Daniele Serafini <me@danieleserafini.eu> Date: Thu Jan 28 16:45:45 2021 +0100 file_priority: remove children only when the unset directory is known If the directory is unknown the algorithm should check the downstream directories since they could be unknown too. commit d64b0d8d402872de7351b0674bde391efcff8fcf Author: Daniele Serafini <me@danieleserafini.eu> Date: Tue Jan 26 10:10:00 2021 +0100 retry request in case of backend failure commit ba29deefccf09642d1c006b1e0887f369d87d321 Author: Daniele Serafini <me@danieleserafini.eu> Date: Mon Jan 25 13:48:14 2021 +0100 algo_min: remove the current node as well commit fa7460a9f9a1a291ea43f7af60486c4a362d04d2 Author: Daniele Serafini <me@danieleserafini.eu> Date: Mon Jan 18 10:26:06 2021 +0100 fix: the temporary directory is removed by tempfile commit f7464b81a5169755a5dbcca853a694ccb29ec9e7 Author: Daniele Serafini <me@danieleserafini.eu> Date: Tue Jan 12 23:12:18 2021 +0100 print results as a csv commit f0f34283cc77dd0795484f5904918a7bba67e329 Author: Daniele Serafini <me@danieleserafini.eu> Date: Tue Jan 12 23:10:39 2021 +0100 extract repositories in temporary directories commit 2d4bf40939653e71d0715a4d3fdba6ce5765991c Author: Daniele Serafini <me@danieleserafini.eu> Date: Thu Dec 10 23:59:31 2020 +0100 scanner experiments
Link to build: https://jenkins.softwareheritage.org/job/DTSCN/job/tests-on-diff/97/
See console output for more information: https://jenkins.softwareheritage.org/job/DTSCN/job/tests-on-diff/97/console
Build has FAILED
Patch application report for D5011 (id=17894)
Rebasing onto 33a9cd4eb9...
First, rewinding head to replay your work on top of it... Applying: scanner experiments Applying: extract repositories in temporary directories Applying: print results as a csv Applying: fix: the temporary directory is removed by tempfile Applying: algo_min: remove the current node as well Applying: retry request in case of backend failure Applying: file_priority: remove children only when the unset directory is known Applying: make 'set_children_status' works with different kind of nodes Applying: model: wrong iteration in 'iterate_bfs' function Applying: algo_min: delete the upstream directories if a (sub)directory is unknown Applying: check if path is a symlink Applying: run random algorithm only once
Changes applied before test
commit 2eca880da64bf5537ac4603a09cd2804c3151d40 Author: Daniele Serafini <me@danieleserafini.eu> Date: Thu Feb 4 16:47:24 2021 +0100 run random algorithm only once commit 3a6203415be0be7825edd74cd505bb6d14ffb635 Author: Daniele Serafini <me@danieleserafini.eu> Date: Thu Feb 4 16:27:59 2021 +0100 check if path is a symlink exclude the path if it is a symlink. - os.listdir() instead of os.walk() to list subdirectories commit e3e1a96f5913905a42762c672720d1480184f858 Author: Daniele Serafini <me@danieleserafini.eu> Date: Tue Feb 2 15:30:54 2021 +0100 algo_min: delete the upstream directories if a (sub)directory is unknown commit 5f27ca465bc33d8babc70f8bfb258165934153e0 Author: Daniele Serafini <me@danieleserafini.eu> Date: Tue Feb 2 13:24:12 2021 +0100 model: wrong iteration in 'iterate_bfs' function commit 590fc3252c7aabbdf30f5fce001d45d487a880d7 Author: Daniele Serafini <me@danieleserafini.eu> Date: Tue Feb 2 11:13:13 2021 +0100 make 'set_children_status' works with different kind of nodes commit d829830b407e06b3bc2624a8552adcddb90278ce Author: Daniele Serafini <me@danieleserafini.eu> Date: Thu Jan 28 16:45:45 2021 +0100 file_priority: remove children only when the unset directory is known If the directory is unknown the algorithm should check the downstream directories since they could be unknown too. commit 4bceda44454777762d5bf677818478a72ad2f624 Author: Daniele Serafini <me@danieleserafini.eu> Date: Tue Jan 26 10:10:00 2021 +0100 retry request in case of backend failure commit 00a2d73a2193406d6fba0a46c91e3098d800d986 Author: Daniele Serafini <me@danieleserafini.eu> Date: Mon Jan 25 13:48:14 2021 +0100 algo_min: remove the current node as well commit 243faa41794f2c5f4182d627bbf3a9dc2e14b75a Author: Daniele Serafini <me@danieleserafini.eu> Date: Mon Jan 18 10:26:06 2021 +0100 fix: the temporary directory is removed by tempfile commit 942d63226f3e589ce0315ec89317118198048a8a Author: Daniele Serafini <me@danieleserafini.eu> Date: Tue Jan 12 23:12:18 2021 +0100 print results as a csv commit 88a9d3232e3a04f8e3d96e95ae05de7dc406c87a Author: Daniele Serafini <me@danieleserafini.eu> Date: Tue Jan 12 23:10:39 2021 +0100 extract repositories in temporary directories commit 7a55f8962e424771aaf5410d7c11103f8fcdbb7c Author: Daniele Serafini <me@danieleserafini.eu> Date: Thu Dec 10 23:59:31 2020 +0100 scanner experiments
Link to build: https://jenkins.softwareheritage.org/job/DTSCN/job/tests-on-diff/98/
See console output for more information: https://jenkins.softwareheritage.org/job/DTSCN/job/tests-on-diff/98/console
Build has FAILED
Patch application report for D5011 (id=17895)
Could not rebase; Attempt merge onto 33a9cd4eb9...
Auto-merging swh/scanner/cli.py Merge made by the 'recursive' strategy. benchmark.py | 136 ++++++++++++++ run_backend.sh | 15 ++ run_benchmark.sh | 37 ++++ swh/scanner/backend.py | 16 +- swh/scanner/benchmark_algos.py | 396 +++++++++++++++++++++++++++++++++++++++++ swh/scanner/cli.py | 73 ++++++++ swh/scanner/model.py | 57 +++++- 7 files changed, 719 insertions(+), 11 deletions(-) create mode 100755 benchmark.py create mode 100755 run_backend.sh create mode 100755 run_benchmark.sh create mode 100644 swh/scanner/benchmark_algos.py
Changes applied before test
commit 28ceb8e275f88e4fee71fbc725f9afb4360b5d0e Merge: 33a9cd4 e46e713 Author: Jenkins user <jenkins@localhost> Date: Thu Feb 4 16:39:04 2021 +0000 Merge branch 'diff-target' into HEAD commit e46e713d2145f69be19e16f5d22a565648e7c0ff Author: Daniele Serafini <me@danieleserafini.eu> Date: Thu Feb 4 16:28:21 2021 +0100 run random algorithm only once commit 3004b66787b28cffa1047427876750397f02e06a Author: Daniele Serafini <me@danieleserafini.eu> Date: Thu Feb 4 16:27:59 2021 +0100 use os.listdir() instead of os.walk() to avoid symlinks commit 3d3665a4f5bb77c981a27ee9206a2c92717e82b0 Author: Daniele Serafini <me@danieleserafini.eu> Date: Tue Feb 2 15:30:54 2021 +0100 algo_min: delete the upstream directories if a (sub)directory is unknown commit c42e643aa512cbd8c039be2350159e46d34daa0d Author: Daniele Serafini <me@danieleserafini.eu> Date: Tue Feb 2 13:24:12 2021 +0100 model: wrong iteration in 'iterate_bfs' function commit 0d3b5cb86144b87accab7f9a45d6457f457d47d0 Author: Daniele Serafini <me@danieleserafini.eu> Date: Tue Feb 2 11:13:13 2021 +0100 make 'set_children_status' works with different kind of nodes commit b601f382db643ddb0af40c85d1d8fc5065bd7224 Author: Daniele Serafini <me@danieleserafini.eu> Date: Thu Jan 28 16:45:45 2021 +0100 file_priority: remove children only when the unset directory is known If the directory is unknown the algorithm should check the downstream directories since they could be unknown too. commit 5e01c09af4c61a309d71adb0d4f61d1766b8a021 Author: Daniele Serafini <me@danieleserafini.eu> Date: Tue Jan 26 10:10:00 2021 +0100 retry request in case of backend failure commit ebad16c02da6bffbc96a623e082a4b5f706d7b1f Author: Daniele Serafini <me@danieleserafini.eu> Date: Mon Jan 25 13:48:14 2021 +0100 algo_min: remove the current node as well commit 5cd9f762467ece41d7d8e1ae1841e1d24aad45e4 Author: Daniele Serafini <me@danieleserafini.eu> Date: Mon Jan 18 10:26:06 2021 +0100 fix: the temporary directory is removed by tempfile commit 7a289332f73025f94f7f85ab5bd6755b876ebe68 Author: Daniele Serafini <me@danieleserafini.eu> Date: Tue Jan 12 23:12:18 2021 +0100 print results as a csv commit 9e4df16d9486a891498124dd4cfb7558c57dfa0c Author: Daniele Serafini <me@danieleserafini.eu> Date: Tue Jan 12 23:10:39 2021 +0100 extract repositories in temporary directories commit 7bd1939949dcbcf0c52b8647f2b1750f2c9d2300 Author: Daniele Serafini <me@danieleserafini.eu> Date: Thu Dec 10 23:59:31 2020 +0100 scanner experiments
Link to build: https://jenkins.softwareheritage.org/job/DTSCN/job/tests-on-diff/99/
See console output for more information: https://jenkins.softwareheritage.org/job/DTSCN/job/tests-on-diff/99/console