Could you document the expected shard layout in RestructuredText format? This code is setting the (current version of the) layout "forever", so we should make sure to properly document it.

Nov 9 2021, 10:48 AM

Nov 4 2021

dachary added inline comments to D6424: Perfect hashmap C implementation.

Nov 4 2021, 4:24 PM

dachary added inline comments to D6424: Perfect hashmap C implementation.

Nov 4 2021, 4:00 PM

dachary updated the diff for D6424: Perfect hashmap C implementation.

move to _hash_cffi swh.perfecthash._hash_cffi

Nov 4 2021, 3:59 PM

Nov 3 2021

dachary added inline comments to D6424: Perfect hashmap C implementation.

Nov 3 2021, 6:09 PM

Nov 2 2021

dachary added a comment to D6424: Perfect hashmap C implementation.

@ardumont I revisited all comments and noticed I missed a few, sorry about that. They should be good now :-) Is there anything else you need me to do for this proposed patch? Or should I just be patient and wait for whatever comes next in the review merge process? Thanks for your guidance!

Nov 2 2021, 9:46 AM

dachary updated the diff for D6424: Perfect hashmap C implementation.

add two missing type annotations

Nov 2 2021, 9:44 AM

Oct 26 2021

dachary updated the summary of D6424: Perfect hashmap C implementation.

Oct 26 2021, 8:22 PM

dachary updated the diff for D6424: Perfect hashmap C implementation.

adjust lookup speed results and expectations according to today's benchmark results

Oct 26 2021, 7:47 PM

dachary updated the diff for D6424: Perfect hashmap C implementation.

comment on hardcoded value

Oct 26 2021, 7:15 PM

dachary added a comment to D6424: Perfect hashmap C implementation.

Thanks a lot for the reviews!

Oct 26 2021, 7:14 PM

dachary updated the diff for D6424: Perfect hashmap C implementation.

address ardrumont comments

Oct 26 2021, 6:23 PM

dachary added inline comments to D6424: Perfect hashmap C implementation.

Oct 26 2021, 6:21 PM

dachary updated the diff for D6424: Perfect hashmap C implementation.

fix documentation bugs

Oct 26 2021, 9:32 AM

dachary updated the task description for T3670: fed4fire setup for winery benchmarks.

Oct 26 2021, 9:18 AM · Object storage

dachary updated the diff for D6424: Perfect hashmap C implementation.

document the benchmark process

Oct 26 2021, 9:17 AM

dachary added a comment to T3670: fed4fire setup for winery benchmarks.

But then Fed4Fire killed the experiment prematurely, ignoring the Grid5000 extension. I'll have another go at it.

Oct 26 2021, 9:16 AM · Object storage

dachary added a comment to T3670: fed4fire setup for winery benchmarks.

When trying to extend the duration of an experiment (slice in the Fed4Fire parlance), an error occurs.

Oct 26 2021, 5:57 AM · Object storage

dachary added a comment to T3670: fed4fire setup for winery benchmarks.

The CLI is actually more complicated because it requires input that are difficult to figure out:

Oct 26 2021, 4:32 AM · Object storage

Oct 25 2021

dachary added a comment to T3634: Create swh-perfecthash module.

@olasd thanks for adding the dependencies 🎉 D6545

Oct 25 2021, 11:26 PM · Object storage

dachary added a comment to T3670: fed4fire setup for winery benchmarks.

The https://jfed.ilabt.imec.be/downloads/ CLI may be easier to use than the graphical client when repeatin experiments.

Oct 25 2021, 11:56 AM · Object storage

dachary retitled D6545: base-buster: add valgrind and googletest for swh.perfecthash from base-buster: add valgrind for swh.perfecthash to base-buster: add valgrind and googletest for swh.perfecthash.

Oct 25 2021, 11:18 AM

dachary updated the diff for D6545: base-buster: add valgrind and googletest for swh.perfecthash.

googletest is also needed

Oct 25 2021, 11:18 AM

dachary updated the diff for D6424: Perfect hashmap C implementation.

run the valgrind tests

Oct 25 2021, 11:08 AM

dachary retitled D6545: base-buster: add valgrind and googletest for swh.perfecthash from base-buster: add valrind for swh.perfecthash to base-buster: add valgrind for swh.perfecthash.

Oct 25 2021, 10:58 AM

dachary updated the diff for D6545: base-buster: add valgrind and googletest for swh.perfecthash.

fix typo in the commit comment

Oct 25 2021, 10:57 AM

dachary requested review of D6545: base-buster: add valgrind and googletest for swh.perfecthash.

Oct 25 2021, 10:56 AM

dachary added a revision to T3634: Create swh-perfecthash module: D6545: base-buster: add valgrind and googletest for swh.perfecthash.

Oct 25 2021, 10:56 AM · Object storage

dachary added a comment to T3634: Create swh-perfecthash module.

It's at https://forge.softwareheritage.org/source/swh-jenkins-dockerfiles/browse/master/base-buster/Dockerfile$39

Oct 25 2021, 10:55 AM · Object storage

Oct 21 2021

dachary added a comment to T3670: fed4fire setup for winery benchmarks.

Contribution to the Grid5000 documentation for Fed4Fire.

Oct 21 2021, 8:48 AM · Object storage

dachary added a comment to T3670: fed4fire setup for winery benchmarks.

Using Export As ansible, I unzipped the result.

Oct 21 2021, 8:44 AM · Object storage

dachary added a comment to T3670: fed4fire setup for winery benchmarks.

Created another experiment (with 25min lifetime only) and it's going better:

Oct 21 2021, 8:39 AM · Object storage

dachary added a comment to T3670: fed4fire setup for winery benchmarks.

Linked Grid5000 & Fed4Fire accounts.

Oct 21 2021, 8:37 AM · Object storage

dachary added a comment to T3670: fed4fire setup for winery benchmarks.

I wanted to terminate the experiment but it looks like it must expire (although Grid5000 has the option to terminate a job).

Oct 21 2021, 8:30 AM · Object storage

dachary added a comment to T3670: fed4fire setup for winery benchmarks.

The Grid5000 machines are found in the "

Oct 21 2021, 8:29 AM · Object storage

dachary updated the task description for T3670: fed4fire setup for winery benchmarks.

Oct 21 2021, 8:20 AM · Object storage

dachary added a comment to T3670: fed4fire setup for winery benchmarks.

https://www.grid5000.fr/w/Fed4FIRE is the better documentation to use Grid5000 via Fed4Fire

Oct 21 2021, 8:19 AM · Object storage

dachary added a comment to T3634: Create swh-perfecthash module.

@olasd I'd like to add dependencies to the CI job running swh-perfecthash (valgrind) so that it can verify the C implementation is clean. Would you be so kind as to point me in the right direction? I looked in https://forge.softwareheritage.org/source/swh-jenkins-jobs but could not find the keyword cmph and concluded it must be in another repository.

Oct 21 2021, 5:19 AM · Object storage

dachary updated the task description for T3525: grid5000 tools and documentation.

Oct 21 2021, 5:11 AM · Object storage

dachary added a comment to T3670: fed4fire setup for winery benchmarks.

There is a monitor that shows all testbeds, among which is grid5000:

Oct 21 2021, 4:57 AM · Object storage

dachary added a comment to T3670: fed4fire setup for winery benchmarks.

Followed the tutorial to run a first experiment which failed https://doc.fed4fire.eu/firstexperiment.html

Oct 21 2021, 4:48 AM · Object storage

dachary added a comment to T3670: fed4fire setup for winery benchmarks.

Followed the tutorial https://doc.fed4fire.eu/tools.html

Oct 21 2021, 4:35 AM · Object storage

dachary added a comment to T3521: Persistent readonly perfect hash table: benchmarks.

$ time tox -e py3 -- --basetemp=/mnt/pytest -s --shard-size $((100 * 1024)) --object-max-size $((4 * 1024)) -k test_build_speed
number of objects = 45973694, total size = 105903024192                                                          
baseline 165.74853587150574, write_duration 495.07564210891724, build_duration 24.210500478744507, total_duration  519.2861425876617

Oct 21 2021, 3:03 AM · Object storage (RedHat collaboration)

dachary added a comment to D6424: Perfect hashmap C implementation.

@ardumont I believe all your comments were addressed. It is lucky that you did not review the actual logic: while running benchmarks I ran into a bug that lead to a significant refactor (using fread/fwrite instead of mmap and addresses). Not to the point where you'll have to re-read everything from scratch though 😅

Oct 21 2021, 2:56 AM

dachary updated the diff for D6424: Perfect hashmap C implementation.

using mmap turned out to be more involved than expected, use
read/write instead

Oct 21 2021, 2:41 AM

Oct 20 2021

dachary added a comment to D6424: Perfect hashmap C implementation.

Thanks a lot for this review @ardumont & @douardda ! I'll ping you for review when:

Oct 20 2021, 6:22 PM

dachary added inline comments to D6424: Perfect hashmap C implementation.

Oct 20 2021, 6:20 PM

Oct 19 2021

dachary updated the task description for T3670: fed4fire setup for winery benchmarks.

Oct 19 2021, 8:10 PM · Object storage

dachary updated the diff for D6424: Perfect hashmap C implementation.

the return value of mmap is -1 on error, not NULL

Oct 19 2021, 9:24 AM

dachary added a comment to T3521: Persistent readonly perfect hash table: benchmarks.

There is an error on mmap which was not detected, therefore no information on why it failed. This was fixed.

Oct 19 2021, 8:19 AM · Object storage (RedHat collaboration)

dachary added a comment to T3521: Persistent readonly perfect hash table: benchmarks.

time tox -e py3 -- --basetemp=/mnt/pytest -s --shard-size $((100 * 1024)) --object-max-size $((4 * 1024)) -k test_build_speed number of objects = 45973118 baseline 163.73826217651367, write_duration 300.58917450904846, build_duration 26.01908826828003, total_duration 326.6082627773285

Oct 19 2021, 5:36 AM · Object storage (RedHat collaboration)

dachary updated the diff for D6424: Perfect hashmap C implementation.

return on error if the write method exceeds the file capacity

Oct 19 2021, 5:09 AM

dachary added a comment to T3521: Persistent readonly perfect hash table: benchmarks.

Running benchmarks directly on grid5000

oarsub -I -l "{cluster='dahu'}/host=1,walltime=1" -t deploy
kadeploy3 -f $OAR_NODE_FILE -e debian11-x64-base -k
ssh root@$(tail -1 $OAR_NODE_FILE)
mkfs.ext4 /dev/sdb1
mount /dev/sdb1 /mnt
apt-get install -y python3-venv libcmph-dev gcc git
git clone https://git.easter-eggs.org/biceps/swh-perfecthash/
python3 -m venv bench
source bench/bin/activate
pip install -r requirements.txt -r requirements-test.txt
cd swh-perfecthash
tox -e py3
time tox -e py3 -- --basetemp=/mnt/pytest -s --shard-size $((100 * 1024)) --object-max-size $((100 * 1024)) -k test_build_speed
rm -fr /mnt/pytest

Oct 19 2021, 4:36 AM · Object storage (RedHat collaboration)

dachary added a comment to T3670: fed4fire setup for winery benchmarks.

/opt/jFed/jFed-Experimenter works but I'll have to wait on the approval of the account before proceeding further.

Oct 19 2021, 3:50 AM · Object storage

dachary updated the task description for T3670: fed4fire setup for winery benchmarks.

Oct 19 2021, 3:48 AM · Object storage

dachary added a parent task for T3670: fed4fire setup for winery benchmarks: T3432: Add winery backend.

Oct 19 2021, 3:44 AM · Object storage

dachary added a subtask for T3432: Add winery backend: T3670: fed4fire setup for winery benchmarks.

Oct 19 2021, 3:44 AM · Object storage

dachary changed the status of T3670: fed4fire setup for winery benchmarks from Open to Work in Progress.

Oct 19 2021, 3:44 AM · Object storage

dachary added a comment to T3521: Persistent readonly perfect hash table: benchmarks.

Created a project in https://portal.fed4fire.eu/ with the intention of using grid5000. It is pending approval from an administrator (see T3670).

Oct 19 2021, 3:11 AM · Object storage (RedHat collaboration)

Oct 18 2021

dachary changed the status of T3521: Persistent readonly perfect hash table: benchmarks, a subtask of T3104: Persistent readonly perfect hash table, from Open to Work in Progress.

Oct 18 2021, 9:02 PM · Object storage (RedHat collaboration)

dachary changed the status of T3521: Persistent readonly perfect hash table: benchmarks from Open to Work in Progress.

Oct 18 2021, 9:02 PM · Object storage (RedHat collaboration)

dachary changed the status of T3520: Persistent readonly perfect hash table: implementation, a subtask of T3104: Persistent readonly perfect hash table, from Open to Work in Progress.

Oct 18 2021, 9:02 PM · Object storage (RedHat collaboration)

dachary changed the status of T3520: Persistent readonly perfect hash table: implementation from Open to Work in Progress.

Oct 18 2021, 9:02 PM · Object storage (RedHat collaboration)

dachary changed the status of T3104: Persistent readonly perfect hash table, a subtask of T3432: Add winery backend, from Open to Work in Progress.

Oct 18 2021, 9:02 PM · Object storage

dachary changed the status of T3104: Persistent readonly perfect hash table, a subtask of T3054: Scale out object storage design, from Open to Work in Progress.

Oct 18 2021, 9:02 PM · Roadmap 2022, Object storage (RedHat collaboration), Roadmap 2021, meta-task

dachary changed the status of T3104: Persistent readonly perfect hash table from Open to Work in Progress.

Oct 18 2021, 9:02 PM · Object storage (RedHat collaboration)

dachary closed T3519: Persistent readonly perfect hash table: CI and package, a subtask of T3104: Persistent readonly perfect hash table, as Wontfix.

Oct 18 2021, 9:00 PM · Object storage (RedHat collaboration)

dachary closed T3519: Persistent readonly perfect hash table: CI and package as Wontfix.

This is a duplicate of T3634

Oct 18 2021, 9:00 PM · Object storage

dachary added a comment to T3520: Persistent readonly perfect hash table: implementation.

The draft implementation is available publicly at https://git.easter-eggs.org/biceps/swh-perfecthash/-/tree/wip-hash while D6424 is under review.

Oct 18 2021, 8:58 PM · Object storage (RedHat collaboration)

dachary added a comment to T3521: Persistent readonly perfect hash table: benchmarks.

The implementation of the benchmarks is prepared at:

Oct 18 2021, 8:57 PM · Object storage (RedHat collaboration)

dachary added a revision to T3520: Persistent readonly perfect hash table: implementation: D6424: Perfect hashmap C implementation.

Oct 18 2021, 8:46 PM · Object storage (RedHat collaboration)

dachary added a task to D6424: Perfect hashmap C implementation: T3520: Persistent readonly perfect hash table: implementation.

Oct 18 2021, 8:46 PM

dachary added a comment to T3104: Persistent readonly perfect hash table.

For the record I created a "draft" repository for contributions to https://forge.softwareheritage.org/source/swh-perfecthash/ at https://git.easter-eggs.org/biceps/swh-perfecthash. It is only meant to be a publicly available sandbox.

Oct 18 2021, 8:39 PM · Object storage (RedHat collaboration)

Oct 12 2021

dachary updated the diff for D6424: Perfect hashmap C implementation.

create and lookup a Read Shard with a perfect hash

Oct 12 2021, 4:55 PM

dachary added a comment to D6424: Perfect hashmap C implementation.

@ardumont @olasd this is ready for review... I think :-D

Oct 12 2021, 4:49 PM

dachary updated the diff for D6424: Perfect hashmap C implementation.

create and lookup a Read Shard with a perfect hash

Oct 12 2021, 4:48 PM

dachary added a comment to D6424: Perfect hashmap C implementation.

This is a working C implementation with tests and decent code coverage (all but error conditions). The run of the tests is valgrind clean. Next step is to run the python tests. Then wrap up and document so it can be reviewed.

Oct 12 2021, 2:52 PM

dachary updated the diff for D6424: Perfect hashmap C implementation.

working implementation, with tests

Oct 12 2021, 2:49 PM

Oct 11 2021

dachary updated the diff for D6424: Perfect hashmap C implementation.

compiles but crashes

Oct 11 2021, 10:33 PM

dachary added a comment to D6424: Perfect hashmap C implementation.

In D6424#167442, @dachary wrote:

skeleton

Oct 11 2021, 4:17 PM

dachary updated the diff for D6424: Perfect hashmap C implementation.

skeleton

Oct 11 2021, 4:10 PM

dachary added a comment to D6424: Perfect hashmap C implementation.

In D6424#166886, @ardumont wrote:

@dachary It'd be nice if you could describe what this is about in the commit message and
the diff description (if you actually provide a commit description, then when you create
the diff, the commit message is used as a description bootstrap). I know it's more work
for you but it happens that:

it helps the reviewers to have some context directly here (without having to follow

between a multitude of tasks. FYI, I've followed through the task but it's not enough,
i need to also dig in that arborescence of tasks).

is also how we are doing that in every other modules ;)

the curious could learn a thing or 2 even if they don't do a proper review.

Please and thanks in advance.

Cheers,