- worker0.staging is running within a venv with D6240 (more extid filtering) and D6268 (build snapshot) - worker17 is running with the current latest mercurial packaged (no optim) But the filtering is still happening in the mercurial loader (without D6275) # tl; dr ``` |-----------------+------------+--------------------------------------------+-----------+--------------------------------------------| | machine | 1st run | snapshot | 2nd run | snapshot | |-----------------+------------+--------------------------------------------+-----------+--------------------------------------------| | worker17 | 12m11.504s | \xa2cd5ce3505f3b8bd7311455d0d200f7c5ad6294 | 0m34.106s | None | | worker0.staging | 8m47.374s | \xa2cd5ce3505f3b8bd7311455d0d200f7c5ad6294 | 0m32.484s | \xa2cd5ce3505f3b8bd7311455d0d200f7c5ad6294 | |-----------------+------------+--------------------------------------------+-----------+--------------------------------------------| ``` # Details ## first run ### worker17 (20 cpus, 64Gib ram, tmpfs): ``` swhworker@worker17:~$ time SWH_CONFIG_FILENAME=/etc/softwareheritage/loader_oneshot.yml swh loader run mercurial_from_disk https://foss.heptapod.net/mercurial/tortoisehg/thg INFO:swh.loader.mercurial.LoaderFromDisk:Load origin 'https://foss.heptapod.net/mercurial/tortoisehg/thg' with type 'hg' requesting all changes adding changesets adding manifests adding file changes added 19929 changesets with 32270 changes to 1595 files new changesets bac32db38e52:0a83dac8c779 {'status': 'eventful'} real 12m11.504s user 8m43.197s sys 0m33.806s ``` snapshot: ``` 15:55:47 softwareheritage@belvedere:5432=> select * from origin o inner join origin_visit_status ovs on o.id=ovs.origin where o.url = 'https://foss.heptapod.net/mercurial/tortoisehg/thg' and ovs.type='hg' order by date desc; +-----------+----------------------------------------------------+-----------+-------+-------------------------------+---------+----------+--------------------------------------------+------+ | id | url | origin | visit | date | status | metadata | snapshot | type | +-----------+----------------------------------------------------+-----------+-------+-------------------------------+---------+----------+--------------------------------------------+------+ | 163616646 | https://foss.heptapod.net/mercurial/tortoisehg/thg | 163616646 | 1 | 2021-09-16 13:56:00.958435+00 | full | (null) | \xa2cd5ce3505f3b8bd7311455d0d200f7c5ad6294 | hg | | 163616646 | https://foss.heptapod.net/mercurial/tortoisehg/thg | 163616646 | 1 | 2021-09-16 13:43:59.915896+00 | created | (null) | (null) | hg | +-----------+----------------------------------------------------+-----------+-------+-------------------------------+---------+----------+--------------------------------------------+------+ ``` ### worker.staging (4cpus, 12Gib ram, /tmp on disk) ``` (ve) swhworker@worker0:~/swh-loader-mercurial$ time swh loader run mercurial_from_disk https://foss.heptapod.net/mercurial/tortoisehg/thg INFO:swh.loader.mercurial.LoaderFromDisk:Load origin 'https://foss.heptapod.net/mercurial/tortoisehg/thg' with type 'hg' requesting all changes adding changesets adding manifests adding file changes added 19929 changesets with 32270 changes to 1595 files new changesets bac32db38e52:0a83dac8c779 INFO:swh.loader.mercurial.LoaderFromDisk:New revisions found: 19929 {'status': 'eventful'} real 8m47.374s user 6m17.912s sys 0m7.192s ``` snapshot: ``` 15:55:29 swh@db1:5432=> select * from origin o inner join origin_visit_status ovs on o.id=ovs.origin where o.url = 'https://foss.heptapod.net/mercurial/tortoisehg/thg' and ovs.type='hg' order by date desc limit 2; +--------+----------------------------------------------------+--------+-------+-------------------------------+---------+----------+--------------------------------------------+------+ | id | url | origin | visit | date | status | metadata | snapshot | type | +--------+----------------------------------------------------+--------+-------+-------------------------------+---------+----------+--------------------------------------------+------+ | 693991 | https://foss.heptapod.net/mercurial/tortoisehg/thg | 693991 | 4 | 2021-09-16 13:54:09.596121+00 | full | (null) | \xa2cd5ce3505f3b8bd7311455d0d200f7c5ad6294 | hg | | 693991 | https://foss.heptapod.net/mercurial/tortoisehg/thg | 693991 | 4 | 2021-09-16 13:45:25.397994+00 | created | (null) | (null) | hg | +--------+----------------------------------------------------+--------+-------+-------------------------------+---------+----------+--------------------------------------------+------+ ``` ## 2nd run ### worker17 ``` swhworker@worker17:~$ time SWH_CONFIG_FILENAME=/etc/softwareheritage/loader_oneshot.yml swh loader run mercurial_from_disk https://foss.heptapod.net/mercurial/tortoisehg/thg INFO:swh.loader.mercurial.LoaderFromDisk:Load origin 'https://foss.heptapod.net/mercurial/tortoisehg/thg' with type 'hg' requesting all changes adding changesets adding manifests adding file changes added 19929 changesets with 32270 changes to 1595 files new changesets bac32db38e52:0a83dac8c779 {'status': 'uneventful'} real 0m34.106s user 0m30.524s sys 0m2.514s ``` This does not create a snapshot (the bad behavior) ``` 15:58:47 softwareheritage@belvedere:5432=> select * from origin o inner join origin_visit_status ovs on o.id=ovs.origin where o.url = 'https://foss.heptapod.net/mercurial/tortoisehg/thg' and ovs.type='hg' order by date desc; +-----------+----------------------------------------------------+-----------+-------+-------------------------------+---------+----------+--------------------------------------------+------+ | id | url | origin | visit | date | status | metadata | snapshot | type | +-----------+----------------------------------------------------+-----------+-------+-------------------------------+---------+----------+--------------------------------------------+------+ | 163616646 | https://foss.heptapod.net/mercurial/tortoisehg/thg | 163616646 | 2 | 2021-09-16 14:05:50.894346+00 | full | (null) | (null) | hg | | 163616646 | https://foss.heptapod.net/mercurial/tortoisehg/thg | 163616646 | 2 | 2021-09-16 14:05:18.307748+00 | created | (null) | (null) | hg | ``` ### worker0.staging ``` (ve) swhworker@worker0:~/swh-loader-mercurial$ time swh loader run mercurial_from_disk https://foss.heptapod.net/mercurial/tortoisehg/thg INFO:swh.loader.mercurial.LoaderFromDisk:Load origin 'https://foss.heptapod.net/mercurial/tortoisehg/thg' with type 'hg' requesting all changes adding changesets adding manifests adding file changes added 19929 changesets with 32270 changes to 1595 files new changesets bac32db38e52:0a83dac8c779 {'status': 'uneventful'} real 0m32.484s user 0m19.929s sys 0m1.139s ``` This now do create a snapshot nonetheless (the same as before since no change): ``` 16:03:28 swh@db1:5432=> select * from origin o inner join origin_visit_status ovs on o.id=ovs.origin where o.url = 'https://foss.heptapod.net/mercurial/tortoisehg/thg' and ovs.type='hg' order by date desc limit 2; +--------+----------------------------------------------------+--------+-------+-------------------------------+---------+----------+--------------------------------------------+------+ | id | url | origin | visit | date | status | metadata | snapshot | type | +--------+----------------------------------------------------+--------+-------+-------------------------------+---------+----------+--------------------------------------------+------+ | 693991 | https://foss.heptapod.net/mercurial/tortoisehg/thg | 693991 | 5 | 2021-09-16 14:06:08.419757+00 | full | (null) | \xa2cd5ce3505f3b8bd7311455d0d200f7c5ad6294 | hg | | 693991 | https://foss.heptapod.net/mercurial/tortoisehg/thg | 693991 | 5 | 2021-09-16 14:05:37.710598+00 | created | (null) | (null) | hg | +--------+----------------------------------------------------+--------+-------+-------------------------------+---------+----------+--------------------------------------------+------+ ```