worker0.staging is running within a venv with D6240 (more extid filtering) and D6268 (build snapshot)
But the filtering is still happening in the mercurial loader (without D6275)
# tl; dr
```
|-----------------+------------+--------------------------------------------+-----------+--------------------------------------------|
| machine | 1st run | snapshot | 2nd run | snapshot |
|-----------------+------------+--------------------------------------------+-----------+--------------------------------------------|
| worker17 | 12m11.504s | \xa2cd5ce3505f3b8bd7311455d0d200f7c5ad6294 | 0m34.106s | None |
| worker0.staging | 8m47.374s | \xa2cd5ce3505f3b8bd7311455d0d200f7c5ad6294 | 0m32.484s | \xa2cd5ce3505f3b8bd7311455d0d200f7c5ad6294 |
|-----------------+------------+--------------------------------------------+-----------+--------------------------------------------|
```
# Details
## first run
### worker17 (20 cpus, 64Gib ram, tmpfs):
```
swhworker@worker17:~$ time SWH_CONFIG_FILENAME=/etc/softwareheritage/loader_oneshot.yml swh loader run mercurial_from_disk https://foss.heptapod.net/mercurial/tortoisehg/thg
INFO:swh.loader.mercurial.LoaderFromDisk:Load origin 'https://foss.heptapod.net/mercurial/tortoisehg/thg' with type 'hg'
requesting all changes
adding changesets
adding manifests
adding file changes
added 19929 changesets with 32270 changes to 1595 files
new changesets bac32db38e52:0a83dac8c779
{'status': 'eventful'}
real 12m11.504s
user 8m43.197s
sys 0m33.806s
```
snapshot:
```
15:55:47 softwareheritage@belvedere:5432=> select * from origin o inner join origin_visit_status ovs on o.id=ovs.origin where o.url = 'https://foss.heptapod.net/mercurial/tortoisehg/thg' and ovs.type='hg' order by date desc;
+-----------+----------------------------------------------------+-----------+-------+-------------------------------+---------+----------+--------------------------------------------+------+
| id | url | origin | visit | date | status | metadata | snapshot | type |
+-----------+----------------------------------------------------+-----------+-------+-------------------------------+---------+----------+--------------------------------------------+------+
| 163616646 | https://foss.heptapod.net/mercurial/tortoisehg/thg | 163616646 | 1 | 2021-09-16 13:56:00.958435+00 | full | (null) | \xa2cd5ce3505f3b8bd7311455d0d200f7c5ad6294 | hg |
| 163616646 | https://foss.heptapod.net/mercurial/tortoisehg/thg | 163616646 | 1 | 2021-09-16 13:43:59.915896+00 | created | (null) | (null) | hg |
+-----------+----------------------------------------------------+-----------+-------+-------------------------------+---------+----------+--------------------------------------------+------+
```
### worker.staging (4cpus, 12Gib ram, /tmp on disk)
```
(ve) swhworker@worker0:~/swh-loader-mercurial$ time swh loader run mercurial_from_disk https://foss.heptapod.net/mercurial/tortoisehg/thg
INFO:swh.loader.mercurial.LoaderFromDisk:Load origin 'https://foss.heptapod.net/mercurial/tortoisehg/thg' with type 'hg'
requesting all changes
adding changesets
adding manifests
adding file changes
added 19929 changesets with 32270 changes to 1595 files
new changesets bac32db38e52:0a83dac8c779
INFO:swh.loader.mercurial.LoaderFromDisk:New revisions found: 19929
{'status': 'eventful'}
real 8m47.374s
user 6m17.912s
sys 0m7.192s
```
snapshot:
```
15:55:29 swh@db1:5432=> select * from origin o inner join origin_visit_status ovs on o.id=ovs.origin where o.url = 'https://foss.heptapod.net/mercurial/tortoisehg/thg' and ovs.type='hg' order by date desc limit 2;
+--------+----------------------------------------------------+--------+-------+-------------------------------+---------+----------+--------------------------------------------+------+
| id | url | origin | visit | date | status | metadata | snapshot | type |
+--------+----------------------------------------------------+--------+-------+-------------------------------+---------+----------+--------------------------------------------+------+
| 693991 | https://foss.heptapod.net/mercurial/tortoisehg/thg | 693991 | 4 | 2021-09-16 13:54:09.596121+00 | full | (null) | \xa2cd5ce3505f3b8bd7311455d0d200f7c5ad6294 | hg |
| 693991 | https://foss.heptapod.net/mercurial/tortoisehg/thg | 693991 | 4 | 2021-09-16 13:45:25.397994+00 | created | (null) | (null) | hg |
+--------+----------------------------------------------------+--------+-------+-------------------------------+---------+----------+--------------------------------------------+------+
```
## 2nd run
### worker17
```
swhworker@worker17:~$ time SWH_CONFIG_FILENAME=/etc/softwareheritage/loader_oneshot.yml swh loader run mercurial_from_disk https://foss.heptapod.net/mercurial/tortoisehg/thg
INFO:swh.loader.mercurial.LoaderFromDisk:Load origin 'https://foss.heptapod.net/mercurial/tortoisehg/thg' with type 'hg'
requesting all changes
adding changesets
adding manifests
adding file changes
added 19929 changesets with 32270 changes to 1595 files
new changesets bac32db38e52:0a83dac8c779
{'status': 'uneventful'}
real 0m34.106s
user 0m30.524s
sys 0m2.514s
```
This does not create a snapshot (the bad behavior)
```
15:58:47 softwareheritage@belvedere:5432=> select * from origin o inner join origin_visit_status ovs on o.id=ovs.origin where o.url = 'https://foss.heptapod.net/mercurial/tortoisehg/thg' and ovs.type='hg' order by date desc;
+-----------+----------------------------------------------------+-----------+-------+-------------------------------+---------+----------+--------------------------------------------+------+
| id | url | origin | visit | date | status | metadata | snapshot | type |
+-----------+----------------------------------------------------+-----------+-------+-------------------------------+---------+----------+--------------------------------------------+------+
| 163616646 | https://foss.heptapod.net/mercurial/tortoisehg/thg | 163616646 | 2 | 2021-09-16 14:05:50.894346+00 | full | (null) | (null) | hg |
| 163616646 | https://foss.heptapod.net/mercurial/tortoisehg/thg | 163616646 | 2 | 2021-09-16 14:05:18.307748+00 | created | (null) | (null) | hg |
```
### worker0.staging
```
(ve) swhworker@worker0:~/swh-loader-mercurial$ time swh loader run mercurial_from_disk https://foss.heptapod.net/mercurial/tortoisehg/thg
INFO:swh.loader.mercurial.LoaderFromDisk:Load origin 'https://foss.heptapod.net/mercurial/tortoisehg/thg' with type 'hg'
requesting all changes
adding changesets
adding manifests
adding file changes
added 19929 changesets with 32270 changes to 1595 files
new changesets bac32db38e52:0a83dac8c779
{'status': 'uneventful'}
real 0m32.484s
user 0m19.929s
sys 0m1.139s
```
This now do create a snapshot nonetheless (the same as before since no change):
```
16:03:28 swh@db1:5432=> select * from origin o inner join origin_visit_status ovs on o.id=ovs.origin where o.url = 'https://foss.heptapod.net/mercurial/tortoisehg/thg' and ovs.type='hg' order by date desc limit 2;
+--------+----------------------------------------------------+--------+-------+-------------------------------+---------+----------+--------------------------------------------+------+
| id | url | origin | visit | date | status | metadata | snapshot | type |
+--------+----------------------------------------------------+--------+-------+-------------------------------+---------+----------+--------------------------------------------+------+
| 693991 | https://foss.heptapod.net/mercurial/tortoisehg/thg | 693991 | 5 | 2021-09-16 14:06:08.419757+00 | full | (null) | \xa2cd5ce3505f3b8bd7311455d0d200f7c5ad6294 | hg |
| 693991 | https://foss.heptapod.net/mercurial/tortoisehg/thg | 693991 | 5 | 2021-09-16 14:05:37.710598+00 | created | (null) | (null) | hg |
+--------+----------------------------------------------------+--------+-------+-------------------------------+---------+----------+--------------------------------------------+------+
```