To transfer the data between 2 hosts, syncoid uses ssh with a control socket for connection sharing.
Due to the socket naming based on the time (with a second precision) , when 2 synchronizations start in the same second, the connection is shared by the 2 sync. The first sync who finished closes the socket and makes the other one failed
https://github.com/jimsalterjrs/sanoid/issues/532
Feb 21 15:27:43 db1 systemd[1]: Starting ZFS dataset synchronization of... Feb 21 15:27:44 db1 syncoid[4136471]: ControlSocket /tmp/syncoid-root-root@storage1.internal.staging.swh.network-1645457263 already exists, disabling multiplexing Feb 21 15:27:45 db1 syncoid[4136469]: Sending incremental data/objects@syncoid_db1_2022-02-21:15:21:59 ... syncoid_db1_2022-02-21:15:27:44 (~ 301.4 MB): Feb 21 15:27:49 db1 syncoid[4136796]: lzop: Inappropriate ioctl for device: <stdin> Feb 21 15:27:50 db1 syncoid[4136792]: cannot receive incremental stream: checksum mismatch or incomplete stream. Feb 21 15:27:50 db1 syncoid[4136792]: Partially received snapshot is saved. Feb 21 15:27:50 db1 syncoid[4136792]: A resuming stream can be generated on the sending system by running: Feb 21 15:27:50 db1 syncoid[4136792]: zfs send -t 1-10db178eb8-100-789c636064000310a501c49c50360710a715e5e7a69766a630404183d9521fcfe7ebdf2800d9ec48eaf293b252934b1818a2de0481d561c8a7a515a79630c001489e0d493ea9b224b518481f78b49f079bfe927c882b7cde7cddbe7676e42c0f24794eb07c5e626e2a0343> Feb 21 15:27:50 db1 syncoid[4136469]: CRITICAL ERROR: ssh -i /root/.ssh/id_ed25519.syncoid_db1 -S /tmp/syncoid-root-root@storage1.internal.staging.swh.network-1645457263 root@storage1.internal.staging.swh.network ' zfs send -I '"'"'data/objects'"'"'@'"'"'syncoid_db1_2022-02-21:15:> Feb 21 15:27:50 db1 systemd[1]: syncoid-storage1-objects.service: Main process exited, code=exited, status=2/INVALIDARGUMENT Feb 21 15:27:50 db1 systemd[1]: syncoid-storage1-objects.service: Failed with result 'exit-code'. Feb 21 15:27:50 db1 systemd[1]: Failed to start ZFS dataset synchronization of. Feb 21 15:27:50 db1 systemd[1]: syncoid-storage1-objects.service: Consumed 2.132s CPU time.
Feb 21 15:27:43 db1 systemd[1]: Starting ZFS dataset synchronization of... Feb 21 15:27:45 db1 syncoid[4136468]: Sending incremental data/kafka@syncoid_db1_2022-02-21:15:21:55 ... syncoid_db1_2022-02-21:15:27:44 (~ 192.1 MB): Feb 21 15:27:48 db1 systemd[1]: syncoid-storage1-kafka.service: Succeeded. Feb 21 15:27:48 db1 systemd[1]: Finished ZFS dataset synchronization of. Feb 21 15:27:48 db1 systemd[1]: syncoid-storage1-kafka.service: Consumed 2.408s CPU time.
Hopefully, syncoid is resilient to this kind of error and stabilizes itself the next run:
Feb 21 15:33:29 db1 systemd[1]: Starting ZFS dataset synchronization of... Feb 21 15:33:30 db1 syncoid[4167164]: Resuming interrupted zfs send/receive from data/objects to data/sync/storage1/objects (~ 97.4 MB remaining): Feb 21 15:33:38 db1 syncoid[4167164]: Sending incremental data/objects@syncoid_db1_2022-02-21:15:27:44 ... syncoid_db1_2022-02-21:15:33:36 (~ 269.3 MB): Feb 21 15:33:51 db1 systemd[1]: syncoid-storage1-objects.service: Succeeded. Feb 21 15:33:51 db1 systemd[1]: Finished ZFS dataset synchronization of. Feb 21 15:33:51 db1 systemd[1]: syncoid-storage1-objects.service: Consumed 7.327s CPU time.