Page MenuHomeSoftware Heritage

I/O error on worker06.internal
Started, Work in Progress, HighPublic

Description

worker06:/var/log/auth.log.1 is partially unreadable.

worker06 is a VM and its virtual drive media reports read errors:

[Mon Jan 21 11:32:42 2019] sd 2:0:0:0: [sda] tag#0 FAILED Result: hostbyte=DID_OK driverbyte=DRIVER_SENSE
[Mon Jan 21 11:32:42 2019] sd 2:0:0:0: [sda] tag#0 Sense Key : Aborted Command [current]
[Mon Jan 21 11:32:42 2019] sd 2:0:0:0: [sda] tag#0 Add. Sense: I/O process terminated
[Mon Jan 21 11:32:42 2019] sd 2:0:0:0: [sda] tag#0 CDB: Read(10) 28 00 02 8f 6b 80 00 00 08 00
[Mon Jan 21 11:32:42 2019] blk_update_request: I/O error, dev sda, sector 42953600

Event Timeline

ftigeot created this task.Mon, Jan 21, 2:38 PM
ftigeot changed the task status from Open to Work in Progress.
ftigeot triaged this task as High priority.
ftigeot added a comment.EditedMon, Jan 21, 2:42 PM

worker06.internal.softwareheritage.org is a VM running on louvre, Its virtual disk is backed by /dev/dm-36 on the host.

louvre:/dev/dm-36 also reports I/O errors:

[Mon Jan 21 12:17:53 2019] buffer_io_error: 25 callbacks suppressed
[Mon Jan 21 12:17:53 2019] Buffer I/O error on dev dm-36, logical block 930275, async page read
[Mon Jan 21 12:17:53 2019] Buffer I/O error on dev dm-36, logical block 930275, async page read

louvre:/dev/dm-36 is backed by /dev/md3, which is itself layered on top of /dev/sda and /dev/sdb.
None of these low-level devices report any error.

ftigeot claimed this task.Tue, Jan 22, 8:41 AM

The /dev/md3 check completed successfully and did not report any error.

ftigeot changed the status of subtask T1518: I/O error on louvre:/dev/md3 from Open to Work in Progress.Tue, Feb 5, 4:18 PM

A brand new virtual disk was created, skipping bad data blocks:

  • Shut down the VM
  • Create a new drive with identical size in the Proxmox web interface. There are now two virtual drives:
/dev/ssd/vm-206-disk-1 => dm-34
/dev/ssd/vm-206-disk-0 => dm-35
  • Activate the new lvm device:
lvchange -a y ssd/vm-206-disk-0
  • Copy data, skipping unreadable 4K blocks:
dd if=/dev/dm-34 of=/dev/dm-35 bs=4k conv=sync,noerror
  • In the Proxmox wui, detach the old drive, tell the VM to boot from the new one and restart it