Deleting a failing drive from RAID6 fails

* Deleting a failing drive from RAID6 fails
@ 2019-12-25 19:25 Martin
  2019-12-26  5:03 ` Qu Wenruo
  0 siblings, 1 reply; 5+ messages in thread
From: Martin @ 2019-12-25 19:25 UTC (permalink / raw)
  To: linux-btrfs

Hi,

I have a drive that started failing (uncorrectable errors & lots of
relocated sectors) in a RAID6 (12 device/70TB total with 30TB of
data), btrfs scrub started showing corrected errors as well (seemingly
no big deal since its RAID6). I decided to remove the drive from the
array with:
    btrfs device delete /dev/sdg /mount_point

After about 20 hours and having rebalanced 90% of the data off the
drive, the operation failed with an I/O error. dmesg was showing csum
errors:
    BTRFS warning (device sdf): csum failed root -9 ino 2526 off
10673848320 csum 0x8941f998 expected csum 0x253c8e4b mirror 2
    BTRFS warning (device sdf): csum failed root -9 ino 2526 off
10673852416 csum 0x8941f998 expected csum 0x8a9a53fe mirror 2
    . . .

I pulled the drive out of the system and attempted the device deletion
again, but getting the same error.

Looking back through the logs to the previous scrubs, it showed the
file paths where errors were detected, so I deleted those files, and
tried removing the failing drive again. It moved along some more. Now
its down to only 13GiB of data remaining on the missing drive. Is
there any way to track the above errors to specific files so I can
delete them and finish the removal. Is there is a better way to finish
the device deletion?

Scrubbing with the device missing just racks up uncorrectable errors
right off the bat, so it seemingly doesn't like missing a device - I
assume it's not actually doing anything useful, right?

I'm currently traveling and away from the system physically. Is there
any way to complete the device removal without reconnecting the
failing drive? Otherwise, I'll have a replacement drive in a couple of
weeks when I'm back, and can try anything involving reconnecting the
drive.

Thanks,
Martin

^ permalink raw reply	[flat|nested] 5+ messages in thread