Re: Deleting a failing drive from RAID6 fails

From: Qu Wenruo <quwenruo.btrfs@gmx.com>
To: Zygo Blaxell <ce3g8jdj@umail.furryterror.org>
Cc: Martin <mbakiev@gmail.com>, linux-btrfs@vger.kernel.org
Subject: Re: Deleting a failing drive from RAID6 fails
Date: Thu, 26 Dec 2019 14:50:30 +0800	[thread overview]
Message-ID: <50661176-b04c-882b-d87c-ee5c0395c3f6@gmx.com> (raw)
In-Reply-To: <20191226054058.GC13306@hungrycats.org>

[-- Attachment #1.1: Type: text/plain, Size: 3496 bytes --]

On 2019/12/26 下午1:40, Zygo Blaxell wrote:
> On Thu, Dec 26, 2019 at 01:03:47PM +0800, Qu Wenruo wrote:
>>
>>
>> On 2019/12/26 上午3:25, Martin wrote:
>>> Hi,
>>>
>>> I have a drive that started failing (uncorrectable errors & lots of
>>> relocated sectors) in a RAID6 (12 device/70TB total with 30TB of
>>> data), btrfs scrub started showing corrected errors as well (seemingly
>>> no big deal since its RAID6). I decided to remove the drive from the
>>> array with:
>>>     btrfs device delete /dev/sdg /mount_point
>>>
>>> After about 20 hours and having rebalanced 90% of the data off the
>>> drive, the operation failed with an I/O error. dmesg was showing csum
>>> errors:
>>>     BTRFS warning (device sdf): csum failed root -9 ino 2526 off
>>> 10673848320 csum 0x8941f998 expected csum 0x253c8e4b mirror 2
>>>     BTRFS warning (device sdf): csum failed root -9 ino 2526 off
>>> 10673852416 csum 0x8941f998 expected csum 0x8a9a53fe mirror 2
>>>     . . .
>>
>> This means some data reloc tree had csum mismatch.
>> The strange part is, we shouldn't hit csum error here, as if it's some
>> data corrupted, it should report csum error at read time, other than
>> reporting the error at this timing.
>>
>> This looks like something reported before.
>>
>>>
>>> I pulled the drive out of the system and attempted the device deletion
>>> again, but getting the same error.
>>>
>>> Looking back through the logs to the previous scrubs, it showed the
>>> file paths where errors were detected, so I deleted those files, and
>>> tried removing the failing drive again. It moved along some more. Now
>>> its down to only 13GiB of data remaining on the missing drive. Is
>>> there any way to track the above errors to specific files so I can
>>> delete them and finish the removal. Is there is a better way to finish
>>> the device deletion?
>>
>> As the message shows, it's the data reloc tree, which store the newly
>> relocated data.
>> So it doesn't contain the file path.
>>
>>>
>>> Scrubbing with the device missing just racks up uncorrectable errors
>>> right off the bat, so it seemingly doesn't like missing a device - I
>>> assume it's not actually doing anything useful, right?
>>
>> Which kernel are you using?
>>
>> IIRC older kernel doesn't retry all possible device combinations, thus
>> it can report uncorrectable errors even if it should be correctable.
> 
>> Another possible cause is write-hole, which reduced the tolerance of
>> RAID6 stripes by stripes.
> 
> Did you find a fix for
> 
> 	https://www.spinics.net/lists/linux-btrfs/msg94634.html
> 
> If that bug is happening in this case, it can abort a device delete
> on raid5/6 due to corrupted data every few block groups.

My bad, always lost my track of to-do works.

It looks like one possible cause indeed.

Thanks for reminding me that bug,
Qu

> 
>> You can also try replace the missing device.
>> In that case, it doesn't go through the regular relocation path, but dev
>> replace path (more like scrub), but you need physical access then.
>>
>> Thanks,
>> Qu
>>
>>>
>>> I'm currently traveling and away from the system physically. Is there
>>> any way to complete the device removal without reconnecting the
>>> failing drive? Otherwise, I'll have a replacement drive in a couple of
>>> weeks when I'm back, and can try anything involving reconnecting the
>>> drive.
>>>
>>> Thanks,
>>> Martin
>>>
>>
> 
> 
> 

[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 488 bytes --]