Re: Access Beyond End of Device & Input/Output Errors

From: Qu Wenruo <quwenruo.btrfs@gmx.com>
To: Justin Brown <Justin.Brown@fandingo.org>
Cc: linux-btrfs <linux-btrfs@vger.kernel.org>
Subject: Re: Access Beyond End of Device & Input/Output Errors
Date: Sun, 2 Aug 2020 07:30:17 +0800	[thread overview]
Message-ID: <bd921a29-cd4a-62dc-4e14-708e617ec156@gmx.com> (raw)
In-Reply-To: <CAKZK7uwFFpxiwA=Ye1VpqvkonAER=T-a2i_h_yGwpkieaeXcjg@mail.gmail.com>

[-- Attachment #1.1: Type: text/plain, Size: 12562 bytes --]

On 2020/8/1 下午7:56, Justin Brown wrote:
> Hi Qu,
> 
> Thanks for your continued help.
> 
> dump-super:
> 
> for i in a b d e f g; do x=$(sudo btrfs ins dump-super /dev/sd${i}1 |
> grep dev_item.uuid | cut -f 3); echo "/dev/sd${i}1 $x"; done
> /dev/sda1 cc3f9a00-bd69-4ceb-b6e5-4fb874be2aaf
> /dev/sdb1 27e1cf24-9349-4f72-a23b-86668b2a9e78
> /dev/sdd1 601d409e-8ffd-489c-91af-daf3e0cc9bd2
> /dev/sde1 2908ebfb-e6b5-4991-b25d-32d1487ff6a4
> /dev/sdf1 cb05aae6-6c03-49d3-b46d-bf51a0eb8cd0

They match with the device size. So no chunk item beyond device boundary.

> 
> btrfs check:
> 
> sudo btrfs check /dev/sda1
> Opening filesystem to check...
> Checking filesystem on /dev/sda1
> UUID: 51eef0c7-2977-4037-b271-3270ea22c7d9
> [1/7] checking root items
> [2/7] checking extents
...
> failed to load free space cache for block group 92568662507520
> failed to load free space cache for block group 92574031216640
> ...
> failed to load free space cache for block group 97722656817152
> failed to load free space cache for block group 97728025526272

This is interesting. Maybe that's related to the problem?

> [4/7] checking fs roots
> [5/7] checking only csums items (without verifying data)
> [6/7] checking root refs
> [7/7] checking quota groups skipped (not enabled on this FS)

Great that all metadata are fine.

> found 5148381876224 bytes used, no error found
> total csum bytes: 4998903140
> total tree bytes: 5301813248
> total fs tree bytes: 96894976
> total extent tree bytes: 41910272
> btree space waste bytes: 135561977
> file data blocks allocated: 8972043898880
> referenced 5113155596288
> 
> The alignment issue would be confined to performance, correct?

Yep, only related to performance and some noisy warning for newer kernel.
Not a big problem yet.

Since btrfs-check reports no obvious problem but free space cache
problems, maybe btrfs repair --clear-space-cache v1 is worthy trying.

BTW, since current kernel and btrfs-progs doesn't do restrict chunk
check against device boundary, I'll add such checks to both kernel and
progs soon.

In the mean time, I also see the following dmesg showing that kernel
failed to detect one device:

  Aug 01 01:15:26 spaceman.fandingo.org kernel: BTRFS warning (device
  sde1): devid 1 uuid cb05aae6-6c03-49d3-b46d-bf51a0eb8cd0 is missing

Can you reproduce that problem? And if so, maybe try "btrfs device scan"
and then mount again?

Thanks,
Qu

> 
> Thanks,
> Justin
> 
> /dev/sdg1 1b938c84-eafd-4396-b06c-8a5bf1339840On Sat, Aug 1, 2020 at
> 4:31 AM Qu Wenruo <quwenruo.btrfs@gmx.com> wrote:
>>
>>
>>
>> On 2020/8/1 下午4:30, Justin Brown wrote:
>>> Hi Qu,
>>>
>>> Thanks for the help.
>>>
>>> Here's is the lsblk -b:
>>>
>>> NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINT
>>> sda 8:0 0 2000398934016 0 disk
>>> └─sda1 8:1 0 2000397868544 0 part
>>> sdb 8:16 0 8001563222016 0 disk
>>> └─sdb1 8:17 0 8001562156544 0 part
>>> sdc 8:32 0 120034123776 0 disk
>>> ├─sdc1 8:33 0 1048576 0 part
>>> ├─sdc2 8:34 0 524288000 0 part /boot
>>> └─sdc3 8:35 0 119507255296 0 part /home
>>> sdd 8:48 0 8001563222016 0 disk
>>> └─sdd1 8:49 0 8001562156544 0 part
>>> sde 8:64 0 2000398934016 0 disk
>>> └─sde1 8:65 0 2000397868544 0 part
>>> sdf 8:80 0 2000398934016 0 disk
>>> └─sdf1 8:81 0 2000397868544 0 part /var/media
>>> sdg 8:96 1 2000398934016 0 disk
>>> └─sdg1 8:97 1 2000397868544 0 part
>>>
>>> The `btrfs ins...` output is quite long. I've attached it as a txt and
>>> also uploaded it at
>>> https://gist.github.com/fandingo/aa345d6c6fa97162f810e86c9ab20d6a
>>
>>
>> Thanks, this already shows some device size difference.
>>
>> But all of them are in fact just a little smaller than device size, thus
>> it should be fine.
>>
>> Another problem I found is, it looks like either size or start of some
>> partitions are not aligned to 4K.
>>
>> It may be a problem for 4K aligned hard disks, so it may worthy some
>> concern after solving the btrfs problem.
>>
>> Would you please also provide some extra dump?
>> - btrfs check /dev/sda1
>>   It should detect any problems I missed
>>
>> - btrfs ins dump-super <device> | grep dev_item.uuid
>>   It's a little hard to find which device owns to which device id.
>>   So we need this dump of each btrfs device to make sure.
>>
>> Thanks,
>> Qu
>>
>>
>>>
>>> Thanks,
>>> Justin
>>>
>>> On Sat, Aug 1, 2020 at 2:02 AM Qu Wenruo <quwenruo.btrfs@gmx.com> wrote:
>>>>
>>>>
>>>>
>>>> On 2020/8/1 下午2:58, Qu Wenruo wrote:
>>>>>
>>>>>
>>>>> On 2020/8/1 下午2:51, Justin Brown wrote:
>>>>>> Hello,
>>>>>>
>>>>>> I've run into a strange problem that I haven't seen before, and I need
>>>>>> some help. I started getting generic "input/output" errors on a couple
>>>>>> of files, and when I looked deeper, the kernel logs are full of
>>>>>> messages like:
>>>>>>
>>>>>>     sd 5:0:0:0: [sdf] tag#29 access beyond end of device
>>>>>
>>>>> We had a new fix for trim. But according to your kernel message, it
>>>>> doesn't look like the case.
>>>>>
>>>>> (No obvious tag showing it's trim/discard)
>>>>>
>>>>>>
>>>>>> I've never seen anything like this before with any FS, so I figured it
>>>>>> was worth asking before I consider running the standard btrfs tools.
>>>>>> (I briefly started a scrub, but it was going crazy with uncorrectable
>>>>>> errors, so I cancelled it.)
>>>>>>
>>>>>> Here's my system info:
>>>>>>
>>>>>> Fedora 32, kernel 5.7.7-200.fc32.x86_64
>>>>>> btrfs-progs v5.7
>>>>>>
>>>>>> /etc/fstab entry:
>>>>>> LABEL=media /var/media btrfs subvol=media,discard 0 2
>>>>>>
>>>>>> btrfs fi show /var/media/
>>>>>> Label: 'media' uuid: 51eef0c7-2977-4037-b271-3270ea22c7d9
>>>>>> Total devices 6 FS bytes used 4.68TiB
>>>>>> devid 1 size 1.82TiB used 963.00GiB path /dev/sdf1
>>>>>> devid 2 size 1.82TiB used 962.00GiB path /dev/sde1
>>>>>> devid 4 size 1.82TiB used 963.00GiB path /dev/sdg1
>>>>>> devid 6 size 1.82TiB used 962.03GiB path /dev/sda1
>>>>>> devid 7 size 7.28TiB used 967.03GiB path /dev/sdb1
>>>>>> devid 8 size 7.28TiB used 967.03GiB path /dev/sdd1
>>>>>>
>>>>>> btrfs fi df /var/media/
>>>>>> Data, RAID5: total=4.69TiB, used=4.68TiB
>>>>>> System, RAID1C3: total=32.00MiB, used=304.00KiB
>>>>>> Metadata, RAID1C3: total=6.00GiB, used=4.94GiB
>>>>>> GlobalReserve, single: total=512.00MiB, used=0.00B
>>>>>>
>>>>>> I can only mount -o degraded now. Here are the logs when mounting:
>>>>>>
>>>>>> Aug 01 01:15:26 spaceman.fandingo.org sudo[275572]: justin : TTY=pts/0
>>>>>> ; PWD=/home/justin ; USER=root ; COMMAND=/usr/bin/mount -t btrfs -o
>>>>>> degraded /dev/sda1 /var/media/
>>>>>> Aug 01 01:15:26 spaceman.fandingo.org kernel: sd 5:0:0:0: [sdf] tag#30
>>>>>> access beyond end of device
>>>>>> Aug 01 01:15:26 spaceman.fandingo.org kernel: blk_update_request: I/O
>>>>>> error, dev sdf, sector 2176 op 0x0:(READ) flags 0x0 phys_seg 1 prio
>>>>>> class 0
>>>>>
>>>>> OK, it's read, not DISCARD, thus a completely different problem.
>>>>>
>>>>>
>>>>>> Aug 01 01:15:26 spaceman.fandingo.org kernel: Buffer I/O error on dev
>>>>>> sdf1, logical block 16, async page read
>>>>>> Aug 01 01:15:26 spaceman.fandingo.org kernel: BTRFS info (device
>>>>>> sde1): allowing degraded mounts
>>>>>> Aug 01 01:15:26 spaceman.fandingo.org kernel: BTRFS info (device
>>>>>> sde1): disk space caching is enabled
>>>>>> Aug 01 01:15:26 spaceman.fandingo.org kernel: BTRFS warning (device
>>>>>> sde1): devid 1 uuid cb05aae6-6c03-49d3-b46d-bf51a0eb8cd0 is missing
>>>>>> Aug 01 01:15:26 spaceman.fandingo.org kernel: BTRFS info (device
>>>>>> sde1): bdev /dev/sdf1 errs: wr 4458026, rd 14571, flush 0, corrupt 0,
>>>>>> gen 0
>>>>>>
>>>>>> It seems like only relatively recently written files are encountering
>>>>>> I/O errors. If I `cat` one of the problematic files when the FS is
>>>>>> mounted normally, I see a ton of this:
>>>>>>
>>>>>> Aug 01 01:13:49 spaceman.fandingo.org kernel: sd 5:0:0:0: [sdf] tag#26
>>>>>> access beyond end of device
>>>>>> Aug 01 01:13:49 spaceman.fandingo.org kernel: sd 5:0:0:0: [sdf] tag#27
>>>>>> access beyond end of device
>>>>>> Aug 01 01:13:49 spaceman.fandingo.org kernel: sd 5:0:0:0: [sdf] tag#28
>>>>>> access beyond end of device
>>>>>> Aug 01 01:13:49 spaceman.fandingo.org kernel: sd 5:0:0:0: [sdf] tag#29
>>>>>> access beyond end of device
>>>>>> Aug 01 01:13:49 spaceman.fandingo.org kernel: sd 5:0:0:0: [sdf] tag#30
>>>>>> access beyond end of device
>>>>>> Aug 01 01:13:49 spaceman.fandingo.org kernel: sd 5:0:0:0: [sdf] tag#0
>>>>>> access beyond end of device
>>>>>> Aug 01 01:13:49 spaceman.fandingo.org kernel: sd 5:0:0:0: [sdf] tag#1
>>>>>> access beyond end of device
>>>>>> Aug 01 01:13:49 spaceman.fandingo.org kernel: sd 5:0:0:0: [sdf] tag#13
>>>>>> access beyond end of device
>>>>>> Aug 01 01:13:49 spaceman.fandingo.org kernel: sd 5:0:0:0: [sdf] tag#2
>>>>>> access beyond end of device
>>>>>>
>>>>>> Now that I'm remounted in -o degraded, I'm getting more comprehensible
>>>>>> warnings, but it still results in I/O read failures:
>>>>>>
>>>>>> Aug 01 01:31:53 spaceman.fandingo.org kernel: BTRFS warning (device
>>>>>> sde1): csum failed root 2820 ino 747435 off 99942400 csum 0x8941f998
>>>>>> expected csum 0xbe3f80a4 mirror 2
>>>>>> Aug 01 01:31:53 spaceman.fandingo.org kernel: BTRFS warning (device
>>>>>> sde1): csum failed root 2820 ino 747435 off 99946496 csum 0x8941f998
>>>>>> expected csum 0x9c36a6b4 mirror 2
>>>>>> Aug 01 01:31:53 spaceman.fandingo.org kernel: BTRFS warning (device
>>>>>> sde1): csum failed root 2820 ino 747435 off 99950592 csum 0x8941f998
>>>>>> expected csum 0x44d30ca2 mirror 2
>>>>>> Aug 01 01:31:53 spaceman.fandingo.org kernel: BTRFS warning (device
>>>>>> sde1): csum failed root 2820 ino 747435 off 99958784 csum 0x8941f998
>>>>>> expected csum 0xc0f08acc mirror 2
>>>>>> Aug 01 01:31:53 spaceman.fandingo.org kernel: BTRFS warning (device
>>>>>> sde1): csum failed root 2820 ino 747435 off 99954688 csum 0x8941f998
>>>>>> expected csum 0xcb11db59 mirror 2
>>>>>> Aug 01 01:31:53 spaceman.fandingo.org kernel: BTRFS warning (device
>>>>>> sde1): csum failed root 2820 ino 747435 off 99962880 csum 0x8941f998
>>>>>> expected csum 0x8a4ee0aa mirror 2
>>>>>> Aug 01 01:31:53 spaceman.fandingo.org kernel: BTRFS warning (device
>>>>>> sde1): csum failed root 2820 ino 747435 off 99971072 csum 0x8941f998
>>>>>> expected csum 0xdfb79e85 mirror 2
>>>>>> Aug 01 01:31:53 spaceman.fandingo.org kernel: BTRFS warning (device
>>>>>> sde1): csum failed root 2820 ino 747435 off 99966976 csum 0x8941f998
>>>>>> expected csum 0xc14921a0 mirror 2
>>>>>> Aug 01 01:31:53 spaceman.fandingo.org kernel: BTRFS warning (device
>>>>>> sde1): csum failed root 2820 ino 747435 off 99975168 csum 0x8941f998
>>>>>> expected csum 0xf2fe8774 mirror 2
>>>>>> Aug 01 01:31:53 spaceman.fandingo.org kernel: BTRFS warning (device
>>>>>> sde1): csum failed root 2820 ino 747435 off 99979264 csum 0x8941f998
>>>>>> expected csum 0xae1cafd6 mirror 2
>>>>>>
>>>>>> Why trying to research this problem, I came across a Github issue
>>>>>> https://github.com/kdave/btrfs-progs/issues/282 and a patch from Qu
>>>>>> from yesterday ([PATCH] btrfs: trim: fix underflow in trim length to
>>>>>> prevent access beyond device boundary). I do use the discard mount
>>>>>> option, and I have a weekly fstrim.timer enabled. I did replace 2x2TB
>>>>>> drives with the 2x8TB drives about 1 month ago, which involved a
>>>>>> conversion to -d raid5 -m raid1c3, which I suppose could hit the same
>>>>>> code paths that resize2fs would?
>>>>>
>>>>> The problem doesn't look like a trim one, but more likely some device
>>>>> boundary bug.
>>>>>
>>>>> Would you please provide the following info?
>>>>> - btrfs ins dump-tree -t chunk /dev/sde1
>>>>>   This contains the device info and chunk tree dump. Doesn't contain
>>>>>   any confidential info.
>>>>>   We can use this info to determine if there is some chunk really beyond
>>>>>   device boundary.
>>>>>   I guess some chunks are already beyond device boundary by somehow.
>>>>
>>>> And `lsblk -b` output.
>>>>
>>>> It may be possible that device size in btrfs doesn't match with the real
>>>> device...
>>>>>
>>>>> Thanks,
>>>>> Qu
>>>>>
>>>>>>
>>>>>> Any advice on how to proceed would be greatly appreciated.
>>>>>>
>>>>>> Thanks,
>>>>>> Justin
>>>>>>
>>>>>
>>>>
>>

[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 488 bytes --]