All of lore.kernel.org
 help / color / mirror / Atom feed
From: Justin Brown <Justin.Brown@fandingo.org>
To: linux-btrfs <linux-btrfs@vger.kernel.org>
Subject: Access Beyond End of Device & Input/Output Errors
Date: Sat, 1 Aug 2020 01:51:16 -0500	[thread overview]
Message-ID: <CAKZK7uwRs_tf6htRtJvw3kNhyNPMJ-juA6_WSJo+PbQA7f40Cg@mail.gmail.com> (raw)

Hello,

I've run into a strange problem that I haven't seen before, and I need
some help. I started getting generic "input/output" errors on a couple
of files, and when I looked deeper, the kernel logs are full of
messages like:

    sd 5:0:0:0: [sdf] tag#29 access beyond end of device

I've never seen anything like this before with any FS, so I figured it
was worth asking before I consider running the standard btrfs tools.
(I briefly started a scrub, but it was going crazy with uncorrectable
errors, so I cancelled it.)

Here's my system info:

Fedora 32, kernel 5.7.7-200.fc32.x86_64
btrfs-progs v5.7

/etc/fstab entry:
LABEL=media /var/media btrfs subvol=media,discard 0 2

btrfs fi show /var/media/
Label: 'media' uuid: 51eef0c7-2977-4037-b271-3270ea22c7d9
Total devices 6 FS bytes used 4.68TiB
devid 1 size 1.82TiB used 963.00GiB path /dev/sdf1
devid 2 size 1.82TiB used 962.00GiB path /dev/sde1
devid 4 size 1.82TiB used 963.00GiB path /dev/sdg1
devid 6 size 1.82TiB used 962.03GiB path /dev/sda1
devid 7 size 7.28TiB used 967.03GiB path /dev/sdb1
devid 8 size 7.28TiB used 967.03GiB path /dev/sdd1

btrfs fi df /var/media/
Data, RAID5: total=4.69TiB, used=4.68TiB
System, RAID1C3: total=32.00MiB, used=304.00KiB
Metadata, RAID1C3: total=6.00GiB, used=4.94GiB
GlobalReserve, single: total=512.00MiB, used=0.00B

I can only mount -o degraded now. Here are the logs when mounting:

Aug 01 01:15:26 spaceman.fandingo.org sudo[275572]: justin : TTY=pts/0
; PWD=/home/justin ; USER=root ; COMMAND=/usr/bin/mount -t btrfs -o
degraded /dev/sda1 /var/media/
Aug 01 01:15:26 spaceman.fandingo.org kernel: sd 5:0:0:0: [sdf] tag#30
access beyond end of device
Aug 01 01:15:26 spaceman.fandingo.org kernel: blk_update_request: I/O
error, dev sdf, sector 2176 op 0x0:(READ) flags 0x0 phys_seg 1 prio
class 0
Aug 01 01:15:26 spaceman.fandingo.org kernel: Buffer I/O error on dev
sdf1, logical block 16, async page read
Aug 01 01:15:26 spaceman.fandingo.org kernel: BTRFS info (device
sde1): allowing degraded mounts
Aug 01 01:15:26 spaceman.fandingo.org kernel: BTRFS info (device
sde1): disk space caching is enabled
Aug 01 01:15:26 spaceman.fandingo.org kernel: BTRFS warning (device
sde1): devid 1 uuid cb05aae6-6c03-49d3-b46d-bf51a0eb8cd0 is missing
Aug 01 01:15:26 spaceman.fandingo.org kernel: BTRFS info (device
sde1): bdev /dev/sdf1 errs: wr 4458026, rd 14571, flush 0, corrupt 0,
gen 0

It seems like only relatively recently written files are encountering
I/O errors. If I `cat` one of the problematic files when the FS is
mounted normally, I see a ton of this:

Aug 01 01:13:49 spaceman.fandingo.org kernel: sd 5:0:0:0: [sdf] tag#26
access beyond end of device
Aug 01 01:13:49 spaceman.fandingo.org kernel: sd 5:0:0:0: [sdf] tag#27
access beyond end of device
Aug 01 01:13:49 spaceman.fandingo.org kernel: sd 5:0:0:0: [sdf] tag#28
access beyond end of device
Aug 01 01:13:49 spaceman.fandingo.org kernel: sd 5:0:0:0: [sdf] tag#29
access beyond end of device
Aug 01 01:13:49 spaceman.fandingo.org kernel: sd 5:0:0:0: [sdf] tag#30
access beyond end of device
Aug 01 01:13:49 spaceman.fandingo.org kernel: sd 5:0:0:0: [sdf] tag#0
access beyond end of device
Aug 01 01:13:49 spaceman.fandingo.org kernel: sd 5:0:0:0: [sdf] tag#1
access beyond end of device
Aug 01 01:13:49 spaceman.fandingo.org kernel: sd 5:0:0:0: [sdf] tag#13
access beyond end of device
Aug 01 01:13:49 spaceman.fandingo.org kernel: sd 5:0:0:0: [sdf] tag#2
access beyond end of device

Now that I'm remounted in -o degraded, I'm getting more comprehensible
warnings, but it still results in I/O read failures:

Aug 01 01:31:53 spaceman.fandingo.org kernel: BTRFS warning (device
sde1): csum failed root 2820 ino 747435 off 99942400 csum 0x8941f998
expected csum 0xbe3f80a4 mirror 2
Aug 01 01:31:53 spaceman.fandingo.org kernel: BTRFS warning (device
sde1): csum failed root 2820 ino 747435 off 99946496 csum 0x8941f998
expected csum 0x9c36a6b4 mirror 2
Aug 01 01:31:53 spaceman.fandingo.org kernel: BTRFS warning (device
sde1): csum failed root 2820 ino 747435 off 99950592 csum 0x8941f998
expected csum 0x44d30ca2 mirror 2
Aug 01 01:31:53 spaceman.fandingo.org kernel: BTRFS warning (device
sde1): csum failed root 2820 ino 747435 off 99958784 csum 0x8941f998
expected csum 0xc0f08acc mirror 2
Aug 01 01:31:53 spaceman.fandingo.org kernel: BTRFS warning (device
sde1): csum failed root 2820 ino 747435 off 99954688 csum 0x8941f998
expected csum 0xcb11db59 mirror 2
Aug 01 01:31:53 spaceman.fandingo.org kernel: BTRFS warning (device
sde1): csum failed root 2820 ino 747435 off 99962880 csum 0x8941f998
expected csum 0x8a4ee0aa mirror 2
Aug 01 01:31:53 spaceman.fandingo.org kernel: BTRFS warning (device
sde1): csum failed root 2820 ino 747435 off 99971072 csum 0x8941f998
expected csum 0xdfb79e85 mirror 2
Aug 01 01:31:53 spaceman.fandingo.org kernel: BTRFS warning (device
sde1): csum failed root 2820 ino 747435 off 99966976 csum 0x8941f998
expected csum 0xc14921a0 mirror 2
Aug 01 01:31:53 spaceman.fandingo.org kernel: BTRFS warning (device
sde1): csum failed root 2820 ino 747435 off 99975168 csum 0x8941f998
expected csum 0xf2fe8774 mirror 2
Aug 01 01:31:53 spaceman.fandingo.org kernel: BTRFS warning (device
sde1): csum failed root 2820 ino 747435 off 99979264 csum 0x8941f998
expected csum 0xae1cafd6 mirror 2

Why trying to research this problem, I came across a Github issue
https://github.com/kdave/btrfs-progs/issues/282 and a patch from Qu
from yesterday ([PATCH] btrfs: trim: fix underflow in trim length to
prevent access beyond device boundary). I do use the discard mount
option, and I have a weekly fstrim.timer enabled. I did replace 2x2TB
drives with the 2x8TB drives about 1 month ago, which involved a
conversion to -d raid5 -m raid1c3, which I suppose could hit the same
code paths that resize2fs would?

Any advice on how to proceed would be greatly appreciated.

Thanks,
Justin

             reply	other threads:[~2020-08-01  6:51 UTC|newest]

Thread overview: 22+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-08-01  6:51 Justin Brown [this message]
2020-08-01  6:58 ` Access Beyond End of Device & Input/Output Errors Qu Wenruo
2020-08-01  7:02   ` Qu Wenruo
     [not found]     ` <CAKZK7uzmg19NDjGPPAxXKu7LJ-7ZdHu2cad22csj_chr2qxMJg@mail.gmail.com>
2020-08-01  9:31       ` Qu Wenruo
2020-08-01 11:56         ` Justin Brown
2020-08-01 23:30           ` Qu Wenruo
2020-09-06  1:42             ` Justin Brown
2021-01-17 23:38 chainofflowers
2021-01-18  0:11 ` Qu Wenruo
2021-01-18 21:07   ` chainofflowers
2021-01-21 23:55     ` chainofflowers
2021-01-22  0:49       ` Qu Wenruo
2021-02-08 21:05         ` chainofflowers
2021-02-20 11:26           ` chainofflowers
2021-02-20 11:42             ` Qu Wenruo
2021-02-20 11:46           ` Forza
2021-02-20 12:07             ` chainofflowers
2021-02-20 12:13               ` Qu Wenruo
2021-04-23 23:36                 ` chainofflowers
2021-04-24  0:25                   ` Qu Wenruo
2021-04-24 14:13                     ` chainofflowers
2021-04-24 22:56                       ` Qu Wenruo

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CAKZK7uwRs_tf6htRtJvw3kNhyNPMJ-juA6_WSJo+PbQA7f40Cg@mail.gmail.com \
    --to=justin.brown@fandingo.org \
    --cc=linux-btrfs@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.