linux-btrfs.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Chris Murphy <lists@colorremedies.com>
To: Peter Chant <pete@petezilla.co.uk>
Cc: Btrfs BTRFS <linux-btrfs@vger.kernel.org>
Subject: Re: Chasing IO errors. BTRFS: error (device dm-2) in btrfs_run_delayed_refs:2907: errno=-5 IO failure
Date: Tue, 20 Aug 2019 15:59:56 -0600	[thread overview]
Message-ID: <CAJCQCtSWi+PUbOWXNwv0guCLRuSgZunWdvRBB4TKMG_X48jHFw@mail.gmail.com> (raw)
In-Reply-To: <fc2b166a-4466-4a5a-ee88-da5e57ee89b6@petezilla.co.uk>

On Tue, Aug 20, 2019 at 3:10 PM Peter Chant <pete@petezilla.co.uk> wrote:
>
> Chasing IO errors.  BTRFS: error (device dm-2) in
> btrfs_run_delayed_refs:2907: errno=-5 IO failure
>
>
> I've just had an odd one.
>
> Over the last few days I've noticed a file system blocking, if that is
> the correct term, and this morning go read only.  This resulted in a lot
> of checksum errors.

That doesn't sound good. Checksum errors where? A complete start to
finish dmesg is most useful in this case.


>
> Having spotted the file system go read only in the logs and then noted
> the error message in the subject shortly after booting I assumed a
> hardware error and changed the SATA cable.  That had no effect so I
> isolated the disk and mounted the respective file system degraded.
> Shortly after mounting the degraded file system I had the same error
> again. So I unmounted the file system edited fstab and swapped the disk
> which I though originally had the error with the one now showing an error.

OK but we don't know anything from what you've told us about what and
whose error, so it's all speculation. Definitely a complete dmesg is
needed.

Or if running systemd-journald to persistent media, you can look up
that boot with journalctl --list-boots, and export just the kernel
messages portion with something like this:

journalctl -b -2 -k -o -short-monotonic > journalbtrfshang.txt

That's two boots back, kernel messages only, monotonic time stamp.

Also useful if you experience blocked tasks, like a kind of system
hang for 2 minutes sort of thing, is a sysrq+t and the simple version
is, as root

# echo 1 > /proc/sys/kernel/sysrq
# echo w > /proc/sysrq-trigger
# echo t > /proc/sysrq-trigger

Detailed version here:
https://fedoraproject.org/wiki/QA/Sysrq

That will dump a bunch of task info into kernel messages, and will be
found in dmesg or the above journalctl command. It's useful to have
the echo 1 setup before you reproduce the problem; and even more
useful to use remote ssh to type out the 2nd command so all you have
to do is hit return upon reproducing the hang - otherwise it can take
a long time to type it all out.


> Does this sound like a hardware error?  I have ordered a replacement
> drive, if it is not needed as a replacement I will put it into a
> homebrew NAS.
>
> I've hit the issue again.  Hopefully the system is up long enough to
> post this.
>
> I'm a bit worried that trying to track this down disconnecting a disk at
> a time I might hit the btrfs split brain issue.

WDC Reds have SCT ERC of I think 70 deciseconds by default which you
can check with 'smartctl -l scterc' for each drive. If it's hardware
related it probably isn't bad block related, and at least if the drive
is aware of the problem it'll report it via libata and you'll see such
messages in kernel messages.


-- 
Chris Murphy

  reply	other threads:[~2019-08-20 22:00 UTC|newest]

Thread overview: 9+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-08-20 20:36 Chasing IO errors. BTRFS: error (device dm-2) in btrfs_run_delayed_refs:2907: errno=-5 IO failure Peter Chant
2019-08-20 21:59 ` Chris Murphy [this message]
2019-08-20 23:47   ` Peter Chant
2019-08-21  8:05   ` Peter Chant
2019-08-21  7:29     ` Qu Wenruo
2019-08-21 21:38       ` Peter Chant
2019-08-21 23:32         ` Qu Wenruo
2019-09-08  7:14           ` Pete
2019-08-20 23:45 ` Qu Wenruo

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CAJCQCtSWi+PUbOWXNwv0guCLRuSgZunWdvRBB4TKMG_X48jHFw@mail.gmail.com \
    --to=lists@colorremedies.com \
    --cc=linux-btrfs@vger.kernel.org \
    --cc=pete@petezilla.co.uk \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).