linux-btrfs.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Chris Murphy <lists@colorremedies.com>
To: Qu Wenruo <quwenruo.btrfs@gmx.com>
Cc: Zygo Blaxell <ce3g8jdj@umail.furryterror.org>,
	Btrfs BTRFS <linux-btrfs@vger.kernel.org>
Subject: Re: btrfs vs write caching firmware bugs (was: Re: BTRFS recovery not possible)
Date: Mon, 24 Jun 2019 11:31:35 -0600	[thread overview]
Message-ID: <CAJCQCtRrT5pUxOxfKWTC=zt9E=ZxRaiLeBxngqc6YVQEYp8n_g@mail.gmail.com> (raw)
In-Reply-To: <f1cfe396-aac7-b670-b8de-f5d3b795acfe@gmx.com>

On Sun, Jun 23, 2019 at 7:52 PM Qu Wenruo <quwenruo.btrfs@gmx.com> wrote:
>
>
>
> On 2019/6/24 上午4:45, Zygo Blaxell wrote:
> > I first observed these correlations back in 2016.  We had a lot of WD
> > Green and Black drives in service at the time--too many to replace or
> > upgrade them all early--so I looked for a workaround to force the
> > drives to behave properly.  Since it looked like a write ordering issue,
> > I disabled the write cache on drives with these firmware versions, and
> > found that the transid-verify filesystem failures stopped immediately
> > (they had been bi-weekly events with write cache enabled).
>
> So the worst scenario really happens in real world, badly implemented
> flush/fua from firmware.
> Btrfs has no way to fix such low level problem.

Right. The questions I have: should Btrfs (or any file system) be able
to detect such devices and still protect the data? i.e. for the file
system to somehow be more suspicious, without impacting performance,
and go read-only sooner so that at least read-only mount can work? Or
is this so much work for such a tiny edge case that it's not worth it?

Arguably the hardware is some kind of zombie saboteur. It's not
totally dead, it gives the impression that it's working most of the
time, and then silently fails to do what we think it should in an
extraordinary departure from specs and expectations.

Are there other failure cases that could look like this and therefore
worth handling? As storage stacks get more complicated with ever more
complex firmware, and firmware updates in the field, it might be
useful to have at least one file system that can detect such problems
sooner than others and go read-only to prevent further problems?


> BTW, do you have any corruption using the bad drivers (with write cache)
> with traditional journal based fs like XFS/EXT4?
>
> Btrfs is relying more the hardware to implement barrier/flush properly,
> or CoW can be easily ruined.
> If the firmware is only tested (if tested) against such fs, it may be
> the problem of the vendor.

I think we can definitely say this is a vendor problem. But the
question still is whether the file system as a role in at least
disqualifying hardware when it knows it's acting up before the file
system is thoroughly damaged?

I also wonder how ext4 and XFS will behave. In some ways they might
tolerate the problem without noticing it for longer, where instead of
kernel space recognizing it, it's actually user space / application
layer that gets confused first, if it's bogus data that's being
returned. Filesystem metadata is a relatively small target for such
corruption when the file system mostly does overwrites.

I also wonder how ZFS handles this. Both in the single device case,
and in the RAIDZ case.


-- 
Chris Murphy

  parent reply	other threads:[~2019-06-24 17:31 UTC|newest]

Thread overview: 10+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-06-23 20:45 btrfs vs write caching firmware bugs (was: Re: BTRFS recovery not possible) Zygo Blaxell
2019-06-24  0:46 ` Qu Wenruo
2019-06-24  4:29   ` Zygo Blaxell
2019-06-24  5:39     ` Qu Wenruo
2019-06-24 17:31   ` Chris Murphy [this message]
2019-06-26  2:30     ` Zygo Blaxell
2019-07-02 13:32     ` Andrea Gelmini
2019-06-24  2:45 ` Remi Gauvin
2019-06-24  4:37   ` Zygo Blaxell
2019-06-24  5:27     ` Zygo Blaxell

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CAJCQCtRrT5pUxOxfKWTC=zt9E=ZxRaiLeBxngqc6YVQEYp8n_g@mail.gmail.com' \
    --to=lists@colorremedies.com \
    --cc=ce3g8jdj@umail.furryterror.org \
    --cc=linux-btrfs@vger.kernel.org \
    --cc=quwenruo.btrfs@gmx.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).