All of lore.kernel.org
 help / color / mirror / Atom feed
From: Zygo Blaxell <ce3g8jdj@umail.furryterror.org>
To: Martin Steigerwald <martin@lichtvoll.de>
Cc: linux-btrfs@vger.kernel.org
Subject: Re: It's broke, Jim. BTRFS mounted read only after corruption errors on Samsung 980 Pro
Date: Sat, 4 Sep 2021 19:38:42 -0400	[thread overview]
Message-ID: <20210904233842.GD27656@hungrycats.org> (raw)
In-Reply-To: <10160891.dnzDEemO5X@ananda>

On Sat, Sep 04, 2021 at 03:10:25PM +0200, Martin Steigerwald wrote:
> Hi Zygo.
> 
> Thank you for your hints.
> 
> Zygo Blaxell - 30.08.21, 06:45:50 CEST:
> > On Thu, Aug 26, 2021 at 10:17:33AM +0200, Martin Steigerwald wrote:
> […]
> > > Last time suspected XXHASH as the culprit, just a gut feeling. So I
> > > used the standard crc32c. Cause I did not think of it I initially
> > > used space cache
> > > v1. But I switched it to space cache v2 I think yesterday.
> > > 
> > > Anyway it can't be xxhash cause crc32c also causes trouble.
> > 
> > Normally if the drive (or scheduler for that matter) was the problem,
> > you would get mostly parent transid verify failed and csum errors,
> > and every btrfs check error would be preceded by a csum verify
> > failure.
> > 
> > This is not happening here, so the drive is probably fine (unless it's
> > corrupting the hibernation image, but you can verify that by enabling
> > hashes on the hibernation image).
> 
> My bet is also that the drive is fine.
> 
> > > My current theory is this:
> > > 
> > > It happens that after resume from hibernation the kernel stops
> > > working correcly. A few seconds till maybe a minute or two after
> > > hibernation. I captured the backtraces in the log this time. See
> > > below.
> > > 
> > > There is some BTRFS backtrace, but it appears to be related to
> > > obtaining kernel memory, and then also a general protection fault.
> > > I saw general protection faults before already on this machine, but
> > > never anywhere else.
> > 
> > That's a very bad sign.  If there are random GPFs, you'll need to take
> > the machine out of production until they are resolved.  Either
> > something is externally corrupting kernel RAM (like RAM failure or a
> > firmware bug) or there's a bug in the kernel that is corrupting
> > kernel RAM (like a use-after-free or hibernation code bug).
> 
> Yeah. I avoid hibernation for now.
> 
> > > My idea now is if the kernel crashes while BTRFS likes to write
> > > something BTRFS might not be consistent afterwards.
> > 
> > That isn't how btrfs works.  btrfs orders writes so that if the kernel
> > is operating correctly, but simply stops writing to the filesystem at
> > any time, no corruption occurs (the filesystem simply reverts to the
> > previously committed transaction) (assuming no raid5/6).
> 
> Or I think I did not word it clearly enough. I said "crash" but thought 
> about a longer term failure scenario as you outlined below:
> 
> > A _completely different_ scenario is that the kernel's memory is
> > corrupted, but the kernel does not stop immediately--it continues
> > writing to the filesystem, unaware of corruption, persisting bad data
> > on disk. If the kernel doesn't crash immediately, btrfs can complete
> > transactions with bad metadata.  In that case, btrfs (or any
> > filesystem) will be damaged, as will any data that machine touches. 
> > The damage may not be repairable as it will bypass or destroy
> > filesystem data integrity protections.
> 
> That is exactly what I think might have happened.
> 
> So I think there is not much of a point to keep my "homedefect" logical 
> volume around. No developer asked me to do anything to obtain further 
> information on the issue. If it stays like that for say another week, I 
> will return the 300G for the "homedefect" LV to the wear leveling pool 
> of the SSD.
> 
> > > I still think it shall survive
> > > a sudden interruption when writting data even if the cause for that
> > > is a kernel crash, but as Linux is a monolithic kernel, whatever
> > > has caused the kernel mal function may affect BTRFS as well.
> > 
> > It doesn't matter if the kernel is monolithic.  RAM is monolithic.
> > RAM corruption, however caused, can and will break everything in RAM,
> > whether it is kernel, userspace, or firmware.
> 
> I think it could matter in case of a software bug, i.e. another kernel 
> component accidentally writing into memory BTRFS uses. This would be 
> prevented by a micro kernel.

Depends on the bug and the microkernel.  Microkernels are not immune to
corruption bugs, though some specific paths leading to corruption are
blocked.  e.g. instead of memcpy() with a bad pointer, it's a device
driver process calculating the wrong physical address for a DMA write.

Also, regardless of kernel implementation, the hibernation procedure
is extremely _invasive_.  It necessarily transports every process and
significant chunks of kernel state to disk and back to memory, and has
opportunities to corrupt data in transit or at rest.

> > > So what I will try next is to shutdown the machine after each day or
> > > just
> > > keep it in standby and hope the battery will not take damage from
> > > it. In other words I will aim at avoiding hibernation. Its hard for
> > > me. I use it
> > > on all of my laptops and now I have the fastest laptop I ever had so
> > > far,
> > > but I cannot seem to use it reliably anymore. The faster the laptop,
> > > the faster it crashes if it does, so to say.
> > > 
> > > Any other ideas, theories, hints?
> > 
> > Start a memory tester running (e.g. 'memtester' or '7z b 9999 -md=xxM'
> > where xx is large enough to allocate most of the free RAM), do
> > hibernation and resume, and see if the memory tester reports failure
> > after resume.
> 
> I may still do this. However I already run the UEFI Lenovo Diagnostic 
> tool and had it do a memory test. It appears to be fine. I did the quick 
> test but then also ran a good part, but not all of a several hour long 
> test. No issues.

The environment might be different for a stand-alone memory tester
compared to one that can run on Linux alongside a production workload.
e.g. if a bus component is failing due to heat stress, that failure might
not be triggered in the standalone case where the failing bus componenet
is inactive, but will be triggered under a full system load where the
failing component is active.  Similarly bugs in the Linux kernel itself,
or interactions between kernel and firmware, will not be visible with
a standalone test.

On the other hand, if the problem _is_ a RAM component failure, a
standalone test will verify it's that and nothing else.

> > If so, you have firmware data integrity issues, or there is a bug in
> > the hibernation code.
> > 
> > Use suspend-to-RAM as a workaround (unless that corrupts memory too).
> 
> Yeap, that is what I am using at the moment. It does not seem to corrupt 
> memory. So without hibernation and without the USB-C dock which 
> triggered all kinds of issues with network driver for that r8152 card in 
> there and all kind of other strange issues, the laptop appears to finally 
> be stable. Seems for AMD based Gen 1 ThinkPads, I bet Gen 2 as well 
> there is still some R&D needed to get them stable in these situations as 
> well.
> 
> Best,
> -- 
> Martin
> 
> 
> 

  reply	other threads:[~2021-09-04 23:38 UTC|newest]

Thread overview: 8+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-08-22 11:14 It's broke, Jim. BTRFS mounted read only after corruption errors on Samsung 980 Pro Martin Steigerwald
2021-08-26  8:17 ` Martin Steigerwald
2021-08-30  4:45   ` Zygo Blaxell
2021-09-04 13:10     ` Martin Steigerwald
2021-09-04 23:38       ` Zygo Blaxell [this message]
2022-02-11  9:45         ` Martin Steigerwald
2021-08-26 10:44 ` Martin Steigerwald
2021-08-26 12:39   ` Martin Steigerwald

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20210904233842.GD27656@hungrycats.org \
    --to=ce3g8jdj@umail.furryterror.org \
    --cc=linux-btrfs@vger.kernel.org \
    --cc=martin@lichtvoll.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.