All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Swâmi Petaramesh" <swami@petaramesh.org>
To: Qu Wenruo <quwenruo.btrfs@gmx.com>,
	Btrfs BTRFS <linux-btrfs@vger.kernel.org>
Subject: Re: System freeze with BTRFS corruption on 4 systems with kernel 5.12 (MANJARO)
Date: Wed, 19 May 2021 15:46:44 +0200	[thread overview]
Message-ID: <c2989b24-f8a0-d01b-3584-8c0bb2c056ab@petaramesh.org> (raw)
In-Reply-To: <2e5259da-fcec-0abe-09a6-3c86c1750477@gmx.com>

On 5/19/21 12:02 PM, Qu Wenruo wrote:
>
> Have you tried something like net-console to catch something?

Nope but the machines were each time plain dead : screen frozen, mouse 
frozen, kbd frozen (LEDs not changing), no ssh, no ping, not even any 
reaction to [Magic SysRq] keys...

>
> If it's some hang, after 120s it would have some dmesg popping out.
> But in that hang case, you should still be able to do a lot of things.
>
More than a hang, appears to be a complete kernel crash.
> Without the dying message, it's really hard to further debug.
>
I would guess so...

> AFAIR it was some kind of “generation mismatch”, expected something,
>> found another, in very large quantities.
>
> That means flush command doesn't work as expected.
>
I would suppose that those machines running bcache in writeback mode, 
some data didn't make it to permanent storage at the time the system 
suffered a sudden death...

Thus incomplete or out-of-order data on disk.

> Considering there are extra layers involved, it's pretty hard to tell
> which is the cause, btrfs or dm-* modules.
>
Well... My view is that the systems crash with or without bcache, but 
BTRFS gets corrupt only when bcache is in use. So I would say that 
bcache is not responsible for the system crashing, but is responsible 
for data not having been properly committed to disk in the good way or 
order at the time the system crashes...

I was wondering if you got any report of other kernel 5.12 issues with 
BTRFS in different configs, or kernel 5.12 crashes that might not be 
related to BTRFS...

>>
>> The machine with BTRFS RAID-1 could heal itself out of this by running a
>> simple btrfs scrub,
>
> This further proves it may be lower layer doing something wrong.
>
I would guess so...

> It's really a good practice to have LUKS under all your fs, but it also
> introduces an extra layer of flush problems.

Yes. However I've been doing this for years on a bunch of machines and 
never got any problem that would relate to this except with this 5.12 
kernel.

I was however wondering if some new optimizations introduced in BTRFS in 
5.12 could have made it prone to crashes or maybe something not being 
properly commited to disk, use of fsyncs or barriers or whatever...

> Did you have any raw btrfs directly over HDD/SDD experiencing such 
> problem?

Unfortunately I don't have any BTRFS out ok LUKS, except for /boot on 
some machines, but this one gets so little activity that I wouldn't 
expect an issue with a /boot partition.

Thanks again for your help Qu.


ॐ

-- 
Swâmi Petaramesh <swami@petaramesh.org> PGP 9076E32E


      reply	other threads:[~2021-05-19 13:46 UTC|newest]

Thread overview: 5+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-05-19  5:39 System freeze with BTRFS corruption on 4 systems with kernel 5.12 (MANJARO) Swâmi Petaramesh
2021-05-19  7:25 ` Qu Wenruo
2021-05-19  9:17   ` Swâmi Petaramesh
2021-05-19 10:02     ` Qu Wenruo
2021-05-19 13:46       ` Swâmi Petaramesh [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=c2989b24-f8a0-d01b-3584-8c0bb2c056ab@petaramesh.org \
    --to=swami@petaramesh.org \
    --cc=linux-btrfs@vger.kernel.org \
    --cc=quwenruo.btrfs@gmx.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.