Re: About the weird tree block corruption

From: Qu Wenruo <quwenruo.btrfs@gmx.com>
To: Tavian Barnes <tavianator@tavianator.com>
Cc: "linux-btrfs@vger.kernel.org" <linux-btrfs@vger.kernel.org>
Subject: Re: About the weird tree block corruption
Date: Sat, 16 Mar 2024 06:21:58 +1030	[thread overview]
Message-ID: <49c7a4c8-852c-4b9c-ba57-938a097aaa6a@gmx.com> (raw)
In-Reply-To: <CABg4E-nKSZR4kvAGfxKLwAoH1_fJXwQb91spFAMsU9L1vqEpiA@mail.gmail.com>

在 2024/3/16 01:53, Tavian Barnes 写道:
> On Wed, Mar 13, 2024 at 2:07 AM Qu Wenruo <quwenruo.btrfs@gmx.com> wrote:
>> [SNIP]
>>
>> The second patch is to making tree-checker to BUG_ON() when something
>> went wrong.
>> This patch should only be applied if you can reliably reproduce it
>> inside a VM.
>
> Okay, I have finally reproduced this in a VM.  I had to add this hunk
> to your patch 0002 in order to trigger the BUG_ON:
>
> diff --git a/fs/btrfs/tree-checker.c b/fs/btrfs/tree-checker.c
> index c8fbcae4e88e..4ee7a717642a 100644
> --- a/fs/btrfs/tree-checker.c
> +++ b/fs/btrfs/tree-checker.c
> @@ -2047,6 +2051,7 @@ int btrfs_check_eb_owner(const struct
> extent_buffer *eb, u64 root_owner)
>                                  btrfs_header_level(eb) == 0 ? "leaf" : "node",
>                                  root_owner, btrfs_header_bytenr(eb), eb_owner,
>                                  root_owner);
> +                       BUG_ON(1);
>                          return -EUCLEAN;
>                  }
>                  return 0;
> @@ -2062,6 +2067,7 @@ int btrfs_check_eb_owner(const struct
> extent_buffer *eb, u64 root_owner)
>                          btrfs_header_level(eb) == 0 ? "leaf" : "node",
>                          root_owner, btrfs_header_bytenr(eb), eb_owner,
>                          BTRFS_FIRST_FREE_OBJECTID, BTRFS_LAST_FREE_OBJECTID);
> +               BUG_ON(1);
>                  return -EUCLEAN;
>          }
>          return 0;
>
>> When using the 2nd patch, it's strongly recommended to enable the
>> following sysctl:
>>
>>    kernel.ftrace_dump_on_oops = 1
>>    kernel.panic = 5
>>    kernel.panic_on_oops = 1
>
> I also set kernel.oops_all_cpu_backtrace = 1, and ran with nowatchdog,
> otherwise I got watchdog backtraces (due to slow console) interspersed
> with the traces which was hard to read.

oops_all_cpu_backtrace looks a little overkilled, and it seems to flood
the output.

>
>> And you need a way to reliably access the VM (either netconsole or a
>> serial console setup).
>> In that case, you would got all the ftrace buffer to be dumped into the
>> netconsole/serial console.
>>
>> This has the extra benefit of reducing the noise. But really needs a
>> reliable VM setup and can be a little tricky to setup.
>
> I got this to work, the console logs are attached.  I added
>
>      echo 1 > $tracefs/buffer_size_kb
>
> otherwise it tried to dump 48MiB over the serial console which I
> didn't have the patience for.  Hopefully that's a big enough buffer, I
> can re-run it if you need more logs.

That's totally fine, and that's exactly what I do during debugging.

>
>> Feel free to ask for any extra help to setup the environment, as you're
>> our last hope to properly pin down the bug.
>
> Hopefully this trace helps you debug this.  Let me know whenever you
> have something else for me to test.

The btrfs_crit() line is using btrfs_header_bytenr(), which can be
corrupted.

So it's much better to add extra trace_printk() to print eb->start so
that we can match the output.

But there is some interesting output, the trace_printk() in
btrfs_release_extent_buffer_pages() are already showing the page refs is
already 0, and its contents is already incorrect.

It may be a clue, but without the proper matching trace, it's still hard
to say.

I'm afraid you will need to add the extra trace_printk() lines, much
like this to all the return -EUCLEAN locations:

	trace_printk("eb=%llu\n", eb->start);

>
> I can also try to send you the VM, but I'm not sure how to package it
> up exactly.  It has two (emulated) NVMEs with LUKS and BTRFS raid0 on
> top.
>

Just send me the rootfs qcow2 would be more than enough.
I can setup LUKS and btrfs all by myself here.

Thanks,
Qu