linux-btrfs.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Qu Wenruo <quwenruo.btrfs@gmx.com>
To: tavianator@tavianator.com, wqu@suse.com
Cc: linux-btrfs@vger.kernel.org
Subject: Re: [PATCH] btrfs: tree-checker: dump the page status if hit something wrong
Date: Tue, 6 Feb 2024 16:24:32 +1030	[thread overview]
Message-ID: <8932de78-729c-431a-b371-a858e986066d@gmx.com> (raw)
In-Reply-To: <20240206033807.15498-1-tavianator@tavianator.com>



On 2024/2/6 14:08, tavianator@tavianator.com wrote:
> On Sat, 27 Jan 2024 10:18:36 +1030, Qu Wenruo wrote:
>> [BUG]
>> There is a bug report about very suspicious tree-checker got triggered:
>>
>>    BTRFS critical (device dm-0): corrupted node, root=256
>> block=8550954455682405139 owner mismatch, have 11858205567642294356
>> expect [256, 18446744073709551360]
>>    BTRFS critical (device dm-0): corrupted node, root=256
>> block=8550954455682405139 owner mismatch, have 11858205567642294356
>> expect [256, 18446744073709551360]
>>    BTRFS critical (device dm-0): corrupted node, root=256
>> block=8550954455682405139 owner mismatch, have 11858205567642294356
>> expect [256, 18446744073709551360]
>>    SELinux: inode_doinit_use_xattr:  getxattr returned 117 for dev=dm-0
>> ino=5737268
>
> I can reproduce this error.  I applied a modified version of your patch,
> against v6.7.2 because that's what I triggered it on.
>
> diff --git a/fs/btrfs/tree-checker.c b/fs/btrfs/tree-checker.c
> index 50fdc69fdddf..3f1fc49cd4dc 100644
> --- a/fs/btrfs/tree-checker.c
> +++ b/fs/btrfs/tree-checker.c
> @@ -2038,6 +2044,7 @@ int btrfs_check_eb_owner(const struct extent_buffer *eb, u64 root_owner)
>          if (!is_subvol) {
>                  /* For non-subvolume trees, the eb owner should match root owner */
>                  if (unlikely(root_owner != eb_owner)) {
> +                       dump_page(eb->pages[0], "eb page dump");
>                          btrfs_crit(eb->fs_info,
>   "corrupted %s, root=%llu block=%llu owner mismatch, have %llu expect %llu",
>                                  btrfs_header_level(eb) == 0 ? "leaf" : "node",
> @@ -2053,6 +2060,7 @@ int btrfs_check_eb_owner(const struct extent_buffer *eb, u64 root_owner)
>           * to subvolume trees.
>           */
>          if (unlikely(is_subvol != is_fstree(eb_owner))) {
> +               dump_page(eb->pages[0], "eb page dump");
>                  btrfs_crit(eb->fs_info,
>   "corrupted %s, root=%llu block=%llu owner mismatch, have %llu expect [%llu, %llu]",
>                          btrfs_header_level(eb) == 0 ? "leaf" : "node",
>
> Here's the corresponding dmesg output:
>
>      page:00000000789c68b4 refcount:4 mapcount:0 mapping:00000000ce99bfc3 index:0x7df93c74 pfn:0x1269558
>      memcg:ffff9f20d10df000
>      aops:btree_aops [btrfs] ino:1
>      flags: 0x12ffff180000820c(referenced|uptodate|workingset|private|node=2|zone=2|lastcpupid=0xffff)
>      page_type: 0xffffffff()
>      raw: 12ffff180000820c 0000000000000000 dead000000000122 ffff9f118586feb8
>      raw: 000000007df93c74 ffff9f2232376e80 00000004ffffffff ffff9f20d10df000
>      page dumped because: eb page dump
>      BTRFS critical (device dm-1): corrupted leaf, root=709 block=8656838410240 owner mismatch, have 2694891690930195334 expect [256, 18446744073709551360]

The page index and eb->start matches page index, so that page attaching
part is correct.

And the refcount is also 4, which matches the common case.

Although I still need to check the extra flags for workingset.

>      page:000000006b7dfcdc refcount:4 mapcount:0 mapping:00000000ce99bfc3 index:0x8dae804c pfn:0x408347
>      memcg:ffff9f20d10df000
>      aops:btree_aops [btrfs] ino:1
>      flags: 0xaffff180000820c(referenced|uptodate|workingset|private|node=1|zone=2|lastcpupid=0xffff)
>      page_type: 0xffffffff()
>      raw: 0affff180000820c 0000000000000000 dead000000000122 ffff9f118586feb8
>      raw: 000000008dae804c ffff9f1497257d00 00000004ffffffff ffff9f20d10df000
>      page dumped because: eb page dump
>      BTRFS critical (device dm-1): corrupted leaf, root=518 block=9736288518144 owner mismatch, have 1691386650333431481 expect [256, 18446744073709551360]
>      page:00000000fb0df6cd refcount:4 mapcount:0 mapping:00000000ce99bfc3 index:0x7609cbdc pfn:0x129e719
>      memcg:ffff9f20d10df000
>      aops:btree_aops [btrfs] ino:1
>      flags: 0x12ffff180000820c(referenced|uptodate|workingset|private|node=2|zone=2|lastcpupid=0xffff)
>      page_type: 0xffffffff()
>      raw: 12ffff180000820c 0000000000000000 dead000000000122 ffff9f118586feb8
>      raw: 000000007609cbdc ffff9f231de92658 00000004ffffffff ffff9f20d10df000
>      page dumped because: eb page dump
>      BTRFS critical (device dm-1): corrupted leaf, root=518 block=8111527936000 owner mismatch, have 10652220539197264134 expect [256, 18446744073709551360]
>
> Hope this helps!  Let me know if you have other debug patches to try.
>
> Here's my reproducer if you want to try it yourself.  It uses bfs, a
> find(1) clone I wrote with multi-threading and io_uring support.  I'm
> in the process of adding multi-threaded stat(), which is what I assume
> triggers the bug.
>
>      $ git clone "https://github.com/tavianator/bfs"
>      $ cd bfs
>      $ git checkout euclean
>      $ make release
>
> Then repeat these steps until it triggers:
>
>      # sysctl vm.drop_caches=3
>      $ ./bin/bfs /mnt -links 100
>      bfs: error: /mnt/slash/@/var/lib/docker/btrfs/subvolumes/f07d37d1c148e9fcdbae166a3a4de36eec49009ce651174d0921fab18d55cee6/dev/ram0: Structure needs cleaning.

It looks like the mount point /mnt/ is pretty large with a lot of things
pre-populated?

I tried to populate the btrfs with my linux git repo (which is around
6.5G with some GC needed), but even 256 runs didn't hit the problem.

The main part of the script looks like this:

for (( i = 0; i < 256; i++ )); do
	mount $dev1 $mnt
	sysctl vm.drop_caches=3
	/home/adam/bfs/bin/bfs $mnt -links 100
	umount $mnt
done

And the device looks like this:

/dev/mapper/test-scratch1  10485760  6472292   3679260  64% /mnt/btrfs

Although the difference is, I'm using btrfs/for-next branch
(https://github.com/btrfs/linux/tree/for-next).

Maybe it's missing some fixes not yet in upstream?
My current guess is related to my commit 09e6cef19c9f ("btrfs: refactor
alloc_extent_buffer() to allocate-then-attach method"), but since I can
not reproduce it, it's only a guess...

Thanks,
Qu

>      ...
>

  reply	other threads:[~2024-02-06  5:54 UTC|newest]

Thread overview: 16+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-01-26 23:48 [PATCH] btrfs: tree-checker: dump the page status if hit something wrong Qu Wenruo
2024-02-06  3:38 ` tavianator
2024-02-06  5:54   ` Qu Wenruo [this message]
2024-02-06 20:12     ` Tavian Barnes
2024-02-06 20:39       ` Qu Wenruo
2024-02-06 21:48         ` Tavian Barnes
2024-02-06 21:53           ` Qu Wenruo
2024-02-13 18:07             ` Tavian Barnes
2024-02-13 18:26               ` Tavian Barnes
2024-02-13 21:26               ` Qu Wenruo
2024-02-06 21:53           ` Tavian Barnes
2024-02-06 22:01             ` Qu Wenruo
2024-02-06 12:51   ` David Sterba
2024-02-06 20:19     ` Tavian Barnes
2024-02-06 12:46 ` David Sterba
2024-02-06 20:34   ` Qu Wenruo

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=8932de78-729c-431a-b371-a858e986066d@gmx.com \
    --to=quwenruo.btrfs@gmx.com \
    --cc=linux-btrfs@vger.kernel.org \
    --cc=tavianator@tavianator.com \
    --cc=wqu@suse.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).