linux-btrfs.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Qu Wenruo <quwenruo.btrfs@gmx.com>
To: Tavian Barnes <tavianator@tavianator.com>
Cc: linux-btrfs@vger.kernel.org, wqu@suse.com
Subject: Re: [PATCH] btrfs: tree-checker: dump the page status if hit something wrong
Date: Wed, 7 Feb 2024 07:09:55 +1030	[thread overview]
Message-ID: <60724d87-293d-495f-92ed-80032dab5c47@gmx.com> (raw)
In-Reply-To: <20240206201247.4120-1-tavianator@tavianator.com>



On 2024/2/7 06:42, Tavian Barnes wrote:
> On Tue, 6 Feb 2024 16:24:32 +1030, Qu Wenruo wrote:
>> On 2024/2/6 14:08, tavianator@tavianator.com wrote:
>>> Here's the corresponding dmesg output:
>>>
>>>       page:00000000789c68b4 refcount:4 mapcount:0 mapping:00000000ce99bfc3 index:0x7df93c74 pfn:0x1269558
>>>       memcg:ffff9f20d10df000
>>>       aops:btree_aops [btrfs] ino:1
>>>       flags: 0x12ffff180000820c(referenced|uptodate|workingset|private|node=2|zone=2|lastcpupid=0xffff)
>>>       page_type: 0xffffffff()
>>>       raw: 12ffff180000820c 0000000000000000 dead000000000122 ffff9f118586feb8
>>>       raw: 000000007df93c74 ffff9f2232376e80 00000004ffffffff ffff9f20d10df000
>>>       page dumped because: eb page dump
>>>       BTRFS critical (device dm-1): corrupted leaf, root=709 block=8656838410240 owner mismatch, have 2694891690930195334 expect [256, 18446744073709551360]
>>
>> The page index and eb->start matches page index, so that page attaching
>> part is correct.
>>
>> And the refcount is also 4, which matches the common case.
>>
>> Although I still need to check the extra flags for workingset.
>
> I did get some other splats with refcount:3, e.g.
>
>      page:000000005ca43abb refcount:3 mapcount:0 mapping:00000000ce99bfc3 index:0x8eb49f38 pfn:0x17e8520
>      page:000000005ca43abb refcount:3 mapcount:0 mapping:00000000ce99bfc3 index:0x8eb49f38 pfn:0x17e8520
>      memcg:ffff9f211ab95000
>      page:000000005ca43abb refcount:3 mapcount:0 mapping:00000000ce99bfc3 index:0x8eb49f38 pfn:0x17e8520
>      memcg:ffff9f211ab95000
>      page:000000005ca43abb refcount:3 mapcount:0 mapping:00000000ce99bfc3 index:0x8eb49f38 pfn:0x17e8520
>      memcg:ffff9f211ab95000
>      memcg:ffff9f211ab95000
>      page:000000005ca43abb refcount:3 mapcount:0 mapping:00000000ce99bfc3 index:0x8eb49f38 pfn:0x17e8520
>      memcg:ffff9f211ab95000
>      BTRFS critical (device dm-1): inode mode mismatch with dir: inode mode=042255 btrfs type=2 dir type=1
>      aops:btree_aops [btrfs] ino:1
>      aops:btree_aops [btrfs] ino:1
>      aops:btree_aops [btrfs] ino:1
>      aops:btree_aops [btrfs] ino:1
>      aops:btree_aops [btrfs] ino:1
>      flags: 0x12ffff580000822c(referenced|uptodate|lru|workingset|private|node=2|zone=2|lastcpupid=0xffff)
>      flags: 0x12ffff580000822c(referenced|uptodate|lru|workingset|private|node=2|zone=2|lastcpupid=0xffff)
>      page_type: 0xffffffff()
>      page_type: 0xffffffff()
>      raw: 12ffff580000822c fffffabb9f5f8288 fffffabb9fa14848 ffff9f118586feb8
>      raw: 12ffff580000822c fffffabb9f5f8288 fffffabb9fa14848 ffff9f118586feb8
>      raw: 000000008eb49f38 ffff9f16ae564cb0 00000003ffffffff ffff9f211ab95000
>      raw: 000000008eb49f38 ffff9f16ae564cb0 00000003ffffffff ffff9f211ab95000
>      flags: 0x12ffff580000822c(referenced|uptodate|lru|workingset|private|node=2|zone=2|lastcpupid=0xffff)
>      page dumped because: eb page dump
>      page dumped because: eb page dump
>      page_type: 0xffffffff()
>      BTRFS critical (device dm-1): corrupted leaf, root=136202 block=9806651031552 owner mismatch, have 174692946400338119 expect [256, 18446744073709551360]
>      BTRFS critical (device dm-1): corrupted leaf, root=136202 block=9806651031552 owner mismatch, have 174692946400338119 expect [256, 18446744073709551360]
>      flags: 0x12ffff580000822c(referenced|uptodate|lru|workingset|private|node=2|zone=2|lastcpupid=0xffff)
>      raw: 12ffff580000822c fffffabb9f5f8288 fffffabb9fa14848 ffff9f118586feb8
>      raw: 000000008eb49f38 ffff9f16ae564cb0 00000003ffffffff ffff9f211ab95000
>      page_type: 0xffffffff()
>      page dumped because: eb page dump
>      raw: 12ffff580000822c fffffabb9f5f8288 fffffabb9fa14848 ffff9f118586feb8
>      BTRFS critical (device dm-1): corrupted leaf, root=136202 block=9806651031552 owner mismatch, have 174692946400338119 expect [256, 18446744073709551360]
>      raw: 000000008eb49f38 ffff9f16ae564cb0 00000003ffffffff ffff9f211ab95000
>      page dumped because: eb page dump
>      BTRFS critical (device dm-1): corrupted leaf, root=136202 block=9806651031552 owner mismatch, have 174692946400338119 expect [256, 18446744073709551360]
>      flags: 0x12ffff580000822c(referenced|uptodate|lru|workingset|private|node=2|zone=2|lastcpupid=0xffff)
>      page_type: 0xffffffff()
>      raw: 12ffff580000822c fffffabb9f5f8288 fffffabb9fa14848 ffff9f118586feb8
>      raw: 000000008eb49f38 ffff9f16ae564cb0 00000003ffffffff ffff9f211ab95000
>      page dumped because: eb page dump
>      BTRFS critical (device dm-1): corrupted leaf, root=136202 block=9806651031552 owner mismatch, have 174692946400338119 expect [256, 18446744073709551360]
>
>>> Here's my reproducer if you want to try it yourself.  It uses bfs, a
>>> find(1) clone I wrote with multi-threading and io_uring support.  I'm
>>> in the process of adding multi-threaded stat(), which is what I assume
>>> triggers the bug.
>>>
>>>       $ git clone "https://github.com/tavianator/bfs"
>>>       $ cd bfs
>>>       $ git checkout euclean
>>>       $ make release
>>>
>>> Then repeat these steps until it triggers:
>>>
>>>       # sysctl vm.drop_caches=3
>>>       $ ./bin/bfs /mnt -links 100
>>>       bfs: error: /mnt/slash/@/var/lib/docker/btrfs/subvolumes/f07d37d1c148e9fcdbae166a3a4de36eec49009ce651174d0921fab18d55cee6/dev/ram0: Structure needs cleaning.
>>
>> It looks like the mount point /mnt/ is pretty large with a lot of things
>> pre-populated?
>
> Right, /mnt contains a few filesystems.  /mnt/slash is my root fs (the
> subvolume @ is mounted as /).  It's quite large, with over 41 million
> files and 640 subvolumes.  It's a BTRFS RAID0 array on 4 1TB NVMEs with
> LUKS encryption.
>
>> I tried to populate the btrfs with my linux git repo (which is around
>> 6.5G with some GC needed), but even 256 runs didn't hit the problem.
>>
>> The main part of the script looks like this:
>>
>> for (( i = 0; i < 256; i++ )); do
>> 	mount $dev1 $mnt
>> 	sysctl vm.drop_caches=3
>> 	/home/adam/bfs/bin/bfs $mnt -links 100
>> 	umount $mnt
>> done
>>
>> And the device looks like this:
>>
>> /dev/mapper/test-scratch1  10485760  6472292   3679260  64% /mnt/btrfs
>
> I also noticed that it seems easier to reproduce right after a reboot.
> I failed to reproduce it this morning, but after a reboot it triggered
> immediately.
>
>> Although the difference is, I'm using btrfs/for-next branch
>> (https://github.com/btrfs/linux/tree/for-next).
>>
>> Maybe it's missing some fixes not yet in upstream?
>> My current guess is related to my commit 09e6cef19c9f ("btrfs: refactor
>> alloc_extent_buffer() to allocate-then-attach method"), but since I can
>> not reproduce it, it's only a guess...
>
> That's possible!  I tried to follow the existing code in
> alloc_extent_buffer() but didn't see any obvious races.  I will try again
> with the for-next tree and report back.

The most obvious way to proof is, if you can reproduce it really
reliably, then just go back to that commit and verify (it can still
cause the problem).
Then go one commit before for, and verify it doesn't cause the problem
anymore.

Although without a way to reproduce locally, it's really hard to say or
debug from my end.

Thanks,
Qu
>
>> Thanks,
>> Qu
>
> --
>
> Tavian Barnes

  reply	other threads:[~2024-02-06 20:40 UTC|newest]

Thread overview: 16+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-01-26 23:48 [PATCH] btrfs: tree-checker: dump the page status if hit something wrong Qu Wenruo
2024-02-06  3:38 ` tavianator
2024-02-06  5:54   ` Qu Wenruo
2024-02-06 20:12     ` Tavian Barnes
2024-02-06 20:39       ` Qu Wenruo [this message]
2024-02-06 21:48         ` Tavian Barnes
2024-02-06 21:53           ` Qu Wenruo
2024-02-13 18:07             ` Tavian Barnes
2024-02-13 18:26               ` Tavian Barnes
2024-02-13 21:26               ` Qu Wenruo
2024-02-06 21:53           ` Tavian Barnes
2024-02-06 22:01             ` Qu Wenruo
2024-02-06 12:51   ` David Sterba
2024-02-06 20:19     ` Tavian Barnes
2024-02-06 12:46 ` David Sterba
2024-02-06 20:34   ` Qu Wenruo

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=60724d87-293d-495f-92ed-80032dab5c47@gmx.com \
    --to=quwenruo.btrfs@gmx.com \
    --cc=linux-btrfs@vger.kernel.org \
    --cc=tavianator@tavianator.com \
    --cc=wqu@suse.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).