linux-btrfs.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Tavian Barnes <tavianator@tavianator.com>
To: Qu Wenruo <quwenruo.btrfs@gmx.com>
Cc: "linux-btrfs@vger.kernel.org" <linux-btrfs@vger.kernel.org>
Subject: Re: About the weird tree block corruption
Date: Thu, 14 Mar 2024 16:25:17 -0400	[thread overview]
Message-ID: <CABg4E-nFJs01tYpxB_D=tYz0OaRJ_euq45CKFJtcuD=xWbsxBw@mail.gmail.com> (raw)
In-Reply-To: <CABg4E-nJ8MDOdBDEpJFhZtjUPtqGTmRPieGSg-NMceJ7EZCD-A@mail.gmail.com>

On Thu, Mar 14, 2024 at 2:42 PM Tavian Barnes <tavianator@tavianator.com> wrote:
> On Thu, Mar 14, 2024 at 1:44 PM Tavian Barnes <tavianator@tavianator.com> wrote:
> > On Wed, Mar 13, 2024 at 2:07 AM Qu Wenruo <quwenruo.btrfs@gmx.com> wrote:
> > > Hi Tavian,
> > >
> > > Thanks for all the awesome help debugging the weird tree block corruption.
> > > And sorry for the late reply.
> >
> > No worries, thanks for your help!
> >
> > > Unfortunately I still failed to reproduce the bug, so I can only craft a
> > > debug patchset for you to test.
> >
> > Good news: I also failed to reproduce the bug on the latest
> > btrfs/for-next branch (ec616f34eba1 "btrfs: do not skip
> > re-registration for the mounted device").  It was still reproducing
> > the last time I pulled btrfs/for-next (09e6cef19c9f "btrfs: refactor
> > alloc_extent_buffer() to allocate-then-attach method").

Actually I have to walk all of this back.  I can still reproduce the
bug on ec616f34eba1.  I can't reproduce it on that commit *plus your
debugging patch* (btrfs: add extra debug for eb read path), which
points to maybe a timing issue/race window closing.

Additionally, shortly after reproducing the bug on ec616f34eba1, I got
this splat:

[  119.257741] BTRFS critical (device dm-2): corrupted node, root=518
block=16438782945911875046 owner mismatch, have 4517169229596899607
expect [256, 18446744073709551360]
[  119.257764] BTRFS critical (device dm-2): corrupted node, root=518
block=16438782945911875046 owner mismatch, have 4517169229596899607
expect [256, 18446744073709551360]
[  119.257775] BTRFS critical (device dm-2): corrupted node, root=518
block=16438782945911875046 owner mismatch, have 4517169229596899607
expect [256, 18446744073709551360]
[  119.257773] BTRFS critical (device dm-2): corrupted node, root=518
block=16438782945911875046 owner mismatch, have 4517169229596899607
expect [256, 18446744073709551360]
[  119.257782] BTRFS critical (device dm-2): corrupted node, root=518
block=16438782945911875046 owner mismatch, have 4517169229596899607
expect [256, 18446744073709551360]
[  119.257787] BTRFS critical (device dm-2): corrupted node, root=518
block=16438782945911875046 owner mismatch, have 4517169229596899607
expect [256, 18446744073709551360]
[  178.564993] ------------[ cut here ]------------
[  178.567792] BTRFS: Transaction aborted (error -17)
[  178.570606] WARNING: CPU: 2 PID: 5191 at fs/btrfs/inode.c:6450
btrfs_create_new_inode+0xa46/0xa70 [btrfs]
[  178.573433] Modules linked in: vhost_net vhost vhost_iotlb tap
xt_addrtype xt_conntrack xt_comment veth rpcrdma rdma_cm cmac iw_cm
algif_hash nct6775 algif_skcipher ib_cm nct6775_core af_alg ib_core
bnep hwmon_vid lm92 uvcvideo uvc intel_rapl_msr intel_rapl_common
amd64_edac edac_mce_amd snd_hda_codec_hdmi kvm_amd snd_hda_intel btusb
snd_intel_dspcfg btrtl snd_intel_sdw_acpi gspca_vc032x btintel
snd_usb_audio snd_hda_codec btbcm gspca_main kvm btmtk
videobuf2_vmalloc snd_usbmidi_lib nls_iso8859_1 videobuf2_memops
snd_ump videobuf2_v4l2 snd_hda_core snd_rawmidi vfat videodev
bluetooth irqbypass snd_hwdep snd_seq_device fat videobuf2_common mc
snd_pcm ecdh_generic crc16 snd_timer rapl wmi_bmof acpi_cpufreq
sp5100_tco pcspkr snd soundcore k10temp i2c_piix4 mousedev joydev
mac_hid nfsd usbip_host usbip_core pkcs8_key_parser auth_rpcgss
i2c_dev sg nfs_acl lockd crypto_user grace fuse loop sunrpc btrfs
blake2b_generic xor raid6_pq dm_crypt cbc encrypted_keys trusted
asn1_encoder tee xt_MASQUERADE xt_tcpudp xt_mark
[  178.573514]  hid_logitech_hidpp hid_logitech_dj dm_mod tun uas
crct10dif_pclmul crc32_pclmul polyval_clmulni usb_storage
polyval_generic gf128mul ghash_clmulni_intel sha512_ssse3 usbhid
sha256_ssse3 sha1_ssse3 aesni_intel nvme igb bridge iwlwifi
crypto_simd nvme_core mxm_wmi ccp sr_mod cryptd dca xhci_pci cdrom
i2c_algo_bit xhci_pci_renesas nvme_auth stp wmi llc nf_tables cfg80211
ip6table_nat ip6table_filter ip6_tables iptable_nat nf_nat
nf_conntrack rfkill nf_defrag_ipv6 nf_defrag_ipv4 libcrc32c
crc32c_generic crc32c_intel iptable_filter nfnetlink ip_tables
x_tables
[  178.600199] CPU: 2 PID: 5191 Comm: git Not tainted
6.8.0-rc7-euclean+ #15 6731aa79f7b285a380c42229104b62a18d4313c1
[  178.603981] Hardware name: Micro-Star International Co., Ltd.
MS-7C60/TRX40 PRO WIFI (MS-7C60), BIOS 2.80 05/17/2022
[  178.607815] RIP: 0010:btrfs_create_new_inode+0xa46/0xa70 [btrfs]
[  178.611667] Code: cc 0a 00 c7 44 24 0c 01 00 00 00 e9 fc fb ff ff
41 bc f4 ff ff ff e9 a4 f8 ff ff 44 89 e6 48 c7 c7 48 00 3b c1 e8 1a
e7 c4 c2 <0f> 0b eb a4 44 89 e6 48 c7 c7 48 00 3b c1 e8 07 e7 c4 c2 0f
0b eb
[  178.615664] RSP: 0018:ffffbb8edf8afad8 EFLAGS: 00010282
[  178.619577] RAX: 0000000000000000 RBX: ffffbb8edf8afbb8 RCX: 0000000000000000
[  178.623516] RDX: ffff8fe8beeae780 RSI: ffff8fe8beea19c0 RDI: ffff8fe8beea19c0
[  178.627458] RBP: ffffbb8edf8afba0 R08: 0000000000000000 R09: ffffbb8edf8af960
[  178.631370] R10: 0000000000000003 R11: ffff8fe8bf1df668 R12: 00000000ffffffef
[  178.635281] R13: ffff8fc9cdc57d88 R14: ffff8fcb654541f8 R15: ffff8fc9d7c07110
[  178.639187] FS:  00007f86d59d9740(0000) GS:ffff8fe8bee80000(0000)
knlGS:0000000000000000
[  178.643131] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  178.647061] CR2: 000055c7d0a7a938 CR3: 000000023e8a8000 CR4: 0000000000350ef0
[  178.650998] Call Trace:
[  178.654898]  <TASK>
[  178.658761]  ? btrfs_create_new_inode+0xa46/0xa70 [btrfs
0221bef32b31f5d5536e63a0526995bec5247918]
[  178.662770]  ? __warn+0x81/0x130
[  178.666696]  ? btrfs_create_new_inode+0xa46/0xa70 [btrfs
0221bef32b31f5d5536e63a0526995bec5247918]
[  178.670749]  ? report_bug+0x171/0x1a0
[  178.674761]  ? prb_read_valid+0x1b/0x30
[  178.678786]  ? srso_return_thunk+0x5/0x5f
[  178.682786]  ? handle_bug+0x3c/0x80
[  178.686779]  ? exc_invalid_op+0x17/0x70
[  178.690769]  ? asm_exc_invalid_op+0x1a/0x20
[  178.694773]  ? btrfs_create_new_inode+0xa46/0xa70 [btrfs
0221bef32b31f5d5536e63a0526995bec5247918]
[  178.698925]  ? btrfs_create_new_inode+0xa46/0xa70 [btrfs
0221bef32b31f5d5536e63a0526995bec5247918]
[  178.703008]  btrfs_create_common+0xc4/0x130 [btrfs
0221bef32b31f5d5536e63a0526995bec5247918]
[  178.707133]  path_openat+0xe9f/0x1190
[  178.711196]  do_filp_open+0xb3/0x160
[  178.715254]  do_sys_openat2+0xab/0xe0
[  178.719294]  __x64_sys_openat+0x57/0xa0
[  178.723321]  do_syscall_64+0x89/0x170
[  178.727336]  ? srso_return_thunk+0x5/0x5f
[  178.731356]  ? do_syscall_64+0x96/0x170
[  178.735373]  ? srso_return_thunk+0x5/0x5f
[  178.739387]  ? syscall_exit_to_user_mode+0x80/0x230
[  178.743420]  ? srso_return_thunk+0x5/0x5f
[  178.747434]  ? do_syscall_64+0x96/0x170
[  178.751441]  ? srso_return_thunk+0x5/0x5f
[  178.755450]  ? srso_return_thunk+0x5/0x5f
[  178.759405]  entry_SYSCALL_64_after_hwframe+0x6e/0x76
[  178.763406] RIP: 0033:0x7f86d5ad6dc0
[  178.767424] Code: 48 89 44 24 20 75 94 44 89 54 24 0c e8 d9 c9 f8
ff 44 8b 54 24 0c 89 da 48 89 ee 41 89 c0 bf 9c ff ff ff b8 01 01 00
00 0f 05 <48> 3d 00 f0 ff ff 77 38 44 89 c7 89 44 24 0c e8 2c ca f8 ff
8b 44
[  178.771580] RSP: 002b:00007ffcc4791f50 EFLAGS: 00000293 ORIG_RAX:
0000000000000101
[  178.775720] RAX: ffffffffffffffda RBX: 00000000000000c1 RCX: 00007f86d5ad6dc0
[  178.779865] RDX: 00000000000000c1 RSI: 000055c7d0a77140 RDI: 00000000ffffff9c
[  178.784014] RBP: 000055c7d0a77140 R08: 0000000000000000 R09: 000055c7d0a77140
[  178.788152] R10: 00000000000001b6 R11: 0000000000000293 R12: 00007ffcc4792360
[  178.792210] R13: 00007ffcc4792000 R14: 000055c7d06bb788 R15: 000055c7d0a77140
[  178.796192]  </TASK>
[  178.800079] ---[ end trace 0000000000000000 ]---
[  178.804024] BTRFS: error (device dm-2: state A) in
btrfs_create_new_inode:6450: errno=-17 Object already exists

-- 
Tavian Barnes

  reply	other threads:[~2024-03-14 20:25 UTC|newest]

Thread overview: 10+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-03-13  6:07 About the weird tree block corruption Qu Wenruo
2024-03-14 17:44 ` Tavian Barnes
2024-03-14 18:42   ` Tavian Barnes
2024-03-14 20:25     ` Tavian Barnes [this message]
2024-03-15 15:23 ` Tavian Barnes
2024-03-15 19:51   ` Qu Wenruo
2024-03-15 20:01   ` Tavian Barnes
2024-03-15 20:21     ` Qu Wenruo
2024-03-15 22:15       ` Tavian Barnes
2024-03-15 23:14         ` Qu Wenruo

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CABg4E-nFJs01tYpxB_D=tYz0OaRJ_euq45CKFJtcuD=xWbsxBw@mail.gmail.com' \
    --to=tavianator@tavianator.com \
    --cc=linux-btrfs@vger.kernel.org \
    --cc=quwenruo.btrfs@gmx.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).