* Patch 'writeback, cgroup: release dying cgwbs by switching attached inodes' leads to kernel crash @ 2021-07-14 3:21 Boyang Xue 2021-07-14 3:57 ` Boyang Xue 2021-07-14 4:11 ` Roman Gushchin 0 siblings, 2 replies; 21+ messages in thread From: Boyang Xue @ 2021-07-14 3:21 UTC (permalink / raw) To: linux-fsdevel; +Cc: guro Hello, I'm not sure if this is the right place to report this bug, please correct me if I'm wrong. I found kernel-5.14.0-rc1 (built from the Linus tree) crash when it's running xfstests generic/256 on ext4 [1]. Looking at the call trace, it looks like the bug had been introduced by the commit c22d70a162d3 writeback, cgroup: release dying cgwbs by switching attached inodes It only happens on aarch64, not on x86_64, ppc64le and s390x. Testing was performed with the latest xfstests, and the bug can be reproduced on ext{2, 3, 4} with {1k, 2k, 4k} block sizes. Thanks, Boyang 1. dmesg ``` [ 4366.380974] run fstests generic/256 at 2021-07-12 05:41:40 [ 4368.337078] EXT4-fs (vda3): mounted filesystem with ordered data mode. Opts: . Quota mode: none. [ 4371.275986] Unable to handle kernel NULL pointer dereference at virtual address 0000000000000000 [ 4371.278210] Mem abort info: [ 4371.278880] ESR = 0x96000005 [ 4371.279603] EC = 0x25: DABT (current EL), IL = 32 bits [ 4371.280878] SET = 0, FnV = 0 [ 4371.281621] EA = 0, S1PTW = 0 [ 4371.282396] FSC = 0x05: level 1 translation fault [ 4371.283635] Data abort info: [ 4371.284333] ISV = 0, ISS = 0x00000005 [ 4371.285246] CM = 0, WnR = 0 [ 4371.285975] user pgtable: 64k pages, 48-bit VAs, pgdp=00000000b0502000 [ 4371.287640] [0000000000000000] pgd=0000000000000000, p4d=0000000000000000, pud=0000000000000000 [ 4371.290016] Internal error: Oops: 96000005 [#1] SMP [ 4371.291251] Modules linked in: dm_flakey dm_snapshot dm_bufio dm_zero dm_mod loop tls rpcsec_gss_krb5 auth_rpcgss nfsv4 dns_resolver nfs lockd grace fscache netfs rfkill sunrpc ext4 vfat fat mbcache jbd2 drm fuse xfs libcrc32c crct10dif_ce ghash_ce sha2_ce sha256_arm64 sha1_ce virtio_blk virtio_net net_failover virtio_console failover virtio_mmio aes_neon_bs [last unloaded: scsi_debug] [ 4371.300059] CPU: 0 PID: 408468 Comm: kworker/u8:5 Tainted: G X --------- --- 5.14.0-0.rc1.15.bx.el9.aarch64 #1 [ 4371.303009] Hardware name: QEMU KVM Virtual Machine, BIOS 0.0.0 02/06/2015 [ 4371.304685] Workqueue: events_unbound cleanup_offline_cgwbs_workfn [ 4371.306329] pstate: 004000c5 (nzcv daIF +PAN -UAO -TCO BTYPE=--) [ 4371.307867] pc : cleanup_offline_cgwbs_workfn+0x320/0x394 [ 4371.309254] lr : cleanup_offline_cgwbs_workfn+0xe0/0x394 [ 4371.310597] sp : ffff80001554fd10 [ 4371.311443] x29: ffff80001554fd10 x28: 0000000000000000 x27: 0000000000000001 [ 4371.313320] x26: 0000000000000000 x25: 00000000000000e0 x24: ffffd2a2fbe671a8 [ 4371.315159] x23: ffff80001554fd88 x22: ffffd2a2fbe67198 x21: ffffd2a2fc25a730 [ 4371.316945] x20: ffff210412bc3000 x19: ffff210412bc3280 x18: 0000000000000000 [ 4371.318690] x17: 0000000000000000 x16: 0000000000000000 x15: 0000000000000000 [ 4371.320437] x14: 0000000000000000 x13: 0000000000000030 x12: 0000000000000040 [ 4371.322444] x11: ffff210481572238 x10: ffff21048157223a x9 : ffffd2a2fa276c60 [ 4371.324243] x8 : ffff210484106b60 x7 : 0000000000000000 x6 : 000000000007d18a [ 4371.326049] x5 : ffff210416a86400 x4 : ffff210412bc0280 x3 : 0000000000000000 [ 4371.327898] x2 : ffff80001554fd88 x1 : ffff210412bc0280 x0 : 0000000000000003 [ 4371.329748] Call trace: [ 4371.330372] cleanup_offline_cgwbs_workfn+0x320/0x394 [ 4371.331694] process_one_work+0x1f4/0x4b0 [ 4371.332767] worker_thread+0x184/0x540 [ 4371.333732] kthread+0x114/0x120 [ 4371.334535] ret_from_fork+0x10/0x18 [ 4371.335440] Code: d63f0020 97f99963 17ffffa6 f8588263 (f9400061) [ 4371.337174] ---[ end trace e250fe289272792a ]--- [ 4371.338365] Kernel panic - not syncing: Oops: Fatal exception [ 4371.339884] SMP: stopping secondary CPUs [ 4372.424137] SMP: failed to stop secondary CPUs 0-2 [ 4372.436894] Kernel Offset: 0x52a2e9fa0000 from 0xffff800010000000 [ 4372.438408] PHYS_OFFSET: 0xfff0defca0000000 [ 4372.439496] CPU features: 0x00200251,23200840 [ 4372.440603] Memory Limit: none [ 4372.441374] ---[ end Kernel panic - not syncing: Oops: Fatal exception ]--- ``` ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: Patch 'writeback, cgroup: release dying cgwbs by switching attached inodes' leads to kernel crash 2021-07-14 3:21 Patch 'writeback, cgroup: release dying cgwbs by switching attached inodes' leads to kernel crash Boyang Xue @ 2021-07-14 3:57 ` Boyang Xue 2021-07-14 4:11 ` Roman Gushchin 1 sibling, 0 replies; 21+ messages in thread From: Boyang Xue @ 2021-07-14 3:57 UTC (permalink / raw) To: linux-fsdevel; +Cc: guro On Wed, Jul 14, 2021 at 11:21 AM Boyang Xue <bxue@redhat.com> wrote: > > Hello, > > I'm not sure if this is the right place to report this bug, please > correct me if I'm wrong. > > I found kernel-5.14.0-rc1 (built from the Linus tree) crash when it's > running xfstests generic/256 on ext4 [1]. Looking at the call trace, > it looks like the bug had been introduced by the commit > > c22d70a162d3 writeback, cgroup: release dying cgwbs by switching attached inodes > > It only happens on aarch64, not on x86_64, ppc64le and s390x. Testing Correction: It only happens on aarch64 and ppc64le, not on x86_64 and s390x. > was performed with the latest xfstests, and the bug can be reproduced > on ext{2, 3, 4} with {1k, 2k, 4k} block sizes. > > Thanks, > Boyang > > 1. dmesg > ``` > [ 4366.380974] run fstests generic/256 at 2021-07-12 05:41:40 > [ 4368.337078] EXT4-fs (vda3): mounted filesystem with ordered data > mode. Opts: . Quota mode: none. > [ 4371.275986] Unable to handle kernel NULL pointer dereference at > virtual address 0000000000000000 > [ 4371.278210] Mem abort info: > [ 4371.278880] ESR = 0x96000005 > [ 4371.279603] EC = 0x25: DABT (current EL), IL = 32 bits > [ 4371.280878] SET = 0, FnV = 0 > [ 4371.281621] EA = 0, S1PTW = 0 > [ 4371.282396] FSC = 0x05: level 1 translation fault > [ 4371.283635] Data abort info: > [ 4371.284333] ISV = 0, ISS = 0x00000005 > [ 4371.285246] CM = 0, WnR = 0 > [ 4371.285975] user pgtable: 64k pages, 48-bit VAs, pgdp=00000000b0502000 > [ 4371.287640] [0000000000000000] pgd=0000000000000000, > p4d=0000000000000000, pud=0000000000000000 > [ 4371.290016] Internal error: Oops: 96000005 [#1] SMP > [ 4371.291251] Modules linked in: dm_flakey dm_snapshot dm_bufio > dm_zero dm_mod loop tls rpcsec_gss_krb5 auth_rpcgss nfsv4 dns_resolver > nfs lockd grace fscache netfs rfkill sunrpc ext4 vfat fat mbcache jbd2 > drm fuse xfs libcrc32c crct10dif_ce ghash_ce sha2_ce sha256_arm64 > sha1_ce virtio_blk virtio_net net_failover virtio_console failover > virtio_mmio aes_neon_bs [last unloaded: scsi_debug] > [ 4371.300059] CPU: 0 PID: 408468 Comm: kworker/u8:5 Tainted: G > X --------- --- 5.14.0-0.rc1.15.bx.el9.aarch64 #1 > [ 4371.303009] Hardware name: QEMU KVM Virtual Machine, BIOS 0.0.0 02/06/2015 > [ 4371.304685] Workqueue: events_unbound cleanup_offline_cgwbs_workfn > [ 4371.306329] pstate: 004000c5 (nzcv daIF +PAN -UAO -TCO BTYPE=--) > [ 4371.307867] pc : cleanup_offline_cgwbs_workfn+0x320/0x394 > [ 4371.309254] lr : cleanup_offline_cgwbs_workfn+0xe0/0x394 > [ 4371.310597] sp : ffff80001554fd10 > [ 4371.311443] x29: ffff80001554fd10 x28: 0000000000000000 x27: 0000000000000001 > [ 4371.313320] x26: 0000000000000000 x25: 00000000000000e0 x24: ffffd2a2fbe671a8 > [ 4371.315159] x23: ffff80001554fd88 x22: ffffd2a2fbe67198 x21: ffffd2a2fc25a730 > [ 4371.316945] x20: ffff210412bc3000 x19: ffff210412bc3280 x18: 0000000000000000 > [ 4371.318690] x17: 0000000000000000 x16: 0000000000000000 x15: 0000000000000000 > [ 4371.320437] x14: 0000000000000000 x13: 0000000000000030 x12: 0000000000000040 > [ 4371.322444] x11: ffff210481572238 x10: ffff21048157223a x9 : ffffd2a2fa276c60 > [ 4371.324243] x8 : ffff210484106b60 x7 : 0000000000000000 x6 : 000000000007d18a > [ 4371.326049] x5 : ffff210416a86400 x4 : ffff210412bc0280 x3 : 0000000000000000 > [ 4371.327898] x2 : ffff80001554fd88 x1 : ffff210412bc0280 x0 : 0000000000000003 > [ 4371.329748] Call trace: > [ 4371.330372] cleanup_offline_cgwbs_workfn+0x320/0x394 > [ 4371.331694] process_one_work+0x1f4/0x4b0 > [ 4371.332767] worker_thread+0x184/0x540 > [ 4371.333732] kthread+0x114/0x120 > [ 4371.334535] ret_from_fork+0x10/0x18 > [ 4371.335440] Code: d63f0020 97f99963 17ffffa6 f8588263 (f9400061) > [ 4371.337174] ---[ end trace e250fe289272792a ]--- > [ 4371.338365] Kernel panic - not syncing: Oops: Fatal exception > [ 4371.339884] SMP: stopping secondary CPUs > [ 4372.424137] SMP: failed to stop secondary CPUs 0-2 > [ 4372.436894] Kernel Offset: 0x52a2e9fa0000 from 0xffff800010000000 > [ 4372.438408] PHYS_OFFSET: 0xfff0defca0000000 > [ 4372.439496] CPU features: 0x00200251,23200840 > [ 4372.440603] Memory Limit: none > [ 4372.441374] ---[ end Kernel panic - not syncing: Oops: Fatal exception ]--- > ``` ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: Patch 'writeback, cgroup: release dying cgwbs by switching attached inodes' leads to kernel crash 2021-07-14 3:21 Patch 'writeback, cgroup: release dying cgwbs by switching attached inodes' leads to kernel crash Boyang Xue 2021-07-14 3:57 ` Boyang Xue @ 2021-07-14 4:11 ` Roman Gushchin 2021-07-14 8:44 ` Boyang Xue 1 sibling, 1 reply; 21+ messages in thread From: Roman Gushchin @ 2021-07-14 4:11 UTC (permalink / raw) To: Boyang Xue; +Cc: linux-fsdevel, Jan Kara On Wed, Jul 14, 2021 at 11:21:12AM +0800, Boyang Xue wrote: > Hello, > > I'm not sure if this is the right place to report this bug, please > correct me if I'm wrong. > > I found kernel-5.14.0-rc1 (built from the Linus tree) crash when it's > running xfstests generic/256 on ext4 [1]. Looking at the call trace, > it looks like the bug had been introduced by the commit > > c22d70a162d3 writeback, cgroup: release dying cgwbs by switching attached inodes > > It only happens on aarch64, not on x86_64, ppc64le and s390x. Testing > was performed with the latest xfstests, and the bug can be reproduced > on ext{2, 3, 4} with {1k, 2k, 4k} block sizes. Hello Boyang, thank you for the report! Do you know on which line the oops happens? I'll try to reproduce the problem. Do you mind sharing your .config, kvm options and any other meaningful details? Thank you! Roman ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: Patch 'writeback, cgroup: release dying cgwbs by switching attached inodes' leads to kernel crash 2021-07-14 4:11 ` Roman Gushchin @ 2021-07-14 8:44 ` Boyang Xue 2021-07-14 9:26 ` Jan Kara 0 siblings, 1 reply; 21+ messages in thread From: Boyang Xue @ 2021-07-14 8:44 UTC (permalink / raw) To: Roman Gushchin; +Cc: linux-fsdevel, Jan Kara Hi Roman, On Wed, Jul 14, 2021 at 12:12 PM Roman Gushchin <guro@fb.com> wrote: > > On Wed, Jul 14, 2021 at 11:21:12AM +0800, Boyang Xue wrote: > > Hello, > > > > I'm not sure if this is the right place to report this bug, please > > correct me if I'm wrong. > > > > I found kernel-5.14.0-rc1 (built from the Linus tree) crash when it's > > running xfstests generic/256 on ext4 [1]. Looking at the call trace, > > it looks like the bug had been introduced by the commit > > > > c22d70a162d3 writeback, cgroup: release dying cgwbs by switching attached inodes > > > > It only happens on aarch64, not on x86_64, ppc64le and s390x. Testing > > was performed with the latest xfstests, and the bug can be reproduced > > on ext{2, 3, 4} with {1k, 2k, 4k} block sizes. > > Hello Boyang, > > thank you for the report! > > Do you know on which line the oops happens? I was trying to inspect the vmcore with crash utility, but unfortunately it doesn't work. ``` # crash /usr/lib/debug/lib/modules/5.14.0-0.rc1.15.bx.el9.aarch64/vmlinux vmcore ... crash: invalid structure member offset: task_struct_state FILE: task.c LINE: 5929 FUNCTION: task_state() [/usr/bin/crash] error trace: aaaae238b080 => aaaae238aff0 => aaaae23ff4e8 => aaaae23ff440 ... ``` Could you suggest other ways to know "the line the oops happens"? > I'll try to reproduce the problem. Do you mind sharing your .config, kvm options > and any other meaningful details? I can't access the VM host, so sorry I can't provide the kvm configuration for now. Please check the following other info: xfstests local.config ``` # cat local.config FSTYP="ext4" TEST_DIR="/test" TEST_DEV="/dev/vda3" SCRATCH_MNT="/scratch" SCRATCH_DEV="/dev/vda4" LOGWRITES_MNT="/logwrites" LOGWRITES_DEV="/dev/vda6" MKFS_OPTIONS="-b 4096" MOUNT_OPTIONS="-o rw,relatime,seclabel" TEST_FS_MOUNT_OPTS="-o rw,relatime,seclabel" ``` # lscpu Architecture: aarch64 CPU op-mode(s): 64-bit Byte Order: Little Endian CPU(s): 4 On-line CPU(s) list: 0-3 Vendor ID: Cavium BIOS Vendor ID: QEMU Model name: ThunderX2 99xx BIOS Model name: virt-rhel7.6.0 Model: 1 Thread(s) per core: 1 Core(s) per cluster: 4 Socket(s): 4 Cluster(s): 1 Stepping: 0x1 BogoMIPS: 400.00 Flags: fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics cpuid asimdrdm NUMA: NUMA node(s): 1 NUMA node0 CPU(s): 0-3 Vulnerabilities: Itlb multihit: Not affected L1tf: Not affected Mds: Not affected Meltdown: Not affected Spec store bypass: Mitigation; Speculative Store Bypass disabled via prctl Spectre v1: Mitigation; __user pointer sanitization Spectre v2: Mitigation; Branch predictor hardening Srbds: Not affected Tsx async abort: Not affected # getconf PAGESIZE 65536 Please let me know if there's other useful info I can provide. Thanks, Boyang > > Thank you! > > Roman > ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: Patch 'writeback, cgroup: release dying cgwbs by switching attached inodes' leads to kernel crash 2021-07-14 8:44 ` Boyang Xue @ 2021-07-14 9:26 ` Jan Kara 2021-07-14 16:22 ` Boyang Xue 0 siblings, 1 reply; 21+ messages in thread From: Jan Kara @ 2021-07-14 9:26 UTC (permalink / raw) To: Boyang Xue; +Cc: Roman Gushchin, linux-fsdevel, Jan Kara On Wed 14-07-21 16:44:33, Boyang Xue wrote: > Hi Roman, > > On Wed, Jul 14, 2021 at 12:12 PM Roman Gushchin <guro@fb.com> wrote: > > > > On Wed, Jul 14, 2021 at 11:21:12AM +0800, Boyang Xue wrote: > > > Hello, > > > > > > I'm not sure if this is the right place to report this bug, please > > > correct me if I'm wrong. > > > > > > I found kernel-5.14.0-rc1 (built from the Linus tree) crash when it's > > > running xfstests generic/256 on ext4 [1]. Looking at the call trace, > > > it looks like the bug had been introduced by the commit > > > > > > c22d70a162d3 writeback, cgroup: release dying cgwbs by switching attached inodes > > > > > > It only happens on aarch64, not on x86_64, ppc64le and s390x. Testing > > > was performed with the latest xfstests, and the bug can be reproduced > > > on ext{2, 3, 4} with {1k, 2k, 4k} block sizes. > > > > Hello Boyang, > > > > thank you for the report! > > > > Do you know on which line the oops happens? > > I was trying to inspect the vmcore with crash utility, but > unfortunately it doesn't work. Thanks for report! Have you tried addr2line utility? Looking at the oops I can see: [ 4371.307867] pc : cleanup_offline_cgwbs_workfn+0x320/0x394 Which means there's probably heavy inlining going on (do you use LTO by any chance?) because I don't think cleanup_offline_cgwbs_workfn() itself would compile into ~1k of code (but I don't have much experience with aarch64). Anyway, add2line should tell us. Also pasting oops into scripts/decodecode on aarch64 machine should tell us more about where and why the kernel crashed. Honza -- Jan Kara <jack@suse.com> SUSE Labs, CR ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: Patch 'writeback, cgroup: release dying cgwbs by switching attached inodes' leads to kernel crash 2021-07-14 9:26 ` Jan Kara @ 2021-07-14 16:22 ` Boyang Xue 2021-07-14 23:46 ` Roman Gushchin 2021-07-15 2:35 ` Matthew Wilcox 0 siblings, 2 replies; 21+ messages in thread From: Boyang Xue @ 2021-07-14 16:22 UTC (permalink / raw) To: Jan Kara; +Cc: Roman Gushchin, linux-fsdevel Hi Jan, On Wed, Jul 14, 2021 at 5:26 PM Jan Kara <jack@suse.cz> wrote: > > On Wed 14-07-21 16:44:33, Boyang Xue wrote: > > Hi Roman, > > > > On Wed, Jul 14, 2021 at 12:12 PM Roman Gushchin <guro@fb.com> wrote: > > > > > > On Wed, Jul 14, 2021 at 11:21:12AM +0800, Boyang Xue wrote: > > > > Hello, > > > > > > > > I'm not sure if this is the right place to report this bug, please > > > > correct me if I'm wrong. > > > > > > > > I found kernel-5.14.0-rc1 (built from the Linus tree) crash when it's > > > > running xfstests generic/256 on ext4 [1]. Looking at the call trace, > > > > it looks like the bug had been introduced by the commit > > > > > > > > c22d70a162d3 writeback, cgroup: release dying cgwbs by switching attached inodes > > > > > > > > It only happens on aarch64, not on x86_64, ppc64le and s390x. Testing > > > > was performed with the latest xfstests, and the bug can be reproduced > > > > on ext{2, 3, 4} with {1k, 2k, 4k} block sizes. > > > > > > Hello Boyang, > > > > > > thank you for the report! > > > > > > Do you know on which line the oops happens? > > > > I was trying to inspect the vmcore with crash utility, but > > unfortunately it doesn't work. > > Thanks for report! Have you tried addr2line utility? Looking at the oops I > can see: Thanks for the tips! It's unclear to me that where to find the required address in the addr2line command line, i.e. addr2line -e /usr/lib/debug/lib/modules/5.14.0-0.rc1.15.bx.el9.aarch64/vmlinux <what address here?> But I have tried gdb like this, # gdb /usr/lib/debug/lib/modules/5.14.0-0.rc1.15.bx.el9.aarch64/vmlinux GNU gdb (GDB) Red Hat Enterprise Linux 10.1-14.el9 Copyright (C) 2020 Free Software Foundation, Inc. License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html> This is free software: you are free to change and redistribute it. There is NO WARRANTY, to the extent permitted by law. Type "show copying" and "show warranty" for details. This GDB was configured as "aarch64-redhat-linux-gnu". Type "show configuration" for configuration details. For bug reporting instructions, please see: <https://www.gnu.org/software/gdb/bugs/>. Find the GDB manual and other documentation resources online at: <http://www.gnu.org/software/gdb/documentation/>. For help, type "help". Type "apropos word" to search for commands related to "word"... Reading symbols from /usr/lib/debug/lib/modules/5.14.0-0.rc1.15.bx.el9.aarch64/vmlinux... (gdb) list *(cleanup_offline_cgwbs_workfn+0x320) 0xffff8000102d6ddc is in cleanup_offline_cgwbs_workfn (./arch/arm64/include/asm/jump_label.h:38). 33 } 34 35 static __always_inline bool arch_static_branch_jump(struct static_key *key, 36 bool branch) 37 { 38 asm_volatile_goto( 39 "1: b %l[l_yes] \n\t" 40 " .pushsection __jump_table, \"aw\" \n\t" 41 " .align 3 \n\t" 42 " .long 1b - ., %l[l_yes] - . \n\t" (gdb) I'm not sure is it meaningful? > > [ 4371.307867] pc : cleanup_offline_cgwbs_workfn+0x320/0x394 > > Which means there's probably heavy inlining going on (do you use LTO by > any chance?) because I don't think cleanup_offline_cgwbs_workfn() itself > would compile into ~1k of code (but I don't have much experience with > aarch64). Anyway, add2line should tell us. Actually I built the kernel on an internal build service, so I don't know much of the build details, like LTO. > > Also pasting oops into scripts/decodecode on aarch64 machine should tell > us more about where and why the kernel crashed. The output is: # echo "Code: d63f0020 97f99963 17ffffa6 f8588263 (f9400061)" | /usr/src/kernels/5.14.0-0.rc1.15.bx.el9.aarch64/scripts/decodecode Code: d63f0020 97f99963 17ffffa6 f8588263 (f9400061) All code ======== 0: d63f0020 blr x1 4: 97f99963 bl 0xffffffffffe66590 8: 17ffffa6 b 0xfffffffffffffea0 c: f8588263 ldur x3, [x19, #-120] 10:* f9400061 ldr x1, [x3] <-- trapping instruction Code starting with the faulting instruction =========================================== 0: f9400061 ldr x1, [x3] > > Honza > > -- > Jan Kara <jack@suse.com> > SUSE Labs, CR > Thanks, Boyang ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: Patch 'writeback, cgroup: release dying cgwbs by switching attached inodes' leads to kernel crash 2021-07-14 16:22 ` Boyang Xue @ 2021-07-14 23:46 ` Roman Gushchin 2021-07-15 1:42 ` Boyang Xue 2021-07-15 2:35 ` Matthew Wilcox 1 sibling, 1 reply; 21+ messages in thread From: Roman Gushchin @ 2021-07-14 23:46 UTC (permalink / raw) To: Boyang Xue; +Cc: Jan Kara, linux-fsdevel On Thu, Jul 15, 2021 at 12:22:28AM +0800, Boyang Xue wrote: > Hi Jan, > > On Wed, Jul 14, 2021 at 5:26 PM Jan Kara <jack@suse.cz> wrote: > > > > On Wed 14-07-21 16:44:33, Boyang Xue wrote: > > > Hi Roman, > > > > > > On Wed, Jul 14, 2021 at 12:12 PM Roman Gushchin <guro@fb.com> wrote: > > > > > > > > On Wed, Jul 14, 2021 at 11:21:12AM +0800, Boyang Xue wrote: > > > > > Hello, > > > > > > > > > > I'm not sure if this is the right place to report this bug, please > > > > > correct me if I'm wrong. > > > > > > > > > > I found kernel-5.14.0-rc1 (built from the Linus tree) crash when it's > > > > > running xfstests generic/256 on ext4 [1]. Looking at the call trace, > > > > > it looks like the bug had been introduced by the commit > > > > > > > > > > c22d70a162d3 writeback, cgroup: release dying cgwbs by switching attached inodes > > > > > > > > > > It only happens on aarch64, not on x86_64, ppc64le and s390x. Testing > > > > > was performed with the latest xfstests, and the bug can be reproduced > > > > > on ext{2, 3, 4} with {1k, 2k, 4k} block sizes. > > > > > > > > Hello Boyang, > > > > > > > > thank you for the report! > > > > > > > > Do you know on which line the oops happens? > > > > > > I was trying to inspect the vmcore with crash utility, but > > > unfortunately it doesn't work. > > > > Thanks for report! Have you tried addr2line utility? Looking at the oops I > > can see: > > Thanks for the tips! > > It's unclear to me that where to find the required address in the > addr2line command line, i.e. > > addr2line -e /usr/lib/debug/lib/modules/5.14.0-0.rc1.15.bx.el9.aarch64/vmlinux > <what address here?> You can use $nm <vmlinux> to get an address of cleanup_offline_cgwbs_workfn() and then add 0x320. Alternatively, maybe you can put the image you're using somewhere? I'm working on getting my arm64 setup and reproduce the problem, but it takes time, and I'm not sure I'll be able to reproduce it in qemu running on top of x86. Thanks! ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: Patch 'writeback, cgroup: release dying cgwbs by switching attached inodes' leads to kernel crash 2021-07-14 23:46 ` Roman Gushchin @ 2021-07-15 1:42 ` Boyang Xue 2021-07-15 9:31 ` Jan Kara 0 siblings, 1 reply; 21+ messages in thread From: Boyang Xue @ 2021-07-15 1:42 UTC (permalink / raw) To: Roman Gushchin; +Cc: Jan Kara, linux-fsdevel On Thu, Jul 15, 2021 at 7:46 AM Roman Gushchin <guro@fb.com> wrote: > > On Thu, Jul 15, 2021 at 12:22:28AM +0800, Boyang Xue wrote: > > Hi Jan, > > > > On Wed, Jul 14, 2021 at 5:26 PM Jan Kara <jack@suse.cz> wrote: > > > > > > On Wed 14-07-21 16:44:33, Boyang Xue wrote: > > > > Hi Roman, > > > > > > > > On Wed, Jul 14, 2021 at 12:12 PM Roman Gushchin <guro@fb.com> wrote: > > > > > > > > > > On Wed, Jul 14, 2021 at 11:21:12AM +0800, Boyang Xue wrote: > > > > > > Hello, > > > > > > > > > > > > I'm not sure if this is the right place to report this bug, please > > > > > > correct me if I'm wrong. > > > > > > > > > > > > I found kernel-5.14.0-rc1 (built from the Linus tree) crash when it's > > > > > > running xfstests generic/256 on ext4 [1]. Looking at the call trace, > > > > > > it looks like the bug had been introduced by the commit > > > > > > > > > > > > c22d70a162d3 writeback, cgroup: release dying cgwbs by switching attached inodes > > > > > > > > > > > > It only happens on aarch64, not on x86_64, ppc64le and s390x. Testing > > > > > > was performed with the latest xfstests, and the bug can be reproduced > > > > > > on ext{2, 3, 4} with {1k, 2k, 4k} block sizes. > > > > > > > > > > Hello Boyang, > > > > > > > > > > thank you for the report! > > > > > > > > > > Do you know on which line the oops happens? > > > > > > > > I was trying to inspect the vmcore with crash utility, but > > > > unfortunately it doesn't work. > > > > > > Thanks for report! Have you tried addr2line utility? Looking at the oops I > > > can see: > > > > Thanks for the tips! > > > > It's unclear to me that where to find the required address in the > > addr2line command line, i.e. > > > > addr2line -e /usr/lib/debug/lib/modules/5.14.0-0.rc1.15.bx.el9.aarch64/vmlinux > > <what address here?> > > You can use $nm <vmlinux> to get an address of cleanup_offline_cgwbs_workfn() > and then add 0x320. Thanks! Hope the following helps: # grep cleanup_offline_cgwbs_workfn /boot/System.map-5.14.0-0.rc1.15.bx.el9.aarch64 ffff8000102d6ab0 t cleanup_offline_cgwbs_workfn ## ffff8000102d6ab0+0x320=FFFF8000102D6DD0 # addr2line -e /usr/lib/debug/lib/modules/5.14.0-0.rc1.15.bx.el9.aarch64/vmlinux FFFF8000102D6DD0 /usr/src/debug/kernel-5.14.0-0.rc1.15.bx/linux-5.14.0-0.rc1.15.bx.el9.aarch64/./include/linux/atomic-arch-fallback.h:2265 # vi /usr/src/debug/kernel-5.14.0-0.rc1.15.bx/linux-5.14.0-0.rc1.15.bx.el9.aarch64/./include/linux/atomic-arch-fallback.h ``` arch_atomic64_fetch_add_unless(atomic64_t *v, s64 a, s64 u) { s64 c = arch_atomic64_read(v); <=== line#2265 do { if (unlikely(c == u)) break; } while (!arch_atomic64_try_cmpxchg(v, &c, c + a)); return c; } ``` # addr2line -i -e /usr/lib/debug/lib/modules/5.14.0-0.rc1.15.bx.el9.aarch64/vmlinux FFFF8000102D6DD0 /usr/src/debug/kernel-5.14.0-0.rc1.15.bx/linux-5.14.0-0.rc1.15.bx.el9.aarch64/./include/linux/atomic-arch-fallback.h:2265 /usr/src/debug/kernel-5.14.0-0.rc1.15.bx/linux-5.14.0-0.rc1.15.bx.el9.aarch64/./include/linux/atomic-arch-fallback.h:2290 /usr/src/debug/kernel-5.14.0-0.rc1.15.bx/linux-5.14.0-0.rc1.15.bx.el9.aarch64/./include/asm-generic/atomic-instrumented.h:1149 /usr/src/debug/kernel-5.14.0-0.rc1.15.bx/linux-5.14.0-0.rc1.15.bx.el9.aarch64/./include/asm-generic/atomic-long.h:491 /usr/src/debug/kernel-5.14.0-0.rc1.15.bx/linux-5.14.0-0.rc1.15.bx.el9.aarch64/./include/linux/percpu-refcount.h:247 /usr/src/debug/kernel-5.14.0-0.rc1.15.bx/linux-5.14.0-0.rc1.15.bx.el9.aarch64/./include/linux/percpu-refcount.h:266 /usr/src/debug/kernel-5.14.0-0.rc1.15.bx/linux-5.14.0-0.rc1.15.bx.el9.aarch64/./include/linux/backing-dev-defs.h:227 /usr/src/debug/kernel-5.14.0-0.rc1.15.bx/linux-5.14.0-0.rc1.15.bx.el9.aarch64/./include/linux/backing-dev-defs.h:224 /usr/src/debug/kernel-5.14.0-0.rc1.15.bx/linux-5.14.0-0.rc1.15.bx.el9.aarch64/mm/backing-dev.c:679 # vi /usr/src/debug/kernel-5.14.0-0.rc1.15.bx/linux-5.14.0-0.rc1.15.bx.el9.aarch64/mm/backing-dev.c ``` static void cleanup_offline_cgwbs_workfn(struct work_struct *work) { struct bdi_writeback *wb; LIST_HEAD(processed); spin_lock_irq(&cgwb_lock); while (!list_empty(&offline_cgwbs)) { wb = list_first_entry(&offline_cgwbs, struct bdi_writeback, offline_node); list_move(&wb->offline_node, &processed); /* * If wb is dirty, cleaning up the writeback by switching * attached inodes will result in an effective removal of any * bandwidth restrictions, which isn't the goal. Instead, * it can be postponed until the next time, when all io * will be likely completed. If in the meantime some inodes * will get re-dirtied, they should be eventually switched to * a new cgwb. */ if (wb_has_dirty_io(wb)) continue; if (!wb_tryget(wb)) <=== line#679 continue; spin_unlock_irq(&cgwb_lock); while (cleanup_offline_cgwb(wb)) cond_resched(); spin_lock_irq(&cgwb_lock); wb_put(wb); } if (!list_empty(&processed)) list_splice_tail(&processed, &offline_cgwbs); spin_unlock_irq(&cgwb_lock); } ``` > > Alternatively, maybe you can put the image you're using somewhere? I put those rpms in the Google Drive https://drive.google.com/drive/folders/1aw-WK2yWD11UWB059bJt6WKNW1OP_fex?usp=sharing > > I'm working on getting my arm64 setup and reproduce the problem, but it takes > time, and I'm not sure I'll be able to reproduce it in qemu running on top of x86. Thanks! It's only reproducible on aarch64 and ppc64le in my test. I'm happy to help test patch, if it would help. > > Thanks! > ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: Patch 'writeback, cgroup: release dying cgwbs by switching attached inodes' leads to kernel crash 2021-07-15 1:42 ` Boyang Xue @ 2021-07-15 9:31 ` Jan Kara 2021-07-15 16:04 ` Roman Gushchin 0 siblings, 1 reply; 21+ messages in thread From: Jan Kara @ 2021-07-15 9:31 UTC (permalink / raw) To: Boyang Xue; +Cc: Roman Gushchin, Jan Kara, linux-fsdevel On Thu 15-07-21 09:42:06, Boyang Xue wrote: > On Thu, Jul 15, 2021 at 7:46 AM Roman Gushchin <guro@fb.com> wrote: > > > > On Thu, Jul 15, 2021 at 12:22:28AM +0800, Boyang Xue wrote: > > > Hi Jan, > > > > > > On Wed, Jul 14, 2021 at 5:26 PM Jan Kara <jack@suse.cz> wrote: > > > > > > > > On Wed 14-07-21 16:44:33, Boyang Xue wrote: > > > > > Hi Roman, > > > > > > > > > > On Wed, Jul 14, 2021 at 12:12 PM Roman Gushchin <guro@fb.com> wrote: > > > > > > > > > > > > On Wed, Jul 14, 2021 at 11:21:12AM +0800, Boyang Xue wrote: > > > > > > > Hello, > > > > > > > > > > > > > > I'm not sure if this is the right place to report this bug, please > > > > > > > correct me if I'm wrong. > > > > > > > > > > > > > > I found kernel-5.14.0-rc1 (built from the Linus tree) crash when it's > > > > > > > running xfstests generic/256 on ext4 [1]. Looking at the call trace, > > > > > > > it looks like the bug had been introduced by the commit > > > > > > > > > > > > > > c22d70a162d3 writeback, cgroup: release dying cgwbs by switching attached inodes > > > > > > > > > > > > > > It only happens on aarch64, not on x86_64, ppc64le and s390x. Testing > > > > > > > was performed with the latest xfstests, and the bug can be reproduced > > > > > > > on ext{2, 3, 4} with {1k, 2k, 4k} block sizes. > > > > > > > > > > > > Hello Boyang, > > > > > > > > > > > > thank you for the report! > > > > > > > > > > > > Do you know on which line the oops happens? > > > > > > > > > > I was trying to inspect the vmcore with crash utility, but > > > > > unfortunately it doesn't work. > > > > > > > > Thanks for report! Have you tried addr2line utility? Looking at the oops I > > > > can see: > > > > > > Thanks for the tips! > > > > > > It's unclear to me that where to find the required address in the > > > addr2line command line, i.e. > > > > > > addr2line -e /usr/lib/debug/lib/modules/5.14.0-0.rc1.15.bx.el9.aarch64/vmlinux > > > <what address here?> > > > > You can use $nm <vmlinux> to get an address of cleanup_offline_cgwbs_workfn() > > and then add 0x320. > > Thanks! Hope the following helps: Thanks for the data! > static void cleanup_offline_cgwbs_workfn(struct work_struct *work) > { > struct bdi_writeback *wb; > LIST_HEAD(processed); > > spin_lock_irq(&cgwb_lock); > > while (!list_empty(&offline_cgwbs)) { > wb = list_first_entry(&offline_cgwbs, struct bdi_writeback, > offline_node); > list_move(&wb->offline_node, &processed); > > /* > * If wb is dirty, cleaning up the writeback by switching > * attached inodes will result in an effective removal of any > * bandwidth restrictions, which isn't the goal. Instead, > * it can be postponed until the next time, when all io > * will be likely completed. If in the meantime some inodes > * will get re-dirtied, they should be eventually switched to > * a new cgwb. > */ > if (wb_has_dirty_io(wb)) > continue; > > if (!wb_tryget(wb)) <=== line#679 > continue; Aha, interesting. So it seems we crashed trying to dereference wb->refcnt->data. So it looks like cgwb_release_workfn() raced with cleanup_offline_cgwbs_workfn() and percpu_ref_exit() got called from cgwb_release_workfn() and then cleanup_offline_cgwbs_workfn() called wb_tryget(). I think the proper fix is to move: spin_lock_irq(&cgwb_lock); list_del(&wb->offline_node); spin_unlock_irq(&cgwb_lock); in cgwb_release_workfn() to the beginning of that function so that we are sure even cleanup_offline_cgwbs_workfn() cannot be working with the wb when it is being released. Roman? Honza -- Jan Kara <jack@suse.com> SUSE Labs, CR ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: Patch 'writeback, cgroup: release dying cgwbs by switching attached inodes' leads to kernel crash 2021-07-15 9:31 ` Jan Kara @ 2021-07-15 16:04 ` Roman Gushchin 2021-07-16 1:37 ` Boyang Xue 0 siblings, 1 reply; 21+ messages in thread From: Roman Gushchin @ 2021-07-15 16:04 UTC (permalink / raw) To: Jan Kara, Boyang Xue; +Cc: linux-fsdevel On Thu, Jul 15, 2021 at 11:31:17AM +0200, Jan Kara wrote: > On Thu 15-07-21 09:42:06, Boyang Xue wrote: > > On Thu, Jul 15, 2021 at 7:46 AM Roman Gushchin <guro@fb.com> wrote: > > > > > > On Thu, Jul 15, 2021 at 12:22:28AM +0800, Boyang Xue wrote: > > > > Hi Jan, > > > > > > > > On Wed, Jul 14, 2021 at 5:26 PM Jan Kara <jack@suse.cz> wrote: > > > > > > > > > > On Wed 14-07-21 16:44:33, Boyang Xue wrote: > > > > > > Hi Roman, > > > > > > > > > > > > On Wed, Jul 14, 2021 at 12:12 PM Roman Gushchin <guro@fb.com> wrote: > > > > > > > > > > > > > > On Wed, Jul 14, 2021 at 11:21:12AM +0800, Boyang Xue wrote: > > > > > > > > Hello, > > > > > > > > > > > > > > > > I'm not sure if this is the right place to report this bug, please > > > > > > > > correct me if I'm wrong. > > > > > > > > > > > > > > > > I found kernel-5.14.0-rc1 (built from the Linus tree) crash when it's > > > > > > > > running xfstests generic/256 on ext4 [1]. Looking at the call trace, > > > > > > > > it looks like the bug had been introduced by the commit > > > > > > > > > > > > > > > > c22d70a162d3 writeback, cgroup: release dying cgwbs by switching attached inodes > > > > > > > > > > > > > > > > It only happens on aarch64, not on x86_64, ppc64le and s390x. Testing > > > > > > > > was performed with the latest xfstests, and the bug can be reproduced > > > > > > > > on ext{2, 3, 4} with {1k, 2k, 4k} block sizes. > > > > > > > > > > > > > > Hello Boyang, > > > > > > > > > > > > > > thank you for the report! > > > > > > > > > > > > > > Do you know on which line the oops happens? > > > > > > > > > > > > I was trying to inspect the vmcore with crash utility, but > > > > > > unfortunately it doesn't work. > > > > > > > > > > Thanks for report! Have you tried addr2line utility? Looking at the oops I > > > > > can see: > > > > > > > > Thanks for the tips! > > > > > > > > It's unclear to me that where to find the required address in the > > > > addr2line command line, i.e. > > > > > > > > addr2line -e /usr/lib/debug/lib/modules/5.14.0-0.rc1.15.bx.el9.aarch64/vmlinux > > > > <what address here?> > > > > > > You can use $nm <vmlinux> to get an address of cleanup_offline_cgwbs_workfn() > > > and then add 0x320. > > > > Thanks! Hope the following helps: > > Thanks for the data! > > > static void cleanup_offline_cgwbs_workfn(struct work_struct *work) > > { > > struct bdi_writeback *wb; > > LIST_HEAD(processed); > > > > spin_lock_irq(&cgwb_lock); > > > > while (!list_empty(&offline_cgwbs)) { > > wb = list_first_entry(&offline_cgwbs, struct bdi_writeback, > > offline_node); > > list_move(&wb->offline_node, &processed); > > > > /* > > * If wb is dirty, cleaning up the writeback by switching > > * attached inodes will result in an effective removal of any > > * bandwidth restrictions, which isn't the goal. Instead, > > * it can be postponed until the next time, when all io > > * will be likely completed. If in the meantime some inodes > > * will get re-dirtied, they should be eventually switched to > > * a new cgwb. > > */ > > if (wb_has_dirty_io(wb)) > > continue; > > > > if (!wb_tryget(wb)) <=== line#679 > > continue; > > Aha, interesting. So it seems we crashed trying to dereference > wb->refcnt->data. So it looks like cgwb_release_workfn() raced with > cleanup_offline_cgwbs_workfn() and percpu_ref_exit() got called from > cgwb_release_workfn() and then cleanup_offline_cgwbs_workfn() called > wb_tryget(). I think the proper fix is to move: > > spin_lock_irq(&cgwb_lock); > list_del(&wb->offline_node); > spin_unlock_irq(&cgwb_lock); > > in cgwb_release_workfn() to the beginning of that function so that we are > sure even cleanup_offline_cgwbs_workfn() cannot be working with the wb when > it is being released. Roman? Yes, it sounds like the most reasonable explanation. Thank you! Boyang, would you mind to test the following patch? diff --git a/mm/backing-dev.c b/mm/backing-dev.c index 271f2ca862c8..f5561ea7d90a 100644 --- a/mm/backing-dev.c +++ b/mm/backing-dev.c @@ -398,12 +398,12 @@ static void cgwb_release_workfn(struct work_struct *work) blkcg_unpin_online(blkcg); fprop_local_destroy_percpu(&wb->memcg_completions); - percpu_ref_exit(&wb->refcnt); spin_lock_irq(&cgwb_lock); list_del(&wb->offline_node); spin_unlock_irq(&cgwb_lock); + percpu_ref_exit(&wb->refcnt); wb_exit(wb); WARN_ON_ONCE(!list_empty(&wb->b_attached)); kfree_rcu(wb, rcu); ^ permalink raw reply related [flat|nested] 21+ messages in thread
* Re: Patch 'writeback, cgroup: release dying cgwbs by switching attached inodes' leads to kernel crash 2021-07-15 16:04 ` Roman Gushchin @ 2021-07-16 1:37 ` Boyang Xue 0 siblings, 0 replies; 21+ messages in thread From: Boyang Xue @ 2021-07-16 1:37 UTC (permalink / raw) To: Roman Gushchin; +Cc: Jan Kara, linux-fsdevel On Fri, Jul 16, 2021 at 12:05 AM Roman Gushchin <guro@fb.com> wrote: > > On Thu, Jul 15, 2021 at 11:31:17AM +0200, Jan Kara wrote: > > On Thu 15-07-21 09:42:06, Boyang Xue wrote: > > > On Thu, Jul 15, 2021 at 7:46 AM Roman Gushchin <guro@fb.com> wrote: > > > > > > > > On Thu, Jul 15, 2021 at 12:22:28AM +0800, Boyang Xue wrote: > > > > > Hi Jan, > > > > > > > > > > On Wed, Jul 14, 2021 at 5:26 PM Jan Kara <jack@suse.cz> wrote: > > > > > > > > > > > > On Wed 14-07-21 16:44:33, Boyang Xue wrote: > > > > > > > Hi Roman, > > > > > > > > > > > > > > On Wed, Jul 14, 2021 at 12:12 PM Roman Gushchin <guro@fb.com> wrote: > > > > > > > > > > > > > > > > On Wed, Jul 14, 2021 at 11:21:12AM +0800, Boyang Xue wrote: > > > > > > > > > Hello, > > > > > > > > > > > > > > > > > > I'm not sure if this is the right place to report this bug, please > > > > > > > > > correct me if I'm wrong. > > > > > > > > > > > > > > > > > > I found kernel-5.14.0-rc1 (built from the Linus tree) crash when it's > > > > > > > > > running xfstests generic/256 on ext4 [1]. Looking at the call trace, > > > > > > > > > it looks like the bug had been introduced by the commit > > > > > > > > > > > > > > > > > > c22d70a162d3 writeback, cgroup: release dying cgwbs by switching attached inodes > > > > > > > > > > > > > > > > > > It only happens on aarch64, not on x86_64, ppc64le and s390x. Testing > > > > > > > > > was performed with the latest xfstests, and the bug can be reproduced > > > > > > > > > on ext{2, 3, 4} with {1k, 2k, 4k} block sizes. > > > > > > > > > > > > > > > > Hello Boyang, > > > > > > > > > > > > > > > > thank you for the report! > > > > > > > > > > > > > > > > Do you know on which line the oops happens? > > > > > > > > > > > > > > I was trying to inspect the vmcore with crash utility, but > > > > > > > unfortunately it doesn't work. > > > > > > > > > > > > Thanks for report! Have you tried addr2line utility? Looking at the oops I > > > > > > can see: > > > > > > > > > > Thanks for the tips! > > > > > > > > > > It's unclear to me that where to find the required address in the > > > > > addr2line command line, i.e. > > > > > > > > > > addr2line -e /usr/lib/debug/lib/modules/5.14.0-0.rc1.15.bx.el9.aarch64/vmlinux > > > > > <what address here?> > > > > > > > > You can use $nm <vmlinux> to get an address of cleanup_offline_cgwbs_workfn() > > > > and then add 0x320. > > > > > > Thanks! Hope the following helps: > > > > Thanks for the data! > > > > > static void cleanup_offline_cgwbs_workfn(struct work_struct *work) > > > { > > > struct bdi_writeback *wb; > > > LIST_HEAD(processed); > > > > > > spin_lock_irq(&cgwb_lock); > > > > > > while (!list_empty(&offline_cgwbs)) { > > > wb = list_first_entry(&offline_cgwbs, struct bdi_writeback, > > > offline_node); > > > list_move(&wb->offline_node, &processed); > > > > > > /* > > > * If wb is dirty, cleaning up the writeback by switching > > > * attached inodes will result in an effective removal of any > > > * bandwidth restrictions, which isn't the goal. Instead, > > > * it can be postponed until the next time, when all io > > > * will be likely completed. If in the meantime some inodes > > > * will get re-dirtied, they should be eventually switched to > > > * a new cgwb. > > > */ > > > if (wb_has_dirty_io(wb)) > > > continue; > > > > > > if (!wb_tryget(wb)) <=== line#679 > > > continue; > > > > Aha, interesting. So it seems we crashed trying to dereference > > wb->refcnt->data. So it looks like cgwb_release_workfn() raced with > > cleanup_offline_cgwbs_workfn() and percpu_ref_exit() got called from > > cgwb_release_workfn() and then cleanup_offline_cgwbs_workfn() called > > wb_tryget(). I think the proper fix is to move: > > > > spin_lock_irq(&cgwb_lock); > > list_del(&wb->offline_node); > > spin_unlock_irq(&cgwb_lock); > > > > in cgwb_release_workfn() to the beginning of that function so that we are > > sure even cleanup_offline_cgwbs_workfn() cannot be working with the wb when > > it is being released. Roman? > > Yes, it sounds like the most reasonable explanation. > Thank you! > > Boyang, would you mind to test the following patch? No problem. I'm testing it. Thanks for the patch. > > diff --git a/mm/backing-dev.c b/mm/backing-dev.c > index 271f2ca862c8..f5561ea7d90a 100644 > --- a/mm/backing-dev.c > +++ b/mm/backing-dev.c > @@ -398,12 +398,12 @@ static void cgwb_release_workfn(struct work_struct *work) > blkcg_unpin_online(blkcg); > > fprop_local_destroy_percpu(&wb->memcg_completions); > - percpu_ref_exit(&wb->refcnt); > > spin_lock_irq(&cgwb_lock); > list_del(&wb->offline_node); > spin_unlock_irq(&cgwb_lock); > > + percpu_ref_exit(&wb->refcnt); > wb_exit(wb); > WARN_ON_ONCE(!list_empty(&wb->b_attached)); > kfree_rcu(wb, rcu); > ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: Patch 'writeback, cgroup: release dying cgwbs by switching attached inodes' leads to kernel crash 2021-07-14 16:22 ` Boyang Xue 2021-07-14 23:46 ` Roman Gushchin @ 2021-07-15 2:35 ` Matthew Wilcox 2021-07-15 3:51 ` Boyang Xue 1 sibling, 1 reply; 21+ messages in thread From: Matthew Wilcox @ 2021-07-15 2:35 UTC (permalink / raw) To: Boyang Xue; +Cc: Jan Kara, Roman Gushchin, linux-fsdevel On Thu, Jul 15, 2021 at 12:22:28AM +0800, Boyang Xue wrote: > It's unclear to me that where to find the required address in the > addr2line command line, i.e. > > addr2line -e /usr/lib/debug/lib/modules/5.14.0-0.rc1.15.bx.el9.aarch64/vmlinux > <what address here?> ./scripts/faddr2line /usr/lib/debug/lib/modules/5.14.0-0.rc1.15.bx.el9.aarch64/vmlinux cleanup_offline_cgwbs_workfn+0x320/0x394 ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: Patch 'writeback, cgroup: release dying cgwbs by switching attached inodes' leads to kernel crash 2021-07-15 2:35 ` Matthew Wilcox @ 2021-07-15 3:51 ` Boyang Xue 2021-07-15 17:10 ` Darrick J. Wong 0 siblings, 1 reply; 21+ messages in thread From: Boyang Xue @ 2021-07-15 3:51 UTC (permalink / raw) To: Matthew Wilcox; +Cc: Jan Kara, Roman Gushchin, linux-fsdevel On Thu, Jul 15, 2021 at 10:36 AM Matthew Wilcox <willy@infradead.org> wrote: > > On Thu, Jul 15, 2021 at 12:22:28AM +0800, Boyang Xue wrote: > > It's unclear to me that where to find the required address in the > > addr2line command line, i.e. > > > > addr2line -e /usr/lib/debug/lib/modules/5.14.0-0.rc1.15.bx.el9.aarch64/vmlinux > > <what address here?> > > ./scripts/faddr2line /usr/lib/debug/lib/modules/5.14.0-0.rc1.15.bx.el9.aarch64/vmlinux cleanup_offline_cgwbs_workfn+0x320/0x394 > Thanks! The result is the same as the addr2line -i -e /usr/lib/debug/lib/modules/5.14.0-0.rc1.15.bx.el9.aarch64/vmlinux FFFF8000102D6DD0 But this script is very handy. # /usr/src/kernels/5.14.0-0.rc1.15.bx.el9.aarch64/scripts/faddr2line /usr/lib/debug/lib/modules/5.14.0-0.rc1.15.bx.el9.aarch64/vmlinux cleanup_offlin e_cgwbs_workfn+0x320/0x394 cleanup_offline_cgwbs_workfn+0x320/0x394: arch_atomic64_fetch_add_unless at /usr/src/debug/kernel-5.14.0-0.rc1.15.bx/linux-5.14.0-0.rc1.15.bx.el9.aarch64/./include/linux/atomic-arch-fallback.h:2265 (inlined by) arch_atomic64_add_unless at /usr/src/debug/kernel-5.14.0-0.rc1.15.bx/linux-5.14.0-0.rc1.15.bx.el9.aarch64/./include/linux/atomic-arch-fallback.h:2290 (inlined by) atomic64_add_unless at /usr/src/debug/kernel-5.14.0-0.rc1.15.bx/linux-5.14.0-0.rc1.15.bx.el9.aarch64/./include/asm-generic/atomic-instrumented.h:1149 (inlined by) atomic_long_add_unless at /usr/src/debug/kernel-5.14.0-0.rc1.15.bx/linux-5.14.0-0.rc1.15.bx.el9.aarch64/./include/asm-generic/atomic-long.h:491 (inlined by) percpu_ref_tryget_many at /usr/src/debug/kernel-5.14.0-0.rc1.15.bx/linux-5.14.0-0.rc1.15.bx.el9.aarch64/./include/linux/percpu-refcount.h:247 (inlined by) percpu_ref_tryget at /usr/src/debug/kernel-5.14.0-0.rc1.15.bx/linux-5.14.0-0.rc1.15.bx.el9.aarch64/./include/linux/percpu-refcount.h:266 (inlined by) wb_tryget at /usr/src/debug/kernel-5.14.0-0.rc1.15.bx/linux-5.14.0-0.rc1.15.bx.el9.aarch64/./include/linux/backing-dev-defs.h:227 (inlined by) wb_tryget at /usr/src/debug/kernel-5.14.0-0.rc1.15.bx/linux-5.14.0-0.rc1.15.bx.el9.aarch64/./include/linux/backing-dev-defs.h:224 (inlined by) cleanup_offline_cgwbs_workfn at /usr/src/debug/kernel-5.14.0-0.rc1.15.bx/linux-5.14.0-0.rc1.15.bx.el9.aarch64/mm/backing-dev.c:679 # vi /usr/src/debug/kernel-5.14.0-0.rc1.15.bx/linux-5.14.0-0.rc1.15.bx.el9.aarch64/mm/backing-dev.c ``` static void cleanup_offline_cgwbs_workfn(struct work_struct *work) { struct bdi_writeback *wb; LIST_HEAD(processed); spin_lock_irq(&cgwb_lock); while (!list_empty(&offline_cgwbs)) { wb = list_first_entry(&offline_cgwbs, struct bdi_writeback, offline_node); list_move(&wb->offline_node, &processed); /* * If wb is dirty, cleaning up the writeback by switching * attached inodes will result in an effective removal of any * bandwidth restrictions, which isn't the goal. Instead, * it can be postponed until the next time, when all io * will be likely completed. If in the meantime some inodes * will get re-dirtied, they should be eventually switched to * a new cgwb. */ if (wb_has_dirty_io(wb)) continue; if (!wb_tryget(wb)) <=== line#679 continue; spin_unlock_irq(&cgwb_lock); while (cleanup_offline_cgwb(wb)) cond_resched(); spin_lock_irq(&cgwb_lock); wb_put(wb); } if (!list_empty(&processed)) list_splice_tail(&processed, &offline_cgwbs); spin_unlock_irq(&cgwb_lock); } ``` BTW, this bug can be only reproduced on a non-debug production built kernel (a.k.a kernel rpm package), it's not reproducible on a debug build with various debug configuration enabled (a.k.a kernel-debug rpm package) ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: Patch 'writeback, cgroup: release dying cgwbs by switching attached inodes' leads to kernel crash 2021-07-15 3:51 ` Boyang Xue @ 2021-07-15 17:10 ` Darrick J. Wong 2021-07-15 20:08 ` Roman Gushchin 0 siblings, 1 reply; 21+ messages in thread From: Darrick J. Wong @ 2021-07-15 17:10 UTC (permalink / raw) To: Boyang Xue; +Cc: Matthew Wilcox, Jan Kara, Roman Gushchin, linux-fsdevel On Thu, Jul 15, 2021 at 11:51:50AM +0800, Boyang Xue wrote: > On Thu, Jul 15, 2021 at 10:36 AM Matthew Wilcox <willy@infradead.org> wrote: > > > > On Thu, Jul 15, 2021 at 12:22:28AM +0800, Boyang Xue wrote: > > > It's unclear to me that where to find the required address in the > > > addr2line command line, i.e. > > > > > > addr2line -e /usr/lib/debug/lib/modules/5.14.0-0.rc1.15.bx.el9.aarch64/vmlinux > > > <what address here?> > > > > ./scripts/faddr2line /usr/lib/debug/lib/modules/5.14.0-0.rc1.15.bx.el9.aarch64/vmlinux cleanup_offline_cgwbs_workfn+0x320/0x394 > > > > Thanks! The result is the same as the > > addr2line -i -e > /usr/lib/debug/lib/modules/5.14.0-0.rc1.15.bx.el9.aarch64/vmlinux > FFFF8000102D6DD0 > > But this script is very handy. > > # /usr/src/kernels/5.14.0-0.rc1.15.bx.el9.aarch64/scripts/faddr2line > /usr/lib/debug/lib/modules/5.14.0-0.rc1.15.bx.el9.aarch64/vmlinux > cleanup_offlin > e_cgwbs_workfn+0x320/0x394 > cleanup_offline_cgwbs_workfn+0x320/0x394: > arch_atomic64_fetch_add_unless at > /usr/src/debug/kernel-5.14.0-0.rc1.15.bx/linux-5.14.0-0.rc1.15.bx.el9.aarch64/./include/linux/atomic-arch-fallback.h:2265 > (inlined by) arch_atomic64_add_unless at > /usr/src/debug/kernel-5.14.0-0.rc1.15.bx/linux-5.14.0-0.rc1.15.bx.el9.aarch64/./include/linux/atomic-arch-fallback.h:2290 > (inlined by) atomic64_add_unless at > /usr/src/debug/kernel-5.14.0-0.rc1.15.bx/linux-5.14.0-0.rc1.15.bx.el9.aarch64/./include/asm-generic/atomic-instrumented.h:1149 > (inlined by) atomic_long_add_unless at > /usr/src/debug/kernel-5.14.0-0.rc1.15.bx/linux-5.14.0-0.rc1.15.bx.el9.aarch64/./include/asm-generic/atomic-long.h:491 > (inlined by) percpu_ref_tryget_many at > /usr/src/debug/kernel-5.14.0-0.rc1.15.bx/linux-5.14.0-0.rc1.15.bx.el9.aarch64/./include/linux/percpu-refcount.h:247 > (inlined by) percpu_ref_tryget at > /usr/src/debug/kernel-5.14.0-0.rc1.15.bx/linux-5.14.0-0.rc1.15.bx.el9.aarch64/./include/linux/percpu-refcount.h:266 > (inlined by) wb_tryget at > /usr/src/debug/kernel-5.14.0-0.rc1.15.bx/linux-5.14.0-0.rc1.15.bx.el9.aarch64/./include/linux/backing-dev-defs.h:227 > (inlined by) wb_tryget at > /usr/src/debug/kernel-5.14.0-0.rc1.15.bx/linux-5.14.0-0.rc1.15.bx.el9.aarch64/./include/linux/backing-dev-defs.h:224 > (inlined by) cleanup_offline_cgwbs_workfn at > /usr/src/debug/kernel-5.14.0-0.rc1.15.bx/linux-5.14.0-0.rc1.15.bx.el9.aarch64/mm/backing-dev.c:679 > > # vi /usr/src/debug/kernel-5.14.0-0.rc1.15.bx/linux-5.14.0-0.rc1.15.bx.el9.aarch64/mm/backing-dev.c > ``` > static void cleanup_offline_cgwbs_workfn(struct work_struct *work) > { > struct bdi_writeback *wb; > LIST_HEAD(processed); > > spin_lock_irq(&cgwb_lock); > > while (!list_empty(&offline_cgwbs)) { > wb = list_first_entry(&offline_cgwbs, struct bdi_writeback, > offline_node); > list_move(&wb->offline_node, &processed); > > /* > * If wb is dirty, cleaning up the writeback by switching > * attached inodes will result in an effective removal of any > * bandwidth restrictions, which isn't the goal. Instead, > * it can be postponed until the next time, when all io > * will be likely completed. If in the meantime some inodes > * will get re-dirtied, they should be eventually switched to > * a new cgwb. > */ > if (wb_has_dirty_io(wb)) > continue; > > if (!wb_tryget(wb)) <=== line#679 > continue; > > spin_unlock_irq(&cgwb_lock); > while (cleanup_offline_cgwb(wb)) > cond_resched(); > spin_lock_irq(&cgwb_lock); > > wb_put(wb); > } > > if (!list_empty(&processed)) > list_splice_tail(&processed, &offline_cgwbs); > > spin_unlock_irq(&cgwb_lock); > } > ``` > > BTW, this bug can be only reproduced on a non-debug production built > kernel (a.k.a kernel rpm package), it's not reproducible on a debug > build with various debug configuration enabled (a.k.a kernel-debug rpm > package) FWIW I've also seen this regularly on x86_64 kernels on ext4 with all default mkfs settings when running generic/256. # FSTYP=ext4 MOUNT_OPTIONS="-o acl,user_xattr," ./check FSTYP -- ext4 PLATFORM -- Linux/x86_64 flax-mtr00 5.14.0-rc1-xfsx #rc1 SMP PREEMPT Wed Jul 14 17:36:18 PDT 2021 MKFS_OPTIONS -- /dev/sdf MOUNT_OPTIONS -- -o acl,user_xattr, /dev/sdf /opt generic/256 Message from syslogd@flax-mtr00 at Jul 15 09:58:14 ... kernel:[ 2508.987522] Dumping ftrace buffer: And the dmesg looks like: run fstests generic/256 at 2021-07-15 09:56:34 EXT4-fs (sdf): mounted filesystem with ordered data mode. Opts: acl,user_xattr. Quota mode: none. BUG: kernel NULL pointer dereference, address: 0000000000000000 #PF: supervisor read access in kernel mode #PF: error_code(0x0000) - not-present page PGD 0 P4D 0 Oops: 0000 [#1] PREEMPT SMP NOPTI CPU: 1 PID: 108604 Comm: u9:3 Not tainted 5.14.0-rc1-xfsx #rc1 486fb938eb99d57e79080268009b49f63f777aec Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.13.0-1ubuntu1.1 04/01/2014 Workqueue: events_unbound cleanup_offline_cgwbs_workfn RIP: 0010:cleanup_offline_cgwbs_workfn+0x1ef/0x220 Code: ff ff f0 48 83 28 01 0f 85 55 ff ff ff 48 8b 83 60 ff ff ff 48 8d bb 58 ff ff ff ff 50 08 e9 3f ff ff ff 48 8b 93 60 ff ff ff <48> 8b 02 48 85 c0 0f 84 2c ff ff ff 48 8d 48 01 f0 48 0f b1 0a 75 RSP: 0018:ffffc9000278be60 EFLAGS: 00010006 RAX: 0000000000000003 RBX: ffff888282dc0b30 RCX: 0000000000000000 RDX: 0000000000000000 RSI: ffffc9000278be60 RDI: ffff888282dc0b30 RBP: ffff888282dc0800 R08: ffff88828006af30 R09: ffff88828006af30 R10: 000000000000000f R11: 000000000000000f R12: ffffc9000278be60 R13: ffff8881000d6800 R14: 0000000000000000 R15: 0000000000000000 FS: 0000000000000000(0000) GS:ffff888277d00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000000000000000 CR3: 0000000102262003 CR4: 00000000001706a0 Call Trace: process_one_work+0x1dd/0x3c0 worker_thread+0x53/0x3c0 ? rescuer_thread+0x390/0x390 kthread+0x149/0x170 ? set_kthread_struct+0x40/0x40 ret_from_fork+0x1f/0x30 Modules linked in: ext2 ext4 jbd2 dm_flakey mbcache xfs libcrc32c ip6t_REJECT nf_reject_ipv6 ipt_REJECT nf_reject_ipv4 xt_tcpudp ip_set_hash_ip ip_set_hash_net xt_set ip_set_hash_mac ip_set nfnetlink ip6table_filter ip6_tables bfq iptable_filter pvpanic_mmio pvpanic sch_fq_codel ip_tables x_tables overlay nfsv4 af_packet [last unloaded: jbd2] Dumping ftrace buffer: (ftrace buffer empty) CR2: 0000000000000000 ---[ end trace 242113b767739fb9 ]--- The faddr2line output points at the same line of code. --D ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: Patch 'writeback, cgroup: release dying cgwbs by switching attached inodes' leads to kernel crash 2021-07-15 17:10 ` Darrick J. Wong @ 2021-07-15 20:08 ` Roman Gushchin 2021-07-15 22:28 ` Darrick J. Wong 0 siblings, 1 reply; 21+ messages in thread From: Roman Gushchin @ 2021-07-15 20:08 UTC (permalink / raw) To: Darrick J. Wong; +Cc: Boyang Xue, Matthew Wilcox, Jan Kara, linux-fsdevel On Thu, Jul 15, 2021 at 10:10:50AM -0700, Darrick J. Wong wrote: > On Thu, Jul 15, 2021 at 11:51:50AM +0800, Boyang Xue wrote: > > On Thu, Jul 15, 2021 at 10:36 AM Matthew Wilcox <willy@infradead.org> wrote: > > > > > > On Thu, Jul 15, 2021 at 12:22:28AM +0800, Boyang Xue wrote: > > > > It's unclear to me that where to find the required address in the > > > > addr2line command line, i.e. > > > > > > > > addr2line -e /usr/lib/debug/lib/modules/5.14.0-0.rc1.15.bx.el9.aarch64/vmlinux > > > > <what address here?> > > > > > > ./scripts/faddr2line /usr/lib/debug/lib/modules/5.14.0-0.rc1.15.bx.el9.aarch64/vmlinux cleanup_offline_cgwbs_workfn+0x320/0x394 > > > > > > > Thanks! The result is the same as the > > > > addr2line -i -e > > /usr/lib/debug/lib/modules/5.14.0-0.rc1.15.bx.el9.aarch64/vmlinux > > FFFF8000102D6DD0 > > > > But this script is very handy. > > > > # /usr/src/kernels/5.14.0-0.rc1.15.bx.el9.aarch64/scripts/faddr2line > > /usr/lib/debug/lib/modules/5.14.0-0.rc1.15.bx.el9.aarch64/vmlinux > > cleanup_offlin > > e_cgwbs_workfn+0x320/0x394 > > cleanup_offline_cgwbs_workfn+0x320/0x394: > > arch_atomic64_fetch_add_unless at > > /usr/src/debug/kernel-5.14.0-0.rc1.15.bx/linux-5.14.0-0.rc1.15.bx.el9.aarch64/./include/linux/atomic-arch-fallback.h:2265 > > (inlined by) arch_atomic64_add_unless at > > /usr/src/debug/kernel-5.14.0-0.rc1.15.bx/linux-5.14.0-0.rc1.15.bx.el9.aarch64/./include/linux/atomic-arch-fallback.h:2290 > > (inlined by) atomic64_add_unless at > > /usr/src/debug/kernel-5.14.0-0.rc1.15.bx/linux-5.14.0-0.rc1.15.bx.el9.aarch64/./include/asm-generic/atomic-instrumented.h:1149 > > (inlined by) atomic_long_add_unless at > > /usr/src/debug/kernel-5.14.0-0.rc1.15.bx/linux-5.14.0-0.rc1.15.bx.el9.aarch64/./include/asm-generic/atomic-long.h:491 > > (inlined by) percpu_ref_tryget_many at > > /usr/src/debug/kernel-5.14.0-0.rc1.15.bx/linux-5.14.0-0.rc1.15.bx.el9.aarch64/./include/linux/percpu-refcount.h:247 > > (inlined by) percpu_ref_tryget at > > /usr/src/debug/kernel-5.14.0-0.rc1.15.bx/linux-5.14.0-0.rc1.15.bx.el9.aarch64/./include/linux/percpu-refcount.h:266 > > (inlined by) wb_tryget at > > /usr/src/debug/kernel-5.14.0-0.rc1.15.bx/linux-5.14.0-0.rc1.15.bx.el9.aarch64/./include/linux/backing-dev-defs.h:227 > > (inlined by) wb_tryget at > > /usr/src/debug/kernel-5.14.0-0.rc1.15.bx/linux-5.14.0-0.rc1.15.bx.el9.aarch64/./include/linux/backing-dev-defs.h:224 > > (inlined by) cleanup_offline_cgwbs_workfn at > > /usr/src/debug/kernel-5.14.0-0.rc1.15.bx/linux-5.14.0-0.rc1.15.bx.el9.aarch64/mm/backing-dev.c:679 > > > > # vi /usr/src/debug/kernel-5.14.0-0.rc1.15.bx/linux-5.14.0-0.rc1.15.bx.el9.aarch64/mm/backing-dev.c > > ``` > > static void cleanup_offline_cgwbs_workfn(struct work_struct *work) > > { > > struct bdi_writeback *wb; > > LIST_HEAD(processed); > > > > spin_lock_irq(&cgwb_lock); > > > > while (!list_empty(&offline_cgwbs)) { > > wb = list_first_entry(&offline_cgwbs, struct bdi_writeback, > > offline_node); > > list_move(&wb->offline_node, &processed); > > > > /* > > * If wb is dirty, cleaning up the writeback by switching > > * attached inodes will result in an effective removal of any > > * bandwidth restrictions, which isn't the goal. Instead, > > * it can be postponed until the next time, when all io > > * will be likely completed. If in the meantime some inodes > > * will get re-dirtied, they should be eventually switched to > > * a new cgwb. > > */ > > if (wb_has_dirty_io(wb)) > > continue; > > > > if (!wb_tryget(wb)) <=== line#679 > > continue; > > > > spin_unlock_irq(&cgwb_lock); > > while (cleanup_offline_cgwb(wb)) > > cond_resched(); > > spin_lock_irq(&cgwb_lock); > > > > wb_put(wb); > > } > > > > if (!list_empty(&processed)) > > list_splice_tail(&processed, &offline_cgwbs); > > > > spin_unlock_irq(&cgwb_lock); > > } > > ``` > > > > BTW, this bug can be only reproduced on a non-debug production built > > kernel (a.k.a kernel rpm package), it's not reproducible on a debug > > build with various debug configuration enabled (a.k.a kernel-debug rpm > > package) > > FWIW I've also seen this regularly on x86_64 kernels on ext4 with all > default mkfs settings when running generic/256. Oh, that's a useful information, thank you! Btw, would you mind to give a patch from an earlier message in the thread a test? I'd highly appreciate it. Thanks! ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: Patch 'writeback, cgroup: release dying cgwbs by switching attached inodes' leads to kernel crash 2021-07-15 20:08 ` Roman Gushchin @ 2021-07-15 22:28 ` Darrick J. Wong 2021-07-16 16:23 ` Darrick J. Wong 0 siblings, 1 reply; 21+ messages in thread From: Darrick J. Wong @ 2021-07-15 22:28 UTC (permalink / raw) To: Roman Gushchin; +Cc: Boyang Xue, Matthew Wilcox, Jan Kara, linux-fsdevel On Thu, Jul 15, 2021 at 01:08:15PM -0700, Roman Gushchin wrote: > On Thu, Jul 15, 2021 at 10:10:50AM -0700, Darrick J. Wong wrote: > > On Thu, Jul 15, 2021 at 11:51:50AM +0800, Boyang Xue wrote: > > > On Thu, Jul 15, 2021 at 10:36 AM Matthew Wilcox <willy@infradead.org> wrote: > > > > > > > > On Thu, Jul 15, 2021 at 12:22:28AM +0800, Boyang Xue wrote: > > > > > It's unclear to me that where to find the required address in the > > > > > addr2line command line, i.e. > > > > > > > > > > addr2line -e /usr/lib/debug/lib/modules/5.14.0-0.rc1.15.bx.el9.aarch64/vmlinux > > > > > <what address here?> > > > > > > > > ./scripts/faddr2line /usr/lib/debug/lib/modules/5.14.0-0.rc1.15.bx.el9.aarch64/vmlinux cleanup_offline_cgwbs_workfn+0x320/0x394 > > > > > > > > > > Thanks! The result is the same as the > > > > > > addr2line -i -e > > > /usr/lib/debug/lib/modules/5.14.0-0.rc1.15.bx.el9.aarch64/vmlinux > > > FFFF8000102D6DD0 > > > > > > But this script is very handy. > > > > > > # /usr/src/kernels/5.14.0-0.rc1.15.bx.el9.aarch64/scripts/faddr2line > > > /usr/lib/debug/lib/modules/5.14.0-0.rc1.15.bx.el9.aarch64/vmlinux > > > cleanup_offlin > > > e_cgwbs_workfn+0x320/0x394 > > > cleanup_offline_cgwbs_workfn+0x320/0x394: > > > arch_atomic64_fetch_add_unless at > > > /usr/src/debug/kernel-5.14.0-0.rc1.15.bx/linux-5.14.0-0.rc1.15.bx.el9.aarch64/./include/linux/atomic-arch-fallback.h:2265 > > > (inlined by) arch_atomic64_add_unless at > > > /usr/src/debug/kernel-5.14.0-0.rc1.15.bx/linux-5.14.0-0.rc1.15.bx.el9.aarch64/./include/linux/atomic-arch-fallback.h:2290 > > > (inlined by) atomic64_add_unless at > > > /usr/src/debug/kernel-5.14.0-0.rc1.15.bx/linux-5.14.0-0.rc1.15.bx.el9.aarch64/./include/asm-generic/atomic-instrumented.h:1149 > > > (inlined by) atomic_long_add_unless at > > > /usr/src/debug/kernel-5.14.0-0.rc1.15.bx/linux-5.14.0-0.rc1.15.bx.el9.aarch64/./include/asm-generic/atomic-long.h:491 > > > (inlined by) percpu_ref_tryget_many at > > > /usr/src/debug/kernel-5.14.0-0.rc1.15.bx/linux-5.14.0-0.rc1.15.bx.el9.aarch64/./include/linux/percpu-refcount.h:247 > > > (inlined by) percpu_ref_tryget at > > > /usr/src/debug/kernel-5.14.0-0.rc1.15.bx/linux-5.14.0-0.rc1.15.bx.el9.aarch64/./include/linux/percpu-refcount.h:266 > > > (inlined by) wb_tryget at > > > /usr/src/debug/kernel-5.14.0-0.rc1.15.bx/linux-5.14.0-0.rc1.15.bx.el9.aarch64/./include/linux/backing-dev-defs.h:227 > > > (inlined by) wb_tryget at > > > /usr/src/debug/kernel-5.14.0-0.rc1.15.bx/linux-5.14.0-0.rc1.15.bx.el9.aarch64/./include/linux/backing-dev-defs.h:224 > > > (inlined by) cleanup_offline_cgwbs_workfn at > > > /usr/src/debug/kernel-5.14.0-0.rc1.15.bx/linux-5.14.0-0.rc1.15.bx.el9.aarch64/mm/backing-dev.c:679 > > > > > > # vi /usr/src/debug/kernel-5.14.0-0.rc1.15.bx/linux-5.14.0-0.rc1.15.bx.el9.aarch64/mm/backing-dev.c > > > ``` > > > static void cleanup_offline_cgwbs_workfn(struct work_struct *work) > > > { > > > struct bdi_writeback *wb; > > > LIST_HEAD(processed); > > > > > > spin_lock_irq(&cgwb_lock); > > > > > > while (!list_empty(&offline_cgwbs)) { > > > wb = list_first_entry(&offline_cgwbs, struct bdi_writeback, > > > offline_node); > > > list_move(&wb->offline_node, &processed); > > > > > > /* > > > * If wb is dirty, cleaning up the writeback by switching > > > * attached inodes will result in an effective removal of any > > > * bandwidth restrictions, which isn't the goal. Instead, > > > * it can be postponed until the next time, when all io > > > * will be likely completed. If in the meantime some inodes > > > * will get re-dirtied, they should be eventually switched to > > > * a new cgwb. > > > */ > > > if (wb_has_dirty_io(wb)) > > > continue; > > > > > > if (!wb_tryget(wb)) <=== line#679 > > > continue; > > > > > > spin_unlock_irq(&cgwb_lock); > > > while (cleanup_offline_cgwb(wb)) > > > cond_resched(); > > > spin_lock_irq(&cgwb_lock); > > > > > > wb_put(wb); > > > } > > > > > > if (!list_empty(&processed)) > > > list_splice_tail(&processed, &offline_cgwbs); > > > > > > spin_unlock_irq(&cgwb_lock); > > > } > > > ``` > > > > > > BTW, this bug can be only reproduced on a non-debug production built > > > kernel (a.k.a kernel rpm package), it's not reproducible on a debug > > > build with various debug configuration enabled (a.k.a kernel-debug rpm > > > package) > > > > FWIW I've also seen this regularly on x86_64 kernels on ext4 with all > > default mkfs settings when running generic/256. > > Oh, that's a useful information, thank you! > > Btw, would you mind to give a patch from an earlier message in the thread > a test? I'd highly appreciate it. > > Thanks! Will do. --D ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: Patch 'writeback, cgroup: release dying cgwbs by switching attached inodes' leads to kernel crash 2021-07-15 22:28 ` Darrick J. Wong @ 2021-07-16 16:23 ` Darrick J. Wong 2021-07-16 20:03 ` Roman Gushchin 0 siblings, 1 reply; 21+ messages in thread From: Darrick J. Wong @ 2021-07-16 16:23 UTC (permalink / raw) To: Roman Gushchin; +Cc: Boyang Xue, Matthew Wilcox, Jan Kara, linux-fsdevel On Thu, Jul 15, 2021 at 03:28:12PM -0700, Darrick J. Wong wrote: > On Thu, Jul 15, 2021 at 01:08:15PM -0700, Roman Gushchin wrote: > > On Thu, Jul 15, 2021 at 10:10:50AM -0700, Darrick J. Wong wrote: > > > On Thu, Jul 15, 2021 at 11:51:50AM +0800, Boyang Xue wrote: > > > > On Thu, Jul 15, 2021 at 10:36 AM Matthew Wilcox <willy@infradead.org> wrote: > > > > > > > > > > On Thu, Jul 15, 2021 at 12:22:28AM +0800, Boyang Xue wrote: > > > > > > It's unclear to me that where to find the required address in the > > > > > > addr2line command line, i.e. > > > > > > > > > > > > addr2line -e /usr/lib/debug/lib/modules/5.14.0-0.rc1.15.bx.el9.aarch64/vmlinux > > > > > > <what address here?> > > > > > > > > > > ./scripts/faddr2line /usr/lib/debug/lib/modules/5.14.0-0.rc1.15.bx.el9.aarch64/vmlinux cleanup_offline_cgwbs_workfn+0x320/0x394 > > > > > > > > > > > > > Thanks! The result is the same as the > > > > > > > > addr2line -i -e > > > > /usr/lib/debug/lib/modules/5.14.0-0.rc1.15.bx.el9.aarch64/vmlinux > > > > FFFF8000102D6DD0 > > > > > > > > But this script is very handy. > > > > > > > > # /usr/src/kernels/5.14.0-0.rc1.15.bx.el9.aarch64/scripts/faddr2line > > > > /usr/lib/debug/lib/modules/5.14.0-0.rc1.15.bx.el9.aarch64/vmlinux > > > > cleanup_offlin > > > > e_cgwbs_workfn+0x320/0x394 > > > > cleanup_offline_cgwbs_workfn+0x320/0x394: > > > > arch_atomic64_fetch_add_unless at > > > > /usr/src/debug/kernel-5.14.0-0.rc1.15.bx/linux-5.14.0-0.rc1.15.bx.el9.aarch64/./include/linux/atomic-arch-fallback.h:2265 > > > > (inlined by) arch_atomic64_add_unless at > > > > /usr/src/debug/kernel-5.14.0-0.rc1.15.bx/linux-5.14.0-0.rc1.15.bx.el9.aarch64/./include/linux/atomic-arch-fallback.h:2290 > > > > (inlined by) atomic64_add_unless at > > > > /usr/src/debug/kernel-5.14.0-0.rc1.15.bx/linux-5.14.0-0.rc1.15.bx.el9.aarch64/./include/asm-generic/atomic-instrumented.h:1149 > > > > (inlined by) atomic_long_add_unless at > > > > /usr/src/debug/kernel-5.14.0-0.rc1.15.bx/linux-5.14.0-0.rc1.15.bx.el9.aarch64/./include/asm-generic/atomic-long.h:491 > > > > (inlined by) percpu_ref_tryget_many at > > > > /usr/src/debug/kernel-5.14.0-0.rc1.15.bx/linux-5.14.0-0.rc1.15.bx.el9.aarch64/./include/linux/percpu-refcount.h:247 > > > > (inlined by) percpu_ref_tryget at > > > > /usr/src/debug/kernel-5.14.0-0.rc1.15.bx/linux-5.14.0-0.rc1.15.bx.el9.aarch64/./include/linux/percpu-refcount.h:266 > > > > (inlined by) wb_tryget at > > > > /usr/src/debug/kernel-5.14.0-0.rc1.15.bx/linux-5.14.0-0.rc1.15.bx.el9.aarch64/./include/linux/backing-dev-defs.h:227 > > > > (inlined by) wb_tryget at > > > > /usr/src/debug/kernel-5.14.0-0.rc1.15.bx/linux-5.14.0-0.rc1.15.bx.el9.aarch64/./include/linux/backing-dev-defs.h:224 > > > > (inlined by) cleanup_offline_cgwbs_workfn at > > > > /usr/src/debug/kernel-5.14.0-0.rc1.15.bx/linux-5.14.0-0.rc1.15.bx.el9.aarch64/mm/backing-dev.c:679 > > > > > > > > # vi /usr/src/debug/kernel-5.14.0-0.rc1.15.bx/linux-5.14.0-0.rc1.15.bx.el9.aarch64/mm/backing-dev.c > > > > ``` > > > > static void cleanup_offline_cgwbs_workfn(struct work_struct *work) > > > > { > > > > struct bdi_writeback *wb; > > > > LIST_HEAD(processed); > > > > > > > > spin_lock_irq(&cgwb_lock); > > > > > > > > while (!list_empty(&offline_cgwbs)) { > > > > wb = list_first_entry(&offline_cgwbs, struct bdi_writeback, > > > > offline_node); > > > > list_move(&wb->offline_node, &processed); > > > > > > > > /* > > > > * If wb is dirty, cleaning up the writeback by switching > > > > * attached inodes will result in an effective removal of any > > > > * bandwidth restrictions, which isn't the goal. Instead, > > > > * it can be postponed until the next time, when all io > > > > * will be likely completed. If in the meantime some inodes > > > > * will get re-dirtied, they should be eventually switched to > > > > * a new cgwb. > > > > */ > > > > if (wb_has_dirty_io(wb)) > > > > continue; > > > > > > > > if (!wb_tryget(wb)) <=== line#679 > > > > continue; > > > > > > > > spin_unlock_irq(&cgwb_lock); > > > > while (cleanup_offline_cgwb(wb)) > > > > cond_resched(); > > > > spin_lock_irq(&cgwb_lock); > > > > > > > > wb_put(wb); > > > > } > > > > > > > > if (!list_empty(&processed)) > > > > list_splice_tail(&processed, &offline_cgwbs); > > > > > > > > spin_unlock_irq(&cgwb_lock); > > > > } > > > > ``` > > > > > > > > BTW, this bug can be only reproduced on a non-debug production built > > > > kernel (a.k.a kernel rpm package), it's not reproducible on a debug > > > > build with various debug configuration enabled (a.k.a kernel-debug rpm > > > > package) > > > > > > FWIW I've also seen this regularly on x86_64 kernels on ext4 with all > > > default mkfs settings when running generic/256. > > > > Oh, that's a useful information, thank you! > > > > Btw, would you mind to give a patch from an earlier message in the thread > > a test? I'd highly appreciate it. > > > > Thanks! > > Will do. fstests passed here, so Tested-by: Darrick J. Wong <djwong@kernel.org> --D > > --D ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: Patch 'writeback, cgroup: release dying cgwbs by switching attached inodes' leads to kernel crash 2021-07-16 16:23 ` Darrick J. Wong @ 2021-07-16 20:03 ` Roman Gushchin 2021-07-17 12:00 ` Boyang Xue 0 siblings, 1 reply; 21+ messages in thread From: Roman Gushchin @ 2021-07-16 20:03 UTC (permalink / raw) To: Darrick J. Wong; +Cc: Boyang Xue, Matthew Wilcox, Jan Kara, linux-fsdevel On Fri, Jul 16, 2021 at 09:23:40AM -0700, Darrick J. Wong wrote: > On Thu, Jul 15, 2021 at 03:28:12PM -0700, Darrick J. Wong wrote: > > On Thu, Jul 15, 2021 at 01:08:15PM -0700, Roman Gushchin wrote: > > > On Thu, Jul 15, 2021 at 10:10:50AM -0700, Darrick J. Wong wrote: > > > > On Thu, Jul 15, 2021 at 11:51:50AM +0800, Boyang Xue wrote: > > > > > On Thu, Jul 15, 2021 at 10:36 AM Matthew Wilcox <willy@infradead.org> wrote: > > > > > > > > > > > > On Thu, Jul 15, 2021 at 12:22:28AM +0800, Boyang Xue wrote: > > > > > > > It's unclear to me that where to find the required address in the > > > > > > > addr2line command line, i.e. > > > > > > > > > > > > > > addr2line -e /usr/lib/debug/lib/modules/5.14.0-0.rc1.15.bx.el9.aarch64/vmlinux > > > > > > > <what address here?> > > > > > > > > > > > > ./scripts/faddr2line /usr/lib/debug/lib/modules/5.14.0-0.rc1.15.bx.el9.aarch64/vmlinux cleanup_offline_cgwbs_workfn+0x320/0x394 > > > > > > > > > > > > > > > > Thanks! The result is the same as the > > > > > > > > > > addr2line -i -e > > > > > /usr/lib/debug/lib/modules/5.14.0-0.rc1.15.bx.el9.aarch64/vmlinux > > > > > FFFF8000102D6DD0 > > > > > > > > > > But this script is very handy. > > > > > > > > > > # /usr/src/kernels/5.14.0-0.rc1.15.bx.el9.aarch64/scripts/faddr2line > > > > > /usr/lib/debug/lib/modules/5.14.0-0.rc1.15.bx.el9.aarch64/vmlinux > > > > > cleanup_offlin > > > > > e_cgwbs_workfn+0x320/0x394 > > > > > cleanup_offline_cgwbs_workfn+0x320/0x394: > > > > > arch_atomic64_fetch_add_unless at > > > > > /usr/src/debug/kernel-5.14.0-0.rc1.15.bx/linux-5.14.0-0.rc1.15.bx.el9.aarch64/./include/linux/atomic-arch-fallback.h:2265 > > > > > (inlined by) arch_atomic64_add_unless at > > > > > /usr/src/debug/kernel-5.14.0-0.rc1.15.bx/linux-5.14.0-0.rc1.15.bx.el9.aarch64/./include/linux/atomic-arch-fallback.h:2290 > > > > > (inlined by) atomic64_add_unless at > > > > > /usr/src/debug/kernel-5.14.0-0.rc1.15.bx/linux-5.14.0-0.rc1.15.bx.el9.aarch64/./include/asm-generic/atomic-instrumented.h:1149 > > > > > (inlined by) atomic_long_add_unless at > > > > > /usr/src/debug/kernel-5.14.0-0.rc1.15.bx/linux-5.14.0-0.rc1.15.bx.el9.aarch64/./include/asm-generic/atomic-long.h:491 > > > > > (inlined by) percpu_ref_tryget_many at > > > > > /usr/src/debug/kernel-5.14.0-0.rc1.15.bx/linux-5.14.0-0.rc1.15.bx.el9.aarch64/./include/linux/percpu-refcount.h:247 > > > > > (inlined by) percpu_ref_tryget at > > > > > /usr/src/debug/kernel-5.14.0-0.rc1.15.bx/linux-5.14.0-0.rc1.15.bx.el9.aarch64/./include/linux/percpu-refcount.h:266 > > > > > (inlined by) wb_tryget at > > > > > /usr/src/debug/kernel-5.14.0-0.rc1.15.bx/linux-5.14.0-0.rc1.15.bx.el9.aarch64/./include/linux/backing-dev-defs.h:227 > > > > > (inlined by) wb_tryget at > > > > > /usr/src/debug/kernel-5.14.0-0.rc1.15.bx/linux-5.14.0-0.rc1.15.bx.el9.aarch64/./include/linux/backing-dev-defs.h:224 > > > > > (inlined by) cleanup_offline_cgwbs_workfn at > > > > > /usr/src/debug/kernel-5.14.0-0.rc1.15.bx/linux-5.14.0-0.rc1.15.bx.el9.aarch64/mm/backing-dev.c:679 > > > > > > > > > > # vi /usr/src/debug/kernel-5.14.0-0.rc1.15.bx/linux-5.14.0-0.rc1.15.bx.el9.aarch64/mm/backing-dev.c > > > > > ``` > > > > > static void cleanup_offline_cgwbs_workfn(struct work_struct *work) > > > > > { > > > > > struct bdi_writeback *wb; > > > > > LIST_HEAD(processed); > > > > > > > > > > spin_lock_irq(&cgwb_lock); > > > > > > > > > > while (!list_empty(&offline_cgwbs)) { > > > > > wb = list_first_entry(&offline_cgwbs, struct bdi_writeback, > > > > > offline_node); > > > > > list_move(&wb->offline_node, &processed); > > > > > > > > > > /* > > > > > * If wb is dirty, cleaning up the writeback by switching > > > > > * attached inodes will result in an effective removal of any > > > > > * bandwidth restrictions, which isn't the goal. Instead, > > > > > * it can be postponed until the next time, when all io > > > > > * will be likely completed. If in the meantime some inodes > > > > > * will get re-dirtied, they should be eventually switched to > > > > > * a new cgwb. > > > > > */ > > > > > if (wb_has_dirty_io(wb)) > > > > > continue; > > > > > > > > > > if (!wb_tryget(wb)) <=== line#679 > > > > > continue; > > > > > > > > > > spin_unlock_irq(&cgwb_lock); > > > > > while (cleanup_offline_cgwb(wb)) > > > > > cond_resched(); > > > > > spin_lock_irq(&cgwb_lock); > > > > > > > > > > wb_put(wb); > > > > > } > > > > > > > > > > if (!list_empty(&processed)) > > > > > list_splice_tail(&processed, &offline_cgwbs); > > > > > > > > > > spin_unlock_irq(&cgwb_lock); > > > > > } > > > > > ``` > > > > > > > > > > BTW, this bug can be only reproduced on a non-debug production built > > > > > kernel (a.k.a kernel rpm package), it's not reproducible on a debug > > > > > build with various debug configuration enabled (a.k.a kernel-debug rpm > > > > > package) > > > > > > > > FWIW I've also seen this regularly on x86_64 kernels on ext4 with all > > > > default mkfs settings when running generic/256. > > > > > > Oh, that's a useful information, thank you! > > > > > > Btw, would you mind to give a patch from an earlier message in the thread > > > a test? I'd highly appreciate it. > > > > > > Thanks! > > > > Will do. > > fstests passed here, so > > Tested-by: Darrick J. Wong <djwong@kernel.org> Great, thank you! ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: Patch 'writeback, cgroup: release dying cgwbs by switching attached inodes' leads to kernel crash 2021-07-16 20:03 ` Roman Gushchin @ 2021-07-17 12:00 ` Boyang Xue 2021-07-22 5:29 ` Boyang Xue 0 siblings, 1 reply; 21+ messages in thread From: Boyang Xue @ 2021-07-17 12:00 UTC (permalink / raw) To: Roman Gushchin; +Cc: Darrick J. Wong, Matthew Wilcox, Jan Kara, linux-fsdevel Testing fstests on aarch64, x86_64, s390x all passed. There's a shortage of ppc64le systems, so I can't provide the ppc64le test result for now, but I hope I can report the result next week. Thanks, Boyang On Sat, Jul 17, 2021 at 4:04 AM Roman Gushchin <guro@fb.com> wrote: > > On Fri, Jul 16, 2021 at 09:23:40AM -0700, Darrick J. Wong wrote: > > On Thu, Jul 15, 2021 at 03:28:12PM -0700, Darrick J. Wong wrote: > > > On Thu, Jul 15, 2021 at 01:08:15PM -0700, Roman Gushchin wrote: > > > > On Thu, Jul 15, 2021 at 10:10:50AM -0700, Darrick J. Wong wrote: > > > > > On Thu, Jul 15, 2021 at 11:51:50AM +0800, Boyang Xue wrote: > > > > > > On Thu, Jul 15, 2021 at 10:36 AM Matthew Wilcox <willy@infradead.org> wrote: > > > > > > > > > > > > > > On Thu, Jul 15, 2021 at 12:22:28AM +0800, Boyang Xue wrote: > > > > > > > > It's unclear to me that where to find the required address in the > > > > > > > > addr2line command line, i.e. > > > > > > > > > > > > > > > > addr2line -e /usr/lib/debug/lib/modules/5.14.0-0.rc1.15.bx.el9.aarch64/vmlinux > > > > > > > > <what address here?> > > > > > > > > > > > > > > ./scripts/faddr2line /usr/lib/debug/lib/modules/5.14.0-0.rc1.15.bx.el9.aarch64/vmlinux cleanup_offline_cgwbs_workfn+0x320/0x394 > > > > > > > > > > > > > > > > > > > Thanks! The result is the same as the > > > > > > > > > > > > addr2line -i -e > > > > > > /usr/lib/debug/lib/modules/5.14.0-0.rc1.15.bx.el9.aarch64/vmlinux > > > > > > FFFF8000102D6DD0 > > > > > > > > > > > > But this script is very handy. > > > > > > > > > > > > # /usr/src/kernels/5.14.0-0.rc1.15.bx.el9.aarch64/scripts/faddr2line > > > > > > /usr/lib/debug/lib/modules/5.14.0-0.rc1.15.bx.el9.aarch64/vmlinux > > > > > > cleanup_offlin > > > > > > e_cgwbs_workfn+0x320/0x394 > > > > > > cleanup_offline_cgwbs_workfn+0x320/0x394: > > > > > > arch_atomic64_fetch_add_unless at > > > > > > /usr/src/debug/kernel-5.14.0-0.rc1.15.bx/linux-5.14.0-0.rc1.15.bx.el9.aarch64/./include/linux/atomic-arch-fallback.h:2265 > > > > > > (inlined by) arch_atomic64_add_unless at > > > > > > /usr/src/debug/kernel-5.14.0-0.rc1.15.bx/linux-5.14.0-0.rc1.15.bx.el9.aarch64/./include/linux/atomic-arch-fallback.h:2290 > > > > > > (inlined by) atomic64_add_unless at > > > > > > /usr/src/debug/kernel-5.14.0-0.rc1.15.bx/linux-5.14.0-0.rc1.15.bx.el9.aarch64/./include/asm-generic/atomic-instrumented.h:1149 > > > > > > (inlined by) atomic_long_add_unless at > > > > > > /usr/src/debug/kernel-5.14.0-0.rc1.15.bx/linux-5.14.0-0.rc1.15.bx.el9.aarch64/./include/asm-generic/atomic-long.h:491 > > > > > > (inlined by) percpu_ref_tryget_many at > > > > > > /usr/src/debug/kernel-5.14.0-0.rc1.15.bx/linux-5.14.0-0.rc1.15.bx.el9.aarch64/./include/linux/percpu-refcount.h:247 > > > > > > (inlined by) percpu_ref_tryget at > > > > > > /usr/src/debug/kernel-5.14.0-0.rc1.15.bx/linux-5.14.0-0.rc1.15.bx.el9.aarch64/./include/linux/percpu-refcount.h:266 > > > > > > (inlined by) wb_tryget at > > > > > > /usr/src/debug/kernel-5.14.0-0.rc1.15.bx/linux-5.14.0-0.rc1.15.bx.el9.aarch64/./include/linux/backing-dev-defs.h:227 > > > > > > (inlined by) wb_tryget at > > > > > > /usr/src/debug/kernel-5.14.0-0.rc1.15.bx/linux-5.14.0-0.rc1.15.bx.el9.aarch64/./include/linux/backing-dev-defs.h:224 > > > > > > (inlined by) cleanup_offline_cgwbs_workfn at > > > > > > /usr/src/debug/kernel-5.14.0-0.rc1.15.bx/linux-5.14.0-0.rc1.15.bx.el9.aarch64/mm/backing-dev.c:679 > > > > > > > > > > > > # vi /usr/src/debug/kernel-5.14.0-0.rc1.15.bx/linux-5.14.0-0.rc1.15.bx.el9.aarch64/mm/backing-dev.c > > > > > > ``` > > > > > > static void cleanup_offline_cgwbs_workfn(struct work_struct *work) > > > > > > { > > > > > > struct bdi_writeback *wb; > > > > > > LIST_HEAD(processed); > > > > > > > > > > > > spin_lock_irq(&cgwb_lock); > > > > > > > > > > > > while (!list_empty(&offline_cgwbs)) { > > > > > > wb = list_first_entry(&offline_cgwbs, struct bdi_writeback, > > > > > > offline_node); > > > > > > list_move(&wb->offline_node, &processed); > > > > > > > > > > > > /* > > > > > > * If wb is dirty, cleaning up the writeback by switching > > > > > > * attached inodes will result in an effective removal of any > > > > > > * bandwidth restrictions, which isn't the goal. Instead, > > > > > > * it can be postponed until the next time, when all io > > > > > > * will be likely completed. If in the meantime some inodes > > > > > > * will get re-dirtied, they should be eventually switched to > > > > > > * a new cgwb. > > > > > > */ > > > > > > if (wb_has_dirty_io(wb)) > > > > > > continue; > > > > > > > > > > > > if (!wb_tryget(wb)) <=== line#679 > > > > > > continue; > > > > > > > > > > > > spin_unlock_irq(&cgwb_lock); > > > > > > while (cleanup_offline_cgwb(wb)) > > > > > > cond_resched(); > > > > > > spin_lock_irq(&cgwb_lock); > > > > > > > > > > > > wb_put(wb); > > > > > > } > > > > > > > > > > > > if (!list_empty(&processed)) > > > > > > list_splice_tail(&processed, &offline_cgwbs); > > > > > > > > > > > > spin_unlock_irq(&cgwb_lock); > > > > > > } > > > > > > ``` > > > > > > > > > > > > BTW, this bug can be only reproduced on a non-debug production built > > > > > > kernel (a.k.a kernel rpm package), it's not reproducible on a debug > > > > > > build with various debug configuration enabled (a.k.a kernel-debug rpm > > > > > > package) > > > > > > > > > > FWIW I've also seen this regularly on x86_64 kernels on ext4 with all > > > > > default mkfs settings when running generic/256. > > > > > > > > Oh, that's a useful information, thank you! > > > > > > > > Btw, would you mind to give a patch from an earlier message in the thread > > > > a test? I'd highly appreciate it. > > > > > > > > Thanks! > > > > > > Will do. > > > > fstests passed here, so > > > > Tested-by: Darrick J. Wong <djwong@kernel.org> > > Great, thank you! > ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: Patch 'writeback, cgroup: release dying cgwbs by switching attached inodes' leads to kernel crash 2021-07-17 12:00 ` Boyang Xue @ 2021-07-22 5:29 ` Boyang Xue 2021-07-22 5:41 ` Roman Gushchin 0 siblings, 1 reply; 21+ messages in thread From: Boyang Xue @ 2021-07-22 5:29 UTC (permalink / raw) To: Roman Gushchin; +Cc: Darrick J. Wong, Matthew Wilcox, Jan Kara, linux-fsdevel Just FYI, the tests on ppc64le are done, no longer kernel panic, so my tests on all arches are fine now. On Sat, Jul 17, 2021 at 8:00 PM Boyang Xue <bxue@redhat.com> wrote: > > Testing fstests on aarch64, x86_64, s390x all passed. There's a > shortage of ppc64le systems, so I can't provide the ppc64le test > result for now, but I hope I can report the result next week. > > Thanks, > Boyang > > On Sat, Jul 17, 2021 at 4:04 AM Roman Gushchin <guro@fb.com> wrote: > > > > On Fri, Jul 16, 2021 at 09:23:40AM -0700, Darrick J. Wong wrote: > > > On Thu, Jul 15, 2021 at 03:28:12PM -0700, Darrick J. Wong wrote: > > > > On Thu, Jul 15, 2021 at 01:08:15PM -0700, Roman Gushchin wrote: > > > > > On Thu, Jul 15, 2021 at 10:10:50AM -0700, Darrick J. Wong wrote: > > > > > > On Thu, Jul 15, 2021 at 11:51:50AM +0800, Boyang Xue wrote: > > > > > > > On Thu, Jul 15, 2021 at 10:36 AM Matthew Wilcox <willy@infradead.org> wrote: > > > > > > > > > > > > > > > > On Thu, Jul 15, 2021 at 12:22:28AM +0800, Boyang Xue wrote: > > > > > > > > > It's unclear to me that where to find the required address in the > > > > > > > > > addr2line command line, i.e. > > > > > > > > > > > > > > > > > > addr2line -e /usr/lib/debug/lib/modules/5.14.0-0.rc1.15.bx.el9.aarch64/vmlinux > > > > > > > > > <what address here?> > > > > > > > > > > > > > > > > ./scripts/faddr2line /usr/lib/debug/lib/modules/5.14.0-0.rc1.15.bx.el9.aarch64/vmlinux cleanup_offline_cgwbs_workfn+0x320/0x394 > > > > > > > > > > > > > > > > > > > > > > Thanks! The result is the same as the > > > > > > > > > > > > > > addr2line -i -e > > > > > > > /usr/lib/debug/lib/modules/5.14.0-0.rc1.15.bx.el9.aarch64/vmlinux > > > > > > > FFFF8000102D6DD0 > > > > > > > > > > > > > > But this script is very handy. > > > > > > > > > > > > > > # /usr/src/kernels/5.14.0-0.rc1.15.bx.el9.aarch64/scripts/faddr2line > > > > > > > /usr/lib/debug/lib/modules/5.14.0-0.rc1.15.bx.el9.aarch64/vmlinux > > > > > > > cleanup_offlin > > > > > > > e_cgwbs_workfn+0x320/0x394 > > > > > > > cleanup_offline_cgwbs_workfn+0x320/0x394: > > > > > > > arch_atomic64_fetch_add_unless at > > > > > > > /usr/src/debug/kernel-5.14.0-0.rc1.15.bx/linux-5.14.0-0.rc1.15.bx.el9.aarch64/./include/linux/atomic-arch-fallback.h:2265 > > > > > > > (inlined by) arch_atomic64_add_unless at > > > > > > > /usr/src/debug/kernel-5.14.0-0.rc1.15.bx/linux-5.14.0-0.rc1.15.bx.el9.aarch64/./include/linux/atomic-arch-fallback.h:2290 > > > > > > > (inlined by) atomic64_add_unless at > > > > > > > /usr/src/debug/kernel-5.14.0-0.rc1.15.bx/linux-5.14.0-0.rc1.15.bx.el9.aarch64/./include/asm-generic/atomic-instrumented.h:1149 > > > > > > > (inlined by) atomic_long_add_unless at > > > > > > > /usr/src/debug/kernel-5.14.0-0.rc1.15.bx/linux-5.14.0-0.rc1.15.bx.el9.aarch64/./include/asm-generic/atomic-long.h:491 > > > > > > > (inlined by) percpu_ref_tryget_many at > > > > > > > /usr/src/debug/kernel-5.14.0-0.rc1.15.bx/linux-5.14.0-0.rc1.15.bx.el9.aarch64/./include/linux/percpu-refcount.h:247 > > > > > > > (inlined by) percpu_ref_tryget at > > > > > > > /usr/src/debug/kernel-5.14.0-0.rc1.15.bx/linux-5.14.0-0.rc1.15.bx.el9.aarch64/./include/linux/percpu-refcount.h:266 > > > > > > > (inlined by) wb_tryget at > > > > > > > /usr/src/debug/kernel-5.14.0-0.rc1.15.bx/linux-5.14.0-0.rc1.15.bx.el9.aarch64/./include/linux/backing-dev-defs.h:227 > > > > > > > (inlined by) wb_tryget at > > > > > > > /usr/src/debug/kernel-5.14.0-0.rc1.15.bx/linux-5.14.0-0.rc1.15.bx.el9.aarch64/./include/linux/backing-dev-defs.h:224 > > > > > > > (inlined by) cleanup_offline_cgwbs_workfn at > > > > > > > /usr/src/debug/kernel-5.14.0-0.rc1.15.bx/linux-5.14.0-0.rc1.15.bx.el9.aarch64/mm/backing-dev.c:679 > > > > > > > > > > > > > > # vi /usr/src/debug/kernel-5.14.0-0.rc1.15.bx/linux-5.14.0-0.rc1.15.bx.el9.aarch64/mm/backing-dev.c > > > > > > > ``` > > > > > > > static void cleanup_offline_cgwbs_workfn(struct work_struct *work) > > > > > > > { > > > > > > > struct bdi_writeback *wb; > > > > > > > LIST_HEAD(processed); > > > > > > > > > > > > > > spin_lock_irq(&cgwb_lock); > > > > > > > > > > > > > > while (!list_empty(&offline_cgwbs)) { > > > > > > > wb = list_first_entry(&offline_cgwbs, struct bdi_writeback, > > > > > > > offline_node); > > > > > > > list_move(&wb->offline_node, &processed); > > > > > > > > > > > > > > /* > > > > > > > * If wb is dirty, cleaning up the writeback by switching > > > > > > > * attached inodes will result in an effective removal of any > > > > > > > * bandwidth restrictions, which isn't the goal. Instead, > > > > > > > * it can be postponed until the next time, when all io > > > > > > > * will be likely completed. If in the meantime some inodes > > > > > > > * will get re-dirtied, they should be eventually switched to > > > > > > > * a new cgwb. > > > > > > > */ > > > > > > > if (wb_has_dirty_io(wb)) > > > > > > > continue; > > > > > > > > > > > > > > if (!wb_tryget(wb)) <=== line#679 > > > > > > > continue; > > > > > > > > > > > > > > spin_unlock_irq(&cgwb_lock); > > > > > > > while (cleanup_offline_cgwb(wb)) > > > > > > > cond_resched(); > > > > > > > spin_lock_irq(&cgwb_lock); > > > > > > > > > > > > > > wb_put(wb); > > > > > > > } > > > > > > > > > > > > > > if (!list_empty(&processed)) > > > > > > > list_splice_tail(&processed, &offline_cgwbs); > > > > > > > > > > > > > > spin_unlock_irq(&cgwb_lock); > > > > > > > } > > > > > > > ``` > > > > > > > > > > > > > > BTW, this bug can be only reproduced on a non-debug production built > > > > > > > kernel (a.k.a kernel rpm package), it's not reproducible on a debug > > > > > > > build with various debug configuration enabled (a.k.a kernel-debug rpm > > > > > > > package) > > > > > > > > > > > > FWIW I've also seen this regularly on x86_64 kernels on ext4 with all > > > > > > default mkfs settings when running generic/256. > > > > > > > > > > Oh, that's a useful information, thank you! > > > > > > > > > > Btw, would you mind to give a patch from an earlier message in the thread > > > > > a test? I'd highly appreciate it. > > > > > > > > > > Thanks! > > > > > > > > Will do. > > > > > > fstests passed here, so > > > > > > Tested-by: Darrick J. Wong <djwong@kernel.org> > > > > Great, thank you! > > ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: Patch 'writeback, cgroup: release dying cgwbs by switching attached inodes' leads to kernel crash 2021-07-22 5:29 ` Boyang Xue @ 2021-07-22 5:41 ` Roman Gushchin 0 siblings, 0 replies; 21+ messages in thread From: Roman Gushchin @ 2021-07-22 5:41 UTC (permalink / raw) To: Boyang Xue; +Cc: Darrick J. Wong, Matthew Wilcox, Jan Kara, linux-fsdevel Thank you very much for testing it! Sent from my iPhone > On Jul 21, 2021, at 22:29, Boyang Xue <bxue@redhat.com> wrote: > > Just FYI, the tests on ppc64le are done, no longer kernel panic, so my > tests on all arches are fine now. > >> On Sat, Jul 17, 2021 at 8:00 PM Boyang Xue <bxue@redhat.com> wrote: >> >> Testing fstests on aarch64, x86_64, s390x all passed. There's a >> shortage of ppc64le systems, so I can't provide the ppc64le test >> result for now, but I hope I can report the result next week. >> >> Thanks, >> Boyang >> >>> On Sat, Jul 17, 2021 at 4:04 AM Roman Gushchin <guro@fb.com> wrote: >>> >>> On Fri, Jul 16, 2021 at 09:23:40AM -0700, Darrick J. Wong wrote: >>>> On Thu, Jul 15, 2021 at 03:28:12PM -0700, Darrick J. Wong wrote: >>>>> On Thu, Jul 15, 2021 at 01:08:15PM -0700, Roman Gushchin wrote: >>>>>> On Thu, Jul 15, 2021 at 10:10:50AM -0700, Darrick J. Wong wrote: >>>>>>> On Thu, Jul 15, 2021 at 11:51:50AM +0800, Boyang Xue wrote: >>>>>>>> On Thu, Jul 15, 2021 at 10:36 AM Matthew Wilcox <willy@infradead.org> wrote: >>>>>>>>> >>>>>>>>> On Thu, Jul 15, 2021 at 12:22:28AM +0800, Boyang Xue wrote: >>>>>>>>>> It's unclear to me that where to find the required address in the >>>>>>>>>> addr2line command line, i.e. >>>>>>>>>> >>>>>>>>>> addr2line -e /usr/lib/debug/lib/modules/5.14.0-0.rc1.15.bx.el9.aarch64/vmlinux >>>>>>>>>> <what address here?> >>>>>>>>> >>>>>>>>> ./scripts/faddr2line /usr/lib/debug/lib/modules/5.14.0-0.rc1.15.bx.el9.aarch64/vmlinux cleanup_offline_cgwbs_workfn+0x320/0x394 >>>>>>>>> >>>>>>>> >>>>>>>> Thanks! The result is the same as the >>>>>>>> >>>>>>>> addr2line -i -e >>>>>>>> /usr/lib/debug/lib/modules/5.14.0-0.rc1.15.bx.el9.aarch64/vmlinux >>>>>>>> FFFF8000102D6DD0 >>>>>>>> >>>>>>>> But this script is very handy. >>>>>>>> >>>>>>>> # /usr/src/kernels/5.14.0-0.rc1.15.bx.el9.aarch64/scripts/faddr2line >>>>>>>> /usr/lib/debug/lib/modules/5.14.0-0.rc1.15.bx.el9.aarch64/vmlinux >>>>>>>> cleanup_offlin >>>>>>>> e_cgwbs_workfn+0x320/0x394 >>>>>>>> cleanup_offline_cgwbs_workfn+0x320/0x394: >>>>>>>> arch_atomic64_fetch_add_unless at >>>>>>>> /usr/src/debug/kernel-5.14.0-0.rc1.15.bx/linux-5.14.0-0.rc1.15.bx.el9.aarch64/./include/linux/atomic-arch-fallback.h:2265 >>>>>>>> (inlined by) arch_atomic64_add_unless at >>>>>>>> /usr/src/debug/kernel-5.14.0-0.rc1.15.bx/linux-5.14.0-0.rc1.15.bx.el9.aarch64/./include/linux/atomic-arch-fallback.h:2290 >>>>>>>> (inlined by) atomic64_add_unless at >>>>>>>> /usr/src/debug/kernel-5.14.0-0.rc1.15.bx/linux-5.14.0-0.rc1.15.bx.el9.aarch64/./include/asm-generic/atomic-instrumented.h:1149 >>>>>>>> (inlined by) atomic_long_add_unless at >>>>>>>> /usr/src/debug/kernel-5.14.0-0.rc1.15.bx/linux-5.14.0-0.rc1.15.bx.el9.aarch64/./include/asm-generic/atomic-long.h:491 >>>>>>>> (inlined by) percpu_ref_tryget_many at >>>>>>>> /usr/src/debug/kernel-5.14.0-0.rc1.15.bx/linux-5.14.0-0.rc1.15.bx.el9.aarch64/./include/linux/percpu-refcount.h:247 >>>>>>>> (inlined by) percpu_ref_tryget at >>>>>>>> /usr/src/debug/kernel-5.14.0-0.rc1.15.bx/linux-5.14.0-0.rc1.15.bx.el9.aarch64/./include/linux/percpu-refcount.h:266 >>>>>>>> (inlined by) wb_tryget at >>>>>>>> /usr/src/debug/kernel-5.14.0-0.rc1.15.bx/linux-5.14.0-0.rc1.15.bx.el9.aarch64/./include/linux/backing-dev-defs.h:227 >>>>>>>> (inlined by) wb_tryget at >>>>>>>> /usr/src/debug/kernel-5.14.0-0.rc1.15.bx/linux-5.14.0-0.rc1.15.bx.el9.aarch64/./include/linux/backing-dev-defs.h:224 >>>>>>>> (inlined by) cleanup_offline_cgwbs_workfn at >>>>>>>> /usr/src/debug/kernel-5.14.0-0.rc1.15.bx/linux-5.14.0-0.rc1.15.bx.el9.aarch64/mm/backing-dev.c:679 >>>>>>>> >>>>>>>> # vi /usr/src/debug/kernel-5.14.0-0.rc1.15.bx/linux-5.14.0-0.rc1.15.bx.el9.aarch64/mm/backing-dev.c >>>>>>>> ``` >>>>>>>> static void cleanup_offline_cgwbs_workfn(struct work_struct *work) >>>>>>>> { >>>>>>>> struct bdi_writeback *wb; >>>>>>>> LIST_HEAD(processed); >>>>>>>> >>>>>>>> spin_lock_irq(&cgwb_lock); >>>>>>>> >>>>>>>> while (!list_empty(&offline_cgwbs)) { >>>>>>>> wb = list_first_entry(&offline_cgwbs, struct bdi_writeback, >>>>>>>> offline_node); >>>>>>>> list_move(&wb->offline_node, &processed); >>>>>>>> >>>>>>>> /* >>>>>>>> * If wb is dirty, cleaning up the writeback by switching >>>>>>>> * attached inodes will result in an effective removal of any >>>>>>>> * bandwidth restrictions, which isn't the goal. Instead, >>>>>>>> * it can be postponed until the next time, when all io >>>>>>>> * will be likely completed. If in the meantime some inodes >>>>>>>> * will get re-dirtied, they should be eventually switched to >>>>>>>> * a new cgwb. >>>>>>>> */ >>>>>>>> if (wb_has_dirty_io(wb)) >>>>>>>> continue; >>>>>>>> >>>>>>>> if (!wb_tryget(wb)) <=== line#679 >>>>>>>> continue; >>>>>>>> >>>>>>>> spin_unlock_irq(&cgwb_lock); >>>>>>>> while (cleanup_offline_cgwb(wb)) >>>>>>>> cond_resched(); >>>>>>>> spin_lock_irq(&cgwb_lock); >>>>>>>> >>>>>>>> wb_put(wb); >>>>>>>> } >>>>>>>> >>>>>>>> if (!list_empty(&processed)) >>>>>>>> list_splice_tail(&processed, &offline_cgwbs); >>>>>>>> >>>>>>>> spin_unlock_irq(&cgwb_lock); >>>>>>>> } >>>>>>>> ``` >>>>>>>> >>>>>>>> BTW, this bug can be only reproduced on a non-debug production built >>>>>>>> kernel (a.k.a kernel rpm package), it's not reproducible on a debug >>>>>>>> build with various debug configuration enabled (a.k.a kernel-debug rpm >>>>>>>> package) >>>>>>> >>>>>>> FWIW I've also seen this regularly on x86_64 kernels on ext4 with all >>>>>>> default mkfs settings when running generic/256. >>>>>> >>>>>> Oh, that's a useful information, thank you! >>>>>> >>>>>> Btw, would you mind to give a patch from an earlier message in the thread >>>>>> a test? I'd highly appreciate it. >>>>>> >>>>>> Thanks! >>>>> >>>>> Will do. >>>> >>>> fstests passed here, so >>>> >>>> Tested-by: Darrick J. Wong <djwong@kernel.org> >>> >>> Great, thank you! >>> > ^ permalink raw reply [flat|nested] 21+ messages in thread
end of thread, other threads:[~2021-07-22 5:42 UTC | newest] Thread overview: 21+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2021-07-14 3:21 Patch 'writeback, cgroup: release dying cgwbs by switching attached inodes' leads to kernel crash Boyang Xue 2021-07-14 3:57 ` Boyang Xue 2021-07-14 4:11 ` Roman Gushchin 2021-07-14 8:44 ` Boyang Xue 2021-07-14 9:26 ` Jan Kara 2021-07-14 16:22 ` Boyang Xue 2021-07-14 23:46 ` Roman Gushchin 2021-07-15 1:42 ` Boyang Xue 2021-07-15 9:31 ` Jan Kara 2021-07-15 16:04 ` Roman Gushchin 2021-07-16 1:37 ` Boyang Xue 2021-07-15 2:35 ` Matthew Wilcox 2021-07-15 3:51 ` Boyang Xue 2021-07-15 17:10 ` Darrick J. Wong 2021-07-15 20:08 ` Roman Gushchin 2021-07-15 22:28 ` Darrick J. Wong 2021-07-16 16:23 ` Darrick J. Wong 2021-07-16 20:03 ` Roman Gushchin 2021-07-17 12:00 ` Boyang Xue 2021-07-22 5:29 ` Boyang Xue 2021-07-22 5:41 ` Roman Gushchin
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.