All of lore.kernel.org
 help / color / mirror / Atom feed
* Patch 'writeback, cgroup: release dying cgwbs by switching attached inodes' leads to kernel crash
@ 2021-07-14  3:21 Boyang Xue
  2021-07-14  3:57 ` Boyang Xue
  2021-07-14  4:11 ` Roman Gushchin
  0 siblings, 2 replies; 21+ messages in thread
From: Boyang Xue @ 2021-07-14  3:21 UTC (permalink / raw)
  To: linux-fsdevel; +Cc: guro

Hello,

I'm not sure if this is the right place to report this bug, please
correct me if I'm wrong.

I found kernel-5.14.0-rc1 (built from the Linus tree) crash when it's
running xfstests generic/256 on ext4 [1]. Looking at the call trace,
it looks like the bug had been introduced by the commit

c22d70a162d3 writeback, cgroup: release dying cgwbs by switching attached inodes

It only happens on aarch64, not on x86_64, ppc64le and s390x. Testing
was performed with the latest xfstests, and the bug can be reproduced
on ext{2, 3, 4} with {1k, 2k, 4k} block sizes.

Thanks,
Boyang

1. dmesg
```
[ 4366.380974] run fstests generic/256 at 2021-07-12 05:41:40
[ 4368.337078] EXT4-fs (vda3): mounted filesystem with ordered data
mode. Opts: . Quota mode: none.
[ 4371.275986] Unable to handle kernel NULL pointer dereference at
virtual address 0000000000000000
[ 4371.278210] Mem abort info:
[ 4371.278880]   ESR = 0x96000005
[ 4371.279603]   EC = 0x25: DABT (current EL), IL = 32 bits
[ 4371.280878]   SET = 0, FnV = 0
[ 4371.281621]   EA = 0, S1PTW = 0
[ 4371.282396]   FSC = 0x05: level 1 translation fault
[ 4371.283635] Data abort info:
[ 4371.284333]   ISV = 0, ISS = 0x00000005
[ 4371.285246]   CM = 0, WnR = 0
[ 4371.285975] user pgtable: 64k pages, 48-bit VAs, pgdp=00000000b0502000
[ 4371.287640] [0000000000000000] pgd=0000000000000000,
p4d=0000000000000000, pud=0000000000000000
[ 4371.290016] Internal error: Oops: 96000005 [#1] SMP
[ 4371.291251] Modules linked in: dm_flakey dm_snapshot dm_bufio
dm_zero dm_mod loop tls rpcsec_gss_krb5 auth_rpcgss nfsv4 dns_resolver
nfs lockd grace fscache netfs rfkill sunrpc ext4 vfat fat mbcache jbd2
drm fuse xfs libcrc32c crct10dif_ce ghash_ce sha2_ce sha256_arm64
sha1_ce virtio_blk virtio_net net_failover virtio_console failover
virtio_mmio aes_neon_bs [last unloaded: scsi_debug]
[ 4371.300059] CPU: 0 PID: 408468 Comm: kworker/u8:5 Tainted: G
       X --------- ---  5.14.0-0.rc1.15.bx.el9.aarch64 #1
[ 4371.303009] Hardware name: QEMU KVM Virtual Machine, BIOS 0.0.0 02/06/2015
[ 4371.304685] Workqueue: events_unbound cleanup_offline_cgwbs_workfn
[ 4371.306329] pstate: 004000c5 (nzcv daIF +PAN -UAO -TCO BTYPE=--)
[ 4371.307867] pc : cleanup_offline_cgwbs_workfn+0x320/0x394
[ 4371.309254] lr : cleanup_offline_cgwbs_workfn+0xe0/0x394
[ 4371.310597] sp : ffff80001554fd10
[ 4371.311443] x29: ffff80001554fd10 x28: 0000000000000000 x27: 0000000000000001
[ 4371.313320] x26: 0000000000000000 x25: 00000000000000e0 x24: ffffd2a2fbe671a8
[ 4371.315159] x23: ffff80001554fd88 x22: ffffd2a2fbe67198 x21: ffffd2a2fc25a730
[ 4371.316945] x20: ffff210412bc3000 x19: ffff210412bc3280 x18: 0000000000000000
[ 4371.318690] x17: 0000000000000000 x16: 0000000000000000 x15: 0000000000000000
[ 4371.320437] x14: 0000000000000000 x13: 0000000000000030 x12: 0000000000000040
[ 4371.322444] x11: ffff210481572238 x10: ffff21048157223a x9 : ffffd2a2fa276c60
[ 4371.324243] x8 : ffff210484106b60 x7 : 0000000000000000 x6 : 000000000007d18a
[ 4371.326049] x5 : ffff210416a86400 x4 : ffff210412bc0280 x3 : 0000000000000000
[ 4371.327898] x2 : ffff80001554fd88 x1 : ffff210412bc0280 x0 : 0000000000000003
[ 4371.329748] Call trace:
[ 4371.330372]  cleanup_offline_cgwbs_workfn+0x320/0x394
[ 4371.331694]  process_one_work+0x1f4/0x4b0
[ 4371.332767]  worker_thread+0x184/0x540
[ 4371.333732]  kthread+0x114/0x120
[ 4371.334535]  ret_from_fork+0x10/0x18
[ 4371.335440] Code: d63f0020 97f99963 17ffffa6 f8588263 (f9400061)
[ 4371.337174] ---[ end trace e250fe289272792a ]---
[ 4371.338365] Kernel panic - not syncing: Oops: Fatal exception
[ 4371.339884] SMP: stopping secondary CPUs
[ 4372.424137] SMP: failed to stop secondary CPUs 0-2
[ 4372.436894] Kernel Offset: 0x52a2e9fa0000 from 0xffff800010000000
[ 4372.438408] PHYS_OFFSET: 0xfff0defca0000000
[ 4372.439496] CPU features: 0x00200251,23200840
[ 4372.440603] Memory Limit: none
[ 4372.441374] ---[ end Kernel panic - not syncing: Oops: Fatal exception ]---
```


^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: Patch 'writeback, cgroup: release dying cgwbs by switching attached inodes' leads to kernel crash
  2021-07-14  3:21 Patch 'writeback, cgroup: release dying cgwbs by switching attached inodes' leads to kernel crash Boyang Xue
@ 2021-07-14  3:57 ` Boyang Xue
  2021-07-14  4:11 ` Roman Gushchin
  1 sibling, 0 replies; 21+ messages in thread
From: Boyang Xue @ 2021-07-14  3:57 UTC (permalink / raw)
  To: linux-fsdevel; +Cc: guro

On Wed, Jul 14, 2021 at 11:21 AM Boyang Xue <bxue@redhat.com> wrote:
>
> Hello,
>
> I'm not sure if this is the right place to report this bug, please
> correct me if I'm wrong.
>
> I found kernel-5.14.0-rc1 (built from the Linus tree) crash when it's
> running xfstests generic/256 on ext4 [1]. Looking at the call trace,
> it looks like the bug had been introduced by the commit
>
> c22d70a162d3 writeback, cgroup: release dying cgwbs by switching attached inodes
>
> It only happens on aarch64, not on x86_64, ppc64le and s390x. Testing

Correction: It only happens on aarch64 and ppc64le, not on x86_64 and s390x.

> was performed with the latest xfstests, and the bug can be reproduced
> on ext{2, 3, 4} with {1k, 2k, 4k} block sizes.
>
> Thanks,
> Boyang
>
> 1. dmesg
> ```
> [ 4366.380974] run fstests generic/256 at 2021-07-12 05:41:40
> [ 4368.337078] EXT4-fs (vda3): mounted filesystem with ordered data
> mode. Opts: . Quota mode: none.
> [ 4371.275986] Unable to handle kernel NULL pointer dereference at
> virtual address 0000000000000000
> [ 4371.278210] Mem abort info:
> [ 4371.278880]   ESR = 0x96000005
> [ 4371.279603]   EC = 0x25: DABT (current EL), IL = 32 bits
> [ 4371.280878]   SET = 0, FnV = 0
> [ 4371.281621]   EA = 0, S1PTW = 0
> [ 4371.282396]   FSC = 0x05: level 1 translation fault
> [ 4371.283635] Data abort info:
> [ 4371.284333]   ISV = 0, ISS = 0x00000005
> [ 4371.285246]   CM = 0, WnR = 0
> [ 4371.285975] user pgtable: 64k pages, 48-bit VAs, pgdp=00000000b0502000
> [ 4371.287640] [0000000000000000] pgd=0000000000000000,
> p4d=0000000000000000, pud=0000000000000000
> [ 4371.290016] Internal error: Oops: 96000005 [#1] SMP
> [ 4371.291251] Modules linked in: dm_flakey dm_snapshot dm_bufio
> dm_zero dm_mod loop tls rpcsec_gss_krb5 auth_rpcgss nfsv4 dns_resolver
> nfs lockd grace fscache netfs rfkill sunrpc ext4 vfat fat mbcache jbd2
> drm fuse xfs libcrc32c crct10dif_ce ghash_ce sha2_ce sha256_arm64
> sha1_ce virtio_blk virtio_net net_failover virtio_console failover
> virtio_mmio aes_neon_bs [last unloaded: scsi_debug]
> [ 4371.300059] CPU: 0 PID: 408468 Comm: kworker/u8:5 Tainted: G
>        X --------- ---  5.14.0-0.rc1.15.bx.el9.aarch64 #1
> [ 4371.303009] Hardware name: QEMU KVM Virtual Machine, BIOS 0.0.0 02/06/2015
> [ 4371.304685] Workqueue: events_unbound cleanup_offline_cgwbs_workfn
> [ 4371.306329] pstate: 004000c5 (nzcv daIF +PAN -UAO -TCO BTYPE=--)
> [ 4371.307867] pc : cleanup_offline_cgwbs_workfn+0x320/0x394
> [ 4371.309254] lr : cleanup_offline_cgwbs_workfn+0xe0/0x394
> [ 4371.310597] sp : ffff80001554fd10
> [ 4371.311443] x29: ffff80001554fd10 x28: 0000000000000000 x27: 0000000000000001
> [ 4371.313320] x26: 0000000000000000 x25: 00000000000000e0 x24: ffffd2a2fbe671a8
> [ 4371.315159] x23: ffff80001554fd88 x22: ffffd2a2fbe67198 x21: ffffd2a2fc25a730
> [ 4371.316945] x20: ffff210412bc3000 x19: ffff210412bc3280 x18: 0000000000000000
> [ 4371.318690] x17: 0000000000000000 x16: 0000000000000000 x15: 0000000000000000
> [ 4371.320437] x14: 0000000000000000 x13: 0000000000000030 x12: 0000000000000040
> [ 4371.322444] x11: ffff210481572238 x10: ffff21048157223a x9 : ffffd2a2fa276c60
> [ 4371.324243] x8 : ffff210484106b60 x7 : 0000000000000000 x6 : 000000000007d18a
> [ 4371.326049] x5 : ffff210416a86400 x4 : ffff210412bc0280 x3 : 0000000000000000
> [ 4371.327898] x2 : ffff80001554fd88 x1 : ffff210412bc0280 x0 : 0000000000000003
> [ 4371.329748] Call trace:
> [ 4371.330372]  cleanup_offline_cgwbs_workfn+0x320/0x394
> [ 4371.331694]  process_one_work+0x1f4/0x4b0
> [ 4371.332767]  worker_thread+0x184/0x540
> [ 4371.333732]  kthread+0x114/0x120
> [ 4371.334535]  ret_from_fork+0x10/0x18
> [ 4371.335440] Code: d63f0020 97f99963 17ffffa6 f8588263 (f9400061)
> [ 4371.337174] ---[ end trace e250fe289272792a ]---
> [ 4371.338365] Kernel panic - not syncing: Oops: Fatal exception
> [ 4371.339884] SMP: stopping secondary CPUs
> [ 4372.424137] SMP: failed to stop secondary CPUs 0-2
> [ 4372.436894] Kernel Offset: 0x52a2e9fa0000 from 0xffff800010000000
> [ 4372.438408] PHYS_OFFSET: 0xfff0defca0000000
> [ 4372.439496] CPU features: 0x00200251,23200840
> [ 4372.440603] Memory Limit: none
> [ 4372.441374] ---[ end Kernel panic - not syncing: Oops: Fatal exception ]---
> ```


^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: Patch 'writeback, cgroup: release dying cgwbs by switching attached inodes' leads to kernel crash
  2021-07-14  3:21 Patch 'writeback, cgroup: release dying cgwbs by switching attached inodes' leads to kernel crash Boyang Xue
  2021-07-14  3:57 ` Boyang Xue
@ 2021-07-14  4:11 ` Roman Gushchin
  2021-07-14  8:44   ` Boyang Xue
  1 sibling, 1 reply; 21+ messages in thread
From: Roman Gushchin @ 2021-07-14  4:11 UTC (permalink / raw)
  To: Boyang Xue; +Cc: linux-fsdevel, Jan Kara

On Wed, Jul 14, 2021 at 11:21:12AM +0800, Boyang Xue wrote:
> Hello,
> 
> I'm not sure if this is the right place to report this bug, please
> correct me if I'm wrong.
> 
> I found kernel-5.14.0-rc1 (built from the Linus tree) crash when it's
> running xfstests generic/256 on ext4 [1]. Looking at the call trace,
> it looks like the bug had been introduced by the commit
> 
> c22d70a162d3 writeback, cgroup: release dying cgwbs by switching attached inodes
> 
> It only happens on aarch64, not on x86_64, ppc64le and s390x. Testing
> was performed with the latest xfstests, and the bug can be reproduced
> on ext{2, 3, 4} with {1k, 2k, 4k} block sizes.

Hello Boyang,

thank you for the report!

Do you know on which line the oops happens?
I'll try to reproduce the problem. Do you mind sharing your .config, kvm options
and any other meaningful details?

Thank you!

Roman

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: Patch 'writeback, cgroup: release dying cgwbs by switching attached inodes' leads to kernel crash
  2021-07-14  4:11 ` Roman Gushchin
@ 2021-07-14  8:44   ` Boyang Xue
  2021-07-14  9:26     ` Jan Kara
  0 siblings, 1 reply; 21+ messages in thread
From: Boyang Xue @ 2021-07-14  8:44 UTC (permalink / raw)
  To: Roman Gushchin; +Cc: linux-fsdevel, Jan Kara

Hi Roman,

On Wed, Jul 14, 2021 at 12:12 PM Roman Gushchin <guro@fb.com> wrote:
>
> On Wed, Jul 14, 2021 at 11:21:12AM +0800, Boyang Xue wrote:
> > Hello,
> >
> > I'm not sure if this is the right place to report this bug, please
> > correct me if I'm wrong.
> >
> > I found kernel-5.14.0-rc1 (built from the Linus tree) crash when it's
> > running xfstests generic/256 on ext4 [1]. Looking at the call trace,
> > it looks like the bug had been introduced by the commit
> >
> > c22d70a162d3 writeback, cgroup: release dying cgwbs by switching attached inodes
> >
> > It only happens on aarch64, not on x86_64, ppc64le and s390x. Testing
> > was performed with the latest xfstests, and the bug can be reproduced
> > on ext{2, 3, 4} with {1k, 2k, 4k} block sizes.
>
> Hello Boyang,
>
> thank you for the report!
>
> Do you know on which line the oops happens?

I was trying to inspect the vmcore with crash utility, but
unfortunately it doesn't work.

```
# crash /usr/lib/debug/lib/modules/5.14.0-0.rc1.15.bx.el9.aarch64/vmlinux vmcore
...
crash: invalid structure member offset: task_struct_state
       FILE: task.c  LINE: 5929  FUNCTION: task_state()

     [/usr/bin/crash] error trace: aaaae238b080 => aaaae238aff0 =>
aaaae23ff4e8 => aaaae23ff440
...
```
Could you suggest other ways to know "the line the oops happens"?

> I'll try to reproduce the problem. Do you mind sharing your .config, kvm options
> and any other meaningful details?

I can't access the VM host, so sorry I can't provide the kvm
configuration for now. Please check the following other info:

xfstests local.config
```
# cat local.config
FSTYP="ext4"
TEST_DIR="/test"
TEST_DEV="/dev/vda3"
SCRATCH_MNT="/scratch"
SCRATCH_DEV="/dev/vda4"
LOGWRITES_MNT="/logwrites"
LOGWRITES_DEV="/dev/vda6"
MKFS_OPTIONS="-b 4096"
MOUNT_OPTIONS="-o rw,relatime,seclabel"
TEST_FS_MOUNT_OPTS="-o rw,relatime,seclabel"
```

# lscpu
Architecture:            aarch64
  CPU op-mode(s):        64-bit
  Byte Order:            Little Endian
CPU(s):                  4
  On-line CPU(s) list:   0-3
Vendor ID:               Cavium
  BIOS Vendor ID:        QEMU
  Model name:            ThunderX2 99xx
    BIOS Model name:     virt-rhel7.6.0
    Model:               1
    Thread(s) per core:  1
    Core(s) per cluster: 4
    Socket(s):           4
    Cluster(s):          1
    Stepping:            0x1
    BogoMIPS:            400.00
    Flags:               fp asimd evtstrm aes pmull sha1 sha2 crc32
atomics cpuid asimdrdm
NUMA:
  NUMA node(s):          1
  NUMA node0 CPU(s):     0-3
Vulnerabilities:
  Itlb multihit:         Not affected
  L1tf:                  Not affected
  Mds:                   Not affected
  Meltdown:              Not affected
  Spec store bypass:     Mitigation; Speculative Store Bypass disabled via prctl
  Spectre v1:            Mitigation; __user pointer sanitization
  Spectre v2:            Mitigation; Branch predictor hardening
  Srbds:                 Not affected
  Tsx async abort:       Not affected

# getconf PAGESIZE
65536

Please let me know if there's other useful info I can provide.

Thanks,
Boyang

>
> Thank you!
>
> Roman
>


^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: Patch 'writeback, cgroup: release dying cgwbs by switching attached inodes' leads to kernel crash
  2021-07-14  8:44   ` Boyang Xue
@ 2021-07-14  9:26     ` Jan Kara
  2021-07-14 16:22       ` Boyang Xue
  0 siblings, 1 reply; 21+ messages in thread
From: Jan Kara @ 2021-07-14  9:26 UTC (permalink / raw)
  To: Boyang Xue; +Cc: Roman Gushchin, linux-fsdevel, Jan Kara

On Wed 14-07-21 16:44:33, Boyang Xue wrote:
> Hi Roman,
> 
> On Wed, Jul 14, 2021 at 12:12 PM Roman Gushchin <guro@fb.com> wrote:
> >
> > On Wed, Jul 14, 2021 at 11:21:12AM +0800, Boyang Xue wrote:
> > > Hello,
> > >
> > > I'm not sure if this is the right place to report this bug, please
> > > correct me if I'm wrong.
> > >
> > > I found kernel-5.14.0-rc1 (built from the Linus tree) crash when it's
> > > running xfstests generic/256 on ext4 [1]. Looking at the call trace,
> > > it looks like the bug had been introduced by the commit
> > >
> > > c22d70a162d3 writeback, cgroup: release dying cgwbs by switching attached inodes
> > >
> > > It only happens on aarch64, not on x86_64, ppc64le and s390x. Testing
> > > was performed with the latest xfstests, and the bug can be reproduced
> > > on ext{2, 3, 4} with {1k, 2k, 4k} block sizes.
> >
> > Hello Boyang,
> >
> > thank you for the report!
> >
> > Do you know on which line the oops happens?
> 
> I was trying to inspect the vmcore with crash utility, but
> unfortunately it doesn't work.

Thanks for report!  Have you tried addr2line utility? Looking at the oops I
can see:

[ 4371.307867] pc : cleanup_offline_cgwbs_workfn+0x320/0x394

Which means there's probably heavy inlining going on (do you use LTO by
any chance?) because I don't think cleanup_offline_cgwbs_workfn() itself
would compile into ~1k of code (but I don't have much experience with
aarch64). Anyway, add2line should tell us.

Also pasting oops into scripts/decodecode on aarch64 machine should tell
us more about where and why the kernel crashed.

								Honza

-- 
Jan Kara <jack@suse.com>
SUSE Labs, CR

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: Patch 'writeback, cgroup: release dying cgwbs by switching attached inodes' leads to kernel crash
  2021-07-14  9:26     ` Jan Kara
@ 2021-07-14 16:22       ` Boyang Xue
  2021-07-14 23:46         ` Roman Gushchin
  2021-07-15  2:35         ` Matthew Wilcox
  0 siblings, 2 replies; 21+ messages in thread
From: Boyang Xue @ 2021-07-14 16:22 UTC (permalink / raw)
  To: Jan Kara; +Cc: Roman Gushchin, linux-fsdevel

Hi Jan,

On Wed, Jul 14, 2021 at 5:26 PM Jan Kara <jack@suse.cz> wrote:
>
> On Wed 14-07-21 16:44:33, Boyang Xue wrote:
> > Hi Roman,
> >
> > On Wed, Jul 14, 2021 at 12:12 PM Roman Gushchin <guro@fb.com> wrote:
> > >
> > > On Wed, Jul 14, 2021 at 11:21:12AM +0800, Boyang Xue wrote:
> > > > Hello,
> > > >
> > > > I'm not sure if this is the right place to report this bug, please
> > > > correct me if I'm wrong.
> > > >
> > > > I found kernel-5.14.0-rc1 (built from the Linus tree) crash when it's
> > > > running xfstests generic/256 on ext4 [1]. Looking at the call trace,
> > > > it looks like the bug had been introduced by the commit
> > > >
> > > > c22d70a162d3 writeback, cgroup: release dying cgwbs by switching attached inodes
> > > >
> > > > It only happens on aarch64, not on x86_64, ppc64le and s390x. Testing
> > > > was performed with the latest xfstests, and the bug can be reproduced
> > > > on ext{2, 3, 4} with {1k, 2k, 4k} block sizes.
> > >
> > > Hello Boyang,
> > >
> > > thank you for the report!
> > >
> > > Do you know on which line the oops happens?
> >
> > I was trying to inspect the vmcore with crash utility, but
> > unfortunately it doesn't work.
>
> Thanks for report!  Have you tried addr2line utility? Looking at the oops I
> can see:

Thanks for the tips!

It's unclear to me that where to find the required address in the
addr2line command line, i.e.

addr2line -e /usr/lib/debug/lib/modules/5.14.0-0.rc1.15.bx.el9.aarch64/vmlinux
<what address here?>

But I have tried gdb like this,

# gdb /usr/lib/debug/lib/modules/5.14.0-0.rc1.15.bx.el9.aarch64/vmlinux
GNU gdb (GDB) Red Hat Enterprise Linux 10.1-14.el9
Copyright (C) 2020 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.
Type "show copying" and "show warranty" for details.
This GDB was configured as "aarch64-redhat-linux-gnu".
Type "show configuration" for configuration details.
For bug reporting instructions, please see:
<https://www.gnu.org/software/gdb/bugs/>.
Find the GDB manual and other documentation resources online at:
    <http://www.gnu.org/software/gdb/documentation/>.

For help, type "help".
Type "apropos word" to search for commands related to "word"...
Reading symbols from
/usr/lib/debug/lib/modules/5.14.0-0.rc1.15.bx.el9.aarch64/vmlinux...
(gdb) list *(cleanup_offline_cgwbs_workfn+0x320)
0xffff8000102d6ddc is in cleanup_offline_cgwbs_workfn
(./arch/arm64/include/asm/jump_label.h:38).
33      }
34
35      static __always_inline bool arch_static_branch_jump(struct
static_key *key,
36                                                          bool branch)
37      {
38              asm_volatile_goto(
39                      "1:     b               %l[l_yes]               \n\t"
40                       "      .pushsection    __jump_table, \"aw\"    \n\t"
41                       "      .align          3                       \n\t"
42                       "      .long           1b - ., %l[l_yes] - .   \n\t"
(gdb)

I'm not sure is it meaningful?

>
> [ 4371.307867] pc : cleanup_offline_cgwbs_workfn+0x320/0x394
>
> Which means there's probably heavy inlining going on (do you use LTO by
> any chance?) because I don't think cleanup_offline_cgwbs_workfn() itself
> would compile into ~1k of code (but I don't have much experience with
> aarch64). Anyway, add2line should tell us.

Actually I built the kernel on an internal build service, so I don't
know much of the build details, like LTO.

>
> Also pasting oops into scripts/decodecode on aarch64 machine should tell
> us more about where and why the kernel crashed.

The output is:

# echo "Code: d63f0020 97f99963 17ffffa6 f8588263 (f9400061)" |
/usr/src/kernels/5.14.0-0.rc1.15.bx.el9.aarch64/scripts/decodecode
Code: d63f0020 97f99963 17ffffa6 f8588263 (f9400061)
All code
========
   0:   d63f0020        blr     x1
   4:   97f99963        bl      0xffffffffffe66590
   8:   17ffffa6        b       0xfffffffffffffea0
   c:   f8588263        ldur    x3, [x19, #-120]
  10:*  f9400061        ldr     x1, [x3]                <-- trapping instruction

Code starting with the faulting instruction
===========================================
   0:   f9400061        ldr     x1, [x3]

>
>                                                                 Honza
>
> --
> Jan Kara <jack@suse.com>
> SUSE Labs, CR
>

Thanks,
Boyang


^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: Patch 'writeback, cgroup: release dying cgwbs by switching attached inodes' leads to kernel crash
  2021-07-14 16:22       ` Boyang Xue
@ 2021-07-14 23:46         ` Roman Gushchin
  2021-07-15  1:42           ` Boyang Xue
  2021-07-15  2:35         ` Matthew Wilcox
  1 sibling, 1 reply; 21+ messages in thread
From: Roman Gushchin @ 2021-07-14 23:46 UTC (permalink / raw)
  To: Boyang Xue; +Cc: Jan Kara, linux-fsdevel

On Thu, Jul 15, 2021 at 12:22:28AM +0800, Boyang Xue wrote:
> Hi Jan,
> 
> On Wed, Jul 14, 2021 at 5:26 PM Jan Kara <jack@suse.cz> wrote:
> >
> > On Wed 14-07-21 16:44:33, Boyang Xue wrote:
> > > Hi Roman,
> > >
> > > On Wed, Jul 14, 2021 at 12:12 PM Roman Gushchin <guro@fb.com> wrote:
> > > >
> > > > On Wed, Jul 14, 2021 at 11:21:12AM +0800, Boyang Xue wrote:
> > > > > Hello,
> > > > >
> > > > > I'm not sure if this is the right place to report this bug, please
> > > > > correct me if I'm wrong.
> > > > >
> > > > > I found kernel-5.14.0-rc1 (built from the Linus tree) crash when it's
> > > > > running xfstests generic/256 on ext4 [1]. Looking at the call trace,
> > > > > it looks like the bug had been introduced by the commit
> > > > >
> > > > > c22d70a162d3 writeback, cgroup: release dying cgwbs by switching attached inodes
> > > > >
> > > > > It only happens on aarch64, not on x86_64, ppc64le and s390x. Testing
> > > > > was performed with the latest xfstests, and the bug can be reproduced
> > > > > on ext{2, 3, 4} with {1k, 2k, 4k} block sizes.
> > > >
> > > > Hello Boyang,
> > > >
> > > > thank you for the report!
> > > >
> > > > Do you know on which line the oops happens?
> > >
> > > I was trying to inspect the vmcore with crash utility, but
> > > unfortunately it doesn't work.
> >
> > Thanks for report!  Have you tried addr2line utility? Looking at the oops I
> > can see:
> 
> Thanks for the tips!
> 
> It's unclear to me that where to find the required address in the
> addr2line command line, i.e.
> 
> addr2line -e /usr/lib/debug/lib/modules/5.14.0-0.rc1.15.bx.el9.aarch64/vmlinux
> <what address here?>

You can use $nm <vmlinux> to get an address of cleanup_offline_cgwbs_workfn()
and then add 0x320.

Alternatively, maybe you can put the image you're using somewhere?

I'm working on getting my arm64 setup and reproduce the problem, but it takes
time, and I'm not sure I'll be able to reproduce it in qemu running on top of x86.

Thanks!

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: Patch 'writeback, cgroup: release dying cgwbs by switching attached inodes' leads to kernel crash
  2021-07-14 23:46         ` Roman Gushchin
@ 2021-07-15  1:42           ` Boyang Xue
  2021-07-15  9:31             ` Jan Kara
  0 siblings, 1 reply; 21+ messages in thread
From: Boyang Xue @ 2021-07-15  1:42 UTC (permalink / raw)
  To: Roman Gushchin; +Cc: Jan Kara, linux-fsdevel

On Thu, Jul 15, 2021 at 7:46 AM Roman Gushchin <guro@fb.com> wrote:
>
> On Thu, Jul 15, 2021 at 12:22:28AM +0800, Boyang Xue wrote:
> > Hi Jan,
> >
> > On Wed, Jul 14, 2021 at 5:26 PM Jan Kara <jack@suse.cz> wrote:
> > >
> > > On Wed 14-07-21 16:44:33, Boyang Xue wrote:
> > > > Hi Roman,
> > > >
> > > > On Wed, Jul 14, 2021 at 12:12 PM Roman Gushchin <guro@fb.com> wrote:
> > > > >
> > > > > On Wed, Jul 14, 2021 at 11:21:12AM +0800, Boyang Xue wrote:
> > > > > > Hello,
> > > > > >
> > > > > > I'm not sure if this is the right place to report this bug, please
> > > > > > correct me if I'm wrong.
> > > > > >
> > > > > > I found kernel-5.14.0-rc1 (built from the Linus tree) crash when it's
> > > > > > running xfstests generic/256 on ext4 [1]. Looking at the call trace,
> > > > > > it looks like the bug had been introduced by the commit
> > > > > >
> > > > > > c22d70a162d3 writeback, cgroup: release dying cgwbs by switching attached inodes
> > > > > >
> > > > > > It only happens on aarch64, not on x86_64, ppc64le and s390x. Testing
> > > > > > was performed with the latest xfstests, and the bug can be reproduced
> > > > > > on ext{2, 3, 4} with {1k, 2k, 4k} block sizes.
> > > > >
> > > > > Hello Boyang,
> > > > >
> > > > > thank you for the report!
> > > > >
> > > > > Do you know on which line the oops happens?
> > > >
> > > > I was trying to inspect the vmcore with crash utility, but
> > > > unfortunately it doesn't work.
> > >
> > > Thanks for report!  Have you tried addr2line utility? Looking at the oops I
> > > can see:
> >
> > Thanks for the tips!
> >
> > It's unclear to me that where to find the required address in the
> > addr2line command line, i.e.
> >
> > addr2line -e /usr/lib/debug/lib/modules/5.14.0-0.rc1.15.bx.el9.aarch64/vmlinux
> > <what address here?>
>
> You can use $nm <vmlinux> to get an address of cleanup_offline_cgwbs_workfn()
> and then add 0x320.

Thanks! Hope the following helps:

# grep  cleanup_offline_cgwbs_workfn
/boot/System.map-5.14.0-0.rc1.15.bx.el9.aarch64
ffff8000102d6ab0 t cleanup_offline_cgwbs_workfn

## ffff8000102d6ab0+0x320=FFFF8000102D6DD0

# addr2line -e /usr/lib/debug/lib/modules/5.14.0-0.rc1.15.bx.el9.aarch64/vmlinux
FFFF8000102D6DD0
/usr/src/debug/kernel-5.14.0-0.rc1.15.bx/linux-5.14.0-0.rc1.15.bx.el9.aarch64/./include/linux/atomic-arch-fallback.h:2265
# vi /usr/src/debug/kernel-5.14.0-0.rc1.15.bx/linux-5.14.0-0.rc1.15.bx.el9.aarch64/./include/linux/atomic-arch-fallback.h
```
arch_atomic64_fetch_add_unless(atomic64_t *v, s64 a, s64 u)
{
        s64 c = arch_atomic64_read(v); <=== line#2265

        do {
                if (unlikely(c == u))
                        break;
        } while (!arch_atomic64_try_cmpxchg(v, &c, c + a));

        return c;
}
```

# addr2line -i -e
/usr/lib/debug/lib/modules/5.14.0-0.rc1.15.bx.el9.aarch64/vmlinux
FFFF8000102D6DD0
/usr/src/debug/kernel-5.14.0-0.rc1.15.bx/linux-5.14.0-0.rc1.15.bx.el9.aarch64/./include/linux/atomic-arch-fallback.h:2265
/usr/src/debug/kernel-5.14.0-0.rc1.15.bx/linux-5.14.0-0.rc1.15.bx.el9.aarch64/./include/linux/atomic-arch-fallback.h:2290
/usr/src/debug/kernel-5.14.0-0.rc1.15.bx/linux-5.14.0-0.rc1.15.bx.el9.aarch64/./include/asm-generic/atomic-instrumented.h:1149
/usr/src/debug/kernel-5.14.0-0.rc1.15.bx/linux-5.14.0-0.rc1.15.bx.el9.aarch64/./include/asm-generic/atomic-long.h:491
/usr/src/debug/kernel-5.14.0-0.rc1.15.bx/linux-5.14.0-0.rc1.15.bx.el9.aarch64/./include/linux/percpu-refcount.h:247
/usr/src/debug/kernel-5.14.0-0.rc1.15.bx/linux-5.14.0-0.rc1.15.bx.el9.aarch64/./include/linux/percpu-refcount.h:266
/usr/src/debug/kernel-5.14.0-0.rc1.15.bx/linux-5.14.0-0.rc1.15.bx.el9.aarch64/./include/linux/backing-dev-defs.h:227
/usr/src/debug/kernel-5.14.0-0.rc1.15.bx/linux-5.14.0-0.rc1.15.bx.el9.aarch64/./include/linux/backing-dev-defs.h:224
/usr/src/debug/kernel-5.14.0-0.rc1.15.bx/linux-5.14.0-0.rc1.15.bx.el9.aarch64/mm/backing-dev.c:679
# vi /usr/src/debug/kernel-5.14.0-0.rc1.15.bx/linux-5.14.0-0.rc1.15.bx.el9.aarch64/mm/backing-dev.c
```
static void cleanup_offline_cgwbs_workfn(struct work_struct *work)
{
        struct bdi_writeback *wb;
        LIST_HEAD(processed);

        spin_lock_irq(&cgwb_lock);

        while (!list_empty(&offline_cgwbs)) {
                wb = list_first_entry(&offline_cgwbs, struct bdi_writeback,
                                      offline_node);
                list_move(&wb->offline_node, &processed);

                /*
                 * If wb is dirty, cleaning up the writeback by switching
                 * attached inodes will result in an effective removal of any
                 * bandwidth restrictions, which isn't the goal.  Instead,
                 * it can be postponed until the next time, when all io
                 * will be likely completed.  If in the meantime some inodes
                 * will get re-dirtied, they should be eventually switched to
                 * a new cgwb.
                 */
                if (wb_has_dirty_io(wb))
                        continue;

                if (!wb_tryget(wb))  <=== line#679
                        continue;

                spin_unlock_irq(&cgwb_lock);
                while (cleanup_offline_cgwb(wb))
                        cond_resched();
                spin_lock_irq(&cgwb_lock);

                wb_put(wb);
        }

        if (!list_empty(&processed))
                list_splice_tail(&processed, &offline_cgwbs);

        spin_unlock_irq(&cgwb_lock);
}
```

>
> Alternatively, maybe you can put the image you're using somewhere?

I put those rpms in the Google Drive
https://drive.google.com/drive/folders/1aw-WK2yWD11UWB059bJt6WKNW1OP_fex?usp=sharing

>
> I'm working on getting my arm64 setup and reproduce the problem, but it takes
> time, and I'm not sure I'll be able to reproduce it in qemu running on top of x86.

Thanks! It's only reproducible on aarch64 and ppc64le in my test. I'm
happy to help test patch, if it would help.

>
> Thanks!
>


^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: Patch 'writeback, cgroup: release dying cgwbs by switching attached inodes' leads to kernel crash
  2021-07-14 16:22       ` Boyang Xue
  2021-07-14 23:46         ` Roman Gushchin
@ 2021-07-15  2:35         ` Matthew Wilcox
  2021-07-15  3:51           ` Boyang Xue
  1 sibling, 1 reply; 21+ messages in thread
From: Matthew Wilcox @ 2021-07-15  2:35 UTC (permalink / raw)
  To: Boyang Xue; +Cc: Jan Kara, Roman Gushchin, linux-fsdevel

On Thu, Jul 15, 2021 at 12:22:28AM +0800, Boyang Xue wrote:
> It's unclear to me that where to find the required address in the
> addr2line command line, i.e.
> 
> addr2line -e /usr/lib/debug/lib/modules/5.14.0-0.rc1.15.bx.el9.aarch64/vmlinux
> <what address here?>

./scripts/faddr2line /usr/lib/debug/lib/modules/5.14.0-0.rc1.15.bx.el9.aarch64/vmlinux cleanup_offline_cgwbs_workfn+0x320/0x394


^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: Patch 'writeback, cgroup: release dying cgwbs by switching attached inodes' leads to kernel crash
  2021-07-15  2:35         ` Matthew Wilcox
@ 2021-07-15  3:51           ` Boyang Xue
  2021-07-15 17:10             ` Darrick J. Wong
  0 siblings, 1 reply; 21+ messages in thread
From: Boyang Xue @ 2021-07-15  3:51 UTC (permalink / raw)
  To: Matthew Wilcox; +Cc: Jan Kara, Roman Gushchin, linux-fsdevel

On Thu, Jul 15, 2021 at 10:36 AM Matthew Wilcox <willy@infradead.org> wrote:
>
> On Thu, Jul 15, 2021 at 12:22:28AM +0800, Boyang Xue wrote:
> > It's unclear to me that where to find the required address in the
> > addr2line command line, i.e.
> >
> > addr2line -e /usr/lib/debug/lib/modules/5.14.0-0.rc1.15.bx.el9.aarch64/vmlinux
> > <what address here?>
>
> ./scripts/faddr2line /usr/lib/debug/lib/modules/5.14.0-0.rc1.15.bx.el9.aarch64/vmlinux cleanup_offline_cgwbs_workfn+0x320/0x394
>

Thanks! The result is the same as the

addr2line -i -e
/usr/lib/debug/lib/modules/5.14.0-0.rc1.15.bx.el9.aarch64/vmlinux
FFFF8000102D6DD0

But this script is very handy.

# /usr/src/kernels/5.14.0-0.rc1.15.bx.el9.aarch64/scripts/faddr2line
/usr/lib/debug/lib/modules/5.14.0-0.rc1.15.bx.el9.aarch64/vmlinux
cleanup_offlin
e_cgwbs_workfn+0x320/0x394
cleanup_offline_cgwbs_workfn+0x320/0x394:
arch_atomic64_fetch_add_unless at
/usr/src/debug/kernel-5.14.0-0.rc1.15.bx/linux-5.14.0-0.rc1.15.bx.el9.aarch64/./include/linux/atomic-arch-fallback.h:2265
(inlined by) arch_atomic64_add_unless at
/usr/src/debug/kernel-5.14.0-0.rc1.15.bx/linux-5.14.0-0.rc1.15.bx.el9.aarch64/./include/linux/atomic-arch-fallback.h:2290
(inlined by) atomic64_add_unless at
/usr/src/debug/kernel-5.14.0-0.rc1.15.bx/linux-5.14.0-0.rc1.15.bx.el9.aarch64/./include/asm-generic/atomic-instrumented.h:1149
(inlined by) atomic_long_add_unless at
/usr/src/debug/kernel-5.14.0-0.rc1.15.bx/linux-5.14.0-0.rc1.15.bx.el9.aarch64/./include/asm-generic/atomic-long.h:491
(inlined by) percpu_ref_tryget_many at
/usr/src/debug/kernel-5.14.0-0.rc1.15.bx/linux-5.14.0-0.rc1.15.bx.el9.aarch64/./include/linux/percpu-refcount.h:247
(inlined by) percpu_ref_tryget at
/usr/src/debug/kernel-5.14.0-0.rc1.15.bx/linux-5.14.0-0.rc1.15.bx.el9.aarch64/./include/linux/percpu-refcount.h:266
(inlined by) wb_tryget at
/usr/src/debug/kernel-5.14.0-0.rc1.15.bx/linux-5.14.0-0.rc1.15.bx.el9.aarch64/./include/linux/backing-dev-defs.h:227
(inlined by) wb_tryget at
/usr/src/debug/kernel-5.14.0-0.rc1.15.bx/linux-5.14.0-0.rc1.15.bx.el9.aarch64/./include/linux/backing-dev-defs.h:224
(inlined by) cleanup_offline_cgwbs_workfn at
/usr/src/debug/kernel-5.14.0-0.rc1.15.bx/linux-5.14.0-0.rc1.15.bx.el9.aarch64/mm/backing-dev.c:679

# vi /usr/src/debug/kernel-5.14.0-0.rc1.15.bx/linux-5.14.0-0.rc1.15.bx.el9.aarch64/mm/backing-dev.c
```
static void cleanup_offline_cgwbs_workfn(struct work_struct *work)
{
        struct bdi_writeback *wb;
        LIST_HEAD(processed);

        spin_lock_irq(&cgwb_lock);

        while (!list_empty(&offline_cgwbs)) {
                wb = list_first_entry(&offline_cgwbs, struct bdi_writeback,
                                      offline_node);
                list_move(&wb->offline_node, &processed);

                /*
                 * If wb is dirty, cleaning up the writeback by switching
                 * attached inodes will result in an effective removal of any
                 * bandwidth restrictions, which isn't the goal.  Instead,
                 * it can be postponed until the next time, when all io
                 * will be likely completed.  If in the meantime some inodes
                 * will get re-dirtied, they should be eventually switched to
                 * a new cgwb.
                 */
                if (wb_has_dirty_io(wb))
                        continue;

                if (!wb_tryget(wb))  <=== line#679
                        continue;

                spin_unlock_irq(&cgwb_lock);
                while (cleanup_offline_cgwb(wb))
                        cond_resched();
                spin_lock_irq(&cgwb_lock);

                wb_put(wb);
        }

        if (!list_empty(&processed))
                list_splice_tail(&processed, &offline_cgwbs);

        spin_unlock_irq(&cgwb_lock);
}
```

BTW, this bug can be only reproduced on a non-debug production built
kernel (a.k.a kernel rpm package), it's not reproducible on a debug
build with various debug configuration enabled (a.k.a kernel-debug rpm
package)


^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: Patch 'writeback, cgroup: release dying cgwbs by switching attached inodes' leads to kernel crash
  2021-07-15  1:42           ` Boyang Xue
@ 2021-07-15  9:31             ` Jan Kara
  2021-07-15 16:04               ` Roman Gushchin
  0 siblings, 1 reply; 21+ messages in thread
From: Jan Kara @ 2021-07-15  9:31 UTC (permalink / raw)
  To: Boyang Xue; +Cc: Roman Gushchin, Jan Kara, linux-fsdevel

On Thu 15-07-21 09:42:06, Boyang Xue wrote:
> On Thu, Jul 15, 2021 at 7:46 AM Roman Gushchin <guro@fb.com> wrote:
> >
> > On Thu, Jul 15, 2021 at 12:22:28AM +0800, Boyang Xue wrote:
> > > Hi Jan,
> > >
> > > On Wed, Jul 14, 2021 at 5:26 PM Jan Kara <jack@suse.cz> wrote:
> > > >
> > > > On Wed 14-07-21 16:44:33, Boyang Xue wrote:
> > > > > Hi Roman,
> > > > >
> > > > > On Wed, Jul 14, 2021 at 12:12 PM Roman Gushchin <guro@fb.com> wrote:
> > > > > >
> > > > > > On Wed, Jul 14, 2021 at 11:21:12AM +0800, Boyang Xue wrote:
> > > > > > > Hello,
> > > > > > >
> > > > > > > I'm not sure if this is the right place to report this bug, please
> > > > > > > correct me if I'm wrong.
> > > > > > >
> > > > > > > I found kernel-5.14.0-rc1 (built from the Linus tree) crash when it's
> > > > > > > running xfstests generic/256 on ext4 [1]. Looking at the call trace,
> > > > > > > it looks like the bug had been introduced by the commit
> > > > > > >
> > > > > > > c22d70a162d3 writeback, cgroup: release dying cgwbs by switching attached inodes
> > > > > > >
> > > > > > > It only happens on aarch64, not on x86_64, ppc64le and s390x. Testing
> > > > > > > was performed with the latest xfstests, and the bug can be reproduced
> > > > > > > on ext{2, 3, 4} with {1k, 2k, 4k} block sizes.
> > > > > >
> > > > > > Hello Boyang,
> > > > > >
> > > > > > thank you for the report!
> > > > > >
> > > > > > Do you know on which line the oops happens?
> > > > >
> > > > > I was trying to inspect the vmcore with crash utility, but
> > > > > unfortunately it doesn't work.
> > > >
> > > > Thanks for report!  Have you tried addr2line utility? Looking at the oops I
> > > > can see:
> > >
> > > Thanks for the tips!
> > >
> > > It's unclear to me that where to find the required address in the
> > > addr2line command line, i.e.
> > >
> > > addr2line -e /usr/lib/debug/lib/modules/5.14.0-0.rc1.15.bx.el9.aarch64/vmlinux
> > > <what address here?>
> >
> > You can use $nm <vmlinux> to get an address of cleanup_offline_cgwbs_workfn()
> > and then add 0x320.
> 
> Thanks! Hope the following helps:

Thanks for the data! 

> static void cleanup_offline_cgwbs_workfn(struct work_struct *work)
> {
>         struct bdi_writeback *wb;
>         LIST_HEAD(processed);
> 
>         spin_lock_irq(&cgwb_lock);
> 
>         while (!list_empty(&offline_cgwbs)) {
>                 wb = list_first_entry(&offline_cgwbs, struct bdi_writeback,
>                                       offline_node);
>                 list_move(&wb->offline_node, &processed);
> 
>                 /*
>                  * If wb is dirty, cleaning up the writeback by switching
>                  * attached inodes will result in an effective removal of any
>                  * bandwidth restrictions, which isn't the goal.  Instead,
>                  * it can be postponed until the next time, when all io
>                  * will be likely completed.  If in the meantime some inodes
>                  * will get re-dirtied, they should be eventually switched to
>                  * a new cgwb.
>                  */
>                 if (wb_has_dirty_io(wb))
>                         continue;
> 
>                 if (!wb_tryget(wb))  <=== line#679
>                         continue;

Aha, interesting. So it seems we crashed trying to dereference
wb->refcnt->data. So it looks like cgwb_release_workfn() raced with
cleanup_offline_cgwbs_workfn() and percpu_ref_exit() got called from
cgwb_release_workfn() and then cleanup_offline_cgwbs_workfn() called
wb_tryget(). I think the proper fix is to move:

        spin_lock_irq(&cgwb_lock);
        list_del(&wb->offline_node);
        spin_unlock_irq(&cgwb_lock);

in cgwb_release_workfn() to the beginning of that function so that we are
sure even cleanup_offline_cgwbs_workfn() cannot be working with the wb when
it is being released. Roman?

								Honza

-- 
Jan Kara <jack@suse.com>
SUSE Labs, CR

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: Patch 'writeback, cgroup: release dying cgwbs by switching attached inodes' leads to kernel crash
  2021-07-15  9:31             ` Jan Kara
@ 2021-07-15 16:04               ` Roman Gushchin
  2021-07-16  1:37                 ` Boyang Xue
  0 siblings, 1 reply; 21+ messages in thread
From: Roman Gushchin @ 2021-07-15 16:04 UTC (permalink / raw)
  To: Jan Kara, Boyang Xue; +Cc: linux-fsdevel

On Thu, Jul 15, 2021 at 11:31:17AM +0200, Jan Kara wrote:
> On Thu 15-07-21 09:42:06, Boyang Xue wrote:
> > On Thu, Jul 15, 2021 at 7:46 AM Roman Gushchin <guro@fb.com> wrote:
> > >
> > > On Thu, Jul 15, 2021 at 12:22:28AM +0800, Boyang Xue wrote:
> > > > Hi Jan,
> > > >
> > > > On Wed, Jul 14, 2021 at 5:26 PM Jan Kara <jack@suse.cz> wrote:
> > > > >
> > > > > On Wed 14-07-21 16:44:33, Boyang Xue wrote:
> > > > > > Hi Roman,
> > > > > >
> > > > > > On Wed, Jul 14, 2021 at 12:12 PM Roman Gushchin <guro@fb.com> wrote:
> > > > > > >
> > > > > > > On Wed, Jul 14, 2021 at 11:21:12AM +0800, Boyang Xue wrote:
> > > > > > > > Hello,
> > > > > > > >
> > > > > > > > I'm not sure if this is the right place to report this bug, please
> > > > > > > > correct me if I'm wrong.
> > > > > > > >
> > > > > > > > I found kernel-5.14.0-rc1 (built from the Linus tree) crash when it's
> > > > > > > > running xfstests generic/256 on ext4 [1]. Looking at the call trace,
> > > > > > > > it looks like the bug had been introduced by the commit
> > > > > > > >
> > > > > > > > c22d70a162d3 writeback, cgroup: release dying cgwbs by switching attached inodes
> > > > > > > >
> > > > > > > > It only happens on aarch64, not on x86_64, ppc64le and s390x. Testing
> > > > > > > > was performed with the latest xfstests, and the bug can be reproduced
> > > > > > > > on ext{2, 3, 4} with {1k, 2k, 4k} block sizes.
> > > > > > >
> > > > > > > Hello Boyang,
> > > > > > >
> > > > > > > thank you for the report!
> > > > > > >
> > > > > > > Do you know on which line the oops happens?
> > > > > >
> > > > > > I was trying to inspect the vmcore with crash utility, but
> > > > > > unfortunately it doesn't work.
> > > > >
> > > > > Thanks for report!  Have you tried addr2line utility? Looking at the oops I
> > > > > can see:
> > > >
> > > > Thanks for the tips!
> > > >
> > > > It's unclear to me that where to find the required address in the
> > > > addr2line command line, i.e.
> > > >
> > > > addr2line -e /usr/lib/debug/lib/modules/5.14.0-0.rc1.15.bx.el9.aarch64/vmlinux
> > > > <what address here?>
> > >
> > > You can use $nm <vmlinux> to get an address of cleanup_offline_cgwbs_workfn()
> > > and then add 0x320.
> > 
> > Thanks! Hope the following helps:
> 
> Thanks for the data! 
> 
> > static void cleanup_offline_cgwbs_workfn(struct work_struct *work)
> > {
> >         struct bdi_writeback *wb;
> >         LIST_HEAD(processed);
> > 
> >         spin_lock_irq(&cgwb_lock);
> > 
> >         while (!list_empty(&offline_cgwbs)) {
> >                 wb = list_first_entry(&offline_cgwbs, struct bdi_writeback,
> >                                       offline_node);
> >                 list_move(&wb->offline_node, &processed);
> > 
> >                 /*
> >                  * If wb is dirty, cleaning up the writeback by switching
> >                  * attached inodes will result in an effective removal of any
> >                  * bandwidth restrictions, which isn't the goal.  Instead,
> >                  * it can be postponed until the next time, when all io
> >                  * will be likely completed.  If in the meantime some inodes
> >                  * will get re-dirtied, they should be eventually switched to
> >                  * a new cgwb.
> >                  */
> >                 if (wb_has_dirty_io(wb))
> >                         continue;
> > 
> >                 if (!wb_tryget(wb))  <=== line#679
> >                         continue;
> 
> Aha, interesting. So it seems we crashed trying to dereference
> wb->refcnt->data. So it looks like cgwb_release_workfn() raced with
> cleanup_offline_cgwbs_workfn() and percpu_ref_exit() got called from
> cgwb_release_workfn() and then cleanup_offline_cgwbs_workfn() called
> wb_tryget(). I think the proper fix is to move:
> 
>         spin_lock_irq(&cgwb_lock);
>         list_del(&wb->offline_node);
>         spin_unlock_irq(&cgwb_lock);
> 
> in cgwb_release_workfn() to the beginning of that function so that we are
> sure even cleanup_offline_cgwbs_workfn() cannot be working with the wb when
> it is being released. Roman?

Yes, it sounds like the most reasonable explanation.
Thank you!

Boyang, would you mind to test the following patch?

diff --git a/mm/backing-dev.c b/mm/backing-dev.c
index 271f2ca862c8..f5561ea7d90a 100644
--- a/mm/backing-dev.c
+++ b/mm/backing-dev.c
@@ -398,12 +398,12 @@ static void cgwb_release_workfn(struct work_struct *work)
        blkcg_unpin_online(blkcg);
 
        fprop_local_destroy_percpu(&wb->memcg_completions);
-       percpu_ref_exit(&wb->refcnt);
 
        spin_lock_irq(&cgwb_lock);
        list_del(&wb->offline_node);
        spin_unlock_irq(&cgwb_lock);
 
+       percpu_ref_exit(&wb->refcnt);
        wb_exit(wb);
        WARN_ON_ONCE(!list_empty(&wb->b_attached));
        kfree_rcu(wb, rcu);

^ permalink raw reply related	[flat|nested] 21+ messages in thread

* Re: Patch 'writeback, cgroup: release dying cgwbs by switching attached inodes' leads to kernel crash
  2021-07-15  3:51           ` Boyang Xue
@ 2021-07-15 17:10             ` Darrick J. Wong
  2021-07-15 20:08               ` Roman Gushchin
  0 siblings, 1 reply; 21+ messages in thread
From: Darrick J. Wong @ 2021-07-15 17:10 UTC (permalink / raw)
  To: Boyang Xue; +Cc: Matthew Wilcox, Jan Kara, Roman Gushchin, linux-fsdevel

On Thu, Jul 15, 2021 at 11:51:50AM +0800, Boyang Xue wrote:
> On Thu, Jul 15, 2021 at 10:36 AM Matthew Wilcox <willy@infradead.org> wrote:
> >
> > On Thu, Jul 15, 2021 at 12:22:28AM +0800, Boyang Xue wrote:
> > > It's unclear to me that where to find the required address in the
> > > addr2line command line, i.e.
> > >
> > > addr2line -e /usr/lib/debug/lib/modules/5.14.0-0.rc1.15.bx.el9.aarch64/vmlinux
> > > <what address here?>
> >
> > ./scripts/faddr2line /usr/lib/debug/lib/modules/5.14.0-0.rc1.15.bx.el9.aarch64/vmlinux cleanup_offline_cgwbs_workfn+0x320/0x394
> >
> 
> Thanks! The result is the same as the
> 
> addr2line -i -e
> /usr/lib/debug/lib/modules/5.14.0-0.rc1.15.bx.el9.aarch64/vmlinux
> FFFF8000102D6DD0
> 
> But this script is very handy.
> 
> # /usr/src/kernels/5.14.0-0.rc1.15.bx.el9.aarch64/scripts/faddr2line
> /usr/lib/debug/lib/modules/5.14.0-0.rc1.15.bx.el9.aarch64/vmlinux
> cleanup_offlin
> e_cgwbs_workfn+0x320/0x394
> cleanup_offline_cgwbs_workfn+0x320/0x394:
> arch_atomic64_fetch_add_unless at
> /usr/src/debug/kernel-5.14.0-0.rc1.15.bx/linux-5.14.0-0.rc1.15.bx.el9.aarch64/./include/linux/atomic-arch-fallback.h:2265
> (inlined by) arch_atomic64_add_unless at
> /usr/src/debug/kernel-5.14.0-0.rc1.15.bx/linux-5.14.0-0.rc1.15.bx.el9.aarch64/./include/linux/atomic-arch-fallback.h:2290
> (inlined by) atomic64_add_unless at
> /usr/src/debug/kernel-5.14.0-0.rc1.15.bx/linux-5.14.0-0.rc1.15.bx.el9.aarch64/./include/asm-generic/atomic-instrumented.h:1149
> (inlined by) atomic_long_add_unless at
> /usr/src/debug/kernel-5.14.0-0.rc1.15.bx/linux-5.14.0-0.rc1.15.bx.el9.aarch64/./include/asm-generic/atomic-long.h:491
> (inlined by) percpu_ref_tryget_many at
> /usr/src/debug/kernel-5.14.0-0.rc1.15.bx/linux-5.14.0-0.rc1.15.bx.el9.aarch64/./include/linux/percpu-refcount.h:247
> (inlined by) percpu_ref_tryget at
> /usr/src/debug/kernel-5.14.0-0.rc1.15.bx/linux-5.14.0-0.rc1.15.bx.el9.aarch64/./include/linux/percpu-refcount.h:266
> (inlined by) wb_tryget at
> /usr/src/debug/kernel-5.14.0-0.rc1.15.bx/linux-5.14.0-0.rc1.15.bx.el9.aarch64/./include/linux/backing-dev-defs.h:227
> (inlined by) wb_tryget at
> /usr/src/debug/kernel-5.14.0-0.rc1.15.bx/linux-5.14.0-0.rc1.15.bx.el9.aarch64/./include/linux/backing-dev-defs.h:224
> (inlined by) cleanup_offline_cgwbs_workfn at
> /usr/src/debug/kernel-5.14.0-0.rc1.15.bx/linux-5.14.0-0.rc1.15.bx.el9.aarch64/mm/backing-dev.c:679
> 
> # vi /usr/src/debug/kernel-5.14.0-0.rc1.15.bx/linux-5.14.0-0.rc1.15.bx.el9.aarch64/mm/backing-dev.c
> ```
> static void cleanup_offline_cgwbs_workfn(struct work_struct *work)
> {
>         struct bdi_writeback *wb;
>         LIST_HEAD(processed);
> 
>         spin_lock_irq(&cgwb_lock);
> 
>         while (!list_empty(&offline_cgwbs)) {
>                 wb = list_first_entry(&offline_cgwbs, struct bdi_writeback,
>                                       offline_node);
>                 list_move(&wb->offline_node, &processed);
> 
>                 /*
>                  * If wb is dirty, cleaning up the writeback by switching
>                  * attached inodes will result in an effective removal of any
>                  * bandwidth restrictions, which isn't the goal.  Instead,
>                  * it can be postponed until the next time, when all io
>                  * will be likely completed.  If in the meantime some inodes
>                  * will get re-dirtied, they should be eventually switched to
>                  * a new cgwb.
>                  */
>                 if (wb_has_dirty_io(wb))
>                         continue;
> 
>                 if (!wb_tryget(wb))  <=== line#679
>                         continue;
> 
>                 spin_unlock_irq(&cgwb_lock);
>                 while (cleanup_offline_cgwb(wb))
>                         cond_resched();
>                 spin_lock_irq(&cgwb_lock);
> 
>                 wb_put(wb);
>         }
> 
>         if (!list_empty(&processed))
>                 list_splice_tail(&processed, &offline_cgwbs);
> 
>         spin_unlock_irq(&cgwb_lock);
> }
> ```
> 
> BTW, this bug can be only reproduced on a non-debug production built
> kernel (a.k.a kernel rpm package), it's not reproducible on a debug
> build with various debug configuration enabled (a.k.a kernel-debug rpm
> package)

FWIW I've also seen this regularly on x86_64 kernels on ext4 with all
default mkfs settings when running generic/256.

# FSTYP=ext4 MOUNT_OPTIONS="-o acl,user_xattr," ./check
FSTYP         -- ext4
PLATFORM      -- Linux/x86_64 flax-mtr00 5.14.0-rc1-xfsx #rc1 SMP
PREEMPT Wed Jul 14 17:36:18 PDT 2021
MKFS_OPTIONS  -- /dev/sdf
MOUNT_OPTIONS -- -o acl,user_xattr, /dev/sdf /opt

generic/256
Message from syslogd@flax-mtr00 at Jul 15 09:58:14 ...
 kernel:[ 2508.987522] Dumping ftrace buffer:

And the dmesg looks like:

run fstests generic/256 at 2021-07-15 09:56:34
EXT4-fs (sdf): mounted filesystem with ordered data mode. Opts: acl,user_xattr. Quota mode: none.
BUG: kernel NULL pointer dereference, address: 0000000000000000
#PF: supervisor read access in kernel mode
#PF: error_code(0x0000) - not-present page
PGD 0 P4D 0 
Oops: 0000 [#1] PREEMPT SMP NOPTI
CPU: 1 PID: 108604 Comm: u9:3 Not tainted 5.14.0-rc1-xfsx #rc1 486fb938eb99d57e79080268009b49f63f777aec
Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.13.0-1ubuntu1.1 04/01/2014
Workqueue: events_unbound cleanup_offline_cgwbs_workfn
RIP: 0010:cleanup_offline_cgwbs_workfn+0x1ef/0x220
Code: ff ff f0 48 83 28 01 0f 85 55 ff ff ff 48 8b 83 60 ff ff ff 48 8d bb 58 ff ff ff ff 50 08 e9 3f ff ff ff 48 8b 93 60 ff ff ff <48> 8b 02 48 85 c0 0f 84 2c ff ff ff 48 8d 48 01 f0 48 0f b1 0a 75
RSP: 0018:ffffc9000278be60 EFLAGS: 00010006
RAX: 0000000000000003 RBX: ffff888282dc0b30 RCX: 0000000000000000
RDX: 0000000000000000 RSI: ffffc9000278be60 RDI: ffff888282dc0b30
RBP: ffff888282dc0800 R08: ffff88828006af30 R09: ffff88828006af30
R10: 000000000000000f R11: 000000000000000f R12: ffffc9000278be60
R13: ffff8881000d6800 R14: 0000000000000000 R15: 0000000000000000
FS:  0000000000000000(0000) GS:ffff888277d00000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000000000000000 CR3: 0000000102262003 CR4: 00000000001706a0
Call Trace:
 process_one_work+0x1dd/0x3c0
 worker_thread+0x53/0x3c0
 ? rescuer_thread+0x390/0x390
 kthread+0x149/0x170
 ? set_kthread_struct+0x40/0x40
 ret_from_fork+0x1f/0x30
Modules linked in: ext2 ext4 jbd2 dm_flakey mbcache xfs libcrc32c ip6t_REJECT nf_reject_ipv6 ipt_REJECT nf_reject_ipv4 xt_tcpudp ip_set_hash_ip ip_set_hash_net xt_set ip_set_hash_mac ip_set nfnetlink ip6table_filter ip6_tables bfq iptable_filter pvpanic_mmio pvpanic sch_fq_codel ip_tables x_tables overlay nfsv4 af_packet [last unloaded: jbd2]
Dumping ftrace buffer:
   (ftrace buffer empty)
CR2: 0000000000000000
---[ end trace 242113b767739fb9 ]---

The faddr2line output points at the same line of code.

--D

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: Patch 'writeback, cgroup: release dying cgwbs by switching attached inodes' leads to kernel crash
  2021-07-15 17:10             ` Darrick J. Wong
@ 2021-07-15 20:08               ` Roman Gushchin
  2021-07-15 22:28                 ` Darrick J. Wong
  0 siblings, 1 reply; 21+ messages in thread
From: Roman Gushchin @ 2021-07-15 20:08 UTC (permalink / raw)
  To: Darrick J. Wong; +Cc: Boyang Xue, Matthew Wilcox, Jan Kara, linux-fsdevel

On Thu, Jul 15, 2021 at 10:10:50AM -0700, Darrick J. Wong wrote:
> On Thu, Jul 15, 2021 at 11:51:50AM +0800, Boyang Xue wrote:
> > On Thu, Jul 15, 2021 at 10:36 AM Matthew Wilcox <willy@infradead.org> wrote:
> > >
> > > On Thu, Jul 15, 2021 at 12:22:28AM +0800, Boyang Xue wrote:
> > > > It's unclear to me that where to find the required address in the
> > > > addr2line command line, i.e.
> > > >
> > > > addr2line -e /usr/lib/debug/lib/modules/5.14.0-0.rc1.15.bx.el9.aarch64/vmlinux
> > > > <what address here?>
> > >
> > > ./scripts/faddr2line /usr/lib/debug/lib/modules/5.14.0-0.rc1.15.bx.el9.aarch64/vmlinux cleanup_offline_cgwbs_workfn+0x320/0x394
> > >
> > 
> > Thanks! The result is the same as the
> > 
> > addr2line -i -e
> > /usr/lib/debug/lib/modules/5.14.0-0.rc1.15.bx.el9.aarch64/vmlinux
> > FFFF8000102D6DD0
> > 
> > But this script is very handy.
> > 
> > # /usr/src/kernels/5.14.0-0.rc1.15.bx.el9.aarch64/scripts/faddr2line
> > /usr/lib/debug/lib/modules/5.14.0-0.rc1.15.bx.el9.aarch64/vmlinux
> > cleanup_offlin
> > e_cgwbs_workfn+0x320/0x394
> > cleanup_offline_cgwbs_workfn+0x320/0x394:
> > arch_atomic64_fetch_add_unless at
> > /usr/src/debug/kernel-5.14.0-0.rc1.15.bx/linux-5.14.0-0.rc1.15.bx.el9.aarch64/./include/linux/atomic-arch-fallback.h:2265
> > (inlined by) arch_atomic64_add_unless at
> > /usr/src/debug/kernel-5.14.0-0.rc1.15.bx/linux-5.14.0-0.rc1.15.bx.el9.aarch64/./include/linux/atomic-arch-fallback.h:2290
> > (inlined by) atomic64_add_unless at
> > /usr/src/debug/kernel-5.14.0-0.rc1.15.bx/linux-5.14.0-0.rc1.15.bx.el9.aarch64/./include/asm-generic/atomic-instrumented.h:1149
> > (inlined by) atomic_long_add_unless at
> > /usr/src/debug/kernel-5.14.0-0.rc1.15.bx/linux-5.14.0-0.rc1.15.bx.el9.aarch64/./include/asm-generic/atomic-long.h:491
> > (inlined by) percpu_ref_tryget_many at
> > /usr/src/debug/kernel-5.14.0-0.rc1.15.bx/linux-5.14.0-0.rc1.15.bx.el9.aarch64/./include/linux/percpu-refcount.h:247
> > (inlined by) percpu_ref_tryget at
> > /usr/src/debug/kernel-5.14.0-0.rc1.15.bx/linux-5.14.0-0.rc1.15.bx.el9.aarch64/./include/linux/percpu-refcount.h:266
> > (inlined by) wb_tryget at
> > /usr/src/debug/kernel-5.14.0-0.rc1.15.bx/linux-5.14.0-0.rc1.15.bx.el9.aarch64/./include/linux/backing-dev-defs.h:227
> > (inlined by) wb_tryget at
> > /usr/src/debug/kernel-5.14.0-0.rc1.15.bx/linux-5.14.0-0.rc1.15.bx.el9.aarch64/./include/linux/backing-dev-defs.h:224
> > (inlined by) cleanup_offline_cgwbs_workfn at
> > /usr/src/debug/kernel-5.14.0-0.rc1.15.bx/linux-5.14.0-0.rc1.15.bx.el9.aarch64/mm/backing-dev.c:679
> > 
> > # vi /usr/src/debug/kernel-5.14.0-0.rc1.15.bx/linux-5.14.0-0.rc1.15.bx.el9.aarch64/mm/backing-dev.c
> > ```
> > static void cleanup_offline_cgwbs_workfn(struct work_struct *work)
> > {
> >         struct bdi_writeback *wb;
> >         LIST_HEAD(processed);
> > 
> >         spin_lock_irq(&cgwb_lock);
> > 
> >         while (!list_empty(&offline_cgwbs)) {
> >                 wb = list_first_entry(&offline_cgwbs, struct bdi_writeback,
> >                                       offline_node);
> >                 list_move(&wb->offline_node, &processed);
> > 
> >                 /*
> >                  * If wb is dirty, cleaning up the writeback by switching
> >                  * attached inodes will result in an effective removal of any
> >                  * bandwidth restrictions, which isn't the goal.  Instead,
> >                  * it can be postponed until the next time, when all io
> >                  * will be likely completed.  If in the meantime some inodes
> >                  * will get re-dirtied, they should be eventually switched to
> >                  * a new cgwb.
> >                  */
> >                 if (wb_has_dirty_io(wb))
> >                         continue;
> > 
> >                 if (!wb_tryget(wb))  <=== line#679
> >                         continue;
> > 
> >                 spin_unlock_irq(&cgwb_lock);
> >                 while (cleanup_offline_cgwb(wb))
> >                         cond_resched();
> >                 spin_lock_irq(&cgwb_lock);
> > 
> >                 wb_put(wb);
> >         }
> > 
> >         if (!list_empty(&processed))
> >                 list_splice_tail(&processed, &offline_cgwbs);
> > 
> >         spin_unlock_irq(&cgwb_lock);
> > }
> > ```
> > 
> > BTW, this bug can be only reproduced on a non-debug production built
> > kernel (a.k.a kernel rpm package), it's not reproducible on a debug
> > build with various debug configuration enabled (a.k.a kernel-debug rpm
> > package)
> 
> FWIW I've also seen this regularly on x86_64 kernels on ext4 with all
> default mkfs settings when running generic/256.

Oh, that's a useful information, thank you!

Btw, would you mind to give a patch from an earlier message in the thread
a test? I'd highly appreciate it.

Thanks!

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: Patch 'writeback, cgroup: release dying cgwbs by switching attached inodes' leads to kernel crash
  2021-07-15 20:08               ` Roman Gushchin
@ 2021-07-15 22:28                 ` Darrick J. Wong
  2021-07-16 16:23                   ` Darrick J. Wong
  0 siblings, 1 reply; 21+ messages in thread
From: Darrick J. Wong @ 2021-07-15 22:28 UTC (permalink / raw)
  To: Roman Gushchin; +Cc: Boyang Xue, Matthew Wilcox, Jan Kara, linux-fsdevel

On Thu, Jul 15, 2021 at 01:08:15PM -0700, Roman Gushchin wrote:
> On Thu, Jul 15, 2021 at 10:10:50AM -0700, Darrick J. Wong wrote:
> > On Thu, Jul 15, 2021 at 11:51:50AM +0800, Boyang Xue wrote:
> > > On Thu, Jul 15, 2021 at 10:36 AM Matthew Wilcox <willy@infradead.org> wrote:
> > > >
> > > > On Thu, Jul 15, 2021 at 12:22:28AM +0800, Boyang Xue wrote:
> > > > > It's unclear to me that where to find the required address in the
> > > > > addr2line command line, i.e.
> > > > >
> > > > > addr2line -e /usr/lib/debug/lib/modules/5.14.0-0.rc1.15.bx.el9.aarch64/vmlinux
> > > > > <what address here?>
> > > >
> > > > ./scripts/faddr2line /usr/lib/debug/lib/modules/5.14.0-0.rc1.15.bx.el9.aarch64/vmlinux cleanup_offline_cgwbs_workfn+0x320/0x394
> > > >
> > > 
> > > Thanks! The result is the same as the
> > > 
> > > addr2line -i -e
> > > /usr/lib/debug/lib/modules/5.14.0-0.rc1.15.bx.el9.aarch64/vmlinux
> > > FFFF8000102D6DD0
> > > 
> > > But this script is very handy.
> > > 
> > > # /usr/src/kernels/5.14.0-0.rc1.15.bx.el9.aarch64/scripts/faddr2line
> > > /usr/lib/debug/lib/modules/5.14.0-0.rc1.15.bx.el9.aarch64/vmlinux
> > > cleanup_offlin
> > > e_cgwbs_workfn+0x320/0x394
> > > cleanup_offline_cgwbs_workfn+0x320/0x394:
> > > arch_atomic64_fetch_add_unless at
> > > /usr/src/debug/kernel-5.14.0-0.rc1.15.bx/linux-5.14.0-0.rc1.15.bx.el9.aarch64/./include/linux/atomic-arch-fallback.h:2265
> > > (inlined by) arch_atomic64_add_unless at
> > > /usr/src/debug/kernel-5.14.0-0.rc1.15.bx/linux-5.14.0-0.rc1.15.bx.el9.aarch64/./include/linux/atomic-arch-fallback.h:2290
> > > (inlined by) atomic64_add_unless at
> > > /usr/src/debug/kernel-5.14.0-0.rc1.15.bx/linux-5.14.0-0.rc1.15.bx.el9.aarch64/./include/asm-generic/atomic-instrumented.h:1149
> > > (inlined by) atomic_long_add_unless at
> > > /usr/src/debug/kernel-5.14.0-0.rc1.15.bx/linux-5.14.0-0.rc1.15.bx.el9.aarch64/./include/asm-generic/atomic-long.h:491
> > > (inlined by) percpu_ref_tryget_many at
> > > /usr/src/debug/kernel-5.14.0-0.rc1.15.bx/linux-5.14.0-0.rc1.15.bx.el9.aarch64/./include/linux/percpu-refcount.h:247
> > > (inlined by) percpu_ref_tryget at
> > > /usr/src/debug/kernel-5.14.0-0.rc1.15.bx/linux-5.14.0-0.rc1.15.bx.el9.aarch64/./include/linux/percpu-refcount.h:266
> > > (inlined by) wb_tryget at
> > > /usr/src/debug/kernel-5.14.0-0.rc1.15.bx/linux-5.14.0-0.rc1.15.bx.el9.aarch64/./include/linux/backing-dev-defs.h:227
> > > (inlined by) wb_tryget at
> > > /usr/src/debug/kernel-5.14.0-0.rc1.15.bx/linux-5.14.0-0.rc1.15.bx.el9.aarch64/./include/linux/backing-dev-defs.h:224
> > > (inlined by) cleanup_offline_cgwbs_workfn at
> > > /usr/src/debug/kernel-5.14.0-0.rc1.15.bx/linux-5.14.0-0.rc1.15.bx.el9.aarch64/mm/backing-dev.c:679
> > > 
> > > # vi /usr/src/debug/kernel-5.14.0-0.rc1.15.bx/linux-5.14.0-0.rc1.15.bx.el9.aarch64/mm/backing-dev.c
> > > ```
> > > static void cleanup_offline_cgwbs_workfn(struct work_struct *work)
> > > {
> > >         struct bdi_writeback *wb;
> > >         LIST_HEAD(processed);
> > > 
> > >         spin_lock_irq(&cgwb_lock);
> > > 
> > >         while (!list_empty(&offline_cgwbs)) {
> > >                 wb = list_first_entry(&offline_cgwbs, struct bdi_writeback,
> > >                                       offline_node);
> > >                 list_move(&wb->offline_node, &processed);
> > > 
> > >                 /*
> > >                  * If wb is dirty, cleaning up the writeback by switching
> > >                  * attached inodes will result in an effective removal of any
> > >                  * bandwidth restrictions, which isn't the goal.  Instead,
> > >                  * it can be postponed until the next time, when all io
> > >                  * will be likely completed.  If in the meantime some inodes
> > >                  * will get re-dirtied, they should be eventually switched to
> > >                  * a new cgwb.
> > >                  */
> > >                 if (wb_has_dirty_io(wb))
> > >                         continue;
> > > 
> > >                 if (!wb_tryget(wb))  <=== line#679
> > >                         continue;
> > > 
> > >                 spin_unlock_irq(&cgwb_lock);
> > >                 while (cleanup_offline_cgwb(wb))
> > >                         cond_resched();
> > >                 spin_lock_irq(&cgwb_lock);
> > > 
> > >                 wb_put(wb);
> > >         }
> > > 
> > >         if (!list_empty(&processed))
> > >                 list_splice_tail(&processed, &offline_cgwbs);
> > > 
> > >         spin_unlock_irq(&cgwb_lock);
> > > }
> > > ```
> > > 
> > > BTW, this bug can be only reproduced on a non-debug production built
> > > kernel (a.k.a kernel rpm package), it's not reproducible on a debug
> > > build with various debug configuration enabled (a.k.a kernel-debug rpm
> > > package)
> > 
> > FWIW I've also seen this regularly on x86_64 kernels on ext4 with all
> > default mkfs settings when running generic/256.
> 
> Oh, that's a useful information, thank you!
> 
> Btw, would you mind to give a patch from an earlier message in the thread
> a test? I'd highly appreciate it.
> 
> Thanks!

Will do.

--D

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: Patch 'writeback, cgroup: release dying cgwbs by switching attached inodes' leads to kernel crash
  2021-07-15 16:04               ` Roman Gushchin
@ 2021-07-16  1:37                 ` Boyang Xue
  0 siblings, 0 replies; 21+ messages in thread
From: Boyang Xue @ 2021-07-16  1:37 UTC (permalink / raw)
  To: Roman Gushchin; +Cc: Jan Kara, linux-fsdevel

On Fri, Jul 16, 2021 at 12:05 AM Roman Gushchin <guro@fb.com> wrote:
>
> On Thu, Jul 15, 2021 at 11:31:17AM +0200, Jan Kara wrote:
> > On Thu 15-07-21 09:42:06, Boyang Xue wrote:
> > > On Thu, Jul 15, 2021 at 7:46 AM Roman Gushchin <guro@fb.com> wrote:
> > > >
> > > > On Thu, Jul 15, 2021 at 12:22:28AM +0800, Boyang Xue wrote:
> > > > > Hi Jan,
> > > > >
> > > > > On Wed, Jul 14, 2021 at 5:26 PM Jan Kara <jack@suse.cz> wrote:
> > > > > >
> > > > > > On Wed 14-07-21 16:44:33, Boyang Xue wrote:
> > > > > > > Hi Roman,
> > > > > > >
> > > > > > > On Wed, Jul 14, 2021 at 12:12 PM Roman Gushchin <guro@fb.com> wrote:
> > > > > > > >
> > > > > > > > On Wed, Jul 14, 2021 at 11:21:12AM +0800, Boyang Xue wrote:
> > > > > > > > > Hello,
> > > > > > > > >
> > > > > > > > > I'm not sure if this is the right place to report this bug, please
> > > > > > > > > correct me if I'm wrong.
> > > > > > > > >
> > > > > > > > > I found kernel-5.14.0-rc1 (built from the Linus tree) crash when it's
> > > > > > > > > running xfstests generic/256 on ext4 [1]. Looking at the call trace,
> > > > > > > > > it looks like the bug had been introduced by the commit
> > > > > > > > >
> > > > > > > > > c22d70a162d3 writeback, cgroup: release dying cgwbs by switching attached inodes
> > > > > > > > >
> > > > > > > > > It only happens on aarch64, not on x86_64, ppc64le and s390x. Testing
> > > > > > > > > was performed with the latest xfstests, and the bug can be reproduced
> > > > > > > > > on ext{2, 3, 4} with {1k, 2k, 4k} block sizes.
> > > > > > > >
> > > > > > > > Hello Boyang,
> > > > > > > >
> > > > > > > > thank you for the report!
> > > > > > > >
> > > > > > > > Do you know on which line the oops happens?
> > > > > > >
> > > > > > > I was trying to inspect the vmcore with crash utility, but
> > > > > > > unfortunately it doesn't work.
> > > > > >
> > > > > > Thanks for report!  Have you tried addr2line utility? Looking at the oops I
> > > > > > can see:
> > > > >
> > > > > Thanks for the tips!
> > > > >
> > > > > It's unclear to me that where to find the required address in the
> > > > > addr2line command line, i.e.
> > > > >
> > > > > addr2line -e /usr/lib/debug/lib/modules/5.14.0-0.rc1.15.bx.el9.aarch64/vmlinux
> > > > > <what address here?>
> > > >
> > > > You can use $nm <vmlinux> to get an address of cleanup_offline_cgwbs_workfn()
> > > > and then add 0x320.
> > >
> > > Thanks! Hope the following helps:
> >
> > Thanks for the data!
> >
> > > static void cleanup_offline_cgwbs_workfn(struct work_struct *work)
> > > {
> > >         struct bdi_writeback *wb;
> > >         LIST_HEAD(processed);
> > >
> > >         spin_lock_irq(&cgwb_lock);
> > >
> > >         while (!list_empty(&offline_cgwbs)) {
> > >                 wb = list_first_entry(&offline_cgwbs, struct bdi_writeback,
> > >                                       offline_node);
> > >                 list_move(&wb->offline_node, &processed);
> > >
> > >                 /*
> > >                  * If wb is dirty, cleaning up the writeback by switching
> > >                  * attached inodes will result in an effective removal of any
> > >                  * bandwidth restrictions, which isn't the goal.  Instead,
> > >                  * it can be postponed until the next time, when all io
> > >                  * will be likely completed.  If in the meantime some inodes
> > >                  * will get re-dirtied, they should be eventually switched to
> > >                  * a new cgwb.
> > >                  */
> > >                 if (wb_has_dirty_io(wb))
> > >                         continue;
> > >
> > >                 if (!wb_tryget(wb))  <=== line#679
> > >                         continue;
> >
> > Aha, interesting. So it seems we crashed trying to dereference
> > wb->refcnt->data. So it looks like cgwb_release_workfn() raced with
> > cleanup_offline_cgwbs_workfn() and percpu_ref_exit() got called from
> > cgwb_release_workfn() and then cleanup_offline_cgwbs_workfn() called
> > wb_tryget(). I think the proper fix is to move:
> >
> >         spin_lock_irq(&cgwb_lock);
> >         list_del(&wb->offline_node);
> >         spin_unlock_irq(&cgwb_lock);
> >
> > in cgwb_release_workfn() to the beginning of that function so that we are
> > sure even cleanup_offline_cgwbs_workfn() cannot be working with the wb when
> > it is being released. Roman?
>
> Yes, it sounds like the most reasonable explanation.
> Thank you!
>
> Boyang, would you mind to test the following patch?

No problem. I'm testing it. Thanks for the patch.

>
> diff --git a/mm/backing-dev.c b/mm/backing-dev.c
> index 271f2ca862c8..f5561ea7d90a 100644
> --- a/mm/backing-dev.c
> +++ b/mm/backing-dev.c
> @@ -398,12 +398,12 @@ static void cgwb_release_workfn(struct work_struct *work)
>         blkcg_unpin_online(blkcg);
>
>         fprop_local_destroy_percpu(&wb->memcg_completions);
> -       percpu_ref_exit(&wb->refcnt);
>
>         spin_lock_irq(&cgwb_lock);
>         list_del(&wb->offline_node);
>         spin_unlock_irq(&cgwb_lock);
>
> +       percpu_ref_exit(&wb->refcnt);
>         wb_exit(wb);
>         WARN_ON_ONCE(!list_empty(&wb->b_attached));
>         kfree_rcu(wb, rcu);
>


^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: Patch 'writeback, cgroup: release dying cgwbs by switching attached inodes' leads to kernel crash
  2021-07-15 22:28                 ` Darrick J. Wong
@ 2021-07-16 16:23                   ` Darrick J. Wong
  2021-07-16 20:03                     ` Roman Gushchin
  0 siblings, 1 reply; 21+ messages in thread
From: Darrick J. Wong @ 2021-07-16 16:23 UTC (permalink / raw)
  To: Roman Gushchin; +Cc: Boyang Xue, Matthew Wilcox, Jan Kara, linux-fsdevel

On Thu, Jul 15, 2021 at 03:28:12PM -0700, Darrick J. Wong wrote:
> On Thu, Jul 15, 2021 at 01:08:15PM -0700, Roman Gushchin wrote:
> > On Thu, Jul 15, 2021 at 10:10:50AM -0700, Darrick J. Wong wrote:
> > > On Thu, Jul 15, 2021 at 11:51:50AM +0800, Boyang Xue wrote:
> > > > On Thu, Jul 15, 2021 at 10:36 AM Matthew Wilcox <willy@infradead.org> wrote:
> > > > >
> > > > > On Thu, Jul 15, 2021 at 12:22:28AM +0800, Boyang Xue wrote:
> > > > > > It's unclear to me that where to find the required address in the
> > > > > > addr2line command line, i.e.
> > > > > >
> > > > > > addr2line -e /usr/lib/debug/lib/modules/5.14.0-0.rc1.15.bx.el9.aarch64/vmlinux
> > > > > > <what address here?>
> > > > >
> > > > > ./scripts/faddr2line /usr/lib/debug/lib/modules/5.14.0-0.rc1.15.bx.el9.aarch64/vmlinux cleanup_offline_cgwbs_workfn+0x320/0x394
> > > > >
> > > > 
> > > > Thanks! The result is the same as the
> > > > 
> > > > addr2line -i -e
> > > > /usr/lib/debug/lib/modules/5.14.0-0.rc1.15.bx.el9.aarch64/vmlinux
> > > > FFFF8000102D6DD0
> > > > 
> > > > But this script is very handy.
> > > > 
> > > > # /usr/src/kernels/5.14.0-0.rc1.15.bx.el9.aarch64/scripts/faddr2line
> > > > /usr/lib/debug/lib/modules/5.14.0-0.rc1.15.bx.el9.aarch64/vmlinux
> > > > cleanup_offlin
> > > > e_cgwbs_workfn+0x320/0x394
> > > > cleanup_offline_cgwbs_workfn+0x320/0x394:
> > > > arch_atomic64_fetch_add_unless at
> > > > /usr/src/debug/kernel-5.14.0-0.rc1.15.bx/linux-5.14.0-0.rc1.15.bx.el9.aarch64/./include/linux/atomic-arch-fallback.h:2265
> > > > (inlined by) arch_atomic64_add_unless at
> > > > /usr/src/debug/kernel-5.14.0-0.rc1.15.bx/linux-5.14.0-0.rc1.15.bx.el9.aarch64/./include/linux/atomic-arch-fallback.h:2290
> > > > (inlined by) atomic64_add_unless at
> > > > /usr/src/debug/kernel-5.14.0-0.rc1.15.bx/linux-5.14.0-0.rc1.15.bx.el9.aarch64/./include/asm-generic/atomic-instrumented.h:1149
> > > > (inlined by) atomic_long_add_unless at
> > > > /usr/src/debug/kernel-5.14.0-0.rc1.15.bx/linux-5.14.0-0.rc1.15.bx.el9.aarch64/./include/asm-generic/atomic-long.h:491
> > > > (inlined by) percpu_ref_tryget_many at
> > > > /usr/src/debug/kernel-5.14.0-0.rc1.15.bx/linux-5.14.0-0.rc1.15.bx.el9.aarch64/./include/linux/percpu-refcount.h:247
> > > > (inlined by) percpu_ref_tryget at
> > > > /usr/src/debug/kernel-5.14.0-0.rc1.15.bx/linux-5.14.0-0.rc1.15.bx.el9.aarch64/./include/linux/percpu-refcount.h:266
> > > > (inlined by) wb_tryget at
> > > > /usr/src/debug/kernel-5.14.0-0.rc1.15.bx/linux-5.14.0-0.rc1.15.bx.el9.aarch64/./include/linux/backing-dev-defs.h:227
> > > > (inlined by) wb_tryget at
> > > > /usr/src/debug/kernel-5.14.0-0.rc1.15.bx/linux-5.14.0-0.rc1.15.bx.el9.aarch64/./include/linux/backing-dev-defs.h:224
> > > > (inlined by) cleanup_offline_cgwbs_workfn at
> > > > /usr/src/debug/kernel-5.14.0-0.rc1.15.bx/linux-5.14.0-0.rc1.15.bx.el9.aarch64/mm/backing-dev.c:679
> > > > 
> > > > # vi /usr/src/debug/kernel-5.14.0-0.rc1.15.bx/linux-5.14.0-0.rc1.15.bx.el9.aarch64/mm/backing-dev.c
> > > > ```
> > > > static void cleanup_offline_cgwbs_workfn(struct work_struct *work)
> > > > {
> > > >         struct bdi_writeback *wb;
> > > >         LIST_HEAD(processed);
> > > > 
> > > >         spin_lock_irq(&cgwb_lock);
> > > > 
> > > >         while (!list_empty(&offline_cgwbs)) {
> > > >                 wb = list_first_entry(&offline_cgwbs, struct bdi_writeback,
> > > >                                       offline_node);
> > > >                 list_move(&wb->offline_node, &processed);
> > > > 
> > > >                 /*
> > > >                  * If wb is dirty, cleaning up the writeback by switching
> > > >                  * attached inodes will result in an effective removal of any
> > > >                  * bandwidth restrictions, which isn't the goal.  Instead,
> > > >                  * it can be postponed until the next time, when all io
> > > >                  * will be likely completed.  If in the meantime some inodes
> > > >                  * will get re-dirtied, they should be eventually switched to
> > > >                  * a new cgwb.
> > > >                  */
> > > >                 if (wb_has_dirty_io(wb))
> > > >                         continue;
> > > > 
> > > >                 if (!wb_tryget(wb))  <=== line#679
> > > >                         continue;
> > > > 
> > > >                 spin_unlock_irq(&cgwb_lock);
> > > >                 while (cleanup_offline_cgwb(wb))
> > > >                         cond_resched();
> > > >                 spin_lock_irq(&cgwb_lock);
> > > > 
> > > >                 wb_put(wb);
> > > >         }
> > > > 
> > > >         if (!list_empty(&processed))
> > > >                 list_splice_tail(&processed, &offline_cgwbs);
> > > > 
> > > >         spin_unlock_irq(&cgwb_lock);
> > > > }
> > > > ```
> > > > 
> > > > BTW, this bug can be only reproduced on a non-debug production built
> > > > kernel (a.k.a kernel rpm package), it's not reproducible on a debug
> > > > build with various debug configuration enabled (a.k.a kernel-debug rpm
> > > > package)
> > > 
> > > FWIW I've also seen this regularly on x86_64 kernels on ext4 with all
> > > default mkfs settings when running generic/256.
> > 
> > Oh, that's a useful information, thank you!
> > 
> > Btw, would you mind to give a patch from an earlier message in the thread
> > a test? I'd highly appreciate it.
> > 
> > Thanks!
> 
> Will do.

fstests passed here, so

Tested-by: Darrick J. Wong <djwong@kernel.org>

--D

> 
> --D

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: Patch 'writeback, cgroup: release dying cgwbs by switching attached inodes' leads to kernel crash
  2021-07-16 16:23                   ` Darrick J. Wong
@ 2021-07-16 20:03                     ` Roman Gushchin
  2021-07-17 12:00                       ` Boyang Xue
  0 siblings, 1 reply; 21+ messages in thread
From: Roman Gushchin @ 2021-07-16 20:03 UTC (permalink / raw)
  To: Darrick J. Wong; +Cc: Boyang Xue, Matthew Wilcox, Jan Kara, linux-fsdevel

On Fri, Jul 16, 2021 at 09:23:40AM -0700, Darrick J. Wong wrote:
> On Thu, Jul 15, 2021 at 03:28:12PM -0700, Darrick J. Wong wrote:
> > On Thu, Jul 15, 2021 at 01:08:15PM -0700, Roman Gushchin wrote:
> > > On Thu, Jul 15, 2021 at 10:10:50AM -0700, Darrick J. Wong wrote:
> > > > On Thu, Jul 15, 2021 at 11:51:50AM +0800, Boyang Xue wrote:
> > > > > On Thu, Jul 15, 2021 at 10:36 AM Matthew Wilcox <willy@infradead.org> wrote:
> > > > > >
> > > > > > On Thu, Jul 15, 2021 at 12:22:28AM +0800, Boyang Xue wrote:
> > > > > > > It's unclear to me that where to find the required address in the
> > > > > > > addr2line command line, i.e.
> > > > > > >
> > > > > > > addr2line -e /usr/lib/debug/lib/modules/5.14.0-0.rc1.15.bx.el9.aarch64/vmlinux
> > > > > > > <what address here?>
> > > > > >
> > > > > > ./scripts/faddr2line /usr/lib/debug/lib/modules/5.14.0-0.rc1.15.bx.el9.aarch64/vmlinux cleanup_offline_cgwbs_workfn+0x320/0x394
> > > > > >
> > > > > 
> > > > > Thanks! The result is the same as the
> > > > > 
> > > > > addr2line -i -e
> > > > > /usr/lib/debug/lib/modules/5.14.0-0.rc1.15.bx.el9.aarch64/vmlinux
> > > > > FFFF8000102D6DD0
> > > > > 
> > > > > But this script is very handy.
> > > > > 
> > > > > # /usr/src/kernels/5.14.0-0.rc1.15.bx.el9.aarch64/scripts/faddr2line
> > > > > /usr/lib/debug/lib/modules/5.14.0-0.rc1.15.bx.el9.aarch64/vmlinux
> > > > > cleanup_offlin
> > > > > e_cgwbs_workfn+0x320/0x394
> > > > > cleanup_offline_cgwbs_workfn+0x320/0x394:
> > > > > arch_atomic64_fetch_add_unless at
> > > > > /usr/src/debug/kernel-5.14.0-0.rc1.15.bx/linux-5.14.0-0.rc1.15.bx.el9.aarch64/./include/linux/atomic-arch-fallback.h:2265
> > > > > (inlined by) arch_atomic64_add_unless at
> > > > > /usr/src/debug/kernel-5.14.0-0.rc1.15.bx/linux-5.14.0-0.rc1.15.bx.el9.aarch64/./include/linux/atomic-arch-fallback.h:2290
> > > > > (inlined by) atomic64_add_unless at
> > > > > /usr/src/debug/kernel-5.14.0-0.rc1.15.bx/linux-5.14.0-0.rc1.15.bx.el9.aarch64/./include/asm-generic/atomic-instrumented.h:1149
> > > > > (inlined by) atomic_long_add_unless at
> > > > > /usr/src/debug/kernel-5.14.0-0.rc1.15.bx/linux-5.14.0-0.rc1.15.bx.el9.aarch64/./include/asm-generic/atomic-long.h:491
> > > > > (inlined by) percpu_ref_tryget_many at
> > > > > /usr/src/debug/kernel-5.14.0-0.rc1.15.bx/linux-5.14.0-0.rc1.15.bx.el9.aarch64/./include/linux/percpu-refcount.h:247
> > > > > (inlined by) percpu_ref_tryget at
> > > > > /usr/src/debug/kernel-5.14.0-0.rc1.15.bx/linux-5.14.0-0.rc1.15.bx.el9.aarch64/./include/linux/percpu-refcount.h:266
> > > > > (inlined by) wb_tryget at
> > > > > /usr/src/debug/kernel-5.14.0-0.rc1.15.bx/linux-5.14.0-0.rc1.15.bx.el9.aarch64/./include/linux/backing-dev-defs.h:227
> > > > > (inlined by) wb_tryget at
> > > > > /usr/src/debug/kernel-5.14.0-0.rc1.15.bx/linux-5.14.0-0.rc1.15.bx.el9.aarch64/./include/linux/backing-dev-defs.h:224
> > > > > (inlined by) cleanup_offline_cgwbs_workfn at
> > > > > /usr/src/debug/kernel-5.14.0-0.rc1.15.bx/linux-5.14.0-0.rc1.15.bx.el9.aarch64/mm/backing-dev.c:679
> > > > > 
> > > > > # vi /usr/src/debug/kernel-5.14.0-0.rc1.15.bx/linux-5.14.0-0.rc1.15.bx.el9.aarch64/mm/backing-dev.c
> > > > > ```
> > > > > static void cleanup_offline_cgwbs_workfn(struct work_struct *work)
> > > > > {
> > > > >         struct bdi_writeback *wb;
> > > > >         LIST_HEAD(processed);
> > > > > 
> > > > >         spin_lock_irq(&cgwb_lock);
> > > > > 
> > > > >         while (!list_empty(&offline_cgwbs)) {
> > > > >                 wb = list_first_entry(&offline_cgwbs, struct bdi_writeback,
> > > > >                                       offline_node);
> > > > >                 list_move(&wb->offline_node, &processed);
> > > > > 
> > > > >                 /*
> > > > >                  * If wb is dirty, cleaning up the writeback by switching
> > > > >                  * attached inodes will result in an effective removal of any
> > > > >                  * bandwidth restrictions, which isn't the goal.  Instead,
> > > > >                  * it can be postponed until the next time, when all io
> > > > >                  * will be likely completed.  If in the meantime some inodes
> > > > >                  * will get re-dirtied, they should be eventually switched to
> > > > >                  * a new cgwb.
> > > > >                  */
> > > > >                 if (wb_has_dirty_io(wb))
> > > > >                         continue;
> > > > > 
> > > > >                 if (!wb_tryget(wb))  <=== line#679
> > > > >                         continue;
> > > > > 
> > > > >                 spin_unlock_irq(&cgwb_lock);
> > > > >                 while (cleanup_offline_cgwb(wb))
> > > > >                         cond_resched();
> > > > >                 spin_lock_irq(&cgwb_lock);
> > > > > 
> > > > >                 wb_put(wb);
> > > > >         }
> > > > > 
> > > > >         if (!list_empty(&processed))
> > > > >                 list_splice_tail(&processed, &offline_cgwbs);
> > > > > 
> > > > >         spin_unlock_irq(&cgwb_lock);
> > > > > }
> > > > > ```
> > > > > 
> > > > > BTW, this bug can be only reproduced on a non-debug production built
> > > > > kernel (a.k.a kernel rpm package), it's not reproducible on a debug
> > > > > build with various debug configuration enabled (a.k.a kernel-debug rpm
> > > > > package)
> > > > 
> > > > FWIW I've also seen this regularly on x86_64 kernels on ext4 with all
> > > > default mkfs settings when running generic/256.
> > > 
> > > Oh, that's a useful information, thank you!
> > > 
> > > Btw, would you mind to give a patch from an earlier message in the thread
> > > a test? I'd highly appreciate it.
> > > 
> > > Thanks!
> > 
> > Will do.
> 
> fstests passed here, so
> 
> Tested-by: Darrick J. Wong <djwong@kernel.org>

Great, thank you!

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: Patch 'writeback, cgroup: release dying cgwbs by switching attached inodes' leads to kernel crash
  2021-07-16 20:03                     ` Roman Gushchin
@ 2021-07-17 12:00                       ` Boyang Xue
  2021-07-22  5:29                         ` Boyang Xue
  0 siblings, 1 reply; 21+ messages in thread
From: Boyang Xue @ 2021-07-17 12:00 UTC (permalink / raw)
  To: Roman Gushchin; +Cc: Darrick J. Wong, Matthew Wilcox, Jan Kara, linux-fsdevel

Testing fstests on aarch64, x86_64, s390x all passed. There's a
shortage of ppc64le systems, so I can't provide the ppc64le test
result for now, but I hope I can report the result next week.

Thanks,
Boyang

On Sat, Jul 17, 2021 at 4:04 AM Roman Gushchin <guro@fb.com> wrote:
>
> On Fri, Jul 16, 2021 at 09:23:40AM -0700, Darrick J. Wong wrote:
> > On Thu, Jul 15, 2021 at 03:28:12PM -0700, Darrick J. Wong wrote:
> > > On Thu, Jul 15, 2021 at 01:08:15PM -0700, Roman Gushchin wrote:
> > > > On Thu, Jul 15, 2021 at 10:10:50AM -0700, Darrick J. Wong wrote:
> > > > > On Thu, Jul 15, 2021 at 11:51:50AM +0800, Boyang Xue wrote:
> > > > > > On Thu, Jul 15, 2021 at 10:36 AM Matthew Wilcox <willy@infradead.org> wrote:
> > > > > > >
> > > > > > > On Thu, Jul 15, 2021 at 12:22:28AM +0800, Boyang Xue wrote:
> > > > > > > > It's unclear to me that where to find the required address in the
> > > > > > > > addr2line command line, i.e.
> > > > > > > >
> > > > > > > > addr2line -e /usr/lib/debug/lib/modules/5.14.0-0.rc1.15.bx.el9.aarch64/vmlinux
> > > > > > > > <what address here?>
> > > > > > >
> > > > > > > ./scripts/faddr2line /usr/lib/debug/lib/modules/5.14.0-0.rc1.15.bx.el9.aarch64/vmlinux cleanup_offline_cgwbs_workfn+0x320/0x394
> > > > > > >
> > > > > >
> > > > > > Thanks! The result is the same as the
> > > > > >
> > > > > > addr2line -i -e
> > > > > > /usr/lib/debug/lib/modules/5.14.0-0.rc1.15.bx.el9.aarch64/vmlinux
> > > > > > FFFF8000102D6DD0
> > > > > >
> > > > > > But this script is very handy.
> > > > > >
> > > > > > # /usr/src/kernels/5.14.0-0.rc1.15.bx.el9.aarch64/scripts/faddr2line
> > > > > > /usr/lib/debug/lib/modules/5.14.0-0.rc1.15.bx.el9.aarch64/vmlinux
> > > > > > cleanup_offlin
> > > > > > e_cgwbs_workfn+0x320/0x394
> > > > > > cleanup_offline_cgwbs_workfn+0x320/0x394:
> > > > > > arch_atomic64_fetch_add_unless at
> > > > > > /usr/src/debug/kernel-5.14.0-0.rc1.15.bx/linux-5.14.0-0.rc1.15.bx.el9.aarch64/./include/linux/atomic-arch-fallback.h:2265
> > > > > > (inlined by) arch_atomic64_add_unless at
> > > > > > /usr/src/debug/kernel-5.14.0-0.rc1.15.bx/linux-5.14.0-0.rc1.15.bx.el9.aarch64/./include/linux/atomic-arch-fallback.h:2290
> > > > > > (inlined by) atomic64_add_unless at
> > > > > > /usr/src/debug/kernel-5.14.0-0.rc1.15.bx/linux-5.14.0-0.rc1.15.bx.el9.aarch64/./include/asm-generic/atomic-instrumented.h:1149
> > > > > > (inlined by) atomic_long_add_unless at
> > > > > > /usr/src/debug/kernel-5.14.0-0.rc1.15.bx/linux-5.14.0-0.rc1.15.bx.el9.aarch64/./include/asm-generic/atomic-long.h:491
> > > > > > (inlined by) percpu_ref_tryget_many at
> > > > > > /usr/src/debug/kernel-5.14.0-0.rc1.15.bx/linux-5.14.0-0.rc1.15.bx.el9.aarch64/./include/linux/percpu-refcount.h:247
> > > > > > (inlined by) percpu_ref_tryget at
> > > > > > /usr/src/debug/kernel-5.14.0-0.rc1.15.bx/linux-5.14.0-0.rc1.15.bx.el9.aarch64/./include/linux/percpu-refcount.h:266
> > > > > > (inlined by) wb_tryget at
> > > > > > /usr/src/debug/kernel-5.14.0-0.rc1.15.bx/linux-5.14.0-0.rc1.15.bx.el9.aarch64/./include/linux/backing-dev-defs.h:227
> > > > > > (inlined by) wb_tryget at
> > > > > > /usr/src/debug/kernel-5.14.0-0.rc1.15.bx/linux-5.14.0-0.rc1.15.bx.el9.aarch64/./include/linux/backing-dev-defs.h:224
> > > > > > (inlined by) cleanup_offline_cgwbs_workfn at
> > > > > > /usr/src/debug/kernel-5.14.0-0.rc1.15.bx/linux-5.14.0-0.rc1.15.bx.el9.aarch64/mm/backing-dev.c:679
> > > > > >
> > > > > > # vi /usr/src/debug/kernel-5.14.0-0.rc1.15.bx/linux-5.14.0-0.rc1.15.bx.el9.aarch64/mm/backing-dev.c
> > > > > > ```
> > > > > > static void cleanup_offline_cgwbs_workfn(struct work_struct *work)
> > > > > > {
> > > > > >         struct bdi_writeback *wb;
> > > > > >         LIST_HEAD(processed);
> > > > > >
> > > > > >         spin_lock_irq(&cgwb_lock);
> > > > > >
> > > > > >         while (!list_empty(&offline_cgwbs)) {
> > > > > >                 wb = list_first_entry(&offline_cgwbs, struct bdi_writeback,
> > > > > >                                       offline_node);
> > > > > >                 list_move(&wb->offline_node, &processed);
> > > > > >
> > > > > >                 /*
> > > > > >                  * If wb is dirty, cleaning up the writeback by switching
> > > > > >                  * attached inodes will result in an effective removal of any
> > > > > >                  * bandwidth restrictions, which isn't the goal.  Instead,
> > > > > >                  * it can be postponed until the next time, when all io
> > > > > >                  * will be likely completed.  If in the meantime some inodes
> > > > > >                  * will get re-dirtied, they should be eventually switched to
> > > > > >                  * a new cgwb.
> > > > > >                  */
> > > > > >                 if (wb_has_dirty_io(wb))
> > > > > >                         continue;
> > > > > >
> > > > > >                 if (!wb_tryget(wb))  <=== line#679
> > > > > >                         continue;
> > > > > >
> > > > > >                 spin_unlock_irq(&cgwb_lock);
> > > > > >                 while (cleanup_offline_cgwb(wb))
> > > > > >                         cond_resched();
> > > > > >                 spin_lock_irq(&cgwb_lock);
> > > > > >
> > > > > >                 wb_put(wb);
> > > > > >         }
> > > > > >
> > > > > >         if (!list_empty(&processed))
> > > > > >                 list_splice_tail(&processed, &offline_cgwbs);
> > > > > >
> > > > > >         spin_unlock_irq(&cgwb_lock);
> > > > > > }
> > > > > > ```
> > > > > >
> > > > > > BTW, this bug can be only reproduced on a non-debug production built
> > > > > > kernel (a.k.a kernel rpm package), it's not reproducible on a debug
> > > > > > build with various debug configuration enabled (a.k.a kernel-debug rpm
> > > > > > package)
> > > > >
> > > > > FWIW I've also seen this regularly on x86_64 kernels on ext4 with all
> > > > > default mkfs settings when running generic/256.
> > > >
> > > > Oh, that's a useful information, thank you!
> > > >
> > > > Btw, would you mind to give a patch from an earlier message in the thread
> > > > a test? I'd highly appreciate it.
> > > >
> > > > Thanks!
> > >
> > > Will do.
> >
> > fstests passed here, so
> >
> > Tested-by: Darrick J. Wong <djwong@kernel.org>
>
> Great, thank you!
>


^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: Patch 'writeback, cgroup: release dying cgwbs by switching attached inodes' leads to kernel crash
  2021-07-17 12:00                       ` Boyang Xue
@ 2021-07-22  5:29                         ` Boyang Xue
  2021-07-22  5:41                           ` Roman Gushchin
  0 siblings, 1 reply; 21+ messages in thread
From: Boyang Xue @ 2021-07-22  5:29 UTC (permalink / raw)
  To: Roman Gushchin; +Cc: Darrick J. Wong, Matthew Wilcox, Jan Kara, linux-fsdevel

Just FYI, the tests on ppc64le are done, no longer kernel panic, so my
tests on all arches are fine now.

On Sat, Jul 17, 2021 at 8:00 PM Boyang Xue <bxue@redhat.com> wrote:
>
> Testing fstests on aarch64, x86_64, s390x all passed. There's a
> shortage of ppc64le systems, so I can't provide the ppc64le test
> result for now, but I hope I can report the result next week.
>
> Thanks,
> Boyang
>
> On Sat, Jul 17, 2021 at 4:04 AM Roman Gushchin <guro@fb.com> wrote:
> >
> > On Fri, Jul 16, 2021 at 09:23:40AM -0700, Darrick J. Wong wrote:
> > > On Thu, Jul 15, 2021 at 03:28:12PM -0700, Darrick J. Wong wrote:
> > > > On Thu, Jul 15, 2021 at 01:08:15PM -0700, Roman Gushchin wrote:
> > > > > On Thu, Jul 15, 2021 at 10:10:50AM -0700, Darrick J. Wong wrote:
> > > > > > On Thu, Jul 15, 2021 at 11:51:50AM +0800, Boyang Xue wrote:
> > > > > > > On Thu, Jul 15, 2021 at 10:36 AM Matthew Wilcox <willy@infradead.org> wrote:
> > > > > > > >
> > > > > > > > On Thu, Jul 15, 2021 at 12:22:28AM +0800, Boyang Xue wrote:
> > > > > > > > > It's unclear to me that where to find the required address in the
> > > > > > > > > addr2line command line, i.e.
> > > > > > > > >
> > > > > > > > > addr2line -e /usr/lib/debug/lib/modules/5.14.0-0.rc1.15.bx.el9.aarch64/vmlinux
> > > > > > > > > <what address here?>
> > > > > > > >
> > > > > > > > ./scripts/faddr2line /usr/lib/debug/lib/modules/5.14.0-0.rc1.15.bx.el9.aarch64/vmlinux cleanup_offline_cgwbs_workfn+0x320/0x394
> > > > > > > >
> > > > > > >
> > > > > > > Thanks! The result is the same as the
> > > > > > >
> > > > > > > addr2line -i -e
> > > > > > > /usr/lib/debug/lib/modules/5.14.0-0.rc1.15.bx.el9.aarch64/vmlinux
> > > > > > > FFFF8000102D6DD0
> > > > > > >
> > > > > > > But this script is very handy.
> > > > > > >
> > > > > > > # /usr/src/kernels/5.14.0-0.rc1.15.bx.el9.aarch64/scripts/faddr2line
> > > > > > > /usr/lib/debug/lib/modules/5.14.0-0.rc1.15.bx.el9.aarch64/vmlinux
> > > > > > > cleanup_offlin
> > > > > > > e_cgwbs_workfn+0x320/0x394
> > > > > > > cleanup_offline_cgwbs_workfn+0x320/0x394:
> > > > > > > arch_atomic64_fetch_add_unless at
> > > > > > > /usr/src/debug/kernel-5.14.0-0.rc1.15.bx/linux-5.14.0-0.rc1.15.bx.el9.aarch64/./include/linux/atomic-arch-fallback.h:2265
> > > > > > > (inlined by) arch_atomic64_add_unless at
> > > > > > > /usr/src/debug/kernel-5.14.0-0.rc1.15.bx/linux-5.14.0-0.rc1.15.bx.el9.aarch64/./include/linux/atomic-arch-fallback.h:2290
> > > > > > > (inlined by) atomic64_add_unless at
> > > > > > > /usr/src/debug/kernel-5.14.0-0.rc1.15.bx/linux-5.14.0-0.rc1.15.bx.el9.aarch64/./include/asm-generic/atomic-instrumented.h:1149
> > > > > > > (inlined by) atomic_long_add_unless at
> > > > > > > /usr/src/debug/kernel-5.14.0-0.rc1.15.bx/linux-5.14.0-0.rc1.15.bx.el9.aarch64/./include/asm-generic/atomic-long.h:491
> > > > > > > (inlined by) percpu_ref_tryget_many at
> > > > > > > /usr/src/debug/kernel-5.14.0-0.rc1.15.bx/linux-5.14.0-0.rc1.15.bx.el9.aarch64/./include/linux/percpu-refcount.h:247
> > > > > > > (inlined by) percpu_ref_tryget at
> > > > > > > /usr/src/debug/kernel-5.14.0-0.rc1.15.bx/linux-5.14.0-0.rc1.15.bx.el9.aarch64/./include/linux/percpu-refcount.h:266
> > > > > > > (inlined by) wb_tryget at
> > > > > > > /usr/src/debug/kernel-5.14.0-0.rc1.15.bx/linux-5.14.0-0.rc1.15.bx.el9.aarch64/./include/linux/backing-dev-defs.h:227
> > > > > > > (inlined by) wb_tryget at
> > > > > > > /usr/src/debug/kernel-5.14.0-0.rc1.15.bx/linux-5.14.0-0.rc1.15.bx.el9.aarch64/./include/linux/backing-dev-defs.h:224
> > > > > > > (inlined by) cleanup_offline_cgwbs_workfn at
> > > > > > > /usr/src/debug/kernel-5.14.0-0.rc1.15.bx/linux-5.14.0-0.rc1.15.bx.el9.aarch64/mm/backing-dev.c:679
> > > > > > >
> > > > > > > # vi /usr/src/debug/kernel-5.14.0-0.rc1.15.bx/linux-5.14.0-0.rc1.15.bx.el9.aarch64/mm/backing-dev.c
> > > > > > > ```
> > > > > > > static void cleanup_offline_cgwbs_workfn(struct work_struct *work)
> > > > > > > {
> > > > > > >         struct bdi_writeback *wb;
> > > > > > >         LIST_HEAD(processed);
> > > > > > >
> > > > > > >         spin_lock_irq(&cgwb_lock);
> > > > > > >
> > > > > > >         while (!list_empty(&offline_cgwbs)) {
> > > > > > >                 wb = list_first_entry(&offline_cgwbs, struct bdi_writeback,
> > > > > > >                                       offline_node);
> > > > > > >                 list_move(&wb->offline_node, &processed);
> > > > > > >
> > > > > > >                 /*
> > > > > > >                  * If wb is dirty, cleaning up the writeback by switching
> > > > > > >                  * attached inodes will result in an effective removal of any
> > > > > > >                  * bandwidth restrictions, which isn't the goal.  Instead,
> > > > > > >                  * it can be postponed until the next time, when all io
> > > > > > >                  * will be likely completed.  If in the meantime some inodes
> > > > > > >                  * will get re-dirtied, they should be eventually switched to
> > > > > > >                  * a new cgwb.
> > > > > > >                  */
> > > > > > >                 if (wb_has_dirty_io(wb))
> > > > > > >                         continue;
> > > > > > >
> > > > > > >                 if (!wb_tryget(wb))  <=== line#679
> > > > > > >                         continue;
> > > > > > >
> > > > > > >                 spin_unlock_irq(&cgwb_lock);
> > > > > > >                 while (cleanup_offline_cgwb(wb))
> > > > > > >                         cond_resched();
> > > > > > >                 spin_lock_irq(&cgwb_lock);
> > > > > > >
> > > > > > >                 wb_put(wb);
> > > > > > >         }
> > > > > > >
> > > > > > >         if (!list_empty(&processed))
> > > > > > >                 list_splice_tail(&processed, &offline_cgwbs);
> > > > > > >
> > > > > > >         spin_unlock_irq(&cgwb_lock);
> > > > > > > }
> > > > > > > ```
> > > > > > >
> > > > > > > BTW, this bug can be only reproduced on a non-debug production built
> > > > > > > kernel (a.k.a kernel rpm package), it's not reproducible on a debug
> > > > > > > build with various debug configuration enabled (a.k.a kernel-debug rpm
> > > > > > > package)
> > > > > >
> > > > > > FWIW I've also seen this regularly on x86_64 kernels on ext4 with all
> > > > > > default mkfs settings when running generic/256.
> > > > >
> > > > > Oh, that's a useful information, thank you!
> > > > >
> > > > > Btw, would you mind to give a patch from an earlier message in the thread
> > > > > a test? I'd highly appreciate it.
> > > > >
> > > > > Thanks!
> > > >
> > > > Will do.
> > >
> > > fstests passed here, so
> > >
> > > Tested-by: Darrick J. Wong <djwong@kernel.org>
> >
> > Great, thank you!
> >


^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: Patch 'writeback, cgroup: release dying cgwbs by switching attached inodes' leads to kernel crash
  2021-07-22  5:29                         ` Boyang Xue
@ 2021-07-22  5:41                           ` Roman Gushchin
  0 siblings, 0 replies; 21+ messages in thread
From: Roman Gushchin @ 2021-07-22  5:41 UTC (permalink / raw)
  To: Boyang Xue; +Cc: Darrick J. Wong, Matthew Wilcox, Jan Kara, linux-fsdevel

Thank you very much for testing it!

Sent from my iPhone

> On Jul 21, 2021, at 22:29, Boyang Xue <bxue@redhat.com> wrote:
> 
> Just FYI, the tests on ppc64le are done, no longer kernel panic, so my
> tests on all arches are fine now.
> 
>> On Sat, Jul 17, 2021 at 8:00 PM Boyang Xue <bxue@redhat.com> wrote:
>> 
>> Testing fstests on aarch64, x86_64, s390x all passed. There's a
>> shortage of ppc64le systems, so I can't provide the ppc64le test
>> result for now, but I hope I can report the result next week.
>> 
>> Thanks,
>> Boyang
>> 
>>> On Sat, Jul 17, 2021 at 4:04 AM Roman Gushchin <guro@fb.com> wrote:
>>> 
>>> On Fri, Jul 16, 2021 at 09:23:40AM -0700, Darrick J. Wong wrote:
>>>> On Thu, Jul 15, 2021 at 03:28:12PM -0700, Darrick J. Wong wrote:
>>>>> On Thu, Jul 15, 2021 at 01:08:15PM -0700, Roman Gushchin wrote:
>>>>>> On Thu, Jul 15, 2021 at 10:10:50AM -0700, Darrick J. Wong wrote:
>>>>>>> On Thu, Jul 15, 2021 at 11:51:50AM +0800, Boyang Xue wrote:
>>>>>>>> On Thu, Jul 15, 2021 at 10:36 AM Matthew Wilcox <willy@infradead.org> wrote:
>>>>>>>>> 
>>>>>>>>> On Thu, Jul 15, 2021 at 12:22:28AM +0800, Boyang Xue wrote:
>>>>>>>>>> It's unclear to me that where to find the required address in the
>>>>>>>>>> addr2line command line, i.e.
>>>>>>>>>> 
>>>>>>>>>> addr2line -e /usr/lib/debug/lib/modules/5.14.0-0.rc1.15.bx.el9.aarch64/vmlinux
>>>>>>>>>> <what address here?>
>>>>>>>>> 
>>>>>>>>> ./scripts/faddr2line /usr/lib/debug/lib/modules/5.14.0-0.rc1.15.bx.el9.aarch64/vmlinux cleanup_offline_cgwbs_workfn+0x320/0x394
>>>>>>>>> 
>>>>>>>> 
>>>>>>>> Thanks! The result is the same as the
>>>>>>>> 
>>>>>>>> addr2line -i -e
>>>>>>>> /usr/lib/debug/lib/modules/5.14.0-0.rc1.15.bx.el9.aarch64/vmlinux
>>>>>>>> FFFF8000102D6DD0
>>>>>>>> 
>>>>>>>> But this script is very handy.
>>>>>>>> 
>>>>>>>> # /usr/src/kernels/5.14.0-0.rc1.15.bx.el9.aarch64/scripts/faddr2line
>>>>>>>> /usr/lib/debug/lib/modules/5.14.0-0.rc1.15.bx.el9.aarch64/vmlinux
>>>>>>>> cleanup_offlin
>>>>>>>> e_cgwbs_workfn+0x320/0x394
>>>>>>>> cleanup_offline_cgwbs_workfn+0x320/0x394:
>>>>>>>> arch_atomic64_fetch_add_unless at
>>>>>>>> /usr/src/debug/kernel-5.14.0-0.rc1.15.bx/linux-5.14.0-0.rc1.15.bx.el9.aarch64/./include/linux/atomic-arch-fallback.h:2265
>>>>>>>> (inlined by) arch_atomic64_add_unless at
>>>>>>>> /usr/src/debug/kernel-5.14.0-0.rc1.15.bx/linux-5.14.0-0.rc1.15.bx.el9.aarch64/./include/linux/atomic-arch-fallback.h:2290
>>>>>>>> (inlined by) atomic64_add_unless at
>>>>>>>> /usr/src/debug/kernel-5.14.0-0.rc1.15.bx/linux-5.14.0-0.rc1.15.bx.el9.aarch64/./include/asm-generic/atomic-instrumented.h:1149
>>>>>>>> (inlined by) atomic_long_add_unless at
>>>>>>>> /usr/src/debug/kernel-5.14.0-0.rc1.15.bx/linux-5.14.0-0.rc1.15.bx.el9.aarch64/./include/asm-generic/atomic-long.h:491
>>>>>>>> (inlined by) percpu_ref_tryget_many at
>>>>>>>> /usr/src/debug/kernel-5.14.0-0.rc1.15.bx/linux-5.14.0-0.rc1.15.bx.el9.aarch64/./include/linux/percpu-refcount.h:247
>>>>>>>> (inlined by) percpu_ref_tryget at
>>>>>>>> /usr/src/debug/kernel-5.14.0-0.rc1.15.bx/linux-5.14.0-0.rc1.15.bx.el9.aarch64/./include/linux/percpu-refcount.h:266
>>>>>>>> (inlined by) wb_tryget at
>>>>>>>> /usr/src/debug/kernel-5.14.0-0.rc1.15.bx/linux-5.14.0-0.rc1.15.bx.el9.aarch64/./include/linux/backing-dev-defs.h:227
>>>>>>>> (inlined by) wb_tryget at
>>>>>>>> /usr/src/debug/kernel-5.14.0-0.rc1.15.bx/linux-5.14.0-0.rc1.15.bx.el9.aarch64/./include/linux/backing-dev-defs.h:224
>>>>>>>> (inlined by) cleanup_offline_cgwbs_workfn at
>>>>>>>> /usr/src/debug/kernel-5.14.0-0.rc1.15.bx/linux-5.14.0-0.rc1.15.bx.el9.aarch64/mm/backing-dev.c:679
>>>>>>>> 
>>>>>>>> # vi /usr/src/debug/kernel-5.14.0-0.rc1.15.bx/linux-5.14.0-0.rc1.15.bx.el9.aarch64/mm/backing-dev.c
>>>>>>>> ```
>>>>>>>> static void cleanup_offline_cgwbs_workfn(struct work_struct *work)
>>>>>>>> {
>>>>>>>>        struct bdi_writeback *wb;
>>>>>>>>        LIST_HEAD(processed);
>>>>>>>> 
>>>>>>>>        spin_lock_irq(&cgwb_lock);
>>>>>>>> 
>>>>>>>>        while (!list_empty(&offline_cgwbs)) {
>>>>>>>>                wb = list_first_entry(&offline_cgwbs, struct bdi_writeback,
>>>>>>>>                                      offline_node);
>>>>>>>>                list_move(&wb->offline_node, &processed);
>>>>>>>> 
>>>>>>>>                /*
>>>>>>>>                 * If wb is dirty, cleaning up the writeback by switching
>>>>>>>>                 * attached inodes will result in an effective removal of any
>>>>>>>>                 * bandwidth restrictions, which isn't the goal.  Instead,
>>>>>>>>                 * it can be postponed until the next time, when all io
>>>>>>>>                 * will be likely completed.  If in the meantime some inodes
>>>>>>>>                 * will get re-dirtied, they should be eventually switched to
>>>>>>>>                 * a new cgwb.
>>>>>>>>                 */
>>>>>>>>                if (wb_has_dirty_io(wb))
>>>>>>>>                        continue;
>>>>>>>> 
>>>>>>>>                if (!wb_tryget(wb))  <=== line#679
>>>>>>>>                        continue;
>>>>>>>> 
>>>>>>>>                spin_unlock_irq(&cgwb_lock);
>>>>>>>>                while (cleanup_offline_cgwb(wb))
>>>>>>>>                        cond_resched();
>>>>>>>>                spin_lock_irq(&cgwb_lock);
>>>>>>>> 
>>>>>>>>                wb_put(wb);
>>>>>>>>        }
>>>>>>>> 
>>>>>>>>        if (!list_empty(&processed))
>>>>>>>>                list_splice_tail(&processed, &offline_cgwbs);
>>>>>>>> 
>>>>>>>>        spin_unlock_irq(&cgwb_lock);
>>>>>>>> }
>>>>>>>> ```
>>>>>>>> 
>>>>>>>> BTW, this bug can be only reproduced on a non-debug production built
>>>>>>>> kernel (a.k.a kernel rpm package), it's not reproducible on a debug
>>>>>>>> build with various debug configuration enabled (a.k.a kernel-debug rpm
>>>>>>>> package)
>>>>>>> 
>>>>>>> FWIW I've also seen this regularly on x86_64 kernels on ext4 with all
>>>>>>> default mkfs settings when running generic/256.
>>>>>> 
>>>>>> Oh, that's a useful information, thank you!
>>>>>> 
>>>>>> Btw, would you mind to give a patch from an earlier message in the thread
>>>>>> a test? I'd highly appreciate it.
>>>>>> 
>>>>>> Thanks!
>>>>> 
>>>>> Will do.
>>>> 
>>>> fstests passed here, so
>>>> 
>>>> Tested-by: Darrick J. Wong <djwong@kernel.org>
>>> 
>>> Great, thank you!
>>> 
> 

^ permalink raw reply	[flat|nested] 21+ messages in thread

end of thread, other threads:[~2021-07-22  5:42 UTC | newest]

Thread overview: 21+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-07-14  3:21 Patch 'writeback, cgroup: release dying cgwbs by switching attached inodes' leads to kernel crash Boyang Xue
2021-07-14  3:57 ` Boyang Xue
2021-07-14  4:11 ` Roman Gushchin
2021-07-14  8:44   ` Boyang Xue
2021-07-14  9:26     ` Jan Kara
2021-07-14 16:22       ` Boyang Xue
2021-07-14 23:46         ` Roman Gushchin
2021-07-15  1:42           ` Boyang Xue
2021-07-15  9:31             ` Jan Kara
2021-07-15 16:04               ` Roman Gushchin
2021-07-16  1:37                 ` Boyang Xue
2021-07-15  2:35         ` Matthew Wilcox
2021-07-15  3:51           ` Boyang Xue
2021-07-15 17:10             ` Darrick J. Wong
2021-07-15 20:08               ` Roman Gushchin
2021-07-15 22:28                 ` Darrick J. Wong
2021-07-16 16:23                   ` Darrick J. Wong
2021-07-16 20:03                     ` Roman Gushchin
2021-07-17 12:00                       ` Boyang Xue
2021-07-22  5:29                         ` Boyang Xue
2021-07-22  5:41                           ` Roman Gushchin

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.