linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* "rcu: React to callback overload by aggressively seeking quiescent states" hangs on boot
@ 2019-12-13  6:13 Qian Cai
  2019-12-13 22:46 ` Paul E. McKenney
  0 siblings, 1 reply; 11+ messages in thread
From: Qian Cai @ 2019-12-13  6:13 UTC (permalink / raw)
  To: Paul E. McKenney
  Cc: Joel Fernandes (Google),
	Tejun Heo, Josh Triplett, Steven Rostedt, rcu,
	Linux Kernel Mailing List

The linux-next commit 82150cb53dcb ("rcu: React to callback overload by aggressively seeking quiescent states”)
causes hangs on boot on almost all arches. Reverted it fixed the issue.

=== x86_64 (Intel) ===

https://raw.githubusercontent.com/cailca/linux-mm/master/x86.config

[   29.130611][    T0] mce: CPU0: Thermal monitoring enabled (TM1)
[   29.136598][    T0] process: using mwait in idle threads
[   29.140582][    T0] Last level iTLB entries: 4KB 64, 2MB 8, 4MB 8
[   29.146704][    T0] Last level dTLB entries: 4KB 64, 2MB 0, 4MB 0, 1GB 4
[   29.150570][    T0] Spectre V1 : Mitigation: usercopy/swapgs barriers and __user pointer sanitization
[   29.160584][    T0] Spectre V2 : Mitigation: Full generic retpoline
[   29.166881][    T0] Spectre V2 : Spectre v2 / SpectreRSB mitigation: Filling RSB on context switch
[   29.170567][    T0] Spectre V2 : Enabling Restricted Speculation for firmware calls
[   29.180569][    T0] Spectre V2 : mitigation: Enabling conditional Indirect Branch Prediction Barrier
[   29.190567][    T0] Spectre V2 : User space: Mitigation: STIBP via seccomp and prctl
[   29.200569][    T0] Speculative Store Bypass: Mitigation: Speculative Store Bypass disabled via prctl and seccomp
[   29.210570][    T0] TAA: Vulnerable: Clear CPU buffers attempted, no microcode
[   28.995181][    T0] MDS: Vulnerable: Clear CPU buffers attempted, no microcode
[   29.005929][    T0] debug: unmapping init [mem 0xffffffffb50ec000-0xffffffffb50f0fff]
[   29.035681][    T1] smpboot: CPU0: Intel(R) Xeon(R) 
<hang ….>

=== arm64 ===

https://raw.githubusercontent.com/cailca/linux-mm/master/arm64.config

[    0.000000][    T0] ITS [mem 0x440100000-0x44011ffff]
[    0.000000][    T0] ITS@0x0000000440100000: allocated 65536 Devices @8bfd080000 (flat, esz 8, psz 64K, shr 0)
[    0.000000][    T0] ITS@0x0000000440100000: allocated 32768 Interrupt Collections @8bfd020000 (flat, esz 2, psz 16K, shr 0)
[    0.000000][    T0] ITS: using cache flushing for cmd queue
[    0.000000][    T0] GICv3: using LPI property table @0x0000000880db0000
[    0.000000][    T0] GIC: using cache flushing for LPI property table
[    0.000000][    T0] GICv3: CPU0: using allocated LPI pending table @0x0000000880dd0000
[    0.000000][    T0] arch_timer: cp15 timer(s) running at 200.00MHz (phys).
[    0.000000][    T0] clocksource: arch_sys_counter: mask: 0xffffffffffffff max_cycles: 0x2e2049d3e8, max_idle_ns: 440795210634 ns
[    0.000005][    T0] sched_clock: 56 bits at 200MHz, resolution 5ns, wraps every 4398046511102ns
[    0.061872][    T0] Lock dependency validator: Copyright (c) 2006 Red Hat, Inc., Ingo Molnar
[    0.070420][    T0] ... MAX_LOCKDEP_SUBCLASSES:  8
[    0.075298][    T0] ... MAX_LOCK_DEPTH:          48
[    0.080240][    T0] ... MAX_LOCKDEP_KEYS:        8192
[    0.085379][    T0] ... CLASSHASH_SIZE:          4096
[    0.090496][    T0] ... MAX_LOCKDEP_ENTRIES:     32768
[    0.095722][    T0] ... MAX_LOCKDEP_CHAINS:      65536
[    0.100926][    T0] ... CHAINHASH_SIZE:          32768
[    0.106153][    T0]  memory used by lock dependency info: 6237 kB
[    0.112324][    T0]  memory used for stack traces: 4224 kB
[    0.117902][    T0]  per task-struct memory footprint: 1920 bytes
[    0.158652][    T0] ACPI: Core revision 20191018
[    0.194716][    T0] Calibrating delay loop (skipped), value calculated using timer frequency.. 400.00 BogoMIPS (lpj=2000000)
[    0.206116][    T0] pid_max: default: 262144 minimum: 2048
[    0.355206][    T0] Dentry cache hash table entries: 8388608 (order: 10, 67108864 bytes, vmalloc)
[    0.396920][    T0] Inode-cache hash table entries: 4194304 (order: 9, 33554432 bytes, vmalloc)
[    0.422261][    T0] Mount-cache hash table entries: 131072 (order: 4, 1048576 bytes, vmalloc)
[    0.431925][    T0] Mountpoint-cache hash table entries: 131072 (order: 4, 1048576 bytes, vmalloc)
[    0.736297][    T1] ASID allocator initialised with 32768 entries
[    0.743932][    T1] rcu: Hierarchical SRCU implementation.
[    0.759898][    T1] Platform MSI: ITS@0x400100000 domain created
[    0.766249][    T1] Platform MSI: ITS@0x440100000 domain created
[    0.772602][    T1] PCI/MSI: ITS@0x400100000 domain created
[    0.778561][    T1] PCI/MSI: ITS@0x440100000 domain created
[    0.784292][    T1] Remapping and enabling EFI services.
<hang …>

=== powerpc ===

https://raw.githubusercontent.com/cailca/linux-mm/master/powerpc.config

[    0.000000][    T0] SLUB: HWalign=128, Order=0-0, MinObjects=0, CPUs=128, Nodes=256
[    0.000000][    T0] ODEBUG: selftest passed
[    0.000000][    T0] ftrace: allocating 19886 entries in 8 pages
[    0.000000][    T0] ftrace: allocated 8 pages with 1 groups
[    0.000000][    T0] Running RCU self tests
[    0.000000][    T0] rcu: Hierarchical RCU implementation.
[    0.000000][    T0] rcu: 	RCU lockdep checking is enabled.
[    0.000000][    T0] rcu: 	RCU restricting CPUs from NR_CPUS=256 to nr_cpu_ids=128.
[    0.000000][    T0] rcu: 	RCU callback double-/use-after-free debug enabled.
[    0.000000][    T0] rcu: RCU calculated value of scheduler-enlistment delay is 10 jiffies.
[    0.000000][    T0] rcu: Adjusting geometry for rcu_fanout_leaf=16, nr_cpu_ids=128
[    0.000000][    T0] NR_IRQS: 512, nr_irqs: 512, preallocated irqs: 16
[    0.000000][    T0] xive: Interrupt handling initialized with native backend
[    0.000000][    T0] xive: Using priority 7 for all interrupts
[    0.000000][    T0] xive: Using 64kB queues
[    0.000007][    T0] time_init: 56 bit decrementer (max: 7fffffffffffff)
[    0.003188][    T0] clocksource: timebase: mask: 0xffffffffffffffff max_cycles: 0x761537d007, max_idle_ns: 440795202126 ns
[    0.011496][    T0] clocksource: timebase mult[1f40000] shift[24] registered
[    0.029470][    T0] printk: console [hvc0] enabled
[    0.029470][    T0] printk: console [hvc0] enabled
[    0.035652][    T0] printk: bootconsole [udbg0] disabled
[    0.035652][    T0] printk: bootconsole [udbg0] disabled
[    0.040864][    T0] Lock dependency validator: Copyright (c) 2006 Red Hat, Inc., Ingo Molnar
[    0.040892][    T0] ... MAX_LOCKDEP_SUBCLASSES:  8
[    0.040918][    T0] ... MAX_LOCK_DEPTH:          48
[    0.040944][    T0] ... MAX_LOCKDEP_KEYS:        8192
[    0.040969][    T0] ... CLASSHASH_SIZE:     
<hang ...>

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: "rcu: React to callback overload by aggressively seeking quiescent states" hangs on boot
  2019-12-13  6:13 "rcu: React to callback overload by aggressively seeking quiescent states" hangs on boot Qian Cai
@ 2019-12-13 22:46 ` Paul E. McKenney
  2019-12-13 23:11   ` Qian Cai
  0 siblings, 1 reply; 11+ messages in thread
From: Paul E. McKenney @ 2019-12-13 22:46 UTC (permalink / raw)
  To: Qian Cai
  Cc: Joel Fernandes (Google),
	Tejun Heo, Josh Triplett, Steven Rostedt, rcu,
	Linux Kernel Mailing List

On Fri, Dec 13, 2019 at 01:13:27AM -0500, Qian Cai wrote:
> The linux-next commit 82150cb53dcb ("rcu: React to callback overload by aggressively seeking quiescent states”)
> causes hangs on boot on almost all arches. Reverted it fixed the issue.

I am running this on a number of x86 systems, but will try it on a
wider variety.  If I cannot reproduce it, would you be willing to
run diagnostics?

Just to double-check...  Are you running rcutorture built into the kernel?
(My guess is "no", but figured that I should ask.)

							Thanx, Paul

> === x86_64 (Intel) ===
> 
> https://raw.githubusercontent.com/cailca/linux-mm/master/x86.config
> 
> [   29.130611][    T0] mce: CPU0: Thermal monitoring enabled (TM1)
> [   29.136598][    T0] process: using mwait in idle threads
> [   29.140582][    T0] Last level iTLB entries: 4KB 64, 2MB 8, 4MB 8
> [   29.146704][    T0] Last level dTLB entries: 4KB 64, 2MB 0, 4MB 0, 1GB 4
> [   29.150570][    T0] Spectre V1 : Mitigation: usercopy/swapgs barriers and __user pointer sanitization
> [   29.160584][    T0] Spectre V2 : Mitigation: Full generic retpoline
> [   29.166881][    T0] Spectre V2 : Spectre v2 / SpectreRSB mitigation: Filling RSB on context switch
> [   29.170567][    T0] Spectre V2 : Enabling Restricted Speculation for firmware calls
> [   29.180569][    T0] Spectre V2 : mitigation: Enabling conditional Indirect Branch Prediction Barrier
> [   29.190567][    T0] Spectre V2 : User space: Mitigation: STIBP via seccomp and prctl
> [   29.200569][    T0] Speculative Store Bypass: Mitigation: Speculative Store Bypass disabled via prctl and seccomp
> [   29.210570][    T0] TAA: Vulnerable: Clear CPU buffers attempted, no microcode
> [   28.995181][    T0] MDS: Vulnerable: Clear CPU buffers attempted, no microcode
> [   29.005929][    T0] debug: unmapping init [mem 0xffffffffb50ec000-0xffffffffb50f0fff]
> [   29.035681][    T1] smpboot: CPU0: Intel(R) Xeon(R) 
> <hang ….>
> 
> === arm64 ===
> 
> https://raw.githubusercontent.com/cailca/linux-mm/master/arm64.config
> 
> [    0.000000][    T0] ITS [mem 0x440100000-0x44011ffff]
> [    0.000000][    T0] ITS@0x0000000440100000: allocated 65536 Devices @8bfd080000 (flat, esz 8, psz 64K, shr 0)
> [    0.000000][    T0] ITS@0x0000000440100000: allocated 32768 Interrupt Collections @8bfd020000 (flat, esz 2, psz 16K, shr 0)
> [    0.000000][    T0] ITS: using cache flushing for cmd queue
> [    0.000000][    T0] GICv3: using LPI property table @0x0000000880db0000
> [    0.000000][    T0] GIC: using cache flushing for LPI property table
> [    0.000000][    T0] GICv3: CPU0: using allocated LPI pending table @0x0000000880dd0000
> [    0.000000][    T0] arch_timer: cp15 timer(s) running at 200.00MHz (phys).
> [    0.000000][    T0] clocksource: arch_sys_counter: mask: 0xffffffffffffff max_cycles: 0x2e2049d3e8, max_idle_ns: 440795210634 ns
> [    0.000005][    T0] sched_clock: 56 bits at 200MHz, resolution 5ns, wraps every 4398046511102ns
> [    0.061872][    T0] Lock dependency validator: Copyright (c) 2006 Red Hat, Inc., Ingo Molnar
> [    0.070420][    T0] ... MAX_LOCKDEP_SUBCLASSES:  8
> [    0.075298][    T0] ... MAX_LOCK_DEPTH:          48
> [    0.080240][    T0] ... MAX_LOCKDEP_KEYS:        8192
> [    0.085379][    T0] ... CLASSHASH_SIZE:          4096
> [    0.090496][    T0] ... MAX_LOCKDEP_ENTRIES:     32768
> [    0.095722][    T0] ... MAX_LOCKDEP_CHAINS:      65536
> [    0.100926][    T0] ... CHAINHASH_SIZE:          32768
> [    0.106153][    T0]  memory used by lock dependency info: 6237 kB
> [    0.112324][    T0]  memory used for stack traces: 4224 kB
> [    0.117902][    T0]  per task-struct memory footprint: 1920 bytes
> [    0.158652][    T0] ACPI: Core revision 20191018
> [    0.194716][    T0] Calibrating delay loop (skipped), value calculated using timer frequency.. 400.00 BogoMIPS (lpj=2000000)
> [    0.206116][    T0] pid_max: default: 262144 minimum: 2048
> [    0.355206][    T0] Dentry cache hash table entries: 8388608 (order: 10, 67108864 bytes, vmalloc)
> [    0.396920][    T0] Inode-cache hash table entries: 4194304 (order: 9, 33554432 bytes, vmalloc)
> [    0.422261][    T0] Mount-cache hash table entries: 131072 (order: 4, 1048576 bytes, vmalloc)
> [    0.431925][    T0] Mountpoint-cache hash table entries: 131072 (order: 4, 1048576 bytes, vmalloc)
> [    0.736297][    T1] ASID allocator initialised with 32768 entries
> [    0.743932][    T1] rcu: Hierarchical SRCU implementation.
> [    0.759898][    T1] Platform MSI: ITS@0x400100000 domain created
> [    0.766249][    T1] Platform MSI: ITS@0x440100000 domain created
> [    0.772602][    T1] PCI/MSI: ITS@0x400100000 domain created
> [    0.778561][    T1] PCI/MSI: ITS@0x440100000 domain created
> [    0.784292][    T1] Remapping and enabling EFI services.
> <hang …>
> 
> === powerpc ===
> 
> https://raw.githubusercontent.com/cailca/linux-mm/master/powerpc.config
> 
> [    0.000000][    T0] SLUB: HWalign=128, Order=0-0, MinObjects=0, CPUs=128, Nodes=256
> [    0.000000][    T0] ODEBUG: selftest passed
> [    0.000000][    T0] ftrace: allocating 19886 entries in 8 pages
> [    0.000000][    T0] ftrace: allocated 8 pages with 1 groups
> [    0.000000][    T0] Running RCU self tests
> [    0.000000][    T0] rcu: Hierarchical RCU implementation.
> [    0.000000][    T0] rcu: 	RCU lockdep checking is enabled.
> [    0.000000][    T0] rcu: 	RCU restricting CPUs from NR_CPUS=256 to nr_cpu_ids=128.
> [    0.000000][    T0] rcu: 	RCU callback double-/use-after-free debug enabled.
> [    0.000000][    T0] rcu: RCU calculated value of scheduler-enlistment delay is 10 jiffies.
> [    0.000000][    T0] rcu: Adjusting geometry for rcu_fanout_leaf=16, nr_cpu_ids=128
> [    0.000000][    T0] NR_IRQS: 512, nr_irqs: 512, preallocated irqs: 16
> [    0.000000][    T0] xive: Interrupt handling initialized with native backend
> [    0.000000][    T0] xive: Using priority 7 for all interrupts
> [    0.000000][    T0] xive: Using 64kB queues
> [    0.000007][    T0] time_init: 56 bit decrementer (max: 7fffffffffffff)
> [    0.003188][    T0] clocksource: timebase: mask: 0xffffffffffffffff max_cycles: 0x761537d007, max_idle_ns: 440795202126 ns
> [    0.011496][    T0] clocksource: timebase mult[1f40000] shift[24] registered
> [    0.029470][    T0] printk: console [hvc0] enabled
> [    0.029470][    T0] printk: console [hvc0] enabled
> [    0.035652][    T0] printk: bootconsole [udbg0] disabled
> [    0.035652][    T0] printk: bootconsole [udbg0] disabled
> [    0.040864][    T0] Lock dependency validator: Copyright (c) 2006 Red Hat, Inc., Ingo Molnar
> [    0.040892][    T0] ... MAX_LOCKDEP_SUBCLASSES:  8
> [    0.040918][    T0] ... MAX_LOCK_DEPTH:          48
> [    0.040944][    T0] ... MAX_LOCKDEP_KEYS:        8192
> [    0.040969][    T0] ... CLASSHASH_SIZE:     
> <hang ...>

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: "rcu: React to callback overload by aggressively seeking quiescent states" hangs on boot
  2019-12-13 22:46 ` Paul E. McKenney
@ 2019-12-13 23:11   ` Qian Cai
  2019-12-14  6:40     ` Paul E. McKenney
  0 siblings, 1 reply; 11+ messages in thread
From: Qian Cai @ 2019-12-13 23:11 UTC (permalink / raw)
  To: paulmck
  Cc: Joel Fernandes (Google),
	Tejun Heo, Josh Triplett, Steven Rostedt, rcu,
	Linux Kernel Mailing List



> On Dec 13, 2019, at 5:46 PM, Paul E. McKenney <paulmck@kernel.org> wrote:
> 
> I am running this on a number of x86 systems, but will try it on a

The config to reproduce includes several debugging options that might required to recreate.

> wider variety.  If I cannot reproduce it, would you be willing to
> run diagnostics?

Yes.

> 
> Just to double-check...  Are you running rcutorture built into the kernel?
> (My guess is "no", but figured that I should ask.)

No as you can see from the config I linked in the original email.

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: "rcu: React to callback overload by aggressively seeking quiescent states" hangs on boot
  2019-12-13 23:11   ` Qian Cai
@ 2019-12-14  6:40     ` Paul E. McKenney
       [not found]       ` <CAA42JLbBFkpYHXRVvyveYO76DnbkE3gyRW-=qmBGZcJTAiB6Uw@mail.gmail.com>
  0 siblings, 1 reply; 11+ messages in thread
From: Paul E. McKenney @ 2019-12-14  6:40 UTC (permalink / raw)
  To: Qian Cai
  Cc: Joel Fernandes (Google),
	Tejun Heo, Josh Triplett, Steven Rostedt, rcu,
	Linux Kernel Mailing List

On Fri, Dec 13, 2019 at 06:11:16PM -0500, Qian Cai wrote:
> 
> 
> > On Dec 13, 2019, at 5:46 PM, Paul E. McKenney <paulmck@kernel.org> wrote:
> > 
> > I am running this on a number of x86 systems, but will try it on a
> 
> The config to reproduce includes several debugging options that might
> required to recreate.

If you run without those debugging options, do you still see the hangs?
If not, please let me know which debugging are involved.

> > wider variety.  If I cannot reproduce it, would you be willing to
> > run diagnostics?
> 
> Yes.

Very good!  Let me see what I can put together.  (No luck reproducing
at my end thus far.)

> > Just to double-check...  Are you running rcutorture built into the kernel?
> > (My guess is "no", but figured that I should ask.)
> 
> No as you can see from the config I linked in the original email.

Fair point, and please accept my apologies for the pointless question.

							Thanx, Paul

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: "rcu: React to callback overload by aggressively seeking quiescent states" hangs on boot
       [not found]       ` <CAA42JLbBFkpYHXRVvyveYO76DnbkE3gyRW-=qmBGZcJTAiB6Uw@mail.gmail.com>
@ 2019-12-15 19:29         ` Dexuan-Linux Cui
  2019-12-15 20:20         ` Paul E. McKenney
  1 sibling, 0 replies; 11+ messages in thread
From: Dexuan-Linux Cui @ 2019-12-15 19:29 UTC (permalink / raw)
  To: paulmck
  Cc: Qian Cai, Joel Fernandes (Google),
	Tejun Heo, Josh Triplett, Steven Rostedt, rcu,
	Linux Kernel Mailing List, Lili Deng, Dexuan Cui, Baihua Lu

On Sun, Dec 15, 2019 at 11:18 AM Dexuan-Linux Cui
<dexuan.linux@gmail.com> wrote:
>
> Hi,
> We're seeing the same hang issue with a recent Linux next-20191213
> kernel.  If we revert the same commit 82150cb53dcb ("rcu: React to
> callback overload by aggressively seeking quiescent states”), the
> issue will go away.
>
> Note: we're running the x86-64 Linux VM Hyper-V, and the the torture
> test is not used:
Sorry for missing a "on" after the word "VM" (and typing 1 more "the") .

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: "rcu: React to callback overload by aggressively seeking quiescent states" hangs on boot
       [not found]       ` <CAA42JLbBFkpYHXRVvyveYO76DnbkE3gyRW-=qmBGZcJTAiB6Uw@mail.gmail.com>
  2019-12-15 19:29         ` Dexuan-Linux Cui
@ 2019-12-15 20:20         ` Paul E. McKenney
  2019-12-15 20:40           ` Dexuan Cui
  1 sibling, 1 reply; 11+ messages in thread
From: Paul E. McKenney @ 2019-12-15 20:20 UTC (permalink / raw)
  To: Dexuan-Linux Cui
  Cc: Qian Cai, Joel Fernandes (Google),
	Tejun Heo, Josh Triplett, Steven Rostedt, rcu,
	Linux Kernel Mailing List, Lili Deng, Dexuan Cui, Baihua Lu

On Sun, Dec 15, 2019 at 11:18:43AM -0800, Dexuan-Linux Cui wrote:
> On Fri, Dec 13, 2019 at 10:41 PM Paul E. McKenney <paulmck@kernel.org> wrote:
> >
> > On Fri, Dec 13, 2019 at 06:11:16PM -0500, Qian Cai wrote:
> > >
> > >
> > > > On Dec 13, 2019, at 5:46 PM, Paul E. McKenney <paulmck@kernel.org> wrote:
> > > >
> > > > I am running this on a number of x86 systems, but will try it on a
> > >
> > > The config to reproduce includes several debugging options that might
> > > required to recreate.
> >
> > If you run without those debugging options, do you still see the hangs?
> > If not, please let me know which debugging are involved.
> >
> > > > wider variety.  If I cannot reproduce it, would you be willing to
> > > > run diagnostics?
> > >
> > > Yes.
> >
> > Very good!  Let me see what I can put together.  (No luck reproducing
> > at my end thus far.)
> >
> > > > Just to double-check...  Are you running rcutorture built into the kernel?
> > > > (My guess is "no", but figured that I should ask.)
> > >
> > > No as you can see from the config I linked in the original email.
> >
> > Fair point, and please accept my apologies for the pointless question.
> >
> >                                                         Thanx, Paul
> 
> Hi,
> We're seeing the same hang issue with a recent Linux next-20191213
> kernel.  If we revert the same commit 82150cb53dcb ("rcu: React to
> callback overload by aggressively seeking quiescent states”), the
> issue will go away.
> 
> Note: we're running the x86-64 Linux VM Hyper-V, and the the torture
> test is not used:
> 
> $ grep  -i torture  .config
> CONFIG_LOCK_TORTURE_TEST=m
> CONFIG_TORTURE_TEST=m
> # CONFIG_RCU_TORTURE_TEST is not set
> 
> (FYI: the kernel config and the serial console log are attached).
> 
> When the issue happens, I force a kernel panic by NMI several times
> and I can see the rcu_gp_kthread hangs at some places, but it looks
> all the places are in the below loop:
> 
> (The first panic log is in the attachment)
> (gdb) l *(rcu_gp_kthread+0x703)
> 0xffffffff811128c3 is in rcu_gp_kthread (kernel/rcu/tree.c:1763).
> 1758                    if (rnp == rdp->mynode)
> 1759                            needgp = __note_gp_changes(rnp, rdp) || needgp;
> 1760                    /* smp_mb() provided by prior unlock-lock pair. */
> 1761                    needgp = rcu_future_gp_cleanup(rnp) || needgp;
> 1762                    // Reset overload indication for CPUs no
> longer overloaded
> 1763                    for_each_leaf_node_cpu_mask(rnp, cpu, rnp->cbovldmask) {
> 1764                            rdp = per_cpu_ptr(&rcu_data, cpu);
> 1765                            check_cb_ovld_locked(rdp, rnp);
> 1766                    }
> 1767                    sq = rcu_nocb_gp_get(rnp);

This is consistent with what I saw in Qian Cai's report, FYI.  So I
am very interested in learning whether the first patch in my reply [1]
helps you.

							Thanx, Paul

[1]  https://lore.kernel.org/lkml/20191215201646.GK2889@paulmck-ThinkPad-P72/

^ permalink raw reply	[flat|nested] 11+ messages in thread

* RE: "rcu: React to callback overload by aggressively seeking quiescent states" hangs on boot
  2019-12-15 20:20         ` Paul E. McKenney
@ 2019-12-15 20:40           ` Dexuan Cui
  2019-12-15 20:56             ` Paul E. McKenney
  0 siblings, 1 reply; 11+ messages in thread
From: Dexuan Cui @ 2019-12-15 20:40 UTC (permalink / raw)
  To: paulmck, Dexuan-Linux Cui
  Cc: Qian Cai, Joel Fernandes (Google),
	Tejun Heo, Josh Triplett, Steven Rostedt, rcu,
	Linux Kernel Mailing List, Lili Deng, Baihua Lu

> From: Paul E. McKenney <paulmck@kernel.org>
> Sent: Sunday, December 15, 2019 12:20 PM
> 
> This is consistent with what I saw in Qian Cai's report, FYI.  So I
> am very interested in learning whether the first patch in my reply [1]
> helps you.
> 							Thanx, Paul

Hi Paul, yes, your first patch (the below) can fix the hang issue:

commit e8d6182b015bdd8221164477f4ab1c307bd2fbe9
Author: Paul E. McKenney <paulmck@kernel.org>
Date:   Sun Dec 15 10:59:06 2019 -0800

    squash! rcu: React to callback overload by aggressively seeking quiescent states

Thanks,
-- Dexuan

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: "rcu: React to callback overload by aggressively seeking quiescent states" hangs on boot
  2019-12-15 20:40           ` Dexuan Cui
@ 2019-12-15 20:56             ` Paul E. McKenney
  2019-12-15 21:02               ` Dexuan Cui
  0 siblings, 1 reply; 11+ messages in thread
From: Paul E. McKenney @ 2019-12-15 20:56 UTC (permalink / raw)
  To: Dexuan Cui
  Cc: Dexuan-Linux Cui, Qian Cai, Joel Fernandes (Google),
	Tejun Heo, Josh Triplett, Steven Rostedt, rcu,
	Linux Kernel Mailing List, Lili Deng, Baihua Lu

On Sun, Dec 15, 2019 at 08:40:40PM +0000, Dexuan Cui wrote:
> > From: Paul E. McKenney <paulmck@kernel.org>
> > Sent: Sunday, December 15, 2019 12:20 PM
> > 
> > This is consistent with what I saw in Qian Cai's report, FYI.  So I
> > am very interested in learning whether the first patch in my reply [1]
> > helps you.
> > 							Thanx, Paul
> 
> Hi Paul, yes, your first patch (the below) can fix the hang issue:
> 
> commit e8d6182b015bdd8221164477f4ab1c307bd2fbe9
> Author: Paul E. McKenney <paulmck@kernel.org>
> Date:   Sun Dec 15 10:59:06 2019 -0800
> 
>     squash! rcu: React to callback overload by aggressively seeking quiescent states

Thank you!  May I add your Tested-by?

							Thanx, Paul

^ permalink raw reply	[flat|nested] 11+ messages in thread

* RE: "rcu: React to callback overload by aggressively seeking quiescent states" hangs on boot
  2019-12-15 20:56             ` Paul E. McKenney
@ 2019-12-15 21:02               ` Dexuan Cui
  0 siblings, 0 replies; 11+ messages in thread
From: Dexuan Cui @ 2019-12-15 21:02 UTC (permalink / raw)
  To: paulmck
  Cc: Dexuan-Linux Cui, Qian Cai, Joel Fernandes (Google),
	Tejun Heo, Josh Triplett, Steven Rostedt, rcu,
	Linux Kernel Mailing List, Lili Deng, Baihua Lu

> From: Paul E. McKenney <paulmck@kernel.org>
> Sent: Sunday, December 15, 2019 12:57 PM
> To: Dexuan Cui <decui@microsoft.com>
> Cc: Dexuan-Linux Cui <dexuan.linux@gmail.com>; Qian Cai <cai@lca.pw>; Joel
> Fernandes (Google) <joel@joelfernandes.org>; Tejun Heo <tj@kernel.org>;
> Josh Triplett <josh@joshtriplett.org>; Steven Rostedt <rostedt@goodmis.org>;
> rcu@vger.kernel.org; Linux Kernel Mailing List <linux-kernel@vger.kernel.org>;
> Lili Deng <Lili.Deng@microsoft.com>; Baihua Lu <Baihua.Lu@microsoft.com>
> Subject: Re: "rcu: React to callback overload by aggressively seeking quiescent
> states" hangs on boot
> 
> On Sun, Dec 15, 2019 at 08:40:40PM +0000, Dexuan Cui wrote:
> > > From: Paul E. McKenney <paulmck@kernel.org>
> > > Sent: Sunday, December 15, 2019 12:20 PM
> > >
> > > This is consistent with what I saw in Qian Cai's report, FYI.  So I
> > > am very interested in learning whether the first patch in my reply [1]
> > > helps you.
> > > 							Thanx, Paul
> >
> > Hi Paul, yes, your first patch (the below) can fix the hang issue:
> >
> > commit e8d6182b015bdd8221164477f4ab1c307bd2fbe9
> > Author: Paul E. McKenney <paulmck@kernel.org>
> > Date:   Sun Dec 15 10:59:06 2019 -0800
> >
> >     squash! rcu: React to callback overload by aggressively seeking
> quiescent states
> 
> Thank you!  May I add your Tested-by?
> 
 							Thanx, Paul

Tested-by: Dexuan Cui <decui@microsoft.com>

^ permalink raw reply	[flat|nested] 11+ messages in thread

* RE: "rcu: React to callback overload by aggressively seeking quiescent states" hangs on boot
       [not found] <BCD69C9E-4E61-405F-A514-36096E0F34F4@lca.pw>
  2019-12-15 20:18 ` Paul E. McKenney
@ 2019-12-15 20:19 ` Dexuan Cui
  1 sibling, 0 replies; 11+ messages in thread
From: Dexuan Cui @ 2019-12-15 20:19 UTC (permalink / raw)
  To: Qian Cai, Dexuan-Linux Cui
  Cc: paulmck, Joel Fernandes (Google),
	Tejun Heo, Josh Triplett, Steven Rostedt, rcu,
	Linux Kernel Mailing List, Lili Deng, Baihua Lu

> From: Qian Cai <cai@lca.pw>
> Sent: Sunday, December 15, 2019 11:54 AM
> Subject: Re: "rcu: React to callback overload by aggressively seeking
> quiescent states" hangs on boot
>
>
> On Dec 15, 2019, at 2:18 PM, Dexuan-Linux Cui <mailto:dexuan.linux@gmail.com> wrote:
> We're seeing the same hang issue with a recent Linux next-20191213
> kernel.  If we revert the same commit 82150cb53dcb ("rcu: React to
> callback overload by aggressively seeking quiescent states”), the
> issue will go away.
>
> Does this patch work for you?
> https://lore.kernel.org/rcu/20191215065242.7155-1-cai@lca.pw/#r

Yes,  this can work for me. Thanks!

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: "rcu: React to callback overload by aggressively seeking quiescent states" hangs on boot
       [not found] <BCD69C9E-4E61-405F-A514-36096E0F34F4@lca.pw>
@ 2019-12-15 20:18 ` Paul E. McKenney
  2019-12-15 20:19 ` Dexuan Cui
  1 sibling, 0 replies; 11+ messages in thread
From: Paul E. McKenney @ 2019-12-15 20:18 UTC (permalink / raw)
  To: Qian Cai
  Cc: Dexuan-Linux Cui, Joel Fernandes (Google),
	Tejun Heo, Josh Triplett, Steven Rostedt, rcu,
	Linux Kernel Mailing List, Lili Deng, Dexuan Cui, Baihua Lu

On Sun, Dec 15, 2019 at 02:53:37PM -0500, Qian Cai wrote:
> > On Dec 15, 2019, at 2:18 PM, Dexuan-Linux Cui <dexuan.linux@gmail.com> wrote:
> > We're seeing the same hang issue with a recent Linux next-20191213
> > kernel.  If we revert the same commit 82150cb53dcb ("rcu: React to
> > callback overload by aggressively seeking quiescent states”), the
> > issue will go away.
> 
> Does this patch work for you?
> 
> https://lore.kernel.org/rcu/20191215065242.7155-1-cai@lca.pw

Same question for this one:

https://lore.kernel.org/lkml/20191215201646.GK2889@paulmck-ThinkPad-P72/

							Thanx, Paul

^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2019-12-15 21:03 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-12-13  6:13 "rcu: React to callback overload by aggressively seeking quiescent states" hangs on boot Qian Cai
2019-12-13 22:46 ` Paul E. McKenney
2019-12-13 23:11   ` Qian Cai
2019-12-14  6:40     ` Paul E. McKenney
     [not found]       ` <CAA42JLbBFkpYHXRVvyveYO76DnbkE3gyRW-=qmBGZcJTAiB6Uw@mail.gmail.com>
2019-12-15 19:29         ` Dexuan-Linux Cui
2019-12-15 20:20         ` Paul E. McKenney
2019-12-15 20:40           ` Dexuan Cui
2019-12-15 20:56             ` Paul E. McKenney
2019-12-15 21:02               ` Dexuan Cui
     [not found] <BCD69C9E-4E61-405F-A514-36096E0F34F4@lca.pw>
2019-12-15 20:18 ` Paul E. McKenney
2019-12-15 20:19 ` Dexuan Cui

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).