All of lore.kernel.org
 help / color / mirror / Atom feed
* WARNING: suspicious RCU usage
@ 2017-12-10 11:39 Peng Fan
  2017-12-10 12:00 ` Russell King - ARM Linux
  0 siblings, 1 reply; 25+ messages in thread
From: Peng Fan @ 2017-12-10 11:39 UTC (permalink / raw)
  To: linux-arm-kernel

Hi All,

I met an RCU warning when test suspend/resume on i.MX6Q-SDB(4 cortex-A9 cores).
The suspend/resume function still work, but I do not have good idea about
the RCU warning. Please help if you any ideas.

The defconfig is imx_v6_v7_defconfig of "4.15.0-rc2-00483-g4ded3bec65a0"

Test log:
root at imx6qpdlsolox:~# /unit_tests/Power_Management/suspend_quick_auto.sh

---- Running < suspend_quick_auto.sh > test ----

---- Running < /unit_tests/SRTC/rtcwakeup.out > test ----

rtcwakeup.out: wakeup from "mem" using rtc0 at Thu Sep  7 00:06:21 2017
[   21.688752] PM: suspend entry (deep)
[   21.693550] PM: Syncing filesystems ... done.
[   21.728528] Freezing user space processes ... (elapsed 0.002 seconds) done.
[   21.739020] OOM killer disabled.
[   21.742326] Freezing remaining freezable tasks ... (elapsed 0.002 seconds) done.
[   21.816672] PM: suspend devices took 0.060 seconds
[   21.832558] Disabling non-boot CPUs ...
[   21.932773]
[   21.933333] =============================
[   21.933338] WARNING: suspicious RCU usage
[   21.933342] 4.15.0-rc2-00483-g4ded3bec65a0 #85 Not tainted
[   21.933348] -----------------------------
[   21.933354] kernel/sched/fair.c:6333 suspicious rcu_dereference_check() usage!
[   21.933358]
[   21.933358] other info that might help us debug this:
[   21.933358]
[   21.933364]
[   21.933364] RCU used illegally from offline CPU!
[   21.933364] rcu_scheduler_active = 2, debug_locks = 0
[   21.933369] 3 locks held by swapper/2/0:
[   21.933373]  #0:  ((cpu_died).wait.lock){....}, at: [<4f528fc4>] complete+0x1c/0x58
[   21.933396]  #1:  (&p->pi_lock){-.-.}, at: [<656ae3ef>] try_to_wake_up+0x28/0x368
[   21.933415]  #2:  (rcu_read_lock){....}, at: [<73be2596>] select_task_rq_fair+0x150/0xf0c
[   21.933433]
[   21.933433] stack backtrace:
[   21.933441] CPU: 2 PID: 0 Comm: swapper/2 Not tainted 4.15.0-rc2-00483-g4ded3bec65a0 #85
[   21.933445] Hardware name: Freescale i.MX6 Quad/DualLite (Device Tree)
[   21.933448] Backtrace:
[   21.933459] [<c010c890>] (dump_backtrace) from [<c010cb60>] (show_stack+0x18/0x1c)
[   21.933466]  r7:00000000 r6:60070093 r5:00000000 r4:c107bd50
[   21.933475] [<c010cb48>] (show_stack) from [<c09eca50>] (dump_stack+0xb4/0xe8)
[   21.933484] [<c09ec99c>] (dump_stack) from [<c016e47c>] (lockdep_rcu_suspicious+0xb0/0x110)
[   21.933493]  r9:c0f7a380 r8:c107cd92 r7:c0cfc5c8 r6:000018bd r5:c0cff438 r4:ec0a0cc0
[   21.933503] [<c016e3cc>] (lockdep_rcu_suspicious) from [<c015a4e8>] (select_task_rq_fair+0xcf8/0xf0c)
[   21.933509]  r7:00000000 r6:00000000 r5:00000000 r4:00000000
[   21.933518] [<c01597f0>] (select_task_rq_fair) from [<c01505e4>] (try_to_wake_up+0x108/0x368)
[   21.933526]  r10:c1009290 r9:c0f7a380 r8:c01597f0 r7:40070093 r6:ec8e1e00 r5:00000000
[   21.933531]  r4:ec8e1980
[   21.933540] [<c01504dc>] (try_to_wake_up) from [<c0150918>] (default_wake_function+0x14/0x18)
[   21.933548]  r10:00000000 r9:00000000 r8:c0150904 r7:00000000 r6:c100e470 r5:00000001
[   21.933552]  r4:c100e464
[   21.933562] [<c0150904>] (default_wake_function) from [<c0166bd4>] (__wake_up_common+0x94/0x154)
[   21.933572] [<c0166b40>] (__wake_up_common) from [<c0166d8c>] (__wake_up_locked+0x20/0x28)
[   21.933580]  r10:00000000 r9:412fc09a r8:c0f79840 r7:c1009174 r6:60070093 r5:c100e44c
[   21.933584]  r4:c100e448
[   21.933594] [<c0166d6c>] (__wake_up_locked) from [<c0167970>] (complete+0x48/0x58)
[   21.933605] [<c0167928>] (complete) from [<c010fb98>] (arch_cpu_idle_dead+0x3c/0xa4)
[   21.933611]  r7:c1009174 r6:00000004 r5:00000002 r4:c0db0054
[   21.933622] [<c010fb5c>] (arch_cpu_idle_dead) from [<c0167d2c>] (do_idle+0x170/0x230)
[   21.933629]  r5:c1009128 r4:ec0b0000
[   21.933638] [<c0167bbc>] (do_idle) from [<c0168178>] (cpu_startup_entry+0x20/0x24)
[   21.933647]  r10:00000000 r9:412fc09a r8:1000406a r7:c107dbe0 r6:10c0387d r5:00000002
[   21.933653]  r4:00000085 r3:ec0a0cc0
[   21.933663] [<c0168158>] (cpu_startup_entry) from [<c010fd48>] (secondary_start_kernel+0x148/0x174)
[   21.933672] [<c010fc00>] (secondary_start_kernel) from [<10101a0c>] (0x10101a0c)
[   21.933677]  r5:00000051 r4:3c08c06a
[   21.991289] Enabling non-boot CPUs ...
[   22.278265] CPU1 is up
[   22.281305] CPU2 is up
[   22.284299] CPU3 is up
[   23.183405]
[   23.183409] ======================================================
[   23.183413] WARNING: possible circular locking dependency detected
[   23.183415] 4.15.0-rc2-00483-g4ded3bec65a0 #85 Not tainted
[   23.183418] ------------------------------------------------------
[   23.183421] swapper/2/0 is trying to acquire lock:
[   23.183423]  ((console_sem).lock){....}, at: [<66600b38>] down_trylock+0x14/0x34
[   23.183431]
[   23.183433] but task is already holding lock:
[   23.183435]  (&p->pi_lock){-.-.}, at: [<656ae3ef>] try_to_wake_up+0x28/0x368
[   23.183442]
[   23.183444] which lock already depends on the new lock.
[   23.183446]
[   23.183448]
[   23.183450] the existing dependency chain (in reverse order) is:
[   23.183452]
[   23.183453] -> #1 (&p->pi_lock){-.-.}:
[   23.183461]        lock_acquire+0x70/0x90
[   23.183463]        _raw_spin_lock_irqsave+0x40/0x54
[   23.183466]        try_to_wake_up+0x28/0x368
[   23.183468]        wake_up_process+0x18/0x1c
[   23.183470]        __up+0x3c/0x40
[   23.183472]        up+0x54/0x68
[   23.183474]        __up_console_sem+0x34/0x60
[   23.183476]        console_unlock+0x374/0x4d0
[   23.183478]        vprintk_emit+0x220/0x2bc
[   23.183480]        dev_vprintk_emit+0x9c/0x1d4
[   23.183483]        dev_printk_emit+0x28/0x30
[   23.183485]        __dev_printk+0x4c/0x90
[   23.183487]        dev_warn+0x3c/0x48
[   23.183489]        _request_firmware+0x448/0x610
[   23.183491]        request_firmware_work_func+0x34/0x6c
[   23.183493]        process_one_work+0x254/0x4f0
[   23.183496]        worker_thread+0x40/0x5a8
[   23.183498]        kthread+0x144/0x194
[   23.183500]        ret_from_fork+0x14/0x20
[   23.183501]
[   23.183503] -> #0 ((console_sem).lock){....}:
[   23.183510]        __lock_acquire+0x131c/0x14a8
[   23.183513]        lock_acquire+0x70/0x90
[   23.183515]        _raw_spin_lock_irqsave+0x40/0x54
[   23.183517]        down_trylock+0x14/0x34
[   23.183519]        __down_trylock_console_sem+0x2c/0x88
[   23.183522]        console_trylock+0x18/0x60
[   23.183524]        vprintk_emit+0x214/0x2bc
[   23.183526]        vprintk_default+0x28/0x30
[   23.183528]        vprintk_func+0x78/0xcc
[   23.183530]        printk+0x28/0x30
[   23.183532]        lockdep_rcu_suspicious+0x30/0x110
[   23.183535]        select_task_rq_fair+0xcf8/0xf0c
[   23.183537]        try_to_wake_up+0x108/0x368
[   23.183539]        default_wake_function+0x14/0x18
[   23.183541]        __wake_up_common+0x94/0x154
[   23.183543]        __wake_up_locked+0x20/0x28
[   23.183545]        complete+0x48/0x58
[   23.183548]        arch_cpu_idle_dead+0x3c/0xa4
[   23.183550]        do_idle+0x170/0x230
[   23.183552]        cpu_startup_entry+0x20/0x24
[   23.183554]        secondary_start_kernel+0x148/0x174
[   23.183556]        0x10101a0c
[   23.183557]
[   23.183560] other info that might help us debug this:
[   23.183561]
[   23.183563]  Possible unsafe locking scenario:
[   23.183565]
[   23.183567]        CPU0                    CPU1
[   23.183569]        ----                    ----
[   23.183571]   lock(&p->pi_lock);
[   23.183576]                                lock((console_sem).lock);
[   23.183581]                                lock(&p->pi_lock);
[   23.183586]   lock((console_sem).lock);
[   23.183590]
[   23.183592]  *** DEADLOCK ***
[   23.183594]
[   23.183596] 3 locks held by swapper/2/0:
[   23.183597]  #0:  ((cpu_died).wait.lock){....}, at: [<4f528fc4>] complete+0x1c/0x58
[   23.183605]  #1:  (&p->pi_lock){-.-.}, at: [<656ae3ef>] try_to_wake_up+0x28/0x368
[   23.183613]  #2:  (rcu_read_lock){....}, at: [<73be2596>] select_task_rq_fair+0x150/0xf0c
[   23.183621]
[   23.183623] stack backtrace:
[   23.183626] CPU: 2 PID: 0 Comm: swapper/2 Not tainted 4.15.0-rc2-00483-g4ded3bec65a0 #85
[   23.183629] Hardware name: Freescale i.MX6 Quad/DualLite (Device Tree)
[   23.183630] Backtrace:
[   23.183635] [<c010c890>] (dump_backtrace) from [<c010cb60>] (show_stack+0x18/0x1c)
[   23.183638]  r7:00000000 r6:60070093 r5:00000000 r4:c107bd50
[   23.183641] [<c010cb48>] (show_stack) from [<c09eca50>] (dump_stack+0xb4/0xe8)
[   23.183643] [<c09ec99c>] (dump_stack) from [<c016e750>] (print_circular_bug+0x218/0x3c8)
[   23.183646]  r9:00000000 r8:ec0a0cc0 r7:c11a4120 r6:c11a4120 r5:ec0a11e0 r4:c11a8700
[   23.183649] [<c016e538>] (print_circular_bug) from [<c016f938>] (check_prev_add+0x42c/0x7b4)
[   23.183652]  r10:c1029bac r9:ec0a0cc0 r8:ec0a11e0 r7:ec0a1220 r6:ec0a11e0 r5:00000000
[   23.183654]  r4:ec0a0cc0 r3:ec0a11e0
[   23.183657] [<c016f50c>] (check_prev_add) from [<c0172678>] (__lock_acquire+0x131c/0x14a8)
[   23.183660]  r10:c1029bac r9:ec0a0cc0 r8:ec0a11e0 r7:c17cad80 r6:00000001 r5:00000000
[   23.183661]  r4:ec0a1220
[   23.183664] [<c017135c>] (__lock_acquire) from [<c017307c>] (lock_acquire+0x70/0x90)
[   23.183667]  r10:00000000 r9:60070093 r8:00000001 r7:00000001 r6:60070093 r5:00000000
[   23.183669]  r4:ffffe000
[   23.183672] [<c017300c>] (lock_acquire) from [<c0a094dc>] (_raw_spin_lock_irqsave+0x40/0x54)
[   23.183674]  r8:00000000 r7:00000004 r6:c016b368 r5:60070093 r4:c101ba28
[   23.183677] [<c0a0949c>] (_raw_spin_lock_irqsave) from [<c016b368>] (down_trylock+0x14/0x34)
[   23.183679]  r6:c017cb30 r5:c101ba28 r4:00000000
[   23.183682] [<c016b354>] (down_trylock) from [<c017b684>] (__down_trylock_console_sem+0x2c/0x88)
[   23.183684]  r5:60070093 r4:00000000
[   23.183687] [<c017b658>] (__down_trylock_console_sem) from [<c017b6f8>] (console_trylock+0x18/0x60)
[   23.183690]  r6:00000006 r5:c017cb30 r4:00000000
[   23.183692] [<c017b6e0>] (console_trylock) from [<c017cb30>] (vprintk_emit+0x214/0x2bc)
[   23.183694]  r5:00000000 r4:00000000
[   23.183697] [<c017c91c>] (vprintk_emit) from [<c017cd74>] (vprintk_default+0x28/0x30)
[   23.183700]  r10:ec8e1980 r9:c0f7a380 r8:c107cd92 r7:c0cfc5c8 r6:000018bd r5:c0cffdfc
[   23.183702]  r4:c1009290
[   23.183705] [<c017cd4c>] (vprintk_default) from [<c017e130>] (vprintk_func+0x78/0xcc)
[   23.183707] [<c017e0b8>] (vprintk_func) from [<c017da50>] (printk+0x28/0x30)
[   23.183709]  r5:c0cff438 r4:ec0a0cc0
[   23.183712] [<c017da2c>] (printk) from [<c016e3fc>] (lockdep_rcu_suspicious+0x30/0x110)
[   23.183715]  r3:ec0b1df0 r2:c0cfc5c8 r1:000018bd r0:c0cffdfc
[   23.183718] [<c016e3cc>] (lockdep_rcu_suspicious) from [<c015a4e8>] (select_task_rq_fair+0xcf8/0xf0c)
[   23.183720]  r7:00000000 r6:00000000 r5:00000000 r4:00000000
[   23.183723] [<c01597f0>] (select_task_rq_fair) from [<c01505e4>] (try_to_wake_up+0x108/0x368)
[   23.183726]  r10:c1009290 r9:c0f7a380 r8:c01597f0 r7:40070093 r6:ec8e1e00 r5:00000000
[   23.183728]  r4:ec8e1980
[   23.183731] [<c01504dc>] (try_to_wake_up) from [<c0150918>] (default_wake_function+0x14/0x18)
[   23.183734]  r10:00000000 r9:00000000 r8:c0150904 r7:00000000 r6:c100e470 r5:00000001
[   23.183736]  r4:c100e464
[   23.183739] [<c0150904>] (default_wake_function) from [<c0166bd4>] (__wake_up_common+0x94/0x154)
[   23.183741] [<c0166b40>] (__wake_up_common) from [<c0166d8c>] (__wake_up_locked+0x20/0x28)
[   23.183744]  r10:00000000 r9:412fc09a r8:c0f79840 r7:c1009174 r6:60070093 r5:c100e44c
[   23.183746]  r4:c100e448
[   23.183749] [<c0166d6c>] (__wake_up_locked) from [<c0167970>] (complete+0x48/0x58)
[   23.183752] [<c0167928>] (complete) from [<c010fb98>] (arch_cpu_idle_dead+0x3c/0xa4)
[   23.183754]  r7:c1009174 r6:00000004 r5:00000002 r4:c0db0054
[   23.183757] [<c010fb5c>] (arch_cpu_idle_dead) from [<c0167d2c>] (do_idle+0x170/0x230)
[   23.183759]  r5:c1009128 r4:ec0b0000
[   23.183762] [<c0167bbc>] (do_idle) from [<c0168178>] (cpu_startup_entry+0x20/0x24)
[   23.183765]  r10:00000000 r9:412fc09a r8:1000406a r7:c107dbe0 r6:10c0387d r5:00000002
[   23.183767]  r4:00000085 r3:ec0a0cc0
[   23.183770] [<c0168158>] (cpu_startup_entry) from [<c010fd48>] (secondary_start_kernel+0x148/0x174)
[   23.183773] [<c010fc00>] (secondary_start_kernel) from [<10101a0c>] (0x10101a0c)
[   23.183775]  r5:00000051 r4:3c08c06a
[   23.471735] ata1: SATA link down (SStatus 0 SControl 300)
[   23.899204] PM: resume devices took 0.750 seconds
[   23.903961] OOM killer enabled.
[   23.907111] Restarting tasks ... done.
[   23.938591] PM: suspend exit
===============================
suspend 1 times
===============================

Thanks,
Peng.

-- 

^ permalink raw reply	[flat|nested] 25+ messages in thread

* WARNING: suspicious RCU usage
  2017-12-10 11:39 WARNING: suspicious RCU usage Peng Fan
@ 2017-12-10 12:00 ` Russell King - ARM Linux
  2017-12-10 19:07   ` Paul E. McKenney
  0 siblings, 1 reply; 25+ messages in thread
From: Russell King - ARM Linux @ 2017-12-10 12:00 UTC (permalink / raw)
  To: linux-arm-kernel

+Paul

Annoyingly, it looks like calling "complete()" from a dying CPU is
triggering the RCU usage warning.  From what I remember, this is an
old problem, and we still have no better solution for this other than
to persist with the warning.

I suspect the following lockdep warning is triggered by the RCU code
bringing the console semaphore into the mix of locks.

On Sun, Dec 10, 2017 at 07:39:33PM +0800, Peng Fan wrote:
> Hi All,
> 
> I met an RCU warning when test suspend/resume on i.MX6Q-SDB(4 cortex-A9 cores).
> The suspend/resume function still work, but I do not have good idea about
> the RCU warning. Please help if you any ideas.
> 
> The defconfig is imx_v6_v7_defconfig of "4.15.0-rc2-00483-g4ded3bec65a0"
> 
> Test log:
> root at imx6qpdlsolox:~# /unit_tests/Power_Management/suspend_quick_auto.sh
> 
> ---- Running < suspend_quick_auto.sh > test ----
> 
> ---- Running < /unit_tests/SRTC/rtcwakeup.out > test ----
> 
> rtcwakeup.out: wakeup from "mem" using rtc0 at Thu Sep  7 00:06:21 2017
> [   21.688752] PM: suspend entry (deep)
> [   21.693550] PM: Syncing filesystems ... done.
> [   21.728528] Freezing user space processes ... (elapsed 0.002 seconds) done.
> [   21.739020] OOM killer disabled.
> [   21.742326] Freezing remaining freezable tasks ... (elapsed 0.002 seconds) done.
> [   21.816672] PM: suspend devices took 0.060 seconds
> [   21.832558] Disabling non-boot CPUs ...
> [   21.932773]
> [   21.933333] =============================
> [   21.933338] WARNING: suspicious RCU usage
> [   21.933342] 4.15.0-rc2-00483-g4ded3bec65a0 #85 Not tainted
> [   21.933348] -----------------------------
> [   21.933354] kernel/sched/fair.c:6333 suspicious rcu_dereference_check() usage!
> [   21.933358]
> [   21.933358] other info that might help us debug this:
> [   21.933358]
> [   21.933364]
> [   21.933364] RCU used illegally from offline CPU!
> [   21.933364] rcu_scheduler_active = 2, debug_locks = 0
> [   21.933369] 3 locks held by swapper/2/0:
> [   21.933373]  #0:  ((cpu_died).wait.lock){....}, at: [<4f528fc4>] complete+0x1c/0x58
> [   21.933396]  #1:  (&p->pi_lock){-.-.}, at: [<656ae3ef>] try_to_wake_up+0x28/0x368
> [   21.933415]  #2:  (rcu_read_lock){....}, at: [<73be2596>] select_task_rq_fair+0x150/0xf0c
> [   21.933433]
> [   21.933433] stack backtrace:
> [   21.933441] CPU: 2 PID: 0 Comm: swapper/2 Not tainted 4.15.0-rc2-00483-g4ded3bec65a0 #85
> [   21.933445] Hardware name: Freescale i.MX6 Quad/DualLite (Device Tree)
> [   21.933448] Backtrace:
> [   21.933459] [<c010c890>] (dump_backtrace) from [<c010cb60>] (show_stack+0x18/0x1c)
> [   21.933466]  r7:00000000 r6:60070093 r5:00000000 r4:c107bd50
> [   21.933475] [<c010cb48>] (show_stack) from [<c09eca50>] (dump_stack+0xb4/0xe8)
> [   21.933484] [<c09ec99c>] (dump_stack) from [<c016e47c>] (lockdep_rcu_suspicious+0xb0/0x110)
> [   21.933493]  r9:c0f7a380 r8:c107cd92 r7:c0cfc5c8 r6:000018bd r5:c0cff438 r4:ec0a0cc0
> [   21.933503] [<c016e3cc>] (lockdep_rcu_suspicious) from [<c015a4e8>] (select_task_rq_fair+0xcf8/0xf0c)
> [   21.933509]  r7:00000000 r6:00000000 r5:00000000 r4:00000000
> [   21.933518] [<c01597f0>] (select_task_rq_fair) from [<c01505e4>] (try_to_wake_up+0x108/0x368)
> [   21.933526]  r10:c1009290 r9:c0f7a380 r8:c01597f0 r7:40070093 r6:ec8e1e00 r5:00000000
> [   21.933531]  r4:ec8e1980
> [   21.933540] [<c01504dc>] (try_to_wake_up) from [<c0150918>] (default_wake_function+0x14/0x18)
> [   21.933548]  r10:00000000 r9:00000000 r8:c0150904 r7:00000000 r6:c100e470 r5:00000001
> [   21.933552]  r4:c100e464
> [   21.933562] [<c0150904>] (default_wake_function) from [<c0166bd4>] (__wake_up_common+0x94/0x154)
> [   21.933572] [<c0166b40>] (__wake_up_common) from [<c0166d8c>] (__wake_up_locked+0x20/0x28)
> [   21.933580]  r10:00000000 r9:412fc09a r8:c0f79840 r7:c1009174 r6:60070093 r5:c100e44c
> [   21.933584]  r4:c100e448
> [   21.933594] [<c0166d6c>] (__wake_up_locked) from [<c0167970>] (complete+0x48/0x58)
> [   21.933605] [<c0167928>] (complete) from [<c010fb98>] (arch_cpu_idle_dead+0x3c/0xa4)
> [   21.933611]  r7:c1009174 r6:00000004 r5:00000002 r4:c0db0054
> [   21.933622] [<c010fb5c>] (arch_cpu_idle_dead) from [<c0167d2c>] (do_idle+0x170/0x230)
> [   21.933629]  r5:c1009128 r4:ec0b0000
> [   21.933638] [<c0167bbc>] (do_idle) from [<c0168178>] (cpu_startup_entry+0x20/0x24)
> [   21.933647]  r10:00000000 r9:412fc09a r8:1000406a r7:c107dbe0 r6:10c0387d r5:00000002
> [   21.933653]  r4:00000085 r3:ec0a0cc0
> [   21.933663] [<c0168158>] (cpu_startup_entry) from [<c010fd48>] (secondary_start_kernel+0x148/0x174)
> [   21.933672] [<c010fc00>] (secondary_start_kernel) from [<10101a0c>] (0x10101a0c)
> [   21.933677]  r5:00000051 r4:3c08c06a
> [   21.991289] Enabling non-boot CPUs ...
> [   22.278265] CPU1 is up
> [   22.281305] CPU2 is up
> [   22.284299] CPU3 is up
> [   23.183405]
> [   23.183409] ======================================================
> [   23.183413] WARNING: possible circular locking dependency detected
> [   23.183415] 4.15.0-rc2-00483-g4ded3bec65a0 #85 Not tainted
> [   23.183418] ------------------------------------------------------
> [   23.183421] swapper/2/0 is trying to acquire lock:
> [   23.183423]  ((console_sem).lock){....}, at: [<66600b38>] down_trylock+0x14/0x34
> [   23.183431]
> [   23.183433] but task is already holding lock:
> [   23.183435]  (&p->pi_lock){-.-.}, at: [<656ae3ef>] try_to_wake_up+0x28/0x368
> [   23.183442]
> [   23.183444] which lock already depends on the new lock.
> [   23.183446]
> [   23.183448]
> [   23.183450] the existing dependency chain (in reverse order) is:
> [   23.183452]
> [   23.183453] -> #1 (&p->pi_lock){-.-.}:
> [   23.183461]        lock_acquire+0x70/0x90
> [   23.183463]        _raw_spin_lock_irqsave+0x40/0x54
> [   23.183466]        try_to_wake_up+0x28/0x368
> [   23.183468]        wake_up_process+0x18/0x1c
> [   23.183470]        __up+0x3c/0x40
> [   23.183472]        up+0x54/0x68
> [   23.183474]        __up_console_sem+0x34/0x60
> [   23.183476]        console_unlock+0x374/0x4d0
> [   23.183478]        vprintk_emit+0x220/0x2bc
> [   23.183480]        dev_vprintk_emit+0x9c/0x1d4
> [   23.183483]        dev_printk_emit+0x28/0x30
> [   23.183485]        __dev_printk+0x4c/0x90
> [   23.183487]        dev_warn+0x3c/0x48
> [   23.183489]        _request_firmware+0x448/0x610
> [   23.183491]        request_firmware_work_func+0x34/0x6c
> [   23.183493]        process_one_work+0x254/0x4f0
> [   23.183496]        worker_thread+0x40/0x5a8
> [   23.183498]        kthread+0x144/0x194
> [   23.183500]        ret_from_fork+0x14/0x20
> [   23.183501]
> [   23.183503] -> #0 ((console_sem).lock){....}:
> [   23.183510]        __lock_acquire+0x131c/0x14a8
> [   23.183513]        lock_acquire+0x70/0x90
> [   23.183515]        _raw_spin_lock_irqsave+0x40/0x54
> [   23.183517]        down_trylock+0x14/0x34
> [   23.183519]        __down_trylock_console_sem+0x2c/0x88
> [   23.183522]        console_trylock+0x18/0x60
> [   23.183524]        vprintk_emit+0x214/0x2bc
> [   23.183526]        vprintk_default+0x28/0x30
> [   23.183528]        vprintk_func+0x78/0xcc
> [   23.183530]        printk+0x28/0x30
> [   23.183532]        lockdep_rcu_suspicious+0x30/0x110
> [   23.183535]        select_task_rq_fair+0xcf8/0xf0c
> [   23.183537]        try_to_wake_up+0x108/0x368
> [   23.183539]        default_wake_function+0x14/0x18
> [   23.183541]        __wake_up_common+0x94/0x154
> [   23.183543]        __wake_up_locked+0x20/0x28
> [   23.183545]        complete+0x48/0x58
> [   23.183548]        arch_cpu_idle_dead+0x3c/0xa4
> [   23.183550]        do_idle+0x170/0x230
> [   23.183552]        cpu_startup_entry+0x20/0x24
> [   23.183554]        secondary_start_kernel+0x148/0x174
> [   23.183556]        0x10101a0c
> [   23.183557]
> [   23.183560] other info that might help us debug this:
> [   23.183561]
> [   23.183563]  Possible unsafe locking scenario:
> [   23.183565]
> [   23.183567]        CPU0                    CPU1
> [   23.183569]        ----                    ----
> [   23.183571]   lock(&p->pi_lock);
> [   23.183576]                                lock((console_sem).lock);
> [   23.183581]                                lock(&p->pi_lock);
> [   23.183586]   lock((console_sem).lock);
> [   23.183590]
> [   23.183592]  *** DEADLOCK ***
> [   23.183594]
> [   23.183596] 3 locks held by swapper/2/0:
> [   23.183597]  #0:  ((cpu_died).wait.lock){....}, at: [<4f528fc4>] complete+0x1c/0x58
> [   23.183605]  #1:  (&p->pi_lock){-.-.}, at: [<656ae3ef>] try_to_wake_up+0x28/0x368
> [   23.183613]  #2:  (rcu_read_lock){....}, at: [<73be2596>] select_task_rq_fair+0x150/0xf0c
> [   23.183621]
> [   23.183623] stack backtrace:
> [   23.183626] CPU: 2 PID: 0 Comm: swapper/2 Not tainted 4.15.0-rc2-00483-g4ded3bec65a0 #85
> [   23.183629] Hardware name: Freescale i.MX6 Quad/DualLite (Device Tree)
> [   23.183630] Backtrace:
> [   23.183635] [<c010c890>] (dump_backtrace) from [<c010cb60>] (show_stack+0x18/0x1c)
> [   23.183638]  r7:00000000 r6:60070093 r5:00000000 r4:c107bd50
> [   23.183641] [<c010cb48>] (show_stack) from [<c09eca50>] (dump_stack+0xb4/0xe8)
> [   23.183643] [<c09ec99c>] (dump_stack) from [<c016e750>] (print_circular_bug+0x218/0x3c8)
> [   23.183646]  r9:00000000 r8:ec0a0cc0 r7:c11a4120 r6:c11a4120 r5:ec0a11e0 r4:c11a8700
> [   23.183649] [<c016e538>] (print_circular_bug) from [<c016f938>] (check_prev_add+0x42c/0x7b4)
> [   23.183652]  r10:c1029bac r9:ec0a0cc0 r8:ec0a11e0 r7:ec0a1220 r6:ec0a11e0 r5:00000000
> [   23.183654]  r4:ec0a0cc0 r3:ec0a11e0
> [   23.183657] [<c016f50c>] (check_prev_add) from [<c0172678>] (__lock_acquire+0x131c/0x14a8)
> [   23.183660]  r10:c1029bac r9:ec0a0cc0 r8:ec0a11e0 r7:c17cad80 r6:00000001 r5:00000000
> [   23.183661]  r4:ec0a1220
> [   23.183664] [<c017135c>] (__lock_acquire) from [<c017307c>] (lock_acquire+0x70/0x90)
> [   23.183667]  r10:00000000 r9:60070093 r8:00000001 r7:00000001 r6:60070093 r5:00000000
> [   23.183669]  r4:ffffe000
> [   23.183672] [<c017300c>] (lock_acquire) from [<c0a094dc>] (_raw_spin_lock_irqsave+0x40/0x54)
> [   23.183674]  r8:00000000 r7:00000004 r6:c016b368 r5:60070093 r4:c101ba28
> [   23.183677] [<c0a0949c>] (_raw_spin_lock_irqsave) from [<c016b368>] (down_trylock+0x14/0x34)
> [   23.183679]  r6:c017cb30 r5:c101ba28 r4:00000000
> [   23.183682] [<c016b354>] (down_trylock) from [<c017b684>] (__down_trylock_console_sem+0x2c/0x88)
> [   23.183684]  r5:60070093 r4:00000000
> [   23.183687] [<c017b658>] (__down_trylock_console_sem) from [<c017b6f8>] (console_trylock+0x18/0x60)
> [   23.183690]  r6:00000006 r5:c017cb30 r4:00000000
> [   23.183692] [<c017b6e0>] (console_trylock) from [<c017cb30>] (vprintk_emit+0x214/0x2bc)
> [   23.183694]  r5:00000000 r4:00000000
> [   23.183697] [<c017c91c>] (vprintk_emit) from [<c017cd74>] (vprintk_default+0x28/0x30)
> [   23.183700]  r10:ec8e1980 r9:c0f7a380 r8:c107cd92 r7:c0cfc5c8 r6:000018bd r5:c0cffdfc
> [   23.183702]  r4:c1009290
> [   23.183705] [<c017cd4c>] (vprintk_default) from [<c017e130>] (vprintk_func+0x78/0xcc)
> [   23.183707] [<c017e0b8>] (vprintk_func) from [<c017da50>] (printk+0x28/0x30)
> [   23.183709]  r5:c0cff438 r4:ec0a0cc0
> [   23.183712] [<c017da2c>] (printk) from [<c016e3fc>] (lockdep_rcu_suspicious+0x30/0x110)
> [   23.183715]  r3:ec0b1df0 r2:c0cfc5c8 r1:000018bd r0:c0cffdfc
> [   23.183718] [<c016e3cc>] (lockdep_rcu_suspicious) from [<c015a4e8>] (select_task_rq_fair+0xcf8/0xf0c)
> [   23.183720]  r7:00000000 r6:00000000 r5:00000000 r4:00000000
> [   23.183723] [<c01597f0>] (select_task_rq_fair) from [<c01505e4>] (try_to_wake_up+0x108/0x368)
> [   23.183726]  r10:c1009290 r9:c0f7a380 r8:c01597f0 r7:40070093 r6:ec8e1e00 r5:00000000
> [   23.183728]  r4:ec8e1980
> [   23.183731] [<c01504dc>] (try_to_wake_up) from [<c0150918>] (default_wake_function+0x14/0x18)
> [   23.183734]  r10:00000000 r9:00000000 r8:c0150904 r7:00000000 r6:c100e470 r5:00000001
> [   23.183736]  r4:c100e464
> [   23.183739] [<c0150904>] (default_wake_function) from [<c0166bd4>] (__wake_up_common+0x94/0x154)
> [   23.183741] [<c0166b40>] (__wake_up_common) from [<c0166d8c>] (__wake_up_locked+0x20/0x28)
> [   23.183744]  r10:00000000 r9:412fc09a r8:c0f79840 r7:c1009174 r6:60070093 r5:c100e44c
> [   23.183746]  r4:c100e448
> [   23.183749] [<c0166d6c>] (__wake_up_locked) from [<c0167970>] (complete+0x48/0x58)
> [   23.183752] [<c0167928>] (complete) from [<c010fb98>] (arch_cpu_idle_dead+0x3c/0xa4)
> [   23.183754]  r7:c1009174 r6:00000004 r5:00000002 r4:c0db0054
> [   23.183757] [<c010fb5c>] (arch_cpu_idle_dead) from [<c0167d2c>] (do_idle+0x170/0x230)
> [   23.183759]  r5:c1009128 r4:ec0b0000
> [   23.183762] [<c0167bbc>] (do_idle) from [<c0168178>] (cpu_startup_entry+0x20/0x24)
> [   23.183765]  r10:00000000 r9:412fc09a r8:1000406a r7:c107dbe0 r6:10c0387d r5:00000002
> [   23.183767]  r4:00000085 r3:ec0a0cc0
> [   23.183770] [<c0168158>] (cpu_startup_entry) from [<c010fd48>] (secondary_start_kernel+0x148/0x174)
> [   23.183773] [<c010fc00>] (secondary_start_kernel) from [<10101a0c>] (0x10101a0c)
> [   23.183775]  r5:00000051 r4:3c08c06a
> [   23.471735] ata1: SATA link down (SStatus 0 SControl 300)
> [   23.899204] PM: resume devices took 0.750 seconds
> [   23.903961] OOM killer enabled.
> [   23.907111] Restarting tasks ... done.
> [   23.938591] PM: suspend exit
> ===============================
> suspend 1 times
> ===============================
> 
> Thanks,
> Peng.
> 
> -- 
> 
> _______________________________________________
> linux-arm-kernel mailing list
> linux-arm-kernel at lists.infradead.org
> http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

-- 
RMK's Patch system: http://www.armlinux.org.uk/developer/patches/
FTTC broadband for 0.8mile line in suburbia: sync at 8.8Mbps down 630kbps up
According to speedtest.net: 8.21Mbps down 510kbps up

^ permalink raw reply	[flat|nested] 25+ messages in thread

* WARNING: suspicious RCU usage
  2017-12-10 12:00 ` Russell King - ARM Linux
@ 2017-12-10 19:07   ` Paul E. McKenney
  2017-12-10 19:34     ` Russell King - ARM Linux
  0 siblings, 1 reply; 25+ messages in thread
From: Paul E. McKenney @ 2017-12-10 19:07 UTC (permalink / raw)
  To: linux-arm-kernel

On Sun, Dec 10, 2017 at 12:00:12PM +0000, Russell King - ARM Linux wrote:
> +Paul
> 
> Annoyingly, it looks like calling "complete()" from a dying CPU is
> triggering the RCU usage warning.  From what I remember, this is an
> old problem, and we still have no better solution for this other than
> to persist with the warning.

I thought that this issue was resolved with tglx's use of IPIs from
the outgoing CPU.  Or is this due to an additional complete() from the
ARM code?  If so, could it also use tglx's IPI trick?

> I suspect the following lockdep warning is triggered by the RCU code
> bringing the console semaphore into the mix of locks.

It does indeed look to me that it is quite possible that resolving
the complete() issue would prevent the lockdep splat from appearing,
which might in turn prevent acquisition of the console semaphore.

						Thanx, Paul

> On Sun, Dec 10, 2017 at 07:39:33PM +0800, Peng Fan wrote:
> > Hi All,
> > 
> > I met an RCU warning when test suspend/resume on i.MX6Q-SDB(4 cortex-A9 cores).
> > The suspend/resume function still work, but I do not have good idea about
> > the RCU warning. Please help if you any ideas.
> > 
> > The defconfig is imx_v6_v7_defconfig of "4.15.0-rc2-00483-g4ded3bec65a0"
> > 
> > Test log:
> > root at imx6qpdlsolox:~# /unit_tests/Power_Management/suspend_quick_auto.sh
> > 
> > ---- Running < suspend_quick_auto.sh > test ----
> > 
> > ---- Running < /unit_tests/SRTC/rtcwakeup.out > test ----
> > 
> > rtcwakeup.out: wakeup from "mem" using rtc0 at Thu Sep  7 00:06:21 2017
> > [   21.688752] PM: suspend entry (deep)
> > [   21.693550] PM: Syncing filesystems ... done.
> > [   21.728528] Freezing user space processes ... (elapsed 0.002 seconds) done.
> > [   21.739020] OOM killer disabled.
> > [   21.742326] Freezing remaining freezable tasks ... (elapsed 0.002 seconds) done.
> > [   21.816672] PM: suspend devices took 0.060 seconds
> > [   21.832558] Disabling non-boot CPUs ...
> > [   21.932773]
> > [   21.933333] =============================
> > [   21.933338] WARNING: suspicious RCU usage
> > [   21.933342] 4.15.0-rc2-00483-g4ded3bec65a0 #85 Not tainted
> > [   21.933348] -----------------------------
> > [   21.933354] kernel/sched/fair.c:6333 suspicious rcu_dereference_check() usage!
> > [   21.933358]
> > [   21.933358] other info that might help us debug this:
> > [   21.933358]
> > [   21.933364]
> > [   21.933364] RCU used illegally from offline CPU!
> > [   21.933364] rcu_scheduler_active = 2, debug_locks = 0
> > [   21.933369] 3 locks held by swapper/2/0:
> > [   21.933373]  #0:  ((cpu_died).wait.lock){....}, at: [<4f528fc4>] complete+0x1c/0x58
> > [   21.933396]  #1:  (&p->pi_lock){-.-.}, at: [<656ae3ef>] try_to_wake_up+0x28/0x368
> > [   21.933415]  #2:  (rcu_read_lock){....}, at: [<73be2596>] select_task_rq_fair+0x150/0xf0c
> > [   21.933433]
> > [   21.933433] stack backtrace:
> > [   21.933441] CPU: 2 PID: 0 Comm: swapper/2 Not tainted 4.15.0-rc2-00483-g4ded3bec65a0 #85
> > [   21.933445] Hardware name: Freescale i.MX6 Quad/DualLite (Device Tree)
> > [   21.933448] Backtrace:
> > [   21.933459] [<c010c890>] (dump_backtrace) from [<c010cb60>] (show_stack+0x18/0x1c)
> > [   21.933466]  r7:00000000 r6:60070093 r5:00000000 r4:c107bd50
> > [   21.933475] [<c010cb48>] (show_stack) from [<c09eca50>] (dump_stack+0xb4/0xe8)
> > [   21.933484] [<c09ec99c>] (dump_stack) from [<c016e47c>] (lockdep_rcu_suspicious+0xb0/0x110)
> > [   21.933493]  r9:c0f7a380 r8:c107cd92 r7:c0cfc5c8 r6:000018bd r5:c0cff438 r4:ec0a0cc0
> > [   21.933503] [<c016e3cc>] (lockdep_rcu_suspicious) from [<c015a4e8>] (select_task_rq_fair+0xcf8/0xf0c)
> > [   21.933509]  r7:00000000 r6:00000000 r5:00000000 r4:00000000
> > [   21.933518] [<c01597f0>] (select_task_rq_fair) from [<c01505e4>] (try_to_wake_up+0x108/0x368)
> > [   21.933526]  r10:c1009290 r9:c0f7a380 r8:c01597f0 r7:40070093 r6:ec8e1e00 r5:00000000
> > [   21.933531]  r4:ec8e1980
> > [   21.933540] [<c01504dc>] (try_to_wake_up) from [<c0150918>] (default_wake_function+0x14/0x18)
> > [   21.933548]  r10:00000000 r9:00000000 r8:c0150904 r7:00000000 r6:c100e470 r5:00000001
> > [   21.933552]  r4:c100e464
> > [   21.933562] [<c0150904>] (default_wake_function) from [<c0166bd4>] (__wake_up_common+0x94/0x154)
> > [   21.933572] [<c0166b40>] (__wake_up_common) from [<c0166d8c>] (__wake_up_locked+0x20/0x28)
> > [   21.933580]  r10:00000000 r9:412fc09a r8:c0f79840 r7:c1009174 r6:60070093 r5:c100e44c
> > [   21.933584]  r4:c100e448
> > [   21.933594] [<c0166d6c>] (__wake_up_locked) from [<c0167970>] (complete+0x48/0x58)
> > [   21.933605] [<c0167928>] (complete) from [<c010fb98>] (arch_cpu_idle_dead+0x3c/0xa4)
> > [   21.933611]  r7:c1009174 r6:00000004 r5:00000002 r4:c0db0054
> > [   21.933622] [<c010fb5c>] (arch_cpu_idle_dead) from [<c0167d2c>] (do_idle+0x170/0x230)
> > [   21.933629]  r5:c1009128 r4:ec0b0000
> > [   21.933638] [<c0167bbc>] (do_idle) from [<c0168178>] (cpu_startup_entry+0x20/0x24)
> > [   21.933647]  r10:00000000 r9:412fc09a r8:1000406a r7:c107dbe0 r6:10c0387d r5:00000002
> > [   21.933653]  r4:00000085 r3:ec0a0cc0
> > [   21.933663] [<c0168158>] (cpu_startup_entry) from [<c010fd48>] (secondary_start_kernel+0x148/0x174)
> > [   21.933672] [<c010fc00>] (secondary_start_kernel) from [<10101a0c>] (0x10101a0c)
> > [   21.933677]  r5:00000051 r4:3c08c06a
> > [   21.991289] Enabling non-boot CPUs ...
> > [   22.278265] CPU1 is up
> > [   22.281305] CPU2 is up
> > [   22.284299] CPU3 is up
> > [   23.183405]
> > [   23.183409] ======================================================
> > [   23.183413] WARNING: possible circular locking dependency detected
> > [   23.183415] 4.15.0-rc2-00483-g4ded3bec65a0 #85 Not tainted
> > [   23.183418] ------------------------------------------------------
> > [   23.183421] swapper/2/0 is trying to acquire lock:
> > [   23.183423]  ((console_sem).lock){....}, at: [<66600b38>] down_trylock+0x14/0x34
> > [   23.183431]
> > [   23.183433] but task is already holding lock:
> > [   23.183435]  (&p->pi_lock){-.-.}, at: [<656ae3ef>] try_to_wake_up+0x28/0x368
> > [   23.183442]
> > [   23.183444] which lock already depends on the new lock.
> > [   23.183446]
> > [   23.183448]
> > [   23.183450] the existing dependency chain (in reverse order) is:
> > [   23.183452]
> > [   23.183453] -> #1 (&p->pi_lock){-.-.}:
> > [   23.183461]        lock_acquire+0x70/0x90
> > [   23.183463]        _raw_spin_lock_irqsave+0x40/0x54
> > [   23.183466]        try_to_wake_up+0x28/0x368
> > [   23.183468]        wake_up_process+0x18/0x1c
> > [   23.183470]        __up+0x3c/0x40
> > [   23.183472]        up+0x54/0x68
> > [   23.183474]        __up_console_sem+0x34/0x60
> > [   23.183476]        console_unlock+0x374/0x4d0
> > [   23.183478]        vprintk_emit+0x220/0x2bc
> > [   23.183480]        dev_vprintk_emit+0x9c/0x1d4
> > [   23.183483]        dev_printk_emit+0x28/0x30
> > [   23.183485]        __dev_printk+0x4c/0x90
> > [   23.183487]        dev_warn+0x3c/0x48
> > [   23.183489]        _request_firmware+0x448/0x610
> > [   23.183491]        request_firmware_work_func+0x34/0x6c
> > [   23.183493]        process_one_work+0x254/0x4f0
> > [   23.183496]        worker_thread+0x40/0x5a8
> > [   23.183498]        kthread+0x144/0x194
> > [   23.183500]        ret_from_fork+0x14/0x20
> > [   23.183501]
> > [   23.183503] -> #0 ((console_sem).lock){....}:
> > [   23.183510]        __lock_acquire+0x131c/0x14a8
> > [   23.183513]        lock_acquire+0x70/0x90
> > [   23.183515]        _raw_spin_lock_irqsave+0x40/0x54
> > [   23.183517]        down_trylock+0x14/0x34
> > [   23.183519]        __down_trylock_console_sem+0x2c/0x88
> > [   23.183522]        console_trylock+0x18/0x60
> > [   23.183524]        vprintk_emit+0x214/0x2bc
> > [   23.183526]        vprintk_default+0x28/0x30
> > [   23.183528]        vprintk_func+0x78/0xcc
> > [   23.183530]        printk+0x28/0x30
> > [   23.183532]        lockdep_rcu_suspicious+0x30/0x110
> > [   23.183535]        select_task_rq_fair+0xcf8/0xf0c
> > [   23.183537]        try_to_wake_up+0x108/0x368
> > [   23.183539]        default_wake_function+0x14/0x18
> > [   23.183541]        __wake_up_common+0x94/0x154
> > [   23.183543]        __wake_up_locked+0x20/0x28
> > [   23.183545]        complete+0x48/0x58
> > [   23.183548]        arch_cpu_idle_dead+0x3c/0xa4
> > [   23.183550]        do_idle+0x170/0x230
> > [   23.183552]        cpu_startup_entry+0x20/0x24
> > [   23.183554]        secondary_start_kernel+0x148/0x174
> > [   23.183556]        0x10101a0c
> > [   23.183557]
> > [   23.183560] other info that might help us debug this:
> > [   23.183561]
> > [   23.183563]  Possible unsafe locking scenario:
> > [   23.183565]
> > [   23.183567]        CPU0                    CPU1
> > [   23.183569]        ----                    ----
> > [   23.183571]   lock(&p->pi_lock);
> > [   23.183576]                                lock((console_sem).lock);
> > [   23.183581]                                lock(&p->pi_lock);
> > [   23.183586]   lock((console_sem).lock);
> > [   23.183590]
> > [   23.183592]  *** DEADLOCK ***
> > [   23.183594]
> > [   23.183596] 3 locks held by swapper/2/0:
> > [   23.183597]  #0:  ((cpu_died).wait.lock){....}, at: [<4f528fc4>] complete+0x1c/0x58
> > [   23.183605]  #1:  (&p->pi_lock){-.-.}, at: [<656ae3ef>] try_to_wake_up+0x28/0x368
> > [   23.183613]  #2:  (rcu_read_lock){....}, at: [<73be2596>] select_task_rq_fair+0x150/0xf0c
> > [   23.183621]
> > [   23.183623] stack backtrace:
> > [   23.183626] CPU: 2 PID: 0 Comm: swapper/2 Not tainted 4.15.0-rc2-00483-g4ded3bec65a0 #85
> > [   23.183629] Hardware name: Freescale i.MX6 Quad/DualLite (Device Tree)
> > [   23.183630] Backtrace:
> > [   23.183635] [<c010c890>] (dump_backtrace) from [<c010cb60>] (show_stack+0x18/0x1c)
> > [   23.183638]  r7:00000000 r6:60070093 r5:00000000 r4:c107bd50
> > [   23.183641] [<c010cb48>] (show_stack) from [<c09eca50>] (dump_stack+0xb4/0xe8)
> > [   23.183643] [<c09ec99c>] (dump_stack) from [<c016e750>] (print_circular_bug+0x218/0x3c8)
> > [   23.183646]  r9:00000000 r8:ec0a0cc0 r7:c11a4120 r6:c11a4120 r5:ec0a11e0 r4:c11a8700
> > [   23.183649] [<c016e538>] (print_circular_bug) from [<c016f938>] (check_prev_add+0x42c/0x7b4)
> > [   23.183652]  r10:c1029bac r9:ec0a0cc0 r8:ec0a11e0 r7:ec0a1220 r6:ec0a11e0 r5:00000000
> > [   23.183654]  r4:ec0a0cc0 r3:ec0a11e0
> > [   23.183657] [<c016f50c>] (check_prev_add) from [<c0172678>] (__lock_acquire+0x131c/0x14a8)
> > [   23.183660]  r10:c1029bac r9:ec0a0cc0 r8:ec0a11e0 r7:c17cad80 r6:00000001 r5:00000000
> > [   23.183661]  r4:ec0a1220
> > [   23.183664] [<c017135c>] (__lock_acquire) from [<c017307c>] (lock_acquire+0x70/0x90)
> > [   23.183667]  r10:00000000 r9:60070093 r8:00000001 r7:00000001 r6:60070093 r5:00000000
> > [   23.183669]  r4:ffffe000
> > [   23.183672] [<c017300c>] (lock_acquire) from [<c0a094dc>] (_raw_spin_lock_irqsave+0x40/0x54)
> > [   23.183674]  r8:00000000 r7:00000004 r6:c016b368 r5:60070093 r4:c101ba28
> > [   23.183677] [<c0a0949c>] (_raw_spin_lock_irqsave) from [<c016b368>] (down_trylock+0x14/0x34)
> > [   23.183679]  r6:c017cb30 r5:c101ba28 r4:00000000
> > [   23.183682] [<c016b354>] (down_trylock) from [<c017b684>] (__down_trylock_console_sem+0x2c/0x88)
> > [   23.183684]  r5:60070093 r4:00000000
> > [   23.183687] [<c017b658>] (__down_trylock_console_sem) from [<c017b6f8>] (console_trylock+0x18/0x60)
> > [   23.183690]  r6:00000006 r5:c017cb30 r4:00000000
> > [   23.183692] [<c017b6e0>] (console_trylock) from [<c017cb30>] (vprintk_emit+0x214/0x2bc)
> > [   23.183694]  r5:00000000 r4:00000000
> > [   23.183697] [<c017c91c>] (vprintk_emit) from [<c017cd74>] (vprintk_default+0x28/0x30)
> > [   23.183700]  r10:ec8e1980 r9:c0f7a380 r8:c107cd92 r7:c0cfc5c8 r6:000018bd r5:c0cffdfc
> > [   23.183702]  r4:c1009290
> > [   23.183705] [<c017cd4c>] (vprintk_default) from [<c017e130>] (vprintk_func+0x78/0xcc)
> > [   23.183707] [<c017e0b8>] (vprintk_func) from [<c017da50>] (printk+0x28/0x30)
> > [   23.183709]  r5:c0cff438 r4:ec0a0cc0
> > [   23.183712] [<c017da2c>] (printk) from [<c016e3fc>] (lockdep_rcu_suspicious+0x30/0x110)
> > [   23.183715]  r3:ec0b1df0 r2:c0cfc5c8 r1:000018bd r0:c0cffdfc
> > [   23.183718] [<c016e3cc>] (lockdep_rcu_suspicious) from [<c015a4e8>] (select_task_rq_fair+0xcf8/0xf0c)
> > [   23.183720]  r7:00000000 r6:00000000 r5:00000000 r4:00000000
> > [   23.183723] [<c01597f0>] (select_task_rq_fair) from [<c01505e4>] (try_to_wake_up+0x108/0x368)
> > [   23.183726]  r10:c1009290 r9:c0f7a380 r8:c01597f0 r7:40070093 r6:ec8e1e00 r5:00000000
> > [   23.183728]  r4:ec8e1980
> > [   23.183731] [<c01504dc>] (try_to_wake_up) from [<c0150918>] (default_wake_function+0x14/0x18)
> > [   23.183734]  r10:00000000 r9:00000000 r8:c0150904 r7:00000000 r6:c100e470 r5:00000001
> > [   23.183736]  r4:c100e464
> > [   23.183739] [<c0150904>] (default_wake_function) from [<c0166bd4>] (__wake_up_common+0x94/0x154)
> > [   23.183741] [<c0166b40>] (__wake_up_common) from [<c0166d8c>] (__wake_up_locked+0x20/0x28)
> > [   23.183744]  r10:00000000 r9:412fc09a r8:c0f79840 r7:c1009174 r6:60070093 r5:c100e44c
> > [   23.183746]  r4:c100e448
> > [   23.183749] [<c0166d6c>] (__wake_up_locked) from [<c0167970>] (complete+0x48/0x58)
> > [   23.183752] [<c0167928>] (complete) from [<c010fb98>] (arch_cpu_idle_dead+0x3c/0xa4)
> > [   23.183754]  r7:c1009174 r6:00000004 r5:00000002 r4:c0db0054
> > [   23.183757] [<c010fb5c>] (arch_cpu_idle_dead) from [<c0167d2c>] (do_idle+0x170/0x230)
> > [   23.183759]  r5:c1009128 r4:ec0b0000
> > [   23.183762] [<c0167bbc>] (do_idle) from [<c0168178>] (cpu_startup_entry+0x20/0x24)
> > [   23.183765]  r10:00000000 r9:412fc09a r8:1000406a r7:c107dbe0 r6:10c0387d r5:00000002
> > [   23.183767]  r4:00000085 r3:ec0a0cc0
> > [   23.183770] [<c0168158>] (cpu_startup_entry) from [<c010fd48>] (secondary_start_kernel+0x148/0x174)
> > [   23.183773] [<c010fc00>] (secondary_start_kernel) from [<10101a0c>] (0x10101a0c)
> > [   23.183775]  r5:00000051 r4:3c08c06a
> > [   23.471735] ata1: SATA link down (SStatus 0 SControl 300)
> > [   23.899204] PM: resume devices took 0.750 seconds
> > [   23.903961] OOM killer enabled.
> > [   23.907111] Restarting tasks ... done.
> > [   23.938591] PM: suspend exit
> > ===============================
> > suspend 1 times
> > ===============================
> > 
> > Thanks,
> > Peng.
> > 
> > -- 
> > 
> > _______________________________________________
> > linux-arm-kernel mailing list
> > linux-arm-kernel at lists.infradead.org
> > http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
> 
> -- 
> RMK's Patch system: http://www.armlinux.org.uk/developer/patches/
> FTTC broadband for 0.8mile line in suburbia: sync at 8.8Mbps down 630kbps up
> According to speedtest.net: 8.21Mbps down 510kbps up
> 

^ permalink raw reply	[flat|nested] 25+ messages in thread

* WARNING: suspicious RCU usage
  2017-12-10 19:07   ` Paul E. McKenney
@ 2017-12-10 19:34     ` Russell King - ARM Linux
  2017-12-10 21:39       ` Paul E. McKenney
  0 siblings, 1 reply; 25+ messages in thread
From: Russell King - ARM Linux @ 2017-12-10 19:34 UTC (permalink / raw)
  To: linux-arm-kernel

On Sun, Dec 10, 2017 at 11:07:27AM -0800, Paul E. McKenney wrote:
> On Sun, Dec 10, 2017 at 12:00:12PM +0000, Russell King - ARM Linux wrote:
> > +Paul
> > 
> > Annoyingly, it looks like calling "complete()" from a dying CPU is
> > triggering the RCU usage warning.  From what I remember, this is an
> > old problem, and we still have no better solution for this other than
> > to persist with the warning.
> 
> I thought that this issue was resolved with tglx's use of IPIs from
> the outgoing CPU.  Or is this due to an additional complete() from the
> ARM code?  If so, could it also use tglx's IPI trick?

I don't think it was tglx's IPI trick, I've had code sitting in my tree
for a while for it, but it has its own set of problems which are not
resolvable:

1. it needs more IPIs than we have available on all platforms
2. there's some optional locking in the GIC driver that cause problems
   for the cpu dying path.

The concensus last time around was that the IPI solution is a non-
starter, so the seven year proven-reliable solution (disregarding the
RCU warning) persists because I don't think anyone came up with a
better solution.

> > I suspect the following lockdep warning is triggered by the RCU code
> > bringing the console semaphore into the mix of locks.
> 
> It does indeed look to me that it is quite possible that resolving
> the complete() issue would prevent the lockdep splat from appearing,
> which might in turn prevent acquisition of the console semaphore.

Yea, if only it was simple to resolve that.

-- 
RMK's Patch system: http://www.armlinux.org.uk/developer/patches/
FTTC broadband for 0.8mile line in suburbia: sync at 8.8Mbps down 630kbps up
According to speedtest.net: 8.21Mbps down 510kbps up

^ permalink raw reply	[flat|nested] 25+ messages in thread

* WARNING: suspicious RCU usage
  2017-12-10 19:34     ` Russell King - ARM Linux
@ 2017-12-10 21:39       ` Paul E. McKenney
  2017-12-10 21:54         ` Russell King - ARM Linux
  2017-12-12 16:49         ` Paul E. McKenney
  0 siblings, 2 replies; 25+ messages in thread
From: Paul E. McKenney @ 2017-12-10 21:39 UTC (permalink / raw)
  To: linux-arm-kernel

On Sun, Dec 10, 2017 at 07:34:39PM +0000, Russell King - ARM Linux wrote:
> On Sun, Dec 10, 2017 at 11:07:27AM -0800, Paul E. McKenney wrote:
> > On Sun, Dec 10, 2017 at 12:00:12PM +0000, Russell King - ARM Linux wrote:
> > > +Paul
> > > 
> > > Annoyingly, it looks like calling "complete()" from a dying CPU is
> > > triggering the RCU usage warning.  From what I remember, this is an
> > > old problem, and we still have no better solution for this other than
> > > to persist with the warning.
> > 
> > I thought that this issue was resolved with tglx's use of IPIs from
> > the outgoing CPU.  Or is this due to an additional complete() from the
> > ARM code?  If so, could it also use tglx's IPI trick?
> 
> I don't think it was tglx's IPI trick, I've had code sitting in my tree
> for a while for it, but it has its own set of problems which are not
> resolvable:
> 
> 1. it needs more IPIs than we have available on all platforms

OK, I will ask the stupid question...  Is it possible to multiplex
the IPIs, for example, by using smp_call_function_single()?

> 2. there's some optional locking in the GIC driver that cause problems
>    for the cpu dying path.

On this, I must plead ignorance.

> The concensus last time around was that the IPI solution is a non-
> starter, so the seven year proven-reliable solution (disregarding the
> RCU warning) persists because I don't think anyone came up with a
> better solution.

Seven years already?  Sigh, I don't have the heart to check because
I wouldn't want to find out that it has actually been longer.  ;-)

							Thanx, Paul

> > > I suspect the following lockdep warning is triggered by the RCU code
> > > bringing the console semaphore into the mix of locks.
> > 
> > It does indeed look to me that it is quite possible that resolving
> > the complete() issue would prevent the lockdep splat from appearing,
> > which might in turn prevent acquisition of the console semaphore.
> 
> Yea, if only it was simple to resolve that.
> 
> -- 
> RMK's Patch system: http://www.armlinux.org.uk/developer/patches/
> FTTC broadband for 0.8mile line in suburbia: sync at 8.8Mbps down 630kbps up
> According to speedtest.net: 8.21Mbps down 510kbps up
> 

^ permalink raw reply	[flat|nested] 25+ messages in thread

* WARNING: suspicious RCU usage
  2017-12-10 21:39       ` Paul E. McKenney
@ 2017-12-10 21:54         ` Russell King - ARM Linux
  2017-12-10 23:16           ` Paul E. McKenney
  2017-12-12 16:49         ` Paul E. McKenney
  1 sibling, 1 reply; 25+ messages in thread
From: Russell King - ARM Linux @ 2017-12-10 21:54 UTC (permalink / raw)
  To: linux-arm-kernel

On Sun, Dec 10, 2017 at 01:39:30PM -0800, Paul E. McKenney wrote:
> On Sun, Dec 10, 2017 at 07:34:39PM +0000, Russell King - ARM Linux wrote:
> > On Sun, Dec 10, 2017 at 11:07:27AM -0800, Paul E. McKenney wrote:
> > > On Sun, Dec 10, 2017 at 12:00:12PM +0000, Russell King - ARM Linux wrote:
> > > > +Paul
> > > > 
> > > > Annoyingly, it looks like calling "complete()" from a dying CPU is
> > > > triggering the RCU usage warning.  From what I remember, this is an
> > > > old problem, and we still have no better solution for this other than
> > > > to persist with the warning.
> > > 
> > > I thought that this issue was resolved with tglx's use of IPIs from
> > > the outgoing CPU.  Or is this due to an additional complete() from the
> > > ARM code?  If so, could it also use tglx's IPI trick?
> > 
> > I don't think it was tglx's IPI trick, I've had code sitting in my tree
> > for a while for it, but it has its own set of problems which are not
> > resolvable:
> > 
> > 1. it needs more IPIs than we have available on all platforms
> 
> OK, I will ask the stupid question...  Is it possible to multiplex
> the IPIs, for example, by using smp_call_function_single()?
> 
> > 2. there's some optional locking in the GIC driver that cause problems
> >    for the cpu dying path.
> 
> On this, I must plead ignorance.
> 
> > The concensus last time around was that the IPI solution is a non-
> > starter, so the seven year proven-reliable solution (disregarding the
> > RCU warning) persists because I don't think anyone came up with a
> > better solution.
> 
> Seven years already?  Sigh, I don't have the heart to check because
> I wouldn't want to find out that it has actually been longer.  ;-)

*Sigh* thanks for what you just said, you do realise that you've just
said that you don't believe what I write in emails?  FFS.  Is there
any point in continuing to discuss this if that's the case?  Really?

commit 3c030beabf937b1d3b4ecaedfd1fb2f1e2aa0c70
Author: Russell King <rmk+kernel@arm.linux.org.uk>
Date:   Tue Nov 30 11:07:35 2010 +0000

    ARM: CPU hotplug: move cpu_killed completion to core code

    We always need to wait for the dying CPU to reach a safe state before
    taking it down, irrespective of the requirements of the platform.
    Move the completion code into the ARM SMP hotplug code rather than
    having each platform re-implement this.

    Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>

diff --git a/arch/arm/kernel/smp.c b/arch/arm/kernel/smp.c
index a30c4094db3a..8c81ff9b3732 100644
--- a/arch/arm/kernel/smp.c
+++ b/arch/arm/kernel/smp.c
@@ -24,6 +24,7 @@
 #include <linux/irq.h>
 #include <linux/percpu.h>
 #include <linux/clockchips.h>
+#include <linux/completion.h>

 #include <asm/atomic.h>
 #include <asm/cacheflush.h>
@@ -238,12 +239,20 @@ int __cpu_disable(void)
        return 0;
 }

+static DECLARE_COMPLETION(cpu_died);
+
 /*
  * called on the thread which is asking for a CPU to be shutdown -
  * waits until shutdown has completed, or it is timed out.
  */
 void __cpu_die(unsigned int cpu)
 {
+       if (!wait_for_completion_timeout(&cpu_died, msecs_to_jiffies(5000))) {
+               pr_err("CPU%u: cpu didn't die\n", cpu);
+               return;
+       }
+       printk(KERN_NOTICE "CPU%u: shutdown\n", cpu);
+
        if (!platform_cpu_kill(cpu))
                printk("CPU%u: unable to kill\n", cpu);
 }
@@ -263,9 +272,12 @@ void __ref cpu_die(void)
        local_irq_disable();
        idle_task_exit();

+       /* Tell __cpu_die() that this CPU is now safe to dispose of */
+       complete(&cpu_died);
+
        /*
         * actual CPU shutdown procedure is at least platform (if not
-        * CPU) specific
+        * CPU) specific.
         */
        platform_cpu_die(cpu);


So, 30th Nov 2010 to 10th Dec 2017 is seven years, one week and three
days to be exact.

-- 
RMK's Patch system: http://www.armlinux.org.uk/developer/patches/
FTTC broadband for 0.8mile line in suburbia: sync at 8.8Mbps down 630kbps up
According to speedtest.net: 8.21Mbps down 510kbps up

^ permalink raw reply related	[flat|nested] 25+ messages in thread

* WARNING: suspicious RCU usage
  2017-12-10 21:54         ` Russell King - ARM Linux
@ 2017-12-10 23:16           ` Paul E. McKenney
  0 siblings, 0 replies; 25+ messages in thread
From: Paul E. McKenney @ 2017-12-10 23:16 UTC (permalink / raw)
  To: linux-arm-kernel

On Sun, Dec 10, 2017 at 09:54:33PM +0000, Russell King - ARM Linux wrote:
> On Sun, Dec 10, 2017 at 01:39:30PM -0800, Paul E. McKenney wrote:
> > On Sun, Dec 10, 2017 at 07:34:39PM +0000, Russell King - ARM Linux wrote:
> > > On Sun, Dec 10, 2017 at 11:07:27AM -0800, Paul E. McKenney wrote:
> > > > On Sun, Dec 10, 2017 at 12:00:12PM +0000, Russell King - ARM Linux wrote:
> > > > > +Paul
> > > > > 
> > > > > Annoyingly, it looks like calling "complete()" from a dying CPU is
> > > > > triggering the RCU usage warning.  From what I remember, this is an
> > > > > old problem, and we still have no better solution for this other than
> > > > > to persist with the warning.
> > > > 
> > > > I thought that this issue was resolved with tglx's use of IPIs from
> > > > the outgoing CPU.  Or is this due to an additional complete() from the
> > > > ARM code?  If so, could it also use tglx's IPI trick?
> > > 
> > > I don't think it was tglx's IPI trick, I've had code sitting in my tree
> > > for a while for it, but it has its own set of problems which are not
> > > resolvable:
> > > 
> > > 1. it needs more IPIs than we have available on all platforms
> > 
> > OK, I will ask the stupid question...  Is it possible to multiplex
> > the IPIs, for example, by using smp_call_function_single()?
> > 
> > > 2. there's some optional locking in the GIC driver that cause problems
> > >    for the cpu dying path.
> > 
> > On this, I must plead ignorance.
> > 
> > > The concensus last time around was that the IPI solution is a non-
> > > starter, so the seven year proven-reliable solution (disregarding the
> > > RCU warning) persists because I don't think anyone came up with a
> > > better solution.
> > 
> > Seven years already?  Sigh, I don't have the heart to check because
> > I wouldn't want to find out that it has actually been longer.  ;-)
> 
> *Sigh* thanks for what you just said, you do realise that you've just
> said that you don't believe what I write in emails?  FFS.  Is there
> any point in continuing to discuss this if that's the case?  Really?

Sorry to have offended you.  I will leave this matter in your hands.

							Thanx, Paul

> commit 3c030beabf937b1d3b4ecaedfd1fb2f1e2aa0c70
> Author: Russell King <rmk+kernel@arm.linux.org.uk>
> Date:   Tue Nov 30 11:07:35 2010 +0000
> 
>     ARM: CPU hotplug: move cpu_killed completion to core code
> 
>     We always need to wait for the dying CPU to reach a safe state before
>     taking it down, irrespective of the requirements of the platform.
>     Move the completion code into the ARM SMP hotplug code rather than
>     having each platform re-implement this.
> 
>     Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
> 
> diff --git a/arch/arm/kernel/smp.c b/arch/arm/kernel/smp.c
> index a30c4094db3a..8c81ff9b3732 100644
> --- a/arch/arm/kernel/smp.c
> +++ b/arch/arm/kernel/smp.c
> @@ -24,6 +24,7 @@
>  #include <linux/irq.h>
>  #include <linux/percpu.h>
>  #include <linux/clockchips.h>
> +#include <linux/completion.h>
> 
>  #include <asm/atomic.h>
>  #include <asm/cacheflush.h>
> @@ -238,12 +239,20 @@ int __cpu_disable(void)
>         return 0;
>  }
> 
> +static DECLARE_COMPLETION(cpu_died);
> +
>  /*
>   * called on the thread which is asking for a CPU to be shutdown -
>   * waits until shutdown has completed, or it is timed out.
>   */
>  void __cpu_die(unsigned int cpu)
>  {
> +       if (!wait_for_completion_timeout(&cpu_died, msecs_to_jiffies(5000))) {
> +               pr_err("CPU%u: cpu didn't die\n", cpu);
> +               return;
> +       }
> +       printk(KERN_NOTICE "CPU%u: shutdown\n", cpu);
> +
>         if (!platform_cpu_kill(cpu))
>                 printk("CPU%u: unable to kill\n", cpu);
>  }
> @@ -263,9 +272,12 @@ void __ref cpu_die(void)
>         local_irq_disable();
>         idle_task_exit();
> 
> +       /* Tell __cpu_die() that this CPU is now safe to dispose of */
> +       complete(&cpu_died);
> +
>         /*
>          * actual CPU shutdown procedure is at least platform (if not
> -        * CPU) specific
> +        * CPU) specific.
>          */
>         platform_cpu_die(cpu);
> 
> 
> So, 30th Nov 2010 to 10th Dec 2017 is seven years, one week and three
> days to be exact.
> 
> -- 
> RMK's Patch system: http://www.armlinux.org.uk/developer/patches/
> FTTC broadband for 0.8mile line in suburbia: sync at 8.8Mbps down 630kbps up
> According to speedtest.net: 8.21Mbps down 510kbps up
> 

^ permalink raw reply	[flat|nested] 25+ messages in thread

* WARNING: suspicious RCU usage
  2017-12-10 21:39       ` Paul E. McKenney
  2017-12-10 21:54         ` Russell King - ARM Linux
@ 2017-12-12 16:49         ` Paul E. McKenney
  2017-12-12 16:56           ` Fabio Estevam
  1 sibling, 1 reply; 25+ messages in thread
From: Paul E. McKenney @ 2017-12-12 16:49 UTC (permalink / raw)
  To: linux-arm-kernel

On Sun, Dec 10, 2017 at 01:39:30PM -0800, Paul E. McKenney wrote:
> On Sun, Dec 10, 2017 at 07:34:39PM +0000, Russell King - ARM Linux wrote:
> > On Sun, Dec 10, 2017 at 11:07:27AM -0800, Paul E. McKenney wrote:
> > > On Sun, Dec 10, 2017 at 12:00:12PM +0000, Russell King - ARM Linux wrote:
> > > > +Paul
> > > > 
> > > > Annoyingly, it looks like calling "complete()" from a dying CPU is
> > > > triggering the RCU usage warning.  From what I remember, this is an
> > > > old problem, and we still have no better solution for this other than
> > > > to persist with the warning.
> > > 
> > > I thought that this issue was resolved with tglx's use of IPIs from
> > > the outgoing CPU.  Or is this due to an additional complete() from the
> > > ARM code?  If so, could it also use tglx's IPI trick?
> > 
> > I don't think it was tglx's IPI trick, I've had code sitting in my tree
> > for a while for it, but it has its own set of problems which are not
> > resolvable:
> > 
> > 1. it needs more IPIs than we have available on all platforms
> 
> OK, I will ask the stupid question...  Is it possible to multiplex
> the IPIs, for example, by using smp_call_function_single()?

On the perhaps unlikely off-chance that it is both useful and welcome,
the (untested, probably does not even build) patch below illustrates the
use of smp_call_function_single().  This is based on the patch Russell
sent -- for all I know, it might well be that there are other places
needing similar changes.

But something to try out for anyone wishing to do so.

							Thanx, Paul

------------------------------------------------------------------------

commit c579a1494ccbc7ebf5548115571a2988ea1a1fe5
Author: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Date:   Mon Dec 11 09:40:58 2017 -0800

    ARM: CPU hotplug: Delegate complete() to surviving CPU
    
    The ARM implementation of arch_cpu_idle_dead() invokes complete(), but
    does so after RCU has stopped watching the outgoing CPU, which results
    in lockdep complaints because complete() invokes functions containing RCU
    readers.  This patch therefore uses Thomas Gleixner's trick of delegating
    the complete() call to a surviving CPU via smp_call_function_single().
    
    This patch is untested, and probably does not even build.

    Reported-by: Peng Fan <van.freenix@gmail.com>
    Reported-by: Russell King - ARM Linux <linux@armlinux.org.uk>
    Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>

diff --git a/arch/arm/kernel/smp.c b/arch/arm/kernel/smp.c
index b4fbf00ee4ad..75f85e20aafa 100644
--- a/arch/arm/kernel/smp.c
+++ b/arch/arm/kernel/smp.c
@@ -267,6 +267,14 @@ void __cpu_die(unsigned int cpu)
 }
 
 /*
+ * Invoke complete() on behalf of the outgoing CPU.
+ */
+static void arch_cpu_idle_dead_complete(void *arg)
+{
+	complete(&cpu_died);
+}
+
+/*
  * Called from the idle thread for the CPU which has been shutdown.
  *
  * Note that we disable IRQs here, but do not re-enable them
@@ -293,9 +301,11 @@ void arch_cpu_idle_dead(void)
 	/*
 	 * Tell __cpu_die() that this CPU is now safe to dispose of.  Once
 	 * this returns, power and/or clocks can be removed at any point
-	 * from this CPU and its cache by platform_cpu_kill().
+	 * from this CPU and its cache by platform_cpu_kill().  We cannot
+	 * call complete() this late, so we delegate it to an online CPU.
 	 */
-	complete(&cpu_died);
+	smp_call_function_single(cpumask_first(cpu_online_mask),
+				 arch_cpu_idle_dead_complete, NULL, 0);
 
 	/*
 	 * Ensure that the cache lines associated with that completion are

^ permalink raw reply related	[flat|nested] 25+ messages in thread

* WARNING: suspicious RCU usage
  2017-12-12 16:49         ` Paul E. McKenney
@ 2017-12-12 16:56           ` Fabio Estevam
  2017-12-12 17:21             ` Paul E. McKenney
  2017-12-12 17:34             ` Russell King - ARM Linux
  0 siblings, 2 replies; 25+ messages in thread
From: Fabio Estevam @ 2017-12-12 16:56 UTC (permalink / raw)
  To: linux-arm-kernel

Hi Paul,

On Tue, Dec 12, 2017 at 2:49 PM, Paul E. McKenney
<paulmck@linux.vnet.ibm.com> wrote:

> On the perhaps unlikely off-chance that it is both useful and welcome,
> the (untested, probably does not even build) patch below illustrates the
> use of smp_call_function_single().  This is based on the patch Russell
> sent -- for all I know, it might well be that there are other places
> needing similar changes.
>
> But something to try out for anyone wishing to do so.
>
>                                                         Thanx, Paul
>
> ------------------------------------------------------------------------
>
> commit c579a1494ccbc7ebf5548115571a2988ea1a1fe5
> Author: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
> Date:   Mon Dec 11 09:40:58 2017 -0800
>
>     ARM: CPU hotplug: Delegate complete() to surviving CPU
>
>     The ARM implementation of arch_cpu_idle_dead() invokes complete(), but
>     does so after RCU has stopped watching the outgoing CPU, which results
>     in lockdep complaints because complete() invokes functions containing RCU
>     readers.  This patch therefore uses Thomas Gleixner's trick of delegating
>     the complete() call to a surviving CPU via smp_call_function_single().
>
>     This patch is untested, and probably does not even build.
>
>     Reported-by: Peng Fan <van.freenix@gmail.com>
>     Reported-by: Russell King - ARM Linux <linux@armlinux.org.uk>
>     Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>

With your patch applied I no longer get the RCU warning, thanks:

Tested-by: Fabio Estevam <fabio.estevam@nxp.com>

^ permalink raw reply	[flat|nested] 25+ messages in thread

* WARNING: suspicious RCU usage
  2017-12-12 16:56           ` Fabio Estevam
@ 2017-12-12 17:21             ` Paul E. McKenney
  2017-12-12 17:34             ` Russell King - ARM Linux
  1 sibling, 0 replies; 25+ messages in thread
From: Paul E. McKenney @ 2017-12-12 17:21 UTC (permalink / raw)
  To: linux-arm-kernel

On Tue, Dec 12, 2017 at 02:56:18PM -0200, Fabio Estevam wrote:
> Hi Paul,
> 
> On Tue, Dec 12, 2017 at 2:49 PM, Paul E. McKenney
> <paulmck@linux.vnet.ibm.com> wrote:
> 
> > On the perhaps unlikely off-chance that it is both useful and welcome,
> > the (untested, probably does not even build) patch below illustrates the
> > use of smp_call_function_single().  This is based on the patch Russell
> > sent -- for all I know, it might well be that there are other places
> > needing similar changes.
> >
> > But something to try out for anyone wishing to do so.
> >
> >                                                         Thanx, Paul
> >
> > ------------------------------------------------------------------------
> >
> > commit c579a1494ccbc7ebf5548115571a2988ea1a1fe5
> > Author: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
> > Date:   Mon Dec 11 09:40:58 2017 -0800
> >
> >     ARM: CPU hotplug: Delegate complete() to surviving CPU
> >
> >     The ARM implementation of arch_cpu_idle_dead() invokes complete(), but
> >     does so after RCU has stopped watching the outgoing CPU, which results
> >     in lockdep complaints because complete() invokes functions containing RCU
> >     readers.  This patch therefore uses Thomas Gleixner's trick of delegating
> >     the complete() call to a surviving CPU via smp_call_function_single().
> >
> >     This patch is untested, and probably does not even build.
> >
> >     Reported-by: Peng Fan <van.freenix@gmail.com>
> >     Reported-by: Russell King - ARM Linux <linux@armlinux.org.uk>
> >     Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
> 
> With your patch applied I no longer get the RCU warning, thanks:
> 
> Tested-by: Fabio Estevam <fabio.estevam@nxp.com>

Well, I guess that it is no longer untested, and thank you for that.  ;-)

I sent a more official posting of the patch.

							Thanx, Paul

^ permalink raw reply	[flat|nested] 25+ messages in thread

* WARNING: suspicious RCU usage
  2017-12-12 16:56           ` Fabio Estevam
  2017-12-12 17:21             ` Paul E. McKenney
@ 2017-12-12 17:34             ` Russell King - ARM Linux
  2017-12-12 18:11               ` Fabio Estevam
  1 sibling, 1 reply; 25+ messages in thread
From: Russell King - ARM Linux @ 2017-12-12 17:34 UTC (permalink / raw)
  To: linux-arm-kernel

On Tue, Dec 12, 2017 at 02:56:18PM -0200, Fabio Estevam wrote:
> Hi Paul,
> 
> On Tue, Dec 12, 2017 at 2:49 PM, Paul E. McKenney
> <paulmck@linux.vnet.ibm.com> wrote:
> 
> > On the perhaps unlikely off-chance that it is both useful and welcome,
> > the (untested, probably does not even build) patch below illustrates the
> > use of smp_call_function_single().  This is based on the patch Russell
> > sent -- for all I know, it might well be that there are other places
> > needing similar changes.
> >
> > But something to try out for anyone wishing to do so.
> >
> >                                                         Thanx, Paul
> >
> > ------------------------------------------------------------------------
> >
> > commit c579a1494ccbc7ebf5548115571a2988ea1a1fe5
> > Author: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
> > Date:   Mon Dec 11 09:40:58 2017 -0800
> >
> >     ARM: CPU hotplug: Delegate complete() to surviving CPU
> >
> >     The ARM implementation of arch_cpu_idle_dead() invokes complete(), but
> >     does so after RCU has stopped watching the outgoing CPU, which results
> >     in lockdep complaints because complete() invokes functions containing RCU
> >     readers.  This patch therefore uses Thomas Gleixner's trick of delegating
> >     the complete() call to a surviving CPU via smp_call_function_single().
> >
> >     This patch is untested, and probably does not even build.
> >
> >     Reported-by: Peng Fan <van.freenix@gmail.com>
> >     Reported-by: Russell King - ARM Linux <linux@armlinux.org.uk>
> >     Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
> 
> With your patch applied I no longer get the RCU warning, thanks:
> 
> Tested-by: Fabio Estevam <fabio.estevam@nxp.com>

It's fundamentally unsafe.

You need to test with CONFIG_BL_SWITCHER enabled - there's spinlocks
in smp_call_function_single() path that are conditional on that symbol.
If CONFIG_BL_SWITCHER is disabled, then the spinlocks are not present.

The problem is that the IPI will be sent with the spinlock held.  The
IPI'd CPU will then do the completion, and the CPU requesting the
death will continue, and could power down the dying CPU _before_ the
unlock of that spinlock becomes visible to other CPUs in the system.

So, we end up with a spinlock permanently held.

Whether this happens or not depends on timing, and whether the unlock
gets evicted from the dying CPU.

If you attempt to clean the caches after the unlock to force that unlock
out, then you need a way to make the requesting CPU wait for the dying
CPU to finish that action... oh, that's what this complete() is trying
to do here in the first place.

So we're back to exactly where we were without this patch.

-- 
RMK's Patch system: http://www.armlinux.org.uk/developer/patches/
FTTC broadband for 0.8mile line in suburbia: sync at 8.8Mbps down 630kbps up
According to speedtest.net: 8.21Mbps down 510kbps up

^ permalink raw reply	[flat|nested] 25+ messages in thread

* WARNING: suspicious RCU usage
  2017-12-12 17:34             ` Russell King - ARM Linux
@ 2017-12-12 18:11               ` Fabio Estevam
  2017-12-12 19:36                 ` Paul E. McKenney
  2017-12-13  9:12                 ` Russell King - ARM Linux
  0 siblings, 2 replies; 25+ messages in thread
From: Fabio Estevam @ 2017-12-12 18:11 UTC (permalink / raw)
  To: linux-arm-kernel

Hi Russell,

On Tue, Dec 12, 2017 at 3:34 PM, Russell King - ARM Linux
<linux@armlinux.org.uk> wrote:

> It's fundamentally unsafe.
>
> You need to test with CONFIG_BL_SWITCHER enabled - there's spinlocks
> in smp_call_function_single() path that are conditional on that symbol.
> If CONFIG_BL_SWITCHER is disabled, then the spinlocks are not present.

Ok, just tested with CONFIG_BL_SWITCHER=y on a imx6q-cubox-i:

# echo enabled > /sys/class/tty/ttymxc0/power/wakeup
# echo mem > /sys/power/state
[   10.503462] PM: suspend entry (deep)
[   10.507479] PM: Syncing filesystems ... done.
[   10.555024] Freezing user space processes ... (elapsed 0.002 seconds) done.
[   10.564511] OOM killer disabled.
[   10.567760] Freezing remaining freezable tasks ... (elapsed 0.002 seconds) d.
[   10.577420] Suspending console(s) (use no_console_suspend to debug)
[   10.657748] PM: suspend devices took 0.080 seconds
[   10.669329] Disabling non-boot CPUs ...
[   10.717049] IRQ17 no longer affine to CPU1
[   10.837141] Enabling non-boot CPUs ...
[   10.839386] CPU1 is up
[   10.840342] CPU2 is up
[   10.841300] CPU3 is up
[   11.113735] mmc0: queuing unknown CIS tuple 0x80 (2 bytes)
[   11.115676] mmc0: queuing unknown CIS tuple 0x80 (3 bytes)
[   11.117595] mmc0: queuing unknown CIS tuple 0x80 (3 bytes)
[   11.121014] mmc0: queuing unknown CIS tuple 0x80 (7 bytes)
[   11.124454] mmc0: queuing unknown CIS tuple 0x80 (7 bytes)
[   11.177299] ata1: SATA link down (SStatus 0 SControl 300)
[   11.181930] PM: resume devices took 0.330 seconds
[   11.243729] OOM killer enabled.
[   11.246886] Restarting tasks ... done.
[   11.253012] PM: suspend exit

^ permalink raw reply	[flat|nested] 25+ messages in thread

* WARNING: suspicious RCU usage
  2017-12-12 18:11               ` Fabio Estevam
@ 2017-12-12 19:36                 ` Paul E. McKenney
  2017-12-12 19:44                   ` Fabio Estevam
  2017-12-13  9:12                 ` Russell King - ARM Linux
  1 sibling, 1 reply; 25+ messages in thread
From: Paul E. McKenney @ 2017-12-12 19:36 UTC (permalink / raw)
  To: linux-arm-kernel

On Tue, Dec 12, 2017 at 04:11:07PM -0200, Fabio Estevam wrote:
> Hi Russell,
> 
> On Tue, Dec 12, 2017 at 3:34 PM, Russell King - ARM Linux
> <linux@armlinux.org.uk> wrote:
> 
> > It's fundamentally unsafe.
> >
> > You need to test with CONFIG_BL_SWITCHER enabled - there's spinlocks
> > in smp_call_function_single() path that are conditional on that symbol.
> > If CONFIG_BL_SWITCHER is disabled, then the spinlocks are not present.
> 
> Ok, just tested with CONFIG_BL_SWITCHER=y on a imx6q-cubox-i:

Just to confirm, your dmesg below is illustrating the hang, correct?

							Thanx, Paul

> # echo enabled > /sys/class/tty/ttymxc0/power/wakeup
> # echo mem > /sys/power/state
> [   10.503462] PM: suspend entry (deep)
> [   10.507479] PM: Syncing filesystems ... done.
> [   10.555024] Freezing user space processes ... (elapsed 0.002 seconds) done.
> [   10.564511] OOM killer disabled.
> [   10.567760] Freezing remaining freezable tasks ... (elapsed 0.002 seconds) d.
> [   10.577420] Suspending console(s) (use no_console_suspend to debug)
> [   10.657748] PM: suspend devices took 0.080 seconds
> [   10.669329] Disabling non-boot CPUs ...
> [   10.717049] IRQ17 no longer affine to CPU1
> [   10.837141] Enabling non-boot CPUs ...
> [   10.839386] CPU1 is up
> [   10.840342] CPU2 is up
> [   10.841300] CPU3 is up
> [   11.113735] mmc0: queuing unknown CIS tuple 0x80 (2 bytes)
> [   11.115676] mmc0: queuing unknown CIS tuple 0x80 (3 bytes)
> [   11.117595] mmc0: queuing unknown CIS tuple 0x80 (3 bytes)
> [   11.121014] mmc0: queuing unknown CIS tuple 0x80 (7 bytes)
> [   11.124454] mmc0: queuing unknown CIS tuple 0x80 (7 bytes)
> [   11.177299] ata1: SATA link down (SStatus 0 SControl 300)
> [   11.181930] PM: resume devices took 0.330 seconds
> [   11.243729] OOM killer enabled.
> [   11.246886] Restarting tasks ... done.
> [   11.253012] PM: suspend exit
> 

^ permalink raw reply	[flat|nested] 25+ messages in thread

* WARNING: suspicious RCU usage
  2017-12-12 19:36                 ` Paul E. McKenney
@ 2017-12-12 19:44                   ` Fabio Estevam
  2017-12-12 19:54                     ` Russell King - ARM Linux
  0 siblings, 1 reply; 25+ messages in thread
From: Fabio Estevam @ 2017-12-12 19:44 UTC (permalink / raw)
  To: linux-arm-kernel

Hi Paul,

On Tue, Dec 12, 2017 at 5:36 PM, Paul E. McKenney
<paulmck@linux.vnet.ibm.com> wrote:

>> Ok, just tested with CONFIG_BL_SWITCHER=y on a imx6q-cubox-i:
>
> Just to confirm, your dmesg below is illustrating the hang, correct?

Sorry for not being clear. Let me clarify my tests.

If I run a mainline kernel on a imx6q I do see the exact same RCU
warning as reported by Peng Fan in this thread.

This is 100% reproducible: the first time I do a suspend/resume after
boot the RCU warning is present. Subsequent suspend/resume cycles do
not show the warning.

With your patch applied I don't see the RCU warning anymore.

I originally tested the standard imx_v6_v7_defconfig and also a kernel
with CONFIG_BL_SWITCHER=y as suggested by Russell.

In my tests even with CONFIG_BL_SWITCHER=y suspend/resume works fine
and no more RCU warnings were seen.

The dmesg I shared shows the normal output without the RCU warning.

Thanks

^ permalink raw reply	[flat|nested] 25+ messages in thread

* WARNING: suspicious RCU usage
  2017-12-12 19:44                   ` Fabio Estevam
@ 2017-12-12 19:54                     ` Russell King - ARM Linux
  2017-12-12 21:05                       ` Fabio Estevam
  0 siblings, 1 reply; 25+ messages in thread
From: Russell King - ARM Linux @ 2017-12-12 19:54 UTC (permalink / raw)
  To: linux-arm-kernel

On Tue, Dec 12, 2017 at 05:44:07PM -0200, Fabio Estevam wrote:
> Hi Paul,
> 
> On Tue, Dec 12, 2017 at 5:36 PM, Paul E. McKenney
> <paulmck@linux.vnet.ibm.com> wrote:
> 
> >> Ok, just tested with CONFIG_BL_SWITCHER=y on a imx6q-cubox-i:
> >
> > Just to confirm, your dmesg below is illustrating the hang, correct?
> 
> Sorry for not being clear. Let me clarify my tests.
> 
> If I run a mainline kernel on a imx6q I do see the exact same RCU
> warning as reported by Peng Fan in this thread.
> 
> This is 100% reproducible: the first time I do a suspend/resume after
> boot the RCU warning is present. Subsequent suspend/resume cycles do
> not show the warning.
> 
> With your patch applied I don't see the RCU warning anymore.
> 
> I originally tested the standard imx_v6_v7_defconfig and also a kernel
> with CONFIG_BL_SWITCHER=y as suggested by Russell.
> 
> In my tests even with CONFIG_BL_SWITCHER=y suspend/resume works fine
> and no more RCU warnings were seen.
> 
> The dmesg I shared shows the normal output without the RCU warning.

Which is exactly what I would expect with imx6.  Just because it
works for imx6 does not mean it works for everyone.

I need to get Will Deacon's permission before I can send an email
containing our discussion on the points here from 2013, and that
probably won't happen until tomorrow - and I'm having to do that
because none of you seem to be listening to what I'm saying wrt
that spinlock.

-- 
RMK's Patch system: http://www.armlinux.org.uk/developer/patches/
FTTC broadband for 0.8mile line in suburbia: sync at 8.8Mbps down 630kbps up
According to speedtest.net: 8.21Mbps down 510kbps up

^ permalink raw reply	[flat|nested] 25+ messages in thread

* WARNING: suspicious RCU usage
  2017-12-12 19:54                     ` Russell King - ARM Linux
@ 2017-12-12 21:05                       ` Fabio Estevam
  0 siblings, 0 replies; 25+ messages in thread
From: Fabio Estevam @ 2017-12-12 21:05 UTC (permalink / raw)
  To: linux-arm-kernel

Hi Russell,

On Tue, Dec 12, 2017 at 5:54 PM, Russell King - ARM Linux
<linux@armlinux.org.uk> wrote:

> Which is exactly what I would expect with imx6.  Just because it
> works for imx6 does not mean it works for everyone.

Sure, I trust your judgement.

You asked me to test with CONFIG_BL_SWITCHER=y and that's what I did
and reported back.

^ permalink raw reply	[flat|nested] 25+ messages in thread

* WARNING: suspicious RCU usage
  2017-12-12 18:11               ` Fabio Estevam
  2017-12-12 19:36                 ` Paul E. McKenney
@ 2017-12-13  9:12                 ` Russell King - ARM Linux
  2017-12-15  6:38                   ` Paul E. McKenney
  1 sibling, 1 reply; 25+ messages in thread
From: Russell King - ARM Linux @ 2017-12-13  9:12 UTC (permalink / raw)
  To: linux-arm-kernel

On Tue, Dec 12, 2017 at 04:11:07PM -0200, Fabio Estevam wrote:
> Hi Russell,
> 
> On Tue, Dec 12, 2017 at 3:34 PM, Russell King - ARM Linux
> <linux@armlinux.org.uk> wrote:
> 
> > It's fundamentally unsafe.
> >
> > You need to test with CONFIG_BL_SWITCHER enabled - there's spinlocks
> > in smp_call_function_single() path that are conditional on that symbol.
> > If CONFIG_BL_SWITCHER is disabled, then the spinlocks are not present.
> 
> Ok, just tested with CONFIG_BL_SWITCHER=y on a imx6q-cubox-i:
> 
> # echo enabled > /sys/class/tty/ttymxc0/power/wakeup
> # echo mem > /sys/power/state
> [   10.503462] PM: suspend entry (deep)
> [   10.507479] PM: Syncing filesystems ... done.
> [   10.555024] Freezing user space processes ... (elapsed 0.002 seconds) done.
> [   10.564511] OOM killer disabled.
> [   10.567760] Freezing remaining freezable tasks ... (elapsed 0.002 seconds) d.
> [   10.577420] Suspending console(s) (use no_console_suspend to debug)
> [   10.657748] PM: suspend devices took 0.080 seconds
> [   10.669329] Disabling non-boot CPUs ...
> [   10.717049] IRQ17 no longer affine to CPU1
> [   10.837141] Enabling non-boot CPUs ...
> [   10.839386] CPU1 is up
> [   10.840342] CPU2 is up
> [   10.841300] CPU3 is up
> [   11.113735] mmc0: queuing unknown CIS tuple 0x80 (2 bytes)
> [   11.115676] mmc0: queuing unknown CIS tuple 0x80 (3 bytes)
> [   11.117595] mmc0: queuing unknown CIS tuple 0x80 (3 bytes)
> [   11.121014] mmc0: queuing unknown CIS tuple 0x80 (7 bytes)
> [   11.124454] mmc0: queuing unknown CIS tuple 0x80 (7 bytes)
> [   11.177299] ata1: SATA link down (SStatus 0 SControl 300)
> [   11.181930] PM: resume devices took 0.330 seconds
> [   11.243729] OOM killer enabled.
> [   11.246886] Restarting tasks ... done.
> [   11.253012] PM: suspend exit

Right, one test.  What makes this safe, and what does this prove?

It's probably worth quoting a discussion I had with Will Deacon on
this subject back in 2013 - it's on that complete(), but the points
discussed there are entirely relevant to the spinlock in the GIC
code.

imx6 won't see a problem because you have additional synchronisation
between the calling CPU and the dying CPU, so I'm afraid that any
testing done on imx6 is meaningless wrt the safety of Paul's change
from an architecture point of view.

And that's the problem - once that complete() happens, the dying CPU
can have power removed _at any moment_, and that could happen when
the cache line for "cpu_map_lock" in drivers/irqchip/irq-gic.c is
owned by the dying CPU.

If you read the discussion below, that was one of Will's concerns
with using complete() before we nailed down complete() works.  I'm
sorry, but I'm not wrapping this...

18:00 < rmk> wildea01: can you think of any reason not to use flush_cache_louis() in cpu_die() ?
18:20 < wildea01> let me see...
18:20 < rmk> what I'm thinking of is:
18:20 < rmk>         idle_task_exit();
18:20 < rmk>         local_irq_disable();
18:20 < rmk>         flush_cache_louis();
18:20 < rmk>         mb();
18:20 < rmk>         RCU_NONIDLE(complete(&cpu_died));
18:20 < rmk>         mb();
18:20 < rmk>         if (smp_ops.cpu_die)
18:20 < rmk>                 smp_ops.cpu_die(cpu);
18:21 < rmk> and then killing most of the flush_cache_all() calls in smp_ops.cpu_die()
18:22 < rmk> the only thing I haven't worked out is why some places disable the L1 cache and then flush - especially as that can make any dirty data in the L1 cache suddenly vanish from the CPUs visibility
18:22 < wildea01> might need to be careful with the completion
18:23 < rmk> that's why the mb() is there - another CPU will read the cpu_died thing which means it must have become visible to the other CPUs
18:24 < wildea01> but the cacheline could still be exclusive in our L1 I think
18:24 < rmk> how?  surely it must have become shared because another CPU has read from it?
18:25 < wildea01> I'm thinking of the spin lock -- can we guarantee that another core will have read that before we turn off our cache?
18:27 < rmk> well, can we get out of wait_for_completion without the spin lock having been unlocked?
18:27 < rmk> one of the points of completions is that it should be safe for stuff like this
18:59 < rmk> yes, one of the things that wait_for_completion/complete was invented for was for synchronising a kernel thread exit with cleaning up after it
19:01 < rmk> and if you look at the above functions, the spinlock can't be owned by the CPU calling complete() because wait_for_completion() must reacquire it after complete() has woken the wait_for_completion thread u p
19:04 < rmk> well, I just tried it out on omap4430 and it seems to work fine
19:04 < wildea01> rmk: but complete does a spin_unlock_irqrestore(&x->wait.lock, flags);, now if that sits exclusive in our cache and we power-off before the waiting CPU gets the lock, then we're dead
19:04 < rmk> yes it does, but...
19:04 < wildea01> maybe that's so incredibly unlikely that we don't mind
19:04 < rmk> the other CPU must exit wait_for_completion()
19:05 < rmk> which involves reading/writing that lock too
19:05 < rmk>                         spin_lock_irq(&x->wait.lock);
19:05 < rmk>                 } while (!x->done && timeout);
19:05 < rmk>                 __remove_wait_queue(&x->wait, &wait);
19:05 < rmk>         }
19:05 < rmk>         x->done--;
19:05 < rmk>         return timeout ?: 1;
19:05 < rmk>         spin_unlock_irq(&x->wait.lock);
19:06 < wildea01> hmm, where is that code?
19:06 < rmk> will all be executed on another CPU after that spin_unlock
19:06 < rmk> wait_for_completion->wait_for_common->__wait_for_common->do_wait_for_common
19:07 < wildea01> gotcha, I didn't go as far as do_wait_for_common
19:07 * wildea01 scrolls up
19:07 < rmk> the bits I quoted is the exit path from do_wait_for_common back to wait_for_completion
19:15 < wildea01> I guess the only bit I'm missing is why the the other CPU must exit wait_for_completion before we can proceed with the cpu_die
19:16 < rmk> wrong way round.
19:16 < rmk> complete() must exit completely and be visible before wait_for_completion can return
19:17 < rmk> so there's no way that platform_cpu_kill() can end up being called before that complete() has unlocked that spinlock
19:17 < rmk> and as platform_cpu_kill() is what should be removing power to the dead CPU
19:17 < wildea01> but I don't see how we can guarantee that the other guy has read it
19:17 < wildea01> he might be *just* about to read it
19:18 < wildea01> but it might not have happened
19:18 < rmk>         if (!wait_for_completion_timeout(&cpu_died, msecs_to_jiffies(5000))) {
19:18 < rmk>         }
19:18 < rmk>         printk(KERN_NOTICE "CPU%u: shutdown\n", cpu);
19:18 < rmk> you can't get to that printk until complete() has unlocked the completions spinlock.
19:18 < wildea01> agreed
19:18 < rmk> and that unlock has become visible to the CPU executing the above code
19:19 < wildea01> wait a second: I'm assuming that cpu_die is killing the lights, which some people seem to do iirc?
19:20 < wildea01> if that's all done in platform_cpu_kill, then I think we're ok19:20 < wildea01> as you say
19:20 < rmk> indeed.
19:20 < rmk> it actually depends on how the offlining works.
19:20 < wildea01> so the question is: are there smp_ops.cpu_die which hit the power controller?
19:20 < rmk> yes, but weirdly... because the CPU goes into dead mode when it executes the WFI
19:21 < wildea01> yeah, there's some signal that goes high when that happens, so people like to tie that to the power switch
19:22 < rmk> but that is also fine, because if that's what triggers the power off, we will get there anyway (because platform_cpu_kill won't do that)
19:22 < rmk> err, won't kill the power
19:24 < wildea01> hmm, not sure I follow
19:24 < wildea01> highbank does what you're saying, so we could take that as an example
19:25 < wildea01> (pretending that the cache flush isn't there)
19:26 < rmk> ok.
19:26 < rmk> so what will happen is...
19:27 < rmk> one CPU (not the dying CPU) will end up calling __cpu_die()
19:27 < rmk> the dying CPU will call cpu_die()
19:27 < wildea01> yup
19:27 < rmk> lets call the first one the calling CPU (even though it may not be)
19:28 < wildea01> sure -- it's the guy waiting for the offline to happen
19:28 < rmk> the calling CPU calls into wait_for_completion_timeout() and sits there waiting for the dying CPU to call that complete()
19:28 < rmk> meanwhile, the dying CPU does the idle task exit, disables IRQs, and flushes its caches of any dirty data.
19:29 < rmk> now, the calling CPU has had to take the completion's spinlock, check the counter, release the spinlock, and is waiting for the completion...
19:30 < rmk> so, the dying CPU takes the spinlock, increments the counter, and triggers a wakeup of the calling CPU, and then releases the spinlock.
19:30 < wildea01> yep
19:30 < rmk> the dying CPU now has dirty cache lines again which it probably exclusively owns
19:30 < wildea01> at this point, can we assume that the calling CPU goes off to handle an interrupt or something?
19:31 < rmk> it can do, it could even be the scheduler IPI
19:31 < wildea01> good
19:31 < wildea01> so the dying CPU can proceed past the complete(, with the calling CPU occupied somewhere else
19:31 < rmk> it can do, yes.
19:32 < wildea01> and off into smp_ops.cpu_die => highbank_cpu_die
19:32 < rmk> *OH*, I see what you're getting at
19:32 < wildea01> :)
19:33 < rmk> hmm, how can we get around that...
19:34 < wildea01> it's tricky, because platform_cpu_kill runs on the caller cpu
19:34 < rmk> because, in the case where the power is cut from platform_cpu_kill(), the dying CPU can loose power at any point after that complete()
19:34 < wildea01> so we'd need to split up the `next wfi on core n should kill it' from the `here's my wfi'
19:36 < rmk> I think we could do another flush_cache_louis() after it
19:37 < rmk> if, in the case of platform_cpu_kill() doing the killing of the CPU, that should be fine.
19:37 < wildea01> actually... why do we even need the one before it?
19:37 < wildea01> why not just have one after the complete has returned?
19:37 < rmk> because if platform_cpu_kill() is the call which removes the power, we need to ensure the cache contents have been written out
19:38 < wildea01> ah yeah, I was thinking of requiring both the kill and the die in order for the powerdown, but that's not alwasy necessary
19:38 < wildea01> *always
19:38 < wildea01> last time I checked, nobody used plaform_cpu_kill
19:39 < rmk> umm, lots do
19:39 < rmk> and some do use it to do stuff
19:39 < wildea01> damn, I must've been thinking of something else
19:40 < rmk> well, imx and tegra seem to, but they have their own custom waits implemented probably because of the lack of cache handling in the generic code
19:42 < wildea01> I don't know enough about tegra to understand why their kill and die don't race
19:44 < rmk> ok, looking at the locking barrier doc, we don't need the mb() after the complete() call
19:44 < rmk> but I think to address your concern, we must have another flush_cache_louis() there
19:44 < wildea01> yeah, the unlock should give you that mb
19:45 < wildea01> just seems a shame to have two flushes when they're not usually both needed
19:45 < wildea01> (he assumes)
19:45 < wildea01> like I said, the tegra could looks broken to me
19:45 < wildea01> *code
19:45 < rmk> if we had a louis() version which could flush the cpu_died completion...
19:51 < wildea01> do we even need the mb before the complete?
19:52 < rmk> I've been debating about that, and I think not
19:52 < rmk> I'm just augmenting this with comments as well
19:53 < wildea01> good thinking, I think we've established that it's not simple to understand!
19:59 < rmk> http://www.home.arm.linux.org.uk/~rmk/misc/smp-hotplug.diff
19:59 < rmk> new version of it with lots of comments :)
20:03 < wildea01> looks good to me. The additional flush is a pity, but required, and it's hotplug-off anyway, so not exactly speedy
20:03 < wildea01> one typo in a comment: s/loosing/losing/

-- 
RMK's Patch system: http://www.armlinux.org.uk/developer/patches/
FTTC broadband for 0.8mile line in suburbia: sync at 8.8Mbps down 630kbps up
According to speedtest.net: 8.21Mbps down 510kbps up

^ permalink raw reply	[flat|nested] 25+ messages in thread

* WARNING: suspicious RCU usage
  2017-12-13  9:12                 ` Russell King - ARM Linux
@ 2017-12-15  6:38                   ` Paul E. McKenney
  2017-12-15 13:16                     ` Fabio Estevam
  0 siblings, 1 reply; 25+ messages in thread
From: Paul E. McKenney @ 2017-12-15  6:38 UTC (permalink / raw)
  To: linux-arm-kernel

On Wed, Dec 13, 2017 at 09:12:45AM +0000, Russell King - ARM Linux wrote:
> On Tue, Dec 12, 2017 at 04:11:07PM -0200, Fabio Estevam wrote:
> > Hi Russell,
> > 
> > On Tue, Dec 12, 2017 at 3:34 PM, Russell King - ARM Linux
> > <linux@armlinux.org.uk> wrote:
> > 
> > > It's fundamentally unsafe.
> > >
> > > You need to test with CONFIG_BL_SWITCHER enabled - there's spinlocks
> > > in smp_call_function_single() path that are conditional on that symbol.
> > > If CONFIG_BL_SWITCHER is disabled, then the spinlocks are not present.
> > 
> > Ok, just tested with CONFIG_BL_SWITCHER=y on a imx6q-cubox-i:
> > 
> > # echo enabled > /sys/class/tty/ttymxc0/power/wakeup
> > # echo mem > /sys/power/state
> > [   10.503462] PM: suspend entry (deep)
> > [   10.507479] PM: Syncing filesystems ... done.
> > [   10.555024] Freezing user space processes ... (elapsed 0.002 seconds) done.
> > [   10.564511] OOM killer disabled.
> > [   10.567760] Freezing remaining freezable tasks ... (elapsed 0.002 seconds) d.
> > [   10.577420] Suspending console(s) (use no_console_suspend to debug)
> > [   10.657748] PM: suspend devices took 0.080 seconds
> > [   10.669329] Disabling non-boot CPUs ...
> > [   10.717049] IRQ17 no longer affine to CPU1
> > [   10.837141] Enabling non-boot CPUs ...
> > [   10.839386] CPU1 is up
> > [   10.840342] CPU2 is up
> > [   10.841300] CPU3 is up
> > [   11.113735] mmc0: queuing unknown CIS tuple 0x80 (2 bytes)
> > [   11.115676] mmc0: queuing unknown CIS tuple 0x80 (3 bytes)
> > [   11.117595] mmc0: queuing unknown CIS tuple 0x80 (3 bytes)
> > [   11.121014] mmc0: queuing unknown CIS tuple 0x80 (7 bytes)
> > [   11.124454] mmc0: queuing unknown CIS tuple 0x80 (7 bytes)
> > [   11.177299] ata1: SATA link down (SStatus 0 SControl 300)
> > [   11.181930] PM: resume devices took 0.330 seconds
> > [   11.243729] OOM killer enabled.
> > [   11.246886] Restarting tasks ... done.
> > [   11.253012] PM: suspend exit
> 
> Right, one test.  What makes this safe, and what does this prove?
> 
> It's probably worth quoting a discussion I had with Will Deacon on
> this subject back in 2013 - it's on that complete(), but the points
> discussed there are entirely relevant to the spinlock in the GIC
> code.
> 
> imx6 won't see a problem because you have additional synchronisation
> between the calling CPU and the dying CPU, so I'm afraid that any
> testing done on imx6 is meaningless wrt the safety of Paul's change
> from an architecture point of view.
> 
> And that's the problem - once that complete() happens, the dying CPU
> can have power removed _at any moment_, and that could happen when
> the cache line for "cpu_map_lock" in drivers/irqchip/irq-gic.c is
> owned by the dying CPU.
> 
> If you read the discussion below, that was one of Will's concerns
> with using complete() before we nailed down complete() works.  I'm
> sorry, but I'm not wrapping this...

No worries!  It is quite legible here.  Additional questions and (you
guessed it!) another patch below.

> 18:00 < rmk> wildea01: can you think of any reason not to use flush_cache_louis() in cpu_die() ?
> 18:20 < wildea01> let me see...
> 18:20 < rmk> what I'm thinking of is:
> 18:20 < rmk>         idle_task_exit();
> 18:20 < rmk>         local_irq_disable();
> 18:20 < rmk>         flush_cache_louis();
> 18:20 < rmk>         mb();
> 18:20 < rmk>         RCU_NONIDLE(complete(&cpu_died));
> 18:20 < rmk>         mb();
> 18:20 < rmk>         if (smp_ops.cpu_die)
> 18:20 < rmk>                 smp_ops.cpu_die(cpu);
> 18:21 < rmk> and then killing most of the flush_cache_all() calls in smp_ops.cpu_die()
> 18:22 < rmk> the only thing I haven't worked out is why some places disable the L1 cache and then flush - especially as that can make any dirty data in the L1 cache suddenly vanish from the CPUs visibility
> 18:22 < wildea01> might need to be careful with the completion
> 18:23 < rmk> that's why the mb() is there - another CPU will read the cpu_died thing which means it must have become visible to the other CPUs
> 18:24 < wildea01> but the cacheline could still be exclusive in our L1 I think
> 18:24 < rmk> how?  surely it must have become shared because another CPU has read from it?

OK, I am taking this as meaning that is is OK for the outgoing CPU to
be powered off with unflushed data in its cache, as long as that data is
shared so that some other CPU has a copy.  In that case, the disappearance
of the outgoing CPU's copy is a non-problem, correct?

> 18:25 < wildea01> I'm thinking of the spin lock -- can we guarantee that another core will have read that before we turn off our cache?
> 18:27 < rmk> well, can we get out of wait_for_completion without the spin lock having been unlocked?
> 18:27 < rmk> one of the points of completions is that it should be safe for stuff like this
> 18:59 < rmk> yes, one of the things that wait_for_completion/complete was invented for was for synchronising a kernel thread exit with cleaning up after it
> 19:01 < rmk> and if you look at the above functions, the spinlock can't be owned by the CPU calling complete() because wait_for_completion() must reacquire it after complete() has woken the wait_for_completion thread u p
> 19:04 < rmk> well, I just tried it out on omap4430 and it seems to work fine
> 19:04 < wildea01> rmk: but complete does a spin_unlock_irqrestore(&x->wait.lock, flags);, now if that sits exclusive in our cache and we power-off before the waiting CPU gets the lock, then we're dead
> 19:04 < rmk> yes it does, but...
> 19:04 < wildea01> maybe that's so incredibly unlikely that we don't mind
> 19:04 < rmk> the other CPU must exit wait_for_completion()
> 19:05 < rmk> which involves reading/writing that lock too
> 19:05 < rmk>                         spin_lock_irq(&x->wait.lock);
> 19:05 < rmk>                 } while (!x->done && timeout);
> 19:05 < rmk>                 __remove_wait_queue(&x->wait, &wait);
> 19:05 < rmk>         }
> 19:05 < rmk>         x->done--;
> 19:05 < rmk>         return timeout ?: 1;
> 19:05 < rmk>         spin_unlock_irq(&x->wait.lock);
> 19:06 < wildea01> hmm, where is that code?
> 19:06 < rmk> will all be executed on another CPU after that spin_unlock
> 19:06 < rmk> wait_for_completion->wait_for_common->__wait_for_common->do_wait_for_common
> 19:07 < wildea01> gotcha, I didn't go as far as do_wait_for_common
> 19:07 * wildea01 scrolls up
> 19:07 < rmk> the bits I quoted is the exit path from do_wait_for_common back to wait_for_completion
> 19:15 < wildea01> I guess the only bit I'm missing is why the the other CPU must exit wait_for_completion before we can proceed with the cpu_die
> 19:16 < rmk> wrong way round.
> 19:16 < rmk> complete() must exit completely and be visible before wait_for_completion can return
> 19:17 < rmk> so there's no way that platform_cpu_kill() can end up being called before that complete() has unlocked that spinlock
> 19:17 < rmk> and as platform_cpu_kill() is what should be removing power to the dead CPU
> 19:17 < wildea01> but I don't see how we can guarantee that the other guy has read it
> 19:17 < wildea01> he might be *just* about to read it
> 19:18 < wildea01> but it might not have happened
> 19:18 < rmk>         if (!wait_for_completion_timeout(&cpu_died, msecs_to_jiffies(5000))) {
> 19:18 < rmk>         }
> 19:18 < rmk>         printk(KERN_NOTICE "CPU%u: shutdown\n", cpu);
> 19:18 < rmk> you can't get to that printk until complete() has unlocked the completions spinlock.

And a read-modify-write atomic (LDREX/STREX pair) is guaranteed to
pull the corresponding cache line out of the outgoing CPU's cache,
thus preserving the values in that cache line for posterity.

> 19:18 < wildea01> agreed
> 19:18 < rmk> and that unlock has become visible to the CPU executing the above code
> 19:19 < wildea01> wait a second: I'm assuming that cpu_die is killing the lights, which some people seem to do iirc?
> 19:20 < wildea01> if that's all done in platform_cpu_kill, then I think we're ok19:20 < wildea01> as you say
> 19:20 < rmk> indeed.
> 19:20 < rmk> it actually depends on how the offlining works.
> 19:20 < wildea01> so the question is: are there smp_ops.cpu_die which hit the power controller?
> 19:20 < rmk> yes, but weirdly... because the CPU goes into dead mode when it executes the WFI
> 19:21 < wildea01> yeah, there's some signal that goes high when that happens, so people like to tie that to the power switch
> 19:22 < rmk> but that is also fine, because if that's what triggers the power off, we will get there anyway (because platform_cpu_kill won't do that)
> 19:22 < rmk> err, won't kill the power
> 19:24 < wildea01> hmm, not sure I follow
> 19:24 < wildea01> highbank does what you're saying, so we could take that as an example
> 19:25 < wildea01> (pretending that the cache flush isn't there)
> 19:26 < rmk> ok.
> 19:26 < rmk> so what will happen is...
> 19:27 < rmk> one CPU (not the dying CPU) will end up calling __cpu_die()
> 19:27 < rmk> the dying CPU will call cpu_die()
> 19:27 < wildea01> yup
> 19:27 < rmk> lets call the first one the calling CPU (even though it may not be)
> 19:28 < wildea01> sure -- it's the guy waiting for the offline to happen
> 19:28 < rmk> the calling CPU calls into wait_for_completion_timeout() and sits there waiting for the dying CPU to call that complete()
> 19:28 < rmk> meanwhile, the dying CPU does the idle task exit, disables IRQs, and flushes its caches of any dirty data.
> 19:29 < rmk> now, the calling CPU has had to take the completion's spinlock, check the counter, release the spinlock, and is waiting for the completion...
> 19:30 < rmk> so, the dying CPU takes the spinlock, increments the counter, and triggers a wakeup of the calling CPU, and then releases the spinlock.
> 19:30 < wildea01> yep
> 19:30 < rmk> the dying CPU now has dirty cache lines again which it probably exclusively owns
> 19:30 < wildea01> at this point, can we assume that the calling CPU goes off to handle an interrupt or something?
> 19:31 < rmk> it can do, it could even be the scheduler IPI
> 19:31 < wildea01> good
> 19:31 < wildea01> so the dying CPU can proceed past the complete(, with the calling CPU occupied somewhere else
> 19:31 < rmk> it can do, yes.
> 19:32 < wildea01> and off into smp_ops.cpu_die => highbank_cpu_die
> 19:32 < rmk> *OH*, I see what you're getting at
> 19:32 < wildea01> :)
> 19:33 < rmk> hmm, how can we get around that...
> 19:34 < wildea01> it's tricky, because platform_cpu_kill runs on the caller cpu
> 19:34 < rmk> because, in the case where the power is cut from platform_cpu_kill(), the dying CPU can loose power at any point after that complete()
> 19:34 < wildea01> so we'd need to split up the `next wfi on core n should kill it' from the `here's my wfi'
> 19:36 < rmk> I think we could do another flush_cache_louis() after it
> 19:37 < rmk> if, in the case of platform_cpu_kill() doing the killing of the CPU, that should be fine.

So CPUs that power themselves off are responsible for flushing their own
caches.  Makes sense.

> 19:37 < wildea01> actually... why do we even need the one before it?
> 19:37 < wildea01> why not just have one after the complete has returned?
> 19:37 < rmk> because if platform_cpu_kill() is the call which removes the power, we need to ensure the cache contents have been written out
> 19:38 < wildea01> ah yeah, I was thinking of requiring both the kill and the die in order for the powerdown, but that's not alwasy necessary
> 19:38 < wildea01> *always
> 19:38 < wildea01> last time I checked, nobody used plaform_cpu_kill
> 19:39 < rmk> umm, lots do
> 19:39 < rmk> and some do use it to do stuff
> 19:39 < wildea01> damn, I must've been thinking of something else
> 19:40 < rmk> well, imx and tegra seem to, but they have their own custom waits implemented probably because of the lack of cache handling in the generic code
> 19:42 < wildea01> I don't know enough about tegra to understand why their kill and die don't race
> 19:44 < rmk> ok, looking at the locking barrier doc, we don't need the mb() after the complete() call
> 19:44 < rmk> but I think to address your concern, we must have another flush_cache_louis() there
> 19:44 < wildea01> yeah, the unlock should give you that mb
> 19:45 < wildea01> just seems a shame to have two flushes when they're not usually both needed
> 19:45 < wildea01> (he assumes)
> 19:45 < wildea01> like I said, the tegra could looks broken to me
> 19:45 < wildea01> *code
> 19:45 < rmk> if we had a louis() version which could flush the cpu_died completion...
> 19:51 < wildea01> do we even need the mb before the complete?
> 19:52 < rmk> I've been debating about that, and I think not
> 19:52 < rmk> I'm just augmenting this with comments as well
> 19:53 < wildea01> good thinking, I think we've established that it's not simple to understand!
> 19:59 < rmk> http://www.home.arm.linux.org.uk/~rmk/misc/smp-hotplug.diff
> 19:59 < rmk> new version of it with lots of comments :)
> 20:03 < wildea01> looks good to me. The additional flush is a pity, but required, and it's hotplug-off anyway, so not exactly speedy
> 20:03 < wildea01> one typo in a comment: s/loosing/losing/

So for the lock word, the trick is that when acquiring the lock requires
an LDREX/STREX pair, the act of acquiring that lock will remove the lock
word from the outgoing CPU's cache.  Cute!

Also, as long as a given cacheline is in shared state (so that there
is another copy of it in some other CPU's cache), it is OK to power off
the outgoing CPU without flushing that shared-state cacheline.

A few questions, as always, for the case where the outgoing CPU is
powered down before it executes any instructions following the return
from complete():

1.	It is possible that complete() and the functions that it invokes
	might write to the outgoing CPU's stack, which could result
	in those lines becoming exclusive in the outgoing CPU's cache.
	My guess is that the surviving CPU won't extract the corresponding
	lines from the outgoing CPU's cache.  Is this correct?

	(I could imagine forgiving cache-coherence hardware just reading
	old values from memory, which might be OK for the stack because
	no one should care about the data that the outgoing CPU wrote
	just before being powered off.  Or maybe something else is
	going on.)

2.	The complete() function also updates the ->done field of the
	completion structure.  This is OK because it is in the same
	cacheline as the lock?

3.	The try_to_wake_up() function that can be invoked indirectly
	by complete() writes to a number of fields in the task struct
	of the task blocked in wait_for_completion().  What ensures
	that the corresponding cachelines are flushed from the outgoing
	CPU's cache?  (I know that some of them are read by the scheduler
	on the CPU on which the task is being awakened (which puts them
	in shared state, thus preserving them), but it is not clear that
	this is the case for all of them, particularly statistics.)

There are probably other questions, but those are the ones that come to
mind at the moment.

For your amusement, I have a patch below that takes a paranoid view of
the possible answers to these questions.  This patch is untested and
probably does not even build.  Plus its polling loop is quite naive.
On the other hand, given the info in the IRC log, it seems like it
should provide a very robust guarantee that no data gets stranded in
the outgoing CPU's cache.

							Thanx, Paul

------------------------------------------------------------------------

commit 6c9e28385ce1417628c4f2e58c078b723f35d62a
Author: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Date:   Mon Dec 11 09:40:58 2017 -0800

    ARM: CPU hotplug: Delegate complete() to surviving CPU
    
    The ARM implementation of arch_cpu_idle_dead() invokes complete(), but
    does so after RCU has stopped watching the outgoing CPU, which results
    in lockdep complaints because complete() invokes functions containing RCU
    readers.  In addition, if the outgoing CPU really were to consume several
    seconds of its five-second allotted time, multiple RCU updates could
    complete, possibly giving the outgoing CPU an inconsistent view of the
    scheduler data structures on which complete() relies.
    
    This (untested, probably does not build) commit avoids this problem by
    polling the outgoing CPU.  The polling strategy in this prototype patch
    is quite naive, with one jiffy between each poll and without any sort of
    adaptive spin phase.  The key point is that the polling CPU uses xchg(),
    which evicts the flag from the outgoing CPU's cache.  The outgoing CPU
    simply does a WRITE_ONCE(), which minimizes opportunities for other data
    to get pulled into the outgoing CPU's cache.  This pulling of values
    from the outgoing CPU's cache is important because the outgoing CPU
    might be unceremoniously powered off before it has time to execute any
    code after the WRITE_ONCE().
    
    Reported-by: Peng Fan <van.freenix@gmail.com>
    Reported-by: Russell King - ARM Linux <linux@armlinux.org.uk>
    Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
    Cc: Russell King <linux@armlinux.org.uk>
    Cc: Ingo Molnar <mingo@kernel.org>
    Cc: "Peter Zijlstra (Intel)" <peterz@infradead.org>
    Cc: Michal Hocko <mhocko@suse.com>
    Cc: Thomas Gleixner <tglx@linutronix.de>
    Cc: Fabio Estevam <fabio.estevam@nxp.com>
    Cc: <linux-arm-kernel@lists.infradead.org>
    Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>

diff --git a/arch/arm/kernel/smp.c b/arch/arm/kernel/smp.c
index b4fbf00ee4ad..da363923503b 100644
--- a/arch/arm/kernel/smp.c
+++ b/arch/arm/kernel/smp.c
@@ -241,7 +241,7 @@ int __cpu_disable(void)
 	return 0;
 }
 
-static DECLARE_COMPLETION(cpu_died);
+static char cpu_died;
 
 /*
  * called on the thread which is asking for a CPU to be shutdown -
@@ -249,7 +249,16 @@ static DECLARE_COMPLETION(cpu_died);
  */
 void __cpu_die(unsigned int cpu)
 {
-	if (!wait_for_completion_timeout(&cpu_died, msecs_to_jiffies(5000))) {
+	unsigned long deadline = jiffies + msecs_to_jiffies(5000);
+	char ret;
+
+	while (time_before(jiffies, deadline)) {
+		ret = xchg(&cpu_died, 0);
+		if (ret)
+			break;
+		schedule_timeout_interruptible(1);
+	}
+	if (!ret) {
 		pr_err("CPU%u: cpu didn't die\n", cpu);
 		return;
 	}
@@ -295,7 +304,7 @@ void arch_cpu_idle_dead(void)
 	 * this returns, power and/or clocks can be removed at any point
 	 * from this CPU and its cache by platform_cpu_kill().
 	 */
-	complete(&cpu_died);
+	WRITE_ONCE(cpu_died, 1);
 
 	/*
 	 * Ensure that the cache lines associated with that completion are

^ permalink raw reply related	[flat|nested] 25+ messages in thread

* WARNING: suspicious RCU usage
  2017-12-15  6:38                   ` Paul E. McKenney
@ 2017-12-15 13:16                     ` Fabio Estevam
  2017-12-15 15:52                       ` Paul E. McKenney
  0 siblings, 1 reply; 25+ messages in thread
From: Fabio Estevam @ 2017-12-15 13:16 UTC (permalink / raw)
  To: linux-arm-kernel

Hi Paul,

On Fri, Dec 15, 2017 at 4:38 AM, Paul E. McKenney
<paulmck@linux.vnet.ibm.com> wrote:

> For your amusement, I have a patch below that takes a paranoid view of
> the possible answers to these questions.  This patch is untested and
> probably does not even build.  Plus its polling loop is quite naive.

I tried to build it, but if fails to link:

  LD      vmlinux.o
  MODPOST vmlinux.o
arch/arm/kernel/smp.o: In function `__cpu_die':
smp.c:(.text+0x44c): undefined reference to `__bad_xchg'
Makefile:1024: recipe for target 'vmlinux' failed
make: *** [vmlinux] Error 1

Thanks

^ permalink raw reply	[flat|nested] 25+ messages in thread

* WARNING: suspicious RCU usage
  2017-12-15 13:16                     ` Fabio Estevam
@ 2017-12-15 15:52                       ` Paul E. McKenney
  2017-12-15 18:23                         ` Paul E. McKenney
  0 siblings, 1 reply; 25+ messages in thread
From: Paul E. McKenney @ 2017-12-15 15:52 UTC (permalink / raw)
  To: linux-arm-kernel

On Fri, Dec 15, 2017 at 11:16:43AM -0200, Fabio Estevam wrote:
> Hi Paul,
> 
> On Fri, Dec 15, 2017 at 4:38 AM, Paul E. McKenney
> <paulmck@linux.vnet.ibm.com> wrote:
> 
> > For your amusement, I have a patch below that takes a paranoid view of
> > the possible answers to these questions.  This patch is untested and
> > probably does not even build.  Plus its polling loop is quite naive.
> 
> I tried to build it, but if fails to link:
> 
>   LD      vmlinux.o
>   MODPOST vmlinux.o
> arch/arm/kernel/smp.o: In function `__cpu_die':
> smp.c:(.text+0x44c): undefined reference to `__bad_xchg'
> Makefile:1024: recipe for target 'vmlinux' failed
> make: *** [vmlinux] Error 1

OK, I will need to make a better choice of atomic operation.

Thank you for testing this!

							Thanx, Paul

^ permalink raw reply	[flat|nested] 25+ messages in thread

* WARNING: suspicious RCU usage
  2017-12-15 15:52                       ` Paul E. McKenney
@ 2017-12-15 18:23                         ` Paul E. McKenney
  2017-12-15 20:36                           ` Fabio Estevam
  0 siblings, 1 reply; 25+ messages in thread
From: Paul E. McKenney @ 2017-12-15 18:23 UTC (permalink / raw)
  To: linux-arm-kernel

On Fri, Dec 15, 2017 at 07:52:18AM -0800, Paul E. McKenney wrote:
> On Fri, Dec 15, 2017 at 11:16:43AM -0200, Fabio Estevam wrote:
> > Hi Paul,
> > 
> > On Fri, Dec 15, 2017 at 4:38 AM, Paul E. McKenney
> > <paulmck@linux.vnet.ibm.com> wrote:
> > 
> > > For your amusement, I have a patch below that takes a paranoid view of
> > > the possible answers to these questions.  This patch is untested and
> > > probably does not even build.  Plus its polling loop is quite naive.
> > 
> > I tried to build it, but if fails to link:
> > 
> >   LD      vmlinux.o
> >   MODPOST vmlinux.o
> > arch/arm/kernel/smp.o: In function `__cpu_die':
> > smp.c:(.text+0x44c): undefined reference to `__bad_xchg'
> > Makefile:1024: recipe for target 'vmlinux' failed
> > make: *** [vmlinux] Error 1
> 
> OK, I will need to make a better choice of atomic operation.
> 
> Thank you for testing this!

How about this one, also untested etc.?

								Thanx, Paul

------------------------------------------------------------------------

commit 66c038bfa64ff75e0fcdf6756f6225d4253f5a81
Author: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Date:   Mon Dec 11 09:40:58 2017 -0800

    ARM: CPU hotplug: Delegate complete() to surviving CPU
    
    The ARM implementation of arch_cpu_idle_dead() invokes complete(), but
    does so after RCU has stopped watching the outgoing CPU, which results
    in lockdep complaints because complete() invokes functions containing RCU
    readers.  In addition, if the outgoing CPU really were to consume several
    seconds of its five-second allotted time, multiple RCU updates could
    complete, possibly giving the outgoing CPU an inconsistent view of the
    scheduler data structures on which complete() relies.
    
    This (untested, probably does not build) commit avoids this problem by
    polling the outgoing CPU.  The polling strategy in this prototype patch
    is quite naive, with one jiffy between each poll and without any sort
    of adaptive spin phase.  The key point is that the polling CPU uses
    atomic_dec_and_test(), which evicts the flag from the outgoing CPU's
    cache.  The outgoing CPU simply does an atomic_set() of the value 1 which
    causes the next atomic_dec_and_test() to return true, and which also
    minimizes opportunities for other data to get pulled into the outgoing
    CPU's cache.  This pulling of values from the outgoing CPU's cache is
    important because the outgoing CPU might be unceremoniously powered off
    before it has time to execute any code after the atomic_set().
    
    Underflow is avoided because there can be at most 5,000 invocations of
    atomic_dec_and_test() for a given offline operation, and the counter is
    set back to zero each time.
    
    Reported-by: Peng Fan <van.freenix@gmail.com>
    Reported-by: Russell King - ARM Linux <linux@armlinux.org.uk>
    Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
    Cc: Russell King <linux@armlinux.org.uk>
    Cc: Ingo Molnar <mingo@kernel.org>
    Cc: "Peter Zijlstra (Intel)" <peterz@infradead.org>
    Cc: Michal Hocko <mhocko@suse.com>
    Cc: Thomas Gleixner <tglx@linutronix.de>
    Cc: Fabio Estevam <fabio.estevam@nxp.com>
    Cc: <linux-arm-kernel@lists.infradead.org>
    Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>

diff --git a/arch/arm/kernel/smp.c b/arch/arm/kernel/smp.c
index b4fbf00ee4ad..2fcffccf26ab 100644
--- a/arch/arm/kernel/smp.c
+++ b/arch/arm/kernel/smp.c
@@ -241,7 +241,7 @@ int __cpu_disable(void)
 	return 0;
 }
 
-static DECLARE_COMPLETION(cpu_died);
+static atomic_t cpu_died;
 
 /*
  * called on the thread which is asking for a CPU to be shutdown -
@@ -249,7 +249,17 @@ static DECLARE_COMPLETION(cpu_died);
  */
 void __cpu_die(unsigned int cpu)
 {
-	if (!wait_for_completion_timeout(&cpu_died, msecs_to_jiffies(5000))) {
+	unsigned long deadline = jiffies + msecs_to_jiffies(5000);
+	char ret;
+
+	while (time_before(jiffies, deadline)) {
+		ret = atomic_dec_and_test(&cpu_died);
+		if (ret)
+			break;
+		schedule_timeout_interruptible(1);
+	}
+	atomic_set(&cpu_died, 0);
+	if (!ret) {
 		pr_err("CPU%u: cpu didn't die\n", cpu);
 		return;
 	}
@@ -295,7 +305,7 @@ void arch_cpu_idle_dead(void)
 	 * this returns, power and/or clocks can be removed at any point
 	 * from this CPU and its cache by platform_cpu_kill().
 	 */
-	complete(&cpu_died);
+	atomic_set(&cpu_died, 1);
 
 	/*
 	 * Ensure that the cache lines associated with that completion are

^ permalink raw reply related	[flat|nested] 25+ messages in thread

* WARNING: suspicious RCU usage
  2017-12-15 18:23                         ` Paul E. McKenney
@ 2017-12-15 20:36                           ` Fabio Estevam
  2017-12-15 21:34                             ` Paul E. McKenney
  0 siblings, 1 reply; 25+ messages in thread
From: Fabio Estevam @ 2017-12-15 20:36 UTC (permalink / raw)
  To: linux-arm-kernel

Hi Paul,

On Fri, Dec 15, 2017 at 4:23 PM, Paul E. McKenney
<paulmck@linux.vnet.ibm.com> wrote:

> How about this one, also untested etc.?

This one gives a build warning:

arch/arm/kernel/smp.c: In function ?__cpu_die?:
arch/arm/kernel/smp.c:262:5: warning: ?ret? may be used uninitialized
in this function [-Wmaybe-uninitialized]
  if (!ret) {
     ^

but when I run suspend/resume I don't see the RCU warning with this
patch applied.

Thanks

^ permalink raw reply	[flat|nested] 25+ messages in thread

* WARNING: suspicious RCU usage
  2017-12-15 20:36                           ` Fabio Estevam
@ 2017-12-15 21:34                             ` Paul E. McKenney
  2017-12-15 21:43                               ` Fabio Estevam
  0 siblings, 1 reply; 25+ messages in thread
From: Paul E. McKenney @ 2017-12-15 21:34 UTC (permalink / raw)
  To: linux-arm-kernel

On Fri, Dec 15, 2017 at 06:36:49PM -0200, Fabio Estevam wrote:
> Hi Paul,
> 
> On Fri, Dec 15, 2017 at 4:23 PM, Paul E. McKenney
> <paulmck@linux.vnet.ibm.com> wrote:
> 
> > How about this one, also untested etc.?
> 
> This one gives a build warning:
> 
> arch/arm/kernel/smp.c: In function ?__cpu_die?:
> arch/arm/kernel/smp.c:262:5: warning: ?ret? may be used uninitialized
> in this function [-Wmaybe-uninitialized]
>   if (!ret) {
>      ^

That would be me being stupid.  Please see below for an updated patch.

> but when I run suspend/resume I don't see the RCU warning with this
> patch applied.

So some progress, at least!  Thank you for your testing efforts!!!

							Thanx, Paul

------------------------------------------------------------------------

commit e1bb1ddc3402220eba1138322fedd520801ee510
Author: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Date:   Mon Dec 11 09:40:58 2017 -0800

    ARM: CPU hotplug: Delegate complete() to surviving CPU
    
    The ARM implementation of arch_cpu_idle_dead() invokes complete(), but
    does so after RCU has stopped watching the outgoing CPU, which results
    in lockdep complaints because complete() invokes functions containing RCU
    readers.  In addition, if the outgoing CPU really were to consume several
    seconds of its five-second allotted time, multiple RCU updates could
    complete, possibly giving the outgoing CPU an inconsistent view of the
    scheduler data structures on which complete() relies.
    
    This (untested, probably does not build) commit avoids this problem by
    polling the outgoing CPU.  The polling strategy in this prototype patch
    is quite naive, with one jiffy between each poll and without any sort
    of adaptive spin phase.  The key point is that the polling CPU uses
    atomic_dec_and_test(), which evicts the flag from the outgoing CPU's
    cache.  The outgoing CPU simply does an atomic_set() of the value 1 which
    causes the next atomic_dec_and_test() to return true, and which also
    minimizes opportunities for other data to get pulled into the outgoing
    CPU's cache.  This pulling of values from the outgoing CPU's cache is
    important because the outgoing CPU might be unceremoniously powered off
    before it has time to execute any code after the atomic_set().
    
    Underflow is avoided because there can be at most 5,000 invocations of
    atomic_dec_and_test() for a given offline operation, and the counter is
    set back to zero each time.
    
    Reported-by: Peng Fan <van.freenix@gmail.com>
    Reported-by: Russell King - ARM Linux <linux@armlinux.org.uk>
    Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
    Cc: Russell King <linux@armlinux.org.uk>
    Cc: Ingo Molnar <mingo@kernel.org>
    Cc: "Peter Zijlstra (Intel)" <peterz@infradead.org>
    Cc: Michal Hocko <mhocko@suse.com>
    Cc: Thomas Gleixner <tglx@linutronix.de>
    Cc: Fabio Estevam <fabio.estevam@nxp.com>
    Cc: <linux-arm-kernel@lists.infradead.org>

diff --git a/arch/arm/kernel/smp.c b/arch/arm/kernel/smp.c
index b4fbf00ee4ad..f9a4990689f3 100644
--- a/arch/arm/kernel/smp.c
+++ b/arch/arm/kernel/smp.c
@@ -241,7 +241,7 @@ int __cpu_disable(void)
 	return 0;
 }
 
-static DECLARE_COMPLETION(cpu_died);
+static atomic_t cpu_died;
 
 /*
  * called on the thread which is asking for a CPU to be shutdown -
@@ -249,7 +249,17 @@ static DECLARE_COMPLETION(cpu_died);
  */
 void __cpu_die(unsigned int cpu)
 {
-	if (!wait_for_completion_timeout(&cpu_died, msecs_to_jiffies(5000))) {
+	unsigned long deadline = jiffies + msecs_to_jiffies(5000);
+	char ret = 0;
+
+	while (time_before(jiffies, deadline)) {
+		ret = atomic_dec_and_test(&cpu_died);
+		if (ret)
+			break;
+		schedule_timeout_interruptible(1);
+	}
+	atomic_set(&cpu_died, 0);
+	if (!ret) {
 		pr_err("CPU%u: cpu didn't die\n", cpu);
 		return;
 	}
@@ -295,7 +305,7 @@ void arch_cpu_idle_dead(void)
 	 * this returns, power and/or clocks can be removed at any point
 	 * from this CPU and its cache by platform_cpu_kill().
 	 */
-	complete(&cpu_died);
+	atomic_set(&cpu_died, 1);
 
 	/*
 	 * Ensure that the cache lines associated with that completion are

^ permalink raw reply related	[flat|nested] 25+ messages in thread

* WARNING: suspicious RCU usage
  2017-12-15 21:34                             ` Paul E. McKenney
@ 2017-12-15 21:43                               ` Fabio Estevam
  2017-12-15 22:56                                 ` Paul E. McKenney
  0 siblings, 1 reply; 25+ messages in thread
From: Fabio Estevam @ 2017-12-15 21:43 UTC (permalink / raw)
  To: linux-arm-kernel

Hi Paul,

On Fri, Dec 15, 2017 at 7:34 PM, Paul E. McKenney
<paulmck@linux.vnet.ibm.com> wrote:

> That would be me being stupid.  Please see below for an updated patch.

This one builds cleanly and works fine on my imx6q board. Thanks

^ permalink raw reply	[flat|nested] 25+ messages in thread

* WARNING: suspicious RCU usage
  2017-12-15 21:43                               ` Fabio Estevam
@ 2017-12-15 22:56                                 ` Paul E. McKenney
  0 siblings, 0 replies; 25+ messages in thread
From: Paul E. McKenney @ 2017-12-15 22:56 UTC (permalink / raw)
  To: linux-arm-kernel

On Fri, Dec 15, 2017 at 07:43:36PM -0200, Fabio Estevam wrote:
> Hi Paul,
> 
> On Fri, Dec 15, 2017 at 7:34 PM, Paul E. McKenney
> <paulmck@linux.vnet.ibm.com> wrote:
> 
> > That would be me being stupid.  Please see below for an updated patch.
> 
> This one builds cleanly and works fine on my imx6q board. Thanks

Very good, thank you!

							Thanx, Paul

^ permalink raw reply	[flat|nested] 25+ messages in thread

end of thread, other threads:[~2017-12-15 22:56 UTC | newest]

Thread overview: 25+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-12-10 11:39 WARNING: suspicious RCU usage Peng Fan
2017-12-10 12:00 ` Russell King - ARM Linux
2017-12-10 19:07   ` Paul E. McKenney
2017-12-10 19:34     ` Russell King - ARM Linux
2017-12-10 21:39       ` Paul E. McKenney
2017-12-10 21:54         ` Russell King - ARM Linux
2017-12-10 23:16           ` Paul E. McKenney
2017-12-12 16:49         ` Paul E. McKenney
2017-12-12 16:56           ` Fabio Estevam
2017-12-12 17:21             ` Paul E. McKenney
2017-12-12 17:34             ` Russell King - ARM Linux
2017-12-12 18:11               ` Fabio Estevam
2017-12-12 19:36                 ` Paul E. McKenney
2017-12-12 19:44                   ` Fabio Estevam
2017-12-12 19:54                     ` Russell King - ARM Linux
2017-12-12 21:05                       ` Fabio Estevam
2017-12-13  9:12                 ` Russell King - ARM Linux
2017-12-15  6:38                   ` Paul E. McKenney
2017-12-15 13:16                     ` Fabio Estevam
2017-12-15 15:52                       ` Paul E. McKenney
2017-12-15 18:23                         ` Paul E. McKenney
2017-12-15 20:36                           ` Fabio Estevam
2017-12-15 21:34                             ` Paul E. McKenney
2017-12-15 21:43                               ` Fabio Estevam
2017-12-15 22:56                                 ` Paul E. McKenney

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.