From: Qian Cai <quic_qiancai@quicinc.com>
To: Peter Zijlstra <peterz@infradead.org>, <gor@linux.ibm.com>,
<jpoimboe@redhat.com>, <jikos@kernel.org>, <mbenes@suse.cz>,
<pmladek@suse.com>, <mingo@kernel.org>
Cc: <linux-kernel@vger.kernel.org>, <joe.lawrence@redhat.com>,
<fweisbec@gmail.com>, <tglx@linutronix.de>, <hca@linux.ibm.com>,
<svens@linux.ibm.com>, <sumanthk@linux.ibm.com>,
<live-patching@vger.kernel.org>, <paulmck@kernel.org>,
<rostedt@goodmis.org>, <x86@kernel.org>
Subject: Re: [PATCH v2 04/11] sched: Simplify wake_up_*idle*()
Date: Wed, 13 Oct 2021 10:32:26 -0400 [thread overview]
Message-ID: <ba4ca17f-100e-bef7-6d7b-4de0f5a515b9@quicinc.com> (raw)
In-Reply-To: <20210929152428.769328779@infradead.org>
On 9/29/2021 11:17 AM, Peter Zijlstra wrote:
> --- a/kernel/smp.c
> +++ b/kernel/smp.c
> @@ -1170,14 +1170,14 @@ void wake_up_all_idle_cpus(void)
> {
> int cpu;
>
> - preempt_disable();
> + cpus_read_lock();
> for_each_online_cpu(cpu) {
> - if (cpu == smp_processor_id())
> + if (cpu == raw_smp_processor_id())
> continue;
>
> wake_up_if_idle(cpu);
> }
> - preempt_enable();
> + cpus_read_unlock();
Peter, it looks like this thing introduced a deadlock during CPU online/offline.
[ 630.145166][ T129] WARNING: possible recursive locking detected
[ 630.151164][ T129] 5.15.0-rc5-next-20211013+ #145 Not tainted
[ 630.156988][ T129] --------------------------------------------
[ 630.162984][ T129] cpuhp/21/129 is trying to acquire lock:
[ 630.168547][ T129] ffff800011f466d0 (cpu_hotplug_lock){++++}-{0:0}, at: wake_up_all_idle_cpus+0x40/0xe8
wake_up_all_idle_cpus at /usr/src/linux-next/kernel/smp.c:1174
[ 630.178040][ T129]
[ 630.178040][){++++}-{0:0}, at help us debug this:
[ 630.202292][ T129] Possible unsafe locking scenario:
[ 630.202292][ T129]
[ 630.209590][ T129] CPU0
[ 630.212720][ T129] ----
[ 630.215851][ T129] lock(cpu_hotplug_lock);
[ 630.220202][ T129] lock(cpu_hotplug_lock);
[ 630.224553][ T129]
[ 630.224553][ T129] *** DEADLOCK ***
[ 630.224553][ T129]
[ 630.232545][ T129] May be due to missing lock nesting notation
[ 630.232545][ T129]
[ 630.240711][ T129] 3 locks held by cpuhp/21/129:
[ 630.245406][ T129] #0: ffff800011f466d0 (cpu_hotplug_lock){++++}-{0:0}, at: cpuhp_thread_fun+0xe0/0x588
[ 630.254976][ T129] #1: ffff800011f46780 (cpuhp_state-up){+.+.}-{0:0}, at: cpuhp_thread_fun+0xe0/0x588
[ 630.264372][ T129] #2: ffff8000191fb9c8 (cpuidle_lock){+.+.}-{3:3}, at: cpuidle_pause_and_lock+0x24/0x38
[ 630.274031][ T129]
[ 630.274031][ T129] stack backtrace:
[ 630.279767][ T129] CPU: 21 PID: 129 Comm: cpuhp/21 Not tainted 5.15.0-rc5-next-20211013+ #145
[ 630.288371][ T129] Hardware name: MiTAC RAPTOR EV-883832-X3-0001/RAPTOR, BIOS 1.6 06/28/2020
[ 630.296886][ T129] Call trace:
[ 630.300017][ T129] dump_backtrace+0x0/0x3b8
[ 630.304369][ T129] show_stack+0x20/0x30
[ 630.308371][ T129] dump_stack_lvl+0x8c/0xb8
[ 630.312722][ T129] dump_stack+0x1c/0x38
[ 630.316723][ T129] validate_chain+0x1d84/0x1da0
[ 630.321421][ T129] __lock_acquire+0xab0/0x2040
[ 630.326033][ T129] lock_acquire+0x32c/0xb08
[ 630.330390][ T129] cpus_read_lock+0x94/0x308
[ 630.334827][ T129] wake_up_all_idle_cpus+0x40/0xe8
[ 630.339784][ T129] cpuidle_uninstall_idle_handler+0x3c/0x50
[ 630.345524][ T129] cpuidle_pause_and_lock+0x28/0x38
[ 630.350569][ T129] acpi_processor_hotplug+0xc0/0x170
[ 630.355701][ T129] acpi_soft_cpu_online+0x124/0x250
[ 630.360745][ T129] cpuhp_invoke_callback+0x51c/0x2ab8
[ 630.365963][ T129] cpuhp_thread_fun+0x204/0x588
[ 630.370659][ T129] smpboot_thread_fn+0x3f0/0xc40
[ 630.375444][ T129] kthread+0x3d8/0x488
[ 630.379360][ T129] ret_from_fork+0x10/0x20
[ 863.525716][ T191] INFO: task cpuhp/21:129 blocked for more than 122 seconds.
[ 863.532954][ T191] Not tainted 5.15.0-rc5-next-20211013+ #145
[ 863.539361][ T191] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[ 863.547927][ T191] task:cpuhp/21 state:D stack:59104 pid: 129 ppid: 2 flags:0x00000008
[ 863.557029][ T191] Call trace:
[ 863.560171][ T191] __switch_to+0x184/0x400
[ 863.564448][ T191] __schedule+0x74c/0x1940
[ 863.568753][ T191] schedule+0x110/0x318
[ 863.572764][ T191] percpu_rwsem_wait+0x1b8/0x348
[ 863.577592][ T191] __percpu_down_read+0xb0/0x148
[ 863.582386][ T191] cpus_read_lock+0x2b0/0x308
[ 863.586961][ T191] wake_up_all_idle_cpus+0x40/0xe8
[ 863.591931][ T191] cpuidle_uninstall_idle_handler+0x3c/0x50
[ 863.597716][ T191] cpuidle_pause_and_lock+0x28/0x38
[ 863.602771][ T191] acpi_processor_hotplug+0xc0/0x170
[ 863.607946][ T191] acpi_soft_cpu_online+0x124/0x250
[ 863.613001][ T191] cpuhp_invoke_callback+0x51c/0x2ab8
[ 863.618261][ T191] cpuhp_thread_fun+0x204/0x588
[ 863.622967][ T191] smpboot_thread_fn+0x3f0/0xc40
[ 863.627787][ T191] kthread+0x3d8/0x488
[ 863.631712][ T191] ret_from_fork+0x10/0x20
[ 863.636020][ T191] INFO: task kworker/0:2:189 blocked for more than 122 seconds.
[ 863.643500][ T191] Not tainted 5.15.0-rc5-next-20211013+ #145
[ 863.649882][ T191] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[ 863.658425][ T191] task:kworker/0:2 state:D stack:58368 pid: 189 ppid: 2 flags:0x00000008
[ 863.667516][ T191] Workqueue: events vmstat_shepherd
[ 863.672573][ T191] Call trace:
[ 863.675731][ T191] __switch_to+0x184/0x400
[ 863.680001][ T191] __schedule+0x74c/0x1940
[ 863.684268][ T191] schedule+0x110/0x318
[ 863.688295][ T191] percpu_rwsem_wait+0x1b8/0x348
[ 863.693085][ T191] __percpu_down_read+0xb0/0x148
[ 863.697892][ T191] cpus_read_lock+0x2b0/0x308
[ 863.702421][ T191] vmstat_shepherd+0x5c/0x1a8
[ 863.706977][ T191] process_one_work+0x808/0x19d0
[ 863.711767][ T191] worker_thread+0x334/0xae0
[ 863.716227][ T191] kthread+0x3d8/0x488
[ 863.720149][ T191] ret_from_fork+0x10/0x20
[ 863.724487][ T191] INFO: task lsbug:4642 blocked for more than 123 seconds.
[ 863.731565][ T191] Not tainted 5.15.0-rc5-next-20211013+ #145
[ 863.737938][ T191] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[ 863.746490][ T191] task:lsbug state:D stack:55536 pid: 4642 ppid: 4638 flags:0x00000008
[ 863.755549][ T191] Call trace:
[ 863.758712][ T191] __switch_to+0x184/0x400
[ 863.762984][ T191] __schedule+0x74c/0x1940
[ 863.767286][ T191] schedule+0x110/0x318
[ 863.771294][ T191] schedule_timeout+0x188/0x238
[ 863.776016][ T191] wait_for_completion+0x174/0x290
[ 863.780979][ T191] __cpuhp_kick_ap+0x158/0x1a8
[ 863.785592][ T191] cpuhp_kick_ap+0x1f0/0x828
[ 863.790053][ T191] bringup_cpu+0x180/0x1e0
[ 863.794320][ T191] cpuhp_invoke_callback+0x51c/0x2ab8
[ 863.799561][ T191] cpuhp_invoke_callback_range+0xa4/0x108
[ 863.805130][ T191] cpu_up+0x528/0xd78
[ 863.808982][ T191] cpu_device_up+0x4c/0x68
[ 863.813249][ T191] cpu_subsys_online+0xc0/0x1f8
[ 863.817972][ T191] device_online+0x10c/0x180
[ 863.822413][ T191] online_store+0x10c/0x118
[ 863.826791][ T191] dev_attr_store+0x44/0x78
[ 863.831148][ T191] sysfs_kf_write+0xe8/0x138
[ 863.835590][ T191] kernfs_fop_write_iter+0x26c/0x3d0
[ 863.840745][ T191] new_sync_write+0x2bc/0x4f8
[ 863.845275][ T191] vfs_write+0x714/0xcd8
[ 863.849387][ T191] ksys_write+0xf8/0x1e0
[ 863.853481][ T191] __arm64_sys_write+0x74/0xa8
[ 863.858113][ T191] invoke_syscall.constprop.0+0xdc/0x1d8
[ 863.863597][ T191] do_el0_svc+0xe4/0x298
[ 863.867710][ T191] el0_svc+0x64/0x130
[ 863.871545][ T191] el0t_64_sync_handler+0xb0/0xb8
[ 863.876437][ T191] el0t_64_sync+0x180/0x184
[ 863.880798][ T191] INFO: task mount:4682 blocked for more than 123 seconds.
[ 863.887881][ T191] Not tainted 5.15.0-rc5-next-20211013+ #145
[ 863.894232][ T191] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[ 863.902776][ T191] task:mount state:D stack:55856 pid: 4682 ppid: 1101 flags:0x00000000
[ 863.911865][ T191] Call trace:
[ 863.915003][ T191] __switch_to+0x184/0x400
[ 863.919296][ T191] __schedule+0x74c/0x1940
[ 863.923564][ T191] schedule+0x110/0x318
[ 863.927590][ T191] percpu_rwsem_wait+0x1b8/0x348
[ 863.932380][ T191] __percpu_down_read+0xb0/0x148
[ 863.937187][ T191] cpus_read_lock+0x2b0/0x308
[ 863.941715][ T191] alloc_workqueue+0x730/0xd48
[ 863.946357][ T191] loop_configure+0x2d4/0x1180 [loop]
[ 863.951592][ T191] lo_ioctl+0x5dc/0x1228 [loop]
[ 863.956321][ T191] blkdev_ioctl+0x258/0x820
[ 863.960678][ T191] __arm64_sys_ioctl+0x114/0x180
[ 863.965468][ T191] invoke_syscall.constprop.0+0xdc/0x1d8
[ 863.970974][ T191] do_el0_svc+0xe4/0x298
[ 863.975069][ T191] el0_svc+0x64/0x130
[ 863.978922][ T191] el0t_64_sync_handler+0xb0/0xb8
[ 863.983798][ T191] el0t_64_sync+0x180/0x184
[ 863.988172][ T191] INFO: lockdep is turned off.
next prev parent reply other threads:[~2021-10-13 14:33 UTC|newest]
Thread overview: 51+ messages / expand[flat|nested] mbox.gz Atom feed top
2021-09-29 15:17 [PATCH v2 00/11] sched,rcu,context_tracking,livepatch: Improve livepatch task transitions for idle and NOHZ_FULL Peter Zijlstra
2021-09-29 15:17 ` [PATCH v2 01/11] sched: Improve try_invoke_on_locked_down_task() Peter Zijlstra
2021-09-29 15:17 ` [PATCH v2 02/11] sched,rcu: Rework try_invoke_on_locked_down_task() Peter Zijlstra
2021-09-29 15:17 ` [PATCH v2 03/11] sched,livepatch: Use task_call_func() Peter Zijlstra
2021-10-05 11:40 ` Petr Mladek
2021-10-05 14:03 ` Peter Zijlstra
2021-10-06 8:59 ` Miroslav Benes
2021-09-29 15:17 ` [PATCH v2 04/11] sched: Simplify wake_up_*idle*() Peter Zijlstra
2021-10-13 14:32 ` Qian Cai [this message]
2021-10-19 3:47 ` Qian Cai
2021-10-19 8:56 ` Peter Zijlstra
2021-10-19 9:10 ` Peter Zijlstra
2021-10-19 15:32 ` Qian Cai
2021-10-19 15:50 ` Peter Zijlstra
2021-10-19 19:22 ` Qian Cai
2021-10-19 20:27 ` Peter Zijlstra
[not found] ` <CGME20211022134630eucas1p2e79e2816587d182c580459d567c1f2a9@eucas1p2.samsung.com>
2021-10-22 13:46 ` Marek Szyprowski
2021-09-29 15:17 ` [PATCH v2 05/11] sched,livepatch: Use wake_up_if_idle() Peter Zijlstra
2021-10-05 12:00 ` Petr Mladek
2021-10-06 9:16 ` Miroslav Benes
2021-10-07 9:18 ` Vasily Gorbik
2021-10-07 10:02 ` Peter Zijlstra
2021-10-13 19:37 ` Arnd Bergmann
2021-10-14 10:42 ` Peter Zijlstra
2021-09-29 15:17 ` [RFC][PATCH v2 06/11] context_tracking: Prefix user_{enter,exit}*() Peter Zijlstra
2021-09-29 15:17 ` [RFC][PATCH v2 07/11] context_tracking: Add an atomic sequence/state count Peter Zijlstra
2021-09-29 15:17 ` [RFC][PATCH v2 08/11] context_tracking,rcu: Replace RCU dynticks counter with context_tracking Peter Zijlstra
2021-09-29 18:37 ` Paul E. McKenney
2021-09-29 19:09 ` Peter Zijlstra
2021-09-29 19:11 ` Peter Zijlstra
2021-09-29 19:13 ` Peter Zijlstra
2021-09-29 19:24 ` Peter Zijlstra
2021-09-29 19:45 ` Paul E. McKenney
2021-09-29 18:54 ` Peter Zijlstra
2021-09-29 15:17 ` [RFC][PATCH v2 09/11] context_tracking,livepatch: Dont disturb NOHZ_FULL Peter Zijlstra
2021-10-06 8:12 ` Petr Mladek
2021-10-06 9:04 ` Peter Zijlstra
2021-10-06 10:29 ` Petr Mladek
2021-10-06 11:41 ` Peter Zijlstra
2021-10-06 11:48 ` Miroslav Benes
2021-09-29 15:17 ` [RFC][PATCH v2 10/11] livepatch: Remove klp_synchronize_transition() Peter Zijlstra
2021-10-06 12:30 ` Petr Mladek
2021-09-29 15:17 ` [RFC][PATCH v2 11/11] context_tracking,x86: Fix text_poke_sync() vs NOHZ_FULL Peter Zijlstra
2021-10-21 18:39 ` Marcelo Tosatti
2021-10-21 18:40 ` Marcelo Tosatti
2021-10-21 19:25 ` Peter Zijlstra
2021-10-21 19:57 ` Marcelo Tosatti
2021-10-21 20:18 ` Peter Zijlstra
2021-10-26 18:19 ` Marcelo Tosatti
2021-10-26 19:38 ` Peter Zijlstra
2021-09-29 18:03 ` [PATCH v2 00/11] sched,rcu,context_tracking,livepatch: Improve livepatch task transitions for idle and NOHZ_FULL Paul E. McKenney
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=ba4ca17f-100e-bef7-6d7b-4de0f5a515b9@quicinc.com \
--to=quic_qiancai@quicinc.com \
--cc=fweisbec@gmail.com \
--cc=gor@linux.ibm.com \
--cc=hca@linux.ibm.com \
--cc=jikos@kernel.org \
--cc=joe.lawrence@redhat.com \
--cc=jpoimboe@redhat.com \
--cc=linux-kernel@vger.kernel.org \
--cc=live-patching@vger.kernel.org \
--cc=mbenes@suse.cz \
--cc=mingo@kernel.org \
--cc=paulmck@kernel.org \
--cc=peterz@infradead.org \
--cc=pmladek@suse.com \
--cc=rostedt@goodmis.org \
--cc=sumanthk@linux.ibm.com \
--cc=svens@linux.ibm.com \
--cc=tglx@linutronix.de \
--cc=x86@kernel.org \
--subject='Re: [PATCH v2 04/11] sched: Simplify wake_up_*idle*()' \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).