All of lore.kernel.org
 help / color / mirror / Atom feed
From: Marek Szyprowski <m.szyprowski@samsung.com>
To: Peter Zijlstra <peterz@infradead.org>,
	gor@linux.ibm.com, jpoimboe@redhat.com, jikos@kernel.org,
	mbenes@suse.cz, pmladek@suse.com, mingo@kernel.org
Cc: linux-kernel@vger.kernel.org, joe.lawrence@redhat.com,
	fweisbec@gmail.com, tglx@linutronix.de, hca@linux.ibm.com,
	svens@linux.ibm.com, sumanthk@linux.ibm.com,
	live-patching@vger.kernel.org, paulmck@kernel.org,
	rostedt@goodmis.org, x86@kernel.org
Subject: Re: [PATCH v2 04/11] sched: Simplify wake_up_*idle*()
Date: Fri, 22 Oct 2021 15:46:29 +0200	[thread overview]
Message-ID: <ff361300-a390-651d-8316-1f4e8d390af3@samsung.com> (raw)
In-Reply-To: <20210929152428.769328779@infradead.org>

Hi

On 29.09.2021 17:17, Peter Zijlstra wrote:
> Simplify and make wake_up_if_idle() more robust, also don't iterate
> the whole machine with preempt_disable() in it's caller:
> wake_up_all_idle_cpus().
>
> This prepares for another wake_up_if_idle() user that needs a full
> do_idle() cycle.
>
> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>

This patch landed recently in linux-next as commit 8850cb663b5c ("sched: 
Simplify wake_up_*idle*()"). It causes the following warning on the 
arm64 virt machine under qemu during the system suspend/resume cycle:

--->8---

  printk: Suspending console(s) (use no_console_suspend to debug)

  ============================================
  WARNING: possible recursive locking detected
  5.15.0-rc6-next-20211022 #10905 Not tainted
  --------------------------------------------
  rtcwake/1326 is trying to acquire lock:
  ffffd4e9192e8130 (cpu_hotplug_lock){++++}-{0:0}, at: 
wake_up_all_idle_cpus+0x24/0x98

  but task is already holding lock:
  ffffd4e9192e8130 (cpu_hotplug_lock){++++}-{0:0}, at: 
suspend_devices_and_enter+0x740/0x9f0

  other info that might help us debug this:
   Possible unsafe locking scenario:

         CPU0
         ----
    lock(cpu_hotplug_lock);
    lock(cpu_hotplug_lock);

   *** DEADLOCK ***

   May be due to missing lock nesting notation

  5 locks held by rtcwake/1326:
   #0: ffff54ad86a78438 (sb_writers#7){.+.+}-{0:0}, at: ksys_write+0x64/0xf0
   #1: ffff54ad84094a88 (&of->mutex){+.+.}-{3:3}, at: 
kernfs_fop_write_iter+0xf4/0x1a8
   #2: ffff54ad83b17a88 (kn->active#43){.+.+}-{0:0}, at: 
kernfs_fop_write_iter+0xfc/0x1a8
   #3: ffffd4e9192efab0 (system_transition_mutex){+.+.}-{3:3}, at: 
pm_suspend+0x214/0x3d0
   #4: ffffd4e9192e8130 (cpu_hotplug_lock){++++}-{0:0}, at: 
suspend_devices_and_enter+0x740/0x9f0

  stack backtrace:
  CPU: 0 PID: 1326 Comm: rtcwake Not tainted 5.15.0-rc6-next-20211022 #10905
  Hardware name: linux,dummy-virt (DT)
  Call trace:
   dump_backtrace+0x0/0x1d0
   show_stack+0x14/0x20
   dump_stack_lvl+0x88/0xb0
   dump_stack+0x14/0x2c
   __lock_acquire+0x171c/0x17b8
   lock_acquire+0x234/0x378
   cpus_read_lock+0x5c/0x150
   wake_up_all_idle_cpus+0x24/0x98
   suspend_devices_and_enter+0x748/0x9f0
   pm_suspend+0x2b0/0x3d0
   state_store+0x84/0x108
   kobj_attr_store+0x14/0x28
   sysfs_kf_write+0x60/0x70
   kernfs_fop_write_iter+0x124/0x1a8
   new_sync_write+0xe8/0x1b0
   vfs_write+0x1d0/0x408
   ksys_write+0x64/0xf0
   __arm64_sys_write+0x14/0x20
   invoke_syscall+0x40/0xf8
   el0_svc_common.constprop.3+0x8c/0x120
   do_el0_svc_compat+0x18/0x48
   el0_svc_compat+0x48/0x100
   el0t_32_sync_handler+0xec/0x140
   el0t_32_sync+0x170/0x174
  OOM killer enabled.
  Restarting tasks ... done.
  PM: suspend exit

--->8---

Let me know if there is anything I can help to debug and fix this issue.

> ---
>   kernel/sched/core.c |   14 +++++---------
>   kernel/smp.c        |    6 +++---
>   2 files changed, 8 insertions(+), 12 deletions(-)
>
> --- a/kernel/sched/core.c
> +++ b/kernel/sched/core.c
> @@ -3691,15 +3691,11 @@ void wake_up_if_idle(int cpu)
>   	if (!is_idle_task(rcu_dereference(rq->curr)))
>   		goto out;
>   
> -	if (set_nr_if_polling(rq->idle)) {
> -		trace_sched_wake_idle_without_ipi(cpu);
> -	} else {
> -		rq_lock_irqsave(rq, &rf);
> -		if (is_idle_task(rq->curr))
> -			smp_send_reschedule(cpu);
> -		/* Else CPU is not idle, do nothing here: */
> -		rq_unlock_irqrestore(rq, &rf);
> -	}
> +	rq_lock_irqsave(rq, &rf);
> +	if (is_idle_task(rq->curr))
> +		resched_curr(rq);
> +	/* Else CPU is not idle, do nothing here: */
> +	rq_unlock_irqrestore(rq, &rf);
>   
>   out:
>   	rcu_read_unlock();
> --- a/kernel/smp.c
> +++ b/kernel/smp.c
> @@ -1170,14 +1170,14 @@ void wake_up_all_idle_cpus(void)
>   {
>   	int cpu;
>   
> -	preempt_disable();
> +	cpus_read_lock();
>   	for_each_online_cpu(cpu) {
> -		if (cpu == smp_processor_id())
> +		if (cpu == raw_smp_processor_id())
>   			continue;
>   
>   		wake_up_if_idle(cpu);
>   	}
> -	preempt_enable();
> +	cpus_read_unlock();
>   }
>   EXPORT_SYMBOL_GPL(wake_up_all_idle_cpus);
>   
>
>
>
Best regards
-- 
Marek Szyprowski, PhD
Samsung R&D Institute Poland


  parent reply	other threads:[~2021-10-22 13:46 UTC|newest]

Thread overview: 57+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-09-29 15:17 [PATCH v2 00/11] sched,rcu,context_tracking,livepatch: Improve livepatch task transitions for idle and NOHZ_FULL Peter Zijlstra
2021-09-29 15:17 ` [PATCH v2 01/11] sched: Improve try_invoke_on_locked_down_task() Peter Zijlstra
2021-10-09 10:07   ` [tip: sched/core] " tip-bot2 for Peter Zijlstra
2021-09-29 15:17 ` [PATCH v2 02/11] sched,rcu: Rework try_invoke_on_locked_down_task() Peter Zijlstra
2021-10-09 10:07   ` [tip: sched/core] " tip-bot2 for Peter Zijlstra
2021-09-29 15:17 ` [PATCH v2 03/11] sched,livepatch: Use task_call_func() Peter Zijlstra
2021-10-05 11:40   ` Petr Mladek
2021-10-05 14:03     ` Peter Zijlstra
2021-10-06  8:59   ` Miroslav Benes
2021-10-09 10:07   ` [tip: sched/core] " tip-bot2 for Peter Zijlstra
2021-09-29 15:17 ` [PATCH v2 04/11] sched: Simplify wake_up_*idle*() Peter Zijlstra
2021-10-09 10:07   ` [tip: sched/core] " tip-bot2 for Peter Zijlstra
2021-10-13 14:32   ` [PATCH v2 04/11] " Qian Cai
2021-10-19  3:47     ` Qian Cai
2021-10-19  8:56       ` Peter Zijlstra
2021-10-19  9:10         ` Peter Zijlstra
2021-10-19 15:32           ` Qian Cai
2021-10-19 15:50             ` Peter Zijlstra
2021-10-19 19:22               ` Qian Cai
2021-10-19 20:27                 ` Peter Zijlstra
     [not found]   ` <CGME20211022134630eucas1p2e79e2816587d182c580459d567c1f2a9@eucas1p2.samsung.com>
2021-10-22 13:46     ` Marek Szyprowski [this message]
2021-09-29 15:17 ` [PATCH v2 05/11] sched,livepatch: Use wake_up_if_idle() Peter Zijlstra
2021-10-05 12:00   ` Petr Mladek
2021-10-06  9:16   ` Miroslav Benes
2021-10-07  9:18     ` Vasily Gorbik
2021-10-07 10:02       ` Peter Zijlstra
2021-10-13 19:37   ` Arnd Bergmann
2021-10-14 10:42     ` Peter Zijlstra
2021-09-29 15:17 ` [RFC][PATCH v2 06/11] context_tracking: Prefix user_{enter,exit}*() Peter Zijlstra
2021-09-29 15:17 ` [RFC][PATCH v2 07/11] context_tracking: Add an atomic sequence/state count Peter Zijlstra
2021-09-29 15:17 ` [RFC][PATCH v2 08/11] context_tracking,rcu: Replace RCU dynticks counter with context_tracking Peter Zijlstra
2021-09-29 18:37   ` Paul E. McKenney
2021-09-29 19:09     ` Peter Zijlstra
2021-09-29 19:11     ` Peter Zijlstra
2021-09-29 19:13     ` Peter Zijlstra
2021-09-29 19:24       ` Peter Zijlstra
2021-09-29 19:45         ` Paul E. McKenney
2021-09-29 18:54   ` Peter Zijlstra
2021-09-29 15:17 ` [RFC][PATCH v2 09/11] context_tracking,livepatch: Dont disturb NOHZ_FULL Peter Zijlstra
2021-10-06  8:12   ` Petr Mladek
2021-10-06  9:04     ` Peter Zijlstra
2021-10-06 10:29       ` Petr Mladek
2021-10-06 11:41         ` Peter Zijlstra
2021-10-06 11:48         ` Miroslav Benes
2021-09-29 15:17 ` [RFC][PATCH v2 10/11] livepatch: Remove klp_synchronize_transition() Peter Zijlstra
2021-10-06 12:30   ` Petr Mladek
2021-09-29 15:17 ` [RFC][PATCH v2 11/11] context_tracking,x86: Fix text_poke_sync() vs NOHZ_FULL Peter Zijlstra
2021-10-21 18:39   ` Marcelo Tosatti
2021-10-21 18:40     ` Marcelo Tosatti
2021-10-21 19:25     ` Peter Zijlstra
2021-10-21 19:57       ` Marcelo Tosatti
2021-10-21 20:18         ` Peter Zijlstra
2021-10-26 18:19           ` Marcelo Tosatti
2021-10-26 19:38             ` Peter Zijlstra
2021-09-29 18:03 ` [PATCH v2 00/11] sched,rcu,context_tracking,livepatch: Improve livepatch task transitions for idle and NOHZ_FULL Paul E. McKenney
2021-10-09 10:07 ` [tip: sched/core] sched,livepatch: Use wake_up_if_idle() tip-bot2 for Peter Zijlstra
2021-10-14 11:16 ` tip-bot2 for Peter Zijlstra

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=ff361300-a390-651d-8316-1f4e8d390af3@samsung.com \
    --to=m.szyprowski@samsung.com \
    --cc=fweisbec@gmail.com \
    --cc=gor@linux.ibm.com \
    --cc=hca@linux.ibm.com \
    --cc=jikos@kernel.org \
    --cc=joe.lawrence@redhat.com \
    --cc=jpoimboe@redhat.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=live-patching@vger.kernel.org \
    --cc=mbenes@suse.cz \
    --cc=mingo@kernel.org \
    --cc=paulmck@kernel.org \
    --cc=peterz@infradead.org \
    --cc=pmladek@suse.com \
    --cc=rostedt@goodmis.org \
    --cc=sumanthk@linux.ibm.com \
    --cc=svens@linux.ibm.com \
    --cc=tglx@linutronix.de \
    --cc=x86@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.