linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: "Xu, Yanfei" <yanfei.xu@windriver.com>
To: paulmck@kernel.org
Cc: josh@joshtriplett.org, rostedt@goodmis.org,
	mathieu.desnoyers@efficios.com, jiangshanlai@gmail.com,
	joel@joelfernandes.org, rcu@vger.kernel.org,
	linux-kernel@vger.kernel.org
Subject: Re: [PATCH v2] rcu: fix a deadlock caused by not release rcu_node->lock
Date: Mon, 17 May 2021 09:55:30 +0800	[thread overview]
Message-ID: <72a3b3f4-1b74-6c03-9d04-ac4bb721a55a@windriver.com> (raw)
In-Reply-To: <20210516225853.GD4441@paulmck-ThinkPad-P17-Gen-1>



On 5/17/21 6:58 AM, Paul E. McKenney wrote:
> [Please note: This e-mail is from an EXTERNAL e-mail address]
> 
> On Sun, May 16, 2021 at 05:50:10PM +0800, yanfei.xu@windriver.com wrote:
>> From: Yanfei Xu <yanfei.xu@windriver.com>
>>
>> rcu_node->lock isn't released in rcu_print_task_stall() if the rcu_node
>> don't contain tasks which blocking the GP. However this rcu_node->lock
>> will be used again in rcu_dump_cpu_stacks() soon while the ndetected is
>> non-zero. As a result the cpu will hung by this deadlock.
>>
>> Fixes: c583bcb8f5ed ("rcu: Don't invoke try_invoke_on_locked_down_task() with irqs disabled")
>> Signed-off-by: Yanfei Xu <yanfei.xu@windriver.com>
> 
> Also a good catch, thank you!  Queued for further review and testing,
> wordsmithed as shown below.  The rcutorture scripts have been known to
> work on ARM in the past, and might still do so.  (I test on x86.)
> 
> As always, please check to make sure that I didn't mess something up.
> 

Looks good to me, Thanks!

Regards,
Yanfei

>                                                          Thanx, Paul
> 
> ------------------------------------------------------------------------
> 
> commit e0a9b77f245ae4fe1537120fd5319bf9e091618e
> Author: Yanfei Xu <yanfei.xu@windriver.com>
> Date:   Sun May 16 17:50:10 2021 +0800
> 
>      rcu: Fix stall-warning deadlock due to non-release of rcu_node ->lock
> 
>      If rcu_print_task_stall() is invoked on an rcu_node structure that does
>      not contain any tasks blocking the current grace period, it takes an
>      early exit that fails to release that rcu_node structure's lock.  This
>      results in a self-deadlock, which is detected by lockdep.
> 
>      To reproduce this bug:
> 
>      tools/testing/selftests/rcutorture/bin/kvm.sh --allcpus --duration 3 --trust-make --configs "TREE03" --kconfig "CONFIG_PROVE_LOCKING=y" --bootargs "rcutorture.stall_cpu=30 rcutorture.stall_cpu_block=1 rcutorture.fwd_progress=0 rcutorture.test_boost=0"
> 
>      This will also result in other complaints, including RCU's scheduler
>      hook complaining about blocking rather than preemption and an rcutorture
>      writer stall.
> 
>      Only a partial RCU CPU stall warning message will be printed because of
>      the self-deadlock.
> 
>      This commit therefore releases the lock on the rcu_print_task_stall()
>      function's early exit path.
> 
>      Fixes: c583bcb8f5ed ("rcu: Don't invoke try_invoke_on_locked_down_task() with irqs disabled")
>      Signed-off-by: Yanfei Xu <yanfei.xu@windriver.com>
>      Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
> 
> diff --git a/kernel/rcu/tree_stall.h b/kernel/rcu/tree_stall.h
> index a10ea1f1f81f..d574e3bbd929 100644
> --- a/kernel/rcu/tree_stall.h
> +++ b/kernel/rcu/tree_stall.h
> @@ -267,8 +267,10 @@ static int rcu_print_task_stall(struct rcu_node *rnp, unsigned long flags)
>          struct task_struct *ts[8];
> 
>          lockdep_assert_irqs_disabled();
> -       if (!rcu_preempt_blocked_readers_cgp(rnp))
> +       if (!rcu_preempt_blocked_readers_cgp(rnp)) {
> +               raw_spin_unlock_irqrestore_rcu_node(rnp, flags);
>                  return 0;
> +       }
>          pr_err("\tTasks blocked on level-%d rcu_node (CPUs %d-%d):",
>                 rnp->level, rnp->grplo, rnp->grphi);
>          t = list_entry(rnp->gp_tasks->prev,
> 

      reply	other threads:[~2021-05-17  1:55 UTC|newest]

Thread overview: 4+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-05-16  9:50 [PATCH v2] rcu: fix a deadlock caused by not release rcu_node->lock yanfei.xu
2021-05-16 12:33 ` Xu, Yanfei
2021-05-16 22:58 ` Paul E. McKenney
2021-05-17  1:55   ` Xu, Yanfei [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=72a3b3f4-1b74-6c03-9d04-ac4bb721a55a@windriver.com \
    --to=yanfei.xu@windriver.com \
    --cc=jiangshanlai@gmail.com \
    --cc=joel@joelfernandes.org \
    --cc=josh@joshtriplett.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mathieu.desnoyers@efficios.com \
    --cc=paulmck@kernel.org \
    --cc=rcu@vger.kernel.org \
    --cc=rostedt@goodmis.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).