linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: "Paul E. McKenney" <paulmck@kernel.org>
To: yanfei.xu@windriver.com
Cc: josh@joshtriplett.org, rostedt@goodmis.org,
	mathieu.desnoyers@efficios.com, jiangshanlai@gmail.com,
	joel@joelfernandes.org, rcu@vger.kernel.org,
	linux-kernel@vger.kernel.org
Subject: Re: [PATCH v2] rcu: fix a deadlock caused by not release rcu_node->lock
Date: Sun, 16 May 2021 15:58:53 -0700	[thread overview]
Message-ID: <20210516225853.GD4441@paulmck-ThinkPad-P17-Gen-1> (raw)
In-Reply-To: <20210516095010.3657134-1-yanfei.xu@windriver.com>

On Sun, May 16, 2021 at 05:50:10PM +0800, yanfei.xu@windriver.com wrote:
> From: Yanfei Xu <yanfei.xu@windriver.com>
> 
> rcu_node->lock isn't released in rcu_print_task_stall() if the rcu_node
> don't contain tasks which blocking the GP. However this rcu_node->lock
> will be used again in rcu_dump_cpu_stacks() soon while the ndetected is
> non-zero. As a result the cpu will hung by this deadlock.
> 
> Fixes: c583bcb8f5ed ("rcu: Don't invoke try_invoke_on_locked_down_task() with irqs disabled")
> Signed-off-by: Yanfei Xu <yanfei.xu@windriver.com>

Also a good catch, thank you!  Queued for further review and testing,
wordsmithed as shown below.  The rcutorture scripts have been known to
work on ARM in the past, and might still do so.  (I test on x86.)

As always, please check to make sure that I didn't mess something up.

							Thanx, Paul

------------------------------------------------------------------------

commit e0a9b77f245ae4fe1537120fd5319bf9e091618e
Author: Yanfei Xu <yanfei.xu@windriver.com>
Date:   Sun May 16 17:50:10 2021 +0800

    rcu: Fix stall-warning deadlock due to non-release of rcu_node ->lock
    
    If rcu_print_task_stall() is invoked on an rcu_node structure that does
    not contain any tasks blocking the current grace period, it takes an
    early exit that fails to release that rcu_node structure's lock.  This
    results in a self-deadlock, which is detected by lockdep.
    
    To reproduce this bug:
    
    tools/testing/selftests/rcutorture/bin/kvm.sh --allcpus --duration 3 --trust-make --configs "TREE03" --kconfig "CONFIG_PROVE_LOCKING=y" --bootargs "rcutorture.stall_cpu=30 rcutorture.stall_cpu_block=1 rcutorture.fwd_progress=0 rcutorture.test_boost=0"
    
    This will also result in other complaints, including RCU's scheduler
    hook complaining about blocking rather than preemption and an rcutorture
    writer stall.
    
    Only a partial RCU CPU stall warning message will be printed because of
    the self-deadlock.
    
    This commit therefore releases the lock on the rcu_print_task_stall()
    function's early exit path.
    
    Fixes: c583bcb8f5ed ("rcu: Don't invoke try_invoke_on_locked_down_task() with irqs disabled")
    Signed-off-by: Yanfei Xu <yanfei.xu@windriver.com>
    Signed-off-by: Paul E. McKenney <paulmck@kernel.org>

diff --git a/kernel/rcu/tree_stall.h b/kernel/rcu/tree_stall.h
index a10ea1f1f81f..d574e3bbd929 100644
--- a/kernel/rcu/tree_stall.h
+++ b/kernel/rcu/tree_stall.h
@@ -267,8 +267,10 @@ static int rcu_print_task_stall(struct rcu_node *rnp, unsigned long flags)
 	struct task_struct *ts[8];
 
 	lockdep_assert_irqs_disabled();
-	if (!rcu_preempt_blocked_readers_cgp(rnp))
+	if (!rcu_preempt_blocked_readers_cgp(rnp)) {
+		raw_spin_unlock_irqrestore_rcu_node(rnp, flags);
 		return 0;
+	}
 	pr_err("\tTasks blocked on level-%d rcu_node (CPUs %d-%d):",
 	       rnp->level, rnp->grplo, rnp->grphi);
 	t = list_entry(rnp->gp_tasks->prev,

  parent reply	other threads:[~2021-05-16 22:59 UTC|newest]

Thread overview: 4+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-05-16  9:50 [PATCH v2] rcu: fix a deadlock caused by not release rcu_node->lock yanfei.xu
2021-05-16 12:33 ` Xu, Yanfei
2021-05-16 22:58 ` Paul E. McKenney [this message]
2021-05-17  1:55   ` Xu, Yanfei

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20210516225853.GD4441@paulmck-ThinkPad-P17-Gen-1 \
    --to=paulmck@kernel.org \
    --cc=jiangshanlai@gmail.com \
    --cc=joel@joelfernandes.org \
    --cc=josh@joshtriplett.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mathieu.desnoyers@efficios.com \
    --cc=rcu@vger.kernel.org \
    --cc=rostedt@goodmis.org \
    --cc=yanfei.xu@windriver.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).