All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Paul E. McKenney" <paulmck@kernel.org>
To: Boqun Feng <boqun.feng@gmail.com>
Cc: Waiman Long <llong@redhat.com>,
	peterz@infradead.org, mingo@redhat.com, will@kernel.org,
	linux-kernel@vger.kernel.org, richard@nod.at
Subject: Re: Confusing lockdep splat
Date: Sat, 25 Sep 2021 05:50:25 -0700	[thread overview]
Message-ID: <20210925125025.GO880162@paulmck-ThinkPad-P17-Gen-1> (raw)
In-Reply-To: <YU6LpI+nSmFcVP4G@boqun-archlinux>

On Sat, Sep 25, 2021 at 10:38:28AM +0800, Boqun Feng wrote:
> On Fri, Sep 24, 2021 at 03:43:37PM -0700, Paul E. McKenney wrote:
> > On Fri, Sep 24, 2021 at 05:41:17PM -0400, Waiman Long wrote:
> > > On 9/24/21 5:02 PM, Paul E. McKenney wrote:
> > > > Hello!
> > > > 
> > > > I got the lockdep splat below from an SRCU-T rcutorture run, which uses
> > > > a !SMP !PREEMPT kernel.  This is a random event, and about half the time
> > > > it happens within an hour or two.  My reproducer (on current -rcu "dev"
> > > > branch for a 16-CPU system) is:
> > > > 
> > > > 	tools/testing/selftests/rcutorture/bin/kvm.sh --cpus 16 --configs "16*SRCU-T" --duration 7200
> > > > 
> > > > My points of confusion are as follows:
> > > > 
> > > > 1.	The locks involved in this deadlock cycle are irq-disabled
> > > > 	raw spinlocks.	The claimed deadlock cycle uses two CPUs.
> > > > 	There is only one CPU.	There is no possibility of preemption
> > > > 	or interrupts.	So how can this deadlock actually happen?
> > > > 
> > > > 2.	If there was more than one CPU, then yes, there would be
> > > > 	a deadlock.  The PI lock is acquired by the wakeup code after
> > > > 	acquiring the workqueue lock, and rcutorture tests the new ability
> > > > 	of the scheduler to hold the PI lock across rcu_read_unlock(),
> > > > 	and while it is at it, across the rest of the unlock primitives.
> > > > 
> > > > 	But if there was more than one CPU, Tree SRCU would be used
> > > > 	instead of Tiny SRCU, and there would be no wakeup invoked from
> > > > 	srcu_read_unlock().
> > > > 
> > > > 	Given only one CPU, there is no way to complete the deadlock
> > > > 	cycle.
> > > > 
> > > > For now, I am working around this by preventing rcutorture from holding
> > > > the PI lock across Tiny srcu_read_unlock().
> > > > 
> > > > Am I missing something subtle here?
> > > 
> > > I would say that the lockdep code just doesn't have enough intelligence to
> > > identify that deadlock is not possible in this special case. There are
> > > certainly false positives, and it can be hard to get rid of them.
> > 
> > Would it make sense for lockdep to filter out reports involving more
> > than one CPU unless there is at least one sleeplock in the cycle?
> 
> I think SRCU is special here, because it has different implementations
> in SMP and UP. For other code, if the implemenation in SMP and UP is the
> same, we want lockdep to detect the deadlock even if it's not in UP.

Ah, fair point!  There are a few others, for example, kernel/up.c, but
it seems to just disable interrupts as its "big UP kernel lock".

> We can provide an annotation similar to data_race() for SRCU to mark
> UP-only code
> 
> 	#define LOCKDEP_UP_ONLY(expr) ({			\
> 		BUILD_BUG_ON(IS_ENABLED(CONFIG_SMP));		\
> 
> 		<disable lockdep>
> 		<...> v = expr;
> 		<enable lockdep>
> 		v
> 	})
> 
> and in __srcu_read_unlock():
> 
> 	LOCKDEP_UP_ONLY(swake_up_one(...));
> 
> Thoughts?

With the workaround I have now, all is well unless someone needs to hold
a PI lock across an rcu_read_unlock(), which seems unlikely.  If such a
case does arise, lockdep will let us know.  In which case what you are
suggesting might be a good way to go.  Alternatively, I could use the
trick that RCU Tasks Trace uses, with the swake_up_one() deferred to an
irq_work_queue() handler.

It does appear that !SMP kernels are nowhere near as important to the
community as they were 20 years ago.  ;-)

							Thanx, Paul

      reply	other threads:[~2021-09-25 12:50 UTC|newest]

Thread overview: 7+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-09-24 21:02 Confusing lockdep splat Paul E. McKenney
2021-09-24 21:41 ` Waiman Long
2021-09-24 22:43   ` Paul E. McKenney
2021-09-25  0:30     ` Waiman Long
2021-09-25 12:55       ` Paul E. McKenney
2021-09-25  2:38     ` Boqun Feng
2021-09-25 12:50       ` Paul E. McKenney [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20210925125025.GO880162@paulmck-ThinkPad-P17-Gen-1 \
    --to=paulmck@kernel.org \
    --cc=boqun.feng@gmail.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=llong@redhat.com \
    --cc=mingo@redhat.com \
    --cc=peterz@infradead.org \
    --cc=richard@nod.at \
    --cc=will@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.