All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Paul E. McKenney" <paulmck@kernel.org>
To: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Cc: Joel Fernandes <joel@joelfernandes.org>,
	Scott Wood <swood@redhat.com>,
	linux-rt-users@vger.kernel.org, linux-kernel@vger.kernel.org,
	Thomas Gleixner <tglx@linutronix.de>,
	Peter Zijlstra <peterz@infradead.org>,
	Juri Lelli <juri.lelli@redhat.com>,
	Clark Williams <williams@redhat.com>
Subject: Re: [PATCH RT v2 2/3] sched: migrate_enable: Use sleeping_lock to indicate involuntary sleep
Date: Tue, 27 Aug 2019 08:53:06 -0700	[thread overview]
Message-ID: <20190827155306.GF26530@linux.ibm.com> (raw)
In-Reply-To: <20190827092333.jp3darw7teyyw67g@linutronix.de>

On Tue, Aug 27, 2019 at 11:23:33AM +0200, Sebastian Andrzej Siewior wrote:
> On 2019-08-26 09:29:45 [-0700], Paul E. McKenney wrote:
> > > The mechanism that is used here may change in future. I just wanted to
> > > make sure that from RCU's side it is okay to schedule here.
> > 
> > Good point.
> > 
> > The effect from RCU's viewpoint will be to split any non-rcu_read_lock()
> > RCU read-side critical section at this point.  This alrady happens in a
> > few places, for example, rcu_note_context_switch() constitutes an RCU
> > quiescent state despite being invoked with interrupts disabled (as is
> > required!).  The __schedule() function just needs to understand (and does
> > understand) that the RCU read-side critical section that would otherwise
> > span that call to rcu_node_context_switch() is split in two by that call.
> 
> Okay. So I read this as invoking schedule() at this point is okay. 

As long as no one is relying on a non-rcu_read_lock() RCU
read-side critical section (local_bh_disable(), preempt_disable(),
local_irq_disable(), ...) spanning this call.  But that depends on the
calling code and on other code it interacts with it, not on any specific
need on the part of RCU itself.

> Looking at this again, this could also happen on a PREEMPT=y kernel if
> the kernel decides to preempt a task within a rcu_read_lock() section
> and put it back later on another CPU.

This is an rcu_read_lock() critical section, so yes, on a PREEMPT=y
kernel, executing schedule() will cause the corresponding RCU read-side
critical section to persist, following the preempted tasks.  Give or
take lockdep complaints.

On a PREEMPT=n kernel, schedule() within an RCU read-side critical
section instead results in that critical section being split in two.
And this will also results in lockdep complaints.

> > However, if this was instead an rcu_read_lock() critical section within
> > a PREEMPT=y kernel, then if a schedule() occured within stop_one_task(),
> > RCU would consider that critical section to be preempted.  This means
> > that any RCU grace period that is blocked by this RCU read-side critical
> > section would remain blocked until stop_one_cpu() resumed, returned,
> > and so on until the matching rcu_read_unlock() was reached.  In other
> > words, RCU would consider that RCU read-side critical section to span
> > the call to stop_one_cpu() even if stop_one_cpu() invoked schedule().
> 
> Isn't that my example from above and what we do in RT? My understanding
> is that this is the reason why we need BOOST on RT otherwise the RCU
> critical section could remain blocked for some time.

At this point, I must confess that I have lost track of whose example
it is.  It was first reported in 2006, if I remember correctly.  ;-)

But yes, you are correct, the point of RCU priority boosting is to
cause tasks that have been preempted while within RCU read-side critical
sections to be scheduled so that they can reach their rcu_read_unlock()
calls, thus allowing the current grace period to end.

> > On the other hand, within a PREEMPT=n kernel, the call to schedule()
> > would split even an rcu_read_lock() critical section.  Which is why I
> > asked earlier if sleeping_lock_inc() and sleeping_lock_dec() are no-ops
> > in !PREEMPT_RT_BASE kernels.  We would after all want the usual lockdep
> > complaints in that case.
> 
> sleeping_lock_inc() +dec() is only RT specific. It is part of RT's
> spin_lock() implementation and used by RCU (rcu_note_context_switch())
> to not complain if invoked within a critical section.

Then this is being called when we have something like this, correct?

	DEFINE_SPINLOCK(mylock); // As opposed to DEFINE_RAW_SPINLOCK().

	...

	rcu_read_lock();
	do_something();
	spin_lock(&mylock); // Can block in -rt, thus needs sleeping_lock_inc()
	...
	rcu_read_unlock();

Without sleeping_lock_inc(), lockdep would complain about a voluntary
schedule within an RCU read-side critical section.  But in -rt, voluntary
schedules due to sleeping on a "spinlock" are OK.

Am I understanding this correctly?

							Thanx, Paul

  parent reply	other threads:[~2019-08-27 15:53 UTC|newest]

Thread overview: 43+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-08-21 23:19 [PATCH RT v2 0/3] RCU fixes Scott Wood
2019-08-21 23:19 ` [PATCH RT v2 1/3] rcu: Acquire RCU lock when disabling BHs Scott Wood
2019-08-21 23:33   ` Paul E. McKenney
2019-08-22 13:39     ` Joel Fernandes
2019-08-22 15:27       ` Paul E. McKenney
2019-08-23  1:50         ` Joel Fernandes
2019-08-23  2:11           ` Paul E. McKenney
2019-08-23  3:23       ` Scott Wood
2019-08-23 12:30         ` Paul E. McKenney
2019-08-23 16:17         ` Sebastian Andrzej Siewior
2019-08-23 19:46           ` Scott Wood
2019-08-26 15:59             ` Sebastian Andrzej Siewior
2019-08-26 23:21               ` Scott Wood
2019-08-23  2:36     ` Scott Wood
2019-08-23  2:54       ` Paul E. McKenney
2019-08-21 23:19 ` [PATCH RT v2 2/3] sched: migrate_enable: Use sleeping_lock to indicate involuntary sleep Scott Wood
2019-08-21 23:35   ` Paul E. McKenney
2019-08-23  1:21     ` Scott Wood
2019-08-23 16:20   ` Sebastian Andrzej Siewior
2019-08-23 19:28     ` Scott Wood
2019-08-24  3:10       ` Joel Fernandes
2019-08-26 15:25         ` Sebastian Andrzej Siewior
2019-08-26 16:29           ` Paul E. McKenney
2019-08-26 17:49             ` Scott Wood
2019-08-26 18:12               ` Paul E. McKenney
2019-08-27  9:23             ` Sebastian Andrzej Siewior
2019-08-27 13:08               ` Joel Fernandes
2019-08-27 15:58                 ` Paul E. McKenney
2019-08-27 16:06                   ` Joel Fernandes
2019-08-27 15:53               ` Paul E. McKenney [this message]
2019-08-28  9:27                 ` Sebastian Andrzej Siewior
2019-08-28 12:54                   ` Paul E. McKenney
2019-08-28 13:14                     ` Sebastian Andrzej Siewior
2019-08-28 13:59                       ` Joel Fernandes
2019-08-28 15:51                         ` Paul E. McKenney
2019-08-28 15:50                       ` Paul E. McKenney
2019-08-21 23:19 ` [PATCH RT v2 3/3] rcu: Disable use_softirq on PREEMPT_RT Scott Wood
2019-08-21 23:40   ` Paul E. McKenney
2019-08-23 16:32     ` Sebastian Andrzej Siewior
2019-08-22 13:59   ` Joel Fernandes
2019-08-22 15:29     ` Paul E. McKenney
2019-08-22 19:31     ` Scott Wood
2019-08-23  0:52       ` Joel Fernandes

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20190827155306.GF26530@linux.ibm.com \
    --to=paulmck@kernel.org \
    --cc=bigeasy@linutronix.de \
    --cc=joel@joelfernandes.org \
    --cc=juri.lelli@redhat.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-rt-users@vger.kernel.org \
    --cc=peterz@infradead.org \
    --cc=swood@redhat.com \
    --cc=tglx@linutronix.de \
    --cc=williams@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.