All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Srivatsa S. Bhat" <srivatsa.bhat@linux.vnet.ibm.com>
To: paulmck@linux.vnet.ibm.com
Cc: linux-kernel@vger.kernel.org, mingo@elte.hu,
	laijs@cn.fujitsu.com, dipankar@in.ibm.com,
	akpm@linux-foundation.org, mathieu.desnoyers@polymtl.ca,
	josh@joshtriplett.org, niv@us.ibm.com, tglx@linutronix.de,
	peterz@infradead.org, rostedt@goodmis.org,
	Valdis.Kletnieks@vt.edu, dhowells@redhat.com,
	eric.dumazet@gmail.com, darren@dvhart.com, fweisbec@gmail.com,
	patches@linaro.org, "Paul E. McKenney" <paul.mckenney@linaro.org>,
	venki@google.com, KOSAKI Motohiro <kosaki.motohiro@gmail.com>,
	"rusty@rustcorp.com.au" <rusty@rustcorp.com.au>
Subject: Re: [PATCH RFC tip/core/rcu 1/6] rcu: Stabilize use of num_online_cpus() for GP short circuit
Date: Tue, 24 Apr 2012 23:16:34 +0530	[thread overview]
Message-ID: <4F96E6FA.40900@linux.vnet.ibm.com> (raw)
In-Reply-To: <20120424165056.GB2403@linux.vnet.ibm.com>

On 04/24/2012 10:20 PM, Paul E. McKenney wrote:

> On Tue, Apr 24, 2012 at 09:05:20PM +0530, Srivatsa S. Bhat wrote:
> 
>>
>>> From: "Paul E. McKenney" <paul.mckenney@linaro.org>
>>>
>>> The rcu_blocking_is_gp() function tests to see if there is only one
>>> online CPU, and if so, synchronize_sched() and friends become no-ops.
>>> However, for larger systems, num_online_cpus() scans a large vector,
>>
>>
>> Venki had posted a patch to optimize that by using a variable, so that we
>> don't calculate the value each and every time.
>> http://thread.gmane.org/gmane.linux.kernel/1240569/focus=1246659
>>
>> However, unfortunately there was some confusion around that patch and
>> even though it made it to akpm's tree and stayed there briefly, it didn't
>> go upstream. Venki had attempted to resolve the confusion here:
>> http://thread.gmane.org/gmane.linux.kernel/1240569/focus=1260702
> 
> Having a single variable tracking the online state would be good,
> but as you say it isn't there yet.
> 
>>> and might be preempted while doing so.  While preempted, any number
>>> of CPUs might come online and go offline, potentially resulting in
>>> num_online_cpus() returning 1 when there never had only been one
>>> CPU online.  This would result in a too-short RCU grace period, which
>>> could in turn result in total failure.
>>
>>
[...]

> 
> The problematic case is instead the one where we were SMP throughout,
> but rcu_blocking_is_gp() mistakenly decides that we were UP.  For example,
> consider the following sequence of events, based on the commit log's
> sentence "While preempted, any number of CPUs might come online and go
> offline, potentially resulting in num_online_cpus() returning 1 when
> there never had only been one CPU online":
> 


Oh, I didn't think in the direction illustrated below when reading that
sentence.. :-(

> o	CPUs 100 and 150 are initially online, with a long-running RCU
> 	read-side critical section on CPU 100 and rcu_blocking_is_gp()
> 	initially running on CPU 150.
> 
> o	The rcu_blocking_is_gp() function checks the bits for CPUs
> 	0-63, and counts zero online CPUs.
> 
> o	CPU 1 comes online.
> 
> o	The rcu_blocking_is_gp() function checks the bits for CPUs
> 	64-127, and counts one online CPUs, for a total thus far
> 	of one CPU online..
> 
> o	CPU 150 goes offline.  Ah, but it cannot do this, because
> 	this is non-preemptible RCU, which means that the RCU
> 	read-side critical section has disabled preemption on
> 	CPU 100, which prevents CPU 150 from going offline, which
> 	prevents this scenario from occurring.
> 
> 	So, yes, rcu_blocking_is_gp() can be fooled into incorrectly
> 	stating that the system has only one CPU (or even that it has
> 	only zero CPUs), but not while there is actually a non-preemptible
> 	RCU read-side critical section running.  Yow!
>


Awesome :-)

 
> 	I clearly had not thought this change through far enough,
> 	thank you for calling it to my attention!
> 
> So I could replace this patch with a patch that adds a comment
> explaining why this works.


Yes, that would be great..

>  Though this patch might be simpler and
> easier to understand.  ;-)


Oh well, but I completely missed the intention behind the patch!
So I guess a comment would be better ;-)

>  But not so good for real-time response
> on large systems, I suppose.
> 
> And rcu_blocking_is_gp() is called only from synchronize_sched() and
> synchronize_rcu_bh(), so it is safe for all current callers.  But it is
> defined publicly, so I should move it to somewhere like kernel/rcutree.c
> to keep new users from being fatally confused and disappointed.
> 
> I can also change the comparison from "num_online_cpus() == 1" to
> "num_online_cpus() <= 1".  That should at least serve as a caution to
> people who might attempt to use it where it shouldn't be used.  ;-)
> 


Hehe, yeah!

Regards,
Srivatsa S. Bhat


  reply	other threads:[~2012-04-24 17:47 UTC|newest]

Thread overview: 32+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2012-04-23 16:41 [PATCH RFC 0/6] Miscellaneous RCU fixes for 3.5 Paul E. McKenney
2012-04-23 16:42 ` [PATCH RFC tip/core/rcu 1/6] rcu: Stabilize use of num_online_cpus() for GP short circuit Paul E. McKenney
2012-04-23 16:42   ` [PATCH RFC tip/core/rcu 2/6] rcu: List-debug variants of rcu list routines Paul E. McKenney
2012-04-23 16:42   ` [PATCH RFC tip/core/rcu 3/6] rcu: Replace list_first_entry_rcu() with list_first_or_null_rcu() Paul E. McKenney
2012-04-23 16:42   ` [PATCH RFC tip/core/rcu 4/6] rcu: Clarify help text for RCU_BOOST_PRIO Paul E. McKenney
2012-04-26 12:46     ` Peter Zijlstra
2012-04-26 17:28       ` Paul E. McKenney
2012-04-23 16:42   ` [PATCH RFC tip/core/rcu 5/6] rcu: Make __kfree_rcu() less dependent on compiler choices Paul E. McKenney
2012-04-26 12:48     ` Peter Zijlstra
2012-04-26 13:29       ` Jan Engelhardt
2012-04-26 13:50         ` Peter Zijlstra
2012-04-23 16:42   ` [PATCH RFC tip/core/rcu 6/6] rcu: Reduce cache-miss initialization latencies for large systems Paul E. McKenney
2012-04-26 12:51     ` Peter Zijlstra
2012-04-26 14:12       ` Paul E. McKenney
2012-04-26 15:28         ` Peter Zijlstra
2012-04-26 16:15           ` Paul E. McKenney
2012-04-26 19:41             ` Peter Zijlstra
2012-04-26 19:47               ` Peter Zijlstra
2012-04-26 20:29                 ` Paul E. McKenney
2012-04-26 22:04                   ` Peter Zijlstra
2012-04-26 20:28               ` Paul E. McKenney
2012-04-26 22:01                 ` Peter Zijlstra
2012-04-27 14:17                   ` Paul E. McKenney
2012-04-27  4:36     ` Mike Galbraith
2012-04-27 15:15       ` Paul E. McKenney
2012-04-28  4:42         ` Mike Galbraith
2012-04-28 17:21           ` Paul E. McKenney
2012-04-29  3:54             ` Mike Galbraith
2012-04-24 15:35   ` [PATCH RFC tip/core/rcu 1/6] rcu: Stabilize use of num_online_cpus() for GP short circuit Srivatsa S. Bhat
2012-04-24 16:50     ` Paul E. McKenney
2012-04-24 17:46       ` Srivatsa S. Bhat [this message]
2012-05-07  3:47       ` Rusty Russell

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4F96E6FA.40900@linux.vnet.ibm.com \
    --to=srivatsa.bhat@linux.vnet.ibm.com \
    --cc=Valdis.Kletnieks@vt.edu \
    --cc=akpm@linux-foundation.org \
    --cc=darren@dvhart.com \
    --cc=dhowells@redhat.com \
    --cc=dipankar@in.ibm.com \
    --cc=eric.dumazet@gmail.com \
    --cc=fweisbec@gmail.com \
    --cc=josh@joshtriplett.org \
    --cc=kosaki.motohiro@gmail.com \
    --cc=laijs@cn.fujitsu.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mathieu.desnoyers@polymtl.ca \
    --cc=mingo@elte.hu \
    --cc=niv@us.ibm.com \
    --cc=patches@linaro.org \
    --cc=paul.mckenney@linaro.org \
    --cc=paulmck@linux.vnet.ibm.com \
    --cc=peterz@infradead.org \
    --cc=rostedt@goodmis.org \
    --cc=rusty@rustcorp.com.au \
    --cc=tglx@linutronix.de \
    --cc=venki@google.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.