linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
To: Peter Zijlstra <peterz@infradead.org>
Cc: linux-kernel@vger.kernel.org, mingo@kernel.org,
	jiangshanlai@gmail.com, dipankar@in.ibm.com,
	akpm@linux-foundation.org, mathieu.desnoyers@efficios.com,
	josh@joshtriplett.org, tglx@linutronix.de, rostedt@goodmis.org,
	dhowells@redhat.com, edumazet@google.com, fweisbec@gmail.com,
	oleg@redhat.com, bobby.prani@gmail.com
Subject: Re: [PATCH tip/core/rcu 04/13] rcu: Make RCU_FANOUT_LEAF help text more explicit about skew_tick
Date: Wed, 19 Apr 2017 08:08:09 -0700	[thread overview]
Message-ID: <20170419150809.GL3956@linux.vnet.ibm.com> (raw)
In-Reply-To: <20170419134835.bpuhurle2jjr66hm@hirez.programming.kicks-ass.net>

On Wed, Apr 19, 2017 at 03:48:35PM +0200, Peter Zijlstra wrote:
> On Wed, Apr 19, 2017 at 03:22:26PM +0200, Peter Zijlstra wrote:
> > On Thu, Apr 13, 2017 at 11:42:32AM -0700, Paul E. McKenney wrote:
> > 
> > > I believe that you are missing the fact that RCU grace-period
> > > initialization and cleanup walks through the rcu_node tree breadth
> > > first, using rcu_for_each_node_breadth_first().
> > 
> > Indeed. That is the part I completely missed.
> > 
> > >                                                 This macro (shown below)
> > > implements this breadth-first walk using a simple sequential traversal of
> > > the ->node[] array that provides the structures making up the rcu_node
> > > tree.  As you can see, this scan is completely independent of how CPU
> > > numbers might be mapped to rcu_data slots in the leaf rcu_node structures.
> > 
> > So this code is clearly not a hotpath, but still its performance
> > matters?
> > 
> > Seems like you cannot win here :/
> 
> So I sort of see what that code does, but I cannot quite grasp from the
> comments near there _why_ it is doing this.
> 
> My thinking is that normal (active CPUs) will update their state at tick
> time through the tree, and once the state reaches the root node, IOW all
> CPUs agree they've observed that particular state, we advance the global
> state, rinse repeat. That's how tree-rcu works.
> 
> NOHZ-idle stuff would be excluded entirely; that is, if we're allowed to
> go idle we're up-to-date, and completely drop out of the state tracking.
> When we become active again, we can simply sync the CPU's state to the
> active state and go from there -- ignoring whatever happened in the
> mean-time.
> 
> So why do we have to do machine wide updates? How can we get at the end
> up a grace period without all CPUs already agreeing that its complete?
> 
> /me puzzled.

This a decent overall summary of how RCU grace periods work, but there
are quite a few corner cases that complicate things.  In this email,
I will focus on just one of them, starting with CPUs returning from
NOHZ-idle state.

In theory, you are correct when you say that we could have CPUs sync up
with current RCU state immediately upon return from idle.  In practice,
people are already screaming at me about the single CPU-local atomic
operation and memory barriers, so adding code on the idle-exit fastpath
to acquire the leaf rcu_node structure's lock and grab the current
state would do nothing but cause Marc Zyngier and many others to report
performance bugs to me.

And even that would not be completely sufficient.  After all, the state
in the leaf rcu_node structure will be out of date during grace-period
initialization and cleanup.  So to -completely- synchronize state for
the incoming CPU, I would have to acquire the root rcu_node structure's
lock and look at the live state.  Needless to say, the performance and
scalability implications of acquiring a global lock on each and every
idle exit event is not going to be at all pretty.

This means that even non-idle CPUs must necessarily be allowed to have
different about which grace period is currently in effect.  We simply
cannot have total agreement on when a given grace period starts or
ends, because such agreement is just too expensive.  Therefore, when a
grace period begins, the grace-period kthread scans the rcu_node tree
propagating this transition through the rcu_node tree.  And similarly
when a grace period ends.

Because the rcu_node tree is mapped into a dense array, and because
the scan proceeds in index order, the scan operation is pretty much
best-case for the cache hardware.  But on large machines with large
cache-miss latencies, it can still inflict a bit of pain -- almost all
of which has been addressed by the switch to grace-period kthreads.

Hey, you asked!!!  ;-)

							Thanx, Paul

  reply	other threads:[~2017-04-19 15:08 UTC|newest]

Thread overview: 106+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-04-12 16:54 [PATCH tip/core/rcu 0/13] Miscellaneous fixes for 4.12 Paul E. McKenney
2017-04-12 16:55 ` [PATCH tip/core/rcu 01/13] mm: Rename SLAB_DESTROY_BY_RCU to SLAB_TYPESAFE_BY_RCU Paul E. McKenney
2017-04-13  9:12   ` Peter Zijlstra
2017-04-13 11:06     ` Vlastimil Babka
2017-04-13 16:00       ` Paul E. McKenney
2017-04-13 16:17       ` Peter Zijlstra
2017-04-13 17:24         ` Paul E. McKenney
2017-04-13 21:30         ` Eric Dumazet
2017-04-14  8:45           ` Peter Zijlstra
2017-04-14 13:39             ` Paul E. McKenney
2017-04-12 16:55 ` [PATCH tip/core/rcu 02/13] lockdep: Use "WARNING" tag on lockdep splats Paul E. McKenney
2017-04-13  9:14   ` Peter Zijlstra
2017-04-13 16:01     ` Paul E. McKenney
2017-04-12 16:55 ` [PATCH tip/core/rcu 03/13] types: Update obsolete callback_head comment Paul E. McKenney
2017-04-12 16:55 ` [PATCH tip/core/rcu 04/13] rcu: Make RCU_FANOUT_LEAF help text more explicit about skew_tick Paul E. McKenney
2017-04-13  9:15   ` Peter Zijlstra
2017-04-13 16:03     ` Paul E. McKenney
2017-04-13 16:19       ` Peter Zijlstra
2017-04-13 16:55         ` Paul E. McKenney
2017-04-13 17:04           ` Peter Zijlstra
2017-04-13 17:31             ` Paul E. McKenney
2017-04-13 17:46               ` Peter Zijlstra
2017-04-13 18:19                 ` Paul E. McKenney
2017-04-13 18:23                   ` Peter Zijlstra
2017-04-13 18:42                     ` Paul E. McKenney
2017-04-19 13:22                       ` Peter Zijlstra
2017-04-19 13:48                         ` Peter Zijlstra
2017-04-19 15:08                           ` Paul E. McKenney [this message]
2017-04-19 15:40                             ` Peter Zijlstra
2017-04-19 16:13                               ` Paul E. McKenney
2017-04-19 14:50                         ` Paul E. McKenney
2017-04-13 18:29               ` Peter Zijlstra
2017-04-13 19:42                 ` Paul E. McKenney
2017-04-12 16:55 ` [PATCH tip/core/rcu 05/13] rcu: Remove obsolete comment from rcu_future_gp_cleanup() header Paul E. McKenney
2017-04-12 16:55 ` [PATCH tip/core/rcu 06/13] hlist_add_tail_rcu disable sparse warning Paul E. McKenney
2017-04-12 16:55 ` [PATCH tip/core/rcu 07/13] rcu: Add smp_mb__after_atomic() to sync_exp_work_done() Paul E. McKenney
2017-04-13  9:18   ` Peter Zijlstra
2017-04-13  9:33     ` Peter Zijlstra
2017-04-13 16:10     ` Paul E. McKenney
2017-04-13 16:24       ` Peter Zijlstra
2017-04-13 16:57         ` Paul E. McKenney
2017-04-13 17:10           ` Peter Zijlstra
2017-04-13 17:39             ` Paul E. McKenney
2017-04-13 17:51               ` Peter Zijlstra
2017-04-13 17:59                 ` Peter Zijlstra
2017-04-19 23:24                   ` Paul E. McKenney
2017-04-19 23:23                 ` Paul E. McKenney
2017-04-20 11:17                   ` Peter Zijlstra
2017-04-20 15:03                     ` Paul E. McKenney
2017-04-20 15:08                       ` Peter Zijlstra
2017-06-09 22:56                         ` Paul E. McKenney
2017-06-12 14:51                           ` Dmitry Vyukov
2017-06-12 21:54                             ` Paul E. McKenney
2017-04-12 16:55 ` [PATCH tip/core/rcu 08/13] rcu: Improve comments for hotplug/suspend/hibernate functions Paul E. McKenney
2017-04-12 16:55 ` [PATCH tip/core/rcu 09/13] mm: Use static initialization for "srcu" Paul E. McKenney
2017-04-12 16:55 ` [PATCH tip/core/rcu 10/13] torture: Use correct path for Kconfig fragment for duplicates Paul E. McKenney
2017-04-12 16:55 ` [PATCH tip/core/rcu 11/13] rcu: Use bool value directly Paul E. McKenney
2017-04-12 16:55 ` [PATCH tip/core/rcu 12/13] rcu: Use true/false in assignment to bool Paul E. McKenney
2017-04-12 16:55 ` [PATCH tip/core/rcu 13/13] rcu: Fix typo in PER_RCU_NODE_PERIOD header comment Paul E. McKenney
2017-04-17 23:27 ` [PATCH v2 tip/core/rcu 0/13] Miscellaneous fixes for 4.12 Paul E. McKenney
2017-04-17 23:28   ` [PATCH v2 tip/core/rcu 01/11] mm: Rename SLAB_DESTROY_BY_RCU to SLAB_TYPESAFE_BY_RCU Paul E. McKenney
2017-04-18  0:14     ` David Rientjes
2017-04-17 23:28   ` [PATCH v2 tip/core/rcu 02/11] lockdep: Use "WARNING" tag on lockdep splats Paul E. McKenney
2017-04-19 15:00     ` Josh Triplett
2017-04-19 16:26       ` Paul E. McKenney
2017-04-17 23:28   ` [PATCH v2 tip/core/rcu 03/11] types: Update obsolete callback_head comment Paul E. McKenney
2017-04-17 23:28   ` [PATCH v2 tip/core/rcu 04/11] rcu: Make RCU_FANOUT_LEAF help text more explicit about skew_tick Paul E. McKenney
2017-04-18  0:18     ` Josh Triplett
2017-04-18 18:42       ` Paul E. McKenney
2017-04-17 23:28   ` [PATCH v2 tip/core/rcu 05/11] rcu: Remove obsolete comment from rcu_future_gp_cleanup() header Paul E. McKenney
2017-04-17 23:28   ` [PATCH v2 tip/core/rcu 06/11] hlist_add_tail_rcu disable sparse warning Paul E. McKenney
2017-04-17 23:28   ` [PATCH v2 tip/core/rcu 07/11] rcu: Improve comments for hotplug/suspend/hibernate functions Paul E. McKenney
2017-04-17 23:28   ` [PATCH v2 tip/core/rcu 08/11] torture: Use correct path for Kconfig fragment for duplicates Paul E. McKenney
2017-04-17 23:28   ` [PATCH v2 tip/core/rcu 09/11] rcu: Use bool value directly Paul E. McKenney
2017-04-17 23:28   ` [PATCH v2 tip/core/rcu 10/11] rcu: Use true/false in assignment to bool Paul E. McKenney
2017-04-17 23:28   ` [PATCH v2 tip/core/rcu 11/11] rcu: Fix typo in PER_RCU_NODE_PERIOD header comment Paul E. McKenney
2017-04-19 11:28   ` [PATCH v2 tip/core/rcu 0/13] Miscellaneous fixes for 4.12 Peter Zijlstra
2017-04-19 11:35     ` Peter Zijlstra
2017-04-19 11:48     ` Christian Borntraeger
2017-04-19 12:08       ` Peter Zijlstra
2017-04-19 12:51         ` Marc Zyngier
2017-04-19 14:47         ` Paul E. McKenney
2017-04-19 14:52           ` Peter Zijlstra
2017-04-19 15:13             ` Paul E. McKenney
2017-04-19 14:58           ` Josh Triplett
2017-04-19 15:03             ` Peter Zijlstra
2017-04-19 15:17               ` Paul E. McKenney
2017-04-19 13:22       ` Paul E. McKenney
2017-04-19 13:25         ` Christian Borntraeger
2017-04-19 13:02     ` Paul E. McKenney
2017-04-19 13:15       ` Peter Zijlstra
2017-04-19 15:37         ` Paul E. McKenney
2017-04-19 15:43           ` Peter Zijlstra
2017-04-19 16:12             ` Paul E. McKenney
2017-04-19 16:45   ` Paul E. McKenney
2017-04-19 16:46     ` [PATCH v3 tip/core/rcu 01/11] mm: Rename SLAB_DESTROY_BY_RCU to SLAB_TYPESAFE_BY_RCU Paul E. McKenney
2017-04-19 16:46     ` [PATCH v3 tip/core/rcu 02/11] lockdep: Use "WARNING" tag on lockdep splats Paul E. McKenney
2017-04-19 16:46     ` [PATCH v3 tip/core/rcu 03/11] types: Update obsolete callback_head comment Paul E. McKenney
2017-04-19 16:46     ` [PATCH v3 tip/core/rcu 04/11] rcu: Make RCU_FANOUT_LEAF help text more explicit about skew_tick Paul E. McKenney
2017-04-19 16:46     ` [PATCH v3 tip/core/rcu 05/11] rcu: Remove obsolete comment from rcu_future_gp_cleanup() header Paul E. McKenney
2017-04-19 16:46     ` [PATCH v3 tip/core/rcu 06/11] hlist_add_tail_rcu disable sparse warning Paul E. McKenney
2017-04-19 16:46     ` [PATCH v3 tip/core/rcu 07/11] rcu: Improve comments for hotplug/suspend/hibernate functions Paul E. McKenney
2017-04-19 16:46     ` [PATCH v3 tip/core/rcu 08/11] torture: Use correct path for Kconfig fragment for duplicates Paul E. McKenney
2017-04-19 16:46     ` [PATCH v3 tip/core/rcu 09/11] rcu: Use bool value directly Paul E. McKenney
2017-04-19 16:46     ` [PATCH v3 tip/core/rcu 10/11] rcu: Use true/false in assignment to bool Paul E. McKenney
2017-04-19 16:46     ` [PATCH v3 tip/core/rcu 11/11] rcu: Fix typo in PER_RCU_NODE_PERIOD header comment Paul E. McKenney

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20170419150809.GL3956@linux.vnet.ibm.com \
    --to=paulmck@linux.vnet.ibm.com \
    --cc=akpm@linux-foundation.org \
    --cc=bobby.prani@gmail.com \
    --cc=dhowells@redhat.com \
    --cc=dipankar@in.ibm.com \
    --cc=edumazet@google.com \
    --cc=fweisbec@gmail.com \
    --cc=jiangshanlai@gmail.com \
    --cc=josh@joshtriplett.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mathieu.desnoyers@efficios.com \
    --cc=mingo@kernel.org \
    --cc=oleg@redhat.com \
    --cc=peterz@infradead.org \
    --cc=rostedt@goodmis.org \
    --cc=tglx@linutronix.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).