From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S932955AbcBXFS4 (ORCPT ); Wed, 24 Feb 2016 00:18:56 -0500 Received: from e18.ny.us.ibm.com ([129.33.205.208]:40809 "EHLO e18.ny.us.ibm.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1757409AbcBXFHn (ORCPT ); Wed, 24 Feb 2016 00:07:43 -0500 X-IBM-Helo: d01dlp03.pok.ibm.com X-IBM-MailFrom: paulmck@linux.vnet.ibm.com X-IBM-RcptTo: linux-kernel@vger.kernel.org From: "Paul E. McKenney" To: linux-kernel@vger.kernel.org Cc: mingo@kernel.org, jiangshanlai@gmail.com, dipankar@in.ibm.com, akpm@linux-foundation.org, mathieu.desnoyers@efficios.com, josh@joshtriplett.org, tglx@linutronix.de, peterz@infradead.org, rostedt@goodmis.org, dhowells@redhat.com, edumazet@google.com, dvhart@linux.intel.com, fweisbec@gmail.com, oleg@redhat.com, bobby.prani@gmail.com, "Paul E. McKenney" Subject: [PATCH tip/core/rcu 13/14] documentation: Explain how RCU's combining tree fights contention Date: Tue, 23 Feb 2016 21:00:46 -0800 Message-Id: <1456290047-16654-13-git-send-email-paulmck@linux.vnet.ibm.com> X-Mailer: git-send-email 2.5.2 In-Reply-To: <20160224050021.GA14616@linux.vnet.ibm.com> References: <20160224050021.GA14616@linux.vnet.ibm.com> X-TM-AS-MML: disable X-Content-Scanned: Fidelis XPS MAILER x-cbid: 16022405-0045-0000-0000-0000036A5E02 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org This commit adds a couple of paragraphs to the description of RCU's combining tree explaining how the combining tree keeps lock contention acceptably low, despite RCU grace periods being global operations. Signed-off-by: Paul E. McKenney --- .../Design/Data-Structures/Data-Structures.html | 23 ++++++++++++++++++++++ .../Design/Data-Structures/Data-Structures.htmlx | 23 ++++++++++++++++++++++ 2 files changed, 46 insertions(+) diff --git a/Documentation/RCU/Design/Data-Structures/Data-Structures.html b/Documentation/RCU/Design/Data-Structures/Data-Structures.html index ba9fbb5177f6..d15744b87b99 100644 --- a/Documentation/RCU/Design/Data-Structures/Data-Structures.html +++ b/Documentation/RCU/Design/Data-Structures/Data-Structures.html @@ -100,6 +100,29 @@ On the other hand, you can set CONFIG_RCU_FANOUT to be as small as 2 if you wish, which would permit only 16 CPUs, which is useful for testing. +

This multi-level combining tree allows us to get most of the +performance and scalability +benefits of partitioning, even though RCU grace-period detection is +inherently a global operation. +The trick here is that only the last CPU to report a quiescent state +into a given rcu_node structure need advance to the rcu_node +structure at the next level up the tree. +This means that at the leaf-level rcu_node structure, only +one access out of sixteen will progress up the tree. +For the internal rcu_node structures, the situation is even +more extreme: Only one access out of sixty-four will progress up +the tree. +Because the vast majority of the CPUs do not progress up the tree, +the lock contention remains roughly constant up the tree. +No matter how many CPUs there are in the system, at most 64 quiescent-state +reports per grace period will progress all the way to the root +rcu_node structure, thus ensuring that the lock contention +on that root rcu_node structure remains acceptably low. + +

In effect, the combining tree acts like a big shock absorber, +keeping lock contention under control at all tree levels regardless +of the level of loading on the system. +

The Linux kernel actually supports multiple flavors of RCU running concurrently, so RCU builds separate data structures for each flavor. diff --git a/Documentation/RCU/Design/Data-Structures/Data-Structures.htmlx b/Documentation/RCU/Design/Data-Structures/Data-Structures.htmlx index c08fd8e9574a..8e88e3e7e2ef 100644 --- a/Documentation/RCU/Design/Data-Structures/Data-Structures.htmlx +++ b/Documentation/RCU/Design/Data-Structures/Data-Structures.htmlx @@ -121,6 +121,29 @@ On the other hand, you can set CONFIG_RCU_FANOUT to be as small as 2 if you wish, which would permit only 16 CPUs, which is useful for testing. +

This multi-level combining tree allows us to get most of the +performance and scalability +benefits of partitioning, even though RCU grace-period detection is +inherently a global operation. +The trick here is that only the last CPU to report a quiescent state +into a given rcu_node structure need advance to the rcu_node +structure at the next level up the tree. +This means that at the leaf-level rcu_node structure, only +one access out of sixteen will progress up the tree. +For the internal rcu_node structures, the situation is even +more extreme: Only one access out of sixty-four will progress up +the tree. +Because the vast majority of the CPUs do not progress up the tree, +the lock contention remains roughly constant up the tree. +No matter how many CPUs there are in the system, at most 64 quiescent-state +reports per grace period will progress all the way to the root +rcu_node structure, thus ensuring that the lock contention +on that root rcu_node structure remains acceptably low. + +

In effect, the combining tree acts like a big shock absorber, +keeping lock contention under control at all tree levels regardless +of the level of loading on the system. +

The Linux kernel actually supports multiple flavors of RCU running concurrently, so RCU builds separate data structures for each flavor. -- 2.5.2