From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S935642AbcLOLow (ORCPT ); Thu, 15 Dec 2016 06:44:52 -0500 Received: from foss.arm.com ([217.140.101.70]:34678 "EHLO foss.arm.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S934860AbcLOLou (ORCPT ); Thu, 15 Dec 2016 06:44:50 -0500 Date: Thu, 15 Dec 2016 11:43:52 +0000 From: Mark Rutland To: Boqun Feng Cc: linux-kernel@vger.kernel.org, "Paul E . McKenney " , Josh Triplett , Steven Rostedt , Mathieu Desnoyers , Lai Jiangshan , Colin King Subject: Re: [RFC v2 1/5] rcu: Introduce for_each_leaf_node_cpu() Message-ID: <20161215114351.GA21758@leverpostej> References: <20161215024204.28620-1-boqun.feng@gmail.com> <20161215024204.28620-2-boqun.feng@gmail.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20161215024204.28620-2-boqun.feng@gmail.com> User-Agent: Mutt/1.5.21 (2010-09-15) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Thu, Dec 15, 2016 at 10:42:00AM +0800, Boqun Feng wrote: > There are some places inside RCU core, where we need to iterate all mask > (->qsmask, ->expmask, etc) bits in a leaf node, in order to iterate all > corresponding CPUs. The current code iterates all possible CPUs in this > leaf node and then checks with the mask to see whether the bit is set. > > However, given the fact that most bits in cpu_possible_mask are set but > rare bits in an RCU leaf node mask are set(in other words, ->qsmask and > its friends are usually more sparse than cpu_possible_mask), it's better > to iterate in the other way, that is iterating mask bits in a leaf node. > By doing so, we can save several checks in the loop, moreover, that fast > path checking(e.g. ->qsmask == 0) could then be consolidated into the > loop logic. > > This patch introduce for_each_leaf_node_cpu() to iterate mask bits in a > more efficient way. > > By design, The CPUs whose bits are set in the leaf node masks should be > a subset of possible CPUs, so we don't need extra check with > cpu_possible(), however, a WARN_ON_ONCE() is put in the loop to check > whether there are some nasty cases we miss. > > Signed-off-by: Boqun Feng > --- > kernel/rcu/tree.h | 16 ++++++++++++++++ > 1 file changed, 16 insertions(+) > > diff --git a/kernel/rcu/tree.h b/kernel/rcu/tree.h > index c0a4bf8f1ed0..70ef44a082e0 100644 > --- a/kernel/rcu/tree.h > +++ b/kernel/rcu/tree.h > @@ -295,6 +295,22 @@ struct rcu_node { > cpu <= rnp->grphi; \ > cpu = cpumask_next((cpu), cpu_possible_mask)) > > + > +#define MASK_BITS(mask) (BITS_PER_BYTE * sizeof(mask)) > +/* > + * Iterate over all CPUs a leaf RCU node which are still masked in > + * @mask. > + * > + * Note @rnp has to be a leaf node and @mask has to belong to @rnp. Not a big deal, but perhaps it's worth enforcing this? If we took just the name of the mask here, (e.g. qsmask rather than rnp->qsmask), we could have the macro always use (rnp)->(mask). That would also make the invocations shorter. > And we > + * assume that no CPU is masked in @mask but not set in cpu_possible_mask. IOW, > + * masks of a leaf node never set a bit for an "impossible" CPU. > + */ > +#define for_each_leaf_node_cpu(rnp, mask, cpu) \ > + for ((cpu) = (rnp)->grplo + find_first_bit(&(mask), MASK_BITS(mask)); \ > + (cpu) <= (rnp)->grphi && !WARN_ON_ONCE(!cpu_possible(cpu)); \ If this happens, we'll exit the loop. If there are any reamining possible CPUs, we'll skip them, which would be less than ideal. I guess this shouldn't happen anyway, but it might be worth continuing. > + (cpu) = (rnp)->grplo + find_next_bit(&(mask), MASK_BITS(mask), \ > + (cpu) - (rnp)->grplo + 1)) > + I was going to ask if that + 1 was correct, but I see that it is! So FWIW: Acked-by: Mark Rutland I had a go at handling my comments above, but I'm not sure it's any better: #define cpu_to_grp(rnp, cpu) ((cpu) - (rnp)->grplo) #define grp_to_cpu(rnp, cpu) ((cpu) + (rnp)->grplo) #define node_first_cpu(rnp, mask) \ grp_to_cpu(find_first_bit(&(rnp)->mask, MASK_BITS((rnp)->mask))) #define node_next_cpu(rnp, mask, cpu) grp_to_cpu(rnp, find_next_bit(&(rnp)->mask, MASK_BITS((rnp)->mask), cpu_to_grp(rnp, cpu) + 1)) #define for_each_leaf_node_cpu(rnp, mask, cpu) \ for ((cpu) = node_first_cpu(rnp, mask); \ (cpu) <= (rnp)->grphi; \ (cpu) = node_next_cpu(rnp, mask, cpu)) \ if (WARN_ON_ONCE(!cpu_possible(cpu))) \ continue; \ else Thanks, Mark.