From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1753961AbbGAL45 (ORCPT ); Wed, 1 Jul 2015 07:56:57 -0400 Received: from casper.infradead.org ([85.118.1.10]:37563 "EHLO casper.infradead.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752950AbbGAL4t (ORCPT ); Wed, 1 Jul 2015 07:56:49 -0400 Date: Wed, 1 Jul 2015 13:56:42 +0200 From: Peter Zijlstra To: "Paul E. McKenney" Cc: Oleg Nesterov , tj@kernel.org, mingo@redhat.com, linux-kernel@vger.kernel.org, der.herr@hofr.at, dave@stgolabs.net, riel@redhat.com, viro@ZenIV.linux.org.uk, torvalds@linux-foundation.org Subject: Re: [RFC][PATCH 12/13] stop_machine: Remove lglock Message-ID: <20150701115642.GU19282@twins.programming.kicks-ass.net> References: <20150624175830.GS3644@twins.programming.kicks-ass.net> <20150625032303.GO3717@linux.vnet.ibm.com> <20150625110734.GX3644@twins.programming.kicks-ass.net> <20150625134726.GR3717@linux.vnet.ibm.com> <20150625142011.GU19282@twins.programming.kicks-ass.net> <20150625145133.GT3717@linux.vnet.ibm.com> <20150626123207.GZ19282@twins.programming.kicks-ass.net> <20150626161415.GY3717@linux.vnet.ibm.com> <20150629075645.GD19282@twins.programming.kicks-ass.net> <20150630213258.GO3717@linux.vnet.ibm.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20150630213258.GO3717@linux.vnet.ibm.com> User-Agent: Mutt/1.5.21 (2012-12-30) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Tue, Jun 30, 2015 at 02:32:58PM -0700, Paul E. McKenney wrote: > > I had indeed forgotten that got farmed out to the kthread; on which, my > > poor desktop seems to have spend ~140 minutes of its (most recent) > > existence poking RCU things. > > > > 7 root 20 0 0 0 0 S 0.0 0.0 56:34.66 rcu_sched > > 8 root 20 0 0 0 0 S 0.0 0.0 20:58.19 rcuos/0 > > 9 root 20 0 0 0 0 S 0.0 0.0 18:50.75 rcuos/1 > > 10 root 20 0 0 0 0 S 0.0 0.0 18:30.62 rcuos/2 > > 11 root 20 0 0 0 0 S 0.0 0.0 17:33.24 rcuos/3 > > 12 root 20 0 0 0 0 S 0.0 0.0 2:43.54 rcuos/4 > > 13 root 20 0 0 0 0 S 0.0 0.0 3:00.31 rcuos/5 > > 14 root 20 0 0 0 0 S 0.0 0.0 3:09.27 rcuos/6 > > 15 root 20 0 0 0 0 S 0.0 0.0 2:52.98 rcuos/7 > > > > Which is almost as much time as my konsole: > > > > 2853 peterz 20 0 586240 103664 41848 S 1.0 0.3 147:39.50 konsole > > > > Which seems somewhat excessive. But who knows. > > No idea. How long has that system been up? What has it been doing? Some 40 odd days it seems. Its my desktop, I read email (in mutt in Konsole), I type patches (in vim in Konsole), I compile kernels (in Konsole) etc.. Now konsole is threaded and each new window/tab is just another thread in the same process so runtime should accumulate. However I just found that for some obscure reason there's two konsole processes around, and the other is the one that I'm using most, it also has significantly more runtime. 3264 ? Sl 452:43 \_ /usr/bin/konsole Must be some of that brain damaged desktop shite that confused things -- I see the one is stared with some -session argument. Some day I'll discover how to destroy all that nonsense and make things behave as they should. > The rcu_sched overhead is expected behavior if the system has run between > ten and one hundred million grace periods, give or take an order of > magnitude depending on the number of idle CPUs and so on. > > The overhead for the RCU offload kthreads is what it is. A kfree() takes > as much time as a kfree does, and they are all nicely counted up for you. Yah, if only we could account it back to whomever caused it :/ > > Although here I'll once again go ahead and say something ignorant; how > > come that's a problem? Surely if we know the kthread thing has finished > > starting a GP, any one CPU issuing a full memory barrier (as would be > > implied by switching to the stop worker) must then indeed observe that > > global state? due to that transitivity thing. > > > > That is, I'm having a wee bit of bother for seeing how you'd need > > manipulation of global variables as you elude to below. > > Well, I thought that you wanted to leverage the combining tree to > determine when the grace period had completed. If a given CPU isn't > pushing its quiescent states up the combining tree, then the combining > tree can't do much for you. Right that is what I wanted, and sure the combining thing needs to happen with atomics, but that's not new, it already does that. What I was talking about was the interaction between the force quiescence state and the poking detectoring that a QS had indeed be started. > Well, I do have something that seems reasonably straightforward. Sending > the patches along separately. Not sure that it is worth its weight. > > The idea is that we keep the expedited grace periods working as they do > now, independently of the normal grace period. The normal grace period > takes a sequence number just after initialization, and checks to see > if an expedited grace period happened in the meantime at the beginning > of each quiescent-state forcing episode. This saves the last one or > two quiescent-state forcing scans if the case where an expedited grace > period really did happen. > > It is possible for the expedited grace period to help things along by > waking up the grace-period kthread, but of course doing this too much > further increases the time consumed by your rcu_sched kthread. Ah so that is the purpose of that patch. Still, I'm having trouble seeing how you can do this too much, you would only be waking it if there was a GP pending completion, right? At which point waking it is the right thing. If you wake it unconditionally, even if there's nothing to do, then yes that'd be a waste of cycles.