From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752883AbbDBA5L (ORCPT ); Wed, 1 Apr 2015 20:57:11 -0400 Received: from g9t5009.houston.hp.com ([15.240.92.67]:45537 "EHLO g9t5009.houston.hp.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752107AbbDBA5I (ORCPT ); Wed, 1 Apr 2015 20:57:08 -0400 Message-ID: <1427936126.2556.10.camel@j-VirtualBox> Subject: Re: [PATCH V2] sched: Improve load balancing in the presence of idle CPUs From: Jason Low To: Morten Rasmussen Cc: Preeti U Murthy , Peter Zijlstra , "mingo@kernel.org" , "riel@redhat.com" , "daniel.lezcano@linaro.org" , "vincent.guittot@linaro.org" , "srikar@linux.vnet.ibm.com" , "pjt@google.com" , "benh@kernel.crashing.org" , "efault@gmx.de" , "linux-kernel@vger.kernel.org" , "iamjoonsoo.kim@lge.com" , "svaidy@linux.vnet.ibm.com" , "tim.c.chen@linux.intel.com" , jason.low2@hp.com Date: Wed, 01 Apr 2015 17:55:26 -0700 In-Reply-To: <20150401130355.GW18994@e105550-lin.cambridge.arm.com> References: <20150326130014.21532.17158.stgit@preeti.in.ibm.com> <20150327143839.GO18994@e105550-lin.cambridge.arm.com> <55158966.4050300@linux.vnet.ibm.com> <20150327175651.GR18994@e105550-lin.cambridge.arm.com> <20150330110632.GT23123@twins.programming.kicks-ass.net> <20150330120302.GT18994@e105550-lin.cambridge.arm.com> <551A61A9.6020009@linux.vnet.ibm.com> <1427823008.2492.19.camel@j-VirtualBox> <551B8FF3.70608@linux.vnet.ibm.com> <20150401130355.GW18994@e105550-lin.cambridge.arm.com> Content-Type: text/plain; charset="UTF-8" X-Mailer: Evolution 3.2.3-0ubuntu6 Content-Transfer-Encoding: 7bit Mime-Version: 1.0 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Wed, 2015-04-01 at 14:03 +0100, Morten Rasmussen wrote: Hi Morten, > > Alright I see. But it is one additional wake up. And the wake up will be > > within the cluster. We will not wake up any CPU in the neighboring > > cluster unless there are tasks to be pulled. So, we can wake up a core > > out of a deep idle state and never a cluster in the problem described. > > In terms of energy efficiency, this is not so bad a scenario, is it? > > After Peter pointed out that it shouldn't happen across clusters due to > group_classify()/sg_capacity_factor() it isn't as bad as I initially > thought. It is still not an ideal solution I think. Wake-ups aren't nice > for battery-powered devices. Waking up a cpu in an already active > cluster may still imply powering up the core and bringing the L1 cache > into a usable state, but it isn't as bad as waking up a cluster. I would > prefer to avoid it if we can. Right. I still think that the patch is justified if it addresses the 10 second latency issue, but if we could find a better solution, that would be great :) > Thinking more about it, don't we also risk doing a lot of iterations in > nohz_idle_balance() leading to nothing (pure overhead) in certain corner > cases? If find_new_ild() is the last cpu in the cluster and we have one > task for each cpu in the cluster but one cpu is currently having two. > Don't we end up trying all nohz-idle cpus before giving up and balancing > the balancer cpu itself. On big machines, going through everyone could > take a while I think. No? Iterating through many CPUs could take a while, but since we only do nohz_idle_balance() when the CPU is idle and exit if need_resched, then we're only doing so if there is nothing else that needs to run. Also, we're only attempting balancing when time_after_eq rq->next_balance, so much of the time, we don't actually traverse all the CPUs. So this may not be too big of an issue.