From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1755279AbbBTT0W (ORCPT ); Fri, 20 Feb 2015 14:26:22 -0500 Received: from service87.mimecast.com ([91.220.42.44]:45315 "EHLO service87.mimecast.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1753612AbbBTT0U convert rfc822-to-8bit (ORCPT ); Fri, 20 Feb 2015 14:26:20 -0500 Message-ID: <54E78A6E.30301@arm.com> Date: Fri, 20 Feb 2015 19:26:38 +0000 From: Dietmar Eggemann User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:31.0) Gecko/20100101 Thunderbird/31.4.0 MIME-Version: 1.0 To: Morten Rasmussen , "peterz@infradead.org" , "mingo@redhat.com" CC: "vincent.guittot@linaro.org" , "yuyang.du@intel.com" , "preeti@linux.vnet.ibm.com" , "mturquette@linaro.org" , "nico@linaro.org" , "rjw@rjwysocki.net" , Juri Lelli , "linux-kernel@vger.kernel.org" Subject: Re: [RFCv3 PATCH 48/48] sched: Disable energy-unfriendly nohz kicks References: <1423074685-6336-1-git-send-email-morten.rasmussen@arm.com> <1423074685-6336-49-git-send-email-morten.rasmussen@arm.com> In-Reply-To: <1423074685-6336-49-git-send-email-morten.rasmussen@arm.com> X-OriginalArrivalTime: 20 Feb 2015 19:26:17.0429 (UTC) FILETIME=[192D4850:01D04D43] X-MC-Unique: 115022019261800301 Content-Type: text/plain; charset=WINDOWS-1252 Content-Transfer-Encoding: 8BIT Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Hi Morten, On 04/02/15 18:31, Morten Rasmussen wrote: > With energy-aware scheduling enabled nohz_kick_needed() generates many > nohz idle-balance kicks which lead to nothing when multiple tasks get > packed on a single cpu to save energy. This causes unnecessary wake-ups > and hence wastes energy. Make these conditions depend on !energy_aware() > for now until the energy-aware nohz story gets sorted out. > > cc: Ingo Molnar > cc: Peter Zijlstra > > Signed-off-by: Morten Rasmussen > --- > kernel/sched/fair.c | 7 +++++-- > 1 file changed, 5 insertions(+), 2 deletions(-) > > diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c > index 1c248f8..cfe65ae 100644 > --- a/kernel/sched/fair.c > +++ b/kernel/sched/fair.c > @@ -8195,6 +8195,8 @@ static void nohz_idle_balance(struct rq *this_rq, enum cpu_idle_type idle) > clear_bit(NOHZ_BALANCE_KICK, nohz_flags(this_cpu)); > } > > +static int cpu_overutilized(int cpu, struct sched_domain *sd); > + > /* > * Current heuristic for kicking the idle load balancer in the presence > * of an idle cpu in the system. > @@ -8234,12 +8236,13 @@ static inline bool nohz_kick_needed(struct rq *rq) > if (time_before(now, nohz.next_balance)) > return false; > > - if (rq->nr_running >= 2) > + sd = rcu_dereference(rq->sd); > + if (rq->nr_running >= 2 && (!energy_aware() || cpu_overutilized(cpu, sd))) > return true; CONFIG_PROVE_RCU checking revealed this one: [ 3.814454] =============================== [ 3.826989] [ INFO: suspicious RCU usage. ] [ 3.839526] 3.19.0-rc7+ #10 Not tainted [ 3.851018] ------------------------------- [ 3.863554] kernel/sched/fair.c:8239 suspicious rcu_dereference_check() usage! [ 3.885216] [ 3.885216] other info that might help us debug this: [ 3.885216] [ 3.909236] [ 3.909236] rcu_scheduler_active = 1, debug_locks = 1 [ 3.928817] no locks held by kthreadd/437. The RCU read-side critical section has to be extended to incorporate this sd = rcu_dereference(rq->sd): diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c index cfe65aec3237..145360ee6e4a 100644 --- a/kernel/sched/fair.c +++ b/kernel/sched/fair.c @@ -8236,11 +8236,13 @@ static inline bool nohz_kick_needed(struct rq *rq) if (time_before(now, nohz.next_balance)) return false; + rcu_read_lock(); sd = rcu_dereference(rq->sd); - if (rq->nr_running >= 2 && (!energy_aware() || cpu_overutilized(cpu, sd))) - return true; + if (rq->nr_running >= 2 && (!energy_aware() || cpu_overutilized(cpu, sd))) { + kick = true; + goto unlock; + } - rcu_read_lock(); sd = rcu_dereference(per_cpu(sd_busy, cpu)); if (sd && !energy_aware()) { sgc = sd->groups->sgc; -- Dietmar > > rcu_read_lock(); > sd = rcu_dereference(per_cpu(sd_busy, cpu)); > - if (sd) { > + if (sd && !energy_aware()) { > sgc = sd->groups->sgc; > nr_busy = atomic_read(&sgc->nr_busy_cpus); > >