From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1753936AbcEWL7e (ORCPT ); Mon, 23 May 2016 07:59:34 -0400 Received: from foss.arm.com ([217.140.101.70]:50550 "EHLO foss.arm.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751865AbcEWL7d (ORCPT ); Mon, 23 May 2016 07:59:33 -0400 Date: Mon, 23 May 2016 13:00:10 +0100 From: Morten Rasmussen To: Mike Galbraith Cc: peterz@infradead.org, mingo@redhat.com, dietmar.eggemann@arm.com, yuyang.du@intel.com, vincent.guittot@linaro.org, linux-kernel@vger.kernel.org Subject: Re: [PATCH 03/16] sched/fair: Disregard idle task wakee_flips in wake_wide Message-ID: <20160523120010.GB27946@e105550-lin.cambridge.arm.com> References: <1464001138-25063-1-git-send-email-morten.rasmussen@arm.com> <1464001138-25063-4-git-send-email-morten.rasmussen@arm.com> <1464001927.4537.118.camel@suse.de> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <1464001927.4537.118.camel@suse.de> User-Agent: Mutt/1.5.21 (2010-09-15) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Mon, May 23, 2016 at 01:12:07PM +0200, Mike Galbraith wrote: > On Mon, 2016-05-23 at 11:58 +0100, Morten Rasmussen wrote: > > wake_wide() is based on task wakee_flips of the waker and the wakee to > > decide whether an affine wakeup is desirable. On lightly loaded systems > > the waker is frequently the idle task (pid=0) which can accumulate a lot > > of wakee_flips in that scenario. It makes little sense to prevent affine > > wakeups on an idle cpu due to the idle task wakee_flips, so it makes > > more sense to ignore them in wake_wide(). > > You sure? What's the difference between a task flipping enough to > warrant spreading the load, and an interrupt source doing the same? > I've both witnessed firsthand, and received user confirmation of this > very thing improving utilization. Right, I didn't consider the interrupt source scenario, my fault. The problem then seems to be distinguishing truly idle and busy doing interrupts. The issue that I observe is that wake_wide() likes pushing tasks around in lightly scenarios which isn't desirable for power management. Selecting the same cpu again may potentially let others reach deeper C-state. With that in mind I will if I can do better. Suggestions are welcome :-) > > > cc: Ingo Molnar > > cc: Peter Zijlstra > > > > Signed-off-by: Morten Rasmussen > > --- > > kernel/sched/fair.c | 4 ++++ > > 1 file changed, 4 insertions(+) > > > > diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c > > index c49e25a..0fe3020 100644 > > --- a/kernel/sched/fair.c > > +++ b/kernel/sched/fair.c > > @@ -5007,6 +5007,10 @@ static int wake_wide(struct task_struct *p) > > unsigned int slave = p->wakee_flips; > > int factor = this_cpu_read(sd_llc_size); > > > > + /* Don't let the idle task prevent affine wakeups */ > > + if (is_idle_task(current)) > > + return 0; > > + > > if (master < slave) > > swap(master, slave); > > if (slave < factor || master < slave * factor)