From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1758514Ab3GRMf7 (ORCPT ); Thu, 18 Jul 2013 08:35:59 -0400 Received: from merlin.infradead.org ([205.233.59.134]:33027 "EHLO merlin.infradead.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1754403Ab3GRMf6 (ORCPT ); Thu, 18 Jul 2013 08:35:58 -0400 Date: Thu, 18 Jul 2013 14:35:31 +0200 From: Peter Zijlstra To: Srikar Dronamraju Cc: Rik van Riel , Jason Low , Ingo Molnar , LKML , Mike Galbraith , Thomas Gleixner , Paul Turner , Alex Shi , Preeti U Murthy , Vincent Guittot , Morten Rasmussen , Namhyung Kim , Andrew Morton , Kees Cook , Mel Gorman , aswin@hp.com, scott.norton@hp.com, chegu_vinod@hp.com Subject: Re: [RFC] sched: Limit idle_balance() when it is being used too frequently Message-ID: <20130718123531.GO27075@twins.programming.kicks-ass.net> References: <1374048701.6000.21.camel@j-VirtualBox> <20130717093913.GP23818@dyad.programming.kicks-ass.net> <1374076741.7412.35.camel@j-VirtualBox> <20130717161815.GR23818@dyad.programming.kicks-ass.net> <51E6D9B7.1030705@redhat.com> <20130717180156.GS23818@dyad.programming.kicks-ass.net> <1374120144.1816.45.camel@j-VirtualBox> <20130718093218.GH27075@twins.programming.kicks-ass.net> <51E7D89A.8010009@redhat.com> <20130718121546.GB3745@linux.vnet.ibm.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20130718121546.GB3745@linux.vnet.ibm.com> User-Agent: Mutt/1.5.21 (2012-12-30) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Thu, Jul 18, 2013 at 05:45:46PM +0530, Srikar Dronamraju wrote: > We take locks if and only if we see imbalance and want to pull the > tasks. > However if the newly idle balance is not finding an imbalance then this > may not be an issue. > > Probably /proc/schedstats will give a better picture. Right, so we're interested in move_tasks() calls that fail to 'deliver'. There's a few conditions in there that can cause us to not move a task, most of them not counted. The few that are; are from can_mirgrate_task(): se.statistics.nr_failed_migrations_affine se.statistics.nr_failed_migrations_running se.statistics.nr_failed_migrations_hot If we see significant increments on those we'll be taking locks. The only one I can see a good way around is the hot one, we could ignore hotness in favour of newidle -- although I could see that being detrimental, we'll just have to try or so ;-) _running shouldn't be much of a problem since we don't bother if nr_running <= 1. And _affine is out of our reach anyway.