From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751852AbcFBFu3 (ORCPT ); Thu, 2 Jun 2016 01:50:29 -0400 Received: from mail-wm0-f46.google.com ([74.125.82.46]:36977 "EHLO mail-wm0-f46.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1750958AbcFBFu2 (ORCPT ); Thu, 2 Jun 2016 01:50:28 -0400 Message-ID: <1464846623.3766.49.camel@gmail.com> Subject: Re: [RFC PATCH 1/2] sched: Clean up SD_BALANCE_WAKE flags in sched domain build-up From: Mike Galbraith To: Yuyang Du Cc: Peter Zijlstra , mingo@kernel.org, linux-kernel@vger.kernel.org, bsegall@google.com, pjt@google.com, morten.rasmussen@arm.com, vincent.guittot@linaro.org, dietmar.eggemann@arm.com Date: Thu, 02 Jun 2016 07:50:23 +0200 In-Reply-To: <20160601200325.GA18670@intel.com> References: <1464657098-24880-1-git-send-email-yuyang.du@intel.com> <1464657098-24880-2-git-send-email-yuyang.du@intel.com> <20160531092146.GT3192@twins.programming.kicks-ass.net> <20160531013132.GQ18670@intel.com> <1464757633.4023.39.camel@gmail.com> <20160601000105.GU18670@intel.com> <1464773799.4023.72.camel@gmail.com> <20160601200325.GA18670@intel.com> Content-Type: text/plain; charset="UTF-8" X-Mailer: Evolution 3.16.5 Mime-Version: 1.0 Content-Transfer-Encoding: 7bit Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Thu, 2016-06-02 at 04:03 +0800, Yuyang Du wrote: > On Wed, Jun 01, 2016 at 11:36:39AM +0200, Mike Galbraith wrote: > > > Yup. Up to this point, we don't have any disagreement. And I don't think we > > > have any disagreement conceptually. What the next patch really does is: > > > > > > (1) we don't remove SD_BALANCE_WAKE as an important sched_domain flag, on > > > the contrary, we strengthen it. > > > > > > (2) the semantic of SD_BALANCE_WAKE is currently represented by SD_WAKE_AFFINE, > > > we actually remove this representation. > > > > Nope, those two have different meanings. We pass SD_BALANCE_WAKE to > > identify a ttwu() wakeup, just as we pass SD_BALANCE_FORK to say we're > > waking a child. SD_WAKE_AFFINE means exactly what it says, but is only > > applicable to ttwu() wakeups. > > I don't disagree, but want to add that, SD_WAKE_AFFINE has no meaning that is so > special and so important for anyone to use the flag to tune anything. If you want > to do any SD_BALANCE_*, waker CPU is a valid candidate if allowed, that is it. That flag lets the user specifically tell us that he doesn't want us to bounce his tasks around the box, cache misses be damned. The user may _know_ that say cross node migrations hurt his load more than help, and not want us to do that, thus expresses himself by turning the flag off at whatever level. People do that. You can force them to take other measures, but why do that? > IIUC your XXX mark and your comment "Prefer wake_affine over balance flags", you > said the same thing: SD_WAKE_AFFINE should be part of SD_BALANCE_WAKE, and should > be part of all SD_BALANCE_* flags, Peter wrote that, but I don't read it the way you do. I read as if the user wants the benefits of affine wakeups, he surely doesn't want us to send the wakee off to god know where on every wakeup _instead_ of waking affine, he wants to balance iff he can't have an affine wakeup. > > > (3) regarding the semantic of SD_WAKE_AFFINE, it is really not about selecting > > > waker CPU or about the fast path. Conceptually, it is just saying the waker > > > CPU is a valid and important candidate if SD_BALANCE_WAKE, which is just so > > > obvious, so I don't think it deserves to be a separate sched_domain flag. > > > > SD_WAKE_AFFINE being a separate domain flag, the user can turn it > > on/off... separately :) > > Sure, that is very true, :) But turning it off for what, that is a big question mark. > We don't want a flag unless the flag is for goodness, and not a flag with big question > mark. It has a good excuse to exist. > > > (4) the outcome is, if SD_BALANCE_WAKE, we definitely will/should try waker CPU, > > > and if !SD_BALANCE_WAKE, we don't try waker CPU. So nothing functional is > > > changed. > > > > If wake_wide() says we do not want an affine wakeup, but you apply > > SD_WAKE_AFFINE meaning to SD_BALANCE_WAKE and turn it on in ->flags, > > we'll give the user a free sample of full balance cost, no? > > Yes, and otherwise we don't select anything? That is just bad engough whether worse > or not. So the whole fuss I made is really that this is a right thing to start with. :) Nope, leaving tasks where they were is not a bad thing. Lots of stuff likes the scheduler best when it leaves them the hell alone :) That works out well all around, balance cycles are spent in userspace instead, scheduler produces wins by doing nothing, perfect. -Mike