From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1754520AbbGCGlH (ORCPT ); Fri, 3 Jul 2015 02:41:07 -0400 Received: from mail-wi0-f179.google.com ([209.85.212.179]:38261 "EHLO mail-wi0-f179.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1753974AbbGCGlB (ORCPT ); Fri, 3 Jul 2015 02:41:01 -0400 Message-ID: <1435905658.6418.52.camel@gmail.com> Subject: Re: [PATCH RESEND] sched: prefer an idle cpu vs an idle sibling for BALANCE_WAKE From: Mike Galbraith To: Josef Bacik Cc: Peter Zijlstra , riel@redhat.com, mingo@redhat.com, linux-kernel@vger.kernel.org, morten.rasmussen@arm.com, kernel-team Date: Fri, 03 Jul 2015 08:40:58 +0200 In-Reply-To: <55957871.7080906@fb.com> References: <1432761736-22093-1-git-send-email-jbacik@fb.com> <20150528102127.GD3644@twins.programming.kicks-ass.net> <20150528110514.GR18673@twins.programming.kicks-ass.net> <1434087305.3674.26.camel@gmail.com> <5581B70D.2000800@fb.com> <1434588939.3444.25.camel@gmail.com> <55823F33.7040005@fb.com> <1434600765.3393.9.camel@gmail.com> <55957871.7080906@fb.com> Content-Type: text/plain; charset="UTF-8" X-Mailer: Evolution 3.12.11 Mime-Version: 1.0 Content-Transfer-Encoding: 7bit Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Thu, 2015-07-02 at 13:44 -0400, Josef Bacik wrote: > Now for 3.10 vs 4.0 our request duration time is the same if not > slightly better on 4.0, so once the workers are doing their job > everything is a-ok. > > The problem is the probability the select queue >= 1 is way different on > 4.0 vs 3.10. Normally this graph looks like an S, it's essentially 0 up > to some RPS (requests per second) threshold and then shoots up to 100% > after the threshold. I'll make a table of these graphs that hopefully > makes sense, the numbers are different from run to run because of > traffic and such, the test and control are both run at the same time. > The header is the probability the select queue >=1 > > 25% 50% 75% > 4.0 plain: 371 388 402 > control: 386 394 402 > difference: 15 6 0 So control is 3.10? Virgin? > So with 4.0 its basically a straight line, at lower RPS we are getting a > higher probability of a select queue >= 1. We are measuring the cpu > delay avg ms thing from the scheduler netlink stuff which is how I > noticed it was scheduler related, our cpu delay is way higher on 4.0 > than it is on 3.10 or 4.0 with the wake idle patch. > > So the next test is NO_PREFER_IDLE. This is slightly better than 4.0 plain > 25% 50% 75% > NO_PREFER_IDLE: 399 401 414 > control: 385 408 416 > difference: 14 7 2 Hm. Throttling nohz may make larger delta. But never mind that. > The numbers don't really show it well, but the graphs are closer > together, it's slightly more s shaped, but still not great. > > Next is NO_WAKE_WIDE, which is horrible > > 25% 50% 75% > NO_WAKE_WIDE: 315 344 369 > control: 373 380 388 > difference: 58 36 19 > > This isn't even in the same ballpark, it's a way worse regression than > plain. Ok, this jibes perfectly with 1:N waker/wakee thingy. > The next bit is NO_WAKE_WIDE|NO_PREFER_IDLE, which is just as bad > > 25% 50% 75% > EVERYTHING: 327 360 383 > control: 381 390 399 > difference: 54 30 19 Ditto. Hm. Seems what this load should like best is if we detect 1:N, skip all of the routine gyrations, ie move the N (workers) infrequently, expend search cycles frequently only on the 1 (dispatch). Ponder.. -Mike