From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752464AbdGaOsK (ORCPT ); Mon, 31 Jul 2017 10:48:10 -0400 Received: from mail-yw0-f175.google.com ([209.85.161.175]:36670 "EHLO mail-yw0-f175.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751951AbdGaOsJ (ORCPT ); Mon, 31 Jul 2017 10:48:09 -0400 Date: Mon, 31 Jul 2017 14:48:07 +0000 From: Josef Bacik To: Mike Galbraith Cc: Josef Bacik , Joel Fernandes , Peter Zijlstra , LKML , Juri Lelli , Dietmar Eggemann , Patrick Bellasi , Brendan Jackman , Chris Redpath , Michael Wang , Matt Fleming Subject: Re: wake_wide mechanism clarification Message-ID: <20170731144806.GA7791@li70-116.members.linode.com> References: <20170630142815.GA9743@destiny> <1498842140.15161.66.camel@gmail.com> <1501340845.7706.168.camel@gmail.com> <20170731122149.GA7539@li70-116.members.linode.com> <1501508545.6867.32.camel@gmail.com> MIME-Version: 1.0 Content-Type: text/plain; charset=iso-8859-1 Content-Disposition: inline Content-Transfer-Encoding: 8bit In-Reply-To: <1501508545.6867.32.camel@gmail.com> User-Agent: Mutt/1.5.23 (2014-03-12) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Mon, Jul 31, 2017 at 03:42:25PM +0200, Mike Galbraith wrote: > On Mon, 2017-07-31 at 12:21 +0000, Josef Bacik wrote: > > > > I've been working in this area recently because of a cpu imbalance problem. > > Wake_wide() definitely makes it so we're waking affine way too often, but I > > think messing with wake_waide to solve that problem is the wrong solution. This > > is just a heuristic to see if we should wake affine, the simpler the better. I > > solved the problem of waking affine too often like this > > > > https://marc.info/?l=linux-kernel&m=150003849602535&w=2 > > Wait a minute, that's not quite fair :)  Wake_wide() can't be blamed > for causing too frequent affine wakeups when what it does is filter > some out.  While it may not reject aggressively enough for you (why you > bent it up to be very aggressive), seems the problem from your loads > POV is the scheduler generally being too eager to bounce. > Yeah sorry, I hate this stuff because it's so hard to talk about without mixing up different ideas. I should say the scheduler in general prefers to wake affine super hard, and wake_wide() is conservative in it's filtering of this behavior. The rest still holds true, I think tinkering with it is just hard and the wrong place to do it, it's a good first step, and we can be smarter further down. > I've also played with rate limiting migration per task, but it had > negative effects too: when idle/periodic balance pulls buddies apart, > rate limiting inhibits them quickly finding each other again, making > undoing all that hard load balancer work a throughput win.  Sigh. > That's why I did the HZ thing, we don't touch the task for HZ to let things settle out, and then allow affine wakeups after that. Now HZ may be an eternity in scheduler time, but I think its a good middle ground. For our case the box is loaded constantly, so we basically never want affine wakeups for our app. For the case where there's spikey behavior we'll return to normal affine wakeups a short while later. But from my admittedly limited testing it appears to be a win overall. Thanks, Josef