From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752882AbdG2Wl7 (ORCPT ); Sat, 29 Jul 2017 18:41:59 -0400 Received: from mail-oi0-f54.google.com ([209.85.218.54]:34884 "EHLO mail-oi0-f54.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752209AbdG2Wl6 (ORCPT ); Sat, 29 Jul 2017 18:41:58 -0400 MIME-Version: 1.0 In-Reply-To: References: <20170630004912.GA2457@destiny> <20170630142815.GA9743@destiny> <1498842140.15161.66.camel@gmail.com> <1501340845.7706.168.camel@gmail.com> From: Joel Fernandes Date: Sat, 29 Jul 2017 15:41:56 -0700 Message-ID: Subject: Re: wake_wide mechanism clarification To: Mike Galbraith Cc: Josef Bacik , Peter Zijlstra , LKML , Juri Lelli , Dietmar Eggemann , Patrick Bellasi , Brendan Jackman , Chris Redpath , Michael Wang , Matt Fleming Content-Type: text/plain; charset="UTF-8" Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Sat, Jul 29, 2017 at 3:28 PM, Joel Fernandes wrote: >>>> Again I didn't follow why the second condition couldn't just be: >>>> waker->nr_wakee_switch > factor, or, (waker->nr_wakee_switch + >>>> wakee->nr_wakee_switch) > factor, based on the above explanation from >>>> Micheal Wang that I quoted. >>>> and why he's instead doing the whole multiplication thing there that I >>>> was talking about earlier: "factor * wakee->nr_wakee_switch". >>>> >>>> Rephrasing my question in another way, why are we talking the ratio of >>>> master/slave instead of the sum when comparing if its > factor? I am >>>> surely missing something here. >>> >>> Because the heuristic tries to not demolish 1:1 buddies. Big partner >>> flip delta means the pair are unlikely to be a communicating pair, >>> perhaps at high frequency where misses hurt like hell. >> >> But it does seem to me to demolish the N:N communicating pairs from a >> latency/load balancing standpoint. For he case of N readers and N >> writers, the ratio (master/slave) comes down to 1:1 and we wake >> affine. Hopefully I didn't miss something too obvious about that. > > I think wake_affine() should correctly handle the case (of > overloading) I bring up here where wake_wide() is too conservative and > does affine a lot, (I don't have any data for this though, this just > from code reading), so I take this comment back for this reason. aargh, nope :( it still runs select_idle_sibling although on the previous CPU even if want_affine is 0 (and doesn't do the wider wakeup..), so the comment still applies.. its easy to get lost into the code with so many if statements :-\ sorry about the noise :) thanks, -Joel