From: Joel Fernandes <joelaf@google.com>
To: Josef Bacik <josef@toxicpanda.com>
Cc: Mike Galbraith <umgwanakikbuti@gmail.com>,
Peter Zijlstra <peterz@infradead.org>,
LKML <linux-kernel@vger.kernel.org>,
Juri Lelli <Juri.Lelli@arm.com>,
Dietmar Eggemann <dietmar.eggemann@arm.com>,
Patrick Bellasi <patrick.bellasi@arm.com>,
Brendan Jackman <brendan.jackman@arm.com>,
Chris Redpath <Chris.Redpath@arm.com>,
Michael Wang <wangyun@linux.vnet.ibm.com>,
Matt Fleming <matt@codeblueprint.co.uk>
Subject: Re: wake_wide mechanism clarification
Date: Mon, 31 Jul 2017 09:21:46 -0700 [thread overview]
Message-ID: <CAJWu+oorSt-vDwVGHBNtJmLJEQD8Vni79QYO0=YnvBpZUy8u9A@mail.gmail.com> (raw)
In-Reply-To: <20170731122149.GA7539@li70-116.members.linode.com>
Hi Josef,
On Mon, Jul 31, 2017 at 5:21 AM, Josef Bacik <josef@toxicpanda.com> wrote:
> On Sat, Jul 29, 2017 at 03:41:56PM -0700, Joel Fernandes wrote:
>> On Sat, Jul 29, 2017 at 3:28 PM, Joel Fernandes <joelaf@google.com> wrote:
>> <snip>
>> >>>> Again I didn't follow why the second condition couldn't just be:
>> >>>> waker->nr_wakee_switch > factor, or, (waker->nr_wakee_switch +
>> >>>> wakee->nr_wakee_switch) > factor, based on the above explanation from
>> >>>> Micheal Wang that I quoted.
>> >>>> and why he's instead doing the whole multiplication thing there that I
>> >>>> was talking about earlier: "factor * wakee->nr_wakee_switch".
>> >>>>
>> >>>> Rephrasing my question in another way, why are we talking the ratio of
>> >>>> master/slave instead of the sum when comparing if its > factor? I am
>> >>>> surely missing something here.
>> >>>
>> >>> Because the heuristic tries to not demolish 1:1 buddies. Big partner
>> >>> flip delta means the pair are unlikely to be a communicating pair,
>> >>> perhaps at high frequency where misses hurt like hell.
>> >>
>> >> But it does seem to me to demolish the N:N communicating pairs from a
>> >> latency/load balancing standpoint. For he case of N readers and N
>> >> writers, the ratio (master/slave) comes down to 1:1 and we wake
>> >> affine. Hopefully I didn't miss something too obvious about that.
>> >
>> > I think wake_affine() should correctly handle the case (of
>> > overloading) I bring up here where wake_wide() is too conservative and
>> > does affine a lot, (I don't have any data for this though, this just
>> > from code reading), so I take this comment back for this reason.
>>
>> aargh, nope :( it still runs select_idle_sibling although on the
>> previous CPU even if want_affine is 0 (and doesn't do the wider
>> wakeup..), so the comment still applies.. its easy to get lost into
>> the code with so many if statements :-\ sorry about the noise :)
>>
>
> I've been working in this area recently because of a cpu imbalance problem.
> Wake_wide() definitely makes it so we're waking affine way too often, but I
> think messing with wake_waide to solve that problem is the wrong solution. This
> is just a heuristic to see if we should wake affine, the simpler the better. I
> solved the problem of waking affine too often like this
>
> https://marc.info/?l=linux-kernel&m=150003849602535&w=2
Thanks! Cool!
>
> So why do you care about wake_wide() anyway? Are you observing some problem
> that you suspect is affected by the affine wakeup stuff? Or are you just trying
I am dealing with an affine wake up issue, yes.
> to understand what is going on for fun? Cause if you are just doing this for
> fun you are a very strange person, thanks,
Its not just for fun :) Let me give you some background about me, I
work in the Android team and one of the things I want to do is to take
an out of tree patch that's been carried for some time and post a more
upstreamable solution - this is related to wake ups from the binder
driver which does sync wake ups (WF_SYNC). I can't find the exact out
of tree patch publicly since it wasn't posted to a list, but the code
is here [1]. What's worse is I have recently found really bad issues
with this patch itself where runnable times are increased. I should
have provided this background earlier (sorry that I didn't, my plan
was to trigger a separate discussion about the binder sync wake up
thing as a part of a patch/proposal I want to post - which I plan to
do so). Anyway, as a part of this effort, I want to understand
wake_wide() better and "respect" it since it sits in the wake up path
and I wanted to my proposal to work well with it, especially since I
want to solve this problem in an upstream-friendly way.
The other reason to trigger the discussion, is, I have seen
wake_wide() enough number of times and asked enough number of folks
how it works that it seems sensible to ask about it here (I was also
suggested to ask about wake_wide on LKML because since few people
seemingly understand how it works) and hopefully now its a bit better
understood.
I agree with you that instead of spending insane amounts of time on
wake_wide itself, its better to directly work on a problem and collect
some data - which is also what I'm doing, but I still thought its
worth doing some digging into wake_wide() during some free time I had,
thanks.
Cheers,
-Joel
[1] https://android.googlesource.com/kernel/msm.git/+/377e6e28b6097b3d6de7245d3d3def45fc8c9ffc/kernel/sched/fair.c#5492
next prev parent reply other threads:[~2017-07-31 16:21 UTC|newest]
Thread overview: 29+ messages / expand[flat|nested] mbox.gz Atom feed top
2017-06-30 0:19 wake_wide mechanism clarification Joel Fernandes
2017-06-30 0:49 ` Josef Bacik
2017-06-30 3:04 ` Joel Fernandes
2017-06-30 14:28 ` Josef Bacik
2017-06-30 17:02 ` Mike Galbraith
2017-06-30 17:55 ` Josef Bacik
2017-08-03 10:53 ` Brendan Jackman
2017-08-03 13:15 ` Josef Bacik
2017-08-03 15:05 ` Brendan Jackman
2017-08-09 21:22 ` Atish Patra
2017-08-10 9:48 ` Brendan Jackman
2017-08-10 17:41 ` Atish Patra
2017-07-29 8:01 ` Joel Fernandes
2017-07-29 8:13 ` Joel Fernandes
2017-08-02 8:26 ` Michael Wang
2017-08-03 23:48 ` Joel Fernandes
2017-07-29 15:07 ` Mike Galbraith
2017-07-29 20:19 ` Joel Fernandes
2017-07-29 22:28 ` Joel Fernandes
2017-07-29 22:41 ` Joel Fernandes
2017-07-31 12:21 ` Josef Bacik
2017-07-31 13:42 ` Mike Galbraith
2017-07-31 14:48 ` Josef Bacik
2017-07-31 17:23 ` Mike Galbraith
2017-07-31 16:21 ` Joel Fernandes [this message]
2017-07-31 16:42 ` Josef Bacik
2017-07-31 17:55 ` Joel Fernandes
2017-06-30 3:11 ` Mike Galbraith
2017-06-30 13:11 ` Matt Fleming
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to='CAJWu+oorSt-vDwVGHBNtJmLJEQD8Vni79QYO0=YnvBpZUy8u9A@mail.gmail.com' \
--to=joelaf@google.com \
--cc=Chris.Redpath@arm.com \
--cc=Juri.Lelli@arm.com \
--cc=brendan.jackman@arm.com \
--cc=dietmar.eggemann@arm.com \
--cc=josef@toxicpanda.com \
--cc=linux-kernel@vger.kernel.org \
--cc=matt@codeblueprint.co.uk \
--cc=patrick.bellasi@arm.com \
--cc=peterz@infradead.org \
--cc=umgwanakikbuti@gmail.com \
--cc=wangyun@linux.vnet.ibm.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).