All of lore.kernel.org
 help / color / mirror / Atom feed
From: Mike Galbraith <efault@gmx.de>
To: Mel Gorman <mgorman@techsingularity.net>
Cc: Peter Zijlstra <peterz@infradead.org>,
	Ingo Molnar <mingo@kernel.org>,
	Vincent Guittot <vincent.guittot@linaro.org>,
	Valentin Schneider <valentin.schneider@arm.com>,
	Aubrey Li <aubrey.li@linux.intel.com>,
	Barry Song <song.bao.hua@hisilicon.com>,
	Srikar Dronamraju <srikar@linux.vnet.ibm.com>,
	LKML <linux-kernel@vger.kernel.org>
Subject: wakeup_affine_weight() is b0rked - was Re: [PATCH 2/2] sched/fair: Scale wakeup granularity relative to nr_running
Date: Sun, 03 Oct 2021 05:07:58 +0200	[thread overview]
Message-ID: <02c977d239c312de5e15c77803118dcf1e11f216.camel@gmx.de> (raw)
In-Reply-To: <ea2f9038f00d3b4c0008235079e1868145b47621.camel@gmx.de>

On Wed, 2021-09-22 at 07:22 +0200, Mike Galbraith wrote:
> On Tue, 2021-09-21 at 11:36 +0100, Mel Gorman wrote:
> > On Tue, Sep 21, 2021 at 05:52:32AM +0200, Mike Galbraith wrote:
> >
> >
> > > Preemption does rapidly run into diminishing return as load climbs for
> > > a lot of loads, but as you know, it's a rather sticky wicket because
> > > even when over-committed, preventing light control threads from slicing
> > > through (what can be a load's own work crew of) hogs can seriously
> > > injure performance.
> > >
> >
> > Turning this into a classic Rob Peter To Pay Paul problem. We don't know
> > if there is a light control thread that needs to run or not that affects
> > overall performance. It all depends on whether that control thread needs
> > to make progress for the overall workload or whether there are a mix of
> > workloads resulting in overloading.
>
> WRT overload, and our good buddies Peter and Paul :) I added...
>         if (gran >= sysctl_sched_latency >> 1)
>                 trace_printk("runnable:%d preempt disabled\n",cfs_rq->nr_running);
> ...to watch, and met the below when I.. logged in. 
>
> homer:..debug/tracing # tail -20 trace
>                X-2229    [002] d..5.    60.690322: wakeup_gran: runnable:9 preempt disabled
>                X-2229    [002] d..5.    60.690325: wakeup_gran: runnable:10 preempt disabled
>                X-2229    [002] d..5.    60.690330: wakeup_gran: runnable:11 preempt disabled
>                X-2229    [002] d..5.    60.690363: wakeup_gran: runnable:13 preempt disabled
>                X-2229    [002] d..5.    60.690377: wakeup_gran: runnable:14 preempt disabled
>                X-2229    [002] d..5.    60.690390: wakeup_gran: runnable:15 preempt disabled
>                X-2229    [002] d..5.    60.690404: wakeup_gran: runnable:16 preempt disabled
>                X-2229    [002] d..5.    60.690425: wakeup_gran: runnable:9 preempt disabled
>        ksmserver-2694    [003] d..3.    60.690432: wakeup_gran: runnable:6 preempt disabled
>        ksmserver-2694    [003] d..3.    60.690436: wakeup_gran: runnable:7 preempt disabled
>                X-2229    [002] d..5.    60.690451: wakeup_gran: runnable:6 preempt disabled
>                X-2229    [002] d..5.    60.690465: wakeup_gran: runnable:7 preempt disabled
>             kmix-2736    [000] d..3.    60.690491: wakeup_gran: runnable:6 preempt disabled
>                X-2229    [004] d..5.    92.889635: wakeup_gran: runnable:6 preempt disabled
>                X-2229    [004] d..5.    92.889675: wakeup_gran: runnable:6 preempt disabled
>                X-2229    [004] d..5.    92.889863: wakeup_gran: runnable:6 preempt disabled
>                X-2229    [004] d..5.    92.889944: wakeup_gran: runnable:6 preempt disabled
>                X-2229    [004] d..5.    92.889957: wakeup_gran: runnable:7 preempt disabled
>                X-2229    [004] d..5.    92.889968: wakeup_gran: runnable:8 preempt disabled
>   QXcbEventQueue-2740    [000] d..4.    92.890025: wakeup_gran: runnable:6 preempt disabled
> homer:..debug/tracing
>
> Watching 'while sleep 1; do clear;tail trace; done' with nothing but a
> kbuild running is like watching top.  There's enough stacking during
> routine use of my desktop box that it runs into the tick granularity
> wall pretty much continuously, so 'overload' may want redefining.

I looked into that crazy stacking depth...

static int
wake_affine_weight(struct sched_domain *sd, struct task_struct *p,
                   int this_cpu, int prev_cpu, int sync)
{
        s64 this_eff_load, prev_eff_load;
        unsigned long task_load;

        this_eff_load = cpu_load(cpu_rq(this_cpu));
                        ^^^^^^^^^^^^^^^^^^^^^^^^^^^ the butler didit!

That's pretty darn busted as it sits.  Between load updates, X, or any
other waker of many, can stack wakees to a ludicrous depth.  Tracing
kbuild vs firefox playing a youtube clip, I watched X stack 20 of the
zillion firefox minions while their previous CPUs all had 1 lousy task
running but a cpu_load() higher than the cpu_load() of X's CPU.  Most
of those prev_cpus were where X had left them when it migrated. Each
and every crazy depth migration was wake_affine_weight() deciding we
should pull based on crappy data.  As instantaneous load on the waker
CPU blew through the roof in my trace snapshot, its cpu_load() did
finally budge.. a tiny bit.. downward.  No idea where the stack would
have topped out, my tracing_off() limit was 20.

Hohum, my box grew a WA_INST companion to SIS_MIN_LAT cache cold task
distribulator feature ;-)  Not particularly lovely, but it knocks over
the leaning tower of minions.

	-Mike

  parent reply	other threads:[~2021-10-03  3:08 UTC|newest]

Thread overview: 59+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-09-20 14:26 [PATCH 0/2] Scale wakeup granularity relative to nr_running Mel Gorman
2021-09-20 14:26 ` [PATCH 1/2] sched/fair: Remove redundant lookup of rq in check_preempt_wakeup Mel Gorman
2021-09-21  7:21   ` Vincent Guittot
2021-09-21  7:53     ` Mel Gorman
2021-09-21  8:12       ` Vincent Guittot
2021-09-21  8:21       ` Peter Zijlstra
2021-09-21 10:03         ` Mel Gorman
2021-09-20 14:26 ` [PATCH 2/2] sched/fair: Scale wakeup granularity relative to nr_running Mel Gorman
2021-09-21  3:52   ` Mike Galbraith
2021-09-21  5:50     ` Mike Galbraith
2021-09-21  7:04     ` Mike Galbraith
2021-09-21 10:36     ` Mel Gorman
2021-09-21 12:32       ` Mike Galbraith
2021-09-21 14:03         ` Mel Gorman
2021-10-05  9:24         ` Peter Zijlstra
2021-09-22  5:22       ` Mike Galbraith
2021-09-22 13:20         ` Mel Gorman
2021-09-22 14:04           ` Mike Galbraith
2021-09-22 14:15           ` Vincent Guittot
2021-09-22 15:04             ` Mel Gorman
2021-09-22 16:00               ` Vincent Guittot
2021-09-22 17:38                 ` Mel Gorman
2021-09-22 18:22                   ` Vincent Guittot
2021-09-22 18:57                     ` Mel Gorman
2021-09-23  1:47                     ` Mike Galbraith
2021-09-23  8:40                       ` Vincent Guittot
2021-09-23  9:21                         ` Mike Galbraith
2021-09-23 12:41                           ` Vincent Guittot
2021-09-23 13:14                             ` Mike Galbraith
2021-09-27 11:17                             ` Mel Gorman
2021-09-27 14:17                               ` Mike Galbraith
2021-10-04  8:05                                 ` Mel Gorman
2021-10-04 16:37                                   ` Vincent Guittot
2021-10-05  7:41                                     ` Mel Gorman
2021-09-27 14:19                               ` Vincent Guittot
2021-09-27 15:02                                 ` Mel Gorman
2021-09-23 12:24                         ` Phil Auld
2021-10-05 10:36                           ` Peter Zijlstra
2021-10-05 14:12                             ` Phil Auld
2021-10-05 14:32                               ` Peter Zijlstra
2021-10-05 10:28                     ` Peter Zijlstra
2021-10-05 10:23                   ` Peter Zijlstra
2021-10-05  9:41               ` Peter Zijlstra
2021-09-22 15:05             ` Vincent Guittot
2021-10-05  9:32           ` Peter Zijlstra
2021-10-03  3:07         ` Mike Galbraith [this message]
2021-10-03  7:34           ` wakeup_affine_weight() is b0rked - was " Barry Song
2021-10-03 14:52             ` Mike Galbraith
2021-10-03 21:06               ` Barry Song
2021-10-04  1:49                 ` Mike Galbraith
2021-10-04  4:34             ` Mike Galbraith
2021-10-04  9:06               ` Mike Galbraith
2021-10-05  7:47                 ` Mel Gorman
2021-10-05  8:42                   ` Mike Galbraith
2021-10-05  9:31                     ` Mel Gorman
2021-10-06  6:46                       ` Mike Galbraith
2021-10-08  5:06                       ` Mike Galbraith
2021-09-21  8:03   ` Vincent Guittot
2021-09-21 10:45     ` Mel Gorman

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=02c977d239c312de5e15c77803118dcf1e11f216.camel@gmx.de \
    --to=efault@gmx.de \
    --cc=aubrey.li@linux.intel.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mgorman@techsingularity.net \
    --cc=mingo@kernel.org \
    --cc=peterz@infradead.org \
    --cc=song.bao.hua@hisilicon.com \
    --cc=srikar@linux.vnet.ibm.com \
    --cc=valentin.schneider@arm.com \
    --cc=vincent.guittot@linaro.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.