All of lore.kernel.org
 help / color / mirror / Atom feed
From: Mike Galbraith <umgwanakikbuti@gmail.com>
To: Vincent Guittot <vincent.guittot@linaro.org>
Cc: Peter Zijlstra <peterz@infradead.org>,
	LKML <linux-kernel@vger.kernel.org>,
	Rik van Riel <riel@redhat.com>
Subject: [v2 patch v3.18+ regression fix] sched: Further improve spurious CPU_IDLE active migrations
Date: Mon, 05 Sep 2016 18:26:53 +0200	[thread overview]
Message-ID: <1473092813.4412.6.camel@gmail.com> (raw)
In-Reply-To: <CAKfTPtDxTH62HGrze+rSrw9+kZc6xHSfJemhWqxhyhLZzM0qDg@mail.gmail.com>

Coming back to this, how about this instead, only increase the group
imbalance threshold when sd_llc_size == 2.  Newer L3 equipped
processors then aren't affected.



43f4d666 partially cured uprious migrations, but when there are
completely idle groups on a lightly loaded processor, and there is
a buddy pair occupying the busiest group, we will not attempt to
migrate due to select_idle_sibling() buddy placement, leaving the
busiest queue with one task.  We skip balancing, but increment
nr_balance_failed until we kick active balancing, and bounce a
buddy pair endlessly, demolishing throughput.

Increase group imbalance threshold to two when sd_llc_size == 2 to
allow buddies to share L2 without affecting larger L3 processors.

Regression detected on X5472 box, which has 4 MC groups of 2 cores.

netperf -l 60 -H 127.0.0.1 -t UDP_STREAM -i5,1 -I 95,5
pre:
!!! WARNING
!!! Desired confidence was not achieved within the specified iterations.
!!! This implies that there was variability in the test environment that
!!! must be investigated before going further.
!!! Confidence intervals: Throughput      : 66.421%
!!!                       Local CPU util  : 0.000%
!!!                       Remote CPU util : 0.000%

Socket  Message  Elapsed      Messages                
Size    Size     Time         Okay Errors   Throughput
bytes   bytes    secs            #      #   10^6bits/sec

212992   65507   60.00     1779143      0    15539.49
212992           60.00     1773551           15490.65

post:
Socket  Message  Elapsed      Messages                
Size    Size     Time         Okay Errors   Throughput
bytes   bytes    secs            #      #   10^6bits/sec

212992   65507   60.00     3719377      0    32486.01
212992           60.00     3717492           32469.54

Signed-off-by: Mike Galbraith <mgalbraith@suse.de>
Fixes: caeb178c sched/fair: Make update_sd_pick_busiest() return 'true' on a busier sd
Cc: <stable@vger.kernel.org> # v3.18+
---
 kernel/sched/fair.c |   17 ++++++++++++-----
 1 file changed, 12 insertions(+), 5 deletions(-)

--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -7249,12 +7249,19 @@ static struct sched_group *find_busiest_
 		 * This cpu is idle. If the busiest group is not overloaded
 		 * and there is no imbalance between this and busiest group
 		 * wrt idle cpus, it is balanced. The imbalance becomes
-		 * significant if the diff is greater than 1 otherwise we
-		 * might end up to just move the imbalance on another group
+		 * significant if the diff is greater than 1 for most CPUs,
+		 * or 2 for older CPUs having multiple groups of 2 cores
+		 * sharing an L2, otherwise we may end up uselessly moving
+		 * the imbalance to another group, or starting a tug of war
+		 * with idle L2 groups constantly ripping communicating
+		 * tasks apart, and no L3 to mitigate the cache miss pain.
 		 */
-		if ((busiest->group_type != group_overloaded) &&
-				(local->idle_cpus <= (busiest->idle_cpus + 1)))
-			goto out_balanced;
+		if (busiest->group_type != group_overloaded) {
+			int imbalance = __this_cpu_read(sd_llc_size) == 2 ? 2 : 1;
+
+			if (local->idle_cpus <= busiest->idle_cpus + imbalance)
+				goto out_balanced;
+		}
 	} else {
 		/*
 		 * In the CPU_NEWLY_IDLE, CPU_NOT_IDLE cases, use

  reply	other threads:[~2016-09-05 16:26 UTC|newest]

Thread overview: 14+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-08-30  5:42 [patch v3.18+ regression fix] sched: Further improve spurious CPU_IDLE active migrations Mike Galbraith
2016-08-31 10:01 ` Peter Zijlstra
2016-08-31 10:18   ` Mike Galbraith
2016-08-31 10:36     ` Mike Galbraith
2016-08-31 15:52       ` Vincent Guittot
2016-09-01  4:11         ` Mike Galbraith
2016-09-01  6:37           ` Mike Galbraith
2016-09-01  8:09           ` Vincent Guittot
2016-09-05 16:26             ` Mike Galbraith [this message]
2016-09-06 13:01               ` [v2 patch " Vincent Guittot
2016-09-06 13:07                 ` Mike Galbraith
2016-09-06 13:42                   ` Vincent Guittot
2016-09-06 13:59                     ` Mike Galbraith
2016-09-06 13:44                   ` Mike Galbraith

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1473092813.4412.6.camel@gmail.com \
    --to=umgwanakikbuti@gmail.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=peterz@infradead.org \
    --cc=riel@redhat.com \
    --cc=vincent.guittot@linaro.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.