LKML Archive on lore.kernel.org
 help / color / Atom feed
From: tip-bot for Peter Zijlstra <tipbot@zytor.com>
To: linux-tip-commits@vger.kernel.org
Cc: clm@fb.com, torvalds@linux-foundation.org, efault@gmx.de,
	mingo@kernel.org, kitsunyan@inbox.ru, peterz@infradead.org,
	linux-kernel@vger.kernel.org, tglx@linutronix.de,
	matt@codeblueprint.co.uk, hpa@zytor.com
Subject: [tip:sched/core] sched/core: Implement new approach to scale select_idle_cpu()
Date: Thu, 8 Jun 2017 02:22:53 -0700
Message-ID: <tip-1ad3aaf3fcd2444406628a19a9b9e0922b95e2d4@git.kernel.org> (raw)
In-Reply-To: <20170517105350.hk5m4h4jb6dfr65a@hirez.programming.kicks-ass.net>

Commit-ID:  1ad3aaf3fcd2444406628a19a9b9e0922b95e2d4
Gitweb:     http://git.kernel.org/tip/1ad3aaf3fcd2444406628a19a9b9e0922b95e2d4
Author:     Peter Zijlstra <peterz@infradead.org>
AuthorDate: Wed, 17 May 2017 12:53:50 +0200
Committer:  Ingo Molnar <mingo@kernel.org>
CommitDate: Thu, 8 Jun 2017 10:25:17 +0200

sched/core: Implement new approach to scale select_idle_cpu()

Hackbench recently suffered a bunch of pain, first by commit:

  4c77b18cf8b7 ("sched/fair: Make select_idle_cpu() more aggressive")

and then by commit:

  c743f0a5c50f ("sched/fair, cpumask: Export for_each_cpu_wrap()")

which fixed a bug in the initial for_each_cpu_wrap() implementation
that made select_idle_cpu() even more expensive. The bug was that it
would skip over CPUs when bits were consequtive in the bitmask.

This however gave me an idea to fix select_idle_cpu(); where the old
scheme was a cliff-edge throttle on idle scanning, this introduces a
more gradual approach. Instead of stopping to scan entirely, we limit
how many CPUs we scan.

Initial benchmarks show that it mostly recovers hackbench while not
hurting anything else, except Mason's schbench, but not as bad as the
old thing.

It also appears to recover the tbench high-end, which also suffered like
hackbench.

Tested-by: Matt Fleming <matt@codeblueprint.co.uk>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Chris Mason <clm@fb.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: hpa@zytor.com
Cc: kitsunyan <kitsunyan@inbox.ru>
Cc: linux-kernel@vger.kernel.org
Cc: lvenanci@redhat.com
Cc: riel@redhat.com
Cc: xiaolong.ye@intel.com
Link: http://lkml.kernel.org/r/20170517105350.hk5m4h4jb6dfr65a@hirez.programming.kicks-ass.net
Signed-off-by: Ingo Molnar <mingo@kernel.org>
---
 kernel/sched/fair.c     | 21 ++++++++++++++++-----
 kernel/sched/features.h |  1 +
 2 files changed, 17 insertions(+), 5 deletions(-)

diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index 47a0c55..396bca9 100644
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -5794,27 +5794,38 @@ static inline int select_idle_smt(struct task_struct *p, struct sched_domain *sd
 static int select_idle_cpu(struct task_struct *p, struct sched_domain *sd, int target)
 {
 	struct sched_domain *this_sd;
-	u64 avg_cost, avg_idle = this_rq()->avg_idle;
+	u64 avg_cost, avg_idle;
 	u64 time, cost;
 	s64 delta;
-	int cpu;
+	int cpu, nr = INT_MAX;
 
 	this_sd = rcu_dereference(*this_cpu_ptr(&sd_llc));
 	if (!this_sd)
 		return -1;
 
-	avg_cost = this_sd->avg_scan_cost;
-
 	/*
 	 * Due to large variance we need a large fuzz factor; hackbench in
 	 * particularly is sensitive here.
 	 */
-	if (sched_feat(SIS_AVG_CPU) && (avg_idle / 512) < avg_cost)
+	avg_idle = this_rq()->avg_idle / 512;
+	avg_cost = this_sd->avg_scan_cost + 1;
+
+	if (sched_feat(SIS_AVG_CPU) && avg_idle < avg_cost)
 		return -1;
 
+	if (sched_feat(SIS_PROP)) {
+		u64 span_avg = sd->span_weight * avg_idle;
+		if (span_avg > 4*avg_cost)
+			nr = div_u64(span_avg, avg_cost);
+		else
+			nr = 4;
+	}
+
 	time = local_clock();
 
 	for_each_cpu_wrap(cpu, sched_domain_span(sd), target) {
+		if (!--nr)
+			return -1;
 		if (!cpumask_test_cpu(cpu, &p->cpus_allowed))
 			continue;
 		if (idle_cpu(cpu))
diff --git a/kernel/sched/features.h b/kernel/sched/features.h
index dc4d148..d3fb155 100644
--- a/kernel/sched/features.h
+++ b/kernel/sched/features.h
@@ -55,6 +55,7 @@ SCHED_FEAT(TTWU_QUEUE, true)
  * When doing wakeups, attempt to limit superfluous scans of the LLC domain.
  */
 SCHED_FEAT(SIS_AVG_CPU, false)
+SCHED_FEAT(SIS_PROP, true)
 
 /*
  * Issue a WARN when we do multiple update_rq_clock() calls

  parent reply index

Thread overview: 29+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-04-13 13:56 [RFC 0/3] sched/topology: fix sched groups on NUMA machines with mesh topology Lauro Ramos Venancio
2017-04-13 13:56 ` [RFC 1/3] sched/topology: Refactor function build_overlap_sched_groups() Lauro Ramos Venancio
2017-04-13 14:50   ` Rik van Riel
2017-05-15  9:02   ` [tip:sched/core] " tip-bot for Lauro Ramos Venancio
2017-04-13 13:56 ` [RFC 2/3] sched/topology: fix sched groups on NUMA machines with mesh topology Lauro Ramos Venancio
2017-04-13 15:16   ` Rik van Riel
2017-04-13 15:48   ` Peter Zijlstra
2017-04-13 20:21     ` Lauro Venancio
2017-04-13 21:06       ` Lauro Venancio
2017-04-13 23:38         ` Rik van Riel
2017-04-14 10:48           ` Peter Zijlstra
2017-04-14 11:38   ` Peter Zijlstra
2017-04-14 12:20     ` Peter Zijlstra
2017-05-15  9:03       ` [tip:sched/core] sched/fair, cpumask: Export for_each_cpu_wrap() tip-bot for Peter Zijlstra
2017-05-17 10:53         ` hackbench vs select_idle_sibling; was: " Peter Zijlstra
2017-05-17 12:46           ` Matt Fleming
2017-05-17 14:49           ` Chris Mason
2017-05-19 15:00           ` Matt Fleming
2017-06-05 13:00             ` Matt Fleming
2017-06-06  9:21               ` Peter Zijlstra
2017-06-09 17:52                 ` Chris Mason
2017-06-08  9:22           ` tip-bot for Peter Zijlstra [this message]
2017-04-14 16:58     ` [RFC 2/3] sched/topology: fix sched groups on NUMA machines with mesh topology Peter Zijlstra
2017-04-17 14:40       ` Lauro Venancio
2017-04-13 13:56 ` [RFC 3/3] sched/topology: Different sched groups must not have the same balance cpu Lauro Ramos Venancio
2017-04-13 15:27   ` Rik van Riel
2017-04-14 16:49   ` Peter Zijlstra
2017-04-17 15:34     ` Lauro Venancio
2017-04-18 12:32       ` Peter Zijlstra

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=tip-1ad3aaf3fcd2444406628a19a9b9e0922b95e2d4@git.kernel.org \
    --to=tipbot@zytor.com \
    --cc=clm@fb.com \
    --cc=efault@gmx.de \
    --cc=hpa@zytor.com \
    --cc=kitsunyan@inbox.ru \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-tip-commits@vger.kernel.org \
    --cc=matt@codeblueprint.co.uk \
    --cc=mingo@kernel.org \
    --cc=peterz@infradead.org \
    --cc=tglx@linutronix.de \
    --cc=torvalds@linux-foundation.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

LKML Archive on lore.kernel.org

Archives are clonable:
	git clone --mirror https://lore.kernel.org/lkml/0 lkml/git/0.git
	git clone --mirror https://lore.kernel.org/lkml/1 lkml/git/1.git
	git clone --mirror https://lore.kernel.org/lkml/2 lkml/git/2.git
	git clone --mirror https://lore.kernel.org/lkml/3 lkml/git/3.git
	git clone --mirror https://lore.kernel.org/lkml/4 lkml/git/4.git
	git clone --mirror https://lore.kernel.org/lkml/5 lkml/git/5.git
	git clone --mirror https://lore.kernel.org/lkml/6 lkml/git/6.git
	git clone --mirror https://lore.kernel.org/lkml/7 lkml/git/7.git
	git clone --mirror https://lore.kernel.org/lkml/8 lkml/git/8.git
	git clone --mirror https://lore.kernel.org/lkml/9 lkml/git/9.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 lkml lkml/ https://lore.kernel.org/lkml \
		linux-kernel@vger.kernel.org
	public-inbox-index lkml

Example config snippet for mirrors

Newsgroup available over NNTP:
	nntp://nntp.lore.kernel.org/org.kernel.vger.linux-kernel


AGPL code for this site: git clone https://public-inbox.org/public-inbox.git