All of lore.kernel.org
 help / color / mirror / Atom feed
From: Mike Galbraith <efault@gmx.de>
To: Mel Gorman <mgorman@techsingularity.net>
Cc: Barry Song <21cnbao@gmail.com>,
	Peter Zijlstra <peterz@infradead.org>,
	Ingo Molnar <mingo@kernel.org>,
	Vincent Guittot <vincent.guittot@linaro.org>,
	Valentin Schneider <valentin.schneider@arm.com>,
	Aubrey Li <aubrey.li@linux.intel.com>,
	Barry Song <song.bao.hua@hisilicon.com>,
	Srikar Dronamraju <srikar@linux.vnet.ibm.com>,
	LKML <linux-kernel@vger.kernel.org>
Subject: Re: wakeup_affine_weight() is b0rked - was Re: [PATCH 2/2] sched/fair: Scale wakeup granularity relative to nr_running
Date: Fri, 08 Oct 2021 07:06:51 +0200	[thread overview]
Message-ID: <63057cf75e91bd0d348b5475ffab8e5a9f5d20f4.camel@gmx.de> (raw)
In-Reply-To: <20211005093137.GQ3959@techsingularity.net>

On Tue, 2021-10-05 at 10:31 +0100, Mel Gorman wrote:
> Ideally, I would do some tracing to confirm that maximum runqueue depth
> is really reduced by the path.

I would expect your worst case to remain unchanged, mine does.  The
patch mitigates, it does not eradicate.

I dug up a late 2016 mitigation patch, wedged it into 2021 and added a
BFH that does eradicate my stacking depth woes.  I'll probably keep it,
at least for a while. Not because I feel anything in my desktop, rather
because meeting this again (and it being deeper than I recall) reminded
me of measuring impact on NFS etc, making it a tad difficult to ignore.
Oh well, I'll forget about it eventually.. BTDT.

(standard beloved Granny disclaimer)

sched: Add SIS stacking mitigation feature

Select the least loaded LLC CPU for cache cold tasks and kthreads.

Addendum: renamed feature, and give it a big brother.

Not-Signed-off-by: Mike Galbraith <efault@gmx.de>
---
 kernel/sched/fair.c     |   54 ++++++++++++++++++++++++++++++++++++++++++++----
 kernel/sched/features.h |    5 ++++
 2 files changed, 55 insertions(+), 4 deletions(-)

--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -6261,6 +6261,26 @@ static inline int select_idle_smt(struct

 #endif /* CONFIG_SCHED_SMT */

+static bool task_is_kthread_or_cold(struct task_struct *p)
+{
+	s64 cold = sysctl_sched_migration_cost;
+
+	if (p->flags & PF_KTHREAD)
+		return true;
+	if (cold <= 0)
+		return false;
+	return task_rq(p)->clock_task - p->se.exec_start > cold;
+}
+
+static bool cpu_load_inconsistent(int cpu)
+{
+	struct rq *rq = cpu_rq(cpu);
+
+	if (rq->cfs.h_nr_running < 4)
+		return false;
+	return cpu_load(rq) << 2 < scale_load_down(rq->cfs.load.weight);
+}
+
 /*
  * Scan the LLC domain for idle CPUs; this is dynamically regulated by
  * comparing the average scan cost (tracked in sd->avg_scan_cost) against the
@@ -6269,7 +6289,7 @@ static inline int select_idle_smt(struct
 static int select_idle_cpu(struct task_struct *p, struct sched_domain *sd, bool has_idle_core, int target)
 {
 	struct cpumask *cpus = this_cpu_cpumask_var_ptr(select_idle_mask);
-	int i, cpu, idle_cpu = -1, nr = INT_MAX;
+	int i, cpu, idle_cpu = -1, nr = INT_MAX, ld = -1;
 	struct rq *this_rq = this_rq();
 	int this = smp_processor_id();
 	struct sched_domain *this_sd;
@@ -6309,6 +6329,21 @@ static int select_idle_cpu(struct task_s
 		time = cpu_clock(this);
 	}

+	/*
+	 * Select the least loaded CPU for kthreads and cache cold tasks
+	 * if no idle CPU is found.
+	 */
+	if ((sched_feat(SIS_SPOT) && task_is_kthread_or_cold(p)) ||
+	    (sched_feat(SIS_REXY) && cpu_load_inconsistent(target))) {
+		idle_cpu = task_cpu(p);
+		if (idle_cpu != target && !cpus_share_cache(idle_cpu, target))
+			idle_cpu = target;
+		if (unlikely(!sched_cpu_cookie_match(cpu_rq(idle_cpu), p)))
+			idle_cpu = -1;
+		else
+			ld = scale_load_down(cpu_rq(idle_cpu)->cfs.load.weight);
+	}
+
 	for_each_cpu_wrap(cpu, cpus, target + 1) {
 		if (has_idle_core) {
 			i = select_idle_core(p, cpu, cpus, &idle_cpu);
@@ -6317,10 +6352,21 @@ static int select_idle_cpu(struct task_s

 		} else {
 			if (!--nr)
-				return -1;
-			idle_cpu = __select_idle_cpu(cpu, p);
-			if ((unsigned int)idle_cpu < nr_cpumask_bits)
+				return idle_cpu;
+			i = __select_idle_cpu(cpu, p);
+			if ((unsigned int)i < nr_cpumask_bits) {
+				idle_cpu = i;
 				break;
+			}
+		}
+		if (ld > 0 && sched_cpu_cookie_match(cpu_rq(cpu), p)) {
+			i = scale_load_down(cpu_rq(cpu)->cfs.load.weight);
+			if (i < ld) {
+				idle_cpu = cpu;
+				if (i == 0)
+					break;
+				ld = i;
+			}
 		}
 	}

--- a/kernel/sched/features.h
+++ b/kernel/sched/features.h
@@ -95,3 +95,8 @@ SCHED_FEAT(LATENCY_WARN, false)

 SCHED_FEAT(ALT_PERIOD, true)
 SCHED_FEAT(BASE_SLICE, true)
+
+/* Mitigate PELT induced stacking.  */
+SCHED_FEAT(SIS_SPOT, true)
+/* Spot's 12 ton big brother. */
+SCHED_FEAT(SIS_REXY, true)


  parent reply	other threads:[~2021-10-08  5:10 UTC|newest]

Thread overview: 59+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-09-20 14:26 [PATCH 0/2] Scale wakeup granularity relative to nr_running Mel Gorman
2021-09-20 14:26 ` [PATCH 1/2] sched/fair: Remove redundant lookup of rq in check_preempt_wakeup Mel Gorman
2021-09-21  7:21   ` Vincent Guittot
2021-09-21  7:53     ` Mel Gorman
2021-09-21  8:12       ` Vincent Guittot
2021-09-21  8:21       ` Peter Zijlstra
2021-09-21 10:03         ` Mel Gorman
2021-09-20 14:26 ` [PATCH 2/2] sched/fair: Scale wakeup granularity relative to nr_running Mel Gorman
2021-09-21  3:52   ` Mike Galbraith
2021-09-21  5:50     ` Mike Galbraith
2021-09-21  7:04     ` Mike Galbraith
2021-09-21 10:36     ` Mel Gorman
2021-09-21 12:32       ` Mike Galbraith
2021-09-21 14:03         ` Mel Gorman
2021-10-05  9:24         ` Peter Zijlstra
2021-09-22  5:22       ` Mike Galbraith
2021-09-22 13:20         ` Mel Gorman
2021-09-22 14:04           ` Mike Galbraith
2021-09-22 14:15           ` Vincent Guittot
2021-09-22 15:04             ` Mel Gorman
2021-09-22 16:00               ` Vincent Guittot
2021-09-22 17:38                 ` Mel Gorman
2021-09-22 18:22                   ` Vincent Guittot
2021-09-22 18:57                     ` Mel Gorman
2021-09-23  1:47                     ` Mike Galbraith
2021-09-23  8:40                       ` Vincent Guittot
2021-09-23  9:21                         ` Mike Galbraith
2021-09-23 12:41                           ` Vincent Guittot
2021-09-23 13:14                             ` Mike Galbraith
2021-09-27 11:17                             ` Mel Gorman
2021-09-27 14:17                               ` Mike Galbraith
2021-10-04  8:05                                 ` Mel Gorman
2021-10-04 16:37                                   ` Vincent Guittot
2021-10-05  7:41                                     ` Mel Gorman
2021-09-27 14:19                               ` Vincent Guittot
2021-09-27 15:02                                 ` Mel Gorman
2021-09-23 12:24                         ` Phil Auld
2021-10-05 10:36                           ` Peter Zijlstra
2021-10-05 14:12                             ` Phil Auld
2021-10-05 14:32                               ` Peter Zijlstra
2021-10-05 10:28                     ` Peter Zijlstra
2021-10-05 10:23                   ` Peter Zijlstra
2021-10-05  9:41               ` Peter Zijlstra
2021-09-22 15:05             ` Vincent Guittot
2021-10-05  9:32           ` Peter Zijlstra
2021-10-03  3:07         ` wakeup_affine_weight() is b0rked - was " Mike Galbraith
2021-10-03  7:34           ` Barry Song
2021-10-03 14:52             ` Mike Galbraith
2021-10-03 21:06               ` Barry Song
2021-10-04  1:49                 ` Mike Galbraith
2021-10-04  4:34             ` Mike Galbraith
2021-10-04  9:06               ` Mike Galbraith
2021-10-05  7:47                 ` Mel Gorman
2021-10-05  8:42                   ` Mike Galbraith
2021-10-05  9:31                     ` Mel Gorman
2021-10-06  6:46                       ` Mike Galbraith
2021-10-08  5:06                       ` Mike Galbraith [this message]
2021-09-21  8:03   ` Vincent Guittot
2021-09-21 10:45     ` Mel Gorman

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=63057cf75e91bd0d348b5475ffab8e5a9f5d20f4.camel@gmx.de \
    --to=efault@gmx.de \
    --cc=21cnbao@gmail.com \
    --cc=aubrey.li@linux.intel.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mgorman@techsingularity.net \
    --cc=mingo@kernel.org \
    --cc=peterz@infradead.org \
    --cc=song.bao.hua@hisilicon.com \
    --cc=srikar@linux.vnet.ibm.com \
    --cc=valentin.schneider@arm.com \
    --cc=vincent.guittot@linaro.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.