From: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
To: Ingo Molnar <mingo@kernel.org>, Peter Zijlstra <peterz@infradead.org>
Cc: LKML <linux-kernel@vger.kernel.org>,
Mel Gorman <mgorman@techsingularity.net>,
Rik van Riel <riel@surriel.com>,
Srikar Dronamraju <srikar@linux.vnet.ibm.com>,
Thomas Gleixner <tglx@linutronix.de>
Subject: [PATCH 5/6] sched/numa: Reset scan rate whenever task moves across nodes
Date: Fri, 3 Aug 2018 11:44:00 +0530 [thread overview]
Message-ID: <1533276841-16341-6-git-send-email-srikar@linux.vnet.ibm.com> (raw)
In-Reply-To: <1533276841-16341-1-git-send-email-srikar@linux.vnet.ibm.com>
Currently task scan rate is reset when numa balancer migrates the task
to a different node. If numa balancer initiates a swap, reset is only
applicable to the task that initiates the swap. Similarly no scan rate
reset is done if the task is migrated across nodes by traditional load
balancer.
Instead move the scan reset to the migrate_task_rq. This ensures the
task moved out of its preferred node, either gets back to its preferred
node quickly or finds a new preferred node. Doing so, would be fair to
all tasks migrating across nodes.
specjbb2005 / bops/JVM / higher bops are better
on 2 Socket/2 Node Intel
JVMS Prev Current %Change
4 210118 208862 -0.597759
1 313171 307007 -1.96825
on 2 Socket/4 Node Power8 (PowerNV)
JVMS Prev Current %Change
8 91027.5 89911.4 -1.22611
1 216460 216176 -0.131202
on 2 Socket/2 Node Power9 (PowerNV)
JVMS Prev Current %Change
4 191918 196078 2.16759
1 207043 214664 3.68088
on 4 Socket/4 Node Power7
JVMS Prev Current %Change
8 58462.1 60719.2 3.86079
1 108334 112615 3.95167
dbench / transactions / higher numbers are better
on 2 Socket/2 Node Intel
count Min Max Avg Variance %Change
5 11851.8 11937.3 11890.9 33.5169
5 12511.7 12559.4 12539.5 15.5883 5.45459
on 2 Socket/4 Node Power8 (PowerNV)
count Min Max Avg Variance %Change
5 4791 5016.08 4962.55 85.9625
5 4709.28 4979.28 4919.32 105.126 -0.871125
on 2 Socket/2 Node Power9 (PowerNV)
count Min Max Avg Variance %Change
5 9353.43 9380.49 9369.6 9.04361
5 9388.38 9406.29 9395.1 5.98959 0.272157
on 4 Socket/4 Node Power7
count Min Max Avg Variance %Change
5 149.518 215.412 179.083 21.5903
5 157.71 184.929 174.754 10.7275 -2.41731
Signed-off-by: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
---
kernel/sched/fair.c | 19 +++++++++++++------
1 file changed, 13 insertions(+), 6 deletions(-)
diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index a5936ed..4ea0eff 100644
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -1837,12 +1837,6 @@ static int task_numa_migrate(struct task_struct *p)
if (env.best_cpu == -1)
return -EAGAIN;
- /*
- * Reset the scan period if the task is being rescheduled on an
- * alternative node to recheck if the tasks is now properly placed.
- */
- p->numa_scan_period = task_scan_start(p);
-
best_rq = cpu_rq(env.best_cpu);
if (env.best_task == NULL) {
ret = migrate_task_to(p, env.best_cpu);
@@ -6361,6 +6355,19 @@ static void migrate_task_rq_fair(struct task_struct *p, int new_cpu __maybe_unus
/* We have migrated, no longer consider this task hot */
p->se.exec_start = 0;
+
+#ifdef CONFIG_NUMA_BALANCING
+ if (!p->mm || (p->flags & PF_EXITING))
+ return;
+
+ if (p->numa_faults) {
+ int src_nid = cpu_to_node(task_cpu(p));
+ int dst_nid = cpu_to_node(new_cpu);
+
+ if (src_nid != dst_nid)
+ p->numa_scan_period = task_scan_start(p);
+ }
+#endif
}
static void task_dead_fair(struct task_struct *p)
--
1.8.3.1
next prev parent reply other threads:[~2018-08-03 6:15 UTC|newest]
Thread overview: 16+ messages / expand[flat|nested] mbox.gz Atom feed top
2018-08-03 6:13 [PATCH 0/6] numa-balancing patches Srikar Dronamraju
2018-08-03 6:13 ` [PATCH 1/6] sched/numa: Stop multiple tasks from moving to the cpu at the same time Srikar Dronamraju
2018-09-10 8:42 ` Ingo Molnar
2018-08-03 6:13 ` [PATCH 2/6] mm/migrate: Use trylock while resetting rate limit Srikar Dronamraju
2018-09-06 11:48 ` Peter Zijlstra
2018-09-10 8:39 ` Ingo Molnar
2018-08-03 6:13 ` [PATCH 3/6] sched/numa: Avoid task migration for small numa improvement Srikar Dronamraju
2018-09-10 8:46 ` Ingo Molnar
2018-09-12 15:17 ` Srikar Dronamraju
2018-08-03 6:13 ` [PATCH 4/6] sched/numa: Pass destination cpu as a parameter to migrate_task_rq Srikar Dronamraju
2018-08-03 6:14 ` Srikar Dronamraju [this message]
2018-09-10 8:48 ` [PATCH 5/6] sched/numa: Reset scan rate whenever task moves across nodes Ingo Molnar
2018-09-12 15:19 ` Srikar Dronamraju
2018-08-03 6:14 ` [PATCH 6/6] sched/numa: Limit the conditions where scan period is reset Srikar Dronamraju
2018-08-21 12:01 ` [PATCH 0/6] numa-balancing patches Srikar Dronamraju
2018-09-06 12:17 ` Peter Zijlstra
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=1533276841-16341-6-git-send-email-srikar@linux.vnet.ibm.com \
--to=srikar@linux.vnet.ibm.com \
--cc=linux-kernel@vger.kernel.org \
--cc=mgorman@techsingularity.net \
--cc=mingo@kernel.org \
--cc=peterz@infradead.org \
--cc=riel@surriel.com \
--cc=tglx@linutronix.de \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).