From: Ingo Molnar <mingo@kernel.org>
To: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
Cc: Peter Zijlstra <peterz@infradead.org>,
LKML <linux-kernel@vger.kernel.org>,
Mel Gorman <mgorman@techsingularity.net>,
Rik van Riel <riel@surriel.com>,
Thomas Gleixner <tglx@linutronix.de>
Subject: Re: [PATCH 1/6] sched/numa: Stop multiple tasks from moving to the cpu at the same time
Date: Mon, 10 Sep 2018 10:42:37 +0200 [thread overview]
Message-ID: <20180910084237.GC48257@gmail.com> (raw)
In-Reply-To: <1533276841-16341-2-git-send-email-srikar@linux.vnet.ibm.com>
* Srikar Dronamraju <srikar@linux.vnet.ibm.com> wrote:
> Task migration under numa balancing can happen in parallel. More than
> one task might choose to migrate to the same cpu at the same time. This
> can result in
> - During task swap, choosing a task that was not part of the evaluation.
> - During task swap, task which just got moved into its preferred node,
> moving to a completely different node.
> - During task swap, task failing to move to the preferred node, will have
> to wait an extra interval for the next migrate opportunity.
> - During task movement, multiple task movements can cause load imbalance.
Please capitalize both 'CPU' and 'NUMA' in changelogs and code comments.
> This problem is more likely if there are more cores per node or more
> nodes in the system.
>
> Use a per run-queue variable to check if numa-balance is active on the
> run-queue.
>
> specjbb2005 / bops/JVM / higher bops are better
> on 2 Socket/2 Node Intel
> JVMS Prev Current %Change
> 4 199709 206350 3.32534
> 1 330830 319963 -3.28477
>
>
> on 2 Socket/4 Node Power8 (PowerNV)
> JVMS Prev Current %Change
> 8 89011.9 89627.8 0.69193
> 1 218946 211338 -3.47483
>
>
> on 2 Socket/2 Node Power9 (PowerNV)
> JVMS Prev Current %Change
> 4 180473 186539 3.36117
> 1 212805 220344 3.54268
>
>
> on 4 Socket/4 Node Power7
> JVMS Prev Current %Change
> 8 56941.8 56836 -0.185804
> 1 111686 112970 1.14965
>
>
> dbench / transactions / higher numbers are better
> on 2 Socket/2 Node Intel
> count Min Max Avg Variance %Change
> 5 12029.8 12124.6 12060.9 34.0076
> 5 13136.1 13170.2 13150.2 14.7482 9.03166
>
>
> on 2 Socket/4 Node Power8 (PowerNV)
> count Min Max Avg Variance %Change
> 5 4968.51 5006.62 4981.31 13.4151
> 5 4319.79 4998.19 4836.53 261.109 -2.90646
>
>
> on 2 Socket/2 Node Power9 (PowerNV)
> count Min Max Avg Variance %Change
> 5 9342.92 9381.44 9363.92 12.8587
> 5 9325.56 9402.7 9362.49 25.9638 -0.0152714
>
>
> on 4 Socket/4 Node Power7
> count Min Max Avg Variance %Change
> 5 143.4 188.892 170.225 16.9929
> 5 132.581 191.072 170.554 21.6444 0.193274
I have applied this patch, but the zero comments benchmark dump is annoying, as the numbers do
not show unconditional advantages - there's some increases in performance and some regressions.
In particular this:
> dbench / transactions / higher numbers are better
> on 2 Socket/4 Node Power8 (PowerNV)
> count Min Max Avg Variance %Change
> 5 4968.51 5006.62 4981.31 13.4151
> 5 4319.79 4998.19 4836.53 261.109 -2.90646
is concerning: not only did we lose some performance, variance went up by a *lot*. Is this just
a measurement fluke? We cannot know and you didn't comment.
Thanks,
Ingo
next prev parent reply other threads:[~2018-09-10 8:42 UTC|newest]
Thread overview: 16+ messages / expand[flat|nested] mbox.gz Atom feed top
2018-08-03 6:13 [PATCH 0/6] numa-balancing patches Srikar Dronamraju
2018-08-03 6:13 ` [PATCH 1/6] sched/numa: Stop multiple tasks from moving to the cpu at the same time Srikar Dronamraju
2018-09-10 8:42 ` Ingo Molnar [this message]
2018-08-03 6:13 ` [PATCH 2/6] mm/migrate: Use trylock while resetting rate limit Srikar Dronamraju
2018-09-06 11:48 ` Peter Zijlstra
2018-09-10 8:39 ` Ingo Molnar
2018-08-03 6:13 ` [PATCH 3/6] sched/numa: Avoid task migration for small numa improvement Srikar Dronamraju
2018-09-10 8:46 ` Ingo Molnar
2018-09-12 15:17 ` Srikar Dronamraju
2018-08-03 6:13 ` [PATCH 4/6] sched/numa: Pass destination cpu as a parameter to migrate_task_rq Srikar Dronamraju
2018-08-03 6:14 ` [PATCH 5/6] sched/numa: Reset scan rate whenever task moves across nodes Srikar Dronamraju
2018-09-10 8:48 ` Ingo Molnar
2018-09-12 15:19 ` Srikar Dronamraju
2018-08-03 6:14 ` [PATCH 6/6] sched/numa: Limit the conditions where scan period is reset Srikar Dronamraju
2018-08-21 12:01 ` [PATCH 0/6] numa-balancing patches Srikar Dronamraju
2018-09-06 12:17 ` Peter Zijlstra
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20180910084237.GC48257@gmail.com \
--to=mingo@kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=mgorman@techsingularity.net \
--cc=peterz@infradead.org \
--cc=riel@surriel.com \
--cc=srikar@linux.vnet.ibm.com \
--cc=tglx@linutronix.de \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.