linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Jirka Hladky <jhladky@redhat.com>
To: linux-kernel@vger.kernel.org
Subject: sched : performance regression 24% between 4.4rc4 and 4.3 kernel
Date: Fri, 11 Dec 2015 15:17:50 +0100	[thread overview]
Message-ID: <CAE4VaGAziCRGRXPPO6YtiHLXqeLUMcuYCh3mkmGpDjfW8GaetQ@mail.gmail.com> (raw)

Hello,

we are doing performance testing of the new kernel scheduler (commit
53528695ff6d8b77011bc818407c13e30914a946). In most cases we see
performance improvements compared to 4.3 kernel with the exception of
stream benchmark when running on 4 NUMA node server.

When we run 4 stream benchmark processes on 4 NUMA node server and we
compare the total performance we see drop about 24% compared to 4.3
kernel. This is caused by the fact that 2 stream benchmarks are
running on the same NUMA node while 1 NUMA node does not run any
stream benchmark. With kernel 4.3, load is distributed evenly among
all 4 NUMA nodes. When two stream benchmarks are running on the same
NUMA node then the runtime is almost twice as long compared to one
stream bench running on one NUMA node. See log files [1] bellow.

Please see the graph comparing stream benchmark results between kernel
4.3 and 4.4rc4 (for legend see [2] bellow).
https://jhladky.fedorapeople.org/sched_stream_kernel_4.3vs4.4rc4/Stream_benchmark_on_4_NUMA_node_server_4.3vs4.4rc4_kernel.png

Could you please help us to identify the root cause of this
regression? We don't have the skills to fix the problem ourselves but
we will be more than happy to test any proposed patch for this issue.

Thanks a lot for your help on that!
Jirka

Further details:

[1] Log files can be downloaded here:
https://jhladky.fedorapeople.org/sched_stream_kernel_4.3vs4.4rc4/4.4RC4_stream_log_files.tar.bz2

$grep "User time" *log
stream.defaultRun.004streams.loop01.instance001.log:User time:  12.370 seconds
stream.defaultRun.004streams.loop01.instance002.log:User time:  10.560 seconds
stream.defaultRun.004streams.loop01.instance003.log:User time:  19.330 seconds
stream.defaultRun.004streams.loop01.instance004.log:User time:  17.820 seconds


$grep "NUMA nodes:" *log
stream.defaultRun.004streams.loop01.instance001.log:NUMA nodes:     2
2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 1 1 1 1 1 1 2 2 2 2 2 2 2 2 2 2
2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2
2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2
2 2 2 2 2 2 2 2 2 2 2 2 2 2
stream.defaultRun.004streams.loop01.instance002.log:NUMA nodes:     0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0
stream.defaultRun.004streams.loop01.instance003.log:NUMA nodes:     3
3 3 3 3 3 3 3 3 0 0 0 0 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3
3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 0 0 0 0 0 0 0 0 0 0 3 3 3 3
3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3
3 3 3 3 3 3 3 3 3 3 3 3 3 3
stream.defaultRun.004streams.loop01.instance004.log:NUMA nodes:     3
3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3
3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3
3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3
3 3 0 0 0 0 0 0 0 0 0 0 0 0

=> please note that NO bench is running on NUMA node #1 and instances
#3 and #4 are running both on NUMA node #3. This has huge performance
impact as stream instances on node #3 need 19 and 17 seconds to finish
compared to 10 and 12 seconds for instances running alone on one NUMA
node.

[2] Graph:
https://jhladky.fedorapeople.org/sched_stream_kernel_4.3vs4.4rc4/Stream_benchmark_on_4_NUMA_node_server_4.3vs4.4rc4_kernel.png

Graph Legend:
GREEN line => kernel 4.3
BLUE line =>    kernel 4.4rc4
x-axis      =>     number of parallel stream instances
y-axis      =>     Sum [1/runtime] over all stream instances


Details on server: DELL PowerEdge R820, 4x E5-4607 0 @ 2.20GHz and 128GB RAM
http://ark.intel.com/products/64604

             reply	other threads:[~2015-12-11 14:17 UTC|newest]

Thread overview: 13+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-12-11 14:17 Jirka Hladky [this message]
2015-12-12  7:04 ` sched : performance regression 24% between 4.4rc4 and 4.3 kernel Mike Galbraith
2015-12-12 14:16   ` Jirka Hladky
2015-12-12 14:37     ` Mike Galbraith
2015-12-15  0:02       ` Jirka Hladky
2015-12-15  0:04       ` Jirka Hladky
     [not found]       ` <CAE4VaGCgAvvQXDsv=Gn8B0JtTzCnXe0oP63HLQWSCyY_QNOB7g@mail.gmail.com>
2015-12-15  2:12         ` Rik van Riel
2015-12-15  8:49           ` Jirka Hladky
2015-12-16 12:56             ` Jirka Hladky
2015-12-16 13:50               ` Peter Zijlstra
2015-12-16 17:04                 ` Jirka Hladky
     [not found]                   ` <CAE4VaGD49UAsBJn3jgg0kREWqjYz8UnvWOi8zU4d5HgNgNS-sQ@mail.gmail.com>
2015-12-17 15:43                     ` Jirka Hladky
2015-12-18  2:49                       ` Mike Galbraith

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CAE4VaGAziCRGRXPPO6YtiHLXqeLUMcuYCh3mkmGpDjfW8GaetQ@mail.gmail.com \
    --to=jhladky@redhat.com \
    --cc=linux-kernel@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).