All of lore.kernel.org
 help / color / mirror / Atom feed
From: Jirka Hladky <jhladky@redhat.com>
To: Peter Zijlstra <peterz@infradead.org>
Cc: linux-kernel@vger.kernel.org, Ingo Molnar <mingo@redhat.com>,
	Kamil Kolakowski <kkolakow@redhat.com>
Subject: Re: Kernel 4.7rc3 - Performance drop 30-40% for SPECjbb2005 and SPECjvm2008 benchmarks against 4.6 kernel
Date: Tue, 21 Jun 2016 15:17:52 +0200	[thread overview]
Message-ID: <CAE4VaGC595XMvCxJBSso53t6pmsSySvQ2HKEbvABKfHCJAUnHQ@mail.gmail.com> (raw)
In-Reply-To: <CAE4VaGCYvAbvNRD65SPfLPGVCspRfUYDAtfPt-VmkukPbt-L4Q@mail.gmail.com>

Hi Peter,

I have an update for this performance issue. I have tested several
kernels, I'm not at the parent of

  2159197d6677 sched/core: Enable increased load resolution on 64-bit kernels

and I still see the performance regression for multithreaded workloads.

There are only 27 commits remaining between v4.6 (last known to be OK)
and current HEAD (6ecdd74962f246dfe8750b7bea481a1c0816315d)
6ecdd74962f246dfe8750b7bea481a1c0816315d    sched/fair: Generalize the
load/util averages resolution definitionq hook unless util changed

See below [0].

Any hint which commit should I try now?

Thanks a lot!
Jirka

[0]
$ git log --pretty=oneline v4.6..HEAD kernel/sched
6ecdd74962f246dfe8750b7bea481a1c0816315d sched/fair: Generalize the
load/util averages resolution definition
2159197d66770ec01f75c93fb11dc66df81fd45b sched/core: Enable increased
load resolution on 64-bit kernels
e7904a28f5331c21d17af638cb477c83662e3cb6 locking/lockdep, sched/core:
Implement a better lock pinning scheme
eb58075149b7f0300ff19142e6245fe75db2a081 sched/core: Introduce 'struct rq_flags'
3e71a462dd483ce508a723356b293731e7d788ea sched/core: Move
task_rq_lock() out of line
64b7aad5798478ffff52e110878ccaae4c3aaa34 Merge branch 'sched/urgent'
into sched/core, to pick up fixes before applying new changes
f98db6013c557c216da5038d9c52045be55cd039 sched/core: Add
switch_mm_irqs_off() and use it in the scheduler
594dd290cf5403a9a5818619dfff42d8e8e0518e sched/cpufreq: Optimize
cpufreq update kicker to avoid update multiple times
fec148c000d0f9ac21679601722811eb60b4cc52 sched/deadline: Fix a bug in
dl_overflow()
9fd81dd5ce0b12341c9f83346f8d32ac68bd3841 sched/fair: Optimize
!CONFIG_NO_HZ_COMMON CPU load updates
1f41906a6fda1114debd3898668bd7ab6470ee41 sched/fair: Correctly handle
nohz ticks CPU load accounting
cee1afce3053e7aa0793fbd5f2e845fa2cef9e33 sched/fair: Gather CPU load
functions under a more conventional namespace
a2c6c91f98247fef0fe75216d607812485aeb0df sched/fair: Call cpufreq hook
in additional paths
41e0d37f7ac81297c07ba311e4ad39465b8c8295 sched/fair: Do not call
cpufreq hook unless util changed
21e96f88776deead303ecd30a17d1d7c2a1776e3 sched/fair: Move cpufreq hook
to update_cfs_rq_load_avg()
1f621e028baf391f6684003e32e009bc934b750f sched/fair: Fix asym packing
to select correct CPU
bd92883051a0228cc34996b8e766111ba10c9aac sched/cpuacct: Check for NULL
when using task_pt_regs()
2c923e94cd9c6acff3b22f0ae29cfe65e2658b40 sched/clock: Make
local_clock()/cpu_clock() inline
c78b17e28cc2c2df74264afc408bdc6aaf3fbcc8 sched/clock: Remove pointless
test in cpu_clock/local_clock
fb90a6e93c0684ab2629a42462400603aa829b9c sched/debug: Don't dump sched
debug info in SysRq-W
2b8c41daba327c633228169e8bd8ec067ab443f8 sched/fair: Initiate a new
task's util avg to a bounded value
1c3de5e19fc96206dd086e634129d08e5f7b1000 sched/fair: Update comments
after a variable rename
47252cfbac03644ee4a3adfa50c77896aa94f2bb sched/core: Add preempt
checks in preempt_schedule() code
bfdb198ccd99472c5bded689699eb30dd06316bb sched/numa: Remove
unnecessary NUMA dequeue update from non-SMP kernels
d02c071183e1c01a76811c878c8a52322201f81f sched/fair: Reset
nr_balance_failed after active balancing
d740037fac7052e49450f6fa1454f1144a103b55 sched/cpuacct: Split usage
accounting into user_usage and sys_usage
5ca3726af7f66a8cc71ce4414cfeb86deb784491 sched/cpuacct: Show all
possible CPUs in cpuacct output

On Fri, Jun 17, 2016 at 1:04 AM, Jirka Hladky <jhladky@redhat.com> wrote:
>> > we see performance drop 30-40% for SPECjbb2005 and SPECjvm2008
>> Blergh, of course I don't have those.. :/
>
> SPECjvm2008 is publicly available.
> https://www.spec.org/download.html
>
> We will prepare a reproducer and attach it to the BZ.
>
>> What kind of config and userspace setup? Do you run this cruft in a
>> cgroup of sorts?
>
>  No, we don't do any special setup except to control the number of threads.
>
> Thanks for the hints which commits are most likely the root cause for
> this. We will try to find the commit which has caused it.
>
> Jirka
>
>
>
> On Thu, Jun 16, 2016 at 7:22 PM, Peter Zijlstra <peterz@infradead.org> wrote:
>> On Thu, Jun 16, 2016 at 06:38:50PM +0200, Jirka Hladky wrote:
>>> Hello,
>>>
>>> we see performance drop 30-40% for SPECjbb2005 and SPECjvm2008
>>
>> Blergh, of course I don't have those.. :/
>>
>>> benchmarks starting from 4.7.0-0.rc0 kernel compared to 4.6 kernel.
>>>
>>> We have tested kernels 4.7.0-0.rc1 and 4.7.0-0.rc3 and these are as
>>> well affected.
>>>
>>> We have observed the drop on variety of different x86_64 servers with
>>> different configuration (different CPU models, RAM sizes, both with
>>> Hyper Threading ON and OFF, different NUMA configurations (2 and 4
>>> NUMA nodes)
>>
>> What kind of config and userspace setup? Do you run this cruft in a
>> cgroup of sorts?
>>
>> If so, does it change anything if you run it in the root cgroup?
>>
>>> Linpack and Stream benchmarks do not show any performance drop.
>>>
>>> The performance drop increases with higher number of threads. The
>>> maximum number of threads in each benchmark is the same as number of
>>> CPUs.
>>>
>>> We have opened a BZ to track the progress:
>>> https://bugzilla.kernel.org/show_bug.cgi?id=120481
>>>
>>> You can find more details along with graphs and tables there.
>>>
>>> Do you have any hints which commit should we try to reverse?
>>
>> There were only 66 commits or so, and I think we can rule out the
>> hotplug changes, which should reduce it even further.
>>
>> You could see what the parent of this one does:
>>
>>   2159197d6677 sched/core: Enable increased load resolution on 64-bit kernels
>>
>> If not that, maybe the parent of:
>>
>>   c58d25f371f5 sched/fair: Move record_wakee()
>>
>> After that I suppose you'll have to go bisect.
>>

  reply	other threads:[~2016-06-21 13:23 UTC|newest]

Thread overview: 34+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-06-16 16:38 Kernel 4.7rc3 - Performance drop 30-40% for SPECjbb2005 and SPECjvm2008 benchmarks against 4.6 kernel Jirka Hladky
2016-06-16 17:22 ` Peter Zijlstra
2016-06-16 23:04   ` Jirka Hladky
2016-06-21 13:17     ` Jirka Hladky [this message]
2016-06-22  7:16     ` Peter Zijlstra
2016-06-22  7:49       ` Peter Zijlstra
2016-06-22  7:54         ` Peter Zijlstra
2016-06-22  9:52           ` Jirka Hladky
2016-06-22 11:12             ` Peter Zijlstra
2016-06-22 12:37               ` Jirka Hladky
2016-06-22 12:46                 ` Jirka Hladky
2016-06-22 14:41                   ` Jirka Hladky
2016-06-22 20:59                     ` Peter Zijlstra
2016-06-22  8:20       ` Jirka Hladky
2016-06-23 18:33     ` Peter Zijlstra
2016-06-23 18:43       ` Peter Zijlstra
2016-06-24  7:44         ` Jirka Hladky
2016-06-24  8:08           ` Peter Zijlstra
2016-06-24  8:20             ` Jirka Hladky
2016-06-24 12:02           ` Peter Zijlstra
2016-06-24 12:09             ` Jirka Hladky
2016-06-24 12:30               ` Peter Zijlstra
2016-06-24 12:35               ` Jirka Hladky
2016-06-24 12:44             ` Vincent Guittot
2016-06-24 13:08               ` Jirka Hladky
2016-06-24 13:09               ` Peter Zijlstra
2016-06-24 13:23                 ` Vincent Guittot
2016-06-24 13:33                   ` Peter Zijlstra
2016-06-24 13:45                   ` Peter Zijlstra
2016-06-24 13:42               ` Peter Zijlstra
2016-06-24 15:54                 ` Peter Zijlstra
2016-06-24 22:13                   ` Jirka Hladky
2016-06-22  7:37 Branimir Maksimovic
2016-06-22  8:25 ` Jirka Hladky

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CAE4VaGC595XMvCxJBSso53t6pmsSySvQ2HKEbvABKfHCJAUnHQ@mail.gmail.com \
    --to=jhladky@redhat.com \
    --cc=kkolakow@redhat.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@redhat.com \
    --cc=peterz@infradead.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.