All of lore.kernel.org
 help / color / mirror / Atom feed
* Potential scheduler regression
@ 2017-07-05 15:42 Ben Guthro
  2017-07-05 16:48 ` Peter Zijlstra
  0 siblings, 1 reply; 11+ messages in thread
From: Ben Guthro @ 2017-07-05 15:42 UTC (permalink / raw)
  To: Linux Kernel Mailing List, Peter Zijlstra (Intel),
	Linus Torvalds, Mike Galbraith, Thomas Gleixner, Ingo Molnar

Hello,

I've been in the process of updating our kernel in our appliance VM
from an old LTS kernel (4.1.y) to something a bit more modern (4.9.y)
- and ran into a performance regression, when our QA team was running
some regression suites.


I bisect this behavior to the following commit, introduced in the 4.9
merge window:


commit 1b568f0aabf280555125bc7cefc08321ff0ebaba
Author: Peter Zijlstra <peterz@infradead.org>
Date:   Mon May 9 10:38:41 2016 +0200

    sched/core: Optimize SCHED_SMT

    Avoid pointless SCHED_SMT code when running on !SMT hardware.

    Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
    Cc: Linus Torvalds <torvalds@linux-foundation.org>
    Cc: Mike Galbraith <efault@gmx.de>
    Cc: Peter Zijlstra <peterz@infradead.org>
    Cc: Thomas Gleixner <tglx@linutronix.de>
    Cc: linux-kernel@vger.kernel.org
    Signed-off-by: Ingo Molnar <mingo@kernel.org>



It seems that this commit can have a performance impact on virtual
machines running on VMWare ESXi,
Now...this seemed strange to me, since it appears that the bulk of the
change comes down to the code in kernel/sched/core.c:

#ifdef CONFIG_SCHED_SMT
DEFINE_STATIC_KEY_FALSE(sched_smt_present);

static void sched_init_smt(void)
{
    /*
     * We've enumerated all CPUs and will assume that if any CPU
     * has SMT siblings, CPU0 will too.
     */
    if (cpumask_weight(cpu_smt_mask(0)) > 1)
        static_branch_enable(&sched_smt_present);
}
#else


I have verified that, in this environment, the vCPU presented to the
guest has hyperthreading enabled,
but only presents a single hyperthread.
cpumask_weight(cpu_smt_mask(0) resolves to 1

This is backed up with the cpuinfo, and lscpu output, as well

Results of /proc/cpuinfo for cpu0:

~$ cat /proc/cpuinfo | head -27
processor       : 0
vendor_id       : GenuineIntel
cpu family      : 6
model           : 63
model name      : Intel(R) Xeon(R) CPU E5-2697 v3 @ 2.60GHz
stepping        : 2
microcode       : 0x2d
cpu MHz         : 2599.732
cache size      : 35840 KB
physical id     : 0
siblings        : 4
core id         : 0
cpu cores       : 4
apicid          : 0
initial apicid  : 0
fpu             : yes
fpu_exception   : yes
cpuid level     : 15
wp              : yes
flags           : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge
mca cmov pat pse36 clflush dts mmx fxsr sse sse2 ss ht syscall nx
pdpe1gb rdtscp lm constant_tsc arch_perfmon pebs bts nopl xtopology
tsc_reliable nonstop_tsc aperfmperf eagerfpu pni pclmulqdq ssse3 fma
cx16 pcid sse4_1 sse4_2 x2apic movbe popcnt tsc_deadline_timer aes
xsave avx f16c rdrand hypervisor lahf_lm abm epb fsgsbase tsc_adjust
bmi1 avx2 smep bmi2 invpcid xsaveopt dtherm ida arat pln pts
bugs            :
bogomips        : 5199.99
clflush size    : 64
cache_alignment : 64
address sizes   : 42 bits physical, 48 bits virtual
power management:

Results of "lscpu" :

~$ lscpu
Architecture:          x86_64
CPU op-mode(s):        32-bit, 64-bit
Byte Order:            Little Endian
CPU(s):                4
On-line CPU(s) list:   0-3
Thread(s) per core:    1
Core(s) per socket:    4
Socket(s):             1
NUMA node(s):          1
Vendor ID:             GenuineIntel
CPU family:            6
Model:                 63
Stepping:              2
CPU MHz:               2599.732
BogoMIPS:              5199.99
Hypervisor vendor:     VMware
Virtualization type:   full
L1d cache:             32K
L1i cache:             32K
L2 cache:              256K
L3 cache:              35840K
NUMA node0 CPU(s):     0-3



Now - I suppose that we could just carry around a patch, to revert
this commit, whenever we wanted to update our kernel...but I'd prefer
to understand the problem better - since this is currently falling
into the category of "being able to have progress, or understanding,
but not necessarily both"


In advance of the question - the tip of the tree (v4.12 at an earlier
RC version) was tested, and at that time, no discernable difference
was noticed, from 4.9, WRT this performance regression in our tests.
However - this code remains unchanged AFAICT in v4.12


This is my first dip back into LKML in probably 4 years - so apologies
if this has been previously discussed. I tried to do my research ahead
of time - but either this has not been discussed, or my google-fu was
weak when attempting the search parameters.


Do you happen to know what might be happening here?



Thank you in advance, for any information that you may be able to provide


Ben Guthro
SimpliVity / Hewlett Packard Enterprise

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: Potential scheduler regression
  2017-07-05 15:42 Potential scheduler regression Ben Guthro
@ 2017-07-05 16:48 ` Peter Zijlstra
  2017-07-07 20:55   ` Ben Guthro
  0 siblings, 1 reply; 11+ messages in thread
From: Peter Zijlstra @ 2017-07-05 16:48 UTC (permalink / raw)
  To: Ben Guthro
  Cc: Linux Kernel Mailing List, Linus Torvalds, Mike Galbraith,
	Thomas Gleixner, Ingo Molnar

On Wed, Jul 05, 2017 at 11:42:46AM -0400, Ben Guthro wrote:
> Hello,
> 
> I've been in the process of updating our kernel in our appliance VM
> from an old LTS kernel (4.1.y) to something a bit more modern (4.9.y)
> - and ran into a performance regression, when our QA team was running
> some regression suites.
> 
> 
> I bisect this behavior to the following commit, introduced in the 4.9
> merge window:
> 

Could you test a later kernel that includes commit:

  1ad3aaf3fcd2 ("sched/core: Implement new approach to scale select_idle_cpu()")

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: Potential scheduler regression
  2017-07-05 16:48 ` Peter Zijlstra
@ 2017-07-07 20:55   ` Ben Guthro
  2017-07-10  9:25     ` Peter Zijlstra
  0 siblings, 1 reply; 11+ messages in thread
From: Ben Guthro @ 2017-07-07 20:55 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Linux Kernel Mailing List, Linus Torvalds, Mike Galbraith,
	Thomas Gleixner, Ingo Molnar

On Wed, Jul 5, 2017 at 12:48 PM, Peter Zijlstra <peterz@infradead.org> wrote:
> On Wed, Jul 05, 2017 at 11:42:46AM -0400, Ben Guthro wrote:
>> Hello,
>>
>> I've been in the process of updating our kernel in our appliance VM
>> from an old LTS kernel (4.1.y) to something a bit more modern (4.9.y)
>> - and ran into a performance regression, when our QA team was running
>> some regression suites.
>>
>>
>> I bisect this behavior to the following commit, introduced in the 4.9
>> merge window:
>>
>
> Could you test a later kernel that includes commit:
>
>   1ad3aaf3fcd2 ("sched/core: Implement new approach to scale select_idle_cpu()")
>

(resend without html)

Apologies on the delay - it took a bit to get the machines, to run the test.

I am happy to report that the kernel at 1ad3aaf3fcd2, seems to regain
performance loss from 1b568f0aab, in our test environment.

Since 4.9 is an LTS kernel - is this appropriate to suggest to be
included in the linux-stable list?

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: Potential scheduler regression
  2017-07-07 20:55   ` Ben Guthro
@ 2017-07-10  9:25     ` Peter Zijlstra
  2017-07-10 15:26       ` Greg KH
  0 siblings, 1 reply; 11+ messages in thread
From: Peter Zijlstra @ 2017-07-10  9:25 UTC (permalink / raw)
  To: Ben Guthro
  Cc: Linux Kernel Mailing List, Linus Torvalds, Mike Galbraith,
	Thomas Gleixner, Ingo Molnar, Greg Kroah-Hartman

On Fri, Jul 07, 2017 at 04:55:27PM -0400, Ben Guthro wrote:

> Apologies on the delay - it took a bit to get the machines, to run the test.
> 
> I am happy to report that the kernel at 1ad3aaf3fcd2, seems to regain
> performance loss from 1b568f0aab, in our test environment.

Excellent.

> Since 4.9 is an LTS kernel - is this appropriate to suggest to be
> included in the linux-stable list?

Hurm... so I typically suck at (also) keeping track of -stable things.

But given LTS, there might be a few more commits that might make sense
to include.

This series corrects NUMA topology creation:

8c0334697dc3 ("sched/topology: Refactor function build_overlap_sched_groups()")
c743f0a5c50f ("sched/fair, cpumask: Export for_each_cpu_wrap()")
0372dd2736e0 ("sched/topology: Fix building of overlapping sched-groups")
91eaed0d6131 ("sched/topology: Simplify build_overlap_sched_groups()")
b0151c25548c ("sched/debug: Print the scheduler topology group mask")
a420b0630362 ("sched/topology: Verify the first group matches the child domain")
f32d782e31bf ("sched/topology: Optimize build_group_mask()")
c20e1ea4b61c ("sched/topology: Move comment about asymmetric node setups")
af85596c74de ("sched/topology: Remove FORCE_SD_OVERLAP")
73bb059f9b8a ("sched/topology: Fix overlapping sched_group_mask")
8d5dc5126bb2 ("sched/topology: Small cleanup")
005f874dd284 ("sched/topology: Add sched_group_capacity debugging")
1676330ecfa8 ("sched/topology: Fix overlapping sched_group_capacity")

(there's a few more commits at the end of that series that add comments
and renames a bunch of stuff which doesn't really fix anything).

Cures a BUG_ON through sysrq:

896bbb252258 ("sched/core: Allow __sched_setscheduler() in interrupts when PI is not used")


Performance issues:


502ce005ab95 ("sched/fair: Use task_groups instead of leaf_cfs_rq_list to walk all cfs_rqs")
a9e7f6544b9c ("sched/fair: Fix O(nr_cgroups) in load balance path")

c249f255aab8 ("sched/rt: Minimize rq->lock contention in do_sched_rt_period_timer()")

8655d5497735 ("sched/numa: Use down_read_trylock() for the mmap_sem")



And then the patch you want for this:

1ad3aaf3fcd2 ("sched/core: Implement new approach to scale select_idle_cpu()")



I have no real idea how much of any those qualify for 4.9, but know most
of those patches ended up in the various enterprise distros in some form
or other.


In any case, some of that will need some massaging to apply and it
obviously needs testing of sorts. So I'm not sure what all makes sense
to do.

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: Potential scheduler regression
  2017-07-10  9:25     ` Peter Zijlstra
@ 2017-07-10 15:26       ` Greg KH
  2017-07-10 15:43         ` Ben Guthro
  0 siblings, 1 reply; 11+ messages in thread
From: Greg KH @ 2017-07-10 15:26 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Ben Guthro, Linux Kernel Mailing List, Linus Torvalds,
	Mike Galbraith, Thomas Gleixner, Ingo Molnar

On Mon, Jul 10, 2017 at 11:25:32AM +0200, Peter Zijlstra wrote:
> On Fri, Jul 07, 2017 at 04:55:27PM -0400, Ben Guthro wrote:
> 
> > Apologies on the delay - it took a bit to get the machines, to run the test.
> > 
> > I am happy to report that the kernel at 1ad3aaf3fcd2, seems to regain
> > performance loss from 1b568f0aab, in our test environment.
> 
> Excellent.
> 
> > Since 4.9 is an LTS kernel - is this appropriate to suggest to be
> > included in the linux-stable list?
> 
> Hurm... so I typically suck at (also) keeping track of -stable things.
> 
> But given LTS, there might be a few more commits that might make sense
> to include.
> 
> This series corrects NUMA topology creation:
> 
> 8c0334697dc3 ("sched/topology: Refactor function build_overlap_sched_groups()")
> c743f0a5c50f ("sched/fair, cpumask: Export for_each_cpu_wrap()")
> 0372dd2736e0 ("sched/topology: Fix building of overlapping sched-groups")
> 91eaed0d6131 ("sched/topology: Simplify build_overlap_sched_groups()")
> b0151c25548c ("sched/debug: Print the scheduler topology group mask")
> a420b0630362 ("sched/topology: Verify the first group matches the child domain")
> f32d782e31bf ("sched/topology: Optimize build_group_mask()")
> c20e1ea4b61c ("sched/topology: Move comment about asymmetric node setups")
> af85596c74de ("sched/topology: Remove FORCE_SD_OVERLAP")
> 73bb059f9b8a ("sched/topology: Fix overlapping sched_group_mask")
> 8d5dc5126bb2 ("sched/topology: Small cleanup")
> 005f874dd284 ("sched/topology: Add sched_group_capacity debugging")
> 1676330ecfa8 ("sched/topology: Fix overlapping sched_group_capacity")
> 
> (there's a few more commits at the end of that series that add comments
> and renames a bunch of stuff which doesn't really fix anything).
> 
> Cures a BUG_ON through sysrq:
> 
> 896bbb252258 ("sched/core: Allow __sched_setscheduler() in interrupts when PI is not used")
> 
> 
> Performance issues:
> 
> 
> 502ce005ab95 ("sched/fair: Use task_groups instead of leaf_cfs_rq_list to walk all cfs_rqs")
> a9e7f6544b9c ("sched/fair: Fix O(nr_cgroups) in load balance path")
> 
> c249f255aab8 ("sched/rt: Minimize rq->lock contention in do_sched_rt_period_timer()")
> 
> 8655d5497735 ("sched/numa: Use down_read_trylock() for the mmap_sem")
> 
> 
> 
> And then the patch you want for this:
> 
> 1ad3aaf3fcd2 ("sched/core: Implement new approach to scale select_idle_cpu()")
> 
> 
> 
> I have no real idea how much of any those qualify for 4.9, but know most
> of those patches ended up in the various enterprise distros in some form
> or other.

If people have experience with these in the "enterprise" distros, or
any other tree, and want to provide me with backported, and tested,
patches, I'll be glad to consider them for stable kernels.

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: Potential scheduler regression
  2017-07-10 15:26       ` Greg KH
@ 2017-07-10 15:43         ` Ben Guthro
  2017-07-11  8:30           ` Ingo Molnar
  0 siblings, 1 reply; 11+ messages in thread
From: Ben Guthro @ 2017-07-10 15:43 UTC (permalink / raw)
  To: Greg KH
  Cc: Peter Zijlstra, Linux Kernel Mailing List, Linus Torvalds,
	Mike Galbraith, Thomas Gleixner, Ingo Molnar

On Mon, Jul 10, 2017 at 11:26 AM, Greg KH <gregkh@linuxfoundation.org> wrote:
> On Mon, Jul 10, 2017 at 11:25:32AM +0200, Peter Zijlstra wrote:
>> On Fri, Jul 07, 2017 at 04:55:27PM -0400, Ben Guthro wrote:
>>
>> > Apologies on the delay - it took a bit to get the machines, to run the test.
>> >
>> > I am happy to report that the kernel at 1ad3aaf3fcd2, seems to regain
>> > performance loss from 1b568f0aab, in our test environment.
>>
>> Excellent.
>>
>> > Since 4.9 is an LTS kernel - is this appropriate to suggest to be
>> > included in the linux-stable list?
>>
>> Hurm... so I typically suck at (also) keeping track of -stable things.
>>
>> But given LTS, there might be a few more commits that might make sense
>> to include.
>>
>> This series corrects NUMA topology creation:
>>
>> 8c0334697dc3 ("sched/topology: Refactor function build_overlap_sched_groups()")
>> c743f0a5c50f ("sched/fair, cpumask: Export for_each_cpu_wrap()")
>> 0372dd2736e0 ("sched/topology: Fix building of overlapping sched-groups")
>> 91eaed0d6131 ("sched/topology: Simplify build_overlap_sched_groups()")
>> b0151c25548c ("sched/debug: Print the scheduler topology group mask")
>> a420b0630362 ("sched/topology: Verify the first group matches the child domain")
>> f32d782e31bf ("sched/topology: Optimize build_group_mask()")
>> c20e1ea4b61c ("sched/topology: Move comment about asymmetric node setups")
>> af85596c74de ("sched/topology: Remove FORCE_SD_OVERLAP")
>> 73bb059f9b8a ("sched/topology: Fix overlapping sched_group_mask")
>> 8d5dc5126bb2 ("sched/topology: Small cleanup")
>> 005f874dd284 ("sched/topology: Add sched_group_capacity debugging")
>> 1676330ecfa8 ("sched/topology: Fix overlapping sched_group_capacity")
>>
>> (there's a few more commits at the end of that series that add comments
>> and renames a bunch of stuff which doesn't really fix anything).
>>
>> Cures a BUG_ON through sysrq:
>>
>> 896bbb252258 ("sched/core: Allow __sched_setscheduler() in interrupts when PI is not used")
>>
>>
>> Performance issues:
>>
>>
>> 502ce005ab95 ("sched/fair: Use task_groups instead of leaf_cfs_rq_list to walk all cfs_rqs")
>> a9e7f6544b9c ("sched/fair: Fix O(nr_cgroups) in load balance path")
>>
>> c249f255aab8 ("sched/rt: Minimize rq->lock contention in do_sched_rt_period_timer()")
>>
>> 8655d5497735 ("sched/numa: Use down_read_trylock() for the mmap_sem")
>>
>>
>>
>> And then the patch you want for this:
>>
>> 1ad3aaf3fcd2 ("sched/core: Implement new approach to scale select_idle_cpu()")
>>
>>
>>
>> I have no real idea how much of any those qualify for 4.9, but know most
>> of those patches ended up in the various enterprise distros in some form
>> or other.
>
> If people have experience with these in the "enterprise" distros, or
> any other tree, and want to provide me with backported, and tested,
> patches, I'll be glad to consider them for stable kernels.
>
> thanks,
>
> greg k-h

I tried to do a simple cherry-pick of the suggested patches - but they
apply against files that don't exist in the 4.9 series.
This means it would be a more complicated port, that, without having
the original author's context - there's a non-zero possibility that
I'd botch the port. As such, I'll yield to Peter's expertise here.

In my release of 4.9 - I'm planning on doing the simpler revert of
1b568f0aab that introduced the performance degradation, rather than
pulling in lots of code from newer kernels.

Thanks
Ben G

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: Potential scheduler regression
  2017-07-10 15:43         ` Ben Guthro
@ 2017-07-11  8:30           ` Ingo Molnar
  2017-07-11  9:55             ` Greg KH
  0 siblings, 1 reply; 11+ messages in thread
From: Ingo Molnar @ 2017-07-11  8:30 UTC (permalink / raw)
  To: Ben Guthro
  Cc: Greg KH, Peter Zijlstra, Linux Kernel Mailing List,
	Linus Torvalds, Mike Galbraith, Thomas Gleixner


* Ben Guthro <ben@guthro.net> wrote:

> > If people have experience with these in the "enterprise" distros, or any other 
> > tree, and want to provide me with backported, and tested, patches, I'll be 
> > glad to consider them for stable kernels.
> >
> > thanks,
> >
> > greg k-h
> 
> I tried to do a simple cherry-pick of the suggested patches - but they
> apply against files that don't exist in the 4.9 series.

I think there are only two strategies to maintain a backport which work in the 
long run:

 - insist on the simplest fixes and pure cherry-picks

 - or pick up _everything_ to sync up the two versions.

The latter would mean a lot of commits - and I'm afraid it would also involve the 
scheduler header split-up, which literally involves hundreds of files plus 
perpetual build-breakage risk, so it's a no-no.

> In my release of 4.9 - I'm planning on doing the simpler revert of 1b568f0aab 
> that introduced the performance degradation, rather than pulling in lots of code 
> from newer kernels.

That sounds much saner - I'd even Ack that approach for -stable as a special 
exception, than to complicate things with excessive backports.

Thanks,

	Ingo

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: Potential scheduler regression
  2017-07-11  8:30           ` Ingo Molnar
@ 2017-07-11  9:55             ` Greg KH
  2017-07-13 19:24               ` Ben Guthro
  0 siblings, 1 reply; 11+ messages in thread
From: Greg KH @ 2017-07-11  9:55 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Ben Guthro, Peter Zijlstra, Linux Kernel Mailing List,
	Linus Torvalds, Mike Galbraith, Thomas Gleixner

On Tue, Jul 11, 2017 at 10:30:14AM +0200, Ingo Molnar wrote:
> 
> * Ben Guthro <ben@guthro.net> wrote:
> 
> > > If people have experience with these in the "enterprise" distros, or any other 
> > > tree, and want to provide me with backported, and tested, patches, I'll be 
> > > glad to consider them for stable kernels.
> > >
> > > thanks,
> > >
> > > greg k-h
> > 
> > I tried to do a simple cherry-pick of the suggested patches - but they
> > apply against files that don't exist in the 4.9 series.
> 
> I think there are only two strategies to maintain a backport which work in the 
> long run:
> 
>  - insist on the simplest fixes and pure cherry-picks
> 
>  - or pick up _everything_ to sync up the two versions.
> 
> The latter would mean a lot of commits - and I'm afraid it would also involve the 
> scheduler header split-up, which literally involves hundreds of files plus 
> perpetual build-breakage risk, so it's a no-no.
> 
> > In my release of 4.9 - I'm planning on doing the simpler revert of 1b568f0aab 
> > that introduced the performance degradation, rather than pulling in lots of code 
> > from newer kernels.
> 
> That sounds much saner - I'd even Ack that approach for -stable as a special 
> exception, than to complicate things with excessive backports.

Ok, I'll revert that for the next stable release after this one that is
currently under review.

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: Potential scheduler regression
  2017-07-11  9:55             ` Greg KH
@ 2017-07-13 19:24               ` Ben Guthro
  2017-07-14  6:54                 ` Greg KH
  0 siblings, 1 reply; 11+ messages in thread
From: Ben Guthro @ 2017-07-13 19:24 UTC (permalink / raw)
  To: Greg KH
  Cc: Ingo Molnar, Peter Zijlstra, Linux Kernel Mailing List,
	Linus Torvalds, Mike Galbraith, Thomas Gleixner

On Tue, Jul 11, 2017 at 5:55 AM, Greg KH <gregkh@linuxfoundation.org> wrote:
> On Tue, Jul 11, 2017 at 10:30:14AM +0200, Ingo Molnar wrote:
>>
>> * Ben Guthro <ben@guthro.net> wrote:
>>
>> > > If people have experience with these in the "enterprise" distros, or any other
>> > > tree, and want to provide me with backported, and tested, patches, I'll be
>> > > glad to consider them for stable kernels.
>> > >
>> > > thanks,
>> > >
>> > > greg k-h
>> >
>> > I tried to do a simple cherry-pick of the suggested patches - but they
>> > apply against files that don't exist in the 4.9 series.
>>
>> I think there are only two strategies to maintain a backport which work in the
>> long run:
>>
>>  - insist on the simplest fixes and pure cherry-picks
>>
>>  - or pick up _everything_ to sync up the two versions.
>>
>> The latter would mean a lot of commits - and I'm afraid it would also involve the
>> scheduler header split-up, which literally involves hundreds of files plus
>> perpetual build-breakage risk, so it's a no-no.
>>
>> > In my release of 4.9 - I'm planning on doing the simpler revert of 1b568f0aab
>> > that introduced the performance degradation, rather than pulling in lots of code
>> > from newer kernels.
>>
>> That sounds much saner - I'd even Ack that approach for -stable as a special
>> exception, than to complicate things with excessive backports.
>
> Ok, I'll revert that for the next stable release after this one that is
> currently under review.
>
> thanks,
>
> greg k-h

Greg,

Just for clarity - is the "next one" 4.9.38 (posted today for review)
- or the one following?

Thanks,
Ben

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: Potential scheduler regression
  2017-07-13 19:24               ` Ben Guthro
@ 2017-07-14  6:54                 ` Greg KH
  2017-07-19  8:02                   ` Greg KH
  0 siblings, 1 reply; 11+ messages in thread
From: Greg KH @ 2017-07-14  6:54 UTC (permalink / raw)
  To: Ben Guthro
  Cc: Ingo Molnar, Peter Zijlstra, Linux Kernel Mailing List,
	Linus Torvalds, Mike Galbraith, Thomas Gleixner

On Thu, Jul 13, 2017 at 03:24:02PM -0400, Ben Guthro wrote:
> On Tue, Jul 11, 2017 at 5:55 AM, Greg KH <gregkh@linuxfoundation.org> wrote:
> > On Tue, Jul 11, 2017 at 10:30:14AM +0200, Ingo Molnar wrote:
> >>
> >> * Ben Guthro <ben@guthro.net> wrote:
> >>
> >> > > If people have experience with these in the "enterprise" distros, or any other
> >> > > tree, and want to provide me with backported, and tested, patches, I'll be
> >> > > glad to consider them for stable kernels.
> >> > >
> >> > > thanks,
> >> > >
> >> > > greg k-h
> >> >
> >> > I tried to do a simple cherry-pick of the suggested patches - but they
> >> > apply against files that don't exist in the 4.9 series.
> >>
> >> I think there are only two strategies to maintain a backport which work in the
> >> long run:
> >>
> >>  - insist on the simplest fixes and pure cherry-picks
> >>
> >>  - or pick up _everything_ to sync up the two versions.
> >>
> >> The latter would mean a lot of commits - and I'm afraid it would also involve the
> >> scheduler header split-up, which literally involves hundreds of files plus
> >> perpetual build-breakage risk, so it's a no-no.
> >>
> >> > In my release of 4.9 - I'm planning on doing the simpler revert of 1b568f0aab
> >> > that introduced the performance degradation, rather than pulling in lots of code
> >> > from newer kernels.
> >>
> >> That sounds much saner - I'd even Ack that approach for -stable as a special
> >> exception, than to complicate things with excessive backports.
> >
> > Ok, I'll revert that for the next stable release after this one that is
> > currently under review.
> >
> > thanks,
> >
> > greg k-h
> 
> Greg,
> 
> Just for clarity - is the "next one" 4.9.38 (posted today for review)
> - or the one following?

Doh, I forgot it for this release, sorry about that, will try to get to
it for the next one after this.

greg k-h

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: Potential scheduler regression
  2017-07-14  6:54                 ` Greg KH
@ 2017-07-19  8:02                   ` Greg KH
  0 siblings, 0 replies; 11+ messages in thread
From: Greg KH @ 2017-07-19  8:02 UTC (permalink / raw)
  To: Ben Guthro
  Cc: Ingo Molnar, Peter Zijlstra, Linux Kernel Mailing List,
	Linus Torvalds, Mike Galbraith, Thomas Gleixner

On Fri, Jul 14, 2017 at 08:54:07AM +0200, Greg KH wrote:
> On Thu, Jul 13, 2017 at 03:24:02PM -0400, Ben Guthro wrote:
> > On Tue, Jul 11, 2017 at 5:55 AM, Greg KH <gregkh@linuxfoundation.org> wrote:
> > > On Tue, Jul 11, 2017 at 10:30:14AM +0200, Ingo Molnar wrote:
> > >>
> > >> * Ben Guthro <ben@guthro.net> wrote:
> > >>
> > >> > > If people have experience with these in the "enterprise" distros, or any other
> > >> > > tree, and want to provide me with backported, and tested, patches, I'll be
> > >> > > glad to consider them for stable kernels.
> > >> > >
> > >> > > thanks,
> > >> > >
> > >> > > greg k-h
> > >> >
> > >> > I tried to do a simple cherry-pick of the suggested patches - but they
> > >> > apply against files that don't exist in the 4.9 series.
> > >>
> > >> I think there are only two strategies to maintain a backport which work in the
> > >> long run:
> > >>
> > >>  - insist on the simplest fixes and pure cherry-picks
> > >>
> > >>  - or pick up _everything_ to sync up the two versions.
> > >>
> > >> The latter would mean a lot of commits - and I'm afraid it would also involve the
> > >> scheduler header split-up, which literally involves hundreds of files plus
> > >> perpetual build-breakage risk, so it's a no-no.
> > >>
> > >> > In my release of 4.9 - I'm planning on doing the simpler revert of 1b568f0aab
> > >> > that introduced the performance degradation, rather than pulling in lots of code
> > >> > from newer kernels.
> > >>
> > >> That sounds much saner - I'd even Ack that approach for -stable as a special
> > >> exception, than to complicate things with excessive backports.
> > >
> > > Ok, I'll revert that for the next stable release after this one that is
> > > currently under review.
> > >
> > > thanks,
> > >
> > > greg k-h
> > 
> > Greg,
> > 
> > Just for clarity - is the "next one" 4.9.38 (posted today for review)
> > - or the one following?
> 
> Doh, I forgot it for this release, sorry about that, will try to get to
> it for the next one after this.

Now reverted.

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2017-07-19  8:03 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-07-05 15:42 Potential scheduler regression Ben Guthro
2017-07-05 16:48 ` Peter Zijlstra
2017-07-07 20:55   ` Ben Guthro
2017-07-10  9:25     ` Peter Zijlstra
2017-07-10 15:26       ` Greg KH
2017-07-10 15:43         ` Ben Guthro
2017-07-11  8:30           ` Ingo Molnar
2017-07-11  9:55             ` Greg KH
2017-07-13 19:24               ` Ben Guthro
2017-07-14  6:54                 ` Greg KH
2017-07-19  8:02                   ` Greg KH

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.