linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* Performance issue since 3.2.6
@ 2013-01-18 21:05 Olivier Doucet
  2013-01-18 21:24 ` Borislav Petkov
  0 siblings, 1 reply; 17+ messages in thread
From: Olivier Doucet @ 2013-01-18 21:05 UTC (permalink / raw)
  To: linux-kernel

Hello,

I think I found a performance issue in kernel. This problem was
introduced in 3.2.6 and is affecting all version, including 3.7.1
(latest tested).

I measured several kernel builds with a homebrew LAMP platform
benchmark (250 runs, average value kept).

Kernel 3.2.0 was on production so far, and I tried an upgrade to 3.7.1
(latest kernel at that time). No software other than the kernel was
modified. Kernel was built with the same .config file (new options
left with default value).
Difference in performance was quite huge :
Kernel 3.7.1 : ~ 3250 queries / second
Kernel 3.2.0 : ~ 4300 queries / second

Yes, this is a 25% performance drop ...

To narrow things down, I tested several kernels :
3.2.0 to 3.2.5 : OK
3.2.6, 3.2.11, 3.2.28, 3.2.36, 3.7.1 : PERFORMANCE DROP
On faulty kernels, performance drop is always the same.

At this step, I know that bug was introduced in version 3.2.6. I then
used git bisect (amazing tool btw) to find the faulty commit :
f51d67a64f32cd81ea8b67ca964fb7cf7e783b2e	PM / QoS: CPU C-state
breakage with PM Qos change

Next test : I used a 3.2.6 and reverted this patch : performance was
back to normal.
I also reverted this patch on 3.2.36 (latest 3.2.X), and performance
was also OK. I was unable to revert this patch on 3.7.1 (structure of
the source file  /include/linux/pm_qos.h changed too heavily).

Kernel was built on a x86_64 platform with binutils 2.19.1 and gcc 4.4.0.
.config file used : https://gist.github.com/4567342
Benchmark was run on a dual CPU INTEL L5630 with 4GB of RAM

What can I do to resolve this problem ?
Thanks !

Olivier

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: Performance issue since 3.2.6
  2013-01-18 21:05 Performance issue since 3.2.6 Olivier Doucet
@ 2013-01-18 21:24 ` Borislav Petkov
  2013-01-18 21:46   ` Rafael J. Wysocki
  0 siblings, 1 reply; 17+ messages in thread
From: Borislav Petkov @ 2013-01-18 21:24 UTC (permalink / raw)
  To: Olivier Doucet; +Cc: linux-kernel, Venkatesh Pallipadi, Rafael J. Wysocki

Good description.

I'm leaving the whole email in for reference, see below:

On Fri, Jan 18, 2013 at 10:05:41PM +0100, Olivier Doucet wrote:
> Hello,
> 
> I think I found a performance issue in kernel. This problem was
> introduced in 3.2.6 and is affecting all version, including 3.7.1
> (latest tested).
> 
> I measured several kernel builds with a homebrew LAMP platform
> benchmark (250 runs, average value kept).
> 
> Kernel 3.2.0 was on production so far, and I tried an upgrade to 3.7.1
> (latest kernel at that time). No software other than the kernel was
> modified. Kernel was built with the same .config file (new options
> left with default value).
> Difference in performance was quite huge :
> Kernel 3.7.1 : ~ 3250 queries / second
> Kernel 3.2.0 : ~ 4300 queries / second
> 
> Yes, this is a 25% performance drop ...
> 
> To narrow things down, I tested several kernels :
> 3.2.0 to 3.2.5 : OK
> 3.2.6, 3.2.11, 3.2.28, 3.2.36, 3.7.1 : PERFORMANCE DROP
> On faulty kernels, performance drop is always the same.
> 
> At this step, I know that bug was introduced in version 3.2.6. I then
> used git bisect (amazing tool btw) to find the faulty commit :
> f51d67a64f32cd81ea8b67ca964fb7cf7e783b2e	PM / QoS: CPU C-state
> breakage with PM Qos change
> 
> Next test : I used a 3.2.6 and reverted this patch : performance was
> back to normal.
> I also reverted this patch on 3.2.36 (latest 3.2.X), and performance
> was also OK. I was unable to revert this patch on 3.7.1 (structure of
> the source file  /include/linux/pm_qos.h changed too heavily).
> 
> Kernel was built on a x86_64 platform with binutils 2.19.1 and gcc 4.4.0.
> .config file used : https://gist.github.com/4567342
> Benchmark was run on a dual CPU INTEL L5630 with 4GB of RAM
> 
> What can I do to resolve this problem ?

Btw, the commit-id you've given is the stable commit-id but this is
still ok, the mainline commit is mentioned in the commit message and is:

commit d020283dc694c9ec31b410f522252f7a8397e67d
Author: Venkatesh Pallipadi <venki@google.com>
Date:   Fri Feb 3 22:22:25 2012 +0100

    PM / QoS: CPU C-state breakage with PM Qos change

So, let's invite the parties from the commit to CC, see what they have
to say.

Thanks.

-- 
Regards/Gruss,
    Boris.

Sent from a fat crate under my desk. Formatting is fine.
--

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: Performance issue since 3.2.6
  2013-01-18 21:46   ` Rafael J. Wysocki
@ 2013-01-18 21:43     ` Olivier Doucet
  2013-01-18 22:17       ` Rafael J. Wysocki
  0 siblings, 1 reply; 17+ messages in thread
From: Olivier Doucet @ 2013-01-18 21:43 UTC (permalink / raw)
  To: Rafael J. Wysocki; +Cc: Borislav Petkov, linux-kernel, Venkatesh Pallipadi

> Olivier, may I see the kernel .config file?
>
> Rafael

I put it online, cause I didn't want to spam lkml with a long text file :
https://gist.github.com/4567342

Olivier

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: Performance issue since 3.2.6
  2013-01-18 21:24 ` Borislav Petkov
@ 2013-01-18 21:46   ` Rafael J. Wysocki
  2013-01-18 21:43     ` Olivier Doucet
  0 siblings, 1 reply; 17+ messages in thread
From: Rafael J. Wysocki @ 2013-01-18 21:46 UTC (permalink / raw)
  To: Borislav Petkov, Olivier Doucet; +Cc: linux-kernel, Venkatesh Pallipadi

On Friday, January 18, 2013 10:24:48 PM Borislav Petkov wrote:
> Good description.
> 
> I'm leaving the whole email in for reference, see below:
> 
> On Fri, Jan 18, 2013 at 10:05:41PM +0100, Olivier Doucet wrote:
> > Hello,
> > 
> > I think I found a performance issue in kernel. This problem was
> > introduced in 3.2.6 and is affecting all version, including 3.7.1
> > (latest tested).
> > 
> > I measured several kernel builds with a homebrew LAMP platform
> > benchmark (250 runs, average value kept).
> > 
> > Kernel 3.2.0 was on production so far, and I tried an upgrade to 3.7.1
> > (latest kernel at that time). No software other than the kernel was
> > modified. Kernel was built with the same .config file (new options
> > left with default value).
> > Difference in performance was quite huge :
> > Kernel 3.7.1 : ~ 3250 queries / second
> > Kernel 3.2.0 : ~ 4300 queries / second
> > 
> > Yes, this is a 25% performance drop ...
> > 
> > To narrow things down, I tested several kernels :
> > 3.2.0 to 3.2.5 : OK
> > 3.2.6, 3.2.11, 3.2.28, 3.2.36, 3.7.1 : PERFORMANCE DROP
> > On faulty kernels, performance drop is always the same.
> > 
> > At this step, I know that bug was introduced in version 3.2.6. I then
> > used git bisect (amazing tool btw) to find the faulty commit :
> > f51d67a64f32cd81ea8b67ca964fb7cf7e783b2e	PM / QoS: CPU C-state
> > breakage with PM Qos change
> > 
> > Next test : I used a 3.2.6 and reverted this patch : performance was
> > back to normal.
> > I also reverted this patch on 3.2.36 (latest 3.2.X), and performance
> > was also OK. I was unable to revert this patch on 3.7.1 (structure of
> > the source file  /include/linux/pm_qos.h changed too heavily).
> > 
> > Kernel was built on a x86_64 platform with binutils 2.19.1 and gcc 4.4.0.
> > .config file used : https://gist.github.com/4567342
> > Benchmark was run on a dual CPU INTEL L5630 with 4GB of RAM
> > 
> > What can I do to resolve this problem ?
> 
> Btw, the commit-id you've given is the stable commit-id but this is
> still ok, the mainline commit is mentioned in the commit message and is:
> 
> commit d020283dc694c9ec31b410f522252f7a8397e67d
> Author: Venkatesh Pallipadi <venki@google.com>
> Date:   Fri Feb 3 22:22:25 2012 +0100
> 
>     PM / QoS: CPU C-state breakage with PM Qos change
> 
> So, let's invite the parties from the commit to CC, see what they have
> to say.

Thanks Boris!

Olivier, may I see the kernel .config file?

Rafael


-- 
I speak only for myself.
Rafael J. Wysocki, Intel Open Source Technology Center.

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: Performance issue since 3.2.6
  2013-01-18 21:43     ` Olivier Doucet
@ 2013-01-18 22:17       ` Rafael J. Wysocki
  2013-01-21 13:18         ` Olivier Doucet
  0 siblings, 1 reply; 17+ messages in thread
From: Rafael J. Wysocki @ 2013-01-18 22:17 UTC (permalink / raw)
  To: Olivier Doucet; +Cc: Borislav Petkov, linux-kernel, Venkatesh Pallipadi

On Friday, January 18, 2013 10:43:45 PM Olivier Doucet wrote:
> > Olivier, may I see the kernel .config file?
> >
> > Rafael
> 
> I put it online, cause I didn't want to spam lkml with a long text file :
> https://gist.github.com/4567342

OK, thanks.

It looks like the value returned by pm_qos_request() with CONFIG_PM unset
for PM_QOS_CPU_DMA_LATENCY is not the right one.

I'll need to find out what value should be returned instead.

Thanks,
Rafael


-- 
I speak only for myself.
Rafael J. Wysocki, Intel Open Source Technology Center.

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: Performance issue since 3.2.6
  2013-01-18 22:17       ` Rafael J. Wysocki
@ 2013-01-21 13:18         ` Olivier Doucet
  2013-01-21 13:29           ` Rafael J. Wysocki
  0 siblings, 1 reply; 17+ messages in thread
From: Olivier Doucet @ 2013-01-21 13:18 UTC (permalink / raw)
  To: Rafael J. Wysocki; +Cc: Borislav Petkov, linux-kernel

Hello,

2013/1/18 Rafael J. Wysocki <rjw@sisk.pl>:
> It looks like the value returned by pm_qos_request() with CONFIG_PM unset
> for PM_QOS_CPU_DMA_LATENCY is not the right one.

FYI, I benchmarked a new version with :
CONFIG_PM_RUNTIME=y
CONFIG_PM=y

but the performance loss is still present.

Olivier

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: Performance issue since 3.2.6
  2013-01-21 13:18         ` Olivier Doucet
@ 2013-01-21 13:29           ` Rafael J. Wysocki
  2013-01-21 14:26             ` Olivier Doucet
  0 siblings, 1 reply; 17+ messages in thread
From: Rafael J. Wysocki @ 2013-01-21 13:29 UTC (permalink / raw)
  To: Olivier Doucet; +Cc: Borislav Petkov, linux-kernel

On Monday, January 21, 2013 02:18:29 PM Olivier Doucet wrote:
> Hello,
> 
> 2013/1/18 Rafael J. Wysocki <rjw@sisk.pl>:
> > It looks like the value returned by pm_qos_request() with CONFIG_PM unset
> > for PM_QOS_CPU_DMA_LATENCY is not the right one.
> 
> FYI, I benchmarked a new version with :
> CONFIG_PM_RUNTIME=y
> CONFIG_PM=y
> 
> but the performance loss is still present.

In that case it is not quite likely that the commit you bisected to
really introduced the problem, because it doesn't change things for
CONFIG_PM=y.

Does reverting that commit still help?

Rafael


-- 
I speak only for myself.
Rafael J. Wysocki, Intel Open Source Technology Center.

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: Performance issue since 3.2.6
  2013-01-21 13:29           ` Rafael J. Wysocki
@ 2013-01-21 14:26             ` Olivier Doucet
  2013-01-21 22:49               ` Rafael J. Wysocki
  0 siblings, 1 reply; 17+ messages in thread
From: Olivier Doucet @ 2013-01-21 14:26 UTC (permalink / raw)
  To: Rafael J. Wysocki; +Cc: Borislav Petkov, linux-kernel

>> FYI, I benchmarked a new version with :
>> CONFIG_PM_RUNTIME=y
>> CONFIG_PM=y
>>
>> but the performance loss is still present.
>
> In that case it is not quite likely that the commit you bisected to
> really introduced the problem, because it doesn't change things for
> CONFIG_PM=y.
>
> Does reverting that commit still help?

I tested several combinations. Results follows :
3.2.6 (base) + CONFIG_PM unset  => BAD
3.2.6 (base) + CONFIG_PM=y  => BAD
3.2.6 (base) + patch reverted + CONFIG_PM=y  => BAD
3.2.6 (base) + patch reverted + CONFIG_PM unset  => GOOD

So if I understand right, the targeted patch introduced the bug when
CONFIG_PM is unset, but there is an other bug when this var is set.
I'll try to track this commit.

Olivier

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: Performance issue since 3.2.6
  2013-01-21 14:26             ` Olivier Doucet
@ 2013-01-21 22:49               ` Rafael J. Wysocki
  2013-01-21 22:57                 ` Olivier Doucet
  0 siblings, 1 reply; 17+ messages in thread
From: Rafael J. Wysocki @ 2013-01-21 22:49 UTC (permalink / raw)
  To: Olivier Doucet; +Cc: Borislav Petkov, linux-kernel

On Monday, January 21, 2013 03:26:10 PM Olivier Doucet wrote:
> >> FYI, I benchmarked a new version with :
> >> CONFIG_PM_RUNTIME=y
> >> CONFIG_PM=y
> >>
> >> but the performance loss is still present.
> >
> > In that case it is not quite likely that the commit you bisected to
> > really introduced the problem, because it doesn't change things for
> > CONFIG_PM=y.
> >
> > Does reverting that commit still help?
> 
> I tested several combinations. Results follows :
> 3.2.6 (base) + CONFIG_PM unset  => BAD
> 3.2.6 (base) + CONFIG_PM=y  => BAD
> 3.2.6 (base) + patch reverted + CONFIG_PM=y  => BAD
> 3.2.6 (base) + patch reverted + CONFIG_PM unset  => GOOD

I see.  Two bugs, then.

> So if I understand right, the targeted patch introduced the bug when
> CONFIG_PM is unset, but there is an other bug when this var is set.
> I'll try to track this commit.

Thanks a lot!

Rafael


-- 
I speak only for myself.
Rafael J. Wysocki, Intel Open Source Technology Center.

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: Performance issue since 3.2.6
  2013-01-21 22:49               ` Rafael J. Wysocki
@ 2013-01-21 22:57                 ` Olivier Doucet
  2013-01-21 23:09                   ` Rafael J. Wysocki
  0 siblings, 1 reply; 17+ messages in thread
From: Olivier Doucet @ 2013-01-21 22:57 UTC (permalink / raw)
  To: Rafael J. Wysocki; +Cc: Borislav Petkov, linux-kernel

Good evening,

>> I'll try to track this commit.
>
> Thanks a lot!

I failed to find a working version today with CONFIG_PM=y
(tested 3.2.6, 3.2.5, 3.2.0, 3.0.1, 2.6.32.8). I'll try older kernels tomorrow.

I'm very sceptical now : how can a bug with such a big impact (25%
performance drop) can survive for so many years, with no one seeing it
?
Good point is that benchmark can be easily done (it took less than an
hour to test a new version). If you have patches to be tested, feel
free to ask :)

In the meanwhile, do you think I can compile a working 3.7 kernel ?
Maybe by activating / deactivating specific options ?

Olivier

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: Performance issue since 3.2.6
  2013-01-21 22:57                 ` Olivier Doucet
@ 2013-01-21 23:09                   ` Rafael J. Wysocki
  2013-01-31 15:15                     ` Olivier Doucet
  0 siblings, 1 reply; 17+ messages in thread
From: Rafael J. Wysocki @ 2013-01-21 23:09 UTC (permalink / raw)
  To: Olivier Doucet; +Cc: Borislav Petkov, linux-kernel

On Monday, January 21, 2013 11:57:06 PM Olivier Doucet wrote:
> Good evening,
> 
> >> I'll try to track this commit.
> >
> > Thanks a lot!
> 
> I failed to find a working version today with CONFIG_PM=y
> (tested 3.2.6, 3.2.5, 3.2.0, 3.0.1, 2.6.32.8). I'll try older kernels tomorrow.

I don't think that's necessary.  It looks like CONFIG_PM=y has never really
worked for you, but you wasn't aware of that.

> I'm very sceptical now : how can a bug with such a big impact (25%
> performance drop) can survive for so many years, with no one seeing it
> ?
> Good point is that benchmark can be easily done (it took less than an
> hour to test a new version). If you have patches to be tested, feel
> free to ask :)
> 
> In the meanwhile, do you think I can compile a working 3.7 kernel ?
> Maybe by activating / deactivating specific options ?

I'm really not sure.  It looks like CONFIG_PM makes PM QoS affect cpuidle
for you in a wrong way, so to speak.

I suppose it's time to look into the code and see what makes the difference.

Thanks,
Rafael


-- 
I speak only for myself.
Rafael J. Wysocki, Intel Open Source Technology Center.

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: Performance issue since 3.2.6
  2013-01-21 23:09                   ` Rafael J. Wysocki
@ 2013-01-31 15:15                     ` Olivier Doucet
  2013-02-12 21:09                       ` Olivier Doucet
  0 siblings, 1 reply; 17+ messages in thread
From: Olivier Doucet @ 2013-01-31 15:15 UTC (permalink / raw)
  To: Rafael J. Wysocki; +Cc: Borislav Petkov, linux-kernel

Hello,

> I'm really not sure.  It looks like CONFIG_PM makes PM QoS affect cpuidle
> for you in a wrong way, so to speak.
>
> I suppose it's time to look into the code and see what makes the difference.

i'm not a kernel guru but I can help by testing any patch or
suggestions. Let me know what can I do to have this fixed.

Olivier

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: Performance issue since 3.2.6
  2013-01-31 15:15                     ` Olivier Doucet
@ 2013-02-12 21:09                       ` Olivier Doucet
  2013-05-17 18:17                         ` Olivier Doucet
  0 siblings, 1 reply; 17+ messages in thread
From: Olivier Doucet @ 2013-02-12 21:09 UTC (permalink / raw)
  To: Rafael J. Wysocki; +Cc: Borislav Petkov, linux-kernel

Hello,

A quick update on my latest tests :
I was able to compile a working 3.7.1 kernel (by 'working', I mean
with no performance penalty). I'm sure 3.7.7 will be OK also (do you
want me to test latest RC of 3.8 ?)

I had to disable CONFIG_ACPI_PROCESSOR to disable power management.
So now these two options are unset :
CONFIG_CPU_IDLE
CONFIG_ACPI_PROCESSOR

I've posted the whole .config file here :
https://gist.github.com/odoucet/4773390

I'll be glad to test any patch that may help reactivate PM on my
system (CPU Intel Xeon L5630)

Olivier

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: Performance issue since 3.2.6
  2013-02-12 21:09                       ` Olivier Doucet
@ 2013-05-17 18:17                         ` Olivier Doucet
  2013-05-17 19:50                           ` Srivatsa S. Bhat
  0 siblings, 1 reply; 17+ messages in thread
From: Olivier Doucet @ 2013-05-17 18:17 UTC (permalink / raw)
  To: Rafael J. Wysocki; +Cc: Borislav Petkov, linux-kernel

Hello,

This performance penalty is still present in kernel 3.9.2. And
CONFIG_PM cannot be deactivated anymore.

I was able to make a working 3.9.2 (meaning with no penalty)  with
following config and patch :
CONFIG_PM=y
CONFIG_PM_SLEEP=y
CONFIG_PM_SLEEP_SMP=y
CONFIG_CPU_IDLE=y
CONFIG_ACPI=y
CONFIG_ACPI_PROCESSOR=y

Patch : https://gist.github.com/odoucet/5600630

I know this patch is not perfect because it is just equivalent to
rollback commit f51d67a64f32cd81ea8b67ca964fb7cf7e783b2e ;

I really want this to be fixed in kernel, so I would be glad to test
any patch / config file you want.


2013/2/12 Olivier Doucet <webmaster@ajeux.com>
>
> Hello,
>
> A quick update on my latest tests :
> I was able to compile a working 3.7.1 kernel (by 'working', I mean
> with no performance penalty). I'm sure 3.7.7 will be OK also (do you
> want me to test latest RC of 3.8 ?)
>
> I had to disable CONFIG_ACPI_PROCESSOR to disable power management.
> So now these two options are unset :
> CONFIG_CPU_IDLE
> CONFIG_ACPI_PROCESSOR
>
> I've posted the whole .config file here :
> https://gist.github.com/odoucet/4773390
>
> I'll be glad to test any patch that may help reactivate PM on my
> system (CPU Intel Xeon L5630)
>
> Olivier

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: Performance issue since 3.2.6
  2013-05-17 18:17                         ` Olivier Doucet
@ 2013-05-17 19:50                           ` Srivatsa S. Bhat
  2013-05-17 23:51                             ` Rafael J. Wysocki
  0 siblings, 1 reply; 17+ messages in thread
From: Srivatsa S. Bhat @ 2013-05-17 19:50 UTC (permalink / raw)
  To: Olivier Doucet
  Cc: Rafael J. Wysocki, Borislav Petkov, linux-kernel,
	Srivatsa S. Bhat, Linux PM mailing list

On 05/17/2013 11:47 PM, Olivier Doucet wrote:
> Hello,
> 
> This performance penalty is still present in kernel 3.9.2. And
> CONFIG_PM cannot be deactivated anymore.
> 
> I was able to make a working 3.9.2 (meaning with no penalty)  with
> following config and patch :
> CONFIG_PM=y
> CONFIG_PM_SLEEP=y
> CONFIG_PM_SLEEP_SMP=y
> CONFIG_CPU_IDLE=y
> CONFIG_ACPI=y
> CONFIG_ACPI_PROCESSOR=y
> 
> Patch : https://gist.github.com/odoucet/5600630
> 
> I know this patch is not perfect because it is just equivalent to
> rollback commit f51d67a64f32cd81ea8b67ca964fb7cf7e783b2e ;
> 
> I really want this to be fixed in kernel, so I would be glad to test
> any patch / config file you want.
>

I went through your previous mails and here is what I think:
I think this is not a regression that needs to be fixed. Instead it
occurs to me that you started depending on the _flaw_ introduced by
commit e8db0be124 (PM QoS: Move and rename the implementation files).

Your requirement is very simple: you don't want CPUs to go to deep
idle states, since your benchmark is very performance critical.

Commit e8db0be124 made the mistake of returning 0 in pm_qos_request()
when CONFIG_PM was unset. And that has the effect of disabling deeper
idle states, which is exactly what you wanted.

But, as noted by commit d020283d (PM / QoS: CPU C-state breakage with
PM Qos change), this is quite a bit wrong, because it makes the system
consume a *lot* of CPU power, because the CPUs never go to idle states
and instead keep polling.

Now, you might ask why is it wrong to set the default value to 0
(IOW, disable deep idle states) when CONFIG_PM is unset? Again, commit
d020283d answers that indirectly - not every power-management
configuration falls under CONFIG_PM, like CONFIG_CPU_IDLE,
CONFIG_INTEL_IDLE etc. So we need a sane default for pm_qos_request()
when CONFIG_PM is unset, to prevent the power usage from shooting
through the roof and surprising the user.

You started your comparisons with 3.2.0 which had commit e8db0be124
included. If you had tried any previous kernel, I'm pretty sure that
you would have found "performance penalties" too.

So, to summarize my thoughts:
 - IMHO there is no regression here, you just depended on a bug included
   in 3.2.0 (which made it behave like idle=poll with CONFIG_PM=n) and
   started your comparisons from there. The later kernels (3.2.6+) got
   that bug fixed which is why you saw "performance drops".

 - As much as we would like to do it, we can't set the value of
   PM_QOS_CPU_DMA_LAT_DEFAULT_VALUE to 0 when CONFIG_PM=n, because
   CONFIG_PM doesn't encompass all power-management features (which is
   a pity). Doing that would need a big overhaul of all the relevant
   Kconfigs, which might or might not be worth the effort. (Because, who
   says that CONFIG_PM=n kernels are supposed to eat power like crazy??)

So here is my suggestion - use the interfaces provided by the kernel to
fix your problem:
   - you can give idle=poll in the kernel command line,
   - OR you can echo 0 > /dev/cpu_dma_latency

Irrespective of your kernel configuration options (CONFIG_PM=y/n), the
CPUs will not enter deep idle states, giving you the performance
improvement that you are looking for.

Regards,
Srivatsa S. Bhat

> 
> 2013/2/12 Olivier Doucet <webmaster@ajeux.com>
>>
>> Hello,
>>
>> A quick update on my latest tests :
>> I was able to compile a working 3.7.1 kernel (by 'working', I mean
>> with no performance penalty). I'm sure 3.7.7 will be OK also (do you
>> want me to test latest RC of 3.8 ?)
>>
>> I had to disable CONFIG_ACPI_PROCESSOR to disable power management.
>> So now these two options are unset :
>> CONFIG_CPU_IDLE
>> CONFIG_ACPI_PROCESSOR
>>
>> I've posted the whole .config file here :
>> https://gist.github.com/odoucet/4773390
>>
>> I'll be glad to test any patch that may help reactivate PM on my
>> system (CPU Intel Xeon L5630)
>>
>> Olivier


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: Performance issue since 3.2.6
  2013-05-17 19:50                           ` Srivatsa S. Bhat
@ 2013-05-17 23:51                             ` Rafael J. Wysocki
  2013-05-18  5:04                               ` Srivatsa S. Bhat
  0 siblings, 1 reply; 17+ messages in thread
From: Rafael J. Wysocki @ 2013-05-17 23:51 UTC (permalink / raw)
  To: Srivatsa S. Bhat
  Cc: Olivier Doucet, Borislav Petkov, linux-kernel, Linux PM mailing list

On Saturday, May 18, 2013 01:20:10 AM Srivatsa S. Bhat wrote:
> On 05/17/2013 11:47 PM, Olivier Doucet wrote:
> > Hello,
> > 
> > This performance penalty is still present in kernel 3.9.2. And
> > CONFIG_PM cannot be deactivated anymore.
> > 
> > I was able to make a working 3.9.2 (meaning with no penalty)  with
> > following config and patch :
> > CONFIG_PM=y
> > CONFIG_PM_SLEEP=y
> > CONFIG_PM_SLEEP_SMP=y
> > CONFIG_CPU_IDLE=y
> > CONFIG_ACPI=y
> > CONFIG_ACPI_PROCESSOR=y
> > 
> > Patch : https://gist.github.com/odoucet/5600630
> > 
> > I know this patch is not perfect because it is just equivalent to
> > rollback commit f51d67a64f32cd81ea8b67ca964fb7cf7e783b2e ;
> > 
> > I really want this to be fixed in kernel, so I would be glad to test
> > any patch / config file you want.
> >
> 
> I went through your previous mails and here is what I think:
> I think this is not a regression that needs to be fixed. Instead it
> occurs to me that you started depending on the _flaw_ introduced by
> commit e8db0be124 (PM QoS: Move and rename the implementation files).
> 
> Your requirement is very simple: you don't want CPUs to go to deep
> idle states, since your benchmark is very performance critical.
> 
> Commit e8db0be124 made the mistake of returning 0 in pm_qos_request()
> when CONFIG_PM was unset. And that has the effect of disabling deeper
> idle states, which is exactly what you wanted.
> 
> But, as noted by commit d020283d (PM / QoS: CPU C-state breakage with
> PM Qos change), this is quite a bit wrong, because it makes the system
> consume a *lot* of CPU power, because the CPUs never go to idle states
> and instead keep polling.
> 
> Now, you might ask why is it wrong to set the default value to 0
> (IOW, disable deep idle states) when CONFIG_PM is unset? Again, commit
> d020283d answers that indirectly - not every power-management
> configuration falls under CONFIG_PM, like CONFIG_CPU_IDLE,
> CONFIG_INTEL_IDLE etc. So we need a sane default for pm_qos_request()
> when CONFIG_PM is unset, to prevent the power usage from shooting
> through the roof and surprising the user.
> 
> You started your comparisons with 3.2.0 which had commit e8db0be124
> included. If you had tried any previous kernel, I'm pretty sure that
> you would have found "performance penalties" too.
> 
> So, to summarize my thoughts:
>  - IMHO there is no regression here, you just depended on a bug included
>    in 3.2.0 (which made it behave like idle=poll with CONFIG_PM=n) and
>    started your comparisons from there. The later kernels (3.2.6+) got
>    that bug fixed which is why you saw "performance drops".
> 
>  - As much as we would like to do it, we can't set the value of
>    PM_QOS_CPU_DMA_LAT_DEFAULT_VALUE to 0 when CONFIG_PM=n, because
>    CONFIG_PM doesn't encompass all power-management features (which is
>    a pity). Doing that would need a big overhaul of all the relevant
>    Kconfigs, which might or might not be worth the effort. (Because, who
>    says that CONFIG_PM=n kernels are supposed to eat power like crazy??)

I think it *is* worth the effort.  We could drop some CONFIG_PM* options in the
process which would simplify things quite a bit too.

> So here is my suggestion - use the interfaces provided by the kernel to
> fix your problem:
>    - you can give idle=poll in the kernel command line,
>    - OR you can echo 0 > /dev/cpu_dma_latency
> 
> Irrespective of your kernel configuration options (CONFIG_PM=y/n), the
> CPUs will not enter deep idle states, giving you the performance
> improvement that you are looking for.

Thanks a lot for the very clear explanation of this!

Rafael


> > 2013/2/12 Olivier Doucet <webmaster@ajeux.com>
> >>
> >> Hello,
> >>
> >> A quick update on my latest tests :
> >> I was able to compile a working 3.7.1 kernel (by 'working', I mean
> >> with no performance penalty). I'm sure 3.7.7 will be OK also (do you
> >> want me to test latest RC of 3.8 ?)
> >>
> >> I had to disable CONFIG_ACPI_PROCESSOR to disable power management.
> >> So now these two options are unset :
> >> CONFIG_CPU_IDLE
> >> CONFIG_ACPI_PROCESSOR
> >>
> >> I've posted the whole .config file here :
> >> https://gist.github.com/odoucet/4773390
> >>
> >> I'll be glad to test any patch that may help reactivate PM on my
> >> system (CPU Intel Xeon L5630)
> >>
> >> Olivier
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-pm" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
-- 
I speak only for myself.
Rafael J. Wysocki, Intel Open Source Technology Center.

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: Performance issue since 3.2.6
  2013-05-17 23:51                             ` Rafael J. Wysocki
@ 2013-05-18  5:04                               ` Srivatsa S. Bhat
  0 siblings, 0 replies; 17+ messages in thread
From: Srivatsa S. Bhat @ 2013-05-18  5:04 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Olivier Doucet, Borislav Petkov, linux-kernel, Linux PM mailing list

On 05/18/2013 05:21 AM, Rafael J. Wysocki wrote:
> On Saturday, May 18, 2013 01:20:10 AM Srivatsa S. Bhat wrote:
>> On 05/17/2013 11:47 PM, Olivier Doucet wrote:
>>> Hello,
>>>
>>> This performance penalty is still present in kernel 3.9.2. And
>>> CONFIG_PM cannot be deactivated anymore.
>>>
>>> I was able to make a working 3.9.2 (meaning with no penalty)  with
>>> following config and patch :
>>> CONFIG_PM=y
>>> CONFIG_PM_SLEEP=y
>>> CONFIG_PM_SLEEP_SMP=y
>>> CONFIG_CPU_IDLE=y
>>> CONFIG_ACPI=y
>>> CONFIG_ACPI_PROCESSOR=y
>>>
>>> Patch : https://gist.github.com/odoucet/5600630
>>>
[...]
>> So, to summarize my thoughts:
>>  - IMHO there is no regression here, you just depended on a bug included
>>    in 3.2.0 (which made it behave like idle=poll with CONFIG_PM=n) and
>>    started your comparisons from there. The later kernels (3.2.6+) got
>>    that bug fixed which is why you saw "performance drops".
>>
>>  - As much as we would like to do it, we can't set the value of
>>    PM_QOS_CPU_DMA_LAT_DEFAULT_VALUE to 0 when CONFIG_PM=n, because
>>    CONFIG_PM doesn't encompass all power-management features (which is
>>    a pity). Doing that would need a big overhaul of all the relevant
>>    Kconfigs, which might or might not be worth the effort. (Because, who
>>    says that CONFIG_PM=n kernels are supposed to eat power like crazy??)
> 
> I think it *is* worth the effort.  We could drop some CONFIG_PM* options in the
> process which would simplify things quite a bit too.
>

Ah, ok..
 
>> So here is my suggestion - use the interfaces provided by the kernel to
>> fix your problem:
>>    - you can give idle=poll in the kernel command line,
>>    - OR you can echo 0 > /dev/cpu_dma_latency
>>
>> Irrespective of your kernel configuration options (CONFIG_PM=y/n), the
>> CPUs will not enter deep idle states, giving you the performance
>> improvement that you are looking for.
> 
> Thanks a lot for the very clear explanation of this!
> 

No problem! :-)

Regards,
Srivatsa S. Bhat


^ permalink raw reply	[flat|nested] 17+ messages in thread

end of thread, other threads:[~2013-05-18  5:08 UTC | newest]

Thread overview: 17+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2013-01-18 21:05 Performance issue since 3.2.6 Olivier Doucet
2013-01-18 21:24 ` Borislav Petkov
2013-01-18 21:46   ` Rafael J. Wysocki
2013-01-18 21:43     ` Olivier Doucet
2013-01-18 22:17       ` Rafael J. Wysocki
2013-01-21 13:18         ` Olivier Doucet
2013-01-21 13:29           ` Rafael J. Wysocki
2013-01-21 14:26             ` Olivier Doucet
2013-01-21 22:49               ` Rafael J. Wysocki
2013-01-21 22:57                 ` Olivier Doucet
2013-01-21 23:09                   ` Rafael J. Wysocki
2013-01-31 15:15                     ` Olivier Doucet
2013-02-12 21:09                       ` Olivier Doucet
2013-05-17 18:17                         ` Olivier Doucet
2013-05-17 19:50                           ` Srivatsa S. Bhat
2013-05-17 23:51                             ` Rafael J. Wysocki
2013-05-18  5:04                               ` Srivatsa S. Bhat

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).