linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* ACPI _CST introduced performance regresions on Haswll
@ 2020-10-06  8:36 Mel Gorman
  2020-10-06 16:00 ` Rafael J. Wysocki
  0 siblings, 1 reply; 18+ messages in thread
From: Mel Gorman @ 2020-10-06  8:36 UTC (permalink / raw)
  To: Rafael J. Wysocki; +Cc: Takashi Iwai, linux-kernel

Hi Rafael,

Numerous workload regressions have bisected repeatedly to the commit
6d4f08a6776 ("intel_idle: Use ACPI _CST on server systems") but only on
a set of haswell machines that all have the same CPU.

CPU(s):              48
On-line CPU(s) list: 0-47
Thread(s) per core:  2
Core(s) per socket:  12
Socket(s):           2
NUMA node(s):        2
Vendor ID:           GenuineIntel
CPU family:          6
Model:               63
Model name:          Intel(R) Xeon(R) CPU E5-2670 v3 @ 2.30GHz
Stepping:            2
CPU MHz:             1200.359
CPU max MHz:         3100.0000
CPU min MHz:         1200.0000

As well as being bisected in mainline, backporting the patch to a
distribution kernel also showed the same type of problem so the patch
is definitely suspicious. An example comparison showing the performance
before CST was enabled and recent mainline kernels is as follow

netperf UDP_STREAM
                                      5.5.0              5.5.0-rc2              5.5.0-rc2                  5.6.0              5.9.0-rc8
                                    vanilla      sle15-sp2-pre-cst   sle15-sp2-enable-cst                vanilla                vanilla
Hmean     send-64         203.21 (   0.00%)      206.43 *   1.58%*      176.89 * -12.95%*      181.18 * -10.84%*      194.45 *  -4.31%*
Hmean     send-128        401.40 (   0.00%)      414.19 *   3.19%*      355.84 * -11.35%*      364.13 *  -9.29%*      387.83 *  -3.38%*
Hmean     send-256        786.69 (   0.00%)      799.70 (   1.65%)      700.65 * -10.94%*      719.82 *  -8.50%*      756.40 *  -3.85%*
Hmean     send-1024      3059.57 (   0.00%)     3106.57 *   1.54%*     2659.62 * -13.07%*     2793.58 *  -8.69%*     3006.95 *  -1.72%*
Hmean     send-2048      5976.66 (   0.00%)     6102.64 (   2.11%)     5249.34 * -12.17%*     5392.04 *  -9.78%*     5805.02 *  -2.87%*
Hmean     send-3312      9145.09 (   0.00%)     9304.85 *   1.75%*     8197.25 * -10.36%*     8398.36 *  -8.17%*     9120.88 (  -0.26%)
Hmean     send-4096     10871.63 (   0.00%)    11129.76 *   2.37%*     9667.68 * -11.07%*     9929.70 *  -8.66%*    10863.41 (  -0.08%)
Hmean     send-8192     17747.35 (   0.00%)    17969.19 (   1.25%)    15652.91 * -11.80%*    16081.20 *  -9.39%*    17316.13 *  -2.43%*
Hmean     send-16384    29187.16 (   0.00%)    29418.75 *   0.79%*    26296.64 *  -9.90%*    27028.18 *  -7.40%*    26941.26 *  -7.69%*
Hmean     recv-64         203.21 (   0.00%)      206.43 *   1.58%*      176.89 * -12.95%*      181.18 * -10.84%*      194.45 *  -4.31%*
Hmean     recv-128        401.40 (   0.00%)      414.19 *   3.19%*      355.84 * -11.35%*      364.13 *  -9.29%*      387.83 *  -3.38%*
Hmean     recv-256        786.69 (   0.00%)      799.70 (   1.65%)      700.65 * -10.94%*      719.82 *  -8.50%*      756.40 *  -3.85%*
Hmean     recv-1024      3059.57 (   0.00%)     3106.57 *   1.54%*     2659.62 * -13.07%*     2793.58 *  -8.69%*     3006.95 *  -1.72%*
Hmean     recv-2048      5976.66 (   0.00%)     6102.64 (   2.11%)     5249.34 * -12.17%*     5392.00 *  -9.78%*     5805.02 *  -2.87%*
Hmean     recv-3312      9145.09 (   0.00%)     9304.85 *   1.75%*     8197.25 * -10.36%*     8398.36 *  -8.17%*     9120.88 (  -0.26%)
Hmean     recv-4096     10871.63 (   0.00%)    11129.76 *   2.37%*     9667.68 * -11.07%*     9929.70 *  -8.66%*    10863.38 (  -0.08%)
Hmean     recv-8192     17747.35 (   0.00%)    17969.19 (   1.25%)    15652.91 * -11.80%*    16081.20 *  -9.39%*    17315.96 *  -2.43%*
Hmean     recv-16384    29187.13 (   0.00%)    29418.72 *   0.79%*    26296.63 *  -9.90%*    27028.18 *  -7.40%*    26941.23 *  -7.69%*

pre-cst is just before commit 6d4f08a6776 ("intel_idle: Use ACPI _CST on
server systems") and enable-cst is the commit. It was not fixed by 5.6 or
5.9-rc8. A lot of bisections ended up here including kernel compilation,
tbench, syscall entry/exit microbenchmark, hackbench, Java workloads etc.

What I don't understand is why. The latencies for c-state exit states
before and after the patch are both as follows

/sys/devices/system/cpu/cpu0/cpuidle/state0/latency:0
/sys/devices/system/cpu/cpu0/cpuidle/state1/latency:2
/sys/devices/system/cpu/cpu0/cpuidle/state2/latency:10
/sys/devices/system/cpu/cpu0/cpuidle/state3/latency:33
/sys/devices/system/cpu/cpu0/cpuidle/state4/latency:133

Perf profiles did not show up anything interesting. A diff of
/sys/devices/system/cpu/cpu0/cpuidle/state0/ before and after the patch
showed up nothing interesting. Any idea why exactly this patch shows up
as being hazardous on Haswell in particular?

-- 
Mel Gorman
SUSE Labs

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: ACPI _CST introduced performance regresions on Haswll
  2020-10-06  8:36 ACPI _CST introduced performance regresions on Haswll Mel Gorman
@ 2020-10-06 16:00 ` Rafael J. Wysocki
  2020-10-06 19:03   ` Mel Gorman
  0 siblings, 1 reply; 18+ messages in thread
From: Rafael J. Wysocki @ 2020-10-06 16:00 UTC (permalink / raw)
  To: Mel Gorman; +Cc: Takashi Iwai, linux-kernel

Hi Mel,

On 10/6/2020 10:36 AM, Mel Gorman wrote:
> Hi Rafael,
>
> Numerous workload regressions have bisected repeatedly to the commit
> 6d4f08a6776 ("intel_idle: Use ACPI _CST on server systems") but only on
> a set of haswell machines that all have the same CPU.
>
> CPU(s):              48
> On-line CPU(s) list: 0-47
> Thread(s) per core:  2
> Core(s) per socket:  12
> Socket(s):           2
> NUMA node(s):        2
> Vendor ID:           GenuineIntel
> CPU family:          6
> Model:               63
> Model name:          Intel(R) Xeon(R) CPU E5-2670 v3 @ 2.30GHz
> Stepping:            2
> CPU MHz:             1200.359
> CPU max MHz:         3100.0000
> CPU min MHz:         1200.0000
>
> As well as being bisected in mainline, backporting the patch to a
> distribution kernel also showed the same type of problem so the patch
> is definitely suspicious. An example comparison showing the performance
> before CST was enabled and recent mainline kernels is as follow
>
> netperf UDP_STREAM
>                                        5.5.0              5.5.0-rc2              5.5.0-rc2                  5.6.0              5.9.0-rc8
>                                      vanilla      sle15-sp2-pre-cst   sle15-sp2-enable-cst                vanilla                vanilla
> Hmean     send-64         203.21 (   0.00%)      206.43 *   1.58%*      176.89 * -12.95%*      181.18 * -10.84%*      194.45 *  -4.31%*
> Hmean     send-128        401.40 (   0.00%)      414.19 *   3.19%*      355.84 * -11.35%*      364.13 *  -9.29%*      387.83 *  -3.38%*
> Hmean     send-256        786.69 (   0.00%)      799.70 (   1.65%)      700.65 * -10.94%*      719.82 *  -8.50%*      756.40 *  -3.85%*
> Hmean     send-1024      3059.57 (   0.00%)     3106.57 *   1.54%*     2659.62 * -13.07%*     2793.58 *  -8.69%*     3006.95 *  -1.72%*
> Hmean     send-2048      5976.66 (   0.00%)     6102.64 (   2.11%)     5249.34 * -12.17%*     5392.04 *  -9.78%*     5805.02 *  -2.87%*
> Hmean     send-3312      9145.09 (   0.00%)     9304.85 *   1.75%*     8197.25 * -10.36%*     8398.36 *  -8.17%*     9120.88 (  -0.26%)
> Hmean     send-4096     10871.63 (   0.00%)    11129.76 *   2.37%*     9667.68 * -11.07%*     9929.70 *  -8.66%*    10863.41 (  -0.08%)
> Hmean     send-8192     17747.35 (   0.00%)    17969.19 (   1.25%)    15652.91 * -11.80%*    16081.20 *  -9.39%*    17316.13 *  -2.43%*
> Hmean     send-16384    29187.16 (   0.00%)    29418.75 *   0.79%*    26296.64 *  -9.90%*    27028.18 *  -7.40%*    26941.26 *  -7.69%*
> Hmean     recv-64         203.21 (   0.00%)      206.43 *   1.58%*      176.89 * -12.95%*      181.18 * -10.84%*      194.45 *  -4.31%*
> Hmean     recv-128        401.40 (   0.00%)      414.19 *   3.19%*      355.84 * -11.35%*      364.13 *  -9.29%*      387.83 *  -3.38%*
> Hmean     recv-256        786.69 (   0.00%)      799.70 (   1.65%)      700.65 * -10.94%*      719.82 *  -8.50%*      756.40 *  -3.85%*
> Hmean     recv-1024      3059.57 (   0.00%)     3106.57 *   1.54%*     2659.62 * -13.07%*     2793.58 *  -8.69%*     3006.95 *  -1.72%*
> Hmean     recv-2048      5976.66 (   0.00%)     6102.64 (   2.11%)     5249.34 * -12.17%*     5392.00 *  -9.78%*     5805.02 *  -2.87%*
> Hmean     recv-3312      9145.09 (   0.00%)     9304.85 *   1.75%*     8197.25 * -10.36%*     8398.36 *  -8.17%*     9120.88 (  -0.26%)
> Hmean     recv-4096     10871.63 (   0.00%)    11129.76 *   2.37%*     9667.68 * -11.07%*     9929.70 *  -8.66%*    10863.38 (  -0.08%)
> Hmean     recv-8192     17747.35 (   0.00%)    17969.19 (   1.25%)    15652.91 * -11.80%*    16081.20 *  -9.39%*    17315.96 *  -2.43%*
> Hmean     recv-16384    29187.13 (   0.00%)    29418.72 *   0.79%*    26296.63 *  -9.90%*    27028.18 *  -7.40%*    26941.23 *  -7.69%*
>
> pre-cst is just before commit 6d4f08a6776 ("intel_idle: Use ACPI _CST on
> server systems") and enable-cst is the commit. It was not fixed by 5.6 or
> 5.9-rc8. A lot of bisections ended up here including kernel compilation,
> tbench, syscall entry/exit microbenchmark, hackbench, Java workloads etc.
>
> What I don't understand is why. The latencies for c-state exit states
> before and after the patch are both as follows
>
> /sys/devices/system/cpu/cpu0/cpuidle/state0/latency:0
> /sys/devices/system/cpu/cpu0/cpuidle/state1/latency:2
> /sys/devices/system/cpu/cpu0/cpuidle/state2/latency:10
> /sys/devices/system/cpu/cpu0/cpuidle/state3/latency:33
> /sys/devices/system/cpu/cpu0/cpuidle/state4/latency:133
>
> Perf profiles did not show up anything interesting. A diff of
> /sys/devices/system/cpu/cpu0/cpuidle/state0/ before and after the patch
> showed up nothing interesting. Any idea why exactly this patch shows up
> as being hazardous on Haswell in particular?
>
Presumably, some of the idle states are disabled by default on the 
affected machines.

Can you check the disable and default_status attributes of each state 
before and after the commit in question?

Cheers!



^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: ACPI _CST introduced performance regresions on Haswll
  2020-10-06 16:00 ` Rafael J. Wysocki
@ 2020-10-06 19:03   ` Mel Gorman
  2020-10-06 19:29     ` Rafael J. Wysocki
  2020-10-06 19:47     ` Mel Gorman
  0 siblings, 2 replies; 18+ messages in thread
From: Mel Gorman @ 2020-10-06 19:03 UTC (permalink / raw)
  To: Rafael J. Wysocki; +Cc: Takashi Iwai, linux-kernel

On Tue, Oct 06, 2020 at 06:00:18PM +0200, Rafael J. Wysocki wrote:
> > server systems") and enable-cst is the commit. It was not fixed by 5.6 or
> > 5.9-rc8. A lot of bisections ended up here including kernel compilation,
> > tbench, syscall entry/exit microbenchmark, hackbench, Java workloads etc.
> > 
> > What I don't understand is why. The latencies for c-state exit states
> > before and after the patch are both as follows
> > 
> > /sys/devices/system/cpu/cpu0/cpuidle/state0/latency:0
> > /sys/devices/system/cpu/cpu0/cpuidle/state1/latency:2
> > /sys/devices/system/cpu/cpu0/cpuidle/state2/latency:10
> > /sys/devices/system/cpu/cpu0/cpuidle/state3/latency:33
> > /sys/devices/system/cpu/cpu0/cpuidle/state4/latency:133
> > 
> > Perf profiles did not show up anything interesting. A diff of
> > /sys/devices/system/cpu/cpu0/cpuidle/state0/ before and after the patch
> > showed up nothing interesting. Any idea why exactly this patch shows up
> > as being hazardous on Haswell in particular?
> > 
> Presumably, some of the idle states are disabled by default on the affected
> machines.
> 
> Can you check the disable and default_status attributes of each state before
> and after the commit in question?
> 

# grep . pre-cst/cpuidle/state*/disable
pre-cst/cpuidle/state0/disable:0
pre-cst/cpuidle/state1/disable:0
pre-cst/cpuidle/state2/disable:0
pre-cst/cpuidle/state3/disable:0
pre-cst/cpuidle/state4/disable:0
# grep . enable-cst/cpuidle/state*/disable
enable-cst/cpuidle/state0/disable:0
enable-cst/cpuidle/state1/disable:0
enable-cst/cpuidle/state2/disable:0
enable-cst/cpuidle/state3/disable:0
enable-cst/cpuidle/state4/disable:0
# grep . pre-cst/cpuidle/state*/default_status
pre-cst/cpuidle/state0/default_status:enabled
pre-cst/cpuidle/state1/default_status:enabled
pre-cst/cpuidle/state2/default_status:enabled
pre-cst/cpuidle/state3/default_status:enabled
pre-cst/cpuidle/state4/default_status:enabled

After the commit, the default_status file does not appear in /sys

-- 
Mel Gorman
SUSE Labs

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: ACPI _CST introduced performance regresions on Haswll
  2020-10-06 19:03   ` Mel Gorman
@ 2020-10-06 19:29     ` Rafael J. Wysocki
  2020-10-06 21:18       ` Mel Gorman
  2020-10-06 19:47     ` Mel Gorman
  1 sibling, 1 reply; 18+ messages in thread
From: Rafael J. Wysocki @ 2020-10-06 19:29 UTC (permalink / raw)
  To: Mel Gorman; +Cc: Takashi Iwai, linux-kernel

On 10/6/2020 9:03 PM, Mel Gorman wrote:
> On Tue, Oct 06, 2020 at 06:00:18PM +0200, Rafael J. Wysocki wrote:
>>> server systems") and enable-cst is the commit. It was not fixed by 5.6 or
>>> 5.9-rc8. A lot of bisections ended up here including kernel compilation,
>>> tbench, syscall entry/exit microbenchmark, hackbench, Java workloads etc.
>>>
>>> What I don't understand is why. The latencies for c-state exit states
>>> before and after the patch are both as follows
>>>
>>> /sys/devices/system/cpu/cpu0/cpuidle/state0/latency:0
>>> /sys/devices/system/cpu/cpu0/cpuidle/state1/latency:2
>>> /sys/devices/system/cpu/cpu0/cpuidle/state2/latency:10
>>> /sys/devices/system/cpu/cpu0/cpuidle/state3/latency:33
>>> /sys/devices/system/cpu/cpu0/cpuidle/state4/latency:133
>>>
>>> Perf profiles did not show up anything interesting. A diff of
>>> /sys/devices/system/cpu/cpu0/cpuidle/state0/ before and after the patch
>>> showed up nothing interesting. Any idea why exactly this patch shows up
>>> as being hazardous on Haswell in particular?
>>>
>> Presumably, some of the idle states are disabled by default on the affected
>> machines.
>>
>> Can you check the disable and default_status attributes of each state before
>> and after the commit in question?
>>
> # grep . pre-cst/cpuidle/state*/disable
> pre-cst/cpuidle/state0/disable:0
> pre-cst/cpuidle/state1/disable:0
> pre-cst/cpuidle/state2/disable:0
> pre-cst/cpuidle/state3/disable:0
> pre-cst/cpuidle/state4/disable:0
> # grep . enable-cst/cpuidle/state*/disable
> enable-cst/cpuidle/state0/disable:0
> enable-cst/cpuidle/state1/disable:0
> enable-cst/cpuidle/state2/disable:0
> enable-cst/cpuidle/state3/disable:0
> enable-cst/cpuidle/state4/disable:0
> # grep . pre-cst/cpuidle/state*/default_status
> pre-cst/cpuidle/state0/default_status:enabled
> pre-cst/cpuidle/state1/default_status:enabled
> pre-cst/cpuidle/state2/default_status:enabled
> pre-cst/cpuidle/state3/default_status:enabled
> pre-cst/cpuidle/state4/default_status:enabled
>
> After the commit, the default_status file does not appear in /sys
>
Something is amiss, then, because the commit doesn't affect the presence 
of this file.

The only thing it does is to set the use_acpi flag for several processor 
models in intel_idle.c.

It can be effectively reversed by removing all of the ".use_acpi = 
true," lines from intel_idle.c.

In particular, please check if changing the value of use_acpi in struct 
idle_cpu_hsx from 'true' to 'false' alone (without reverting the commit) 
makes the issue go away in 5.9-rc8 (the default_status file should be 
present regardless).

Cheers!



^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: ACPI _CST introduced performance regresions on Haswll
  2020-10-06 19:03   ` Mel Gorman
  2020-10-06 19:29     ` Rafael J. Wysocki
@ 2020-10-06 19:47     ` Mel Gorman
  2020-10-07 15:40       ` Rafael J. Wysocki
  1 sibling, 1 reply; 18+ messages in thread
From: Mel Gorman @ 2020-10-06 19:47 UTC (permalink / raw)
  To: Rafael J. Wysocki; +Cc: Takashi Iwai, linux-kernel

On Tue, Oct 06, 2020 at 08:03:22PM +0100, Mel Gorman wrote:
> On Tue, Oct 06, 2020 at 06:00:18PM +0200, Rafael J. Wysocki wrote:
> > > server systems") and enable-cst is the commit. It was not fixed by 5.6 or
> > > 5.9-rc8. A lot of bisections ended up here including kernel compilation,
> > > tbench, syscall entry/exit microbenchmark, hackbench, Java workloads etc.
> > > 
> > > What I don't understand is why. The latencies for c-state exit states
> > > before and after the patch are both as follows
> > > 
> > > /sys/devices/system/cpu/cpu0/cpuidle/state0/latency:0
> > > /sys/devices/system/cpu/cpu0/cpuidle/state1/latency:2
> > > /sys/devices/system/cpu/cpu0/cpuidle/state2/latency:10
> > > /sys/devices/system/cpu/cpu0/cpuidle/state3/latency:33
> > > /sys/devices/system/cpu/cpu0/cpuidle/state4/latency:133
> > > 
> > > Perf profiles did not show up anything interesting. A diff of
> > > /sys/devices/system/cpu/cpu0/cpuidle/state0/ before and after the patch
> > > showed up nothing interesting. Any idea why exactly this patch shows up
> > > as being hazardous on Haswell in particular?
> > > 
> > Presumably, some of the idle states are disabled by default on the affected
> > machines.
> > 
> > Can you check the disable and default_status attributes of each state before
> > and after the commit in question?
> > 
> 
> # grep . pre-cst/cpuidle/state*/disable

Sorry, second attempt after thinking the results made no sense at all.
Turns out I fat fingered setting up the enable-cst kernel the second time
to collect what you asked for and the patch was not applied at all.

# grep . pre-cst/cpuidle/state*/disable
pre-cst/cpuidle/state0/disable:0
pre-cst/cpuidle/state1/disable:0
pre-cst/cpuidle/state2/disable:0
pre-cst/cpuidle/state3/disable:0
pre-cst/cpuidle/state4/disable:0
# grep . pre-cst/cpuidle/state*/default_status
pre-cst/cpuidle/state0/default_status:enabled
pre-cst/cpuidle/state1/default_status:enabled
pre-cst/cpuidle/state2/default_status:enabled
pre-cst/cpuidle/state3/default_status:enabled
pre-cst/cpuidle/state4/default_status:enabled
# grep . enable-cst/cpuidle/state*/disable
enable-cst/cpuidle/state0/disable:0
enable-cst/cpuidle/state1/disable:0
enable-cst/cpuidle/state2/disable:0
enable-cst/cpuidle/state3/disable:1
enable-cst/cpuidle/state4/disable:1
# grep . enable-cst/cpuidle/state*/default_status
enable-cst/cpuidle/state0/default_status:enabled
enable-cst/cpuidle/state1/default_status:enabled
enable-cst/cpuidle/state2/default_status:enabled
enable-cst/cpuidle/state3/default_status:disabled
enable-cst/cpuidle/state4/default_status:disabled

That looks like C3 and C6 are disabled after the patch.

# grep . enable-cst/cpuidle/state*/name
enable-cst/cpuidle/state0/name:POLL
enable-cst/cpuidle/state1/name:C1
enable-cst/cpuidle/state2/name:C1E
enable-cst/cpuidle/state3/name:C3
enable-cst/cpuidle/state4/name:C6

-- 
Mel Gorman
SUSE Labs

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: ACPI _CST introduced performance regresions on Haswll
  2020-10-06 19:29     ` Rafael J. Wysocki
@ 2020-10-06 21:18       ` Mel Gorman
  2020-10-07 15:45         ` Rafael J. Wysocki
  0 siblings, 1 reply; 18+ messages in thread
From: Mel Gorman @ 2020-10-06 21:18 UTC (permalink / raw)
  To: Rafael J. Wysocki; +Cc: Takashi Iwai, linux-kernel

On Tue, Oct 06, 2020 at 09:29:24PM +0200, Rafael J. Wysocki wrote:
> > After the commit, the default_status file does not appear in /sys
> > 
> Something is amiss, then, because the commit doesn't affect the presence of
> this file.
> 

This was cleared up in another mail.

> The only thing it does is to set the use_acpi flag for several processor
> models in intel_idle.c.
> 
> It can be effectively reversed by removing all of the ".use_acpi = true,"
> lines from intel_idle.c.
> 
> In particular, please check if changing the value of use_acpi in struct
> idle_cpu_hsx from 'true' to 'false' alone (without reverting the commit)
> makes the issue go away in 5.9-rc8 (the default_status file should be
> present regardless).
> 

Thanks.  I applied the following

diff --git a/drivers/idle/intel_idle.c b/drivers/idle/intel_idle.c
index 9a810e4a7946..6478347669a9 100644
--- a/drivers/idle/intel_idle.c
+++ b/drivers/idle/intel_idle.c
@@ -1044,7 +1044,7 @@ static const struct idle_cpu idle_cpu_hsw __initconst = {
 static const struct idle_cpu idle_cpu_hsx __initconst = {
 	.state_table = hsw_cstates,
 	.disable_promotion_to_c1e = true,
-	.use_acpi = true,
+	.use_acpi = false,
 };

netperf UDP_STREAM
                                      pre                 enable                 enable                5.9-rc8                5.9-rc8
                                      cst                    cst        cst-no-hsx-acpi                vanilla            no-hsx-acpi
Hmean     send-64       203.96 (   0.00%)      179.23 * -12.13%*      201.04 (  -1.44%)      203.24 (  -0.36%)      233.43 *  14.45%*
Hmean     send-128      403.66 (   0.00%)      355.99 * -11.81%*      402.28 (  -0.34%)      387.65 *  -3.97%*      461.47 *  14.32%*
Hmean     send-256      784.39 (   0.00%)      697.78 * -11.04%*      782.15 (  -0.29%)      758.49 *  -3.30%*      895.31 *  14.14%*
Hmean     recv-64       203.96 (   0.00%)      179.23 * -12.13%*      201.04 (  -1.44%)      203.24 (  -0.36%)      233.43 *  14.45%*
Hmean     recv-128      403.66 (   0.00%)      355.99 * -11.81%*      402.28 (  -0.34%)      387.65 *  -3.97%*      461.47 *  14.32%*
Hmean     recv-256      784.39 (   0.00%)      697.78 * -11.04%*      782.15 (  -0.29%)      758.49 *  -3.30%*      895.28 *  14.14%*

This is a more limited run to save time but is enough to illustrate
the point.

pre-cst is just before your patch
enable-cst is your patch that was bisected
enable-cst-no-hsx-acpi is your patch with use_acpi disabled
5.9-rc8-vanilla is what it sounds like
5.9-rc8-no-hsx-acpi disables use_acpi

The enable-cst-no-hsx-acpi result indicates that use_acpi was the issue for
Haswell (at least these machines). Looking just at 5.9-rc8-vanillaa might
have been misleading because its performance is not far off the baseline
due to unrelated changes that mostly offset the performance penalty.

The key question is -- how appropriate would it be to disable acpi for
Haswell? Would that be generally safe or could it hide other surprises?

-- 
Mel Gorman
SUSE Labs

^ permalink raw reply related	[flat|nested] 18+ messages in thread

* Re: ACPI _CST introduced performance regresions on Haswll
  2020-10-06 19:47     ` Mel Gorman
@ 2020-10-07 15:40       ` Rafael J. Wysocki
  2020-10-07 19:23         ` Mel Gorman
  0 siblings, 1 reply; 18+ messages in thread
From: Rafael J. Wysocki @ 2020-10-07 15:40 UTC (permalink / raw)
  To: Mel Gorman; +Cc: Takashi Iwai, linux-kernel

On 10/6/2020 9:47 PM, Mel Gorman wrote:
> On Tue, Oct 06, 2020 at 08:03:22PM +0100, Mel Gorman wrote:
>> On Tue, Oct 06, 2020 at 06:00:18PM +0200, Rafael J. Wysocki wrote:
>>>> server systems") and enable-cst is the commit. It was not fixed by 5.6 or
>>>> 5.9-rc8. A lot of bisections ended up here including kernel compilation,
>>>> tbench, syscall entry/exit microbenchmark, hackbench, Java workloads etc.
>>>>
>>>> What I don't understand is why. The latencies for c-state exit states
>>>> before and after the patch are both as follows
>>>>
>>>> /sys/devices/system/cpu/cpu0/cpuidle/state0/latency:0
>>>> /sys/devices/system/cpu/cpu0/cpuidle/state1/latency:2
>>>> /sys/devices/system/cpu/cpu0/cpuidle/state2/latency:10
>>>> /sys/devices/system/cpu/cpu0/cpuidle/state3/latency:33
>>>> /sys/devices/system/cpu/cpu0/cpuidle/state4/latency:133
>>>>
>>>> Perf profiles did not show up anything interesting. A diff of
>>>> /sys/devices/system/cpu/cpu0/cpuidle/state0/ before and after the patch
>>>> showed up nothing interesting. Any idea why exactly this patch shows up
>>>> as being hazardous on Haswell in particular?
>>>>
>>> Presumably, some of the idle states are disabled by default on the affected
>>> machines.
>>>
>>> Can you check the disable and default_status attributes of each state before
>>> and after the commit in question?
>>>
>> # grep . pre-cst/cpuidle/state*/disable
> Sorry, second attempt after thinking the results made no sense at all.
> Turns out I fat fingered setting up the enable-cst kernel the second time
> to collect what you asked for and the patch was not applied at all.
>
> # grep . pre-cst/cpuidle/state*/disable
> pre-cst/cpuidle/state0/disable:0
> pre-cst/cpuidle/state1/disable:0
> pre-cst/cpuidle/state2/disable:0
> pre-cst/cpuidle/state3/disable:0
> pre-cst/cpuidle/state4/disable:0
> # grep . pre-cst/cpuidle/state*/default_status
> pre-cst/cpuidle/state0/default_status:enabled
> pre-cst/cpuidle/state1/default_status:enabled
> pre-cst/cpuidle/state2/default_status:enabled
> pre-cst/cpuidle/state3/default_status:enabled
> pre-cst/cpuidle/state4/default_status:enabled
> # grep . enable-cst/cpuidle/state*/disable
> enable-cst/cpuidle/state0/disable:0
> enable-cst/cpuidle/state1/disable:0
> enable-cst/cpuidle/state2/disable:0
> enable-cst/cpuidle/state3/disable:1
> enable-cst/cpuidle/state4/disable:1
> # grep . enable-cst/cpuidle/state*/default_status
> enable-cst/cpuidle/state0/default_status:enabled
> enable-cst/cpuidle/state1/default_status:enabled
> enable-cst/cpuidle/state2/default_status:enabled
> enable-cst/cpuidle/state3/default_status:disabled
> enable-cst/cpuidle/state4/default_status:disabled
>
> That looks like C3 and C6 are disabled after the patch.
>
> # grep . enable-cst/cpuidle/state*/name
> enable-cst/cpuidle/state0/name:POLL
> enable-cst/cpuidle/state1/name:C1
> enable-cst/cpuidle/state2/name:C1E
> enable-cst/cpuidle/state3/name:C3
> enable-cst/cpuidle/state4/name:C6
>
That's kind of unexpected and there may be two reasons for that.

First off, the MWAIT hints in the ACPI tables for C3 and C6 may be 
different from the ones in the intel_idle internal table.

Second, the ACPI tables may only be listing C1.

Can you send me the acpidump output from the affected machine, please?



^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: ACPI _CST introduced performance regresions on Haswll
  2020-10-06 21:18       ` Mel Gorman
@ 2020-10-07 15:45         ` Rafael J. Wysocki
  2020-10-08  9:09           ` Mel Gorman
  0 siblings, 1 reply; 18+ messages in thread
From: Rafael J. Wysocki @ 2020-10-07 15:45 UTC (permalink / raw)
  To: Mel Gorman; +Cc: Takashi Iwai, linux-kernel

On 10/6/2020 11:18 PM, Mel Gorman wrote:
> On Tue, Oct 06, 2020 at 09:29:24PM +0200, Rafael J. Wysocki wrote:
>>> After the commit, the default_status file does not appear in /sys
>>>
>> Something is amiss, then, because the commit doesn't affect the presence of
>> this file.
>>
> This was cleared up in another mail.
>
>> The only thing it does is to set the use_acpi flag for several processor
>> models in intel_idle.c.
>>
>> It can be effectively reversed by removing all of the ".use_acpi = true,"
>> lines from intel_idle.c.
>>
>> In particular, please check if changing the value of use_acpi in struct
>> idle_cpu_hsx from 'true' to 'false' alone (without reverting the commit)
>> makes the issue go away in 5.9-rc8 (the default_status file should be
>> present regardless).
>>
> Thanks.  I applied the following
>
> diff --git a/drivers/idle/intel_idle.c b/drivers/idle/intel_idle.c
> index 9a810e4a7946..6478347669a9 100644
> --- a/drivers/idle/intel_idle.c
> +++ b/drivers/idle/intel_idle.c
> @@ -1044,7 +1044,7 @@ static const struct idle_cpu idle_cpu_hsw __initconst = {
>   static const struct idle_cpu idle_cpu_hsx __initconst = {
>   	.state_table = hsw_cstates,
>   	.disable_promotion_to_c1e = true,
> -	.use_acpi = true,
> +	.use_acpi = false,
>   };
>
> netperf UDP_STREAM
>                                        pre                 enable                 enable                5.9-rc8                5.9-rc8
>                                        cst                    cst        cst-no-hsx-acpi                vanilla            no-hsx-acpi
> Hmean     send-64       203.96 (   0.00%)      179.23 * -12.13%*      201.04 (  -1.44%)      203.24 (  -0.36%)      233.43 *  14.45%*
> Hmean     send-128      403.66 (   0.00%)      355.99 * -11.81%*      402.28 (  -0.34%)      387.65 *  -3.97%*      461.47 *  14.32%*
> Hmean     send-256      784.39 (   0.00%)      697.78 * -11.04%*      782.15 (  -0.29%)      758.49 *  -3.30%*      895.31 *  14.14%*
> Hmean     recv-64       203.96 (   0.00%)      179.23 * -12.13%*      201.04 (  -1.44%)      203.24 (  -0.36%)      233.43 *  14.45%*
> Hmean     recv-128      403.66 (   0.00%)      355.99 * -11.81%*      402.28 (  -0.34%)      387.65 *  -3.97%*      461.47 *  14.32%*
> Hmean     recv-256      784.39 (   0.00%)      697.78 * -11.04%*      782.15 (  -0.29%)      758.49 *  -3.30%*      895.28 *  14.14%*
>
> This is a more limited run to save time but is enough to illustrate
> the point.
>
> pre-cst is just before your patch
> enable-cst is your patch that was bisected
> enable-cst-no-hsx-acpi is your patch with use_acpi disabled
> 5.9-rc8-vanilla is what it sounds like
> 5.9-rc8-no-hsx-acpi disables use_acpi
>
> The enable-cst-no-hsx-acpi result indicates that use_acpi was the issue for
> Haswell (at least these machines). Looking just at 5.9-rc8-vanillaa might
> have been misleading because its performance is not far off the baseline
> due to unrelated changes that mostly offset the performance penalty.
>
> The key question is -- how appropriate would it be to disable acpi for
> Haswell? Would that be generally safe or could it hide other surprises?
>
It should be safe, but let's try to do something more fine-grained.

There is the CPUIDLE_FLAG_ALWAYS_ENABLE flag that is set for C1E.  Can 
you please try to set it for C6 in hsw_cstates instead of clearing 
use_acpi in idle_cpu_hsx and retest?



^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: ACPI _CST introduced performance regresions on Haswll
  2020-10-07 15:40       ` Rafael J. Wysocki
@ 2020-10-07 19:23         ` Mel Gorman
  0 siblings, 0 replies; 18+ messages in thread
From: Mel Gorman @ 2020-10-07 19:23 UTC (permalink / raw)
  To: Rafael J. Wysocki; +Cc: Takashi Iwai, linux-kernel

[-- Attachment #1: Type: text/plain, Size: 1326 bytes --]

On Wed, Oct 07, 2020 at 05:40:43PM +0200, Rafael J. Wysocki wrote:
> > # grep . enable-cst/cpuidle/state*/disable
> > enable-cst/cpuidle/state0/disable:0
> > enable-cst/cpuidle/state1/disable:0
> > enable-cst/cpuidle/state2/disable:0
> > enable-cst/cpuidle/state3/disable:1
> > enable-cst/cpuidle/state4/disable:1
> > # grep . enable-cst/cpuidle/state*/default_status
> > enable-cst/cpuidle/state0/default_status:enabled
> > enable-cst/cpuidle/state1/default_status:enabled
> > enable-cst/cpuidle/state2/default_status:enabled
> > enable-cst/cpuidle/state3/default_status:disabled
> > enable-cst/cpuidle/state4/default_status:disabled
> > 
> > That looks like C3 and C6 are disabled after the patch.
> > 
> > # grep . enable-cst/cpuidle/state*/name
> > enable-cst/cpuidle/state0/name:POLL
> > enable-cst/cpuidle/state1/name:C1
> > enable-cst/cpuidle/state2/name:C1E
> > enable-cst/cpuidle/state3/name:C3
> > enable-cst/cpuidle/state4/name:C6
> > 
>
> That's kind of unexpected and there may be two reasons for that.
> 
> First off, the MWAIT hints in the ACPI tables for C3 and C6 may be different
> from the ones in the intel_idle internal table.
> 
> Second, the ACPI tables may only be listing C1.
> 
> Can you send me the acpidump output from the affected machine, please?
> 

Attached.

Thanks.

-- 
Mel Gorman
SUSE Labs

[-- Attachment #2: acpidump.xz --]
[-- Type: application/x-xz, Size: 92724 bytes --]

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: ACPI _CST introduced performance regresions on Haswll
  2020-10-07 15:45         ` Rafael J. Wysocki
@ 2020-10-08  9:09           ` Mel Gorman
  2020-10-08 17:15             ` Rafael J. Wysocki
  0 siblings, 1 reply; 18+ messages in thread
From: Mel Gorman @ 2020-10-08  9:09 UTC (permalink / raw)
  To: Rafael J. Wysocki; +Cc: Takashi Iwai, linux-kernel

On Wed, Oct 07, 2020 at 05:45:30PM +0200, Rafael J. Wysocki wrote:
> > pre-cst is just before your patch
> > enable-cst is your patch that was bisected
> > enable-cst-no-hsx-acpi is your patch with use_acpi disabled
> > 5.9-rc8-vanilla is what it sounds like
> > 5.9-rc8-no-hsx-acpi disables use_acpi
> > 
> > The enable-cst-no-hsx-acpi result indicates that use_acpi was the issue for
> > Haswell (at least these machines). Looking just at 5.9-rc8-vanillaa might
> > have been misleading because its performance is not far off the baseline
> > due to unrelated changes that mostly offset the performance penalty.
> > 
> > The key question is -- how appropriate would it be to disable acpi for
> > Haswell? Would that be generally safe or could it hide other surprises?
> > 
> It should be safe, but let's try to do something more fine-grained.
> 
> There is the CPUIDLE_FLAG_ALWAYS_ENABLE flag that is set for C1E.  Can you
> please try to set it for C6 in hsw_cstates instead of clearing use_acpi in
> idle_cpu_hsx and retest?
> 

Performance-wise, always enabling C6 helps but it may be specific to
this workload. Looking across all tested kernels I get;

netperf-udp
                                      5.5.0              5.5.0-rc2              5.5.0-rc2              5.9.0-rc8              5.9.0-rc8              5.9.0-rc8
                                    vanilla                pre-cst             enable-cst                vanilla           disable-acpi              enable-c6
Hmean     send-64         196.31 (   0.00%)      208.56 *   6.24%*      181.15 *  -7.72%*      199.84 *   1.80%*      235.09 *  19.76%*      234.79 *  19.60%*
Hmean     send-128        391.75 (   0.00%)      408.13 *   4.18%*      359.92 *  -8.12%*      396.81 (   1.29%)      469.44 *  19.83%*      465.55 *  18.84%*
Hmean     send-256        776.38 (   0.00%)      798.39 *   2.84%*      707.31 *  -8.90%*      781.63 (   0.68%)      917.19 *  18.14%*      905.06 *  16.57%*
Hmean     send-1024      3019.64 (   0.00%)     3099.00 *   2.63%*     2756.32 *  -8.72%*     3017.06 (  -0.09%)     3509.84 *  16.23%*     3532.85 *  17.00%*
Hmean     send-2048      5790.31 (   0.00%)     6209.53 *   7.24%*     5394.42 *  -6.84%*     5846.11 (   0.96%)     6861.93 *  18.51%*     6852.08 *  18.34%*
Hmean     send-3312      8909.98 (   0.00%)     9483.92 *   6.44%*     8332.35 *  -6.48%*     9047.52 *   1.54%*    10677.93 *  19.84%*    10509.41 *  17.95%*
Hmean     send-4096     10517.63 (   0.00%)    11044.19 *   5.01%*     9851.70 *  -6.33%*    10914.24 *   3.77%*    12719.58 *  20.94%*    12731.06 *  21.04%*
Hmean     send-8192     17355.48 (   0.00%)    18344.50 *   5.70%*    15844.38 *  -8.71%*    17690.46 (   1.93%)    20777.97 *  19.72%*    20220.24 *  16.51%*
Hmean     send-16384    28585.78 (   0.00%)    28950.90 (   1.28%)    25946.88 *  -9.23%*    26643.69 *  -6.79%*    30891.89 *   8.07%*    30701.46 *   7.40%*

The difference between always using ACPI and force enabling C6 is
negligible in this case but more on that later

netperf-udp
                                  5.9.0-rc8              5.9.0-rc8
                               disable-acpi              enable-c6
Hmean     send-64         235.09 (   0.00%)      234.79 (  -0.13%)
Hmean     send-128        469.44 (   0.00%)      465.55 (  -0.83%)
Hmean     send-256        917.19 (   0.00%)      905.06 (  -1.32%)
Hmean     send-1024      3509.84 (   0.00%)     3532.85 (   0.66%)
Hmean     send-2048      6861.93 (   0.00%)     6852.08 (  -0.14%)
Hmean     send-3312     10677.93 (   0.00%)    10509.41 *  -1.58%*
Hmean     send-4096     12719.58 (   0.00%)    12731.06 (   0.09%)
Hmean     send-8192     20777.97 (   0.00%)    20220.24 *  -2.68%*
Hmean     send-16384    30891.89 (   0.00%)    30701.46 (  -0.62%)

The default status and enabled states differ.

For 5.9-rc8 vanilla, the default and disabled status for cstates are

./5.9.0-rc8-vanilla/iter-0/sys/devices/system/cpu/cpu0/cpuidle/state0/disable:0
./5.9.0-rc8-vanilla/iter-0/sys/devices/system/cpu/cpu0/cpuidle/state1/disable:0
./5.9.0-rc8-vanilla/iter-0/sys/devices/system/cpu/cpu0/cpuidle/state2/disable:0
./5.9.0-rc8-vanilla/iter-0/sys/devices/system/cpu/cpu0/cpuidle/state3/disable:1
./5.9.0-rc8-vanilla/iter-0/sys/devices/system/cpu/cpu0/cpuidle/state4/disable:1
./5.9.0-rc8-vanilla/iter-0/sys/devices/system/cpu/cpu0/cpuidle/state0/default_status:enabled
./5.9.0-rc8-vanilla/iter-0/sys/devices/system/cpu/cpu0/cpuidle/state1/default_status:enabled
./5.9.0-rc8-vanilla/iter-0/sys/devices/system/cpu/cpu0/cpuidle/state2/default_status:enabled
./5.9.0-rc8-vanilla/iter-0/sys/devices/system/cpu/cpu0/cpuidle/state3/default_status:disabled
./5.9.0-rc8-vanilla/iter-0/sys/devices/system/cpu/cpu0/cpuidle/state4/default_status:disabled

For use_acpi == false, all c-states are enabled

./5.9.0-rc8-disable-acpi/iter-0/sys/devices/system/cpu/cpu0/cpuidle/state0/disable:0
./5.9.0-rc8-disable-acpi/iter-0/sys/devices/system/cpu/cpu0/cpuidle/state1/disable:0
./5.9.0-rc8-disable-acpi/iter-0/sys/devices/system/cpu/cpu0/cpuidle/state2/disable:0
./5.9.0-rc8-disable-acpi/iter-0/sys/devices/system/cpu/cpu0/cpuidle/state3/disable:0
./5.9.0-rc8-disable-acpi/iter-0/sys/devices/system/cpu/cpu0/cpuidle/state4/disable:0
./5.9.0-rc8-disable-acpi/iter-0/sys/devices/system/cpu/cpu0/cpuidle/state0/default_status:enabled
./5.9.0-rc8-disable-acpi/iter-0/sys/devices/system/cpu/cpu0/cpuidle/state1/default_status:enabled
./5.9.0-rc8-disable-acpi/iter-0/sys/devices/system/cpu/cpu0/cpuidle/state2/default_status:enabled
./5.9.0-rc8-disable-acpi/iter-0/sys/devices/system/cpu/cpu0/cpuidle/state3/default_status:enabled
./5.9.0-rc8-disable-acpi/iter-0/sys/devices/system/cpu/cpu0/cpuidle/state4/default_status:enabled

Force enabling C6

./5.9.0-rc8-enable-c6/iter-0/sys/devices/system/cpu/cpu0/cpuidle/state0/disable:0
./5.9.0-rc8-enable-c6/iter-0/sys/devices/system/cpu/cpu0/cpuidle/state1/disable:0
./5.9.0-rc8-enable-c6/iter-0/sys/devices/system/cpu/cpu0/cpuidle/state2/disable:0
./5.9.0-rc8-enable-c6/iter-0/sys/devices/system/cpu/cpu0/cpuidle/state3/disable:1
./5.9.0-rc8-enable-c6/iter-0/sys/devices/system/cpu/cpu0/cpuidle/state4/disable:0
./5.9.0-rc8-enable-c6/iter-0/sys/devices/system/cpu/cpu0/cpuidle/state0/default_status:enabled
./5.9.0-rc8-enable-c6/iter-0/sys/devices/system/cpu/cpu0/cpuidle/state1/default_status:enabled
./5.9.0-rc8-enable-c6/iter-0/sys/devices/system/cpu/cpu0/cpuidle/state2/default_status:enabled
./5.9.0-rc8-enable-c6/iter-0/sys/devices/system/cpu/cpu0/cpuidle/state3/default_status:disabled
./5.9.0-rc8-enable-c6/iter-0/sys/devices/system/cpu/cpu0/cpuidle/state4/default_status:enabled

Note that as expected, C3 remains disabled when only C6 is forced (state3
== c3, state4 == c6). While this particular workload does not appear to
care as it does not remain idle for long, the exit latency difference
between c3 and c6 is large so potentially a workload that idles for short
durations that are somewhere between c1e and c3 exit latency might take
a larger penalty exiting from c6 state if the deeper c-state is selected
for idling.

./5.9.0-rc8-enable-c6/iter-0/sys/devices/system/cpu/cpu0/cpuidle/state0/residency:0
./5.9.0-rc8-enable-c6/iter-0/sys/devices/system/cpu/cpu0/cpuidle/state1/residency:2
./5.9.0-rc8-enable-c6/iter-0/sys/devices/system/cpu/cpu0/cpuidle/state2/residency:20
./5.9.0-rc8-enable-c6/iter-0/sys/devices/system/cpu/cpu0/cpuidle/state3/residency:100
./5.9.0-rc8-enable-c6/iter-0/sys/devices/system/cpu/cpu0/cpuidle/state4/residency:400

-- 
Mel Gorman
SUSE Labs

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: ACPI _CST introduced performance regresions on Haswll
  2020-10-08  9:09           ` Mel Gorman
@ 2020-10-08 17:15             ` Rafael J. Wysocki
  2020-10-08 17:34               ` Mel Gorman
  0 siblings, 1 reply; 18+ messages in thread
From: Rafael J. Wysocki @ 2020-10-08 17:15 UTC (permalink / raw)
  To: Mel Gorman; +Cc: Takashi Iwai, linux-kernel

On 10/8/2020 11:09 AM, Mel Gorman wrote:
> On Wed, Oct 07, 2020 at 05:45:30PM +0200, Rafael J. Wysocki wrote:
>>> pre-cst is just before your patch
>>> enable-cst is your patch that was bisected
>>> enable-cst-no-hsx-acpi is your patch with use_acpi disabled
>>> 5.9-rc8-vanilla is what it sounds like
>>> 5.9-rc8-no-hsx-acpi disables use_acpi
>>>
>>> The enable-cst-no-hsx-acpi result indicates that use_acpi was the issue for
>>> Haswell (at least these machines). Looking just at 5.9-rc8-vanillaa might
>>> have been misleading because its performance is not far off the baseline
>>> due to unrelated changes that mostly offset the performance penalty.
>>>
>>> The key question is -- how appropriate would it be to disable acpi for
>>> Haswell? Would that be generally safe or could it hide other surprises?
>>>
>> It should be safe, but let's try to do something more fine-grained.
>>
>> There is the CPUIDLE_FLAG_ALWAYS_ENABLE flag that is set for C1E.  Can you
>> please try to set it for C6 in hsw_cstates instead of clearing use_acpi in
>> idle_cpu_hsx and retest?
>>
> Performance-wise, always enabling C6 helps but it may be specific to
> this workload. Looking across all tested kernels I get;
>
> netperf-udp
>                                        5.5.0              5.5.0-rc2              5.5.0-rc2              5.9.0-rc8              5.9.0-rc8              5.9.0-rc8
>                                      vanilla                pre-cst             enable-cst                vanilla           disable-acpi              enable-c6
> Hmean     send-64         196.31 (   0.00%)      208.56 *   6.24%*      181.15 *  -7.72%*      199.84 *   1.80%*      235.09 *  19.76%*      234.79 *  19.60%*
> Hmean     send-128        391.75 (   0.00%)      408.13 *   4.18%*      359.92 *  -8.12%*      396.81 (   1.29%)      469.44 *  19.83%*      465.55 *  18.84%*
> Hmean     send-256        776.38 (   0.00%)      798.39 *   2.84%*      707.31 *  -8.90%*      781.63 (   0.68%)      917.19 *  18.14%*      905.06 *  16.57%*
> Hmean     send-1024      3019.64 (   0.00%)     3099.00 *   2.63%*     2756.32 *  -8.72%*     3017.06 (  -0.09%)     3509.84 *  16.23%*     3532.85 *  17.00%*
> Hmean     send-2048      5790.31 (   0.00%)     6209.53 *   7.24%*     5394.42 *  -6.84%*     5846.11 (   0.96%)     6861.93 *  18.51%*     6852.08 *  18.34%*
> Hmean     send-3312      8909.98 (   0.00%)     9483.92 *   6.44%*     8332.35 *  -6.48%*     9047.52 *   1.54%*    10677.93 *  19.84%*    10509.41 *  17.95%*
> Hmean     send-4096     10517.63 (   0.00%)    11044.19 *   5.01%*     9851.70 *  -6.33%*    10914.24 *   3.77%*    12719.58 *  20.94%*    12731.06 *  21.04%*
> Hmean     send-8192     17355.48 (   0.00%)    18344.50 *   5.70%*    15844.38 *  -8.71%*    17690.46 (   1.93%)    20777.97 *  19.72%*    20220.24 *  16.51%*
> Hmean     send-16384    28585.78 (   0.00%)    28950.90 (   1.28%)    25946.88 *  -9.23%*    26643.69 *  -6.79%*    30891.89 *   8.07%*    30701.46 *   7.40%*
>
> The difference between always using ACPI and force enabling C6 is
> negligible in this case but more on that later
>
> netperf-udp
>                                    5.9.0-rc8              5.9.0-rc8
>                                 disable-acpi              enable-c6
> Hmean     send-64         235.09 (   0.00%)      234.79 (  -0.13%)
> Hmean     send-128        469.44 (   0.00%)      465.55 (  -0.83%)
> Hmean     send-256        917.19 (   0.00%)      905.06 (  -1.32%)
> Hmean     send-1024      3509.84 (   0.00%)     3532.85 (   0.66%)
> Hmean     send-2048      6861.93 (   0.00%)     6852.08 (  -0.14%)
> Hmean     send-3312     10677.93 (   0.00%)    10509.41 *  -1.58%*
> Hmean     send-4096     12719.58 (   0.00%)    12731.06 (   0.09%)
> Hmean     send-8192     20777.97 (   0.00%)    20220.24 *  -2.68%*
> Hmean     send-16384    30891.89 (   0.00%)    30701.46 (  -0.62%)
>
> The default status and enabled states differ.
>
> For 5.9-rc8 vanilla, the default and disabled status for cstates are
>
> ./5.9.0-rc8-vanilla/iter-0/sys/devices/system/cpu/cpu0/cpuidle/state0/disable:0
> ./5.9.0-rc8-vanilla/iter-0/sys/devices/system/cpu/cpu0/cpuidle/state1/disable:0
> ./5.9.0-rc8-vanilla/iter-0/sys/devices/system/cpu/cpu0/cpuidle/state2/disable:0
> ./5.9.0-rc8-vanilla/iter-0/sys/devices/system/cpu/cpu0/cpuidle/state3/disable:1
> ./5.9.0-rc8-vanilla/iter-0/sys/devices/system/cpu/cpu0/cpuidle/state4/disable:1
> ./5.9.0-rc8-vanilla/iter-0/sys/devices/system/cpu/cpu0/cpuidle/state0/default_status:enabled
> ./5.9.0-rc8-vanilla/iter-0/sys/devices/system/cpu/cpu0/cpuidle/state1/default_status:enabled
> ./5.9.0-rc8-vanilla/iter-0/sys/devices/system/cpu/cpu0/cpuidle/state2/default_status:enabled
> ./5.9.0-rc8-vanilla/iter-0/sys/devices/system/cpu/cpu0/cpuidle/state3/default_status:disabled
> ./5.9.0-rc8-vanilla/iter-0/sys/devices/system/cpu/cpu0/cpuidle/state4/default_status:disabled
>
> For use_acpi == false, all c-states are enabled
>
> ./5.9.0-rc8-disable-acpi/iter-0/sys/devices/system/cpu/cpu0/cpuidle/state0/disable:0
> ./5.9.0-rc8-disable-acpi/iter-0/sys/devices/system/cpu/cpu0/cpuidle/state1/disable:0
> ./5.9.0-rc8-disable-acpi/iter-0/sys/devices/system/cpu/cpu0/cpuidle/state2/disable:0
> ./5.9.0-rc8-disable-acpi/iter-0/sys/devices/system/cpu/cpu0/cpuidle/state3/disable:0
> ./5.9.0-rc8-disable-acpi/iter-0/sys/devices/system/cpu/cpu0/cpuidle/state4/disable:0
> ./5.9.0-rc8-disable-acpi/iter-0/sys/devices/system/cpu/cpu0/cpuidle/state0/default_status:enabled
> ./5.9.0-rc8-disable-acpi/iter-0/sys/devices/system/cpu/cpu0/cpuidle/state1/default_status:enabled
> ./5.9.0-rc8-disable-acpi/iter-0/sys/devices/system/cpu/cpu0/cpuidle/state2/default_status:enabled
> ./5.9.0-rc8-disable-acpi/iter-0/sys/devices/system/cpu/cpu0/cpuidle/state3/default_status:enabled
> ./5.9.0-rc8-disable-acpi/iter-0/sys/devices/system/cpu/cpu0/cpuidle/state4/default_status:enabled
>
> Force enabling C6
>
> ./5.9.0-rc8-enable-c6/iter-0/sys/devices/system/cpu/cpu0/cpuidle/state0/disable:0
> ./5.9.0-rc8-enable-c6/iter-0/sys/devices/system/cpu/cpu0/cpuidle/state1/disable:0
> ./5.9.0-rc8-enable-c6/iter-0/sys/devices/system/cpu/cpu0/cpuidle/state2/disable:0
> ./5.9.0-rc8-enable-c6/iter-0/sys/devices/system/cpu/cpu0/cpuidle/state3/disable:1
> ./5.9.0-rc8-enable-c6/iter-0/sys/devices/system/cpu/cpu0/cpuidle/state4/disable:0
> ./5.9.0-rc8-enable-c6/iter-0/sys/devices/system/cpu/cpu0/cpuidle/state0/default_status:enabled
> ./5.9.0-rc8-enable-c6/iter-0/sys/devices/system/cpu/cpu0/cpuidle/state1/default_status:enabled
> ./5.9.0-rc8-enable-c6/iter-0/sys/devices/system/cpu/cpu0/cpuidle/state2/default_status:enabled
> ./5.9.0-rc8-enable-c6/iter-0/sys/devices/system/cpu/cpu0/cpuidle/state3/default_status:disabled
> ./5.9.0-rc8-enable-c6/iter-0/sys/devices/system/cpu/cpu0/cpuidle/state4/default_status:enabled
>
> Note that as expected, C3 remains disabled when only C6 is forced (state3
> == c3, state4 == c6). While this particular workload does not appear to
> care as it does not remain idle for long, the exit latency difference
> between c3 and c6 is large so potentially a workload that idles for short
> durations that are somewhere between c1e and c3 exit latency might take
> a larger penalty exiting from c6 state if the deeper c-state is selected
> for idling.
>
> ./5.9.0-rc8-enable-c6/iter-0/sys/devices/system/cpu/cpu0/cpuidle/state0/residency:0
> ./5.9.0-rc8-enable-c6/iter-0/sys/devices/system/cpu/cpu0/cpuidle/state1/residency:2
> ./5.9.0-rc8-enable-c6/iter-0/sys/devices/system/cpu/cpu0/cpuidle/state2/residency:20
> ./5.9.0-rc8-enable-c6/iter-0/sys/devices/system/cpu/cpu0/cpuidle/state3/residency:100
> ./5.9.0-rc8-enable-c6/iter-0/sys/devices/system/cpu/cpu0/cpuidle/state4/residency:400
>
If you are worried that C6 might be used instead of C3 in some cases, 
this is not going to happen.

I all cases in which C3 would have been used had it not been disabled, 
C1E will be used instead.

Which BTW indicates that using C1E more often adds a lot of latency to 
the workload (if C3 and C6 are both disabled, C1E is used in all cases 
in which one of them would have been used). With C6 enabled, that state 
is used at least sometimes (so C1E is used less often), but PC6 doesn't 
seem to be really used - it looks like core C6 only is entered and which 
may be why C6 adds less latency than C1E (and analogously for C3).



^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: ACPI _CST introduced performance regresions on Haswll
  2020-10-08 17:15             ` Rafael J. Wysocki
@ 2020-10-08 17:34               ` Mel Gorman
  2020-10-13 18:55                 ` Rafael J. Wysocki
  0 siblings, 1 reply; 18+ messages in thread
From: Mel Gorman @ 2020-10-08 17:34 UTC (permalink / raw)
  To: Rafael J. Wysocki; +Cc: Takashi Iwai, linux-kernel

On Thu, Oct 08, 2020 at 07:15:46PM +0200, Rafael J. Wysocki wrote:
> > Force enabling C6
> > 
> > ./5.9.0-rc8-enable-c6/iter-0/sys/devices/system/cpu/cpu0/cpuidle/state0/disable:0
> > ./5.9.0-rc8-enable-c6/iter-0/sys/devices/system/cpu/cpu0/cpuidle/state1/disable:0
> > ./5.9.0-rc8-enable-c6/iter-0/sys/devices/system/cpu/cpu0/cpuidle/state2/disable:0
> > ./5.9.0-rc8-enable-c6/iter-0/sys/devices/system/cpu/cpu0/cpuidle/state3/disable:1
> > ./5.9.0-rc8-enable-c6/iter-0/sys/devices/system/cpu/cpu0/cpuidle/state4/disable:0
> > ./5.9.0-rc8-enable-c6/iter-0/sys/devices/system/cpu/cpu0/cpuidle/state0/default_status:enabled
> > ./5.9.0-rc8-enable-c6/iter-0/sys/devices/system/cpu/cpu0/cpuidle/state1/default_status:enabled
> > ./5.9.0-rc8-enable-c6/iter-0/sys/devices/system/cpu/cpu0/cpuidle/state2/default_status:enabled
> > ./5.9.0-rc8-enable-c6/iter-0/sys/devices/system/cpu/cpu0/cpuidle/state3/default_status:disabled
> > ./5.9.0-rc8-enable-c6/iter-0/sys/devices/system/cpu/cpu0/cpuidle/state4/default_status:enabled
> > 
> > Note that as expected, C3 remains disabled when only C6 is forced (state3
> > == c3, state4 == c6). While this particular workload does not appear to
> > care as it does not remain idle for long, the exit latency difference
> > between c3 and c6 is large so potentially a workload that idles for short
> > durations that are somewhere between c1e and c3 exit latency might take
> > a larger penalty exiting from c6 state if the deeper c-state is selected
> > for idling.
> > 
> > ./5.9.0-rc8-enable-c6/iter-0/sys/devices/system/cpu/cpu0/cpuidle/state0/residency:0
> > ./5.9.0-rc8-enable-c6/iter-0/sys/devices/system/cpu/cpu0/cpuidle/state1/residency:2
> > ./5.9.0-rc8-enable-c6/iter-0/sys/devices/system/cpu/cpu0/cpuidle/state2/residency:20
> > ./5.9.0-rc8-enable-c6/iter-0/sys/devices/system/cpu/cpu0/cpuidle/state3/residency:100
> > ./5.9.0-rc8-enable-c6/iter-0/sys/devices/system/cpu/cpu0/cpuidle/state4/residency:400
> > 
>
> If you are worried that C6 might be used instead of C3 in some cases, this
> is not going to happen.
>

Ok, so it goes in the C1E direction instead. I lost track of how C-state
is selected based on predictions about the future. It's changed a bit
over time.
 
> I all cases in which C3 would have been used had it not been disabled, C1E
> will be used instead.
> 
> Which BTW indicates that using C1E more often adds a lot of latency to the
> workload (if C3 and C6 are both disabled, C1E is used in all cases in which
> one of them would have been used).

Which is weird. From the exit latency alone, I'd think it would be faster
to use C1E instead of C3. It implies that using C1E instead of C3/C6 has
some other side-effect on Haswell. At one point, there was plenty of advice
on disabling C1E but very little concrete information on what impact it
has exactly and why it might cause problems that other c-states avoid.

> With C6 enabled, that state is used at
> least sometimes (so C1E is used less often), but PC6 doesn't seem to be
> really used - it looks like core C6 only is entered and which may be why C6
> adds less latency than C1E (and analogously for C3).
> 

At the moment, I'm happy with either solution but mostly because I can't
tell what other trade-offs should be considered :/

-- 
Mel Gorman
SUSE Labs

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: ACPI _CST introduced performance regresions on Haswll
  2020-10-08 17:34               ` Mel Gorman
@ 2020-10-13 18:55                 ` Rafael J. Wysocki
  2020-10-14 22:37                   ` Mel Gorman
  0 siblings, 1 reply; 18+ messages in thread
From: Rafael J. Wysocki @ 2020-10-13 18:55 UTC (permalink / raw)
  To: Mel Gorman
  Cc: Takashi Iwai, linux-kernel, Len Brown, Srinivas Pandruvada,
	rafael, Linux PM

On 10/8/2020 7:34 PM, Mel Gorman wrote:
> On Thu, Oct 08, 2020 at 07:15:46PM +0200, Rafael J. Wysocki wrote:
>>> Force enabling C6
>>>
>>> ./5.9.0-rc8-enable-c6/iter-0/sys/devices/system/cpu/cpu0/cpuidle/state0/disable:0
>>> ./5.9.0-rc8-enable-c6/iter-0/sys/devices/system/cpu/cpu0/cpuidle/state1/disable:0
>>> ./5.9.0-rc8-enable-c6/iter-0/sys/devices/system/cpu/cpu0/cpuidle/state2/disable:0
>>> ./5.9.0-rc8-enable-c6/iter-0/sys/devices/system/cpu/cpu0/cpuidle/state3/disable:1
>>> ./5.9.0-rc8-enable-c6/iter-0/sys/devices/system/cpu/cpu0/cpuidle/state4/disable:0
>>> ./5.9.0-rc8-enable-c6/iter-0/sys/devices/system/cpu/cpu0/cpuidle/state0/default_status:enabled
>>> ./5.9.0-rc8-enable-c6/iter-0/sys/devices/system/cpu/cpu0/cpuidle/state1/default_status:enabled
>>> ./5.9.0-rc8-enable-c6/iter-0/sys/devices/system/cpu/cpu0/cpuidle/state2/default_status:enabled
>>> ./5.9.0-rc8-enable-c6/iter-0/sys/devices/system/cpu/cpu0/cpuidle/state3/default_status:disabled
>>> ./5.9.0-rc8-enable-c6/iter-0/sys/devices/system/cpu/cpu0/cpuidle/state4/default_status:enabled
>>>
>>> Note that as expected, C3 remains disabled when only C6 is forced (state3
>>> == c3, state4 == c6). While this particular workload does not appear to
>>> care as it does not remain idle for long, the exit latency difference
>>> between c3 and c6 is large so potentially a workload that idles for short
>>> durations that are somewhere between c1e and c3 exit latency might take
>>> a larger penalty exiting from c6 state if the deeper c-state is selected
>>> for idling.
>>>
>>> ./5.9.0-rc8-enable-c6/iter-0/sys/devices/system/cpu/cpu0/cpuidle/state0/residency:0
>>> ./5.9.0-rc8-enable-c6/iter-0/sys/devices/system/cpu/cpu0/cpuidle/state1/residency:2
>>> ./5.9.0-rc8-enable-c6/iter-0/sys/devices/system/cpu/cpu0/cpuidle/state2/residency:20
>>> ./5.9.0-rc8-enable-c6/iter-0/sys/devices/system/cpu/cpu0/cpuidle/state3/residency:100
>>> ./5.9.0-rc8-enable-c6/iter-0/sys/devices/system/cpu/cpu0/cpuidle/state4/residency:400
>>>
>> If you are worried that C6 might be used instead of C3 in some cases, this
>> is not going to happen.
>>
> Ok, so it goes in the C1E direction instead. I lost track of how C-state
> is selected based on predictions about the future. It's changed a bit
> over time.
>   
>> I all cases in which C3 would have been used had it not been disabled, C1E
>> will be used instead.
>>
>> Which BTW indicates that using C1E more often adds a lot of latency to the
>> workload (if C3 and C6 are both disabled, C1E is used in all cases in which
>> one of them would have been used).
> Which is weird. From the exit latency alone, I'd think it would be faster
> to use C1E instead of C3. It implies that using C1E instead of C3/C6 has
> some other side-effect on Haswell. At one point, there was plenty of advice
> on disabling C1E but very little concrete information on what impact it
> has exactly and why it might cause problems that other c-states avoid.
>
>> With C6 enabled, that state is used at
>> least sometimes (so C1E is used less often), but PC6 doesn't seem to be
>> really used - it looks like core C6 only is entered and which may be why C6
>> adds less latency than C1E (and analogously for C3).
>>
> At the moment, I'm happy with either solution but mostly because I can't
> tell what other trade-offs should be considered :/
>
I talked to Len and Srinivas about this and my theory above didn't survive.

The most likely reason why you see a performance drop after enabling the 
ACPI support (which effectively causes C3 and C6 to be disabled by 
default on the affected machines) is because the benchmarks in question 
require sufficiently high one-CPU performance and the CPUs cannot reach 
high enough one-core turbo P-states without the other CPUs going into C6.

Inspection of the ACPI tables you sent me indicates that there is a BIOS 
switch in that system allowing C6 to be enabled.  Would it be possible 
to check whether or not there is a BIOS setup option to change that setting?

Also, I need to know what happens if that system is started with 
intel_idle disabled.  That is, what idle states are visible in sysfs in 
that configuration (what their names and descriptions are in particular) 
and whether or not the issue is still present then.

In addition to that, can you please run the benchmark on that system 
under turbostat both with unmodified intel_idle and with intel_idle 
disabled and post the turbostat output in each of those cases?

Cheers!



^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: ACPI _CST introduced performance regresions on Haswll
  2020-10-13 18:55                 ` Rafael J. Wysocki
@ 2020-10-14 22:37                   ` Mel Gorman
  2020-10-15 18:34                     ` Mel Gorman
  0 siblings, 1 reply; 18+ messages in thread
From: Mel Gorman @ 2020-10-14 22:37 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Takashi Iwai, linux-kernel, Len Brown, Srinivas Pandruvada,
	rafael, Linux PM

On Tue, Oct 13, 2020 at 08:55:26PM +0200, Rafael J. Wysocki wrote:
> > > With C6 enabled, that state is used at
> > > least sometimes (so C1E is used less often), but PC6 doesn't seem to be
> > > really used - it looks like core C6 only is entered and which may be why C6
> > > adds less latency than C1E (and analogously for C3).
> > > 
> > At the moment, I'm happy with either solution but mostly because I can't
> > tell what other trade-offs should be considered :/
> > 
>
> I talked to Len and Srinivas about this and my theory above didn't survive.
> 
> The most likely reason why you see a performance drop after enabling the
> ACPI support (which effectively causes C3 and C6 to be disabled by default
> on the affected machines) is because the benchmarks in question require
> sufficiently high one-CPU performance and the CPUs cannot reach high enough
> one-core turbo P-states without the other CPUs going into C6.
> 

That makes sense. It also can explain anomalies like server/clients being
split across NUMA nodes with no other activity can sometimes be better
because of turbo states being more important than memory locality for
some benchmarks.

> Inspection of the ACPI tables you sent me indicates that there is a BIOS
> switch in that system allowing C6 to be enabled.  Would it be possible to
> check whether or not there is a BIOS setup option to change that setting?
> 

Yes, it's well hidden but it's there. If the profile is made custom, then
the p-states can be selected and "custom" default enables C6 but not C3
(there is a note saying that it's not recommended for that CPU). If I
then switch it back to the normal profile, the c-states are not restored
so this is a one-way trip even if you disable the c-state in custom,
reboot, switch back, reboot. Same if the machine is reset to "optimal
default settings". Yey for BIOS developers.

This means I have a limited number of attempts to do something about
this. 2 machines can no longer reproduce the problem reliably.

However, that also says a "solution" is to enable the state on each of
these machines, discard the historical results and carry on. In practice,
if reports are received either upstream or in distros about strange
behaviour on old machines when upgrading then first check the c-states
and the CPU generation. Given long enough, the machines with that specific
CPU and a crappy BIOS will phase out :/

> Also, I need to know what happens if that system is started with intel_idle
> disabled.  That is, what idle states are visible in sysfs in that
> configuration (what their names and descriptions are in particular) and
> whether or not the issue is still present then.
> 

Kernel name		c-state	  exit latency	 disabled?  default?
-----------             ------    ------------   ---------  --------
5.9-vanilla              POLL     latency:0      disabled:0 default:enabled
5.9-vanilla              C1       latency:2      disabled:0 default:enabled
5.9-vanilla              C1E      latency:10     disabled:0 default:enabled
5.9-vanilla              C3       latency:33     disabled:1 default:disabled
5.9-vanilla              C6       latency:133    disabled:0 default:enabled
5.9-intel_idle-disabled  POLL     latency:0      disabled:0 default:enabled
5.9-intel_idle-disabled  C1       latency:1      disabled:0 default:enabled
5.9-intel_idle-disabled  C2       latency:41     disabled:0 default:enabled
5.9-acpi-disable         POLL     latency:0      disabled:0 default:enabled
5.9-acpi-disable         C1       latency:2      disabled:0 default:enabled
5.9-acpi-disable         C1E      latency:10     disabled:0 default:enabled
5.9-acpi-disable         C3       latency:33     disabled:0 default:enabled
5.9-acpi-disable         C6       latency:133    disabled:0 default:enabled
5.9-custom-powerprofile  POLL     latency:0      disabled:0 default:enabled
5.9-custom-powerprofile  C1       latency:2      disabled:0 default:enabled
5.9-custom-powerprofile  C1E      latency:10     disabled:0 default:enabled
5.9-custom-powerprofile  C3       latency:33     disabled:1 default:disabled
5.9-custom-powerprofile  C6       latency:133    disabled:0 default:enabled

In this case, the test results are similar. I vaguely recall the bios
was reconfigured on the second machine for unrelated reasons and it's no
longer able to reproduce the problem properly. As the results show little
difference in this case, I didn't include the turbostat figures. The
summary from mmtests showed this

                      5.9         5.9         5.9         5.9
                  vanillaintel_idle-disabledacpi-disablecustom-powerprofile
Hmean Avg_MHz        8.31        8.29        8.35        8.26
Hmean Busy%          0.58        0.56        0.58        0.57
Hmean CPU%c1         3.19       40.81        3.14        3.18
Hmean CPU%c3         0.00        0.00        0.00        0.00
Hmean CPU%c6        92.24        0.00       92.21       92.20
Hmean CPU%c7         0.00        0.00        0.00        0.00
Hmean PkgWatt       47.98        0.00       47.95       47.68

i.e. The average time on c6 was high on the vanilla kernel where as it
would not have been when the problem was originally reproduced (I
clearly broke this test machine in a way I can't "fix"). Disabling
intel_idle kept it mostly in C1 state.

I'll try a third machine tomorrow but even if I reproduce the problem,
I won't be able to do it again because ... yey bios developers.

-- 
Mel Gorman
SUSE Labs

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: ACPI _CST introduced performance regresions on Haswll
  2020-10-14 22:37                   ` Mel Gorman
@ 2020-10-15 18:34                     ` Mel Gorman
  2020-10-16 13:41                       ` Rafael J. Wysocki
  0 siblings, 1 reply; 18+ messages in thread
From: Mel Gorman @ 2020-10-15 18:34 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Takashi Iwai, linux-kernel, Len Brown, Srinivas Pandruvada,
	rafael, Linux PM

> Yes, it's well hidden but it's there. If the profile is made custom, then
> the p-states can be selected and "custom" default enables C6 but not C3
> (there is a note saying that it's not recommended for that CPU). If I
> then switch it back to the normal profile, the c-states are not restored
> so this is a one-way trip even if you disable the c-state in custom,
> reboot, switch back, reboot. Same if the machine is reset to "optimal
> default settings". Yey for BIOS developers.
> 
> This means I have a limited number of attempts to do something about
> this. 2 machines can no longer reproduce the problem reliably.
> 

Turns out I didn't even have that. On another machine (same model,
same cpu, different BIOS that cannot be updated), enabling the C6 state
still did not enable it on boot and dmesg complained about CST not being
usable. This is weird because one would expect that if CST was unusable
that it would be the same as use_acpi == false.

This could potentially be if the ACPI tables are unsuitable due to bad
bad FFH information for a lower c-state. If _CST is not found or usable,
should acpi_state_table.count be reset to go back to the old behaviour?

diff --git a/drivers/idle/intel_idle.c b/drivers/idle/intel_idle.c
index 13600c403035..3b84f8631b40 100644
--- a/drivers/idle/intel_idle.c
+++ b/drivers/idle/intel_idle.c
@@ -1261,6 +1261,7 @@ static bool intel_idle_acpi_cst_extract(void)
 		return true;
 	}
 
+	acpi_state_table.count = 0;
 	pr_debug("ACPI _CST not found or not usable\n");
 	return false;
 }

-- 
Mel Gorman
SUSE Labs

^ permalink raw reply related	[flat|nested] 18+ messages in thread

* Re: ACPI _CST introduced performance regresions on Haswll
  2020-10-15 18:34                     ` Mel Gorman
@ 2020-10-16 13:41                       ` Rafael J. Wysocki
  2020-10-16 14:09                         ` Mel Gorman
  0 siblings, 1 reply; 18+ messages in thread
From: Rafael J. Wysocki @ 2020-10-16 13:41 UTC (permalink / raw)
  To: Mel Gorman
  Cc: Rafael J. Wysocki, Takashi Iwai, Linux Kernel Mailing List,
	Len Brown, Srinivas Pandruvada, Rafael J. Wysocki, Linux PM

On Thu, Oct 15, 2020 at 8:34 PM Mel Gorman <mgorman@techsingularity.net> wrote:
>
> > Yes, it's well hidden but it's there. If the profile is made custom, then
> > the p-states can be selected and "custom" default enables C6 but not C3
> > (there is a note saying that it's not recommended for that CPU). If I
> > then switch it back to the normal profile, the c-states are not restored
> > so this is a one-way trip even if you disable the c-state in custom,
> > reboot, switch back, reboot. Same if the machine is reset to "optimal
> > default settings". Yey for BIOS developers.
> >
> > This means I have a limited number of attempts to do something about
> > this. 2 machines can no longer reproduce the problem reliably.
> >
>
> Turns out I didn't even have that. On another machine (same model,
> same cpu, different BIOS that cannot be updated), enabling the C6 state
> still did not enable it on boot and dmesg complained about CST not being
> usable. This is weird because one would expect that if CST was unusable
> that it would be the same as use_acpi == false.
>
> This could potentially be if the ACPI tables are unsuitable due to bad
> bad FFH information for a lower c-state. If _CST is not found or usable,
> should acpi_state_table.count be reset to go back to the old behaviour?

Yes, it should, although I would reset it in intel_idle_cst_usable()
right away before returning 'false'.

I can send a patch to do the above or please submit the one below as
it works too.

> diff --git a/drivers/idle/intel_idle.c b/drivers/idle/intel_idle.c
> index 13600c403035..3b84f8631b40 100644
> --- a/drivers/idle/intel_idle.c
> +++ b/drivers/idle/intel_idle.c
> @@ -1261,6 +1261,7 @@ static bool intel_idle_acpi_cst_extract(void)
>                 return true;
>         }
>
> +       acpi_state_table.count = 0;
>         pr_debug("ACPI _CST not found or not usable\n");
>         return false;
>  }
>
> --

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: ACPI _CST introduced performance regresions on Haswll
  2020-10-16 13:41                       ` Rafael J. Wysocki
@ 2020-10-16 14:09                         ` Mel Gorman
  2020-10-16 15:29                           ` Rafael J. Wysocki
  0 siblings, 1 reply; 18+ messages in thread
From: Mel Gorman @ 2020-10-16 14:09 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Rafael J. Wysocki, Takashi Iwai, Linux Kernel Mailing List,
	Len Brown, Srinivas Pandruvada, Linux PM

On Fri, Oct 16, 2020 at 03:41:12PM +0200, Rafael J. Wysocki wrote:
> > Turns out I didn't even have that. On another machine (same model,
> > same cpu, different BIOS that cannot be updated), enabling the C6 state
> > still did not enable it on boot and dmesg complained about CST not being
> > usable. This is weird because one would expect that if CST was unusable
> > that it would be the same as use_acpi == false.
> >
> > This could potentially be if the ACPI tables are unsuitable due to bad
> > bad FFH information for a lower c-state. If _CST is not found or usable,
> > should acpi_state_table.count be reset to go back to the old behaviour?
> 
> Yes, it should, although I would reset it in intel_idle_cst_usable()
> right away before returning 'false'.
> 

Good stuff.

> I can send a patch to do the above or please submit the one below as
> it works too.
> 

I'm happy to go with your alternative if you prefer. For a finish,
I decided it was worth reporting if the _CST was ignored regardless of
the reason. It performs roughly the same as setting use_acpi = false on
the affected machines.

---8<---
intel_idle: Ignore _CST if control cannot be taken from the platform

e6d4f08a6776 ("intel_idle: Use ACPI _CST on server systems") avoids
enabling c-states that have been disabled by the platform with the
exception of C1E.

Unfortunately, BIOS implementations are not always consistent in terms
of how capabilities are advertised and control cannot always be handed
over. If control cannot be handed over then intel_idle reports that "ACPI
_CST not found or not usable" but does not clear acpi_state_table.count
meaning the information is still partially used.

This patch ignores ACPI information if CST control cannot be requested from
the platform. This was only observed on a number of Haswell platforms that
had identical CPUs but not identical BIOS versions.  While this problem
may be rare overall, 24 separate test cases bisected to this specific
commit across 4 separate test machines and is worth addressing. If the
situation occurs, the kernel behaves as it did before commit e6d4f08a6776
and uses any c-states that are discovered.

The affected test cases were all ones that involved a small number of
processes -- exec microbenchmark, pipe microbenchmark, git test suite,
netperf, tbench with one client and system call microbenchmark. Each
case benefits from being able to use turboboost which is prevented if the
lower c-states are unavailable. This may mask real regressions specific
to older hardware so it is worth addressing.

C-state status before and after the patch

5.9.0-vanilla            POLL     latency:0      disabled:0 default:enabled
5.9.0-vanilla            C1       latency:2      disabled:0 default:enabled
5.9.0-vanilla            C1E      latency:10     disabled:0 default:enabled
5.9.0-vanilla            C3       latency:33     disabled:1 default:disabled
5.9.0-vanilla            C6       latency:133    disabled:1 default:disabled
5.9.0-ignore-cst-v1r1    POLL     latency:0      disabled:0 default:enabled
5.9.0-ignore-cst-v1r1    C1       latency:2      disabled:0 default:enabled
5.9.0-ignore-cst-v1r1    C1E      latency:10     disabled:0 default:enabled
5.9.0-ignore-cst-v1r1    C3       latency:33     disabled:0 default:enabled
5.9.0-ignore-cst-v1r1    C6       latency:133    disabled:0 default:enabled

Patch enables C3/C6.

Netperf UDP_STREAM

netperf-udp
                                      5.5.0                  5.9.0
                                    vanilla        ignore-cst-v1r1
Hmean     send-64         193.41 (   0.00%)      226.54 *  17.13%*
Hmean     send-128        392.16 (   0.00%)      450.54 *  14.89%*
Hmean     send-256        769.94 (   0.00%)      881.85 *  14.53%*
Hmean     send-1024      2994.21 (   0.00%)     3468.95 *  15.85%*
Hmean     send-2048      5725.60 (   0.00%)     6628.99 *  15.78%*
Hmean     send-3312      8468.36 (   0.00%)    10288.02 *  21.49%*
Hmean     send-4096     10135.46 (   0.00%)    12387.57 *  22.22%*
Hmean     send-8192     17142.07 (   0.00%)    19748.11 *  15.20%*
Hmean     send-16384    28539.71 (   0.00%)    30084.45 *   5.41%*

Fixes: e6d4f08a6776 ("intel_idle: Use ACPI _CST on server systems")
Signed-off-by: Mel Gorman <mgorman@techsingularity.net>
---
 drivers/idle/intel_idle.c | 7 +++----
 1 file changed, 3 insertions(+), 4 deletions(-)

diff --git a/drivers/idle/intel_idle.c b/drivers/idle/intel_idle.c
index 9a810e4a7946..4af2d3f2c8aa 100644
--- a/drivers/idle/intel_idle.c
+++ b/drivers/idle/intel_idle.c
@@ -1212,14 +1212,13 @@ static bool __init intel_idle_acpi_cst_extract(void)
 		if (!intel_idle_cst_usable())
 			continue;
 
-		if (!acpi_processor_claim_cst_control()) {
-			acpi_state_table.count = 0;
-			return false;
-		}
+		if (!acpi_processor_claim_cst_control())
+			break;
 
 		return true;
 	}
 
+	acpi_state_table.count = 0;
 	pr_debug("ACPI _CST not found or not usable\n");
 	return false;
 }

^ permalink raw reply related	[flat|nested] 18+ messages in thread

* Re: ACPI _CST introduced performance regresions on Haswll
  2020-10-16 14:09                         ` Mel Gorman
@ 2020-10-16 15:29                           ` Rafael J. Wysocki
  0 siblings, 0 replies; 18+ messages in thread
From: Rafael J. Wysocki @ 2020-10-16 15:29 UTC (permalink / raw)
  To: Mel Gorman
  Cc: Rafael J. Wysocki, Rafael J. Wysocki, Takashi Iwai,
	Linux Kernel Mailing List, Len Brown, Srinivas Pandruvada,
	Linux PM

On Fri, Oct 16, 2020 at 4:09 PM Mel Gorman <mgorman@techsingularity.net> wrote:
>
> On Fri, Oct 16, 2020 at 03:41:12PM +0200, Rafael J. Wysocki wrote:
> > > Turns out I didn't even have that. On another machine (same model,
> > > same cpu, different BIOS that cannot be updated), enabling the C6 state
> > > still did not enable it on boot and dmesg complained about CST not being
> > > usable. This is weird because one would expect that if CST was unusable
> > > that it would be the same as use_acpi == false.
> > >
> > > This could potentially be if the ACPI tables are unsuitable due to bad
> > > bad FFH information for a lower c-state. If _CST is not found or usable,
> > > should acpi_state_table.count be reset to go back to the old behaviour?
> >
> > Yes, it should, although I would reset it in intel_idle_cst_usable()
> > right away before returning 'false'.
> >
>
> Good stuff.
>
> > I can send a patch to do the above or please submit the one below as
> > it works too.
> >
>
> I'm happy to go with your alternative if you prefer. For a finish,
> I decided it was worth reporting if the _CST was ignored regardless of
> the reason. It performs roughly the same as setting use_acpi = false on
> the affected machines.
>
> ---8<---
> intel_idle: Ignore _CST if control cannot be taken from the platform
>
> e6d4f08a6776 ("intel_idle: Use ACPI _CST on server systems") avoids
> enabling c-states that have been disabled by the platform with the
> exception of C1E.
>
> Unfortunately, BIOS implementations are not always consistent in terms
> of how capabilities are advertised and control cannot always be handed
> over. If control cannot be handed over then intel_idle reports that "ACPI
> _CST not found or not usable" but does not clear acpi_state_table.count
> meaning the information is still partially used.
>
> This patch ignores ACPI information if CST control cannot be requested from
> the platform. This was only observed on a number of Haswell platforms that
> had identical CPUs but not identical BIOS versions.  While this problem
> may be rare overall, 24 separate test cases bisected to this specific
> commit across 4 separate test machines and is worth addressing. If the
> situation occurs, the kernel behaves as it did before commit e6d4f08a6776
> and uses any c-states that are discovered.
>
> The affected test cases were all ones that involved a small number of
> processes -- exec microbenchmark, pipe microbenchmark, git test suite,
> netperf, tbench with one client and system call microbenchmark. Each
> case benefits from being able to use turboboost which is prevented if the
> lower c-states are unavailable. This may mask real regressions specific
> to older hardware so it is worth addressing.
>
> C-state status before and after the patch
>
> 5.9.0-vanilla            POLL     latency:0      disabled:0 default:enabled
> 5.9.0-vanilla            C1       latency:2      disabled:0 default:enabled
> 5.9.0-vanilla            C1E      latency:10     disabled:0 default:enabled
> 5.9.0-vanilla            C3       latency:33     disabled:1 default:disabled
> 5.9.0-vanilla            C6       latency:133    disabled:1 default:disabled
> 5.9.0-ignore-cst-v1r1    POLL     latency:0      disabled:0 default:enabled
> 5.9.0-ignore-cst-v1r1    C1       latency:2      disabled:0 default:enabled
> 5.9.0-ignore-cst-v1r1    C1E      latency:10     disabled:0 default:enabled
> 5.9.0-ignore-cst-v1r1    C3       latency:33     disabled:0 default:enabled
> 5.9.0-ignore-cst-v1r1    C6       latency:133    disabled:0 default:enabled
>
> Patch enables C3/C6.
>
> Netperf UDP_STREAM
>
> netperf-udp
>                                       5.5.0                  5.9.0
>                                     vanilla        ignore-cst-v1r1
> Hmean     send-64         193.41 (   0.00%)      226.54 *  17.13%*
> Hmean     send-128        392.16 (   0.00%)      450.54 *  14.89%*
> Hmean     send-256        769.94 (   0.00%)      881.85 *  14.53%*
> Hmean     send-1024      2994.21 (   0.00%)     3468.95 *  15.85%*
> Hmean     send-2048      5725.60 (   0.00%)     6628.99 *  15.78%*
> Hmean     send-3312      8468.36 (   0.00%)    10288.02 *  21.49%*
> Hmean     send-4096     10135.46 (   0.00%)    12387.57 *  22.22%*
> Hmean     send-8192     17142.07 (   0.00%)    19748.11 *  15.20%*
> Hmean     send-16384    28539.71 (   0.00%)    30084.45 *   5.41%*
>
> Fixes: e6d4f08a6776 ("intel_idle: Use ACPI _CST on server systems")
> Signed-off-by: Mel Gorman <mgorman@techsingularity.net>
> ---
>  drivers/idle/intel_idle.c | 7 +++----
>  1 file changed, 3 insertions(+), 4 deletions(-)
>
> diff --git a/drivers/idle/intel_idle.c b/drivers/idle/intel_idle.c
> index 9a810e4a7946..4af2d3f2c8aa 100644
> --- a/drivers/idle/intel_idle.c
> +++ b/drivers/idle/intel_idle.c
> @@ -1212,14 +1212,13 @@ static bool __init intel_idle_acpi_cst_extract(void)
>                 if (!intel_idle_cst_usable())
>                         continue;
>
> -               if (!acpi_processor_claim_cst_control()) {
> -                       acpi_state_table.count = 0;
> -                       return false;
> -               }
> +               if (!acpi_processor_claim_cst_control())
> +                       break;
>
>                 return true;
>         }
>
> +       acpi_state_table.count = 0;
>         pr_debug("ACPI _CST not found or not usable\n");
>         return false;
>  }

Applied as a fix for 5.10-rc1, thanks!

^ permalink raw reply	[flat|nested] 18+ messages in thread

end of thread, other threads:[~2020-10-16 15:29 UTC | newest]

Thread overview: 18+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-10-06  8:36 ACPI _CST introduced performance regresions on Haswll Mel Gorman
2020-10-06 16:00 ` Rafael J. Wysocki
2020-10-06 19:03   ` Mel Gorman
2020-10-06 19:29     ` Rafael J. Wysocki
2020-10-06 21:18       ` Mel Gorman
2020-10-07 15:45         ` Rafael J. Wysocki
2020-10-08  9:09           ` Mel Gorman
2020-10-08 17:15             ` Rafael J. Wysocki
2020-10-08 17:34               ` Mel Gorman
2020-10-13 18:55                 ` Rafael J. Wysocki
2020-10-14 22:37                   ` Mel Gorman
2020-10-15 18:34                     ` Mel Gorman
2020-10-16 13:41                       ` Rafael J. Wysocki
2020-10-16 14:09                         ` Mel Gorman
2020-10-16 15:29                           ` Rafael J. Wysocki
2020-10-06 19:47     ` Mel Gorman
2020-10-07 15:40       ` Rafael J. Wysocki
2020-10-07 19:23         ` Mel Gorman

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).