All of lore.kernel.org
 help / color / mirror / Atom feed
* PROBLEM: Cpufreq constantly keeps frequency at maximum on 4.5-rc4
@ 2016-02-20  8:49 Arto Jantunen
  2016-02-20 16:31 ` Doug Smythies
  0 siblings, 1 reply; 22+ messages in thread
From: Arto Jantunen @ 2016-02-20  8:49 UTC (permalink / raw)
  To: Rafael J. Wysocki, Viresh Kumar, linux-pm

Hi,

When using kernel 4.5-rc4 my Skylake machine runs very warm since all
cpu cores are always kept at 3.10Ghz (with maximum without turboboost
being 2.6Ghz), completely regardless of load. Swapping between the
governors (performance and powersave) doesn't change the result in any
way, frequency remains at a constant 3.10Ghz.

The machine is using the intel_pstate driver, I don't know if it's
possible to choose to use the acpi driver instead (without recompiling
the kernel so that intel_pstate is not supported).

I can force the frequency down manually with cpufreq-set, so the problem
doesn't seem to be with the actual frequency changing.

Cpuinfo (taken from 4.4 which doesn't have the problem):

processor	: 0
vendor_id	: GenuineIntel
cpu family	: 6
model		: 94
model name	: Intel(R) Core(TM) i7-6700HQ CPU @ 2.60GHz
stepping	: 3
microcode	: 0x33
cpu MHz		: 1067.929
cache size	: 6144 KB
physical id	: 0
siblings	: 8
core id		: 0
cpu cores	: 4
apicid		: 0
initial apicid	: 0
fpu		: yes
fpu_exception	: yes
cpuid level	: 22
wp		: yes
flags		: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx pdpe1gb rdtscp lm constant_tsc arch_perfmon pebs bts rep_good nopl xtopology nonstop_tsc aperfmperf eagerfpu pni pclmulqdq dtes64 monitor ds_cpl vmx est tm2 ssse3 sdbg fma cx16 xtpr pdcm pcid sse4_1 sse4_2 x2apic movbe popcnt tsc_deadline_timer aes xsave avx f16c rdrand lahf_lm abm 3dnowprefetch ida arat epb pln pts dtherm hwp hwp_notify hwp_act_window hwp_epp intel_pt tpr_shadow vnmi flexpriority ept vpid fsgsbase tsc_adjust bmi1 hle avx2 smep bmi2 erms invpcid rtm mpx rdseed adx smap clflushopt xsaveopt xsavec xgetbv1
bugs		:
bogomips	: 5183.96
clflush size	: 64
cache_alignment	: 64
address sizes	: 39 bits physical, 48 bits virtual
power management:

-- 
Arto Jantunen

^ permalink raw reply	[flat|nested] 22+ messages in thread

* RE: PROBLEM: Cpufreq constantly keeps frequency at maximum on 4.5-rc4
  2016-02-20  8:49 PROBLEM: Cpufreq constantly keeps frequency at maximum on 4.5-rc4 Arto Jantunen
@ 2016-02-20 16:31 ` Doug Smythies
  2016-02-20 17:10   ` Arto Jantunen
  0 siblings, 1 reply; 22+ messages in thread
From: Doug Smythies @ 2016-02-20 16:31 UTC (permalink / raw)
  To: 'Arto Jantunen'
  Cc: 'Rafael J. Wysocki', 'Viresh Kumar',
	linux-pm, 'Srinivas Pandruvada'

On 2106.02.20 00:50 Arto Jantunen wrote:

> When using kernel 4.5-rc4 my Skylake machine runs very warm since all
> cpu cores are always kept at 3.10Ghz (with maximum without turboboost
> being 2.6Ghz), completely regardless of load. Swapping between the
> governors (performance and powersave) doesn't change the result in any
> way, frequency remains at a constant 3.10Ghz.
>
> The machine is using the intel_pstate driver, I don't know if it's
> possible to choose to use the acpi driver instead (without recompiling
> the kernel so that intel_pstate is not supported).

If you use grub then you can disable the intel_pstate driver by modifying
the grub command line. Here is an example, where I have left other stuff
that I use:

GRUB_CMDLINE_LINUX_DEFAULT="ipv6.disable=1 intel_pstate=disable net.ifnames=1 biosdevname=0"

Here is an example, with the intel_pstate directive by itself:

GRUB_CMDLINE_LINUX_DEFAULT="intel_pstate=disable"

If the intel_pstate driver is disabled, then the acpi-cpufreq CPU
frequency scaling driver will be used. 
Remember to update grub after the above edit.

If you do not use grub, then I don't know.

Please, and as a test, also try this and report back:

GRUB_CMDLINE_LINUX_DEFAULT="intel_pstate=no_hwp"

> I can force the frequency down manually with cpufreq-set, so the problem
> doesn't seem to be with the actual frequency changing.

> Cpuinfo (taken from 4.4 which doesn't have the problem):

Is kernel 4.5-rc4 the first one you have tried in the 4.5 series?
What I am asking is if you know if the issue was introduced between
kernels 4.4 and 4.5-rc1 (most likely) or between 4.5-rc3 and 4.5-rc4 or?

What distribution, if any, of Linux do you use?

It might make sense to take this off-list and into a bugzilla bug report.

... Doug



^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: PROBLEM: Cpufreq constantly keeps frequency at maximum on 4.5-rc4
  2016-02-20 16:31 ` Doug Smythies
@ 2016-02-20 17:10   ` Arto Jantunen
  2016-02-20 18:03     ` Chen, Yu C
  0 siblings, 1 reply; 22+ messages in thread
From: Arto Jantunen @ 2016-02-20 17:10 UTC (permalink / raw)
  To: Doug Smythies
  Cc: 'Rafael J. Wysocki', 'Viresh Kumar',
	linux-pm, 'Srinivas Pandruvada'

"Doug Smythies" <dsmythies@telus.net> writes:

> On 2106.02.20 00:50 Arto Jantunen wrote:
>> The machine is using the intel_pstate driver, I don't know if it's
>> possible to choose to use the acpi driver instead (without recompiling
>> the kernel so that intel_pstate is not supported).
>
> Here is an example, with the intel_pstate directive by itself:
>
> GRUB_CMDLINE_LINUX_DEFAULT="intel_pstate=disable"
>
> If the intel_pstate driver is disabled, then the acpi-cpufreq CPU
> frequency scaling driver will be used. 
> Remember to update grub after the above edit.
>
> If you do not use grub, then I don't know.
>
> Please, and as a test, also try this and report back:
>
> GRUB_CMDLINE_LINUX_DEFAULT="intel_pstate=no_hwp"

Thanks, I'll test this.

>> I can force the frequency down manually with cpufreq-set, so the problem
>> doesn't seem to be with the actual frequency changing.
>
>> Cpuinfo (taken from 4.4 which doesn't have the problem):
>
> Is kernel 4.5-rc4 the first one you have tried in the 4.5 series?
> What I am asking is if you know if the issue was introduced between
> kernels 4.4 and 4.5-rc1 (most likely) or between 4.5-rc3 and 4.5-rc4 or?

4.5-rc4 is the first one I have tested, I assume that the problem has
been introduced between 4.4 and 4.5-rc1.

> What distribution, if any, of Linux do you use?

The system is running Debian Unstable, but the kernel is built from
upstream git (no Debian patches).

-- 
Arto Jantunen

^ permalink raw reply	[flat|nested] 22+ messages in thread

* RE: PROBLEM: Cpufreq constantly keeps frequency at maximum on 4.5-rc4
  2016-02-20 17:10   ` Arto Jantunen
@ 2016-02-20 18:03     ` Chen, Yu C
  2016-02-21  8:45       ` Arto Jantunen
  0 siblings, 1 reply; 22+ messages in thread
From: Chen, Yu C @ 2016-02-20 18:03 UTC (permalink / raw)
  To: Arto Jantunen, Doug Smythies
  Cc: 'Rafael J. Wysocki', 'Viresh Kumar',
	linux-pm, 'Srinivas Pandruvada'

> -----Original Message-----
> From: linux-pm-owner@vger.kernel.org [mailto:linux-pm-
> owner@vger.kernel.org] On Behalf Of Arto Jantunen
> Sent: Sunday, February 21, 2016 1:11 AM
> To: Doug Smythies
> Cc: 'Rafael J. Wysocki'; 'Viresh Kumar'; linux-pm@vger.kernel.org; 'Srinivas
> Pandruvada'
> Subject: Re: PROBLEM: Cpufreq constantly keeps frequency at maximum on
> 4.5-rc4
> 
> "Doug Smythies" <dsmythies@telus.net> writes:
> 
> > On 2106.02.20 00:50 Arto Jantunen wrote:
> >> I can force the frequency down manually with cpufreq-set, so the
> >> problem doesn't seem to be with the actual frequency changing.
 cpufreq-set modifies the value of scaling_max_freq, it looks like
your system always demands for the max freq, can you provide:
grep . /sys/devices/system/cpu/intel_pstate/*perf_pct
under powersave and performance?
And I agree with Doug,you can also file a report at Bugzilla.


^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: PROBLEM: Cpufreq constantly keeps frequency at maximum on 4.5-rc4
  2016-02-20 18:03     ` Chen, Yu C
@ 2016-02-21  8:45       ` Arto Jantunen
  2016-02-21  8:52         ` Chen, Yu C
  0 siblings, 1 reply; 22+ messages in thread
From: Arto Jantunen @ 2016-02-21  8:45 UTC (permalink / raw)
  To: Chen, Yu C
  Cc: Doug Smythies, 'Rafael J. Wysocki',
	'Viresh Kumar', linux-pm, 'Srinivas Pandruvada'

"Chen, Yu C" <yu.c.chen@intel.com> writes:

>> -----Original Message-----
>> From: linux-pm-owner@vger.kernel.org [mailto:linux-pm-
>> owner@vger.kernel.org] On Behalf Of Arto Jantunen
>> Sent: Sunday, February 21, 2016 1:11 AM
>> To: Doug Smythies
>> Cc: 'Rafael J. Wysocki'; 'Viresh Kumar'; linux-pm@vger.kernel.org; 'Srinivas
>> Pandruvada'
>> Subject: Re: PROBLEM: Cpufreq constantly keeps frequency at maximum on
>> 4.5-rc4
>> 
>> "Doug Smythies" <dsmythies@telus.net> writes:
>> 
>> > On 2106.02.20 00:50 Arto Jantunen wrote:
>> >> I can force the frequency down manually with cpufreq-set, so the
>> >> problem doesn't seem to be with the actual frequency changing.
>  cpufreq-set modifies the value of scaling_max_freq, it looks like
> your system always demands for the max freq, can you provide:
> grep . /sys/devices/system/cpu/intel_pstate/*perf_pct
> under powersave and performance?

Powersave:
/sys/devices/system/cpu/intel_pstate/max_perf_pct:100
/sys/devices/system/cpu/intel_pstate/min_perf_pct:22

Performance:
/sys/devices/system/cpu/intel_pstate/max_perf_pct:100
/sys/devices/system/cpu/intel_pstate/min_perf_pct:100

Also, adding intel_pstate=no_hwp to the command line does not change the
result. Changing that to intel_pstate=disable does fix the problem, so
the bug seems to be somewhere in intel_pstate instead of cpufreq core.

-- 
Arto Jantunen

^ permalink raw reply	[flat|nested] 22+ messages in thread

* RE: PROBLEM: Cpufreq constantly keeps frequency at maximum on 4.5-rc4
  2016-02-21  8:45       ` Arto Jantunen
@ 2016-02-21  8:52         ` Chen, Yu C
  2016-02-21 20:02           ` Srinivas Pandruvada
  0 siblings, 1 reply; 22+ messages in thread
From: Chen, Yu C @ 2016-02-21  8:52 UTC (permalink / raw)
  To: Arto Jantunen
  Cc: Doug Smythies, 'Rafael J. Wysocki',
	'Viresh Kumar', linux-pm, 'Srinivas Pandruvada'


> -----Original Message-----
> From: Arto Jantunen [mailto:viiru@iki.fi]
> Sent: Sunday, February 21, 2016 4:45 PM
> To: Chen, Yu C
> Cc: Doug Smythies; 'Rafael J. Wysocki'; 'Viresh Kumar'; linux-
> pm@vger.kernel.org; 'Srinivas Pandruvada'
> Subject: Re: PROBLEM: Cpufreq constantly keeps frequency at maximum on
> 4.5-rc4
> 
> "Chen, Yu C" <yu.c.chen@intel.com> writes:
> 
> >> -----Original Message-----
> >> From: linux-pm-owner@vger.kernel.org [mailto:linux-pm-
> >> owner@vger.kernel.org] On Behalf Of Arto Jantunen
> >> Sent: Sunday, February 21, 2016 1:11 AM
> >> To: Doug Smythies
> >> Cc: 'Rafael J. Wysocki'; 'Viresh Kumar'; linux-pm@vger.kernel.org;
> >> 'Srinivas Pandruvada'
> >> Subject: Re: PROBLEM: Cpufreq constantly keeps frequency at maximum
> >> on
> >> 4.5-rc4
> >>
> >> "Doug Smythies" <dsmythies@telus.net> writes:
> >>
> >> > On 2106.02.20 00:50 Arto Jantunen wrote:
> >> >> I can force the frequency down manually with cpufreq-set, so the
> >> >> problem doesn't seem to be with the actual frequency changing.
> >  cpufreq-set modifies the value of scaling_max_freq, it looks like
> > your system always demands for the max freq, can you provide:
> > grep . /sys/devices/system/cpu/intel_pstate/*perf_pct
> > under powersave and performance?
> 
> Powersave:
> /sys/devices/system/cpu/intel_pstate/max_perf_pct:100
> /sys/devices/system/cpu/intel_pstate/min_perf_pct:22
> 
> Performance:
> /sys/devices/system/cpu/intel_pstate/max_perf_pct:100
> /sys/devices/system/cpu/intel_pstate/min_perf_pct:100
> 
> Also, adding intel_pstate=no_hwp to the command line does not change the
> result. Changing that to intel_pstate=disable does fix the problem, so the
> bug seems to be somewhere in intel_pstate instead of cpufreq core.
> 
1.It would be nice if a git bisect is used to find the commit causing this problem.

2.
# cd /sys/kernel/debug/tracing/
# echo 1 > events/power/pstate_sample/enable
# echo 1 > events/power/cpu_frequency/enable
# cat trace

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: PROBLEM: Cpufreq constantly keeps frequency at maximum on 4.5-rc4
  2016-02-21  8:52         ` Chen, Yu C
@ 2016-02-21 20:02           ` Srinivas Pandruvada
  2016-02-21 20:33             ` Arto Jantunen
  0 siblings, 1 reply; 22+ messages in thread
From: Srinivas Pandruvada @ 2016-02-21 20:02 UTC (permalink / raw)
  To: Chen, Yu C, Arto Jantunen
  Cc: Doug Smythies, 'Rafael J. Wysocki',
	'Viresh Kumar',
	linux-pm



On 02/21/2016 12:52 AM, Chen, Yu C wrote:
>> -----Original Message-----
>> From: Arto Jantunen [mailto:viiru@iki.fi]
>> Sent: Sunday, February 21, 2016 4:45 PM
>> To: Chen, Yu C
>> Cc: Doug Smythies; 'Rafael J. Wysocki'; 'Viresh Kumar'; linux-
>> pm@vger.kernel.org; 'Srinivas Pandruvada'
>> Subject: Re: PROBLEM: Cpufreq constantly keeps frequency at maximum on
>> 4.5-rc4
>>
>> "Chen, Yu C" <yu.c.chen@intel.com> writes:
>>
>>>> -----Original Message-----
>>>> From: linux-pm-owner@vger.kernel.org [mailto:linux-pm-
>>>> owner@vger.kernel.org] On Behalf Of Arto Jantunen
>>>> Sent: Sunday, February 21, 2016 1:11 AM
>>>> To: Doug Smythies
>>>> Cc: 'Rafael J. Wysocki'; 'Viresh Kumar'; linux-pm@vger.kernel.org;
>>>> 'Srinivas Pandruvada'
>>>> Subject: Re: PROBLEM: Cpufreq constantly keeps frequency at maximum
>>>> on
>>>> 4.5-rc4
>>>>
>>>> "Doug Smythies" <dsmythies@telus.net> writes:
>>>>
>>>>> On 2106.02.20 00:50 Arto Jantunen wrote:
>>>>>> I can force the frequency down manually with cpufreq-set, so the
>>>>>> problem doesn't seem to be with the actual frequency changing.
>>>   cpufreq-set modifies the value of scaling_max_freq, it looks like
>>> your system always demands for the max freq, can you provide:
>>> grep . /sys/devices/system/cpu/intel_pstate/*perf_pct
>>> under powersave and performance?
>> Powersave:
>> /sys/devices/system/cpu/intel_pstate/max_perf_pct:100
>> /sys/devices/system/cpu/intel_pstate/min_perf_pct:22
>>
>> Performance:
>> /sys/devices/system/cpu/intel_pstate/max_perf_pct:100
>> /sys/devices/system/cpu/intel_pstate/min_perf_pct:100
>>
>> Also, adding intel_pstate=no_hwp to the command line does not change the
>> result. Changing that to intel_pstate=disable does fix the problem, so the
>> bug seems to be somewhere in intel_pstate instead of cpufreq core.
You have Skylake, which is not compatible with legacy ACPI P states with 
_PSS tables, so not running without intel_pstate is not much use. It may 
be running at at low P-state by disabling.

Is Debian use default mode as performance? I think Ubuntu uses 
performance mode as default.
What is the output of
cat cpu*/cpufreq/scaling_governor

Thanks,
Srinivas
> 1.It would be nice if a git bisect is used to find the commit causing this problem.
>
> 2.
> # cd /sys/kernel/debug/tracing/
> # echo 1 > events/power/pstate_sample/enable
> # echo 1 > events/power/cpu_frequency/enable
> # cat trace
>


^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: PROBLEM: Cpufreq constantly keeps frequency at maximum on 4.5-rc4
  2016-02-21 20:02           ` Srinivas Pandruvada
@ 2016-02-21 20:33             ` Arto Jantunen
  2016-02-22  6:16               ` Viresh Kumar
  0 siblings, 1 reply; 22+ messages in thread
From: Arto Jantunen @ 2016-02-21 20:33 UTC (permalink / raw)
  To: Srinivas Pandruvada
  Cc: Chen, Yu C, Doug Smythies, 'Rafael J. Wysocki',
	'Viresh Kumar',
	linux-pm

Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com> writes:
>>> Powersave:
>>> /sys/devices/system/cpu/intel_pstate/max_perf_pct:100
>>> /sys/devices/system/cpu/intel_pstate/min_perf_pct:22
>>>
>>> Performance:
>>> /sys/devices/system/cpu/intel_pstate/max_perf_pct:100
>>> /sys/devices/system/cpu/intel_pstate/min_perf_pct:100
>>>
>>> Also, adding intel_pstate=no_hwp to the command line does not change the
>>> result. Changing that to intel_pstate=disable does fix the problem, so the
>>> bug seems to be somewhere in intel_pstate instead of cpufreq core.
> You have Skylake, which is not compatible with legacy ACPI P states with _PSS
> tables, so not running without intel_pstate is not much use. It may be running
> at at low P-state by disabling.
>
> Is Debian use default mode as performance? I think Ubuntu uses performance
> mode as default.
> What is the output of
> cat cpu*/cpufreq/scaling_governor

I have tested both available governors, and see the same behavior either
way. The kernel I have defaults to performance, I think I'll try
building another one which defaults to powersave to see if that changes
anything (perhaps both governors actually work but it isn't possible to
switch between them at runtime?). The Debian userspace defaults to
ondemand, which doesn't exist for intel_pstate.

-- 
Arto Jantunen

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: PROBLEM: Cpufreq constantly keeps frequency at maximum on 4.5-rc4
  2016-02-21 20:33             ` Arto Jantunen
@ 2016-02-22  6:16               ` Viresh Kumar
  2016-02-22 16:39                 ` Arto Jantunen
  0 siblings, 1 reply; 22+ messages in thread
From: Viresh Kumar @ 2016-02-22  6:16 UTC (permalink / raw)
  To: Arto Jantunen
  Cc: Srinivas Pandruvada, Chen, Yu C, Doug Smythies,
	'Rafael J. Wysocki',
	linux-pm

On 21-02-16, 22:33, Arto Jantunen wrote:
> Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com> writes:
> >>> Powersave:
> >>> /sys/devices/system/cpu/intel_pstate/max_perf_pct:100
> >>> /sys/devices/system/cpu/intel_pstate/min_perf_pct:22
> >>>
> >>> Performance:
> >>> /sys/devices/system/cpu/intel_pstate/max_perf_pct:100
> >>> /sys/devices/system/cpu/intel_pstate/min_perf_pct:100
> >>>
> >>> Also, adding intel_pstate=no_hwp to the command line does not change the
> >>> result. Changing that to intel_pstate=disable does fix the problem, so the
> >>> bug seems to be somewhere in intel_pstate instead of cpufreq core.
> > You have Skylake, which is not compatible with legacy ACPI P states with _PSS
> > tables, so not running without intel_pstate is not much use. It may be running
> > at at low P-state by disabling.
> >
> > Is Debian use default mode as performance? I think Ubuntu uses performance
> > mode as default.
> > What is the output of
> > cat cpu*/cpufreq/scaling_governor
> 
> I have tested both available governors, and see the same behavior either
> way. The kernel I have defaults to performance, I think I'll try
> building another one which defaults to powersave to see if that changes
> anything (perhaps both governors actually work but it isn't possible to
> switch between them at runtime?). The Debian userspace defaults to
> ondemand, which doesn't exist for intel_pstate.

I took a close look at git log between 4.4 and 4.5-rc1 for intel-pstate and it
had only three patches:

157386b6fc14 cpufreq: intel_pstate: Configurable algorithm to get target pstate
e70eed2b6454 cpufreq: intel_pstate: Account for non C0 time
63d1d656a523 cpufreq: intel_pstate: Account for IO wait time

The first one creates special routines based on the CPU model you have, yours is
94, i.e. 5e, which means we are going to use: core_params in your case. And so
you will be using get_target_pstate_use_performance() for .get_target_pstate().

The two later patches doesn't make any changes to the working of core_params()
and so shouldn't have changed anything for skylake.

Anyway, Please trying reverting the above three patches to see if there is a bug
somewhere there. So you need to do:

git revert 63d1d656a523
git revert e70eed2b6454
git revert 157386b6fc14

-- 
viresh

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: PROBLEM: Cpufreq constantly keeps frequency at maximum on 4.5-rc4
  2016-02-22  6:16               ` Viresh Kumar
@ 2016-02-22 16:39                 ` Arto Jantunen
  2016-02-22 16:41                   ` Viresh Kumar
  0 siblings, 1 reply; 22+ messages in thread
From: Arto Jantunen @ 2016-02-22 16:39 UTC (permalink / raw)
  To: Viresh Kumar
  Cc: Srinivas Pandruvada, Chen, Yu C, Doug Smythies,
	'Rafael J. Wysocki',
	linux-pm

Viresh Kumar <viresh.kumar@linaro.org> writes:

> On 21-02-16, 22:33, Arto Jantunen wrote:
>> I have tested both available governors, and see the same behavior either
>> way. The kernel I have defaults to performance, I think I'll try
>> building another one which defaults to powersave to see if that changes
>> anything (perhaps both governors actually work but it isn't possible to
>> switch between them at runtime?). The Debian userspace defaults to
>> ondemand, which doesn't exist for intel_pstate.
>
> I took a close look at git log between 4.4 and 4.5-rc1 for intel-pstate and it
> had only three patches:
>
> 157386b6fc14 cpufreq: intel_pstate: Configurable algorithm to get target pstate
> e70eed2b6454 cpufreq: intel_pstate: Account for non C0 time
> 63d1d656a523 cpufreq: intel_pstate: Account for IO wait time
>
> The first one creates special routines based on the CPU model you have, yours is
> 94, i.e. 5e, which means we are going to use: core_params in your case. And so
> you will be using get_target_pstate_use_performance() for .get_target_pstate().
>
> The two later patches doesn't make any changes to the working of core_params()
> and so shouldn't have changed anything for skylake.
>
> Anyway, Please trying reverting the above three patches to see if there is a bug
> somewhere there. So you need to do:
>
> git revert 63d1d656a523
> git revert e70eed2b6454
> git revert 157386b6fc14

Thanks. I tried this, and somewhat surprisingly it doesn't change the
result. I guess we are back to doing a full bisect?

-- 
Arto Jantunen

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: PROBLEM: Cpufreq constantly keeps frequency at maximum on 4.5-rc4
  2016-02-22 16:39                 ` Arto Jantunen
@ 2016-02-22 16:41                   ` Viresh Kumar
  2016-02-22 16:48                     ` Viresh Kumar
  2016-02-28 15:43                     ` Arto Jantunen
  0 siblings, 2 replies; 22+ messages in thread
From: Viresh Kumar @ 2016-02-22 16:41 UTC (permalink / raw)
  To: Arto Jantunen
  Cc: Srinivas Pandruvada, Chen, Yu C, Doug Smythies,
	'Rafael J. Wysocki',
	linux-pm

On 22-02-16, 18:39, Arto Jantunen wrote:
> Viresh Kumar <viresh.kumar@linaro.org> writes:
> 
> > On 21-02-16, 22:33, Arto Jantunen wrote:
> >> I have tested both available governors, and see the same behavior either
> >> way. The kernel I have defaults to performance, I think I'll try
> >> building another one which defaults to powersave to see if that changes
> >> anything (perhaps both governors actually work but it isn't possible to
> >> switch between them at runtime?). The Debian userspace defaults to
> >> ondemand, which doesn't exist for intel_pstate.
> >
> > I took a close look at git log between 4.4 and 4.5-rc1 for intel-pstate and it
> > had only three patches:
> >
> > 157386b6fc14 cpufreq: intel_pstate: Configurable algorithm to get target pstate
> > e70eed2b6454 cpufreq: intel_pstate: Account for non C0 time
> > 63d1d656a523 cpufreq: intel_pstate: Account for IO wait time
> >
> > The first one creates special routines based on the CPU model you have, yours is
> > 94, i.e. 5e, which means we are going to use: core_params in your case. And so
> > you will be using get_target_pstate_use_performance() for .get_target_pstate().
> >
> > The two later patches doesn't make any changes to the working of core_params()
> > and so shouldn't have changed anything for skylake.
> >
> > Anyway, Please trying reverting the above three patches to see if there is a bug
> > somewhere there. So you need to do:
> >
> > git revert 63d1d656a523
> > git revert e70eed2b6454
> > git revert 157386b6fc14
> 
> Thanks. I tried this, and somewhat surprisingly it doesn't change the
> result. I guess we are back to doing a full bisect?

Good. That was kind of what I expected, so no surprise :)

I think bisect wouldn't be that difficult, please try :)

-- 
viresh

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: PROBLEM: Cpufreq constantly keeps frequency at maximum on 4.5-rc4
  2016-02-22 16:41                   ` Viresh Kumar
@ 2016-02-22 16:48                     ` Viresh Kumar
  2016-02-22 19:25                       ` Srinivas Pandruvada
  2016-02-28 15:43                     ` Arto Jantunen
  1 sibling, 1 reply; 22+ messages in thread
From: Viresh Kumar @ 2016-02-22 16:48 UTC (permalink / raw)
  To: Arto Jantunen
  Cc: Srinivas Pandruvada, Chen, Yu C, Doug Smythies,
	'Rafael J. Wysocki',
	linux-pm

On 22-02-16, 22:11, Viresh Kumar wrote:
> > Thanks. I tried this, and somewhat surprisingly it doesn't change the
> > result. I guess we are back to doing a full bisect?
> 
> Good. That was kind of what I expected, so no surprise :)
> 
> I think bisect wouldn't be that difficult, please try :)

Okay, for record, I looked at all the patches that went into 4.5-rcs
into drivers/cpufreq/ directory and there is absolutely nothing in
there, that could have caused this. I suspect some x86 core changes
have broken this stuff.

-- 
viresh

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: PROBLEM: Cpufreq constantly keeps frequency at maximum on 4.5-rc4
  2016-02-22 16:48                     ` Viresh Kumar
@ 2016-02-22 19:25                       ` Srinivas Pandruvada
  0 siblings, 0 replies; 22+ messages in thread
From: Srinivas Pandruvada @ 2016-02-22 19:25 UTC (permalink / raw)
  To: Viresh Kumar, Arto Jantunen
  Cc: Chen, Yu C, Doug Smythies, 'Rafael J. Wysocki', linux-pm



On 02/22/2016 08:48 AM, Viresh Kumar wrote:
> On 22-02-16, 22:11, Viresh Kumar wrote:
>>> Thanks. I tried this, and somewhat surprisingly it doesn't change the
>>> result. I guess we are back to doing a full bisect?
>> Good. That was kind of what I expected, so no surprise :)
>>
>> I think bisect wouldn't be that difficult, please try :)
> Okay, for record, I looked at all the patches that went into 4.5-rcs
> into drivers/cpufreq/ directory and there is absolutely nothing in
> there, that could have caused this. I suspect some x86 core changes
> have broken this stuff.
Correct. There is no change submitted in this release cycle to impact to 
cpufreq drivers.

Arto,

I want to understand your issue. I just gave a try by setting default 
governors to ondemand and performance and changing them on fly.
I didn't see any issue. So try the steps below. It doesn't work for you 
then I have some dumps to send at the end.

Verify that you are using HWP
[linux]$ dmesg | grep -i HWP
[    2.136799] intel_pstate: HWP enabled


1. Test with default governor as powersave

Default governor is on_damand, which will be "powersave" for 
Intel_Pstate driver. In kernel config
CONFIG_CPU_FREQ_DEFAULT_GOV_ONDEMAND=y

[linux ~]$ uname -a
Linux spandruv-mobl.jf.intel.com 4.5.0-rc2+ #13 SMP Fri Feb 19 13:58:16 
PST 2016 x86_64 x86_64 x86_64 GNU/Linux

[linux ~]$ sudo cpupower frequency-info
analyzing CPU 0:
   driver: intel_pstate
   CPUs which run at the same hardware frequency: 0
   CPUs which need to have their frequency coordinated by software: 0
   maximum transition latency: 0.97 ms.
   hardware limits: 400 MHz - 3.10 GHz
   available cpufreq governors: performance, powersave
   current policy: frequency should be within 400 MHz and 3.10 GHz.
                   The governor "powersave" may decide which speed to use
                   within this range.
   current CPU frequency is 503 MHz (asserted by call to hardware).
   boost state support:
     Supported: yes
     Active: yes

[linux ~]$ sudo turbostat -i 1
      CPU Avg_MHz   %Busy Bzy_MHz TSC_MHz
        -      36    4.59     776    1512
        0      42    5.10     817    1511
        2      22    2.91     766    1512
        1      44    5.74     770    1512
        3      34    4.61     746    1512
      CPU Avg_MHz   %Busy Bzy_MHz TSC_MHz
        -      17    3.06     551    1512
        0      32    5.97     543    1512
        2       7    1.26     546    1512
        1      17    2.94     570    1512
        3      11    2.06     550    1512
      CPU Avg_MHz   %Busy Bzy_MHz TSC_MHz
        -      24    4.45     541    1512
        0      21    3.81     557    1512
        2      25    4.76     525    1512
        1      35    6.35     552    1512
        3      15    2.90     525    1512
      CPU Avg_MHz   %Busy Bzy_MHz TSC_MHz
        -      17    3.25     524    1512
        0      20    3.71     530    1512
        2      14    2.78     510    1512
        1       8    1.56     516    1512
        3      26    4.97     529    1512

Now change the default policy to performance
[linux]$ sudo cpupower frequency-set -g performance
Setting cpu: 0
Setting cpu: 1
Setting cpu: 2
Setting cpu: 3

[linux ~]$ sudo turbostat -i 1
      CPU Avg_MHz   %Busy Bzy_MHz TSC_MHz
        -      33    1.14    2925    1512
        0      19    0.66    2879    1512
        2      83    2.79    2993    1512
        1      13    0.48    2662    1512
        3      18    0.63    2879    1512
      CPU Avg_MHz   %Busy Bzy_MHz TSC_MHz
        -      34    1.17    2942    1512
        0      19    0.64    2914    1512
        2      83    2.77    2996    1512
        1      24    0.84    2812    1512
        3      12    0.42    2888    1512
      CPU Avg_MHz   %Busy Bzy_MHz TSC_MHz
        -      64    2.30    2779    1512
        0      93    3.36    2768    1512
        2      49    1.73    2850    1512
        1      29    1.16    2497    1512
        3      85    2.96    2860    1512
      CPU Avg_MHz   %Busy Bzy_MHz TSC_MHz
        -      31    1.08    2828    1512
        0      15    0.57    2582    1512
        2      18    0.67    2645    1512
        1      31    1.12    2792    1512
        3      59    1.97    2982    1512

As expected busy MHz changed to possible max.

[linux]$ sudo cpupower frequency-set -g powersave
Setting cpu: 0
Setting cpu: 1
Setting cpu: 2
Setting cpu: 3

[linux ~]$ sudo turbostat -i 1
      CPU Avg_MHz   %Busy Bzy_MHz TSC_MHz
        -      22    3.31     672    1512
        0      19    2.40     809    1511
        2      19    2.18     863    1512
        1      41    7.10     584    1512
        3       9    1.57     596    1512
      CPU Avg_MHz   %Busy Bzy_MHz TSC_MHz
        -      57    7.59     749    1512
        0      59    6.78     876    1512
        2      54    6.96     770    1512
        1      71   10.18     700    1512
        3      43    6.43     672    1512
      CPU Avg_MHz   %Busy Bzy_MHz TSC_MHz
        -      78   10.50     739    1512
        0     115   14.19     811    1512
        2      86   11.08     774    1512
        1      53    6.98     758    1512
        3    CONFIG_CPU_FREQ_DEFAULT_GOV_PERFORMANCE=y  57 9.75     
582    1512
      CPU Avg_MHz   %Busy Bzy_MHz TSC_MHz
        -      51    7.14     720    1512
        0      35    4.95     710    1512
        2      47    5.88     798    1512
        1      78    9.32     832    1512
        3      46    8.40     546    1512

Busy MHz is no longer max possible.

2.
Now set the default governor in kernel config as "performance" and rebuild.
CONFIG_CPU_FREQ_DEFAULT_GOV_PERFORMANCE=y

[linux ~]$ sudo turbostat -i 1
      CPU Avg_MHz   %Busy Bzy_MHz TSC_MHz
        -     323   12.00    2689    1512
        0     293   11.25    2607    1512
        2     220    8.58    2571    1512
        1     176    6.87    2565    1512
        3     601   21.30    2821    1512
      CPU Avg_MHz   %Busy Bzy_MHz TSC_MHz
        -      37    1.26    2913    1512
        0      74    2.47    2988    1512
        2      35    1.18    2920    1512
        1      21    0.79    2683    1512
        3      18    0.61    2891    1512
      CPU Avg_MHz   %Busy Bzy_MHz TSC_MHz
        -      30    1.04    2879    1512
        0      64    2.17    2970    1512
        2      16    0.56    2901    1512
        1      15    0.57    2620    1512
        3      25    0.88    2810    1512
      CPU Avg_MHz   %Busy Bzy_MHz TSC_MHz
        -      48    1.65    2887    1512
        0      36    1.29    2814    1512
        2      77    2.59    2957    1512
        1      23    0.87    2661    1512
        3      55    1.85    2943    1512

No change to powersave
[linux ~]$ sudo cpupower frequency-set -g powersave
Setting cpu: 0
Setting cpu: 1
Setting cpu: 2
Setting cpu: 3
[linux ~]$ sudo turbostat -i 1
      CPU Avg_MHz   %Busy Bzy_MHz TSC_MHz
        -      18    3.22     565    1512
        0      16    2.51     647    1511
        2       8    1.45     551    1512
        1      33    5.94     551    1512
        3      16    2.98     532    1512
      CPU Avg_MHz   %Busy Bzy_MHz TSC_MHz
        -      20    3.57     548    1512
        0      14    2.06     685    1512
        2      13    2.38     527    1512
        1      16    2.85     559    1512
        3      36    6.99     511    1512
      CPU Avg_MHz   %Busy Bzy_MHz TSC_MHz
        -      20    3.62     542    1512
        0      17    2.87     597    1512
        2      13    2.18     585    1512
        1      36    7.02     509    1512
        3      13    2.41     536    1512

Change back to performance
[linux ~]$ sudo cpupower frequency-set -g performance
Setting cpu: 0
Setting cpu: 1
Setting cpu: 2
Setting cpu: 3
[linux ~]$ sudo turbostat -i 1
      CPU Avg_MHz   %Busy Bzy_MHz TSC_MHz
        -      33    1.14    2898    1512
        0      78    2.61    2998    1513
        2      24    0.86    2815    1512
        1      16    0.61    2618    1512
        3      14    0.49    2867    1512
      CPU Avg_MHz   %Busy Bzy_MHz TSC_MHz
        -      33    1.11    2938    1512
        0      79    2.62    3019    1512
        2      19    0.68    2832    1512
        1      21    0.77    2740    1512
        3      11    0.38    2973    1512
      CPU Avg_MHz   %Busy Bzy_MHz TSC_MHz
        -      35    1.23    2888    1512
        0      23    0.84    2722    1512
        2      21    0.72    2898    1512
        1      82    2.78    2938    1512
        3      16    0.56    2870    1512
      CPU Avg_MHz   %Busy Bzy_MHz TSC_MHz
        -      63    2.24    2820    1512
        0      67    2.38    2800    1512
        2      73    2.56    2864    1512
        1      87    3.07    2842    1512
        3      25    0.93    2676    1512


Now if you don't get above results, then send the following for each 
mode "powersave" and "performance".

rdmsr 0xCE
rdmsr 0x1AD
rdmsr 0x770
rdmsr 0x771
rdmsr 0x772
rdmsr 0x773
rdmsr 0x774
rdmsr 0x777

Thanks,
Srinivas












^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: PROBLEM: Cpufreq constantly keeps frequency at maximum on 4.5-rc4
  2016-02-22 16:41                   ` Viresh Kumar
  2016-02-22 16:48                     ` Viresh Kumar
@ 2016-02-28 15:43                     ` Arto Jantunen
  2016-02-29  6:22                       ` Doug Smythies
                                         ` (2 more replies)
  1 sibling, 3 replies; 22+ messages in thread
From: Arto Jantunen @ 2016-02-28 15:43 UTC (permalink / raw)
  To: Viresh Kumar
  Cc: Srinivas Pandruvada, Chen, Yu C, Doug Smythies,
	'Rafael J. Wysocki',
	linux-pm, Rik van Riel

Viresh Kumar <viresh.kumar@linaro.org> writes:

> On 22-02-16, 18:39, Arto Jantunen wrote:
>> Viresh Kumar <viresh.kumar@linaro.org> writes:
>> 
>> > On 21-02-16, 22:33, Arto Jantunen wrote:
>> >> I have tested both available governors, and see the same behavior either
>> >> way. The kernel I have defaults to performance, I think I'll try
>> >> building another one which defaults to powersave to see if that changes
>> >> anything (perhaps both governors actually work but it isn't possible to
>> >> switch between them at runtime?). The Debian userspace defaults to
>> >> ondemand, which doesn't exist for intel_pstate.
>> >
>> > I took a close look at git log between 4.4 and 4.5-rc1 for intel-pstate and it
>> > had only three patches:
>> >
>> > 157386b6fc14 cpufreq: intel_pstate: Configurable algorithm to get target pstate
>> > e70eed2b6454 cpufreq: intel_pstate: Account for non C0 time
>> > 63d1d656a523 cpufreq: intel_pstate: Account for IO wait time
>> >
>> > The first one creates special routines based on the CPU model you have, yours is
>> > 94, i.e. 5e, which means we are going to use: core_params in your case. And so
>> > you will be using get_target_pstate_use_performance() for .get_target_pstate().
>> >
>> > The two later patches doesn't make any changes to the working of core_params()
>> > and so shouldn't have changed anything for skylake.
>> >
>> > Anyway, Please trying reverting the above three patches to see if there is a bug
>> > somewhere there. So you need to do:
>> >
>> > git revert 63d1d656a523
>> > git revert e70eed2b6454
>> > git revert 157386b6fc14
>> 
>> Thanks. I tried this, and somewhat surprisingly it doesn't change the
>> result. I guess we are back to doing a full bisect?
>
> Good. That was kind of what I expected, so no surprise :)
>
> I think bisect wouldn't be that difficult, please try :)

Bisect comes up with this commit:

commit a9ceb78bc75ca47972096372ff3d48648b16317a
Author: Rik van Riel <riel@redhat.com>
Date:   Tue Nov 3 17:34:18 2015 -0500

    cpuidle,menu: use interactivity_req to disable polling
    
    The menu governor carefully figures out how much time we typically
    sleep for an estimated sleep interval, or whether there is a repeating
    pattern going on, and corrects that estimate for the CPU load.
    
    Then it proceeds to ignore that information when determining whether
    or not to consider polling. This is not a big deal on most x86 CPUs,
    which have very low C1 latencies, and the patch should not have any
    effect on those CPUs.
    
    However, certain CPUs (eg. Atom) have much higher C1 latencies, and
    it would be good to not waste performance and power on those CPUs if
    we are expecting a very low wakeup latency.
    
    Disable polling based on the estimated interactivity requirement, not
    on the time to the next timer interrupt.
    
    Signed-off-by: Rik van Riel <riel@redhat.com>
    Acked-by: Arjan van de Ven <arjan@linux.intel.com>
    Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>

I verified the result by reverting
9c4b2867ed7c8c8784dd417ffd16e705e81eb145 and
a9ceb78bc75ca47972096372ff3d48648b16317a from 4.5-rc5, the resulting
kernel does not have the bug.

Since this is about cpuidle, I'll also mention that this hardware
requires idle=nomwait on the command line, otherwise the kernel will not
boot.

-- 
Arto Jantunen

^ permalink raw reply	[flat|nested] 22+ messages in thread

* RE: PROBLEM: Cpufreq constantly keeps frequency at maximum on 4.5-rc4
  2016-02-28 15:43                     ` Arto Jantunen
@ 2016-02-29  6:22                       ` Doug Smythies
  2016-03-01 19:28                         ` Doug Smythies
  2016-02-29 16:49                       ` Srinivas Pandruvada
       [not found]                       ` <20160229201946.0bdcc48e@annuminas.surriel.com>
  2 siblings, 1 reply; 22+ messages in thread
From: Doug Smythies @ 2016-02-29  6:22 UTC (permalink / raw)
  To: 'Arto Jantunen', 'Rafael J. Wysocki'
  Cc: 'Srinivas Pandruvada', 'Chen, Yu C',
	linux-pm, 'Rik van Riel', 'Viresh Kumar'

On 2016.02.28 07:44 Arto Jantunen wrote:
> Viresh Kumar <viresh.kumar@linaro.org> writes:
>> On 22-02-16, 18:39, Arto Jantunen wrote:
>>> Viresh Kumar <viresh.kumar@linaro.org> writes: 
>>>> On 21-02-16, 22:33, Arto Jantunen wrote:

> Bisect comes up with this commit:
>
> commit a9ceb78bc75ca47972096372ff3d48648b16317a
>
> I verified the result by reverting
> 9c4b2867ed7c8c8784dd417ffd16e705e81eb145 and
> a9ceb78bc75ca47972096372ff3d48648b16317a from 4.5-rc5, the resulting
> kernel does not have the bug.

Interesting.

I also reverted those two commits, and they made a huge
difference to something else I have been working on, heretofore
not thought to be related.

Recall:

A couple of weeks ago, I was trying Rafael's 3 patch set with the
intel_pstate driver:
"[PATCH 0/3] cpufreq: Replace timers with utilization update callbacks"
And while it solved the long standing issue of potentially incorrectly
driving down the target pstate due to the CPU being idle on jiffy
boundaries but otherwise busy, there were other scenarios (previously
masked by the dominance of the jiffy boundary method).

Example:

CPU 5
Core_busy: 92
Scaled busy: 0
Old target pstate: 18
New target pstate: 16 (minimum for the processor)
mpref: 7181473469
aperf: 6675884860
tsc: 7193569488
freq: 3160539 kHz
load: 99.83
duration: 2108.79 mSec

Assertion: At such a very high CPU load, there should have been
many passes through the intel_pstate driver in that 2.1 second
interval, all driving up the target pstate towards the maximum
of 38 for the processor involved. Instead the target pstate is
driven down, due to the long duration.

So what does this have to do with these commits?

Reverting the commits dramatically reduces, but does not eliminate,
the frequency of the high CPU load long durations situation.

Test results: Each test was 33 minutes, involving 9 incremental
kernel compiles.

Why an incremental kernel compile? Because it just so happens
to demonstrate the issue well.

Kernel 1 = 4.5-rc5 + rjw v10 3 patch set. Called "rjwv10".
Kernel 2 = kernel 2 + above 2 commits reverted. Called "reverted".

Test 1: rjwv10 4056 occurrences
Test 2: reverted 293 occurrences
Test 3: rjwv10 7878 occurrences
Test 4: reverted 259 occurrences
Test 5: rjwv10 3708 occurrences
Test 6: reverted 54 occurrences

Average issue reduction ratio: 26 times better.

... Doug



^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: PROBLEM: Cpufreq constantly keeps frequency at maximum on 4.5-rc4
  2016-02-28 15:43                     ` Arto Jantunen
  2016-02-29  6:22                       ` Doug Smythies
@ 2016-02-29 16:49                       ` Srinivas Pandruvada
  2016-03-01  0:37                         ` Rafael J. Wysocki
       [not found]                       ` <20160229201946.0bdcc48e@annuminas.surriel.com>
  2 siblings, 1 reply; 22+ messages in thread
From: Srinivas Pandruvada @ 2016-02-29 16:49 UTC (permalink / raw)
  To: Arto Jantunen, Viresh Kumar, Brown, Len, len.brown
  Cc: Chen, Yu C, Doug Smythies, 'Rafael J. Wysocki',
	linux-pm, Rik van Riel

+Len

On Sun, 2016-02-28 at 17:43 +0200, Arto Jantunen wrote:
> Viresh Kumar <viresh.kumar@linaro.org> writes:
> 
> > On 22-02-16, 18:39, Arto Jantunen wrote:
> > > Viresh Kumar <viresh.kumar@linaro.org> writes:
> > > 
> > > > On 21-02-16, 22:33, Arto Jantunen wrote:
> > > > > I have tested both available governors, and see the same
> > > > > behavior either
> > > > > way. The kernel I have defaults to performance, I think I'll
> > > > > try
> > > > > building another one which defaults to powersave to see if
> > > > > that changes
> > > > > anything (perhaps both governors actually work but it isn't
> > > > > possible to
> > > > > switch between them at runtime?). The Debian userspace
> > > > > defaults to
> > > > > ondemand, which doesn't exist for intel_pstate.
> > > > 
> > > > I took a close look at git log between 4.4 and 4.5-rc1 for
> > > > intel-pstate and it
> > > > had only three patches:
> > > > 
> > > > 157386b6fc14 cpufreq: intel_pstate: Configurable algorithm to
> > > > get target pstate
> > > > e70eed2b6454 cpufreq: intel_pstate: Account for non C0 time
> > > > 63d1d656a523 cpufreq: intel_pstate: Account for IO wait time
> > > > 
> > > > The first one creates special routines based on the CPU model
> > > > you have, yours is
> > > > 94, i.e. 5e, which means we are going to use: core_params in
> > > > your case. And so
> > > > you will be using get_target_pstate_use_performance() for
> > > > .get_target_pstate().
> > > > 
> > > > The two later patches doesn't make any changes to the working
> > > > of core_params()
> > > > and so shouldn't have changed anything for skylake.
> > > > 
> > > > Anyway, Please trying reverting the above three patches to see
> > > > if there is a bug
> > > > somewhere there. So you need to do:
> > > > 
> > > > git revert 63d1d656a523
> > > > git revert e70eed2b6454
> > > > git revert 157386b6fc14
> > > 
> > > Thanks. I tried this, and somewhat surprisingly it doesn't change
> > > the
> > > result. I guess we are back to doing a full bisect?
> > 
> > Good. That was kind of what I expected, so no surprise :)
> > 
> > I think bisect wouldn't be that difficult, please try :)
> 
> Bisect comes up with this commit:
> 
> commit a9ceb78bc75ca47972096372ff3d48648b16317a
> Author: Rik van Riel <riel@redhat.com>
> Date:   Tue Nov 3 17:34:18 2015 -0500
> 
>     cpuidle,menu: use interactivity_req to disable polling
>     
>     The menu governor carefully figures out how much time we
> typically
>     sleep for an estimated sleep interval, or whether there is a
> repeating
>     pattern going on, and corrects that estimate for the CPU load.
>     
>     Then it proceeds to ignore that information when determining
> whether
>     or not to consider polling. This is not a big deal on most x86
> CPUs,
>     which have very low C1 latencies, and the patch should not have
> any
>     effect on those CPUs.
>     
>     However, certain CPUs (eg. Atom) have much higher C1 latencies,
> and
>     it would be good to not waste performance and power on those CPUs
> if
>     we are expecting a very low wakeup latency.
>     
>     Disable polling based on the estimated interactivity requirement,
> not
>     on the time to the next timer interrupt.
>     
>     Signed-off-by: Rik van Riel <riel@redhat.com>
>     Acked-by: Arjan van de Ven <arjan@linux.intel.com>
>     Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
> 
> I verified the result by reverting
> 9c4b2867ed7c8c8784dd417ffd16e705e81eb145 and
> a9ceb78bc75ca47972096372ff3d48648b16317a from 4.5-rc5, the resulting
> kernel does not have the bug.
> 
> Since this is about cpuidle, I'll also mention that this hardware
> requires idle=nomwait on the command line, otherwise the kernel will
> not
> boot.
This is a problem.

Thanks,
Srinivas

> 

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: PROBLEM: Cpufreq constantly keeps frequency at maximum on 4.5-rc4
  2016-02-29 16:49                       ` Srinivas Pandruvada
@ 2016-03-01  0:37                         ` Rafael J. Wysocki
  0 siblings, 0 replies; 22+ messages in thread
From: Rafael J. Wysocki @ 2016-03-01  0:37 UTC (permalink / raw)
  To: Srinivas Pandruvada
  Cc: Arto Jantunen, Viresh Kumar, Brown, Len, len.brown, Chen, Yu C,
	Doug Smythies, Rafael J. Wysocki, linux-pm, Rik van Riel

On Mon, Feb 29, 2016 at 5:49 PM, Srinivas Pandruvada
<srinivas.pandruvada@linux.intel.com> wrote:
> +Len
>
> On Sun, 2016-02-28 at 17:43 +0200, Arto Jantunen wrote:
>> Viresh Kumar <viresh.kumar@linaro.org> writes:
>>
>> > On 22-02-16, 18:39, Arto Jantunen wrote:
>> > > Viresh Kumar <viresh.kumar@linaro.org> writes:
>> > >
>> > > > On 21-02-16, 22:33, Arto Jantunen wrote:
>> > > > > I have tested both available governors, and see the same
>> > > > > behavior either
>> > > > > way. The kernel I have defaults to performance, I think I'll
>> > > > > try
>> > > > > building another one which defaults to powersave to see if
>> > > > > that changes
>> > > > > anything (perhaps both governors actually work but it isn't
>> > > > > possible to
>> > > > > switch between them at runtime?). The Debian userspace
>> > > > > defaults to
>> > > > > ondemand, which doesn't exist for intel_pstate.
>> > > >
>> > > > I took a close look at git log between 4.4 and 4.5-rc1 for
>> > > > intel-pstate and it
>> > > > had only three patches:
>> > > >
>> > > > 157386b6fc14 cpufreq: intel_pstate: Configurable algorithm to
>> > > > get target pstate
>> > > > e70eed2b6454 cpufreq: intel_pstate: Account for non C0 time
>> > > > 63d1d656a523 cpufreq: intel_pstate: Account for IO wait time
>> > > >
>> > > > The first one creates special routines based on the CPU model
>> > > > you have, yours is
>> > > > 94, i.e. 5e, which means we are going to use: core_params in
>> > > > your case. And so
>> > > > you will be using get_target_pstate_use_performance() for
>> > > > .get_target_pstate().
>> > > >
>> > > > The two later patches doesn't make any changes to the working
>> > > > of core_params()
>> > > > and so shouldn't have changed anything for skylake.
>> > > >
>> > > > Anyway, Please trying reverting the above three patches to see
>> > > > if there is a bug
>> > > > somewhere there. So you need to do:
>> > > >
>> > > > git revert 63d1d656a523
>> > > > git revert e70eed2b6454
>> > > > git revert 157386b6fc14
>> > >
>> > > Thanks. I tried this, and somewhat surprisingly it doesn't change
>> > > the
>> > > result. I guess we are back to doing a full bisect?
>> >
>> > Good. That was kind of what I expected, so no surprise :)
>> >
>> > I think bisect wouldn't be that difficult, please try :)
>>
>> Bisect comes up with this commit:
>>
>> commit a9ceb78bc75ca47972096372ff3d48648b16317a
>> Author: Rik van Riel <riel@redhat.com>
>> Date:   Tue Nov 3 17:34:18 2015 -0500
>>
>>     cpuidle,menu: use interactivity_req to disable polling
>>
>>     The menu governor carefully figures out how much time we
>> typically
>>     sleep for an estimated sleep interval, or whether there is a
>> repeating
>>     pattern going on, and corrects that estimate for the CPU load.
>>
>>     Then it proceeds to ignore that information when determining
>> whether
>>     or not to consider polling. This is not a big deal on most x86
>> CPUs,
>>     which have very low C1 latencies, and the patch should not have
>> any
>>     effect on those CPUs.
>>
>>     However, certain CPUs (eg. Atom) have much higher C1 latencies,
>> and
>>     it would be good to not waste performance and power on those CPUs
>> if
>>     we are expecting a very low wakeup latency.
>>
>>     Disable polling based on the estimated interactivity requirement,
>> not
>>     on the time to the next timer interrupt.
>>
>>     Signed-off-by: Rik van Riel <riel@redhat.com>
>>     Acked-by: Arjan van de Ven <arjan@linux.intel.com>
>>     Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
>>
>> I verified the result by reverting
>> 9c4b2867ed7c8c8784dd417ffd16e705e81eb145 and
>> a9ceb78bc75ca47972096372ff3d48648b16317a from 4.5-rc5, the resulting
>> kernel does not have the bug.
>>
>> Since this is about cpuidle, I'll also mention that this hardware
>> requires idle=nomwait on the command line, otherwise the kernel will
>> not
>> boot.
> This is a problem.

I'll give Rik a couple of days to respond to this and if he doesn't,
I'll queue up the reverts for 4.5.

Thanks,
Rafael

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: PROBLEM: Cpufreq constantly keeps frequency at maximum on 4.5-rc4
       [not found]                       ` <20160229201946.0bdcc48e@annuminas.surriel.com>
@ 2016-03-01  7:06                         ` Arto Jantunen
  2016-03-01 16:59                           ` Arto Jantunen
  2016-03-01 19:22                           ` Rik van Riel
  0 siblings, 2 replies; 22+ messages in thread
From: Arto Jantunen @ 2016-03-01  7:06 UTC (permalink / raw)
  To: Rik van Riel; +Cc: linux-pm

Rik van Riel <riel@redhat.com> writes:

> On Sun, 28 Feb 2016 17:43:46 +0200
> Arto Jantunen <viiru@iki.fi> wrote:
>
>> Viresh Kumar <viresh.kumar@linaro.org> writes:
>> 
>> > On 22-02-16, 18:39, Arto Jantunen wrote:
>> >> Viresh Kumar <viresh.kumar@linaro.org> writes:
>> >> 
>> >> > On 21-02-16, 22:33, Arto Jantunen wrote:
>> >> >> I have tested both available governors, and see the same behavior either
>> >> >> way. The kernel I have defaults to performance, I think I'll try
>> >> >> building another one which defaults to powersave to see if that changes
>> >> >> anything (perhaps both governors actually work but it isn't possible to
>> >> >> switch between them at runtime?). The Debian userspace defaults to
>> >> >> ondemand, which doesn't exist for intel_pstate.
>> >> >
>> >> > I took a close look at git log between 4.4 and 4.5-rc1 for intel-pstate and it
>> >> > had only three patches:
>> >> >
>> >> > 157386b6fc14 cpufreq: intel_pstate: Configurable algorithm to get target pstate
>> >> > e70eed2b6454 cpufreq: intel_pstate: Account for non C0 time
>> >> > 63d1d656a523 cpufreq: intel_pstate: Account for IO wait time
>> >> >
>> >> > The first one creates special routines based on the CPU model you have, yours is
>> >> > 94, i.e. 5e, which means we are going to use: core_params in your case. And so
>> >> > you will be using get_target_pstate_use_performance() for .get_target_pstate().
>
> What exactly is the problem you are seeing, and on what CPUs?

Quoting from the first mail in this thread:

When using kernel 4.5-rc4 my Skylake machine runs very warm since all
cpu cores are always kept at 3.10Ghz (with maximum without turboboost
being 2.6Ghz), completely regardless of load. Swapping between the
governors (performance and powersave) doesn't change the result in any
way, frequency remains at a constant 3.10Ghz.

I can force the frequency down manually with cpufreq-set, so the problem
doesn't seem to be with the actual frequency changing.

Cpuinfo (taken from 4.4 which doesn't have the problem):

processor	: 0
vendor_id	: GenuineIntel
cpu family	: 6
model		: 94
model name	: Intel(R) Core(TM) i7-6700HQ CPU @ 2.60GHz
stepping	: 3
microcode	: 0x33
cpu MHz		: 1067.929
cache size	: 6144 KB
physical id	: 0
siblings	: 8
core id		: 0
cpu cores	: 4
apicid		: 0
initial apicid	: 0
fpu		: yes
fpu_exception	: yes
cpuid level	: 22
wp		: yes
flags		: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 \
clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx pdpe1gb rdtscp lm \
constant_tsc arch_perfmon pebs bts rep_good nopl xtopology nonstop_tsc aperfmperf \
eagerfpu pni pclmulqdq dtes64 monitor ds_cpl vmx est tm2 ssse3 sdbg fma cx16 xtpr \
pdcm pcid sse4_1 sse4_2 x2apic movbe popcnt tsc_deadline_timer aes xsave avx f16c \
rdrand lahf_lm abm 3dnowprefetch ida arat epb pln pts dtherm hwp hwp_notify \
hwp_act_window hwp_epp intel_pt tpr_shadow vnmi flexpriority ept vpid fsgsbase \
tsc_adjust bmi1 hle avx2 smep bmi2 erms invpcid rtm mpx rdseed adx smap clflushopt \
xsaveopt xsavec xgetbv1 bugs		:
bogomips	: 5183.96
clflush size	: 64
cache_alignment	: 64
address sizes	: 39 bits physical, 48 bits virtual
power management:

> What does the cpufreq table for that CPU look like?
>
> Does HLT (or its equivalent) have a really, really high
> exit or residency latency?

I don't know the answer to either of these questions. How do I find out?

>> commit a9ceb78bc75ca47972096372ff3d48648b16317a
>> Author: Rik van Riel <riel@redhat.com>
>> Date:   Tue Nov 3 17:34:18 2015 -0500
>> 
>>     cpuidle,menu: use interactivity_req to disable polling
>
> I could see that patch being a little aggressive on some CPUs,
> due to interactivity_req being corrected by the load on the
> CPU.  Does using data->predicted_us help?

I'll test this tonight and get back to you.

> ---8<---
>
> Subject: cpuidle: use predicted_us not interactivity_req to consider polling
>
> The interactivity_req variable is the expected sleep time, divided
> by the CPU load. This can be too aggressive a factor in deciding
> whether or not to consider polling in the cpuidle state selection.
>
> Use the (not corrected for load) predicted_us instead.
>
> Signed-off-by: Rik van Riel <riel@redhat.com>
> ---
>  drivers/cpuidle/governors/menu.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/drivers/cpuidle/governors/menu.c b/drivers/cpuidle/governors/menu.c
> index 0742b3296673..97022ae01d2e 100644
> --- a/drivers/cpuidle/governors/menu.c
> +++ b/drivers/cpuidle/governors/menu.c
> @@ -330,7 +330,7 @@ static int menu_select(struct cpuidle_driver *drv, struct cpuidle_device *dev)
>  		 * We want to default to C1 (hlt), not to busy polling
>  		 * unless the timer is happening really really soon.
>  		 */
> -		if (interactivity_req > 20 &&
> +		if (data->predicted_us > 20 &&
>  		    !drv->states[CPUIDLE_DRIVER_STATE_START].disabled &&
>  			dev->states_usage[CPUIDLE_DRIVER_STATE_START].disable == 0)
>  			data->last_state_idx = CPUIDLE_DRIVER_STATE_START;
>
-- 
Arto Jantunen

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: PROBLEM: Cpufreq constantly keeps frequency at maximum on 4.5-rc4
  2016-03-01  7:06                         ` Arto Jantunen
@ 2016-03-01 16:59                           ` Arto Jantunen
  2016-03-01 19:22                           ` Rik van Riel
  1 sibling, 0 replies; 22+ messages in thread
From: Arto Jantunen @ 2016-03-01 16:59 UTC (permalink / raw)
  To: Rik van Riel; +Cc: linux-pm

Arto Jantunen <viiru@iki.fi> writes:

> Rik van Riel <riel@redhat.com> writes:
>>> commit a9ceb78bc75ca47972096372ff3d48648b16317a
>>> Author: Rik van Riel <riel@redhat.com>
>>> Date:   Tue Nov 3 17:34:18 2015 -0500
>>> 
>>>     cpuidle,menu: use interactivity_req to disable polling
>>
>> I could see that patch being a little aggressive on some CPUs,
>> due to interactivity_req being corrected by the load on the
>> CPU.  Does using data->predicted_us help?
>
> I'll test this tonight and get back to you.
>
>> ---8<---
>>
>> Subject: cpuidle: use predicted_us not interactivity_req to consider polling
>>
>> The interactivity_req variable is the expected sleep time, divided
>> by the CPU load. This can be too aggressive a factor in deciding
>> whether or not to consider polling in the cpuidle state selection.
>>
>> Use the (not corrected for load) predicted_us instead.
>>
>> Signed-off-by: Rik van Riel <riel@redhat.com>
>> ---
>>  drivers/cpuidle/governors/menu.c | 2 +-
>>  1 file changed, 1 insertion(+), 1 deletion(-)
>>
>> diff --git a/drivers/cpuidle/governors/menu.c b/drivers/cpuidle/governors/menu.c
>> index 0742b3296673..97022ae01d2e 100644
>> --- a/drivers/cpuidle/governors/menu.c
>> +++ b/drivers/cpuidle/governors/menu.c
>> @@ -330,7 +330,7 @@ static int menu_select(struct cpuidle_driver *drv, struct cpuidle_device *dev)
>>  		 * We want to default to C1 (hlt), not to busy polling
>>  		 * unless the timer is happening really really soon.
>>  		 */
>> -		if (interactivity_req > 20 &&
>> +		if (data->predicted_us > 20 &&
>>  		    !drv->states[CPUIDLE_DRIVER_STATE_START].disabled &&
>>  			dev->states_usage[CPUIDLE_DRIVER_STATE_START].disable == 0)
>>  			data->last_state_idx = CPUIDLE_DRIVER_STATE_START;
>>

This patch has no effect on the symptoms I'm seeing.

-- 
Arto Jantunen

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: PROBLEM: Cpufreq constantly keeps frequency at maximum on 4.5-rc4
  2016-03-01  7:06                         ` Arto Jantunen
  2016-03-01 16:59                           ` Arto Jantunen
@ 2016-03-01 19:22                           ` Rik van Riel
  2016-03-01 19:47                             ` Arto Jantunen
  1 sibling, 1 reply; 22+ messages in thread
From: Rik van Riel @ 2016-03-01 19:22 UTC (permalink / raw)
  To: Arto Jantunen; +Cc: linux-pm

[-- Attachment #1: Type: text/plain, Size: 1068 bytes --]

On Tue, 2016-03-01 at 09:06 +0200, Arto Jantunen wrote:
> Rik van Riel <riel@redhat.com> writes:
> 
> > What exactly is the problem you are seeing, and on what CPUs?
> 
> Quoting from the first mail in this thread:
> 
> When using kernel 4.5-rc4 my Skylake machine runs very warm since all
> cpu cores are always kept at 3.10Ghz (with maximum without turboboost
> being 2.6Ghz), completely regardless of load. Swapping between the
> governors (performance and powersave) doesn't change the result in
> any
> way, frequency remains at a constant 3.10Ghz.

> > What does the cpufreq table for that CPU look like?
> > 
> > Does HLT (or its equivalent) have a really, really high
> > exit or residency latency?
> 
> I don't know the answer to either of these questions. How do I find
> out?

Could you run this?

#!/bin/sh
for i in `seq 0 10`; do
	if [ -d /sys/devices/system/cpu/cpu0/cpuidle/state$i ]; then
		echo -n "state $i latency: "
		cat
/sys/devices/system/cpu/cpu0/cpuidle/state$i/latency
	fi
done

-- 
All Rights Reversed.


[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 473 bytes --]

^ permalink raw reply	[flat|nested] 22+ messages in thread

* RE: PROBLEM: Cpufreq constantly keeps frequency at maximum on 4.5-rc4
  2016-02-29  6:22                       ` Doug Smythies
@ 2016-03-01 19:28                         ` Doug Smythies
  0 siblings, 0 replies; 22+ messages in thread
From: Doug Smythies @ 2016-03-01 19:28 UTC (permalink / raw)
  To: 'Arto Jantunen', 'Rafael J. Wysocki'
  Cc: 'Srinivas Pandruvada', 'Chen, Yu C',
	linux-pm, 'Rik van Riel', 'Viresh Kumar',
	'Doug Smythies'

Note: Just following up with some energy numbers, that perhaps
should have been included to begin with.

> On 2016.02.28 22:22 Doug Smythies wrote:
>> On 2016.02.28 07:44 Arto Jantunen wrote:

>> Bisect comes up with this commit:
>>
>> commit a9ceb78bc75ca47972096372ff3d48648b16317a
>>
>> I verified the result by reverting
>> 9c4b2867ed7c8c8784dd417ffd16e705e81eb145 and
>> a9ceb78bc75ca47972096372ff3d48648b16317a from 4.5-rc5, the resulting
>> kernel does not have the bug.

...[cut]...

> So what does this have to do with these commits?
>
> Reverting the commits dramatically reduces, but does not eliminate,
> the frequency of the high CPU load long durations situation.
>
> Test results: Each test was 33 minutes, involving 9 incremental
> kernel compiles.
>
> Why an incremental kernel compile? Because it just so happens
> to demonstrate the issue well.

Why 9? Just to have a longer test time for better averaging.

>
> Kernel 1 = 4.5-rc5 + rjw v10 3 patch set. Called "rjwv10".
> Kernel 2 = kernel 1 + above 2 commits reverted. Called "reverted".
>
> Test 1: rjwv10 4056 occurrences
> Test 2: reverted 293 occurrences
> Test 3: rjwv10 7878 occurrences
> Test 4: reverted 259 occurrences
> Test 5: rjwv10 3708 occurrences
> Test 6: reverted 54 occurrences
>
> Average issue reduction ratio: 26 times better.

Turbostat was used for the following:

Test 7: reverted: Package Joules: 47830
Test 8: rjwv10: Package Joules: 54419 (revert saves 12.1% energy)
Test 9: reverted: Package Joules: 49326
Test 10: rjwv10: Package Joules: 55442 (revert saves 11% energy)
Test 11: reverted: acpi-cpufreq ondemand: Package Joules: 49146
Test 12: rjwv10: acpi-cpufreq ondemand: Package Joules: 56302 (revert saves 12.7% energy)

... Doug



^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: PROBLEM: Cpufreq constantly keeps frequency at maximum on 4.5-rc4
  2016-03-01 19:22                           ` Rik van Riel
@ 2016-03-01 19:47                             ` Arto Jantunen
  0 siblings, 0 replies; 22+ messages in thread
From: Arto Jantunen @ 2016-03-01 19:47 UTC (permalink / raw)
  To: Rik van Riel; +Cc: linux-pm

Rik van Riel <riel@redhat.com> writes:

> On Tue, 2016-03-01 at 09:06 +0200, Arto Jantunen wrote:
>> Rik van Riel <riel@redhat.com> writes:
>> > What does the cpufreq table for that CPU look like?
>> > 
>> > Does HLT (or its equivalent) have a really, really high
>> > exit or residency latency?
>> 
>> I don't know the answer to either of these questions. How do I find
>> out?
>
> Could you run this?
>
> #!/bin/sh
> for i in `seq 0 10`; do
> 	if [ -d /sys/devices/system/cpu/cpu0/cpuidle/state$i ]; then
> 		echo -n "state $i latency: "
> 		cat
> /sys/devices/system/cpu/cpu0/cpuidle/state$i/latency
> 	fi
> done

Here is the output:

state 0 latency: 0
state 1 latency: 1
state 2 latency: 151
state 3 latency: 1034

-- 
Arto Jantunen

^ permalink raw reply	[flat|nested] 22+ messages in thread

end of thread, other threads:[~2016-03-01 19:47 UTC | newest]

Thread overview: 22+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-02-20  8:49 PROBLEM: Cpufreq constantly keeps frequency at maximum on 4.5-rc4 Arto Jantunen
2016-02-20 16:31 ` Doug Smythies
2016-02-20 17:10   ` Arto Jantunen
2016-02-20 18:03     ` Chen, Yu C
2016-02-21  8:45       ` Arto Jantunen
2016-02-21  8:52         ` Chen, Yu C
2016-02-21 20:02           ` Srinivas Pandruvada
2016-02-21 20:33             ` Arto Jantunen
2016-02-22  6:16               ` Viresh Kumar
2016-02-22 16:39                 ` Arto Jantunen
2016-02-22 16:41                   ` Viresh Kumar
2016-02-22 16:48                     ` Viresh Kumar
2016-02-22 19:25                       ` Srinivas Pandruvada
2016-02-28 15:43                     ` Arto Jantunen
2016-02-29  6:22                       ` Doug Smythies
2016-03-01 19:28                         ` Doug Smythies
2016-02-29 16:49                       ` Srinivas Pandruvada
2016-03-01  0:37                         ` Rafael J. Wysocki
     [not found]                       ` <20160229201946.0bdcc48e@annuminas.surriel.com>
2016-03-01  7:06                         ` Arto Jantunen
2016-03-01 16:59                           ` Arto Jantunen
2016-03-01 19:22                           ` Rik van Riel
2016-03-01 19:47                             ` Arto Jantunen

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.