All of lore.kernel.org
 help / color / mirror / Atom feed
From: rwells@codeaurora.org
To: ahs3@redhat.com
Cc: Ashwin Chaugule <ashwin.chaugule@linaro.org>,
	Viresh Kumar <viresh.kumar@linaro.org>,
	"Rafael J. Wysocki" <rjw@rjwysocki.net>,
	Len Brown <lenb@kernel.org>,
	linux acpi <linux-acpi@vger.kernel.org>,
	lkml <linux-kernel@vger.kernel.org>,
	linux-pm@vger.kernel.org
Subject: Re: [PATCH v2] Force cppc_cpufreq to report values in KHz to fix user space reporting
Date: Tue, 26 Apr 2016 14:46:32 -0400	[thread overview]
Message-ID: <9bf9293aa62b0a7caef7c032ed18749b@codeaurora.org> (raw)
In-Reply-To: <5716B706.2050700@redhat.com>

On 2016-04-19 18:53, Al Stone wrote:
> On 04/19/2016 02:12 PM, Ashwin Chaugule wrote:
>> + Ryan
>> 
>> Hi Al,
>> 
>> On 18 April 2016 at 20:11, Al Stone <ahs3@redhat.com> wrote:
>>> When CPPC is being used by ACPI on arm64, user space tools such as
>>> cpupower report CPU frequency values from sysfs that are incorrect.
>>> 
>>> What the driver was doing was reporting the values given by ACPI 
>>> tables
>>> in whatever scale was used to provide them.  However, the ACPI spec
>>> defines the CPPC values as unitless abstract numbers.  Internal 
>>> kernel
>>> structures such as struct perf_cap, in contrast, expect these values
>>> to be in KHz.  When these struct values get reported via sysfs, the
>>> user space tools also assume they are in KHz, causing them to report
>>> incorrect values (for example, reporting a CPU frequency of 1MHz when
>>> it should be 1.8GHz).
>>> 
>>> While the investigation for a long term fix proceeds (several options
>>> are being explored, some of which may require spec changes or other
>>> much more invasive fixes), this patch forces the values read by CPPC
>>> to be read in KHz, regardless of what they actually represent.
>>> 
>>> The downside is that this approach has some assumptions:
>>> 
>>>    (1) It relies on SMBIOS3 being used, *and* that the Max Frequency
>>>    value for a processor is set to a non-zero value.
>>> 
>>>    (2) It assumes that all processors run at the same speed.  This
>>>    patch retrieves the first CPU Max Frequency from a type 4 DMI
>>>    record that it can find.  This may not be an issue, however, as a
>>>    sampling of DMI data on x86 and arm64 indicates there is often 
>>> only
>>>    one such record regardless.
>>> 
>>> For arm64 servers, this may be sufficient, but it does rely on
>>> firmware values being set correctly.  Hence, other approaches are
>>> also being considered.
>>> 
>>> This has been tested on three arm64 servers, with and without DMI, 
>>> with
>>> and without CPPC support.
>>> 
>>> Changes for v2:
>>>     -- Corrected thinko: needed to have DEPENDS on DMI in 
>>> Kconfig.arm,
>>>        not SELECT DMI (found by build daemon)
>>> 
>>> Signed-off-by: Al Stone <ahs3@redhat.com>
>> 
>> This looks like a good short term solution. Does it make more sense to
>> move this to the cppc_cpufreq driver though? Since that ties more
>> closely into the cpufreq framework which requires the kHz values in
>> sysfs. That way we can keep the cppc_acpi.c shim compliant with the
>> ACPI spec. (i.e. values read in cppc structures remain abstract and
>> unitless).
> 
> Perhaps.  Doing it that way made the patch a bit messier since
> cppc_acpi.c would set values that then had to be replaced in
> cppc_cpufreq.c, so initialization looked odd to me; that's how
> I ended up here.  You do raise a good point, however; I'll look
> at that approach again since I could have missed an easier way
> to do it.
> 
>> Rafael, Viresh, others,
>> 
>> Any other ideas how to handle this better in the long term?
>> 
>>  - Decouple the cpufreq sysfs from the cppc driver and introduce its
>> own entries. Is it possibly to do this cleanly while still allowing
>> usage of cpufreq registration with existing governors?
>> 
>>  - Come up with a scaling factor using the PMU cycle counter at boot
>> before the CPPC drivers are initialized. This would use the current
>> freq set by some UEFI var. This would possibly require some messy
>> perfevents plumbing and added bootup time though.
>> 
>> - .. ?
>> 
>> 
>> Cheers,
>> Ashwin.
>> 
> 
> The other thought that occurs to me is to go back through the
> perf_cap and cpufreq structs and make them more general -- perhaps
> store the units being used and pointers to functions to convert them
> to KHz.  This may require separating sysfs data for perf_cap from the
> cpufreq sysfs data from the cppc sysfs data.  But, if units are then
> reported out to sysfs, user space tools can do whatever conversions
> they want, or at least know what they're reporting instead of there
> being an implicit ABI between the kernel and the tools.  This would
> be a far more invasive patch set, I think, but it still may be the
> right thing to do for the long term.

The issue is a little more fundamental than that even.  We are 
retrofitting a performance management interface (CPPC) into a frequency 
management framework (cpufreq) and accompanying tools.  Regardless of 
what scheme we come up with for deriving/exposing frequency, we still 
haven’t completely solved the problem as that assumes a linear 
relationship between freq and performance.  This will work for many but 
not necessarily all CPPC systems.  In fact, making that assumption is 
explicitly forbidden in the ACPI spec: "OSPM must make no assumption 
about the exact meaning of the performance values presented by the 
platform, or how they may correlate to specific hardware metrics like 
processor frequency."

So to be completely consistent with the current spec, we would need to 
ween the tools off of frequency altogether and move to abstract 
performance - either specifically when CPPC driver is loaded or more 
generally.  If we think reporting frequency is required that might still 
be doable, but would need to be separate interface from the CPPC perf 
scale.  But I agree with Al that is a more invasive change.

For the time being, I don't think it is unreasonable to assume 
performance is linear with frequency and come up with a scaling factor 
via one of several mechanisms:
    1) SMBIOS as Al proposed (caveats above)
    2) Measure at boot using PMU or other mechanism as Ashwin floated 
(more complicated but removes dependency on SMBIOS and assumption that 
freq scale is same across all CPUs)
    3) Just hardcode a CPPC perf to kHz mapping - maybe everyone is using 
MHz today? (simplest but obviously least flexible)
    4) Others?
None of these are full solution - it is a question of how many different 
scenarios do we need to cover with initial solution.  I think the SMBIOS 
one is probably a good simplicity/flexibility compromise.

-Ryan

Qualcomm Technologies, Inc. on behalf of Qualcomm Innovation Center, 
Inc.
Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, a 
Linux Foundation Collaborative Project

  reply	other threads:[~2016-04-26 18:46 UTC|newest]

Thread overview: 10+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-04-19  0:11 [PATCH v2] Force cppc_cpufreq to report values in KHz to fix user space reporting Al Stone
2016-04-19 20:12 ` Ashwin Chaugule
2016-04-19 22:53   ` Al Stone
2016-04-26 18:46     ` rwells [this message]
2016-04-22  5:30   ` Viresh Kumar
2016-04-22 12:47     ` Rafael J. Wysocki
2016-05-11 23:08     ` Al Stone
2016-04-22 12:52   ` Rafael J. Wysocki
2016-04-21 14:53 ` Alexey Klimov
2016-04-21 16:49   ` Al Stone

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=9bf9293aa62b0a7caef7c032ed18749b@codeaurora.org \
    --to=rwells@codeaurora.org \
    --cc=ahs3@redhat.com \
    --cc=ashwin.chaugule@linaro.org \
    --cc=lenb@kernel.org \
    --cc=linux-acpi@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-pm@vger.kernel.org \
    --cc=rjw@rjwysocki.net \
    --cc=viresh.kumar@linaro.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.