xen-devel.lists.xenproject.org archive mirror
 help / color / mirror / Atom feed
From: Jason Andryuk <jandryuk@gmail.com>
To: Jan Beulich <jbeulich@suse.com>
Cc: "Andrew Cooper" <andrew.cooper3@citrix.com>,
	"George Dunlap" <george.dunlap@citrix.com>,
	"Ian Jackson" <iwj@xenproject.org>,
	"Julien Grall" <julien@xen.org>,
	"Stefano Stabellini" <sstabellini@kernel.org>,
	"Wei Liu" <wl@xen.org>, "Roger Pau Monné" <roger.pau@citrix.com>,
	xen-devel <xen-devel@lists.xenproject.org>
Subject: Re: [PATCH 04/13] cpufreq: Add Hardware P-State (HWP) driver
Date: Thu, 3 Jun 2021 07:55:00 -0400	[thread overview]
Message-ID: <CAKf6xpvCkzHOZsBY2yMQSVxq844_muaAaKd-JZUQfd7UCrXLVg@mail.gmail.com> (raw)
In-Reply-To: <2c3400e8-8236-8558-08e4-37c8b1494de7@suse.com>

On Fri, May 28, 2021 at 2:35 AM Jan Beulich <jbeulich@suse.com> wrote:
>
> On 27.05.2021 20:50, Jason Andryuk wrote:
> > On Wed, May 26, 2021 at 11:00 AM Jan Beulich <jbeulich@suse.com> wrote:
> >>
> >> On 03.05.2021 21:28, Jason Andryuk wrote:
> >>> +    hwp_verbose("HWP: FAST_IA32_HWP_REQUEST %ssupported\n",
> >>> +                eax & CPUID6_EAX_FAST_HWP_MSR ? "" : "not ");
> >>> +    if ( eax & CPUID6_EAX_FAST_HWP_MSR )
> >>> +    {
> >>> +        if ( rdmsr_safe(MSR_FAST_UNCORE_MSRS_CAPABILITY, val) )
> >>> +            hwp_err("error rdmsr_safe(MSR_FAST_UNCORE_MSRS_CAPABILITY)\n");
> >>> +
> >>> +        hwp_verbose("HWP: MSR_FAST_UNCORE_MSRS_CAPABILITY: %016lx\n", val);
> >>
> >> Missing "else" above here?
> >
> > Are unbalanced braces acceptable or must they be balanced?  Is this acceptable:
> > if ()
> >   one;
> > else {
> >   two;
> >   three;
> > }
>
> Yes, it is. But I don't see how the question relates to my comment.
> All that needs to go in the else's body is the hwp_verbose().

'val' shouldn't be used to set features when the rdmsr fails, so the
following code needs to be within the else.  Unless you want to rely
on a failed rdmsr returning 0.

> >>> +static void hdc_set_pkg_hdc_ctl(bool val)
> >>> +{
> >>> +    uint64_t msr;
> >>> +
> >>> +    if ( rdmsr_safe(MSR_IA32_PKG_HDC_CTL, msr) )
> >>> +    {
> >>> +        hwp_err("error rdmsr_safe(MSR_IA32_PKG_HDC_CTL)\n");
> >>
> >> I'm not convinced of the need of having such log messages after
> >> failed rdmsr/wrmsr, but I'm definitely against them getting logged
> >> unconditionally. In verbose mode, maybe.
> >
> > We should not fail the rdmsr if our earlier cpuid check passed.  So in
> > that respect these messages can be removed.  The benefit here is that
> > you can see which MSR failed.  If you relied on extable_fixup, you
> > would have to cross reference manually.  How will administors know if
> > hwp isn't working properly there are no messages?
>
> This same question would go for various other MSR accesses which
> might fail, but which aren't accompanied by an explicit log message.
> I wouldn't mind a single summary message reporting if e.g. HWP
> setup failed. More detailed analysis of such would be more of a
> developer's than an administrator's job then anyway.
>
> > For HWP in general, the SDM says to check CPUID for the availability
> > of MSRs.  Therefore rdmsr/wrmsr shouldn't fail.  During development, I
> > saw wrmsr failures when with "Valid Maximum" and other "Valid" bits
> > set.  I think that was because I hadn't set up the Package Request
> > MSR.  That has been fixed by not using those bits.  Anyway,
> > rdmsr/wrmsr shouldn't fail, so how much code should be put into
> > checking for those failures which shouldn't happen?
>
> If there's any risk of accesses causing a fault, using *msr_safe()
> is of course the right choice. Beyond that you will need to consider
> what the consequences are. Sadly this needs doing on every single
> case individually. See "Handling unexpected conditions" section of
> ./CODING_STYLE for guidelines (even if the specific constructs
> aren't in use here).

Using *msr_safe(), I think the worst case is the users can't set HWP
in the way they want.  So power/performance may not be what the users
wanted, but Xen won't crash.  The hardware will throttle itself if
needed for self-protection, so that should be okay as well.

> >>> +static void hdc_set_pm_ctl1(bool val)
> >>> +{
> >>> +    uint64_t msr;
> >>> +
> >>> +    if ( rdmsr_safe(MSR_IA32_PM_CTL1, msr) )
> >>> +    {
> >>> +        hwp_err("error rdmsr_safe(MSR_IA32_PM_CTL1)\n");
> >>> +
> >>> +        return;
> >>> +    }
> >>> +
> >>> +    msr = val ? IA32_PM_CTL1_HDC_Allow_Block : 0;
> >>
> >> Same here then, and ...
> >>
> >>> +static void hwp_fast_uncore_msrs_ctl(bool val)
> >>> +{
> >>> +    uint64_t msr;
> >>> +
> >>> +    if ( rdmsr_safe(MSR_FAST_UNCORE_MSRS_CTL, msr) )
> >>> +        hwp_err("error rdmsr_safe(MSR_FAST_UNCORE_MSRS_CTL)\n");
> >>> +
> >>> +    msr = val;
> >>
> >> ... here (where you imply bit 0 instead of using a proper
> >> constant).
> >>
> >> Also for all three functions I'm not convinced their names are
> >> in good sync with their parameters being boolean.
> >
> > Would you prefer something named closer to the bit names like:
> > hdc_set_pkg_hdc_ctl -> hdc_set_pkg_enable
> > hdc_set_pm_ctl1 -> hdc_set_allow_block
> > hwp_fast_uncore_msrs_ctl -> hwp_fast_request_enable
>
> My primary request is for function name, purpose, and parameter(s)
> to be in line. So e.g. if you left the parameters as boolean, then
> I think your suggested name changes make sense. Alternatively you
> could make the functions e.g. be full register updating ones, with
> suitable parameters, which could be a mask-to-set and mask-to-clear.

I'm going to use the replacement names while keeping the boolean
argument.  Those MSRs only have single bits defined today, so
functions with boolean parameters are simpler.

> >>> +static void hwp_read_capabilities(void *info)
> >>> +{
> >>> +    struct cpufreq_policy *policy = info;
> >>> +    struct hwp_drv_data *data = hwp_drv_data[policy->cpu];
> >>> +
> >>> +    if ( rdmsr_safe(MSR_IA32_HWP_CAPABILITIES, data->hwp_caps) )
> >>> +    {
> >>> +        hwp_err("CPU%u: error rdmsr_safe(MSR_IA32_HWP_CAPABILITIES)\n",
> >>> +                policy->cpu);
> >>> +
> >>> +        return;
> >>> +    }
> >>> +
> >>> +    if ( rdmsr_safe(MSR_IA32_HWP_REQUEST, data->curr_req.raw) )
> >>> +    {
> >>> +        hwp_err("CPU%u: error rdmsr_safe(MSR_IA32_HWP_REQUEST)\n", policy->cpu);
> >>> +
> >>> +        return;
> >>> +    }
> >>> +}
> >>
> >> This function doesn't indicate failure to its caller(s), so am I
> >> to understand that failure to read either ofthe MSRs is actually
> >> benign to the driver?
> >
> > They really shouldn't fail since the CPUID check passed during
> > initialization.  If you can't read/write MSRs, something is broken and
> > the driver cannot function.  The machine would still run, but HWP
> > would be uncontrollable.  How much care should be given to
> > "impossible" situations?
>
> See above. The main point is to avoid crashing. In the specific
> case here I think you could simply drop both the log messages and
> the early return, assuming the caller can deal fine with the zero
> value(s) that rdmsr_safe() will substitute for the MSR value(s).
> Bailing early, otoh, calls for handing back an error indicator
> imo.

I changed it to have failure set curr_req.raw = -1.  Then
cpufreq_driver.init returns -ENODEV in that case.

> >>> +    policy->governor = &hwp_cpufreq_governor;
> >>> +
> >>> +    data = xzalloc(typeof(*data));
> >>
> >> Commonly we specify the type explicitly in such cases, rather than using
> >> typeof(). I will admit though that I'm not entirely certain which one's
> >> better. But consistency across the code base is perhaps preferable for
> >> the time being.
> >
> > I thought typeof(*data) is always preferable since it will always be
> > the matching type.  I can change it, but I feel like it's a step
> > backwards.
>
> There's exactly one similar use in the entire code base. The original
> idea with xmalloc() was that one would explicitly specify the intended
> type, such that without looking elsewhere one can see what exactly is
> to be allocated. One could further rely on the compiler warning if
> there was a disagreement between declaration and assignment.

Oh, okay.  Thanks for the explanation of xmalloc().

> >>> +    if ( feature_hwp_energy_perf )
> >>> +        data->energy_perf = 0x80;
> >>> +    else
> >>> +        data->energy_perf = 7;
> >>
> >> Where's this 7 coming from? (You do mention the 0x80 at least in the
> >> description.)
> >
> > When HWP energy performance preference is unavailable, it falls back
> > to IA32_ENERGY_PERF_BIAS with a 0-15 range.  Per the SDM Vol3 14.3.4,
> > "A value of 7 roughly translates into a hint to balance performance
> > with energy consumption."  I will add a comment.
>
> Actually, as per a comment on a later patch, I'm instead expecting
> you to add a #define, the name of which will serve as comment.

Yes, sounds good.

Regards,
Jason


  reply	other threads:[~2021-06-03 11:55 UTC|newest]

Thread overview: 44+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-05-03 19:27 [PATCH 00/13] Intel Hardware P-States (HWP) support Jason Andryuk
2021-05-03 19:27 ` [PATCH 01/13] cpufreq: Allow restricting to internal governors only Jason Andryuk
2021-05-26 13:18   ` Jan Beulich
2021-05-26 14:12     ` Jason Andryuk
2021-05-26 15:09       ` Jan Beulich
2021-05-26 16:44         ` Jason Andryuk
2021-05-03 19:27 ` [PATCH 02/13] cpufreq: Add perf_freq to cpuinfo Jason Andryuk
2021-05-26 13:24   ` Jan Beulich
2021-05-26 14:19     ` Jason Andryuk
2021-05-03 19:28 ` [PATCH 03/13] cpufreq: Export intel_feature_detect Jason Andryuk
2021-05-26 13:27   ` Jan Beulich
2021-05-26 14:44     ` Jason Andryuk
2021-05-26 15:11       ` Jan Beulich
2021-05-03 19:28 ` [PATCH 04/13] cpufreq: Add Hardware P-State (HWP) driver Jason Andryuk
2021-05-26 14:59   ` Jan Beulich
2021-05-27  7:23     ` Jan Beulich
2021-05-27 18:50     ` Jason Andryuk
2021-05-28  6:35       ` Jan Beulich
2021-06-03 11:55         ` Jason Andryuk [this message]
2021-06-04  6:39           ` Jan Beulich
2021-05-27  7:45   ` Jan Beulich
2021-05-03 19:28 ` [PATCH 05/13] xenpm: Change get-cpufreq-para output for internal Jason Andryuk
2021-05-26 15:21   ` Jan Beulich
2021-05-27  5:54     ` Jan Beulich
2021-05-03 19:28 ` [PATCH 06/13] cpufreq: Export HWP parameters to userspace Jason Andryuk
2021-05-27  7:55   ` Jan Beulich
2021-05-28 13:19     ` Jason Andryuk
2021-05-28 13:39       ` Jan Beulich
2021-05-03 19:28 ` [PATCH 07/13] libxc: Include hwp_para in definitions Jason Andryuk
2021-05-03 19:28 ` [PATCH 08/13] xenpm: Print HWP parameters Jason Andryuk
2021-05-27  8:02   ` Jan Beulich
2021-05-03 19:28 ` [PATCH 09/13] xen: Add SET_CPUFREQ_HWP xen_sysctl_pm_op Jason Andryuk
2021-05-27  8:33   ` Jan Beulich
2021-05-03 19:28 ` [PATCH 10/13] libxc: Add xc_set_cpufreq_hwp Jason Andryuk
2021-05-04  8:03   ` Jan Beulich
2021-05-04 11:31     ` Jason Andryuk
2021-05-27  9:45   ` Jan Beulich
2021-05-03 19:28 ` [PATCH 11/13] xenpm: Factor out a non-fatal cpuid_parse variant Jason Andryuk
2021-05-27  8:41   ` Jan Beulich
2021-05-03 19:28 ` [PATCH 12/13] xenpm: Add set-cpufreq-hwp subcommand Jason Andryuk
2021-05-27  9:46   ` Jan Beulich
2021-05-03 19:28 ` [PATCH 13/13] CHANGELOG: Add Intel HWP entry Jason Andryuk
2021-05-27  9:48   ` Jan Beulich
2021-05-20 14:57 ` [PATCH 00/13] Intel Hardware P-States (HWP) support Jan Beulich

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CAKf6xpvCkzHOZsBY2yMQSVxq844_muaAaKd-JZUQfd7UCrXLVg@mail.gmail.com \
    --to=jandryuk@gmail.com \
    --cc=andrew.cooper3@citrix.com \
    --cc=george.dunlap@citrix.com \
    --cc=iwj@xenproject.org \
    --cc=jbeulich@suse.com \
    --cc=julien@xen.org \
    --cc=roger.pau@citrix.com \
    --cc=sstabellini@kernel.org \
    --cc=wl@xen.org \
    --cc=xen-devel@lists.xenproject.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).