From: Jason Andryuk <jandryuk@gmail.com>
To: Jan Beulich <jbeulich@suse.com>
Cc: "Andrew Cooper" <andrew.cooper3@citrix.com>,
"George Dunlap" <george.dunlap@citrix.com>,
"Ian Jackson" <iwj@xenproject.org>,
"Julien Grall" <julien@xen.org>,
"Stefano Stabellini" <sstabellini@kernel.org>,
"Wei Liu" <wl@xen.org>, "Roger Pau Monné" <roger.pau@citrix.com>,
xen-devel <xen-devel@lists.xenproject.org>
Subject: Re: [PATCH 04/13] cpufreq: Add Hardware P-State (HWP) driver
Date: Thu, 3 Jun 2021 07:55:00 -0400 [thread overview]
Message-ID: <CAKf6xpvCkzHOZsBY2yMQSVxq844_muaAaKd-JZUQfd7UCrXLVg@mail.gmail.com> (raw)
In-Reply-To: <2c3400e8-8236-8558-08e4-37c8b1494de7@suse.com>
On Fri, May 28, 2021 at 2:35 AM Jan Beulich <jbeulich@suse.com> wrote:
>
> On 27.05.2021 20:50, Jason Andryuk wrote:
> > On Wed, May 26, 2021 at 11:00 AM Jan Beulich <jbeulich@suse.com> wrote:
> >>
> >> On 03.05.2021 21:28, Jason Andryuk wrote:
> >>> + hwp_verbose("HWP: FAST_IA32_HWP_REQUEST %ssupported\n",
> >>> + eax & CPUID6_EAX_FAST_HWP_MSR ? "" : "not ");
> >>> + if ( eax & CPUID6_EAX_FAST_HWP_MSR )
> >>> + {
> >>> + if ( rdmsr_safe(MSR_FAST_UNCORE_MSRS_CAPABILITY, val) )
> >>> + hwp_err("error rdmsr_safe(MSR_FAST_UNCORE_MSRS_CAPABILITY)\n");
> >>> +
> >>> + hwp_verbose("HWP: MSR_FAST_UNCORE_MSRS_CAPABILITY: %016lx\n", val);
> >>
> >> Missing "else" above here?
> >
> > Are unbalanced braces acceptable or must they be balanced? Is this acceptable:
> > if ()
> > one;
> > else {
> > two;
> > three;
> > }
>
> Yes, it is. But I don't see how the question relates to my comment.
> All that needs to go in the else's body is the hwp_verbose().
'val' shouldn't be used to set features when the rdmsr fails, so the
following code needs to be within the else. Unless you want to rely
on a failed rdmsr returning 0.
> >>> +static void hdc_set_pkg_hdc_ctl(bool val)
> >>> +{
> >>> + uint64_t msr;
> >>> +
> >>> + if ( rdmsr_safe(MSR_IA32_PKG_HDC_CTL, msr) )
> >>> + {
> >>> + hwp_err("error rdmsr_safe(MSR_IA32_PKG_HDC_CTL)\n");
> >>
> >> I'm not convinced of the need of having such log messages after
> >> failed rdmsr/wrmsr, but I'm definitely against them getting logged
> >> unconditionally. In verbose mode, maybe.
> >
> > We should not fail the rdmsr if our earlier cpuid check passed. So in
> > that respect these messages can be removed. The benefit here is that
> > you can see which MSR failed. If you relied on extable_fixup, you
> > would have to cross reference manually. How will administors know if
> > hwp isn't working properly there are no messages?
>
> This same question would go for various other MSR accesses which
> might fail, but which aren't accompanied by an explicit log message.
> I wouldn't mind a single summary message reporting if e.g. HWP
> setup failed. More detailed analysis of such would be more of a
> developer's than an administrator's job then anyway.
>
> > For HWP in general, the SDM says to check CPUID for the availability
> > of MSRs. Therefore rdmsr/wrmsr shouldn't fail. During development, I
> > saw wrmsr failures when with "Valid Maximum" and other "Valid" bits
> > set. I think that was because I hadn't set up the Package Request
> > MSR. That has been fixed by not using those bits. Anyway,
> > rdmsr/wrmsr shouldn't fail, so how much code should be put into
> > checking for those failures which shouldn't happen?
>
> If there's any risk of accesses causing a fault, using *msr_safe()
> is of course the right choice. Beyond that you will need to consider
> what the consequences are. Sadly this needs doing on every single
> case individually. See "Handling unexpected conditions" section of
> ./CODING_STYLE for guidelines (even if the specific constructs
> aren't in use here).
Using *msr_safe(), I think the worst case is the users can't set HWP
in the way they want. So power/performance may not be what the users
wanted, but Xen won't crash. The hardware will throttle itself if
needed for self-protection, so that should be okay as well.
> >>> +static void hdc_set_pm_ctl1(bool val)
> >>> +{
> >>> + uint64_t msr;
> >>> +
> >>> + if ( rdmsr_safe(MSR_IA32_PM_CTL1, msr) )
> >>> + {
> >>> + hwp_err("error rdmsr_safe(MSR_IA32_PM_CTL1)\n");
> >>> +
> >>> + return;
> >>> + }
> >>> +
> >>> + msr = val ? IA32_PM_CTL1_HDC_Allow_Block : 0;
> >>
> >> Same here then, and ...
> >>
> >>> +static void hwp_fast_uncore_msrs_ctl(bool val)
> >>> +{
> >>> + uint64_t msr;
> >>> +
> >>> + if ( rdmsr_safe(MSR_FAST_UNCORE_MSRS_CTL, msr) )
> >>> + hwp_err("error rdmsr_safe(MSR_FAST_UNCORE_MSRS_CTL)\n");
> >>> +
> >>> + msr = val;
> >>
> >> ... here (where you imply bit 0 instead of using a proper
> >> constant).
> >>
> >> Also for all three functions I'm not convinced their names are
> >> in good sync with their parameters being boolean.
> >
> > Would you prefer something named closer to the bit names like:
> > hdc_set_pkg_hdc_ctl -> hdc_set_pkg_enable
> > hdc_set_pm_ctl1 -> hdc_set_allow_block
> > hwp_fast_uncore_msrs_ctl -> hwp_fast_request_enable
>
> My primary request is for function name, purpose, and parameter(s)
> to be in line. So e.g. if you left the parameters as boolean, then
> I think your suggested name changes make sense. Alternatively you
> could make the functions e.g. be full register updating ones, with
> suitable parameters, which could be a mask-to-set and mask-to-clear.
I'm going to use the replacement names while keeping the boolean
argument. Those MSRs only have single bits defined today, so
functions with boolean parameters are simpler.
> >>> +static void hwp_read_capabilities(void *info)
> >>> +{
> >>> + struct cpufreq_policy *policy = info;
> >>> + struct hwp_drv_data *data = hwp_drv_data[policy->cpu];
> >>> +
> >>> + if ( rdmsr_safe(MSR_IA32_HWP_CAPABILITIES, data->hwp_caps) )
> >>> + {
> >>> + hwp_err("CPU%u: error rdmsr_safe(MSR_IA32_HWP_CAPABILITIES)\n",
> >>> + policy->cpu);
> >>> +
> >>> + return;
> >>> + }
> >>> +
> >>> + if ( rdmsr_safe(MSR_IA32_HWP_REQUEST, data->curr_req.raw) )
> >>> + {
> >>> + hwp_err("CPU%u: error rdmsr_safe(MSR_IA32_HWP_REQUEST)\n", policy->cpu);
> >>> +
> >>> + return;
> >>> + }
> >>> +}
> >>
> >> This function doesn't indicate failure to its caller(s), so am I
> >> to understand that failure to read either ofthe MSRs is actually
> >> benign to the driver?
> >
> > They really shouldn't fail since the CPUID check passed during
> > initialization. If you can't read/write MSRs, something is broken and
> > the driver cannot function. The machine would still run, but HWP
> > would be uncontrollable. How much care should be given to
> > "impossible" situations?
>
> See above. The main point is to avoid crashing. In the specific
> case here I think you could simply drop both the log messages and
> the early return, assuming the caller can deal fine with the zero
> value(s) that rdmsr_safe() will substitute for the MSR value(s).
> Bailing early, otoh, calls for handing back an error indicator
> imo.
I changed it to have failure set curr_req.raw = -1. Then
cpufreq_driver.init returns -ENODEV in that case.
> >>> + policy->governor = &hwp_cpufreq_governor;
> >>> +
> >>> + data = xzalloc(typeof(*data));
> >>
> >> Commonly we specify the type explicitly in such cases, rather than using
> >> typeof(). I will admit though that I'm not entirely certain which one's
> >> better. But consistency across the code base is perhaps preferable for
> >> the time being.
> >
> > I thought typeof(*data) is always preferable since it will always be
> > the matching type. I can change it, but I feel like it's a step
> > backwards.
>
> There's exactly one similar use in the entire code base. The original
> idea with xmalloc() was that one would explicitly specify the intended
> type, such that without looking elsewhere one can see what exactly is
> to be allocated. One could further rely on the compiler warning if
> there was a disagreement between declaration and assignment.
Oh, okay. Thanks for the explanation of xmalloc().
> >>> + if ( feature_hwp_energy_perf )
> >>> + data->energy_perf = 0x80;
> >>> + else
> >>> + data->energy_perf = 7;
> >>
> >> Where's this 7 coming from? (You do mention the 0x80 at least in the
> >> description.)
> >
> > When HWP energy performance preference is unavailable, it falls back
> > to IA32_ENERGY_PERF_BIAS with a 0-15 range. Per the SDM Vol3 14.3.4,
> > "A value of 7 roughly translates into a hint to balance performance
> > with energy consumption." I will add a comment.
>
> Actually, as per a comment on a later patch, I'm instead expecting
> you to add a #define, the name of which will serve as comment.
Yes, sounds good.
Regards,
Jason
next prev parent reply other threads:[~2021-06-03 11:55 UTC|newest]
Thread overview: 44+ messages / expand[flat|nested] mbox.gz Atom feed top
2021-05-03 19:27 [PATCH 00/13] Intel Hardware P-States (HWP) support Jason Andryuk
2021-05-03 19:27 ` [PATCH 01/13] cpufreq: Allow restricting to internal governors only Jason Andryuk
2021-05-26 13:18 ` Jan Beulich
2021-05-26 14:12 ` Jason Andryuk
2021-05-26 15:09 ` Jan Beulich
2021-05-26 16:44 ` Jason Andryuk
2021-05-03 19:27 ` [PATCH 02/13] cpufreq: Add perf_freq to cpuinfo Jason Andryuk
2021-05-26 13:24 ` Jan Beulich
2021-05-26 14:19 ` Jason Andryuk
2021-05-03 19:28 ` [PATCH 03/13] cpufreq: Export intel_feature_detect Jason Andryuk
2021-05-26 13:27 ` Jan Beulich
2021-05-26 14:44 ` Jason Andryuk
2021-05-26 15:11 ` Jan Beulich
2021-05-03 19:28 ` [PATCH 04/13] cpufreq: Add Hardware P-State (HWP) driver Jason Andryuk
2021-05-26 14:59 ` Jan Beulich
2021-05-27 7:23 ` Jan Beulich
2021-05-27 18:50 ` Jason Andryuk
2021-05-28 6:35 ` Jan Beulich
2021-06-03 11:55 ` Jason Andryuk [this message]
2021-06-04 6:39 ` Jan Beulich
2021-05-27 7:45 ` Jan Beulich
2021-05-03 19:28 ` [PATCH 05/13] xenpm: Change get-cpufreq-para output for internal Jason Andryuk
2021-05-26 15:21 ` Jan Beulich
2021-05-27 5:54 ` Jan Beulich
2021-05-03 19:28 ` [PATCH 06/13] cpufreq: Export HWP parameters to userspace Jason Andryuk
2021-05-27 7:55 ` Jan Beulich
2021-05-28 13:19 ` Jason Andryuk
2021-05-28 13:39 ` Jan Beulich
2021-05-03 19:28 ` [PATCH 07/13] libxc: Include hwp_para in definitions Jason Andryuk
2021-05-03 19:28 ` [PATCH 08/13] xenpm: Print HWP parameters Jason Andryuk
2021-05-27 8:02 ` Jan Beulich
2021-05-03 19:28 ` [PATCH 09/13] xen: Add SET_CPUFREQ_HWP xen_sysctl_pm_op Jason Andryuk
2021-05-27 8:33 ` Jan Beulich
2021-05-03 19:28 ` [PATCH 10/13] libxc: Add xc_set_cpufreq_hwp Jason Andryuk
2021-05-04 8:03 ` Jan Beulich
2021-05-04 11:31 ` Jason Andryuk
2021-05-27 9:45 ` Jan Beulich
2021-05-03 19:28 ` [PATCH 11/13] xenpm: Factor out a non-fatal cpuid_parse variant Jason Andryuk
2021-05-27 8:41 ` Jan Beulich
2021-05-03 19:28 ` [PATCH 12/13] xenpm: Add set-cpufreq-hwp subcommand Jason Andryuk
2021-05-27 9:46 ` Jan Beulich
2021-05-03 19:28 ` [PATCH 13/13] CHANGELOG: Add Intel HWP entry Jason Andryuk
2021-05-27 9:48 ` Jan Beulich
2021-05-20 14:57 ` [PATCH 00/13] Intel Hardware P-States (HWP) support Jan Beulich
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=CAKf6xpvCkzHOZsBY2yMQSVxq844_muaAaKd-JZUQfd7UCrXLVg@mail.gmail.com \
--to=jandryuk@gmail.com \
--cc=andrew.cooper3@citrix.com \
--cc=george.dunlap@citrix.com \
--cc=iwj@xenproject.org \
--cc=jbeulich@suse.com \
--cc=julien@xen.org \
--cc=roger.pau@citrix.com \
--cc=sstabellini@kernel.org \
--cc=wl@xen.org \
--cc=xen-devel@lists.xenproject.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).