* [PATCH] x86/msr: fix handling of MSR_IA32_PERF_{STATUS/CTL}
@ 2020-10-06 16:23 Roger Pau Monne
2020-10-07 12:06 ` Andrew Cooper
2020-10-13 13:49 ` Jan Beulich
0 siblings, 2 replies; 7+ messages in thread
From: Roger Pau Monne @ 2020-10-06 16:23 UTC (permalink / raw)
To: xen-devel
Cc: Roger Pau Monne, Jan Beulich, Andrew Cooper, Wei Liu,
George Dunlap, Ian Jackson, Julien Grall, Stefano Stabellini
Currently a PV hardware domain can also be given control over the CPU
frequency, and such guest is allowed to write to MSR_IA32_PERF_CTL.
However since commit 322ec7c89f6 the default behavior has been changed
to reject accesses to not explicitly handled MSRs, preventing PV
guests that manage CPU frequency from reading
MSR_IA32_PERF_{STATUS/CTL}.
Additionally some HVM guests (Windows at least) will attempt to read
MSR_IA32_PERF_CTL and will panic if given back a #GP fault:
vmx.c:3035:d8v0 RDMSR 0x00000199 unimplemented
d8v0 VIRIDIAN CRASH: 3b c0000096 fffff806871c1651 ffffda0253683720 0
Move the handling of MSR_IA32_PERF_{STATUS/CTL} to the common MSR
handling shared between HVM and PV guests, and add an explicit case
for reads to MSR_IA32_PERF_{STATUS/CTL}.
Restore previous behavior and allow PV guests with the required
permissions to read the contents of the mentioned MSRs. Non privileged
guests will get 0 when trying to read those registers, as writes to
MSR_IA32_PERF_CTL by such guest will already be silently dropped.
Fixes: 322ec7c89f6 ('x86/pv: disallow access to unknown MSRs')
Fixes: 84e848fd7a1 ('x86/hvm: disallow access to unknown MSRs')
Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
---
xen/arch/x86/msr.c | 20 ++++++++++++++++++++
xen/arch/x86/pv/emul-priv-op.c | 14 --------------
xen/include/xen/sched.h | 6 ++++++
3 files changed, 26 insertions(+), 14 deletions(-)
diff --git a/xen/arch/x86/msr.c b/xen/arch/x86/msr.c
index 81b34fb212..e4c4fa6127 100644
--- a/xen/arch/x86/msr.c
+++ b/xen/arch/x86/msr.c
@@ -242,6 +242,17 @@ int guest_rdmsr(struct vcpu *v, uint32_t msr, uint64_t *val)
goto gp_fault;
break;
+ case MSR_IA32_PERF_STATUS:
+ case MSR_IA32_PERF_CTL:
+ if ( cp->x86_vendor != X86_VENDOR_INTEL )
+ goto gp_fault;
+ *val = 0;
+ if ( likely(!is_cpufreq_controller(d)) ||
+ boot_cpu_data.x86_vendor != X86_VENDOR_INTEL ||
+ rdmsr_safe(msr, *val) == 0 )
+ break;
+ goto gp_fault;
+
case MSR_X2APIC_FIRST ... MSR_X2APIC_LAST:
if ( !is_hvm_domain(d) || v != curr )
goto gp_fault;
@@ -442,6 +453,15 @@ int guest_wrmsr(struct vcpu *v, uint32_t msr, uint64_t val)
goto gp_fault;
break;
+ case MSR_IA32_PERF_CTL:
+ if ( cp->x86_vendor != X86_VENDOR_INTEL )
+ goto gp_fault;
+ if ( likely(!is_cpufreq_controller(d)) ||
+ boot_cpu_data.x86_vendor != X86_VENDOR_INTEL ||
+ wrmsr_safe(msr, val) == 0 )
+ break;
+ goto gp_fault;
+
case MSR_X2APIC_FIRST ... MSR_X2APIC_LAST:
if ( !is_hvm_domain(d) || v != curr )
goto gp_fault;
diff --git a/xen/arch/x86/pv/emul-priv-op.c b/xen/arch/x86/pv/emul-priv-op.c
index 7cc16d6eda..dbceed8a05 100644
--- a/xen/arch/x86/pv/emul-priv-op.c
+++ b/xen/arch/x86/pv/emul-priv-op.c
@@ -849,12 +849,6 @@ static inline uint64_t guest_misc_enable(uint64_t val)
return val;
}
-static inline bool is_cpufreq_controller(const struct domain *d)
-{
- return ((cpufreq_controller == FREQCTL_dom0_kernel) &&
- is_hardware_domain(d));
-}
-
static uint64_t guest_efer(const struct domain *d)
{
uint64_t val;
@@ -1121,14 +1115,6 @@ static int write_msr(unsigned int reg, uint64_t val,
return X86EMUL_OKAY;
break;
- case MSR_IA32_PERF_CTL:
- if ( boot_cpu_data.x86_vendor != X86_VENDOR_INTEL )
- break;
- if ( likely(!is_cpufreq_controller(currd)) ||
- wrmsr_safe(reg, val) == 0 )
- return X86EMUL_OKAY;
- break;
-
case MSR_IA32_THERM_CONTROL:
case MSR_IA32_ENERGY_PERF_BIAS:
if ( boot_cpu_data.x86_vendor != X86_VENDOR_INTEL )
diff --git a/xen/include/xen/sched.h b/xen/include/xen/sched.h
index d8ed83f869..41baa3b7a1 100644
--- a/xen/include/xen/sched.h
+++ b/xen/include/xen/sched.h
@@ -1069,6 +1069,12 @@ extern enum cpufreq_controller {
FREQCTL_none, FREQCTL_dom0_kernel, FREQCTL_xen
} cpufreq_controller;
+static inline bool is_cpufreq_controller(const struct domain *d)
+{
+ return ((cpufreq_controller == FREQCTL_dom0_kernel) &&
+ is_hardware_domain(d));
+}
+
int cpupool_move_domain(struct domain *d, struct cpupool *c);
int cpupool_do_sysctl(struct xen_sysctl_cpupool_op *op);
int cpupool_get_id(const struct domain *d);
--
2.28.0
^ permalink raw reply related [flat|nested] 7+ messages in thread
* Re: [PATCH] x86/msr: fix handling of MSR_IA32_PERF_{STATUS/CTL}
2020-10-06 16:23 [PATCH] x86/msr: fix handling of MSR_IA32_PERF_{STATUS/CTL} Roger Pau Monne
@ 2020-10-07 12:06 ` Andrew Cooper
2020-10-07 16:41 ` Roger Pau Monné
2020-10-13 13:49 ` Jan Beulich
1 sibling, 1 reply; 7+ messages in thread
From: Andrew Cooper @ 2020-10-07 12:06 UTC (permalink / raw)
To: Roger Pau Monne, xen-devel
Cc: Jan Beulich, Wei Liu, George Dunlap, Ian Jackson, Julien Grall,
Stefano Stabellini
On 06/10/2020 17:23, Roger Pau Monne wrote:
> Currently a PV hardware domain can also be given control over the CPU
> frequency, and such guest is allowed to write to MSR_IA32_PERF_CTL.
This might be how the current logic "works", but its straight up broken.
PERF_CTL is thread scope, so unless dom0 is identity pinned and has one
vcpu for every pcpu, it cannot use the interface correctly.
> However since commit 322ec7c89f6 the default behavior has been changed
> to reject accesses to not explicitly handled MSRs, preventing PV
> guests that manage CPU frequency from reading
> MSR_IA32_PERF_{STATUS/CTL}.
>
> Additionally some HVM guests (Windows at least) will attempt to read
> MSR_IA32_PERF_CTL and will panic if given back a #GP fault:
>
> vmx.c:3035:d8v0 RDMSR 0x00000199 unimplemented
> d8v0 VIRIDIAN CRASH: 3b c0000096 fffff806871c1651 ffffda0253683720 0
>
> Move the handling of MSR_IA32_PERF_{STATUS/CTL} to the common MSR
> handling shared between HVM and PV guests, and add an explicit case
> for reads to MSR_IA32_PERF_{STATUS/CTL}.
OTOH, PERF_CTL does have a seemingly architectural "please disable turbo
for me" bit, which is supposed to be for calibration loops. I wonder if
anyone uses this, and whether we ought to honour it (probably not).
> diff --git a/xen/include/xen/sched.h b/xen/include/xen/sched.h
> index d8ed83f869..41baa3b7a1 100644
> --- a/xen/include/xen/sched.h
> +++ b/xen/include/xen/sched.h
> @@ -1069,6 +1069,12 @@ extern enum cpufreq_controller {
> FREQCTL_none, FREQCTL_dom0_kernel, FREQCTL_xen
> } cpufreq_controller;
>
> +static inline bool is_cpufreq_controller(const struct domain *d)
> +{
> + return ((cpufreq_controller == FREQCTL_dom0_kernel) &&
> + is_hardware_domain(d));
This won't compile on !CONFIG_X86, due to CONFIG_HAS_CPUFREQ
Honestly - I don't see any point to this code. Its opt-in via the
command line only, and doesn't provide adequate checks for enablement.
(It's not as if we're lacking complexity or moving parts when it comes
to power/frequency management).
~Andrew
> +}
> +
> int cpupool_move_domain(struct domain *d, struct cpupool *c);
> int cpupool_do_sysctl(struct xen_sysctl_cpupool_op *op);
> int cpupool_get_id(const struct domain *d);
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH] x86/msr: fix handling of MSR_IA32_PERF_{STATUS/CTL}
2020-10-07 12:06 ` Andrew Cooper
@ 2020-10-07 16:41 ` Roger Pau Monné
2020-10-13 13:44 ` Jan Beulich
2020-10-15 13:34 ` Roger Pau Monné
0 siblings, 2 replies; 7+ messages in thread
From: Roger Pau Monné @ 2020-10-07 16:41 UTC (permalink / raw)
To: Andrew Cooper
Cc: xen-devel, Jan Beulich, Wei Liu, George Dunlap, Ian Jackson,
Julien Grall, Stefano Stabellini
On Wed, Oct 07, 2020 at 01:06:08PM +0100, Andrew Cooper wrote:
> On 06/10/2020 17:23, Roger Pau Monne wrote:
> > Currently a PV hardware domain can also be given control over the CPU
> > frequency, and such guest is allowed to write to MSR_IA32_PERF_CTL.
>
> This might be how the current logic "works", but its straight up broken.
>
> PERF_CTL is thread scope, so unless dom0 is identity pinned and has one
> vcpu for every pcpu, it cannot use the interface correctly.
Selecting cpufreq=dom0-kernel will force vCPU pinning. I'm not able
however to see anywhere that would force dom0 vCPUs == pCPUs.
> > However since commit 322ec7c89f6 the default behavior has been changed
> > to reject accesses to not explicitly handled MSRs, preventing PV
> > guests that manage CPU frequency from reading
> > MSR_IA32_PERF_{STATUS/CTL}.
> >
> > Additionally some HVM guests (Windows at least) will attempt to read
> > MSR_IA32_PERF_CTL and will panic if given back a #GP fault:
> >
> > vmx.c:3035:d8v0 RDMSR 0x00000199 unimplemented
> > d8v0 VIRIDIAN CRASH: 3b c0000096 fffff806871c1651 ffffda0253683720 0
> >
> > Move the handling of MSR_IA32_PERF_{STATUS/CTL} to the common MSR
> > handling shared between HVM and PV guests, and add an explicit case
> > for reads to MSR_IA32_PERF_{STATUS/CTL}.
>
> OTOH, PERF_CTL does have a seemingly architectural "please disable turbo
> for me" bit, which is supposed to be for calibration loops. I wonder if
> anyone uses this, and whether we ought to honour it (probably not).
If we let guests play with this we would have to save/restore the
guest value on context switch. Unless there's a strong case for this,
I would say no.
> > diff --git a/xen/include/xen/sched.h b/xen/include/xen/sched.h
> > index d8ed83f869..41baa3b7a1 100644
> > --- a/xen/include/xen/sched.h
> > +++ b/xen/include/xen/sched.h
> > @@ -1069,6 +1069,12 @@ extern enum cpufreq_controller {
> > FREQCTL_none, FREQCTL_dom0_kernel, FREQCTL_xen
> > } cpufreq_controller;
> >
> > +static inline bool is_cpufreq_controller(const struct domain *d)
> > +{
> > + return ((cpufreq_controller == FREQCTL_dom0_kernel) &&
> > + is_hardware_domain(d));
>
> This won't compile on !CONFIG_X86, due to CONFIG_HAS_CPUFREQ
It does seem to build on Arm, because this is only used in x86 code:
https://gitlab.com/xen-project/people/royger/xen/-/jobs/778207412
The extern declaration of cpufreq_controller is just above, so if you
tried to use is_cpufreq_controller on Arm you would get a link time
error, otherwise it builds fine. The compiler removes the function on
Arm as it has the inline attribute and it's not used.
Alternatively I could look into moving cpufreq_controller (and
is_cpufreq_controller) out of sched.h into somewhere else, I haven't
looked at why it needs to live there.
> Honestly - I don't see any point to this code. Its opt-in via the
> command line only, and doesn't provide adequate checks for enablement.
> (It's not as if we're lacking complexity or moving parts when it comes
> to power/frequency management).
Right, I could do a pre-patch to remove this, but I also don't think
we should block this fix on removing FREQCTL_dom0_kernel, so I would
rather fix the regression and then remove the feature if we agree it
can be removed.
Thanks, Roger.
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH] x86/msr: fix handling of MSR_IA32_PERF_{STATUS/CTL}
2020-10-07 16:41 ` Roger Pau Monné
@ 2020-10-13 13:44 ` Jan Beulich
2020-10-15 13:34 ` Roger Pau Monné
1 sibling, 0 replies; 7+ messages in thread
From: Jan Beulich @ 2020-10-13 13:44 UTC (permalink / raw)
To: Roger Pau Monné
Cc: Andrew Cooper, xen-devel, Wei Liu, George Dunlap, Ian Jackson,
Julien Grall, Stefano Stabellini
On 07.10.2020 18:41, Roger Pau Monné wrote:
> On Wed, Oct 07, 2020 at 01:06:08PM +0100, Andrew Cooper wrote:
>> On 06/10/2020 17:23, Roger Pau Monne wrote:
>>> Currently a PV hardware domain can also be given control over the CPU
>>> frequency, and such guest is allowed to write to MSR_IA32_PERF_CTL.
>>
>> This might be how the current logic "works", but its straight up broken.
>>
>> PERF_CTL is thread scope, so unless dom0 is identity pinned and has one
>> vcpu for every pcpu, it cannot use the interface correctly.
>
> Selecting cpufreq=dom0-kernel will force vCPU pinning. I'm not able
> however to see anywhere that would force dom0 vCPUs == pCPUs.
Unless there are other overriding command line options, doesn't the
way sched_select_initial_cpu() works guarantee this?
>>> diff --git a/xen/include/xen/sched.h b/xen/include/xen/sched.h
>>> index d8ed83f869..41baa3b7a1 100644
>>> --- a/xen/include/xen/sched.h
>>> +++ b/xen/include/xen/sched.h
>>> @@ -1069,6 +1069,12 @@ extern enum cpufreq_controller {
>>> FREQCTL_none, FREQCTL_dom0_kernel, FREQCTL_xen
>>> } cpufreq_controller;
>>>
>>> +static inline bool is_cpufreq_controller(const struct domain *d)
>>> +{
>>> + return ((cpufreq_controller == FREQCTL_dom0_kernel) &&
>>> + is_hardware_domain(d));
>>
>> This won't compile on !CONFIG_X86, due to CONFIG_HAS_CPUFREQ
>
> It does seem to build on Arm, because this is only used in x86 code:
>
> https://gitlab.com/xen-project/people/royger/xen/-/jobs/778207412
>
> The extern declaration of cpufreq_controller is just above, so if you
> tried to use is_cpufreq_controller on Arm you would get a link time
> error, otherwise it builds fine. The compiler removes the function on
> Arm as it has the inline attribute and it's not used.
>
> Alternatively I could look into moving cpufreq_controller (and
> is_cpufreq_controller) out of sched.h into somewhere else, I haven't
> looked at why it needs to live there.
>
>> Honestly - I don't see any point to this code. Its opt-in via the
>> command line only, and doesn't provide adequate checks for enablement.
>> (It's not as if we're lacking complexity or moving parts when it comes
>> to power/frequency management).
>
> Right, I could do a pre-patch to remove this, but I also don't think
> we should block this fix on removing FREQCTL_dom0_kernel, so I would
> rather fix the regression and then remove the feature if we agree it
> can be removed.
I agree.
Jan
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH] x86/msr: fix handling of MSR_IA32_PERF_{STATUS/CTL}
2020-10-06 16:23 [PATCH] x86/msr: fix handling of MSR_IA32_PERF_{STATUS/CTL} Roger Pau Monne
2020-10-07 12:06 ` Andrew Cooper
@ 2020-10-13 13:49 ` Jan Beulich
1 sibling, 0 replies; 7+ messages in thread
From: Jan Beulich @ 2020-10-13 13:49 UTC (permalink / raw)
To: Roger Pau Monne
Cc: xen-devel, Andrew Cooper, Wei Liu, George Dunlap, Ian Jackson,
Julien Grall, Stefano Stabellini
On 06.10.2020 18:23, Roger Pau Monne wrote:
> Currently a PV hardware domain can also be given control over the CPU
> frequency, and such guest is allowed to write to MSR_IA32_PERF_CTL.
> However since commit 322ec7c89f6 the default behavior has been changed
> to reject accesses to not explicitly handled MSRs, preventing PV
> guests that manage CPU frequency from reading
> MSR_IA32_PERF_{STATUS/CTL}.
>
> Additionally some HVM guests (Windows at least) will attempt to read
> MSR_IA32_PERF_CTL and will panic if given back a #GP fault:
>
> vmx.c:3035:d8v0 RDMSR 0x00000199 unimplemented
> d8v0 VIRIDIAN CRASH: 3b c0000096 fffff806871c1651 ffffda0253683720 0
>
> Move the handling of MSR_IA32_PERF_{STATUS/CTL} to the common MSR
> handling shared between HVM and PV guests, and add an explicit case
> for reads to MSR_IA32_PERF_{STATUS/CTL}.
>
> Restore previous behavior and allow PV guests with the required
> permissions to read the contents of the mentioned MSRs. Non privileged
> guests will get 0 when trying to read those registers, as writes to
> MSR_IA32_PERF_CTL by such guest will already be silently dropped.
>
> Fixes: 322ec7c89f6 ('x86/pv: disallow access to unknown MSRs')
> Fixes: 84e848fd7a1 ('x86/hvm: disallow access to unknown MSRs')
> Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
I would have given this my R-b, but Andrew's "straight up broken"
comment needs resolving first, one way or another.
Jan
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH] x86/msr: fix handling of MSR_IA32_PERF_{STATUS/CTL}
2020-10-07 16:41 ` Roger Pau Monné
2020-10-13 13:44 ` Jan Beulich
@ 2020-10-15 13:34 ` Roger Pau Monné
2020-11-09 13:21 ` Roger Pau Monné
1 sibling, 1 reply; 7+ messages in thread
From: Roger Pau Monné @ 2020-10-15 13:34 UTC (permalink / raw)
To: Andrew Cooper
Cc: xen-devel, Jan Beulich, Wei Liu, George Dunlap, Ian Jackson,
Julien Grall, Stefano Stabellini
On Wed, Oct 07, 2020 at 06:41:17PM +0200, Roger Pau Monné wrote:
> On Wed, Oct 07, 2020 at 01:06:08PM +0100, Andrew Cooper wrote:
> > On 06/10/2020 17:23, Roger Pau Monne wrote:
> > > Currently a PV hardware domain can also be given control over the CPU
> > > frequency, and such guest is allowed to write to MSR_IA32_PERF_CTL.
> >
> > This might be how the current logic "works", but its straight up broken.
> >
> > PERF_CTL is thread scope, so unless dom0 is identity pinned and has one
> > vcpu for every pcpu, it cannot use the interface correctly.
>
> Selecting cpufreq=dom0-kernel will force vCPU pinning. I'm not able
> however to see anywhere that would force dom0 vCPUs == pCPUs.
>
> > > However since commit 322ec7c89f6 the default behavior has been changed
> > > to reject accesses to not explicitly handled MSRs, preventing PV
> > > guests that manage CPU frequency from reading
> > > MSR_IA32_PERF_{STATUS/CTL}.
> > >
> > > Additionally some HVM guests (Windows at least) will attempt to read
> > > MSR_IA32_PERF_CTL and will panic if given back a #GP fault:
> > >
> > > vmx.c:3035:d8v0 RDMSR 0x00000199 unimplemented
> > > d8v0 VIRIDIAN CRASH: 3b c0000096 fffff806871c1651 ffffda0253683720 0
> > >
> > > Move the handling of MSR_IA32_PERF_{STATUS/CTL} to the common MSR
> > > handling shared between HVM and PV guests, and add an explicit case
> > > for reads to MSR_IA32_PERF_{STATUS/CTL}.
> >
> > OTOH, PERF_CTL does have a seemingly architectural "please disable turbo
> > for me" bit, which is supposed to be for calibration loops. I wonder if
> > anyone uses this, and whether we ought to honour it (probably not).
>
> If we let guests play with this we would have to save/restore the
> guest value on context switch. Unless there's a strong case for this,
> I would say no.
>
> > > diff --git a/xen/include/xen/sched.h b/xen/include/xen/sched.h
> > > index d8ed83f869..41baa3b7a1 100644
> > > --- a/xen/include/xen/sched.h
> > > +++ b/xen/include/xen/sched.h
> > > @@ -1069,6 +1069,12 @@ extern enum cpufreq_controller {
> > > FREQCTL_none, FREQCTL_dom0_kernel, FREQCTL_xen
> > > } cpufreq_controller;
> > >
> > > +static inline bool is_cpufreq_controller(const struct domain *d)
> > > +{
> > > + return ((cpufreq_controller == FREQCTL_dom0_kernel) &&
> > > + is_hardware_domain(d));
> >
> > This won't compile on !CONFIG_X86, due to CONFIG_HAS_CPUFREQ
>
> It does seem to build on Arm, because this is only used in x86 code:
>
> https://gitlab.com/xen-project/people/royger/xen/-/jobs/778207412
>
> The extern declaration of cpufreq_controller is just above, so if you
> tried to use is_cpufreq_controller on Arm you would get a link time
> error, otherwise it builds fine. The compiler removes the function on
> Arm as it has the inline attribute and it's not used.
>
> Alternatively I could look into moving cpufreq_controller (and
> is_cpufreq_controller) out of sched.h into somewhere else, I haven't
> looked at why it needs to live there.
>
> > Honestly - I don't see any point to this code. Its opt-in via the
> > command line only, and doesn't provide adequate checks for enablement.
> > (It's not as if we're lacking complexity or moving parts when it comes
> > to power/frequency management).
>
> Right, I could do a pre-patch to remove this, but I also don't think
> we should block this fix on removing FREQCTL_dom0_kernel, so I would
> rather fix the regression and then remove the feature if we agree it
> can be removed.
Can we get some consensus on what to do next?
I think I've provided replies to all the points above, and I'm not sure
what do to next in order to proceed with this patch.
Thanks, Roger.
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH] x86/msr: fix handling of MSR_IA32_PERF_{STATUS/CTL}
2020-10-15 13:34 ` Roger Pau Monné
@ 2020-11-09 13:21 ` Roger Pau Monné
0 siblings, 0 replies; 7+ messages in thread
From: Roger Pau Monné @ 2020-11-09 13:21 UTC (permalink / raw)
To: Andrew Cooper
Cc: xen-devel, Jan Beulich, Wei Liu, George Dunlap, Ian Jackson,
Julien Grall, Stefano Stabellini
Ping?
On Thu, Oct 15, 2020 at 03:34:12PM +0200, Roger Pau Monné wrote:
> On Wed, Oct 07, 2020 at 06:41:17PM +0200, Roger Pau Monné wrote:
> > On Wed, Oct 07, 2020 at 01:06:08PM +0100, Andrew Cooper wrote:
> > > On 06/10/2020 17:23, Roger Pau Monne wrote:
> > > > Currently a PV hardware domain can also be given control over the CPU
> > > > frequency, and such guest is allowed to write to MSR_IA32_PERF_CTL.
> > >
> > > This might be how the current logic "works", but its straight up broken.
> > >
> > > PERF_CTL is thread scope, so unless dom0 is identity pinned and has one
> > > vcpu for every pcpu, it cannot use the interface correctly.
> >
> > Selecting cpufreq=dom0-kernel will force vCPU pinning. I'm not able
> > however to see anywhere that would force dom0 vCPUs == pCPUs.
> >
> > > > However since commit 322ec7c89f6 the default behavior has been changed
> > > > to reject accesses to not explicitly handled MSRs, preventing PV
> > > > guests that manage CPU frequency from reading
> > > > MSR_IA32_PERF_{STATUS/CTL}.
> > > >
> > > > Additionally some HVM guests (Windows at least) will attempt to read
> > > > MSR_IA32_PERF_CTL and will panic if given back a #GP fault:
> > > >
> > > > vmx.c:3035:d8v0 RDMSR 0x00000199 unimplemented
> > > > d8v0 VIRIDIAN CRASH: 3b c0000096 fffff806871c1651 ffffda0253683720 0
> > > >
> > > > Move the handling of MSR_IA32_PERF_{STATUS/CTL} to the common MSR
> > > > handling shared between HVM and PV guests, and add an explicit case
> > > > for reads to MSR_IA32_PERF_{STATUS/CTL}.
> > >
> > > OTOH, PERF_CTL does have a seemingly architectural "please disable turbo
> > > for me" bit, which is supposed to be for calibration loops. I wonder if
> > > anyone uses this, and whether we ought to honour it (probably not).
> >
> > If we let guests play with this we would have to save/restore the
> > guest value on context switch. Unless there's a strong case for this,
> > I would say no.
> >
> > > > diff --git a/xen/include/xen/sched.h b/xen/include/xen/sched.h
> > > > index d8ed83f869..41baa3b7a1 100644
> > > > --- a/xen/include/xen/sched.h
> > > > +++ b/xen/include/xen/sched.h
> > > > @@ -1069,6 +1069,12 @@ extern enum cpufreq_controller {
> > > > FREQCTL_none, FREQCTL_dom0_kernel, FREQCTL_xen
> > > > } cpufreq_controller;
> > > >
> > > > +static inline bool is_cpufreq_controller(const struct domain *d)
> > > > +{
> > > > + return ((cpufreq_controller == FREQCTL_dom0_kernel) &&
> > > > + is_hardware_domain(d));
> > >
> > > This won't compile on !CONFIG_X86, due to CONFIG_HAS_CPUFREQ
> >
> > It does seem to build on Arm, because this is only used in x86 code:
> >
> > https://gitlab.com/xen-project/people/royger/xen/-/jobs/778207412
> >
> > The extern declaration of cpufreq_controller is just above, so if you
> > tried to use is_cpufreq_controller on Arm you would get a link time
> > error, otherwise it builds fine. The compiler removes the function on
> > Arm as it has the inline attribute and it's not used.
> >
> > Alternatively I could look into moving cpufreq_controller (and
> > is_cpufreq_controller) out of sched.h into somewhere else, I haven't
> > looked at why it needs to live there.
> >
> > > Honestly - I don't see any point to this code. Its opt-in via the
> > > command line only, and doesn't provide adequate checks for enablement.
> > > (It's not as if we're lacking complexity or moving parts when it comes
> > > to power/frequency management).
> >
> > Right, I could do a pre-patch to remove this, but I also don't think
> > we should block this fix on removing FREQCTL_dom0_kernel, so I would
> > rather fix the regression and then remove the feature if we agree it
> > can be removed.
>
> Can we get some consensus on what to do next?
>
> I think I've provided replies to all the points above, and I'm not sure
> what do to next in order to proceed with this patch.
>
> Thanks, Roger.
>
^ permalink raw reply [flat|nested] 7+ messages in thread
end of thread, other threads:[~2020-11-09 13:25 UTC | newest]
Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-10-06 16:23 [PATCH] x86/msr: fix handling of MSR_IA32_PERF_{STATUS/CTL} Roger Pau Monne
2020-10-07 12:06 ` Andrew Cooper
2020-10-07 16:41 ` Roger Pau Monné
2020-10-13 13:44 ` Jan Beulich
2020-10-15 13:34 ` Roger Pau Monné
2020-11-09 13:21 ` Roger Pau Monné
2020-10-13 13:49 ` Jan Beulich
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).