* [PATCH v2] x86/msr: fix handling of MSR_IA32_PERF_{STATUS/CTL}
@ 2020-11-09 17:38 Andrew Cooper
2020-11-09 18:31 ` Roger Pau Monné
2020-11-10 8:03 ` Jan Beulich
0 siblings, 2 replies; 7+ messages in thread
From: Andrew Cooper @ 2020-11-09 17:38 UTC (permalink / raw)
To: Xen-devel; +Cc: Roger Pau Monné, Andrew Cooper, Jan Beulich, Wei Liu
From: Roger Pau Monné <roger.pau@citrix.com>
Currently a PV hardware domain can also be given control over the CPU
frequency, and such guest is allowed to write to MSR_IA32_PERF_CTL.
However since commit 322ec7c89f6 the default behavior has been changed
to reject accesses to not explicitly handled MSRs, preventing PV
guests that manage CPU frequency from reading
MSR_IA32_PERF_{STATUS/CTL}.
Additionally some HVM guests (Windows at least) will attempt to read
MSR_IA32_PERF_CTL and will panic if given back a #GP fault:
vmx.c:3035:d8v0 RDMSR 0x00000199 unimplemented
d8v0 VIRIDIAN CRASH: 3b c0000096 fffff806871c1651 ffffda0253683720 0
Move the handling of MSR_IA32_PERF_{STATUS/CTL} to the common MSR
handling shared between HVM and PV guests, and add an explicit case
for reads to MSR_IA32_PERF_{STATUS/CTL}.
Restore previous behavior and allow PV guests with the required
permissions to read the contents of the mentioned MSRs. Non privileged
guests will get 0 when trying to read those registers, as writes to
MSR_IA32_PERF_CTL by such guest will already be silently dropped.
Fixes: 322ec7c89f6 ('x86/pv: disallow access to unknown MSRs')
Fixes: 84e848fd7a1 ('x86/hvm: disallow access to unknown MSRs')
Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
---
CC: Jan Beulich <JBeulich@suse.com>
CC: Roger Pau Monné <roger.pau@citrix.com>
CC: Wei Liu <wl@xen.org>
v2:
* fix is_cpufreq_controller() to exclude PVH dom0, and collapse to nothing in
!CONFIG_PV builds
* Drop the cross-vendor checks. It isn't possible to configure dom0 as
cross-vendor, and anyone using is_cpufreq_controller() without an exact
model match has far bigger problems.
* At least Centaur implements these MSRs. Add access.
---
xen/arch/x86/msr.c | 34 ++++++++++++++++++++++++++++++++++
xen/arch/x86/pv/emul-priv-op.c | 14 --------------
xen/include/xen/sched.h | 17 +++++++++++++++++
3 files changed, 51 insertions(+), 14 deletions(-)
diff --git a/xen/arch/x86/msr.c b/xen/arch/x86/msr.c
index 9c69ef8792..0a8ae4d22c 100644
--- a/xen/arch/x86/msr.c
+++ b/xen/arch/x86/msr.c
@@ -242,6 +242,25 @@ int guest_rdmsr(struct vcpu *v, uint32_t msr, uint64_t *val)
goto gp_fault;
break;
+ /*
+ * These MSRs are not enumerated in CPUID. They have been around
+ * since the Pentium 4, and implemented by other vendors.
+ *
+ * Some versions of Windows try reading these before setting up a #GP
+ * handler, and Linux has several unguarded reads as well. Provide
+ * RAZ semantics, in general, but permit a cpufreq controller dom0 to
+ * have full access.
+ */
+ case MSR_IA32_PERF_STATUS:
+ case MSR_IA32_PERF_CTL:
+ if ( !(cp->x86_vendor & (X86_VENDOR_INTEL | X86_VENDOR_CENTAUR)) )
+ goto gp_fault;
+
+ *val = 0;
+ if ( likely(!is_cpufreq_controller(d)) || rdmsr_safe(msr, *val) == 0 )
+ break;
+ goto gp_fault;
+
case MSR_IA32_THERM_STATUS:
if ( cp->x86_vendor != X86_VENDOR_INTEL )
goto gp_fault;
@@ -448,6 +467,21 @@ int guest_wrmsr(struct vcpu *v, uint32_t msr, uint64_t val)
goto gp_fault;
break;
+ /*
+ * This MSR are not enumerated in CPUID. It has been around since the
+ * Pentium 4, and implemented by other vendors.
+ *
+ * To match the RAZ semantics, implement as write-discard, except for
+ * a cpufreq controller dom0 which has full access.
+ */
+ case MSR_IA32_PERF_CTL:
+ if ( !(cp->x86_vendor & (X86_VENDOR_INTEL | X86_VENDOR_CENTAUR)) )
+ goto gp_fault;
+
+ if ( likely(!is_cpufreq_controller(d)) || wrmsr_safe(msr, val) == 0 )
+ break;
+ goto gp_fault;
+
case MSR_X2APIC_FIRST ... MSR_X2APIC_LAST:
if ( !is_hvm_domain(d) || v != curr )
goto gp_fault;
diff --git a/xen/arch/x86/pv/emul-priv-op.c b/xen/arch/x86/pv/emul-priv-op.c
index 7cc16d6eda..dbceed8a05 100644
--- a/xen/arch/x86/pv/emul-priv-op.c
+++ b/xen/arch/x86/pv/emul-priv-op.c
@@ -849,12 +849,6 @@ static inline uint64_t guest_misc_enable(uint64_t val)
return val;
}
-static inline bool is_cpufreq_controller(const struct domain *d)
-{
- return ((cpufreq_controller == FREQCTL_dom0_kernel) &&
- is_hardware_domain(d));
-}
-
static uint64_t guest_efer(const struct domain *d)
{
uint64_t val;
@@ -1121,14 +1115,6 @@ static int write_msr(unsigned int reg, uint64_t val,
return X86EMUL_OKAY;
break;
- case MSR_IA32_PERF_CTL:
- if ( boot_cpu_data.x86_vendor != X86_VENDOR_INTEL )
- break;
- if ( likely(!is_cpufreq_controller(currd)) ||
- wrmsr_safe(reg, val) == 0 )
- return X86EMUL_OKAY;
- break;
-
case MSR_IA32_THERM_CONTROL:
case MSR_IA32_ENERGY_PERF_BIAS:
if ( boot_cpu_data.x86_vendor != X86_VENDOR_INTEL )
diff --git a/xen/include/xen/sched.h b/xen/include/xen/sched.h
index d8ed83f869..b4d3e53310 100644
--- a/xen/include/xen/sched.h
+++ b/xen/include/xen/sched.h
@@ -1069,6 +1069,23 @@ extern enum cpufreq_controller {
FREQCTL_none, FREQCTL_dom0_kernel, FREQCTL_xen
} cpufreq_controller;
+static always_inline bool is_cpufreq_controller(const struct domain *d)
+{
+ /*
+ * A PV dom0 can be nominated as the cpufreq controller, instead of using
+ * Xen's cpufreq driver, at which point dom0 gets direct access to certain
+ * MSRs.
+ *
+ * This interface only works when dom0 is identity pinned and has the same
+ * number of vCPUs as pCPUs on the system.
+ *
+ * It would be far better to paravirtualise the interface.
+ */
+ return (IS_ENABLED(CONFIG_PV) &&
+ (cpufreq_controller == FREQCTL_dom0_kernel) &&
+ is_pv_domain(d) && is_hardware_domain(d));
+}
+
int cpupool_move_domain(struct domain *d, struct cpupool *c);
int cpupool_do_sysctl(struct xen_sysctl_cpupool_op *op);
int cpupool_get_id(const struct domain *d);
--
2.11.0
^ permalink raw reply related [flat|nested] 7+ messages in thread
* Re: [PATCH v2] x86/msr: fix handling of MSR_IA32_PERF_{STATUS/CTL}
2020-11-09 17:38 [PATCH v2] x86/msr: fix handling of MSR_IA32_PERF_{STATUS/CTL} Andrew Cooper
@ 2020-11-09 18:31 ` Roger Pau Monné
2020-11-10 8:04 ` Jan Beulich
2020-11-10 8:03 ` Jan Beulich
1 sibling, 1 reply; 7+ messages in thread
From: Roger Pau Monné @ 2020-11-09 18:31 UTC (permalink / raw)
To: Andrew Cooper; +Cc: Xen-devel, Jan Beulich, Wei Liu
On Mon, Nov 09, 2020 at 05:38:19PM +0000, Andrew Cooper wrote:
> From: Roger Pau Monné <roger.pau@citrix.com>
>
> Currently a PV hardware domain can also be given control over the CPU
> frequency, and such guest is allowed to write to MSR_IA32_PERF_CTL.
> However since commit 322ec7c89f6 the default behavior has been changed
> to reject accesses to not explicitly handled MSRs, preventing PV
> guests that manage CPU frequency from reading
> MSR_IA32_PERF_{STATUS/CTL}.
>
> Additionally some HVM guests (Windows at least) will attempt to read
> MSR_IA32_PERF_CTL and will panic if given back a #GP fault:
>
> vmx.c:3035:d8v0 RDMSR 0x00000199 unimplemented
> d8v0 VIRIDIAN CRASH: 3b c0000096 fffff806871c1651 ffffda0253683720 0
>
> Move the handling of MSR_IA32_PERF_{STATUS/CTL} to the common MSR
> handling shared between HVM and PV guests, and add an explicit case
> for reads to MSR_IA32_PERF_{STATUS/CTL}.
>
> Restore previous behavior and allow PV guests with the required
> permissions to read the contents of the mentioned MSRs. Non privileged
> guests will get 0 when trying to read those registers, as writes to
> MSR_IA32_PERF_CTL by such guest will already be silently dropped.
>
> Fixes: 322ec7c89f6 ('x86/pv: disallow access to unknown MSRs')
> Fixes: 84e848fd7a1 ('x86/hvm: disallow access to unknown MSRs')
> Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
> Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Roger Pau Monné <roger.pau@citrix.com>
> ---
> CC: Jan Beulich <JBeulich@suse.com>
> CC: Roger Pau Monné <roger.pau@citrix.com>
> CC: Wei Liu <wl@xen.org>
>
> v2:
> * fix is_cpufreq_controller() to exclude PVH dom0, and collapse to nothing in
> !CONFIG_PV builds
> * Drop the cross-vendor checks. It isn't possible to configure dom0 as
> cross-vendor, and anyone using is_cpufreq_controller() without an exact
> model match has far bigger problems.
I was on the verge of doing this in v1, but wasn't really sure whether
there was any use case to change the vendor for dom0 cpuid.
> * At least Centaur implements these MSRs. Add access.
> ---
> xen/arch/x86/msr.c | 34 ++++++++++++++++++++++++++++++++++
> xen/arch/x86/pv/emul-priv-op.c | 14 --------------
> xen/include/xen/sched.h | 17 +++++++++++++++++
> 3 files changed, 51 insertions(+), 14 deletions(-)
>
> diff --git a/xen/arch/x86/msr.c b/xen/arch/x86/msr.c
> index 9c69ef8792..0a8ae4d22c 100644
> --- a/xen/arch/x86/msr.c
> +++ b/xen/arch/x86/msr.c
> @@ -242,6 +242,25 @@ int guest_rdmsr(struct vcpu *v, uint32_t msr, uint64_t *val)
> goto gp_fault;
> break;
>
> + /*
> + * These MSRs are not enumerated in CPUID. They have been around
> + * since the Pentium 4, and implemented by other vendors.
> + *
> + * Some versions of Windows try reading these before setting up a #GP
> + * handler, and Linux has several unguarded reads as well. Provide
> + * RAZ semantics, in general, but permit a cpufreq controller dom0 to
> + * have full access.
> + */
> + case MSR_IA32_PERF_STATUS:
> + case MSR_IA32_PERF_CTL:
> + if ( !(cp->x86_vendor & (X86_VENDOR_INTEL | X86_VENDOR_CENTAUR)) )
> + goto gp_fault;
> +
> + *val = 0;
> + if ( likely(!is_cpufreq_controller(d)) || rdmsr_safe(msr, *val) == 0 )
> + break;
> + goto gp_fault;
> +
> case MSR_IA32_THERM_STATUS:
> if ( cp->x86_vendor != X86_VENDOR_INTEL )
> goto gp_fault;
> @@ -448,6 +467,21 @@ int guest_wrmsr(struct vcpu *v, uint32_t msr, uint64_t val)
> goto gp_fault;
> break;
>
> + /*
> + * This MSR are not enumerated in CPUID. It has been around since the
> + * Pentium 4, and implemented by other vendors.
> + *
> + * To match the RAZ semantics, implement as write-discard, except for
> + * a cpufreq controller dom0 which has full access.
> + */
> + case MSR_IA32_PERF_CTL:
> + if ( !(cp->x86_vendor & (X86_VENDOR_INTEL | X86_VENDOR_CENTAUR)) )
> + goto gp_fault;
> +
> + if ( likely(!is_cpufreq_controller(d)) || wrmsr_safe(msr, val) == 0 )
> + break;
> + goto gp_fault;
> +
> case MSR_X2APIC_FIRST ... MSR_X2APIC_LAST:
> if ( !is_hvm_domain(d) || v != curr )
> goto gp_fault;
> diff --git a/xen/arch/x86/pv/emul-priv-op.c b/xen/arch/x86/pv/emul-priv-op.c
> index 7cc16d6eda..dbceed8a05 100644
> --- a/xen/arch/x86/pv/emul-priv-op.c
> +++ b/xen/arch/x86/pv/emul-priv-op.c
> @@ -849,12 +849,6 @@ static inline uint64_t guest_misc_enable(uint64_t val)
> return val;
> }
>
> -static inline bool is_cpufreq_controller(const struct domain *d)
> -{
> - return ((cpufreq_controller == FREQCTL_dom0_kernel) &&
> - is_hardware_domain(d));
> -}
> -
> static uint64_t guest_efer(const struct domain *d)
> {
> uint64_t val;
> @@ -1121,14 +1115,6 @@ static int write_msr(unsigned int reg, uint64_t val,
> return X86EMUL_OKAY;
> break;
>
> - case MSR_IA32_PERF_CTL:
> - if ( boot_cpu_data.x86_vendor != X86_VENDOR_INTEL )
> - break;
> - if ( likely(!is_cpufreq_controller(currd)) ||
> - wrmsr_safe(reg, val) == 0 )
> - return X86EMUL_OKAY;
> - break;
> -
> case MSR_IA32_THERM_CONTROL:
> case MSR_IA32_ENERGY_PERF_BIAS:
> if ( boot_cpu_data.x86_vendor != X86_VENDOR_INTEL )
> diff --git a/xen/include/xen/sched.h b/xen/include/xen/sched.h
> index d8ed83f869..b4d3e53310 100644
> --- a/xen/include/xen/sched.h
> +++ b/xen/include/xen/sched.h
> @@ -1069,6 +1069,23 @@ extern enum cpufreq_controller {
> FREQCTL_none, FREQCTL_dom0_kernel, FREQCTL_xen
> } cpufreq_controller;
>
> +static always_inline bool is_cpufreq_controller(const struct domain *d)
> +{
> + /*
> + * A PV dom0 can be nominated as the cpufreq controller, instead of using
> + * Xen's cpufreq driver, at which point dom0 gets direct access to certain
> + * MSRs.
> + *
> + * This interface only works when dom0 is identity pinned and has the same
> + * number of vCPUs as pCPUs on the system.
> + *
> + * It would be far better to paravirtualise the interface.
> + */
Would it be useful to add an assert here that the domain cpuid vendor
and the BSP cpuid vendor match?
Is it even possible to fake a different cpuid vendor for dom0?
Thanks, Roger.
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH v2] x86/msr: fix handling of MSR_IA32_PERF_{STATUS/CTL}
2020-11-09 17:38 [PATCH v2] x86/msr: fix handling of MSR_IA32_PERF_{STATUS/CTL} Andrew Cooper
2020-11-09 18:31 ` Roger Pau Monné
@ 2020-11-10 8:03 ` Jan Beulich
2020-11-10 10:32 ` Andrew Cooper
1 sibling, 1 reply; 7+ messages in thread
From: Jan Beulich @ 2020-11-10 8:03 UTC (permalink / raw)
To: Andrew Cooper; +Cc: Roger Pau Monné, Wei Liu, Xen-devel
On 09.11.2020 18:38, Andrew Cooper wrote:
> From: Roger Pau Monné <roger.pau@citrix.com>
>
> Currently a PV hardware domain can also be given control over the CPU
> frequency, and such guest is allowed to write to MSR_IA32_PERF_CTL.
> However since commit 322ec7c89f6 the default behavior has been changed
> to reject accesses to not explicitly handled MSRs, preventing PV
> guests that manage CPU frequency from reading
> MSR_IA32_PERF_{STATUS/CTL}.
>
> Additionally some HVM guests (Windows at least) will attempt to read
> MSR_IA32_PERF_CTL and will panic if given back a #GP fault:
>
> vmx.c:3035:d8v0 RDMSR 0x00000199 unimplemented
> d8v0 VIRIDIAN CRASH: 3b c0000096 fffff806871c1651 ffffda0253683720 0
>
> Move the handling of MSR_IA32_PERF_{STATUS/CTL} to the common MSR
> handling shared between HVM and PV guests, and add an explicit case
> for reads to MSR_IA32_PERF_{STATUS/CTL}.
>
> Restore previous behavior and allow PV guests with the required
> permissions to read the contents of the mentioned MSRs. Non privileged
> guests will get 0 when trying to read those registers, as writes to
> MSR_IA32_PERF_CTL by such guest will already be silently dropped.
>
> Fixes: 322ec7c89f6 ('x86/pv: disallow access to unknown MSRs')
> Fixes: 84e848fd7a1 ('x86/hvm: disallow access to unknown MSRs')
> Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
> Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
with a nit, a minor adjustment request, and a question:
> @@ -448,6 +467,21 @@ int guest_wrmsr(struct vcpu *v, uint32_t msr, uint64_t val)
> goto gp_fault;
> break;
>
> + /*
> + * This MSR are not enumerated in CPUID. It has been around since the
s/are/is/
> --- a/xen/include/xen/sched.h
> +++ b/xen/include/xen/sched.h
> @@ -1069,6 +1069,23 @@ extern enum cpufreq_controller {
> FREQCTL_none, FREQCTL_dom0_kernel, FREQCTL_xen
> } cpufreq_controller;
>
> +static always_inline bool is_cpufreq_controller(const struct domain *d)
> +{
> + /*
> + * A PV dom0 can be nominated as the cpufreq controller, instead of using
> + * Xen's cpufreq driver, at which point dom0 gets direct access to certain
> + * MSRs.
> + *
> + * This interface only works when dom0 is identity pinned and has the same
> + * number of vCPUs as pCPUs on the system.
> + *
> + * It would be far better to paravirtualise the interface.
> + */
> + return (IS_ENABLED(CONFIG_PV) &&
> + (cpufreq_controller == FREQCTL_dom0_kernel) &&
> + is_pv_domain(d) && is_hardware_domain(d));
> +}
IS_ENABLED(CONFIG_PV) is already part of is_pv_domain() and hence
imo shouldn't be used explicitly here.
Also, this being an x86 concept, is it really a good idea to put
in xen/sched.h? I guess this builds on Arm just because of DCE
from the IS_ENABLED(CONFIG_PV) (where afaict the one in
is_pv_domain() will still do). (But yes, I do realize that
cpufreq_controller itself gets declared in this file, so maybe
better to do some subsequent cleanup.)
Jan
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH v2] x86/msr: fix handling of MSR_IA32_PERF_{STATUS/CTL}
2020-11-09 18:31 ` Roger Pau Monné
@ 2020-11-10 8:04 ` Jan Beulich
2020-11-10 10:24 ` Andrew Cooper
0 siblings, 1 reply; 7+ messages in thread
From: Jan Beulich @ 2020-11-10 8:04 UTC (permalink / raw)
To: Roger Pau Monné; +Cc: Xen-devel, Wei Liu, Andrew Cooper
On 09.11.2020 19:31, Roger Pau Monné wrote:
> On Mon, Nov 09, 2020 at 05:38:19PM +0000, Andrew Cooper wrote:
>> From: Roger Pau Monné <roger.pau@citrix.com>
>>
>> Currently a PV hardware domain can also be given control over the CPU
>> frequency, and such guest is allowed to write to MSR_IA32_PERF_CTL.
>> However since commit 322ec7c89f6 the default behavior has been changed
>> to reject accesses to not explicitly handled MSRs, preventing PV
>> guests that manage CPU frequency from reading
>> MSR_IA32_PERF_{STATUS/CTL}.
>>
>> Additionally some HVM guests (Windows at least) will attempt to read
>> MSR_IA32_PERF_CTL and will panic if given back a #GP fault:
>>
>> vmx.c:3035:d8v0 RDMSR 0x00000199 unimplemented
>> d8v0 VIRIDIAN CRASH: 3b c0000096 fffff806871c1651 ffffda0253683720 0
>>
>> Move the handling of MSR_IA32_PERF_{STATUS/CTL} to the common MSR
>> handling shared between HVM and PV guests, and add an explicit case
>> for reads to MSR_IA32_PERF_{STATUS/CTL}.
>>
>> Restore previous behavior and allow PV guests with the required
>> permissions to read the contents of the mentioned MSRs. Non privileged
>> guests will get 0 when trying to read those registers, as writes to
>> MSR_IA32_PERF_CTL by such guest will already be silently dropped.
>>
>> Fixes: 322ec7c89f6 ('x86/pv: disallow access to unknown MSRs')
>> Fixes: 84e848fd7a1 ('x86/hvm: disallow access to unknown MSRs')
>> Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
>> Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
>
> Reviewed-by: Roger Pau Monné <roger.pau@citrix.com>
>
>> ---
>> CC: Jan Beulich <JBeulich@suse.com>
>> CC: Roger Pau Monné <roger.pau@citrix.com>
>> CC: Wei Liu <wl@xen.org>
>>
>> v2:
>> * fix is_cpufreq_controller() to exclude PVH dom0, and collapse to nothing in
>> !CONFIG_PV builds
>> * Drop the cross-vendor checks. It isn't possible to configure dom0 as
>> cross-vendor, and anyone using is_cpufreq_controller() without an exact
>> model match has far bigger problems.
This already answers ...
>> --- a/xen/include/xen/sched.h
>> +++ b/xen/include/xen/sched.h
>> @@ -1069,6 +1069,23 @@ extern enum cpufreq_controller {
>> FREQCTL_none, FREQCTL_dom0_kernel, FREQCTL_xen
>> } cpufreq_controller;
>>
>> +static always_inline bool is_cpufreq_controller(const struct domain *d)
>> +{
>> + /*
>> + * A PV dom0 can be nominated as the cpufreq controller, instead of using
>> + * Xen's cpufreq driver, at which point dom0 gets direct access to certain
>> + * MSRs.
>> + *
>> + * This interface only works when dom0 is identity pinned and has the same
>> + * number of vCPUs as pCPUs on the system.
>> + *
>> + * It would be far better to paravirtualise the interface.
>> + */
>
> Would it be useful to add an assert here that the domain cpuid vendor
> and the BSP cpuid vendor match?
>
> Is it even possible to fake a different cpuid vendor for dom0?
... your question here.
Jan
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH v2] x86/msr: fix handling of MSR_IA32_PERF_{STATUS/CTL}
2020-11-10 8:04 ` Jan Beulich
@ 2020-11-10 10:24 ` Andrew Cooper
0 siblings, 0 replies; 7+ messages in thread
From: Andrew Cooper @ 2020-11-10 10:24 UTC (permalink / raw)
To: Jan Beulich, Roger Pau Monné; +Cc: Xen-devel, Wei Liu
On 10/11/2020 08:04, Jan Beulich wrote:
> On 09.11.2020 19:31, Roger Pau Monné wrote:
>> On Mon, Nov 09, 2020 at 05:38:19PM +0000, Andrew Cooper wrote:
>>> From: Roger Pau Monné <roger.pau@citrix.com>
>>>
>>> Currently a PV hardware domain can also be given control over the CPU
>>> frequency, and such guest is allowed to write to MSR_IA32_PERF_CTL.
>>> However since commit 322ec7c89f6 the default behavior has been changed
>>> to reject accesses to not explicitly handled MSRs, preventing PV
>>> guests that manage CPU frequency from reading
>>> MSR_IA32_PERF_{STATUS/CTL}.
>>>
>>> Additionally some HVM guests (Windows at least) will attempt to read
>>> MSR_IA32_PERF_CTL and will panic if given back a #GP fault:
>>>
>>> vmx.c:3035:d8v0 RDMSR 0x00000199 unimplemented
>>> d8v0 VIRIDIAN CRASH: 3b c0000096 fffff806871c1651 ffffda0253683720 0
>>>
>>> Move the handling of MSR_IA32_PERF_{STATUS/CTL} to the common MSR
>>> handling shared between HVM and PV guests, and add an explicit case
>>> for reads to MSR_IA32_PERF_{STATUS/CTL}.
>>>
>>> Restore previous behavior and allow PV guests with the required
>>> permissions to read the contents of the mentioned MSRs. Non privileged
>>> guests will get 0 when trying to read those registers, as writes to
>>> MSR_IA32_PERF_CTL by such guest will already be silently dropped.
>>>
>>> Fixes: 322ec7c89f6 ('x86/pv: disallow access to unknown MSRs')
>>> Fixes: 84e848fd7a1 ('x86/hvm: disallow access to unknown MSRs')
>>> Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
>>> Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
>> Reviewed-by: Roger Pau Monné <roger.pau@citrix.com>
>>
>>> ---
>>> CC: Jan Beulich <JBeulich@suse.com>
>>> CC: Roger Pau Monné <roger.pau@citrix.com>
>>> CC: Wei Liu <wl@xen.org>
>>>
>>> v2:
>>> * fix is_cpufreq_controller() to exclude PVH dom0, and collapse to nothing in
>>> !CONFIG_PV builds
>>> * Drop the cross-vendor checks. It isn't possible to configure dom0 as
>>> cross-vendor, and anyone using is_cpufreq_controller() without an exact
>>> model match has far bigger problems.
> This already answers ...
>
>>> --- a/xen/include/xen/sched.h
>>> +++ b/xen/include/xen/sched.h
>>> @@ -1069,6 +1069,23 @@ extern enum cpufreq_controller {
>>> FREQCTL_none, FREQCTL_dom0_kernel, FREQCTL_xen
>>> } cpufreq_controller;
>>>
>>> +static always_inline bool is_cpufreq_controller(const struct domain *d)
>>> +{
>>> + /*
>>> + * A PV dom0 can be nominated as the cpufreq controller, instead of using
>>> + * Xen's cpufreq driver, at which point dom0 gets direct access to certain
>>> + * MSRs.
>>> + *
>>> + * This interface only works when dom0 is identity pinned and has the same
>>> + * number of vCPUs as pCPUs on the system.
>>> + *
>>> + * It would be far better to paravirtualise the interface.
>>> + */
>> Would it be useful to add an assert here that the domain cpuid vendor
>> and the BSP cpuid vendor match?
>>
>> Is it even possible to fake a different cpuid vendor for dom0?
> ... your question here.
Technically at the moment it is possible to configure a non-dom0
hardware domain as cross vendor, but anyone doing so can keep all the
resulting pieces.
~Andrew
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH v2] x86/msr: fix handling of MSR_IA32_PERF_{STATUS/CTL}
2020-11-10 8:03 ` Jan Beulich
@ 2020-11-10 10:32 ` Andrew Cooper
2020-11-10 12:12 ` Jan Beulich
0 siblings, 1 reply; 7+ messages in thread
From: Andrew Cooper @ 2020-11-10 10:32 UTC (permalink / raw)
To: Jan Beulich; +Cc: Roger Pau Monné, Wei Liu, Xen-devel
On 10/11/2020 08:03, Jan Beulich wrote:
> On 09.11.2020 18:38, Andrew Cooper wrote:
>> From: Roger Pau Monné <roger.pau@citrix.com>
>>
>> Currently a PV hardware domain can also be given control over the CPU
>> frequency, and such guest is allowed to write to MSR_IA32_PERF_CTL.
>> However since commit 322ec7c89f6 the default behavior has been changed
>> to reject accesses to not explicitly handled MSRs, preventing PV
>> guests that manage CPU frequency from reading
>> MSR_IA32_PERF_{STATUS/CTL}.
>>
>> Additionally some HVM guests (Windows at least) will attempt to read
>> MSR_IA32_PERF_CTL and will panic if given back a #GP fault:
>>
>> vmx.c:3035:d8v0 RDMSR 0x00000199 unimplemented
>> d8v0 VIRIDIAN CRASH: 3b c0000096 fffff806871c1651 ffffda0253683720 0
>>
>> Move the handling of MSR_IA32_PERF_{STATUS/CTL} to the common MSR
>> handling shared between HVM and PV guests, and add an explicit case
>> for reads to MSR_IA32_PERF_{STATUS/CTL}.
>>
>> Restore previous behavior and allow PV guests with the required
>> permissions to read the contents of the mentioned MSRs. Non privileged
>> guests will get 0 when trying to read those registers, as writes to
>> MSR_IA32_PERF_CTL by such guest will already be silently dropped.
>>
>> Fixes: 322ec7c89f6 ('x86/pv: disallow access to unknown MSRs')
>> Fixes: 84e848fd7a1 ('x86/hvm: disallow access to unknown MSRs')
>> Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
>> Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
> Reviewed-by: Jan Beulich <jbeulich@suse.com>
Thanks,
> with a nit, a minor adjustment request, and a question:
>
>> @@ -448,6 +467,21 @@ int guest_wrmsr(struct vcpu *v, uint32_t msr, uint64_t val)
>> goto gp_fault;
>> break;
>>
>> + /*
>> + * This MSR are not enumerated in CPUID. It has been around since the
> s/are/is/
Oops, yes.
>
>> --- a/xen/include/xen/sched.h
>> +++ b/xen/include/xen/sched.h
>> @@ -1069,6 +1069,23 @@ extern enum cpufreq_controller {
>> FREQCTL_none, FREQCTL_dom0_kernel, FREQCTL_xen
>> } cpufreq_controller;
>>
>> +static always_inline bool is_cpufreq_controller(const struct domain *d)
>> +{
>> + /*
>> + * A PV dom0 can be nominated as the cpufreq controller, instead of using
>> + * Xen's cpufreq driver, at which point dom0 gets direct access to certain
>> + * MSRs.
>> + *
>> + * This interface only works when dom0 is identity pinned and has the same
>> + * number of vCPUs as pCPUs on the system.
>> + *
>> + * It would be far better to paravirtualise the interface.
>> + */
>> + return (IS_ENABLED(CONFIG_PV) &&
>> + (cpufreq_controller == FREQCTL_dom0_kernel) &&
>> + is_pv_domain(d) && is_hardware_domain(d));
>> +}
> IS_ENABLED(CONFIG_PV) is already part of is_pv_domain() and hence
> imo shouldn't be used explicitly here.
Ah yes. Will drop.
> Also, this being an x86 concept, is it really a good idea to put
> in xen/sched.h? I guess this builds on Arm just because of DCE
> from the IS_ENABLED(CONFIG_PV) (where afaict the one in
> is_pv_domain() will still do). (But yes, I do realize that
> cpufreq_controller itself gets declared in this file, so maybe
> better to do some subsequent cleanup.)
I can't spot anywhere obviously better for it to live. We don't have a
common cpufreq.h, and I'm not sure that cpuidle.h is an appropriate
place to live either.
Thanks,
~Andrew
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH v2] x86/msr: fix handling of MSR_IA32_PERF_{STATUS/CTL}
2020-11-10 10:32 ` Andrew Cooper
@ 2020-11-10 12:12 ` Jan Beulich
0 siblings, 0 replies; 7+ messages in thread
From: Jan Beulich @ 2020-11-10 12:12 UTC (permalink / raw)
To: Andrew Cooper; +Cc: Roger Pau Monné, Wei Liu, Xen-devel
On 10.11.2020 11:32, Andrew Cooper wrote:
> On 10/11/2020 08:03, Jan Beulich wrote:
>> On 09.11.2020 18:38, Andrew Cooper wrote:
>>> --- a/xen/include/xen/sched.h
>>> +++ b/xen/include/xen/sched.h
>>> @@ -1069,6 +1069,23 @@ extern enum cpufreq_controller {
>>> FREQCTL_none, FREQCTL_dom0_kernel, FREQCTL_xen
>>> } cpufreq_controller;
>>>
>>> +static always_inline bool is_cpufreq_controller(const struct domain *d)
>>> +{
>>> + /*
>>> + * A PV dom0 can be nominated as the cpufreq controller, instead of using
>>> + * Xen's cpufreq driver, at which point dom0 gets direct access to certain
>>> + * MSRs.
>>> + *
>>> + * This interface only works when dom0 is identity pinned and has the same
>>> + * number of vCPUs as pCPUs on the system.
>>> + *
>>> + * It would be far better to paravirtualise the interface.
>>> + */
>>> + return (IS_ENABLED(CONFIG_PV) &&
>>> + (cpufreq_controller == FREQCTL_dom0_kernel) &&
>>> + is_pv_domain(d) && is_hardware_domain(d));
>>> +}
>> IS_ENABLED(CONFIG_PV) is already part of is_pv_domain() and hence
>> imo shouldn't be used explicitly here.
>
> Ah yes. Will drop.
>
>> Also, this being an x86 concept, is it really a good idea to put
>> in xen/sched.h? I guess this builds on Arm just because of DCE
>> from the IS_ENABLED(CONFIG_PV) (where afaict the one in
>> is_pv_domain() will still do). (But yes, I do realize that
>> cpufreq_controller itself gets declared in this file, so maybe
>> better to do some subsequent cleanup.)
>
> I can't spot anywhere obviously better for it to live. We don't have a
> common cpufreq.h,
Not the most obvious place for it to live at, but
xen/include/acpi/cpufreq/cpufreq.h ?
Jan
^ permalink raw reply [flat|nested] 7+ messages in thread
end of thread, other threads:[~2020-11-10 12:12 UTC | newest]
Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-11-09 17:38 [PATCH v2] x86/msr: fix handling of MSR_IA32_PERF_{STATUS/CTL} Andrew Cooper
2020-11-09 18:31 ` Roger Pau Monné
2020-11-10 8:04 ` Jan Beulich
2020-11-10 10:24 ` Andrew Cooper
2020-11-10 8:03 ` Jan Beulich
2020-11-10 10:32 ` Andrew Cooper
2020-11-10 12:12 ` Jan Beulich
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).