All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 00/27] xen/x86: Per-domain CPUID policies
@ 2017-01-04 12:39 Andrew Cooper
  2017-01-04 12:39 ` [PATCH 01/27] x86/cpuid: Untangle the <asm/cpufeature.h> include hierachy Andrew Cooper
                   ` (26 more replies)
  0 siblings, 27 replies; 93+ messages in thread
From: Andrew Cooper @ 2017-01-04 12:39 UTC (permalink / raw)
  To: Xen-devel; +Cc: Andrew Cooper, Jan Beulich

Hello,

Presented herewith is the first part of improvement work to support full
per-domain CPUID policies.  More work is pending on top of this series.

This series is available in git form from:

  http://xenbits.xen.org/gitweb/?p=people/andrewcoop/xen.git;a=shortlog;h=refs/heads/xen-cpuid-v1

Testing wise, this series has been bisected, checking at each stage that the
guest-visibile CPUID information is identical (other than reported-frequency
values) for different VMs in a number of configurations.

Jan: This series textually conflicts with some of your register renaming.  I
am happy to rebase if needs be.

Andrew Cooper (27):
  x86/cpuid: Untangle the <asm/cpufeature.h> include hierachy
  x86/cpuid: Introduce guest_cpuid() and struct cpuid_leaf
  x86/cpuid: Introduce struct cpuid_policy
  x86/cpuid: Move featuresets into struct cpuid_policy
  x86/cpuid: Allocate a CPUID policy for every domain
  x86/domctl: Make XEN_DOMCTL_set_address_size singleshot
  x86/cpuid: Recalculate a domains CPUID policy when appropriate
  x86/hvm: Dispatch cpuid_viridian_leaves() from guest_cpuid()
  x86/cpuid: Dispatch cpuid_hypervisor_leaves() from guest_cpuid()
  x86/cpuid: Introduce named feature bitmaps
  x86/hvm: Improve hvm_efer_valid() using named features
  x86/hvm: Improve CR4 verification using named features
  x86/vvmx: Use hvm_cr4_guest_valid_bits() to calculate MSR_IA32_VMX_CR4_FIXED1
  x86/pv: Improve pv_cpuid() using named features
  x86/hvm: Improve CPUID and MSR handling using named features
  x86/svm: Improvements using named features
  x86/pv: Use per-domain policy information when calculating the cpumasks
  x86/pv: Use per-domain policy information in pv_cpuid()
  x86/hvm: Use per-domain policy information in hvm_cpuid()
  x86/cpuid: Drop the temporary linear feature bitmap from struct cpuid_policy
  x86/cpuid: Calculate appropriate max_leaf values for the global policies
  x86/cpuid: Perform max_leaf calculations in guest_cpuid()
  x86/cpuid: Move all leaf 7 handling into guest_cpuid()
  x86/hvm: Use guest_cpuid() rather than hvm_cpuid()
  x86/svm: Use guest_cpuid() rather than hvm_cpuid()
  x86/cpuid: Effectively remove pv_cpuid() and hvm_cpuid()
  x86/cpuid: Alter the legacy-path prototypes to match guest_cpuid()

 tools/tests/x86_emulator/x86_emulate.c |  15 +-
 tools/tests/x86_emulator/x86_emulate.h |  60 ++-
 xen/arch/x86/cpuid.c                   | 837 ++++++++++++++++++++++++++++++---
 xen/arch/x86/domain.c                  |  44 +-
 xen/arch/x86/domctl.c                  |  52 +-
 xen/arch/x86/hvm/emulate.c             |  10 +-
 xen/arch/x86/hvm/hvm.c                 | 521 +++-----------------
 xen/arch/x86/hvm/mtrr.c                |  13 +-
 xen/arch/x86/hvm/nestedhvm.c           |   6 +-
 xen/arch/x86/hvm/svm/svm.c             |  62 +--
 xen/arch/x86/hvm/viridian.c            |  65 ++-
 xen/arch/x86/hvm/vmx/vmx.c             |  35 +-
 xen/arch/x86/hvm/vmx/vvmx.c            |  58 +--
 xen/arch/x86/setup.c                   |   4 +-
 xen/arch/x86/sysctl.c                  |  21 +-
 xen/arch/x86/traps.c                   | 476 ++-----------------
 xen/arch/x86/x86_emulate/x86_emulate.c |  31 +-
 xen/arch/x86/x86_emulate/x86_emulate.h |  12 +-
 xen/include/asm-x86/bitops.h           |   3 +-
 xen/include/asm-x86/cpufeature.h       |  30 +-
 xen/include/asm-x86/cpufeatures.h      |  24 +
 xen/include/asm-x86/cpufeatureset.h    |   6 +-
 xen/include/asm-x86/cpuid.h            | 254 +++++++++-
 xen/include/asm-x86/domain.h           |   5 +-
 xen/include/asm-x86/hvm/emulate.h      |   8 +-
 xen/include/asm-x86/hvm/hvm.h          |   7 +-
 xen/include/asm-x86/hvm/nestedhvm.h    |   2 +-
 xen/include/asm-x86/hvm/viridian.h     |   9 +-
 xen/include/asm-x86/mm.h               |   4 +-
 xen/include/asm-x86/processor.h        |   6 +-
 xen/include/xen/compat.h               |   1 -
 31 files changed, 1381 insertions(+), 1300 deletions(-)
 create mode 100644 xen/include/asm-x86/cpufeatures.h

-- 
2.1.4


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 93+ messages in thread

* [PATCH 01/27] x86/cpuid: Untangle the <asm/cpufeature.h> include hierachy
  2017-01-04 12:39 [PATCH 00/27] xen/x86: Per-domain CPUID policies Andrew Cooper
@ 2017-01-04 12:39 ` Andrew Cooper
  2017-01-04 13:39   ` Jan Beulich
  2017-01-04 12:39 ` [PATCH 02/27] x86/cpuid: Introduce guest_cpuid() and struct cpuid_leaf Andrew Cooper
                   ` (25 subsequent siblings)
  26 siblings, 1 reply; 93+ messages in thread
From: Andrew Cooper @ 2017-01-04 12:39 UTC (permalink / raw)
  To: Xen-devel; +Cc: Andrew Cooper, Jan Beulich

The use of X86_FEATURES_ONLY was shortlived in Linux for the same problem
encountered here.  The following series needs to add extra includes to
asm/cpuid.h, which breaks the build elsewhere given the current hierachy.

Move the feature definitions into a separate header file, which also matches
the solution Linux used.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
---
CC: Jan Beulich <JBeulich@suse.com>
---
 xen/include/asm-x86/bitops.h        |  3 +--
 xen/include/asm-x86/cpufeature.h    | 29 +++--------------------------
 xen/include/asm-x86/cpufeatures.h   | 24 ++++++++++++++++++++++++
 xen/include/asm-x86/cpufeatureset.h |  6 +++---
 xen/include/asm-x86/cpuid.h         |  4 ----
 5 files changed, 31 insertions(+), 35 deletions(-)
 create mode 100644 xen/include/asm-x86/cpufeatures.h

diff --git a/xen/include/asm-x86/bitops.h b/xen/include/asm-x86/bitops.h
index a8db7e4..fd494e8 100644
--- a/xen/include/asm-x86/bitops.h
+++ b/xen/include/asm-x86/bitops.h
@@ -6,8 +6,7 @@
  */
 
 #include <asm/alternative.h>
-#define X86_FEATURES_ONLY
-#include <asm/cpufeature.h>
+#include <asm/cpufeatureset.h>
 
 /*
  * We specify the memory operand as both input and output because the memory
diff --git a/xen/include/asm-x86/cpufeature.h b/xen/include/asm-x86/cpufeature.h
index c7c8520..d45e650 100644
--- a/xen/include/asm-x86/cpufeature.h
+++ b/xen/include/asm-x86/cpufeature.h
@@ -3,29 +3,8 @@
  *
  * Defines x86 CPU feature bits
  */
-#if defined(XEN_CPUFEATURE)
-
-/* Other features, Xen-defined mapping. */
-/* This range is used for feature bits which conflict or are synthesized */
-XEN_CPUFEATURE(CONSTANT_TSC,    (FSCAPINTS+0)*32+ 0) /* TSC ticks at a constant rate */
-XEN_CPUFEATURE(NONSTOP_TSC,     (FSCAPINTS+0)*32+ 1) /* TSC does not stop in C states */
-XEN_CPUFEATURE(ARAT,            (FSCAPINTS+0)*32+ 2) /* Always running APIC timer */
-XEN_CPUFEATURE(ARCH_PERFMON,    (FSCAPINTS+0)*32+ 3) /* Intel Architectural PerfMon */
-XEN_CPUFEATURE(TSC_RELIABLE,    (FSCAPINTS+0)*32+ 4) /* TSC is known to be reliable */
-XEN_CPUFEATURE(XTOPOLOGY,       (FSCAPINTS+0)*32+ 5) /* cpu topology enum extensions */
-XEN_CPUFEATURE(CPUID_FAULTING,  (FSCAPINTS+0)*32+ 6) /* cpuid faulting */
-XEN_CPUFEATURE(CLFLUSH_MONITOR, (FSCAPINTS+0)*32+ 7) /* clflush reqd with monitor */
-XEN_CPUFEATURE(APERFMPERF,      (FSCAPINTS+0)*32+ 8) /* APERFMPERF */
-XEN_CPUFEATURE(MFENCE_RDTSC,    (FSCAPINTS+0)*32+ 9) /* MFENCE synchronizes RDTSC */
-XEN_CPUFEATURE(XEN_SMEP,        (FSCAPINTS+0)*32+ 10) /* SMEP gets used by Xen itself */
-XEN_CPUFEATURE(XEN_SMAP,        (FSCAPINTS+0)*32+ 11) /* SMAP gets used by Xen itself */
-
-#define NCAPINTS (FSCAPINTS + 1) /* N 32-bit words worth of info */
-
-#elif !defined(__ASM_I386_CPUFEATURE_H)
-#ifndef X86_FEATURES_ONLY
+#ifndef __ASM_I386_CPUFEATURE_H
 #define __ASM_I386_CPUFEATURE_H
-#endif
 
 #include <xen/const.h>
 #include <asm/cpuid.h>
@@ -37,7 +16,7 @@ XEN_CPUFEATURE(XEN_SMAP,        (FSCAPINTS+0)*32+ 11) /* SMAP gets used by Xen i
 /* An alias of a feature we know is always going to be present. */
 #define X86_FEATURE_ALWAYS      X86_FEATURE_LM
 
-#if !defined(__ASSEMBLY__) && !defined(X86_FEATURES_ONLY)
+#ifndef __ASSEMBLY__
 #include <xen/bitops.h>
 
 #define cpu_has(c, bit)		test_bit(bit, (c)->x86_capability)
@@ -139,9 +118,7 @@ struct cpuid4_info {
 };
 
 int cpuid4_cache_lookup(int index, struct cpuid4_info *this_leaf);
-#endif
-
-#undef X86_FEATURES_ONLY
+#endif /* !__ASSEMBLY__ */
 
 #endif /* __ASM_I386_CPUFEATURE_H */
 
diff --git a/xen/include/asm-x86/cpufeatures.h b/xen/include/asm-x86/cpufeatures.h
new file mode 100644
index 0000000..bc98227
--- /dev/null
+++ b/xen/include/asm-x86/cpufeatures.h
@@ -0,0 +1,24 @@
+/*
+ * Explicitly intended for multiple inclusion.
+ */
+
+#include <asm/cpuid-autogen.h>
+
+#define FSCAPINTS FEATURESET_NR_ENTRIES
+
+#define NCAPINTS (FSCAPINTS + 1) /* N 32-bit words worth of info */
+
+/* Other features, Xen-defined mapping. */
+/* This range is used for feature bits which conflict or are synthesized */
+XEN_CPUFEATURE(CONSTANT_TSC,    (FSCAPINTS+0)*32+ 0) /* TSC ticks at a constant rate */
+XEN_CPUFEATURE(NONSTOP_TSC,     (FSCAPINTS+0)*32+ 1) /* TSC does not stop in C states */
+XEN_CPUFEATURE(ARAT,            (FSCAPINTS+0)*32+ 2) /* Always running APIC timer */
+XEN_CPUFEATURE(ARCH_PERFMON,    (FSCAPINTS+0)*32+ 3) /* Intel Architectural PerfMon */
+XEN_CPUFEATURE(TSC_RELIABLE,    (FSCAPINTS+0)*32+ 4) /* TSC is known to be reliable */
+XEN_CPUFEATURE(XTOPOLOGY,       (FSCAPINTS+0)*32+ 5) /* cpu topology enum extensions */
+XEN_CPUFEATURE(CPUID_FAULTING,  (FSCAPINTS+0)*32+ 6) /* cpuid faulting */
+XEN_CPUFEATURE(CLFLUSH_MONITOR, (FSCAPINTS+0)*32+ 7) /* clflush reqd with monitor */
+XEN_CPUFEATURE(APERFMPERF,      (FSCAPINTS+0)*32+ 8) /* APERFMPERF */
+XEN_CPUFEATURE(MFENCE_RDTSC,    (FSCAPINTS+0)*32+ 9) /* MFENCE synchronizes RDTSC */
+XEN_CPUFEATURE(XEN_SMEP,        (FSCAPINTS+0)*32+10) /* SMEP gets used by Xen itself */
+XEN_CPUFEATURE(XEN_SMAP,        (FSCAPINTS+0)*32+11) /* SMAP gets used by Xen itself */
diff --git a/xen/include/asm-x86/cpufeatureset.h b/xen/include/asm-x86/cpufeatureset.h
index c54ff2b..f179229 100644
--- a/xen/include/asm-x86/cpufeatureset.h
+++ b/xen/include/asm-x86/cpufeatureset.h
@@ -8,20 +8,20 @@
 #define XEN_CPUFEATURE(name, value) X86_FEATURE_##name = value,
 enum {
 #include <public/arch-x86/cpufeatureset.h>
-#include <asm/cpufeature.h>
+#include <asm/cpufeatures.h>
 };
 #undef XEN_CPUFEATURE
 
 #define XEN_CPUFEATURE(name, value) asm (".equ X86_FEATURE_" #name ", " \
                                          __stringify(value));
 #include <public/arch-x86/cpufeatureset.h>
-#include <asm/cpufeature.h>
+#include <asm/cpufeatures.h>
 
 #else /* !__ASSEMBLY__ */
 
 #define XEN_CPUFEATURE(name, value) .equ X86_FEATURE_##name, value
 #include <public/arch-x86/cpufeatureset.h>
-#include <asm/cpufeature.h>
+#include <asm/cpufeatures.h>
 
 #endif /* __ASSEMBLY__ */
 
diff --git a/xen/include/asm-x86/cpuid.h b/xen/include/asm-x86/cpuid.h
index ec8bbb5..05f2c9a 100644
--- a/xen/include/asm-x86/cpuid.h
+++ b/xen/include/asm-x86/cpuid.h
@@ -1,10 +1,6 @@
 #ifndef __X86_CPUID_H__
 #define __X86_CPUID_H__
 
-#include <asm/cpuid-autogen.h>
-
-#define FSCAPINTS FEATURESET_NR_ENTRIES
-
 #include <asm/cpufeatureset.h>
 #include <asm/percpu.h>
 
-- 
2.1.4


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 93+ messages in thread

* [PATCH 02/27] x86/cpuid: Introduce guest_cpuid() and struct cpuid_leaf
  2017-01-04 12:39 [PATCH 00/27] xen/x86: Per-domain CPUID policies Andrew Cooper
  2017-01-04 12:39 ` [PATCH 01/27] x86/cpuid: Untangle the <asm/cpufeature.h> include hierachy Andrew Cooper
@ 2017-01-04 12:39 ` Andrew Cooper
  2017-01-04 14:01   ` Jan Beulich
  2017-01-04 12:39 ` [PATCH 03/27] x86/cpuid: Introduce struct cpuid_policy Andrew Cooper
                   ` (24 subsequent siblings)
  26 siblings, 1 reply; 93+ messages in thread
From: Andrew Cooper @ 2017-01-04 12:39 UTC (permalink / raw)
  To: Xen-devel
  Cc: Kevin Tian, Jan Beulich, Andrew Cooper, Paul Durrant,
	Jun Nakajima, Boris Ostrovsky, Suravee Suthikulpanit

Longterm, pv_cpuid() and hvm_cpuid() will be merged into a single
guest_cpuid(), which is also capable of working outside of current context.

To aid this transtion, introduce guest_cpuid() with the intended API, which
simply defers back to pv_cpuid() or hvm_cpuid() as appropriate.

Introduce struct cpuid_leaf which is used to represent the results of a CPUID
query in a more efficient mannor than passing four pointers through the
calltree.

Update all codepaths which should use the new guest_cpuid() API.  These are
the codepaths which have variable inputs, and (other than some specific
x86_emulate() cases) all pertain to servicing a CPUID instruction from a
guest.

The other codepaths using {pv,hvm}_cpuid() with fixed inputs will later be
adjusted to read their data straight from the policy block.

No intended functional change.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
---
CC: Jan Beulich <JBeulich@suse.com>
CC: Paul Durrant <paul.durrant@citrix.com>
CC: Jun Nakajima <jun.nakajima@intel.com>
CC: Kevin Tian <kevin.tian@intel.com>
CC: Boris Ostrovsky <boris.ostrovsky@oracle.com>
CC: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>

Jan: I note that this patch texturally conflicts with your register renaming
series.
---
 tools/tests/x86_emulator/x86_emulate.c | 15 ++++-----
 tools/tests/x86_emulator/x86_emulate.h | 60 ++++++++++++++++------------------
 xen/arch/x86/cpuid.c                   | 33 +++++++++++++++++++
 xen/arch/x86/hvm/emulate.c             | 10 ++----
 xen/arch/x86/hvm/svm/svm.c             | 23 ++++++-------
 xen/arch/x86/hvm/vmx/vmx.c             | 35 ++++++--------------
 xen/arch/x86/traps.c                   | 26 +++++++--------
 xen/arch/x86/x86_emulate/x86_emulate.c | 31 +++++++++---------
 xen/arch/x86/x86_emulate/x86_emulate.h | 12 ++++---
 xen/include/asm-x86/cpuid.h            |  4 +++
 xen/include/asm-x86/hvm/emulate.h      |  8 ++---
 xen/include/asm-x86/mm.h               |  4 +--
 12 files changed, 135 insertions(+), 126 deletions(-)

diff --git a/tools/tests/x86_emulator/x86_emulate.c b/tools/tests/x86_emulator/x86_emulate.c
index 7f644d381..165f98a 100644
--- a/tools/tests/x86_emulator/x86_emulate.c
+++ b/tools/tests/x86_emulator/x86_emulate.c
@@ -39,22 +39,21 @@ bool emul_test_make_stack_executable(void)
 }
 
 int emul_test_cpuid(
-    unsigned int *eax,
-    unsigned int *ebx,
-    unsigned int *ecx,
-    unsigned int *edx,
+    unsigned int leaf,
+    unsigned int subleaf,
+    struct cpuid_leaf *res,
     struct x86_emulate_ctxt *ctxt)
 {
-    unsigned int leaf = *eax;
-
-    asm ("cpuid" : "+a" (*eax), "+c" (*ecx), "=d" (*edx), "=b" (*ebx));
+    asm ("cpuid"
+         : "=a" (res->a), "=b" (res->b), "=c" (res->c), "=d" (res->d)
+         : "a" (leaf), "c" (subleaf));
 
     /*
      * The emulator doesn't itself use MOVBE, so we can always run the
      * respective tests.
      */
     if ( leaf == 1 )
-        *ecx |= 1U << 22;
+        res->c |= 1U << 22;
 
     return X86EMUL_OKAY;
 }
diff --git a/tools/tests/x86_emulator/x86_emulate.h b/tools/tests/x86_emulator/x86_emulate.h
index c14c613..ed9951a 100644
--- a/tools/tests/x86_emulator/x86_emulate.h
+++ b/tools/tests/x86_emulator/x86_emulate.h
@@ -58,61 +58,59 @@ static inline uint64_t xgetbv(uint32_t xcr)
 }
 
 #define cache_line_size() ({		     \
-    unsigned int eax = 1, ebx, ecx = 0, edx; \
-    emul_test_cpuid(&eax, &ebx, &ecx, &edx, NULL); \
-    edx & (1U << 19) ? (ebx >> 5) & 0x7f8 : 0; \
+    struct cpuid_leaf res; \
+    emul_test_cpuid(1, 0, &res, NULL); \
+    res.d & (1U << 19) ? (res.b >> 5) & 0x7f8 : 0; \
 })
 
 #define cpu_has_mmx ({ \
-    unsigned int eax = 1, ecx = 0, edx; \
-    emul_test_cpuid(&eax, &ecx, &ecx, &edx, NULL); \
-    (edx & (1U << 23)) != 0; \
+    struct cpuid_leaf res; \
+    emul_test_cpuid(1, 0, &res, NULL); \
+    (res.d & (1U << 23)) != 0; \
 })
 
 #define cpu_has_sse ({ \
-    unsigned int eax = 1, ecx = 0, edx; \
-    emul_test_cpuid(&eax, &ecx, &ecx, &edx, NULL); \
-    (edx & (1U << 25)) != 0; \
+    struct cpuid_leaf res; \
+    emul_test_cpuid(1, 0, &res, NULL); \
+    (res.d & (1U << 25)) != 0; \
 })
 
 #define cpu_has_sse2 ({ \
-    unsigned int eax = 1, ecx = 0, edx; \
-    emul_test_cpuid(&eax, &ecx, &ecx, &edx, NULL); \
-    (edx & (1U << 26)) != 0; \
+    struct cpuid_leaf res; \
+    emul_test_cpuid(1, 0, &res, NULL); \
+    (res.d & (1U << 26)) != 0; \
 })
 
 #define cpu_has_xsave ({ \
-    unsigned int eax = 1, ecx = 0; \
-    emul_test_cpuid(&eax, &eax, &ecx, &eax, NULL); \
+    struct cpuid_leaf res; \
+    emul_test_cpuid(1, 0, &res, NULL); \
     /* Intentionally checking OSXSAVE here. */ \
-    (ecx & (1U << 27)) != 0; \
+    (res.c & (1U << 27)) != 0; \
 })
 
 #define cpu_has_avx ({ \
-    unsigned int eax = 1, ecx = 0; \
-    emul_test_cpuid(&eax, &eax, &ecx, &eax, NULL); \
-    if ( !(ecx & (1U << 27)) || ((xgetbv(0) & 6) != 6) ) \
-        ecx = 0; \
-    (ecx & (1U << 28)) != 0; \
+    struct cpuid_leaf res; \
+    emul_test_cpuid(1, 0, &res, NULL); \
+    if ( !(res.c & (1U << 27)) || ((xgetbv(0) & 6) != 6) ) \
+        res.c = 0; \
+    (res.c & (1U << 28)) != 0; \
 })
 
 #define cpu_has_avx2 ({ \
-    unsigned int eax = 1, ebx, ecx = 0; \
-    emul_test_cpuid(&eax, &ebx, &ecx, &eax, NULL); \
-    if ( !(ecx & (1U << 27)) || ((xgetbv(0) & 6) != 6) ) \
-        ebx = 0; \
+    struct cpuid_leaf res; \
+    emul_test_cpuid(1, 0, &res, NULL); \
+    if ( !(res.c & (1U << 27)) || ((xgetbv(0) & 6) != 6) ) \
+        res.b = 0; \
     else { \
-        eax = 7, ecx = 0; \
-        emul_test_cpuid(&eax, &ebx, &ecx, &eax, NULL); \
+        emul_test_cpuid(7, 0, &res, NULL); \
     } \
-    (ebx & (1U << 5)) != 0; \
+    (res.b & (1U << 5)) != 0; \
 })
 
 int emul_test_cpuid(
-    unsigned int *eax,
-    unsigned int *ebx,
-    unsigned int *ecx,
-    unsigned int *edx,
+    unsigned int leaf,
+    unsigned int subleaf,
+    struct cpuid_leaf *res,
     struct x86_emulate_ctxt *ctxt);
 
 int emul_test_read_cr(
diff --git a/xen/arch/x86/cpuid.c b/xen/arch/x86/cpuid.c
index 3e85a63..7796df6 100644
--- a/xen/arch/x86/cpuid.c
+++ b/xen/arch/x86/cpuid.c
@@ -1,5 +1,6 @@
 #include <xen/init.h>
 #include <xen/lib.h>
+#include <xen/sched.h>
 #include <asm/cpuid.h>
 #include <asm/hvm/hvm.h>
 #include <asm/hvm/vmx/vmcs.h>
@@ -17,6 +18,8 @@ uint32_t __read_mostly raw_featureset[FSCAPINTS];
 uint32_t __read_mostly pv_featureset[FSCAPINTS];
 uint32_t __read_mostly hvm_featureset[FSCAPINTS];
 
+#define EMPTY_LEAF (struct cpuid_leaf){}
+
 static void __init sanitise_featureset(uint32_t *fs)
 {
     /* for_each_set_bit() uses unsigned longs.  Extend with zeroes. */
@@ -215,6 +218,36 @@ const uint32_t * __init lookup_deep_deps(uint32_t feature)
     return NULL;
 }
 
+void guest_cpuid(const struct vcpu *v, unsigned int leaf,
+                 unsigned int subleaf, struct cpuid_leaf *res)
+{
+    *res = EMPTY_LEAF;
+
+    /* {pv,hvm}_cpuid() have this expectation. */
+    ASSERT(v == current);
+
+    if ( is_pv_vcpu(v) || is_pvh_vcpu(v) )
+    {
+        struct cpu_user_regs regs = *guest_cpu_user_regs();
+
+        regs.rax = leaf;
+        regs.rcx = subleaf;
+
+        pv_cpuid(&regs);
+
+        res->a = regs._eax;
+        res->b = regs._ebx;
+        res->c = regs._ecx;
+        res->d = regs._edx;
+    }
+    else
+    {
+        res->c = subleaf;
+
+        hvm_cpuid(leaf, &res->a, &res->b, &res->c, &res->d);
+    }
+}
+
 static void __init __maybe_unused build_assertions(void)
 {
     BUILD_BUG_ON(ARRAY_SIZE(known_features) != FSCAPINTS);
diff --git a/xen/arch/x86/hvm/emulate.c b/xen/arch/x86/hvm/emulate.c
index 41bd4f5..8d1bf51 100644
--- a/xen/arch/x86/hvm/emulate.c
+++ b/xen/arch/x86/hvm/emulate.c
@@ -1552,12 +1552,8 @@ static int hvmemul_wbinvd(
     return X86EMUL_OKAY;
 }
 
-int hvmemul_cpuid(
-    unsigned int *eax,
-    unsigned int *ebx,
-    unsigned int *ecx,
-    unsigned int *edx,
-    struct x86_emulate_ctxt *ctxt)
+int hvmemul_cpuid(unsigned int leaf, unsigned int subleaf,
+                  struct cpuid_leaf *res, struct x86_emulate_ctxt *ctxt)
 {
     /*
      * x86_emulate uses this function to query CPU features for its own internal
@@ -1568,7 +1564,7 @@ int hvmemul_cpuid(
          hvm_check_cpuid_faulting(current) )
         return X86EMUL_EXCEPTION;
 
-    hvm_cpuid(*eax, eax, ebx, ecx, edx);
+    guest_cpuid(current, leaf, subleaf, res);
     return X86EMUL_OKAY;
 }
 
diff --git a/xen/arch/x86/hvm/svm/svm.c b/xen/arch/x86/hvm/svm/svm.c
index 89daa39..de20f64 100644
--- a/xen/arch/x86/hvm/svm/svm.c
+++ b/xen/arch/x86/hvm/svm/svm.c
@@ -1572,23 +1572,20 @@ static void svm_fpu_dirty_intercept(void)
 
 static void svm_vmexit_do_cpuid(struct cpu_user_regs *regs)
 {
-    unsigned int eax, ebx, ecx, edx, inst_len;
+    struct vcpu *curr = current;
+    unsigned int inst_len;
+    struct cpuid_leaf res;
 
-    if ( (inst_len = __get_instruction_length(current, INSTR_CPUID)) == 0 )
+    if ( (inst_len = __get_instruction_length(curr, INSTR_CPUID)) == 0 )
         return;
 
-    eax = regs->_eax;
-    ebx = regs->_ebx;
-    ecx = regs->_ecx;
-    edx = regs->_edx;
-
-    hvm_cpuid(regs->_eax, &eax, &ebx, &ecx, &edx);
-    HVMTRACE_5D(CPUID, regs->_eax, eax, ebx, ecx, edx);
+    guest_cpuid(curr, regs->_eax, regs->_ecx, &res);
+    HVMTRACE_5D(CPUID, regs->_eax, res.a, res.b, res.c, res.d);
 
-    regs->rax = eax;
-    regs->rbx = ebx;
-    regs->rcx = ecx;
-    regs->rdx = edx;
+    regs->rax = res.a;
+    regs->rbx = res.b;
+    regs->rcx = res.c;
+    regs->rdx = res.d;
 
     __update_guest_eip(regs, inst_len);
 }
diff --git a/xen/arch/x86/hvm/vmx/vmx.c b/xen/arch/x86/hvm/vmx/vmx.c
index 15d66a2..ada5358 100644
--- a/xen/arch/x86/hvm/vmx/vmx.c
+++ b/xen/arch/x86/hvm/vmx/vmx.c
@@ -2361,8 +2361,9 @@ static void vmx_fpu_dirty_intercept(void)
 
 static int vmx_do_cpuid(struct cpu_user_regs *regs)
 {
-    unsigned int eax, ebx, ecx, edx;
-    unsigned int leaf, subleaf;
+    struct vcpu *curr = current;
+    unsigned int leaf = regs->_eax, subleaf = regs->_ecx;
+    struct cpuid_leaf res;
 
     if ( hvm_check_cpuid_faulting(current) )
     {
@@ -2370,21 +2371,13 @@ static int vmx_do_cpuid(struct cpu_user_regs *regs)
         return 1;  /* Don't advance the guest IP! */
     }
 
-    eax = regs->eax;
-    ebx = regs->ebx;
-    ecx = regs->ecx;
-    edx = regs->edx;
-
-    leaf = regs->eax;
-    subleaf = regs->ecx;
+    guest_cpuid(curr, leaf, subleaf, &res);
+    HVMTRACE_5D(CPUID, leaf, res.a, res.b, res.c, res.d);
 
-    hvm_cpuid(leaf, &eax, &ebx, &ecx, &edx);
-    HVMTRACE_5D(CPUID, leaf, eax, ebx, ecx, edx);
-
-    regs->eax = eax;
-    regs->ebx = ebx;
-    regs->ecx = ecx;
-    regs->edx = edx;
+    regs->rax = res.a;
+    regs->rbx = res.b;
+    regs->rcx = res.c;
+    regs->rdx = res.d;
 
     return hvm_monitor_cpuid(get_instruction_length(), leaf, subleaf);
 }
@@ -3562,15 +3555,7 @@ void vmx_vmexit_handler(struct cpu_user_regs *regs)
     }
     case EXIT_REASON_CPUID:
     {
-        int rc;
-
-        if ( is_pvh_vcpu(v) )
-        {
-            pv_cpuid(regs);
-            rc = 0;
-        }
-        else
-            rc = vmx_do_cpuid(regs);
+        int rc = vmx_do_cpuid(regs);
 
         /*
          * rc < 0 error in monitor/vm_event, crash
diff --git a/xen/arch/x86/traps.c b/xen/arch/x86/traps.c
index 2d211d1..02f2d5c 100644
--- a/xen/arch/x86/traps.c
+++ b/xen/arch/x86/traps.c
@@ -1412,6 +1412,7 @@ static int emulate_forced_invalid_op(struct cpu_user_regs *regs)
 {
     char sig[5], instr[2];
     unsigned long eip, rc;
+    struct cpuid_leaf res;
 
     eip = regs->rip;
 
@@ -1444,7 +1445,12 @@ static int emulate_forced_invalid_op(struct cpu_user_regs *regs)
 
     eip += sizeof(instr);
 
-    pv_cpuid(regs);
+    guest_cpuid(current, regs->_eax, regs->_ecx, &res);
+
+    regs->rax = res.a;
+    regs->rbx = res.b;
+    regs->rcx = res.c;
+    regs->rdx = res.d;
 
     instruction_done(regs, eip);
 
@@ -3246,10 +3252,10 @@ static int priv_op_wbinvd(struct x86_emulate_ctxt *ctxt)
     return X86EMUL_OKAY;
 }
 
-int pv_emul_cpuid(unsigned int *eax, unsigned int *ebx, unsigned int *ecx,
-                  unsigned int *edx, struct x86_emulate_ctxt *ctxt)
+int pv_emul_cpuid(unsigned int leaf, unsigned int subleaf,
+                  struct cpuid_leaf *res, struct x86_emulate_ctxt *ctxt)
 {
-    struct cpu_user_regs regs = *ctxt->regs;
+    struct vcpu *curr = current;
 
     /*
      * x86_emulate uses this function to query CPU features for its own
@@ -3258,7 +3264,6 @@ int pv_emul_cpuid(unsigned int *eax, unsigned int *ebx, unsigned int *ecx,
      */
     if ( ctxt->opcode == X86EMUL_OPC(0x0f, 0xa2) )
     {
-        const struct vcpu *curr = current;
 
         /* If cpuid faulting is enabled and CPL>0 leave the #GP untouched. */
         if ( curr->arch.cpuid_faulting &&
@@ -3266,16 +3271,7 @@ int pv_emul_cpuid(unsigned int *eax, unsigned int *ebx, unsigned int *ecx,
             return X86EMUL_EXCEPTION;
     }
 
-    regs._eax = *eax;
-    regs._ecx = *ecx;
-
-    pv_cpuid(&regs);
-
-    *eax = regs._eax;
-    *ebx = regs._ebx;
-    *ecx = regs._ecx;
-    *edx = regs._edx;
-
+    guest_cpuid(curr, leaf, subleaf, res);
     return X86EMUL_OKAY;
 }
 
diff --git a/xen/arch/x86/x86_emulate/x86_emulate.c b/xen/arch/x86/x86_emulate/x86_emulate.c
index 3076c0c..4456df9 100644
--- a/xen/arch/x86/x86_emulate/x86_emulate.c
+++ b/xen/arch/x86/x86_emulate/x86_emulate.c
@@ -1265,19 +1265,19 @@ static bool vcpu_has(
     struct x86_emulate_ctxt *ctxt,
     const struct x86_emulate_ops *ops)
 {
-    unsigned int ebx = 0, ecx = 0, edx = 0;
+    struct cpuid_leaf res;
     int rc = X86EMUL_OKAY;
 
     fail_if(!ops->cpuid);
-    rc = ops->cpuid(&eax, &ebx, &ecx, &edx, ctxt);
+    rc = ops->cpuid(eax, 0, &res, ctxt);
     if ( rc == X86EMUL_OKAY )
     {
         switch ( reg )
         {
-        case EAX: reg = eax; break;
-        case EBX: reg = ebx; break;
-        case ECX: reg = ecx; break;
-        case EDX: reg = edx; break;
+        case EAX: reg = res.a; break;
+        case EBX: reg = res.b; break;
+        case ECX: reg = res.c; break;
+        case EDX: reg = res.d; break;
         default: BUG();
         }
         if ( !(reg & (1U << bit)) )
@@ -4502,15 +4502,15 @@ x86_emulate(
 
         case 0xfc: /* clzero */
         {
-            unsigned int eax = 1, ebx = 0, dummy = 0;
+            struct cpuid_leaf res;
             unsigned long zero = 0;
 
             base = ad_bytes == 8 ? _regs.eax :
                    ad_bytes == 4 ? (uint32_t)_regs.eax : (uint16_t)_regs.eax;
             limit = 0;
             if ( vcpu_has_clflush() &&
-                 ops->cpuid(&eax, &ebx, &dummy, &dummy, ctxt) == X86EMUL_OKAY )
-                limit = ((ebx >> 8) & 0xff) * 8;
+                 ops->cpuid(1, 0, &res, ctxt) == X86EMUL_OKAY )
+                limit = ((res.b >> 8) & 0xff) * 8;
             generate_exception_if(limit < sizeof(long) ||
                                   (limit & (limit - 1)), EXC_UD);
             base &= ~(limit - 1);
@@ -5150,17 +5150,18 @@ x86_emulate(
         dst.val = test_cc(b, _regs.eflags);
         break;
 
-    case X86EMUL_OPC(0x0f, 0xa2): /* cpuid */ {
-        unsigned int eax = _regs.eax, ebx = _regs.ebx;
-        unsigned int ecx = _regs.ecx, edx = _regs.edx;
+    case X86EMUL_OPC(0x0f, 0xa2): /* cpuid */
+    {
+        struct cpuid_leaf res;
+
         fail_if(ops->cpuid == NULL);
-        rc = ops->cpuid(&eax, &ebx, &ecx, &edx, ctxt);
+        rc = ops->cpuid(_regs._eax, _regs._ecx, &res, ctxt);
         generate_exception_if(rc == X86EMUL_EXCEPTION,
                               EXC_GP, 0); /* CPUID Faulting? */
         if ( rc != X86EMUL_OKAY )
             goto done;
-        _regs.eax = eax; _regs.ebx = ebx;
-        _regs.ecx = ecx; _regs.edx = edx;
+        _regs.eax = res.a; _regs.ebx = res.b;
+        _regs.ecx = res.c; _regs.edx = res.d;
         break;
     }
 
diff --git a/xen/arch/x86/x86_emulate/x86_emulate.h b/xen/arch/x86/x86_emulate/x86_emulate.h
index 75f57ba..769bb32 100644
--- a/xen/arch/x86/x86_emulate/x86_emulate.h
+++ b/xen/arch/x86/x86_emulate/x86_emulate.h
@@ -164,6 +164,11 @@ enum x86_emulate_fpu_type {
     X86EMUL_FPU_ymm  /* AVX/XOP instruction set (%ymm0-%ymm7/15) */
 };
 
+struct cpuid_leaf
+{
+    uint32_t a, b, c, d;
+};
+
 struct x86_emulate_state;
 
 /*
@@ -415,10 +420,9 @@ struct x86_emulate_ops
      * #GP[0].  Used to implement CPUID faulting.
      */
     int (*cpuid)(
-        unsigned int *eax,
-        unsigned int *ebx,
-        unsigned int *ecx,
-        unsigned int *edx,
+        unsigned int leaf,
+        unsigned int subleaf,
+        struct cpuid_leaf *res,
         struct x86_emulate_ctxt *ctxt);
 
     /*
diff --git a/xen/include/asm-x86/cpuid.h b/xen/include/asm-x86/cpuid.h
index 05f2c9a..a6488af 100644
--- a/xen/include/asm-x86/cpuid.h
+++ b/xen/include/asm-x86/cpuid.h
@@ -17,6 +17,7 @@
 
 #ifndef __ASSEMBLY__
 #include <xen/types.h>
+#include <asm/x86_emulate.h>
 #include <public/sysctl.h>
 
 extern const uint32_t known_features[FSCAPINTS];
@@ -64,6 +65,9 @@ extern struct cpuidmasks cpuidmask_defaults;
 /* Whether or not cpuid faulting is available for the current domain. */
 DECLARE_PER_CPU(bool, cpuid_faulting_enabled);
 
+void guest_cpuid(const struct vcpu *v, unsigned int leaf,
+                 unsigned int subleaf, struct cpuid_leaf *res);
+
 #endif /* __ASSEMBLY__ */
 #endif /* !__X86_CPUID_H__ */
 
diff --git a/xen/include/asm-x86/hvm/emulate.h b/xen/include/asm-x86/hvm/emulate.h
index 68a95e4..88e4856 100644
--- a/xen/include/asm-x86/hvm/emulate.h
+++ b/xen/include/asm-x86/hvm/emulate.h
@@ -57,12 +57,8 @@ void hvm_emulate_init_per_insn(
     unsigned int insn_bytes);
 void hvm_emulate_writeback(
     struct hvm_emulate_ctxt *hvmemul_ctxt);
-int hvmemul_cpuid(
-    unsigned int *eax,
-    unsigned int *ebx,
-    unsigned int *ecx,
-    unsigned int *edx,
-    struct x86_emulate_ctxt *ctxt);
+int hvmemul_cpuid(unsigned int leaf, unsigned int subleaf,
+                  struct cpuid_leaf *res, struct x86_emulate_ctxt *ctxt);
 struct segment_register *hvmemul_get_seg_reg(
     enum x86_segment seg,
     struct hvm_emulate_ctxt *hvmemul_ctxt);
diff --git a/xen/include/asm-x86/mm.h b/xen/include/asm-x86/mm.h
index a15029c..d573ca1 100644
--- a/xen/include/asm-x86/mm.h
+++ b/xen/include/asm-x86/mm.h
@@ -504,8 +504,8 @@ extern int mmcfg_intercept_write(enum x86_segment seg,
                                  void *p_data,
                                  unsigned int bytes,
                                  struct x86_emulate_ctxt *ctxt);
-int pv_emul_cpuid(unsigned int *eax, unsigned int *ebx, unsigned int *ecx,
-                  unsigned int *edx, struct x86_emulate_ctxt *ctxt);
+int pv_emul_cpuid(unsigned int leaf, unsigned int subleaf,
+                  struct cpuid_leaf *res, struct x86_emulate_ctxt *ctxt);
 
 int  ptwr_do_page_fault(struct vcpu *, unsigned long,
                         struct cpu_user_regs *);
-- 
2.1.4


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 93+ messages in thread

* [PATCH 03/27] x86/cpuid: Introduce struct cpuid_policy
  2017-01-04 12:39 [PATCH 00/27] xen/x86: Per-domain CPUID policies Andrew Cooper
  2017-01-04 12:39 ` [PATCH 01/27] x86/cpuid: Untangle the <asm/cpufeature.h> include hierachy Andrew Cooper
  2017-01-04 12:39 ` [PATCH 02/27] x86/cpuid: Introduce guest_cpuid() and struct cpuid_leaf Andrew Cooper
@ 2017-01-04 12:39 ` Andrew Cooper
  2017-01-04 14:22   ` Jan Beulich
  2017-01-04 12:39 ` [PATCH 04/27] x86/cpuid: Move featuresets into " Andrew Cooper
                   ` (23 subsequent siblings)
  26 siblings, 1 reply; 93+ messages in thread
From: Andrew Cooper @ 2017-01-04 12:39 UTC (permalink / raw)
  To: Xen-devel; +Cc: Andrew Cooper, Jan Beulich

struct cpuid_policy will eventually be a complete replacement for the cpuids[]
array, with a fixed layout and named fields to allow O(1) access to specific
information.

For now, the CPUID content is capped at the 0xd and 0x8000001c leaves, which
matches the maximum policy that the toolstack will generate for a domain.  The
xstate leaves extend up to LWP, and the structured features leaf is
implemented with subleaf properties (in anticipation of subleaf 1 appearing
soon), although only subleaf 0 is currently implemented.

Introduce calculate_raw_policy() which fills raw_policy with information,
making use of the new helpers, cpuid_{,count_}leaf().

Finally, rename calculate_featuresets() to init_guest_cpuid(), as it is going
to perform rather more work.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
---
CC: Jan Beulich <JBeulich@suse.com>
---
 xen/arch/x86/cpuid.c        | 78 ++++++++++++++++++++++++++++++++++++++++++++-
 xen/arch/x86/setup.c        |  2 +-
 xen/include/asm-x86/cpuid.h | 75 ++++++++++++++++++++++++++++++++++++++++++-
 3 files changed, 152 insertions(+), 3 deletions(-)

diff --git a/xen/arch/x86/cpuid.c b/xen/arch/x86/cpuid.c
index 7796df6..e0751c9 100644
--- a/xen/arch/x86/cpuid.c
+++ b/xen/arch/x86/cpuid.c
@@ -20,6 +20,19 @@ uint32_t __read_mostly hvm_featureset[FSCAPINTS];
 
 #define EMPTY_LEAF (struct cpuid_leaf){}
 
+static struct cpuid_policy __read_mostly raw_policy;
+
+static void cpuid_leaf(unsigned int leaf, struct cpuid_leaf *data)
+{
+    cpuid(leaf, &data->a, &data->b, &data->c, &data->d);
+}
+
+static void cpuid_count_leaf(unsigned int leaf, unsigned int subleaf,
+                             struct cpuid_leaf *data)
+{
+    cpuid_count(leaf, subleaf, &data->a, &data->b, &data->c, &data->d);
+}
+
 static void __init sanitise_featureset(uint32_t *fs)
 {
     /* for_each_set_bit() uses unsigned longs.  Extend with zeroes. */
@@ -67,6 +80,55 @@ static void __init sanitise_featureset(uint32_t *fs)
                           (fs[FEATURESET_e1d] & ~CPUID_COMMON_1D_FEATURES));
 }
 
+static void __init calculate_raw_policy(void)
+{
+    struct cpuid_policy *p = &raw_policy;
+    unsigned int i;
+
+    cpuid_leaf(0, &p->basic.raw[0]);
+    for ( i = 1; i < min(ARRAY_SIZE(p->basic.raw),
+                         p->basic.max_leaf + 1ul); ++i )
+    {
+        /* Collected later. */
+        if ( i == 0x7 || i == 0xd )
+            continue;
+
+        cpuid_leaf(i, &p->basic.raw[i]);
+    }
+
+    if ( p->basic.max_leaf >= 7 )
+    {
+        cpuid_count_leaf(7, 0, &p->feat.raw[0]);
+
+        for ( i = 1; i < min(ARRAY_SIZE(p->feat.raw),
+                             p->feat.max_subleaf + 1ul); ++i )
+            cpuid_count_leaf(7, i, &p->feat.raw[i]);
+    }
+
+    if ( p->basic.max_leaf >= 0xd )
+    {
+        uint64_t xstates;
+
+        cpuid_count_leaf(0xd, 0, &p->xstate.raw[0]);
+        cpuid_count_leaf(0xd, 1, &p->xstate.raw[1]);
+
+        xstates = ((uint64_t)(p->xstate.xcr0_high | p->xstate.xss_high) << 32) |
+            (p->xstate.xcr0_low | p->xstate.xss_low);
+
+        for ( i = 2; i < 63; ++i )
+        {
+            if ( xstates & (1u << i) )
+                cpuid_count_leaf(0xd, i, &p->xstate.raw[i]);
+        }
+    }
+
+    /* Extended leaves. */
+    cpuid_leaf(0x80000000, &p->extd.raw[0]);
+    for ( i = 1; i < min(ARRAY_SIZE(p->extd.raw),
+                         p->extd.max_leaf + 1 - 0x80000000ul); ++i )
+        cpuid_leaf(0x80000000 + i, &p->extd.raw[i]);
+}
+
 static void __init calculate_raw_featureset(void)
 {
     unsigned int max, tmp;
@@ -181,8 +243,10 @@ static void __init calculate_hvm_featureset(void)
     sanitise_featureset(hvm_featureset);
 }
 
-void __init calculate_featuresets(void)
+void __init init_guest_cpuid(void)
 {
+    calculate_raw_policy();
+
     calculate_raw_featureset();
     calculate_pv_featureset();
     calculate_hvm_featureset();
@@ -256,6 +320,18 @@ static void __init __maybe_unused build_assertions(void)
     BUILD_BUG_ON(ARRAY_SIZE(hvm_shadow_featuremask) != FSCAPINTS);
     BUILD_BUG_ON(ARRAY_SIZE(hvm_hap_featuremask) != FSCAPINTS);
     BUILD_BUG_ON(ARRAY_SIZE(deep_features) != FSCAPINTS);
+
+    /* Find some more clever allocation scheme if this trips. */
+    BUILD_BUG_ON(sizeof(struct cpuid_policy) > PAGE_SIZE);
+
+    BUILD_BUG_ON(sizeof(raw_policy.basic) !=
+                 sizeof(raw_policy.basic.raw));
+    BUILD_BUG_ON(sizeof(raw_policy.feat) !=
+                 sizeof(raw_policy.feat.raw));
+    BUILD_BUG_ON(sizeof(raw_policy.xstate) !=
+                 sizeof(raw_policy.xstate.raw));
+    BUILD_BUG_ON(sizeof(raw_policy.extd) !=
+                 sizeof(raw_policy.extd.raw));
 }
 
 /*
diff --git a/xen/arch/x86/setup.c b/xen/arch/x86/setup.c
index d473ac8..94db514 100644
--- a/xen/arch/x86/setup.c
+++ b/xen/arch/x86/setup.c
@@ -1590,7 +1590,7 @@ void __init noreturn __start_xen(unsigned long mbi_p)
                "Multiple initrd candidates, picking module #%u\n",
                initrdidx);
 
-    calculate_featuresets();
+    init_guest_cpuid();
 
     /*
      * Temporarily clear SMAP in CR4 to allow user-accesses in construct_dom0().
diff --git a/xen/include/asm-x86/cpuid.h b/xen/include/asm-x86/cpuid.h
index a6488af..50844f2 100644
--- a/xen/include/asm-x86/cpuid.h
+++ b/xen/include/asm-x86/cpuid.h
@@ -17,6 +17,7 @@
 
 #ifndef __ASSEMBLY__
 #include <xen/types.h>
+#include <xen/kernel.h>
 #include <asm/x86_emulate.h>
 #include <public/sysctl.h>
 
@@ -28,7 +29,7 @@ extern uint32_t raw_featureset[FSCAPINTS];
 extern uint32_t pv_featureset[FSCAPINTS];
 extern uint32_t hvm_featureset[FSCAPINTS];
 
-void calculate_featuresets(void);
+void init_guest_cpuid(void);
 
 const uint32_t *lookup_deep_deps(uint32_t feature);
 
@@ -65,6 +66,78 @@ extern struct cpuidmasks cpuidmask_defaults;
 /* Whether or not cpuid faulting is available for the current domain. */
 DECLARE_PER_CPU(bool, cpuid_faulting_enabled);
 
+#define CPUID_GUEST_NR_BASIC      (0xdu + 1)
+#define CPUID_GUEST_NR_FEAT       (0u + 1)
+#define CPUID_GUEST_NR_XSTATE     (62u + 1)
+#define CPUID_GUEST_NR_EXTD_INTEL (0x8u + 1)
+#define CPUID_GUEST_NR_EXTD_AMD   (0x1cu + 1)
+#define CPUID_GUEST_NR_EXTD       MAX(CPUID_GUEST_NR_EXTD_INTEL, \
+                                      CPUID_GUEST_NR_EXTD_AMD)
+
+struct cpuid_policy
+{
+    /*
+     * WARNING: During the CPUID transition period, not all information here
+     * is accurate.  The following items are accurate, and can be relied upon.
+     *
+     * Global *_policy objects:
+     *
+     * - Host accurate:
+     *   - max_{,sub}leaf
+     *   - {xcr0,xss}_{high,low}
+     *
+     * - Guest appropriate:
+     *   - Nothing
+     *
+     * Everything else should be considered inaccurate, and not necesserily 0.
+     */
+
+    /* Basic leaves: 0x000000xx */
+    union {
+        struct cpuid_leaf raw[CPUID_GUEST_NR_BASIC];
+        struct {
+            /* Leaf 0x0 - Max and vendor. */
+            struct {
+                uint32_t max_leaf, :32, :32, :32;
+            };
+        };
+    } basic;
+
+    /* Structured feature leaf: 0x00000007[xx] */
+    union {
+        struct cpuid_leaf raw[CPUID_GUEST_NR_FEAT];
+        struct {
+            struct {
+                uint32_t max_subleaf, :32, :32, :32;
+            };
+        };
+    } feat;
+
+    /* Xstate feature leaf: 0x0000000D[xx] */
+    union {
+        struct cpuid_leaf raw[CPUID_GUEST_NR_XSTATE];
+        struct {
+            struct {
+                uint32_t xcr0_low, :32, :32, xcr0_high;
+            };
+            struct {
+                uint32_t :32, :32, xss_low, xss_high;
+            };
+        };
+    } xstate;
+
+    /* Extended leaves: 0x800000xx */
+    union {
+        struct cpuid_leaf raw[CPUID_GUEST_NR_EXTD];
+        struct {
+            /* Leaf 0x80000000 - Max and vendor. */
+            struct {
+                uint32_t max_leaf, :32, :32, :32;
+            };
+        };
+    } extd;
+};
+
 void guest_cpuid(const struct vcpu *v, unsigned int leaf,
                  unsigned int subleaf, struct cpuid_leaf *res);
 
-- 
2.1.4


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 93+ messages in thread

* [PATCH 04/27] x86/cpuid: Move featuresets into struct cpuid_policy
  2017-01-04 12:39 [PATCH 00/27] xen/x86: Per-domain CPUID policies Andrew Cooper
                   ` (2 preceding siblings ...)
  2017-01-04 12:39 ` [PATCH 03/27] x86/cpuid: Introduce struct cpuid_policy Andrew Cooper
@ 2017-01-04 12:39 ` Andrew Cooper
  2017-01-04 14:35   ` Jan Beulich
  2017-01-04 12:39 ` [PATCH 05/27] x86/cpuid: Allocate a CPUID policy for every domain Andrew Cooper
                   ` (22 subsequent siblings)
  26 siblings, 1 reply; 93+ messages in thread
From: Andrew Cooper @ 2017-01-04 12:39 UTC (permalink / raw)
  To: Xen-devel; +Cc: Andrew Cooper, Jan Beulich

Featuresets will eventually live only once in a struct cpuid_policy, but lots
of code currently uses the global featuresets as a linear bitmap.  Remove the
existing global *_featureset bitmaps, replacing them with *_policy objects
containing named featureset words and a fs[] linear bitmap.

Two new helpers are introduced to scatter/gather a linear featureset bitmap
to/from the fixed word locations in struct cpuid_policy.

The existing calculate_raw_policy() already obtains the scattered raw
featureset.  Gather the raw featureset into raw_policy.fs in
calculate_raw_policy() and drop calculate_raw_featureset() entirely.

Now that host_featureset can't be a straight define of
boot_cpu_data.x86_capability, introduce calculate_host_policy() to suitably
fill in host_policy from boot_cpu_data.x86_capability.  (Future changes will
have additional sanitization logic in this function.)

The PV and HVM policy objects and calculation functions have max introduced to
their names, as there will eventually be a distinction between max and default
policies for each domain type.  The existing logic works in terms of linear
bitmaps, so scatter the result back into the policy objects.

Leave some compatibility defines providing the old *_featureset API.  This
results in no observed change in the *_featureset values, which are still used
at the hypercall and guest_cpuid() interfaces.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
---
CC: Jan Beulich <JBeulich@suse.com>
---
 xen/arch/x86/cpuid.c        | 64 +++++++++++------------------------
 xen/include/asm-x86/cpuid.h | 81 ++++++++++++++++++++++++++++++++++++++++-----
 2 files changed, 93 insertions(+), 52 deletions(-)

diff --git a/xen/arch/x86/cpuid.c b/xen/arch/x86/cpuid.c
index e0751c9..92e825e 100644
--- a/xen/arch/x86/cpuid.c
+++ b/xen/arch/x86/cpuid.c
@@ -14,13 +14,12 @@ static const uint32_t __initconst hvm_shadow_featuremask[] = INIT_HVM_SHADOW_FEA
 static const uint32_t __initconst hvm_hap_featuremask[] = INIT_HVM_HAP_FEATURES;
 static const uint32_t __initconst deep_features[] = INIT_DEEP_FEATURES;
 
-uint32_t __read_mostly raw_featureset[FSCAPINTS];
-uint32_t __read_mostly pv_featureset[FSCAPINTS];
-uint32_t __read_mostly hvm_featureset[FSCAPINTS];
-
 #define EMPTY_LEAF (struct cpuid_leaf){}
 
-static struct cpuid_policy __read_mostly raw_policy;
+struct cpuid_policy __read_mostly raw_policy,
+    __read_mostly host_policy,
+    __read_mostly pv_max_policy,
+    __read_mostly hvm_max_policy;
 
 static void cpuid_leaf(unsigned int leaf, struct cpuid_leaf *data)
 {
@@ -127,47 +126,22 @@ static void __init calculate_raw_policy(void)
     for ( i = 1; i < min(ARRAY_SIZE(p->extd.raw),
                          p->extd.max_leaf + 1 - 0x80000000ul); ++i )
         cpuid_leaf(0x80000000 + i, &p->extd.raw[i]);
+
+    cpuid_policy_to_featureset(p, p->fs);
 }
 
-static void __init calculate_raw_featureset(void)
+static void __init calculate_host_policy(void)
 {
-    unsigned int max, tmp;
-
-    max = cpuid_eax(0);
-
-    if ( max >= 1 )
-        cpuid(0x1, &tmp, &tmp,
-              &raw_featureset[FEATURESET_1c],
-              &raw_featureset[FEATURESET_1d]);
-    if ( max >= 7 )
-        cpuid_count(0x7, 0, &tmp,
-                    &raw_featureset[FEATURESET_7b0],
-                    &raw_featureset[FEATURESET_7c0],
-                    &raw_featureset[FEATURESET_7d0]);
-    if ( max >= 0xd )
-        cpuid_count(0xd, 1,
-                    &raw_featureset[FEATURESET_Da1],
-                    &tmp, &tmp, &tmp);
-
-    max = cpuid_eax(0x80000000);
-    if ( (max >> 16) != 0x8000 )
-        return;
+    struct cpuid_policy *p = &host_policy;
 
-    if ( max >= 0x80000001 )
-        cpuid(0x80000001, &tmp, &tmp,
-              &raw_featureset[FEATURESET_e1c],
-              &raw_featureset[FEATURESET_e1d]);
-    if ( max >= 0x80000007 )
-        cpuid(0x80000007, &tmp, &tmp, &tmp,
-              &raw_featureset[FEATURESET_e7d]);
-    if ( max >= 0x80000008 )
-        cpuid(0x80000008, &tmp,
-              &raw_featureset[FEATURESET_e8b],
-              &tmp, &tmp);
+    memcpy(p->fs, boot_cpu_data.x86_capability, sizeof(p->fs));
+
+    cpuid_featureset_to_policy(host_featureset, p);
 }
 
-static void __init calculate_pv_featureset(void)
+static void __init calculate_pv_max_policy(void)
 {
+    struct cpuid_policy *p = &pv_max_policy;
     unsigned int i;
 
     for ( i = 0; i < FSCAPINTS; ++i )
@@ -185,10 +159,12 @@ static void __init calculate_pv_featureset(void)
     __set_bit(X86_FEATURE_CMP_LEGACY, pv_featureset);
 
     sanitise_featureset(pv_featureset);
+    cpuid_featureset_to_policy(pv_featureset, p);
 }
 
-static void __init calculate_hvm_featureset(void)
+static void __init calculate_hvm_max_policy(void)
 {
+    struct cpuid_policy *p = &hvm_max_policy;
     unsigned int i;
     const uint32_t *hvm_featuremask;
 
@@ -241,15 +217,15 @@ static void __init calculate_hvm_featureset(void)
     }
 
     sanitise_featureset(hvm_featureset);
+    cpuid_featureset_to_policy(hvm_featureset, p);
 }
 
 void __init init_guest_cpuid(void)
 {
     calculate_raw_policy();
-
-    calculate_raw_featureset();
-    calculate_pv_featureset();
-    calculate_hvm_featureset();
+    calculate_host_policy();
+    calculate_pv_max_policy();
+    calculate_hvm_max_policy();
 }
 
 const uint32_t * __init lookup_deep_deps(uint32_t feature)
diff --git a/xen/include/asm-x86/cpuid.h b/xen/include/asm-x86/cpuid.h
index 50844f2..b67c10e 100644
--- a/xen/include/asm-x86/cpuid.h
+++ b/xen/include/asm-x86/cpuid.h
@@ -24,11 +24,6 @@
 extern const uint32_t known_features[FSCAPINTS];
 extern const uint32_t special_features[FSCAPINTS];
 
-extern uint32_t raw_featureset[FSCAPINTS];
-#define host_featureset boot_cpu_data.x86_capability
-extern uint32_t pv_featureset[FSCAPINTS];
-extern uint32_t hvm_featureset[FSCAPINTS];
-
 void init_guest_cpuid(void);
 
 const uint32_t *lookup_deep_deps(uint32_t feature);
@@ -87,7 +82,7 @@ struct cpuid_policy
      *   - {xcr0,xss}_{high,low}
      *
      * - Guest appropriate:
-     *   - Nothing
+     *   - All FEATURESET_* words
      *
      * Everything else should be considered inaccurate, and not necesserily 0.
      */
@@ -100,6 +95,11 @@ struct cpuid_policy
             struct {
                 uint32_t max_leaf, :32, :32, :32;
             };
+
+            /* Leaf 0x1 - family/model/stepping and features. */
+            struct {
+                uint32_t :32, :32, _1c, _1d;
+            };
         };
     } basic;
 
@@ -108,7 +108,7 @@ struct cpuid_policy
         struct cpuid_leaf raw[CPUID_GUEST_NR_FEAT];
         struct {
             struct {
-                uint32_t max_subleaf, :32, :32, :32;
+                uint32_t max_subleaf, _7b0, _7c0, _7d0;
             };
         };
     } feat;
@@ -121,7 +121,7 @@ struct cpuid_policy
                 uint32_t xcr0_low, :32, :32, xcr0_high;
             };
             struct {
-                uint32_t :32, :32, xss_low, xss_high;
+                uint32_t Da1, :32, xss_low, xss_high;
             };
         };
     } xstate;
@@ -134,10 +134,75 @@ struct cpuid_policy
             struct {
                 uint32_t max_leaf, :32, :32, :32;
             };
+
+            /* Leaf 0x80000001 - family/model/stepping and features. */
+            struct {
+                uint32_t :32, :32, e1c, e1d;
+            };
+
+            uint64_t :64, :64; /* Brand string. */
+            uint64_t :64, :64; /* Brand string. */
+            uint64_t :64, :64; /* Brand string. */
+            uint64_t :64, :64; /* L1 cache/TLB. */
+            uint64_t :64, :64; /* L2/3 cache/TLB. */
+
+            /* Leaf 0x80000007 - Advanced Power Management. */
+            struct {
+                uint32_t :32, :32, :32, e7d;
+            };
+
+            /* Leaf 0x80000008 - Misc addr/feature info. */
+            struct {
+                uint32_t :32, e8b, :32, :32;
+            };
         };
     } extd;
+
+    /* Temporary featureset bitmap. */
+    uint32_t fs[FSCAPINTS];
 };
 
+/* Fill in a featureset bitmap from a CPUID policy. */
+static inline void cpuid_policy_to_featureset(
+    const struct cpuid_policy *p, uint32_t fs[FSCAPINTS])
+{
+    fs[FEATURESET_1d]  = p->basic._1d;
+    fs[FEATURESET_1c]  = p->basic._1c;
+    fs[FEATURESET_e1d] = p->extd.e1d;
+    fs[FEATURESET_e1c] = p->extd.e1c;
+    fs[FEATURESET_Da1] = p->xstate.Da1;
+    fs[FEATURESET_7b0] = p->feat._7b0;
+    fs[FEATURESET_7c0] = p->feat._7c0;
+    fs[FEATURESET_e7d] = p->extd.e7d;
+    fs[FEATURESET_e8b] = p->extd.e8b;
+    fs[FEATURESET_7d0] = p->feat._7d0;
+}
+
+/* Fill in a CPUID policy from a featureset bitmap. */
+static inline void cpuid_featureset_to_policy(
+    const uint32_t fs[FSCAPINTS], struct cpuid_policy *p)
+{
+    p->basic._1d  = fs[FEATURESET_1d];
+    p->basic._1c  = fs[FEATURESET_1c];
+    p->extd.e1d   = fs[FEATURESET_e1d];
+    p->extd.e1c   = fs[FEATURESET_e1c];
+    p->xstate.Da1 = fs[FEATURESET_Da1];
+    p->feat._7b0  = fs[FEATURESET_7b0];
+    p->feat._7c0  = fs[FEATURESET_7c0];
+    p->extd.e7d   = fs[FEATURESET_e7d];
+    p->extd.e8b   = fs[FEATURESET_e8b];
+    p->feat._7d0  = fs[FEATURESET_7d0];
+}
+
+extern struct cpuid_policy raw_policy, host_policy, pv_max_policy,
+    hvm_max_policy;
+
+/* Temporary compatibility defines. */
+#define raw_featureset raw_policy.fs
+#define host_featureset host_policy.fs
+#define pv_featureset pv_max_policy.fs
+#define hvm_featureset hvm_max_policy.fs
+
 void guest_cpuid(const struct vcpu *v, unsigned int leaf,
                  unsigned int subleaf, struct cpuid_leaf *res);
 
-- 
2.1.4


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 93+ messages in thread

* [PATCH 05/27] x86/cpuid: Allocate a CPUID policy for every domain
  2017-01-04 12:39 [PATCH 00/27] xen/x86: Per-domain CPUID policies Andrew Cooper
                   ` (3 preceding siblings ...)
  2017-01-04 12:39 ` [PATCH 04/27] x86/cpuid: Move featuresets into " Andrew Cooper
@ 2017-01-04 12:39 ` Andrew Cooper
  2017-01-04 14:40   ` Jan Beulich
  2017-01-04 12:39 ` [PATCH 06/27] x86/domctl: Make XEN_DOMCTL_set_address_size singleshot Andrew Cooper
                   ` (21 subsequent siblings)
  26 siblings, 1 reply; 93+ messages in thread
From: Andrew Cooper @ 2017-01-04 12:39 UTC (permalink / raw)
  To: Xen-devel; +Cc: Andrew Cooper, Jan Beulich

Introduce init_domain_cpuid_policy() to allocate an appropriate cpuid policy
for the domain (currently the domains maximum applicable policy), and call it
during domain construction.

init_guest_cpuid() now needs calling before dom0 is constructed.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
---
CC: Jan Beulich <JBeulich@suse.com>
---
 xen/arch/x86/cpuid.c         | 12 ++++++++++++
 xen/arch/x86/domain.c        |  6 ++++++
 xen/arch/x86/setup.c         |  4 ++--
 xen/include/asm-x86/cpuid.h  | 13 +++++++++++++
 xen/include/asm-x86/domain.h |  3 +++
 5 files changed, 36 insertions(+), 2 deletions(-)

diff --git a/xen/arch/x86/cpuid.c b/xen/arch/x86/cpuid.c
index 92e825e..e7bb0d5 100644
--- a/xen/arch/x86/cpuid.c
+++ b/xen/arch/x86/cpuid.c
@@ -258,6 +258,18 @@ const uint32_t * __init lookup_deep_deps(uint32_t feature)
     return NULL;
 }
 
+int init_domain_cpuid_policy(struct domain *d)
+{
+    d->arch.cpuid = xmalloc(struct cpuid_policy);
+
+    if ( !d->arch.cpuid )
+        return -ENOMEM;
+
+    *d->arch.cpuid = is_pv_domain(d) ? pv_max_policy : hvm_max_policy;
+
+    return 0;
+}
+
 void guest_cpuid(const struct vcpu *v, unsigned int leaf,
                  unsigned int subleaf, struct cpuid_leaf *res)
 {
diff --git a/xen/arch/x86/domain.c b/xen/arch/x86/domain.c
index 11fa379..3082a6c 100644
--- a/xen/arch/x86/domain.c
+++ b/xen/arch/x86/domain.c
@@ -563,6 +563,7 @@ int arch_domain_create(struct domain *d, unsigned int domcr_flags,
     if ( is_idle_domain(d) )
     {
         d->arch.emulation_flags = 0;
+        d->arch.cpuid = ZERO_BLOCK_PTR; /* Catch stray misuses. */
     }
     else
     {
@@ -632,6 +633,9 @@ int arch_domain_create(struct domain *d, unsigned int domcr_flags,
             goto fail;
         paging_initialised = 1;
 
+        if ( (rc = init_domain_cpuid_policy(d)) )
+            goto fail;
+
         d->arch.cpuids = xmalloc_array(cpuid_input_t, MAX_CPUID_INPUT);
         rc = -ENOMEM;
         if ( d->arch.cpuids == NULL )
@@ -705,6 +709,7 @@ int arch_domain_create(struct domain *d, unsigned int domcr_flags,
     cleanup_domain_irq_mapping(d);
     free_xenheap_page(d->shared_info);
     xfree(d->arch.cpuids);
+    xfree(d->arch.cpuid);
     if ( paging_initialised )
         paging_final_teardown(d);
     free_perdomain_mappings(d);
@@ -723,6 +728,7 @@ void arch_domain_destroy(struct domain *d)
 
     xfree(d->arch.e820);
     xfree(d->arch.cpuids);
+    xfree(d->arch.cpuid);
 
     free_domain_pirqs(d);
     if ( !is_idle_domain(d) )
diff --git a/xen/arch/x86/setup.c b/xen/arch/x86/setup.c
index 94db514..0ccef1d 100644
--- a/xen/arch/x86/setup.c
+++ b/xen/arch/x86/setup.c
@@ -1540,6 +1540,8 @@ void __init noreturn __start_xen(unsigned long mbi_p)
     if ( !tboot_protect_mem_regions() )
         panic("Could not protect TXT memory regions");
 
+    init_guest_cpuid();
+
     if ( opt_dom0pvh )
         domcr_flags |= DOMCRF_pvh | DOMCRF_hap;
 
@@ -1590,8 +1592,6 @@ void __init noreturn __start_xen(unsigned long mbi_p)
                "Multiple initrd candidates, picking module #%u\n",
                initrdidx);
 
-    init_guest_cpuid();
-
     /*
      * Temporarily clear SMAP in CR4 to allow user-accesses in construct_dom0().
      * This saves a large number of corner cases interactions with
diff --git a/xen/include/asm-x86/cpuid.h b/xen/include/asm-x86/cpuid.h
index b67c10e..86fa0b1 100644
--- a/xen/include/asm-x86/cpuid.h
+++ b/xen/include/asm-x86/cpuid.h
@@ -84,6 +84,16 @@ struct cpuid_policy
      * - Guest appropriate:
      *   - All FEATURESET_* words
      *
+     * Per-domain objects:
+     *
+     * - Host accurate:
+     *   - max_{,sub}leaf
+     *   - {xcr0,xss}_{high,low}
+     *   - All FEATURESET_* words
+     *
+     * - Guest accurate:
+     *   - Nothing
+     *
      * Everything else should be considered inaccurate, and not necesserily 0.
      */
 
@@ -203,6 +213,9 @@ extern struct cpuid_policy raw_policy, host_policy, pv_max_policy,
 #define pv_featureset pv_max_policy.fs
 #define hvm_featureset hvm_max_policy.fs
 
+/* Allocate and initialise a CPUID policy suitable for the domain. */
+int init_domain_cpuid_policy(struct domain *d);
+
 void guest_cpuid(const struct vcpu *v, unsigned int leaf,
                  unsigned int subleaf, struct cpuid_leaf *res);
 
diff --git a/xen/include/asm-x86/domain.h b/xen/include/asm-x86/domain.h
index 95762cf..d2e087f 100644
--- a/xen/include/asm-x86/domain.h
+++ b/xen/include/asm-x86/domain.h
@@ -340,6 +340,9 @@ struct arch_domain
     /* Is PHYSDEVOP_eoi to automatically unmask the event channel? */
     bool_t auto_unmask;
 
+    /* CPUID Policy. */
+    struct cpuid_policy *cpuid;
+
     /* Values snooped from updates to cpuids[] (below). */
     u8 x86;                  /* CPU family */
     u8 x86_vendor;           /* CPU vendor */
-- 
2.1.4


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 93+ messages in thread

* [PATCH 06/27] x86/domctl: Make XEN_DOMCTL_set_address_size singleshot
  2017-01-04 12:39 [PATCH 00/27] xen/x86: Per-domain CPUID policies Andrew Cooper
                   ` (4 preceding siblings ...)
  2017-01-04 12:39 ` [PATCH 05/27] x86/cpuid: Allocate a CPUID policy for every domain Andrew Cooper
@ 2017-01-04 12:39 ` Andrew Cooper
  2017-01-04 14:42   ` Jan Beulich
  2017-01-04 12:39 ` [PATCH 07/27] x86/cpuid: Recalculate a domains CPUID policy when appropriate Andrew Cooper
                   ` (20 subsequent siblings)
  26 siblings, 1 reply; 93+ messages in thread
From: Andrew Cooper @ 2017-01-04 12:39 UTC (permalink / raw)
  To: Xen-devel; +Cc: Andrew Cooper, Jan Beulich

Toolstacks (including some out-of-tree ones) use XEN_DOMCTL_set_address_size
at most once per domain, and it ends up having a destructive effect on the
available CPUID policy for a domain.

To avoid ordering issues between altering the policy via domctl, and the
constructive effects which would have to happen from switching back to native,
explicitly reject this case.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
---
CC: Jan Beulich <JBeulich@suse.com>
---
 xen/arch/x86/domain.c    | 33 +--------------------------------
 xen/arch/x86/domctl.c    | 17 ++++++-----------
 xen/include/xen/compat.h |  1 -
 3 files changed, 7 insertions(+), 44 deletions(-)

diff --git a/xen/arch/x86/domain.c b/xen/arch/x86/domain.c
index 3082a6c..c1f95cc 100644
--- a/xen/arch/x86/domain.c
+++ b/xen/arch/x86/domain.c
@@ -322,43 +322,12 @@ static void release_compat_l4(struct vcpu *v)
     v->arch.guest_table_user = pagetable_null();
 }
 
-static inline int may_switch_mode(struct domain *d)
-{
-    return (!is_hvm_domain(d) && (d->tot_pages == 0));
-}
-
-int switch_native(struct domain *d)
-{
-    struct vcpu *v;
-
-    if ( !may_switch_mode(d) )
-        return -EACCES;
-    if ( !is_pv_32bit_domain(d) && !is_pvh_32bit_domain(d) )
-        return 0;
-
-    d->arch.is_32bit_pv = d->arch.has_32bit_shinfo = 0;
-
-    for_each_vcpu( d, v )
-    {
-        free_compat_arg_xlat(v);
-
-        if ( !is_pvh_domain(d) )
-            release_compat_l4(v);
-        else
-            hvm_set_mode(v, 8);
-    }
-
-    d->arch.x87_fip_width = cpu_has_fpu_sel ? 0 : 8;
-
-    return 0;
-}
-
 int switch_compat(struct domain *d)
 {
     struct vcpu *v;
     int rc;
 
-    if ( !may_switch_mode(d) )
+    if ( is_hvm_domain(d) || d->tot_pages != 0 )
         return -EACCES;
     if ( is_pv_32bit_domain(d) || is_pvh_32bit_domain(d) )
         return 0;
diff --git a/xen/arch/x86/domctl.c b/xen/arch/x86/domctl.c
index eb71c9e..069f1fe 100644
--- a/xen/arch/x86/domctl.c
+++ b/xen/arch/x86/domctl.c
@@ -514,18 +514,13 @@ long arch_do_domctl(
         break;
 
     case XEN_DOMCTL_set_address_size:
-        switch ( domctl->u.address_size.size )
-        {
-        case 32:
+        if ( ((domctl->u.address_size.size == 64) && !d->arch.is_32bit_pv) ||
+             ((domctl->u.address_size.size == 32) && d->arch.is_32bit_pv) )
+            ret = 0;
+        else if ( domctl->u.address_size.size == 32 )
             ret = switch_compat(d);
-            break;
-        case 64:
-            ret = switch_native(d);
-            break;
-        default:
-            ret = (domctl->u.address_size.size == BITS_PER_LONG) ? 0 : -EINVAL;
-            break;
-        }
+        else
+            ret = -EINVAL;
         break;
 
     case XEN_DOMCTL_get_address_size:
diff --git a/xen/include/xen/compat.h b/xen/include/xen/compat.h
index ce913ac..0868350 100644
--- a/xen/include/xen/compat.h
+++ b/xen/include/xen/compat.h
@@ -231,7 +231,6 @@ struct vcpu_runstate_info;
 void xlat_vcpu_runstate_info(struct vcpu_runstate_info *);
 
 int switch_compat(struct domain *);
-int switch_native(struct domain *);
 
 #else
 
-- 
2.1.4


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 93+ messages in thread

* [PATCH 07/27] x86/cpuid: Recalculate a domains CPUID policy when appropriate
  2017-01-04 12:39 [PATCH 00/27] xen/x86: Per-domain CPUID policies Andrew Cooper
                   ` (5 preceding siblings ...)
  2017-01-04 12:39 ` [PATCH 06/27] x86/domctl: Make XEN_DOMCTL_set_address_size singleshot Andrew Cooper
@ 2017-01-04 12:39 ` Andrew Cooper
  2017-01-04 15:01   ` Jan Beulich
  2017-01-04 12:39 ` [PATCH 08/27] x86/hvm: Dispatch cpuid_viridian_leaves() from guest_cpuid() Andrew Cooper
                   ` (19 subsequent siblings)
  26 siblings, 1 reply; 93+ messages in thread
From: Andrew Cooper @ 2017-01-04 12:39 UTC (permalink / raw)
  To: Xen-devel; +Cc: Andrew Cooper, Jan Beulich

Introduce recalculate_cpuid_policy() which clamps a CPUID policy based on the
domains current restrictions.

Recalculate on domain creation immediately after copying the appropriate
policy, when switching a PV guest to being compat, and when the toolstack sets
CPUID policy data.

This needs sanitise_featureset() and lookup_deep_deps() to move out of __init

From this point on, domains have full and correct feature-leaf information in
their CPUID policies, allowing for substantial cleanup and improvements.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
---
CC: Jan Beulich <JBeulich@suse.com>
---
 xen/arch/x86/cpuid.c             | 60 +++++++++++++++++++++++++++++++++++-----
 xen/arch/x86/domain.c            |  1 +
 xen/arch/x86/domctl.c            | 23 +++++++++++++++
 xen/include/asm-x86/cpufeature.h |  1 +
 xen/include/asm-x86/cpuid.h      | 10 +++----
 5 files changed, 82 insertions(+), 13 deletions(-)

diff --git a/xen/arch/x86/cpuid.c b/xen/arch/x86/cpuid.c
index e7bb0d5..36d11c0 100644
--- a/xen/arch/x86/cpuid.c
+++ b/xen/arch/x86/cpuid.c
@@ -9,10 +9,10 @@
 const uint32_t known_features[] = INIT_KNOWN_FEATURES;
 const uint32_t special_features[] = INIT_SPECIAL_FEATURES;
 
-static const uint32_t __initconst pv_featuremask[] = INIT_PV_FEATURES;
-static const uint32_t __initconst hvm_shadow_featuremask[] = INIT_HVM_SHADOW_FEATURES;
-static const uint32_t __initconst hvm_hap_featuremask[] = INIT_HVM_HAP_FEATURES;
-static const uint32_t __initconst deep_features[] = INIT_DEEP_FEATURES;
+static const uint32_t pv_featuremask[] = INIT_PV_FEATURES;
+static const uint32_t hvm_shadow_featuremask[] = INIT_HVM_SHADOW_FEATURES;
+static const uint32_t hvm_hap_featuremask[] = INIT_HVM_HAP_FEATURES;
+static const uint32_t deep_features[] = INIT_DEEP_FEATURES;
 
 #define EMPTY_LEAF (struct cpuid_leaf){}
 
@@ -32,7 +32,7 @@ static void cpuid_count_leaf(unsigned int leaf, unsigned int subleaf,
     cpuid_count(leaf, subleaf, &data->a, &data->b, &data->c, &data->d);
 }
 
-static void __init sanitise_featureset(uint32_t *fs)
+static void sanitise_featureset(uint32_t *fs)
 {
     /* for_each_set_bit() uses unsigned longs.  Extend with zeroes. */
     uint32_t disabled_features[
@@ -228,12 +228,12 @@ void __init init_guest_cpuid(void)
     calculate_hvm_max_policy();
 }
 
-const uint32_t * __init lookup_deep_deps(uint32_t feature)
+const uint32_t *lookup_deep_deps(uint32_t feature)
 {
     static const struct {
         uint32_t feature;
         uint32_t fs[FSCAPINTS];
-    } deep_deps[] __initconst = INIT_DEEP_DEPS;
+    } deep_deps[] = INIT_DEEP_DEPS;
     unsigned int start = 0, end = ARRAY_SIZE(deep_deps);
 
     BUILD_BUG_ON(ARRAY_SIZE(deep_deps) != NR_DEEP_DEPS);
@@ -258,6 +258,50 @@ const uint32_t * __init lookup_deep_deps(uint32_t feature)
     return NULL;
 }
 
+void recalculate_cpuid_policy(struct domain *d)
+{
+    struct cpuid_policy *p = d->arch.cpuid;
+    const struct cpuid_policy *max =
+        is_pv_domain(d) ? &pv_max_policy : &hvm_max_policy;
+    uint32_t fs[FSCAPINTS], max_fs[FSCAPINTS];
+    unsigned int i;
+
+    cpuid_policy_to_featureset(p, fs);
+    memcpy(max_fs, max->fs, sizeof(max_fs));
+
+    /* Allow a toolstack to possibly select ITSC... */
+    if ( cpu_has_itsc )
+        __set_bit(X86_FEATURE_ITSC, max_fs);
+
+    for ( i = 0; i < ARRAY_SIZE(fs); i++ )
+        fs[i] &= max_fs[i];
+
+    if ( is_pv_32bit_domain(d) )
+    {
+        __clear_bit(X86_FEATURE_LM, fs);
+        if ( boot_cpu_data.x86_vendor != X86_VENDOR_AMD )
+            __clear_bit(X86_FEATURE_SYSCALL, fs);
+    }
+
+    if ( is_hvm_domain(d) && !hap_enabled(d) )
+    {
+        for ( i = 0; i < ARRAY_SIZE(fs); i++ )
+            fs[i] &= hvm_shadow_featuremask[i];
+    }
+
+    /* ... but hide ITSC in the common case. */
+    if ( !d->disable_migrate && !d->arch.vtsc )
+        __clear_bit(X86_FEATURE_ITSC, fs);
+
+    /* Fold host's FDP_EXCP_ONLY and NO_FPU_SEL into guest's view. */
+    fs[FEATURESET_7b0] &= ~special_features[FEATURESET_7b0];
+    fs[FEATURESET_7b0] |= (host_featureset[FEATURESET_7b0] &
+                           special_features[FEATURESET_7b0]);
+
+    sanitise_featureset(fs);
+    cpuid_featureset_to_policy(fs, p);
+}
+
 int init_domain_cpuid_policy(struct domain *d)
 {
     d->arch.cpuid = xmalloc(struct cpuid_policy);
@@ -267,6 +311,8 @@ int init_domain_cpuid_policy(struct domain *d)
 
     *d->arch.cpuid = is_pv_domain(d) ? pv_max_policy : hvm_max_policy;
 
+    recalculate_cpuid_policy(d);
+
     return 0;
 }
 
diff --git a/xen/arch/x86/domain.c b/xen/arch/x86/domain.c
index c1f95cc..7d33c41 100644
--- a/xen/arch/x86/domain.c
+++ b/xen/arch/x86/domain.c
@@ -352,6 +352,7 @@ int switch_compat(struct domain *d)
     }
 
     domain_set_alloc_bitsize(d);
+    recalculate_cpuid_policy(d);
 
     d->arch.x87_fip_width = 4;
 
diff --git a/xen/arch/x86/domctl.c b/xen/arch/x86/domctl.c
index 069f1fe..c7e74dd 100644
--- a/xen/arch/x86/domctl.c
+++ b/xen/arch/x86/domctl.c
@@ -51,6 +51,29 @@ static int gdbsx_guest_mem_io(domid_t domid, struct xen_domctl_gdbsx_memio *iop)
 static void update_domain_cpuid_info(struct domain *d,
                                      const xen_domctl_cpuid_t *ctl)
 {
+    struct cpuid_policy *p = d->arch.cpuid;
+    struct cpuid_leaf leaf = { ctl->eax, ctl->ebx, ctl->ecx, ctl->edx };
+
+    if ( ctl->input[0] < ARRAY_SIZE(p->basic.raw) )
+    {
+        if ( ctl->input[0] == 7 )
+        {
+            if ( ctl->input[1] < ARRAY_SIZE(p->feat.raw) )
+                p->feat.raw[ctl->input[1]] = leaf;
+        }
+        else if ( ctl->input[0] == 0xd )
+        {
+            if ( ctl->input[1] < ARRAY_SIZE(p->xstate.raw) )
+                p->xstate.raw[ctl->input[1]] = leaf;
+        }
+        else
+            p->basic.raw[ctl->input[0]] = leaf;
+    }
+    else if ( (ctl->input[0] - 0x80000000) < ARRAY_SIZE(p->extd.raw) )
+        p->extd.raw[ctl->input[0] - 0x80000000] = leaf;
+
+    recalculate_cpuid_policy(d);
+
     switch ( ctl->input[0] )
     {
     case 0: {
diff --git a/xen/include/asm-x86/cpufeature.h b/xen/include/asm-x86/cpufeature.h
index d45e650..e7181bb 100644
--- a/xen/include/asm-x86/cpufeature.h
+++ b/xen/include/asm-x86/cpufeature.h
@@ -73,6 +73,7 @@
 #define cpu_has_eist		boot_cpu_has(X86_FEATURE_EIST)
 #define cpu_has_hypervisor	boot_cpu_has(X86_FEATURE_HYPERVISOR)
 #define cpu_has_cmp_legacy	boot_cpu_has(X86_FEATURE_CMP_LEGACY)
+#define cpu_has_itsc		boot_cpu_has(X86_FEATURE_ITSC)
 
 enum _cache_type {
     CACHE_TYPE_NULL = 0,
diff --git a/xen/include/asm-x86/cpuid.h b/xen/include/asm-x86/cpuid.h
index 86fa0b1..e20b0d2 100644
--- a/xen/include/asm-x86/cpuid.h
+++ b/xen/include/asm-x86/cpuid.h
@@ -86,13 +86,8 @@ struct cpuid_policy
      *
      * Per-domain objects:
      *
-     * - Host accurate:
-     *   - max_{,sub}leaf
-     *   - {xcr0,xss}_{high,low}
-     *   - All FEATURESET_* words
-     *
      * - Guest accurate:
-     *   - Nothing
+     *   - All FEATURESET_* words
      *
      * Everything else should be considered inaccurate, and not necesserily 0.
      */
@@ -216,6 +211,9 @@ extern struct cpuid_policy raw_policy, host_policy, pv_max_policy,
 /* Allocate and initialise a CPUID policy suitable for the domain. */
 int init_domain_cpuid_policy(struct domain *d);
 
+/* Clamp the CPUID policy to reality. */
+void recalculate_cpuid_policy(struct domain *d);
+
 void guest_cpuid(const struct vcpu *v, unsigned int leaf,
                  unsigned int subleaf, struct cpuid_leaf *res);
 
-- 
2.1.4


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 93+ messages in thread

* [PATCH 08/27] x86/hvm: Dispatch cpuid_viridian_leaves() from guest_cpuid()
  2017-01-04 12:39 [PATCH 00/27] xen/x86: Per-domain CPUID policies Andrew Cooper
                   ` (6 preceding siblings ...)
  2017-01-04 12:39 ` [PATCH 07/27] x86/cpuid: Recalculate a domains CPUID policy when appropriate Andrew Cooper
@ 2017-01-04 12:39 ` Andrew Cooper
  2017-01-04 15:24   ` Jan Beulich
  2017-01-04 12:39 ` [PATCH 09/27] x86/cpuid: Dispatch cpuid_hypervisor_leaves() " Andrew Cooper
                   ` (18 subsequent siblings)
  26 siblings, 1 reply; 93+ messages in thread
From: Andrew Cooper @ 2017-01-04 12:39 UTC (permalink / raw)
  To: Xen-devel; +Cc: Andrew Cooper, Jan Beulich

... rather than from the legacy path.  Update the API to match guest_cpuid(),
and remove its dependence on current.

One check against EFER_SVME is replaced with the more appropriate cpu_has_svm,
when determining whether MSR bitmaps are available.  Make use of guest_cpuid()
unconditionally zeroing res to avoid repeated re-zeroing.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
---
CC: Jan Beulich <JBeulich@suse.com>
---
 xen/arch/x86/cpuid.c               | 13 ++++++++
 xen/arch/x86/hvm/hvm.c             |  3 --
 xen/arch/x86/hvm/viridian.c        | 65 ++++++++++++++++++--------------------
 xen/include/asm-x86/hvm/viridian.h |  9 ++----
 4 files changed, 46 insertions(+), 44 deletions(-)

diff --git a/xen/arch/x86/cpuid.c b/xen/arch/x86/cpuid.c
index 36d11c0..c38e477 100644
--- a/xen/arch/x86/cpuid.c
+++ b/xen/arch/x86/cpuid.c
@@ -319,8 +319,21 @@ int init_domain_cpuid_policy(struct domain *d)
 void guest_cpuid(const struct vcpu *v, unsigned int leaf,
                  unsigned int subleaf, struct cpuid_leaf *res)
 {
+    const struct domain *d = v->domain;
+
     *res = EMPTY_LEAF;
 
+    /*
+     * First pass:
+     * - Dispatch the virtualised leaves to their respective handlers.
+     */
+    switch ( leaf )
+    {
+    case 0x40000000 ... 0x400000ff:
+        if ( is_viridian_domain(d) )
+            return cpuid_viridian_leaves(v, leaf, subleaf, res);
+    }
+
     /* {pv,hvm}_cpuid() have this expectation. */
     ASSERT(v == current);
 
diff --git a/xen/arch/x86/hvm/hvm.c b/xen/arch/x86/hvm/hvm.c
index 70afcc6..ce2785f 100644
--- a/xen/arch/x86/hvm/hvm.c
+++ b/xen/arch/x86/hvm/hvm.c
@@ -3353,9 +3353,6 @@ void hvm_cpuid(unsigned int input, unsigned int *eax, unsigned int *ebx,
     if ( !edx )
         edx = &dummy;
 
-    if ( cpuid_viridian_leaves(input, eax, ebx, ecx, edx) )
-        return;
-
     if ( cpuid_hypervisor_leaves(input, count, eax, ebx, ecx, edx) )
         return;
 
diff --git a/xen/arch/x86/hvm/viridian.c b/xen/arch/x86/hvm/viridian.c
index f6abdd2..f2ac0ab 100644
--- a/xen/arch/x86/hvm/viridian.c
+++ b/xen/arch/x86/hvm/viridian.c
@@ -66,77 +66,74 @@
 #define CPUID6A_MSR_BITMAPS     (1 << 1)
 #define CPUID6A_NESTED_PAGING   (1 << 3)
 
-int cpuid_viridian_leaves(unsigned int leaf, unsigned int *eax,
-                          unsigned int *ebx, unsigned int *ecx,
-                          unsigned int *edx)
+void cpuid_viridian_leaves(const struct vcpu *v, unsigned int leaf,
+                           unsigned int subleaf, struct cpuid_leaf *res)
 {
-    struct domain *d = current->domain;
+    const struct domain *d = v->domain;
 
-    if ( !is_viridian_domain(d) )
-        return 0;
+    ASSERT(is_viridian_domain(d));
+    ASSERT(leaf >= 0x40000000 && leaf < 0x40000100);
 
     leaf -= 0x40000000;
-    if ( leaf > 6 )
-        return 0;
 
-    *eax = *ebx = *ecx = *edx = 0;
     switch ( leaf )
     {
     case 0:
-        *eax = 0x40000006; /* Maximum leaf */
-        *ebx = 0x7263694d; /* Magic numbers  */
-        *ecx = 0x666F736F;
-        *edx = 0x76482074;
+        res->a = 0x40000006; /* Maximum leaf */
+        res->b = 0x7263694d; /* Magic numbers  */
+        res->c = 0x666F736F;
+        res->d = 0x76482074;
         break;
+
     case 1:
-        *eax = 0x31237648; /* Version number */
+        res->a = 0x31237648; /* Version number */
         break;
+
     case 2:
         /* Hypervisor information, but only if the guest has set its
            own version number. */
         if ( d->arch.hvm_domain.viridian.guest_os_id.raw == 0 )
             break;
-        *eax = 1; /* Build number */
-        *ebx = (xen_major_version() << 16) | xen_minor_version();
-        *ecx = 0; /* SP */
-        *edx = 0; /* Service branch and number */
+        res->a = 1; /* Build number */
+        res->b = (xen_major_version() << 16) | xen_minor_version();
         break;
+
     case 3:
         /* Which hypervisor MSRs are available to the guest */
-        *eax = (CPUID3A_MSR_APIC_ACCESS |
-                CPUID3A_MSR_HYPERCALL   |
-                CPUID3A_MSR_VP_INDEX);
+        res->a = (CPUID3A_MSR_APIC_ACCESS |
+                  CPUID3A_MSR_HYPERCALL   |
+                  CPUID3A_MSR_VP_INDEX);
         if ( !(viridian_feature_mask(d) & HVMPV_no_freq) )
-            *eax |= CPUID3A_MSR_FREQ;
+            res->a |= CPUID3A_MSR_FREQ;
         if ( viridian_feature_mask(d) & HVMPV_time_ref_count )
-            *eax |= CPUID3A_MSR_TIME_REF_COUNT;
+            res->a |= CPUID3A_MSR_TIME_REF_COUNT;
         if ( viridian_feature_mask(d) & HVMPV_reference_tsc )
-            *eax |= CPUID3A_MSR_REFERENCE_TSC;
+            res->a |= CPUID3A_MSR_REFERENCE_TSC;
         break;
+
     case 4:
         /* Recommended hypercall usage. */
         if ( (d->arch.hvm_domain.viridian.guest_os_id.raw == 0) ||
              (d->arch.hvm_domain.viridian.guest_os_id.fields.os < 4) )
             break;
-        *eax = CPUID4A_RELAX_TIMER_INT;
+        res->a = CPUID4A_RELAX_TIMER_INT;
         if ( viridian_feature_mask(d) & HVMPV_hcall_remote_tlb_flush )
-            *eax |= CPUID4A_HCALL_REMOTE_TLB_FLUSH;
+            res->a |= CPUID4A_HCALL_REMOTE_TLB_FLUSH;
         if ( !cpu_has_vmx_apic_reg_virt )
-            *eax |= CPUID4A_MSR_BASED_APIC;
-        *ebx = 2047; /* long spin count */
+            res->a |= CPUID4A_MSR_BASED_APIC;
+        res->b = 2047; /* long spin count */
         break;
+
     case 6:
         /* Detected and in use hardware features. */
         if ( cpu_has_vmx_virtualize_apic_accesses )
-            *eax |= CPUID6A_APIC_OVERLAY;
-        if ( cpu_has_vmx_msr_bitmap || (read_efer() & EFER_SVME) )
-            *eax |= CPUID6A_MSR_BITMAPS;
+            res->a |= CPUID6A_APIC_OVERLAY;
+        if ( cpu_has_vmx_msr_bitmap || cpu_has_svm )
+            res->a |= CPUID6A_MSR_BITMAPS;
         if ( hap_enabled(d) )
-            *eax |= CPUID6A_NESTED_PAGING;
+            res->a |= CPUID6A_NESTED_PAGING;
         break;
     }
-
-    return 1;
 }
 
 static void dump_guest_os_id(const struct domain *d)
diff --git a/xen/include/asm-x86/hvm/viridian.h b/xen/include/asm-x86/hvm/viridian.h
index bdbccd5..6e062df 100644
--- a/xen/include/asm-x86/hvm/viridian.h
+++ b/xen/include/asm-x86/hvm/viridian.h
@@ -97,13 +97,8 @@ struct viridian_domain
     union viridian_reference_tsc reference_tsc;
 };
 
-int
-cpuid_viridian_leaves(
-    unsigned int leaf,
-    unsigned int *eax,
-    unsigned int *ebx,
-    unsigned int *ecx,
-    unsigned int *edx);
+void cpuid_viridian_leaves(const struct vcpu *v, unsigned int leaf,
+                           unsigned int subleaf, struct cpuid_leaf *res);
 
 int
 wrmsr_viridian_regs(
-- 
2.1.4


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 93+ messages in thread

* [PATCH 09/27] x86/cpuid: Dispatch cpuid_hypervisor_leaves() from guest_cpuid()
  2017-01-04 12:39 [PATCH 00/27] xen/x86: Per-domain CPUID policies Andrew Cooper
                   ` (7 preceding siblings ...)
  2017-01-04 12:39 ` [PATCH 08/27] x86/hvm: Dispatch cpuid_viridian_leaves() from guest_cpuid() Andrew Cooper
@ 2017-01-04 12:39 ` Andrew Cooper
  2017-01-04 15:34   ` Jan Beulich
  2017-01-04 12:39 ` [PATCH 10/27] x86/cpuid: Introduce named feature bitmaps Andrew Cooper
                   ` (17 subsequent siblings)
  26 siblings, 1 reply; 93+ messages in thread
From: Andrew Cooper @ 2017-01-04 12:39 UTC (permalink / raw)
  To: Xen-devel; +Cc: Andrew Cooper, Jan Beulich

... rather than from the legacy path.  Update the API to match guest_cpuid(),
and remove its dependence on current.

Make use of guest_cpuid() unconditionally zeroing res to avoid repeated
re-zeroing.  To use a const struct domain, domain_cpuid() needs to be
const-corrected.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
---
CC: Jan Beulich <JBeulich@suse.com>
---
 xen/arch/x86/cpuid.c            |   3 ++
 xen/arch/x86/domain.c           |   2 +-
 xen/arch/x86/hvm/hvm.c          |   3 --
 xen/arch/x86/traps.c            | 102 ++++++++++++++++------------------------
 xen/include/asm-x86/domain.h    |   2 +-
 xen/include/asm-x86/processor.h |   4 +-
 6 files changed, 48 insertions(+), 68 deletions(-)

diff --git a/xen/arch/x86/cpuid.c b/xen/arch/x86/cpuid.c
index c38e477..86f598f 100644
--- a/xen/arch/x86/cpuid.c
+++ b/xen/arch/x86/cpuid.c
@@ -332,6 +332,9 @@ void guest_cpuid(const struct vcpu *v, unsigned int leaf,
     case 0x40000000 ... 0x400000ff:
         if ( is_viridian_domain(d) )
             return cpuid_viridian_leaves(v, leaf, subleaf, res);
+        /* Fallthrough. */
+    case 0x40000100 ... 0x4fffffff:
+        return cpuid_hypervisor_leaves(v, leaf, subleaf, res);
     }
 
     /* {pv,hvm}_cpuid() have this expectation. */
diff --git a/xen/arch/x86/domain.c b/xen/arch/x86/domain.c
index 7d33c41..b554a9c 100644
--- a/xen/arch/x86/domain.c
+++ b/xen/arch/x86/domain.c
@@ -2623,7 +2623,7 @@ void arch_dump_vcpu_info(struct vcpu *v)
 }
 
 void domain_cpuid(
-    struct domain *d,
+    const struct domain *d,
     unsigned int  input,
     unsigned int  sub_input,
     unsigned int  *eax,
diff --git a/xen/arch/x86/hvm/hvm.c b/xen/arch/x86/hvm/hvm.c
index ce2785f..bdf8ca8 100644
--- a/xen/arch/x86/hvm/hvm.c
+++ b/xen/arch/x86/hvm/hvm.c
@@ -3353,9 +3353,6 @@ void hvm_cpuid(unsigned int input, unsigned int *eax, unsigned int *ebx,
     if ( !edx )
         edx = &dummy;
 
-    if ( cpuid_hypervisor_leaves(input, count, eax, ebx, ecx, edx) )
-        return;
-
     if ( input & 0x7fffffff )
     {
         /*
diff --git a/xen/arch/x86/traps.c b/xen/arch/x86/traps.c
index 02f2d5c..8b395f8 100644
--- a/xen/arch/x86/traps.c
+++ b/xen/arch/x86/traps.c
@@ -902,20 +902,18 @@ int wrmsr_hypervisor_regs(uint32_t idx, uint64_t val)
     return 0;
 }
 
-int cpuid_hypervisor_leaves( uint32_t idx, uint32_t sub_idx,
-               uint32_t *eax, uint32_t *ebx, uint32_t *ecx, uint32_t *edx)
+void cpuid_hypervisor_leaves(const struct vcpu *v, unsigned int leaf,
+                             unsigned int subleaf, struct cpuid_leaf *res)
 {
-    struct vcpu *curr = current;
-    struct domain *currd = curr->domain;
-    /* Optionally shift out of the way of Viridian architectural leaves. */
-    uint32_t base = is_viridian_domain(currd) ? 0x40000100 : 0x40000000;
+    const struct domain *d = v->domain;
+    unsigned int base = leaf & ~0xff;
+    unsigned int idx  = leaf - base;
     uint32_t limit, dummy;
 
-    idx -= base;
-    if ( idx > XEN_CPUID_MAX_NUM_LEAVES )
-        return 0; /* Avoid unnecessary pass through domain_cpuid() */
+    if ( base > 0x40000100 || idx > XEN_CPUID_MAX_NUM_LEAVES )
+        return; /* Avoid unnecessary pass through domain_cpuid() */
 
-    domain_cpuid(currd, base, 0, &limit, &dummy, &dummy, &dummy);
+    domain_cpuid(d, base, 0, &limit, &dummy, &dummy, &dummy);
     if ( limit == 0 )
         /* Default number of leaves */
         limit = XEN_CPUID_MAX_NUM_LEAVES;
@@ -929,83 +927,71 @@ int cpuid_hypervisor_leaves( uint32_t idx, uint32_t sub_idx,
             limit = XEN_CPUID_MAX_NUM_LEAVES;
     }
 
-    if ( idx > limit ) 
-        return 0;
+    if ( idx > limit )
+        return;
 
     switch ( idx )
     {
     case 0:
-        *eax = base + limit; /* Largest leaf */
-        *ebx = XEN_CPUID_SIGNATURE_EBX;
-        *ecx = XEN_CPUID_SIGNATURE_ECX;
-        *edx = XEN_CPUID_SIGNATURE_EDX;
+        res->a = base + limit; /* Largest leaf */
+        res->b = XEN_CPUID_SIGNATURE_EBX;
+        res->c = XEN_CPUID_SIGNATURE_ECX;
+        res->d = XEN_CPUID_SIGNATURE_EDX;
         break;
 
     case 1:
-        *eax = (xen_major_version() << 16) | xen_minor_version();
-        *ebx = 0;          /* Reserved */
-        *ecx = 0;          /* Reserved */
-        *edx = 0;          /* Reserved */
+        res->a = (xen_major_version() << 16) | xen_minor_version();
         break;
 
     case 2:
-        *eax = 1;          /* Number of hypercall-transfer pages */
-        *ebx = 0x40000000; /* MSR base address */
-        if ( is_viridian_domain(currd) )
-            *ebx = 0x40000200;
-        *ecx = 0;          /* Features 1 */
-        *edx = 0;          /* Features 2 */
-        if ( is_pv_domain(currd) )
-            *ecx |= XEN_CPUID_FEAT1_MMU_PT_UPDATE_PRESERVE_AD;
+        res->a = 1;          /* Number of hypercall-transfer pages */
+        res->b = 0x40000000; /* MSR base address */
+        if ( is_viridian_domain(d) )
+            res->b = 0x40000200;
+        if ( is_pv_domain(d) )
+            res->c |= XEN_CPUID_FEAT1_MMU_PT_UPDATE_PRESERVE_AD;
         break;
 
     case 3: /* Time leaf. */
-        switch ( sub_idx )
+        switch ( subleaf )
         {
         case 0: /* features */
-            *eax = ((!!currd->arch.vtsc << 0) |
-                    (!!host_tsc_is_safe() << 1) |
-                    (!!boot_cpu_has(X86_FEATURE_RDTSCP) << 2));
-            *ebx = currd->arch.tsc_mode;
-            *ecx = currd->arch.tsc_khz;
-            *edx = currd->arch.incarnation;
+            res->a = ((!!d->arch.vtsc << 0) |
+                      (!!host_tsc_is_safe() << 1) |
+                      (!!boot_cpu_has(X86_FEATURE_RDTSCP) << 2));
+            res->b = d->arch.tsc_mode;
+            res->c = d->arch.tsc_khz;
+            res->d = d->arch.incarnation;
             break;
 
         case 1: /* scale and offset */
         {
             uint64_t offset;
 
-            if ( !currd->arch.vtsc )
-                offset = currd->arch.vtsc_offset;
+            if ( !d->arch.vtsc )
+                offset = d->arch.vtsc_offset;
             else
                 /* offset already applied to value returned by virtual rdtscp */
                 offset = 0;
-            *eax = (uint32_t)offset;
-            *ebx = (uint32_t)(offset >> 32);
-            *ecx = currd->arch.vtsc_to_ns.mul_frac;
-            *edx = (s8)currd->arch.vtsc_to_ns.shift;
+            res->a = (uint32_t)offset;
+            res->b = (uint32_t)(offset >> 32);
+            res->c = d->arch.vtsc_to_ns.mul_frac;
+            res->d = (s8)d->arch.vtsc_to_ns.shift;
             break;
         }
 
         case 2: /* physical cpu_khz */
-            *eax = cpu_khz;
-            *ebx = *ecx = *edx = 0;
-            break;
-
-        default:
-            *eax = *ebx = *ecx = *edx = 0;
+            res->a = cpu_khz;
             break;
         }
         break;
 
     case 4: /* HVM hypervisor leaf. */
-        *eax = *ebx = *ecx = *edx = 0;
-
-        if ( !has_hvm_container_domain(currd) || sub_idx != 0 )
+        if ( !has_hvm_container_domain(d) || subleaf != 0 )
             break;
 
         if ( cpu_has_vmx_apic_reg_virt )
-            *eax |= XEN_HVM_CPUID_APIC_ACCESS_VIRT;
+            res->a |= XEN_HVM_CPUID_APIC_ACCESS_VIRT;
 
         /*
          * We want to claim that x2APIC is virtualized if APIC MSR accesses
@@ -1016,24 +1002,22 @@ int cpuid_hypervisor_leaves( uint32_t idx, uint32_t sub_idx,
         if ( cpu_has_vmx_virtualize_x2apic_mode &&
              cpu_has_vmx_apic_reg_virt &&
              cpu_has_vmx_virtual_intr_delivery )
-            *eax |= XEN_HVM_CPUID_X2APIC_VIRT;
+            res->a |= XEN_HVM_CPUID_X2APIC_VIRT;
 
         /*
          * Indicate that memory mapped from other domains (either grants or
          * foreign pages) has valid IOMMU entries.
          */
-        *eax |= XEN_HVM_CPUID_IOMMU_MAPPINGS;
+        res->a |= XEN_HVM_CPUID_IOMMU_MAPPINGS;
 
         /* Indicate presence of vcpu id and set it in ebx */
-        *eax |= XEN_HVM_CPUID_VCPU_ID_PRESENT;
-        *ebx = curr->vcpu_id;
+        res->a |= XEN_HVM_CPUID_VCPU_ID_PRESENT;
+        res->b = v->vcpu_id;
         break;
 
     default:
         BUG();
     }
-
-    return 1;
 }
 
 void pv_cpuid(struct cpu_user_regs *regs)
@@ -1047,9 +1031,6 @@ void pv_cpuid(struct cpu_user_regs *regs)
     subleaf = c = regs->_ecx;
     d = regs->_edx;
 
-    if ( cpuid_hypervisor_leaves(leaf, subleaf, &a, &b, &c, &d) )
-        goto out;
-
     if ( leaf & 0x7fffffff )
     {
         /*
@@ -1381,7 +1362,6 @@ void pv_cpuid(struct cpu_user_regs *regs)
         break;
     }
 
- out:
     regs->rax = a;
     regs->rbx = b;
     regs->rcx = c;
diff --git a/xen/include/asm-x86/domain.h b/xen/include/asm-x86/domain.h
index d2e087f..1ae0cd3 100644
--- a/xen/include/asm-x86/domain.h
+++ b/xen/include/asm-x86/domain.h
@@ -617,7 +617,7 @@ unsigned long pv_guest_cr4_fixup(const struct vcpu *, unsigned long guest_cr4);
              X86_CR4_OSXSAVE | X86_CR4_SMEP |               \
              X86_CR4_FSGSBASE | X86_CR4_SMAP))
 
-void domain_cpuid(struct domain *d,
+void domain_cpuid(const struct domain *d,
                   unsigned int  input,
                   unsigned int  sub_input,
                   unsigned int  *eax,
diff --git a/xen/include/asm-x86/processor.h b/xen/include/asm-x86/processor.h
index aff115b..e43b956 100644
--- a/xen/include/asm-x86/processor.h
+++ b/xen/include/asm-x86/processor.h
@@ -610,8 +610,8 @@ struct stubs {
 DECLARE_PER_CPU(struct stubs, stubs);
 unsigned long alloc_stub_page(unsigned int cpu, unsigned long *mfn);
 
-int cpuid_hypervisor_leaves( uint32_t idx, uint32_t sub_idx,
-          uint32_t *eax, uint32_t *ebx, uint32_t *ecx, uint32_t *edx);
+void cpuid_hypervisor_leaves(const struct vcpu *v, unsigned int leaf,
+                             unsigned int subleaf, struct cpuid_leaf *res);
 int rdmsr_hypervisor_regs(uint32_t idx, uint64_t *val);
 int wrmsr_hypervisor_regs(uint32_t idx, uint64_t val);
 
-- 
2.1.4


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 93+ messages in thread

* [PATCH 10/27] x86/cpuid: Introduce named feature bitmaps
  2017-01-04 12:39 [PATCH 00/27] xen/x86: Per-domain CPUID policies Andrew Cooper
                   ` (8 preceding siblings ...)
  2017-01-04 12:39 ` [PATCH 09/27] x86/cpuid: Dispatch cpuid_hypervisor_leaves() " Andrew Cooper
@ 2017-01-04 12:39 ` Andrew Cooper
  2017-01-04 15:44   ` Jan Beulich
  2017-01-04 12:39 ` [PATCH 11/27] x86/hvm: Improve hvm_efer_valid() using named features Andrew Cooper
                   ` (16 subsequent siblings)
  26 siblings, 1 reply; 93+ messages in thread
From: Andrew Cooper @ 2017-01-04 12:39 UTC (permalink / raw)
  To: Xen-devel; +Cc: Andrew Cooper, Jan Beulich

Use anonymous unions to access the feature leaves as complete words, and by
named individual feature.

A feature name is introduced for every architectural X86_FEATURE_*, other than
the dynamically calculated values such as APIC, OSXSAVE and OSPKE.

No functional change.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
---
CC: Jan Beulich <JBeulich@suse.com>
---
 xen/include/asm-x86/cpuid.h | 103 +++++++++++++++++++++++++++++++++++++++++---
 1 file changed, 97 insertions(+), 6 deletions(-)

diff --git a/xen/include/asm-x86/cpuid.h b/xen/include/asm-x86/cpuid.h
index e20b0d2..0371e6e 100644
--- a/xen/include/asm-x86/cpuid.h
+++ b/xen/include/asm-x86/cpuid.h
@@ -103,7 +103,25 @@ struct cpuid_policy
 
             /* Leaf 0x1 - family/model/stepping and features. */
             struct {
-                uint32_t :32, :32, _1c, _1d;
+                uint32_t :32, :32;
+                union {
+                    uint32_t _1c;
+                    struct {
+                        bool sse3:1, pclmulqdq:1, dtes64:1, monitor:1, dscpl:1, vmx:1, smx:1, eist:1,
+                             tm2:1, ssse3:1, /* cid */:1, /* sdbg */:1, fma:1, cx16:1, xtpr:1, pcdm:1,
+                             :1, pcid:1, dca:1, sse4_1:1, sse4_2:1, x2apic:1, movebe:1, popcnt:1,
+                             tsc_deadline:1, aesni:1, xsave:1, /* osxsave */:1, avx:1, f16c:1, rdrand:1, hv:1;
+                    };
+                };
+                union {
+                    uint32_t _1d;
+                    struct {
+                        bool fpu:1, vme:1, de:1, pse:1, tsc:1, msr:1, pae:1, mce:1,
+                             cx8:1, /* apic */:1, :1, sep:1, mtrr:1, pge:1, mca:1, cmov:1,
+                             pat:1, pse36:1, /* psn */:1, clflush:1, :1, ds:1, acpi:1, mmx:1,
+                             fxsr:1, sse:1, sse2:1, /* ss */:1, htt:1, tm1:1, /* IA-64 */:1, pbe:1;
+                    };
+                };
             };
         };
     } basic;
@@ -113,7 +131,34 @@ struct cpuid_policy
         struct cpuid_leaf raw[CPUID_GUEST_NR_FEAT];
         struct {
             struct {
-                uint32_t max_subleaf, _7b0, _7c0, _7d0;
+                uint32_t max_subleaf;
+                union {
+                    uint32_t _7b0;
+                    struct {
+                        bool fsgsbase:1, tsc_adjust:1, sgx:1, bmi1:1, hle:1, avx2:1, fdp_excp_only:1, smep:1,
+                             bmi2:1, erms:1, invpcid:1, rtm:1, pqm:1, no_fpu_sel:1, mpx:1, pqe:1,
+                             avx512f:1, avx512dq:1, rdseed:1, adx:1, smap:1, avx512ifma:1, /* pcommit */:1, clflushopt:1,
+                             clwb:1, /* pt */:1, avx512pf:1, avx512er:1, avx512cd:1, sha:1, avx512bw:1, avx512vl:1;
+                    };
+                };
+                union {
+                    uint32_t _7c0;
+                    struct {
+                        bool prefetchwt1:1, avx512vbmi:1, :1, pku: 1, :1, :1, :1, :1,
+                             :1, :1, :1, :1, :1, :1, :1, :1,
+                             :1, :1, :1, :1, :1, :1, :1, :1,
+                             :1, :1, :1, :1, :1, :1, :1, :1;
+                    };
+                };
+                union {
+                    uint32_t _7d0;
+                    struct {
+                        bool :1, avx512_4vnniw:1, avx512_4fmaps:1, :1, :1, :1, :1, :1,
+                             :1, :1, :1, :1, :1, :1, :1, :1,
+                             :1, :1, :1, :1, :1, :1, :1, :1,
+                             :1, :1, :1, :1, :1, :1, :1, :1;
+                    };
+                };
             };
         };
     } feat;
@@ -126,7 +171,16 @@ struct cpuid_policy
                 uint32_t xcr0_low, :32, :32, xcr0_high;
             };
             struct {
-                uint32_t Da1, :32, xss_low, xss_high;
+                union {
+                    uint32_t Da1;
+                    struct {
+                        bool xsaveopt: 1, xsavec: 1, xgetbv1: 1, xsaves: 1, :1, :1, :1, :1,
+                             :1, :1, :1, :1, :1, :1, :1, :1,
+                             :1, :1, :1, :1, :1, :1, :1, :1,
+                             :1, :1, :1, :1, :1, :1, :1, :1;
+                    };
+                };
+                uint32_t :32, xss_low, xss_high;
             };
         };
     } xstate;
@@ -142,7 +196,25 @@ struct cpuid_policy
 
             /* Leaf 0x80000001 - family/model/stepping and features. */
             struct {
-                uint32_t :32, :32, e1c, e1d;
+                uint32_t :32, :32;
+                union {
+                    uint32_t e1c;
+                    struct {
+                        bool lahf_lm:1, cmp_legacy:1, svm:1, extapic:1, cr8_legacy:1, abm:1, sse4a:1, misalignsse:1,
+                             _3dnowprefetch:1, osvw: 1, ibs:1, xop:1, skinit:1, wdt:1, :1, lwp:1,
+                             fma4:1, :1, :1, nodeid_msr:1, :1, tbm:1, topoext:1, :1,
+                             :1, :1, dbext:1, :1, :1, monitorx:1, :1, :1;
+                    };
+                };
+                union {
+                    uint32_t e1d;
+                    struct {
+                        bool :1, :1, :1, :1, :1, :1, :1, :1,
+                             :1, :1, :1, syscall:1, :1, :1, :1, :1,
+                             :1, :1, :1, :1, nx:1, :1, mmxext:1, :1,
+                             :1, ffxsr:1, page1gb:1, rdtscp:1, :1, lm:1, _3dnowext:1, _3dnow:1;
+                    };
+                };
             };
 
             uint64_t :64, :64; /* Brand string. */
@@ -153,12 +225,31 @@ struct cpuid_policy
 
             /* Leaf 0x80000007 - Advanced Power Management. */
             struct {
-                uint32_t :32, :32, :32, e7d;
+                uint32_t :32, :32, :32;
+                union {
+                    uint32_t e7d;
+                    struct {
+                        bool :1, :1, :1, :1, :1, :1, :1, :1,
+                             itsc:1, :1, efro:1, :1, :1, :1, :1, :1,
+                             :1, :1, :1, :1, :1, :1, :1, :1,
+                             :1, :1, :1, :1, :1, :1, :1, :1;
+                    };
+                };
             };
 
             /* Leaf 0x80000008 - Misc addr/feature info. */
             struct {
-                uint32_t :32, e8b, :32, :32;
+                uint32_t :32;
+                union {
+                    uint32_t e8b;
+                    struct {
+                        bool clzero:1, :1, :1, :1, :1, :1, :1, :1,
+                             :1, :1, :1, :1, :1, :1, :1, :1,
+                             :1, :1, :1, :1, :1, :1, :1, :1,
+                             :1, :1, :1, :1, :1, :1, :1, :1;
+                    };
+                };
+                uint32_t :32, :32;
             };
         };
     } extd;
-- 
2.1.4


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 93+ messages in thread

* [PATCH 11/27] x86/hvm: Improve hvm_efer_valid() using named features
  2017-01-04 12:39 [PATCH 00/27] xen/x86: Per-domain CPUID policies Andrew Cooper
                   ` (9 preceding siblings ...)
  2017-01-04 12:39 ` [PATCH 10/27] x86/cpuid: Introduce named feature bitmaps Andrew Cooper
@ 2017-01-04 12:39 ` Andrew Cooper
  2017-01-05 11:34   ` Jan Beulich
  2017-01-04 12:39 ` [PATCH 12/27] x86/hvm: Improve CR4 verification " Andrew Cooper
                   ` (15 subsequent siblings)
  26 siblings, 1 reply; 93+ messages in thread
From: Andrew Cooper @ 2017-01-04 12:39 UTC (permalink / raw)
  To: Xen-devel; +Cc: Andrew Cooper, Jan Beulich

Pick the appropriate cpuid_policy object rather than using hvm_cpuid() or
boot_cpu_data.  This breaks the dependency on current.

As data is read straight out of cpuid_policy, there is no need to work around
the fact that X86_FEATURE_SYSCALL might be clear because of the dynamic
adjustment in hvm_cpuid().  This simplifies the SCE handling, as EFER.SCE can
be set in isolation in 32bit mode on Intel hardware.

Alter nestedhvm_enabled() to be const-correct, allowing hvm_efer_valid() to be
properly const-correct.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
---
CC: Jan Beulich <JBeulich@suse.com>
---
 xen/arch/x86/hvm/hvm.c              | 43 ++++++++++---------------------------
 xen/arch/x86/hvm/nestedhvm.c        |  6 ++----
 xen/include/asm-x86/hvm/hvm.h       |  3 +--
 xen/include/asm-x86/hvm/nestedhvm.h |  2 +-
 4 files changed, 15 insertions(+), 39 deletions(-)

diff --git a/xen/arch/x86/hvm/hvm.c b/xen/arch/x86/hvm/hvm.c
index bdf8ca8..d651d0b 100644
--- a/xen/arch/x86/hvm/hvm.c
+++ b/xen/arch/x86/hvm/hvm.c
@@ -914,56 +914,35 @@ static int hvm_save_cpu_ctxt(struct domain *d, hvm_domain_context_t *h)
 }
 
 /* Return a string indicating the error, or NULL for valid. */
-const char *hvm_efer_valid(const struct vcpu *v, uint64_t value,
-                           signed int cr0_pg)
+const char *hvm_efer_valid(const struct vcpu *v, uint64_t value, int cr0_pg)
 {
-    unsigned int ext1_ecx = 0, ext1_edx = 0;
+    const struct domain *d = v->domain;
+    const struct cpuid_policy *p;
 
-    if ( cr0_pg < 0 && !is_hardware_domain(v->domain) )
-    {
-        unsigned int level;
-
-        ASSERT(v->domain == current->domain);
-        hvm_cpuid(0x80000000, &level, NULL, NULL, NULL);
-        if ( (level >> 16) == 0x8000 && level > 0x80000000 )
-            hvm_cpuid(0x80000001, NULL, NULL, &ext1_ecx, &ext1_edx);
-    }
+    if ( cr0_pg < 0 && !is_hardware_domain(d) )
+        p = d->arch.cpuid;
     else
-    {
-        ext1_edx = boot_cpu_data.x86_capability[cpufeat_word(X86_FEATURE_LM)];
-        ext1_ecx = boot_cpu_data.x86_capability[cpufeat_word(X86_FEATURE_SVM)];
-    }
+        p = &host_policy;
 
-    /*
-     * Guests may want to set EFER.SCE and EFER.LME at the same time, so we
-     * can't make the check depend on only X86_FEATURE_SYSCALL (which on VMX
-     * will be clear without the guest having entered 64-bit mode).
-     */
-    if ( (value & EFER_SCE) &&
-         !(ext1_edx & cpufeat_mask(X86_FEATURE_SYSCALL)) &&
-         (cr0_pg >= 0 || !(value & EFER_LME)) )
+    if ( (value & EFER_SCE) && !p->extd.syscall )
         return "SCE without feature";
 
-    if ( (value & (EFER_LME | EFER_LMA)) &&
-         !(ext1_edx & cpufeat_mask(X86_FEATURE_LM)) )
+    if ( (value & (EFER_LME | EFER_LMA)) && !p->extd.lm )
         return "LME/LMA without feature";
 
     if ( (value & EFER_LMA) && (!(value & EFER_LME) || !cr0_pg) )
         return "LMA/LME/CR0.PG inconsistency";
 
-    if ( (value & EFER_NX) && !(ext1_edx & cpufeat_mask(X86_FEATURE_NX)) )
+    if ( (value & EFER_NX) && !p->extd.nx )
         return "NX without feature";
 
-    if ( (value & EFER_SVME) &&
-         (!(ext1_ecx & cpufeat_mask(X86_FEATURE_SVM)) ||
-          !nestedhvm_enabled(v->domain)) )
+    if ( (value & EFER_SVME) && (!p->extd.svm || !nestedhvm_enabled(d)) )
         return "SVME without nested virt";
 
     if ( (value & EFER_LMSLE) && !cpu_has_lmsl )
         return "LMSLE without support";
 
-    if ( (value & EFER_FFXSE) &&
-         !(ext1_edx & cpufeat_mask(X86_FEATURE_FFXSR)) )
+    if ( (value & EFER_FFXSE) && !p->extd.ffxsr )
         return "FFXSE without feature";
 
     return NULL;
diff --git a/xen/arch/x86/hvm/nestedhvm.c b/xen/arch/x86/hvm/nestedhvm.c
index c09c5b2..a400d55 100644
--- a/xen/arch/x86/hvm/nestedhvm.c
+++ b/xen/arch/x86/hvm/nestedhvm.c
@@ -27,11 +27,9 @@
 static unsigned long *shadow_io_bitmap[3];
 
 /* Nested HVM on/off per domain */
-bool_t
-nestedhvm_enabled(struct domain *d)
+bool nestedhvm_enabled(const struct domain *d)
 {
-    return is_hvm_domain(d) &&
-           d->arch.hvm_domain.params[HVM_PARAM_NESTEDHVM];
+    return is_hvm_domain(d) && d->arch.hvm_domain.params[HVM_PARAM_NESTEDHVM];
 }
 
 /* Nested VCPU */
diff --git a/xen/include/asm-x86/hvm/hvm.h b/xen/include/asm-x86/hvm/hvm.h
index 8c95c08..4248546 100644
--- a/xen/include/asm-x86/hvm/hvm.h
+++ b/xen/include/asm-x86/hvm/hvm.h
@@ -613,8 +613,7 @@ static inline bool altp2m_vcpu_emulate_ve(struct vcpu *v)
 }
 
 /* Check CR4/EFER values */
-const char *hvm_efer_valid(const struct vcpu *v, uint64_t value,
-                           signed int cr0_pg);
+const char *hvm_efer_valid(const struct vcpu *v, uint64_t value, int cr0_pg);
 unsigned long hvm_cr4_guest_reserved_bits(const struct vcpu *v, bool_t restore);
 
 /*
diff --git a/xen/include/asm-x86/hvm/nestedhvm.h b/xen/include/asm-x86/hvm/nestedhvm.h
index bc82425..47165fc 100644
--- a/xen/include/asm-x86/hvm/nestedhvm.h
+++ b/xen/include/asm-x86/hvm/nestedhvm.h
@@ -33,7 +33,7 @@ enum nestedhvm_vmexits {
 };
 
 /* Nested HVM on/off per domain */
-bool_t nestedhvm_enabled(struct domain *d);
+bool nestedhvm_enabled(const struct domain *d);
 
 /* Nested VCPU */
 int nestedhvm_vcpu_initialise(struct vcpu *v);
-- 
2.1.4


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 93+ messages in thread

* [PATCH 12/27] x86/hvm: Improve CR4 verification using named features
  2017-01-04 12:39 [PATCH 00/27] xen/x86: Per-domain CPUID policies Andrew Cooper
                   ` (10 preceding siblings ...)
  2017-01-04 12:39 ` [PATCH 11/27] x86/hvm: Improve hvm_efer_valid() using named features Andrew Cooper
@ 2017-01-04 12:39 ` Andrew Cooper
  2017-01-05 11:39   ` Jan Beulich
  2017-01-04 12:39 ` [PATCH 13/27] x86/vvmx: Use hvm_cr4_guest_valid_bits() to calculate MSR_IA32_VMX_CR4_FIXED1 Andrew Cooper
                   ` (14 subsequent siblings)
  26 siblings, 1 reply; 93+ messages in thread
From: Andrew Cooper @ 2017-01-04 12:39 UTC (permalink / raw)
  To: Xen-devel; +Cc: Andrew Cooper, Jan Beulich

Alter the function to return the valid CR4 bits, rather than the invalid CR4
bits.  This will allow reuse in other areas of code.

Pick the appropriate cpuid_policy object rather than using hvm_cpuid() or
boot_cpu_data.  This breaks the dependency on current.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
---
CC: Jan Beulich <JBeulich@suse.com>
---
 xen/arch/x86/domain.c         |  2 +-
 xen/arch/x86/hvm/hvm.c        | 92 +++++++++++++++----------------------------
 xen/include/asm-x86/hvm/hvm.h |  2 +-
 3 files changed, 34 insertions(+), 62 deletions(-)

diff --git a/xen/arch/x86/domain.c b/xen/arch/x86/domain.c
index b554a9c..319cc8a 100644
--- a/xen/arch/x86/domain.c
+++ b/xen/arch/x86/domain.c
@@ -1512,7 +1512,7 @@ int arch_set_info_hvm_guest(struct vcpu *v, const vcpu_hvm_context_t *ctx)
     if ( v->arch.hvm_vcpu.guest_efer & EFER_LME )
         v->arch.hvm_vcpu.guest_efer |= EFER_LMA;
 
-    if ( v->arch.hvm_vcpu.guest_cr[4] & hvm_cr4_guest_reserved_bits(v, 0) )
+    if ( v->arch.hvm_vcpu.guest_cr[4] & ~hvm_cr4_guest_valid_bits(v, 0) )
     {
         gprintk(XENLOG_ERR, "Bad CR4 value: %#016lx\n",
                 v->arch.hvm_vcpu.guest_cr[4]);
diff --git a/xen/arch/x86/hvm/hvm.c b/xen/arch/x86/hvm/hvm.c
index d651d0b..e2060d2 100644
--- a/xen/arch/x86/hvm/hvm.c
+++ b/xen/arch/x86/hvm/hvm.c
@@ -956,67 +956,39 @@ const char *hvm_efer_valid(const struct vcpu *v, uint64_t value, int cr0_pg)
         X86_CR0_WP | X86_CR0_AM | X86_CR0_NW |  \
         X86_CR0_CD | X86_CR0_PG)))
 
-/* These bits in CR4 cannot be set by the guest. */
-unsigned long hvm_cr4_guest_reserved_bits(const struct vcpu *v,bool_t restore)
+/* These bits in CR4 can be set by the guest. */
+unsigned long hvm_cr4_guest_valid_bits(const struct vcpu *v, bool restore)
 {
-    unsigned int leaf1_ecx = 0, leaf1_edx = 0;
-    unsigned int leaf7_0_ebx = 0, leaf7_0_ecx = 0;
-
-    if ( !restore && !is_hardware_domain(v->domain) )
-    {
-        unsigned int level;
+    const struct domain *d = v->domain;
+    const struct cpuid_policy *p;
+    bool mce, vmxe;
 
-        ASSERT(v->domain == current->domain);
-        hvm_cpuid(0, &level, NULL, NULL, NULL);
-        if ( level >= 1 )
-            hvm_cpuid(1, NULL, NULL, &leaf1_ecx, &leaf1_edx);
-        if ( level >= 7 )
-            hvm_cpuid(7, NULL, &leaf7_0_ebx, &leaf7_0_ecx, NULL);
-    }
+    if ( !restore && !is_hardware_domain(d) )
+        p = d->arch.cpuid;
     else
-    {
-        leaf1_edx = boot_cpu_data.x86_capability[cpufeat_word(X86_FEATURE_VME)];
-        leaf1_ecx = boot_cpu_data.x86_capability[cpufeat_word(X86_FEATURE_PCID)];
-        leaf7_0_ebx = boot_cpu_data.x86_capability[cpufeat_word(X86_FEATURE_FSGSBASE)];
-        leaf7_0_ecx = boot_cpu_data.x86_capability[cpufeat_word(X86_FEATURE_PKU)];
-    }
-
-    return ~(unsigned long)
-            ((leaf1_edx & cpufeat_mask(X86_FEATURE_VME) ?
-              X86_CR4_VME | X86_CR4_PVI : 0) |
-             (leaf1_edx & cpufeat_mask(X86_FEATURE_TSC) ?
-              X86_CR4_TSD : 0) |
-             (leaf1_edx & cpufeat_mask(X86_FEATURE_DE) ?
-              X86_CR4_DE : 0) |
-             (leaf1_edx & cpufeat_mask(X86_FEATURE_PSE) ?
-              X86_CR4_PSE : 0) |
-             (leaf1_edx & cpufeat_mask(X86_FEATURE_PAE) ?
-              X86_CR4_PAE : 0) |
-             (leaf1_edx & (cpufeat_mask(X86_FEATURE_MCE) |
-                           cpufeat_mask(X86_FEATURE_MCA)) ?
-              X86_CR4_MCE : 0) |
-             (leaf1_edx & cpufeat_mask(X86_FEATURE_PGE) ?
-              X86_CR4_PGE : 0) |
-             X86_CR4_PCE |
-             (leaf1_edx & cpufeat_mask(X86_FEATURE_FXSR) ?
-              X86_CR4_OSFXSR : 0) |
-             (leaf1_edx & cpufeat_mask(X86_FEATURE_SSE) ?
-              X86_CR4_OSXMMEXCPT : 0) |
-             ((restore || nestedhvm_enabled(v->domain)) &&
-              (leaf1_ecx & cpufeat_mask(X86_FEATURE_VMX)) ?
-              X86_CR4_VMXE : 0) |
-             (leaf7_0_ebx & cpufeat_mask(X86_FEATURE_FSGSBASE) ?
-              X86_CR4_FSGSBASE : 0) |
-             (leaf1_ecx & cpufeat_mask(X86_FEATURE_PCID) ?
-              X86_CR4_PCIDE : 0) |
-             (leaf1_ecx & cpufeat_mask(X86_FEATURE_XSAVE) ?
-              X86_CR4_OSXSAVE : 0) |
-             (leaf7_0_ebx & cpufeat_mask(X86_FEATURE_SMEP) ?
-              X86_CR4_SMEP : 0) |
-             (leaf7_0_ebx & cpufeat_mask(X86_FEATURE_SMAP) ?
-              X86_CR4_SMAP : 0) |
-              (leaf7_0_ecx & cpufeat_mask(X86_FEATURE_PKU) ?
-              X86_CR4_PKE : 0));
+        p = &host_policy;
+
+    /* Logic broken out simply to aid readability below. */
+    mce  = p->basic.mce || p->basic.mca;
+    vmxe = p->basic.vmx && (restore || nestedhvm_enabled(d));
+
+    return ((p->basic.vme     ? X86_CR4_VME | X86_CR4_PVI : 0) |
+            (p->basic.tsc     ? X86_CR4_TSD               : 0) |
+            (p->basic.de      ? X86_CR4_DE                : 0) |
+            (p->basic.pse     ? X86_CR4_PSE               : 0) |
+            (p->basic.pae     ? X86_CR4_PAE               : 0) |
+            (mce              ? X86_CR4_MCE               : 0) |
+            (p->basic.pge     ? X86_CR4_PGE               : 0) |
+                                X86_CR4_PCE                    |
+            (p->basic.fxsr    ? X86_CR4_OSFXSR            : 0) |
+            (p->basic.sse     ? X86_CR4_OSXMMEXCPT        : 0) |
+            (vmxe             ? X86_CR4_VMXE              : 0) |
+            (p->feat.fsgsbase ? X86_CR4_FSGSBASE          : 0) |
+            (p->basic.pcid    ? X86_CR4_PCIDE             : 0) |
+            (p->basic.xsave   ? X86_CR4_OSXSAVE           : 0) |
+            (p->feat.smep     ? X86_CR4_SMEP              : 0) |
+            (p->feat.smap     ? X86_CR4_SMAP              : 0) |
+            (p->feat.pku      ? X86_CR4_PKE               : 0));
 }
 
 static int hvm_load_cpu_ctxt(struct domain *d, hvm_domain_context_t *h)
@@ -1053,7 +1025,7 @@ static int hvm_load_cpu_ctxt(struct domain *d, hvm_domain_context_t *h)
         return -EINVAL;
     }
 
-    if ( ctxt.cr4 & hvm_cr4_guest_reserved_bits(v, 1) )
+    if ( ctxt.cr4 & ~hvm_cr4_guest_valid_bits(v, 1) )
     {
         printk(XENLOG_G_ERR "HVM%d restore: bad CR4 %#" PRIx64 "\n",
                d->domain_id, ctxt.cr4);
@@ -2389,7 +2361,7 @@ int hvm_set_cr4(unsigned long value, bool_t may_defer)
     struct vcpu *v = current;
     unsigned long old_cr;
 
-    if ( value & hvm_cr4_guest_reserved_bits(v, 0) )
+    if ( value & ~hvm_cr4_guest_valid_bits(v, 0) )
     {
         HVM_DBG_LOG(DBG_LEVEL_1,
                     "Guest attempts to set reserved bit in CR4: %lx",
diff --git a/xen/include/asm-x86/hvm/hvm.h b/xen/include/asm-x86/hvm/hvm.h
index 4248546..c1b07b7 100644
--- a/xen/include/asm-x86/hvm/hvm.h
+++ b/xen/include/asm-x86/hvm/hvm.h
@@ -614,7 +614,7 @@ static inline bool altp2m_vcpu_emulate_ve(struct vcpu *v)
 
 /* Check CR4/EFER values */
 const char *hvm_efer_valid(const struct vcpu *v, uint64_t value, int cr0_pg);
-unsigned long hvm_cr4_guest_reserved_bits(const struct vcpu *v, bool_t restore);
+unsigned long hvm_cr4_guest_valid_bits(const struct vcpu *v, bool restore);
 
 /*
  * This must be defined as a macro instead of an inline function,
-- 
2.1.4


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 93+ messages in thread

* [PATCH 13/27] x86/vvmx: Use hvm_cr4_guest_valid_bits() to calculate MSR_IA32_VMX_CR4_FIXED1
  2017-01-04 12:39 [PATCH 00/27] xen/x86: Per-domain CPUID policies Andrew Cooper
                   ` (11 preceding siblings ...)
  2017-01-04 12:39 ` [PATCH 12/27] x86/hvm: Improve CR4 verification " Andrew Cooper
@ 2017-01-04 12:39 ` Andrew Cooper
  2017-01-05  2:40   ` Tian, Kevin
  2017-01-05 11:42   ` Jan Beulich
  2017-01-04 12:39 ` [PATCH 14/27] x86/pv: Improve pv_cpuid() using named features Andrew Cooper
                   ` (13 subsequent siblings)
  26 siblings, 2 replies; 93+ messages in thread
From: Andrew Cooper @ 2017-01-04 12:39 UTC (permalink / raw)
  To: Xen-devel; +Cc: Andrew Cooper, Kevin Tian, Jun Nakajima, Jan Beulich

Reuse the logic in hvm_cr4_guest_valid_bits() instead of duplicating it.

This fixes a bug to do with the handling of X86_CR4_PCE.  The RDPMC
instruction predate the architectural performance feature, and has been around
since the P6.  X86_CR4_PCE is like X86_CR4_TSD and only controls whether RDPMC
is available at cpl!=0, not whether RDPMC is generally unavailable.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
---
CC: Jan Beulich <JBeulich@suse.com>
CC: Jun Nakajima <jun.nakajima@intel.com>
CC: Kevin Tian <kevin.tian@intel.com>
---
 xen/arch/x86/hvm/vmx/vvmx.c | 58 +++------------------------------------------
 1 file changed, 3 insertions(+), 55 deletions(-)

diff --git a/xen/arch/x86/hvm/vmx/vvmx.c b/xen/arch/x86/hvm/vmx/vvmx.c
index d53c576..0a298c7 100644
--- a/xen/arch/x86/hvm/vmx/vvmx.c
+++ b/xen/arch/x86/hvm/vmx/vvmx.c
@@ -1849,16 +1849,12 @@ int nvmx_handle_invvpid(struct cpu_user_regs *regs)
 int nvmx_msr_read_intercept(unsigned int msr, u64 *msr_content)
 {
     struct vcpu *v = current;
-    unsigned int eax, ebx, ecx, edx;
+    struct domain *d = v->domain;
     u64 data = 0, host_data = 0;
     int r = 1;
 
-    if ( !nestedhvm_enabled(v->domain) )
-        return 0;
-
     /* VMX capablity MSRs are available only when guest supports VMX. */
-    hvm_cpuid(0x1, NULL, NULL, &ecx, &edx);
-    if ( !(ecx & cpufeat_mask(X86_FEATURE_VMX)) )
+    if ( !nestedhvm_enabled(d) || !d->arch.cpuid->basic.vmx )
         return 0;
 
     /*
@@ -2004,55 +2000,7 @@ int nvmx_msr_read_intercept(unsigned int msr, u64 *msr_content)
         data = X86_CR4_VMXE;
         break;
     case MSR_IA32_VMX_CR4_FIXED1:
-        if ( edx & cpufeat_mask(X86_FEATURE_VME) )
-            data |= X86_CR4_VME | X86_CR4_PVI;
-        if ( edx & cpufeat_mask(X86_FEATURE_TSC) )
-            data |= X86_CR4_TSD;
-        if ( edx & cpufeat_mask(X86_FEATURE_DE) )
-            data |= X86_CR4_DE;
-        if ( edx & cpufeat_mask(X86_FEATURE_PSE) )
-            data |= X86_CR4_PSE;
-        if ( edx & cpufeat_mask(X86_FEATURE_PAE) )
-            data |= X86_CR4_PAE;
-        if ( edx & cpufeat_mask(X86_FEATURE_MCE) )
-            data |= X86_CR4_MCE;
-        if ( edx & cpufeat_mask(X86_FEATURE_PGE) )
-            data |= X86_CR4_PGE;
-        if ( edx & cpufeat_mask(X86_FEATURE_FXSR) )
-            data |= X86_CR4_OSFXSR;
-        if ( edx & cpufeat_mask(X86_FEATURE_SSE) )
-            data |= X86_CR4_OSXMMEXCPT;
-        if ( ecx & cpufeat_mask(X86_FEATURE_VMX) )
-            data |= X86_CR4_VMXE;
-        if ( ecx & cpufeat_mask(X86_FEATURE_SMX) )
-            data |= X86_CR4_SMXE;
-        if ( ecx & cpufeat_mask(X86_FEATURE_PCID) )
-            data |= X86_CR4_PCIDE;
-        if ( ecx & cpufeat_mask(X86_FEATURE_XSAVE) )
-            data |= X86_CR4_OSXSAVE;
-
-        hvm_cpuid(0x0, &eax, NULL, NULL, NULL);
-        switch ( eax )
-        {
-        default:
-            hvm_cpuid(0xa, &eax, NULL, NULL, NULL);
-            /* Check whether guest has the perf monitor feature. */
-            if ( (eax & 0xff) && (eax & 0xff00) )
-                data |= X86_CR4_PCE;
-            /* fall through */
-        case 0x7 ... 0x9:
-            ecx = 0;
-            hvm_cpuid(0x7, NULL, &ebx, &ecx, NULL);
-            if ( ebx & cpufeat_mask(X86_FEATURE_FSGSBASE) )
-                data |= X86_CR4_FSGSBASE;
-            if ( ebx & cpufeat_mask(X86_FEATURE_SMEP) )
-                data |= X86_CR4_SMEP;
-            if ( ebx & cpufeat_mask(X86_FEATURE_SMAP) )
-                data |= X86_CR4_SMAP;
-            /* fall through */
-        case 0x0 ... 0x6:
-            break;
-        }
+        data = hvm_cr4_guest_valid_bits(v, 0);
         break;
     case MSR_IA32_VMX_MISC:
         /* Do not support CR3-target feature now */
-- 
2.1.4


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 93+ messages in thread

* [PATCH 14/27] x86/pv: Improve pv_cpuid() using named features
  2017-01-04 12:39 [PATCH 00/27] xen/x86: Per-domain CPUID policies Andrew Cooper
                   ` (12 preceding siblings ...)
  2017-01-04 12:39 ` [PATCH 13/27] x86/vvmx: Use hvm_cr4_guest_valid_bits() to calculate MSR_IA32_VMX_CR4_FIXED1 Andrew Cooper
@ 2017-01-04 12:39 ` Andrew Cooper
  2017-01-05 11:43   ` Jan Beulich
  2017-01-04 12:39 ` [PATCH 15/27] x86/hvm: Improve CPUID and MSR handling " Andrew Cooper
                   ` (12 subsequent siblings)
  26 siblings, 1 reply; 93+ messages in thread
From: Andrew Cooper @ 2017-01-04 12:39 UTC (permalink / raw)
  To: Xen-devel; +Cc: Andrew Cooper, Jan Beulich

This avoids refering back to domain_cpuid() or native CPUID to obtain
information which is directly available.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
---
CC: Jan Beulich <JBeulich@suse.com>
---
 xen/arch/x86/traps.c | 22 +++++-----------------
 1 file changed, 5 insertions(+), 17 deletions(-)

diff --git a/xen/arch/x86/traps.c b/xen/arch/x86/traps.c
index 8b395f8..f19e015 100644
--- a/xen/arch/x86/traps.c
+++ b/xen/arch/x86/traps.c
@@ -1025,6 +1025,7 @@ void pv_cpuid(struct cpu_user_regs *regs)
     uint32_t leaf, subleaf, a, b, c, d;
     struct vcpu *curr = current;
     struct domain *currd = curr->domain;
+    const struct cpuid_policy *p = currd->arch.cpuid;
 
     leaf = a = regs->_eax;
     b = regs->_ebx;
@@ -1061,7 +1062,7 @@ void pv_cpuid(struct cpu_user_regs *regs)
 
     switch ( leaf )
     {
-        uint32_t tmp, _ecx, _ebx;
+        uint32_t tmp;
 
     case 0x00000001:
         c &= pv_featureset[FEATURESET_1c];
@@ -1247,14 +1248,7 @@ void pv_cpuid(struct cpu_user_regs *regs)
         break;
 
     case XSTATE_CPUID:
-
-        if ( !is_control_domain(currd) && !is_hardware_domain(currd) )
-            domain_cpuid(currd, 1, 0, &tmp, &tmp, &_ecx, &tmp);
-        else
-            _ecx = cpuid_ecx(1);
-        _ecx &= pv_featureset[FEATURESET_1c];
-
-        if ( !(_ecx & cpufeat_mask(X86_FEATURE_XSAVE)) || subleaf >= 63 )
+        if ( !p->basic.xsave || subleaf >= 63 )
             goto unsupported;
         switch ( subleaf )
         {
@@ -1263,20 +1257,14 @@ void pv_cpuid(struct cpu_user_regs *regs)
             uint64_t xfeature_mask = XSTATE_FP_SSE;
             uint32_t xstate_size = XSTATE_AREA_MIN_SIZE;
 
-            if ( _ecx & cpufeat_mask(X86_FEATURE_AVX) )
+            if ( p->basic.avx )
             {
                 xfeature_mask |= XSTATE_YMM;
                 xstate_size = (xstate_offsets[_XSTATE_YMM] +
                                xstate_sizes[_XSTATE_YMM]);
             }
 
-            if ( !is_control_domain(currd) && !is_hardware_domain(currd) )
-                domain_cpuid(currd, 7, 0, &tmp, &_ebx, &tmp, &tmp);
-            else
-                cpuid_count(7, 0, &tmp, &_ebx, &tmp, &tmp);
-            _ebx &= pv_featureset[FEATURESET_7b0];
-
-            if ( _ebx & cpufeat_mask(X86_FEATURE_AVX512F) )
+            if ( p->feat.avx512f )
             {
                 xfeature_mask |= XSTATE_OPMASK | XSTATE_ZMM | XSTATE_HI_ZMM;
                 xstate_size = max(xstate_size,
-- 
2.1.4


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 93+ messages in thread

* [PATCH 15/27] x86/hvm: Improve CPUID and MSR handling using named features
  2017-01-04 12:39 [PATCH 00/27] xen/x86: Per-domain CPUID policies Andrew Cooper
                   ` (13 preceding siblings ...)
  2017-01-04 12:39 ` [PATCH 14/27] x86/pv: Improve pv_cpuid() using named features Andrew Cooper
@ 2017-01-04 12:39 ` Andrew Cooper
  2017-01-05 12:06   ` Jan Beulich
  2017-01-04 12:39 ` [PATCH 16/27] x86/svm: Improvements " Andrew Cooper
                   ` (11 subsequent siblings)
  26 siblings, 1 reply; 93+ messages in thread
From: Andrew Cooper @ 2017-01-04 12:39 UTC (permalink / raw)
  To: Xen-devel; +Cc: Andrew Cooper, Jan Beulich

This avoids hvm_cpuid() recursing into itself, and the MSR paths using
hvm_cpuid() to obtain information which is directly available.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
---
CC: Jan Beulich <JBeulich@suse.com>
---
 xen/arch/x86/hvm/hvm.c | 95 +++++++++++++++-----------------------------------
 1 file changed, 29 insertions(+), 66 deletions(-)

diff --git a/xen/arch/x86/hvm/hvm.c b/xen/arch/x86/hvm/hvm.c
index e2060d2..6a3fdaa 100644
--- a/xen/arch/x86/hvm/hvm.c
+++ b/xen/arch/x86/hvm/hvm.c
@@ -3292,6 +3292,7 @@ void hvm_cpuid(unsigned int input, unsigned int *eax, unsigned int *ebx,
 {
     struct vcpu *v = current;
     struct domain *d = v->domain;
+    const struct cpuid_policy *p = d->arch.cpuid;
     unsigned int count, dummy = 0;
 
     if ( !eax )
@@ -3329,8 +3330,6 @@ void hvm_cpuid(unsigned int input, unsigned int *eax, unsigned int *ebx,
 
     switch ( input )
     {
-        unsigned int _ebx, _ecx, _edx;
-
     case 0x1:
         /* Fix up VLAPIC details. */
         *ebx &= 0x00FFFFFFu;
@@ -3413,8 +3412,7 @@ void hvm_cpuid(unsigned int input, unsigned int *eax, unsigned int *ebx,
         break;
 
     case XSTATE_CPUID:
-        hvm_cpuid(1, NULL, NULL, &_ecx, NULL);
-        if ( !(_ecx & cpufeat_mask(X86_FEATURE_XSAVE)) || count >= 63 )
+        if ( !p->basic.xsave || count >= 63 )
         {
             *eax = *ebx = *ecx = *edx = 0;
             break;
@@ -3426,7 +3424,7 @@ void hvm_cpuid(unsigned int input, unsigned int *eax, unsigned int *ebx,
             uint64_t xfeature_mask = XSTATE_FP_SSE;
             uint32_t xstate_size = XSTATE_AREA_MIN_SIZE;
 
-            if ( _ecx & cpufeat_mask(X86_FEATURE_AVX) )
+            if ( p->basic.avx )
             {
                 xfeature_mask |= XSTATE_YMM;
                 xstate_size = max(xstate_size,
@@ -3434,10 +3432,7 @@ void hvm_cpuid(unsigned int input, unsigned int *eax, unsigned int *ebx,
                                   xstate_sizes[_XSTATE_YMM]);
             }
 
-            _ecx = 0;
-            hvm_cpuid(7, NULL, &_ebx, &_ecx, NULL);
-
-            if ( _ebx & cpufeat_mask(X86_FEATURE_MPX) )
+            if ( p->feat.mpx )
             {
                 xfeature_mask |= XSTATE_BNDREGS | XSTATE_BNDCSR;
                 xstate_size = max(xstate_size,
@@ -3445,7 +3440,7 @@ void hvm_cpuid(unsigned int input, unsigned int *eax, unsigned int *ebx,
                                   xstate_sizes[_XSTATE_BNDCSR]);
             }
 
-            if ( _ebx & cpufeat_mask(X86_FEATURE_AVX512F) )
+            if ( p->feat.avx512f )
             {
                 xfeature_mask |= XSTATE_OPMASK | XSTATE_ZMM | XSTATE_HI_ZMM;
                 xstate_size = max(xstate_size,
@@ -3459,7 +3454,7 @@ void hvm_cpuid(unsigned int input, unsigned int *eax, unsigned int *ebx,
                                   xstate_sizes[_XSTATE_HI_ZMM]);
             }
 
-            if ( _ecx & cpufeat_mask(X86_FEATURE_PKU) )
+            if ( p->feat.pku )
             {
                 xfeature_mask |= XSTATE_PKRU;
                 xstate_size = max(xstate_size,
@@ -3467,9 +3462,7 @@ void hvm_cpuid(unsigned int input, unsigned int *eax, unsigned int *ebx,
                                   xstate_sizes[_XSTATE_PKRU]);
             }
 
-            hvm_cpuid(0x80000001, NULL, NULL, &_ecx, NULL);
-
-            if ( _ecx & cpufeat_mask(X86_FEATURE_LWP) )
+            if ( p->extd.lwp )
             {
                 xfeature_mask |= XSTATE_LWP;
                 xstate_size = max(xstate_size,
@@ -3493,7 +3486,7 @@ void hvm_cpuid(unsigned int input, unsigned int *eax, unsigned int *ebx,
         case 1:
             *eax &= hvm_featureset[FEATURESET_Da1];
 
-            if ( *eax & cpufeat_mask(X86_FEATURE_XSAVES) )
+            if ( p->xstate.xsaves )
             {
                 /*
                  * Always read CPUID[0xD,1].EBX from hardware, rather than
@@ -3574,14 +3567,11 @@ void hvm_cpuid(unsigned int input, unsigned int *eax, unsigned int *ebx,
         if ( *eax > count )
             *eax = count;
 
-        hvm_cpuid(1, NULL, NULL, NULL, &_edx);
-        count = _edx & (cpufeat_mask(X86_FEATURE_PAE) |
-                        cpufeat_mask(X86_FEATURE_PSE36)) ? 36 : 32;
+        count = (p->basic.pae || p->basic.pse36) ? 36 : 32;
         if ( *eax < count )
             *eax = count;
 
-        hvm_cpuid(0x80000001, NULL, NULL, NULL, &_edx);
-        *eax |= (_edx & cpufeat_mask(X86_FEATURE_LM) ? vaddr_bits : 32) << 8;
+        *eax |= (p->extd.lm ? vaddr_bits : 32) << 8;
 
         *ebx &= hvm_featureset[FEATURESET_e8b];
         break;
@@ -3648,26 +3638,16 @@ void hvm_rdtsc_intercept(struct cpu_user_regs *regs)
 int hvm_msr_read_intercept(unsigned int msr, uint64_t *msr_content)
 {
     struct vcpu *v = current;
+    struct domain *d = v->domain;
     uint64_t *var_range_base, *fixed_range_base;
-    bool mtrr = false;
     int ret = X86EMUL_OKAY;
 
     var_range_base = (uint64_t *)v->arch.hvm_vcpu.mtrr.var_ranges;
     fixed_range_base = (uint64_t *)v->arch.hvm_vcpu.mtrr.fixed_ranges;
 
-    if ( msr == MSR_MTRRcap ||
-         (msr >= MSR_IA32_MTRR_PHYSBASE(0) && msr <= MSR_MTRRdefType) )
-    {
-        unsigned int edx;
-
-        hvm_cpuid(1, NULL, NULL, NULL, &edx);
-        if ( edx & cpufeat_mask(X86_FEATURE_MTRR) )
-            mtrr = true;
-    }
-
     switch ( msr )
     {
-        unsigned int eax, ebx, ecx, index;
+        unsigned int index;
 
     case MSR_EFER:
         *msr_content = v->arch.hvm_vcpu.guest_efer;
@@ -3703,53 +3683,49 @@ int hvm_msr_read_intercept(unsigned int msr, uint64_t *msr_content)
         break;
 
     case MSR_MTRRcap:
-        if ( !mtrr )
+        if ( !d->arch.cpuid->basic.mtrr )
             goto gp_fault;
         *msr_content = v->arch.hvm_vcpu.mtrr.mtrr_cap;
         break;
     case MSR_MTRRdefType:
-        if ( !mtrr )
+        if ( !d->arch.cpuid->basic.mtrr )
             goto gp_fault;
         *msr_content = v->arch.hvm_vcpu.mtrr.def_type
                         | (v->arch.hvm_vcpu.mtrr.enabled << 10);
         break;
     case MSR_MTRRfix64K_00000:
-        if ( !mtrr )
+        if ( !d->arch.cpuid->basic.mtrr )
             goto gp_fault;
         *msr_content = fixed_range_base[0];
         break;
     case MSR_MTRRfix16K_80000:
     case MSR_MTRRfix16K_A0000:
-        if ( !mtrr )
+        if ( !d->arch.cpuid->basic.mtrr )
             goto gp_fault;
         index = msr - MSR_MTRRfix16K_80000;
         *msr_content = fixed_range_base[index + 1];
         break;
     case MSR_MTRRfix4K_C0000...MSR_MTRRfix4K_F8000:
-        if ( !mtrr )
+        if ( !d->arch.cpuid->basic.mtrr )
             goto gp_fault;
         index = msr - MSR_MTRRfix4K_C0000;
         *msr_content = fixed_range_base[index + 3];
         break;
     case MSR_IA32_MTRR_PHYSBASE(0)...MSR_IA32_MTRR_PHYSMASK(MTRR_VCNT-1):
-        if ( !mtrr )
+        if ( !d->arch.cpuid->basic.mtrr )
             goto gp_fault;
         index = msr - MSR_IA32_MTRR_PHYSBASE(0);
         *msr_content = var_range_base[index];
         break;
 
     case MSR_IA32_XSS:
-        ecx = 1;
-        hvm_cpuid(XSTATE_CPUID, &eax, NULL, &ecx, NULL);
-        if ( !(eax & cpufeat_mask(X86_FEATURE_XSAVES)) )
+        if ( !d->arch.cpuid->xstate.xsaves )
             goto gp_fault;
         *msr_content = v->arch.hvm_vcpu.msr_xss;
         break;
 
     case MSR_IA32_BNDCFGS:
-        ecx = 0;
-        hvm_cpuid(7, NULL, &ebx, &ecx, NULL);
-        if ( !(ebx & cpufeat_mask(X86_FEATURE_MPX)) ||
+        if ( !d->arch.cpuid->feat.mpx ||
              !hvm_get_guest_bndcfgs(v, msr_content) )
             goto gp_fault;
         break;
@@ -3790,21 +3766,12 @@ int hvm_msr_write_intercept(unsigned int msr, uint64_t msr_content,
                             bool_t may_defer)
 {
     struct vcpu *v = current;
-    bool mtrr = false;
+    struct domain *d = v->domain;
     int ret = X86EMUL_OKAY;
 
     HVMTRACE_3D(MSR_WRITE, msr,
                (uint32_t)msr_content, (uint32_t)(msr_content >> 32));
 
-    if ( msr >= MSR_IA32_MTRR_PHYSBASE(0) && msr <= MSR_MTRRdefType )
-    {
-        unsigned int edx;
-
-        hvm_cpuid(1, NULL, NULL, NULL, &edx);
-        if ( edx & cpufeat_mask(X86_FEATURE_MTRR) )
-            mtrr = true;
-    }
-
     if ( may_defer && unlikely(monitored_msr(v->domain, msr)) )
     {
         ASSERT(v->arch.vm_event);
@@ -3820,7 +3787,7 @@ int hvm_msr_write_intercept(unsigned int msr, uint64_t msr_content,
 
     switch ( msr )
     {
-        unsigned int eax, ebx, ecx, index;
+        unsigned int index;
 
     case MSR_EFER:
         if ( hvm_set_efer(msr_content) )
@@ -3866,14 +3833,14 @@ int hvm_msr_write_intercept(unsigned int msr, uint64_t msr_content,
         goto gp_fault;
 
     case MSR_MTRRdefType:
-        if ( !mtrr )
+        if ( !d->arch.cpuid->basic.mtrr )
             goto gp_fault;
         if ( !mtrr_def_type_msr_set(v->domain, &v->arch.hvm_vcpu.mtrr,
                                     msr_content) )
            goto gp_fault;
         break;
     case MSR_MTRRfix64K_00000:
-        if ( !mtrr )
+        if ( !d->arch.cpuid->basic.mtrr )
             goto gp_fault;
         if ( !mtrr_fix_range_msr_set(v->domain, &v->arch.hvm_vcpu.mtrr, 0,
                                      msr_content) )
@@ -3881,7 +3848,7 @@ int hvm_msr_write_intercept(unsigned int msr, uint64_t msr_content,
         break;
     case MSR_MTRRfix16K_80000:
     case MSR_MTRRfix16K_A0000:
-        if ( !mtrr )
+        if ( !d->arch.cpuid->basic.mtrr )
             goto gp_fault;
         index = msr - MSR_MTRRfix16K_80000 + 1;
         if ( !mtrr_fix_range_msr_set(v->domain, &v->arch.hvm_vcpu.mtrr,
@@ -3889,7 +3856,7 @@ int hvm_msr_write_intercept(unsigned int msr, uint64_t msr_content,
             goto gp_fault;
         break;
     case MSR_MTRRfix4K_C0000...MSR_MTRRfix4K_F8000:
-        if ( !mtrr )
+        if ( !d->arch.cpuid->basic.mtrr )
             goto gp_fault;
         index = msr - MSR_MTRRfix4K_C0000 + 3;
         if ( !mtrr_fix_range_msr_set(v->domain, &v->arch.hvm_vcpu.mtrr,
@@ -3897,7 +3864,7 @@ int hvm_msr_write_intercept(unsigned int msr, uint64_t msr_content,
             goto gp_fault;
         break;
     case MSR_IA32_MTRR_PHYSBASE(0)...MSR_IA32_MTRR_PHYSMASK(MTRR_VCNT-1):
-        if ( !mtrr )
+        if ( !d->arch.cpuid->basic.mtrr )
             goto gp_fault;
         if ( !mtrr_var_range_msr_set(v->domain, &v->arch.hvm_vcpu.mtrr,
                                      msr, msr_content) )
@@ -3905,18 +3872,14 @@ int hvm_msr_write_intercept(unsigned int msr, uint64_t msr_content,
         break;
 
     case MSR_IA32_XSS:
-        ecx = 1;
-        hvm_cpuid(XSTATE_CPUID, &eax, NULL, &ecx, NULL);
         /* No XSS features currently supported for guests. */
-        if ( !(eax & cpufeat_mask(X86_FEATURE_XSAVES)) || msr_content != 0 )
+        if ( !d->arch.cpuid->xstate.xsaves || msr_content != 0 )
             goto gp_fault;
         v->arch.hvm_vcpu.msr_xss = msr_content;
         break;
 
     case MSR_IA32_BNDCFGS:
-        ecx = 0;
-        hvm_cpuid(7, NULL, &ebx, &ecx, NULL);
-        if ( !(ebx & cpufeat_mask(X86_FEATURE_MPX)) ||
+        if ( !d->arch.cpuid->feat.mpx ||
              !hvm_set_guest_bndcfgs(v, msr_content) )
             goto gp_fault;
         break;
-- 
2.1.4


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 93+ messages in thread

* [PATCH 16/27] x86/svm: Improvements using named features
  2017-01-04 12:39 [PATCH 00/27] xen/x86: Per-domain CPUID policies Andrew Cooper
                   ` (14 preceding siblings ...)
  2017-01-04 12:39 ` [PATCH 15/27] x86/hvm: Improve CPUID and MSR handling " Andrew Cooper
@ 2017-01-04 12:39 ` Andrew Cooper
  2017-01-04 14:52   ` Boris Ostrovsky
  2017-01-04 12:39 ` [PATCH 17/27] x86/pv: Use per-domain policy information when calculating the cpumasks Andrew Cooper
                   ` (10 subsequent siblings)
  26 siblings, 1 reply; 93+ messages in thread
From: Andrew Cooper @ 2017-01-04 12:39 UTC (permalink / raw)
  To: Xen-devel
  Cc: Andrew Cooper, Boris Ostrovsky, Suravee Suthikulpanit, Jan Beulich

This avoids calling into hvm_cpuid() to obtain information which is directly
available.  In particular, this avoids the need to overload flag_dr_dirty
because of hvm_cpuid() being unavailable svm_save_dr()

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
---
CC: Jan Beulich <JBeulich@suse.com>
CC: Boris Ostrovsky <boris.ostrovsky@oracle.com>
CC: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
---
 xen/arch/x86/hvm/svm/svm.c | 33 ++++++++-------------------------
 1 file changed, 8 insertions(+), 25 deletions(-)

diff --git a/xen/arch/x86/hvm/svm/svm.c b/xen/arch/x86/hvm/svm/svm.c
index de20f64..8f6737c 100644
--- a/xen/arch/x86/hvm/svm/svm.c
+++ b/xen/arch/x86/hvm/svm/svm.c
@@ -173,7 +173,7 @@ static void svm_save_dr(struct vcpu *v)
     v->arch.hvm_vcpu.flag_dr_dirty = 0;
     vmcb_set_dr_intercepts(vmcb, ~0u);
 
-    if ( flag_dr_dirty & 2 )
+    if ( v->domain->arch.cpuid->extd.dbext )
     {
         svm_intercept_msr(v, MSR_AMD64_DR0_ADDRESS_MASK, MSR_INTERCEPT_RW);
         svm_intercept_msr(v, MSR_AMD64_DR1_ADDRESS_MASK, MSR_INTERCEPT_RW);
@@ -196,8 +196,6 @@ static void svm_save_dr(struct vcpu *v)
 
 static void __restore_debug_registers(struct vmcb_struct *vmcb, struct vcpu *v)
 {
-    unsigned int ecx;
-
     if ( v->arch.hvm_vcpu.flag_dr_dirty )
         return;
 
@@ -205,8 +203,8 @@ static void __restore_debug_registers(struct vmcb_struct *vmcb, struct vcpu *v)
     vmcb_set_dr_intercepts(vmcb, 0);
 
     ASSERT(v == current);
-    hvm_cpuid(0x80000001, NULL, NULL, &ecx, NULL);
-    if ( test_bit(X86_FEATURE_DBEXT & 31, &ecx) )
+
+    if ( v->domain->arch.cpuid->extd.dbext )
     {
         svm_intercept_msr(v, MSR_AMD64_DR0_ADDRESS_MASK, MSR_INTERCEPT_NONE);
         svm_intercept_msr(v, MSR_AMD64_DR1_ADDRESS_MASK, MSR_INTERCEPT_NONE);
@@ -217,9 +215,6 @@ static void __restore_debug_registers(struct vmcb_struct *vmcb, struct vcpu *v)
         wrmsrl(MSR_AMD64_DR1_ADDRESS_MASK, v->arch.hvm_svm.dr_mask[1]);
         wrmsrl(MSR_AMD64_DR2_ADDRESS_MASK, v->arch.hvm_svm.dr_mask[2]);
         wrmsrl(MSR_AMD64_DR3_ADDRESS_MASK, v->arch.hvm_svm.dr_mask[3]);
-
-        /* Can't use hvm_cpuid() in svm_save_dr(): v != current. */
-        v->arch.hvm_vcpu.flag_dr_dirty |= 2;
     }
 
     write_debugreg(0, v->arch.debugreg[0]);
@@ -1359,11 +1354,7 @@ static void svm_init_erratum_383(struct cpuinfo_x86 *c)
 
 static int svm_handle_osvw(struct vcpu *v, uint32_t msr, uint64_t *val, bool_t read)
 {
-    unsigned int ecx;
-
-    /* Guest OSVW support */
-    hvm_cpuid(0x80000001, NULL, NULL, &ecx, NULL);
-    if ( !test_bit((X86_FEATURE_OSVW & 31), &ecx) )
+    if ( !v->domain->arch.cpuid->extd.osvw )
         return -1;
 
     if ( read )
@@ -1622,8 +1613,6 @@ static int svm_msr_read_intercept(unsigned int msr, uint64_t *msr_content)
 
     switch ( msr )
     {
-        unsigned int ecx;
-
     case MSR_IA32_SYSENTER_CS:
         *msr_content = v->arch.hvm_svm.guest_sysenter_cs;
         break;
@@ -1701,15 +1690,13 @@ static int svm_msr_read_intercept(unsigned int msr, uint64_t *msr_content)
         break;
 
     case MSR_AMD64_DR0_ADDRESS_MASK:
-        hvm_cpuid(0x80000001, NULL, NULL, &ecx, NULL);
-        if ( !test_bit(X86_FEATURE_DBEXT & 31, &ecx) )
+        if ( !v->domain->arch.cpuid->extd.dbext )
             goto gpf;
         *msr_content = v->arch.hvm_svm.dr_mask[0];
         break;
 
     case MSR_AMD64_DR1_ADDRESS_MASK ... MSR_AMD64_DR3_ADDRESS_MASK:
-        hvm_cpuid(0x80000001, NULL, NULL, &ecx, NULL);
-        if ( !test_bit(X86_FEATURE_DBEXT & 31, &ecx) )
+        if ( !v->domain->arch.cpuid->extd.dbext )
             goto gpf;
         *msr_content =
             v->arch.hvm_svm.dr_mask[msr - MSR_AMD64_DR1_ADDRESS_MASK + 1];
@@ -1783,8 +1770,6 @@ static int svm_msr_write_intercept(unsigned int msr, uint64_t msr_content)
 
     switch ( msr )
     {
-        unsigned int ecx;
-
     case MSR_IA32_SYSENTER_CS:
         vmcb->sysenter_cs = v->arch.hvm_svm.guest_sysenter_cs = msr_content;
         break;
@@ -1862,15 +1847,13 @@ static int svm_msr_write_intercept(unsigned int msr, uint64_t msr_content)
         break;
 
     case MSR_AMD64_DR0_ADDRESS_MASK:
-        hvm_cpuid(0x80000001, NULL, NULL, &ecx, NULL);
-        if ( !test_bit(X86_FEATURE_DBEXT & 31, &ecx) || (msr_content >> 32) )
+        if ( !v->domain->arch.cpuid->extd.dbext || (msr_content >> 32) )
             goto gpf;
         v->arch.hvm_svm.dr_mask[0] = msr_content;
         break;
 
     case MSR_AMD64_DR1_ADDRESS_MASK ... MSR_AMD64_DR3_ADDRESS_MASK:
-        hvm_cpuid(0x80000001, NULL, NULL, &ecx, NULL);
-        if ( !test_bit(X86_FEATURE_DBEXT & 31, &ecx) || (msr_content >> 32) )
+        if ( !v->domain->arch.cpuid->extd.dbext || (msr_content >> 32) )
             goto gpf;
         v->arch.hvm_svm.dr_mask[msr - MSR_AMD64_DR1_ADDRESS_MASK + 1] =
             msr_content;
-- 
2.1.4


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 93+ messages in thread

* [PATCH 17/27] x86/pv: Use per-domain policy information when calculating the cpumasks
  2017-01-04 12:39 [PATCH 00/27] xen/x86: Per-domain CPUID policies Andrew Cooper
                   ` (15 preceding siblings ...)
  2017-01-04 12:39 ` [PATCH 16/27] x86/svm: Improvements " Andrew Cooper
@ 2017-01-04 12:39 ` Andrew Cooper
  2017-01-05 12:23   ` Jan Beulich
  2017-01-04 12:39 ` [PATCH 18/27] x86/pv: Use per-domain policy information in pv_cpuid() Andrew Cooper
                   ` (9 subsequent siblings)
  26 siblings, 1 reply; 93+ messages in thread
From: Andrew Cooper @ 2017-01-04 12:39 UTC (permalink / raw)
  To: Xen-devel; +Cc: Andrew Cooper, Jan Beulich

... rather than dynamically claming against the PV maximum policy.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
---
CC: Jan Beulich <JBeulich@suse.com>
---
 xen/arch/x86/domctl.c | 12 ++++++------
 1 file changed, 6 insertions(+), 6 deletions(-)

diff --git a/xen/arch/x86/domctl.c b/xen/arch/x86/domctl.c
index c7e74dd..c1a4d00 100644
--- a/xen/arch/x86/domctl.c
+++ b/xen/arch/x86/domctl.c
@@ -99,8 +99,8 @@ static void update_domain_cpuid_info(struct domain *d,
         if ( is_pv_domain(d) && ((levelling_caps & LCAP_1cd) == LCAP_1cd) )
         {
             uint64_t mask = cpuidmask_defaults._1cd;
-            uint32_t ecx = ctl->ecx & pv_featureset[FEATURESET_1c];
-            uint32_t edx = ctl->edx & pv_featureset[FEATURESET_1d];
+            uint32_t ecx = p->basic._1c;
+            uint32_t edx = p->basic._1d;
 
             /*
              * Must expose hosts HTT and X2APIC value so a guest using native
@@ -174,7 +174,7 @@ static void update_domain_cpuid_info(struct domain *d,
         {
             uint64_t mask = cpuidmask_defaults._7ab0;
             uint32_t eax = ctl->eax;
-            uint32_t ebx = ctl->ebx & pv_featureset[FEATURESET_7b0];
+            uint32_t ebx = p->feat._7b0;
 
             if ( boot_cpu_data.x86_vendor == X86_VENDOR_AMD )
                 mask &= ((uint64_t)eax << 32) | ebx;
@@ -190,7 +190,7 @@ static void update_domain_cpuid_info(struct domain *d,
         if ( is_pv_domain(d) && ((levelling_caps & LCAP_Da1) == LCAP_Da1) )
         {
             uint64_t mask = cpuidmask_defaults.Da1;
-            uint32_t eax = ctl->eax & pv_featureset[FEATURESET_Da1];
+            uint32_t eax = p->xstate.Da1;
 
             if ( boot_cpu_data.x86_vendor == X86_VENDOR_INTEL )
                 mask &= (~0ULL << 32) | eax;
@@ -203,8 +203,8 @@ static void update_domain_cpuid_info(struct domain *d,
         if ( is_pv_domain(d) && ((levelling_caps & LCAP_e1cd) == LCAP_e1cd) )
         {
             uint64_t mask = cpuidmask_defaults.e1cd;
-            uint32_t ecx = ctl->ecx & pv_featureset[FEATURESET_e1c];
-            uint32_t edx = ctl->edx & pv_featureset[FEATURESET_e1d];
+            uint32_t ecx = p->extd.e1c;
+            uint32_t edx = p->extd.e1d;
 
             /*
              * Must expose hosts CMP_LEGACY value so a guest using native
-- 
2.1.4


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 93+ messages in thread

* [PATCH 18/27] x86/pv: Use per-domain policy information in pv_cpuid()
  2017-01-04 12:39 [PATCH 00/27] xen/x86: Per-domain CPUID policies Andrew Cooper
                   ` (16 preceding siblings ...)
  2017-01-04 12:39 ` [PATCH 17/27] x86/pv: Use per-domain policy information when calculating the cpumasks Andrew Cooper
@ 2017-01-04 12:39 ` Andrew Cooper
  2017-01-05 12:44   ` Jan Beulich
  2017-01-04 12:39 ` [PATCH 19/27] x86/hvm: Use per-domain policy information in hvm_cpuid() Andrew Cooper
                   ` (8 subsequent siblings)
  26 siblings, 1 reply; 93+ messages in thread
From: Andrew Cooper @ 2017-01-04 12:39 UTC (permalink / raw)
  To: Xen-devel; +Cc: Andrew Cooper, Jan Beulich

... rather than performing runtime adjustments.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
---
CC: Jan Beulich <JBeulich@suse.com>
---
 xen/arch/x86/traps.c | 44 ++++++++++++--------------------------------
 1 file changed, 12 insertions(+), 32 deletions(-)

diff --git a/xen/arch/x86/traps.c b/xen/arch/x86/traps.c
index f19e015..1c384cf 100644
--- a/xen/arch/x86/traps.c
+++ b/xen/arch/x86/traps.c
@@ -1065,11 +1065,8 @@ void pv_cpuid(struct cpu_user_regs *regs)
         uint32_t tmp;
 
     case 0x00000001:
-        c &= pv_featureset[FEATURESET_1c];
-        d &= pv_featureset[FEATURESET_1d];
-
-        if ( is_pv_32bit_domain(currd) )
-            c &= ~cpufeat_mask(X86_FEATURE_CX16);
+        c = p->basic._1c;
+        d = p->basic._1d;
 
         if ( !is_pvh_domain(currd) )
         {
@@ -1128,7 +1125,7 @@ void pv_cpuid(struct cpu_user_regs *regs)
              *    Emulated vs Faulted CPUID is distinguised based on whether a
              *    #UD or #GP is currently being serviced.
              */
-            /* OSXSAVE cleared by pv_featureset.  Fast-forward CR4 back in. */
+            /* OSXSAVE clear in policy.  Fast-forward CR4 back in. */
             if ( (curr->arch.pv_vcpu.ctrlreg[4] & X86_CR4_OSXSAVE) ||
                  (regs->entry_vector == TRAP_invalid_op &&
                   guest_kernel_mode(curr, regs) &&
@@ -1204,21 +1201,14 @@ void pv_cpuid(struct cpu_user_regs *regs)
             if ( cpu_has(&current_cpu_data, X86_FEATURE_DSCPL) )
                 c |= cpufeat_mask(X86_FEATURE_DSCPL);
         }
-
-        c |= cpufeat_mask(X86_FEATURE_HYPERVISOR);
         break;
 
     case 0x00000007:
         if ( subleaf == 0 )
         {
-            /* Fold host's FDP_EXCP_ONLY and NO_FPU_SEL into guest's view. */
-            b &= (pv_featureset[FEATURESET_7b0] &
-                  ~special_features[FEATURESET_7b0]);
-            b |= (host_featureset[FEATURESET_7b0] &
-                  special_features[FEATURESET_7b0]);
-
-            c &= pv_featureset[FEATURESET_7c0];
-            d &= pv_featureset[FEATURESET_7d0];
+            b = currd->arch.cpuid->feat._7b0;
+            c = currd->arch.cpuid->feat._7c0;
+            d = currd->arch.cpuid->feat._7d0;
 
             if ( !is_pvh_domain(currd) )
             {
@@ -1227,7 +1217,7 @@ void pv_cpuid(struct cpu_user_regs *regs)
                  * and HVM guests no longer enter a PV codepath.
                  */
 
-                /* OSPKE cleared by pv_featureset.  Fast-forward CR4 back in. */
+                /* OSPKE clear in policy.  Fast-forward CR4 back in. */
                 if ( curr->arch.pv_vcpu.ctrlreg[4] & X86_CR4_PKE )
                     c |= cpufeat_mask(X86_FEATURE_OSPKE);
             }
@@ -1292,15 +1282,15 @@ void pv_cpuid(struct cpu_user_regs *regs)
         }
 
         case 1:
-            a &= pv_featureset[FEATURESET_Da1];
+            a = p->xstate.Da1;
             b = c = d = 0;
             break;
         }
         break;
 
     case 0x80000001:
-        c &= pv_featureset[FEATURESET_e1c];
-        d &= pv_featureset[FEATURESET_e1d];
+        c = p->extd.e1c;
+        d = p->extd.e1d;
 
         /* If not emulating AMD, clear the duplicated features in e1d. */
         if ( currd->arch.x86_vendor != X86_VENDOR_AMD )
@@ -1318,25 +1308,15 @@ void pv_cpuid(struct cpu_user_regs *regs)
         if ( is_hardware_domain(currd) && guest_kernel_mode(curr, regs) &&
              cpu_has_mtrr )
             d |= cpufeat_mask(X86_FEATURE_MTRR);
-
-        if ( is_pv_32bit_domain(currd) )
-        {
-            d &= ~cpufeat_mask(X86_FEATURE_LM);
-            c &= ~cpufeat_mask(X86_FEATURE_LAHF_LM);
-
-            if ( boot_cpu_data.x86_vendor != X86_VENDOR_AMD )
-                d &= ~cpufeat_mask(X86_FEATURE_SYSCALL);
-        }
         break;
 
     case 0x80000007:
-        d &= (pv_featureset[FEATURESET_e7d] |
-              (host_featureset[FEATURESET_e7d] & cpufeat_mask(X86_FEATURE_ITSC)));
+        d = p->extd.e7d;
         break;
 
     case 0x80000008:
         a = paddr_bits | (vaddr_bits << 8);
-        b &= pv_featureset[FEATURESET_e8b];
+        b = p->extd.e8b;
         break;
 
     case 0x00000005: /* MONITOR/MWAIT */
-- 
2.1.4


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 93+ messages in thread

* [PATCH 19/27] x86/hvm: Use per-domain policy information in hvm_cpuid()
  2017-01-04 12:39 [PATCH 00/27] xen/x86: Per-domain CPUID policies Andrew Cooper
                   ` (17 preceding siblings ...)
  2017-01-04 12:39 ` [PATCH 18/27] x86/pv: Use per-domain policy information in pv_cpuid() Andrew Cooper
@ 2017-01-04 12:39 ` Andrew Cooper
  2017-01-05 12:55   ` Jan Beulich
  2017-01-04 12:39 ` [PATCH 20/27] x86/cpuid: Drop the temporary linear feature bitmap from struct cpuid_policy Andrew Cooper
                   ` (7 subsequent siblings)
  26 siblings, 1 reply; 93+ messages in thread
From: Andrew Cooper @ 2017-01-04 12:39 UTC (permalink / raw)
  To: Xen-devel; +Cc: Andrew Cooper, Jan Beulich

... rather than performing runtime adjustments.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
---
CC: Jan Beulich <JBeulich@suse.com>
---
 xen/arch/x86/hvm/hvm.c | 113 +++++++++++++++++++------------------------------
 1 file changed, 44 insertions(+), 69 deletions(-)

diff --git a/xen/arch/x86/hvm/hvm.c b/xen/arch/x86/hvm/hvm.c
index 6a3fdaa..7cda53f 100644
--- a/xen/arch/x86/hvm/hvm.c
+++ b/xen/arch/x86/hvm/hvm.c
@@ -3335,39 +3335,33 @@ void hvm_cpuid(unsigned int input, unsigned int *eax, unsigned int *ebx,
         *ebx &= 0x00FFFFFFu;
         *ebx |= (v->vcpu_id * 2) << 24;
 
-        *ecx &= hvm_featureset[FEATURESET_1c];
-        *edx &= hvm_featureset[FEATURESET_1d];
+        *ecx = p->basic._1c;
+        *edx = p->basic._1d;
 
         /* APIC exposed to guests, but Fast-forward MSR_APIC_BASE.EN back in. */
         if ( vlapic_hw_disabled(vcpu_vlapic(v)) )
             *edx &= ~cpufeat_bit(X86_FEATURE_APIC);
 
-        /* OSXSAVE cleared by hvm_featureset.  Fast-forward CR4 back in. */
+        /* OSXSAVE clear in policy.  Fast-forward CR4 back in. */
         if ( v->arch.hvm_vcpu.guest_cr[4] & X86_CR4_OSXSAVE )
             *ecx |= cpufeat_mask(X86_FEATURE_OSXSAVE);
 
-        /* Don't expose HAP-only features to non-hap guests. */
-        if ( !hap_enabled(d) )
-        {
-            *ecx &= ~cpufeat_mask(X86_FEATURE_PCID);
-
-            /*
-             * PSE36 is not supported in shadow mode.  This bit should be
-             * unilaterally cleared.
-             *
-             * However, an unspecified version of Hyper-V from 2011 refuses
-             * to start as the "cpu does not provide required hw features" if
-             * it can't see PSE36.
-             *
-             * As a workaround, leak the toolstack-provided PSE36 value into a
-             * shadow guest if the guest is already using PAE paging (and
-             * won't care about reverting back to PSE paging).  Otherwise,
-             * knoble it, so a 32bit guest doesn't get the impression that it
-             * could try to use PSE36 paging.
-             */
-            if ( !(hvm_pae_enabled(v) || hvm_long_mode_enabled(v)) )
-                *edx &= ~cpufeat_mask(X86_FEATURE_PSE36);
-        }
+        /*
+         * PSE36 is not supported in shadow mode.  This bit should be
+         * unilaterally cleared.
+         *
+         * However, an unspecified version of Hyper-V from 2011 refuses
+         * to start as the "cpu does not provide required hw features" if
+         * it can't see PSE36.
+         *
+         * As a workaround, leak the toolstack-provided PSE36 value into a
+         * shadow guest if the guest is already using PAE paging (and won't
+         * care about reverting back to PSE paging).  Otherwise, knoble it, so
+         * a 32bit guest doesn't get the impression that it could try to use
+         * PSE36 paging.
+         */
+        if ( !hap_enabled(d) && !(hvm_pae_enabled(v) || hvm_long_mode_enabled(v)) )
+            *edx &= ~cpufeat_mask(X86_FEATURE_PSE36);
 
         if ( vpmu_enabled(v) &&
              vpmu_is_set(vcpu_vpmu(v), VPMU_CPU_HAS_DS) )
@@ -3384,23 +3378,11 @@ void hvm_cpuid(unsigned int input, unsigned int *eax, unsigned int *ebx,
     case 0x7:
         if ( count == 0 )
         {
-            /* Fold host's FDP_EXCP_ONLY and NO_FPU_SEL into guest's view. */
-            *ebx &= (hvm_featureset[FEATURESET_7b0] &
-                     ~special_features[FEATURESET_7b0]);
-            *ebx |= (host_featureset[FEATURESET_7b0] &
-                     special_features[FEATURESET_7b0]);
-
-            *ecx &= hvm_featureset[FEATURESET_7c0];
-            *edx &= hvm_featureset[FEATURESET_7d0];
-
-            /* Don't expose HAP-only features to non-hap guests. */
-            if ( !hap_enabled(d) )
-            {
-                 *ebx &= ~cpufeat_mask(X86_FEATURE_INVPCID);
-                 *ecx &= ~cpufeat_mask(X86_FEATURE_PKU);
-            }
+            *ebx = d->arch.cpuid->feat._7b0;
+            *ecx = d->arch.cpuid->feat._7c0;
+            *edx = d->arch.cpuid->feat._7d0;
 
-            /* OSPKE cleared by hvm_featureset.  Fast-forward CR4 back in. */
+            /* OSPKE clear in policy.  Fast-forward CR4 back in. */
             if ( v->arch.hvm_vcpu.guest_cr[4] & X86_CR4_PKE )
                 *ecx |= cpufeat_mask(X86_FEATURE_OSPKE);
         }
@@ -3484,7 +3466,7 @@ void hvm_cpuid(unsigned int input, unsigned int *eax, unsigned int *ebx,
         }
 
         case 1:
-            *eax &= hvm_featureset[FEATURESET_Da1];
+            *eax = p->xstate.Da1;
 
             if ( p->xstate.xsaves )
             {
@@ -3516,8 +3498,8 @@ void hvm_cpuid(unsigned int input, unsigned int *eax, unsigned int *ebx,
         break;
 
     case 0x80000001:
-        *ecx &= hvm_featureset[FEATURESET_e1c];
-        *edx &= hvm_featureset[FEATURESET_e1d];
+        *ecx = p->extd.e1c;
+        *edx = p->extd.e1d;
 
         /* If not emulating AMD, clear the duplicated features in e1d. */
         if ( d->arch.x86_vendor != X86_VENDOR_AMD )
@@ -3526,28 +3508,22 @@ void hvm_cpuid(unsigned int input, unsigned int *eax, unsigned int *ebx,
         else if ( vlapic_hw_disabled(vcpu_vlapic(v)) )
             *edx &= ~cpufeat_bit(X86_FEATURE_APIC);
 
-        /* Don't expose HAP-only features to non-hap guests. */
-        if ( !hap_enabled(d) )
-        {
-            *edx &= ~cpufeat_mask(X86_FEATURE_PAGE1GB);
-
-            /*
-             * PSE36 is not supported in shadow mode.  This bit should be
-             * unilaterally cleared.
-             *
-             * However, an unspecified version of Hyper-V from 2011 refuses
-             * to start as the "cpu does not provide required hw features" if
-             * it can't see PSE36.
-             *
-             * As a workaround, leak the toolstack-provided PSE36 value into a
-             * shadow guest if the guest is already using PAE paging (and
-             * won't care about reverting back to PSE paging).  Otherwise,
-             * knoble it, so a 32bit guest doesn't get the impression that it
-             * could try to use PSE36 paging.
-             */
-            if ( !(hvm_pae_enabled(v) || hvm_long_mode_enabled(v)) )
-                *edx &= ~cpufeat_mask(X86_FEATURE_PSE36);
-        }
+        /*
+         * PSE36 is not supported in shadow mode.  This bit should be
+         * unilaterally cleared.
+         *
+         * However, an unspecified version of Hyper-V from 2011 refuses
+         * to start as the "cpu does not provide required hw features" if
+         * it can't see PSE36.
+         *
+         * As a workaround, leak the toolstack-provided PSE36 value into a
+         * shadow guest if the guest is already using PAE paging (and won't
+         * care about reverting back to PSE paging).  Otherwise, knoble it, so
+         * a 32bit guest doesn't get the impression that it could try to use
+         * PSE36 paging.
+         */
+        if ( !hap_enabled(d) && !(hvm_pae_enabled(v) || hvm_long_mode_enabled(v)) )
+            *edx &= ~cpufeat_mask(X86_FEATURE_PSE36);
 
         /* SYSCALL is hidden outside of long mode on Intel. */
         if ( d->arch.x86_vendor == X86_VENDOR_INTEL &&
@@ -3557,8 +3533,7 @@ void hvm_cpuid(unsigned int input, unsigned int *eax, unsigned int *ebx,
         break;
 
     case 0x80000007:
-        *edx &= (hvm_featureset[FEATURESET_e7d] |
-                 (host_featureset[FEATURESET_e7d] & cpufeat_mask(X86_FEATURE_ITSC)));
+        *edx = p->extd.e7d;
         break;
 
     case 0x80000008:
@@ -3573,7 +3548,7 @@ void hvm_cpuid(unsigned int input, unsigned int *eax, unsigned int *ebx,
 
         *eax |= (p->extd.lm ? vaddr_bits : 32) << 8;
 
-        *ebx &= hvm_featureset[FEATURESET_e8b];
+        *ebx = p->extd.e8b;
         break;
 
     case 0x8000001c:
-- 
2.1.4


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 93+ messages in thread

* [PATCH 20/27] x86/cpuid: Drop the temporary linear feature bitmap from struct cpuid_policy
  2017-01-04 12:39 [PATCH 00/27] xen/x86: Per-domain CPUID policies Andrew Cooper
                   ` (18 preceding siblings ...)
  2017-01-04 12:39 ` [PATCH 19/27] x86/hvm: Use per-domain policy information in hvm_cpuid() Andrew Cooper
@ 2017-01-04 12:39 ` Andrew Cooper
  2017-01-05 13:07   ` Jan Beulich
  2017-01-04 12:39 ` [PATCH 21/27] x86/cpuid: Calculate appropriate max_leaf values for the global policies Andrew Cooper
                   ` (6 subsequent siblings)
  26 siblings, 1 reply; 93+ messages in thread
From: Andrew Cooper @ 2017-01-04 12:39 UTC (permalink / raw)
  To: Xen-devel; +Cc: Andrew Cooper, Jan Beulich

With most uses of the *_featureset API removed, the remaining uses are only
during XEN_SYSCTL_get_cpu_featureset, init_guest_cpuid(), and
recalculate_cpuid_policy(), none of which are hot paths.

Drop the temporary infrastructure, and have the current users recreate the
linear bitmap using cpuid_policy_to_featureset().  This avoids storing
duplicated information in struct cpuid_policy.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
---
CC: Jan Beulich <JBeulich@suse.com>
---
 xen/arch/x86/cpuid.c        | 19 ++++++++++---------
 xen/arch/x86/sysctl.c       | 21 ++++++++++++---------
 xen/include/asm-x86/cpuid.h |  9 ---------
 3 files changed, 22 insertions(+), 27 deletions(-)

diff --git a/xen/arch/x86/cpuid.c b/xen/arch/x86/cpuid.c
index 86f598f..a261843 100644
--- a/xen/arch/x86/cpuid.c
+++ b/xen/arch/x86/cpuid.c
@@ -126,24 +126,23 @@ static void __init calculate_raw_policy(void)
     for ( i = 1; i < min(ARRAY_SIZE(p->extd.raw),
                          p->extd.max_leaf + 1 - 0x80000000ul); ++i )
         cpuid_leaf(0x80000000 + i, &p->extd.raw[i]);
-
-    cpuid_policy_to_featureset(p, p->fs);
 }
 
 static void __init calculate_host_policy(void)
 {
     struct cpuid_policy *p = &host_policy;
 
-    memcpy(p->fs, boot_cpu_data.x86_capability, sizeof(p->fs));
-
-    cpuid_featureset_to_policy(host_featureset, p);
+    cpuid_featureset_to_policy(boot_cpu_data.x86_capability, p);
 }
 
 static void __init calculate_pv_max_policy(void)
 {
     struct cpuid_policy *p = &pv_max_policy;
+    uint32_t pv_featureset[FSCAPINTS], host_featureset[FSCAPINTS];
     unsigned int i;
 
+    cpuid_policy_to_featureset(&host_policy, host_featureset);
+
     for ( i = 0; i < FSCAPINTS; ++i )
         pv_featureset[i] = host_featureset[i] & pv_featuremask[i];
 
@@ -165,12 +164,15 @@ static void __init calculate_pv_max_policy(void)
 static void __init calculate_hvm_max_policy(void)
 {
     struct cpuid_policy *p = &hvm_max_policy;
+    uint32_t hvm_featureset[FSCAPINTS], host_featureset[FSCAPINTS];
     unsigned int i;
     const uint32_t *hvm_featuremask;
 
     if ( !hvm_enabled )
         return;
 
+    cpuid_policy_to_featureset(&host_policy, host_featureset);
+
     hvm_featuremask = hvm_funcs.hap_supported ?
         hvm_hap_featuremask : hvm_shadow_featuremask;
 
@@ -199,8 +201,7 @@ static void __init calculate_hvm_max_policy(void)
      * long mode (and init_amd() has cleared it out of host capabilities), but
      * HVM guests are able if running in protected mode.
      */
-    if ( (boot_cpu_data.x86_vendor == X86_VENDOR_AMD) &&
-         test_bit(X86_FEATURE_SEP, raw_featureset) )
+    if ( (boot_cpu_data.x86_vendor == X86_VENDOR_AMD) && raw_policy.basic.sep )
         __set_bit(X86_FEATURE_SEP, hvm_featureset);
 
     /*
@@ -267,7 +268,7 @@ void recalculate_cpuid_policy(struct domain *d)
     unsigned int i;
 
     cpuid_policy_to_featureset(p, fs);
-    memcpy(max_fs, max->fs, sizeof(max_fs));
+    cpuid_policy_to_featureset(max, max_fs);
 
     /* Allow a toolstack to possibly select ITSC... */
     if ( cpu_has_itsc )
@@ -295,7 +296,7 @@ void recalculate_cpuid_policy(struct domain *d)
 
     /* Fold host's FDP_EXCP_ONLY and NO_FPU_SEL into guest's view. */
     fs[FEATURESET_7b0] &= ~special_features[FEATURESET_7b0];
-    fs[FEATURESET_7b0] |= (host_featureset[FEATURESET_7b0] &
+    fs[FEATURESET_7b0] |= (host_policy.feat._7b0 &
                            special_features[FEATURESET_7b0]);
 
     sanitise_featureset(fs);
diff --git a/xen/arch/x86/sysctl.c b/xen/arch/x86/sysctl.c
index 14e7dc7..87da541 100644
--- a/xen/arch/x86/sysctl.c
+++ b/xen/arch/x86/sysctl.c
@@ -199,13 +199,14 @@ long arch_do_sysctl(
 
     case XEN_SYSCTL_get_cpu_featureset:
     {
-        static const uint32_t *const featureset_table[] = {
-            [XEN_SYSCTL_cpu_featureset_raw]  = raw_featureset,
-            [XEN_SYSCTL_cpu_featureset_host] = host_featureset,
-            [XEN_SYSCTL_cpu_featureset_pv]   = pv_featureset,
-            [XEN_SYSCTL_cpu_featureset_hvm]  = hvm_featureset,
+        static const struct cpuid_policy *const policy_table[] = {
+            [XEN_SYSCTL_cpu_featureset_raw]  = &raw_policy,
+            [XEN_SYSCTL_cpu_featureset_host] = &host_policy,
+            [XEN_SYSCTL_cpu_featureset_pv]   = &pv_max_policy,
+            [XEN_SYSCTL_cpu_featureset_hvm]  = &hvm_max_policy,
         };
-        const uint32_t *featureset = NULL;
+        const struct cpuid_policy *p = NULL;
+        uint32_t featureset[FSCAPINTS];
         unsigned int nr;
 
         /* Request for maximum number of features? */
@@ -223,13 +224,15 @@ long arch_do_sysctl(
                    FSCAPINTS);
 
         /* Look up requested featureset. */
-        if ( sysctl->u.cpu_featureset.index < ARRAY_SIZE(featureset_table) )
-            featureset = featureset_table[sysctl->u.cpu_featureset.index];
+        if ( sysctl->u.cpu_featureset.index < ARRAY_SIZE(policy_table) )
+            p = policy_table[sysctl->u.cpu_featureset.index];
 
         /* Bad featureset index? */
-        if ( !featureset )
+        if ( !p )
             ret = -EINVAL;
 
+        cpuid_policy_to_featureset(p, featureset);
+
         /* Copy the requested featureset into place. */
         if ( !ret && copy_to_guest(sysctl->u.cpu_featureset.features,
                                    featureset, nr) )
diff --git a/xen/include/asm-x86/cpuid.h b/xen/include/asm-x86/cpuid.h
index 0371e6e..9788cac 100644
--- a/xen/include/asm-x86/cpuid.h
+++ b/xen/include/asm-x86/cpuid.h
@@ -253,9 +253,6 @@ struct cpuid_policy
             };
         };
     } extd;
-
-    /* Temporary featureset bitmap. */
-    uint32_t fs[FSCAPINTS];
 };
 
 /* Fill in a featureset bitmap from a CPUID policy. */
@@ -293,12 +290,6 @@ static inline void cpuid_featureset_to_policy(
 extern struct cpuid_policy raw_policy, host_policy, pv_max_policy,
     hvm_max_policy;
 
-/* Temporary compatibility defines. */
-#define raw_featureset raw_policy.fs
-#define host_featureset host_policy.fs
-#define pv_featureset pv_max_policy.fs
-#define hvm_featureset hvm_max_policy.fs
-
 /* Allocate and initialise a CPUID policy suitable for the domain. */
 int init_domain_cpuid_policy(struct domain *d);
 
-- 
2.1.4


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 93+ messages in thread

* [PATCH 21/27] x86/cpuid: Calculate appropriate max_leaf values for the global policies
  2017-01-04 12:39 [PATCH 00/27] xen/x86: Per-domain CPUID policies Andrew Cooper
                   ` (19 preceding siblings ...)
  2017-01-04 12:39 ` [PATCH 20/27] x86/cpuid: Drop the temporary linear feature bitmap from struct cpuid_policy Andrew Cooper
@ 2017-01-04 12:39 ` Andrew Cooper
  2017-01-05 13:43   ` Jan Beulich
  2017-01-04 12:39 ` [PATCH 22/27] x86/cpuid: Perform max_leaf calculations in guest_cpuid() Andrew Cooper
                   ` (5 subsequent siblings)
  26 siblings, 1 reply; 93+ messages in thread
From: Andrew Cooper @ 2017-01-04 12:39 UTC (permalink / raw)
  To: Xen-devel; +Cc: Andrew Cooper, Jan Beulich

Derive host_policy from raw_policy, and {pv,hvm}_max_policy from host_policy.
Clamp the raw values to the maximum we will offer to guests.

This simplifies the PV and HVM policy calculations, removing the need for an
intermediate linear host_featureset bitmap.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
---
CC: Jan Beulich <JBeulich@suse.com>
---
 xen/arch/x86/cpuid.c        | 28 ++++++++++++++++++++--------
 xen/include/asm-x86/cpuid.h |  2 +-
 2 files changed, 21 insertions(+), 9 deletions(-)

diff --git a/xen/arch/x86/cpuid.c b/xen/arch/x86/cpuid.c
index a261843..01bb906 100644
--- a/xen/arch/x86/cpuid.c
+++ b/xen/arch/x86/cpuid.c
@@ -132,19 +132,30 @@ static void __init calculate_host_policy(void)
 {
     struct cpuid_policy *p = &host_policy;
 
+    *p = raw_policy;
+
+    p->basic.max_leaf =
+        min_t(uint32_t, p->basic.max_leaf,   ARRAY_SIZE(p->basic.raw) - 1);
+    p->feat.max_subleaf =
+        min_t(uint32_t, p->feat.max_subleaf, ARRAY_SIZE(p->feat.raw) - 1);
+    p->extd.max_leaf =
+        min_t(uint32_t, p->extd.max_leaf,
+              0x80000000u + ARRAY_SIZE(p->extd.raw) - 1);
+
     cpuid_featureset_to_policy(boot_cpu_data.x86_capability, p);
 }
 
 static void __init calculate_pv_max_policy(void)
 {
     struct cpuid_policy *p = &pv_max_policy;
-    uint32_t pv_featureset[FSCAPINTS], host_featureset[FSCAPINTS];
+    uint32_t pv_featureset[FSCAPINTS];
     unsigned int i;
 
-    cpuid_policy_to_featureset(&host_policy, host_featureset);
+    *p = host_policy;
+    cpuid_policy_to_featureset(p, pv_featureset);
 
-    for ( i = 0; i < FSCAPINTS; ++i )
-        pv_featureset[i] = host_featureset[i] & pv_featuremask[i];
+    for ( i = 0; i < ARRAY_SIZE(pv_featureset); ++i )
+        pv_featureset[i] &= pv_featuremask[i];
 
     /* Unconditionally claim to be able to set the hypervisor bit. */
     __set_bit(X86_FEATURE_HYPERVISOR, pv_featureset);
@@ -164,20 +175,21 @@ static void __init calculate_pv_max_policy(void)
 static void __init calculate_hvm_max_policy(void)
 {
     struct cpuid_policy *p = &hvm_max_policy;
-    uint32_t hvm_featureset[FSCAPINTS], host_featureset[FSCAPINTS];
+    uint32_t hvm_featureset[FSCAPINTS];
     unsigned int i;
     const uint32_t *hvm_featuremask;
 
     if ( !hvm_enabled )
         return;
 
-    cpuid_policy_to_featureset(&host_policy, host_featureset);
+    *p = host_policy;
+    cpuid_policy_to_featureset(p, hvm_featureset);
 
     hvm_featuremask = hvm_funcs.hap_supported ?
         hvm_hap_featuremask : hvm_shadow_featuremask;
 
-    for ( i = 0; i < FSCAPINTS; ++i )
-        hvm_featureset[i] = host_featureset[i] & hvm_featuremask[i];
+    for ( i = 0; i < ARRAY_SIZE(hvm_featureset); ++i )
+        hvm_featureset[i] &= hvm_featuremask[i];
 
     /* Unconditionally claim to be able to set the hypervisor bit. */
     __set_bit(X86_FEATURE_HYPERVISOR, hvm_featureset);
diff --git a/xen/include/asm-x86/cpuid.h b/xen/include/asm-x86/cpuid.h
index 9788cac..c621a6f 100644
--- a/xen/include/asm-x86/cpuid.h
+++ b/xen/include/asm-x86/cpuid.h
@@ -78,10 +78,10 @@ struct cpuid_policy
      * Global *_policy objects:
      *
      * - Host accurate:
-     *   - max_{,sub}leaf
      *   - {xcr0,xss}_{high,low}
      *
      * - Guest appropriate:
+     *   - max_{,sub}leaf
      *   - All FEATURESET_* words
      *
      * Per-domain objects:
-- 
2.1.4


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 93+ messages in thread

* [PATCH 22/27] x86/cpuid: Perform max_leaf calculations in guest_cpuid()
  2017-01-04 12:39 [PATCH 00/27] xen/x86: Per-domain CPUID policies Andrew Cooper
                   ` (20 preceding siblings ...)
  2017-01-04 12:39 ` [PATCH 21/27] x86/cpuid: Calculate appropriate max_leaf values for the global policies Andrew Cooper
@ 2017-01-04 12:39 ` Andrew Cooper
  2017-01-05 13:51   ` Jan Beulich
  2017-01-04 12:39 ` [PATCH 23/27] x86/cpuid: Move all leaf 7 handling into guest_cpuid() Andrew Cooper
                   ` (4 subsequent siblings)
  26 siblings, 1 reply; 93+ messages in thread
From: Andrew Cooper @ 2017-01-04 12:39 UTC (permalink / raw)
  To: Xen-devel; +Cc: Andrew Cooper, Jan Beulich

Clamp the toolstack-providied max_leaf values in recalculate_cpuid_policy(),
causing the per-domain policy to have guest-accurate data.

Have guest_cpuid() exit early if a requested leaf is out of range, rather than
falling into the legacy path.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
---
CC: Jan Beulich <JBeulich@suse.com>
---
 xen/arch/x86/cpuid.c        | 36 ++++++++++++++++++++++++++++++++++++
 xen/arch/x86/hvm/hvm.c      | 21 ---------------------
 xen/arch/x86/traps.c        | 23 -----------------------
 xen/include/asm-x86/cpuid.h |  1 +
 4 files changed, 37 insertions(+), 44 deletions(-)

diff --git a/xen/arch/x86/cpuid.c b/xen/arch/x86/cpuid.c
index 01bb906..d140482 100644
--- a/xen/arch/x86/cpuid.c
+++ b/xen/arch/x86/cpuid.c
@@ -279,6 +279,10 @@ void recalculate_cpuid_policy(struct domain *d)
     uint32_t fs[FSCAPINTS], max_fs[FSCAPINTS];
     unsigned int i;
 
+    p->basic.max_leaf   = min(p->basic.max_leaf,   max->basic.max_leaf);
+    p->feat.max_subleaf = min(p->feat.max_subleaf, max->feat.max_subleaf);
+    p->extd.max_leaf    = min(p->extd.max_leaf,    max->extd.max_leaf);
+
     cpuid_policy_to_featureset(p, fs);
     cpuid_policy_to_featureset(max, max_fs);
 
@@ -306,6 +310,9 @@ void recalculate_cpuid_policy(struct domain *d)
     if ( !d->disable_migrate && !d->arch.vtsc )
         __clear_bit(X86_FEATURE_ITSC, fs);
 
+    if ( p->basic.max_leaf < 0xd )
+        __clear_bit(X86_FEATURE_XSAVE, fs);
+
     /* Fold host's FDP_EXCP_ONLY and NO_FPU_SEL into guest's view. */
     fs[FEATURESET_7b0] &= ~special_features[FEATURESET_7b0];
     fs[FEATURESET_7b0] |= (host_policy.feat._7b0 &
@@ -333,21 +340,50 @@ void guest_cpuid(const struct vcpu *v, unsigned int leaf,
                  unsigned int subleaf, struct cpuid_leaf *res)
 {
     const struct domain *d = v->domain;
+    const struct cpuid_policy *p = d->arch.cpuid;
 
     *res = EMPTY_LEAF;
 
     /*
      * First pass:
      * - Dispatch the virtualised leaves to their respective handlers.
+     * - Perform max_leaf/subleaf calculations, maybe returning early.
      */
     switch ( leaf )
     {
+    case 0x0 ... 0x6:
+    case 0x8 ... 0xc:
+#if 0 /* For when CPUID_GUEST_NR_BASIC isn't 0xd */
+    case 0xe ... CPUID_GUEST_NR_BASIC - 1:
+#endif
+        if ( leaf > p->basic.max_leaf )
+            return;
+        break;
+
+    case 0x7:
+        if ( subleaf > p->feat.max_subleaf )
+            return;
+        break;
+
+    case 0xd:
+        if ( subleaf > ARRAY_SIZE(p->xstate.raw) )
+            return;
+        break;
+
     case 0x40000000 ... 0x400000ff:
         if ( is_viridian_domain(d) )
             return cpuid_viridian_leaves(v, leaf, subleaf, res);
         /* Fallthrough. */
     case 0x40000100 ... 0x4fffffff:
         return cpuid_hypervisor_leaves(v, leaf, subleaf, res);
+
+    case 0x80000000 ... 0x80000000 + CPUID_GUEST_NR_EXTD - 1:
+        if ( leaf > p->extd.max_leaf )
+            return;
+        break;
+
+    default:
+        return;
     }
 
     /* {pv,hvm}_cpuid() have this expectation. */
diff --git a/xen/arch/x86/hvm/hvm.c b/xen/arch/x86/hvm/hvm.c
index 7cda53f..1dd92e3 100644
--- a/xen/arch/x86/hvm/hvm.c
+++ b/xen/arch/x86/hvm/hvm.c
@@ -3305,27 +3305,6 @@ void hvm_cpuid(unsigned int input, unsigned int *eax, unsigned int *ebx,
     if ( !edx )
         edx = &dummy;
 
-    if ( input & 0x7fffffff )
-    {
-        /*
-         * Requests outside the supported leaf ranges return zero on AMD
-         * and the highest basic leaf output on Intel. Uniformly follow
-         * the AMD model as the more sane one.
-         */
-        unsigned int limit;
-
-        domain_cpuid(d, (input >> 16) != 0x8000 ? 0 : 0x80000000, 0,
-                     &limit, &dummy, &dummy, &dummy);
-        if ( input > limit )
-        {
-            *eax = 0;
-            *ebx = 0;
-            *ecx = 0;
-            *edx = 0;
-            return;
-        }
-    }
-
     domain_cpuid(d, input, count, eax, ebx, ecx, edx);
 
     switch ( input )
diff --git a/xen/arch/x86/traps.c b/xen/arch/x86/traps.c
index 1c384cf..aed96c3 100644
--- a/xen/arch/x86/traps.c
+++ b/xen/arch/x86/traps.c
@@ -1032,29 +1032,6 @@ void pv_cpuid(struct cpu_user_regs *regs)
     subleaf = c = regs->_ecx;
     d = regs->_edx;
 
-    if ( leaf & 0x7fffffff )
-    {
-        /*
-         * Requests outside the supported leaf ranges return zero on AMD
-         * and the highest basic leaf output on Intel. Uniformly follow
-         * the AMD model as the more sane one.
-         */
-        unsigned int limit = (leaf >> 16) != 0x8000 ? 0 : 0x80000000, dummy;
-
-        if ( !is_control_domain(currd) && !is_hardware_domain(currd) )
-            domain_cpuid(currd, limit, 0, &limit, &dummy, &dummy, &dummy);
-        else
-            limit = cpuid_eax(limit);
-        if ( leaf > limit )
-        {
-            regs->rax = 0;
-            regs->rbx = 0;
-            regs->rcx = 0;
-            regs->rdx = 0;
-            return;
-        }
-    }
-
     if ( !is_control_domain(currd) && !is_hardware_domain(currd) )
         domain_cpuid(currd, leaf, subleaf, &a, &b, &c, &d);
     else
diff --git a/xen/include/asm-x86/cpuid.h b/xen/include/asm-x86/cpuid.h
index c621a6f..7363263 100644
--- a/xen/include/asm-x86/cpuid.h
+++ b/xen/include/asm-x86/cpuid.h
@@ -87,6 +87,7 @@ struct cpuid_policy
      * Per-domain objects:
      *
      * - Guest accurate:
+     *   - max_{,sub}leaf
      *   - All FEATURESET_* words
      *
      * Everything else should be considered inaccurate, and not necesserily 0.
-- 
2.1.4


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 93+ messages in thread

* [PATCH 23/27] x86/cpuid: Move all leaf 7 handling into guest_cpuid()
  2017-01-04 12:39 [PATCH 00/27] xen/x86: Per-domain CPUID policies Andrew Cooper
                   ` (21 preceding siblings ...)
  2017-01-04 12:39 ` [PATCH 22/27] x86/cpuid: Perform max_leaf calculations in guest_cpuid() Andrew Cooper
@ 2017-01-04 12:39 ` Andrew Cooper
  2017-01-05 14:01   ` Jan Beulich
  2017-01-04 12:39 ` [PATCH 24/27] x86/hvm: Use guest_cpuid() rather than hvm_cpuid() Andrew Cooper
                   ` (3 subsequent siblings)
  26 siblings, 1 reply; 93+ messages in thread
From: Andrew Cooper @ 2017-01-04 12:39 UTC (permalink / raw)
  To: Xen-devel; +Cc: Andrew Cooper, Jan Beulich

All per-domain policy data concerning leaf 7 is accurate.  Handle it all in
guest_cpuid() by reading out of the raw array block, and introduing a dynamic
adjustment for OSPKE.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
---
CC: Jan Beulich <JBeulich@suse.com>
---
 xen/arch/x86/cpuid.c        | 41 +++++++++++++++++++++++++++++++++++++----
 xen/arch/x86/hvm/hvm.c      | 17 ++++-------------
 xen/arch/x86/traps.c        | 28 ++++------------------------
 xen/include/asm-x86/cpuid.h |  2 ++
 4 files changed, 47 insertions(+), 41 deletions(-)

diff --git a/xen/arch/x86/cpuid.c b/xen/arch/x86/cpuid.c
index d140482..9181fc7 100644
--- a/xen/arch/x86/cpuid.c
+++ b/xen/arch/x86/cpuid.c
@@ -339,6 +339,7 @@ int init_domain_cpuid_policy(struct domain *d)
 void guest_cpuid(const struct vcpu *v, unsigned int leaf,
                  unsigned int subleaf, struct cpuid_leaf *res)
 {
+    const struct vcpu *curr = current;
     const struct domain *d = v->domain;
     const struct cpuid_policy *p = d->arch.cpuid;
 
@@ -348,6 +349,7 @@ void guest_cpuid(const struct vcpu *v, unsigned int leaf,
      * First pass:
      * - Dispatch the virtualised leaves to their respective handlers.
      * - Perform max_leaf/subleaf calculations, maybe returning early.
+     * - Fill in *res for leaves no longer handled on the legacy path.
      */
     switch ( leaf )
     {
@@ -358,17 +360,20 @@ void guest_cpuid(const struct vcpu *v, unsigned int leaf,
 #endif
         if ( leaf > p->basic.max_leaf )
             return;
-        break;
+        goto legacy;
 
     case 0x7:
         if ( subleaf > p->feat.max_subleaf )
             return;
+
+        BUG_ON(subleaf >= ARRAY_SIZE(p->feat.raw));
+        *res = p->feat.raw[subleaf];
         break;
 
     case 0xd:
         if ( subleaf > ARRAY_SIZE(p->xstate.raw) )
             return;
-        break;
+        goto legacy;
 
     case 0x40000000 ... 0x400000ff:
         if ( is_viridian_domain(d) )
@@ -380,14 +385,42 @@ void guest_cpuid(const struct vcpu *v, unsigned int leaf,
     case 0x80000000 ... 0x80000000 + CPUID_GUEST_NR_EXTD - 1:
         if ( leaf > p->extd.max_leaf )
             return;
-        break;
+        goto legacy;
 
     default:
         return;
     }
 
+    /* Skip dynamic adjustments if we are in the wrong context. */
+    if ( v != curr )
+        return;
+
+    /*
+     * Second pass:
+     * - Dynamic adjustments
+     */
+    switch ( leaf )
+    {
+    case 0x7:
+        switch ( subleaf )
+        {
+        case 0:
+            /* OSPKE clear in policy.  Fast-forward CR4 back in. */
+            if ( (is_pv_vcpu(v)
+                  ? v->arch.pv_vcpu.ctrlreg[4]
+                  : v->arch.hvm_vcpu.guest_cr[4]) & X86_CR4_PKE )
+                res->c |= cpufeat_mask(X86_FEATURE_OSPKE);
+            break;
+        }
+        break;
+    }
+
+    /* Done. */
+    return;
+
+ legacy:
     /* {pv,hvm}_cpuid() have this expectation. */
-    ASSERT(v == current);
+    ASSERT(v == curr);
 
     if ( is_pv_vcpu(v) || is_pvh_vcpu(v) )
     {
diff --git a/xen/arch/x86/hvm/hvm.c b/xen/arch/x86/hvm/hvm.c
index 1dd92e3..4ded533 100644
--- a/xen/arch/x86/hvm/hvm.c
+++ b/xen/arch/x86/hvm/hvm.c
@@ -3354,19 +3354,6 @@ void hvm_cpuid(unsigned int input, unsigned int *eax, unsigned int *ebx,
 
         break;
 
-    case 0x7:
-        if ( count == 0 )
-        {
-            *ebx = d->arch.cpuid->feat._7b0;
-            *ecx = d->arch.cpuid->feat._7c0;
-            *edx = d->arch.cpuid->feat._7d0;
-
-            /* OSPKE clear in policy.  Fast-forward CR4 back in. */
-            if ( v->arch.hvm_vcpu.guest_cr[4] & X86_CR4_PKE )
-                *ecx |= cpufeat_mask(X86_FEATURE_OSPKE);
-        }
-        break;
-
     case 0xb:
         /* Fix the x2APIC identifier. */
         *edx = v->vcpu_id * 2;
@@ -3543,6 +3530,10 @@ void hvm_cpuid(unsigned int input, unsigned int *eax, unsigned int *ebx,
         else
             *eax = 0;
         break;
+
+    case 0x7:
+        ASSERT_UNREACHABLE();
+        /* Now handled in guest_cpuid(). */
     }
 }
 
diff --git a/xen/arch/x86/traps.c b/xen/arch/x86/traps.c
index aed96c3..e2669b1 100644
--- a/xen/arch/x86/traps.c
+++ b/xen/arch/x86/traps.c
@@ -1180,30 +1180,6 @@ void pv_cpuid(struct cpu_user_regs *regs)
         }
         break;
 
-    case 0x00000007:
-        if ( subleaf == 0 )
-        {
-            b = currd->arch.cpuid->feat._7b0;
-            c = currd->arch.cpuid->feat._7c0;
-            d = currd->arch.cpuid->feat._7d0;
-
-            if ( !is_pvh_domain(currd) )
-            {
-                /*
-                 * Delete the PVH condition when HVMLite formally replaces PVH,
-                 * and HVM guests no longer enter a PV codepath.
-                 */
-
-                /* OSPKE clear in policy.  Fast-forward CR4 back in. */
-                if ( curr->arch.pv_vcpu.ctrlreg[4] & X86_CR4_PKE )
-                    c |= cpufeat_mask(X86_FEATURE_OSPKE);
-            }
-        }
-        else
-            b = c = d = 0;
-        a = 0;
-        break;
-
     case 0x0000000a: /* Architectural Performance Monitor Features (Intel) */
         if ( boot_cpu_data.x86_vendor != X86_VENDOR_INTEL ||
              !vpmu_enabled(curr) )
@@ -1305,6 +1281,10 @@ void pv_cpuid(struct cpu_user_regs *regs)
     unsupported:
         a = b = c = d = 0;
         break;
+
+    case 0x7:
+        ASSERT_UNREACHABLE();
+        /* Now handled in guest_cpuid(). */
     }
 
     regs->rax = a;
diff --git a/xen/include/asm-x86/cpuid.h b/xen/include/asm-x86/cpuid.h
index 7363263..c7e9df5 100644
--- a/xen/include/asm-x86/cpuid.h
+++ b/xen/include/asm-x86/cpuid.h
@@ -81,12 +81,14 @@ struct cpuid_policy
      *   - {xcr0,xss}_{high,low}
      *
      * - Guest appropriate:
+     *   - All of the feat union
      *   - max_{,sub}leaf
      *   - All FEATURESET_* words
      *
      * Per-domain objects:
      *
      * - Guest accurate:
+     *   - All of the feat union
      *   - max_{,sub}leaf
      *   - All FEATURESET_* words
      *
-- 
2.1.4


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 93+ messages in thread

* [PATCH 24/27] x86/hvm: Use guest_cpuid() rather than hvm_cpuid()
  2017-01-04 12:39 [PATCH 00/27] xen/x86: Per-domain CPUID policies Andrew Cooper
                   ` (22 preceding siblings ...)
  2017-01-04 12:39 ` [PATCH 23/27] x86/cpuid: Move all leaf 7 handling into guest_cpuid() Andrew Cooper
@ 2017-01-04 12:39 ` Andrew Cooper
  2017-01-05 14:02   ` Jan Beulich
  2017-01-04 12:39 ` [PATCH 25/27] x86/svm: " Andrew Cooper
                   ` (2 subsequent siblings)
  26 siblings, 1 reply; 93+ messages in thread
From: Andrew Cooper @ 2017-01-04 12:39 UTC (permalink / raw)
  To: Xen-devel; +Cc: Andrew Cooper, Jan Beulich

More work is required before maxphysaddr can be read straight out of the
cpuid_policy block, but in the meantime hvm_cpuid() wants to disappear so
update the code to use the newer interface.

Use the behaviour of max_leaf handling (returning all zeros) to avoid a double
call into guest_cpuid().

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
---
CC: Jan Beulich <JBeulich@suse.com>
---
 xen/arch/x86/hvm/mtrr.c | 13 +++++--------
 1 file changed, 5 insertions(+), 8 deletions(-)

diff --git a/xen/arch/x86/hvm/mtrr.c b/xen/arch/x86/hvm/mtrr.c
index 228dac1..709759c 100644
--- a/xen/arch/x86/hvm/mtrr.c
+++ b/xen/arch/x86/hvm/mtrr.c
@@ -440,7 +440,7 @@ bool_t mtrr_fix_range_msr_set(struct domain *d, struct mtrr_state *m,
 bool_t mtrr_var_range_msr_set(
     struct domain *d, struct mtrr_state *m, uint32_t msr, uint64_t msr_content)
 {
-    uint32_t index, phys_addr, eax;
+    uint32_t index, phys_addr;
     uint64_t msr_mask;
     uint64_t *var_range_base = (uint64_t*)m->var_ranges;
 
@@ -453,13 +453,10 @@ bool_t mtrr_var_range_msr_set(
 
     if ( d == current->domain )
     {
-        phys_addr = 36;
-        hvm_cpuid(0x80000000, &eax, NULL, NULL, NULL);
-        if ( (eax >> 16) == 0x8000 && eax >= 0x80000008 )
-        {
-            hvm_cpuid(0x80000008, &eax, NULL, NULL, NULL);
-            phys_addr = (uint8_t)eax;
-        }
+        struct cpuid_leaf res;
+
+        guest_cpuid(current, 0x80000008, 0, &res);
+        phys_addr = (uint8_t)res.a ?: 36;
     }
     else
         phys_addr = paddr_bits;
-- 
2.1.4


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 93+ messages in thread

* [PATCH 25/27] x86/svm: Use guest_cpuid() rather than hvm_cpuid()
  2017-01-04 12:39 [PATCH 00/27] xen/x86: Per-domain CPUID policies Andrew Cooper
                   ` (23 preceding siblings ...)
  2017-01-04 12:39 ` [PATCH 24/27] x86/hvm: Use guest_cpuid() rather than hvm_cpuid() Andrew Cooper
@ 2017-01-04 12:39 ` Andrew Cooper
  2017-01-04 15:26   ` Boris Ostrovsky
  2017-01-05 14:04   ` Jan Beulich
  2017-01-04 12:39 ` [PATCH 26/27] x86/cpuid: Effectively remove pv_cpuid() and hvm_cpuid() Andrew Cooper
  2017-01-04 12:39 ` [PATCH 27/27] x86/cpuid: Alter the legacy-path prototypes to match guest_cpuid() Andrew Cooper
  26 siblings, 2 replies; 93+ messages in thread
From: Andrew Cooper @ 2017-01-04 12:39 UTC (permalink / raw)
  To: Xen-devel
  Cc: Andrew Cooper, Boris Ostrovsky, Suravee Suthikulpanit, Jan Beulich

More work is required before LWP details can be read straight out of the
cpuid_policy block, but in the meantime hvm_cpuid() wants to disappear so
update the code to use the newer interface.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
---
CC: Jan Beulich <JBeulich@suse.com>
CC: Boris Ostrovsky <boris.ostrovsky@oracle.com>
CC: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
---
 xen/arch/x86/hvm/svm/svm.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/xen/arch/x86/hvm/svm/svm.c b/xen/arch/x86/hvm/svm/svm.c
index 8f6737c..36c7edd 100644
--- a/xen/arch/x86/hvm/svm/svm.c
+++ b/xen/arch/x86/hvm/svm/svm.c
@@ -926,17 +926,17 @@ static inline void svm_lwp_load(struct vcpu *v)
 /* Update LWP_CFG MSR (0xc0000105). Return -1 if error; otherwise returns 0. */
 static int svm_update_lwp_cfg(struct vcpu *v, uint64_t msr_content)
 {
-    unsigned int edx;
+    struct cpuid_leaf res;
     uint32_t msr_low;
     static uint8_t lwp_intr_vector;
 
     if ( xsave_enabled(v) && cpu_has_lwp )
     {
-        hvm_cpuid(0x8000001c, NULL, NULL, NULL, &edx);
+        guest_cpuid(v, 0x8000001c, 0, &res);
         msr_low = (uint32_t)msr_content;
         
         /* generate #GP if guest tries to turn on unsupported features. */
-        if ( msr_low & ~edx)
+        if ( msr_low & ~res.d)
             return -1;
 
         v->arch.hvm_svm.guest_lwp_cfg = msr_content;
-- 
2.1.4


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 93+ messages in thread

* [PATCH 26/27] x86/cpuid: Effectively remove pv_cpuid() and hvm_cpuid()
  2017-01-04 12:39 [PATCH 00/27] xen/x86: Per-domain CPUID policies Andrew Cooper
                   ` (24 preceding siblings ...)
  2017-01-04 12:39 ` [PATCH 25/27] x86/svm: " Andrew Cooper
@ 2017-01-04 12:39 ` Andrew Cooper
  2017-01-05 14:06   ` Jan Beulich
  2017-01-04 12:39 ` [PATCH 27/27] x86/cpuid: Alter the legacy-path prototypes to match guest_cpuid() Andrew Cooper
  26 siblings, 1 reply; 93+ messages in thread
From: Andrew Cooper @ 2017-01-04 12:39 UTC (permalink / raw)
  To: Xen-devel; +Cc: Andrew Cooper, Jan Beulich

All callers of pv_cpuid() and hvm_cpuid() (other than guest_cpuid() legacy
path) have been removed from the codebase.  Move them into cpuid.c to avoid
any further use, leaving guest_cpuid() as the sole API to use.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
---
CC: Jan Beulich <JBeulich@suse.com>
---
 xen/arch/x86/cpuid.c            | 524 ++++++++++++++++++++++++++++++++++++++++
 xen/arch/x86/hvm/hvm.c          | 250 -------------------
 xen/arch/x86/traps.c            | 273 ---------------------
 xen/include/asm-x86/hvm/hvm.h   |   2 -
 xen/include/asm-x86/processor.h |   2 -
 5 files changed, 524 insertions(+), 527 deletions(-)

diff --git a/xen/arch/x86/cpuid.c b/xen/arch/x86/cpuid.c
index 9181fc7..5e7e8cc 100644
--- a/xen/arch/x86/cpuid.c
+++ b/xen/arch/x86/cpuid.c
@@ -5,6 +5,7 @@
 #include <asm/hvm/hvm.h>
 #include <asm/hvm/vmx/vmcs.h>
 #include <asm/processor.h>
+#include <asm/xstate.h>
 
 const uint32_t known_features[] = INIT_KNOWN_FEATURES;
 const uint32_t special_features[] = INIT_SPECIAL_FEATURES;
@@ -336,6 +337,529 @@ int init_domain_cpuid_policy(struct domain *d)
     return 0;
 }
 
+static void pv_cpuid(struct cpu_user_regs *regs)
+{
+    uint32_t leaf, subleaf, a, b, c, d;
+    struct vcpu *curr = current;
+    struct domain *currd = curr->domain;
+    const struct cpuid_policy *p = currd->arch.cpuid;
+
+    leaf = a = regs->_eax;
+    b = regs->_ebx;
+    subleaf = c = regs->_ecx;
+    d = regs->_edx;
+
+    if ( !is_control_domain(currd) && !is_hardware_domain(currd) )
+        domain_cpuid(currd, leaf, subleaf, &a, &b, &c, &d);
+    else
+        cpuid_count(leaf, subleaf, &a, &b, &c, &d);
+
+    switch ( leaf )
+    {
+        uint32_t tmp;
+
+    case 0x00000001:
+        c = p->basic._1c;
+        d = p->basic._1d;
+
+        if ( !is_pvh_domain(currd) )
+        {
+            /*
+             * Delete the PVH condition when HVMLite formally replaces PVH,
+             * and HVM guests no longer enter a PV codepath.
+             */
+
+            /*
+             * !!! OSXSAVE handling for PV guests is non-architectural !!!
+             *
+             * Architecturally, the correct code here is simply:
+             *
+             *   if ( curr->arch.pv_vcpu.ctrlreg[4] & X86_CR4_OSXSAVE )
+             *       c |= cpufeat_mask(X86_FEATURE_OSXSAVE);
+             *
+             * However because of bugs in Xen (before c/s bd19080b, Nov 2010,
+             * the XSAVE cpuid flag leaked into guests despite the feature not
+             * being available for use), buggy workarounds where introduced to
+             * Linux (c/s 947ccf9c, also Nov 2010) which relied on the fact
+             * that Xen also incorrectly leaked OSXSAVE into the guest.
+             *
+             * Furthermore, providing architectural OSXSAVE behaviour to a
+             * many Linux PV guests triggered a further kernel bug when the
+             * fpu code observes that XSAVEOPT is available, assumes that
+             * xsave state had been set up for the task, and follows a wild
+             * pointer.
+             *
+             * Older Linux PVOPS kernels however do require architectural
+             * behaviour.  They observe Xen's leaked OSXSAVE and assume they
+             * can already use XSETBV, dying with a #UD because the shadowed
+             * CR4.OSXSAVE is clear.  This behaviour has been adjusted in all
+             * observed cases via stable backports of the above changeset.
+             *
+             * Therefore, the leaking of Xen's OSXSAVE setting has become a
+             * defacto part of the PV ABI and can't reasonably be corrected.
+             * It can however be restricted to only the enlightened CPUID
+             * view, as seen by the guest kernel.
+             *
+             * The following situations and logic now applies:
+             *
+             * - Hardware without CPUID faulting support and native CPUID:
+             *    There is nothing Xen can do here.  The hosts XSAVE flag will
+             *    leak through and Xen's OSXSAVE choice will leak through.
+             *
+             *    In the case that the guest kernel has not set up OSXSAVE, only
+             *    SSE will be set in xcr0, and guest userspace can't do too much
+             *    damage itself.
+             *
+             * - Enlightened CPUID or CPUID faulting available:
+             *    Xen can fully control what is seen here.  Guest kernels need
+             *    to see the leaked OSXSAVE via the enlightened path, but
+             *    guest userspace and the native is given architectural
+             *    behaviour.
+             *
+             *    Emulated vs Faulted CPUID is distinguised based on whether a
+             *    #UD or #GP is currently being serviced.
+             */
+            /* OSXSAVE clear in policy.  Fast-forward CR4 back in. */
+            if ( (curr->arch.pv_vcpu.ctrlreg[4] & X86_CR4_OSXSAVE) ||
+                 (regs->entry_vector == TRAP_invalid_op &&
+                  guest_kernel_mode(curr, regs) &&
+                  (read_cr4() & X86_CR4_OSXSAVE)) )
+                c |= cpufeat_mask(X86_FEATURE_OSXSAVE);
+
+            /*
+             * At the time of writing, a PV domain is the only viable option
+             * for Dom0.  Several interactions between dom0 and Xen for real
+             * hardware setup have unfortunately been implemented based on
+             * state which incorrectly leaked into dom0.
+             *
+             * These leaks are retained for backwards compatibility, but
+             * restricted to the hardware domains kernel only.
+             */
+            if ( is_hardware_domain(currd) && guest_kernel_mode(curr, regs) )
+            {
+                /*
+                 * MTRR used to unconditionally leak into PV guests.  They
+                 * cannot MTRR infrastructure at all, and shouldn't be able to
+                 * see the feature.
+                 *
+                 * Modern PVOPS Linux self-clobbers the MTRR feature, to avoid
+                 * trying to use the associated MSRs.  Xenolinux-based PV dom0's
+                 * however use the MTRR feature as an indication of the presence
+                 * of the XENPF_{add,del,read}_memtype hypercalls.
+                 */
+                if ( cpu_has_mtrr )
+                    d |= cpufeat_mask(X86_FEATURE_MTRR);
+
+                /*
+                 * MONITOR never leaked into PV guests, as PV guests cannot
+                 * use the MONITOR/MWAIT instructions.  As such, they require
+                 * the feature to not being present in emulated CPUID.
+                 *
+                 * Modern PVOPS Linux try to be cunning and use native CPUID
+                 * to see if the hardware actually supports MONITOR, and by
+                 * extension, deep C states.
+                 *
+                 * If the feature is seen, deep-C state information is
+                 * obtained from the DSDT and handed back to Xen via the
+                 * XENPF_set_processor_pminfo hypercall.
+                 *
+                 * This mechanism is incompatible with an HVM-based hardware
+                 * domain, and also with CPUID Faulting.
+                 *
+                 * Luckily, Xen can be just as 'cunning', and distinguish an
+                 * emulated CPUID from a faulted CPUID by whether a #UD or #GP
+                 * fault is currently being serviced.  Yuck...
+                 */
+                if ( cpu_has_monitor && regs->entry_vector == TRAP_gp_fault )
+                    c |= cpufeat_mask(X86_FEATURE_MONITOR);
+
+                /*
+                 * While MONITOR never leaked into PV guests, EIST always used
+                 * to.
+                 *
+                 * Modern PVOPS will only parse P state information from the
+                 * DSDT and return it to Xen if EIST is seen in the emulated
+                 * CPUID information.
+                 */
+                if ( cpu_has_eist )
+                    c |= cpufeat_mask(X86_FEATURE_EIST);
+            }
+        }
+
+        if ( vpmu_enabled(curr) &&
+             vpmu_is_set(vcpu_vpmu(curr), VPMU_CPU_HAS_DS) )
+        {
+            d |= cpufeat_mask(X86_FEATURE_DS);
+            if ( cpu_has(&current_cpu_data, X86_FEATURE_DTES64) )
+                c |= cpufeat_mask(X86_FEATURE_DTES64);
+            if ( cpu_has(&current_cpu_data, X86_FEATURE_DSCPL) )
+                c |= cpufeat_mask(X86_FEATURE_DSCPL);
+        }
+        break;
+
+    case 0x0000000a: /* Architectural Performance Monitor Features (Intel) */
+        if ( boot_cpu_data.x86_vendor != X86_VENDOR_INTEL ||
+             !vpmu_enabled(curr) )
+            goto unsupported;
+
+        /* Report at most version 3 since that's all we currently emulate. */
+        if ( (a & 0xff) > 3 )
+            a = (a & ~0xff) | 3;
+        break;
+
+    case XSTATE_CPUID:
+        if ( !p->basic.xsave || subleaf >= 63 )
+            goto unsupported;
+        switch ( subleaf )
+        {
+        case 0:
+        {
+            uint64_t xfeature_mask = XSTATE_FP_SSE;
+            uint32_t xstate_size = XSTATE_AREA_MIN_SIZE;
+
+            if ( p->basic.avx )
+            {
+                xfeature_mask |= XSTATE_YMM;
+                xstate_size = (xstate_offsets[_XSTATE_YMM] +
+                               xstate_sizes[_XSTATE_YMM]);
+            }
+
+            if ( p->feat.avx512f )
+            {
+                xfeature_mask |= XSTATE_OPMASK | XSTATE_ZMM | XSTATE_HI_ZMM;
+                xstate_size = max(xstate_size,
+                                  xstate_offsets[_XSTATE_OPMASK] +
+                                  xstate_sizes[_XSTATE_OPMASK]);
+                xstate_size = max(xstate_size,
+                                  xstate_offsets[_XSTATE_ZMM] +
+                                  xstate_sizes[_XSTATE_ZMM]);
+                xstate_size = max(xstate_size,
+                                  xstate_offsets[_XSTATE_HI_ZMM] +
+                                  xstate_sizes[_XSTATE_HI_ZMM]);
+            }
+
+            a = (uint32_t)xfeature_mask;
+            d = (uint32_t)(xfeature_mask >> 32);
+            c = xstate_size;
+
+            /*
+             * Always read CPUID.0xD[ECX=0].EBX from hardware, rather than
+             * domain policy.  It varies with enabled xstate, and the correct
+             * xcr0 is in context.
+             */
+            cpuid_count(leaf, subleaf, &tmp, &b, &tmp, &tmp);
+            break;
+        }
+
+        case 1:
+            a = p->xstate.Da1;
+            b = c = d = 0;
+            break;
+        }
+        break;
+
+    case 0x80000001:
+        c = p->extd.e1c;
+        d = p->extd.e1d;
+
+        /* If not emulating AMD, clear the duplicated features in e1d. */
+        if ( currd->arch.x86_vendor != X86_VENDOR_AMD )
+            d &= ~CPUID_COMMON_1D_FEATURES;
+
+        /*
+         * MTRR used to unconditionally leak into PV guests.  They cannot MTRR
+         * infrastructure at all, and shouldn't be able to see the feature.
+         *
+         * Modern PVOPS Linux self-clobbers the MTRR feature, to avoid trying
+         * to use the associated MSRs.  Xenolinux-based PV dom0's however use
+         * the MTRR feature as an indication of the presence of the
+         * XENPF_{add,del,read}_memtype hypercalls.
+         */
+        if ( is_hardware_domain(currd) && guest_kernel_mode(curr, regs) &&
+             cpu_has_mtrr )
+            d |= cpufeat_mask(X86_FEATURE_MTRR);
+        break;
+
+    case 0x80000007:
+        d = p->extd.e7d;
+        break;
+
+    case 0x80000008:
+        a = paddr_bits | (vaddr_bits << 8);
+        b = p->extd.e8b;
+        break;
+
+    case 0x00000005: /* MONITOR/MWAIT */
+    case 0x0000000b: /* Extended Topology Enumeration */
+    case 0x8000000a: /* SVM revision and features */
+    case 0x8000001b: /* Instruction Based Sampling */
+    case 0x8000001c: /* Light Weight Profiling */
+    case 0x8000001e: /* Extended topology reporting */
+    unsupported:
+        a = b = c = d = 0;
+        break;
+
+    case 0x7:
+        ASSERT_UNREACHABLE();
+        /* Now handled in guest_cpuid(). */
+    }
+
+    regs->rax = a;
+    regs->rbx = b;
+    regs->rcx = c;
+    regs->rdx = d;
+}
+
+static void hvm_cpuid(unsigned int input, unsigned int *eax, unsigned int *ebx,
+                      unsigned int *ecx, unsigned int *edx)
+{
+    struct vcpu *v = current;
+    struct domain *d = v->domain;
+    const struct cpuid_policy *p = d->arch.cpuid;
+    unsigned int count, dummy = 0;
+
+    if ( !eax )
+        eax = &dummy;
+    if ( !ebx )
+        ebx = &dummy;
+    if ( !ecx )
+        ecx = &dummy;
+    count = *ecx;
+    if ( !edx )
+        edx = &dummy;
+
+    domain_cpuid(d, input, count, eax, ebx, ecx, edx);
+
+    switch ( input )
+    {
+    case 0x1:
+        /* Fix up VLAPIC details. */
+        *ebx &= 0x00FFFFFFu;
+        *ebx |= (v->vcpu_id * 2) << 24;
+
+        *ecx = p->basic._1c;
+        *edx = p->basic._1d;
+
+        /* APIC exposed to guests, but Fast-forward MSR_APIC_BASE.EN back in. */
+        if ( vlapic_hw_disabled(vcpu_vlapic(v)) )
+            *edx &= ~cpufeat_bit(X86_FEATURE_APIC);
+
+        /* OSXSAVE clear in policy.  Fast-forward CR4 back in. */
+        if ( v->arch.hvm_vcpu.guest_cr[4] & X86_CR4_OSXSAVE )
+            *ecx |= cpufeat_mask(X86_FEATURE_OSXSAVE);
+
+        /*
+         * PSE36 is not supported in shadow mode.  This bit should be
+         * unilaterally cleared.
+         *
+         * However, an unspecified version of Hyper-V from 2011 refuses
+         * to start as the "cpu does not provide required hw features" if
+         * it can't see PSE36.
+         *
+         * As a workaround, leak the toolstack-provided PSE36 value into a
+         * shadow guest if the guest is already using PAE paging (and won't
+         * care about reverting back to PSE paging).  Otherwise, knoble it, so
+         * a 32bit guest doesn't get the impression that it could try to use
+         * PSE36 paging.
+         */
+        if ( !hap_enabled(d) && !(hvm_pae_enabled(v) || hvm_long_mode_enabled(v)) )
+            *edx &= ~cpufeat_mask(X86_FEATURE_PSE36);
+
+        if ( vpmu_enabled(v) &&
+             vpmu_is_set(vcpu_vpmu(v), VPMU_CPU_HAS_DS) )
+        {
+            *edx |= cpufeat_mask(X86_FEATURE_DS);
+            if ( cpu_has(&current_cpu_data, X86_FEATURE_DTES64) )
+                *ecx |= cpufeat_mask(X86_FEATURE_DTES64);
+            if ( cpu_has(&current_cpu_data, X86_FEATURE_DSCPL) )
+                *ecx |= cpufeat_mask(X86_FEATURE_DSCPL);
+        }
+
+        break;
+
+    case 0xb:
+        /* Fix the x2APIC identifier. */
+        *edx = v->vcpu_id * 2;
+        break;
+
+    case XSTATE_CPUID:
+        if ( !p->basic.xsave || count >= 63 )
+        {
+            *eax = *ebx = *ecx = *edx = 0;
+            break;
+        }
+        switch ( count )
+        {
+        case 0:
+        {
+            uint64_t xfeature_mask = XSTATE_FP_SSE;
+            uint32_t xstate_size = XSTATE_AREA_MIN_SIZE;
+
+            if ( p->basic.avx )
+            {
+                xfeature_mask |= XSTATE_YMM;
+                xstate_size = max(xstate_size,
+                                  xstate_offsets[_XSTATE_YMM] +
+                                  xstate_sizes[_XSTATE_YMM]);
+            }
+
+            if ( p->feat.mpx )
+            {
+                xfeature_mask |= XSTATE_BNDREGS | XSTATE_BNDCSR;
+                xstate_size = max(xstate_size,
+                                  xstate_offsets[_XSTATE_BNDCSR] +
+                                  xstate_sizes[_XSTATE_BNDCSR]);
+            }
+
+            if ( p->feat.avx512f )
+            {
+                xfeature_mask |= XSTATE_OPMASK | XSTATE_ZMM | XSTATE_HI_ZMM;
+                xstate_size = max(xstate_size,
+                                  xstate_offsets[_XSTATE_OPMASK] +
+                                  xstate_sizes[_XSTATE_OPMASK]);
+                xstate_size = max(xstate_size,
+                                  xstate_offsets[_XSTATE_ZMM] +
+                                  xstate_sizes[_XSTATE_ZMM]);
+                xstate_size = max(xstate_size,
+                                  xstate_offsets[_XSTATE_HI_ZMM] +
+                                  xstate_sizes[_XSTATE_HI_ZMM]);
+            }
+
+            if ( p->feat.pku )
+            {
+                xfeature_mask |= XSTATE_PKRU;
+                xstate_size = max(xstate_size,
+                                  xstate_offsets[_XSTATE_PKRU] +
+                                  xstate_sizes[_XSTATE_PKRU]);
+            }
+
+            if ( p->extd.lwp )
+            {
+                xfeature_mask |= XSTATE_LWP;
+                xstate_size = max(xstate_size,
+                                  xstate_offsets[_XSTATE_LWP] +
+                                  xstate_sizes[_XSTATE_LWP]);
+            }
+
+            *eax = (uint32_t)xfeature_mask;
+            *edx = (uint32_t)(xfeature_mask >> 32);
+            *ecx = xstate_size;
+
+            /*
+             * Always read CPUID[0xD,0].EBX from hardware, rather than domain
+             * policy.  It varies with enabled xstate, and the correct xcr0 is
+             * in context.
+             */
+            cpuid_count(input, count, &dummy, ebx, &dummy, &dummy);
+            break;
+        }
+
+        case 1:
+            *eax = p->xstate.Da1;
+
+            if ( p->xstate.xsaves )
+            {
+                /*
+                 * Always read CPUID[0xD,1].EBX from hardware, rather than
+                 * domain policy.  It varies with enabled xstate, and the
+                 * correct xcr0/xss are in context.
+                 */
+                cpuid_count(input, count, &dummy, ebx, &dummy, &dummy);
+            }
+            else
+                *ebx = 0;
+
+            *ecx = *edx = 0;
+            break;
+        }
+        break;
+
+    case 0x0000000a: /* Architectural Performance Monitor Features (Intel) */
+        if ( boot_cpu_data.x86_vendor != X86_VENDOR_INTEL || !vpmu_enabled(v) )
+        {
+            *eax = *ebx = *ecx = *edx = 0;
+            break;
+        }
+
+        /* Report at most version 3 since that's all we currently emulate */
+        if ( (*eax & 0xff) > 3 )
+            *eax = (*eax & ~0xff) | 3;
+        break;
+
+    case 0x80000001:
+        *ecx = p->extd.e1c;
+        *edx = p->extd.e1d;
+
+        /* If not emulating AMD, clear the duplicated features in e1d. */
+        if ( d->arch.x86_vendor != X86_VENDOR_AMD )
+            *edx &= ~CPUID_COMMON_1D_FEATURES;
+        /* fast-forward MSR_APIC_BASE.EN if it hasn't already been clobbered. */
+        else if ( vlapic_hw_disabled(vcpu_vlapic(v)) )
+            *edx &= ~cpufeat_bit(X86_FEATURE_APIC);
+
+        /*
+         * PSE36 is not supported in shadow mode.  This bit should be
+         * unilaterally cleared.
+         *
+         * However, an unspecified version of Hyper-V from 2011 refuses
+         * to start as the "cpu does not provide required hw features" if
+         * it can't see PSE36.
+         *
+         * As a workaround, leak the toolstack-provided PSE36 value into a
+         * shadow guest if the guest is already using PAE paging (and won't
+         * care about reverting back to PSE paging).  Otherwise, knoble it, so
+         * a 32bit guest doesn't get the impression that it could try to use
+         * PSE36 paging.
+         */
+        if ( !hap_enabled(d) && !(hvm_pae_enabled(v) || hvm_long_mode_enabled(v)) )
+            *edx &= ~cpufeat_mask(X86_FEATURE_PSE36);
+
+        /* SYSCALL is hidden outside of long mode on Intel. */
+        if ( d->arch.x86_vendor == X86_VENDOR_INTEL &&
+             !hvm_long_mode_enabled(v))
+            *edx &= ~cpufeat_mask(X86_FEATURE_SYSCALL);
+
+        break;
+
+    case 0x80000007:
+        *edx = p->extd.e7d;
+        break;
+
+    case 0x80000008:
+        *eax &= 0xff;
+        count = d->arch.paging.gfn_bits + PAGE_SHIFT;
+        if ( *eax > count )
+            *eax = count;
+
+        count = (p->basic.pae || p->basic.pse36) ? 36 : 32;
+        if ( *eax < count )
+            *eax = count;
+
+        *eax |= (p->extd.lm ? vaddr_bits : 32) << 8;
+
+        *ebx = p->extd.e8b;
+        break;
+
+    case 0x8000001c:
+        if ( !cpu_has_svm )
+        {
+            *eax = *ebx = *ecx = *edx = 0;
+            break;
+        }
+
+        if ( cpu_has_lwp && (v->arch.xcr0 & XSTATE_LWP) )
+            /* Turn on available bit and other features specified in lwp_cfg. */
+            *eax = (*edx & v->arch.hvm_svm.guest_lwp_cfg) | 1;
+        else
+            *eax = 0;
+        break;
+
+    case 0x7:
+        ASSERT_UNREACHABLE();
+        /* Now handled in guest_cpuid(). */
+    }
+}
+
 void guest_cpuid(const struct vcpu *v, unsigned int leaf,
                  unsigned int subleaf, struct cpuid_leaf *res)
 {
diff --git a/xen/arch/x86/hvm/hvm.c b/xen/arch/x86/hvm/hvm.c
index 4ded533..cf38907 100644
--- a/xen/arch/x86/hvm/hvm.c
+++ b/xen/arch/x86/hvm/hvm.c
@@ -3287,256 +3287,6 @@ unsigned long copy_from_user_hvm(void *to, const void *from, unsigned len)
     return rc ? len : 0; /* fake a copy_from_user() return code */
 }
 
-void hvm_cpuid(unsigned int input, unsigned int *eax, unsigned int *ebx,
-                                   unsigned int *ecx, unsigned int *edx)
-{
-    struct vcpu *v = current;
-    struct domain *d = v->domain;
-    const struct cpuid_policy *p = d->arch.cpuid;
-    unsigned int count, dummy = 0;
-
-    if ( !eax )
-        eax = &dummy;
-    if ( !ebx )
-        ebx = &dummy;
-    if ( !ecx )
-        ecx = &dummy;
-    count = *ecx;
-    if ( !edx )
-        edx = &dummy;
-
-    domain_cpuid(d, input, count, eax, ebx, ecx, edx);
-
-    switch ( input )
-    {
-    case 0x1:
-        /* Fix up VLAPIC details. */
-        *ebx &= 0x00FFFFFFu;
-        *ebx |= (v->vcpu_id * 2) << 24;
-
-        *ecx = p->basic._1c;
-        *edx = p->basic._1d;
-
-        /* APIC exposed to guests, but Fast-forward MSR_APIC_BASE.EN back in. */
-        if ( vlapic_hw_disabled(vcpu_vlapic(v)) )
-            *edx &= ~cpufeat_bit(X86_FEATURE_APIC);
-
-        /* OSXSAVE clear in policy.  Fast-forward CR4 back in. */
-        if ( v->arch.hvm_vcpu.guest_cr[4] & X86_CR4_OSXSAVE )
-            *ecx |= cpufeat_mask(X86_FEATURE_OSXSAVE);
-
-        /*
-         * PSE36 is not supported in shadow mode.  This bit should be
-         * unilaterally cleared.
-         *
-         * However, an unspecified version of Hyper-V from 2011 refuses
-         * to start as the "cpu does not provide required hw features" if
-         * it can't see PSE36.
-         *
-         * As a workaround, leak the toolstack-provided PSE36 value into a
-         * shadow guest if the guest is already using PAE paging (and won't
-         * care about reverting back to PSE paging).  Otherwise, knoble it, so
-         * a 32bit guest doesn't get the impression that it could try to use
-         * PSE36 paging.
-         */
-        if ( !hap_enabled(d) && !(hvm_pae_enabled(v) || hvm_long_mode_enabled(v)) )
-            *edx &= ~cpufeat_mask(X86_FEATURE_PSE36);
-
-        if ( vpmu_enabled(v) &&
-             vpmu_is_set(vcpu_vpmu(v), VPMU_CPU_HAS_DS) )
-        {
-            *edx |= cpufeat_mask(X86_FEATURE_DS);
-            if ( cpu_has(&current_cpu_data, X86_FEATURE_DTES64) )
-                *ecx |= cpufeat_mask(X86_FEATURE_DTES64);
-            if ( cpu_has(&current_cpu_data, X86_FEATURE_DSCPL) )
-                *ecx |= cpufeat_mask(X86_FEATURE_DSCPL);
-        }
-
-        break;
-
-    case 0xb:
-        /* Fix the x2APIC identifier. */
-        *edx = v->vcpu_id * 2;
-        break;
-
-    case XSTATE_CPUID:
-        if ( !p->basic.xsave || count >= 63 )
-        {
-            *eax = *ebx = *ecx = *edx = 0;
-            break;
-        }
-        switch ( count )
-        {
-        case 0:
-        {
-            uint64_t xfeature_mask = XSTATE_FP_SSE;
-            uint32_t xstate_size = XSTATE_AREA_MIN_SIZE;
-
-            if ( p->basic.avx )
-            {
-                xfeature_mask |= XSTATE_YMM;
-                xstate_size = max(xstate_size,
-                                  xstate_offsets[_XSTATE_YMM] +
-                                  xstate_sizes[_XSTATE_YMM]);
-            }
-
-            if ( p->feat.mpx )
-            {
-                xfeature_mask |= XSTATE_BNDREGS | XSTATE_BNDCSR;
-                xstate_size = max(xstate_size,
-                                  xstate_offsets[_XSTATE_BNDCSR] +
-                                  xstate_sizes[_XSTATE_BNDCSR]);
-            }
-
-            if ( p->feat.avx512f )
-            {
-                xfeature_mask |= XSTATE_OPMASK | XSTATE_ZMM | XSTATE_HI_ZMM;
-                xstate_size = max(xstate_size,
-                                  xstate_offsets[_XSTATE_OPMASK] +
-                                  xstate_sizes[_XSTATE_OPMASK]);
-                xstate_size = max(xstate_size,
-                                  xstate_offsets[_XSTATE_ZMM] +
-                                  xstate_sizes[_XSTATE_ZMM]);
-                xstate_size = max(xstate_size,
-                                  xstate_offsets[_XSTATE_HI_ZMM] +
-                                  xstate_sizes[_XSTATE_HI_ZMM]);
-            }
-
-            if ( p->feat.pku )
-            {
-                xfeature_mask |= XSTATE_PKRU;
-                xstate_size = max(xstate_size,
-                                  xstate_offsets[_XSTATE_PKRU] +
-                                  xstate_sizes[_XSTATE_PKRU]);
-            }
-
-            if ( p->extd.lwp )
-            {
-                xfeature_mask |= XSTATE_LWP;
-                xstate_size = max(xstate_size,
-                                  xstate_offsets[_XSTATE_LWP] +
-                                  xstate_sizes[_XSTATE_LWP]);
-            }
-
-            *eax = (uint32_t)xfeature_mask;
-            *edx = (uint32_t)(xfeature_mask >> 32);
-            *ecx = xstate_size;
-
-            /*
-             * Always read CPUID[0xD,0].EBX from hardware, rather than domain
-             * policy.  It varies with enabled xstate, and the correct xcr0 is
-             * in context.
-             */
-            cpuid_count(input, count, &dummy, ebx, &dummy, &dummy);
-            break;
-        }
-
-        case 1:
-            *eax = p->xstate.Da1;
-
-            if ( p->xstate.xsaves )
-            {
-                /*
-                 * Always read CPUID[0xD,1].EBX from hardware, rather than
-                 * domain policy.  It varies with enabled xstate, and the
-                 * correct xcr0/xss are in context.
-                 */
-                cpuid_count(input, count, &dummy, ebx, &dummy, &dummy);
-            }
-            else
-                *ebx = 0;
-
-            *ecx = *edx = 0;
-            break;
-        }
-        break;
-
-    case 0x0000000a: /* Architectural Performance Monitor Features (Intel) */
-        if ( boot_cpu_data.x86_vendor != X86_VENDOR_INTEL || !vpmu_enabled(v) )
-        {
-            *eax = *ebx = *ecx = *edx = 0;
-            break;
-        }
-
-        /* Report at most version 3 since that's all we currently emulate */
-        if ( (*eax & 0xff) > 3 )
-            *eax = (*eax & ~0xff) | 3;
-        break;
-
-    case 0x80000001:
-        *ecx = p->extd.e1c;
-        *edx = p->extd.e1d;
-
-        /* If not emulating AMD, clear the duplicated features in e1d. */
-        if ( d->arch.x86_vendor != X86_VENDOR_AMD )
-            *edx &= ~CPUID_COMMON_1D_FEATURES;
-        /* fast-forward MSR_APIC_BASE.EN if it hasn't already been clobbered. */
-        else if ( vlapic_hw_disabled(vcpu_vlapic(v)) )
-            *edx &= ~cpufeat_bit(X86_FEATURE_APIC);
-
-        /*
-         * PSE36 is not supported in shadow mode.  This bit should be
-         * unilaterally cleared.
-         *
-         * However, an unspecified version of Hyper-V from 2011 refuses
-         * to start as the "cpu does not provide required hw features" if
-         * it can't see PSE36.
-         *
-         * As a workaround, leak the toolstack-provided PSE36 value into a
-         * shadow guest if the guest is already using PAE paging (and won't
-         * care about reverting back to PSE paging).  Otherwise, knoble it, so
-         * a 32bit guest doesn't get the impression that it could try to use
-         * PSE36 paging.
-         */
-        if ( !hap_enabled(d) && !(hvm_pae_enabled(v) || hvm_long_mode_enabled(v)) )
-            *edx &= ~cpufeat_mask(X86_FEATURE_PSE36);
-
-        /* SYSCALL is hidden outside of long mode on Intel. */
-        if ( d->arch.x86_vendor == X86_VENDOR_INTEL &&
-             !hvm_long_mode_enabled(v))
-            *edx &= ~cpufeat_mask(X86_FEATURE_SYSCALL);
-
-        break;
-
-    case 0x80000007:
-        *edx = p->extd.e7d;
-        break;
-
-    case 0x80000008:
-        *eax &= 0xff;
-        count = d->arch.paging.gfn_bits + PAGE_SHIFT;
-        if ( *eax > count )
-            *eax = count;
-
-        count = (p->basic.pae || p->basic.pse36) ? 36 : 32;
-        if ( *eax < count )
-            *eax = count;
-
-        *eax |= (p->extd.lm ? vaddr_bits : 32) << 8;
-
-        *ebx = p->extd.e8b;
-        break;
-
-    case 0x8000001c:
-        if ( !cpu_has_svm )
-        {
-            *eax = *ebx = *ecx = *edx = 0;
-            break;
-        }
-
-        if ( cpu_has_lwp && (v->arch.xcr0 & XSTATE_LWP) )
-            /* Turn on available bit and other features specified in lwp_cfg. */
-            *eax = (*edx & v->arch.hvm_svm.guest_lwp_cfg) | 1;
-        else
-            *eax = 0;
-        break;
-
-    case 0x7:
-        ASSERT_UNREACHABLE();
-        /* Now handled in guest_cpuid(). */
-    }
-}
-
 bool hvm_check_cpuid_faulting(struct vcpu *v)
 {
     if ( !v->arch.cpuid_faulting )
diff --git a/xen/arch/x86/traps.c b/xen/arch/x86/traps.c
index e2669b1..501d5b3 100644
--- a/xen/arch/x86/traps.c
+++ b/xen/arch/x86/traps.c
@@ -1020,279 +1020,6 @@ void cpuid_hypervisor_leaves(const struct vcpu *v, unsigned int leaf,
     }
 }
 
-void pv_cpuid(struct cpu_user_regs *regs)
-{
-    uint32_t leaf, subleaf, a, b, c, d;
-    struct vcpu *curr = current;
-    struct domain *currd = curr->domain;
-    const struct cpuid_policy *p = currd->arch.cpuid;
-
-    leaf = a = regs->_eax;
-    b = regs->_ebx;
-    subleaf = c = regs->_ecx;
-    d = regs->_edx;
-
-    if ( !is_control_domain(currd) && !is_hardware_domain(currd) )
-        domain_cpuid(currd, leaf, subleaf, &a, &b, &c, &d);
-    else
-        cpuid_count(leaf, subleaf, &a, &b, &c, &d);
-
-    switch ( leaf )
-    {
-        uint32_t tmp;
-
-    case 0x00000001:
-        c = p->basic._1c;
-        d = p->basic._1d;
-
-        if ( !is_pvh_domain(currd) )
-        {
-            /*
-             * Delete the PVH condition when HVMLite formally replaces PVH,
-             * and HVM guests no longer enter a PV codepath.
-             */
-
-            /*
-             * !!! OSXSAVE handling for PV guests is non-architectural !!!
-             *
-             * Architecturally, the correct code here is simply:
-             *
-             *   if ( curr->arch.pv_vcpu.ctrlreg[4] & X86_CR4_OSXSAVE )
-             *       c |= cpufeat_mask(X86_FEATURE_OSXSAVE);
-             *
-             * However because of bugs in Xen (before c/s bd19080b, Nov 2010,
-             * the XSAVE cpuid flag leaked into guests despite the feature not
-             * being available for use), buggy workarounds where introduced to
-             * Linux (c/s 947ccf9c, also Nov 2010) which relied on the fact
-             * that Xen also incorrectly leaked OSXSAVE into the guest.
-             *
-             * Furthermore, providing architectural OSXSAVE behaviour to a
-             * many Linux PV guests triggered a further kernel bug when the
-             * fpu code observes that XSAVEOPT is available, assumes that
-             * xsave state had been set up for the task, and follows a wild
-             * pointer.
-             *
-             * Older Linux PVOPS kernels however do require architectural
-             * behaviour.  They observe Xen's leaked OSXSAVE and assume they
-             * can already use XSETBV, dying with a #UD because the shadowed
-             * CR4.OSXSAVE is clear.  This behaviour has been adjusted in all
-             * observed cases via stable backports of the above changeset.
-             *
-             * Therefore, the leaking of Xen's OSXSAVE setting has become a
-             * defacto part of the PV ABI and can't reasonably be corrected.
-             * It can however be restricted to only the enlightened CPUID
-             * view, as seen by the guest kernel.
-             *
-             * The following situations and logic now applies:
-             *
-             * - Hardware without CPUID faulting support and native CPUID:
-             *    There is nothing Xen can do here.  The hosts XSAVE flag will
-             *    leak through and Xen's OSXSAVE choice will leak through.
-             *
-             *    In the case that the guest kernel has not set up OSXSAVE, only
-             *    SSE will be set in xcr0, and guest userspace can't do too much
-             *    damage itself.
-             *
-             * - Enlightened CPUID or CPUID faulting available:
-             *    Xen can fully control what is seen here.  Guest kernels need
-             *    to see the leaked OSXSAVE via the enlightened path, but
-             *    guest userspace and the native is given architectural
-             *    behaviour.
-             *
-             *    Emulated vs Faulted CPUID is distinguised based on whether a
-             *    #UD or #GP is currently being serviced.
-             */
-            /* OSXSAVE clear in policy.  Fast-forward CR4 back in. */
-            if ( (curr->arch.pv_vcpu.ctrlreg[4] & X86_CR4_OSXSAVE) ||
-                 (regs->entry_vector == TRAP_invalid_op &&
-                  guest_kernel_mode(curr, regs) &&
-                  (read_cr4() & X86_CR4_OSXSAVE)) )
-                c |= cpufeat_mask(X86_FEATURE_OSXSAVE);
-
-            /*
-             * At the time of writing, a PV domain is the only viable option
-             * for Dom0.  Several interactions between dom0 and Xen for real
-             * hardware setup have unfortunately been implemented based on
-             * state which incorrectly leaked into dom0.
-             *
-             * These leaks are retained for backwards compatibility, but
-             * restricted to the hardware domains kernel only.
-             */
-            if ( is_hardware_domain(currd) && guest_kernel_mode(curr, regs) )
-            {
-                /*
-                 * MTRR used to unconditionally leak into PV guests.  They
-                 * cannot MTRR infrastructure at all, and shouldn't be able to
-                 * see the feature.
-                 *
-                 * Modern PVOPS Linux self-clobbers the MTRR feature, to avoid
-                 * trying to use the associated MSRs.  Xenolinux-based PV dom0's
-                 * however use the MTRR feature as an indication of the presence
-                 * of the XENPF_{add,del,read}_memtype hypercalls.
-                 */
-                if ( cpu_has_mtrr )
-                    d |= cpufeat_mask(X86_FEATURE_MTRR);
-
-                /*
-                 * MONITOR never leaked into PV guests, as PV guests cannot
-                 * use the MONITOR/MWAIT instructions.  As such, they require
-                 * the feature to not being present in emulated CPUID.
-                 *
-                 * Modern PVOPS Linux try to be cunning and use native CPUID
-                 * to see if the hardware actually supports MONITOR, and by
-                 * extension, deep C states.
-                 *
-                 * If the feature is seen, deep-C state information is
-                 * obtained from the DSDT and handed back to Xen via the
-                 * XENPF_set_processor_pminfo hypercall.
-                 *
-                 * This mechanism is incompatible with an HVM-based hardware
-                 * domain, and also with CPUID Faulting.
-                 *
-                 * Luckily, Xen can be just as 'cunning', and distinguish an
-                 * emulated CPUID from a faulted CPUID by whether a #UD or #GP
-                 * fault is currently being serviced.  Yuck...
-                 */
-                if ( cpu_has_monitor && regs->entry_vector == TRAP_gp_fault )
-                    c |= cpufeat_mask(X86_FEATURE_MONITOR);
-
-                /*
-                 * While MONITOR never leaked into PV guests, EIST always used
-                 * to.
-                 *
-                 * Modern PVOPS will only parse P state information from the
-                 * DSDT and return it to Xen if EIST is seen in the emulated
-                 * CPUID information.
-                 */
-                if ( cpu_has_eist )
-                    c |= cpufeat_mask(X86_FEATURE_EIST);
-            }
-        }
-
-        if ( vpmu_enabled(curr) &&
-             vpmu_is_set(vcpu_vpmu(curr), VPMU_CPU_HAS_DS) )
-        {
-            d |= cpufeat_mask(X86_FEATURE_DS);
-            if ( cpu_has(&current_cpu_data, X86_FEATURE_DTES64) )
-                c |= cpufeat_mask(X86_FEATURE_DTES64);
-            if ( cpu_has(&current_cpu_data, X86_FEATURE_DSCPL) )
-                c |= cpufeat_mask(X86_FEATURE_DSCPL);
-        }
-        break;
-
-    case 0x0000000a: /* Architectural Performance Monitor Features (Intel) */
-        if ( boot_cpu_data.x86_vendor != X86_VENDOR_INTEL ||
-             !vpmu_enabled(curr) )
-            goto unsupported;
-
-        /* Report at most version 3 since that's all we currently emulate. */
-        if ( (a & 0xff) > 3 )
-            a = (a & ~0xff) | 3;
-        break;
-
-    case XSTATE_CPUID:
-        if ( !p->basic.xsave || subleaf >= 63 )
-            goto unsupported;
-        switch ( subleaf )
-        {
-        case 0:
-        {
-            uint64_t xfeature_mask = XSTATE_FP_SSE;
-            uint32_t xstate_size = XSTATE_AREA_MIN_SIZE;
-
-            if ( p->basic.avx )
-            {
-                xfeature_mask |= XSTATE_YMM;
-                xstate_size = (xstate_offsets[_XSTATE_YMM] +
-                               xstate_sizes[_XSTATE_YMM]);
-            }
-
-            if ( p->feat.avx512f )
-            {
-                xfeature_mask |= XSTATE_OPMASK | XSTATE_ZMM | XSTATE_HI_ZMM;
-                xstate_size = max(xstate_size,
-                                  xstate_offsets[_XSTATE_OPMASK] +
-                                  xstate_sizes[_XSTATE_OPMASK]);
-                xstate_size = max(xstate_size,
-                                  xstate_offsets[_XSTATE_ZMM] +
-                                  xstate_sizes[_XSTATE_ZMM]);
-                xstate_size = max(xstate_size,
-                                  xstate_offsets[_XSTATE_HI_ZMM] +
-                                  xstate_sizes[_XSTATE_HI_ZMM]);
-            }
-
-            a = (uint32_t)xfeature_mask;
-            d = (uint32_t)(xfeature_mask >> 32);
-            c = xstate_size;
-
-            /*
-             * Always read CPUID.0xD[ECX=0].EBX from hardware, rather than
-             * domain policy.  It varies with enabled xstate, and the correct
-             * xcr0 is in context.
-             */
-            cpuid_count(leaf, subleaf, &tmp, &b, &tmp, &tmp);
-            break;
-        }
-
-        case 1:
-            a = p->xstate.Da1;
-            b = c = d = 0;
-            break;
-        }
-        break;
-
-    case 0x80000001:
-        c = p->extd.e1c;
-        d = p->extd.e1d;
-
-        /* If not emulating AMD, clear the duplicated features in e1d. */
-        if ( currd->arch.x86_vendor != X86_VENDOR_AMD )
-            d &= ~CPUID_COMMON_1D_FEATURES;
-
-        /*
-         * MTRR used to unconditionally leak into PV guests.  They cannot MTRR
-         * infrastructure at all, and shouldn't be able to see the feature.
-         *
-         * Modern PVOPS Linux self-clobbers the MTRR feature, to avoid trying
-         * to use the associated MSRs.  Xenolinux-based PV dom0's however use
-         * the MTRR feature as an indication of the presence of the
-         * XENPF_{add,del,read}_memtype hypercalls.
-         */
-        if ( is_hardware_domain(currd) && guest_kernel_mode(curr, regs) &&
-             cpu_has_mtrr )
-            d |= cpufeat_mask(X86_FEATURE_MTRR);
-        break;
-
-    case 0x80000007:
-        d = p->extd.e7d;
-        break;
-
-    case 0x80000008:
-        a = paddr_bits | (vaddr_bits << 8);
-        b = p->extd.e8b;
-        break;
-
-    case 0x00000005: /* MONITOR/MWAIT */
-    case 0x0000000b: /* Extended Topology Enumeration */
-    case 0x8000000a: /* SVM revision and features */
-    case 0x8000001b: /* Instruction Based Sampling */
-    case 0x8000001c: /* Light Weight Profiling */
-    case 0x8000001e: /* Extended topology reporting */
-    unsupported:
-        a = b = c = d = 0;
-        break;
-
-    case 0x7:
-        ASSERT_UNREACHABLE();
-        /* Now handled in guest_cpuid(). */
-    }
-
-    regs->rax = a;
-    regs->rbx = b;
-    regs->rcx = c;
-    regs->rdx = d;
-}
-
 static int emulate_invalid_rdtscp(struct cpu_user_regs *regs)
 {
     char opcode[3];
diff --git a/xen/include/asm-x86/hvm/hvm.h b/xen/include/asm-x86/hvm/hvm.h
index c1b07b7..a8f9824 100644
--- a/xen/include/asm-x86/hvm/hvm.h
+++ b/xen/include/asm-x86/hvm/hvm.h
@@ -392,8 +392,6 @@ bool hvm_set_guest_bndcfgs(struct vcpu *v, u64 val);
 #define has_viridian_apic_assist(d) \
     (is_viridian_domain(d) && (viridian_feature_mask(d) & HVMPV_apic_assist))
 
-void hvm_cpuid(unsigned int input, unsigned int *eax, unsigned int *ebx,
-                                   unsigned int *ecx, unsigned int *edx);
 bool hvm_check_cpuid_faulting(struct vcpu *v);
 void hvm_migrate_timers(struct vcpu *v);
 void hvm_do_resume(struct vcpu *v);
diff --git a/xen/include/asm-x86/processor.h b/xen/include/asm-x86/processor.h
index e43b956..3bb93a2 100644
--- a/xen/include/asm-x86/processor.h
+++ b/xen/include/asm-x86/processor.h
@@ -627,8 +627,6 @@ enum get_cpu_vendor {
 int get_cpu_vendor(uint32_t b, uint32_t c, uint32_t d, enum get_cpu_vendor mode);
 uint8_t get_cpu_family(uint32_t raw, uint8_t *model, uint8_t *stepping);
 
-void pv_cpuid(struct cpu_user_regs *regs);
-
 #endif /* !__ASSEMBLY__ */
 
 #endif /* __ASM_X86_PROCESSOR_H */
-- 
2.1.4


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 93+ messages in thread

* [PATCH 27/27] x86/cpuid: Alter the legacy-path prototypes to match guest_cpuid()
  2017-01-04 12:39 [PATCH 00/27] xen/x86: Per-domain CPUID policies Andrew Cooper
                   ` (25 preceding siblings ...)
  2017-01-04 12:39 ` [PATCH 26/27] x86/cpuid: Effectively remove pv_cpuid() and hvm_cpuid() Andrew Cooper
@ 2017-01-04 12:39 ` Andrew Cooper
  2017-01-05 14:19   ` Jan Beulich
  26 siblings, 1 reply; 93+ messages in thread
From: Andrew Cooper @ 2017-01-04 12:39 UTC (permalink / raw)
  To: Xen-devel; +Cc: Andrew Cooper, Jan Beulich

This allows the compiler to have a far easier time inlining the legacy paths
into guest_cpuid(), and avoids the need to have a full struct cpu_user_regs in
the guest_cpuid() stack frame.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
---
CC: Jan Beulich <JBeulich@suse.com>

Here and elsewhere, it is becomes very obvious that the PVH path using
pv_cpuid() is broken, as the guest_kernel_mode() check using
guest_cpu_user_regs() is erroneous.  I am tempted to just switch PVH onto the
HVM path, which won't make it any more broken than it currently is.
---
 xen/arch/x86/cpuid.c | 200 +++++++++++++++++++++------------------------------
 1 file changed, 81 insertions(+), 119 deletions(-)

diff --git a/xen/arch/x86/cpuid.c b/xen/arch/x86/cpuid.c
index 5e7e8cc..a75ef28 100644
--- a/xen/arch/x86/cpuid.c
+++ b/xen/arch/x86/cpuid.c
@@ -337,30 +337,26 @@ int init_domain_cpuid_policy(struct domain *d)
     return 0;
 }
 
-static void pv_cpuid(struct cpu_user_regs *regs)
+static void pv_cpuid(unsigned int leaf, unsigned int subleaf,
+                     struct cpuid_leaf *res)
 {
-    uint32_t leaf, subleaf, a, b, c, d;
+    const struct cpu_user_regs *regs = guest_cpu_user_regs();
     struct vcpu *curr = current;
     struct domain *currd = curr->domain;
     const struct cpuid_policy *p = currd->arch.cpuid;
 
-    leaf = a = regs->_eax;
-    b = regs->_ebx;
-    subleaf = c = regs->_ecx;
-    d = regs->_edx;
-
     if ( !is_control_domain(currd) && !is_hardware_domain(currd) )
-        domain_cpuid(currd, leaf, subleaf, &a, &b, &c, &d);
+        domain_cpuid(currd, leaf, subleaf, &res->a, &res->b, &res->c, &res->d);
     else
-        cpuid_count(leaf, subleaf, &a, &b, &c, &d);
+        cpuid_count(leaf, subleaf, &res->a, &res->b, &res->c, &res->d);
 
     switch ( leaf )
     {
         uint32_t tmp;
 
     case 0x00000001:
-        c = p->basic._1c;
-        d = p->basic._1d;
+        res->c = p->basic._1c;
+        res->d = p->basic._1d;
 
         if ( !is_pvh_domain(currd) )
         {
@@ -424,7 +420,7 @@ static void pv_cpuid(struct cpu_user_regs *regs)
                  (regs->entry_vector == TRAP_invalid_op &&
                   guest_kernel_mode(curr, regs) &&
                   (read_cr4() & X86_CR4_OSXSAVE)) )
-                c |= cpufeat_mask(X86_FEATURE_OSXSAVE);
+                res->c |= cpufeat_mask(X86_FEATURE_OSXSAVE);
 
             /*
              * At the time of writing, a PV domain is the only viable option
@@ -448,7 +444,7 @@ static void pv_cpuid(struct cpu_user_regs *regs)
                  * of the XENPF_{add,del,read}_memtype hypercalls.
                  */
                 if ( cpu_has_mtrr )
-                    d |= cpufeat_mask(X86_FEATURE_MTRR);
+                    res->d |= cpufeat_mask(X86_FEATURE_MTRR);
 
                 /*
                  * MONITOR never leaked into PV guests, as PV guests cannot
@@ -471,7 +467,7 @@ static void pv_cpuid(struct cpu_user_regs *regs)
                  * fault is currently being serviced.  Yuck...
                  */
                 if ( cpu_has_monitor && regs->entry_vector == TRAP_gp_fault )
-                    c |= cpufeat_mask(X86_FEATURE_MONITOR);
+                    res->c |= cpufeat_mask(X86_FEATURE_MONITOR);
 
                 /*
                  * While MONITOR never leaked into PV guests, EIST always used
@@ -482,18 +478,18 @@ static void pv_cpuid(struct cpu_user_regs *regs)
                  * CPUID information.
                  */
                 if ( cpu_has_eist )
-                    c |= cpufeat_mask(X86_FEATURE_EIST);
+                    res->c |= cpufeat_mask(X86_FEATURE_EIST);
             }
         }
 
         if ( vpmu_enabled(curr) &&
              vpmu_is_set(vcpu_vpmu(curr), VPMU_CPU_HAS_DS) )
         {
-            d |= cpufeat_mask(X86_FEATURE_DS);
+            res->d |= cpufeat_mask(X86_FEATURE_DS);
             if ( cpu_has(&current_cpu_data, X86_FEATURE_DTES64) )
-                c |= cpufeat_mask(X86_FEATURE_DTES64);
+                res->c |= cpufeat_mask(X86_FEATURE_DTES64);
             if ( cpu_has(&current_cpu_data, X86_FEATURE_DSCPL) )
-                c |= cpufeat_mask(X86_FEATURE_DSCPL);
+                res->c |= cpufeat_mask(X86_FEATURE_DSCPL);
         }
         break;
 
@@ -503,8 +499,8 @@ static void pv_cpuid(struct cpu_user_regs *regs)
             goto unsupported;
 
         /* Report at most version 3 since that's all we currently emulate. */
-        if ( (a & 0xff) > 3 )
-            a = (a & ~0xff) | 3;
+        if ( (res->a & 0xff) > 3 )
+            res->a = (res->a & ~0xff) | 3;
         break;
 
     case XSTATE_CPUID:
@@ -538,33 +534,33 @@ static void pv_cpuid(struct cpu_user_regs *regs)
                                   xstate_sizes[_XSTATE_HI_ZMM]);
             }
 
-            a = (uint32_t)xfeature_mask;
-            d = (uint32_t)(xfeature_mask >> 32);
-            c = xstate_size;
+            res->a = (uint32_t)xfeature_mask;
+            res->d = (uint32_t)(xfeature_mask >> 32);
+            res->c = xstate_size;
 
             /*
              * Always read CPUID.0xD[ECX=0].EBX from hardware, rather than
              * domain policy.  It varies with enabled xstate, and the correct
              * xcr0 is in context.
              */
-            cpuid_count(leaf, subleaf, &tmp, &b, &tmp, &tmp);
+            cpuid_count(leaf, subleaf, &tmp, &res->b, &tmp, &tmp);
             break;
         }
 
         case 1:
-            a = p->xstate.Da1;
-            b = c = d = 0;
+            res->a = p->xstate.Da1;
+            res->b = res->c = res->d = 0;
             break;
         }
         break;
 
     case 0x80000001:
-        c = p->extd.e1c;
-        d = p->extd.e1d;
+        res->c = p->extd.e1c;
+        res->d = p->extd.e1d;
 
         /* If not emulating AMD, clear the duplicated features in e1d. */
         if ( currd->arch.x86_vendor != X86_VENDOR_AMD )
-            d &= ~CPUID_COMMON_1D_FEATURES;
+            res->d &= ~CPUID_COMMON_1D_FEATURES;
 
         /*
          * MTRR used to unconditionally leak into PV guests.  They cannot MTRR
@@ -577,16 +573,16 @@ static void pv_cpuid(struct cpu_user_regs *regs)
          */
         if ( is_hardware_domain(currd) && guest_kernel_mode(curr, regs) &&
              cpu_has_mtrr )
-            d |= cpufeat_mask(X86_FEATURE_MTRR);
+            res->d |= cpufeat_mask(X86_FEATURE_MTRR);
         break;
 
     case 0x80000007:
-        d = p->extd.e7d;
+        res->d = p->extd.e7d;
         break;
 
     case 0x80000008:
-        a = paddr_bits | (vaddr_bits << 8);
-        b = p->extd.e8b;
+        res->a = paddr_bits | (vaddr_bits << 8);
+        res->b = p->extd.e8b;
         break;
 
     case 0x00000005: /* MONITOR/MWAIT */
@@ -596,57 +592,43 @@ static void pv_cpuid(struct cpu_user_regs *regs)
     case 0x8000001c: /* Light Weight Profiling */
     case 0x8000001e: /* Extended topology reporting */
     unsupported:
-        a = b = c = d = 0;
+        *res = EMPTY_LEAF;
         break;
 
     case 0x7:
         ASSERT_UNREACHABLE();
         /* Now handled in guest_cpuid(). */
     }
-
-    regs->rax = a;
-    regs->rbx = b;
-    regs->rcx = c;
-    regs->rdx = d;
 }
 
-static void hvm_cpuid(unsigned int input, unsigned int *eax, unsigned int *ebx,
-                      unsigned int *ecx, unsigned int *edx)
+static void hvm_cpuid(unsigned int leaf, unsigned int subleaf,
+                      struct cpuid_leaf *res)
 {
     struct vcpu *v = current;
     struct domain *d = v->domain;
     const struct cpuid_policy *p = d->arch.cpuid;
-    unsigned int count, dummy = 0;
-
-    if ( !eax )
-        eax = &dummy;
-    if ( !ebx )
-        ebx = &dummy;
-    if ( !ecx )
-        ecx = &dummy;
-    count = *ecx;
-    if ( !edx )
-        edx = &dummy;
 
-    domain_cpuid(d, input, count, eax, ebx, ecx, edx);
+    domain_cpuid(d, leaf, subleaf, &res->a, &res->b, &res->c, &res->d);
 
-    switch ( input )
+    switch ( leaf )
     {
+        unsigned int tmp;
+
     case 0x1:
         /* Fix up VLAPIC details. */
-        *ebx &= 0x00FFFFFFu;
-        *ebx |= (v->vcpu_id * 2) << 24;
+        res->b &= 0x00FFFFFFu;
+        res->b |= (v->vcpu_id * 2) << 24;
 
-        *ecx = p->basic._1c;
-        *edx = p->basic._1d;
+        res->c = p->basic._1c;
+        res->d = p->basic._1d;
 
         /* APIC exposed to guests, but Fast-forward MSR_APIC_BASE.EN back in. */
         if ( vlapic_hw_disabled(vcpu_vlapic(v)) )
-            *edx &= ~cpufeat_bit(X86_FEATURE_APIC);
+            res->d &= ~cpufeat_bit(X86_FEATURE_APIC);
 
         /* OSXSAVE clear in policy.  Fast-forward CR4 back in. */
         if ( v->arch.hvm_vcpu.guest_cr[4] & X86_CR4_OSXSAVE )
-            *ecx |= cpufeat_mask(X86_FEATURE_OSXSAVE);
+            res->c |= cpufeat_mask(X86_FEATURE_OSXSAVE);
 
         /*
          * PSE36 is not supported in shadow mode.  This bit should be
@@ -663,32 +645,32 @@ static void hvm_cpuid(unsigned int input, unsigned int *eax, unsigned int *ebx,
          * PSE36 paging.
          */
         if ( !hap_enabled(d) && !(hvm_pae_enabled(v) || hvm_long_mode_enabled(v)) )
-            *edx &= ~cpufeat_mask(X86_FEATURE_PSE36);
+            res->d &= ~cpufeat_mask(X86_FEATURE_PSE36);
 
         if ( vpmu_enabled(v) &&
              vpmu_is_set(vcpu_vpmu(v), VPMU_CPU_HAS_DS) )
         {
-            *edx |= cpufeat_mask(X86_FEATURE_DS);
+            res->d |= cpufeat_mask(X86_FEATURE_DS);
             if ( cpu_has(&current_cpu_data, X86_FEATURE_DTES64) )
-                *ecx |= cpufeat_mask(X86_FEATURE_DTES64);
+                res->c |= cpufeat_mask(X86_FEATURE_DTES64);
             if ( cpu_has(&current_cpu_data, X86_FEATURE_DSCPL) )
-                *ecx |= cpufeat_mask(X86_FEATURE_DSCPL);
+                res->c |= cpufeat_mask(X86_FEATURE_DSCPL);
         }
 
         break;
 
     case 0xb:
         /* Fix the x2APIC identifier. */
-        *edx = v->vcpu_id * 2;
+        res->d = v->vcpu_id * 2;
         break;
 
     case XSTATE_CPUID:
-        if ( !p->basic.xsave || count >= 63 )
+        if ( !p->basic.xsave || subleaf >= 63 )
         {
-            *eax = *ebx = *ecx = *edx = 0;
+            *res = EMPTY_LEAF;
             break;
         }
-        switch ( count )
+        switch ( subleaf )
         {
         case 0:
         {
@@ -741,21 +723,21 @@ static void hvm_cpuid(unsigned int input, unsigned int *eax, unsigned int *ebx,
                                   xstate_sizes[_XSTATE_LWP]);
             }
 
-            *eax = (uint32_t)xfeature_mask;
-            *edx = (uint32_t)(xfeature_mask >> 32);
-            *ecx = xstate_size;
+            res->a = (uint32_t)xfeature_mask;
+            res->d = (uint32_t)(xfeature_mask >> 32);
+            res->c = xstate_size;
 
             /*
              * Always read CPUID[0xD,0].EBX from hardware, rather than domain
              * policy.  It varies with enabled xstate, and the correct xcr0 is
              * in context.
              */
-            cpuid_count(input, count, &dummy, ebx, &dummy, &dummy);
+            cpuid_count(leaf, subleaf, &tmp, &res->b, &tmp, &tmp);
             break;
         }
 
         case 1:
-            *eax = p->xstate.Da1;
+            res->a = p->xstate.Da1;
 
             if ( p->xstate.xsaves )
             {
@@ -764,12 +746,12 @@ static void hvm_cpuid(unsigned int input, unsigned int *eax, unsigned int *ebx,
                  * domain policy.  It varies with enabled xstate, and the
                  * correct xcr0/xss are in context.
                  */
-                cpuid_count(input, count, &dummy, ebx, &dummy, &dummy);
+                cpuid_count(leaf, subleaf, &tmp, &res->b, &tmp, &tmp);
             }
             else
-                *ebx = 0;
+                res->b = 0;
 
-            *ecx = *edx = 0;
+            res->c = res->d = 0;
             break;
         }
         break;
@@ -777,25 +759,25 @@ static void hvm_cpuid(unsigned int input, unsigned int *eax, unsigned int *ebx,
     case 0x0000000a: /* Architectural Performance Monitor Features (Intel) */
         if ( boot_cpu_data.x86_vendor != X86_VENDOR_INTEL || !vpmu_enabled(v) )
         {
-            *eax = *ebx = *ecx = *edx = 0;
+            *res = EMPTY_LEAF;
             break;
         }
 
         /* Report at most version 3 since that's all we currently emulate */
-        if ( (*eax & 0xff) > 3 )
-            *eax = (*eax & ~0xff) | 3;
+        if ( (res->a & 0xff) > 3 )
+            res->a = (res->a & ~0xff) | 3;
         break;
 
     case 0x80000001:
-        *ecx = p->extd.e1c;
-        *edx = p->extd.e1d;
+        res->c = p->extd.e1c;
+        res->d = p->extd.e1d;
 
         /* If not emulating AMD, clear the duplicated features in e1d. */
         if ( d->arch.x86_vendor != X86_VENDOR_AMD )
-            *edx &= ~CPUID_COMMON_1D_FEATURES;
+            res->d &= ~CPUID_COMMON_1D_FEATURES;
         /* fast-forward MSR_APIC_BASE.EN if it hasn't already been clobbered. */
         else if ( vlapic_hw_disabled(vcpu_vlapic(v)) )
-            *edx &= ~cpufeat_bit(X86_FEATURE_APIC);
+            res->d &= ~cpufeat_bit(X86_FEATURE_APIC);
 
         /*
          * PSE36 is not supported in shadow mode.  This bit should be
@@ -812,46 +794,46 @@ static void hvm_cpuid(unsigned int input, unsigned int *eax, unsigned int *ebx,
          * PSE36 paging.
          */
         if ( !hap_enabled(d) && !(hvm_pae_enabled(v) || hvm_long_mode_enabled(v)) )
-            *edx &= ~cpufeat_mask(X86_FEATURE_PSE36);
+            res->d &= ~cpufeat_mask(X86_FEATURE_PSE36);
 
         /* SYSCALL is hidden outside of long mode on Intel. */
         if ( d->arch.x86_vendor == X86_VENDOR_INTEL &&
              !hvm_long_mode_enabled(v))
-            *edx &= ~cpufeat_mask(X86_FEATURE_SYSCALL);
+            res->d &= ~cpufeat_mask(X86_FEATURE_SYSCALL);
 
         break;
 
     case 0x80000007:
-        *edx = p->extd.e7d;
+        res->d = p->extd.e7d;
         break;
 
     case 0x80000008:
-        *eax &= 0xff;
-        count = d->arch.paging.gfn_bits + PAGE_SHIFT;
-        if ( *eax > count )
-            *eax = count;
+        res->a &= 0xff;
+        tmp = d->arch.paging.gfn_bits + PAGE_SHIFT;
+        if ( res->a > tmp )
+            res->a = tmp;
 
-        count = (p->basic.pae || p->basic.pse36) ? 36 : 32;
-        if ( *eax < count )
-            *eax = count;
+        tmp = (p->basic.pae || p->basic.pse36) ? 36 : 32;
+        if ( res->a < tmp )
+            res->a = tmp;
 
-        *eax |= (p->extd.lm ? vaddr_bits : 32) << 8;
+        res->a |= (p->extd.lm ? vaddr_bits : 32) << 8;
 
-        *ebx = p->extd.e8b;
+        res->b = p->extd.e8b;
         break;
 
     case 0x8000001c:
         if ( !cpu_has_svm )
         {
-            *eax = *ebx = *ecx = *edx = 0;
+            *res = EMPTY_LEAF;
             break;
         }
 
         if ( cpu_has_lwp && (v->arch.xcr0 & XSTATE_LWP) )
             /* Turn on available bit and other features specified in lwp_cfg. */
-            *eax = (*edx & v->arch.hvm_svm.guest_lwp_cfg) | 1;
+            res->a = (res->d & v->arch.hvm_svm.guest_lwp_cfg) | 1;
         else
-            *eax = 0;
+            res->a = 0;
         break;
 
     case 0x7:
@@ -945,27 +927,7 @@ void guest_cpuid(const struct vcpu *v, unsigned int leaf,
  legacy:
     /* {pv,hvm}_cpuid() have this expectation. */
     ASSERT(v == curr);
-
-    if ( is_pv_vcpu(v) || is_pvh_vcpu(v) )
-    {
-        struct cpu_user_regs regs = *guest_cpu_user_regs();
-
-        regs.rax = leaf;
-        regs.rcx = subleaf;
-
-        pv_cpuid(&regs);
-
-        res->a = regs._eax;
-        res->b = regs._ebx;
-        res->c = regs._ecx;
-        res->d = regs._edx;
-    }
-    else
-    {
-        res->c = subleaf;
-
-        hvm_cpuid(leaf, &res->a, &res->b, &res->c, &res->d);
-    }
+    (is_pv_vcpu(v) || is_pvh_vcpu(v) ? pv_cpuid : hvm_cpuid)(leaf, subleaf, res);
 }
 
 static void __init __maybe_unused build_assertions(void)
-- 
2.1.4


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 93+ messages in thread

* Re: [PATCH 01/27] x86/cpuid: Untangle the <asm/cpufeature.h> include hierachy
  2017-01-04 12:39 ` [PATCH 01/27] x86/cpuid: Untangle the <asm/cpufeature.h> include hierachy Andrew Cooper
@ 2017-01-04 13:39   ` Jan Beulich
  0 siblings, 0 replies; 93+ messages in thread
From: Jan Beulich @ 2017-01-04 13:39 UTC (permalink / raw)
  To: Andrew Cooper; +Cc: Xen-devel

>>> On 04.01.17 at 13:39, <andrew.cooper3@citrix.com> wrote:
> The use of X86_FEATURES_ONLY was shortlived in Linux for the same problem
> encountered here.  The following series needs to add extra includes to
> asm/cpuid.h, which breaks the build elsewhere given the current hierachy.
> 
> Move the feature definitions into a separate header file, which also matches
> the solution Linux used.
> 
> Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>

Acked-by: Jan Beulich <jbeulich@suse.com>


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: [PATCH 02/27] x86/cpuid: Introduce guest_cpuid() and struct cpuid_leaf
  2017-01-04 12:39 ` [PATCH 02/27] x86/cpuid: Introduce guest_cpuid() and struct cpuid_leaf Andrew Cooper
@ 2017-01-04 14:01   ` Jan Beulich
  2017-01-04 14:47     ` Andrew Cooper
  0 siblings, 1 reply; 93+ messages in thread
From: Jan Beulich @ 2017-01-04 14:01 UTC (permalink / raw)
  To: Andrew Cooper
  Cc: Kevin Tian, Suravee Suthikulpanit, Xen-devel, Paul Durrant,
	Jun Nakajima, Boris Ostrovsky

>>> On 04.01.17 at 13:39, <andrew.cooper3@citrix.com> wrote:
> Jan: I note that this patch texturally conflicts with your register renaming
> series.

And, not just texturally, the SSE/AVX moves series still awaiting your
review.

> @@ -17,6 +18,8 @@ uint32_t __read_mostly raw_featureset[FSCAPINTS];
>  uint32_t __read_mostly pv_featureset[FSCAPINTS];
>  uint32_t __read_mostly hvm_featureset[FSCAPINTS];
>  
> +#define EMPTY_LEAF (struct cpuid_leaf){}

Perhaps another pair of parentheses around the entire thing?

> @@ -215,6 +218,36 @@ const uint32_t * __init lookup_deep_deps(uint32_t feature)
>      return NULL;
>  }
>  
> +void guest_cpuid(const struct vcpu *v, unsigned int leaf,
> +                 unsigned int subleaf, struct cpuid_leaf *res)
> +{
> +    *res = EMPTY_LEAF;

Why? There's no valid path leaving the structure uninitialized.

> +    /* {pv,hvm}_cpuid() have this expectation. */
> +    ASSERT(v == current);
> +
> +    if ( is_pv_vcpu(v) || is_pvh_vcpu(v) )
> +    {
> +        struct cpu_user_regs regs = *guest_cpu_user_regs();

I assume this is only a transient thing, in which case I'm fine with
this relatively big item getting placed on the stack.

> +        regs.rax = leaf;
> +        regs.rcx = subleaf;

DYM _eax/_ecx respectively? The upper halves are of no interest.

> @@ -3246,10 +3252,10 @@ static int priv_op_wbinvd(struct x86_emulate_ctxt *ctxt)
>      return X86EMUL_OKAY;
>  }
>  
> -int pv_emul_cpuid(unsigned int *eax, unsigned int *ebx, unsigned int *ecx,
> -                  unsigned int *edx, struct x86_emulate_ctxt *ctxt)
> +int pv_emul_cpuid(unsigned int leaf, unsigned int subleaf,
> +                  struct cpuid_leaf *res, struct x86_emulate_ctxt *ctxt)
>  {
> -    struct cpu_user_regs regs = *ctxt->regs;
> +    struct vcpu *curr = current;
>  
>      /*
>       * x86_emulate uses this function to query CPU features for its own
> @@ -3258,7 +3264,6 @@ int pv_emul_cpuid(unsigned int *eax, unsigned int *ebx, unsigned int *ecx,
>       */
>      if ( ctxt->opcode == X86EMUL_OPC(0x0f, 0xa2) )
>      {
> -        const struct vcpu *curr = current;

There was a "const" here - did you really mean to get rid of it?

> @@ -3266,16 +3271,7 @@ int pv_emul_cpuid(unsigned int *eax, unsigned int *ebx, unsigned int *ecx,
>              return X86EMUL_EXCEPTION;
>      }
>  
> -    regs._eax = *eax;
> -    regs._ecx = *ecx;
> -
> -    pv_cpuid(&regs);
> -
> -    *eax = regs._eax;
> -    *ebx = regs._ebx;
> -    *ecx = regs._ecx;
> -    *edx = regs._edx;
> -
> +    guest_cpuid(curr, leaf, subleaf, res);
>      return X86EMUL_OKAY;
>  }

Please retain the blank line before the return statement.

> @@ -4502,15 +4502,15 @@ x86_emulate(
>  
>          case 0xfc: /* clzero */
>          {
> -            unsigned int eax = 1, ebx = 0, dummy = 0;
> +            struct cpuid_leaf res;

Please put a single instance of this at the top of the body of the giant
switch() statement (likely calling for it to be named other than "res").

> --- a/xen/arch/x86/x86_emulate/x86_emulate.h
> +++ b/xen/arch/x86/x86_emulate/x86_emulate.h
> @@ -164,6 +164,11 @@ enum x86_emulate_fpu_type {
>      X86EMUL_FPU_ymm  /* AVX/XOP instruction set (%ymm0-%ymm7/15) */
>  };
>  
> +struct cpuid_leaf
> +{
> +    uint32_t a, b, c, d;

Could you please consistently use uint32_t or unsigned int between
here and ...

> @@ -415,10 +420,9 @@ struct x86_emulate_ops
>       * #GP[0].  Used to implement CPUID faulting.
>       */
>      int (*cpuid)(
> -        unsigned int *eax,
> -        unsigned int *ebx,
> -        unsigned int *ecx,
> -        unsigned int *edx,
> +        unsigned int leaf,
> +        unsigned int subleaf,
> +        struct cpuid_leaf *res,

... here? I have no particular preference which of the two to use.

> @@ -64,6 +65,9 @@ extern struct cpuidmasks cpuidmask_defaults;
>  /* Whether or not cpuid faulting is available for the current domain. */
>  DECLARE_PER_CPU(bool, cpuid_faulting_enabled);
>  
> +void guest_cpuid(const struct vcpu *v, unsigned int leaf,
> +                 unsigned int subleaf, struct cpuid_leaf *res);

Same for this one then, obviously (and a few others).

Jan

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: [PATCH 03/27] x86/cpuid: Introduce struct cpuid_policy
  2017-01-04 12:39 ` [PATCH 03/27] x86/cpuid: Introduce struct cpuid_policy Andrew Cooper
@ 2017-01-04 14:22   ` Jan Beulich
  2017-01-04 15:05     ` Andrew Cooper
  0 siblings, 1 reply; 93+ messages in thread
From: Jan Beulich @ 2017-01-04 14:22 UTC (permalink / raw)
  To: Andrew Cooper; +Cc: Xen-devel

>>> On 04.01.17 at 13:39, <andrew.cooper3@citrix.com> wrote:
> struct cpuid_policy will eventually be a complete replacement for the cpuids[]
> array, with a fixed layout and named fields to allow O(1) access to specific
> information.
> 
> For now, the CPUID content is capped at the 0xd and 0x8000001c leaves, which
> matches the maximum policy that the toolstack will generate for a domain.

Especially (but not only) leaf 0x17 and extended leaf 0x8000001e
make me wonder whether this is a good starting point.

> @@ -67,6 +80,55 @@ static void __init sanitise_featureset(uint32_t *fs)
>                            (fs[FEATURESET_e1d] & ~CPUID_COMMON_1D_FEATURES));
>  }
>  
> +static void __init calculate_raw_policy(void)
> +{
> +    struct cpuid_policy *p = &raw_policy;
> +    unsigned int i;
> +
> +    cpuid_leaf(0, &p->basic.raw[0]);
> +    for ( i = 1; i < min(ARRAY_SIZE(p->basic.raw),
> +                         p->basic.max_leaf + 1ul); ++i )
> +    {
> +        /* Collected later. */
> +        if ( i == 0x7 || i == 0xd )
> +            continue;
> +
> +        cpuid_leaf(i, &p->basic.raw[i]);

Leaves 2, 4, 0xb, and 0xf are, iirc, a multiple invocation ones too.
There should at least be a comment here clarifying why they don't
need treatment similar to 7 and 0xd.

> +    }
> +
> +    if ( p->basic.max_leaf >= 7 )
> +    {
> +        cpuid_count_leaf(7, 0, &p->feat.raw[0]);
> +
> +        for ( i = 1; i < min(ARRAY_SIZE(p->feat.raw),
> +                             p->feat.max_subleaf + 1ul); ++i )
> +            cpuid_count_leaf(7, i, &p->feat.raw[i]);
> +    }
> +
> +    if ( p->basic.max_leaf >= 0xd )
> +    {
> +        uint64_t xstates;
> +
> +        cpuid_count_leaf(0xd, 0, &p->xstate.raw[0]);
> +        cpuid_count_leaf(0xd, 1, &p->xstate.raw[1]);
> +
> +        xstates = ((uint64_t)(p->xstate.xcr0_high | p->xstate.xss_high) << 32) |
> +            (p->xstate.xcr0_low | p->xstate.xss_low);
> +
> +        for ( i = 2; i < 63; ++i )
> +        {
> +            if ( xstates & (1u << i) )

1ull

> @@ -65,6 +66,78 @@ extern struct cpuidmasks cpuidmask_defaults;
>  /* Whether or not cpuid faulting is available for the current domain. */
>  DECLARE_PER_CPU(bool, cpuid_faulting_enabled);
>  
> +#define CPUID_GUEST_NR_BASIC      (0xdu + 1)
> +#define CPUID_GUEST_NR_FEAT       (0u + 1)
> +#define CPUID_GUEST_NR_XSTATE     (62u + 1)
> +#define CPUID_GUEST_NR_EXTD_INTEL (0x8u + 1)
> +#define CPUID_GUEST_NR_EXTD_AMD   (0x1cu + 1)
> +#define CPUID_GUEST_NR_EXTD       MAX(CPUID_GUEST_NR_EXTD_INTEL, \
> +                                      CPUID_GUEST_NR_EXTD_AMD)
> +
> +struct cpuid_policy
> +{
> +    /*
> +     * WARNING: During the CPUID transition period, not all information here
> +     * is accurate.  The following items are accurate, and can be relied upon.
> +     *
> +     * Global *_policy objects:
> +     *
> +     * - Host accurate:
> +     *   - max_{,sub}leaf
> +     *   - {xcr0,xss}_{high,low}
> +     *
> +     * - Guest appropriate:
> +     *   - Nothing

I don't understand the meaning of the "accurate" above and
the "appropriate" here.

> +     *
> +     * Everything else should be considered inaccurate, and not necesserily 0.
> +     */
> +
> +    /* Basic leaves: 0x000000xx */
> +    union {
> +        struct cpuid_leaf raw[CPUID_GUEST_NR_BASIC];
> +        struct {
> +            /* Leaf 0x0 - Max and vendor. */
> +            struct {
> +                uint32_t max_leaf, :32, :32, :32;

These unnamed bitfields are here solely for the BUILD_BUG_ON()s?
Wouldn't it make sense to omit them here, and use > instead of !=
there? Also is there really value in nesting unnamed structures like
this?

Jan


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: [PATCH 04/27] x86/cpuid: Move featuresets into struct cpuid_policy
  2017-01-04 12:39 ` [PATCH 04/27] x86/cpuid: Move featuresets into " Andrew Cooper
@ 2017-01-04 14:35   ` Jan Beulich
  2017-01-04 15:10     ` Andrew Cooper
  0 siblings, 1 reply; 93+ messages in thread
From: Jan Beulich @ 2017-01-04 14:35 UTC (permalink / raw)
  To: Andrew Cooper; +Cc: Xen-devel

>>> On 04.01.17 at 13:39, <andrew.cooper3@citrix.com> wrote:
> +static void __init calculate_host_policy(void)
>  {
> -    unsigned int max, tmp;
> -
> -    max = cpuid_eax(0);
> -
> -    if ( max >= 1 )
> -        cpuid(0x1, &tmp, &tmp,
> -              &raw_featureset[FEATURESET_1c],
> -              &raw_featureset[FEATURESET_1d]);
> -    if ( max >= 7 )
> -        cpuid_count(0x7, 0, &tmp,
> -                    &raw_featureset[FEATURESET_7b0],
> -                    &raw_featureset[FEATURESET_7c0],
> -                    &raw_featureset[FEATURESET_7d0]);
> -    if ( max >= 0xd )
> -        cpuid_count(0xd, 1,
> -                    &raw_featureset[FEATURESET_Da1],
> -                    &tmp, &tmp, &tmp);
> -
> -    max = cpuid_eax(0x80000000);
> -    if ( (max >> 16) != 0x8000 )
> -        return;
> +    struct cpuid_policy *p = &host_policy;
>  
> -    if ( max >= 0x80000001 )
> -        cpuid(0x80000001, &tmp, &tmp,
> -              &raw_featureset[FEATURESET_e1c],
> -              &raw_featureset[FEATURESET_e1d]);
> -    if ( max >= 0x80000007 )
> -        cpuid(0x80000007, &tmp, &tmp, &tmp,
> -              &raw_featureset[FEATURESET_e7d]);
> -    if ( max >= 0x80000008 )
> -        cpuid(0x80000008, &tmp,
> -              &raw_featureset[FEATURESET_e8b],
> -              &tmp, &tmp);
> +    memcpy(p->fs, boot_cpu_data.x86_capability, sizeof(p->fs));

What are the plans for keeping this up-to-date wrt later
adjustments to boot_cpu_data.x86_capability? Wouldn't it be
better for the field to be a pointer, and the above to be a simple
assignment of &boot_cpu_data.x86_capability?

> +static void __init calculate_pv_max_policy(void)
>  {
> +    struct cpuid_policy *p = &pv_max_policy;

I assume later patches will add further uses of this variable?
Otherwise ...

> @@ -185,10 +159,12 @@ static void __init calculate_pv_featureset(void)
>      __set_bit(X86_FEATURE_CMP_LEGACY, pv_featureset);
>  
>      sanitise_featureset(pv_featureset);
> +    cpuid_featureset_to_policy(pv_featureset, p);

... using &pv_max_policy directly here would seem more friendly
to readers.

Jan


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: [PATCH 05/27] x86/cpuid: Allocate a CPUID policy for every domain
  2017-01-04 12:39 ` [PATCH 05/27] x86/cpuid: Allocate a CPUID policy for every domain Andrew Cooper
@ 2017-01-04 14:40   ` Jan Beulich
  0 siblings, 0 replies; 93+ messages in thread
From: Jan Beulich @ 2017-01-04 14:40 UTC (permalink / raw)
  To: Andrew Cooper; +Cc: Xen-devel

>>> On 04.01.17 at 13:39, <andrew.cooper3@citrix.com> wrote:
> --- a/xen/include/asm-x86/domain.h
> +++ b/xen/include/asm-x86/domain.h
> @@ -340,6 +340,9 @@ struct arch_domain
>      /* Is PHYSDEVOP_eoi to automatically unmask the event channel? */
>      bool_t auto_unmask;
>  
> +    /* CPUID Policy. */
> +    struct cpuid_policy *cpuid;
> +
>      /* Values snooped from updates to cpuids[] (below). */
>      u8 x86;                  /* CPU family */
>      u8 x86_vendor;           /* CPU vendor */

This is rather undesirable placement, producing 2 4-byte holes. With
it moved to a better spot
Reviewed-by: Jan Beulich <jbeulich@suse.com>

Jan


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: [PATCH 06/27] x86/domctl: Make XEN_DOMCTL_set_address_size singleshot
  2017-01-04 12:39 ` [PATCH 06/27] x86/domctl: Make XEN_DOMCTL_set_address_size singleshot Andrew Cooper
@ 2017-01-04 14:42   ` Jan Beulich
  0 siblings, 0 replies; 93+ messages in thread
From: Jan Beulich @ 2017-01-04 14:42 UTC (permalink / raw)
  To: Andrew Cooper; +Cc: Xen-devel

>>> On 04.01.17 at 13:39, <andrew.cooper3@citrix.com> wrote:
> Toolstacks (including some out-of-tree ones) use XEN_DOMCTL_set_address_size
> at most once per domain, and it ends up having a destructive effect on the
> available CPUID policy for a domain.
> 
> To avoid ordering issues between altering the policy via domctl, and the
> constructive effects which would have to happen from switching back to 
> native,
> explicitly reject this case.
> 
> Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>

Reviewed-by: Jan Beulich <jbeulich@suse.com>


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: [PATCH 02/27] x86/cpuid: Introduce guest_cpuid() and struct cpuid_leaf
  2017-01-04 14:01   ` Jan Beulich
@ 2017-01-04 14:47     ` Andrew Cooper
  2017-01-04 15:49       ` Jan Beulich
  0 siblings, 1 reply; 93+ messages in thread
From: Andrew Cooper @ 2017-01-04 14:47 UTC (permalink / raw)
  To: Jan Beulich
  Cc: Kevin Tian, Suravee Suthikulpanit, Xen-devel, Paul Durrant,
	Jun Nakajima, Boris Ostrovsky

On 04/01/17 14:01, Jan Beulich wrote:
>
>> @@ -17,6 +18,8 @@ uint32_t __read_mostly raw_featureset[FSCAPINTS];
>>  uint32_t __read_mostly pv_featureset[FSCAPINTS];
>>  uint32_t __read_mostly hvm_featureset[FSCAPINTS];
>>  
>> +#define EMPTY_LEAF (struct cpuid_leaf){}
> Perhaps another pair of parentheses around the entire thing?

Can do.

>
>> @@ -215,6 +218,36 @@ const uint32_t * __init lookup_deep_deps(uint32_t feature)
>>      return NULL;
>>  }
>>  
>> +void guest_cpuid(const struct vcpu *v, unsigned int leaf,
>> +                 unsigned int subleaf, struct cpuid_leaf *res)
>> +{
>> +    *res = EMPTY_LEAF;
> Why? There's no valid path leaving the structure uninitialized.

Paths are introduced later in the series.

>
>> +    /* {pv,hvm}_cpuid() have this expectation. */
>> +    ASSERT(v == current);
>> +
>> +    if ( is_pv_vcpu(v) || is_pvh_vcpu(v) )
>> +    {
>> +        struct cpu_user_regs regs = *guest_cpu_user_regs();
> I assume this is only a transient thing, in which case I'm fine with
> this relatively big item getting placed on the stack.

Yes.  Removed in patch 27.

>
>> +        regs.rax = leaf;
>> +        regs.rcx = subleaf;
> DYM _eax/_ecx respectively? The upper halves are of no interest.

Mainly for my peace of mind.  This, like the above, is only transient.

>
>> @@ -3246,10 +3252,10 @@ static int priv_op_wbinvd(struct x86_emulate_ctxt *ctxt)
>>      return X86EMUL_OKAY;
>>  }
>>  
>> -int pv_emul_cpuid(unsigned int *eax, unsigned int *ebx, unsigned int *ecx,
>> -                  unsigned int *edx, struct x86_emulate_ctxt *ctxt)
>> +int pv_emul_cpuid(unsigned int leaf, unsigned int subleaf,
>> +                  struct cpuid_leaf *res, struct x86_emulate_ctxt *ctxt)
>>  {
>> -    struct cpu_user_regs regs = *ctxt->regs;
>> +    struct vcpu *curr = current;
>>  
>>      /*
>>       * x86_emulate uses this function to query CPU features for its own
>> @@ -3258,7 +3264,6 @@ int pv_emul_cpuid(unsigned int *eax, unsigned int *ebx, unsigned int *ecx,
>>       */
>>      if ( ctxt->opcode == X86EMUL_OPC(0x0f, 0xa2) )
>>      {
>> -        const struct vcpu *curr = current;
> There was a "const" here - did you really mean to get rid of it?

No.  Will fix.

>
>> @@ -3266,16 +3271,7 @@ int pv_emul_cpuid(unsigned int *eax, unsigned int *ebx, unsigned int *ecx,
>>              return X86EMUL_EXCEPTION;
>>      }
>>  
>> -    regs._eax = *eax;
>> -    regs._ecx = *ecx;
>> -
>> -    pv_cpuid(&regs);
>> -
>> -    *eax = regs._eax;
>> -    *ebx = regs._ebx;
>> -    *ecx = regs._ecx;
>> -    *edx = regs._edx;
>> -
>> +    guest_cpuid(curr, leaf, subleaf, res);
>>      return X86EMUL_OKAY;
>>  }
> Please retain the blank line before the return statement.

Ok.

>
>> @@ -4502,15 +4502,15 @@ x86_emulate(
>>  
>>          case 0xfc: /* clzero */
>>          {
>> -            unsigned int eax = 1, ebx = 0, dummy = 0;
>> +            struct cpuid_leaf res;
> Please put a single instance of this at the top of the body of the giant
> switch() statement (likely calling for it to be named other than "res").

struct cpuid_leaf cpuid_leaf?

I can't think of anything clearer.

>
>> --- a/xen/arch/x86/x86_emulate/x86_emulate.h
>> +++ b/xen/arch/x86/x86_emulate/x86_emulate.h
>> @@ -164,6 +164,11 @@ enum x86_emulate_fpu_type {
>>      X86EMUL_FPU_ymm  /* AVX/XOP instruction set (%ymm0-%ymm7/15) */
>>  };
>>  
>> +struct cpuid_leaf
>> +{
>> +    uint32_t a, b, c, d;
> Could you please consistently use uint32_t or unsigned int between
> here and ...
>
>> @@ -415,10 +420,9 @@ struct x86_emulate_ops
>>       * #GP[0].  Used to implement CPUID faulting.
>>       */
>>      int (*cpuid)(
>> -        unsigned int *eax,
>> -        unsigned int *ebx,
>> -        unsigned int *ecx,
>> -        unsigned int *edx,
>> +        unsigned int leaf,
>> +        unsigned int subleaf,
>> +        struct cpuid_leaf *res,
> ... here? I have no particular preference which of the two to use.

Will use uint32_t.

~Andrew

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: [PATCH 16/27] x86/svm: Improvements using named features
  2017-01-04 12:39 ` [PATCH 16/27] x86/svm: Improvements " Andrew Cooper
@ 2017-01-04 14:52   ` Boris Ostrovsky
  2017-01-04 15:42     ` Andrew Cooper
  0 siblings, 1 reply; 93+ messages in thread
From: Boris Ostrovsky @ 2017-01-04 14:52 UTC (permalink / raw)
  To: Andrew Cooper, Xen-devel; +Cc: Suravee Suthikulpanit, Jan Beulich

On 01/04/2017 07:39 AM, Andrew Cooper wrote:
> This avoids calling into hvm_cpuid() to obtain information which is directly
> available.  In particular, this avoids the need to overload flag_dr_dirty
> because of hvm_cpuid() being unavailable svm_save_dr()

"unavailabe in" (or from)

>
> Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
> ---
> CC: Jan Beulich <JBeulich@suse.com>
> CC: Boris Ostrovsky <boris.ostrovsky@oracle.com>
> CC: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
> ---
>  xen/arch/x86/hvm/svm/svm.c | 33 ++++++++-------------------------
>  1 file changed, 8 insertions(+), 25 deletions(-)
>
> diff --git a/xen/arch/x86/hvm/svm/svm.c b/xen/arch/x86/hvm/svm/svm.c
> index de20f64..8f6737c 100644
> --- a/xen/arch/x86/hvm/svm/svm.c
> +++ b/xen/arch/x86/hvm/svm/svm.c
> @@ -173,7 +173,7 @@ static void svm_save_dr(struct vcpu *v)
>      v->arch.hvm_vcpu.flag_dr_dirty = 0;
>      vmcb_set_dr_intercepts(vmcb, ~0u);
>  
> -    if ( flag_dr_dirty & 2 )
> +    if ( v->domain->arch.cpuid->extd.dbext )
>      {
>          svm_intercept_msr(v, MSR_AMD64_DR0_ADDRESS_MASK, MSR_INTERCEPT_RW);
>          svm_intercept_msr(v, MSR_AMD64_DR1_ADDRESS_MASK, MSR_INTERCEPT_RW);
> @@ -196,8 +196,6 @@ static void svm_save_dr(struct vcpu *v)
>  
>  static void __restore_debug_registers(struct vmcb_struct *vmcb, struct vcpu *v)
>  {
> -    unsigned int ecx;
> -
>      if ( v->arch.hvm_vcpu.flag_dr_dirty )
>          return;
>  
> @@ -205,8 +203,8 @@ static void __restore_debug_registers(struct vmcb_struct *vmcb, struct vcpu *v)
>      vmcb_set_dr_intercepts(vmcb, 0);
>  
>      ASSERT(v == current);
> -    hvm_cpuid(0x80000001, NULL, NULL, &ecx, NULL);
> -    if ( test_bit(X86_FEATURE_DBEXT & 31, &ecx) )
> +
> +    if ( v->domain->arch.cpuid->extd.dbext )
>      {
>          svm_intercept_msr(v, MSR_AMD64_DR0_ADDRESS_MASK, MSR_INTERCEPT_NONE);
>          svm_intercept_msr(v, MSR_AMD64_DR1_ADDRESS_MASK, MSR_INTERCEPT_NONE);
> @@ -217,9 +215,6 @@ static void __restore_debug_registers(struct vmcb_struct *vmcb, struct vcpu *v)
>          wrmsrl(MSR_AMD64_DR1_ADDRESS_MASK, v->arch.hvm_svm.dr_mask[1]);
>          wrmsrl(MSR_AMD64_DR2_ADDRESS_MASK, v->arch.hvm_svm.dr_mask[2]);
>          wrmsrl(MSR_AMD64_DR3_ADDRESS_MASK, v->arch.hvm_svm.dr_mask[3]);
> -
> -        /* Can't use hvm_cpuid() in svm_save_dr(): v != current. */
> -        v->arch.hvm_vcpu.flag_dr_dirty |= 2;

Should v->arch.hvm_vcpu.flag_dr_dirty be converted to bool then?

-boris


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: [PATCH 07/27] x86/cpuid: Recalculate a domains CPUID policy when appropriate
  2017-01-04 12:39 ` [PATCH 07/27] x86/cpuid: Recalculate a domains CPUID policy when appropriate Andrew Cooper
@ 2017-01-04 15:01   ` Jan Beulich
  2017-01-04 15:33     ` Andrew Cooper
  0 siblings, 1 reply; 93+ messages in thread
From: Jan Beulich @ 2017-01-04 15:01 UTC (permalink / raw)
  To: Andrew Cooper; +Cc: Xen-devel

>>> On 04.01.17 at 13:39, <andrew.cooper3@citrix.com> wrote:
> +void recalculate_cpuid_policy(struct domain *d)
> +{
> +    struct cpuid_policy *p = d->arch.cpuid;
> +    const struct cpuid_policy *max =
> +        is_pv_domain(d) ? &pv_max_policy : &hvm_max_policy;
> +    uint32_t fs[FSCAPINTS], max_fs[FSCAPINTS];
> +    unsigned int i;
> +
> +    cpuid_policy_to_featureset(p, fs);
> +    memcpy(max_fs, max->fs, sizeof(max_fs));
> +
> +    /* Allow a toolstack to possibly select ITSC... */
> +    if ( cpu_has_itsc )
> +        __set_bit(X86_FEATURE_ITSC, max_fs);

This special casing calls for some explanation in the commit message
(or the comment here).

> +    for ( i = 0; i < ARRAY_SIZE(fs); i++ )
> +        fs[i] &= max_fs[i];
> +
> +    if ( is_pv_32bit_domain(d) )
> +    {
> +        __clear_bit(X86_FEATURE_LM, fs);
> +        if ( boot_cpu_data.x86_vendor != X86_VENDOR_AMD )
> +            __clear_bit(X86_FEATURE_SYSCALL, fs);
> +    }
> +
> +    if ( is_hvm_domain(d) && !hap_enabled(d) )
> +    {
> +        for ( i = 0; i < ARRAY_SIZE(fs); i++ )
> +            fs[i] &= hvm_shadow_featuremask[i];
> +    }

Wouldn't this better go into the other loop ANDing fs[i]?

> +    /* ... but hide ITSC in the common case. */
> +    if ( !d->disable_migrate && !d->arch.vtsc )
> +        __clear_bit(X86_FEATURE_ITSC, fs);

The 32-bit PV logic could easily move below here afaics, reducing
the distance between the two parts of the comment.

Also this requires adjustment of the policy by (the caller of)
tsc_set_info().

>  static void update_domain_cpuid_info(struct domain *d,
>                                       const xen_domctl_cpuid_t *ctl)
>  {
> +    struct cpuid_policy *p = d->arch.cpuid;
> +    struct cpuid_leaf leaf = { ctl->eax, ctl->ebx, ctl->ecx, ctl->edx };
> +
> +    if ( ctl->input[0] < ARRAY_SIZE(p->basic.raw) )
> +    {
> +        if ( ctl->input[0] == 7 )
> +        {
> +            if ( ctl->input[1] < ARRAY_SIZE(p->feat.raw) )
> +                p->feat.raw[ctl->input[1]] = leaf;
> +        }
> +        else if ( ctl->input[0] == 0xd )
> +        {
> +            if ( ctl->input[1] < ARRAY_SIZE(p->xstate.raw) )
> +                p->xstate.raw[ctl->input[1]] = leaf;
> +        }
> +        else
> +            p->basic.raw[ctl->input[0]] = leaf;
> +    }
> +    else if ( (ctl->input[0] - 0x80000000) < ARRAY_SIZE(p->extd.raw) )
> +        p->extd.raw[ctl->input[0] - 0x80000000] = leaf;

These checks against ARRAY_SIZE() worry me - wouldn't we better
refuse any attempts to set values not representable in the policy?

Jan


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: [PATCH 03/27] x86/cpuid: Introduce struct cpuid_policy
  2017-01-04 14:22   ` Jan Beulich
@ 2017-01-04 15:05     ` Andrew Cooper
  2017-01-04 15:58       ` Jan Beulich
  0 siblings, 1 reply; 93+ messages in thread
From: Andrew Cooper @ 2017-01-04 15:05 UTC (permalink / raw)
  To: Jan Beulich; +Cc: Xen-devel

On 04/01/17 14:22, Jan Beulich wrote:
>>>> On 04.01.17 at 13:39, <andrew.cooper3@citrix.com> wrote:
>> struct cpuid_policy will eventually be a complete replacement for the cpuids[]
>> array, with a fixed layout and named fields to allow O(1) access to specific
>> information.
>>
>> For now, the CPUID content is capped at the 0xd and 0x8000001c leaves, which
>> matches the maximum policy that the toolstack will generate for a domain.
> Especially (but not only) leaf 0x17 and extended leaf 0x8000001e
> make me wonder whether this is a good starting point.

The starting point matches what the toolstack currently does.

I'd prefer to logically separate this series (reworking how the
hypervisor deals with CPUID data), from altering the default policy
given to guests, but I do agree that we should move in the direction you
suggest.

>
>> @@ -67,6 +80,55 @@ static void __init sanitise_featureset(uint32_t *fs)
>>                            (fs[FEATURESET_e1d] & ~CPUID_COMMON_1D_FEATURES));
>>  }
>>  
>> +static void __init calculate_raw_policy(void)
>> +{
>> +    struct cpuid_policy *p = &raw_policy;
>> +    unsigned int i;
>> +
>> +    cpuid_leaf(0, &p->basic.raw[0]);
>> +    for ( i = 1; i < min(ARRAY_SIZE(p->basic.raw),
>> +                         p->basic.max_leaf + 1ul); ++i )
>> +    {
>> +        /* Collected later. */
>> +        if ( i == 0x7 || i == 0xd )
>> +            continue;
>> +
>> +        cpuid_leaf(i, &p->basic.raw[i]);
> Leaves 2, 4, 0xb, and 0xf are, iirc, a multiple invocation ones too.
> There should at least be a comment here clarifying why they don't
> need treatment similar to 7 and 0xd.

Leaf 2 is magic.  It doesn't take a subleaf parameter, but may return
different information on repeated invocation.  I am half tempted to
require this to be a static leaf, which appears to be the case on most
hardware I have available to me.

The handling of leaf 4 is all per-cpu rather than per-domain, which is
why it isn't expressed in this structure.  That is going to require the
per-cpu topology work to do sensibly.  (There is a lot more CPUID work
than is just presented in this series, but it was frankly getting
unwieldy large.)

0xf isn't currently exposed (due to max_leaf being 0xd), and we don't
support PQM in guests yet.  I'd expect the work to expose it to guests
to add a new union, following the 0x7/0xd example.

>
>> +    }
>> +
>> +    if ( p->basic.max_leaf >= 7 )
>> +    {
>> +        cpuid_count_leaf(7, 0, &p->feat.raw[0]);
>> +
>> +        for ( i = 1; i < min(ARRAY_SIZE(p->feat.raw),
>> +                             p->feat.max_subleaf + 1ul); ++i )
>> +            cpuid_count_leaf(7, i, &p->feat.raw[i]);
>> +    }
>> +
>> +    if ( p->basic.max_leaf >= 0xd )
>> +    {
>> +        uint64_t xstates;
>> +
>> +        cpuid_count_leaf(0xd, 0, &p->xstate.raw[0]);
>> +        cpuid_count_leaf(0xd, 1, &p->xstate.raw[1]);
>> +
>> +        xstates = ((uint64_t)(p->xstate.xcr0_high | p->xstate.xss_high) << 32) |
>> +            (p->xstate.xcr0_low | p->xstate.xss_low);
>> +
>> +        for ( i = 2; i < 63; ++i )
>> +        {
>> +            if ( xstates & (1u << i) )
> 1ull

Oops yes.

>
>> @@ -65,6 +66,78 @@ extern struct cpuidmasks cpuidmask_defaults;
>>  /* Whether or not cpuid faulting is available for the current domain. */
>>  DECLARE_PER_CPU(bool, cpuid_faulting_enabled);
>>  
>> +#define CPUID_GUEST_NR_BASIC      (0xdu + 1)
>> +#define CPUID_GUEST_NR_FEAT       (0u + 1)
>> +#define CPUID_GUEST_NR_XSTATE     (62u + 1)
>> +#define CPUID_GUEST_NR_EXTD_INTEL (0x8u + 1)
>> +#define CPUID_GUEST_NR_EXTD_AMD   (0x1cu + 1)
>> +#define CPUID_GUEST_NR_EXTD       MAX(CPUID_GUEST_NR_EXTD_INTEL, \
>> +                                      CPUID_GUEST_NR_EXTD_AMD)
>> +
>> +struct cpuid_policy
>> +{
>> +    /*
>> +     * WARNING: During the CPUID transition period, not all information here
>> +     * is accurate.  The following items are accurate, and can be relied upon.
>> +     *
>> +     * Global *_policy objects:
>> +     *
>> +     * - Host accurate:
>> +     *   - max_{,sub}leaf
>> +     *   - {xcr0,xss}_{high,low}
>> +     *
>> +     * - Guest appropriate:
>> +     *   - Nothing
> I don't understand the meaning of the "accurate" above and
> the "appropriate" here.

This might make more sense in the context of patches 7 and 22, where we
end up with a mix of host values, unsanitised and sanitised guest
values.  This comment describes which values fall into which category,
and is updated across the series.

>
>> +     *
>> +     * Everything else should be considered inaccurate, and not necesserily 0.
>> +     */
>> +
>> +    /* Basic leaves: 0x000000xx */
>> +    union {
>> +        struct cpuid_leaf raw[CPUID_GUEST_NR_BASIC];
>> +        struct {
>> +            /* Leaf 0x0 - Max and vendor. */
>> +            struct {
>> +                uint32_t max_leaf, :32, :32, :32;
> These unnamed bitfields are here solely for the BUILD_BUG_ON()s?

They are to help my counting when adding structs for new leaves, but
they do get filled in with named fields later.

> Wouldn't it make sense to omit them here, and use > instead of !=
> there?

I tried it that way to start with, and got in a mess.

> Also is there really value in nesting unnamed structures like this?

It makes tools like `pahole` looking at struct cpuid_policy far clearer
to read.  But like above, it aids clarity, particularly when adding in
the higher numbered fields in later patches.

~Andrew

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: [PATCH 04/27] x86/cpuid: Move featuresets into struct cpuid_policy
  2017-01-04 14:35   ` Jan Beulich
@ 2017-01-04 15:10     ` Andrew Cooper
  2017-01-04 15:59       ` Jan Beulich
  0 siblings, 1 reply; 93+ messages in thread
From: Andrew Cooper @ 2017-01-04 15:10 UTC (permalink / raw)
  To: Jan Beulich; +Cc: Xen-devel

On 04/01/17 14:35, Jan Beulich wrote:
>>>> On 04.01.17 at 13:39, <andrew.cooper3@citrix.com> wrote:
>> +static void __init calculate_host_policy(void)
>>  {
>> -    unsigned int max, tmp;
>> -
>> -    max = cpuid_eax(0);
>> -
>> -    if ( max >= 1 )
>> -        cpuid(0x1, &tmp, &tmp,
>> -              &raw_featureset[FEATURESET_1c],
>> -              &raw_featureset[FEATURESET_1d]);
>> -    if ( max >= 7 )
>> -        cpuid_count(0x7, 0, &tmp,
>> -                    &raw_featureset[FEATURESET_7b0],
>> -                    &raw_featureset[FEATURESET_7c0],
>> -                    &raw_featureset[FEATURESET_7d0]);
>> -    if ( max >= 0xd )
>> -        cpuid_count(0xd, 1,
>> -                    &raw_featureset[FEATURESET_Da1],
>> -                    &tmp, &tmp, &tmp);
>> -
>> -    max = cpuid_eax(0x80000000);
>> -    if ( (max >> 16) != 0x8000 )
>> -        return;
>> +    struct cpuid_policy *p = &host_policy;
>>  
>> -    if ( max >= 0x80000001 )
>> -        cpuid(0x80000001, &tmp, &tmp,
>> -              &raw_featureset[FEATURESET_e1c],
>> -              &raw_featureset[FEATURESET_e1d]);
>> -    if ( max >= 0x80000007 )
>> -        cpuid(0x80000007, &tmp, &tmp, &tmp,
>> -              &raw_featureset[FEATURESET_e7d]);
>> -    if ( max >= 0x80000008 )
>> -        cpuid(0x80000008, &tmp,
>> -              &raw_featureset[FEATURESET_e8b],
>> -              &tmp, &tmp);
>> +    memcpy(p->fs, boot_cpu_data.x86_capability, sizeof(p->fs));
> What are the plans for keeping this up-to-date wrt later
> adjustments to boot_cpu_data.x86_capability?  Wouldn't it be
> better for the field to be a pointer, and the above to be a simple
> assignment of &boot_cpu_data.x86_capability?

The fs field is temporary and removed in patch 20.

calculate_host_policy() is called immediately before dom0 is
constructed, which is after AP bringup.  Realistically,
boot_cpu_data.x86_capability won't be changing by this point, even for
PCPU hotplug.

>
>> +static void __init calculate_pv_max_policy(void)
>>  {
>> +    struct cpuid_policy *p = &pv_max_policy;
> I assume later patches will add further uses of this variable?

Yes.

> Otherwise ...
>
>> @@ -185,10 +159,12 @@ static void __init calculate_pv_featureset(void)
>>      __set_bit(X86_FEATURE_CMP_LEGACY, pv_featureset);
>>  
>>      sanitise_featureset(pv_featureset);
>> +    cpuid_featureset_to_policy(pv_featureset, p);
> ... using &pv_max_policy directly here would seem more friendly
> to readers.

Expressing it this way makes shorter diffs along the series.

~Andrew

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: [PATCH 08/27] x86/hvm: Dispatch cpuid_viridian_leaves() from guest_cpuid()
  2017-01-04 12:39 ` [PATCH 08/27] x86/hvm: Dispatch cpuid_viridian_leaves() from guest_cpuid() Andrew Cooper
@ 2017-01-04 15:24   ` Jan Beulich
  2017-01-04 15:36     ` Andrew Cooper
  0 siblings, 1 reply; 93+ messages in thread
From: Jan Beulich @ 2017-01-04 15:24 UTC (permalink / raw)
  To: Andrew Cooper; +Cc: Paul Durrant, Xen-devel

>>> On 04.01.17 at 13:39, <andrew.cooper3@citrix.com> wrote:
> One check against EFER_SVME is replaced with the more appropriate cpu_has_svm,
> when determining whether MSR bitmaps are available.

I don't think this is correct - start_svm() may fail, in which case
the CPUID flag doesn't get cleared, yet EFER.SVME also doesn't
get set. How about comparing hvm_funcs (if not NULL) ->name
against "SVM"?

> Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
> ---
> CC: Jan Beulich <JBeulich@suse.com>

Cc: Paul Durrant <paul.durrant@citrix.com>

> --- a/xen/arch/x86/cpuid.c
> +++ b/xen/arch/x86/cpuid.c
> @@ -319,8 +319,21 @@ int init_domain_cpuid_policy(struct domain *d)
>  void guest_cpuid(const struct vcpu *v, unsigned int leaf,
>                   unsigned int subleaf, struct cpuid_leaf *res)
>  {
> +    const struct domain *d = v->domain;
> +
>      *res = EMPTY_LEAF;
>  
> +    /*
> +     * First pass:
> +     * - Dispatch the virtualised leaves to their respective handlers.
> +     */
> +    switch ( leaf )
> +    {
> +    case 0x40000000 ... 0x400000ff:
> +        if ( is_viridian_domain(d) )
> +            return cpuid_viridian_leaves(v, leaf, subleaf, res);
> +    }

Can we please have a break statement above here?

> +void cpuid_viridian_leaves(const struct vcpu *v, unsigned int leaf,
> +                           unsigned int subleaf, struct cpuid_leaf *res)
>  {
> -    struct domain *d = current->domain;
> +    const struct domain *d = v->domain;
>  
> -    if ( !is_viridian_domain(d) )
> -        return 0;
> +    ASSERT(is_viridian_domain(d));
> +    ASSERT(leaf >= 0x40000000 && leaf < 0x40000100);
>  
>      leaf -= 0x40000000;
> -    if ( leaf > 6 )
> -        return 0;
>  
> -    *eax = *ebx = *ecx = *edx = 0;
>      switch ( leaf )
>      {
>      case 0:
> -        *eax = 0x40000006; /* Maximum leaf */
> -        *ebx = 0x7263694d; /* Magic numbers  */
> -        *ecx = 0x666F736F;
> -        *edx = 0x76482074;
> +        res->a = 0x40000006; /* Maximum leaf */
> +        res->b = 0x7263694d; /* Magic numbers  */
> +        res->c = 0x666F736F;
> +        res->d = 0x76482074;
>          break;
> +
>      case 1:
> -        *eax = 0x31237648; /* Version number */
> +        res->a = 0x31237648; /* Version number */
>          break;
> +
>      case 2:
>          /* Hypervisor information, but only if the guest has set its
>             own version number. */
>          if ( d->arch.hvm_domain.viridian.guest_os_id.raw == 0 )
>              break;
> -        *eax = 1; /* Build number */
> -        *ebx = (xen_major_version() << 16) | xen_minor_version();
> -        *ecx = 0; /* SP */
> -        *edx = 0; /* Service branch and number */
> +        res->a = 1; /* Build number */
> +        res->b = (xen_major_version() << 16) | xen_minor_version();

I think the comments warrant the zeroing of ECX and EDX to be
retained.

Jan


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: [PATCH 25/27] x86/svm: Use guest_cpuid() rather than hvm_cpuid()
  2017-01-04 12:39 ` [PATCH 25/27] x86/svm: " Andrew Cooper
@ 2017-01-04 15:26   ` Boris Ostrovsky
  2017-01-05 14:04   ` Jan Beulich
  1 sibling, 0 replies; 93+ messages in thread
From: Boris Ostrovsky @ 2017-01-04 15:26 UTC (permalink / raw)
  To: Andrew Cooper, Xen-devel; +Cc: Suravee Suthikulpanit, Jan Beulich

On 01/04/2017 07:39 AM, Andrew Cooper wrote:
> More work is required before LWP details can be read straight out of the
> cpuid_policy block, but in the meantime hvm_cpuid() wants to disappear so
> update the code to use the newer interface.
>
> Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>

Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: [PATCH 07/27] x86/cpuid: Recalculate a domains CPUID policy when appropriate
  2017-01-04 15:01   ` Jan Beulich
@ 2017-01-04 15:33     ` Andrew Cooper
  2017-01-04 16:04       ` Jan Beulich
  0 siblings, 1 reply; 93+ messages in thread
From: Andrew Cooper @ 2017-01-04 15:33 UTC (permalink / raw)
  To: Jan Beulich; +Cc: Xen-devel

On 04/01/17 15:01, Jan Beulich wrote:
>>>> On 04.01.17 at 13:39, <andrew.cooper3@citrix.com> wrote:
>> +void recalculate_cpuid_policy(struct domain *d)
>> +{
>> +    struct cpuid_policy *p = d->arch.cpuid;
>> +    const struct cpuid_policy *max =
>> +        is_pv_domain(d) ? &pv_max_policy : &hvm_max_policy;
>> +    uint32_t fs[FSCAPINTS], max_fs[FSCAPINTS];
>> +    unsigned int i;
>> +
>> +    cpuid_policy_to_featureset(p, fs);
>> +    memcpy(max_fs, max->fs, sizeof(max_fs));
>> +
>> +    /* Allow a toolstack to possibly select ITSC... */
>> +    if ( cpu_has_itsc )
>> +        __set_bit(X86_FEATURE_ITSC, max_fs);
> This special casing calls for some explanation in the commit message
> (or the comment here).

Ah - this logic is all copied from the current dynamic adjustment we
make in {pv,hvm}_cpuid().  This ITSC one is expressed differently, but
it should hopefully be obvious in the context of patches 18 and 19.

I will adjust the commit message.

>
>> +    for ( i = 0; i < ARRAY_SIZE(fs); i++ )
>> +        fs[i] &= max_fs[i];
>> +
>> +    if ( is_pv_32bit_domain(d) )
>> +    {
>> +        __clear_bit(X86_FEATURE_LM, fs);
>> +        if ( boot_cpu_data.x86_vendor != X86_VENDOR_AMD )
>> +            __clear_bit(X86_FEATURE_SYSCALL, fs);
>> +    }
>> +
>> +    if ( is_hvm_domain(d) && !hap_enabled(d) )
>> +    {
>> +        for ( i = 0; i < ARRAY_SIZE(fs); i++ )
>> +            fs[i] &= hvm_shadow_featuremask[i];
>> +    }
> Wouldn't this better go into the other loop ANDing fs[i]?

Maybe (although that would put a conditional in the middle of a tight loop).

I think that answer will depend on how much other content ends up here,
and I haven't finished replacing hvm_cpuid() yet.

>
>> +    /* ... but hide ITSC in the common case. */
>> +    if ( !d->disable_migrate && !d->arch.vtsc )
>> +        __clear_bit(X86_FEATURE_ITSC, fs);
> The 32-bit PV logic could easily move below here afaics, reducing
> the distance between the two parts of the comment.
>
> Also this requires adjustment of the policy by (the caller of)
> tsc_set_info().

And also XEN_DOMCTL_set_disable_migrate.

Currently the various toolstacks issues these hypercalls in the correct
order, so I was planning to ignore these edge cases until the toolstack
side work (see below).

>
>>  static void update_domain_cpuid_info(struct domain *d,
>>                                       const xen_domctl_cpuid_t *ctl)
>>  {
>> +    struct cpuid_policy *p = d->arch.cpuid;
>> +    struct cpuid_leaf leaf = { ctl->eax, ctl->ebx, ctl->ecx, ctl->edx };
>> +
>> +    if ( ctl->input[0] < ARRAY_SIZE(p->basic.raw) )
>> +    {
>> +        if ( ctl->input[0] == 7 )
>> +        {
>> +            if ( ctl->input[1] < ARRAY_SIZE(p->feat.raw) )
>> +                p->feat.raw[ctl->input[1]] = leaf;
>> +        }
>> +        else if ( ctl->input[0] == 0xd )
>> +        {
>> +            if ( ctl->input[1] < ARRAY_SIZE(p->xstate.raw) )
>> +                p->xstate.raw[ctl->input[1]] = leaf;
>> +        }
>> +        else
>> +            p->basic.raw[ctl->input[0]] = leaf;
>> +    }
>> +    else if ( (ctl->input[0] - 0x80000000) < ARRAY_SIZE(p->extd.raw) )
>> +        p->extd.raw[ctl->input[0] - 0x80000000] = leaf;
> These checks against ARRAY_SIZE() worry me - wouldn't we better
> refuse any attempts to set values not representable in the policy?

We can't do that yet, without toolstack side changes.  Currently the
toolstack can lodge any values it wishes, and all we do is ignore them,
which can be arbitrary information from a cpuid= clause.

The plan (which will fix construction of domains when dom0 isn't a PV
guest), will involve introducing a new DOMCTL_get_cpuid_policy (which
the toolstack will modify to its taste) and present back to Xen, at
which point I will tighten up the hypercall interfaces to require a
sensible policy from the toolstack.

~Andrew

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: [PATCH 09/27] x86/cpuid: Dispatch cpuid_hypervisor_leaves() from guest_cpuid()
  2017-01-04 12:39 ` [PATCH 09/27] x86/cpuid: Dispatch cpuid_hypervisor_leaves() " Andrew Cooper
@ 2017-01-04 15:34   ` Jan Beulich
  2017-01-04 15:40     ` Andrew Cooper
  0 siblings, 1 reply; 93+ messages in thread
From: Jan Beulich @ 2017-01-04 15:34 UTC (permalink / raw)
  To: Andrew Cooper; +Cc: Xen-devel

>>> On 04.01.17 at 13:39, <andrew.cooper3@citrix.com> wrote:
> --- a/xen/arch/x86/cpuid.c
> +++ b/xen/arch/x86/cpuid.c
> @@ -332,6 +332,9 @@ void guest_cpuid(const struct vcpu *v, unsigned int leaf,
>      case 0x40000000 ... 0x400000ff:
>          if ( is_viridian_domain(d) )
>              return cpuid_viridian_leaves(v, leaf, subleaf, res);
> +        /* Fallthrough. */
> +    case 0x40000100 ... 0x4fffffff:
> +        return cpuid_hypervisor_leaves(v, leaf, subleaf, res);
>      }

Ah - that's why you didn't have a break statement there. But: Is this
correct? You're now returning Xen leaves in two windows for non-
Viridian domains.

> @@ -929,83 +927,71 @@ int cpuid_hypervisor_leaves( uint32_t idx, uint32_t sub_idx,
>              limit = XEN_CPUID_MAX_NUM_LEAVES;
>      }
>  
> -    if ( idx > limit ) 
> -        return 0;
> +    if ( idx > limit )
> +        return;
>  
>      switch ( idx )
>      {
>      case 0:
> -        *eax = base + limit; /* Largest leaf */
> -        *ebx = XEN_CPUID_SIGNATURE_EBX;
> -        *ecx = XEN_CPUID_SIGNATURE_ECX;
> -        *edx = XEN_CPUID_SIGNATURE_EDX;
> +        res->a = base + limit; /* Largest leaf */
> +        res->b = XEN_CPUID_SIGNATURE_EBX;
> +        res->c = XEN_CPUID_SIGNATURE_ECX;
> +        res->d = XEN_CPUID_SIGNATURE_EDX;
>          break;
>  
>      case 1:
> -        *eax = (xen_major_version() << 16) | xen_minor_version();
> -        *ebx = 0;          /* Reserved */
> -        *ecx = 0;          /* Reserved */
> -        *edx = 0;          /* Reserved */
> +        res->a = (xen_major_version() << 16) | xen_minor_version();
>          break;
>  
>      case 2:
> -        *eax = 1;          /* Number of hypercall-transfer pages */
> -        *ebx = 0x40000000; /* MSR base address */
> -        if ( is_viridian_domain(currd) )
> -            *ebx = 0x40000200;
> -        *ecx = 0;          /* Features 1 */
> -        *edx = 0;          /* Features 2 */
> -        if ( is_pv_domain(currd) )
> -            *ecx |= XEN_CPUID_FEAT1_MMU_PT_UPDATE_PRESERVE_AD;
> +        res->a = 1;          /* Number of hypercall-transfer pages */
> +        res->b = 0x40000000; /* MSR base address */
> +        if ( is_viridian_domain(d) )
> +            res->b = 0x40000200;

Could I talk you into making this a conditional expression, as you're
touching it anyway?

> +        if ( is_pv_domain(d) )
> +            res->c |= XEN_CPUID_FEAT1_MMU_PT_UPDATE_PRESERVE_AD;
>          break;
>  
>      case 3: /* Time leaf. */
> -        switch ( sub_idx )
> +        switch ( subleaf )
>          {
>          case 0: /* features */
> -            *eax = ((!!currd->arch.vtsc << 0) |
> -                    (!!host_tsc_is_safe() << 1) |
> -                    (!!boot_cpu_has(X86_FEATURE_RDTSCP) << 2));
> -            *ebx = currd->arch.tsc_mode;
> -            *ecx = currd->arch.tsc_khz;
> -            *edx = currd->arch.incarnation;
> +            res->a = ((!!d->arch.vtsc << 0) |
> +                      (!!host_tsc_is_safe() << 1) |
> +                      (!!boot_cpu_has(X86_FEATURE_RDTSCP) << 2));

The latter two !! appear to still be necessary, but the first can go
away now that we use bool, and bool_t is an alias thereof.

> +            res->b = d->arch.tsc_mode;
> +            res->c = d->arch.tsc_khz;
> +            res->d = d->arch.incarnation;
>              break;
>  
>          case 1: /* scale and offset */
>          {
>              uint64_t offset;
>  
> -            if ( !currd->arch.vtsc )
> -                offset = currd->arch.vtsc_offset;
> +            if ( !d->arch.vtsc )
> +                offset = d->arch.vtsc_offset;
>              else
>                  /* offset already applied to value returned by virtual rdtscp */
>                  offset = 0;
> -            *eax = (uint32_t)offset;
> -            *ebx = (uint32_t)(offset >> 32);
> -            *ecx = currd->arch.vtsc_to_ns.mul_frac;
> -            *edx = (s8)currd->arch.vtsc_to_ns.shift;
> +            res->a = (uint32_t)offset;
> +            res->b = (uint32_t)(offset >> 32);

The casts aren't really necessary.

Jan

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: [PATCH 08/27] x86/hvm: Dispatch cpuid_viridian_leaves() from guest_cpuid()
  2017-01-04 15:24   ` Jan Beulich
@ 2017-01-04 15:36     ` Andrew Cooper
  2017-01-04 16:11       ` Jan Beulich
  0 siblings, 1 reply; 93+ messages in thread
From: Andrew Cooper @ 2017-01-04 15:36 UTC (permalink / raw)
  To: Jan Beulich; +Cc: Paul Durrant, Xen-devel

On 04/01/17 15:24, Jan Beulich wrote:
>>>> On 04.01.17 at 13:39, <andrew.cooper3@citrix.com> wrote:
>> One check against EFER_SVME is replaced with the more appropriate cpu_has_svm,
>> when determining whether MSR bitmaps are available.
> I don't think this is correct - start_svm() may fail, in which case
> the CPUID flag doesn't get cleared, yet EFER.SVME also doesn't
> get set. How about comparing hvm_funcs (if not NULL) ->name
> against "SVM"?

Hmm.  This shows that the same logical bug is present in the vmx side. 
Let me see about finding a better way of doing this.

>
>> Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
>> ---
>> CC: Jan Beulich <JBeulich@suse.com>
> Cc: Paul Durrant <paul.durrant@citrix.com>

Oops yes sorry.

>
>> --- a/xen/arch/x86/cpuid.c
>> +++ b/xen/arch/x86/cpuid.c
>> @@ -319,8 +319,21 @@ int init_domain_cpuid_policy(struct domain *d)
>>  void guest_cpuid(const struct vcpu *v, unsigned int leaf,
>>                   unsigned int subleaf, struct cpuid_leaf *res)
>>  {
>> +    const struct domain *d = v->domain;
>> +
>>      *res = EMPTY_LEAF;
>>  
>> +    /*
>> +     * First pass:
>> +     * - Dispatch the virtualised leaves to their respective handlers.
>> +     */
>> +    switch ( leaf )
>> +    {
>> +    case 0x40000000 ... 0x400000ff:
>> +        if ( is_viridian_domain(d) )
>> +            return cpuid_viridian_leaves(v, leaf, subleaf, res);
>> +    }
> Can we please have a break statement above here?

I think this got lost in a rebase.  The following patch makes it all
sensible.  I will adjust.

>
>> +void cpuid_viridian_leaves(const struct vcpu *v, unsigned int leaf,
>> +                           unsigned int subleaf, struct cpuid_leaf *res)
>>  {
>> -    struct domain *d = current->domain;
>> +    const struct domain *d = v->domain;
>>  
>> -    if ( !is_viridian_domain(d) )
>> -        return 0;
>> +    ASSERT(is_viridian_domain(d));
>> +    ASSERT(leaf >= 0x40000000 && leaf < 0x40000100);
>>  
>>      leaf -= 0x40000000;
>> -    if ( leaf > 6 )
>> -        return 0;
>>  
>> -    *eax = *ebx = *ecx = *edx = 0;
>>      switch ( leaf )
>>      {
>>      case 0:
>> -        *eax = 0x40000006; /* Maximum leaf */
>> -        *ebx = 0x7263694d; /* Magic numbers  */
>> -        *ecx = 0x666F736F;
>> -        *edx = 0x76482074;
>> +        res->a = 0x40000006; /* Maximum leaf */
>> +        res->b = 0x7263694d; /* Magic numbers  */
>> +        res->c = 0x666F736F;
>> +        res->d = 0x76482074;
>>          break;
>> +
>>      case 1:
>> -        *eax = 0x31237648; /* Version number */
>> +        res->a = 0x31237648; /* Version number */
>>          break;
>> +
>>      case 2:
>>          /* Hypervisor information, but only if the guest has set its
>>             own version number. */
>>          if ( d->arch.hvm_domain.viridian.guest_os_id.raw == 0 )
>>              break;
>> -        *eax = 1; /* Build number */
>> -        *ebx = (xen_major_version() << 16) | xen_minor_version();
>> -        *ecx = 0; /* SP */
>> -        *edx = 0; /* Service branch and number */
>> +        res->a = 1; /* Build number */
>> +        res->b = (xen_major_version() << 16) | xen_minor_version();
> I think the comments warrant the zeroing of ECX and EDX to be
> retained.

Ok.

~Andrew

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: [PATCH 09/27] x86/cpuid: Dispatch cpuid_hypervisor_leaves() from guest_cpuid()
  2017-01-04 15:34   ` Jan Beulich
@ 2017-01-04 15:40     ` Andrew Cooper
  2017-01-04 16:14       ` Jan Beulich
  0 siblings, 1 reply; 93+ messages in thread
From: Andrew Cooper @ 2017-01-04 15:40 UTC (permalink / raw)
  To: Jan Beulich; +Cc: Xen-devel

On 04/01/17 15:34, Jan Beulich wrote:
>>>> On 04.01.17 at 13:39, <andrew.cooper3@citrix.com> wrote:
>> --- a/xen/arch/x86/cpuid.c
>> +++ b/xen/arch/x86/cpuid.c
>> @@ -332,6 +332,9 @@ void guest_cpuid(const struct vcpu *v, unsigned int leaf,
>>      case 0x40000000 ... 0x400000ff:
>>          if ( is_viridian_domain(d) )
>>              return cpuid_viridian_leaves(v, leaf, subleaf, res);
>> +        /* Fallthrough. */
>> +    case 0x40000100 ... 0x4fffffff:
>> +        return cpuid_hypervisor_leaves(v, leaf, subleaf, res);
>>      }
> Ah - that's why you didn't have a break statement there. But: Is this
> correct? You're now returning Xen leaves in two windows for non-
> Viridian domains.

Oh - good point.  I think for now I will retain the viridian_domain()
check in cpuid_hypervisor_leaves()

The awkard issue is that the toolstack can provide the xen max leaf
field.  I was considering switching the interface around to having the
toolstack choose all of leaf 0 for the virtualualised leaves, and I am
looking longterm to have unions for the these leaves in struct cpuid_policy.

>
>> @@ -929,83 +927,71 @@ int cpuid_hypervisor_leaves( uint32_t idx, uint32_t sub_idx,
>>              limit = XEN_CPUID_MAX_NUM_LEAVES;
>>      }
>>  
>> -    if ( idx > limit ) 
>> -        return 0;
>> +    if ( idx > limit )
>> +        return;
>>  
>>      switch ( idx )
>>      {
>>      case 0:
>> -        *eax = base + limit; /* Largest leaf */
>> -        *ebx = XEN_CPUID_SIGNATURE_EBX;
>> -        *ecx = XEN_CPUID_SIGNATURE_ECX;
>> -        *edx = XEN_CPUID_SIGNATURE_EDX;
>> +        res->a = base + limit; /* Largest leaf */
>> +        res->b = XEN_CPUID_SIGNATURE_EBX;
>> +        res->c = XEN_CPUID_SIGNATURE_ECX;
>> +        res->d = XEN_CPUID_SIGNATURE_EDX;
>>          break;
>>  
>>      case 1:
>> -        *eax = (xen_major_version() << 16) | xen_minor_version();
>> -        *ebx = 0;          /* Reserved */
>> -        *ecx = 0;          /* Reserved */
>> -        *edx = 0;          /* Reserved */
>> +        res->a = (xen_major_version() << 16) | xen_minor_version();
>>          break;
>>  
>>      case 2:
>> -        *eax = 1;          /* Number of hypercall-transfer pages */
>> -        *ebx = 0x40000000; /* MSR base address */
>> -        if ( is_viridian_domain(currd) )
>> -            *ebx = 0x40000200;
>> -        *ecx = 0;          /* Features 1 */
>> -        *edx = 0;          /* Features 2 */
>> -        if ( is_pv_domain(currd) )
>> -            *ecx |= XEN_CPUID_FEAT1_MMU_PT_UPDATE_PRESERVE_AD;
>> +        res->a = 1;          /* Number of hypercall-transfer pages */
>> +        res->b = 0x40000000; /* MSR base address */
>> +        if ( is_viridian_domain(d) )
>> +            res->b = 0x40000200;
> Could I talk you into making this a conditional expression, as you're
> touching it anyway?

Ok.  I did find the value of 0x40000200 particularly odd (given that we
split the CPUID leaves on the 100 boundary), but it has been like that
for ages.

>
>> +        if ( is_pv_domain(d) )
>> +            res->c |= XEN_CPUID_FEAT1_MMU_PT_UPDATE_PRESERVE_AD;
>>          break;
>>  
>>      case 3: /* Time leaf. */
>> -        switch ( sub_idx )
>> +        switch ( subleaf )
>>          {
>>          case 0: /* features */
>> -            *eax = ((!!currd->arch.vtsc << 0) |
>> -                    (!!host_tsc_is_safe() << 1) |
>> -                    (!!boot_cpu_has(X86_FEATURE_RDTSCP) << 2));
>> -            *ebx = currd->arch.tsc_mode;
>> -            *ecx = currd->arch.tsc_khz;
>> -            *edx = currd->arch.incarnation;
>> +            res->a = ((!!d->arch.vtsc << 0) |
>> +                      (!!host_tsc_is_safe() << 1) |
>> +                      (!!boot_cpu_has(X86_FEATURE_RDTSCP) << 2));
> The latter two !! appear to still be necessary, but the first can go
> away now that we use bool, and bool_t is an alias thereof.

Ok.

>
>> +            res->b = d->arch.tsc_mode;
>> +            res->c = d->arch.tsc_khz;
>> +            res->d = d->arch.incarnation;
>>              break;
>>  
>>          case 1: /* scale and offset */
>>          {
>>              uint64_t offset;
>>  
>> -            if ( !currd->arch.vtsc )
>> -                offset = currd->arch.vtsc_offset;
>> +            if ( !d->arch.vtsc )
>> +                offset = d->arch.vtsc_offset;
>>              else
>>                  /* offset already applied to value returned by virtual rdtscp */
>>                  offset = 0;
>> -            *eax = (uint32_t)offset;
>> -            *ebx = (uint32_t)(offset >> 32);
>> -            *ecx = currd->arch.vtsc_to_ns.mul_frac;
>> -            *edx = (s8)currd->arch.vtsc_to_ns.shift;
>> +            res->a = (uint32_t)offset;
>> +            res->b = (uint32_t)(offset >> 32);
> The casts aren't really necessary.

Will drop.

~Andrew

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: [PATCH 16/27] x86/svm: Improvements using named features
  2017-01-04 14:52   ` Boris Ostrovsky
@ 2017-01-04 15:42     ` Andrew Cooper
  0 siblings, 0 replies; 93+ messages in thread
From: Andrew Cooper @ 2017-01-04 15:42 UTC (permalink / raw)
  To: Boris Ostrovsky, Xen-devel; +Cc: Suravee Suthikulpanit, Jan Beulich

On 04/01/17 14:52, Boris Ostrovsky wrote:
> On 01/04/2017 07:39 AM, Andrew Cooper wrote:
>> This avoids calling into hvm_cpuid() to obtain information which is directly
>> available.  In particular, this avoids the need to overload flag_dr_dirty
>> because of hvm_cpuid() being unavailable svm_save_dr()
> "unavailabe in" (or from)

Will fix.

>
>> Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
>> ---
>> CC: Jan Beulich <JBeulich@suse.com>
>> CC: Boris Ostrovsky <boris.ostrovsky@oracle.com>
>> CC: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
>> ---
>>  xen/arch/x86/hvm/svm/svm.c | 33 ++++++++-------------------------
>>  1 file changed, 8 insertions(+), 25 deletions(-)
>>
>> diff --git a/xen/arch/x86/hvm/svm/svm.c b/xen/arch/x86/hvm/svm/svm.c
>> index de20f64..8f6737c 100644
>> --- a/xen/arch/x86/hvm/svm/svm.c
>> +++ b/xen/arch/x86/hvm/svm/svm.c
>> @@ -173,7 +173,7 @@ static void svm_save_dr(struct vcpu *v)
>>      v->arch.hvm_vcpu.flag_dr_dirty = 0;
>>      vmcb_set_dr_intercepts(vmcb, ~0u);
>>  
>> -    if ( flag_dr_dirty & 2 )
>> +    if ( v->domain->arch.cpuid->extd.dbext )
>>      {
>>          svm_intercept_msr(v, MSR_AMD64_DR0_ADDRESS_MASK, MSR_INTERCEPT_RW);
>>          svm_intercept_msr(v, MSR_AMD64_DR1_ADDRESS_MASK, MSR_INTERCEPT_RW);
>> @@ -196,8 +196,6 @@ static void svm_save_dr(struct vcpu *v)
>>  
>>  static void __restore_debug_registers(struct vmcb_struct *vmcb, struct vcpu *v)
>>  {
>> -    unsigned int ecx;
>> -
>>      if ( v->arch.hvm_vcpu.flag_dr_dirty )
>>          return;
>>  
>> @@ -205,8 +203,8 @@ static void __restore_debug_registers(struct vmcb_struct *vmcb, struct vcpu *v)
>>      vmcb_set_dr_intercepts(vmcb, 0);
>>  
>>      ASSERT(v == current);
>> -    hvm_cpuid(0x80000001, NULL, NULL, &ecx, NULL);
>> -    if ( test_bit(X86_FEATURE_DBEXT & 31, &ecx) )
>> +
>> +    if ( v->domain->arch.cpuid->extd.dbext )
>>      {
>>          svm_intercept_msr(v, MSR_AMD64_DR0_ADDRESS_MASK, MSR_INTERCEPT_NONE);
>>          svm_intercept_msr(v, MSR_AMD64_DR1_ADDRESS_MASK, MSR_INTERCEPT_NONE);
>> @@ -217,9 +215,6 @@ static void __restore_debug_registers(struct vmcb_struct *vmcb, struct vcpu *v)
>>          wrmsrl(MSR_AMD64_DR1_ADDRESS_MASK, v->arch.hvm_svm.dr_mask[1]);
>>          wrmsrl(MSR_AMD64_DR2_ADDRESS_MASK, v->arch.hvm_svm.dr_mask[2]);
>>          wrmsrl(MSR_AMD64_DR3_ADDRESS_MASK, v->arch.hvm_svm.dr_mask[3]);
>> -
>> -        /* Can't use hvm_cpuid() in svm_save_dr(): v != current. */
>> -        v->arch.hvm_vcpu.flag_dr_dirty |= 2;
> Should v->arch.hvm_vcpu.flag_dr_dirty be converted to bool then?

Hmm.  That is how the code was before c/s c097f54912, so yes - I will
switch it back.

~Andrew

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: [PATCH 10/27] x86/cpuid: Introduce named feature bitmaps
  2017-01-04 12:39 ` [PATCH 10/27] x86/cpuid: Introduce named feature bitmaps Andrew Cooper
@ 2017-01-04 15:44   ` Jan Beulich
  2017-01-04 17:21     ` Andrew Cooper
  0 siblings, 1 reply; 93+ messages in thread
From: Jan Beulich @ 2017-01-04 15:44 UTC (permalink / raw)
  To: Andrew Cooper; +Cc: Xen-devel

>>> On 04.01.17 at 13:39, <andrew.cooper3@citrix.com> wrote:
> Use anonymous unions to access the feature leaves as complete words, and by
> named individual feature.
> 
> A feature name is introduced for every architectural X86_FEATURE_*, other than
> the dynamically calculated values such as APIC, OSXSAVE and OSPKE.

A rationale for this change would be nice to have here, as the
redundancy with public/arch-x86/cpufeatureset.h means any
addition will now need to change two places. Would it be possible
for gen-cpuid.py to generate these bitfield declarations?

> --- a/xen/include/asm-x86/cpuid.h
> +++ b/xen/include/asm-x86/cpuid.h
> @@ -103,7 +103,25 @@ struct cpuid_policy
>  
>              /* Leaf 0x1 - family/model/stepping and features. */
>              struct {
> -                uint32_t :32, :32, _1c, _1d;
> +                uint32_t :32, :32;
> +                union {
> +                    uint32_t _1c;
> +                    struct {
> +                        bool sse3:1, pclmulqdq:1, dtes64:1, monitor:1, dscpl:1, vmx:1, smx:1, eist:1,
> +                             tm2:1, ssse3:1, /* cid */:1, /* sdbg */:1, fma:1, cx16:1, xtpr:1, pcdm:1,
> +                             :1, pcid:1, dca:1, sse4_1:1, sse4_2:1, x2apic:1, movebe:1, popcnt:1,

movbe

> +                             tsc_deadline:1, aesni:1, xsave:1, /* osxsave */:1, avx:1, f16c:1, rdrand:1, hv:1;

hypervisor (please make sure the name match up with X86_FEATURE_*).

> @@ -113,7 +131,34 @@ struct cpuid_policy
>          struct cpuid_leaf raw[CPUID_GUEST_NR_FEAT];
>          struct {
>              struct {
> -                uint32_t max_subleaf, _7b0, _7c0, _7d0;
> +                uint32_t max_subleaf;
> +                union {
> +                    uint32_t _7b0;
> +                    struct {
> +                        bool fsgsbase:1, tsc_adjust:1, sgx:1, bmi1:1, hle:1, avx2:1, fdp_excp_only:1, smep:1,
> +                             bmi2:1, erms:1, invpcid:1, rtm:1, pqm:1, no_fpu_sel:1, mpx:1, pqe:1,
> +                             avx512f:1, avx512dq:1, rdseed:1, adx:1, smap:1, avx512ifma:1, /* pcommit */:1, clflushopt:1,
> +                             clwb:1, /* pt */:1, avx512pf:1, avx512er:1, avx512cd:1, sha:1, avx512bw:1, avx512vl:1;

The commented out entries here don't match the commit message.

> +                    };
> +                };
> +                union {
> +                    uint32_t _7c0;
> +                    struct {
> +                        bool prefetchwt1:1, avx512vbmi:1, :1, pku: 1, :1, :1, :1, :1,
> +                             :1, :1, :1, :1, :1, :1, :1, :1,
> +                             :1, :1, :1, :1, :1, :1, :1, :1,
> +                             :1, :1, :1, :1, :1, :1, :1, :1;

This is ugly, but I remember you saying (on irc?) the compiler
doesn't allow bitfields wider than one bit for bool ...

> @@ -126,7 +171,16 @@ struct cpuid_policy
>                  uint32_t xcr0_low, :32, :32, xcr0_high;
>              };
>              struct {
> -                uint32_t Da1, :32, xss_low, xss_high;
> +                union {
> +                    uint32_t Da1;
> +                    struct {
> +                        bool xsaveopt: 1, xsavec: 1, xgetbv1: 1, xsaves: 1, :1, :1, :1, :1,

Why the blanks after the colons?

Jan


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: [PATCH 02/27] x86/cpuid: Introduce guest_cpuid() and struct cpuid_leaf
  2017-01-04 14:47     ` Andrew Cooper
@ 2017-01-04 15:49       ` Jan Beulich
  0 siblings, 0 replies; 93+ messages in thread
From: Jan Beulich @ 2017-01-04 15:49 UTC (permalink / raw)
  To: Andrew Cooper
  Cc: Kevin Tian, Suravee Suthikulpanit, Xen-devel, Paul Durrant,
	Jun Nakajima, Boris Ostrovsky

>>> On 04.01.17 at 15:47, <andrew.cooper3@citrix.com> wrote:
> On 04/01/17 14:01, Jan Beulich wrote:
>>> @@ -4502,15 +4502,15 @@ x86_emulate(
>>>  
>>>          case 0xfc: /* clzero */
>>>          {
>>> -            unsigned int eax = 1, ebx = 0, dummy = 0;
>>> +            struct cpuid_leaf res;
>> Please put a single instance of this at the top of the body of the giant
>> switch() statement (likely calling for it to be named other than "res").
> 
> struct cpuid_leaf cpuid_leaf?
> 
> I can't think of anything clearer.

Fine with me.

>>> --- a/xen/arch/x86/x86_emulate/x86_emulate.h
>>> +++ b/xen/arch/x86/x86_emulate/x86_emulate.h
>>> @@ -164,6 +164,11 @@ enum x86_emulate_fpu_type {
>>>      X86EMUL_FPU_ymm  /* AVX/XOP instruction set (%ymm0-%ymm7/15) */
>>>  };
>>>  
>>> +struct cpuid_leaf
>>> +{
>>> +    uint32_t a, b, c, d;
>> Could you please consistently use uint32_t or unsigned int between
>> here and ...
>>
>>> @@ -415,10 +420,9 @@ struct x86_emulate_ops
>>>       * #GP[0].  Used to implement CPUID faulting.
>>>       */
>>>      int (*cpuid)(
>>> -        unsigned int *eax,
>>> -        unsigned int *ebx,
>>> -        unsigned int *ecx,
>>> -        unsigned int *edx,
>>> +        unsigned int leaf,
>>> +        unsigned int subleaf,
>>> +        struct cpuid_leaf *res,
>> ... here? I have no particular preference which of the two to use.
> 
> Will use uint32_t.

Having gone a little farther through the series, that's the option
which would apparently incur the higher amount of follow-on
changes.

Jan


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: [PATCH 03/27] x86/cpuid: Introduce struct cpuid_policy
  2017-01-04 15:05     ` Andrew Cooper
@ 2017-01-04 15:58       ` Jan Beulich
  0 siblings, 0 replies; 93+ messages in thread
From: Jan Beulich @ 2017-01-04 15:58 UTC (permalink / raw)
  To: Andrew Cooper; +Cc: Xen-devel

>>> On 04.01.17 at 16:05, <andrew.cooper3@citrix.com> wrote:
> On 04/01/17 14:22, Jan Beulich wrote:
>>>>> On 04.01.17 at 13:39, <andrew.cooper3@citrix.com> wrote:
>>> struct cpuid_policy will eventually be a complete replacement for the cpuids[]
>>> array, with a fixed layout and named fields to allow O(1) access to specific
>>> information.
>>>
>>> For now, the CPUID content is capped at the 0xd and 0x8000001c leaves, which
>>> matches the maximum policy that the toolstack will generate for a domain.
>> Especially (but not only) leaf 0x17 and extended leaf 0x8000001e
>> make me wonder whether this is a good starting point.
> 
> The starting point matches what the toolstack currently does.
> 
> I'd prefer to logically separate this series (reworking how the
> hypervisor deals with CPUID data), from altering the default policy
> given to guests, but I do agree that we should move in the direction you
> suggest.

Okay.

>>> @@ -67,6 +80,55 @@ static void __init sanitise_featureset(uint32_t *fs)
>>>                            (fs[FEATURESET_e1d] & ~CPUID_COMMON_1D_FEATURES));
>>>  }
>>>  
>>> +static void __init calculate_raw_policy(void)
>>> +{
>>> +    struct cpuid_policy *p = &raw_policy;
>>> +    unsigned int i;
>>> +
>>> +    cpuid_leaf(0, &p->basic.raw[0]);
>>> +    for ( i = 1; i < min(ARRAY_SIZE(p->basic.raw),
>>> +                         p->basic.max_leaf + 1ul); ++i )
>>> +    {
>>> +        /* Collected later. */
>>> +        if ( i == 0x7 || i == 0xd )
>>> +            continue;
>>> +
>>> +        cpuid_leaf(i, &p->basic.raw[i]);
>> Leaves 2, 4, 0xb, and 0xf are, iirc, a multiple invocation ones too.
>> There should at least be a comment here clarifying why they don't
>> need treatment similar to 7 and 0xd.
> 
> Leaf 2 is magic.  It doesn't take a subleaf parameter, but may return
> different information on repeated invocation.  I am half tempted to
> require this to be a static leaf, which appears to be the case on most
> hardware I have available to me.

Then we should have at least a warning logged somewhere if
multiple invocations would be needed, in the hopes that people
encountering it would tell us.

> The handling of leaf 4 is all per-cpu rather than per-domain, which is
> why it isn't expressed in this structure.  That is going to require the
> per-cpu topology work to do sensibly.  (There is a lot more CPUID work
> than is just presented in this series, but it was frankly getting
> unwieldy large.)

Well, okay, understood.

> 0xf isn't currently exposed (due to max_leaf being 0xd),

True.

>>> @@ -65,6 +66,78 @@ extern struct cpuidmasks cpuidmask_defaults;
>>>  /* Whether or not cpuid faulting is available for the current domain. */
>>>  DECLARE_PER_CPU(bool, cpuid_faulting_enabled);
>>>  
>>> +#define CPUID_GUEST_NR_BASIC      (0xdu + 1)
>>> +#define CPUID_GUEST_NR_FEAT       (0u + 1)
>>> +#define CPUID_GUEST_NR_XSTATE     (62u + 1)
>>> +#define CPUID_GUEST_NR_EXTD_INTEL (0x8u + 1)
>>> +#define CPUID_GUEST_NR_EXTD_AMD   (0x1cu + 1)
>>> +#define CPUID_GUEST_NR_EXTD       MAX(CPUID_GUEST_NR_EXTD_INTEL, \
>>> +                                      CPUID_GUEST_NR_EXTD_AMD)
>>> +
>>> +struct cpuid_policy
>>> +{
>>> +    /*
>>> +     * WARNING: During the CPUID transition period, not all information here
>>> +     * is accurate.  The following items are accurate, and can be relied upon.
>>> +     *
>>> +     * Global *_policy objects:
>>> +     *
>>> +     * - Host accurate:
>>> +     *   - max_{,sub}leaf
>>> +     *   - {xcr0,xss}_{high,low}
>>> +     *
>>> +     * - Guest appropriate:
>>> +     *   - Nothing
>> I don't understand the meaning of the "accurate" above and
>> the "appropriate" here.
> 
> This might make more sense in the context of patches 7 and 22, where we
> end up with a mix of host values, unsanitised and sanitised guest
> values.  This comment describes which values fall into which category,
> and is updated across the series.

Well - my main issue here is the use of two _different_ words,
whereas later for per-domain stuff you use the same word twice.

>>> +     *
>>> +     * Everything else should be considered inaccurate, and not necesserily 
> 0.
>>> +     */
>>> +
>>> +    /* Basic leaves: 0x000000xx */
>>> +    union {
>>> +        struct cpuid_leaf raw[CPUID_GUEST_NR_BASIC];
>>> +        struct {
>>> +            /* Leaf 0x0 - Max and vendor. */
>>> +            struct {
>>> +                uint32_t max_leaf, :32, :32, :32;
>> Also is there really value in nesting unnamed structures like this?
> 
> It makes tools like `pahole` looking at struct cpuid_policy far clearer
> to read.  But like above, it aids clarity, particularly when adding in
> the higher numbered fields in later patches.

I've seen some of those later patches meanwhile, but I don't think
the double struct-s help (other than cluttering these already not
easy to look at declarations). The comments which are there
should be enough to separate groups.

Jan

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: [PATCH 04/27] x86/cpuid: Move featuresets into struct cpuid_policy
  2017-01-04 15:10     ` Andrew Cooper
@ 2017-01-04 15:59       ` Jan Beulich
  0 siblings, 0 replies; 93+ messages in thread
From: Jan Beulich @ 2017-01-04 15:59 UTC (permalink / raw)
  To: Andrew Cooper; +Cc: Xen-devel

>>> On 04.01.17 at 16:10, <andrew.cooper3@citrix.com> wrote:
> On 04/01/17 14:35, Jan Beulich wrote:
>>>>> On 04.01.17 at 13:39, <andrew.cooper3@citrix.com> wrote:
>>> +static void __init calculate_host_policy(void)
>>>  {
>>> -    unsigned int max, tmp;
>>> -
>>> -    max = cpuid_eax(0);
>>> -
>>> -    if ( max >= 1 )
>>> -        cpuid(0x1, &tmp, &tmp,
>>> -              &raw_featureset[FEATURESET_1c],
>>> -              &raw_featureset[FEATURESET_1d]);
>>> -    if ( max >= 7 )
>>> -        cpuid_count(0x7, 0, &tmp,
>>> -                    &raw_featureset[FEATURESET_7b0],
>>> -                    &raw_featureset[FEATURESET_7c0],
>>> -                    &raw_featureset[FEATURESET_7d0]);
>>> -    if ( max >= 0xd )
>>> -        cpuid_count(0xd, 1,
>>> -                    &raw_featureset[FEATURESET_Da1],
>>> -                    &tmp, &tmp, &tmp);
>>> -
>>> -    max = cpuid_eax(0x80000000);
>>> -    if ( (max >> 16) != 0x8000 )
>>> -        return;
>>> +    struct cpuid_policy *p = &host_policy;
>>>  
>>> -    if ( max >= 0x80000001 )
>>> -        cpuid(0x80000001, &tmp, &tmp,
>>> -              &raw_featureset[FEATURESET_e1c],
>>> -              &raw_featureset[FEATURESET_e1d]);
>>> -    if ( max >= 0x80000007 )
>>> -        cpuid(0x80000007, &tmp, &tmp, &tmp,
>>> -              &raw_featureset[FEATURESET_e7d]);
>>> -    if ( max >= 0x80000008 )
>>> -        cpuid(0x80000008, &tmp,
>>> -              &raw_featureset[FEATURESET_e8b],
>>> -              &tmp, &tmp);
>>> +    memcpy(p->fs, boot_cpu_data.x86_capability, sizeof(p->fs));
>> What are the plans for keeping this up-to-date wrt later
>> adjustments to boot_cpu_data.x86_capability?  Wouldn't it be
>> better for the field to be a pointer, and the above to be a simple
>> assignment of &boot_cpu_data.x86_capability?
> 
> The fs field is temporary and removed in patch 20.
> 
> calculate_host_policy() is called immediately before dom0 is
> constructed, which is after AP bringup.  Realistically,
> boot_cpu_data.x86_capability won't be changing by this point, even for
> PCPU hotplug.
> 
>>
>>> +static void __init calculate_pv_max_policy(void)
>>>  {
>>> +    struct cpuid_policy *p = &pv_max_policy;
>> I assume later patches will add further uses of this variable?
> 
> Yes.
> 
>> Otherwise ...
>>
>>> @@ -185,10 +159,12 @@ static void __init calculate_pv_featureset(void)
>>>      __set_bit(X86_FEATURE_CMP_LEGACY, pv_featureset);
>>>  
>>>      sanitise_featureset(pv_featureset);
>>> +    cpuid_featureset_to_policy(pv_featureset, p);
>> ... using &pv_max_policy directly here would seem more friendly
>> to readers.
> 
> Expressing it this way makes shorter diffs along the series.

Okay then:
Reviewed-by: Jan Beulich <jbeulich@suse.com>

Jan


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: [PATCH 07/27] x86/cpuid: Recalculate a domains CPUID policy when appropriate
  2017-01-04 15:33     ` Andrew Cooper
@ 2017-01-04 16:04       ` Jan Beulich
  2017-01-04 17:37         ` Andrew Cooper
  0 siblings, 1 reply; 93+ messages in thread
From: Jan Beulich @ 2017-01-04 16:04 UTC (permalink / raw)
  To: Andrew Cooper; +Cc: Xen-devel

>>> On 04.01.17 at 16:33, <andrew.cooper3@citrix.com> wrote:
> On 04/01/17 15:01, Jan Beulich wrote:
>>>>> On 04.01.17 at 13:39, <andrew.cooper3@citrix.com> wrote:
>>> +void recalculate_cpuid_policy(struct domain *d)
>>> +{
>>> +    struct cpuid_policy *p = d->arch.cpuid;
>>> +    const struct cpuid_policy *max =
>>> +        is_pv_domain(d) ? &pv_max_policy : &hvm_max_policy;
>>> +    uint32_t fs[FSCAPINTS], max_fs[FSCAPINTS];
>>> +    unsigned int i;
>>> +
>>> +    cpuid_policy_to_featureset(p, fs);
>>> +    memcpy(max_fs, max->fs, sizeof(max_fs));
>>> +
>>> +    /* Allow a toolstack to possibly select ITSC... */
>>> +    if ( cpu_has_itsc )
>>> +        __set_bit(X86_FEATURE_ITSC, max_fs);
>> This special casing calls for some explanation in the commit message
>> (or the comment here).
> 
> Ah - this logic is all copied from the current dynamic adjustment we
> make in {pv,hvm}_cpuid().  This ITSC one is expressed differently, but
> it should hopefully be obvious in the context of patches 18 and 19.

Well, before making the comment I did check. The clearing of
the flag is a copy of existing code, but the setting of it here
isn't afaics.

>>> +    /* ... but hide ITSC in the common case. */
>>> +    if ( !d->disable_migrate && !d->arch.vtsc )
>>> +        __clear_bit(X86_FEATURE_ITSC, fs);
>> The 32-bit PV logic could easily move below here afaics, reducing
>> the distance between the two parts of the comment.
>>
>> Also this requires adjustment of the policy by (the caller of)
>> tsc_set_info().
> 
> And also XEN_DOMCTL_set_disable_migrate.
> 
> Currently the various toolstacks issues these hypercalls in the correct
> order, so I was planning to ignore these edge cases until the toolstack
> side work (see below).

Let's not do that - it'll be some time until that other work lands,
I assume, and introducing (further) dependencies on tool stacks
to do things in the right order is quite bad imo.

>>>  static void update_domain_cpuid_info(struct domain *d,
>>>                                       const xen_domctl_cpuid_t *ctl)
>>>  {
>>> +    struct cpuid_policy *p = d->arch.cpuid;
>>> +    struct cpuid_leaf leaf = { ctl->eax, ctl->ebx, ctl->ecx, ctl->edx };
>>> +
>>> +    if ( ctl->input[0] < ARRAY_SIZE(p->basic.raw) )
>>> +    {
>>> +        if ( ctl->input[0] == 7 )
>>> +        {
>>> +            if ( ctl->input[1] < ARRAY_SIZE(p->feat.raw) )
>>> +                p->feat.raw[ctl->input[1]] = leaf;
>>> +        }
>>> +        else if ( ctl->input[0] == 0xd )
>>> +        {
>>> +            if ( ctl->input[1] < ARRAY_SIZE(p->xstate.raw) )
>>> +                p->xstate.raw[ctl->input[1]] = leaf;
>>> +        }
>>> +        else
>>> +            p->basic.raw[ctl->input[0]] = leaf;
>>> +    }
>>> +    else if ( (ctl->input[0] - 0x80000000) < ARRAY_SIZE(p->extd.raw) )
>>> +        p->extd.raw[ctl->input[0] - 0x80000000] = leaf;
>> These checks against ARRAY_SIZE() worry me - wouldn't we better
>> refuse any attempts to set values not representable in the policy?
> 
> We can't do that yet, without toolstack side changes.  Currently the
> toolstack can lodge any values it wishes, and all we do is ignore them,
> which can be arbitrary information from a cpuid= clause.

Hmm, do we really _ignore_ them in all cases (rather than handing
them through to guests)? If so, that should indeed be good enough
for now.

Jan


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: [PATCH 08/27] x86/hvm: Dispatch cpuid_viridian_leaves() from guest_cpuid()
  2017-01-04 15:36     ` Andrew Cooper
@ 2017-01-04 16:11       ` Jan Beulich
  0 siblings, 0 replies; 93+ messages in thread
From: Jan Beulich @ 2017-01-04 16:11 UTC (permalink / raw)
  To: Andrew Cooper; +Cc: Paul Durrant, Xen-devel

>>> On 04.01.17 at 16:36, <andrew.cooper3@citrix.com> wrote:
> On 04/01/17 15:24, Jan Beulich wrote:
>>>>> On 04.01.17 at 13:39, <andrew.cooper3@citrix.com> wrote:
>>> One check against EFER_SVME is replaced with the more appropriate cpu_has_svm,
>>> when determining whether MSR bitmaps are available.
>> I don't think this is correct - start_svm() may fail, in which case
>> the CPUID flag doesn't get cleared, yet EFER.SVME also doesn't
>> get set. How about comparing hvm_funcs (if not NULL) ->name
>> against "SVM"?
> 
> Hmm.  This shows that the same logical bug is present in the vmx side. 

Oh, indeed. I had assumed vmx_secondary_exec_control & Co
would remain zero until after the last failure point, or get zeroed
in case of failure. I think we need to make that happen, or else
I think we have the same problem elsewhere. For the actual
CPUID feature flags, otoh, I don't think we should clear them in
these failure cases though.

Jan


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: [PATCH 09/27] x86/cpuid: Dispatch cpuid_hypervisor_leaves() from guest_cpuid()
  2017-01-04 15:40     ` Andrew Cooper
@ 2017-01-04 16:14       ` Jan Beulich
  0 siblings, 0 replies; 93+ messages in thread
From: Jan Beulich @ 2017-01-04 16:14 UTC (permalink / raw)
  To: Andrew Cooper; +Cc: Xen-devel

>>> On 04.01.17 at 16:40, <andrew.cooper3@citrix.com> wrote:
> On 04/01/17 15:34, Jan Beulich wrote:
>>>>> On 04.01.17 at 13:39, <andrew.cooper3@citrix.com> wrote:
>>> @@ -929,83 +927,71 @@ int cpuid_hypervisor_leaves( uint32_t idx, uint32_t sub_idx,
>>>              limit = XEN_CPUID_MAX_NUM_LEAVES;
>>>      }
>>>  
>>> -    if ( idx > limit ) 
>>> -        return 0;
>>> +    if ( idx > limit )
>>> +        return;
>>>  
>>>      switch ( idx )
>>>      {
>>>      case 0:
>>> -        *eax = base + limit; /* Largest leaf */
>>> -        *ebx = XEN_CPUID_SIGNATURE_EBX;
>>> -        *ecx = XEN_CPUID_SIGNATURE_ECX;
>>> -        *edx = XEN_CPUID_SIGNATURE_EDX;
>>> +        res->a = base + limit; /* Largest leaf */
>>> +        res->b = XEN_CPUID_SIGNATURE_EBX;
>>> +        res->c = XEN_CPUID_SIGNATURE_ECX;
>>> +        res->d = XEN_CPUID_SIGNATURE_EDX;
>>>          break;
>>>  
>>>      case 1:
>>> -        *eax = (xen_major_version() << 16) | xen_minor_version();
>>> -        *ebx = 0;          /* Reserved */
>>> -        *ecx = 0;          /* Reserved */
>>> -        *edx = 0;          /* Reserved */
>>> +        res->a = (xen_major_version() << 16) | xen_minor_version();
>>>          break;
>>>  
>>>      case 2:
>>> -        *eax = 1;          /* Number of hypercall-transfer pages */
>>> -        *ebx = 0x40000000; /* MSR base address */
>>> -        if ( is_viridian_domain(currd) )
>>> -            *ebx = 0x40000200;
>>> -        *ecx = 0;          /* Features 1 */
>>> -        *edx = 0;          /* Features 2 */
>>> -        if ( is_pv_domain(currd) )
>>> -            *ecx |= XEN_CPUID_FEAT1_MMU_PT_UPDATE_PRESERVE_AD;
>>> +        res->a = 1;          /* Number of hypercall-transfer pages */
>>> +        res->b = 0x40000000; /* MSR base address */
>>> +        if ( is_viridian_domain(d) )
>>> +            res->b = 0x40000200;
>> Could I talk you into making this a conditional expression, as you're
>> touching it anyway?
> 
> Ok.  I did find the value of 0x40000200 particularly odd (given that we
> split the CPUID leaves on the 100 boundary), but it has been like that
> for ages.

I guess the assumption was that the MSR space might grow faster
than the CPUID one, and personally I would agree with such a
guess. Not sure whether the Viridian spec perhaps calls out that
value as an upper limit for its MSR space (otherwise I'm not really
clear how we would deal with them growing beyond that boundary).

Jan


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: [PATCH 10/27] x86/cpuid: Introduce named feature bitmaps
  2017-01-04 15:44   ` Jan Beulich
@ 2017-01-04 17:21     ` Andrew Cooper
  2017-01-05  8:27       ` Jan Beulich
  0 siblings, 1 reply; 93+ messages in thread
From: Andrew Cooper @ 2017-01-04 17:21 UTC (permalink / raw)
  To: Jan Beulich; +Cc: Xen-devel

On 04/01/17 15:44, Jan Beulich wrote:
>>>> On 04.01.17 at 13:39, <andrew.cooper3@citrix.com> wrote:
>> Use anonymous unions to access the feature leaves as complete words, and by
>> named individual feature.
>>
>> A feature name is introduced for every architectural X86_FEATURE_*, other than
>> the dynamically calculated values such as APIC, OSXSAVE and OSPKE.
> A rationale for this change would be nice to have here, as the
> redundancy with public/arch-x86/cpufeatureset.h means any
> addition will now need to change two places. Would it be possible
> for gen-cpuid.py to generate these bitfield declarations?

Hmm.  I hadn't considered that as an option.

Thinking about it however, I'd ideally prefer not to hide the
declarations behind a macro.

>
>> --- a/xen/include/asm-x86/cpuid.h
>> +++ b/xen/include/asm-x86/cpuid.h
>> @@ -103,7 +103,25 @@ struct cpuid_policy
>>  
>>              /* Leaf 0x1 - family/model/stepping and features. */
>>              struct {
>> -                uint32_t :32, :32, _1c, _1d;
>> +                uint32_t :32, :32;
>> +                union {
>> +                    uint32_t _1c;
>> +                    struct {
>> +                        bool sse3:1, pclmulqdq:1, dtes64:1, monitor:1, dscpl:1, vmx:1, smx:1, eist:1,
>> +                             tm2:1, ssse3:1, /* cid */:1, /* sdbg */:1, fma:1, cx16:1, xtpr:1, pcdm:1,
>> +                             :1, pcid:1, dca:1, sse4_1:1, sse4_2:1, x2apic:1, movebe:1, popcnt:1,
> movbe
>
>> +                             tsc_deadline:1, aesni:1, xsave:1, /* osxsave */:1, avx:1, f16c:1, rdrand:1, hv:1;
> hypervisor (please make sure the name match up with X86_FEATURE_*).
>
>> @@ -113,7 +131,34 @@ struct cpuid_policy
>>          struct cpuid_leaf raw[CPUID_GUEST_NR_FEAT];
>>          struct {
>>              struct {
>> -                uint32_t max_subleaf, _7b0, _7c0, _7d0;
>> +                uint32_t max_subleaf;
>> +                union {
>> +                    uint32_t _7b0;
>> +                    struct {
>> +                        bool fsgsbase:1, tsc_adjust:1, sgx:1, bmi1:1, hle:1, avx2:1, fdp_excp_only:1, smep:1,
>> +                             bmi2:1, erms:1, invpcid:1, rtm:1, pqm:1, no_fpu_sel:1, mpx:1, pqe:1,
>> +                             avx512f:1, avx512dq:1, rdseed:1, adx:1, smap:1, avx512ifma:1, /* pcommit */:1, clflushopt:1,
>> +                             clwb:1, /* pt */:1, avx512pf:1, avx512er:1, avx512cd:1, sha:1, avx512bw:1, avx512vl:1;
> The commented out entries here don't match the commit message.

Well - they are not yet implemented.

>
>> +                    };
>> +                };
>> +                union {
>> +                    uint32_t _7c0;
>> +                    struct {
>> +                        bool prefetchwt1:1, avx512vbmi:1, :1, pku: 1, :1, :1, :1, :1,
>> +                             :1, :1, :1, :1, :1, :1, :1, :1,
>> +                             :1, :1, :1, :1, :1, :1, :1, :1,
>> +                             :1, :1, :1, :1, :1, :1, :1, :1;
> This is ugly, but I remember you saying (on irc?) the compiler
> doesn't allow bitfields wider than one bit for bool ...

Correct.  I was quite surprised by this, but I can understand that bool
foo:2 is quite meaningless when foo can strictly only take a binary value.

>
>> @@ -126,7 +171,16 @@ struct cpuid_policy
>>                  uint32_t xcr0_low, :32, :32, xcr0_high;
>>              };
>>              struct {
>> -                uint32_t Da1, :32, xss_low, xss_high;
>> +                union {
>> +                    uint32_t Da1;
>> +                    struct {
>> +                        bool xsaveopt: 1, xsavec: 1, xgetbv1: 1, xsaves: 1, :1, :1, :1, :1,
> Why the blanks after the colons?

Older formatting choice.  I will fix up.

~Andrew

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: [PATCH 07/27] x86/cpuid: Recalculate a domains CPUID policy when appropriate
  2017-01-04 16:04       ` Jan Beulich
@ 2017-01-04 17:37         ` Andrew Cooper
  2017-01-05  8:24           ` Jan Beulich
  0 siblings, 1 reply; 93+ messages in thread
From: Andrew Cooper @ 2017-01-04 17:37 UTC (permalink / raw)
  To: Jan Beulich; +Cc: Xen-devel

On 04/01/17 16:04, Jan Beulich wrote:
>>>> On 04.01.17 at 16:33, <andrew.cooper3@citrix.com> wrote:
>> On 04/01/17 15:01, Jan Beulich wrote:
>>>>>> On 04.01.17 at 13:39, <andrew.cooper3@citrix.com> wrote:
>>>> +void recalculate_cpuid_policy(struct domain *d)
>>>> +{
>>>> +    struct cpuid_policy *p = d->arch.cpuid;
>>>> +    const struct cpuid_policy *max =
>>>> +        is_pv_domain(d) ? &pv_max_policy : &hvm_max_policy;
>>>> +    uint32_t fs[FSCAPINTS], max_fs[FSCAPINTS];
>>>> +    unsigned int i;
>>>> +
>>>> +    cpuid_policy_to_featureset(p, fs);
>>>> +    memcpy(max_fs, max->fs, sizeof(max_fs));
>>>> +
>>>> +    /* Allow a toolstack to possibly select ITSC... */
>>>> +    if ( cpu_has_itsc )
>>>> +        __set_bit(X86_FEATURE_ITSC, max_fs);
>>> This special casing calls for some explanation in the commit message
>>> (or the comment here).
>> Ah - this logic is all copied from the current dynamic adjustment we
>> make in {pv,hvm}_cpuid().  This ITSC one is expressed differently, but
>> it should hopefully be obvious in the context of patches 18 and 19.
> Well, before making the comment I did check. The clearing of
> the flag is a copy of existing code, but the setting of it here
> isn't afaics.

The relevant hunk is:

     case 0x80000007:
-        d &= (pv_featureset[FEATURESET_e7d] |
-              (host_featureset[FEATURESET_e7d] &
cpufeat_mask(X86_FEATURE_ITSC)));
+        d = p->extd.e7d;
         break;

which was introduced as an ITSC bugfix in c/s a5adcee740d.  ITSC is by
default hidden in the featureset offered to toolstacks at the moment,
but we need to cope with the toolstack explicitly setting ITSC after
setting nomigrate.

This will eventually be fixed by having ITSC set in the max_policy, but
clear in the default_policy, but that is also mixed in with the
toolstack changes.

>
>>>> +    /* ... but hide ITSC in the common case. */
>>>> +    if ( !d->disable_migrate && !d->arch.vtsc )
>>>> +        __clear_bit(X86_FEATURE_ITSC, fs);
>>> The 32-bit PV logic could easily move below here afaics, reducing
>>> the distance between the two parts of the comment.
>>>
>>> Also this requires adjustment of the policy by (the caller of)
>>> tsc_set_info().
>> And also XEN_DOMCTL_set_disable_migrate.
>>
>> Currently the various toolstacks issues these hypercalls in the correct
>> order, so I was planning to ignore these edge cases until the toolstack
>> side work (see below).
> Let's not do that - it'll be some time until that other work lands,
> I assume, and introducing (further) dependencies on tool stacks
> to do things in the right order is quite bad imo.

This is code which hasn't changed in years.  But if you insist, then I
will see about best to do an x86-only change to the common code.

>
>>>>  static void update_domain_cpuid_info(struct domain *d,
>>>>                                       const xen_domctl_cpuid_t *ctl)
>>>>  {
>>>> +    struct cpuid_policy *p = d->arch.cpuid;
>>>> +    struct cpuid_leaf leaf = { ctl->eax, ctl->ebx, ctl->ecx, ctl->edx };
>>>> +
>>>> +    if ( ctl->input[0] < ARRAY_SIZE(p->basic.raw) )
>>>> +    {
>>>> +        if ( ctl->input[0] == 7 )
>>>> +        {
>>>> +            if ( ctl->input[1] < ARRAY_SIZE(p->feat.raw) )
>>>> +                p->feat.raw[ctl->input[1]] = leaf;
>>>> +        }
>>>> +        else if ( ctl->input[0] == 0xd )
>>>> +        {
>>>> +            if ( ctl->input[1] < ARRAY_SIZE(p->xstate.raw) )
>>>> +                p->xstate.raw[ctl->input[1]] = leaf;
>>>> +        }
>>>> +        else
>>>> +            p->basic.raw[ctl->input[0]] = leaf;
>>>> +    }
>>>> +    else if ( (ctl->input[0] - 0x80000000) < ARRAY_SIZE(p->extd.raw) )
>>>> +        p->extd.raw[ctl->input[0] - 0x80000000] = leaf;
>>> These checks against ARRAY_SIZE() worry me - wouldn't we better
>>> refuse any attempts to set values not representable in the policy?
>> We can't do that yet, without toolstack side changes.  Currently the
>> toolstack can lodge any values it wishes, and all we do is ignore them,
>> which can be arbitrary information from a cpuid= clause.
> Hmm, do we really _ignore_ them in all cases (rather than handing
> them through to guests)? If so, that should indeed be good enough
> for now.

Any arbitrary values get can get inserted into the cpuids[] array but,
given your fairly-recent change to check max_leaf, we don't guarantee to
hand the values to a guest.

~Andrew

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: [PATCH 13/27] x86/vvmx: Use hvm_cr4_guest_valid_bits() to calculate MSR_IA32_VMX_CR4_FIXED1
  2017-01-04 12:39 ` [PATCH 13/27] x86/vvmx: Use hvm_cr4_guest_valid_bits() to calculate MSR_IA32_VMX_CR4_FIXED1 Andrew Cooper
@ 2017-01-05  2:40   ` Tian, Kevin
  2017-01-05 11:42   ` Jan Beulich
  1 sibling, 0 replies; 93+ messages in thread
From: Tian, Kevin @ 2017-01-05  2:40 UTC (permalink / raw)
  To: Andrew Cooper, Xen-devel; +Cc: Nakajima, Jun, Jan Beulich

> From: Andrew Cooper [mailto:andrew.cooper3@citrix.com]
> Sent: Wednesday, January 04, 2017 8:39 PM
> 
> Reuse the logic in hvm_cr4_guest_valid_bits() instead of duplicating it.
> 
> This fixes a bug to do with the handling of X86_CR4_PCE.  The RDPMC
> instruction predate the architectural performance feature, and has been around
> since the P6.  X86_CR4_PCE is like X86_CR4_TSD and only controls whether RDPMC
> is available at cpl!=0, not whether RDPMC is generally unavailable.
> 
> Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>

Acked-by: Kevin Tian <kevin.tian@intel.com>

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: [PATCH 07/27] x86/cpuid: Recalculate a domains CPUID policy when appropriate
  2017-01-04 17:37         ` Andrew Cooper
@ 2017-01-05  8:24           ` Jan Beulich
  2017-01-05 14:42             ` Andrew Cooper
  0 siblings, 1 reply; 93+ messages in thread
From: Jan Beulich @ 2017-01-05  8:24 UTC (permalink / raw)
  To: Andrew Cooper; +Cc: Xen-devel

>>> On 04.01.17 at 18:37, <andrew.cooper3@citrix.com> wrote:
> On 04/01/17 16:04, Jan Beulich wrote:
>>>>> On 04.01.17 at 16:33, <andrew.cooper3@citrix.com> wrote:
>>> On 04/01/17 15:01, Jan Beulich wrote:
>>>>>>> On 04.01.17 at 13:39, <andrew.cooper3@citrix.com> wrote:
>>>>> +    /* ... but hide ITSC in the common case. */
>>>>> +    if ( !d->disable_migrate && !d->arch.vtsc )
>>>>> +        __clear_bit(X86_FEATURE_ITSC, fs);
>>>> The 32-bit PV logic could easily move below here afaics, reducing
>>>> the distance between the two parts of the comment.
>>>>
>>>> Also this requires adjustment of the policy by (the caller of)
>>>> tsc_set_info().
>>> And also XEN_DOMCTL_set_disable_migrate.
>>>
>>> Currently the various toolstacks issues these hypercalls in the correct
>>> order, so I was planning to ignore these edge cases until the toolstack
>>> side work (see below).
>> Let's not do that - it'll be some time until that other work lands,
>> I assume, and introducing (further) dependencies on tool stacks
>> to do things in the right order is quite bad imo.
> 
> This is code which hasn't changed in years.  But if you insist, then I
> will see about best to do an x86-only change to the common code.

The tsc_set_info() would likely be in x86 specific code, but the
set_disable_migrate would, as you say, presumably want handling
in/from common code. So unless this would turn out to be a rather
costly change, I'd indeed prefer if you adjusted these.

>>>>>  static void update_domain_cpuid_info(struct domain *d,
>>>>>                                       const xen_domctl_cpuid_t *ctl)
>>>>>  {
>>>>> +    struct cpuid_policy *p = d->arch.cpuid;
>>>>> +    struct cpuid_leaf leaf = { ctl->eax, ctl->ebx, ctl->ecx, ctl->edx };
>>>>> +
>>>>> +    if ( ctl->input[0] < ARRAY_SIZE(p->basic.raw) )
>>>>> +    {
>>>>> +        if ( ctl->input[0] == 7 )
>>>>> +        {
>>>>> +            if ( ctl->input[1] < ARRAY_SIZE(p->feat.raw) )
>>>>> +                p->feat.raw[ctl->input[1]] = leaf;
>>>>> +        }
>>>>> +        else if ( ctl->input[0] == 0xd )
>>>>> +        {
>>>>> +            if ( ctl->input[1] < ARRAY_SIZE(p->xstate.raw) )
>>>>> +                p->xstate.raw[ctl->input[1]] = leaf;
>>>>> +        }
>>>>> +        else
>>>>> +            p->basic.raw[ctl->input[0]] = leaf;
>>>>> +    }
>>>>> +    else if ( (ctl->input[0] - 0x80000000) < ARRAY_SIZE(p->extd.raw) )
>>>>> +        p->extd.raw[ctl->input[0] - 0x80000000] = leaf;
>>>> These checks against ARRAY_SIZE() worry me - wouldn't we better
>>>> refuse any attempts to set values not representable in the policy?
>>> We can't do that yet, without toolstack side changes.  Currently the
>>> toolstack can lodge any values it wishes, and all we do is ignore them,
>>> which can be arbitrary information from a cpuid= clause.
>> Hmm, do we really _ignore_ them in all cases (rather than handing
>> them through to guests)? If so, that should indeed be good enough
>> for now.
> 
> Any arbitrary values get can get inserted into the cpuids[] array but,
> given your fairly-recent change to check max_leaf, we don't guarantee to
> hand the values to a guest.

"we don't guarantee" != "we guarantee not to"

But my main point here is that a domain's cpuid= may specify a
higher than default max leaf, and I think going forward we ought
to still return all zero for those leaves in that case, or else the
overall spirit of white listing would get violated.

Jan


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: [PATCH 10/27] x86/cpuid: Introduce named feature bitmaps
  2017-01-04 17:21     ` Andrew Cooper
@ 2017-01-05  8:27       ` Jan Beulich
  2017-01-05 14:53         ` Andrew Cooper
  0 siblings, 1 reply; 93+ messages in thread
From: Jan Beulich @ 2017-01-05  8:27 UTC (permalink / raw)
  To: Andrew Cooper; +Cc: Xen-devel

>>> On 04.01.17 at 18:21, <andrew.cooper3@citrix.com> wrote:
> On 04/01/17 15:44, Jan Beulich wrote:
>>>>> On 04.01.17 at 13:39, <andrew.cooper3@citrix.com> wrote:
>>> Use anonymous unions to access the feature leaves as complete words, and by
>>> named individual feature.
>>>
>>> A feature name is introduced for every architectural X86_FEATURE_*, other than
>>> the dynamically calculated values such as APIC, OSXSAVE and OSPKE.
>> A rationale for this change would be nice to have here, as the
>> redundancy with public/arch-x86/cpufeatureset.h means any
>> addition will now need to change two places. Would it be possible
>> for gen-cpuid.py to generate these bitfield declarations?
> 
> Hmm.  I hadn't considered that as an option.
> 
> Thinking about it however, I'd ideally prefer not to hide the
> declarations behind a macro.

What's wrong with that? It's surely better than having to keep two
pieces of code in sync manually.

>>> @@ -113,7 +131,34 @@ struct cpuid_policy
>>>          struct cpuid_leaf raw[CPUID_GUEST_NR_FEAT];
>>>          struct {
>>>              struct {
>>> -                uint32_t max_subleaf, _7b0, _7c0, _7d0;
>>> +                uint32_t max_subleaf;
>>> +                union {
>>> +                    uint32_t _7b0;
>>> +                    struct {
>>> +                        bool fsgsbase:1, tsc_adjust:1, sgx:1, bmi1:1, hle:1, avx2:1, fdp_excp_only:1, smep:1,
>>> +                             bmi2:1, erms:1, invpcid:1, rtm:1, pqm:1, no_fpu_sel:1, mpx:1, pqe:1,
>>> +                             avx512f:1, avx512dq:1, rdseed:1, adx:1, smap:1, avx512ifma:1, /* pcommit */:1, clflushopt:1,
>>> +                             clwb:1, /* pt */:1, avx512pf:1, avx512er:1, avx512cd:1, sha:1, avx512bw:1, avx512vl:1;
>> The commented out entries here don't match the commit message.
> 
> Well - they are not yet implemented.

Or have been removed, in the case of pcommit. But my point really
was that the commit message might better be relaxed/extended a
little in this regard.

>>> +                    };
>>> +                };
>>> +                union {
>>> +                    uint32_t _7c0;
>>> +                    struct {
>>> +                        bool prefetchwt1:1, avx512vbmi:1, :1, pku: 1, :1, :1, :1, :1,
>>> +                             :1, :1, :1, :1, :1, :1, :1, :1,
>>> +                             :1, :1, :1, :1, :1, :1, :1, :1,
>>> +                             :1, :1, :1, :1, :1, :1, :1, :1;
>> This is ugly, but I remember you saying (on irc?) the compiler
>> doesn't allow bitfields wider than one bit for bool ...
> 
> Correct.  I was quite surprised by this, but I can understand that bool
> foo:2 is quite meaningless when foo can strictly only take a binary value.

Thinking about it another time - what's wrong with using uint32_t
instead of bool here, allowing consecutive unknown fields to be
folded?

Jan


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: [PATCH 11/27] x86/hvm: Improve hvm_efer_valid() using named features
  2017-01-04 12:39 ` [PATCH 11/27] x86/hvm: Improve hvm_efer_valid() using named features Andrew Cooper
@ 2017-01-05 11:34   ` Jan Beulich
  2017-01-05 14:57     ` Andrew Cooper
  0 siblings, 1 reply; 93+ messages in thread
From: Jan Beulich @ 2017-01-05 11:34 UTC (permalink / raw)
  To: Andrew Cooper; +Cc: Xen-devel

>>> On 04.01.17 at 13:39, <andrew.cooper3@citrix.com> wrote:
> --- a/xen/arch/x86/hvm/hvm.c
> +++ b/xen/arch/x86/hvm/hvm.c
> @@ -914,56 +914,35 @@ static int hvm_save_cpu_ctxt(struct domain *d, hvm_domain_context_t *h)
>  }
>  
>  /* Return a string indicating the error, or NULL for valid. */
> -const char *hvm_efer_valid(const struct vcpu *v, uint64_t value,
> -                           signed int cr0_pg)
> +const char *hvm_efer_valid(const struct vcpu *v, uint64_t value, int cr0_pg)

Please can we keep the "signed" here, to make clear signedness
indeed matters (as opposed to various other uses of plain int we
still have which could equally well be unsigned int)?

Other than that
Reviewed-by: Jan Beulich <jbeulich@suse.com>
albeit I have one more question:

>      if ( (value & EFER_LMSLE) && !cpu_has_lmsl )
>          return "LMSLE without support";

Do you have any plans to include such non-CPUID-based features
into the policy?

Jan


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: [PATCH 12/27] x86/hvm: Improve CR4 verification using named features
  2017-01-04 12:39 ` [PATCH 12/27] x86/hvm: Improve CR4 verification " Andrew Cooper
@ 2017-01-05 11:39   ` Jan Beulich
  0 siblings, 0 replies; 93+ messages in thread
From: Jan Beulich @ 2017-01-05 11:39 UTC (permalink / raw)
  To: Andrew Cooper; +Cc: Xen-devel

>>> On 04.01.17 at 13:39, <andrew.cooper3@citrix.com> wrote:
> Alter the function to return the valid CR4 bits, rather than the invalid CR4
> bits.  This will allow reuse in other areas of code.
> 
> Pick the appropriate cpuid_policy object rather than using hvm_cpuid() or
> boot_cpu_data.  This breaks the dependency on current.
> 
> Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>

Reviewed-by: Jan Beulich <jbeulich@suse.com>


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: [PATCH 13/27] x86/vvmx: Use hvm_cr4_guest_valid_bits() to calculate MSR_IA32_VMX_CR4_FIXED1
  2017-01-04 12:39 ` [PATCH 13/27] x86/vvmx: Use hvm_cr4_guest_valid_bits() to calculate MSR_IA32_VMX_CR4_FIXED1 Andrew Cooper
  2017-01-05  2:40   ` Tian, Kevin
@ 2017-01-05 11:42   ` Jan Beulich
  1 sibling, 0 replies; 93+ messages in thread
From: Jan Beulich @ 2017-01-05 11:42 UTC (permalink / raw)
  To: Andrew Cooper; +Cc: Kevin Tian, Jun Nakajima, Xen-devel

>>> On 04.01.17 at 13:39, <andrew.cooper3@citrix.com> wrote:
> Reuse the logic in hvm_cr4_guest_valid_bits() instead of duplicating it.
> 
> This fixes a bug to do with the handling of X86_CR4_PCE.  The RDPMC
> instruction predate the architectural performance feature, and has been around
> since the P6.  X86_CR4_PCE is like X86_CR4_TSD and only controls whether RDPMC
> is available at cpl!=0, not whether RDPMC is generally unavailable.
> 
> Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>

Nice!

Reviewed-by: Jan Beulich <jbeulich@suse.com>


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: [PATCH 14/27] x86/pv: Improve pv_cpuid() using named features
  2017-01-04 12:39 ` [PATCH 14/27] x86/pv: Improve pv_cpuid() using named features Andrew Cooper
@ 2017-01-05 11:43   ` Jan Beulich
  0 siblings, 0 replies; 93+ messages in thread
From: Jan Beulich @ 2017-01-05 11:43 UTC (permalink / raw)
  To: Andrew Cooper; +Cc: Xen-devel

>>> On 04.01.17 at 13:39, <andrew.cooper3@citrix.com> wrote:
> This avoids refering back to domain_cpuid() or native CPUID to obtain
> information which is directly available.
> 
> Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>

Reviewed-by: Jan Beulich <jbeulich@suse.com>


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: [PATCH 15/27] x86/hvm: Improve CPUID and MSR handling using named features
  2017-01-04 12:39 ` [PATCH 15/27] x86/hvm: Improve CPUID and MSR handling " Andrew Cooper
@ 2017-01-05 12:06   ` Jan Beulich
  0 siblings, 0 replies; 93+ messages in thread
From: Jan Beulich @ 2017-01-05 12:06 UTC (permalink / raw)
  To: Andrew Cooper; +Cc: Xen-devel

>>> On 04.01.17 at 13:39, <andrew.cooper3@citrix.com> wrote:
> This avoids hvm_cpuid() recursing into itself, and the MSR paths using
> hvm_cpuid() to obtain information which is directly available.
> 
> Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>

Reviewed-by: Jan Beulich <jbeulich@suse.com>


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: [PATCH 17/27] x86/pv: Use per-domain policy information when calculating the cpumasks
  2017-01-04 12:39 ` [PATCH 17/27] x86/pv: Use per-domain policy information when calculating the cpumasks Andrew Cooper
@ 2017-01-05 12:23   ` Jan Beulich
  2017-01-05 12:24     ` Andrew Cooper
  0 siblings, 1 reply; 93+ messages in thread
From: Jan Beulich @ 2017-01-05 12:23 UTC (permalink / raw)
  To: Andrew Cooper; +Cc: Xen-devel

>>> On 04.01.17 at 13:39, <andrew.cooper3@citrix.com> wrote:
> ... rather than dynamically claming against the PV maximum policy.

What is "claming"? I'm even having a hard time guessing what you
may have meant - clamping maybe?

> Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>

Reviewed-by: Jan Beulich <jbeulich@suse.com>


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: [PATCH 17/27] x86/pv: Use per-domain policy information when calculating the cpumasks
  2017-01-05 12:23   ` Jan Beulich
@ 2017-01-05 12:24     ` Andrew Cooper
  0 siblings, 0 replies; 93+ messages in thread
From: Andrew Cooper @ 2017-01-05 12:24 UTC (permalink / raw)
  To: Jan Beulich; +Cc: Xen-devel

On 05/01/17 12:23, Jan Beulich wrote:
>>>> On 04.01.17 at 13:39, <andrew.cooper3@citrix.com> wrote:
>> ... rather than dynamically claming against the PV maximum policy.
> What is "claming"? I'm even having a hard time guessing what you
> may have meant - clamping maybe?

Yes.  I did mean clamping and had already fixed this up locally.

>> Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
> Reviewed-by: Jan Beulich <jbeulich@suse.com>
>

Thanks,

~Andrew

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: [PATCH 18/27] x86/pv: Use per-domain policy information in pv_cpuid()
  2017-01-04 12:39 ` [PATCH 18/27] x86/pv: Use per-domain policy information in pv_cpuid() Andrew Cooper
@ 2017-01-05 12:44   ` Jan Beulich
  2017-01-05 12:46     ` Andrew Cooper
  0 siblings, 1 reply; 93+ messages in thread
From: Jan Beulich @ 2017-01-05 12:44 UTC (permalink / raw)
  To: Andrew Cooper; +Cc: Xen-devel

>>> On 04.01.17 at 13:39, <andrew.cooper3@citrix.com> wrote:
> --- a/xen/arch/x86/traps.c
> +++ b/xen/arch/x86/traps.c
> @@ -1065,11 +1065,8 @@ void pv_cpuid(struct cpu_user_regs *regs)
>          uint32_t tmp;
>  
>      case 0x00000001:
> -        c &= pv_featureset[FEATURESET_1c];
> -        d &= pv_featureset[FEATURESET_1d];
> -
> -        if ( is_pv_32bit_domain(currd) )
> -            c &= ~cpufeat_mask(X86_FEATURE_CX16);
> +        c = p->basic._1c;
> +        d = p->basic._1d;

Being able to drop the clearing of CX16 is because it depends on
LM, and LM gets cleared explicitly in recalculate_cpuid_policy(). If
that's right (and intended that way),
Reviewed-by: Jan Beulich <jbeulich@suse.com>

Jan


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: [PATCH 18/27] x86/pv: Use per-domain policy information in pv_cpuid()
  2017-01-05 12:44   ` Jan Beulich
@ 2017-01-05 12:46     ` Andrew Cooper
  0 siblings, 0 replies; 93+ messages in thread
From: Andrew Cooper @ 2017-01-05 12:46 UTC (permalink / raw)
  To: Jan Beulich; +Cc: Xen-devel

On 05/01/17 12:44, Jan Beulich wrote:
>>>> On 04.01.17 at 13:39, <andrew.cooper3@citrix.com> wrote:
>> --- a/xen/arch/x86/traps.c
>> +++ b/xen/arch/x86/traps.c
>> @@ -1065,11 +1065,8 @@ void pv_cpuid(struct cpu_user_regs *regs)
>>          uint32_t tmp;
>>  
>>      case 0x00000001:
>> -        c &= pv_featureset[FEATURESET_1c];
>> -        d &= pv_featureset[FEATURESET_1d];
>> -
>> -        if ( is_pv_32bit_domain(currd) )
>> -            c &= ~cpufeat_mask(X86_FEATURE_CX16);
>> +        c = p->basic._1c;
>> +        d = p->basic._1d;
> Being able to drop the clearing of CX16 is because it depends on
> LM, and LM gets cleared explicitly in recalculate_cpuid_policy(). If
> that's right (and intended that way),
> Reviewed-by: Jan Beulich <jbeulich@suse.com>

Yes.  I will update the commit message to explicitly state that
recalculate_cpuid_policy() performs these adjustments when the policy is
loaded.

~Andrew

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: [PATCH 19/27] x86/hvm: Use per-domain policy information in hvm_cpuid()
  2017-01-04 12:39 ` [PATCH 19/27] x86/hvm: Use per-domain policy information in hvm_cpuid() Andrew Cooper
@ 2017-01-05 12:55   ` Jan Beulich
  2017-01-05 13:03     ` Andrew Cooper
  0 siblings, 1 reply; 93+ messages in thread
From: Jan Beulich @ 2017-01-05 12:55 UTC (permalink / raw)
  To: Andrew Cooper; +Cc: Xen-devel

>>> On 04.01.17 at 13:39, <andrew.cooper3@citrix.com> wrote:
> --- a/xen/arch/x86/hvm/hvm.c
> +++ b/xen/arch/x86/hvm/hvm.c
> @@ -3335,39 +3335,33 @@ void hvm_cpuid(unsigned int input, unsigned int *eax, unsigned int *ebx,
>          *ebx &= 0x00FFFFFFu;
>          *ebx |= (v->vcpu_id * 2) << 24;
>  
> -        *ecx &= hvm_featureset[FEATURESET_1c];
> -        *edx &= hvm_featureset[FEATURESET_1d];
> +        *ecx = p->basic._1c;
> +        *edx = p->basic._1d;
>  
>          /* APIC exposed to guests, but Fast-forward MSR_APIC_BASE.EN back in. */
>          if ( vlapic_hw_disabled(vcpu_vlapic(v)) )
>              *edx &= ~cpufeat_bit(X86_FEATURE_APIC);
>  
> -        /* OSXSAVE cleared by hvm_featureset.  Fast-forward CR4 back in. */
> +        /* OSXSAVE clear in policy.  Fast-forward CR4 back in. */
>          if ( v->arch.hvm_vcpu.guest_cr[4] & X86_CR4_OSXSAVE )
>              *ecx |= cpufeat_mask(X86_FEATURE_OSXSAVE);
>  
> -        /* Don't expose HAP-only features to non-hap guests. */
> -        if ( !hap_enabled(d) )
> -        {
> -            *ecx &= ~cpufeat_mask(X86_FEATURE_PCID);
> -
> -            /*
> -             * PSE36 is not supported in shadow mode.  This bit should be
> -             * unilaterally cleared.
> -             *
> -             * However, an unspecified version of Hyper-V from 2011 refuses
> -             * to start as the "cpu does not provide required hw features" if
> -             * it can't see PSE36.
> -             *
> -             * As a workaround, leak the toolstack-provided PSE36 value into a
> -             * shadow guest if the guest is already using PAE paging (and
> -             * won't care about reverting back to PSE paging).  Otherwise,
> -             * knoble it, so a 32bit guest doesn't get the impression that it
> -             * could try to use PSE36 paging.
> -             */
> -            if ( !(hvm_pae_enabled(v) || hvm_long_mode_enabled(v)) )
> -                *edx &= ~cpufeat_mask(X86_FEATURE_PSE36);
> -        }
> +        /*
> +         * PSE36 is not supported in shadow mode.  This bit should be
> +         * unilaterally cleared.
> +         *
> +         * However, an unspecified version of Hyper-V from 2011 refuses
> +         * to start as the "cpu does not provide required hw features" if
> +         * it can't see PSE36.
> +         *
> +         * As a workaround, leak the toolstack-provided PSE36 value into a
> +         * shadow guest if the guest is already using PAE paging (and won't
> +         * care about reverting back to PSE paging).  Otherwise, knoble it, so
> +         * a 32bit guest doesn't get the impression that it could try to use
> +         * PSE36 paging.
> +         */
> +        if ( !hap_enabled(d) && !(hvm_pae_enabled(v) || hvm_long_mode_enabled(v)) )
> +            *edx &= ~cpufeat_mask(X86_FEATURE_PSE36);

The PSE36 part of this is fine, bit the PCID dropping (as well as the
PKU part below) made me go look back at patch 7: You AND together
hvm_max_policy.fs[] and hvm_shadow_featuremask[], which aren't
quite the same (the equivalent of the latter would be
hvm_hap_featuremask[]). Aren't we risking to wrongly hide features
in shadow mode this way, at least as soon as max != default?

Since that would rather affect the other patch, this one is
Reviewed-by: Jan Beulich <jbeulich@suse.com>

Jan


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: [PATCH 19/27] x86/hvm: Use per-domain policy information in hvm_cpuid()
  2017-01-05 12:55   ` Jan Beulich
@ 2017-01-05 13:03     ` Andrew Cooper
  0 siblings, 0 replies; 93+ messages in thread
From: Andrew Cooper @ 2017-01-05 13:03 UTC (permalink / raw)
  To: Jan Beulich; +Cc: Xen-devel

On 05/01/17 12:55, Jan Beulich wrote:
>>>> On 04.01.17 at 13:39, <andrew.cooper3@citrix.com> wrote:
>> --- a/xen/arch/x86/hvm/hvm.c
>> +++ b/xen/arch/x86/hvm/hvm.c
>> @@ -3335,39 +3335,33 @@ void hvm_cpuid(unsigned int input, unsigned int *eax, unsigned int *ebx,
>>          *ebx &= 0x00FFFFFFu;
>>          *ebx |= (v->vcpu_id * 2) << 24;
>>  
>> -        *ecx &= hvm_featureset[FEATURESET_1c];
>> -        *edx &= hvm_featureset[FEATURESET_1d];
>> +        *ecx = p->basic._1c;
>> +        *edx = p->basic._1d;
>>  
>>          /* APIC exposed to guests, but Fast-forward MSR_APIC_BASE.EN back in. */
>>          if ( vlapic_hw_disabled(vcpu_vlapic(v)) )
>>              *edx &= ~cpufeat_bit(X86_FEATURE_APIC);
>>  
>> -        /* OSXSAVE cleared by hvm_featureset.  Fast-forward CR4 back in. */
>> +        /* OSXSAVE clear in policy.  Fast-forward CR4 back in. */
>>          if ( v->arch.hvm_vcpu.guest_cr[4] & X86_CR4_OSXSAVE )
>>              *ecx |= cpufeat_mask(X86_FEATURE_OSXSAVE);
>>  
>> -        /* Don't expose HAP-only features to non-hap guests. */
>> -        if ( !hap_enabled(d) )
>> -        {
>> -            *ecx &= ~cpufeat_mask(X86_FEATURE_PCID);
>> -
>> -            /*
>> -             * PSE36 is not supported in shadow mode.  This bit should be
>> -             * unilaterally cleared.
>> -             *
>> -             * However, an unspecified version of Hyper-V from 2011 refuses
>> -             * to start as the "cpu does not provide required hw features" if
>> -             * it can't see PSE36.
>> -             *
>> -             * As a workaround, leak the toolstack-provided PSE36 value into a
>> -             * shadow guest if the guest is already using PAE paging (and
>> -             * won't care about reverting back to PSE paging).  Otherwise,
>> -             * knoble it, so a 32bit guest doesn't get the impression that it
>> -             * could try to use PSE36 paging.
>> -             */
>> -            if ( !(hvm_pae_enabled(v) || hvm_long_mode_enabled(v)) )
>> -                *edx &= ~cpufeat_mask(X86_FEATURE_PSE36);
>> -        }
>> +        /*
>> +         * PSE36 is not supported in shadow mode.  This bit should be
>> +         * unilaterally cleared.
>> +         *
>> +         * However, an unspecified version of Hyper-V from 2011 refuses
>> +         * to start as the "cpu does not provide required hw features" if
>> +         * it can't see PSE36.
>> +         *
>> +         * As a workaround, leak the toolstack-provided PSE36 value into a
>> +         * shadow guest if the guest is already using PAE paging (and won't
>> +         * care about reverting back to PSE paging).  Otherwise, knoble it, so
>> +         * a 32bit guest doesn't get the impression that it could try to use
>> +         * PSE36 paging.
>> +         */
>> +        if ( !hap_enabled(d) && !(hvm_pae_enabled(v) || hvm_long_mode_enabled(v)) )
>> +            *edx &= ~cpufeat_mask(X86_FEATURE_PSE36);
> The PSE36 part of this is fine, bit the PCID dropping (as well as the
> PKU part below) made me go look back at patch 7: You AND together
> hvm_max_policy.fs[] and hvm_shadow_featuremask[], which aren't
> quite the same (the equivalent of the latter would be
> hvm_hap_featuremask[]). Aren't we risking to wrongly hide features
> in shadow mode this way, at least as soon as max != default?

hvm_shadow_featuremask[] is strictly a subset of hvm_hap_featuremask[],
by virtue of A/S/H annotations in cpufeatureset.h

hvm_max_policy.fs[] is most likely the HAP set, but might be Shadow if
HAP is entirely unavailable on the host.

I don't forsee a situation where shadow would ever be more featureful
than hap.  The only differences are paging related features.

~Andrew

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: [PATCH 20/27] x86/cpuid: Drop the temporary linear feature bitmap from struct cpuid_policy
  2017-01-04 12:39 ` [PATCH 20/27] x86/cpuid: Drop the temporary linear feature bitmap from struct cpuid_policy Andrew Cooper
@ 2017-01-05 13:07   ` Jan Beulich
  2017-01-05 13:12     ` Andrew Cooper
  0 siblings, 1 reply; 93+ messages in thread
From: Jan Beulich @ 2017-01-05 13:07 UTC (permalink / raw)
  To: Andrew Cooper; +Cc: Xen-devel

>>> On 04.01.17 at 13:39, <andrew.cooper3@citrix.com> wrote:
>  static void __init calculate_pv_max_policy(void)
>  {
>      struct cpuid_policy *p = &pv_max_policy;
> +    uint32_t pv_featureset[FSCAPINTS], host_featureset[FSCAPINTS];
>      unsigned int i;
>  
> +    cpuid_policy_to_featureset(&host_policy, host_featureset);
> +
>      for ( i = 0; i < FSCAPINTS; ++i )
>          pv_featureset[i] = host_featureset[i] & pv_featuremask[i];

While at init time we shouldn't be tight on stack space, it would still
feel better if you didn't put two such (growing in the future) arrays
on the stack. Would you consider it unreasonable to do

    cpuid_policy_to_featureset(&host_policy, pv_featureset);
    for ( i = 0; i < FSCAPINTS; ++i )
        pv_featureset[i] &= pv_featuremask[i];

(and then similarly for HVM)? Either way
Reviewed-by: Jan Beulich <jbeulich@suse.com>

Jan


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: [PATCH 20/27] x86/cpuid: Drop the temporary linear feature bitmap from struct cpuid_policy
  2017-01-05 13:07   ` Jan Beulich
@ 2017-01-05 13:12     ` Andrew Cooper
  0 siblings, 0 replies; 93+ messages in thread
From: Andrew Cooper @ 2017-01-05 13:12 UTC (permalink / raw)
  To: Jan Beulich; +Cc: Xen-devel

On 05/01/17 13:07, Jan Beulich wrote:
>>>> On 04.01.17 at 13:39, <andrew.cooper3@citrix.com> wrote:
>>  static void __init calculate_pv_max_policy(void)
>>  {
>>      struct cpuid_policy *p = &pv_max_policy;
>> +    uint32_t pv_featureset[FSCAPINTS], host_featureset[FSCAPINTS];
>>      unsigned int i;
>>  
>> +    cpuid_policy_to_featureset(&host_policy, host_featureset);
>> +
>>      for ( i = 0; i < FSCAPINTS; ++i )
>>          pv_featureset[i] = host_featureset[i] & pv_featuremask[i];
> While at init time we shouldn't be tight on stack space, it would still
> feel better if you didn't put two such (growing in the future) arrays
> on the stack. Would you consider it unreasonable to do
>
>     cpuid_policy_to_featureset(&host_policy, pv_featureset);
>     for ( i = 0; i < FSCAPINTS; ++i )
>         pv_featureset[i] &= pv_featuremask[i];
>
> (and then similarly for HVM)?

The following patch does (basically) this, and drops one of the arrays.

~Andrew

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: [PATCH 21/27] x86/cpuid: Calculate appropriate max_leaf values for the global policies
  2017-01-04 12:39 ` [PATCH 21/27] x86/cpuid: Calculate appropriate max_leaf values for the global policies Andrew Cooper
@ 2017-01-05 13:43   ` Jan Beulich
  2017-01-05 14:13     ` Andrew Cooper
  0 siblings, 1 reply; 93+ messages in thread
From: Jan Beulich @ 2017-01-05 13:43 UTC (permalink / raw)
  To: Andrew Cooper; +Cc: Xen-devel

>>> On 04.01.17 at 13:39, <andrew.cooper3@citrix.com> wrote:
> --- a/xen/include/asm-x86/cpuid.h
> +++ b/xen/include/asm-x86/cpuid.h
> @@ -78,10 +78,10 @@ struct cpuid_policy
>       * Global *_policy objects:
>       *
>       * - Host accurate:
> -     *   - max_{,sub}leaf
>       *   - {xcr0,xss}_{high,low}
>       *
>       * - Guest appropriate:
> +     *   - max_{,sub}leaf
>       *   - All FEATURESET_* words

I can see the point of the addition, but why the removal?

Jan


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: [PATCH 22/27] x86/cpuid: Perform max_leaf calculations in guest_cpuid()
  2017-01-04 12:39 ` [PATCH 22/27] x86/cpuid: Perform max_leaf calculations in guest_cpuid() Andrew Cooper
@ 2017-01-05 13:51   ` Jan Beulich
  2017-01-05 14:28     ` Andrew Cooper
  0 siblings, 1 reply; 93+ messages in thread
From: Jan Beulich @ 2017-01-05 13:51 UTC (permalink / raw)
  To: Andrew Cooper; +Cc: Xen-devel

>>> On 04.01.17 at 13:39, <andrew.cooper3@citrix.com> wrote:
> @@ -306,6 +310,9 @@ void recalculate_cpuid_policy(struct domain *d)
>      if ( !d->disable_migrate && !d->arch.vtsc )
>          __clear_bit(X86_FEATURE_ITSC, fs);
>  
> +    if ( p->basic.max_leaf < 0xd )

XSTATE_CPUID

> @@ -333,21 +340,50 @@ void guest_cpuid(const struct vcpu *v, unsigned int leaf,
>                   unsigned int subleaf, struct cpuid_leaf *res)
>  {
>      const struct domain *d = v->domain;
> +    const struct cpuid_policy *p = d->arch.cpuid;
>  
>      *res = EMPTY_LEAF;
>  
>      /*
>       * First pass:
>       * - Dispatch the virtualised leaves to their respective handlers.
> +     * - Perform max_leaf/subleaf calculations, maybe returning early.
>       */
>      switch ( leaf )
>      {
> +    case 0x0 ... 0x6:
> +    case 0x8 ... 0xc:
> +#if 0 /* For when CPUID_GUEST_NR_BASIC isn't 0xd */
> +    case 0xe ... CPUID_GUEST_NR_BASIC - 1:
> +#endif

Perhaps have a BUILD_BUG_ON() in an #else here?

> +        if ( leaf > p->basic.max_leaf )
> +            return;
> +        break;
> +
> +    case 0x7:
> +        if ( subleaf > p->feat.max_subleaf )
> +            return;
> +        break;
> +
> +    case 0xd:

XSTATE_CPUID again, which raises the question whether switch()
really is the best way to deal with things here.

> --- a/xen/arch/x86/hvm/hvm.c
> +++ b/xen/arch/x86/hvm/hvm.c
> @@ -3305,27 +3305,6 @@ void hvm_cpuid(unsigned int input, unsigned int *eax, unsigned int *ebx,
>      if ( !edx )
>          edx = &dummy;
>  
> -    if ( input & 0x7fffffff )
> -    {
> -        /*
> -         * Requests outside the supported leaf ranges return zero on AMD
> -         * and the highest basic leaf output on Intel. Uniformly follow
> -         * the AMD model as the more sane one.
> -         */

I think this comment would better be moved instead of deleted.

Jan


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: [PATCH 23/27] x86/cpuid: Move all leaf 7 handling into guest_cpuid()
  2017-01-04 12:39 ` [PATCH 23/27] x86/cpuid: Move all leaf 7 handling into guest_cpuid() Andrew Cooper
@ 2017-01-05 14:01   ` Jan Beulich
  2017-01-05 14:39     ` Andrew Cooper
  0 siblings, 1 reply; 93+ messages in thread
From: Jan Beulich @ 2017-01-05 14:01 UTC (permalink / raw)
  To: Andrew Cooper; +Cc: Xen-devel

>>> On 04.01.17 at 13:39, <andrew.cooper3@citrix.com> wrote:
> @@ -380,14 +385,42 @@ void guest_cpuid(const struct vcpu *v, unsigned int leaf,
>      case 0x80000000 ... 0x80000000 + CPUID_GUEST_NR_EXTD - 1:
>          if ( leaf > p->extd.max_leaf )
>              return;
> -        break;
> +        goto legacy;
>  
>      default:
>          return;
>      }
>  
> +    /* Skip dynamic adjustments if we are in the wrong context. */
> +    if ( v != curr )
> +        return;
> +
> +    /*
> +     * Second pass:
> +     * - Dynamic adjustments
> +     */
> +    switch ( leaf )
> +    {
> +    case 0x7:
> +        switch ( subleaf )
> +        {
> +        case 0:
> +            /* OSPKE clear in policy.  Fast-forward CR4 back in. */
> +            if ( (is_pv_vcpu(v)
> +                  ? v->arch.pv_vcpu.ctrlreg[4]
> +                  : v->arch.hvm_vcpu.guest_cr[4]) & X86_CR4_PKE )
> +                res->c |= cpufeat_mask(X86_FEATURE_OSPKE);

What's wrong with doing this adjustment when v != curr? By
the time the caller looks at the result, the state of guest
software controlled bits can't be relied upon anyway. Which
then raises the question whether a second switch() statement
for the a second pass is all that useful in the first place (I
realize this may depend on future plans of yours).

Jan


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: [PATCH 24/27] x86/hvm: Use guest_cpuid() rather than hvm_cpuid()
  2017-01-04 12:39 ` [PATCH 24/27] x86/hvm: Use guest_cpuid() rather than hvm_cpuid() Andrew Cooper
@ 2017-01-05 14:02   ` Jan Beulich
  0 siblings, 0 replies; 93+ messages in thread
From: Jan Beulich @ 2017-01-05 14:02 UTC (permalink / raw)
  To: Andrew Cooper; +Cc: Xen-devel

>>> On 04.01.17 at 13:39, <andrew.cooper3@citrix.com> wrote:
> More work is required before maxphysaddr can be read straight out of the
> cpuid_policy block, but in the meantime hvm_cpuid() wants to disappear so
> update the code to use the newer interface.
> 
> Use the behaviour of max_leaf handling (returning all zeros) to avoid a 
> double
> call into guest_cpuid().
> 
> Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>

Reviewed-by: Jan Beulich <jbeulich@suse.com>


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: [PATCH 25/27] x86/svm: Use guest_cpuid() rather than hvm_cpuid()
  2017-01-04 12:39 ` [PATCH 25/27] x86/svm: " Andrew Cooper
  2017-01-04 15:26   ` Boris Ostrovsky
@ 2017-01-05 14:04   ` Jan Beulich
  1 sibling, 0 replies; 93+ messages in thread
From: Jan Beulich @ 2017-01-05 14:04 UTC (permalink / raw)
  To: Andrew Cooper; +Cc: Boris Ostrovsky, SuraveeSuthikulpanit, Xen-devel

>>> On 04.01.17 at 13:39, <andrew.cooper3@citrix.com> wrote:
> More work is required before LWP details can be read straight out of the
> cpuid_policy block, but in the meantime hvm_cpuid() wants to disappear so
> update the code to use the newer interface.
> 
> Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>

Reviewed-by: Jan Beulich <jbeulich@suse.com>
albeit ...

> --- a/xen/arch/x86/hvm/svm/svm.c
> +++ b/xen/arch/x86/hvm/svm/svm.c
> @@ -926,17 +926,17 @@ static inline void svm_lwp_load(struct vcpu *v)
>  /* Update LWP_CFG MSR (0xc0000105). Return -1 if error; otherwise returns 0. 
> */
>  static int svm_update_lwp_cfg(struct vcpu *v, uint64_t msr_content)
>  {
> -    unsigned int edx;
> +    struct cpuid_leaf res;
>      uint32_t msr_low;
>      static uint8_t lwp_intr_vector;
>  
>      if ( xsave_enabled(v) && cpu_has_lwp )
>      {
> -        hvm_cpuid(0x8000001c, NULL, NULL, NULL, &edx);
> +        guest_cpuid(v, 0x8000001c, 0, &res);
>          msr_low = (uint32_t)msr_content;
>          
>          /* generate #GP if guest tries to turn on unsupported features. */
> -        if ( msr_low & ~edx)
> +        if ( msr_low & ~res.d)
>              return -1;

... please consider moving res into the inner scope.

Jan


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: [PATCH 26/27] x86/cpuid: Effectively remove pv_cpuid() and hvm_cpuid()
  2017-01-04 12:39 ` [PATCH 26/27] x86/cpuid: Effectively remove pv_cpuid() and hvm_cpuid() Andrew Cooper
@ 2017-01-05 14:06   ` Jan Beulich
  2017-01-05 14:11     ` Andrew Cooper
  0 siblings, 1 reply; 93+ messages in thread
From: Jan Beulich @ 2017-01-05 14:06 UTC (permalink / raw)
  To: Andrew Cooper; +Cc: Xen-devel

>>> On 04.01.17 at 13:39, <andrew.cooper3@citrix.com> wrote:
> All callers of pv_cpuid() and hvm_cpuid() (other than guest_cpuid() legacy
> path) have been removed from the codebase.  Move them into cpuid.c to avoid
> any further use, leaving guest_cpuid() as the sole API to use.
> 
> Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>

Assuming this is only code movement with no further editing
(other than possibly for coding style)
Acked-by: Jan Beulich <jbeulich@suse.com>

Jan


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: [PATCH 26/27] x86/cpuid: Effectively remove pv_cpuid() and hvm_cpuid()
  2017-01-05 14:06   ` Jan Beulich
@ 2017-01-05 14:11     ` Andrew Cooper
  0 siblings, 0 replies; 93+ messages in thread
From: Andrew Cooper @ 2017-01-05 14:11 UTC (permalink / raw)
  To: Jan Beulich; +Cc: Xen-devel

On 05/01/17 14:06, Jan Beulich wrote:
>>>> On 04.01.17 at 13:39, <andrew.cooper3@citrix.com> wrote:
>> All callers of pv_cpuid() and hvm_cpuid() (other than guest_cpuid() legacy
>> path) have been removed from the codebase.  Move them into cpuid.c to avoid
>> any further use, leaving guest_cpuid() as the sole API to use.
>>
>> Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
> Assuming this is only code movement with no further editing
> (other than possibly for coding style)
> Acked-by: Jan Beulich <jbeulich@suse.com>

Literally a straight move without any adjustments, other than a static
qualifier.

~Andrew

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: [PATCH 21/27] x86/cpuid: Calculate appropriate max_leaf values for the global policies
  2017-01-05 13:43   ` Jan Beulich
@ 2017-01-05 14:13     ` Andrew Cooper
  2017-01-05 14:24       ` Jan Beulich
  0 siblings, 1 reply; 93+ messages in thread
From: Andrew Cooper @ 2017-01-05 14:13 UTC (permalink / raw)
  To: Jan Beulich; +Cc: Xen-devel

On 05/01/17 13:43, Jan Beulich wrote:
>>>> On 04.01.17 at 13:39, <andrew.cooper3@citrix.com> wrote:
>> --- a/xen/include/asm-x86/cpuid.h
>> +++ b/xen/include/asm-x86/cpuid.h
>> @@ -78,10 +78,10 @@ struct cpuid_policy
>>       * Global *_policy objects:
>>       *
>>       * - Host accurate:
>> -     *   - max_{,sub}leaf
>>       *   - {xcr0,xss}_{high,low}
>>       *
>>       * - Guest appropriate:
>> +     *   - max_{,sub}leaf
>>       *   - All FEATURESET_* words
> I can see the point of the addition, but why the removal?

Because the max_{,sub}leaf fields are no longer host-accurate.  They are
min(host, tolerated policy).

~Andrew

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: [PATCH 27/27] x86/cpuid: Alter the legacy-path prototypes to match guest_cpuid()
  2017-01-04 12:39 ` [PATCH 27/27] x86/cpuid: Alter the legacy-path prototypes to match guest_cpuid() Andrew Cooper
@ 2017-01-05 14:19   ` Jan Beulich
  2017-01-05 15:09     ` Andrew Cooper
  0 siblings, 1 reply; 93+ messages in thread
From: Jan Beulich @ 2017-01-05 14:19 UTC (permalink / raw)
  To: Andrew Cooper; +Cc: Xen-devel

>>> On 04.01.17 at 13:39, <andrew.cooper3@citrix.com> wrote:
> Here and elsewhere, it is becomes very obvious that the PVH path using
> pv_cpuid() is broken, as the guest_kernel_mode() check using
> guest_cpu_user_regs() is erroneous.  I am tempted to just switch PVH onto the
> HVM path, which won't make it any more broken than it currently is.

Are you sure? There was a reason it had been done this way back then.

> --- a/xen/arch/x86/cpuid.c
> +++ b/xen/arch/x86/cpuid.c
> @@ -337,30 +337,26 @@ int init_domain_cpuid_policy(struct domain *d)
>      return 0;
>  }
>  
> -static void pv_cpuid(struct cpu_user_regs *regs)
> +static void pv_cpuid(unsigned int leaf, unsigned int subleaf,
> +                     struct cpuid_leaf *res)
>  {
> -    uint32_t leaf, subleaf, a, b, c, d;
> +    const struct cpu_user_regs *regs = guest_cpu_user_regs();

Please consider moving this into the !is_pvh_domain() scope,
open coding the one access outside of that.

> @@ -538,33 +534,33 @@ static void pv_cpuid(struct cpu_user_regs *regs)
>                                    xstate_sizes[_XSTATE_HI_ZMM]);
>              }
>  
> -            a = (uint32_t)xfeature_mask;
> -            d = (uint32_t)(xfeature_mask >> 32);
> -            c = xstate_size;
> +            res->a = (uint32_t)xfeature_mask;
> +            res->d = (uint32_t)(xfeature_mask >> 32);
> +            res->c = xstate_size;

Please consider at once dropping these pointless casts (also on the
HVM side then).

> @@ -945,27 +927,7 @@ void guest_cpuid(const struct vcpu *v, unsigned int leaf,
>   legacy:
>      /* {pv,hvm}_cpuid() have this expectation. */
>      ASSERT(v == curr);
> -
> -    if ( is_pv_vcpu(v) || is_pvh_vcpu(v) )
> -    {
> -        struct cpu_user_regs regs = *guest_cpu_user_regs();
> -
> -        regs.rax = leaf;
> -        regs.rcx = subleaf;
> -
> -        pv_cpuid(&regs);
> -
> -        res->a = regs._eax;
> -        res->b = regs._ebx;
> -        res->c = regs._ecx;
> -        res->d = regs._edx;
> -    }
> -    else
> -    {
> -        res->c = subleaf;
> -
> -        hvm_cpuid(leaf, &res->a, &res->b, &res->c, &res->d);
> -    }
> +    (is_pv_vcpu(v) || is_pvh_vcpu(v) ? pv_cpuid : hvm_cpuid)(leaf, subleaf, res);

Afaics as of patch 8 you have v->domain already latched into a
local variable, so please use is_*_domain() here.

In any event
Reviewed-by: Jan Beulich <jbeulich@suse.com>

Jan


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: [PATCH 21/27] x86/cpuid: Calculate appropriate max_leaf values for the global policies
  2017-01-05 14:13     ` Andrew Cooper
@ 2017-01-05 14:24       ` Jan Beulich
  0 siblings, 0 replies; 93+ messages in thread
From: Jan Beulich @ 2017-01-05 14:24 UTC (permalink / raw)
  To: Andrew Cooper; +Cc: Xen-devel

>>> On 05.01.17 at 15:13, <andrew.cooper3@citrix.com> wrote:
> On 05/01/17 13:43, Jan Beulich wrote:
>>>>> On 04.01.17 at 13:39, <andrew.cooper3@citrix.com> wrote:
>>> --- a/xen/include/asm-x86/cpuid.h
>>> +++ b/xen/include/asm-x86/cpuid.h
>>> @@ -78,10 +78,10 @@ struct cpuid_policy
>>>       * Global *_policy objects:
>>>       *
>>>       * - Host accurate:
>>> -     *   - max_{,sub}leaf
>>>       *   - {xcr0,xss}_{high,low}
>>>       *
>>>       * - Guest appropriate:
>>> +     *   - max_{,sub}leaf
>>>       *   - All FEATURESET_* words
>> I can see the point of the addition, but why the removal?
> 
> Because the max_{,sub}leaf fields are no longer host-accurate.  They are
> min(host, tolerated policy).

Oh, I see.

Reviewed-by: Jan Beulich <jbeulich@suse.com>


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: [PATCH 22/27] x86/cpuid: Perform max_leaf calculations in guest_cpuid()
  2017-01-05 13:51   ` Jan Beulich
@ 2017-01-05 14:28     ` Andrew Cooper
  2017-01-05 14:52       ` Jan Beulich
  0 siblings, 1 reply; 93+ messages in thread
From: Andrew Cooper @ 2017-01-05 14:28 UTC (permalink / raw)
  To: Jan Beulich; +Cc: Xen-devel

On 05/01/17 13:51, Jan Beulich wrote:
>
>> @@ -333,21 +340,50 @@ void guest_cpuid(const struct vcpu *v, unsigned int leaf,
>>                   unsigned int subleaf, struct cpuid_leaf *res)
>>  {
>>      const struct domain *d = v->domain;
>> +    const struct cpuid_policy *p = d->arch.cpuid;
>>  
>>      *res = EMPTY_LEAF;
>>  
>>      /*
>>       * First pass:
>>       * - Dispatch the virtualised leaves to their respective handlers.
>> +     * - Perform max_leaf/subleaf calculations, maybe returning early.
>>       */
>>      switch ( leaf )
>>      {
>> +    case 0x0 ... 0x6:
>> +    case 0x8 ... 0xc:
>> +#if 0 /* For when CPUID_GUEST_NR_BASIC isn't 0xd */
>> +    case 0xe ... CPUID_GUEST_NR_BASIC - 1:
>> +#endif
> Perhaps have a BUILD_BUG_ON() in an #else here?

The presence of this was to be a reminder to whomever tries upping
max_leaf beyond 0xd.  Then again, there is a reasonable chance it will
be me.

I am half tempted to leave it out.

>
>> +        if ( leaf > p->basic.max_leaf )
>> +            return;
>> +        break;
>> +
>> +    case 0x7:
>> +        if ( subleaf > p->feat.max_subleaf )
>> +            return;
>> +        break;
>> +
>> +    case 0xd:
> XSTATE_CPUID again,

I considered this, but having a mix of named an numbered leaves is worse
than having them uniformly numbered, especially when visually checking
the conditions around the #if 0 case above.

I had considered making a cpuid-index.h for leaf names, but most leaves
are more commonly referred to by number than name, so I am really not
sure if that would be helpful or hindering in the long run.

> which raises the question whether switch() really is the best way to deal with things here.

What else would you suggest?  One way or another (better shown in the
context of the following patch), we need one block per union{} to apply
max_leaf calculations and read the base data from p->$FOO.raw[$IDX].

>
>> --- a/xen/arch/x86/hvm/hvm.c
>> +++ b/xen/arch/x86/hvm/hvm.c
>> @@ -3305,27 +3305,6 @@ void hvm_cpuid(unsigned int input, unsigned int *eax, unsigned int *ebx,
>>      if ( !edx )
>>          edx = &dummy;
>>  
>> -    if ( input & 0x7fffffff )
>> -    {
>> -        /*
>> -         * Requests outside the supported leaf ranges return zero on AMD
>> -         * and the highest basic leaf output on Intel. Uniformly follow
>> -         * the AMD model as the more sane one.
>> -         */
> I think this comment would better be moved instead of deleted.

Where would you like it?  It doesn't have an easy logical place to live
in guest_cpuid().  The best I can think of is probably as an extension
of the "First Pass" comment.

~Andrew

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: [PATCH 23/27] x86/cpuid: Move all leaf 7 handling into guest_cpuid()
  2017-01-05 14:01   ` Jan Beulich
@ 2017-01-05 14:39     ` Andrew Cooper
  2017-01-05 14:55       ` Jan Beulich
  0 siblings, 1 reply; 93+ messages in thread
From: Andrew Cooper @ 2017-01-05 14:39 UTC (permalink / raw)
  To: Jan Beulich; +Cc: Xen-devel

On 05/01/17 14:01, Jan Beulich wrote:
>>>> On 04.01.17 at 13:39, <andrew.cooper3@citrix.com> wrote:
>> @@ -380,14 +385,42 @@ void guest_cpuid(const struct vcpu *v, unsigned int leaf,
>>      case 0x80000000 ... 0x80000000 + CPUID_GUEST_NR_EXTD - 1:
>>          if ( leaf > p->extd.max_leaf )
>>              return;
>> -        break;
>> +        goto legacy;
>>  
>>      default:
>>          return;
>>      }
>>  
>> +    /* Skip dynamic adjustments if we are in the wrong context. */
>> +    if ( v != curr )
>> +        return;
>> +
>> +    /*
>> +     * Second pass:
>> +     * - Dynamic adjustments
>> +     */
>> +    switch ( leaf )
>> +    {
>> +    case 0x7:
>> +        switch ( subleaf )
>> +        {
>> +        case 0:
>> +            /* OSPKE clear in policy.  Fast-forward CR4 back in. */
>> +            if ( (is_pv_vcpu(v)
>> +                  ? v->arch.pv_vcpu.ctrlreg[4]
>> +                  : v->arch.hvm_vcpu.guest_cr[4]) & X86_CR4_PKE )
>> +                res->c |= cpufeat_mask(X86_FEATURE_OSPKE);
> What's wrong with doing this adjustment when v != curr?

A guests %cr4 is stale if it is running elsewhere.

> By the time the caller looks at the result, the state of guest
> software controlled bits can't be relied upon anyway.

This particular adjustment can be done out of curr context, but others
are harder.  I have taken the approach that it is better to do nothing
consistently, than to expend effort filling in data we know is going to
be wrong for the caller.

(I hit a rats nest with the xstate leaf and dynamic %ebx's, which is why
those patches are still pending some more work, and I haven't yet
decided how to do the pv hardware domain leakage.)

> Which then raises the question whether a second switch() statement
> for the a second pass is all that useful in the first place (I
> realize this may depend on future plans of yours).

This switch statement will soon be far larger and more complicated than
the first-pass one, and I think it is important to separate the static
and dynamic nature of the two.

~Andrew

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: [PATCH 07/27] x86/cpuid: Recalculate a domains CPUID policy when appropriate
  2017-01-05  8:24           ` Jan Beulich
@ 2017-01-05 14:42             ` Andrew Cooper
  2017-01-05 14:56               ` Jan Beulich
  0 siblings, 1 reply; 93+ messages in thread
From: Andrew Cooper @ 2017-01-05 14:42 UTC (permalink / raw)
  To: Jan Beulich; +Cc: Xen-devel

On 05/01/17 08:24, Jan Beulich wrote:
>>>> On 04.01.17 at 18:37, <andrew.cooper3@citrix.com> wrote:
>> On 04/01/17 16:04, Jan Beulich wrote:
>>>>>> On 04.01.17 at 16:33, <andrew.cooper3@citrix.com> wrote:
>>>> On 04/01/17 15:01, Jan Beulich wrote:
>>>>>>>> On 04.01.17 at 13:39, <andrew.cooper3@citrix.com> wrote:
>>>>>> +    /* ... but hide ITSC in the common case. */
>>>>>> +    if ( !d->disable_migrate && !d->arch.vtsc )
>>>>>> +        __clear_bit(X86_FEATURE_ITSC, fs);
>>>>> The 32-bit PV logic could easily move below here afaics, reducing
>>>>> the distance between the two parts of the comment.
>>>>>
>>>>> Also this requires adjustment of the policy by (the caller of)
>>>>> tsc_set_info().
>>>> And also XEN_DOMCTL_set_disable_migrate.
>>>>
>>>> Currently the various toolstacks issues these hypercalls in the correct
>>>> order, so I was planning to ignore these edge cases until the toolstack
>>>> side work (see below).
>>> Let's not do that - it'll be some time until that other work lands,
>>> I assume, and introducing (further) dependencies on tool stacks
>>> to do things in the right order is quite bad imo.
>> This is code which hasn't changed in years.  But if you insist, then I
>> will see about best to do an x86-only change to the common code.
> The tsc_set_info() would likely be in x86 specific code, but the
> set_disable_migrate would, as you say, presumably want handling
> in/from common code. So unless this would turn out to be a rather
> costly change, I'd indeed prefer if you adjusted these.
>
>>>>>>  static void update_domain_cpuid_info(struct domain *d,
>>>>>>                                       const xen_domctl_cpuid_t *ctl)
>>>>>>  {
>>>>>> +    struct cpuid_policy *p = d->arch.cpuid;
>>>>>> +    struct cpuid_leaf leaf = { ctl->eax, ctl->ebx, ctl->ecx, ctl->edx };
>>>>>> +
>>>>>> +    if ( ctl->input[0] < ARRAY_SIZE(p->basic.raw) )
>>>>>> +    {
>>>>>> +        if ( ctl->input[0] == 7 )
>>>>>> +        {
>>>>>> +            if ( ctl->input[1] < ARRAY_SIZE(p->feat.raw) )
>>>>>> +                p->feat.raw[ctl->input[1]] = leaf;
>>>>>> +        }
>>>>>> +        else if ( ctl->input[0] == 0xd )
>>>>>> +        {
>>>>>> +            if ( ctl->input[1] < ARRAY_SIZE(p->xstate.raw) )
>>>>>> +                p->xstate.raw[ctl->input[1]] = leaf;
>>>>>> +        }
>>>>>> +        else
>>>>>> +            p->basic.raw[ctl->input[0]] = leaf;
>>>>>> +    }
>>>>>> +    else if ( (ctl->input[0] - 0x80000000) < ARRAY_SIZE(p->extd.raw) )
>>>>>> +        p->extd.raw[ctl->input[0] - 0x80000000] = leaf;
>>>>> These checks against ARRAY_SIZE() worry me - wouldn't we better
>>>>> refuse any attempts to set values not representable in the policy?
>>>> We can't do that yet, without toolstack side changes.  Currently the
>>>> toolstack can lodge any values it wishes, and all we do is ignore them,
>>>> which can be arbitrary information from a cpuid= clause.
>>> Hmm, do we really _ignore_ them in all cases (rather than handing
>>> them through to guests)? If so, that should indeed be good enough
>>> for now.
>> Any arbitrary values get can get inserted into the cpuids[] array but,
>> given your fairly-recent change to check max_leaf, we don't guarantee to
>> hand the values to a guest.
> "we don't guarantee" != "we guarantee not to"
>
> But my main point here is that a domain's cpuid= may specify a
> higher than default max leaf, and I think going forward we ought
> to still return all zero for those leaves in that case, or else the
> overall spirit of white listing would get violated.

Does this concern still stand in light of max_leaf handling in patches
21 and 22?

~Andrew

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: [PATCH 22/27] x86/cpuid: Perform max_leaf calculations in guest_cpuid()
  2017-01-05 14:28     ` Andrew Cooper
@ 2017-01-05 14:52       ` Jan Beulich
  2017-01-05 15:02         ` Andrew Cooper
  0 siblings, 1 reply; 93+ messages in thread
From: Jan Beulich @ 2017-01-05 14:52 UTC (permalink / raw)
  To: Andrew Cooper; +Cc: Xen-devel

>>> On 05.01.17 at 15:28, <andrew.cooper3@citrix.com> wrote:
> On 05/01/17 13:51, Jan Beulich wrote:
>>> @@ -333,21 +340,50 @@ void guest_cpuid(const struct vcpu *v, unsigned int leaf,
>>>                   unsigned int subleaf, struct cpuid_leaf *res)
>>>  {
>>>      const struct domain *d = v->domain;
>>> +    const struct cpuid_policy *p = d->arch.cpuid;
>>>  
>>>      *res = EMPTY_LEAF;
>>>  
>>>      /*
>>>       * First pass:
>>>       * - Dispatch the virtualised leaves to their respective handlers.
>>> +     * - Perform max_leaf/subleaf calculations, maybe returning early.
>>>       */
>>>      switch ( leaf )
>>>      {
>>> +    case 0x0 ... 0x6:
>>> +    case 0x8 ... 0xc:
>>> +#if 0 /* For when CPUID_GUEST_NR_BASIC isn't 0xd */
>>> +    case 0xe ... CPUID_GUEST_NR_BASIC - 1:
>>> +#endif
>> Perhaps have a BUILD_BUG_ON() in an #else here?
> 
> The presence of this was to be a reminder to whomever tries upping
> max_leaf beyond 0xd.  Then again, there is a reasonable chance it will
> be me.

Well, that's why the recommendation to add a BUILD_BUG_ON() -
that's a reminder to that "whoever".

>>> +        if ( leaf > p->basic.max_leaf )
>>> +            return;
>>> +        break;
>>> +
>>> +    case 0x7:
>>> +        if ( subleaf > p->feat.max_subleaf )
>>> +            return;
>>> +        break;
>>> +
>>> +    case 0xd:
>> XSTATE_CPUID again,
> 
> I considered this, but having a mix of named an numbered leaves is worse
> than having them uniformly numbered, especially when visually checking
> the conditions around the #if 0 case above.
> 
> I had considered making a cpuid-index.h for leaf names, but most leaves
> are more commonly referred to by number than name, so I am really not
> sure if that would be helpful or hindering in the long run.
> 
>> which raises the question whether switch() really is the best way to deal 
> with things here.
> 
> What else would you suggest?  One way or another (better shown in the
> context of the following patch), we need one block per union{} to apply
> max_leaf calculations and read the base data from p->$FOO.raw[$IDX].

Actually, perhaps a mixture: Inside the default case have

    if ( leaf == 0x7 )
    {
        if ( subleaf > p->feat.max_subleaf )
            return;
    }
    else if ( leaf == 0xd)
    {
        if ( subleaf > ARRAY_SIZE(p->xstate.raw) )
            return;
    }
    if ( leaf > p->basic.max_leaf )
        return;

Which (by making the last one if rather than else-if) also fixes an
issue I've spotted only now: So far you exclude leaves 7 and 0xd
from the basic.max_leaf checking. (And this way that check could
also go first.)

>>> --- a/xen/arch/x86/hvm/hvm.c
>>> +++ b/xen/arch/x86/hvm/hvm.c
>>> @@ -3305,27 +3305,6 @@ void hvm_cpuid(unsigned int input, unsigned int *eax, unsigned int *ebx,
>>>      if ( !edx )
>>>          edx = &dummy;
>>>  
>>> -    if ( input & 0x7fffffff )
>>> -    {
>>> -        /*
>>> -         * Requests outside the supported leaf ranges return zero on AMD
>>> -         * and the highest basic leaf output on Intel. Uniformly follow
>>> -         * the AMD model as the more sane one.
>>> -         */
>> I think this comment would better be moved instead of deleted.
> 
> Where would you like it?  It doesn't have an easy logical place to live
> in guest_cpuid().  The best I can think of is probably as an extension
> of the "First Pass" comment.

Right there, yes, as an extension to the line you're already adding.

Jan


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: [PATCH 10/27] x86/cpuid: Introduce named feature bitmaps
  2017-01-05  8:27       ` Jan Beulich
@ 2017-01-05 14:53         ` Andrew Cooper
  2017-01-05 15:00           ` Jan Beulich
  0 siblings, 1 reply; 93+ messages in thread
From: Andrew Cooper @ 2017-01-05 14:53 UTC (permalink / raw)
  To: Jan Beulich; +Cc: Xen-devel

On 05/01/17 08:27, Jan Beulich wrote:
>>>> On 04.01.17 at 18:21, <andrew.cooper3@citrix.com> wrote:
>> On 04/01/17 15:44, Jan Beulich wrote:
>>>>>> On 04.01.17 at 13:39, <andrew.cooper3@citrix.com> wrote:
>>>> Use anonymous unions to access the feature leaves as complete words, and by
>>>> named individual feature.
>>>>
>>>> A feature name is introduced for every architectural X86_FEATURE_*, other than
>>>> the dynamically calculated values such as APIC, OSXSAVE and OSPKE.
>>> A rationale for this change would be nice to have here, as the
>>> redundancy with public/arch-x86/cpufeatureset.h means any
>>> addition will now need to change two places. Would it be possible
>>> for gen-cpuid.py to generate these bitfield declarations?
>> Hmm.  I hadn't considered that as an option.
>>
>> Thinking about it however, I'd ideally prefer not to hide the
>> declarations behind a macro.
> What's wrong with that?

My specific dislike of hiding code from tools like grep and cscope.

> It's surely better than having to keep two pieces of code in sync manually.

True, but that doesn't come with zero cost.

Thinking about it, the spanner in the works for easily generating this
in an automatic way is MAWAU in leaf 7, which is the first non-bit thing
in the feature leaves.

>>>> +                    };
>>>> +                };
>>>> +                union {
>>>> +                    uint32_t _7c0;
>>>> +                    struct {
>>>> +                        bool prefetchwt1:1, avx512vbmi:1, :1, pku: 1, :1, :1, :1, :1,
>>>> +                             :1, :1, :1, :1, :1, :1, :1, :1,
>>>> +                             :1, :1, :1, :1, :1, :1, :1, :1,
>>>> +                             :1, :1, :1, :1, :1, :1, :1, :1;
>>> This is ugly, but I remember you saying (on irc?) the compiler
>>> doesn't allow bitfields wider than one bit for bool ...
>> Correct.  I was quite surprised by this, but I can understand that bool
>> foo:2 is quite meaningless when foo can strictly only take a binary value.
> Thinking about it another time - what's wrong with using uint32_t
> instead of bool here, allowing consecutive unknown fields to be
> folded?

I first tried using uint32_t and had many problems counting bits
(although this less of an issue if it was automatically generated).

I also wanted to maintain bool properties, but now I think back, I don't
forsee any situation where we would make an assignment to one of these
named features.

~Andrew

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: [PATCH 23/27] x86/cpuid: Move all leaf 7 handling into guest_cpuid()
  2017-01-05 14:39     ` Andrew Cooper
@ 2017-01-05 14:55       ` Jan Beulich
  0 siblings, 0 replies; 93+ messages in thread
From: Jan Beulich @ 2017-01-05 14:55 UTC (permalink / raw)
  To: Andrew Cooper; +Cc: Xen-devel

>>> On 05.01.17 at 15:39, <andrew.cooper3@citrix.com> wrote:
> On 05/01/17 14:01, Jan Beulich wrote:
>>>>> On 04.01.17 at 13:39, <andrew.cooper3@citrix.com> wrote:
>>> @@ -380,14 +385,42 @@ void guest_cpuid(const struct vcpu *v, unsigned int 
> leaf,
>>>      case 0x80000000 ... 0x80000000 + CPUID_GUEST_NR_EXTD - 1:
>>>          if ( leaf > p->extd.max_leaf )
>>>              return;
>>> -        break;
>>> +        goto legacy;
>>>  
>>>      default:
>>>          return;
>>>      }
>>>  
>>> +    /* Skip dynamic adjustments if we are in the wrong context. */
>>> +    if ( v != curr )
>>> +        return;
>>> +
>>> +    /*
>>> +     * Second pass:
>>> +     * - Dynamic adjustments
>>> +     */
>>> +    switch ( leaf )
>>> +    {
>>> +    case 0x7:
>>> +        switch ( subleaf )
>>> +        {
>>> +        case 0:
>>> +            /* OSPKE clear in policy.  Fast-forward CR4 back in. */
>>> +            if ( (is_pv_vcpu(v)
>>> +                  ? v->arch.pv_vcpu.ctrlreg[4]
>>> +                  : v->arch.hvm_vcpu.guest_cr[4]) & X86_CR4_PKE )
>>> +                res->c |= cpufeat_mask(X86_FEATURE_OSPKE);
>> What's wrong with doing this adjustment when v != curr?
> 
> A guests %cr4 is stale if it is running elsewhere.
> 
>> By the time the caller looks at the result, the state of guest
>> software controlled bits can't be relied upon anyway.
> 
> This particular adjustment can be done out of curr context, but others
> are harder.  I have taken the approach that it is better to do nothing
> consistently, than to expend effort filling in data we know is going to
> be wrong for the caller.

May I then suggest to add the early bailing at the time it actually
becomes necessary, or at the very least extend its comment to
make clear that this isn't always strictly needed?

Jan


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: [PATCH 07/27] x86/cpuid: Recalculate a domains CPUID policy when appropriate
  2017-01-05 14:42             ` Andrew Cooper
@ 2017-01-05 14:56               ` Jan Beulich
  0 siblings, 0 replies; 93+ messages in thread
From: Jan Beulich @ 2017-01-05 14:56 UTC (permalink / raw)
  To: Andrew Cooper; +Cc: Xen-devel

>>> On 05.01.17 at 15:42, <andrew.cooper3@citrix.com> wrote:
> On 05/01/17 08:24, Jan Beulich wrote:
>>>>> On 04.01.17 at 18:37, <andrew.cooper3@citrix.com> wrote:
>>> On 04/01/17 16:04, Jan Beulich wrote:
>>>>>>> On 04.01.17 at 16:33, <andrew.cooper3@citrix.com> wrote:
>>>>> On 04/01/17 15:01, Jan Beulich wrote:
>>>>>>>>> On 04.01.17 at 13:39, <andrew.cooper3@citrix.com> wrote:
>>>>>>>  static void update_domain_cpuid_info(struct domain *d,
>>>>>>>                                       const xen_domctl_cpuid_t *ctl)
>>>>>>>  {
>>>>>>> +    struct cpuid_policy *p = d->arch.cpuid;
>>>>>>> +    struct cpuid_leaf leaf = { ctl->eax, ctl->ebx, ctl->ecx, ctl->edx };
>>>>>>> +
>>>>>>> +    if ( ctl->input[0] < ARRAY_SIZE(p->basic.raw) )
>>>>>>> +    {
>>>>>>> +        if ( ctl->input[0] == 7 )
>>>>>>> +        {
>>>>>>> +            if ( ctl->input[1] < ARRAY_SIZE(p->feat.raw) )
>>>>>>> +                p->feat.raw[ctl->input[1]] = leaf;
>>>>>>> +        }
>>>>>>> +        else if ( ctl->input[0] == 0xd )
>>>>>>> +        {
>>>>>>> +            if ( ctl->input[1] < ARRAY_SIZE(p->xstate.raw) )
>>>>>>> +                p->xstate.raw[ctl->input[1]] = leaf;
>>>>>>> +        }
>>>>>>> +        else
>>>>>>> +            p->basic.raw[ctl->input[0]] = leaf;
>>>>>>> +    }
>>>>>>> +    else if ( (ctl->input[0] - 0x80000000) < ARRAY_SIZE(p->extd.raw) )
>>>>>>> +        p->extd.raw[ctl->input[0] - 0x80000000] = leaf;
>>>>>> These checks against ARRAY_SIZE() worry me - wouldn't we better
>>>>>> refuse any attempts to set values not representable in the policy?
>>>>> We can't do that yet, without toolstack side changes.  Currently the
>>>>> toolstack can lodge any values it wishes, and all we do is ignore them,
>>>>> which can be arbitrary information from a cpuid= clause.
>>>> Hmm, do we really _ignore_ them in all cases (rather than handing
>>>> them through to guests)? If so, that should indeed be good enough
>>>> for now.
>>> Any arbitrary values get can get inserted into the cpuids[] array but,
>>> given your fairly-recent change to check max_leaf, we don't guarantee to
>>> hand the values to a guest.
>> "we don't guarantee" != "we guarantee not to"
>>
>> But my main point here is that a domain's cpuid= may specify a
>> higher than default max leaf, and I think going forward we ought
>> to still return all zero for those leaves in that case, or else the
>> overall spirit of white listing would get violated.
> 
> Does this concern still stand in light of max_leaf handling in patches
> 21 and 22?

Indeed, now that I've seen the full series, this should be fine.

Jan


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: [PATCH 11/27] x86/hvm: Improve hvm_efer_valid() using named features
  2017-01-05 11:34   ` Jan Beulich
@ 2017-01-05 14:57     ` Andrew Cooper
  0 siblings, 0 replies; 93+ messages in thread
From: Andrew Cooper @ 2017-01-05 14:57 UTC (permalink / raw)
  To: Jan Beulich; +Cc: Xen-devel

On 05/01/17 11:34, Jan Beulich wrote:
>>>> On 04.01.17 at 13:39, <andrew.cooper3@citrix.com> wrote:
>> --- a/xen/arch/x86/hvm/hvm.c
>> +++ b/xen/arch/x86/hvm/hvm.c
>> @@ -914,56 +914,35 @@ static int hvm_save_cpu_ctxt(struct domain *d, hvm_domain_context_t *h)
>>  }
>>  
>>  /* Return a string indicating the error, or NULL for valid. */
>> -const char *hvm_efer_valid(const struct vcpu *v, uint64_t value,
>> -                           signed int cr0_pg)
>> +const char *hvm_efer_valid(const struct vcpu *v, uint64_t value, int cr0_pg)
> Please can we keep the "signed" here, to make clear signedness
> indeed matters (as opposed to various other uses of plain int we
> still have which could equally well be unsigned int)?

Ok.

>
> Other than that
> Reviewed-by: Jan Beulich <jbeulich@suse.com>
> albeit I have one more question:
>
>>      if ( (value & EFER_LMSLE) && !cpu_has_lmsl )
>>          return "LMSLE without support";
> Do you have any plans to include such non-CPUID-based features
> into the policy?

That is on my TODO list, because one way or another, I will need it when
doing the migration improvements.

One way or another I think we are going to have to non-architectural
information in our architectural representation of the policy.

~Andrew

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: [PATCH 10/27] x86/cpuid: Introduce named feature bitmaps
  2017-01-05 14:53         ` Andrew Cooper
@ 2017-01-05 15:00           ` Jan Beulich
  0 siblings, 0 replies; 93+ messages in thread
From: Jan Beulich @ 2017-01-05 15:00 UTC (permalink / raw)
  To: Andrew Cooper; +Cc: Xen-devel

>>> On 05.01.17 at 15:53, <andrew.cooper3@citrix.com> wrote:
> On 05/01/17 08:27, Jan Beulich wrote:
>>>>> On 04.01.17 at 18:21, <andrew.cooper3@citrix.com> wrote:
>>> On 04/01/17 15:44, Jan Beulich wrote:
>>>>>>> On 04.01.17 at 13:39, <andrew.cooper3@citrix.com> wrote:
>>>>> Use anonymous unions to access the feature leaves as complete words, and by
>>>>> named individual feature.
>>>>>
>>>>> A feature name is introduced for every architectural X86_FEATURE_*, other than
>>>>> the dynamically calculated values such as APIC, OSXSAVE and OSPKE.
>>>> A rationale for this change would be nice to have here, as the
>>>> redundancy with public/arch-x86/cpufeatureset.h means any
>>>> addition will now need to change two places. Would it be possible
>>>> for gen-cpuid.py to generate these bitfield declarations?
>>> Hmm.  I hadn't considered that as an option.
>>>
>>> Thinking about it however, I'd ideally prefer not to hide the
>>> declarations behind a macro.
>> What's wrong with that?
> 
> My specific dislike of hiding code from tools like grep and cscope.
> 
>> It's surely better than having to keep two pieces of code in sync manually.
> 
> True, but that doesn't come with zero cost.
> 
> Thinking about it, the spanner in the works for easily generating this
> in an automatic way is MAWAU in leaf 7, which is the first non-bit thing
> in the feature leaves.

That's a case needing something new anyway, as you can't even
express it using XEN_CPUFEATURE().

Jan


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: [PATCH 22/27] x86/cpuid: Perform max_leaf calculations in guest_cpuid()
  2017-01-05 14:52       ` Jan Beulich
@ 2017-01-05 15:02         ` Andrew Cooper
  2017-01-05 15:39           ` Jan Beulich
  0 siblings, 1 reply; 93+ messages in thread
From: Andrew Cooper @ 2017-01-05 15:02 UTC (permalink / raw)
  To: Jan Beulich; +Cc: Xen-devel

On 05/01/17 14:52, Jan Beulich wrote:
>>>> On 05.01.17 at 15:28, <andrew.cooper3@citrix.com> wrote:
>> On 05/01/17 13:51, Jan Beulich wrote:
>>>> @@ -333,21 +340,50 @@ void guest_cpuid(const struct vcpu *v, unsigned int leaf,
>>>>                   unsigned int subleaf, struct cpuid_leaf *res)
>>>>  {
>>>>      const struct domain *d = v->domain;
>>>> +    const struct cpuid_policy *p = d->arch.cpuid;
>>>>  
>>>>      *res = EMPTY_LEAF;
>>>>  
>>>>      /*
>>>>       * First pass:
>>>>       * - Dispatch the virtualised leaves to their respective handlers.
>>>> +     * - Perform max_leaf/subleaf calculations, maybe returning early.
>>>>       */
>>>>      switch ( leaf )
>>>>      {
>>>> +    case 0x0 ... 0x6:
>>>> +    case 0x8 ... 0xc:
>>>> +#if 0 /* For when CPUID_GUEST_NR_BASIC isn't 0xd */
>>>> +    case 0xe ... CPUID_GUEST_NR_BASIC - 1:
>>>> +#endif
>>> Perhaps have a BUILD_BUG_ON() in an #else here?
>> The presence of this was to be a reminder to whomever tries upping
>> max_leaf beyond 0xd.  Then again, there is a reasonable chance it will
>> be me.
> Well, that's why the recommendation to add a BUILD_BUG_ON() -
> that's a reminder to that "whoever".

Ok.

>
>>>> +        if ( leaf > p->basic.max_leaf )
>>>> +            return;
>>>> +        break;
>>>> +
>>>> +    case 0x7:
>>>> +        if ( subleaf > p->feat.max_subleaf )
>>>> +            return;
>>>> +        break;
>>>> +
>>>> +    case 0xd:
>>> XSTATE_CPUID again,
>> I considered this, but having a mix of named an numbered leaves is worse
>> than having them uniformly numbered, especially when visually checking
>> the conditions around the #if 0 case above.
>>
>> I had considered making a cpuid-index.h for leaf names, but most leaves
>> are more commonly referred to by number than name, so I am really not
>> sure if that would be helpful or hindering in the long run.
>>
>>> which raises the question whether switch() really is the best way to deal 
>> with things here.
>>
>> What else would you suggest?  One way or another (better shown in the
>> context of the following patch), we need one block per union{} to apply
>> max_leaf calculations and read the base data from p->$FOO.raw[$IDX].
> Actually, perhaps a mixture: Inside the default case have
>
>     if ( leaf == 0x7 )
>     {
>         if ( subleaf > p->feat.max_subleaf )
>             return;
>     }
>     else if ( leaf == 0xd)
>     {
>         if ( subleaf > ARRAY_SIZE(p->xstate.raw) )
>             return;
>     }
>     if ( leaf > p->basic.max_leaf )
>         return;
>
> Which (by making the last one if rather than else-if) also fixes an
> issue I've spotted only now: So far you exclude leaves 7 and 0xd
> from the basic.max_leaf checking. (And this way that check could
> also go first.)

Very good point, although I still think I'd still prefer a logic block
in this form inside a case 0 ... 0x3fffffff to avoid potential leakage
if other logic changes.

~Andrew

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: [PATCH 27/27] x86/cpuid: Alter the legacy-path prototypes to match guest_cpuid()
  2017-01-05 14:19   ` Jan Beulich
@ 2017-01-05 15:09     ` Andrew Cooper
  0 siblings, 0 replies; 93+ messages in thread
From: Andrew Cooper @ 2017-01-05 15:09 UTC (permalink / raw)
  To: Jan Beulich; +Cc: Xen-devel

On 05/01/17 14:19, Jan Beulich wrote:
>>>> On 04.01.17 at 13:39, <andrew.cooper3@citrix.com> wrote:
>> Here and elsewhere, it is becomes very obvious that the PVH path using
>> pv_cpuid() is broken, as the guest_kernel_mode() check using
>> guest_cpu_user_regs() is erroneous.  I am tempted to just switch PVH onto the
>> HVM path, which won't make it any more broken than it currently is.
> Are you sure? There was a reason it had been done this way back then.

Oh yes, the same problem Roger is having with PVHv2.  Only the
pv_cpuid() path has logic to read from native in the case of the control
domain, for whom no policy is constructed.

This series lays a lot of groundwork to fixing the dom0 policy problem,
but wont be fully working for PVHv2 until I remove all of the legacy
path. (and that is at least the same quantity of work again, I reckon).

>
>> --- a/xen/arch/x86/cpuid.c
>> +++ b/xen/arch/x86/cpuid.c
>> @@ -337,30 +337,26 @@ int init_domain_cpuid_policy(struct domain *d)
>>      return 0;
>>  }
>>  
>> -static void pv_cpuid(struct cpu_user_regs *regs)
>> +static void pv_cpuid(unsigned int leaf, unsigned int subleaf,
>> +                     struct cpuid_leaf *res)
>>  {
>> -    uint32_t leaf, subleaf, a, b, c, d;
>> +    const struct cpu_user_regs *regs = guest_cpu_user_regs();
> Please consider moving this into the !is_pvh_domain() scope,
> open coding the one access outside of that.
>
>> @@ -538,33 +534,33 @@ static void pv_cpuid(struct cpu_user_regs *regs)
>>                                    xstate_sizes[_XSTATE_HI_ZMM]);
>>              }
>>  
>> -            a = (uint32_t)xfeature_mask;
>> -            d = (uint32_t)(xfeature_mask >> 32);
>> -            c = xstate_size;
>> +            res->a = (uint32_t)xfeature_mask;
>> +            res->d = (uint32_t)(xfeature_mask >> 32);
>> +            res->c = xstate_size;
> Please consider at once dropping these pointless casts (also on the
> HVM side then).

I can do, but after this patch, I only ever expect to delete code from
these functions as more leaves move over to the new infrastructure.

>
>> @@ -945,27 +927,7 @@ void guest_cpuid(const struct vcpu *v, unsigned int leaf,
>>   legacy:
>>      /* {pv,hvm}_cpuid() have this expectation. */
>>      ASSERT(v == curr);
>> -
>> -    if ( is_pv_vcpu(v) || is_pvh_vcpu(v) )
>> -    {
>> -        struct cpu_user_regs regs = *guest_cpu_user_regs();
>> -
>> -        regs.rax = leaf;
>> -        regs.rcx = subleaf;
>> -
>> -        pv_cpuid(&regs);
>> -
>> -        res->a = regs._eax;
>> -        res->b = regs._ebx;
>> -        res->c = regs._ecx;
>> -        res->d = regs._edx;
>> -    }
>> -    else
>> -    {
>> -        res->c = subleaf;
>> -
>> -        hvm_cpuid(leaf, &res->a, &res->b, &res->c, &res->d);
>> -    }
>> +    (is_pv_vcpu(v) || is_pvh_vcpu(v) ? pv_cpuid : hvm_cpuid)(leaf, subleaf, res);
> Afaics as of patch 8 you have v->domain already latched into a
> local variable, so please use is_*_domain() here.

Actually, I will switch it around to is_hvm_domain() which is shorter,
and will require no modification when PVHv1 finally gets excised.

~Andrew

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: [PATCH 22/27] x86/cpuid: Perform max_leaf calculations in guest_cpuid()
  2017-01-05 15:02         ` Andrew Cooper
@ 2017-01-05 15:39           ` Jan Beulich
  0 siblings, 0 replies; 93+ messages in thread
From: Jan Beulich @ 2017-01-05 15:39 UTC (permalink / raw)
  To: Andrew Cooper; +Cc: Xen-devel

>>> On 05.01.17 at 16:02, <andrew.cooper3@citrix.com> wrote:
> On 05/01/17 14:52, Jan Beulich wrote:
>>>>> On 05.01.17 at 15:28, <andrew.cooper3@citrix.com> wrote:
>>> What else would you suggest?  One way or another (better shown in the
>>> context of the following patch), we need one block per union{} to apply
>>> max_leaf calculations and read the base data from p->$FOO.raw[$IDX].
>> Actually, perhaps a mixture: Inside the default case have
>>
>>     if ( leaf == 0x7 )
>>     {
>>         if ( subleaf > p->feat.max_subleaf )
>>             return;
>>     }
>>     else if ( leaf == 0xd)
>>     {
>>         if ( subleaf > ARRAY_SIZE(p->xstate.raw) )
>>             return;
>>     }
>>     if ( leaf > p->basic.max_leaf )
>>         return;
>>
>> Which (by making the last one if rather than else-if) also fixes an
>> issue I've spotted only now: So far you exclude leaves 7 and 0xd
>> from the basic.max_leaf checking. (And this way that check could
>> also go first.)
> 
> Very good point, although I still think I'd still prefer a logic block
> in this form inside a case 0 ... 0x3fffffff to avoid potential leakage
> if other logic changes.

Well, that's certainly fine with me.

Jan


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 93+ messages in thread

end of thread, other threads:[~2017-01-05 15:39 UTC | newest]

Thread overview: 93+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-01-04 12:39 [PATCH 00/27] xen/x86: Per-domain CPUID policies Andrew Cooper
2017-01-04 12:39 ` [PATCH 01/27] x86/cpuid: Untangle the <asm/cpufeature.h> include hierachy Andrew Cooper
2017-01-04 13:39   ` Jan Beulich
2017-01-04 12:39 ` [PATCH 02/27] x86/cpuid: Introduce guest_cpuid() and struct cpuid_leaf Andrew Cooper
2017-01-04 14:01   ` Jan Beulich
2017-01-04 14:47     ` Andrew Cooper
2017-01-04 15:49       ` Jan Beulich
2017-01-04 12:39 ` [PATCH 03/27] x86/cpuid: Introduce struct cpuid_policy Andrew Cooper
2017-01-04 14:22   ` Jan Beulich
2017-01-04 15:05     ` Andrew Cooper
2017-01-04 15:58       ` Jan Beulich
2017-01-04 12:39 ` [PATCH 04/27] x86/cpuid: Move featuresets into " Andrew Cooper
2017-01-04 14:35   ` Jan Beulich
2017-01-04 15:10     ` Andrew Cooper
2017-01-04 15:59       ` Jan Beulich
2017-01-04 12:39 ` [PATCH 05/27] x86/cpuid: Allocate a CPUID policy for every domain Andrew Cooper
2017-01-04 14:40   ` Jan Beulich
2017-01-04 12:39 ` [PATCH 06/27] x86/domctl: Make XEN_DOMCTL_set_address_size singleshot Andrew Cooper
2017-01-04 14:42   ` Jan Beulich
2017-01-04 12:39 ` [PATCH 07/27] x86/cpuid: Recalculate a domains CPUID policy when appropriate Andrew Cooper
2017-01-04 15:01   ` Jan Beulich
2017-01-04 15:33     ` Andrew Cooper
2017-01-04 16:04       ` Jan Beulich
2017-01-04 17:37         ` Andrew Cooper
2017-01-05  8:24           ` Jan Beulich
2017-01-05 14:42             ` Andrew Cooper
2017-01-05 14:56               ` Jan Beulich
2017-01-04 12:39 ` [PATCH 08/27] x86/hvm: Dispatch cpuid_viridian_leaves() from guest_cpuid() Andrew Cooper
2017-01-04 15:24   ` Jan Beulich
2017-01-04 15:36     ` Andrew Cooper
2017-01-04 16:11       ` Jan Beulich
2017-01-04 12:39 ` [PATCH 09/27] x86/cpuid: Dispatch cpuid_hypervisor_leaves() " Andrew Cooper
2017-01-04 15:34   ` Jan Beulich
2017-01-04 15:40     ` Andrew Cooper
2017-01-04 16:14       ` Jan Beulich
2017-01-04 12:39 ` [PATCH 10/27] x86/cpuid: Introduce named feature bitmaps Andrew Cooper
2017-01-04 15:44   ` Jan Beulich
2017-01-04 17:21     ` Andrew Cooper
2017-01-05  8:27       ` Jan Beulich
2017-01-05 14:53         ` Andrew Cooper
2017-01-05 15:00           ` Jan Beulich
2017-01-04 12:39 ` [PATCH 11/27] x86/hvm: Improve hvm_efer_valid() using named features Andrew Cooper
2017-01-05 11:34   ` Jan Beulich
2017-01-05 14:57     ` Andrew Cooper
2017-01-04 12:39 ` [PATCH 12/27] x86/hvm: Improve CR4 verification " Andrew Cooper
2017-01-05 11:39   ` Jan Beulich
2017-01-04 12:39 ` [PATCH 13/27] x86/vvmx: Use hvm_cr4_guest_valid_bits() to calculate MSR_IA32_VMX_CR4_FIXED1 Andrew Cooper
2017-01-05  2:40   ` Tian, Kevin
2017-01-05 11:42   ` Jan Beulich
2017-01-04 12:39 ` [PATCH 14/27] x86/pv: Improve pv_cpuid() using named features Andrew Cooper
2017-01-05 11:43   ` Jan Beulich
2017-01-04 12:39 ` [PATCH 15/27] x86/hvm: Improve CPUID and MSR handling " Andrew Cooper
2017-01-05 12:06   ` Jan Beulich
2017-01-04 12:39 ` [PATCH 16/27] x86/svm: Improvements " Andrew Cooper
2017-01-04 14:52   ` Boris Ostrovsky
2017-01-04 15:42     ` Andrew Cooper
2017-01-04 12:39 ` [PATCH 17/27] x86/pv: Use per-domain policy information when calculating the cpumasks Andrew Cooper
2017-01-05 12:23   ` Jan Beulich
2017-01-05 12:24     ` Andrew Cooper
2017-01-04 12:39 ` [PATCH 18/27] x86/pv: Use per-domain policy information in pv_cpuid() Andrew Cooper
2017-01-05 12:44   ` Jan Beulich
2017-01-05 12:46     ` Andrew Cooper
2017-01-04 12:39 ` [PATCH 19/27] x86/hvm: Use per-domain policy information in hvm_cpuid() Andrew Cooper
2017-01-05 12:55   ` Jan Beulich
2017-01-05 13:03     ` Andrew Cooper
2017-01-04 12:39 ` [PATCH 20/27] x86/cpuid: Drop the temporary linear feature bitmap from struct cpuid_policy Andrew Cooper
2017-01-05 13:07   ` Jan Beulich
2017-01-05 13:12     ` Andrew Cooper
2017-01-04 12:39 ` [PATCH 21/27] x86/cpuid: Calculate appropriate max_leaf values for the global policies Andrew Cooper
2017-01-05 13:43   ` Jan Beulich
2017-01-05 14:13     ` Andrew Cooper
2017-01-05 14:24       ` Jan Beulich
2017-01-04 12:39 ` [PATCH 22/27] x86/cpuid: Perform max_leaf calculations in guest_cpuid() Andrew Cooper
2017-01-05 13:51   ` Jan Beulich
2017-01-05 14:28     ` Andrew Cooper
2017-01-05 14:52       ` Jan Beulich
2017-01-05 15:02         ` Andrew Cooper
2017-01-05 15:39           ` Jan Beulich
2017-01-04 12:39 ` [PATCH 23/27] x86/cpuid: Move all leaf 7 handling into guest_cpuid() Andrew Cooper
2017-01-05 14:01   ` Jan Beulich
2017-01-05 14:39     ` Andrew Cooper
2017-01-05 14:55       ` Jan Beulich
2017-01-04 12:39 ` [PATCH 24/27] x86/hvm: Use guest_cpuid() rather than hvm_cpuid() Andrew Cooper
2017-01-05 14:02   ` Jan Beulich
2017-01-04 12:39 ` [PATCH 25/27] x86/svm: " Andrew Cooper
2017-01-04 15:26   ` Boris Ostrovsky
2017-01-05 14:04   ` Jan Beulich
2017-01-04 12:39 ` [PATCH 26/27] x86/cpuid: Effectively remove pv_cpuid() and hvm_cpuid() Andrew Cooper
2017-01-05 14:06   ` Jan Beulich
2017-01-05 14:11     ` Andrew Cooper
2017-01-04 12:39 ` [PATCH 27/27] x86/cpuid: Alter the legacy-path prototypes to match guest_cpuid() Andrew Cooper
2017-01-05 14:19   ` Jan Beulich
2017-01-05 15:09     ` Andrew Cooper

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.