xen-devel.lists.xenproject.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v4 00/26] x86: Improvements to cpuid handling for guests
@ 2016-03-23 16:36 Andrew Cooper
  2016-03-23 16:36 ` [PATCH v4 01/26] xen/public: Export cpu featureset information in the public API Andrew Cooper
                   ` (26 more replies)
  0 siblings, 27 replies; 72+ messages in thread
From: Andrew Cooper @ 2016-03-23 16:36 UTC (permalink / raw)
  To: Xen-devel; +Cc: Andrew Cooper

This series is available in git form at:
  http://xenbits.xen.org/git-http/people/andrewcoop/xen.git levelling-v4

There are no major changes from v3.  There were minor adjustmenst to the
feature dependency tree, OSXSAVE/OSPKE handling for PV guests and collection
of Acks/Reviews.

Most patches do now how Acks/Reviews.  The remaining patches are #1 (Rest),
#6-8,11-13,18 (x86), #20 (ARM), 26 (Toolstack).

The current cpuid code, both in the hypervisor and toolstack, has grown
organically for a very long time, and is flawed in many ways.  This series
focuses specifically on the fixing the bits pertaining to the visible
features, and I will be fixing other areas in future work (e.g. per-core,
per-package values, auditing of incoming migration values, etc.)

These changes alter the workflow of cpuid handling as follows:

Xen boots and evaluates its current capabilities.  It uses this information to
calculate the maximum featuresets it can provide to guests, and provides this
information for toolstack consumption.  A toolstack may then calculate a safe
set of features (taking into account migratability), and sets a guests cpuid
policy.  Xen then takes care of context switching the levelling state.

In particular, this means that PV guests may have different levels while
running on the same host, an option which was not previously available.

Andrew Cooper (26):
  xen/public: Export cpu featureset information in the public API
  xen/x86: Script to automatically process featureset information
  xen/x86: Collect more cpuid feature leaves
  xen/x86: Mask out unknown features from Xen's capabilities
  xen/x86: Annotate special features
  xen/x86: Annotate VM applicability in featureset
  xen/x86: Calculate maximum host and guest featuresets
  xen/x86: Generate deep dependencies of features
  xen/x86: Clear dependent features when clearing a cpu cap
  xen/x86: Improve disabling of features which have dependencies
  xen/x86: Improvements to in-hypervisor cpuid sanity checks
  x86/cpu: Move set_cpumask() calls into c_early_init()
  x86/cpu: Sysctl and common infrastructure for levelling context
    switching
  x86/cpu: Rework AMD masking MSR setup
  x86/cpu: Rework Intel masking/faulting setup
  x86/cpu: Context switch cpuid masks and faulting state in
    context_switch()
  x86/pv: Provide custom cpumasks for PV domains
  x86/domctl: Update PV domain cpumasks when setting cpuid policy
  xen+tools: Export maximum host and guest cpu featuresets via SYSCTL
  tools/libxc: Modify bitmap operations to take void pointers
  tools/libxc: Use public/featureset.h for cpuid policy generation
  tools/libxc: Expose the automatically generated cpu featuremask
    information
  tools: Utility for dealing with featuresets
  tools/libxc: Wire a featureset through to cpuid policy logic
  tools/libxc: Use featuresets rather than guesswork
  tools/libxc: Calculate xstate cpuid leaf from guest information

 .gitignore                                  |   2 +
 tools/libxc/Makefile                        |   9 +
 tools/libxc/include/xenctrl.h               |  22 +-
 tools/libxc/xc_bitops.h                     |  37 +-
 tools/libxc/xc_cpufeature.h                 | 151 -------
 tools/libxc/xc_cpuid_x86.c                  | 621 +++++++++++++++++-----------
 tools/libxl/libxl_cpuid.c                   |   2 +-
 tools/misc/Makefile                         |   4 +
 tools/misc/xen-cpuid.c                      | 394 ++++++++++++++++++
 tools/ocaml/libs/xc/xenctrl.ml              |   3 +
 tools/ocaml/libs/xc/xenctrl.mli             |   4 +
 tools/ocaml/libs/xc/xenctrl_stubs.c         |  37 +-
 tools/python/xen/lowlevel/xc/xc.c           |   2 +-
 xen/arch/x86/Makefile                       |   1 +
 xen/arch/x86/apic.c                         |   2 +-
 xen/arch/x86/cpu/amd.c                      | 308 ++++++++------
 xen/arch/x86/cpu/common.c                   |  49 ++-
 xen/arch/x86/cpu/intel.c                    | 263 +++++++-----
 xen/arch/x86/cpuid.c                        | 240 +++++++++++
 xen/arch/x86/crash.c                        |   3 +
 xen/arch/x86/domain.c                       |  20 +-
 xen/arch/x86/domctl.c                       | 138 +++++++
 xen/arch/x86/hvm/hvm.c                      | 125 ++++--
 xen/arch/x86/setup.c                        |   3 +
 xen/arch/x86/sysctl.c                       |  57 +++
 xen/arch/x86/traps.c                        | 209 ++++++----
 xen/arch/x86/xstate.c                       |   6 +-
 xen/include/Makefile                        |  10 +
 xen/include/asm-x86/cpufeature.h            | 153 +------
 xen/include/asm-x86/cpufeatureset.h         |  32 ++
 xen/include/asm-x86/cpuid.h                 |  77 ++++
 xen/include/asm-x86/domain.h                |   2 +
 xen/include/asm-x86/processor.h             |   2 +-
 xen/include/public/arch-x86/cpufeatureset.h | 245 +++++++++++
 xen/include/public/sysctl.h                 |  50 +++
 xen/tools/gen-cpuid.py                      | 405 ++++++++++++++++++
 36 files changed, 2788 insertions(+), 900 deletions(-)
 delete mode 100644 tools/libxc/xc_cpufeature.h
 create mode 100644 tools/misc/xen-cpuid.c
 create mode 100644 xen/arch/x86/cpuid.c
 create mode 100644 xen/include/asm-x86/cpufeatureset.h
 create mode 100644 xen/include/asm-x86/cpuid.h
 create mode 100644 xen/include/public/arch-x86/cpufeatureset.h
 create mode 100755 xen/tools/gen-cpuid.py

-- 
2.1.4


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 72+ messages in thread

* [PATCH v4 01/26] xen/public: Export cpu featureset information in the public API
  2016-03-23 16:36 [PATCH v4 00/26] x86: Improvements to cpuid handling for guests Andrew Cooper
@ 2016-03-23 16:36 ` Andrew Cooper
  2016-03-24 14:08   ` Jan Beulich
  2016-03-23 16:36 ` [PATCH v4 02/26] xen/x86: Script to automatically process featureset information Andrew Cooper
                   ` (25 subsequent siblings)
  26 siblings, 1 reply; 72+ messages in thread
From: Andrew Cooper @ 2016-03-23 16:36 UTC (permalink / raw)
  To: Xen-devel; +Cc: Andrew Cooper, Tim Deegan, Jan Beulich

For the featureset to be a useful object, it needs a stable interpretation, a
property which is missing from the current hw_caps interface.

Additionly, introduce TSC_ADJUST, FDP_EXCP_ONLY, SHA, PREFETCHWT1, ITSC, EFRO
and CLZERO which will be used by later changes.

To maintain compilation, FSCAPINTS is currently hardcoded at 9.  Future
changes will change this to being dynamically generated.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
---
CC: Jan Beulich <JBeulich@suse.com>
CC: Tim Deegan <tim@xen.org>

v2:
 * Rebase over upstream changes
 * Collect all feature introductions from later in the series
 * Restrict API to Xen and toolstack
v3:
 * Allow the constants to be in a namespace of the includers choosing.
 * Add FDP_EXCP_ONLY
v4:
 * Magic blocks in new file.
 * Remove default ASM support.
 * Renumber the synthetic values from 0.
---
 xen/include/asm-x86/cpufeature.h            | 152 ++-----------------
 xen/include/asm-x86/cpufeatureset.h         |  32 ++++
 xen/include/public/arch-x86/cpufeatureset.h | 228 ++++++++++++++++++++++++++++
 3 files changed, 273 insertions(+), 139 deletions(-)
 create mode 100644 xen/include/asm-x86/cpufeatureset.h
 create mode 100644 xen/include/public/arch-x86/cpufeatureset.h

diff --git a/xen/include/asm-x86/cpufeature.h b/xen/include/asm-x86/cpufeature.h
index 1bac562..a044616 100644
--- a/xen/include/asm-x86/cpufeature.h
+++ b/xen/include/asm-x86/cpufeature.h
@@ -10,148 +10,22 @@
 #endif
 
 #include <xen/const.h>
+#include <asm/cpufeatureset.h>
 
-#define NCAPINTS	9	/* N 32-bit words worth of info */
+#define FSCAPINTS 9
+#define NCAPINTS (FSCAPINTS + 1) /* N 32-bit words worth of info */
 
-/* Intel-defined CPU features, CPUID level 0x00000001 (edx), word 0 */
-#define X86_FEATURE_FPU		(0*32+ 0) /* Onboard FPU */
-#define X86_FEATURE_VME		(0*32+ 1) /* Virtual Mode Extensions */
-#define X86_FEATURE_DE		(0*32+ 2) /* Debugging Extensions */
-#define X86_FEATURE_PSE 	(0*32+ 3) /* Page Size Extensions */
-#define X86_FEATURE_TSC		(0*32+ 4) /* Time Stamp Counter */
-#define X86_FEATURE_MSR		(0*32+ 5) /* Model-Specific Registers, RDMSR, WRMSR */
-#define X86_FEATURE_PAE		(0*32+ 6) /* Physical Address Extensions */
-#define X86_FEATURE_MCE		(0*32+ 7) /* Machine Check Architecture */
-#define X86_FEATURE_CX8		(0*32+ 8) /* CMPXCHG8 instruction */
-#define X86_FEATURE_APIC	(0*32+ 9) /* Onboard APIC */
-#define X86_FEATURE_SEP		(0*32+11) /* SYSENTER/SYSEXIT */
-#define X86_FEATURE_MTRR	(0*32+12) /* Memory Type Range Registers */
-#define X86_FEATURE_PGE		(0*32+13) /* Page Global Enable */
-#define X86_FEATURE_MCA		(0*32+14) /* Machine Check Architecture */
-#define X86_FEATURE_CMOV	(0*32+15) /* CMOV instruction (FCMOVCC and FCOMI too if FPU present) */
-#define X86_FEATURE_PAT		(0*32+16) /* Page Attribute Table */
-#define X86_FEATURE_PSE36	(0*32+17) /* 36-bit PSEs */
-#define X86_FEATURE_CLFLUSH	(0*32+19) /* Supports the CLFLUSH instruction */
-#define X86_FEATURE_DS		(0*32+21) /* Debug Store */
-#define X86_FEATURE_ACPI	(0*32+22) /* ACPI via MSR */
-#define X86_FEATURE_MMX		(0*32+23) /* Multimedia Extensions */
-#define X86_FEATURE_FXSR	(0*32+24) /* FXSAVE and FXRSTOR instructions (fast save and restore */
-				          /* of FPU context), and CR4.OSFXSR available */
-#define X86_FEATURE_SSE		(0*32+25) /* Streaming SIMD Extensions */
-#define X86_FEATURE_SSE2	(0*32+26) /* Streaming SIMD Extensions-2 */
-#define X86_FEATURE_HTT		(0*32+28) /* Hyper-Threading Technology */
-#define X86_FEATURE_TM1		(0*32+29) /* Thermal Monitor 1 */
-#define X86_FEATURE_PBE		(0*32+31) /* Pending Break Enable */
-
-/* AMD-defined CPU features, CPUID level 0x80000001, word 1 */
-/* Don't duplicate feature flags which are redundant with Intel! */
-#define X86_FEATURE_SYSCALL	(1*32+11) /* SYSCALL/SYSRET */
-#define X86_FEATURE_NX		(1*32+20) /* Execute Disable */
-#define X86_FEATURE_MMXEXT	(1*32+22) /* AMD MMX extensions */
-#define X86_FEATURE_FFXSR       (1*32+25) /* FFXSR instruction optimizations */
-#define X86_FEATURE_PAGE1GB	(1*32+26) /* 1Gb large page support */
-#define X86_FEATURE_RDTSCP	(1*32+27) /* RDTSCP */
-#define X86_FEATURE_LM		(1*32+29) /* Long Mode (x86-64) */
-#define X86_FEATURE_3DNOWEXT	(1*32+30) /* AMD 3DNow! extensions */
-#define X86_FEATURE_3DNOW	(1*32+31) /* 3DNow! */
-
-/* Intel-defined CPU features, CPUID level 0x0000000D:1 (eax), word 2 */
-#define X86_FEATURE_XSAVEOPT	(2*32+ 0) /* XSAVEOPT instruction. */
-#define X86_FEATURE_XSAVEC	(2*32+ 1) /* XSAVEC/XRSTORC instructions. */
-#define X86_FEATURE_XGETBV1	(2*32+ 2) /* XGETBV with %ecx=1. */
-#define X86_FEATURE_XSAVES	(2*32+ 3) /* XSAVES/XRSTORS instructions. */
-
-/* Other features, Linux-defined mapping, word 3 */
+/* Other features, Xen-defined mapping. */
 /* This range is used for feature bits which conflict or are synthesized */
-#define X86_FEATURE_CONSTANT_TSC (3*32+ 8) /* TSC ticks at a constant rate */
-#define X86_FEATURE_NONSTOP_TSC	(3*32+ 9) /* TSC does not stop in C states */
-#define X86_FEATURE_ARAT	(3*32+ 10) /* Always running APIC timer */
-#define X86_FEATURE_ARCH_PERFMON (3*32+11) /* Intel Architectural PerfMon */
-#define X86_FEATURE_TSC_RELIABLE (3*32+12) /* TSC is known to be reliable */
-#define X86_FEATURE_XTOPOLOGY    (3*32+13) /* cpu topology enum extensions */
-#define X86_FEATURE_CPUID_FAULTING (3*32+14) /* cpuid faulting */
-#define X86_FEATURE_CLFLUSH_MONITOR (3*32+15) /* clflush reqd with monitor */
-#define X86_FEATURE_APERFMPERF   (3*32+16) /* APERFMPERF */
-
-/* Intel-defined CPU features, CPUID level 0x00000001 (ecx), word 4 */
-#define X86_FEATURE_SSE3	(4*32+ 0) /* Streaming SIMD Extensions-3 */
-#define X86_FEATURE_PCLMULQDQ	(4*32+ 1) /* Carry-less mulitplication */
-#define X86_FEATURE_DTES64	(4*32+ 2) /* 64-bit Debug Store */
-#define X86_FEATURE_MONITOR	(4*32+ 3) /* Monitor/Mwait support */
-#define X86_FEATURE_DSCPL	(4*32+ 4) /* CPL Qualified Debug Store */
-#define X86_FEATURE_VMX		(4*32+ 5) /* Virtual Machine Extensions */
-#define X86_FEATURE_SMX		(4*32+ 6) /* Safer Mode Extensions */
-#define X86_FEATURE_EIST	(4*32+ 7) /* Enhanced SpeedStep */
-#define X86_FEATURE_TM2		(4*32+ 8) /* Thermal Monitor 2 */
-#define X86_FEATURE_SSSE3	(4*32+ 9) /* Supplemental Streaming SIMD Extensions-3 */
-#define X86_FEATURE_FMA		(4*32+12) /* Fused Multiply Add */
-#define X86_FEATURE_CX16        (4*32+13) /* CMPXCHG16B */
-#define X86_FEATURE_XTPR	(4*32+14) /* Send Task Priority Messages */
-#define X86_FEATURE_PDCM	(4*32+15) /* Perf/Debug Capability MSR */
-#define X86_FEATURE_PCID	(4*32+17) /* Process Context ID */
-#define X86_FEATURE_DCA		(4*32+18) /* Direct Cache Access */
-#define X86_FEATURE_SSE4_1	(4*32+19) /* Streaming SIMD Extensions 4.1 */
-#define X86_FEATURE_SSE4_2	(4*32+20) /* Streaming SIMD Extensions 4.2 */
-#define X86_FEATURE_X2APIC	(4*32+21) /* Extended xAPIC */
-#define X86_FEATURE_MOVBE	(4*32+22) /* movbe instruction */
-#define X86_FEATURE_POPCNT	(4*32+23) /* POPCNT instruction */
-#define X86_FEATURE_TSC_DEADLINE (4*32+24) /* "tdt" TSC Deadline Timer */
-#define X86_FEATURE_AESNI	(4*32+25) /* AES instructions */
-#define X86_FEATURE_XSAVE	(4*32+26) /* XSAVE/XRSTOR/XSETBV/XGETBV */
-#define X86_FEATURE_OSXSAVE	(4*32+27) /* OSXSAVE */
-#define X86_FEATURE_AVX 	(4*32+28) /* Advanced Vector Extensions */
-#define X86_FEATURE_F16C 	(4*32+29) /* Half-precision convert instruction */
-#define X86_FEATURE_RDRAND 	(4*32+30) /* Digital Random Number Generator */
-#define X86_FEATURE_HYPERVISOR	(4*32+31) /* Running under some hypervisor */
-
-/* UNUSED, word 5 */
-
-/* More extended AMD flags: CPUID level 0x80000001, ecx, word 6 */
-#define X86_FEATURE_LAHF_LM     (6*32+ 0) /* LAHF/SAHF in long mode */
-#define X86_FEATURE_CMP_LEGACY  (6*32+ 1) /* If yes HyperThreading not valid */
-#define X86_FEATURE_SVM         (6*32+ 2) /* Secure virtual machine */
-#define X86_FEATURE_EXTAPIC     (6*32+ 3) /* Extended APIC space */
-#define X86_FEATURE_CR8_LEGACY  (6*32+ 4) /* CR8 in 32-bit mode */
-#define X86_FEATURE_ABM         (6*32+ 5) /* Advanced bit manipulation */
-#define X86_FEATURE_SSE4A       (6*32+ 6) /* SSE-4A */
-#define X86_FEATURE_MISALIGNSSE (6*32+ 7) /* Misaligned SSE mode */
-#define X86_FEATURE_3DNOWPREFETCH (6*32+ 8) /* 3DNow prefetch instructions */
-#define X86_FEATURE_OSVW        (6*32+ 9) /* OS Visible Workaround */
-#define X86_FEATURE_IBS         (6*32+10) /* Instruction Based Sampling */
-#define X86_FEATURE_XOP         (6*32+11) /* extended AVX instructions */
-#define X86_FEATURE_SKINIT      (6*32+12) /* SKINIT/STGI instructions */
-#define X86_FEATURE_WDT         (6*32+13) /* Watchdog timer */
-#define X86_FEATURE_LWP         (6*32+15) /* Light Weight Profiling */
-#define X86_FEATURE_FMA4        (6*32+16) /* 4 operands MAC instructions */
-#define X86_FEATURE_NODEID_MSR  (6*32+19) /* NodeId MSR */
-#define X86_FEATURE_TBM         (6*32+21) /* trailing bit manipulations */
-#define X86_FEATURE_TOPOEXT     (6*32+22) /* topology extensions CPUID leafs */
-#define X86_FEATURE_DBEXT       (6*32+26) /* data breakpoint extension */
-#define X86_FEATURE_MONITORX    (6*32+29) /* MWAIT extension (MONITORX/MWAITX) */
-
-/* Intel-defined CPU features, CPUID level 0x00000007:0 (ebx), word 7 */
-#define X86_FEATURE_FSGSBASE	(7*32+ 0) /* {RD,WR}{FS,GS}BASE instructions */
-#define X86_FEATURE_BMI1	(7*32+ 3) /* 1st bit manipulation extensions */
-#define X86_FEATURE_HLE 	(7*32+ 4) /* Hardware Lock Elision */
-#define X86_FEATURE_AVX2	(7*32+ 5) /* AVX2 instructions */
-#define X86_FEATURE_SMEP	(7*32+ 7) /* Supervisor Mode Execution Protection */
-#define X86_FEATURE_BMI2	(7*32+ 8) /* 2nd bit manipulation extensions */
-#define X86_FEATURE_ERMS	(7*32+ 9) /* Enhanced REP MOVSB/STOSB */
-#define X86_FEATURE_INVPCID	(7*32+10) /* Invalidate Process Context ID */
-#define X86_FEATURE_RTM 	(7*32+11) /* Restricted Transactional Memory */
-#define X86_FEATURE_PQM 	(7*32+12) /* Platform QoS Monitoring */
-#define X86_FEATURE_NO_FPU_SEL 	(7*32+13) /* FPU CS/DS stored as zero */
-#define X86_FEATURE_MPX		(7*32+14) /* Memory Protection Extensions */
-#define X86_FEATURE_PQE 	(7*32+15) /* Platform QoS Enforcement */
-#define X86_FEATURE_RDSEED	(7*32+18) /* RDSEED instruction */
-#define X86_FEATURE_ADX		(7*32+19) /* ADCX, ADOX instructions */
-#define X86_FEATURE_SMAP	(7*32+20) /* Supervisor Mode Access Prevention */
-#define X86_FEATURE_PCOMMIT	(7*32+22) /* PCOMMIT instruction */
-#define X86_FEATURE_CLFLUSHOPT	(7*32+23) /* CLFLUSHOPT instruction */
-
-/* Intel-defined CPU features, CPUID level 0x00000007:0 (ecx), word 8 */
-#define X86_FEATURE_PKU	(8*32+ 3) /* Protection Keys for Userspace */
-#define X86_FEATURE_OSPKE	(8*32+ 4) /* OS Protection Keys Enable */
+#define X86_FEATURE_CONSTANT_TSC	((FSCAPINTS+0)*32+ 0) /* TSC ticks at a constant rate */
+#define X86_FEATURE_NONSTOP_TSC		((FSCAPINTS+0)*32+ 1) /* TSC does not stop in C states */
+#define X86_FEATURE_ARAT		((FSCAPINTS+0)*32+ 2) /* Always running APIC timer */
+#define X86_FEATURE_ARCH_PERFMON	((FSCAPINTS+0)*32+ 3) /* Intel Architectural PerfMon */
+#define X86_FEATURE_TSC_RELIABLE	((FSCAPINTS+0)*32+ 4) /* TSC is known to be reliable */
+#define X86_FEATURE_XTOPOLOGY		((FSCAPINTS+0)*32+ 5) /* cpu topology enum extensions */
+#define X86_FEATURE_CPUID_FAULTING	((FSCAPINTS+0)*32+ 6) /* cpuid faulting */
+#define X86_FEATURE_CLFLUSH_MONITOR	((FSCAPINTS+0)*32+ 7) /* clflush reqd with monitor */
+#define X86_FEATURE_APERFMPERF		((FSCAPINTS+0)*32+ 8) /* APERFMPERF */
 
 #define cpufeat_word(idx)	((idx) / 32)
 #define cpufeat_bit(idx)	((idx) % 32)
diff --git a/xen/include/asm-x86/cpufeatureset.h b/xen/include/asm-x86/cpufeatureset.h
new file mode 100644
index 0000000..07ee32f
--- /dev/null
+++ b/xen/include/asm-x86/cpufeatureset.h
@@ -0,0 +1,32 @@
+#ifndef __XEN_X86_CPUFEATURESET_H__
+#define __XEN_X86_CPUFEATURESET_H__
+
+#ifndef __ASSEMBLY__
+
+#define XEN_CPUFEATURE(name, value) X86_FEATURE_##name = value,
+enum {
+#include <public/arch-x86/cpufeatureset.h>
+#undef XEN_CPUFEATURE
+};
+
+#define XEN_CPUFEATURE(name, value) asm (".equ X86_FEATURE_" #name ", " #value);
+#include <public/arch-x86/cpufeatureset.h>
+
+#else /* !__ASSEMBLY__ */
+
+#define XEN_CPUFEATURE(name, value) .equ X86_FEATURE_##name, value
+#include <public/arch-x86/cpufeatureset.h>
+
+#endif /* __ASSEMBLY__ */
+
+#endif /* !__XEN_X86_CPUFEATURESET_H__ */
+
+/*
+ * Local variables:
+ * mode: C
+ * c-file-style: "BSD"
+ * c-basic-offset: 4
+ * tab-width: 4
+ * indent-tabs-mode: nil
+ * End:
+ */
diff --git a/xen/include/public/arch-x86/cpufeatureset.h b/xen/include/public/arch-x86/cpufeatureset.h
new file mode 100644
index 0000000..5da37eb
--- /dev/null
+++ b/xen/include/public/arch-x86/cpufeatureset.h
@@ -0,0 +1,228 @@
+/*
+ * arch-x86/cpufeatureset.h
+ *
+ * CPU featureset definitions
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a copy
+ * of this software and associated documentation files (the "Software"), to
+ * deal in the Software without restriction, including without limitation the
+ * rights to use, copy, modify, merge, publish, distribute, sublicense, and/or
+ * sell copies of the Software, and to permit persons to whom the Software is
+ * furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice shall be included in
+ * all copies or substantial portions of the Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
+ * AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+ * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
+ * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
+ * DEALINGS IN THE SOFTWARE.
+ *
+ * Copyright (c) 2015, 2016 Citrix Systems, Inc.
+ */
+
+/*
+ * There are two expected ways of including this header.
+ *
+ * 1) The "default" case (expected from tools etc).
+ *
+ * Simply #include <public/arch-x86/cpufeatureset.h>
+ *
+ * In this circumstance, normal header guards apply and the includer shall get
+ * an enumeration in the XEN_X86_FEATURE_xxx namespace.
+ *
+ * 2) The special case where the includer provides XEN_CPUFEATURE() in scope.
+ *
+ * In this case, no inclusion guards apply and the caller is responsible for
+ * their XEN_CPUFEATURE() being appropriate in the included context.
+ */
+
+#ifndef XEN_CPUFEATURE
+
+/*
+ * Includer has not provided a custom XEN_CPUFEATURE().  Arrange for normal
+ * header guards, an enum and constants in the XEN_X86_FEATURE_xxx namespace.
+ */
+#ifndef __XEN_PUBLIC_ARCH_X86_CPUFEATURESET_H__
+#define __XEN_PUBLIC_ARCH_X86_CPUFEATURESET_H__
+
+#define XEN_CPUFEATURESET_DEFAULT_INCLUDE
+
+#define XEN_CPUFEATURE(name, value) XEN_X86_FEATURE_##name = value,
+enum {
+
+#endif /* __XEN_PUBLIC_ARCH_X86_CPUFEATURESET_H__ */
+#endif /* !XEN_CPUFEATURE */
+
+
+#ifdef XEN_CPUFEATURE
+/*
+ * A featureset is a bitmap of x86 features, represented as a collection of
+ * 32bit words.
+ *
+ * Words are as specified in vendors programming manuals, and shall not
+ * contain any synthesied values.  New words may be added to the end of
+ * featureset.
+ *
+ * All featureset words currently originate from leaves specified for the
+ * CPUID instruction, but this is not preclude other sources of information.
+ */
+
+/* Intel-defined CPU features, CPUID level 0x00000001.edx, word 0 */
+XEN_CPUFEATURE(FPU,           0*32+ 0) /*   Onboard FPU */
+XEN_CPUFEATURE(VME,           0*32+ 1) /*   Virtual Mode Extensions */
+XEN_CPUFEATURE(DE,            0*32+ 2) /*   Debugging Extensions */
+XEN_CPUFEATURE(PSE,           0*32+ 3) /*   Page Size Extensions */
+XEN_CPUFEATURE(TSC,           0*32+ 4) /*   Time Stamp Counter */
+XEN_CPUFEATURE(MSR,           0*32+ 5) /*   Model-Specific Registers, RDMSR, WRMSR */
+XEN_CPUFEATURE(PAE,           0*32+ 6) /*   Physical Address Extensions */
+XEN_CPUFEATURE(MCE,           0*32+ 7) /*   Machine Check Architecture */
+XEN_CPUFEATURE(CX8,           0*32+ 8) /*   CMPXCHG8 instruction */
+XEN_CPUFEATURE(APIC,          0*32+ 9) /*   Onboard APIC */
+XEN_CPUFEATURE(SEP,           0*32+11) /*   SYSENTER/SYSEXIT */
+XEN_CPUFEATURE(MTRR,          0*32+12) /*   Memory Type Range Registers */
+XEN_CPUFEATURE(PGE,           0*32+13) /*   Page Global Enable */
+XEN_CPUFEATURE(MCA,           0*32+14) /*   Machine Check Architecture */
+XEN_CPUFEATURE(CMOV,          0*32+15) /*   CMOV instruction (FCMOVCC and FCOMI too if FPU present) */
+XEN_CPUFEATURE(PAT,           0*32+16) /*   Page Attribute Table */
+XEN_CPUFEATURE(PSE36,         0*32+17) /*   36-bit PSEs */
+XEN_CPUFEATURE(CLFLUSH,       0*32+19) /*   CLFLUSH instruction */
+XEN_CPUFEATURE(DS,            0*32+21) /*   Debug Store */
+XEN_CPUFEATURE(ACPI,          0*32+22) /*   ACPI via MSR */
+XEN_CPUFEATURE(MMX,           0*32+23) /*   Multimedia Extensions */
+XEN_CPUFEATURE(FXSR,          0*32+24) /*   FXSAVE and FXRSTOR instructions */
+XEN_CPUFEATURE(SSE,           0*32+25) /*   Streaming SIMD Extensions */
+XEN_CPUFEATURE(SSE2,          0*32+26) /*   Streaming SIMD Extensions-2 */
+XEN_CPUFEATURE(HTT,           0*32+28) /*   Hyper-Threading Technology */
+XEN_CPUFEATURE(TM1,           0*32+29) /*   Thermal Monitor 1 */
+XEN_CPUFEATURE(PBE,           0*32+31) /*   Pending Break Enable */
+
+/* Intel-defined CPU features, CPUID level 0x00000001.ecx, word 1 */
+XEN_CPUFEATURE(SSE3,          1*32+ 0) /*   Streaming SIMD Extensions-3 */
+XEN_CPUFEATURE(PCLMULQDQ,     1*32+ 1) /*   Carry-less mulitplication */
+XEN_CPUFEATURE(DTES64,        1*32+ 2) /*   64-bit Debug Store */
+XEN_CPUFEATURE(MONITOR,       1*32+ 3) /*   Monitor/Mwait support */
+XEN_CPUFEATURE(DSCPL,         1*32+ 4) /*   CPL Qualified Debug Store */
+XEN_CPUFEATURE(VMX,           1*32+ 5) /*   Virtual Machine Extensions */
+XEN_CPUFEATURE(SMX,           1*32+ 6) /*   Safer Mode Extensions */
+XEN_CPUFEATURE(EIST,          1*32+ 7) /*   Enhanced SpeedStep */
+XEN_CPUFEATURE(TM2,           1*32+ 8) /*   Thermal Monitor 2 */
+XEN_CPUFEATURE(SSSE3,         1*32+ 9) /*   Supplemental Streaming SIMD Extensions-3 */
+XEN_CPUFEATURE(FMA,           1*32+12) /*   Fused Multiply Add */
+XEN_CPUFEATURE(CX16,          1*32+13) /*   CMPXCHG16B */
+XEN_CPUFEATURE(XTPR,          1*32+14) /*   Send Task Priority Messages */
+XEN_CPUFEATURE(PDCM,          1*32+15) /*   Perf/Debug Capability MSR */
+XEN_CPUFEATURE(PCID,          1*32+17) /*   Process Context ID */
+XEN_CPUFEATURE(DCA,           1*32+18) /*   Direct Cache Access */
+XEN_CPUFEATURE(SSE4_1,        1*32+19) /*   Streaming SIMD Extensions 4.1 */
+XEN_CPUFEATURE(SSE4_2,        1*32+20) /*   Streaming SIMD Extensions 4.2 */
+XEN_CPUFEATURE(X2APIC,        1*32+21) /*   Extended xAPIC */
+XEN_CPUFEATURE(MOVBE,         1*32+22) /*   movbe instruction */
+XEN_CPUFEATURE(POPCNT,        1*32+23) /*   POPCNT instruction */
+XEN_CPUFEATURE(TSC_DEADLINE,  1*32+24) /*   TSC Deadline Timer */
+XEN_CPUFEATURE(AESNI,         1*32+25) /*   AES instructions */
+XEN_CPUFEATURE(XSAVE,         1*32+26) /*   XSAVE/XRSTOR/XSETBV/XGETBV */
+XEN_CPUFEATURE(OSXSAVE,       1*32+27) /*   OSXSAVE */
+XEN_CPUFEATURE(AVX,           1*32+28) /*   Advanced Vector Extensions */
+XEN_CPUFEATURE(F16C,          1*32+29) /*   Half-precision convert instruction */
+XEN_CPUFEATURE(RDRAND,        1*32+30) /*   Digital Random Number Generator */
+XEN_CPUFEATURE(HYPERVISOR,    1*32+31) /*   Running under some hypervisor */
+
+/* AMD-defined CPU features, CPUID level 0x80000001.edx, word 2 */
+XEN_CPUFEATURE(SYSCALL,       2*32+11) /*   SYSCALL/SYSRET */
+XEN_CPUFEATURE(NX,            2*32+20) /*   Execute Disable */
+XEN_CPUFEATURE(MMXEXT,        2*32+22) /*   AMD MMX extensions */
+XEN_CPUFEATURE(FFXSR,         2*32+25) /*   FFXSR instruction optimizations */
+XEN_CPUFEATURE(PAGE1GB,       2*32+26) /*   1Gb large page support */
+XEN_CPUFEATURE(RDTSCP,        2*32+27) /*   RDTSCP */
+XEN_CPUFEATURE(LM,            2*32+29) /*   Long Mode (x86-64) */
+XEN_CPUFEATURE(3DNOWEXT,      2*32+30) /*   AMD 3DNow! extensions */
+XEN_CPUFEATURE(3DNOW,         2*32+31) /*   3DNow! */
+
+/* AMD-defined CPU features, CPUID level 0x80000001.ecx, word 3 */
+XEN_CPUFEATURE(LAHF_LM,       3*32+ 0) /*   LAHF/SAHF in long mode */
+XEN_CPUFEATURE(CMP_LEGACY,    3*32+ 1) /*   If yes HyperThreading not valid */
+XEN_CPUFEATURE(SVM,           3*32+ 2) /*   Secure virtual machine */
+XEN_CPUFEATURE(EXTAPIC,       3*32+ 3) /*   Extended APIC space */
+XEN_CPUFEATURE(CR8_LEGACY,    3*32+ 4) /*   CR8 in 32-bit mode */
+XEN_CPUFEATURE(ABM,           3*32+ 5) /*   Advanced bit manipulation */
+XEN_CPUFEATURE(SSE4A,         3*32+ 6) /*   SSE-4A */
+XEN_CPUFEATURE(MISALIGNSSE,   3*32+ 7) /*   Misaligned SSE mode */
+XEN_CPUFEATURE(3DNOWPREFETCH, 3*32+ 8) /*   3DNow prefetch instructions */
+XEN_CPUFEATURE(OSVW,          3*32+ 9) /*   OS Visible Workaround */
+XEN_CPUFEATURE(IBS,           3*32+10) /*   Instruction Based Sampling */
+XEN_CPUFEATURE(XOP,           3*32+11) /*   extended AVX instructions */
+XEN_CPUFEATURE(SKINIT,        3*32+12) /*   SKINIT/STGI instructions */
+XEN_CPUFEATURE(WDT,           3*32+13) /*   Watchdog timer */
+XEN_CPUFEATURE(LWP,           3*32+15) /*   Light Weight Profiling */
+XEN_CPUFEATURE(FMA4,          3*32+16) /*   4 operands MAC instructions */
+XEN_CPUFEATURE(NODEID_MSR,    3*32+19) /*   NodeId MSR */
+XEN_CPUFEATURE(TBM,           3*32+21) /*   trailing bit manipulations */
+XEN_CPUFEATURE(TOPOEXT,       3*32+22) /*   topology extensions CPUID leafs */
+XEN_CPUFEATURE(DBEXT,         3*32+26) /*   data breakpoint extension */
+XEN_CPUFEATURE(MONITORX,      3*32+29) /*   MONITOR extension (MONITORX/MWAITX) */
+
+/* Intel-defined CPU features, CPUID level 0x0000000D:1.eax, word 4 */
+XEN_CPUFEATURE(XSAVEOPT,      4*32+ 0) /*   XSAVEOPT instruction */
+XEN_CPUFEATURE(XSAVEC,        4*32+ 1) /*   XSAVEC/XRSTORC instructions */
+XEN_CPUFEATURE(XGETBV1,       4*32+ 2) /*   XGETBV with %ecx=1 */
+XEN_CPUFEATURE(XSAVES,        4*32+ 3) /*   XSAVES/XRSTORS instructions */
+
+/* Intel-defined CPU features, CPUID level 0x00000007:0.ebx, word 5 */
+XEN_CPUFEATURE(FSGSBASE,      5*32+ 0) /*   {RD,WR}{FS,GS}BASE instructions */
+XEN_CPUFEATURE(TSC_ADJUST,    5*32+ 1) /*   TSC_ADJUST MSR available */
+XEN_CPUFEATURE(BMI1,          5*32+ 3) /*   1st bit manipulation extensions */
+XEN_CPUFEATURE(HLE,           5*32+ 4) /*   Hardware Lock Elision */
+XEN_CPUFEATURE(AVX2,          5*32+ 5) /*   AVX2 instructions */
+XEN_CPUFEATURE(FDP_EXCP_ONLY, 5*32+ 6) /*   x87 FDP only updated on exception. */
+XEN_CPUFEATURE(SMEP,          5*32+ 7) /*   Supervisor Mode Execution Protection */
+XEN_CPUFEATURE(BMI2,          5*32+ 8) /*   2nd bit manipulation extensions */
+XEN_CPUFEATURE(ERMS,          5*32+ 9) /*   Enhanced REP MOVSB/STOSB */
+XEN_CPUFEATURE(INVPCID,       5*32+10) /*   Invalidate Process Context ID */
+XEN_CPUFEATURE(RTM,           5*32+11) /*   Restricted Transactional Memory */
+XEN_CPUFEATURE(PQM,           5*32+12) /*   Platform QoS Monitoring */
+XEN_CPUFEATURE(NO_FPU_SEL,    5*32+13) /*   FPU CS/DS stored as zero */
+XEN_CPUFEATURE(MPX,           5*32+14) /*   Memory Protection Extensions */
+XEN_CPUFEATURE(PQE,           5*32+15) /*   Platform QoS Enforcement */
+XEN_CPUFEATURE(RDSEED,        5*32+18) /*   RDSEED instruction */
+XEN_CPUFEATURE(ADX,           5*32+19) /*   ADCX, ADOX instructions */
+XEN_CPUFEATURE(SMAP,          5*32+20) /*   Supervisor Mode Access Prevention */
+XEN_CPUFEATURE(PCOMMIT,       5*32+22) /*   PCOMMIT instruction */
+XEN_CPUFEATURE(CLFLUSHOPT,    5*32+23) /*   CLFLUSHOPT instruction */
+XEN_CPUFEATURE(CLWB,          5*32+24) /*   CLWB instruction */
+XEN_CPUFEATURE(SHA,           5*32+29) /*   SHA1 & SHA256 instructions */
+
+/* Intel-defined CPU features, CPUID level 0x00000007:0.ecx, word 6 */
+XEN_CPUFEATURE(PREFETCHWT1,   6*32+ 0) /*   PREFETCHWT1 instruction */
+XEN_CPUFEATURE(PKU,           6*32+ 3) /*   Protection Keys for Userspace */
+XEN_CPUFEATURE(OSPKE,         6*32+ 4) /*   OS Protection Keys Enable */
+
+/* AMD-defined CPU features, CPUID level 0x80000007.edx, word 7 */
+XEN_CPUFEATURE(ITSC,          7*32+ 8) /*   Invariant TSC */
+XEN_CPUFEATURE(EFRO,          7*32+10) /*   APERF/MPERF Read Only interface */
+
+/* AMD-defined CPU features, CPUID level 0x80000008.ebx, word 8 */
+XEN_CPUFEATURE(CLZERO,        8*32+ 0) /*   CLZERO instruction */
+
+#endif /* XEN_CPUFEATURE */
+
+/* Clean up from a default include.  Close the enum (for C). */
+#ifdef XEN_CPUFEATURESET_DEFAULT_INCLUDE
+#undef XEN_CPUFEATURESET_DEFAULT_INCLUDE
+#undef XEN_CPUFEATURE
+};
+
+#endif /* XEN_CPUFEATURESET_DEFAULT_INCLUDE */
+
+/*
+ * Local variables:
+ * mode: C
+ * c-file-style: "BSD"
+ * c-basic-offset: 4
+ * tab-width: 4
+ * indent-tabs-mode: nil
+ * End:
+ */
-- 
2.1.4


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 72+ messages in thread

* [PATCH v4 02/26] xen/x86: Script to automatically process featureset information
  2016-03-23 16:36 [PATCH v4 00/26] x86: Improvements to cpuid handling for guests Andrew Cooper
  2016-03-23 16:36 ` [PATCH v4 01/26] xen/public: Export cpu featureset information in the public API Andrew Cooper
@ 2016-03-23 16:36 ` Andrew Cooper
  2016-03-23 16:36 ` [PATCH v4 03/26] xen/x86: Collect more cpuid feature leaves Andrew Cooper
                   ` (24 subsequent siblings)
  26 siblings, 0 replies; 72+ messages in thread
From: Andrew Cooper @ 2016-03-23 16:36 UTC (permalink / raw)
  To: Xen-devel; +Cc: Andrew Cooper

This script consumes include/public/arch-x86/cpufeatureset.h and generates a
single include/asm-x86/cpuid-autogen.h containing all the processed
information.

It currently generates just FEATURESET_NR_ENTRIES.  Future changes will
generate more information.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by: Jan Beulich <JBeulich@suse.com>
---
v2:
 * New
v3:
 * Rebased over the new namespacing in cpufeatureset.h
v4:
 * Speeling fixes
---
 .gitignore                       |   1 +
 xen/include/Makefile             |  10 ++
 xen/include/asm-x86/cpufeature.h |   3 +-
 xen/tools/gen-cpuid.py           | 197 +++++++++++++++++++++++++++++++++++++++
 4 files changed, 210 insertions(+), 1 deletion(-)
 create mode 100755 xen/tools/gen-cpuid.py

diff --git a/.gitignore b/.gitignore
index 91f690c..b40453e 100644
--- a/.gitignore
+++ b/.gitignore
@@ -252,6 +252,7 @@ xen/include/headers.chk
 xen/include/headers++.chk
 xen/include/asm
 xen/include/asm-*/asm-offsets.h
+xen/include/asm-x86/cpuid-autogen.h
 xen/include/compat/*
 xen/include/config/
 xen/include/generated/
diff --git a/xen/include/Makefile b/xen/include/Makefile
index 9c8188b..268bc9d 100644
--- a/xen/include/Makefile
+++ b/xen/include/Makefile
@@ -117,5 +117,15 @@ headers++.chk: $(PUBLIC_HEADERS) Makefile
 
 endif
 
+ifeq ($(XEN_TARGET_ARCH),x86_64)
+
+$(BASEDIR)/include/asm-x86/cpuid-autogen.h: $(BASEDIR)/include/public/arch-x86/cpufeatureset.h $(BASEDIR)/tools/gen-cpuid.py FORCE
+	$(PYTHON) $(BASEDIR)/tools/gen-cpuid.py -i $^ -o $@.new
+	$(call move-if-changed,$@.new,$@)
+
+all: $(BASEDIR)/include/asm-x86/cpuid-autogen.h
+endif
+
 clean::
 	rm -rf compat headers.chk headers++.chk
+	rm -f $(BASEDIR)/include/asm-x86/cpuid-autogen.h
diff --git a/xen/include/asm-x86/cpufeature.h b/xen/include/asm-x86/cpufeature.h
index a044616..bcda09b 100644
--- a/xen/include/asm-x86/cpufeature.h
+++ b/xen/include/asm-x86/cpufeature.h
@@ -11,8 +11,9 @@
 
 #include <xen/const.h>
 #include <asm/cpufeatureset.h>
+#include <asm/cpuid-autogen.h>
 
-#define FSCAPINTS 9
+#define FSCAPINTS FEATURESET_NR_ENTRIES
 #define NCAPINTS (FSCAPINTS + 1) /* N 32-bit words worth of info */
 
 /* Other features, Xen-defined mapping. */
diff --git a/xen/tools/gen-cpuid.py b/xen/tools/gen-cpuid.py
new file mode 100755
index 0000000..c6bd98d
--- /dev/null
+++ b/xen/tools/gen-cpuid.py
@@ -0,0 +1,197 @@
+#!/usr/bin/env python
+# -*- coding: utf-8 -*-
+
+import sys, os, re
+
+class Fail(Exception):
+    pass
+
+class State(object):
+
+    def __init__(self, input, output):
+
+        self.source = input
+        self.input  = open_file_or_fd(input, "r", 2)
+        self.output = open_file_or_fd(output, "w", 2)
+
+        # State parsed from input
+        self.names = {} # Name => value mapping
+
+        # State calculated
+        self.nr_entries = 0 # Number of words in a featureset
+
+def parse_definitions(state):
+    """
+    Parse featureset information from @param f and mutate the global
+    namespace with symbols
+    """
+    feat_regex = re.compile(
+        r"^XEN_CPUFEATURE\(([A-Z0-9_]+),"
+        "\s+([\s\d]+\*[\s\d]+\+[\s\d]+)\).*$")
+
+    this = sys.modules[__name__]
+
+    for l in state.input.readlines():
+        # Short circuit the regex...
+        if not l.startswith("XEN_CPUFEATURE("):
+            continue
+
+        res = feat_regex.match(l)
+
+        if res is None:
+            raise Fail("Failed to interpret '%s'" % (l.strip(), ))
+
+        name = res.groups()[0]
+        val = eval(res.groups()[1]) # Regex confines this to a very simple expression
+
+        if hasattr(this, name):
+            raise Fail("Duplicate symbol %s" % (name,))
+
+        if val in state.names:
+            raise Fail("Aliased value between %s and %s" %
+                       (name, state.names[val]))
+
+        # Mutate the current namespace to insert a feature literal with its
+        # bit index.  Prepend an underscore if the name starts with a digit.
+        if name[0] in "0123456789":
+            this_name = "_" + name
+        else:
+            this_name = name
+        setattr(this, this_name, val)
+
+        # Construct a reverse mapping of value to name
+        state.names[val] = name
+
+    if len(state.names) == 0:
+        raise Fail("No features found")
+
+def featureset_to_uint32s(fs, nr):
+    """ Represent a featureset as a list of C-compatible uint32_t's """
+
+    bitmap = 0L
+    for f in fs:
+        bitmap |= 1L << f
+
+    words = []
+    while bitmap:
+        words.append(bitmap & ((1L << 32) - 1))
+        bitmap >>= 32
+
+    assert len(words) <= nr
+
+    if len(words) < nr:
+        words.extend([0] * (nr - len(words)))
+
+    return [ "0x%08xU" % x for x in words ]
+
+def format_uint32s(words, indent):
+    """ Format a list of uint32_t's suitable for a macro definition """
+    spaces = " " * indent
+    return spaces + (", \\\n" + spaces).join(words) + ", \\"
+
+
+def crunch_numbers(state):
+
+    # Size of bitmaps
+    state.nr_entries = nr_entries = (max(state.names.keys()) >> 5) + 1
+
+
+def write_results(state):
+    state.output.write(
+"""/*
+ * Automatically generated by %s - Do not edit!
+ * Source data: %s
+ */
+#ifndef __XEN_X86__FEATURESET_DATA__
+#define __XEN_X86__FEATURESET_DATA__
+""" % (sys.argv[0], state.source))
+
+    state.output.write(
+"""
+#define FEATURESET_NR_ENTRIES %sU
+""" % (state.nr_entries,
+       ))
+
+    state.output.write(
+"""
+#endif /* __XEN_X86__FEATURESET_DATA__ */
+""")
+
+
+def open_file_or_fd(val, mode, buffering):
+    """
+    If 'val' looks like a decimal integer, open it as an fd.  If not, try to
+    open it as a regular file.
+    """
+
+    fd = -1
+    try:
+        # Does it look like an integer?
+        try:
+            fd = int(val, 10)
+        except ValueError:
+            pass
+
+        if fd == 0:
+            return sys.stdin
+        elif fd == 1:
+            return sys.stdout
+        elif fd == 2:
+            return sys.stderr
+
+        # Try to open it...
+        if fd != -1:
+            return os.fdopen(fd, mode, buffering)
+        else:
+            return open(val, mode, buffering)
+
+    except StandardError, e:
+        if fd != -1:
+            raise Fail("Unable to open fd %d: %s: %s" %
+                       (fd, e.__class__.__name__, e))
+        else:
+            raise Fail("Unable to open file '%s': %s: %s" %
+                       (val, e.__class__.__name__, e))
+
+    raise SystemExit(2)
+
+def main():
+    from optparse import OptionParser
+
+    # Change stdout to be line-buffered.
+    sys.stdout = os.fdopen(sys.stdout.fileno(), 'w', 1)
+
+    parser = OptionParser(usage = "%prog [options] -i INPUT -o OUTPUT",
+                          description =
+                          "Process featureset information")
+
+    parser.add_option("-i", "--in", dest = "fin", metavar = "<FD or FILE>",
+                      default = "0",
+                      help = "Featureset definitions")
+    parser.add_option("-o", "--out", dest = "fout", metavar = "<FD or FILE>",
+                      default = "1",
+                      help = "Featureset calculated information")
+
+    opts, _ = parser.parse_args()
+
+    if opts.fin is None or opts.fout is None:
+        parser.print_help(sys.stderr)
+        raise SystemExit(1)
+
+    state = State(opts.fin, opts.fout)
+
+    parse_definitions(state)
+    crunch_numbers(state)
+    write_results(state)
+
+
+if __name__ == "__main__":
+    try:
+        sys.exit(main())
+    except Fail, e:
+        print >>sys.stderr, "%s:" % (sys.argv[0],), e
+        sys.exit(1)
+    except SystemExit, e:
+        sys.exit(e.code)
+    except KeyboardInterrupt:
+        sys.exit(2)
-- 
2.1.4


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 72+ messages in thread

* [PATCH v4 03/26] xen/x86: Collect more cpuid feature leaves
  2016-03-23 16:36 [PATCH v4 00/26] x86: Improvements to cpuid handling for guests Andrew Cooper
  2016-03-23 16:36 ` [PATCH v4 01/26] xen/public: Export cpu featureset information in the public API Andrew Cooper
  2016-03-23 16:36 ` [PATCH v4 02/26] xen/x86: Script to automatically process featureset information Andrew Cooper
@ 2016-03-23 16:36 ` Andrew Cooper
  2016-03-23 16:36 ` [PATCH v4 04/26] xen/x86: Mask out unknown features from Xen's capabilities Andrew Cooper
                   ` (23 subsequent siblings)
  26 siblings, 0 replies; 72+ messages in thread
From: Andrew Cooper @ 2016-03-23 16:36 UTC (permalink / raw)
  To: Xen-devel; +Cc: Andrew Cooper

New words are:
 * 0x80000007.edx - Contains Invarient TSC
 * 0x80000008.ebx - Newly used for AMD Zen processors

In addition, replace some open-coded ITSC and EFRO manipulation.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by: Jan Beulich <JBeulich@suse.com>
Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
---
v2:
 * Rely on ordering of generic_identify() to simplify init_amd()
 * Remove opencoded EFRO manipulation as well
---
 xen/arch/x86/cpu/amd.c    | 21 +++------------------
 xen/arch/x86/cpu/common.c |  6 ++++++
 xen/arch/x86/cpu/intel.c  |  2 +-
 xen/arch/x86/domain.c     |  2 +-
 4 files changed, 11 insertions(+), 20 deletions(-)

diff --git a/xen/arch/x86/cpu/amd.c b/xen/arch/x86/cpu/amd.c
index a4bef21..47a38c6 100644
--- a/xen/arch/x86/cpu/amd.c
+++ b/xen/arch/x86/cpu/amd.c
@@ -294,21 +294,6 @@ int cpu_has_amd_erratum(const struct cpuinfo_x86 *cpu, int osvw_id, ...)
 	return 0;
 }
 
-/* Can this system suffer from TSC drift due to C1 clock ramping? */
-static int c1_ramping_may_cause_clock_drift(struct cpuinfo_x86 *c) 
-{ 
-	if (cpuid_edx(0x80000007) & (1<<8)) {
-		/*
-		 * CPUID.AdvPowerMgmtInfo.TscInvariant
-		 * EDX bit 8, 8000_0007
-		 * Invariant TSC on 8th Gen or newer, use it
-		 * (assume all cores have invariant TSC)
-		 */
-		return 0;
-	}
-	return 1;
-}
-
 /*
  * Disable C1-Clock ramping if enabled in PMM7.CpuLowPwrEnh on 8th-generation
  * cores only. Assume BIOS has setup all Northbridges equivalently.
@@ -475,7 +460,7 @@ static void init_amd(struct cpuinfo_x86 *c)
 	}
 
 	if (c->extended_cpuid_level >= 0x80000007) {
-		if (cpuid_edx(0x80000007) & (1<<8)) {
+		if (cpu_has(c, X86_FEATURE_ITSC)) {
 			__set_bit(X86_FEATURE_CONSTANT_TSC, c->x86_capability);
 			__set_bit(X86_FEATURE_NONSTOP_TSC, c->x86_capability);
 			if (c->x86 != 0x11)
@@ -600,14 +585,14 @@ static void init_amd(struct cpuinfo_x86 *c)
 		wrmsrl(MSR_K7_PERFCTR3, 0);
 	}
 
-	if (cpuid_edx(0x80000007) & (1 << 10)) {
+	if (cpu_has(c, X86_FEATURE_EFRO)) {
 		rdmsr(MSR_K7_HWCR, l, h);
 		l |= (1 << 27); /* Enable read-only APERF/MPERF bit */
 		wrmsr(MSR_K7_HWCR, l, h);
 	}
 
 	/* Prevent TSC drift in non single-processor, single-core platforms. */
-	if ((smp_processor_id() == 1) && c1_ramping_may_cause_clock_drift(c))
+	if ((smp_processor_id() == 1) && !cpu_has(c, X86_FEATURE_ITSC))
 		disable_c1_ramping();
 
 	set_cpuidmask(c);
diff --git a/xen/arch/x86/cpu/common.c b/xen/arch/x86/cpu/common.c
index 8b94c1b..1a278b1 100644
--- a/xen/arch/x86/cpu/common.c
+++ b/xen/arch/x86/cpu/common.c
@@ -269,6 +269,12 @@ static void generic_identify(struct cpuinfo_x86 *c)
 
 	if (c->extended_cpuid_level >= 0x80000004)
 		get_model_name(c); /* Default name */
+	if (c->extended_cpuid_level >= 0x80000007)
+		c->x86_capability[cpufeat_word(X86_FEATURE_ITSC)]
+			= cpuid_edx(0x80000007);
+	if (c->extended_cpuid_level >= 0x80000008)
+		c->x86_capability[cpufeat_word(X86_FEATURE_CLZERO)]
+			= cpuid_ebx(0x80000008);
 
 	/* Intel-defined flags: level 0x00000007 */
 	if ( c->cpuid_level >= 0x00000007 )
diff --git a/xen/arch/x86/cpu/intel.c b/xen/arch/x86/cpu/intel.c
index d4f574b..bdf89f6 100644
--- a/xen/arch/x86/cpu/intel.c
+++ b/xen/arch/x86/cpu/intel.c
@@ -281,7 +281,7 @@ static void init_intel(struct cpuinfo_x86 *c)
 	if ((c->x86 == 0xf && c->x86_model >= 0x03) ||
 		(c->x86 == 0x6 && c->x86_model >= 0x0e))
 		__set_bit(X86_FEATURE_CONSTANT_TSC, c->x86_capability);
-	if (cpuid_edx(0x80000007) & (1u<<8)) {
+	if (cpu_has(c, X86_FEATURE_ITSC)) {
 		__set_bit(X86_FEATURE_CONSTANT_TSC, c->x86_capability);
 		__set_bit(X86_FEATURE_NONSTOP_TSC, c->x86_capability);
 		__set_bit(X86_FEATURE_TSC_RELIABLE, c->x86_capability);
diff --git a/xen/arch/x86/domain.c b/xen/arch/x86/domain.c
index a33f975..6ec7554 100644
--- a/xen/arch/x86/domain.c
+++ b/xen/arch/x86/domain.c
@@ -2614,7 +2614,7 @@ void domain_cpuid(
              */
             if ( (input == 0x80000007) && /* Advanced Power Management */
                  !d->disable_migrate && !d->arch.vtsc )
-                *edx &= ~(1u<<8); /* TSC Invariant */
+                *edx &= ~cpufeat_mask(X86_FEATURE_ITSC);
 
             return;
         }
-- 
2.1.4


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 72+ messages in thread

* [PATCH v4 04/26] xen/x86: Mask out unknown features from Xen's capabilities
  2016-03-23 16:36 [PATCH v4 00/26] x86: Improvements to cpuid handling for guests Andrew Cooper
                   ` (2 preceding siblings ...)
  2016-03-23 16:36 ` [PATCH v4 03/26] xen/x86: Collect more cpuid feature leaves Andrew Cooper
@ 2016-03-23 16:36 ` Andrew Cooper
  2016-03-23 16:36 ` [PATCH v4 05/26] xen/x86: Annotate special features Andrew Cooper
                   ` (22 subsequent siblings)
  26 siblings, 0 replies; 72+ messages in thread
From: Andrew Cooper @ 2016-03-23 16:36 UTC (permalink / raw)
  To: Xen-devel; +Cc: Andrew Cooper

If Xen doesn't know about a feature, it is unsafe for use and should be
deliberately hidden from Xen's capabilities.

This doesn't make a practical difference yet, but will make a difference
later when the guest featuresets are seeded from the host featureset.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by: Jan Beulich <JBeulich@suse.com>
Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
---
v2:
 * Reduced substantially from v1, by using the autogenerated information.
v3:
 * Drop redundant braces.
---
 xen/arch/x86/Makefile            |  1 +
 xen/arch/x86/cpu/common.c        |  2 ++
 xen/arch/x86/cpuid.c             | 20 ++++++++++++++++++++
 xen/include/asm-x86/cpufeature.h |  4 +---
 xen/include/asm-x86/cpuid.h      | 25 +++++++++++++++++++++++++
 xen/tools/gen-cpuid.py           | 24 ++++++++++++++++++++++++
 6 files changed, 73 insertions(+), 3 deletions(-)
 create mode 100644 xen/arch/x86/cpuid.c
 create mode 100644 xen/include/asm-x86/cpuid.h

diff --git a/xen/arch/x86/Makefile b/xen/arch/x86/Makefile
index 1bcb08b..729065b 100644
--- a/xen/arch/x86/Makefile
+++ b/xen/arch/x86/Makefile
@@ -12,6 +12,7 @@ obj-y += bitops.o
 obj-bin-y += bzimage.init.o
 obj-bin-y += clear_page.o
 obj-bin-y += copy_page.o
+obj-y += cpuid.o
 obj-y += compat.o x86_64/compat.o
 obj-$(CONFIG_KEXEC) += crash.o
 obj-y += debug.o
diff --git a/xen/arch/x86/cpu/common.c b/xen/arch/x86/cpu/common.c
index 1a278b1..d302272 100644
--- a/xen/arch/x86/cpu/common.c
+++ b/xen/arch/x86/cpu/common.c
@@ -341,6 +341,8 @@ void identify_cpu(struct cpuinfo_x86 *c)
 	 * The vendor-specific functions might have changed features.  Now
 	 * we do "generic changes."
 	 */
+	for (i = 0; i < FSCAPINTS; ++i)
+		c->x86_capability[i] &= known_features[i];
 
 	for (i = 0 ; i < NCAPINTS ; ++i)
 		c->x86_capability[i] &= ~cleared_caps[i];
diff --git a/xen/arch/x86/cpuid.c b/xen/arch/x86/cpuid.c
new file mode 100644
index 0000000..05cd646
--- /dev/null
+++ b/xen/arch/x86/cpuid.c
@@ -0,0 +1,20 @@
+#include <xen/init.h>
+#include <xen/lib.h>
+#include <asm/cpuid.h>
+
+const uint32_t known_features[] = INIT_KNOWN_FEATURES;
+
+static void __init __maybe_unused build_assertions(void)
+{
+    BUILD_BUG_ON(ARRAY_SIZE(known_features) != FSCAPINTS);
+}
+
+/*
+ * Local variables:
+ * mode: C
+ * c-file-style: "BSD"
+ * c-basic-offset: 4
+ * tab-width: 4
+ * indent-tabs-mode: nil
+ * End:
+ */
diff --git a/xen/include/asm-x86/cpufeature.h b/xen/include/asm-x86/cpufeature.h
index bcda09b..e29b024 100644
--- a/xen/include/asm-x86/cpufeature.h
+++ b/xen/include/asm-x86/cpufeature.h
@@ -10,10 +10,8 @@
 #endif
 
 #include <xen/const.h>
-#include <asm/cpufeatureset.h>
-#include <asm/cpuid-autogen.h>
+#include <asm/cpuid.h>
 
-#define FSCAPINTS FEATURESET_NR_ENTRIES
 #define NCAPINTS (FSCAPINTS + 1) /* N 32-bit words worth of info */
 
 /* Other features, Xen-defined mapping. */
diff --git a/xen/include/asm-x86/cpuid.h b/xen/include/asm-x86/cpuid.h
new file mode 100644
index 0000000..b72d88f
--- /dev/null
+++ b/xen/include/asm-x86/cpuid.h
@@ -0,0 +1,25 @@
+#ifndef __X86_CPUID_H__
+#define __X86_CPUID_H__
+
+#include <asm/cpufeatureset.h>
+#include <asm/cpuid-autogen.h>
+
+#define FSCAPINTS FEATURESET_NR_ENTRIES
+
+#ifndef __ASSEMBLY__
+#include <xen/types.h>
+
+extern const uint32_t known_features[FSCAPINTS];
+
+#endif /* __ASSEMBLY__ */
+#endif /* !__X86_CPUID_H__ */
+
+/*
+ * Local variables:
+ * mode: C
+ * c-file-style: "BSD"
+ * c-basic-offset: 4
+ * tab-width: 4
+ * indent-tabs-mode: nil
+ * End:
+ */
diff --git a/xen/tools/gen-cpuid.py b/xen/tools/gen-cpuid.py
index c6bd98d..44b4c98 100755
--- a/xen/tools/gen-cpuid.py
+++ b/xen/tools/gen-cpuid.py
@@ -19,6 +19,8 @@ class State(object):
 
         # State calculated
         self.nr_entries = 0 # Number of words in a featureset
+        self.common_1d = 0 # Common features between 1d and e1d
+        self.known = [] # All known features
 
 def parse_definitions(state):
     """
@@ -95,6 +97,22 @@ def crunch_numbers(state):
     # Size of bitmaps
     state.nr_entries = nr_entries = (max(state.names.keys()) >> 5) + 1
 
+    # Features common between 1d and e1d.
+    common_1d = (FPU, VME, DE, PSE, TSC, MSR, PAE, MCE, CX8, APIC,
+                 MTRR, PGE, MCA, CMOV, PAT, PSE36, MMX, FXSR)
+
+    # All known features.  Duplicate the common features in e1d
+    e1d_base = SYSCALL & ~31
+    state.known = featureset_to_uint32s(
+        state.names.keys() + [ e1d_base + (x % 32) for x in common_1d ],
+        nr_entries)
+
+    # Fold common back into names
+    for f in common_1d:
+        state.names[e1d_base + (f % 32)] = "E1D_" + state.names[f]
+
+    state.common_1d = featureset_to_uint32s(common_1d, 1)[0]
+
 
 def write_results(state):
     state.output.write(
@@ -109,7 +127,13 @@ def write_results(state):
     state.output.write(
 """
 #define FEATURESET_NR_ENTRIES %sU
+
+#define CPUID_COMMON_1D_FEATURES %s
+
+#define INIT_KNOWN_FEATURES { \\\n%s\n}
 """ % (state.nr_entries,
+       state.common_1d,
+       format_uint32s(state.known, 4),
        ))
 
     state.output.write(
-- 
2.1.4


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 72+ messages in thread

* [PATCH v4 05/26] xen/x86: Annotate special features
  2016-03-23 16:36 [PATCH v4 00/26] x86: Improvements to cpuid handling for guests Andrew Cooper
                   ` (3 preceding siblings ...)
  2016-03-23 16:36 ` [PATCH v4 04/26] xen/x86: Mask out unknown features from Xen's capabilities Andrew Cooper
@ 2016-03-23 16:36 ` Andrew Cooper
  2016-03-23 16:36 ` [PATCH v4 06/26] xen/x86: Annotate VM applicability in featureset Andrew Cooper
                   ` (21 subsequent siblings)
  26 siblings, 0 replies; 72+ messages in thread
From: Andrew Cooper @ 2016-03-23 16:36 UTC (permalink / raw)
  To: Xen-devel; +Cc: Andrew Cooper

Some bits in a featureset are not simple a indication of new functionality,
and require special handling.

APIC, OSXSAVE and OSPKE are fast-forwards of other pieces of state;
IA32_APIC_BASE.EN, CR4.OSXSAVE and CR4.OSPKE.  Xen will take care of filling
these appropriately at runtime.

FDP_EXCP_ONLY and NO_FPU_SEL are bits indicating reduced functionality in the
x87 pipeline.  The effects of these cannot be hidden from the guest, so the
host values will always be provided.

HTT, X2APIC and CMP_LEGACY indicate how to interpret other cpuid leaves.  In
most cases, the toolstack value will be used (with the expectation that these
flags will match the other provided topology information).  However with cpuid
masking, the host values are presented as masking cannot influence what the
guest sees in the dependent leaves.

HYPERVISOR is unconditionally set in the PV ABI, but follows the toolstack
setting for HVM guests.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
---
v3:
 * Essentially new.  Replaces "Store antifeatures inverted in a featureset"
v4:
 * Include X2APIC and HYPERVISOR as special bits.
---
 xen/arch/x86/cpuid.c                        |  2 ++
 xen/include/asm-x86/cpuid.h                 |  1 +
 xen/include/public/arch-x86/cpufeatureset.h | 30 ++++++++++++++++++++---------
 xen/tools/gen-cpuid.py                      | 17 +++++++++++++++-
 4 files changed, 40 insertions(+), 10 deletions(-)

diff --git a/xen/arch/x86/cpuid.c b/xen/arch/x86/cpuid.c
index 05cd646..77e008a 100644
--- a/xen/arch/x86/cpuid.c
+++ b/xen/arch/x86/cpuid.c
@@ -3,10 +3,12 @@
 #include <asm/cpuid.h>
 
 const uint32_t known_features[] = INIT_KNOWN_FEATURES;
+const uint32_t special_features[] = INIT_SPECIAL_FEATURES;
 
 static void __init __maybe_unused build_assertions(void)
 {
     BUILD_BUG_ON(ARRAY_SIZE(known_features) != FSCAPINTS);
+    BUILD_BUG_ON(ARRAY_SIZE(special_features) != FSCAPINTS);
 }
 
 /*
diff --git a/xen/include/asm-x86/cpuid.h b/xen/include/asm-x86/cpuid.h
index b72d88f..0ecf357 100644
--- a/xen/include/asm-x86/cpuid.h
+++ b/xen/include/asm-x86/cpuid.h
@@ -10,6 +10,7 @@
 #include <xen/types.h>
 
 extern const uint32_t known_features[FSCAPINTS];
+extern const uint32_t special_features[FSCAPINTS];
 
 #endif /* __ASSEMBLY__ */
 #endif /* !__X86_CPUID_H__ */
diff --git a/xen/include/public/arch-x86/cpufeatureset.h b/xen/include/public/arch-x86/cpufeatureset.h
index 5da37eb..8308972 100644
--- a/xen/include/public/arch-x86/cpufeatureset.h
+++ b/xen/include/public/arch-x86/cpufeatureset.h
@@ -71,6 +71,18 @@ enum {
  * CPUID instruction, but this is not preclude other sources of information.
  */
 
+/*
+ * Attribute syntax:
+ *
+ * Attributes for a particular feature are provided as characters before the
+ * first space in the comment immediately following the feature value.
+ *
+ * Special: '!'
+ *   This bit has special properties and is not a straight indication of a
+ *   piece of new functionality.  Xen will handle these differently,
+ *   and may override toolstack settings completely.
+ */
+
 /* Intel-defined CPU features, CPUID level 0x00000001.edx, word 0 */
 XEN_CPUFEATURE(FPU,           0*32+ 0) /*   Onboard FPU */
 XEN_CPUFEATURE(VME,           0*32+ 1) /*   Virtual Mode Extensions */
@@ -81,7 +93,7 @@ XEN_CPUFEATURE(MSR,           0*32+ 5) /*   Model-Specific Registers, RDMSR, WRM
 XEN_CPUFEATURE(PAE,           0*32+ 6) /*   Physical Address Extensions */
 XEN_CPUFEATURE(MCE,           0*32+ 7) /*   Machine Check Architecture */
 XEN_CPUFEATURE(CX8,           0*32+ 8) /*   CMPXCHG8 instruction */
-XEN_CPUFEATURE(APIC,          0*32+ 9) /*   Onboard APIC */
+XEN_CPUFEATURE(APIC,          0*32+ 9) /*!  Onboard APIC */
 XEN_CPUFEATURE(SEP,           0*32+11) /*   SYSENTER/SYSEXIT */
 XEN_CPUFEATURE(MTRR,          0*32+12) /*   Memory Type Range Registers */
 XEN_CPUFEATURE(PGE,           0*32+13) /*   Page Global Enable */
@@ -96,7 +108,7 @@ XEN_CPUFEATURE(MMX,           0*32+23) /*   Multimedia Extensions */
 XEN_CPUFEATURE(FXSR,          0*32+24) /*   FXSAVE and FXRSTOR instructions */
 XEN_CPUFEATURE(SSE,           0*32+25) /*   Streaming SIMD Extensions */
 XEN_CPUFEATURE(SSE2,          0*32+26) /*   Streaming SIMD Extensions-2 */
-XEN_CPUFEATURE(HTT,           0*32+28) /*   Hyper-Threading Technology */
+XEN_CPUFEATURE(HTT,           0*32+28) /*!  Hyper-Threading Technology */
 XEN_CPUFEATURE(TM1,           0*32+29) /*   Thermal Monitor 1 */
 XEN_CPUFEATURE(PBE,           0*32+31) /*   Pending Break Enable */
 
@@ -119,17 +131,17 @@ XEN_CPUFEATURE(PCID,          1*32+17) /*   Process Context ID */
 XEN_CPUFEATURE(DCA,           1*32+18) /*   Direct Cache Access */
 XEN_CPUFEATURE(SSE4_1,        1*32+19) /*   Streaming SIMD Extensions 4.1 */
 XEN_CPUFEATURE(SSE4_2,        1*32+20) /*   Streaming SIMD Extensions 4.2 */
-XEN_CPUFEATURE(X2APIC,        1*32+21) /*   Extended xAPIC */
+XEN_CPUFEATURE(X2APIC,        1*32+21) /*!  Extended xAPIC */
 XEN_CPUFEATURE(MOVBE,         1*32+22) /*   movbe instruction */
 XEN_CPUFEATURE(POPCNT,        1*32+23) /*   POPCNT instruction */
 XEN_CPUFEATURE(TSC_DEADLINE,  1*32+24) /*   TSC Deadline Timer */
 XEN_CPUFEATURE(AESNI,         1*32+25) /*   AES instructions */
 XEN_CPUFEATURE(XSAVE,         1*32+26) /*   XSAVE/XRSTOR/XSETBV/XGETBV */
-XEN_CPUFEATURE(OSXSAVE,       1*32+27) /*   OSXSAVE */
+XEN_CPUFEATURE(OSXSAVE,       1*32+27) /*!  OSXSAVE */
 XEN_CPUFEATURE(AVX,           1*32+28) /*   Advanced Vector Extensions */
 XEN_CPUFEATURE(F16C,          1*32+29) /*   Half-precision convert instruction */
 XEN_CPUFEATURE(RDRAND,        1*32+30) /*   Digital Random Number Generator */
-XEN_CPUFEATURE(HYPERVISOR,    1*32+31) /*   Running under some hypervisor */
+XEN_CPUFEATURE(HYPERVISOR,    1*32+31) /*!  Running under some hypervisor */
 
 /* AMD-defined CPU features, CPUID level 0x80000001.edx, word 2 */
 XEN_CPUFEATURE(SYSCALL,       2*32+11) /*   SYSCALL/SYSRET */
@@ -144,7 +156,7 @@ XEN_CPUFEATURE(3DNOW,         2*32+31) /*   3DNow! */
 
 /* AMD-defined CPU features, CPUID level 0x80000001.ecx, word 3 */
 XEN_CPUFEATURE(LAHF_LM,       3*32+ 0) /*   LAHF/SAHF in long mode */
-XEN_CPUFEATURE(CMP_LEGACY,    3*32+ 1) /*   If yes HyperThreading not valid */
+XEN_CPUFEATURE(CMP_LEGACY,    3*32+ 1) /*!  If yes HyperThreading not valid */
 XEN_CPUFEATURE(SVM,           3*32+ 2) /*   Secure virtual machine */
 XEN_CPUFEATURE(EXTAPIC,       3*32+ 3) /*   Extended APIC space */
 XEN_CPUFEATURE(CR8_LEGACY,    3*32+ 4) /*   CR8 in 32-bit mode */
@@ -177,14 +189,14 @@ XEN_CPUFEATURE(TSC_ADJUST,    5*32+ 1) /*   TSC_ADJUST MSR available */
 XEN_CPUFEATURE(BMI1,          5*32+ 3) /*   1st bit manipulation extensions */
 XEN_CPUFEATURE(HLE,           5*32+ 4) /*   Hardware Lock Elision */
 XEN_CPUFEATURE(AVX2,          5*32+ 5) /*   AVX2 instructions */
-XEN_CPUFEATURE(FDP_EXCP_ONLY, 5*32+ 6) /*   x87 FDP only updated on exception. */
+XEN_CPUFEATURE(FDP_EXCP_ONLY, 5*32+ 6) /*!  x87 FDP only updated on exception. */
 XEN_CPUFEATURE(SMEP,          5*32+ 7) /*   Supervisor Mode Execution Protection */
 XEN_CPUFEATURE(BMI2,          5*32+ 8) /*   2nd bit manipulation extensions */
 XEN_CPUFEATURE(ERMS,          5*32+ 9) /*   Enhanced REP MOVSB/STOSB */
 XEN_CPUFEATURE(INVPCID,       5*32+10) /*   Invalidate Process Context ID */
 XEN_CPUFEATURE(RTM,           5*32+11) /*   Restricted Transactional Memory */
 XEN_CPUFEATURE(PQM,           5*32+12) /*   Platform QoS Monitoring */
-XEN_CPUFEATURE(NO_FPU_SEL,    5*32+13) /*   FPU CS/DS stored as zero */
+XEN_CPUFEATURE(NO_FPU_SEL,    5*32+13) /*!  FPU CS/DS stored as zero */
 XEN_CPUFEATURE(MPX,           5*32+14) /*   Memory Protection Extensions */
 XEN_CPUFEATURE(PQE,           5*32+15) /*   Platform QoS Enforcement */
 XEN_CPUFEATURE(RDSEED,        5*32+18) /*   RDSEED instruction */
@@ -198,7 +210,7 @@ XEN_CPUFEATURE(SHA,           5*32+29) /*   SHA1 & SHA256 instructions */
 /* Intel-defined CPU features, CPUID level 0x00000007:0.ecx, word 6 */
 XEN_CPUFEATURE(PREFETCHWT1,   6*32+ 0) /*   PREFETCHWT1 instruction */
 XEN_CPUFEATURE(PKU,           6*32+ 3) /*   Protection Keys for Userspace */
-XEN_CPUFEATURE(OSPKE,         6*32+ 4) /*   OS Protection Keys Enable */
+XEN_CPUFEATURE(OSPKE,         6*32+ 4) /*!  OS Protection Keys Enable */
 
 /* AMD-defined CPU features, CPUID level 0x80000007.edx, word 7 */
 XEN_CPUFEATURE(ITSC,          7*32+ 8) /*   Invariant TSC */
diff --git a/xen/tools/gen-cpuid.py b/xen/tools/gen-cpuid.py
index 44b4c98..5076fb9 100755
--- a/xen/tools/gen-cpuid.py
+++ b/xen/tools/gen-cpuid.py
@@ -16,11 +16,13 @@ class State(object):
 
         # State parsed from input
         self.names = {} # Name => value mapping
+        self.raw_special = set()
 
         # State calculated
         self.nr_entries = 0 # Number of words in a featureset
         self.common_1d = 0 # Common features between 1d and e1d
         self.known = [] # All known features
+        self.special = [] # Features with special semantics
 
 def parse_definitions(state):
     """
@@ -29,7 +31,8 @@ def parse_definitions(state):
     """
     feat_regex = re.compile(
         r"^XEN_CPUFEATURE\(([A-Z0-9_]+),"
-        "\s+([\s\d]+\*[\s\d]+\+[\s\d]+)\).*$")
+        "\s+([\s\d]+\*[\s\d]+\+[\s\d]+)\)"
+        "\s+/\*([!]*) .*$")
 
     this = sys.modules[__name__]
 
@@ -45,6 +48,7 @@ def parse_definitions(state):
 
         name = res.groups()[0]
         val = eval(res.groups()[1]) # Regex confines this to a very simple expression
+        attr = res.groups()[2]
 
         if hasattr(this, name):
             raise Fail("Duplicate symbol %s" % (name,))
@@ -64,6 +68,13 @@ def parse_definitions(state):
         # Construct a reverse mapping of value to name
         state.names[val] = name
 
+        for a in attr:
+
+            if a == "!":
+                state.raw_special.add(val)
+            else:
+                raise Fail("Unrecognised attribute '%s' for %s" % (a, name))
+
     if len(state.names) == 0:
         raise Fail("No features found")
 
@@ -112,6 +123,7 @@ def crunch_numbers(state):
         state.names[e1d_base + (f % 32)] = "E1D_" + state.names[f]
 
     state.common_1d = featureset_to_uint32s(common_1d, 1)[0]
+    state.special = featureset_to_uint32s(state.raw_special, nr_entries)
 
 
 def write_results(state):
@@ -131,9 +143,12 @@ def write_results(state):
 #define CPUID_COMMON_1D_FEATURES %s
 
 #define INIT_KNOWN_FEATURES { \\\n%s\n}
+
+#define INIT_SPECIAL_FEATURES { \\\n%s\n}
 """ % (state.nr_entries,
        state.common_1d,
        format_uint32s(state.known, 4),
+       format_uint32s(state.special, 4),
        ))
 
     state.output.write(
-- 
2.1.4


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 72+ messages in thread

* [PATCH v4 06/26] xen/x86: Annotate VM applicability in featureset
  2016-03-23 16:36 [PATCH v4 00/26] x86: Improvements to cpuid handling for guests Andrew Cooper
                   ` (4 preceding siblings ...)
  2016-03-23 16:36 ` [PATCH v4 05/26] xen/x86: Annotate special features Andrew Cooper
@ 2016-03-23 16:36 ` Andrew Cooper
  2016-03-23 16:36 ` [PATCH v4 07/26] xen/x86: Calculate maximum host and guest featuresets Andrew Cooper
                   ` (20 subsequent siblings)
  26 siblings, 0 replies; 72+ messages in thread
From: Andrew Cooper @ 2016-03-23 16:36 UTC (permalink / raw)
  To: Xen-devel; +Cc: Andrew Cooper, Jan Beulich

Use attributes to specify whether a feature is applicable to be exposed to:
 1) All guests
 2) HVM guests
 3) HVM HAP guests
and, via absence of an attribute, to no guests.

There is no current need for other categories (e.g. PV-only features), and
such categories should not be introduced if possible.  These categories follow
from the fact that, with increased hardware support, a guest gets more
features to use.

These settings are derived from the existing code in {pv,hvm}_cpuid(), and
xc_cpuid_x86.c.  One notable exception is EXTAPIC which was previously
erroneously exposed to guests.  PV guests don't get to use the APIC and the
HVM APIC emulation doesn't support extended space.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
---
CC: Jan Beulich <JBeulich@suse.com>

v2:
 * Annotate features using a magic comment and autogeneration.
v3:
 * Rebase over the new namespaceing changes.
 * Expand commit message.
 * Correct PSE36 to being a HAP-only feature.
v4:
 * Re-break PSE36.
 * Hide LWP from PV guests.
---
 xen/include/public/arch-x86/cpufeatureset.h | 187 ++++++++++++++--------------
 xen/tools/gen-cpuid.py                      |  30 ++++-
 2 files changed, 125 insertions(+), 92 deletions(-)

diff --git a/xen/include/public/arch-x86/cpufeatureset.h b/xen/include/public/arch-x86/cpufeatureset.h
index 8308972..75dd2ac 100644
--- a/xen/include/public/arch-x86/cpufeatureset.h
+++ b/xen/include/public/arch-x86/cpufeatureset.h
@@ -81,135 +81,140 @@ enum {
  *   This bit has special properties and is not a straight indication of a
  *   piece of new functionality.  Xen will handle these differently,
  *   and may override toolstack settings completely.
+ *
+ * Applicability to guests: 'A', 'S' or 'H'
+ *   'A' = All guests.
+ *   'S' = All HVM guests (not PV guests).
+ *   'H' = HVM HAP guests (not PV or HVM Shadow guests).
  */
 
 /* Intel-defined CPU features, CPUID level 0x00000001.edx, word 0 */
-XEN_CPUFEATURE(FPU,           0*32+ 0) /*   Onboard FPU */
-XEN_CPUFEATURE(VME,           0*32+ 1) /*   Virtual Mode Extensions */
-XEN_CPUFEATURE(DE,            0*32+ 2) /*   Debugging Extensions */
-XEN_CPUFEATURE(PSE,           0*32+ 3) /*   Page Size Extensions */
-XEN_CPUFEATURE(TSC,           0*32+ 4) /*   Time Stamp Counter */
-XEN_CPUFEATURE(MSR,           0*32+ 5) /*   Model-Specific Registers, RDMSR, WRMSR */
-XEN_CPUFEATURE(PAE,           0*32+ 6) /*   Physical Address Extensions */
-XEN_CPUFEATURE(MCE,           0*32+ 7) /*   Machine Check Architecture */
-XEN_CPUFEATURE(CX8,           0*32+ 8) /*   CMPXCHG8 instruction */
-XEN_CPUFEATURE(APIC,          0*32+ 9) /*!  Onboard APIC */
-XEN_CPUFEATURE(SEP,           0*32+11) /*   SYSENTER/SYSEXIT */
-XEN_CPUFEATURE(MTRR,          0*32+12) /*   Memory Type Range Registers */
-XEN_CPUFEATURE(PGE,           0*32+13) /*   Page Global Enable */
-XEN_CPUFEATURE(MCA,           0*32+14) /*   Machine Check Architecture */
-XEN_CPUFEATURE(CMOV,          0*32+15) /*   CMOV instruction (FCMOVCC and FCOMI too if FPU present) */
-XEN_CPUFEATURE(PAT,           0*32+16) /*   Page Attribute Table */
-XEN_CPUFEATURE(PSE36,         0*32+17) /*   36-bit PSEs */
-XEN_CPUFEATURE(CLFLUSH,       0*32+19) /*   CLFLUSH instruction */
+XEN_CPUFEATURE(FPU,           0*32+ 0) /*A  Onboard FPU */
+XEN_CPUFEATURE(VME,           0*32+ 1) /*S  Virtual Mode Extensions */
+XEN_CPUFEATURE(DE,            0*32+ 2) /*A  Debugging Extensions */
+XEN_CPUFEATURE(PSE,           0*32+ 3) /*S  Page Size Extensions */
+XEN_CPUFEATURE(TSC,           0*32+ 4) /*A  Time Stamp Counter */
+XEN_CPUFEATURE(MSR,           0*32+ 5) /*A  Model-Specific Registers, RDMSR, WRMSR */
+XEN_CPUFEATURE(PAE,           0*32+ 6) /*A  Physical Address Extensions */
+XEN_CPUFEATURE(MCE,           0*32+ 7) /*A  Machine Check Architecture */
+XEN_CPUFEATURE(CX8,           0*32+ 8) /*A  CMPXCHG8 instruction */
+XEN_CPUFEATURE(APIC,          0*32+ 9) /*!A Onboard APIC */
+XEN_CPUFEATURE(SEP,           0*32+11) /*A  SYSENTER/SYSEXIT */
+XEN_CPUFEATURE(MTRR,          0*32+12) /*S  Memory Type Range Registers */
+XEN_CPUFEATURE(PGE,           0*32+13) /*S  Page Global Enable */
+XEN_CPUFEATURE(MCA,           0*32+14) /*A  Machine Check Architecture */
+XEN_CPUFEATURE(CMOV,          0*32+15) /*A  CMOV instruction (FCMOVCC and FCOMI too if FPU present) */
+XEN_CPUFEATURE(PAT,           0*32+16) /*A  Page Attribute Table */
+XEN_CPUFEATURE(PSE36,         0*32+17) /*S  36-bit PSEs */
+XEN_CPUFEATURE(CLFLUSH,       0*32+19) /*A  CLFLUSH instruction */
 XEN_CPUFEATURE(DS,            0*32+21) /*   Debug Store */
-XEN_CPUFEATURE(ACPI,          0*32+22) /*   ACPI via MSR */
-XEN_CPUFEATURE(MMX,           0*32+23) /*   Multimedia Extensions */
-XEN_CPUFEATURE(FXSR,          0*32+24) /*   FXSAVE and FXRSTOR instructions */
-XEN_CPUFEATURE(SSE,           0*32+25) /*   Streaming SIMD Extensions */
-XEN_CPUFEATURE(SSE2,          0*32+26) /*   Streaming SIMD Extensions-2 */
-XEN_CPUFEATURE(HTT,           0*32+28) /*!  Hyper-Threading Technology */
+XEN_CPUFEATURE(ACPI,          0*32+22) /*A  ACPI via MSR */
+XEN_CPUFEATURE(MMX,           0*32+23) /*A  Multimedia Extensions */
+XEN_CPUFEATURE(FXSR,          0*32+24) /*A  FXSAVE and FXRSTOR instructions */
+XEN_CPUFEATURE(SSE,           0*32+25) /*A  Streaming SIMD Extensions */
+XEN_CPUFEATURE(SSE2,          0*32+26) /*A  Streaming SIMD Extensions-2 */
+XEN_CPUFEATURE(HTT,           0*32+28) /*!A Hyper-Threading Technology */
 XEN_CPUFEATURE(TM1,           0*32+29) /*   Thermal Monitor 1 */
 XEN_CPUFEATURE(PBE,           0*32+31) /*   Pending Break Enable */
 
 /* Intel-defined CPU features, CPUID level 0x00000001.ecx, word 1 */
-XEN_CPUFEATURE(SSE3,          1*32+ 0) /*   Streaming SIMD Extensions-3 */
-XEN_CPUFEATURE(PCLMULQDQ,     1*32+ 1) /*   Carry-less mulitplication */
+XEN_CPUFEATURE(SSE3,          1*32+ 0) /*A  Streaming SIMD Extensions-3 */
+XEN_CPUFEATURE(PCLMULQDQ,     1*32+ 1) /*A  Carry-less mulitplication */
 XEN_CPUFEATURE(DTES64,        1*32+ 2) /*   64-bit Debug Store */
 XEN_CPUFEATURE(MONITOR,       1*32+ 3) /*   Monitor/Mwait support */
 XEN_CPUFEATURE(DSCPL,         1*32+ 4) /*   CPL Qualified Debug Store */
-XEN_CPUFEATURE(VMX,           1*32+ 5) /*   Virtual Machine Extensions */
+XEN_CPUFEATURE(VMX,           1*32+ 5) /*S  Virtual Machine Extensions */
 XEN_CPUFEATURE(SMX,           1*32+ 6) /*   Safer Mode Extensions */
 XEN_CPUFEATURE(EIST,          1*32+ 7) /*   Enhanced SpeedStep */
 XEN_CPUFEATURE(TM2,           1*32+ 8) /*   Thermal Monitor 2 */
-XEN_CPUFEATURE(SSSE3,         1*32+ 9) /*   Supplemental Streaming SIMD Extensions-3 */
-XEN_CPUFEATURE(FMA,           1*32+12) /*   Fused Multiply Add */
-XEN_CPUFEATURE(CX16,          1*32+13) /*   CMPXCHG16B */
+XEN_CPUFEATURE(SSSE3,         1*32+ 9) /*A  Supplemental Streaming SIMD Extensions-3 */
+XEN_CPUFEATURE(FMA,           1*32+12) /*A  Fused Multiply Add */
+XEN_CPUFEATURE(CX16,          1*32+13) /*A  CMPXCHG16B */
 XEN_CPUFEATURE(XTPR,          1*32+14) /*   Send Task Priority Messages */
 XEN_CPUFEATURE(PDCM,          1*32+15) /*   Perf/Debug Capability MSR */
-XEN_CPUFEATURE(PCID,          1*32+17) /*   Process Context ID */
+XEN_CPUFEATURE(PCID,          1*32+17) /*H  Process Context ID */
 XEN_CPUFEATURE(DCA,           1*32+18) /*   Direct Cache Access */
-XEN_CPUFEATURE(SSE4_1,        1*32+19) /*   Streaming SIMD Extensions 4.1 */
-XEN_CPUFEATURE(SSE4_2,        1*32+20) /*   Streaming SIMD Extensions 4.2 */
-XEN_CPUFEATURE(X2APIC,        1*32+21) /*!  Extended xAPIC */
-XEN_CPUFEATURE(MOVBE,         1*32+22) /*   movbe instruction */
-XEN_CPUFEATURE(POPCNT,        1*32+23) /*   POPCNT instruction */
-XEN_CPUFEATURE(TSC_DEADLINE,  1*32+24) /*   TSC Deadline Timer */
-XEN_CPUFEATURE(AESNI,         1*32+25) /*   AES instructions */
-XEN_CPUFEATURE(XSAVE,         1*32+26) /*   XSAVE/XRSTOR/XSETBV/XGETBV */
+XEN_CPUFEATURE(SSE4_1,        1*32+19) /*A  Streaming SIMD Extensions 4.1 */
+XEN_CPUFEATURE(SSE4_2,        1*32+20) /*A  Streaming SIMD Extensions 4.2 */
+XEN_CPUFEATURE(X2APIC,        1*32+21) /*!A Extended xAPIC */
+XEN_CPUFEATURE(MOVBE,         1*32+22) /*A  movbe instruction */
+XEN_CPUFEATURE(POPCNT,        1*32+23) /*A  POPCNT instruction */
+XEN_CPUFEATURE(TSC_DEADLINE,  1*32+24) /*S  TSC Deadline Timer */
+XEN_CPUFEATURE(AESNI,         1*32+25) /*A  AES instructions */
+XEN_CPUFEATURE(XSAVE,         1*32+26) /*A  XSAVE/XRSTOR/XSETBV/XGETBV */
 XEN_CPUFEATURE(OSXSAVE,       1*32+27) /*!  OSXSAVE */
-XEN_CPUFEATURE(AVX,           1*32+28) /*   Advanced Vector Extensions */
-XEN_CPUFEATURE(F16C,          1*32+29) /*   Half-precision convert instruction */
-XEN_CPUFEATURE(RDRAND,        1*32+30) /*   Digital Random Number Generator */
-XEN_CPUFEATURE(HYPERVISOR,    1*32+31) /*!  Running under some hypervisor */
+XEN_CPUFEATURE(AVX,           1*32+28) /*A  Advanced Vector Extensions */
+XEN_CPUFEATURE(F16C,          1*32+29) /*A  Half-precision convert instruction */
+XEN_CPUFEATURE(RDRAND,        1*32+30) /*A  Digital Random Number Generator */
+XEN_CPUFEATURE(HYPERVISOR,    1*32+31) /*!A  Running under some hypervisor */
 
 /* AMD-defined CPU features, CPUID level 0x80000001.edx, word 2 */
-XEN_CPUFEATURE(SYSCALL,       2*32+11) /*   SYSCALL/SYSRET */
-XEN_CPUFEATURE(NX,            2*32+20) /*   Execute Disable */
-XEN_CPUFEATURE(MMXEXT,        2*32+22) /*   AMD MMX extensions */
-XEN_CPUFEATURE(FFXSR,         2*32+25) /*   FFXSR instruction optimizations */
-XEN_CPUFEATURE(PAGE1GB,       2*32+26) /*   1Gb large page support */
-XEN_CPUFEATURE(RDTSCP,        2*32+27) /*   RDTSCP */
-XEN_CPUFEATURE(LM,            2*32+29) /*   Long Mode (x86-64) */
-XEN_CPUFEATURE(3DNOWEXT,      2*32+30) /*   AMD 3DNow! extensions */
-XEN_CPUFEATURE(3DNOW,         2*32+31) /*   3DNow! */
+XEN_CPUFEATURE(SYSCALL,       2*32+11) /*A  SYSCALL/SYSRET */
+XEN_CPUFEATURE(NX,            2*32+20) /*A  Execute Disable */
+XEN_CPUFEATURE(MMXEXT,        2*32+22) /*A  AMD MMX extensions */
+XEN_CPUFEATURE(FFXSR,         2*32+25) /*A  FFXSR instruction optimizations */
+XEN_CPUFEATURE(PAGE1GB,       2*32+26) /*H  1Gb large page support */
+XEN_CPUFEATURE(RDTSCP,        2*32+27) /*S  RDTSCP */
+XEN_CPUFEATURE(LM,            2*32+29) /*A  Long Mode (x86-64) */
+XEN_CPUFEATURE(3DNOWEXT,      2*32+30) /*A  AMD 3DNow! extensions */
+XEN_CPUFEATURE(3DNOW,         2*32+31) /*A  3DNow! */
 
 /* AMD-defined CPU features, CPUID level 0x80000001.ecx, word 3 */
-XEN_CPUFEATURE(LAHF_LM,       3*32+ 0) /*   LAHF/SAHF in long mode */
-XEN_CPUFEATURE(CMP_LEGACY,    3*32+ 1) /*!  If yes HyperThreading not valid */
-XEN_CPUFEATURE(SVM,           3*32+ 2) /*   Secure virtual machine */
+XEN_CPUFEATURE(LAHF_LM,       3*32+ 0) /*A  LAHF/SAHF in long mode */
+XEN_CPUFEATURE(CMP_LEGACY,    3*32+ 1) /*!A If yes HyperThreading not valid */
+XEN_CPUFEATURE(SVM,           3*32+ 2) /*S  Secure virtual machine */
 XEN_CPUFEATURE(EXTAPIC,       3*32+ 3) /*   Extended APIC space */
-XEN_CPUFEATURE(CR8_LEGACY,    3*32+ 4) /*   CR8 in 32-bit mode */
-XEN_CPUFEATURE(ABM,           3*32+ 5) /*   Advanced bit manipulation */
-XEN_CPUFEATURE(SSE4A,         3*32+ 6) /*   SSE-4A */
-XEN_CPUFEATURE(MISALIGNSSE,   3*32+ 7) /*   Misaligned SSE mode */
-XEN_CPUFEATURE(3DNOWPREFETCH, 3*32+ 8) /*   3DNow prefetch instructions */
+XEN_CPUFEATURE(CR8_LEGACY,    3*32+ 4) /*S  CR8 in 32-bit mode */
+XEN_CPUFEATURE(ABM,           3*32+ 5) /*A  Advanced bit manipulation */
+XEN_CPUFEATURE(SSE4A,         3*32+ 6) /*A  SSE-4A */
+XEN_CPUFEATURE(MISALIGNSSE,   3*32+ 7) /*A  Misaligned SSE mode */
+XEN_CPUFEATURE(3DNOWPREFETCH, 3*32+ 8) /*A  3DNow prefetch instructions */
 XEN_CPUFEATURE(OSVW,          3*32+ 9) /*   OS Visible Workaround */
-XEN_CPUFEATURE(IBS,           3*32+10) /*   Instruction Based Sampling */
-XEN_CPUFEATURE(XOP,           3*32+11) /*   extended AVX instructions */
+XEN_CPUFEATURE(IBS,           3*32+10) /*S  Instruction Based Sampling */
+XEN_CPUFEATURE(XOP,           3*32+11) /*A  extended AVX instructions */
 XEN_CPUFEATURE(SKINIT,        3*32+12) /*   SKINIT/STGI instructions */
 XEN_CPUFEATURE(WDT,           3*32+13) /*   Watchdog timer */
-XEN_CPUFEATURE(LWP,           3*32+15) /*   Light Weight Profiling */
-XEN_CPUFEATURE(FMA4,          3*32+16) /*   4 operands MAC instructions */
+XEN_CPUFEATURE(LWP,           3*32+15) /*S  Light Weight Profiling */
+XEN_CPUFEATURE(FMA4,          3*32+16) /*A  4 operands MAC instructions */
 XEN_CPUFEATURE(NODEID_MSR,    3*32+19) /*   NodeId MSR */
-XEN_CPUFEATURE(TBM,           3*32+21) /*   trailing bit manipulations */
+XEN_CPUFEATURE(TBM,           3*32+21) /*A  trailing bit manipulations */
 XEN_CPUFEATURE(TOPOEXT,       3*32+22) /*   topology extensions CPUID leafs */
-XEN_CPUFEATURE(DBEXT,         3*32+26) /*   data breakpoint extension */
+XEN_CPUFEATURE(DBEXT,         3*32+26) /*A  data breakpoint extension */
 XEN_CPUFEATURE(MONITORX,      3*32+29) /*   MONITOR extension (MONITORX/MWAITX) */
 
 /* Intel-defined CPU features, CPUID level 0x0000000D:1.eax, word 4 */
-XEN_CPUFEATURE(XSAVEOPT,      4*32+ 0) /*   XSAVEOPT instruction */
-XEN_CPUFEATURE(XSAVEC,        4*32+ 1) /*   XSAVEC/XRSTORC instructions */
-XEN_CPUFEATURE(XGETBV1,       4*32+ 2) /*   XGETBV with %ecx=1 */
-XEN_CPUFEATURE(XSAVES,        4*32+ 3) /*   XSAVES/XRSTORS instructions */
+XEN_CPUFEATURE(XSAVEOPT,      4*32+ 0) /*A  XSAVEOPT instruction */
+XEN_CPUFEATURE(XSAVEC,        4*32+ 1) /*A  XSAVEC/XRSTORC instructions */
+XEN_CPUFEATURE(XGETBV1,       4*32+ 2) /*A  XGETBV with %ecx=1 */
+XEN_CPUFEATURE(XSAVES,        4*32+ 3) /*S  XSAVES/XRSTORS instructions */
 
 /* Intel-defined CPU features, CPUID level 0x00000007:0.ebx, word 5 */
-XEN_CPUFEATURE(FSGSBASE,      5*32+ 0) /*   {RD,WR}{FS,GS}BASE instructions */
-XEN_CPUFEATURE(TSC_ADJUST,    5*32+ 1) /*   TSC_ADJUST MSR available */
-XEN_CPUFEATURE(BMI1,          5*32+ 3) /*   1st bit manipulation extensions */
-XEN_CPUFEATURE(HLE,           5*32+ 4) /*   Hardware Lock Elision */
-XEN_CPUFEATURE(AVX2,          5*32+ 5) /*   AVX2 instructions */
+XEN_CPUFEATURE(FSGSBASE,      5*32+ 0) /*A  {RD,WR}{FS,GS}BASE instructions */
+XEN_CPUFEATURE(TSC_ADJUST,    5*32+ 1) /*S  TSC_ADJUST MSR available */
+XEN_CPUFEATURE(BMI1,          5*32+ 3) /*A  1st bit manipulation extensions */
+XEN_CPUFEATURE(HLE,           5*32+ 4) /*A  Hardware Lock Elision */
+XEN_CPUFEATURE(AVX2,          5*32+ 5) /*A  AVX2 instructions */
 XEN_CPUFEATURE(FDP_EXCP_ONLY, 5*32+ 6) /*!  x87 FDP only updated on exception. */
-XEN_CPUFEATURE(SMEP,          5*32+ 7) /*   Supervisor Mode Execution Protection */
-XEN_CPUFEATURE(BMI2,          5*32+ 8) /*   2nd bit manipulation extensions */
-XEN_CPUFEATURE(ERMS,          5*32+ 9) /*   Enhanced REP MOVSB/STOSB */
-XEN_CPUFEATURE(INVPCID,       5*32+10) /*   Invalidate Process Context ID */
-XEN_CPUFEATURE(RTM,           5*32+11) /*   Restricted Transactional Memory */
+XEN_CPUFEATURE(SMEP,          5*32+ 7) /*S  Supervisor Mode Execution Protection */
+XEN_CPUFEATURE(BMI2,          5*32+ 8) /*A  2nd bit manipulation extensions */
+XEN_CPUFEATURE(ERMS,          5*32+ 9) /*A  Enhanced REP MOVSB/STOSB */
+XEN_CPUFEATURE(INVPCID,       5*32+10) /*H  Invalidate Process Context ID */
+XEN_CPUFEATURE(RTM,           5*32+11) /*A  Restricted Transactional Memory */
 XEN_CPUFEATURE(PQM,           5*32+12) /*   Platform QoS Monitoring */
 XEN_CPUFEATURE(NO_FPU_SEL,    5*32+13) /*!  FPU CS/DS stored as zero */
-XEN_CPUFEATURE(MPX,           5*32+14) /*   Memory Protection Extensions */
+XEN_CPUFEATURE(MPX,           5*32+14) /*S  Memory Protection Extensions */
 XEN_CPUFEATURE(PQE,           5*32+15) /*   Platform QoS Enforcement */
-XEN_CPUFEATURE(RDSEED,        5*32+18) /*   RDSEED instruction */
-XEN_CPUFEATURE(ADX,           5*32+19) /*   ADCX, ADOX instructions */
-XEN_CPUFEATURE(SMAP,          5*32+20) /*   Supervisor Mode Access Prevention */
-XEN_CPUFEATURE(PCOMMIT,       5*32+22) /*   PCOMMIT instruction */
-XEN_CPUFEATURE(CLFLUSHOPT,    5*32+23) /*   CLFLUSHOPT instruction */
-XEN_CPUFEATURE(CLWB,          5*32+24) /*   CLWB instruction */
-XEN_CPUFEATURE(SHA,           5*32+29) /*   SHA1 & SHA256 instructions */
+XEN_CPUFEATURE(RDSEED,        5*32+18) /*A  RDSEED instruction */
+XEN_CPUFEATURE(ADX,           5*32+19) /*A  ADCX, ADOX instructions */
+XEN_CPUFEATURE(SMAP,          5*32+20) /*S  Supervisor Mode Access Prevention */
+XEN_CPUFEATURE(PCOMMIT,       5*32+22) /*A  PCOMMIT instruction */
+XEN_CPUFEATURE(CLFLUSHOPT,    5*32+23) /*A  CLFLUSHOPT instruction */
+XEN_CPUFEATURE(CLWB,          5*32+24) /*A  CLWB instruction */
+XEN_CPUFEATURE(SHA,           5*32+29) /*A  SHA1 & SHA256 instructions */
 
 /* Intel-defined CPU features, CPUID level 0x00000007:0.ecx, word 6 */
-XEN_CPUFEATURE(PREFETCHWT1,   6*32+ 0) /*   PREFETCHWT1 instruction */
-XEN_CPUFEATURE(PKU,           6*32+ 3) /*   Protection Keys for Userspace */
+XEN_CPUFEATURE(PREFETCHWT1,   6*32+ 0) /*A  PREFETCHWT1 instruction */
+XEN_CPUFEATURE(PKU,           6*32+ 3) /*H  Protection Keys for Userspace */
 XEN_CPUFEATURE(OSPKE,         6*32+ 4) /*!  OS Protection Keys Enable */
 
 /* AMD-defined CPU features, CPUID level 0x80000007.edx, word 7 */
@@ -217,7 +222,7 @@ XEN_CPUFEATURE(ITSC,          7*32+ 8) /*   Invariant TSC */
 XEN_CPUFEATURE(EFRO,          7*32+10) /*   APERF/MPERF Read Only interface */
 
 /* AMD-defined CPU features, CPUID level 0x80000008.ebx, word 8 */
-XEN_CPUFEATURE(CLZERO,        8*32+ 0) /*   CLZERO instruction */
+XEN_CPUFEATURE(CLZERO,        8*32+ 0) /*A  CLZERO instruction */
 
 #endif /* XEN_CPUFEATURE */
 
diff --git a/xen/tools/gen-cpuid.py b/xen/tools/gen-cpuid.py
index 5076fb9..4fd603d 100755
--- a/xen/tools/gen-cpuid.py
+++ b/xen/tools/gen-cpuid.py
@@ -17,12 +17,18 @@ class State(object):
         # State parsed from input
         self.names = {} # Name => value mapping
         self.raw_special = set()
+        self.raw_pv = set()
+        self.raw_hvm_shadow = set()
+        self.raw_hvm_hap = set()
 
         # State calculated
         self.nr_entries = 0 # Number of words in a featureset
         self.common_1d = 0 # Common features between 1d and e1d
         self.known = [] # All known features
         self.special = [] # Features with special semantics
+        self.pv = []
+        self.hvm_shadow = []
+        self.hvm_hap = []
 
 def parse_definitions(state):
     """
@@ -32,7 +38,7 @@ def parse_definitions(state):
     feat_regex = re.compile(
         r"^XEN_CPUFEATURE\(([A-Z0-9_]+),"
         "\s+([\s\d]+\*[\s\d]+\+[\s\d]+)\)"
-        "\s+/\*([!]*) .*$")
+        "\s+/\*([\w!]*) .*$")
 
     this = sys.modules[__name__]
 
@@ -72,6 +78,16 @@ def parse_definitions(state):
 
             if a == "!":
                 state.raw_special.add(val)
+            elif a in "ASH":
+                if a == "A":
+                    state.raw_pv.add(val)
+                    state.raw_hvm_shadow.add(val)
+                    state.raw_hvm_hap.add(val)
+                elif attr == "S":
+                    state.raw_hvm_shadow.add(val)
+                    state.raw_hvm_hap.add(val)
+                elif attr == "H":
+                    state.raw_hvm_hap.add(val)
             else:
                 raise Fail("Unrecognised attribute '%s' for %s" % (a, name))
 
@@ -124,6 +140,9 @@ def crunch_numbers(state):
 
     state.common_1d = featureset_to_uint32s(common_1d, 1)[0]
     state.special = featureset_to_uint32s(state.raw_special, nr_entries)
+    state.pv = featureset_to_uint32s(state.raw_pv, nr_entries)
+    state.hvm_shadow = featureset_to_uint32s(state.raw_hvm_shadow, nr_entries)
+    state.hvm_hap = featureset_to_uint32s(state.raw_hvm_hap, nr_entries)
 
 
 def write_results(state):
@@ -145,10 +164,19 @@ def write_results(state):
 #define INIT_KNOWN_FEATURES { \\\n%s\n}
 
 #define INIT_SPECIAL_FEATURES { \\\n%s\n}
+
+#define INIT_PV_FEATURES { \\\n%s\n}
+
+#define INIT_HVM_SHADOW_FEATURES { \\\n%s\n}
+
+#define INIT_HVM_HAP_FEATURES { \\\n%s\n}
 """ % (state.nr_entries,
        state.common_1d,
        format_uint32s(state.known, 4),
        format_uint32s(state.special, 4),
+       format_uint32s(state.pv, 4),
+       format_uint32s(state.hvm_shadow, 4),
+       format_uint32s(state.hvm_hap, 4),
        ))
 
     state.output.write(
-- 
2.1.4


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 72+ messages in thread

* [PATCH v4 07/26] xen/x86: Calculate maximum host and guest featuresets
  2016-03-23 16:36 [PATCH v4 00/26] x86: Improvements to cpuid handling for guests Andrew Cooper
                   ` (5 preceding siblings ...)
  2016-03-23 16:36 ` [PATCH v4 06/26] xen/x86: Annotate VM applicability in featureset Andrew Cooper
@ 2016-03-23 16:36 ` Andrew Cooper
  2016-03-29  8:57   ` Jan Beulich
  2016-03-23 16:36 ` [PATCH v4 08/26] xen/x86: Generate deep dependencies of features Andrew Cooper
                   ` (19 subsequent siblings)
  26 siblings, 1 reply; 72+ messages in thread
From: Andrew Cooper @ 2016-03-23 16:36 UTC (permalink / raw)
  To: Xen-devel; +Cc: Andrew Cooper, Jan Beulich

All of this information will be used by the toolstack to make informed
levelling decisions for VMs, and by Xen to sanity check toolstack-provided
information.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
---
CC: Jan Beulich <JBeulich@suse.com>

v3:
 * Move as much as possible into .init.
 * Fix the handing of the shared bits for the cross-vendor case.
 * Fix extended check.
v4:
 * Fix copy&paste error in calculate_hvm_featureset()
---
 xen/arch/x86/cpuid.c        | 162 ++++++++++++++++++++++++++++++++++++++++++++
 xen/arch/x86/setup.c        |   3 +
 xen/include/asm-x86/cpuid.h |  17 +++++
 3 files changed, 182 insertions(+)

diff --git a/xen/arch/x86/cpuid.c b/xen/arch/x86/cpuid.c
index 77e008a..41439f8 100644
--- a/xen/arch/x86/cpuid.c
+++ b/xen/arch/x86/cpuid.c
@@ -1,14 +1,176 @@
 #include <xen/init.h>
 #include <xen/lib.h>
 #include <asm/cpuid.h>
+#include <asm/hvm/hvm.h>
+#include <asm/hvm/vmx/vmcs.h>
+#include <asm/processor.h>
 
 const uint32_t known_features[] = INIT_KNOWN_FEATURES;
 const uint32_t special_features[] = INIT_SPECIAL_FEATURES;
 
+static const uint32_t __initconst pv_featuremask[] = INIT_PV_FEATURES;
+static const uint32_t __initconst hvm_shadow_featuremask[] = INIT_HVM_SHADOW_FEATURES;
+static const uint32_t __initconst hvm_hap_featuremask[] = INIT_HVM_HAP_FEATURES;
+
+uint32_t __read_mostly raw_featureset[FSCAPINTS];
+uint32_t __read_mostly pv_featureset[FSCAPINTS];
+uint32_t __read_mostly hvm_featureset[FSCAPINTS];
+
+static void __init sanitise_featureset(uint32_t *fs)
+{
+    unsigned int i;
+
+    for ( i = 0; i < FSCAPINTS; ++i )
+    {
+        /* Clamp to known mask. */
+        fs[i] &= known_features[i];
+    }
+
+    /*
+     * Sort out shared bits.  We are constructing a featureset which needs to
+     * be applicable to a cross-vendor case.  Intel strictly clears the common
+     * bits in e1d, while AMD strictly duplicates them.
+     *
+     * We duplicate them here to be compatible with AMD while on Intel, and
+     * rely on logic closer to the guest to make the featureset stricter if
+     * emulating Intel.
+     */
+    fs[FEATURESET_e1d] = ((fs[FEATURESET_1d]  &  CPUID_COMMON_1D_FEATURES) |
+                          (fs[FEATURESET_e1d] & ~CPUID_COMMON_1D_FEATURES));
+}
+
+static void __init calculate_raw_featureset(void)
+{
+    unsigned int max, tmp;
+
+    max = cpuid_eax(0);
+
+    if ( max >= 1 )
+        cpuid(0x1, &tmp, &tmp,
+              &raw_featureset[FEATURESET_1c],
+              &raw_featureset[FEATURESET_1d]);
+    if ( max >= 7 )
+        cpuid_count(0x7, 0, &tmp,
+                    &raw_featureset[FEATURESET_7b0],
+                    &raw_featureset[FEATURESET_7c0],
+                    &tmp);
+    if ( max >= 0xd )
+        cpuid_count(0xd, 1,
+                    &raw_featureset[FEATURESET_Da1],
+                    &tmp, &tmp, &tmp);
+
+    max = cpuid_eax(0x80000000);
+    if ( (max >> 16) != 0x8000 )
+        return;
+
+    if ( max >= 0x80000001 )
+        cpuid(0x80000001, &tmp, &tmp,
+              &raw_featureset[FEATURESET_e1c],
+              &raw_featureset[FEATURESET_e1d]);
+    if ( max >= 0x80000007 )
+        cpuid(0x80000007, &tmp, &tmp, &tmp,
+              &raw_featureset[FEATURESET_e7d]);
+    if ( max >= 0x80000008 )
+        cpuid(0x80000008, &tmp,
+              &raw_featureset[FEATURESET_e8b],
+              &tmp, &tmp);
+}
+
+static void __init calculate_pv_featureset(void)
+{
+    unsigned int i;
+
+    for ( i = 0; i < FSCAPINTS; ++i )
+        pv_featureset[i] = host_featureset[i] & pv_featuremask[i];
+
+    /* Unconditionally claim to be able to set the hypervisor bit. */
+    __set_bit(X86_FEATURE_HYPERVISOR, pv_featureset);
+
+    /*
+     * Allow the toolstack to set HTT, X2APIC and CMP_LEGACY.  These bits
+     * affect how to interpret topology information in other cpuid leaves.
+     */
+    __set_bit(X86_FEATURE_HTT, pv_featureset);
+    __set_bit(X86_FEATURE_X2APIC, pv_featureset);
+    __set_bit(X86_FEATURE_CMP_LEGACY, pv_featureset);
+
+    sanitise_featureset(pv_featureset);
+}
+
+static void __init calculate_hvm_featureset(void)
+{
+    unsigned int i;
+    const uint32_t *hvm_featuremask;
+
+    if ( !hvm_enabled )
+        return;
+
+    hvm_featuremask = hvm_funcs.hap_supported ?
+        hvm_hap_featuremask : hvm_shadow_featuremask;
+
+    for ( i = 0; i < FSCAPINTS; ++i )
+        hvm_featureset[i] = host_featureset[i] & hvm_featuremask[i];
+
+    /* Unconditionally claim to be able to set the hypervisor bit. */
+    __set_bit(X86_FEATURE_HYPERVISOR, hvm_featureset);
+
+    /*
+     * Allow the toolstack to set HTT, X2APIC and CMP_LEGACY.  These bits
+     * affect how to interpret topology information in other cpuid leaves.
+     */
+    __set_bit(X86_FEATURE_HTT, hvm_featureset);
+    __set_bit(X86_FEATURE_X2APIC, hvm_featureset);
+    __set_bit(X86_FEATURE_CMP_LEGACY, hvm_featureset);
+
+    /*
+     * Xen can provide an APIC emulation to HVM guests even if the host's APIC
+     * isn't enabled.
+     */
+    __set_bit(X86_FEATURE_APIC, hvm_featureset);
+
+    /*
+     * On AMD, PV guests are entirely unable to use SYSENTER as Xen runs in
+     * long mode (and init_amd() has cleared it out of host capabilities), but
+     * HVM guests are able if running in protected mode.
+     */
+    if ( (boot_cpu_data.x86_vendor == X86_VENDOR_AMD) &&
+         test_bit(X86_FEATURE_SEP, raw_featureset) )
+        __set_bit(X86_FEATURE_SEP, hvm_featureset);
+
+    /*
+     * With VT-x, some features are only supported by Xen if dedicated
+     * hardware support is also available.
+     */
+    if ( cpu_has_vmx )
+    {
+        if ( !(vmx_vmexit_control & VM_EXIT_CLEAR_BNDCFGS) ||
+             !(vmx_vmentry_control & VM_ENTRY_LOAD_BNDCFGS) )
+            __clear_bit(X86_FEATURE_MPX, hvm_featureset);
+
+        if ( !cpu_has_vmx_xsaves )
+            __clear_bit(X86_FEATURE_XSAVES, hvm_featureset);
+
+        if ( !cpu_has_vmx_pcommit )
+            __clear_bit(X86_FEATURE_PCOMMIT, hvm_featureset);
+    }
+
+    sanitise_featureset(hvm_featureset);
+}
+
+void __init calculate_featuresets(void)
+{
+    calculate_raw_featureset();
+    calculate_pv_featureset();
+    calculate_hvm_featureset();
+}
+
 static void __init __maybe_unused build_assertions(void)
 {
     BUILD_BUG_ON(ARRAY_SIZE(known_features) != FSCAPINTS);
     BUILD_BUG_ON(ARRAY_SIZE(special_features) != FSCAPINTS);
+    BUILD_BUG_ON(ARRAY_SIZE(pv_featuremask) != FSCAPINTS);
+    BUILD_BUG_ON(ARRAY_SIZE(hvm_shadow_featuremask) != FSCAPINTS);
+    BUILD_BUG_ON(ARRAY_SIZE(hvm_hap_featuremask) != FSCAPINTS);
 }
 
 /*
diff --git a/xen/arch/x86/setup.c b/xen/arch/x86/setup.c
index ee65f55..3696c31 100644
--- a/xen/arch/x86/setup.c
+++ b/xen/arch/x86/setup.c
@@ -50,6 +50,7 @@
 #include <asm/nmi.h>
 #include <asm/alternative.h>
 #include <asm/mc146818rtc.h>
+#include <asm/cpuid.h>
 
 /* opt_nosmp: If true, secondary processors are ignored. */
 static bool_t __initdata opt_nosmp;
@@ -1525,6 +1526,8 @@ void __init noreturn __start_xen(unsigned long mbi_p)
                "Multiple initrd candidates, picking module #%u\n",
                initrdidx);
 
+    calculate_featuresets();
+
     /*
      * Temporarily clear SMAP in CR4 to allow user-accesses in construct_dom0().
      * This saves a large number of corner cases interactions with
diff --git a/xen/include/asm-x86/cpuid.h b/xen/include/asm-x86/cpuid.h
index 0ecf357..5041bcd 100644
--- a/xen/include/asm-x86/cpuid.h
+++ b/xen/include/asm-x86/cpuid.h
@@ -6,12 +6,29 @@
 
 #define FSCAPINTS FEATURESET_NR_ENTRIES
 
+#define FEATURESET_1d     0 /* 0x00000001.edx      */
+#define FEATURESET_1c     1 /* 0x00000001.ecx      */
+#define FEATURESET_e1d    2 /* 0x80000001.edx      */
+#define FEATURESET_e1c    3 /* 0x80000001.ecx      */
+#define FEATURESET_Da1    4 /* 0x0000000d:1.eax    */
+#define FEATURESET_7b0    5 /* 0x00000007:0.ebx    */
+#define FEATURESET_7c0    6 /* 0x00000007:0.ecx    */
+#define FEATURESET_e7d    7 /* 0x80000007.edx      */
+#define FEATURESET_e8b    8 /* 0x80000008.ebx      */
+
 #ifndef __ASSEMBLY__
 #include <xen/types.h>
 
 extern const uint32_t known_features[FSCAPINTS];
 extern const uint32_t special_features[FSCAPINTS];
 
+extern uint32_t raw_featureset[FSCAPINTS];
+#define host_featureset boot_cpu_data.x86_capability
+extern uint32_t pv_featureset[FSCAPINTS];
+extern uint32_t hvm_featureset[FSCAPINTS];
+
+void calculate_featuresets(void);
+
 #endif /* __ASSEMBLY__ */
 #endif /* !__X86_CPUID_H__ */
 
-- 
2.1.4


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 72+ messages in thread

* [PATCH v4 08/26] xen/x86: Generate deep dependencies of features
  2016-03-23 16:36 [PATCH v4 00/26] x86: Improvements to cpuid handling for guests Andrew Cooper
                   ` (6 preceding siblings ...)
  2016-03-23 16:36 ` [PATCH v4 07/26] xen/x86: Calculate maximum host and guest featuresets Andrew Cooper
@ 2016-03-23 16:36 ` Andrew Cooper
  2016-03-24 16:16   ` Jan Beulich
  2016-03-23 16:36 ` [PATCH v4 09/26] xen/x86: Clear dependent features when clearing a cpu cap Andrew Cooper
                   ` (18 subsequent siblings)
  26 siblings, 1 reply; 72+ messages in thread
From: Andrew Cooper @ 2016-03-23 16:36 UTC (permalink / raw)
  To: Xen-devel; +Cc: Andrew Cooper, Jan Beulich

Some features depend on other features.  Working out and maintaining the exact
dependency tree is complicated, so it is expressed in the automatic generation
script.

At runtime, Xen needs to be disable all features which are dependent on a
feature being disabled.  Because of the flattening performed at compile time,
runtime can use a single mask to disable all eventual features.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
---
CC: Jan Beulich <JBeulich@suse.com>

v2:
 * New.
v3:
 * Vastly more reserch and comments.
v4:
 * Expand commit message.
 * More tweaks to the dependency tree.
 * Avoid for_each_set_bit() walking off the end of disabled_features[].
   Expanding disabled_features[] turns out to be far more simple than
   attempting to opencode for_each_set_bit()
---
 xen/arch/x86/cpuid.c        |  56 +++++++++++++++++
 xen/include/asm-x86/cpuid.h |   2 +
 xen/tools/gen-cpuid.py      | 143 +++++++++++++++++++++++++++++++++++++++++++-
 3 files changed, 200 insertions(+), 1 deletion(-)

diff --git a/xen/arch/x86/cpuid.c b/xen/arch/x86/cpuid.c
index 41439f8..e1e0e44 100644
--- a/xen/arch/x86/cpuid.c
+++ b/xen/arch/x86/cpuid.c
@@ -11,6 +11,7 @@ const uint32_t special_features[] = INIT_SPECIAL_FEATURES;
 static const uint32_t __initconst pv_featuremask[] = INIT_PV_FEATURES;
 static const uint32_t __initconst hvm_shadow_featuremask[] = INIT_HVM_SHADOW_FEATURES;
 static const uint32_t __initconst hvm_hap_featuremask[] = INIT_HVM_HAP_FEATURES;
+static const uint32_t __initconst deep_features[] = INIT_DEEP_FEATURES;
 
 uint32_t __read_mostly raw_featureset[FSCAPINTS];
 uint32_t __read_mostly pv_featureset[FSCAPINTS];
@@ -18,12 +19,36 @@ uint32_t __read_mostly hvm_featureset[FSCAPINTS];
 
 static void __init sanitise_featureset(uint32_t *fs)
 {
+    /* for_each_set_bit() uses unsigned longs.  Extend with zeroes. */
+    uint32_t disabled_features[
+        ROUNDUP(FSCAPINTS, sizeof(unsigned long)/sizeof(uint32_t))] = {};
     unsigned int i;
 
     for ( i = 0; i < FSCAPINTS; ++i )
     {
         /* Clamp to known mask. */
         fs[i] &= known_features[i];
+
+        /*
+         * Identify which features with deep dependencies have been
+         * disabled.
+         */
+        disabled_features[i] = ~fs[i] & deep_features[i];
+    }
+
+    for_each_set_bit(i, (void *)disabled_features,
+                     sizeof(disabled_features) * 8)
+    {
+        const uint32_t *dfs = lookup_deep_deps(i);
+        unsigned int j;
+
+        ASSERT(dfs); /* deep_features[] should guarentee this. */
+
+        for ( j = 0; j < FSCAPINTS; ++j )
+        {
+            fs[j] &= ~dfs[j];
+            disabled_features[j] &= ~dfs[j];
+        }
     }
 
     /*
@@ -164,6 +189,36 @@ void __init calculate_featuresets(void)
     calculate_hvm_featureset();
 }
 
+const uint32_t * __init lookup_deep_deps(uint32_t feature)
+{
+    static const struct {
+        uint32_t feature;
+        uint32_t fs[FSCAPINTS];
+    } deep_deps[] __initconst = INIT_DEEP_DEPS;
+    unsigned int start = 0, end = ARRAY_SIZE(deep_deps);
+
+    BUILD_BUG_ON(ARRAY_SIZE(deep_deps) != NR_DEEP_DEPS);
+
+    /* Fast early exit. */
+    if ( !test_bit(feature, deep_features) )
+        return NULL;
+
+    /* deep_deps[] is sorted.  Perform a binary search. */
+    while ( start < end )
+    {
+        unsigned int mid = start + ((end - start) / 2);
+
+        if ( deep_deps[mid].feature > feature )
+            end = mid;
+        else if ( deep_deps[mid].feature < feature )
+            start = mid + 1;
+        else
+            return deep_deps[mid].fs;
+    }
+
+    return NULL;
+}
+
 static void __init __maybe_unused build_assertions(void)
 {
     BUILD_BUG_ON(ARRAY_SIZE(known_features) != FSCAPINTS);
@@ -171,6 +226,7 @@ static void __init __maybe_unused build_assertions(void)
     BUILD_BUG_ON(ARRAY_SIZE(pv_featuremask) != FSCAPINTS);
     BUILD_BUG_ON(ARRAY_SIZE(hvm_shadow_featuremask) != FSCAPINTS);
     BUILD_BUG_ON(ARRAY_SIZE(hvm_hap_featuremask) != FSCAPINTS);
+    BUILD_BUG_ON(ARRAY_SIZE(deep_features) != FSCAPINTS);
 }
 
 /*
diff --git a/xen/include/asm-x86/cpuid.h b/xen/include/asm-x86/cpuid.h
index 5041bcd..4725672 100644
--- a/xen/include/asm-x86/cpuid.h
+++ b/xen/include/asm-x86/cpuid.h
@@ -29,6 +29,8 @@ extern uint32_t hvm_featureset[FSCAPINTS];
 
 void calculate_featuresets(void);
 
+const uint32_t *lookup_deep_deps(uint32_t feature);
+
 #endif /* __ASSEMBLY__ */
 #endif /* !__X86_CPUID_H__ */
 
diff --git a/xen/tools/gen-cpuid.py b/xen/tools/gen-cpuid.py
index 4fd603d..1cec5d8 100755
--- a/xen/tools/gen-cpuid.py
+++ b/xen/tools/gen-cpuid.py
@@ -144,6 +144,131 @@ def crunch_numbers(state):
     state.hvm_shadow = featureset_to_uint32s(state.raw_hvm_shadow, nr_entries)
     state.hvm_hap = featureset_to_uint32s(state.raw_hvm_hap, nr_entries)
 
+    #
+    # Feature dependency information.
+    #
+    # !!! WARNING !!!
+    #
+    # A lot of this information is derived from the written text of vendors
+    # software manuals, rather than directly from a statement.  As such, it
+    # does contain guesswork and assumptions, and may not accurately match
+    # hardware implementations.
+    #
+    # It is however designed to create an end result for a guest which does
+    # plausibly match real hardware.
+    #
+    # !!! WARNING !!!
+    #
+    # The format of this dictionary is that the feature in the key is a direct
+    # prerequisite of each feature in the value.
+    #
+    # The first consideration is about which functionality is physically built
+    # on top of other features.  The second consideration, which is more
+    # subjective, is whether real hardware would ever be found supporting
+    # feature X but not Y.
+    #
+    deps = {
+        # FPU is taken to mean support for the x87 regisers as well as the
+        # instructions.  MMX is documented to alias the %MM registers over the
+        # x87 %ST registers in hardware.
+        FPU: [MMX],
+
+        # The PSE36 feature indicates that reserved bits in a PSE superpage
+        # may be used as extra physical address bits.
+        PSE: [PSE36],
+
+        # Entering Long Mode requires that %CR4.PAE is set.  The NX and PKU
+        # pagetable bits are only representable in the 64bit PTE format
+        # offered by PAE.
+        PAE: [LM, NX, PKU],
+
+        # APIC is special, but X2APIC does depend on APIC being available in
+        # the first place.
+        APIC: [X2APIC],
+
+        # AMD built MMXExtentions and 3DNow as extentions to MMX.
+        MMX: [MMXEXT, _3DNOW],
+
+        # The FXSAVE/FXRSTOR instructions were introduced into hardware before
+        # SSE, which is why they behave differently based on %CR4.OSFXSAVE and
+        # have their own feature bit.  AMD however introduce the Fast FXSR
+        # feature as an optimisation.
+        FXSR: [FFXSR, SSE],
+
+        # SSE is taken to mean support for the %XMM registers as well as the
+        # instructions.
+        SSE: [SSE2],
+
+        # SSE2 was also re-specified as core for 64bit.  The AESNI and SHA
+        # instruction groups are documented to require checking for SSE2
+        # support as a prerequisite.
+        SSE2: [SSE3, LM, AESNI, SHA],
+
+        # AMD K10 processors has SSE3 and SSE4A.  Bobcat/Barcelona processors
+        # subsequently included SSSE3, and Bulldozer subsequently included
+        # SSE4_1.  Intel have never shipped SSE4A.
+        SSE3: [SSSE3, SSE4_1, SSE4_2, SSE4A],
+
+        # The INVPCID instruction depends on PCID infrastructure being
+        # available.
+        PCID: [INVPCID],
+
+        # XSAVE is an extra set of instructions for state management, but
+        # doesn't constitue new state itself.  Some of the dependent features
+        # are instructions built on top of base XSAVE, while others are new
+        # instruction groups which are specified to require XSAVE for state
+        # management.
+        XSAVE: [XSAVEOPT, XSAVEC, XGETBV1, XSAVES,
+                AVX, MPX, PKU, LWP],
+
+        # AVX is taken to mean hardware support for VEX encoded instructions,
+        # 256bit registers, and the instructions themselves.  Each of these
+        # subsequent instruction groups may only be VEX encoded.
+        AVX: [FMA, FMA4, F16C, AVX2, XOP],
+
+        # CX16 is only encodable in Long Mode.  LAHF_LM indicates that the
+        # SAHF/LAHF instructions are reintroduced in Long Mode.  1GB
+        # superpages and PCID are only available in 4 level paging.
+        LM: [CX16, PCID, LAHF_LM, PAGE1GB],
+
+        # AMD K6-2+ and K6-III processors shipped with 3DNow+, beyond the
+        # standard 3DNow in the earlier K6 processors.
+        _3DNOW: [_3DNOWEXT],
+    }
+
+    deep_features = tuple(sorted(deps.keys()))
+    state.deep_deps = {}
+
+    for feat in deep_features:
+
+        seen = [feat]
+        to_process = list(deps[feat])
+
+        while len(to_process):
+
+            # To debug, uncomment the following lines:
+            # def repl(l):
+            #     return "[" + ", ".join((state.names[x] for x in l)) + "]"
+            # print >>sys.stderr, "Feature %s, seen %s, to_process %s " % \
+            #     (state.names[feat], repl(seen), repl(to_process))
+
+            f = to_process.pop(0)
+
+            if f in seen:
+                raise Fail("ERROR: Cycle found with %s when processing %s"
+                           % (state.names[f], state.names[feat]))
+
+            seen.append(f)
+            to_process = list(set(to_process + deps.get(f, [])))
+
+        state.deep_deps[feat] = seen[1:]
+
+    state.deep_features = featureset_to_uint32s(deps.keys(), nr_entries)
+    state.nr_deep_deps = len(state.deep_deps.keys())
+
+    for k, v in state.deep_deps.iteritems():
+        state.deep_deps[k] = featureset_to_uint32s(v, nr_entries)
+
 
 def write_results(state):
     state.output.write(
@@ -170,6 +295,12 @@ def write_results(state):
 #define INIT_HVM_SHADOW_FEATURES { \\\n%s\n}
 
 #define INIT_HVM_HAP_FEATURES { \\\n%s\n}
+
+#define NR_DEEP_DEPS %sU
+
+#define INIT_DEEP_FEATURES { \\\n%s\n}
+
+#define INIT_DEEP_DEPS { \\
 """ % (state.nr_entries,
        state.common_1d,
        format_uint32s(state.known, 4),
@@ -177,10 +308,20 @@ def write_results(state):
        format_uint32s(state.pv, 4),
        format_uint32s(state.hvm_shadow, 4),
        format_uint32s(state.hvm_hap, 4),
+       state.nr_deep_deps,
+       format_uint32s(state.deep_features, 4),
        ))
 
+    for dep in sorted(state.deep_deps.keys()):
+        state.output.write(
+            "    { %#xU, /* %s */ { \\\n%s\n    }, }, \\\n"
+            % (dep, state.names[dep],
+               format_uint32s(state.deep_deps[dep], 8)
+           ))
+
     state.output.write(
-"""
+"""}
+
 #endif /* __XEN_X86__FEATURESET_DATA__ */
 """)
 
-- 
2.1.4


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 72+ messages in thread

* [PATCH v4 09/26] xen/x86: Clear dependent features when clearing a cpu cap
  2016-03-23 16:36 [PATCH v4 00/26] x86: Improvements to cpuid handling for guests Andrew Cooper
                   ` (7 preceding siblings ...)
  2016-03-23 16:36 ` [PATCH v4 08/26] xen/x86: Generate deep dependencies of features Andrew Cooper
@ 2016-03-23 16:36 ` Andrew Cooper
  2016-03-23 16:36 ` [PATCH v4 10/26] xen/x86: Improve disabling of features which have dependencies Andrew Cooper
                   ` (17 subsequent siblings)
  26 siblings, 0 replies; 72+ messages in thread
From: Andrew Cooper @ 2016-03-23 16:36 UTC (permalink / raw)
  To: Xen-devel; +Cc: Andrew Cooper

When clearing a cpu cap, clear all dependent features.  This avoids having a
featureset with intermediate features disabled, but leaf features enabled.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Jan Beulich <JBeulich@suse.com>
---
v3:
 * Style fixes.  Use __test_and_set_bit()
---
 xen/arch/x86/cpu/common.c | 16 +++++++++++++++-
 1 file changed, 15 insertions(+), 1 deletion(-)

diff --git a/xen/arch/x86/cpu/common.c b/xen/arch/x86/cpu/common.c
index d302272..0942b44 100644
--- a/xen/arch/x86/cpu/common.c
+++ b/xen/arch/x86/cpu/common.c
@@ -53,8 +53,22 @@ static unsigned int cleared_caps[NCAPINTS];
 
 void __init setup_clear_cpu_cap(unsigned int cap)
 {
+	const uint32_t *dfs;
+	unsigned int i;
+
+	if (__test_and_set_bit(cap, cleared_caps))
+		return;
+
 	__clear_bit(cap, boot_cpu_data.x86_capability);
-	__set_bit(cap, cleared_caps);
+	dfs = lookup_deep_deps(cap);
+
+	if (!dfs)
+		return;
+
+	for (i = 0; i < FSCAPINTS; ++i) {
+		cleared_caps[i] |= dfs[i];
+		boot_cpu_data.x86_capability[i] &= ~dfs[i];
+	}
 }
 
 static void default_init(struct cpuinfo_x86 * c)
-- 
2.1.4


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 72+ messages in thread

* [PATCH v4 10/26] xen/x86: Improve disabling of features which have dependencies
  2016-03-23 16:36 [PATCH v4 00/26] x86: Improvements to cpuid handling for guests Andrew Cooper
                   ` (8 preceding siblings ...)
  2016-03-23 16:36 ` [PATCH v4 09/26] xen/x86: Clear dependent features when clearing a cpu cap Andrew Cooper
@ 2016-03-23 16:36 ` Andrew Cooper
  2016-03-28 15:18   ` Konrad Rzeszutek Wilk
  2016-03-23 16:36 ` [PATCH v4 11/26] xen/x86: Improvements to in-hypervisor cpuid sanity checks Andrew Cooper
                   ` (16 subsequent siblings)
  26 siblings, 1 reply; 72+ messages in thread
From: Andrew Cooper @ 2016-03-23 16:36 UTC (permalink / raw)
  To: Xen-devel; +Cc: Andrew Cooper

APIC and XSAVE have dependent features, which also need disabling if Xen
chooses to disable a feature.

Use setup_clear_cpu_cap() rather than clear_bit(), as it takes care of
dependent features as well.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Jan Beulich <JBeulich@suse.com>
---
v2: Move boolean_param() adjacent to use_xsave in xstate_init()
---
 xen/arch/x86/apic.c       |  2 +-
 xen/arch/x86/cpu/common.c | 12 +++---------
 xen/arch/x86/xstate.c     |  6 +++++-
 3 files changed, 9 insertions(+), 11 deletions(-)

diff --git a/xen/arch/x86/apic.c b/xen/arch/x86/apic.c
index b9601ad..8df5bd3 100644
--- a/xen/arch/x86/apic.c
+++ b/xen/arch/x86/apic.c
@@ -1349,7 +1349,7 @@ void pmu_apic_interrupt(struct cpu_user_regs *regs)
 int __init APIC_init_uniprocessor (void)
 {
     if (enable_local_apic < 0)
-        __clear_bit(X86_FEATURE_APIC, boot_cpu_data.x86_capability);
+        setup_clear_cpu_cap(X86_FEATURE_APIC);
 
     if (!smp_found_config && !cpu_has_apic) {
         skip_ioapic_setup = 1;
diff --git a/xen/arch/x86/cpu/common.c b/xen/arch/x86/cpu/common.c
index 0942b44..b5c023f 100644
--- a/xen/arch/x86/cpu/common.c
+++ b/xen/arch/x86/cpu/common.c
@@ -16,9 +16,6 @@
 
 #include "cpu.h"
 
-static bool_t use_xsave = 1;
-boolean_param("xsave", use_xsave);
-
 bool_t opt_arat = 1;
 boolean_param("arat", opt_arat);
 
@@ -341,12 +338,6 @@ void identify_cpu(struct cpuinfo_x86 *c)
 	if (this_cpu->c_init)
 		this_cpu->c_init(c);
 
-        /* Initialize xsave/xrstor features */
-	if ( !use_xsave )
-		__clear_bit(X86_FEATURE_XSAVE, boot_cpu_data.x86_capability);
-
-	if ( cpu_has_xsave )
-		xstate_init(c);
 
    	if ( !opt_pku )
 		setup_clear_cpu_cap(X86_FEATURE_PKU);
@@ -370,6 +361,9 @@ void identify_cpu(struct cpuinfo_x86 *c)
 
 	/* Now the feature flags better reflect actual CPU features! */
 
+	if ( cpu_has_xsave )
+		xstate_init(c);
+
 #ifdef NOISY_CAPS
 	printk(KERN_DEBUG "CPU: After all inits, caps:");
 	for (i = 0; i < NCAPINTS; i++)
diff --git a/xen/arch/x86/xstate.c b/xen/arch/x86/xstate.c
index f649405..5060704 100644
--- a/xen/arch/x86/xstate.c
+++ b/xen/arch/x86/xstate.c
@@ -502,11 +502,15 @@ unsigned int xstate_ctxt_size(u64 xcr0)
 /* Collect the information of processor's extended state */
 void xstate_init(struct cpuinfo_x86 *c)
 {
+    static bool_t __initdata use_xsave = 1;
+    boolean_param("xsave", use_xsave);
+
     bool_t bsp = c == &boot_cpu_data;
     u32 eax, ebx, ecx, edx;
     u64 feature_mask;
 
-    if ( boot_cpu_data.cpuid_level < XSTATE_CPUID )
+    if ( (bsp && !use_xsave) ||
+         boot_cpu_data.cpuid_level < XSTATE_CPUID )
     {
         BUG_ON(!bsp);
         setup_clear_cpu_cap(X86_FEATURE_XSAVE);
-- 
2.1.4


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 72+ messages in thread

* [PATCH v4 11/26] xen/x86: Improvements to in-hypervisor cpuid sanity checks
  2016-03-23 16:36 [PATCH v4 00/26] x86: Improvements to cpuid handling for guests Andrew Cooper
                   ` (9 preceding siblings ...)
  2016-03-23 16:36 ` [PATCH v4 10/26] xen/x86: Improve disabling of features which have dependencies Andrew Cooper
@ 2016-03-23 16:36 ` Andrew Cooper
  2016-03-24 15:38   ` Andrew Cooper
                     ` (2 more replies)
  2016-03-23 16:36 ` [PATCH v4 12/26] x86/cpu: Move set_cpumask() calls into c_early_init() Andrew Cooper
                   ` (15 subsequent siblings)
  26 siblings, 3 replies; 72+ messages in thread
From: Andrew Cooper @ 2016-03-23 16:36 UTC (permalink / raw)
  To: Xen-devel; +Cc: Andrew Cooper, Jan Beulich

Currently, {pv,hvm}_cpuid() has a large quantity of essentially-static logic
for modifying the features visible to a guest.  A lot of this can be subsumed
by {pv,hvm}_featuremask, which identify the features available on this
hardware which could be given to a PV or HVM guest.

This is a step in the direction of full per-domain cpuid policies, but lots
more development is needed for that.  As a result, the static checks are
simplified, but the dynamic checks need to remain for now.

As a side effect, some of the logic for special features can be improved.
OSXSAVE and OSPKE will be automatically cleared because of being absent in the
featuremask.  This allows the fast-forward logic to be more simple.

In addition, there are some corrections to the existing logic:

 * Hiding PSE36 out of PAE mode is architecturally wrong.  It turns out that
   it was a bugfix for running HyperV under Xen, which wanted to see PSE36
   even after choosing to use PAE paging.  PSE36 is not supported by shadow
   paging, so is hidden from non-HAP guests, but is still visible for HAP
   guests.
 * Changing the visibility of RDTSCP based on host TSC stability or virtual
   TSC mode is bogus, so dropped.
 * When emulating Intel to a guest, the common features in e1d should be
   cleared.
 * The APIC bit in e1d (on non-Intel) is also a fast-forward from the
   APIC_BASE MSR.

As a small improvement, use compiler-visible &'s and |'s, rather than
{clear,set}_bit().

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
---
CC: Jan Beulich <JBeulich@suse.com>

v2:
 * Reinstate some of the dynamic checks for now.  Future development work will
   instate a complete per-domain policy.
 * Fix OSXSAVE handling for PV guests.
v3:
 * Better handling of the cross-vendor case.
 * Improvements to the handling of special features.
 * Correct PSE36 to being a HAP-only feature.
 * Yet more OSXSAVE fixes for PV guests.
v4:
 * Leak PSE36 into shadow guests to fix buggy versions of Hyper-V.
 * Leak MTRR into the hardware domain to fix Xenolinux dom0.
 * Change cross-vendor 1D disabling logic.
 * Avoid reading arch.pv_vcpu for PVH guests.
---
 xen/arch/x86/hvm/hvm.c | 125 ++++++++++++++++++-----------
 xen/arch/x86/traps.c   | 209 ++++++++++++++++++++++++++++++++-----------------
 2 files changed, 216 insertions(+), 118 deletions(-)

diff --git a/xen/arch/x86/hvm/hvm.c b/xen/arch/x86/hvm/hvm.c
index 80d59ff..6593bb1 100644
--- a/xen/arch/x86/hvm/hvm.c
+++ b/xen/arch/x86/hvm/hvm.c
@@ -71,6 +71,7 @@
 #include <public/memory.h>
 #include <public/vm_event.h>
 #include <public/arch-x86/cpuid.h>
+#include <asm/cpuid.h>
 
 bool_t __read_mostly hvm_enabled;
 
@@ -4668,62 +4669,71 @@ void hvm_cpuid(unsigned int input, unsigned int *eax, unsigned int *ebx,
         /* Fix up VLAPIC details. */
         *ebx &= 0x00FFFFFFu;
         *ebx |= (v->vcpu_id * 2) << 24;
+
+        *ecx &= hvm_featureset[FEATURESET_1c];
+        *edx &= hvm_featureset[FEATURESET_1d];
+
+        /* APIC exposed to guests, but Fast-forward MSR_APIC_BASE.EN back in. */
         if ( vlapic_hw_disabled(vcpu_vlapic(v)) )
-            __clear_bit(X86_FEATURE_APIC & 31, edx);
+            *edx &= ~cpufeat_bit(X86_FEATURE_APIC);
 
-        /* Fix up OSXSAVE. */
-        if ( *ecx & cpufeat_mask(X86_FEATURE_XSAVE) &&
-             (v->arch.hvm_vcpu.guest_cr[4] & X86_CR4_OSXSAVE) )
+        /* OSXSAVE cleared by hvm_featureset.  Fast-forward CR4 back in. */
+        if ( v->arch.hvm_vcpu.guest_cr[4] & X86_CR4_OSXSAVE )
             *ecx |= cpufeat_mask(X86_FEATURE_OSXSAVE);
-        else
-            *ecx &= ~cpufeat_mask(X86_FEATURE_OSXSAVE);
 
-        /* Don't expose PCID to non-hap hvm. */
+        /* Don't expose HAP-only features to non-hap guests. */
         if ( !hap_enabled(d) )
+        {
             *ecx &= ~cpufeat_mask(X86_FEATURE_PCID);
 
-        /* Only provide PSE36 when guest runs in 32bit PAE or in long mode */
-        if ( !(hvm_pae_enabled(v) || hvm_long_mode_enabled(v)) )
-            *edx &= ~cpufeat_mask(X86_FEATURE_PSE36);
+            /*
+             * PSE36 is not supported in shadow mode.  This bit should be
+             * unilaterally cleared.
+             *
+             * However, an unspecified version of Hyper-V from 2011 refuses
+             * to start as the "cpu does not provide required hw features" if
+             * it can't see PSE36.
+             *
+             * As a workaround, leak the toolstack-provided PSE36 value into a
+             * shadow guest if the guest is already using PAE paging (and
+             * won't care about reverting back to PSE paging).  Otherwise,
+             * knoble it, so a 32bit guest doesn't get the impression that it
+             * could try to use PSE36 paging.
+             */
+            if ( !(hvm_pae_enabled(v) || hvm_long_mode_enabled(v)) )
+                *edx &= ~cpufeat_mask(X86_FEATURE_PSE36);
+        }
         break;
+
     case 0x7:
         if ( count == 0 )
         {
-            if ( !cpu_has_smep )
-                *ebx &= ~cpufeat_mask(X86_FEATURE_SMEP);
-
-            if ( !cpu_has_smap )
-                *ebx &= ~cpufeat_mask(X86_FEATURE_SMAP);
+            /* Fold host's FDP_EXCP_ONLY and NO_FPU_SEL into guest's view. */
+            *ebx &= (hvm_featureset[FEATURESET_7b0] &
+                     ~special_features[FEATURESET_7b0]);
+            *ebx |= (host_featureset[FEATURESET_7b0] &
+                     special_features[FEATURESET_7b0]);
 
-            /* Don't expose MPX to hvm when VMX support is not available. */
-            if ( !(vmx_vmexit_control & VM_EXIT_CLEAR_BNDCFGS) ||
-                 !(vmx_vmentry_control & VM_ENTRY_LOAD_BNDCFGS) )
-                *ebx &= ~cpufeat_mask(X86_FEATURE_MPX);
+            *ecx &= hvm_featureset[FEATURESET_7c0];
 
+            /* Don't expose HAP-only features to non-hap guests. */
             if ( !hap_enabled(d) )
             {
-                 /* Don't expose INVPCID to non-hap hvm. */
                  *ebx &= ~cpufeat_mask(X86_FEATURE_INVPCID);
-                 /* X86_FEATURE_PKU is not yet implemented for shadow paging. */
                  *ecx &= ~cpufeat_mask(X86_FEATURE_PKU);
             }
 
-            if ( (*ecx & cpufeat_mask(X86_FEATURE_PKU)) &&
-                 (v->arch.hvm_vcpu.guest_cr[4] & X86_CR4_PKE) )
+            /* OSPKE cleared by hvm_featureset.  Fast-forward CR4 back in. */
+            if ( v->arch.hvm_vcpu.guest_cr[4] & X86_CR4_PKE )
                 *ecx |= cpufeat_mask(X86_FEATURE_OSPKE);
-            else
-                *ecx &= ~cpufeat_mask(X86_FEATURE_OSPKE);
-
-            /* Don't expose PCOMMIT to hvm when VMX support is not available. */
-            if ( !cpu_has_vmx_pcommit )
-                *ebx &= ~cpufeat_mask(X86_FEATURE_PCOMMIT);
         }
-
         break;
+
     case 0xb:
         /* Fix the x2APIC identifier. */
         *edx = v->vcpu_id * 2;
         break;
+
     case 0xd:
         /* EBX value of main leaf 0 depends on enabled xsave features */
         if ( count == 0 && v->arch.xcr0 ) 
@@ -4740,9 +4750,12 @@ void hvm_cpuid(unsigned int input, unsigned int *eax, unsigned int *ebx,
                     *ebx = _eax + _ebx;
             }
         }
+
         if ( count == 1 )
         {
-            if ( cpu_has_xsaves && cpu_has_vmx_xsaves )
+            *eax &= hvm_featureset[FEATURESET_Da1];
+
+            if ( *eax & cpufeat_mask(X86_FEATURE_XSAVES) )
             {
                 *ebx = XSTATE_AREA_MIN_SIZE;
                 if ( v->arch.xcr0 | v->arch.hvm_vcpu.msr_xss )
@@ -4757,20 +4770,42 @@ void hvm_cpuid(unsigned int input, unsigned int *eax, unsigned int *ebx,
         break;
 
     case 0x80000001:
-        /* We expose RDTSCP feature to guest only when
-           tsc_mode == TSC_MODE_DEFAULT and host_tsc_is_safe() returns 1 */
-        if ( d->arch.tsc_mode != TSC_MODE_DEFAULT ||
-             !host_tsc_is_safe() )
-            *edx &= ~cpufeat_mask(X86_FEATURE_RDTSCP);
-        /* Hide 1GB-superpage feature if we can't emulate it. */
-        if (!hvm_pse1gb_supported(d))
+        *ecx &= hvm_featureset[FEATURESET_e1c];
+        *edx &= hvm_featureset[FEATURESET_e1d];
+
+        /* If not emulating AMD, clear the duplicated features in e1d. */
+        if ( d->arch.x86_vendor != X86_VENDOR_AMD )
+            *edx &= ~CPUID_COMMON_1D_FEATURES;
+        /* fast-forward MSR_APIC_BASE.EN if it hasn't already been clobbered. */
+        else if ( vlapic_hw_disabled(vcpu_vlapic(v)) )
+            *edx &= ~cpufeat_bit(X86_FEATURE_APIC);
+
+        /* Don't expose HAP-only features to non-hap guests. */
+        if ( !hap_enabled(d) )
+        {
             *edx &= ~cpufeat_mask(X86_FEATURE_PAGE1GB);
-        /* Only provide PSE36 when guest runs in 32bit PAE or in long mode */
-        if ( !(hvm_pae_enabled(v) || hvm_long_mode_enabled(v)) )
-            *edx &= ~cpufeat_mask(X86_FEATURE_PSE36);
-        /* Hide data breakpoint extensions if the hardware has no support. */
-        if ( !boot_cpu_has(X86_FEATURE_DBEXT) )
-            *ecx &= ~cpufeat_mask(X86_FEATURE_DBEXT);
+
+            /*
+             * PSE36 is not supported in shadow mode.  This bit should be
+             * unilaterally cleared.
+             *
+             * However, an unspecified version of Hyper-V from 2011 refuses
+             * to start as the "cpu does not provide required hw features" if
+             * it can't see PSE36.
+             *
+             * As a workaround, leak the toolstack-provided PSE36 value into a
+             * shadow guest if the guest is already using PAE paging (and
+             * won't care about reverting back to PSE paging).  Otherwise,
+             * knoble it, so a 32bit guest doesn't get the impression that it
+             * could try to use PSE36 paging.
+             */
+            if ( !(hvm_pae_enabled(v) || hvm_long_mode_enabled(v)) )
+                *edx &= ~cpufeat_mask(X86_FEATURE_PSE36);
+        }
+        break;
+
+    case 0x80000007:
+        *edx &= hvm_featureset[FEATURESET_e7d];
         break;
 
     case 0x80000008:
@@ -4788,6 +4823,8 @@ void hvm_cpuid(unsigned int input, unsigned int *eax, unsigned int *ebx,
         hvm_cpuid(0x80000001, NULL, NULL, NULL, &_edx);
         *eax = (*eax & ~0xffff00) | (_edx & cpufeat_mask(X86_FEATURE_LM)
                                      ? 0x3000 : 0x2000);
+
+        *ebx &= hvm_featureset[FEATURESET_e8b];
         break;
     }
 }
diff --git a/xen/arch/x86/traps.c b/xen/arch/x86/traps.c
index 6fbb1cf..dfa1cb6 100644
--- a/xen/arch/x86/traps.c
+++ b/xen/arch/x86/traps.c
@@ -73,6 +73,7 @@
 #include <asm/hpet.h>
 #include <asm/vpmu.h>
 #include <public/arch-x86/cpuid.h>
+#include <asm/cpuid.h>
 #include <xsm/xsm.h>
 
 /*
@@ -932,69 +933,116 @@ void pv_cpuid(struct cpu_user_regs *regs)
     else
         cpuid_count(leaf, subleaf, &a, &b, &c, &d);
 
-    if ( (leaf & 0x7fffffff) == 0x00000001 )
-    {
-        /* Modify Feature Information. */
-        if ( !cpu_has_apic )
-            __clear_bit(X86_FEATURE_APIC, &d);
-
-        if ( !is_pvh_domain(currd) )
-        {
-            __clear_bit(X86_FEATURE_PSE, &d);
-            __clear_bit(X86_FEATURE_PGE, &d);
-            __clear_bit(X86_FEATURE_PSE36, &d);
-            __clear_bit(X86_FEATURE_VME, &d);
-        }
-    }
-
     switch ( leaf )
     {
     case 0x00000001:
-        /* Modify Feature Information. */
-        if ( !cpu_has_sep )
-            __clear_bit(X86_FEATURE_SEP, &d);
-        __clear_bit(X86_FEATURE_DS, &d);
-        __clear_bit(X86_FEATURE_TM1, &d);
-        __clear_bit(X86_FEATURE_PBE, &d);
-        if ( is_pvh_domain(currd) )
-            __clear_bit(X86_FEATURE_MTRR, &d);
-
-        __clear_bit(X86_FEATURE_DTES64 % 32, &c);
-        __clear_bit(X86_FEATURE_MONITOR % 32, &c);
-        __clear_bit(X86_FEATURE_DSCPL % 32, &c);
-        __clear_bit(X86_FEATURE_VMX % 32, &c);
-        __clear_bit(X86_FEATURE_SMX % 32, &c);
-        __clear_bit(X86_FEATURE_TM2 % 32, &c);
+        c &= pv_featureset[FEATURESET_1c];
+        d &= pv_featureset[FEATURESET_1d];
+
         if ( is_pv_32bit_domain(currd) )
-            __clear_bit(X86_FEATURE_CX16 % 32, &c);
-        __clear_bit(X86_FEATURE_XTPR % 32, &c);
-        __clear_bit(X86_FEATURE_PDCM % 32, &c);
-        __clear_bit(X86_FEATURE_PCID % 32, &c);
-        __clear_bit(X86_FEATURE_DCA % 32, &c);
-        if ( !cpu_has_xsave )
+            c &= ~cpufeat_mask(X86_FEATURE_CX16);
+
+        /*
+         * !!! Warning - OSXSAVE handling for PV guests is non-architectural !!!
+         *
+         * Architecturally, the correct code here is simply:
+         *
+         *   if ( curr->arch.pv_vcpu.ctrlreg[4] & X86_CR4_OSXSAVE )
+         *       c |= cpufeat_mask(X86_FEATURE_OSXSAVE);
+         *
+         * However because of bugs in Xen (before c/s bd19080b, Nov 2010, the
+         * XSAVE cpuid flag leaked into guests despite the feature not being
+         * avilable for use), buggy workarounds where introduced to Linux (c/s
+         * 947ccf9c, also Nov 2010) which relied on the fact that Xen also
+         * incorrectly leaked OSXSAVE into the guest.
+         *
+         * Furthermore, providing architectural OSXSAVE behaviour to a many
+         * Linux PV guests triggered a further kernel bug when the fpu code
+         * observes that XSAVEOPT is available, assumes that xsave state had
+         * been set up for the task, and follows a wild pointer.
+         *
+         * Older Linux PVOPS kernels however do require architectrual
+         * behaviour.  They observe Xen's leaked OSXSAVE and assume they can
+         * already use XSETBV, dying with a #UD because the shadowed
+         * CR4.OSXSAVE is clear.  This behaviour has been adjusted in all
+         * observed cases via stable backports of the above changeset.
+         *
+         * Therefore, the leaking of Xen's OSXSAVE setting has become a
+         * defacto part of the PV ABI and can't reasonably be corrected.
+         *
+         * The following situations and logic now applies:
+         *
+         * - Hardware without CPUID faulting support and native CPUID:
+         *    There is nothing Xen can do here.  The hosts XSAVE flag will
+         *    leak through and Xen's OSXSAVE choice will leak through.
+         *
+         *    In the case that the guest kernel has not set up OSXSAVE, only
+         *    SSE will be set in xcr0, and guest userspace can't do too much
+         *    damage itself.
+         *
+         * - Enlightened CPUID or CPUID faulting available:
+         *    Xen can fully control what is seen here.  Guest kernels need to
+         *    see the leaked OSXSAVE, but guest userspace is given
+         *    architectural behaviour, to reflect the guest kernels
+         *    intentions.
+         */
+        if ( !is_pvh_domain(currd) )
         {
-            __clear_bit(X86_FEATURE_XSAVE % 32, &c);
-            __clear_bit(X86_FEATURE_AVX % 32, &c);
+            /*
+             * Delete the PVH condition when HVMLite formally replaces PVH,
+             * and HVM guests no longer enter a PV codepath.
+             */
+
+            /* OSXSAVE cleared by pv_featureset.  Fast-forward CR4 back in. */
+            if ( (is_pv_domain(currd) && guest_kernel_mode(curr, regs) &&
+                  (read_cr4() & X86_CR4_OSXSAVE)) ||
+                 (curr->arch.pv_vcpu.ctrlreg[4] & X86_CR4_OSXSAVE) )
+                c |= cpufeat_mask(X86_FEATURE_OSXSAVE);
         }
-        if ( !cpu_has_apic )
-           __clear_bit(X86_FEATURE_X2APIC % 32, &c);
-        __set_bit(X86_FEATURE_HYPERVISOR % 32, &c);
+
+        /*
+         * PV guests cannot use any MTRR infrastructure, so shouldn't see the
+         * feature bit.  It used to leak in to PV guests.
+         *
+         * PVOPS Linux self-clobbers the MTRR feature, to avoid trying to use
+         * the associated MSRs.  Xenolinux-based PV dom0's however use the
+         * MTRR feature as an indication of the presence of the
+         * XENPF_{add,del,read}_memtype hypercalls.
+         *
+         * Leak the host MTRR value into the hardware domain only.
+         */
+        if ( is_hardware_domain(currd) && cpu_has_mtrr )
+            d |= cpufeat_mask(X86_FEATURE_MTRR);
+
+        c |= cpufeat_mask(X86_FEATURE_HYPERVISOR);
         break;
 
     case 0x00000007:
         if ( subleaf == 0 )
-            b &= (cpufeat_mask(X86_FEATURE_BMI1) |
-                  cpufeat_mask(X86_FEATURE_HLE)  |
-                  cpufeat_mask(X86_FEATURE_AVX2) |
-                  cpufeat_mask(X86_FEATURE_BMI2) |
-                  cpufeat_mask(X86_FEATURE_ERMS) |
-                  cpufeat_mask(X86_FEATURE_RTM)  |
-                  cpufeat_mask(X86_FEATURE_RDSEED)  |
-                  cpufeat_mask(X86_FEATURE_ADX)  |
-                  cpufeat_mask(X86_FEATURE_FSGSBASE));
+        {
+            /* Fold host's FDP_EXCP_ONLY and NO_FPU_SEL into guest's view. */
+            b &= (pv_featureset[FEATURESET_7b0] &
+                  ~special_features[FEATURESET_7b0]);
+            b |= (host_featureset[FEATURESET_7b0] &
+                  special_features[FEATURESET_7b0]);
+
+            c &= pv_featureset[FEATURESET_7c0];
+
+            if ( !is_pvh_domain(currd) )
+            {
+                /*
+                 * Delete the PVH condition when HVMLite formally replaces PVH,
+                 * and HVM guests no longer enter a PV codepath.
+                 */
+
+                /* OSPKE cleared by pv_featureset.  Fast-forward CR4 back in. */
+                if ( curr->arch.pv_vcpu.ctrlreg[4] & X86_CR4_PKE )
+                    c |= cpufeat_mask(X86_FEATURE_OSPKE);
+            }
+        }
         else
-            b = 0;
-        a = c = d = 0;
+            b = c = 0;
+        a = d = 0;
         break;
 
     case XSTATE_CPUID:
@@ -1017,37 +1065,50 @@ void pv_cpuid(struct cpu_user_regs *regs)
         }
 
         case 1:
-            a &= (boot_cpu_data.x86_capability[cpufeat_word(X86_FEATURE_XSAVEOPT)] &
-                  ~cpufeat_mask(X86_FEATURE_XSAVES));
+            a &= pv_featureset[FEATURESET_Da1];
             b = c = d = 0;
             break;
         }
         break;
 
     case 0x80000001:
-        /* Modify Feature Information. */
+        c &= pv_featureset[FEATURESET_e1c];
+        d &= pv_featureset[FEATURESET_e1d];
+
+        /* If not emulating AMD, clear the duplicated features in e1d. */
+        if ( currd->arch.x86_vendor != X86_VENDOR_AMD )
+            d &= ~CPUID_COMMON_1D_FEATURES;
+
+        /*
+         * PV guests cannot use any MTRR infrastructure, so shouldn't see the
+         * feature bit.  It used to leak in to PV guests.
+         *
+         * PVOPS Linux self-clobbers the MTRR feature, to avoid trying to use
+         * the associated MSRs.  Xenolinux-based PV dom0's however use the
+         * MTRR feature as an indication of the presence of the
+         * XENPF_{add,del,read}_memtype hypercalls.
+         *
+         * Leak the host MTRR value into the hardware domain only.
+         */
+        if ( is_hardware_domain(currd) && cpu_has_mtrr )
+            d |= cpufeat_mask(X86_FEATURE_MTRR);
+
         if ( is_pv_32bit_domain(currd) )
         {
-            __clear_bit(X86_FEATURE_LM % 32, &d);
-            __clear_bit(X86_FEATURE_LAHF_LM % 32, &c);
+            d &= ~cpufeat_mask(X86_FEATURE_LM);
+            c &= ~cpufeat_mask(X86_FEATURE_LAHF_LM);
+
+            if ( boot_cpu_data.x86_vendor != X86_VENDOR_AMD )
+                d &= ~cpufeat_mask(X86_FEATURE_SYSCALL);
         }
-        if ( is_pv_32bit_domain(currd) &&
-             boot_cpu_data.x86_vendor != X86_VENDOR_AMD )
-            __clear_bit(X86_FEATURE_SYSCALL % 32, &d);
-        __clear_bit(X86_FEATURE_PAGE1GB % 32, &d);
-        __clear_bit(X86_FEATURE_RDTSCP % 32, &d);
-
-        __clear_bit(X86_FEATURE_SVM % 32, &c);
-        if ( !cpu_has_apic )
-           __clear_bit(X86_FEATURE_EXTAPIC % 32, &c);
-        __clear_bit(X86_FEATURE_OSVW % 32, &c);
-        __clear_bit(X86_FEATURE_IBS % 32, &c);
-        __clear_bit(X86_FEATURE_SKINIT % 32, &c);
-        __clear_bit(X86_FEATURE_WDT % 32, &c);
-        __clear_bit(X86_FEATURE_LWP % 32, &c);
-        __clear_bit(X86_FEATURE_NODEID_MSR % 32, &c);
-        __clear_bit(X86_FEATURE_TOPOEXT % 32, &c);
-        __clear_bit(X86_FEATURE_MONITORX % 32, &c);
+        break;
+
+    case 0x80000007:
+        d &= pv_featureset[FEATURESET_e7d];
+        break;
+
+    case 0x80000008:
+        b &= pv_featureset[FEATURESET_e8b];
         break;
 
     case 0x0000000a: /* Architectural Performance Monitor Features (Intel) */
-- 
2.1.4


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 72+ messages in thread

* [PATCH v4 12/26] x86/cpu: Move set_cpumask() calls into c_early_init()
  2016-03-23 16:36 [PATCH v4 00/26] x86: Improvements to cpuid handling for guests Andrew Cooper
                   ` (10 preceding siblings ...)
  2016-03-23 16:36 ` [PATCH v4 11/26] xen/x86: Improvements to in-hypervisor cpuid sanity checks Andrew Cooper
@ 2016-03-23 16:36 ` Andrew Cooper
  2016-03-28 15:55   ` Konrad Rzeszutek Wilk
  2016-03-23 16:36 ` [PATCH v4 13/26] x86/cpu: Sysctl and common infrastructure for levelling context switching Andrew Cooper
                   ` (14 subsequent siblings)
  26 siblings, 1 reply; 72+ messages in thread
From: Andrew Cooper @ 2016-03-23 16:36 UTC (permalink / raw)
  To: Xen-devel; +Cc: Andrew Cooper, Jan Beulich

Before c/s 44e24f8567 "x86: don't call generic_identify() redundantly", the
commandline-provided masks would take effect in Xen's view of the features.

As the masks got applied after the query for features, the redundant call to
generic_identify() would clobber the pre-masking feature information with the
post-masking information.

Move the set_cpumask() calls into c_early_init() so their effects take place
before the main query for features in generic_identify().

The cpuid_mask_* command line parameters now limit the entire system, a
feature XenServer was relying on for testing purposes.  Subsequent changes
will cause the mask MSRs to be context switched per-domain, removing the need
to use the command line parameters for heterogeneous levelling purposes.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
---
CC: Jan Beulich <JBeulich@suse.com>
---
 xen/arch/x86/cpu/amd.c   |  8 ++++++--
 xen/arch/x86/cpu/intel.c | 34 +++++++++++++++++-----------------
 2 files changed, 23 insertions(+), 19 deletions(-)

diff --git a/xen/arch/x86/cpu/amd.c b/xen/arch/x86/cpu/amd.c
index 47a38c6..5516777 100644
--- a/xen/arch/x86/cpu/amd.c
+++ b/xen/arch/x86/cpu/amd.c
@@ -407,6 +407,11 @@ static void amd_get_topology(struct cpuinfo_x86 *c)
                                                          c->cpu_core_id);
 }
 
+static void early_init_amd(struct cpuinfo_x86 *c)
+{
+	set_cpuidmask(c);
+}
+
 static void init_amd(struct cpuinfo_x86 *c)
 {
 	u32 l, h;
@@ -595,14 +600,13 @@ static void init_amd(struct cpuinfo_x86 *c)
 	if ((smp_processor_id() == 1) && !cpu_has(c, X86_FEATURE_ITSC))
 		disable_c1_ramping();
 
-	set_cpuidmask(c);
-
 	check_syscfg_dram_mod_en();
 }
 
 static const struct cpu_dev amd_cpu_dev = {
 	.c_vendor	= "AMD",
 	.c_ident 	= { "AuthenticAMD" },
+	.c_early_init	= early_init_amd,
 	.c_init		= init_amd,
 };
 
diff --git a/xen/arch/x86/cpu/intel.c b/xen/arch/x86/cpu/intel.c
index bdf89f6..ad22375 100644
--- a/xen/arch/x86/cpu/intel.c
+++ b/xen/arch/x86/cpu/intel.c
@@ -189,6 +189,23 @@ static void early_init_intel(struct cpuinfo_x86 *c)
 	if (boot_cpu_data.x86 == 0xF && boot_cpu_data.x86_model == 3 &&
 	    (boot_cpu_data.x86_mask == 3 || boot_cpu_data.x86_mask == 4))
 		paddr_bits = 36;
+
+	if (c == &boot_cpu_data && c->x86 == 6) {
+		if (probe_intel_cpuid_faulting())
+			__set_bit(X86_FEATURE_CPUID_FAULTING,
+				  c->x86_capability);
+	} else if (boot_cpu_has(X86_FEATURE_CPUID_FAULTING)) {
+		BUG_ON(!probe_intel_cpuid_faulting());
+		__set_bit(X86_FEATURE_CPUID_FAULTING, c->x86_capability);
+	}
+
+	if (!cpu_has_cpuid_faulting)
+		set_cpuidmask(c);
+	else if ((c == &boot_cpu_data) &&
+		 (~(opt_cpuid_mask_ecx & opt_cpuid_mask_edx &
+		    opt_cpuid_mask_ext_ecx & opt_cpuid_mask_ext_edx &
+		    opt_cpuid_mask_xsave_eax)))
+		printk("No CPUID feature masking support available\n");
 }
 
 /*
@@ -258,23 +275,6 @@ static void init_intel(struct cpuinfo_x86 *c)
 		detect_ht(c);
 	}
 
-	if (c == &boot_cpu_data && c->x86 == 6) {
-		if (probe_intel_cpuid_faulting())
-			__set_bit(X86_FEATURE_CPUID_FAULTING,
-				  c->x86_capability);
-	} else if (boot_cpu_has(X86_FEATURE_CPUID_FAULTING)) {
-		BUG_ON(!probe_intel_cpuid_faulting());
-		__set_bit(X86_FEATURE_CPUID_FAULTING, c->x86_capability);
-	}
-
-	if (!cpu_has_cpuid_faulting)
-		set_cpuidmask(c);
-	else if ((c == &boot_cpu_data) &&
-		 (~(opt_cpuid_mask_ecx & opt_cpuid_mask_edx &
-		    opt_cpuid_mask_ext_ecx & opt_cpuid_mask_ext_edx &
-		    opt_cpuid_mask_xsave_eax)))
-		printk("No CPUID feature masking support available\n");
-
 	/* Work around errata */
 	Intel_errata_workarounds(c);
 
-- 
2.1.4


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 72+ messages in thread

* [PATCH v4 13/26] x86/cpu: Sysctl and common infrastructure for levelling context switching
  2016-03-23 16:36 [PATCH v4 00/26] x86: Improvements to cpuid handling for guests Andrew Cooper
                   ` (11 preceding siblings ...)
  2016-03-23 16:36 ` [PATCH v4 12/26] x86/cpu: Move set_cpumask() calls into c_early_init() Andrew Cooper
@ 2016-03-23 16:36 ` Andrew Cooper
  2016-03-24 16:58   ` Jan Beulich
                     ` (2 more replies)
  2016-03-23 16:36 ` [PATCH v4 14/26] x86/cpu: Rework AMD masking MSR setup Andrew Cooper
                   ` (13 subsequent siblings)
  26 siblings, 3 replies; 72+ messages in thread
From: Andrew Cooper @ 2016-03-23 16:36 UTC (permalink / raw)
  To: Xen-devel; +Cc: Andrew Cooper, Jan Beulich

A toolstack needs to know how much control Xen has over the visible cpuid
values in PV guests.  Provide an explicit mechanism to query what Xen is
capable of.

This interface will currently report no capabilities.  This change is
scaffolding for future patches, which will introduce detection and switching
logic, after which the interface will report hardware capabilities correctly.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
---
CC: Jan Beulich <JBeulich@suse.com>

v2:
 * s/cpumasks/cpuidmasks/
v3:
 * Reintroduce XEN_SYSCTL_get_levelling_caps (requested by Joao for some
   development he has planned).
 * Rename to XEN_SYSCTL_get_cpu_levelling_caps, and rename the constants to
   match the Xen command line options.
v4:
 * Move declarations from processor.h to cpuid.h
 * API corrections for XEN_SYSCTL_get_levelling_caps
---
 xen/arch/x86/cpu/common.c        |  6 ++++++
 xen/arch/x86/sysctl.c            |  6 ++++++
 xen/include/asm-x86/cpufeature.h |  1 +
 xen/include/asm-x86/cpuid.h      | 32 ++++++++++++++++++++++++++++++++
 xen/include/public/sysctl.h      | 23 +++++++++++++++++++++++
 5 files changed, 68 insertions(+)

diff --git a/xen/arch/x86/cpu/common.c b/xen/arch/x86/cpu/common.c
index b5c023f..7ef75b0 100644
--- a/xen/arch/x86/cpu/common.c
+++ b/xen/arch/x86/cpu/common.c
@@ -36,6 +36,12 @@ integer_param("cpuid_mask_ext_ecx", opt_cpuid_mask_ext_ecx);
 unsigned int opt_cpuid_mask_ext_edx = ~0u;
 integer_param("cpuid_mask_ext_edx", opt_cpuid_mask_ext_edx);
 
+unsigned int __initdata expected_levelling_cap;
+unsigned int __read_mostly levelling_caps;
+
+DEFINE_PER_CPU(struct cpuidmasks, cpuidmasks);
+struct cpuidmasks __read_mostly cpuidmask_defaults;
+
 const struct cpu_dev *__read_mostly cpu_devs[X86_VENDOR_NUM] = {};
 
 unsigned int paddr_bits __read_mostly = 36;
diff --git a/xen/arch/x86/sysctl.c b/xen/arch/x86/sysctl.c
index 58cbd70..f68cbec 100644
--- a/xen/arch/x86/sysctl.c
+++ b/xen/arch/x86/sysctl.c
@@ -190,6 +190,12 @@ long arch_do_sysctl(
         }
         break;
 
+    case XEN_SYSCTL_get_cpu_levelling_caps:
+        sysctl->u.cpu_levelling_caps.caps = levelling_caps;
+        if ( __copy_field_to_guest(u_sysctl, sysctl, u.cpu_levelling_caps.caps) )
+            ret = -EFAULT;
+        break;
+
     default:
         ret = -ENOSYS;
         break;
diff --git a/xen/include/asm-x86/cpufeature.h b/xen/include/asm-x86/cpufeature.h
index e29b024..84d3220 100644
--- a/xen/include/asm-x86/cpufeature.h
+++ b/xen/include/asm-x86/cpufeature.h
@@ -81,6 +81,7 @@
 #define cpu_has_xsavec		boot_cpu_has(X86_FEATURE_XSAVEC)
 #define cpu_has_xgetbv1		boot_cpu_has(X86_FEATURE_XGETBV1)
 #define cpu_has_xsaves		boot_cpu_has(X86_FEATURE_XSAVES)
+#define cpu_has_hypervisor	boot_cpu_has(X86_FEATURE_HYPERVISOR)
 
 enum _cache_type {
     CACHE_TYPE_NULL = 0,
diff --git a/xen/include/asm-x86/cpuid.h b/xen/include/asm-x86/cpuid.h
index 4725672..9a21c25 100644
--- a/xen/include/asm-x86/cpuid.h
+++ b/xen/include/asm-x86/cpuid.h
@@ -3,6 +3,7 @@
 
 #include <asm/cpufeatureset.h>
 #include <asm/cpuid-autogen.h>
+#include <asm/percpu.h>
 
 #define FSCAPINTS FEATURESET_NR_ENTRIES
 
@@ -18,6 +19,7 @@
 
 #ifndef __ASSEMBLY__
 #include <xen/types.h>
+#include <public/sysctl.h>
 
 extern const uint32_t known_features[FSCAPINTS];
 extern const uint32_t special_features[FSCAPINTS];
@@ -31,6 +33,36 @@ void calculate_featuresets(void);
 
 const uint32_t *lookup_deep_deps(uint32_t feature);
 
+/*
+ * Expected levelling capabilities (given cpuid vendor/family information),
+ * and levelling capabilities actually available (given MSR probing).
+ */
+#define LCAP_faulting XEN_SYSCTL_CPU_LEVELCAP_faulting
+#define LCAP_1cd      (XEN_SYSCTL_CPU_LEVELCAP_ecx |        \
+                       XEN_SYSCTL_CPU_LEVELCAP_edx)
+#define LCAP_e1cd     (XEN_SYSCTL_CPU_LEVELCAP_extd_ecx |   \
+                       XEN_SYSCTL_CPU_LEVELCAP_extd_edx)
+#define LCAP_Da1      XEN_SYSCTL_CPU_LEVELCAP_xsave_eax
+#define LCAP_6c       XEN_SYSCTL_CPU_LEVELCAP_thermal_ecx
+#define LCAP_7ab0     (XEN_SYSCTL_CPU_LEVELCAP_l7s0_eax |   \
+                       XEN_SYSCTL_CPU_LEVELCAP_l7s0_ebx)
+extern unsigned int expected_levelling_cap, levelling_caps;
+
+struct cpuidmasks
+{
+    uint64_t _1cd;
+    uint64_t e1cd;
+    uint64_t Da1;
+    uint64_t _6c;
+    uint64_t _7ab0;
+};
+
+/* Per CPU shadows of masking MSR values, for lazy context switching. */
+DECLARE_PER_CPU(struct cpuidmasks, cpuidmasks);
+
+/* Default masking MSR values, calculated at boot. */
+extern struct cpuidmasks cpuidmask_defaults;
+
 #endif /* __ASSEMBLY__ */
 #endif /* !__X86_CPUID_H__ */
 
diff --git a/xen/include/public/sysctl.h b/xen/include/public/sysctl.h
index 96680eb..1ab16db 100644
--- a/xen/include/public/sysctl.h
+++ b/xen/include/public/sysctl.h
@@ -766,6 +766,27 @@ struct xen_sysctl_tmem_op {
 typedef struct xen_sysctl_tmem_op xen_sysctl_tmem_op_t;
 DEFINE_XEN_GUEST_HANDLE(xen_sysctl_tmem_op_t);
 
+/*
+ * XEN_SYSCTL_get_cpu_levelling_caps (x86 specific)
+ *
+ * Return hardware capabilities concerning masking or faulting of the cpuid
+ * instruction for PV guests.
+ */
+struct xen_sysctl_cpu_levelling_caps {
+#define XEN_SYSCTL_CPU_LEVELCAP_faulting    (1ul <<  0) /* CPUID faulting    */
+#define XEN_SYSCTL_CPU_LEVELCAP_ecx         (1ul <<  1) /* 0x00000001.ecx    */
+#define XEN_SYSCTL_CPU_LEVELCAP_edx         (1ul <<  2) /* 0x00000001.edx    */
+#define XEN_SYSCTL_CPU_LEVELCAP_extd_ecx    (1ul <<  3) /* 0x80000001.ecx    */
+#define XEN_SYSCTL_CPU_LEVELCAP_extd_edx    (1ul <<  4) /* 0x80000001.edx    */
+#define XEN_SYSCTL_CPU_LEVELCAP_xsave_eax   (1ul <<  5) /* 0x0000000D:1.eax  */
+#define XEN_SYSCTL_CPU_LEVELCAP_thermal_ecx (1ul <<  6) /* 0x00000006.ecx    */
+#define XEN_SYSCTL_CPU_LEVELCAP_l7s0_eax    (1ul <<  7) /* 0x00000007:0.eax  */
+#define XEN_SYSCTL_CPU_LEVELCAP_l7s0_ebx    (1ul <<  8) /* 0x00000007:0.ebx  */
+    uint32_t caps;
+};
+typedef struct xen_sysctl_cpu_levelling_caps xen_sysctl_cpu_levelling_caps_t;
+DEFINE_XEN_GUEST_HANDLE(xen_sysctl_cpu_levelling_caps_t);
+
 struct xen_sysctl {
     uint32_t cmd;
 #define XEN_SYSCTL_readconsole                    1
@@ -791,6 +812,7 @@ struct xen_sysctl {
 #define XEN_SYSCTL_pcitopoinfo                   22
 #define XEN_SYSCTL_psr_cat_op                    23
 #define XEN_SYSCTL_tmem_op                       24
+#define XEN_SYSCTL_get_cpu_levelling_caps        25
     uint32_t interface_version; /* XEN_SYSCTL_INTERFACE_VERSION */
     union {
         struct xen_sysctl_readconsole       readconsole;
@@ -816,6 +838,7 @@ struct xen_sysctl {
         struct xen_sysctl_psr_cmt_op        psr_cmt_op;
         struct xen_sysctl_psr_cat_op        psr_cat_op;
         struct xen_sysctl_tmem_op           tmem_op;
+        struct xen_sysctl_cpu_levelling_caps cpu_levelling_caps;
         uint8_t                             pad[128];
     } u;
 };
-- 
2.1.4


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 72+ messages in thread

* [PATCH v4 14/26] x86/cpu: Rework AMD masking MSR setup
  2016-03-23 16:36 [PATCH v4 00/26] x86: Improvements to cpuid handling for guests Andrew Cooper
                   ` (12 preceding siblings ...)
  2016-03-23 16:36 ` [PATCH v4 13/26] x86/cpu: Sysctl and common infrastructure for levelling context switching Andrew Cooper
@ 2016-03-23 16:36 ` Andrew Cooper
  2016-03-28 18:55   ` Konrad Rzeszutek Wilk
  2016-03-23 16:36 ` [PATCH v4 15/26] x86/cpu: Rework Intel masking/faulting setup Andrew Cooper
                   ` (12 subsequent siblings)
  26 siblings, 1 reply; 72+ messages in thread
From: Andrew Cooper @ 2016-03-23 16:36 UTC (permalink / raw)
  To: Xen-devel; +Cc: Andrew Cooper

This patch is best reviewed as its end result rather than as a diff, as it
rewrites almost all of the setup.

On the BSP, cpuid information is used to evaluate the potential available set
of masking MSRs, and they are unconditionally probed, filling in the
availability information and hardware defaults.

The command line parameters are then combined with the hardware defaults to
further restrict the Xen default masking level.  Each cpu is then context
switched into the default levelling state.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Jan Beulich <JBeulich@suse.com>
---
v2:
 * Provide extra information if opt_cpu_info
 * Extra comment indicating the expected use of amd_ctxt_switch_levelling()
v3:
 * Fix the interaction of the fast-forward bits with the override MSRs.
 * Style fixups.
---
 xen/arch/x86/cpu/amd.c | 276 ++++++++++++++++++++++++++++++++-----------------
 1 file changed, 179 insertions(+), 97 deletions(-)

diff --git a/xen/arch/x86/cpu/amd.c b/xen/arch/x86/cpu/amd.c
index 5516777..0e1c8b9 100644
--- a/xen/arch/x86/cpu/amd.c
+++ b/xen/arch/x86/cpu/amd.c
@@ -80,6 +80,13 @@ static inline int wrmsr_amd_safe(unsigned int msr, unsigned int lo,
 	return err;
 }
 
+static void wrmsr_amd(unsigned int msr, uint64_t val)
+{
+	asm volatile("wrmsr" ::
+		     "c" (msr), "a" ((uint32_t)val),
+		     "d" (val >> 32), "D" (0x9c5a203a));
+}
+
 static const struct cpuidmask {
 	uint16_t fam;
 	char rev[2];
@@ -126,126 +133,198 @@ static const struct cpuidmask *__init noinline get_cpuidmask(const char *opt)
 }
 
 /*
+ * Sets caps in expected_levelling_cap, probes for the specified mask MSR, and
+ * set caps in levelling_caps if it is found.  Processors prior to Fam 10h
+ * required a 32-bit password for masking MSRs.  Returns the default value.
+ */
+static uint64_t __init _probe_mask_msr(unsigned int msr, uint64_t caps)
+{
+	unsigned int hi, lo;
+
+	expected_levelling_cap |= caps;
+
+	if ((rdmsr_amd_safe(msr, &lo, &hi) == 0) &&
+	    (wrmsr_amd_safe(msr, lo, hi) == 0))
+		levelling_caps |= caps;
+
+	return ((uint64_t)hi << 32) | lo;
+}
+
+/*
+ * Probe for the existance of the expected masking MSRs.  They might easily
+ * not be available if Xen is running virtualised.
+ */
+static void __init noinline probe_masking_msrs(void)
+{
+	const struct cpuinfo_x86 *c = &boot_cpu_data;
+
+	/*
+	 * First, work out which masking MSRs we should have, based on
+	 * revision and cpuid.
+	 */
+
+	/* Fam11 doesn't support masking at all. */
+	if (c->x86 == 0x11)
+		return;
+
+	cpuidmask_defaults._1cd =
+		_probe_mask_msr(MSR_K8_FEATURE_MASK, LCAP_1cd);
+	cpuidmask_defaults.e1cd =
+		_probe_mask_msr(MSR_K8_EXT_FEATURE_MASK, LCAP_e1cd);
+
+	if (c->cpuid_level >= 7)
+		cpuidmask_defaults._7ab0 =
+			_probe_mask_msr(MSR_AMD_L7S0_FEATURE_MASK, LCAP_7ab0);
+
+	if (c->x86 == 0x15 && c->cpuid_level >= 6 && cpuid_ecx(6))
+		cpuidmask_defaults._6c =
+			_probe_mask_msr(MSR_AMD_THRM_FEATURE_MASK, LCAP_6c);
+
+	/*
+	 * Don't bother warning about a mismatch if virtualised.  These MSRs
+	 * are not architectural and almost never virtualised.
+	 */
+	if ((expected_levelling_cap == levelling_caps) ||
+	    cpu_has_hypervisor)
+		return;
+
+	printk(XENLOG_WARNING "Mismatch between expected (%#x) "
+	       "and real (%#x) levelling caps: missing %#x\n",
+	       expected_levelling_cap, levelling_caps,
+	       (expected_levelling_cap ^ levelling_caps) & levelling_caps);
+	printk(XENLOG_WARNING "Fam %#x, model %#x level %#x\n",
+	       c->x86, c->x86_model, c->cpuid_level);
+	printk(XENLOG_WARNING
+	       "If not running virtualised, please report a bug\n");
+}
+
+/*
+ * Context switch levelling state to the next domain.  A parameter of NULL is
+ * used to context switch to the default host state, and is used by the BSP/AP
+ * startup code.
+ */
+static void amd_ctxt_switch_levelling(const struct domain *nextd)
+{
+	struct cpuidmasks *these_masks = &this_cpu(cpuidmasks);
+	const struct cpuidmasks *masks = &cpuidmask_defaults;
+
+#define LAZY(cap, msr, field)						\
+	({								\
+		if (unlikely(these_masks->field != masks->field) &&	\
+		    ((levelling_caps & cap) == cap))			\
+		{							\
+			wrmsr_amd(msr, masks->field);			\
+			these_masks->field = masks->field;		\
+		}							\
+	})
+
+	LAZY(LCAP_1cd,  MSR_K8_FEATURE_MASK,       _1cd);
+	LAZY(LCAP_e1cd, MSR_K8_EXT_FEATURE_MASK,   e1cd);
+	LAZY(LCAP_7ab0, MSR_AMD_L7S0_FEATURE_MASK, _7ab0);
+	LAZY(LCAP_6c,   MSR_AMD_THRM_FEATURE_MASK, _6c);
+
+#undef LAZY
+}
+
+/*
  * Mask the features and extended features returned by CPUID.  Parameters are
  * set from the boot line via two methods:
  *
  *   1) Specific processor revision string
  *   2) User-defined masks
  *
- * The processor revision string parameter has precedene.
+ * The user-defined masks take precedence.
  */
-static void set_cpuidmask(const struct cpuinfo_x86 *c)
+static void __init noinline amd_init_levelling(void)
 {
-	static unsigned int feat_ecx, feat_edx;
-	static unsigned int extfeat_ecx, extfeat_edx;
-	static unsigned int l7s0_eax, l7s0_ebx;
-	static unsigned int thermal_ecx;
-	static bool_t skip_feat, skip_extfeat;
-	static bool_t skip_l7s0_eax_ebx, skip_thermal_ecx;
-	static enum { not_parsed, no_mask, set_mask } status;
-	unsigned int eax, ebx, ecx, edx;
-
-	if (status == no_mask)
-		return;
+	const struct cpuidmask *m = NULL;
 
-	if (status == set_mask)
-		goto setmask;
+	probe_masking_msrs();
 
-	ASSERT((status == not_parsed) && (c == &boot_cpu_data));
-	status = no_mask;
+	if (*opt_famrev != '\0') {
+		m = get_cpuidmask(opt_famrev);
 
-	/* Fam11 doesn't support masking at all. */
-	if (c->x86 == 0x11)
-		return;
+		if (!m)
+			printk("Invalid processor string: %s\n", opt_famrev);
+	}
 
-	if (~(opt_cpuid_mask_ecx & opt_cpuid_mask_edx &
-	      opt_cpuid_mask_ext_ecx & opt_cpuid_mask_ext_edx &
-	      opt_cpuid_mask_l7s0_eax & opt_cpuid_mask_l7s0_ebx &
-	      opt_cpuid_mask_thermal_ecx)) {
-		feat_ecx = opt_cpuid_mask_ecx;
-		feat_edx = opt_cpuid_mask_edx;
-		extfeat_ecx = opt_cpuid_mask_ext_ecx;
-		extfeat_edx = opt_cpuid_mask_ext_edx;
-		l7s0_eax = opt_cpuid_mask_l7s0_eax;
-		l7s0_ebx = opt_cpuid_mask_l7s0_ebx;
-		thermal_ecx = opt_cpuid_mask_thermal_ecx;
-	} else if (*opt_famrev == '\0') {
-		return;
-	} else {
-		const struct cpuidmask *m = get_cpuidmask(opt_famrev);
+	if ((levelling_caps & LCAP_1cd) == LCAP_1cd) {
+		uint32_t ecx, edx, tmp;
 
-		if (!m) {
-			printk("Invalid processor string: %s\n", opt_famrev);
-			printk("CPUID will not be masked\n");
-			return;
+		cpuid(0x00000001, &tmp, &tmp, &ecx, &edx);
+
+		if(~(opt_cpuid_mask_ecx & opt_cpuid_mask_edx)) {
+			ecx &= opt_cpuid_mask_ecx;
+			edx &= opt_cpuid_mask_edx;
+		} else if (m) {
+			ecx &= m->ecx;
+			edx &= m->edx;
 		}
-		feat_ecx = m->ecx;
-		feat_edx = m->edx;
-		extfeat_ecx = m->ext_ecx;
-		extfeat_edx = m->ext_edx;
+
+		/* Fast-forward bits - Must be set. */
+		if (ecx & cpufeat_mask(X86_FEATURE_XSAVE))
+			ecx |= cpufeat_mask(X86_FEATURE_OSXSAVE);
+		edx |= cpufeat_mask(X86_FEATURE_APIC);
+
+		/* Allow the HYPERVISOR bit to be set via guest policy. */
+		ecx |= cpufeat_mask(X86_FEATURE_HYPERVISOR);
+
+		cpuidmask_defaults._1cd = ((uint64_t)ecx << 32) | edx;
 	}
 
-        /* Setting bits in the CPUID mask MSR that are not set in the
-         * unmasked CPUID response can cause those bits to be set in the
-         * masked response.  Avoid that by explicitly masking in software. */
-        feat_ecx &= cpuid_ecx(0x00000001);
-        feat_edx &= cpuid_edx(0x00000001);
-        extfeat_ecx &= cpuid_ecx(0x80000001);
-        extfeat_edx &= cpuid_edx(0x80000001);
+	if ((levelling_caps & LCAP_e1cd) == LCAP_e1cd) {
+		uint32_t ecx, edx, tmp;
 
-	status = set_mask;
-	printk("Writing CPUID feature mask ECX:EDX -> %08Xh:%08Xh\n", 
-	       feat_ecx, feat_edx);
-	printk("Writing CPUID extended feature mask ECX:EDX -> %08Xh:%08Xh\n", 
-	       extfeat_ecx, extfeat_edx);
+		cpuid(0x80000001, &tmp, &tmp, &ecx, &edx);
 
-	if (c->cpuid_level >= 7)
-		cpuid_count(7, 0, &eax, &ebx, &ecx, &edx);
-	else
-		ebx = eax = 0;
-	if ((eax | ebx) && ~(l7s0_eax & l7s0_ebx)) {
-		if (l7s0_eax > eax)
-			l7s0_eax = eax;
-		l7s0_ebx &= ebx;
-		printk("Writing CPUID leaf 7 subleaf 0 feature mask EAX:EBX -> %08Xh:%08Xh\n",
-		       l7s0_eax, l7s0_ebx);
-	} else
-		skip_l7s0_eax_ebx = 1;
-
-	/* Only Fam15 has the respective MSR. */
-	ecx = c->x86 == 0x15 && c->cpuid_level >= 6 ? cpuid_ecx(6) : 0;
-	if (ecx && ~thermal_ecx) {
-		thermal_ecx &= ecx;
-		printk("Writing CPUID thermal/power feature mask ECX -> %08Xh\n",
-		       thermal_ecx);
-	} else
-		skip_thermal_ecx = 1;
-
- setmask:
-	/* AMD processors prior to family 10h required a 32-bit password */
-	if (!skip_feat &&
-	    wrmsr_amd_safe(MSR_K8_FEATURE_MASK, feat_edx, feat_ecx)) {
-		skip_feat = 1;
-		printk("Failed to set CPUID feature mask\n");
+		if(~(opt_cpuid_mask_ext_ecx & opt_cpuid_mask_ext_edx)) {
+			ecx &= opt_cpuid_mask_ext_ecx;
+			edx &= opt_cpuid_mask_ext_edx;
+		} else if (m) {
+			ecx &= m->ext_ecx;
+			edx &= m->ext_edx;
+		}
+
+		/* Fast-forward bits - Must be set. */
+		edx |= cpufeat_mask(X86_FEATURE_APIC);
+
+		cpuidmask_defaults.e1cd = ((uint64_t)ecx << 32) | edx;
 	}
 
-	if (!skip_extfeat &&
-	    wrmsr_amd_safe(MSR_K8_EXT_FEATURE_MASK, extfeat_edx, extfeat_ecx)) {
-		skip_extfeat = 1;
-		printk("Failed to set CPUID extended feature mask\n");
+	if ((levelling_caps & LCAP_7ab0) == LCAP_7ab0) {
+		uint32_t eax, ebx, tmp;
+
+		cpuid(0x00000007, &eax, &ebx, &tmp, &tmp);
+
+		if(~(opt_cpuid_mask_l7s0_eax & opt_cpuid_mask_l7s0_ebx)) {
+			eax &= opt_cpuid_mask_l7s0_eax;
+			ebx &= opt_cpuid_mask_l7s0_ebx;
+		}
+
+		cpuidmask_defaults._7ab0 &= ((uint64_t)eax << 32) | ebx;
 	}
 
-	if (!skip_l7s0_eax_ebx &&
-	    wrmsr_amd_safe(MSR_AMD_L7S0_FEATURE_MASK, l7s0_ebx, l7s0_eax)) {
-		skip_l7s0_eax_ebx = 1;
-		printk("Failed to set CPUID leaf 7 subleaf 0 feature mask\n");
+	if ((levelling_caps & LCAP_6c) == LCAP_6c) {
+		uint32_t ecx = cpuid_ecx(6);
+
+		if (~opt_cpuid_mask_thermal_ecx)
+			ecx &= opt_cpuid_mask_thermal_ecx;
+
+		cpuidmask_defaults._6c &= (~0ULL << 32) | ecx;
 	}
 
-	if (!skip_thermal_ecx &&
-	    (rdmsr_amd_safe(MSR_AMD_THRM_FEATURE_MASK, &eax, &edx) ||
-	     wrmsr_amd_safe(MSR_AMD_THRM_FEATURE_MASK, thermal_ecx, edx))){
-		skip_thermal_ecx = 1;
-		printk("Failed to set CPUID thermal/power feature mask\n");
+	if (opt_cpu_info) {
+		printk(XENLOG_INFO "Levelling caps: %#x\n", levelling_caps);
+		printk(XENLOG_INFO
+		       "MSR defaults: 1d 0x%08x, 1c 0x%08x, e1d 0x%08x, "
+		       "e1c 0x%08x, 7a0 0x%08x, 7b0 0x%08x, 6c 0x%08x\n",
+		       (uint32_t)cpuidmask_defaults._1cd,
+		       (uint32_t)(cpuidmask_defaults._1cd >> 32),
+		       (uint32_t)cpuidmask_defaults.e1cd,
+		       (uint32_t)(cpuidmask_defaults.e1cd >> 32),
+		       (uint32_t)(cpuidmask_defaults._7ab0 >> 32),
+		       (uint32_t)cpuidmask_defaults._7ab0,
+		       (uint32_t)cpuidmask_defaults._6c);
 	}
 }
 
@@ -409,7 +488,10 @@ static void amd_get_topology(struct cpuinfo_x86 *c)
 
 static void early_init_amd(struct cpuinfo_x86 *c)
 {
-	set_cpuidmask(c);
+	if (c == &boot_cpu_data)
+		amd_init_levelling();
+
+	amd_ctxt_switch_levelling(NULL);
 }
 
 static void init_amd(struct cpuinfo_x86 *c)
-- 
2.1.4


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 72+ messages in thread

* [PATCH v4 15/26] x86/cpu: Rework Intel masking/faulting setup
  2016-03-23 16:36 [PATCH v4 00/26] x86: Improvements to cpuid handling for guests Andrew Cooper
                   ` (13 preceding siblings ...)
  2016-03-23 16:36 ` [PATCH v4 14/26] x86/cpu: Rework AMD masking MSR setup Andrew Cooper
@ 2016-03-23 16:36 ` Andrew Cooper
  2016-03-28 19:14   ` Konrad Rzeszutek Wilk
  2016-03-23 16:36 ` [PATCH v4 16/26] x86/cpu: Context switch cpuid masks and faulting state in context_switch() Andrew Cooper
                   ` (11 subsequent siblings)
  26 siblings, 1 reply; 72+ messages in thread
From: Andrew Cooper @ 2016-03-23 16:36 UTC (permalink / raw)
  To: Xen-devel; +Cc: Andrew Cooper

This patch is best reviewed as its end result rather than as a diff, as it
rewrites almost all of the setup.

On the BSP, cpuid information is used to evaluate the potential available set
of masking MSRs, and they are unconditionally probed, filling in the
availability information and hardware defaults.  A side effect of this is that
probe_intel_cpuid_faulting() can move to being __init.

The command line parameters are then combined with the hardware defaults to
further restrict the Xen default masking level.  Each cpu is then context
switched into the default levelling state.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Jan Beulich <JBeulich@suse.com>
---
v2:
 * Style fixes.
 * Provide extra information if opt_cpu_info.
 * Extra comment indicating the expected use of intel_ctxt_switch_levelling().
v3:
 * Style fixes.
 * Avoid printing the cpumask defaults if faulting is available.
---
 xen/arch/x86/cpu/intel.c | 234 ++++++++++++++++++++++++++++++-----------------
 1 file changed, 149 insertions(+), 85 deletions(-)

diff --git a/xen/arch/x86/cpu/intel.c b/xen/arch/x86/cpu/intel.c
index ad22375..b2666a8 100644
--- a/xen/arch/x86/cpu/intel.c
+++ b/xen/arch/x86/cpu/intel.c
@@ -18,11 +18,18 @@
 
 #define select_idle_routine(x) ((void)0)
 
-static unsigned int probe_intel_cpuid_faulting(void)
+static bool_t __init probe_intel_cpuid_faulting(void)
 {
 	uint64_t x;
-	return !rdmsr_safe(MSR_INTEL_PLATFORM_INFO, x) &&
-		(x & MSR_PLATFORM_INFO_CPUID_FAULTING);
+
+	if (rdmsr_safe(MSR_INTEL_PLATFORM_INFO, x) ||
+	    !(x & MSR_PLATFORM_INFO_CPUID_FAULTING))
+		return 0;
+
+	expected_levelling_cap |= LCAP_faulting;
+	levelling_caps |=  LCAP_faulting;
+	__set_bit(X86_FEATURE_CPUID_FAULTING, boot_cpu_data.x86_capability);
+	return 1;
 }
 
 static DEFINE_PER_CPU(bool_t, cpuid_faulting_enabled);
@@ -44,36 +51,40 @@ void set_cpuid_faulting(bool_t enable)
 }
 
 /*
- * opt_cpuid_mask_ecx/edx: cpuid.1[ecx, edx] feature mask.
- * For example, E8400[Intel Core 2 Duo Processor series] ecx = 0x0008E3FD,
- * edx = 0xBFEBFBFF when executing CPUID.EAX = 1 normally. If you want to
- * 'rev down' to E8400, you can set these values in these Xen boot parameters.
+ * Set caps in expected_levelling_cap, probe a specific masking MSR, and set
+ * caps in levelling_caps if it is found, or clobber the MSR index if missing.
+ * If preset, reads the default value into msr_val.
  */
-static void set_cpuidmask(const struct cpuinfo_x86 *c)
+static uint64_t __init _probe_mask_msr(unsigned int *msr, uint64_t caps)
 {
-	static unsigned int msr_basic, msr_ext, msr_xsave;
-	static enum { not_parsed, no_mask, set_mask } status;
-	u64 msr_val;
+	uint64_t val = 0;
 
-	if (status == no_mask)
-		return;
+	expected_levelling_cap |= caps;
 
-	if (status == set_mask)
-		goto setmask;
+	if (rdmsr_safe(*msr, val) || wrmsr_safe(*msr, val))
+		*msr = 0;
+	else
+		levelling_caps |= caps;
 
-	ASSERT((status == not_parsed) && (c == &boot_cpu_data));
-	status = no_mask;
+	return val;
+}
 
-	if (!~(opt_cpuid_mask_ecx & opt_cpuid_mask_edx &
-	       opt_cpuid_mask_ext_ecx & opt_cpuid_mask_ext_edx &
-	       opt_cpuid_mask_xsave_eax))
-		return;
+/* Indices of the masking MSRs, or 0 if unavailable. */
+static unsigned int __read_mostly msr_basic, __read_mostly msr_ext,
+	__read_mostly msr_xsave;
+
+/*
+ * Probe for the existance of the expected masking MSRs.  They might easily
+ * not be available if Xen is running virtualised.
+ */
+static void __init probe_masking_msrs(void)
+{
+	const struct cpuinfo_x86 *c = &boot_cpu_data;
+	unsigned int exp_msr_basic, exp_msr_ext, exp_msr_xsave;
 
 	/* Only family 6 supports this feature. */
-	if (c->x86 != 6) {
-		printk("No CPUID feature masking support available\n");
+	if (c->x86 != 6)
 		return;
-	}
 
 	switch (c->x86_model) {
 	case 0x17: /* Yorkfield, Wolfdale, Penryn, Harpertown(DP) */
@@ -100,59 +111,121 @@ static void set_cpuidmask(const struct cpuinfo_x86 *c)
 		break;
 	}
 
-	status = set_mask;
+	exp_msr_basic = msr_basic;
+	exp_msr_ext   = msr_ext;
+	exp_msr_xsave = msr_xsave;
 
-	if (~(opt_cpuid_mask_ecx & opt_cpuid_mask_edx)) {
-		if (msr_basic)
-			printk("Writing CPUID feature mask ecx:edx -> %08x:%08x\n",
-			       opt_cpuid_mask_ecx, opt_cpuid_mask_edx);
-		else
-			printk("No CPUID feature mask available\n");
-	}
-	else
-		msr_basic = 0;
-
-	if (~(opt_cpuid_mask_ext_ecx & opt_cpuid_mask_ext_edx)) {
-		if (msr_ext)
-			printk("Writing CPUID extended feature mask ecx:edx -> %08x:%08x\n",
-			       opt_cpuid_mask_ext_ecx, opt_cpuid_mask_ext_edx);
-		else
-			printk("No CPUID extended feature mask available\n");
-	}
-	else
-		msr_ext = 0;
-
-	if (~opt_cpuid_mask_xsave_eax) {
-		if (msr_xsave)
-			printk("Writing CPUID xsave feature mask eax -> %08x\n",
-			       opt_cpuid_mask_xsave_eax);
-		else
-			printk("No CPUID xsave feature mask available\n");
+	if (msr_basic)
+		cpuidmask_defaults._1cd = _probe_mask_msr(&msr_basic, LCAP_1cd);
+
+	if (msr_ext)
+		cpuidmask_defaults.e1cd = _probe_mask_msr(&msr_ext, LCAP_e1cd);
+
+	if (msr_xsave)
+		cpuidmask_defaults.Da1 = _probe_mask_msr(&msr_xsave, LCAP_Da1);
+
+	/*
+	 * Don't bother warning about a mismatch if virtualised.  These MSRs
+	 * are not architectural and almost never virtualised.
+	 */
+	if ((expected_levelling_cap == levelling_caps) ||
+	    cpu_has_hypervisor)
+		return;
+
+	printk(XENLOG_WARNING "Mismatch between expected (%#x) "
+	       "and real (%#x) levelling caps: missing %#x\n",
+	       expected_levelling_cap, levelling_caps,
+	       (expected_levelling_cap ^ levelling_caps) & levelling_caps);
+	printk(XENLOG_WARNING "Fam %#x, model %#x expected (%#x/%#x/%#x), "
+	       "got (%#x/%#x/%#x)\n", c->x86, c->x86_model,
+	       exp_msr_basic, exp_msr_ext, exp_msr_xsave,
+	       msr_basic, msr_ext, msr_xsave);
+	printk(XENLOG_WARNING
+	       "If not running virtualised, please report a bug\n");
+}
+
+/*
+ * Context switch levelling state to the next domain.  A parameter of NULL is
+ * used to context switch to the default host state, and is used by the BSP/AP
+ * startup code.
+ */
+static void intel_ctxt_switch_levelling(const struct domain *nextd)
+{
+	struct cpuidmasks *these_masks = &this_cpu(cpuidmasks);
+	const struct cpuidmasks *masks = &cpuidmask_defaults;
+
+#define LAZY(msr, field)						\
+	({								\
+		if (unlikely(these_masks->field != masks->field) &&	\
+		    (msr))						\
+		{							\
+			wrmsrl((msr), masks->field);			\
+			these_masks->field = masks->field;		\
+		}							\
+	})
+
+	LAZY(msr_basic, _1cd);
+	LAZY(msr_ext,   e1cd);
+	LAZY(msr_xsave, Da1);
+
+#undef LAZY
+}
+
+/*
+ * opt_cpuid_mask_ecx/edx: cpuid.1[ecx, edx] feature mask.
+ * For example, E8400[Intel Core 2 Duo Processor series] ecx = 0x0008E3FD,
+ * edx = 0xBFEBFBFF when executing CPUID.EAX = 1 normally. If you want to
+ * 'rev down' to E8400, you can set these values in these Xen boot parameters.
+ */
+static void __init noinline intel_init_levelling(void)
+{
+	if (!probe_intel_cpuid_faulting())
+		probe_masking_msrs();
+
+	if (msr_basic) {
+		uint32_t ecx, edx, tmp;
+
+		cpuid(0x00000001, &tmp, &tmp, &ecx, &edx);
+
+		ecx &= opt_cpuid_mask_ecx;
+		edx &= opt_cpuid_mask_edx;
+
+		cpuidmask_defaults._1cd &= ((u64)edx << 32) | ecx;
 	}
-	else
-		msr_xsave = 0;
-
- setmask:
-	if (msr_basic &&
-	    wrmsr_safe(msr_basic,
-		       ((u64)opt_cpuid_mask_edx << 32) | opt_cpuid_mask_ecx)){
-		msr_basic = 0;
-		printk("Failed to set CPUID feature mask\n");
+
+	if (msr_ext) {
+		uint32_t ecx, edx, tmp;
+
+		cpuid(0x80000001, &tmp, &tmp, &ecx, &edx);
+
+		ecx &= opt_cpuid_mask_ext_ecx;
+		edx &= opt_cpuid_mask_ext_edx;
+
+		cpuidmask_defaults.e1cd &= ((u64)edx << 32) | ecx;
 	}
 
-	if (msr_ext &&
-	    wrmsr_safe(msr_ext,
-		       ((u64)opt_cpuid_mask_ext_edx << 32) | opt_cpuid_mask_ext_ecx)){
-		msr_ext = 0;
-		printk("Failed to set CPUID extended feature mask\n");
+	if (msr_xsave) {
+		uint32_t eax, tmp;
+
+		cpuid_count(0x0000000d, 1, &eax, &tmp, &tmp, &tmp);
+
+		eax &= opt_cpuid_mask_xsave_eax;
+
+		cpuidmask_defaults.Da1 &= (~0ULL << 32) | eax;
 	}
 
-	if (msr_xsave &&
-	    (rdmsr_safe(msr_xsave, msr_val) ||
-	     wrmsr_safe(msr_xsave,
-			(msr_val & (~0ULL << 32)) | opt_cpuid_mask_xsave_eax))){
-		msr_xsave = 0;
-		printk("Failed to set CPUID xsave feature mask\n");
+	if (opt_cpu_info) {
+		printk(XENLOG_INFO "Levelling caps: %#x\n", levelling_caps);
+
+		if (!cpu_has_cpuid_faulting)
+			printk(XENLOG_INFO
+			       "MSR defaults: 1d 0x%08x, 1c 0x%08x, e1d 0x%08x, "
+			       "e1c 0x%08x, Da1 0x%08x\n",
+			       (uint32_t)(cpuidmask_defaults._1cd >> 32),
+			       (uint32_t)cpuidmask_defaults._1cd,
+			       (uint32_t)(cpuidmask_defaults.e1cd >> 32),
+			       (uint32_t)cpuidmask_defaults.e1cd,
+			       (uint32_t)cpuidmask_defaults.Da1);
 	}
 }
 
@@ -190,22 +263,13 @@ static void early_init_intel(struct cpuinfo_x86 *c)
 	    (boot_cpu_data.x86_mask == 3 || boot_cpu_data.x86_mask == 4))
 		paddr_bits = 36;
 
-	if (c == &boot_cpu_data && c->x86 == 6) {
-		if (probe_intel_cpuid_faulting())
-			__set_bit(X86_FEATURE_CPUID_FAULTING,
-				  c->x86_capability);
-	} else if (boot_cpu_has(X86_FEATURE_CPUID_FAULTING)) {
-		BUG_ON(!probe_intel_cpuid_faulting());
+	if (c == &boot_cpu_data)
+		intel_init_levelling();
+
+	if (test_bit(X86_FEATURE_CPUID_FAULTING, boot_cpu_data.x86_capability))
 		__set_bit(X86_FEATURE_CPUID_FAULTING, c->x86_capability);
-	}
 
-	if (!cpu_has_cpuid_faulting)
-		set_cpuidmask(c);
-	else if ((c == &boot_cpu_data) &&
-		 (~(opt_cpuid_mask_ecx & opt_cpuid_mask_edx &
-		    opt_cpuid_mask_ext_ecx & opt_cpuid_mask_ext_edx &
-		    opt_cpuid_mask_xsave_eax)))
-		printk("No CPUID feature masking support available\n");
+	intel_ctxt_switch_levelling(NULL);
 }
 
 /*
-- 
2.1.4


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 72+ messages in thread

* [PATCH v4 16/26] x86/cpu: Context switch cpuid masks and faulting state in context_switch()
  2016-03-23 16:36 [PATCH v4 00/26] x86: Improvements to cpuid handling for guests Andrew Cooper
                   ` (14 preceding siblings ...)
  2016-03-23 16:36 ` [PATCH v4 15/26] x86/cpu: Rework Intel masking/faulting setup Andrew Cooper
@ 2016-03-23 16:36 ` Andrew Cooper
  2016-03-28 19:27   ` Konrad Rzeszutek Wilk
  2016-03-23 16:36 ` [PATCH v4 17/26] x86/pv: Provide custom cpumasks for PV domains Andrew Cooper
                   ` (10 subsequent siblings)
  26 siblings, 1 reply; 72+ messages in thread
From: Andrew Cooper @ 2016-03-23 16:36 UTC (permalink / raw)
  To: Xen-devel; +Cc: Andrew Cooper

A single ctxt_switch_levelling() function pointer is provided
(defaulting to an empty nop), which is overridden in the appropriate
$VENDOR_init_levelling().

set_cpuid_faulting() is made private and included within
intel_ctxt_switch_levelling().

One functional change is that the faulting configuration is no longer special
cased for dom0.  There was never any need to, and it will cause dom0 to
observe the same information through native and enlightened cpuid.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Jan Beulich <JBeulich@suse.com>
---
v3:
 * Don't leave cpuid masking/faulting active for the kexec kernel.
v2:
 * Style fixes
 * ASSERT() that faulting is available in set_cpuid_faulting()
---
 xen/arch/x86/cpu/amd.c          |  3 +++
 xen/arch/x86/cpu/common.c       |  7 +++++++
 xen/arch/x86/cpu/intel.c        | 20 +++++++++++++++-----
 xen/arch/x86/crash.c            |  3 +++
 xen/arch/x86/domain.c           |  4 +---
 xen/include/asm-x86/processor.h |  2 +-
 6 files changed, 30 insertions(+), 9 deletions(-)

diff --git a/xen/arch/x86/cpu/amd.c b/xen/arch/x86/cpu/amd.c
index 0e1c8b9..484d4b0 100644
--- a/xen/arch/x86/cpu/amd.c
+++ b/xen/arch/x86/cpu/amd.c
@@ -326,6 +326,9 @@ static void __init noinline amd_init_levelling(void)
 		       (uint32_t)cpuidmask_defaults._7ab0,
 		       (uint32_t)cpuidmask_defaults._6c);
 	}
+
+	if (levelling_caps)
+		ctxt_switch_levelling = amd_ctxt_switch_levelling;
 }
 
 /*
diff --git a/xen/arch/x86/cpu/common.c b/xen/arch/x86/cpu/common.c
index 7ef75b0..fe6eab4 100644
--- a/xen/arch/x86/cpu/common.c
+++ b/xen/arch/x86/cpu/common.c
@@ -88,6 +88,13 @@ static const struct cpu_dev default_cpu = {
 };
 static const struct cpu_dev *this_cpu = &default_cpu;
 
+static void default_ctxt_switch_levelling(const struct domain *nextd)
+{
+	/* Nop */
+}
+void (* __read_mostly ctxt_switch_levelling)(const struct domain *nextd) =
+	default_ctxt_switch_levelling;
+
 bool_t opt_cpu_info;
 boolean_param("cpuinfo", opt_cpu_info);
 
diff --git a/xen/arch/x86/cpu/intel.c b/xen/arch/x86/cpu/intel.c
index b2666a8..71b1199 100644
--- a/xen/arch/x86/cpu/intel.c
+++ b/xen/arch/x86/cpu/intel.c
@@ -32,13 +32,15 @@ static bool_t __init probe_intel_cpuid_faulting(void)
 	return 1;
 }
 
-static DEFINE_PER_CPU(bool_t, cpuid_faulting_enabled);
-void set_cpuid_faulting(bool_t enable)
+static void set_cpuid_faulting(bool_t enable)
 {
+	static DEFINE_PER_CPU(bool_t, cpuid_faulting_enabled);
+	bool_t *this_enabled = &this_cpu(cpuid_faulting_enabled);
 	uint32_t hi, lo;
 
-	if (!cpu_has_cpuid_faulting ||
-	    this_cpu(cpuid_faulting_enabled) == enable )
+	ASSERT(cpu_has_cpuid_faulting);
+
+	if (*this_enabled == enable)
 		return;
 
 	rdmsr(MSR_INTEL_MISC_FEATURES_ENABLES, lo, hi);
@@ -47,7 +49,7 @@ void set_cpuid_faulting(bool_t enable)
 		lo |= MSR_MISC_FEATURES_CPUID_FAULTING;
 	wrmsr(MSR_INTEL_MISC_FEATURES_ENABLES, lo, hi);
 
-	this_cpu(cpuid_faulting_enabled) = enable;
+	*this_enabled = enable;
 }
 
 /*
@@ -154,6 +156,11 @@ static void intel_ctxt_switch_levelling(const struct domain *nextd)
 	struct cpuidmasks *these_masks = &this_cpu(cpuidmasks);
 	const struct cpuidmasks *masks = &cpuidmask_defaults;
 
+	if (cpu_has_cpuid_faulting) {
+		set_cpuid_faulting(nextd && is_pv_domain(nextd));
+		return;
+	}
+
 #define LAZY(msr, field)						\
 	({								\
 		if (unlikely(these_masks->field != masks->field) &&	\
@@ -227,6 +234,9 @@ static void __init noinline intel_init_levelling(void)
 			       (uint32_t)cpuidmask_defaults.e1cd,
 			       (uint32_t)cpuidmask_defaults.Da1);
 	}
+
+	if (levelling_caps)
+		ctxt_switch_levelling = intel_ctxt_switch_levelling;
 }
 
 static void early_init_intel(struct cpuinfo_x86 *c)
diff --git a/xen/arch/x86/crash.c b/xen/arch/x86/crash.c
index 888a214..f28f527 100644
--- a/xen/arch/x86/crash.c
+++ b/xen/arch/x86/crash.c
@@ -189,6 +189,9 @@ void machine_crash_shutdown(void)
 
     nmi_shootdown_cpus();
 
+    /* Reset CPUID masking and faulting to the host's default. */
+    ctxt_switch_levelling(NULL);
+
     info = kexec_crash_save_info();
     info->xen_phys_start = xen_phys_start;
     info->dom0_pfn_to_mfn_frame_list_list =
diff --git a/xen/arch/x86/domain.c b/xen/arch/x86/domain.c
index 6ec7554..abc7194 100644
--- a/xen/arch/x86/domain.c
+++ b/xen/arch/x86/domain.c
@@ -2088,9 +2088,7 @@ void context_switch(struct vcpu *prev, struct vcpu *next)
             load_segments(next);
         }
 
-        set_cpuid_faulting(is_pv_domain(nextd) &&
-                           !is_control_domain(nextd) &&
-                           !is_hardware_domain(nextd));
+        ctxt_switch_levelling(nextd);
     }
 
     context_saved(prev);
diff --git a/xen/include/asm-x86/processor.h b/xen/include/asm-x86/processor.h
index a950554..f29d370 100644
--- a/xen/include/asm-x86/processor.h
+++ b/xen/include/asm-x86/processor.h
@@ -209,7 +209,7 @@ extern struct cpuinfo_x86 boot_cpu_data;
 extern struct cpuinfo_x86 cpu_data[];
 #define current_cpu_data cpu_data[smp_processor_id()]
 
-extern void set_cpuid_faulting(bool_t enable);
+extern void (*ctxt_switch_levelling)(const struct domain *nextd);
 
 extern u64 host_pat;
 extern bool_t opt_cpu_info;
-- 
2.1.4


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 72+ messages in thread

* [PATCH v4 17/26] x86/pv: Provide custom cpumasks for PV domains
  2016-03-23 16:36 [PATCH v4 00/26] x86: Improvements to cpuid handling for guests Andrew Cooper
                   ` (15 preceding siblings ...)
  2016-03-23 16:36 ` [PATCH v4 16/26] x86/cpu: Context switch cpuid masks and faulting state in context_switch() Andrew Cooper
@ 2016-03-23 16:36 ` Andrew Cooper
  2016-03-28 19:40   ` Konrad Rzeszutek Wilk
  2016-03-23 16:36 ` [PATCH v4 18/26] x86/domctl: Update PV domain cpumasks when setting cpuid policy Andrew Cooper
                   ` (9 subsequent siblings)
  26 siblings, 1 reply; 72+ messages in thread
From: Andrew Cooper @ 2016-03-23 16:36 UTC (permalink / raw)
  To: Xen-devel; +Cc: Andrew Cooper

And use them in preference to cpumask_defaults on context switch.  HVM domains
must not be masked (to avoid interfering with cpuid calls within the guest),
so always lazily context switch to the host default.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Jan Beulich <JBeulich@suse.com>
---
v2:
 * s/cpumasks/cpuidmasks/
 * Use structure assignment
 * Fix error path in arch_domain_create()
v3:
 * Indentation fixes.
 * Only allocate PV cpuidmasks if the host is has cpumasks to use.
---
 xen/arch/x86/cpu/amd.c       |  4 +++-
 xen/arch/x86/cpu/intel.c     |  5 ++++-
 xen/arch/x86/domain.c        | 14 ++++++++++++++
 xen/include/asm-x86/domain.h |  2 ++
 4 files changed, 23 insertions(+), 2 deletions(-)

diff --git a/xen/arch/x86/cpu/amd.c b/xen/arch/x86/cpu/amd.c
index 484d4b0..8cb04f0 100644
--- a/xen/arch/x86/cpu/amd.c
+++ b/xen/arch/x86/cpu/amd.c
@@ -206,7 +206,9 @@ static void __init noinline probe_masking_msrs(void)
 static void amd_ctxt_switch_levelling(const struct domain *nextd)
 {
 	struct cpuidmasks *these_masks = &this_cpu(cpuidmasks);
-	const struct cpuidmasks *masks = &cpuidmask_defaults;
+	const struct cpuidmasks *masks =
+		(nextd && is_pv_domain(nextd) && nextd->arch.pv_domain.cpuidmasks)
+		? nextd->arch.pv_domain.cpuidmasks : &cpuidmask_defaults;
 
 #define LAZY(cap, msr, field)						\
 	({								\
diff --git a/xen/arch/x86/cpu/intel.c b/xen/arch/x86/cpu/intel.c
index 71b1199..00a9987 100644
--- a/xen/arch/x86/cpu/intel.c
+++ b/xen/arch/x86/cpu/intel.c
@@ -154,13 +154,16 @@ static void __init probe_masking_msrs(void)
 static void intel_ctxt_switch_levelling(const struct domain *nextd)
 {
 	struct cpuidmasks *these_masks = &this_cpu(cpuidmasks);
-	const struct cpuidmasks *masks = &cpuidmask_defaults;
+	const struct cpuidmasks *masks;
 
 	if (cpu_has_cpuid_faulting) {
 		set_cpuid_faulting(nextd && is_pv_domain(nextd));
 		return;
 	}
 
+	masks = (nextd && is_pv_domain(nextd) && nextd->arch.pv_domain.cpuidmasks)
+		? nextd->arch.pv_domain.cpuidmasks : &cpuidmask_defaults;
+
 #define LAZY(msr, field)						\
 	({								\
 		if (unlikely(these_masks->field != masks->field) &&	\
diff --git a/xen/arch/x86/domain.c b/xen/arch/x86/domain.c
index abc7194..d0d9773 100644
--- a/xen/arch/x86/domain.c
+++ b/xen/arch/x86/domain.c
@@ -577,6 +577,14 @@ int arch_domain_create(struct domain *d, unsigned int domcr_flags,
             goto fail;
         clear_page(d->arch.pv_domain.gdt_ldt_l1tab);
 
+        if ( levelling_caps & ~LCAP_faulting )
+        {
+            d->arch.pv_domain.cpuidmasks = xmalloc(struct cpuidmasks);
+            if ( !d->arch.pv_domain.cpuidmasks )
+                goto fail;
+            *d->arch.pv_domain.cpuidmasks = cpuidmask_defaults;
+        }
+
         rc = create_perdomain_mapping(d, GDT_LDT_VIRT_START,
                                       GDT_LDT_MBYTES << (20 - PAGE_SHIFT),
                                       NULL, NULL);
@@ -672,7 +680,10 @@ int arch_domain_create(struct domain *d, unsigned int domcr_flags,
         paging_final_teardown(d);
     free_perdomain_mappings(d);
     if ( is_pv_domain(d) )
+    {
+        xfree(d->arch.pv_domain.cpuidmasks);
         free_xenheap_page(d->arch.pv_domain.gdt_ldt_l1tab);
+    }
     psr_domain_free(d);
     return rc;
 }
@@ -692,7 +703,10 @@ void arch_domain_destroy(struct domain *d)
 
     free_perdomain_mappings(d);
     if ( is_pv_domain(d) )
+    {
         free_xenheap_page(d->arch.pv_domain.gdt_ldt_l1tab);
+        xfree(d->arch.pv_domain.cpuidmasks);
+    }
 
     free_xenheap_page(d->shared_info);
     cleanup_domain_irq_mapping(d);
diff --git a/xen/include/asm-x86/domain.h b/xen/include/asm-x86/domain.h
index de60def..90f021f 100644
--- a/xen/include/asm-x86/domain.h
+++ b/xen/include/asm-x86/domain.h
@@ -252,6 +252,8 @@ struct pv_domain
 
     /* map_domain_page() mapping cache. */
     struct mapcache_domain mapcache;
+
+    struct cpuidmasks *cpuidmasks;
 };
 
 struct monitor_write_data {
-- 
2.1.4


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 72+ messages in thread

* [PATCH v4 18/26] x86/domctl: Update PV domain cpumasks when setting cpuid policy
  2016-03-23 16:36 [PATCH v4 00/26] x86: Improvements to cpuid handling for guests Andrew Cooper
                   ` (16 preceding siblings ...)
  2016-03-23 16:36 ` [PATCH v4 17/26] x86/pv: Provide custom cpumasks for PV domains Andrew Cooper
@ 2016-03-23 16:36 ` Andrew Cooper
  2016-03-24 17:04   ` Jan Beulich
  2016-03-28 19:51   ` Konrad Rzeszutek Wilk
  2016-03-23 16:36 ` [PATCH v4 19/26] xen+tools: Export maximum host and guest cpu featuresets via SYSCTL Andrew Cooper
                   ` (8 subsequent siblings)
  26 siblings, 2 replies; 72+ messages in thread
From: Andrew Cooper @ 2016-03-23 16:36 UTC (permalink / raw)
  To: Xen-devel; +Cc: Andrew Cooper, Jan Beulich

This allows PV domains with different featuresets to observe different values
from a native cpuid instruction, on supporting hardware.

It is important to leak the host view of HTT and CMP_LEGACY through to guests,
even though they could be hidden.  These flags affect how to interpret other
cpuid leaves which are not maskable.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
---
CC: Jan Beulich <JBeulich@suse.com>

v2:
 * Use switch() rather than if/elseif chain
 * Clamp to static PV featuremask
v3:
 * Only set a shadow cpumask if it is available in hardware.  This causes
   fewer branches in the context switch.
 * Fix interaction between fastforward bits and override MSR.
 * Fix up the cross-vendor case.
 * Fix the host view of HTT/CMP_LEGACY.
v4:
 * More comments explaining the masking MSRs behaviour.
 * s/CPU/CPUID/
 * Leak host X2APIC.
---
 xen/arch/x86/domctl.c            | 138 +++++++++++++++++++++++++++++++++++++++
 xen/include/asm-x86/cpufeature.h |   1 +
 2 files changed, 139 insertions(+)

diff --git a/xen/arch/x86/domctl.c b/xen/arch/x86/domctl.c
index b7c7f42..403bae8 100644
--- a/xen/arch/x86/domctl.c
+++ b/xen/arch/x86/domctl.c
@@ -36,6 +36,7 @@
 #include <asm/xstate.h>
 #include <asm/debugger.h>
 #include <asm/psr.h>
+#include <asm/cpuid.h>
 
 static int gdbsx_guest_mem_io(domid_t domid, struct xen_domctl_gdbsx_memio *iop)
 {
@@ -87,6 +88,143 @@ static void update_domain_cpuid_info(struct domain *d,
         d->arch.x86_model = (ctl->eax >> 4) & 0xf;
         if ( d->arch.x86 >= 0x6 )
             d->arch.x86_model |= (ctl->eax >> 12) & 0xf0;
+
+        if ( is_pv_domain(d) && ((levelling_caps & LCAP_1cd) == LCAP_1cd) )
+        {
+            uint64_t mask = cpuidmask_defaults._1cd;
+            uint32_t ecx = ctl->ecx & pv_featureset[FEATURESET_1c];
+            uint32_t edx = ctl->edx & pv_featureset[FEATURESET_1d];
+
+            /*
+             * Must expose hosts HTT and X2APIC value so a guest using native
+             * CPUID can correctly interpret other leaves which cannot be
+             * masked.
+             */
+            if ( cpu_has_x2apic )
+                ecx |= cpufeat_mask(X86_FEATURE_X2APIC);
+            if ( cpu_has_htt )
+                edx |= cpufeat_mask(X86_FEATURE_HTT);
+
+            switch ( boot_cpu_data.x86_vendor )
+            {
+            case X86_VENDOR_INTEL:
+                /*
+                 * Intel masking MSRs are documented as AND masks.
+                 * Experimentally, they are applied before OSXSAVE and APIC
+                 * are fast-forwarded from real hardware state.
+                 */
+                mask &= ((uint64_t)edx << 32) | ecx;
+                break;
+
+            case X86_VENDOR_AMD:
+                mask &= ((uint64_t)ecx << 32) | edx;
+
+                /*
+                 * AMD masking MSRs are documented as overrides.
+                 * Experimentally, fast-forwarding of the OSXSAVE and APIC
+                 * bits from real hardware state only occurs if the MSR has
+                 * the respective bits set.
+                 */
+                if ( ecx & cpufeat_mask(X86_FEATURE_XSAVE) )
+                    ecx = cpufeat_mask(X86_FEATURE_OSXSAVE);
+                else
+                    ecx = 0;
+                edx = cpufeat_mask(X86_FEATURE_APIC);
+
+                mask |= ((uint64_t)ecx << 32) | edx;
+                break;
+            }
+
+            d->arch.pv_domain.cpuidmasks->_1cd = mask;
+        }
+        break;
+
+    case 6:
+        if ( is_pv_domain(d) && ((levelling_caps & LCAP_6c) == LCAP_6c) )
+        {
+            uint64_t mask = cpuidmask_defaults._6c;
+
+            if ( boot_cpu_data.x86_vendor == X86_VENDOR_AMD )
+                mask &= (~0ULL << 32) | ctl->ecx;
+
+            d->arch.pv_domain.cpuidmasks->_6c = mask;
+        }
+        break;
+
+    case 7:
+        if ( ctl->input[1] != 0 )
+            break;
+
+        if ( is_pv_domain(d) && ((levelling_caps & LCAP_7ab0) == LCAP_7ab0) )
+        {
+            uint64_t mask = cpuidmask_defaults._7ab0;
+            uint32_t eax = ctl->eax;
+            uint32_t ebx = ctl->ebx & pv_featureset[FEATURESET_7b0];
+
+            if ( boot_cpu_data.x86_vendor == X86_VENDOR_AMD )
+                mask &= ((uint64_t)eax << 32) | ebx;
+
+            d->arch.pv_domain.cpuidmasks->_7ab0 = mask;
+        }
+        break;
+
+    case 0xd:
+        if ( ctl->input[1] != 1 )
+            break;
+
+        if ( is_pv_domain(d) && ((levelling_caps & LCAP_Da1) == LCAP_Da1) )
+        {
+            uint64_t mask = cpuidmask_defaults.Da1;
+            uint32_t eax = ctl->eax & pv_featureset[FEATURESET_Da1];
+
+            if ( boot_cpu_data.x86_vendor == X86_VENDOR_INTEL )
+                mask &= (~0ULL << 32) | eax;
+
+            d->arch.pv_domain.cpuidmasks->Da1 = mask;
+        }
+        break;
+
+    case 0x80000001:
+        if ( is_pv_domain(d) && ((levelling_caps & LCAP_e1cd) == LCAP_e1cd) )
+        {
+            uint64_t mask = cpuidmask_defaults.e1cd;
+            uint32_t ecx = ctl->ecx & pv_featureset[FEATURESET_e1c];
+            uint32_t edx = ctl->edx & pv_featureset[FEATURESET_e1d];
+
+            /*
+             * Must expose hosts CMP_LEGACY value so a guest using native
+             * CPUID can correctly interpret other leaves which cannot be
+             * masked.
+             */
+            if ( cpu_has_cmp_legacy )
+                ecx |= cpufeat_mask(X86_FEATURE_CMP_LEGACY);
+
+            /* If not emulating AMD, clear the duplicated features in e1d. */
+            if ( d->arch.x86_vendor != X86_VENDOR_AMD )
+                edx &= ~CPUID_COMMON_1D_FEATURES;
+
+            switch ( boot_cpu_data.x86_vendor )
+            {
+            case X86_VENDOR_INTEL:
+                mask &= ((uint64_t)edx << 32) | ecx;
+                break;
+
+            case X86_VENDOR_AMD:
+                mask &= ((uint64_t)ecx << 32) | edx;
+
+                /*
+                 * Fast-forward bits - Must be set in the masking MSR for
+                 * fast-forwarding to occur in hardware.
+                 */
+                ecx = 0;
+                edx = cpufeat_mask(X86_FEATURE_APIC);
+
+                mask |= ((uint64_t)ecx << 32) | edx;
+                break;
+            }
+
+            d->arch.pv_domain.cpuidmasks->e1cd = mask;
+        }
         break;
     }
 }
diff --git a/xen/include/asm-x86/cpufeature.h b/xen/include/asm-x86/cpufeature.h
index 84d3220..ae7b47b 100644
--- a/xen/include/asm-x86/cpufeature.h
+++ b/xen/include/asm-x86/cpufeature.h
@@ -81,6 +81,7 @@
 #define cpu_has_xsavec		boot_cpu_has(X86_FEATURE_XSAVEC)
 #define cpu_has_xgetbv1		boot_cpu_has(X86_FEATURE_XGETBV1)
 #define cpu_has_xsaves		boot_cpu_has(X86_FEATURE_XSAVES)
+#define cpu_has_cmp_legacy	boot_cpu_has(X86_FEATURE_CMP_LEGACY)
 #define cpu_has_hypervisor	boot_cpu_has(X86_FEATURE_HYPERVISOR)
 
 enum _cache_type {
-- 
2.1.4


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 72+ messages in thread

* [PATCH v4 19/26] xen+tools: Export maximum host and guest cpu featuresets via SYSCTL
  2016-03-23 16:36 [PATCH v4 00/26] x86: Improvements to cpuid handling for guests Andrew Cooper
                   ` (17 preceding siblings ...)
  2016-03-23 16:36 ` [PATCH v4 18/26] x86/domctl: Update PV domain cpumasks when setting cpuid policy Andrew Cooper
@ 2016-03-23 16:36 ` Andrew Cooper
  2016-03-28 19:59   ` Konrad Rzeszutek Wilk
  2016-03-23 16:36 ` [PATCH v4 20/26] tools/libxc: Modify bitmap operations to take void pointers Andrew Cooper
                   ` (7 subsequent siblings)
  26 siblings, 1 reply; 72+ messages in thread
From: Andrew Cooper @ 2016-03-23 16:36 UTC (permalink / raw)
  To: Xen-devel; +Cc: Andrew Cooper, Tim Deegan

And provide stubs for toolstack use.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by: Wei Liu <wei.liu2@citrix.com>
Acked-by: David Scott <dave@recoil.org>
Acked-by: Jan Beulich <JBeulich@suse.com>
---
CC: Tim Deegan <tim@xen.org>

v2:
 * Rebased to use libxencall
 * Improve hypercall documentation
v3:
 * Provide libxc implementation for XEN_SYSCTL_get_cpu_levelling_caps as well.
v4:
 * More const.
---
 tools/libxc/include/xenctrl.h       |  4 +++
 tools/libxc/xc_cpuid_x86.c          | 41 +++++++++++++++++++++++++++++
 tools/ocaml/libs/xc/xenctrl.ml      |  3 +++
 tools/ocaml/libs/xc/xenctrl.mli     |  4 +++
 tools/ocaml/libs/xc/xenctrl_stubs.c | 35 +++++++++++++++++++++++++
 xen/arch/x86/sysctl.c               | 51 +++++++++++++++++++++++++++++++++++++
 xen/include/public/sysctl.h         | 27 ++++++++++++++++++++
 7 files changed, 165 insertions(+)

diff --git a/tools/libxc/include/xenctrl.h b/tools/libxc/include/xenctrl.h
index 150d727..c136aa8 100644
--- a/tools/libxc/include/xenctrl.h
+++ b/tools/libxc/include/xenctrl.h
@@ -2529,6 +2529,10 @@ int xc_psr_cat_get_domain_data(xc_interface *xch, uint32_t domid,
 int xc_psr_cat_get_l3_info(xc_interface *xch, uint32_t socket,
                            uint32_t *cos_max, uint32_t *cbm_len,
                            bool *cdp_enabled);
+
+int xc_get_cpu_levelling_caps(xc_interface *xch, uint32_t *caps);
+int xc_get_cpu_featureset(xc_interface *xch, uint32_t index,
+                          uint32_t *nr_features, uint32_t *featureset);
 #endif
 
 /* Compat shims */
diff --git a/tools/libxc/xc_cpuid_x86.c b/tools/libxc/xc_cpuid_x86.c
index 733add4..5780397 100644
--- a/tools/libxc/xc_cpuid_x86.c
+++ b/tools/libxc/xc_cpuid_x86.c
@@ -33,6 +33,47 @@
 #define DEF_MAX_INTELEXT  0x80000008u
 #define DEF_MAX_AMDEXT    0x8000001cu
 
+int xc_get_cpu_levelling_caps(xc_interface *xch, uint32_t *caps)
+{
+    DECLARE_SYSCTL;
+    int ret;
+
+    sysctl.cmd = XEN_SYSCTL_get_cpu_levelling_caps;
+    ret = do_sysctl(xch, &sysctl);
+
+    if ( !ret )
+        *caps = sysctl.u.cpu_levelling_caps.caps;
+
+    return ret;
+}
+
+int xc_get_cpu_featureset(xc_interface *xch, uint32_t index,
+                          uint32_t *nr_features, uint32_t *featureset)
+{
+    DECLARE_SYSCTL;
+    DECLARE_HYPERCALL_BOUNCE(featureset,
+                             *nr_features * sizeof(*featureset),
+                             XC_HYPERCALL_BUFFER_BOUNCE_OUT);
+    int ret;
+
+    if ( xc_hypercall_bounce_pre(xch, featureset) )
+        return -1;
+
+    sysctl.cmd = XEN_SYSCTL_get_cpu_featureset;
+    sysctl.u.cpu_featureset.index = index;
+    sysctl.u.cpu_featureset.nr_features = *nr_features;
+    set_xen_guest_handle(sysctl.u.cpu_featureset.features, featureset);
+
+    ret = do_sysctl(xch, &sysctl);
+
+    xc_hypercall_bounce_post(xch, featureset);
+
+    if ( !ret )
+        *nr_features = sysctl.u.cpu_featureset.nr_features;
+
+    return ret;
+}
+
 struct cpuid_domain_info
 {
     enum
diff --git a/tools/ocaml/libs/xc/xenctrl.ml b/tools/ocaml/libs/xc/xenctrl.ml
index 58a53a1..75006e7 100644
--- a/tools/ocaml/libs/xc/xenctrl.ml
+++ b/tools/ocaml/libs/xc/xenctrl.ml
@@ -242,6 +242,9 @@ external version_changeset: handle -> string = "stub_xc_version_changeset"
 external version_capabilities: handle -> string =
   "stub_xc_version_capabilities"
 
+type featureset_index = Featureset_raw | Featureset_host | Featureset_pv | Featureset_hvm
+external get_cpu_featureset : handle -> featureset_index -> int64 array = "stub_xc_get_cpu_featureset"
+
 external watchdog : handle -> int -> int32 -> int
   = "stub_xc_watchdog"
 
diff --git a/tools/ocaml/libs/xc/xenctrl.mli b/tools/ocaml/libs/xc/xenctrl.mli
index 16443df..720e4b2 100644
--- a/tools/ocaml/libs/xc/xenctrl.mli
+++ b/tools/ocaml/libs/xc/xenctrl.mli
@@ -147,6 +147,10 @@ external version_compile_info : handle -> compile_info
 external version_changeset : handle -> string = "stub_xc_version_changeset"
 external version_capabilities : handle -> string
   = "stub_xc_version_capabilities"
+
+type featureset_index = Featureset_raw | Featureset_host | Featureset_pv | Featureset_hvm
+external get_cpu_featureset : handle -> featureset_index -> int64 array = "stub_xc_get_cpu_featureset"
+
 type core_magic = Magic_hvm | Magic_pv
 type core_header = {
   xch_magic : core_magic;
diff --git a/tools/ocaml/libs/xc/xenctrl_stubs.c b/tools/ocaml/libs/xc/xenctrl_stubs.c
index 74928e9..e7adf37 100644
--- a/tools/ocaml/libs/xc/xenctrl_stubs.c
+++ b/tools/ocaml/libs/xc/xenctrl_stubs.c
@@ -1214,6 +1214,41 @@ CAMLprim value stub_xc_domain_deassign_device(value xch, value domid, value desc
 	CAMLreturn(Val_unit);
 }
 
+CAMLprim value stub_xc_get_cpu_featureset(value xch, value idx)
+{
+	CAMLparam2(xch, idx);
+	CAMLlocal1(bitmap_val);
+
+	/* Safe, because of the global ocaml lock. */
+	static uint32_t fs_len;
+
+	if (fs_len == 0)
+	{
+		int ret = xc_get_cpu_featureset(_H(xch), 0, &fs_len, NULL);
+
+		if (ret || (fs_len == 0))
+			failwith_xc(_H(xch));
+	}
+
+	{
+		/* To/from hypervisor to retrieve actual featureset */
+		uint32_t fs[fs_len], len = fs_len;
+		unsigned int i;
+
+		int ret = xc_get_cpu_featureset(_H(xch), Int_val(idx), &len, fs);
+
+		if (ret)
+			failwith_xc(_H(xch));
+
+		bitmap_val = caml_alloc(len, 0);
+
+		for (i = 0; i < len; ++i)
+			Store_field(bitmap_val, i, caml_copy_int64(fs[i]));
+	}
+
+	CAMLreturn(bitmap_val);
+}
+
 CAMLprim value stub_xc_watchdog(value xch, value domid, value timeout)
 {
 	CAMLparam3(xch, domid, timeout);
diff --git a/xen/arch/x86/sysctl.c b/xen/arch/x86/sysctl.c
index f68cbec..9c75de6 100644
--- a/xen/arch/x86/sysctl.c
+++ b/xen/arch/x86/sysctl.c
@@ -30,6 +30,7 @@
 #include <xen/cpu.h>
 #include <xsm/xsm.h>
 #include <asm/psr.h>
+#include <asm/cpuid.h>
 
 struct l3_cache_info {
     int ret;
@@ -196,6 +197,56 @@ long arch_do_sysctl(
             ret = -EFAULT;
         break;
 
+    case XEN_SYSCTL_get_cpu_featureset:
+    {
+        static const uint32_t *const featureset_table[] = {
+            [XEN_SYSCTL_cpu_featureset_raw]  = raw_featureset,
+            [XEN_SYSCTL_cpu_featureset_host] = host_featureset,
+            [XEN_SYSCTL_cpu_featureset_pv]   = pv_featureset,
+            [XEN_SYSCTL_cpu_featureset_hvm]  = hvm_featureset,
+        };
+        const uint32_t *featureset = NULL;
+        unsigned int nr;
+
+        /* Request for maximum number of features? */
+        if ( guest_handle_is_null(sysctl->u.cpu_featureset.features) )
+        {
+            sysctl->u.cpu_featureset.nr_features = FSCAPINTS;
+            if ( __copy_field_to_guest(u_sysctl, sysctl,
+                                       u.cpu_featureset.nr_features) )
+                ret = -EFAULT;
+            break;
+        }
+
+        /* Clip the number of entries. */
+        nr = min(sysctl->u.cpu_featureset.nr_features, FSCAPINTS);
+
+        /* Look up requested featureset. */
+        if ( sysctl->u.cpu_featureset.index < ARRAY_SIZE(featureset_table) )
+            featureset = featureset_table[sysctl->u.cpu_featureset.index];
+
+        /* Bad featureset index? */
+        if ( !featureset )
+            ret = -EINVAL;
+
+        /* Copy the requested featureset into place. */
+        if ( !ret && copy_to_guest(sysctl->u.cpu_featureset.features,
+                                   featureset, nr) )
+            ret = -EFAULT;
+
+        /* Inform the caller of how many features we wrote. */
+        sysctl->u.cpu_featureset.nr_features = nr;
+        if ( !ret && __copy_field_to_guest(u_sysctl, sysctl,
+                                           u.cpu_featureset.nr_features) )
+            ret = -EFAULT;
+
+        /* Inform the caller if there was more data to provide. */
+        if ( !ret && nr < FSCAPINTS )
+            ret = -ENOBUFS;
+
+        break;
+    }
+
     default:
         ret = -ENOSYS;
         break;
diff --git a/xen/include/public/sysctl.h b/xen/include/public/sysctl.h
index 1ab16db..4596d20 100644
--- a/xen/include/public/sysctl.h
+++ b/xen/include/public/sysctl.h
@@ -787,6 +787,31 @@ struct xen_sysctl_cpu_levelling_caps {
 typedef struct xen_sysctl_cpu_levelling_caps xen_sysctl_cpu_levelling_caps_t;
 DEFINE_XEN_GUEST_HANDLE(xen_sysctl_cpu_levelling_caps_t);
 
+/*
+ * XEN_SYSCTL_get_cpu_featureset (x86 specific)
+ *
+ * Return information about featuresets available on this host.
+ *  -  Raw: The real cpuid values.
+ *  - Host: The values Xen is using, (after command line overrides, etc).
+ *  -   PV: Maximum set of features which can be given to a PV guest.
+ *  -  HVM: Maximum set of features which can be given to a HVM guest.
+ */
+struct xen_sysctl_cpu_featureset {
+#define XEN_SYSCTL_cpu_featureset_raw      0
+#define XEN_SYSCTL_cpu_featureset_host     1
+#define XEN_SYSCTL_cpu_featureset_pv       2
+#define XEN_SYSCTL_cpu_featureset_hvm      3
+    uint32_t index;       /* IN: Which featureset to query? */
+    uint32_t nr_features; /* IN/OUT: Number of entries in/written to
+                           * 'features', or the maximum number of features if
+                           * the guest handle is NULL.  NB. All featuresets
+                           * come from the same numberspace, so have the same
+                           * maximum length. */
+    XEN_GUEST_HANDLE_64(uint32) features; /* OUT: */
+};
+typedef struct xen_sysctl_featureset xen_sysctl_featureset_t;
+DEFINE_XEN_GUEST_HANDLE(xen_sysctl_featureset_t);
+
 struct xen_sysctl {
     uint32_t cmd;
 #define XEN_SYSCTL_readconsole                    1
@@ -813,6 +838,7 @@ struct xen_sysctl {
 #define XEN_SYSCTL_psr_cat_op                    23
 #define XEN_SYSCTL_tmem_op                       24
 #define XEN_SYSCTL_get_cpu_levelling_caps        25
+#define XEN_SYSCTL_get_cpu_featureset            26
     uint32_t interface_version; /* XEN_SYSCTL_INTERFACE_VERSION */
     union {
         struct xen_sysctl_readconsole       readconsole;
@@ -839,6 +865,7 @@ struct xen_sysctl {
         struct xen_sysctl_psr_cat_op        psr_cat_op;
         struct xen_sysctl_tmem_op           tmem_op;
         struct xen_sysctl_cpu_levelling_caps cpu_levelling_caps;
+        struct xen_sysctl_cpu_featureset    cpu_featureset;
         uint8_t                             pad[128];
     } u;
 };
-- 
2.1.4


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 72+ messages in thread

* [PATCH v4 20/26] tools/libxc: Modify bitmap operations to take void pointers
  2016-03-23 16:36 [PATCH v4 00/26] x86: Improvements to cpuid handling for guests Andrew Cooper
                   ` (18 preceding siblings ...)
  2016-03-23 16:36 ` [PATCH v4 19/26] xen+tools: Export maximum host and guest cpu featuresets via SYSCTL Andrew Cooper
@ 2016-03-23 16:36 ` Andrew Cooper
  2016-03-28 20:05   ` Konrad Rzeszutek Wilk
  2016-03-23 16:36 ` [PATCH v4 21/26] tools/libxc: Use public/featureset.h for cpuid policy generation Andrew Cooper
                   ` (6 subsequent siblings)
  26 siblings, 1 reply; 72+ messages in thread
From: Andrew Cooper @ 2016-03-23 16:36 UTC (permalink / raw)
  To: Xen-devel
  Cc: Andrew Cooper, Julien Grall, Ian Jackson, Wei Liu, Stefano Stabellini

The type of the pointer to a bitmap is not interesting; it does not affect the
representation of the block of bits being pointed to.

Make the libxc functions consistent with those in Xen, so they can work just
as well with 'unsigned int *' based bitmaps.

As part of doing so, change the implementation to be in terms of char rather
than unsigned long.  This fixes alignment concerns with ARM.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
---
CC: Ian Jackson <Ian.Jackson@eu.citrix.com>
CC: Wei Liu <wei.liu2@citrix.com>
CC: Stefano Stabellini <stefano.stabellini@citrix.com>
CC: Julien Grall <julien.grall@arm.com>

v2:
 * New
v3:
 * Implement in terms of char rather than unsigned long to fix alignment
   issues for ARM.
v4:
 * Fix erronious calculation in bitmap_size()
---
 tools/libxc/xc_bitops.h | 37 ++++++++++++++++++++-----------------
 1 file changed, 20 insertions(+), 17 deletions(-)

diff --git a/tools/libxc/xc_bitops.h b/tools/libxc/xc_bitops.h
index cd749f4..3e7a544 100644
--- a/tools/libxc/xc_bitops.h
+++ b/tools/libxc/xc_bitops.h
@@ -6,70 +6,73 @@
 #include <stdlib.h>
 #include <string.h>
 
+/* Needed by several includees, but no longer used for bitops. */
 #define BITS_PER_LONG (sizeof(unsigned long) * 8)
 #define ORDER_LONG (sizeof(unsigned long) == 4 ? 5 : 6)
 
-#define BITMAP_ENTRY(_nr,_bmap) ((_bmap))[(_nr)/BITS_PER_LONG]
-#define BITMAP_SHIFT(_nr) ((_nr) % BITS_PER_LONG)
+#define BITMAP_ENTRY(_nr,_bmap) ((_bmap))[(_nr) / 8]
+#define BITMAP_SHIFT(_nr) ((_nr) % 8)
 
 /* calculate required space for number of longs needed to hold nr_bits */
 static inline int bitmap_size(int nr_bits)
 {
-    int nr_long, nr_bytes;
-    nr_long = (nr_bits + BITS_PER_LONG - 1) >> ORDER_LONG;
-    nr_bytes = nr_long * sizeof(unsigned long);
-    return nr_bytes;
+    return (nr_bits + 7) / 8;
 }
 
-static inline unsigned long *bitmap_alloc(int nr_bits)
+static inline void *bitmap_alloc(int nr_bits)
 {
     return calloc(1, bitmap_size(nr_bits));
 }
 
-static inline void bitmap_set(unsigned long *addr, int nr_bits)
+static inline void bitmap_set(void *addr, int nr_bits)
 {
     memset(addr, 0xff, bitmap_size(nr_bits));
 }
 
-static inline void bitmap_clear(unsigned long *addr, int nr_bits)
+static inline void bitmap_clear(void *addr, int nr_bits)
 {
     memset(addr, 0, bitmap_size(nr_bits));
 }
 
-static inline int test_bit(int nr, unsigned long *addr)
+static inline int test_bit(int nr, const void *_addr)
 {
+    const char *addr = _addr;
     return (BITMAP_ENTRY(nr, addr) >> BITMAP_SHIFT(nr)) & 1;
 }
 
-static inline void clear_bit(int nr, unsigned long *addr)
+static inline void clear_bit(int nr, void *_addr)
 {
+    char *addr = _addr;
     BITMAP_ENTRY(nr, addr) &= ~(1UL << BITMAP_SHIFT(nr));
 }
 
-static inline void set_bit(int nr, unsigned long *addr)
+static inline void set_bit(int nr, void *_addr)
 {
+    char *addr = _addr;
     BITMAP_ENTRY(nr, addr) |= (1UL << BITMAP_SHIFT(nr));
 }
 
-static inline int test_and_clear_bit(int nr, unsigned long *addr)
+static inline int test_and_clear_bit(int nr, void *addr)
 {
     int oldbit = test_bit(nr, addr);
     clear_bit(nr, addr);
     return oldbit;
 }
 
-static inline int test_and_set_bit(int nr, unsigned long *addr)
+static inline int test_and_set_bit(int nr, void *addr)
 {
     int oldbit = test_bit(nr, addr);
     set_bit(nr, addr);
     return oldbit;
 }
 
-static inline void bitmap_or(unsigned long *dst, const unsigned long *other,
+static inline void bitmap_or(void *_dst, const void *_other,
                              int nr_bits)
 {
-    int i, nr_longs = (bitmap_size(nr_bits) / sizeof(unsigned long));
-    for ( i = 0; i < nr_longs; ++i )
+    char *dst = _dst;
+    const char *other = _other;
+    int i;
+    for ( i = 0; i < bitmap_size(nr_bits); ++i )
         dst[i] |= other[i];
 }
 
-- 
2.1.4


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 72+ messages in thread

* [PATCH v4 21/26] tools/libxc: Use public/featureset.h for cpuid policy generation
  2016-03-23 16:36 [PATCH v4 00/26] x86: Improvements to cpuid handling for guests Andrew Cooper
                   ` (19 preceding siblings ...)
  2016-03-23 16:36 ` [PATCH v4 20/26] tools/libxc: Modify bitmap operations to take void pointers Andrew Cooper
@ 2016-03-23 16:36 ` Andrew Cooper
  2016-03-28 20:07   ` Konrad Rzeszutek Wilk
  2016-03-23 16:36 ` [PATCH v4 22/26] tools/libxc: Expose the automatically generated cpu featuremask information Andrew Cooper
                   ` (5 subsequent siblings)
  26 siblings, 1 reply; 72+ messages in thread
From: Andrew Cooper @ 2016-03-23 16:36 UTC (permalink / raw)
  To: Xen-devel; +Cc: Andrew Cooper, Ian Jackson

Rather than having a different local copy of some of the feature
definitions.

Modify the xc_cpuid_x86.c cpumask helpers to appropriate truncate the
new values.

As some of the feature have been renamed in the public API, similar renames
are made here.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by: Wei Liu <wei.liu2@citrix.com>
---
CC: Ian Jackson <Ian.Jackson@eu.citrix.com>

v3:
 * Adjust naming to match Xen.
---
 tools/libxc/xc_cpufeature.h | 151 --------------------------------------------
 tools/libxc/xc_cpuid_x86.c  |  37 ++++++-----
 2 files changed, 20 insertions(+), 168 deletions(-)
 delete mode 100644 tools/libxc/xc_cpufeature.h

diff --git a/tools/libxc/xc_cpufeature.h b/tools/libxc/xc_cpufeature.h
deleted file mode 100644
index 01dbeec..0000000
--- a/tools/libxc/xc_cpufeature.h
+++ /dev/null
@@ -1,151 +0,0 @@
-/*
- * This library is free software; you can redistribute it and/or
- * modify it under the terms of the GNU Lesser General Public
- * License as published by the Free Software Foundation;
- * version 2.1 of the License.
- *
- * This library is distributed in the hope that it will be useful,
- * but WITHOUT ANY WARRANTY; without even the implied warranty of
- * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
- * Lesser General Public License for more details.
- *
- * You should have received a copy of the GNU Lesser General Public
- * License along with this library; If not, see <http://www.gnu.org/licenses/>.
- */
-
-#ifndef __LIBXC_CPUFEATURE_H
-#define __LIBXC_CPUFEATURE_H
-
-/* Intel-defined CPU features, CPUID level 0x00000001 (edx) */
-#define X86_FEATURE_FPU          0 /* Onboard FPU */
-#define X86_FEATURE_VME          1 /* Virtual Mode Extensions */
-#define X86_FEATURE_DE           2 /* Debugging Extensions */
-#define X86_FEATURE_PSE          3 /* Page Size Extensions */
-#define X86_FEATURE_TSC          4 /* Time Stamp Counter */
-#define X86_FEATURE_MSR          5 /* Model-Specific Registers, RDMSR, WRMSR */
-#define X86_FEATURE_PAE          6 /* Physical Address Extensions */
-#define X86_FEATURE_MCE          7 /* Machine Check Architecture */
-#define X86_FEATURE_CX8          8 /* CMPXCHG8 instruction */
-#define X86_FEATURE_APIC         9 /* Onboard APIC */
-#define X86_FEATURE_SEP         11 /* SYSENTER/SYSEXIT */
-#define X86_FEATURE_MTRR        12 /* Memory Type Range Registers */
-#define X86_FEATURE_PGE         13 /* Page Global Enable */
-#define X86_FEATURE_MCA         14 /* Machine Check Architecture */
-#define X86_FEATURE_CMOV        15 /* CMOV instruction */
-#define X86_FEATURE_PAT         16 /* Page Attribute Table */
-#define X86_FEATURE_PSE36       17 /* 36-bit PSEs */
-#define X86_FEATURE_PN          18 /* Processor serial number */
-#define X86_FEATURE_CLFLSH      19 /* Supports the CLFLUSH instruction */
-#define X86_FEATURE_DS          21 /* Debug Store */
-#define X86_FEATURE_ACPI        22 /* ACPI via MSR */
-#define X86_FEATURE_MMX         23 /* Multimedia Extensions */
-#define X86_FEATURE_FXSR        24 /* FXSAVE and FXRSTOR instructions */
-#define X86_FEATURE_XMM         25 /* Streaming SIMD Extensions */
-#define X86_FEATURE_XMM2        26 /* Streaming SIMD Extensions-2 */
-#define X86_FEATURE_SELFSNOOP   27 /* CPU self snoop */
-#define X86_FEATURE_HT          28 /* Hyper-Threading */
-#define X86_FEATURE_ACC         29 /* Automatic clock control */
-#define X86_FEATURE_IA64        30 /* IA-64 processor */
-#define X86_FEATURE_PBE         31 /* Pending Break Enable */
-
-/* AMD-defined CPU features, CPUID level 0x80000001 */
-/* Don't duplicate feature flags which are redundant with Intel! */
-#define X86_FEATURE_SYSCALL     11 /* SYSCALL/SYSRET */
-#define X86_FEATURE_MP          19 /* MP Capable. */
-#define X86_FEATURE_NX          20 /* Execute Disable */
-#define X86_FEATURE_MMXEXT      22 /* AMD MMX extensions */
-#define X86_FEATURE_FFXSR       25 /* FFXSR instruction optimizations */
-#define X86_FEATURE_PAGE1GB     26 /* 1Gb large page support */
-#define X86_FEATURE_RDTSCP      27 /* RDTSCP */
-#define X86_FEATURE_LM          29 /* Long Mode (x86-64) */
-#define X86_FEATURE_3DNOWEXT    30 /* AMD 3DNow! extensions */
-#define X86_FEATURE_3DNOW       31 /* 3DNow! */
-
-/* Intel-defined CPU features, CPUID level 0x00000001 (ecx) */
-#define X86_FEATURE_XMM3         0 /* Streaming SIMD Extensions-3 */
-#define X86_FEATURE_PCLMULQDQ    1 /* Carry-less multiplication */
-#define X86_FEATURE_DTES64       2 /* 64-bit Debug Store */
-#define X86_FEATURE_MWAIT        3 /* Monitor/Mwait support */
-#define X86_FEATURE_DSCPL        4 /* CPL Qualified Debug Store */
-#define X86_FEATURE_VMXE         5 /* Virtual Machine Extensions */
-#define X86_FEATURE_SMXE         6 /* Safer Mode Extensions */
-#define X86_FEATURE_EST          7 /* Enhanced SpeedStep */
-#define X86_FEATURE_TM2          8 /* Thermal Monitor 2 */
-#define X86_FEATURE_SSSE3        9 /* Supplemental Streaming SIMD Exts-3 */
-#define X86_FEATURE_CID         10 /* Context ID */
-#define X86_FEATURE_FMA         12 /* Fused Multiply Add */
-#define X86_FEATURE_CX16        13 /* CMPXCHG16B */
-#define X86_FEATURE_XTPR        14 /* Send Task Priority Messages */
-#define X86_FEATURE_PDCM        15 /* Perf/Debug Capability MSR */
-#define X86_FEATURE_PCID        17 /* Process Context ID */
-#define X86_FEATURE_DCA         18 /* Direct Cache Access */
-#define X86_FEATURE_SSE4_1      19 /* Streaming SIMD Extensions 4.1 */
-#define X86_FEATURE_SSE4_2      20 /* Streaming SIMD Extensions 4.2 */
-#define X86_FEATURE_X2APIC      21 /* x2APIC */
-#define X86_FEATURE_MOVBE       22 /* movbe instruction */
-#define X86_FEATURE_POPCNT      23 /* POPCNT instruction */
-#define X86_FEATURE_TSC_DEADLINE 24 /* "tdt" TSC Deadline Timer */
-#define X86_FEATURE_AES         25 /* AES acceleration instructions */
-#define X86_FEATURE_XSAVE       26 /* XSAVE/XRSTOR/XSETBV/XGETBV */
-#define X86_FEATURE_AVX         28 /* Advanced Vector Extensions */
-#define X86_FEATURE_F16C        29 /* Half-precision convert instruction */
-#define X86_FEATURE_RDRAND      30 /* Digital Random Number Generator */
-#define X86_FEATURE_HYPERVISOR  31 /* Running under some hypervisor */
-
-/* VIA/Cyrix/Centaur-defined CPU features, CPUID level 0xC0000001 */
-#define X86_FEATURE_XSTORE       2 /* on-CPU RNG present (xstore insn) */
-#define X86_FEATURE_XSTORE_EN    3 /* on-CPU RNG enabled */
-#define X86_FEATURE_XCRYPT       6 /* on-CPU crypto (xcrypt insn) */
-#define X86_FEATURE_XCRYPT_EN    7 /* on-CPU crypto enabled */
-#define X86_FEATURE_ACE2         8 /* Advanced Cryptography Engine v2 */
-#define X86_FEATURE_ACE2_EN      9 /* ACE v2 enabled */
-#define X86_FEATURE_PHE         10 /* PadLock Hash Engine */
-#define X86_FEATURE_PHE_EN      11 /* PHE enabled */
-#define X86_FEATURE_PMM         12 /* PadLock Montgomery Multiplier */
-#define X86_FEATURE_PMM_EN      13 /* PMM enabled */
-
-/* More extended AMD flags: CPUID level 0x80000001, ecx */
-#define X86_FEATURE_LAHF_LM      0 /* LAHF/SAHF in long mode */
-#define X86_FEATURE_CMP_LEGACY   1 /* If yes HyperThreading not valid */
-#define X86_FEATURE_SVM          2 /* Secure virtual machine */
-#define X86_FEATURE_EXTAPIC      3 /* Extended APIC space */
-#define X86_FEATURE_CR8_LEGACY   4 /* CR8 in 32-bit mode */
-#define X86_FEATURE_ABM          5 /* Advanced bit manipulation */
-#define X86_FEATURE_SSE4A        6 /* SSE-4A */
-#define X86_FEATURE_MISALIGNSSE  7 /* Misaligned SSE mode */
-#define X86_FEATURE_3DNOWPREFETCH 8 /* 3DNow prefetch instructions */
-#define X86_FEATURE_OSVW         9 /* OS Visible Workaround */
-#define X86_FEATURE_IBS         10 /* Instruction Based Sampling */
-#define X86_FEATURE_XOP         11 /* extended AVX instructions */
-#define X86_FEATURE_SKINIT      12 /* SKINIT/STGI instructions */
-#define X86_FEATURE_WDT         13 /* Watchdog timer */
-#define X86_FEATURE_LWP         15 /* Light Weight Profiling */
-#define X86_FEATURE_FMA4        16 /* 4 operands MAC instructions */
-#define X86_FEATURE_NODEID_MSR  19 /* NodeId MSR */
-#define X86_FEATURE_TBM         21 /* trailing bit manipulations */
-#define X86_FEATURE_TOPOEXT     22 /* topology extensions CPUID leafs */
-#define X86_FEATURE_DBEXT       26 /* data breakpoint extension */
-
-/* Intel-defined CPU features, CPUID level 0x00000007:0 (ebx) */
-#define X86_FEATURE_FSGSBASE     0 /* {RD,WR}{FS,GS}BASE instructions */
-#define X86_FEATURE_TSC_ADJUST   1 /* Tsc thread offset */
-#define X86_FEATURE_BMI1         3 /* 1st group bit manipulation extensions */
-#define X86_FEATURE_HLE          4 /* Hardware Lock Elision */
-#define X86_FEATURE_AVX2         5 /* AVX2 instructions */
-#define X86_FEATURE_SMEP         7 /* Supervisor Mode Execution Protection */
-#define X86_FEATURE_BMI2         8 /* 2nd group bit manipulation extensions */
-#define X86_FEATURE_ERMS         9 /* Enhanced REP MOVSB/STOSB */
-#define X86_FEATURE_INVPCID     10 /* Invalidate Process Context ID */
-#define X86_FEATURE_RTM         11 /* Restricted Transactional Memory */
-#define X86_FEATURE_MPX         14 /* Memory Protection Extensions */
-#define X86_FEATURE_RDSEED      18 /* RDSEED instruction */
-#define X86_FEATURE_ADX         19 /* ADCX, ADOX instructions */
-#define X86_FEATURE_SMAP        20 /* Supervisor Mode Access Protection */
-#define X86_FEATURE_PCOMMIT     22 /* PCOMMIT instruction */
-#define X86_FEATURE_CLFLUSHOPT  23 /* CLFLUSHOPT instruction */
-#define X86_FEATURE_CLWB        24 /* CLWB instruction */
-
-/* Intel-defined CPU features, CPUID level 0x00000007:0 (ecx) */
-#define X86_FEATURE_PKU     3
-
-#endif /* __LIBXC_CPUFEATURE_H */
diff --git a/tools/libxc/xc_cpuid_x86.c b/tools/libxc/xc_cpuid_x86.c
index 5780397..d3674db 100644
--- a/tools/libxc/xc_cpuid_x86.c
+++ b/tools/libxc/xc_cpuid_x86.c
@@ -22,12 +22,16 @@
 #include <stdlib.h>
 #include <stdbool.h>
 #include "xc_private.h"
-#include "xc_cpufeature.h"
 #include <xen/hvm/params.h>
 
-#define bitmaskof(idx)      (1u << (idx))
-#define clear_bit(idx, dst) ((dst) &= ~(1u << (idx)))
-#define set_bit(idx, dst)   ((dst) |= (1u << (idx)))
+enum {
+#define XEN_CPUFEATURE(name, value) X86_FEATURE_##name = value,
+#include <xen/arch-x86/cpufeatureset.h>
+};
+
+#define bitmaskof(idx)      (1u << ((idx) & 31))
+#define clear_bit(idx, dst) ((dst) &= ~bitmaskof(idx))
+#define set_bit(idx, dst)   ((dst) |=  bitmaskof(idx))
 
 #define DEF_MAX_BASE 0x0000000du
 #define DEF_MAX_INTELEXT  0x80000008u
@@ -223,7 +227,6 @@ static void amd_xc_cpuid_policy(xc_interface *xch,
                     bitmaskof(X86_FEATURE_LM) |
                     bitmaskof(X86_FEATURE_PAGE1GB) |
                     bitmaskof(X86_FEATURE_SYSCALL) |
-                    bitmaskof(X86_FEATURE_MP) |
                     bitmaskof(X86_FEATURE_MMXEXT) |
                     bitmaskof(X86_FEATURE_FFXSR) |
                     bitmaskof(X86_FEATURE_3DNOW) |
@@ -280,7 +283,7 @@ static void intel_xc_cpuid_policy(xc_interface *xch,
     case 0x00000001:
         /* ECX[5] is availability of VMX */
         if ( info->nestedhvm )
-            set_bit(X86_FEATURE_VMXE, regs[2]);
+            set_bit(X86_FEATURE_VMX, regs[2]);
         break;
 
     case 0x00000004:
@@ -398,7 +401,7 @@ static void xc_cpuid_hvm_policy(xc_interface *xch,
          */
         regs[1] = (regs[1] & 0x0000ffffu) | ((regs[1] & 0x007f0000u) << 1);
 
-        regs[2] &= (bitmaskof(X86_FEATURE_XMM3) |
+        regs[2] &= (bitmaskof(X86_FEATURE_SSE3) |
                     bitmaskof(X86_FEATURE_PCLMULQDQ) |
                     bitmaskof(X86_FEATURE_SSSE3) |
                     bitmaskof(X86_FEATURE_FMA) |
@@ -408,7 +411,7 @@ static void xc_cpuid_hvm_policy(xc_interface *xch,
                     bitmaskof(X86_FEATURE_SSE4_2) |
                     bitmaskof(X86_FEATURE_MOVBE)  |
                     bitmaskof(X86_FEATURE_POPCNT) |
-                    bitmaskof(X86_FEATURE_AES) |
+                    bitmaskof(X86_FEATURE_AESNI) |
                     bitmaskof(X86_FEATURE_F16C) |
                     bitmaskof(X86_FEATURE_RDRAND) |
                     ((info->xfeature_mask != 0) ?
@@ -435,13 +438,13 @@ static void xc_cpuid_hvm_policy(xc_interface *xch,
                     bitmaskof(X86_FEATURE_MCA) |
                     bitmaskof(X86_FEATURE_CMOV) |
                     bitmaskof(X86_FEATURE_PAT) |
-                    bitmaskof(X86_FEATURE_CLFLSH) |
+                    bitmaskof(X86_FEATURE_CLFLUSH) |
                     bitmaskof(X86_FEATURE_PSE36) |
                     bitmaskof(X86_FEATURE_MMX) |
                     bitmaskof(X86_FEATURE_FXSR) |
-                    bitmaskof(X86_FEATURE_XMM) |
-                    bitmaskof(X86_FEATURE_XMM2) |
-                    bitmaskof(X86_FEATURE_HT));
+                    bitmaskof(X86_FEATURE_SSE) |
+                    bitmaskof(X86_FEATURE_SSE2) |
+                    bitmaskof(X86_FEATURE_HTT));
             
         /* We always support MTRR MSRs. */
         regs[3] |= bitmaskof(X86_FEATURE_MTRR);
@@ -560,15 +563,15 @@ static void xc_cpuid_pv_policy(xc_interface *xch,
         if ( info->vendor == VENDOR_AMD )
             clear_bit(X86_FEATURE_SEP, regs[3]);
         clear_bit(X86_FEATURE_DS, regs[3]);
-        clear_bit(X86_FEATURE_ACC, regs[3]);
+        clear_bit(X86_FEATURE_TM1, regs[3]);
         clear_bit(X86_FEATURE_PBE, regs[3]);
 
         clear_bit(X86_FEATURE_DTES64, regs[2]);
-        clear_bit(X86_FEATURE_MWAIT, regs[2]);
+        clear_bit(X86_FEATURE_MONITOR, regs[2]);
         clear_bit(X86_FEATURE_DSCPL, regs[2]);
-        clear_bit(X86_FEATURE_VMXE, regs[2]);
-        clear_bit(X86_FEATURE_SMXE, regs[2]);
-        clear_bit(X86_FEATURE_EST, regs[2]);
+        clear_bit(X86_FEATURE_VMX, regs[2]);
+        clear_bit(X86_FEATURE_SMX, regs[2]);
+        clear_bit(X86_FEATURE_EIST, regs[2]);
         clear_bit(X86_FEATURE_TM2, regs[2]);
         if ( !info->pv64 )
             clear_bit(X86_FEATURE_CX16, regs[2]);
-- 
2.1.4


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 72+ messages in thread

* [PATCH v4 22/26] tools/libxc: Expose the automatically generated cpu featuremask information
  2016-03-23 16:36 [PATCH v4 00/26] x86: Improvements to cpuid handling for guests Andrew Cooper
                   ` (20 preceding siblings ...)
  2016-03-23 16:36 ` [PATCH v4 21/26] tools/libxc: Use public/featureset.h for cpuid policy generation Andrew Cooper
@ 2016-03-23 16:36 ` Andrew Cooper
  2016-03-28 20:08   ` Konrad Rzeszutek Wilk
  2016-03-23 16:36 ` [PATCH v4 23/26] tools: Utility for dealing with featuresets Andrew Cooper
                   ` (4 subsequent siblings)
  26 siblings, 1 reply; 72+ messages in thread
From: Andrew Cooper @ 2016-03-23 16:36 UTC (permalink / raw)
  To: Xen-devel; +Cc: Andrew Cooper, Ian Jackson

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by: Wei Liu <wei.liu2@citrix.com>
---
CC: Ian Jackson <Ian.Jackson@eu.citrix.com>

New in v2
---
 tools/libxc/Makefile          |  9 ++++++
 tools/libxc/include/xenctrl.h | 14 ++++++++
 tools/libxc/xc_cpuid_x86.c    | 75 +++++++++++++++++++++++++++++++++++++++++++
 3 files changed, 98 insertions(+)

diff --git a/tools/libxc/Makefile b/tools/libxc/Makefile
index 608404f..ef02c9d 100644
--- a/tools/libxc/Makefile
+++ b/tools/libxc/Makefile
@@ -145,6 +145,15 @@ $(eval $(genpath-target))
 
 xc_private.h: _paths.h
 
+ifeq ($(CONFIG_X86),y)
+
+_xc_cpuid_autogen.h: $(XEN_ROOT)/xen/include/public/arch-x86/cpufeatureset.h $(XEN_ROOT)/xen/tools/gen-cpuid.py
+	$(PYTHON) $(XEN_ROOT)/xen/tools/gen-cpuid.py -i $^ -o $@.new
+	$(call move-if-changed,$@.new,$@)
+
+build: _xc_cpuid_autogen.h
+endif
+
 $(CTRL_LIB_OBJS) $(GUEST_LIB_OBJS) \
 $(CTRL_PIC_OBJS) $(GUEST_PIC_OBJS): xc_private.h
 
diff --git a/tools/libxc/include/xenctrl.h b/tools/libxc/include/xenctrl.h
index c136aa8..66acbd1 100644
--- a/tools/libxc/include/xenctrl.h
+++ b/tools/libxc/include/xenctrl.h
@@ -2533,6 +2533,20 @@ int xc_psr_cat_get_l3_info(xc_interface *xch, uint32_t socket,
 int xc_get_cpu_levelling_caps(xc_interface *xch, uint32_t *caps);
 int xc_get_cpu_featureset(xc_interface *xch, uint32_t index,
                           uint32_t *nr_features, uint32_t *featureset);
+
+uint32_t xc_get_cpu_featureset_size(void);
+
+enum xc_static_cpu_featuremask {
+    XC_FEATUREMASK_KNOWN,
+    XC_FEATUREMASK_SPECIAL,
+    XC_FEATUREMASK_PV,
+    XC_FEATUREMASK_HVM_SHADOW,
+    XC_FEATUREMASK_HVM_HAP,
+    XC_FEATUREMASK_DEEP_FEATURES,
+};
+const uint32_t *xc_get_static_cpu_featuremask(enum xc_static_cpu_featuremask);
+const uint32_t *xc_get_feature_deep_deps(uint32_t feature);
+
 #endif
 
 /* Compat shims */
diff --git a/tools/libxc/xc_cpuid_x86.c b/tools/libxc/xc_cpuid_x86.c
index d3674db..0cffb36 100644
--- a/tools/libxc/xc_cpuid_x86.c
+++ b/tools/libxc/xc_cpuid_x86.c
@@ -28,6 +28,7 @@ enum {
 #define XEN_CPUFEATURE(name, value) X86_FEATURE_##name = value,
 #include <xen/arch-x86/cpufeatureset.h>
 };
+#include "_xc_cpuid_autogen.h"
 
 #define bitmaskof(idx)      (1u << ((idx) & 31))
 #define clear_bit(idx, dst) ((dst) &= ~bitmaskof(idx))
@@ -78,6 +79,80 @@ int xc_get_cpu_featureset(xc_interface *xch, uint32_t index,
     return ret;
 }
 
+uint32_t xc_get_cpu_featureset_size(void)
+{
+    return FEATURESET_NR_ENTRIES;
+}
+
+const uint32_t *xc_get_static_cpu_featuremask(
+    enum xc_static_cpu_featuremask mask)
+{
+    const static uint32_t known[FEATURESET_NR_ENTRIES] = INIT_KNOWN_FEATURES,
+        special[FEATURESET_NR_ENTRIES] = INIT_SPECIAL_FEATURES,
+        pv[FEATURESET_NR_ENTRIES] = INIT_PV_FEATURES,
+        hvm_shadow[FEATURESET_NR_ENTRIES] = INIT_HVM_SHADOW_FEATURES,
+        hvm_hap[FEATURESET_NR_ENTRIES] = INIT_HVM_HAP_FEATURES,
+        deep_features[FEATURESET_NR_ENTRIES] = INIT_DEEP_FEATURES;
+
+    XC_BUILD_BUG_ON(ARRAY_SIZE(known) != FEATURESET_NR_ENTRIES);
+    XC_BUILD_BUG_ON(ARRAY_SIZE(special) != FEATURESET_NR_ENTRIES);
+    XC_BUILD_BUG_ON(ARRAY_SIZE(pv) != FEATURESET_NR_ENTRIES);
+    XC_BUILD_BUG_ON(ARRAY_SIZE(hvm_shadow) != FEATURESET_NR_ENTRIES);
+    XC_BUILD_BUG_ON(ARRAY_SIZE(hvm_hap) != FEATURESET_NR_ENTRIES);
+    XC_BUILD_BUG_ON(ARRAY_SIZE(deep_features) != FEATURESET_NR_ENTRIES);
+
+    switch ( mask )
+    {
+    case XC_FEATUREMASK_KNOWN:
+        return known;
+
+    case XC_FEATUREMASK_SPECIAL:
+        return special;
+
+    case XC_FEATUREMASK_PV:
+        return pv;
+
+    case XC_FEATUREMASK_HVM_SHADOW:
+        return hvm_shadow;
+
+    case XC_FEATUREMASK_HVM_HAP:
+        return hvm_hap;
+
+    case XC_FEATUREMASK_DEEP_FEATURES:
+        return deep_features;
+
+    default:
+        return NULL;
+    }
+}
+
+const uint32_t *xc_get_feature_deep_deps(uint32_t feature)
+{
+    static const struct {
+        uint32_t feature;
+        uint32_t fs[FEATURESET_NR_ENTRIES];
+    } deep_deps[] = INIT_DEEP_DEPS;
+
+    unsigned int start = 0, end = ARRAY_SIZE(deep_deps);
+
+    XC_BUILD_BUG_ON(ARRAY_SIZE(deep_deps) != NR_DEEP_DEPS);
+
+    /* deep_deps[] is sorted.  Perform a binary search. */
+    while ( start < end )
+    {
+        unsigned int mid = start + ((end - start) / 2);
+
+        if ( deep_deps[mid].feature > feature )
+            end = mid;
+        else if ( deep_deps[mid].feature < feature )
+            start = mid + 1;
+        else
+            return deep_deps[mid].fs;
+    }
+
+    return NULL;
+}
+
 struct cpuid_domain_info
 {
     enum
-- 
2.1.4


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 72+ messages in thread

* [PATCH v4 23/26] tools: Utility for dealing with featuresets
  2016-03-23 16:36 [PATCH v4 00/26] x86: Improvements to cpuid handling for guests Andrew Cooper
                   ` (21 preceding siblings ...)
  2016-03-23 16:36 ` [PATCH v4 22/26] tools/libxc: Expose the automatically generated cpu featuremask information Andrew Cooper
@ 2016-03-23 16:36 ` Andrew Cooper
  2016-03-28 20:26   ` Konrad Rzeszutek Wilk
  2016-03-23 16:36 ` [PATCH v4 24/26] tools/libxc: Wire a featureset through to cpuid policy logic Andrew Cooper
                   ` (3 subsequent siblings)
  26 siblings, 1 reply; 72+ messages in thread
From: Andrew Cooper @ 2016-03-23 16:36 UTC (permalink / raw)
  To: Xen-devel; +Cc: Andrew Cooper, Ian Jackson

It is able to reports the current featuresets; both the static masks and
dynamic featuresets from Xen, or to decode an arbitrary featureset into
`/proc/cpuinfo` style strings.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by: Wei Liu <wei.liu2@citrix.com>
---
CC: Ian Jackson <Ian.Jackson@eu.citrix.com>

v2: No linking hackary
---
 .gitignore             |   1 +
 tools/misc/Makefile    |   4 +
 tools/misc/xen-cpuid.c | 394 +++++++++++++++++++++++++++++++++++++++++++++++++
 3 files changed, 399 insertions(+)
 create mode 100644 tools/misc/xen-cpuid.c

diff --git a/.gitignore b/.gitignore
index b40453e..20ffa2d 100644
--- a/.gitignore
+++ b/.gitignore
@@ -179,6 +179,7 @@ tools/misc/cpuperf/cpuperf-perfcntr
 tools/misc/cpuperf/cpuperf-xen
 tools/misc/xc_shadow
 tools/misc/xen_cpuperf
+tools/misc/xen-cpuid
 tools/misc/xen-detect
 tools/misc/xen-tmem-list-parse
 tools/misc/xenperf
diff --git a/tools/misc/Makefile b/tools/misc/Makefile
index a2ef0ec..a94dad9 100644
--- a/tools/misc/Makefile
+++ b/tools/misc/Makefile
@@ -10,6 +10,7 @@ CFLAGS += $(CFLAGS_xeninclude)
 CFLAGS += $(CFLAGS_libxenstore)
 
 # Everything to be installed in regular bin/
+INSTALL_BIN-$(CONFIG_X86)      += xen-cpuid
 INSTALL_BIN-$(CONFIG_X86)      += xen-detect
 INSTALL_BIN                    += xencons
 INSTALL_BIN                    += xencov_split
@@ -68,6 +69,9 @@ clean:
 .PHONY: distclean
 distclean: clean
 
+xen-cpuid: xen-cpuid.o
+	$(CC) $(LDFLAGS) -o $@ $< $(LDLIBS_libxenctrl) $(LDLIBS_libxenguest) $(APPEND_LDFLAGS)
+
 xen-hvmctx: xen-hvmctx.o
 	$(CC) $(LDFLAGS) -o $@ $< $(LDLIBS_libxenctrl) $(APPEND_LDFLAGS)
 
diff --git a/tools/misc/xen-cpuid.c b/tools/misc/xen-cpuid.c
new file mode 100644
index 0000000..608c488
--- /dev/null
+++ b/tools/misc/xen-cpuid.c
@@ -0,0 +1,394 @@
+#include <stdio.h>
+#include <stdlib.h>
+#include <err.h>
+#include <getopt.h>
+#include <string.h>
+
+#include <xenctrl.h>
+
+#define ARRAY_SIZE(a) (sizeof a / sizeof *a)
+static uint32_t nr_features;
+
+static const char *str_1d[32] =
+{
+    [ 0] = "fpu",  [ 1] = "vme",
+    [ 2] = "de",   [ 3] = "pse",
+    [ 4] = "tsc",  [ 5] = "msr",
+    [ 6] = "pae",  [ 7] = "mce",
+    [ 8] = "cx8",  [ 9] = "apic",
+    [10] = "REZ",  [11] = "sysenter",
+    [12] = "mtrr", [13] = "pge",
+    [14] = "mca",  [15] = "cmov",
+    [16] = "pat",  [17] = "pse36",
+    [18] = "psn",  [19] = "clflush",
+    [20] = "REZ",  [21] = "ds",
+    [22] = "acpi", [23] = "mmx",
+    [24] = "fxsr", [25] = "sse",
+    [26] = "sse2", [27] = "ss",
+    [28] = "htt",  [29] = "tm",
+    [30] = "ia64", [31] = "pbe",
+};
+
+static const char *str_1c[32] =
+{
+    [ 0] = "sse3",    [ 1] = "pclmulqdq",
+    [ 2] = "dtes64",  [ 3] = "monitor",
+    [ 4] = "ds-cpl",  [ 5] = "vmx",
+    [ 6] = "smx",     [ 7] = "est",
+    [ 8] = "tm2",     [ 9] = "ssse3",
+    [10] = "cntx-id", [11] = "sdgb",
+    [12] = "fma",     [13] = "cx16",
+    [14] = "xtpr",    [15] = "pdcm",
+    [16] = "REZ",     [17] = "pcid",
+    [18] = "dca",     [19] = "sse41",
+    [20] = "sse42",   [21] = "x2apic",
+    [22] = "movebe",  [23] = "popcnt",
+    [24] = "tsc-dl",  [25] = "aesni",
+    [26] = "xsave",   [27] = "osxsave",
+    [28] = "avx",     [29] = "f16c",
+    [30] = "rdrnd",   [31] = "hyper",
+};
+
+static const char *str_e1d[32] =
+{
+    [ 0] = "fpu",    [ 1] = "vme",
+    [ 2] = "de",     [ 3] = "pse",
+    [ 4] = "tsc",    [ 5] = "msr",
+    [ 6] = "pae",    [ 7] = "mce",
+    [ 8] = "cx8",    [ 9] = "apic",
+    [10] = "REZ",    [11] = "syscall",
+    [12] = "mtrr",   [13] = "pge",
+    [14] = "mca",    [15] = "cmov",
+    [16] = "fcmov",  [17] = "pse36",
+    [18] = "REZ",    [19] = "mp",
+    [20] = "nx",     [21] = "REZ",
+    [22] = "mmx+",   [23] = "mmx",
+    [24] = "fxsr",   [25] = "fxsr+",
+    [26] = "pg1g",   [27] = "rdtscp",
+    [28] = "REZ",    [29] = "lm",
+    [30] = "3dnow+", [31] = "3dnow",
+};
+
+static const char *str_e1c[32] =
+{
+    [ 0] = "lahf_lm",    [ 1] = "cmp",
+    [ 2] = "svm",        [ 3] = "extapic",
+    [ 4] = "cr8d",       [ 5] = "lzcnt",
+    [ 6] = "sse4a",      [ 7] = "msse",
+    [ 8] = "3dnowpf",    [ 9] = "osvw",
+    [10] = "ibs",        [11] = "xop",
+    [12] = "skinit",     [13] = "wdt",
+    [14] = "REZ",        [15] = "lwp",
+    [16] = "fma4",       [17] = "tce",
+    [18] = "REZ",        [19] = "nodeid",
+    [20] = "REZ",        [21] = "tbm",
+    [22] = "topoext",    [23] = "perfctr_core",
+    [24] = "perfctr_nb", [25] = "REZ",
+    [26] = "dbx",        [27] = "perftsc",
+    [28] = "pcx_l2i",    [29] = "monitorx",
+
+    [30 ... 31] = "REZ",
+};
+
+static const char *str_7b0[32] =
+{
+    [ 0] = "fsgsbase", [ 1] = "tsc-adj",
+    [ 2] = "sgx",      [ 3] = "bmi1",
+    [ 4] = "hle",      [ 5] = "avx2",
+    [ 6] = "REZ",      [ 7] = "smep",
+    [ 8] = "bmi2",     [ 9] = "erms",
+    [10] = "invpcid",  [11] = "rtm",
+    [12] = "pqm",      [13] = "depfpp",
+    [14] = "mpx",      [15] = "pqe",
+    [16] = "avx512f",  [17] = "avx512dq",
+    [18] = "rdseed",   [19] = "adx",
+    [20] = "smap",     [21] = "avx512ifma",
+    [22] = "pcomit",   [23] = "clflushopt",
+    [24] = "clwb",     [25] = "pt",
+    [26] = "avx512pf", [27] = "avx512er",
+    [28] = "avx512cd", [29] = "sha",
+    [30] = "avx512bw", [31] = "avx512vl",
+};
+
+static const char *str_Da1[32] =
+{
+    [ 0] = "xsaveopt", [ 1] = "xsavec",
+    [ 2] = "xgetbv1",  [ 3] = "xsaves",
+
+    [4 ... 31] = "REZ",
+};
+
+static const char *str_7c0[32] =
+{
+    [ 0] = "prechwt1", [ 1] = "avx512vbmi",
+    [ 2] = "REZ",      [ 3] = "pku",
+    [ 4] = "ospke",
+
+    [5 ... 31] = "REZ",
+};
+
+static const char *str_e7d[32] =
+{
+    [0 ... 7] = "REZ",
+
+    [ 8] = "itsc",
+
+    [9 ... 31] = "REZ",
+};
+
+static const char *str_e8b[32] =
+{
+    [ 0] = "clzero",
+
+    [1 ... 31] = "REZ",
+};
+
+static struct {
+    const char *name;
+    const char *abbr;
+    const char **strs;
+} decodes[] =
+{
+    { "0x00000001.edx",   "1d",  str_1d },
+    { "0x00000001.ecx",   "1c",  str_1c },
+    { "0x80000001.edx",   "e1d", str_e1d },
+    { "0x80000001.ecx",   "e1c", str_e1c },
+    { "0x0000000d:1.eax", "Da1", str_Da1 },
+    { "0x00000007:0.ebx", "7b0", str_7b0 },
+    { "0x00000007:0.ecx", "7c0", str_7c0 },
+    { "0x80000007.edx",   "e7d", str_e7d },
+    { "0x80000008.ebx",   "e8b", str_e8b },
+};
+
+#define COL_ALIGN "18"
+
+static struct fsinfo {
+    const char *name;
+    uint32_t len;
+    uint32_t *fs;
+} featuresets[] =
+{
+    [XEN_SYSCTL_cpu_featureset_host] = { "Host", 0, NULL },
+    [XEN_SYSCTL_cpu_featureset_raw]  = { "Raw",  0, NULL },
+    [XEN_SYSCTL_cpu_featureset_pv]   = { "PV",   0, NULL },
+    [XEN_SYSCTL_cpu_featureset_hvm]  = { "HVM",  0, NULL },
+};
+
+static void dump_leaf(uint32_t leaf, const char **strs)
+{
+    unsigned i;
+
+    if ( !strs )
+    {
+        printf(" ???");
+        return;
+    }
+
+    for ( i = 0; i < 32; ++i )
+        if ( leaf & (1u << i) )
+            printf(" %s", strs[i] ?: "???" );
+}
+
+static void decode_featureset(const uint32_t *features,
+                              const uint32_t length,
+                              const char *name,
+                              bool detail)
+{
+    unsigned int i;
+
+    printf("%-"COL_ALIGN"s        ", name);
+    for ( i = 0; i < length; ++i )
+        printf("%08x%c", features[i],
+               i < length - 1 ? ':' : '\n');
+
+    if ( !detail )
+        return;
+
+    for ( i = 0; i < length && i < ARRAY_SIZE(decodes); ++i )
+    {
+        printf("  [%02u] %-"COL_ALIGN"s", i, decodes[i].name ?: "<UNKNOWN>");
+        if ( decodes[i].name )
+            dump_leaf(features[i], decodes[i].strs);
+        printf("\n");
+    }
+}
+
+static void get_featureset(xc_interface *xch, unsigned int idx)
+{
+    struct fsinfo *f = &featuresets[idx];
+
+    f->len = xc_get_cpu_featureset_size();
+    f->fs = calloc(nr_features, sizeof(*f->fs));
+
+    if ( !f->fs )
+        err(1, "calloc(, featureset)");
+
+    if ( xc_get_cpu_featureset(xch, idx, &f->len, f->fs) )
+        err(1, "xc_get_featureset()");
+}
+
+static void dump_info(xc_interface *xch, bool detail)
+{
+    unsigned int i;
+
+    printf("nr_features: %u\n", nr_features);
+
+    if ( !detail )
+    {
+        printf("       %"COL_ALIGN"s ", "KEY");
+        for ( i = 0; i < ARRAY_SIZE(decodes); ++i )
+            printf("%-8s ", decodes[i].abbr ?: "???");
+        printf("\n");
+    }
+
+    printf("\nStatic sets:\n");
+    decode_featureset(xc_get_static_cpu_featuremask(XC_FEATUREMASK_KNOWN),
+                      nr_features, "Known", detail);
+    decode_featureset(xc_get_static_cpu_featuremask(XC_FEATUREMASK_SPECIAL),
+                      nr_features, "Special", detail);
+    decode_featureset(xc_get_static_cpu_featuremask(XC_FEATUREMASK_PV),
+                      nr_features, "PV Mask", detail);
+    decode_featureset(xc_get_static_cpu_featuremask(XC_FEATUREMASK_HVM_SHADOW),
+                      nr_features, "HVM Shadow Mask", detail);
+    decode_featureset(xc_get_static_cpu_featuremask(XC_FEATUREMASK_HVM_HAP),
+                      nr_features, "HVM Hap Mask", detail);
+
+    printf("\nDynamic sets:\n");
+    for ( i = 0; i < ARRAY_SIZE(featuresets); ++i )
+    {
+        get_featureset(xch, i);
+
+        decode_featureset(featuresets[i].fs, featuresets[i].len,
+                          featuresets[i].name, detail);
+    }
+
+    for ( i = 0; i < ARRAY_SIZE(featuresets); ++i )
+        free(featuresets[i].fs);
+}
+
+int main(int argc, char **argv)
+{
+    enum { MODE_UNKNOWN, MODE_INFO, MODE_DETAIL, MODE_INTERPRET }
+    mode = MODE_UNKNOWN;
+
+    nr_features = xc_get_cpu_featureset_size();
+
+    for ( ;; )
+    {
+        int option_index = 0, c;
+        static struct option long_options[] =
+        {
+            { "help", no_argument, NULL, 'h' },
+            { "info", no_argument, NULL, 'i' },
+            { "detail", no_argument, NULL, 'd' },
+            { "verbose", no_argument, NULL, 'v' },
+            { NULL, 0, NULL, 0 },
+        };
+
+        c = getopt_long(argc, argv, "hidv", long_options, &option_index);
+
+        if ( c == -1 )
+            break;
+
+        switch ( c )
+        {
+        case 'h':
+ option_error:
+            printf("Usage: %s [ info | detail | <featureset>* ]\n", argv[0]);
+            return 0;
+
+        case 'i':
+            mode = MODE_INFO;
+            break;
+
+        case 'd':
+        case 'v':
+            mode = MODE_DETAIL;
+            break;
+
+        default:
+            printf("Bad option '%c'\n", c);
+            goto option_error;
+        }
+    }
+
+    if ( mode == MODE_UNKNOWN )
+    {
+        if ( optind == argc )
+            mode = MODE_INFO;
+        else if ( optind < argc )
+        {
+            if ( !strcmp(argv[optind], "info") )
+            {
+                mode = MODE_INFO;
+                optind++;
+            }
+            else if ( !strcmp(argv[optind], "detail") )
+            {
+                mode = MODE_DETAIL;
+                optind++;
+            }
+            else
+                mode = MODE_INTERPRET;
+        }
+        else
+            mode = MODE_INTERPRET;
+    }
+
+    if ( mode == MODE_INFO || mode == MODE_DETAIL )
+    {
+        xc_interface *xch = xc_interface_open(0, 0, 0);
+
+        if ( !xch )
+            err(1, "xc_interface_open");
+
+        if ( xc_get_cpu_featureset(xch, 0, &nr_features, NULL) )
+            err(1, "xc_get_featureset(, NULL)");
+
+        dump_info(xch, mode == MODE_DETAIL);
+
+        xc_interface_close(xch);
+    }
+    else
+    {
+        uint32_t fs[nr_features + 1];
+
+        while ( optind < argc )
+        {
+            char *ptr = argv[optind++];
+            unsigned int i = 0;
+            int offset;
+
+            memset(fs, 0, sizeof(fs));
+
+            while ( sscanf(ptr, "%x%n", &fs[i], &offset) == 1 )
+            {
+                i++;
+                ptr += offset;
+
+                if ( i == nr_features )
+                    break;
+
+                if ( *ptr == ':' )
+                {
+                    ptr++; continue;
+                }
+                break;
+            }
+
+            decode_featureset(fs, i, "Raw", true);
+        }
+    }
+
+    return 0;
+}
+
+/*
+ * Local variables:
+ * mode: C
+ * c-file-style: "BSD"
+ * c-basic-offset: 4
+ * tab-width: 4
+ * indent-tabs-mode: nil
+ * End:
+ */
-- 
2.1.4


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 72+ messages in thread

* [PATCH v4 24/26] tools/libxc: Wire a featureset through to cpuid policy logic
  2016-03-23 16:36 [PATCH v4 00/26] x86: Improvements to cpuid handling for guests Andrew Cooper
                   ` (22 preceding siblings ...)
  2016-03-23 16:36 ` [PATCH v4 23/26] tools: Utility for dealing with featuresets Andrew Cooper
@ 2016-03-23 16:36 ` Andrew Cooper
  2016-03-28 20:39   ` Konrad Rzeszutek Wilk
  2016-03-23 16:36 ` [PATCH v4 25/26] tools/libxc: Use featuresets rather than guesswork Andrew Cooper
                   ` (2 subsequent siblings)
  26 siblings, 1 reply; 72+ messages in thread
From: Andrew Cooper @ 2016-03-23 16:36 UTC (permalink / raw)
  To: Xen-devel; +Cc: Andrew Cooper, Ian Jackson

Later changes will cause the cpuid generation logic to seed their information
from a featureset.  This patch adds the infrastructure to specify a
featureset, and will obtain the appropriate default from Xen if omitted.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by: Wei Liu <wei.liu2@citrix.com>
---
CC: Ian Jackson <Ian.Jackson@eu.citrix.com>

v2:
 * Modify existing call rather than introducing a new one.
 * Fix up in-tree callsites.
---
 tools/libxc/include/xenctrl.h       |  4 ++-
 tools/libxc/xc_cpuid_x86.c          | 69 ++++++++++++++++++++++++++++++++-----
 tools/libxl/libxl_cpuid.c           |  2 +-
 tools/ocaml/libs/xc/xenctrl_stubs.c |  2 +-
 tools/python/xen/lowlevel/xc/xc.c   |  2 +-
 5 files changed, 66 insertions(+), 13 deletions(-)

diff --git a/tools/libxc/include/xenctrl.h b/tools/libxc/include/xenctrl.h
index 66acbd1..872fd08 100644
--- a/tools/libxc/include/xenctrl.h
+++ b/tools/libxc/include/xenctrl.h
@@ -1896,7 +1896,9 @@ int xc_cpuid_set(xc_interface *xch,
                  const char **config,
                  char **config_transformed);
 int xc_cpuid_apply_policy(xc_interface *xch,
-                          domid_t domid);
+                          domid_t domid,
+                          uint32_t *featureset,
+                          unsigned int nr_features);
 void xc_cpuid_to_str(const unsigned int *regs,
                      char **strs); /* some strs[] may be NULL if ENOMEM */
 int xc_mca_op(xc_interface *xch, struct xen_mc *mc);
diff --git a/tools/libxc/xc_cpuid_x86.c b/tools/libxc/xc_cpuid_x86.c
index 0cffb36..a92f5e4 100644
--- a/tools/libxc/xc_cpuid_x86.c
+++ b/tools/libxc/xc_cpuid_x86.c
@@ -166,6 +166,9 @@ struct cpuid_domain_info
     bool pvh;
     uint64_t xfeature_mask;
 
+    uint32_t *featureset;
+    unsigned int nr_features;
+
     /* PV-only information. */
     bool pv64;
 
@@ -197,11 +200,14 @@ static void cpuid(const unsigned int *input, unsigned int *regs)
 }
 
 static int get_cpuid_domain_info(xc_interface *xch, domid_t domid,
-                                 struct cpuid_domain_info *info)
+                                 struct cpuid_domain_info *info,
+                                 uint32_t *featureset,
+                                 unsigned int nr_features)
 {
     struct xen_domctl domctl = {};
     xc_dominfo_t di;
     unsigned int in[2] = { 0, ~0U }, regs[4];
+    unsigned int i, host_nr_features = xc_get_cpu_featureset_size();
     int rc;
 
     cpuid(in, regs);
@@ -223,6 +229,23 @@ static int get_cpuid_domain_info(xc_interface *xch, domid_t domid,
     info->hvm = di.hvm;
     info->pvh = di.pvh;
 
+    info->featureset = calloc(host_nr_features, sizeof(*info->featureset));
+    if ( !info->featureset )
+        return -ENOMEM;
+
+    info->nr_features = host_nr_features;
+
+    if ( featureset )
+    {
+        memcpy(info->featureset, featureset,
+               min(host_nr_features, nr_features) * sizeof(*info->featureset));
+
+        /* Check for truncated set bits. */
+        for ( i = nr_features; i < host_nr_features; ++i )
+            if ( featureset[i] != 0 )
+                return -EOPNOTSUPP;
+    }
+
     /* Get xstate information. */
     domctl.cmd = XEN_DOMCTL_getvcpuextstate;
     domctl.domain = domid;
@@ -247,6 +270,14 @@ static int get_cpuid_domain_info(xc_interface *xch, domid_t domid,
             return rc;
 
         info->nestedhvm = !!val;
+
+        if ( !featureset )
+        {
+            rc = xc_get_cpu_featureset(xch, XEN_SYSCTL_cpu_featureset_hvm,
+                                       &host_nr_features, info->featureset);
+            if ( rc )
+                return rc;
+        }
     }
     else
     {
@@ -257,11 +288,24 @@ static int get_cpuid_domain_info(xc_interface *xch, domid_t domid,
             return rc;
 
         info->pv64 = (width == 8);
+
+        if ( !featureset )
+        {
+            rc = xc_get_cpu_featureset(xch, XEN_SYSCTL_cpu_featureset_pv,
+                                       &host_nr_features, info->featureset);
+            if ( rc )
+                return rc;
+        }
     }
 
     return 0;
 }
 
+static void free_cpuid_domain_info(struct cpuid_domain_info *info)
+{
+    free(info->featureset);
+}
+
 static void amd_xc_cpuid_policy(xc_interface *xch,
                                 const struct cpuid_domain_info *info,
                                 const unsigned int *input, unsigned int *regs)
@@ -789,16 +833,18 @@ void xc_cpuid_to_str(const unsigned int *regs, char **strs)
     }
 }
 
-int xc_cpuid_apply_policy(xc_interface *xch, domid_t domid)
+int xc_cpuid_apply_policy(xc_interface *xch, domid_t domid,
+                          uint32_t *featureset,
+                          unsigned int nr_features)
 {
     struct cpuid_domain_info info = {};
     unsigned int input[2] = { 0, 0 }, regs[4];
     unsigned int base_max, ext_max;
     int rc;
 
-    rc = get_cpuid_domain_info(xch, domid, &info);
+    rc = get_cpuid_domain_info(xch, domid, &info, featureset, nr_features);
     if ( rc )
-        return rc;
+        goto out;
 
     cpuid(input, regs);
     base_max = (regs[0] <= DEF_MAX_BASE) ? regs[0] : DEF_MAX_BASE;
@@ -821,7 +867,7 @@ int xc_cpuid_apply_policy(xc_interface *xch, domid_t domid)
         {
             rc = xc_cpuid_do_domctl(xch, domid, input, regs);
             if ( rc )
-                return rc;
+                goto out;
         }
 
         /* Intel cache descriptor leaves. */
@@ -849,7 +895,9 @@ int xc_cpuid_apply_policy(xc_interface *xch, domid_t domid)
             break;
     }
 
-    return 0;
+ out:
+    free_cpuid_domain_info(&info);
+    return rc;
 }
 
 /*
@@ -938,9 +986,9 @@ int xc_cpuid_set(
 
     memset(config_transformed, 0, 4 * sizeof(*config_transformed));
 
-    rc = get_cpuid_domain_info(xch, domid, &info);
+    rc = get_cpuid_domain_info(xch, domid, &info, NULL, 0);
     if ( rc )
-        return rc;
+        goto out;
 
     cpuid(input, regs);
 
@@ -991,7 +1039,7 @@ int xc_cpuid_set(
 
     rc = xc_cpuid_do_domctl(xch, domid, input, regs);
     if ( rc == 0 )
-        return 0;
+        goto out;
 
  fail:
     for ( i = 0; i < 4; i++ )
@@ -999,5 +1047,8 @@ int xc_cpuid_set(
         free(config_transformed[i]);
         config_transformed[i] = NULL;
     }
+
+ out:
+    free_cpuid_domain_info(&info);
     return rc;
 }
diff --git a/tools/libxl/libxl_cpuid.c b/tools/libxl/libxl_cpuid.c
index c66e912..fc20157 100644
--- a/tools/libxl/libxl_cpuid.c
+++ b/tools/libxl/libxl_cpuid.c
@@ -334,7 +334,7 @@ int libxl_cpuid_parse_config_xend(libxl_cpuid_policy_list *cpuid,
 
 void libxl_cpuid_apply_policy(libxl_ctx *ctx, uint32_t domid)
 {
-    xc_cpuid_apply_policy(ctx->xch, domid);
+    xc_cpuid_apply_policy(ctx->xch, domid, NULL, 0);
 }
 
 void libxl_cpuid_set(libxl_ctx *ctx, uint32_t domid,
diff --git a/tools/ocaml/libs/xc/xenctrl_stubs.c b/tools/ocaml/libs/xc/xenctrl_stubs.c
index e7adf37..5477df3 100644
--- a/tools/ocaml/libs/xc/xenctrl_stubs.c
+++ b/tools/ocaml/libs/xc/xenctrl_stubs.c
@@ -796,7 +796,7 @@ CAMLprim value stub_xc_domain_cpuid_apply_policy(value xch, value domid)
 #if defined(__i386__) || defined(__x86_64__)
 	int r;
 
-	r = xc_cpuid_apply_policy(_H(xch), _D(domid));
+	r = xc_cpuid_apply_policy(_H(xch), _D(domid), NULL, 0);
 	if (r < 0)
 		failwith_xc(_H(xch));
 #else
diff --git a/tools/python/xen/lowlevel/xc/xc.c b/tools/python/xen/lowlevel/xc/xc.c
index c40a4e9..22a1c9f 100644
--- a/tools/python/xen/lowlevel/xc/xc.c
+++ b/tools/python/xen/lowlevel/xc/xc.c
@@ -731,7 +731,7 @@ static PyObject *pyxc_dom_set_policy_cpuid(XcObject *self,
     if ( !PyArg_ParseTuple(args, "i", &domid) )
         return NULL;
 
-    if ( xc_cpuid_apply_policy(self->xc_handle, domid) )
+    if ( xc_cpuid_apply_policy(self->xc_handle, domid, NULL, 0) )
         return pyxc_error_to_exception(self->xc_handle);
 
     Py_INCREF(zero);
-- 
2.1.4


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 72+ messages in thread

* [PATCH v4 25/26] tools/libxc: Use featuresets rather than guesswork
  2016-03-23 16:36 [PATCH v4 00/26] x86: Improvements to cpuid handling for guests Andrew Cooper
                   ` (23 preceding siblings ...)
  2016-03-23 16:36 ` [PATCH v4 24/26] tools/libxc: Wire a featureset through to cpuid policy logic Andrew Cooper
@ 2016-03-23 16:36 ` Andrew Cooper
  2016-03-23 16:36 ` [PATCH v4 26/26] tools/libxc: Calculate xstate cpuid leaf from guest information Andrew Cooper
  2016-03-24 10:27 ` [PATCH v4 00/26] x86: Improvements to cpuid handling for guests Jan Beulich
  26 siblings, 0 replies; 72+ messages in thread
From: Andrew Cooper @ 2016-03-23 16:36 UTC (permalink / raw)
  To: Xen-devel; +Cc: Andrew Cooper, Ian Jackson

It is conceptually wrong to base a VM's featureset on the features visible to
the toolstack which happens to construct it.

Instead, the featureset used is either an explicit one passed by the
toolstack, or the default which Xen believes it can give to the guest.

Collect all the feature manipulation into a single function which adjusts the
featureset, and perform deep dependency removal.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by: Wei Liu <wei.liu2@citrix.com>
---
CC: Ian Jackson <Ian.Jackson@eu.citrix.com>

v2:
 * Join several related patches together.
v3:
 * Correctly adjust HTT/CMP_LEGACY in the policy.  PV guests see host details,
   so get the host features.  HVM guests have their vcpu topology presented in
   an HTT compatible manor (even if ends up reporting 1 cpu), so have
   CMP_LEGACY unconditionally cleared.
---
 tools/libxc/xc_cpuid_x86.c | 356 +++++++++++++++++----------------------------
 1 file changed, 137 insertions(+), 219 deletions(-)

diff --git a/tools/libxc/xc_cpuid_x86.c b/tools/libxc/xc_cpuid_x86.c
index a92f5e4..fc7e20a 100644
--- a/tools/libxc/xc_cpuid_x86.c
+++ b/tools/libxc/xc_cpuid_x86.c
@@ -21,7 +21,9 @@
 
 #include <stdlib.h>
 #include <stdbool.h>
+#include <limits.h>
 #include "xc_private.h"
+#include "xc_bitops.h"
 #include <xen/hvm/params.h>
 
 enum {
@@ -31,12 +33,14 @@ enum {
 #include "_xc_cpuid_autogen.h"
 
 #define bitmaskof(idx)      (1u << ((idx) & 31))
-#define clear_bit(idx, dst) ((dst) &= ~bitmaskof(idx))
-#define set_bit(idx, dst)   ((dst) |=  bitmaskof(idx))
+#define featureword_of(idx) ((idx) >> 5)
+#define clear_feature(idx, dst) ((dst) &= ~bitmaskof(idx))
+#define set_feature(idx, dst)   ((dst) |=  bitmaskof(idx))
 
 #define DEF_MAX_BASE 0x0000000du
 #define DEF_MAX_INTELEXT  0x80000008u
 #define DEF_MAX_AMDEXT    0x8000001cu
+#define COMMON_1D CPUID_COMMON_1D_FEATURES
 
 int xc_get_cpu_levelling_caps(xc_interface *xch, uint32_t *caps)
 {
@@ -322,37 +326,6 @@ static void amd_xc_cpuid_policy(xc_interface *xch,
             regs[0] = DEF_MAX_AMDEXT;
         break;
 
-    case 0x80000001: {
-        if ( !info->pae )
-            clear_bit(X86_FEATURE_PAE, regs[3]);
-
-        /* Filter all other features according to a whitelist. */
-        regs[2] &= (bitmaskof(X86_FEATURE_LAHF_LM) |
-                    bitmaskof(X86_FEATURE_CMP_LEGACY) |
-                    (info->nestedhvm ? bitmaskof(X86_FEATURE_SVM) : 0) |
-                    bitmaskof(X86_FEATURE_CR8_LEGACY) |
-                    bitmaskof(X86_FEATURE_ABM) |
-                    bitmaskof(X86_FEATURE_SSE4A) |
-                    bitmaskof(X86_FEATURE_MISALIGNSSE) |
-                    bitmaskof(X86_FEATURE_3DNOWPREFETCH) |
-                    bitmaskof(X86_FEATURE_OSVW) |
-                    bitmaskof(X86_FEATURE_XOP) |
-                    bitmaskof(X86_FEATURE_LWP) |
-                    bitmaskof(X86_FEATURE_FMA4) |
-                    bitmaskof(X86_FEATURE_TBM) |
-                    bitmaskof(X86_FEATURE_DBEXT));
-        regs[3] &= (0x0183f3ff | /* features shared with 0x00000001:EDX */
-                    bitmaskof(X86_FEATURE_NX) |
-                    bitmaskof(X86_FEATURE_LM) |
-                    bitmaskof(X86_FEATURE_PAGE1GB) |
-                    bitmaskof(X86_FEATURE_SYSCALL) |
-                    bitmaskof(X86_FEATURE_MMXEXT) |
-                    bitmaskof(X86_FEATURE_FFXSR) |
-                    bitmaskof(X86_FEATURE_3DNOW) |
-                    bitmaskof(X86_FEATURE_3DNOWEXT));
-        break;
-    }
-
     case 0x80000008:
         /*
          * ECX[15:12] is ApicIdCoreSize: ECX[7:0] is NumberOfCores (minus one).
@@ -399,12 +372,6 @@ static void intel_xc_cpuid_policy(xc_interface *xch,
 {
     switch ( input[0] )
     {
-    case 0x00000001:
-        /* ECX[5] is availability of VMX */
-        if ( info->nestedhvm )
-            set_bit(X86_FEATURE_VMX, regs[2]);
-        break;
-
     case 0x00000004:
         /*
          * EAX[31:26] is Maximum Cores Per Package (minus one).
@@ -420,19 +387,6 @@ static void intel_xc_cpuid_policy(xc_interface *xch,
             regs[0] = DEF_MAX_INTELEXT;
         break;
 
-    case 0x80000001: {
-        /* Only a few features are advertised in Intel's 0x80000001. */
-        regs[2] &= (bitmaskof(X86_FEATURE_LAHF_LM) |
-                    bitmaskof(X86_FEATURE_3DNOWPREFETCH) |
-                    bitmaskof(X86_FEATURE_ABM));
-        regs[3] &= (bitmaskof(X86_FEATURE_NX) |
-                    bitmaskof(X86_FEATURE_LM) |
-                    bitmaskof(X86_FEATURE_PAGE1GB) |
-                    bitmaskof(X86_FEATURE_SYSCALL) |
-                    bitmaskof(X86_FEATURE_RDTSCP));
-        break;
-    }
-
     case 0x80000005:
         regs[0] = regs[1] = regs[2] = 0;
         break;
@@ -444,10 +398,6 @@ static void intel_xc_cpuid_policy(xc_interface *xch,
     }
 }
 
-#define XSAVEOPT        (1 << 0)
-#define XSAVEC          (1 << 1)
-#define XGETBV1         (1 << 2)
-#define XSAVES          (1 << 3)
 /* Configure extended state enumeration leaves (0x0000000D for xsave) */
 static void xc_cpuid_config_xsave(xc_interface *xch,
                                   const struct cpuid_domain_info *info,
@@ -484,9 +434,7 @@ static void xc_cpuid_config_xsave(xc_interface *xch,
         regs[1] = 512 + 64; /* FP/SSE + XSAVE.HEADER */
         break;
     case 1: /* leaf 1 */
-        regs[0] &= (XSAVEOPT | XSAVEC | XGETBV1 | XSAVES);
-        if ( !info->hvm )
-            regs[0] &= ~XSAVES;
+        regs[0] = info->featureset[featureword_of(X86_FEATURE_XSAVEOPT)];
         regs[2] &= info->xfeature_mask;
         regs[3] = 0;
         break;
@@ -520,85 +468,22 @@ static void xc_cpuid_hvm_policy(xc_interface *xch,
          */
         regs[1] = (regs[1] & 0x0000ffffu) | ((regs[1] & 0x007f0000u) << 1);
 
-        regs[2] &= (bitmaskof(X86_FEATURE_SSE3) |
-                    bitmaskof(X86_FEATURE_PCLMULQDQ) |
-                    bitmaskof(X86_FEATURE_SSSE3) |
-                    bitmaskof(X86_FEATURE_FMA) |
-                    bitmaskof(X86_FEATURE_CX16) |
-                    bitmaskof(X86_FEATURE_PCID) |
-                    bitmaskof(X86_FEATURE_SSE4_1) |
-                    bitmaskof(X86_FEATURE_SSE4_2) |
-                    bitmaskof(X86_FEATURE_MOVBE)  |
-                    bitmaskof(X86_FEATURE_POPCNT) |
-                    bitmaskof(X86_FEATURE_AESNI) |
-                    bitmaskof(X86_FEATURE_F16C) |
-                    bitmaskof(X86_FEATURE_RDRAND) |
-                    ((info->xfeature_mask != 0) ?
-                     (bitmaskof(X86_FEATURE_AVX) |
-                      bitmaskof(X86_FEATURE_XSAVE)) : 0));
-
-        regs[2] |= (bitmaskof(X86_FEATURE_HYPERVISOR) |
-                    bitmaskof(X86_FEATURE_TSC_DEADLINE) |
-                    bitmaskof(X86_FEATURE_X2APIC));
-
-        regs[3] &= (bitmaskof(X86_FEATURE_FPU) |
-                    bitmaskof(X86_FEATURE_VME) |
-                    bitmaskof(X86_FEATURE_DE) |
-                    bitmaskof(X86_FEATURE_PSE) |
-                    bitmaskof(X86_FEATURE_TSC) |
-                    bitmaskof(X86_FEATURE_MSR) |
-                    bitmaskof(X86_FEATURE_PAE) |
-                    bitmaskof(X86_FEATURE_MCE) |
-                    bitmaskof(X86_FEATURE_CX8) |
-                    bitmaskof(X86_FEATURE_APIC) |
-                    bitmaskof(X86_FEATURE_SEP) |
-                    bitmaskof(X86_FEATURE_MTRR) |
-                    bitmaskof(X86_FEATURE_PGE) |
-                    bitmaskof(X86_FEATURE_MCA) |
-                    bitmaskof(X86_FEATURE_CMOV) |
-                    bitmaskof(X86_FEATURE_PAT) |
-                    bitmaskof(X86_FEATURE_CLFLUSH) |
-                    bitmaskof(X86_FEATURE_PSE36) |
-                    bitmaskof(X86_FEATURE_MMX) |
-                    bitmaskof(X86_FEATURE_FXSR) |
-                    bitmaskof(X86_FEATURE_SSE) |
-                    bitmaskof(X86_FEATURE_SSE2) |
-                    bitmaskof(X86_FEATURE_HTT));
-            
-        /* We always support MTRR MSRs. */
-        regs[3] |= bitmaskof(X86_FEATURE_MTRR);
-
-        if ( !info->pae )
-        {
-            clear_bit(X86_FEATURE_PAE, regs[3]);
-            clear_bit(X86_FEATURE_PSE36, regs[3]);
-        }
+        regs[2] = info->featureset[featureword_of(X86_FEATURE_SSE3)];
+        regs[3] = (info->featureset[featureword_of(X86_FEATURE_FPU)] |
+                   bitmaskof(X86_FEATURE_HTT));
         break;
 
     case 0x00000007: /* Intel-defined CPU features */
-        if ( input[1] == 0 ) {
-            regs[1] &= (bitmaskof(X86_FEATURE_TSC_ADJUST) |
-                        bitmaskof(X86_FEATURE_BMI1) |
-                        bitmaskof(X86_FEATURE_HLE)  |
-                        bitmaskof(X86_FEATURE_AVX2) |
-                        bitmaskof(X86_FEATURE_SMEP) |
-                        bitmaskof(X86_FEATURE_BMI2) |
-                        bitmaskof(X86_FEATURE_ERMS) |
-                        bitmaskof(X86_FEATURE_INVPCID) |
-                        bitmaskof(X86_FEATURE_RTM)  |
-                        ((info->xfeature_mask != 0) ?
-                        bitmaskof(X86_FEATURE_MPX) : 0)  |
-                        bitmaskof(X86_FEATURE_RDSEED)  |
-                        bitmaskof(X86_FEATURE_ADX)  |
-                        bitmaskof(X86_FEATURE_SMAP) |
-                        bitmaskof(X86_FEATURE_FSGSBASE) |
-                        bitmaskof(X86_FEATURE_PCOMMIT) |
-                        bitmaskof(X86_FEATURE_CLWB) |
-                        bitmaskof(X86_FEATURE_CLFLUSHOPT));
-            regs[2] &= bitmaskof(X86_FEATURE_PKU);
-        } else
-            regs[1] = regs[2] = 0;
-
+        if ( input[1] == 0 )
+        {
+            regs[1] = info->featureset[featureword_of(X86_FEATURE_FSGSBASE)];
+            regs[2] = info->featureset[featureword_of(X86_FEATURE_PREFETCHWT1)];
+        }
+        else
+        {
+            regs[1] = 0;
+            regs[2] = 0;
+        }
         regs[0] = regs[3] = 0;
         break;
 
@@ -611,14 +496,9 @@ static void xc_cpuid_hvm_policy(xc_interface *xch,
         break;
 
     case 0x80000001:
-        if ( !info->pae )
-        {
-            clear_bit(X86_FEATURE_LAHF_LM, regs[2]);
-            clear_bit(X86_FEATURE_LM, regs[3]);
-            clear_bit(X86_FEATURE_NX, regs[3]);
-            clear_bit(X86_FEATURE_PSE36, regs[3]);
-            clear_bit(X86_FEATURE_PAGE1GB, regs[3]);
-        }
+        regs[2] = (info->featureset[featureword_of(X86_FEATURE_LAHF_LM)] &
+                   ~bitmaskof(X86_FEATURE_CMP_LEGACY));
+        regs[3] = info->featureset[featureword_of(X86_FEATURE_SYSCALL)];
         break;
 
     case 0x80000007:
@@ -662,68 +542,34 @@ static void xc_cpuid_pv_policy(xc_interface *xch,
                                const struct cpuid_domain_info *info,
                                const unsigned int *input, unsigned int *regs)
 {
-    if ( (input[0] & 0x7fffffff) == 0x00000001 )
-    {
-        clear_bit(X86_FEATURE_VME, regs[3]);
-        if ( !info->pvh )
-        {
-            clear_bit(X86_FEATURE_PSE, regs[3]);
-            clear_bit(X86_FEATURE_PGE, regs[3]);
-        }
-        clear_bit(X86_FEATURE_MCE, regs[3]);
-        clear_bit(X86_FEATURE_MCA, regs[3]);
-        clear_bit(X86_FEATURE_MTRR, regs[3]);
-        clear_bit(X86_FEATURE_PSE36, regs[3]);
-    }
-
     switch ( input[0] )
     {
     case 0x00000001:
-        if ( info->vendor == VENDOR_AMD )
-            clear_bit(X86_FEATURE_SEP, regs[3]);
-        clear_bit(X86_FEATURE_DS, regs[3]);
-        clear_bit(X86_FEATURE_TM1, regs[3]);
-        clear_bit(X86_FEATURE_PBE, regs[3]);
-
-        clear_bit(X86_FEATURE_DTES64, regs[2]);
-        clear_bit(X86_FEATURE_MONITOR, regs[2]);
-        clear_bit(X86_FEATURE_DSCPL, regs[2]);
-        clear_bit(X86_FEATURE_VMX, regs[2]);
-        clear_bit(X86_FEATURE_SMX, regs[2]);
-        clear_bit(X86_FEATURE_EIST, regs[2]);
-        clear_bit(X86_FEATURE_TM2, regs[2]);
-        if ( !info->pv64 )
-            clear_bit(X86_FEATURE_CX16, regs[2]);
-        if ( info->xfeature_mask == 0 )
-        {
-            clear_bit(X86_FEATURE_XSAVE, regs[2]);
-            clear_bit(X86_FEATURE_AVX, regs[2]);
-        }
-        clear_bit(X86_FEATURE_XTPR, regs[2]);
-        clear_bit(X86_FEATURE_PDCM, regs[2]);
-        clear_bit(X86_FEATURE_PCID, regs[2]);
-        clear_bit(X86_FEATURE_DCA, regs[2]);
-        set_bit(X86_FEATURE_HYPERVISOR, regs[2]);
+    {
+        /* Host topology exposed to PV guest.  Provide host value. */
+        bool host_htt = regs[3] & bitmaskof(X86_FEATURE_HTT);
+
+        regs[2] = info->featureset[featureword_of(X86_FEATURE_SSE3)];
+        regs[3] = (info->featureset[featureword_of(X86_FEATURE_FPU)] &
+                   ~bitmaskof(X86_FEATURE_HTT));
+
+        if ( host_htt )
+            regs[3] |= bitmaskof(X86_FEATURE_HTT);
         break;
+    }
 
     case 0x00000007:
         if ( input[1] == 0 )
         {
-            regs[1] &= (bitmaskof(X86_FEATURE_BMI1) |
-                        bitmaskof(X86_FEATURE_HLE)  |
-                        bitmaskof(X86_FEATURE_AVX2) |
-                        bitmaskof(X86_FEATURE_BMI2) |
-                        bitmaskof(X86_FEATURE_ERMS) |
-                        bitmaskof(X86_FEATURE_RTM)  |
-                        bitmaskof(X86_FEATURE_RDSEED)  |
-                        bitmaskof(X86_FEATURE_ADX)  |
-                        bitmaskof(X86_FEATURE_FSGSBASE));
-            if ( info->xfeature_mask == 0 )
-                clear_bit(X86_FEATURE_MPX, regs[1]);
+            regs[1] = info->featureset[featureword_of(X86_FEATURE_FSGSBASE)];
+            regs[2] = info->featureset[featureword_of(X86_FEATURE_PREFETCHWT1)];
         }
         else
+        {
             regs[1] = 0;
-        regs[0] = regs[2] = regs[3] = 0;
+            regs[2] = 0;
+        }
+        regs[0] = regs[3] = 0;
         break;
 
     case 0x0000000d:
@@ -731,30 +577,19 @@ static void xc_cpuid_pv_policy(xc_interface *xch,
         break;
 
     case 0x80000001:
-        if ( !info->pv64 )
-        {
-            clear_bit(X86_FEATURE_LM, regs[3]);
-            clear_bit(X86_FEATURE_LAHF_LM, regs[2]);
-            if ( info->vendor != VENDOR_AMD )
-                clear_bit(X86_FEATURE_SYSCALL, regs[3]);
-        }
-        else
-        {
-            set_bit(X86_FEATURE_SYSCALL, regs[3]);
-        }
-        if ( !info->pvh )
-            clear_bit(X86_FEATURE_PAGE1GB, regs[3]);
-        clear_bit(X86_FEATURE_RDTSCP, regs[3]);
-
-        clear_bit(X86_FEATURE_SVM, regs[2]);
-        clear_bit(X86_FEATURE_OSVW, regs[2]);
-        clear_bit(X86_FEATURE_IBS, regs[2]);
-        clear_bit(X86_FEATURE_SKINIT, regs[2]);
-        clear_bit(X86_FEATURE_WDT, regs[2]);
-        clear_bit(X86_FEATURE_LWP, regs[2]);
-        clear_bit(X86_FEATURE_NODEID_MSR, regs[2]);
-        clear_bit(X86_FEATURE_TOPOEXT, regs[2]);
+    {
+        /* Host topology exposed to PV guest.  Provide host CMP_LEGACY value. */
+        bool host_cmp_legacy = regs[2] & bitmaskof(X86_FEATURE_CMP_LEGACY);
+
+        regs[2] = (info->featureset[featureword_of(X86_FEATURE_LAHF_LM)] &
+                   ~bitmaskof(X86_FEATURE_CMP_LEGACY));
+        regs[3] = info->featureset[featureword_of(X86_FEATURE_SYSCALL)];
+
+        if ( host_cmp_legacy )
+            regs[2] |= bitmaskof(X86_FEATURE_CMP_LEGACY);
+
         break;
+    }
 
     case 0x00000005: /* MONITOR/MWAIT */
     case 0x0000000a: /* Architectural Performance Monitor Features */
@@ -833,6 +668,87 @@ void xc_cpuid_to_str(const unsigned int *regs, char **strs)
     }
 }
 
+static void sanitise_featureset(struct cpuid_domain_info *info)
+{
+    const uint32_t fs_size = xc_get_cpu_featureset_size();
+    uint32_t disabled_features[fs_size];
+    static const uint32_t deep_features[] = INIT_DEEP_FEATURES;
+    unsigned int i, b;
+
+    if ( info->hvm )
+    {
+        /* HVM Guest */
+
+        if ( !info->pae )
+            clear_bit(X86_FEATURE_PAE, info->featureset);
+
+        if ( !info->nestedhvm )
+        {
+            clear_bit(X86_FEATURE_SVM, info->featureset);
+            clear_bit(X86_FEATURE_VMX, info->featureset);
+        }
+    }
+    else
+    {
+        /* PV or PVH Guest */
+
+        if ( !info->pv64 )
+        {
+            clear_bit(X86_FEATURE_LM, info->featureset);
+            if ( info->vendor != VENDOR_AMD )
+                clear_bit(X86_FEATURE_SYSCALL, info->featureset);
+        }
+
+        if ( !info->pvh )
+        {
+            clear_bit(X86_FEATURE_PSE, info->featureset);
+            clear_bit(X86_FEATURE_PSE36, info->featureset);
+            clear_bit(X86_FEATURE_PGE, info->featureset);
+            clear_bit(X86_FEATURE_PAGE1GB, info->featureset);
+        }
+    }
+
+    if ( info->xfeature_mask == 0 )
+        clear_bit(X86_FEATURE_XSAVE, info->featureset);
+
+    /* Disable deep dependencies of disabled features. */
+    for ( i = 0; i < ARRAY_SIZE(disabled_features); ++i )
+        disabled_features[i] = ~info->featureset[i] & deep_features[i];
+
+    for ( b = 0; b < sizeof(disabled_features) * CHAR_BIT; ++b )
+    {
+        const uint32_t *dfs;
+
+        if ( !test_bit(b, disabled_features) ||
+             !(dfs = xc_get_feature_deep_deps(b)) )
+             continue;
+
+        for ( i = 0; i < ARRAY_SIZE(disabled_features); ++i )
+        {
+            info->featureset[i] &= ~dfs[i];
+            disabled_features[i] &= ~dfs[i];
+        }
+    }
+
+    switch ( info->vendor )
+    {
+    case VENDOR_INTEL:
+        /* Intel clears the common bits in e1d. */
+        info->featureset[featureword_of(X86_FEATURE_SYSCALL)] &= ~COMMON_1D;
+        break;
+
+    case VENDOR_AMD:
+        /* AMD duplicates the common bits between 1d and e1d. */
+        info->featureset[featureword_of(X86_FEATURE_SYSCALL)] =
+            ((info->featureset[featureword_of(X86_FEATURE_FPU)] & COMMON_1D) |
+             (info->featureset[featureword_of(X86_FEATURE_SYSCALL)] & ~COMMON_1D));
+        break;
+
+    default:
+        break;
+    }
+}
+
 int xc_cpuid_apply_policy(xc_interface *xch, domid_t domid,
                           uint32_t *featureset,
                           unsigned int nr_features)
@@ -856,6 +772,8 @@ int xc_cpuid_apply_policy(xc_interface *xch, domid_t domid,
     else
         ext_max = (regs[0] <= DEF_MAX_INTELEXT) ? regs[0] : DEF_MAX_INTELEXT;
 
+    sanitise_featureset(&info);
+
     input[0] = 0;
     input[1] = XEN_CPUID_INPUT_UNUSED;
     for ( ; ; )
@@ -1027,9 +945,9 @@ int xc_cpuid_set(
                 val = polval;
 
             if ( val )
-                set_bit(31 - j, regs[i]);
+                set_feature(31 - j, regs[i]);
             else
-                clear_bit(31 - j, regs[i]);
+                clear_feature(31 - j, regs[i]);
 
             config_transformed[i][j] = config[i][j];
             if ( config[i][j] == 's' )
-- 
2.1.4


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 72+ messages in thread

* [PATCH v4 26/26] tools/libxc: Calculate xstate cpuid leaf from guest information
  2016-03-23 16:36 [PATCH v4 00/26] x86: Improvements to cpuid handling for guests Andrew Cooper
                   ` (24 preceding siblings ...)
  2016-03-23 16:36 ` [PATCH v4 25/26] tools/libxc: Use featuresets rather than guesswork Andrew Cooper
@ 2016-03-23 16:36 ` Andrew Cooper
  2016-03-24 17:20   ` Wei Liu
  2016-03-31  7:48   ` Jan Beulich
  2016-03-24 10:27 ` [PATCH v4 00/26] x86: Improvements to cpuid handling for guests Jan Beulich
  26 siblings, 2 replies; 72+ messages in thread
From: Andrew Cooper @ 2016-03-23 16:36 UTC (permalink / raw)
  To: Xen-devel; +Cc: Andrew Cooper, Wei Liu, Ian Jackson, Jan Beulich

It is unsafe to generate the guests xstate leaves from host information, as it
prevents the differences between hosts from being hidden.

In addition, some further improvements and corrections:
 - don't discard the known flags in sub-leaves 2..63 ECX
 - zap sub-leaves beyond 62
 - zap all bits in leaf 1, EBX/ECX.  No XSS features are currently supported.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Signed-off-by: Jan Beulich <jbeulich@suse.com>
---
CC: Wei Liu <wei.liu2@citrix.com>
CC: Ian Jackson <Ian.Jackson@eu.citrix.com>

v3:
 * Reintroduce MPX adjustment (this series has been in development since
   before the introduction of MPX upstream, and it got lost in a rebase)
v4:
 * Fold further improvements from Jan
---
 tools/libxc/xc_cpuid_x86.c | 71 +++++++++++++++++++++++++++++++++++++---------
 1 file changed, 57 insertions(+), 14 deletions(-)

diff --git a/tools/libxc/xc_cpuid_x86.c b/tools/libxc/xc_cpuid_x86.c
index fc7e20a..cf1f6b7 100644
--- a/tools/libxc/xc_cpuid_x86.c
+++ b/tools/libxc/xc_cpuid_x86.c
@@ -398,54 +398,97 @@ static void intel_xc_cpuid_policy(xc_interface *xch,
     }
 }
 
+/* XSTATE bits in XCR0. */
+#define X86_XCR0_X87    (1ULL <<  0)
+#define X86_XCR0_SSE    (1ULL <<  1)
+#define X86_XCR0_AVX    (1ULL <<  2)
+#define X86_XCR0_BNDREG (1ULL <<  3)
+#define X86_XCR0_BNDCSR (1ULL <<  4)
+#define X86_XCR0_LWP    (1ULL << 62)
+
+#define X86_XSS_MASK    (0) /* No XSS states supported yet. */
+
+/* Per-component subleaf flags. */
+#define XSTATE_XSS      (1ULL <<  0)
+#define XSTATE_ALIGN64  (1ULL <<  1)
+
 /* Configure extended state enumeration leaves (0x0000000D for xsave) */
 static void xc_cpuid_config_xsave(xc_interface *xch,
                                   const struct cpuid_domain_info *info,
                                   const unsigned int *input, unsigned int *regs)
 {
-    if ( info->xfeature_mask == 0 )
+    uint64_t guest_xfeature_mask;
+
+    if ( info->xfeature_mask == 0 ||
+         !test_bit(X86_FEATURE_XSAVE, info->featureset) )
     {
         regs[0] = regs[1] = regs[2] = regs[3] = 0;
         return;
     }
 
+    guest_xfeature_mask = X86_XCR0_SSE | X86_XCR0_X87;
+
+    if ( test_bit(X86_FEATURE_AVX, info->featureset) )
+        guest_xfeature_mask |= X86_XCR0_AVX;
+
+    if ( test_bit(X86_FEATURE_MPX, info->featureset) )
+        guest_xfeature_mask |= X86_XCR0_BNDREG | X86_XCR0_BNDCSR;
+
+    if ( test_bit(X86_FEATURE_LWP, info->featureset) )
+        guest_xfeature_mask |= X86_XCR0_LWP;
+
+    /*
+     * Clamp to host mask.  Should be no-op, as guest_xfeature_mask should not
+     * be able to be calculated as larger than info->xfeature_mask.
+     *
+     * TODO - see about making this a harder error.
+     */
+    guest_xfeature_mask &= info->xfeature_mask;
+
     switch ( input[1] )
     {
-    case 0: 
+    case 0:
         /* EAX: low 32bits of xfeature_enabled_mask */
-        regs[0] = info->xfeature_mask & 0xFFFFFFFF;
+        regs[0] = guest_xfeature_mask;
         /* EDX: high 32bits of xfeature_enabled_mask */
-        regs[3] = (info->xfeature_mask >> 32) & 0xFFFFFFFF;
+        regs[3] = guest_xfeature_mask >> 32;
         /* ECX: max size required by all HW features */
         {
             unsigned int _input[2] = {0xd, 0x0}, _regs[4];
             regs[2] = 0;
-            for ( _input[1] = 2; _input[1] < 64; _input[1]++ )
+            for ( _input[1] = 2; _input[1] <= 62; _input[1]++ )
             {
                 cpuid(_input, _regs);
                 if ( (_regs[0] + _regs[1]) > regs[2] )
                     regs[2] = _regs[0] + _regs[1];
             }
         }
-        /* EBX: max size required by enabled features. 
-         * This register contains a dynamic value, which varies when a guest 
-         * enables or disables XSTATE features (via xsetbv). The default size 
-         * after reset is 576. */ 
+        /* EBX: max size required by enabled features.
+         * This register contains a dynamic value, which varies when a guest
+         * enables or disables XSTATE features (via xsetbv). The default size
+         * after reset is 576. */
         regs[1] = 512 + 64; /* FP/SSE + XSAVE.HEADER */
         break;
+
     case 1: /* leaf 1 */
         regs[0] = info->featureset[featureword_of(X86_FEATURE_XSAVEOPT)];
-        regs[2] &= info->xfeature_mask;
-        regs[3] = 0;
+        regs[2] = guest_xfeature_mask & X86_XSS_MASK;
+        regs[3] = (guest_xfeature_mask >> 32) & X86_XSS_MASK;
         break;
-    case 2 ... 63: /* sub-leaves */
-        if ( !(info->xfeature_mask & (1ULL << input[1])) )
+
+    case 2 ... 62: /* per-component sub-leaves */
+        if ( !(guest_xfeature_mask & (1ULL << input[1])) )
         {
             regs[0] = regs[1] = regs[2] = regs[3] = 0;
             break;
         }
         /* Don't touch EAX, EBX. Also cleanup ECX and EDX */
-        regs[2] = regs[3] = 0;
+        regs[2] &= XSTATE_XSS | XSTATE_ALIGN64;
+        regs[3] = 0;
+        break;
+
+    default:
+        regs[0] = regs[1] = regs[2] = regs[3] = 0;
         break;
     }
 }
-- 
2.1.4


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [PATCH v4 00/26] x86: Improvements to cpuid handling for guests
  2016-03-23 16:36 [PATCH v4 00/26] x86: Improvements to cpuid handling for guests Andrew Cooper
                   ` (25 preceding siblings ...)
  2016-03-23 16:36 ` [PATCH v4 26/26] tools/libxc: Calculate xstate cpuid leaf from guest information Andrew Cooper
@ 2016-03-24 10:27 ` Jan Beulich
  2016-03-24 10:28   ` Andrew Cooper
  26 siblings, 1 reply; 72+ messages in thread
From: Jan Beulich @ 2016-03-24 10:27 UTC (permalink / raw)
  To: Andrew Cooper; +Cc: Xen-devel

>>> On 23.03.16 at 17:36, <andrew.cooper3@citrix.com> wrote:
> Most patches do now how Acks/Reviews.  The remaining patches are #1 (Rest),
> #6-8,11-13,18 (x86), #20 (ARM), 26 (Toolstack).

#20 doesn't have anything ARM specific, it's pure tools stuff.

Jan


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [PATCH v4 00/26] x86: Improvements to cpuid handling for guests
  2016-03-24 10:27 ` [PATCH v4 00/26] x86: Improvements to cpuid handling for guests Jan Beulich
@ 2016-03-24 10:28   ` Andrew Cooper
  0 siblings, 0 replies; 72+ messages in thread
From: Andrew Cooper @ 2016-03-24 10:28 UTC (permalink / raw)
  To: Jan Beulich; +Cc: Xen-devel

On 24/03/16 10:27, Jan Beulich wrote:
>>>> On 23.03.16 at 17:36, <andrew.cooper3@citrix.com> wrote:
>> Most patches do now how Acks/Reviews.  The remaining patches are #1 (Rest),
>> #6-8,11-13,18 (x86), #20 (ARM), 26 (Toolstack).
> #20 doesn't have anything ARM specific, it's pure tools stuff.

It is trying to fix a pointer alignment issue on ARM.  I want a second
opinion that I haven't broken it.

~Andrew

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [PATCH v4 01/26] xen/public: Export cpu featureset information in the public API
  2016-03-23 16:36 ` [PATCH v4 01/26] xen/public: Export cpu featureset information in the public API Andrew Cooper
@ 2016-03-24 14:08   ` Jan Beulich
  2016-03-24 14:12     ` Andrew Cooper
  0 siblings, 1 reply; 72+ messages in thread
From: Jan Beulich @ 2016-03-24 14:08 UTC (permalink / raw)
  To: Andrew Cooper; +Cc: Tim Deegan, Xen-devel

>>> On 23.03.16 at 17:36, <andrew.cooper3@citrix.com> wrote:
> --- /dev/null
> +++ b/xen/include/asm-x86/cpufeatureset.h
> @@ -0,0 +1,32 @@
> +#ifndef __XEN_X86_CPUFEATURESET_H__
> +#define __XEN_X86_CPUFEATURESET_H__
> +
> +#ifndef __ASSEMBLY__
> +
> +#define XEN_CPUFEATURE(name, value) X86_FEATURE_##name = value,
> +enum {
> +#include <public/arch-x86/cpufeatureset.h>
> +#undef XEN_CPUFEATURE
> +};
> +
> +#define XEN_CPUFEATURE(name, value) asm (".equ X86_FEATURE_" #name ", " #value);
> +#include <public/arch-x86/cpufeatureset.h>
> +
> +#else /* !__ASSEMBLY__ */
> +
> +#define XEN_CPUFEATURE(name, value) .equ X86_FEATURE_##name, value
> +#include <public/arch-x86/cpufeatureset.h>
> +
> +#endif /* __ASSEMBLY__ */
> +
> +#endif /* !__XEN_X86_CPUFEATURESET_H__ */

While this is now no longer in a public header, I still don't like
XEN_CPUFEATURE() remaining defined here.

With that
Reviewed-by: Jan Beulich <jbeulich@suse.com>

If you agree, I could fold this in while committing.

Independently - is the asm() indeed unconditionally necessary?
If so, how much clutter to symbol tables does it introduce? I notice
that public/errno.h causes some unnecessary bloat in this regard
too. Should we perhaps add --strip-local-absolute to the assembler
options (for both .c and .S files)?

Jan


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [PATCH v4 01/26] xen/public: Export cpu featureset information in the public API
  2016-03-24 14:08   ` Jan Beulich
@ 2016-03-24 14:12     ` Andrew Cooper
  2016-03-24 14:16       ` Jan Beulich
  0 siblings, 1 reply; 72+ messages in thread
From: Andrew Cooper @ 2016-03-24 14:12 UTC (permalink / raw)
  To: Jan Beulich; +Cc: Tim Deegan, Xen-devel

On 24/03/16 14:08, Jan Beulich wrote:
>>>> On 23.03.16 at 17:36, <andrew.cooper3@citrix.com> wrote:
>> --- /dev/null
>> +++ b/xen/include/asm-x86/cpufeatureset.h
>> @@ -0,0 +1,32 @@
>> +#ifndef __XEN_X86_CPUFEATURESET_H__
>> +#define __XEN_X86_CPUFEATURESET_H__
>> +
>> +#ifndef __ASSEMBLY__
>> +
>> +#define XEN_CPUFEATURE(name, value) X86_FEATURE_##name = value,
>> +enum {
>> +#include <public/arch-x86/cpufeatureset.h>
>> +#undef XEN_CPUFEATURE
>> +};
>> +
>> +#define XEN_CPUFEATURE(name, value) asm (".equ X86_FEATURE_" #name ", " #value);
>> +#include <public/arch-x86/cpufeatureset.h>
>> +
>> +#else /* !__ASSEMBLY__ */
>> +
>> +#define XEN_CPUFEATURE(name, value) .equ X86_FEATURE_##name, value
>> +#include <public/arch-x86/cpufeatureset.h>
>> +
>> +#endif /* __ASSEMBLY__ */
>> +
>> +#endif /* !__XEN_X86_CPUFEATURESET_H__ */
> While this is now no longer in a public header, I still don't like
> XEN_CPUFEATURE() remaining defined here.
>
> With that
> Reviewed-by: Jan Beulich <jbeulich@suse.com>
>
> If you agree, I could fold this in while committing.

Ok.

>
> Independently - is the asm() indeed unconditionally necessary?

Yes.  Otherwise alternative blocks in C fail to compile.  They
__stringify() the feature name, which used to turn into a number (when
the feature was a define), but stay as a string identifier because of
the new enum.

> If so, how much clutter to symbol tables does it introduce? I notice
> that public/errno.h causes some unnecessary bloat in this regard
> too. Should we perhaps add --strip-local-absolute to the assembler
> options (for both .c and .S files)?

I was wondering if there was anything we could do to fix the errno
clutter.  That sounds like a good plan.

~Andrew

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [PATCH v4 01/26] xen/public: Export cpu featureset information in the public API
  2016-03-24 14:12     ` Andrew Cooper
@ 2016-03-24 14:16       ` Jan Beulich
  0 siblings, 0 replies; 72+ messages in thread
From: Jan Beulich @ 2016-03-24 14:16 UTC (permalink / raw)
  To: Andrew Cooper; +Cc: Tim Deegan, Xen-devel

>>> On 24.03.16 at 15:12, <andrew.cooper3@citrix.com> wrote:
> On 24/03/16 14:08, Jan Beulich wrote:
>> Independently - is the asm() indeed unconditionally necessary?
> 
> Yes.  Otherwise alternative blocks in C fail to compile.  They
> __stringify() the feature name, which used to turn into a number (when
> the feature was a define), but stay as a string identifier because of
> the new enum.
> 
>> If so, how much clutter to symbol tables does it introduce? I notice
>> that public/errno.h causes some unnecessary bloat in this regard
>> too. Should we perhaps add --strip-local-absolute to the assembler
>> options (for both .c and .S files)?
> 
> I was wondering if there was anything we could do to fix the errno
> clutter.  That sounds like a good plan.

Okay - luckily this exists in gas 2.16 already, so we can add it
unconditionally. I'll prepare a patch...

Jan


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [PATCH v4 11/26] xen/x86: Improvements to in-hypervisor cpuid sanity checks
  2016-03-23 16:36 ` [PATCH v4 11/26] xen/x86: Improvements to in-hypervisor cpuid sanity checks Andrew Cooper
@ 2016-03-24 15:38   ` Andrew Cooper
  2016-03-24 16:47   ` Jan Beulich
  2016-03-28 15:29   ` Konrad Rzeszutek Wilk
  2 siblings, 0 replies; 72+ messages in thread
From: Andrew Cooper @ 2016-03-24 15:38 UTC (permalink / raw)
  To: Xen-devel; +Cc: Jan Beulich

On 23/03/16 16:36, Andrew Cooper wrote:
> Currently, {pv,hvm}_cpuid() has a large quantity of essentially-static logic
> for modifying the features visible to a guest.  A lot of this can be subsumed
> by {pv,hvm}_featuremask, which identify the features available on this
> hardware which could be given to a PV or HVM guest.
>
> This is a step in the direction of full per-domain cpuid policies, but lots
> more development is needed for that.  As a result, the static checks are
> simplified, but the dynamic checks need to remain for now.
>
> As a side effect, some of the logic for special features can be improved.
> OSXSAVE and OSPKE will be automatically cleared because of being absent in the
> featuremask.  This allows the fast-forward logic to be more simple.
>
> In addition, there are some corrections to the existing logic:
>
>  * Hiding PSE36 out of PAE mode is architecturally wrong.  It turns out that
>    it was a bugfix for running HyperV under Xen, which wanted to see PSE36
>    even after choosing to use PAE paging.  PSE36 is not supported by shadow
>    paging, so is hidden from non-HAP guests, but is still visible for HAP
>    guests.

This paragraph is slightly stale.

As one further sentence,  "It is also leaked into non-HAP guests when
the guest is running in PAE mode, to placate HyperV."

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [PATCH v4 08/26] xen/x86: Generate deep dependencies of features
  2016-03-23 16:36 ` [PATCH v4 08/26] xen/x86: Generate deep dependencies of features Andrew Cooper
@ 2016-03-24 16:16   ` Jan Beulich
  0 siblings, 0 replies; 72+ messages in thread
From: Jan Beulich @ 2016-03-24 16:16 UTC (permalink / raw)
  To: Andrew Cooper; +Cc: Xen-devel

>>> On 23.03.16 at 17:36, <andrew.cooper3@citrix.com> wrote:
> +        # SSE2 was also re-specified as core for 64bit.  The AESNI and SHA
> +        # instruction groups are documented to require checking for SSE2
> +        # support as a prerequisite.
> +        SSE2: [SSE3, LM, AESNI, SHA],

No idea why I didn't notice this in v3 already: Are you referring to
"The SHA extensions require only XMM state support on operating
systems, similar to SSE2 instructions"? I don't think this can be read
to mean that SSE2 is a prereq. Instead to me this means that there
are no requirements to the OS beyond supporting SSE register
state and exception handling.

Similarly for AESNI it refers to "Checking for SSE/SSE2 Support",
which in turn says "Check that the processor supports the SSE
and/or SSE2 extensions" - note the "and/or" there and again in the
following referral to the actual CPUID bits.

Similarly SSE3 then only depends on SSE, ...

> +        # AMD K10 processors has SSE3 and SSE4A.  Bobcat/Barcelona processors
> +        # subsequently included SSSE3, and Bulldozer subsequently included
> +        # SSE4_1.  Intel have never shipped SSE4A.
> +        SSE3: [SSSE3, SSE4_1, SSE4_2, SSE4A],

... as does SSSE3 (despite its name).

SSE4.1 is different indeed: They require us to check SSE3 and SSSE3
along with SSE4.1.

SSE4.2 is even more complicated due to the CRC32 and POPCNT
special cases (which also implies that the latter two may need to
become dependents of SSE4.2).

Jan


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [PATCH v4 11/26] xen/x86: Improvements to in-hypervisor cpuid sanity checks
  2016-03-23 16:36 ` [PATCH v4 11/26] xen/x86: Improvements to in-hypervisor cpuid sanity checks Andrew Cooper
  2016-03-24 15:38   ` Andrew Cooper
@ 2016-03-24 16:47   ` Jan Beulich
  2016-03-24 17:01     ` Andrew Cooper
  2016-03-28 15:29   ` Konrad Rzeszutek Wilk
  2 siblings, 1 reply; 72+ messages in thread
From: Jan Beulich @ 2016-03-24 16:47 UTC (permalink / raw)
  To: Andrew Cooper; +Cc: Xen-devel

>>> On 23.03.16 at 17:36, <andrew.cooper3@citrix.com> wrote:
> +        if ( !is_pvh_domain(currd) )
>          {
> -            __clear_bit(X86_FEATURE_XSAVE % 32, &c);
> -            __clear_bit(X86_FEATURE_AVX % 32, &c);
> +            /*
> +             * Delete the PVH condition when HVMLite formally replaces PVH,
> +             * and HVM guests no longer enter a PV codepath.
> +             */
> +
> +            /* OSXSAVE cleared by pv_featureset.  Fast-forward CR4 back in. */
> +            if ( (is_pv_domain(currd) && guest_kernel_mode(curr, regs) &&
> +                  (read_cr4() & X86_CR4_OSXSAVE)) ||
> +                 (curr->arch.pv_vcpu.ctrlreg[4] & X86_CR4_OSXSAVE) )
> +                c |= cpufeat_mask(X86_FEATURE_OSXSAVE);
>          }

The is_pv_domain() is now redundant with the is_pvh_domain()
earlier on, and it would likely end up confusing the reader if on
the right side of the || then ->arch.pv_vcpu is being referenced.

With that addressed, the commit message updated as already
indicated and ...

> +        /*
> +         * PV guests cannot use any MTRR infrastructure, so shouldn't see the
> +         * feature bit.  It used to leak in to PV guests.
> +         *
> +         * PVOPS Linux self-clobbers the MTRR feature, to avoid trying to use

... this starting "Modern PVOPS Linux self-clobbers ...":
Reviewed-by: Jan Beulich <jbeulich@suse.com>

Jan


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [PATCH v4 13/26] x86/cpu: Sysctl and common infrastructure for levelling context switching
  2016-03-23 16:36 ` [PATCH v4 13/26] x86/cpu: Sysctl and common infrastructure for levelling context switching Andrew Cooper
@ 2016-03-24 16:58   ` Jan Beulich
  2016-03-28 16:12   ` Konrad Rzeszutek Wilk
  2016-03-28 17:37   ` Konrad Rzeszutek Wilk
  2 siblings, 0 replies; 72+ messages in thread
From: Jan Beulich @ 2016-03-24 16:58 UTC (permalink / raw)
  To: Andrew Cooper; +Cc: Xen-devel

>>> On 23.03.16 at 17:36, <andrew.cooper3@citrix.com> wrote:
> A toolstack needs to know how much control Xen has over the visible cpuid
> values in PV guests.  Provide an explicit mechanism to query what Xen is
> capable of.
> 
> This interface will currently report no capabilities.  This change is
> scaffolding for future patches, which will introduce detection and switching
> logic, after which the interface will report hardware capabilities 
> correctly.
> 
> Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>

Acked-by: Jan Beulich <jbeulich@suse.com>


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [PATCH v4 11/26] xen/x86: Improvements to in-hypervisor cpuid sanity checks
  2016-03-24 16:47   ` Jan Beulich
@ 2016-03-24 17:01     ` Andrew Cooper
  2016-03-24 17:11       ` Jan Beulich
  0 siblings, 1 reply; 72+ messages in thread
From: Andrew Cooper @ 2016-03-24 17:01 UTC (permalink / raw)
  To: Jan Beulich; +Cc: Xen-devel

On 24/03/16 16:47, Jan Beulich wrote:
>>>> On 23.03.16 at 17:36, <andrew.cooper3@citrix.com> wrote:
>> +        if ( !is_pvh_domain(currd) )
>>          {
>> -            __clear_bit(X86_FEATURE_XSAVE % 32, &c);
>> -            __clear_bit(X86_FEATURE_AVX % 32, &c);
>> +            /*
>> +             * Delete the PVH condition when HVMLite formally replaces PVH,
>> +             * and HVM guests no longer enter a PV codepath.
>> +             */
>> +
>> +            /* OSXSAVE cleared by pv_featureset.  Fast-forward CR4 back in. */
>> +            if ( (is_pv_domain(currd) && guest_kernel_mode(curr, regs) &&
>> +                  (read_cr4() & X86_CR4_OSXSAVE)) ||
>> +                 (curr->arch.pv_vcpu.ctrlreg[4] & X86_CR4_OSXSAVE) )
>> +                c |= cpufeat_mask(X86_FEATURE_OSXSAVE);
>>          }
> The is_pv_domain() is now redundant with the is_pvh_domain()
> earlier on, and it would likely end up confusing the reader if on
> the right side of the || then ->arch.pv_vcpu is being referenced.

I specifically chose to order the code like this to make it easier to
remove the is_pvh_domain() conditional in the future, without having to
re-edit the PV path.

This layout matches the OSPKE version, and I would prefer to keep it
this way unless you really insist on changing it.

~Andrew

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [PATCH v4 18/26] x86/domctl: Update PV domain cpumasks when setting cpuid policy
  2016-03-23 16:36 ` [PATCH v4 18/26] x86/domctl: Update PV domain cpumasks when setting cpuid policy Andrew Cooper
@ 2016-03-24 17:04   ` Jan Beulich
  2016-03-24 17:05     ` Andrew Cooper
  2016-03-28 19:51   ` Konrad Rzeszutek Wilk
  1 sibling, 1 reply; 72+ messages in thread
From: Jan Beulich @ 2016-03-24 17:04 UTC (permalink / raw)
  To: Andrew Cooper; +Cc: Xen-devel

>>> On 23.03.16 at 17:36, <andrew.cooper3@citrix.com> wrote:
> This allows PV domains with different featuresets to observe different values
> from a native cpuid instruction, on supporting hardware.
> 
> It is important to leak the host view of HTT and CMP_LEGACY through to 
> guests,
> even though they could be hidden.  These flags affect how to interpret other
> cpuid leaves which are not maskable.
> 
> Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
> ---
> CC: Jan Beulich <JBeulich@suse.com>
> 
> v2:
>  * Use switch() rather than if/elseif chain
>  * Clamp to static PV featuremask
> v3:
>  * Only set a shadow cpumask if it is available in hardware.  This causes
>    fewer branches in the context switch.
>  * Fix interaction between fastforward bits and override MSR.
>  * Fix up the cross-vendor case.
>  * Fix the host view of HTT/CMP_LEGACY.
> v4:
>  * More comments explaining the masking MSRs behaviour.
>  * s/CPU/CPUID/
>  * Leak host X2APIC.

Did you perhaps mean to also mention this one in the commit message
then? Anyway,
Reviewed-by: Jan Beulich <jbeulich@suse.com>


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [PATCH v4 18/26] x86/domctl: Update PV domain cpumasks when setting cpuid policy
  2016-03-24 17:04   ` Jan Beulich
@ 2016-03-24 17:05     ` Andrew Cooper
  0 siblings, 0 replies; 72+ messages in thread
From: Andrew Cooper @ 2016-03-24 17:05 UTC (permalink / raw)
  To: Jan Beulich; +Cc: Xen-devel

On 24/03/16 17:04, Jan Beulich wrote:
>>>> On 23.03.16 at 17:36, <andrew.cooper3@citrix.com> wrote:
>> This allows PV domains with different featuresets to observe different values
>> from a native cpuid instruction, on supporting hardware.
>>
>> It is important to leak the host view of HTT and CMP_LEGACY through to 
>> guests,
>> even though they could be hidden.  These flags affect how to interpret other
>> cpuid leaves which are not maskable.
>>
>> Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
>> ---
>> CC: Jan Beulich <JBeulich@suse.com>
>>
>> v2:
>>  * Use switch() rather than if/elseif chain
>>  * Clamp to static PV featuremask
>> v3:
>>  * Only set a shadow cpumask if it is available in hardware.  This causes
>>    fewer branches in the context switch.
>>  * Fix interaction between fastforward bits and override MSR.
>>  * Fix up the cross-vendor case.
>>  * Fix the host view of HTT/CMP_LEGACY.
>> v4:
>>  * More comments explaining the masking MSRs behaviour.
>>  * s/CPU/CPUID/
>>  * Leak host X2APIC.
> Did you perhaps mean to also mention this one in the commit message
> then? Anyway,
> Reviewed-by: Jan Beulich <jbeulich@suse.com>

Yes, I did.  I will fix up the message.

~Andrew

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [PATCH v4 11/26] xen/x86: Improvements to in-hypervisor cpuid sanity checks
  2016-03-24 17:01     ` Andrew Cooper
@ 2016-03-24 17:11       ` Jan Beulich
  2016-03-24 17:12         ` Andrew Cooper
  0 siblings, 1 reply; 72+ messages in thread
From: Jan Beulich @ 2016-03-24 17:11 UTC (permalink / raw)
  To: Andrew Cooper; +Cc: Xen-devel

>>> On 24.03.16 at 18:01, <andrew.cooper3@citrix.com> wrote:
> On 24/03/16 16:47, Jan Beulich wrote:
>>>>> On 23.03.16 at 17:36, <andrew.cooper3@citrix.com> wrote:
>>> +        if ( !is_pvh_domain(currd) )
>>>          {
>>> -            __clear_bit(X86_FEATURE_XSAVE % 32, &c);
>>> -            __clear_bit(X86_FEATURE_AVX % 32, &c);
>>> +            /*
>>> +             * Delete the PVH condition when HVMLite formally replaces PVH,
>>> +             * and HVM guests no longer enter a PV codepath.
>>> +             */
>>> +
>>> +            /* OSXSAVE cleared by pv_featureset.  Fast-forward CR4 back in. */
>>> +            if ( (is_pv_domain(currd) && guest_kernel_mode(curr, regs) &&
>>> +                  (read_cr4() & X86_CR4_OSXSAVE)) ||
>>> +                 (curr->arch.pv_vcpu.ctrlreg[4] & X86_CR4_OSXSAVE) )
>>> +                c |= cpufeat_mask(X86_FEATURE_OSXSAVE);
>>>          }
>> The is_pv_domain() is now redundant with the is_pvh_domain()
>> earlier on, and it would likely end up confusing the reader if on
>> the right side of the || then ->arch.pv_vcpu is being referenced.
> 
> I specifically chose to order the code like this to make it easier to
> remove the is_pvh_domain() conditional in the future, without having to
> re-edit the PV path.
> 
> This layout matches the OSPKE version, and I would prefer to keep it
> this way unless you really insist on changing it.

Well, after removing is_pvh_domain() the is_pv_domain() still
won't be needed here, or would also be needed to guard the
curr->arch.pv_vcpu access. So yes, I insist on _some_ change
to make the whole thing consistent.

Jan


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [PATCH v4 11/26] xen/x86: Improvements to in-hypervisor cpuid sanity checks
  2016-03-24 17:11       ` Jan Beulich
@ 2016-03-24 17:12         ` Andrew Cooper
  0 siblings, 0 replies; 72+ messages in thread
From: Andrew Cooper @ 2016-03-24 17:12 UTC (permalink / raw)
  To: Jan Beulich; +Cc: Xen-devel

On 24/03/16 17:11, Jan Beulich wrote:
>>>> On 24.03.16 at 18:01, <andrew.cooper3@citrix.com> wrote:
>> On 24/03/16 16:47, Jan Beulich wrote:
>>>>>> On 23.03.16 at 17:36, <andrew.cooper3@citrix.com> wrote:
>>>> +        if ( !is_pvh_domain(currd) )
>>>>          {
>>>> -            __clear_bit(X86_FEATURE_XSAVE % 32, &c);
>>>> -            __clear_bit(X86_FEATURE_AVX % 32, &c);
>>>> +            /*
>>>> +             * Delete the PVH condition when HVMLite formally replaces PVH,
>>>> +             * and HVM guests no longer enter a PV codepath.
>>>> +             */
>>>> +
>>>> +            /* OSXSAVE cleared by pv_featureset.  Fast-forward CR4 back in. */
>>>> +            if ( (is_pv_domain(currd) && guest_kernel_mode(curr, regs) &&
>>>> +                  (read_cr4() & X86_CR4_OSXSAVE)) ||
>>>> +                 (curr->arch.pv_vcpu.ctrlreg[4] & X86_CR4_OSXSAVE) )
>>>> +                c |= cpufeat_mask(X86_FEATURE_OSXSAVE);
>>>>          }
>>> The is_pv_domain() is now redundant with the is_pvh_domain()
>>> earlier on, and it would likely end up confusing the reader if on
>>> the right side of the || then ->arch.pv_vcpu is being referenced.
>> I specifically chose to order the code like this to make it easier to
>> remove the is_pvh_domain() conditional in the future, without having to
>> re-edit the PV path.
>>
>> This layout matches the OSPKE version, and I would prefer to keep it
>> this way unless you really insist on changing it.
> Well, after removing is_pvh_domain() the is_pv_domain() still
> won't be needed here, or would also be needed to guard the
> curr->arch.pv_vcpu access. So yes, I insist on _some_ change
> to make the whole thing consistent.

Ah ok - I will tweak it.

~Andrew

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [PATCH v4 26/26] tools/libxc: Calculate xstate cpuid leaf from guest information
  2016-03-23 16:36 ` [PATCH v4 26/26] tools/libxc: Calculate xstate cpuid leaf from guest information Andrew Cooper
@ 2016-03-24 17:20   ` Wei Liu
  2016-03-31  7:48   ` Jan Beulich
  1 sibling, 0 replies; 72+ messages in thread
From: Wei Liu @ 2016-03-24 17:20 UTC (permalink / raw)
  To: Andrew Cooper; +Cc: Ian Jackson, Wei Liu, Jan Beulich, Xen-devel

On Wed, Mar 23, 2016 at 04:36:29PM +0000, Andrew Cooper wrote:
> It is unsafe to generate the guests xstate leaves from host information, as it
> prevents the differences between hosts from being hidden.
> 
> In addition, some further improvements and corrections:
>  - don't discard the known flags in sub-leaves 2..63 ECX
>  - zap sub-leaves beyond 62
>  - zap all bits in leaf 1, EBX/ECX.  No XSS features are currently supported.
> 
> Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
> Signed-off-by: Jan Beulich <jbeulich@suse.com>

Acked-by: Wei Liu <wei.liu2@citrix.com>

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [PATCH v4 10/26] xen/x86: Improve disabling of features which have dependencies
  2016-03-23 16:36 ` [PATCH v4 10/26] xen/x86: Improve disabling of features which have dependencies Andrew Cooper
@ 2016-03-28 15:18   ` Konrad Rzeszutek Wilk
  0 siblings, 0 replies; 72+ messages in thread
From: Konrad Rzeszutek Wilk @ 2016-03-28 15:18 UTC (permalink / raw)
  To: Andrew Cooper; +Cc: Xen-devel

On Wed, Mar 23, 2016 at 04:36:13PM +0000, Andrew Cooper wrote:
> APIC and XSAVE have dependent features, which also need disabling if Xen
> chooses to disable a feature.
> 
> Use setup_clear_cpu_cap() rather than clear_bit(), as it takes care of
> dependent features as well.
> 
> Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
> Reviewed-by: Jan Beulich <JBeulich@suse.com>

Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
> ---
> v2: Move boolean_param() adjacent to use_xsave in xstate_init()
> ---
>  xen/arch/x86/apic.c       |  2 +-
>  xen/arch/x86/cpu/common.c | 12 +++---------
>  xen/arch/x86/xstate.c     |  6 +++++-
>  3 files changed, 9 insertions(+), 11 deletions(-)
> 
> diff --git a/xen/arch/x86/apic.c b/xen/arch/x86/apic.c
> index b9601ad..8df5bd3 100644
> --- a/xen/arch/x86/apic.c
> +++ b/xen/arch/x86/apic.c
> @@ -1349,7 +1349,7 @@ void pmu_apic_interrupt(struct cpu_user_regs *regs)
>  int __init APIC_init_uniprocessor (void)
>  {
>      if (enable_local_apic < 0)
> -        __clear_bit(X86_FEATURE_APIC, boot_cpu_data.x86_capability);
> +        setup_clear_cpu_cap(X86_FEATURE_APIC);
>  
>      if (!smp_found_config && !cpu_has_apic) {
>          skip_ioapic_setup = 1;
> diff --git a/xen/arch/x86/cpu/common.c b/xen/arch/x86/cpu/common.c
> index 0942b44..b5c023f 100644
> --- a/xen/arch/x86/cpu/common.c
> +++ b/xen/arch/x86/cpu/common.c
> @@ -16,9 +16,6 @@
>  
>  #include "cpu.h"
>  
> -static bool_t use_xsave = 1;
> -boolean_param("xsave", use_xsave);
> -
>  bool_t opt_arat = 1;
>  boolean_param("arat", opt_arat);
>  
> @@ -341,12 +338,6 @@ void identify_cpu(struct cpuinfo_x86 *c)
>  	if (this_cpu->c_init)
>  		this_cpu->c_init(c);
>  
> -        /* Initialize xsave/xrstor features */
> -	if ( !use_xsave )
> -		__clear_bit(X86_FEATURE_XSAVE, boot_cpu_data.x86_capability);
> -
> -	if ( cpu_has_xsave )
> -		xstate_init(c);
>  
>     	if ( !opt_pku )
>  		setup_clear_cpu_cap(X86_FEATURE_PKU);
> @@ -370,6 +361,9 @@ void identify_cpu(struct cpuinfo_x86 *c)
>  
>  	/* Now the feature flags better reflect actual CPU features! */
>  
> +	if ( cpu_has_xsave )
> +		xstate_init(c);
> +
>  #ifdef NOISY_CAPS
>  	printk(KERN_DEBUG "CPU: After all inits, caps:");
>  	for (i = 0; i < NCAPINTS; i++)
> diff --git a/xen/arch/x86/xstate.c b/xen/arch/x86/xstate.c
> index f649405..5060704 100644
> --- a/xen/arch/x86/xstate.c
> +++ b/xen/arch/x86/xstate.c
> @@ -502,11 +502,15 @@ unsigned int xstate_ctxt_size(u64 xcr0)
>  /* Collect the information of processor's extended state */
>  void xstate_init(struct cpuinfo_x86 *c)
>  {
> +    static bool_t __initdata use_xsave = 1;
> +    boolean_param("xsave", use_xsave);
> +
>      bool_t bsp = c == &boot_cpu_data;
>      u32 eax, ebx, ecx, edx;
>      u64 feature_mask;
>  
> -    if ( boot_cpu_data.cpuid_level < XSTATE_CPUID )
> +    if ( (bsp && !use_xsave) ||
> +         boot_cpu_data.cpuid_level < XSTATE_CPUID )
>      {
>          BUG_ON(!bsp);
>          setup_clear_cpu_cap(X86_FEATURE_XSAVE);
> -- 
> 2.1.4
> 
> 
> _______________________________________________
> Xen-devel mailing list
> Xen-devel@lists.xen.org
> http://lists.xen.org/xen-devel

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [PATCH v4 11/26] xen/x86: Improvements to in-hypervisor cpuid sanity checks
  2016-03-23 16:36 ` [PATCH v4 11/26] xen/x86: Improvements to in-hypervisor cpuid sanity checks Andrew Cooper
  2016-03-24 15:38   ` Andrew Cooper
  2016-03-24 16:47   ` Jan Beulich
@ 2016-03-28 15:29   ` Konrad Rzeszutek Wilk
  2016-04-05 15:25     ` Andrew Cooper
  2 siblings, 1 reply; 72+ messages in thread
From: Konrad Rzeszutek Wilk @ 2016-03-28 15:29 UTC (permalink / raw)
  To: Andrew Cooper; +Cc: Jan Beulich, Xen-devel

On Wed, Mar 23, 2016 at 04:36:14PM +0000, Andrew Cooper wrote:
> Currently, {pv,hvm}_cpuid() has a large quantity of essentially-static logic
> for modifying the features visible to a guest.  A lot of this can be subsumed
> by {pv,hvm}_featuremask, which identify the features available on this
> hardware which could be given to a PV or HVM guest.
> 
> This is a step in the direction of full per-domain cpuid policies, but lots
> more development is needed for that.  As a result, the static checks are
> simplified, but the dynamic checks need to remain for now.
> 
> As a side effect, some of the logic for special features can be improved.
> OSXSAVE and OSPKE will be automatically cleared because of being absent in the
> featuremask.  This allows the fast-forward logic to be more simple.
> 
> In addition, there are some corrections to the existing logic:
> 
>  * Hiding PSE36 out of PAE mode is architecturally wrong.  It turns out that
>    it was a bugfix for running HyperV under Xen, which wanted to see PSE36
>    even after choosing to use PAE paging.  PSE36 is not supported by shadow
>    paging, so is hidden from non-HAP guests, but is still visible for HAP
>    guests.
>  * Changing the visibility of RDTSCP based on host TSC stability or virtual
>    TSC mode is bogus, so dropped.

Why is it bogus? It is an PV ABI type CPUID.

Independetly of that you would also need to modify tsc_mode.txt file and all uses
of 'tsc_mode=3'.



_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [PATCH v4 12/26] x86/cpu: Move set_cpumask() calls into c_early_init()
  2016-03-23 16:36 ` [PATCH v4 12/26] x86/cpu: Move set_cpumask() calls into c_early_init() Andrew Cooper
@ 2016-03-28 15:55   ` Konrad Rzeszutek Wilk
  2016-04-05 16:19     ` Andrew Cooper
  0 siblings, 1 reply; 72+ messages in thread
From: Konrad Rzeszutek Wilk @ 2016-03-28 15:55 UTC (permalink / raw)
  To: Andrew Cooper; +Cc: Jan Beulich, Xen-devel

On Wed, Mar 23, 2016 at 04:36:15PM +0000, Andrew Cooper wrote:
> Before c/s 44e24f8567 "x86: don't call generic_identify() redundantly", the
> commandline-provided masks would take effect in Xen's view of the features.

s/the// ?

Or perhaps s/the/cpuid// ?
> 
> As the masks got applied after the query for features, the redundant call to
> generic_identify() would clobber the pre-masking feature information with the
> post-masking information.
> 
> Move the set_cpumask() calls into c_early_init() so their effects take place

s/their effects take/it's effect takes/ 

> before the main query for features in generic_identify().

.. and unifying all c_early_init() functions behavior?

> 
> The cpuid_mask_* command line parameters now limit the entire system, a
> feature XenServer was relying on for testing purposes.  Subsequent changes

.. what are those changes? Could you mention the title of the patch perhaps?

> will cause the mask MSRs to be context switched per-domain, removing the need
> to use the command line parameters for heterogeneous levelling purposes.
> 
> Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>

Besides those little nitpicks:

Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
> ---
> CC: Jan Beulich <JBeulich@suse.com>
> ---
>  xen/arch/x86/cpu/amd.c   |  8 ++++++--
>  xen/arch/x86/cpu/intel.c | 34 +++++++++++++++++-----------------
>  2 files changed, 23 insertions(+), 19 deletions(-)
> 
> diff --git a/xen/arch/x86/cpu/amd.c b/xen/arch/x86/cpu/amd.c
> index 47a38c6..5516777 100644
> --- a/xen/arch/x86/cpu/amd.c
> +++ b/xen/arch/x86/cpu/amd.c
> @@ -407,6 +407,11 @@ static void amd_get_topology(struct cpuinfo_x86 *c)
>                                                           c->cpu_core_id);
>  }
>  
> +static void early_init_amd(struct cpuinfo_x86 *c)
> +{
> +	set_cpuidmask(c);
> +}
> +
>  static void init_amd(struct cpuinfo_x86 *c)
>  {
>  	u32 l, h;
> @@ -595,14 +600,13 @@ static void init_amd(struct cpuinfo_x86 *c)
>  	if ((smp_processor_id() == 1) && !cpu_has(c, X86_FEATURE_ITSC))
>  		disable_c1_ramping();
>  
> -	set_cpuidmask(c);
> -
>  	check_syscfg_dram_mod_en();
>  }
>  
>  static const struct cpu_dev amd_cpu_dev = {
>  	.c_vendor	= "AMD",
>  	.c_ident 	= { "AuthenticAMD" },
> +	.c_early_init	= early_init_amd,
>  	.c_init		= init_amd,
>  };
>  
> diff --git a/xen/arch/x86/cpu/intel.c b/xen/arch/x86/cpu/intel.c
> index bdf89f6..ad22375 100644
> --- a/xen/arch/x86/cpu/intel.c
> +++ b/xen/arch/x86/cpu/intel.c
> @@ -189,6 +189,23 @@ static void early_init_intel(struct cpuinfo_x86 *c)
>  	if (boot_cpu_data.x86 == 0xF && boot_cpu_data.x86_model == 3 &&
>  	    (boot_cpu_data.x86_mask == 3 || boot_cpu_data.x86_mask == 4))
>  		paddr_bits = 36;
> +
> +	if (c == &boot_cpu_data && c->x86 == 6) {
> +		if (probe_intel_cpuid_faulting())
> +			__set_bit(X86_FEATURE_CPUID_FAULTING,
> +				  c->x86_capability);
> +	} else if (boot_cpu_has(X86_FEATURE_CPUID_FAULTING)) {
> +		BUG_ON(!probe_intel_cpuid_faulting());
> +		__set_bit(X86_FEATURE_CPUID_FAULTING, c->x86_capability);
> +	}
> +
> +	if (!cpu_has_cpuid_faulting)
> +		set_cpuidmask(c);
> +	else if ((c == &boot_cpu_data) &&
> +		 (~(opt_cpuid_mask_ecx & opt_cpuid_mask_edx &
> +		    opt_cpuid_mask_ext_ecx & opt_cpuid_mask_ext_edx &
> +		    opt_cpuid_mask_xsave_eax)))
> +		printk("No CPUID feature masking support available\n");
>  }
>  
>  /*
> @@ -258,23 +275,6 @@ static void init_intel(struct cpuinfo_x86 *c)
>  		detect_ht(c);
>  	}
>  
> -	if (c == &boot_cpu_data && c->x86 == 6) {
> -		if (probe_intel_cpuid_faulting())
> -			__set_bit(X86_FEATURE_CPUID_FAULTING,
> -				  c->x86_capability);
> -	} else if (boot_cpu_has(X86_FEATURE_CPUID_FAULTING)) {
> -		BUG_ON(!probe_intel_cpuid_faulting());
> -		__set_bit(X86_FEATURE_CPUID_FAULTING, c->x86_capability);
> -	}
> -
> -	if (!cpu_has_cpuid_faulting)
> -		set_cpuidmask(c);
> -	else if ((c == &boot_cpu_data) &&
> -		 (~(opt_cpuid_mask_ecx & opt_cpuid_mask_edx &
> -		    opt_cpuid_mask_ext_ecx & opt_cpuid_mask_ext_edx &
> -		    opt_cpuid_mask_xsave_eax)))
> -		printk("No CPUID feature masking support available\n");
> -
>  	/* Work around errata */
>  	Intel_errata_workarounds(c);
>  
> -- 
> 2.1.4
> 
> 
> _______________________________________________
> Xen-devel mailing list
> Xen-devel@lists.xen.org
> http://lists.xen.org/xen-devel

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [PATCH v4 13/26] x86/cpu: Sysctl and common infrastructure for levelling context switching
  2016-03-23 16:36 ` [PATCH v4 13/26] x86/cpu: Sysctl and common infrastructure for levelling context switching Andrew Cooper
  2016-03-24 16:58   ` Jan Beulich
@ 2016-03-28 16:12   ` Konrad Rzeszutek Wilk
  2016-04-05 16:33     ` Andrew Cooper
  2016-03-28 17:37   ` Konrad Rzeszutek Wilk
  2 siblings, 1 reply; 72+ messages in thread
From: Konrad Rzeszutek Wilk @ 2016-03-28 16:12 UTC (permalink / raw)
  To: Andrew Cooper; +Cc: Jan Beulich, Xen-devel

On Wed, Mar 23, 2016 at 04:36:16PM +0000, Andrew Cooper wrote:
> A toolstack needs to know how much control Xen has over the visible cpuid
> values in PV guests.  Provide an explicit mechanism to query what Xen is
> capable of.
> 
> This interface will currently report no capabilities.  This change is
> scaffolding for future patches, which will introduce detection and switching
> logic, after which the interface will report hardware capabilities correctly.
> 
> Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
> ---
> CC: Jan Beulich <JBeulich@suse.com>
> 
> v2:
>  * s/cpumasks/cpuidmasks/
> v3:
>  * Reintroduce XEN_SYSCTL_get_levelling_caps (requested by Joao for some
>    development he has planned).

s/some/libvirt/

>  * Rename to XEN_SYSCTL_get_cpu_levelling_caps, and rename the constants to
>    match the Xen command line options.
> v4:
>  * Move declarations from processor.h to cpuid.h
>  * API corrections for XEN_SYSCTL_get_levelling_caps
> ---
>  xen/arch/x86/cpu/common.c        |  6 ++++++
>  xen/arch/x86/sysctl.c            |  6 ++++++
>  xen/include/asm-x86/cpufeature.h |  1 +
>  xen/include/asm-x86/cpuid.h      | 32 ++++++++++++++++++++++++++++++++
>  xen/include/public/sysctl.h      | 23 +++++++++++++++++++++++
>  5 files changed, 68 insertions(+)
> 
> diff --git a/xen/arch/x86/cpu/common.c b/xen/arch/x86/cpu/common.c
> index b5c023f..7ef75b0 100644
> --- a/xen/arch/x86/cpu/common.c
> +++ b/xen/arch/x86/cpu/common.c
> @@ -36,6 +36,12 @@ integer_param("cpuid_mask_ext_ecx", opt_cpuid_mask_ext_ecx);
>  unsigned int opt_cpuid_mask_ext_edx = ~0u;
>  integer_param("cpuid_mask_ext_edx", opt_cpuid_mask_ext_edx);
>  
> +unsigned int __initdata expected_levelling_cap;
> +unsigned int __read_mostly levelling_caps;
> +
> +DEFINE_PER_CPU(struct cpuidmasks, cpuidmasks);
> +struct cpuidmasks __read_mostly cpuidmask_defaults;
> +

Stray changes?

>  const struct cpu_dev *__read_mostly cpu_devs[X86_VENDOR_NUM] = {};
>  
>  unsigned int paddr_bits __read_mostly = 36;
> diff --git a/xen/arch/x86/sysctl.c b/xen/arch/x86/sysctl.c
> index 58cbd70..f68cbec 100644
> --- a/xen/arch/x86/sysctl.c
> +++ b/xen/arch/x86/sysctl.c
> @@ -190,6 +190,12 @@ long arch_do_sysctl(
>          }
>          break;
>  
> +    case XEN_SYSCTL_get_cpu_levelling_caps:
> +        sysctl->u.cpu_levelling_caps.caps = levelling_caps;
> +        if ( __copy_field_to_guest(u_sysctl, sysctl, u.cpu_levelling_caps.caps) )
> +            ret = -EFAULT;
> +        break;
> +


You are missing the XSM checks in flask_sysctl and the proper label in attributes files
(and also the default policy in xen.te).

>      default:
>          ret = -ENOSYS;
>          break;
> diff --git a/xen/include/asm-x86/cpufeature.h b/xen/include/asm-x86/cpufeature.h
> index e29b024..84d3220 100644
> --- a/xen/include/asm-x86/cpufeature.h
> +++ b/xen/include/asm-x86/cpufeature.h
> @@ -81,6 +81,7 @@
>  #define cpu_has_xsavec		boot_cpu_has(X86_FEATURE_XSAVEC)
>  #define cpu_has_xgetbv1		boot_cpu_has(X86_FEATURE_XGETBV1)
>  #define cpu_has_xsaves		boot_cpu_has(X86_FEATURE_XSAVES)
> +#define cpu_has_hypervisor	boot_cpu_has(X86_FEATURE_HYPERVISOR)
>  
>  enum _cache_type {
>      CACHE_TYPE_NULL = 0,
> diff --git a/xen/include/asm-x86/cpuid.h b/xen/include/asm-x86/cpuid.h
> index 4725672..9a21c25 100644
> --- a/xen/include/asm-x86/cpuid.h
> +++ b/xen/include/asm-x86/cpuid.h
> @@ -3,6 +3,7 @@
>  
>  #include <asm/cpufeatureset.h>
>  #include <asm/cpuid-autogen.h>
> +#include <asm/percpu.h>
>  
>  #define FSCAPINTS FEATURESET_NR_ENTRIES
>  
> @@ -18,6 +19,7 @@
>  
>  #ifndef __ASSEMBLY__
>  #include <xen/types.h>
> +#include <public/sysctl.h>
>  
>  extern const uint32_t known_features[FSCAPINTS];
>  extern const uint32_t special_features[FSCAPINTS];
> @@ -31,6 +33,36 @@ void calculate_featuresets(void);
>  
>  const uint32_t *lookup_deep_deps(uint32_t feature);
>  
> +/*
> + * Expected levelling capabilities (given cpuid vendor/family information),
> + * and levelling capabilities actually available (given MSR probing).
> + */
> +#define LCAP_faulting XEN_SYSCTL_CPU_LEVELCAP_faulting
> +#define LCAP_1cd      (XEN_SYSCTL_CPU_LEVELCAP_ecx |        \
> +                       XEN_SYSCTL_CPU_LEVELCAP_edx)
> +#define LCAP_e1cd     (XEN_SYSCTL_CPU_LEVELCAP_extd_ecx |   \
> +                       XEN_SYSCTL_CPU_LEVELCAP_extd_edx)
> +#define LCAP_Da1      XEN_SYSCTL_CPU_LEVELCAP_xsave_eax
> +#define LCAP_6c       XEN_SYSCTL_CPU_LEVELCAP_thermal_ecx
> +#define LCAP_7ab0     (XEN_SYSCTL_CPU_LEVELCAP_l7s0_eax |   \
> +                       XEN_SYSCTL_CPU_LEVELCAP_l7s0_ebx)
> +extern unsigned int expected_levelling_cap, levelling_caps;
> +
> +struct cpuidmasks
> +{
> +    uint64_t _1cd;
> +    uint64_t e1cd;
> +    uint64_t Da1;
> +    uint64_t _6c;
> +    uint64_t _7ab0;
> +};
> +
> +/* Per CPU shadows of masking MSR values, for lazy context switching. */
> +DECLARE_PER_CPU(struct cpuidmasks, cpuidmasks);

Ditto? Should these be in another patchset?

Why not make this patch only introduce the sub-ops and the very first
patch that uses the cpuidmasks have these squashed in?

> +
> +/* Default masking MSR values, calculated at boot. */
> +extern struct cpuidmasks cpuidmask_defaults;
> +
>  #endif /* __ASSEMBLY__ */
>  #endif /* !__X86_CPUID_H__ */
>  
> diff --git a/xen/include/public/sysctl.h b/xen/include/public/sysctl.h
> index 96680eb..1ab16db 100644
> --- a/xen/include/public/sysctl.h
> +++ b/xen/include/public/sysctl.h
> @@ -766,6 +766,27 @@ struct xen_sysctl_tmem_op {
>  typedef struct xen_sysctl_tmem_op xen_sysctl_tmem_op_t;
>  DEFINE_XEN_GUEST_HANDLE(xen_sysctl_tmem_op_t);
>  
> +/*
> + * XEN_SYSCTL_get_cpu_levelling_caps (x86 specific)
> + *
> + * Return hardware capabilities concerning masking or faulting of the cpuid
> + * instruction for PV guests.

Just PV? Not HVM?

> + */
> +struct xen_sysctl_cpu_levelling_caps {

I presume these are the /*IN*/ args?


> +#define XEN_SYSCTL_CPU_LEVELCAP_faulting    (1ul <<  0) /* CPUID faulting    */

Perhaps also point to what they look like in the Xen header file?

And mention that they differ in word placement from what Linux has
(which in turns impacts how toolstack may mash this together).

> +#define XEN_SYSCTL_CPU_LEVELCAP_ecx         (1ul <<  1) /* 0x00000001.ecx    */
> +#define XEN_SYSCTL_CPU_LEVELCAP_edx         (1ul <<  2) /* 0x00000001.edx    */
> +#define XEN_SYSCTL_CPU_LEVELCAP_extd_ecx    (1ul <<  3) /* 0x80000001.ecx    */
> +#define XEN_SYSCTL_CPU_LEVELCAP_extd_edx    (1ul <<  4) /* 0x80000001.edx    */
> +#define XEN_SYSCTL_CPU_LEVELCAP_xsave_eax   (1ul <<  5) /* 0x0000000D:1.eax  */
> +#define XEN_SYSCTL_CPU_LEVELCAP_thermal_ecx (1ul <<  6) /* 0x00000006.ecx    */
> +#define XEN_SYSCTL_CPU_LEVELCAP_l7s0_eax    (1ul <<  7) /* 0x00000007:0.eax  */
> +#define XEN_SYSCTL_CPU_LEVELCAP_l7s0_ebx    (1ul <<  8) /* 0x00000007:0.ebx  */
> +    uint32_t caps;
> +};
> +typedef struct xen_sysctl_cpu_levelling_caps xen_sysctl_cpu_levelling_caps_t;
> +DEFINE_XEN_GUEST_HANDLE(xen_sysctl_cpu_levelling_caps_t);
> +
>  struct xen_sysctl {
>      uint32_t cmd;
>  #define XEN_SYSCTL_readconsole                    1
> @@ -791,6 +812,7 @@ struct xen_sysctl {
>  #define XEN_SYSCTL_pcitopoinfo                   22
>  #define XEN_SYSCTL_psr_cat_op                    23
>  #define XEN_SYSCTL_tmem_op                       24
> +#define XEN_SYSCTL_get_cpu_levelling_caps        25
>      uint32_t interface_version; /* XEN_SYSCTL_INTERFACE_VERSION */
>      union {
>          struct xen_sysctl_readconsole       readconsole;
> @@ -816,6 +838,7 @@ struct xen_sysctl {
>          struct xen_sysctl_psr_cmt_op        psr_cmt_op;
>          struct xen_sysctl_psr_cat_op        psr_cat_op;
>          struct xen_sysctl_tmem_op           tmem_op;
> +        struct xen_sysctl_cpu_levelling_caps cpu_levelling_caps;
>          uint8_t                             pad[128];
>      } u;
>  };
> -- 
> 2.1.4
> 
> 
> _______________________________________________
> Xen-devel mailing list
> Xen-devel@lists.xen.org
> http://lists.xen.org/xen-devel

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [PATCH v4 13/26] x86/cpu: Sysctl and common infrastructure for levelling context switching
  2016-03-23 16:36 ` [PATCH v4 13/26] x86/cpu: Sysctl and common infrastructure for levelling context switching Andrew Cooper
  2016-03-24 16:58   ` Jan Beulich
  2016-03-28 16:12   ` Konrad Rzeszutek Wilk
@ 2016-03-28 17:37   ` Konrad Rzeszutek Wilk
  2 siblings, 0 replies; 72+ messages in thread
From: Konrad Rzeszutek Wilk @ 2016-03-28 17:37 UTC (permalink / raw)
  To: Andrew Cooper, Joao Martins; +Cc: Jan Beulich, Xen-devel

On Wed, Mar 23, 2016 at 04:36:16PM +0000, Andrew Cooper wrote:
> A toolstack needs to know how much control Xen has over the visible cpuid
> values in PV guests.  Provide an explicit mechanism to query what Xen is
> capable of.

If it is only for PV, why don't you add that subops with that name?

> 
> This interface will currently report no capabilities.  This change is
> scaffolding for future patches, which will introduce detection and switching
> logic, after which the interface will report hardware capabilities correctly.

> 
> Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
> ---
> CC: Jan Beulich <JBeulich@suse.com>

Also I would suggest you add Joao so he can be informed of this patch.
(CCing him here).
> 
> v2:
>  * s/cpumasks/cpuidmasks/
> v3:
>  * Reintroduce XEN_SYSCTL_get_levelling_caps (requested by Joao for some
>    development he has planned).
>  * Rename to XEN_SYSCTL_get_cpu_levelling_caps, and rename the constants to
>    match the Xen command line options.
> v4:
>  * Move declarations from processor.h to cpuid.h
>  * API corrections for XEN_SYSCTL_get_levelling_caps
> ---
>  xen/arch/x86/cpu/common.c        |  6 ++++++
>  xen/arch/x86/sysctl.c            |  6 ++++++
>  xen/include/asm-x86/cpufeature.h |  1 +
>  xen/include/asm-x86/cpuid.h      | 32 ++++++++++++++++++++++++++++++++
>  xen/include/public/sysctl.h      | 23 +++++++++++++++++++++++
>  5 files changed, 68 insertions(+)
> 
> diff --git a/xen/arch/x86/cpu/common.c b/xen/arch/x86/cpu/common.c
> index b5c023f..7ef75b0 100644
> --- a/xen/arch/x86/cpu/common.c
> +++ b/xen/arch/x86/cpu/common.c
> @@ -36,6 +36,12 @@ integer_param("cpuid_mask_ext_ecx", opt_cpuid_mask_ext_ecx);
>  unsigned int opt_cpuid_mask_ext_edx = ~0u;
>  integer_param("cpuid_mask_ext_edx", opt_cpuid_mask_ext_edx);
>  
> +unsigned int __initdata expected_levelling_cap;
> +unsigned int __read_mostly levelling_caps;
> +
> +DEFINE_PER_CPU(struct cpuidmasks, cpuidmasks);
> +struct cpuidmasks __read_mostly cpuidmask_defaults;
> +
>  const struct cpu_dev *__read_mostly cpu_devs[X86_VENDOR_NUM] = {};
>  
>  unsigned int paddr_bits __read_mostly = 36;
> diff --git a/xen/arch/x86/sysctl.c b/xen/arch/x86/sysctl.c
> index 58cbd70..f68cbec 100644
> --- a/xen/arch/x86/sysctl.c
> +++ b/xen/arch/x86/sysctl.c
> @@ -190,6 +190,12 @@ long arch_do_sysctl(
>          }
>          break;
>  
> +    case XEN_SYSCTL_get_cpu_levelling_caps:
> +        sysctl->u.cpu_levelling_caps.caps = levelling_caps;
> +        if ( __copy_field_to_guest(u_sysctl, sysctl, u.cpu_levelling_caps.caps) )
> +            ret = -EFAULT;
> +        break;
> +
>      default:
>          ret = -ENOSYS;
>          break;
> diff --git a/xen/include/asm-x86/cpufeature.h b/xen/include/asm-x86/cpufeature.h
> index e29b024..84d3220 100644
> --- a/xen/include/asm-x86/cpufeature.h
> +++ b/xen/include/asm-x86/cpufeature.h
> @@ -81,6 +81,7 @@
>  #define cpu_has_xsavec		boot_cpu_has(X86_FEATURE_XSAVEC)
>  #define cpu_has_xgetbv1		boot_cpu_has(X86_FEATURE_XGETBV1)
>  #define cpu_has_xsaves		boot_cpu_has(X86_FEATURE_XSAVES)
> +#define cpu_has_hypervisor	boot_cpu_has(X86_FEATURE_HYPERVISOR)
>  
>  enum _cache_type {
>      CACHE_TYPE_NULL = 0,
> diff --git a/xen/include/asm-x86/cpuid.h b/xen/include/asm-x86/cpuid.h
> index 4725672..9a21c25 100644
> --- a/xen/include/asm-x86/cpuid.h
> +++ b/xen/include/asm-x86/cpuid.h
> @@ -3,6 +3,7 @@
>  
>  #include <asm/cpufeatureset.h>
>  #include <asm/cpuid-autogen.h>
> +#include <asm/percpu.h>
>  
>  #define FSCAPINTS FEATURESET_NR_ENTRIES
>  
> @@ -18,6 +19,7 @@
>  
>  #ifndef __ASSEMBLY__
>  #include <xen/types.h>
> +#include <public/sysctl.h>
>  
>  extern const uint32_t known_features[FSCAPINTS];
>  extern const uint32_t special_features[FSCAPINTS];
> @@ -31,6 +33,36 @@ void calculate_featuresets(void);
>  
>  const uint32_t *lookup_deep_deps(uint32_t feature);
>  
> +/*
> + * Expected levelling capabilities (given cpuid vendor/family information),
> + * and levelling capabilities actually available (given MSR probing).
> + */
> +#define LCAP_faulting XEN_SYSCTL_CPU_LEVELCAP_faulting
> +#define LCAP_1cd      (XEN_SYSCTL_CPU_LEVELCAP_ecx |        \
> +                       XEN_SYSCTL_CPU_LEVELCAP_edx)
> +#define LCAP_e1cd     (XEN_SYSCTL_CPU_LEVELCAP_extd_ecx |   \
> +                       XEN_SYSCTL_CPU_LEVELCAP_extd_edx)
> +#define LCAP_Da1      XEN_SYSCTL_CPU_LEVELCAP_xsave_eax
> +#define LCAP_6c       XEN_SYSCTL_CPU_LEVELCAP_thermal_ecx
> +#define LCAP_7ab0     (XEN_SYSCTL_CPU_LEVELCAP_l7s0_eax |   \
> +                       XEN_SYSCTL_CPU_LEVELCAP_l7s0_ebx)
> +extern unsigned int expected_levelling_cap, levelling_caps;
> +
> +struct cpuidmasks
> +{
> +    uint64_t _1cd;
> +    uint64_t e1cd;
> +    uint64_t Da1;
> +    uint64_t _6c;
> +    uint64_t _7ab0;
> +};
> +
> +/* Per CPU shadows of masking MSR values, for lazy context switching. */
> +DECLARE_PER_CPU(struct cpuidmasks, cpuidmasks);
> +
> +/* Default masking MSR values, calculated at boot. */
> +extern struct cpuidmasks cpuidmask_defaults;
> +
>  #endif /* __ASSEMBLY__ */
>  #endif /* !__X86_CPUID_H__ */
>  
> diff --git a/xen/include/public/sysctl.h b/xen/include/public/sysctl.h
> index 96680eb..1ab16db 100644
> --- a/xen/include/public/sysctl.h
> +++ b/xen/include/public/sysctl.h
> @@ -766,6 +766,27 @@ struct xen_sysctl_tmem_op {
>  typedef struct xen_sysctl_tmem_op xen_sysctl_tmem_op_t;
>  DEFINE_XEN_GUEST_HANDLE(xen_sysctl_tmem_op_t);
>  
> +/*
> + * XEN_SYSCTL_get_cpu_levelling_caps (x86 specific)
> + *
> + * Return hardware capabilities concerning masking or faulting of the cpuid
> + * instruction for PV guests.
> + */
> +struct xen_sysctl_cpu_levelling_caps {
> +#define XEN_SYSCTL_CPU_LEVELCAP_faulting    (1ul <<  0) /* CPUID faulting    */
> +#define XEN_SYSCTL_CPU_LEVELCAP_ecx         (1ul <<  1) /* 0x00000001.ecx    */
> +#define XEN_SYSCTL_CPU_LEVELCAP_edx         (1ul <<  2) /* 0x00000001.edx    */
> +#define XEN_SYSCTL_CPU_LEVELCAP_extd_ecx    (1ul <<  3) /* 0x80000001.ecx    */
> +#define XEN_SYSCTL_CPU_LEVELCAP_extd_edx    (1ul <<  4) /* 0x80000001.edx    */
> +#define XEN_SYSCTL_CPU_LEVELCAP_xsave_eax   (1ul <<  5) /* 0x0000000D:1.eax  */
> +#define XEN_SYSCTL_CPU_LEVELCAP_thermal_ecx (1ul <<  6) /* 0x00000006.ecx    */
> +#define XEN_SYSCTL_CPU_LEVELCAP_l7s0_eax    (1ul <<  7) /* 0x00000007:0.eax  */
> +#define XEN_SYSCTL_CPU_LEVELCAP_l7s0_ebx    (1ul <<  8) /* 0x00000007:0.ebx  */
> +    uint32_t caps;
> +};
> +typedef struct xen_sysctl_cpu_levelling_caps xen_sysctl_cpu_levelling_caps_t;
> +DEFINE_XEN_GUEST_HANDLE(xen_sysctl_cpu_levelling_caps_t);
> +
>  struct xen_sysctl {
>      uint32_t cmd;
>  #define XEN_SYSCTL_readconsole                    1
> @@ -791,6 +812,7 @@ struct xen_sysctl {
>  #define XEN_SYSCTL_pcitopoinfo                   22
>  #define XEN_SYSCTL_psr_cat_op                    23
>  #define XEN_SYSCTL_tmem_op                       24
> +#define XEN_SYSCTL_get_cpu_levelling_caps        25
>      uint32_t interface_version; /* XEN_SYSCTL_INTERFACE_VERSION */
>      union {
>          struct xen_sysctl_readconsole       readconsole;
> @@ -816,6 +838,7 @@ struct xen_sysctl {
>          struct xen_sysctl_psr_cmt_op        psr_cmt_op;
>          struct xen_sysctl_psr_cat_op        psr_cat_op;
>          struct xen_sysctl_tmem_op           tmem_op;
> +        struct xen_sysctl_cpu_levelling_caps cpu_levelling_caps;
>          uint8_t                             pad[128];
>      } u;
>  };
> -- 
> 2.1.4
> 
> 
> _______________________________________________
> Xen-devel mailing list
> Xen-devel@lists.xen.org
> http://lists.xen.org/xen-devel

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [PATCH v4 14/26] x86/cpu: Rework AMD masking MSR setup
  2016-03-23 16:36 ` [PATCH v4 14/26] x86/cpu: Rework AMD masking MSR setup Andrew Cooper
@ 2016-03-28 18:55   ` Konrad Rzeszutek Wilk
  2016-04-05 16:44     ` Andrew Cooper
  0 siblings, 1 reply; 72+ messages in thread
From: Konrad Rzeszutek Wilk @ 2016-03-28 18:55 UTC (permalink / raw)
  To: Andrew Cooper; +Cc: Xen-devel

On Wed, Mar 23, 2016 at 04:36:17PM +0000, Andrew Cooper wrote:
> This patch is best reviewed as its end result rather than as a diff, as it
> rewrites almost all of the setup.
> 
> On the BSP, cpuid information is used to evaluate the potential available set
> of masking MSRs, and they are unconditionally probed, filling in the
> availability information and hardware defaults.
> 
> The command line parameters are then combined with the hardware defaults to
> further restrict the Xen default masking level.  Each cpu is then context
> switched into the default levelling state.

Context switched? Why not just say:

When booting up CPUs we set the same MSR mask for each CPU.

The amd_ctxt_switch_levelling can be also used (in patch XYZ) to swap
levelling per guest granularity.

> 
> Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
> Reviewed-by: Jan Beulich <JBeulich@suse.com>
> ---
> v2:
>  * Provide extra information if opt_cpu_info
>  * Extra comment indicating the expected use of amd_ctxt_switch_levelling()
> v3:
>  * Fix the interaction of the fast-forward bits with the override MSRs.
>  * Style fixups.
> ---
>  xen/arch/x86/cpu/amd.c | 276 ++++++++++++++++++++++++++++++++-----------------
>  1 file changed, 179 insertions(+), 97 deletions(-)
> 
> diff --git a/xen/arch/x86/cpu/amd.c b/xen/arch/x86/cpu/amd.c
> index 5516777..0e1c8b9 100644
> --- a/xen/arch/x86/cpu/amd.c
> +++ b/xen/arch/x86/cpu/amd.c
> @@ -80,6 +80,13 @@ static inline int wrmsr_amd_safe(unsigned int msr, unsigned int lo,
>  	return err;
>  }
>  
> +static void wrmsr_amd(unsigned int msr, uint64_t val)
> +{
> +	asm volatile("wrmsr" ::
> +		     "c" (msr), "a" ((uint32_t)val),
> +		     "d" (val >> 32), "D" (0x9c5a203a));
> +}
> +
>  static const struct cpuidmask {
>  	uint16_t fam;
>  	char rev[2];
> @@ -126,126 +133,198 @@ static const struct cpuidmask *__init noinline get_cpuidmask(const char *opt)
>  }
>  
>  /*
> + * Sets caps in expected_levelling_cap, probes for the specified mask MSR, and
> + * set caps in levelling_caps if it is found.  Processors prior to Fam 10h
> + * required a 32-bit password for masking MSRs.  Returns the default value.
> + */
> +static uint64_t __init _probe_mask_msr(unsigned int msr, uint64_t caps)
> +{
> +	unsigned int hi, lo;
> +
> +	expected_levelling_cap |= caps;
> +
> +	if ((rdmsr_amd_safe(msr, &lo, &hi) == 0) &&
> +	    (wrmsr_amd_safe(msr, lo, hi) == 0))
> +		levelling_caps |= caps;
> +
> +	return ((uint64_t)hi << 32) | lo;
> +}
> +
> +/*
> + * Probe for the existance of the expected masking MSRs.  They might easily
> + * not be available if Xen is running virtualised.
> + */
> +static void __init noinline probe_masking_msrs(void)

Why noninline?

> +{
> +	const struct cpuinfo_x86 *c = &boot_cpu_data;
> +
> +	/*
> +	 * First, work out which masking MSRs we should have, based on
> +	 * revision and cpuid.
> +	 */
> +
> +	/* Fam11 doesn't support masking at all. */
> +	if (c->x86 == 0x11)
> +		return;
> +
> +	cpuidmask_defaults._1cd =
> +		_probe_mask_msr(MSR_K8_FEATURE_MASK, LCAP_1cd);
> +	cpuidmask_defaults.e1cd =
> +		_probe_mask_msr(MSR_K8_EXT_FEATURE_MASK, LCAP_e1cd);
> +
> +	if (c->cpuid_level >= 7)
> +		cpuidmask_defaults._7ab0 =
> +			_probe_mask_msr(MSR_AMD_L7S0_FEATURE_MASK, LCAP_7ab0);
> +
> +	if (c->x86 == 0x15 && c->cpuid_level >= 6 && cpuid_ecx(6))
> +		cpuidmask_defaults._6c =
> +			_probe_mask_msr(MSR_AMD_THRM_FEATURE_MASK, LCAP_6c);
> +
> +	/*
> +	 * Don't bother warning about a mismatch if virtualised.  These MSRs
> +	 * are not architectural and almost never virtualised.
> +	 */
> +	if ((expected_levelling_cap == levelling_caps) ||
> +	    cpu_has_hypervisor)
> +		return;
> +
> +	printk(XENLOG_WARNING "Mismatch between expected (%#x) "
> +	       "and real (%#x) levelling caps: missing %#x\n",
> +	       expected_levelling_cap, levelling_caps,
> +	       (expected_levelling_cap ^ levelling_caps) & levelling_caps);
> +	printk(XENLOG_WARNING "Fam %#x, model %#x level %#x\n",
> +	       c->x86, c->x86_model, c->cpuid_level);
> +	printk(XENLOG_WARNING
> +	       "If not running virtualised, please report a bug\n");

You already have an cpu_has_hypervisor check above? Or is that for
the hypervisor which do not set that bit?

> +}
> +
> +/*
> + * Context switch levelling state to the next domain.  A parameter of NULL is
> + * used to context switch to the default host state, and is used by the BSP/AP
> + * startup code.

OK, how about:
> + */
> +static void amd_ctxt_switch_levelling(const struct domain *nextd)
> +{
> +	struct cpuidmasks *these_masks = &this_cpu(cpuidmasks);
> +	const struct cpuidmasks *masks = &cpuidmask_defaults;
> +

	ASSERT(!d && system_state != SYS_STATE_active); ?

> +#define LAZY(cap, msr, field)						\
> +	({								\
> +		if (unlikely(these_masks->field != masks->field) &&	\
> +		    ((levelling_caps & cap) == cap))			\
> +		{							\
> +			wrmsr_amd(msr, masks->field);			\
> +			these_masks->field = masks->field;		\
> +		}							\
> +	})
> +
> +	LAZY(LCAP_1cd,  MSR_K8_FEATURE_MASK,       _1cd);
> +	LAZY(LCAP_e1cd, MSR_K8_EXT_FEATURE_MASK,   e1cd);
> +	LAZY(LCAP_7ab0, MSR_AMD_L7S0_FEATURE_MASK, _7ab0);
> +	LAZY(LCAP_6c,   MSR_AMD_THRM_FEATURE_MASK, _6c);
> +
> +#undef LAZY
> +}
> +
> +/*
>   * Mask the features and extended features returned by CPUID.  Parameters are
>   * set from the boot line via two methods:
>   *
>   *   1) Specific processor revision string
>   *   2) User-defined masks
>   *
> - * The processor revision string parameter has precedene.
> + * The user-defined masks take precedence.
>   */
> -static void set_cpuidmask(const struct cpuinfo_x86 *c)
> +static void __init noinline amd_init_levelling(void)
>  {
> -	static unsigned int feat_ecx, feat_edx;
> -	static unsigned int extfeat_ecx, extfeat_edx;
> -	static unsigned int l7s0_eax, l7s0_ebx;
> -	static unsigned int thermal_ecx;
> -	static bool_t skip_feat, skip_extfeat;
> -	static bool_t skip_l7s0_eax_ebx, skip_thermal_ecx;
> -	static enum { not_parsed, no_mask, set_mask } status;
> -	unsigned int eax, ebx, ecx, edx;
> -
> -	if (status == no_mask)
> -		return;
> +	const struct cpuidmask *m = NULL;
>  
> -	if (status == set_mask)
> -		goto setmask;
> +	probe_masking_msrs();
>  
> -	ASSERT((status == not_parsed) && (c == &boot_cpu_data));
> -	status = no_mask;
> +	if (*opt_famrev != '\0') {
> +		m = get_cpuidmask(opt_famrev);
>  
> -	/* Fam11 doesn't support masking at all. */
> -	if (c->x86 == 0x11)
> -		return;
> +		if (!m)
> +			printk("Invalid processor string: %s\n", opt_famrev);
> +	}
>  
> -	if (~(opt_cpuid_mask_ecx & opt_cpuid_mask_edx &
> -	      opt_cpuid_mask_ext_ecx & opt_cpuid_mask_ext_edx &
> -	      opt_cpuid_mask_l7s0_eax & opt_cpuid_mask_l7s0_ebx &
> -	      opt_cpuid_mask_thermal_ecx)) {
> -		feat_ecx = opt_cpuid_mask_ecx;
> -		feat_edx = opt_cpuid_mask_edx;
> -		extfeat_ecx = opt_cpuid_mask_ext_ecx;
> -		extfeat_edx = opt_cpuid_mask_ext_edx;
> -		l7s0_eax = opt_cpuid_mask_l7s0_eax;
> -		l7s0_ebx = opt_cpuid_mask_l7s0_ebx;
> -		thermal_ecx = opt_cpuid_mask_thermal_ecx;
> -	} else if (*opt_famrev == '\0') {
> -		return;
> -	} else {
> -		const struct cpuidmask *m = get_cpuidmask(opt_famrev);
> +	if ((levelling_caps & LCAP_1cd) == LCAP_1cd) {
> +		uint32_t ecx, edx, tmp;
>  
> -		if (!m) {
> -			printk("Invalid processor string: %s\n", opt_famrev);
> -			printk("CPUID will not be masked\n");
> -			return;
> +		cpuid(0x00000001, &tmp, &tmp, &ecx, &edx);
> +
> +		if(~(opt_cpuid_mask_ecx & opt_cpuid_mask_edx)) {

Why the missing space?
> +			ecx &= opt_cpuid_mask_ecx;
> +			edx &= opt_cpuid_mask_edx;
> +		} else if (m) {
> +			ecx &= m->ecx;
> +			edx &= m->edx;
>  		}
> -		feat_ecx = m->ecx;
> -		feat_edx = m->edx;
> -		extfeat_ecx = m->ext_ecx;
> -		extfeat_edx = m->ext_edx;
> +
> +		/* Fast-forward bits - Must be set. */
> +		if (ecx & cpufeat_mask(X86_FEATURE_XSAVE))
> +			ecx |= cpufeat_mask(X86_FEATURE_OSXSAVE);
> +		edx |= cpufeat_mask(X86_FEATURE_APIC);
> +
> +		/* Allow the HYPERVISOR bit to be set via guest policy. */
> +		ecx |= cpufeat_mask(X86_FEATURE_HYPERVISOR);

Hmm. The http://support.amd.com/TechDocs/52740_16h_Models_30h-3Fh_BKDG.pdf
pg 624 mentions this (bit 63) as 'Reserved'. Should we really set it?
Ah, but then earlier (pg 530) it says 'Reserved for use by hypervisor to indicate
guest status.


> +
> +		cpuidmask_defaults._1cd = ((uint64_t)ecx << 32) | edx;

Considering the document mentions Reserved should we preserve the bits that
are set by the initial call that fills out the cpuidmask_default?

The original code also had:
>  	}
>  
> -        /* Setting bits in the CPUID mask MSR that are not set in the
> -         * unmasked CPUID response can cause those bits to be set in the
> -         * masked response.  Avoid that by explicitly masking in software. */

 that comment in it. Would it make sense to include it (or a rework of it since
I wasn't exactly sure what it was saying).

> -        feat_ecx &= cpuid_ecx(0x00000001);
> -        feat_edx &= cpuid_edx(0x00000001);
> -        extfeat_ecx &= cpuid_ecx(0x80000001);
> -        extfeat_edx &= cpuid_edx(0x80000001);
> +	if ((levelling_caps & LCAP_e1cd) == LCAP_e1cd) {
> +		uint32_t ecx, edx, tmp;
>  
> -	status = set_mask;
> -	printk("Writing CPUID feature mask ECX:EDX -> %08Xh:%08Xh\n", 
> -	       feat_ecx, feat_edx);
> -	printk("Writing CPUID extended feature mask ECX:EDX -> %08Xh:%08Xh\n", 
> -	       extfeat_ecx, extfeat_edx);
> +		cpuid(0x80000001, &tmp, &tmp, &ecx, &edx);
>  
> -	if (c->cpuid_level >= 7)
> -		cpuid_count(7, 0, &eax, &ebx, &ecx, &edx);
> -	else
> -		ebx = eax = 0;
> -	if ((eax | ebx) && ~(l7s0_eax & l7s0_ebx)) {
> -		if (l7s0_eax > eax)
> -			l7s0_eax = eax;
> -		l7s0_ebx &= ebx;
> -		printk("Writing CPUID leaf 7 subleaf 0 feature mask EAX:EBX -> %08Xh:%08Xh\n",
> -		       l7s0_eax, l7s0_ebx);
> -	} else
> -		skip_l7s0_eax_ebx = 1;
> -
> -	/* Only Fam15 has the respective MSR. */
> -	ecx = c->x86 == 0x15 && c->cpuid_level >= 6 ? cpuid_ecx(6) : 0;
> -	if (ecx && ~thermal_ecx) {
> -		thermal_ecx &= ecx;
> -		printk("Writing CPUID thermal/power feature mask ECX -> %08Xh\n",
> -		       thermal_ecx);
> -	} else
> -		skip_thermal_ecx = 1;
> -
> - setmask:
> -	/* AMD processors prior to family 10h required a 32-bit password */
> -	if (!skip_feat &&
> -	    wrmsr_amd_safe(MSR_K8_FEATURE_MASK, feat_edx, feat_ecx)) {
> -		skip_feat = 1;
> -		printk("Failed to set CPUID feature mask\n");
> +		if(~(opt_cpuid_mask_ext_ecx & opt_cpuid_mask_ext_edx)) {

Here the space looks to be missing?
> +			ecx &= opt_cpuid_mask_ext_ecx;
> +			edx &= opt_cpuid_mask_ext_edx;
> +		} else if (m) {
> +			ecx &= m->ext_ecx;
> +			edx &= m->ext_edx;
> +		}
> +
> +		/* Fast-forward bits - Must be set. */
> +		edx |= cpufeat_mask(X86_FEATURE_APIC);
> +
> +		cpuidmask_defaults.e1cd = ((uint64_t)ecx << 32) | edx;

Should this be &= ?

>  	}
>  
> -	if (!skip_extfeat &&
> -	    wrmsr_amd_safe(MSR_K8_EXT_FEATURE_MASK, extfeat_edx, extfeat_ecx)) {
> -		skip_extfeat = 1;
> -		printk("Failed to set CPUID extended feature mask\n");
> +	if ((levelling_caps & LCAP_7ab0) == LCAP_7ab0) {
> +		uint32_t eax, ebx, tmp;
> +
> +		cpuid(0x00000007, &eax, &ebx, &tmp, &tmp);
> +
> +		if(~(opt_cpuid_mask_l7s0_eax & opt_cpuid_mask_l7s0_ebx)) {

Ditto.
> +			eax &= opt_cpuid_mask_l7s0_eax;
> +			ebx &= opt_cpuid_mask_l7s0_ebx;
> +		}
> +
> +		cpuidmask_defaults._7ab0 &= ((uint64_t)eax << 32) | ebx;
>  	}
>  
> -	if (!skip_l7s0_eax_ebx &&
> -	    wrmsr_amd_safe(MSR_AMD_L7S0_FEATURE_MASK, l7s0_ebx, l7s0_eax)) {
> -		skip_l7s0_eax_ebx = 1;
> -		printk("Failed to set CPUID leaf 7 subleaf 0 feature mask\n");
> +	if ((levelling_caps & LCAP_6c) == LCAP_6c) {
> +		uint32_t ecx = cpuid_ecx(6);
> +
> +		if (~opt_cpuid_mask_thermal_ecx)
> +			ecx &= opt_cpuid_mask_thermal_ecx;
> +
> +		cpuidmask_defaults._6c &= (~0ULL << 32) | ecx;


Is there any documentation about this? The BKDG from 03/2016 does not mention
this MSR (C001_1003). Ah but it is mentioned in docs for Family 15th. How nice.

>  	}.
>  
> -	if (!skip_thermal_ecx &&
> -	    (rdmsr_amd_safe(MSR_AMD_THRM_FEATURE_MASK, &eax, &edx) ||
> -	     wrmsr_amd_safe(MSR_AMD_THRM_FEATURE_MASK, thermal_ecx, edx))){
> -		skip_thermal_ecx = 1;
> -		printk("Failed to set CPUID thermal/power feature mask\n");
> +	if (opt_cpu_info) {
> +		printk(XENLOG_INFO "Levelling caps: %#x\n", levelling_caps);
> +		printk(XENLOG_INFO
> +		       "MSR defaults: 1d 0x%08x, 1c 0x%08x, e1d 0x%08x, "
> +		       "e1c 0x%08x, 7a0 0x%08x, 7b0 0x%08x, 6c 0x%08x\n",
> +		       (uint32_t)cpuidmask_defaults._1cd,
> +		       (uint32_t)(cpuidmask_defaults._1cd >> 32),
> +		       (uint32_t)cpuidmask_defaults.e1cd,
> +		       (uint32_t)(cpuidmask_defaults.e1cd >> 32),
> +		       (uint32_t)(cpuidmask_defaults._7ab0 >> 32),
> +		       (uint32_t)cpuidmask_defaults._7ab0,
> +		       (uint32_t)cpuidmask_defaults._6c);

Why don't you bit shift cpuidmask_defaults._6c too?
>  	}
>  }
>  
> @@ -409,7 +488,10 @@ static void amd_get_topology(struct cpuinfo_x86 *c)
>  
>  static void early_init_amd(struct cpuinfo_x86 *c)
>  {
> -	set_cpuidmask(c);
> +	if (c == &boot_cpu_data)
> +		amd_init_levelling();
> +
> +	amd_ctxt_switch_levelling(NULL);
>  }
>  
>  static void init_amd(struct cpuinfo_x86 *c)
> -- 
> 2.1.4
> 
> 
> _______________________________________________
> Xen-devel mailing list
> Xen-devel@lists.xen.org
> http://lists.xen.org/xen-devel

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [PATCH v4 15/26] x86/cpu: Rework Intel masking/faulting setup
  2016-03-23 16:36 ` [PATCH v4 15/26] x86/cpu: Rework Intel masking/faulting setup Andrew Cooper
@ 2016-03-28 19:14   ` Konrad Rzeszutek Wilk
  2016-04-05 16:45     ` Andrew Cooper
  0 siblings, 1 reply; 72+ messages in thread
From: Konrad Rzeszutek Wilk @ 2016-03-28 19:14 UTC (permalink / raw)
  To: Andrew Cooper; +Cc: Xen-devel

> + * Context switch levelling state to the next domain.  A parameter of NULL is
> + * used to context switch to the default host state, and is used by the BSP/AP
> + * startup code.
> + */
> +static void intel_ctxt_switch_levelling(const struct domain *nextd)
> +{
> +	struct cpuidmasks *these_masks = &this_cpu(cpuidmasks);
> +	const struct cpuidmasks *masks = &cpuidmask_defaults;
> +

Same question as on the AMD - would it make sense to add an ASSERT
to make sure that !nextd && system_state != SYS_STATE_active?

.. snip..

> +static void __init noinline intel_init_levelling(void)
> +{
> +	if (opt_cpu_info) {
> +		printk(XENLOG_INFO "Levelling caps: %#x\n", levelling_caps);
> +
> +		if (!cpu_has_cpuid_faulting)
> +			printk(XENLOG_INFO
> +			       "MSR defaults: 1d 0x%08x, 1c 0x%08x, e1d 0x%08x, "
> +			       "e1c 0x%08x, Da1 0x%08x\n",
> +			       (uint32_t)(cpuidmask_defaults._1cd >> 32),
> +			       (uint32_t)cpuidmask_defaults._1cd,
> +			       (uint32_t)(cpuidmask_defaults.e1cd >> 32),
> +			       (uint32_t)cpuidmask_defaults.e1cd,
> +			       (uint32_t)cpuidmask_defaults.Da1);

Perhaps shift Da1 as there is no need in seeing the upper bits?

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [PATCH v4 16/26] x86/cpu: Context switch cpuid masks and faulting state in context_switch()
  2016-03-23 16:36 ` [PATCH v4 16/26] x86/cpu: Context switch cpuid masks and faulting state in context_switch() Andrew Cooper
@ 2016-03-28 19:27   ` Konrad Rzeszutek Wilk
  2016-04-05 18:34     ` Andrew Cooper
  0 siblings, 1 reply; 72+ messages in thread
From: Konrad Rzeszutek Wilk @ 2016-03-28 19:27 UTC (permalink / raw)
  To: Andrew Cooper; +Cc: Xen-devel

On Wed, Mar 23, 2016 at 04:36:19PM +0000, Andrew Cooper wrote:
> A single ctxt_switch_levelling() function pointer is provided
> (defaulting to an empty nop), which is overridden in the appropriate
> $VENDOR_init_levelling().
> 
> set_cpuid_faulting() is made private and included within
> intel_ctxt_switch_levelling().
> 
> One functional change is that the faulting configuration is no longer special
> cased for dom0.  There was never any need to, and it will cause dom0 to

There was. See 1d6ffea6
    ACPI: add _PDC input override mechanism

And in Linux see xen_check_mwait().

> observe the same information through native and enlightened cpuid.

Which will be a regression when it comes to ACPI C-states  - as we won't
expose the deeper ones (C6 or such) on SandyBridge CPUs.

But looking at this document:
http://www.intel.com/content/dam/www/public/us/en/documents/application-notes/virtualization-technology-flexmigration-application-note.pdf

The CPUID Masking all talks about VM guests - but PV guests are not
really VM (no VMCS container for them). Does that mean that if a PV
guests does an 'native' CPUID it the CPUID results are not masked
by CPUID masking (or faulting?). I would think not since:

> @@ -154,6 +156,11 @@ static void intel_ctxt_switch_levelling(const struct domain *nextd)
>  	struct cpuidmasks *these_masks = &this_cpu(cpuidmasks);
>  	const struct cpuidmasks *masks = &cpuidmask_defaults;
>  
> +	if (cpu_has_cpuid_faulting) {
> +		set_cpuid_faulting(nextd && is_pv_domain(nextd));

Which would give us a NULL for Dom0. So no engagning of CPUID faulting for PV guests.
And I suppose the CPUID masking is only for guests in VMCS container?

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [PATCH v4 17/26] x86/pv: Provide custom cpumasks for PV domains
  2016-03-23 16:36 ` [PATCH v4 17/26] x86/pv: Provide custom cpumasks for PV domains Andrew Cooper
@ 2016-03-28 19:40   ` Konrad Rzeszutek Wilk
  2016-04-05 16:55     ` Andrew Cooper
  0 siblings, 1 reply; 72+ messages in thread
From: Konrad Rzeszutek Wilk @ 2016-03-28 19:40 UTC (permalink / raw)
  To: Andrew Cooper; +Cc: Xen-devel

On Wed, Mar 23, 2016 at 04:36:20PM +0000, Andrew Cooper wrote:
> And use them in preference to cpumask_defaults on context switch.  HVM domains

Extra space before HVM
> must not be masked (to avoid interfering with cpuid calls within the guest),
> so always lazily context switch to the host default.

Could you add please:
Host default being set by cpuid_mask_* boot paramters.

> 
> Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
> Reviewed-by: Jan Beulich <JBeulich@suse.com>
> ---
> v2:
>  * s/cpumasks/cpuidmasks/
>  * Use structure assignment
>  * Fix error path in arch_domain_create()
> v3:
>  * Indentation fixes.
>  * Only allocate PV cpuidmasks if the host is has cpumasks to use.
> ---
>  xen/arch/x86/cpu/amd.c       |  4 +++-
>  xen/arch/x86/cpu/intel.c     |  5 ++++-
>  xen/arch/x86/domain.c        | 14 ++++++++++++++
>  xen/include/asm-x86/domain.h |  2 ++
>  4 files changed, 23 insertions(+), 2 deletions(-)
> 
> diff --git a/xen/arch/x86/cpu/amd.c b/xen/arch/x86/cpu/amd.c
> index 484d4b0..8cb04f0 100644
> --- a/xen/arch/x86/cpu/amd.c
> +++ b/xen/arch/x86/cpu/amd.c
> @@ -206,7 +206,9 @@ static void __init noinline probe_masking_msrs(void)
>  static void amd_ctxt_switch_levelling(const struct domain *nextd)
>  {
>  	struct cpuidmasks *these_masks = &this_cpu(cpuidmasks);
> -	const struct cpuidmasks *masks = &cpuidmask_defaults;
> +	const struct cpuidmasks *masks =
> +		(nextd && is_pv_domain(nextd) && nextd->arch.pv_domain.cpuidmasks)
> +		? nextd->arch.pv_domain.cpuidmasks : &cpuidmask_defaults;
>  
>  #define LAZY(cap, msr, field)						\
>  	({								\
> diff --git a/xen/arch/x86/cpu/intel.c b/xen/arch/x86/cpu/intel.c
> index 71b1199..00a9987 100644
> --- a/xen/arch/x86/cpu/intel.c
> +++ b/xen/arch/x86/cpu/intel.c
> @@ -154,13 +154,16 @@ static void __init probe_masking_msrs(void)
>  static void intel_ctxt_switch_levelling(const struct domain *nextd)
>  {
>  	struct cpuidmasks *these_masks = &this_cpu(cpuidmasks);
> -	const struct cpuidmasks *masks = &cpuidmask_defaults;
> +	const struct cpuidmasks *masks;
>  
>  	if (cpu_has_cpuid_faulting) {
>  		set_cpuid_faulting(nextd && is_pv_domain(nextd));
>  		return;
>  	}
>  
> +	masks = (nextd && is_pv_domain(nextd) && nextd->arch.pv_domain.cpuidmasks)
> +		? nextd->arch.pv_domain.cpuidmasks : &cpuidmask_defaults;
> +
>  #define LAZY(msr, field)						\
>  	({								\
>  		if (unlikely(these_masks->field != masks->field) &&	\
> diff --git a/xen/arch/x86/domain.c b/xen/arch/x86/domain.c
> index abc7194..d0d9773 100644
> --- a/xen/arch/x86/domain.c
> +++ b/xen/arch/x86/domain.c
> @@ -577,6 +577,14 @@ int arch_domain_create(struct domain *d, unsigned int domcr_flags,
>              goto fail;
>          clear_page(d->arch.pv_domain.gdt_ldt_l1tab);
>  
> +        if ( levelling_caps & ~LCAP_faulting )
> +        {
> +            d->arch.pv_domain.cpuidmasks = xmalloc(struct cpuidmasks);
> +            if ( !d->arch.pv_domain.cpuidmasks )
> +                goto fail;
> +            *d->arch.pv_domain.cpuidmasks = cpuidmask_defaults;
> +        }
> +
>          rc = create_perdomain_mapping(d, GDT_LDT_VIRT_START,
>                                        GDT_LDT_MBYTES << (20 - PAGE_SHIFT),
>                                        NULL, NULL);
> @@ -672,7 +680,10 @@ int arch_domain_create(struct domain *d, unsigned int domcr_flags,
>          paging_final_teardown(d);
>      free_perdomain_mappings(d);
>      if ( is_pv_domain(d) )
> +    {
> +        xfree(d->arch.pv_domain.cpuidmasks);
>          free_xenheap_page(d->arch.pv_domain.gdt_ldt_l1tab);
> +    }
>      psr_domain_free(d);
>      return rc;
>  }
> @@ -692,7 +703,10 @@ void arch_domain_destroy(struct domain *d)
>  
>      free_perdomain_mappings(d);
>      if ( is_pv_domain(d) )
> +    {
>          free_xenheap_page(d->arch.pv_domain.gdt_ldt_l1tab);
> +        xfree(d->arch.pv_domain.cpuidmasks);
> +    }
>  
>      free_xenheap_page(d->shared_info);
>      cleanup_domain_irq_mapping(d);
> diff --git a/xen/include/asm-x86/domain.h b/xen/include/asm-x86/domain.h
> index de60def..90f021f 100644
> --- a/xen/include/asm-x86/domain.h
> +++ b/xen/include/asm-x86/domain.h
> @@ -252,6 +252,8 @@ struct pv_domain
>  
>      /* map_domain_page() mapping cache. */
>      struct mapcache_domain mapcache;
> +
> +    struct cpuidmasks *cpuidmasks;
>  };
>  
>  struct monitor_write_data {
> -- 
> 2.1.4
> 
> 
> _______________________________________________
> Xen-devel mailing list
> Xen-devel@lists.xen.org
> http://lists.xen.org/xen-devel

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [PATCH v4 18/26] x86/domctl: Update PV domain cpumasks when setting cpuid policy
  2016-03-23 16:36 ` [PATCH v4 18/26] x86/domctl: Update PV domain cpumasks when setting cpuid policy Andrew Cooper
  2016-03-24 17:04   ` Jan Beulich
@ 2016-03-28 19:51   ` Konrad Rzeszutek Wilk
  1 sibling, 0 replies; 72+ messages in thread
From: Konrad Rzeszutek Wilk @ 2016-03-28 19:51 UTC (permalink / raw)
  To: Andrew Cooper; +Cc: Jan Beulich, Xen-devel

On Wed, Mar 23, 2016 at 04:36:21PM +0000, Andrew Cooper wrote:
> This allows PV domains with different featuresets to observe different values
> from a native cpuid instruction, on supporting hardware.
> 
> It is important to leak the host view of HTT and CMP_LEGACY through to guests,
> even though they could be hidden.  These flags affect how to interpret other
> cpuid leaves which are not maskable.
> 
> Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>

Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [PATCH v4 19/26] xen+tools: Export maximum host and guest cpu featuresets via SYSCTL
  2016-03-23 16:36 ` [PATCH v4 19/26] xen+tools: Export maximum host and guest cpu featuresets via SYSCTL Andrew Cooper
@ 2016-03-28 19:59   ` Konrad Rzeszutek Wilk
  0 siblings, 0 replies; 72+ messages in thread
From: Konrad Rzeszutek Wilk @ 2016-03-28 19:59 UTC (permalink / raw)
  To: Andrew Cooper; +Cc: Tim Deegan, Xen-devel

On Wed, Mar 23, 2016 at 04:36:22PM +0000, Andrew Cooper wrote:
> And provide stubs for toolstack use.
> 
> Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
> Acked-by: Wei Liu <wei.liu2@citrix.com>
> Acked-by: David Scott <dave@recoil.org>
> Acked-by: Jan Beulich <JBeulich@suse.com>
> ---
> CC: Tim Deegan <tim@xen.org>

You should also CC Daniel De Graaf..
> 
> v2:
>  * Rebased to use libxencall
>  * Improve hypercall documentation
> v3:
>  * Provide libxc implementation for XEN_SYSCTL_get_cpu_levelling_caps as well.
> v4:
>  * More const.
> ---
>  tools/libxc/include/xenctrl.h       |  4 +++
>  tools/libxc/xc_cpuid_x86.c          | 41 +++++++++++++++++++++++++++++
>  tools/ocaml/libs/xc/xenctrl.ml      |  3 +++
>  tools/ocaml/libs/xc/xenctrl.mli     |  4 +++
>  tools/ocaml/libs/xc/xenctrl_stubs.c | 35 +++++++++++++++++++++++++
>  xen/arch/x86/sysctl.c               | 51 +++++++++++++++++++++++++++++++++++++
>  xen/include/public/sysctl.h         | 27 ++++++++++++++++++++

And implement the XSM flask sub-ops including the appropiate xen.te policy
modification along with updating the attributes file.

Otherwise the code looks good and you can tack on Reviewed-by: Konrad Rzeszutek
Wilk <konrad.wilk@oracle.com>. Or wait until you repost it with the XSM checks and
I can extend the 'Reviewed-by' for the XSM code..

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [PATCH v4 20/26] tools/libxc: Modify bitmap operations to take void pointers
  2016-03-23 16:36 ` [PATCH v4 20/26] tools/libxc: Modify bitmap operations to take void pointers Andrew Cooper
@ 2016-03-28 20:05   ` Konrad Rzeszutek Wilk
  0 siblings, 0 replies; 72+ messages in thread
From: Konrad Rzeszutek Wilk @ 2016-03-28 20:05 UTC (permalink / raw)
  To: Andrew Cooper
  Cc: Wei Liu, Julien Grall, Ian Jackson, Stefano Stabellini, Xen-devel

On Wed, Mar 23, 2016 at 04:36:23PM +0000, Andrew Cooper wrote:
> The type of the pointer to a bitmap is not interesting; it does not affect the
> representation of the block of bits being pointed to.
> 
> Make the libxc functions consistent with those in Xen, so they can work just
> as well with 'unsigned int *' based bitmaps.
> 
> As part of doing so, change the implementation to be in terms of char rather
> than unsigned long.  This fixes alignment concerns with ARM.

If you could, please also modify the comment in xc_misc.c. See

 3cab67ac83b1d56c3daedd9c4adfed497a114246
    libxl/cpumap: Add xc_cpumap_[setcpu, clearcpu, testcpu] to complement xc_cpumap_alloc.

Otherwise: Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [PATCH v4 21/26] tools/libxc: Use public/featureset.h for cpuid policy generation
  2016-03-23 16:36 ` [PATCH v4 21/26] tools/libxc: Use public/featureset.h for cpuid policy generation Andrew Cooper
@ 2016-03-28 20:07   ` Konrad Rzeszutek Wilk
  0 siblings, 0 replies; 72+ messages in thread
From: Konrad Rzeszutek Wilk @ 2016-03-28 20:07 UTC (permalink / raw)
  To: Andrew Cooper; +Cc: Ian Jackson, Xen-devel

On Wed, Mar 23, 2016 at 04:36:24PM +0000, Andrew Cooper wrote:
> Rather than having a different local copy of some of the feature
> definitions.
> 
> Modify the xc_cpuid_x86.c cpumask helpers to appropriate truncate the

s/appropiate/appropiately/ ?

> new values.
> 
> As some of the feature have been renamed in the public API, similar renames
> are made here.
> 
> Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
> Acked-by: Wei Liu <wei.liu2@citrix.com>

Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [PATCH v4 22/26] tools/libxc: Expose the automatically generated cpu featuremask information
  2016-03-23 16:36 ` [PATCH v4 22/26] tools/libxc: Expose the automatically generated cpu featuremask information Andrew Cooper
@ 2016-03-28 20:08   ` Konrad Rzeszutek Wilk
  0 siblings, 0 replies; 72+ messages in thread
From: Konrad Rzeszutek Wilk @ 2016-03-28 20:08 UTC (permalink / raw)
  To: Andrew Cooper; +Cc: Ian Jackson, Xen-devel

On Wed, Mar 23, 2016 at 04:36:25PM +0000, Andrew Cooper wrote:
> Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
> Acked-by: Wei Liu <wei.liu2@citrix.com>

Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [PATCH v4 23/26] tools: Utility for dealing with featuresets
  2016-03-23 16:36 ` [PATCH v4 23/26] tools: Utility for dealing with featuresets Andrew Cooper
@ 2016-03-28 20:26   ` Konrad Rzeszutek Wilk
  0 siblings, 0 replies; 72+ messages in thread
From: Konrad Rzeszutek Wilk @ 2016-03-28 20:26 UTC (permalink / raw)
  To: Andrew Cooper; +Cc: Ian Jackson, Xen-devel

On Wed, Mar 23, 2016 at 04:36:26PM +0000, Andrew Cooper wrote:
> It is able to reports the current featuresets; both the static masks and
> dynamic featuresets from Xen, or to decode an arbitrary featureset into
> `/proc/cpuinfo` style strings.
> 
> Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
> Acked-by: Wei Liu <wei.liu2@citrix.com>

.. snip..
> +int main(int argc, char **argv)
> +{
> +    enum { MODE_UNKNOWN, MODE_INFO, MODE_DETAIL, MODE_INTERPRET }
> +    mode = MODE_UNKNOWN;
> +
> +    nr_features = xc_get_cpu_featureset_size();
> +
> +    for ( ;; )
> +    {
> +        int option_index = 0, c;
> +        static struct option long_options[] =
> +        {
> +            { "help", no_argument, NULL, 'h' },
> +            { "info", no_argument, NULL, 'i' },
> +            { "detail", no_argument, NULL, 'd' },
> +            { "verbose", no_argument, NULL, 'v' },
> +            { NULL, 0, NULL, 0 },
> +        };
> +
> +        c = getopt_long(argc, argv, "hidv", long_options, &option_index);
> +
> +        if ( c == -1 )
> +            break;
> +
> +        switch ( c )
> +        {
> +        case 'h':
> + option_error:
> +            printf("Usage: %s [ info | detail | <featureset>* ]\n", argv[0]);
> +            return 0;
> +
> +        case 'i':
> +            mode = MODE_INFO;
> +            break;
> +
> +        case 'd':
> +        case 'v':
> +            mode = MODE_DETAIL;
> +            break;
> +
> +        default:
> +            printf("Bad option '%c'\n", c);
> +            goto option_error;

Oh my. An backward goto! How about moving this default right above 'case 'h'
and do a fallthrough?

Granted one could consider that even worst looking that this goto.

Either way:

Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [PATCH v4 24/26] tools/libxc: Wire a featureset through to cpuid policy logic
  2016-03-23 16:36 ` [PATCH v4 24/26] tools/libxc: Wire a featureset through to cpuid policy logic Andrew Cooper
@ 2016-03-28 20:39   ` Konrad Rzeszutek Wilk
  0 siblings, 0 replies; 72+ messages in thread
From: Konrad Rzeszutek Wilk @ 2016-03-28 20:39 UTC (permalink / raw)
  To: Andrew Cooper; +Cc: Ian Jackson, Xen-devel

On Wed, Mar 23, 2016 at 04:36:27PM +0000, Andrew Cooper wrote:
> Later changes will cause the cpuid generation logic to seed their information

s/Later changes/Patch titled tools/libxc: Use featuresets rather than guesswork

> from a featureset.  This patch adds the infrastructure to specify a
> featureset, and will obtain the appropriate default from Xen if omitted.
s/default/defaults/

> 
> Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
> Acked-by: Wei Liu <wei.liu2@citrix.com>
> ---
> CC: Ian Jackson <Ian.Jackson@eu.citrix.com>
> 
> v2:
>  * Modify existing call rather than introducing a new one.
>  * Fix up in-tree callsites.
> ---
>  tools/libxc/include/xenctrl.h       |  4 ++-
>  tools/libxc/xc_cpuid_x86.c          | 69 ++++++++++++++++++++++++++++++++-----
>  tools/libxl/libxl_cpuid.c           |  2 +-
>  tools/ocaml/libs/xc/xenctrl_stubs.c |  2 +-
>  tools/python/xen/lowlevel/xc/xc.c   |  2 +-
>  5 files changed, 66 insertions(+), 13 deletions(-)
> 
> diff --git a/tools/libxc/include/xenctrl.h b/tools/libxc/include/xenctrl.h
> index 66acbd1..872fd08 100644
> --- a/tools/libxc/include/xenctrl.h
> +++ b/tools/libxc/include/xenctrl.h
> @@ -1896,7 +1896,9 @@ int xc_cpuid_set(xc_interface *xch,
>                   const char **config,
>                   char **config_transformed);
>  int xc_cpuid_apply_policy(xc_interface *xch,
> -                          domid_t domid);
> +                          domid_t domid,
> +                          uint32_t *featureset,
> +                          unsigned int nr_features);
>  void xc_cpuid_to_str(const unsigned int *regs,
>                       char **strs); /* some strs[] may be NULL if ENOMEM */
>  int xc_mca_op(xc_interface *xch, struct xen_mc *mc);
> diff --git a/tools/libxc/xc_cpuid_x86.c b/tools/libxc/xc_cpuid_x86.c
> index 0cffb36..a92f5e4 100644
> --- a/tools/libxc/xc_cpuid_x86.c
> +++ b/tools/libxc/xc_cpuid_x86.c
> @@ -166,6 +166,9 @@ struct cpuid_domain_info
>      bool pvh;
>      uint64_t xfeature_mask;
>  
> +    uint32_t *featureset;
> +    unsigned int nr_features;
> +
>      /* PV-only information. */
>      bool pv64;
>  
> @@ -197,11 +200,14 @@ static void cpuid(const unsigned int *input, unsigned int *regs)
>  }
>  
>  static int get_cpuid_domain_info(xc_interface *xch, domid_t domid,
> -                                 struct cpuid_domain_info *info)
> +                                 struct cpuid_domain_info *info,
> +                                 uint32_t *featureset,
> +                                 unsigned int nr_features)
>  {
>      struct xen_domctl domctl = {};
>      xc_dominfo_t di;
>      unsigned int in[2] = { 0, ~0U }, regs[4];
> +    unsigned int i, host_nr_features = xc_get_cpu_featureset_size();
>      int rc;
>  
>      cpuid(in, regs);
> @@ -223,6 +229,23 @@ static int get_cpuid_domain_info(xc_interface *xch, domid_t domid,
>      info->hvm = di.hvm;
>      info->pvh = di.pvh;
>  
> +    info->featureset = calloc(host_nr_features, sizeof(*info->featureset));
> +    if ( !info->featureset )
> +        return -ENOMEM;
> +
> +    info->nr_features = host_nr_features;
> +
> +    if ( featureset )
> +    {
> +        memcpy(info->featureset, featureset,
> +               min(host_nr_features, nr_features) * sizeof(*info->featureset));
> +
> +        /* Check for truncated set bits. */
> +        for ( i = nr_features; i < host_nr_features; ++i )

What if nr_features is greater then host_nr_features? Should we fail immediately?


> +            if ( featureset[i] != 0 )

Could you make this: if ( !featureset[i] )  - to complement the style?

Otherwise: Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [PATCH v4 07/26] xen/x86: Calculate maximum host and guest featuresets
  2016-03-23 16:36 ` [PATCH v4 07/26] xen/x86: Calculate maximum host and guest featuresets Andrew Cooper
@ 2016-03-29  8:57   ` Jan Beulich
  0 siblings, 0 replies; 72+ messages in thread
From: Jan Beulich @ 2016-03-29  8:57 UTC (permalink / raw)
  To: Andrew Cooper; +Cc: Xen-devel

>>> On 23.03.16 at 17:36, <andrew.cooper3@citrix.com> wrote:
> All of this information will be used by the toolstack to make informed
> levelling decisions for VMs, and by Xen to sanity check toolstack-provided
> information.

Not only am I still missing a sentence or two here on the two HVM
feature sets (namely why only one gets exposed), I'm also now
realizing that your intention of not exposing both in the public
interface is in contradiction with patch 6, which does expose both
of them, even if only in textual (comment) form. I.e. I think the
public header should then also get a note added that these
annotations are not part of the public interface.

Jan


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [PATCH v4 26/26] tools/libxc: Calculate xstate cpuid leaf from guest information
  2016-03-23 16:36 ` [PATCH v4 26/26] tools/libxc: Calculate xstate cpuid leaf from guest information Andrew Cooper
  2016-03-24 17:20   ` Wei Liu
@ 2016-03-31  7:48   ` Jan Beulich
  2016-04-05 17:48     ` Andrew Cooper
  1 sibling, 1 reply; 72+ messages in thread
From: Jan Beulich @ 2016-03-31  7:48 UTC (permalink / raw)
  To: Andrew Cooper; +Cc: Ian Jackson, Wei Liu, Xen-devel

>>> On 23.03.16 at 17:36, <andrew.cooper3@citrix.com> wrote:
> --- a/tools/libxc/xc_cpuid_x86.c
> +++ b/tools/libxc/xc_cpuid_x86.c
> @@ -398,54 +398,97 @@ static void intel_xc_cpuid_policy(xc_interface *xch,
>      }
>  }
>  
> +/* XSTATE bits in XCR0. */
> +#define X86_XCR0_X87    (1ULL <<  0)
> +#define X86_XCR0_SSE    (1ULL <<  1)
> +#define X86_XCR0_AVX    (1ULL <<  2)
> +#define X86_XCR0_BNDREG (1ULL <<  3)
> +#define X86_XCR0_BNDCSR (1ULL <<  4)
> +#define X86_XCR0_LWP    (1ULL << 62)

Why an incomplete set? At least PKRU should be needed right
away. And I see no reason why the three AVX-512 pieces can't
be put here right away too.

> +#define X86_XSS_MASK    (0) /* No XSS states supported yet. */
> +
> +/* Per-component subleaf flags. */
> +#define XSTATE_XSS      (1ULL <<  0)
> +#define XSTATE_ALIGN64  (1ULL <<  1)
> +
>  /* Configure extended state enumeration leaves (0x0000000D for xsave) */
>  static void xc_cpuid_config_xsave(xc_interface *xch,
>                                    const struct cpuid_domain_info *info,
>                                    const unsigned int *input, unsigned int *regs)
>  {
> -    if ( info->xfeature_mask == 0 )
> +    uint64_t guest_xfeature_mask;
> +
> +    if ( info->xfeature_mask == 0 ||
> +         !test_bit(X86_FEATURE_XSAVE, info->featureset) )
>      {
>          regs[0] = regs[1] = regs[2] = regs[3] = 0;
>          return;
>      }
>  
> +    guest_xfeature_mask = X86_XCR0_SSE | X86_XCR0_X87;
> +
> +    if ( test_bit(X86_FEATURE_AVX, info->featureset) )
> +        guest_xfeature_mask |= X86_XCR0_AVX;
> +
> +    if ( test_bit(X86_FEATURE_MPX, info->featureset) )
> +        guest_xfeature_mask |= X86_XCR0_BNDREG | X86_XCR0_BNDCSR;
> +
> +    if ( test_bit(X86_FEATURE_LWP, info->featureset) )
> +        guest_xfeature_mask |= X86_XCR0_LWP;
> +
> +    /*
> +     * Clamp to host mask.  Should be no-op, as guest_xfeature_mask should not
> +     * be able to be calculated as larger than info->xfeature_mask.
> +     *
> +     * TODO - see about making this a harder error.
> +     */
> +    guest_xfeature_mask &= info->xfeature_mask;

This is ugly. For one, your dependency mechanism should be able to
express the dependencies you "manually"enforce above. And beyond
that masking with info->xfeature_mask should be all that's needed,
together with enforcing the XCR0 / XSS split ...

>      switch ( input[1] )
>      {
> -    case 0: 
> +    case 0:
>          /* EAX: low 32bits of xfeature_enabled_mask */
> -        regs[0] = info->xfeature_mask & 0xFFFFFFFF;
> +        regs[0] = guest_xfeature_mask;
>          /* EDX: high 32bits of xfeature_enabled_mask */
> -        regs[3] = (info->xfeature_mask >> 32) & 0xFFFFFFFF;
> +        regs[3] = guest_xfeature_mask >> 32;

... here and ...

>      case 1: /* leaf 1 */
>          regs[0] = info->featureset[featureword_of(X86_FEATURE_XSAVEOPT)];
> -        regs[2] &= info->xfeature_mask;
> -        regs[3] = 0;
> +        regs[2] = guest_xfeature_mask & X86_XSS_MASK;
> +        regs[3] = (guest_xfeature_mask >> 32) & X86_XSS_MASK;

... here. Yet not by a compile time defined mask, but by using
(host) CPUID output: It is clear that once a bit got assigned to XCR0
vs XSS, it won't ever change. Hence it doesn't matter whether you
use the guest or host view of that split. And this will then also - other
than you've said before would be unavoidable - make unnecessary to
always update this code when new states get added.

> -    case 2 ... 63: /* sub-leaves */
> -        if ( !(info->xfeature_mask & (1ULL << input[1])) )
> +
> +    case 2 ... 62: /* per-component sub-leaves */
> +        if ( !(guest_xfeature_mask & (1ULL << input[1])) )
>          {
>              regs[0] = regs[1] = regs[2] = regs[3] = 0;
>              break;
>          }
>          /* Don't touch EAX, EBX. Also cleanup ECX and EDX */
> -        regs[2] = regs[3] = 0;
> +        regs[2] &= XSTATE_XSS | XSTATE_ALIGN64;

Wouldn't this better also use the "known features" approach, by
adding yet another word in cpufeatureset.h?

Btw., looking at that header again I now wonder whether it
wouldn't have been neater to make XEN_CPUFEATURE() a
3-parameter macro, with word and bit specified separately
and a default definition of

#define XEN_CPUFEATURE(name, word, bit) XEN_X86_FEATURE_##name = (word) * 32 + (bit),

avoiding the ugly repeated "*32" in all macro invocations. Of
course we'd need to adjust this before we release with this new
interface.

Jan

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [PATCH v4 11/26] xen/x86: Improvements to in-hypervisor cpuid sanity checks
  2016-03-28 15:29   ` Konrad Rzeszutek Wilk
@ 2016-04-05 15:25     ` Andrew Cooper
  0 siblings, 0 replies; 72+ messages in thread
From: Andrew Cooper @ 2016-04-05 15:25 UTC (permalink / raw)
  To: Konrad Rzeszutek Wilk; +Cc: Jan Beulich, Xen-devel

On 28/03/16 16:29, Konrad Rzeszutek Wilk wrote:
> On Wed, Mar 23, 2016 at 04:36:14PM +0000, Andrew Cooper wrote:
>> Currently, {pv,hvm}_cpuid() has a large quantity of essentially-static logic
>> for modifying the features visible to a guest.  A lot of this can be subsumed
>> by {pv,hvm}_featuremask, which identify the features available on this
>> hardware which could be given to a PV or HVM guest.
>>
>> This is a step in the direction of full per-domain cpuid policies, but lots
>> more development is needed for that.  As a result, the static checks are
>> simplified, but the dynamic checks need to remain for now.
>>
>> As a side effect, some of the logic for special features can be improved.
>> OSXSAVE and OSPKE will be automatically cleared because of being absent in the
>> featuremask.  This allows the fast-forward logic to be more simple.
>>
>> In addition, there are some corrections to the existing logic:
>>
>>  * Hiding PSE36 out of PAE mode is architecturally wrong.  It turns out that
>>    it was a bugfix for running HyperV under Xen, which wanted to see PSE36
>>    even after choosing to use PAE paging.  PSE36 is not supported by shadow
>>    paging, so is hidden from non-HAP guests, but is still visible for HAP
>>    guests.
>>  * Changing the visibility of RDTSCP based on host TSC stability or virtual
>>    TSC mode is bogus, so dropped.
> Why is it bogus? It is an PV ABI type CPUID.

The CPUID bit has a well defined meaning, and the vtsc infrastructure
went and diverged the ABI provided by Intel and AMD.

If it were only PV guests, it would be less bad.  However, it breaks HVM
guests as well which is absolutely not ok.

The meaning of the RDTSCP feature bit is well defined.  The presence of
the `rdtscp` instruction, and the TSC_AUX MSR.

The vtsc options control whether rdtsc(p) is intercepted by Xen, and
whether the guest or Xen controls the AUX MSR.  These options are
unrelated, and have no bearing on the availability of the instruction in
the first place.

>
> Independetly of that you would also need to modify tsc_mode.txt file and all uses
> of 'tsc_mode=3'.

A guest kernel cannot use the presence or absence of the rdtscp feature
to identify which vtsc mode is in intended, which necessarily means
there is an extra out-of-band signal controlling its use.

In particular, the current issue fixed by this change is that for the
default case, on migrate, the rdtscp feature disappears from the domain
as the tsc logic decides that the frequency has changed and vtsc mode
should be enabled.

~Andrew

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [PATCH v4 12/26] x86/cpu: Move set_cpumask() calls into c_early_init()
  2016-03-28 15:55   ` Konrad Rzeszutek Wilk
@ 2016-04-05 16:19     ` Andrew Cooper
  0 siblings, 0 replies; 72+ messages in thread
From: Andrew Cooper @ 2016-04-05 16:19 UTC (permalink / raw)
  To: Konrad Rzeszutek Wilk; +Cc: Jan Beulich, Xen-devel

On 28/03/16 16:55, Konrad Rzeszutek Wilk wrote:
> On Wed, Mar 23, 2016 at 04:36:15PM +0000, Andrew Cooper wrote:
>> Before c/s 44e24f8567 "x86: don't call generic_identify() redundantly", the
>> commandline-provided masks would take effect in Xen's view of the features.
> s/the// ?
>
> Or perhaps s/the/cpuid// ?

"the processor features"? 

>> As the masks got applied after the query for features, the redundant call to
>> generic_identify() would clobber the pre-masking feature information with the
>> post-masking information.
>>
>> Move the set_cpumask() calls into c_early_init() so their effects take place
> s/their effects take/it's effect takes/ 

"their" is supposed to be referring to the cpuid mask command line
parameters, but I will see if I can reword.

>
>> before the main query for features in generic_identify().
> .. and unifying all c_early_init() functions behavior?

I don't understand what you are trying to get at here.  I am moving the
masking setup from c_init() to c_early_init() for both Intel and AMD.  I
don't see what I am supposedly unifying.

>
>> The cpuid_mask_* command line parameters now limit the entire system, a
>> feature XenServer was relying on for testing purposes.  Subsequent changes
> .. what are those changes? Could you mention the title of the patch perhaps?

Specifically, the complete combination of patches 14 through 18, but as
this part of the series can't reasonably be committed piecemeal, there
is no risk of other changes getting inbetween.

~Andrew

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [PATCH v4 13/26] x86/cpu: Sysctl and common infrastructure for levelling context switching
  2016-03-28 16:12   ` Konrad Rzeszutek Wilk
@ 2016-04-05 16:33     ` Andrew Cooper
  0 siblings, 0 replies; 72+ messages in thread
From: Andrew Cooper @ 2016-04-05 16:33 UTC (permalink / raw)
  To: Konrad Rzeszutek Wilk; +Cc: Jan Beulich, Xen-devel

On 28/03/16 17:12, Konrad Rzeszutek Wilk wrote:
> On Wed, Mar 23, 2016 at 04:36:16PM +0000, Andrew Cooper wrote:
>> A toolstack needs to know how much control Xen has over the visible cpuid
>> values in PV guests.  Provide an explicit mechanism to query what Xen is
>> capable of.
>>
>> This interface will currently report no capabilities.  This change is
>> scaffolding for future patches, which will introduce detection and switching
>> logic, after which the interface will report hardware capabilities correctly.
>>
>> Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
>> ---
>> CC: Jan Beulich <JBeulich@suse.com>
>>
>> v2:
>>  * s/cpumasks/cpuidmasks/
>> v3:
>>  * Reintroduce XEN_SYSCTL_get_levelling_caps (requested by Joao for some
>>    development he has planned).
> s/some/libvirt/
>
>>  * Rename to XEN_SYSCTL_get_cpu_levelling_caps, and rename the constants to
>>    match the Xen command line options.
>> v4:
>>  * Move declarations from processor.h to cpuid.h
>>  * API corrections for XEN_SYSCTL_get_levelling_caps
>> ---
>>  xen/arch/x86/cpu/common.c        |  6 ++++++
>>  xen/arch/x86/sysctl.c            |  6 ++++++
>>  xen/include/asm-x86/cpufeature.h |  1 +
>>  xen/include/asm-x86/cpuid.h      | 32 ++++++++++++++++++++++++++++++++
>>  xen/include/public/sysctl.h      | 23 +++++++++++++++++++++++
>>  5 files changed, 68 insertions(+)
>>
>> diff --git a/xen/arch/x86/cpu/common.c b/xen/arch/x86/cpu/common.c
>> index b5c023f..7ef75b0 100644
>> --- a/xen/arch/x86/cpu/common.c
>> +++ b/xen/arch/x86/cpu/common.c
>> @@ -36,6 +36,12 @@ integer_param("cpuid_mask_ext_ecx", opt_cpuid_mask_ext_ecx);
>>  unsigned int opt_cpuid_mask_ext_edx = ~0u;
>>  integer_param("cpuid_mask_ext_edx", opt_cpuid_mask_ext_edx);
>>  
>> +unsigned int __initdata expected_levelling_cap;
>> +unsigned int __read_mostly levelling_caps;
>> +
>> +DEFINE_PER_CPU(struct cpuidmasks, cpuidmasks);
>> +struct cpuidmasks __read_mostly cpuidmask_defaults;
>> +
> Stray changes?

No.  Please refer to the commit message.

If it isn't sufficiently clear, please suggest how better to say "This
patch is deliberately like this to reduce the complexity of the two
following patches, both of which are already very complicated to review".

Compile-wise, you can either have this patch separate (and accept that I
introduce some variables before the code which sets them to a non-zero
value), or folded into the following patch.  I know which way I would
prefer to review them...

>
>>  const struct cpu_dev *__read_mostly cpu_devs[X86_VENDOR_NUM] = {};
>>  
>>  unsigned int paddr_bits __read_mostly = 36;
>> diff --git a/xen/arch/x86/sysctl.c b/xen/arch/x86/sysctl.c
>> index 58cbd70..f68cbec 100644
>> --- a/xen/arch/x86/sysctl.c
>> +++ b/xen/arch/x86/sysctl.c
>> @@ -190,6 +190,12 @@ long arch_do_sysctl(
>>          }
>>          break;
>>  
>> +    case XEN_SYSCTL_get_cpu_levelling_caps:
>> +        sysctl->u.cpu_levelling_caps.caps = levelling_caps;
>> +        if ( __copy_field_to_guest(u_sysctl, sysctl, u.cpu_levelling_caps.caps) )
>> +            ret = -EFAULT;
>> +        break;
>> +
>
> You are missing the XSM checks in flask_sysctl and the proper label in attributes files
> (and also the default policy in xen.te).

So I have - will fix.

>
>>      default:
>>          ret = -ENOSYS;
>>          break;
>> diff --git a/xen/include/asm-x86/cpufeature.h b/xen/include/asm-x86/cpufeature.h
>> index e29b024..84d3220 100644
>> --- a/xen/include/asm-x86/cpufeature.h
>> +++ b/xen/include/asm-x86/cpufeature.h
>> @@ -81,6 +81,7 @@
>>  #define cpu_has_xsavec		boot_cpu_has(X86_FEATURE_XSAVEC)
>>  #define cpu_has_xgetbv1		boot_cpu_has(X86_FEATURE_XGETBV1)
>>  #define cpu_has_xsaves		boot_cpu_has(X86_FEATURE_XSAVES)
>> +#define cpu_has_hypervisor	boot_cpu_has(X86_FEATURE_HYPERVISOR)
>>  
>>  enum _cache_type {
>>      CACHE_TYPE_NULL = 0,
>> diff --git a/xen/include/asm-x86/cpuid.h b/xen/include/asm-x86/cpuid.h
>> index 4725672..9a21c25 100644
>> --- a/xen/include/asm-x86/cpuid.h
>> +++ b/xen/include/asm-x86/cpuid.h
>> @@ -3,6 +3,7 @@
>>  
>>  #include <asm/cpufeatureset.h>
>>  #include <asm/cpuid-autogen.h>
>> +#include <asm/percpu.h>
>>  
>>  #define FSCAPINTS FEATURESET_NR_ENTRIES
>>  
>> @@ -18,6 +19,7 @@
>>  
>>  #ifndef __ASSEMBLY__
>>  #include <xen/types.h>
>> +#include <public/sysctl.h>
>>  
>>  extern const uint32_t known_features[FSCAPINTS];
>>  extern const uint32_t special_features[FSCAPINTS];
>> @@ -31,6 +33,36 @@ void calculate_featuresets(void);
>>  
>>  const uint32_t *lookup_deep_deps(uint32_t feature);
>>  
>> +/*
>> + * Expected levelling capabilities (given cpuid vendor/family information),
>> + * and levelling capabilities actually available (given MSR probing).
>> + */
>> +#define LCAP_faulting XEN_SYSCTL_CPU_LEVELCAP_faulting
>> +#define LCAP_1cd      (XEN_SYSCTL_CPU_LEVELCAP_ecx |        \
>> +                       XEN_SYSCTL_CPU_LEVELCAP_edx)
>> +#define LCAP_e1cd     (XEN_SYSCTL_CPU_LEVELCAP_extd_ecx |   \
>> +                       XEN_SYSCTL_CPU_LEVELCAP_extd_edx)
>> +#define LCAP_Da1      XEN_SYSCTL_CPU_LEVELCAP_xsave_eax
>> +#define LCAP_6c       XEN_SYSCTL_CPU_LEVELCAP_thermal_ecx
>> +#define LCAP_7ab0     (XEN_SYSCTL_CPU_LEVELCAP_l7s0_eax |   \
>> +                       XEN_SYSCTL_CPU_LEVELCAP_l7s0_ebx)
>> +extern unsigned int expected_levelling_cap, levelling_caps;
>> +
>> +struct cpuidmasks
>> +{
>> +    uint64_t _1cd;
>> +    uint64_t e1cd;
>> +    uint64_t Da1;
>> +    uint64_t _6c;
>> +    uint64_t _7ab0;
>> +};
>> +
>> +/* Per CPU shadows of masking MSR values, for lazy context switching. */
>> +DECLARE_PER_CPU(struct cpuidmasks, cpuidmasks);
> Ditto? Should these be in another patchset?
>
> Why not make this patch only introduce the sub-ops and the very first
> patch that uses the cpuidmasks have these squashed in?

Because squashing this patch and the following patch will make something
far more difficult to review.

>
>> +
>> +/* Default masking MSR values, calculated at boot. */
>> +extern struct cpuidmasks cpuidmask_defaults;
>> +
>>  #endif /* __ASSEMBLY__ */
>>  #endif /* !__X86_CPUID_H__ */
>>  
>> diff --git a/xen/include/public/sysctl.h b/xen/include/public/sysctl.h
>> index 96680eb..1ab16db 100644
>> --- a/xen/include/public/sysctl.h
>> +++ b/xen/include/public/sysctl.h
>> @@ -766,6 +766,27 @@ struct xen_sysctl_tmem_op {
>>  typedef struct xen_sysctl_tmem_op xen_sysctl_tmem_op_t;
>>  DEFINE_XEN_GUEST_HANDLE(xen_sysctl_tmem_op_t);
>>  
>> +/*
>> + * XEN_SYSCTL_get_cpu_levelling_caps (x86 specific)
>> + *
>> + * Return hardware capabilities concerning masking or faulting of the cpuid
>> + * instruction for PV guests.
> Just PV? Not HVM?

HVM doesn't have any cpuid problems; the cpuid instruction traps to Xen
under all circumstances.

>
>> + */
>> +struct xen_sysctl_cpu_levelling_caps {
> I presume these are the /*IN*/ args?

They are /*OUT*/, as only a get_cpu_levelling_caps hypercall is
defined.  I can't see a plausible use for a set variant of this
hypercall, but I have named the structure in a neutral way.

>
>
>> +#define XEN_SYSCTL_CPU_LEVELCAP_faulting    (1ul <<  0) /* CPUID faulting    */
> Perhaps also point to what they look like in the Xen header file?
>
> And mention that they differ in word placement from what Linux has
> (which in turns impacts how toolstack may mash this together).

Are you getting confused?  This has nothing to do with the featureset api.

This reports which cpuid leaves can be controlled by Xen, based on the
hardware availability of masking msrs or cpuid faulting.

~Andrew

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [PATCH v4 14/26] x86/cpu: Rework AMD masking MSR setup
  2016-03-28 18:55   ` Konrad Rzeszutek Wilk
@ 2016-04-05 16:44     ` Andrew Cooper
  0 siblings, 0 replies; 72+ messages in thread
From: Andrew Cooper @ 2016-04-05 16:44 UTC (permalink / raw)
  To: Konrad Rzeszutek Wilk; +Cc: Xen-devel

On 28/03/16 19:55, Konrad Rzeszutek Wilk wrote:
> On Wed, Mar 23, 2016 at 04:36:17PM +0000, Andrew Cooper wrote:
>> This patch is best reviewed as its end result rather than as a diff, as it
>> rewrites almost all of the setup.
>>
>> On the BSP, cpuid information is used to evaluate the potential available set
>> of masking MSRs, and they are unconditionally probed, filling in the
>> availability information and hardware defaults.
>>
>> The command line parameters are then combined with the hardware defaults to
>> further restrict the Xen default masking level.  Each cpu is then context
>> switched into the default levelling state.
> Context switched? Why not just say:
>
> When booting up CPUs we set the same MSR mask for each CPU.
>
> The amd_ctxt_switch_levelling can be also used (in patch XYZ) to swap
> levelling per guest granularity.
>
>> Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
>> Reviewed-by: Jan Beulich <JBeulich@suse.com>
>> ---
>> v2:
>>  * Provide extra information if opt_cpu_info
>>  * Extra comment indicating the expected use of amd_ctxt_switch_levelling()
>> v3:
>>  * Fix the interaction of the fast-forward bits with the override MSRs.
>>  * Style fixups.
>> ---
>>  xen/arch/x86/cpu/amd.c | 276 ++++++++++++++++++++++++++++++++-----------------
>>  1 file changed, 179 insertions(+), 97 deletions(-)
>>
>> diff --git a/xen/arch/x86/cpu/amd.c b/xen/arch/x86/cpu/amd.c
>> index 5516777..0e1c8b9 100644
>> --- a/xen/arch/x86/cpu/amd.c
>> +++ b/xen/arch/x86/cpu/amd.c
>> @@ -80,6 +80,13 @@ static inline int wrmsr_amd_safe(unsigned int msr, unsigned int lo,
>>  	return err;
>>  }
>>  
>> +static void wrmsr_amd(unsigned int msr, uint64_t val)
>> +{
>> +	asm volatile("wrmsr" ::
>> +		     "c" (msr), "a" ((uint32_t)val),
>> +		     "d" (val >> 32), "D" (0x9c5a203a));
>> +}
>> +
>>  static const struct cpuidmask {
>>  	uint16_t fam;
>>  	char rev[2];
>> @@ -126,126 +133,198 @@ static const struct cpuidmask *__init noinline get_cpuidmask(const char *opt)
>>  }
>>  
>>  /*
>> + * Sets caps in expected_levelling_cap, probes for the specified mask MSR, and
>> + * set caps in levelling_caps if it is found.  Processors prior to Fam 10h
>> + * required a 32-bit password for masking MSRs.  Returns the default value.
>> + */
>> +static uint64_t __init _probe_mask_msr(unsigned int msr, uint64_t caps)
>> +{
>> +	unsigned int hi, lo;
>> +
>> +	expected_levelling_cap |= caps;
>> +
>> +	if ((rdmsr_amd_safe(msr, &lo, &hi) == 0) &&
>> +	    (wrmsr_amd_safe(msr, lo, hi) == 0))
>> +		levelling_caps |= caps;
>> +
>> +	return ((uint64_t)hi << 32) | lo;
>> +}
>> +
>> +/*
>> + * Probe for the existance of the expected masking MSRs.  They might easily
>> + * not be available if Xen is running virtualised.
>> + */
>> +static void __init noinline probe_masking_msrs(void)
> Why noninline?

So this large quantity of __init code doesn't get inlined into its sole
caller, which is not __init.

>
>> +{
>> +	const struct cpuinfo_x86 *c = &boot_cpu_data;
>> +
>> +	/*
>> +	 * First, work out which masking MSRs we should have, based on
>> +	 * revision and cpuid.
>> +	 */
>> +
>> +	/* Fam11 doesn't support masking at all. */
>> +	if (c->x86 == 0x11)
>> +		return;
>> +
>> +	cpuidmask_defaults._1cd =
>> +		_probe_mask_msr(MSR_K8_FEATURE_MASK, LCAP_1cd);
>> +	cpuidmask_defaults.e1cd =
>> +		_probe_mask_msr(MSR_K8_EXT_FEATURE_MASK, LCAP_e1cd);
>> +
>> +	if (c->cpuid_level >= 7)
>> +		cpuidmask_defaults._7ab0 =
>> +			_probe_mask_msr(MSR_AMD_L7S0_FEATURE_MASK, LCAP_7ab0);
>> +
>> +	if (c->x86 == 0x15 && c->cpuid_level >= 6 && cpuid_ecx(6))
>> +		cpuidmask_defaults._6c =
>> +			_probe_mask_msr(MSR_AMD_THRM_FEATURE_MASK, LCAP_6c);
>> +
>> +	/*
>> +	 * Don't bother warning about a mismatch if virtualised.  These MSRs
>> +	 * are not architectural and almost never virtualised.
>> +	 */
>> +	if ((expected_levelling_cap == levelling_caps) ||
>> +	    cpu_has_hypervisor)
>> +		return;
>> +
>> +	printk(XENLOG_WARNING "Mismatch between expected (%#x) "
>> +	       "and real (%#x) levelling caps: missing %#x\n",
>> +	       expected_levelling_cap, levelling_caps,
>> +	       (expected_levelling_cap ^ levelling_caps) & levelling_caps);
>> +	printk(XENLOG_WARNING "Fam %#x, model %#x level %#x\n",
>> +	       c->x86, c->x86_model, c->cpuid_level);
>> +	printk(XENLOG_WARNING
>> +	       "If not running virtualised, please report a bug\n");
> You already have an cpu_has_hypervisor check above? Or is that for
> the hypervisor which do not set that bit?

Correct.

>
>> +}
>> +
>> +/*
>> + * Context switch levelling state to the next domain.  A parameter of NULL is
>> + * used to context switch to the default host state, and is used by the BSP/AP
>> + * startup code.
> OK, how about:
>> + */
>> +static void amd_ctxt_switch_levelling(const struct domain *nextd)
>> +{
>> +	struct cpuidmasks *these_masks = &this_cpu(cpuidmasks);
>> +	const struct cpuidmasks *masks = &cpuidmask_defaults;
>> +
> 	ASSERT(!d && system_state != SYS_STATE_active); ?

Because context switching back to NULL is in other situations, such as
the crash path.

>> +			ecx &= opt_cpuid_mask_ecx;
>> +			edx &= opt_cpuid_mask_edx;
>> +		} else if (m) {
>> +			ecx &= m->ecx;
>> +			edx &= m->edx;
>>  		}
>> -		feat_ecx = m->ecx;
>> -		feat_edx = m->edx;
>> -		extfeat_ecx = m->ext_ecx;
>> -		extfeat_edx = m->ext_edx;
>> +
>> +		/* Fast-forward bits - Must be set. */
>> +		if (ecx & cpufeat_mask(X86_FEATURE_XSAVE))
>> +			ecx |= cpufeat_mask(X86_FEATURE_OSXSAVE);
>> +		edx |= cpufeat_mask(X86_FEATURE_APIC);
>> +
>> +		/* Allow the HYPERVISOR bit to be set via guest policy. */
>> +		ecx |= cpufeat_mask(X86_FEATURE_HYPERVISOR);
> Hmm. The http://support.amd.com/TechDocs/52740_16h_Models_30h-3Fh_BKDG.pdf
> pg 624 mentions this (bit 63) as 'Reserved'. Should we really set it?
> Ah, but then earlier (pg 530) it says 'Reserved for use by hypervisor to indicate
> guest status.
>
>
>> +
>> +		cpuidmask_defaults._1cd = ((uint64_t)ecx << 32) | edx;
> Considering the document mentions Reserved should we preserve the bits that
> are set by the initial call that fills out the cpuidmask_default?

We already do.  Observe that the defaults will be reflected by the
cpuid() call.

>
> The original code also had:
>>  	}
>>  
>> -        /* Setting bits in the CPUID mask MSR that are not set in the
>> -         * unmasked CPUID response can cause those bits to be set in the
>> -         * masked response.  Avoid that by explicitly masking in software. */
>  that comment in it. Would it make sense to include it (or a rework of it since
> I wasn't exactly sure what it was saying).

These MSRs are overrides, which in practice allow you to advertise
features which are not actually supported by hardware.  This is bad, as
the attempting to use the feature will still cause a #UD fault.

>
>> +			ecx &= opt_cpuid_mask_ext_ecx;
>> +			edx &= opt_cpuid_mask_ext_edx;
>> +		} else if (m) {
>> +			ecx &= m->ext_ecx;
>> +			edx &= m->ext_edx;
>> +		}
>> +
>> +		/* Fast-forward bits - Must be set. */
>> +		edx |= cpufeat_mask(X86_FEATURE_APIC);
>> +
>> +		cpuidmask_defaults.e1cd = ((uint64_t)ecx << 32) | edx;
> Should this be &= ?

No, because that would prevent correct handling of the fast forward bits.

>> +			eax &= opt_cpuid_mask_l7s0_eax;
>> +			ebx &= opt_cpuid_mask_l7s0_ebx;
>> +		}
>> +
>> +		cpuidmask_defaults._7ab0 &= ((uint64_t)eax << 32) | ebx;
>>  	}
>>  
>> -	if (!skip_l7s0_eax_ebx &&
>> -	    wrmsr_amd_safe(MSR_AMD_L7S0_FEATURE_MASK, l7s0_ebx, l7s0_eax)) {
>> -		skip_l7s0_eax_ebx = 1;
>> -		printk("Failed to set CPUID leaf 7 subleaf 0 feature mask\n");
>> +	if ((levelling_caps & LCAP_6c) == LCAP_6c) {
>> +		uint32_t ecx = cpuid_ecx(6);
>> +
>> +		if (~opt_cpuid_mask_thermal_ecx)
>> +			ecx &= opt_cpuid_mask_thermal_ecx;
>> +
>> +		cpuidmask_defaults._6c &= (~0ULL << 32) | ecx;
>
> Is there any documentation about this? The BKDG from 03/2016 does not mention
> this MSR (C001_1003). Ah but it is mentioned in docs for Family 15th. How nice.

The documentation in this regard is remarkably poor.

>
>>  	}.
>>  
>> -	if (!skip_thermal_ecx &&
>> -	    (rdmsr_amd_safe(MSR_AMD_THRM_FEATURE_MASK, &eax, &edx) ||
>> -	     wrmsr_amd_safe(MSR_AMD_THRM_FEATURE_MASK, thermal_ecx, edx))){
>> -		skip_thermal_ecx = 1;
>> -		printk("Failed to set CPUID thermal/power feature mask\n");
>> +	if (opt_cpu_info) {
>> +		printk(XENLOG_INFO "Levelling caps: %#x\n", levelling_caps);
>> +		printk(XENLOG_INFO
>> +		       "MSR defaults: 1d 0x%08x, 1c 0x%08x, e1d 0x%08x, "
>> +		       "e1c 0x%08x, 7a0 0x%08x, 7b0 0x%08x, 6c 0x%08x\n",
>> +		       (uint32_t)cpuidmask_defaults._1cd,
>> +		       (uint32_t)(cpuidmask_defaults._1cd >> 32),
>> +		       (uint32_t)cpuidmask_defaults.e1cd,
>> +		       (uint32_t)(cpuidmask_defaults.e1cd >> 32),
>> +		       (uint32_t)(cpuidmask_defaults._7ab0 >> 32),
>> +		       (uint32_t)cpuidmask_defaults._7ab0,
>> +		       (uint32_t)cpuidmask_defaults._6c);
> Why don't you bit shift cpuidmask_defaults._6c too?

Because only the bottom 32bit are relevant.

~Andrew

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [PATCH v4 15/26] x86/cpu: Rework Intel masking/faulting setup
  2016-03-28 19:14   ` Konrad Rzeszutek Wilk
@ 2016-04-05 16:45     ` Andrew Cooper
  0 siblings, 0 replies; 72+ messages in thread
From: Andrew Cooper @ 2016-04-05 16:45 UTC (permalink / raw)
  To: Konrad Rzeszutek Wilk; +Cc: Xen-devel

On 28/03/16 20:14, Konrad Rzeszutek Wilk wrote:
>> + * Context switch levelling state to the next domain.  A parameter of NULL is
>> + * used to context switch to the default host state, and is used by the BSP/AP
>> + * startup code.
>> + */
>> +static void intel_ctxt_switch_levelling(const struct domain *nextd)
>> +{
>> +	struct cpuidmasks *these_masks = &this_cpu(cpuidmasks);
>> +	const struct cpuidmasks *masks = &cpuidmask_defaults;
>> +
> Same question as on the AMD - would it make sense to add an ASSERT
> to make sure that !nextd && system_state != SYS_STATE_active?

Same answer.  (It would break the crash path).

>
> .. snip..
>
>> +static void __init noinline intel_init_levelling(void)
>> +{
>> +	if (opt_cpu_info) {
>> +		printk(XENLOG_INFO "Levelling caps: %#x\n", levelling_caps);
>> +
>> +		if (!cpu_has_cpuid_faulting)
>> +			printk(XENLOG_INFO
>> +			       "MSR defaults: 1d 0x%08x, 1c 0x%08x, e1d 0x%08x, "
>> +			       "e1c 0x%08x, Da1 0x%08x\n",
>> +			       (uint32_t)(cpuidmask_defaults._1cd >> 32),
>> +			       (uint32_t)cpuidmask_defaults._1cd,
>> +			       (uint32_t)(cpuidmask_defaults.e1cd >> 32),
>> +			       (uint32_t)cpuidmask_defaults.e1cd,
>> +			       (uint32_t)cpuidmask_defaults.Da1);
> Perhaps shift Da1 as there is no need in seeing the upper bits?

That is what the uint32_t cast does.

~Andrew

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [PATCH v4 17/26] x86/pv: Provide custom cpumasks for PV domains
  2016-03-28 19:40   ` Konrad Rzeszutek Wilk
@ 2016-04-05 16:55     ` Andrew Cooper
  0 siblings, 0 replies; 72+ messages in thread
From: Andrew Cooper @ 2016-04-05 16:55 UTC (permalink / raw)
  To: Konrad Rzeszutek Wilk; +Cc: Xen-devel

On 28/03/16 20:40, Konrad Rzeszutek Wilk wrote:
> On Wed, Mar 23, 2016 at 04:36:20PM +0000, Andrew Cooper wrote:
>> And use them in preference to cpumask_defaults on context switch.  HVM domains
> Extra space before HVM

It is normal to have two spaces following a full stop in written text. 
Observe that this is consistent for all text I write.

>> must not be masked (to avoid interfering with cpuid calls within the guest),
>> so always lazily context switch to the host default.
> Could you add please:
> Host default being set by cpuid_mask_* boot paramters.

But that would be misleading.  The exceedingly likely case is that those
parameters are not specified.

~Andrew

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [PATCH v4 26/26] tools/libxc: Calculate xstate cpuid leaf from guest information
  2016-03-31  7:48   ` Jan Beulich
@ 2016-04-05 17:48     ` Andrew Cooper
  2016-04-07  0:16       ` Jan Beulich
  0 siblings, 1 reply; 72+ messages in thread
From: Andrew Cooper @ 2016-04-05 17:48 UTC (permalink / raw)
  To: Jan Beulich; +Cc: Ian Jackson, Wei Liu, Xen-devel

On 31/03/16 08:48, Jan Beulich wrote:
>>>> On 23.03.16 at 17:36, <andrew.cooper3@citrix.com> wrote:
>> --- a/tools/libxc/xc_cpuid_x86.c
>> +++ b/tools/libxc/xc_cpuid_x86.c
>> @@ -398,54 +398,97 @@ static void intel_xc_cpuid_policy(xc_interface *xch,
>>      }
>>  }
>>  
>> +/* XSTATE bits in XCR0. */
>> +#define X86_XCR0_X87    (1ULL <<  0)
>> +#define X86_XCR0_SSE    (1ULL <<  1)
>> +#define X86_XCR0_AVX    (1ULL <<  2)
>> +#define X86_XCR0_BNDREG (1ULL <<  3)
>> +#define X86_XCR0_BNDCSR (1ULL <<  4)
>> +#define X86_XCR0_LWP    (1ULL << 62)
> Why an incomplete set? At least PKRU should be needed right
> away. And I see no reason why the three AVX-512 pieces can't
> be put here right away too.

PKRU is another victim of this series being rebased over the
introduction of new functionality.  I will re-add it.

AVX-512 would require adding the AVX feature flags, and deciphering the
dependency tree for all of them.  I have no ability to test any such
additions (no available hardware), and don't want to introduce
possibly-buggy code ahead of full support being added.

>
>> +#define X86_XSS_MASK    (0) /* No XSS states supported yet. */
>> +
>> +/* Per-component subleaf flags. */
>> +#define XSTATE_XSS      (1ULL <<  0)
>> +#define XSTATE_ALIGN64  (1ULL <<  1)
>> +
>>  /* Configure extended state enumeration leaves (0x0000000D for xsave) */
>>  static void xc_cpuid_config_xsave(xc_interface *xch,
>>                                    const struct cpuid_domain_info *info,
>>                                    const unsigned int *input, unsigned int *regs)
>>  {
>> -    if ( info->xfeature_mask == 0 )
>> +    uint64_t guest_xfeature_mask;
>> +
>> +    if ( info->xfeature_mask == 0 ||
>> +         !test_bit(X86_FEATURE_XSAVE, info->featureset) )
>>      {
>>          regs[0] = regs[1] = regs[2] = regs[3] = 0;
>>          return;
>>      }
>>  
>> +    guest_xfeature_mask = X86_XCR0_SSE | X86_XCR0_X87;
>> +
>> +    if ( test_bit(X86_FEATURE_AVX, info->featureset) )
>> +        guest_xfeature_mask |= X86_XCR0_AVX;
>> +
>> +    if ( test_bit(X86_FEATURE_MPX, info->featureset) )
>> +        guest_xfeature_mask |= X86_XCR0_BNDREG | X86_XCR0_BNDCSR;
>> +
>> +    if ( test_bit(X86_FEATURE_LWP, info->featureset) )
>> +        guest_xfeature_mask |= X86_XCR0_LWP;
>> +
>> +    /*
>> +     * Clamp to host mask.  Should be no-op, as guest_xfeature_mask should not
>> +     * be able to be calculated as larger than info->xfeature_mask.
>> +     *
>> +     * TODO - see about making this a harder error.
>> +     */
>> +    guest_xfeature_mask &= info->xfeature_mask;
> This is ugly.

And now I think about it, wrong.  Dom0's cpuid view is that of a PV
guest, which comes with no XSAVES (which will impact the future support
of Processor Trace), and no PKRU.

>  For one, your dependency mechanism should be able to
> express the dependencies you "manually"enforce above. And beyond
> that masking with info->xfeature_mask should be all that's needed,
> together with enforcing the XCR0 / XSS split ...
>
>>      switch ( input[1] )
>>      {
>> -    case 0: 
>> +    case 0:
>>          /* EAX: low 32bits of xfeature_enabled_mask */
>> -        regs[0] = info->xfeature_mask & 0xFFFFFFFF;
>> +        regs[0] = guest_xfeature_mask;
>>          /* EDX: high 32bits of xfeature_enabled_mask */
>> -        regs[3] = (info->xfeature_mask >> 32) & 0xFFFFFFFF;
>> +        regs[3] = guest_xfeature_mask >> 32;
> ... here and ...
>
>>      case 1: /* leaf 1 */
>>          regs[0] = info->featureset[featureword_of(X86_FEATURE_XSAVEOPT)];
>> -        regs[2] &= info->xfeature_mask;
>> -        regs[3] = 0;
>> +        regs[2] = guest_xfeature_mask & X86_XSS_MASK;
>> +        regs[3] = (guest_xfeature_mask >> 32) & X86_XSS_MASK;
> ... here. Yet not by a compile time defined mask, but by using
> (host) CPUID output: It is clear that once a bit got assigned to XCR0
> vs XSS, it won't ever change. Hence it doesn't matter whether you
> use the guest or host view of that split. And this will then also - other
> than you've said before would be unavoidable - make unnecessary to
> always update this code when new states get added.

There is no possible way of avoiding having a whitelist somewhere, which
limits what Xen will tolerate supporting for the guest.

All of this code should have been implemented in Xen in the first
place.  I am afraid that this can't be fixed properly without my further
plans to do fully policy handling in Xen.

I will see if I can find a minimal way of fixing this for 4.7, but it is
yet another example of xstate handling simply being broken in tree.

>
>> -    case 2 ... 63: /* sub-leaves */
>> -        if ( !(info->xfeature_mask & (1ULL << input[1])) )
>> +
>> +    case 2 ... 62: /* per-component sub-leaves */
>> +        if ( !(guest_xfeature_mask & (1ULL << input[1])) )
>>          {
>>              regs[0] = regs[1] = regs[2] = regs[3] = 0;
>>              break;
>>          }
>>          /* Don't touch EAX, EBX. Also cleanup ECX and EDX */
>> -        regs[2] = regs[3] = 0;
>> +        regs[2] &= XSTATE_XSS | XSTATE_ALIGN64;
> Wouldn't this better also use the "known features" approach, by
> adding yet another word in cpufeatureset.h?

No - I (thought) I had already explained why.

There is a mapping between features and available xstate to use those
features (with some features mapping to multiple xstates).  Having the
valid xstates derived from the configured features prevents the two
getting out of sync, and advertising a feature without its applicable
xstate, or advertising an xstate without the appropriate feature bit.

>
> Btw., looking at that header again I now wonder whether it
> wouldn't have been neater to make XEN_CPUFEATURE() a
> 3-parameter macro, with word and bit specified separately
> and a default definition of
>
> #define XEN_CPUFEATURE(name, word, bit) XEN_X86_FEATURE_##name = (word) * 32 + (bit),
>
> avoiding the ugly repeated "*32" in all macro invocations. Of
> course we'd need to adjust this before we release with this new
> interface.

I'd prefer not to.  The "*32" is the expected way of reading the
constants, and providing the word and bit separately allows for someone
to try and do something silly by not multiplying by 32 themselves.

~Andrew

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [PATCH v4 16/26] x86/cpu: Context switch cpuid masks and faulting state in context_switch()
  2016-03-28 19:27   ` Konrad Rzeszutek Wilk
@ 2016-04-05 18:34     ` Andrew Cooper
  0 siblings, 0 replies; 72+ messages in thread
From: Andrew Cooper @ 2016-04-05 18:34 UTC (permalink / raw)
  To: Konrad Rzeszutek Wilk; +Cc: Xen-devel

On 28/03/16 20:27, Konrad Rzeszutek Wilk wrote:
> On Wed, Mar 23, 2016 at 04:36:19PM +0000, Andrew Cooper wrote:
>> A single ctxt_switch_levelling() function pointer is provided
>> (defaulting to an empty nop), which is overridden in the appropriate
>> $VENDOR_init_levelling().
>>
>> set_cpuid_faulting() is made private and included within
>> intel_ctxt_switch_levelling().
>>
>> One functional change is that the faulting configuration is no longer special
>> cased for dom0.  There was never any need to, and it will cause dom0 to
> There was. See 1d6ffea6
>     ACPI: add _PDC input override mechanism
>
> And in Linux see xen_check_mwait().

This logic is fundamentally broken.  It is, and has *alway* been wrong
to try and equate dom0's view of cpuid with the hosts view of cpuid...

>
>> observe the same information through native and enlightened cpuid.
> Which will be a regression when it comes to ACPI C-states

... it can't possibly work for an HVM-based dom0, where there is no
distinction between a native and enlightened cpuid.

I will clearly have to re-break this in a similar way to the MTRR bits,
whitelisted for the hardware domain only, but Linux will have to be
changed to get working deep cstates for PVH/HVMLite.

~Andrew

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [PATCH v4 26/26] tools/libxc: Calculate xstate cpuid leaf from guest information
  2016-04-05 17:48     ` Andrew Cooper
@ 2016-04-07  0:16       ` Jan Beulich
  2016-04-07  0:40         ` Andrew Cooper
  0 siblings, 1 reply; 72+ messages in thread
From: Jan Beulich @ 2016-04-07  0:16 UTC (permalink / raw)
  To: andrew.cooper3; +Cc: Ian.Jackson, wei.liu2, xen-devel

>>> Andrew Cooper <andrew.cooper3@citrix.com> 04/05/16 7:49 PM >>>
>On 31/03/16 08:48, Jan Beulich wrote:
>>>>> On 23.03.16 at 17:36, <andrew.cooper3@citrix.com> wrote:
>>>      switch ( input[1] )
>>>      {
>>> -    case 0: 
>>> +    case 0:
>>>          /* EAX: low 32bits of xfeature_enabled_mask */
>>> -        regs[0] = info->xfeature_mask & 0xFFFFFFFF;
>>> +        regs[0] = guest_xfeature_mask;
>>          /* EDX: high 32bits of xfeature_enabled_mask */
>> -        regs[3] = (info->xfeature_mask >> 32) & 0xFFFFFFFF;
>> +        regs[3] = guest_xfeature_mask >> 32;
>> ... here and ...
>>
>>>      case 1: /* leaf 1 */
<>>          regs[0] = info->featureset[featureword_of(X86_FEATURE_XSAVEOPT)];
>>> -        regs[2] &= info->xfeature_mask;
>>> -        regs[3] = 0;
>>> +        regs[2] = guest_xfeature_mask & X86_XSS_MASK;
>>> +        regs[3] = (guest_xfeature_mask >> 32) & X86_XSS_MASK;
>> ... here. Yet not by a compile time defined mask, but by using
>> (host) CPUID output: It is clear that once a bit got assigned to XCR0
>> vs XSS, it won't ever change. Hence it doesn't matter whether you
>> use the guest or host view of that split. And this will then also - other
>> than you've said before would be unavoidable - make unnecessary to
>> always update this code when new states get added.
>
>There is no possible way of avoiding having a whitelist somewhere, which
>limits what Xen will tolerate supporting for the guest.

Right, but preferably in exactly one place. And imo that ought to be
info->xfeature_mask.

Jan


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [PATCH v4 26/26] tools/libxc: Calculate xstate cpuid leaf from guest information
  2016-04-07  0:16       ` Jan Beulich
@ 2016-04-07  0:40         ` Andrew Cooper
  2016-04-07  0:56           ` Jan Beulich
  0 siblings, 1 reply; 72+ messages in thread
From: Andrew Cooper @ 2016-04-07  0:40 UTC (permalink / raw)
  To: Jan Beulich; +Cc: wei.liu2, Ian.Jackson, xen-devel

On 07/04/2016 01:16, Jan Beulich wrote:
>>>> Andrew Cooper <andrew.cooper3@citrix.com> 04/05/16 7:49 PM >>>
>> On 31/03/16 08:48, Jan Beulich wrote:
>>>>>> On 23.03.16 at 17:36, <andrew.cooper3@citrix.com> wrote:
>>>>      switch ( input[1] )
>>>>      {
>>>> -    case 0: 
>>>> +    case 0:
>>>>          /* EAX: low 32bits of xfeature_enabled_mask */
>>>> -        regs[0] = info->xfeature_mask & 0xFFFFFFFF;
>>>> +        regs[0] = guest_xfeature_mask;
>>>          /* EDX: high 32bits of xfeature_enabled_mask */
>>> -        regs[3] = (info->xfeature_mask >> 32) & 0xFFFFFFFF;
>>> +        regs[3] = guest_xfeature_mask >> 32;
>>> ... here and ...
>>>
>>>>      case 1: /* leaf 1 */
> <>>          regs[0] = info->featureset[featureword_of(X86_FEATURE_XSAVEOPT)];
>>>> -        regs[2] &= info->xfeature_mask;
>>>> -        regs[3] = 0;
>>>> +        regs[2] = guest_xfeature_mask & X86_XSS_MASK;
>>>> +        regs[3] = (guest_xfeature_mask >> 32) & X86_XSS_MASK;
>>> ... here. Yet not by a compile time defined mask, but by using
>>> (host) CPUID output: It is clear that once a bit got assigned to XCR0
>>> vs XSS, it won't ever change. Hence it doesn't matter whether you
>>> use the guest or host view of that split. And this will then also - other
>>> than you've said before would be unavoidable - make unnecessary to
>>> always update this code when new states get added.
>> There is no possible way of avoiding having a whitelist somewhere, which
>> limits what Xen will tolerate supporting for the guest.
> Right, but preferably in exactly one place. And imo that ought to be
> info->xfeature_mask.

info->xfeature_mask is actually Xen's limit, as obtained from
XEN_DOMCTL_getvcpuextstate, so is an authoritative source of "the
maximum Xen will support".

However, the guest_xfeature_mask must be generated and used as this
patch.  Without it, a domU will break if it migrates from a more capable
xstate host to a less capable host, as using info->xfeature_mask alone
leaks in state which should be levelled out.

Currently upstream, heterogeneous migration of domains using xsave is
broken if the domain first boots on the more-capable host.

~Andrew

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [PATCH v4 26/26] tools/libxc: Calculate xstate cpuid leaf from guest information
  2016-04-07  0:40         ` Andrew Cooper
@ 2016-04-07  0:56           ` Jan Beulich
  2016-04-07 11:34             ` Andrew Cooper
  0 siblings, 1 reply; 72+ messages in thread
From: Jan Beulich @ 2016-04-07  0:56 UTC (permalink / raw)
  To: andrew.cooper3; +Cc: Ian.Jackson, wei.liu2, xen-devel

>>> Andrew Cooper <andrew.cooper3@citrix.com> 04/07/16 2:40 AM >>>
>On 07/04/2016 01:16, Jan Beulich wrote:
>>>>> Andrew Cooper <andrew.cooper3@citrix.com> 04/05/16 7:49 PM >>>
>>> There is no possible way of avoiding having a whitelist somewhere, which
>>> limits what Xen will tolerate supporting for the guest.
>> Right, but preferably in exactly one place. And imo that ought to be
>> info->xfeature_mask.
>
>info->xfeature_mask is actually Xen's limit, as obtained from
>XEN_DOMCTL_getvcpuextstate, so is an authoritative source of "the
>maximum Xen will support".
>
>However, the guest_xfeature_mask must be generated and used as this
>patch.  Without it, a domU will break if it migrates from a more capable
>xstate host to a less capable host, as using info->xfeature_mask alone
>leaks in state which should be levelled out.
>
>Currently upstream, heterogeneous migration of domains using xsave is
>broken if the domain first boots on the more-capable host.

I don't follow, I'm afraid: To me this looks like two separate things. One is to
suitably level the guest (via its config file), and the other is to not allow it to
use things the host doesn't support. If you want the guest to be migratable
to a less capable host, you need to configure the guest accordingly instead
of relying on a second instance of white listing.

Jan


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [PATCH v4 26/26] tools/libxc: Calculate xstate cpuid leaf from guest information
  2016-04-07  0:56           ` Jan Beulich
@ 2016-04-07 11:34             ` Andrew Cooper
  0 siblings, 0 replies; 72+ messages in thread
From: Andrew Cooper @ 2016-04-07 11:34 UTC (permalink / raw)
  To: Jan Beulich; +Cc: Ian.Jackson, wei.liu2, xen-devel

On 07/04/16 01:56, Jan Beulich wrote:
>>>> Andrew Cooper <andrew.cooper3@citrix.com> 04/07/16 2:40 AM >>>
>> On 07/04/2016 01:16, Jan Beulich wrote:
>>>>>> Andrew Cooper <andrew.cooper3@citrix.com> 04/05/16 7:49 PM >>>
>>>> There is no possible way of avoiding having a whitelist somewhere, which
>>>> limits what Xen will tolerate supporting for the guest.
>>> Right, but preferably in exactly one place. And imo that ought to be
>>> info->xfeature_mask.
>> info->xfeature_mask is actually Xen's limit, as obtained from
>> XEN_DOMCTL_getvcpuextstate, so is an authoritative source of "the
>> maximum Xen will support".
>>
>> However, the guest_xfeature_mask must be generated and used as this
>> patch.  Without it, a domU will break if it migrates from a more capable
>> xstate host to a less capable host, as using info->xfeature_mask alone
>> leaks in state which should be levelled out.
>>
>> Currently upstream, heterogeneous migration of domains using xsave is
>> broken if the domain first boots on the more-capable host.
> I don't follow, I'm afraid: To me this looks like two separate things. One is to
> suitably level the guest (via its config file), and the other is to not allow it to
> use things the host doesn't support. If you want the guest to be migratable
> to a less capable host, you need to configure the guest accordingly instead
> of relying on a second instance of white listing.

Agreed, on all points.

But I assert that my change moves the code from being broken to working,
per the above description.

I have reworded several bits for v5 - perhaps that will make the patch
more clear.

~Andrew

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 72+ messages in thread

end of thread, other threads:[~2016-04-07 11:34 UTC | newest]

Thread overview: 72+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-03-23 16:36 [PATCH v4 00/26] x86: Improvements to cpuid handling for guests Andrew Cooper
2016-03-23 16:36 ` [PATCH v4 01/26] xen/public: Export cpu featureset information in the public API Andrew Cooper
2016-03-24 14:08   ` Jan Beulich
2016-03-24 14:12     ` Andrew Cooper
2016-03-24 14:16       ` Jan Beulich
2016-03-23 16:36 ` [PATCH v4 02/26] xen/x86: Script to automatically process featureset information Andrew Cooper
2016-03-23 16:36 ` [PATCH v4 03/26] xen/x86: Collect more cpuid feature leaves Andrew Cooper
2016-03-23 16:36 ` [PATCH v4 04/26] xen/x86: Mask out unknown features from Xen's capabilities Andrew Cooper
2016-03-23 16:36 ` [PATCH v4 05/26] xen/x86: Annotate special features Andrew Cooper
2016-03-23 16:36 ` [PATCH v4 06/26] xen/x86: Annotate VM applicability in featureset Andrew Cooper
2016-03-23 16:36 ` [PATCH v4 07/26] xen/x86: Calculate maximum host and guest featuresets Andrew Cooper
2016-03-29  8:57   ` Jan Beulich
2016-03-23 16:36 ` [PATCH v4 08/26] xen/x86: Generate deep dependencies of features Andrew Cooper
2016-03-24 16:16   ` Jan Beulich
2016-03-23 16:36 ` [PATCH v4 09/26] xen/x86: Clear dependent features when clearing a cpu cap Andrew Cooper
2016-03-23 16:36 ` [PATCH v4 10/26] xen/x86: Improve disabling of features which have dependencies Andrew Cooper
2016-03-28 15:18   ` Konrad Rzeszutek Wilk
2016-03-23 16:36 ` [PATCH v4 11/26] xen/x86: Improvements to in-hypervisor cpuid sanity checks Andrew Cooper
2016-03-24 15:38   ` Andrew Cooper
2016-03-24 16:47   ` Jan Beulich
2016-03-24 17:01     ` Andrew Cooper
2016-03-24 17:11       ` Jan Beulich
2016-03-24 17:12         ` Andrew Cooper
2016-03-28 15:29   ` Konrad Rzeszutek Wilk
2016-04-05 15:25     ` Andrew Cooper
2016-03-23 16:36 ` [PATCH v4 12/26] x86/cpu: Move set_cpumask() calls into c_early_init() Andrew Cooper
2016-03-28 15:55   ` Konrad Rzeszutek Wilk
2016-04-05 16:19     ` Andrew Cooper
2016-03-23 16:36 ` [PATCH v4 13/26] x86/cpu: Sysctl and common infrastructure for levelling context switching Andrew Cooper
2016-03-24 16:58   ` Jan Beulich
2016-03-28 16:12   ` Konrad Rzeszutek Wilk
2016-04-05 16:33     ` Andrew Cooper
2016-03-28 17:37   ` Konrad Rzeszutek Wilk
2016-03-23 16:36 ` [PATCH v4 14/26] x86/cpu: Rework AMD masking MSR setup Andrew Cooper
2016-03-28 18:55   ` Konrad Rzeszutek Wilk
2016-04-05 16:44     ` Andrew Cooper
2016-03-23 16:36 ` [PATCH v4 15/26] x86/cpu: Rework Intel masking/faulting setup Andrew Cooper
2016-03-28 19:14   ` Konrad Rzeszutek Wilk
2016-04-05 16:45     ` Andrew Cooper
2016-03-23 16:36 ` [PATCH v4 16/26] x86/cpu: Context switch cpuid masks and faulting state in context_switch() Andrew Cooper
2016-03-28 19:27   ` Konrad Rzeszutek Wilk
2016-04-05 18:34     ` Andrew Cooper
2016-03-23 16:36 ` [PATCH v4 17/26] x86/pv: Provide custom cpumasks for PV domains Andrew Cooper
2016-03-28 19:40   ` Konrad Rzeszutek Wilk
2016-04-05 16:55     ` Andrew Cooper
2016-03-23 16:36 ` [PATCH v4 18/26] x86/domctl: Update PV domain cpumasks when setting cpuid policy Andrew Cooper
2016-03-24 17:04   ` Jan Beulich
2016-03-24 17:05     ` Andrew Cooper
2016-03-28 19:51   ` Konrad Rzeszutek Wilk
2016-03-23 16:36 ` [PATCH v4 19/26] xen+tools: Export maximum host and guest cpu featuresets via SYSCTL Andrew Cooper
2016-03-28 19:59   ` Konrad Rzeszutek Wilk
2016-03-23 16:36 ` [PATCH v4 20/26] tools/libxc: Modify bitmap operations to take void pointers Andrew Cooper
2016-03-28 20:05   ` Konrad Rzeszutek Wilk
2016-03-23 16:36 ` [PATCH v4 21/26] tools/libxc: Use public/featureset.h for cpuid policy generation Andrew Cooper
2016-03-28 20:07   ` Konrad Rzeszutek Wilk
2016-03-23 16:36 ` [PATCH v4 22/26] tools/libxc: Expose the automatically generated cpu featuremask information Andrew Cooper
2016-03-28 20:08   ` Konrad Rzeszutek Wilk
2016-03-23 16:36 ` [PATCH v4 23/26] tools: Utility for dealing with featuresets Andrew Cooper
2016-03-28 20:26   ` Konrad Rzeszutek Wilk
2016-03-23 16:36 ` [PATCH v4 24/26] tools/libxc: Wire a featureset through to cpuid policy logic Andrew Cooper
2016-03-28 20:39   ` Konrad Rzeszutek Wilk
2016-03-23 16:36 ` [PATCH v4 25/26] tools/libxc: Use featuresets rather than guesswork Andrew Cooper
2016-03-23 16:36 ` [PATCH v4 26/26] tools/libxc: Calculate xstate cpuid leaf from guest information Andrew Cooper
2016-03-24 17:20   ` Wei Liu
2016-03-31  7:48   ` Jan Beulich
2016-04-05 17:48     ` Andrew Cooper
2016-04-07  0:16       ` Jan Beulich
2016-04-07  0:40         ` Andrew Cooper
2016-04-07  0:56           ` Jan Beulich
2016-04-07 11:34             ` Andrew Cooper
2016-03-24 10:27 ` [PATCH v4 00/26] x86: Improvements to cpuid handling for guests Jan Beulich
2016-03-24 10:28   ` Andrew Cooper

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).