All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 00/10] x86/cpuid: Remove the legacy infrastructure
@ 2017-02-20 11:00 Andrew Cooper
  2017-02-20 11:00 ` [PATCH 01/10] x86/cpuid: Disallow policy updates once the domain is running Andrew Cooper
                   ` (9 more replies)
  0 siblings, 10 replies; 65+ messages in thread
From: Andrew Cooper @ 2017-02-20 11:00 UTC (permalink / raw)
  To: Xen-devel; +Cc: Andrew Cooper, Jan Beulich

This is the final piece of work to move the remaining legacy CPUID logic onto
the new guest_cpuid() path, allowing the legacy logic to be dropped entirely.

It is available in git form from http://xenbits.xen.org/gitweb/?p=people/andrewcoop/xen.git;a=shortlog;h=refs/heads/xen-cpuid-v4



Andrew Cooper (10):
  x86/cpuid: Disallow policy updates once the domain is running
  x86/gen-cpuid: Clarify the intended meaning of AVX wrt feature
    dependencies
  x86/cpuid: Handle leaf 0x1 in guest_cpuid()
  x86/cpuid: Handle leaf 0x4 in guest_cpuid()
  x86/cpuid: Handle leaf 0x5 in guest_cpuid()
  x86/cpuid: Handle leaf 0x6 in guest_cpuid()
  x86/cpuid: Handle leaf 0xa in guest_cpuid()
  x86/cpuid: Handle leaf 0xb in guest_cpuid()
  x86/cpuid: Drop legacy CPUID infrastructure
  x86/cpuid: Always enable faulting for the control domain

 xen/arch/x86/cpu/intel.c    |  18 +-
 xen/arch/x86/cpuid.c        | 497 ++++++++++++++++++++------------------------
 xen/arch/x86/domctl.c       |  55 +----
 xen/include/asm-x86/cpuid.h |  43 ++--
 xen/tools/gen-cpuid.py      |  10 +-
 5 files changed, 255 insertions(+), 368 deletions(-)

-- 
2.1.4


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 65+ messages in thread

* [PATCH 01/10] x86/cpuid: Disallow policy updates once the domain is running
  2017-02-20 11:00 [PATCH 00/10] x86/cpuid: Remove the legacy infrastructure Andrew Cooper
@ 2017-02-20 11:00 ` Andrew Cooper
  2017-02-21 16:37   ` Jan Beulich
  2017-02-20 11:00 ` [PATCH 02/10] x86/gen-cpuid: Clarify the intended meaning of AVX wrt feature dependencies Andrew Cooper
                   ` (8 subsequent siblings)
  9 siblings, 1 reply; 65+ messages in thread
From: Andrew Cooper @ 2017-02-20 11:00 UTC (permalink / raw)
  To: Xen-devel; +Cc: Andrew Cooper, Jan Beulich

On real hardware, the bulk of CPUID data is system-specific and constant.
Hold the toolstack to the same behaviour when constructing domains.

Values which are expected to change dynamically (e.g. OSXSAVE) are unaffected
and continue to function as before.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
---
CC: Jan Beulich <JBeulich@suse.com>
---
 xen/arch/x86/domctl.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/xen/arch/x86/domctl.c b/xen/arch/x86/domctl.c
index 3b5c3c9..fc42cb1 100644
--- a/xen/arch/x86/domctl.c
+++ b/xen/arch/x86/domctl.c
@@ -941,6 +941,8 @@ long arch_do_domctl(
     case XEN_DOMCTL_set_cpuid:
         if ( d == currd ) /* no domain_pause() */
             ret = -EINVAL;
+        else if ( d->creation_finished )
+            ret = -EEXIST; /* No changing once the domain is running. */
         else
         {
             domain_pause(d);
-- 
2.1.4


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 65+ messages in thread

* [PATCH 02/10] x86/gen-cpuid: Clarify the intended meaning of AVX wrt feature dependencies
  2017-02-20 11:00 [PATCH 00/10] x86/cpuid: Remove the legacy infrastructure Andrew Cooper
  2017-02-20 11:00 ` [PATCH 01/10] x86/cpuid: Disallow policy updates once the domain is running Andrew Cooper
@ 2017-02-20 11:00 ` Andrew Cooper
  2017-02-21 16:40   ` Jan Beulich
  2017-02-20 11:00 ` [PATCH 03/10] x86/cpuid: Handle leaf 0x1 in guest_cpuid() Andrew Cooper
                   ` (7 subsequent siblings)
  9 siblings, 1 reply; 65+ messages in thread
From: Andrew Cooper @ 2017-02-20 11:00 UTC (permalink / raw)
  To: Xen-devel; +Cc: Andrew Cooper, Jan Beulich

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
---
CC: Jan Beulich <JBeulich@suse.com>
---
 xen/tools/gen-cpuid.py | 10 +++++++---
 1 file changed, 7 insertions(+), 3 deletions(-)

diff --git a/xen/tools/gen-cpuid.py b/xen/tools/gen-cpuid.py
index 5cab6db..6531ca8 100755
--- a/xen/tools/gen-cpuid.py
+++ b/xen/tools/gen-cpuid.py
@@ -225,9 +225,13 @@ def crunch_numbers(state):
         XSAVE: [XSAVEOPT, XSAVEC, XGETBV1, XSAVES,
                 AVX, MPX, PKU, LWP],
 
-        # AVX is taken to mean hardware support for VEX encoded instructions,
-        # 256bit registers, and the instructions themselves.  Each of these
-        # subsequent instruction groups may only be VEX encoded.
+        # AVX is taken to mean hardware support for 256bit registers (which in
+        # practice depends on the VEX prefix to encode), and the instructions
+        # themselves.
+        #
+        # AVX is not taken to mean support for the VEX prefix itself.
+        # VEX-encoded GPR instructions, such as those from the BMI{1,2} sets
+        # function fine in the absence of any enabled xstate.
         AVX: [FMA, FMA4, F16C, AVX2, XOP],
 
         # CX16 is only encodable in Long Mode.  LAHF_LM indicates that the
-- 
2.1.4


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 65+ messages in thread

* [PATCH 03/10] x86/cpuid: Handle leaf 0x1 in guest_cpuid()
  2017-02-20 11:00 [PATCH 00/10] x86/cpuid: Remove the legacy infrastructure Andrew Cooper
  2017-02-20 11:00 ` [PATCH 01/10] x86/cpuid: Disallow policy updates once the domain is running Andrew Cooper
  2017-02-20 11:00 ` [PATCH 02/10] x86/gen-cpuid: Clarify the intended meaning of AVX wrt feature dependencies Andrew Cooper
@ 2017-02-20 11:00 ` Andrew Cooper
  2017-02-21 16:59   ` Jan Beulich
  2017-02-20 11:00 ` [PATCH 04/10] x86/cpuid: Handle leaf 0x4 " Andrew Cooper
                   ` (6 subsequent siblings)
  9 siblings, 1 reply; 65+ messages in thread
From: Andrew Cooper @ 2017-02-20 11:00 UTC (permalink / raw)
  To: Xen-devel; +Cc: Andrew Cooper, Boris Ostrovsky, Jan Beulich

The features words, ecx and edx, are already audited as part of the featureset
logic.  The existing leaf 0x80000001 dynamic logic has its SYSCALL adjustment
split out, as the rest of the adjustments are common with leaf 0x1.  The
existing leaf 0x1 feature adjustments from {pv,hvm}_cpuid() are moved
wholesale into guest_cpuid(), although deduped against the common adjustments.

The eax word is family/model/stepping information, and is fine to use as
provided by the toolstack, although with reserved bits cleared.

The ebx word is more problematic.  The low 8 bits are the brand ID and safe to
pass straight through.  The next 8 bits are the CLFLUSH line size.  This value
is forwarded straight from hardware, as nothing good can possibly come of
providing an alternative value to the guest.

The next 8 bits are slightly different between Intel and AMD, but are both
some property of the number of logical cores in the current physical package.
For now, the toolstack value is used unchanged until better topology support
is available.

The final 8 bits are the initial legacy APIC ID.  For HVM guests, this was
overridden to vcpu_id * 2.  The same logic is now applied to PV guests, so
guests don't observe a constant number on all vcpus via their emulated or
faulted view.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
---
CC: Jan Beulich <JBeulich@suse.com>
CC: Boris Ostrovsky <boris.ostrovsky@oracle.com>

Boris: This conflicts textually but not functionally with your vPMU
adjustments.  Whichever way round we end up needing to rebase should be easy.
---
 xen/arch/x86/cpuid.c        | 351 +++++++++++++++++++-------------------------
 xen/include/asm-x86/cpuid.h |   6 +-
 2 files changed, 158 insertions(+), 199 deletions(-)

diff --git a/xen/arch/x86/cpuid.c b/xen/arch/x86/cpuid.c
index e0a387e..3ecb794 100644
--- a/xen/arch/x86/cpuid.c
+++ b/xen/arch/x86/cpuid.c
@@ -176,6 +176,9 @@ static void recalculate_misc(struct cpuid_policy *p)
     switch ( p->x86_vendor )
     {
     case X86_VENDOR_INTEL:
+        p->basic.raw_fms &= 0x0fff3fff;
+        p->basic.apic_id = 0; /* Dynamic. */
+
         p->basic.l2_nr_queries = 1; /* Fixed to 1 query. */
         p->basic.raw[0x3] = EMPTY_LEAF; /* PSN - always hidden. */
         p->basic.raw[0x9] = EMPTY_LEAF; /* DCA - always hidden. */
@@ -194,6 +197,9 @@ static void recalculate_misc(struct cpuid_policy *p)
         break;
 
     case X86_VENDOR_AMD:
+        p->basic.raw_fms &= 0x0fff0fff;
+        p->basic.apic_id = 0; /* Dynamic. */
+
         zero_leaves(p->basic.raw, 0x2, 0x3);
         p->basic.raw[0x9] = EMPTY_LEAF;
 
@@ -502,6 +508,9 @@ void recalculate_cpuid_policy(struct domain *d)
 
     cpuid_featureset_to_policy(fs, p);
 
+    /* Pass host cacheline size through to guests. */
+    p->basic.clflush_size = max->basic.clflush_size;
+
     p->extd.maxphysaddr = min(p->extd.maxphysaddr, max->extd.maxphysaddr);
     p->extd.maxphysaddr = min_t(uint8_t, p->extd.maxphysaddr,
                                 d->arch.paging.gfn_bits + PAGE_SHIFT);
@@ -574,7 +583,6 @@ static void pv_cpuid(uint32_t leaf, uint32_t subleaf, struct cpuid_leaf *res)
 {
     struct vcpu *curr = current;
     struct domain *currd = curr->domain;
-    const struct cpuid_policy *p = currd->arch.cpuid;
 
     if ( !is_control_domain(currd) && !is_hardware_domain(currd) )
         domain_cpuid(currd, leaf, subleaf, res);
@@ -583,147 +591,6 @@ static void pv_cpuid(uint32_t leaf, uint32_t subleaf, struct cpuid_leaf *res)
 
     switch ( leaf )
     {
-    case 0x00000001:
-        res->c = p->basic._1c;
-        res->d = p->basic._1d;
-
-        if ( !is_pvh_domain(currd) )
-        {
-            const struct cpu_user_regs *regs = guest_cpu_user_regs();
-
-            /*
-             * Delete the PVH condition when HVMLite formally replaces PVH,
-             * and HVM guests no longer enter a PV codepath.
-             */
-
-            /*
-             * !!! OSXSAVE handling for PV guests is non-architectural !!!
-             *
-             * Architecturally, the correct code here is simply:
-             *
-             *   if ( curr->arch.pv_vcpu.ctrlreg[4] & X86_CR4_OSXSAVE )
-             *       c |= cpufeat_mask(X86_FEATURE_OSXSAVE);
-             *
-             * However because of bugs in Xen (before c/s bd19080b, Nov 2010,
-             * the XSAVE cpuid flag leaked into guests despite the feature not
-             * being available for use), buggy workarounds where introduced to
-             * Linux (c/s 947ccf9c, also Nov 2010) which relied on the fact
-             * that Xen also incorrectly leaked OSXSAVE into the guest.
-             *
-             * Furthermore, providing architectural OSXSAVE behaviour to a
-             * many Linux PV guests triggered a further kernel bug when the
-             * fpu code observes that XSAVEOPT is available, assumes that
-             * xsave state had been set up for the task, and follows a wild
-             * pointer.
-             *
-             * Older Linux PVOPS kernels however do require architectural
-             * behaviour.  They observe Xen's leaked OSXSAVE and assume they
-             * can already use XSETBV, dying with a #UD because the shadowed
-             * CR4.OSXSAVE is clear.  This behaviour has been adjusted in all
-             * observed cases via stable backports of the above changeset.
-             *
-             * Therefore, the leaking of Xen's OSXSAVE setting has become a
-             * defacto part of the PV ABI and can't reasonably be corrected.
-             * It can however be restricted to only the enlightened CPUID
-             * view, as seen by the guest kernel.
-             *
-             * The following situations and logic now applies:
-             *
-             * - Hardware without CPUID faulting support and native CPUID:
-             *    There is nothing Xen can do here.  The hosts XSAVE flag will
-             *    leak through and Xen's OSXSAVE choice will leak through.
-             *
-             *    In the case that the guest kernel has not set up OSXSAVE, only
-             *    SSE will be set in xcr0, and guest userspace can't do too much
-             *    damage itself.
-             *
-             * - Enlightened CPUID or CPUID faulting available:
-             *    Xen can fully control what is seen here.  Guest kernels need
-             *    to see the leaked OSXSAVE via the enlightened path, but
-             *    guest userspace and the native is given architectural
-             *    behaviour.
-             *
-             *    Emulated vs Faulted CPUID is distinguised based on whether a
-             *    #UD or #GP is currently being serviced.
-             */
-            /* OSXSAVE clear in policy.  Fast-forward CR4 back in. */
-            if ( (curr->arch.pv_vcpu.ctrlreg[4] & X86_CR4_OSXSAVE) ||
-                 (regs->entry_vector == TRAP_invalid_op &&
-                  guest_kernel_mode(curr, regs) &&
-                  (read_cr4() & X86_CR4_OSXSAVE)) )
-                res->c |= cpufeat_mask(X86_FEATURE_OSXSAVE);
-
-            /*
-             * At the time of writing, a PV domain is the only viable option
-             * for Dom0.  Several interactions between dom0 and Xen for real
-             * hardware setup have unfortunately been implemented based on
-             * state which incorrectly leaked into dom0.
-             *
-             * These leaks are retained for backwards compatibility, but
-             * restricted to the hardware domains kernel only.
-             */
-            if ( is_hardware_domain(currd) && guest_kernel_mode(curr, regs) )
-            {
-                /*
-                 * MTRR used to unconditionally leak into PV guests.  They
-                 * cannot MTRR infrastructure at all, and shouldn't be able to
-                 * see the feature.
-                 *
-                 * Modern PVOPS Linux self-clobbers the MTRR feature, to avoid
-                 * trying to use the associated MSRs.  Xenolinux-based PV dom0's
-                 * however use the MTRR feature as an indication of the presence
-                 * of the XENPF_{add,del,read}_memtype hypercalls.
-                 */
-                if ( cpu_has_mtrr )
-                    res->d |= cpufeat_mask(X86_FEATURE_MTRR);
-
-                /*
-                 * MONITOR never leaked into PV guests, as PV guests cannot
-                 * use the MONITOR/MWAIT instructions.  As such, they require
-                 * the feature to not being present in emulated CPUID.
-                 *
-                 * Modern PVOPS Linux try to be cunning and use native CPUID
-                 * to see if the hardware actually supports MONITOR, and by
-                 * extension, deep C states.
-                 *
-                 * If the feature is seen, deep-C state information is
-                 * obtained from the DSDT and handed back to Xen via the
-                 * XENPF_set_processor_pminfo hypercall.
-                 *
-                 * This mechanism is incompatible with an HVM-based hardware
-                 * domain, and also with CPUID Faulting.
-                 *
-                 * Luckily, Xen can be just as 'cunning', and distinguish an
-                 * emulated CPUID from a faulted CPUID by whether a #UD or #GP
-                 * fault is currently being serviced.  Yuck...
-                 */
-                if ( cpu_has_monitor && regs->entry_vector == TRAP_gp_fault )
-                    res->c |= cpufeat_mask(X86_FEATURE_MONITOR);
-
-                /*
-                 * While MONITOR never leaked into PV guests, EIST always used
-                 * to.
-                 *
-                 * Modern PVOPS will only parse P state information from the
-                 * DSDT and return it to Xen if EIST is seen in the emulated
-                 * CPUID information.
-                 */
-                if ( cpu_has_eist )
-                    res->c |= cpufeat_mask(X86_FEATURE_EIST);
-            }
-        }
-
-        if ( vpmu_enabled(curr) &&
-             vpmu_is_set(vcpu_vpmu(curr), VPMU_CPU_HAS_DS) )
-        {
-            res->d |= cpufeat_mask(X86_FEATURE_DS);
-            if ( cpu_has(&current_cpu_data, X86_FEATURE_DTES64) )
-                res->c |= cpufeat_mask(X86_FEATURE_DTES64);
-            if ( cpu_has(&current_cpu_data, X86_FEATURE_DSCPL) )
-                res->c |= cpufeat_mask(X86_FEATURE_DSCPL);
-        }
-        break;
-
     case 0x0000000a: /* Architectural Performance Monitor Features (Intel) */
         if ( boot_cpu_data.x86_vendor != X86_VENDOR_INTEL ||
              !vpmu_enabled(curr) )
@@ -740,8 +607,7 @@ static void pv_cpuid(uint32_t leaf, uint32_t subleaf, struct cpuid_leaf *res)
         *res = EMPTY_LEAF;
         break;
 
-    case 0x0:
-    case 0x2 ... 0x3:
+    case 0x0 ... 0x3:
     case 0x7 ... 0x9:
     case 0xc ... XSTATE_CPUID:
     case 0x80000000 ... 0xffffffff:
@@ -754,57 +620,11 @@ static void hvm_cpuid(uint32_t leaf, uint32_t subleaf, struct cpuid_leaf *res)
 {
     struct vcpu *v = current;
     struct domain *d = v->domain;
-    const struct cpuid_policy *p = d->arch.cpuid;
 
     domain_cpuid(d, leaf, subleaf, res);
 
     switch ( leaf )
     {
-    case 0x1:
-        /* Fix up VLAPIC details. */
-        res->b &= 0x00FFFFFFu;
-        res->b |= (v->vcpu_id * 2) << 24;
-
-        res->c = p->basic._1c;
-        res->d = p->basic._1d;
-
-        /* APIC exposed to guests, but Fast-forward MSR_APIC_BASE.EN back in. */
-        if ( vlapic_hw_disabled(vcpu_vlapic(v)) )
-            res->d &= ~cpufeat_bit(X86_FEATURE_APIC);
-
-        /* OSXSAVE clear in policy.  Fast-forward CR4 back in. */
-        if ( v->arch.hvm_vcpu.guest_cr[4] & X86_CR4_OSXSAVE )
-            res->c |= cpufeat_mask(X86_FEATURE_OSXSAVE);
-
-        /*
-         * PSE36 is not supported in shadow mode.  This bit should be
-         * unilaterally cleared.
-         *
-         * However, an unspecified version of Hyper-V from 2011 refuses
-         * to start as the "cpu does not provide required hw features" if
-         * it can't see PSE36.
-         *
-         * As a workaround, leak the toolstack-provided PSE36 value into a
-         * shadow guest if the guest is already using PAE paging (and won't
-         * care about reverting back to PSE paging).  Otherwise, knoble it, so
-         * a 32bit guest doesn't get the impression that it could try to use
-         * PSE36 paging.
-         */
-        if ( !hap_enabled(d) && !hvm_pae_enabled(v) )
-            res->d &= ~cpufeat_mask(X86_FEATURE_PSE36);
-
-        if ( vpmu_enabled(v) &&
-             vpmu_is_set(vcpu_vpmu(v), VPMU_CPU_HAS_DS) )
-        {
-            res->d |= cpufeat_mask(X86_FEATURE_DS);
-            if ( cpu_has(&current_cpu_data, X86_FEATURE_DTES64) )
-                res->c |= cpufeat_mask(X86_FEATURE_DTES64);
-            if ( cpu_has(&current_cpu_data, X86_FEATURE_DSCPL) )
-                res->c |= cpufeat_mask(X86_FEATURE_DSCPL);
-        }
-
-        break;
-
     case 0xb:
         /* Fix the x2APIC identifier. */
         res->d = v->vcpu_id * 2;
@@ -822,8 +642,7 @@ static void hvm_cpuid(uint32_t leaf, uint32_t subleaf, struct cpuid_leaf *res)
             res->a = (res->a & ~0xff) | 3;
         break;
 
-    case 0x0:
-    case 0x2 ... 0x3:
+    case 0x0 ... 0x3:
     case 0x7 ... 0x9:
     case 0xc ... XSTATE_CPUID:
     case 0x80000000 ... 0xffffffff:
@@ -876,8 +695,7 @@ void guest_cpuid(const struct vcpu *v, uint32_t leaf,
         default:
             goto legacy;
 
-        case 0x0:
-        case 0x2 ... 0x3:
+        case 0x0 ... 0x3:
         case 0x8 ... 0x9:
         case 0xc:
             *res = p->basic.raw[leaf];
@@ -928,6 +746,141 @@ void guest_cpuid(const struct vcpu *v, uint32_t leaf,
      */
     switch ( leaf )
     {
+        const struct cpu_user_regs *regs;
+
+    case 0x1:
+        /* TODO: Rework topology logic. */
+        res->b &= 0x00ffffffu;
+        res->b |= (v->vcpu_id * 2) << 24;
+
+        /* TODO: Rework vPMU control in terms of toolstack choices. */
+        if ( vpmu_enabled(v) &&
+             vpmu_is_set(vcpu_vpmu(v), VPMU_CPU_HAS_DS) )
+        {
+            res->d |= cpufeat_mask(X86_FEATURE_DS);
+            if ( cpu_has(&current_cpu_data, X86_FEATURE_DTES64) )
+                res->c |= cpufeat_mask(X86_FEATURE_DTES64);
+            if ( cpu_has(&current_cpu_data, X86_FEATURE_DSCPL) )
+                res->c |= cpufeat_mask(X86_FEATURE_DSCPL);
+        }
+
+        if ( has_hvm_container_domain(d) )
+        {
+            /* OSXSAVE clear in policy.  Fast-forward CR4 back in. */
+            if ( v->arch.hvm_vcpu.guest_cr[4] & X86_CR4_OSXSAVE )
+                res->c |= cpufeat_mask(X86_FEATURE_OSXSAVE);
+        }
+        else /* PV domain */
+        {
+            regs = guest_cpu_user_regs();
+
+            /*
+             * !!! OSXSAVE handling for PV guests is non-architectural !!!
+             *
+             * Architecturally, the correct code here is simply:
+             *
+             *   if ( v->arch.pv_vcpu.ctrlreg[4] & X86_CR4_OSXSAVE )
+             *       c |= cpufeat_mask(X86_FEATURE_OSXSAVE);
+             *
+             * However because of bugs in Xen (before c/s bd19080b, Nov 2010,
+             * the XSAVE cpuid flag leaked into guests despite the feature not
+             * being available for use), buggy workarounds where introduced to
+             * Linux (c/s 947ccf9c, also Nov 2010) which relied on the fact
+             * that Xen also incorrectly leaked OSXSAVE into the guest.
+             *
+             * Furthermore, providing architectural OSXSAVE behaviour to a
+             * many Linux PV guests triggered a further kernel bug when the
+             * fpu code observes that XSAVEOPT is available, assumes that
+             * xsave state had been set up for the task, and follows a wild
+             * pointer.
+             *
+             * Older Linux PVOPS kernels however do require architectural
+             * behaviour.  They observe Xen's leaked OSXSAVE and assume they
+             * can already use XSETBV, dying with a #UD because the shadowed
+             * CR4.OSXSAVE is clear.  This behaviour has been adjusted in all
+             * observed cases via stable backports of the above changeset.
+             *
+             * Therefore, the leaking of Xen's OSXSAVE setting has become a
+             * defacto part of the PV ABI and can't reasonably be corrected.
+             * It can however be restricted to only the enlightened CPUID
+             * view, as seen by the guest kernel.
+             *
+             * The following situations and logic now applies:
+             *
+             * - Hardware without CPUID faulting support and native CPUID:
+             *    There is nothing Xen can do here.  The hosts XSAVE flag will
+             *    leak through and Xen's OSXSAVE choice will leak through.
+             *
+             *    In the case that the guest kernel has not set up OSXSAVE, only
+             *    SSE will be set in xcr0, and guest userspace can't do too much
+             *    damage itself.
+             *
+             * - Enlightened CPUID or CPUID faulting available:
+             *    Xen can fully control what is seen here.  Guest kernels need
+             *    to see the leaked OSXSAVE via the enlightened path, but
+             *    guest userspace and the native is given architectural
+             *    behaviour.
+             *
+             *    Emulated vs Faulted CPUID is distinguised based on whether a
+             *    #UD or #GP is currently being serviced.
+             */
+            /* OSXSAVE clear in policy.  Fast-forward CR4 back in. */
+            if ( (v->arch.pv_vcpu.ctrlreg[4] & X86_CR4_OSXSAVE) ||
+                 (regs->entry_vector == TRAP_invalid_op &&
+                  guest_kernel_mode(v, regs) &&
+                  (read_cr4() & X86_CR4_OSXSAVE)) )
+                res->c |= cpufeat_mask(X86_FEATURE_OSXSAVE);
+
+            /*
+             * At the time of writing, a PV domain is the only viable option
+             * for Dom0.  Several interactions between dom0 and Xen for real
+             * hardware setup have unfortunately been implemented based on
+             * state which incorrectly leaked into dom0.
+             *
+             * These leaks are retained for backwards compatibility, but
+             * restricted to the hardware domains kernel only.
+             */
+            if ( is_hardware_domain(d) && guest_kernel_mode(v, regs) )
+            {
+                /*
+                 * MONITOR never leaked into PV guests, as PV guests cannot
+                 * use the MONITOR/MWAIT instructions.  As such, they require
+                 * the feature to not being present in emulated CPUID.
+                 *
+                 * Modern PVOPS Linux try to be cunning and use native CPUID
+                 * to see if the hardware actually supports MONITOR, and by
+                 * extension, deep C states.
+                 *
+                 * If the feature is seen, deep-C state information is
+                 * obtained from the DSDT and handed back to Xen via the
+                 * XENPF_set_processor_pminfo hypercall.
+                 *
+                 * This mechanism is incompatible with an HVM-based hardware
+                 * domain, and also with CPUID Faulting.
+                 *
+                 * Luckily, Xen can be just as 'cunning', and distinguish an
+                 * emulated CPUID from a faulted CPUID by whether a #UD or #GP
+                 * fault is currently being serviced.  Yuck...
+                 */
+                if ( cpu_has_monitor && regs->entry_vector == TRAP_gp_fault )
+                    res->c |= cpufeat_mask(X86_FEATURE_MONITOR);
+
+                /*
+                 * While MONITOR never leaked into PV guests, EIST always used
+                 * to.
+                 *
+                 * Modern PVOPS Linux will only parse P state information from
+                 * the DSDT and return it to Xen if EIST is seen in the
+                 * emulated CPUID information.
+                 */
+                if ( cpu_has_eist )
+                    res->c |= cpufeat_mask(X86_FEATURE_EIST);
+            }
+        }
+
+        /* Adjustments common with leaf 0x80000001. */
+        goto common_dynamic_adjustments;
+
     case 0x7:
         switch ( subleaf )
         {
@@ -967,6 +920,12 @@ void guest_cpuid(const struct vcpu *v, uint32_t leaf,
         break;
 
     case 0x80000001:
+        /* SYSCALL is hidden outside of long mode on Intel. */
+        if ( p->x86_vendor == X86_VENDOR_INTEL &&
+             has_hvm_container_domain(d) && !hvm_long_mode_enabled(v) )
+            res->d &= ~cpufeat_mask(X86_FEATURE_SYSCALL);
+
+    common_dynamic_adjustments: /* Adjustments common with leaf 1. */
         if ( has_hvm_container_domain(d) )
         {
             /* Fast-forward MSR_APIC_BASE.EN. */
@@ -989,10 +948,6 @@ void guest_cpuid(const struct vcpu *v, uint32_t leaf,
              */
             if ( !hap_enabled(d) && !hvm_pae_enabled(v) )
                 res->d &= ~cpufeat_mask(X86_FEATURE_PSE36);
-
-            /* SYSCALL is hidden outside of long mode on Intel. */
-            if ( p->x86_vendor == X86_VENDOR_INTEL && !hvm_long_mode_enabled(v) )
-                res->d &= ~cpufeat_mask(X86_FEATURE_SYSCALL);
         }
         else /* PV domain */
         {
diff --git a/xen/include/asm-x86/cpuid.h b/xen/include/asm-x86/cpuid.h
index bc3fc7c..6d1990b 100644
--- a/xen/include/asm-x86/cpuid.h
+++ b/xen/include/asm-x86/cpuid.h
@@ -106,7 +106,11 @@ struct cpuid_policy
             uint32_t max_leaf, vendor_ebx, vendor_ecx, vendor_edx;
 
             /* Leaf 0x1 - Family/model/stepping and features. */
-            uint32_t raw_fms, /* b */:32;
+            uint32_t raw_fms;
+            uint8_t :8,       /* Brand ID. */
+                clflush_size, /* Number of 8-byte blocks per cache line. */
+                lppp,         /* Logical processors per package. */
+                apic_id;      /* Initial APIC ID. */
             union {
                 uint32_t _1c;
                 struct { DECL_BITFIELD(1c); };
-- 
2.1.4


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 65+ messages in thread

* [PATCH 04/10] x86/cpuid: Handle leaf 0x4 in guest_cpuid()
  2017-02-20 11:00 [PATCH 00/10] x86/cpuid: Remove the legacy infrastructure Andrew Cooper
                   ` (2 preceding siblings ...)
  2017-02-20 11:00 ` [PATCH 03/10] x86/cpuid: Handle leaf 0x1 in guest_cpuid() Andrew Cooper
@ 2017-02-20 11:00 ` Andrew Cooper
  2017-02-21 17:16   ` Jan Beulich
  2017-03-10 16:27   ` [PATCH v2 " Andrew Cooper
  2017-02-20 11:00 ` [PATCH 05/10] x86/cpuid: Handle leaf 0x5 " Andrew Cooper
                   ` (5 subsequent siblings)
  9 siblings, 2 replies; 65+ messages in thread
From: Andrew Cooper @ 2017-02-20 11:00 UTC (permalink / raw)
  To: Xen-devel; +Cc: Andrew Cooper, Jan Beulich

Leaf 0x4 is reserved by AMD.  For Intel, it is a multi-invocation leaf with
ecx enumerating different cache details.

Add a new union for it in struct cpuid_policy, collect it from hardware in
calculate_raw_policy(), audit it in recalculate_cpuid_policy() and update
guest_cpuid() and update_domain_cpuid_info() to properly insert/extract data.

A lot of the data here will need further auditing/refinement when better
topology support is introduced, but for now, this matches the existing
toolstack behaviour.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
---
CC: Jan Beulich <JBeulich@suse.com>
---
 xen/arch/x86/cpuid.c        | 51 +++++++++++++++++++++++++++++++++++++++++++--
 xen/arch/x86/domctl.c       |  8 ++++++-
 xen/include/asm-x86/cpuid.h | 10 +++++++++
 3 files changed, 66 insertions(+), 3 deletions(-)

diff --git a/xen/arch/x86/cpuid.c b/xen/arch/x86/cpuid.c
index 3ecb794..ca38d04 100644
--- a/xen/arch/x86/cpuid.c
+++ b/xen/arch/x86/cpuid.c
@@ -163,6 +163,9 @@ static void recalculate_xstate(struct cpuid_policy *p)
  */
 static void recalculate_misc(struct cpuid_policy *p)
 {
+    /* Leaves with subleaf unions. */
+    p->basic.raw[0x4] = p->basic.raw[0x7] = p->basic.raw[0xd] = EMPTY_LEAF;
+
     p->basic.raw[0x8] = EMPTY_LEAF;
     p->basic.raw[0xc] = EMPTY_LEAF;
 
@@ -201,6 +204,7 @@ static void recalculate_misc(struct cpuid_policy *p)
         p->basic.apic_id = 0; /* Dynamic. */
 
         zero_leaves(p->basic.raw, 0x2, 0x3);
+        memset(p->cache.raw, 0, sizeof(p->cache.raw));
         p->basic.raw[0x9] = EMPTY_LEAF;
 
         p->extd.vendor_ebx = p->basic.vendor_ebx;
@@ -244,6 +248,25 @@ static void __init calculate_raw_policy(void)
         cpuid_leaf(i, &p->basic.raw[i]);
     }
 
+    if ( p->basic.max_leaf >= 4 )
+    {
+        for ( i = 0; i < ARRAY_SIZE(p->cache.raw); ++i )
+        {
+            cpuid_count_leaf(4, i, &p->cache.raw[i]);
+
+            if ( p->cache.subleaf[i].type == 0 )
+                break;
+        }
+
+        /*
+         * The choice of CPUID_GUEST_NR_CACHE is arbitrary.  It is expected
+         * that it will eventually need increasing for future hardware.
+         */
+        if ( i == ARRAY_SIZE(p->cache.raw) )
+            printk(XENLOG_WARNING
+                   "CPUID: Insufficient Leaf 4 space for this hardware\n");
+    }
+
     if ( p->basic.max_leaf >= 7 )
     {
         cpuid_count_leaf(7, 0, &p->feat.raw[0]);
@@ -522,6 +545,23 @@ void recalculate_cpuid_policy(struct domain *d)
     recalculate_xstate(p);
     recalculate_misc(p);
 
+    for ( i = 0; i < ARRAY_SIZE(p->cache.raw); ++i )
+    {
+        if ( p->cache.subleaf[i].type >= 1 &&
+             p->cache.subleaf[i].type <= 3 )
+        {
+            /* Subleaf has a valid cache type. Zero reserved fields. */
+            p->cache.raw[i].a &= 0xffffc3ffu;
+            p->cache.raw[i].d &= 0x00000007u;
+        }
+        else
+        {
+            /* Subleaf is not valid.  Zero the rest of the union. */
+            zero_leaves(p->cache.raw, i, ARRAY_SIZE(p->cache.raw) - 1);
+            break;
+        }
+    }
+
     if ( !p->extd.svm )
         p->extd.raw[0xa] = EMPTY_LEAF;
 
@@ -607,7 +647,7 @@ static void pv_cpuid(uint32_t leaf, uint32_t subleaf, struct cpuid_leaf *res)
         *res = EMPTY_LEAF;
         break;
 
-    case 0x0 ... 0x3:
+    case 0x0 ... 0x4:
     case 0x7 ... 0x9:
     case 0xc ... XSTATE_CPUID:
     case 0x80000000 ... 0xffffffff:
@@ -642,7 +682,7 @@ static void hvm_cpuid(uint32_t leaf, uint32_t subleaf, struct cpuid_leaf *res)
             res->a = (res->a & ~0xff) | 3;
         break;
 
-    case 0x0 ... 0x3:
+    case 0x0 ... 0x4:
     case 0x7 ... 0x9:
     case 0xc ... XSTATE_CPUID:
     case 0x80000000 ... 0xffffffff:
@@ -676,6 +716,13 @@ void guest_cpuid(const struct vcpu *v, uint32_t leaf,
 
         switch ( leaf )
         {
+        case 0x4:
+            if ( subleaf >= ARRAY_SIZE(p->cache.raw) )
+                return;
+
+            *res = p->cache.raw[subleaf];
+            break;
+
         case 0x7:
             ASSERT(p->feat.max_subleaf < ARRAY_SIZE(p->feat.raw));
             if ( subleaf > min_t(uint32_t, p->feat.max_subleaf,
diff --git a/xen/arch/x86/domctl.c b/xen/arch/x86/domctl.c
index fc42cb1..aa0d914 100644
--- a/xen/arch/x86/domctl.c
+++ b/xen/arch/x86/domctl.c
@@ -101,6 +101,10 @@ static int update_domain_cpuid_info(struct domain *d,
     switch ( ctl->input[0] )
     {
     case 0x00000000 ... ARRAY_SIZE(p->basic.raw) - 1:
+        if ( ctl->input[0] == 4 &&
+             ctl->input[1] >= ARRAY_SIZE(p->cache.raw) )
+            return 0;
+
         if ( ctl->input[0] == 7 &&
              ctl->input[1] >= ARRAY_SIZE(p->feat.raw) )
             return 0;
@@ -129,7 +133,9 @@ static int update_domain_cpuid_info(struct domain *d,
     switch ( ctl->input[0] )
     {
     case 0x00000000 ... ARRAY_SIZE(p->basic.raw) - 1:
-        if ( ctl->input[0] == 7 )
+        if ( ctl->input[0] == 4 )
+            p->cache.raw[ctl->input[1]] = leaf;
+        else if ( ctl->input[0] == 7 )
             p->feat.raw[ctl->input[1]] = leaf;
         else if ( ctl->input[0] == XSTATE_CPUID )
             p->xstate.raw[ctl->input[1]] = leaf;
diff --git a/xen/include/asm-x86/cpuid.h b/xen/include/asm-x86/cpuid.h
index 6d1990b..14fc95c 100644
--- a/xen/include/asm-x86/cpuid.h
+++ b/xen/include/asm-x86/cpuid.h
@@ -63,6 +63,7 @@ DECLARE_PER_CPU(bool, cpuid_faulting_enabled);
 
 #define CPUID_GUEST_NR_BASIC      (0xdu + 1)
 #define CPUID_GUEST_NR_FEAT       (0u + 1)
+#define CPUID_GUEST_NR_CACHE      (5u + 1)
 #define CPUID_GUEST_NR_XSTATE     (62u + 1)
 #define CPUID_GUEST_NR_EXTD_INTEL (0x8u + 1)
 #define CPUID_GUEST_NR_EXTD_AMD   (0x1cu + 1)
@@ -125,6 +126,15 @@ struct cpuid_policy
         };
     } basic;
 
+    /* Structured cache leaf: 0x00000004[xx] */
+    union {
+        struct cpuid_leaf raw[CPUID_GUEST_NR_CACHE];
+        struct {
+            uint32_t type:4,
+                :28, :32, :32, :32;
+        } subleaf[CPUID_GUEST_NR_CACHE];
+    } cache;
+
     /* Structured feature leaf: 0x00000007[xx] */
     union {
         struct cpuid_leaf raw[CPUID_GUEST_NR_FEAT];
-- 
2.1.4


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 65+ messages in thread

* [PATCH 05/10] x86/cpuid: Handle leaf 0x5 in guest_cpuid()
  2017-02-20 11:00 [PATCH 00/10] x86/cpuid: Remove the legacy infrastructure Andrew Cooper
                   ` (3 preceding siblings ...)
  2017-02-20 11:00 ` [PATCH 04/10] x86/cpuid: Handle leaf 0x4 " Andrew Cooper
@ 2017-02-20 11:00 ` Andrew Cooper
  2017-02-21 17:22   ` Jan Beulich
  2017-02-20 11:00 ` [PATCH 06/10] x86/cpuid: Handle leaf 0x6 " Andrew Cooper
                   ` (4 subsequent siblings)
  9 siblings, 1 reply; 65+ messages in thread
From: Andrew Cooper @ 2017-02-20 11:00 UTC (permalink / raw)
  To: Xen-devel; +Cc: Andrew Cooper, Jan Beulich

The MONITOR flag isn't exposed to guests.  The existing toolstack logic, and
pv_cpuid() in the hypervisor, zero the MONITOR leaf for queries.

However, the MONITOR leaf is still visible in the hardware domains native
CPUID view, and Linux depends on this to set up C-state information.  Leak the
hosts MONITOR leaf under the same circumstances that the MONITOR feature is
leaked.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
---
CC: Jan Beulich <JBeulich@suse.com>
---
 xen/arch/x86/cpuid.c | 20 +++++++++++++++++---
 1 file changed, 17 insertions(+), 3 deletions(-)

diff --git a/xen/arch/x86/cpuid.c b/xen/arch/x86/cpuid.c
index ca38d04..7862d62 100644
--- a/xen/arch/x86/cpuid.c
+++ b/xen/arch/x86/cpuid.c
@@ -166,6 +166,8 @@ static void recalculate_misc(struct cpuid_policy *p)
     /* Leaves with subleaf unions. */
     p->basic.raw[0x4] = p->basic.raw[0x7] = p->basic.raw[0xd] = EMPTY_LEAF;
 
+    p->basic.raw[0x5] = EMPTY_LEAF; /* MONITOR not exposed to guests. */
+
     p->basic.raw[0x8] = EMPTY_LEAF;
     p->basic.raw[0xc] = EMPTY_LEAF;
 
@@ -641,13 +643,12 @@ static void pv_cpuid(uint32_t leaf, uint32_t subleaf, struct cpuid_leaf *res)
             res->a = (res->a & ~0xff) | 3;
         break;
 
-    case 0x00000005: /* MONITOR/MWAIT */
     case 0x0000000b: /* Extended Topology Enumeration */
     unsupported:
         *res = EMPTY_LEAF;
         break;
 
-    case 0x0 ... 0x4:
+    case 0x0 ... 0x5:
     case 0x7 ... 0x9:
     case 0xc ... XSTATE_CPUID:
     case 0x80000000 ... 0xffffffff:
@@ -682,7 +683,7 @@ static void hvm_cpuid(uint32_t leaf, uint32_t subleaf, struct cpuid_leaf *res)
             res->a = (res->a & ~0xff) | 3;
         break;
 
-    case 0x0 ... 0x4:
+    case 0x0 ... 0x5:
     case 0x7 ... 0x9:
     case 0xc ... XSTATE_CPUID:
     case 0x80000000 ... 0xffffffff:
@@ -743,6 +744,7 @@ void guest_cpuid(const struct vcpu *v, uint32_t leaf,
             goto legacy;
 
         case 0x0 ... 0x3:
+        case 0x5:
         case 0x8 ... 0x9:
         case 0xc:
             *res = p->basic.raw[leaf];
@@ -928,6 +930,18 @@ void guest_cpuid(const struct vcpu *v, uint32_t leaf,
         /* Adjustments common with leaf 0x80000001. */
         goto common_dynamic_adjustments;
 
+    case 0x5:
+        /*
+         * Leak the hardware MONITOR leaf under the same conditions that the
+         * MONITOR feature flag is leaked.  See above for details.
+         */
+        regs = guest_cpu_user_regs();
+        if ( is_pv_domain(d) && is_hardware_domain(d) &&
+             guest_kernel_mode(v, regs) && cpu_has_monitor &&
+             regs->entry_vector == TRAP_gp_fault )
+            *res = raw_policy.basic.raw[leaf];
+        break;
+
     case 0x7:
         switch ( subleaf )
         {
-- 
2.1.4


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 65+ messages in thread

* [PATCH 06/10] x86/cpuid: Handle leaf 0x6 in guest_cpuid()
  2017-02-20 11:00 [PATCH 00/10] x86/cpuid: Remove the legacy infrastructure Andrew Cooper
                   ` (4 preceding siblings ...)
  2017-02-20 11:00 ` [PATCH 05/10] x86/cpuid: Handle leaf 0x5 " Andrew Cooper
@ 2017-02-20 11:00 ` Andrew Cooper
  2017-02-21 17:25   ` Jan Beulich
  2017-03-10 16:32   ` [PATCH v2 " Andrew Cooper
  2017-02-20 11:00 ` [PATCH 07/10] x86/cpuid: Handle leaf 0xa " Andrew Cooper
                   ` (3 subsequent siblings)
  9 siblings, 2 replies; 65+ messages in thread
From: Andrew Cooper @ 2017-02-20 11:00 UTC (permalink / raw)
  To: Xen-devel; +Cc: Andrew Cooper, Jan Beulich

The thermal/performance leaf was previously hidden from HVM guests, but fully
visible to PV guests.  Most of the leaf refers to MSR availability, and there
is nothing an unprivileged PV guest can do with the information, so hide the
leaf entirely.

The PV MSR handling logic as minimal support for some thermal/perf operations
from the hardware domain, so leak through the implemented subset of features.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
---
CC: Jan Beulich <JBeulich@suse.com>
---
 xen/arch/x86/cpuid.c | 25 ++++++++++++++++++++-----
 1 file changed, 20 insertions(+), 5 deletions(-)

diff --git a/xen/arch/x86/cpuid.c b/xen/arch/x86/cpuid.c
index 7862d62..2a5e011 100644
--- a/xen/arch/x86/cpuid.c
+++ b/xen/arch/x86/cpuid.c
@@ -167,6 +167,7 @@ static void recalculate_misc(struct cpuid_policy *p)
     p->basic.raw[0x4] = p->basic.raw[0x7] = p->basic.raw[0xd] = EMPTY_LEAF;
 
     p->basic.raw[0x5] = EMPTY_LEAF; /* MONITOR not exposed to guests. */
+    p->basic.raw[0x6] = EMPTY_LEAF; /* Therm/Power not exposed to guests. */
 
     p->basic.raw[0x8] = EMPTY_LEAF;
     p->basic.raw[0xc] = EMPTY_LEAF;
@@ -648,8 +649,7 @@ static void pv_cpuid(uint32_t leaf, uint32_t subleaf, struct cpuid_leaf *res)
         *res = EMPTY_LEAF;
         break;
 
-    case 0x0 ... 0x5:
-    case 0x7 ... 0x9:
+    case 0x0 ... 0x9:
     case 0xc ... XSTATE_CPUID:
     case 0x80000000 ... 0xffffffff:
         ASSERT_UNREACHABLE();
@@ -683,8 +683,7 @@ static void hvm_cpuid(uint32_t leaf, uint32_t subleaf, struct cpuid_leaf *res)
             res->a = (res->a & ~0xff) | 3;
         break;
 
-    case 0x0 ... 0x5:
-    case 0x7 ... 0x9:
+    case 0x0 ... 0x9:
     case 0xc ... XSTATE_CPUID:
     case 0x80000000 ... 0xffffffff:
         ASSERT_UNREACHABLE();
@@ -744,7 +743,7 @@ void guest_cpuid(const struct vcpu *v, uint32_t leaf,
             goto legacy;
 
         case 0x0 ... 0x3:
-        case 0x5:
+        case 0x5 ... 0x6:
         case 0x8 ... 0x9:
         case 0xc:
             *res = p->basic.raw[leaf];
@@ -942,6 +941,22 @@ void guest_cpuid(const struct vcpu *v, uint32_t leaf,
             *res = raw_policy.basic.raw[leaf];
         break;
 
+    case 0x6:
+        /*
+         * Leak the implemented-subset of therm/power features into PV
+         * hardware domain kernels.
+         */
+        if ( is_pv_domain(d) && is_hardware_domain(d) &&
+             guest_kernel_mode(v, guest_cpu_user_regs()) )
+        {
+            /* Temp, Turbo, ARAT. */
+            res->a = raw_policy.basic.raw[leaf].a & 0x00000007;
+
+            /* APERF/MPERF, perf/enery bias. */
+            res->c = raw_policy.basic.raw[leaf].c & 0x00000009;
+        }
+        break;
+
     case 0x7:
         switch ( subleaf )
         {
-- 
2.1.4


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 65+ messages in thread

* [PATCH 07/10] x86/cpuid: Handle leaf 0xa in guest_cpuid()
  2017-02-20 11:00 [PATCH 00/10] x86/cpuid: Remove the legacy infrastructure Andrew Cooper
                   ` (5 preceding siblings ...)
  2017-02-20 11:00 ` [PATCH 06/10] x86/cpuid: Handle leaf 0x6 " Andrew Cooper
@ 2017-02-20 11:00 ` Andrew Cooper
  2017-02-22  9:11   ` Jan Beulich
  2017-02-20 11:00 ` [PATCH 08/10] x86/cpuid: Handle leaf 0xb " Andrew Cooper
                   ` (2 subsequent siblings)
  9 siblings, 1 reply; 65+ messages in thread
From: Andrew Cooper @ 2017-02-20 11:00 UTC (permalink / raw)
  To: Xen-devel; +Cc: Andrew Cooper, Jan Beulich

Leaf 0xa is reserved by AMD, and only exposed to Intel guests when vPMU is
enabled.  Leave the logic as-was, ready to be cleaned up when further
toolstack infrastructure is in place.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
---
CC: Jan Beulich <JBeulich@suse.com>
---
 xen/arch/x86/cpuid.c | 43 ++++++++++++++++---------------------------
 1 file changed, 16 insertions(+), 27 deletions(-)

diff --git a/xen/arch/x86/cpuid.c b/xen/arch/x86/cpuid.c
index 2a5e011..c8dadab 100644
--- a/xen/arch/x86/cpuid.c
+++ b/xen/arch/x86/cpuid.c
@@ -208,7 +208,7 @@ static void recalculate_misc(struct cpuid_policy *p)
 
         zero_leaves(p->basic.raw, 0x2, 0x3);
         memset(p->cache.raw, 0, sizeof(p->cache.raw));
-        p->basic.raw[0x9] = EMPTY_LEAF;
+        zero_leaves(p->basic.raw, 0x9, 0xa);
 
         p->extd.vendor_ebx = p->basic.vendor_ebx;
         p->extd.vendor_ecx = p->basic.vendor_ecx;
@@ -634,22 +634,11 @@ static void pv_cpuid(uint32_t leaf, uint32_t subleaf, struct cpuid_leaf *res)
 
     switch ( leaf )
     {
-    case 0x0000000a: /* Architectural Performance Monitor Features (Intel) */
-        if ( boot_cpu_data.x86_vendor != X86_VENDOR_INTEL ||
-             !vpmu_enabled(curr) )
-            goto unsupported;
-
-        /* Report at most version 3 since that's all we currently emulate. */
-        if ( (res->a & 0xff) > 3 )
-            res->a = (res->a & ~0xff) | 3;
-        break;
-
     case 0x0000000b: /* Extended Topology Enumeration */
-    unsupported:
         *res = EMPTY_LEAF;
         break;
 
-    case 0x0 ... 0x9:
+    case 0x0 ... 0xa:
     case 0xc ... XSTATE_CPUID:
     case 0x80000000 ... 0xffffffff:
         ASSERT_UNREACHABLE();
@@ -671,19 +660,7 @@ static void hvm_cpuid(uint32_t leaf, uint32_t subleaf, struct cpuid_leaf *res)
         res->d = v->vcpu_id * 2;
         break;
 
-    case 0x0000000a: /* Architectural Performance Monitor Features (Intel) */
-        if ( boot_cpu_data.x86_vendor != X86_VENDOR_INTEL || !vpmu_enabled(v) )
-        {
-            *res = EMPTY_LEAF;
-            break;
-        }
-
-        /* Report at most version 3 since that's all we currently emulate */
-        if ( (res->a & 0xff) > 3 )
-            res->a = (res->a & ~0xff) | 3;
-        break;
-
-    case 0x0 ... 0x9:
+    case 0x0 ... 0xa:
     case 0xc ... XSTATE_CPUID:
     case 0x80000000 ... 0xffffffff:
         ASSERT_UNREACHABLE();
@@ -744,7 +721,7 @@ void guest_cpuid(const struct vcpu *v, uint32_t leaf,
 
         case 0x0 ... 0x3:
         case 0x5 ... 0x6:
-        case 0x8 ... 0x9:
+        case 0x8 ... 0xa:
         case 0xc:
             *res = p->basic.raw[leaf];
             break;
@@ -970,6 +947,18 @@ void guest_cpuid(const struct vcpu *v, uint32_t leaf,
         }
         break;
 
+    case 0xa:
+        /* TODO: Rework vPMU control in terms of toolstack choices. */
+        if ( boot_cpu_data.x86_vendor != X86_VENDOR_INTEL || !vpmu_enabled(v) )
+            *res = EMPTY_LEAF;
+        else
+        {
+            /* Report at most v3 since that's all we currently emulate. */
+            if ( (res->a & 0xff) > 3 )
+                res->a = (res->a & ~0xff) | 3;
+        }
+        break;
+
     case XSTATE_CPUID:
         switch ( subleaf )
         {
-- 
2.1.4


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 65+ messages in thread

* [PATCH 08/10] x86/cpuid: Handle leaf 0xb in guest_cpuid()
  2017-02-20 11:00 [PATCH 00/10] x86/cpuid: Remove the legacy infrastructure Andrew Cooper
                   ` (6 preceding siblings ...)
  2017-02-20 11:00 ` [PATCH 07/10] x86/cpuid: Handle leaf 0xa " Andrew Cooper
@ 2017-02-20 11:00 ` Andrew Cooper
  2017-02-22  9:16   ` Jan Beulich
  2017-03-10 16:44   ` [PATCH v2 " Andrew Cooper
  2017-02-20 11:00 ` [PATCH 09/10] x86/cpuid: Drop legacy CPUID infrastructure Andrew Cooper
  2017-02-20 11:00 ` [PATCH 10/10] x86/cpuid: Always enable faulting for the control domain Andrew Cooper
  9 siblings, 2 replies; 65+ messages in thread
From: Andrew Cooper @ 2017-02-20 11:00 UTC (permalink / raw)
  To: Xen-devel; +Cc: Andrew Cooper, Jan Beulich

Leaf 0xb is reserved by AMD, and uniformly hidden from guests by the toolstack
logic and hypervisor PV logic.

The previous dynamic logic filled in the x2APIC ID for all HVM guests.  This
is modified to respect the entire leaf being reserved by AMD, but is altered
to include PV Intel guests, so they get more sensible values in their emulated
and faulted view of CPUID.

Sensibly exposing the rest of the leaf requires further topology
infrastructure.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
---
CC: Jan Beulich <JBeulich@suse.com>
---
 xen/arch/x86/cpuid.c | 31 +++++++++++++++----------------
 1 file changed, 15 insertions(+), 16 deletions(-)

diff --git a/xen/arch/x86/cpuid.c b/xen/arch/x86/cpuid.c
index c8dadab..f1bcd7d 100644
--- a/xen/arch/x86/cpuid.c
+++ b/xen/arch/x86/cpuid.c
@@ -189,6 +189,9 @@ static void recalculate_misc(struct cpuid_policy *p)
         p->basic.raw[0x3] = EMPTY_LEAF; /* PSN - always hidden. */
         p->basic.raw[0x9] = EMPTY_LEAF; /* DCA - always hidden. */
 
+        /* TODO: Rework topology logic. */
+        p->basic.raw[0xb] = EMPTY_LEAF;
+
         p->extd.vendor_ebx = 0;
         p->extd.vendor_ecx = 0;
         p->extd.vendor_edx = 0;
@@ -208,7 +211,7 @@ static void recalculate_misc(struct cpuid_policy *p)
 
         zero_leaves(p->basic.raw, 0x2, 0x3);
         memset(p->cache.raw, 0, sizeof(p->cache.raw));
-        zero_leaves(p->basic.raw, 0x9, 0xa);
+        zero_leaves(p->basic.raw, 0x9, 0xb);
 
         p->extd.vendor_ebx = p->basic.vendor_ebx;
         p->extd.vendor_ecx = p->basic.vendor_ecx;
@@ -634,12 +637,7 @@ static void pv_cpuid(uint32_t leaf, uint32_t subleaf, struct cpuid_leaf *res)
 
     switch ( leaf )
     {
-    case 0x0000000b: /* Extended Topology Enumeration */
-        *res = EMPTY_LEAF;
-        break;
-
-    case 0x0 ... 0xa:
-    case 0xc ... XSTATE_CPUID:
+    case 0x0 ... XSTATE_CPUID:
     case 0x80000000 ... 0xffffffff:
         ASSERT_UNREACHABLE();
         /* Now handled in guest_cpuid(). */
@@ -655,13 +653,7 @@ static void hvm_cpuid(uint32_t leaf, uint32_t subleaf, struct cpuid_leaf *res)
 
     switch ( leaf )
     {
-    case 0xb:
-        /* Fix the x2APIC identifier. */
-        res->d = v->vcpu_id * 2;
-        break;
-
-    case 0x0 ... 0xa:
-    case 0xc ... XSTATE_CPUID:
+    case 0x0 ... XSTATE_CPUID:
     case 0x80000000 ... 0xffffffff:
         ASSERT_UNREACHABLE();
         /* Now handled in guest_cpuid(). */
@@ -721,8 +713,7 @@ void guest_cpuid(const struct vcpu *v, uint32_t leaf,
 
         case 0x0 ... 0x3:
         case 0x5 ... 0x6:
-        case 0x8 ... 0xa:
-        case 0xc:
+        case 0x8 ... 0xc:
             *res = p->basic.raw[leaf];
             break;
         }
@@ -959,6 +950,14 @@ void guest_cpuid(const struct vcpu *v, uint32_t leaf,
         }
         break;
 
+    case 0xb:
+        if ( p->x86_vendor == X86_VENDOR_INTEL )
+        {
+            /* Fix the x2APIC identifier. */
+            res->d = v->vcpu_id * 2;
+        }
+        break;
+
     case XSTATE_CPUID:
         switch ( subleaf )
         {
-- 
2.1.4


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 65+ messages in thread

* [PATCH 09/10] x86/cpuid: Drop legacy CPUID infrastructure
  2017-02-20 11:00 [PATCH 00/10] x86/cpuid: Remove the legacy infrastructure Andrew Cooper
                   ` (7 preceding siblings ...)
  2017-02-20 11:00 ` [PATCH 08/10] x86/cpuid: Handle leaf 0xb " Andrew Cooper
@ 2017-02-20 11:00 ` Andrew Cooper
  2017-02-22  9:19   ` Jan Beulich
  2017-02-20 11:00 ` [PATCH 10/10] x86/cpuid: Always enable faulting for the control domain Andrew Cooper
  9 siblings, 1 reply; 65+ messages in thread
From: Andrew Cooper @ 2017-02-20 11:00 UTC (permalink / raw)
  To: Xen-devel; +Cc: Andrew Cooper, Jan Beulich

Now that all leaves have been altered to use the guest_cpuid() path, remove
all the remaining legacy infrastructure.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
---
CC: Jan Beulich <JBeulich@suse.com>
---
 xen/arch/x86/cpuid.c        | 76 ---------------------------------------------
 xen/arch/x86/domctl.c       | 45 +--------------------------
 xen/include/asm-x86/cpuid.h | 27 ----------------
 3 files changed, 1 insertion(+), 147 deletions(-)

diff --git a/xen/arch/x86/cpuid.c b/xen/arch/x86/cpuid.c
index f1bcd7d..eaeec97 100644
--- a/xen/arch/x86/cpuid.c
+++ b/xen/arch/x86/cpuid.c
@@ -582,8 +582,6 @@ void recalculate_cpuid_policy(struct domain *d)
 
 int init_domain_cpuid_policy(struct domain *d)
 {
-    unsigned int i;
-
     d->arch.cpuid = xmalloc(struct cpuid_policy);
 
     if ( !d->arch.cpuid )
@@ -596,70 +594,9 @@ int init_domain_cpuid_policy(struct domain *d)
 
     recalculate_cpuid_policy(d);
 
-    for ( i = 0; i < MAX_CPUID_INPUT; i++ )
-    {
-        d->arch.cpuid->legacy[i].input[0] = XEN_CPUID_INPUT_UNUSED;
-        d->arch.cpuid->legacy[i].input[1] = XEN_CPUID_INPUT_UNUSED;
-    }
-
     return 0;
 }
 
-static void domain_cpuid(const struct domain *d, uint32_t leaf,
-                         uint32_t subleaf, struct cpuid_leaf *res)
-{
-    unsigned int i;
-
-    for ( i = 0; i < MAX_CPUID_INPUT; i++ )
-    {
-        xen_domctl_cpuid_t *cpuid = &d->arch.cpuid->legacy[i];
-
-        if ( (cpuid->input[0] == leaf) &&
-             ((cpuid->input[1] == XEN_CPUID_INPUT_UNUSED) ||
-              (cpuid->input[1] == subleaf)) )
-        {
-            *res = (struct cpuid_leaf){ cpuid->eax, cpuid->ebx,
-                                        cpuid->ecx, cpuid->edx };
-            return;
-        }
-    }
-}
-
-static void pv_cpuid(uint32_t leaf, uint32_t subleaf, struct cpuid_leaf *res)
-{
-    struct vcpu *curr = current;
-    struct domain *currd = curr->domain;
-
-    if ( !is_control_domain(currd) && !is_hardware_domain(currd) )
-        domain_cpuid(currd, leaf, subleaf, res);
-    else
-        cpuid_count_leaf(leaf, subleaf, res);
-
-    switch ( leaf )
-    {
-    case 0x0 ... XSTATE_CPUID:
-    case 0x80000000 ... 0xffffffff:
-        ASSERT_UNREACHABLE();
-        /* Now handled in guest_cpuid(). */
-    }
-}
-
-static void hvm_cpuid(uint32_t leaf, uint32_t subleaf, struct cpuid_leaf *res)
-{
-    struct vcpu *v = current;
-    struct domain *d = v->domain;
-
-    domain_cpuid(d, leaf, subleaf, res);
-
-    switch ( leaf )
-    {
-    case 0x0 ... XSTATE_CPUID:
-    case 0x80000000 ... 0xffffffff:
-        ASSERT_UNREACHABLE();
-        /* Now handled in guest_cpuid(). */
-    }
-}
-
 void guest_cpuid(const struct vcpu *v, uint32_t leaf,
                  uint32_t subleaf, struct cpuid_leaf *res)
 {
@@ -709,11 +646,6 @@ void guest_cpuid(const struct vcpu *v, uint32_t leaf,
             break;
 
         default:
-            goto legacy;
-
-        case 0x0 ... 0x3:
-        case 0x5 ... 0x6:
-        case 0x8 ... 0xc:
             *res = p->basic.raw[leaf];
             break;
         }
@@ -1037,14 +969,6 @@ void guest_cpuid(const struct vcpu *v, uint32_t leaf,
             res->a = (res->d & v->arch.hvm_svm.guest_lwp_cfg) | 1;
         break;
     }
-
-    /* Done. */
-    return;
-
- legacy:
-    /* {hvm,pv}_cpuid() have this expectation. */
-    ASSERT(v == current);
-    (is_hvm_domain(d) ? hvm_cpuid : pv_cpuid)(leaf, subleaf, res);
 }
 
 static void __init __maybe_unused build_assertions(void)
diff --git a/xen/arch/x86/domctl.c b/xen/arch/x86/domctl.c
index aa0d914..e345767 100644
--- a/xen/arch/x86/domctl.c
+++ b/xen/arch/x86/domctl.c
@@ -47,51 +47,12 @@ static int gdbsx_guest_mem_io(domid_t domid, struct xen_domctl_gdbsx_memio *iop)
     return iop->remain ? -EFAULT : 0;
 }
 
-static int update_legacy_cpuid_array(struct domain *d,
-                                     const xen_domctl_cpuid_t *ctl)
-{
-    xen_domctl_cpuid_t *cpuid, *unused = NULL;
-    unsigned int i;
-
-    /* Try to insert ctl into d->arch.cpuids[] */
-    for ( i = 0; i < MAX_CPUID_INPUT; i++ )
-    {
-        cpuid = &d->arch.cpuid->legacy[i];
-
-        if ( cpuid->input[0] == XEN_CPUID_INPUT_UNUSED )
-        {
-            if ( !unused )
-                unused = cpuid;
-            continue;
-        }
-
-        if ( (cpuid->input[0] == ctl->input[0]) &&
-             ((cpuid->input[1] == XEN_CPUID_INPUT_UNUSED) ||
-              (cpuid->input[1] == ctl->input[1])) )
-            break;
-    }
-
-    if ( !(ctl->eax | ctl->ebx | ctl->ecx | ctl->edx) )
-    {
-        if ( i < MAX_CPUID_INPUT )
-            cpuid->input[0] = XEN_CPUID_INPUT_UNUSED;
-    }
-    else if ( i < MAX_CPUID_INPUT )
-        *cpuid = *ctl;
-    else if ( unused )
-        *unused = *ctl;
-    else
-        return -ENOENT;
-
-    return 0;
-}
-
 static int update_domain_cpuid_info(struct domain *d,
                                     const xen_domctl_cpuid_t *ctl)
 {
     struct cpuid_policy *p = d->arch.cpuid;
     const struct cpuid_leaf leaf = { ctl->eax, ctl->ebx, ctl->ecx, ctl->edx };
-    int rc, old_vendor = p->x86_vendor;
+    int old_vendor = p->x86_vendor;
 
     /*
      * Skip update for leaves we don't care about.  This avoids the overhead
@@ -125,10 +86,6 @@ static int update_domain_cpuid_info(struct domain *d,
         return 0;
     }
 
-    rc = update_legacy_cpuid_array(d, ctl);
-    if ( rc )
-        return rc;
-
     /* Insert ctl data into cpuid_policy. */
     switch ( ctl->input[0] )
     {
diff --git a/xen/include/asm-x86/cpuid.h b/xen/include/asm-x86/cpuid.h
index 14fc95c..8d6a653 100644
--- a/xen/include/asm-x86/cpuid.h
+++ b/xen/include/asm-x86/cpuid.h
@@ -72,29 +72,6 @@ DECLARE_PER_CPU(bool, cpuid_faulting_enabled);
 
 struct cpuid_policy
 {
-    /*
-     * WARNING: During the CPUID transition period, not all information here
-     * is accurate.  The following items are accurate, and can be relied upon.
-     *
-     * Global *_policy objects:
-     *
-     * - Guest accurate:
-     *   - All of the feat, xstate and extd unions
-     *   - max_{,sub}leaf
-     *   - All FEATURESET_* words
-     *   - Short vendor infomation
-     *
-     * Per-domain objects:
-     *
-     * - Guest accurate:
-     *   - All of the feat, xstate and extd unions
-     *   - max_{,sub}leaf
-     *   - All FEATURESET_* words
-     *   - Short vendor infomation
-     *
-     * Everything else should be considered inaccurate, and not necesserily 0.
-     */
-
 #define DECL_BITFIELD(word) _DECL_BITFIELD(FEATURESET_ ## word)
 #define _DECL_BITFIELD(x)   __DECL_BITFIELD(x)
 #define __DECL_BITFIELD(x)  CPUID_BITFIELD_ ## x
@@ -230,10 +207,6 @@ struct cpuid_policy
 
     /* Value calculated from raw data above. */
     uint8_t x86_vendor;
-
-    /* Temporary: Legacy data array. */
-#define MAX_CPUID_INPUT 40
-    xen_domctl_cpuid_t legacy[MAX_CPUID_INPUT];
 };
 
 /* Fill in a featureset bitmap from a CPUID policy. */
-- 
2.1.4


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 65+ messages in thread

* [PATCH 10/10] x86/cpuid: Always enable faulting for the control domain
  2017-02-20 11:00 [PATCH 00/10] x86/cpuid: Remove the legacy infrastructure Andrew Cooper
                   ` (8 preceding siblings ...)
  2017-02-20 11:00 ` [PATCH 09/10] x86/cpuid: Drop legacy CPUID infrastructure Andrew Cooper
@ 2017-02-20 11:00 ` Andrew Cooper
  2017-02-22  9:23   ` Jan Beulich
  9 siblings, 1 reply; 65+ messages in thread
From: Andrew Cooper @ 2017-02-20 11:00 UTC (permalink / raw)
  To: Xen-devel; +Cc: Andrew Cooper, Jan Beulich

The domain builder in libxc no longer depends on leaked CPUID information to
properly construct HVM domains.  Remove the control domain exclusion.

On capable hardware, this prevents all unintended leakage of hardware CPUID
values into the control domain, and brings the hypervisor leaves into view.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
---
CC: Jan Beulich <JBeulich@suse.com>
---
 xen/arch/x86/cpu/intel.c | 18 +++---------------
 1 file changed, 3 insertions(+), 15 deletions(-)

diff --git a/xen/arch/x86/cpu/intel.c b/xen/arch/x86/cpu/intel.c
index 2e20327..238c47d 100644
--- a/xen/arch/x86/cpu/intel.c
+++ b/xen/arch/x86/cpu/intel.c
@@ -159,22 +159,10 @@ static void intel_ctxt_switch_levelling(const struct vcpu *next)
 
 	if (cpu_has_cpuid_faulting) {
 		/*
-		 * We *should* be enabling faulting for the control domain.
-		 *
-		 * Unfortunately, the domain builder (having only ever been a
-		 * PV guest) expects to be able to see host cpuid state in a
-		 * native CPUID instruction, to correctly build a CPUID policy
-		 * for HVM guests (notably the xstate leaves).
-		 *
-		 * This logic is fundimentally broken for HVM toolstack
-		 * domains, and faulting causes PV guests to behave like HVM
-		 * guests from their point of view.
-		 *
-		 * Future development plans will move responsibility for
-		 * generating the maximum full cpuid policy into Xen, at which
-		 * this problem will disappear.
+		 * Always enable faulting for all PV domains.  Enable faulting
+		 * for HVM guests if they have configured it.
 		 */
-		set_cpuid_faulting(nextd && !is_control_domain(nextd) &&
+		set_cpuid_faulting(nextd &&
 				   (is_pv_domain(nextd) ||
 				    next->arch.cpuid_faulting));
 		return;
-- 
2.1.4


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 65+ messages in thread

* Re: [PATCH 01/10] x86/cpuid: Disallow policy updates once the domain is running
  2017-02-20 11:00 ` [PATCH 01/10] x86/cpuid: Disallow policy updates once the domain is running Andrew Cooper
@ 2017-02-21 16:37   ` Jan Beulich
  0 siblings, 0 replies; 65+ messages in thread
From: Jan Beulich @ 2017-02-21 16:37 UTC (permalink / raw)
  To: Andrew Cooper; +Cc: Xen-devel

>>> On 20.02.17 at 12:00, <andrew.cooper3@citrix.com> wrote:
> On real hardware, the bulk of CPUID data is system-specific and constant.
> Hold the toolstack to the same behaviour when constructing domains.
> 
> Values which are expected to change dynamically (e.g. OSXSAVE) are 
> unaffected
> and continue to function as before.
> 
> Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>

Reviewed-by: Jan Beulich <jbeulich@suse.com>



_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 65+ messages in thread

* Re: [PATCH 02/10] x86/gen-cpuid: Clarify the intended meaning of AVX wrt feature dependencies
  2017-02-20 11:00 ` [PATCH 02/10] x86/gen-cpuid: Clarify the intended meaning of AVX wrt feature dependencies Andrew Cooper
@ 2017-02-21 16:40   ` Jan Beulich
  2017-02-21 16:41     ` Andrew Cooper
  2017-02-21 16:47     ` Jan Beulich
  0 siblings, 2 replies; 65+ messages in thread
From: Jan Beulich @ 2017-02-21 16:40 UTC (permalink / raw)
  To: Andrew Cooper; +Cc: Xen-devel

>>> On 20.02.17 at 12:00, <andrew.cooper3@citrix.com> wrote:
> --- a/xen/tools/gen-cpuid.py
> +++ b/xen/tools/gen-cpuid.py
> @@ -225,9 +225,13 @@ def crunch_numbers(state):
>          XSAVE: [XSAVEOPT, XSAVEC, XGETBV1, XSAVES,
>                  AVX, MPX, PKU, LWP],
>  
> -        # AVX is taken to mean hardware support for VEX encoded instructions,
> -        # 256bit registers, and the instructions themselves.  Each of these
> -        # subsequent instruction groups may only be VEX encoded.
> +        # AVX is taken to mean hardware support for 256bit registers (which in
> +        # practice depends on the VEX prefix to encode), and the instructions
> +        # themselves.
> +        #
> +        # AVX is not taken to mean support for the VEX prefix itself.
> +        # VEX-encoded GPR instructions, such as those from the BMI{1,2} sets
> +        # function fine in the absence of any enabled xstate.
>          AVX: [FMA, FMA4, F16C, AVX2, XOP],

Even if there aren't any EVEX-encoded non-AVX512 instructions so
far, I'd prefer the AVX512F entry to get adjusted along the same
lines. With that
Reviewed-by: Jan Beulich <jbeulich@suse.com>

Jan


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 65+ messages in thread

* Re: [PATCH 02/10] x86/gen-cpuid: Clarify the intended meaning of AVX wrt feature dependencies
  2017-02-21 16:40   ` Jan Beulich
@ 2017-02-21 16:41     ` Andrew Cooper
  2017-02-21 16:47     ` Jan Beulich
  1 sibling, 0 replies; 65+ messages in thread
From: Andrew Cooper @ 2017-02-21 16:41 UTC (permalink / raw)
  To: Jan Beulich; +Cc: Xen-devel

On 21/02/17 16:40, Jan Beulich wrote:
>>>> On 20.02.17 at 12:00, <andrew.cooper3@citrix.com> wrote:
>> --- a/xen/tools/gen-cpuid.py
>> +++ b/xen/tools/gen-cpuid.py
>> @@ -225,9 +225,13 @@ def crunch_numbers(state):
>>          XSAVE: [XSAVEOPT, XSAVEC, XGETBV1, XSAVES,
>>                  AVX, MPX, PKU, LWP],
>>  
>> -        # AVX is taken to mean hardware support for VEX encoded instructions,
>> -        # 256bit registers, and the instructions themselves.  Each of these
>> -        # subsequent instruction groups may only be VEX encoded.
>> +        # AVX is taken to mean hardware support for 256bit registers (which in
>> +        # practice depends on the VEX prefix to encode), and the instructions
>> +        # themselves.
>> +        #
>> +        # AVX is not taken to mean support for the VEX prefix itself.
>> +        # VEX-encoded GPR instructions, such as those from the BMI{1,2} sets
>> +        # function fine in the absence of any enabled xstate.
>>          AVX: [FMA, FMA4, F16C, AVX2, XOP],
> Even if there aren't any EVEX-encoded non-AVX512 instructions so
> far, I'd prefer the AVX512F entry to get adjusted along the same
> lines.

Good point.  I will update.

> With that Reviewed-by: Jan Beulich <jbeulich@suse.com>

Thanks,

~Andrew

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 65+ messages in thread

* Re: [PATCH 02/10] x86/gen-cpuid: Clarify the intended meaning of AVX wrt feature dependencies
  2017-02-21 16:40   ` Jan Beulich
  2017-02-21 16:41     ` Andrew Cooper
@ 2017-02-21 16:47     ` Jan Beulich
  2017-02-21 16:53       ` Andrew Cooper
  1 sibling, 1 reply; 65+ messages in thread
From: Jan Beulich @ 2017-02-21 16:47 UTC (permalink / raw)
  To: Andrew Cooper; +Cc: Xen-devel

>>> On 21.02.17 at 17:40, <JBeulich@suse.com> wrote:
>>>> On 20.02.17 at 12:00, <andrew.cooper3@citrix.com> wrote:
>> --- a/xen/tools/gen-cpuid.py
>> +++ b/xen/tools/gen-cpuid.py
>> @@ -225,9 +225,13 @@ def crunch_numbers(state):
>>          XSAVE: [XSAVEOPT, XSAVEC, XGETBV1, XSAVES,
>>                  AVX, MPX, PKU, LWP],
>>  
>> -        # AVX is taken to mean hardware support for VEX encoded instructions,
>> -        # 256bit registers, and the instructions themselves.  Each of these
>> -        # subsequent instruction groups may only be VEX encoded.
>> +        # AVX is taken to mean hardware support for 256bit registers (which in
>> +        # practice depends on the VEX prefix to encode), and the instructions
>> +        # themselves.
>> +        #
>> +        # AVX is not taken to mean support for the VEX prefix itself.
>> +        # VEX-encoded GPR instructions, such as those from the BMI{1,2} sets
>> +        # function fine in the absence of any enabled xstate.
>>          AVX: [FMA, FMA4, F16C, AVX2, XOP],
> 
> Even if there aren't any EVEX-encoded non-AVX512 instructions so
> far, I'd prefer the AVX512F entry to get adjusted along the same
> lines. With that
> Reviewed-by: Jan Beulich <jbeulich@suse.com>

Actually, one more thing: XOP really has dual meaning (encoding
and certain SIMD instructions). Perhaps it would be good to clarify
this here too.

Jan


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 65+ messages in thread

* Re: [PATCH 02/10] x86/gen-cpuid: Clarify the intended meaning of AVX wrt feature dependencies
  2017-02-21 16:47     ` Jan Beulich
@ 2017-02-21 16:53       ` Andrew Cooper
  2017-02-21 17:07         ` Jan Beulich
  0 siblings, 1 reply; 65+ messages in thread
From: Andrew Cooper @ 2017-02-21 16:53 UTC (permalink / raw)
  To: Jan Beulich; +Cc: Xen-devel

On 21/02/17 16:47, Jan Beulich wrote:
>>>> On 21.02.17 at 17:40, <JBeulich@suse.com> wrote:
>>>>> On 20.02.17 at 12:00, <andrew.cooper3@citrix.com> wrote:
>>> --- a/xen/tools/gen-cpuid.py
>>> +++ b/xen/tools/gen-cpuid.py
>>> @@ -225,9 +225,13 @@ def crunch_numbers(state):
>>>          XSAVE: [XSAVEOPT, XSAVEC, XGETBV1, XSAVES,
>>>                  AVX, MPX, PKU, LWP],
>>>  
>>> -        # AVX is taken to mean hardware support for VEX encoded instructions,
>>> -        # 256bit registers, and the instructions themselves.  Each of these
>>> -        # subsequent instruction groups may only be VEX encoded.
>>> +        # AVX is taken to mean hardware support for 256bit registers (which in
>>> +        # practice depends on the VEX prefix to encode), and the instructions
>>> +        # themselves.
>>> +        #
>>> +        # AVX is not taken to mean support for the VEX prefix itself.
>>> +        # VEX-encoded GPR instructions, such as those from the BMI{1,2} sets
>>> +        # function fine in the absence of any enabled xstate.
>>>          AVX: [FMA, FMA4, F16C, AVX2, XOP],
>> Even if there aren't any EVEX-encoded non-AVX512 instructions so
>> far, I'd prefer the AVX512F entry to get adjusted along the same
>> lines. With that
>> Reviewed-by: Jan Beulich <jbeulich@suse.com>
> Actually, one more thing: XOP really has dual meaning (encoding
> and certain SIMD instructions). Perhaps it would be good to clarify
> this here too.

We don't currently express any dependencies based on XOP, so there is no
text about it.

As AMD have dropped XOP encoding in their Zen line, I don't expect this
will change going forwards.

~Andrew

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 65+ messages in thread

* Re: [PATCH 03/10] x86/cpuid: Handle leaf 0x1 in guest_cpuid()
  2017-02-20 11:00 ` [PATCH 03/10] x86/cpuid: Handle leaf 0x1 in guest_cpuid() Andrew Cooper
@ 2017-02-21 16:59   ` Jan Beulich
  2017-02-21 17:13     ` Andrew Cooper
  0 siblings, 1 reply; 65+ messages in thread
From: Jan Beulich @ 2017-02-21 16:59 UTC (permalink / raw)
  To: Andrew Cooper; +Cc: Boris Ostrovsky, Xen-devel

>>> On 20.02.17 at 12:00, <andrew.cooper3@citrix.com> wrote:
> The ebx word is more problematic.  The low 8 bits are the brand ID and safe to
> pass straight through.  The next 8 bits are the CLFLUSH line size.  This value
> is forwarded straight from hardware, as nothing good can possibly come of
> providing an alternative value to the guest.

Risking the value guests see to change across migration.

> The final 8 bits are the initial legacy APIC ID.  For HVM guests, this was
> overridden to vcpu_id * 2.  The same logic is now applied to PV guests, so
> guests don't observe a constant number on all vcpus via their emulated or
> faulted view.

They won't be the same everywhere, but every 128th CPU will
share values. I'm therefore not sure it wouldn't be better to hand
out zero or all ones here.

> --- a/xen/arch/x86/cpuid.c
> +++ b/xen/arch/x86/cpuid.c
> @@ -176,6 +176,9 @@ static void recalculate_misc(struct cpuid_policy *p)
>      switch ( p->x86_vendor )
>      {
>      case X86_VENDOR_INTEL:
> +        p->basic.raw_fms &= 0x0fff3fff;

Wouldn't it be better to use the same mask as for AMD here? The
field is effectively unused / reserved on 64-bit processors.

> @@ -967,6 +920,12 @@ void guest_cpuid(const struct vcpu *v, uint32_t leaf,
>          break;
>  
>      case 0x80000001:
> +        /* SYSCALL is hidden outside of long mode on Intel. */
> +        if ( p->x86_vendor == X86_VENDOR_INTEL &&
> +             has_hvm_container_domain(d) && !hvm_long_mode_enabled(v) )
> +            res->d &= ~cpufeat_mask(X86_FEATURE_SYSCALL);
> +
> +    common_dynamic_adjustments: /* Adjustments common with leaf 1. */

Perhaps have "leaf1" in the label named then?

Jan


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 65+ messages in thread

* Re: [PATCH 02/10] x86/gen-cpuid: Clarify the intended meaning of AVX wrt feature dependencies
  2017-02-21 16:53       ` Andrew Cooper
@ 2017-02-21 17:07         ` Jan Beulich
  2017-02-21 17:12           ` Andrew Cooper
  0 siblings, 1 reply; 65+ messages in thread
From: Jan Beulich @ 2017-02-21 17:07 UTC (permalink / raw)
  To: Andrew Cooper; +Cc: Xen-devel

>>> On 21.02.17 at 17:53, <andrew.cooper3@citrix.com> wrote:
> On 21/02/17 16:47, Jan Beulich wrote:
>>>>> On 21.02.17 at 17:40, <JBeulich@suse.com> wrote:
>>>>>> On 20.02.17 at 12:00, <andrew.cooper3@citrix.com> wrote:
>>>> --- a/xen/tools/gen-cpuid.py
>>>> +++ b/xen/tools/gen-cpuid.py
>>>> @@ -225,9 +225,13 @@ def crunch_numbers(state):
>>>>          XSAVE: [XSAVEOPT, XSAVEC, XGETBV1, XSAVES,
>>>>                  AVX, MPX, PKU, LWP],
>>>>  
>>>> -        # AVX is taken to mean hardware support for VEX encoded instructions,
>>>> -        # 256bit registers, and the instructions themselves.  Each of these
>>>> -        # subsequent instruction groups may only be VEX encoded.
>>>> +        # AVX is taken to mean hardware support for 256bit registers (which in
>>>> +        # practice depends on the VEX prefix to encode), and the instructions
>>>> +        # themselves.
>>>> +        #
>>>> +        # AVX is not taken to mean support for the VEX prefix itself.
>>>> +        # VEX-encoded GPR instructions, such as those from the BMI{1,2} sets
>>>> +        # function fine in the absence of any enabled xstate.
>>>>          AVX: [FMA, FMA4, F16C, AVX2, XOP],
>>> Even if there aren't any EVEX-encoded non-AVX512 instructions so
>>> far, I'd prefer the AVX512F entry to get adjusted along the same
>>> lines. With that
>>> Reviewed-by: Jan Beulich <jbeulich@suse.com>
>> Actually, one more thing: XOP really has dual meaning (encoding
>> and certain SIMD instructions). Perhaps it would be good to clarify
>> this here too.
> 
> We don't currently express any dependencies based on XOP, so there is no
> text about it.

Well, I'm referring to the text (and dependency) above. In
particular the dependency is meant for the XOP-encoded SIMD
insns (which the XOP feature flag represents), not the XOP-encoded
GPR ones (which have distinct feature flags, but the naming sadly
collides).

Jan


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 65+ messages in thread

* Re: [PATCH 02/10] x86/gen-cpuid: Clarify the intended meaning of AVX wrt feature dependencies
  2017-02-21 17:07         ` Jan Beulich
@ 2017-02-21 17:12           ` Andrew Cooper
  2017-02-21 17:17             ` Jan Beulich
  0 siblings, 1 reply; 65+ messages in thread
From: Andrew Cooper @ 2017-02-21 17:12 UTC (permalink / raw)
  To: Jan Beulich; +Cc: Xen-devel

On 21/02/17 17:07, Jan Beulich wrote:
>>>> On 21.02.17 at 17:53, <andrew.cooper3@citrix.com> wrote:
>> On 21/02/17 16:47, Jan Beulich wrote:
>>>>>> On 21.02.17 at 17:40, <JBeulich@suse.com> wrote:
>>>>>>> On 20.02.17 at 12:00, <andrew.cooper3@citrix.com> wrote:
>>>>> --- a/xen/tools/gen-cpuid.py
>>>>> +++ b/xen/tools/gen-cpuid.py
>>>>> @@ -225,9 +225,13 @@ def crunch_numbers(state):
>>>>>          XSAVE: [XSAVEOPT, XSAVEC, XGETBV1, XSAVES,
>>>>>                  AVX, MPX, PKU, LWP],
>>>>>  
>>>>> -        # AVX is taken to mean hardware support for VEX encoded instructions,
>>>>> -        # 256bit registers, and the instructions themselves.  Each of these
>>>>> -        # subsequent instruction groups may only be VEX encoded.
>>>>> +        # AVX is taken to mean hardware support for 256bit registers (which in
>>>>> +        # practice depends on the VEX prefix to encode), and the instructions
>>>>> +        # themselves.
>>>>> +        #
>>>>> +        # AVX is not taken to mean support for the VEX prefix itself.
>>>>> +        # VEX-encoded GPR instructions, such as those from the BMI{1,2} sets
>>>>> +        # function fine in the absence of any enabled xstate.
>>>>>          AVX: [FMA, FMA4, F16C, AVX2, XOP],
>>>> Even if there aren't any EVEX-encoded non-AVX512 instructions so
>>>> far, I'd prefer the AVX512F entry to get adjusted along the same
>>>> lines. With that
>>>> Reviewed-by: Jan Beulich <jbeulich@suse.com>
>>> Actually, one more thing: XOP really has dual meaning (encoding
>>> and certain SIMD instructions). Perhaps it would be good to clarify
>>> this here too.
>> We don't currently express any dependencies based on XOP, so there is no
>> text about it.
> Well, I'm referring to the text (and dependency) above. In
> particular the dependency is meant for the XOP-encoded SIMD
> insns (which the XOP feature flag represents), not the XOP-encoded
> GPR ones (which have distinct feature flags, but the naming sadly
> collides).

Which particular feature groups?  I could extend that text to say

"VEX/XOP-encoded GPR instructions, such as those from the BMI{1,2} and
$X sets..."

~Andrew

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 65+ messages in thread

* Re: [PATCH 03/10] x86/cpuid: Handle leaf 0x1 in guest_cpuid()
  2017-02-21 16:59   ` Jan Beulich
@ 2017-02-21 17:13     ` Andrew Cooper
  2017-02-21 17:20       ` Jan Beulich
  0 siblings, 1 reply; 65+ messages in thread
From: Andrew Cooper @ 2017-02-21 17:13 UTC (permalink / raw)
  To: Jan Beulich; +Cc: Boris Ostrovsky, Xen-devel

On 21/02/17 16:59, Jan Beulich wrote:
>>>> On 20.02.17 at 12:00, <andrew.cooper3@citrix.com> wrote:
>> The ebx word is more problematic.  The low 8 bits are the brand ID and safe to
>> pass straight through.  The next 8 bits are the CLFLUSH line size.  This value
>> is forwarded straight from hardware, as nothing good can possibly come of
>> providing an alternative value to the guest.
> Risking the value guests see to change across migration.

CPUID state isn't sent in the migration stream.  It is still regenerated
from scratch on the destination side, thus has always (potentially) been
changing under the feet of the guest.

All this logic is doing is preventing the toolstack from offering
anything but the real hardware value.

With the planned extra toolstack and migration logic in place,
clflush_size being different will be a hard failure to migrate, as it is
not safe for the guest.

>
>> The final 8 bits are the initial legacy APIC ID.  For HVM guests, this was
>> overridden to vcpu_id * 2.  The same logic is now applied to PV guests, so
>> guests don't observe a constant number on all vcpus via their emulated or
>> faulted view.
> They won't be the same everywhere, but every 128th CPU will
> share values. I'm therefore not sure it wouldn't be better to hand
> out zero or all ones here.

There is no case where 128 cpus work sensibly under Xen ATM.

On real hardware, this APIC ID field does start repeating, so IMO it is
the correct action to take.

>
>> --- a/xen/arch/x86/cpuid.c
>> +++ b/xen/arch/x86/cpuid.c
>> @@ -176,6 +176,9 @@ static void recalculate_misc(struct cpuid_policy *p)
>>      switch ( p->x86_vendor )
>>      {
>>      case X86_VENDOR_INTEL:
>> +        p->basic.raw_fms &= 0x0fff3fff;
> Wouldn't it be better to use the same mask as for AMD here? The
> field is effectively unused / reserved on 64-bit processors.

Perhaps.

>
>> @@ -967,6 +920,12 @@ void guest_cpuid(const struct vcpu *v, uint32_t leaf,
>>          break;
>>  
>>      case 0x80000001:
>> +        /* SYSCALL is hidden outside of long mode on Intel. */
>> +        if ( p->x86_vendor == X86_VENDOR_INTEL &&
>> +             has_hvm_container_domain(d) && !hvm_long_mode_enabled(v) )
>> +            res->d &= ~cpufeat_mask(X86_FEATURE_SYSCALL);
>> +
>> +    common_dynamic_adjustments: /* Adjustments common with leaf 1. */
> Perhaps have "leaf1" in the label named then?

Ok.

~Andrew

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 65+ messages in thread

* Re: [PATCH 04/10] x86/cpuid: Handle leaf 0x4 in guest_cpuid()
  2017-02-20 11:00 ` [PATCH 04/10] x86/cpuid: Handle leaf 0x4 " Andrew Cooper
@ 2017-02-21 17:16   ` Jan Beulich
  2017-02-21 17:35     ` Andrew Cooper
  2017-03-10 16:27   ` [PATCH v2 " Andrew Cooper
  1 sibling, 1 reply; 65+ messages in thread
From: Jan Beulich @ 2017-02-21 17:16 UTC (permalink / raw)
  To: Andrew Cooper; +Cc: Xen-devel

>>> On 20.02.17 at 12:00, <andrew.cooper3@citrix.com> wrote:
> --- a/xen/arch/x86/cpuid.c
> +++ b/xen/arch/x86/cpuid.c
> @@ -163,6 +163,9 @@ static void recalculate_xstate(struct cpuid_policy *p)
>   */
>  static void recalculate_misc(struct cpuid_policy *p)
>  {
> +    /* Leaves with subleaf unions. */
> +    p->basic.raw[0x4] = p->basic.raw[0x7] = p->basic.raw[0xd] = EMPTY_LEAF;

How come you play with leaves 7 and 0xd here?

> @@ -244,6 +248,25 @@ static void __init calculate_raw_policy(void)
>          cpuid_leaf(i, &p->basic.raw[i]);
>      }
>  
> +    if ( p->basic.max_leaf >= 4 )
> +    {
> +        for ( i = 0; i < ARRAY_SIZE(p->cache.raw); ++i )
> +        {
> +            cpuid_count_leaf(4, i, &p->cache.raw[i]);
> +
> +            if ( p->cache.subleaf[i].type == 0 )
> +                break;
> +        }
> +
> +        /*
> +         * The choice of CPUID_GUEST_NR_CACHE is arbitrary.  It is expected
> +         * that it will eventually need increasing for future hardware.
> +         */
> +        if ( i == ARRAY_SIZE(p->cache.raw) )
> +            printk(XENLOG_WARNING
> +                   "CPUID: Insufficient Leaf 4 space for this hardware\n");
> +    }

It probably doesn't hurt, but it's one off: There's no enough space
only when the next (i-th) doesn't report type 0.

> @@ -125,6 +126,15 @@ struct cpuid_policy
>          };
>      } basic;
>  
> +    /* Structured cache leaf: 0x00000004[xx] */
> +    union {
> +        struct cpuid_leaf raw[CPUID_GUEST_NR_CACHE];
> +        struct {
> +            uint32_t type:4,

According to the SDM version I'm looking at this is a 5 bit field.

Jan


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 65+ messages in thread

* Re: [PATCH 02/10] x86/gen-cpuid: Clarify the intended meaning of AVX wrt feature dependencies
  2017-02-21 17:12           ` Andrew Cooper
@ 2017-02-21 17:17             ` Jan Beulich
  2017-02-21 17:42               ` Andrew Cooper
  0 siblings, 1 reply; 65+ messages in thread
From: Jan Beulich @ 2017-02-21 17:17 UTC (permalink / raw)
  To: Andrew Cooper; +Cc: Xen-devel

>>> On 21.02.17 at 18:12, <andrew.cooper3@citrix.com> wrote:
> Which particular feature groups?  I could extend that text to say
> 
> "VEX/XOP-encoded GPR instructions, such as those from the BMI{1,2} and
> $X sets..."

TBM and LWP.

Jan


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 65+ messages in thread

* Re: [PATCH 03/10] x86/cpuid: Handle leaf 0x1 in guest_cpuid()
  2017-02-21 17:13     ` Andrew Cooper
@ 2017-02-21 17:20       ` Jan Beulich
  2017-02-21 17:29         ` Andrew Cooper
  0 siblings, 1 reply; 65+ messages in thread
From: Jan Beulich @ 2017-02-21 17:20 UTC (permalink / raw)
  To: Andrew Cooper; +Cc: Boris Ostrovsky, Xen-devel

>>> On 21.02.17 at 18:13, <andrew.cooper3@citrix.com> wrote:
> On 21/02/17 16:59, Jan Beulich wrote:
>>>>> On 20.02.17 at 12:00, <andrew.cooper3@citrix.com> wrote:
>>> The ebx word is more problematic.  The low 8 bits are the brand ID and safe to
>>> pass straight through.  The next 8 bits are the CLFLUSH line size.  This value
>>> is forwarded straight from hardware, as nothing good can possibly come of
>>> providing an alternative value to the guest.
>> Risking the value guests see to change across migration.
> 
> CPUID state isn't sent in the migration stream.  It is still regenerated
> from scratch on the destination side, thus has always (potentially) been
> changing under the feet of the guest.

True.

>>> The final 8 bits are the initial legacy APIC ID.  For HVM guests, this was
>>> overridden to vcpu_id * 2.  The same logic is now applied to PV guests, so
>>> guests don't observe a constant number on all vcpus via their emulated or
>>> faulted view.
>> They won't be the same everywhere, but every 128th CPU will
>> share values. I'm therefore not sure it wouldn't be better to hand
>> out zero or all ones here.
> 
> There is no case where 128 cpus work sensibly under Xen ATM.

For HVM you mean. I'm sure I've seen > 128 vCPU PV guests
(namely Dom0-s).

> On real hardware, this APIC ID field does start repeating, so IMO it is
> the correct action to take.

Oh, right. That's fine then.

>>> --- a/xen/arch/x86/cpuid.c
>>> +++ b/xen/arch/x86/cpuid.c
>>> @@ -176,6 +176,9 @@ static void recalculate_misc(struct cpuid_policy *p)
>>>      switch ( p->x86_vendor )
>>>      {
>>>      case X86_VENDOR_INTEL:
>>> +        p->basic.raw_fms &= 0x0fff3fff;
>> Wouldn't it be better to use the same mask as for AMD here? The
>> field is effectively unused / reserved on 64-bit processors.
> 
> Perhaps.
> 
>>
>>> @@ -967,6 +920,12 @@ void guest_cpuid(const struct vcpu *v, uint32_t leaf,
>>>          break;
>>>  
>>>      case 0x80000001:
>>> +        /* SYSCALL is hidden outside of long mode on Intel. */
>>> +        if ( p->x86_vendor == X86_VENDOR_INTEL &&
>>> +             has_hvm_container_domain(d) && !hvm_long_mode_enabled(v) )
>>> +            res->d &= ~cpufeat_mask(X86_FEATURE_SYSCALL);
>>> +
>>> +    common_dynamic_adjustments: /* Adjustments common with leaf 1. */
>> Perhaps have "leaf1" in the label named then?
> 
> Ok.

With those (last) two adjustments,
Reviewed-by: Jan Beulich <jbeulich@suse.com>

Jan


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 65+ messages in thread

* Re: [PATCH 05/10] x86/cpuid: Handle leaf 0x5 in guest_cpuid()
  2017-02-20 11:00 ` [PATCH 05/10] x86/cpuid: Handle leaf 0x5 " Andrew Cooper
@ 2017-02-21 17:22   ` Jan Beulich
  0 siblings, 0 replies; 65+ messages in thread
From: Jan Beulich @ 2017-02-21 17:22 UTC (permalink / raw)
  To: Andrew Cooper; +Cc: Xen-devel

>>> On 20.02.17 at 12:00, <andrew.cooper3@citrix.com> wrote:
> The MONITOR flag isn't exposed to guests.  The existing toolstack logic, and
> pv_cpuid() in the hypervisor, zero the MONITOR leaf for queries.
> 
> However, the MONITOR leaf is still visible in the hardware domains native
> CPUID view, and Linux depends on this to set up C-state information.  Leak the
> hosts MONITOR leaf under the same circumstances that the MONITOR feature is
> leaked.
> 
> Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>

Reviewed-by: Jan Beulich <jbeulich@suse.com>



_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 65+ messages in thread

* Re: [PATCH 06/10] x86/cpuid: Handle leaf 0x6 in guest_cpuid()
  2017-02-20 11:00 ` [PATCH 06/10] x86/cpuid: Handle leaf 0x6 " Andrew Cooper
@ 2017-02-21 17:25   ` Jan Beulich
  2017-02-21 17:40     ` Andrew Cooper
  2017-03-10 16:32   ` [PATCH v2 " Andrew Cooper
  1 sibling, 1 reply; 65+ messages in thread
From: Jan Beulich @ 2017-02-21 17:25 UTC (permalink / raw)
  To: Andrew Cooper; +Cc: Xen-devel

>>> On 20.02.17 at 12:00, <andrew.cooper3@citrix.com> wrote:
> The thermal/performance leaf was previously hidden from HVM guests, but fully
> visible to PV guests.  Most of the leaf refers to MSR availability, and there
> is nothing an unprivileged PV guest can do with the information, so hide the
> leaf entirely.
> 
> The PV MSR handling logic as minimal support for some thermal/perf operations

... has ...

> from the hardware domain, so leak through the implemented subset of 
> features.

Does it make sense to continue to special case PV hwdom here?
Should there perhaps be at least a fixme note?

Jan


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 65+ messages in thread

* Re: [PATCH 03/10] x86/cpuid: Handle leaf 0x1 in guest_cpuid()
  2017-02-21 17:20       ` Jan Beulich
@ 2017-02-21 17:29         ` Andrew Cooper
  2017-02-22  7:16           ` Jan Beulich
  0 siblings, 1 reply; 65+ messages in thread
From: Andrew Cooper @ 2017-02-21 17:29 UTC (permalink / raw)
  To: Jan Beulich; +Cc: Boris Ostrovsky, Xen-devel

On 21/02/17 17:20, Jan Beulich wrote:
>
>>>> The final 8 bits are the initial legacy APIC ID.  For HVM guests, this was
>>>> overridden to vcpu_id * 2.  The same logic is now applied to PV guests, so
>>>> guests don't observe a constant number on all vcpus via their emulated or
>>>> faulted view.
>>> They won't be the same everywhere, but every 128th CPU will
>>> share values. I'm therefore not sure it wouldn't be better to hand
>>> out zero or all ones here.
>> There is no case where 128 cpus work sensibly under Xen ATM.
> For HVM you mean. I'm sure I've seen > 128 vCPU PV guests
> (namely Dom0-s).

You can physically create PV domains with up to 8192 vcpus.  I tried
this once.

The NMI watchdog (even set to 10s) is unforgiving of some the
for_each_vcpu() loops during domain destruction.

I can also still create workloads in a 64vcpu HVM guest which will cause
a 5 second watchdog timeout, which is why XenServers upper supported
vcpu limit is still 32.

~Andrew

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 65+ messages in thread

* Re: [PATCH 04/10] x86/cpuid: Handle leaf 0x4 in guest_cpuid()
  2017-02-21 17:16   ` Jan Beulich
@ 2017-02-21 17:35     ` Andrew Cooper
  2017-02-22  7:23       ` Jan Beulich
  0 siblings, 1 reply; 65+ messages in thread
From: Andrew Cooper @ 2017-02-21 17:35 UTC (permalink / raw)
  To: Jan Beulich; +Cc: Xen-devel

On 21/02/17 17:16, Jan Beulich wrote:
>>>> On 20.02.17 at 12:00, <andrew.cooper3@citrix.com> wrote:
>> --- a/xen/arch/x86/cpuid.c
>> +++ b/xen/arch/x86/cpuid.c
>> @@ -163,6 +163,9 @@ static void recalculate_xstate(struct cpuid_policy *p)
>>   */
>>  static void recalculate_misc(struct cpuid_policy *p)
>>  {
>> +    /* Leaves with subleaf unions. */
>> +    p->basic.raw[0x4] = p->basic.raw[0x7] = p->basic.raw[0xd] = EMPTY_LEAF;
> How come you play with leaves 7 and 0xd here?

This particular piece of clobbering was something which has only just
occurred to me now when implementing the leaf 4 union.

Then again, there is no supported way of getting any values into those
particular rows, or reading out of them, so I could just rely on no-one
caring?

>
>> @@ -244,6 +248,25 @@ static void __init calculate_raw_policy(void)
>>          cpuid_leaf(i, &p->basic.raw[i]);
>>      }
>>  
>> +    if ( p->basic.max_leaf >= 4 )
>> +    {
>> +        for ( i = 0; i < ARRAY_SIZE(p->cache.raw); ++i )
>> +        {
>> +            cpuid_count_leaf(4, i, &p->cache.raw[i]);
>> +
>> +            if ( p->cache.subleaf[i].type == 0 )
>> +                break;
>> +        }
>> +
>> +        /*
>> +         * The choice of CPUID_GUEST_NR_CACHE is arbitrary.  It is expected
>> +         * that it will eventually need increasing for future hardware.
>> +         */
>> +        if ( i == ARRAY_SIZE(p->cache.raw) )
>> +            printk(XENLOG_WARNING
>> +                   "CPUID: Insufficient Leaf 4 space for this hardware\n");
>> +    }
> It probably doesn't hurt, but it's one off: There's no enough space
> only when the next (i-th) doesn't report type 0.

This bit of logic is slightly awkward.  We read into p->cache.raw[i]
before looking to see whether p->cache.subleaf[i].type is the end of the
list.  As such we always read one-past-the-end.

>
>> @@ -125,6 +126,15 @@ struct cpuid_policy
>>          };
>>      } basic;
>>  
>> +    /* Structured cache leaf: 0x00000004[xx] */
>> +    union {
>> +        struct cpuid_leaf raw[CPUID_GUEST_NR_CACHE];
>> +        struct {
>> +            uint32_t type:4,
> According to the SDM version I'm looking at this is a 5 bit field.

Right you are.  I'd got confused by the "Bits 04 - 00".  Will fix.

~Andrew

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 65+ messages in thread

* Re: [PATCH 06/10] x86/cpuid: Handle leaf 0x6 in guest_cpuid()
  2017-02-21 17:25   ` Jan Beulich
@ 2017-02-21 17:40     ` Andrew Cooper
  2017-02-21 17:44       ` Andrew Cooper
  2017-02-22  7:31       ` Jan Beulich
  0 siblings, 2 replies; 65+ messages in thread
From: Andrew Cooper @ 2017-02-21 17:40 UTC (permalink / raw)
  To: Jan Beulich; +Cc: Xen-devel

On 21/02/17 17:25, Jan Beulich wrote:
>>>> On 20.02.17 at 12:00, <andrew.cooper3@citrix.com> wrote:
>> The thermal/performance leaf was previously hidden from HVM guests, but fully
>> visible to PV guests.  Most of the leaf refers to MSR availability, and there
>> is nothing an unprivileged PV guest can do with the information, so hide the
>> leaf entirely.
>>
>> The PV MSR handling logic as minimal support for some thermal/perf operations
> ... has ...
>
>> from the hardware domain, so leak through the implemented subset of 
>> features.
> Does it make sense to continue to special case PV hwdom here?

Being able to play with these MSRs will be actively wrong for HVM
context.  It is already fairly wrong for PV context, as nothing prevents
you being rescheduled across pcpus while in the middle of a read/write
cycle on the MSRs.

> Should there perhaps be at least a fixme note?

One way or another, we have to invest some different mechanism of
providing real hardware details to the hardware domain which don't
collide with their vcpus.

~Andrew

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 65+ messages in thread

* Re: [PATCH 02/10] x86/gen-cpuid: Clarify the intended meaning of AVX wrt feature dependencies
  2017-02-21 17:17             ` Jan Beulich
@ 2017-02-21 17:42               ` Andrew Cooper
  2017-02-22  7:13                 ` Jan Beulich
  0 siblings, 1 reply; 65+ messages in thread
From: Andrew Cooper @ 2017-02-21 17:42 UTC (permalink / raw)
  To: Jan Beulich; +Cc: Xen-devel

On 21/02/17 17:17, Jan Beulich wrote:
>>>> On 21.02.17 at 18:12, <andrew.cooper3@citrix.com> wrote:
>> Which particular feature groups?  I could extend that text to say
>>
>> "VEX/XOP-encoded GPR instructions, such as those from the BMI{1,2} and
>> $X sets..."
> TBM and LWP.

Ok.  Final text reads:

# AVX is not taken to mean support for the VEX prefix itself (nor XOP
# for the XOP prefix).  VEX/XOP-encoded GPR instructions, such as
# those from the BMI{1,2}, TBM and LWP sets function fine in the
# absence of any enabled xstate.

~Andrew

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 65+ messages in thread

* Re: [PATCH 06/10] x86/cpuid: Handle leaf 0x6 in guest_cpuid()
  2017-02-21 17:40     ` Andrew Cooper
@ 2017-02-21 17:44       ` Andrew Cooper
  2017-02-22  7:31       ` Jan Beulich
  1 sibling, 0 replies; 65+ messages in thread
From: Andrew Cooper @ 2017-02-21 17:44 UTC (permalink / raw)
  To: Jan Beulich; +Cc: Xen-devel

On 21/02/17 17:40, Andrew Cooper wrote:
> On 21/02/17 17:25, Jan Beulich wrote:
>>>>> On 20.02.17 at 12:00, <andrew.cooper3@citrix.com> wrote:
>>> The thermal/performance leaf was previously hidden from HVM guests, but fully
>>> visible to PV guests.  Most of the leaf refers to MSR availability, and there
>>> is nothing an unprivileged PV guest can do with the information, so hide the
>>> leaf entirely.
>>>
>>> The PV MSR handling logic as minimal support for some thermal/perf operations
>> ... has ...
>>
>>> from the hardware domain, so leak through the implemented subset of 
>>> features.
>> Does it make sense to continue to special case PV hwdom here?
> Being able to play with these MSRs will be actively wrong for HVM
> context.  It is already fairly wrong for PV context, as nothing prevents
> you being rescheduled across pcpus while in the middle of a read/write
> cycle on the MSRs.
>
>> Should there perhaps be at least a fixme note?
> One way or another, we have to invest some different mechanism of
> providing real hardware details to the hardware domain which don't
> collide with their vcpus.

s/invest/invent/.  Sorry for the confusion.

~Andrew

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 65+ messages in thread

* Re: [PATCH 02/10] x86/gen-cpuid: Clarify the intended meaning of AVX wrt feature dependencies
  2017-02-21 17:42               ` Andrew Cooper
@ 2017-02-22  7:13                 ` Jan Beulich
  0 siblings, 0 replies; 65+ messages in thread
From: Jan Beulich @ 2017-02-22  7:13 UTC (permalink / raw)
  To: Andrew Cooper; +Cc: Xen-devel

>>> On 21.02.17 at 18:42, <andrew.cooper3@citrix.com> wrote:
> On 21/02/17 17:17, Jan Beulich wrote:
>>>>> On 21.02.17 at 18:12, <andrew.cooper3@citrix.com> wrote:
>>> Which particular feature groups?  I could extend that text to say
>>>
>>> "VEX/XOP-encoded GPR instructions, such as those from the BMI{1,2} and
>>> $X sets..."
>> TBM and LWP.
> 
> Ok.  Final text reads:
> 
> # AVX is not taken to mean support for the VEX prefix itself (nor XOP
> # for the XOP prefix).  VEX/XOP-encoded GPR instructions, such as
> # those from the BMI{1,2}, TBM and LWP sets function fine in the
> # absence of any enabled xstate.

Thanks, looks good.

Jan


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 65+ messages in thread

* Re: [PATCH 03/10] x86/cpuid: Handle leaf 0x1 in guest_cpuid()
  2017-02-21 17:29         ` Andrew Cooper
@ 2017-02-22  7:16           ` Jan Beulich
  0 siblings, 0 replies; 65+ messages in thread
From: Jan Beulich @ 2017-02-22  7:16 UTC (permalink / raw)
  To: Andrew Cooper; +Cc: Boris Ostrovsky, Xen-devel

>>> On 21.02.17 at 18:29, <andrew.cooper3@citrix.com> wrote:
> On 21/02/17 17:20, Jan Beulich wrote:
>>
>>>>> The final 8 bits are the initial legacy APIC ID.  For HVM guests, this was
>>>>> overridden to vcpu_id * 2.  The same logic is now applied to PV guests, so
>>>>> guests don't observe a constant number on all vcpus via their emulated or
>>>>> faulted view.
>>>> They won't be the same everywhere, but every 128th CPU will
>>>> share values. I'm therefore not sure it wouldn't be better to hand
>>>> out zero or all ones here.
>>> There is no case where 128 cpus work sensibly under Xen ATM.
>> For HVM you mean. I'm sure I've seen > 128 vCPU PV guests
>> (namely Dom0-s).
> 
> You can physically create PV domains with up to 8192 vcpus.  I tried
> this once.
> 
> The NMI watchdog (even set to 10s) is unforgiving of some the
> for_each_vcpu() loops during domain destruction.
> 
> I can also still create workloads in a 64vcpu HVM guest which will cause
> a 5 second watchdog timeout, which is why XenServers upper supported
> vcpu limit is still 32.

Which does not contradict what I've said: I didn't claim 8k-vCPU
guests would work well, but I'm pretty convinced ones in the
range 128...512 have reasonable chances of working. And we
both know the situation sadly is worse for HVM ones.

Jan


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 65+ messages in thread

* Re: [PATCH 04/10] x86/cpuid: Handle leaf 0x4 in guest_cpuid()
  2017-02-21 17:35     ` Andrew Cooper
@ 2017-02-22  7:23       ` Jan Beulich
  2017-02-22  7:55         ` Andrew Cooper
  0 siblings, 1 reply; 65+ messages in thread
From: Jan Beulich @ 2017-02-22  7:23 UTC (permalink / raw)
  To: Andrew Cooper; +Cc: Xen-devel

>>> On 21.02.17 at 18:35, <andrew.cooper3@citrix.com> wrote:
> On 21/02/17 17:16, Jan Beulich wrote:
>>>>> On 20.02.17 at 12:00, <andrew.cooper3@citrix.com> wrote:
>>> --- a/xen/arch/x86/cpuid.c
>>> +++ b/xen/arch/x86/cpuid.c
>>> @@ -163,6 +163,9 @@ static void recalculate_xstate(struct cpuid_policy *p)
>>>   */
>>>  static void recalculate_misc(struct cpuid_policy *p)
>>>  {
>>> +    /* Leaves with subleaf unions. */
>>> +    p->basic.raw[0x4] = p->basic.raw[0x7] = p->basic.raw[0xd] = EMPTY_LEAF;
>> How come you play with leaves 7 and 0xd here?
> 
> This particular piece of clobbering was something which has only just
> occurred to me now when implementing the leaf 4 union.
> 
> Then again, there is no supported way of getting any values into those
> particular rows, or reading out of them, so I could just rely on no-one
> caring?

Well, if they start out as EMPTY_LEAF and there's no way to get
other values into them, why bother filling them here? The more
with a line that doesn't allow neatly extending should one more
such leaf need adding here. I'd say if you want to clobber the
values here just in case, merge the assignments above (in
numeric order) with the ones that are already there just below
(visible in the original patch context).

>>> @@ -244,6 +248,25 @@ static void __init calculate_raw_policy(void)
>>>          cpuid_leaf(i, &p->basic.raw[i]);
>>>      }
>>>  
>>> +    if ( p->basic.max_leaf >= 4 )
>>> +    {
>>> +        for ( i = 0; i < ARRAY_SIZE(p->cache.raw); ++i )
>>> +        {
>>> +            cpuid_count_leaf(4, i, &p->cache.raw[i]);
>>> +
>>> +            if ( p->cache.subleaf[i].type == 0 )
>>> +                break;
>>> +        }
>>> +
>>> +        /*
>>> +         * The choice of CPUID_GUEST_NR_CACHE is arbitrary.  It is expected
>>> +         * that it will eventually need increasing for future hardware.
>>> +         */
>>> +        if ( i == ARRAY_SIZE(p->cache.raw) )
>>> +            printk(XENLOG_WARNING
>>> +                   "CPUID: Insufficient Leaf 4 space for this hardware\n");
>>> +    }
>> It probably doesn't hurt, but it's one off: There's no enough space
>> only when the next (i-th) doesn't report type 0.
> 
> This bit of logic is slightly awkward.  We read into p->cache.raw[i]
> before looking to see whether p->cache.subleaf[i].type is the end of the
> list.  As such we always read one-past-the-end.

Sure. Issuing the message prematurely could of course be avoided
nevertheless, by reading sub-leaf i (regardless of whether i ==
CPUID_GUEST_NR_CACHE) into a local variable and checking type
there. But as said, it's not something I strictly ask for to be done,
as I can also see upsides of seeing this warning earlier than
absolutely needed.

Jan


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 65+ messages in thread

* Re: [PATCH 06/10] x86/cpuid: Handle leaf 0x6 in guest_cpuid()
  2017-02-21 17:40     ` Andrew Cooper
  2017-02-21 17:44       ` Andrew Cooper
@ 2017-02-22  7:31       ` Jan Beulich
  2017-02-22  8:23         ` Andrew Cooper
  1 sibling, 1 reply; 65+ messages in thread
From: Jan Beulich @ 2017-02-22  7:31 UTC (permalink / raw)
  To: Andrew Cooper; +Cc: Xen-devel

>>> On 21.02.17 at 18:40, <andrew.cooper3@citrix.com> wrote:
> On 21/02/17 17:25, Jan Beulich wrote:
>>>>> On 20.02.17 at 12:00, <andrew.cooper3@citrix.com> wrote:
>>> The thermal/performance leaf was previously hidden from HVM guests, but 
> fully
>>> visible to PV guests.  Most of the leaf refers to MSR availability, and 
> there
>>> is nothing an unprivileged PV guest can do with the information, so hide the
>>> leaf entirely.
>>>
>>> The PV MSR handling logic as minimal support for some thermal/perf operations
>> ... has ...
>>
>>> from the hardware domain, so leak through the implemented subset of 
>>> features.
>> Does it make sense to continue to special case PV hwdom here?
> 
> Being able to play with these MSRs will be actively wrong for HVM
> context.  It is already fairly wrong for PV context, as nothing prevents
> you being rescheduled across pcpus while in the middle of a read/write
> cycle on the MSRs.

So the MSRs in question are, afaics
- MSR_IA32_MPERF, MSR_IA32_APERF, MSR_IA32_PERF_CTL (all
  of which are is_cpufreq_controller() dependent)
- MSR_IA32_THERM_CONTROL, MSR_IA32_ENERGY_PERF_BIAS
  (both of which are is_pinned_vcpu() dependent)
For the latter your argument doesn't apply. For the former, I've
been wondering for a while whether we shouldn't do away with
"cpufreq=dom0-kernel".

>> Should there perhaps be at least a fixme note?
> 
> One way or another, we have to invest some different mechanism of
> providing real hardware details to the hardware domain which don't
> collide with their vcpus.

Hence the question whether to at least add a "fixme" note.

Jan


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 65+ messages in thread

* Re: [PATCH 04/10] x86/cpuid: Handle leaf 0x4 in guest_cpuid()
  2017-02-22  7:23       ` Jan Beulich
@ 2017-02-22  7:55         ` Andrew Cooper
  0 siblings, 0 replies; 65+ messages in thread
From: Andrew Cooper @ 2017-02-22  7:55 UTC (permalink / raw)
  To: Jan Beulich; +Cc: Xen-devel

On 22/02/17 07:23, Jan Beulich wrote:
>>>> On 21.02.17 at 18:35, <andrew.cooper3@citrix.com> wrote:
>> On 21/02/17 17:16, Jan Beulich wrote:
>>>>>> On 20.02.17 at 12:00, <andrew.cooper3@citrix.com> wrote:
>>>> --- a/xen/arch/x86/cpuid.c
>>>> +++ b/xen/arch/x86/cpuid.c
>>>> @@ -163,6 +163,9 @@ static void recalculate_xstate(struct cpuid_policy *p)
>>>>   */
>>>>  static void recalculate_misc(struct cpuid_policy *p)
>>>>  {
>>>> +    /* Leaves with subleaf unions. */
>>>> +    p->basic.raw[0x4] = p->basic.raw[0x7] = p->basic.raw[0xd] = EMPTY_LEAF;
>>> How come you play with leaves 7 and 0xd here?
>> This particular piece of clobbering was something which has only just
>> occurred to me now when implementing the leaf 4 union.
>>
>> Then again, there is no supported way of getting any values into those
>> particular rows, or reading out of them, so I could just rely on no-one
>> caring?
> Well, if they start out as EMPTY_LEAF and there's no way to get
> other values into them, why bother filling them here? The more
> with a line that doesn't allow neatly extending should one more
> such leaf need adding here. I'd say if you want to clobber the
> values here just in case, merge the assignments above (in
> numeric order) with the ones that are already there just below
> (visible in the original patch context).

I will just drop the clobbering.  Even this bit of logic isn't going to
survive to the end of eventual toolstack changes.

>
>>>> @@ -244,6 +248,25 @@ static void __init calculate_raw_policy(void)
>>>>          cpuid_leaf(i, &p->basic.raw[i]);
>>>>      }
>>>>  
>>>> +    if ( p->basic.max_leaf >= 4 )
>>>> +    {
>>>> +        for ( i = 0; i < ARRAY_SIZE(p->cache.raw); ++i )
>>>> +        {
>>>> +            cpuid_count_leaf(4, i, &p->cache.raw[i]);
>>>> +
>>>> +            if ( p->cache.subleaf[i].type == 0 )
>>>> +                break;
>>>> +        }
>>>> +
>>>> +        /*
>>>> +         * The choice of CPUID_GUEST_NR_CACHE is arbitrary.  It is expected
>>>> +         * that it will eventually need increasing for future hardware.
>>>> +         */
>>>> +        if ( i == ARRAY_SIZE(p->cache.raw) )
>>>> +            printk(XENLOG_WARNING
>>>> +                   "CPUID: Insufficient Leaf 4 space for this hardware\n");
>>>> +    }
>>> It probably doesn't hurt, but it's one off: There's no enough space
>>> only when the next (i-th) doesn't report type 0.
>> This bit of logic is slightly awkward.  We read into p->cache.raw[i]
>> before looking to see whether p->cache.subleaf[i].type is the end of the
>> list.  As such we always read one-past-the-end.
> Sure. Issuing the message prematurely could of course be avoided
> nevertheless, by reading sub-leaf i (regardless of whether i ==
> CPUID_GUEST_NR_CACHE) into a local variable and checking type
> there. But as said, it's not something I strictly ask for to be done,
> as I can also see upsides of seeing this warning earlier than
> absolutely needed.

Ok.  I will leave it as-is.

~Andrew

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 65+ messages in thread

* Re: [PATCH 06/10] x86/cpuid: Handle leaf 0x6 in guest_cpuid()
  2017-02-22  7:31       ` Jan Beulich
@ 2017-02-22  8:23         ` Andrew Cooper
  2017-02-22  9:12           ` Andrew Cooper
  0 siblings, 1 reply; 65+ messages in thread
From: Andrew Cooper @ 2017-02-22  8:23 UTC (permalink / raw)
  To: Jan Beulich; +Cc: Xen-devel

On 22/02/17 07:31, Jan Beulich wrote:
>>>> On 21.02.17 at 18:40, <andrew.cooper3@citrix.com> wrote:
>> On 21/02/17 17:25, Jan Beulich wrote:
>>>>>> On 20.02.17 at 12:00, <andrew.cooper3@citrix.com> wrote:
>>>> The thermal/performance leaf was previously hidden from HVM guests, but 
>> fully
>>>> visible to PV guests.  Most of the leaf refers to MSR availability, and 
>> there
>>>> is nothing an unprivileged PV guest can do with the information, so hide the
>>>> leaf entirely.
>>>>
>>>> The PV MSR handling logic as minimal support for some thermal/perf operations
>>> ... has ...
>>>
>>>> from the hardware domain, so leak through the implemented subset of 
>>>> features.
>>> Does it make sense to continue to special case PV hwdom here?
>> Being able to play with these MSRs will be actively wrong for HVM
>> context.  It is already fairly wrong for PV context, as nothing prevents
>> you being rescheduled across pcpus while in the middle of a read/write
>> cycle on the MSRs.
> So the MSRs in question are, afaics
> - MSR_IA32_MPERF, MSR_IA32_APERF, MSR_IA32_PERF_CTL (all
>   of which are is_cpufreq_controller() dependent)
> - MSR_IA32_THERM_CONTROL, MSR_IA32_ENERGY_PERF_BIAS
>   (both of which are is_pinned_vcpu() dependent)
> For the latter your argument doesn't apply. For the former, I've
> been wondering for a while whether we shouldn't do away with
> "cpufreq=dom0-kernel".

Hmm.  All good points.  If I can get away without leaking any of this,
that would be ideal.  (Lets see what Linux thinks of such a setup.)

~Andrew

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 65+ messages in thread

* Re: [PATCH 07/10] x86/cpuid: Handle leaf 0xa in guest_cpuid()
  2017-02-20 11:00 ` [PATCH 07/10] x86/cpuid: Handle leaf 0xa " Andrew Cooper
@ 2017-02-22  9:11   ` Jan Beulich
  0 siblings, 0 replies; 65+ messages in thread
From: Jan Beulich @ 2017-02-22  9:11 UTC (permalink / raw)
  To: Andrew Cooper; +Cc: Xen-devel

>>> On 20.02.17 at 12:00, <andrew.cooper3@citrix.com> wrote:
> Leaf 0xa is reserved by AMD, and only exposed to Intel guests when vPMU is
> enabled.  Leave the logic as-was, ready to be cleaned up when further
> toolstack infrastructure is in place.
> 
> Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>

Reviewed-by: Jan Beulich <jbeulich@suse.com>



_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 65+ messages in thread

* Re: [PATCH 06/10] x86/cpuid: Handle leaf 0x6 in guest_cpuid()
  2017-02-22  8:23         ` Andrew Cooper
@ 2017-02-22  9:12           ` Andrew Cooper
  2017-02-22  9:26             ` Jan Beulich
  0 siblings, 1 reply; 65+ messages in thread
From: Andrew Cooper @ 2017-02-22  9:12 UTC (permalink / raw)
  To: Jan Beulich; +Cc: Xen-devel

On 22/02/17 08:23, Andrew Cooper wrote:
> On 22/02/17 07:31, Jan Beulich wrote:
>>>>> On 21.02.17 at 18:40, <andrew.cooper3@citrix.com> wrote:
>>> On 21/02/17 17:25, Jan Beulich wrote:
>>>>>>> On 20.02.17 at 12:00, <andrew.cooper3@citrix.com> wrote:
>>>>> The thermal/performance leaf was previously hidden from HVM guests, but 
>>> fully
>>>>> visible to PV guests.  Most of the leaf refers to MSR availability, and 
>>> there
>>>>> is nothing an unprivileged PV guest can do with the information, so hide the
>>>>> leaf entirely.
>>>>>
>>>>> The PV MSR handling logic as minimal support for some thermal/perf operations
>>>> ... has ...
>>>>
>>>>> from the hardware domain, so leak through the implemented subset of 
>>>>> features.
>>>> Does it make sense to continue to special case PV hwdom here?
>>> Being able to play with these MSRs will be actively wrong for HVM
>>> context.  It is already fairly wrong for PV context, as nothing prevents
>>> you being rescheduled across pcpus while in the middle of a read/write
>>> cycle on the MSRs.
>> So the MSRs in question are, afaics
>> - MSR_IA32_MPERF, MSR_IA32_APERF, MSR_IA32_PERF_CTL (all
>>   of which are is_cpufreq_controller() dependent)
>> - MSR_IA32_THERM_CONTROL, MSR_IA32_ENERGY_PERF_BIAS
>>   (both of which are is_pinned_vcpu() dependent)
>> For the latter your argument doesn't apply. For the former, I've
>> been wondering for a while whether we shouldn't do away with
>> "cpufreq=dom0-kernel".
> Hmm.  All good points.  If I can get away without leaking any of this,
> that would be ideal.  (Lets see what Linux thinks of such a setup.)

Linux seems fine without any of this leakage.  I have checked and C/P
state information is still propagated up to Xen.  I will drop the entire
dynamic adjustment.

~Andrew

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 65+ messages in thread

* Re: [PATCH 08/10] x86/cpuid: Handle leaf 0xb in guest_cpuid()
  2017-02-20 11:00 ` [PATCH 08/10] x86/cpuid: Handle leaf 0xb " Andrew Cooper
@ 2017-02-22  9:16   ` Jan Beulich
  2017-02-22 10:22     ` Andrew Cooper
  2017-03-10 16:44   ` [PATCH v2 " Andrew Cooper
  1 sibling, 1 reply; 65+ messages in thread
From: Jan Beulich @ 2017-02-22  9:16 UTC (permalink / raw)
  To: Andrew Cooper; +Cc: Xen-devel

>>> On 20.02.17 at 12:00, <andrew.cooper3@citrix.com> wrote:
> Leaf 0xb is reserved by AMD, and uniformly hidden from guests by the toolstack
> logic and hypervisor PV logic.
> 
> The previous dynamic logic filled in the x2APIC ID for all HVM guests.  This
> is modified to respect the entire leaf being reserved by AMD, but is altered
> to include PV Intel guests, so they get more sensible values in their emulated
> and faulted view of CPUID.

Don't we expose x2APIC to HVM guests even on AMD systems? In
which case we surely will want to also expose the x2APIC ID.

> @@ -959,6 +950,14 @@ void guest_cpuid(const struct vcpu *v, uint32_t leaf,
>          }
>          break;
>  
> +    case 0xb:
> +        if ( p->x86_vendor == X86_VENDOR_INTEL )
> +        {
> +            /* Fix the x2APIC identifier. */
> +            res->d = v->vcpu_id * 2;
> +        }
> +        break;

Irrespective of the comment above, wouldn't the if() here better
look at the x2APIC feature flag of the domain?

Jan


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 65+ messages in thread

* Re: [PATCH 09/10] x86/cpuid: Drop legacy CPUID infrastructure
  2017-02-20 11:00 ` [PATCH 09/10] x86/cpuid: Drop legacy CPUID infrastructure Andrew Cooper
@ 2017-02-22  9:19   ` Jan Beulich
  0 siblings, 0 replies; 65+ messages in thread
From: Jan Beulich @ 2017-02-22  9:19 UTC (permalink / raw)
  To: Andrew Cooper; +Cc: Xen-devel

>>> On 20.02.17 at 12:00, <andrew.cooper3@citrix.com> wrote:
> Now that all leaves have been altered to use the guest_cpuid() path, remove
> all the remaining legacy infrastructure.
> 
> Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>

Reviewed-by: Jan Beulich <jbeulich@suse.com>

> ---
>  xen/arch/x86/cpuid.c        | 76 ---------------------------------------------
>  xen/arch/x86/domctl.c       | 45 +--------------------------
>  xen/include/asm-x86/cpuid.h | 27 ----------------
>  3 files changed, 1 insertion(+), 147 deletions(-)

Nice.

Jan


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 65+ messages in thread

* Re: [PATCH 10/10] x86/cpuid: Always enable faulting for the control domain
  2017-02-20 11:00 ` [PATCH 10/10] x86/cpuid: Always enable faulting for the control domain Andrew Cooper
@ 2017-02-22  9:23   ` Jan Beulich
  2017-02-22 10:00     ` Andrew Cooper
  0 siblings, 1 reply; 65+ messages in thread
From: Jan Beulich @ 2017-02-22  9:23 UTC (permalink / raw)
  To: Andrew Cooper; +Cc: Xen-devel

>>> On 20.02.17 at 12:00, <andrew.cooper3@citrix.com> wrote:
> The domain builder in libxc no longer depends on leaked CPUID information to
> properly construct HVM domains.  Remove the control domain exclusion.

Am I missing some intermediate step? As long as there's a raw
CPUID invocation in xc_cpuid_x86.c (which is still there in staging
and I don't recall this series removing it) it at least _feels_ unsafe.

Also the change here then results in Dom0 observing different
behavior between faulting-capable and faulting-incapable hosts.
I'm not convinced this is desirable.

Jan


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 65+ messages in thread

* Re: [PATCH 06/10] x86/cpuid: Handle leaf 0x6 in guest_cpuid()
  2017-02-22  9:12           ` Andrew Cooper
@ 2017-02-22  9:26             ` Jan Beulich
  2017-02-27 14:30               ` Andrew Cooper
  0 siblings, 1 reply; 65+ messages in thread
From: Jan Beulich @ 2017-02-22  9:26 UTC (permalink / raw)
  To: Andrew Cooper; +Cc: Xen-devel

>>> On 22.02.17 at 10:12, <andrew.cooper3@citrix.com> wrote:
> On 22/02/17 08:23, Andrew Cooper wrote:
>> On 22/02/17 07:31, Jan Beulich wrote:
>>>>>> On 21.02.17 at 18:40, <andrew.cooper3@citrix.com> wrote:
>>>> On 21/02/17 17:25, Jan Beulich wrote:
>>>>>>>> On 20.02.17 at 12:00, <andrew.cooper3@citrix.com> wrote:
>>>>>> The PV MSR handling logic as minimal support for some thermal/perf operations
>>>>>> from the hardware domain, so leak through the implemented subset of 
>>>>>> features.
>>>>> Does it make sense to continue to special case PV hwdom here?
>>>> Being able to play with these MSRs will be actively wrong for HVM
>>>> context.  It is already fairly wrong for PV context, as nothing prevents
>>>> you being rescheduled across pcpus while in the middle of a read/write
>>>> cycle on the MSRs.
>>> So the MSRs in question are, afaics
>>> - MSR_IA32_MPERF, MSR_IA32_APERF, MSR_IA32_PERF_CTL (all
>>>   of which are is_cpufreq_controller() dependent)
>>> - MSR_IA32_THERM_CONTROL, MSR_IA32_ENERGY_PERF_BIAS
>>>   (both of which are is_pinned_vcpu() dependent)
>>> For the latter your argument doesn't apply. For the former, I've
>>> been wondering for a while whether we shouldn't do away with
>>> "cpufreq=dom0-kernel".
>> Hmm.  All good points.  If I can get away without leaking any of this,
>> that would be ideal.  (Lets see what Linux thinks of such a setup.)
> 
> Linux seems fine without any of this leakage.

But is that for a broad range of versions, or just the one you had
to hand?

Jan


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 65+ messages in thread

* Re: [PATCH 10/10] x86/cpuid: Always enable faulting for the control domain
  2017-02-22  9:23   ` Jan Beulich
@ 2017-02-22 10:00     ` Andrew Cooper
  2017-02-22 10:10       ` Jan Beulich
  0 siblings, 1 reply; 65+ messages in thread
From: Andrew Cooper @ 2017-02-22 10:00 UTC (permalink / raw)
  To: Jan Beulich; +Cc: Xen-devel

On 22/02/17 09:23, Jan Beulich wrote:
>>>> On 20.02.17 at 12:00, <andrew.cooper3@citrix.com> wrote:
>> The domain builder in libxc no longer depends on leaked CPUID information to
>> properly construct HVM domains.  Remove the control domain exclusion.
> Am I missing some intermediate step? As long as there's a raw
> CPUID invocation in xc_cpuid_x86.c (which is still there in staging
> and I don't recall this series removing it) it at least _feels_ unsafe.

Strictly speaking, the domain builder part of this was completed after
my xsave adjustments.  All the guest-type-dependent information now
comes from non-cpuid sources in libxc, or Xen ignores the toolstack
values and recalculates information itself.

However, until the Intel leaves were complete, dom0 had a hard time
booting with this change as there were no toolstack-provided policy and
no leakage from hardware.

>
> Also the change here then results in Dom0 observing different
> behavior between faulting-capable and faulting-incapable hosts.
> I'm not convinced this is desirable.

I disagree.  Avoiding the leakage is very desirable moving forwards.

Other side effects are that it makes PV and PVH dom0 functionally
identical WRT CPUID, and PV userspace (which, unlikely the kernel, tends
not to be Xen-aware) sees sensible information.

~Andrew

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 65+ messages in thread

* Re: [PATCH 10/10] x86/cpuid: Always enable faulting for the control domain
  2017-02-22 10:00     ` Andrew Cooper
@ 2017-02-22 10:10       ` Jan Beulich
  2017-02-27 15:10         ` Andrew Cooper
  0 siblings, 1 reply; 65+ messages in thread
From: Jan Beulich @ 2017-02-22 10:10 UTC (permalink / raw)
  To: Andrew Cooper; +Cc: Xen-devel

>>> On 22.02.17 at 11:00, <andrew.cooper3@citrix.com> wrote:
> On 22/02/17 09:23, Jan Beulich wrote:
>>>>> On 20.02.17 at 12:00, <andrew.cooper3@citrix.com> wrote:
>>> The domain builder in libxc no longer depends on leaked CPUID information to
>>> properly construct HVM domains.  Remove the control domain exclusion.
>> Am I missing some intermediate step? As long as there's a raw
>> CPUID invocation in xc_cpuid_x86.c (which is still there in staging
>> and I don't recall this series removing it) it at least _feels_ unsafe.
> 
> Strictly speaking, the domain builder part of this was completed after
> my xsave adjustments.  All the guest-type-dependent information now
> comes from non-cpuid sources in libxc, or Xen ignores the toolstack
> values and recalculates information itself.
> 
> However, until the Intel leaves were complete, dom0 had a hard time
> booting with this change as there were no toolstack-provided policy and
> no leakage from hardware.

So what are the CPUID uses in libxc then needed for at this point?
Could they be removed in a prereq patch to make clear all needed
information is now being obtained via hypercalls?

>> Also the change here then results in Dom0 observing different
>> behavior between faulting-capable and faulting-incapable hosts.
>> I'm not convinced this is desirable.
> 
> I disagree.  Avoiding the leakage is very desirable moving forwards.
> 
> Other side effects are that it makes PV and PVH dom0 functionally
> identical WRT CPUID, and PV userspace (which, unlikely the kernel, tends
> not to be Xen-aware) sees sensible information.

I can see the upsides too, hence the "I'm not convinced" ...

Jan


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 65+ messages in thread

* Re: [PATCH 08/10] x86/cpuid: Handle leaf 0xb in guest_cpuid()
  2017-02-22  9:16   ` Jan Beulich
@ 2017-02-22 10:22     ` Andrew Cooper
  2017-02-22 10:37       ` Jan Beulich
  0 siblings, 1 reply; 65+ messages in thread
From: Andrew Cooper @ 2017-02-22 10:22 UTC (permalink / raw)
  To: Jan Beulich; +Cc: Xen-devel

On 22/02/17 09:16, Jan Beulich wrote:
>>>> On 20.02.17 at 12:00, <andrew.cooper3@citrix.com> wrote:
>> Leaf 0xb is reserved by AMD, and uniformly hidden from guests by the toolstack
>> logic and hypervisor PV logic.
>>
>> The previous dynamic logic filled in the x2APIC ID for all HVM guests.  This
>> is modified to respect the entire leaf being reserved by AMD, but is altered
>> to include PV Intel guests, so they get more sensible values in their emulated
>> and faulted view of CPUID.
> Don't we expose x2APIC to HVM guests even on AMD systems? In
> which case we surely will want to also expose the x2APIC ID.

The x2apic feature bit is still listed as reserved in the latest AMD
SDM, and I haven't seen a piece of hardware which supports it.

Also, x2apic or not, this leaf is documented as fully reserved, so we
shouldn't be filling it in anyway.

>
>> @@ -959,6 +950,14 @@ void guest_cpuid(const struct vcpu *v, uint32_t leaf,
>>          }
>>          break;
>>  
>> +    case 0xb:
>> +        if ( p->x86_vendor == X86_VENDOR_INTEL )
>> +        {
>> +            /* Fix the x2APIC identifier. */
>> +            res->d = v->vcpu_id * 2;
>> +        }
>> +        break;
> Irrespective of the comment above, wouldn't the if() here better
> look at the x2APIC feature flag of the domain?

Perhaps.  There isn't a formal link called out, but looking at the
content, I am guessing that real hardware is consistent with their
support of x2apic and the availability of this leaf?

~Andrew

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 65+ messages in thread

* Re: [PATCH 08/10] x86/cpuid: Handle leaf 0xb in guest_cpuid()
  2017-02-22 10:22     ` Andrew Cooper
@ 2017-02-22 10:37       ` Jan Beulich
  2017-02-27 15:05         ` Andrew Cooper
  0 siblings, 1 reply; 65+ messages in thread
From: Jan Beulich @ 2017-02-22 10:37 UTC (permalink / raw)
  To: Andrew Cooper; +Cc: Xen-devel

>>> On 22.02.17 at 11:22, <andrew.cooper3@citrix.com> wrote:
> On 22/02/17 09:16, Jan Beulich wrote:
>>>>> On 20.02.17 at 12:00, <andrew.cooper3@citrix.com> wrote:
>>> Leaf 0xb is reserved by AMD, and uniformly hidden from guests by the 
> toolstack
>>> logic and hypervisor PV logic.
>>>
>>> The previous dynamic logic filled in the x2APIC ID for all HVM guests.  This
>>> is modified to respect the entire leaf being reserved by AMD, but is altered
>>> to include PV Intel guests, so they get more sensible values in their emulated
>>> and faulted view of CPUID.
>> Don't we expose x2APIC to HVM guests even on AMD systems? In
>> which case we surely will want to also expose the x2APIC ID.
> 
> The x2apic feature bit is still listed as reserved in the latest AMD
> SDM, and I haven't seen a piece of hardware which supports it.

This doesn't seem to answer my question, as hardware behavior is
of no interest here (our emulation works on old Intel hardware too).
Just to be sure I've checked a HVM guest on an AMD box, and this
is what Linux reports:

<6>Enabling x2apic
<6>Enabled x2apic
<6>Switched APIC routing to physical x2apic.

> Also, x2apic or not, this leaf is documented as fully reserved, so we
> shouldn't be filling it in anyway.

While guests appear to do fine without, I think this is inconsistent
with providing x2APIC support. But as current behavior is the same,
we can as well leave this for the future.

>>> @@ -959,6 +950,14 @@ void guest_cpuid(const struct vcpu *v, uint32_t leaf,
>>>          }
>>>          break;
>>>  
>>> +    case 0xb:
>>> +        if ( p->x86_vendor == X86_VENDOR_INTEL )
>>> +        {
>>> +            /* Fix the x2APIC identifier. */
>>> +            res->d = v->vcpu_id * 2;
>>> +        }
>>> +        break;
>> Irrespective of the comment above, wouldn't the if() here better
>> look at the x2APIC feature flag of the domain?
> 
> Perhaps.  There isn't a formal link called out, but looking at the
> content, I am guessing that real hardware is consistent with their
> support of x2apic and the availability of this leaf?

I suppose so, at least they've got introduced together iirc.

Jan


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 65+ messages in thread

* Re: [PATCH 06/10] x86/cpuid: Handle leaf 0x6 in guest_cpuid()
  2017-02-22  9:26             ` Jan Beulich
@ 2017-02-27 14:30               ` Andrew Cooper
  0 siblings, 0 replies; 65+ messages in thread
From: Andrew Cooper @ 2017-02-27 14:30 UTC (permalink / raw)
  To: Jan Beulich; +Cc: Xen-devel

On 22/02/17 09:26, Jan Beulich wrote:
>>>> On 22.02.17 at 10:12, <andrew.cooper3@citrix.com> wrote:
>> On 22/02/17 08:23, Andrew Cooper wrote:
>>> On 22/02/17 07:31, Jan Beulich wrote:
>>>>>>> On 21.02.17 at 18:40, <andrew.cooper3@citrix.com> wrote:
>>>>> On 21/02/17 17:25, Jan Beulich wrote:
>>>>>>>>> On 20.02.17 at 12:00, <andrew.cooper3@citrix.com> wrote:
>>>>>>> The PV MSR handling logic as minimal support for some thermal/perf operations
>>>>>>> from the hardware domain, so leak through the implemented subset of 
>>>>>>> features.
>>>>>> Does it make sense to continue to special case PV hwdom here?
>>>>> Being able to play with these MSRs will be actively wrong for HVM
>>>>> context.  It is already fairly wrong for PV context, as nothing prevents
>>>>> you being rescheduled across pcpus while in the middle of a read/write
>>>>> cycle on the MSRs.
>>>> So the MSRs in question are, afaics
>>>> - MSR_IA32_MPERF, MSR_IA32_APERF, MSR_IA32_PERF_CTL (all
>>>>   of which are is_cpufreq_controller() dependent)
>>>> - MSR_IA32_THERM_CONTROL, MSR_IA32_ENERGY_PERF_BIAS
>>>>   (both of which are is_pinned_vcpu() dependent)
>>>> For the latter your argument doesn't apply. For the former, I've
>>>> been wondering for a while whether we shouldn't do away with
>>>> "cpufreq=dom0-kernel".
>>> Hmm.  All good points.  If I can get away without leaking any of this,
>>> that would be ideal.  (Lets see what Linux thinks of such a setup.)
>> Linux seems fine without any of this leakage.
> But is that for a broad range of versions, or just the one you had
> to hand?

3.10 and 4.4 PVOps.  Looking at the 2.6.32-classic source, I can't see
anything which would be a problem.

~Andrew

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 65+ messages in thread

* Re: [PATCH 08/10] x86/cpuid: Handle leaf 0xb in guest_cpuid()
  2017-02-22 10:37       ` Jan Beulich
@ 2017-02-27 15:05         ` Andrew Cooper
  0 siblings, 0 replies; 65+ messages in thread
From: Andrew Cooper @ 2017-02-27 15:05 UTC (permalink / raw)
  To: Jan Beulich; +Cc: Xen-devel

On 22/02/17 10:37, Jan Beulich wrote:
>>>> On 22.02.17 at 11:22, <andrew.cooper3@citrix.com> wrote:
>> On 22/02/17 09:16, Jan Beulich wrote:
>>>>>> On 20.02.17 at 12:00, <andrew.cooper3@citrix.com> wrote:
>>>> Leaf 0xb is reserved by AMD, and uniformly hidden from guests by the 
>> toolstack
>>>> logic and hypervisor PV logic.
>>>>
>>>> The previous dynamic logic filled in the x2APIC ID for all HVM guests.  This
>>>> is modified to respect the entire leaf being reserved by AMD, but is altered
>>>> to include PV Intel guests, so they get more sensible values in their emulated
>>>> and faulted view of CPUID.
>>> Don't we expose x2APIC to HVM guests even on AMD systems? In
>>> which case we surely will want to also expose the x2APIC ID.
>> The x2apic feature bit is still listed as reserved in the latest AMD
>> SDM, and I haven't seen a piece of hardware which supports it.
> This doesn't seem to answer my question, as hardware behavior is
> of no interest here (our emulation works on old Intel hardware too).
> Just to be sure I've checked a HVM guest on an AMD box, and this
> is what Linux reports:
>
> <6>Enabling x2apic
> <6>Enabled x2apic
> <6>Switched APIC routing to physical x2apic.

Hmm - that is a good point - we do offer x2apic even to AMD HVM guests.

However, this is problematic and highlights another can of worms.  We
cannot turn on APICV/AVIC for guests if our cpuid features claim a
greater featureset than hardware can support.  (Looking at c/s
3e0c8272f, TSC_DEADLINE is a feature in a similar position.)

I think the only feasible way of making this work is to ensure that the
default policy exactly matches hardware, and the max policy contains all
features we are able to emulate.  Determination of whether to use
APICV/AVIC or not must be on whether the chosen featureset is compatible
with hardware or not.

>
>> Also, x2apic or not, this leaf is documented as fully reserved, so we
>> shouldn't be filling it in anyway.
> While guests appear to do fine without, I think this is inconsistent
> with providing x2APIC support. But as current behavior is the same,
> we can as well leave this for the future.
>
>>>> @@ -959,6 +950,14 @@ void guest_cpuid(const struct vcpu *v, uint32_t leaf,
>>>>          }
>>>>          break;
>>>>  
>>>> +    case 0xb:
>>>> +        if ( p->x86_vendor == X86_VENDOR_INTEL )
>>>> +        {
>>>> +            /* Fix the x2APIC identifier. */
>>>> +            res->d = v->vcpu_id * 2;
>>>> +        }
>>>> +        break;
>>> Irrespective of the comment above, wouldn't the if() here better
>>> look at the x2APIC feature flag of the domain?
>> Perhaps.  There isn't a formal link called out, but looking at the
>> content, I am guessing that real hardware is consistent with their
>> support of x2apic and the availability of this leaf?
> I suppose so, at least they've got introduced together iirc.

Ok.  I will change the determination to be based exclusively on x2apic
being advertised, and leave a comment explaining why.

~Andrew

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 65+ messages in thread

* Re: [PATCH 10/10] x86/cpuid: Always enable faulting for the control domain
  2017-02-22 10:10       ` Jan Beulich
@ 2017-02-27 15:10         ` Andrew Cooper
  2017-02-28  9:31           ` Jan Beulich
  0 siblings, 1 reply; 65+ messages in thread
From: Andrew Cooper @ 2017-02-27 15:10 UTC (permalink / raw)
  To: Jan Beulich; +Cc: Xen-devel

On 22/02/17 10:10, Jan Beulich wrote:
>>>> On 22.02.17 at 11:00, <andrew.cooper3@citrix.com> wrote:
>> On 22/02/17 09:23, Jan Beulich wrote:
>>>>>> On 20.02.17 at 12:00, <andrew.cooper3@citrix.com> wrote:
>>>> The domain builder in libxc no longer depends on leaked CPUID information to
>>>> properly construct HVM domains.  Remove the control domain exclusion.
>>> Am I missing some intermediate step? As long as there's a raw
>>> CPUID invocation in xc_cpuid_x86.c (which is still there in staging
>>> and I don't recall this series removing it) it at least _feels_ unsafe.
>> Strictly speaking, the domain builder part of this was completed after
>> my xsave adjustments.  All the guest-type-dependent information now
>> comes from non-cpuid sources in libxc, or Xen ignores the toolstack
>> values and recalculates information itself.
>>
>> However, until the Intel leaves were complete, dom0 had a hard time
>> booting with this change as there were no toolstack-provided policy and
>> no leakage from hardware.
> So what are the CPUID uses in libxc then needed for at this point?
> Could they be removed in a prereq patch to make clear all needed
> information is now being obtained via hypercalls?

I'd prefer to defer that work.  The next chunk of CPUID work is going to
be redesigning and reimplementing the hypervisor/libxc interface, and
all cpuid() calls in libxc will fall out there, but its not a trivial
set of changes to make.

>
>>> Also the change here then results in Dom0 observing different
>>> behavior between faulting-capable and faulting-incapable hosts.
>>> I'm not convinced this is desirable.
>> I disagree.  Avoiding the leakage is very desirable moving forwards.
>>
>> Other side effects are that it makes PV and PVH dom0 functionally
>> identical WRT CPUID, and PV userspace (which, unlikely the kernel, tends
>> not to be Xen-aware) sees sensible information.
> I can see the upsides too, hence the "I'm not convinced" ...

So is that an ack or a nack?  I am afraid that this isn't very helpful.

~Andrew

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 65+ messages in thread

* Re: [PATCH 10/10] x86/cpuid: Always enable faulting for the control domain
  2017-02-27 15:10         ` Andrew Cooper
@ 2017-02-28  9:31           ` Jan Beulich
  2017-03-10 17:10             ` Andrew Cooper
  0 siblings, 1 reply; 65+ messages in thread
From: Jan Beulich @ 2017-02-28  9:31 UTC (permalink / raw)
  To: Andrew Cooper; +Cc: Xen-devel

>>> On 27.02.17 at 16:10, <andrew.cooper3@citrix.com> wrote:
> On 22/02/17 10:10, Jan Beulich wrote:
>>>>> On 22.02.17 at 11:00, <andrew.cooper3@citrix.com> wrote:
>>> On 22/02/17 09:23, Jan Beulich wrote:
>>>>>>> On 20.02.17 at 12:00, <andrew.cooper3@citrix.com> wrote:
>>>>> The domain builder in libxc no longer depends on leaked CPUID information to
>>>>> properly construct HVM domains.  Remove the control domain exclusion.
>>>> Am I missing some intermediate step? As long as there's a raw
>>>> CPUID invocation in xc_cpuid_x86.c (which is still there in staging
>>>> and I don't recall this series removing it) it at least _feels_ unsafe.
>>> Strictly speaking, the domain builder part of this was completed after
>>> my xsave adjustments.  All the guest-type-dependent information now
>>> comes from non-cpuid sources in libxc, or Xen ignores the toolstack
>>> values and recalculates information itself.
>>>
>>> However, until the Intel leaves were complete, dom0 had a hard time
>>> booting with this change as there were no toolstack-provided policy and
>>> no leakage from hardware.
>> So what are the CPUID uses in libxc then needed for at this point?
>> Could they be removed in a prereq patch to make clear all needed
>> information is now being obtained via hypercalls?
> 
> I'd prefer to defer that work.  The next chunk of CPUID work is going to
> be redesigning and reimplementing the hypervisor/libxc interface, and
> all cpuid() calls in libxc will fall out there, but its not a trivial
> set of changes to make.

With that, could you live with deferring the patch here until then?
I ask because ...

>>>> Also the change here then results in Dom0 observing different
>>>> behavior between faulting-capable and faulting-incapable hosts.
>>>> I'm not convinced this is desirable.
>>> I disagree.  Avoiding the leakage is very desirable moving forwards.
>>>
>>> Other side effects are that it makes PV and PVH dom0 functionally
>>> identical WRT CPUID, and PV userspace (which, unlikely the kernel, tends
>>> not to be Xen-aware) sees sensible information.
>> I can see the upsides too, hence the "I'm not convinced" ...
> 
> So is that an ack or a nack?  I am afraid that this isn't very helpful.

... I understand this isn't helpful, yet no, at this point its neither
an ack nor a nak.

Jan


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 65+ messages in thread

* [PATCH v2 04/10] x86/cpuid: Handle leaf 0x4 in guest_cpuid()
  2017-02-20 11:00 ` [PATCH 04/10] x86/cpuid: Handle leaf 0x4 " Andrew Cooper
  2017-02-21 17:16   ` Jan Beulich
@ 2017-03-10 16:27   ` Andrew Cooper
  2017-03-13 12:03     ` Jan Beulich
  1 sibling, 1 reply; 65+ messages in thread
From: Andrew Cooper @ 2017-03-10 16:27 UTC (permalink / raw)
  To: Xen-devel; +Cc: Andrew Cooper, Jan Beulich

Leaf 0x4 is reserved by AMD.  For Intel, it is a multi-invocation leaf with
ecx enumerating different cache details.

Add a new union for it in struct cpuid_policy, collect it from hardware in
calculate_raw_policy(), audit it in recalculate_cpuid_policy() and update
guest_cpuid() and update_domain_cpuid_info() to properly insert/extract data.

A lot of the data here will need further auditing/refinement when better
topology support is introduced, but for now, this matches the existing
toolstack behaviour.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
---
CC: Jan Beulich <JBeulich@suse.com>

v2:
 * The cache type field is 5 bits wide, rather than 4.
 * Don't bother clobbering p->basic.raw[0x{4,7,d}]
---
 xen/arch/x86/cpuid.c        | 48 +++++++++++++++++++++++++++++++++++++++++++--
 xen/arch/x86/domctl.c       |  8 +++++++-
 xen/include/asm-x86/cpuid.h | 10 ++++++++++
 3 files changed, 63 insertions(+), 3 deletions(-)

diff --git a/xen/arch/x86/cpuid.c b/xen/arch/x86/cpuid.c
index d6f6b88..e85415f 100644
--- a/xen/arch/x86/cpuid.c
+++ b/xen/arch/x86/cpuid.c
@@ -199,6 +199,7 @@ static void recalculate_misc(struct cpuid_policy *p)
 
     case X86_VENDOR_AMD:
         zero_leaves(p->basic.raw, 0x2, 0x3);
+        memset(p->cache.raw, 0, sizeof(p->cache.raw));
         p->basic.raw[0x9] = EMPTY_LEAF;
 
         p->extd.vendor_ebx = p->basic.vendor_ebx;
@@ -242,6 +243,25 @@ static void __init calculate_raw_policy(void)
         cpuid_leaf(i, &p->basic.raw[i]);
     }
 
+    if ( p->basic.max_leaf >= 4 )
+    {
+        for ( i = 0; i < ARRAY_SIZE(p->cache.raw); ++i )
+        {
+            cpuid_count_leaf(4, i, &p->cache.raw[i]);
+
+            if ( p->cache.subleaf[i].type == 0 )
+                break;
+        }
+
+        /*
+         * The choice of CPUID_GUEST_NR_CACHE is arbitrary.  It is expected
+         * that it will eventually need increasing for future hardware.
+         */
+        if ( i == ARRAY_SIZE(p->cache.raw) )
+            printk(XENLOG_WARNING
+                   "CPUID: Insufficient Leaf 4 space for this hardware\n");
+    }
+
     if ( p->basic.max_leaf >= 7 )
     {
         cpuid_count_leaf(7, 0, &p->feat.raw[0]);
@@ -520,6 +540,23 @@ void recalculate_cpuid_policy(struct domain *d)
     recalculate_xstate(p);
     recalculate_misc(p);
 
+    for ( i = 0; i < ARRAY_SIZE(p->cache.raw); ++i )
+    {
+        if ( p->cache.subleaf[i].type >= 1 &&
+             p->cache.subleaf[i].type <= 3 )
+        {
+            /* Subleaf has a valid cache type. Zero reserved fields. */
+            p->cache.raw[i].a &= 0xffffc3ffu;
+            p->cache.raw[i].d &= 0x00000007u;
+        }
+        else
+        {
+            /* Subleaf is not valid.  Zero the rest of the union. */
+            zero_leaves(p->cache.raw, i, ARRAY_SIZE(p->cache.raw) - 1);
+            break;
+        }
+    }
+
     if ( !p->extd.svm )
         p->extd.raw[0xa] = EMPTY_LEAF;
 
@@ -605,7 +642,7 @@ static void pv_cpuid(uint32_t leaf, uint32_t subleaf, struct cpuid_leaf *res)
         *res = EMPTY_LEAF;
         break;
 
-    case 0x0 ... 0x3:
+    case 0x0 ... 0x4:
     case 0x7 ... 0x9:
     case 0xc ... XSTATE_CPUID:
     case 0x80000000 ... 0xffffffff:
@@ -640,7 +677,7 @@ static void hvm_cpuid(uint32_t leaf, uint32_t subleaf, struct cpuid_leaf *res)
             res->a = (res->a & ~0xff) | 3;
         break;
 
-    case 0x0 ... 0x3:
+    case 0x0 ... 0x4:
     case 0x7 ... 0x9:
     case 0xc ... XSTATE_CPUID:
     case 0x80000000 ... 0xffffffff:
@@ -674,6 +711,13 @@ void guest_cpuid(const struct vcpu *v, uint32_t leaf,
 
         switch ( leaf )
         {
+        case 0x4:
+            if ( subleaf >= ARRAY_SIZE(p->cache.raw) )
+                return;
+
+            *res = p->cache.raw[subleaf];
+            break;
+
         case 0x7:
             ASSERT(p->feat.max_subleaf < ARRAY_SIZE(p->feat.raw));
             if ( subleaf > min_t(uint32_t, p->feat.max_subleaf,
diff --git a/xen/arch/x86/domctl.c b/xen/arch/x86/domctl.c
index 02b48e8..d87efa2 100644
--- a/xen/arch/x86/domctl.c
+++ b/xen/arch/x86/domctl.c
@@ -101,6 +101,10 @@ static int update_domain_cpuid_info(struct domain *d,
     switch ( ctl->input[0] )
     {
     case 0x00000000 ... ARRAY_SIZE(p->basic.raw) - 1:
+        if ( ctl->input[0] == 4 &&
+             ctl->input[1] >= ARRAY_SIZE(p->cache.raw) )
+            return 0;
+
         if ( ctl->input[0] == 7 &&
              ctl->input[1] >= ARRAY_SIZE(p->feat.raw) )
             return 0;
@@ -129,7 +133,9 @@ static int update_domain_cpuid_info(struct domain *d,
     switch ( ctl->input[0] )
     {
     case 0x00000000 ... ARRAY_SIZE(p->basic.raw) - 1:
-        if ( ctl->input[0] == 7 )
+        if ( ctl->input[0] == 4 )
+            p->cache.raw[ctl->input[1]] = leaf;
+        else if ( ctl->input[0] == 7 )
             p->feat.raw[ctl->input[1]] = leaf;
         else if ( ctl->input[0] == XSTATE_CPUID )
             p->xstate.raw[ctl->input[1]] = leaf;
diff --git a/xen/include/asm-x86/cpuid.h b/xen/include/asm-x86/cpuid.h
index aa482b7..d13b322 100644
--- a/xen/include/asm-x86/cpuid.h
+++ b/xen/include/asm-x86/cpuid.h
@@ -63,6 +63,7 @@ DECLARE_PER_CPU(bool, cpuid_faulting_enabled);
 
 #define CPUID_GUEST_NR_BASIC      (0xdu + 1)
 #define CPUID_GUEST_NR_FEAT       (0u + 1)
+#define CPUID_GUEST_NR_CACHE      (5u + 1)
 #define CPUID_GUEST_NR_XSTATE     (62u + 1)
 #define CPUID_GUEST_NR_EXTD_INTEL (0x8u + 1)
 #define CPUID_GUEST_NR_EXTD_AMD   (0x1cu + 1)
@@ -137,6 +138,15 @@ struct cpuid_policy
         };
     } basic;
 
+    /* Structured cache leaf: 0x00000004[xx] */
+    union {
+        struct cpuid_leaf raw[CPUID_GUEST_NR_CACHE];
+        struct {
+            uint32_t type:5,
+                :27, :32, :32, :32;
+        } subleaf[CPUID_GUEST_NR_CACHE];
+    } cache;
+
     /* Structured feature leaf: 0x00000007[xx] */
     union {
         struct cpuid_leaf raw[CPUID_GUEST_NR_FEAT];
-- 
2.1.4


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 65+ messages in thread

* [PATCH v2 06/10] x86/cpuid: Handle leaf 0x6 in guest_cpuid()
  2017-02-20 11:00 ` [PATCH 06/10] x86/cpuid: Handle leaf 0x6 " Andrew Cooper
  2017-02-21 17:25   ` Jan Beulich
@ 2017-03-10 16:32   ` Andrew Cooper
  2017-03-13 12:04     ` Jan Beulich
  1 sibling, 1 reply; 65+ messages in thread
From: Andrew Cooper @ 2017-03-10 16:32 UTC (permalink / raw)
  To: Xen-devel; +Cc: Andrew Cooper, Jan Beulich

The thermal/performance leaf was previously hidden from HVM guests, but fully
visible to PV guests.  Most of the leaf refers to MSR availability, and there
is nothing an unprivileged PV guest can do with the information, so hide the
leaf entirely.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
---
CC: Jan Beulich <JBeulich@suse.com>

v2:
 * Don't leak any state into the hardware domain.
---
 xen/arch/x86/cpuid.c | 9 ++++-----
 1 file changed, 4 insertions(+), 5 deletions(-)

diff --git a/xen/arch/x86/cpuid.c b/xen/arch/x86/cpuid.c
index 11e50e9..8585d94 100644
--- a/xen/arch/x86/cpuid.c
+++ b/xen/arch/x86/cpuid.c
@@ -168,6 +168,7 @@ static void recalculate_misc(struct cpuid_policy *p)
     p->basic.apic_id = 0; /* Dynamic. */
 
     p->basic.raw[0x5] = EMPTY_LEAF; /* MONITOR not exposed to guests. */
+    p->basic.raw[0x6] = EMPTY_LEAF; /* Therm/Power not exposed to guests. */
 
     p->basic.raw[0x8] = EMPTY_LEAF;
     p->basic.raw[0xc] = EMPTY_LEAF;
@@ -643,8 +644,7 @@ static void pv_cpuid(uint32_t leaf, uint32_t subleaf, struct cpuid_leaf *res)
         *res = EMPTY_LEAF;
         break;
 
-    case 0x0 ... 0x5:
-    case 0x7 ... 0x9:
+    case 0x0 ... 0x9:
     case 0xc ... XSTATE_CPUID:
     case 0x80000000 ... 0xffffffff:
         ASSERT_UNREACHABLE();
@@ -678,8 +678,7 @@ static void hvm_cpuid(uint32_t leaf, uint32_t subleaf, struct cpuid_leaf *res)
             res->a = (res->a & ~0xff) | 3;
         break;
 
-    case 0x0 ... 0x5:
-    case 0x7 ... 0x9:
+    case 0x0 ... 0x9:
     case 0xc ... XSTATE_CPUID:
     case 0x80000000 ... 0xffffffff:
         ASSERT_UNREACHABLE();
@@ -739,7 +738,7 @@ void guest_cpuid(const struct vcpu *v, uint32_t leaf,
             goto legacy;
 
         case 0x0 ... 0x3:
-        case 0x5:
+        case 0x5 ... 0x6:
         case 0x8 ... 0x9:
         case 0xc:
             *res = p->basic.raw[leaf];
-- 
2.1.4


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 65+ messages in thread

* [PATCH v2 08/10] x86/cpuid: Handle leaf 0xb in guest_cpuid()
  2017-02-20 11:00 ` [PATCH 08/10] x86/cpuid: Handle leaf 0xb " Andrew Cooper
  2017-02-22  9:16   ` Jan Beulich
@ 2017-03-10 16:44   ` Andrew Cooper
  2017-03-13 12:13     ` Jan Beulich
  1 sibling, 1 reply; 65+ messages in thread
From: Andrew Cooper @ 2017-03-10 16:44 UTC (permalink / raw)
  To: Xen-devel; +Cc: Andrew Cooper, Jan Beulich

Leaf 0xb is reserved by AMD, and uniformly hidden from guests by the toolstack
logic and hypervisor PV logic.  The previous dynamic logic filled in the
x2APIC ID for all HVM guests.

In practice, leaf 0xb is tightly linked with x2APIC, and x2APIC is offered to
guests on AMD hardware, as Xen's APIC emulation is x2APIC capable even if
hardware isn't.

Sensibly exposing the rest of the leaf requires further topology
infrastructure.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
---
CC: Jan Beulich <JBeulich@suse.com>

v2:
 * Switch logic to be in terms of the visibility of x2APIC.
---
 xen/arch/x86/cpuid.c | 34 +++++++++++++++++++---------------
 1 file changed, 19 insertions(+), 15 deletions(-)

diff --git a/xen/arch/x86/cpuid.c b/xen/arch/x86/cpuid.c
index 17769fb..dd8b583 100644
--- a/xen/arch/x86/cpuid.c
+++ b/xen/arch/x86/cpuid.c
@@ -171,6 +171,7 @@ static void recalculate_misc(struct cpuid_policy *p)
     p->basic.raw[0x6] = EMPTY_LEAF; /* Therm/Power not exposed to guests. */
 
     p->basic.raw[0x8] = EMPTY_LEAF;
+    p->basic.raw[0xb] = EMPTY_LEAF; /* TODO: Rework topology logic. */
     p->basic.raw[0xc] = EMPTY_LEAF;
 
     p->extd.e1d &= ~CPUID_COMMON_1D_FEATURES;
@@ -629,12 +630,7 @@ static void pv_cpuid(uint32_t leaf, uint32_t subleaf, struct cpuid_leaf *res)
 
     switch ( leaf )
     {
-    case 0x0000000b: /* Extended Topology Enumeration */
-        *res = EMPTY_LEAF;
-        break;
-
-    case 0x0 ... 0xa:
-    case 0xc ... XSTATE_CPUID:
+    case 0x0 ... XSTATE_CPUID:
     case 0x80000000 ... 0xffffffff:
         ASSERT_UNREACHABLE();
         /* Now handled in guest_cpuid(). */
@@ -650,13 +646,7 @@ static void hvm_cpuid(uint32_t leaf, uint32_t subleaf, struct cpuid_leaf *res)
 
     switch ( leaf )
     {
-    case 0xb:
-        /* Fix the x2APIC identifier. */
-        res->d = v->vcpu_id * 2;
-        break;
-
-    case 0x0 ... 0xa:
-    case 0xc ... XSTATE_CPUID:
+    case 0x0 ... XSTATE_CPUID:
     case 0x80000000 ... 0xffffffff:
         ASSERT_UNREACHABLE();
         /* Now handled in guest_cpuid(). */
@@ -716,8 +706,7 @@ void guest_cpuid(const struct vcpu *v, uint32_t leaf,
 
         case 0x0 ... 0x3:
         case 0x5 ... 0x6:
-        case 0x8 ... 0xa:
-        case 0xc:
+        case 0x8 ... 0xc:
             *res = p->basic.raw[leaf];
             break;
         }
@@ -938,6 +927,21 @@ void guest_cpuid(const struct vcpu *v, uint32_t leaf,
         }
         break;
 
+    case 0xb:
+        /*
+         * In principle, this leaf is Intel-only.  In practice, it is tightly
+         * coupled with x2apic, and we offer an x2apic-capable APIC emulation
+         * to guests on AMD hardware as well.
+         *
+         * TODO: Rework topology logic.
+         */
+        if ( p->basic.x2apic )
+        {
+            /* Fix the x2APIC identifier. */
+            res->d = v->vcpu_id * 2;
+        }
+        break;
+
     case XSTATE_CPUID:
         switch ( subleaf )
         {
-- 
2.1.4


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 65+ messages in thread

* Re: [PATCH 10/10] x86/cpuid: Always enable faulting for the control domain
  2017-02-28  9:31           ` Jan Beulich
@ 2017-03-10 17:10             ` Andrew Cooper
  2017-03-13 11:48               ` Jan Beulich
  0 siblings, 1 reply; 65+ messages in thread
From: Andrew Cooper @ 2017-03-10 17:10 UTC (permalink / raw)
  To: Jan Beulich; +Cc: Xen-devel

On 28/02/17 09:31, Jan Beulich wrote:
>>>> On 27.02.17 at 16:10, <andrew.cooper3@citrix.com> wrote:
>> On 22/02/17 10:10, Jan Beulich wrote:
>>>>>> On 22.02.17 at 11:00, <andrew.cooper3@citrix.com> wrote:
>>>> On 22/02/17 09:23, Jan Beulich wrote:
>>>>>>>> On 20.02.17 at 12:00, <andrew.cooper3@citrix.com> wrote:
>>>>>> The domain builder in libxc no longer depends on leaked CPUID information to
>>>>>> properly construct HVM domains.  Remove the control domain exclusion.
>>>>> Am I missing some intermediate step? As long as there's a raw
>>>>> CPUID invocation in xc_cpuid_x86.c (which is still there in staging
>>>>> and I don't recall this series removing it) it at least _feels_ unsafe.
>>>> Strictly speaking, the domain builder part of this was completed after
>>>> my xsave adjustments.  All the guest-type-dependent information now
>>>> comes from non-cpuid sources in libxc, or Xen ignores the toolstack
>>>> values and recalculates information itself.
>>>>
>>>> However, until the Intel leaves were complete, dom0 had a hard time
>>>> booting with this change as there were no toolstack-provided policy and
>>>> no leakage from hardware.
>>> So what are the CPUID uses in libxc then needed for at this point?
>>> Could they be removed in a prereq patch to make clear all needed
>>> information is now being obtained via hypercalls?
>> I'd prefer to defer that work.  The next chunk of CPUID work is going to
>> be redesigning and reimplementing the hypervisor/libxc interface, and
>> all cpuid() calls in libxc will fall out there, but its not a trivial
>> set of changes to make.
> With that, could you live with deferring the patch here until then?

We currently have a lot of dom0 implicit dependencies on leaked CPUID
state into PV dom0.

With this series, I believe I have identified all leaked dependencies,
and I really want to prevent is introducing any new implicit dependences
accidentally.

~Andrew

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 65+ messages in thread

* Re: [PATCH 10/10] x86/cpuid: Always enable faulting for the control domain
  2017-03-10 17:10             ` Andrew Cooper
@ 2017-03-13 11:48               ` Jan Beulich
  2017-03-14 15:06                 ` Wei Liu
  0 siblings, 1 reply; 65+ messages in thread
From: Jan Beulich @ 2017-03-13 11:48 UTC (permalink / raw)
  To: Andrew Cooper
  Cc: Stefano Stabellini, Wei Liu, George Dunlap, Tim Deegan,
	Ian Jackson, Xen-devel

>>> On 10.03.17 at 18:10, <andrew.cooper3@citrix.com> wrote:
> On 28/02/17 09:31, Jan Beulich wrote:
>>>>> On 27.02.17 at 16:10, <andrew.cooper3@citrix.com> wrote:
>>> On 22/02/17 10:10, Jan Beulich wrote:
>>>>>>> On 22.02.17 at 11:00, <andrew.cooper3@citrix.com> wrote:
>>>>> On 22/02/17 09:23, Jan Beulich wrote:
>>>>>>>>> On 20.02.17 at 12:00, <andrew.cooper3@citrix.com> wrote:
>>>>>>> The domain builder in libxc no longer depends on leaked CPUID information to
>>>>>>> properly construct HVM domains.  Remove the control domain exclusion.
>>>>>> Am I missing some intermediate step? As long as there's a raw
>>>>>> CPUID invocation in xc_cpuid_x86.c (which is still there in staging
>>>>>> and I don't recall this series removing it) it at least _feels_ unsafe.
>>>>> Strictly speaking, the domain builder part of this was completed after
>>>>> my xsave adjustments.  All the guest-type-dependent information now
>>>>> comes from non-cpuid sources in libxc, or Xen ignores the toolstack
>>>>> values and recalculates information itself.
>>>>>
>>>>> However, until the Intel leaves were complete, dom0 had a hard time
>>>>> booting with this change as there were no toolstack-provided policy and
>>>>> no leakage from hardware.
>>>> So what are the CPUID uses in libxc then needed for at this point?
>>>> Could they be removed in a prereq patch to make clear all needed
>>>> information is now being obtained via hypercalls?
>>> I'd prefer to defer that work.  The next chunk of CPUID work is going to
>>> be redesigning and reimplementing the hypervisor/libxc interface, and
>>> all cpuid() calls in libxc will fall out there, but its not a trivial
>>> set of changes to make.
>> With that, could you live with deferring the patch here until then?
> 
> We currently have a lot of dom0 implicit dependencies on leaked CPUID
> state into PV dom0.
> 
> With this series, I believe I have identified all leaked dependencies,
> and I really want to prevent is introducing any new implicit dependences
> accidentally.

I can certainly understand this, but the state libxc code is in then
makes this a rather implicit thing, instead of being fully explicit. I
think I'd like to have another (tools or REST) maintainer voice a 3rd
opinion. Extending Cc list ...

Jan


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 65+ messages in thread

* Re: [PATCH v2 04/10] x86/cpuid: Handle leaf 0x4 in guest_cpuid()
  2017-03-10 16:27   ` [PATCH v2 " Andrew Cooper
@ 2017-03-13 12:03     ` Jan Beulich
  2017-03-13 12:51       ` Andrew Cooper
  0 siblings, 1 reply; 65+ messages in thread
From: Jan Beulich @ 2017-03-13 12:03 UTC (permalink / raw)
  To: Andrew Cooper; +Cc: Xen-devel

>>> On 10.03.17 at 17:27, <andrew.cooper3@citrix.com> wrote:
> Leaf 0x4 is reserved by AMD.  For Intel, it is a multi-invocation leaf with
> ecx enumerating different cache details.
> 
> Add a new union for it in struct cpuid_policy, collect it from hardware in
> calculate_raw_policy(), audit it in recalculate_cpuid_policy() and update
> guest_cpuid() and update_domain_cpuid_info() to properly insert/extract 
> data.
> 
> A lot of the data here will need further auditing/refinement when better
> topology support is introduced, but for now, this matches the existing
> toolstack behaviour.
> 
> Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>

Reviewed-by: Jan Beulich <jbeulich@suse.com>

but with a couple of remarks:

> @@ -242,6 +243,25 @@ static void __init calculate_raw_policy(void)
>          cpuid_leaf(i, &p->basic.raw[i]);
>      }
>  
> +    if ( p->basic.max_leaf >= 4 )
> +    {
> +        for ( i = 0; i < ARRAY_SIZE(p->cache.raw); ++i )
> +        {
> +            cpuid_count_leaf(4, i, &p->cache.raw[i]);
> +
> +            if ( p->cache.subleaf[i].type == 0 )
> +                break;
> +        }
> +
> +        /*
> +         * The choice of CPUID_GUEST_NR_CACHE is arbitrary.  It is expected
> +         * that it will eventually need increasing for future hardware.
> +         */
> +        if ( i == ARRAY_SIZE(p->cache.raw) )
> +            printk(XENLOG_WARNING
> +                   "CPUID: Insufficient Leaf 4 space for this hardware\n");
> +    }

As expressed before (perhaps in the context of another patch),
the warning may be logged prematurely, which I'd prefer to be
avoided.

> --- a/xen/arch/x86/domctl.c
> +++ b/xen/arch/x86/domctl.c
> @@ -101,6 +101,10 @@ static int update_domain_cpuid_info(struct domain *d,
>      switch ( ctl->input[0] )
>      {
>      case 0x00000000 ... ARRAY_SIZE(p->basic.raw) - 1:
> +        if ( ctl->input[0] == 4 &&
> +             ctl->input[1] >= ARRAY_SIZE(p->cache.raw) )
> +            return 0;
> +
>          if ( ctl->input[0] == 7 &&
>               ctl->input[1] >= ARRAY_SIZE(p->feat.raw) )
>              return 0;
> @@ -129,7 +133,9 @@ static int update_domain_cpuid_info(struct domain *d,
>      switch ( ctl->input[0] )
>      {
>      case 0x00000000 ... ARRAY_SIZE(p->basic.raw) - 1:
> -        if ( ctl->input[0] == 7 )
> +        if ( ctl->input[0] == 4 )
> +            p->cache.raw[ctl->input[1]] = leaf;
> +        else if ( ctl->input[0] == 7 )
>              p->feat.raw[ctl->input[1]] = leaf;
>          else if ( ctl->input[0] == XSTATE_CPUID )
>              p->xstate.raw[ctl->input[1]] = leaf;

The contexts of these two hunks make it pretty likely that inner
switch() statements would help readability.

Jan


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 65+ messages in thread

* Re: [PATCH v2 06/10] x86/cpuid: Handle leaf 0x6 in guest_cpuid()
  2017-03-10 16:32   ` [PATCH v2 " Andrew Cooper
@ 2017-03-13 12:04     ` Jan Beulich
  0 siblings, 0 replies; 65+ messages in thread
From: Jan Beulich @ 2017-03-13 12:04 UTC (permalink / raw)
  To: Andrew Cooper; +Cc: Xen-devel

>>> On 10.03.17 at 17:32, <andrew.cooper3@citrix.com> wrote:
> The thermal/performance leaf was previously hidden from HVM guests, but fully
> visible to PV guests.  Most of the leaf refers to MSR availability, and there
> is nothing an unprivileged PV guest can do with the information, so hide the
> leaf entirely.
> 
> Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>

Reviewed-by: Jan Beulich <jbeulich@suse.com>



_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 65+ messages in thread

* Re: [PATCH v2 08/10] x86/cpuid: Handle leaf 0xb in guest_cpuid()
  2017-03-10 16:44   ` [PATCH v2 " Andrew Cooper
@ 2017-03-13 12:13     ` Jan Beulich
  0 siblings, 0 replies; 65+ messages in thread
From: Jan Beulich @ 2017-03-13 12:13 UTC (permalink / raw)
  To: Andrew Cooper; +Cc: Xen-devel

>>> On 10.03.17 at 17:44, <andrew.cooper3@citrix.com> wrote:
> @@ -938,6 +927,21 @@ void guest_cpuid(const struct vcpu *v, uint32_t leaf,
>          }
>          break;
>  
> +    case 0xb:
> +        /*
> +         * In principle, this leaf is Intel-only.  In practice, it is tightly
> +         * coupled with x2apic, and we offer an x2apic-capable APIC emulation
> +         * to guests on AMD hardware as well.
> +         *
> +         * TODO: Rework topology logic.
> +         */
> +        if ( p->basic.x2apic )
> +        {
> +            /* Fix the x2APIC identifier. */
> +            res->d = v->vcpu_id * 2;
> +        }
> +        break;

The SDM says that ECX[7:0] output matches the input. I think you
want to mimic that (not doing so would imo be acceptable only if
you returned all zeros uniformly).

With that taken care of
Reviewed-by: Jan Beulich <jbeulich@suse.com>

Jan


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 65+ messages in thread

* Re: [PATCH v2 04/10] x86/cpuid: Handle leaf 0x4 in guest_cpuid()
  2017-03-13 12:03     ` Jan Beulich
@ 2017-03-13 12:51       ` Andrew Cooper
  2017-03-13 13:05         ` Jan Beulich
  0 siblings, 1 reply; 65+ messages in thread
From: Andrew Cooper @ 2017-03-13 12:51 UTC (permalink / raw)
  To: Jan Beulich; +Cc: Xen-devel

On 13/03/17 12:03, Jan Beulich wrote:
>>>> On 10.03.17 at 17:27, <andrew.cooper3@citrix.com> wrote:
>> Leaf 0x4 is reserved by AMD.  For Intel, it is a multi-invocation leaf with
>> ecx enumerating different cache details.
>>
>> Add a new union for it in struct cpuid_policy, collect it from hardware in
>> calculate_raw_policy(), audit it in recalculate_cpuid_policy() and update
>> guest_cpuid() and update_domain_cpuid_info() to properly insert/extract 
>> data.
>>
>> A lot of the data here will need further auditing/refinement when better
>> topology support is introduced, but for now, this matches the existing
>> toolstack behaviour.
>>
>> Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
> Reviewed-by: Jan Beulich <jbeulich@suse.com>
>
> but with a couple of remarks:
>
>> @@ -242,6 +243,25 @@ static void __init calculate_raw_policy(void)
>>          cpuid_leaf(i, &p->basic.raw[i]);
>>      }
>>  
>> +    if ( p->basic.max_leaf >= 4 )
>> +    {
>> +        for ( i = 0; i < ARRAY_SIZE(p->cache.raw); ++i )
>> +        {
>> +            cpuid_count_leaf(4, i, &p->cache.raw[i]);
>> +
>> +            if ( p->cache.subleaf[i].type == 0 )
>> +                break;
>> +        }
>> +
>> +        /*
>> +         * The choice of CPUID_GUEST_NR_CACHE is arbitrary.  It is expected
>> +         * that it will eventually need increasing for future hardware.
>> +         */
>> +        if ( i == ARRAY_SIZE(p->cache.raw) )
>> +            printk(XENLOG_WARNING
>> +                   "CPUID: Insufficient Leaf 4 space for this hardware\n");
>> +    }
> As expressed before (perhaps in the context of another patch),
> the warning may be logged prematurely, which I'd prefer to be
> avoided.

How would you like it then?  You previously indicated that it probably
want a problem leaving it like this, which is why I did.

>
>> --- a/xen/arch/x86/domctl.c
>> +++ b/xen/arch/x86/domctl.c
>> @@ -101,6 +101,10 @@ static int update_domain_cpuid_info(struct domain *d,
>>      switch ( ctl->input[0] )
>>      {
>>      case 0x00000000 ... ARRAY_SIZE(p->basic.raw) - 1:
>> +        if ( ctl->input[0] == 4 &&
>> +             ctl->input[1] >= ARRAY_SIZE(p->cache.raw) )
>> +            return 0;
>> +
>>          if ( ctl->input[0] == 7 &&
>>               ctl->input[1] >= ARRAY_SIZE(p->feat.raw) )
>>              return 0;
>> @@ -129,7 +133,9 @@ static int update_domain_cpuid_info(struct domain *d,
>>      switch ( ctl->input[0] )
>>      {
>>      case 0x00000000 ... ARRAY_SIZE(p->basic.raw) - 1:
>> -        if ( ctl->input[0] == 7 )
>> +        if ( ctl->input[0] == 4 )
>> +            p->cache.raw[ctl->input[1]] = leaf;
>> +        else if ( ctl->input[0] == 7 )
>>              p->feat.raw[ctl->input[1]] = leaf;
>>          else if ( ctl->input[0] == XSTATE_CPUID )
>>              p->xstate.raw[ctl->input[1]] = leaf;
> The contexts of these two hunks make it pretty likely that inner
> switch() statements would help readability.

Will do.

~Andrew

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 65+ messages in thread

* Re: [PATCH v2 04/10] x86/cpuid: Handle leaf 0x4 in guest_cpuid()
  2017-03-13 12:51       ` Andrew Cooper
@ 2017-03-13 13:05         ` Jan Beulich
  2017-03-13 13:24           ` Andrew Cooper
  0 siblings, 1 reply; 65+ messages in thread
From: Jan Beulich @ 2017-03-13 13:05 UTC (permalink / raw)
  To: Andrew Cooper; +Cc: Xen-devel

>>> On 13.03.17 at 13:51, <andrew.cooper3@citrix.com> wrote:
> On 13/03/17 12:03, Jan Beulich wrote:
>>>>> On 10.03.17 at 17:27, <andrew.cooper3@citrix.com> wrote:
>>> @@ -242,6 +243,25 @@ static void __init calculate_raw_policy(void)
>>>          cpuid_leaf(i, &p->basic.raw[i]);
>>>      }
>>>  
>>> +    if ( p->basic.max_leaf >= 4 )
>>> +    {
>>> +        for ( i = 0; i < ARRAY_SIZE(p->cache.raw); ++i )
>>> +        {
>>> +            cpuid_count_leaf(4, i, &p->cache.raw[i]);
>>> +
>>> +            if ( p->cache.subleaf[i].type == 0 )
>>> +                break;
>>> +        }
>>> +
>>> +        /*
>>> +         * The choice of CPUID_GUEST_NR_CACHE is arbitrary.  It is expected
>>> +         * that it will eventually need increasing for future hardware.
>>> +         */
>>> +        if ( i == ARRAY_SIZE(p->cache.raw) )
>>> +            printk(XENLOG_WARNING
>>> +                   "CPUID: Insufficient Leaf 4 space for this hardware\n");
>>> +    }
>> As expressed before (perhaps in the context of another patch),
>> the warning may be logged prematurely, which I'd prefer to be
>> avoided.
> 
> How would you like it then?  You previously indicated that it probably
> want a problem leaving it like this, which is why I did.

And I continue to be of that opinion: It's probably not a problem to
leave it as is (hence the R-b), but I'd prefer if the warning was only
issued when we actually can't fit at least one sub-leaf.

Jan


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 65+ messages in thread

* Re: [PATCH v2 04/10] x86/cpuid: Handle leaf 0x4 in guest_cpuid()
  2017-03-13 13:05         ` Jan Beulich
@ 2017-03-13 13:24           ` Andrew Cooper
  2017-03-13 13:36             ` Jan Beulich
  0 siblings, 1 reply; 65+ messages in thread
From: Andrew Cooper @ 2017-03-13 13:24 UTC (permalink / raw)
  To: Jan Beulich; +Cc: Xen-devel

On 13/03/17 13:05, Jan Beulich wrote:
>>>> On 13.03.17 at 13:51, <andrew.cooper3@citrix.com> wrote:
>> On 13/03/17 12:03, Jan Beulich wrote:
>>>>>> On 10.03.17 at 17:27, <andrew.cooper3@citrix.com> wrote:
>>>> @@ -242,6 +243,25 @@ static void __init calculate_raw_policy(void)
>>>>          cpuid_leaf(i, &p->basic.raw[i]);
>>>>      }
>>>>  
>>>> +    if ( p->basic.max_leaf >= 4 )
>>>> +    {
>>>> +        for ( i = 0; i < ARRAY_SIZE(p->cache.raw); ++i )
>>>> +        {
>>>> +            cpuid_count_leaf(4, i, &p->cache.raw[i]);
>>>> +
>>>> +            if ( p->cache.subleaf[i].type == 0 )
>>>> +                break;
>>>> +        }
>>>> +
>>>> +        /*
>>>> +         * The choice of CPUID_GUEST_NR_CACHE is arbitrary.  It is expected
>>>> +         * that it will eventually need increasing for future hardware.
>>>> +         */
>>>> +        if ( i == ARRAY_SIZE(p->cache.raw) )
>>>> +            printk(XENLOG_WARNING
>>>> +                   "CPUID: Insufficient Leaf 4 space for this hardware\n");
>>>> +    }
>>> As expressed before (perhaps in the context of another patch),
>>> the warning may be logged prematurely, which I'd prefer to be
>>> avoided.
>> How would you like it then?  You previously indicated that it probably
>> want a problem leaving it like this, which is why I did.
> And I continue to be of that opinion: It's probably not a problem to
> leave it as is (hence the R-b), but I'd prefer if the warning was only
> issued when we actually can't fit at least one sub-leaf.

Is this delta ok?

diff --git a/xen/arch/x86/cpuid.c b/xen/arch/x86/cpuid.c
index b70fc0b..7caf6af 100644
--- a/xen/arch/x86/cpuid.c
+++ b/xen/arch/x86/cpuid.c
@@ -251,10 +251,17 @@ static void __init calculate_raw_policy(void)
     {
         for ( i = 0; i < ARRAY_SIZE(p->cache.raw); ++i )
         {
-            cpuid_count_leaf(4, i, &p->cache.raw[i]);
+            union {
+                struct cpuid_leaf l;
+                struct cpuid_cache_leaf c;
+            } u;
 
-            if ( p->cache.subleaf[i].type == 0 )
+            cpuid_count_leaf(4, i, &u.l);
+
+            if ( u.c.type == 0 )
                 break;
+
+            p->cache.subleaf[i] = u.c;
         }
 
         /*
diff --git a/xen/include/asm-x86/cpuid.h b/xen/include/asm-x86/cpuid.h
index 816f06d..ac25908 100644
--- a/xen/include/asm-x86/cpuid.h
+++ b/xen/include/asm-x86/cpuid.h
@@ -118,7 +118,7 @@ struct cpuid_policy
     /* Structured cache leaf: 0x00000004[xx] */
     union {
         struct cpuid_leaf raw[CPUID_GUEST_NR_CACHE];
-        struct {
+        struct cpuid_cache_leaf {
             uint32_t type:5,
                 :27, :32, :32, :32;
         } subleaf[CPUID_GUEST_NR_CACHE];


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 65+ messages in thread

* Re: [PATCH v2 04/10] x86/cpuid: Handle leaf 0x4 in guest_cpuid()
  2017-03-13 13:24           ` Andrew Cooper
@ 2017-03-13 13:36             ` Jan Beulich
  0 siblings, 0 replies; 65+ messages in thread
From: Jan Beulich @ 2017-03-13 13:36 UTC (permalink / raw)
  To: Andrew Cooper; +Cc: Xen-devel

>>> On 13.03.17 at 14:24, <andrew.cooper3@citrix.com> wrote:
> On 13/03/17 13:05, Jan Beulich wrote:
>>>>> On 13.03.17 at 13:51, <andrew.cooper3@citrix.com> wrote:
>>> On 13/03/17 12:03, Jan Beulich wrote:
>>>>>>> On 10.03.17 at 17:27, <andrew.cooper3@citrix.com> wrote:
>>>>> @@ -242,6 +243,25 @@ static void __init calculate_raw_policy(void)
>>>>>          cpuid_leaf(i, &p->basic.raw[i]);
>>>>>      }
>>>>>  
>>>>> +    if ( p->basic.max_leaf >= 4 )
>>>>> +    {
>>>>> +        for ( i = 0; i < ARRAY_SIZE(p->cache.raw); ++i )
>>>>> +        {
>>>>> +            cpuid_count_leaf(4, i, &p->cache.raw[i]);
>>>>> +
>>>>> +            if ( p->cache.subleaf[i].type == 0 )
>>>>> +                break;
>>>>> +        }
>>>>> +
>>>>> +        /*
>>>>> +         * The choice of CPUID_GUEST_NR_CACHE is arbitrary.  It is expected
>>>>> +         * that it will eventually need increasing for future hardware.
>>>>> +         */
>>>>> +        if ( i == ARRAY_SIZE(p->cache.raw) )
>>>>> +            printk(XENLOG_WARNING
>>>>> +                   "CPUID: Insufficient Leaf 4 space for this hardware\n");
>>>>> +    }
>>>> As expressed before (perhaps in the context of another patch),
>>>> the warning may be logged prematurely, which I'd prefer to be
>>>> avoided.
>>> How would you like it then?  You previously indicated that it probably
>>> want a problem leaving it like this, which is why I did.
>> And I continue to be of that opinion: It's probably not a problem to
>> leave it as is (hence the R-b), but I'd prefer if the warning was only
>> issued when we actually can't fit at least one sub-leaf.
> 
> Is this delta ok?

Yes, thanks. R-b stands, if in doubt.

Jan

> diff --git a/xen/arch/x86/cpuid.c b/xen/arch/x86/cpuid.c
> index b70fc0b..7caf6af 100644
> --- a/xen/arch/x86/cpuid.c
> +++ b/xen/arch/x86/cpuid.c
> @@ -251,10 +251,17 @@ static void __init calculate_raw_policy(void)
>      {
>          for ( i = 0; i < ARRAY_SIZE(p->cache.raw); ++i )
>          {
> -            cpuid_count_leaf(4, i, &p->cache.raw[i]);
> +            union {
> +                struct cpuid_leaf l;
> +                struct cpuid_cache_leaf c;
> +            } u;
>  
> -            if ( p->cache.subleaf[i].type == 0 )
> +            cpuid_count_leaf(4, i, &u.l);
> +
> +            if ( u.c.type == 0 )
>                  break;
> +
> +            p->cache.subleaf[i] = u.c;
>          }
>  
>          /*
> diff --git a/xen/include/asm-x86/cpuid.h b/xen/include/asm-x86/cpuid.h
> index 816f06d..ac25908 100644
> --- a/xen/include/asm-x86/cpuid.h
> +++ b/xen/include/asm-x86/cpuid.h
> @@ -118,7 +118,7 @@ struct cpuid_policy
>      /* Structured cache leaf: 0x00000004[xx] */
>      union {
>          struct cpuid_leaf raw[CPUID_GUEST_NR_CACHE];
> -        struct {
> +        struct cpuid_cache_leaf {
>              uint32_t type:5,
>                  :27, :32, :32, :32;
>          } subleaf[CPUID_GUEST_NR_CACHE];




_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 65+ messages in thread

* Re: [PATCH 10/10] x86/cpuid: Always enable faulting for the control domain
  2017-03-13 11:48               ` Jan Beulich
@ 2017-03-14 15:06                 ` Wei Liu
  2017-03-14 15:13                   ` Jan Beulich
  0 siblings, 1 reply; 65+ messages in thread
From: Wei Liu @ 2017-03-14 15:06 UTC (permalink / raw)
  To: Jan Beulich
  Cc: Tim Deegan, Stefano Stabellini, Wei Liu, George Dunlap,
	Andrew Cooper, Ian Jackson, Xen-devel

On Mon, Mar 13, 2017 at 05:48:44AM -0600, Jan Beulich wrote:
> >>> On 10.03.17 at 18:10, <andrew.cooper3@citrix.com> wrote:
> > On 28/02/17 09:31, Jan Beulich wrote:
> >>>>> On 27.02.17 at 16:10, <andrew.cooper3@citrix.com> wrote:
> >>> On 22/02/17 10:10, Jan Beulich wrote:
> >>>>>>> On 22.02.17 at 11:00, <andrew.cooper3@citrix.com> wrote:
> >>>>> On 22/02/17 09:23, Jan Beulich wrote:
> >>>>>>>>> On 20.02.17 at 12:00, <andrew.cooper3@citrix.com> wrote:
> >>>>>>> The domain builder in libxc no longer depends on leaked CPUID information to
> >>>>>>> properly construct HVM domains.  Remove the control domain exclusion.
> >>>>>> Am I missing some intermediate step? As long as there's a raw
> >>>>>> CPUID invocation in xc_cpuid_x86.c (which is still there in staging
> >>>>>> and I don't recall this series removing it) it at least _feels_ unsafe.
> >>>>> Strictly speaking, the domain builder part of this was completed after
> >>>>> my xsave adjustments.  All the guest-type-dependent information now
> >>>>> comes from non-cpuid sources in libxc, or Xen ignores the toolstack
> >>>>> values and recalculates information itself.
> >>>>>
> >>>>> However, until the Intel leaves were complete, dom0 had a hard time
> >>>>> booting with this change as there were no toolstack-provided policy and
> >>>>> no leakage from hardware.
> >>>> So what are the CPUID uses in libxc then needed for at this point?
> >>>> Could they be removed in a prereq patch to make clear all needed
> >>>> information is now being obtained via hypercalls?
> >>> I'd prefer to defer that work.  The next chunk of CPUID work is going to
> >>> be redesigning and reimplementing the hypervisor/libxc interface, and
> >>> all cpuid() calls in libxc will fall out there, but its not a trivial
> >>> set of changes to make.
> >> With that, could you live with deferring the patch here until then?
> > 
> > We currently have a lot of dom0 implicit dependencies on leaked CPUID
> > state into PV dom0.
> > 
> > With this series, I believe I have identified all leaked dependencies,
> > and I really want to prevent is introducing any new implicit dependences
> > accidentally.
> 
> I can certainly understand this, but the state libxc code is in then
> makes this a rather implicit thing, instead of being fully explicit. I
> think I'd like to have another (tools or REST) maintainer voice a 3rd
> opinion. Extending Cc list ...
> 

I'm not sure I follow the implicit vs explicit bit.

Wei.

> Jan
> 

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 65+ messages in thread

* Re: [PATCH 10/10] x86/cpuid: Always enable faulting for the control domain
  2017-03-14 15:06                 ` Wei Liu
@ 2017-03-14 15:13                   ` Jan Beulich
  2017-03-14 16:05                     ` Wei Liu
  0 siblings, 1 reply; 65+ messages in thread
From: Jan Beulich @ 2017-03-14 15:13 UTC (permalink / raw)
  To: Wei Liu
  Cc: Tim Deegan, Stefano Stabellini, George Dunlap, Andrew Cooper,
	Ian Jackson, Xen-devel

>>> On 14.03.17 at 16:06, <wei.liu2@citrix.com> wrote:
> On Mon, Mar 13, 2017 at 05:48:44AM -0600, Jan Beulich wrote:
>> >>> On 10.03.17 at 18:10, <andrew.cooper3@citrix.com> wrote:
>> > On 28/02/17 09:31, Jan Beulich wrote:
>> >>>>> On 27.02.17 at 16:10, <andrew.cooper3@citrix.com> wrote:
>> >>> On 22/02/17 10:10, Jan Beulich wrote:
>> >>>>>>> On 22.02.17 at 11:00, <andrew.cooper3@citrix.com> wrote:
>> >>>>> On 22/02/17 09:23, Jan Beulich wrote:
>> >>>>>>>>> On 20.02.17 at 12:00, <andrew.cooper3@citrix.com> wrote:
>> >>>>>>> The domain builder in libxc no longer depends on leaked CPUID information 
> to
>> >>>>>>> properly construct HVM domains.  Remove the control domain exclusion.
>> >>>>>> Am I missing some intermediate step? As long as there's a raw
>> >>>>>> CPUID invocation in xc_cpuid_x86.c (which is still there in staging
>> >>>>>> and I don't recall this series removing it) it at least _feels_ unsafe.
>> >>>>> Strictly speaking, the domain builder part of this was completed after
>> >>>>> my xsave adjustments.  All the guest-type-dependent information now
>> >>>>> comes from non-cpuid sources in libxc, or Xen ignores the toolstack
>> >>>>> values and recalculates information itself.
>> >>>>>
>> >>>>> However, until the Intel leaves were complete, dom0 had a hard time
>> >>>>> booting with this change as there were no toolstack-provided policy and
>> >>>>> no leakage from hardware.
>> >>>> So what are the CPUID uses in libxc then needed for at this point?
>> >>>> Could they be removed in a prereq patch to make clear all needed
>> >>>> information is now being obtained via hypercalls?
>> >>> I'd prefer to defer that work.  The next chunk of CPUID work is going to
>> >>> be redesigning and reimplementing the hypervisor/libxc interface, and
>> >>> all cpuid() calls in libxc will fall out there, but its not a trivial
>> >>> set of changes to make.
>> >> With that, could you live with deferring the patch here until then?
>> > 
>> > We currently have a lot of dom0 implicit dependencies on leaked CPUID
>> > state into PV dom0.
>> > 
>> > With this series, I believe I have identified all leaked dependencies,
>> > and I really want to prevent is introducing any new implicit dependences
>> > accidentally.
>> 
>> I can certainly understand this, but the state libxc code is in then
>> makes this a rather implicit thing, instead of being fully explicit. I
>> think I'd like to have another (tools or REST) maintainer voice a 3rd
>> opinion. Extending Cc list ...
> 
> I'm not sure I follow the implicit vs explicit bit.

Right now libxc still uses the CPUID instruction, thus implicitly (via
CPUID faulting / masking) obtaining the intended (filtered) state of
individual feature flags. The intended explicit way would be for it
to instead use hypercalls to retrieve the information.

Jan


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 65+ messages in thread

* Re: [PATCH 10/10] x86/cpuid: Always enable faulting for the control domain
  2017-03-14 15:13                   ` Jan Beulich
@ 2017-03-14 16:05                     ` Wei Liu
  0 siblings, 0 replies; 65+ messages in thread
From: Wei Liu @ 2017-03-14 16:05 UTC (permalink / raw)
  To: Jan Beulich
  Cc: Tim Deegan, Stefano Stabellini, Wei Liu, George Dunlap,
	Andrew Cooper, Ian Jackson, Xen-devel

On Tue, Mar 14, 2017 at 09:13:08AM -0600, Jan Beulich wrote:
> >>> On 14.03.17 at 16:06, <wei.liu2@citrix.com> wrote:
> > On Mon, Mar 13, 2017 at 05:48:44AM -0600, Jan Beulich wrote:
> >> >>> On 10.03.17 at 18:10, <andrew.cooper3@citrix.com> wrote:
> >> > On 28/02/17 09:31, Jan Beulich wrote:
> >> >>>>> On 27.02.17 at 16:10, <andrew.cooper3@citrix.com> wrote:
> >> >>> On 22/02/17 10:10, Jan Beulich wrote:
> >> >>>>>>> On 22.02.17 at 11:00, <andrew.cooper3@citrix.com> wrote:
> >> >>>>> On 22/02/17 09:23, Jan Beulich wrote:
> >> >>>>>>>>> On 20.02.17 at 12:00, <andrew.cooper3@citrix.com> wrote:
> >> >>>>>>> The domain builder in libxc no longer depends on leaked CPUID information 
> > to
> >> >>>>>>> properly construct HVM domains.  Remove the control domain exclusion.
> >> >>>>>> Am I missing some intermediate step? As long as there's a raw
> >> >>>>>> CPUID invocation in xc_cpuid_x86.c (which is still there in staging
> >> >>>>>> and I don't recall this series removing it) it at least _feels_ unsafe.
> >> >>>>> Strictly speaking, the domain builder part of this was completed after
> >> >>>>> my xsave adjustments.  All the guest-type-dependent information now
> >> >>>>> comes from non-cpuid sources in libxc, or Xen ignores the toolstack
> >> >>>>> values and recalculates information itself.
> >> >>>>>
> >> >>>>> However, until the Intel leaves were complete, dom0 had a hard time
> >> >>>>> booting with this change as there were no toolstack-provided policy and
> >> >>>>> no leakage from hardware.
> >> >>>> So what are the CPUID uses in libxc then needed for at this point?
> >> >>>> Could they be removed in a prereq patch to make clear all needed
> >> >>>> information is now being obtained via hypercalls?
> >> >>> I'd prefer to defer that work.  The next chunk of CPUID work is going to
> >> >>> be redesigning and reimplementing the hypervisor/libxc interface, and
> >> >>> all cpuid() calls in libxc will fall out there, but its not a trivial
> >> >>> set of changes to make.
> >> >> With that, could you live with deferring the patch here until then?
> >> > 
> >> > We currently have a lot of dom0 implicit dependencies on leaked CPUID
> >> > state into PV dom0.
> >> > 
> >> > With this series, I believe I have identified all leaked dependencies,
> >> > and I really want to prevent is introducing any new implicit dependences
> >> > accidentally.
> >> 
> >> I can certainly understand this, but the state libxc code is in then
> >> makes this a rather implicit thing, instead of being fully explicit. I
> >> think I'd like to have another (tools or REST) maintainer voice a 3rd
> >> opinion. Extending Cc list ...
> > 
> > I'm not sure I follow the implicit vs explicit bit.
> 
> Right now libxc still uses the CPUID instruction, thus implicitly (via
> CPUID faulting / masking) obtaining the intended (filtered) state of
> individual feature flags. The intended explicit way would be for it
> to instead use hypercalls to retrieve the information.
> 

AFAICT having this patch won't make things worse than before, with the
added benefit of avoiding dependencies of leaked information. If my
understanding is correct, I'm slightly in favour of committing it now.

Wei.

> Jan
> 

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 65+ messages in thread

end of thread, other threads:[~2017-03-14 16:05 UTC | newest]

Thread overview: 65+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-02-20 11:00 [PATCH 00/10] x86/cpuid: Remove the legacy infrastructure Andrew Cooper
2017-02-20 11:00 ` [PATCH 01/10] x86/cpuid: Disallow policy updates once the domain is running Andrew Cooper
2017-02-21 16:37   ` Jan Beulich
2017-02-20 11:00 ` [PATCH 02/10] x86/gen-cpuid: Clarify the intended meaning of AVX wrt feature dependencies Andrew Cooper
2017-02-21 16:40   ` Jan Beulich
2017-02-21 16:41     ` Andrew Cooper
2017-02-21 16:47     ` Jan Beulich
2017-02-21 16:53       ` Andrew Cooper
2017-02-21 17:07         ` Jan Beulich
2017-02-21 17:12           ` Andrew Cooper
2017-02-21 17:17             ` Jan Beulich
2017-02-21 17:42               ` Andrew Cooper
2017-02-22  7:13                 ` Jan Beulich
2017-02-20 11:00 ` [PATCH 03/10] x86/cpuid: Handle leaf 0x1 in guest_cpuid() Andrew Cooper
2017-02-21 16:59   ` Jan Beulich
2017-02-21 17:13     ` Andrew Cooper
2017-02-21 17:20       ` Jan Beulich
2017-02-21 17:29         ` Andrew Cooper
2017-02-22  7:16           ` Jan Beulich
2017-02-20 11:00 ` [PATCH 04/10] x86/cpuid: Handle leaf 0x4 " Andrew Cooper
2017-02-21 17:16   ` Jan Beulich
2017-02-21 17:35     ` Andrew Cooper
2017-02-22  7:23       ` Jan Beulich
2017-02-22  7:55         ` Andrew Cooper
2017-03-10 16:27   ` [PATCH v2 " Andrew Cooper
2017-03-13 12:03     ` Jan Beulich
2017-03-13 12:51       ` Andrew Cooper
2017-03-13 13:05         ` Jan Beulich
2017-03-13 13:24           ` Andrew Cooper
2017-03-13 13:36             ` Jan Beulich
2017-02-20 11:00 ` [PATCH 05/10] x86/cpuid: Handle leaf 0x5 " Andrew Cooper
2017-02-21 17:22   ` Jan Beulich
2017-02-20 11:00 ` [PATCH 06/10] x86/cpuid: Handle leaf 0x6 " Andrew Cooper
2017-02-21 17:25   ` Jan Beulich
2017-02-21 17:40     ` Andrew Cooper
2017-02-21 17:44       ` Andrew Cooper
2017-02-22  7:31       ` Jan Beulich
2017-02-22  8:23         ` Andrew Cooper
2017-02-22  9:12           ` Andrew Cooper
2017-02-22  9:26             ` Jan Beulich
2017-02-27 14:30               ` Andrew Cooper
2017-03-10 16:32   ` [PATCH v2 " Andrew Cooper
2017-03-13 12:04     ` Jan Beulich
2017-02-20 11:00 ` [PATCH 07/10] x86/cpuid: Handle leaf 0xa " Andrew Cooper
2017-02-22  9:11   ` Jan Beulich
2017-02-20 11:00 ` [PATCH 08/10] x86/cpuid: Handle leaf 0xb " Andrew Cooper
2017-02-22  9:16   ` Jan Beulich
2017-02-22 10:22     ` Andrew Cooper
2017-02-22 10:37       ` Jan Beulich
2017-02-27 15:05         ` Andrew Cooper
2017-03-10 16:44   ` [PATCH v2 " Andrew Cooper
2017-03-13 12:13     ` Jan Beulich
2017-02-20 11:00 ` [PATCH 09/10] x86/cpuid: Drop legacy CPUID infrastructure Andrew Cooper
2017-02-22  9:19   ` Jan Beulich
2017-02-20 11:00 ` [PATCH 10/10] x86/cpuid: Always enable faulting for the control domain Andrew Cooper
2017-02-22  9:23   ` Jan Beulich
2017-02-22 10:00     ` Andrew Cooper
2017-02-22 10:10       ` Jan Beulich
2017-02-27 15:10         ` Andrew Cooper
2017-02-28  9:31           ` Jan Beulich
2017-03-10 17:10             ` Andrew Cooper
2017-03-13 11:48               ` Jan Beulich
2017-03-14 15:06                 ` Wei Liu
2017-03-14 15:13                   ` Jan Beulich
2017-03-14 16:05                     ` Wei Liu

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.