All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v5 0/4] Expose HW APIC virtualization support to HVM guests
@ 2014-03-19  0:58 Boris Ostrovsky
  2014-03-19  0:58 ` [PATCH v5 1/4] xen/libxc: Allow changes to hypervisor CPUID leaf from config file Boris Ostrovsky
                   ` (3 more replies)
  0 siblings, 4 replies; 18+ messages in thread
From: Boris Ostrovsky @ 2014-03-19  0:58 UTC (permalink / raw)
  To: jbeulich, keir, jun.nakajima, eddie.dong, ian.jackson,
	stefano.stabellini, ian.campbell
  Cc: yang.z.zhang, andrew.cooper3, boris.ostrovsky, xen-devel

Version 5:
* Remember to deal with Viridian leaves in xc_cpuid_constrain()

Version 4:
* asm changes per Jan's suggestions
* Don't pass leaf index number to handlers, check for sub_idx==0
* Added (possibly somewhat overwrought) interface to constrain user-requested
  CPUID to what is allowed (specifically, for leaf 0x40000000 only EAX[7:0]
  can be ovverwritten).

Version 3:
* Removed sysctl for querying hypervisor leaf from libxc, replaced it with
  prefixed CPUID instruction
* Limited ability change hypervisor leaves to 0x40000000
* Fixed IS_HYPERVISOR_LEAF macro

Version 2:
* Added ability to specify hypervisor CPUID leaves in config file (this requires
  new sysctl)
* Use 2 bits to indicate what is supported --- one for APIC memory access and the
  other for x2APIC. Still not sure whether virtual interrupt delivery should be
  exposed as well.



HVM guests running on HW that supports HW APIC virtualization features
(APIC-register virtualization, virtual interrupt delivery, etc) may
want to use APIC instead of hvm_pirqs. Since we are not guaranteed to
have these features on VMX (for example, there is a boot option to
turn it off) and there is no such support on SVM we need to make the
guest aware that its APIC accesses may not be so bad.

CPUID seems to be a good way to provide this info to the guest.

Having a guest switch to APIC shows fairly good impact on number of
VMEXITs. For example, with a pass-through NIC, netperf sees almost
half as many. Here are results for 'xentrace -e 0x00083fff -c 2 -D -T 2'

(The guest here essentially turned off XENFEAT_hvm_pirqs but we may
want to use APIC for MSI interrupts only and leave pirqs for gsi).


[root@ovs105 virt]#  cat orig  |xentrace_format  ~/xen/tools/xentrace/formats | awk  '{print $5}' | sort  | uniq -c
     94 cpu_change
  13944 HLT
  26341 INJ_VIRQ
  12054 INTR
  30784 INTR_WINDOW
  10126 TRAP
 124783 VMENTRY
 124782 VMEXIT
  59217 VMMCALL
     35 wrap_buffer
 
 
[root@ovs105 virt]#  cat apicv  |xentrace_format  ~/xen/tools/xentrace/formats | awk  '{print $5}' | sort  | uniq -c
     49 cpu_change
  16157 HLT
     31 INJ_VIRQ
  10652 INTR
     38 INTR_WINDOW
     10 NPF
  10286 TRAP
  71269 VMENTRY
  71269 VMEXIT
  34129 VMMCALL
     15 wrap_buffer

The difference is even larger when the guest is busy.

These results are in line with what has been reported for KVM. For example
http://events.linuxfoundation.org/sites/events/files/cojp13_natapov.pdf
http://www.linuxplumbersconf.org/2012/wp-content/uploads/2012/09/2012-lpc-virt-intel-vt-feat-nakajima.pdf

I am also not sure whether (cpu_has_vmx_apic_reg_virt &
cpu_has_vmx_virtualize_x2apic_mode) is sufficient to declare full HW
APIC support to a guest. The tests show ~95K VMEXITs when virtual
interrupt delivery and posted interrupts are turned off so there
appears to still be some benefit. I suppose we can use another CPUID
bit for these two (although I am not particularly eager to do this).

Boris Ostrovsky (4):
  xen/libxc: Allow changes to hypervisor CPUID leaf from config file
  x86/hvm: Revert 80ecb40362365ba77e68fc609de8bd3b7208ae19
  x86/hvm: Add HVM-specific hypervisor CPUID leaf
  x86/hvm: Indicate avaliability of HW support of APIC virtualization
    to HVM guests

 tools/libxc/xc_cpuid_x86.c          |   71 ++++++++++++++++++++++++++++++++++-
 xen/arch/x86/domain.c               |   19 ++++++++--
 xen/arch/x86/hvm/hvm.c              |    9 +++++
 xen/arch/x86/hvm/vmx/vmx.c          |   15 ++++++++
 xen/arch/x86/traps.c                |   18 ++++-----
 xen/include/asm-x86/domain.h        |    7 ++++
 xen/include/asm-x86/hvm/hvm.h       |    7 ++++
 xen/include/public/arch-x86/cpuid.h |   10 +++++
 8 files changed, 142 insertions(+), 14 deletions(-)

-- 
1.7.10.4

^ permalink raw reply	[flat|nested] 18+ messages in thread

* [PATCH v5 1/4] xen/libxc: Allow changes to hypervisor CPUID leaf from config file
  2014-03-19  0:58 [PATCH v5 0/4] Expose HW APIC virtualization support to HVM guests Boris Ostrovsky
@ 2014-03-19  0:58 ` Boris Ostrovsky
  2014-03-19  9:26   ` Jan Beulich
  2014-03-19  9:27   ` Ian Campbell
  2014-03-19  0:58 ` [PATCH v5 2/4] x86/hvm: Revert 80ecb40362365ba77e68fc609de8bd3b7208ae19 Boris Ostrovsky
                   ` (2 subsequent siblings)
  3 siblings, 2 replies; 18+ messages in thread
From: Boris Ostrovsky @ 2014-03-19  0:58 UTC (permalink / raw)
  To: jbeulich, keir, jun.nakajima, eddie.dong, ian.jackson,
	stefano.stabellini, ian.campbell
  Cc: yang.z.zhang, andrew.cooper3, boris.ostrovsky, xen-devel

Currently only "real" cpuid leaves can be overwritten by users via
'cpuid' option in the configuration file. This patch provides ability to
do the same for hypervisor leaves (but for now only 0x40000000 is allowed).

Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
---
 tools/libxc/xc_cpuid_x86.c   |   71 ++++++++++++++++++++++++++++++++++++++++--
 xen/arch/x86/domain.c        |   19 +++++++++--
 xen/arch/x86/traps.c         |    3 ++
 xen/include/asm-x86/domain.h |    7 +++++
 4 files changed, 95 insertions(+), 5 deletions(-)

diff --git a/tools/libxc/xc_cpuid_x86.c b/tools/libxc/xc_cpuid_x86.c
index bbbf9b8..5501d5b 100644
--- a/tools/libxc/xc_cpuid_x86.c
+++ b/tools/libxc/xc_cpuid_x86.c
@@ -33,6 +33,8 @@
 #define DEF_MAX_INTELEXT  0x80000008u
 #define DEF_MAX_AMDEXT    0x8000001cu
 
+#define IS_HYPERVISOR_LEAF(idx) (((idx) & 0xffff0000) == 0x40000000)
+
 static int hypervisor_is_64bit(xc_interface *xch)
 {
     xen_capabilities_info_t xen_caps = "";
@@ -43,22 +45,31 @@ static int hypervisor_is_64bit(xc_interface *xch)
 static void cpuid(const unsigned int *input, unsigned int *regs)
 {
     unsigned int count = (input[1] == XEN_CPUID_INPUT_UNUSED) ? 0 : input[1];
+    uint8_t is_hyp = IS_HYPERVISOR_LEAF(input[0]);
 #ifdef __i386__
     /* Use the stack to avoid reg constraint failures with some gcc flags */
     asm (
         "push %%ebx; push %%edx\n\t"
+        "testb $0xff,%5\n\t"
+        "jz .Lcpuid%=\n\t"
+        XEN_EMULATE_PREFIX
+        ".Lcpuid%=:\n\t"
         "cpuid\n\t"
         "mov %%ebx,4(%4)\n\t"
         "mov %%edx,12(%4)\n\t"
         "pop %%edx; pop %%ebx\n\t"
         : "=a" (regs[0]), "=c" (regs[2])
-        : "0" (input[0]), "1" (count), "S" (regs)
+        : "0" (input[0]), "1" (count), "S" (regs), "q" (is_hyp)
         : "memory" );
 #else
     asm (
+        "testb $0xff,%6\n\t"
+        "jz .Lcpuid%=\n\t"
+        XEN_EMULATE_PREFIX
+        ".Lcpuid%=:\n\t"
         "cpuid"
         : "=a" (regs[0]), "=b" (regs[1]), "=c" (regs[2]), "=d" (regs[3])
-        : "0" (input[0]), "2" (count) );
+        : "0" (input[0]), "2" (count), "q" (is_hyp) );
 #endif
 }
 
@@ -555,6 +566,9 @@ static int xc_cpuid_policy(
 {
     xc_dominfo_t        info;
 
+    if ( IS_HYPERVISOR_LEAF(input[0]) )
+        return 0;
+
     if ( xc_domain_getinfo(xch, domid, 1, &info) == 0 )
         return -EINVAL;
 
@@ -726,6 +740,57 @@ int xc_cpuid_check(
     return rc;
 }
 
+static void xc_cpuid_revert(unsigned int *reg, unsigned int polreg,
+    unsigned int start, unsigned int end, char *config)
+{
+    unsigned int i;
+
+    for (i = start; i <= end; i++)
+    {
+        if ( polreg & (1U << i) )
+        {
+            set_bit(i, *reg);
+            config[32 - i] = '1';
+        }
+        else
+        {
+            clear_bit(i, *reg);
+            config[32 - i] = '0';
+        }
+    }
+}
+
+static void xc_cpuid_constrain(const unsigned int *input, unsigned int *regs,
+    unsigned int *polregs, char **config_transformed)
+{
+    unsigned int i;
+
+    if ( IS_HYPERVISOR_LEAF(input[0]) )
+    {
+        switch ( input[0] & 0xff )
+        {
+            case 0:
+                if ( (regs[0] & 0xffffff00) != (polregs[0] & 0xffffff00) )
+                        xc_cpuid_revert(&regs[0], polregs[0], 8, 31,
+                                        config_transformed[0]);
+                if ( (regs[0] & 0xff) < 2 )
+                        xc_cpuid_revert(&regs[0], polregs[0], 0, 7,
+                                        config_transformed[0]);
+                for (i = 1; i < 4; i++)
+                    if ( regs[i] != polregs[i] )
+                        xc_cpuid_revert(&regs[i], polregs[i], 0, 31,
+                                        config_transformed[i]);
+                break;
+
+            default:
+                for (i = 0; i < 4; i++)
+                    xc_cpuid_revert(&regs[i], polregs[i], 0, 31,
+                                    config_transformed[i]);
+                break;
+        }
+    }
+}
+
 /*
  * Configure a single input with the informatiom from config.
  *
@@ -801,6 +866,8 @@ int xc_cpuid_set(
         }
     }
 
+    xc_cpuid_constrain(input, regs, polregs, config_transformed);
+
     rc = xc_cpuid_do_domctl(xch, domid, input, regs);
     if ( rc == 0 )
         return 0;
diff --git a/xen/arch/x86/domain.c b/xen/arch/x86/domain.c
index c42a079..98e2b5f 100644
--- a/xen/arch/x86/domain.c
+++ b/xen/arch/x86/domain.c
@@ -1997,7 +1997,7 @@ void arch_dump_vcpu_info(struct vcpu *v)
         vpmu_dump(v);
 }
 
-void domain_cpuid(
+bool_t domain_cpuid_exists(
     struct domain *d,
     unsigned int  input,
     unsigned int  sub_input,
@@ -2030,11 +2030,24 @@ void domain_cpuid(
                  !d->disable_migrate && !d->arch.vtsc )
                 *edx &= ~(1u<<8); /* TSC Invariant */
 
-            return;
+            return 1;
         }
     }
 
-    *eax = *ebx = *ecx = *edx = 0;
+    return 0;
+}
+
+void domain_cpuid(
+    struct domain *d,
+    unsigned int  input,
+    unsigned int  sub_input,
+    unsigned int  *eax,
+    unsigned int  *ebx,
+    unsigned int  *ecx,
+    unsigned int  *edx)
+{
+    if ( !domain_cpuid_exists(d, input, sub_input, eax, ebx, ecx, edx) )
+        *eax = *ebx = *ecx = *edx = 0;
 }
 
 void vcpu_kick(struct vcpu *v)
diff --git a/xen/arch/x86/traps.c b/xen/arch/x86/traps.c
index c462317..e1f39d9 100644
--- a/xen/arch/x86/traps.c
+++ b/xen/arch/x86/traps.c
@@ -690,6 +690,9 @@ int cpuid_hypervisor_leaves( uint32_t idx, uint32_t sub_idx,
     if ( idx > limit ) 
         return 0;
 
+    if ( domain_cpuid_exists(d, base + idx, sub_idx, eax, ebx, ecx, edx) )
+        return 1;
+
     switch ( idx )
     {
     case 0:
diff --git a/xen/include/asm-x86/domain.h b/xen/include/asm-x86/domain.h
index 49f7c0c..5fd47de 100644
--- a/xen/include/asm-x86/domain.h
+++ b/xen/include/asm-x86/domain.h
@@ -471,6 +471,13 @@ unsigned long pv_guest_cr4_fixup(const struct vcpu *, unsigned long guest_cr4);
     ((c) & ~(X86_CR4_PGE | X86_CR4_PSE | X86_CR4_TSD |      \
              X86_CR4_OSXSAVE | X86_CR4_SMEP | X86_CR4_FSGSBASE))
 
+bool_t domain_cpuid_exists(struct domain *d,
+                           unsigned int  input,
+                           unsigned int  sub_input,
+                           unsigned int  *eax,
+                           unsigned int  *ebx,
+                           unsigned int  *ecx,
+                           unsigned int  *edx);
 void domain_cpuid(struct domain *d,
                   unsigned int  input,
                   unsigned int  sub_input,
-- 
1.7.10.4

^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH v5 2/4] x86/hvm: Revert 80ecb40362365ba77e68fc609de8bd3b7208ae19
  2014-03-19  0:58 [PATCH v5 0/4] Expose HW APIC virtualization support to HVM guests Boris Ostrovsky
  2014-03-19  0:58 ` [PATCH v5 1/4] xen/libxc: Allow changes to hypervisor CPUID leaf from config file Boris Ostrovsky
@ 2014-03-19  0:58 ` Boris Ostrovsky
  2014-03-19  0:58 ` [PATCH v5 3/4] x86/hvm: Add HVM-specific hypervisor CPUID leaf Boris Ostrovsky
  2014-03-19  0:58 ` [PATCH v5 4/4] x86/hvm: Indicate avaliability of HW support of APIC virtualization to HVM guests Boris Ostrovsky
  3 siblings, 0 replies; 18+ messages in thread
From: Boris Ostrovsky @ 2014-03-19  0:58 UTC (permalink / raw)
  To: jbeulich, keir, jun.nakajima, eddie.dong, ian.jackson,
	stefano.stabellini, ian.campbell
  Cc: yang.z.zhang, andrew.cooper3, boris.ostrovsky, xen-devel

The Solaris bug that commit 80ecb40362365ba77e68fc609de8bd3b7208ae19
addressed has been fixed and backported to earlier releases.

Those still using those releases can specify number of hypervisor leaves
explicitly via 'cpuid' xl config option.

Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
---
 xen/arch/x86/traps.c |   11 ++---------
 1 file changed, 2 insertions(+), 9 deletions(-)

diff --git a/xen/arch/x86/traps.c b/xen/arch/x86/traps.c
index e1f39d9..75a0ceb 100644
--- a/xen/arch/x86/traps.c
+++ b/xen/arch/x86/traps.c
@@ -677,17 +677,10 @@ int cpuid_hypervisor_leaves( uint32_t idx, uint32_t sub_idx,
     struct domain *d = current->domain;
     /* Optionally shift out of the way of Viridian architectural leaves. */
     uint32_t base = is_viridian_domain(d) ? 0x40000100 : 0x40000000;
-    uint32_t limit;
 
     idx -= base;
 
-    /*
-     * Some Solaris PV drivers fail if max > base + 2. Help them out by
-     * hiding the PVRDTSCP leaf if PVRDTSCP is disabled.
-     */
-    limit = (d->arch.tsc_mode < TSC_MODE_PVRDTSCP) ? 2 : 3;
-
-    if ( idx > limit ) 
+    if ( idx > 3 ) 
         return 0;
 
     if ( domain_cpuid_exists(d, base + idx, sub_idx, eax, ebx, ecx, edx) )
@@ -696,7 +689,7 @@ int cpuid_hypervisor_leaves( uint32_t idx, uint32_t sub_idx,
     switch ( idx )
     {
     case 0:
-        *eax = base + limit; /* Largest leaf */
+        *eax = base + 3; /* Largest leaf */
         *ebx = XEN_CPUID_SIGNATURE_EBX;
         *ecx = XEN_CPUID_SIGNATURE_ECX;
         *edx = XEN_CPUID_SIGNATURE_EDX;
-- 
1.7.10.4

^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH v5 3/4] x86/hvm: Add HVM-specific hypervisor CPUID leaf
  2014-03-19  0:58 [PATCH v5 0/4] Expose HW APIC virtualization support to HVM guests Boris Ostrovsky
  2014-03-19  0:58 ` [PATCH v5 1/4] xen/libxc: Allow changes to hypervisor CPUID leaf from config file Boris Ostrovsky
  2014-03-19  0:58 ` [PATCH v5 2/4] x86/hvm: Revert 80ecb40362365ba77e68fc609de8bd3b7208ae19 Boris Ostrovsky
@ 2014-03-19  0:58 ` Boris Ostrovsky
  2014-03-19  0:58 ` [PATCH v5 4/4] x86/hvm: Indicate avaliability of HW support of APIC virtualization to HVM guests Boris Ostrovsky
  3 siblings, 0 replies; 18+ messages in thread
From: Boris Ostrovsky @ 2014-03-19  0:58 UTC (permalink / raw)
  To: jbeulich, keir, jun.nakajima, eddie.dong, ian.jackson,
	stefano.stabellini, ian.campbell
  Cc: yang.z.zhang, andrew.cooper3, boris.ostrovsky, xen-devel

CPUID leaf 0x40000004 is for HVM-specific features.

Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
---
 xen/arch/x86/hvm/hvm.c        |    9 +++++++++
 xen/arch/x86/traps.c          |    8 ++++++--
 xen/include/asm-x86/hvm/hvm.h |    7 +++++++
 3 files changed, 22 insertions(+), 2 deletions(-)

diff --git a/xen/arch/x86/hvm/hvm.c b/xen/arch/x86/hvm/hvm.c
index ae24211..b07f11e 100644
--- a/xen/arch/x86/hvm/hvm.c
+++ b/xen/arch/x86/hvm/hvm.c
@@ -2980,6 +2980,15 @@ unsigned long copy_from_user_hvm(void *to, const void *from, unsigned len)
     return rc ? len : 0; /* fake a copy_from_user() return code */
 }
 
+void hvm_hypervisor_cpuid_leaf(uint32_t sub_idx,
+                               uint32_t *eax, uint32_t *ebx,
+                               uint32_t *ecx, uint32_t *edx)
+{
+    *eax = *ebx = *ecx = *edx = 0;
+    if ( hvm_funcs.hypervisor_cpuid_leaf )
+        hvm_funcs.hypervisor_cpuid_leaf(sub_idx, eax, ebx, ecx, edx);
+}
+
 void hvm_cpuid(unsigned int input, unsigned int *eax, unsigned int *ebx,
                                    unsigned int *ecx, unsigned int *edx)
 {
diff --git a/xen/arch/x86/traps.c b/xen/arch/x86/traps.c
index 75a0ceb..6dfb721 100644
--- a/xen/arch/x86/traps.c
+++ b/xen/arch/x86/traps.c
@@ -680,7 +680,7 @@ int cpuid_hypervisor_leaves( uint32_t idx, uint32_t sub_idx,
 
     idx -= base;
 
-    if ( idx > 3 ) 
+    if ( idx > 4 ) 
         return 0;
 
     if ( domain_cpuid_exists(d, base + idx, sub_idx, eax, ebx, ecx, edx) )
@@ -689,7 +689,7 @@ int cpuid_hypervisor_leaves( uint32_t idx, uint32_t sub_idx,
     switch ( idx )
     {
     case 0:
-        *eax = base + 3; /* Largest leaf */
+        *eax = base + 4; /* Largest leaf */
         *ebx = XEN_CPUID_SIGNATURE_EBX;
         *ecx = XEN_CPUID_SIGNATURE_ECX;
         *edx = XEN_CPUID_SIGNATURE_EDX;
@@ -718,6 +718,10 @@ int cpuid_hypervisor_leaves( uint32_t idx, uint32_t sub_idx,
         cpuid_time_leaf( sub_idx, eax, ebx, ecx, edx );
         break;
 
+    case 4:
+        hvm_hypervisor_cpuid_leaf(sub_idx, eax, ebx, ecx, edx);
+        break;
+
     default:
         BUG();
     }
diff --git a/xen/include/asm-x86/hvm/hvm.h b/xen/include/asm-x86/hvm/hvm.h
index dcc3483..a030ea4 100644
--- a/xen/include/asm-x86/hvm/hvm.h
+++ b/xen/include/asm-x86/hvm/hvm.h
@@ -200,6 +200,10 @@ struct hvm_function_table {
                                 paddr_t *L1_gpa, unsigned int *page_order,
                                 uint8_t *p2m_acc, bool_t access_r,
                                 bool_t access_w, bool_t access_x);
+
+    void (*hypervisor_cpuid_leaf)(uint32_t sub_idx,
+                                  uint32_t *eax, uint32_t *ebx,
+                                  uint32_t *ecx, uint32_t *edx);
 };
 
 extern struct hvm_function_table hvm_funcs;
@@ -336,6 +340,9 @@ static inline unsigned long hvm_get_shadow_gs_base(struct vcpu *v)
 #define is_viridian_domain(_d)                                             \
  (is_hvm_domain(_d) && ((_d)->arch.hvm_domain.params[HVM_PARAM_VIRIDIAN]))
 
+void hvm_hypervisor_cpuid_leaf(uint32_t sub_idx,
+                               uint32_t *eax, uint32_t *ebx,
+                               uint32_t *ecx, uint32_t *edx);
 void hvm_cpuid(unsigned int input, unsigned int *eax, unsigned int *ebx,
                                    unsigned int *ecx, unsigned int *edx);
 void hvm_migrate_timers(struct vcpu *v);
-- 
1.7.10.4

^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH v5 4/4] x86/hvm: Indicate avaliability of HW support of APIC virtualization to HVM guests
  2014-03-19  0:58 [PATCH v5 0/4] Expose HW APIC virtualization support to HVM guests Boris Ostrovsky
                   ` (2 preceding siblings ...)
  2014-03-19  0:58 ` [PATCH v5 3/4] x86/hvm: Add HVM-specific hypervisor CPUID leaf Boris Ostrovsky
@ 2014-03-19  0:58 ` Boris Ostrovsky
  3 siblings, 0 replies; 18+ messages in thread
From: Boris Ostrovsky @ 2014-03-19  0:58 UTC (permalink / raw)
  To: jbeulich, keir, jun.nakajima, eddie.dong, ian.jackson,
	stefano.stabellini, ian.campbell
  Cc: yang.z.zhang, andrew.cooper3, boris.ostrovsky, xen-devel

Set bits in hypervisor CPUID leaf indicating that HW provides (and the
hypervisor enables) HW support for APIC and x2APIC virtualization.

Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
---
 xen/arch/x86/hvm/vmx/vmx.c          |   15 +++++++++++++++
 xen/include/public/arch-x86/cpuid.h |   10 ++++++++++
 2 files changed, 25 insertions(+)

diff --git a/xen/arch/x86/hvm/vmx/vmx.c b/xen/arch/x86/hvm/vmx/vmx.c
index 8395e86..a59bac6 100644
--- a/xen/arch/x86/hvm/vmx/vmx.c
+++ b/xen/arch/x86/hvm/vmx/vmx.c
@@ -57,6 +57,7 @@
 #include <asm/apic.h>
 #include <asm/hvm/nestedhvm.h>
 #include <asm/event.h>
+#include <public/arch-x86/cpuid.h>
 
 enum handler_return { HNDL_done, HNDL_unhandled, HNDL_exception_raised };
 
@@ -1646,6 +1647,19 @@ static void vmx_handle_eoi(u8 vector)
     __vmwrite(GUEST_INTR_STATUS, status);
 }
 
+void vmx_hypervisor_cpuid_leaf(uint32_t sub_idx,
+                               uint32_t *eax, uint32_t *ebx,
+                               uint32_t *ecx, uint32_t *edx)
+{
+    if ( sub_idx != 0 )
+        return;
+
+    if ( cpu_has_vmx_apic_reg_virt )
+        *eax |= XEN_HVM_CPUID_APIC_ACCESS_VIRT;
+    if ( cpu_has_vmx_virtualize_x2apic_mode )
+        *eax |= XEN_HVM_CPUID_X2APIC_VIRT;
+}
+
 static struct hvm_function_table __initdata vmx_function_table = {
     .name                 = "VMX",
     .cpu_up_prepare       = vmx_cpu_up_prepare,
@@ -1703,6 +1717,7 @@ static struct hvm_function_table __initdata vmx_function_table = {
     .sync_pir_to_irr      = vmx_sync_pir_to_irr,
     .handle_eoi           = vmx_handle_eoi,
     .nhvm_hap_walk_L1_p2m = nvmx_hap_walk_L1_p2m,
+    .hypervisor_cpuid_leaf= vmx_hypervisor_cpuid_leaf,
 };
 
 const struct hvm_function_table * __init start_vmx(void)
diff --git a/xen/include/public/arch-x86/cpuid.h b/xen/include/public/arch-x86/cpuid.h
index d9bd627..7118760 100644
--- a/xen/include/public/arch-x86/cpuid.h
+++ b/xen/include/public/arch-x86/cpuid.h
@@ -65,4 +65,14 @@
 #define _XEN_CPUID_FEAT1_MMU_PT_UPDATE_PRESERVE_AD 0
 #define XEN_CPUID_FEAT1_MMU_PT_UPDATE_PRESERVE_AD  (1u<<0)
 
+/*
+ * Leaf 5 (0x40000004)
+ * HVM-specific features
+ */
+
+/* EAX Features */
+#define XEN_HVM_CPUID_APIC_ACCESS_VIRT (1u << 0) /* Virtualized APIC registers */
+#define XEN_HVM_CPUID_X2APIC_VIRT      (1u << 1) /* Virtualized x2APIC accesses */
+
+
 #endif /* __XEN_PUBLIC_ARCH_X86_CPUID_H__ */
-- 
1.7.10.4

^ permalink raw reply related	[flat|nested] 18+ messages in thread

* Re: [PATCH v5 1/4] xen/libxc: Allow changes to hypervisor CPUID leaf from config file
  2014-03-19  0:58 ` [PATCH v5 1/4] xen/libxc: Allow changes to hypervisor CPUID leaf from config file Boris Ostrovsky
@ 2014-03-19  9:26   ` Jan Beulich
  2014-03-19  9:29     ` Ian Campbell
  2014-03-19  9:27   ` Ian Campbell
  1 sibling, 1 reply; 18+ messages in thread
From: Jan Beulich @ 2014-03-19  9:26 UTC (permalink / raw)
  To: Boris Ostrovsky
  Cc: keir, ian.campbell, stefano.stabellini, andrew.cooper3,
	eddie.dong, xen-devel, jun.nakajima, yang.z.zhang, ian.jackson

>>> On 19.03.14 at 01:58, Boris Ostrovsky <boris.ostrovsky@oracle.com> wrote:
> Currently only "real" cpuid leaves can be overwritten by users via
> 'cpuid' option in the configuration file. This patch provides ability to
> do the same for hypervisor leaves (but for now only 0x40000000 is allowed).

I'm still not really happy with the approach:

> --- a/xen/arch/x86/traps.c
> +++ b/xen/arch/x86/traps.c
> @@ -690,6 +690,9 @@ int cpuid_hypervisor_leaves( uint32_t idx, uint32_t sub_idx,
>      if ( idx > limit ) 
>          return 0;
>  
> +    if ( domain_cpuid_exists(d, base + idx, sub_idx, eax, ebx, ecx, edx) )
> +        return 1;

You basically rely here on the tools having fed back the values the
hypervisor returned to them. What if this gets tweaked at some
point based on some other domain characteristics? I'd really like to
see the hypervisor, and only the hypervisor, being in charge of
returning the hypervisor leaf values. The sole restriction the tools
may place ought to be on the maximum leaf.

Sadly I didn't see anyone else commenting on this model of yours,
so I can't judge whether I'm the only one having reservations
against the model you have been proposing so far.

Jan

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH v5 1/4] xen/libxc: Allow changes to hypervisor CPUID leaf from config file
  2014-03-19  0:58 ` [PATCH v5 1/4] xen/libxc: Allow changes to hypervisor CPUID leaf from config file Boris Ostrovsky
  2014-03-19  9:26   ` Jan Beulich
@ 2014-03-19  9:27   ` Ian Campbell
  2014-03-19  9:32     ` Jan Beulich
  2014-03-19 14:41     ` Boris Ostrovsky
  1 sibling, 2 replies; 18+ messages in thread
From: Ian Campbell @ 2014-03-19  9:27 UTC (permalink / raw)
  To: Boris Ostrovsky
  Cc: keir, jbeulich, stefano.stabellini, eddie.dong, ian.jackson,
	xen-devel, jun.nakajima, andrew.cooper3, yang.z.zhang

On Tue, 2014-03-18 at 20:58 -0400, Boris Ostrovsky wrote:
> Currently only "real" cpuid leaves can be overwritten by users via
> 'cpuid' option in the configuration file. This patch provides ability to
> do the same for hypervisor leaves (but for now only 0x40000000 is allowed).
> 
> Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
> ---
>  tools/libxc/xc_cpuid_x86.c   |   71 ++++++++++++++++++++++++++++++++++++++++--
>  xen/arch/x86/domain.c        |   19 +++++++++--
>  xen/arch/x86/traps.c         |    3 ++
>  xen/include/asm-x86/domain.h |    7 +++++
>  4 files changed, 95 insertions(+), 5 deletions(-)
> 
> diff --git a/tools/libxc/xc_cpuid_x86.c b/tools/libxc/xc_cpuid_x86.c
> index bbbf9b8..5501d5b 100644
> --- a/tools/libxc/xc_cpuid_x86.c
> +++ b/tools/libxc/xc_cpuid_x86.c
> @@ -33,6 +33,8 @@
>  #define DEF_MAX_INTELEXT  0x80000008u
>  #define DEF_MAX_AMDEXT    0x8000001cu
>  
> +#define IS_HYPERVISOR_LEAF(idx) (((idx) & 0xffff0000) == 0x40000000)

Not idx == 0x40000000?

Also as I think Jan said before if viridian support is enabled then the
Xen leaves may be elsewhere (at 0x100 increments above that address
IIRC).

> +
>  static int hypervisor_is_64bit(xc_interface *xch)
>  {
>      xen_capabilities_info_t xen_caps = "";
> @@ -43,22 +45,31 @@ static int hypervisor_is_64bit(xc_interface *xch)
>  static void cpuid(const unsigned int *input, unsigned int *regs)
>  {
>      unsigned int count = (input[1] == XEN_CPUID_INPUT_UNUSED) ? 0 : input[1];
> +    uint8_t is_hyp = IS_HYPERVISOR_LEAF(input[0]);
>  #ifdef __i386__
>      /* Use the stack to avoid reg constraint failures with some gcc flags */
>      asm (
>          "push %%ebx; push %%edx\n\t"
> +        "testb $0xff,%5\n\t"
> +        "jz .Lcpuid%=\n\t"
> +        XEN_EMULATE_PREFIX
> +        ".Lcpuid%=:\n\t"
>          "cpuid\n\t"
>          "mov %%ebx,4(%4)\n\t"
>          "mov %%edx,12(%4)\n\t"
>          "pop %%edx; pop %%ebx\n\t"
>          : "=a" (regs[0]), "=c" (regs[2])
> -        : "0" (input[0]), "1" (count), "S" (regs)
> +        : "0" (input[0]), "1" (count), "S" (regs), "q" (is_hyp)

I think this would be clearer refactored into make_real_cpuid() and
make_pv_cpuid() functions.

It also seems strange to use emulated for a subset of leafs, although I
understand why.

How does this play out in e.g. a PVH toolstack domain where the even
"real" cpuid might be faked?

Perhaps we should have a hypercall to retrieve the complete set of real
h/w, levelled h/w, pv, emulated etc values for a given leaf?

> @@ -726,6 +740,57 @@ int xc_cpuid_check(
>      return rc;
>  }
>  
> +static void xc_cpuid_revert(unsigned int *reg, unsigned int polreg,
> +    unsigned int start, unsigned int end, char *config)
[...]
> +static void xc_cpuid_constrain(const unsigned int *input, unsigned int *regs,
> +    unsigned int *polregs, char **config_transformed)

These complicated functions most certainly need some sort of explanation
about what they are trying to do.

Why do the hypervisor leaves need to be manipulated by new functions, do
the existing facilities used for all the others not work?

> +{
> +    unsigned int i;
> +
> +    if ( IS_HYPERVISOR_LEAF(input[0]) )
> +    {
> +        switch ( input[0] & 0xff )
> +        {
> +            case 0:
> +                if ( (regs[0] & 0xffffff00) != (polregs[0] & 0xffffff00) )
> +                        xc_cpuid_revert(&regs[0], polregs[0], 8, 31,
> +                                        config_transformed[0]);
> +                if ( (regs[0] & 0xff) < 2 )

Lots of magic numbers with varying number's of f's. Please #define them.

> +                        xc_cpuid_revert(&regs[0], polregs[0], 0, 7,
> +                                        config_transformed[0]);
> +                for (i = 1; i < 4; i++)
> +                    if ( regs[i] != polregs[i] )
> +                        xc_cpuid_revert(&regs[i], polregs[i], 0, 31,
> +                                        config_transformed[i]);
> +                break;
> +
> +            default:
> +                for (i = 0; i < 4; i++)
> +                    xc_cpuid_revert(&regs[i], polregs[i], 0, 31,
> +                                    config_transformed[i]);
> +                break;
> +        }
> +    }
> +}
> +
>  /*
>   * Configure a single input with the informatiom from config.
>   *

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH v5 1/4] xen/libxc: Allow changes to hypervisor CPUID leaf from config file
  2014-03-19  9:26   ` Jan Beulich
@ 2014-03-19  9:29     ` Ian Campbell
  2014-03-19 14:26       ` Boris Ostrovsky
  0 siblings, 1 reply; 18+ messages in thread
From: Ian Campbell @ 2014-03-19  9:29 UTC (permalink / raw)
  To: Jan Beulich
  Cc: keir, eddie.dong, stefano.stabellini, andrew.cooper3,
	ian.jackson, xen-devel, jun.nakajima, yang.z.zhang,
	Boris Ostrovsky

On Wed, 2014-03-19 at 09:26 +0000, Jan Beulich wrote:
> >>> On 19.03.14 at 01:58, Boris Ostrovsky <boris.ostrovsky@oracle.com> wrote:
> > Currently only "real" cpuid leaves can be overwritten by users via
> > 'cpuid' option in the configuration file. This patch provides ability to
> > do the same for hypervisor leaves (but for now only 0x40000000 is allowed).
> 
> I'm still not really happy with the approach:
> 
> > --- a/xen/arch/x86/traps.c
> > +++ b/xen/arch/x86/traps.c
> > @@ -690,6 +690,9 @@ int cpuid_hypervisor_leaves( uint32_t idx, uint32_t sub_idx,
> >      if ( idx > limit ) 
> >          return 0;
> >  
> > +    if ( domain_cpuid_exists(d, base + idx, sub_idx, eax, ebx, ecx, edx) )
> > +        return 1;
> 
> You basically rely here on the tools having fed back the values the
> hypervisor returned to them. What if this gets tweaked at some
> point based on some other domain characteristics? I'd really like to
> see the hypervisor, and only the hypervisor, being in charge of
> returning the hypervisor leaf values. The sole restriction the tools
> may place ought to be on the maximum leaf.

That does sound like it would simplify things.

> Sadly I didn't see anyone else commenting on this model of yours,
> so I can't judge whether I'm the only one having reservations
> against the model you have been proposing so far.

Well, the libxc side is certainly very complicated if all which is
needed is the ability to limit the maximum hypervisor leaf which is
exposed. That ought to be a dozen lines max across tools and hypervisor
I would have said.

Ian.

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH v5 1/4] xen/libxc: Allow changes to hypervisor CPUID leaf from config file
  2014-03-19  9:27   ` Ian Campbell
@ 2014-03-19  9:32     ` Jan Beulich
  2014-03-19  9:52       ` Ian Campbell
  2014-03-19 14:41     ` Boris Ostrovsky
  1 sibling, 1 reply; 18+ messages in thread
From: Jan Beulich @ 2014-03-19  9:32 UTC (permalink / raw)
  To: Ian Campbell
  Cc: keir, eddie.dong, stefano.stabellini, andrew.cooper3,
	ian.jackson, xen-devel, jun.nakajima, yang.z.zhang,
	Boris Ostrovsky

>>> On 19.03.14 at 10:27, Ian Campbell <Ian.Campbell@citrix.com> wrote:
> On Tue, 2014-03-18 at 20:58 -0400, Boris Ostrovsky wrote:
>> Currently only "real" cpuid leaves can be overwritten by users via
>> 'cpuid' option in the configuration file. This patch provides ability to
>> do the same for hypervisor leaves (but for now only 0x40000000 is allowed).
>> 
>> Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
>> ---
>>  tools/libxc/xc_cpuid_x86.c   |   71 
> ++++++++++++++++++++++++++++++++++++++++--
>>  xen/arch/x86/domain.c        |   19 +++++++++--
>>  xen/arch/x86/traps.c         |    3 ++
>>  xen/include/asm-x86/domain.h |    7 +++++
>>  4 files changed, 95 insertions(+), 5 deletions(-)
>> 
>> diff --git a/tools/libxc/xc_cpuid_x86.c b/tools/libxc/xc_cpuid_x86.c
>> index bbbf9b8..5501d5b 100644
>> --- a/tools/libxc/xc_cpuid_x86.c
>> +++ b/tools/libxc/xc_cpuid_x86.c
>> @@ -33,6 +33,8 @@
>>  #define DEF_MAX_INTELEXT  0x80000008u
>>  #define DEF_MAX_AMDEXT    0x8000001cu
>>  
>> +#define IS_HYPERVISOR_LEAF(idx) (((idx) & 0xffff0000) == 0x40000000)
> 
> Not idx == 0x40000000?
> 
> Also as I think Jan said before if viridian support is enabled then the
> Xen leaves may be elsewhere (at 0x100 increments above that address
> IIRC).

But that's exactly why the low 16 bits should be masked off in
above comparison.

Jan

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH v5 1/4] xen/libxc: Allow changes to hypervisor CPUID leaf from config file
  2014-03-19  9:32     ` Jan Beulich
@ 2014-03-19  9:52       ` Ian Campbell
  2014-03-19 10:06         ` Jan Beulich
  0 siblings, 1 reply; 18+ messages in thread
From: Ian Campbell @ 2014-03-19  9:52 UTC (permalink / raw)
  To: Jan Beulich
  Cc: keir, eddie.dong, stefano.stabellini, andrew.cooper3,
	ian.jackson, xen-devel, jun.nakajima, yang.z.zhang,
	Boris Ostrovsky

On Wed, 2014-03-19 at 09:32 +0000, Jan Beulich wrote:
> >>> On 19.03.14 at 10:27, Ian Campbell <Ian.Campbell@citrix.com> wrote:
> > On Tue, 2014-03-18 at 20:58 -0400, Boris Ostrovsky wrote:
> >> Currently only "real" cpuid leaves can be overwritten by users via
> >> 'cpuid' option in the configuration file. This patch provides ability to
> >> do the same for hypervisor leaves (but for now only 0x40000000 is allowed).
> >> 
> >> Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
> >> ---
> >>  tools/libxc/xc_cpuid_x86.c   |   71 
> > ++++++++++++++++++++++++++++++++++++++++--
> >>  xen/arch/x86/domain.c        |   19 +++++++++--
> >>  xen/arch/x86/traps.c         |    3 ++
> >>  xen/include/asm-x86/domain.h |    7 +++++
> >>  4 files changed, 95 insertions(+), 5 deletions(-)
> >> 
> >> diff --git a/tools/libxc/xc_cpuid_x86.c b/tools/libxc/xc_cpuid_x86.c
> >> index bbbf9b8..5501d5b 100644
> >> --- a/tools/libxc/xc_cpuid_x86.c
> >> +++ b/tools/libxc/xc_cpuid_x86.c
> >> @@ -33,6 +33,8 @@
> >>  #define DEF_MAX_INTELEXT  0x80000008u
> >>  #define DEF_MAX_AMDEXT    0x8000001cu
> >>  
> >> +#define IS_HYPERVISOR_LEAF(idx) (((idx) & 0xffff0000) == 0x40000000)
> > 
> > Not idx == 0x40000000?
> > 
> > Also as I think Jan said before if viridian support is enabled then the
> > Xen leaves may be elsewhere (at 0x100 increments above that address
> > IIRC).
> 
> But that's exactly why the low 16 bits should be masked off in
> above comparison.

Is it 0x100 or 0x1000 increments? I thought it was 0x100, in which case
shouldn't the mask be 0xfffff000?

I think this should be clarified in the xen/include/public/cpuid.h
header, perhaps including this sort of macro for convenience.

Ian.

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH v5 1/4] xen/libxc: Allow changes to hypervisor CPUID leaf from config file
  2014-03-19  9:52       ` Ian Campbell
@ 2014-03-19 10:06         ` Jan Beulich
  2014-03-19 10:39           ` Ian Campbell
  0 siblings, 1 reply; 18+ messages in thread
From: Jan Beulich @ 2014-03-19 10:06 UTC (permalink / raw)
  To: Ian Campbell
  Cc: keir, eddie.dong, stefano.stabellini, andrew.cooper3,
	ian.jackson, xen-devel, jun.nakajima, yang.z.zhang,
	Boris Ostrovsky

>>> On 19.03.14 at 10:52, Ian Campbell <Ian.Campbell@citrix.com> wrote:
> On Wed, 2014-03-19 at 09:32 +0000, Jan Beulich wrote:
>> >>> On 19.03.14 at 10:27, Ian Campbell <Ian.Campbell@citrix.com> wrote:
>> > On Tue, 2014-03-18 at 20:58 -0400, Boris Ostrovsky wrote:
>> >> Currently only "real" cpuid leaves can be overwritten by users via
>> >> 'cpuid' option in the configuration file. This patch provides ability to
>> >> do the same for hypervisor leaves (but for now only 0x40000000 is allowed).
>> >> 
>> >> Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
>> >> ---
>> >>  tools/libxc/xc_cpuid_x86.c   |   71 
>> > ++++++++++++++++++++++++++++++++++++++++--
>> >>  xen/arch/x86/domain.c        |   19 +++++++++--
>> >>  xen/arch/x86/traps.c         |    3 ++
>> >>  xen/include/asm-x86/domain.h |    7 +++++
>> >>  4 files changed, 95 insertions(+), 5 deletions(-)
>> >> 
>> >> diff --git a/tools/libxc/xc_cpuid_x86.c b/tools/libxc/xc_cpuid_x86.c
>> >> index bbbf9b8..5501d5b 100644
>> >> --- a/tools/libxc/xc_cpuid_x86.c
>> >> +++ b/tools/libxc/xc_cpuid_x86.c
>> >> @@ -33,6 +33,8 @@
>> >>  #define DEF_MAX_INTELEXT  0x80000008u
>> >>  #define DEF_MAX_AMDEXT    0x8000001cu
>> >>  
>> >> +#define IS_HYPERVISOR_LEAF(idx) (((idx) & 0xffff0000) == 0x40000000)
>> > 
>> > Not idx == 0x40000000?
>> > 
>> > Also as I think Jan said before if viridian support is enabled then the
>> > Xen leaves may be elsewhere (at 0x100 increments above that address
>> > IIRC).
>> 
>> But that's exactly why the low 16 bits should be masked off in
>> above comparison.
> 
> Is it 0x100 or 0x1000 increments? I thought it was 0x100, in which case
> shouldn't the mask be 0xfffff000?

It's 0x100 increments, but that doesn't relate to the mask to be
applied here - major groups appear to be using 64k increments
(0000 - basic, 4000 - hypervisor, 8000 - extended, 8086 -
Transmeta, C000 - VIA/Cyrix, and I guess there are others I
don't know about). I don't think I've seen this publicly/formally
documented so far.

Jan

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH v5 1/4] xen/libxc: Allow changes to hypervisor CPUID leaf from config file
  2014-03-19 10:06         ` Jan Beulich
@ 2014-03-19 10:39           ` Ian Campbell
  2014-03-19 11:44             ` Jan Beulich
  0 siblings, 1 reply; 18+ messages in thread
From: Ian Campbell @ 2014-03-19 10:39 UTC (permalink / raw)
  To: Jan Beulich
  Cc: keir, stefano.stabellini, andrew.cooper3, eddie.dong, xen-devel,
	jun.nakajima, yang.z.zhang, Boris Ostrovsky, ian.jackson

On Wed, 2014-03-19 at 10:06 +0000, Jan Beulich wrote:
> >>> On 19.03.14 at 10:52, Ian Campbell <Ian.Campbell@citrix.com> wrote:
> > On Wed, 2014-03-19 at 09:32 +0000, Jan Beulich wrote:
> >> >>> On 19.03.14 at 10:27, Ian Campbell <Ian.Campbell@citrix.com> wrote:
> >> > On Tue, 2014-03-18 at 20:58 -0400, Boris Ostrovsky wrote:
> >> >> Currently only "real" cpuid leaves can be overwritten by users via
> >> >> 'cpuid' option in the configuration file. This patch provides ability to
> >> >> do the same for hypervisor leaves (but for now only 0x40000000 is allowed).
> >> >> 
> >> >> Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
> >> >> ---
> >> >>  tools/libxc/xc_cpuid_x86.c   |   71 
> >> > ++++++++++++++++++++++++++++++++++++++++--
> >> >>  xen/arch/x86/domain.c        |   19 +++++++++--
> >> >>  xen/arch/x86/traps.c         |    3 ++
> >> >>  xen/include/asm-x86/domain.h |    7 +++++
> >> >>  4 files changed, 95 insertions(+), 5 deletions(-)
> >> >> 
> >> >> diff --git a/tools/libxc/xc_cpuid_x86.c b/tools/libxc/xc_cpuid_x86.c
> >> >> index bbbf9b8..5501d5b 100644
> >> >> --- a/tools/libxc/xc_cpuid_x86.c
> >> >> +++ b/tools/libxc/xc_cpuid_x86.c
> >> >> @@ -33,6 +33,8 @@
> >> >>  #define DEF_MAX_INTELEXT  0x80000008u
> >> >>  #define DEF_MAX_AMDEXT    0x8000001cu
> >> >>  
> >> >> +#define IS_HYPERVISOR_LEAF(idx) (((idx) & 0xffff0000) == 0x40000000)
> >> > 
> >> > Not idx == 0x40000000?
> >> > 
> >> > Also as I think Jan said before if viridian support is enabled then the
> >> > Xen leaves may be elsewhere (at 0x100 increments above that address
> >> > IIRC).
> >> 
> >> But that's exactly why the low 16 bits should be masked off in
> >> above comparison.
> > 
> > Is it 0x100 or 0x1000 increments? I thought it was 0x100, in which case
> > shouldn't the mask be 0xfffff000?
> 
> It's 0x100 increments, but that doesn't relate to the mask to be
> applied here - major groups appear to be using 64k increments
> (0000 - basic, 4000 - hypervisor, 8000 - extended, 8086 -
> Transmeta, C000 - VIA/Cyrix, and I guess there are others I
> don't know about). I don't think I've seen this publicly/formally
> documented so far.

OK, that makes sense from a major group perspective.

But I think the "first minor group" of hypervisor nodes at 0x40000000
stops at 0x40010000, at least implicitly due to the existing code in e.g
unmodified_drivers/linux-2.6/platform-pci/platform-pci.c and
tools/misc/xen-detect.c. I don't think it is out of the question that we
might want to put other stuff at e.g. 0x40020000 (or at least we should
retain the option).

Ian.

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH v5 1/4] xen/libxc: Allow changes to hypervisor CPUID leaf from config file
  2014-03-19 10:39           ` Ian Campbell
@ 2014-03-19 11:44             ` Jan Beulich
  2014-03-19 12:02               ` Ian Campbell
  0 siblings, 1 reply; 18+ messages in thread
From: Jan Beulich @ 2014-03-19 11:44 UTC (permalink / raw)
  To: Ian Campbell
  Cc: keir, eddie.dong, stefano.stabellini, andrew.cooper3,
	ian.jackson, xen-devel, jun.nakajima, yang.z.zhang,
	Boris Ostrovsky

>>> On 19.03.14 at 11:39, Ian Campbell <Ian.Campbell@citrix.com> wrote:
> On Wed, 2014-03-19 at 10:06 +0000, Jan Beulich wrote:
>> >>> On 19.03.14 at 10:52, Ian Campbell <Ian.Campbell@citrix.com> wrote:
>> > On Wed, 2014-03-19 at 09:32 +0000, Jan Beulich wrote:
>> >> >>> On 19.03.14 at 10:27, Ian Campbell <Ian.Campbell@citrix.com> wrote:
>> >> > On Tue, 2014-03-18 at 20:58 -0400, Boris Ostrovsky wrote:
>> >> >> Currently only "real" cpuid leaves can be overwritten by users via
>> >> >> 'cpuid' option in the configuration file. This patch provides ability to
>> >> >> do the same for hypervisor leaves (but for now only 0x40000000 is 
> allowed).
>> >> >> 
>> >> >> Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
>> >> >> ---
>> >> >>  tools/libxc/xc_cpuid_x86.c   |   71 
>> >> > ++++++++++++++++++++++++++++++++++++++++--
>> >> >>  xen/arch/x86/domain.c        |   19 +++++++++--
>> >> >>  xen/arch/x86/traps.c         |    3 ++
>> >> >>  xen/include/asm-x86/domain.h |    7 +++++
>> >> >>  4 files changed, 95 insertions(+), 5 deletions(-)
>> >> >> 
>> >> >> diff --git a/tools/libxc/xc_cpuid_x86.c b/tools/libxc/xc_cpuid_x86.c
>> >> >> index bbbf9b8..5501d5b 100644
>> >> >> --- a/tools/libxc/xc_cpuid_x86.c
>> >> >> +++ b/tools/libxc/xc_cpuid_x86.c
>> >> >> @@ -33,6 +33,8 @@
>> >> >>  #define DEF_MAX_INTELEXT  0x80000008u
>> >> >>  #define DEF_MAX_AMDEXT    0x8000001cu
>> >> >>  
>> >> >> +#define IS_HYPERVISOR_LEAF(idx) (((idx) & 0xffff0000) == 0x40000000)
>> >> > 
>> >> > Not idx == 0x40000000?
>> >> > 
>> >> > Also as I think Jan said before if viridian support is enabled then the
>> >> > Xen leaves may be elsewhere (at 0x100 increments above that address
>> >> > IIRC).
>> >> 
>> >> But that's exactly why the low 16 bits should be masked off in
>> >> above comparison.
>> > 
>> > Is it 0x100 or 0x1000 increments? I thought it was 0x100, in which case
>> > shouldn't the mask be 0xfffff000?
>> 
>> It's 0x100 increments, but that doesn't relate to the mask to be
>> applied here - major groups appear to be using 64k increments
>> (0000 - basic, 4000 - hypervisor, 8000 - extended, 8086 -
>> Transmeta, C000 - VIA/Cyrix, and I guess there are others I
>> don't know about). I don't think I've seen this publicly/formally
>> documented so far.
> 
> OK, that makes sense from a major group perspective.
> 
> But I think the "first minor group" of hypervisor nodes at 0x40000000
> stops at 0x40010000, at least implicitly due to the existing code in e.g
> unmodified_drivers/linux-2.6/platform-pci/platform-pci.c and
> tools/misc/xen-detect.c. I don't think it is out of the question that we
> might want to put other stuff at e.g. 0x40020000 (or at least we should
> retain the option).

So that means you advocate for shrinking the number of significant
bits checked for. Question then is by how much - perhaps we should
then consider the whole range 40000000-7FFFFFFF as hypervisor
reserved?

Jan

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH v5 1/4] xen/libxc: Allow changes to hypervisor CPUID leaf from config file
  2014-03-19 11:44             ` Jan Beulich
@ 2014-03-19 12:02               ` Ian Campbell
  0 siblings, 0 replies; 18+ messages in thread
From: Ian Campbell @ 2014-03-19 12:02 UTC (permalink / raw)
  To: Jan Beulich
  Cc: keir, eddie.dong, stefano.stabellini, andrew.cooper3,
	ian.jackson, xen-devel, jun.nakajima, yang.z.zhang,
	Boris Ostrovsky

On Wed, 2014-03-19 at 11:44 +0000, Jan Beulich wrote:
> >>> On 19.03.14 at 11:39, Ian Campbell <Ian.Campbell@citrix.com> wrote:
> > On Wed, 2014-03-19 at 10:06 +0000, Jan Beulich wrote:
> >> >>> On 19.03.14 at 10:52, Ian Campbell <Ian.Campbell@citrix.com> wrote:
> >> > On Wed, 2014-03-19 at 09:32 +0000, Jan Beulich wrote:
> >> >> >>> On 19.03.14 at 10:27, Ian Campbell <Ian.Campbell@citrix.com> wrote:
> >> >> > On Tue, 2014-03-18 at 20:58 -0400, Boris Ostrovsky wrote:
> >> >> >> Currently only "real" cpuid leaves can be overwritten by users via
> >> >> >> 'cpuid' option in the configuration file. This patch provides ability to
> >> >> >> do the same for hypervisor leaves (but for now only 0x40000000 is 
> > allowed).
> >> >> >> 
> >> >> >> Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
> >> >> >> ---
> >> >> >>  tools/libxc/xc_cpuid_x86.c   |   71 
> >> >> > ++++++++++++++++++++++++++++++++++++++++--
> >> >> >>  xen/arch/x86/domain.c        |   19 +++++++++--
> >> >> >>  xen/arch/x86/traps.c         |    3 ++
> >> >> >>  xen/include/asm-x86/domain.h |    7 +++++
> >> >> >>  4 files changed, 95 insertions(+), 5 deletions(-)
> >> >> >> 
> >> >> >> diff --git a/tools/libxc/xc_cpuid_x86.c b/tools/libxc/xc_cpuid_x86.c
> >> >> >> index bbbf9b8..5501d5b 100644
> >> >> >> --- a/tools/libxc/xc_cpuid_x86.c
> >> >> >> +++ b/tools/libxc/xc_cpuid_x86.c
> >> >> >> @@ -33,6 +33,8 @@
> >> >> >>  #define DEF_MAX_INTELEXT  0x80000008u
> >> >> >>  #define DEF_MAX_AMDEXT    0x8000001cu
> >> >> >>  
> >> >> >> +#define IS_HYPERVISOR_LEAF(idx) (((idx) & 0xffff0000) == 0x40000000)
> >> >> > 
> >> >> > Not idx == 0x40000000?
> >> >> > 
> >> >> > Also as I think Jan said before if viridian support is enabled then the
> >> >> > Xen leaves may be elsewhere (at 0x100 increments above that address
> >> >> > IIRC).
> >> >> 
> >> >> But that's exactly why the low 16 bits should be masked off in
> >> >> above comparison.
> >> > 
> >> > Is it 0x100 or 0x1000 increments? I thought it was 0x100, in which case
> >> > shouldn't the mask be 0xfffff000?
> >> 
> >> It's 0x100 increments, but that doesn't relate to the mask to be
> >> applied here - major groups appear to be using 64k increments
> >> (0000 - basic, 4000 - hypervisor, 8000 - extended, 8086 -
> >> Transmeta, C000 - VIA/Cyrix, and I guess there are others I
> >> don't know about). I don't think I've seen this publicly/formally
> >> documented so far.
> > 
> > OK, that makes sense from a major group perspective.
> > 
> > But I think the "first minor group" of hypervisor nodes at 0x40000000
> > stops at 0x40010000, at least implicitly due to the existing code in e.g
> > unmodified_drivers/linux-2.6/platform-pci/platform-pci.c and
> > tools/misc/xen-detect.c. I don't think it is out of the question that we
> > might want to put other stuff at e.g. 0x40020000 (or at least we should
> > retain the option).

I think I confused myself -- thinking I meant 0x40002000 here when the
existing Xen leaves are to 0x40010000 not to 0x40001000 as I thought
despite what I managed to write and then misinterpreted what that meant
anyway. Sigh.

> So that means you advocate for shrinking the number of significant
> bits checked for. Question then is by how much - perhaps we should
> then consider the whole range 40000000-7FFFFFFF as hypervisor
> reserved?

Regardless of my confusion, I don't think this is necessarily a bad
idea, although it is somewhat subject to the whims of the h/w designs,
meaning that in the end we should cross this bridge when we come to it
rather than now.

Sorry for the noise. Needed more coffee obviously.

Ian.

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH v5 1/4] xen/libxc: Allow changes to hypervisor CPUID leaf from config file
  2014-03-19  9:29     ` Ian Campbell
@ 2014-03-19 14:26       ` Boris Ostrovsky
  0 siblings, 0 replies; 18+ messages in thread
From: Boris Ostrovsky @ 2014-03-19 14:26 UTC (permalink / raw)
  To: Ian Campbell
  Cc: keir, eddie.dong, stefano.stabellini, andrew.cooper3,
	ian.jackson, xen-devel, Jan Beulich, jun.nakajima, yang.z.zhang

On 03/19/2014 05:29 AM, Ian Campbell wrote:
> On Wed, 2014-03-19 at 09:26 +0000, Jan Beulich wrote:
>>>>> On 19.03.14 at 01:58, Boris Ostrovsky <boris.ostrovsky@oracle.com> wrote:
>>> Currently only "real" cpuid leaves can be overwritten by users via
>>> 'cpuid' option in the configuration file. This patch provides ability to
>>> do the same for hypervisor leaves (but for now only 0x40000000 is allowed).
>> I'm still not really happy with the approach:
>>
>>> --- a/xen/arch/x86/traps.c
>>> +++ b/xen/arch/x86/traps.c
>>> @@ -690,6 +690,9 @@ int cpuid_hypervisor_leaves( uint32_t idx, uint32_t sub_idx,
>>>       if ( idx > limit )
>>>           return 0;
>>>   
>>> +    if ( domain_cpuid_exists(d, base + idx, sub_idx, eax, ebx, ecx, edx) )
>>> +        return 1;
>> You basically rely here on the tools having fed back the values the
>> hypervisor returned to them. What if this gets tweaked at some
>> point based on some other domain characteristics? I'd really like to
>> see the hypervisor, and only the hypervisor, being in charge of
>> returning the hypervisor leaf values. The sole restriction the tools
>> may place ought to be on the maximum leaf.
> That does sound like it would simplify things.


I'll move the enforcement of cpuid values to cpuid_hypervisor_leaves() 
then. It will also address (to some extent) the comment below.

-boris

>> Sadly I didn't see anyone else commenting on this model of yours,
>> so I can't judge whether I'm the only one having reservations
>> against the model you have been proposing so far.
> Well, the libxc side is certainly very complicated if all which is
> needed is the ability to limit the maximum hypervisor leaf which is
> exposed. That ought to be a dozen lines max across tools and hypervisor
> I would have said.

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH v5 1/4] xen/libxc: Allow changes to hypervisor CPUID leaf from config file
  2014-03-19  9:27   ` Ian Campbell
  2014-03-19  9:32     ` Jan Beulich
@ 2014-03-19 14:41     ` Boris Ostrovsky
  2014-03-20  9:25       ` Ian Campbell
  1 sibling, 1 reply; 18+ messages in thread
From: Boris Ostrovsky @ 2014-03-19 14:41 UTC (permalink / raw)
  To: Ian Campbell
  Cc: keir, jbeulich, stefano.stabellini, eddie.dong, ian.jackson,
	xen-devel, jun.nakajima, andrew.cooper3, yang.z.zhang

On 03/19/2014 05:27 AM, Ian Campbell wrote:
> On Tue, 2014-03-18 at 20:58 -0400, Boris Ostrovsky wrote:
>> Currently only "real" cpuid leaves can be overwritten by users via
>> 'cpuid' option in the configuration file. This patch provides ability to
>> do the same for hypervisor leaves (but for now only 0x40000000 is allowed).
>>
>> Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
>> ---
>>   tools/libxc/xc_cpuid_x86.c   |   71 ++++++++++++++++++++++++++++++++++++++++--
>>   xen/arch/x86/domain.c        |   19 +++++++++--
>>   xen/arch/x86/traps.c         |    3 ++
>>   xen/include/asm-x86/domain.h |    7 +++++
>>   4 files changed, 95 insertions(+), 5 deletions(-)
>>
>> diff --git a/tools/libxc/xc_cpuid_x86.c b/tools/libxc/xc_cpuid_x86.c
>> index bbbf9b8..5501d5b 100644
>> --- a/tools/libxc/xc_cpuid_x86.c
>> +++ b/tools/libxc/xc_cpuid_x86.c
>> @@ -33,6 +33,8 @@
>>   #define DEF_MAX_INTELEXT  0x80000008u
>>   #define DEF_MAX_AMDEXT    0x8000001cu
>>   
>> +#define IS_HYPERVISOR_LEAF(idx) (((idx) & 0xffff0000) == 0x40000000)
> Not idx == 0x40000000?
>
> Also as I think Jan said before if viridian support is enabled then the
> Xen leaves may be elsewhere (at 0x100 increments above that address
> IIRC).
>
>> +
>>   static int hypervisor_is_64bit(xc_interface *xch)
>>   {
>>       xen_capabilities_info_t xen_caps = "";
>> @@ -43,22 +45,31 @@ static int hypervisor_is_64bit(xc_interface *xch)
>>   static void cpuid(const unsigned int *input, unsigned int *regs)
>>   {
>>       unsigned int count = (input[1] == XEN_CPUID_INPUT_UNUSED) ? 0 : input[1];
>> +    uint8_t is_hyp = IS_HYPERVISOR_LEAF(input[0]);
>>   #ifdef __i386__
>>       /* Use the stack to avoid reg constraint failures with some gcc flags */
>>       asm (
>>           "push %%ebx; push %%edx\n\t"
>> +        "testb $0xff,%5\n\t"
>> +        "jz .Lcpuid%=\n\t"
>> +        XEN_EMULATE_PREFIX
>> +        ".Lcpuid%=:\n\t"
>>           "cpuid\n\t"
>>           "mov %%ebx,4(%4)\n\t"
>>           "mov %%edx,12(%4)\n\t"
>>           "pop %%edx; pop %%ebx\n\t"
>>           : "=a" (regs[0]), "=c" (regs[2])
>> -        : "0" (input[0]), "1" (count), "S" (regs)
>> +        : "0" (input[0]), "1" (count), "S" (regs), "q" (is_hyp)
> I think this would be clearer refactored into make_real_cpuid() and
> make_pv_cpuid() functions.

Would a comment explaining why we do it this way be sufficient or do you 
really want to split this into two routines? (And I assume you meant 
make_hv_cpuid, not make_pv_cpuid.)

>
> It also seems strange to use emulated for a subset of leafs, although I
> understand why.
>
> How does this play out in e.g. a PVH toolstack domain where the even
> "real" cpuid might be faked?

It shouldn't matter what the guest it, the hypervisor leaves are 
guest-independent.


>
> Perhaps we should have a hypercall to retrieve the complete set of real
> h/w, levelled h/w, pv, emulated etc values for a given leaf?

That's what I was thinking about (except for leveled values) when I 
implemented sysctl in the early version of this series but then Andrew 
pointed out that for what I need prefixed cpuid was sufficient, so I 
went that route.


>
>> @@ -726,6 +740,57 @@ int xc_cpuid_check(
>>       return rc;
>>   }
>>   
>> +static void xc_cpuid_revert(unsigned int *reg, unsigned int polreg,
>> +    unsigned int start, unsigned int end, char *config)
> [...]
>> +static void xc_cpuid_constrain(const unsigned int *input, unsigned int *regs,
>> +    unsigned int *polregs, char **config_transformed)
> These complicated functions most certainly need some sort of explanation
> about what they are trying to do.

They will be gone since the functionality will be implemented in the 
hypervisor.

-boris

>
> Why do the hypervisor leaves need to be manipulated by new functions, do
> the existing facilities used for all the others not work?
>
>> +{
>> +    unsigned int i;
>> +
>> +    if ( IS_HYPERVISOR_LEAF(input[0]) )
>> +    {
>> +        switch ( input[0] & 0xff )
>> +        {
>> +            case 0:
>> +                if ( (regs[0] & 0xffffff00) != (polregs[0] & 0xffffff00) )
>> +                        xc_cpuid_revert(&regs[0], polregs[0], 8, 31,
>> +                                        config_transformed[0]);
>> +                if ( (regs[0] & 0xff) < 2 )
> Lots of magic numbers with varying number's of f's. Please #define them.
>
>> +                        xc_cpuid_revert(&regs[0], polregs[0], 0, 7,
>> +                                        config_transformed[0]);
>> +                for (i = 1; i < 4; i++)
>> +                    if ( regs[i] != polregs[i] )
>> +                        xc_cpuid_revert(&regs[i], polregs[i], 0, 31,
>> +                                        config_transformed[i]);
>> +                break;
>> +
>> +            default:
>> +                for (i = 0; i < 4; i++)
>> +                    xc_cpuid_revert(&regs[i], polregs[i], 0, 31,
>> +                                    config_transformed[i]);
>> +                break;
>> +        }
>> +    }
>> +}
>> +
>>   /*
>>    * Configure a single input with the informatiom from config.
>>    *

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH v5 1/4] xen/libxc: Allow changes to hypervisor CPUID leaf from config file
  2014-03-19 14:41     ` Boris Ostrovsky
@ 2014-03-20  9:25       ` Ian Campbell
  2014-03-20 13:50         ` Boris Ostrovsky
  0 siblings, 1 reply; 18+ messages in thread
From: Ian Campbell @ 2014-03-20  9:25 UTC (permalink / raw)
  To: Boris Ostrovsky
  Cc: keir, jbeulich, stefano.stabellini, eddie.dong, ian.jackson,
	xen-devel, jun.nakajima, andrew.cooper3, yang.z.zhang

On Wed, 2014-03-19 at 10:41 -0400, Boris Ostrovsky wrote:
> On 03/19/2014 05:27 AM, Ian Campbell wrote:
> > On Tue, 2014-03-18 at 20:58 -0400, Boris Ostrovsky wrote:
> >> Currently only "real" cpuid leaves can be overwritten by users via
> >> 'cpuid' option in the configuration file. This patch provides ability to
> >> do the same for hypervisor leaves (but for now only 0x40000000 is allowed).
> >>
> >> Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
> >> ---
> >>   tools/libxc/xc_cpuid_x86.c   |   71 ++++++++++++++++++++++++++++++++++++++++--
> >>   xen/arch/x86/domain.c        |   19 +++++++++--
> >>   xen/arch/x86/traps.c         |    3 ++
> >>   xen/include/asm-x86/domain.h |    7 +++++
> >>   4 files changed, 95 insertions(+), 5 deletions(-)
> >>
> >> diff --git a/tools/libxc/xc_cpuid_x86.c b/tools/libxc/xc_cpuid_x86.c
> >> index bbbf9b8..5501d5b 100644
> >> --- a/tools/libxc/xc_cpuid_x86.c
> >> +++ b/tools/libxc/xc_cpuid_x86.c
> >> @@ -33,6 +33,8 @@
> >>   #define DEF_MAX_INTELEXT  0x80000008u
> >>   #define DEF_MAX_AMDEXT    0x8000001cu
> >>   
> >> +#define IS_HYPERVISOR_LEAF(idx) (((idx) & 0xffff0000) == 0x40000000)
> > Not idx == 0x40000000?
> >
> > Also as I think Jan said before if viridian support is enabled then the
> > Xen leaves may be elsewhere (at 0x100 increments above that address
> > IIRC).
> >
> >> +
> >>   static int hypervisor_is_64bit(xc_interface *xch)
> >>   {
> >>       xen_capabilities_info_t xen_caps = "";
> >> @@ -43,22 +45,31 @@ static int hypervisor_is_64bit(xc_interface *xch)
> >>   static void cpuid(const unsigned int *input, unsigned int *regs)
> >>   {
> >>       unsigned int count = (input[1] == XEN_CPUID_INPUT_UNUSED) ? 0 : input[1];
> >> +    uint8_t is_hyp = IS_HYPERVISOR_LEAF(input[0]);
> >>   #ifdef __i386__
> >>       /* Use the stack to avoid reg constraint failures with some gcc flags */
> >>       asm (
> >>           "push %%ebx; push %%edx\n\t"
> >> +        "testb $0xff,%5\n\t"
> >> +        "jz .Lcpuid%=\n\t"
> >> +        XEN_EMULATE_PREFIX
> >> +        ".Lcpuid%=:\n\t"
> >>           "cpuid\n\t"
> >>           "mov %%ebx,4(%4)\n\t"
> >>           "mov %%edx,12(%4)\n\t"
> >>           "pop %%edx; pop %%ebx\n\t"
> >>           : "=a" (regs[0]), "=c" (regs[2])
> >> -        : "0" (input[0]), "1" (count), "S" (regs)
> >> +        : "0" (input[0]), "1" (count), "S" (regs), "q" (is_hyp)
> > I think this would be clearer refactored into make_real_cpuid() and
> > make_pv_cpuid() functions.
> 
> Would a comment explaining why we do it this way be sufficient or do you 
> really want to split this into two routines?

I think splitting would be clearer, just by virtue of being able to give
the functions comprehensible names.

>  (And I assume you meant 
> make_hv_cpuid, not make_pv_cpuid.)

I meant pv -- XEN_EMUALTE_PREFIX+cpuid instr is a "pv cpuid", isn't it?
It's not clear what "hypervisor cpuid" would be -- is it the cpuid which
the hypervisor sees (i.e. real) or is it some fabrication, in which case
how does it depend on the context (i.e. with which guest is it with
respect too).

> > It also seems strange to use emulated for a subset of leafs, although I
> > understand why.
> >
> > How does this play out in e.g. a PVH toolstack domain where the even
> > "real" cpuid might be faked?
> 
> It shouldn't matter what the guest it, the hypervisor leaves are 
> guest-independent.

Except when you've change them for a guest using the functionality you
are adding here, surely?

> > Perhaps we should have a hypercall to retrieve the complete set of real
> > h/w, levelled h/w, pv, emulated etc values for a given leaf?
> 
> That's what I was thinking about (except for leveled values) when I 
> implemented sysctl in the early version of this series but then Andrew 
> pointed out that for what I need prefixed cpuid was sufficient, so I 
> went that route.

Hrm, it may be sufficient, but is it a good interface?

> >> @@ -726,6 +740,57 @@ int xc_cpuid_check(
> >>       return rc;
> >>   }
> >>   
> >> +static void xc_cpuid_revert(unsigned int *reg, unsigned int polreg,
> >> +    unsigned int start, unsigned int end, char *config)
> > [...]
> >> +static void xc_cpuid_constrain(const unsigned int *input, unsigned int *regs,
> >> +    unsigned int *polregs, char **config_transformed)
> > These complicated functions most certainly need some sort of explanation
> > about what they are trying to do.
> 
> They will be gone since the functionality will be implemented in the 
> hypervisor.

OK good, although if they reappear in hypervisor context I suppose my
comments will still stand.

Ian.

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH v5 1/4] xen/libxc: Allow changes to hypervisor CPUID leaf from config file
  2014-03-20  9:25       ` Ian Campbell
@ 2014-03-20 13:50         ` Boris Ostrovsky
  0 siblings, 0 replies; 18+ messages in thread
From: Boris Ostrovsky @ 2014-03-20 13:50 UTC (permalink / raw)
  To: Ian Campbell
  Cc: keir, jbeulich, stefano.stabellini, eddie.dong, ian.jackson,
	xen-devel, jun.nakajima, andrew.cooper3, yang.z.zhang

On 03/20/2014 05:25 AM, Ian Campbell wrote:
> On Wed, 2014-03-19 at 10:41 -0400, Boris Ostrovsky wrote:
>> On 03/19/2014 05:27 AM, Ian Campbell wrote:
>>> On Tue, 2014-03-18 at 20:58 -0400, Boris Ostrovsky wrote:
>>>> Currently only "real" cpuid leaves can be overwritten by users via
>>>> 'cpuid' option in the configuration file. This patch provides ability to
>>>> do the same for hypervisor leaves (but for now only 0x40000000 is allowed).
>>>>
>>>> Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
>>>> ---
>>>>    tools/libxc/xc_cpuid_x86.c   |   71 ++++++++++++++++++++++++++++++++++++++++--
>>>>    xen/arch/x86/domain.c        |   19 +++++++++--
>>>>    xen/arch/x86/traps.c         |    3 ++
>>>>    xen/include/asm-x86/domain.h |    7 +++++
>>>>    4 files changed, 95 insertions(+), 5 deletions(-)
>>>>
>>>> diff --git a/tools/libxc/xc_cpuid_x86.c b/tools/libxc/xc_cpuid_x86.c
>>>> index bbbf9b8..5501d5b 100644
>>>> --- a/tools/libxc/xc_cpuid_x86.c
>>>> +++ b/tools/libxc/xc_cpuid_x86.c
>>>> @@ -33,6 +33,8 @@
>>>>    #define DEF_MAX_INTELEXT  0x80000008u
>>>>    #define DEF_MAX_AMDEXT    0x8000001cu
>>>>    
>>>> +#define IS_HYPERVISOR_LEAF(idx) (((idx) & 0xffff0000) == 0x40000000)
>>> Not idx == 0x40000000?
>>>
>>> Also as I think Jan said before if viridian support is enabled then the
>>> Xen leaves may be elsewhere (at 0x100 increments above that address
>>> IIRC).
>>>
>>>> +
>>>>    static int hypervisor_is_64bit(xc_interface *xch)
>>>>    {
>>>>        xen_capabilities_info_t xen_caps = "";
>>>> @@ -43,22 +45,31 @@ static int hypervisor_is_64bit(xc_interface *xch)
>>>>    static void cpuid(const unsigned int *input, unsigned int *regs)
>>>>    {
>>>>        unsigned int count = (input[1] == XEN_CPUID_INPUT_UNUSED) ? 0 : input[1];
>>>> +    uint8_t is_hyp = IS_HYPERVISOR_LEAF(input[0]);
>>>>    #ifdef __i386__
>>>>        /* Use the stack to avoid reg constraint failures with some gcc flags */
>>>>        asm (
>>>>            "push %%ebx; push %%edx\n\t"
>>>> +        "testb $0xff,%5\n\t"
>>>> +        "jz .Lcpuid%=\n\t"
>>>> +        XEN_EMULATE_PREFIX
>>>> +        ".Lcpuid%=:\n\t"
>>>>            "cpuid\n\t"
>>>>            "mov %%ebx,4(%4)\n\t"
>>>>            "mov %%edx,12(%4)\n\t"
>>>>            "pop %%edx; pop %%ebx\n\t"
>>>>            : "=a" (regs[0]), "=c" (regs[2])
>>>> -        : "0" (input[0]), "1" (count), "S" (regs)
>>>> +        : "0" (input[0]), "1" (count), "S" (regs), "q" (is_hyp)
>>> I think this would be clearer refactored into make_real_cpuid() and
>>> make_pv_cpuid() functions.
>> Would a comment explaining why we do it this way be sufficient or do you
>> really want to split this into two routines?
> I think splitting would be clearer, just by virtue of being able to give
> the functions comprehensible names.

Actually, now that I hear both of you arguing for only being able to 
change the max number of hypervisor leaves I think I'll drop pretty much 
all changes to libxc and pass user's values directly to the hypervisor. 
And the check there will again be specific for 0x40000x00.eax[7:0], the 
rest will be Xen's default and user's request will be ignored.

I was aiming for a more generic support for changes to hypervisor leaves 
but since there is no interest in this (as there is no real need right 
now) we can come back to this when the need arises.

>
>>   (And I assume you meant
>> make_hv_cpuid, not make_pv_cpuid.)
> I meant pv -- XEN_EMUALTE_PREFIX+cpuid instr is a "pv cpuid", isn't it?
> It's not clear what "hypervisor cpuid" would be -- is it the cpuid which
> the hypervisor sees (i.e. real) or is it some fabrication, in which case
> how does it depend on the context (i.e. with which guest is it with
> respect too).


I was thinking this would be used for hypervisor leaves only but you are 
right, this is a pv cpuid.


>
>>> It also seems strange to use emulated for a subset of leafs, although I
>>> understand why.
>>>
>>> How does this play out in e.g. a PVH toolstack domain where the even
>>> "real" cpuid might be faked?
>> It shouldn't matter what the guest it, the hypervisor leaves are
>> guest-independent.
> Except when you've change them for a guest using the functionality you
> are adding here, surely?

Yes, what I meant was that changing hypervisor leaves will work for all 
types of guests.

I think.

>
>>> Perhaps we should have a hypercall to retrieve the complete set of real
>>> h/w, levelled h/w, pv, emulated etc values for a given leaf?
>> That's what I was thinking about (except for leveled values) when I
>> implemented sysctl in the early version of this series but then Andrew
>> pointed out that for what I need prefixed cpuid was sufficient, so I
>> went that route.
> Hrm, it may be sufficient, but is it a good interface?

Not as it was written --- it was for a single leaf and not for the whole 
set in one call. But something along those lines.

>
>>>> @@ -726,6 +740,57 @@ int xc_cpuid_check(
>>>>        return rc;
>>>>    }
>>>>    
>>>> +static void xc_cpuid_revert(unsigned int *reg, unsigned int polreg,
>>>> +    unsigned int start, unsigned int end, char *config)
>>> [...]
>>>> +static void xc_cpuid_constrain(const unsigned int *input, unsigned int *regs,
>>>> +    unsigned int *polregs, char **config_transformed)
>>> These complicated functions most certainly need some sort of explanation
>>> about what they are trying to do.
>> They will be gone since the functionality will be implemented in the
>> hypervisor.
> OK good, although if they reappear in hypervisor context I suppose my
> comments will still stand.

They won't look like they were. As I said above, it will be a simple, 
targeted test.

Thanks.
-boris

^ permalink raw reply	[flat|nested] 18+ messages in thread

end of thread, other threads:[~2014-03-20 13:50 UTC | newest]

Thread overview: 18+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2014-03-19  0:58 [PATCH v5 0/4] Expose HW APIC virtualization support to HVM guests Boris Ostrovsky
2014-03-19  0:58 ` [PATCH v5 1/4] xen/libxc: Allow changes to hypervisor CPUID leaf from config file Boris Ostrovsky
2014-03-19  9:26   ` Jan Beulich
2014-03-19  9:29     ` Ian Campbell
2014-03-19 14:26       ` Boris Ostrovsky
2014-03-19  9:27   ` Ian Campbell
2014-03-19  9:32     ` Jan Beulich
2014-03-19  9:52       ` Ian Campbell
2014-03-19 10:06         ` Jan Beulich
2014-03-19 10:39           ` Ian Campbell
2014-03-19 11:44             ` Jan Beulich
2014-03-19 12:02               ` Ian Campbell
2014-03-19 14:41     ` Boris Ostrovsky
2014-03-20  9:25       ` Ian Campbell
2014-03-20 13:50         ` Boris Ostrovsky
2014-03-19  0:58 ` [PATCH v5 2/4] x86/hvm: Revert 80ecb40362365ba77e68fc609de8bd3b7208ae19 Boris Ostrovsky
2014-03-19  0:58 ` [PATCH v5 3/4] x86/hvm: Add HVM-specific hypervisor CPUID leaf Boris Ostrovsky
2014-03-19  0:58 ` [PATCH v5 4/4] x86/hvm: Indicate avaliability of HW support of APIC virtualization to HVM guests Boris Ostrovsky

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.