All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v5 0/2] xen: Report and use hardware APIC virtualization capabilities
@ 2022-03-07 15:06 Jane Malalane
  2022-03-07 15:06 ` [PATCH v5 1/2] xen+tools: Report Interrupt Controller Virtualization capabilities on x86 Jane Malalane
  2022-03-07 15:06 ` [PATCH v5 2/2] x86/xen: Allow per-domain usage of hardware virtualized APIC Jane Malalane
  0 siblings, 2 replies; 27+ messages in thread
From: Jane Malalane @ 2022-03-07 15:06 UTC (permalink / raw)
  To: Xen-devel; +Cc: Jane Malalane

Jane Malalane (2):
  xen+tools: Report Interrupt Controller Virtualization capabilities on
    x86
  x86/xen: Allow per-domain usage of hardware virtualized APIC

 docs/man/xl.cfg.5.pod.in                | 19 ++++++++++++++++
 docs/man/xl.conf.5.pod.in               | 12 ++++++++++
 tools/golang/xenlight/helpers.gen.go    | 16 ++++++++++++++
 tools/golang/xenlight/types.gen.go      |  4 ++++
 tools/include/libxl.h                   | 14 ++++++++++++
 tools/libs/light/libxl.c                |  3 +++
 tools/libs/light/libxl_arch.h           |  9 ++++++--
 tools/libs/light/libxl_arm.c            | 12 ++++++++--
 tools/libs/light/libxl_create.c         | 22 +++++++++++--------
 tools/libs/light/libxl_types.idl        |  4 ++++
 tools/libs/light/libxl_x86.c            | 39 +++++++++++++++++++++++++++++++--
 tools/ocaml/libs/xc/xenctrl.ml          |  7 ++++++
 tools/ocaml/libs/xc/xenctrl.mli         |  7 ++++++
 tools/ocaml/libs/xc/xenctrl_stubs.c     | 16 ++++++++++----
 tools/xl/xl.c                           |  8 +++++++
 tools/xl/xl.h                           |  2 ++
 tools/xl/xl_info.c                      |  6 +++--
 tools/xl/xl_parse.c                     | 16 ++++++++++++++
 xen/arch/x86/domain.c                   | 28 ++++++++++++++++++++++-
 xen/arch/x86/hvm/vmx/vmcs.c             | 13 +++++++++++
 xen/arch/x86/hvm/vmx/vmx.c              | 13 ++++-------
 xen/arch/x86/include/asm/domain.h       |  3 +++
 xen/arch/x86/include/asm/hvm/domain.h   |  6 +++++
 xen/arch/x86/include/asm/hvm/vmx/vmcs.h |  3 +++
 xen/arch/x86/sysctl.c                   |  7 ++++++
 xen/arch/x86/traps.c                    |  9 ++++----
 xen/include/public/arch-x86/xen.h       |  2 ++
 xen/include/public/sysctl.h             | 11 +++++++++-
 28 files changed, 275 insertions(+), 36 deletions(-)

-- 
2.11.0



^ permalink raw reply	[flat|nested] 27+ messages in thread

* [PATCH v5 1/2] xen+tools: Report Interrupt Controller Virtualization capabilities on x86
  2022-03-07 15:06 [PATCH v5 0/2] xen: Report and use hardware APIC virtualization capabilities Jane Malalane
@ 2022-03-07 15:06 ` Jane Malalane
  2022-03-08 10:39   ` Roger Pau Monné
                     ` (2 more replies)
  2022-03-07 15:06 ` [PATCH v5 2/2] x86/xen: Allow per-domain usage of hardware virtualized APIC Jane Malalane
  1 sibling, 3 replies; 27+ messages in thread
From: Jane Malalane @ 2022-03-07 15:06 UTC (permalink / raw)
  To: Xen-devel
  Cc: Jane Malalane, Wei Liu, Anthony PERARD, Juergen Gross,
	Andrew Cooper, George Dunlap, Jan Beulich, Julien Grall,
	Stefano Stabellini, Volodymyr Babchuk, Bertrand Marquis,
	Jun Nakajima, Kevin Tian, Roger Pau Monné

Add XEN_SYSCTL_PHYSCAP_ARCH_ASSISTED_xapic and
XEN_SYSCTL_PHYSCAP_ARCH_ASSISTED_x2apic to report accelerated xapic
and x2apic, on x86 hardware.
No such features are currently implemented on AMD hardware.

HW assisted xAPIC virtualization will be reported if HW, at the
minimum, supports virtualize_apic_accesses as this feature alone means
that an access to the APIC page will cause an APIC-access VM exit. An
APIC-access VM exit provides a VMM with information about the access
causing the VM exit, unlike a regular EPT fault, thus simplifying some
internal handling.

HW assisted x2APIC virtualization will be reported if HW supports
virtualize_x2apic_mode and, at least, either apic_reg_virt or
virtual_intr_delivery. This is due to apic_reg_virt and
virtual_intr_delivery preventing a VM exit from occuring or at least
replacing a regular EPT fault VM-exit with an APIC-access VM-exit on
read and write APIC accesses, respectively. This also means that
sysctl follows the conditionals in vmx_vlapic_msr_changed().

For that purpose, also add an arch-specific "capabilities" parameter
to struct xen_sysctl_physinfo.

Note that this interface is intended to be compatible with AMD so that
AVIC support can be introduced in a future patch. Unlike Intel that
has multiple controls for APIC Virtualization, AMD has one global
'AVIC Enable' control bit, so fine-graining of APIC virtualization
control cannot be done on a common interface.

Suggested-by: Andrew Cooper <andrew.cooper3@citrix.com>
Signed-off-by: Jane Malalane <jane.malalane@citrix.com>
---
CC: Wei Liu <wl@xen.org>
CC: Anthony PERARD <anthony.perard@citrix.com>
CC: Juergen Gross <jgross@suse.com>
CC: Andrew Cooper <andrew.cooper3@citrix.com>
CC: George Dunlap <george.dunlap@citrix.com>
CC: Jan Beulich <jbeulich@suse.com>
CC: Julien Grall <julien@xen.org>
CC: Stefano Stabellini <sstabellini@kernel.org>
CC: Volodymyr Babchuk <Volodymyr_Babchuk@epam.com>
CC: Bertrand Marquis <bertrand.marquis@arm.com>
CC: Jun Nakajima <jun.nakajima@intel.com>
CC: Kevin Tian <kevin.tian@intel.com>
CC: "Roger Pau Monné" <roger.pau@citrix.com>

v5:
* Have assisted_xapic_available solely depend on
  cpu_has_vmx_virtualize_apic_accesses and assisted_x2apic_available
  depend on cpu_has_vmx_virtualize_x2apic_mode and
  cpu_has_vmx_apic_reg_virt OR cpu_has_vmx_virtual_intr_delivery.

v4:
 * Fallback to the original v2/v1 conditions for setting
   assisted_xapic_available and assisted_x2apic_available so that in
   the future APIC virtualization can be exposed on AMD hardware
   since fine-graining of "AVIC" is not supported, i.e., AMD solely
   uses "AVIC Enable". This also means that sysctl mimics what's
   exposed in CPUID.

v3:
 * Define XEN_SYSCTL_PHYSCAP_ARCH_MAX for ABI checking and actually
   set "arch_capbilities", via a call to c_bitmap_to_ocaml_list()
 * Have assisted_x2apic_available only depend on
   cpu_has_vmx_virtualize_x2apic_mode

v2:
 * Use one macro LIBXL_HAVE_PHYSINFO_ASSISTED_APIC instead of two
 * Pass xcpyshinfo as a pointer in libxl__arch_get_physinfo
 * Set assisted_x{2}apic_available to be conditional upon "bsp" and
   annotate it with __ro_after_init
 * Change XEN_SYSCTL_PHYSCAP_ARCH_ASSISTED_X{2}APIC to
   _X86_ASSISTED_X{2}APIC
 * Keep XEN_SYSCTL_PHYSCAP_X86_ASSISTED_X{2}APIC contained within
   sysctl.h
 * Fix padding introduced in struct xen_sysctl_physinfo and bump
   XEN_SYSCTL_INTERFACE_VERSION
---
 tools/golang/xenlight/helpers.gen.go |  4 ++++
 tools/golang/xenlight/types.gen.go   |  2 ++
 tools/include/libxl.h                |  7 +++++++
 tools/libs/light/libxl.c             |  3 +++
 tools/libs/light/libxl_arch.h        |  4 ++++
 tools/libs/light/libxl_arm.c         |  5 +++++
 tools/libs/light/libxl_types.idl     |  2 ++
 tools/libs/light/libxl_x86.c         | 11 +++++++++++
 tools/ocaml/libs/xc/xenctrl.ml       |  5 +++++
 tools/ocaml/libs/xc/xenctrl.mli      |  5 +++++
 tools/ocaml/libs/xc/xenctrl_stubs.c  | 14 +++++++++++---
 tools/xl/xl_info.c                   |  6 ++++--
 xen/arch/x86/hvm/vmx/vmcs.c          |  9 +++++++++
 xen/arch/x86/include/asm/domain.h    |  3 +++
 xen/arch/x86/sysctl.c                |  7 +++++++
 xen/include/public/sysctl.h          | 11 ++++++++++-
 16 files changed, 92 insertions(+), 6 deletions(-)

diff --git a/tools/golang/xenlight/helpers.gen.go b/tools/golang/xenlight/helpers.gen.go
index b746ff1081..dd4e6c9f14 100644
--- a/tools/golang/xenlight/helpers.gen.go
+++ b/tools/golang/xenlight/helpers.gen.go
@@ -3373,6 +3373,8 @@ x.CapVmtrace = bool(xc.cap_vmtrace)
 x.CapVpmu = bool(xc.cap_vpmu)
 x.CapGnttabV1 = bool(xc.cap_gnttab_v1)
 x.CapGnttabV2 = bool(xc.cap_gnttab_v2)
+x.CapAssistedXapic = bool(xc.cap_assisted_xapic)
+x.CapAssistedX2Apic = bool(xc.cap_assisted_x2apic)
 
  return nil}
 
@@ -3407,6 +3409,8 @@ xc.cap_vmtrace = C.bool(x.CapVmtrace)
 xc.cap_vpmu = C.bool(x.CapVpmu)
 xc.cap_gnttab_v1 = C.bool(x.CapGnttabV1)
 xc.cap_gnttab_v2 = C.bool(x.CapGnttabV2)
+xc.cap_assisted_xapic = C.bool(x.CapAssistedXapic)
+xc.cap_assisted_x2apic = C.bool(x.CapAssistedX2Apic)
 
  return nil
  }
diff --git a/tools/golang/xenlight/types.gen.go b/tools/golang/xenlight/types.gen.go
index b1e84d5258..87be46c745 100644
--- a/tools/golang/xenlight/types.gen.go
+++ b/tools/golang/xenlight/types.gen.go
@@ -1014,6 +1014,8 @@ CapVmtrace bool
 CapVpmu bool
 CapGnttabV1 bool
 CapGnttabV2 bool
+CapAssistedXapic bool
+CapAssistedX2Apic bool
 }
 
 type Connectorinfo struct {
diff --git a/tools/include/libxl.h b/tools/include/libxl.h
index 51a9b6cfac..94e6355822 100644
--- a/tools/include/libxl.h
+++ b/tools/include/libxl.h
@@ -528,6 +528,13 @@
 #define LIBXL_HAVE_MAX_GRANT_VERSION 1
 
 /*
+ * LIBXL_HAVE_PHYSINFO_ASSISTED_APIC indicates that libxl_physinfo has
+ * cap_assisted_xapic and cap_assisted_x2apic fields, which indicates
+ * the availability of x{2}APIC hardware assisted virtualization.
+ */
+#define LIBXL_HAVE_PHYSINFO_ASSISTED_APIC 1
+
+/*
  * libxl ABI compatibility
  *
  * The only guarantee which libxl makes regarding ABI compatibility
diff --git a/tools/libs/light/libxl.c b/tools/libs/light/libxl.c
index a0bf7d186f..6d699951e2 100644
--- a/tools/libs/light/libxl.c
+++ b/tools/libs/light/libxl.c
@@ -15,6 +15,7 @@
 #include "libxl_osdeps.h"
 
 #include "libxl_internal.h"
+#include "libxl_arch.h"
 
 int libxl_ctx_alloc(libxl_ctx **pctx, int version,
                     unsigned flags, xentoollog_logger * lg)
@@ -410,6 +411,8 @@ int libxl_get_physinfo(libxl_ctx *ctx, libxl_physinfo *physinfo)
     physinfo->cap_gnttab_v2 =
         !!(xcphysinfo.capabilities & XEN_SYSCTL_PHYSCAP_gnttab_v2);
 
+    libxl__arch_get_physinfo(physinfo, &xcphysinfo);
+
     GC_FREE;
     return 0;
 }
diff --git a/tools/libs/light/libxl_arch.h b/tools/libs/light/libxl_arch.h
index 1522ecb97f..207ceac6a1 100644
--- a/tools/libs/light/libxl_arch.h
+++ b/tools/libs/light/libxl_arch.h
@@ -86,6 +86,10 @@ int libxl__arch_extra_memory(libxl__gc *gc,
                              uint64_t *out);
 
 _hidden
+void libxl__arch_get_physinfo(libxl_physinfo *physinfo,
+                              const xc_physinfo_t *xcphysinfo);
+
+_hidden
 void libxl__arch_update_domain_config(libxl__gc *gc,
                                       libxl_domain_config *dst,
                                       const libxl_domain_config *src);
diff --git a/tools/libs/light/libxl_arm.c b/tools/libs/light/libxl_arm.c
index eef1de0939..39fdca1b49 100644
--- a/tools/libs/light/libxl_arm.c
+++ b/tools/libs/light/libxl_arm.c
@@ -1431,6 +1431,11 @@ int libxl__arch_passthrough_mode_setdefault(libxl__gc *gc,
     return rc;
 }
 
+void libxl__arch_get_physinfo(libxl_physinfo *physinfo,
+                              const xc_physinfo_t *xcphysinfo)
+{
+}
+
 void libxl__arch_update_domain_config(libxl__gc *gc,
                                       libxl_domain_config *dst,
                                       const libxl_domain_config *src)
diff --git a/tools/libs/light/libxl_types.idl b/tools/libs/light/libxl_types.idl
index 2a42da2f7d..42ac6c357b 100644
--- a/tools/libs/light/libxl_types.idl
+++ b/tools/libs/light/libxl_types.idl
@@ -1068,6 +1068,8 @@ libxl_physinfo = Struct("physinfo", [
     ("cap_vpmu", bool),
     ("cap_gnttab_v1", bool),
     ("cap_gnttab_v2", bool),
+    ("cap_assisted_xapic", bool),
+    ("cap_assisted_x2apic", bool),
     ], dir=DIR_OUT)
 
 libxl_connectorinfo = Struct("connectorinfo", [
diff --git a/tools/libs/light/libxl_x86.c b/tools/libs/light/libxl_x86.c
index 1feadebb18..e0a06ecfe3 100644
--- a/tools/libs/light/libxl_x86.c
+++ b/tools/libs/light/libxl_x86.c
@@ -866,6 +866,17 @@ int libxl__arch_passthrough_mode_setdefault(libxl__gc *gc,
     return rc;
 }
 
+void libxl__arch_get_physinfo(libxl_physinfo *physinfo,
+                              const xc_physinfo_t *xcphysinfo)
+{
+    physinfo->cap_assisted_xapic =
+        !!(xcphysinfo->arch_capabilities &
+           XEN_SYSCTL_PHYSCAP_X86_ASSISTED_XAPIC);
+    physinfo->cap_assisted_x2apic =
+        !!(xcphysinfo->arch_capabilities &
+           XEN_SYSCTL_PHYSCAP_X86_ASSISTED_X2APIC);
+}
+
 void libxl__arch_update_domain_config(libxl__gc *gc,
                                       libxl_domain_config *dst,
                                       const libxl_domain_config *src)
diff --git a/tools/ocaml/libs/xc/xenctrl.ml b/tools/ocaml/libs/xc/xenctrl.ml
index 7503031d8f..21783d3622 100644
--- a/tools/ocaml/libs/xc/xenctrl.ml
+++ b/tools/ocaml/libs/xc/xenctrl.ml
@@ -127,6 +127,10 @@ type physinfo_cap_flag =
 	| CAP_Gnttab_v1
 	| CAP_Gnttab_v2
 
+type physinfo_arch_cap_flag =
+	| CAP_X86_ASSISTED_XAPIC
+	| CAP_X86_ASSISTED_X2APIC
+
 type physinfo =
 {
 	threads_per_core : int;
@@ -139,6 +143,7 @@ type physinfo =
 	scrub_pages      : nativeint;
 	(* XXX hw_cap *)
 	capabilities     : physinfo_cap_flag list;
+	arch_capabilities : physinfo_arch_cap_flag list;
 	max_nr_cpus      : int;
 }
 
diff --git a/tools/ocaml/libs/xc/xenctrl.mli b/tools/ocaml/libs/xc/xenctrl.mli
index d1d9c9247a..af6ba3d1a0 100644
--- a/tools/ocaml/libs/xc/xenctrl.mli
+++ b/tools/ocaml/libs/xc/xenctrl.mli
@@ -112,6 +112,10 @@ type physinfo_cap_flag =
   | CAP_Gnttab_v1
   | CAP_Gnttab_v2
 
+type physinfo_arch_cap_flag =
+  | CAP_X86_ASSISTED_XAPIC
+  | CAP_X86_ASSISTED_X2APIC
+
 type physinfo = {
   threads_per_core : int;
   cores_per_socket : int;
@@ -122,6 +126,7 @@ type physinfo = {
   free_pages       : nativeint;
   scrub_pages      : nativeint;
   capabilities     : physinfo_cap_flag list;
+  arch_capabilities : physinfo_arch_cap_flag list;
   max_nr_cpus      : int; (** compile-time max possible number of nr_cpus *)
 }
 type version = { major : int; minor : int; extra : string; }
diff --git a/tools/ocaml/libs/xc/xenctrl_stubs.c b/tools/ocaml/libs/xc/xenctrl_stubs.c
index 5b4fe72c8d..e0d49b18d2 100644
--- a/tools/ocaml/libs/xc/xenctrl_stubs.c
+++ b/tools/ocaml/libs/xc/xenctrl_stubs.c
@@ -712,7 +712,7 @@ CAMLprim value stub_xc_send_debug_keys(value xch, value keys)
 CAMLprim value stub_xc_physinfo(value xch)
 {
 	CAMLparam1(xch);
-	CAMLlocal2(physinfo, cap_list);
+	CAMLlocal3(physinfo, cap_list, arch_cap_list);
 	xc_physinfo_t c_physinfo;
 	int r;
 
@@ -730,8 +730,15 @@ CAMLprim value stub_xc_physinfo(value xch)
 		/* ! physinfo_cap_flag CAP_ lc */
 		/* ! XEN_SYSCTL_PHYSCAP_ XEN_SYSCTL_PHYSCAP_MAX max */
 		(c_physinfo.capabilities);
+	/*
+	 * arch_capabilities: physinfo_arch_cap_flag list;
+	 */
+	arch_cap_list = c_bitmap_to_ocaml_list
+		/* ! physinfo_arch_cap_flag CAP_ none */
+		/* ! XEN_SYSCTL_PHYSCAP_ XEN_SYSCTL_PHYSCAP_X86_MAX max */
+		(c_physinfo.arch_capabilities);
 
-	physinfo = caml_alloc_tuple(10);
+	physinfo = caml_alloc_tuple(11);
 	Store_field(physinfo, 0, Val_int(c_physinfo.threads_per_core));
 	Store_field(physinfo, 1, Val_int(c_physinfo.cores_per_socket));
 	Store_field(physinfo, 2, Val_int(c_physinfo.nr_cpus));
@@ -741,7 +748,8 @@ CAMLprim value stub_xc_physinfo(value xch)
 	Store_field(physinfo, 6, caml_copy_nativeint(c_physinfo.free_pages));
 	Store_field(physinfo, 7, caml_copy_nativeint(c_physinfo.scrub_pages));
 	Store_field(physinfo, 8, cap_list);
-	Store_field(physinfo, 9, Val_int(c_physinfo.max_cpu_id + 1));
+	Store_field(physinfo, 9, arch_cap_list);
+	Store_field(physinfo, 10, Val_int(c_physinfo.max_cpu_id + 1));
 
 	CAMLreturn(physinfo);
 }
diff --git a/tools/xl/xl_info.c b/tools/xl/xl_info.c
index 712b7638b0..3205270754 100644
--- a/tools/xl/xl_info.c
+++ b/tools/xl/xl_info.c
@@ -210,7 +210,7 @@ static void output_physinfo(void)
          info.hw_cap[4], info.hw_cap[5], info.hw_cap[6], info.hw_cap[7]
         );
 
-    maybe_printf("virt_caps              :%s%s%s%s%s%s%s%s%s%s%s\n",
+    maybe_printf("virt_caps              :%s%s%s%s%s%s%s%s%s%s%s%s%s\n",
          info.cap_pv ? " pv" : "",
          info.cap_hvm ? " hvm" : "",
          info.cap_hvm && info.cap_hvm_directio ? " hvm_directio" : "",
@@ -221,7 +221,9 @@ static void output_physinfo(void)
          info.cap_vmtrace ? " vmtrace" : "",
          info.cap_vpmu ? " vpmu" : "",
          info.cap_gnttab_v1 ? " gnttab-v1" : "",
-         info.cap_gnttab_v2 ? " gnttab-v2" : ""
+         info.cap_gnttab_v2 ? " gnttab-v2" : "",
+         info.cap_assisted_xapic ? " assisted_xapic" : "",
+         info.cap_assisted_x2apic ? " assisted_x2apic" : ""
         );
 
     vinfo = libxl_get_version_info(ctx);
diff --git a/xen/arch/x86/hvm/vmx/vmcs.c b/xen/arch/x86/hvm/vmx/vmcs.c
index e1e1fa14e6..06831099ed 100644
--- a/xen/arch/x86/hvm/vmx/vmcs.c
+++ b/xen/arch/x86/hvm/vmx/vmcs.c
@@ -343,6 +343,15 @@ static int vmx_init_vmcs_config(bool bsp)
             MSR_IA32_VMX_PROCBASED_CTLS2, &mismatch);
     }
 
+    /* Check whether hardware supports accelerated xapic and x2apic. */
+    if ( bsp )
+    {
+        assisted_xapic_available = cpu_has_vmx_virtualize_apic_accesses;
+        assisted_x2apic_available = (cpu_has_vmx_virtualize_x2apic_mode &&
+                                     (cpu_has_vmx_apic_reg_virt ||
+                                      cpu_has_vmx_virtual_intr_delivery));
+    }
+
     /* The IA32_VMX_EPT_VPID_CAP MSR exists only when EPT or VPID available */
     if ( _vmx_secondary_exec_control & (SECONDARY_EXEC_ENABLE_EPT |
                                         SECONDARY_EXEC_ENABLE_VPID) )
diff --git a/xen/arch/x86/include/asm/domain.h b/xen/arch/x86/include/asm/domain.h
index e62e109598..72431df26d 100644
--- a/xen/arch/x86/include/asm/domain.h
+++ b/xen/arch/x86/include/asm/domain.h
@@ -756,6 +756,9 @@ static inline void pv_inject_sw_interrupt(unsigned int vector)
                       : is_pv_32bit_domain(d) ? PV32_VM_ASSIST_MASK \
                                               : PV64_VM_ASSIST_MASK)
 
+extern bool assisted_xapic_available;
+extern bool assisted_x2apic_available;
+
 #endif /* __ASM_DOMAIN_H__ */
 
 /*
diff --git a/xen/arch/x86/sysctl.c b/xen/arch/x86/sysctl.c
index f82abc2488..ad95c86aef 100644
--- a/xen/arch/x86/sysctl.c
+++ b/xen/arch/x86/sysctl.c
@@ -69,6 +69,9 @@ struct l3_cache_info {
     unsigned long size;
 };
 
+bool __ro_after_init assisted_xapic_available;
+bool __ro_after_init assisted_x2apic_available;
+
 static void cf_check l3_cache_get(void *arg)
 {
     struct cpuid4_info info;
@@ -135,6 +138,10 @@ void arch_do_physinfo(struct xen_sysctl_physinfo *pi)
         pi->capabilities |= XEN_SYSCTL_PHYSCAP_hap;
     if ( IS_ENABLED(CONFIG_SHADOW_PAGING) )
         pi->capabilities |= XEN_SYSCTL_PHYSCAP_shadow;
+    if ( assisted_xapic_available )
+        pi->arch_capabilities |= XEN_SYSCTL_PHYSCAP_X86_ASSISTED_XAPIC;
+    if ( assisted_x2apic_available )
+        pi->arch_capabilities |= XEN_SYSCTL_PHYSCAP_X86_ASSISTED_X2APIC;
 }
 
 long arch_do_sysctl(
diff --git a/xen/include/public/sysctl.h b/xen/include/public/sysctl.h
index 55252e97f2..7fe05be0c9 100644
--- a/xen/include/public/sysctl.h
+++ b/xen/include/public/sysctl.h
@@ -35,7 +35,7 @@
 #include "domctl.h"
 #include "physdev.h"
 
-#define XEN_SYSCTL_INTERFACE_VERSION 0x00000014
+#define XEN_SYSCTL_INTERFACE_VERSION 0x00000015
 
 /*
  * Read console content from Xen buffer ring.
@@ -111,6 +111,13 @@ struct xen_sysctl_tbuf_op {
 /* Max XEN_SYSCTL_PHYSCAP_* constant.  Used for ABI checking. */
 #define XEN_SYSCTL_PHYSCAP_MAX XEN_SYSCTL_PHYSCAP_gnttab_v2
 
+/* The platform supports x{2}apic hardware assisted emulation. */
+#define XEN_SYSCTL_PHYSCAP_X86_ASSISTED_XAPIC  (1u << 0)
+#define XEN_SYSCTL_PHYSCAP_X86_ASSISTED_X2APIC (1u << 1)
+
+/* Max XEN_SYSCTL_PHYSCAP_X86__* constant. Used for ABI checking. */
+#define XEN_SYSCTL_PHYSCAP_X86_MAX XEN_SYSCTL_PHYSCAP_X86_ASSISTED_X2APIC
+
 struct xen_sysctl_physinfo {
     uint32_t threads_per_core;
     uint32_t cores_per_socket;
@@ -120,6 +127,8 @@ struct xen_sysctl_physinfo {
     uint32_t max_node_id; /* Largest possible node ID on this host */
     uint32_t cpu_khz;
     uint32_t capabilities;/* XEN_SYSCTL_PHYSCAP_??? */
+    uint32_t arch_capabilities;/* XEN_SYSCTL_PHYSCAP_X86{ARM}_??? */
+    uint32_t pad;
     uint64_aligned_t total_pages;
     uint64_aligned_t free_pages;
     uint64_aligned_t scrub_pages;
-- 
2.11.0



^ permalink raw reply related	[flat|nested] 27+ messages in thread

* [PATCH v5 2/2] x86/xen: Allow per-domain usage of hardware virtualized APIC
  2022-03-07 15:06 [PATCH v5 0/2] xen: Report and use hardware APIC virtualization capabilities Jane Malalane
  2022-03-07 15:06 ` [PATCH v5 1/2] xen+tools: Report Interrupt Controller Virtualization capabilities on x86 Jane Malalane
@ 2022-03-07 15:06 ` Jane Malalane
  2022-03-08 11:38   ` Roger Pau Monné
  2022-03-08 17:36   ` [PATCH v6 " Jane Malalane
  1 sibling, 2 replies; 27+ messages in thread
From: Jane Malalane @ 2022-03-07 15:06 UTC (permalink / raw)
  To: Xen-devel
  Cc: Jane Malalane, Wei Liu, Anthony PERARD, Juergen Gross,
	Andrew Cooper, George Dunlap, Jan Beulich, Julien Grall,
	Stefano Stabellini, Christian Lindig, David Scott,
	Volodymyr Babchuk, Roger Pau Monné

Introduce a new per-domain creation x86 specific flag to
select whether hardware assisted virtualization should be used for
x{2}APIC.

A per-domain option is added to xl in order to select the usage of
x{2}APIC hardware assisted virtualization, as well as a global
configuration option.

Having all APIC interaction exit to Xen for emulation is slow and can
induce much overhead. Hardware can speed up x{2}APIC by decoding the
APIC access and providing a VM exit with a more specific exit reason
than a regular EPT fault or by altogether avoiding a VM exit.

On the other hand, being able to disable x{2}APIC hardware assisted
virtualization can be useful for testing and debugging purposes.

Note: vmx_install_vlapic_mapping doesn't require modifications
regardless of whether the guest has "Virtualize APIC accesses" enabled
or not, i.e., setting the APIC_ACCESS_ADDR VMCS field is fine so long
as virtualize_apic_accesses is supported by the CPU.

Suggested-by: Andrew Cooper <andrew.cooper3@citrix.com>
Signed-off-by: Jane Malalane <jane.malalane@citrix.com>
---
CC: Wei Liu <wl@xen.org>
CC: Anthony PERARD <anthony.perard@citrix.com>
CC: Juergen Gross <jgross@suse.com>
CC: Andrew Cooper <andrew.cooper3@citrix.com>
CC: George Dunlap <george.dunlap@citrix.com>
CC: Jan Beulich <jbeulich@suse.com>
CC: Julien Grall <julien@xen.org>
CC: Stefano Stabellini <sstabellini@kernel.org>
CC: Christian Lindig <christian.lindig@citrix.com>
CC: David Scott <dave@recoil.org>
CC: Volodymyr Babchuk <Volodymyr_Babchuk@epam.com>
CC: "Roger Pau Monné" <roger.pau@citrix.com>

v5:
* Revert v4 changes in vmx_vlapic_msr_changed(), preserving the use of
  the has_assisted_x{2}apic macros
* Following changes in assisted_x{2}apic_available definitions in
  patch 1, retighten conditionals for setting
  XEN_HVM_CPUID_APIC_ACCESS_VIRT and XEN_HVM_CPUID_X2APIC_VIRT in
  cpuid_hypervisor_leaves()

v4:
 * Add has_assisted_x{2}apic macros and use them where appropriate
 * Replace CPU checks with per-domain assisted_x{2}apic control
   options in vmx_vlapic_msr_changed() and cpuid_hypervisor_leaves(),
   following edits to assisted_x{2}apic_available definitions in
   patch 1
   Note: new assisted_x{2}apic_available definitions make later
   cpu_has_vmx_apic_reg_virt and cpu_has_vmx_virtual_intr_delivery
   checks redundant in vmx_vlapic_msr_changed()

v3:
 * Change info in xl.cfg to better express reality and fix
   capitalization of x{2}apic
 * Move "physinfo" variable definition to the beggining of
   libxl__domain_build_info_setdefault()
 * Reposition brackets in if statement to match libxl coding style
 * Shorten logic in libxl__arch_domain_build_info_setdefault()
 * Correct dprintk message in arch_sanitise_domain_config()
 * Make appropriate changes in vmx_vlapic_msr_changed() and
   cpuid_hypervisor_leaves() for amended "assisted_x2apic" bit
 * Remove unneeded parantheses

v2:
 * Add a LIBXL_HAVE_ASSISTED_APIC macro
 * Pass xcpyshinfo as a pointer in libxl__arch_get_physinfo
 * Add a return statement in now "int"
   libxl__arch_domain_build_info_setdefault()
 * Preserve libxl__arch_domain_build_info_setdefault 's location in
   libxl_create.c
 * Correct x{2}apic default setting logic in
   libxl__arch_domain_prepare_config()
 * Correct logic for parsing assisted_x{2}apic host/guest options in
   xl_parse.c and initialize them to -1 in xl.c
 * Use guest options directly in vmx_vlapic_msr_changed
 * Fix indentation of bool assisted_x{2}apic in struct hvm_domain
 * Add a change in xenctrl_stubs.c to pass xenctrl ABI checks
---
 docs/man/xl.cfg.5.pod.in                | 19 +++++++++++++++++++
 docs/man/xl.conf.5.pod.in               | 12 ++++++++++++
 tools/golang/xenlight/helpers.gen.go    | 12 ++++++++++++
 tools/golang/xenlight/types.gen.go      |  2 ++
 tools/include/libxl.h                   |  7 +++++++
 tools/libs/light/libxl_arch.h           |  5 +++--
 tools/libs/light/libxl_arm.c            |  7 +++++--
 tools/libs/light/libxl_create.c         | 22 +++++++++++++---------
 tools/libs/light/libxl_types.idl        |  2 ++
 tools/libs/light/libxl_x86.c            | 28 ++++++++++++++++++++++++++--
 tools/ocaml/libs/xc/xenctrl.ml          |  2 ++
 tools/ocaml/libs/xc/xenctrl.mli         |  2 ++
 tools/ocaml/libs/xc/xenctrl_stubs.c     |  2 +-
 tools/xl/xl.c                           |  8 ++++++++
 tools/xl/xl.h                           |  2 ++
 tools/xl/xl_parse.c                     | 16 ++++++++++++++++
 xen/arch/x86/domain.c                   | 28 +++++++++++++++++++++++++++-
 xen/arch/x86/hvm/vmx/vmcs.c             |  4 ++++
 xen/arch/x86/hvm/vmx/vmx.c              | 13 ++++---------
 xen/arch/x86/include/asm/hvm/domain.h   |  6 ++++++
 xen/arch/x86/include/asm/hvm/vmx/vmcs.h |  3 +++
 xen/arch/x86/traps.c                    |  9 +++++----
 xen/include/public/arch-x86/xen.h       |  2 ++
 23 files changed, 183 insertions(+), 30 deletions(-)

diff --git a/docs/man/xl.cfg.5.pod.in b/docs/man/xl.cfg.5.pod.in
index b98d161398..dcca564a23 100644
--- a/docs/man/xl.cfg.5.pod.in
+++ b/docs/man/xl.cfg.5.pod.in
@@ -1862,6 +1862,25 @@ firmware tables when using certain older guest Operating
 Systems. These tables have been superseded by newer constructs within
 the ACPI tables.
 
+=item B<assisted_xapic=BOOLEAN>
+
+B<(x86 only)> Enables or disables hardware assisted virtualization for
+xAPIC. With this option enabled, a memory-mapped APIC access will be
+decoded by hardware and either issue a more specific VM exit than just
+an EPT fault, or altogether avoid a VM exit. Notice full
+virtualization for xAPIC can only be achieved if hardware supports
+“APIC-register virtualization” and “virtual-interrupt delivery”. The
+default is settable via L<xl.conf(5)>.
+
+=item B<assisted_x2apic=BOOLEAN>
+
+B<(x86 only)> Enables or disables hardware assisted virtualization for
+x2APIC. With this option enabled, an MSR-Based APIC access will
+either issue a VM exit or altogether avoid one. Notice full
+virtualization for x2APIC can only be achieved if hardware supports
+“APIC-register virtualization” and “virtual-interrupt delivery”. The
+default is settable via L<xl.conf(5)>.
+
 =item B<nx=BOOLEAN>
 
 B<(x86 only)> Hides or exposes the No-eXecute capability. This allows a guest
diff --git a/docs/man/xl.conf.5.pod.in b/docs/man/xl.conf.5.pod.in
index df20c08137..95d136d1ea 100644
--- a/docs/man/xl.conf.5.pod.in
+++ b/docs/man/xl.conf.5.pod.in
@@ -107,6 +107,18 @@ Sets the default value for the C<max_grant_version> domain config value.
 
 Default: maximum grant version supported by the hypervisor.
 
+=item B<assisted_xapic=BOOLEAN>
+
+If enabled, domains will use xAPIC hardware assisted virtualization by default.
+
+Default: enabled if supported.
+
+=item B<assisted_x2apic=BOOLEAN>
+
+If enabled, domains will use x2APIC hardware assisted virtualization by default.
+
+Default: enabled if supported.
+
 =item B<vif.default.script="PATH">
 
 Configures the default hotplug script used by virtual network devices.
diff --git a/tools/golang/xenlight/helpers.gen.go b/tools/golang/xenlight/helpers.gen.go
index dd4e6c9f14..dece545ee0 100644
--- a/tools/golang/xenlight/helpers.gen.go
+++ b/tools/golang/xenlight/helpers.gen.go
@@ -1120,6 +1120,12 @@ x.ArchArm.Vuart = VuartType(xc.arch_arm.vuart)
 if err := x.ArchX86.MsrRelaxed.fromC(&xc.arch_x86.msr_relaxed);err != nil {
 return fmt.Errorf("converting field ArchX86.MsrRelaxed: %v", err)
 }
+if err := x.ArchX86.AssistedXapic.fromC(&xc.arch_x86.assisted_xapic);err != nil {
+return fmt.Errorf("converting field ArchX86.AssistedXapic: %v", err)
+}
+if err := x.ArchX86.AssistedX2Apic.fromC(&xc.arch_x86.assisted_x2apic);err != nil {
+return fmt.Errorf("converting field ArchX86.AssistedX2Apic: %v", err)
+}
 x.Altp2M = Altp2MMode(xc.altp2m)
 x.VmtraceBufKb = int(xc.vmtrace_buf_kb)
 if err := x.Vpmu.fromC(&xc.vpmu);err != nil {
@@ -1605,6 +1611,12 @@ xc.arch_arm.vuart = C.libxl_vuart_type(x.ArchArm.Vuart)
 if err := x.ArchX86.MsrRelaxed.toC(&xc.arch_x86.msr_relaxed); err != nil {
 return fmt.Errorf("converting field ArchX86.MsrRelaxed: %v", err)
 }
+if err := x.ArchX86.AssistedXapic.toC(&xc.arch_x86.assisted_xapic); err != nil {
+return fmt.Errorf("converting field ArchX86.AssistedXapic: %v", err)
+}
+if err := x.ArchX86.AssistedX2Apic.toC(&xc.arch_x86.assisted_x2apic); err != nil {
+return fmt.Errorf("converting field ArchX86.AssistedX2Apic: %v", err)
+}
 xc.altp2m = C.libxl_altp2m_mode(x.Altp2M)
 xc.vmtrace_buf_kb = C.int(x.VmtraceBufKb)
 if err := x.Vpmu.toC(&xc.vpmu); err != nil {
diff --git a/tools/golang/xenlight/types.gen.go b/tools/golang/xenlight/types.gen.go
index 87be46c745..253c9ad93d 100644
--- a/tools/golang/xenlight/types.gen.go
+++ b/tools/golang/xenlight/types.gen.go
@@ -520,6 +520,8 @@ Vuart VuartType
 }
 ArchX86 struct {
 MsrRelaxed Defbool
+AssistedXapic Defbool
+AssistedX2Apic Defbool
 }
 Altp2M Altp2MMode
 VmtraceBufKb int
diff --git a/tools/include/libxl.h b/tools/include/libxl.h
index 94e6355822..cdcccd6d01 100644
--- a/tools/include/libxl.h
+++ b/tools/include/libxl.h
@@ -535,6 +535,13 @@
 #define LIBXL_HAVE_PHYSINFO_ASSISTED_APIC 1
 
 /*
+ * LIBXL_HAVE_ASSISTED_APIC indicates that libxl_domain_build_info has
+ * assisted_xapic and assisted_x2apic fields for enabling hardware
+ * assisted virtualization for x{2}apic per domain.
+ */
+#define LIBXL_HAVE_ASSISTED_APIC 1
+
+/*
  * libxl ABI compatibility
  *
  * The only guarantee which libxl makes regarding ABI compatibility
diff --git a/tools/libs/light/libxl_arch.h b/tools/libs/light/libxl_arch.h
index 207ceac6a1..03b89929e6 100644
--- a/tools/libs/light/libxl_arch.h
+++ b/tools/libs/light/libxl_arch.h
@@ -71,8 +71,9 @@ void libxl__arch_domain_create_info_setdefault(libxl__gc *gc,
                                                libxl_domain_create_info *c_info);
 
 _hidden
-void libxl__arch_domain_build_info_setdefault(libxl__gc *gc,
-                                              libxl_domain_build_info *b_info);
+int libxl__arch_domain_build_info_setdefault(libxl__gc *gc,
+                                             libxl_domain_build_info *b_info,
+                                             const libxl_physinfo *physinfo);
 
 _hidden
 int libxl__arch_passthrough_mode_setdefault(libxl__gc *gc,
diff --git a/tools/libs/light/libxl_arm.c b/tools/libs/light/libxl_arm.c
index 39fdca1b49..ba5b8f433f 100644
--- a/tools/libs/light/libxl_arm.c
+++ b/tools/libs/light/libxl_arm.c
@@ -1384,8 +1384,9 @@ void libxl__arch_domain_create_info_setdefault(libxl__gc *gc,
     }
 }
 
-void libxl__arch_domain_build_info_setdefault(libxl__gc *gc,
-                                              libxl_domain_build_info *b_info)
+int libxl__arch_domain_build_info_setdefault(libxl__gc *gc,
+                                             libxl_domain_build_info *b_info,
+                                             const libxl_physinfo *physinfo)
 {
     /* ACPI is disabled by default */
     libxl_defbool_setdefault(&b_info->acpi, false);
@@ -1399,6 +1400,8 @@ void libxl__arch_domain_build_info_setdefault(libxl__gc *gc,
     memset(&b_info->u, '\0', sizeof(b_info->u));
     b_info->type = LIBXL_DOMAIN_TYPE_INVALID;
     libxl_domain_build_info_init_type(b_info, LIBXL_DOMAIN_TYPE_PVH);
+
+    return 0;
 }
 
 int libxl__arch_passthrough_mode_setdefault(libxl__gc *gc,
diff --git a/tools/libs/light/libxl_create.c b/tools/libs/light/libxl_create.c
index 15ed021f41..88d08d7277 100644
--- a/tools/libs/light/libxl_create.c
+++ b/tools/libs/light/libxl_create.c
@@ -75,6 +75,7 @@ int libxl__domain_build_info_setdefault(libxl__gc *gc,
                                         libxl_domain_build_info *b_info)
 {
     int i, rc;
+    libxl_physinfo info;
 
     if (b_info->type != LIBXL_DOMAIN_TYPE_HVM &&
         b_info->type != LIBXL_DOMAIN_TYPE_PV &&
@@ -264,7 +265,18 @@ int libxl__domain_build_info_setdefault(libxl__gc *gc,
     if (!b_info->event_channels)
         b_info->event_channels = 1023;
 
-    libxl__arch_domain_build_info_setdefault(gc, b_info);
+    rc = libxl_get_physinfo(CTX, &info);
+    if (rc) {
+        LOG(ERROR, "failed to get hypervisor info");
+        return rc;
+    }
+
+    rc = libxl__arch_domain_build_info_setdefault(gc, b_info, &info);
+    if (rc) {
+        LOG(ERROR, "unable to set domain arch build info defaults");
+        return rc;
+    }
+
     libxl_defbool_setdefault(&b_info->dm_restrict, false);
 
     if (b_info->iommu_memkb == LIBXL_MEMKB_DEFAULT)
@@ -457,14 +469,6 @@ int libxl__domain_build_info_setdefault(libxl__gc *gc,
     }
 
     if (b_info->max_grant_version == LIBXL_MAX_GRANT_DEFAULT) {
-        libxl_physinfo info;
-
-        rc = libxl_get_physinfo(CTX, &info);
-        if (rc) {
-            LOG(ERROR, "failed to get hypervisor info");
-            return rc;
-        }
-
         if (info.cap_gnttab_v2)
             b_info->max_grant_version = 2;
         else if (info.cap_gnttab_v1)
diff --git a/tools/libs/light/libxl_types.idl b/tools/libs/light/libxl_types.idl
index 42ac6c357b..db5eb0a0b3 100644
--- a/tools/libs/light/libxl_types.idl
+++ b/tools/libs/light/libxl_types.idl
@@ -648,6 +648,8 @@ libxl_domain_build_info = Struct("domain_build_info",[
                                ("vuart", libxl_vuart_type),
                               ])),
     ("arch_x86", Struct(None, [("msr_relaxed", libxl_defbool),
+                               ("assisted_xapic", libxl_defbool),
+                               ("assisted_x2apic", libxl_defbool),
                               ])),
     # Alternate p2m is not bound to any architecture or guest type, as it is
     # supported by x86 HVM and ARM support is planned.
diff --git a/tools/libs/light/libxl_x86.c b/tools/libs/light/libxl_x86.c
index e0a06ecfe3..c377d13b19 100644
--- a/tools/libs/light/libxl_x86.c
+++ b/tools/libs/light/libxl_x86.c
@@ -23,6 +23,14 @@ int libxl__arch_domain_prepare_config(libxl__gc *gc,
     if (libxl_defbool_val(d_config->b_info.arch_x86.msr_relaxed))
         config->arch.misc_flags |= XEN_X86_MSR_RELAXED;
 
+    if (d_config->c_info.type != LIBXL_DOMAIN_TYPE_PV)
+    {
+        if (libxl_defbool_val(d_config->b_info.arch_x86.assisted_xapic))
+            config->arch.misc_flags |= XEN_X86_ASSISTED_XAPIC;
+
+        if (libxl_defbool_val(d_config->b_info.arch_x86.assisted_x2apic))
+            config->arch.misc_flags |= XEN_X86_ASSISTED_X2APIC;
+    }
     return 0;
 }
 
@@ -819,11 +827,27 @@ void libxl__arch_domain_create_info_setdefault(libxl__gc *gc,
 {
 }
 
-void libxl__arch_domain_build_info_setdefault(libxl__gc *gc,
-                                              libxl_domain_build_info *b_info)
+int libxl__arch_domain_build_info_setdefault(libxl__gc *gc,
+                                             libxl_domain_build_info *b_info,
+                                             const libxl_physinfo *physinfo)
 {
     libxl_defbool_setdefault(&b_info->acpi, true);
     libxl_defbool_setdefault(&b_info->arch_x86.msr_relaxed, false);
+
+    if (b_info->type != LIBXL_DOMAIN_TYPE_PV) {
+        libxl_defbool_setdefault(&b_info->arch_x86.assisted_xapic,
+                             physinfo->cap_assisted_xapic);
+        libxl_defbool_setdefault(&b_info->arch_x86.assisted_x2apic,
+                             physinfo->cap_assisted_x2apic);
+    }
+
+    else if (!libxl_defbool_is_default(b_info->arch_x86.assisted_xapic) ||
+             !libxl_defbool_is_default(b_info->arch_x86.assisted_x2apic)) {
+        LOG(ERROR, "Interrupt Controller Virtualization not supported for PV");
+        return ERROR_INVAL;
+    }
+
+    return 0;
 }
 
 int libxl__arch_passthrough_mode_setdefault(libxl__gc *gc,
diff --git a/tools/ocaml/libs/xc/xenctrl.ml b/tools/ocaml/libs/xc/xenctrl.ml
index 21783d3622..672a11ceb6 100644
--- a/tools/ocaml/libs/xc/xenctrl.ml
+++ b/tools/ocaml/libs/xc/xenctrl.ml
@@ -50,6 +50,8 @@ type x86_arch_emulation_flags =
 
 type x86_arch_misc_flags =
 	| X86_MSR_RELAXED
+	| X86_ASSISTED_XAPIC
+	| X86_ASSISTED_X2APIC
 
 type xen_x86_arch_domainconfig =
 {
diff --git a/tools/ocaml/libs/xc/xenctrl.mli b/tools/ocaml/libs/xc/xenctrl.mli
index af6ba3d1a0..f9a6aa3a0f 100644
--- a/tools/ocaml/libs/xc/xenctrl.mli
+++ b/tools/ocaml/libs/xc/xenctrl.mli
@@ -44,6 +44,8 @@ type x86_arch_emulation_flags =
 
 type x86_arch_misc_flags =
   | X86_MSR_RELAXED
+  | X86_ASSISTED_XAPIC
+  | X86_ASSISTED_X2APIC
 
 type xen_x86_arch_domainconfig = {
   emulation_flags: x86_arch_emulation_flags list;
diff --git a/tools/ocaml/libs/xc/xenctrl_stubs.c b/tools/ocaml/libs/xc/xenctrl_stubs.c
index e0d49b18d2..ecfc7125d5 100644
--- a/tools/ocaml/libs/xc/xenctrl_stubs.c
+++ b/tools/ocaml/libs/xc/xenctrl_stubs.c
@@ -239,7 +239,7 @@ CAMLprim value stub_xc_domain_create(value xch, value wanted_domid, value config
 
 		cfg.arch.misc_flags = ocaml_list_to_c_bitmap
 			/* ! x86_arch_misc_flags X86_ none */
-			/* ! XEN_X86_ XEN_X86_MSR_RELAXED all */
+			/* ! XEN_X86_ XEN_X86_ASSISTED_X2APIC max */
 			(VAL_MISC_FLAGS);
 
 #undef VAL_MISC_FLAGS
diff --git a/tools/xl/xl.c b/tools/xl/xl.c
index 2d1ec18ea3..31eb223309 100644
--- a/tools/xl/xl.c
+++ b/tools/xl/xl.c
@@ -57,6 +57,8 @@ int max_grant_frames = -1;
 int max_maptrack_frames = -1;
 int max_grant_version = LIBXL_MAX_GRANT_DEFAULT;
 libxl_domid domid_policy = INVALID_DOMID;
+int assisted_xapic = -1;
+int assisted_x2apic = -1;
 
 xentoollog_level minmsglevel = minmsglevel_default;
 
@@ -201,6 +203,12 @@ static void parse_global_config(const char *configfile,
     if (!xlu_cfg_get_long (config, "claim_mode", &l, 0))
         claim_mode = l;
 
+    if (!xlu_cfg_get_long (config, "assisted_xapic", &l, 0))
+        assisted_xapic = l;
+
+    if (!xlu_cfg_get_long (config, "assisted_x2apic", &l, 0))
+        assisted_x2apic = l;
+
     xlu_cfg_replace_string (config, "remus.default.netbufscript",
         &default_remus_netbufscript, 0);
     xlu_cfg_replace_string (config, "colo.default.proxyscript",
diff --git a/tools/xl/xl.h b/tools/xl/xl.h
index c5c4bedbdd..528deb3feb 100644
--- a/tools/xl/xl.h
+++ b/tools/xl/xl.h
@@ -286,6 +286,8 @@ extern libxl_bitmap global_vm_affinity_mask;
 extern libxl_bitmap global_hvm_affinity_mask;
 extern libxl_bitmap global_pv_affinity_mask;
 extern libxl_domid domid_policy;
+extern int assisted_xapic;
+extern int assisted_x2apic;
 
 enum output_format {
     OUTPUT_FORMAT_JSON,
diff --git a/tools/xl/xl_parse.c b/tools/xl/xl_parse.c
index 117fcdcb2b..0ab9b145fe 100644
--- a/tools/xl/xl_parse.c
+++ b/tools/xl/xl_parse.c
@@ -1681,6 +1681,22 @@ void parse_config_data(const char *config_source,
         xlu_cfg_get_defbool(config, "vpt_align", &b_info->u.hvm.vpt_align, 0);
         xlu_cfg_get_defbool(config, "apic", &b_info->apic, 0);
 
+        e = xlu_cfg_get_long(config, "assisted_xapic", &l , 0);
+        if ((e == ESRCH && assisted_xapic != -1)) /* use global default if present */
+            libxl_defbool_set(&b_info->arch_x86.assisted_xapic, assisted_xapic);
+        else if (!e)
+            libxl_defbool_set(&b_info->arch_x86.assisted_xapic, l);
+        else
+            exit(1);
+
+        e = xlu_cfg_get_long(config, "assisted_x2apic", &l, 0);
+        if ((e == ESRCH && assisted_x2apic != -1)) /* use global default if present */
+            libxl_defbool_set(&b_info->arch_x86.assisted_x2apic, assisted_x2apic);
+        else if (!e)
+            libxl_defbool_set(&b_info->arch_x86.assisted_x2apic, l);
+        else
+            exit(1);
+
         switch (xlu_cfg_get_list(config, "viridian",
                                  &viridian, &num_viridian, 1))
         {
diff --git a/xen/arch/x86/domain.c b/xen/arch/x86/domain.c
index a5048ed654..bcca0dc900 100644
--- a/xen/arch/x86/domain.c
+++ b/xen/arch/x86/domain.c
@@ -619,6 +619,8 @@ int arch_sanitise_domain_config(struct xen_domctl_createdomain *config)
     bool hvm = config->flags & XEN_DOMCTL_CDF_hvm;
     bool hap = config->flags & XEN_DOMCTL_CDF_hap;
     bool nested_virt = config->flags & XEN_DOMCTL_CDF_nested_virt;
+    bool assisted_xapic = config->arch.misc_flags & XEN_X86_ASSISTED_XAPIC;
+    bool assisted_x2apic = config->arch.misc_flags & XEN_X86_ASSISTED_X2APIC;
     unsigned int max_vcpus;
 
     if ( hvm ? !hvm_enabled : !IS_ENABLED(CONFIG_PV) )
@@ -685,13 +687,31 @@ int arch_sanitise_domain_config(struct xen_domctl_createdomain *config)
         }
     }
 
-    if ( config->arch.misc_flags & ~XEN_X86_MSR_RELAXED )
+    if ( config->arch.misc_flags & ~(XEN_X86_MSR_RELAXED |
+                                     XEN_X86_ASSISTED_XAPIC |
+                                     XEN_X86_ASSISTED_X2APIC) )
     {
         dprintk(XENLOG_INFO, "Invalid arch misc flags %#x\n",
                 config->arch.misc_flags);
         return -EINVAL;
     }
 
+    if ( (assisted_xapic || assisted_x2apic) && !hvm )
+    {
+        dprintk(XENLOG_INFO,
+                "Interrupt Controller Virtualization not supported for PV\n");
+        return -EINVAL;
+    }
+
+    if ( (assisted_xapic && !assisted_xapic_available) ||
+         (assisted_x2apic && !assisted_x2apic_available) )
+    {
+        dprintk(XENLOG_INFO,
+                "Hardware assisted x%sAPIC requested but not available\n",
+                assisted_xapic && !assisted_xapic_available ? "" : "2");
+        return -EINVAL;
+    }
+
     return 0;
 }
 
@@ -864,6 +884,12 @@ int arch_domain_create(struct domain *d,
 
     d->arch.msr_relaxed = config->arch.misc_flags & XEN_X86_MSR_RELAXED;
 
+    d->arch.hvm.assisted_xapic =
+        config->arch.misc_flags & XEN_X86_ASSISTED_XAPIC;
+
+    d->arch.hvm.assisted_x2apic =
+        config->arch.misc_flags & XEN_X86_ASSISTED_X2APIC;
+
     return 0;
 
  fail:
diff --git a/xen/arch/x86/hvm/vmx/vmcs.c b/xen/arch/x86/hvm/vmx/vmcs.c
index 06831099ed..e4503a02a7 100644
--- a/xen/arch/x86/hvm/vmx/vmcs.c
+++ b/xen/arch/x86/hvm/vmx/vmcs.c
@@ -1157,6 +1157,10 @@ static int construct_vmcs(struct vcpu *v)
         __vmwrite(PLE_WINDOW, ple_window);
     }
 
+    if ( !has_assisted_xapic(v->domain) )
+        v->arch.hvm.vmx.secondary_exec_control &=
+            ~SECONDARY_EXEC_VIRTUALIZE_APIC_ACCESSES;
+
     if ( cpu_has_vmx_secondary_exec_control )
         __vmwrite(SECONDARY_VM_EXEC_CONTROL,
                   v->arch.hvm.vmx.secondary_exec_control);
diff --git a/xen/arch/x86/hvm/vmx/vmx.c b/xen/arch/x86/hvm/vmx/vmx.c
index c075370f64..949ddd684c 100644
--- a/xen/arch/x86/hvm/vmx/vmx.c
+++ b/xen/arch/x86/hvm/vmx/vmx.c
@@ -3344,16 +3344,11 @@ static void vmx_install_vlapic_mapping(struct vcpu *v)
 
 void vmx_vlapic_msr_changed(struct vcpu *v)
 {
-    int virtualize_x2apic_mode;
     struct vlapic *vlapic = vcpu_vlapic(v);
     unsigned int msr;
 
-    virtualize_x2apic_mode = ( (cpu_has_vmx_apic_reg_virt ||
-                                cpu_has_vmx_virtual_intr_delivery) &&
-                               cpu_has_vmx_virtualize_x2apic_mode );
-
-    if ( !cpu_has_vmx_virtualize_apic_accesses &&
-         !virtualize_x2apic_mode )
+    if ( !has_assisted_xapic(v->domain) &&
+         !has_assisted_x2apic(v->domain) )
         return;
 
     vmx_vmcs_enter(v);
@@ -3363,7 +3358,7 @@ void vmx_vlapic_msr_changed(struct vcpu *v)
     if ( !vlapic_hw_disabled(vlapic) &&
          (vlapic_base_address(vlapic) == APIC_DEFAULT_PHYS_BASE) )
     {
-        if ( virtualize_x2apic_mode && vlapic_x2apic_mode(vlapic) )
+        if ( has_assisted_x2apic(v->domain) && vlapic_x2apic_mode(vlapic) )
         {
             v->arch.hvm.vmx.secondary_exec_control |=
                 SECONDARY_EXEC_VIRTUALIZE_X2APIC_MODE;
@@ -3384,7 +3379,7 @@ void vmx_vlapic_msr_changed(struct vcpu *v)
                 vmx_clear_msr_intercept(v, MSR_X2APIC_SELF, VMX_MSR_W);
             }
         }
-        else
+        else if ( has_assisted_xapic(v->domain) )
             v->arch.hvm.vmx.secondary_exec_control |=
                 SECONDARY_EXEC_VIRTUALIZE_APIC_ACCESSES;
     }
diff --git a/xen/arch/x86/include/asm/hvm/domain.h b/xen/arch/x86/include/asm/hvm/domain.h
index 698455444e..92bf53483c 100644
--- a/xen/arch/x86/include/asm/hvm/domain.h
+++ b/xen/arch/x86/include/asm/hvm/domain.h
@@ -117,6 +117,12 @@ struct hvm_domain {
 
     bool                   is_s3_suspended;
 
+    /* xAPIC hardware assisted virtualization. */
+    bool                   assisted_xapic;
+
+    /* x2APIC hardware assisted virtualization. */
+    bool                   assisted_x2apic;
+
     /* hypervisor intercepted msix table */
     struct list_head       msixtbl_list;
 
diff --git a/xen/arch/x86/include/asm/hvm/vmx/vmcs.h b/xen/arch/x86/include/asm/hvm/vmx/vmcs.h
index 9119aa8536..5b7d662ed7 100644
--- a/xen/arch/x86/include/asm/hvm/vmx/vmcs.h
+++ b/xen/arch/x86/include/asm/hvm/vmx/vmcs.h
@@ -220,6 +220,9 @@ void vmx_vmcs_reload(struct vcpu *v);
 #define CPU_BASED_ACTIVATE_SECONDARY_CONTROLS 0x80000000
 extern u32 vmx_cpu_based_exec_control;
 
+#define has_assisted_xapic(d)   ((d)->arch.hvm.assisted_xapic)
+#define has_assisted_x2apic(d)  ((d)->arch.hvm.assisted_x2apic)
+
 #define PIN_BASED_EXT_INTR_MASK         0x00000001
 #define PIN_BASED_NMI_EXITING           0x00000008
 #define PIN_BASED_VIRTUAL_NMIS          0x00000020
diff --git a/xen/arch/x86/traps.c b/xen/arch/x86/traps.c
index a2278d9499..a0c6b89a88 100644
--- a/xen/arch/x86/traps.c
+++ b/xen/arch/x86/traps.c
@@ -1121,7 +1121,8 @@ void cpuid_hypervisor_leaves(const struct vcpu *v, uint32_t leaf,
         if ( !is_hvm_domain(d) || subleaf != 0 )
             break;
 
-        if ( cpu_has_vmx_apic_reg_virt )
+        if ( cpu_has_vmx_apic_reg_virt &&
+             has_assisted_xapic(d) )
             res->a |= XEN_HVM_CPUID_APIC_ACCESS_VIRT;
 
         /*
@@ -1130,9 +1131,9 @@ void cpuid_hypervisor_leaves(const struct vcpu *v, uint32_t leaf,
          * and wrmsr in the guest will run without VMEXITs (see
          * vmx_vlapic_msr_changed()).
          */
-        if ( cpu_has_vmx_virtualize_x2apic_mode &&
-             cpu_has_vmx_apic_reg_virt &&
-             cpu_has_vmx_virtual_intr_delivery )
+        if ( cpu_has_vmx_apic_reg_virt &&
+             cpu_has_vmx_virtual_intr_delivery &&
+             has_assisted_x2apic(d) )
             res->a |= XEN_HVM_CPUID_X2APIC_VIRT;
 
         /*
diff --git a/xen/include/public/arch-x86/xen.h b/xen/include/public/arch-x86/xen.h
index 7acd94c8eb..9da32c6239 100644
--- a/xen/include/public/arch-x86/xen.h
+++ b/xen/include/public/arch-x86/xen.h
@@ -317,6 +317,8 @@ struct xen_arch_domainconfig {
  * doesn't allow the guest to read or write to the underlying MSR.
  */
 #define XEN_X86_MSR_RELAXED (1u << 0)
+#define XEN_X86_ASSISTED_XAPIC (1u << 1)
+#define XEN_X86_ASSISTED_X2APIC (1u << 2)
     uint32_t misc_flags;
 };
 
-- 
2.11.0



^ permalink raw reply related	[flat|nested] 27+ messages in thread

* Re: [PATCH v5 1/2] xen+tools: Report Interrupt Controller Virtualization capabilities on x86
  2022-03-07 15:06 ` [PATCH v5 1/2] xen+tools: Report Interrupt Controller Virtualization capabilities on x86 Jane Malalane
@ 2022-03-08 10:39   ` Roger Pau Monné
  2022-03-08 17:31   ` [PATCH v6 " Jane Malalane
  2022-03-14  6:40   ` [PATCH v5 " Tian, Kevin
  2 siblings, 0 replies; 27+ messages in thread
From: Roger Pau Monné @ 2022-03-08 10:39 UTC (permalink / raw)
  To: Jane Malalane
  Cc: Xen-devel, Wei Liu, Anthony PERARD, Juergen Gross, Andrew Cooper,
	George Dunlap, Jan Beulich, Julien Grall, Stefano Stabellini,
	Volodymyr Babchuk, Bertrand Marquis, Jun Nakajima, Kevin Tian

On Mon, Mar 07, 2022 at 03:06:08PM +0000, Jane Malalane wrote:
> Add XEN_SYSCTL_PHYSCAP_ARCH_ASSISTED_xapic and
> XEN_SYSCTL_PHYSCAP_ARCH_ASSISTED_x2apic to report accelerated xapic
> and x2apic, on x86 hardware.

I think the commit message has gone out of sync with the code, those
should now be XEN_SYSCTL_PHYSCAP_X86_ASSISTED_X{2,}APIC

> No such features are currently implemented on AMD hardware.
> 
> HW assisted xAPIC virtualization will be reported if HW, at the
> minimum, supports virtualize_apic_accesses as this feature alone means
> that an access to the APIC page will cause an APIC-access VM exit. An
> APIC-access VM exit provides a VMM with information about the access
> causing the VM exit, unlike a regular EPT fault, thus simplifying some
> internal handling.
> 
> HW assisted x2APIC virtualization will be reported if HW supports
> virtualize_x2apic_mode and, at least, either apic_reg_virt or
> virtual_intr_delivery. This is due to apic_reg_virt and
> virtual_intr_delivery preventing a VM exit from occuring or at least
> replacing a regular EPT fault VM-exit with an APIC-access VM-exit on

For x2APIC there's no EPT fault, because x2APIC accesses are in the
MSR space. I think you can omit this whole sentence and just mention
that this now follows the conditionals in vmx_vlapic_msr_changed.

> read and write APIC accesses, respectively. This also means that
> sysctl follows the conditionals in vmx_vlapic_msr_changed().
> 
> For that purpose, also add an arch-specific "capabilities" parameter
> to struct xen_sysctl_physinfo.
> 
> Note that this interface is intended to be compatible with AMD so that
> AVIC support can be introduced in a future patch. Unlike Intel that
> has multiple controls for APIC Virtualization, AMD has one global
> 'AVIC Enable' control bit, so fine-graining of APIC virtualization
> control cannot be done on a common interface.
> 
> Suggested-by: Andrew Cooper <andrew.cooper3@citrix.com>
> Signed-off-by: Jane Malalane <jane.malalane@citrix.com>
> ---
> CC: Wei Liu <wl@xen.org>
> CC: Anthony PERARD <anthony.perard@citrix.com>
> CC: Juergen Gross <jgross@suse.com>
> CC: Andrew Cooper <andrew.cooper3@citrix.com>
> CC: George Dunlap <george.dunlap@citrix.com>
> CC: Jan Beulich <jbeulich@suse.com>
> CC: Julien Grall <julien@xen.org>
> CC: Stefano Stabellini <sstabellini@kernel.org>
> CC: Volodymyr Babchuk <Volodymyr_Babchuk@epam.com>
> CC: Bertrand Marquis <bertrand.marquis@arm.com>
> CC: Jun Nakajima <jun.nakajima@intel.com>
> CC: Kevin Tian <kevin.tian@intel.com>
> CC: "Roger Pau Monné" <roger.pau@citrix.com>
> 
> v5:
> * Have assisted_xapic_available solely depend on
>   cpu_has_vmx_virtualize_apic_accesses and assisted_x2apic_available
>   depend on cpu_has_vmx_virtualize_x2apic_mode and
>   cpu_has_vmx_apic_reg_virt OR cpu_has_vmx_virtual_intr_delivery.
> 
> v4:
>  * Fallback to the original v2/v1 conditions for setting
>    assisted_xapic_available and assisted_x2apic_available so that in
>    the future APIC virtualization can be exposed on AMD hardware
>    since fine-graining of "AVIC" is not supported, i.e., AMD solely
>    uses "AVIC Enable". This also means that sysctl mimics what's
>    exposed in CPUID.
> 
> v3:
>  * Define XEN_SYSCTL_PHYSCAP_ARCH_MAX for ABI checking and actually
>    set "arch_capbilities", via a call to c_bitmap_to_ocaml_list()
>  * Have assisted_x2apic_available only depend on
>    cpu_has_vmx_virtualize_x2apic_mode
> 
> v2:
>  * Use one macro LIBXL_HAVE_PHYSINFO_ASSISTED_APIC instead of two
>  * Pass xcpyshinfo as a pointer in libxl__arch_get_physinfo
>  * Set assisted_x{2}apic_available to be conditional upon "bsp" and
>    annotate it with __ro_after_init
>  * Change XEN_SYSCTL_PHYSCAP_ARCH_ASSISTED_X{2}APIC to
>    _X86_ASSISTED_X{2}APIC
>  * Keep XEN_SYSCTL_PHYSCAP_X86_ASSISTED_X{2}APIC contained within
>    sysctl.h
>  * Fix padding introduced in struct xen_sysctl_physinfo and bump
>    XEN_SYSCTL_INTERFACE_VERSION
> ---
>  tools/golang/xenlight/helpers.gen.go |  4 ++++
>  tools/golang/xenlight/types.gen.go   |  2 ++
>  tools/include/libxl.h                |  7 +++++++
>  tools/libs/light/libxl.c             |  3 +++
>  tools/libs/light/libxl_arch.h        |  4 ++++
>  tools/libs/light/libxl_arm.c         |  5 +++++
>  tools/libs/light/libxl_types.idl     |  2 ++
>  tools/libs/light/libxl_x86.c         | 11 +++++++++++
>  tools/ocaml/libs/xc/xenctrl.ml       |  5 +++++
>  tools/ocaml/libs/xc/xenctrl.mli      |  5 +++++
>  tools/ocaml/libs/xc/xenctrl_stubs.c  | 14 +++++++++++---
>  tools/xl/xl_info.c                   |  6 ++++--
>  xen/arch/x86/hvm/vmx/vmcs.c          |  9 +++++++++
>  xen/arch/x86/include/asm/domain.h    |  3 +++
>  xen/arch/x86/sysctl.c                |  7 +++++++
>  xen/include/public/sysctl.h          | 11 ++++++++++-
>  16 files changed, 92 insertions(+), 6 deletions(-)
> 
> diff --git a/tools/golang/xenlight/helpers.gen.go b/tools/golang/xenlight/helpers.gen.go
> index b746ff1081..dd4e6c9f14 100644
> --- a/tools/golang/xenlight/helpers.gen.go
> +++ b/tools/golang/xenlight/helpers.gen.go
> @@ -3373,6 +3373,8 @@ x.CapVmtrace = bool(xc.cap_vmtrace)
>  x.CapVpmu = bool(xc.cap_vpmu)
>  x.CapGnttabV1 = bool(xc.cap_gnttab_v1)
>  x.CapGnttabV2 = bool(xc.cap_gnttab_v2)
> +x.CapAssistedXapic = bool(xc.cap_assisted_xapic)
> +x.CapAssistedX2Apic = bool(xc.cap_assisted_x2apic)
>  
>   return nil}
>  
> @@ -3407,6 +3409,8 @@ xc.cap_vmtrace = C.bool(x.CapVmtrace)
>  xc.cap_vpmu = C.bool(x.CapVpmu)
>  xc.cap_gnttab_v1 = C.bool(x.CapGnttabV1)
>  xc.cap_gnttab_v2 = C.bool(x.CapGnttabV2)
> +xc.cap_assisted_xapic = C.bool(x.CapAssistedXapic)
> +xc.cap_assisted_x2apic = C.bool(x.CapAssistedX2Apic)
>  
>   return nil
>   }
> diff --git a/tools/golang/xenlight/types.gen.go b/tools/golang/xenlight/types.gen.go
> index b1e84d5258..87be46c745 100644
> --- a/tools/golang/xenlight/types.gen.go
> +++ b/tools/golang/xenlight/types.gen.go
> @@ -1014,6 +1014,8 @@ CapVmtrace bool
>  CapVpmu bool
>  CapGnttabV1 bool
>  CapGnttabV2 bool
> +CapAssistedXapic bool
> +CapAssistedX2Apic bool
>  }
>  
>  type Connectorinfo struct {
> diff --git a/tools/include/libxl.h b/tools/include/libxl.h
> index 51a9b6cfac..94e6355822 100644
> --- a/tools/include/libxl.h
> +++ b/tools/include/libxl.h
> @@ -528,6 +528,13 @@
>  #define LIBXL_HAVE_MAX_GRANT_VERSION 1
>  
>  /*
> + * LIBXL_HAVE_PHYSINFO_ASSISTED_APIC indicates that libxl_physinfo has
> + * cap_assisted_xapic and cap_assisted_x2apic fields, which indicates
> + * the availability of x{2}APIC hardware assisted virtualization.
> + */
> +#define LIBXL_HAVE_PHYSINFO_ASSISTED_APIC 1
> +
> +/*
>   * libxl ABI compatibility
>   *
>   * The only guarantee which libxl makes regarding ABI compatibility
> diff --git a/tools/libs/light/libxl.c b/tools/libs/light/libxl.c
> index a0bf7d186f..6d699951e2 100644
> --- a/tools/libs/light/libxl.c
> +++ b/tools/libs/light/libxl.c
> @@ -15,6 +15,7 @@
>  #include "libxl_osdeps.h"
>  
>  #include "libxl_internal.h"
> +#include "libxl_arch.h"
>  
>  int libxl_ctx_alloc(libxl_ctx **pctx, int version,
>                      unsigned flags, xentoollog_logger * lg)
> @@ -410,6 +411,8 @@ int libxl_get_physinfo(libxl_ctx *ctx, libxl_physinfo *physinfo)
>      physinfo->cap_gnttab_v2 =
>          !!(xcphysinfo.capabilities & XEN_SYSCTL_PHYSCAP_gnttab_v2);
>  
> +    libxl__arch_get_physinfo(physinfo, &xcphysinfo);
> +
>      GC_FREE;
>      return 0;
>  }
> diff --git a/tools/libs/light/libxl_arch.h b/tools/libs/light/libxl_arch.h
> index 1522ecb97f..207ceac6a1 100644
> --- a/tools/libs/light/libxl_arch.h
> +++ b/tools/libs/light/libxl_arch.h
> @@ -86,6 +86,10 @@ int libxl__arch_extra_memory(libxl__gc *gc,
>                               uint64_t *out);
>  
>  _hidden
> +void libxl__arch_get_physinfo(libxl_physinfo *physinfo,
> +                              const xc_physinfo_t *xcphysinfo);
> +
> +_hidden
>  void libxl__arch_update_domain_config(libxl__gc *gc,
>                                        libxl_domain_config *dst,
>                                        const libxl_domain_config *src);
> diff --git a/tools/libs/light/libxl_arm.c b/tools/libs/light/libxl_arm.c
> index eef1de0939..39fdca1b49 100644
> --- a/tools/libs/light/libxl_arm.c
> +++ b/tools/libs/light/libxl_arm.c
> @@ -1431,6 +1431,11 @@ int libxl__arch_passthrough_mode_setdefault(libxl__gc *gc,
>      return rc;
>  }
>  
> +void libxl__arch_get_physinfo(libxl_physinfo *physinfo,
> +                              const xc_physinfo_t *xcphysinfo)
> +{
> +}
> +
>  void libxl__arch_update_domain_config(libxl__gc *gc,
>                                        libxl_domain_config *dst,
>                                        const libxl_domain_config *src)
> diff --git a/tools/libs/light/libxl_types.idl b/tools/libs/light/libxl_types.idl
> index 2a42da2f7d..42ac6c357b 100644
> --- a/tools/libs/light/libxl_types.idl
> +++ b/tools/libs/light/libxl_types.idl
> @@ -1068,6 +1068,8 @@ libxl_physinfo = Struct("physinfo", [
>      ("cap_vpmu", bool),
>      ("cap_gnttab_v1", bool),
>      ("cap_gnttab_v2", bool),
> +    ("cap_assisted_xapic", bool),
> +    ("cap_assisted_x2apic", bool),
>      ], dir=DIR_OUT)
>  
>  libxl_connectorinfo = Struct("connectorinfo", [
> diff --git a/tools/libs/light/libxl_x86.c b/tools/libs/light/libxl_x86.c
> index 1feadebb18..e0a06ecfe3 100644
> --- a/tools/libs/light/libxl_x86.c
> +++ b/tools/libs/light/libxl_x86.c
> @@ -866,6 +866,17 @@ int libxl__arch_passthrough_mode_setdefault(libxl__gc *gc,
>      return rc;
>  }
>  
> +void libxl__arch_get_physinfo(libxl_physinfo *physinfo,
> +                              const xc_physinfo_t *xcphysinfo)
> +{
> +    physinfo->cap_assisted_xapic =
> +        !!(xcphysinfo->arch_capabilities &
> +           XEN_SYSCTL_PHYSCAP_X86_ASSISTED_XAPIC);
> +    physinfo->cap_assisted_x2apic =
> +        !!(xcphysinfo->arch_capabilities &
> +           XEN_SYSCTL_PHYSCAP_X86_ASSISTED_X2APIC);
> +}
> +
>  void libxl__arch_update_domain_config(libxl__gc *gc,
>                                        libxl_domain_config *dst,
>                                        const libxl_domain_config *src)
> diff --git a/tools/ocaml/libs/xc/xenctrl.ml b/tools/ocaml/libs/xc/xenctrl.ml
> index 7503031d8f..21783d3622 100644
> --- a/tools/ocaml/libs/xc/xenctrl.ml
> +++ b/tools/ocaml/libs/xc/xenctrl.ml
> @@ -127,6 +127,10 @@ type physinfo_cap_flag =
>  	| CAP_Gnttab_v1
>  	| CAP_Gnttab_v2
>  
> +type physinfo_arch_cap_flag =
> +	| CAP_X86_ASSISTED_XAPIC
> +	| CAP_X86_ASSISTED_X2APIC
> +
>  type physinfo =
>  {
>  	threads_per_core : int;
> @@ -139,6 +143,7 @@ type physinfo =
>  	scrub_pages      : nativeint;
>  	(* XXX hw_cap *)
>  	capabilities     : physinfo_cap_flag list;
> +	arch_capabilities : physinfo_arch_cap_flag list;
>  	max_nr_cpus      : int;
>  }
>  
> diff --git a/tools/ocaml/libs/xc/xenctrl.mli b/tools/ocaml/libs/xc/xenctrl.mli
> index d1d9c9247a..af6ba3d1a0 100644
> --- a/tools/ocaml/libs/xc/xenctrl.mli
> +++ b/tools/ocaml/libs/xc/xenctrl.mli
> @@ -112,6 +112,10 @@ type physinfo_cap_flag =
>    | CAP_Gnttab_v1
>    | CAP_Gnttab_v2
>  
> +type physinfo_arch_cap_flag =
> +  | CAP_X86_ASSISTED_XAPIC
> +  | CAP_X86_ASSISTED_X2APIC
> +
>  type physinfo = {
>    threads_per_core : int;
>    cores_per_socket : int;
> @@ -122,6 +126,7 @@ type physinfo = {
>    free_pages       : nativeint;
>    scrub_pages      : nativeint;
>    capabilities     : physinfo_cap_flag list;
> +  arch_capabilities : physinfo_arch_cap_flag list;
>    max_nr_cpus      : int; (** compile-time max possible number of nr_cpus *)
>  }
>  type version = { major : int; minor : int; extra : string; }
> diff --git a/tools/ocaml/libs/xc/xenctrl_stubs.c b/tools/ocaml/libs/xc/xenctrl_stubs.c
> index 5b4fe72c8d..e0d49b18d2 100644
> --- a/tools/ocaml/libs/xc/xenctrl_stubs.c
> +++ b/tools/ocaml/libs/xc/xenctrl_stubs.c
> @@ -712,7 +712,7 @@ CAMLprim value stub_xc_send_debug_keys(value xch, value keys)
>  CAMLprim value stub_xc_physinfo(value xch)
>  {
>  	CAMLparam1(xch);
> -	CAMLlocal2(physinfo, cap_list);
> +	CAMLlocal3(physinfo, cap_list, arch_cap_list);
>  	xc_physinfo_t c_physinfo;
>  	int r;
>  
> @@ -730,8 +730,15 @@ CAMLprim value stub_xc_physinfo(value xch)
>  		/* ! physinfo_cap_flag CAP_ lc */
>  		/* ! XEN_SYSCTL_PHYSCAP_ XEN_SYSCTL_PHYSCAP_MAX max */
>  		(c_physinfo.capabilities);
> +	/*
> +	 * arch_capabilities: physinfo_arch_cap_flag list;
> +	 */
> +	arch_cap_list = c_bitmap_to_ocaml_list
> +		/* ! physinfo_arch_cap_flag CAP_ none */
> +		/* ! XEN_SYSCTL_PHYSCAP_ XEN_SYSCTL_PHYSCAP_X86_MAX max */
> +		(c_physinfo.arch_capabilities);

Note the above is likely to need adjusting if Arm (or other arches)
start using arch_capabilities. Could we make this conditional to
CONFIG_X86 being enabled?

>  
> -	physinfo = caml_alloc_tuple(10);
> +	physinfo = caml_alloc_tuple(11);
>  	Store_field(physinfo, 0, Val_int(c_physinfo.threads_per_core));
>  	Store_field(physinfo, 1, Val_int(c_physinfo.cores_per_socket));
>  	Store_field(physinfo, 2, Val_int(c_physinfo.nr_cpus));
> @@ -741,7 +748,8 @@ CAMLprim value stub_xc_physinfo(value xch)
>  	Store_field(physinfo, 6, caml_copy_nativeint(c_physinfo.free_pages));
>  	Store_field(physinfo, 7, caml_copy_nativeint(c_physinfo.scrub_pages));
>  	Store_field(physinfo, 8, cap_list);
> -	Store_field(physinfo, 9, Val_int(c_physinfo.max_cpu_id + 1));
> +	Store_field(physinfo, 9, arch_cap_list);
> +	Store_field(physinfo, 10, Val_int(c_physinfo.max_cpu_id + 1));
>  
>  	CAMLreturn(physinfo);
>  }
> diff --git a/tools/xl/xl_info.c b/tools/xl/xl_info.c
> index 712b7638b0..3205270754 100644
> --- a/tools/xl/xl_info.c
> +++ b/tools/xl/xl_info.c
> @@ -210,7 +210,7 @@ static void output_physinfo(void)
>           info.hw_cap[4], info.hw_cap[5], info.hw_cap[6], info.hw_cap[7]
>          );
>  
> -    maybe_printf("virt_caps              :%s%s%s%s%s%s%s%s%s%s%s\n",
> +    maybe_printf("virt_caps              :%s%s%s%s%s%s%s%s%s%s%s%s%s\n",
>           info.cap_pv ? " pv" : "",
>           info.cap_hvm ? " hvm" : "",
>           info.cap_hvm && info.cap_hvm_directio ? " hvm_directio" : "",
> @@ -221,7 +221,9 @@ static void output_physinfo(void)
>           info.cap_vmtrace ? " vmtrace" : "",
>           info.cap_vpmu ? " vpmu" : "",
>           info.cap_gnttab_v1 ? " gnttab-v1" : "",
> -         info.cap_gnttab_v2 ? " gnttab-v2" : ""
> +         info.cap_gnttab_v2 ? " gnttab-v2" : "",
> +         info.cap_assisted_xapic ? " assisted_xapic" : "",
> +         info.cap_assisted_x2apic ? " assisted_x2apic" : ""
>          );
>  
>      vinfo = libxl_get_version_info(ctx);
> diff --git a/xen/arch/x86/hvm/vmx/vmcs.c b/xen/arch/x86/hvm/vmx/vmcs.c
> index e1e1fa14e6..06831099ed 100644
> --- a/xen/arch/x86/hvm/vmx/vmcs.c
> +++ b/xen/arch/x86/hvm/vmx/vmcs.c
> @@ -343,6 +343,15 @@ static int vmx_init_vmcs_config(bool bsp)
>              MSR_IA32_VMX_PROCBASED_CTLS2, &mismatch);
>      }
>  
> +    /* Check whether hardware supports accelerated xapic and x2apic. */
> +    if ( bsp )
> +    {
> +        assisted_xapic_available = cpu_has_vmx_virtualize_apic_accesses;
> +        assisted_x2apic_available = (cpu_has_vmx_virtualize_x2apic_mode &&
> +                                     (cpu_has_vmx_apic_reg_virt ||
> +                                      cpu_has_vmx_virtual_intr_delivery));

There's no need for the outer parentheses.

Thanks, Roger.


^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH v5 2/2] x86/xen: Allow per-domain usage of hardware virtualized APIC
  2022-03-07 15:06 ` [PATCH v5 2/2] x86/xen: Allow per-domain usage of hardware virtualized APIC Jane Malalane
@ 2022-03-08 11:38   ` Roger Pau Monné
  2022-03-08 12:24     ` Jan Beulich
  2022-03-08 15:44     ` Jane Malalane
  2022-03-08 17:36   ` [PATCH v6 " Jane Malalane
  1 sibling, 2 replies; 27+ messages in thread
From: Roger Pau Monné @ 2022-03-08 11:38 UTC (permalink / raw)
  To: Jane Malalane
  Cc: Xen-devel, Wei Liu, Anthony PERARD, Juergen Gross, Andrew Cooper,
	George Dunlap, Jan Beulich, Julien Grall, Stefano Stabellini,
	Christian Lindig, David Scott, Volodymyr Babchuk

On Mon, Mar 07, 2022 at 03:06:09PM +0000, Jane Malalane wrote:
> Introduce a new per-domain creation x86 specific flag to
> select whether hardware assisted virtualization should be used for
> x{2}APIC.
> 
> A per-domain option is added to xl in order to select the usage of
> x{2}APIC hardware assisted virtualization, as well as a global
> configuration option.
> 
> Having all APIC interaction exit to Xen for emulation is slow and can
> induce much overhead. Hardware can speed up x{2}APIC by decoding the
> APIC access and providing a VM exit with a more specific exit reason
> than a regular EPT fault or by altogether avoiding a VM exit.
> 
> On the other hand, being able to disable x{2}APIC hardware assisted
> virtualization can be useful for testing and debugging purposes.
> 
> Note: vmx_install_vlapic_mapping doesn't require modifications
> regardless of whether the guest has "Virtualize APIC accesses" enabled
> or not, i.e., setting the APIC_ACCESS_ADDR VMCS field is fine so long
> as virtualize_apic_accesses is supported by the CPU.
> 
> Suggested-by: Andrew Cooper <andrew.cooper3@citrix.com>
> Signed-off-by: Jane Malalane <jane.malalane@citrix.com>
> ---
> CC: Wei Liu <wl@xen.org>
> CC: Anthony PERARD <anthony.perard@citrix.com>
> CC: Juergen Gross <jgross@suse.com>
> CC: Andrew Cooper <andrew.cooper3@citrix.com>
> CC: George Dunlap <george.dunlap@citrix.com>
> CC: Jan Beulich <jbeulich@suse.com>
> CC: Julien Grall <julien@xen.org>
> CC: Stefano Stabellini <sstabellini@kernel.org>
> CC: Christian Lindig <christian.lindig@citrix.com>
> CC: David Scott <dave@recoil.org>
> CC: Volodymyr Babchuk <Volodymyr_Babchuk@epam.com>
> CC: "Roger Pau Monné" <roger.pau@citrix.com>
> 
> v5:
> * Revert v4 changes in vmx_vlapic_msr_changed(), preserving the use of
>   the has_assisted_x{2}apic macros
> * Following changes in assisted_x{2}apic_available definitions in
>   patch 1, retighten conditionals for setting
>   XEN_HVM_CPUID_APIC_ACCESS_VIRT and XEN_HVM_CPUID_X2APIC_VIRT in
>   cpuid_hypervisor_leaves()
> 
> v4:
>  * Add has_assisted_x{2}apic macros and use them where appropriate
>  * Replace CPU checks with per-domain assisted_x{2}apic control
>    options in vmx_vlapic_msr_changed() and cpuid_hypervisor_leaves(),
>    following edits to assisted_x{2}apic_available definitions in
>    patch 1
>    Note: new assisted_x{2}apic_available definitions make later
>    cpu_has_vmx_apic_reg_virt and cpu_has_vmx_virtual_intr_delivery
>    checks redundant in vmx_vlapic_msr_changed()
> 
> v3:
>  * Change info in xl.cfg to better express reality and fix
>    capitalization of x{2}apic
>  * Move "physinfo" variable definition to the beggining of
>    libxl__domain_build_info_setdefault()
>  * Reposition brackets in if statement to match libxl coding style
>  * Shorten logic in libxl__arch_domain_build_info_setdefault()
>  * Correct dprintk message in arch_sanitise_domain_config()
>  * Make appropriate changes in vmx_vlapic_msr_changed() and
>    cpuid_hypervisor_leaves() for amended "assisted_x2apic" bit
>  * Remove unneeded parantheses
> 
> v2:
>  * Add a LIBXL_HAVE_ASSISTED_APIC macro
>  * Pass xcpyshinfo as a pointer in libxl__arch_get_physinfo
>  * Add a return statement in now "int"
>    libxl__arch_domain_build_info_setdefault()
>  * Preserve libxl__arch_domain_build_info_setdefault 's location in
>    libxl_create.c
>  * Correct x{2}apic default setting logic in
>    libxl__arch_domain_prepare_config()
>  * Correct logic for parsing assisted_x{2}apic host/guest options in
>    xl_parse.c and initialize them to -1 in xl.c
>  * Use guest options directly in vmx_vlapic_msr_changed
>  * Fix indentation of bool assisted_x{2}apic in struct hvm_domain
>  * Add a change in xenctrl_stubs.c to pass xenctrl ABI checks
> ---
>  docs/man/xl.cfg.5.pod.in                | 19 +++++++++++++++++++
>  docs/man/xl.conf.5.pod.in               | 12 ++++++++++++
>  tools/golang/xenlight/helpers.gen.go    | 12 ++++++++++++
>  tools/golang/xenlight/types.gen.go      |  2 ++
>  tools/include/libxl.h                   |  7 +++++++
>  tools/libs/light/libxl_arch.h           |  5 +++--
>  tools/libs/light/libxl_arm.c            |  7 +++++--
>  tools/libs/light/libxl_create.c         | 22 +++++++++++++---------
>  tools/libs/light/libxl_types.idl        |  2 ++
>  tools/libs/light/libxl_x86.c            | 28 ++++++++++++++++++++++++++--
>  tools/ocaml/libs/xc/xenctrl.ml          |  2 ++
>  tools/ocaml/libs/xc/xenctrl.mli         |  2 ++
>  tools/ocaml/libs/xc/xenctrl_stubs.c     |  2 +-
>  tools/xl/xl.c                           |  8 ++++++++
>  tools/xl/xl.h                           |  2 ++
>  tools/xl/xl_parse.c                     | 16 ++++++++++++++++
>  xen/arch/x86/domain.c                   | 28 +++++++++++++++++++++++++++-
>  xen/arch/x86/hvm/vmx/vmcs.c             |  4 ++++
>  xen/arch/x86/hvm/vmx/vmx.c              | 13 ++++---------
>  xen/arch/x86/include/asm/hvm/domain.h   |  6 ++++++
>  xen/arch/x86/include/asm/hvm/vmx/vmcs.h |  3 +++
>  xen/arch/x86/traps.c                    |  9 +++++----
>  xen/include/public/arch-x86/xen.h       |  2 ++
>  23 files changed, 183 insertions(+), 30 deletions(-)
> 
> diff --git a/docs/man/xl.cfg.5.pod.in b/docs/man/xl.cfg.5.pod.in
> index b98d161398..dcca564a23 100644
> --- a/docs/man/xl.cfg.5.pod.in
> +++ b/docs/man/xl.cfg.5.pod.in
> @@ -1862,6 +1862,25 @@ firmware tables when using certain older guest Operating
>  Systems. These tables have been superseded by newer constructs within
>  the ACPI tables.
>  
> +=item B<assisted_xapic=BOOLEAN>
> +
> +B<(x86 only)> Enables or disables hardware assisted virtualization for
> +xAPIC. With this option enabled, a memory-mapped APIC access will be
> +decoded by hardware and either issue a more specific VM exit than just
> +an EPT fault, or altogether avoid a VM exit. Notice full
> +virtualization for xAPIC can only be achieved if hardware supports
> +“APIC-register virtualization” and “virtual-interrupt delivery”.

You shouldn't mention “APIC-register virtualization” or
“virtual-interrupt delivery”, as those are Intel specific options. I
would just remove that sentence (same below).

> The
> +default is settable via L<xl.conf(5)>.
> +
> +=item B<assisted_x2apic=BOOLEAN>
> +
> +B<(x86 only)> Enables or disables hardware assisted virtualization for
> +x2APIC. With this option enabled, an MSR-Based APIC access will
> +either issue a VM exit or altogether avoid one.

"With this option enabled, certain accesses to MSR APIC registers will
avoid a VM exit into the hypervisor."

> Notice full
> +virtualization for x2APIC can only be achieved if hardware supports
> +“APIC-register virtualization” and “virtual-interrupt delivery”. The
> +default is settable via L<xl.conf(5)>.
> +
>  =item B<nx=BOOLEAN>
>  
>  B<(x86 only)> Hides or exposes the No-eXecute capability. This allows a guest
> diff --git a/docs/man/xl.conf.5.pod.in b/docs/man/xl.conf.5.pod.in
> index df20c08137..95d136d1ea 100644
> --- a/docs/man/xl.conf.5.pod.in
> +++ b/docs/man/xl.conf.5.pod.in
> @@ -107,6 +107,18 @@ Sets the default value for the C<max_grant_version> domain config value.
>  
>  Default: maximum grant version supported by the hypervisor.
>  
> +=item B<assisted_xapic=BOOLEAN>
> +
> +If enabled, domains will use xAPIC hardware assisted virtualization by default.
> +
> +Default: enabled if supported.
> +
> +=item B<assisted_x2apic=BOOLEAN>
> +
> +If enabled, domains will use x2APIC hardware assisted virtualization by default.
> +
> +Default: enabled if supported.
> +
>  =item B<vif.default.script="PATH">
>  
>  Configures the default hotplug script used by virtual network devices.
> diff --git a/tools/golang/xenlight/helpers.gen.go b/tools/golang/xenlight/helpers.gen.go
> index dd4e6c9f14..dece545ee0 100644
> --- a/tools/golang/xenlight/helpers.gen.go
> +++ b/tools/golang/xenlight/helpers.gen.go
> @@ -1120,6 +1120,12 @@ x.ArchArm.Vuart = VuartType(xc.arch_arm.vuart)
>  if err := x.ArchX86.MsrRelaxed.fromC(&xc.arch_x86.msr_relaxed);err != nil {
>  return fmt.Errorf("converting field ArchX86.MsrRelaxed: %v", err)
>  }
> +if err := x.ArchX86.AssistedXapic.fromC(&xc.arch_x86.assisted_xapic);err != nil {
> +return fmt.Errorf("converting field ArchX86.AssistedXapic: %v", err)
> +}
> +if err := x.ArchX86.AssistedX2Apic.fromC(&xc.arch_x86.assisted_x2apic);err != nil {
> +return fmt.Errorf("converting field ArchX86.AssistedX2Apic: %v", err)
> +}
>  x.Altp2M = Altp2MMode(xc.altp2m)
>  x.VmtraceBufKb = int(xc.vmtrace_buf_kb)
>  if err := x.Vpmu.fromC(&xc.vpmu);err != nil {
> @@ -1605,6 +1611,12 @@ xc.arch_arm.vuart = C.libxl_vuart_type(x.ArchArm.Vuart)
>  if err := x.ArchX86.MsrRelaxed.toC(&xc.arch_x86.msr_relaxed); err != nil {
>  return fmt.Errorf("converting field ArchX86.MsrRelaxed: %v", err)
>  }
> +if err := x.ArchX86.AssistedXapic.toC(&xc.arch_x86.assisted_xapic); err != nil {
> +return fmt.Errorf("converting field ArchX86.AssistedXapic: %v", err)
> +}
> +if err := x.ArchX86.AssistedX2Apic.toC(&xc.arch_x86.assisted_x2apic); err != nil {
> +return fmt.Errorf("converting field ArchX86.AssistedX2Apic: %v", err)
> +}
>  xc.altp2m = C.libxl_altp2m_mode(x.Altp2M)
>  xc.vmtrace_buf_kb = C.int(x.VmtraceBufKb)
>  if err := x.Vpmu.toC(&xc.vpmu); err != nil {
> diff --git a/tools/golang/xenlight/types.gen.go b/tools/golang/xenlight/types.gen.go
> index 87be46c745..253c9ad93d 100644
> --- a/tools/golang/xenlight/types.gen.go
> +++ b/tools/golang/xenlight/types.gen.go
> @@ -520,6 +520,8 @@ Vuart VuartType
>  }
>  ArchX86 struct {
>  MsrRelaxed Defbool
> +AssistedXapic Defbool
> +AssistedX2Apic Defbool
>  }
>  Altp2M Altp2MMode
>  VmtraceBufKb int
> diff --git a/tools/include/libxl.h b/tools/include/libxl.h
> index 94e6355822..cdcccd6d01 100644
> --- a/tools/include/libxl.h
> +++ b/tools/include/libxl.h
> @@ -535,6 +535,13 @@
>  #define LIBXL_HAVE_PHYSINFO_ASSISTED_APIC 1
>  
>  /*
> + * LIBXL_HAVE_ASSISTED_APIC indicates that libxl_domain_build_info has
> + * assisted_xapic and assisted_x2apic fields for enabling hardware
> + * assisted virtualization for x{2}apic per domain.
> + */
> +#define LIBXL_HAVE_ASSISTED_APIC 1
> +
> +/*
>   * libxl ABI compatibility
>   *
>   * The only guarantee which libxl makes regarding ABI compatibility
> diff --git a/tools/libs/light/libxl_arch.h b/tools/libs/light/libxl_arch.h
> index 207ceac6a1..03b89929e6 100644
> --- a/tools/libs/light/libxl_arch.h
> +++ b/tools/libs/light/libxl_arch.h
> @@ -71,8 +71,9 @@ void libxl__arch_domain_create_info_setdefault(libxl__gc *gc,
>                                                 libxl_domain_create_info *c_info);
>  
>  _hidden
> -void libxl__arch_domain_build_info_setdefault(libxl__gc *gc,
> -                                              libxl_domain_build_info *b_info);
> +int libxl__arch_domain_build_info_setdefault(libxl__gc *gc,
> +                                             libxl_domain_build_info *b_info,
> +                                             const libxl_physinfo *physinfo);
>  
>  _hidden
>  int libxl__arch_passthrough_mode_setdefault(libxl__gc *gc,
> diff --git a/tools/libs/light/libxl_arm.c b/tools/libs/light/libxl_arm.c
> index 39fdca1b49..ba5b8f433f 100644
> --- a/tools/libs/light/libxl_arm.c
> +++ b/tools/libs/light/libxl_arm.c
> @@ -1384,8 +1384,9 @@ void libxl__arch_domain_create_info_setdefault(libxl__gc *gc,
>      }
>  }
>  
> -void libxl__arch_domain_build_info_setdefault(libxl__gc *gc,
> -                                              libxl_domain_build_info *b_info)
> +int libxl__arch_domain_build_info_setdefault(libxl__gc *gc,
> +                                             libxl_domain_build_info *b_info,
> +                                             const libxl_physinfo *physinfo)
>  {
>      /* ACPI is disabled by default */
>      libxl_defbool_setdefault(&b_info->acpi, false);
> @@ -1399,6 +1400,8 @@ void libxl__arch_domain_build_info_setdefault(libxl__gc *gc,
>      memset(&b_info->u, '\0', sizeof(b_info->u));
>      b_info->type = LIBXL_DOMAIN_TYPE_INVALID;
>      libxl_domain_build_info_init_type(b_info, LIBXL_DOMAIN_TYPE_PVH);
> +
> +    return 0;
>  }
>  
>  int libxl__arch_passthrough_mode_setdefault(libxl__gc *gc,
> diff --git a/tools/libs/light/libxl_create.c b/tools/libs/light/libxl_create.c
> index 15ed021f41..88d08d7277 100644
> --- a/tools/libs/light/libxl_create.c
> +++ b/tools/libs/light/libxl_create.c
> @@ -75,6 +75,7 @@ int libxl__domain_build_info_setdefault(libxl__gc *gc,
>                                          libxl_domain_build_info *b_info)
>  {
>      int i, rc;
> +    libxl_physinfo info;
>  
>      if (b_info->type != LIBXL_DOMAIN_TYPE_HVM &&
>          b_info->type != LIBXL_DOMAIN_TYPE_PV &&
> @@ -264,7 +265,18 @@ int libxl__domain_build_info_setdefault(libxl__gc *gc,
>      if (!b_info->event_channels)
>          b_info->event_channels = 1023;
>  
> -    libxl__arch_domain_build_info_setdefault(gc, b_info);
> +    rc = libxl_get_physinfo(CTX, &info);
> +    if (rc) {
> +        LOG(ERROR, "failed to get hypervisor info");
> +        return rc;
> +    }
> +
> +    rc = libxl__arch_domain_build_info_setdefault(gc, b_info, &info);
> +    if (rc) {
> +        LOG(ERROR, "unable to set domain arch build info defaults");
> +        return rc;
> +    }
> +
>      libxl_defbool_setdefault(&b_info->dm_restrict, false);
>  
>      if (b_info->iommu_memkb == LIBXL_MEMKB_DEFAULT)
> @@ -457,14 +469,6 @@ int libxl__domain_build_info_setdefault(libxl__gc *gc,
>      }
>  
>      if (b_info->max_grant_version == LIBXL_MAX_GRANT_DEFAULT) {
> -        libxl_physinfo info;
> -
> -        rc = libxl_get_physinfo(CTX, &info);
> -        if (rc) {
> -            LOG(ERROR, "failed to get hypervisor info");
> -            return rc;
> -        }
> -
>          if (info.cap_gnttab_v2)
>              b_info->max_grant_version = 2;
>          else if (info.cap_gnttab_v1)
> diff --git a/tools/libs/light/libxl_types.idl b/tools/libs/light/libxl_types.idl
> index 42ac6c357b..db5eb0a0b3 100644
> --- a/tools/libs/light/libxl_types.idl
> +++ b/tools/libs/light/libxl_types.idl
> @@ -648,6 +648,8 @@ libxl_domain_build_info = Struct("domain_build_info",[
>                                 ("vuart", libxl_vuart_type),
>                                ])),
>      ("arch_x86", Struct(None, [("msr_relaxed", libxl_defbool),
> +                               ("assisted_xapic", libxl_defbool),
> +                               ("assisted_x2apic", libxl_defbool),
>                                ])),
>      # Alternate p2m is not bound to any architecture or guest type, as it is
>      # supported by x86 HVM and ARM support is planned.
> diff --git a/tools/libs/light/libxl_x86.c b/tools/libs/light/libxl_x86.c
> index e0a06ecfe3..c377d13b19 100644
> --- a/tools/libs/light/libxl_x86.c
> +++ b/tools/libs/light/libxl_x86.c
> @@ -23,6 +23,14 @@ int libxl__arch_domain_prepare_config(libxl__gc *gc,
>      if (libxl_defbool_val(d_config->b_info.arch_x86.msr_relaxed))
>          config->arch.misc_flags |= XEN_X86_MSR_RELAXED;
>  
> +    if (d_config->c_info.type != LIBXL_DOMAIN_TYPE_PV)
> +    {
> +        if (libxl_defbool_val(d_config->b_info.arch_x86.assisted_xapic))
> +            config->arch.misc_flags |= XEN_X86_ASSISTED_XAPIC;
> +
> +        if (libxl_defbool_val(d_config->b_info.arch_x86.assisted_x2apic))
> +            config->arch.misc_flags |= XEN_X86_ASSISTED_X2APIC;
> +    }
>      return 0;
>  }
>  
> @@ -819,11 +827,27 @@ void libxl__arch_domain_create_info_setdefault(libxl__gc *gc,
>  {
>  }
>  
> -void libxl__arch_domain_build_info_setdefault(libxl__gc *gc,
> -                                              libxl_domain_build_info *b_info)
> +int libxl__arch_domain_build_info_setdefault(libxl__gc *gc,
> +                                             libxl_domain_build_info *b_info,
> +                                             const libxl_physinfo *physinfo)
>  {
>      libxl_defbool_setdefault(&b_info->acpi, true);
>      libxl_defbool_setdefault(&b_info->arch_x86.msr_relaxed, false);
> +
> +    if (b_info->type != LIBXL_DOMAIN_TYPE_PV) {
> +        libxl_defbool_setdefault(&b_info->arch_x86.assisted_xapic,
> +                             physinfo->cap_assisted_xapic);
> +        libxl_defbool_setdefault(&b_info->arch_x86.assisted_x2apic,
> +                             physinfo->cap_assisted_x2apic);
> +    }
> +

Extra newline? 'else if' should be one space after the closing
bracket.

> +    else if (!libxl_defbool_is_default(b_info->arch_x86.assisted_xapic) ||
> +             !libxl_defbool_is_default(b_info->arch_x86.assisted_x2apic)) {
> +        LOG(ERROR, "Interrupt Controller Virtualization not supported for PV");
> +        return ERROR_INVAL;
> +    }
> +
> +    return 0;
>  }
>  
>  int libxl__arch_passthrough_mode_setdefault(libxl__gc *gc,
> diff --git a/tools/ocaml/libs/xc/xenctrl.ml b/tools/ocaml/libs/xc/xenctrl.ml
> index 21783d3622..672a11ceb6 100644
> --- a/tools/ocaml/libs/xc/xenctrl.ml
> +++ b/tools/ocaml/libs/xc/xenctrl.ml
> @@ -50,6 +50,8 @@ type x86_arch_emulation_flags =
>  
>  type x86_arch_misc_flags =
>  	| X86_MSR_RELAXED
> +	| X86_ASSISTED_XAPIC
> +	| X86_ASSISTED_X2APIC
>  
>  type xen_x86_arch_domainconfig =
>  {
> diff --git a/tools/ocaml/libs/xc/xenctrl.mli b/tools/ocaml/libs/xc/xenctrl.mli
> index af6ba3d1a0..f9a6aa3a0f 100644
> --- a/tools/ocaml/libs/xc/xenctrl.mli
> +++ b/tools/ocaml/libs/xc/xenctrl.mli
> @@ -44,6 +44,8 @@ type x86_arch_emulation_flags =
>  
>  type x86_arch_misc_flags =
>    | X86_MSR_RELAXED
> +  | X86_ASSISTED_XAPIC
> +  | X86_ASSISTED_X2APIC
>  
>  type xen_x86_arch_domainconfig = {
>    emulation_flags: x86_arch_emulation_flags list;
> diff --git a/tools/ocaml/libs/xc/xenctrl_stubs.c b/tools/ocaml/libs/xc/xenctrl_stubs.c
> index e0d49b18d2..ecfc7125d5 100644
> --- a/tools/ocaml/libs/xc/xenctrl_stubs.c
> +++ b/tools/ocaml/libs/xc/xenctrl_stubs.c
> @@ -239,7 +239,7 @@ CAMLprim value stub_xc_domain_create(value xch, value wanted_domid, value config
>  
>  		cfg.arch.misc_flags = ocaml_list_to_c_bitmap
>  			/* ! x86_arch_misc_flags X86_ none */
> -			/* ! XEN_X86_ XEN_X86_MSR_RELAXED all */
> +			/* ! XEN_X86_ XEN_X86_ASSISTED_X2APIC max */
>  			(VAL_MISC_FLAGS);
>  
>  #undef VAL_MISC_FLAGS
> diff --git a/tools/xl/xl.c b/tools/xl/xl.c
> index 2d1ec18ea3..31eb223309 100644
> --- a/tools/xl/xl.c
> +++ b/tools/xl/xl.c
> @@ -57,6 +57,8 @@ int max_grant_frames = -1;
>  int max_maptrack_frames = -1;
>  int max_grant_version = LIBXL_MAX_GRANT_DEFAULT;
>  libxl_domid domid_policy = INVALID_DOMID;
> +int assisted_xapic = -1;
> +int assisted_x2apic = -1;
>  
>  xentoollog_level minmsglevel = minmsglevel_default;
>  
> @@ -201,6 +203,12 @@ static void parse_global_config(const char *configfile,
>      if (!xlu_cfg_get_long (config, "claim_mode", &l, 0))
>          claim_mode = l;
>  
> +    if (!xlu_cfg_get_long (config, "assisted_xapic", &l, 0))
> +        assisted_xapic = l;
> +
> +    if (!xlu_cfg_get_long (config, "assisted_x2apic", &l, 0))
> +        assisted_x2apic = l;
> +
>      xlu_cfg_replace_string (config, "remus.default.netbufscript",
>          &default_remus_netbufscript, 0);
>      xlu_cfg_replace_string (config, "colo.default.proxyscript",
> diff --git a/tools/xl/xl.h b/tools/xl/xl.h
> index c5c4bedbdd..528deb3feb 100644
> --- a/tools/xl/xl.h
> +++ b/tools/xl/xl.h
> @@ -286,6 +286,8 @@ extern libxl_bitmap global_vm_affinity_mask;
>  extern libxl_bitmap global_hvm_affinity_mask;
>  extern libxl_bitmap global_pv_affinity_mask;
>  extern libxl_domid domid_policy;
> +extern int assisted_xapic;
> +extern int assisted_x2apic;
>  
>  enum output_format {
>      OUTPUT_FORMAT_JSON,
> diff --git a/tools/xl/xl_parse.c b/tools/xl/xl_parse.c
> index 117fcdcb2b..0ab9b145fe 100644
> --- a/tools/xl/xl_parse.c
> +++ b/tools/xl/xl_parse.c
> @@ -1681,6 +1681,22 @@ void parse_config_data(const char *config_source,
>          xlu_cfg_get_defbool(config, "vpt_align", &b_info->u.hvm.vpt_align, 0);
>          xlu_cfg_get_defbool(config, "apic", &b_info->apic, 0);
>  
> +        e = xlu_cfg_get_long(config, "assisted_xapic", &l , 0);
> +        if ((e == ESRCH && assisted_xapic != -1)) /* use global default if present */
> +            libxl_defbool_set(&b_info->arch_x86.assisted_xapic, assisted_xapic);
> +        else if (!e)
> +            libxl_defbool_set(&b_info->arch_x86.assisted_xapic, l);
> +        else
> +            exit(1);
> +
> +        e = xlu_cfg_get_long(config, "assisted_x2apic", &l, 0);
> +        if ((e == ESRCH && assisted_x2apic != -1)) /* use global default if present */
> +            libxl_defbool_set(&b_info->arch_x86.assisted_x2apic, assisted_x2apic);
> +        else if (!e)
> +            libxl_defbool_set(&b_info->arch_x86.assisted_x2apic, l);
> +        else
> +            exit(1);
> +
>          switch (xlu_cfg_get_list(config, "viridian",
>                                   &viridian, &num_viridian, 1))
>          {
> diff --git a/xen/arch/x86/domain.c b/xen/arch/x86/domain.c
> index a5048ed654..bcca0dc900 100644
> --- a/xen/arch/x86/domain.c
> +++ b/xen/arch/x86/domain.c
> @@ -619,6 +619,8 @@ int arch_sanitise_domain_config(struct xen_domctl_createdomain *config)
>      bool hvm = config->flags & XEN_DOMCTL_CDF_hvm;
>      bool hap = config->flags & XEN_DOMCTL_CDF_hap;
>      bool nested_virt = config->flags & XEN_DOMCTL_CDF_nested_virt;
> +    bool assisted_xapic = config->arch.misc_flags & XEN_X86_ASSISTED_XAPIC;
> +    bool assisted_x2apic = config->arch.misc_flags & XEN_X86_ASSISTED_X2APIC;
>      unsigned int max_vcpus;
>  
>      if ( hvm ? !hvm_enabled : !IS_ENABLED(CONFIG_PV) )
> @@ -685,13 +687,31 @@ int arch_sanitise_domain_config(struct xen_domctl_createdomain *config)
>          }
>      }
>  
> -    if ( config->arch.misc_flags & ~XEN_X86_MSR_RELAXED )
> +    if ( config->arch.misc_flags & ~(XEN_X86_MSR_RELAXED |
> +                                     XEN_X86_ASSISTED_XAPIC |
> +                                     XEN_X86_ASSISTED_X2APIC) )
>      {
>          dprintk(XENLOG_INFO, "Invalid arch misc flags %#x\n",
>                  config->arch.misc_flags);
>          return -EINVAL;
>      }
>  
> +    if ( (assisted_xapic || assisted_x2apic) && !hvm )
> +    {
> +        dprintk(XENLOG_INFO,
> +                "Interrupt Controller Virtualization not supported for PV\n");
> +        return -EINVAL;
> +    }
> +
> +    if ( (assisted_xapic && !assisted_xapic_available) ||
> +         (assisted_x2apic && !assisted_x2apic_available) )
> +    {
> +        dprintk(XENLOG_INFO,
> +                "Hardware assisted x%sAPIC requested but not available\n",
> +                assisted_xapic && !assisted_xapic_available ? "" : "2");
> +        return -EINVAL;

I think for those two you could return -ENODEV if others agree.

> +    }
> +
>      return 0;
>  }
>  
> @@ -864,6 +884,12 @@ int arch_domain_create(struct domain *d,
>  
>      d->arch.msr_relaxed = config->arch.misc_flags & XEN_X86_MSR_RELAXED;
>  
> +    d->arch.hvm.assisted_xapic =
> +        config->arch.misc_flags & XEN_X86_ASSISTED_XAPIC;
> +
> +    d->arch.hvm.assisted_x2apic =
> +        config->arch.misc_flags & XEN_X86_ASSISTED_X2APIC;
> +
>      return 0;
>  
>   fail:
> diff --git a/xen/arch/x86/hvm/vmx/vmcs.c b/xen/arch/x86/hvm/vmx/vmcs.c
> index 06831099ed..e4503a02a7 100644
> --- a/xen/arch/x86/hvm/vmx/vmcs.c
> +++ b/xen/arch/x86/hvm/vmx/vmcs.c
> @@ -1157,6 +1157,10 @@ static int construct_vmcs(struct vcpu *v)
>          __vmwrite(PLE_WINDOW, ple_window);
>      }
>  
> +    if ( !has_assisted_xapic(v->domain) )
> +        v->arch.hvm.vmx.secondary_exec_control &=
> +            ~SECONDARY_EXEC_VIRTUALIZE_APIC_ACCESSES;
> +
>      if ( cpu_has_vmx_secondary_exec_control )
>          __vmwrite(SECONDARY_VM_EXEC_CONTROL,
>                    v->arch.hvm.vmx.secondary_exec_control);
> diff --git a/xen/arch/x86/hvm/vmx/vmx.c b/xen/arch/x86/hvm/vmx/vmx.c
> index c075370f64..949ddd684c 100644
> --- a/xen/arch/x86/hvm/vmx/vmx.c
> +++ b/xen/arch/x86/hvm/vmx/vmx.c
> @@ -3344,16 +3344,11 @@ static void vmx_install_vlapic_mapping(struct vcpu *v)
>  
>  void vmx_vlapic_msr_changed(struct vcpu *v)
>  {
> -    int virtualize_x2apic_mode;
>      struct vlapic *vlapic = vcpu_vlapic(v);
>      unsigned int msr;
>  
> -    virtualize_x2apic_mode = ( (cpu_has_vmx_apic_reg_virt ||
> -                                cpu_has_vmx_virtual_intr_delivery) &&
> -                               cpu_has_vmx_virtualize_x2apic_mode );
> -
> -    if ( !cpu_has_vmx_virtualize_apic_accesses &&
> -         !virtualize_x2apic_mode )
> +    if ( !has_assisted_xapic(v->domain) &&
> +         !has_assisted_x2apic(v->domain) )
>          return;
>  
>      vmx_vmcs_enter(v);
> @@ -3363,7 +3358,7 @@ void vmx_vlapic_msr_changed(struct vcpu *v)
>      if ( !vlapic_hw_disabled(vlapic) &&
>           (vlapic_base_address(vlapic) == APIC_DEFAULT_PHYS_BASE) )
>      {
> -        if ( virtualize_x2apic_mode && vlapic_x2apic_mode(vlapic) )
> +        if ( has_assisted_x2apic(v->domain) && vlapic_x2apic_mode(vlapic) )
>          {
>              v->arch.hvm.vmx.secondary_exec_control |=
>                  SECONDARY_EXEC_VIRTUALIZE_X2APIC_MODE;
> @@ -3384,7 +3379,7 @@ void vmx_vlapic_msr_changed(struct vcpu *v)
>                  vmx_clear_msr_intercept(v, MSR_X2APIC_SELF, VMX_MSR_W);
>              }
>          }
> -        else
> +        else if ( has_assisted_xapic(v->domain) )
>              v->arch.hvm.vmx.secondary_exec_control |=
>                  SECONDARY_EXEC_VIRTUALIZE_APIC_ACCESSES;
>      }
> diff --git a/xen/arch/x86/include/asm/hvm/domain.h b/xen/arch/x86/include/asm/hvm/domain.h
> index 698455444e..92bf53483c 100644
> --- a/xen/arch/x86/include/asm/hvm/domain.h
> +++ b/xen/arch/x86/include/asm/hvm/domain.h
> @@ -117,6 +117,12 @@ struct hvm_domain {
>  
>      bool                   is_s3_suspended;
>  
> +    /* xAPIC hardware assisted virtualization. */
> +    bool                   assisted_xapic;
> +
> +    /* x2APIC hardware assisted virtualization. */
> +    bool                   assisted_x2apic;
> +
>      /* hypervisor intercepted msix table */
>      struct list_head       msixtbl_list;
>  
> diff --git a/xen/arch/x86/include/asm/hvm/vmx/vmcs.h b/xen/arch/x86/include/asm/hvm/vmx/vmcs.h
> index 9119aa8536..5b7d662ed7 100644
> --- a/xen/arch/x86/include/asm/hvm/vmx/vmcs.h
> +++ b/xen/arch/x86/include/asm/hvm/vmx/vmcs.h
> @@ -220,6 +220,9 @@ void vmx_vmcs_reload(struct vcpu *v);
>  #define CPU_BASED_ACTIVATE_SECONDARY_CONTROLS 0x80000000
>  extern u32 vmx_cpu_based_exec_control;
>  
> +#define has_assisted_xapic(d)   ((d)->arch.hvm.assisted_xapic)
> +#define has_assisted_x2apic(d)  ((d)->arch.hvm.assisted_x2apic)

Those macros should not be in an Intel specific header,
arch/x86/include/asm/hvm/domain.h is likely a better place.

> +
>  #define PIN_BASED_EXT_INTR_MASK         0x00000001
>  #define PIN_BASED_NMI_EXITING           0x00000008
>  #define PIN_BASED_VIRTUAL_NMIS          0x00000020
> diff --git a/xen/arch/x86/traps.c b/xen/arch/x86/traps.c
> index a2278d9499..a0c6b89a88 100644
> --- a/xen/arch/x86/traps.c
> +++ b/xen/arch/x86/traps.c
> @@ -1121,7 +1121,8 @@ void cpuid_hypervisor_leaves(const struct vcpu *v, uint32_t leaf,
>          if ( !is_hvm_domain(d) || subleaf != 0 )
>              break;
>  
> -        if ( cpu_has_vmx_apic_reg_virt )
> +        if ( cpu_has_vmx_apic_reg_virt &&
> +             has_assisted_xapic(d) )
>              res->a |= XEN_HVM_CPUID_APIC_ACCESS_VIRT;
>  
>          /*
> @@ -1130,9 +1131,9 @@ void cpuid_hypervisor_leaves(const struct vcpu *v, uint32_t leaf,
>           * and wrmsr in the guest will run without VMEXITs (see
>           * vmx_vlapic_msr_changed()).
>           */
> -        if ( cpu_has_vmx_virtualize_x2apic_mode &&
> -             cpu_has_vmx_apic_reg_virt &&
> -             cpu_has_vmx_virtual_intr_delivery )
> +        if ( cpu_has_vmx_apic_reg_virt &&
> +             cpu_has_vmx_virtual_intr_delivery &&
> +             has_assisted_x2apic(d) )

This will result in less code changes if you just replace
cpu_has_vmx_virtualize_x2apic_mode with has_assisted_x2apic(d).

Thanks, Roger.


^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH v5 2/2] x86/xen: Allow per-domain usage of hardware virtualized APIC
  2022-03-08 11:38   ` Roger Pau Monné
@ 2022-03-08 12:24     ` Jan Beulich
  2022-03-08 12:33       ` Roger Pau Monné
  2022-03-08 15:44     ` Jane Malalane
  1 sibling, 1 reply; 27+ messages in thread
From: Jan Beulich @ 2022-03-08 12:24 UTC (permalink / raw)
  To: Roger Pau Monné, Jane Malalane
  Cc: Xen-devel, Wei Liu, Anthony PERARD, Juergen Gross, Andrew Cooper,
	George Dunlap, Julien Grall, Stefano Stabellini,
	Christian Lindig, David Scott, Volodymyr Babchuk

On 08.03.2022 12:38, Roger Pau Monné wrote:
> On Mon, Mar 07, 2022 at 03:06:09PM +0000, Jane Malalane wrote:
>> @@ -685,13 +687,31 @@ int arch_sanitise_domain_config(struct xen_domctl_createdomain *config)
>>          }
>>      }
>>  
>> -    if ( config->arch.misc_flags & ~XEN_X86_MSR_RELAXED )
>> +    if ( config->arch.misc_flags & ~(XEN_X86_MSR_RELAXED |
>> +                                     XEN_X86_ASSISTED_XAPIC |
>> +                                     XEN_X86_ASSISTED_X2APIC) )
>>      {
>>          dprintk(XENLOG_INFO, "Invalid arch misc flags %#x\n",
>>                  config->arch.misc_flags);
>>          return -EINVAL;
>>      }
>>  
>> +    if ( (assisted_xapic || assisted_x2apic) && !hvm )
>> +    {
>> +        dprintk(XENLOG_INFO,
>> +                "Interrupt Controller Virtualization not supported for PV\n");
>> +        return -EINVAL;
>> +    }
>> +
>> +    if ( (assisted_xapic && !assisted_xapic_available) ||
>> +         (assisted_x2apic && !assisted_x2apic_available) )
>> +    {
>> +        dprintk(XENLOG_INFO,
>> +                "Hardware assisted x%sAPIC requested but not available\n",
>> +                assisted_xapic && !assisted_xapic_available ? "" : "2");
>> +        return -EINVAL;
> 
> I think for those two you could return -ENODEV if others agree.

If by "two" you mean the xAPIC and x2APIC aspects here (and not e.g. this
and the earlier if()), then I agree. I'm always in favor of using distinct
error codes when possible and at least halfway sensible.

Jan



^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH v5 2/2] x86/xen: Allow per-domain usage of hardware virtualized APIC
  2022-03-08 12:24     ` Jan Beulich
@ 2022-03-08 12:33       ` Roger Pau Monné
  2022-03-08 14:31         ` Jane Malalane
  0 siblings, 1 reply; 27+ messages in thread
From: Roger Pau Monné @ 2022-03-08 12:33 UTC (permalink / raw)
  To: Jan Beulich
  Cc: Jane Malalane, Xen-devel, Wei Liu, Anthony PERARD, Juergen Gross,
	Andrew Cooper, George Dunlap, Julien Grall, Stefano Stabellini,
	Christian Lindig, David Scott, Volodymyr Babchuk

On Tue, Mar 08, 2022 at 01:24:23PM +0100, Jan Beulich wrote:
> On 08.03.2022 12:38, Roger Pau Monné wrote:
> > On Mon, Mar 07, 2022 at 03:06:09PM +0000, Jane Malalane wrote:
> >> @@ -685,13 +687,31 @@ int arch_sanitise_domain_config(struct xen_domctl_createdomain *config)
> >>          }
> >>      }
> >>  
> >> -    if ( config->arch.misc_flags & ~XEN_X86_MSR_RELAXED )
> >> +    if ( config->arch.misc_flags & ~(XEN_X86_MSR_RELAXED |
> >> +                                     XEN_X86_ASSISTED_XAPIC |
> >> +                                     XEN_X86_ASSISTED_X2APIC) )
> >>      {
> >>          dprintk(XENLOG_INFO, "Invalid arch misc flags %#x\n",
> >>                  config->arch.misc_flags);
> >>          return -EINVAL;
> >>      }
> >>  
> >> +    if ( (assisted_xapic || assisted_x2apic) && !hvm )
> >> +    {
> >> +        dprintk(XENLOG_INFO,
> >> +                "Interrupt Controller Virtualization not supported for PV\n");
> >> +        return -EINVAL;
> >> +    }
> >> +
> >> +    if ( (assisted_xapic && !assisted_xapic_available) ||
> >> +         (assisted_x2apic && !assisted_x2apic_available) )
> >> +    {
> >> +        dprintk(XENLOG_INFO,
> >> +                "Hardware assisted x%sAPIC requested but not available\n",
> >> +                assisted_xapic && !assisted_xapic_available ? "" : "2");
> >> +        return -EINVAL;
> > 
> > I think for those two you could return -ENODEV if others agree.
> 
> If by "two" you mean the xAPIC and x2APIC aspects here (and not e.g. this
> and the earlier if()), then I agree. I'm always in favor of using distinct
> error codes when possible and at least halfway sensible.

I would be fine by using it for the !hvm if also. IMO it makes sense
as PV doesn't have an APIC 'device' at all, so ENODEV would seem
fitting. EINVAL is also fine as the caller shouldn't even attempt that
in the first place.

So let's use it for the last if only.

Thanks, Roger.


^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH v5 2/2] x86/xen: Allow per-domain usage of hardware virtualized APIC
  2022-03-08 12:33       ` Roger Pau Monné
@ 2022-03-08 14:31         ` Jane Malalane
  2022-03-08 14:41           ` Jan Beulich
  0 siblings, 1 reply; 27+ messages in thread
From: Jane Malalane @ 2022-03-08 14:31 UTC (permalink / raw)
  To: Roger Pau Monne, Jan Beulich
  Cc: Xen-devel, Wei Liu, Anthony Perard, Juergen Gross, Andrew Cooper,
	George Dunlap, Julien Grall, Stefano Stabellini,
	Christian Lindig, David Scott, Volodymyr Babchuk

On 08/03/2022 12:33, Roger Pau Monné wrote:
> On Tue, Mar 08, 2022 at 01:24:23PM +0100, Jan Beulich wrote:
>> On 08.03.2022 12:38, Roger Pau Monné wrote:
>>> On Mon, Mar 07, 2022 at 03:06:09PM +0000, Jane Malalane wrote:
>>>> @@ -685,13 +687,31 @@ int arch_sanitise_domain_config(struct xen_domctl_createdomain *config)
>>>>           }
>>>>       }
>>>>   
>>>> -    if ( config->arch.misc_flags & ~XEN_X86_MSR_RELAXED )
>>>> +    if ( config->arch.misc_flags & ~(XEN_X86_MSR_RELAXED |
>>>> +                                     XEN_X86_ASSISTED_XAPIC |
>>>> +                                     XEN_X86_ASSISTED_X2APIC) )
>>>>       {
>>>>           dprintk(XENLOG_INFO, "Invalid arch misc flags %#x\n",
>>>>                   config->arch.misc_flags);
>>>>           return -EINVAL;
>>>>       }
>>>>   
>>>> +    if ( (assisted_xapic || assisted_x2apic) && !hvm )
>>>> +    {
>>>> +        dprintk(XENLOG_INFO,
>>>> +                "Interrupt Controller Virtualization not supported for PV\n");
>>>> +        return -EINVAL;
>>>> +    }
>>>> +
>>>> +    if ( (assisted_xapic && !assisted_xapic_available) ||
>>>> +         (assisted_x2apic && !assisted_x2apic_available) )
>>>> +    {
>>>> +        dprintk(XENLOG_INFO,
>>>> +                "Hardware assisted x%sAPIC requested but not available\n",
>>>> +                assisted_xapic && !assisted_xapic_available ? "" : "2");
>>>> +        return -EINVAL;
>>>
>>> I think for those two you could return -ENODEV if others agree.
>>
>> If by "two" you mean the xAPIC and x2APIC aspects here (and not e.g. this
>> and the earlier if()), then I agree. I'm always in favor of using distinct
>> error codes when possible and at least halfway sensible.
> 
> I would be fine by using it for the !hvm if also. IMO it makes sense
> as PV doesn't have an APIC 'device' at all, so ENODEV would seem
> fitting. EINVAL is also fine as the caller shouldn't even attempt that
> in the first place.
> 
> So let's use it for the last if only.
Wouldn't it make more sense to use -ENODEV particularly for the first? I 
agree that -ENODEV should be reported in the first case because it 
doesn't make sense to request acceleration of something that doesn't 
exist and I should have put that. But having a look at the hap code 
(since it resembles the second case), it returns -EINVAL when it is not 
available, unless you deem this to be different or, in retrospective, 
that the hap code should too have been coded to return -ENODEV.

if ( hap && !hvm_hap_supported() )
     {
         dprintk(XENLOG_INFO, "HAP requested but not available\n");
         return -EINVAL;
     }

I agree with all the other comments and have made the approp changes for 
v6. Thanks a lot.

Jane.

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH v5 2/2] x86/xen: Allow per-domain usage of hardware virtualized APIC
  2022-03-08 14:31         ` Jane Malalane
@ 2022-03-08 14:41           ` Jan Beulich
  2022-03-09 15:56             ` Jane Malalane
  0 siblings, 1 reply; 27+ messages in thread
From: Jan Beulich @ 2022-03-08 14:41 UTC (permalink / raw)
  To: Jane Malalane
  Cc: Xen-devel, Wei Liu, Anthony Perard, Juergen Gross, Andrew Cooper,
	George Dunlap, Julien Grall, Stefano Stabellini,
	Christian Lindig, David Scott, Volodymyr Babchuk,
	Roger Pau Monne

On 08.03.2022 15:31, Jane Malalane wrote:
> On 08/03/2022 12:33, Roger Pau Monné wrote:
>> On Tue, Mar 08, 2022 at 01:24:23PM +0100, Jan Beulich wrote:
>>> On 08.03.2022 12:38, Roger Pau Monné wrote:
>>>> On Mon, Mar 07, 2022 at 03:06:09PM +0000, Jane Malalane wrote:
>>>>> @@ -685,13 +687,31 @@ int arch_sanitise_domain_config(struct xen_domctl_createdomain *config)
>>>>>           }
>>>>>       }
>>>>>   
>>>>> -    if ( config->arch.misc_flags & ~XEN_X86_MSR_RELAXED )
>>>>> +    if ( config->arch.misc_flags & ~(XEN_X86_MSR_RELAXED |
>>>>> +                                     XEN_X86_ASSISTED_XAPIC |
>>>>> +                                     XEN_X86_ASSISTED_X2APIC) )
>>>>>       {
>>>>>           dprintk(XENLOG_INFO, "Invalid arch misc flags %#x\n",
>>>>>                   config->arch.misc_flags);
>>>>>           return -EINVAL;
>>>>>       }
>>>>>   
>>>>> +    if ( (assisted_xapic || assisted_x2apic) && !hvm )
>>>>> +    {
>>>>> +        dprintk(XENLOG_INFO,
>>>>> +                "Interrupt Controller Virtualization not supported for PV\n");
>>>>> +        return -EINVAL;
>>>>> +    }
>>>>> +
>>>>> +    if ( (assisted_xapic && !assisted_xapic_available) ||
>>>>> +         (assisted_x2apic && !assisted_x2apic_available) )
>>>>> +    {
>>>>> +        dprintk(XENLOG_INFO,
>>>>> +                "Hardware assisted x%sAPIC requested but not available\n",
>>>>> +                assisted_xapic && !assisted_xapic_available ? "" : "2");
>>>>> +        return -EINVAL;
>>>>
>>>> I think for those two you could return -ENODEV if others agree.
>>>
>>> If by "two" you mean the xAPIC and x2APIC aspects here (and not e.g. this
>>> and the earlier if()), then I agree. I'm always in favor of using distinct
>>> error codes when possible and at least halfway sensible.
>>
>> I would be fine by using it for the !hvm if also. IMO it makes sense
>> as PV doesn't have an APIC 'device' at all, so ENODEV would seem
>> fitting. EINVAL is also fine as the caller shouldn't even attempt that
>> in the first place.
>>
>> So let's use it for the last if only.
> Wouldn't it make more sense to use -ENODEV particularly for the first? I 
> agree that -ENODEV should be reported in the first case because it 
> doesn't make sense to request acceleration of something that doesn't 
> exist and I should have put that. But having a look at the hap code 
> (since it resembles the second case), it returns -EINVAL when it is not 
> available, unless you deem this to be different or, in retrospective, 
> that the hap code should too have been coded to return -ENODEV.
> 
> if ( hap && !hvm_hap_supported() )
>      {
>          dprintk(XENLOG_INFO, "HAP requested but not available\n");
>          return -EINVAL;
>      }

This is just one of the examples where using -ENODEV as you suggest
would introduce an inconsistency. We use -EINVAL also for other
purely guest-type dependent checks.

Jan



^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH v5 2/2] x86/xen: Allow per-domain usage of hardware virtualized APIC
  2022-03-08 11:38   ` Roger Pau Monné
  2022-03-08 12:24     ` Jan Beulich
@ 2022-03-08 15:44     ` Jane Malalane
  2022-03-08 16:02       ` Roger Pau Monné
  1 sibling, 1 reply; 27+ messages in thread
From: Jane Malalane @ 2022-03-08 15:44 UTC (permalink / raw)
  To: Roger Pau Monne
  Cc: Xen-devel, Wei Liu, Anthony Perard, Juergen Gross, Andrew Cooper,
	George Dunlap, Jan Beulich, Julien Grall, Stefano Stabellini,
	Christian Lindig, David Scott, Volodymyr Babchuk

On 08/03/2022 11:38, Roger Pau Monné wrote:
> On Mon, Mar 07, 2022 at 03:06:09PM +0000, Jane Malalane wrote:
>> Introduce a new per-domain creation x86 specific flag to
>> select whether hardware assisted virtualization should be used for
>> x{2}APIC.
>>
>> A per-domain option is added to xl in order to select the usage of
>> x{2}APIC hardware assisted virtualization, as well as a global
>> configuration option.
>>
>> Having all APIC interaction exit to Xen for emulation is slow and can
>> induce much overhead. Hardware can speed up x{2}APIC by decoding the
>> APIC access and providing a VM exit with a more specific exit reason
>> than a regular EPT fault or by altogether avoiding a VM exit.
>>
>> On the other hand, being able to disable x{2}APIC hardware assisted
>> virtualization can be useful for testing and debugging purposes.
>>
>> Note: vmx_install_vlapic_mapping doesn't require modifications
>> regardless of whether the guest has "Virtualize APIC accesses" enabled
>> or not, i.e., setting the APIC_ACCESS_ADDR VMCS field is fine so long
>> as virtualize_apic_accesses is supported by the CPU.
>>
>> Suggested-by: Andrew Cooper <andrew.cooper3@citrix.com>
>> Signed-off-by: Jane Malalane <jane.malalane@citrix.com>
>> ---
>> CC: Wei Liu <wl@xen.org>
>> CC: Anthony PERARD <anthony.perard@citrix.com>
>> CC: Juergen Gross <jgross@suse.com>
>> CC: Andrew Cooper <andrew.cooper3@citrix.com>
>> CC: George Dunlap <george.dunlap@citrix.com>
>> CC: Jan Beulich <jbeulich@suse.com>
>> CC: Julien Grall <julien@xen.org>
>> CC: Stefano Stabellini <sstabellini@kernel.org>
>> CC: Christian Lindig <christian.lindig@citrix.com>
>> CC: David Scott <dave@recoil.org>
>> CC: Volodymyr Babchuk <Volodymyr_Babchuk@epam.com>
>> CC: "Roger Pau Monné" <roger.pau@citrix.com>
>>
>> v5:
>> * Revert v4 changes in vmx_vlapic_msr_changed(), preserving the use of
>>    the has_assisted_x{2}apic macros
>> * Following changes in assisted_x{2}apic_available definitions in
>>    patch 1, retighten conditionals for setting
>>    XEN_HVM_CPUID_APIC_ACCESS_VIRT and XEN_HVM_CPUID_X2APIC_VIRT in
>>    cpuid_hypervisor_leaves()
>>
>> v4:
>>   * Add has_assisted_x{2}apic macros and use them where appropriate
>>   * Replace CPU checks with per-domain assisted_x{2}apic control
>>     options in vmx_vlapic_msr_changed() and cpuid_hypervisor_leaves(),
>>     following edits to assisted_x{2}apic_available definitions in
>>     patch 1
>>     Note: new assisted_x{2}apic_available definitions make later
>>     cpu_has_vmx_apic_reg_virt and cpu_has_vmx_virtual_intr_delivery
>>     checks redundant in vmx_vlapic_msr_changed()
>>
>> v3:
>>   * Change info in xl.cfg to better express reality and fix
>>     capitalization of x{2}apic
>>   * Move "physinfo" variable definition to the beggining of
>>     libxl__domain_build_info_setdefault()
>>   * Reposition brackets in if statement to match libxl coding style
>>   * Shorten logic in libxl__arch_domain_build_info_setdefault()
>>   * Correct dprintk message in arch_sanitise_domain_config()
>>   * Make appropriate changes in vmx_vlapic_msr_changed() and
>>     cpuid_hypervisor_leaves() for amended "assisted_x2apic" bit
>>   * Remove unneeded parantheses
>>
>> v2:
>>   * Add a LIBXL_HAVE_ASSISTED_APIC macro
>>   * Pass xcpyshinfo as a pointer in libxl__arch_get_physinfo
>>   * Add a return statement in now "int"
>>     libxl__arch_domain_build_info_setdefault()
>>   * Preserve libxl__arch_domain_build_info_setdefault 's location in
>>     libxl_create.c
>>   * Correct x{2}apic default setting logic in
>>     libxl__arch_domain_prepare_config()
>>   * Correct logic for parsing assisted_x{2}apic host/guest options in
>>     xl_parse.c and initialize them to -1 in xl.c
>>   * Use guest options directly in vmx_vlapic_msr_changed
>>   * Fix indentation of bool assisted_x{2}apic in struct hvm_domain
>>   * Add a change in xenctrl_stubs.c to pass xenctrl ABI checks
>> ---
>>   docs/man/xl.cfg.5.pod.in                | 19 +++++++++++++++++++
>>   docs/man/xl.conf.5.pod.in               | 12 ++++++++++++
>>   tools/golang/xenlight/helpers.gen.go    | 12 ++++++++++++
>>   tools/golang/xenlight/types.gen.go      |  2 ++
>>   tools/include/libxl.h                   |  7 +++++++
>>   tools/libs/light/libxl_arch.h           |  5 +++--
>>   tools/libs/light/libxl_arm.c            |  7 +++++--
>>   tools/libs/light/libxl_create.c         | 22 +++++++++++++---------
>>   tools/libs/light/libxl_types.idl        |  2 ++
>>   tools/libs/light/libxl_x86.c            | 28 ++++++++++++++++++++++++++--
>>   tools/ocaml/libs/xc/xenctrl.ml          |  2 ++
>>   tools/ocaml/libs/xc/xenctrl.mli         |  2 ++
>>   tools/ocaml/libs/xc/xenctrl_stubs.c     |  2 +-
>>   tools/xl/xl.c                           |  8 ++++++++
>>   tools/xl/xl.h                           |  2 ++
>>   tools/xl/xl_parse.c                     | 16 ++++++++++++++++
>>   xen/arch/x86/domain.c                   | 28 +++++++++++++++++++++++++++-
>>   xen/arch/x86/hvm/vmx/vmcs.c             |  4 ++++
>>   xen/arch/x86/hvm/vmx/vmx.c              | 13 ++++---------
>>   xen/arch/x86/include/asm/hvm/domain.h   |  6 ++++++
>>   xen/arch/x86/include/asm/hvm/vmx/vmcs.h |  3 +++
>>   xen/arch/x86/traps.c                    |  9 +++++----
>>   xen/include/public/arch-x86/xen.h       |  2 ++
>>   23 files changed, 183 insertions(+), 30 deletions(-)
>>
>> diff --git a/docs/man/xl.cfg.5.pod.in b/docs/man/xl.cfg.5.pod.in
>> index b98d161398..dcca564a23 100644
>> --- a/docs/man/xl.cfg.5.pod.in
>> +++ b/docs/man/xl.cfg.5.pod.in
>> @@ -1862,6 +1862,25 @@ firmware tables when using certain older guest Operating
>>   Systems. These tables have been superseded by newer constructs within
>>   the ACPI tables.
>>   
>> +=item B<assisted_xapic=BOOLEAN>
>> +
>> +B<(x86 only)> Enables or disables hardware assisted virtualization for
>> +xAPIC. With this option enabled, a memory-mapped APIC access will be
>> +decoded by hardware and either issue a more specific VM exit than just
>> +an EPT fault, or altogether avoid a VM exit. Notice full
>> +virtualization for xAPIC can only be achieved if hardware supports
>> +“APIC-register virtualization” and “virtual-interrupt delivery”.
> 
> You shouldn't mention “APIC-register virtualization” or
> “virtual-interrupt delivery”, as those are Intel specific options. I
> would just remove that sentence (same below).
> 
>> The
>> +default is settable via L<xl.conf(5)>.
>> +
>> +=item B<assisted_x2apic=BOOLEAN>
>> +
>> +B<(x86 only)> Enables or disables hardware assisted virtualization for
>> +x2APIC. With this option enabled, an MSR-Based APIC access will
>> +either issue a VM exit or altogether avoid one.
> 
> "With this option enabled, certain accesses to MSR APIC registers will
> avoid a VM exit into the hypervisor."
> 
>> Notice full
>> +virtualization for x2APIC can only be achieved if hardware supports
>> +“APIC-register virtualization” and “virtual-interrupt delivery”. The
>> +default is settable via L<xl.conf(5)>.
>> +
>>   =item B<nx=BOOLEAN>
>>   
>>   B<(x86 only)> Hides or exposes the No-eXecute capability. This allows a guest
>> diff --git a/docs/man/xl.conf.5.pod.in b/docs/man/xl.conf.5.pod.in
>> index df20c08137..95d136d1ea 100644
>> --- a/docs/man/xl.conf.5.pod.in
>> +++ b/docs/man/xl.conf.5.pod.in
>> @@ -107,6 +107,18 @@ Sets the default value for the C<max_grant_version> domain config value.
>>   
>>   Default: maximum grant version supported by the hypervisor.
>>   
>> +=item B<assisted_xapic=BOOLEAN>
>> +
>> +If enabled, domains will use xAPIC hardware assisted virtualization by default.
>> +
>> +Default: enabled if supported.
>> +
>> +=item B<assisted_x2apic=BOOLEAN>
>> +
>> +If enabled, domains will use x2APIC hardware assisted virtualization by default.
>> +
>> +Default: enabled if supported.
>> +
>>   =item B<vif.default.script="PATH">
>>   
>>   Configures the default hotplug script used by virtual network devices.
>> diff --git a/tools/golang/xenlight/helpers.gen.go b/tools/golang/xenlight/helpers.gen.go
>> index dd4e6c9f14..dece545ee0 100644
>> --- a/tools/golang/xenlight/helpers.gen.go
>> +++ b/tools/golang/xenlight/helpers.gen.go
>> @@ -1120,6 +1120,12 @@ x.ArchArm.Vuart = VuartType(xc.arch_arm.vuart)
>>   if err := x.ArchX86.MsrRelaxed.fromC(&xc.arch_x86.msr_relaxed);err != nil {
>>   return fmt.Errorf("converting field ArchX86.MsrRelaxed: %v", err)
>>   }
>> +if err := x.ArchX86.AssistedXapic.fromC(&xc.arch_x86.assisted_xapic);err != nil {
>> +return fmt.Errorf("converting field ArchX86.AssistedXapic: %v", err)
>> +}
>> +if err := x.ArchX86.AssistedX2Apic.fromC(&xc.arch_x86.assisted_x2apic);err != nil {
>> +return fmt.Errorf("converting field ArchX86.AssistedX2Apic: %v", err)
>> +}
>>   x.Altp2M = Altp2MMode(xc.altp2m)
>>   x.VmtraceBufKb = int(xc.vmtrace_buf_kb)
>>   if err := x.Vpmu.fromC(&xc.vpmu);err != nil {
>> @@ -1605,6 +1611,12 @@ xc.arch_arm.vuart = C.libxl_vuart_type(x.ArchArm.Vuart)
>>   if err := x.ArchX86.MsrRelaxed.toC(&xc.arch_x86.msr_relaxed); err != nil {
>>   return fmt.Errorf("converting field ArchX86.MsrRelaxed: %v", err)
>>   }
>> +if err := x.ArchX86.AssistedXapic.toC(&xc.arch_x86.assisted_xapic); err != nil {
>> +return fmt.Errorf("converting field ArchX86.AssistedXapic: %v", err)
>> +}
>> +if err := x.ArchX86.AssistedX2Apic.toC(&xc.arch_x86.assisted_x2apic); err != nil {
>> +return fmt.Errorf("converting field ArchX86.AssistedX2Apic: %v", err)
>> +}
>>   xc.altp2m = C.libxl_altp2m_mode(x.Altp2M)
>>   xc.vmtrace_buf_kb = C.int(x.VmtraceBufKb)
>>   if err := x.Vpmu.toC(&xc.vpmu); err != nil {
>> diff --git a/tools/golang/xenlight/types.gen.go b/tools/golang/xenlight/types.gen.go
>> index 87be46c745..253c9ad93d 100644
>> --- a/tools/golang/xenlight/types.gen.go
>> +++ b/tools/golang/xenlight/types.gen.go
>> @@ -520,6 +520,8 @@ Vuart VuartType
>>   }
>>   ArchX86 struct {
>>   MsrRelaxed Defbool
>> +AssistedXapic Defbool
>> +AssistedX2Apic Defbool
>>   }
>>   Altp2M Altp2MMode
>>   VmtraceBufKb int
>> diff --git a/tools/include/libxl.h b/tools/include/libxl.h
>> index 94e6355822..cdcccd6d01 100644
>> --- a/tools/include/libxl.h
>> +++ b/tools/include/libxl.h
>> @@ -535,6 +535,13 @@
>>   #define LIBXL_HAVE_PHYSINFO_ASSISTED_APIC 1
>>   
>>   /*
>> + * LIBXL_HAVE_ASSISTED_APIC indicates that libxl_domain_build_info has
>> + * assisted_xapic and assisted_x2apic fields for enabling hardware
>> + * assisted virtualization for x{2}apic per domain.
>> + */
>> +#define LIBXL_HAVE_ASSISTED_APIC 1
>> +
>> +/*
>>    * libxl ABI compatibility
>>    *
>>    * The only guarantee which libxl makes regarding ABI compatibility
>> diff --git a/tools/libs/light/libxl_arch.h b/tools/libs/light/libxl_arch.h
>> index 207ceac6a1..03b89929e6 100644
>> --- a/tools/libs/light/libxl_arch.h
>> +++ b/tools/libs/light/libxl_arch.h
>> @@ -71,8 +71,9 @@ void libxl__arch_domain_create_info_setdefault(libxl__gc *gc,
>>                                                  libxl_domain_create_info *c_info);
>>   
>>   _hidden
>> -void libxl__arch_domain_build_info_setdefault(libxl__gc *gc,
>> -                                              libxl_domain_build_info *b_info);
>> +int libxl__arch_domain_build_info_setdefault(libxl__gc *gc,
>> +                                             libxl_domain_build_info *b_info,
>> +                                             const libxl_physinfo *physinfo);
>>   
>>   _hidden
>>   int libxl__arch_passthrough_mode_setdefault(libxl__gc *gc,
>> diff --git a/tools/libs/light/libxl_arm.c b/tools/libs/light/libxl_arm.c
>> index 39fdca1b49..ba5b8f433f 100644
>> --- a/tools/libs/light/libxl_arm.c
>> +++ b/tools/libs/light/libxl_arm.c
>> @@ -1384,8 +1384,9 @@ void libxl__arch_domain_create_info_setdefault(libxl__gc *gc,
>>       }
>>   }
>>   
>> -void libxl__arch_domain_build_info_setdefault(libxl__gc *gc,
>> -                                              libxl_domain_build_info *b_info)
>> +int libxl__arch_domain_build_info_setdefault(libxl__gc *gc,
>> +                                             libxl_domain_build_info *b_info,
>> +                                             const libxl_physinfo *physinfo)
>>   {
>>       /* ACPI is disabled by default */
>>       libxl_defbool_setdefault(&b_info->acpi, false);
>> @@ -1399,6 +1400,8 @@ void libxl__arch_domain_build_info_setdefault(libxl__gc *gc,
>>       memset(&b_info->u, '\0', sizeof(b_info->u));
>>       b_info->type = LIBXL_DOMAIN_TYPE_INVALID;
>>       libxl_domain_build_info_init_type(b_info, LIBXL_DOMAIN_TYPE_PVH);
>> +
>> +    return 0;
>>   }
>>   
>>   int libxl__arch_passthrough_mode_setdefault(libxl__gc *gc,
>> diff --git a/tools/libs/light/libxl_create.c b/tools/libs/light/libxl_create.c
>> index 15ed021f41..88d08d7277 100644
>> --- a/tools/libs/light/libxl_create.c
>> +++ b/tools/libs/light/libxl_create.c
>> @@ -75,6 +75,7 @@ int libxl__domain_build_info_setdefault(libxl__gc *gc,
>>                                           libxl_domain_build_info *b_info)
>>   {
>>       int i, rc;
>> +    libxl_physinfo info;
>>   
>>       if (b_info->type != LIBXL_DOMAIN_TYPE_HVM &&
>>           b_info->type != LIBXL_DOMAIN_TYPE_PV &&
>> @@ -264,7 +265,18 @@ int libxl__domain_build_info_setdefault(libxl__gc *gc,
>>       if (!b_info->event_channels)
>>           b_info->event_channels = 1023;
>>   
>> -    libxl__arch_domain_build_info_setdefault(gc, b_info);
>> +    rc = libxl_get_physinfo(CTX, &info);
>> +    if (rc) {
>> +        LOG(ERROR, "failed to get hypervisor info");
>> +        return rc;
>> +    }
>> +
>> +    rc = libxl__arch_domain_build_info_setdefault(gc, b_info, &info);
>> +    if (rc) {
>> +        LOG(ERROR, "unable to set domain arch build info defaults");
>> +        return rc;
>> +    }
>> +
>>       libxl_defbool_setdefault(&b_info->dm_restrict, false);
>>   
>>       if (b_info->iommu_memkb == LIBXL_MEMKB_DEFAULT)
>> @@ -457,14 +469,6 @@ int libxl__domain_build_info_setdefault(libxl__gc *gc,
>>       }
>>   
>>       if (b_info->max_grant_version == LIBXL_MAX_GRANT_DEFAULT) {
>> -        libxl_physinfo info;
>> -
>> -        rc = libxl_get_physinfo(CTX, &info);
>> -        if (rc) {
>> -            LOG(ERROR, "failed to get hypervisor info");
>> -            return rc;
>> -        }
>> -
>>           if (info.cap_gnttab_v2)
>>               b_info->max_grant_version = 2;
>>           else if (info.cap_gnttab_v1)
>> diff --git a/tools/libs/light/libxl_types.idl b/tools/libs/light/libxl_types.idl
>> index 42ac6c357b..db5eb0a0b3 100644
>> --- a/tools/libs/light/libxl_types.idl
>> +++ b/tools/libs/light/libxl_types.idl
>> @@ -648,6 +648,8 @@ libxl_domain_build_info = Struct("domain_build_info",[
>>                                  ("vuart", libxl_vuart_type),
>>                                 ])),
>>       ("arch_x86", Struct(None, [("msr_relaxed", libxl_defbool),
>> +                               ("assisted_xapic", libxl_defbool),
>> +                               ("assisted_x2apic", libxl_defbool),
>>                                 ])),
>>       # Alternate p2m is not bound to any architecture or guest type, as it is
>>       # supported by x86 HVM and ARM support is planned.
>> diff --git a/tools/libs/light/libxl_x86.c b/tools/libs/light/libxl_x86.c
>> index e0a06ecfe3..c377d13b19 100644
>> --- a/tools/libs/light/libxl_x86.c
>> +++ b/tools/libs/light/libxl_x86.c
>> @@ -23,6 +23,14 @@ int libxl__arch_domain_prepare_config(libxl__gc *gc,
>>       if (libxl_defbool_val(d_config->b_info.arch_x86.msr_relaxed))
>>           config->arch.misc_flags |= XEN_X86_MSR_RELAXED;
>>   
>> +    if (d_config->c_info.type != LIBXL_DOMAIN_TYPE_PV)
>> +    {
>> +        if (libxl_defbool_val(d_config->b_info.arch_x86.assisted_xapic))
>> +            config->arch.misc_flags |= XEN_X86_ASSISTED_XAPIC;
>> +
>> +        if (libxl_defbool_val(d_config->b_info.arch_x86.assisted_x2apic))
>> +            config->arch.misc_flags |= XEN_X86_ASSISTED_X2APIC;
>> +    }
>>       return 0;
>>   }
>>   
>> @@ -819,11 +827,27 @@ void libxl__arch_domain_create_info_setdefault(libxl__gc *gc,
>>   {
>>   }
>>   
>> -void libxl__arch_domain_build_info_setdefault(libxl__gc *gc,
>> -                                              libxl_domain_build_info *b_info)
>> +int libxl__arch_domain_build_info_setdefault(libxl__gc *gc,
>> +                                             libxl_domain_build_info *b_info,
>> +                                             const libxl_physinfo *physinfo)
>>   {
>>       libxl_defbool_setdefault(&b_info->acpi, true);
>>       libxl_defbool_setdefault(&b_info->arch_x86.msr_relaxed, false);
>> +
>> +    if (b_info->type != LIBXL_DOMAIN_TYPE_PV) {
>> +        libxl_defbool_setdefault(&b_info->arch_x86.assisted_xapic,
>> +                             physinfo->cap_assisted_xapic);
>> +        libxl_defbool_setdefault(&b_info->arch_x86.assisted_x2apic,
>> +                             physinfo->cap_assisted_x2apic);
>> +    }
>> +
> 
> Extra newline? 'else if' should be one space after the closing
> bracket.
> 
>> +    else if (!libxl_defbool_is_default(b_info->arch_x86.assisted_xapic) ||
>> +             !libxl_defbool_is_default(b_info->arch_x86.assisted_x2apic)) {
>> +        LOG(ERROR, "Interrupt Controller Virtualization not supported for PV");
>> +        return ERROR_INVAL;
>> +    }
>> +
>> +    return 0;
>>   }
>>   
>>   int libxl__arch_passthrough_mode_setdefault(libxl__gc *gc,
>> diff --git a/tools/ocaml/libs/xc/xenctrl.ml b/tools/ocaml/libs/xc/xenctrl.ml
>> index 21783d3622..672a11ceb6 100644
>> --- a/tools/ocaml/libs/xc/xenctrl.ml
>> +++ b/tools/ocaml/libs/xc/xenctrl.ml
>> @@ -50,6 +50,8 @@ type x86_arch_emulation_flags =
>>   
>>   type x86_arch_misc_flags =
>>   	| X86_MSR_RELAXED
>> +	| X86_ASSISTED_XAPIC
>> +	| X86_ASSISTED_X2APIC
>>   
>>   type xen_x86_arch_domainconfig =
>>   {
>> diff --git a/tools/ocaml/libs/xc/xenctrl.mli b/tools/ocaml/libs/xc/xenctrl.mli
>> index af6ba3d1a0..f9a6aa3a0f 100644
>> --- a/tools/ocaml/libs/xc/xenctrl.mli
>> +++ b/tools/ocaml/libs/xc/xenctrl.mli
>> @@ -44,6 +44,8 @@ type x86_arch_emulation_flags =
>>   
>>   type x86_arch_misc_flags =
>>     | X86_MSR_RELAXED
>> +  | X86_ASSISTED_XAPIC
>> +  | X86_ASSISTED_X2APIC
>>   
>>   type xen_x86_arch_domainconfig = {
>>     emulation_flags: x86_arch_emulation_flags list;
>> diff --git a/tools/ocaml/libs/xc/xenctrl_stubs.c b/tools/ocaml/libs/xc/xenctrl_stubs.c
>> index e0d49b18d2..ecfc7125d5 100644
>> --- a/tools/ocaml/libs/xc/xenctrl_stubs.c
>> +++ b/tools/ocaml/libs/xc/xenctrl_stubs.c
>> @@ -239,7 +239,7 @@ CAMLprim value stub_xc_domain_create(value xch, value wanted_domid, value config
>>   
>>   		cfg.arch.misc_flags = ocaml_list_to_c_bitmap
>>   			/* ! x86_arch_misc_flags X86_ none */
>> -			/* ! XEN_X86_ XEN_X86_MSR_RELAXED all */
>> +			/* ! XEN_X86_ XEN_X86_ASSISTED_X2APIC max */
>>   			(VAL_MISC_FLAGS);
>>   
>>   #undef VAL_MISC_FLAGS
>> diff --git a/tools/xl/xl.c b/tools/xl/xl.c
>> index 2d1ec18ea3..31eb223309 100644
>> --- a/tools/xl/xl.c
>> +++ b/tools/xl/xl.c
>> @@ -57,6 +57,8 @@ int max_grant_frames = -1;
>>   int max_maptrack_frames = -1;
>>   int max_grant_version = LIBXL_MAX_GRANT_DEFAULT;
>>   libxl_domid domid_policy = INVALID_DOMID;
>> +int assisted_xapic = -1;
>> +int assisted_x2apic = -1;
>>   
>>   xentoollog_level minmsglevel = minmsglevel_default;
>>   
>> @@ -201,6 +203,12 @@ static void parse_global_config(const char *configfile,
>>       if (!xlu_cfg_get_long (config, "claim_mode", &l, 0))
>>           claim_mode = l;
>>   
>> +    if (!xlu_cfg_get_long (config, "assisted_xapic", &l, 0))
>> +        assisted_xapic = l;
>> +
>> +    if (!xlu_cfg_get_long (config, "assisted_x2apic", &l, 0))
>> +        assisted_x2apic = l;
>> +
>>       xlu_cfg_replace_string (config, "remus.default.netbufscript",
>>           &default_remus_netbufscript, 0);
>>       xlu_cfg_replace_string (config, "colo.default.proxyscript",
>> diff --git a/tools/xl/xl.h b/tools/xl/xl.h
>> index c5c4bedbdd..528deb3feb 100644
>> --- a/tools/xl/xl.h
>> +++ b/tools/xl/xl.h
>> @@ -286,6 +286,8 @@ extern libxl_bitmap global_vm_affinity_mask;
>>   extern libxl_bitmap global_hvm_affinity_mask;
>>   extern libxl_bitmap global_pv_affinity_mask;
>>   extern libxl_domid domid_policy;
>> +extern int assisted_xapic;
>> +extern int assisted_x2apic;
>>   
>>   enum output_format {
>>       OUTPUT_FORMAT_JSON,
>> diff --git a/tools/xl/xl_parse.c b/tools/xl/xl_parse.c
>> index 117fcdcb2b..0ab9b145fe 100644
>> --- a/tools/xl/xl_parse.c
>> +++ b/tools/xl/xl_parse.c
>> @@ -1681,6 +1681,22 @@ void parse_config_data(const char *config_source,
>>           xlu_cfg_get_defbool(config, "vpt_align", &b_info->u.hvm.vpt_align, 0);
>>           xlu_cfg_get_defbool(config, "apic", &b_info->apic, 0);
>>   
>> +        e = xlu_cfg_get_long(config, "assisted_xapic", &l , 0);
>> +        if ((e == ESRCH && assisted_xapic != -1)) /* use global default if present */
>> +            libxl_defbool_set(&b_info->arch_x86.assisted_xapic, assisted_xapic);
>> +        else if (!e)
>> +            libxl_defbool_set(&b_info->arch_x86.assisted_xapic, l);
>> +        else
>> +            exit(1);
>> +
>> +        e = xlu_cfg_get_long(config, "assisted_x2apic", &l, 0);
>> +        if ((e == ESRCH && assisted_x2apic != -1)) /* use global default if present */
>> +            libxl_defbool_set(&b_info->arch_x86.assisted_x2apic, assisted_x2apic);
>> +        else if (!e)
>> +            libxl_defbool_set(&b_info->arch_x86.assisted_x2apic, l);
>> +        else
>> +            exit(1);
>> +
>>           switch (xlu_cfg_get_list(config, "viridian",
>>                                    &viridian, &num_viridian, 1))
>>           {
>> diff --git a/xen/arch/x86/domain.c b/xen/arch/x86/domain.c
>> index a5048ed654..bcca0dc900 100644
>> --- a/xen/arch/x86/domain.c
>> +++ b/xen/arch/x86/domain.c
>> @@ -619,6 +619,8 @@ int arch_sanitise_domain_config(struct xen_domctl_createdomain *config)
>>       bool hvm = config->flags & XEN_DOMCTL_CDF_hvm;
>>       bool hap = config->flags & XEN_DOMCTL_CDF_hap;
>>       bool nested_virt = config->flags & XEN_DOMCTL_CDF_nested_virt;
>> +    bool assisted_xapic = config->arch.misc_flags & XEN_X86_ASSISTED_XAPIC;
>> +    bool assisted_x2apic = config->arch.misc_flags & XEN_X86_ASSISTED_X2APIC;
>>       unsigned int max_vcpus;
>>   
>>       if ( hvm ? !hvm_enabled : !IS_ENABLED(CONFIG_PV) )
>> @@ -685,13 +687,31 @@ int arch_sanitise_domain_config(struct xen_domctl_createdomain *config)
>>           }
>>       }
>>   
>> -    if ( config->arch.misc_flags & ~XEN_X86_MSR_RELAXED )
>> +    if ( config->arch.misc_flags & ~(XEN_X86_MSR_RELAXED |
>> +                                     XEN_X86_ASSISTED_XAPIC |
>> +                                     XEN_X86_ASSISTED_X2APIC) )
>>       {
>>           dprintk(XENLOG_INFO, "Invalid arch misc flags %#x\n",
>>                   config->arch.misc_flags);
>>           return -EINVAL;
>>       }
>>   
>> +    if ( (assisted_xapic || assisted_x2apic) && !hvm )
>> +    {
>> +        dprintk(XENLOG_INFO,
>> +                "Interrupt Controller Virtualization not supported for PV\n");
>> +        return -EINVAL;
>> +    }
>> +
>> +    if ( (assisted_xapic && !assisted_xapic_available) ||
>> +         (assisted_x2apic && !assisted_x2apic_available) )
>> +    {
>> +        dprintk(XENLOG_INFO,
>> +                "Hardware assisted x%sAPIC requested but not available\n",
>> +                assisted_xapic && !assisted_xapic_available ? "" : "2");
>> +        return -EINVAL;
> 
> I think for those two you could return -ENODEV if others agree.
> 
>> +    }
>> +
>>       return 0;
>>   }
>>   
>> @@ -864,6 +884,12 @@ int arch_domain_create(struct domain *d,
>>   
>>       d->arch.msr_relaxed = config->arch.misc_flags & XEN_X86_MSR_RELAXED;
>>   
>> +    d->arch.hvm.assisted_xapic =
>> +        config->arch.misc_flags & XEN_X86_ASSISTED_XAPIC;
>> +
>> +    d->arch.hvm.assisted_x2apic =
>> +        config->arch.misc_flags & XEN_X86_ASSISTED_X2APIC;
>> +
>>       return 0;
>>   
>>    fail:
>> diff --git a/xen/arch/x86/hvm/vmx/vmcs.c b/xen/arch/x86/hvm/vmx/vmcs.c
>> index 06831099ed..e4503a02a7 100644
>> --- a/xen/arch/x86/hvm/vmx/vmcs.c
>> +++ b/xen/arch/x86/hvm/vmx/vmcs.c
>> @@ -1157,6 +1157,10 @@ static int construct_vmcs(struct vcpu *v)
>>           __vmwrite(PLE_WINDOW, ple_window);
>>       }
>>   
>> +    if ( !has_assisted_xapic(v->domain) )
>> +        v->arch.hvm.vmx.secondary_exec_control &=
>> +            ~SECONDARY_EXEC_VIRTUALIZE_APIC_ACCESSES;
>> +
>>       if ( cpu_has_vmx_secondary_exec_control )
>>           __vmwrite(SECONDARY_VM_EXEC_CONTROL,
>>                     v->arch.hvm.vmx.secondary_exec_control);
>> diff --git a/xen/arch/x86/hvm/vmx/vmx.c b/xen/arch/x86/hvm/vmx/vmx.c
>> index c075370f64..949ddd684c 100644
>> --- a/xen/arch/x86/hvm/vmx/vmx.c
>> +++ b/xen/arch/x86/hvm/vmx/vmx.c
>> @@ -3344,16 +3344,11 @@ static void vmx_install_vlapic_mapping(struct vcpu *v)
>>   
>>   void vmx_vlapic_msr_changed(struct vcpu *v)
>>   {
>> -    int virtualize_x2apic_mode;
>>       struct vlapic *vlapic = vcpu_vlapic(v);
>>       unsigned int msr;
>>   
>> -    virtualize_x2apic_mode = ( (cpu_has_vmx_apic_reg_virt ||
>> -                                cpu_has_vmx_virtual_intr_delivery) &&
>> -                               cpu_has_vmx_virtualize_x2apic_mode );
>> -
>> -    if ( !cpu_has_vmx_virtualize_apic_accesses &&
>> -         !virtualize_x2apic_mode )
>> +    if ( !has_assisted_xapic(v->domain) &&
>> +         !has_assisted_x2apic(v->domain) )
>>           return;
>>   
>>       vmx_vmcs_enter(v);
>> @@ -3363,7 +3358,7 @@ void vmx_vlapic_msr_changed(struct vcpu *v)
>>       if ( !vlapic_hw_disabled(vlapic) &&
>>            (vlapic_base_address(vlapic) == APIC_DEFAULT_PHYS_BASE) )
>>       {
>> -        if ( virtualize_x2apic_mode && vlapic_x2apic_mode(vlapic) )
>> +        if ( has_assisted_x2apic(v->domain) && vlapic_x2apic_mode(vlapic) )
>>           {
>>               v->arch.hvm.vmx.secondary_exec_control |=
>>                   SECONDARY_EXEC_VIRTUALIZE_X2APIC_MODE;
>> @@ -3384,7 +3379,7 @@ void vmx_vlapic_msr_changed(struct vcpu *v)
>>                   vmx_clear_msr_intercept(v, MSR_X2APIC_SELF, VMX_MSR_W);
>>               }
>>           }
>> -        else
>> +        else if ( has_assisted_xapic(v->domain) )
>>               v->arch.hvm.vmx.secondary_exec_control |=
>>                   SECONDARY_EXEC_VIRTUALIZE_APIC_ACCESSES;
>>       }
>> diff --git a/xen/arch/x86/include/asm/hvm/domain.h b/xen/arch/x86/include/asm/hvm/domain.h
>> index 698455444e..92bf53483c 100644
>> --- a/xen/arch/x86/include/asm/hvm/domain.h
>> +++ b/xen/arch/x86/include/asm/hvm/domain.h
>> @@ -117,6 +117,12 @@ struct hvm_domain {
>>   
>>       bool                   is_s3_suspended;
>>   
>> +    /* xAPIC hardware assisted virtualization. */
>> +    bool                   assisted_xapic;
>> +
>> +    /* x2APIC hardware assisted virtualization. */
>> +    bool                   assisted_x2apic;
>> +
>>       /* hypervisor intercepted msix table */
>>       struct list_head       msixtbl_list;
>>   
>> diff --git a/xen/arch/x86/include/asm/hvm/vmx/vmcs.h b/xen/arch/x86/include/asm/hvm/vmx/vmcs.h
>> index 9119aa8536..5b7d662ed7 100644
>> --- a/xen/arch/x86/include/asm/hvm/vmx/vmcs.h
>> +++ b/xen/arch/x86/include/asm/hvm/vmx/vmcs.h
>> @@ -220,6 +220,9 @@ void vmx_vmcs_reload(struct vcpu *v);
>>   #define CPU_BASED_ACTIVATE_SECONDARY_CONTROLS 0x80000000
>>   extern u32 vmx_cpu_based_exec_control;
>>   
>> +#define has_assisted_xapic(d)   ((d)->arch.hvm.assisted_xapic)
>> +#define has_assisted_x2apic(d)  ((d)->arch.hvm.assisted_x2apic)
> 
> Those macros should not be in an Intel specific header,
> arch/x86/include/asm/hvm/domain.h is likely a better place.

Actually do you think hvm.h could be better?

diff --git a/xen/arch/x86/include/asm/hvm/hvm.h 
b/xen/arch/x86/include/asm/hvm/hvm.h
index 5b7ec0cf69..65a978f670 100644
--- a/xen/arch/x86/include/asm/hvm/hvm.h
+++ b/xen/arch/x86/include/asm/hvm/hvm.h
@@ -373,6 +373,12 @@ int hvm_get_param(struct domain *d, uint32_t index, 
uint64_t *value);
  #define hvm_tsc_scaling_ratio(d) \
      ((d)->arch.hvm.tsc_scaling_ratio)

+#define has_assisted_xapic(d) \
+    ((d)->arch.hvm.assisted_xapic)
+
+#define has_assisted_x2apic(d) \
+    ((d)->arch.hvm.assisted_x2apic)
+
  #define hvm_get_guest_time(v) hvm_get_guest_time_fixed(v, 0)

  #define hvm_paging_enabled(v) \

Thank you,

Jane.

^ permalink raw reply related	[flat|nested] 27+ messages in thread

* Re: [PATCH v5 2/2] x86/xen: Allow per-domain usage of hardware virtualized APIC
  2022-03-08 15:44     ` Jane Malalane
@ 2022-03-08 16:02       ` Roger Pau Monné
  2022-03-08 16:13         ` Jane Malalane
  2022-03-08 16:16         ` Jane Malalane
  0 siblings, 2 replies; 27+ messages in thread
From: Roger Pau Monné @ 2022-03-08 16:02 UTC (permalink / raw)
  To: Jane Malalane
  Cc: Xen-devel, Wei Liu, Anthony Perard, Juergen Gross, Andrew Cooper,
	George Dunlap, Jan Beulich, Julien Grall, Stefano Stabellini,
	Christian Lindig, David Scott, Volodymyr Babchuk

On Tue, Mar 08, 2022 at 03:44:18PM +0000, Jane Malalane wrote:
> On 08/03/2022 11:38, Roger Pau Monné wrote:
> > On Mon, Mar 07, 2022 at 03:06:09PM +0000, Jane Malalane wrote:
> >> diff --git a/xen/arch/x86/include/asm/hvm/vmx/vmcs.h b/xen/arch/x86/include/asm/hvm/vmx/vmcs.h
> >> index 9119aa8536..5b7d662ed7 100644
> >> --- a/xen/arch/x86/include/asm/hvm/vmx/vmcs.h
> >> +++ b/xen/arch/x86/include/asm/hvm/vmx/vmcs.h
> >> @@ -220,6 +220,9 @@ void vmx_vmcs_reload(struct vcpu *v);
> >>   #define CPU_BASED_ACTIVATE_SECONDARY_CONTROLS 0x80000000
> >>   extern u32 vmx_cpu_based_exec_control;
> >>   
> >> +#define has_assisted_xapic(d)   ((d)->arch.hvm.assisted_xapic)
> >> +#define has_assisted_x2apic(d)  ((d)->arch.hvm.assisted_x2apic)
> > 
> > Those macros should not be in an Intel specific header,
> > arch/x86/include/asm/hvm/domain.h is likely a better place.
> 
> Actually do you think hvm.h could be better?

I guess that's also fine, I did suggest hvm/domain.h because that's
where the fields get declared. I guess you prefer hvm.h because there
are other HVM related helpers in there?

Thanks, Roger.


^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH v5 2/2] x86/xen: Allow per-domain usage of hardware virtualized APIC
  2022-03-08 16:02       ` Roger Pau Monné
@ 2022-03-08 16:13         ` Jane Malalane
  2022-03-08 16:16         ` Jane Malalane
  1 sibling, 0 replies; 27+ messages in thread
From: Jane Malalane @ 2022-03-08 16:13 UTC (permalink / raw)
  To: Roger Pau Monne
  Cc: Xen-devel, Wei Liu, Anthony Perard, Juergen Gross, Andrew Cooper,
	George Dunlap, Jan Beulich, Julien Grall, Stefano Stabellini,
	Christian Lindig, David Scott, Volodymyr Babchuk

On 08/03/2022 16:02, Roger Pau Monné wrote:
> On Tue, Mar 08, 2022 at 03:44:18PM +0000, Jane Malalane wrote:
>> On 08/03/2022 11:38, Roger Pau Monné wrote:
>>> On Mon, Mar 07, 2022 at 03:06:09PM +0000, Jane Malalane wrote:
>>>> diff --git a/xen/arch/x86/include/asm/hvm/vmx/vmcs.h b/xen/arch/x86/include/asm/hvm/vmx/vmcs.h
>>>> index 9119aa8536..5b7d662ed7 100644
>>>> --- a/xen/arch/x86/include/asm/hvm/vmx/vmcs.h
>>>> +++ b/xen/arch/x86/include/asm/hvm/vmx/vmcs.h
>>>> @@ -220,6 +220,9 @@ void vmx_vmcs_reload(struct vcpu *v);
>>>>    #define CPU_BASED_ACTIVATE_SECONDARY_CONTROLS 0x80000000
>>>>    extern u32 vmx_cpu_based_exec_control;
>>>>    
>>>> +#define has_assisted_xapic(d)   ((d)->arch.hvm.assisted_xapic)
>>>> +#define has_assisted_x2apic(d)  ((d)->arch.hvm.assisted_x2apic)
>>>
>>> Those macros should not be in an Intel specific header,
>>> arch/x86/include/asm/hvm/domain.h is likely a better place.
>>
>> Actually do you think hvm.h could be better?
> 
> I guess that's also fine, I did suggest hvm/domain.h because that's
> where the fields get declared. I guess you prefer hvm.h because there
> are other HVM related helpers in there?

Yeah, that is why - tsc_scaling_ratio also gets defined in domain.h, for 
e.g.
Thank you for pointing this out,

Jane.

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH v5 2/2] x86/xen: Allow per-domain usage of hardware virtualized APIC
  2022-03-08 16:02       ` Roger Pau Monné
  2022-03-08 16:13         ` Jane Malalane
@ 2022-03-08 16:16         ` Jane Malalane
  2022-03-08 16:22           ` Roger Pau Monné
  1 sibling, 1 reply; 27+ messages in thread
From: Jane Malalane @ 2022-03-08 16:16 UTC (permalink / raw)
  To: Roger Pau Monne
  Cc: Xen-devel, Wei Liu, Anthony Perard, Juergen Gross, Andrew Cooper,
	George Dunlap, Jan Beulich, Julien Grall, Stefano Stabellini,
	Christian Lindig, David Scott, Volodymyr Babchuk

On 08/03/2022 16:02, Roger Pau Monné wrote:
> On Tue, Mar 08, 2022 at 03:44:18PM +0000, Jane Malalane wrote:
>> On 08/03/2022 11:38, Roger Pau Monné wrote:
>>> On Mon, Mar 07, 2022 at 03:06:09PM +0000, Jane Malalane wrote:
>>>> diff --git a/xen/arch/x86/include/asm/hvm/vmx/vmcs.h b/xen/arch/x86/include/asm/hvm/vmx/vmcs.h
>>>> index 9119aa8536..5b7d662ed7 100644
>>>> --- a/xen/arch/x86/include/asm/hvm/vmx/vmcs.h
>>>> +++ b/xen/arch/x86/include/asm/hvm/vmx/vmcs.h
>>>> @@ -220,6 +220,9 @@ void vmx_vmcs_reload(struct vcpu *v);
>>>>    #define CPU_BASED_ACTIVATE_SECONDARY_CONTROLS 0x80000000
>>>>    extern u32 vmx_cpu_based_exec_control;
>>>>    
>>>> +#define has_assisted_xapic(d)   ((d)->arch.hvm.assisted_xapic)
>>>> +#define has_assisted_x2apic(d)  ((d)->arch.hvm.assisted_x2apic)
>>>
>>> Those macros should not be in an Intel specific header,
>>> arch/x86/include/asm/hvm/domain.h is likely a better place.
>>
>> Actually do you think hvm.h could be better?
> 
> I guess that's also fine, I did suggest hvm/domain.h because that's
> where the fields get declared. I guess you prefer hvm.h because there
> are other HVM related helpers in there?

Yeah, that is why - tsc_scaling_ratio also gets defined in domain.h for e.g.

Thanks again,

Jane.

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH v5 2/2] x86/xen: Allow per-domain usage of hardware virtualized APIC
  2022-03-08 16:16         ` Jane Malalane
@ 2022-03-08 16:22           ` Roger Pau Monné
  0 siblings, 0 replies; 27+ messages in thread
From: Roger Pau Monné @ 2022-03-08 16:22 UTC (permalink / raw)
  To: Jane Malalane
  Cc: Xen-devel, Wei Liu, Anthony Perard, Juergen Gross, Andrew Cooper,
	George Dunlap, Jan Beulich, Julien Grall, Stefano Stabellini,
	Christian Lindig, David Scott, Volodymyr Babchuk

On Tue, Mar 08, 2022 at 04:16:21PM +0000, Jane Malalane wrote:
> On 08/03/2022 16:02, Roger Pau Monné wrote:
> > On Tue, Mar 08, 2022 at 03:44:18PM +0000, Jane Malalane wrote:
> >> On 08/03/2022 11:38, Roger Pau Monné wrote:
> >>> On Mon, Mar 07, 2022 at 03:06:09PM +0000, Jane Malalane wrote:
> >>>> diff --git a/xen/arch/x86/include/asm/hvm/vmx/vmcs.h b/xen/arch/x86/include/asm/hvm/vmx/vmcs.h
> >>>> index 9119aa8536..5b7d662ed7 100644
> >>>> --- a/xen/arch/x86/include/asm/hvm/vmx/vmcs.h
> >>>> +++ b/xen/arch/x86/include/asm/hvm/vmx/vmcs.h
> >>>> @@ -220,6 +220,9 @@ void vmx_vmcs_reload(struct vcpu *v);
> >>>>    #define CPU_BASED_ACTIVATE_SECONDARY_CONTROLS 0x80000000
> >>>>    extern u32 vmx_cpu_based_exec_control;
> >>>>    
> >>>> +#define has_assisted_xapic(d)   ((d)->arch.hvm.assisted_xapic)
> >>>> +#define has_assisted_x2apic(d)  ((d)->arch.hvm.assisted_x2apic)
> >>>
> >>> Those macros should not be in an Intel specific header,
> >>> arch/x86/include/asm/hvm/domain.h is likely a better place.
> >>
> >> Actually do you think hvm.h could be better?
> > 
> > I guess that's also fine, I did suggest hvm/domain.h because that's
> > where the fields get declared. I guess you prefer hvm.h because there
> > are other HVM related helpers in there?
> 
> Yeah, that is why - tsc_scaling_ratio also gets defined in domain.h for e.g.

I'm fine with placing it in hvm.h.

Thanks, Roger.


^ permalink raw reply	[flat|nested] 27+ messages in thread

* [PATCH v6 1/2] xen+tools: Report Interrupt Controller Virtualization capabilities on x86
  2022-03-07 15:06 ` [PATCH v5 1/2] xen+tools: Report Interrupt Controller Virtualization capabilities on x86 Jane Malalane
  2022-03-08 10:39   ` Roger Pau Monné
@ 2022-03-08 17:31   ` Jane Malalane
  2022-03-09 10:36     ` Roger Pau Monné
  2022-03-11 15:04     ` [PATCH v7 " Jane Malalane
  2022-03-14  6:40   ` [PATCH v5 " Tian, Kevin
  2 siblings, 2 replies; 27+ messages in thread
From: Jane Malalane @ 2022-03-08 17:31 UTC (permalink / raw)
  To: Xen-devel
  Cc: Jane Malalane, Wei Liu, Anthony PERARD, Juergen Gross,
	Andrew Cooper, George Dunlap, Jan Beulich, Julien Grall,
	Stefano Stabellini, Volodymyr Babchuk, Bertrand Marquis,
	Jun Nakajima, Kevin Tian, Roger Pau Monné

Add XEN_SYSCTL_PHYSCAP_X86_ASSISTED_XAPIC and
XEN_SYSCTL_PHYSCAP_X86_ASSISTED_X2APIC to report accelerated xapic
and x2apic, on x86 hardware.
No such features are currently implemented on AMD hardware.

HW assisted xAPIC virtualization will be reported if HW, at the
minimum, supports virtualize_apic_accesses as this feature alone means
that an access to the APIC page will cause an APIC-access VM exit. An
APIC-access VM exit provides a VMM with information about the access
causing the VM exit, unlike a regular EPT fault, thus simplifying some
internal handling.

HW assisted x2APIC virtualization will be reported if HW supports
virtualize_x2apic_mode and, at least, either apic_reg_virt or
virtual_intr_delivery. This also means that
sysctl follows the conditionals in vmx_vlapic_msr_changed().

For that purpose, also add an arch-specific "capabilities" parameter
to struct xen_sysctl_physinfo.

Note that this interface is intended to be compatible with AMD so that
AVIC support can be introduced in a future patch. Unlike Intel that
has multiple controls for APIC Virtualization, AMD has one global
'AVIC Enable' control bit, so fine-graining of APIC virtualization
control cannot be done on a common interface.

Suggested-by: Andrew Cooper <andrew.cooper3@citrix.com>
Signed-off-by: Jane Malalane <jane.malalane@citrix.com>
---
CC: Wei Liu <wl@xen.org>
CC: Anthony PERARD <anthony.perard@citrix.com>
CC: Juergen Gross <jgross@suse.com>
CC: Andrew Cooper <andrew.cooper3@citrix.com>
CC: George Dunlap <george.dunlap@citrix.com>
CC: Jan Beulich <jbeulich@suse.com>
CC: Julien Grall <julien@xen.org>
CC: Stefano Stabellini <sstabellini@kernel.org>
CC: Volodymyr Babchuk <Volodymyr_Babchuk@epam.com>
CC: Bertrand Marquis <bertrand.marquis@arm.com>
CC: Jun Nakajima <jun.nakajima@intel.com>
CC: Kevin Tian <kevin.tian@intel.com>
CC: "Roger Pau Monné" <roger.pau@citrix.com>

v6:
 * Limit abi check to x86
 * Fix coding style issue

v5:
 * Have assisted_xapic_available solely depend on
   cpu_has_vmx_virtualize_apic_accesses and assisted_x2apic_available
   depend on cpu_has_vmx_virtualize_x2apic_mode and
   cpu_has_vmx_apic_reg_virt OR cpu_has_vmx_virtual_intr_delivery

v4:
 * Fallback to the original v2/v1 conditions for setting
   assisted_xapic_available and assisted_x2apic_available so that in
   the future APIC virtualization can be exposed on AMD hardware
   since fine-graining of "AVIC" is not supported, i.e., AMD solely
   uses "AVIC Enable". This also means that sysctl mimics what's
   exposed in CPUID

v3:
 * Define XEN_SYSCTL_PHYSCAP_ARCH_MAX for ABI checking and actually
   set "arch_capbilities", via a call to c_bitmap_to_ocaml_list()
 * Have assisted_x2apic_available only depend on
   cpu_has_vmx_virtualize_x2apic_mode

v2:
 * Use one macro LIBXL_HAVE_PHYSINFO_ASSISTED_APIC instead of two
 * Pass xcpyshinfo as a pointer in libxl__arch_get_physinfo
 * Set assisted_x{2}apic_available to be conditional upon "bsp" and
   annotate it with __ro_after_init
 * Change XEN_SYSCTL_PHYSCAP_ARCH_ASSISTED_X{2}APIC to
   _X86_ASSISTED_X{2}APIC
 * Keep XEN_SYSCTL_PHYSCAP_X86_ASSISTED_X{2}APIC contained within
   sysctl.h
 * Fix padding introduced in struct xen_sysctl_physinfo and bump
   XEN_SYSCTL_INTERFACE_VERSION
---
 tools/golang/xenlight/helpers.gen.go |  4 ++++
 tools/golang/xenlight/types.gen.go   |  2 ++
 tools/include/libxl.h                |  7 +++++++
 tools/libs/light/libxl.c             |  3 +++
 tools/libs/light/libxl_arch.h        |  4 ++++
 tools/libs/light/libxl_arm.c         |  5 +++++
 tools/libs/light/libxl_types.idl     |  2 ++
 tools/libs/light/libxl_x86.c         | 11 +++++++++++
 tools/ocaml/libs/xc/xenctrl.ml       |  5 +++++
 tools/ocaml/libs/xc/xenctrl.mli      |  5 +++++
 tools/ocaml/libs/xc/xenctrl_stubs.c  | 15 +++++++++++++--
 tools/xl/xl_info.c                   |  6 ++++--
 xen/arch/x86/hvm/vmx/vmcs.c          |  9 +++++++++
 xen/arch/x86/include/asm/domain.h    |  3 +++
 xen/arch/x86/sysctl.c                |  7 +++++++
 xen/include/public/sysctl.h          | 11 ++++++++++-
 16 files changed, 94 insertions(+), 5 deletions(-)

diff --git a/tools/golang/xenlight/helpers.gen.go b/tools/golang/xenlight/helpers.gen.go
index b746ff1081..dd4e6c9f14 100644
--- a/tools/golang/xenlight/helpers.gen.go
+++ b/tools/golang/xenlight/helpers.gen.go
@@ -3373,6 +3373,8 @@ x.CapVmtrace = bool(xc.cap_vmtrace)
 x.CapVpmu = bool(xc.cap_vpmu)
 x.CapGnttabV1 = bool(xc.cap_gnttab_v1)
 x.CapGnttabV2 = bool(xc.cap_gnttab_v2)
+x.CapAssistedXapic = bool(xc.cap_assisted_xapic)
+x.CapAssistedX2Apic = bool(xc.cap_assisted_x2apic)
 
  return nil}
 
@@ -3407,6 +3409,8 @@ xc.cap_vmtrace = C.bool(x.CapVmtrace)
 xc.cap_vpmu = C.bool(x.CapVpmu)
 xc.cap_gnttab_v1 = C.bool(x.CapGnttabV1)
 xc.cap_gnttab_v2 = C.bool(x.CapGnttabV2)
+xc.cap_assisted_xapic = C.bool(x.CapAssistedXapic)
+xc.cap_assisted_x2apic = C.bool(x.CapAssistedX2Apic)
 
  return nil
  }
diff --git a/tools/golang/xenlight/types.gen.go b/tools/golang/xenlight/types.gen.go
index b1e84d5258..87be46c745 100644
--- a/tools/golang/xenlight/types.gen.go
+++ b/tools/golang/xenlight/types.gen.go
@@ -1014,6 +1014,8 @@ CapVmtrace bool
 CapVpmu bool
 CapGnttabV1 bool
 CapGnttabV2 bool
+CapAssistedXapic bool
+CapAssistedX2Apic bool
 }
 
 type Connectorinfo struct {
diff --git a/tools/include/libxl.h b/tools/include/libxl.h
index 51a9b6cfac..94e6355822 100644
--- a/tools/include/libxl.h
+++ b/tools/include/libxl.h
@@ -528,6 +528,13 @@
 #define LIBXL_HAVE_MAX_GRANT_VERSION 1
 
 /*
+ * LIBXL_HAVE_PHYSINFO_ASSISTED_APIC indicates that libxl_physinfo has
+ * cap_assisted_xapic and cap_assisted_x2apic fields, which indicates
+ * the availability of x{2}APIC hardware assisted virtualization.
+ */
+#define LIBXL_HAVE_PHYSINFO_ASSISTED_APIC 1
+
+/*
  * libxl ABI compatibility
  *
  * The only guarantee which libxl makes regarding ABI compatibility
diff --git a/tools/libs/light/libxl.c b/tools/libs/light/libxl.c
index a0bf7d186f..6d699951e2 100644
--- a/tools/libs/light/libxl.c
+++ b/tools/libs/light/libxl.c
@@ -15,6 +15,7 @@
 #include "libxl_osdeps.h"
 
 #include "libxl_internal.h"
+#include "libxl_arch.h"
 
 int libxl_ctx_alloc(libxl_ctx **pctx, int version,
                     unsigned flags, xentoollog_logger * lg)
@@ -410,6 +411,8 @@ int libxl_get_physinfo(libxl_ctx *ctx, libxl_physinfo *physinfo)
     physinfo->cap_gnttab_v2 =
         !!(xcphysinfo.capabilities & XEN_SYSCTL_PHYSCAP_gnttab_v2);
 
+    libxl__arch_get_physinfo(physinfo, &xcphysinfo);
+
     GC_FREE;
     return 0;
 }
diff --git a/tools/libs/light/libxl_arch.h b/tools/libs/light/libxl_arch.h
index 1522ecb97f..207ceac6a1 100644
--- a/tools/libs/light/libxl_arch.h
+++ b/tools/libs/light/libxl_arch.h
@@ -86,6 +86,10 @@ int libxl__arch_extra_memory(libxl__gc *gc,
                              uint64_t *out);
 
 _hidden
+void libxl__arch_get_physinfo(libxl_physinfo *physinfo,
+                              const xc_physinfo_t *xcphysinfo);
+
+_hidden
 void libxl__arch_update_domain_config(libxl__gc *gc,
                                       libxl_domain_config *dst,
                                       const libxl_domain_config *src);
diff --git a/tools/libs/light/libxl_arm.c b/tools/libs/light/libxl_arm.c
index eef1de0939..39fdca1b49 100644
--- a/tools/libs/light/libxl_arm.c
+++ b/tools/libs/light/libxl_arm.c
@@ -1431,6 +1431,11 @@ int libxl__arch_passthrough_mode_setdefault(libxl__gc *gc,
     return rc;
 }
 
+void libxl__arch_get_physinfo(libxl_physinfo *physinfo,
+                              const xc_physinfo_t *xcphysinfo)
+{
+}
+
 void libxl__arch_update_domain_config(libxl__gc *gc,
                                       libxl_domain_config *dst,
                                       const libxl_domain_config *src)
diff --git a/tools/libs/light/libxl_types.idl b/tools/libs/light/libxl_types.idl
index 2a42da2f7d..42ac6c357b 100644
--- a/tools/libs/light/libxl_types.idl
+++ b/tools/libs/light/libxl_types.idl
@@ -1068,6 +1068,8 @@ libxl_physinfo = Struct("physinfo", [
     ("cap_vpmu", bool),
     ("cap_gnttab_v1", bool),
     ("cap_gnttab_v2", bool),
+    ("cap_assisted_xapic", bool),
+    ("cap_assisted_x2apic", bool),
     ], dir=DIR_OUT)
 
 libxl_connectorinfo = Struct("connectorinfo", [
diff --git a/tools/libs/light/libxl_x86.c b/tools/libs/light/libxl_x86.c
index 1feadebb18..e0a06ecfe3 100644
--- a/tools/libs/light/libxl_x86.c
+++ b/tools/libs/light/libxl_x86.c
@@ -866,6 +866,17 @@ int libxl__arch_passthrough_mode_setdefault(libxl__gc *gc,
     return rc;
 }
 
+void libxl__arch_get_physinfo(libxl_physinfo *physinfo,
+                              const xc_physinfo_t *xcphysinfo)
+{
+    physinfo->cap_assisted_xapic =
+        !!(xcphysinfo->arch_capabilities &
+           XEN_SYSCTL_PHYSCAP_X86_ASSISTED_XAPIC);
+    physinfo->cap_assisted_x2apic =
+        !!(xcphysinfo->arch_capabilities &
+           XEN_SYSCTL_PHYSCAP_X86_ASSISTED_X2APIC);
+}
+
 void libxl__arch_update_domain_config(libxl__gc *gc,
                                       libxl_domain_config *dst,
                                       const libxl_domain_config *src)
diff --git a/tools/ocaml/libs/xc/xenctrl.ml b/tools/ocaml/libs/xc/xenctrl.ml
index 7503031d8f..712456e098 100644
--- a/tools/ocaml/libs/xc/xenctrl.ml
+++ b/tools/ocaml/libs/xc/xenctrl.ml
@@ -127,6 +127,10 @@ type physinfo_cap_flag =
 	| CAP_Gnttab_v1
 	| CAP_Gnttab_v2
 
+type physinfo_arch_cap_flag =
+	| CAP_X86_ASSISTED_XAPIC
+	| CAP_X86_ASSISTED_X2APIC
+
 type physinfo =
 {
 	threads_per_core : int;
@@ -140,6 +144,7 @@ type physinfo =
 	(* XXX hw_cap *)
 	capabilities     : physinfo_cap_flag list;
 	max_nr_cpus      : int;
+	arch_capabilities : physinfo_arch_cap_flag list;
 }
 
 type version =
diff --git a/tools/ocaml/libs/xc/xenctrl.mli b/tools/ocaml/libs/xc/xenctrl.mli
index d1d9c9247a..b034434f68 100644
--- a/tools/ocaml/libs/xc/xenctrl.mli
+++ b/tools/ocaml/libs/xc/xenctrl.mli
@@ -112,6 +112,10 @@ type physinfo_cap_flag =
   | CAP_Gnttab_v1
   | CAP_Gnttab_v2
 
+type physinfo_arch_cap_flag =
+  | CAP_X86_ASSISTED_XAPIC
+  | CAP_X86_ASSISTED_X2APIC
+
 type physinfo = {
   threads_per_core : int;
   cores_per_socket : int;
@@ -123,6 +127,7 @@ type physinfo = {
   scrub_pages      : nativeint;
   capabilities     : physinfo_cap_flag list;
   max_nr_cpus      : int; (** compile-time max possible number of nr_cpus *)
+  arch_capabilities : physinfo_arch_cap_flag list;
 }
 type version = { major : int; minor : int; extra : string; }
 type compile_info = {
diff --git a/tools/ocaml/libs/xc/xenctrl_stubs.c b/tools/ocaml/libs/xc/xenctrl_stubs.c
index 5b4fe72c8d..7e9c32ad1b 100644
--- a/tools/ocaml/libs/xc/xenctrl_stubs.c
+++ b/tools/ocaml/libs/xc/xenctrl_stubs.c
@@ -712,7 +712,7 @@ CAMLprim value stub_xc_send_debug_keys(value xch, value keys)
 CAMLprim value stub_xc_physinfo(value xch)
 {
 	CAMLparam1(xch);
-	CAMLlocal2(physinfo, cap_list);
+	CAMLlocal3(physinfo, cap_list, arch_cap_list);
 	xc_physinfo_t c_physinfo;
 	int r;
 
@@ -731,7 +731,7 @@ CAMLprim value stub_xc_physinfo(value xch)
 		/* ! XEN_SYSCTL_PHYSCAP_ XEN_SYSCTL_PHYSCAP_MAX max */
 		(c_physinfo.capabilities);
 
-	physinfo = caml_alloc_tuple(10);
+	physinfo = caml_alloc_tuple(11);
 	Store_field(physinfo, 0, Val_int(c_physinfo.threads_per_core));
 	Store_field(physinfo, 1, Val_int(c_physinfo.cores_per_socket));
 	Store_field(physinfo, 2, Val_int(c_physinfo.nr_cpus));
@@ -743,6 +743,17 @@ CAMLprim value stub_xc_physinfo(value xch)
 	Store_field(physinfo, 8, cap_list);
 	Store_field(physinfo, 9, Val_int(c_physinfo.max_cpu_id + 1));
 
+#if defined(__i386__) || defined(__x86_64__)
+	/*
+	 * arch_capabilities: physinfo_arch_cap_flag list;
+	 */
+	arch_cap_list = c_bitmap_to_ocaml_list
+		/* ! physinfo_arch_cap_flag CAP_ none */
+		/* ! XEN_SYSCTL_PHYSCAP_ XEN_SYSCTL_PHYSCAP_X86_MAX max */
+		(c_physinfo.arch_capabilities);
+	Store_field(physinfo, 10, arch_cap_list);
+#endif
+
 	CAMLreturn(physinfo);
 }
 
diff --git a/tools/xl/xl_info.c b/tools/xl/xl_info.c
index 712b7638b0..3205270754 100644
--- a/tools/xl/xl_info.c
+++ b/tools/xl/xl_info.c
@@ -210,7 +210,7 @@ static void output_physinfo(void)
          info.hw_cap[4], info.hw_cap[5], info.hw_cap[6], info.hw_cap[7]
         );
 
-    maybe_printf("virt_caps              :%s%s%s%s%s%s%s%s%s%s%s\n",
+    maybe_printf("virt_caps              :%s%s%s%s%s%s%s%s%s%s%s%s%s\n",
          info.cap_pv ? " pv" : "",
          info.cap_hvm ? " hvm" : "",
          info.cap_hvm && info.cap_hvm_directio ? " hvm_directio" : "",
@@ -221,7 +221,9 @@ static void output_physinfo(void)
          info.cap_vmtrace ? " vmtrace" : "",
          info.cap_vpmu ? " vpmu" : "",
          info.cap_gnttab_v1 ? " gnttab-v1" : "",
-         info.cap_gnttab_v2 ? " gnttab-v2" : ""
+         info.cap_gnttab_v2 ? " gnttab-v2" : "",
+         info.cap_assisted_xapic ? " assisted_xapic" : "",
+         info.cap_assisted_x2apic ? " assisted_x2apic" : ""
         );
 
     vinfo = libxl_get_version_info(ctx);
diff --git a/xen/arch/x86/hvm/vmx/vmcs.c b/xen/arch/x86/hvm/vmx/vmcs.c
index e1e1fa14e6..77ce0b2121 100644
--- a/xen/arch/x86/hvm/vmx/vmcs.c
+++ b/xen/arch/x86/hvm/vmx/vmcs.c
@@ -343,6 +343,15 @@ static int vmx_init_vmcs_config(bool bsp)
             MSR_IA32_VMX_PROCBASED_CTLS2, &mismatch);
     }
 
+    /* Check whether hardware supports accelerated xapic and x2apic. */
+    if ( bsp )
+    {
+        assisted_xapic_available = cpu_has_vmx_virtualize_apic_accesses;
+        assisted_x2apic_available = cpu_has_vmx_virtualize_x2apic_mode &&
+                                    (cpu_has_vmx_apic_reg_virt ||
+                                     cpu_has_vmx_virtual_intr_delivery);
+    }
+
     /* The IA32_VMX_EPT_VPID_CAP MSR exists only when EPT or VPID available */
     if ( _vmx_secondary_exec_control & (SECONDARY_EXEC_ENABLE_EPT |
                                         SECONDARY_EXEC_ENABLE_VPID) )
diff --git a/xen/arch/x86/include/asm/domain.h b/xen/arch/x86/include/asm/domain.h
index e62e109598..72431df26d 100644
--- a/xen/arch/x86/include/asm/domain.h
+++ b/xen/arch/x86/include/asm/domain.h
@@ -756,6 +756,9 @@ static inline void pv_inject_sw_interrupt(unsigned int vector)
                       : is_pv_32bit_domain(d) ? PV32_VM_ASSIST_MASK \
                                               : PV64_VM_ASSIST_MASK)
 
+extern bool assisted_xapic_available;
+extern bool assisted_x2apic_available;
+
 #endif /* __ASM_DOMAIN_H__ */
 
 /*
diff --git a/xen/arch/x86/sysctl.c b/xen/arch/x86/sysctl.c
index f82abc2488..ad95c86aef 100644
--- a/xen/arch/x86/sysctl.c
+++ b/xen/arch/x86/sysctl.c
@@ -69,6 +69,9 @@ struct l3_cache_info {
     unsigned long size;
 };
 
+bool __ro_after_init assisted_xapic_available;
+bool __ro_after_init assisted_x2apic_available;
+
 static void cf_check l3_cache_get(void *arg)
 {
     struct cpuid4_info info;
@@ -135,6 +138,10 @@ void arch_do_physinfo(struct xen_sysctl_physinfo *pi)
         pi->capabilities |= XEN_SYSCTL_PHYSCAP_hap;
     if ( IS_ENABLED(CONFIG_SHADOW_PAGING) )
         pi->capabilities |= XEN_SYSCTL_PHYSCAP_shadow;
+    if ( assisted_xapic_available )
+        pi->arch_capabilities |= XEN_SYSCTL_PHYSCAP_X86_ASSISTED_XAPIC;
+    if ( assisted_x2apic_available )
+        pi->arch_capabilities |= XEN_SYSCTL_PHYSCAP_X86_ASSISTED_X2APIC;
 }
 
 long arch_do_sysctl(
diff --git a/xen/include/public/sysctl.h b/xen/include/public/sysctl.h
index 55252e97f2..7fe05be0c9 100644
--- a/xen/include/public/sysctl.h
+++ b/xen/include/public/sysctl.h
@@ -35,7 +35,7 @@
 #include "domctl.h"
 #include "physdev.h"
 
-#define XEN_SYSCTL_INTERFACE_VERSION 0x00000014
+#define XEN_SYSCTL_INTERFACE_VERSION 0x00000015
 
 /*
  * Read console content from Xen buffer ring.
@@ -111,6 +111,13 @@ struct xen_sysctl_tbuf_op {
 /* Max XEN_SYSCTL_PHYSCAP_* constant.  Used for ABI checking. */
 #define XEN_SYSCTL_PHYSCAP_MAX XEN_SYSCTL_PHYSCAP_gnttab_v2
 
+/* The platform supports x{2}apic hardware assisted emulation. */
+#define XEN_SYSCTL_PHYSCAP_X86_ASSISTED_XAPIC  (1u << 0)
+#define XEN_SYSCTL_PHYSCAP_X86_ASSISTED_X2APIC (1u << 1)
+
+/* Max XEN_SYSCTL_PHYSCAP_X86__* constant. Used for ABI checking. */
+#define XEN_SYSCTL_PHYSCAP_X86_MAX XEN_SYSCTL_PHYSCAP_X86_ASSISTED_X2APIC
+
 struct xen_sysctl_physinfo {
     uint32_t threads_per_core;
     uint32_t cores_per_socket;
@@ -120,6 +127,8 @@ struct xen_sysctl_physinfo {
     uint32_t max_node_id; /* Largest possible node ID on this host */
     uint32_t cpu_khz;
     uint32_t capabilities;/* XEN_SYSCTL_PHYSCAP_??? */
+    uint32_t arch_capabilities;/* XEN_SYSCTL_PHYSCAP_X86{ARM}_??? */
+    uint32_t pad;
     uint64_aligned_t total_pages;
     uint64_aligned_t free_pages;
     uint64_aligned_t scrub_pages;
-- 
2.11.0



^ permalink raw reply related	[flat|nested] 27+ messages in thread

* [PATCH v6 2/2] x86/xen: Allow per-domain usage of hardware virtualized APIC
  2022-03-07 15:06 ` [PATCH v5 2/2] x86/xen: Allow per-domain usage of hardware virtualized APIC Jane Malalane
  2022-03-08 11:38   ` Roger Pau Monné
@ 2022-03-08 17:36   ` Jane Malalane
  2022-03-09 11:48     ` Roger Pau Monné
                       ` (3 more replies)
  1 sibling, 4 replies; 27+ messages in thread
From: Jane Malalane @ 2022-03-08 17:36 UTC (permalink / raw)
  To: Xen-devel
  Cc: Jane Malalane, Wei Liu, Anthony PERARD, Juergen Gross,
	Andrew Cooper, George Dunlap, Jan Beulich, Julien Grall,
	Stefano Stabellini, Christian Lindig, David Scott,
	Volodymyr Babchuk, Roger Pau Monné

Introduce a new per-domain creation x86 specific flag to
select whether hardware assisted virtualization should be used for
x{2}APIC.

A per-domain option is added to xl in order to select the usage of
x{2}APIC hardware assisted virtualization, as well as a global
configuration option.

Having all APIC interaction exit to Xen for emulation is slow and can
induce much overhead. Hardware can speed up x{2}APIC by decoding the
APIC access and providing a VM exit with a more specific exit reason
than a regular EPT fault or by altogether avoiding a VM exit.

On the other hand, being able to disable x{2}APIC hardware assisted
virtualization can be useful for testing and debugging purposes.

Note: vmx_install_vlapic_mapping doesn't require modifications
regardless of whether the guest has "Virtualize APIC accesses" enabled
or not, i.e., setting the APIC_ACCESS_ADDR VMCS field is fine so long
as virtualize_apic_accesses is supported by the CPU.

Suggested-by: Andrew Cooper <andrew.cooper3@citrix.com>
Signed-off-by: Jane Malalane <jane.malalane@citrix.com>
---
CC: Wei Liu <wl@xen.org>
CC: Anthony PERARD <anthony.perard@citrix.com>
CC: Juergen Gross <jgross@suse.com>
CC: Andrew Cooper <andrew.cooper3@citrix.com>
CC: George Dunlap <george.dunlap@citrix.com>
CC: Jan Beulich <jbeulich@suse.com>
CC: Julien Grall <julien@xen.org>
CC: Stefano Stabellini <sstabellini@kernel.org>
CC: Christian Lindig <christian.lindig@citrix.com>
CC: David Scott <dave@recoil.org>
CC: Volodymyr Babchuk <Volodymyr_Babchuk@epam.com>
CC: "Roger Pau Monné" <roger.pau@citrix.com>

v6:
 * Use ENODEV instead of EINVAL when rejecting assisted_x{2}apic
   for PV guests
 * Move has_assisted_x{2}apic macros out of an Intel specific header
 * Remove references to Intel specific features in documentation

v5:
 * Revert v4 changes in vmx_vlapic_msr_changed(), preserving the use of
   the has_assisted_x{2}apic macros
 * Following changes in assisted_x{2}apic_available definitions in
   patch 1, retighten conditionals for setting
   XEN_HVM_CPUID_APIC_ACCESS_VIRT and XEN_HVM_CPUID_X2APIC_VIRT in
   cpuid_hypervisor_leaves()

v4:
 * Add has_assisted_x{2}apic macros and use them where appropriate
 * Replace CPU checks with per-domain assisted_x{2}apic control
   options in vmx_vlapic_msr_changed() and cpuid_hypervisor_leaves(),
   following edits to assisted_x{2}apic_available definitions in
   patch 1
   Note: new assisted_x{2}apic_available definitions make later
   cpu_has_vmx_apic_reg_virt and cpu_has_vmx_virtual_intr_delivery
   checks redundant in vmx_vlapic_msr_changed()

v3:
 * Change info in xl.cfg to better express reality and fix
   capitalization of x{2}apic
 * Move "physinfo" variable definition to the beggining of
   libxl__domain_build_info_setdefault()
 * Reposition brackets in if statement to match libxl coding style
 * Shorten logic in libxl__arch_domain_build_info_setdefault()
 * Correct dprintk message in arch_sanitise_domain_config()
 * Make appropriate changes in vmx_vlapic_msr_changed() and
   cpuid_hypervisor_leaves() for amended "assisted_x2apic" bit
 * Remove unneeded parantheses

v2:
 * Add a LIBXL_HAVE_ASSISTED_APIC macro
 * Pass xcpyshinfo as a pointer in libxl__arch_get_physinfo
 * Add a return statement in now "int"
   libxl__arch_domain_build_info_setdefault()
 * Preserve libxl__arch_domain_build_info_setdefault 's location in
   libxl_create.c
 * Correct x{2}apic default setting logic in
   libxl__arch_domain_prepare_config()
 * Correct logic for parsing assisted_x{2}apic host/guest options in
   xl_parse.c and initialize them to -1 in xl.c
 * Use guest options directly in vmx_vlapic_msr_changed
 * Fix indentation of bool assisted_x{2}apic in struct hvm_domain
 * Add a change in xenctrl_stubs.c to pass xenctrl ABI checks
---
 docs/man/xl.cfg.5.pod.in              | 15 +++++++++++++++
 docs/man/xl.conf.5.pod.in             | 12 ++++++++++++
 tools/golang/xenlight/helpers.gen.go  | 12 ++++++++++++
 tools/golang/xenlight/types.gen.go    |  2 ++
 tools/include/libxl.h                 |  7 +++++++
 tools/libs/light/libxl_arch.h         |  5 +++--
 tools/libs/light/libxl_arm.c          |  7 +++++--
 tools/libs/light/libxl_create.c       | 22 +++++++++++++---------
 tools/libs/light/libxl_types.idl      |  2 ++
 tools/libs/light/libxl_x86.c          | 27 +++++++++++++++++++++++++--
 tools/ocaml/libs/xc/xenctrl.ml        |  2 ++
 tools/ocaml/libs/xc/xenctrl.mli       |  2 ++
 tools/ocaml/libs/xc/xenctrl_stubs.c   |  2 +-
 tools/xl/xl.c                         |  8 ++++++++
 tools/xl/xl.h                         |  2 ++
 tools/xl/xl_parse.c                   | 16 ++++++++++++++++
 xen/arch/x86/domain.c                 | 28 +++++++++++++++++++++++++++-
 xen/arch/x86/hvm/vmx/vmcs.c           |  4 ++++
 xen/arch/x86/hvm/vmx/vmx.c            | 13 ++++---------
 xen/arch/x86/include/asm/hvm/domain.h |  6 ++++++
 xen/arch/x86/include/asm/hvm/hvm.h    |  6 ++++++
 xen/arch/x86/traps.c                  |  5 +++--
 xen/include/public/arch-x86/xen.h     |  2 ++
 23 files changed, 179 insertions(+), 28 deletions(-)

diff --git a/docs/man/xl.cfg.5.pod.in b/docs/man/xl.cfg.5.pod.in
index b98d161398..b4239fcc5e 100644
--- a/docs/man/xl.cfg.5.pod.in
+++ b/docs/man/xl.cfg.5.pod.in
@@ -1862,6 +1862,21 @@ firmware tables when using certain older guest Operating
 Systems. These tables have been superseded by newer constructs within
 the ACPI tables.
 
+=item B<assisted_xapic=BOOLEAN>
+
+B<(x86 only)> Enables or disables hardware assisted virtualization for
+xAPIC. With this option enabled, a memory-mapped APIC access will be
+decoded by hardware and either issue a more specific VM exit than just
+an EPT fault, or altogether avoid a VM exit. The
+default is settable via L<xl.conf(5)>.
+
+=item B<assisted_x2apic=BOOLEAN>
+
+B<(x86 only)> Enables or disables hardware assisted virtualization for
+x2APIC. With this option enabled, certain accesses to MSR APIC
+registers will avoid a VM exit into the hypervisor. The default is
+settable via L<xl.conf(5)>.
+
 =item B<nx=BOOLEAN>
 
 B<(x86 only)> Hides or exposes the No-eXecute capability. This allows a guest
diff --git a/docs/man/xl.conf.5.pod.in b/docs/man/xl.conf.5.pod.in
index df20c08137..95d136d1ea 100644
--- a/docs/man/xl.conf.5.pod.in
+++ b/docs/man/xl.conf.5.pod.in
@@ -107,6 +107,18 @@ Sets the default value for the C<max_grant_version> domain config value.
 
 Default: maximum grant version supported by the hypervisor.
 
+=item B<assisted_xapic=BOOLEAN>
+
+If enabled, domains will use xAPIC hardware assisted virtualization by default.
+
+Default: enabled if supported.
+
+=item B<assisted_x2apic=BOOLEAN>
+
+If enabled, domains will use x2APIC hardware assisted virtualization by default.
+
+Default: enabled if supported.
+
 =item B<vif.default.script="PATH">
 
 Configures the default hotplug script used by virtual network devices.
diff --git a/tools/golang/xenlight/helpers.gen.go b/tools/golang/xenlight/helpers.gen.go
index dd4e6c9f14..dece545ee0 100644
--- a/tools/golang/xenlight/helpers.gen.go
+++ b/tools/golang/xenlight/helpers.gen.go
@@ -1120,6 +1120,12 @@ x.ArchArm.Vuart = VuartType(xc.arch_arm.vuart)
 if err := x.ArchX86.MsrRelaxed.fromC(&xc.arch_x86.msr_relaxed);err != nil {
 return fmt.Errorf("converting field ArchX86.MsrRelaxed: %v", err)
 }
+if err := x.ArchX86.AssistedXapic.fromC(&xc.arch_x86.assisted_xapic);err != nil {
+return fmt.Errorf("converting field ArchX86.AssistedXapic: %v", err)
+}
+if err := x.ArchX86.AssistedX2Apic.fromC(&xc.arch_x86.assisted_x2apic);err != nil {
+return fmt.Errorf("converting field ArchX86.AssistedX2Apic: %v", err)
+}
 x.Altp2M = Altp2MMode(xc.altp2m)
 x.VmtraceBufKb = int(xc.vmtrace_buf_kb)
 if err := x.Vpmu.fromC(&xc.vpmu);err != nil {
@@ -1605,6 +1611,12 @@ xc.arch_arm.vuart = C.libxl_vuart_type(x.ArchArm.Vuart)
 if err := x.ArchX86.MsrRelaxed.toC(&xc.arch_x86.msr_relaxed); err != nil {
 return fmt.Errorf("converting field ArchX86.MsrRelaxed: %v", err)
 }
+if err := x.ArchX86.AssistedXapic.toC(&xc.arch_x86.assisted_xapic); err != nil {
+return fmt.Errorf("converting field ArchX86.AssistedXapic: %v", err)
+}
+if err := x.ArchX86.AssistedX2Apic.toC(&xc.arch_x86.assisted_x2apic); err != nil {
+return fmt.Errorf("converting field ArchX86.AssistedX2Apic: %v", err)
+}
 xc.altp2m = C.libxl_altp2m_mode(x.Altp2M)
 xc.vmtrace_buf_kb = C.int(x.VmtraceBufKb)
 if err := x.Vpmu.toC(&xc.vpmu); err != nil {
diff --git a/tools/golang/xenlight/types.gen.go b/tools/golang/xenlight/types.gen.go
index 87be46c745..253c9ad93d 100644
--- a/tools/golang/xenlight/types.gen.go
+++ b/tools/golang/xenlight/types.gen.go
@@ -520,6 +520,8 @@ Vuart VuartType
 }
 ArchX86 struct {
 MsrRelaxed Defbool
+AssistedXapic Defbool
+AssistedX2Apic Defbool
 }
 Altp2M Altp2MMode
 VmtraceBufKb int
diff --git a/tools/include/libxl.h b/tools/include/libxl.h
index 94e6355822..cdcccd6d01 100644
--- a/tools/include/libxl.h
+++ b/tools/include/libxl.h
@@ -535,6 +535,13 @@
 #define LIBXL_HAVE_PHYSINFO_ASSISTED_APIC 1
 
 /*
+ * LIBXL_HAVE_ASSISTED_APIC indicates that libxl_domain_build_info has
+ * assisted_xapic and assisted_x2apic fields for enabling hardware
+ * assisted virtualization for x{2}apic per domain.
+ */
+#define LIBXL_HAVE_ASSISTED_APIC 1
+
+/*
  * libxl ABI compatibility
  *
  * The only guarantee which libxl makes regarding ABI compatibility
diff --git a/tools/libs/light/libxl_arch.h b/tools/libs/light/libxl_arch.h
index 207ceac6a1..03b89929e6 100644
--- a/tools/libs/light/libxl_arch.h
+++ b/tools/libs/light/libxl_arch.h
@@ -71,8 +71,9 @@ void libxl__arch_domain_create_info_setdefault(libxl__gc *gc,
                                                libxl_domain_create_info *c_info);
 
 _hidden
-void libxl__arch_domain_build_info_setdefault(libxl__gc *gc,
-                                              libxl_domain_build_info *b_info);
+int libxl__arch_domain_build_info_setdefault(libxl__gc *gc,
+                                             libxl_domain_build_info *b_info,
+                                             const libxl_physinfo *physinfo);
 
 _hidden
 int libxl__arch_passthrough_mode_setdefault(libxl__gc *gc,
diff --git a/tools/libs/light/libxl_arm.c b/tools/libs/light/libxl_arm.c
index 39fdca1b49..ba5b8f433f 100644
--- a/tools/libs/light/libxl_arm.c
+++ b/tools/libs/light/libxl_arm.c
@@ -1384,8 +1384,9 @@ void libxl__arch_domain_create_info_setdefault(libxl__gc *gc,
     }
 }
 
-void libxl__arch_domain_build_info_setdefault(libxl__gc *gc,
-                                              libxl_domain_build_info *b_info)
+int libxl__arch_domain_build_info_setdefault(libxl__gc *gc,
+                                             libxl_domain_build_info *b_info,
+                                             const libxl_physinfo *physinfo)
 {
     /* ACPI is disabled by default */
     libxl_defbool_setdefault(&b_info->acpi, false);
@@ -1399,6 +1400,8 @@ void libxl__arch_domain_build_info_setdefault(libxl__gc *gc,
     memset(&b_info->u, '\0', sizeof(b_info->u));
     b_info->type = LIBXL_DOMAIN_TYPE_INVALID;
     libxl_domain_build_info_init_type(b_info, LIBXL_DOMAIN_TYPE_PVH);
+
+    return 0;
 }
 
 int libxl__arch_passthrough_mode_setdefault(libxl__gc *gc,
diff --git a/tools/libs/light/libxl_create.c b/tools/libs/light/libxl_create.c
index 15ed021f41..88d08d7277 100644
--- a/tools/libs/light/libxl_create.c
+++ b/tools/libs/light/libxl_create.c
@@ -75,6 +75,7 @@ int libxl__domain_build_info_setdefault(libxl__gc *gc,
                                         libxl_domain_build_info *b_info)
 {
     int i, rc;
+    libxl_physinfo info;
 
     if (b_info->type != LIBXL_DOMAIN_TYPE_HVM &&
         b_info->type != LIBXL_DOMAIN_TYPE_PV &&
@@ -264,7 +265,18 @@ int libxl__domain_build_info_setdefault(libxl__gc *gc,
     if (!b_info->event_channels)
         b_info->event_channels = 1023;
 
-    libxl__arch_domain_build_info_setdefault(gc, b_info);
+    rc = libxl_get_physinfo(CTX, &info);
+    if (rc) {
+        LOG(ERROR, "failed to get hypervisor info");
+        return rc;
+    }
+
+    rc = libxl__arch_domain_build_info_setdefault(gc, b_info, &info);
+    if (rc) {
+        LOG(ERROR, "unable to set domain arch build info defaults");
+        return rc;
+    }
+
     libxl_defbool_setdefault(&b_info->dm_restrict, false);
 
     if (b_info->iommu_memkb == LIBXL_MEMKB_DEFAULT)
@@ -457,14 +469,6 @@ int libxl__domain_build_info_setdefault(libxl__gc *gc,
     }
 
     if (b_info->max_grant_version == LIBXL_MAX_GRANT_DEFAULT) {
-        libxl_physinfo info;
-
-        rc = libxl_get_physinfo(CTX, &info);
-        if (rc) {
-            LOG(ERROR, "failed to get hypervisor info");
-            return rc;
-        }
-
         if (info.cap_gnttab_v2)
             b_info->max_grant_version = 2;
         else if (info.cap_gnttab_v1)
diff --git a/tools/libs/light/libxl_types.idl b/tools/libs/light/libxl_types.idl
index 42ac6c357b..db5eb0a0b3 100644
--- a/tools/libs/light/libxl_types.idl
+++ b/tools/libs/light/libxl_types.idl
@@ -648,6 +648,8 @@ libxl_domain_build_info = Struct("domain_build_info",[
                                ("vuart", libxl_vuart_type),
                               ])),
     ("arch_x86", Struct(None, [("msr_relaxed", libxl_defbool),
+                               ("assisted_xapic", libxl_defbool),
+                               ("assisted_x2apic", libxl_defbool),
                               ])),
     # Alternate p2m is not bound to any architecture or guest type, as it is
     # supported by x86 HVM and ARM support is planned.
diff --git a/tools/libs/light/libxl_x86.c b/tools/libs/light/libxl_x86.c
index e0a06ecfe3..c7f9f3774f 100644
--- a/tools/libs/light/libxl_x86.c
+++ b/tools/libs/light/libxl_x86.c
@@ -23,6 +23,14 @@ int libxl__arch_domain_prepare_config(libxl__gc *gc,
     if (libxl_defbool_val(d_config->b_info.arch_x86.msr_relaxed))
         config->arch.misc_flags |= XEN_X86_MSR_RELAXED;
 
+    if (d_config->c_info.type != LIBXL_DOMAIN_TYPE_PV)
+    {
+        if (libxl_defbool_val(d_config->b_info.arch_x86.assisted_xapic))
+            config->arch.misc_flags |= XEN_X86_ASSISTED_XAPIC;
+
+        if (libxl_defbool_val(d_config->b_info.arch_x86.assisted_x2apic))
+            config->arch.misc_flags |= XEN_X86_ASSISTED_X2APIC;
+    }
     return 0;
 }
 
@@ -819,11 +827,26 @@ void libxl__arch_domain_create_info_setdefault(libxl__gc *gc,
 {
 }
 
-void libxl__arch_domain_build_info_setdefault(libxl__gc *gc,
-                                              libxl_domain_build_info *b_info)
+int libxl__arch_domain_build_info_setdefault(libxl__gc *gc,
+                                             libxl_domain_build_info *b_info,
+                                             const libxl_physinfo *physinfo)
 {
     libxl_defbool_setdefault(&b_info->acpi, true);
     libxl_defbool_setdefault(&b_info->arch_x86.msr_relaxed, false);
+
+    if (b_info->type != LIBXL_DOMAIN_TYPE_PV) {
+        libxl_defbool_setdefault(&b_info->arch_x86.assisted_xapic,
+                             physinfo->cap_assisted_xapic);
+        libxl_defbool_setdefault(&b_info->arch_x86.assisted_x2apic,
+                             physinfo->cap_assisted_x2apic);
+    }
+    else if (!libxl_defbool_is_default(b_info->arch_x86.assisted_xapic) ||
+             !libxl_defbool_is_default(b_info->arch_x86.assisted_x2apic)) {
+        LOG(ERROR, "Interrupt Controller Virtualization not supported for PV");
+        return ERROR_INVAL;
+    }
+
+    return 0;
 }
 
 int libxl__arch_passthrough_mode_setdefault(libxl__gc *gc,
diff --git a/tools/ocaml/libs/xc/xenctrl.ml b/tools/ocaml/libs/xc/xenctrl.ml
index 712456e098..32f3028828 100644
--- a/tools/ocaml/libs/xc/xenctrl.ml
+++ b/tools/ocaml/libs/xc/xenctrl.ml
@@ -50,6 +50,8 @@ type x86_arch_emulation_flags =
 
 type x86_arch_misc_flags =
 	| X86_MSR_RELAXED
+	| X86_ASSISTED_XAPIC
+	| X86_ASSISTED_X2APIC
 
 type xen_x86_arch_domainconfig =
 {
diff --git a/tools/ocaml/libs/xc/xenctrl.mli b/tools/ocaml/libs/xc/xenctrl.mli
index b034434f68..d0fcbc8866 100644
--- a/tools/ocaml/libs/xc/xenctrl.mli
+++ b/tools/ocaml/libs/xc/xenctrl.mli
@@ -44,6 +44,8 @@ type x86_arch_emulation_flags =
 
 type x86_arch_misc_flags =
   | X86_MSR_RELAXED
+  | X86_ASSISTED_XAPIC
+  | X86_ASSISTED_X2APIC
 
 type xen_x86_arch_domainconfig = {
   emulation_flags: x86_arch_emulation_flags list;
diff --git a/tools/ocaml/libs/xc/xenctrl_stubs.c b/tools/ocaml/libs/xc/xenctrl_stubs.c
index 7e9c32ad1b..5df8aaa58f 100644
--- a/tools/ocaml/libs/xc/xenctrl_stubs.c
+++ b/tools/ocaml/libs/xc/xenctrl_stubs.c
@@ -239,7 +239,7 @@ CAMLprim value stub_xc_domain_create(value xch, value wanted_domid, value config
 
 		cfg.arch.misc_flags = ocaml_list_to_c_bitmap
 			/* ! x86_arch_misc_flags X86_ none */
-			/* ! XEN_X86_ XEN_X86_MSR_RELAXED all */
+			/* ! XEN_X86_ XEN_X86_ASSISTED_X2APIC max */
 			(VAL_MISC_FLAGS);
 
 #undef VAL_MISC_FLAGS
diff --git a/tools/xl/xl.c b/tools/xl/xl.c
index 2d1ec18ea3..31eb223309 100644
--- a/tools/xl/xl.c
+++ b/tools/xl/xl.c
@@ -57,6 +57,8 @@ int max_grant_frames = -1;
 int max_maptrack_frames = -1;
 int max_grant_version = LIBXL_MAX_GRANT_DEFAULT;
 libxl_domid domid_policy = INVALID_DOMID;
+int assisted_xapic = -1;
+int assisted_x2apic = -1;
 
 xentoollog_level minmsglevel = minmsglevel_default;
 
@@ -201,6 +203,12 @@ static void parse_global_config(const char *configfile,
     if (!xlu_cfg_get_long (config, "claim_mode", &l, 0))
         claim_mode = l;
 
+    if (!xlu_cfg_get_long (config, "assisted_xapic", &l, 0))
+        assisted_xapic = l;
+
+    if (!xlu_cfg_get_long (config, "assisted_x2apic", &l, 0))
+        assisted_x2apic = l;
+
     xlu_cfg_replace_string (config, "remus.default.netbufscript",
         &default_remus_netbufscript, 0);
     xlu_cfg_replace_string (config, "colo.default.proxyscript",
diff --git a/tools/xl/xl.h b/tools/xl/xl.h
index c5c4bedbdd..528deb3feb 100644
--- a/tools/xl/xl.h
+++ b/tools/xl/xl.h
@@ -286,6 +286,8 @@ extern libxl_bitmap global_vm_affinity_mask;
 extern libxl_bitmap global_hvm_affinity_mask;
 extern libxl_bitmap global_pv_affinity_mask;
 extern libxl_domid domid_policy;
+extern int assisted_xapic;
+extern int assisted_x2apic;
 
 enum output_format {
     OUTPUT_FORMAT_JSON,
diff --git a/tools/xl/xl_parse.c b/tools/xl/xl_parse.c
index 117fcdcb2b..0ab9b145fe 100644
--- a/tools/xl/xl_parse.c
+++ b/tools/xl/xl_parse.c
@@ -1681,6 +1681,22 @@ void parse_config_data(const char *config_source,
         xlu_cfg_get_defbool(config, "vpt_align", &b_info->u.hvm.vpt_align, 0);
         xlu_cfg_get_defbool(config, "apic", &b_info->apic, 0);
 
+        e = xlu_cfg_get_long(config, "assisted_xapic", &l , 0);
+        if ((e == ESRCH && assisted_xapic != -1)) /* use global default if present */
+            libxl_defbool_set(&b_info->arch_x86.assisted_xapic, assisted_xapic);
+        else if (!e)
+            libxl_defbool_set(&b_info->arch_x86.assisted_xapic, l);
+        else
+            exit(1);
+
+        e = xlu_cfg_get_long(config, "assisted_x2apic", &l, 0);
+        if ((e == ESRCH && assisted_x2apic != -1)) /* use global default if present */
+            libxl_defbool_set(&b_info->arch_x86.assisted_x2apic, assisted_x2apic);
+        else if (!e)
+            libxl_defbool_set(&b_info->arch_x86.assisted_x2apic, l);
+        else
+            exit(1);
+
         switch (xlu_cfg_get_list(config, "viridian",
                                  &viridian, &num_viridian, 1))
         {
diff --git a/xen/arch/x86/domain.c b/xen/arch/x86/domain.c
index a5048ed654..b18d5c9b52 100644
--- a/xen/arch/x86/domain.c
+++ b/xen/arch/x86/domain.c
@@ -619,6 +619,8 @@ int arch_sanitise_domain_config(struct xen_domctl_createdomain *config)
     bool hvm = config->flags & XEN_DOMCTL_CDF_hvm;
     bool hap = config->flags & XEN_DOMCTL_CDF_hap;
     bool nested_virt = config->flags & XEN_DOMCTL_CDF_nested_virt;
+    bool assisted_xapic = config->arch.misc_flags & XEN_X86_ASSISTED_XAPIC;
+    bool assisted_x2apic = config->arch.misc_flags & XEN_X86_ASSISTED_X2APIC;
     unsigned int max_vcpus;
 
     if ( hvm ? !hvm_enabled : !IS_ENABLED(CONFIG_PV) )
@@ -685,13 +687,31 @@ int arch_sanitise_domain_config(struct xen_domctl_createdomain *config)
         }
     }
 
-    if ( config->arch.misc_flags & ~XEN_X86_MSR_RELAXED )
+    if ( config->arch.misc_flags & ~(XEN_X86_MSR_RELAXED |
+                                     XEN_X86_ASSISTED_XAPIC |
+                                     XEN_X86_ASSISTED_X2APIC) )
     {
         dprintk(XENLOG_INFO, "Invalid arch misc flags %#x\n",
                 config->arch.misc_flags);
         return -EINVAL;
     }
 
+    if ( (assisted_xapic || assisted_x2apic) && !hvm )
+    {
+        dprintk(XENLOG_INFO,
+                "Interrupt Controller Virtualization not supported for PV\n");
+        return -ENODEV;
+    }
+
+    if ( (assisted_xapic && !assisted_xapic_available) ||
+         (assisted_x2apic && !assisted_x2apic_available) )
+    {
+        dprintk(XENLOG_INFO,
+                "Hardware assisted x%sAPIC requested but not available\n",
+                assisted_xapic && !assisted_xapic_available ? "" : "2");
+        return -EINVAL;
+    }
+
     return 0;
 }
 
@@ -864,6 +884,12 @@ int arch_domain_create(struct domain *d,
 
     d->arch.msr_relaxed = config->arch.misc_flags & XEN_X86_MSR_RELAXED;
 
+    d->arch.hvm.assisted_xapic =
+        config->arch.misc_flags & XEN_X86_ASSISTED_XAPIC;
+
+    d->arch.hvm.assisted_x2apic =
+        config->arch.misc_flags & XEN_X86_ASSISTED_X2APIC;
+
     return 0;
 
  fail:
diff --git a/xen/arch/x86/hvm/vmx/vmcs.c b/xen/arch/x86/hvm/vmx/vmcs.c
index 77ce0b2121..7adb043ab7 100644
--- a/xen/arch/x86/hvm/vmx/vmcs.c
+++ b/xen/arch/x86/hvm/vmx/vmcs.c
@@ -1157,6 +1157,10 @@ static int construct_vmcs(struct vcpu *v)
         __vmwrite(PLE_WINDOW, ple_window);
     }
 
+    if ( !has_assisted_xapic(v->domain) )
+        v->arch.hvm.vmx.secondary_exec_control &=
+            ~SECONDARY_EXEC_VIRTUALIZE_APIC_ACCESSES;
+
     if ( cpu_has_vmx_secondary_exec_control )
         __vmwrite(SECONDARY_VM_EXEC_CONTROL,
                   v->arch.hvm.vmx.secondary_exec_control);
diff --git a/xen/arch/x86/hvm/vmx/vmx.c b/xen/arch/x86/hvm/vmx/vmx.c
index c075370f64..949ddd684c 100644
--- a/xen/arch/x86/hvm/vmx/vmx.c
+++ b/xen/arch/x86/hvm/vmx/vmx.c
@@ -3344,16 +3344,11 @@ static void vmx_install_vlapic_mapping(struct vcpu *v)
 
 void vmx_vlapic_msr_changed(struct vcpu *v)
 {
-    int virtualize_x2apic_mode;
     struct vlapic *vlapic = vcpu_vlapic(v);
     unsigned int msr;
 
-    virtualize_x2apic_mode = ( (cpu_has_vmx_apic_reg_virt ||
-                                cpu_has_vmx_virtual_intr_delivery) &&
-                               cpu_has_vmx_virtualize_x2apic_mode );
-
-    if ( !cpu_has_vmx_virtualize_apic_accesses &&
-         !virtualize_x2apic_mode )
+    if ( !has_assisted_xapic(v->domain) &&
+         !has_assisted_x2apic(v->domain) )
         return;
 
     vmx_vmcs_enter(v);
@@ -3363,7 +3358,7 @@ void vmx_vlapic_msr_changed(struct vcpu *v)
     if ( !vlapic_hw_disabled(vlapic) &&
          (vlapic_base_address(vlapic) == APIC_DEFAULT_PHYS_BASE) )
     {
-        if ( virtualize_x2apic_mode && vlapic_x2apic_mode(vlapic) )
+        if ( has_assisted_x2apic(v->domain) && vlapic_x2apic_mode(vlapic) )
         {
             v->arch.hvm.vmx.secondary_exec_control |=
                 SECONDARY_EXEC_VIRTUALIZE_X2APIC_MODE;
@@ -3384,7 +3379,7 @@ void vmx_vlapic_msr_changed(struct vcpu *v)
                 vmx_clear_msr_intercept(v, MSR_X2APIC_SELF, VMX_MSR_W);
             }
         }
-        else
+        else if ( has_assisted_xapic(v->domain) )
             v->arch.hvm.vmx.secondary_exec_control |=
                 SECONDARY_EXEC_VIRTUALIZE_APIC_ACCESSES;
     }
diff --git a/xen/arch/x86/include/asm/hvm/domain.h b/xen/arch/x86/include/asm/hvm/domain.h
index 698455444e..92bf53483c 100644
--- a/xen/arch/x86/include/asm/hvm/domain.h
+++ b/xen/arch/x86/include/asm/hvm/domain.h
@@ -117,6 +117,12 @@ struct hvm_domain {
 
     bool                   is_s3_suspended;
 
+    /* xAPIC hardware assisted virtualization. */
+    bool                   assisted_xapic;
+
+    /* x2APIC hardware assisted virtualization. */
+    bool                   assisted_x2apic;
+
     /* hypervisor intercepted msix table */
     struct list_head       msixtbl_list;
 
diff --git a/xen/arch/x86/include/asm/hvm/hvm.h b/xen/arch/x86/include/asm/hvm/hvm.h
index 5b7ec0cf69..65a978f670 100644
--- a/xen/arch/x86/include/asm/hvm/hvm.h
+++ b/xen/arch/x86/include/asm/hvm/hvm.h
@@ -373,6 +373,12 @@ int hvm_get_param(struct domain *d, uint32_t index, uint64_t *value);
 #define hvm_tsc_scaling_ratio(d) \
     ((d)->arch.hvm.tsc_scaling_ratio)
 
+#define has_assisted_xapic(d) \
+    ((d)->arch.hvm.assisted_xapic)
+
+#define has_assisted_x2apic(d) \
+    ((d)->arch.hvm.assisted_x2apic)
+
 #define hvm_get_guest_time(v) hvm_get_guest_time_fixed(v, 0)
 
 #define hvm_paging_enabled(v) \
diff --git a/xen/arch/x86/traps.c b/xen/arch/x86/traps.c
index a2278d9499..a8dba88916 100644
--- a/xen/arch/x86/traps.c
+++ b/xen/arch/x86/traps.c
@@ -1121,7 +1121,8 @@ void cpuid_hypervisor_leaves(const struct vcpu *v, uint32_t leaf,
         if ( !is_hvm_domain(d) || subleaf != 0 )
             break;
 
-        if ( cpu_has_vmx_apic_reg_virt )
+        if ( cpu_has_vmx_apic_reg_virt &&
+             has_assisted_xapic(d) )
             res->a |= XEN_HVM_CPUID_APIC_ACCESS_VIRT;
 
         /*
@@ -1130,7 +1131,7 @@ void cpuid_hypervisor_leaves(const struct vcpu *v, uint32_t leaf,
          * and wrmsr in the guest will run without VMEXITs (see
          * vmx_vlapic_msr_changed()).
          */
-        if ( cpu_has_vmx_virtualize_x2apic_mode &&
+        if ( has_assisted_x2apic(d) &&
              cpu_has_vmx_apic_reg_virt &&
              cpu_has_vmx_virtual_intr_delivery )
             res->a |= XEN_HVM_CPUID_X2APIC_VIRT;
diff --git a/xen/include/public/arch-x86/xen.h b/xen/include/public/arch-x86/xen.h
index 7acd94c8eb..9da32c6239 100644
--- a/xen/include/public/arch-x86/xen.h
+++ b/xen/include/public/arch-x86/xen.h
@@ -317,6 +317,8 @@ struct xen_arch_domainconfig {
  * doesn't allow the guest to read or write to the underlying MSR.
  */
 #define XEN_X86_MSR_RELAXED (1u << 0)
+#define XEN_X86_ASSISTED_XAPIC (1u << 1)
+#define XEN_X86_ASSISTED_X2APIC (1u << 2)
     uint32_t misc_flags;
 };
 
-- 
2.11.0



^ permalink raw reply related	[flat|nested] 27+ messages in thread

* Re: [PATCH v6 1/2] xen+tools: Report Interrupt Controller Virtualization capabilities on x86
  2022-03-08 17:31   ` [PATCH v6 " Jane Malalane
@ 2022-03-09 10:36     ` Roger Pau Monné
  2022-03-11 15:09       ` Jane Malalane
  2022-03-11 15:04     ` [PATCH v7 " Jane Malalane
  1 sibling, 1 reply; 27+ messages in thread
From: Roger Pau Monné @ 2022-03-09 10:36 UTC (permalink / raw)
  To: Jane Malalane
  Cc: Xen-devel, Wei Liu, Anthony PERARD, Juergen Gross, Andrew Cooper,
	George Dunlap, Jan Beulich, Julien Grall, Stefano Stabellini,
	Volodymyr Babchuk, Bertrand Marquis, Jun Nakajima, Kevin Tian

On Tue, Mar 08, 2022 at 05:31:17PM +0000, Jane Malalane wrote:
> Add XEN_SYSCTL_PHYSCAP_X86_ASSISTED_XAPIC and
> XEN_SYSCTL_PHYSCAP_X86_ASSISTED_X2APIC to report accelerated xapic
> and x2apic, on x86 hardware.
> No such features are currently implemented on AMD hardware.
> 
> HW assisted xAPIC virtualization will be reported if HW, at the
> minimum, supports virtualize_apic_accesses as this feature alone means
> that an access to the APIC page will cause an APIC-access VM exit. An
> APIC-access VM exit provides a VMM with information about the access
> causing the VM exit, unlike a regular EPT fault, thus simplifying some
> internal handling.
> 
> HW assisted x2APIC virtualization will be reported if HW supports
> virtualize_x2apic_mode and, at least, either apic_reg_virt or
> virtual_intr_delivery. This also means that
> sysctl follows the conditionals in vmx_vlapic_msr_changed().
> 
> For that purpose, also add an arch-specific "capabilities" parameter
> to struct xen_sysctl_physinfo.
> 
> Note that this interface is intended to be compatible with AMD so that
> AVIC support can be introduced in a future patch. Unlike Intel that
> has multiple controls for APIC Virtualization, AMD has one global
> 'AVIC Enable' control bit, so fine-graining of APIC virtualization
> control cannot be done on a common interface.
> 
> Suggested-by: Andrew Cooper <andrew.cooper3@citrix.com>
> Signed-off-by: Jane Malalane <jane.malalane@citrix.com>

Overall LGTM, just one question and one nit.

> diff --git a/tools/ocaml/libs/xc/xenctrl_stubs.c b/tools/ocaml/libs/xc/xenctrl_stubs.c
> index 5b4fe72c8d..7e9c32ad1b 100644
> --- a/tools/ocaml/libs/xc/xenctrl_stubs.c
> +++ b/tools/ocaml/libs/xc/xenctrl_stubs.c
> @@ -712,7 +712,7 @@ CAMLprim value stub_xc_send_debug_keys(value xch, value keys)
>  CAMLprim value stub_xc_physinfo(value xch)
>  {
>  	CAMLparam1(xch);
> -	CAMLlocal2(physinfo, cap_list);
> +	CAMLlocal3(physinfo, cap_list, arch_cap_list);
>  	xc_physinfo_t c_physinfo;
>  	int r;
>  
> @@ -731,7 +731,7 @@ CAMLprim value stub_xc_physinfo(value xch)
>  		/* ! XEN_SYSCTL_PHYSCAP_ XEN_SYSCTL_PHYSCAP_MAX max */
>  		(c_physinfo.capabilities);
>  
> -	physinfo = caml_alloc_tuple(10);
> +	physinfo = caml_alloc_tuple(11);
>  	Store_field(physinfo, 0, Val_int(c_physinfo.threads_per_core));
>  	Store_field(physinfo, 1, Val_int(c_physinfo.cores_per_socket));
>  	Store_field(physinfo, 2, Val_int(c_physinfo.nr_cpus));
> @@ -743,6 +743,17 @@ CAMLprim value stub_xc_physinfo(value xch)
>  	Store_field(physinfo, 8, cap_list);
>  	Store_field(physinfo, 9, Val_int(c_physinfo.max_cpu_id + 1));
>  
> +#if defined(__i386__) || defined(__x86_64__)
> +	/*
> +	 * arch_capabilities: physinfo_arch_cap_flag list;
> +	 */
> +	arch_cap_list = c_bitmap_to_ocaml_list
> +		/* ! physinfo_arch_cap_flag CAP_ none */
> +		/* ! XEN_SYSCTL_PHYSCAP_ XEN_SYSCTL_PHYSCAP_X86_MAX max */
> +		(c_physinfo.arch_capabilities);
> +	Store_field(physinfo, 10, arch_cap_list);
> +#endif

Have you tried to build this on Arm? I wonder whether the compiler
will complain about arch_cap_list being unused there?

> +
>  	CAMLreturn(physinfo);
>  }
>  
> diff --git a/tools/xl/xl_info.c b/tools/xl/xl_info.c
> index 712b7638b0..3205270754 100644
> --- a/tools/xl/xl_info.c
> +++ b/tools/xl/xl_info.c
> @@ -210,7 +210,7 @@ static void output_physinfo(void)
>           info.hw_cap[4], info.hw_cap[5], info.hw_cap[6], info.hw_cap[7]
>          );
>  
> -    maybe_printf("virt_caps              :%s%s%s%s%s%s%s%s%s%s%s\n",
> +    maybe_printf("virt_caps              :%s%s%s%s%s%s%s%s%s%s%s%s%s\n",
>           info.cap_pv ? " pv" : "",
>           info.cap_hvm ? " hvm" : "",
>           info.cap_hvm && info.cap_hvm_directio ? " hvm_directio" : "",
> @@ -221,7 +221,9 @@ static void output_physinfo(void)
>           info.cap_vmtrace ? " vmtrace" : "",
>           info.cap_vpmu ? " vpmu" : "",
>           info.cap_gnttab_v1 ? " gnttab-v1" : "",
> -         info.cap_gnttab_v2 ? " gnttab-v2" : ""
> +         info.cap_gnttab_v2 ? " gnttab-v2" : "",
> +         info.cap_assisted_xapic ? " assisted_xapic" : "",
> +         info.cap_assisted_x2apic ? " assisted_x2apic" : ""
>          );
>  
>      vinfo = libxl_get_version_info(ctx);
> diff --git a/xen/arch/x86/hvm/vmx/vmcs.c b/xen/arch/x86/hvm/vmx/vmcs.c
> index e1e1fa14e6..77ce0b2121 100644
> --- a/xen/arch/x86/hvm/vmx/vmcs.c
> +++ b/xen/arch/x86/hvm/vmx/vmcs.c
> @@ -343,6 +343,15 @@ static int vmx_init_vmcs_config(bool bsp)
>              MSR_IA32_VMX_PROCBASED_CTLS2, &mismatch);
>      }
>  
> +    /* Check whether hardware supports accelerated xapic and x2apic. */
> +    if ( bsp )
> +    {
> +        assisted_xapic_available = cpu_has_vmx_virtualize_apic_accesses;
> +        assisted_x2apic_available = cpu_has_vmx_virtualize_x2apic_mode &&
> +                                    (cpu_has_vmx_apic_reg_virt ||
> +                                     cpu_has_vmx_virtual_intr_delivery);
> +    }
> +
>      /* The IA32_VMX_EPT_VPID_CAP MSR exists only when EPT or VPID available */
>      if ( _vmx_secondary_exec_control & (SECONDARY_EXEC_ENABLE_EPT |
>                                          SECONDARY_EXEC_ENABLE_VPID) )
> diff --git a/xen/arch/x86/include/asm/domain.h b/xen/arch/x86/include/asm/domain.h
> index e62e109598..72431df26d 100644
> --- a/xen/arch/x86/include/asm/domain.h
> +++ b/xen/arch/x86/include/asm/domain.h
> @@ -756,6 +756,9 @@ static inline void pv_inject_sw_interrupt(unsigned int vector)
>                        : is_pv_32bit_domain(d) ? PV32_VM_ASSIST_MASK \
>                                                : PV64_VM_ASSIST_MASK)
>  
> +extern bool assisted_xapic_available;
> +extern bool assisted_x2apic_available;
> +
>  #endif /* __ASM_DOMAIN_H__ */
>  
>  /*
> diff --git a/xen/arch/x86/sysctl.c b/xen/arch/x86/sysctl.c
> index f82abc2488..ad95c86aef 100644
> --- a/xen/arch/x86/sysctl.c
> +++ b/xen/arch/x86/sysctl.c
> @@ -69,6 +69,9 @@ struct l3_cache_info {
>      unsigned long size;
>  };
>  
> +bool __ro_after_init assisted_xapic_available;
> +bool __ro_after_init assisted_x2apic_available;
> +
>  static void cf_check l3_cache_get(void *arg)
>  {
>      struct cpuid4_info info;
> @@ -135,6 +138,10 @@ void arch_do_physinfo(struct xen_sysctl_physinfo *pi)
>          pi->capabilities |= XEN_SYSCTL_PHYSCAP_hap;
>      if ( IS_ENABLED(CONFIG_SHADOW_PAGING) )
>          pi->capabilities |= XEN_SYSCTL_PHYSCAP_shadow;
> +    if ( assisted_xapic_available )
> +        pi->arch_capabilities |= XEN_SYSCTL_PHYSCAP_X86_ASSISTED_XAPIC;
> +    if ( assisted_x2apic_available )
> +        pi->arch_capabilities |= XEN_SYSCTL_PHYSCAP_X86_ASSISTED_X2APIC;
>  }
>  
>  long arch_do_sysctl(
> diff --git a/xen/include/public/sysctl.h b/xen/include/public/sysctl.h
> index 55252e97f2..7fe05be0c9 100644
> --- a/xen/include/public/sysctl.h
> +++ b/xen/include/public/sysctl.h
> @@ -35,7 +35,7 @@
>  #include "domctl.h"
>  #include "physdev.h"
>  
> -#define XEN_SYSCTL_INTERFACE_VERSION 0x00000014
> +#define XEN_SYSCTL_INTERFACE_VERSION 0x00000015
>  
>  /*
>   * Read console content from Xen buffer ring.
> @@ -111,6 +111,13 @@ struct xen_sysctl_tbuf_op {
>  /* Max XEN_SYSCTL_PHYSCAP_* constant.  Used for ABI checking. */
>  #define XEN_SYSCTL_PHYSCAP_MAX XEN_SYSCTL_PHYSCAP_gnttab_v2
>  
> +/* The platform supports x{2}apic hardware assisted emulation. */
> +#define XEN_SYSCTL_PHYSCAP_X86_ASSISTED_XAPIC  (1u << 0)
> +#define XEN_SYSCTL_PHYSCAP_X86_ASSISTED_X2APIC (1u << 1)
> +
> +/* Max XEN_SYSCTL_PHYSCAP_X86__* constant. Used for ABI checking. */
> +#define XEN_SYSCTL_PHYSCAP_X86_MAX XEN_SYSCTL_PHYSCAP_X86_ASSISTED_X2APIC
> +
>  struct xen_sysctl_physinfo {
>      uint32_t threads_per_core;
>      uint32_t cores_per_socket;
> @@ -120,6 +127,8 @@ struct xen_sysctl_physinfo {
>      uint32_t max_node_id; /* Largest possible node ID on this host */
>      uint32_t cpu_khz;
>      uint32_t capabilities;/* XEN_SYSCTL_PHYSCAP_??? */
> +    uint32_t arch_capabilities;/* XEN_SYSCTL_PHYSCAP_X86{ARM}_??? */

Nit: comment should likely be:

XEN_SYSCTL_PHYSCAP_{X86,ARM,...}_???

Thanks, Roger.


^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH v6 2/2] x86/xen: Allow per-domain usage of hardware virtualized APIC
  2022-03-08 17:36   ` [PATCH v6 " Jane Malalane
@ 2022-03-09 11:48     ` Roger Pau Monné
  2022-03-09 13:23     ` Jan Beulich
                       ` (2 subsequent siblings)
  3 siblings, 0 replies; 27+ messages in thread
From: Roger Pau Monné @ 2022-03-09 11:48 UTC (permalink / raw)
  To: Jane Malalane
  Cc: Xen-devel, Wei Liu, Anthony PERARD, Juergen Gross, Andrew Cooper,
	George Dunlap, Jan Beulich, Julien Grall, Stefano Stabellini,
	Christian Lindig, David Scott, Volodymyr Babchuk

On Tue, Mar 08, 2022 at 05:36:43PM +0000, Jane Malalane wrote:
> diff --git a/tools/libs/light/libxl_arm.c b/tools/libs/light/libxl_arm.c
> index 39fdca1b49..ba5b8f433f 100644
> --- a/tools/libs/light/libxl_arm.c
> +++ b/tools/libs/light/libxl_arm.c
> @@ -1384,8 +1384,9 @@ void libxl__arch_domain_create_info_setdefault(libxl__gc *gc,
>      }
>  }
>  
> -void libxl__arch_domain_build_info_setdefault(libxl__gc *gc,
> -                                              libxl_domain_build_info *b_info)
> +int libxl__arch_domain_build_info_setdefault(libxl__gc *gc,
> +                                             libxl_domain_build_info *b_info,
> +                                             const libxl_physinfo *physinfo)
>  {
>      /* ACPI is disabled by default */
>      libxl_defbool_setdefault(&b_info->acpi, false);
> @@ -1399,6 +1400,8 @@ void libxl__arch_domain_build_info_setdefault(libxl__gc *gc,
>      memset(&b_info->u, '\0', sizeof(b_info->u));
>      b_info->type = LIBXL_DOMAIN_TYPE_INVALID;
>      libxl_domain_build_info_init_type(b_info, LIBXL_DOMAIN_TYPE_PVH);
> +
> +    return 0;

There's a void return above the memset (out of context in the diff)
that you also need to patch to 'return 0;'.


> diff --git a/xen/arch/x86/hvm/vmx/vmcs.c b/xen/arch/x86/hvm/vmx/vmcs.c
> index 77ce0b2121..7adb043ab7 100644
> --- a/xen/arch/x86/hvm/vmx/vmcs.c
> +++ b/xen/arch/x86/hvm/vmx/vmcs.c
> @@ -1157,6 +1157,10 @@ static int construct_vmcs(struct vcpu *v)
>          __vmwrite(PLE_WINDOW, ple_window);
>      }
>  
> +    if ( !has_assisted_xapic(v->domain) )

Nit: you already have a local 'd' variable here that's v->domain, so
please use that.

> diff --git a/xen/arch/x86/include/asm/hvm/hvm.h b/xen/arch/x86/include/asm/hvm/hvm.h
> index 5b7ec0cf69..65a978f670 100644
> --- a/xen/arch/x86/include/asm/hvm/hvm.h
> +++ b/xen/arch/x86/include/asm/hvm/hvm.h
> @@ -373,6 +373,12 @@ int hvm_get_param(struct domain *d, uint32_t index, uint64_t *value);
>  #define hvm_tsc_scaling_ratio(d) \
>      ((d)->arch.hvm.tsc_scaling_ratio)
>  
> +#define has_assisted_xapic(d) \
> +    ((d)->arch.hvm.assisted_xapic)
> +
> +#define has_assisted_x2apic(d) \
> +    ((d)->arch.hvm.assisted_x2apic)

Nit: I think there's no need to split those into two lines, ie:

#define has_assisted_xapic(d) ((d)->arch.hvm.assisted_xapic)

Is well below the 80 column limit.

Thanks, Roger.


^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH v6 2/2] x86/xen: Allow per-domain usage of hardware virtualized APIC
  2022-03-08 17:36   ` [PATCH v6 " Jane Malalane
  2022-03-09 11:48     ` Roger Pau Monné
@ 2022-03-09 13:23     ` Jan Beulich
  2022-03-09 13:51     ` Roger Pau Monné
  2022-03-11 15:08     ` [PATCH v7 " Jane Malalane
  3 siblings, 0 replies; 27+ messages in thread
From: Jan Beulich @ 2022-03-09 13:23 UTC (permalink / raw)
  To: Jane Malalane
  Cc: Wei Liu, Anthony PERARD, Juergen Gross, Andrew Cooper,
	George Dunlap, Julien Grall, Stefano Stabellini,
	Christian Lindig, David Scott, Volodymyr Babchuk,
	Roger Pau Monné,
	Xen-devel

On 08.03.2022 18:36, Jane Malalane wrote:
> @@ -685,13 +687,31 @@ int arch_sanitise_domain_config(struct xen_domctl_createdomain *config)
>          }
>      }
>  
> -    if ( config->arch.misc_flags & ~XEN_X86_MSR_RELAXED )
> +    if ( config->arch.misc_flags & ~(XEN_X86_MSR_RELAXED |
> +                                     XEN_X86_ASSISTED_XAPIC |
> +                                     XEN_X86_ASSISTED_X2APIC) )
>      {
>          dprintk(XENLOG_INFO, "Invalid arch misc flags %#x\n",
>                  config->arch.misc_flags);
>          return -EINVAL;
>      }
>  
> +    if ( (assisted_xapic || assisted_x2apic) && !hvm )
> +    {
> +        dprintk(XENLOG_INFO,
> +                "Interrupt Controller Virtualization not supported for PV\n");
> +        return -ENODEV;
> +    }
> +
> +    if ( (assisted_xapic && !assisted_xapic_available) ||
> +         (assisted_x2apic && !assisted_x2apic_available) )
> +    {
> +        dprintk(XENLOG_INFO,
> +                "Hardware assisted x%sAPIC requested but not available\n",
> +                assisted_xapic && !assisted_xapic_available ? "" : "2");
> +        return -EINVAL;
> +    }

My understanding of the outcome of the prior discussion was the opposite
use of EINVAL and ENODEV respectively.

Jan



^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH v6 2/2] x86/xen: Allow per-domain usage of hardware virtualized APIC
  2022-03-08 17:36   ` [PATCH v6 " Jane Malalane
  2022-03-09 11:48     ` Roger Pau Monné
  2022-03-09 13:23     ` Jan Beulich
@ 2022-03-09 13:51     ` Roger Pau Monné
  2022-03-11 15:08     ` [PATCH v7 " Jane Malalane
  3 siblings, 0 replies; 27+ messages in thread
From: Roger Pau Monné @ 2022-03-09 13:51 UTC (permalink / raw)
  To: Jane Malalane
  Cc: Xen-devel, Wei Liu, Anthony PERARD, Juergen Gross, Andrew Cooper,
	George Dunlap, Jan Beulich, Julien Grall, Stefano Stabellini,
	Christian Lindig, David Scott, Volodymyr Babchuk

On Tue, Mar 08, 2022 at 05:36:43PM +0000, Jane Malalane wrote:
> Introduce a new per-domain creation x86 specific flag to
> select whether hardware assisted virtualization should be used for
> x{2}APIC.
> 
> A per-domain option is added to xl in order to select the usage of
> x{2}APIC hardware assisted virtualization, as well as a global
> configuration option.
> 
> Having all APIC interaction exit to Xen for emulation is slow and can
> induce much overhead. Hardware can speed up x{2}APIC by decoding the
> APIC access and providing a VM exit with a more specific exit reason
> than a regular EPT fault or by altogether avoiding a VM exit.
> 
> On the other hand, being able to disable x{2}APIC hardware assisted
> virtualization can be useful for testing and debugging purposes.
> 
> Note: vmx_install_vlapic_mapping doesn't require modifications
> regardless of whether the guest has "Virtualize APIC accesses" enabled
> or not, i.e., setting the APIC_ACCESS_ADDR VMCS field is fine so long
> as virtualize_apic_accesses is supported by the CPU.
> 
> Suggested-by: Andrew Cooper <andrew.cooper3@citrix.com>
> Signed-off-by: Jane Malalane <jane.malalane@citrix.com>
> ---
> CC: Wei Liu <wl@xen.org>
> CC: Anthony PERARD <anthony.perard@citrix.com>
> CC: Juergen Gross <jgross@suse.com>
> CC: Andrew Cooper <andrew.cooper3@citrix.com>
> CC: George Dunlap <george.dunlap@citrix.com>
> CC: Jan Beulich <jbeulich@suse.com>
> CC: Julien Grall <julien@xen.org>
> CC: Stefano Stabellini <sstabellini@kernel.org>
> CC: Christian Lindig <christian.lindig@citrix.com>
> CC: David Scott <dave@recoil.org>
> CC: Volodymyr Babchuk <Volodymyr_Babchuk@epam.com>
> CC: "Roger Pau Monné" <roger.pau@citrix.com>
> 
> v6:
>  * Use ENODEV instead of EINVAL when rejecting assisted_x{2}apic
>    for PV guests
>  * Move has_assisted_x{2}apic macros out of an Intel specific header
>  * Remove references to Intel specific features in documentation
> 
> v5:
>  * Revert v4 changes in vmx_vlapic_msr_changed(), preserving the use of
>    the has_assisted_x{2}apic macros
>  * Following changes in assisted_x{2}apic_available definitions in
>    patch 1, retighten conditionals for setting
>    XEN_HVM_CPUID_APIC_ACCESS_VIRT and XEN_HVM_CPUID_X2APIC_VIRT in
>    cpuid_hypervisor_leaves()
> 
> v4:
>  * Add has_assisted_x{2}apic macros and use them where appropriate
>  * Replace CPU checks with per-domain assisted_x{2}apic control
>    options in vmx_vlapic_msr_changed() and cpuid_hypervisor_leaves(),
>    following edits to assisted_x{2}apic_available definitions in
>    patch 1
>    Note: new assisted_x{2}apic_available definitions make later
>    cpu_has_vmx_apic_reg_virt and cpu_has_vmx_virtual_intr_delivery
>    checks redundant in vmx_vlapic_msr_changed()
> 
> v3:
>  * Change info in xl.cfg to better express reality and fix
>    capitalization of x{2}apic
>  * Move "physinfo" variable definition to the beggining of
>    libxl__domain_build_info_setdefault()
>  * Reposition brackets in if statement to match libxl coding style
>  * Shorten logic in libxl__arch_domain_build_info_setdefault()
>  * Correct dprintk message in arch_sanitise_domain_config()
>  * Make appropriate changes in vmx_vlapic_msr_changed() and
>    cpuid_hypervisor_leaves() for amended "assisted_x2apic" bit
>  * Remove unneeded parantheses
> 
> v2:
>  * Add a LIBXL_HAVE_ASSISTED_APIC macro
>  * Pass xcpyshinfo as a pointer in libxl__arch_get_physinfo
>  * Add a return statement in now "int"
>    libxl__arch_domain_build_info_setdefault()
>  * Preserve libxl__arch_domain_build_info_setdefault 's location in
>    libxl_create.c
>  * Correct x{2}apic default setting logic in
>    libxl__arch_domain_prepare_config()
>  * Correct logic for parsing assisted_x{2}apic host/guest options in
>    xl_parse.c and initialize them to -1 in xl.c
>  * Use guest options directly in vmx_vlapic_msr_changed
>  * Fix indentation of bool assisted_x{2}apic in struct hvm_domain
>  * Add a change in xenctrl_stubs.c to pass xenctrl ABI checks
> ---
>  docs/man/xl.cfg.5.pod.in              | 15 +++++++++++++++
>  docs/man/xl.conf.5.pod.in             | 12 ++++++++++++
>  tools/golang/xenlight/helpers.gen.go  | 12 ++++++++++++
>  tools/golang/xenlight/types.gen.go    |  2 ++
>  tools/include/libxl.h                 |  7 +++++++
>  tools/libs/light/libxl_arch.h         |  5 +++--
>  tools/libs/light/libxl_arm.c          |  7 +++++--
>  tools/libs/light/libxl_create.c       | 22 +++++++++++++---------
>  tools/libs/light/libxl_types.idl      |  2 ++
>  tools/libs/light/libxl_x86.c          | 27 +++++++++++++++++++++++++--
>  tools/ocaml/libs/xc/xenctrl.ml        |  2 ++
>  tools/ocaml/libs/xc/xenctrl.mli       |  2 ++
>  tools/ocaml/libs/xc/xenctrl_stubs.c   |  2 +-
>  tools/xl/xl.c                         |  8 ++++++++
>  tools/xl/xl.h                         |  2 ++
>  tools/xl/xl_parse.c                   | 16 ++++++++++++++++
>  xen/arch/x86/domain.c                 | 28 +++++++++++++++++++++++++++-
>  xen/arch/x86/hvm/vmx/vmcs.c           |  4 ++++
>  xen/arch/x86/hvm/vmx/vmx.c            | 13 ++++---------
>  xen/arch/x86/include/asm/hvm/domain.h |  6 ++++++
>  xen/arch/x86/include/asm/hvm/hvm.h    |  6 ++++++
>  xen/arch/x86/traps.c                  |  5 +++--
>  xen/include/public/arch-x86/xen.h     |  2 ++
>  23 files changed, 179 insertions(+), 28 deletions(-)
> 
> diff --git a/docs/man/xl.cfg.5.pod.in b/docs/man/xl.cfg.5.pod.in
> index b98d161398..b4239fcc5e 100644
> --- a/docs/man/xl.cfg.5.pod.in
> +++ b/docs/man/xl.cfg.5.pod.in
> @@ -1862,6 +1862,21 @@ firmware tables when using certain older guest Operating
>  Systems. These tables have been superseded by newer constructs within
>  the ACPI tables.
>  
> +=item B<assisted_xapic=BOOLEAN>
> +
> +B<(x86 only)> Enables or disables hardware assisted virtualization for
> +xAPIC. With this option enabled, a memory-mapped APIC access will be
> +decoded by hardware and either issue a more specific VM exit than just
> +an EPT fault, or altogether avoid a VM exit. The
> +default is settable via L<xl.conf(5)>.

Sorry, replied to a v2 version of the patch instead of this one. Could
we use 'p2m fault' or maybe 'guest physmap fault' intead of EPT (as
that's Intel specific).

Thanks, Roger.


^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH v5 2/2] x86/xen: Allow per-domain usage of hardware virtualized APIC
  2022-03-08 14:41           ` Jan Beulich
@ 2022-03-09 15:56             ` Jane Malalane
  2022-03-09 16:51               ` Jan Beulich
  0 siblings, 1 reply; 27+ messages in thread
From: Jane Malalane @ 2022-03-09 15:56 UTC (permalink / raw)
  To: Jan Beulich
  Cc: Xen-devel, Wei Liu, Anthony Perard, Juergen Gross, Andrew Cooper,
	George Dunlap, Julien Grall, Stefano Stabellini,
	Christian Lindig, David Scott, Volodymyr Babchuk,
	Roger Pau Monne

On 08/03/2022 14:41, Jan Beulich wrote:
> [CAUTION - EXTERNAL EMAIL] DO NOT reply, click links, or open attachments unless you have verified the sender and know the content is safe.
> 
> On 08.03.2022 15:31, Jane Malalane wrote:
>> On 08/03/2022 12:33, Roger Pau Monné wrote:
>>> On Tue, Mar 08, 2022 at 01:24:23PM +0100, Jan Beulich wrote:
>>>> On 08.03.2022 12:38, Roger Pau Monné wrote:
>>>>> On Mon, Mar 07, 2022 at 03:06:09PM +0000, Jane Malalane wrote:
>>>>>> @@ -685,13 +687,31 @@ int arch_sanitise_domain_config(struct xen_domctl_createdomain *config)
>>>>>>            }
>>>>>>        }
>>>>>>    
>>>>>> -    if ( config->arch.misc_flags & ~XEN_X86_MSR_RELAXED )
>>>>>> +    if ( config->arch.misc_flags & ~(XEN_X86_MSR_RELAXED |
>>>>>> +                                     XEN_X86_ASSISTED_XAPIC |
>>>>>> +                                     XEN_X86_ASSISTED_X2APIC) )
>>>>>>        {
>>>>>>            dprintk(XENLOG_INFO, "Invalid arch misc flags %#x\n",
>>>>>>                    config->arch.misc_flags);
>>>>>>            return -EINVAL;
>>>>>>        }
>>>>>>    
>>>>>> +    if ( (assisted_xapic || assisted_x2apic) && !hvm )
>>>>>> +    {
>>>>>> +        dprintk(XENLOG_INFO,
>>>>>> +                "Interrupt Controller Virtualization not supported for PV\n");
>>>>>> +        return -EINVAL;
>>>>>> +    }
>>>>>> +
>>>>>> +    if ( (assisted_xapic && !assisted_xapic_available) ||
>>>>>> +         (assisted_x2apic && !assisted_x2apic_available) )
>>>>>> +    {
>>>>>> +        dprintk(XENLOG_INFO,
>>>>>> +                "Hardware assisted x%sAPIC requested but not available\n",
>>>>>> +                assisted_xapic && !assisted_xapic_available ? "" : "2");
>>>>>> +        return -EINVAL;
>>>>>
>>>>> I think for those two you could return -ENODEV if others agree.
>>>>
>>>> If by "two" you mean the xAPIC and x2APIC aspects here (and not e.g. this
>>>> and the earlier if()), then I agree. I'm always in favor of using distinct
>>>> error codes when possible and at least halfway sensible.
>>>
>>> I would be fine by using it for the !hvm if also. IMO it makes sense
>>> as PV doesn't have an APIC 'device' at all, so ENODEV would seem
>>> fitting. EINVAL is also fine as the caller shouldn't even attempt that
>>> in the first place.
>>>
>>> So let's use it for the last if only.
>> Wouldn't it make more sense to use -ENODEV particularly for the first? I
>> agree that -ENODEV should be reported in the first case because it
>> doesn't make sense to request acceleration of something that doesn't
>> exist and I should have put that. But having a look at the hap code
>> (since it resembles the second case), it returns -EINVAL when it is not
>> available, unless you deem this to be different or, in retrospective,
>> that the hap code should too have been coded to return -ENODEV.
>>
>> if ( hap && !hvm_hap_supported() )
>>       {
>>           dprintk(XENLOG_INFO, "HAP requested but not available\n");
>>           return -EINVAL;
>>       }
> 
> This is just one of the examples where using -ENODEV as you suggest
> would introduce an inconsistency. We use -EINVAL also for other
> purely guest-type dependent checks.
> 
> Jan
Hi Jan, so here I was comparing the hap implementation with the second 
case, i.e.

if ( (assisted_xapic && !assisted_xapic_available) ||
      (assisted_x2apic && !assisted_x2apic_available) )

and you seem to agree that using -ENODEV would be inconsistent? Have I 
misinterpreted this?

Thanks,

Jane.

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH v5 2/2] x86/xen: Allow per-domain usage of hardware virtualized APIC
  2022-03-09 15:56             ` Jane Malalane
@ 2022-03-09 16:51               ` Jan Beulich
  2022-03-09 17:02                 ` Jane Malalane
  0 siblings, 1 reply; 27+ messages in thread
From: Jan Beulich @ 2022-03-09 16:51 UTC (permalink / raw)
  To: Jane Malalane
  Cc: Xen-devel, Wei Liu, Anthony Perard, Juergen Gross, Andrew Cooper,
	George Dunlap, Julien Grall, Stefano Stabellini,
	Christian Lindig, David Scott, Volodymyr Babchuk,
	Roger Pau Monne

On 09.03.2022 16:56, Jane Malalane wrote:
> On 08/03/2022 14:41, Jan Beulich wrote:
>> [CAUTION - EXTERNAL EMAIL] DO NOT reply, click links, or open attachments unless you have verified the sender and know the content is safe.
>>
>> On 08.03.2022 15:31, Jane Malalane wrote:
>>> On 08/03/2022 12:33, Roger Pau Monné wrote:
>>>> On Tue, Mar 08, 2022 at 01:24:23PM +0100, Jan Beulich wrote:
>>>>> On 08.03.2022 12:38, Roger Pau Monné wrote:
>>>>>> On Mon, Mar 07, 2022 at 03:06:09PM +0000, Jane Malalane wrote:
>>>>>>> @@ -685,13 +687,31 @@ int arch_sanitise_domain_config(struct xen_domctl_createdomain *config)
>>>>>>>            }
>>>>>>>        }
>>>>>>>    
>>>>>>> -    if ( config->arch.misc_flags & ~XEN_X86_MSR_RELAXED )
>>>>>>> +    if ( config->arch.misc_flags & ~(XEN_X86_MSR_RELAXED |
>>>>>>> +                                     XEN_X86_ASSISTED_XAPIC |
>>>>>>> +                                     XEN_X86_ASSISTED_X2APIC) )
>>>>>>>        {
>>>>>>>            dprintk(XENLOG_INFO, "Invalid arch misc flags %#x\n",
>>>>>>>                    config->arch.misc_flags);
>>>>>>>            return -EINVAL;
>>>>>>>        }
>>>>>>>    
>>>>>>> +    if ( (assisted_xapic || assisted_x2apic) && !hvm )
>>>>>>> +    {
>>>>>>> +        dprintk(XENLOG_INFO,
>>>>>>> +                "Interrupt Controller Virtualization not supported for PV\n");
>>>>>>> +        return -EINVAL;
>>>>>>> +    }
>>>>>>> +
>>>>>>> +    if ( (assisted_xapic && !assisted_xapic_available) ||
>>>>>>> +         (assisted_x2apic && !assisted_x2apic_available) )
>>>>>>> +    {
>>>>>>> +        dprintk(XENLOG_INFO,
>>>>>>> +                "Hardware assisted x%sAPIC requested but not available\n",
>>>>>>> +                assisted_xapic && !assisted_xapic_available ? "" : "2");
>>>>>>> +        return -EINVAL;
>>>>>>
>>>>>> I think for those two you could return -ENODEV if others agree.
>>>>>
>>>>> If by "two" you mean the xAPIC and x2APIC aspects here (and not e.g. this
>>>>> and the earlier if()), then I agree. I'm always in favor of using distinct
>>>>> error codes when possible and at least halfway sensible.
>>>>
>>>> I would be fine by using it for the !hvm if also. IMO it makes sense
>>>> as PV doesn't have an APIC 'device' at all, so ENODEV would seem
>>>> fitting. EINVAL is also fine as the caller shouldn't even attempt that
>>>> in the first place.
>>>>
>>>> So let's use it for the last if only.
>>> Wouldn't it make more sense to use -ENODEV particularly for the first? I
>>> agree that -ENODEV should be reported in the first case because it
>>> doesn't make sense to request acceleration of something that doesn't
>>> exist and I should have put that. But having a look at the hap code
>>> (since it resembles the second case), it returns -EINVAL when it is not
>>> available, unless you deem this to be different or, in retrospective,
>>> that the hap code should too have been coded to return -ENODEV.
>>>
>>> if ( hap && !hvm_hap_supported() )
>>>       {
>>>           dprintk(XENLOG_INFO, "HAP requested but not available\n");
>>>           return -EINVAL;
>>>       }
>>
>> This is just one of the examples where using -ENODEV as you suggest
>> would introduce an inconsistency. We use -EINVAL also for other
>> purely guest-type dependent checks.
>>
>> Jan
> Hi Jan, so here I was comparing the hap implementation with the second 
> case, i.e.
> 
> if ( (assisted_xapic && !assisted_xapic_available) ||
>       (assisted_x2apic && !assisted_x2apic_available) )
> 
> and you seem to agree that using -ENODEV would be inconsistent? Have I 
> misinterpreted this?

Not exactly. I'm comparing existing hap / hvm / !hap / !hvm uses with
what you add.

Jan



^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH v5 2/2] x86/xen: Allow per-domain usage of hardware virtualized APIC
  2022-03-09 16:51               ` Jan Beulich
@ 2022-03-09 17:02                 ` Jane Malalane
  0 siblings, 0 replies; 27+ messages in thread
From: Jane Malalane @ 2022-03-09 17:02 UTC (permalink / raw)
  To: Jan Beulich
  Cc: Xen-devel, Wei Liu, Anthony Perard, Juergen Gross, Andrew Cooper,
	George Dunlap, Julien Grall, Stefano Stabellini,
	Christian Lindig, David Scott, Volodymyr Babchuk,
	Roger Pau Monne

On 09/03/2022 16:51, Jan Beulich wrote:
> [CAUTION - EXTERNAL EMAIL] DO NOT reply, click links, or open attachments unless you have verified the sender and know the content is safe.
> 
> On 09.03.2022 16:56, Jane Malalane wrote:
>> On 08/03/2022 14:41, Jan Beulich wrote:
>>> [CAUTION - EXTERNAL EMAIL] DO NOT reply, click links, or open attachments unless you have verified the sender and know the content is safe.
>>>
>>> On 08.03.2022 15:31, Jane Malalane wrote:
>>>> On 08/03/2022 12:33, Roger Pau Monné wrote:
>>>>> On Tue, Mar 08, 2022 at 01:24:23PM +0100, Jan Beulich wrote:
>>>>>> On 08.03.2022 12:38, Roger Pau Monné wrote:
>>>>>>> On Mon, Mar 07, 2022 at 03:06:09PM +0000, Jane Malalane wrote:
>>>>>>>> @@ -685,13 +687,31 @@ int arch_sanitise_domain_config(struct xen_domctl_createdomain *config)
>>>>>>>>             }
>>>>>>>>         }
>>>>>>>>     
>>>>>>>> -    if ( config->arch.misc_flags & ~XEN_X86_MSR_RELAXED )
>>>>>>>> +    if ( config->arch.misc_flags & ~(XEN_X86_MSR_RELAXED |
>>>>>>>> +                                     XEN_X86_ASSISTED_XAPIC |
>>>>>>>> +                                     XEN_X86_ASSISTED_X2APIC) )
>>>>>>>>         {
>>>>>>>>             dprintk(XENLOG_INFO, "Invalid arch misc flags %#x\n",
>>>>>>>>                     config->arch.misc_flags);
>>>>>>>>             return -EINVAL;
>>>>>>>>         }
>>>>>>>>     
>>>>>>>> +    if ( (assisted_xapic || assisted_x2apic) && !hvm )
>>>>>>>> +    {
>>>>>>>> +        dprintk(XENLOG_INFO,
>>>>>>>> +                "Interrupt Controller Virtualization not supported for PV\n");
>>>>>>>> +        return -EINVAL;
>>>>>>>> +    }
>>>>>>>> +
>>>>>>>> +    if ( (assisted_xapic && !assisted_xapic_available) ||
>>>>>>>> +         (assisted_x2apic && !assisted_x2apic_available) )
>>>>>>>> +    {
>>>>>>>> +        dprintk(XENLOG_INFO,
>>>>>>>> +                "Hardware assisted x%sAPIC requested but not available\n",
>>>>>>>> +                assisted_xapic && !assisted_xapic_available ? "" : "2");
>>>>>>>> +        return -EINVAL;
>>>>>>>
>>>>>>> I think for those two you could return -ENODEV if others agree.
>>>>>>
>>>>>> If by "two" you mean the xAPIC and x2APIC aspects here (and not e.g. this
>>>>>> and the earlier if()), then I agree. I'm always in favor of using distinct
>>>>>> error codes when possible and at least halfway sensible.
>>>>>
>>>>> I would be fine by using it for the !hvm if also. IMO it makes sense
>>>>> as PV doesn't have an APIC 'device' at all, so ENODEV would seem
>>>>> fitting. EINVAL is also fine as the caller shouldn't even attempt that
>>>>> in the first place.
>>>>>
>>>>> So let's use it for the last if only.
>>>> Wouldn't it make more sense to use -ENODEV particularly for the first? I
>>>> agree that -ENODEV should be reported in the first case because it
>>>> doesn't make sense to request acceleration of something that doesn't
>>>> exist and I should have put that. But having a look at the hap code
>>>> (since it resembles the second case), it returns -EINVAL when it is not
>>>> available, unless you deem this to be different or, in retrospective,
>>>> that the hap code should too have been coded to return -ENODEV.
>>>>
>>>> if ( hap && !hvm_hap_supported() )
>>>>        {
>>>>            dprintk(XENLOG_INFO, "HAP requested but not available\n");
>>>>            return -EINVAL;
>>>>        }
>>>
>>> This is just one of the examples where using -ENODEV as you suggest
>>> would introduce an inconsistency. We use -EINVAL also for other
>>> purely guest-type dependent checks.
>>>
>>> Jan
>> Hi Jan, so here I was comparing the hap implementation with the second
>> case, i.e.
>>
>> if ( (assisted_xapic && !assisted_xapic_available) ||
>>        (assisted_x2apic && !assisted_x2apic_available) )
>>
>> and you seem to agree that using -ENODEV would be inconsistent? Have I
>> misinterpreted this?
> 
> Not exactly. I'm comparing existing hap / hvm / !hap / !hvm uses with
> what you add.
> 
> Jan
> 
Okay, I wil swap the error codes then, thank you.

Jane.

^ permalink raw reply	[flat|nested] 27+ messages in thread

* [PATCH v7 1/2] xen+tools: Report Interrupt Controller Virtualization capabilities on x86
  2022-03-08 17:31   ` [PATCH v6 " Jane Malalane
  2022-03-09 10:36     ` Roger Pau Monné
@ 2022-03-11 15:04     ` Jane Malalane
  1 sibling, 0 replies; 27+ messages in thread
From: Jane Malalane @ 2022-03-11 15:04 UTC (permalink / raw)
  To: Xen-devel
  Cc: Jane Malalane, Wei Liu, Anthony PERARD, Juergen Gross,
	Andrew Cooper, George Dunlap, Jan Beulich, Julien Grall,
	Stefano Stabellini, Volodymyr Babchuk, Bertrand Marquis,
	Jun Nakajima, Kevin Tian, Roger Pau Monné

Add XEN_SYSCTL_PHYSCAP_X86_ASSISTED_XAPIC and
XEN_SYSCTL_PHYSCAP_X86_ASSISTED_X2APIC to report accelerated xapic
and x2apic, on x86 hardware.
No such features are currently implemented on AMD hardware.

HW assisted xAPIC virtualization will be reported if HW, at the
minimum, supports virtualize_apic_accesses as this feature alone means
that an access to the APIC page will cause an APIC-access VM exit. An
APIC-access VM exit provides a VMM with information about the access
causing the VM exit, unlike a regular EPT fault, thus simplifying some
internal handling.

HW assisted x2APIC virtualization will be reported if HW supports
virtualize_x2apic_mode and, at least, either apic_reg_virt or
virtual_intr_delivery. This also means that
sysctl follows the conditionals in vmx_vlapic_msr_changed().

For that purpose, also add an arch-specific "capabilities" parameter
to struct xen_sysctl_physinfo.

Note that this interface is intended to be compatible with AMD so that
AVIC support can be introduced in a future patch. Unlike Intel that
has multiple controls for APIC Virtualization, AMD has one global
'AVIC Enable' control bit, so fine-graining of APIC virtualization
control cannot be done on a common interface.

Suggested-by: Andrew Cooper <andrew.cooper3@citrix.com>
Signed-off-by: Jane Malalane <jane.malalane@citrix.com>
---
CC: Wei Liu <wl@xen.org>
CC: Anthony PERARD <anthony.perard@citrix.com>
CC: Juergen Gross <jgross@suse.com>
CC: Andrew Cooper <andrew.cooper3@citrix.com>
CC: George Dunlap <george.dunlap@citrix.com>
CC: Jan Beulich <jbeulich@suse.com>
CC: Julien Grall <julien@xen.org>
CC: Stefano Stabellini <sstabellini@kernel.org>
CC: Volodymyr Babchuk <Volodymyr_Babchuk@epam.com>
CC: Bertrand Marquis <bertrand.marquis@arm.com>
CC: Jun Nakajima <jun.nakajima@intel.com>
CC: Kevin Tian <kevin.tian@intel.com>
CC: "Roger Pau Monné" <roger.pau@citrix.com>

v7:
 * Make sure assisted_x{2}apic_available evaluates to false, to ensure
   Xen builds, when !CONFIG_HVM
 * Fix coding style issues

v6:
 * Limit abi check to x86
 * Fix coding style issue

v5:
 * Have assisted_xapic_available solely depend on
   cpu_has_vmx_virtualize_apic_accesses and assisted_x2apic_available
   depend on cpu_has_vmx_virtualize_x2apic_mode and
   cpu_has_vmx_apic_reg_virt OR cpu_has_vmx_virtual_intr_delivery

v4:
 * Fallback to the original v2/v1 conditions for setting
   assisted_xapic_available and assisted_x2apic_available so that in
   the future APIC virtualization can be exposed on AMD hardware
   since fine-graining of "AVIC" is not supported, i.e., AMD solely
   uses "AVIC Enable". This also means that sysctl mimics what's
   exposed in CPUID

v3:
 * Define XEN_SYSCTL_PHYSCAP_ARCH_MAX for ABI checking and actually
   set "arch_capbilities", via a call to c_bitmap_to_ocaml_list()
 * Have assisted_x2apic_available only depend on
   cpu_has_vmx_virtualize_x2apic_mode

v2:
 * Use one macro LIBXL_HAVE_PHYSINFO_ASSISTED_APIC instead of two
 * Pass xcpyshinfo as a pointer in libxl__arch_get_physinfo
 * Set assisted_x{2}apic_available to be conditional upon "bsp" and
   annotate it with __ro_after_init
 * Change XEN_SYSCTL_PHYSCAP_ARCH_ASSISTED_X{2}APIC to
   _X86_ASSISTED_X{2}APIC
 * Keep XEN_SYSCTL_PHYSCAP_X86_ASSISTED_X{2}APIC contained within
   sysctl.h
 * Fix padding introduced in struct xen_sysctl_physinfo and bump
   XEN_SYSCTL_INTERFACE_VERSION
---
 tools/golang/xenlight/helpers.gen.go |  4 ++++
 tools/golang/xenlight/types.gen.go   |  2 ++
 tools/include/libxl.h                |  7 +++++++
 tools/libs/light/libxl.c             |  3 +++
 tools/libs/light/libxl_arch.h        |  4 ++++
 tools/libs/light/libxl_arm.c         |  5 +++++
 tools/libs/light/libxl_types.idl     |  2 ++
 tools/libs/light/libxl_x86.c         | 11 +++++++++++
 tools/ocaml/libs/xc/xenctrl.ml       |  5 +++++
 tools/ocaml/libs/xc/xenctrl.mli      |  5 +++++
 tools/ocaml/libs/xc/xenctrl_stubs.c  | 15 +++++++++++++--
 tools/xl/xl_info.c                   |  6 ++++--
 xen/arch/x86/hvm/hvm.c               |  3 +++
 xen/arch/x86/hvm/vmx/vmcs.c          |  9 +++++++++
 xen/arch/x86/include/asm/hvm/hvm.h   |  5 +++++
 xen/arch/x86/sysctl.c                |  4 ++++
 xen/include/public/sysctl.h          | 11 ++++++++++-
 17 files changed, 96 insertions(+), 5 deletions(-)

diff --git a/tools/golang/xenlight/helpers.gen.go b/tools/golang/xenlight/helpers.gen.go
index b746ff1081..dd4e6c9f14 100644
--- a/tools/golang/xenlight/helpers.gen.go
+++ b/tools/golang/xenlight/helpers.gen.go
@@ -3373,6 +3373,8 @@ x.CapVmtrace = bool(xc.cap_vmtrace)
 x.CapVpmu = bool(xc.cap_vpmu)
 x.CapGnttabV1 = bool(xc.cap_gnttab_v1)
 x.CapGnttabV2 = bool(xc.cap_gnttab_v2)
+x.CapAssistedXapic = bool(xc.cap_assisted_xapic)
+x.CapAssistedX2Apic = bool(xc.cap_assisted_x2apic)
 
  return nil}
 
@@ -3407,6 +3409,8 @@ xc.cap_vmtrace = C.bool(x.CapVmtrace)
 xc.cap_vpmu = C.bool(x.CapVpmu)
 xc.cap_gnttab_v1 = C.bool(x.CapGnttabV1)
 xc.cap_gnttab_v2 = C.bool(x.CapGnttabV2)
+xc.cap_assisted_xapic = C.bool(x.CapAssistedXapic)
+xc.cap_assisted_x2apic = C.bool(x.CapAssistedX2Apic)
 
  return nil
  }
diff --git a/tools/golang/xenlight/types.gen.go b/tools/golang/xenlight/types.gen.go
index b1e84d5258..87be46c745 100644
--- a/tools/golang/xenlight/types.gen.go
+++ b/tools/golang/xenlight/types.gen.go
@@ -1014,6 +1014,8 @@ CapVmtrace bool
 CapVpmu bool
 CapGnttabV1 bool
 CapGnttabV2 bool
+CapAssistedXapic bool
+CapAssistedX2Apic bool
 }
 
 type Connectorinfo struct {
diff --git a/tools/include/libxl.h b/tools/include/libxl.h
index 51a9b6cfac..94e6355822 100644
--- a/tools/include/libxl.h
+++ b/tools/include/libxl.h
@@ -528,6 +528,13 @@
 #define LIBXL_HAVE_MAX_GRANT_VERSION 1
 
 /*
+ * LIBXL_HAVE_PHYSINFO_ASSISTED_APIC indicates that libxl_physinfo has
+ * cap_assisted_xapic and cap_assisted_x2apic fields, which indicates
+ * the availability of x{2}APIC hardware assisted virtualization.
+ */
+#define LIBXL_HAVE_PHYSINFO_ASSISTED_APIC 1
+
+/*
  * libxl ABI compatibility
  *
  * The only guarantee which libxl makes regarding ABI compatibility
diff --git a/tools/libs/light/libxl.c b/tools/libs/light/libxl.c
index a0bf7d186f..6d699951e2 100644
--- a/tools/libs/light/libxl.c
+++ b/tools/libs/light/libxl.c
@@ -15,6 +15,7 @@
 #include "libxl_osdeps.h"
 
 #include "libxl_internal.h"
+#include "libxl_arch.h"
 
 int libxl_ctx_alloc(libxl_ctx **pctx, int version,
                     unsigned flags, xentoollog_logger * lg)
@@ -410,6 +411,8 @@ int libxl_get_physinfo(libxl_ctx *ctx, libxl_physinfo *physinfo)
     physinfo->cap_gnttab_v2 =
         !!(xcphysinfo.capabilities & XEN_SYSCTL_PHYSCAP_gnttab_v2);
 
+    libxl__arch_get_physinfo(physinfo, &xcphysinfo);
+
     GC_FREE;
     return 0;
 }
diff --git a/tools/libs/light/libxl_arch.h b/tools/libs/light/libxl_arch.h
index 1522ecb97f..207ceac6a1 100644
--- a/tools/libs/light/libxl_arch.h
+++ b/tools/libs/light/libxl_arch.h
@@ -86,6 +86,10 @@ int libxl__arch_extra_memory(libxl__gc *gc,
                              uint64_t *out);
 
 _hidden
+void libxl__arch_get_physinfo(libxl_physinfo *physinfo,
+                              const xc_physinfo_t *xcphysinfo);
+
+_hidden
 void libxl__arch_update_domain_config(libxl__gc *gc,
                                       libxl_domain_config *dst,
                                       const libxl_domain_config *src);
diff --git a/tools/libs/light/libxl_arm.c b/tools/libs/light/libxl_arm.c
index eef1de0939..39fdca1b49 100644
--- a/tools/libs/light/libxl_arm.c
+++ b/tools/libs/light/libxl_arm.c
@@ -1431,6 +1431,11 @@ int libxl__arch_passthrough_mode_setdefault(libxl__gc *gc,
     return rc;
 }
 
+void libxl__arch_get_physinfo(libxl_physinfo *physinfo,
+                              const xc_physinfo_t *xcphysinfo)
+{
+}
+
 void libxl__arch_update_domain_config(libxl__gc *gc,
                                       libxl_domain_config *dst,
                                       const libxl_domain_config *src)
diff --git a/tools/libs/light/libxl_types.idl b/tools/libs/light/libxl_types.idl
index 2a42da2f7d..42ac6c357b 100644
--- a/tools/libs/light/libxl_types.idl
+++ b/tools/libs/light/libxl_types.idl
@@ -1068,6 +1068,8 @@ libxl_physinfo = Struct("physinfo", [
     ("cap_vpmu", bool),
     ("cap_gnttab_v1", bool),
     ("cap_gnttab_v2", bool),
+    ("cap_assisted_xapic", bool),
+    ("cap_assisted_x2apic", bool),
     ], dir=DIR_OUT)
 
 libxl_connectorinfo = Struct("connectorinfo", [
diff --git a/tools/libs/light/libxl_x86.c b/tools/libs/light/libxl_x86.c
index 1feadebb18..e0a06ecfe3 100644
--- a/tools/libs/light/libxl_x86.c
+++ b/tools/libs/light/libxl_x86.c
@@ -866,6 +866,17 @@ int libxl__arch_passthrough_mode_setdefault(libxl__gc *gc,
     return rc;
 }
 
+void libxl__arch_get_physinfo(libxl_physinfo *physinfo,
+                              const xc_physinfo_t *xcphysinfo)
+{
+    physinfo->cap_assisted_xapic =
+        !!(xcphysinfo->arch_capabilities &
+           XEN_SYSCTL_PHYSCAP_X86_ASSISTED_XAPIC);
+    physinfo->cap_assisted_x2apic =
+        !!(xcphysinfo->arch_capabilities &
+           XEN_SYSCTL_PHYSCAP_X86_ASSISTED_X2APIC);
+}
+
 void libxl__arch_update_domain_config(libxl__gc *gc,
                                       libxl_domain_config *dst,
                                       const libxl_domain_config *src)
diff --git a/tools/ocaml/libs/xc/xenctrl.ml b/tools/ocaml/libs/xc/xenctrl.ml
index 7503031d8f..712456e098 100644
--- a/tools/ocaml/libs/xc/xenctrl.ml
+++ b/tools/ocaml/libs/xc/xenctrl.ml
@@ -127,6 +127,10 @@ type physinfo_cap_flag =
 	| CAP_Gnttab_v1
 	| CAP_Gnttab_v2
 
+type physinfo_arch_cap_flag =
+	| CAP_X86_ASSISTED_XAPIC
+	| CAP_X86_ASSISTED_X2APIC
+
 type physinfo =
 {
 	threads_per_core : int;
@@ -140,6 +144,7 @@ type physinfo =
 	(* XXX hw_cap *)
 	capabilities     : physinfo_cap_flag list;
 	max_nr_cpus      : int;
+	arch_capabilities : physinfo_arch_cap_flag list;
 }
 
 type version =
diff --git a/tools/ocaml/libs/xc/xenctrl.mli b/tools/ocaml/libs/xc/xenctrl.mli
index d1d9c9247a..b034434f68 100644
--- a/tools/ocaml/libs/xc/xenctrl.mli
+++ b/tools/ocaml/libs/xc/xenctrl.mli
@@ -112,6 +112,10 @@ type physinfo_cap_flag =
   | CAP_Gnttab_v1
   | CAP_Gnttab_v2
 
+type physinfo_arch_cap_flag =
+  | CAP_X86_ASSISTED_XAPIC
+  | CAP_X86_ASSISTED_X2APIC
+
 type physinfo = {
   threads_per_core : int;
   cores_per_socket : int;
@@ -123,6 +127,7 @@ type physinfo = {
   scrub_pages      : nativeint;
   capabilities     : physinfo_cap_flag list;
   max_nr_cpus      : int; (** compile-time max possible number of nr_cpus *)
+  arch_capabilities : physinfo_arch_cap_flag list;
 }
 type version = { major : int; minor : int; extra : string; }
 type compile_info = {
diff --git a/tools/ocaml/libs/xc/xenctrl_stubs.c b/tools/ocaml/libs/xc/xenctrl_stubs.c
index 5b4fe72c8d..7e9c32ad1b 100644
--- a/tools/ocaml/libs/xc/xenctrl_stubs.c
+++ b/tools/ocaml/libs/xc/xenctrl_stubs.c
@@ -712,7 +712,7 @@ CAMLprim value stub_xc_send_debug_keys(value xch, value keys)
 CAMLprim value stub_xc_physinfo(value xch)
 {
 	CAMLparam1(xch);
-	CAMLlocal2(physinfo, cap_list);
+	CAMLlocal3(physinfo, cap_list, arch_cap_list);
 	xc_physinfo_t c_physinfo;
 	int r;
 
@@ -731,7 +731,7 @@ CAMLprim value stub_xc_physinfo(value xch)
 		/* ! XEN_SYSCTL_PHYSCAP_ XEN_SYSCTL_PHYSCAP_MAX max */
 		(c_physinfo.capabilities);
 
-	physinfo = caml_alloc_tuple(10);
+	physinfo = caml_alloc_tuple(11);
 	Store_field(physinfo, 0, Val_int(c_physinfo.threads_per_core));
 	Store_field(physinfo, 1, Val_int(c_physinfo.cores_per_socket));
 	Store_field(physinfo, 2, Val_int(c_physinfo.nr_cpus));
@@ -743,6 +743,17 @@ CAMLprim value stub_xc_physinfo(value xch)
 	Store_field(physinfo, 8, cap_list);
 	Store_field(physinfo, 9, Val_int(c_physinfo.max_cpu_id + 1));
 
+#if defined(__i386__) || defined(__x86_64__)
+	/*
+	 * arch_capabilities: physinfo_arch_cap_flag list;
+	 */
+	arch_cap_list = c_bitmap_to_ocaml_list
+		/* ! physinfo_arch_cap_flag CAP_ none */
+		/* ! XEN_SYSCTL_PHYSCAP_ XEN_SYSCTL_PHYSCAP_X86_MAX max */
+		(c_physinfo.arch_capabilities);
+	Store_field(physinfo, 10, arch_cap_list);
+#endif
+
 	CAMLreturn(physinfo);
 }
 
diff --git a/tools/xl/xl_info.c b/tools/xl/xl_info.c
index 712b7638b0..3205270754 100644
--- a/tools/xl/xl_info.c
+++ b/tools/xl/xl_info.c
@@ -210,7 +210,7 @@ static void output_physinfo(void)
          info.hw_cap[4], info.hw_cap[5], info.hw_cap[6], info.hw_cap[7]
         );
 
-    maybe_printf("virt_caps              :%s%s%s%s%s%s%s%s%s%s%s\n",
+    maybe_printf("virt_caps              :%s%s%s%s%s%s%s%s%s%s%s%s%s\n",
          info.cap_pv ? " pv" : "",
          info.cap_hvm ? " hvm" : "",
          info.cap_hvm && info.cap_hvm_directio ? " hvm_directio" : "",
@@ -221,7 +221,9 @@ static void output_physinfo(void)
          info.cap_vmtrace ? " vmtrace" : "",
          info.cap_vpmu ? " vpmu" : "",
          info.cap_gnttab_v1 ? " gnttab-v1" : "",
-         info.cap_gnttab_v2 ? " gnttab-v2" : ""
+         info.cap_gnttab_v2 ? " gnttab-v2" : "",
+         info.cap_assisted_xapic ? " assisted_xapic" : "",
+         info.cap_assisted_x2apic ? " assisted_x2apic" : ""
         );
 
     vinfo = libxl_get_version_info(ctx);
diff --git a/xen/arch/x86/hvm/hvm.c b/xen/arch/x86/hvm/hvm.c
index 709a4191ef..e5dde9f8ce 100644
--- a/xen/arch/x86/hvm/hvm.c
+++ b/xen/arch/x86/hvm/hvm.c
@@ -117,6 +117,9 @@ static const char __initconst warning_hvm_fep[] =
 static bool_t __initdata opt_altp2m_enabled = 0;
 boolean_param("altp2m", opt_altp2m_enabled);
 
+bool __ro_after_init assisted_xapic_available;
+bool __ro_after_init assisted_x2apic_available;
+
 static int cf_check cpu_callback(
     struct notifier_block *nfb, unsigned long action, void *hcpu)
 {
diff --git a/xen/arch/x86/hvm/vmx/vmcs.c b/xen/arch/x86/hvm/vmx/vmcs.c
index e1e1fa14e6..77ce0b2121 100644
--- a/xen/arch/x86/hvm/vmx/vmcs.c
+++ b/xen/arch/x86/hvm/vmx/vmcs.c
@@ -343,6 +343,15 @@ static int vmx_init_vmcs_config(bool bsp)
             MSR_IA32_VMX_PROCBASED_CTLS2, &mismatch);
     }
 
+    /* Check whether hardware supports accelerated xapic and x2apic. */
+    if ( bsp )
+    {
+        assisted_xapic_available = cpu_has_vmx_virtualize_apic_accesses;
+        assisted_x2apic_available = cpu_has_vmx_virtualize_x2apic_mode &&
+                                    (cpu_has_vmx_apic_reg_virt ||
+                                     cpu_has_vmx_virtual_intr_delivery);
+    }
+
     /* The IA32_VMX_EPT_VPID_CAP MSR exists only when EPT or VPID available */
     if ( _vmx_secondary_exec_control & (SECONDARY_EXEC_ENABLE_EPT |
                                         SECONDARY_EXEC_ENABLE_VPID) )
diff --git a/xen/arch/x86/include/asm/hvm/hvm.h b/xen/arch/x86/include/asm/hvm/hvm.h
index 5b7ec0cf69..e0d9348878 100644
--- a/xen/arch/x86/include/asm/hvm/hvm.h
+++ b/xen/arch/x86/include/asm/hvm/hvm.h
@@ -373,6 +373,9 @@ int hvm_get_param(struct domain *d, uint32_t index, uint64_t *value);
 #define hvm_tsc_scaling_ratio(d) \
     ((d)->arch.hvm.tsc_scaling_ratio)
 
+extern bool assisted_xapic_available;
+extern bool assisted_x2apic_available;
+
 #define hvm_get_guest_time(v) hvm_get_guest_time_fixed(v, 0)
 
 #define hvm_paging_enabled(v) \
@@ -872,6 +875,8 @@ static inline void hvm_set_reg(struct vcpu *v, unsigned int reg, uint64_t val)
 #define hvm_tsc_scaling_supported false
 #define hap_has_1gb false
 #define hap_has_2mb false
+#define assisted_xapic_available false
+#define assisted_x2apic_available false
 
 #define hvm_paging_enabled(v) ((void)(v), false)
 #define hvm_wp_enabled(v) ((void)(v), false)
diff --git a/xen/arch/x86/sysctl.c b/xen/arch/x86/sysctl.c
index f82abc2488..716525f72f 100644
--- a/xen/arch/x86/sysctl.c
+++ b/xen/arch/x86/sysctl.c
@@ -135,6 +135,10 @@ void arch_do_physinfo(struct xen_sysctl_physinfo *pi)
         pi->capabilities |= XEN_SYSCTL_PHYSCAP_hap;
     if ( IS_ENABLED(CONFIG_SHADOW_PAGING) )
         pi->capabilities |= XEN_SYSCTL_PHYSCAP_shadow;
+    if ( assisted_xapic_available )
+        pi->arch_capabilities |= XEN_SYSCTL_PHYSCAP_X86_ASSISTED_XAPIC;
+    if ( assisted_x2apic_available )
+        pi->arch_capabilities |= XEN_SYSCTL_PHYSCAP_X86_ASSISTED_X2APIC;
 }
 
 long arch_do_sysctl(
diff --git a/xen/include/public/sysctl.h b/xen/include/public/sysctl.h
index 55252e97f2..fbb9912067 100644
--- a/xen/include/public/sysctl.h
+++ b/xen/include/public/sysctl.h
@@ -35,7 +35,7 @@
 #include "domctl.h"
 #include "physdev.h"
 
-#define XEN_SYSCTL_INTERFACE_VERSION 0x00000014
+#define XEN_SYSCTL_INTERFACE_VERSION 0x00000015
 
 /*
  * Read console content from Xen buffer ring.
@@ -111,6 +111,13 @@ struct xen_sysctl_tbuf_op {
 /* Max XEN_SYSCTL_PHYSCAP_* constant.  Used for ABI checking. */
 #define XEN_SYSCTL_PHYSCAP_MAX XEN_SYSCTL_PHYSCAP_gnttab_v2
 
+/* The platform supports x{2}apic hardware assisted emulation. */
+#define XEN_SYSCTL_PHYSCAP_X86_ASSISTED_XAPIC  (1u << 0)
+#define XEN_SYSCTL_PHYSCAP_X86_ASSISTED_X2APIC (1u << 1)
+
+/* Max XEN_SYSCTL_PHYSCAP_X86__* constant. Used for ABI checking. */
+#define XEN_SYSCTL_PHYSCAP_X86_MAX XEN_SYSCTL_PHYSCAP_X86_ASSISTED_X2APIC
+
 struct xen_sysctl_physinfo {
     uint32_t threads_per_core;
     uint32_t cores_per_socket;
@@ -120,6 +127,8 @@ struct xen_sysctl_physinfo {
     uint32_t max_node_id; /* Largest possible node ID on this host */
     uint32_t cpu_khz;
     uint32_t capabilities;/* XEN_SYSCTL_PHYSCAP_??? */
+    uint32_t arch_capabilities;/* XEN_SYSCTL_PHYSCAP_{X86,ARM,...}_??? */
+    uint32_t pad;
     uint64_aligned_t total_pages;
     uint64_aligned_t free_pages;
     uint64_aligned_t scrub_pages;
-- 
2.11.0



^ permalink raw reply related	[flat|nested] 27+ messages in thread

* [PATCH v7 2/2] x86/xen: Allow per-domain usage of hardware virtualized APIC
  2022-03-08 17:36   ` [PATCH v6 " Jane Malalane
                       ` (2 preceding siblings ...)
  2022-03-09 13:51     ` Roger Pau Monné
@ 2022-03-11 15:08     ` Jane Malalane
  3 siblings, 0 replies; 27+ messages in thread
From: Jane Malalane @ 2022-03-11 15:08 UTC (permalink / raw)
  To: Xen-devel
  Cc: Jane Malalane, Wei Liu, Anthony PERARD, Juergen Gross,
	Andrew Cooper, George Dunlap, Jan Beulich, Julien Grall,
	Stefano Stabellini, Christian Lindig, David Scott,
	Volodymyr Babchuk, Roger Pau Monné

Introduce a new per-domain creation x86 specific flag to
select whether hardware assisted virtualization should be used for
x{2}APIC.

A per-domain option is added to xl in order to select the usage of
x{2}APIC hardware assisted virtualization, as well as a global
configuration option.

Having all APIC interaction exit to Xen for emulation is slow and can
induce much overhead. Hardware can speed up x{2}APIC by decoding the
APIC access and providing a VM exit with a more specific exit reason
than a regular EPT fault or by altogether avoiding a VM exit.

On the other hand, being able to disable x{2}APIC hardware assisted
virtualization can be useful for testing and debugging purposes.

Note: vmx_install_vlapic_mapping doesn't require modifications
regardless of whether the guest has "Virtualize APIC accesses" enabled
or not, i.e., setting the APIC_ACCESS_ADDR VMCS field is fine so long
as virtualize_apic_accesses is supported by the CPU.

Suggested-by: Andrew Cooper <andrew.cooper3@citrix.com>
Signed-off-by: Jane Malalane <jane.malalane@citrix.com>
---
CC: Wei Liu <wl@xen.org>
CC: Anthony PERARD <anthony.perard@citrix.com>
CC: Juergen Gross <jgross@suse.com>
CC: Andrew Cooper <andrew.cooper3@citrix.com>
CC: George Dunlap <george.dunlap@citrix.com>
CC: Jan Beulich <jbeulich@suse.com>
CC: Julien Grall <julien@xen.org>
CC: Stefano Stabellini <sstabellini@kernel.org>
CC: Christian Lindig <christian.lindig@citrix.com>
CC: David Scott <dave@recoil.org>
CC: Volodymyr Babchuk <Volodymyr_Babchuk@epam.com>
CC: "Roger Pau Monné" <roger.pau@citrix.com>

v7:
 * Fix void return in libxl__arch_domain_build_info_setdefault
 * Fix style issues
 * Use EINVAL when rejecting assisted_x{2}apic for PV guests and
   ENODEV otherwise, when assisted_x{2}apic isn't supported
 * Define has_assisted_x{2}apic macros for when !CONFIG_HVM
 * Replace "EPT" fault reference with "p2m" fault since the former is
   Intel-specific

v6:
 * Use ENODEV instead of EINVAL when rejecting assisted_x{2}apic
   for PV guests
 * Move has_assisted_x{2}apic macros out of an Intel specific header
 * Remove references to Intel specific features in documentation

v5:
 * Revert v4 changes in vmx_vlapic_msr_changed(), preserving the use of
   the has_assisted_x{2}apic macros
 * Following changes in assisted_x{2}apic_available definitions in
   patch 1, retighten conditionals for setting
   XEN_HVM_CPUID_APIC_ACCESS_VIRT and XEN_HVM_CPUID_X2APIC_VIRT in
   cpuid_hypervisor_leaves()

v4:
 * Add has_assisted_x{2}apic macros and use them where appropriate
 * Replace CPU checks with per-domain assisted_x{2}apic control
   options in vmx_vlapic_msr_changed() and cpuid_hypervisor_leaves(),
   following edits to assisted_x{2}apic_available definitions in
   patch 1
   Note: new assisted_x{2}apic_available definitions make later
   cpu_has_vmx_apic_reg_virt and cpu_has_vmx_virtual_intr_delivery
   checks redundant in vmx_vlapic_msr_changed()

v3:
 * Change info in xl.cfg to better express reality and fix
   capitalization of x{2}apic
 * Move "physinfo" variable definition to the beggining of
   libxl__domain_build_info_setdefault()
 * Reposition brackets in if statement to match libxl coding style
 * Shorten logic in libxl__arch_domain_build_info_setdefault()
 * Correct dprintk message in arch_sanitise_domain_config()
 * Make appropriate changes in vmx_vlapic_msr_changed() and
   cpuid_hypervisor_leaves() for amended "assisted_x2apic" bit
 * Remove unneeded parantheses

v2:
 * Add a LIBXL_HAVE_ASSISTED_APIC macro
 * Pass xcpyshinfo as a pointer in libxl__arch_get_physinfo
 * Add a return statement in now "int"
   libxl__arch_domain_build_info_setdefault()
 * Preserve libxl__arch_domain_build_info_setdefault 's location in
   libxl_create.c
 * Correct x{2}apic default setting logic in
   libxl__arch_domain_prepare_config()
 * Correct logic for parsing assisted_x{2}apic host/guest options in
   xl_parse.c and initialize them to -1 in xl.c
 * Use guest options directly in vmx_vlapic_msr_changed
 * Fix indentation of bool assisted_x{2}apic in struct hvm_domain
 * Add a change in xenctrl_stubs.c to pass xenctrl ABI checks
---
 docs/man/xl.cfg.5.pod.in              | 15 +++++++++++++++
 docs/man/xl.conf.5.pod.in             | 12 ++++++++++++
 tools/golang/xenlight/helpers.gen.go  | 12 ++++++++++++
 tools/golang/xenlight/types.gen.go    |  2 ++
 tools/include/libxl.h                 |  7 +++++++
 tools/libs/light/libxl_arch.h         |  5 +++--
 tools/libs/light/libxl_arm.c          |  9 ++++++---
 tools/libs/light/libxl_create.c       | 22 +++++++++++++---------
 tools/libs/light/libxl_types.idl      |  2 ++
 tools/libs/light/libxl_x86.c          | 28 ++++++++++++++++++++++++++--
 tools/ocaml/libs/xc/xenctrl.ml        |  2 ++
 tools/ocaml/libs/xc/xenctrl.mli       |  2 ++
 tools/ocaml/libs/xc/xenctrl_stubs.c   |  2 +-
 tools/xl/xl.c                         |  8 ++++++++
 tools/xl/xl.h                         |  2 ++
 tools/xl/xl_parse.c                   | 16 ++++++++++++++++
 xen/arch/x86/domain.c                 | 29 ++++++++++++++++++++++++++++-
 xen/arch/x86/hvm/vmx/vmcs.c           |  4 ++++
 xen/arch/x86/hvm/vmx/vmx.c            | 13 ++++---------
 xen/arch/x86/include/asm/hvm/domain.h |  6 ++++++
 xen/arch/x86/include/asm/hvm/hvm.h    |  5 +++++
 xen/arch/x86/traps.c                  |  5 +++--
 xen/include/public/arch-x86/xen.h     |  2 ++
 23 files changed, 181 insertions(+), 29 deletions(-)

diff --git a/docs/man/xl.cfg.5.pod.in b/docs/man/xl.cfg.5.pod.in
index b98d161398..6d98d73d76 100644
--- a/docs/man/xl.cfg.5.pod.in
+++ b/docs/man/xl.cfg.5.pod.in
@@ -1862,6 +1862,21 @@ firmware tables when using certain older guest Operating
 Systems. These tables have been superseded by newer constructs within
 the ACPI tables.
 
+=item B<assisted_xapic=BOOLEAN>
+
+B<(x86 only)> Enables or disables hardware assisted virtualization for
+xAPIC. With this option enabled, a memory-mapped APIC access will be
+decoded by hardware and either issue a more specific VM exit than just
+a p2m fault, or altogether avoid a VM exit. The
+default is settable via L<xl.conf(5)>.
+
+=item B<assisted_x2apic=BOOLEAN>
+
+B<(x86 only)> Enables or disables hardware assisted virtualization for
+x2APIC. With this option enabled, certain accesses to MSR APIC
+registers will avoid a VM exit into the hypervisor. The default is
+settable via L<xl.conf(5)>.
+
 =item B<nx=BOOLEAN>
 
 B<(x86 only)> Hides or exposes the No-eXecute capability. This allows a guest
diff --git a/docs/man/xl.conf.5.pod.in b/docs/man/xl.conf.5.pod.in
index df20c08137..95d136d1ea 100644
--- a/docs/man/xl.conf.5.pod.in
+++ b/docs/man/xl.conf.5.pod.in
@@ -107,6 +107,18 @@ Sets the default value for the C<max_grant_version> domain config value.
 
 Default: maximum grant version supported by the hypervisor.
 
+=item B<assisted_xapic=BOOLEAN>
+
+If enabled, domains will use xAPIC hardware assisted virtualization by default.
+
+Default: enabled if supported.
+
+=item B<assisted_x2apic=BOOLEAN>
+
+If enabled, domains will use x2APIC hardware assisted virtualization by default.
+
+Default: enabled if supported.
+
 =item B<vif.default.script="PATH">
 
 Configures the default hotplug script used by virtual network devices.
diff --git a/tools/golang/xenlight/helpers.gen.go b/tools/golang/xenlight/helpers.gen.go
index dd4e6c9f14..dece545ee0 100644
--- a/tools/golang/xenlight/helpers.gen.go
+++ b/tools/golang/xenlight/helpers.gen.go
@@ -1120,6 +1120,12 @@ x.ArchArm.Vuart = VuartType(xc.arch_arm.vuart)
 if err := x.ArchX86.MsrRelaxed.fromC(&xc.arch_x86.msr_relaxed);err != nil {
 return fmt.Errorf("converting field ArchX86.MsrRelaxed: %v", err)
 }
+if err := x.ArchX86.AssistedXapic.fromC(&xc.arch_x86.assisted_xapic);err != nil {
+return fmt.Errorf("converting field ArchX86.AssistedXapic: %v", err)
+}
+if err := x.ArchX86.AssistedX2Apic.fromC(&xc.arch_x86.assisted_x2apic);err != nil {
+return fmt.Errorf("converting field ArchX86.AssistedX2Apic: %v", err)
+}
 x.Altp2M = Altp2MMode(xc.altp2m)
 x.VmtraceBufKb = int(xc.vmtrace_buf_kb)
 if err := x.Vpmu.fromC(&xc.vpmu);err != nil {
@@ -1605,6 +1611,12 @@ xc.arch_arm.vuart = C.libxl_vuart_type(x.ArchArm.Vuart)
 if err := x.ArchX86.MsrRelaxed.toC(&xc.arch_x86.msr_relaxed); err != nil {
 return fmt.Errorf("converting field ArchX86.MsrRelaxed: %v", err)
 }
+if err := x.ArchX86.AssistedXapic.toC(&xc.arch_x86.assisted_xapic); err != nil {
+return fmt.Errorf("converting field ArchX86.AssistedXapic: %v", err)
+}
+if err := x.ArchX86.AssistedX2Apic.toC(&xc.arch_x86.assisted_x2apic); err != nil {
+return fmt.Errorf("converting field ArchX86.AssistedX2Apic: %v", err)
+}
 xc.altp2m = C.libxl_altp2m_mode(x.Altp2M)
 xc.vmtrace_buf_kb = C.int(x.VmtraceBufKb)
 if err := x.Vpmu.toC(&xc.vpmu); err != nil {
diff --git a/tools/golang/xenlight/types.gen.go b/tools/golang/xenlight/types.gen.go
index 87be46c745..253c9ad93d 100644
--- a/tools/golang/xenlight/types.gen.go
+++ b/tools/golang/xenlight/types.gen.go
@@ -520,6 +520,8 @@ Vuart VuartType
 }
 ArchX86 struct {
 MsrRelaxed Defbool
+AssistedXapic Defbool
+AssistedX2Apic Defbool
 }
 Altp2M Altp2MMode
 VmtraceBufKb int
diff --git a/tools/include/libxl.h b/tools/include/libxl.h
index 94e6355822..cdcccd6d01 100644
--- a/tools/include/libxl.h
+++ b/tools/include/libxl.h
@@ -535,6 +535,13 @@
 #define LIBXL_HAVE_PHYSINFO_ASSISTED_APIC 1
 
 /*
+ * LIBXL_HAVE_ASSISTED_APIC indicates that libxl_domain_build_info has
+ * assisted_xapic and assisted_x2apic fields for enabling hardware
+ * assisted virtualization for x{2}apic per domain.
+ */
+#define LIBXL_HAVE_ASSISTED_APIC 1
+
+/*
  * libxl ABI compatibility
  *
  * The only guarantee which libxl makes regarding ABI compatibility
diff --git a/tools/libs/light/libxl_arch.h b/tools/libs/light/libxl_arch.h
index 207ceac6a1..03b89929e6 100644
--- a/tools/libs/light/libxl_arch.h
+++ b/tools/libs/light/libxl_arch.h
@@ -71,8 +71,9 @@ void libxl__arch_domain_create_info_setdefault(libxl__gc *gc,
                                                libxl_domain_create_info *c_info);
 
 _hidden
-void libxl__arch_domain_build_info_setdefault(libxl__gc *gc,
-                                              libxl_domain_build_info *b_info);
+int libxl__arch_domain_build_info_setdefault(libxl__gc *gc,
+                                             libxl_domain_build_info *b_info,
+                                             const libxl_physinfo *physinfo);
 
 _hidden
 int libxl__arch_passthrough_mode_setdefault(libxl__gc *gc,
diff --git a/tools/libs/light/libxl_arm.c b/tools/libs/light/libxl_arm.c
index 39fdca1b49..7dee2afd4b 100644
--- a/tools/libs/light/libxl_arm.c
+++ b/tools/libs/light/libxl_arm.c
@@ -1384,14 +1384,15 @@ void libxl__arch_domain_create_info_setdefault(libxl__gc *gc,
     }
 }
 
-void libxl__arch_domain_build_info_setdefault(libxl__gc *gc,
-                                              libxl_domain_build_info *b_info)
+int libxl__arch_domain_build_info_setdefault(libxl__gc *gc,
+                                             libxl_domain_build_info *b_info,
+                                             const libxl_physinfo *physinfo)
 {
     /* ACPI is disabled by default */
     libxl_defbool_setdefault(&b_info->acpi, false);
 
     if (b_info->type != LIBXL_DOMAIN_TYPE_PV)
-        return;
+        return 0;
 
     LOG(DEBUG, "Converting build_info to PVH");
 
@@ -1399,6 +1400,8 @@ void libxl__arch_domain_build_info_setdefault(libxl__gc *gc,
     memset(&b_info->u, '\0', sizeof(b_info->u));
     b_info->type = LIBXL_DOMAIN_TYPE_INVALID;
     libxl_domain_build_info_init_type(b_info, LIBXL_DOMAIN_TYPE_PVH);
+
+    return 0;
 }
 
 int libxl__arch_passthrough_mode_setdefault(libxl__gc *gc,
diff --git a/tools/libs/light/libxl_create.c b/tools/libs/light/libxl_create.c
index 15ed021f41..88d08d7277 100644
--- a/tools/libs/light/libxl_create.c
+++ b/tools/libs/light/libxl_create.c
@@ -75,6 +75,7 @@ int libxl__domain_build_info_setdefault(libxl__gc *gc,
                                         libxl_domain_build_info *b_info)
 {
     int i, rc;
+    libxl_physinfo info;
 
     if (b_info->type != LIBXL_DOMAIN_TYPE_HVM &&
         b_info->type != LIBXL_DOMAIN_TYPE_PV &&
@@ -264,7 +265,18 @@ int libxl__domain_build_info_setdefault(libxl__gc *gc,
     if (!b_info->event_channels)
         b_info->event_channels = 1023;
 
-    libxl__arch_domain_build_info_setdefault(gc, b_info);
+    rc = libxl_get_physinfo(CTX, &info);
+    if (rc) {
+        LOG(ERROR, "failed to get hypervisor info");
+        return rc;
+    }
+
+    rc = libxl__arch_domain_build_info_setdefault(gc, b_info, &info);
+    if (rc) {
+        LOG(ERROR, "unable to set domain arch build info defaults");
+        return rc;
+    }
+
     libxl_defbool_setdefault(&b_info->dm_restrict, false);
 
     if (b_info->iommu_memkb == LIBXL_MEMKB_DEFAULT)
@@ -457,14 +469,6 @@ int libxl__domain_build_info_setdefault(libxl__gc *gc,
     }
 
     if (b_info->max_grant_version == LIBXL_MAX_GRANT_DEFAULT) {
-        libxl_physinfo info;
-
-        rc = libxl_get_physinfo(CTX, &info);
-        if (rc) {
-            LOG(ERROR, "failed to get hypervisor info");
-            return rc;
-        }
-
         if (info.cap_gnttab_v2)
             b_info->max_grant_version = 2;
         else if (info.cap_gnttab_v1)
diff --git a/tools/libs/light/libxl_types.idl b/tools/libs/light/libxl_types.idl
index 42ac6c357b..db5eb0a0b3 100644
--- a/tools/libs/light/libxl_types.idl
+++ b/tools/libs/light/libxl_types.idl
@@ -648,6 +648,8 @@ libxl_domain_build_info = Struct("domain_build_info",[
                                ("vuart", libxl_vuart_type),
                               ])),
     ("arch_x86", Struct(None, [("msr_relaxed", libxl_defbool),
+                               ("assisted_xapic", libxl_defbool),
+                               ("assisted_x2apic", libxl_defbool),
                               ])),
     # Alternate p2m is not bound to any architecture or guest type, as it is
     # supported by x86 HVM and ARM support is planned.
diff --git a/tools/libs/light/libxl_x86.c b/tools/libs/light/libxl_x86.c
index e0a06ecfe3..46d4de22d1 100644
--- a/tools/libs/light/libxl_x86.c
+++ b/tools/libs/light/libxl_x86.c
@@ -23,6 +23,15 @@ int libxl__arch_domain_prepare_config(libxl__gc *gc,
     if (libxl_defbool_val(d_config->b_info.arch_x86.msr_relaxed))
         config->arch.misc_flags |= XEN_X86_MSR_RELAXED;
 
+    if (d_config->c_info.type != LIBXL_DOMAIN_TYPE_PV)
+    {
+        if (libxl_defbool_val(d_config->b_info.arch_x86.assisted_xapic))
+            config->arch.misc_flags |= XEN_X86_ASSISTED_XAPIC;
+
+        if (libxl_defbool_val(d_config->b_info.arch_x86.assisted_x2apic))
+            config->arch.misc_flags |= XEN_X86_ASSISTED_X2APIC;
+    }
+
     return 0;
 }
 
@@ -819,11 +828,26 @@ void libxl__arch_domain_create_info_setdefault(libxl__gc *gc,
 {
 }
 
-void libxl__arch_domain_build_info_setdefault(libxl__gc *gc,
-                                              libxl_domain_build_info *b_info)
+int libxl__arch_domain_build_info_setdefault(libxl__gc *gc,
+                                             libxl_domain_build_info *b_info,
+                                             const libxl_physinfo *physinfo)
 {
     libxl_defbool_setdefault(&b_info->acpi, true);
     libxl_defbool_setdefault(&b_info->arch_x86.msr_relaxed, false);
+
+    if (b_info->type != LIBXL_DOMAIN_TYPE_PV) {
+        libxl_defbool_setdefault(&b_info->arch_x86.assisted_xapic,
+                             physinfo->cap_assisted_xapic);
+        libxl_defbool_setdefault(&b_info->arch_x86.assisted_x2apic,
+                             physinfo->cap_assisted_x2apic);
+    }
+    else if (!libxl_defbool_is_default(b_info->arch_x86.assisted_xapic) ||
+             !libxl_defbool_is_default(b_info->arch_x86.assisted_x2apic)) {
+        LOG(ERROR, "Interrupt Controller Virtualization not supported for PV");
+        return ERROR_INVAL;
+    }
+
+    return 0;
 }
 
 int libxl__arch_passthrough_mode_setdefault(libxl__gc *gc,
diff --git a/tools/ocaml/libs/xc/xenctrl.ml b/tools/ocaml/libs/xc/xenctrl.ml
index 712456e098..32f3028828 100644
--- a/tools/ocaml/libs/xc/xenctrl.ml
+++ b/tools/ocaml/libs/xc/xenctrl.ml
@@ -50,6 +50,8 @@ type x86_arch_emulation_flags =
 
 type x86_arch_misc_flags =
 	| X86_MSR_RELAXED
+	| X86_ASSISTED_XAPIC
+	| X86_ASSISTED_X2APIC
 
 type xen_x86_arch_domainconfig =
 {
diff --git a/tools/ocaml/libs/xc/xenctrl.mli b/tools/ocaml/libs/xc/xenctrl.mli
index b034434f68..d0fcbc8866 100644
--- a/tools/ocaml/libs/xc/xenctrl.mli
+++ b/tools/ocaml/libs/xc/xenctrl.mli
@@ -44,6 +44,8 @@ type x86_arch_emulation_flags =
 
 type x86_arch_misc_flags =
   | X86_MSR_RELAXED
+  | X86_ASSISTED_XAPIC
+  | X86_ASSISTED_X2APIC
 
 type xen_x86_arch_domainconfig = {
   emulation_flags: x86_arch_emulation_flags list;
diff --git a/tools/ocaml/libs/xc/xenctrl_stubs.c b/tools/ocaml/libs/xc/xenctrl_stubs.c
index 7e9c32ad1b..5df8aaa58f 100644
--- a/tools/ocaml/libs/xc/xenctrl_stubs.c
+++ b/tools/ocaml/libs/xc/xenctrl_stubs.c
@@ -239,7 +239,7 @@ CAMLprim value stub_xc_domain_create(value xch, value wanted_domid, value config
 
 		cfg.arch.misc_flags = ocaml_list_to_c_bitmap
 			/* ! x86_arch_misc_flags X86_ none */
-			/* ! XEN_X86_ XEN_X86_MSR_RELAXED all */
+			/* ! XEN_X86_ XEN_X86_ASSISTED_X2APIC max */
 			(VAL_MISC_FLAGS);
 
 #undef VAL_MISC_FLAGS
diff --git a/tools/xl/xl.c b/tools/xl/xl.c
index 2d1ec18ea3..31eb223309 100644
--- a/tools/xl/xl.c
+++ b/tools/xl/xl.c
@@ -57,6 +57,8 @@ int max_grant_frames = -1;
 int max_maptrack_frames = -1;
 int max_grant_version = LIBXL_MAX_GRANT_DEFAULT;
 libxl_domid domid_policy = INVALID_DOMID;
+int assisted_xapic = -1;
+int assisted_x2apic = -1;
 
 xentoollog_level minmsglevel = minmsglevel_default;
 
@@ -201,6 +203,12 @@ static void parse_global_config(const char *configfile,
     if (!xlu_cfg_get_long (config, "claim_mode", &l, 0))
         claim_mode = l;
 
+    if (!xlu_cfg_get_long (config, "assisted_xapic", &l, 0))
+        assisted_xapic = l;
+
+    if (!xlu_cfg_get_long (config, "assisted_x2apic", &l, 0))
+        assisted_x2apic = l;
+
     xlu_cfg_replace_string (config, "remus.default.netbufscript",
         &default_remus_netbufscript, 0);
     xlu_cfg_replace_string (config, "colo.default.proxyscript",
diff --git a/tools/xl/xl.h b/tools/xl/xl.h
index c5c4bedbdd..528deb3feb 100644
--- a/tools/xl/xl.h
+++ b/tools/xl/xl.h
@@ -286,6 +286,8 @@ extern libxl_bitmap global_vm_affinity_mask;
 extern libxl_bitmap global_hvm_affinity_mask;
 extern libxl_bitmap global_pv_affinity_mask;
 extern libxl_domid domid_policy;
+extern int assisted_xapic;
+extern int assisted_x2apic;
 
 enum output_format {
     OUTPUT_FORMAT_JSON,
diff --git a/tools/xl/xl_parse.c b/tools/xl/xl_parse.c
index 117fcdcb2b..0ab9b145fe 100644
--- a/tools/xl/xl_parse.c
+++ b/tools/xl/xl_parse.c
@@ -1681,6 +1681,22 @@ void parse_config_data(const char *config_source,
         xlu_cfg_get_defbool(config, "vpt_align", &b_info->u.hvm.vpt_align, 0);
         xlu_cfg_get_defbool(config, "apic", &b_info->apic, 0);
 
+        e = xlu_cfg_get_long(config, "assisted_xapic", &l , 0);
+        if ((e == ESRCH && assisted_xapic != -1)) /* use global default if present */
+            libxl_defbool_set(&b_info->arch_x86.assisted_xapic, assisted_xapic);
+        else if (!e)
+            libxl_defbool_set(&b_info->arch_x86.assisted_xapic, l);
+        else
+            exit(1);
+
+        e = xlu_cfg_get_long(config, "assisted_x2apic", &l, 0);
+        if ((e == ESRCH && assisted_x2apic != -1)) /* use global default if present */
+            libxl_defbool_set(&b_info->arch_x86.assisted_x2apic, assisted_x2apic);
+        else if (!e)
+            libxl_defbool_set(&b_info->arch_x86.assisted_x2apic, l);
+        else
+            exit(1);
+
         switch (xlu_cfg_get_list(config, "viridian",
                                  &viridian, &num_viridian, 1))
         {
diff --git a/xen/arch/x86/domain.c b/xen/arch/x86/domain.c
index a5048ed654..279936a016 100644
--- a/xen/arch/x86/domain.c
+++ b/xen/arch/x86/domain.c
@@ -50,6 +50,7 @@
 #include <asm/cpuidle.h>
 #include <asm/mpspec.h>
 #include <asm/ldt.h>
+#include <asm/hvm/domain.h>
 #include <asm/hvm/hvm.h>
 #include <asm/hvm/nestedhvm.h>
 #include <asm/hvm/support.h>
@@ -619,6 +620,8 @@ int arch_sanitise_domain_config(struct xen_domctl_createdomain *config)
     bool hvm = config->flags & XEN_DOMCTL_CDF_hvm;
     bool hap = config->flags & XEN_DOMCTL_CDF_hap;
     bool nested_virt = config->flags & XEN_DOMCTL_CDF_nested_virt;
+    bool assisted_xapic = config->arch.misc_flags & XEN_X86_ASSISTED_XAPIC;
+    bool assisted_x2apic = config->arch.misc_flags & XEN_X86_ASSISTED_X2APIC;
     unsigned int max_vcpus;
 
     if ( hvm ? !hvm_enabled : !IS_ENABLED(CONFIG_PV) )
@@ -685,13 +688,31 @@ int arch_sanitise_domain_config(struct xen_domctl_createdomain *config)
         }
     }
 
-    if ( config->arch.misc_flags & ~XEN_X86_MSR_RELAXED )
+    if ( config->arch.misc_flags & ~(XEN_X86_MSR_RELAXED |
+                                     XEN_X86_ASSISTED_XAPIC |
+                                     XEN_X86_ASSISTED_X2APIC) )
     {
         dprintk(XENLOG_INFO, "Invalid arch misc flags %#x\n",
                 config->arch.misc_flags);
         return -EINVAL;
     }
 
+    if ( (assisted_xapic || assisted_x2apic) && !hvm )
+    {
+        dprintk(XENLOG_INFO,
+                "Interrupt Controller Virtualization not supported for PV\n");
+        return -EINVAL;
+    }
+
+    if ( (assisted_xapic && !assisted_xapic_available) ||
+         (assisted_x2apic && !assisted_x2apic_available) )
+    {
+        dprintk(XENLOG_INFO,
+                "Hardware assisted x%sAPIC requested but not available\n",
+                assisted_xapic && !assisted_xapic_available ? "" : "2");
+        return -ENODEV;
+    }
+
     return 0;
 }
 
@@ -864,6 +885,12 @@ int arch_domain_create(struct domain *d,
 
     d->arch.msr_relaxed = config->arch.misc_flags & XEN_X86_MSR_RELAXED;
 
+    d->arch.hvm.assisted_xapic =
+        config->arch.misc_flags & XEN_X86_ASSISTED_XAPIC;
+
+    d->arch.hvm.assisted_x2apic =
+        config->arch.misc_flags & XEN_X86_ASSISTED_X2APIC;
+
     return 0;
 
  fail:
diff --git a/xen/arch/x86/hvm/vmx/vmcs.c b/xen/arch/x86/hvm/vmx/vmcs.c
index 77ce0b2121..47c27740d3 100644
--- a/xen/arch/x86/hvm/vmx/vmcs.c
+++ b/xen/arch/x86/hvm/vmx/vmcs.c
@@ -1157,6 +1157,10 @@ static int construct_vmcs(struct vcpu *v)
         __vmwrite(PLE_WINDOW, ple_window);
     }
 
+    if ( !has_assisted_xapic(d) )
+        v->arch.hvm.vmx.secondary_exec_control &=
+            ~SECONDARY_EXEC_VIRTUALIZE_APIC_ACCESSES;
+
     if ( cpu_has_vmx_secondary_exec_control )
         __vmwrite(SECONDARY_VM_EXEC_CONTROL,
                   v->arch.hvm.vmx.secondary_exec_control);
diff --git a/xen/arch/x86/hvm/vmx/vmx.c b/xen/arch/x86/hvm/vmx/vmx.c
index c075370f64..949ddd684c 100644
--- a/xen/arch/x86/hvm/vmx/vmx.c
+++ b/xen/arch/x86/hvm/vmx/vmx.c
@@ -3344,16 +3344,11 @@ static void vmx_install_vlapic_mapping(struct vcpu *v)
 
 void vmx_vlapic_msr_changed(struct vcpu *v)
 {
-    int virtualize_x2apic_mode;
     struct vlapic *vlapic = vcpu_vlapic(v);
     unsigned int msr;
 
-    virtualize_x2apic_mode = ( (cpu_has_vmx_apic_reg_virt ||
-                                cpu_has_vmx_virtual_intr_delivery) &&
-                               cpu_has_vmx_virtualize_x2apic_mode );
-
-    if ( !cpu_has_vmx_virtualize_apic_accesses &&
-         !virtualize_x2apic_mode )
+    if ( !has_assisted_xapic(v->domain) &&
+         !has_assisted_x2apic(v->domain) )
         return;
 
     vmx_vmcs_enter(v);
@@ -3363,7 +3358,7 @@ void vmx_vlapic_msr_changed(struct vcpu *v)
     if ( !vlapic_hw_disabled(vlapic) &&
          (vlapic_base_address(vlapic) == APIC_DEFAULT_PHYS_BASE) )
     {
-        if ( virtualize_x2apic_mode && vlapic_x2apic_mode(vlapic) )
+        if ( has_assisted_x2apic(v->domain) && vlapic_x2apic_mode(vlapic) )
         {
             v->arch.hvm.vmx.secondary_exec_control |=
                 SECONDARY_EXEC_VIRTUALIZE_X2APIC_MODE;
@@ -3384,7 +3379,7 @@ void vmx_vlapic_msr_changed(struct vcpu *v)
                 vmx_clear_msr_intercept(v, MSR_X2APIC_SELF, VMX_MSR_W);
             }
         }
-        else
+        else if ( has_assisted_xapic(v->domain) )
             v->arch.hvm.vmx.secondary_exec_control |=
                 SECONDARY_EXEC_VIRTUALIZE_APIC_ACCESSES;
     }
diff --git a/xen/arch/x86/include/asm/hvm/domain.h b/xen/arch/x86/include/asm/hvm/domain.h
index 698455444e..92bf53483c 100644
--- a/xen/arch/x86/include/asm/hvm/domain.h
+++ b/xen/arch/x86/include/asm/hvm/domain.h
@@ -117,6 +117,12 @@ struct hvm_domain {
 
     bool                   is_s3_suspended;
 
+    /* xAPIC hardware assisted virtualization. */
+    bool                   assisted_xapic;
+
+    /* x2APIC hardware assisted virtualization. */
+    bool                   assisted_x2apic;
+
     /* hypervisor intercepted msix table */
     struct list_head       msixtbl_list;
 
diff --git a/xen/arch/x86/include/asm/hvm/hvm.h b/xen/arch/x86/include/asm/hvm/hvm.h
index e0d9348878..6ecbe22cc9 100644
--- a/xen/arch/x86/include/asm/hvm/hvm.h
+++ b/xen/arch/x86/include/asm/hvm/hvm.h
@@ -376,6 +376,9 @@ int hvm_get_param(struct domain *d, uint32_t index, uint64_t *value);
 extern bool assisted_xapic_available;
 extern bool assisted_x2apic_available;
 
+#define has_assisted_xapic(d) ((d)->arch.hvm.assisted_xapic)
+#define has_assisted_x2apic(d) ((d)->arch.hvm.assisted_x2apic)
+
 #define hvm_get_guest_time(v) hvm_get_guest_time_fixed(v, 0)
 
 #define hvm_paging_enabled(v) \
@@ -878,6 +881,8 @@ static inline void hvm_set_reg(struct vcpu *v, unsigned int reg, uint64_t val)
 #define assisted_xapic_available false
 #define assisted_x2apic_available false
 
+#define has_assisted_xapic(d) ((void)(d), false)
+#define has_assisted_x2apic(d) ((void)(d), false)
 #define hvm_paging_enabled(v) ((void)(v), false)
 #define hvm_wp_enabled(v) ((void)(v), false)
 #define hvm_pcid_enabled(v) ((void)(v), false)
diff --git a/xen/arch/x86/traps.c b/xen/arch/x86/traps.c
index a2278d9499..a8dba88916 100644
--- a/xen/arch/x86/traps.c
+++ b/xen/arch/x86/traps.c
@@ -1121,7 +1121,8 @@ void cpuid_hypervisor_leaves(const struct vcpu *v, uint32_t leaf,
         if ( !is_hvm_domain(d) || subleaf != 0 )
             break;
 
-        if ( cpu_has_vmx_apic_reg_virt )
+        if ( cpu_has_vmx_apic_reg_virt &&
+             has_assisted_xapic(d) )
             res->a |= XEN_HVM_CPUID_APIC_ACCESS_VIRT;
 
         /*
@@ -1130,7 +1131,7 @@ void cpuid_hypervisor_leaves(const struct vcpu *v, uint32_t leaf,
          * and wrmsr in the guest will run without VMEXITs (see
          * vmx_vlapic_msr_changed()).
          */
-        if ( cpu_has_vmx_virtualize_x2apic_mode &&
+        if ( has_assisted_x2apic(d) &&
              cpu_has_vmx_apic_reg_virt &&
              cpu_has_vmx_virtual_intr_delivery )
             res->a |= XEN_HVM_CPUID_X2APIC_VIRT;
diff --git a/xen/include/public/arch-x86/xen.h b/xen/include/public/arch-x86/xen.h
index 7acd94c8eb..9da32c6239 100644
--- a/xen/include/public/arch-x86/xen.h
+++ b/xen/include/public/arch-x86/xen.h
@@ -317,6 +317,8 @@ struct xen_arch_domainconfig {
  * doesn't allow the guest to read or write to the underlying MSR.
  */
 #define XEN_X86_MSR_RELAXED (1u << 0)
+#define XEN_X86_ASSISTED_XAPIC (1u << 1)
+#define XEN_X86_ASSISTED_X2APIC (1u << 2)
     uint32_t misc_flags;
 };
 
-- 
2.11.0



^ permalink raw reply related	[flat|nested] 27+ messages in thread

* Re: [PATCH v6 1/2] xen+tools: Report Interrupt Controller Virtualization capabilities on x86
  2022-03-09 10:36     ` Roger Pau Monné
@ 2022-03-11 15:09       ` Jane Malalane
  0 siblings, 0 replies; 27+ messages in thread
From: Jane Malalane @ 2022-03-11 15:09 UTC (permalink / raw)
  To: Roger Pau Monne
  Cc: Xen-devel, Wei Liu, Anthony Perard, Juergen Gross, Andrew Cooper,
	George Dunlap, Jan Beulich, Julien Grall, Stefano Stabellini,
	Volodymyr Babchuk, Bertrand Marquis, Jun Nakajima, Kevin Tian

On 09/03/2022 10:36, Roger Pau Monné wrote:
> On Tue, Mar 08, 2022 at 05:31:17PM +0000, Jane Malalane wrote:
>> Add XEN_SYSCTL_PHYSCAP_X86_ASSISTED_XAPIC and
>> XEN_SYSCTL_PHYSCAP_X86_ASSISTED_X2APIC to report accelerated xapic
>> and x2apic, on x86 hardware.
>> No such features are currently implemented on AMD hardware.
>>
>> HW assisted xAPIC virtualization will be reported if HW, at the
>> minimum, supports virtualize_apic_accesses as this feature alone means
>> that an access to the APIC page will cause an APIC-access VM exit. An
>> APIC-access VM exit provides a VMM with information about the access
>> causing the VM exit, unlike a regular EPT fault, thus simplifying some
>> internal handling.
>>
>> HW assisted x2APIC virtualization will be reported if HW supports
>> virtualize_x2apic_mode and, at least, either apic_reg_virt or
>> virtual_intr_delivery. This also means that
>> sysctl follows the conditionals in vmx_vlapic_msr_changed().
>>
>> For that purpose, also add an arch-specific "capabilities" parameter
>> to struct xen_sysctl_physinfo.
>>
>> Note that this interface is intended to be compatible with AMD so that
>> AVIC support can be introduced in a future patch. Unlike Intel that
>> has multiple controls for APIC Virtualization, AMD has one global
>> 'AVIC Enable' control bit, so fine-graining of APIC virtualization
>> control cannot be done on a common interface.
>>
>> Suggested-by: Andrew Cooper <andrew.cooper3@citrix.com>
>> Signed-off-by: Jane Malalane <jane.malalane@citrix.com>
> 
> Overall LGTM, just one question and one nit.
> 
>> diff --git a/tools/ocaml/libs/xc/xenctrl_stubs.c b/tools/ocaml/libs/xc/xenctrl_stubs.c
>> index 5b4fe72c8d..7e9c32ad1b 100644
>> --- a/tools/ocaml/libs/xc/xenctrl_stubs.c
>> +++ b/tools/ocaml/libs/xc/xenctrl_stubs.c
>> @@ -712,7 +712,7 @@ CAMLprim value stub_xc_send_debug_keys(value xch, value keys)
>>   CAMLprim value stub_xc_physinfo(value xch)
>>   {
>>   	CAMLparam1(xch);
>> -	CAMLlocal2(physinfo, cap_list);
>> +	CAMLlocal3(physinfo, cap_list, arch_cap_list);
>>   	xc_physinfo_t c_physinfo;
>>   	int r;
>>   
>> @@ -731,7 +731,7 @@ CAMLprim value stub_xc_physinfo(value xch)
>>   		/* ! XEN_SYSCTL_PHYSCAP_ XEN_SYSCTL_PHYSCAP_MAX max */
>>   		(c_physinfo.capabilities);
>>   
>> -	physinfo = caml_alloc_tuple(10);
>> +	physinfo = caml_alloc_tuple(11);
>>   	Store_field(physinfo, 0, Val_int(c_physinfo.threads_per_core));
>>   	Store_field(physinfo, 1, Val_int(c_physinfo.cores_per_socket));
>>   	Store_field(physinfo, 2, Val_int(c_physinfo.nr_cpus));
>> @@ -743,6 +743,17 @@ CAMLprim value stub_xc_physinfo(value xch)
>>   	Store_field(physinfo, 8, cap_list);
>>   	Store_field(physinfo, 9, Val_int(c_physinfo.max_cpu_id + 1));
>>   
>> +#if defined(__i386__) || defined(__x86_64__)
>> +	/*
>> +	 * arch_capabilities: physinfo_arch_cap_flag list;
>> +	 */
>> +	arch_cap_list = c_bitmap_to_ocaml_list
>> +		/* ! physinfo_arch_cap_flag CAP_ none */
>> +		/* ! XEN_SYSCTL_PHYSCAP_ XEN_SYSCTL_PHYSCAP_X86_MAX max */
>> +		(c_physinfo.arch_capabilities);
>> +	Store_field(physinfo, 10, arch_cap_list);
>> +#endif
> 
> Have you tried to build this on Arm? I wonder whether the compiler
> will complain about arch_cap_list being unused there?
Built it - no error.

Thank you,

Jane.

^ permalink raw reply	[flat|nested] 27+ messages in thread

* RE: [PATCH v5 1/2] xen+tools: Report Interrupt Controller Virtualization capabilities on x86
  2022-03-07 15:06 ` [PATCH v5 1/2] xen+tools: Report Interrupt Controller Virtualization capabilities on x86 Jane Malalane
  2022-03-08 10:39   ` Roger Pau Monné
  2022-03-08 17:31   ` [PATCH v6 " Jane Malalane
@ 2022-03-14  6:40   ` Tian, Kevin
  2 siblings, 0 replies; 27+ messages in thread
From: Tian, Kevin @ 2022-03-14  6:40 UTC (permalink / raw)
  To: Jane Malalane, Xen-devel
  Cc: Wei Liu, Anthony PERARD, Gross, Jurgen, Cooper, Andrew,
	George Dunlap, Beulich, Jan, Julien Grall, Stefano Stabellini,
	Volodymyr Babchuk, Bertrand Marquis, Nakajima, Jun,
	Pau Monné,
	Roger

> From: Jane Malalane <jane.malalane@citrix.com>
> Sent: Monday, March 7, 2022 11:06 PM
> 
> Add XEN_SYSCTL_PHYSCAP_ARCH_ASSISTED_xapic and
> XEN_SYSCTL_PHYSCAP_ARCH_ASSISTED_x2apic to report accelerated xapic
> and x2apic, on x86 hardware.
> No such features are currently implemented on AMD hardware.
> 
> HW assisted xAPIC virtualization will be reported if HW, at the
> minimum, supports virtualize_apic_accesses as this feature alone means
> that an access to the APIC page will cause an APIC-access VM exit. An
> APIC-access VM exit provides a VMM with information about the access
> causing the VM exit, unlike a regular EPT fault, thus simplifying some
> internal handling.
> 
> HW assisted x2APIC virtualization will be reported if HW supports
> virtualize_x2apic_mode and, at least, either apic_reg_virt or
> virtual_intr_delivery. This is due to apic_reg_virt and
> virtual_intr_delivery preventing a VM exit from occuring or at least
> replacing a regular EPT fault VM-exit with an APIC-access VM-exit on
> read and write APIC accesses, respectively. This also means that
> sysctl follows the conditionals in vmx_vlapic_msr_changed().
> 
> For that purpose, also add an arch-specific "capabilities" parameter
> to struct xen_sysctl_physinfo.
> 
> Note that this interface is intended to be compatible with AMD so that
> AVIC support can be introduced in a future patch. Unlike Intel that
> has multiple controls for APIC Virtualization, AMD has one global
> 'AVIC Enable' control bit, so fine-graining of APIC virtualization
> control cannot be done on a common interface.
> 
> Suggested-by: Andrew Cooper <andrew.cooper3@citrix.com>
> Signed-off-by: Jane Malalane <jane.malalane@citrix.com>

Can you add some explanation why this capability needs to be
reported to toolstack?

Thanks
Kevin

^ permalink raw reply	[flat|nested] 27+ messages in thread

end of thread, other threads:[~2022-03-14  6:40 UTC | newest]

Thread overview: 27+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-03-07 15:06 [PATCH v5 0/2] xen: Report and use hardware APIC virtualization capabilities Jane Malalane
2022-03-07 15:06 ` [PATCH v5 1/2] xen+tools: Report Interrupt Controller Virtualization capabilities on x86 Jane Malalane
2022-03-08 10:39   ` Roger Pau Monné
2022-03-08 17:31   ` [PATCH v6 " Jane Malalane
2022-03-09 10:36     ` Roger Pau Monné
2022-03-11 15:09       ` Jane Malalane
2022-03-11 15:04     ` [PATCH v7 " Jane Malalane
2022-03-14  6:40   ` [PATCH v5 " Tian, Kevin
2022-03-07 15:06 ` [PATCH v5 2/2] x86/xen: Allow per-domain usage of hardware virtualized APIC Jane Malalane
2022-03-08 11:38   ` Roger Pau Monné
2022-03-08 12:24     ` Jan Beulich
2022-03-08 12:33       ` Roger Pau Monné
2022-03-08 14:31         ` Jane Malalane
2022-03-08 14:41           ` Jan Beulich
2022-03-09 15:56             ` Jane Malalane
2022-03-09 16:51               ` Jan Beulich
2022-03-09 17:02                 ` Jane Malalane
2022-03-08 15:44     ` Jane Malalane
2022-03-08 16:02       ` Roger Pau Monné
2022-03-08 16:13         ` Jane Malalane
2022-03-08 16:16         ` Jane Malalane
2022-03-08 16:22           ` Roger Pau Monné
2022-03-08 17:36   ` [PATCH v6 " Jane Malalane
2022-03-09 11:48     ` Roger Pau Monné
2022-03-09 13:23     ` Jan Beulich
2022-03-09 13:51     ` Roger Pau Monné
2022-03-11 15:08     ` [PATCH v7 " Jane Malalane

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.