All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 00/10] SVE feature for arm guests
@ 2023-02-02 11:08 Luca Fancellu
  2023-02-02 11:08 ` [PATCH 01/10] xen/arm: enable SVE extension for Xen Luca Fancellu
                   ` (9 more replies)
  0 siblings, 10 replies; 18+ messages in thread
From: Luca Fancellu @ 2023-02-02 11:08 UTC (permalink / raw)
  To: xen-devel
  Cc: bertrand.marquis, wei.chen, Stefano Stabellini, Julien Grall,
	Volodymyr Babchuk, Andrew Cooper, George Dunlap, Jan Beulich,
	Wei Liu, Nick Rosbrook, Anthony PERARD, Juergen Gross,
	Christian Lindig, David Scott, Marek Marczykowski-Górecki

This serie is introducing the possibility for Dom0 and DomU guests to use
sve/sve2 instructions.

SVE feature introduces new instruction and registers to improve performances on
floating point operations.

The SVE feature is advertised using the ID_AA64PFR0_EL1 register, SVE field, and
when available the ID_AA64ZFR0_EL1 register provides additional information
about the implemented version and other SVE feature.

New registers added by the SVE feature are Z0-Z31, P0-P15, FFR, ZCR_ELx.

Z0-Z31 are scalable vector register whose size is implementation defined and
goes from 128 bits to maximum 2048, the term vector length will be used to refer
to this quantity.
P0-P15 are predicate registers and the size is the vector length divided by 8,
same size is the FFR (First Fault Register).
ZCR_ELx is a register that can control and restrict the maximum vector length
used by the <x> exception level and all the lower exception levels, so for
example EL3 can restrict the vector length usable by EL3,2,1,0.

The platform has a maximum implemented vector length, so for every value
written in ZCR register, if this value is above the implemented length, then the
lower value will be used. The RDVL instruction can be used to check what vector
length is the HW using after setting ZCR.

For an SVE guest, the V0-V31 registers are part of the Z0-Z31, so there is no
need to save them separately, saving Z0-Z31 will save implicitly also V0-V31.

SVE usage can be trapped using a flag in CPTR_EL2, hence in this serie the
register is added to the domain state, to be able to trap only the guests that
are not allowed to use SVE.

This serie is introducing a command line parameter to enable Dom0 to use SVE and
to set its maximum vector length that by default is 0 which means the guest is
not allowed to use SVE. Values from 128 to 2048 mean the guest can use SVE with
the selected value used as maximum allowed vector length (which could be lower
if the implemented one is lower).
For DomUs, an XL parameter with the same way of use is introduced and a dom0less
DTB binding is created.

The context switch is the most critical part because there can be big registers
to be saved, in this serie an easy approach is used and the context is
saved/restored every time for the guests that are allowed to use SVE.

Luca Fancellu (10):
  xen/arm: enable SVE extension for Xen
  xen/arm: add sve_vl_bits field to domain
  xen/arm: Expose SVE feature to the guest
  xen/arm: add SVE exception class handling
  arm/sve: save/restore SVE context switch
  xen/arm: enable Dom0 to use SVE feature
  xen/physinfo: add arm SVE arch capability and vector length
  tools: add SVE capability and SVE vector length for physinfo
  xen/tools: add sve parameter in XL configuration
  xen/arm: add sve property for dom0less domUs

 docs/man/xl.cfg.5.pod.in                 |  11 ++
 docs/misc/arm/device-tree/booting.txt    |   9 ++
 docs/misc/xen-command-line.pandoc        |  13 ++
 tools/golang/xenlight/helpers.gen.go     |   6 +
 tools/golang/xenlight/types.gen.go       |   3 +
 tools/include/libxl.h                    |  13 ++
 tools/libs/light/libxl.c                 |   3 +
 tools/libs/light/libxl_arm.c             |   2 +
 tools/libs/light/libxl_types.idl         |   3 +
 tools/ocaml/libs/xc/xenctrl.ml           |   4 +-
 tools/ocaml/libs/xc/xenctrl.mli          |   4 +-
 tools/ocaml/libs/xc/xenctrl_stubs.c      |  17 +-
 tools/python/xen/lowlevel/xc/xc.c        |  17 +-
 tools/xl/xl_info.c                       |  10 ++
 tools/xl/xl_parse.c                      |  22 ++-
 xen/arch/arm/Kconfig                     |  10 +-
 xen/arch/arm/arm64/Makefile              |   1 +
 xen/arch/arm/arm64/cpufeature.c          |   7 +-
 xen/arch/arm/arm64/sve.c                 | 118 ++++++++++++++
 xen/arch/arm/arm64/sve_asm.S             | 189 +++++++++++++++++++++++
 xen/arch/arm/arm64/vfp.c                 |  79 ++++++----
 xen/arch/arm/arm64/vsysreg.c             |  39 ++++-
 xen/arch/arm/cpufeature.c                |   6 +-
 xen/arch/arm/domain.c                    |  48 +++++-
 xen/arch/arm/domain_build.c              |  11 ++
 xen/arch/arm/include/asm/arm64/sve.h     |  72 +++++++++
 xen/arch/arm/include/asm/arm64/sysregs.h |   4 +
 xen/arch/arm/include/asm/arm64/vfp.h     |  10 ++
 xen/arch/arm/include/asm/cpufeature.h    |  14 ++
 xen/arch/arm/include/asm/domain.h        |   8 +
 xen/arch/arm/include/asm/processor.h     |   3 +
 xen/arch/arm/setup.c                     |   5 +-
 xen/arch/arm/sysctl.c                    |   7 +
 xen/arch/arm/traps.c                     |  40 +++--
 xen/include/public/arch-arm.h            |   3 +
 xen/include/public/domctl.h              |   2 +-
 xen/include/public/sysctl.h              |  11 +-
 37 files changed, 751 insertions(+), 73 deletions(-)
 create mode 100644 xen/arch/arm/arm64/sve.c
 create mode 100644 xen/arch/arm/arm64/sve_asm.S
 create mode 100644 xen/arch/arm/include/asm/arm64/sve.h

-- 
2.25.1



^ permalink raw reply	[flat|nested] 18+ messages in thread

* [PATCH 01/10] xen/arm: enable SVE extension for Xen
  2023-02-02 11:08 [PATCH 00/10] SVE feature for arm guests Luca Fancellu
@ 2023-02-02 11:08 ` Luca Fancellu
  2023-02-02 11:08 ` [PATCH 02/10] xen/arm: add sve_vl_bits field to domain Luca Fancellu
                   ` (8 subsequent siblings)
  9 siblings, 0 replies; 18+ messages in thread
From: Luca Fancellu @ 2023-02-02 11:08 UTC (permalink / raw)
  To: xen-devel
  Cc: bertrand.marquis, wei.chen, Stefano Stabellini, Julien Grall,
	Volodymyr Babchuk

Enable Xen to handle the SVE extension, add code in cpufeature module
to handle ZCR SVE register, disable trapping SVE feature on system
boot only when SVE resources are accessed.
While there, correct coding style for the comment on coprocessor
trapping.

Now cptr_el2 is part of the domain context and it will be restored
on context switch, this is a preparation for saving the SVE context
which will be part of VFP operations, so restore it before the call
to save VFP registers.
To save an additional isb barrier, restore cptr_el2 before an
existing isb barrier and move the call for saving VFP context after
that barrier.

Change the KConfig entry to make ARM64_SVE symbol selectable, by
default it will be not selected.

Create sve module and sve_asm.S that contains assembly routines for
the SVE feature, this code is inspired from linux and it uses
instruction encoding to be compatible with compilers that does not
support SVE.

Signed-off-by: Luca Fancellu <luca.fancellu@arm.com>
---
Changes from RFC:
 - Moved restoring of cptr before an existing barrier (Julien)
 - Marked the feature as unsupported for now (Julien)
 - Trap and un-trap only when using SVE resources in
   compute_max_zcr() (Julien)
---
 xen/arch/arm/Kconfig                     | 10 +++--
 xen/arch/arm/arm64/Makefile              |  1 +
 xen/arch/arm/arm64/cpufeature.c          |  7 ++--
 xen/arch/arm/arm64/sve.c                 | 49 ++++++++++++++++++++++++
 xen/arch/arm/arm64/sve_asm.S             | 48 +++++++++++++++++++++++
 xen/arch/arm/cpufeature.c                |  6 ++-
 xen/arch/arm/domain.c                    |  9 +++--
 xen/arch/arm/include/asm/arm64/sve.h     | 43 +++++++++++++++++++++
 xen/arch/arm/include/asm/arm64/sysregs.h |  1 +
 xen/arch/arm/include/asm/cpufeature.h    | 14 +++++++
 xen/arch/arm/include/asm/domain.h        |  1 +
 xen/arch/arm/include/asm/processor.h     |  2 +
 xen/arch/arm/setup.c                     |  5 ++-
 xen/arch/arm/traps.c                     | 28 ++++++++------
 14 files changed, 200 insertions(+), 24 deletions(-)
 create mode 100644 xen/arch/arm/arm64/sve.c
 create mode 100644 xen/arch/arm/arm64/sve_asm.S
 create mode 100644 xen/arch/arm/include/asm/arm64/sve.h

diff --git a/xen/arch/arm/Kconfig b/xen/arch/arm/Kconfig
index 239d3aed3c7f..41f45d8d1203 100644
--- a/xen/arch/arm/Kconfig
+++ b/xen/arch/arm/Kconfig
@@ -112,11 +112,15 @@ config ARM64_PTR_AUTH
 	  This feature is not supported in Xen.
 
 config ARM64_SVE
-	def_bool n
+	bool "Enable Scalar Vector Extension support (UNSUPPORTED)" if UNSUPPORTED
 	depends on ARM_64
 	help
-	  Scalar Vector Extension support.
-	  This feature is not supported in Xen.
+	  Scalar Vector Extension (SVE/SVE2) support for guests.
+
+	  Please be aware that currently, enabling this feature will add latency on
+	  VM context switch between SVE enabled guests, between not-enabled SVE
+	  guests and SVE enabled guests and viceversa, compared to the time
+	  required to switch between not-enabled SVE guests.
 
 config ARM64_MTE
 	def_bool n
diff --git a/xen/arch/arm/arm64/Makefile b/xen/arch/arm/arm64/Makefile
index 6d507da0d44d..1d59c3b0ec89 100644
--- a/xen/arch/arm/arm64/Makefile
+++ b/xen/arch/arm/arm64/Makefile
@@ -12,6 +12,7 @@ obj-y += insn.o
 obj-$(CONFIG_LIVEPATCH) += livepatch.o
 obj-y += smc.o
 obj-y += smpboot.o
+obj-$(CONFIG_ARM64_SVE) += sve.o sve_asm.o
 obj-y += traps.o
 obj-y += vfp.o
 obj-y += vsysreg.o
diff --git a/xen/arch/arm/arm64/cpufeature.c b/xen/arch/arm/arm64/cpufeature.c
index d9039d37b2d1..b4656ff4d80f 100644
--- a/xen/arch/arm/arm64/cpufeature.c
+++ b/xen/arch/arm/arm64/cpufeature.c
@@ -455,15 +455,11 @@ static const struct arm64_ftr_bits ftr_id_dfr1[] = {
 	ARM64_FTR_END,
 };
 
-#if 0
-/* TODO: use this to sanitize SVE once we support it */
-
 static const struct arm64_ftr_bits ftr_zcr[] = {
 	ARM64_FTR_BITS(FTR_HIDDEN, FTR_NONSTRICT, FTR_LOWER_SAFE,
 		ZCR_ELx_LEN_SHIFT, ZCR_ELx_LEN_SIZE, 0),	/* LEN */
 	ARM64_FTR_END,
 };
-#endif
 
 /*
  * Common ftr bits for a 32bit register with all hidden, strict
@@ -603,6 +599,9 @@ void update_system_features(const struct cpuinfo_arm *new)
 
 	SANITIZE_ID_REG(zfr64, 0, aa64zfr0);
 
+	if ( cpu_has_sve )
+		SANITIZE_REG(zcr64, 0, zcr);
+
 	/*
 	 * Comment from Linux:
 	 * Userspace may perform DC ZVA instructions. Mismatched block sizes
diff --git a/xen/arch/arm/arm64/sve.c b/xen/arch/arm/arm64/sve.c
new file mode 100644
index 000000000000..1c8024b44cd8
--- /dev/null
+++ b/xen/arch/arm/arm64/sve.c
@@ -0,0 +1,49 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+/*
+ * Arm SVE feature code
+ *
+ * Copyright (C) 2022 ARM Ltd.
+ */
+
+#include <xen/types.h>
+#include <asm/arm64/sve.h>
+#include <asm/arm64/sysregs.h>
+#include <asm/processor.h>
+#include <asm/system.h>
+
+extern unsigned int sve_get_hw_vl(void);
+
+register_t compute_max_zcr(void)
+{
+    register_t cptr_bits = get_default_cptr_flags();
+    register_t zcr = vl_to_zcr(SVE_VL_MAX_BITS);
+    unsigned int hw_vl;
+
+    /* Remove trap for SVE resources */
+    WRITE_SYSREG(cptr_bits & ~HCPTR_CP(8), CPTR_EL2);
+    isb();
+
+    /*
+     * Set the maximum SVE vector length, doing that we will know the VL
+     * supported by the platform, calling sve_get_hw_vl()
+     */
+    WRITE_SYSREG(zcr, ZCR_EL2);
+
+    /*
+     * Read the maximum VL, which could be lower than what we imposed before,
+     * hw_vl contains VL in bytes, multiply it by 8 to use vl_to_zcr() later
+     */
+    hw_vl = sve_get_hw_vl() * 8U;
+
+    /* Restore CPTR_EL2 */
+    WRITE_SYSREG(cptr_bits, CPTR_EL2);
+    isb();
+
+    return vl_to_zcr(hw_vl);
+}
+
+/* Takes a vector length in bits and returns the ZCR_ELx encoding */
+register_t vl_to_zcr(uint16_t vl)
+{
+    return ((vl / SVE_VL_MULTIPLE_VAL) - 1U) & ZCR_ELx_LEN_MASK;
+}
diff --git a/xen/arch/arm/arm64/sve_asm.S b/xen/arch/arm/arm64/sve_asm.S
new file mode 100644
index 000000000000..4d1549344733
--- /dev/null
+++ b/xen/arch/arm/arm64/sve_asm.S
@@ -0,0 +1,48 @@
+/* SPDX-License-Identifier: GPL-2.0-only */
+/*
+ * Arm SVE assembly routines
+ *
+ * Copyright (C) 2022 ARM Ltd.
+ *
+ * Some macros and instruction encoding in this file are taken from linux 6.1.1,
+ * file arch/arm64/include/asm/fpsimdmacros.h, some of them are a modified
+ * version.
+ */
+
+/* Sanity-check macros to help avoid encoding garbage instructions */
+
+.macro _check_general_reg nr
+    .if (\nr) < 0 || (\nr) > 30
+        .error "Bad register number \nr."
+    .endif
+.endm
+
+.macro _check_num n, min, max
+    .if (\n) < (\min) || (\n) > (\max)
+        .error "Number \n out of range [\min,\max]"
+    .endif
+.endm
+
+/* SVE instruction encodings for non-SVE-capable assemblers */
+/* (pre binutils 2.28, all kernel capable clang versions support SVE) */
+
+/* RDVL X\nx, #\imm */
+.macro _sve_rdvl nx, imm
+    _check_general_reg \nx
+    _check_num (\imm), -0x20, 0x1f
+    .inst 0x04bf5000                \
+        | (\nx)                     \
+        | (((\imm) & 0x3f) << 5)
+.endm
+
+/* Gets the current vector register size in bytes */
+GLOBAL(sve_get_hw_vl)
+    _sve_rdvl 0, 1
+    ret
+
+/*
+ * Local variables:
+ * mode: ASM
+ * indent-tabs-mode: nil
+ * End:
+ */
diff --git a/xen/arch/arm/cpufeature.c b/xen/arch/arm/cpufeature.c
index c4ec38bb2554..83b84368f6d5 100644
--- a/xen/arch/arm/cpufeature.c
+++ b/xen/arch/arm/cpufeature.c
@@ -9,6 +9,7 @@
 #include <xen/init.h>
 #include <xen/smp.h>
 #include <xen/stop_machine.h>
+#include <asm/arm64/sve.h>
 #include <asm/cpufeature.h>
 
 DECLARE_BITMAP(cpu_hwcaps, ARM_NCAPS);
@@ -143,6 +144,9 @@ void identify_cpu(struct cpuinfo_arm *c)
 
     c->zfr64.bits[0] = READ_SYSREG(ID_AA64ZFR0_EL1);
 
+    if ( cpu_has_sve )
+        c->zcr64.bits[0] = compute_max_zcr();
+
     c->dczid.bits[0] = READ_SYSREG(DCZID_EL0);
 
     c->ctr.bits[0] = READ_SYSREG(CTR_EL0);
@@ -199,7 +203,7 @@ static int __init create_guest_cpuinfo(void)
     guest_cpuinfo.pfr64.mpam = 0;
     guest_cpuinfo.pfr64.mpam_frac = 0;
 
-    /* Hide SVE as Xen does not support it */
+    /* Hide SVE by default to the guests */
     guest_cpuinfo.pfr64.sve = 0;
     guest_cpuinfo.zfr64.bits[0] = 0;
 
diff --git a/xen/arch/arm/domain.c b/xen/arch/arm/domain.c
index 99577adb6c69..adb6ace2e24d 100644
--- a/xen/arch/arm/domain.c
+++ b/xen/arch/arm/domain.c
@@ -181,9 +181,6 @@ static void ctxt_switch_to(struct vcpu *n)
     /* VGIC */
     gic_restore_state(n);
 
-    /* VFP */
-    vfp_restore_state(n);
-
     /* XXX MPU */
 
     /* Fault Status */
@@ -234,6 +231,7 @@ static void ctxt_switch_to(struct vcpu *n)
     p2m_restore_state(n);
 
     /* Control Registers */
+    WRITE_SYSREG(n->arch.cptr_el2, CPTR_EL2);
     WRITE_SYSREG(n->arch.cpacr, CPACR_EL1);
 
     /*
@@ -258,6 +256,9 @@ static void ctxt_switch_to(struct vcpu *n)
 #endif
     isb();
 
+    /* VFP */
+    vfp_restore_state(n);
+
     /* CP 15 */
     WRITE_SYSREG(n->arch.csselr, CSSELR_EL1);
 
@@ -548,6 +549,8 @@ int arch_vcpu_create(struct vcpu *v)
 
     v->arch.vmpidr = MPIDR_SMP | vcpuid_to_vaffinity(v->vcpu_id);
 
+    v->arch.cptr_el2 = get_default_cptr_flags();
+
     v->arch.hcr_el2 = get_default_hcr_flags();
 
     v->arch.mdcr_el2 = HDCR_TDRA | HDCR_TDOSA | HDCR_TDA;
diff --git a/xen/arch/arm/include/asm/arm64/sve.h b/xen/arch/arm/include/asm/arm64/sve.h
new file mode 100644
index 000000000000..bd56e2f24230
--- /dev/null
+++ b/xen/arch/arm/include/asm/arm64/sve.h
@@ -0,0 +1,43 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+/*
+ * Arm SVE feature code
+ *
+ * Copyright (C) 2022 ARM Ltd.
+ */
+
+#ifndef _ARM_ARM64_SVE_H
+#define _ARM_ARM64_SVE_H
+
+#define SVE_VL_MAX_BITS (2048U)
+
+/* Vector length must be multiple of 128 */
+#define SVE_VL_MULTIPLE_VAL (128U)
+
+#ifdef CONFIG_ARM64_SVE
+
+register_t compute_max_zcr(void);
+register_t vl_to_zcr(uint16_t vl);
+
+#else /* !CONFIG_ARM64_SVE */
+
+static inline register_t compute_max_zcr(void)
+{
+    return 0;
+}
+
+static inline register_t vl_to_zcr(uint16_t vl)
+{
+    return 0;
+}
+
+#endif
+
+#endif /* _ARM_ARM64_SVE_H */
+/*
+ * Local variables:
+ * mode: C
+ * c-file-style: "BSD"
+ * c-basic-offset: 4
+ * indent-tabs-mode: nil
+ * End:
+ */
diff --git a/xen/arch/arm/include/asm/arm64/sysregs.h b/xen/arch/arm/include/asm/arm64/sysregs.h
index 463899951414..4cabb9eb4d5e 100644
--- a/xen/arch/arm/include/asm/arm64/sysregs.h
+++ b/xen/arch/arm/include/asm/arm64/sysregs.h
@@ -24,6 +24,7 @@
 #define ICH_EISR_EL2              S3_4_C12_C11_3
 #define ICH_ELSR_EL2              S3_4_C12_C11_5
 #define ICH_VMCR_EL2              S3_4_C12_C11_7
+#define ZCR_EL2                   S3_4_C1_C2_0
 
 #define __LR0_EL2(x)              S3_4_C12_C12_ ## x
 #define __LR8_EL2(x)              S3_4_C12_C13_ ## x
diff --git a/xen/arch/arm/include/asm/cpufeature.h b/xen/arch/arm/include/asm/cpufeature.h
index c62cf6293fd6..6d703e051906 100644
--- a/xen/arch/arm/include/asm/cpufeature.h
+++ b/xen/arch/arm/include/asm/cpufeature.h
@@ -32,6 +32,12 @@
 #define cpu_has_thumbee   (boot_cpu_feature32(thumbee) == 1)
 #define cpu_has_aarch32   (cpu_has_arm || cpu_has_thumb)
 
+#ifdef CONFIG_ARM64_SVE
+#define cpu_has_sve       (boot_cpu_feature64(sve) == 1)
+#else
+#define cpu_has_sve       (0)
+#endif
+
 #ifdef CONFIG_ARM_32
 #define cpu_has_gicv3     (boot_cpu_feature32(gic) >= 1)
 #define cpu_has_gentimer  (boot_cpu_feature32(gentimer) == 1)
@@ -323,6 +329,14 @@ struct cpuinfo_arm {
         };
     } isa64;
 
+    union {
+        register_t bits[1];
+        struct {
+            unsigned long len:4;
+            unsigned long __res0:60;
+        };
+    } zcr64;
+
     struct {
         register_t bits[1];
     } zfr64;
diff --git a/xen/arch/arm/include/asm/domain.h b/xen/arch/arm/include/asm/domain.h
index 0e310601e846..42eb5df320a7 100644
--- a/xen/arch/arm/include/asm/domain.h
+++ b/xen/arch/arm/include/asm/domain.h
@@ -190,6 +190,7 @@ struct arch_vcpu
     register_t tpidrro_el0;
 
     /* HYP configuration */
+    register_t cptr_el2;
     register_t hcr_el2;
     register_t mdcr_el2;
 
diff --git a/xen/arch/arm/include/asm/processor.h b/xen/arch/arm/include/asm/processor.h
index 1dd81d7d528f..0e38926b94db 100644
--- a/xen/arch/arm/include/asm/processor.h
+++ b/xen/arch/arm/include/asm/processor.h
@@ -583,6 +583,8 @@ void do_trap_guest_serror(struct cpu_user_regs *regs);
 
 register_t get_default_hcr_flags(void);
 
+register_t get_default_cptr_flags(void);
+
 /*
  * Synchronize SError unless the feature is selected.
  * This is relying on the SErrors are currently unmasked.
diff --git a/xen/arch/arm/setup.c b/xen/arch/arm/setup.c
index 1f26f67b90e3..5459cc4f5e62 100644
--- a/xen/arch/arm/setup.c
+++ b/xen/arch/arm/setup.c
@@ -135,10 +135,11 @@ static void __init processor_id(void)
            cpu_has_el2_32 ? "64+32" : cpu_has_el2_64 ? "64" : "No",
            cpu_has_el1_32 ? "64+32" : cpu_has_el1_64 ? "64" : "No",
            cpu_has_el0_32 ? "64+32" : cpu_has_el0_64 ? "64" : "No");
-    printk("    Extensions:%s%s%s\n",
+    printk("    Extensions:%s%s%s%s\n",
            cpu_has_fp ? " FloatingPoint" : "",
            cpu_has_simd ? " AdvancedSIMD" : "",
-           cpu_has_gicv3 ? " GICv3-SysReg" : "");
+           cpu_has_gicv3 ? " GICv3-SysReg" : "",
+           cpu_has_sve ? " SVE" : "");
 
     /* Warn user if we find unknown floating-point features */
     if ( cpu_has_fp && (boot_cpu_feature64(fp) >= 2) )
diff --git a/xen/arch/arm/traps.c b/xen/arch/arm/traps.c
index 061c92acbd68..a78a99ddadd0 100644
--- a/xen/arch/arm/traps.c
+++ b/xen/arch/arm/traps.c
@@ -93,6 +93,21 @@ register_t get_default_hcr_flags(void)
              HCR_TID3|HCR_TSC|HCR_TAC|HCR_SWIO|HCR_TIDCP|HCR_FB|HCR_TSW);
 }
 
+register_t get_default_cptr_flags(void)
+{
+    /*
+     * Trap all coprocessor registers (0-13) except cp10 and
+     * cp11 for VFP.
+     *
+     * /!\ All coprocessors except cp10 and cp11 cannot be used in Xen.
+     *
+     * On ARM64 the TCPx bits which we set here (0..9,12,13) are all
+     * RES1, i.e. they would trap whether we did this write or not.
+     */
+    return  ((HCPTR_CP_MASK & ~(HCPTR_CP(10) | HCPTR_CP(11))) |
+             HCPTR_TTA | HCPTR_TAM);
+}
+
 static enum {
     SERRORS_DIVERSE,
     SERRORS_PANIC,
@@ -122,6 +137,7 @@ __initcall(update_serrors_cpu_caps);
 
 void init_traps(void)
 {
+    register_t cptr_bits = get_default_cptr_flags();
     /*
      * Setup Hyp vector base. Note they might get updated with the
      * branch predictor hardening.
@@ -135,17 +151,7 @@ void init_traps(void)
     /* Trap CP15 c15 used for implementation defined registers */
     WRITE_SYSREG(HSTR_T(15), HSTR_EL2);
 
-    /* Trap all coprocessor registers (0-13) except cp10 and
-     * cp11 for VFP.
-     *
-     * /!\ All coprocessors except cp10 and cp11 cannot be used in Xen.
-     *
-     * On ARM64 the TCPx bits which we set here (0..9,12,13) are all
-     * RES1, i.e. they would trap whether we did this write or not.
-     */
-    WRITE_SYSREG((HCPTR_CP_MASK & ~(HCPTR_CP(10) | HCPTR_CP(11))) |
-                 HCPTR_TTA | HCPTR_TAM,
-                 CPTR_EL2);
+    WRITE_SYSREG(cptr_bits, CPTR_EL2);
 
     /*
      * Configure HCR_EL2 with the bare minimum to run Xen until a guest
-- 
2.25.1



^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH 02/10] xen/arm: add sve_vl_bits field to domain
  2023-02-02 11:08 [PATCH 00/10] SVE feature for arm guests Luca Fancellu
  2023-02-02 11:08 ` [PATCH 01/10] xen/arm: enable SVE extension for Xen Luca Fancellu
@ 2023-02-02 11:08 ` Luca Fancellu
  2023-02-02 11:08 ` [PATCH 03/10] xen/arm: Expose SVE feature to the guest Luca Fancellu
                   ` (7 subsequent siblings)
  9 siblings, 0 replies; 18+ messages in thread
From: Luca Fancellu @ 2023-02-02 11:08 UTC (permalink / raw)
  To: xen-devel
  Cc: bertrand.marquis, wei.chen, Stefano Stabellini, Julien Grall,
	Volodymyr Babchuk, Andrew Cooper, George Dunlap, Jan Beulich,
	Wei Liu

Add sve_vl_bits field to arch_domain and xen_arch_domainconfig
structure, to allow the domain to have an information about the
SVE feature and the number of SVE register bits that are allowed
for this domain.

The field is used also to allow or forbid a domain to use SVE,
because a value equal to zero means the guest is not allowed to
use the feature.

Check that the requested vector length is lower or equal to the
platform supported vector length, otherwise fail on domain
creation.

Signed-off-by: Luca Fancellu <luca.fancellu@arm.com>
---
Changes from RFC:
 - restore zcr_el2 in sve_restore_state, that will be introduced
   later in this serie, so remove zcr_el2 related code from this
   patch and move everything to the later patch (Julien)
 - add explicit padding into struct xen_arch_domainconfig (Julien)
 - Don't lower down the vector length, just fail to create the
   domain. (Julien)
---
 xen/arch/arm/arm64/sve.c             |  9 ++++++++
 xen/arch/arm/domain.c                | 32 ++++++++++++++++++++++++++++
 xen/arch/arm/include/asm/arm64/sve.h | 12 +++++++++++
 xen/arch/arm/include/asm/domain.h    |  5 +++++
 xen/include/public/arch-arm.h        |  3 +++
 xen/include/public/domctl.h          |  2 +-
 6 files changed, 62 insertions(+), 1 deletion(-)

diff --git a/xen/arch/arm/arm64/sve.c b/xen/arch/arm/arm64/sve.c
index 1c8024b44cd8..eee39caa9efa 100644
--- a/xen/arch/arm/arm64/sve.c
+++ b/xen/arch/arm/arm64/sve.c
@@ -6,6 +6,7 @@
  */
 
 #include <xen/types.h>
+#include <asm/cpufeature.h>
 #include <asm/arm64/sve.h>
 #include <asm/arm64/sysregs.h>
 #include <asm/processor.h>
@@ -47,3 +48,11 @@ register_t vl_to_zcr(uint16_t vl)
 {
     return ((vl / SVE_VL_MULTIPLE_VAL) - 1U) & ZCR_ELx_LEN_MASK;
 }
+
+/* Get the system sanitized value for VL in bits */
+uint16_t get_sys_vl_len(void)
+{
+    /* ZCR_ELx len field is ((len+1) * 128) = vector bits length */
+    return ((system_cpuinfo.zcr64.bits[0] & ZCR_ELx_LEN_MASK) + 1U) *
+            SVE_VL_MULTIPLE_VAL;
+}
diff --git a/xen/arch/arm/domain.c b/xen/arch/arm/domain.c
index adb6ace2e24d..8d64e3c08edf 100644
--- a/xen/arch/arm/domain.c
+++ b/xen/arch/arm/domain.c
@@ -13,6 +13,7 @@
 #include <xen/wait.h>
 
 #include <asm/alternative.h>
+#include <asm/arm64/sve.h>
 #include <asm/cpuerrata.h>
 #include <asm/cpufeature.h>
 #include <asm/current.h>
@@ -550,6 +551,8 @@ int arch_vcpu_create(struct vcpu *v)
     v->arch.vmpidr = MPIDR_SMP | vcpuid_to_vaffinity(v->vcpu_id);
 
     v->arch.cptr_el2 = get_default_cptr_flags();
+    if ( is_sve_domain(v->domain) )
+        v->arch.cptr_el2 &= ~HCPTR_CP(8);
 
     v->arch.hcr_el2 = get_default_hcr_flags();
 
@@ -594,6 +597,7 @@ int arch_sanitise_domain_config(struct xen_domctl_createdomain *config)
     unsigned int max_vcpus;
     unsigned int flags_required = (XEN_DOMCTL_CDF_hvm | XEN_DOMCTL_CDF_hap);
     unsigned int flags_optional = (XEN_DOMCTL_CDF_iommu | XEN_DOMCTL_CDF_vpmu);
+    unsigned int sve_vl_bits = config->arch.sve_vl_bits;
 
     if ( (config->flags & ~flags_optional) != flags_required )
     {
@@ -602,6 +606,31 @@ int arch_sanitise_domain_config(struct xen_domctl_createdomain *config)
         return -EINVAL;
     }
 
+    /* Check feature flags */
+    if ( sve_vl_bits > 0 ) {
+        unsigned int zcr_max_bits = get_sys_vl_len();
+
+        if ( !cpu_has_sve )
+        {
+            dprintk(XENLOG_INFO, "SVE is unsupported on this machine.\n");
+            return -EINVAL;
+        }
+        else if ( !is_vl_valid(sve_vl_bits) )
+        {
+            dprintk(XENLOG_INFO, "Unsupported SVE vector length (%u)\n",
+                    sve_vl_bits);
+            return -EINVAL;
+        }
+        else if ( sve_vl_bits > zcr_max_bits )
+        {
+            dprintk(XENLOG_INFO,
+                    "The requested SVE vector length (%u) must be lower or \n"
+                    "equal to the platform supported vector length (%u)\n",
+                    sve_vl_bits, zcr_max_bits);
+            return -EINVAL;
+        }
+    }
+
     /* The P2M table must always be shared between the CPU and the IOMMU */
     if ( config->iommu_opts & XEN_DOMCTL_IOMMU_no_sharept )
     {
@@ -744,6 +773,9 @@ int arch_domain_create(struct domain *d,
     if ( (rc = domain_vpci_init(d)) != 0 )
         goto fail;
 
+    /* Copy sve_vl_bits to the domain configuration */
+    d->arch.sve_vl_bits = config->arch.sve_vl_bits;
+
     return 0;
 
 fail:
diff --git a/xen/arch/arm/include/asm/arm64/sve.h b/xen/arch/arm/include/asm/arm64/sve.h
index bd56e2f24230..f4a660e402ca 100644
--- a/xen/arch/arm/include/asm/arm64/sve.h
+++ b/xen/arch/arm/include/asm/arm64/sve.h
@@ -13,10 +13,17 @@
 /* Vector length must be multiple of 128 */
 #define SVE_VL_MULTIPLE_VAL (128U)
 
+static inline bool is_vl_valid(uint16_t vl)
+{
+    /* SVE vector length is multiple of 128 and maximum 2048 */
+    return ((vl % SVE_VL_MULTIPLE_VAL) == 0) && (vl <= SVE_VL_MAX_BITS);
+}
+
 #ifdef CONFIG_ARM64_SVE
 
 register_t compute_max_zcr(void);
 register_t vl_to_zcr(uint16_t vl);
+uint16_t get_sys_vl_len(void);
 
 #else /* !CONFIG_ARM64_SVE */
 
@@ -30,6 +37,11 @@ static inline register_t vl_to_zcr(uint16_t vl)
     return 0;
 }
 
+static inline uint16_t get_sys_vl_len(void)
+{
+    return 0;
+}
+
 #endif
 
 #endif /* _ARM_ARM64_SVE_H */
diff --git a/xen/arch/arm/include/asm/domain.h b/xen/arch/arm/include/asm/domain.h
index 42eb5df320a7..1bd669e0a5c1 100644
--- a/xen/arch/arm/include/asm/domain.h
+++ b/xen/arch/arm/include/asm/domain.h
@@ -31,6 +31,8 @@ enum domain_type {
 
 #define is_domain_direct_mapped(d) ((d)->cdf & CDF_directmap)
 
+#define is_sve_domain(d) ((d)->arch.sve_vl_bits > 0)
+
 /*
  * Is the domain using the host memory layout?
  *
@@ -114,6 +116,9 @@ struct arch_domain
     void *tee;
 #endif
 
+    /* max SVE vector length in bits */
+    uint16_t sve_vl_bits;
+
 }  __cacheline_aligned;
 
 struct arch_vcpu
diff --git a/xen/include/public/arch-arm.h b/xen/include/public/arch-arm.h
index 1528ced5097a..3094aeb57990 100644
--- a/xen/include/public/arch-arm.h
+++ b/xen/include/public/arch-arm.h
@@ -304,6 +304,9 @@ struct xen_arch_domainconfig {
     uint16_t tee_type;
     /* IN */
     uint32_t nr_spis;
+    /* IN */
+    uint16_t sve_vl_bits;
+    uint16_t _pad1;
     /*
      * OUT
      * Based on the property clock-frequency in the DT timer node.
diff --git a/xen/include/public/domctl.h b/xen/include/public/domctl.h
index 51be28c3de7c..616d7a1c070d 100644
--- a/xen/include/public/domctl.h
+++ b/xen/include/public/domctl.h
@@ -21,7 +21,7 @@
 #include "hvm/save.h"
 #include "memory.h"
 
-#define XEN_DOMCTL_INTERFACE_VERSION 0x00000015
+#define XEN_DOMCTL_INTERFACE_VERSION 0x00000016
 
 /*
  * NB. xen_domctl.domain is an IN/OUT parameter for this operation.
-- 
2.25.1



^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH 03/10] xen/arm: Expose SVE feature to the guest
  2023-02-02 11:08 [PATCH 00/10] SVE feature for arm guests Luca Fancellu
  2023-02-02 11:08 ` [PATCH 01/10] xen/arm: enable SVE extension for Xen Luca Fancellu
  2023-02-02 11:08 ` [PATCH 02/10] xen/arm: add sve_vl_bits field to domain Luca Fancellu
@ 2023-02-02 11:08 ` Luca Fancellu
  2023-02-02 11:08 ` [PATCH 04/10] xen/arm: add SVE exception class handling Luca Fancellu
                   ` (6 subsequent siblings)
  9 siblings, 0 replies; 18+ messages in thread
From: Luca Fancellu @ 2023-02-02 11:08 UTC (permalink / raw)
  To: xen-devel
  Cc: bertrand.marquis, wei.chen, Stefano Stabellini, Julien Grall,
	Volodymyr Babchuk

When a guest is allowed to use SVE, expose the SVE features through
the identification registers.

Signed-off-by: Luca Fancellu <luca.fancellu@arm.com>
---
Changes from RFC:
 - No changes
---
 xen/arch/arm/arm64/vsysreg.c | 39 ++++++++++++++++++++++++++++++++++--
 1 file changed, 37 insertions(+), 2 deletions(-)

diff --git a/xen/arch/arm/arm64/vsysreg.c b/xen/arch/arm/arm64/vsysreg.c
index 758750983c11..10048bb4d221 100644
--- a/xen/arch/arm/arm64/vsysreg.c
+++ b/xen/arch/arm/arm64/vsysreg.c
@@ -18,6 +18,7 @@
 
 #include <xen/sched.h>
 
+#include <asm/arm64/cpufeature.h>
 #include <asm/current.h>
 #include <asm/regs.h>
 #include <asm/traps.h>
@@ -295,7 +296,28 @@ void do_sysreg(struct cpu_user_regs *regs,
     GENERATE_TID3_INFO(MVFR0_EL1, mvfr, 0)
     GENERATE_TID3_INFO(MVFR1_EL1, mvfr, 1)
     GENERATE_TID3_INFO(MVFR2_EL1, mvfr, 2)
-    GENERATE_TID3_INFO(ID_AA64PFR0_EL1, pfr64, 0)
+
+    case HSR_SYSREG_ID_AA64PFR0_EL1:
+    {
+        register_t guest_reg_value = guest_cpuinfo.pfr64.bits[0];
+
+        if ( is_sve_domain(v->domain) )
+        {
+            /* 4 is the SVE field width in id_aa64pfr0_el1 */
+            uint64_t mask = GENMASK(ID_AA64PFR0_SVE_SHIFT + 4 - 1,
+                                    ID_AA64PFR0_SVE_SHIFT);
+            /* sysval is the sve field on the system */
+            uint64_t sysval = cpuid_feature_extract_unsigned_field_width(
+                                system_cpuinfo.pfr64.bits[0],
+                                ID_AA64PFR0_SVE_SHIFT, 4);
+            guest_reg_value &= ~mask;
+            guest_reg_value |= (sysval << ID_AA64PFR0_SVE_SHIFT) & mask;
+        }
+
+        return handle_ro_read_val(regs, regidx, hsr.sysreg.read, hsr, 1,
+                                  guest_reg_value);
+    }
+
     GENERATE_TID3_INFO(ID_AA64PFR1_EL1, pfr64, 1)
     GENERATE_TID3_INFO(ID_AA64DFR0_EL1, dbg64, 0)
     GENERATE_TID3_INFO(ID_AA64DFR1_EL1, dbg64, 1)
@@ -306,7 +328,20 @@ void do_sysreg(struct cpu_user_regs *regs,
     GENERATE_TID3_INFO(ID_AA64MMFR2_EL1, mm64, 2)
     GENERATE_TID3_INFO(ID_AA64AFR0_EL1, aux64, 0)
     GENERATE_TID3_INFO(ID_AA64AFR1_EL1, aux64, 1)
-    GENERATE_TID3_INFO(ID_AA64ZFR0_EL1, zfr64, 0)
+
+    case HSR_SYSREG_ID_AA64ZFR0_EL1:
+    {
+        /*
+         * When the guest has the SVE feature enabled, the whole id_aa64zfr0_el1
+         * needs to be exposed.
+         */
+        register_t guest_reg_value = guest_cpuinfo.zfr64.bits[0];
+        if ( is_sve_domain(v->domain) )
+            guest_reg_value = system_cpuinfo.zfr64.bits[0];
+
+        return handle_ro_read_val(regs, regidx, hsr.sysreg.read, hsr, 1,
+                                  guest_reg_value);
+    }
 
     /*
      * Those cases are catching all Reserved registers trapped by TID3 which
-- 
2.25.1



^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH 04/10] xen/arm: add SVE exception class handling
  2023-02-02 11:08 [PATCH 00/10] SVE feature for arm guests Luca Fancellu
                   ` (2 preceding siblings ...)
  2023-02-02 11:08 ` [PATCH 03/10] xen/arm: Expose SVE feature to the guest Luca Fancellu
@ 2023-02-02 11:08 ` Luca Fancellu
  2023-02-02 11:08 ` [PATCH 05/10] arm/sve: save/restore SVE context switch Luca Fancellu
                   ` (5 subsequent siblings)
  9 siblings, 0 replies; 18+ messages in thread
From: Luca Fancellu @ 2023-02-02 11:08 UTC (permalink / raw)
  To: xen-devel
  Cc: bertrand.marquis, wei.chen, Stefano Stabellini, Julien Grall,
	Volodymyr Babchuk

SVE has a new exception class with code 0x19, introduce the new code
and handle the exception.

Signed-off-by: Luca Fancellu <luca.fancellu@arm.com>
---
Changes from RFC:
 - No changes
---
 xen/arch/arm/include/asm/processor.h |  1 +
 xen/arch/arm/traps.c                 | 12 ++++++++++++
 2 files changed, 13 insertions(+)

diff --git a/xen/arch/arm/include/asm/processor.h b/xen/arch/arm/include/asm/processor.h
index 0e38926b94db..625c2bd0cd6c 100644
--- a/xen/arch/arm/include/asm/processor.h
+++ b/xen/arch/arm/include/asm/processor.h
@@ -426,6 +426,7 @@
 #define HSR_EC_HVC64                0x16
 #define HSR_EC_SMC64                0x17
 #define HSR_EC_SYSREG               0x18
+#define HSR_EC_SVE                  0x19
 #endif
 #define HSR_EC_INSTR_ABORT_LOWER_EL 0x20
 #define HSR_EC_INSTR_ABORT_CURR_EL  0x21
diff --git a/xen/arch/arm/traps.c b/xen/arch/arm/traps.c
index a78a99ddadd0..c2e30feafd5a 100644
--- a/xen/arch/arm/traps.c
+++ b/xen/arch/arm/traps.c
@@ -2160,6 +2160,13 @@ void do_trap_guest_sync(struct cpu_user_regs *regs)
         perfc_incr(trap_sysreg);
         do_sysreg(regs, hsr);
         break;
+    case HSR_EC_SVE:
+        GUEST_BUG_ON(regs_mode_is_32bit(regs));
+        gprintk(XENLOG_WARNING,
+                "Domain id %d tried to use SVE while not allowed\n",
+                current->domain->domain_id);
+        inject_undef_exception(regs, hsr);
+        break;
 #endif
 
     case HSR_EC_INSTR_ABORT_LOWER_EL:
@@ -2189,6 +2196,11 @@ void do_trap_hyp_sync(struct cpu_user_regs *regs)
     case HSR_EC_BRK:
         do_trap_brk(regs, hsr);
         break;
+    case HSR_EC_SVE:
+        /* An SVE exception is a bug somewhere in hypervisor code */
+        printk("SVE trap at EL2.\n");
+        do_unexpected_trap("Hypervisor", regs);
+        break;
 #endif
     case HSR_EC_DATA_ABORT_CURR_EL:
     case HSR_EC_INSTR_ABORT_CURR_EL:
-- 
2.25.1



^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH 05/10] arm/sve: save/restore SVE context switch
  2023-02-02 11:08 [PATCH 00/10] SVE feature for arm guests Luca Fancellu
                   ` (3 preceding siblings ...)
  2023-02-02 11:08 ` [PATCH 04/10] xen/arm: add SVE exception class handling Luca Fancellu
@ 2023-02-02 11:08 ` Luca Fancellu
  2023-02-02 11:08 ` [PATCH 06/10] xen/arm: enable Dom0 to use SVE feature Luca Fancellu
                   ` (4 subsequent siblings)
  9 siblings, 0 replies; 18+ messages in thread
From: Luca Fancellu @ 2023-02-02 11:08 UTC (permalink / raw)
  To: xen-devel
  Cc: bertrand.marquis, wei.chen, Stefano Stabellini, Julien Grall,
	Volodymyr Babchuk

Save/restore context switch for SVE, allocate memory to contain
the Z0-31 registers whose length is maximum 2048 bits each and
FFR who can be maximum 256 bits, the allocated memory depends on
how many bits is the vector length for the domain and how many bits
are supported by the platform.

Save P0-15 whose length is maximum 256 bits each, in this case the
memory used is from the fpregs field in struct vfp_state,
because V0-31 are part of Z0-31 and this space would have been
unused for SVE domain otherwise.

Create zcr_el{1,2} fields in arch_vcpu, initialise zcr_el2 on vcpu
creation given the requested vector length and restore it on
context switch, save/restore ZCR_EL1 value as well.

Remove headers from sve.c that are already included using
xen/sched.h.

Signed-off-by: Luca Fancellu <luca.fancellu@arm.com>
---
Changes from RFC:
 - Moved zcr_el2 field introduction in this patch, restore its
   content inside sve_restore_state function. (Julien)
---
 xen/arch/arm/arm64/sve.c                 |  65 ++++++++++-
 xen/arch/arm/arm64/sve_asm.S             | 141 +++++++++++++++++++++++
 xen/arch/arm/arm64/vfp.c                 |  79 +++++++------
 xen/arch/arm/domain.c                    |   7 ++
 xen/arch/arm/include/asm/arm64/sve.h     |  13 +++
 xen/arch/arm/include/asm/arm64/sysregs.h |   3 +
 xen/arch/arm/include/asm/arm64/vfp.h     |  10 ++
 xen/arch/arm/include/asm/domain.h        |   2 +
 8 files changed, 281 insertions(+), 39 deletions(-)

diff --git a/xen/arch/arm/arm64/sve.c b/xen/arch/arm/arm64/sve.c
index eee39caa9efa..bb3229ead005 100644
--- a/xen/arch/arm/arm64/sve.c
+++ b/xen/arch/arm/arm64/sve.c
@@ -5,14 +5,29 @@
  * Copyright (C) 2022 ARM Ltd.
  */
 
-#include <xen/types.h>
-#include <asm/cpufeature.h>
+#include <xen/sched.h>
+#include <xen/sizes.h>
 #include <asm/arm64/sve.h>
-#include <asm/arm64/sysregs.h>
-#include <asm/processor.h>
-#include <asm/system.h>
 
 extern unsigned int sve_get_hw_vl(void);
+extern void sve_save_ctx(uint64_t *sve_ctx, uint64_t *pregs, int save_ffr);
+extern void sve_load_ctx(uint64_t const *sve_ctx, uint64_t const *pregs,
+                         int restore_ffr);
+
+static inline uint16_t sve_zreg_ctx_size(uint16_t vl)
+{
+    /*
+     * Z0-31 registers size in bytes is computed from VL that is in bits, so VL
+     * in bytes is VL/8.
+     */
+    return (vl / 8U) * 32U;
+}
+
+static inline uint16_t sve_ffrreg_ctx_size(uint16_t vl)
+{
+    /* FFR register size is VL/8, which is in bytes (VL/8)/8 */
+    return (vl / 64U);
+}
 
 register_t compute_max_zcr(void)
 {
@@ -56,3 +71,43 @@ uint16_t get_sys_vl_len(void)
     return ((system_cpuinfo.zcr64.bits[0] & ZCR_ELx_LEN_MASK) + 1U) *
             SVE_VL_MULTIPLE_VAL;
 }
+
+int sve_context_init(struct vcpu *v)
+{
+    uint64_t *ctx = _xzalloc(sve_zreg_ctx_size(v->domain->arch.sve_vl_bits) +
+                             sve_ffrreg_ctx_size(v->domain->arch.sve_vl_bits),
+                             L1_CACHE_BYTES);
+
+    if ( !ctx )
+        return -ENOMEM;
+
+    v->arch.vfp.sve_context = ctx;
+
+    return 0;
+}
+
+void sve_context_free(struct vcpu *v)
+{
+    xfree(v->arch.vfp.sve_context);
+}
+
+void sve_save_state(struct vcpu *v)
+{
+    uint64_t *sve_ctx_zreg_end = v->arch.vfp.sve_context +
+            (sve_zreg_ctx_size(v->domain->arch.sve_vl_bits) / sizeof(uint64_t));
+
+    v->arch.zcr_el1 = READ_SYSREG(ZCR_EL1);
+
+    sve_save_ctx(sve_ctx_zreg_end, v->arch.vfp.fpregs, 1);
+}
+
+void sve_restore_state(struct vcpu *v)
+{
+    uint64_t *sve_ctx_zreg_end = v->arch.vfp.sve_context +
+            (sve_zreg_ctx_size(v->domain->arch.sve_vl_bits) / sizeof(uint64_t));
+
+    WRITE_SYSREG(v->arch.zcr_el1, ZCR_EL1);
+    WRITE_SYSREG(v->arch.zcr_el2, ZCR_EL2);
+
+    sve_load_ctx(sve_ctx_zreg_end, v->arch.vfp.fpregs, 1);
+}
diff --git a/xen/arch/arm/arm64/sve_asm.S b/xen/arch/arm/arm64/sve_asm.S
index 4d1549344733..8c37d7bc95d5 100644
--- a/xen/arch/arm/arm64/sve_asm.S
+++ b/xen/arch/arm/arm64/sve_asm.S
@@ -17,6 +17,18 @@
     .endif
 .endm
 
+.macro _sve_check_zreg znr
+    .if (\znr) < 0 || (\znr) > 31
+        .error "Bad Scalable Vector Extension vector register number \znr."
+    .endif
+.endm
+
+.macro _sve_check_preg pnr
+    .if (\pnr) < 0 || (\pnr) > 15
+        .error "Bad Scalable Vector Extension predicate register number \pnr."
+    .endif
+.endm
+
 .macro _check_num n, min, max
     .if (\n) < (\min) || (\n) > (\max)
         .error "Number \n out of range [\min,\max]"
@@ -26,6 +38,54 @@
 /* SVE instruction encodings for non-SVE-capable assemblers */
 /* (pre binutils 2.28, all kernel capable clang versions support SVE) */
 
+/* STR (vector): STR Z\nz, [X\nxbase, #\offset, MUL VL] */
+.macro _sve_str_v nz, nxbase, offset=0
+    _sve_check_zreg \nz
+    _check_general_reg \nxbase
+    _check_num (\offset), -0x100, 0xff
+    .inst 0xe5804000                \
+        | (\nz)                     \
+        | ((\nxbase) << 5)          \
+        | (((\offset) & 7) << 10)   \
+        | (((\offset) & 0x1f8) << 13)
+.endm
+
+/* LDR (vector): LDR Z\nz, [X\nxbase, #\offset, MUL VL] */
+.macro _sve_ldr_v nz, nxbase, offset=0
+    _sve_check_zreg \nz
+    _check_general_reg \nxbase
+    _check_num (\offset), -0x100, 0xff
+    .inst 0x85804000                \
+        | (\nz)                     \
+        | ((\nxbase) << 5)          \
+        | (((\offset) & 7) << 10)   \
+        | (((\offset) & 0x1f8) << 13)
+.endm
+
+/* STR (predicate): STR P\np, [X\nxbase, #\offset, MUL VL] */
+.macro _sve_str_p np, nxbase, offset=0
+    _sve_check_preg \np
+    _check_general_reg \nxbase
+    _check_num (\offset), -0x100, 0xff
+    .inst 0xe5800000                \
+        | (\np)                     \
+        | ((\nxbase) << 5)          \
+        | (((\offset) & 7) << 10)   \
+        | (((\offset) & 0x1f8) << 13)
+.endm
+
+/* LDR (predicate): LDR P\np, [X\nxbase, #\offset, MUL VL] */
+.macro _sve_ldr_p np, nxbase, offset=0
+    _sve_check_preg \np
+    _check_general_reg \nxbase
+    _check_num (\offset), -0x100, 0xff
+    .inst 0x85800000                \
+        | (\np)                     \
+        | ((\nxbase) << 5)          \
+        | (((\offset) & 7) << 10)   \
+        | (((\offset) & 0x1f8) << 13)
+.endm
+
 /* RDVL X\nx, #\imm */
 .macro _sve_rdvl nx, imm
     _check_general_reg \nx
@@ -35,11 +95,92 @@
         | (((\imm) & 0x3f) << 5)
 .endm
 
+/* RDFFR (unpredicated): RDFFR P\np.B */
+.macro _sve_rdffr np
+    _sve_check_preg \np
+    .inst 0x2519f000                \
+        | (\np)
+.endm
+
+/* WRFFR P\np.B */
+.macro _sve_wrffr np
+    _sve_check_preg \np
+    .inst 0x25289000                \
+        | ((\np) << 5)
+.endm
+
+.macro __for from:req, to:req
+    .if (\from) == (\to)
+        _for__body %\from
+    .else
+        __for %\from, %((\from) + ((\to) - (\from)) / 2)
+        __for %((\from) + ((\to) - (\from)) / 2 + 1), %\to
+    .endif
+.endm
+
+.macro _for var:req, from:req, to:req, insn:vararg
+    .macro _for__body \var:req
+        .noaltmacro
+        \insn
+        .altmacro
+    .endm
+
+    .altmacro
+    __for \from, \to
+    .noaltmacro
+
+    .purgem _for__body
+.endm
+
+.macro sve_save nxzffrctx, nxpctx, save_ffr
+    _for n, 0, 31, _sve_str_v \n, \nxzffrctx, \n - 32
+    _for n, 0, 15, _sve_str_p \n, \nxpctx, \n
+        cbz \save_ffr, 1f
+        _sve_rdffr 0
+        _sve_str_p 0, \nxzffrctx
+        _sve_ldr_p 0, \nxpctx
+        b 2f
+1:
+        str xzr, [x\nxzffrctx]      // Zero out FFR
+2:
+.endm
+
+.macro sve_load nxzffrctx, nxpctx, restore_ffr
+    _for n, 0, 31, _sve_ldr_v \n, \nxzffrctx, \n - 32
+        cbz \restore_ffr, 1f
+        _sve_ldr_p 0, \nxzffrctx
+        _sve_wrffr 0
+1:
+    _for n, 0, 15, _sve_ldr_p \n, \nxpctx, \n
+.endm
+
 /* Gets the current vector register size in bytes */
 GLOBAL(sve_get_hw_vl)
     _sve_rdvl 0, 1
     ret
 
+/*
+ * Save the SVE context
+ *
+ * x0 - pointer to buffer for Z0-31 + FFR
+ * x1 - pointer to buffer for P0-15
+ * x2 - Save FFR if non-zero
+ */
+GLOBAL(sve_save_ctx)
+    sve_save 0, 1, x2
+    ret
+
+/*
+ * Load the SVE context
+ *
+ * x0 - pointer to buffer for Z0-31 + FFR
+ * x1 - pointer to buffer for P0-15
+ * x2 - Restore FFR if non-zero
+ */
+GLOBAL(sve_load_ctx)
+    sve_load 0, 1, x2
+    ret
+
 /*
  * Local variables:
  * mode: ASM
diff --git a/xen/arch/arm/arm64/vfp.c b/xen/arch/arm/arm64/vfp.c
index 47885e76baae..2d0d7c2e6ddb 100644
--- a/xen/arch/arm/arm64/vfp.c
+++ b/xen/arch/arm/arm64/vfp.c
@@ -2,29 +2,35 @@
 #include <asm/processor.h>
 #include <asm/cpufeature.h>
 #include <asm/vfp.h>
+#include <asm/arm64/sve.h>
 
 void vfp_save_state(struct vcpu *v)
 {
     if ( !cpu_has_fp )
         return;
 
-    asm volatile("stp q0, q1, [%1, #16 * 0]\n\t"
-                 "stp q2, q3, [%1, #16 * 2]\n\t"
-                 "stp q4, q5, [%1, #16 * 4]\n\t"
-                 "stp q6, q7, [%1, #16 * 6]\n\t"
-                 "stp q8, q9, [%1, #16 * 8]\n\t"
-                 "stp q10, q11, [%1, #16 * 10]\n\t"
-                 "stp q12, q13, [%1, #16 * 12]\n\t"
-                 "stp q14, q15, [%1, #16 * 14]\n\t"
-                 "stp q16, q17, [%1, #16 * 16]\n\t"
-                 "stp q18, q19, [%1, #16 * 18]\n\t"
-                 "stp q20, q21, [%1, #16 * 20]\n\t"
-                 "stp q22, q23, [%1, #16 * 22]\n\t"
-                 "stp q24, q25, [%1, #16 * 24]\n\t"
-                 "stp q26, q27, [%1, #16 * 26]\n\t"
-                 "stp q28, q29, [%1, #16 * 28]\n\t"
-                 "stp q30, q31, [%1, #16 * 30]\n\t"
-                 : "=Q" (*v->arch.vfp.fpregs) : "r" (v->arch.vfp.fpregs));
+    if ( is_sve_domain(v->domain) )
+        sve_save_state(v);
+    else
+    {
+        asm volatile("stp q0, q1, [%1, #16 * 0]\n\t"
+                     "stp q2, q3, [%1, #16 * 2]\n\t"
+                     "stp q4, q5, [%1, #16 * 4]\n\t"
+                     "stp q6, q7, [%1, #16 * 6]\n\t"
+                     "stp q8, q9, [%1, #16 * 8]\n\t"
+                     "stp q10, q11, [%1, #16 * 10]\n\t"
+                     "stp q12, q13, [%1, #16 * 12]\n\t"
+                     "stp q14, q15, [%1, #16 * 14]\n\t"
+                     "stp q16, q17, [%1, #16 * 16]\n\t"
+                     "stp q18, q19, [%1, #16 * 18]\n\t"
+                     "stp q20, q21, [%1, #16 * 20]\n\t"
+                     "stp q22, q23, [%1, #16 * 22]\n\t"
+                     "stp q24, q25, [%1, #16 * 24]\n\t"
+                     "stp q26, q27, [%1, #16 * 26]\n\t"
+                     "stp q28, q29, [%1, #16 * 28]\n\t"
+                     "stp q30, q31, [%1, #16 * 30]\n\t"
+                     : "=Q" (*v->arch.vfp.fpregs) : "r" (v->arch.vfp.fpregs));
+    }
 
     v->arch.vfp.fpsr = READ_SYSREG(FPSR);
     v->arch.vfp.fpcr = READ_SYSREG(FPCR);
@@ -37,23 +43,28 @@ void vfp_restore_state(struct vcpu *v)
     if ( !cpu_has_fp )
         return;
 
-    asm volatile("ldp q0, q1, [%1, #16 * 0]\n\t"
-                 "ldp q2, q3, [%1, #16 * 2]\n\t"
-                 "ldp q4, q5, [%1, #16 * 4]\n\t"
-                 "ldp q6, q7, [%1, #16 * 6]\n\t"
-                 "ldp q8, q9, [%1, #16 * 8]\n\t"
-                 "ldp q10, q11, [%1, #16 * 10]\n\t"
-                 "ldp q12, q13, [%1, #16 * 12]\n\t"
-                 "ldp q14, q15, [%1, #16 * 14]\n\t"
-                 "ldp q16, q17, [%1, #16 * 16]\n\t"
-                 "ldp q18, q19, [%1, #16 * 18]\n\t"
-                 "ldp q20, q21, [%1, #16 * 20]\n\t"
-                 "ldp q22, q23, [%1, #16 * 22]\n\t"
-                 "ldp q24, q25, [%1, #16 * 24]\n\t"
-                 "ldp q26, q27, [%1, #16 * 26]\n\t"
-                 "ldp q28, q29, [%1, #16 * 28]\n\t"
-                 "ldp q30, q31, [%1, #16 * 30]\n\t"
-                 : : "Q" (*v->arch.vfp.fpregs), "r" (v->arch.vfp.fpregs));
+    if ( is_sve_domain(v->domain) )
+        sve_restore_state(v);
+    else
+    {
+        asm volatile("ldp q0, q1, [%1, #16 * 0]\n\t"
+                     "ldp q2, q3, [%1, #16 * 2]\n\t"
+                     "ldp q4, q5, [%1, #16 * 4]\n\t"
+                     "ldp q6, q7, [%1, #16 * 6]\n\t"
+                     "ldp q8, q9, [%1, #16 * 8]\n\t"
+                     "ldp q10, q11, [%1, #16 * 10]\n\t"
+                     "ldp q12, q13, [%1, #16 * 12]\n\t"
+                     "ldp q14, q15, [%1, #16 * 14]\n\t"
+                     "ldp q16, q17, [%1, #16 * 16]\n\t"
+                     "ldp q18, q19, [%1, #16 * 18]\n\t"
+                     "ldp q20, q21, [%1, #16 * 20]\n\t"
+                     "ldp q22, q23, [%1, #16 * 22]\n\t"
+                     "ldp q24, q25, [%1, #16 * 24]\n\t"
+                     "ldp q26, q27, [%1, #16 * 26]\n\t"
+                     "ldp q28, q29, [%1, #16 * 28]\n\t"
+                     "ldp q30, q31, [%1, #16 * 30]\n\t"
+                     : : "Q" (*v->arch.vfp.fpregs), "r" (v->arch.vfp.fpregs));
+    }
 
     WRITE_SYSREG(v->arch.vfp.fpsr, FPSR);
     WRITE_SYSREG(v->arch.vfp.fpcr, FPCR);
diff --git a/xen/arch/arm/domain.c b/xen/arch/arm/domain.c
index 8d64e3c08edf..4a9a19189dd2 100644
--- a/xen/arch/arm/domain.c
+++ b/xen/arch/arm/domain.c
@@ -552,7 +552,12 @@ int arch_vcpu_create(struct vcpu *v)
 
     v->arch.cptr_el2 = get_default_cptr_flags();
     if ( is_sve_domain(v->domain) )
+    {
+        if ( (rc = sve_context_init(v)) != 0 )
+            goto fail;
         v->arch.cptr_el2 &= ~HCPTR_CP(8);
+        v->arch.zcr_el2 = vl_to_zcr(v->domain->arch.sve_vl_bits);
+    }
 
     v->arch.hcr_el2 = get_default_hcr_flags();
 
@@ -582,6 +587,8 @@ fail:
 
 void arch_vcpu_destroy(struct vcpu *v)
 {
+    if ( is_sve_domain(v->domain) )
+        sve_context_free(v);
     vcpu_timer_destroy(v);
     vcpu_vgic_free(v);
     free_xenheap_pages(v->arch.stack, STACK_ORDER);
diff --git a/xen/arch/arm/include/asm/arm64/sve.h b/xen/arch/arm/include/asm/arm64/sve.h
index f4a660e402ca..28c31b329233 100644
--- a/xen/arch/arm/include/asm/arm64/sve.h
+++ b/xen/arch/arm/include/asm/arm64/sve.h
@@ -24,6 +24,10 @@ static inline bool is_vl_valid(uint16_t vl)
 register_t compute_max_zcr(void);
 register_t vl_to_zcr(uint16_t vl);
 uint16_t get_sys_vl_len(void);
+int sve_context_init(struct vcpu *v);
+void sve_context_free(struct vcpu *v);
+void sve_save_state(struct vcpu *v);
+void sve_restore_state(struct vcpu *v);
 
 #else /* !CONFIG_ARM64_SVE */
 
@@ -42,6 +46,15 @@ static inline uint16_t get_sys_vl_len(void)
     return 0;
 }
 
+static inline int sve_context_init(struct vcpu *v)
+{
+    return 0;
+}
+
+static inline void sve_context_free(struct vcpu *v) {}
+static inline void sve_save_state(struct vcpu *v) {}
+static inline void sve_restore_state(struct vcpu *v) {}
+
 #endif
 
 #endif /* _ARM_ARM64_SVE_H */
diff --git a/xen/arch/arm/include/asm/arm64/sysregs.h b/xen/arch/arm/include/asm/arm64/sysregs.h
index 4cabb9eb4d5e..3fdeb9d8cdef 100644
--- a/xen/arch/arm/include/asm/arm64/sysregs.h
+++ b/xen/arch/arm/include/asm/arm64/sysregs.h
@@ -88,6 +88,9 @@
 #ifndef ID_AA64ISAR2_EL1
 #define ID_AA64ISAR2_EL1            S3_0_C0_C6_2
 #endif
+#ifndef ZCR_EL1
+#define ZCR_EL1                     S3_0_C1_C2_0
+#endif
 
 /* ID registers (imported from arm64/include/asm/sysreg.h in Linux) */
 
diff --git a/xen/arch/arm/include/asm/arm64/vfp.h b/xen/arch/arm/include/asm/arm64/vfp.h
index e6e8c363bc16..8af714cb8ecc 100644
--- a/xen/arch/arm/include/asm/arm64/vfp.h
+++ b/xen/arch/arm/include/asm/arm64/vfp.h
@@ -6,7 +6,17 @@
 
 struct vfp_state
 {
+    /*
+     * When SVE is enabled for the guest, fpregs memory will be used to
+     * save/restore P0-P15 registers, otherwise it will be used for the V0-V31
+     * registers.
+     */
     uint64_t fpregs[64] __vfp_aligned;
+    /*
+     * When SVE is enabled for the guest, sve_context contains memory to
+     * save/restore Z0-Z31 registers and FFR.
+     */
+    uint64_t *sve_context;
     register_t fpcr;
     register_t fpexc32_el2;
     register_t fpsr;
diff --git a/xen/arch/arm/include/asm/domain.h b/xen/arch/arm/include/asm/domain.h
index 1bd669e0a5c1..4d1066750a9b 100644
--- a/xen/arch/arm/include/asm/domain.h
+++ b/xen/arch/arm/include/asm/domain.h
@@ -195,6 +195,8 @@ struct arch_vcpu
     register_t tpidrro_el0;
 
     /* HYP configuration */
+    register_t zcr_el1;
+    register_t zcr_el2;
     register_t cptr_el2;
     register_t hcr_el2;
     register_t mdcr_el2;
-- 
2.25.1



^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH 06/10] xen/arm: enable Dom0 to use SVE feature
  2023-02-02 11:08 [PATCH 00/10] SVE feature for arm guests Luca Fancellu
                   ` (4 preceding siblings ...)
  2023-02-02 11:08 ` [PATCH 05/10] arm/sve: save/restore SVE context switch Luca Fancellu
@ 2023-02-02 11:08 ` Luca Fancellu
  2023-02-02 11:08 ` [PATCH 07/10] xen/physinfo: add arm SVE arch capability and vector length Luca Fancellu
                   ` (3 subsequent siblings)
  9 siblings, 0 replies; 18+ messages in thread
From: Luca Fancellu @ 2023-02-02 11:08 UTC (permalink / raw)
  To: xen-devel
  Cc: bertrand.marquis, wei.chen, Andrew Cooper, George Dunlap,
	Jan Beulich, Julien Grall, Stefano Stabellini, Wei Liu,
	Volodymyr Babchuk

Add a command line parameter to allow Dom0 the use of SVE resources,
the command line parameter dom0_sve controls the feature on this
domain and sets the maximum SVE vector length for Dom0.

Signed-off-by: Luca Fancellu <luca.fancellu@arm.com>
---
Changes from RFC:
 - Changed docs to explain that the domain won't be created if the
   requested vector length is above the supported one from the
   platform.
---
 docs/misc/xen-command-line.pandoc    | 13 +++++++++++++
 xen/arch/arm/arm64/sve.c             |  5 +++++
 xen/arch/arm/domain_build.c          |  4 ++++
 xen/arch/arm/include/asm/arm64/sve.h |  4 ++++
 4 files changed, 26 insertions(+)

diff --git a/docs/misc/xen-command-line.pandoc b/docs/misc/xen-command-line.pandoc
index 923910f553c5..5ccda7037f5a 100644
--- a/docs/misc/xen-command-line.pandoc
+++ b/docs/misc/xen-command-line.pandoc
@@ -995,6 +995,19 @@ restrictions set up here. Note that the values to be specified here are
 ACPI PXM ones, not Xen internal node numbers. `relaxed` sets up vCPU
 affinities to prefer but be not limited to the specified node(s).
 
+### dom0_sve (arm)
+> `= <integer>`
+
+> Default: `0`
+
+Enable arm SVE usage for Dom0 domain and sets the maximum SVE vector length.
+Values above 0 means feature is enabled for Dom0, otherwise feature is disabled.
+Possible values are from 0 to maximum 2048, being multiple of 128, that will be
+the maximum vector length.
+Please note that the platform can supports a lower value, if the requested value
+is above the supported one, the domain creation will fail and the system will
+stop.
+
 ### dom0_vcpus_pin
 > `= <boolean>`
 
diff --git a/xen/arch/arm/arm64/sve.c b/xen/arch/arm/arm64/sve.c
index bb3229ead005..6f6bf30b5c4c 100644
--- a/xen/arch/arm/arm64/sve.c
+++ b/xen/arch/arm/arm64/sve.c
@@ -5,10 +5,15 @@
  * Copyright (C) 2022 ARM Ltd.
  */
 
+#include <xen/param.h>
 #include <xen/sched.h>
 #include <xen/sizes.h>
 #include <asm/arm64/sve.h>
 
+/* opt_dom0_sve: allow Dom0 to use SVE and set maximum vector length. */
+unsigned int __initdata opt_dom0_sve;
+integer_param("dom0_sve", opt_dom0_sve);
+
 extern unsigned int sve_get_hw_vl(void);
 extern void sve_save_ctx(uint64_t *sve_ctx, uint64_t *pregs, int save_ffr);
 extern void sve_load_ctx(uint64_t const *sve_ctx, uint64_t const *pregs,
diff --git a/xen/arch/arm/domain_build.c b/xen/arch/arm/domain_build.c
index c2b97fa21e20..fbfdf3417cef 100644
--- a/xen/arch/arm/domain_build.c
+++ b/xen/arch/arm/domain_build.c
@@ -26,6 +26,7 @@
 #include <asm/platform.h>
 #include <asm/psci.h>
 #include <asm/setup.h>
+#include <asm/arm64/sve.h>
 #include <asm/cpufeature.h>
 #include <asm/domain_build.h>
 #include <xen/event.h>
@@ -4076,6 +4077,9 @@ void __init create_dom0(void)
     if ( iommu_enabled )
         dom0_cfg.flags |= XEN_DOMCTL_CDF_iommu;
 
+    if ( opt_dom0_sve > 0 )
+        dom0_cfg.arch.sve_vl_bits = opt_dom0_sve;
+
     dom0 = domain_create(0, &dom0_cfg, CDF_privileged | CDF_directmap);
     if ( IS_ERR(dom0) || (alloc_dom0_vcpu0(dom0) == NULL) )
         panic("Error creating domain 0\n");
diff --git a/xen/arch/arm/include/asm/arm64/sve.h b/xen/arch/arm/include/asm/arm64/sve.h
index 28c31b329233..dc6e747cec9e 100644
--- a/xen/arch/arm/include/asm/arm64/sve.h
+++ b/xen/arch/arm/include/asm/arm64/sve.h
@@ -21,6 +21,8 @@ static inline bool is_vl_valid(uint16_t vl)
 
 #ifdef CONFIG_ARM64_SVE
 
+extern unsigned int opt_dom0_sve;
+
 register_t compute_max_zcr(void);
 register_t vl_to_zcr(uint16_t vl);
 uint16_t get_sys_vl_len(void);
@@ -31,6 +33,8 @@ void sve_restore_state(struct vcpu *v);
 
 #else /* !CONFIG_ARM64_SVE */
 
+#define opt_dom0_sve (0)
+
 static inline register_t compute_max_zcr(void)
 {
     return 0;
-- 
2.25.1



^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH 07/10] xen/physinfo: add arm SVE arch capability and vector length
  2023-02-02 11:08 [PATCH 00/10] SVE feature for arm guests Luca Fancellu
                   ` (5 preceding siblings ...)
  2023-02-02 11:08 ` [PATCH 06/10] xen/arm: enable Dom0 to use SVE feature Luca Fancellu
@ 2023-02-02 11:08 ` Luca Fancellu
  2023-02-02 12:05   ` Jan Beulich
  2023-02-02 11:08 ` [PATCH 08/10] tools: add SVE capability and SVE vector length for physinfo Luca Fancellu
                   ` (2 subsequent siblings)
  9 siblings, 1 reply; 18+ messages in thread
From: Luca Fancellu @ 2023-02-02 11:08 UTC (permalink / raw)
  To: xen-devel
  Cc: bertrand.marquis, wei.chen, Stefano Stabellini, Julien Grall,
	Volodymyr Babchuk, Andrew Cooper, George Dunlap, Jan Beulich,
	Wei Liu

When the arm platform supports SVE, advertise the feature by a new
flag for the arch_capabilities in struct xen_sysctl_physinfo and add
a new field "arm_sve_vl_bits" where on arm there can be stored the
maximum SVE vector length in bits.

Update the padding.

Bump XEN_SYSCTL_INTERFACE_VERSION for the changes.

Signed-off-by: Luca Fancellu <luca.fancellu@arm.com>
---
Changes from RFC:
 - new patch
---
Here I need an opinion from arm and x86 maintainer, I see there is no arch
specific structure in struct xen_sysctl_physinfo, hw_cap is used only by x86
and now arch_capabilities and the new arm_sve_vl_bits will be used only by arm.
So how should we proceed here? Shall we create a struct arch for each
architecture and for example move arch_capabilities inside it (renaming to
capabilities) and every arch specific field as well?
(is hw_cap only for x86?)
---
 xen/arch/arm/sysctl.c       |  7 +++++++
 xen/include/public/sysctl.h | 11 +++++++++--
 2 files changed, 16 insertions(+), 2 deletions(-)

diff --git a/xen/arch/arm/sysctl.c b/xen/arch/arm/sysctl.c
index b0a78a8b10d0..d65f8be498f4 100644
--- a/xen/arch/arm/sysctl.c
+++ b/xen/arch/arm/sysctl.c
@@ -11,11 +11,18 @@
 #include <xen/lib.h>
 #include <xen/errno.h>
 #include <xen/hypercall.h>
+#include <asm/arm64/sve.h>
 #include <public/sysctl.h>
 
 void arch_do_physinfo(struct xen_sysctl_physinfo *pi)
 {
     pi->capabilities |= XEN_SYSCTL_PHYSCAP_hvm | XEN_SYSCTL_PHYSCAP_hap;
+
+    if ( cpu_has_sve )
+    {
+        pi->arch_capabilities |= XEN_SYSCTL_PHYSCAP_ARM_sve;
+        pi->arm_sve_vl_bits = get_sys_vl_len();
+    }
 }
 
 long arch_do_sysctl(struct xen_sysctl *sysctl,
diff --git a/xen/include/public/sysctl.h b/xen/include/public/sysctl.h
index 001a4de27375..5acbb41256bc 100644
--- a/xen/include/public/sysctl.h
+++ b/xen/include/public/sysctl.h
@@ -18,7 +18,7 @@
 #include "domctl.h"
 #include "physdev.h"
 
-#define XEN_SYSCTL_INTERFACE_VERSION 0x00000015
+#define XEN_SYSCTL_INTERFACE_VERSION 0x00000016
 
 /*
  * Read console content from Xen buffer ring.
@@ -94,6 +94,12 @@ struct xen_sysctl_tbuf_op {
 /* Max XEN_SYSCTL_PHYSCAP_* constant.  Used for ABI checking. */
 #define XEN_SYSCTL_PHYSCAP_MAX XEN_SYSCTL_PHYSCAP_gnttab_v2
 
+/* Arm platform is SVE capable */
+#define XEN_SYSCTL_PHYSCAP_ARM_sve (1u << 0)
+
+/* Max XEN_SYSCTL_PHYSCAP_ARM_* constant.  Used for ABI checking. */
+#define XEN_SYSCTL_PHYSCAP_ARM_MAX XEN_SYSCTL_PHYSCAP_ARM_sve
+
 struct xen_sysctl_physinfo {
     uint32_t threads_per_core;
     uint32_t cores_per_socket;
@@ -104,7 +110,8 @@ struct xen_sysctl_physinfo {
     uint32_t cpu_khz;
     uint32_t capabilities;/* XEN_SYSCTL_PHYSCAP_??? */
     uint32_t arch_capabilities;/* XEN_SYSCTL_PHYSCAP_{X86,ARM,...}_??? */
-    uint32_t pad;
+    uint16_t arm_sve_vl_bits;
+    uint16_t pad;
     uint64_aligned_t total_pages;
     uint64_aligned_t free_pages;
     uint64_aligned_t scrub_pages;
-- 
2.25.1



^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH 08/10] tools: add SVE capability and SVE vector length for physinfo
  2023-02-02 11:08 [PATCH 00/10] SVE feature for arm guests Luca Fancellu
                   ` (6 preceding siblings ...)
  2023-02-02 11:08 ` [PATCH 07/10] xen/physinfo: add arm SVE arch capability and vector length Luca Fancellu
@ 2023-02-02 11:08 ` Luca Fancellu
  2023-02-02 12:38   ` Marek Marczykowski-Górecki
  2023-02-02 14:12   ` Andrew Cooper
  2023-02-02 11:08 ` [PATCH 09/10] xen/tools: add sve parameter in XL configuration Luca Fancellu
  2023-02-02 11:08 ` [PATCH 10/10] xen/arm: add sve property for dom0less domUs Luca Fancellu
  9 siblings, 2 replies; 18+ messages in thread
From: Luca Fancellu @ 2023-02-02 11:08 UTC (permalink / raw)
  To: xen-devel
  Cc: bertrand.marquis, wei.chen, George Dunlap, Nick Rosbrook,
	Wei Liu, Anthony PERARD, Juergen Gross, Christian Lindig,
	David Scott, Marek Marczykowski-Górecki

Recent changes added in struct xen_sysctl_physinfo a new flag for
arch_capabilities to understand when the platform is SVE capable,
another field having the maximum SVE vector length in bits was
added as well, so update the tools to handle these new fields.

Signed-off-by: Luca Fancellu <luca.fancellu@arm.com>
---
Changes from RFC:
 - new patch
---
This patch is mostly dependent on the decisions we make in the previous, anyway
I touched some part of the toolstack where my knowledge is limited (ocaml) so
I might need a feedback for something I may have done wrong.
---
 tools/golang/xenlight/helpers.gen.go |  4 ++++
 tools/golang/xenlight/types.gen.go   |  2 ++
 tools/include/libxl.h                |  8 ++++++++
 tools/libs/light/libxl.c             |  3 +++
 tools/libs/light/libxl_types.idl     |  2 ++
 tools/ocaml/libs/xc/xenctrl.ml       |  4 +++-
 tools/ocaml/libs/xc/xenctrl.mli      |  4 +++-
 tools/ocaml/libs/xc/xenctrl_stubs.c  | 17 +++++++++++++----
 tools/python/xen/lowlevel/xc/xc.c    | 17 +++++++++++++++--
 tools/xl/xl_info.c                   | 10 ++++++++++
 10 files changed, 63 insertions(+), 8 deletions(-)

diff --git a/tools/golang/xenlight/helpers.gen.go b/tools/golang/xenlight/helpers.gen.go
index 3ac4938858f2..0e2c1b3acc74 100644
--- a/tools/golang/xenlight/helpers.gen.go
+++ b/tools/golang/xenlight/helpers.gen.go
@@ -3455,6 +3455,8 @@ x.CapVmtrace = bool(xc.cap_vmtrace)
 x.CapVpmu = bool(xc.cap_vpmu)
 x.CapGnttabV1 = bool(xc.cap_gnttab_v1)
 x.CapGnttabV2 = bool(xc.cap_gnttab_v2)
+x.CapArmSve = bool(xc.cap_arm_sve)
+x.ArmSveVlBits = uint16(xc.arm_sve_vl_bits)
 
  return nil}
 
@@ -3489,6 +3491,8 @@ xc.cap_vmtrace = C.bool(x.CapVmtrace)
 xc.cap_vpmu = C.bool(x.CapVpmu)
 xc.cap_gnttab_v1 = C.bool(x.CapGnttabV1)
 xc.cap_gnttab_v2 = C.bool(x.CapGnttabV2)
+xc.cap_arm_sve = C.bool(x.CapArmSve)
+xc.arm_sve_vl_bits = C.uint16(x.ArmSveVlBits)
 
  return nil
  }
diff --git a/tools/golang/xenlight/types.gen.go b/tools/golang/xenlight/types.gen.go
index 16ce879e3fb7..d4b67c97bb54 100644
--- a/tools/golang/xenlight/types.gen.go
+++ b/tools/golang/xenlight/types.gen.go
@@ -1051,6 +1051,8 @@ CapVmtrace bool
 CapVpmu bool
 CapGnttabV1 bool
 CapGnttabV2 bool
+CapArmSve bool
+ArmSveVlBits uint16
 }
 
 type Connectorinfo struct {
diff --git a/tools/include/libxl.h b/tools/include/libxl.h
index d652895075a0..ce2997486953 100644
--- a/tools/include/libxl.h
+++ b/tools/include/libxl.h
@@ -520,6 +520,14 @@
  */
 #define LIBXL_HAVE_PHYSINFO_CAP_GNTTAB 1
 
+/*
+ * LIBXL_HAVE_PHYSINFO_CAP_ARM_SVE indicates that libxl_physinfo has a
+ * cap_arm_sve field, which indicates the availability of Arm SVE feature for
+ * guests and has arm_sve_vl_bits fields which indicates the supported maximum
+ * vector length for the platform.
+ */
+#define LIBXL_HAVE_PHYSINFO_CAP_ARM_SVE 1
+
 /*
  * LIBXL_HAVE_MAX_GRANT_VERSION indicates libxl_domain_build_info has a
  * max_grant_version field for setting the max grant table version per
diff --git a/tools/libs/light/libxl.c b/tools/libs/light/libxl.c
index a0bf7d186f69..ed9bd0ff520b 100644
--- a/tools/libs/light/libxl.c
+++ b/tools/libs/light/libxl.c
@@ -409,6 +409,9 @@ int libxl_get_physinfo(libxl_ctx *ctx, libxl_physinfo *physinfo)
         !!(xcphysinfo.capabilities & XEN_SYSCTL_PHYSCAP_gnttab_v1);
     physinfo->cap_gnttab_v2 =
         !!(xcphysinfo.capabilities & XEN_SYSCTL_PHYSCAP_gnttab_v2);
+    physinfo->cap_arm_sve =
+        !!(xcphysinfo.arch_capabilities & XEN_SYSCTL_PHYSCAP_ARM_sve);
+    physinfo->arm_sve_vl_bits = xcphysinfo.arm_sve_vl_bits;
 
     GC_FREE;
     return 0;
diff --git a/tools/libs/light/libxl_types.idl b/tools/libs/light/libxl_types.idl
index 0cfad8508dbd..5bd44b37c74d 100644
--- a/tools/libs/light/libxl_types.idl
+++ b/tools/libs/light/libxl_types.idl
@@ -1106,6 +1106,8 @@ libxl_physinfo = Struct("physinfo", [
     ("cap_vpmu", bool),
     ("cap_gnttab_v1", bool),
     ("cap_gnttab_v2", bool),
+    ("cap_arm_sve", bool),
+    ("arm_sve_vl_bits", uint16),
     ], dir=DIR_OUT)
 
 libxl_connectorinfo = Struct("connectorinfo", [
diff --git a/tools/ocaml/libs/xc/xenctrl.ml b/tools/ocaml/libs/xc/xenctrl.ml
index 7442bbbfc5e0..ce8ce2d1a663 100644
--- a/tools/ocaml/libs/xc/xenctrl.ml
+++ b/tools/ocaml/libs/xc/xenctrl.ml
@@ -128,7 +128,8 @@ type physinfo_cap_flag =
 	| CAP_Gnttab_v1
 	| CAP_Gnttab_v2
 
-type arm_physinfo_cap_flag
+type arm_physinfo_cap_flag =
+	| CAP_sve
 
 type x86_physinfo_cap_flag
 
@@ -150,6 +151,7 @@ type physinfo =
 	capabilities     : physinfo_cap_flag list;
 	max_nr_cpus      : int;
 	arch_capabilities : arch_physinfo_cap_flags;
+	arm_sve_vl_bits : int;
 }
 
 type version =
diff --git a/tools/ocaml/libs/xc/xenctrl.mli b/tools/ocaml/libs/xc/xenctrl.mli
index 82def5a17cc2..ac479f44ddb2 100644
--- a/tools/ocaml/libs/xc/xenctrl.mli
+++ b/tools/ocaml/libs/xc/xenctrl.mli
@@ -113,7 +113,8 @@ type physinfo_cap_flag =
   | CAP_Gnttab_v1
   | CAP_Gnttab_v2
 
-type arm_physinfo_cap_flag
+type arm_physinfo_cap_flag =
+  | CAP_sve
 
 type x86_physinfo_cap_flag
 
@@ -133,6 +134,7 @@ type physinfo = {
   capabilities     : physinfo_cap_flag list;
   max_nr_cpus      : int; (** compile-time max possible number of nr_cpus *)
   arch_capabilities : arch_physinfo_cap_flags;
+  arm_sve_vl_bits : int;
 }
 type version = { major : int; minor : int; extra : string; }
 type compile_info = {
diff --git a/tools/ocaml/libs/xc/xenctrl_stubs.c b/tools/ocaml/libs/xc/xenctrl_stubs.c
index 2fba9c5e94d6..d85e4a20e6d0 100644
--- a/tools/ocaml/libs/xc/xenctrl_stubs.c
+++ b/tools/ocaml/libs/xc/xenctrl_stubs.c
@@ -810,7 +810,7 @@ CAMLprim value stub_xc_physinfo(value xch)
 		/* ! XEN_SYSCTL_PHYSCAP_ XEN_SYSCTL_PHYSCAP_MAX max */
 		(c_physinfo.capabilities);
 
-	physinfo = caml_alloc_tuple(11);
+	physinfo = caml_alloc_tuple(12);
 	Store_field(physinfo, 0, Val_int(c_physinfo.threads_per_core));
 	Store_field(physinfo, 1, Val_int(c_physinfo.cores_per_socket));
 	Store_field(physinfo, 2, Val_int(c_physinfo.nr_cpus));
@@ -825,13 +825,22 @@ CAMLprim value stub_xc_physinfo(value xch)
 #if defined(__i386__) || defined(__x86_64__)
 	arch_cap_list = Tag_cons;
 
-	arch_cap_flags_tag = 1; /* tag x86 */
+	arch_cap_flags = caml_alloc_small(1, 1);
+	Store_field(arch_cap_flags, 0, arch_cap_list);
+	Store_field(physinfo, 11, Tag_cons);
+#elif defined (__arm__) || defined(__aarch64__)
+	/*
+	 * capabilities: arm_physinfo_cap_flag list;
+	 */
+	arch_cap_flags = c_bitmap_to_ocaml_list
+		/* ! arm_physinfo_cap_flag CAP_ lc */
+		/* ! XEN_SYSCTL_PHYSCAP_ARM_ XEN_SYSCTL_PHYSCAP_ARM_MAX max */
+		(c_physinfo.arch_capabilities);
+	Store_field(physinfo, 11, Val_int(c_physinfo.arm_sve_vl_bits));
 #else
 	caml_failwith("Unhandled architecture");
 #endif
 
-	arch_cap_flags = caml_alloc_small(1, arch_cap_flags_tag);
-	Store_field(arch_cap_flags, 0, arch_cap_list);
 	Store_field(physinfo, 10, arch_cap_flags);
 
 	CAMLreturn(physinfo);
diff --git a/tools/python/xen/lowlevel/xc/xc.c b/tools/python/xen/lowlevel/xc/xc.c
index fd008610329b..516fa57161a6 100644
--- a/tools/python/xen/lowlevel/xc/xc.c
+++ b/tools/python/xen/lowlevel/xc/xc.c
@@ -870,6 +870,11 @@ static PyObject *pyxc_physinfo(XcObject *self)
     const char *virtcap_names[] = { "hvm", "pv" };
     const unsigned virtcaps_bits[] = { XEN_SYSCTL_PHYSCAP_hvm,
                                        XEN_SYSCTL_PHYSCAP_pv };
+#if defined(__aarch64__)
+    const char py_buildval[] = "{s:i,s:i,s:i,s:i,s:l,s:l,s:l,s:i,s:s,s:s,s:i}";
+#else
+    const char py_buildval[] = "{s:i,s:i,s:i,s:i,s:l,s:l,s:l,s:i,s:s,s:s}";
+#endif
 
     if ( xc_physinfo(self->xc_handle, &pinfo) != 0 )
         return pyxc_error_to_exception(self->xc_handle);
@@ -893,10 +898,14 @@ static PyObject *pyxc_physinfo(XcObject *self)
         for ( i = 0; i < ARRAY_SIZE(virtcaps_bits); i++ )
             if ( pinfo.capabilities & virtcaps_bits[i] )
               p += sprintf(p, "%s_directio ", virtcap_names[i]);
+#if defined(__aarch64__)
+    if ( pinfo.arch_capabilities & XEN_SYSCTL_PHYSCAP_ARM_sve )
+        p += sprintf(p, "%s ", "arm_sve");
+#endif
     if ( p != virt_caps )
       *(p-1) = '\0';
 
-    return Py_BuildValue("{s:i,s:i,s:i,s:i,s:l,s:l,s:l,s:i,s:s,s:s}",
+    return Py_BuildValue(py_buildval,
                             "nr_nodes",         pinfo.nr_nodes,
                             "threads_per_core", pinfo.threads_per_core,
                             "cores_per_socket", pinfo.cores_per_socket,
@@ -906,7 +915,11 @@ static PyObject *pyxc_physinfo(XcObject *self)
                             "scrub_memory",     pages_to_kib(pinfo.scrub_pages),
                             "cpu_khz",          pinfo.cpu_khz,
                             "hw_caps",          cpu_cap,
-                            "virt_caps",        virt_caps);
+                            "virt_caps",        virt_caps
+#if defined(__aarch64__)
+                            , "arm_sve_vl_bits",  pinfo.arm_sve_vl_bits
+#endif
+                            );
 }
 
 static PyObject *pyxc_getcpuinfo(XcObject *self, PyObject *args, PyObject *kwds)
diff --git a/tools/xl/xl_info.c b/tools/xl/xl_info.c
index 712b7638b013..44345d405873 100644
--- a/tools/xl/xl_info.c
+++ b/tools/xl/xl_info.c
@@ -210,7 +210,11 @@ static void output_physinfo(void)
          info.hw_cap[4], info.hw_cap[5], info.hw_cap[6], info.hw_cap[7]
         );
 
+#if defined(__aarch64__)
+    maybe_printf("virt_caps              :%s%s%s%s%s%s%s%s%s%s%s%s\n",
+#else
     maybe_printf("virt_caps              :%s%s%s%s%s%s%s%s%s%s%s\n",
+#endif
          info.cap_pv ? " pv" : "",
          info.cap_hvm ? " hvm" : "",
          info.cap_hvm && info.cap_hvm_directio ? " hvm_directio" : "",
@@ -222,6 +226,9 @@ static void output_physinfo(void)
          info.cap_vpmu ? " vpmu" : "",
          info.cap_gnttab_v1 ? " gnttab-v1" : "",
          info.cap_gnttab_v2 ? " gnttab-v2" : ""
+#if defined(__aarch64__)
+         , info.cap_arm_sve ? " arm_sve" : ""
+#endif
         );
 
     vinfo = libxl_get_version_info(ctx);
@@ -240,6 +247,9 @@ static void output_physinfo(void)
         maybe_printf("free_cpus              : %d\n", n);
         free(cpumap.map);
     }
+#if defined(__aarch64__)
+    maybe_printf("arm_sve_max_vl_bits    : %d\n", info.arm_sve_vl_bits);
+#endif
     libxl_physinfo_dispose(&info);
     return;
 }
-- 
2.25.1



^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH 09/10] xen/tools: add sve parameter in XL configuration
  2023-02-02 11:08 [PATCH 00/10] SVE feature for arm guests Luca Fancellu
                   ` (7 preceding siblings ...)
  2023-02-02 11:08 ` [PATCH 08/10] tools: add SVE capability and SVE vector length for physinfo Luca Fancellu
@ 2023-02-02 11:08 ` Luca Fancellu
  2023-02-02 11:08 ` [PATCH 10/10] xen/arm: add sve property for dom0less domUs Luca Fancellu
  9 siblings, 0 replies; 18+ messages in thread
From: Luca Fancellu @ 2023-02-02 11:08 UTC (permalink / raw)
  To: xen-devel
  Cc: bertrand.marquis, wei.chen, Wei Liu, Anthony PERARD,
	George Dunlap, Nick Rosbrook, Juergen Gross

Add sve parameter in XL configuration to allow guests to use
SVE feature.

Signed-off-by: Luca Fancellu <luca.fancellu@arm.com>
---
Changes from RFC:
 - changed libxl_types.idl sve field to uint16
 - now toolstack uses info from physinfo to check against the
   sve XL value
 - Changed documentation
---
 docs/man/xl.cfg.5.pod.in             | 11 +++++++++++
 tools/golang/xenlight/helpers.gen.go |  2 ++
 tools/golang/xenlight/types.gen.go   |  1 +
 tools/include/libxl.h                |  5 +++++
 tools/libs/light/libxl_arm.c         |  2 ++
 tools/libs/light/libxl_types.idl     |  1 +
 tools/xl/xl_parse.c                  | 22 ++++++++++++++++++++--
 7 files changed, 42 insertions(+), 2 deletions(-)

diff --git a/docs/man/xl.cfg.5.pod.in b/docs/man/xl.cfg.5.pod.in
index 024bceeb61b2..bb138322e290 100644
--- a/docs/man/xl.cfg.5.pod.in
+++ b/docs/man/xl.cfg.5.pod.in
@@ -2903,6 +2903,17 @@ Currently, only the "sbsa_uart" model is supported for ARM.
 
 =back
 
+=item B<sve="NUMBER">
+
+To enable SVE, user must specify a number different from zero, maximum 2048 and
+multiple of 128. That value will be the maximum number of SVE registers bits
+that the hypervisor will impose to this guest. If the platform has a lower
+supported bits value, then the domain creation will fail.
+A value equal to zero is the default and it means this guest is not allowed to
+use SVE.
+
+=back
+
 =head3 x86
 
 =over 4
diff --git a/tools/golang/xenlight/helpers.gen.go b/tools/golang/xenlight/helpers.gen.go
index 0e2c1b3acc74..a58b5f1ea268 100644
--- a/tools/golang/xenlight/helpers.gen.go
+++ b/tools/golang/xenlight/helpers.gen.go
@@ -1117,6 +1117,7 @@ default:
 return fmt.Errorf("invalid union key '%v'", x.Type)}
 x.ArchArm.GicVersion = GicVersion(xc.arch_arm.gic_version)
 x.ArchArm.Vuart = VuartType(xc.arch_arm.vuart)
+x.ArchArm.Sve = int(xc.arch_arm.sve)
 if err := x.ArchX86.MsrRelaxed.fromC(&xc.arch_x86.msr_relaxed);err != nil {
 return fmt.Errorf("converting field ArchX86.MsrRelaxed: %v", err)
 }
@@ -1602,6 +1603,7 @@ default:
 return fmt.Errorf("invalid union key '%v'", x.Type)}
 xc.arch_arm.gic_version = C.libxl_gic_version(x.ArchArm.GicVersion)
 xc.arch_arm.vuart = C.libxl_vuart_type(x.ArchArm.Vuart)
+xc.arch_arm.sve = C.int(x.Sve)
 if err := x.ArchX86.MsrRelaxed.toC(&xc.arch_x86.msr_relaxed); err != nil {
 return fmt.Errorf("converting field ArchX86.MsrRelaxed: %v", err)
 }
diff --git a/tools/golang/xenlight/types.gen.go b/tools/golang/xenlight/types.gen.go
index d4b67c97bb54..f1ba391d2d3e 100644
--- a/tools/golang/xenlight/types.gen.go
+++ b/tools/golang/xenlight/types.gen.go
@@ -537,6 +537,7 @@ TypeUnion DomainBuildInfoTypeUnion
 ArchArm struct {
 GicVersion GicVersion
 Vuart VuartType
+Sve uint32
 }
 ArchX86 struct {
 MsrRelaxed Defbool
diff --git a/tools/include/libxl.h b/tools/include/libxl.h
index ce2997486953..6385db38bf8c 100644
--- a/tools/include/libxl.h
+++ b/tools/include/libxl.h
@@ -278,6 +278,11 @@
  */
 #define LIBXL_HAVE_BUILDINFO_ARCH_ARM_TEE 1
 
+/*
+ * libxl_domain_build_info has the arch_arm.sve field.
+ */
+#define LIBXL_HAVE_BUILDINFO_ARCH_ARM_SVE 1
+
 /*
  * LIBXL_HAVE_SOFT_RESET indicates that libxl supports performing
  * 'soft reset' for domains and there is 'soft_reset' shutdown reason
diff --git a/tools/libs/light/libxl_arm.c b/tools/libs/light/libxl_arm.c
index ddc7b2a15975..31f30e054bf4 100644
--- a/tools/libs/light/libxl_arm.c
+++ b/tools/libs/light/libxl_arm.c
@@ -211,6 +211,8 @@ int libxl__arch_domain_prepare_config(libxl__gc *gc,
         return ERROR_FAIL;
     }
 
+    config->arch.sve_vl_bits = d_config->b_info.arch_arm.sve;
+
     return 0;
 }
 
diff --git a/tools/libs/light/libxl_types.idl b/tools/libs/light/libxl_types.idl
index 5bd44b37c74d..29a1c86ce3b5 100644
--- a/tools/libs/light/libxl_types.idl
+++ b/tools/libs/light/libxl_types.idl
@@ -663,6 +663,7 @@ libxl_domain_build_info = Struct("domain_build_info",[
 
     ("arch_arm", Struct(None, [("gic_version", libxl_gic_version),
                                ("vuart", libxl_vuart_type),
+                               ("sve", uint16),
                               ])),
     ("arch_x86", Struct(None, [("msr_relaxed", libxl_defbool),
                               ])),
diff --git a/tools/xl/xl_parse.c b/tools/xl/xl_parse.c
index 853e9f357a1a..25af5d5dfafb 100644
--- a/tools/xl/xl_parse.c
+++ b/tools/xl/xl_parse.c
@@ -1316,8 +1316,6 @@ void parse_config_data(const char *config_source,
         exit(EXIT_FAILURE);
     }
 
-    libxl_physinfo_dispose(&physinfo);
-
     config= xlu_cfg_init(stderr, config_source);
     if (!config) {
         fprintf(stderr, "Failed to allocate for configuration\n");
@@ -2828,6 +2826,26 @@ skip_usbdev:
         }
     }
 
+    if (!xlu_cfg_get_long (config, "sve", &l, 0)) {
+        if (!physinfo.cap_arm_sve) {
+            fprintf(stderr, "SVE is not supported by the platform\n");
+            exit(-ERROR_FAIL);
+        } else if (((l % 128) != 0) || (l > 2048)) {
+            fprintf(stderr,
+                    "Invalid sve value: %ld. Needs to be <= 2048 and multiple"
+                    " of 128\n", l);
+            exit(-ERROR_FAIL);
+        } else if (l > physinfo.arm_sve_vl_bits) {
+            fprintf(stderr,
+                    "Invalid sve value: %ld. Platform supports up to %u bits\n",
+                    l, physinfo.arm_sve_vl_bits);
+            exit(-ERROR_FAIL);
+        }
+        b_info->arch_arm.sve = l;
+    }
+
+    libxl_physinfo_dispose(&physinfo);
+
     parse_vkb_list(config, d_config);
 
     d_config->virtios = NULL;
-- 
2.25.1



^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH 10/10] xen/arm: add sve property for dom0less domUs
  2023-02-02 11:08 [PATCH 00/10] SVE feature for arm guests Luca Fancellu
                   ` (8 preceding siblings ...)
  2023-02-02 11:08 ` [PATCH 09/10] xen/tools: add sve parameter in XL configuration Luca Fancellu
@ 2023-02-02 11:08 ` Luca Fancellu
  9 siblings, 0 replies; 18+ messages in thread
From: Luca Fancellu @ 2023-02-02 11:08 UTC (permalink / raw)
  To: xen-devel
  Cc: bertrand.marquis, wei.chen, Stefano Stabellini, Julien Grall,
	Volodymyr Babchuk

Add a device tree property in the dom0less domU configuration
to enable the guest to use SVE.

Update documentation.

Signed-off-by: Luca Fancellu <luca.fancellu@arm.com>
---
Changes from RFC:
 - Changed documentation
---
 docs/misc/arm/device-tree/booting.txt | 9 +++++++++
 xen/arch/arm/domain_build.c           | 7 +++++++
 2 files changed, 16 insertions(+)

diff --git a/docs/misc/arm/device-tree/booting.txt b/docs/misc/arm/device-tree/booting.txt
index 3879340b5e0a..d74bf9ab1c8b 100644
--- a/docs/misc/arm/device-tree/booting.txt
+++ b/docs/misc/arm/device-tree/booting.txt
@@ -193,6 +193,15 @@ with the following properties:
     Optional. Handle to a xen,cpupool device tree node that identifies the
     cpupool where the guest will be started at boot.
 
+- sve
+
+    Optional. A number that, when above 0, enables SVE for this guest and sets
+    its maximum SVE vector length. The default value is 0, that means this
+    guest is not allowed to use SVE, the maximum value allowed is 2048, any
+    other value must be multiple of 128.
+    Please note that if the platform supports a lower value of bits, then the
+    domain creation will fail.
+
 - xen,enhanced
 
     A string property. Possible property values are:
diff --git a/xen/arch/arm/domain_build.c b/xen/arch/arm/domain_build.c
index fbfdf3417cef..3752e2c7dc16 100644
--- a/xen/arch/arm/domain_build.c
+++ b/xen/arch/arm/domain_build.c
@@ -3960,6 +3960,13 @@ void __init create_domUs(void)
             d_cfg.max_maptrack_frames = val;
         }
 
+        if ( dt_property_read_u32(node, "sve", &val) )
+        {
+            if ( val > UINT16_MAX )
+                panic("sve property value (%"PRIu32") overflow\n", val);
+            d_cfg.arch.sve_vl_bits = val;
+        }
+
         /*
          * The variable max_init_domid is initialized with zero, so here it's
          * very important to use the pre-increment operator to call
-- 
2.25.1



^ permalink raw reply related	[flat|nested] 18+ messages in thread

* Re: [PATCH 07/10] xen/physinfo: add arm SVE arch capability and vector length
  2023-02-02 11:08 ` [PATCH 07/10] xen/physinfo: add arm SVE arch capability and vector length Luca Fancellu
@ 2023-02-02 12:05   ` Jan Beulich
  2023-02-10 15:54     ` Luca Fancellu
  0 siblings, 1 reply; 18+ messages in thread
From: Jan Beulich @ 2023-02-02 12:05 UTC (permalink / raw)
  To: Luca Fancellu
  Cc: bertrand.marquis, wei.chen, Stefano Stabellini, Julien Grall,
	Volodymyr Babchuk, Andrew Cooper, George Dunlap, Wei Liu,
	xen-devel

On 02.02.2023 12:08, Luca Fancellu wrote:
> When the arm platform supports SVE, advertise the feature by a new
> flag for the arch_capabilities in struct xen_sysctl_physinfo and add
> a new field "arm_sve_vl_bits" where on arm there can be stored the
> maximum SVE vector length in bits.
> 
> Update the padding.
> 
> Bump XEN_SYSCTL_INTERFACE_VERSION for the changes.
> 
> Signed-off-by: Luca Fancellu <luca.fancellu@arm.com>
> ---
> Changes from RFC:
>  - new patch
> ---
> Here I need an opinion from arm and x86 maintainer, I see there is no arch
> specific structure in struct xen_sysctl_physinfo, hw_cap is used only by x86
> and now arch_capabilities and the new arm_sve_vl_bits will be used only by arm.
> So how should we proceed here? Shall we create a struct arch for each
> architecture and for example move arch_capabilities inside it (renaming to
> capabilities) and every arch specific field as well?

Counter question: Why don't you use (part of) arch_capabilities (and not
just a single bit)? It looks to be entirely unused at present. Vector
length being zero would sufficiently indicate absence of the feature
without a separate boolean.


> (is hw_cap only for x86?)

I suppose it is, but I also expect it would better go away than be moved.
It doesn't hold a complete set of information, and it has been superseded.

Question is (and I think I did raise this before, perhaps in different
context) whether Arm wouldn't want to follow x86 in how hardware as well
as hypervisor derived / used ones are exposed to the tool stack
(XEN_SYSCTL_get_cpu_featureset). I guess there's nothing really precluding
that data to consist of more than just boolean fields.

> --- a/xen/include/public/sysctl.h
> +++ b/xen/include/public/sysctl.h
> @@ -18,7 +18,7 @@
>  #include "domctl.h"
>  #include "physdev.h"
>  
> -#define XEN_SYSCTL_INTERFACE_VERSION 0x00000015
> +#define XEN_SYSCTL_INTERFACE_VERSION 0x00000016

Why? You ...

> @@ -104,7 +110,8 @@ struct xen_sysctl_physinfo {
>      uint32_t cpu_khz;
>      uint32_t capabilities;/* XEN_SYSCTL_PHYSCAP_??? */
>      uint32_t arch_capabilities;/* XEN_SYSCTL_PHYSCAP_{X86,ARM,...}_??? */
> -    uint32_t pad;
> +    uint16_t arm_sve_vl_bits;
> +    uint16_t pad;
>      uint64_aligned_t total_pages;
>      uint64_aligned_t free_pages;
>      uint64_aligned_t scrub_pages;

... add no new fields, and the only producer of the data zero-fills the
struct first thing.

Jan


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH 08/10] tools: add SVE capability and SVE vector length for physinfo
  2023-02-02 11:08 ` [PATCH 08/10] tools: add SVE capability and SVE vector length for physinfo Luca Fancellu
@ 2023-02-02 12:38   ` Marek Marczykowski-Górecki
  2023-02-02 14:12   ` Andrew Cooper
  1 sibling, 0 replies; 18+ messages in thread
From: Marek Marczykowski-Górecki @ 2023-02-02 12:38 UTC (permalink / raw)
  To: Luca Fancellu
  Cc: xen-devel, bertrand.marquis, wei.chen, George Dunlap,
	Nick Rosbrook, Wei Liu, Anthony PERARD, Juergen Gross,
	Christian Lindig, David Scott

[-- Attachment #1: Type: text/plain, Size: 3999 bytes --]

On Thu, Feb 02, 2023 at 11:08:14AM +0000, Luca Fancellu wrote:
> Recent changes added in struct xen_sysctl_physinfo a new flag for
> arch_capabilities to understand when the platform is SVE capable,
> another field having the maximum SVE vector length in bits was
> added as well, so update the tools to handle these new fields.
> 
> Signed-off-by: Luca Fancellu <luca.fancellu@arm.com>
> ---
> Changes from RFC:
>  - new patch
> ---
> This patch is mostly dependent on the decisions we make in the previous, anyway
> I touched some part of the toolstack where my knowledge is limited (ocaml) so
> I might need a feedback for something I may have done wrong.
> ---
>  tools/golang/xenlight/helpers.gen.go |  4 ++++
>  tools/golang/xenlight/types.gen.go   |  2 ++
>  tools/include/libxl.h                |  8 ++++++++
>  tools/libs/light/libxl.c             |  3 +++
>  tools/libs/light/libxl_types.idl     |  2 ++
>  tools/ocaml/libs/xc/xenctrl.ml       |  4 +++-
>  tools/ocaml/libs/xc/xenctrl.mli      |  4 +++-
>  tools/ocaml/libs/xc/xenctrl_stubs.c  | 17 +++++++++++++----
>  tools/python/xen/lowlevel/xc/xc.c    | 17 +++++++++++++++--
>  tools/xl/xl_info.c                   | 10 ++++++++++
>  10 files changed, 63 insertions(+), 8 deletions(-)

For the python part:

> diff --git a/tools/python/xen/lowlevel/xc/xc.c b/tools/python/xen/lowlevel/xc/xc.c
> index fd008610329b..516fa57161a6 100644
> --- a/tools/python/xen/lowlevel/xc/xc.c
> +++ b/tools/python/xen/lowlevel/xc/xc.c
> @@ -870,6 +870,11 @@ static PyObject *pyxc_physinfo(XcObject *self)
>      const char *virtcap_names[] = { "hvm", "pv" };
>      const unsigned virtcaps_bits[] = { XEN_SYSCTL_PHYSCAP_hvm,
>                                         XEN_SYSCTL_PHYSCAP_pv };
> +#if defined(__aarch64__)
> +    const char py_buildval[] = "{s:i,s:i,s:i,s:i,s:l,s:l,s:l,s:i,s:s,s:s,s:i}";
> +#else
> +    const char py_buildval[] = "{s:i,s:i,s:i,s:i,s:l,s:l,s:l,s:i,s:s,s:s}";
> +#endif

I don't like this #if for a different return format depending on the
architecture, especially when the underlying structure is in fact the
same (just not all fields are used for every arch). It would make adding
further fields unnecessarily error-prone.

Instead, you can do common stuff with Py_BuildValue and then add
architecture-specific stuff with PyDict_SetItemString:
https://docs.python.org/3/c-api/dict.html

>      if ( xc_physinfo(self->xc_handle, &pinfo) != 0 )
>          return pyxc_error_to_exception(self->xc_handle);
> @@ -893,10 +898,14 @@ static PyObject *pyxc_physinfo(XcObject *self)
>          for ( i = 0; i < ARRAY_SIZE(virtcaps_bits); i++ )
>              if ( pinfo.capabilities & virtcaps_bits[i] )
>                p += sprintf(p, "%s_directio ", virtcap_names[i]);
> +#if defined(__aarch64__)
> +    if ( pinfo.arch_capabilities & XEN_SYSCTL_PHYSCAP_ARM_sve )
> +        p += sprintf(p, "%s ", "arm_sve");
> +#endif
>      if ( p != virt_caps )
>        *(p-1) = '\0';
>  
> -    return Py_BuildValue("{s:i,s:i,s:i,s:i,s:l,s:l,s:l,s:i,s:s,s:s}",
> +    return Py_BuildValue(py_buildval,
>                              "nr_nodes",         pinfo.nr_nodes,
>                              "threads_per_core", pinfo.threads_per_core,
>                              "cores_per_socket", pinfo.cores_per_socket,
> @@ -906,7 +915,11 @@ static PyObject *pyxc_physinfo(XcObject *self)
>                              "scrub_memory",     pages_to_kib(pinfo.scrub_pages),
>                              "cpu_khz",          pinfo.cpu_khz,
>                              "hw_caps",          cpu_cap,
> -                            "virt_caps",        virt_caps);
> +                            "virt_caps",        virt_caps
> +#if defined(__aarch64__)
> +                            , "arm_sve_vl_bits",  pinfo.arm_sve_vl_bits
> +#endif
> +                            );
>  }
>  

-- 
Best Regards,
Marek Marczykowski-Górecki
Invisible Things Lab

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH 08/10] tools: add SVE capability and SVE vector length for physinfo
  2023-02-02 11:08 ` [PATCH 08/10] tools: add SVE capability and SVE vector length for physinfo Luca Fancellu
  2023-02-02 12:38   ` Marek Marczykowski-Górecki
@ 2023-02-02 14:12   ` Andrew Cooper
  1 sibling, 0 replies; 18+ messages in thread
From: Andrew Cooper @ 2023-02-02 14:12 UTC (permalink / raw)
  To: Luca Fancellu, xen-devel
  Cc: bertrand.marquis, wei.chen, George Dunlap, Nick Rosbrook,
	Wei Liu, Anthony Perard, Juergen Gross, Christian Lindig,
	David Scott, Marek Marczykowski-Górecki

On 02/02/2023 11:08 am, Luca Fancellu wrote:
> Recent changes added in struct xen_sysctl_physinfo a new flag for
> arch_capabilities to understand when the platform is SVE capable,
> another field having the maximum SVE vector length in bits was
> added as well, so update the tools to handle these new fields.
>
> Signed-off-by: Luca Fancellu <luca.fancellu@arm.com>
> ---
> Changes from RFC:
>  - new patch
> ---
> This patch is mostly dependent on the decisions we make in the previous, anyway
> I touched some part of the toolstack where my knowledge is limited (ocaml) so
> I might need a feedback for something I may have done wrong.

There's a lot of ifdefary here which is going to make the code
impossible to maintain, especially in python and xl where you're playing
games with the position of the comma.

But taking a step back.  First, split this series into two (even if it's
first half / second half of a single series that you email out).  How to
report the presence of a capability is different from how to configure
the capability for a guest, and should be done separately.

But even still.  SVE is a single bit feature.  physinfo is the wrong
place to report it, and I dont see any need for it to be in the domain
construction hypercall.  Both are critically space-limited structures.

ARM needs to gain something similar to x86's cpu policy architecture,
both in terms of sysctls reporting system capabilities, and domctls to
get/set domain configuration.

Looking at your cover letter, SVE (in terms of what it means for Xen)
looks quite similar to x86's AMX and Arch-LBR insofar as it's a large
amount of new register state that we don't want to allocate for guests
not making use of it.

On x86, what we're planning to do (It's not done yet, because I keep on
getting preempted with security work) is to require the toolstack to
make a set_cpu_policy hypercall between createdomain and maxcpus to
configure the size of XSAVE state, after which vcpu_create() in the
hypervisor has the info it needs to construct backing state correctly.


I know I'm telling you to do a lot of work, but bodging SVE in like this
isn't ok.  Someone *is* going to have to do proper feature configuration
on ARM, and I guarantee you that it will be less effort to do it
properly now, than to retrofit it in the future.  x86 has been a
nightmare to untangle 15 years of bad interface design in a way that
doesn't regress behaviour.

~Andrew

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH 07/10] xen/physinfo: add arm SVE arch capability and vector length
  2023-02-02 12:05   ` Jan Beulich
@ 2023-02-10 15:54     ` Luca Fancellu
  2023-02-13  8:36       ` Jan Beulich
  0 siblings, 1 reply; 18+ messages in thread
From: Luca Fancellu @ 2023-02-10 15:54 UTC (permalink / raw)
  To: Jan Beulich
  Cc: Bertrand Marquis, Wei Chen, Stefano Stabellini, Julien Grall,
	Volodymyr Babchuk, Andrew Cooper, George Dunlap, Wei Liu,
	xen-devel



> On 2 Feb 2023, at 12:05, Jan Beulich <jbeulich@suse.com> wrote:
> 
> On 02.02.2023 12:08, Luca Fancellu wrote:
>> When the arm platform supports SVE, advertise the feature by a new
>> flag for the arch_capabilities in struct xen_sysctl_physinfo and add
>> a new field "arm_sve_vl_bits" where on arm there can be stored the
>> maximum SVE vector length in bits.
>> 
>> Update the padding.
>> 
>> Bump XEN_SYSCTL_INTERFACE_VERSION for the changes.
>> 
>> Signed-off-by: Luca Fancellu <luca.fancellu@arm.com>
>> ---
>> Changes from RFC:
>> - new patch
>> ---

Hi Jan,

Thanks for your review,

>> Here I need an opinion from arm and x86 maintainer, I see there is no arch
>> specific structure in struct xen_sysctl_physinfo, hw_cap is used only by x86
>> and now arch_capabilities and the new arm_sve_vl_bits will be used only by arm.
>> So how should we proceed here? Shall we create a struct arch for each
>> architecture and for example move arch_capabilities inside it (renaming to
>> capabilities) and every arch specific field as well?
> 
> Counter question: Why don't you use (part of) arch_capabilities (and not
> just a single bit)? It looks to be entirely unused at present. Vector
> length being zero would sufficiently indicate absence of the feature
> without a separate boolean.

Yes I could have used just the value to determine if the platform is SVE capable
or not, but since this field was there (even if with no user) I was unsure about
renaming it and use it for this purpose.
In the end I did what was more logic to me at the moment, and I reserved a flag
in it for SVE.

> 
> 
>> (is hw_cap only for x86?)
> 
> I suppose it is, but I also expect it would better go away than be moved.
> It doesn't hold a complete set of information, and it has been superseded.
> 
> Question is (and I think I did raise this before, perhaps in different
> context) whether Arm wouldn't want to follow x86 in how hardware as well
> as hypervisor derived / used ones are exposed to the tool stack
> (XEN_SYSCTL_get_cpu_featureset). I guess there's nothing really precluding
> that data to consist of more than just boolean fields.

Yes I guess that infrastructure could work, however I don’t have the bandwidth to
put it in place at the moment, so I would like the Arm maintainers to give me a
suggestion on how I can expose the vector length to XL if putting its value here
needs to be avoided

> 
>> --- a/xen/include/public/sysctl.h
>> +++ b/xen/include/public/sysctl.h
>> @@ -18,7 +18,7 @@
>> #include "domctl.h"
>> #include "physdev.h"
>> 
>> -#define XEN_SYSCTL_INTERFACE_VERSION 0x00000015
>> +#define XEN_SYSCTL_INTERFACE_VERSION 0x00000016
> 
> Why? You ...
> 
>> @@ -104,7 +110,8 @@ struct xen_sysctl_physinfo {
>>     uint32_t cpu_khz;
>>     uint32_t capabilities;/* XEN_SYSCTL_PHYSCAP_??? */
>>     uint32_t arch_capabilities;/* XEN_SYSCTL_PHYSCAP_{X86,ARM,...}_??? */
>> -    uint32_t pad;
>> +    uint16_t arm_sve_vl_bits;
>> +    uint16_t pad;
>>     uint64_aligned_t total_pages;
>>     uint64_aligned_t free_pages;
>>     uint64_aligned_t scrub_pages;
> 
> ... add no new fields, and the only producer of the data zero-fills the
> struct first thing.

Yes that is right, I will wait to understand how I can proceed here:

1) rename arch_capabilities to arm_sve_vl_bits and put vector length there.
2) leave arch_capabilities untouched, no flag creation/setting, create uint32_t arm_sve_vl_bits field removing “pad”,
    Use its value to determine if SVE is present or not.
3) ??

Thank you

Cheers,
Luca

> 
> Jan



^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH 07/10] xen/physinfo: add arm SVE arch capability and vector length
  2023-02-10 15:54     ` Luca Fancellu
@ 2023-02-13  8:36       ` Jan Beulich
  2023-02-23 14:00         ` Bertrand Marquis
  0 siblings, 1 reply; 18+ messages in thread
From: Jan Beulich @ 2023-02-13  8:36 UTC (permalink / raw)
  To: Luca Fancellu
  Cc: Bertrand Marquis, Wei Chen, Stefano Stabellini, Julien Grall,
	Volodymyr Babchuk, Andrew Cooper, George Dunlap, Wei Liu,
	xen-devel

On 10.02.2023 16:54, Luca Fancellu wrote:
>> On 2 Feb 2023, at 12:05, Jan Beulich <jbeulich@suse.com> wrote:
>> On 02.02.2023 12:08, Luca Fancellu wrote:
>>> (is hw_cap only for x86?)
>>
>> I suppose it is, but I also expect it would better go away than be moved.
>> It doesn't hold a complete set of information, and it has been superseded.
>>
>> Question is (and I think I did raise this before, perhaps in different
>> context) whether Arm wouldn't want to follow x86 in how hardware as well
>> as hypervisor derived / used ones are exposed to the tool stack
>> (XEN_SYSCTL_get_cpu_featureset). I guess there's nothing really precluding
>> that data to consist of more than just boolean fields.
> 
> Yes I guess that infrastructure could work, however I don’t have the bandwidth to
> put it in place at the moment, so I would like the Arm maintainers to give me a
> suggestion on how I can expose the vector length to XL if putting its value here
> needs to be avoided

Since you've got a reply from Andrew boiling down to the same suggestion
(or should I even say request), I guess it wants seriously considering
to introduce abstract base infrastructure first. As Andrew says, time not
invested now will very likely mean yet more time to be invested later.

>>> --- a/xen/include/public/sysctl.h
>>> +++ b/xen/include/public/sysctl.h
>>> @@ -18,7 +18,7 @@
>>> #include "domctl.h"
>>> #include "physdev.h"
>>>
>>> -#define XEN_SYSCTL_INTERFACE_VERSION 0x00000015
>>> +#define XEN_SYSCTL_INTERFACE_VERSION 0x00000016
>>
>> Why? You ...
>>
>>> @@ -104,7 +110,8 @@ struct xen_sysctl_physinfo {
>>>     uint32_t cpu_khz;
>>>     uint32_t capabilities;/* XEN_SYSCTL_PHYSCAP_??? */
>>>     uint32_t arch_capabilities;/* XEN_SYSCTL_PHYSCAP_{X86,ARM,...}_??? */
>>> -    uint32_t pad;
>>> +    uint16_t arm_sve_vl_bits;
>>> +    uint16_t pad;
>>>     uint64_aligned_t total_pages;
>>>     uint64_aligned_t free_pages;
>>>     uint64_aligned_t scrub_pages;
>>
>> ... add no new fields, and the only producer of the data zero-fills the
>> struct first thing.
> 
> Yes that is right, I will wait to understand how I can proceed here:
> 
> 1) rename arch_capabilities to arm_sve_vl_bits and put vector length there.
> 2) leave arch_capabilities untouched, no flag creation/setting, create uint32_t arm_sve_vl_bits field removing “pad”,
>     Use its value to determine if SVE is present or not.
> 3) ??

3) Introduce suitable #define(s) to use part of arch_capabilities for your
purpose, without renaming the field. (But that's of course only if Arm
maintainers agree with you on _not_ going the proper feature handling route
right away.)

Jan


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH 07/10] xen/physinfo: add arm SVE arch capability and vector length
  2023-02-13  8:36       ` Jan Beulich
@ 2023-02-23 14:00         ` Bertrand Marquis
  2023-03-02  2:57           ` Stefano Stabellini
  0 siblings, 1 reply; 18+ messages in thread
From: Bertrand Marquis @ 2023-02-23 14:00 UTC (permalink / raw)
  To: Jan Beulich
  Cc: Luca Fancellu, Wei Chen, Stefano Stabellini, Julien Grall,
	Volodymyr Babchuk, Andrew Cooper, George Dunlap, Wei Liu,
	xen-devel

Hi Jan,

> On 13 Feb 2023, at 09:36, Jan Beulich <jbeulich@suse.com> wrote:
> 
> On 10.02.2023 16:54, Luca Fancellu wrote:
>>> On 2 Feb 2023, at 12:05, Jan Beulich <jbeulich@suse.com> wrote:
>>> On 02.02.2023 12:08, Luca Fancellu wrote:
>>>> (is hw_cap only for x86?)
>>> 
>>> I suppose it is, but I also expect it would better go away than be moved.
>>> It doesn't hold a complete set of information, and it has been superseded.
>>> 
>>> Question is (and I think I did raise this before, perhaps in different
>>> context) whether Arm wouldn't want to follow x86 in how hardware as well
>>> as hypervisor derived / used ones are exposed to the tool stack
>>> (XEN_SYSCTL_get_cpu_featureset). I guess there's nothing really precluding
>>> that data to consist of more than just boolean fields.
>> 
>> Yes I guess that infrastructure could work, however I don’t have the bandwidth to
>> put it in place at the moment, so I would like the Arm maintainers to give me a
>> suggestion on how I can expose the vector length to XL if putting its value here
>> needs to be avoided
> 
> Since you've got a reply from Andrew boiling down to the same suggestion
> (or should I even say request), I guess it wants seriously considering
> to introduce abstract base infrastructure first. As Andrew says, time not
> invested now will very likely mean yet more time to be invested later.
> 
>>>> --- a/xen/include/public/sysctl.h
>>>> +++ b/xen/include/public/sysctl.h
>>>> @@ -18,7 +18,7 @@
>>>> #include "domctl.h"
>>>> #include "physdev.h"
>>>> 
>>>> -#define XEN_SYSCTL_INTERFACE_VERSION 0x00000015
>>>> +#define XEN_SYSCTL_INTERFACE_VERSION 0x00000016
>>> 
>>> Why? You ...
>>> 
>>>> @@ -104,7 +110,8 @@ struct xen_sysctl_physinfo {
>>>>    uint32_t cpu_khz;
>>>>    uint32_t capabilities;/* XEN_SYSCTL_PHYSCAP_??? */
>>>>    uint32_t arch_capabilities;/* XEN_SYSCTL_PHYSCAP_{X86,ARM,...}_??? */
>>>> -    uint32_t pad;
>>>> +    uint16_t arm_sve_vl_bits;
>>>> +    uint16_t pad;
>>>>    uint64_aligned_t total_pages;
>>>>    uint64_aligned_t free_pages;
>>>>    uint64_aligned_t scrub_pages;
>>> 
>>> ... add no new fields, and the only producer of the data zero-fills the
>>> struct first thing.
>> 
>> Yes that is right, I will wait to understand how I can proceed here:
>> 
>> 1) rename arch_capabilities to arm_sve_vl_bits and put vector length there.
>> 2) leave arch_capabilities untouched, no flag creation/setting, create uint32_t arm_sve_vl_bits field removing “pad”,
>>    Use its value to determine if SVE is present or not.
>> 3) ??
> 
> 3) Introduce suitable #define(s) to use part of arch_capabilities for your
> purpose, without renaming the field. (But that's of course only if Arm
> maintainers agree with you on _not_ going the proper feature handling route
> right away.)

As Luca said, he does not have the required bandwidth to do this so I think it is ok for him to go with your solution 3.

@Julien/Stefano: any thoughts here ?

Bertrand

> 
> Jan


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH 07/10] xen/physinfo: add arm SVE arch capability and vector length
  2023-02-23 14:00         ` Bertrand Marquis
@ 2023-03-02  2:57           ` Stefano Stabellini
  0 siblings, 0 replies; 18+ messages in thread
From: Stefano Stabellini @ 2023-03-02  2:57 UTC (permalink / raw)
  To: Bertrand Marquis
  Cc: Jan Beulich, Luca Fancellu, Wei Chen, Stefano Stabellini,
	Julien Grall, Volodymyr Babchuk, Andrew Cooper, George Dunlap,
	Wei Liu, xen-devel

[-- Attachment #1: Type: text/plain, Size: 3146 bytes --]

On Thu, 23 Feb 2023, Bertrand Marquis wrote:
> Hi Jan,
> 
> > On 13 Feb 2023, at 09:36, Jan Beulich <jbeulich@suse.com> wrote:
> > 
> > On 10.02.2023 16:54, Luca Fancellu wrote:
> >>> On 2 Feb 2023, at 12:05, Jan Beulich <jbeulich@suse.com> wrote:
> >>> On 02.02.2023 12:08, Luca Fancellu wrote:
> >>>> (is hw_cap only for x86?)
> >>> 
> >>> I suppose it is, but I also expect it would better go away than be moved.
> >>> It doesn't hold a complete set of information, and it has been superseded.
> >>> 
> >>> Question is (and I think I did raise this before, perhaps in different
> >>> context) whether Arm wouldn't want to follow x86 in how hardware as well
> >>> as hypervisor derived / used ones are exposed to the tool stack
> >>> (XEN_SYSCTL_get_cpu_featureset). I guess there's nothing really precluding
> >>> that data to consist of more than just boolean fields.
> >> 
> >> Yes I guess that infrastructure could work, however I don’t have the bandwidth to
> >> put it in place at the moment, so I would like the Arm maintainers to give me a
> >> suggestion on how I can expose the vector length to XL if putting its value here
> >> needs to be avoided
> > 
> > Since you've got a reply from Andrew boiling down to the same suggestion
> > (or should I even say request), I guess it wants seriously considering
> > to introduce abstract base infrastructure first. As Andrew says, time not
> > invested now will very likely mean yet more time to be invested later.
> > 
> >>>> --- a/xen/include/public/sysctl.h
> >>>> +++ b/xen/include/public/sysctl.h
> >>>> @@ -18,7 +18,7 @@
> >>>> #include "domctl.h"
> >>>> #include "physdev.h"
> >>>> 
> >>>> -#define XEN_SYSCTL_INTERFACE_VERSION 0x00000015
> >>>> +#define XEN_SYSCTL_INTERFACE_VERSION 0x00000016
> >>> 
> >>> Why? You ...
> >>> 
> >>>> @@ -104,7 +110,8 @@ struct xen_sysctl_physinfo {
> >>>>    uint32_t cpu_khz;
> >>>>    uint32_t capabilities;/* XEN_SYSCTL_PHYSCAP_??? */
> >>>>    uint32_t arch_capabilities;/* XEN_SYSCTL_PHYSCAP_{X86,ARM,...}_??? */
> >>>> -    uint32_t pad;
> >>>> +    uint16_t arm_sve_vl_bits;
> >>>> +    uint16_t pad;
> >>>>    uint64_aligned_t total_pages;
> >>>>    uint64_aligned_t free_pages;
> >>>>    uint64_aligned_t scrub_pages;
> >>> 
> >>> ... add no new fields, and the only producer of the data zero-fills the
> >>> struct first thing.
> >> 
> >> Yes that is right, I will wait to understand how I can proceed here:
> >> 
> >> 1) rename arch_capabilities to arm_sve_vl_bits and put vector length there.
> >> 2) leave arch_capabilities untouched, no flag creation/setting, create uint32_t arm_sve_vl_bits field removing “pad”,
> >>    Use its value to determine if SVE is present or not.
> >> 3) ??
> > 
> > 3) Introduce suitable #define(s) to use part of arch_capabilities for your
> > purpose, without renaming the field. (But that's of course only if Arm
> > maintainers agree with you on _not_ going the proper feature handling route
> > right away.)
> 
> As Luca said, he does not have the required bandwidth to do this so I think it is ok for him to go with your solution 3.
> 
> @Julien/Stefano: any thoughts here ?

I am OK with that

^ permalink raw reply	[flat|nested] 18+ messages in thread

end of thread, other threads:[~2023-03-02  2:58 UTC | newest]

Thread overview: 18+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-02-02 11:08 [PATCH 00/10] SVE feature for arm guests Luca Fancellu
2023-02-02 11:08 ` [PATCH 01/10] xen/arm: enable SVE extension for Xen Luca Fancellu
2023-02-02 11:08 ` [PATCH 02/10] xen/arm: add sve_vl_bits field to domain Luca Fancellu
2023-02-02 11:08 ` [PATCH 03/10] xen/arm: Expose SVE feature to the guest Luca Fancellu
2023-02-02 11:08 ` [PATCH 04/10] xen/arm: add SVE exception class handling Luca Fancellu
2023-02-02 11:08 ` [PATCH 05/10] arm/sve: save/restore SVE context switch Luca Fancellu
2023-02-02 11:08 ` [PATCH 06/10] xen/arm: enable Dom0 to use SVE feature Luca Fancellu
2023-02-02 11:08 ` [PATCH 07/10] xen/physinfo: add arm SVE arch capability and vector length Luca Fancellu
2023-02-02 12:05   ` Jan Beulich
2023-02-10 15:54     ` Luca Fancellu
2023-02-13  8:36       ` Jan Beulich
2023-02-23 14:00         ` Bertrand Marquis
2023-03-02  2:57           ` Stefano Stabellini
2023-02-02 11:08 ` [PATCH 08/10] tools: add SVE capability and SVE vector length for physinfo Luca Fancellu
2023-02-02 12:38   ` Marek Marczykowski-Górecki
2023-02-02 14:12   ` Andrew Cooper
2023-02-02 11:08 ` [PATCH 09/10] xen/tools: add sve parameter in XL configuration Luca Fancellu
2023-02-02 11:08 ` [PATCH 10/10] xen/arm: add sve property for dom0less domUs Luca Fancellu

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.