All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v6 00/12] SVE feature for arm guests
@ 2023-04-24  6:02 Luca Fancellu
  2023-04-24  6:02 ` [PATCH v6 01/12] xen/arm: enable SVE extension for Xen Luca Fancellu
                   ` (11 more replies)
  0 siblings, 12 replies; 56+ messages in thread
From: Luca Fancellu @ 2023-04-24  6:02 UTC (permalink / raw)
  To: xen-devel
  Cc: bertrand.marquis, wei.chen, Stefano Stabellini, Julien Grall,
	Volodymyr Babchuk, Andrew Cooper, George Dunlap, Jan Beulich,
	Wei Liu, Roger Pau Monné,
	Nick Rosbrook, Anthony PERARD, Juergen Gross, Christian Lindig,
	David Scott, Marek Marczykowski-Górecki, Henry Wang,
	Community Manager

This serie is introducing the possibility for Dom0 and DomU guests to use
sve/sve2 instructions.

SVE feature introduces new instruction and registers to improve performances on
floating point operations.

The SVE feature is advertised using the ID_AA64PFR0_EL1 register, SVE field, and
when available the ID_AA64ZFR0_EL1 register provides additional information
about the implemented version and other SVE feature.

New registers added by the SVE feature are Z0-Z31, P0-P15, FFR, ZCR_ELx.

Z0-Z31 are scalable vector register whose size is implementation defined and
goes from 128 bits to maximum 2048, the term vector length will be used to refer
to this quantity.
P0-P15 are predicate registers and the size is the vector length divided by 8,
same size is the FFR (First Fault Register).
ZCR_ELx is a register that can control and restrict the maximum vector length
used by the <x> exception level and all the lower exception levels, so for
example EL3 can restrict the vector length usable by EL3,2,1,0.

The platform has a maximum implemented vector length, so for every value
written in ZCR register, if this value is above the implemented length, then the
lower value will be used. The RDVL instruction can be used to check what vector
length is the HW using after setting ZCR.

For an SVE guest, the V0-V31 registers are part of the Z0-Z31, so there is no
need to save them separately, saving Z0-Z31 will save implicitly also V0-V31.

SVE usage can be trapped using a flag in CPTR_EL2, hence in this serie the
register is added to the domain state, to be able to trap only the guests that
are not allowed to use SVE.

This serie is introducing a command line parameter to enable Dom0 to use SVE and
to set its maximum vector length that by default is 0 which means the guest is
not allowed to use SVE. Values from 128 to 2048 mean the guest can use SVE with
the selected value used as maximum allowed vector length (which could be lower
if the implemented one is lower).
For DomUs, an XL parameter with the same way of use is introduced and a dom0less
DTB binding is created.

The context switch is the most critical part because there can be big registers
to be saved, in this serie an easy approach is used and the context is
saved/restored every time for the guests that are allowed to use SVE.

Luca Fancellu (12):
  xen/arm: enable SVE extension for Xen
  xen/arm: add SVE vector length field to the domain
  xen/arm: Expose SVE feature to the guest
  xen/arm: add SVE exception class handling
  arm/sve: save/restore SVE context switch
  xen/common: add dom0 xen command line argument for Arm
  xen: enable Dom0 to use SVE feature
  xen/physinfo: encode Arm SVE vector length in arch_capabilities
  tools: add physinfo arch_capabilities handling for Arm
  xen/tools: add sve parameter in XL configuration
  xen/arm: add sve property for dom0less domUs
  xen/changelog: Add SVE and "dom0" options to the changelog for Arm

 CHANGELOG.md                                  |   3 +
 SUPPORT.md                                    |   6 +
 docs/man/xl.cfg.5.pod.in                      |  16 ++
 docs/misc/arm/device-tree/booting.txt         |  16 ++
 docs/misc/xen-command-line.pandoc             |  20 +-
 tools/golang/xenlight/helpers.gen.go          |   4 +
 tools/golang/xenlight/types.gen.go            |  24 +++
 tools/include/libxl.h                         |  11 +
 .../include/xen-tools/arm-arch-capabilities.h |  28 +++
 tools/include/xen-tools/common-macros.h       |   2 +
 tools/libs/light/libxl.c                      |   1 +
 tools/libs/light/libxl_arm.c                  |  28 +++
 tools/libs/light/libxl_internal.h             |   1 -
 tools/libs/light/libxl_types.idl              |  23 +++
 tools/ocaml/libs/xc/xenctrl.ml                |   4 +-
 tools/ocaml/libs/xc/xenctrl.mli               |   4 +-
 tools/ocaml/libs/xc/xenctrl_stubs.c           |   8 +-
 tools/python/xen/lowlevel/xc/xc.c             |   8 +-
 tools/xl/xl_info.c                            |   8 +
 tools/xl/xl_parse.c                           |   8 +
 xen/arch/arm/Kconfig                          |  10 +-
 xen/arch/arm/arm64/Makefile                   |   1 +
 xen/arch/arm/arm64/cpufeature.c               |   7 +-
 xen/arch/arm/arm64/domctl.c                   |   4 +
 xen/arch/arm/arm64/sve-asm.S                  | 189 ++++++++++++++++++
 xen/arch/arm/arm64/sve.c                      | 145 ++++++++++++++
 xen/arch/arm/arm64/vfp.c                      |  79 ++++----
 xen/arch/arm/arm64/vsysreg.c                  |  40 +++-
 xen/arch/arm/cpufeature.c                     |   6 +-
 xen/arch/arm/domain.c                         |  47 ++++-
 xen/arch/arm/domain_build.c                   |  61 ++++++
 xen/arch/arm/include/asm/arm64/sve.h          |  86 ++++++++
 xen/arch/arm/include/asm/arm64/sysregs.h      |   4 +
 xen/arch/arm/include/asm/arm64/vfp.h          |  12 ++
 xen/arch/arm/include/asm/cpufeature.h         |  14 ++
 xen/arch/arm/include/asm/domain.h             |   8 +
 xen/arch/arm/include/asm/processor.h          |   3 +
 xen/arch/arm/setup.c                          |   5 +-
 xen/arch/arm/sysctl.c                         |   4 +
 xen/arch/arm/traps.c                          |  37 +++-
 xen/arch/x86/dom0_build.c                     |  48 ++---
 xen/common/domain.c                           |  23 +++
 xen/common/kernel.c                           |  25 +++
 xen/include/public/arch-arm.h                 |   2 +
 xen/include/public/domctl.h                   |   2 +-
 xen/include/public/sysctl.h                   |   4 +
 xen/include/xen/domain.h                      |   1 +
 xen/include/xen/lib.h                         |  10 +
 48 files changed, 995 insertions(+), 105 deletions(-)
 create mode 100644 tools/include/xen-tools/arm-arch-capabilities.h
 create mode 100644 xen/arch/arm/arm64/sve-asm.S
 create mode 100644 xen/arch/arm/arm64/sve.c
 create mode 100644 xen/arch/arm/include/asm/arm64/sve.h

-- 
2.34.1



^ permalink raw reply	[flat|nested] 56+ messages in thread

* [PATCH v6 01/12] xen/arm: enable SVE extension for Xen
  2023-04-24  6:02 [PATCH v6 00/12] SVE feature for arm guests Luca Fancellu
@ 2023-04-24  6:02 ` Luca Fancellu
  2023-05-18  9:35   ` Julien Grall
  2023-04-24  6:02 ` [PATCH v6 02/12] xen/arm: add SVE vector length field to the domain Luca Fancellu
                   ` (10 subsequent siblings)
  11 siblings, 1 reply; 56+ messages in thread
From: Luca Fancellu @ 2023-04-24  6:02 UTC (permalink / raw)
  To: xen-devel
  Cc: bertrand.marquis, wei.chen, Stefano Stabellini, Julien Grall,
	Volodymyr Babchuk

Enable Xen to handle the SVE extension, add code in cpufeature module
to handle ZCR SVE register, disable trapping SVE feature on system
boot only when SVE resources are accessed.
While there, correct coding style for the comment on coprocessor
trapping.

Now cptr_el2 is part of the domain context and it will be restored
on context switch, this is a preparation for saving the SVE context
which will be part of VFP operations, so restore it before the call
to save VFP registers.
To save an additional isb barrier, restore cptr_el2 before an
existing isb barrier and move the call for saving VFP context after
that barrier.

Change the KConfig entry to make ARM64_SVE symbol selectable, by
default it will be not selected.

Create sve module and sve_asm.S that contains assembly routines for
the SVE feature, this code is inspired from linux and it uses
instruction encoding to be compatible with compilers that does not
support SVE.

Signed-off-by: Luca Fancellu <luca.fancellu@arm.com>
Reviewed-by: Bertrand Marquis <bertrand.marquis@arm.com>
---
Changes from v5:
 - Add R-by Bertrand
Changes from v4:
 - don't use fixed types in vl_to_zcr, forgot to address that in
   v3, by mistake I changed that in patch 2, fixing now (Jan)
Changes from v3:
 - no changes
Changes from v2:
 - renamed sve_asm.S in sve-asm.S, new files should not contain
   underscore in the name (Jan)
Changes from v1:
 - Add assert to vl_to_zcr, it is never called with vl==0, but just
   to be sure it won't in the future.
Changes from RFC:
 - Moved restoring of cptr before an existing barrier (Julien)
 - Marked the feature as unsupported for now (Julien)
 - Trap and un-trap only when using SVE resources in
   compute_max_zcr() (Julien)
---
 xen/arch/arm/Kconfig                     | 10 +++--
 xen/arch/arm/arm64/Makefile              |  1 +
 xen/arch/arm/arm64/cpufeature.c          |  7 ++--
 xen/arch/arm/arm64/sve-asm.S             | 48 +++++++++++++++++++++++
 xen/arch/arm/arm64/sve.c                 | 50 ++++++++++++++++++++++++
 xen/arch/arm/cpufeature.c                |  6 ++-
 xen/arch/arm/domain.c                    |  9 +++--
 xen/arch/arm/include/asm/arm64/sve.h     | 43 ++++++++++++++++++++
 xen/arch/arm/include/asm/arm64/sysregs.h |  1 +
 xen/arch/arm/include/asm/cpufeature.h    | 14 +++++++
 xen/arch/arm/include/asm/domain.h        |  1 +
 xen/arch/arm/include/asm/processor.h     |  2 +
 xen/arch/arm/setup.c                     |  5 ++-
 xen/arch/arm/traps.c                     | 28 +++++++------
 14 files changed, 201 insertions(+), 24 deletions(-)
 create mode 100644 xen/arch/arm/arm64/sve-asm.S
 create mode 100644 xen/arch/arm/arm64/sve.c
 create mode 100644 xen/arch/arm/include/asm/arm64/sve.h

diff --git a/xen/arch/arm/Kconfig b/xen/arch/arm/Kconfig
index 239d3aed3c7f..41f45d8d1203 100644
--- a/xen/arch/arm/Kconfig
+++ b/xen/arch/arm/Kconfig
@@ -112,11 +112,15 @@ config ARM64_PTR_AUTH
 	  This feature is not supported in Xen.
 
 config ARM64_SVE
-	def_bool n
+	bool "Enable Scalar Vector Extension support (UNSUPPORTED)" if UNSUPPORTED
 	depends on ARM_64
 	help
-	  Scalar Vector Extension support.
-	  This feature is not supported in Xen.
+	  Scalar Vector Extension (SVE/SVE2) support for guests.
+
+	  Please be aware that currently, enabling this feature will add latency on
+	  VM context switch between SVE enabled guests, between not-enabled SVE
+	  guests and SVE enabled guests and viceversa, compared to the time
+	  required to switch between not-enabled SVE guests.
 
 config ARM64_MTE
 	def_bool n
diff --git a/xen/arch/arm/arm64/Makefile b/xen/arch/arm/arm64/Makefile
index 28481393e98f..54ad55c75cda 100644
--- a/xen/arch/arm/arm64/Makefile
+++ b/xen/arch/arm/arm64/Makefile
@@ -13,6 +13,7 @@ obj-$(CONFIG_LIVEPATCH) += livepatch.o
 obj-y += mm.o
 obj-y += smc.o
 obj-y += smpboot.o
+obj-$(CONFIG_ARM64_SVE) += sve.o sve-asm.o
 obj-y += traps.o
 obj-y += vfp.o
 obj-y += vsysreg.o
diff --git a/xen/arch/arm/arm64/cpufeature.c b/xen/arch/arm/arm64/cpufeature.c
index d9039d37b2d1..b4656ff4d80f 100644
--- a/xen/arch/arm/arm64/cpufeature.c
+++ b/xen/arch/arm/arm64/cpufeature.c
@@ -455,15 +455,11 @@ static const struct arm64_ftr_bits ftr_id_dfr1[] = {
 	ARM64_FTR_END,
 };
 
-#if 0
-/* TODO: use this to sanitize SVE once we support it */
-
 static const struct arm64_ftr_bits ftr_zcr[] = {
 	ARM64_FTR_BITS(FTR_HIDDEN, FTR_NONSTRICT, FTR_LOWER_SAFE,
 		ZCR_ELx_LEN_SHIFT, ZCR_ELx_LEN_SIZE, 0),	/* LEN */
 	ARM64_FTR_END,
 };
-#endif
 
 /*
  * Common ftr bits for a 32bit register with all hidden, strict
@@ -603,6 +599,9 @@ void update_system_features(const struct cpuinfo_arm *new)
 
 	SANITIZE_ID_REG(zfr64, 0, aa64zfr0);
 
+	if ( cpu_has_sve )
+		SANITIZE_REG(zcr64, 0, zcr);
+
 	/*
 	 * Comment from Linux:
 	 * Userspace may perform DC ZVA instructions. Mismatched block sizes
diff --git a/xen/arch/arm/arm64/sve-asm.S b/xen/arch/arm/arm64/sve-asm.S
new file mode 100644
index 000000000000..4d1549344733
--- /dev/null
+++ b/xen/arch/arm/arm64/sve-asm.S
@@ -0,0 +1,48 @@
+/* SPDX-License-Identifier: GPL-2.0-only */
+/*
+ * Arm SVE assembly routines
+ *
+ * Copyright (C) 2022 ARM Ltd.
+ *
+ * Some macros and instruction encoding in this file are taken from linux 6.1.1,
+ * file arch/arm64/include/asm/fpsimdmacros.h, some of them are a modified
+ * version.
+ */
+
+/* Sanity-check macros to help avoid encoding garbage instructions */
+
+.macro _check_general_reg nr
+    .if (\nr) < 0 || (\nr) > 30
+        .error "Bad register number \nr."
+    .endif
+.endm
+
+.macro _check_num n, min, max
+    .if (\n) < (\min) || (\n) > (\max)
+        .error "Number \n out of range [\min,\max]"
+    .endif
+.endm
+
+/* SVE instruction encodings for non-SVE-capable assemblers */
+/* (pre binutils 2.28, all kernel capable clang versions support SVE) */
+
+/* RDVL X\nx, #\imm */
+.macro _sve_rdvl nx, imm
+    _check_general_reg \nx
+    _check_num (\imm), -0x20, 0x1f
+    .inst 0x04bf5000                \
+        | (\nx)                     \
+        | (((\imm) & 0x3f) << 5)
+.endm
+
+/* Gets the current vector register size in bytes */
+GLOBAL(sve_get_hw_vl)
+    _sve_rdvl 0, 1
+    ret
+
+/*
+ * Local variables:
+ * mode: ASM
+ * indent-tabs-mode: nil
+ * End:
+ */
diff --git a/xen/arch/arm/arm64/sve.c b/xen/arch/arm/arm64/sve.c
new file mode 100644
index 000000000000..6f3fb368c59b
--- /dev/null
+++ b/xen/arch/arm/arm64/sve.c
@@ -0,0 +1,50 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+/*
+ * Arm SVE feature code
+ *
+ * Copyright (C) 2022 ARM Ltd.
+ */
+
+#include <xen/types.h>
+#include <asm/arm64/sve.h>
+#include <asm/arm64/sysregs.h>
+#include <asm/processor.h>
+#include <asm/system.h>
+
+extern unsigned int sve_get_hw_vl(void);
+
+register_t compute_max_zcr(void)
+{
+    register_t cptr_bits = get_default_cptr_flags();
+    register_t zcr = vl_to_zcr(SVE_VL_MAX_BITS);
+    unsigned int hw_vl;
+
+    /* Remove trap for SVE resources */
+    WRITE_SYSREG(cptr_bits & ~HCPTR_CP(8), CPTR_EL2);
+    isb();
+
+    /*
+     * Set the maximum SVE vector length, doing that we will know the VL
+     * supported by the platform, calling sve_get_hw_vl()
+     */
+    WRITE_SYSREG(zcr, ZCR_EL2);
+
+    /*
+     * Read the maximum VL, which could be lower than what we imposed before,
+     * hw_vl contains VL in bytes, multiply it by 8 to use vl_to_zcr() later
+     */
+    hw_vl = sve_get_hw_vl() * 8U;
+
+    /* Restore CPTR_EL2 */
+    WRITE_SYSREG(cptr_bits, CPTR_EL2);
+    isb();
+
+    return vl_to_zcr(hw_vl);
+}
+
+/* Takes a vector length in bits and returns the ZCR_ELx encoding */
+register_t vl_to_zcr(unsigned int vl)
+{
+    ASSERT(vl > 0);
+    return ((vl / SVE_VL_MULTIPLE_VAL) - 1U) & ZCR_ELx_LEN_MASK;
+}
diff --git a/xen/arch/arm/cpufeature.c b/xen/arch/arm/cpufeature.c
index c4ec38bb2554..83b84368f6d5 100644
--- a/xen/arch/arm/cpufeature.c
+++ b/xen/arch/arm/cpufeature.c
@@ -9,6 +9,7 @@
 #include <xen/init.h>
 #include <xen/smp.h>
 #include <xen/stop_machine.h>
+#include <asm/arm64/sve.h>
 #include <asm/cpufeature.h>
 
 DECLARE_BITMAP(cpu_hwcaps, ARM_NCAPS);
@@ -143,6 +144,9 @@ void identify_cpu(struct cpuinfo_arm *c)
 
     c->zfr64.bits[0] = READ_SYSREG(ID_AA64ZFR0_EL1);
 
+    if ( cpu_has_sve )
+        c->zcr64.bits[0] = compute_max_zcr();
+
     c->dczid.bits[0] = READ_SYSREG(DCZID_EL0);
 
     c->ctr.bits[0] = READ_SYSREG(CTR_EL0);
@@ -199,7 +203,7 @@ static int __init create_guest_cpuinfo(void)
     guest_cpuinfo.pfr64.mpam = 0;
     guest_cpuinfo.pfr64.mpam_frac = 0;
 
-    /* Hide SVE as Xen does not support it */
+    /* Hide SVE by default to the guests */
     guest_cpuinfo.pfr64.sve = 0;
     guest_cpuinfo.zfr64.bits[0] = 0;
 
diff --git a/xen/arch/arm/domain.c b/xen/arch/arm/domain.c
index d8ef6501ff8e..0350d8c61ed8 100644
--- a/xen/arch/arm/domain.c
+++ b/xen/arch/arm/domain.c
@@ -181,9 +181,6 @@ static void ctxt_switch_to(struct vcpu *n)
     /* VGIC */
     gic_restore_state(n);
 
-    /* VFP */
-    vfp_restore_state(n);
-
     /* XXX MPU */
 
     /* Fault Status */
@@ -234,6 +231,7 @@ static void ctxt_switch_to(struct vcpu *n)
     p2m_restore_state(n);
 
     /* Control Registers */
+    WRITE_SYSREG(n->arch.cptr_el2, CPTR_EL2);
     WRITE_SYSREG(n->arch.cpacr, CPACR_EL1);
 
     /*
@@ -258,6 +256,9 @@ static void ctxt_switch_to(struct vcpu *n)
 #endif
     isb();
 
+    /* VFP */
+    vfp_restore_state(n);
+
     /* CP 15 */
     WRITE_SYSREG(n->arch.csselr, CSSELR_EL1);
 
@@ -548,6 +549,8 @@ int arch_vcpu_create(struct vcpu *v)
 
     v->arch.vmpidr = MPIDR_SMP | vcpuid_to_vaffinity(v->vcpu_id);
 
+    v->arch.cptr_el2 = get_default_cptr_flags();
+
     v->arch.hcr_el2 = get_default_hcr_flags();
 
     v->arch.mdcr_el2 = HDCR_TDRA | HDCR_TDOSA | HDCR_TDA;
diff --git a/xen/arch/arm/include/asm/arm64/sve.h b/xen/arch/arm/include/asm/arm64/sve.h
new file mode 100644
index 000000000000..144d2b1cc485
--- /dev/null
+++ b/xen/arch/arm/include/asm/arm64/sve.h
@@ -0,0 +1,43 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+/*
+ * Arm SVE feature code
+ *
+ * Copyright (C) 2022 ARM Ltd.
+ */
+
+#ifndef _ARM_ARM64_SVE_H
+#define _ARM_ARM64_SVE_H
+
+#define SVE_VL_MAX_BITS (2048U)
+
+/* Vector length must be multiple of 128 */
+#define SVE_VL_MULTIPLE_VAL (128U)
+
+#ifdef CONFIG_ARM64_SVE
+
+register_t compute_max_zcr(void);
+register_t vl_to_zcr(unsigned int vl);
+
+#else /* !CONFIG_ARM64_SVE */
+
+static inline register_t compute_max_zcr(void)
+{
+    return 0;
+}
+
+static inline register_t vl_to_zcr(unsigned int vl)
+{
+    return 0;
+}
+
+#endif /* CONFIG_ARM64_SVE */
+
+#endif /* _ARM_ARM64_SVE_H */
+/*
+ * Local variables:
+ * mode: C
+ * c-file-style: "BSD"
+ * c-basic-offset: 4
+ * indent-tabs-mode: nil
+ * End:
+ */
diff --git a/xen/arch/arm/include/asm/arm64/sysregs.h b/xen/arch/arm/include/asm/arm64/sysregs.h
index 463899951414..4cabb9eb4d5e 100644
--- a/xen/arch/arm/include/asm/arm64/sysregs.h
+++ b/xen/arch/arm/include/asm/arm64/sysregs.h
@@ -24,6 +24,7 @@
 #define ICH_EISR_EL2              S3_4_C12_C11_3
 #define ICH_ELSR_EL2              S3_4_C12_C11_5
 #define ICH_VMCR_EL2              S3_4_C12_C11_7
+#define ZCR_EL2                   S3_4_C1_C2_0
 
 #define __LR0_EL2(x)              S3_4_C12_C12_ ## x
 #define __LR8_EL2(x)              S3_4_C12_C13_ ## x
diff --git a/xen/arch/arm/include/asm/cpufeature.h b/xen/arch/arm/include/asm/cpufeature.h
index c62cf6293fd6..6d703e051906 100644
--- a/xen/arch/arm/include/asm/cpufeature.h
+++ b/xen/arch/arm/include/asm/cpufeature.h
@@ -32,6 +32,12 @@
 #define cpu_has_thumbee   (boot_cpu_feature32(thumbee) == 1)
 #define cpu_has_aarch32   (cpu_has_arm || cpu_has_thumb)
 
+#ifdef CONFIG_ARM64_SVE
+#define cpu_has_sve       (boot_cpu_feature64(sve) == 1)
+#else
+#define cpu_has_sve       (0)
+#endif
+
 #ifdef CONFIG_ARM_32
 #define cpu_has_gicv3     (boot_cpu_feature32(gic) >= 1)
 #define cpu_has_gentimer  (boot_cpu_feature32(gentimer) == 1)
@@ -323,6 +329,14 @@ struct cpuinfo_arm {
         };
     } isa64;
 
+    union {
+        register_t bits[1];
+        struct {
+            unsigned long len:4;
+            unsigned long __res0:60;
+        };
+    } zcr64;
+
     struct {
         register_t bits[1];
     } zfr64;
diff --git a/xen/arch/arm/include/asm/domain.h b/xen/arch/arm/include/asm/domain.h
index 2a51f0ca688e..e776ee704b7d 100644
--- a/xen/arch/arm/include/asm/domain.h
+++ b/xen/arch/arm/include/asm/domain.h
@@ -190,6 +190,7 @@ struct arch_vcpu
     register_t tpidrro_el0;
 
     /* HYP configuration */
+    register_t cptr_el2;
     register_t hcr_el2;
     register_t mdcr_el2;
 
diff --git a/xen/arch/arm/include/asm/processor.h b/xen/arch/arm/include/asm/processor.h
index 54f253087718..bc683334125c 100644
--- a/xen/arch/arm/include/asm/processor.h
+++ b/xen/arch/arm/include/asm/processor.h
@@ -582,6 +582,8 @@ void do_trap_guest_serror(struct cpu_user_regs *regs);
 
 register_t get_default_hcr_flags(void);
 
+register_t get_default_cptr_flags(void);
+
 /*
  * Synchronize SError unless the feature is selected.
  * This is relying on the SErrors are currently unmasked.
diff --git a/xen/arch/arm/setup.c b/xen/arch/arm/setup.c
index 6f9f4d8c8a15..4191a766767a 100644
--- a/xen/arch/arm/setup.c
+++ b/xen/arch/arm/setup.c
@@ -135,10 +135,11 @@ static void __init processor_id(void)
            cpu_has_el2_32 ? "64+32" : cpu_has_el2_64 ? "64" : "No",
            cpu_has_el1_32 ? "64+32" : cpu_has_el1_64 ? "64" : "No",
            cpu_has_el0_32 ? "64+32" : cpu_has_el0_64 ? "64" : "No");
-    printk("    Extensions:%s%s%s\n",
+    printk("    Extensions:%s%s%s%s\n",
            cpu_has_fp ? " FloatingPoint" : "",
            cpu_has_simd ? " AdvancedSIMD" : "",
-           cpu_has_gicv3 ? " GICv3-SysReg" : "");
+           cpu_has_gicv3 ? " GICv3-SysReg" : "",
+           cpu_has_sve ? " SVE" : "");
 
     /* Warn user if we find unknown floating-point features */
     if ( cpu_has_fp && (boot_cpu_feature64(fp) >= 2) )
diff --git a/xen/arch/arm/traps.c b/xen/arch/arm/traps.c
index d40c331a4e9c..c0611c2ef6a5 100644
--- a/xen/arch/arm/traps.c
+++ b/xen/arch/arm/traps.c
@@ -93,6 +93,21 @@ register_t get_default_hcr_flags(void)
              HCR_TID3|HCR_TSC|HCR_TAC|HCR_SWIO|HCR_TIDCP|HCR_FB|HCR_TSW);
 }
 
+register_t get_default_cptr_flags(void)
+{
+    /*
+     * Trap all coprocessor registers (0-13) except cp10 and
+     * cp11 for VFP.
+     *
+     * /!\ All coprocessors except cp10 and cp11 cannot be used in Xen.
+     *
+     * On ARM64 the TCPx bits which we set here (0..9,12,13) are all
+     * RES1, i.e. they would trap whether we did this write or not.
+     */
+    return  ((HCPTR_CP_MASK & ~(HCPTR_CP(10) | HCPTR_CP(11))) |
+             HCPTR_TTA | HCPTR_TAM);
+}
+
 static enum {
     SERRORS_DIVERSE,
     SERRORS_PANIC,
@@ -122,6 +137,7 @@ __initcall(update_serrors_cpu_caps);
 
 void init_traps(void)
 {
+    register_t cptr_bits = get_default_cptr_flags();
     /*
      * Setup Hyp vector base. Note they might get updated with the
      * branch predictor hardening.
@@ -135,17 +151,7 @@ void init_traps(void)
     /* Trap CP15 c15 used for implementation defined registers */
     WRITE_SYSREG(HSTR_T(15), HSTR_EL2);
 
-    /* Trap all coprocessor registers (0-13) except cp10 and
-     * cp11 for VFP.
-     *
-     * /!\ All coprocessors except cp10 and cp11 cannot be used in Xen.
-     *
-     * On ARM64 the TCPx bits which we set here (0..9,12,13) are all
-     * RES1, i.e. they would trap whether we did this write or not.
-     */
-    WRITE_SYSREG((HCPTR_CP_MASK & ~(HCPTR_CP(10) | HCPTR_CP(11))) |
-                 HCPTR_TTA | HCPTR_TAM,
-                 CPTR_EL2);
+    WRITE_SYSREG(cptr_bits, CPTR_EL2);
 
     /*
      * Configure HCR_EL2 with the bare minimum to run Xen until a guest
-- 
2.34.1



^ permalink raw reply related	[flat|nested] 56+ messages in thread

* [PATCH v6 02/12] xen/arm: add SVE vector length field to the domain
  2023-04-24  6:02 [PATCH v6 00/12] SVE feature for arm guests Luca Fancellu
  2023-04-24  6:02 ` [PATCH v6 01/12] xen/arm: enable SVE extension for Xen Luca Fancellu
@ 2023-04-24  6:02 ` Luca Fancellu
  2023-05-18  9:48   ` Julien Grall
  2023-04-24  6:02 ` [PATCH v6 03/12] xen/arm: Expose SVE feature to the guest Luca Fancellu
                   ` (9 subsequent siblings)
  11 siblings, 1 reply; 56+ messages in thread
From: Luca Fancellu @ 2023-04-24  6:02 UTC (permalink / raw)
  To: xen-devel
  Cc: bertrand.marquis, wei.chen, Stefano Stabellini, Julien Grall,
	Volodymyr Babchuk, Andrew Cooper, George Dunlap, Jan Beulich,
	Wei Liu

Add sve_vl field to arch_domain and xen_arch_domainconfig struct,
to allow the domain to have an information about the SVE feature
and the number of SVE register bits that are allowed for this
domain.

sve_vl field is the vector length in bits divided by 128, this
allows to use less space in the structures.

The field is used also to allow or forbid a domain to use SVE,
because a value equal to zero means the guest is not allowed to
use the feature.

Check that the requested vector length is lower or equal to the
platform supported vector length, otherwise fail on domain
creation.

Check that only 64 bit domains have SVE enabled, otherwise fail.

Bump the XEN_DOMCTL_INTERFACE_VERSION because of the new field
in struct xen_arch_domainconfig.

Signed-off-by: Luca Fancellu <luca.fancellu@arm.com>
---
Changes from v5:
 - Update commit message stating the interface ver. bump (Bertrand)
 - in struct arch_domain, protect sve_vl with CONFIG_ARM64_SVE,
   given the change, move also is_sve_domain() where it's protected
   inside sve.h and create a stub when the macro is not defined,
   protect the usage of sve_vl where needed.
   (Julien)
 - Add a check for 32 bit guest running on top of 64 bit host that
   have sve parameter enabled to stop the domain creation, added in
   construct_domain() of domain_build.c and subarch_do_domctl of
   domctl.c. (Julien)
Changes from v4:
 - Return 0 in get_sys_vl_len() if sve is not supported, code style fix,
   removed else if since the conditions can't fallthrough, removed not
   needed condition checking for VL bits validity because it's already
   covered, so delete is_vl_valid() function. (Jan)
Changes from v3:
 - don't use fixed types when not needed, use encoded value also in
   arch_domain so rename sve_vl_bits in sve_vl. (Jan)
 - rename domainconfig_decode_vl to sve_decode_vl because it will now
   be used also to decode from arch_domain value
 - change sve_vl from uint16_t to uint8_t and move it after "type" field
   to optimize space.
Changes from v2:
 - rename field in xen_arch_domainconfig from "sve_vl_bits" to
   "sve_vl" and use the implicit padding after gic_version to
   store it, now this field is the VL/128. (Jan)
 - Created domainconfig_decode_vl() function to decode the sve_vl
   field and use it as plain bits value inside arch_domain.
 - Changed commit message reflecting the changes
Changes from v1:
 - no changes
Changes from RFC:
 - restore zcr_el2 in sve_restore_state, that will be introduced
   later in this serie, so remove zcr_el2 related code from this
   patch and move everything to the later patch (Julien)
 - add explicit padding into struct xen_arch_domainconfig (Julien)
 - Don't lower down the vector length, just fail to create the
   domain. (Julien)
---
 xen/arch/arm/arm64/domctl.c          |  4 ++++
 xen/arch/arm/arm64/sve.c             | 12 ++++++++++++
 xen/arch/arm/domain.c                | 29 ++++++++++++++++++++++++++++
 xen/arch/arm/domain_build.c          |  7 +++++++
 xen/arch/arm/include/asm/arm64/sve.h | 16 +++++++++++++++
 xen/arch/arm/include/asm/domain.h    |  5 +++++
 xen/include/public/arch-arm.h        |  2 ++
 xen/include/public/domctl.h          |  2 +-
 8 files changed, 76 insertions(+), 1 deletion(-)

diff --git a/xen/arch/arm/arm64/domctl.c b/xen/arch/arm/arm64/domctl.c
index 0de89b42c448..14fc622e9956 100644
--- a/xen/arch/arm/arm64/domctl.c
+++ b/xen/arch/arm/arm64/domctl.c
@@ -10,6 +10,7 @@
 #include <xen/sched.h>
 #include <xen/hypercall.h>
 #include <public/domctl.h>
+#include <asm/arm64/sve.h>
 #include <asm/cpufeature.h>
 
 static long switch_mode(struct domain *d, enum domain_type type)
@@ -43,6 +44,9 @@ long subarch_do_domctl(struct xen_domctl *domctl, struct domain *d,
         case 32:
             if ( !cpu_has_el1_32 )
                 return -EINVAL;
+            /* SVE is not supported for 32 bit domain */
+            if ( is_sve_domain(d) )
+                return -EINVAL;
             return switch_mode(d, DOMAIN_32BIT);
         case 64:
             return switch_mode(d, DOMAIN_64BIT);
diff --git a/xen/arch/arm/arm64/sve.c b/xen/arch/arm/arm64/sve.c
index 6f3fb368c59b..86a5e617bfca 100644
--- a/xen/arch/arm/arm64/sve.c
+++ b/xen/arch/arm/arm64/sve.c
@@ -8,6 +8,7 @@
 #include <xen/types.h>
 #include <asm/arm64/sve.h>
 #include <asm/arm64/sysregs.h>
+#include <asm/cpufeature.h>
 #include <asm/processor.h>
 #include <asm/system.h>
 
@@ -48,3 +49,14 @@ register_t vl_to_zcr(unsigned int vl)
     ASSERT(vl > 0);
     return ((vl / SVE_VL_MULTIPLE_VAL) - 1U) & ZCR_ELx_LEN_MASK;
 }
+
+/* Get the system sanitized value for VL in bits */
+unsigned int get_sys_vl_len(void)
+{
+    if ( !cpu_has_sve )
+        return 0;
+
+    /* ZCR_ELx len field is ((len+1) * 128) = vector bits length */
+    return ((system_cpuinfo.zcr64.bits[0] & ZCR_ELx_LEN_MASK) + 1U) *
+            SVE_VL_MULTIPLE_VAL;
+}
diff --git a/xen/arch/arm/domain.c b/xen/arch/arm/domain.c
index 0350d8c61ed8..143359d0f313 100644
--- a/xen/arch/arm/domain.c
+++ b/xen/arch/arm/domain.c
@@ -13,6 +13,7 @@
 #include <xen/wait.h>
 
 #include <asm/alternative.h>
+#include <asm/arm64/sve.h>
 #include <asm/cpuerrata.h>
 #include <asm/cpufeature.h>
 #include <asm/current.h>
@@ -550,6 +551,8 @@ int arch_vcpu_create(struct vcpu *v)
     v->arch.vmpidr = MPIDR_SMP | vcpuid_to_vaffinity(v->vcpu_id);
 
     v->arch.cptr_el2 = get_default_cptr_flags();
+    if ( is_sve_domain(v->domain) )
+        v->arch.cptr_el2 &= ~HCPTR_CP(8);
 
     v->arch.hcr_el2 = get_default_hcr_flags();
 
@@ -594,6 +597,7 @@ int arch_sanitise_domain_config(struct xen_domctl_createdomain *config)
     unsigned int max_vcpus;
     unsigned int flags_required = (XEN_DOMCTL_CDF_hvm | XEN_DOMCTL_CDF_hap);
     unsigned int flags_optional = (XEN_DOMCTL_CDF_iommu | XEN_DOMCTL_CDF_vpmu);
+    unsigned int sve_vl_bits = sve_decode_vl(config->arch.sve_vl);
 
     if ( (config->flags & ~flags_optional) != flags_required )
     {
@@ -602,6 +606,26 @@ int arch_sanitise_domain_config(struct xen_domctl_createdomain *config)
         return -EINVAL;
     }
 
+    /* Check feature flags */
+    if ( sve_vl_bits > 0 )
+    {
+        unsigned int zcr_max_bits = get_sys_vl_len();
+
+        if ( !zcr_max_bits )
+        {
+            dprintk(XENLOG_INFO, "SVE is unsupported on this machine.\n");
+            return -EINVAL;
+        }
+
+        if ( sve_vl_bits > zcr_max_bits )
+        {
+            dprintk(XENLOG_INFO,
+                    "Requested SVE vector length (%u) > supported length (%u)\n",
+                    sve_vl_bits, zcr_max_bits);
+            return -EINVAL;
+        }
+    }
+
     /* The P2M table must always be shared between the CPU and the IOMMU */
     if ( config->iommu_opts & XEN_DOMCTL_IOMMU_no_sharept )
     {
@@ -744,6 +768,11 @@ int arch_domain_create(struct domain *d,
     if ( (rc = domain_vpci_init(d)) != 0 )
         goto fail;
 
+#ifdef CONFIG_ARM64_SVE
+    /* Copy the encoded vector length sve_vl from the domain configuration */
+    d->arch.sve_vl = config->arch.sve_vl;
+#endif
+
     return 0;
 
 fail:
diff --git a/xen/arch/arm/domain_build.c b/xen/arch/arm/domain_build.c
index f80fdd1af206..ffabe567ac3f 100644
--- a/xen/arch/arm/domain_build.c
+++ b/xen/arch/arm/domain_build.c
@@ -26,6 +26,7 @@
 #include <asm/platform.h>
 #include <asm/psci.h>
 #include <asm/setup.h>
+#include <asm/arm64/sve.h>
 #include <asm/cpufeature.h>
 #include <asm/domain_build.h>
 #include <xen/event.h>
@@ -3674,6 +3675,12 @@ static int __init construct_domain(struct domain *d, struct kernel_info *kinfo)
         return -EINVAL;
     }
 
+    if ( is_sve_domain(d) && (kinfo->type == DOMAIN_32BIT) )
+    {
+        printk("SVE is not available for 32-bit domain\n");
+        return -EINVAL;
+    }
+
     if ( is_64bit_domain(d) )
         vcpu_switch_to_aarch64_mode(v);
 
diff --git a/xen/arch/arm/include/asm/arm64/sve.h b/xen/arch/arm/include/asm/arm64/sve.h
index 144d2b1cc485..730c3fb5a9c8 100644
--- a/xen/arch/arm/include/asm/arm64/sve.h
+++ b/xen/arch/arm/include/asm/arm64/sve.h
@@ -13,13 +13,24 @@
 /* Vector length must be multiple of 128 */
 #define SVE_VL_MULTIPLE_VAL (128U)
 
+static inline unsigned int sve_decode_vl(unsigned int sve_vl)
+{
+    /* SVE vector length is stored as VL/128 in xen_arch_domainconfig */
+    return sve_vl * SVE_VL_MULTIPLE_VAL;
+}
+
 #ifdef CONFIG_ARM64_SVE
 
+#define is_sve_domain(d) ((d)->arch.sve_vl > 0)
+
 register_t compute_max_zcr(void);
 register_t vl_to_zcr(unsigned int vl);
+unsigned int get_sys_vl_len(void);
 
 #else /* !CONFIG_ARM64_SVE */
 
+#define is_sve_domain(d) (0)
+
 static inline register_t compute_max_zcr(void)
 {
     return 0;
@@ -30,6 +41,11 @@ static inline register_t vl_to_zcr(unsigned int vl)
     return 0;
 }
 
+static inline unsigned int get_sys_vl_len(void)
+{
+    return 0;
+}
+
 #endif /* CONFIG_ARM64_SVE */
 
 #endif /* _ARM_ARM64_SVE_H */
diff --git a/xen/arch/arm/include/asm/domain.h b/xen/arch/arm/include/asm/domain.h
index e776ee704b7d..331da0f3bcc3 100644
--- a/xen/arch/arm/include/asm/domain.h
+++ b/xen/arch/arm/include/asm/domain.h
@@ -67,6 +67,11 @@ struct arch_domain
     enum domain_type type;
 #endif
 
+#ifdef CONFIG_ARM64_SVE
+    /* max SVE encoded vector length */
+    uint8_t sve_vl;
+#endif
+
     /* Virtual MMU */
     struct p2m_domain p2m;
 
diff --git a/xen/include/public/arch-arm.h b/xen/include/public/arch-arm.h
index 1528ced5097a..38311f559581 100644
--- a/xen/include/public/arch-arm.h
+++ b/xen/include/public/arch-arm.h
@@ -300,6 +300,8 @@ DEFINE_XEN_GUEST_HANDLE(vcpu_guest_context_t);
 struct xen_arch_domainconfig {
     /* IN/OUT */
     uint8_t gic_version;
+    /* IN - Contains SVE vector length divided by 128 */
+    uint8_t sve_vl;
     /* IN */
     uint16_t tee_type;
     /* IN */
diff --git a/xen/include/public/domctl.h b/xen/include/public/domctl.h
index 529801c89ba3..e2e22cb534d6 100644
--- a/xen/include/public/domctl.h
+++ b/xen/include/public/domctl.h
@@ -21,7 +21,7 @@
 #include "hvm/save.h"
 #include "memory.h"
 
-#define XEN_DOMCTL_INTERFACE_VERSION 0x00000015
+#define XEN_DOMCTL_INTERFACE_VERSION 0x00000016
 
 /*
  * NB. xen_domctl.domain is an IN/OUT parameter for this operation.
-- 
2.34.1



^ permalink raw reply related	[flat|nested] 56+ messages in thread

* [PATCH v6 03/12] xen/arm: Expose SVE feature to the guest
  2023-04-24  6:02 [PATCH v6 00/12] SVE feature for arm guests Luca Fancellu
  2023-04-24  6:02 ` [PATCH v6 01/12] xen/arm: enable SVE extension for Xen Luca Fancellu
  2023-04-24  6:02 ` [PATCH v6 02/12] xen/arm: add SVE vector length field to the domain Luca Fancellu
@ 2023-04-24  6:02 ` Luca Fancellu
  2023-05-18  9:51   ` Julien Grall
  2023-04-24  6:02 ` [PATCH v6 04/12] xen/arm: add SVE exception class handling Luca Fancellu
                   ` (8 subsequent siblings)
  11 siblings, 1 reply; 56+ messages in thread
From: Luca Fancellu @ 2023-04-24  6:02 UTC (permalink / raw)
  To: xen-devel
  Cc: bertrand.marquis, wei.chen, Stefano Stabellini, Julien Grall,
	Volodymyr Babchuk

When a guest is allowed to use SVE, expose the SVE features through
the identification registers.

Signed-off-by: Luca Fancellu <luca.fancellu@arm.com>
---
Changes from v5:
 - given the move of is_sve_domain() in asm/arm64/sve.h, add the
   header to vsysreg.c
 - dropping Bertrand's R-by because of the change
Changes from v4:
 - no changes
Changes from v3:
 - no changes
Changes from v2:
 - no changes
Changes from v1:
 - No changes
Changes from RFC:
 - No changes
---
 xen/arch/arm/arm64/vsysreg.c | 40 ++++++++++++++++++++++++++++++++++--
 1 file changed, 38 insertions(+), 2 deletions(-)

diff --git a/xen/arch/arm/arm64/vsysreg.c b/xen/arch/arm/arm64/vsysreg.c
index 758750983c11..e1ef927b3347 100644
--- a/xen/arch/arm/arm64/vsysreg.c
+++ b/xen/arch/arm/arm64/vsysreg.c
@@ -18,6 +18,8 @@
 
 #include <xen/sched.h>
 
+#include <asm/arm64/cpufeature.h>
+#include <asm/arm64/sve.h>
 #include <asm/current.h>
 #include <asm/regs.h>
 #include <asm/traps.h>
@@ -295,7 +297,28 @@ void do_sysreg(struct cpu_user_regs *regs,
     GENERATE_TID3_INFO(MVFR0_EL1, mvfr, 0)
     GENERATE_TID3_INFO(MVFR1_EL1, mvfr, 1)
     GENERATE_TID3_INFO(MVFR2_EL1, mvfr, 2)
-    GENERATE_TID3_INFO(ID_AA64PFR0_EL1, pfr64, 0)
+
+    case HSR_SYSREG_ID_AA64PFR0_EL1:
+    {
+        register_t guest_reg_value = guest_cpuinfo.pfr64.bits[0];
+
+        if ( is_sve_domain(v->domain) )
+        {
+            /* 4 is the SVE field width in id_aa64pfr0_el1 */
+            uint64_t mask = GENMASK(ID_AA64PFR0_SVE_SHIFT + 4 - 1,
+                                    ID_AA64PFR0_SVE_SHIFT);
+            /* sysval is the sve field on the system */
+            uint64_t sysval = cpuid_feature_extract_unsigned_field_width(
+                                system_cpuinfo.pfr64.bits[0],
+                                ID_AA64PFR0_SVE_SHIFT, 4);
+            guest_reg_value &= ~mask;
+            guest_reg_value |= (sysval << ID_AA64PFR0_SVE_SHIFT) & mask;
+        }
+
+        return handle_ro_read_val(regs, regidx, hsr.sysreg.read, hsr, 1,
+                                  guest_reg_value);
+    }
+
     GENERATE_TID3_INFO(ID_AA64PFR1_EL1, pfr64, 1)
     GENERATE_TID3_INFO(ID_AA64DFR0_EL1, dbg64, 0)
     GENERATE_TID3_INFO(ID_AA64DFR1_EL1, dbg64, 1)
@@ -306,7 +329,20 @@ void do_sysreg(struct cpu_user_regs *regs,
     GENERATE_TID3_INFO(ID_AA64MMFR2_EL1, mm64, 2)
     GENERATE_TID3_INFO(ID_AA64AFR0_EL1, aux64, 0)
     GENERATE_TID3_INFO(ID_AA64AFR1_EL1, aux64, 1)
-    GENERATE_TID3_INFO(ID_AA64ZFR0_EL1, zfr64, 0)
+
+    case HSR_SYSREG_ID_AA64ZFR0_EL1:
+    {
+        /*
+         * When the guest has the SVE feature enabled, the whole id_aa64zfr0_el1
+         * needs to be exposed.
+         */
+        register_t guest_reg_value = guest_cpuinfo.zfr64.bits[0];
+        if ( is_sve_domain(v->domain) )
+            guest_reg_value = system_cpuinfo.zfr64.bits[0];
+
+        return handle_ro_read_val(regs, regidx, hsr.sysreg.read, hsr, 1,
+                                  guest_reg_value);
+    }
 
     /*
      * Those cases are catching all Reserved registers trapped by TID3 which
-- 
2.34.1



^ permalink raw reply related	[flat|nested] 56+ messages in thread

* [PATCH v6 04/12] xen/arm: add SVE exception class handling
  2023-04-24  6:02 [PATCH v6 00/12] SVE feature for arm guests Luca Fancellu
                   ` (2 preceding siblings ...)
  2023-04-24  6:02 ` [PATCH v6 03/12] xen/arm: Expose SVE feature to the guest Luca Fancellu
@ 2023-04-24  6:02 ` Luca Fancellu
  2023-05-18  9:55   ` Julien Grall
  2023-04-24  6:02 ` [PATCH v6 05/12] arm/sve: save/restore SVE context switch Luca Fancellu
                   ` (7 subsequent siblings)
  11 siblings, 1 reply; 56+ messages in thread
From: Luca Fancellu @ 2023-04-24  6:02 UTC (permalink / raw)
  To: xen-devel
  Cc: bertrand.marquis, wei.chen, Stefano Stabellini, Julien Grall,
	Volodymyr Babchuk

SVE has a new exception class with code 0x19, introduce the new code
and handle the exception.

Signed-off-by: Luca Fancellu <luca.fancellu@arm.com>
Reviewed-by: Bertrand Marquis <bertrand.marquis@arm.com>
---
Changes from v5:
 - modified error messages (Julien)
 - add R-by Bertrand
Changes from v4:
 - No changes
Changes from v3:
 - No changes
Changes from v2:
 - No changes
Changes from v1:
 - No changes
Changes from RFC:
 - No changes
---
 xen/arch/arm/include/asm/processor.h | 1 +
 xen/arch/arm/traps.c                 | 9 +++++++++
 2 files changed, 10 insertions(+)

diff --git a/xen/arch/arm/include/asm/processor.h b/xen/arch/arm/include/asm/processor.h
index bc683334125c..7e42ff8811fc 100644
--- a/xen/arch/arm/include/asm/processor.h
+++ b/xen/arch/arm/include/asm/processor.h
@@ -426,6 +426,7 @@
 #define HSR_EC_HVC64                0x16
 #define HSR_EC_SMC64                0x17
 #define HSR_EC_SYSREG               0x18
+#define HSR_EC_SVE                  0x19
 #endif
 #define HSR_EC_INSTR_ABORT_LOWER_EL 0x20
 #define HSR_EC_INSTR_ABORT_CURR_EL  0x21
diff --git a/xen/arch/arm/traps.c b/xen/arch/arm/traps.c
index c0611c2ef6a5..d672d2c694ef 100644
--- a/xen/arch/arm/traps.c
+++ b/xen/arch/arm/traps.c
@@ -2173,6 +2173,11 @@ void do_trap_guest_sync(struct cpu_user_regs *regs)
         perfc_incr(trap_sysreg);
         do_sysreg(regs, hsr);
         break;
+    case HSR_EC_SVE:
+        GUEST_BUG_ON(regs_mode_is_32bit(regs));
+        gprintk(XENLOG_WARNING, "Domain tried to use SVE while not allowed\n");
+        inject_undef_exception(regs, hsr);
+        break;
 #endif
 
     case HSR_EC_INSTR_ABORT_LOWER_EL:
@@ -2202,6 +2207,10 @@ void do_trap_hyp_sync(struct cpu_user_regs *regs)
     case HSR_EC_BRK:
         do_trap_brk(regs, hsr);
         break;
+    case HSR_EC_SVE:
+        /* An SVE exception is a bug somewhere in hypervisor code */
+        do_unexpected_trap("SVE trap at EL2", regs);
+        break;
 #endif
     case HSR_EC_DATA_ABORT_CURR_EL:
     case HSR_EC_INSTR_ABORT_CURR_EL:
-- 
2.34.1



^ permalink raw reply related	[flat|nested] 56+ messages in thread

* [PATCH v6 05/12] arm/sve: save/restore SVE context switch
  2023-04-24  6:02 [PATCH v6 00/12] SVE feature for arm guests Luca Fancellu
                   ` (3 preceding siblings ...)
  2023-04-24  6:02 ` [PATCH v6 04/12] xen/arm: add SVE exception class handling Luca Fancellu
@ 2023-04-24  6:02 ` Luca Fancellu
  2023-05-18 18:27   ` Julien Grall
  2023-05-18 18:30   ` Julien Grall
  2023-04-24  6:02 ` [PATCH v6 06/12] xen/common: add dom0 xen command line argument for Arm Luca Fancellu
                   ` (6 subsequent siblings)
  11 siblings, 2 replies; 56+ messages in thread
From: Luca Fancellu @ 2023-04-24  6:02 UTC (permalink / raw)
  To: xen-devel
  Cc: bertrand.marquis, wei.chen, Stefano Stabellini, Julien Grall,
	Volodymyr Babchuk

Save/restore context switch for SVE, allocate memory to contain
the Z0-31 registers whose length is maximum 2048 bits each and
FFR who can be maximum 256 bits, the allocated memory depends on
how many bits is the vector length for the domain and how many bits
are supported by the platform.

Save P0-15 whose length is maximum 256 bits each, in this case the
memory used is from the fpregs field in struct vfp_state,
because V0-31 are part of Z0-31 and this space would have been
unused for SVE domain otherwise.

Create zcr_el{1,2} fields in arch_vcpu, initialise zcr_el2 on vcpu
creation given the requested vector length and restore it on
context switch, save/restore ZCR_EL1 value as well.

Signed-off-by: Luca Fancellu <luca.fancellu@arm.com>
---
Changes from v5:
 - use XFREE instead of xfree, keep the headers (Julien)
 - Avoid math computation for every save/restore, store the computation
   in struct vfp_state once (Bertrand)
 - protect access to v->domain->arch.sve_vl inside arch_vcpu_create now
   that sve_vl is available only on arm64
Changes from v4:
 - No changes
Changes from v3:
 - don't use fixed len types when not needed (Jan)
 - now VL is an encoded value, decode it before using.
Changes from v2:
 - No changes
Changes from v1:
 - No changes
Changes from RFC:
 - Moved zcr_el2 field introduction in this patch, restore its
   content inside sve_restore_state function. (Julien)
---
 xen/arch/arm/arm64/sve-asm.S             | 141 +++++++++++++++++++++++
 xen/arch/arm/arm64/sve.c                 |  63 ++++++++++
 xen/arch/arm/arm64/vfp.c                 |  79 +++++++------
 xen/arch/arm/domain.c                    |   9 ++
 xen/arch/arm/include/asm/arm64/sve.h     |  13 +++
 xen/arch/arm/include/asm/arm64/sysregs.h |   3 +
 xen/arch/arm/include/asm/arm64/vfp.h     |  12 ++
 xen/arch/arm/include/asm/domain.h        |   2 +
 8 files changed, 288 insertions(+), 34 deletions(-)

diff --git a/xen/arch/arm/arm64/sve-asm.S b/xen/arch/arm/arm64/sve-asm.S
index 4d1549344733..8c37d7bc95d5 100644
--- a/xen/arch/arm/arm64/sve-asm.S
+++ b/xen/arch/arm/arm64/sve-asm.S
@@ -17,6 +17,18 @@
     .endif
 .endm
 
+.macro _sve_check_zreg znr
+    .if (\znr) < 0 || (\znr) > 31
+        .error "Bad Scalable Vector Extension vector register number \znr."
+    .endif
+.endm
+
+.macro _sve_check_preg pnr
+    .if (\pnr) < 0 || (\pnr) > 15
+        .error "Bad Scalable Vector Extension predicate register number \pnr."
+    .endif
+.endm
+
 .macro _check_num n, min, max
     .if (\n) < (\min) || (\n) > (\max)
         .error "Number \n out of range [\min,\max]"
@@ -26,6 +38,54 @@
 /* SVE instruction encodings for non-SVE-capable assemblers */
 /* (pre binutils 2.28, all kernel capable clang versions support SVE) */
 
+/* STR (vector): STR Z\nz, [X\nxbase, #\offset, MUL VL] */
+.macro _sve_str_v nz, nxbase, offset=0
+    _sve_check_zreg \nz
+    _check_general_reg \nxbase
+    _check_num (\offset), -0x100, 0xff
+    .inst 0xe5804000                \
+        | (\nz)                     \
+        | ((\nxbase) << 5)          \
+        | (((\offset) & 7) << 10)   \
+        | (((\offset) & 0x1f8) << 13)
+.endm
+
+/* LDR (vector): LDR Z\nz, [X\nxbase, #\offset, MUL VL] */
+.macro _sve_ldr_v nz, nxbase, offset=0
+    _sve_check_zreg \nz
+    _check_general_reg \nxbase
+    _check_num (\offset), -0x100, 0xff
+    .inst 0x85804000                \
+        | (\nz)                     \
+        | ((\nxbase) << 5)          \
+        | (((\offset) & 7) << 10)   \
+        | (((\offset) & 0x1f8) << 13)
+.endm
+
+/* STR (predicate): STR P\np, [X\nxbase, #\offset, MUL VL] */
+.macro _sve_str_p np, nxbase, offset=0
+    _sve_check_preg \np
+    _check_general_reg \nxbase
+    _check_num (\offset), -0x100, 0xff
+    .inst 0xe5800000                \
+        | (\np)                     \
+        | ((\nxbase) << 5)          \
+        | (((\offset) & 7) << 10)   \
+        | (((\offset) & 0x1f8) << 13)
+.endm
+
+/* LDR (predicate): LDR P\np, [X\nxbase, #\offset, MUL VL] */
+.macro _sve_ldr_p np, nxbase, offset=0
+    _sve_check_preg \np
+    _check_general_reg \nxbase
+    _check_num (\offset), -0x100, 0xff
+    .inst 0x85800000                \
+        | (\np)                     \
+        | ((\nxbase) << 5)          \
+        | (((\offset) & 7) << 10)   \
+        | (((\offset) & 0x1f8) << 13)
+.endm
+
 /* RDVL X\nx, #\imm */
 .macro _sve_rdvl nx, imm
     _check_general_reg \nx
@@ -35,11 +95,92 @@
         | (((\imm) & 0x3f) << 5)
 .endm
 
+/* RDFFR (unpredicated): RDFFR P\np.B */
+.macro _sve_rdffr np
+    _sve_check_preg \np
+    .inst 0x2519f000                \
+        | (\np)
+.endm
+
+/* WRFFR P\np.B */
+.macro _sve_wrffr np
+    _sve_check_preg \np
+    .inst 0x25289000                \
+        | ((\np) << 5)
+.endm
+
+.macro __for from:req, to:req
+    .if (\from) == (\to)
+        _for__body %\from
+    .else
+        __for %\from, %((\from) + ((\to) - (\from)) / 2)
+        __for %((\from) + ((\to) - (\from)) / 2 + 1), %\to
+    .endif
+.endm
+
+.macro _for var:req, from:req, to:req, insn:vararg
+    .macro _for__body \var:req
+        .noaltmacro
+        \insn
+        .altmacro
+    .endm
+
+    .altmacro
+    __for \from, \to
+    .noaltmacro
+
+    .purgem _for__body
+.endm
+
+.macro sve_save nxzffrctx, nxpctx, save_ffr
+    _for n, 0, 31, _sve_str_v \n, \nxzffrctx, \n - 32
+    _for n, 0, 15, _sve_str_p \n, \nxpctx, \n
+        cbz \save_ffr, 1f
+        _sve_rdffr 0
+        _sve_str_p 0, \nxzffrctx
+        _sve_ldr_p 0, \nxpctx
+        b 2f
+1:
+        str xzr, [x\nxzffrctx]      // Zero out FFR
+2:
+.endm
+
+.macro sve_load nxzffrctx, nxpctx, restore_ffr
+    _for n, 0, 31, _sve_ldr_v \n, \nxzffrctx, \n - 32
+        cbz \restore_ffr, 1f
+        _sve_ldr_p 0, \nxzffrctx
+        _sve_wrffr 0
+1:
+    _for n, 0, 15, _sve_ldr_p \n, \nxpctx, \n
+.endm
+
 /* Gets the current vector register size in bytes */
 GLOBAL(sve_get_hw_vl)
     _sve_rdvl 0, 1
     ret
 
+/*
+ * Save the SVE context
+ *
+ * x0 - pointer to buffer for Z0-31 + FFR
+ * x1 - pointer to buffer for P0-15
+ * x2 - Save FFR if non-zero
+ */
+GLOBAL(sve_save_ctx)
+    sve_save 0, 1, x2
+    ret
+
+/*
+ * Load the SVE context
+ *
+ * x0 - pointer to buffer for Z0-31 + FFR
+ * x1 - pointer to buffer for P0-15
+ * x2 - Restore FFR if non-zero
+ */
+GLOBAL(sve_load_ctx)
+    sve_load 0, 1, x2
+    ret
+
 /*
  * Local variables:
  * mode: ASM
diff --git a/xen/arch/arm/arm64/sve.c b/xen/arch/arm/arm64/sve.c
index 86a5e617bfca..064832b450ff 100644
--- a/xen/arch/arm/arm64/sve.c
+++ b/xen/arch/arm/arm64/sve.c
@@ -5,6 +5,8 @@
  * Copyright (C) 2022 ARM Ltd.
  */
 
+#include <xen/sched.h>
+#include <xen/sizes.h>
 #include <xen/types.h>
 #include <asm/arm64/sve.h>
 #include <asm/arm64/sysregs.h>
@@ -13,6 +15,24 @@
 #include <asm/system.h>
 
 extern unsigned int sve_get_hw_vl(void);
+extern void sve_save_ctx(uint64_t *sve_ctx, uint64_t *pregs, int save_ffr);
+extern void sve_load_ctx(uint64_t const *sve_ctx, uint64_t const *pregs,
+                         int restore_ffr);
+
+static inline unsigned int sve_zreg_ctx_size(unsigned int vl)
+{
+    /*
+     * Z0-31 registers size in bytes is computed from VL that is in bits, so VL
+     * in bytes is VL/8.
+     */
+    return (vl / 8U) * 32U;
+}
+
+static inline unsigned int sve_ffrreg_ctx_size(unsigned int vl)
+{
+    /* FFR register size is VL/8, which is in bytes (VL/8)/8 */
+    return (vl / 64U);
+}
 
 register_t compute_max_zcr(void)
 {
@@ -60,3 +80,46 @@ unsigned int get_sys_vl_len(void)
     return ((system_cpuinfo.zcr64.bits[0] & ZCR_ELx_LEN_MASK) + 1U) *
             SVE_VL_MULTIPLE_VAL;
 }
+
+int sve_context_init(struct vcpu *v)
+{
+    unsigned int sve_vl_bits = sve_decode_vl(v->domain->arch.sve_vl);
+    uint64_t *ctx = _xzalloc(sve_zreg_ctx_size(sve_vl_bits) +
+                             sve_ffrreg_ctx_size(sve_vl_bits),
+                             L1_CACHE_BYTES);
+
+    if ( !ctx )
+        return -ENOMEM;
+
+    /* Point to the end of Z0-Z31 memory, just before FFR memory */
+    v->arch.vfp.sve_zreg_ctx_end = ctx +
+        (sve_zreg_ctx_size(sve_vl_bits) / sizeof(uint64_t));
+
+    return 0;
+}
+
+void sve_context_free(struct vcpu *v)
+{
+    unsigned int sve_vl_bits = sve_decode_vl(v->domain->arch.sve_vl);
+
+    /* Point back to the beginning of Z0-Z31 + FFR memory */
+    v->arch.vfp.sve_zreg_ctx_end -=
+        (sve_zreg_ctx_size(sve_vl_bits) / sizeof(uint64_t));
+
+    XFREE(v->arch.vfp.sve_zreg_ctx_end);
+}
+
+void sve_save_state(struct vcpu *v)
+{
+    v->arch.zcr_el1 = READ_SYSREG(ZCR_EL1);
+
+    sve_save_ctx(v->arch.vfp.sve_zreg_ctx_end, v->arch.vfp.fpregs, 1);
+}
+
+void sve_restore_state(struct vcpu *v)
+{
+    WRITE_SYSREG(v->arch.zcr_el1, ZCR_EL1);
+    WRITE_SYSREG(v->arch.zcr_el2, ZCR_EL2);
+
+    sve_load_ctx(v->arch.vfp.sve_zreg_ctx_end, v->arch.vfp.fpregs, 1);
+}
diff --git a/xen/arch/arm/arm64/vfp.c b/xen/arch/arm/arm64/vfp.c
index 47885e76baae..2d0d7c2e6ddb 100644
--- a/xen/arch/arm/arm64/vfp.c
+++ b/xen/arch/arm/arm64/vfp.c
@@ -2,29 +2,35 @@
 #include <asm/processor.h>
 #include <asm/cpufeature.h>
 #include <asm/vfp.h>
+#include <asm/arm64/sve.h>
 
 void vfp_save_state(struct vcpu *v)
 {
     if ( !cpu_has_fp )
         return;
 
-    asm volatile("stp q0, q1, [%1, #16 * 0]\n\t"
-                 "stp q2, q3, [%1, #16 * 2]\n\t"
-                 "stp q4, q5, [%1, #16 * 4]\n\t"
-                 "stp q6, q7, [%1, #16 * 6]\n\t"
-                 "stp q8, q9, [%1, #16 * 8]\n\t"
-                 "stp q10, q11, [%1, #16 * 10]\n\t"
-                 "stp q12, q13, [%1, #16 * 12]\n\t"
-                 "stp q14, q15, [%1, #16 * 14]\n\t"
-                 "stp q16, q17, [%1, #16 * 16]\n\t"
-                 "stp q18, q19, [%1, #16 * 18]\n\t"
-                 "stp q20, q21, [%1, #16 * 20]\n\t"
-                 "stp q22, q23, [%1, #16 * 22]\n\t"
-                 "stp q24, q25, [%1, #16 * 24]\n\t"
-                 "stp q26, q27, [%1, #16 * 26]\n\t"
-                 "stp q28, q29, [%1, #16 * 28]\n\t"
-                 "stp q30, q31, [%1, #16 * 30]\n\t"
-                 : "=Q" (*v->arch.vfp.fpregs) : "r" (v->arch.vfp.fpregs));
+    if ( is_sve_domain(v->domain) )
+        sve_save_state(v);
+    else
+    {
+        asm volatile("stp q0, q1, [%1, #16 * 0]\n\t"
+                     "stp q2, q3, [%1, #16 * 2]\n\t"
+                     "stp q4, q5, [%1, #16 * 4]\n\t"
+                     "stp q6, q7, [%1, #16 * 6]\n\t"
+                     "stp q8, q9, [%1, #16 * 8]\n\t"
+                     "stp q10, q11, [%1, #16 * 10]\n\t"
+                     "stp q12, q13, [%1, #16 * 12]\n\t"
+                     "stp q14, q15, [%1, #16 * 14]\n\t"
+                     "stp q16, q17, [%1, #16 * 16]\n\t"
+                     "stp q18, q19, [%1, #16 * 18]\n\t"
+                     "stp q20, q21, [%1, #16 * 20]\n\t"
+                     "stp q22, q23, [%1, #16 * 22]\n\t"
+                     "stp q24, q25, [%1, #16 * 24]\n\t"
+                     "stp q26, q27, [%1, #16 * 26]\n\t"
+                     "stp q28, q29, [%1, #16 * 28]\n\t"
+                     "stp q30, q31, [%1, #16 * 30]\n\t"
+                     : "=Q" (*v->arch.vfp.fpregs) : "r" (v->arch.vfp.fpregs));
+    }
 
     v->arch.vfp.fpsr = READ_SYSREG(FPSR);
     v->arch.vfp.fpcr = READ_SYSREG(FPCR);
@@ -37,23 +43,28 @@ void vfp_restore_state(struct vcpu *v)
     if ( !cpu_has_fp )
         return;
 
-    asm volatile("ldp q0, q1, [%1, #16 * 0]\n\t"
-                 "ldp q2, q3, [%1, #16 * 2]\n\t"
-                 "ldp q4, q5, [%1, #16 * 4]\n\t"
-                 "ldp q6, q7, [%1, #16 * 6]\n\t"
-                 "ldp q8, q9, [%1, #16 * 8]\n\t"
-                 "ldp q10, q11, [%1, #16 * 10]\n\t"
-                 "ldp q12, q13, [%1, #16 * 12]\n\t"
-                 "ldp q14, q15, [%1, #16 * 14]\n\t"
-                 "ldp q16, q17, [%1, #16 * 16]\n\t"
-                 "ldp q18, q19, [%1, #16 * 18]\n\t"
-                 "ldp q20, q21, [%1, #16 * 20]\n\t"
-                 "ldp q22, q23, [%1, #16 * 22]\n\t"
-                 "ldp q24, q25, [%1, #16 * 24]\n\t"
-                 "ldp q26, q27, [%1, #16 * 26]\n\t"
-                 "ldp q28, q29, [%1, #16 * 28]\n\t"
-                 "ldp q30, q31, [%1, #16 * 30]\n\t"
-                 : : "Q" (*v->arch.vfp.fpregs), "r" (v->arch.vfp.fpregs));
+    if ( is_sve_domain(v->domain) )
+        sve_restore_state(v);
+    else
+    {
+        asm volatile("ldp q0, q1, [%1, #16 * 0]\n\t"
+                     "ldp q2, q3, [%1, #16 * 2]\n\t"
+                     "ldp q4, q5, [%1, #16 * 4]\n\t"
+                     "ldp q6, q7, [%1, #16 * 6]\n\t"
+                     "ldp q8, q9, [%1, #16 * 8]\n\t"
+                     "ldp q10, q11, [%1, #16 * 10]\n\t"
+                     "ldp q12, q13, [%1, #16 * 12]\n\t"
+                     "ldp q14, q15, [%1, #16 * 14]\n\t"
+                     "ldp q16, q17, [%1, #16 * 16]\n\t"
+                     "ldp q18, q19, [%1, #16 * 18]\n\t"
+                     "ldp q20, q21, [%1, #16 * 20]\n\t"
+                     "ldp q22, q23, [%1, #16 * 22]\n\t"
+                     "ldp q24, q25, [%1, #16 * 24]\n\t"
+                     "ldp q26, q27, [%1, #16 * 26]\n\t"
+                     "ldp q28, q29, [%1, #16 * 28]\n\t"
+                     "ldp q30, q31, [%1, #16 * 30]\n\t"
+                     : : "Q" (*v->arch.vfp.fpregs), "r" (v->arch.vfp.fpregs));
+    }
 
     WRITE_SYSREG(v->arch.vfp.fpsr, FPSR);
     WRITE_SYSREG(v->arch.vfp.fpcr, FPCR);
diff --git a/xen/arch/arm/domain.c b/xen/arch/arm/domain.c
index 143359d0f313..24c722a4a11e 100644
--- a/xen/arch/arm/domain.c
+++ b/xen/arch/arm/domain.c
@@ -552,7 +552,14 @@ int arch_vcpu_create(struct vcpu *v)
 
     v->arch.cptr_el2 = get_default_cptr_flags();
     if ( is_sve_domain(v->domain) )
+    {
+        if ( (rc = sve_context_init(v)) != 0 )
+            goto fail;
         v->arch.cptr_el2 &= ~HCPTR_CP(8);
+#ifdef CONFIG_ARM64_SVE
+        v->arch.zcr_el2 = vl_to_zcr(sve_decode_vl(v->domain->arch.sve_vl));
+#endif
+    }
 
     v->arch.hcr_el2 = get_default_hcr_flags();
 
@@ -582,6 +589,8 @@ fail:
 
 void arch_vcpu_destroy(struct vcpu *v)
 {
+    if ( is_sve_domain(v->domain) )
+        sve_context_free(v);
     vcpu_timer_destroy(v);
     vcpu_vgic_free(v);
     free_xenheap_pages(v->arch.stack, STACK_ORDER);
diff --git a/xen/arch/arm/include/asm/arm64/sve.h b/xen/arch/arm/include/asm/arm64/sve.h
index 730c3fb5a9c8..582405dfdf6a 100644
--- a/xen/arch/arm/include/asm/arm64/sve.h
+++ b/xen/arch/arm/include/asm/arm64/sve.h
@@ -26,6 +26,10 @@ static inline unsigned int sve_decode_vl(unsigned int sve_vl)
 register_t compute_max_zcr(void);
 register_t vl_to_zcr(unsigned int vl);
 unsigned int get_sys_vl_len(void);
+int sve_context_init(struct vcpu *v);
+void sve_context_free(struct vcpu *v);
+void sve_save_state(struct vcpu *v);
+void sve_restore_state(struct vcpu *v);
 
 #else /* !CONFIG_ARM64_SVE */
 
@@ -46,6 +50,15 @@ static inline unsigned int get_sys_vl_len(void)
     return 0;
 }
 
+static inline int sve_context_init(struct vcpu *v)
+{
+    return 0;
+}
+
+static inline void sve_context_free(struct vcpu *v) {}
+static inline void sve_save_state(struct vcpu *v) {}
+static inline void sve_restore_state(struct vcpu *v) {}
+
 #endif /* CONFIG_ARM64_SVE */
 
 #endif /* _ARM_ARM64_SVE_H */
diff --git a/xen/arch/arm/include/asm/arm64/sysregs.h b/xen/arch/arm/include/asm/arm64/sysregs.h
index 4cabb9eb4d5e..3fdeb9d8cdef 100644
--- a/xen/arch/arm/include/asm/arm64/sysregs.h
+++ b/xen/arch/arm/include/asm/arm64/sysregs.h
@@ -88,6 +88,9 @@
 #ifndef ID_AA64ISAR2_EL1
 #define ID_AA64ISAR2_EL1            S3_0_C0_C6_2
 #endif
+#ifndef ZCR_EL1
+#define ZCR_EL1                     S3_0_C1_C2_0
+#endif
 
 /* ID registers (imported from arm64/include/asm/sysreg.h in Linux) */
 
diff --git a/xen/arch/arm/include/asm/arm64/vfp.h b/xen/arch/arm/include/asm/arm64/vfp.h
index e6e8c363bc16..4aa371e85d26 100644
--- a/xen/arch/arm/include/asm/arm64/vfp.h
+++ b/xen/arch/arm/include/asm/arm64/vfp.h
@@ -6,7 +6,19 @@
 
 struct vfp_state
 {
+    /*
+     * When SVE is enabled for the guest, fpregs memory will be used to
+     * save/restore P0-P15 registers, otherwise it will be used for the V0-V31
+     * registers.
+     */
     uint64_t fpregs[64] __vfp_aligned;
+    /*
+     * When SVE is enabled for the guest, sve_zreg_ctx_end points to memory
+     * where Z0-Z31 registers and FFR can be saved/restored, it points at the
+     * end of the Z0-Z31 space and at the beginning of the FFR space, it's done
+     * like that to ease the save/restore assembly operations.
+     */
+    uint64_t *sve_zreg_ctx_end;
     register_t fpcr;
     register_t fpexc32_el2;
     register_t fpsr;
diff --git a/xen/arch/arm/include/asm/domain.h b/xen/arch/arm/include/asm/domain.h
index 331da0f3bcc3..814652d92568 100644
--- a/xen/arch/arm/include/asm/domain.h
+++ b/xen/arch/arm/include/asm/domain.h
@@ -195,6 +195,8 @@ struct arch_vcpu
     register_t tpidrro_el0;
 
     /* HYP configuration */
+    register_t zcr_el1;
+    register_t zcr_el2;
     register_t cptr_el2;
     register_t hcr_el2;
     register_t mdcr_el2;
-- 
2.34.1



^ permalink raw reply related	[flat|nested] 56+ messages in thread

* [PATCH v6 06/12] xen/common: add dom0 xen command line argument for Arm
  2023-04-24  6:02 [PATCH v6 00/12] SVE feature for arm guests Luca Fancellu
                   ` (4 preceding siblings ...)
  2023-04-24  6:02 ` [PATCH v6 05/12] arm/sve: save/restore SVE context switch Luca Fancellu
@ 2023-04-24  6:02 ` Luca Fancellu
  2023-04-24  6:02 ` [PATCH v6 07/12] xen: enable Dom0 to use SVE feature Luca Fancellu
                   ` (5 subsequent siblings)
  11 siblings, 0 replies; 56+ messages in thread
From: Luca Fancellu @ 2023-04-24  6:02 UTC (permalink / raw)
  To: xen-devel
  Cc: bertrand.marquis, wei.chen, Stefano Stabellini, Julien Grall,
	Volodymyr Babchuk, Andrew Cooper, George Dunlap, Jan Beulich,
	Wei Liu, Roger Pau Monné

Currently x86 defines a Xen command line argument dom0=<list> where
there can be specified dom0 controlling sub-options, to use it also
on Arm, move the code that loops through the list of arguments from
x86 to the common code and from there, call architecture specific
functions to handle the comma separated sub-options.

No functional changes are intended.

Signed-off-by: Luca Fancellu <luca.fancellu@arm.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Bertrand Marquis <bertrand.marquis@arm.com>
---
Changes from v5:
 - Add Bertrand R-by
Changes from v4:
 - return EINVAL in Arm implementation of parse_arch_dom0_param,
   shorten variable name in the funtion from str_begin, str_end to
   s, e. Removed variable rc from x86 parse_arch_dom0_param
   implementation. (Jan)
 - Add R-By Jan
Changes from v3:
 - new patch
---
 xen/arch/arm/domain_build.c |  5 ++++
 xen/arch/x86/dom0_build.c   | 48 ++++++++++++++-----------------------
 xen/common/domain.c         | 23 ++++++++++++++++++
 xen/include/xen/domain.h    |  1 +
 4 files changed, 47 insertions(+), 30 deletions(-)

diff --git a/xen/arch/arm/domain_build.c b/xen/arch/arm/domain_build.c
index ffabe567ac3f..d9450416f665 100644
--- a/xen/arch/arm/domain_build.c
+++ b/xen/arch/arm/domain_build.c
@@ -60,6 +60,11 @@ static int __init parse_dom0_mem(const char *s)
 }
 custom_param("dom0_mem", parse_dom0_mem);
 
+int __init parse_arch_dom0_param(const char *s, const char *e)
+{
+    return -EINVAL;
+}
+
 /* Override macros from asm/page.h to make them work with mfn_t */
 #undef virt_to_mfn
 #define virt_to_mfn(va) _mfn(__virt_to_mfn(va))
diff --git a/xen/arch/x86/dom0_build.c b/xen/arch/x86/dom0_build.c
index 79234f18ff01..9f5300a3efbb 100644
--- a/xen/arch/x86/dom0_build.c
+++ b/xen/arch/x86/dom0_build.c
@@ -266,42 +266,30 @@ bool __initdata opt_dom0_pvh = !IS_ENABLED(CONFIG_PV);
 bool __initdata opt_dom0_verbose = IS_ENABLED(CONFIG_VERBOSE_DEBUG);
 bool __initdata opt_dom0_msr_relaxed;
 
-static int __init cf_check parse_dom0_param(const char *s)
+int __init parse_arch_dom0_param(const char *s, const char *e)
 {
-    const char *ss;
-    int rc = 0;
+    int val;
 
-    do {
-        int val;
-
-        ss = strchr(s, ',');
-        if ( !ss )
-            ss = strchr(s, '\0');
-
-        if ( IS_ENABLED(CONFIG_PV) && !cmdline_strcmp(s, "pv") )
-            opt_dom0_pvh = false;
-        else if ( IS_ENABLED(CONFIG_HVM) && !cmdline_strcmp(s, "pvh") )
-            opt_dom0_pvh = true;
+    if ( IS_ENABLED(CONFIG_PV) && !cmdline_strcmp(s, "pv") )
+        opt_dom0_pvh = false;
+    else if ( IS_ENABLED(CONFIG_HVM) && !cmdline_strcmp(s, "pvh") )
+        opt_dom0_pvh = true;
 #ifdef CONFIG_SHADOW_PAGING
-        else if ( (val = parse_boolean("shadow", s, ss)) >= 0 )
-            opt_dom0_shadow = val;
+    else if ( (val = parse_boolean("shadow", s, e)) >= 0 )
+        opt_dom0_shadow = val;
 #endif
-        else if ( (val = parse_boolean("verbose", s, ss)) >= 0 )
-            opt_dom0_verbose = val;
-        else if ( IS_ENABLED(CONFIG_PV) &&
-                  (val = parse_boolean("cpuid-faulting", s, ss)) >= 0 )
-            opt_dom0_cpuid_faulting = val;
-        else if ( (val = parse_boolean("msr-relaxed", s, ss)) >= 0 )
-            opt_dom0_msr_relaxed = val;
-        else
-            rc = -EINVAL;
-
-        s = ss + 1;
-    } while ( *ss );
+    else if ( (val = parse_boolean("verbose", s, e)) >= 0 )
+        opt_dom0_verbose = val;
+    else if ( IS_ENABLED(CONFIG_PV) &&
+              (val = parse_boolean("cpuid-faulting", s, e)) >= 0 )
+        opt_dom0_cpuid_faulting = val;
+    else if ( (val = parse_boolean("msr-relaxed", s, e)) >= 0 )
+        opt_dom0_msr_relaxed = val;
+    else
+        return -EINVAL;
 
-    return rc;
+    return 0;
 }
-custom_param("dom0", parse_dom0_param);
 
 static char __initdata opt_dom0_ioports_disable[200] = "";
 string_param("dom0_ioports_disable", opt_dom0_ioports_disable);
diff --git a/xen/common/domain.c b/xen/common/domain.c
index 626debbae095..7779ba088675 100644
--- a/xen/common/domain.c
+++ b/xen/common/domain.c
@@ -364,6 +364,29 @@ static int __init cf_check parse_extra_guest_irqs(const char *s)
 }
 custom_param("extra_guest_irqs", parse_extra_guest_irqs);
 
+static int __init cf_check parse_dom0_param(const char *s)
+{
+    const char *ss;
+    int rc = 0;
+
+    do {
+        int ret;
+
+        ss = strchr(s, ',');
+        if ( !ss )
+            ss = strchr(s, '\0');
+
+        ret = parse_arch_dom0_param(s, ss);
+        if ( ret && !rc )
+            rc = ret;
+
+        s = ss + 1;
+    } while ( *ss );
+
+    return rc;
+}
+custom_param("dom0", parse_dom0_param);
+
 /*
  * Release resources held by a domain.  There may or may not be live
  * references to the domain, and it may or may not be fully constructed.
diff --git a/xen/include/xen/domain.h b/xen/include/xen/domain.h
index 26f9c4f6dd5b..1df8f933d076 100644
--- a/xen/include/xen/domain.h
+++ b/xen/include/xen/domain.h
@@ -16,6 +16,7 @@ typedef union {
 struct vcpu *vcpu_create(struct domain *d, unsigned int vcpu_id);
 
 unsigned int dom0_max_vcpus(void);
+int parse_arch_dom0_param(const char *s, const char *e);
 struct vcpu *alloc_dom0_vcpu0(struct domain *dom0);
 
 int vcpu_reset(struct vcpu *);
-- 
2.34.1



^ permalink raw reply related	[flat|nested] 56+ messages in thread

* [PATCH v6 07/12] xen: enable Dom0 to use SVE feature
  2023-04-24  6:02 [PATCH v6 00/12] SVE feature for arm guests Luca Fancellu
                   ` (5 preceding siblings ...)
  2023-04-24  6:02 ` [PATCH v6 06/12] xen/common: add dom0 xen command line argument for Arm Luca Fancellu
@ 2023-04-24  6:02 ` Luca Fancellu
  2023-04-24 11:34   ` Jan Beulich
  2023-04-24  6:02 ` [PATCH v6 08/12] xen/physinfo: encode Arm SVE vector length in arch_capabilities Luca Fancellu
                   ` (4 subsequent siblings)
  11 siblings, 1 reply; 56+ messages in thread
From: Luca Fancellu @ 2023-04-24  6:02 UTC (permalink / raw)
  To: xen-devel
  Cc: bertrand.marquis, wei.chen, Andrew Cooper, George Dunlap,
	Jan Beulich, Julien Grall, Stefano Stabellini, Wei Liu,
	Volodymyr Babchuk

Add a command line parameter to allow Dom0 the use of SVE resources,
the command line parameter sve=<integer>, sub argument of dom0=,
controls the feature on this domain and sets the maximum SVE vector
length for Dom0.

Add a new function, parse_signed_integer(), to parse an integer
command line argument.

Signed-off-by: Luca Fancellu <luca.fancellu@arm.com>
---
Changes from v5:
 - stop the domain if VL error occurs (Julien, Bertrand)
 - update the documentation
 - Rename sve_sanitize_vl_param to sve_domctl_vl_param to
   mark the fact that we are sanitizing a parameter coming from
   the user before encoding it into sve_vl in domctl structure.
   (suggestion from Bertrand in a separate discussion)
 - update comment in parse_signed_integer, return boolean in
   sve_domctl_vl_param (Jan).
Changes from v4:
 - Negative values as user param means max supported HW VL (Jan)
 - update documentation, make use of no_config_param(), rename
   parse_integer into parse_signed_integer and take long long *,
   also put a comment on the -2 return condition, update
   declaration comment to reflect the modifications (Jan)
Changes from v3:
 - Don't use fixed len types when not needed (Jan)
 - renamed domainconfig_encode_vl to sve_encode_vl
 - Use a sub argument of dom0= to enable the feature (Jan)
 - Add parse_integer() function
Changes from v2:
 - xen_domctl_createdomain field has changed into sve_vl and its
   value now is the VL / 128, create an helper function for that.
Changes from v1:
 - No changes
Changes from RFC:
 - Changed docs to explain that the domain won't be created if the
   requested vector length is above the supported one from the
   platform.
---
 docs/misc/xen-command-line.pandoc    | 20 ++++++++++++++++++--
 xen/arch/arm/arm64/sve.c             | 20 ++++++++++++++++++++
 xen/arch/arm/domain_build.c          | 25 +++++++++++++++++++++++++
 xen/arch/arm/include/asm/arm64/sve.h | 14 ++++++++++++++
 xen/common/kernel.c                  | 25 +++++++++++++++++++++++++
 xen/include/xen/lib.h                | 10 ++++++++++
 6 files changed, 112 insertions(+), 2 deletions(-)

diff --git a/docs/misc/xen-command-line.pandoc b/docs/misc/xen-command-line.pandoc
index e0b89b7d3319..47e5b4eb6199 100644
--- a/docs/misc/xen-command-line.pandoc
+++ b/docs/misc/xen-command-line.pandoc
@@ -777,9 +777,9 @@ Specify the bit width of the DMA heap.
 
 ### dom0
     = List of [ pv | pvh, shadow=<bool>, verbose=<bool>,
-                cpuid-faulting=<bool>, msr-relaxed=<bool> ]
+                cpuid-faulting=<bool>, msr-relaxed=<bool> ] (x86)
 
-    Applicability: x86
+    = List of [ sve=<integer> ] (Arm)
 
 Controls for how dom0 is constructed on x86 systems.
 
@@ -838,6 +838,22 @@ Controls for how dom0 is constructed on x86 systems.
 
     If using this option is necessary to fix an issue, please report a bug.
 
+Enables features on dom0 on Arm systems.
+
+*   The `sve` integer parameter enables Arm SVE usage for Dom0 domain and sets
+    the maximum SVE vector length, the option is applicable only to AArch64
+    guests.
+    A value equal to 0 disables the feature, this is the default value.
+    Values below 0 means the feature uses the maximum SVE vector length
+    supported by hardware, if SVE is supported.
+    Values above 0 explicitly set the maximum SVE vector length for Dom0,
+    allowed values are from 128 to maximum 2048, being multiple of 128.
+    Please note that when the user explicitly specifies the value, if that value
+    is above the hardware supported maximum SVE vector length, the domain
+    creation will fail and the system will stop, the same will occur if the
+    option is provided with a non zero value, but the platform doesn't support
+    SVE.
+
 ### dom0-cpuid
     = List of comma separated booleans
 
diff --git a/xen/arch/arm/arm64/sve.c b/xen/arch/arm/arm64/sve.c
index 064832b450ff..4d964f2b56b4 100644
--- a/xen/arch/arm/arm64/sve.c
+++ b/xen/arch/arm/arm64/sve.c
@@ -14,6 +14,9 @@
 #include <asm/processor.h>
 #include <asm/system.h>
 
+/* opt_dom0_sve: allow Dom0 to use SVE and set maximum vector length. */
+int __initdata opt_dom0_sve;
+
 extern unsigned int sve_get_hw_vl(void);
 extern void sve_save_ctx(uint64_t *sve_ctx, uint64_t *pregs, int save_ffr);
 extern void sve_load_ctx(uint64_t const *sve_ctx, uint64_t const *pregs,
@@ -123,3 +126,20 @@ void sve_restore_state(struct vcpu *v)
 
     sve_load_ctx(v->arch.vfp.sve_zreg_ctx_end, v->arch.vfp.fpregs, 1);
 }
+
+bool __init sve_domctl_vl_param(int val, unsigned int *out)
+{
+    /*
+     * Negative SVE parameter value means to use the maximum supported
+     * vector length, otherwise if a positive value is provided, check if the
+     * vector length is a multiple of 128
+     */
+    if ( val < 0 )
+        *out = get_sys_vl_len();
+    else if ( (val % SVE_VL_MULTIPLE_VAL) == 0 )
+        *out = val;
+    else
+        return false;
+
+    return true;
+}
diff --git a/xen/arch/arm/domain_build.c b/xen/arch/arm/domain_build.c
index d9450416f665..4a6b73348594 100644
--- a/xen/arch/arm/domain_build.c
+++ b/xen/arch/arm/domain_build.c
@@ -62,6 +62,21 @@ custom_param("dom0_mem", parse_dom0_mem);
 
 int __init parse_arch_dom0_param(const char *s, const char *e)
 {
+    long long val;
+
+    if ( !parse_signed_integer("sve", s, e, &val) )
+    {
+#ifdef CONFIG_ARM64_SVE
+        if ( (val >= INT_MIN) && (val <= INT_MAX) )
+            opt_dom0_sve = val;
+        else
+            printk(XENLOG_INFO "'sve=%lld' value out of range!\n", val);
+#else
+        no_config_param("ARM64_SVE", "sve", s, e);
+#endif
+        return 0;
+    }
+
     return -EINVAL;
 }
 
@@ -4117,6 +4132,16 @@ void __init create_dom0(void)
     if ( iommu_enabled )
         dom0_cfg.flags |= XEN_DOMCTL_CDF_iommu;
 
+    if ( opt_dom0_sve )
+    {
+        unsigned int vl;
+
+        if ( sve_domctl_vl_param(opt_dom0_sve, &vl) )
+            dom0_cfg.arch.sve_vl = sve_encode_vl(vl);
+        else
+            panic("SVE vector length error\n");
+    }
+
     dom0 = domain_create(0, &dom0_cfg, CDF_privileged | CDF_directmap);
     if ( IS_ERR(dom0) )
         panic("Error creating domain 0 (rc = %ld)\n", PTR_ERR(dom0));
diff --git a/xen/arch/arm/include/asm/arm64/sve.h b/xen/arch/arm/include/asm/arm64/sve.h
index 582405dfdf6a..71bddb41f19c 100644
--- a/xen/arch/arm/include/asm/arm64/sve.h
+++ b/xen/arch/arm/include/asm/arm64/sve.h
@@ -19,8 +19,15 @@ static inline unsigned int sve_decode_vl(unsigned int sve_vl)
     return sve_vl * SVE_VL_MULTIPLE_VAL;
 }
 
+static inline unsigned int sve_encode_vl(unsigned int sve_vl_bits)
+{
+    return sve_vl_bits / SVE_VL_MULTIPLE_VAL;
+}
+
 #ifdef CONFIG_ARM64_SVE
 
+extern int opt_dom0_sve;
+
 #define is_sve_domain(d) ((d)->arch.sve_vl > 0)
 
 register_t compute_max_zcr(void);
@@ -30,9 +37,11 @@ int sve_context_init(struct vcpu *v);
 void sve_context_free(struct vcpu *v);
 void sve_save_state(struct vcpu *v);
 void sve_restore_state(struct vcpu *v);
+bool sve_domctl_vl_param(int val, unsigned int *out);
 
 #else /* !CONFIG_ARM64_SVE */
 
+#define opt_dom0_sve     (0)
 #define is_sve_domain(d) (0)
 
 static inline register_t compute_max_zcr(void)
@@ -59,6 +68,11 @@ static inline void sve_context_free(struct vcpu *v) {}
 static inline void sve_save_state(struct vcpu *v) {}
 static inline void sve_restore_state(struct vcpu *v) {}
 
+static inline bool sve_domctl_vl_param(int val, unsigned int *out)
+{
+    return false;
+}
+
 #endif /* CONFIG_ARM64_SVE */
 
 #endif /* _ARM_ARM64_SVE_H */
diff --git a/xen/common/kernel.c b/xen/common/kernel.c
index f7b1f65f373c..b67d9056fec3 100644
--- a/xen/common/kernel.c
+++ b/xen/common/kernel.c
@@ -314,6 +314,31 @@ int parse_boolean(const char *name, const char *s, const char *e)
     return -1;
 }
 
+int __init parse_signed_integer(const char *name, const char *s, const char *e,
+                                long long *val)
+{
+    size_t slen, nlen;
+    const char *str;
+    long long pval;
+
+    slen = e ? ({ ASSERT(e >= s); e - s; }) : strlen(s);
+    nlen = strlen(name);
+
+    /* Check that this is the name we're looking for and a value was provided */
+    if ( (slen <= nlen) || strncmp(s, name, nlen) || (s[nlen] != '=') )
+        return -1;
+
+    pval = simple_strtoll(&s[nlen + 1], &str, 0);
+
+    /* Number not recognised */
+    if ( str != e )
+        return -2;
+
+    *val = pval;
+
+    return 0;
+}
+
 int cmdline_strcmp(const char *frag, const char *name)
 {
     for ( ; ; frag++, name++ )
diff --git a/xen/include/xen/lib.h b/xen/include/xen/lib.h
index e914ccade095..5343ee7a944a 100644
--- a/xen/include/xen/lib.h
+++ b/xen/include/xen/lib.h
@@ -94,6 +94,16 @@ int parse_bool(const char *s, const char *e);
  */
 int parse_boolean(const char *name, const char *s, const char *e);
 
+/**
+ * Given a specific name, parses a string of the form:
+ *   $NAME=<integer number>
+ * returning 0 and a value in val, for a recognised integer.
+ * Returns -1 for name not found, general errors, or -2 if name is found but
+ * not recognised number.
+ */
+int parse_signed_integer(const char *name, const char *s, const char *e,
+                         long long *val);
+
 /**
  * Very similar to strcmp(), but will declare a match if the NUL in 'name'
  * lines up with comma, colon, semicolon or equals in 'frag'.  Designed for
-- 
2.34.1



^ permalink raw reply related	[flat|nested] 56+ messages in thread

* [PATCH v6 08/12] xen/physinfo: encode Arm SVE vector length in arch_capabilities
  2023-04-24  6:02 [PATCH v6 00/12] SVE feature for arm guests Luca Fancellu
                   ` (6 preceding siblings ...)
  2023-04-24  6:02 ` [PATCH v6 07/12] xen: enable Dom0 to use SVE feature Luca Fancellu
@ 2023-04-24  6:02 ` Luca Fancellu
  2023-04-24  6:02 ` [PATCH v6 09/12] tools: add physinfo arch_capabilities handling for Arm Luca Fancellu
                   ` (3 subsequent siblings)
  11 siblings, 0 replies; 56+ messages in thread
From: Luca Fancellu @ 2023-04-24  6:02 UTC (permalink / raw)
  To: xen-devel
  Cc: bertrand.marquis, wei.chen, Stefano Stabellini, Julien Grall,
	Volodymyr Babchuk, Andrew Cooper, George Dunlap, Jan Beulich,
	Wei Liu

When the arm platform supports SVE, advertise the feature in the
field arch_capabilities in struct xen_sysctl_physinfo by encoding
the SVE vector length in it.

Signed-off-by: Luca Fancellu <luca.fancellu@arm.com>
Reviewed-by: Bertrand Marquis <bertrand.marquis@arm.com>
---
Changes from v5:
 - Add R-by from Bertrand
Changes from v4:
 - Write arch_capabilities from arch_do_physinfo instead of using
   stub functions (Jan)
Changes from v3:
 - domainconfig_encode_vl is now named sve_encode_vl
Changes from v2:
 - Remove XEN_SYSCTL_PHYSCAP_ARM_SVE_SHFT, use MASK_INSR and
   protect with ifdef XEN_SYSCTL_PHYSCAP_ARM_SVE_MASK (Jan)
 - Use the helper function sve_arch_cap_physinfo to encode
   the VL into physinfo arch_capabilities field.
Changes from v1:
 - Use only arch_capabilities and some defines to encode SVE VL
   (Bertrand, Stefano, Jan)
Changes from RFC:
 - new patch
---
 xen/arch/arm/sysctl.c       | 4 ++++
 xen/include/public/sysctl.h | 4 ++++
 2 files changed, 8 insertions(+)

diff --git a/xen/arch/arm/sysctl.c b/xen/arch/arm/sysctl.c
index b0a78a8b10d0..e9a0661146e4 100644
--- a/xen/arch/arm/sysctl.c
+++ b/xen/arch/arm/sysctl.c
@@ -11,11 +11,15 @@
 #include <xen/lib.h>
 #include <xen/errno.h>
 #include <xen/hypercall.h>
+#include <asm/arm64/sve.h>
 #include <public/sysctl.h>
 
 void arch_do_physinfo(struct xen_sysctl_physinfo *pi)
 {
     pi->capabilities |= XEN_SYSCTL_PHYSCAP_hvm | XEN_SYSCTL_PHYSCAP_hap;
+
+    pi->arch_capabilities |= MASK_INSR(sve_encode_vl(get_sys_vl_len()),
+                                       XEN_SYSCTL_PHYSCAP_ARM_SVE_MASK);
 }
 
 long arch_do_sysctl(struct xen_sysctl *sysctl,
diff --git a/xen/include/public/sysctl.h b/xen/include/public/sysctl.h
index 2b24d6bfd00e..9d06e92d0f6a 100644
--- a/xen/include/public/sysctl.h
+++ b/xen/include/public/sysctl.h
@@ -94,6 +94,10 @@ struct xen_sysctl_tbuf_op {
 /* Max XEN_SYSCTL_PHYSCAP_* constant.  Used for ABI checking. */
 #define XEN_SYSCTL_PHYSCAP_MAX XEN_SYSCTL_PHYSCAP_gnttab_v2
 
+#if defined(__arm__) || defined(__aarch64__)
+#define XEN_SYSCTL_PHYSCAP_ARM_SVE_MASK  (0x1FU)
+#endif
+
 struct xen_sysctl_physinfo {
     uint32_t threads_per_core;
     uint32_t cores_per_socket;
-- 
2.34.1



^ permalink raw reply related	[flat|nested] 56+ messages in thread

* [PATCH v6 09/12] tools: add physinfo arch_capabilities handling for Arm
  2023-04-24  6:02 [PATCH v6 00/12] SVE feature for arm guests Luca Fancellu
                   ` (7 preceding siblings ...)
  2023-04-24  6:02 ` [PATCH v6 08/12] xen/physinfo: encode Arm SVE vector length in arch_capabilities Luca Fancellu
@ 2023-04-24  6:02 ` Luca Fancellu
  2023-05-02 16:13   ` Anthony PERARD
  2023-04-24  6:02 ` [PATCH v6 10/12] xen/tools: add sve parameter in XL configuration Luca Fancellu
                   ` (2 subsequent siblings)
  11 siblings, 1 reply; 56+ messages in thread
From: Luca Fancellu @ 2023-04-24  6:02 UTC (permalink / raw)
  To: xen-devel
  Cc: bertrand.marquis, wei.chen, George Dunlap, Nick Rosbrook,
	Wei Liu, Anthony PERARD, Juergen Gross, Christian Lindig,
	David Scott, Marek Marczykowski-Górecki, Christian Lindig

On Arm, the SVE vector length is encoded in arch_capabilities field
of struct xen_sysctl_physinfo, make use of this field in the tools
when building for arm.

Create header arm-arch-capabilities.h to handle the arch_capabilities
field of physinfo for Arm.

Removed include for xen-tools/common-macros.h in
python/xen/lowlevel/xc/xc.c because it is already included by the
arm-arch-capabilities.h header.

Signed-off-by: Luca Fancellu <luca.fancellu@arm.com>
Acked-by: George Dunlap <george.dunlap@citrix.com>
Acked-by: Christian Lindig <christian.lindig@cloud.com>
---
Changes from v4:
 - Move arm-arch-capabilities.h into xen-tools/, add LIBXL_HAVE_,
   fixed python return type to I instead of i. (Anthony)
Changes from v3:
 - add Ack-by for the Golang bits (George)
 - add Ack-by for the OCaml tools (Christian)
 - now xen-tools/libs.h is named xen-tools/common-macros.h
 - changed commit message to explain why the header modification
   in python/xen/lowlevel/xc/xc.c
Changes from v2:
 - rename arm_arch_capabilities.h in arm-arch-capabilities.h, use
   MASK_EXTR.
 - Now arm-arch-capabilities.h needs MASK_EXTR macro, but it is
   defined in libxl_internal.h, it doesn't feel right to include
   that header so move MASK_EXTR into xen-tools/libs.h that is also
   included in libxl_internal.h
Changes from v1:
 - now SVE VL is encoded in arch_capabilities on Arm
Changes from RFC:
 - new patch
---
 tools/golang/xenlight/helpers.gen.go          |  2 ++
 tools/golang/xenlight/types.gen.go            |  1 +
 tools/include/libxl.h                         |  6 ++++
 .../include/xen-tools/arm-arch-capabilities.h | 28 +++++++++++++++++++
 tools/include/xen-tools/common-macros.h       |  2 ++
 tools/libs/light/libxl.c                      |  1 +
 tools/libs/light/libxl_internal.h             |  1 -
 tools/libs/light/libxl_types.idl              |  1 +
 tools/ocaml/libs/xc/xenctrl.ml                |  4 +--
 tools/ocaml/libs/xc/xenctrl.mli               |  4 +--
 tools/ocaml/libs/xc/xenctrl_stubs.c           |  8 ++++--
 tools/python/xen/lowlevel/xc/xc.c             |  8 ++++--
 tools/xl/xl_info.c                            |  8 ++++++
 13 files changed, 62 insertions(+), 12 deletions(-)
 create mode 100644 tools/include/xen-tools/arm-arch-capabilities.h

diff --git a/tools/golang/xenlight/helpers.gen.go b/tools/golang/xenlight/helpers.gen.go
index 0a203d22321f..35397be2f9e2 100644
--- a/tools/golang/xenlight/helpers.gen.go
+++ b/tools/golang/xenlight/helpers.gen.go
@@ -3506,6 +3506,7 @@ x.CapVmtrace = bool(xc.cap_vmtrace)
 x.CapVpmu = bool(xc.cap_vpmu)
 x.CapGnttabV1 = bool(xc.cap_gnttab_v1)
 x.CapGnttabV2 = bool(xc.cap_gnttab_v2)
+x.ArchCapabilities = uint32(xc.arch_capabilities)
 
  return nil}
 
@@ -3540,6 +3541,7 @@ xc.cap_vmtrace = C.bool(x.CapVmtrace)
 xc.cap_vpmu = C.bool(x.CapVpmu)
 xc.cap_gnttab_v1 = C.bool(x.CapGnttabV1)
 xc.cap_gnttab_v2 = C.bool(x.CapGnttabV2)
+xc.arch_capabilities = C.uint32_t(x.ArchCapabilities)
 
  return nil
  }
diff --git a/tools/golang/xenlight/types.gen.go b/tools/golang/xenlight/types.gen.go
index a7c17699f80e..3d968a496744 100644
--- a/tools/golang/xenlight/types.gen.go
+++ b/tools/golang/xenlight/types.gen.go
@@ -1079,6 +1079,7 @@ CapVmtrace bool
 CapVpmu bool
 CapGnttabV1 bool
 CapGnttabV2 bool
+ArchCapabilities uint32
 }
 
 type Connectorinfo struct {
diff --git a/tools/include/libxl.h b/tools/include/libxl.h
index cfa1a191318c..4fa09ff7635a 100644
--- a/tools/include/libxl.h
+++ b/tools/include/libxl.h
@@ -525,6 +525,12 @@
  */
 #define LIBXL_HAVE_PHYSINFO_CAP_GNTTAB 1
 
+/*
+ * LIBXL_HAVE_PHYSINFO_ARCH_CAPABILITIES indicates that libxl_physinfo has a
+ * arch_capabilities field.
+ */
+#define LIBXL_HAVE_PHYSINFO_ARCH_CAPABILITIES 1
+
 /*
  * LIBXL_HAVE_MAX_GRANT_VERSION indicates libxl_domain_build_info has a
  * max_grant_version field for setting the max grant table version per
diff --git a/tools/include/xen-tools/arm-arch-capabilities.h b/tools/include/xen-tools/arm-arch-capabilities.h
new file mode 100644
index 000000000000..ac44c8b14344
--- /dev/null
+++ b/tools/include/xen-tools/arm-arch-capabilities.h
@@ -0,0 +1,28 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+/*
+ * Copyright (C) 2023 ARM Ltd.
+ */
+
+#ifndef ARM_ARCH_CAPABILITIES_H
+#define ARM_ARCH_CAPABILITIES_H
+
+#include <stdint.h>
+#include <xen/sysctl.h>
+
+#include <xen-tools/common-macros.h>
+
+static inline
+unsigned int arch_capabilities_arm_sve(unsigned int arch_capabilities)
+{
+#if defined(__aarch64__)
+    unsigned int sve_vl = MASK_EXTR(arch_capabilities,
+                                    XEN_SYSCTL_PHYSCAP_ARM_SVE_MASK);
+
+    /* Vector length is divided by 128 before storing it in arch_capabilities */
+    return sve_vl * 128U;
+#else
+    return 0;
+#endif
+}
+
+#endif /* ARM_ARCH_CAPABILITIES_H */
diff --git a/tools/include/xen-tools/common-macros.h b/tools/include/xen-tools/common-macros.h
index 76b55bf62085..d53b88182560 100644
--- a/tools/include/xen-tools/common-macros.h
+++ b/tools/include/xen-tools/common-macros.h
@@ -72,6 +72,8 @@
 #define ROUNDUP(_x,_w) (((unsigned long)(_x)+(1UL<<(_w))-1) & ~((1UL<<(_w))-1))
 #endif
 
+#define MASK_EXTR(v, m) (((v) & (m)) / ((m) & -(m)))
+
 #ifndef __must_check
 #define __must_check __attribute__((__warn_unused_result__))
 #endif
diff --git a/tools/libs/light/libxl.c b/tools/libs/light/libxl.c
index a0bf7d186f69..175d6dde0b80 100644
--- a/tools/libs/light/libxl.c
+++ b/tools/libs/light/libxl.c
@@ -409,6 +409,7 @@ int libxl_get_physinfo(libxl_ctx *ctx, libxl_physinfo *physinfo)
         !!(xcphysinfo.capabilities & XEN_SYSCTL_PHYSCAP_gnttab_v1);
     physinfo->cap_gnttab_v2 =
         !!(xcphysinfo.capabilities & XEN_SYSCTL_PHYSCAP_gnttab_v2);
+    physinfo->arch_capabilities = xcphysinfo.arch_capabilities;
 
     GC_FREE;
     return 0;
diff --git a/tools/libs/light/libxl_internal.h b/tools/libs/light/libxl_internal.h
index 5244fde6239a..8aba3e138909 100644
--- a/tools/libs/light/libxl_internal.h
+++ b/tools/libs/light/libxl_internal.h
@@ -132,7 +132,6 @@
 
 #define DIV_ROUNDUP(n, d) (((n) + (d) - 1) / (d))
 
-#define MASK_EXTR(v, m) (((v) & (m)) / ((m) & -(m)))
 #define MASK_INSR(v, m) (((v) * ((m) & -(m))) & (m))
 
 #define LIBXL__LOGGING_ENABLED
diff --git a/tools/libs/light/libxl_types.idl b/tools/libs/light/libxl_types.idl
index c10292e0d7e3..fd31dacf7d5a 100644
--- a/tools/libs/light/libxl_types.idl
+++ b/tools/libs/light/libxl_types.idl
@@ -1133,6 +1133,7 @@ libxl_physinfo = Struct("physinfo", [
     ("cap_vpmu", bool),
     ("cap_gnttab_v1", bool),
     ("cap_gnttab_v2", bool),
+    ("arch_capabilities", uint32),
     ], dir=DIR_OUT)
 
 libxl_connectorinfo = Struct("connectorinfo", [
diff --git a/tools/ocaml/libs/xc/xenctrl.ml b/tools/ocaml/libs/xc/xenctrl.ml
index e4096bf92c1d..bf23ca50bb15 100644
--- a/tools/ocaml/libs/xc/xenctrl.ml
+++ b/tools/ocaml/libs/xc/xenctrl.ml
@@ -128,12 +128,10 @@ type physinfo_cap_flag =
   | CAP_Gnttab_v1
   | CAP_Gnttab_v2
 
-type arm_physinfo_cap_flag
-
 type x86_physinfo_cap_flag
 
 type arch_physinfo_cap_flags =
-  | ARM of arm_physinfo_cap_flag list
+  | ARM of int
   | X86 of x86_physinfo_cap_flag list
 
 type physinfo =
diff --git a/tools/ocaml/libs/xc/xenctrl.mli b/tools/ocaml/libs/xc/xenctrl.mli
index ef2254537430..ed1e28ea30a0 100644
--- a/tools/ocaml/libs/xc/xenctrl.mli
+++ b/tools/ocaml/libs/xc/xenctrl.mli
@@ -113,12 +113,10 @@ type physinfo_cap_flag =
   | CAP_Gnttab_v1
   | CAP_Gnttab_v2
 
-type arm_physinfo_cap_flag
-
 type x86_physinfo_cap_flag
 
 type arch_physinfo_cap_flags =
-  | ARM of arm_physinfo_cap_flag list
+  | ARM of int
   | X86 of x86_physinfo_cap_flag list
 
 type physinfo = {
diff --git a/tools/ocaml/libs/xc/xenctrl_stubs.c b/tools/ocaml/libs/xc/xenctrl_stubs.c
index 6ec9ed6d1e6f..526a3610fa42 100644
--- a/tools/ocaml/libs/xc/xenctrl_stubs.c
+++ b/tools/ocaml/libs/xc/xenctrl_stubs.c
@@ -853,13 +853,15 @@ CAMLprim value stub_xc_physinfo(value xch_val)
 	arch_cap_list = Tag_cons;
 
 	arch_cap_flags_tag = 1; /* tag x86 */
-#else
-	caml_failwith("Unhandled architecture");
-#endif
 
 	arch_cap_flags = caml_alloc_small(1, arch_cap_flags_tag);
 	Store_field(arch_cap_flags, 0, arch_cap_list);
 	Store_field(physinfo, 10, arch_cap_flags);
+#elif defined(__aarch64__)
+	Store_field(physinfo, 10, Val_int(c_physinfo.arch_capabilities));
+#else
+	caml_failwith("Unhandled architecture");
+#endif
 
 	CAMLreturn(physinfo);
 }
diff --git a/tools/python/xen/lowlevel/xc/xc.c b/tools/python/xen/lowlevel/xc/xc.c
index 35901c2d63b6..94b0354cf52f 100644
--- a/tools/python/xen/lowlevel/xc/xc.c
+++ b/tools/python/xen/lowlevel/xc/xc.c
@@ -22,6 +22,7 @@
 #include <xen/hvm/hvm_info_table.h>
 #include <xen/hvm/params.h>
 
+#include <xen-tools/arm-arch-capabilities.h>
 #include <xen-tools/common-macros.h>
 
 /* Needed for Python versions earlier than 2.3. */
@@ -897,7 +898,7 @@ static PyObject *pyxc_physinfo(XcObject *self)
     if ( p != virt_caps )
       *(p-1) = '\0';
 
-    return Py_BuildValue("{s:i,s:i,s:i,s:i,s:l,s:l,s:l,s:i,s:s,s:s}",
+    return Py_BuildValue("{s:i,s:i,s:i,s:i,s:l,s:l,s:l,s:i,s:s,s:s,s:I}",
                             "nr_nodes",         pinfo.nr_nodes,
                             "threads_per_core", pinfo.threads_per_core,
                             "cores_per_socket", pinfo.cores_per_socket,
@@ -907,7 +908,10 @@ static PyObject *pyxc_physinfo(XcObject *self)
                             "scrub_memory",     pages_to_kib(pinfo.scrub_pages),
                             "cpu_khz",          pinfo.cpu_khz,
                             "hw_caps",          cpu_cap,
-                            "virt_caps",        virt_caps);
+                            "virt_caps",        virt_caps,
+                            "arm_sve_vl",
+                              arch_capabilities_arm_sve(pinfo.arch_capabilities)
+                        );
 }
 
 static PyObject *pyxc_getcpuinfo(XcObject *self, PyObject *args, PyObject *kwds)
diff --git a/tools/xl/xl_info.c b/tools/xl/xl_info.c
index 712b7638b013..ddc42f96b979 100644
--- a/tools/xl/xl_info.c
+++ b/tools/xl/xl_info.c
@@ -27,6 +27,7 @@
 #include <libxl_json.h>
 #include <libxl_utils.h>
 #include <libxlutil.h>
+#include <xen-tools/arm-arch-capabilities.h>
 
 #include "xl.h"
 #include "xl_utils.h"
@@ -224,6 +225,13 @@ static void output_physinfo(void)
          info.cap_gnttab_v2 ? " gnttab-v2" : ""
         );
 
+    /* Print arm SVE vector length only on ARM platforms */
+#if defined(__aarch64__)
+    maybe_printf("arm_sve_vector_length  : %u\n",
+         arch_capabilities_arm_sve(info.arch_capabilities)
+        );
+#endif
+
     vinfo = libxl_get_version_info(ctx);
     if (vinfo) {
         i = (1 << 20) / vinfo->pagesize;
-- 
2.34.1



^ permalink raw reply related	[flat|nested] 56+ messages in thread

* [PATCH v6 10/12] xen/tools: add sve parameter in XL configuration
  2023-04-24  6:02 [PATCH v6 00/12] SVE feature for arm guests Luca Fancellu
                   ` (8 preceding siblings ...)
  2023-04-24  6:02 ` [PATCH v6 09/12] tools: add physinfo arch_capabilities handling for Arm Luca Fancellu
@ 2023-04-24  6:02 ` Luca Fancellu
  2023-05-02 17:06   ` Anthony PERARD
  2023-04-24  6:02 ` [PATCH v6 11/12] xen/arm: add sve property for dom0less domUs Luca Fancellu
  2023-04-24  6:02 ` [PATCH v6 12/12] xen/changelog: Add SVE and "dom0" options to the changelog for Arm Luca Fancellu
  11 siblings, 1 reply; 56+ messages in thread
From: Luca Fancellu @ 2023-04-24  6:02 UTC (permalink / raw)
  To: xen-devel
  Cc: bertrand.marquis, wei.chen, Wei Liu, Anthony PERARD,
	George Dunlap, Nick Rosbrook, Juergen Gross

Add sve parameter in XL configuration to allow guests to use
SVE feature.

Signed-off-by: Luca Fancellu <luca.fancellu@arm.com>
---
Changes from v5:
 - Update documentation
 - re-generated golang files
Changes from v4:
 - Rename sve field to sve_vl (Anthony), changed type to
   libxl_sve_type
 - Sanity check of sve field in libxl instead of xl, update docs
   (Anthony)
 - drop Ack-by from George because of the changes in the Golang bits
Changes from v3:
 - no changes
Changes from v2:
 - domain configuration field name has changed to sve_vl,
   also its value now is VL/128.
 - Add Ack-by George for the Golang bits
Changes from v1:
 - updated to use arch_capabilities field for vector length
Changes from RFC:
 - changed libxl_types.idl sve field to uint16
 - now toolstack uses info from physinfo to check against the
   sve XL value
 - Changed documentation
---
 docs/man/xl.cfg.5.pod.in             | 16 ++++++++++++++++
 tools/golang/xenlight/helpers.gen.go |  2 ++
 tools/golang/xenlight/types.gen.go   | 23 +++++++++++++++++++++++
 tools/include/libxl.h                |  5 +++++
 tools/libs/light/libxl_arm.c         | 28 ++++++++++++++++++++++++++++
 tools/libs/light/libxl_types.idl     | 22 ++++++++++++++++++++++
 tools/xl/xl_parse.c                  |  8 ++++++++
 7 files changed, 104 insertions(+)

diff --git a/docs/man/xl.cfg.5.pod.in b/docs/man/xl.cfg.5.pod.in
index 10f37990be57..2ea996caa256 100644
--- a/docs/man/xl.cfg.5.pod.in
+++ b/docs/man/xl.cfg.5.pod.in
@@ -2952,6 +2952,22 @@ Currently, only the "sbsa_uart" model is supported for ARM.
 
 =back
 
+=item B<sve="vl">
+
+The `sve` parameter enables Arm Scalable Vector Extension (SVE) usage for the
+guest and sets the maximum SVE vector length, the option is applicable only to
+AArch64 guests.
+A value equal to "disabled" disables the feature, this is the default value.
+Allowed values are "disabled", "128", "256", "384", "512", "640", "768", "896",
+"1024", "1152", "1280", "1408", "1536", "1664", "1792", "1920", "2048", "hw".
+Specifying "hw" means that the maximum vector length supported by the platform
+will be used.
+Please be aware that if a specific vector length is passed and its value is
+above the maximum vector length supported by the platform, an error will be
+raised.
+
+=back
+
 =head3 x86
 
 =over 4
diff --git a/tools/golang/xenlight/helpers.gen.go b/tools/golang/xenlight/helpers.gen.go
index 35397be2f9e2..cd1a16e32eac 100644
--- a/tools/golang/xenlight/helpers.gen.go
+++ b/tools/golang/xenlight/helpers.gen.go
@@ -1149,6 +1149,7 @@ default:
 return fmt.Errorf("invalid union key '%v'", x.Type)}
 x.ArchArm.GicVersion = GicVersion(xc.arch_arm.gic_version)
 x.ArchArm.Vuart = VuartType(xc.arch_arm.vuart)
+x.ArchArm.SveVl = SveType(xc.arch_arm.sve_vl)
 if err := x.ArchX86.MsrRelaxed.fromC(&xc.arch_x86.msr_relaxed);err != nil {
 return fmt.Errorf("converting field ArchX86.MsrRelaxed: %v", err)
 }
@@ -1653,6 +1654,7 @@ default:
 return fmt.Errorf("invalid union key '%v'", x.Type)}
 xc.arch_arm.gic_version = C.libxl_gic_version(x.ArchArm.GicVersion)
 xc.arch_arm.vuart = C.libxl_vuart_type(x.ArchArm.Vuart)
+xc.arch_arm.sve_vl = C.libxl_sve_type(x.ArchArm.SveVl)
 if err := x.ArchX86.MsrRelaxed.toC(&xc.arch_x86.msr_relaxed); err != nil {
 return fmt.Errorf("converting field ArchX86.MsrRelaxed: %v", err)
 }
diff --git a/tools/golang/xenlight/types.gen.go b/tools/golang/xenlight/types.gen.go
index 3d968a496744..b131a7eedc9d 100644
--- a/tools/golang/xenlight/types.gen.go
+++ b/tools/golang/xenlight/types.gen.go
@@ -490,6 +490,28 @@ TeeTypeNone TeeType = 0
 TeeTypeOptee TeeType = 1
 )
 
+type SveType int
+const(
+SveTypeHw SveType = -1
+SveTypeDisabled SveType = 0
+SveType128 SveType = 128
+SveType256 SveType = 256
+SveType384 SveType = 384
+SveType512 SveType = 512
+SveType640 SveType = 640
+SveType768 SveType = 768
+SveType896 SveType = 896
+SveType1024 SveType = 1024
+SveType1152 SveType = 1152
+SveType1280 SveType = 1280
+SveType1408 SveType = 1408
+SveType1536 SveType = 1536
+SveType1664 SveType = 1664
+SveType1792 SveType = 1792
+SveType1920 SveType = 1920
+SveType2048 SveType = 2048
+)
+
 type RdmReserve struct {
 Strategy RdmReserveStrategy
 Policy RdmReservePolicy
@@ -564,6 +586,7 @@ TypeUnion DomainBuildInfoTypeUnion
 ArchArm struct {
 GicVersion GicVersion
 Vuart VuartType
+SveVl SveType
 }
 ArchX86 struct {
 MsrRelaxed Defbool
diff --git a/tools/include/libxl.h b/tools/include/libxl.h
index 4fa09ff7635a..cac641a7eba2 100644
--- a/tools/include/libxl.h
+++ b/tools/include/libxl.h
@@ -283,6 +283,11 @@
  */
 #define LIBXL_HAVE_BUILDINFO_ARCH_ARM_TEE 1
 
+/*
+ * libxl_domain_build_info has the arch_arm.sve_vl field.
+ */
+#define LIBXL_HAVE_BUILDINFO_ARCH_ARM_SVE_VL 1
+
 /*
  * LIBXL_HAVE_SOFT_RESET indicates that libxl supports performing
  * 'soft reset' for domains and there is 'soft_reset' shutdown reason
diff --git a/tools/libs/light/libxl_arm.c b/tools/libs/light/libxl_arm.c
index ddc7b2a15975..1e69dac2c4fa 100644
--- a/tools/libs/light/libxl_arm.c
+++ b/tools/libs/light/libxl_arm.c
@@ -3,6 +3,8 @@
 #include "libxl_libfdt_compat.h"
 #include "libxl_arm.h"
 
+#include <xen-tools/arm-arch-capabilities.h>
+
 #include <stdbool.h>
 #include <libfdt.h>
 #include <assert.h>
@@ -211,6 +213,12 @@ int libxl__arch_domain_prepare_config(libxl__gc *gc,
         return ERROR_FAIL;
     }
 
+    /* Parameter is sanitised in libxl__arch_domain_build_info_setdefault */
+    if (d_config->b_info.arch_arm.sve_vl) {
+        /* Vector length is divided by 128 in struct xen_domctl_createdomain */
+        config->arch.sve_vl = d_config->b_info.arch_arm.sve_vl / 128U;
+    }
+
     return 0;
 }
 
@@ -1681,6 +1689,26 @@ int libxl__arch_domain_build_info_setdefault(libxl__gc *gc,
     /* ACPI is disabled by default */
     libxl_defbool_setdefault(&b_info->acpi, false);
 
+    /* Sanitise SVE parameter */
+    if (b_info->arch_arm.sve_vl) {
+        unsigned int max_sve_vl =
+            arch_capabilities_arm_sve(physinfo->arch_capabilities);
+
+        if (!max_sve_vl) {
+            LOG(ERROR, "SVE is unsupported on this machine.");
+            return ERROR_FAIL;
+        }
+
+        if (LIBXL_SVE_TYPE_HW == b_info->arch_arm.sve_vl) {
+            b_info->arch_arm.sve_vl = max_sve_vl;
+        } else if (b_info->arch_arm.sve_vl > max_sve_vl) {
+            LOG(ERROR,
+                "Invalid sve value: %d. Platform supports up to %u bits",
+                b_info->arch_arm.sve_vl, max_sve_vl);
+            return ERROR_FAIL;
+        }
+    }
+
     if (b_info->type != LIBXL_DOMAIN_TYPE_PV)
         return 0;
 
diff --git a/tools/libs/light/libxl_types.idl b/tools/libs/light/libxl_types.idl
index fd31dacf7d5a..9e48bb772646 100644
--- a/tools/libs/light/libxl_types.idl
+++ b/tools/libs/light/libxl_types.idl
@@ -523,6 +523,27 @@ libxl_tee_type = Enumeration("tee_type", [
     (1, "optee")
     ], init_val = "LIBXL_TEE_TYPE_NONE")
 
+libxl_sve_type = Enumeration("sve_type", [
+    (-1, "hw"),
+    (0, "disabled"),
+    (128, "128"),
+    (256, "256"),
+    (384, "384"),
+    (512, "512"),
+    (640, "640"),
+    (768, "768"),
+    (896, "896"),
+    (1024, "1024"),
+    (1152, "1152"),
+    (1280, "1280"),
+    (1408, "1408"),
+    (1536, "1536"),
+    (1664, "1664"),
+    (1792, "1792"),
+    (1920, "1920"),
+    (2048, "2048")
+    ], init_val = "LIBXL_SVE_TYPE_DISABLED")
+
 libxl_rdm_reserve = Struct("rdm_reserve", [
     ("strategy",    libxl_rdm_reserve_strategy),
     ("policy",      libxl_rdm_reserve_policy),
@@ -690,6 +711,7 @@ libxl_domain_build_info = Struct("domain_build_info",[
 
     ("arch_arm", Struct(None, [("gic_version", libxl_gic_version),
                                ("vuart", libxl_vuart_type),
+                               ("sve_vl", libxl_sve_type),
                               ])),
     ("arch_x86", Struct(None, [("msr_relaxed", libxl_defbool),
                               ])),
diff --git a/tools/xl/xl_parse.c b/tools/xl/xl_parse.c
index 1f6f47daf4e1..f036e56fc239 100644
--- a/tools/xl/xl_parse.c
+++ b/tools/xl/xl_parse.c
@@ -2887,6 +2887,14 @@ skip_usbdev:
         }
     }
 
+    if (!xlu_cfg_get_string (config, "sve", &buf, 1)) {
+        e = libxl_sve_type_from_string(buf, &b_info->arch_arm.sve_vl);
+        if (e) {
+            fprintf(stderr, "Unknown sve \"%s\" specified\n", buf);
+            exit(EXIT_FAILURE);
+        }
+    }
+
     parse_vkb_list(config, d_config);
 
     d_config->virtios = NULL;
-- 
2.34.1



^ permalink raw reply related	[flat|nested] 56+ messages in thread

* [PATCH v6 11/12] xen/arm: add sve property for dom0less domUs
  2023-04-24  6:02 [PATCH v6 00/12] SVE feature for arm guests Luca Fancellu
                   ` (9 preceding siblings ...)
  2023-04-24  6:02 ` [PATCH v6 10/12] xen/tools: add sve parameter in XL configuration Luca Fancellu
@ 2023-04-24  6:02 ` Luca Fancellu
  2023-04-24  6:02 ` [PATCH v6 12/12] xen/changelog: Add SVE and "dom0" options to the changelog for Arm Luca Fancellu
  11 siblings, 0 replies; 56+ messages in thread
From: Luca Fancellu @ 2023-04-24  6:02 UTC (permalink / raw)
  To: xen-devel
  Cc: bertrand.marquis, wei.chen, Stefano Stabellini, Julien Grall,
	Volodymyr Babchuk

Add a device tree property in the dom0less domU configuration
to enable the guest to use SVE.

Update documentation.

Signed-off-by: Luca Fancellu <luca.fancellu@arm.com>
---
Changes from v5:
 - Stop the domain creation if SVE not supported or SVE VL
   errors (Julien, Bertrand)
 - now sve_sanitize_vl_param is renamed to sve_domctl_vl_param
   and returns a boolean, change the affected code.
 - Reworded documentation.
Changes from v4:
 - Now it is possible to specify the property "sve" for dom0less
   device tree node without any value, that means the platform
   supported VL will be used.
Changes from v3:
 - Now domainconfig_encode_vl is named sve_encode_vl
Changes from v2:
 - xen_domctl_createdomain field name has changed into sve_vl
   and its value is the VL/128, use domainconfig_encode_vl
   to encode a plain VL in bits.
Changes from v1:
 - No changes
Changes from RFC:
 - Changed documentation
---
 docs/misc/arm/device-tree/booting.txt | 16 ++++++++++++++++
 xen/arch/arm/domain_build.c           | 24 ++++++++++++++++++++++++
 2 files changed, 40 insertions(+)

diff --git a/docs/misc/arm/device-tree/booting.txt b/docs/misc/arm/device-tree/booting.txt
index 3879340b5e0a..32a0e508c471 100644
--- a/docs/misc/arm/device-tree/booting.txt
+++ b/docs/misc/arm/device-tree/booting.txt
@@ -193,6 +193,22 @@ with the following properties:
     Optional. Handle to a xen,cpupool device tree node that identifies the
     cpupool where the guest will be started at boot.
 
+- sve
+
+    Optional. The `sve` property enables Arm SVE usage for the domain and sets
+    the maximum SVE vector length, the option is applicable only to AArch64
+    guests.
+    A value equal to 0 disables the feature, this is the default value.
+    Specifying this property with no value, means that the SVE vector length
+    will be set equal to the maximum vector length supported by the platform.
+    Values above 0 explicitly set the maximum SVE vector length for the domain,
+    allowed values are from 128 to maximum 2048, being multiple of 128.
+    Please note that when the user explicitly specifies the value, if that value
+    is above the hardware supported maximum SVE vector length, the domain
+    creation will fail and the system will stop, the same will occur if the
+    option is provided with a non zero value, but the platform doesn't support
+    SVE.
+
 - xen,enhanced
 
     A string property. Possible property values are:
diff --git a/xen/arch/arm/domain_build.c b/xen/arch/arm/domain_build.c
index 4a6b73348594..b61cc35b8524 100644
--- a/xen/arch/arm/domain_build.c
+++ b/xen/arch/arm/domain_build.c
@@ -4011,6 +4011,30 @@ void __init create_domUs(void)
             d_cfg.max_maptrack_frames = val;
         }
 
+        if ( dt_get_property(node, "sve", &val) )
+        {
+            unsigned int sve_vl_bits;
+            bool ret = false;
+
+            if ( !val )
+            {
+                /* Property found with no value, means max HW VL supported */
+                ret = sve_domctl_vl_param(-1, &sve_vl_bits);
+            }
+            else
+            {
+                if ( dt_property_read_u32(node, "sve", &val) )
+                    ret = sve_domctl_vl_param(val, &sve_vl_bits);
+                else
+                    panic("Error reading 'sve' property");
+            }
+
+            if ( ret )
+                d_cfg.arch.sve_vl = sve_encode_vl(sve_vl_bits);
+            else
+                panic("SVE vector length error\n");
+        }
+
         /*
          * The variable max_init_domid is initialized with zero, so here it's
          * very important to use the pre-increment operator to call
-- 
2.34.1



^ permalink raw reply related	[flat|nested] 56+ messages in thread

* [PATCH v6 12/12] xen/changelog: Add SVE and "dom0" options to the changelog for Arm
  2023-04-24  6:02 [PATCH v6 00/12] SVE feature for arm guests Luca Fancellu
                   ` (10 preceding siblings ...)
  2023-04-24  6:02 ` [PATCH v6 11/12] xen/arm: add sve property for dom0less domUs Luca Fancellu
@ 2023-04-24  6:02 ` Luca Fancellu
  2023-04-24  7:22   ` Henry Wang
  11 siblings, 1 reply; 56+ messages in thread
From: Luca Fancellu @ 2023-04-24  6:02 UTC (permalink / raw)
  To: xen-devel
  Cc: bertrand.marquis, wei.chen, Henry Wang, Community Manager,
	Andrew Cooper, George Dunlap, Jan Beulich, Julien Grall,
	Stefano Stabellini, Wei Liu

Arm now can use the "dom0=" Xen command line option and the support
for guests running SVE instructions is added, put entries in the
changelog.

Mention the "Tech Preview" status and add an entry in SUPPORT.md

Signed-off-by: Luca Fancellu <luca.fancellu@arm.com>
---
Changes from v5:
 - Add Tech Preview status and add entry in SUPPORT.md (Bertrand)
Changes from v4:
 - No changes
Change from v3:
 - new patch
---
 CHANGELOG.md | 3 +++
 SUPPORT.md   | 6 ++++++
 2 files changed, 9 insertions(+)

diff --git a/CHANGELOG.md b/CHANGELOG.md
index 5dbf8b06d72c..c82a03afd2cf 100644
--- a/CHANGELOG.md
+++ b/CHANGELOG.md
@@ -9,6 +9,8 @@ The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.0.0/)
 ### Changed
  - Repurpose command line gnttab_max_{maptrack_,}frames options so they don't
    cap toolstack provided values.
+ - The "dom0" option is now supported on Arm and "sve=" sub-option can be used
+   to enable dom0 guest to use SVE/SVE2 instructions.
 
 ### Added
  - On x86, support for features new in Intel Sapphire Rapids CPUs:
@@ -18,6 +20,7 @@ The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.0.0/)
    - Bus-lock detection, used by Xen to mitigate (by rate-limiting) the system
      wide impact of a guest misusing atomic instructions.
  - xl/libxl can customize SMBIOS strings for HVM guests.
+ - On Arm, Xen supports guests running SVE/SVE2 instructions. (Tech Preview)
 
 ## [4.17.0](https://xenbits.xen.org/gitweb/?p=xen.git;a=shortlog;h=RELEASE-4.17.0) - 2022-12-12
 
diff --git a/SUPPORT.md b/SUPPORT.md
index aa1940e55f09..3711fc83b48a 100644
--- a/SUPPORT.md
+++ b/SUPPORT.md
@@ -89,6 +89,12 @@ Extension to the GICv3 interrupt controller to support MSI.
 
     Status: Experimental
 
+### ARM Scalable Vector Extension (SVE/SVE2)
+
+AArch64 guest can use Scalable Vector Extension (SVE/SVE2).
+
+    Status: Tech Preview
+
 ## Guest Type
 
 ### x86/PV
-- 
2.34.1



^ permalink raw reply related	[flat|nested] 56+ messages in thread

* RE: [PATCH v6 12/12] xen/changelog: Add SVE and "dom0" options to the changelog for Arm
  2023-04-24  6:02 ` [PATCH v6 12/12] xen/changelog: Add SVE and "dom0" options to the changelog for Arm Luca Fancellu
@ 2023-04-24  7:22   ` Henry Wang
  0 siblings, 0 replies; 56+ messages in thread
From: Henry Wang @ 2023-04-24  7:22 UTC (permalink / raw)
  To: Luca Fancellu, xen-devel
  Cc: Bertrand Marquis, Wei Chen, Community Manager, Andrew Cooper,
	George Dunlap, Jan Beulich, Julien Grall, Stefano Stabellini,
	Wei Liu

Hi Luca,

> -----Original Message-----
> From: Luca Fancellu <luca.fancellu@arm.com>
> Subject: [PATCH v6 12/12] xen/changelog: Add SVE and "dom0" options to the
> changelog for Arm
> 
> Arm now can use the "dom0=" Xen command line option and the support
> for guests running SVE instructions is added, put entries in the
> changelog.
> 
> Mention the "Tech Preview" status and add an entry in SUPPORT.md
> 
> Signed-off-by: Luca Fancellu <luca.fancellu@arm.com>

Acked-by: Henry Wang <Henry.Wang@arm.com> # CHANGELOG

Kind regards,
Henry


^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [PATCH v6 07/12] xen: enable Dom0 to use SVE feature
  2023-04-24  6:02 ` [PATCH v6 07/12] xen: enable Dom0 to use SVE feature Luca Fancellu
@ 2023-04-24 11:34   ` Jan Beulich
  2023-04-24 14:00     ` Luca Fancellu
  0 siblings, 1 reply; 56+ messages in thread
From: Jan Beulich @ 2023-04-24 11:34 UTC (permalink / raw)
  To: Luca Fancellu
  Cc: bertrand.marquis, wei.chen, Andrew Cooper, George Dunlap,
	Julien Grall, Stefano Stabellini, Wei Liu, Volodymyr Babchuk,
	xen-devel

On 24.04.2023 08:02, Luca Fancellu wrote:
> @@ -30,9 +37,11 @@ int sve_context_init(struct vcpu *v);
>  void sve_context_free(struct vcpu *v);
>  void sve_save_state(struct vcpu *v);
>  void sve_restore_state(struct vcpu *v);
> +bool sve_domctl_vl_param(int val, unsigned int *out);
>  
>  #else /* !CONFIG_ARM64_SVE */
>  
> +#define opt_dom0_sve     (0)
>  #define is_sve_domain(d) (0)
>  
>  static inline register_t compute_max_zcr(void)
> @@ -59,6 +68,11 @@ static inline void sve_context_free(struct vcpu *v) {}
>  static inline void sve_save_state(struct vcpu *v) {}
>  static inline void sve_restore_state(struct vcpu *v) {}
>  
> +static inline bool sve_domctl_vl_param(int val, unsigned int *out)
> +{
> +    return false;
> +}

Once again I don't see the need for this stub: opt_dom0_sve is #define-d
to plain zero when !ARM64_SVE, so the only call site merely requires a
visible declaration, and DCE will take care of eliminating the actual call.

> --- a/xen/common/kernel.c
> +++ b/xen/common/kernel.c
> @@ -314,6 +314,31 @@ int parse_boolean(const char *name, const char *s, const char *e)
>      return -1;
>  }
>  
> +int __init parse_signed_integer(const char *name, const char *s, const char *e,
> +                                long long *val)
> +{
> +    size_t slen, nlen;
> +    const char *str;
> +    long long pval;
> +
> +    slen = e ? ({ ASSERT(e >= s); e - s; }) : strlen(s);

As per this "e" may come in as NULL, meaning that ...

> +    nlen = strlen(name);
> +
> +    /* Check that this is the name we're looking for and a value was provided */
> +    if ( (slen <= nlen) || strncmp(s, name, nlen) || (s[nlen] != '=') )
> +        return -1;
> +
> +    pval = simple_strtoll(&s[nlen + 1], &str, 0);
> +
> +    /* Number not recognised */
> +    if ( str != e )
> +        return -2;

... this is always going to lead to failure in that case. (I guess I could
have spotted this earlier, sorry.)

As a nit, I'd also appreciate if style here (parenthesization in particular)
could match that of parse_boolean(), which doesn't put parentheses around
the operands of comparison operators (a few lines up from here). With the
other function in mind, I'm then not going to pick on the seemingly
redundant (with the subsequent strncmp()) "slen <= nlen", which has an
equivalent there as well.

Jan


^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [PATCH v6 07/12] xen: enable Dom0 to use SVE feature
  2023-04-24 11:34   ` Jan Beulich
@ 2023-04-24 14:00     ` Luca Fancellu
  2023-04-24 14:05       ` Jan Beulich
  0 siblings, 1 reply; 56+ messages in thread
From: Luca Fancellu @ 2023-04-24 14:00 UTC (permalink / raw)
  To: Jan Beulich
  Cc: Bertrand Marquis, Wei Chen, Andrew Cooper, George Dunlap,
	Julien Grall, Stefano Stabellini, Wei Liu, Volodymyr Babchuk,
	xen-devel

Hi Jan,

> On 24 Apr 2023, at 12:34, Jan Beulich <jbeulich@suse.com> wrote:
> 
> On 24.04.2023 08:02, Luca Fancellu wrote:
>> @@ -30,9 +37,11 @@ int sve_context_init(struct vcpu *v);
>> void sve_context_free(struct vcpu *v);
>> void sve_save_state(struct vcpu *v);
>> void sve_restore_state(struct vcpu *v);
>> +bool sve_domctl_vl_param(int val, unsigned int *out);
>> 
>> #else /* !CONFIG_ARM64_SVE */
>> 
>> +#define opt_dom0_sve     (0)
>> #define is_sve_domain(d) (0)
>> 
>> static inline register_t compute_max_zcr(void)
>> @@ -59,6 +68,11 @@ static inline void sve_context_free(struct vcpu *v) {}
>> static inline void sve_save_state(struct vcpu *v) {}
>> static inline void sve_restore_state(struct vcpu *v) {}
>> 
>> +static inline bool sve_domctl_vl_param(int val, unsigned int *out)
>> +{
>> +    return false;
>> +}
> 
> Once again I don't see the need for this stub: opt_dom0_sve is #define-d
> to plain zero when !ARM64_SVE, so the only call site merely requires a
> visible declaration, and DCE will take care of eliminating the actual call.

I’ve tried to do that, I’ve put the declaration outside the ifdef so that it was always included
and I removed the stub, but I got errors on compilation because of undefined function.
For that reason  I left that change out.

> 
>> --- a/xen/common/kernel.c
>> +++ b/xen/common/kernel.c
>> @@ -314,6 +314,31 @@ int parse_boolean(const char *name, const char *s, const char *e)
>>     return -1;
>> }
>> 
>> +int __init parse_signed_integer(const char *name, const char *s, const char *e,
>> +                                long long *val)
>> +{
>> +    size_t slen, nlen;
>> +    const char *str;
>> +    long long pval;
>> +
>> +    slen = e ? ({ ASSERT(e >= s); e - s; }) : strlen(s);
> 
> As per this "e" may come in as NULL, meaning that ...
> 
>> +    nlen = strlen(name);
>> +
>> +    /* Check that this is the name we're looking for and a value was provided */
>> +    if ( (slen <= nlen) || strncmp(s, name, nlen) || (s[nlen] != '=') )
>> +        return -1;
>> +
>> +    pval = simple_strtoll(&s[nlen + 1], &str, 0);
>> +
>> +    /* Number not recognised */
>> +    if ( str != e )
>> +        return -2;
> 
> ... this is always going to lead to failure in that case. (I guess I could
> have spotted this earlier, sorry.)
> 
> As a nit, I'd also appreciate if style here (parenthesization in particular)
> could match that of parse_boolean(), which doesn't put parentheses around
> the operands of comparison operators (a few lines up from here). With the
> other function in mind, I'm then not going to pick on the seemingly
> redundant (with the subsequent strncmp()) "slen <= nlen", which has an
> equivalent there as well.

You are right, do you think this will be ok:

diff --git a/xen/common/kernel.c b/xen/common/kernel.c
index b67d9056fec3..7cd00a4c999a 100644
--- a/xen/common/kernel.c
+++ b/xen/common/kernel.c
@@ -324,11 +324,14 @@ int __init parse_signed_integer(const char *name, const char *s, const char *e,
     slen = e ? ({ ASSERT(e >= s); e - s; }) : strlen(s);
     nlen = strlen(name);
 
+    if ( !e )
+        e = s + slen;
+
     /* Check that this is the name we're looking for and a value was provided */
-    if ( (slen <= nlen) || strncmp(s, name, nlen) || (s[nlen] != '=') )
+    if ( slen <= nlen || strncmp(s, name, nlen) || s[nlen] != '=' )
         return -1;
 
-    pval = simple_strtoll(&s[nlen + 1], &str, 0);
+    pval = simple_strtoll(&s[nlen + 1], &str, 10);
 
     /* Number not recognised */
     if ( str != e )


Please note that I’ve also included your comment about the base, which I forgot to add, apologies for that.

slen <= nlen doesn’t seems redundant to me, I have that because I’m accessing s[nlen] and I would like
the string s to be at least > nlen



> 
> Jan


^ permalink raw reply related	[flat|nested] 56+ messages in thread

* Re: [PATCH v6 07/12] xen: enable Dom0 to use SVE feature
  2023-04-24 14:00     ` Luca Fancellu
@ 2023-04-24 14:05       ` Jan Beulich
  2023-04-24 14:57         ` Luca Fancellu
  0 siblings, 1 reply; 56+ messages in thread
From: Jan Beulich @ 2023-04-24 14:05 UTC (permalink / raw)
  To: Luca Fancellu
  Cc: Bertrand Marquis, Wei Chen, Andrew Cooper, George Dunlap,
	Julien Grall, Stefano Stabellini, Wei Liu, Volodymyr Babchuk,
	xen-devel

On 24.04.2023 16:00, Luca Fancellu wrote:
>> On 24 Apr 2023, at 12:34, Jan Beulich <jbeulich@suse.com> wrote:
>> On 24.04.2023 08:02, Luca Fancellu wrote:
>>> @@ -30,9 +37,11 @@ int sve_context_init(struct vcpu *v);
>>> void sve_context_free(struct vcpu *v);
>>> void sve_save_state(struct vcpu *v);
>>> void sve_restore_state(struct vcpu *v);
>>> +bool sve_domctl_vl_param(int val, unsigned int *out);
>>>
>>> #else /* !CONFIG_ARM64_SVE */
>>>
>>> +#define opt_dom0_sve     (0)
>>> #define is_sve_domain(d) (0)
>>>
>>> static inline register_t compute_max_zcr(void)
>>> @@ -59,6 +68,11 @@ static inline void sve_context_free(struct vcpu *v) {}
>>> static inline void sve_save_state(struct vcpu *v) {}
>>> static inline void sve_restore_state(struct vcpu *v) {}
>>>
>>> +static inline bool sve_domctl_vl_param(int val, unsigned int *out)
>>> +{
>>> +    return false;
>>> +}
>>
>> Once again I don't see the need for this stub: opt_dom0_sve is #define-d
>> to plain zero when !ARM64_SVE, so the only call site merely requires a
>> visible declaration, and DCE will take care of eliminating the actual call.
> 
> I’ve tried to do that, I’ve put the declaration outside the ifdef so that it was always included
> and I removed the stub, but I got errors on compilation because of undefined function.
> For that reason  I left that change out.

Interesting. I don't see where the reference would be coming from.

>>> --- a/xen/common/kernel.c
>>> +++ b/xen/common/kernel.c
>>> @@ -314,6 +314,31 @@ int parse_boolean(const char *name, const char *s, const char *e)
>>>     return -1;
>>> }
>>>
>>> +int __init parse_signed_integer(const char *name, const char *s, const char *e,
>>> +                                long long *val)
>>> +{
>>> +    size_t slen, nlen;
>>> +    const char *str;
>>> +    long long pval;
>>> +
>>> +    slen = e ? ({ ASSERT(e >= s); e - s; }) : strlen(s);
>>
>> As per this "e" may come in as NULL, meaning that ...
>>
>>> +    nlen = strlen(name);
>>> +
>>> +    /* Check that this is the name we're looking for and a value was provided */
>>> +    if ( (slen <= nlen) || strncmp(s, name, nlen) || (s[nlen] != '=') )
>>> +        return -1;
>>> +
>>> +    pval = simple_strtoll(&s[nlen + 1], &str, 0);
>>> +
>>> +    /* Number not recognised */
>>> +    if ( str != e )
>>> +        return -2;
>>
>> ... this is always going to lead to failure in that case. (I guess I could
>> have spotted this earlier, sorry.)
>>
>> As a nit, I'd also appreciate if style here (parenthesization in particular)
>> could match that of parse_boolean(), which doesn't put parentheses around
>> the operands of comparison operators (a few lines up from here). With the
>> other function in mind, I'm then not going to pick on the seemingly
>> redundant (with the subsequent strncmp()) "slen <= nlen", which has an
>> equivalent there as well.
> 
> You are right, do you think this will be ok:

It'll do, I guess.

> --- a/xen/common/kernel.c
> +++ b/xen/common/kernel.c
> @@ -324,11 +324,14 @@ int __init parse_signed_integer(const char *name, const char *s, const char *e,
>      slen = e ? ({ ASSERT(e >= s); e - s; }) : strlen(s);
>      nlen = strlen(name);
>  
> +    if ( !e )
> +        e = s + slen;
> +
>      /* Check that this is the name we're looking for and a value was provided */
> -    if ( (slen <= nlen) || strncmp(s, name, nlen) || (s[nlen] != '=') )
> +    if ( slen <= nlen || strncmp(s, name, nlen) || s[nlen] != '=' )
>          return -1;
>  
> -    pval = simple_strtoll(&s[nlen + 1], &str, 0);
> +    pval = simple_strtoll(&s[nlen + 1], &str, 10);
>  
>      /* Number not recognised */
>      if ( str != e )
> 
> 
> Please note that I’ve also included your comment about the base, which I forgot to add, apologies for that.
> 
> slen <= nlen doesn’t seems redundant to me, I have that because I’m accessing s[nlen] and I would like
> the string s to be at least > nlen

Right, but doesn't strncmp() guarantee that already?

Jan


^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [PATCH v6 07/12] xen: enable Dom0 to use SVE feature
  2023-04-24 14:05       ` Jan Beulich
@ 2023-04-24 14:57         ` Luca Fancellu
  2023-04-24 15:06           ` Jan Beulich
  0 siblings, 1 reply; 56+ messages in thread
From: Luca Fancellu @ 2023-04-24 14:57 UTC (permalink / raw)
  To: Jan Beulich
  Cc: Bertrand Marquis, Wei Chen, Andrew Cooper, George Dunlap,
	Julien Grall, Stefano Stabellini, Wei Liu, Volodymyr Babchuk,
	xen-devel



> On 24 Apr 2023, at 15:05, Jan Beulich <jbeulich@suse.com> wrote:
> 
> On 24.04.2023 16:00, Luca Fancellu wrote:
>>> On 24 Apr 2023, at 12:34, Jan Beulich <jbeulich@suse.com> wrote:
>>> On 24.04.2023 08:02, Luca Fancellu wrote:
>>>> @@ -30,9 +37,11 @@ int sve_context_init(struct vcpu *v);
>>>> void sve_context_free(struct vcpu *v);
>>>> void sve_save_state(struct vcpu *v);
>>>> void sve_restore_state(struct vcpu *v);
>>>> +bool sve_domctl_vl_param(int val, unsigned int *out);
>>>> 
>>>> #else /* !CONFIG_ARM64_SVE */
>>>> 
>>>> +#define opt_dom0_sve     (0)
>>>> #define is_sve_domain(d) (0)
>>>> 
>>>> static inline register_t compute_max_zcr(void)
>>>> @@ -59,6 +68,11 @@ static inline void sve_context_free(struct vcpu *v) {}
>>>> static inline void sve_save_state(struct vcpu *v) {}
>>>> static inline void sve_restore_state(struct vcpu *v) {}
>>>> 
>>>> +static inline bool sve_domctl_vl_param(int val, unsigned int *out)
>>>> +{
>>>> +    return false;
>>>> +}
>>> 
>>> Once again I don't see the need for this stub: opt_dom0_sve is #define-d
>>> to plain zero when !ARM64_SVE, so the only call site merely requires a
>>> visible declaration, and DCE will take care of eliminating the actual call.
>> 
>> I’ve tried to do that, I’ve put the declaration outside the ifdef so that it was always included
>> and I removed the stub, but I got errors on compilation because of undefined function.
>> For that reason  I left that change out.
> 
> Interesting. I don't see where the reference would be coming from.

Could it be because the declaration is visible, outside the ifdef, but the definition is not compiled in? 

>>>> --- a/xen/common/kernel.c
>>>> +++ b/xen/common/kernel.c
>>>> @@ -314,6 +314,31 @@ int parse_boolean(const char *name, const char *s, const char *e)
>>>>    return -1;
>>>> }
>>>> 
>>>> +int __init parse_signed_integer(const char *name, const char *s, const char *e,
>>>> +                                long long *val)
>>>> +{
>>>> +    size_t slen, nlen;
>>>> +    const char *str;
>>>> +    long long pval;
>>>> +
>>>> +    slen = e ? ({ ASSERT(e >= s); e - s; }) : strlen(s);
>>> 
>>> As per this "e" may come in as NULL, meaning that ...
>>> 
>>>> +    nlen = strlen(name);
>>>> +
>>>> +    /* Check that this is the name we're looking for and a value was provided */
>>>> +    if ( (slen <= nlen) || strncmp(s, name, nlen) || (s[nlen] != '=') )
>>>> +        return -1;
>>>> +
>>>> +    pval = simple_strtoll(&s[nlen + 1], &str, 0);
>>>> +
>>>> +    /* Number not recognised */
>>>> +    if ( str != e )
>>>> +        return -2;
>>> 
>>> ... this is always going to lead to failure in that case. (I guess I could
>>> have spotted this earlier, sorry.)
>>> 
>>> As a nit, I'd also appreciate if style here (parenthesization in particular)
>>> could match that of parse_boolean(), which doesn't put parentheses around
>>> the operands of comparison operators (a few lines up from here). With the
>>> other function in mind, I'm then not going to pick on the seemingly
>>> redundant (with the subsequent strncmp()) "slen <= nlen", which has an
>>> equivalent there as well.
>> 
>> You are right, do you think this will be ok:
> 
> It'll do, I guess.
> 
>> --- a/xen/common/kernel.c
>> +++ b/xen/common/kernel.c
>> @@ -324,11 +324,14 @@ int __init parse_signed_integer(const char *name, const char *s, const char *e,
>>     slen = e ? ({ ASSERT(e >= s); e - s; }) : strlen(s);
>>     nlen = strlen(name);
>> 
>> +    if ( !e )
>> +        e = s + slen;
>> +
>>     /* Check that this is the name we're looking for and a value was provided */
>> -    if ( (slen <= nlen) || strncmp(s, name, nlen) || (s[nlen] != '=') )
>> +    if ( slen <= nlen || strncmp(s, name, nlen) || s[nlen] != '=' )
>>         return -1;
>> 
>> -    pval = simple_strtoll(&s[nlen + 1], &str, 0);
>> +    pval = simple_strtoll(&s[nlen + 1], &str, 10);
>> 
>>     /* Number not recognised */
>>     if ( str != e )
>> 
>> 
>> Please note that I’ve also included your comment about the base, which I forgot to add, apologies for that.
>> 
>> slen <= nlen doesn’t seems redundant to me, I have that because I’m accessing s[nlen] and I would like
>> the string s to be at least > nlen
> 
> Right, but doesn't strncmp() guarantee that already?

I thought strncmp() guarantees s contains at least nlen chars, meaning from 0 to nlen-1, is my understanding wrong?

> 
> Jan



^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [PATCH v6 07/12] xen: enable Dom0 to use SVE feature
  2023-04-24 14:57         ` Luca Fancellu
@ 2023-04-24 15:06           ` Jan Beulich
  2023-04-24 15:18             ` Luca Fancellu
  0 siblings, 1 reply; 56+ messages in thread
From: Jan Beulich @ 2023-04-24 15:06 UTC (permalink / raw)
  To: Luca Fancellu
  Cc: Bertrand Marquis, Wei Chen, Andrew Cooper, George Dunlap,
	Julien Grall, Stefano Stabellini, Wei Liu, Volodymyr Babchuk,
	xen-devel

On 24.04.2023 16:57, Luca Fancellu wrote:
>> On 24 Apr 2023, at 15:05, Jan Beulich <jbeulich@suse.com> wrote:
>> On 24.04.2023 16:00, Luca Fancellu wrote:
>>>> On 24 Apr 2023, at 12:34, Jan Beulich <jbeulich@suse.com> wrote:
>>>> On 24.04.2023 08:02, Luca Fancellu wrote:
>>>>> @@ -30,9 +37,11 @@ int sve_context_init(struct vcpu *v);
>>>>> void sve_context_free(struct vcpu *v);
>>>>> void sve_save_state(struct vcpu *v);
>>>>> void sve_restore_state(struct vcpu *v);
>>>>> +bool sve_domctl_vl_param(int val, unsigned int *out);
>>>>>
>>>>> #else /* !CONFIG_ARM64_SVE */
>>>>>
>>>>> +#define opt_dom0_sve     (0)
>>>>> #define is_sve_domain(d) (0)
>>>>>
>>>>> static inline register_t compute_max_zcr(void)
>>>>> @@ -59,6 +68,11 @@ static inline void sve_context_free(struct vcpu *v) {}
>>>>> static inline void sve_save_state(struct vcpu *v) {}
>>>>> static inline void sve_restore_state(struct vcpu *v) {}
>>>>>
>>>>> +static inline bool sve_domctl_vl_param(int val, unsigned int *out)
>>>>> +{
>>>>> +    return false;
>>>>> +}
>>>>
>>>> Once again I don't see the need for this stub: opt_dom0_sve is #define-d
>>>> to plain zero when !ARM64_SVE, so the only call site merely requires a
>>>> visible declaration, and DCE will take care of eliminating the actual call.
>>>
>>> I’ve tried to do that, I’ve put the declaration outside the ifdef so that it was always included
>>> and I removed the stub, but I got errors on compilation because of undefined function.
>>> For that reason  I left that change out.
>>
>> Interesting. I don't see where the reference would be coming from.
> 
> Could it be because the declaration is visible, outside the ifdef, but the definition is not compiled in? 

Well, yes, likely. But the question isn't that but "Why did the reference
not get removed, when it's inside an if(0) block?"

>>> --- a/xen/common/kernel.c
>>> +++ b/xen/common/kernel.c
>>> @@ -324,11 +324,14 @@ int __init parse_signed_integer(const char *name, const char *s, const char *e,
>>>     slen = e ? ({ ASSERT(e >= s); e - s; }) : strlen(s);
>>>     nlen = strlen(name);
>>>
>>> +    if ( !e )
>>> +        e = s + slen;
>>> +
>>>     /* Check that this is the name we're looking for and a value was provided */
>>> -    if ( (slen <= nlen) || strncmp(s, name, nlen) || (s[nlen] != '=') )
>>> +    if ( slen <= nlen || strncmp(s, name, nlen) || s[nlen] != '=' )
>>>         return -1;
>>>
>>> -    pval = simple_strtoll(&s[nlen + 1], &str, 0);
>>> +    pval = simple_strtoll(&s[nlen + 1], &str, 10);
>>>
>>>     /* Number not recognised */
>>>     if ( str != e )
>>>
>>>
>>> Please note that I’ve also included your comment about the base, which I forgot to add, apologies for that.
>>>
>>> slen <= nlen doesn’t seems redundant to me, I have that because I’m accessing s[nlen] and I would like
>>> the string s to be at least > nlen
>>
>> Right, but doesn't strncmp() guarantee that already?
> 
> I thought strncmp() guarantees s contains at least nlen chars, meaning from 0 to nlen-1, is my understanding wrong?

That's my understanding too. Translated to C this means "slen >= nlen",
i.e. the "slen < nlen" case is covered. The "slen == nlen" case is then
covered by "s[nlen] != '='", which - due to the earlier guarantee - is
going to be in bounds. That's because even when e is non-NULL and points
at non-nul, it still points into a valid nul-terminated string. (But yes,
I see now that the "slen == nlen" case is a little hairy, so perhaps
indeed best to keep the check as you have it.)

Jan


^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [PATCH v6 07/12] xen: enable Dom0 to use SVE feature
  2023-04-24 15:06           ` Jan Beulich
@ 2023-04-24 15:18             ` Luca Fancellu
  2023-04-24 15:25               ` Jan Beulich
  0 siblings, 1 reply; 56+ messages in thread
From: Luca Fancellu @ 2023-04-24 15:18 UTC (permalink / raw)
  To: Jan Beulich
  Cc: Bertrand Marquis, Wei Chen, Andrew Cooper, George Dunlap,
	Julien Grall, Stefano Stabellini, Wei Liu, Volodymyr Babchuk,
	xen-devel



> On 24 Apr 2023, at 16:06, Jan Beulich <jbeulich@suse.com> wrote:
> 
> On 24.04.2023 16:57, Luca Fancellu wrote:
>>> On 24 Apr 2023, at 15:05, Jan Beulich <jbeulich@suse.com> wrote:
>>> On 24.04.2023 16:00, Luca Fancellu wrote:
>>>>> On 24 Apr 2023, at 12:34, Jan Beulich <jbeulich@suse.com> wrote:
>>>>> On 24.04.2023 08:02, Luca Fancellu wrote:
>>>>>> @@ -30,9 +37,11 @@ int sve_context_init(struct vcpu *v);
>>>>>> void sve_context_free(struct vcpu *v);
>>>>>> void sve_save_state(struct vcpu *v);
>>>>>> void sve_restore_state(struct vcpu *v);
>>>>>> +bool sve_domctl_vl_param(int val, unsigned int *out);
>>>>>> 
>>>>>> #else /* !CONFIG_ARM64_SVE */
>>>>>> 
>>>>>> +#define opt_dom0_sve     (0)
>>>>>> #define is_sve_domain(d) (0)
>>>>>> 
>>>>>> static inline register_t compute_max_zcr(void)
>>>>>> @@ -59,6 +68,11 @@ static inline void sve_context_free(struct vcpu *v) {}
>>>>>> static inline void sve_save_state(struct vcpu *v) {}
>>>>>> static inline void sve_restore_state(struct vcpu *v) {}
>>>>>> 
>>>>>> +static inline bool sve_domctl_vl_param(int val, unsigned int *out)
>>>>>> +{
>>>>>> +    return false;
>>>>>> +}
>>>>> 
>>>>> Once again I don't see the need for this stub: opt_dom0_sve is #define-d
>>>>> to plain zero when !ARM64_SVE, so the only call site merely requires a
>>>>> visible declaration, and DCE will take care of eliminating the actual call.
>>>> 
>>>> I’ve tried to do that, I’ve put the declaration outside the ifdef so that it was always included
>>>> and I removed the stub, but I got errors on compilation because of undefined function.
>>>> For that reason  I left that change out.
>>> 
>>> Interesting. I don't see where the reference would be coming from.
>> 
>> Could it be because the declaration is visible, outside the ifdef, but the definition is not compiled in? 
> 
> Well, yes, likely. But the question isn't that but "Why did the reference
> not get removed, when it's inside an if(0) block?"

Oh ok, I don’t know, here what I get if for example I build arm32:

arm-linux-gnueabihf-ld -EL -T arch/arm/xen.lds -N prelink.o \
./common/symbols-dummy.o -o ./.xen-syms.0
arm-linux-gnueabihf-ld: prelink.o: in function `create_domUs':
(.init.text+0x13464): undefined reference to `sve_domctl_vl_param'
arm-linux-gnueabihf-ld: (.init.text+0x136b4): undefined reference to `sve_domctl_vl_param'
arm-linux-gnueabihf-ld: ./.xen-syms.0: hidden symbol `sve_domctl_vl_param' isn't defined
arm-linux-gnueabihf-ld: final link failed: bad value
make[3]: *** [/data_sdc/lucfan01/kirkstone_xen/xen/xen/arch/arm/Makefile:95: xen-syms] Error 1
make[2]: *** [/data_sdc/lucfan01/kirkstone_xen/xen/xen/./build.mk:90: xen] Error 2
make[1]: *** [/data_sdc/lucfan01/kirkstone_xen/xen/xen/Makefile:590: xen] Error 2
make[1]: Leaving directory '/data_sdc/lucfan01/kirkstone_xen/build/xen-qemu-arm32'
make: *** [Makefile:181: __sub-make] Error 2
make: Leaving directory '/data_sdc/lucfan01/kirkstone_xen/xen/xen’

These are the modification I’ve done:

diff --git a/xen/arch/arm/include/asm/arm64/sve.h b/xen/arch/arm/include/asm/arm64/sve.h
index 71bddb41f19c..330c47ea8864 100644
--- a/xen/arch/arm/include/asm/arm64/sve.h
+++ b/xen/arch/arm/include/asm/arm64/sve.h
@@ -24,6 +24,8 @@ static inline unsigned int sve_encode_vl(unsigned int sve_vl_bits)
     return sve_vl_bits / SVE_VL_MULTIPLE_VAL;
 }
 
+bool sve_domctl_vl_param(int val, unsigned int *out);
+
 #ifdef CONFIG_ARM64_SVE
 
 extern int opt_dom0_sve;
@@ -37,7 +39,6 @@ int sve_context_init(struct vcpu *v);
 void sve_context_free(struct vcpu *v);
 void sve_save_state(struct vcpu *v);
 void sve_restore_state(struct vcpu *v);
-bool sve_domctl_vl_param(int val, unsigned int *out);
 
 #else /* !CONFIG_ARM64_SVE */
 
@@ -68,11 +69,6 @@ static inline void sve_context_free(struct vcpu *v) {}
 static inline void sve_save_state(struct vcpu *v) {}
 static inline void sve_restore_state(struct vcpu *v) {}
 
-static inline bool sve_domctl_vl_param(int val, unsigned int *out)
-{
-    return false;
-}
-
 #endif /* CONFIG_ARM64_SVE */
 
 #endif /* _ARM_ARM64_SVE_H */


> 
>>>> --- a/xen/common/kernel.c
>>>> +++ b/xen/common/kernel.c
>>>> @@ -324,11 +324,14 @@ int __init parse_signed_integer(const char *name, const char *s, const char *e,
>>>>    slen = e ? ({ ASSERT(e >= s); e - s; }) : strlen(s);
>>>>    nlen = strlen(name);
>>>> 
>>>> +    if ( !e )
>>>> +        e = s + slen;
>>>> +
>>>>    /* Check that this is the name we're looking for and a value was provided */
>>>> -    if ( (slen <= nlen) || strncmp(s, name, nlen) || (s[nlen] != '=') )
>>>> +    if ( slen <= nlen || strncmp(s, name, nlen) || s[nlen] != '=' )
>>>>        return -1;
>>>> 
>>>> -    pval = simple_strtoll(&s[nlen + 1], &str, 0);
>>>> +    pval = simple_strtoll(&s[nlen + 1], &str, 10);
>>>> 
>>>>    /* Number not recognised */
>>>>    if ( str != e )
>>>> 
>>>> 
>>>> Please note that I’ve also included your comment about the base, which I forgot to add, apologies for that.
>>>> 
>>>> slen <= nlen doesn’t seems redundant to me, I have that because I’m accessing s[nlen] and I would like
>>>> the string s to be at least > nlen
>>> 
>>> Right, but doesn't strncmp() guarantee that already?
>> 
>> I thought strncmp() guarantees s contains at least nlen chars, meaning from 0 to nlen-1, is my understanding wrong?
> 
> That's my understanding too. Translated to C this means "slen >= nlen",
> i.e. the "slen < nlen" case is covered. The "slen == nlen" case is then
> covered by "s[nlen] != '='", which - due to the earlier guarantee - is
> going to be in bounds. That's because even when e is non-NULL and points
> at non-nul, it still points into a valid nul-terminated string. (But yes,
> I see now that the "slen == nlen" case is a little hairy, so perhaps
> indeed best to keep the check as you have it.)
> 
> Jan



^ permalink raw reply related	[flat|nested] 56+ messages in thread

* Re: [PATCH v6 07/12] xen: enable Dom0 to use SVE feature
  2023-04-24 15:18             ` Luca Fancellu
@ 2023-04-24 15:25               ` Jan Beulich
  2023-04-24 15:34                 ` Luca Fancellu
  0 siblings, 1 reply; 56+ messages in thread
From: Jan Beulich @ 2023-04-24 15:25 UTC (permalink / raw)
  To: Luca Fancellu
  Cc: Bertrand Marquis, Wei Chen, Andrew Cooper, George Dunlap,
	Julien Grall, Stefano Stabellini, Wei Liu, Volodymyr Babchuk,
	xen-devel

On 24.04.2023 17:18, Luca Fancellu wrote:
>> On 24 Apr 2023, at 16:06, Jan Beulich <jbeulich@suse.com> wrote:
>> On 24.04.2023 16:57, Luca Fancellu wrote:
>>>> On 24 Apr 2023, at 15:05, Jan Beulich <jbeulich@suse.com> wrote:
>>>> On 24.04.2023 16:00, Luca Fancellu wrote:
>>>>>> On 24 Apr 2023, at 12:34, Jan Beulich <jbeulich@suse.com> wrote:
>>>>>> On 24.04.2023 08:02, Luca Fancellu wrote:
>>>>>>> @@ -30,9 +37,11 @@ int sve_context_init(struct vcpu *v);
>>>>>>> void sve_context_free(struct vcpu *v);
>>>>>>> void sve_save_state(struct vcpu *v);
>>>>>>> void sve_restore_state(struct vcpu *v);
>>>>>>> +bool sve_domctl_vl_param(int val, unsigned int *out);
>>>>>>>
>>>>>>> #else /* !CONFIG_ARM64_SVE */
>>>>>>>
>>>>>>> +#define opt_dom0_sve     (0)
>>>>>>> #define is_sve_domain(d) (0)
>>>>>>>
>>>>>>> static inline register_t compute_max_zcr(void)
>>>>>>> @@ -59,6 +68,11 @@ static inline void sve_context_free(struct vcpu *v) {}
>>>>>>> static inline void sve_save_state(struct vcpu *v) {}
>>>>>>> static inline void sve_restore_state(struct vcpu *v) {}
>>>>>>>
>>>>>>> +static inline bool sve_domctl_vl_param(int val, unsigned int *out)
>>>>>>> +{
>>>>>>> +    return false;
>>>>>>> +}
>>>>>>
>>>>>> Once again I don't see the need for this stub: opt_dom0_sve is #define-d
>>>>>> to plain zero when !ARM64_SVE, so the only call site merely requires a
>>>>>> visible declaration, and DCE will take care of eliminating the actual call.
>>>>>
>>>>> I’ve tried to do that, I’ve put the declaration outside the ifdef so that it was always included
>>>>> and I removed the stub, but I got errors on compilation because of undefined function.
>>>>> For that reason  I left that change out.
>>>>
>>>> Interesting. I don't see where the reference would be coming from.
>>>
>>> Could it be because the declaration is visible, outside the ifdef, but the definition is not compiled in? 
>>
>> Well, yes, likely. But the question isn't that but "Why did the reference
>> not get removed, when it's inside an if(0) block?"
> 
> Oh ok, I don’t know, here what I get if for example I build arm32:
> 
> arm-linux-gnueabihf-ld -EL -T arch/arm/xen.lds -N prelink.o \
> ./common/symbols-dummy.o -o ./.xen-syms.0
> arm-linux-gnueabihf-ld: prelink.o: in function `create_domUs':
> (.init.text+0x13464): undefined reference to `sve_domctl_vl_param'

In particular with seeing this: What you copied here is a build with the
series applied only up to this patch? I ask because the patch here adds a
call only out of create_dom0().

Jan

> arm-linux-gnueabihf-ld: (.init.text+0x136b4): undefined reference to `sve_domctl_vl_param'
> arm-linux-gnueabihf-ld: ./.xen-syms.0: hidden symbol `sve_domctl_vl_param' isn't defined
> arm-linux-gnueabihf-ld: final link failed: bad value
> make[3]: *** [/data_sdc/lucfan01/kirkstone_xen/xen/xen/arch/arm/Makefile:95: xen-syms] Error 1
> make[2]: *** [/data_sdc/lucfan01/kirkstone_xen/xen/xen/./build.mk:90: xen] Error 2
> make[1]: *** [/data_sdc/lucfan01/kirkstone_xen/xen/xen/Makefile:590: xen] Error 2
> make[1]: Leaving directory '/data_sdc/lucfan01/kirkstone_xen/build/xen-qemu-arm32'
> make: *** [Makefile:181: __sub-make] Error 2
> make: Leaving directory '/data_sdc/lucfan01/kirkstone_xen/xen/xen’
> 
> These are the modification I’ve done:
> 
> diff --git a/xen/arch/arm/include/asm/arm64/sve.h b/xen/arch/arm/include/asm/arm64/sve.h
> index 71bddb41f19c..330c47ea8864 100644
> --- a/xen/arch/arm/include/asm/arm64/sve.h
> +++ b/xen/arch/arm/include/asm/arm64/sve.h
> @@ -24,6 +24,8 @@ static inline unsigned int sve_encode_vl(unsigned int sve_vl_bits)
>      return sve_vl_bits / SVE_VL_MULTIPLE_VAL;
>  }
>  
> +bool sve_domctl_vl_param(int val, unsigned int *out);
> +
>  #ifdef CONFIG_ARM64_SVE
>  
>  extern int opt_dom0_sve;
> @@ -37,7 +39,6 @@ int sve_context_init(struct vcpu *v);
>  void sve_context_free(struct vcpu *v);
>  void sve_save_state(struct vcpu *v);
>  void sve_restore_state(struct vcpu *v);
> -bool sve_domctl_vl_param(int val, unsigned int *out);
>  
>  #else /* !CONFIG_ARM64_SVE */
>  
> @@ -68,11 +69,6 @@ static inline void sve_context_free(struct vcpu *v) {}
>  static inline void sve_save_state(struct vcpu *v) {}
>  static inline void sve_restore_state(struct vcpu *v) {}
>  
> -static inline bool sve_domctl_vl_param(int val, unsigned int *out)
> -{
> -    return false;
> -}
> -
>  #endif /* CONFIG_ARM64_SVE */
>  
>  #endif /* _ARM_ARM64_SVE_H */



^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [PATCH v6 07/12] xen: enable Dom0 to use SVE feature
  2023-04-24 15:25               ` Jan Beulich
@ 2023-04-24 15:34                 ` Luca Fancellu
  2023-04-24 15:41                   ` Jan Beulich
  0 siblings, 1 reply; 56+ messages in thread
From: Luca Fancellu @ 2023-04-24 15:34 UTC (permalink / raw)
  To: Jan Beulich
  Cc: Bertrand Marquis, Wei Chen, Andrew Cooper, George Dunlap,
	Julien Grall, Stefano Stabellini, Wei Liu, Volodymyr Babchuk,
	xen-devel



> On 24 Apr 2023, at 16:25, Jan Beulich <jbeulich@suse.com> wrote:
> 
> On 24.04.2023 17:18, Luca Fancellu wrote:
>>> On 24 Apr 2023, at 16:06, Jan Beulich <jbeulich@suse.com> wrote:
>>> On 24.04.2023 16:57, Luca Fancellu wrote:
>>>>> On 24 Apr 2023, at 15:05, Jan Beulich <jbeulich@suse.com> wrote:
>>>>> On 24.04.2023 16:00, Luca Fancellu wrote:
>>>>>>> On 24 Apr 2023, at 12:34, Jan Beulich <jbeulich@suse.com> wrote:
>>>>>>> On 24.04.2023 08:02, Luca Fancellu wrote:
>>>>>>>> @@ -30,9 +37,11 @@ int sve_context_init(struct vcpu *v);
>>>>>>>> void sve_context_free(struct vcpu *v);
>>>>>>>> void sve_save_state(struct vcpu *v);
>>>>>>>> void sve_restore_state(struct vcpu *v);
>>>>>>>> +bool sve_domctl_vl_param(int val, unsigned int *out);
>>>>>>>> 
>>>>>>>> #else /* !CONFIG_ARM64_SVE */
>>>>>>>> 
>>>>>>>> +#define opt_dom0_sve     (0)
>>>>>>>> #define is_sve_domain(d) (0)
>>>>>>>> 
>>>>>>>> static inline register_t compute_max_zcr(void)
>>>>>>>> @@ -59,6 +68,11 @@ static inline void sve_context_free(struct vcpu *v) {}
>>>>>>>> static inline void sve_save_state(struct vcpu *v) {}
>>>>>>>> static inline void sve_restore_state(struct vcpu *v) {}
>>>>>>>> 
>>>>>>>> +static inline bool sve_domctl_vl_param(int val, unsigned int *out)
>>>>>>>> +{
>>>>>>>> +    return false;
>>>>>>>> +}
>>>>>>> 
>>>>>>> Once again I don't see the need for this stub: opt_dom0_sve is #define-d
>>>>>>> to plain zero when !ARM64_SVE, so the only call site merely requires a
>>>>>>> visible declaration, and DCE will take care of eliminating the actual call.
>>>>>> 
>>>>>> I’ve tried to do that, I’ve put the declaration outside the ifdef so that it was always included
>>>>>> and I removed the stub, but I got errors on compilation because of undefined function.
>>>>>> For that reason  I left that change out.
>>>>> 
>>>>> Interesting. I don't see where the reference would be coming from.
>>>> 
>>>> Could it be because the declaration is visible, outside the ifdef, but the definition is not compiled in? 
>>> 
>>> Well, yes, likely. But the question isn't that but "Why did the reference
>>> not get removed, when it's inside an if(0) block?"
>> 
>> Oh ok, I don’t know, here what I get if for example I build arm32:
>> 
>> arm-linux-gnueabihf-ld -EL -T arch/arm/xen.lds -N prelink.o \
>> ./common/symbols-dummy.o -o ./.xen-syms.0
>> arm-linux-gnueabihf-ld: prelink.o: in function `create_domUs':
>> (.init.text+0x13464): undefined reference to `sve_domctl_vl_param'
> 
> In particular with seeing this: What you copied here is a build with the
> series applied only up to this patch? I ask because the patch here adds a
> call only out of create_dom0().

No I’ve do the changes on top of the serie, I’ve tried it now, only to this patch and it builds correctly,
It was my mistake to don’t read carefully the error output.

Anyway I guess this change is not applicable because we don’t have a symbol that is plain 0 for domUs
to be placed inside create_domUs.

> 
> Jan
> 
>> arm-linux-gnueabihf-ld: (.init.text+0x136b4): undefined reference to `sve_domctl_vl_param'
>> arm-linux-gnueabihf-ld: ./.xen-syms.0: hidden symbol `sve_domctl_vl_param' isn't defined
>> arm-linux-gnueabihf-ld: final link failed: bad value
>> make[3]: *** [/data_sdc/lucfan01/kirkstone_xen/xen/xen/arch/arm/Makefile:95: xen-syms] Error 1
>> make[2]: *** [/data_sdc/lucfan01/kirkstone_xen/xen/xen/./build.mk:90: xen] Error 2
>> make[1]: *** [/data_sdc/lucfan01/kirkstone_xen/xen/xen/Makefile:590: xen] Error 2
>> make[1]: Leaving directory '/data_sdc/lucfan01/kirkstone_xen/build/xen-qemu-arm32'
>> make: *** [Makefile:181: __sub-make] Error 2
>> make: Leaving directory '/data_sdc/lucfan01/kirkstone_xen/xen/xen’
>> 
>> These are the modification I’ve done:
>> 
>> diff --git a/xen/arch/arm/include/asm/arm64/sve.h b/xen/arch/arm/include/asm/arm64/sve.h
>> index 71bddb41f19c..330c47ea8864 100644
>> --- a/xen/arch/arm/include/asm/arm64/sve.h
>> +++ b/xen/arch/arm/include/asm/arm64/sve.h
>> @@ -24,6 +24,8 @@ static inline unsigned int sve_encode_vl(unsigned int sve_vl_bits)
>>     return sve_vl_bits / SVE_VL_MULTIPLE_VAL;
>> }
>> 
>> +bool sve_domctl_vl_param(int val, unsigned int *out);
>> +
>> #ifdef CONFIG_ARM64_SVE
>> 
>> extern int opt_dom0_sve;
>> @@ -37,7 +39,6 @@ int sve_context_init(struct vcpu *v);
>> void sve_context_free(struct vcpu *v);
>> void sve_save_state(struct vcpu *v);
>> void sve_restore_state(struct vcpu *v);
>> -bool sve_domctl_vl_param(int val, unsigned int *out);
>> 
>> #else /* !CONFIG_ARM64_SVE */
>> 
>> @@ -68,11 +69,6 @@ static inline void sve_context_free(struct vcpu *v) {}
>> static inline void sve_save_state(struct vcpu *v) {}
>> static inline void sve_restore_state(struct vcpu *v) {}
>> 
>> -static inline bool sve_domctl_vl_param(int val, unsigned int *out)
>> -{
>> -    return false;
>> -}
>> -
>> #endif /* CONFIG_ARM64_SVE */
>> 
>> #endif /* _ARM_ARM64_SVE_H */



^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [PATCH v6 07/12] xen: enable Dom0 to use SVE feature
  2023-04-24 15:34                 ` Luca Fancellu
@ 2023-04-24 15:41                   ` Jan Beulich
  2023-04-24 15:43                     ` Luca Fancellu
  0 siblings, 1 reply; 56+ messages in thread
From: Jan Beulich @ 2023-04-24 15:41 UTC (permalink / raw)
  To: Luca Fancellu
  Cc: Bertrand Marquis, Wei Chen, Andrew Cooper, George Dunlap,
	Julien Grall, Stefano Stabellini, Wei Liu, Volodymyr Babchuk,
	xen-devel

On 24.04.2023 17:34, Luca Fancellu wrote:
>> On 24 Apr 2023, at 16:25, Jan Beulich <jbeulich@suse.com> wrote:
>> On 24.04.2023 17:18, Luca Fancellu wrote:
>>>> On 24 Apr 2023, at 16:06, Jan Beulich <jbeulich@suse.com> wrote:
>>>> On 24.04.2023 16:57, Luca Fancellu wrote:
>>>>>> On 24 Apr 2023, at 15:05, Jan Beulich <jbeulich@suse.com> wrote:
>>>>>> On 24.04.2023 16:00, Luca Fancellu wrote:
>>>>>>>> On 24 Apr 2023, at 12:34, Jan Beulich <jbeulich@suse.com> wrote:
>>>>>>>> On 24.04.2023 08:02, Luca Fancellu wrote:
>>>>>>>>> @@ -30,9 +37,11 @@ int sve_context_init(struct vcpu *v);
>>>>>>>>> void sve_context_free(struct vcpu *v);
>>>>>>>>> void sve_save_state(struct vcpu *v);
>>>>>>>>> void sve_restore_state(struct vcpu *v);
>>>>>>>>> +bool sve_domctl_vl_param(int val, unsigned int *out);
>>>>>>>>>
>>>>>>>>> #else /* !CONFIG_ARM64_SVE */
>>>>>>>>>
>>>>>>>>> +#define opt_dom0_sve     (0)
>>>>>>>>> #define is_sve_domain(d) (0)
>>>>>>>>>
>>>>>>>>> static inline register_t compute_max_zcr(void)
>>>>>>>>> @@ -59,6 +68,11 @@ static inline void sve_context_free(struct vcpu *v) {}
>>>>>>>>> static inline void sve_save_state(struct vcpu *v) {}
>>>>>>>>> static inline void sve_restore_state(struct vcpu *v) {}
>>>>>>>>>
>>>>>>>>> +static inline bool sve_domctl_vl_param(int val, unsigned int *out)
>>>>>>>>> +{
>>>>>>>>> +    return false;
>>>>>>>>> +}
>>>>>>>>
>>>>>>>> Once again I don't see the need for this stub: opt_dom0_sve is #define-d
>>>>>>>> to plain zero when !ARM64_SVE, so the only call site merely requires a
>>>>>>>> visible declaration, and DCE will take care of eliminating the actual call.
>>>>>>>
>>>>>>> I’ve tried to do that, I’ve put the declaration outside the ifdef so that it was always included
>>>>>>> and I removed the stub, but I got errors on compilation because of undefined function.
>>>>>>> For that reason  I left that change out.
>>>>>>
>>>>>> Interesting. I don't see where the reference would be coming from.
>>>>>
>>>>> Could it be because the declaration is visible, outside the ifdef, but the definition is not compiled in? 
>>>>
>>>> Well, yes, likely. But the question isn't that but "Why did the reference
>>>> not get removed, when it's inside an if(0) block?"
>>>
>>> Oh ok, I don’t know, here what I get if for example I build arm32:
>>>
>>> arm-linux-gnueabihf-ld -EL -T arch/arm/xen.lds -N prelink.o \
>>> ./common/symbols-dummy.o -o ./.xen-syms.0
>>> arm-linux-gnueabihf-ld: prelink.o: in function `create_domUs':
>>> (.init.text+0x13464): undefined reference to `sve_domctl_vl_param'
>>
>> In particular with seeing this: What you copied here is a build with the
>> series applied only up to this patch? I ask because the patch here adds a
>> call only out of create_dom0().
> 
> No I’ve do the changes on top of the serie, I’ve tried it now, only to this patch and it builds correctly,
> It was my mistake to don’t read carefully the error output.
> 
> Anyway I guess this change is not applicable because we don’t have a symbol that is plain 0 for domUs
> to be placed inside create_domUs.

Possible, but would you mind first telling me in which other patch(es) the
further reference(s) are being introduced, so I could take a look without
(again) digging through the entire series?

Jan


^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [PATCH v6 07/12] xen: enable Dom0 to use SVE feature
  2023-04-24 15:41                   ` Jan Beulich
@ 2023-04-24 15:43                     ` Luca Fancellu
  2023-04-24 16:10                       ` Jan Beulich
  0 siblings, 1 reply; 56+ messages in thread
From: Luca Fancellu @ 2023-04-24 15:43 UTC (permalink / raw)
  To: Jan Beulich
  Cc: Bertrand Marquis, Wei Chen, Andrew Cooper, George Dunlap,
	Julien Grall, Stefano Stabellini, Wei Liu, Volodymyr Babchuk,
	xen-devel



> On 24 Apr 2023, at 16:41, Jan Beulich <jbeulich@suse.com> wrote:
> 
> On 24.04.2023 17:34, Luca Fancellu wrote:
>>> On 24 Apr 2023, at 16:25, Jan Beulich <jbeulich@suse.com> wrote:
>>> On 24.04.2023 17:18, Luca Fancellu wrote:
>>>>> On 24 Apr 2023, at 16:06, Jan Beulich <jbeulich@suse.com> wrote:
>>>>> On 24.04.2023 16:57, Luca Fancellu wrote:
>>>>>>> On 24 Apr 2023, at 15:05, Jan Beulich <jbeulich@suse.com> wrote:
>>>>>>> On 24.04.2023 16:00, Luca Fancellu wrote:
>>>>>>>>> On 24 Apr 2023, at 12:34, Jan Beulich <jbeulich@suse.com> wrote:
>>>>>>>>> On 24.04.2023 08:02, Luca Fancellu wrote:
>>>>>>>>>> @@ -30,9 +37,11 @@ int sve_context_init(struct vcpu *v);
>>>>>>>>>> void sve_context_free(struct vcpu *v);
>>>>>>>>>> void sve_save_state(struct vcpu *v);
>>>>>>>>>> void sve_restore_state(struct vcpu *v);
>>>>>>>>>> +bool sve_domctl_vl_param(int val, unsigned int *out);
>>>>>>>>>> 
>>>>>>>>>> #else /* !CONFIG_ARM64_SVE */
>>>>>>>>>> 
>>>>>>>>>> +#define opt_dom0_sve     (0)
>>>>>>>>>> #define is_sve_domain(d) (0)
>>>>>>>>>> 
>>>>>>>>>> static inline register_t compute_max_zcr(void)
>>>>>>>>>> @@ -59,6 +68,11 @@ static inline void sve_context_free(struct vcpu *v) {}
>>>>>>>>>> static inline void sve_save_state(struct vcpu *v) {}
>>>>>>>>>> static inline void sve_restore_state(struct vcpu *v) {}
>>>>>>>>>> 
>>>>>>>>>> +static inline bool sve_domctl_vl_param(int val, unsigned int *out)
>>>>>>>>>> +{
>>>>>>>>>> +    return false;
>>>>>>>>>> +}
>>>>>>>>> 
>>>>>>>>> Once again I don't see the need for this stub: opt_dom0_sve is #define-d
>>>>>>>>> to plain zero when !ARM64_SVE, so the only call site merely requires a
>>>>>>>>> visible declaration, and DCE will take care of eliminating the actual call.
>>>>>>>> 
>>>>>>>> I’ve tried to do that, I’ve put the declaration outside the ifdef so that it was always included
>>>>>>>> and I removed the stub, but I got errors on compilation because of undefined function.
>>>>>>>> For that reason  I left that change out.
>>>>>>> 
>>>>>>> Interesting. I don't see where the reference would be coming from.
>>>>>> 
>>>>>> Could it be because the declaration is visible, outside the ifdef, but the definition is not compiled in? 
>>>>> 
>>>>> Well, yes, likely. But the question isn't that but "Why did the reference
>>>>> not get removed, when it's inside an if(0) block?"
>>>> 
>>>> Oh ok, I don’t know, here what I get if for example I build arm32:
>>>> 
>>>> arm-linux-gnueabihf-ld -EL -T arch/arm/xen.lds -N prelink.o \
>>>> ./common/symbols-dummy.o -o ./.xen-syms.0
>>>> arm-linux-gnueabihf-ld: prelink.o: in function `create_domUs':
>>>> (.init.text+0x13464): undefined reference to `sve_domctl_vl_param'
>>> 
>>> In particular with seeing this: What you copied here is a build with the
>>> series applied only up to this patch? I ask because the patch here adds a
>>> call only out of create_dom0().
>> 
>> No I’ve do the changes on top of the serie, I’ve tried it now, only to this patch and it builds correctly,
>> It was my mistake to don’t read carefully the error output.
>> 
>> Anyway I guess this change is not applicable because we don’t have a symbol that is plain 0 for domUs
>> to be placed inside create_domUs.
> 
> Possible, but would you mind first telling me in which other patch(es) the
> further reference(s) are being introduced, so I could take a look without
> (again) digging through the entire series?

Sure, the other references to the function are introduced in "xen/arm: add sve property for dom0less domUs” patch 11

> 
> Jan



^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [PATCH v6 07/12] xen: enable Dom0 to use SVE feature
  2023-04-24 15:43                     ` Luca Fancellu
@ 2023-04-24 16:10                       ` Jan Beulich
  2023-04-25  6:04                         ` Luca Fancellu
  0 siblings, 1 reply; 56+ messages in thread
From: Jan Beulich @ 2023-04-24 16:10 UTC (permalink / raw)
  To: Luca Fancellu
  Cc: Bertrand Marquis, Wei Chen, Andrew Cooper, George Dunlap,
	Julien Grall, Stefano Stabellini, Wei Liu, Volodymyr Babchuk,
	xen-devel

On 24.04.2023 17:43, Luca Fancellu wrote:
>> On 24 Apr 2023, at 16:41, Jan Beulich <jbeulich@suse.com> wrote:
>> On 24.04.2023 17:34, Luca Fancellu wrote:
>>>> On 24 Apr 2023, at 16:25, Jan Beulich <jbeulich@suse.com> wrote:
>>>> On 24.04.2023 17:18, Luca Fancellu wrote:
>>>>> Oh ok, I don’t know, here what I get if for example I build arm32:
>>>>>
>>>>> arm-linux-gnueabihf-ld -EL -T arch/arm/xen.lds -N prelink.o \
>>>>> ./common/symbols-dummy.o -o ./.xen-syms.0
>>>>> arm-linux-gnueabihf-ld: prelink.o: in function `create_domUs':
>>>>> (.init.text+0x13464): undefined reference to `sve_domctl_vl_param'
>>>>
>>>> In particular with seeing this: What you copied here is a build with the
>>>> series applied only up to this patch? I ask because the patch here adds a
>>>> call only out of create_dom0().
>>>
>>> No I’ve do the changes on top of the serie, I’ve tried it now, only to this patch and it builds correctly,
>>> It was my mistake to don’t read carefully the error output.
>>>
>>> Anyway I guess this change is not applicable because we don’t have a symbol that is plain 0 for domUs
>>> to be placed inside create_domUs.
>>
>> Possible, but would you mind first telling me in which other patch(es) the
>> further reference(s) are being introduced, so I could take a look without
>> (again) digging through the entire series?
> 
> Sure, the other references to the function are introduced in "xen/arm: add sve property for dom0less domUs” patch 11

Personally I'm inclined to suggest adding "#ifdef CONFIG_ARM64_SVE" there.
But I guess that may again go against your desire to not ignore inapplicable
options. Still I can't resist to at least ask how an "sve" node on Arm32 is
different from an entirely unknown one.

Jan


^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [PATCH v6 07/12] xen: enable Dom0 to use SVE feature
  2023-04-24 16:10                       ` Jan Beulich
@ 2023-04-25  6:04                         ` Luca Fancellu
  2023-05-18 18:39                           ` Julien Grall
  0 siblings, 1 reply; 56+ messages in thread
From: Luca Fancellu @ 2023-04-25  6:04 UTC (permalink / raw)
  To: Jan Beulich
  Cc: Bertrand Marquis, Wei Chen, Andrew Cooper, George Dunlap,
	Julien Grall, Stefano Stabellini, Wei Liu, Volodymyr Babchuk,
	xen-devel



> On 24 Apr 2023, at 17:10, Jan Beulich <jbeulich@suse.com> wrote:
> 
> On 24.04.2023 17:43, Luca Fancellu wrote:
>>> On 24 Apr 2023, at 16:41, Jan Beulich <jbeulich@suse.com> wrote:
>>> On 24.04.2023 17:34, Luca Fancellu wrote:
>>>>> On 24 Apr 2023, at 16:25, Jan Beulich <jbeulich@suse.com> wrote:
>>>>> On 24.04.2023 17:18, Luca Fancellu wrote:
>>>>>> Oh ok, I don’t know, here what I get if for example I build arm32:
>>>>>> 
>>>>>> arm-linux-gnueabihf-ld -EL -T arch/arm/xen.lds -N prelink.o \
>>>>>> ./common/symbols-dummy.o -o ./.xen-syms.0
>>>>>> arm-linux-gnueabihf-ld: prelink.o: in function `create_domUs':
>>>>>> (.init.text+0x13464): undefined reference to `sve_domctl_vl_param'
>>>>> 
>>>>> In particular with seeing this: What you copied here is a build with the
>>>>> series applied only up to this patch? I ask because the patch here adds a
>>>>> call only out of create_dom0().
>>>> 
>>>> No I’ve do the changes on top of the serie, I’ve tried it now, only to this patch and it builds correctly,
>>>> It was my mistake to don’t read carefully the error output.
>>>> 
>>>> Anyway I guess this change is not applicable because we don’t have a symbol that is plain 0 for domUs
>>>> to be placed inside create_domUs.
>>> 
>>> Possible, but would you mind first telling me in which other patch(es) the
>>> further reference(s) are being introduced, so I could take a look without
>>> (again) digging through the entire series?
>> 
>> Sure, the other references to the function are introduced in "xen/arm: add sve property for dom0less domUs” patch 11
> 
> Personally I'm inclined to suggest adding "#ifdef CONFIG_ARM64_SVE" there.
> But I guess that may again go against your desire to not ignore inapplicable
> options. Still I can't resist to at least ask how an "sve" node on Arm32 is
> different from an entirely unknown one.

It would be ok for me to use #ifdef CONFIG_ARM64_SVE and fail in the #else branch,
but I had the feeling in the past that Arm maintainers are not very happy with #ifdefs, I might
be wrong so I’ll wait for them to give an opinion and then I will be happy to follow.

> 
> Jan



^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [PATCH v6 09/12] tools: add physinfo arch_capabilities handling for Arm
  2023-04-24  6:02 ` [PATCH v6 09/12] tools: add physinfo arch_capabilities handling for Arm Luca Fancellu
@ 2023-05-02 16:13   ` Anthony PERARD
  2023-05-03  9:23     ` Luca Fancellu
  0 siblings, 1 reply; 56+ messages in thread
From: Anthony PERARD @ 2023-05-02 16:13 UTC (permalink / raw)
  To: Luca Fancellu
  Cc: xen-devel, bertrand.marquis, wei.chen, George Dunlap,
	Nick Rosbrook, Wei Liu, Juergen Gross, Christian Lindig,
	David Scott, Marek Marczykowski-Górecki, Christian Lindig

On Mon, Apr 24, 2023 at 07:02:45AM +0100, Luca Fancellu wrote:
> diff --git a/tools/include/xen-tools/arm-arch-capabilities.h b/tools/include/xen-tools/arm-arch-capabilities.h
> new file mode 100644
> index 000000000000..ac44c8b14344
> --- /dev/null
> +++ b/tools/include/xen-tools/arm-arch-capabilities.h
> @@ -0,0 +1,28 @@
> +/* SPDX-License-Identifier: GPL-2.0 */

Do you mean GPL-2.0-only ?

GPL-2.0 is deprecated by the SPDX project.

https://spdx.org/licenses/GPL-2.0.html


Besides that, patch looks fine:
Reviewed-by: Anthony PERARD <anthony.perard@citrix.com>

Thanks,

-- 
Anthony PERARD


^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [PATCH v6 10/12] xen/tools: add sve parameter in XL configuration
  2023-04-24  6:02 ` [PATCH v6 10/12] xen/tools: add sve parameter in XL configuration Luca Fancellu
@ 2023-05-02 17:06   ` Anthony PERARD
  2023-05-02 19:54     ` Luca Fancellu
  0 siblings, 1 reply; 56+ messages in thread
From: Anthony PERARD @ 2023-05-02 17:06 UTC (permalink / raw)
  To: Luca Fancellu
  Cc: xen-devel, bertrand.marquis, wei.chen, Wei Liu, George Dunlap,
	Nick Rosbrook, Juergen Gross

On Mon, Apr 24, 2023 at 07:02:46AM +0100, Luca Fancellu wrote:
> diff --git a/tools/libs/light/libxl_arm.c b/tools/libs/light/libxl_arm.c
> index ddc7b2a15975..1e69dac2c4fa 100644
> --- a/tools/libs/light/libxl_arm.c
> +++ b/tools/libs/light/libxl_arm.c
> @@ -211,6 +213,12 @@ int libxl__arch_domain_prepare_config(libxl__gc *gc,
>          return ERROR_FAIL;
>      }
>  
> +    /* Parameter is sanitised in libxl__arch_domain_build_info_setdefault */
> +    if (d_config->b_info.arch_arm.sve_vl) {
> +        /* Vector length is divided by 128 in struct xen_domctl_createdomain */
> +        config->arch.sve_vl = d_config->b_info.arch_arm.sve_vl / 128U;
> +    }
> +
>      return 0;
>  }
>  
> @@ -1681,6 +1689,26 @@ int libxl__arch_domain_build_info_setdefault(libxl__gc *gc,
>      /* ACPI is disabled by default */
>      libxl_defbool_setdefault(&b_info->acpi, false);
>  
> +    /* Sanitise SVE parameter */
> +    if (b_info->arch_arm.sve_vl) {
> +        unsigned int max_sve_vl =
> +            arch_capabilities_arm_sve(physinfo->arch_capabilities);
> +
> +        if (!max_sve_vl) {
> +            LOG(ERROR, "SVE is unsupported on this machine.");
> +            return ERROR_FAIL;
> +        }
> +
> +        if (LIBXL_SVE_TYPE_HW == b_info->arch_arm.sve_vl) {
> +            b_info->arch_arm.sve_vl = max_sve_vl;
> +        } else if (b_info->arch_arm.sve_vl > max_sve_vl) {
> +            LOG(ERROR,
> +                "Invalid sve value: %d. Platform supports up to %u bits",
> +                b_info->arch_arm.sve_vl, max_sve_vl);
> +            return ERROR_FAIL;
> +        }

You still need to check that sve_vl is one of the value from the enum,
or that the value is divisible by 128.

> +    }
> +
>      if (b_info->type != LIBXL_DOMAIN_TYPE_PV)
>          return 0;
>  
> diff --git a/tools/libs/light/libxl_types.idl b/tools/libs/light/libxl_types.idl
> index fd31dacf7d5a..9e48bb772646 100644
> --- a/tools/libs/light/libxl_types.idl
> +++ b/tools/libs/light/libxl_types.idl
> @@ -523,6 +523,27 @@ libxl_tee_type = Enumeration("tee_type", [
>      (1, "optee")
>      ], init_val = "LIBXL_TEE_TYPE_NONE")
>  
> +libxl_sve_type = Enumeration("sve_type", [
> +    (-1, "hw"),
> +    (0, "disabled"),
> +    (128, "128"),
> +    (256, "256"),
> +    (384, "384"),
> +    (512, "512"),
> +    (640, "640"),
> +    (768, "768"),
> +    (896, "896"),
> +    (1024, "1024"),
> +    (1152, "1152"),
> +    (1280, "1280"),
> +    (1408, "1408"),
> +    (1536, "1536"),
> +    (1664, "1664"),
> +    (1792, "1792"),
> +    (1920, "1920"),
> +    (2048, "2048")
> +    ], init_val = "LIBXL_SVE_TYPE_DISABLED")

I'm not sure if I like that or not. Is there a reason to stop at 2048?
It is possible that there will be more value available in the future?

Also this mean that users of libxl (like libvirt) would be supposed to
use LIBXL_SVE_TYPE_1024 for e.g., or use libxl_sve_type_from_string().

Also, it feels weird to me to mostly use numerical value of the enum
rather than the enum itself.

Anyway, hopefully that enum will work fine.

>  libxl_rdm_reserve = Struct("rdm_reserve", [
>      ("strategy",    libxl_rdm_reserve_strategy),
>      ("policy",      libxl_rdm_reserve_policy),

Thanks,

-- 
Anthony PERARD


^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [PATCH v6 10/12] xen/tools: add sve parameter in XL configuration
  2023-05-02 17:06   ` Anthony PERARD
@ 2023-05-02 19:54     ` Luca Fancellu
  2023-05-05 16:23       ` Anthony PERARD
  0 siblings, 1 reply; 56+ messages in thread
From: Luca Fancellu @ 2023-05-02 19:54 UTC (permalink / raw)
  To: Anthony PERARD
  Cc: Xen-devel, Bertrand Marquis, Wei Chen, Wei Liu, George Dunlap,
	Nick Rosbrook, Juergen Gross

Hi Anthony,

Thank you for your review.

> On 2 May 2023, at 18:06, Anthony PERARD <anthony.perard@citrix.com> wrote:
> 
> On Mon, Apr 24, 2023 at 07:02:46AM +0100, Luca Fancellu wrote:
>> diff --git a/tools/libs/light/libxl_arm.c b/tools/libs/light/libxl_arm.c
>> index ddc7b2a15975..1e69dac2c4fa 100644
>> --- a/tools/libs/light/libxl_arm.c
>> +++ b/tools/libs/light/libxl_arm.c
>> @@ -211,6 +213,12 @@ int libxl__arch_domain_prepare_config(libxl__gc *gc,
>>         return ERROR_FAIL;
>>     }
>> 
>> +    /* Parameter is sanitised in libxl__arch_domain_build_info_setdefault */
>> +    if (d_config->b_info.arch_arm.sve_vl) {
>> +        /* Vector length is divided by 128 in struct xen_domctl_createdomain */
>> +        config->arch.sve_vl = d_config->b_info.arch_arm.sve_vl / 128U;
>> +    }
>> +
>>     return 0;
>> }
>> 
>> @@ -1681,6 +1689,26 @@ int libxl__arch_domain_build_info_setdefault(libxl__gc *gc,
>>     /* ACPI is disabled by default */
>>     libxl_defbool_setdefault(&b_info->acpi, false);
>> 
>> +    /* Sanitise SVE parameter */
>> +    if (b_info->arch_arm.sve_vl) {
>> +        unsigned int max_sve_vl =
>> +            arch_capabilities_arm_sve(physinfo->arch_capabilities);
>> +
>> +        if (!max_sve_vl) {
>> +            LOG(ERROR, "SVE is unsupported on this machine.");
>> +            return ERROR_FAIL;
>> +        }
>> +
>> +        if (LIBXL_SVE_TYPE_HW == b_info->arch_arm.sve_vl) {
>> +            b_info->arch_arm.sve_vl = max_sve_vl;
>> +        } else if (b_info->arch_arm.sve_vl > max_sve_vl) {
>> +            LOG(ERROR,
>> +                "Invalid sve value: %d. Platform supports up to %u bits",
>> +                b_info->arch_arm.sve_vl, max_sve_vl);
>> +            return ERROR_FAIL;
>> +        }
> 
> You still need to check that sve_vl is one of the value from the enum,
> or that the value is divisible by 128.

I have probably missed something, I thought that using the way below to
specify the input I had for free that the value is 0 or divisible by 128, is it
not the case? Who can write to b_info->arch_arm.sve_vl different value
from the enum we specified in the .idl?

> 
>> +    }
>> +
>>     if (b_info->type != LIBXL_DOMAIN_TYPE_PV)
>>         return 0;
>> 
>> diff --git a/tools/libs/light/libxl_types.idl b/tools/libs/light/libxl_types.idl
>> index fd31dacf7d5a..9e48bb772646 100644
>> --- a/tools/libs/light/libxl_types.idl
>> +++ b/tools/libs/light/libxl_types.idl
>> @@ -523,6 +523,27 @@ libxl_tee_type = Enumeration("tee_type", [
>>     (1, "optee")
>>     ], init_val = "LIBXL_TEE_TYPE_NONE")
>> 
>> +libxl_sve_type = Enumeration("sve_type", [
>> +    (-1, "hw"),
>> +    (0, "disabled"),
>> +    (128, "128"),
>> +    (256, "256"),
>> +    (384, "384"),
>> +    (512, "512"),
>> +    (640, "640"),
>> +    (768, "768"),
>> +    (896, "896"),
>> +    (1024, "1024"),
>> +    (1152, "1152"),
>> +    (1280, "1280"),
>> +    (1408, "1408"),
>> +    (1536, "1536"),
>> +    (1664, "1664"),
>> +    (1792, "1792"),
>> +    (1920, "1920"),
>> +    (2048, "2048")
>> +    ], init_val = "LIBXL_SVE_TYPE_DISABLED")
> 
> I'm not sure if I like that or not. Is there a reason to stop at 2048?
> It is possible that there will be more value available in the future?

Uhm... possibly there might be some extension, I thought that when it will
be the case, the only thing to do was to add another entry, I used this way
also to have for free the checks on the %128 and maximum 2048.

> 
> Also this mean that users of libxl (like libvirt) would be supposed to
> use LIBXL_SVE_TYPE_1024 for e.g., or use libxl_sve_type_from_string().
> 
> Also, it feels weird to me to mostly use numerical value of the enum
> rather than the enum itself.
> 
> Anyway, hopefully that enum will work fine.
> 
>> libxl_rdm_reserve = Struct("rdm_reserve", [
>>     ("strategy",    libxl_rdm_reserve_strategy),
>>     ("policy",      libxl_rdm_reserve_policy),
> 
> Thanks,
> 
> -- 
> Anthony PERARD



^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [PATCH v6 09/12] tools: add physinfo arch_capabilities handling for Arm
  2023-05-02 16:13   ` Anthony PERARD
@ 2023-05-03  9:23     ` Luca Fancellu
  2023-05-05 16:44       ` Anthony PERARD
  0 siblings, 1 reply; 56+ messages in thread
From: Luca Fancellu @ 2023-05-03  9:23 UTC (permalink / raw)
  To: Anthony PERARD
  Cc: Xen-devel, Bertrand Marquis, Wei Chen, George Dunlap,
	Nick Rosbrook, Wei Liu, Juergen Gross, Christian Lindig,
	David Scott, Marek Marczykowski-Górecki, Christian Lindig



> On 2 May 2023, at 17:13, Anthony PERARD <anthony.perard@citrix.com> wrote:
> 
> On Mon, Apr 24, 2023 at 07:02:45AM +0100, Luca Fancellu wrote:
>> diff --git a/tools/include/xen-tools/arm-arch-capabilities.h b/tools/include/xen-tools/arm-arch-capabilities.h
>> new file mode 100644
>> index 000000000000..ac44c8b14344
>> --- /dev/null
>> +++ b/tools/include/xen-tools/arm-arch-capabilities.h
>> @@ -0,0 +1,28 @@
>> +/* SPDX-License-Identifier: GPL-2.0 */
> 
> Do you mean GPL-2.0-only ?
> 
> GPL-2.0 is deprecated by the SPDX project.
> 
> https://spdx.org/licenses/GPL-2.0.html
> 
> 
> Besides that, patch looks fine:
> Reviewed-by: Anthony PERARD <anthony.perard@citrix.com>

Thanks, I’ll fix in the next push and I’ll add your R-by

> 
> Thanks,
> 
> -- 
> Anthony PERARD


^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [PATCH v6 10/12] xen/tools: add sve parameter in XL configuration
  2023-05-02 19:54     ` Luca Fancellu
@ 2023-05-05 16:23       ` Anthony PERARD
  2023-05-05 16:36         ` Luca Fancellu
  0 siblings, 1 reply; 56+ messages in thread
From: Anthony PERARD @ 2023-05-05 16:23 UTC (permalink / raw)
  To: Luca Fancellu
  Cc: Xen-devel, Bertrand Marquis, Wei Chen, Wei Liu, George Dunlap,
	Nick Rosbrook, Juergen Gross

On Tue, May 02, 2023 at 07:54:19PM +0000, Luca Fancellu wrote:
> > On 2 May 2023, at 18:06, Anthony PERARD <anthony.perard@citrix.com> wrote:
> > On Mon, Apr 24, 2023 at 07:02:46AM +0100, Luca Fancellu wrote:
> >> diff --git a/tools/libs/light/libxl_arm.c b/tools/libs/light/libxl_arm.c
> >> index ddc7b2a15975..1e69dac2c4fa 100644
> >> --- a/tools/libs/light/libxl_arm.c
> >> +++ b/tools/libs/light/libxl_arm.c
> >> @@ -211,6 +213,12 @@ int libxl__arch_domain_prepare_config(libxl__gc *gc,
> >>         return ERROR_FAIL;
> >>     }
> >> 
> >> +    /* Parameter is sanitised in libxl__arch_domain_build_info_setdefault */
> >> +    if (d_config->b_info.arch_arm.sve_vl) {
> >> +        /* Vector length is divided by 128 in struct xen_domctl_createdomain */
> >> +        config->arch.sve_vl = d_config->b_info.arch_arm.sve_vl / 128U;
> >> +    }
> >> +
> >>     return 0;
> >> }
> >> 
> >> @@ -1681,6 +1689,26 @@ int libxl__arch_domain_build_info_setdefault(libxl__gc *gc,
> >>     /* ACPI is disabled by default */
> >>     libxl_defbool_setdefault(&b_info->acpi, false);
> >> 
> >> +    /* Sanitise SVE parameter */
> >> +    if (b_info->arch_arm.sve_vl) {
> >> +        unsigned int max_sve_vl =
> >> +            arch_capabilities_arm_sve(physinfo->arch_capabilities);
> >> +
> >> +        if (!max_sve_vl) {
> >> +            LOG(ERROR, "SVE is unsupported on this machine.");
> >> +            return ERROR_FAIL;
> >> +        }
> >> +
> >> +        if (LIBXL_SVE_TYPE_HW == b_info->arch_arm.sve_vl) {
> >> +            b_info->arch_arm.sve_vl = max_sve_vl;
> >> +        } else if (b_info->arch_arm.sve_vl > max_sve_vl) {
> >> +            LOG(ERROR,
> >> +                "Invalid sve value: %d. Platform supports up to %u bits",
> >> +                b_info->arch_arm.sve_vl, max_sve_vl);
> >> +            return ERROR_FAIL;
> >> +        }
> > 
> > You still need to check that sve_vl is one of the value from the enum,
> > or that the value is divisible by 128.
> 
> I have probably missed something, I thought that using the way below to
> specify the input I had for free that the value is 0 or divisible by 128, is it
> not the case? Who can write to b_info->arch_arm.sve_vl different value
> from the enum we specified in the .idl?

`xl` isn't the only user of `libxl`. There's `libvirt` as well. We also
have libxl bindings for several languages. There's nothing stopping a
developer to write a number into `sve_vl` instead of choosing one of the
value from the enum. I think we should probably sanitize any input that
comes from outside of libxl, that's probably the case, I'm not sure.

So if valid values for `sve_vl` are only numbers divisible by 128, and
some other discrete numbers, then we should check that they are, or
check that the value is one defined by the enum.

Cheers,

-- 
Anthony PERARD


^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [PATCH v6 10/12] xen/tools: add sve parameter in XL configuration
  2023-05-05 16:23       ` Anthony PERARD
@ 2023-05-05 16:36         ` Luca Fancellu
  0 siblings, 0 replies; 56+ messages in thread
From: Luca Fancellu @ 2023-05-05 16:36 UTC (permalink / raw)
  To: Anthony PERARD
  Cc: Xen-devel, Bertrand Marquis, Wei Chen, Wei Liu, George Dunlap,
	Nick Rosbrook, Juergen Gross



> On 5 May 2023, at 17:23, Anthony PERARD <anthony.perard@citrix.com> wrote:
> 
> On Tue, May 02, 2023 at 07:54:19PM +0000, Luca Fancellu wrote:
>>> On 2 May 2023, at 18:06, Anthony PERARD <anthony.perard@citrix.com> wrote:
>>> On Mon, Apr 24, 2023 at 07:02:46AM +0100, Luca Fancellu wrote:
>>>> diff --git a/tools/libs/light/libxl_arm.c b/tools/libs/light/libxl_arm.c
>>>> index ddc7b2a15975..1e69dac2c4fa 100644
>>>> --- a/tools/libs/light/libxl_arm.c
>>>> +++ b/tools/libs/light/libxl_arm.c
>>>> @@ -211,6 +213,12 @@ int libxl__arch_domain_prepare_config(libxl__gc *gc,
>>>>        return ERROR_FAIL;
>>>>    }
>>>> 
>>>> +    /* Parameter is sanitised in libxl__arch_domain_build_info_setdefault */
>>>> +    if (d_config->b_info.arch_arm.sve_vl) {
>>>> +        /* Vector length is divided by 128 in struct xen_domctl_createdomain */
>>>> +        config->arch.sve_vl = d_config->b_info.arch_arm.sve_vl / 128U;
>>>> +    }
>>>> +
>>>>    return 0;
>>>> }
>>>> 
>>>> @@ -1681,6 +1689,26 @@ int libxl__arch_domain_build_info_setdefault(libxl__gc *gc,
>>>>    /* ACPI is disabled by default */
>>>>    libxl_defbool_setdefault(&b_info->acpi, false);
>>>> 
>>>> +    /* Sanitise SVE parameter */
>>>> +    if (b_info->arch_arm.sve_vl) {
>>>> +        unsigned int max_sve_vl =
>>>> +            arch_capabilities_arm_sve(physinfo->arch_capabilities);
>>>> +
>>>> +        if (!max_sve_vl) {
>>>> +            LOG(ERROR, "SVE is unsupported on this machine.");
>>>> +            return ERROR_FAIL;
>>>> +        }
>>>> +
>>>> +        if (LIBXL_SVE_TYPE_HW == b_info->arch_arm.sve_vl) {
>>>> +            b_info->arch_arm.sve_vl = max_sve_vl;
>>>> +        } else if (b_info->arch_arm.sve_vl > max_sve_vl) {
>>>> +            LOG(ERROR,
>>>> +                "Invalid sve value: %d. Platform supports up to %u bits",
>>>> +                b_info->arch_arm.sve_vl, max_sve_vl);
>>>> +            return ERROR_FAIL;
>>>> +        }
>>> 
>>> You still need to check that sve_vl is one of the value from the enum,
>>> or that the value is divisible by 128.
>> 
>> I have probably missed something, I thought that using the way below to
>> specify the input I had for free that the value is 0 or divisible by 128, is it
>> not the case? Who can write to b_info->arch_arm.sve_vl different value
>> from the enum we specified in the .idl?
> 
> `xl` isn't the only user of `libxl`. There's `libvirt` as well. We also
> have libxl bindings for several languages.

Right, this point wasn’t clear to me, I will add the check there, thank you
for the explanation.

> There's nothing stopping a
> developer to write a number into `sve_vl` instead of choosing one of the
> value from the enum. I think we should probably sanitize any input that
> comes from outside of libxl, that's probably the case, I'm not sure.
> 
> So if valid values for `sve_vl` are only numbers divisible by 128, and
> some other discrete numbers, then we should check that they are, or
> check that the value is one defined by the enum.
> 
> Cheers,
> 
> -- 
> Anthony PERARD


^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [PATCH v6 09/12] tools: add physinfo arch_capabilities handling for Arm
  2023-05-03  9:23     ` Luca Fancellu
@ 2023-05-05 16:44       ` Anthony PERARD
  2023-05-05 16:56         ` Luca Fancellu
  0 siblings, 1 reply; 56+ messages in thread
From: Anthony PERARD @ 2023-05-05 16:44 UTC (permalink / raw)
  To: Luca Fancellu
  Cc: Xen-devel, Bertrand Marquis, Wei Chen, George Dunlap,
	Nick Rosbrook, Wei Liu, Juergen Gross, Christian Lindig,
	David Scott, Marek Marczykowski-Górecki, Christian Lindig

On Wed, May 03, 2023 at 09:23:19AM +0000, Luca Fancellu wrote:
> 
> 
> > On 2 May 2023, at 17:13, Anthony PERARD <anthony.perard@citrix.com> wrote:
> > 
> > On Mon, Apr 24, 2023 at 07:02:45AM +0100, Luca Fancellu wrote:
> >> diff --git a/tools/include/xen-tools/arm-arch-capabilities.h b/tools/include/xen-tools/arm-arch-capabilities.h
> >> new file mode 100644
> >> index 000000000000..ac44c8b14344
> >> --- /dev/null
> >> +++ b/tools/include/xen-tools/arm-arch-capabilities.h
> >> @@ -0,0 +1,28 @@
> >> +/* SPDX-License-Identifier: GPL-2.0 */
> > 
> > Do you mean GPL-2.0-only ?
> > 
> > GPL-2.0 is deprecated by the SPDX project.
> > 
> > https://spdx.org/licenses/GPL-2.0.html
> > 
> > 
> > Besides that, patch looks fine:
> > Reviewed-by: Anthony PERARD <anthony.perard@citrix.com>
> 
> Thanks, I’ll fix in the next push and I’ll add your R-by

Actually, could you use LGPL-2.1-only instead. As this code is to be
included in libxl, and libxl is supposed to be LGPL-2.1-only, it might
be better to be on the safe side and use LGPL for this new file.

As I understand (from recent discussion about libacpi, and a quick search
only), mixing GPL and LGPL code might mean the result is GPL. So just to
be on the safe side, have this file been LGPL might be better. And it
seems that it would still be fine to include that file in GPL projects.

Would that be ok with you?

Cheers,

-- 
Anthony PERARD


^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [PATCH v6 09/12] tools: add physinfo arch_capabilities handling for Arm
  2023-05-05 16:44       ` Anthony PERARD
@ 2023-05-05 16:56         ` Luca Fancellu
  0 siblings, 0 replies; 56+ messages in thread
From: Luca Fancellu @ 2023-05-05 16:56 UTC (permalink / raw)
  To: Anthony PERARD
  Cc: Xen-devel, Bertrand Marquis, Wei Chen, George Dunlap,
	Nick Rosbrook, Wei Liu, Juergen Gross, Christian Lindig,
	David Scott, Marek Marczykowski-Górecki, Christian Lindig



> On 5 May 2023, at 17:44, Anthony PERARD <anthony.perard@citrix.com> wrote:
> 
> On Wed, May 03, 2023 at 09:23:19AM +0000, Luca Fancellu wrote:
>> 
>> 
>>> On 2 May 2023, at 17:13, Anthony PERARD <anthony.perard@citrix.com> wrote:
>>> 
>>> On Mon, Apr 24, 2023 at 07:02:45AM +0100, Luca Fancellu wrote:
>>>> diff --git a/tools/include/xen-tools/arm-arch-capabilities.h b/tools/include/xen-tools/arm-arch-capabilities.h
>>>> new file mode 100644
>>>> index 000000000000..ac44c8b14344
>>>> --- /dev/null
>>>> +++ b/tools/include/xen-tools/arm-arch-capabilities.h
>>>> @@ -0,0 +1,28 @@
>>>> +/* SPDX-License-Identifier: GPL-2.0 */
>>> 
>>> Do you mean GPL-2.0-only ?
>>> 
>>> GPL-2.0 is deprecated by the SPDX project.
>>> 
>>> https://spdx.org/licenses/GPL-2.0.html
>>> 
>>> 
>>> Besides that, patch looks fine:
>>> Reviewed-by: Anthony PERARD <anthony.perard@citrix.com>
>> 
>> Thanks, I’ll fix in the next push and I’ll add your R-by
> 
> Actually, could you use LGPL-2.1-only instead. As this code is to be
> included in libxl, and libxl is supposed to be LGPL-2.1-only, it might
> be better to be on the safe side and use LGPL for this new file.
> 
> As I understand (from recent discussion about libacpi, and a quick search
> only), mixing GPL and LGPL code might mean the result is GPL. So just to
> be on the safe side, have this file been LGPL might be better. And it
> seems that it would still be fine to include that file in GPL projects.
> 
> Would that be ok with you?

Yes sure, I will use LGPL-2.1-only instead, no problems

> 
> Cheers,
> 
> -- 
> Anthony PERARD



^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [PATCH v6 01/12] xen/arm: enable SVE extension for Xen
  2023-04-24  6:02 ` [PATCH v6 01/12] xen/arm: enable SVE extension for Xen Luca Fancellu
@ 2023-05-18  9:35   ` Julien Grall
  2023-05-19 14:26     ` Luca Fancellu
  0 siblings, 1 reply; 56+ messages in thread
From: Julien Grall @ 2023-05-18  9:35 UTC (permalink / raw)
  To: Luca Fancellu, xen-devel
  Cc: bertrand.marquis, wei.chen, Stefano Stabellini, Volodymyr Babchuk

Hi Luca,

Sorry for jumping late in the review.

On 24/04/2023 07:02, Luca Fancellu wrote:
> Enable Xen to handle the SVE extension, add code in cpufeature module
> to handle ZCR SVE register, disable trapping SVE feature on system
> boot only when SVE resources are accessed.
> While there, correct coding style for the comment on coprocessor
> trapping.
> 
> Now cptr_el2 is part of the domain context and it will be restored
> on context switch, this is a preparation for saving the SVE context
> which will be part of VFP operations, so restore it before the call
> to save VFP registers.
> To save an additional isb barrier, restore cptr_el2 before an
> existing isb barrier and move the call for saving VFP context after
> that barrier.
> 
> Change the KConfig entry to make ARM64_SVE symbol selectable, by
> default it will be not selected.
> 
> Create sve module and sve_asm.S that contains assembly routines for
> the SVE feature, this code is inspired from linux and it uses
> instruction encoding to be compatible with compilers that does not
> support SVE.
> 
> Signed-off-by: Luca Fancellu <luca.fancellu@arm.com>
> Reviewed-by: Bertrand Marquis <bertrand.marquis@arm.com>
> ---
> Changes from v5:
>   - Add R-by Bertrand
> Changes from v4:
>   - don't use fixed types in vl_to_zcr, forgot to address that in
>     v3, by mistake I changed that in patch 2, fixing now (Jan)
> Changes from v3:
>   - no changes
> Changes from v2:
>   - renamed sve_asm.S in sve-asm.S, new files should not contain
>     underscore in the name (Jan)
> Changes from v1:
>   - Add assert to vl_to_zcr, it is never called with vl==0, but just
>     to be sure it won't in the future.
> Changes from RFC:
>   - Moved restoring of cptr before an existing barrier (Julien)
>   - Marked the feature as unsupported for now (Julien)
>   - Trap and un-trap only when using SVE resources in
>     compute_max_zcr() (Julien)
> ---
>   xen/arch/arm/Kconfig                     | 10 +++--
>   xen/arch/arm/arm64/Makefile              |  1 +
>   xen/arch/arm/arm64/cpufeature.c          |  7 ++--
>   xen/arch/arm/arm64/sve-asm.S             | 48 +++++++++++++++++++++++
>   xen/arch/arm/arm64/sve.c                 | 50 ++++++++++++++++++++++++
>   xen/arch/arm/cpufeature.c                |  6 ++-
>   xen/arch/arm/domain.c                    |  9 +++--
>   xen/arch/arm/include/asm/arm64/sve.h     | 43 ++++++++++++++++++++
>   xen/arch/arm/include/asm/arm64/sysregs.h |  1 +
>   xen/arch/arm/include/asm/cpufeature.h    | 14 +++++++
>   xen/arch/arm/include/asm/domain.h        |  1 +
>   xen/arch/arm/include/asm/processor.h     |  2 +
>   xen/arch/arm/setup.c                     |  5 ++-
>   xen/arch/arm/traps.c                     | 28 +++++++------
>   14 files changed, 201 insertions(+), 24 deletions(-)
>   create mode 100644 xen/arch/arm/arm64/sve-asm.S
>   create mode 100644 xen/arch/arm/arm64/sve.c
>   create mode 100644 xen/arch/arm/include/asm/arm64/sve.h
> 
> diff --git a/xen/arch/arm/Kconfig b/xen/arch/arm/Kconfig
> index 239d3aed3c7f..41f45d8d1203 100644
> --- a/xen/arch/arm/Kconfig
> +++ b/xen/arch/arm/Kconfig
> @@ -112,11 +112,15 @@ config ARM64_PTR_AUTH
>   	  This feature is not supported in Xen.
>   
>   config ARM64_SVE
> -	def_bool n
> +	bool "Enable Scalar Vector Extension support (UNSUPPORTED)" if UNSUPPORTED
>   	depends on ARM_64
>   	help
> -	  Scalar Vector Extension support.
> -	  This feature is not supported in Xen.
> +	  Scalar Vector Extension (SVE/SVE2) support for guests.
> +
> +	  Please be aware that currently, enabling this feature will add latency on
> +	  VM context switch between SVE enabled guests, between not-enabled SVE
> +	  guests and SVE enabled guests and viceversa, compared to the time
> +	  required to switch between not-enabled SVE guests.
>   
>   config ARM64_MTE
>   	def_bool n
> diff --git a/xen/arch/arm/arm64/Makefile b/xen/arch/arm/arm64/Makefile
> index 28481393e98f..54ad55c75cda 100644
> --- a/xen/arch/arm/arm64/Makefile
> +++ b/xen/arch/arm/arm64/Makefile
> @@ -13,6 +13,7 @@ obj-$(CONFIG_LIVEPATCH) += livepatch.o
>   obj-y += mm.o
>   obj-y += smc.o
>   obj-y += smpboot.o
> +obj-$(CONFIG_ARM64_SVE) += sve.o sve-asm.o
>   obj-y += traps.o
>   obj-y += vfp.o
>   obj-y += vsysreg.o
> diff --git a/xen/arch/arm/arm64/cpufeature.c b/xen/arch/arm/arm64/cpufeature.c
> index d9039d37b2d1..b4656ff4d80f 100644
> --- a/xen/arch/arm/arm64/cpufeature.c
> +++ b/xen/arch/arm/arm64/cpufeature.c
> @@ -455,15 +455,11 @@ static const struct arm64_ftr_bits ftr_id_dfr1[] = {
>   	ARM64_FTR_END,
>   };
>   
> -#if 0
> -/* TODO: use this to sanitize SVE once we support it */
> -
>   static const struct arm64_ftr_bits ftr_zcr[] = {
>   	ARM64_FTR_BITS(FTR_HIDDEN, FTR_NONSTRICT, FTR_LOWER_SAFE,
>   		ZCR_ELx_LEN_SHIFT, ZCR_ELx_LEN_SIZE, 0),	/* LEN */
>   	ARM64_FTR_END,
>   };
> -#endif
>   
>   /*
>    * Common ftr bits for a 32bit register with all hidden, strict
> @@ -603,6 +599,9 @@ void update_system_features(const struct cpuinfo_arm *new)
>   
>   	SANITIZE_ID_REG(zfr64, 0, aa64zfr0);
>   
> +	if ( cpu_has_sve )
> +		SANITIZE_REG(zcr64, 0, zcr);
> +
>   	/*
>   	 * Comment from Linux:
>   	 * Userspace may perform DC ZVA instructions. Mismatched block sizes
> diff --git a/xen/arch/arm/arm64/sve-asm.S b/xen/arch/arm/arm64/sve-asm.S
> new file mode 100644
> index 000000000000..4d1549344733
> --- /dev/null
> +++ b/xen/arch/arm/arm64/sve-asm.S
> @@ -0,0 +1,48 @@
> +/* SPDX-License-Identifier: GPL-2.0-only */
> +/*
> + * Arm SVE assembly routines
> + *
> + * Copyright (C) 2022 ARM Ltd.
> + *
> + * Some macros and instruction encoding in this file are taken from linux 6.1.1,
> + * file arch/arm64/include/asm/fpsimdmacros.h, some of them are a modified
> + * version.
AFAICT, the only modified version is _sve_rdvl, but it is not clear to 
me why we would want to have a modified version?

I am asking this because without an explanation, it would be difficult 
to know how to re-sync the code with Linux.

> + */
> +
> +/* Sanity-check macros to help avoid encoding garbage instructions */
> +
> +.macro _check_general_reg nr
> +    .if (\nr) < 0 || (\nr) > 30
> +        .error "Bad register number \nr."
> +    .endif
> +.endm
> +
> +.macro _check_num n, min, max
> +    .if (\n) < (\min) || (\n) > (\max)
> +        .error "Number \n out of range [\min,\max]"
> +    .endif
> +.endm
> +
> +/* SVE instruction encodings for non-SVE-capable assemblers */
> +/* (pre binutils 2.28, all kernel capable clang versions support SVE) */
> +
> +/* RDVL X\nx, #\imm */
> +.macro _sve_rdvl nx, imm
> +    _check_general_reg \nx
> +    _check_num (\imm), -0x20, 0x1f
> +    .inst 0x04bf5000                \
> +        | (\nx)                     \
> +        | (((\imm) & 0x3f) << 5)
> +.endm
> +
> +/* Gets the current vector register size in bytes */
> +GLOBAL(sve_get_hw_vl)
> +    _sve_rdvl 0, 1
> +    ret
> +
> +/*
> + * Local variables:
> + * mode: ASM
> + * indent-tabs-mode: nil
> + * End:
> + */
> diff --git a/xen/arch/arm/arm64/sve.c b/xen/arch/arm/arm64/sve.c
> new file mode 100644
> index 000000000000..6f3fb368c59b
> --- /dev/null
> +++ b/xen/arch/arm/arm64/sve.c
> @@ -0,0 +1,50 @@
> +/* SPDX-License-Identifier: GPL-2.0 */

Above, you are using GPL-2.0-only, but here GPL-2.0. We favor the former 
now. Happy to deal it on commit if there is nothing else to address.

> +/*
> + * Arm SVE feature code
> + *
> + * Copyright (C) 2022 ARM Ltd.
> + */
> +
> +#include <xen/types.h>
> +#include <asm/arm64/sve.h>
> +#include <asm/arm64/sysregs.h>
> +#include <asm/processor.h>
> +#include <asm/system.h>
> +
> +extern unsigned int sve_get_hw_vl(void);
> +
> +register_t compute_max_zcr(void)
> +{
> +    register_t cptr_bits = get_default_cptr_flags();
> +    register_t zcr = vl_to_zcr(SVE_VL_MAX_BITS);
> +    unsigned int hw_vl;
> +
> +    /* Remove trap for SVE resources */
> +    WRITE_SYSREG(cptr_bits & ~HCPTR_CP(8), CPTR_EL2);
> +    isb();
> +
> +    /*
> +     * Set the maximum SVE vector length, doing that we will know the VL
> +     * supported by the platform, calling sve_get_hw_vl()
> +     */
> +    WRITE_SYSREG(zcr, ZCR_EL2);

 From my reading of the Arm (D19-6331, ARM DDI 0487J.a), a direct write 
to a system register would need to be followed by an context 
synchronization event (e.g. isb()) before the software can rely on the 
value.

In this situation, AFAICT, the instruciton in sve_get_hw_vl() will use 
the content of ZCR_EL2. So don't we need an ISB() here?

> +
> +    /*
> +     * Read the maximum VL, which could be lower than what we imposed before,
> +     * hw_vl contains VL in bytes, multiply it by 8 to use vl_to_zcr() later
> +     */
> +    hw_vl = sve_get_hw_vl() * 8U;
> +
> +    /* Restore CPTR_EL2 */
> +    WRITE_SYSREG(cptr_bits, CPTR_EL2);
> +    isb();
> +
> +    return vl_to_zcr(hw_vl);
> +}
> +
> +/* Takes a vector length in bits and returns the ZCR_ELx encoding */
> +register_t vl_to_zcr(unsigned int vl)
> +{
> +    ASSERT(vl > 0);
> +    return ((vl / SVE_VL_MULTIPLE_VAL) - 1U) & ZCR_ELx_LEN_MASK;
> +}

Missing the emacs magic blocks at the end.

> diff --git a/xen/arch/arm/cpufeature.c b/xen/arch/arm/cpufeature.c
> index c4ec38bb2554..83b84368f6d5 100644
> --- a/xen/arch/arm/cpufeature.c
> +++ b/xen/arch/arm/cpufeature.c
> @@ -9,6 +9,7 @@
>   #include <xen/init.h>
>   #include <xen/smp.h>
>   #include <xen/stop_machine.h>
> +#include <asm/arm64/sve.h>
>   #include <asm/cpufeature.h>
>   
>   DECLARE_BITMAP(cpu_hwcaps, ARM_NCAPS);
> @@ -143,6 +144,9 @@ void identify_cpu(struct cpuinfo_arm *c)
>   
>       c->zfr64.bits[0] = READ_SYSREG(ID_AA64ZFR0_EL1);
>   
> +    if ( cpu_has_sve )
> +        c->zcr64.bits[0] = compute_max_zcr();
> +
>       c->dczid.bits[0] = READ_SYSREG(DCZID_EL0);
>   
>       c->ctr.bits[0] = READ_SYSREG(CTR_EL0);
> @@ -199,7 +203,7 @@ static int __init create_guest_cpuinfo(void)
>       guest_cpuinfo.pfr64.mpam = 0;
>       guest_cpuinfo.pfr64.mpam_frac = 0;
>   
> -    /* Hide SVE as Xen does not support it */
> +    /* Hide SVE by default to the guests */
>       guest_cpuinfo.pfr64.sve = 0;
>       guest_cpuinfo.zfr64.bits[0] = 0;
>   
> diff --git a/xen/arch/arm/domain.c b/xen/arch/arm/domain.c
> index d8ef6501ff8e..0350d8c61ed8 100644
> --- a/xen/arch/arm/domain.c
> +++ b/xen/arch/arm/domain.c
> @@ -181,9 +181,6 @@ static void ctxt_switch_to(struct vcpu *n)
>       /* VGIC */
>       gic_restore_state(n);
>   
> -    /* VFP */
> -    vfp_restore_state(n);
> -

At the moment ctxt_switch_to() is (mostly?) the reverse of 
ctxt_switch_from(). But with this change, you are going to break it.

I would really prefer if the existing convention stays because it helps 
to confirm that we didn't miss bits in the restore code.

So if you want to move vfp_restore_state() later, then please more 
vfp_save_state() earlier in ctxt_switch_from().


>       /* XXX MPU */
>   
>       /* Fault Status */
> @@ -234,6 +231,7 @@ static void ctxt_switch_to(struct vcpu *n)
>       p2m_restore_state(n);
>   
>       /* Control Registers */
> +    WRITE_SYSREG(n->arch.cptr_el2, CPTR_EL2);

I would prefer if this called closer to vfp_restore_state(). So the 
dependency between the two is easier to spot.

>       WRITE_SYSREG(n->arch.cpacr, CPACR_EL1);
>   
>       /*
> @@ -258,6 +256,9 @@ static void ctxt_switch_to(struct vcpu *n)
>   #endif
>       isb();
>   
> +    /* VFP */

Please document in the code that vfp_restore_state() have to be called 
after CPTR_EL2() + a synchronization event.

Similar docoumentation on top of at least CPTR_EL2 and possibly isb(). 
This would help if we need to re-order the code in the future.


> +    vfp_restore_state(n);
> +
>       /* CP 15 */
>       WRITE_SYSREG(n->arch.csselr, CSSELR_EL1);
>   
> @@ -548,6 +549,8 @@ int arch_vcpu_create(struct vcpu *v)
>   
>       v->arch.vmpidr = MPIDR_SMP | vcpuid_to_vaffinity(v->vcpu_id);
>   
> +    v->arch.cptr_el2 = get_default_cptr_flags();
> +
>       v->arch.hcr_el2 = get_default_hcr_flags();
>   
>       v->arch.mdcr_el2 = HDCR_TDRA | HDCR_TDOSA | HDCR_TDA;
> diff --git a/xen/arch/arm/include/asm/arm64/sve.h b/xen/arch/arm/include/asm/arm64/sve.h
> new file mode 100644
> index 000000000000..144d2b1cc485
> --- /dev/null
> +++ b/xen/arch/arm/include/asm/arm64/sve.h
> @@ -0,0 +1,43 @@
> +/* SPDX-License-Identifier: GPL-2.0 */

Use GPL-2.0-only.

> +/*
> + * Arm SVE feature code
> + *
> + * Copyright (C) 2022 ARM Ltd.
> + */
> +
> +#ifndef _ARM_ARM64_SVE_H
> +#define _ARM_ARM64_SVE_H
> +
> +#define SVE_VL_MAX_BITS (2048U)

NIT: The parentheses are unnecessary and we don't tend to add them in Xen.

> +
> +/* Vector length must be multiple of 128 */
> +#define SVE_VL_MULTIPLE_VAL (128U)

NIT: The parentheses are unnecessary

> +
> +#ifdef CONFIG_ARM64_SVE
> +
> +register_t compute_max_zcr(void);
> +register_t vl_to_zcr(unsigned int vl);
> +
> +#else /* !CONFIG_ARM64_SVE */
> +
> +static inline register_t compute_max_zcr(void)
> +{

Is this meant to be called when SVE is not enabled? If not, then please 
add ASSERT_UNREACHABLE().

> +    return 0;
> +}
> +
> +static inline register_t vl_to_zcr(unsigned int vl)
> +{

Is this meant to be called when SVE is not enabled? If not, then please 
add ASSERT_UNREACHABLE().

> +    return 0;
> +}
> +
> +#endif /* CONFIG_ARM64_SVE */
> +
> +#endif /* _ARM_ARM64_SVE_H */
> +/*
> + * Local variables:
> + * mode: C
> + * c-file-style: "BSD"
> + * c-basic-offset: 4
> + * indent-tabs-mode: nil
> + * End:
> + */
> diff --git a/xen/arch/arm/include/asm/arm64/sysregs.h b/xen/arch/arm/include/asm/arm64/sysregs.h
> index 463899951414..4cabb9eb4d5e 100644
> --- a/xen/arch/arm/include/asm/arm64/sysregs.h
> +++ b/xen/arch/arm/include/asm/arm64/sysregs.h
> @@ -24,6 +24,7 @@
>   #define ICH_EISR_EL2              S3_4_C12_C11_3
>   #define ICH_ELSR_EL2              S3_4_C12_C11_5
>   #define ICH_VMCR_EL2              S3_4_C12_C11_7
> +#define ZCR_EL2                   S3_4_C1_C2_0
>   
>   #define __LR0_EL2(x)              S3_4_C12_C12_ ## x
>   #define __LR8_EL2(x)              S3_4_C12_C13_ ## x
> diff --git a/xen/arch/arm/include/asm/cpufeature.h b/xen/arch/arm/include/asm/cpufeature.h
> index c62cf6293fd6..6d703e051906 100644
> --- a/xen/arch/arm/include/asm/cpufeature.h
> +++ b/xen/arch/arm/include/asm/cpufeature.h
> @@ -32,6 +32,12 @@
>   #define cpu_has_thumbee   (boot_cpu_feature32(thumbee) == 1)
>   #define cpu_has_aarch32   (cpu_has_arm || cpu_has_thumb)
>   
> +#ifdef CONFIG_ARM64_SVE
> +#define cpu_has_sve       (boot_cpu_feature64(sve) == 1)
> +#else
> +#define cpu_has_sve       (0)

NIT: The parentheses are unnecessary

> +#endif
> +
>   #ifdef CONFIG_ARM_32
>   #define cpu_has_gicv3     (boot_cpu_feature32(gic) >= 1)
>   #define cpu_has_gentimer  (boot_cpu_feature32(gentimer) == 1)
> @@ -323,6 +329,14 @@ struct cpuinfo_arm {
>           };
>       } isa64;
>   
> +    union {
> +        register_t bits[1];
> +        struct {
> +            unsigned long len:4;
> +            unsigned long __res0:60;
> +        };
> +    } zcr64;
> +
>       struct {
>           register_t bits[1];
>       } zfr64;
> diff --git a/xen/arch/arm/include/asm/domain.h b/xen/arch/arm/include/asm/domain.h
> index 2a51f0ca688e..e776ee704b7d 100644
> --- a/xen/arch/arm/include/asm/domain.h
> +++ b/xen/arch/arm/include/asm/domain.h
> @@ -190,6 +190,7 @@ struct arch_vcpu
>       register_t tpidrro_el0;
>   
>       /* HYP configuration */
> +    register_t cptr_el2;
>       register_t hcr_el2;
>       register_t mdcr_el2;
>   
> diff --git a/xen/arch/arm/include/asm/processor.h b/xen/arch/arm/include/asm/processor.h
> index 54f253087718..bc683334125c 100644
> --- a/xen/arch/arm/include/asm/processor.h
> +++ b/xen/arch/arm/include/asm/processor.h
> @@ -582,6 +582,8 @@ void do_trap_guest_serror(struct cpu_user_regs *regs);
>   
>   register_t get_default_hcr_flags(void);
>   
> +register_t get_default_cptr_flags(void);
> +
>   /*
>    * Synchronize SError unless the feature is selected.
>    * This is relying on the SErrors are currently unmasked.
> diff --git a/xen/arch/arm/setup.c b/xen/arch/arm/setup.c
> index 6f9f4d8c8a15..4191a766767a 100644
> --- a/xen/arch/arm/setup.c
> +++ b/xen/arch/arm/setup.c
> @@ -135,10 +135,11 @@ static void __init processor_id(void)
>              cpu_has_el2_32 ? "64+32" : cpu_has_el2_64 ? "64" : "No",
>              cpu_has_el1_32 ? "64+32" : cpu_has_el1_64 ? "64" : "No",
>              cpu_has_el0_32 ? "64+32" : cpu_has_el0_64 ? "64" : "No");
> -    printk("    Extensions:%s%s%s\n",
> +    printk("    Extensions:%s%s%s%s\n",
>              cpu_has_fp ? " FloatingPoint" : "",
>              cpu_has_simd ? " AdvancedSIMD" : "",
> -           cpu_has_gicv3 ? " GICv3-SysReg" : "");
> +           cpu_has_gicv3 ? " GICv3-SysReg" : "",
> +           cpu_has_sve ? " SVE" : "");
>   
>       /* Warn user if we find unknown floating-point features */
>       if ( cpu_has_fp && (boot_cpu_feature64(fp) >= 2) )
> diff --git a/xen/arch/arm/traps.c b/xen/arch/arm/traps.c
> index d40c331a4e9c..c0611c2ef6a5 100644
> --- a/xen/arch/arm/traps.c
> +++ b/xen/arch/arm/traps.c
> @@ -93,6 +93,21 @@ register_t get_default_hcr_flags(void)
>                HCR_TID3|HCR_TSC|HCR_TAC|HCR_SWIO|HCR_TIDCP|HCR_FB|HCR_TSW);
>   }
>   
> +register_t get_default_cptr_flags(void)
> +{
> +    /*
> +     * Trap all coprocessor registers (0-13) except cp10 and
> +     * cp11 for VFP.
> +     *
> +     * /!\ All coprocessors except cp10 and cp11 cannot be used in Xen.
> +     *
> +     * On ARM64 the TCPx bits which we set here (0..9,12,13) are all
> +     * RES1, i.e. they would trap whether we did this write or not.
> +     */
> +    return  ((HCPTR_CP_MASK & ~(HCPTR_CP(10) | HCPTR_CP(11))) |
> +             HCPTR_TTA | HCPTR_TAM);
> +}
> +
>   static enum {
>       SERRORS_DIVERSE,
>       SERRORS_PANIC,
> @@ -122,6 +137,7 @@ __initcall(update_serrors_cpu_caps);
>   
>   void init_traps(void)
>   {
> +    register_t cptr_bits = get_default_cptr_flags();

Coding style: Please add a newline after the declaration. That said...

>       /*
>        * Setup Hyp vector base. Note they might get updated with the
>        * branch predictor hardening.
> @@ -135,17 +151,7 @@ void init_traps(void)
>       /* Trap CP15 c15 used for implementation defined registers */
>       WRITE_SYSREG(HSTR_T(15), HSTR_EL2);
>   
> -    /* Trap all coprocessor registers (0-13) except cp10 and
> -     * cp11 for VFP.
> -     *
> -     * /!\ All coprocessors except cp10 and cp11 cannot be used in Xen.
> -     *
> -     * On ARM64 the TCPx bits which we set here (0..9,12,13) are all
> -     * RES1, i.e. they would trap whether we did this write or not.
> -     */
> -    WRITE_SYSREG((HCPTR_CP_MASK & ~(HCPTR_CP(10) | HCPTR_CP(11))) |
> -                 HCPTR_TTA | HCPTR_TAM,
> -                 CPTR_EL2);
> +    WRITE_SYSREG(cptr_bits, CPTR_EL2);

... I would combine the two lines as the variable seems unnecessary.

>   
>       /*
>        * Configure HCR_EL2 with the bare minimum to run Xen until a guest

Cheers,

-- 
Julien Grall


^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [PATCH v6 02/12] xen/arm: add SVE vector length field to the domain
  2023-04-24  6:02 ` [PATCH v6 02/12] xen/arm: add SVE vector length field to the domain Luca Fancellu
@ 2023-05-18  9:48   ` Julien Grall
  0 siblings, 0 replies; 56+ messages in thread
From: Julien Grall @ 2023-05-18  9:48 UTC (permalink / raw)
  To: Luca Fancellu, xen-devel
  Cc: bertrand.marquis, wei.chen, Stefano Stabellini,
	Volodymyr Babchuk, Andrew Cooper, George Dunlap, Jan Beulich,
	Wei Liu

Hi Luca,

On 24/04/2023 07:02, Luca Fancellu wrote:
> Add sve_vl field to arch_domain and xen_arch_domainconfig struct,
> to allow the domain to have an information about the SVE feature
> and the number of SVE register bits that are allowed for this
> domain.
> 
> sve_vl field is the vector length in bits divided by 128, this
> allows to use less space in the structures.
> 
> The field is used also to allow or forbid a domain to use SVE,
> because a value equal to zero means the guest is not allowed to
> use the feature.
> 
> Check that the requested vector length is lower or equal to the
> platform supported vector length, otherwise fail on domain
> creation.
> 
> Check that only 64 bit domains have SVE enabled, otherwise fail.
> 
> Bump the XEN_DOMCTL_INTERFACE_VERSION because of the new field
> in struct xen_arch_domainconfig.

The domctl interface version was bumped this week (see bdb1184d4f 
"domctl: bump interface version"). So this will not be necessary.

> 
> Signed-off-by: Luca Fancellu <luca.fancellu@arm.com>
> ---
> Changes from v5:
>   - Update commit message stating the interface ver. bump (Bertrand)
>   - in struct arch_domain, protect sve_vl with CONFIG_ARM64_SVE,
>     given the change, move also is_sve_domain() where it's protected
>     inside sve.h and create a stub when the macro is not defined,
>     protect the usage of sve_vl where needed.
>     (Julien)
>   - Add a check for 32 bit guest running on top of 64 bit host that
>     have sve parameter enabled to stop the domain creation, added in
>     construct_domain() of domain_build.c and subarch_do_domctl of
>     domctl.c. (Julien)
> Changes from v4:
>   - Return 0 in get_sys_vl_len() if sve is not supported, code style fix,
>     removed else if since the conditions can't fallthrough, removed not
>     needed condition checking for VL bits validity because it's already
>     covered, so delete is_vl_valid() function. (Jan)
> Changes from v3:
>   - don't use fixed types when not needed, use encoded value also in
>     arch_domain so rename sve_vl_bits in sve_vl. (Jan)
>   - rename domainconfig_decode_vl to sve_decode_vl because it will now
>     be used also to decode from arch_domain value
>   - change sve_vl from uint16_t to uint8_t and move it after "type" field
>     to optimize space.
> Changes from v2:
>   - rename field in xen_arch_domainconfig from "sve_vl_bits" to
>     "sve_vl" and use the implicit padding after gic_version to
>     store it, now this field is the VL/128. (Jan)
>   - Created domainconfig_decode_vl() function to decode the sve_vl
>     field and use it as plain bits value inside arch_domain.
>   - Changed commit message reflecting the changes
> Changes from v1:
>   - no changes
> Changes from RFC:
>   - restore zcr_el2 in sve_restore_state, that will be introduced
>     later in this serie, so remove zcr_el2 related code from this
>     patch and move everything to the later patch (Julien)
>   - add explicit padding into struct xen_arch_domainconfig (Julien)
>   - Don't lower down the vector length, just fail to create the
>     domain. (Julien)
> ---
>   xen/arch/arm/arm64/domctl.c          |  4 ++++
>   xen/arch/arm/arm64/sve.c             | 12 ++++++++++++
>   xen/arch/arm/domain.c                | 29 ++++++++++++++++++++++++++++
>   xen/arch/arm/domain_build.c          |  7 +++++++
>   xen/arch/arm/include/asm/arm64/sve.h | 16 +++++++++++++++
>   xen/arch/arm/include/asm/domain.h    |  5 +++++
>   xen/include/public/arch-arm.h        |  2 ++
>   xen/include/public/domctl.h          |  2 +-
>   8 files changed, 76 insertions(+), 1 deletion(-)
> 
> diff --git a/xen/arch/arm/arm64/domctl.c b/xen/arch/arm/arm64/domctl.c
> index 0de89b42c448..14fc622e9956 100644
> --- a/xen/arch/arm/arm64/domctl.c
> +++ b/xen/arch/arm/arm64/domctl.c
> @@ -10,6 +10,7 @@
>   #include <xen/sched.h>
>   #include <xen/hypercall.h>
>   #include <public/domctl.h>
> +#include <asm/arm64/sve.h>
>   #include <asm/cpufeature.h>
>   
>   static long switch_mode(struct domain *d, enum domain_type type)
> @@ -43,6 +44,9 @@ long subarch_do_domctl(struct xen_domctl *domctl, struct domain *d,
>           case 32:
>               if ( !cpu_has_el1_32 )
>                   return -EINVAL;
> +            /* SVE is not supported for 32 bit domain */
> +            if ( is_sve_domain(d) )
> +                return -EINVAL;
>               return switch_mode(d, DOMAIN_32BIT);
>           case 64:
>               return switch_mode(d, DOMAIN_64BIT);
> diff --git a/xen/arch/arm/arm64/sve.c b/xen/arch/arm/arm64/sve.c
> index 6f3fb368c59b..86a5e617bfca 100644
> --- a/xen/arch/arm/arm64/sve.c
> +++ b/xen/arch/arm/arm64/sve.c
> @@ -8,6 +8,7 @@
>   #include <xen/types.h>
>   #include <asm/arm64/sve.h>
>   #include <asm/arm64/sysregs.h>
> +#include <asm/cpufeature.h>
>   #include <asm/processor.h>
>   #include <asm/system.h>
>   
> @@ -48,3 +49,14 @@ register_t vl_to_zcr(unsigned int vl)
>       ASSERT(vl > 0);
>       return ((vl / SVE_VL_MULTIPLE_VAL) - 1U) & ZCR_ELx_LEN_MASK;
>   }
> +
> +/* Get the system sanitized value for VL in bits */
> +unsigned int get_sys_vl_len(void)
> +{
> +    if ( !cpu_has_sve )
> +        return 0;
> +
> +    /* ZCR_ELx len field is ((len+1) * 128) = vector bits length */

NIT: Please add a space before and after '+'.

> +    return ((system_cpuinfo.zcr64.bits[0] & ZCR_ELx_LEN_MASK) + 1U) *
> +            SVE_VL_MULTIPLE_VAL;
> +}
> diff --git a/xen/arch/arm/domain.c b/xen/arch/arm/domain.c
> index 0350d8c61ed8..143359d0f313 100644
> --- a/xen/arch/arm/domain.c
> +++ b/xen/arch/arm/domain.c
> @@ -13,6 +13,7 @@
>   #include <xen/wait.h>
>   
>   #include <asm/alternative.h>
> +#include <asm/arm64/sve.h>
>   #include <asm/cpuerrata.h>
>   #include <asm/cpufeature.h>
>   #include <asm/current.h>
> @@ -550,6 +551,8 @@ int arch_vcpu_create(struct vcpu *v)
>       v->arch.vmpidr = MPIDR_SMP | vcpuid_to_vaffinity(v->vcpu_id);
>   
>       v->arch.cptr_el2 = get_default_cptr_flags();
> +    if ( is_sve_domain(v->domain) )
> +        v->arch.cptr_el2 &= ~HCPTR_CP(8);
>   
>       v->arch.hcr_el2 = get_default_hcr_flags();
>   
> @@ -594,6 +597,7 @@ int arch_sanitise_domain_config(struct xen_domctl_createdomain *config)
>       unsigned int max_vcpus;
>       unsigned int flags_required = (XEN_DOMCTL_CDF_hvm | XEN_DOMCTL_CDF_hap);
>       unsigned int flags_optional = (XEN_DOMCTL_CDF_iommu | XEN_DOMCTL_CDF_vpmu);
> +    unsigned int sve_vl_bits = sve_decode_vl(config->arch.sve_vl);
>   
>       if ( (config->flags & ~flags_optional) != flags_required )
>       {
> @@ -602,6 +606,26 @@ int arch_sanitise_domain_config(struct xen_domctl_createdomain *config)
>           return -EINVAL;
>       }
>   
> +    /* Check feature flags */
> +    if ( sve_vl_bits > 0 )
> +    {
> +        unsigned int zcr_max_bits = get_sys_vl_len();
> +
> +        if ( !zcr_max_bits )
> +        {
> +            dprintk(XENLOG_INFO, "SVE is unsupported on this machine.\n");
> +            return -EINVAL;
> +        }
> +
> +        if ( sve_vl_bits > zcr_max_bits )
> +        {
> +            dprintk(XENLOG_INFO,
> +                    "Requested SVE vector length (%u) > supported length (%u)\n",
> +                    sve_vl_bits, zcr_max_bits);
> +            return -EINVAL;
> +        }
> +    }
> +
>       /* The P2M table must always be shared between the CPU and the IOMMU */
>       if ( config->iommu_opts & XEN_DOMCTL_IOMMU_no_sharept )
>       {
> @@ -744,6 +768,11 @@ int arch_domain_create(struct domain *d,
>       if ( (rc = domain_vpci_init(d)) != 0 )
>           goto fail;
>   
> +#ifdef CONFIG_ARM64_SVE
> +    /* Copy the encoded vector length sve_vl from the domain configuration */
> +    d->arch.sve_vl = config->arch.sve_vl;
> +#endif
> +
>       return 0;
>   
>   fail:
> diff --git a/xen/arch/arm/domain_build.c b/xen/arch/arm/domain_build.c
> index f80fdd1af206..ffabe567ac3f 100644
> --- a/xen/arch/arm/domain_build.c
> +++ b/xen/arch/arm/domain_build.c
> @@ -26,6 +26,7 @@
>   #include <asm/platform.h>
>   #include <asm/psci.h>
>   #include <asm/setup.h>
> +#include <asm/arm64/sve.h>
>   #include <asm/cpufeature.h>
>   #include <asm/domain_build.h>
>   #include <xen/event.h>
> @@ -3674,6 +3675,12 @@ static int __init construct_domain(struct domain *d, struct kernel_info *kinfo)
>           return -EINVAL;
>       }
>   
> +    if ( is_sve_domain(d) && (kinfo->type == DOMAIN_32BIT) )
> +    {
> +        printk("SVE is not available for 32-bit domain\n");
> +        return -EINVAL;
> +    }
> +
>       if ( is_64bit_domain(d) )
>           vcpu_switch_to_aarch64_mode(v);
>   
> diff --git a/xen/arch/arm/include/asm/arm64/sve.h b/xen/arch/arm/include/asm/arm64/sve.h
> index 144d2b1cc485..730c3fb5a9c8 100644
> --- a/xen/arch/arm/include/asm/arm64/sve.h
> +++ b/xen/arch/arm/include/asm/arm64/sve.h
> @@ -13,13 +13,24 @@
>   /* Vector length must be multiple of 128 */
>   #define SVE_VL_MULTIPLE_VAL (128U)
>   
> +static inline unsigned int sve_decode_vl(unsigned int sve_vl)
> +{
> +    /* SVE vector length is stored as VL/128 in xen_arch_domainconfig */
> +    return sve_vl * SVE_VL_MULTIPLE_VAL;
> +}
> +
>   #ifdef CONFIG_ARM64_SVE
>   
> +#define is_sve_domain(d) ((d)->arch.sve_vl > 0)
> +
>   register_t compute_max_zcr(void);
>   register_t vl_to_zcr(unsigned int vl);
> +unsigned int get_sys_vl_len(void);
>   
>   #else /* !CONFIG_ARM64_SVE */
>   
> +#define is_sve_domain(d) (0)

You want to use (d, 0) so 'd' is still evaluated when !SVE is not 
enabled. An alternative is to provide a static line helper.

> +
>   static inline register_t compute_max_zcr(void)
>   {
>       return 0;
> @@ -30,6 +41,11 @@ static inline register_t vl_to_zcr(unsigned int vl)
>       return 0;
>   }
>   
> +static inline unsigned int get_sys_vl_len(void)
> +{
> +    return 0;
> +}
> +
>   #endif /* CONFIG_ARM64_SVE */
>   
>   #endif /* _ARM_ARM64_SVE_H */
> diff --git a/xen/arch/arm/include/asm/domain.h b/xen/arch/arm/include/asm/domain.h
> index e776ee704b7d..331da0f3bcc3 100644
> --- a/xen/arch/arm/include/asm/domain.h
> +++ b/xen/arch/arm/include/asm/domain.h
> @@ -67,6 +67,11 @@ struct arch_domain
>       enum domain_type type;
>   #endif
>   
> +#ifdef CONFIG_ARM64_SVE
> +    /* max SVE encoded vector length */
> +    uint8_t sve_vl;
> +#endif
> +
>       /* Virtual MMU */
>       struct p2m_domain p2m;
>   
> diff --git a/xen/include/public/arch-arm.h b/xen/include/public/arch-arm.h
> index 1528ced5097a..38311f559581 100644
> --- a/xen/include/public/arch-arm.h
> +++ b/xen/include/public/arch-arm.h
> @@ -300,6 +300,8 @@ DEFINE_XEN_GUEST_HANDLE(vcpu_guest_context_t);
>   struct xen_arch_domainconfig {
>       /* IN/OUT */
>       uint8_t gic_version;
> +    /* IN - Contains SVE vector length divided by 128 */
> +    uint8_t sve_vl;
>       /* IN */
>       uint16_t tee_type;
>       /* IN */
> diff --git a/xen/include/public/domctl.h b/xen/include/public/domctl.h
> index 529801c89ba3..e2e22cb534d6 100644
> --- a/xen/include/public/domctl.h
> +++ b/xen/include/public/domctl.h
> @@ -21,7 +21,7 @@
>   #include "hvm/save.h"
>   #include "memory.h"
>   
> -#define XEN_DOMCTL_INTERFACE_VERSION 0x00000015
> +#define XEN_DOMCTL_INTERFACE_VERSION 0x00000016
>   
>   /*
>    * NB. xen_domctl.domain is an IN/OUT parameter for this operation.

Cheers,

-- 
Julien Grall


^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [PATCH v6 03/12] xen/arm: Expose SVE feature to the guest
  2023-04-24  6:02 ` [PATCH v6 03/12] xen/arm: Expose SVE feature to the guest Luca Fancellu
@ 2023-05-18  9:51   ` Julien Grall
  0 siblings, 0 replies; 56+ messages in thread
From: Julien Grall @ 2023-05-18  9:51 UTC (permalink / raw)
  To: Luca Fancellu, xen-devel
  Cc: bertrand.marquis, wei.chen, Stefano Stabellini, Volodymyr Babchuk

Hi,

On 24/04/2023 07:02, Luca Fancellu wrote:
> When a guest is allowed to use SVE, expose the SVE features through
> the identification registers.
> 
> Signed-off-by: Luca Fancellu <luca.fancellu@arm.com>

With one remark below:

Acked-by: Julien Grall <jgrall@amazon.com>

> +    case HSR_SYSREG_ID_AA64ZFR0_EL1:
> +    {
> +        /*
> +         * When the guest has the SVE feature enabled, the whole id_aa64zfr0_el1
> +         * needs to be exposed.
> +         */
> +        register_t guest_reg_value = guest_cpuinfo.zfr64.bits[0];

Coding style: Add a newline after the declaration.

> +        if ( is_sve_domain(v->domain) )
> +            guest_reg_value = system_cpuinfo.zfr64.bits[0];
> +
> +        return handle_ro_read_val(regs, regidx, hsr.sysreg.read, hsr, 1,
> +                                  guest_reg_value);
> +    }
>   
>       /*
>        * Those cases are catching all Reserved registers trapped by TID3 which

Cheers,

-- 
Julien Grall


^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [PATCH v6 04/12] xen/arm: add SVE exception class handling
  2023-04-24  6:02 ` [PATCH v6 04/12] xen/arm: add SVE exception class handling Luca Fancellu
@ 2023-05-18  9:55   ` Julien Grall
  0 siblings, 0 replies; 56+ messages in thread
From: Julien Grall @ 2023-05-18  9:55 UTC (permalink / raw)
  To: Luca Fancellu, xen-devel
  Cc: bertrand.marquis, wei.chen, Stefano Stabellini, Volodymyr Babchuk

Hi Luca,

On 24/04/2023 07:02, Luca Fancellu wrote:
> SVE has a new exception class with code 0x19, introduce the new code
> and handle the exception.
> 
> Signed-off-by: Luca Fancellu <luca.fancellu@arm.com>
> Reviewed-by: Bertrand Marquis <bertrand.marquis@arm.com>

Reviewed-by: Julien Grall <jgrall@amazon.com>

Cheers,

-- 
Julien Grall


^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [PATCH v6 05/12] arm/sve: save/restore SVE context switch
  2023-04-24  6:02 ` [PATCH v6 05/12] arm/sve: save/restore SVE context switch Luca Fancellu
@ 2023-05-18 18:27   ` Julien Grall
  2023-05-19 17:35     ` Luca Fancellu
  2023-05-18 18:30   ` Julien Grall
  1 sibling, 1 reply; 56+ messages in thread
From: Julien Grall @ 2023-05-18 18:27 UTC (permalink / raw)
  To: Luca Fancellu, xen-devel
  Cc: bertrand.marquis, wei.chen, Stefano Stabellini, Volodymyr Babchuk

Hi Luca,

On 24/04/2023 07:02, Luca Fancellu wrote:
> Save/restore context switch for SVE, allocate memory to contain
> the Z0-31 registers whose length is maximum 2048 bits each and
> FFR who can be maximum 256 bits, the allocated memory depends on
> how many bits is the vector length for the domain and how many bits
> are supported by the platform.
> 
> Save P0-15 whose length is maximum 256 bits each, in this case the
> memory used is from the fpregs field in struct vfp_state,
> because V0-31 are part of Z0-31 and this space would have been
> unused for SVE domain otherwise.
> 
> Create zcr_el{1,2} fields in arch_vcpu, initialise zcr_el2 on vcpu
> creation given the requested vector length and restore it on
> context switch, save/restore ZCR_EL1 value as well.
> 
> Signed-off-by: Luca Fancellu <luca.fancellu@arm.com>
> ---
> Changes from v5:
>   - use XFREE instead of xfree, keep the headers (Julien)
>   - Avoid math computation for every save/restore, store the computation
>     in struct vfp_state once (Bertrand)
>   - protect access to v->domain->arch.sve_vl inside arch_vcpu_create now
>     that sve_vl is available only on arm64
> Changes from v4:
>   - No changes
> Changes from v3:
>   - don't use fixed len types when not needed (Jan)
>   - now VL is an encoded value, decode it before using.
> Changes from v2:
>   - No changes
> Changes from v1:
>   - No changes
> Changes from RFC:
>   - Moved zcr_el2 field introduction in this patch, restore its
>     content inside sve_restore_state function. (Julien)
> ---
>   xen/arch/arm/arm64/sve-asm.S             | 141 +++++++++++++++++++++++
>   xen/arch/arm/arm64/sve.c                 |  63 ++++++++++
>   xen/arch/arm/arm64/vfp.c                 |  79 +++++++------
>   xen/arch/arm/domain.c                    |   9 ++
>   xen/arch/arm/include/asm/arm64/sve.h     |  13 +++
>   xen/arch/arm/include/asm/arm64/sysregs.h |   3 +
>   xen/arch/arm/include/asm/arm64/vfp.h     |  12 ++
>   xen/arch/arm/include/asm/domain.h        |   2 +
>   8 files changed, 288 insertions(+), 34 deletions(-)
> 
> diff --git a/xen/arch/arm/arm64/sve-asm.S b/xen/arch/arm/arm64/sve-asm.S
> index 4d1549344733..8c37d7bc95d5 100644
> --- a/xen/arch/arm/arm64/sve-asm.S
> +++ b/xen/arch/arm/arm64/sve-asm.S

Are all the new helpers added in this patch taken from Linux? If so, it 
would be good to clarify this (again) in the commit message as it helps 
for the review (I can diff with Linux rather than properly reviewing them).

> diff --git a/xen/arch/arm/arm64/sve.c b/xen/arch/arm/arm64/sve.c
> index 86a5e617bfca..064832b450ff 100644
> --- a/xen/arch/arm/arm64/sve.c
> +++ b/xen/arch/arm/arm64/sve.c
> @@ -5,6 +5,8 @@
>    * Copyright (C) 2022 ARM Ltd.
>    */
>   
> +#include <xen/sched.h>
> +#include <xen/sizes.h>
>   #include <xen/types.h>
>   #include <asm/arm64/sve.h>
>   #include <asm/arm64/sysregs.h>
> @@ -13,6 +15,24 @@
>   #include <asm/system.h>
>   
>   extern unsigned int sve_get_hw_vl(void);
> +extern void sve_save_ctx(uint64_t *sve_ctx, uint64_t *pregs, int save_ffr);
> +extern void sve_load_ctx(uint64_t const *sve_ctx, uint64_t const *pregs,
> +                         int restore_ffr);

 From the use, it is not entirely what restore_ffr/save_ffr is meant to 
be. Are they bool? If so, maybe use bool? At mimimum, they probably want 
to be unsigned int.

> +
> +static inline unsigned int sve_zreg_ctx_size(unsigned int vl)
> +{
> +    /*
> +     * Z0-31 registers size in bytes is computed from VL that is in bits, so VL
> +     * in bytes is VL/8.
> +     */
> +    return (vl / 8U) * 32U;
> +}
> +
> +static inline unsigned int sve_ffrreg_ctx_size(unsigned int vl)
> +{
> +    /* FFR register size is VL/8, which is in bytes (VL/8)/8 */
> +    return (vl / 64U);
> +}
>   
>   register_t compute_max_zcr(void)
>   {
> @@ -60,3 +80,46 @@ unsigned int get_sys_vl_len(void)
>       return ((system_cpuinfo.zcr64.bits[0] & ZCR_ELx_LEN_MASK) + 1U) *
>               SVE_VL_MULTIPLE_VAL;
>   }
> +
> +int sve_context_init(struct vcpu *v)
> +{
> +    unsigned int sve_vl_bits = sve_decode_vl(v->domain->arch.sve_vl);
> +    uint64_t *ctx = _xzalloc(sve_zreg_ctx_size(sve_vl_bits) +
> +                             sve_ffrreg_ctx_size(sve_vl_bits),
> +                             L1_CACHE_BYTES);
> +
> +    if ( !ctx )
> +        return -ENOMEM;
> +
> +    /* Point to the end of Z0-Z31 memory, just before FFR memory */

NIT: I would add that the logic should be kept in sync with 
sve_context_free(). Same...

> +    v->arch.vfp.sve_zreg_ctx_end = ctx +
> +        (sve_zreg_ctx_size(sve_vl_bits) / sizeof(uint64_t));
> +
> +    return 0;
> +}
> +
> +void sve_context_free(struct vcpu *v)
> +{
> +    unsigned int sve_vl_bits = sve_decode_vl(v->domain->arch.sve_vl);
> +
> +    /* Point back to the beginning of Z0-Z31 + FFR memory */

... here (but with sve_context_init()). So it is clearer that if the 
logic change in one place then it needs to be changed in the other.

> +    v->arch.vfp.sve_zreg_ctx_end -=
> +        (sve_zreg_ctx_size(sve_vl_bits) / sizeof(uint64_t));

 From my understanding, sve_context_free() could be called with 
sve_zreg_ctxt_end equal to NULL (i.e. because sve_context_init() 
failed). So wouldn't we end up to substract the value to NULL and 
therefore...

> +
> +    XFREE(v->arch.vfp.sve_zreg_ctx_end);

... free a random pointer?

> +}
> +
> +void sve_save_state(struct vcpu *v)
> +{
> +    v->arch.zcr_el1 = READ_SYSREG(ZCR_EL1);
> +
> +    sve_save_ctx(v->arch.vfp.sve_zreg_ctx_end, v->arch.vfp.fpregs, 1);
> +}
> +
> +void sve_restore_state(struct vcpu *v)
> +{
> +    WRITE_SYSREG(v->arch.zcr_el1, ZCR_EL1);
> +    WRITE_SYSREG(v->arch.zcr_el2, ZCR_EL2);

AFAIU, this value will be used for the restore below. So don't we need 
an isb()?

> +
> +    sve_load_ctx(v->arch.vfp.sve_zreg_ctx_end, v->arch.vfp.fpregs, 1);
> +}
> diff --git a/xen/arch/arm/arm64/vfp.c b/xen/arch/arm/arm64/vfp.c
> index 47885e76baae..2d0d7c2e6ddb 100644
> --- a/xen/arch/arm/arm64/vfp.c
> +++ b/xen/arch/arm/arm64/vfp.c
> @@ -2,29 +2,35 @@
>   #include <asm/processor.h>
>   #include <asm/cpufeature.h>
>   #include <asm/vfp.h>
> +#include <asm/arm64/sve.h>
>   
>   void vfp_save_state(struct vcpu *v)
>   {
>       if ( !cpu_has_fp )
>           return;
>   
> -    asm volatile("stp q0, q1, [%1, #16 * 0]\n\t"
> -                 "stp q2, q3, [%1, #16 * 2]\n\t"
> -                 "stp q4, q5, [%1, #16 * 4]\n\t"
> -                 "stp q6, q7, [%1, #16 * 6]\n\t"
> -                 "stp q8, q9, [%1, #16 * 8]\n\t"
> -                 "stp q10, q11, [%1, #16 * 10]\n\t"
> -                 "stp q12, q13, [%1, #16 * 12]\n\t"
> -                 "stp q14, q15, [%1, #16 * 14]\n\t"
> -                 "stp q16, q17, [%1, #16 * 16]\n\t"
> -                 "stp q18, q19, [%1, #16 * 18]\n\t"
> -                 "stp q20, q21, [%1, #16 * 20]\n\t"
> -                 "stp q22, q23, [%1, #16 * 22]\n\t"
> -                 "stp q24, q25, [%1, #16 * 24]\n\t"
> -                 "stp q26, q27, [%1, #16 * 26]\n\t"
> -                 "stp q28, q29, [%1, #16 * 28]\n\t"
> -                 "stp q30, q31, [%1, #16 * 30]\n\t"
> -                 : "=Q" (*v->arch.vfp.fpregs) : "r" (v->arch.vfp.fpregs));
> +    if ( is_sve_domain(v->domain) )
> +        sve_save_state(v);
> +    else
> +    {
> +        asm volatile("stp q0, q1, [%1, #16 * 0]\n\t"
> +                     "stp q2, q3, [%1, #16 * 2]\n\t"
> +                     "stp q4, q5, [%1, #16 * 4]\n\t"
> +                     "stp q6, q7, [%1, #16 * 6]\n\t"
> +                     "stp q8, q9, [%1, #16 * 8]\n\t"
> +                     "stp q10, q11, [%1, #16 * 10]\n\t"
> +                     "stp q12, q13, [%1, #16 * 12]\n\t"
> +                     "stp q14, q15, [%1, #16 * 14]\n\t"
> +                     "stp q16, q17, [%1, #16 * 16]\n\t"
> +                     "stp q18, q19, [%1, #16 * 18]\n\t"
> +                     "stp q20, q21, [%1, #16 * 20]\n\t"
> +                     "stp q22, q23, [%1, #16 * 22]\n\t"
> +                     "stp q24, q25, [%1, #16 * 24]\n\t"
> +                     "stp q26, q27, [%1, #16 * 26]\n\t"
> +                     "stp q28, q29, [%1, #16 * 28]\n\t"
> +                     "stp q30, q31, [%1, #16 * 30]\n\t"
> +                     : "=Q" (*v->arch.vfp.fpregs) : "r" (v->arch.vfp.fpregs));
> +    }
>   
>       v->arch.vfp.fpsr = READ_SYSREG(FPSR);
>       v->arch.vfp.fpcr = READ_SYSREG(FPCR);
> @@ -37,23 +43,28 @@ void vfp_restore_state(struct vcpu *v)
>       if ( !cpu_has_fp )
>           return;
>   
> -    asm volatile("ldp q0, q1, [%1, #16 * 0]\n\t"
> -                 "ldp q2, q3, [%1, #16 * 2]\n\t"
> -                 "ldp q4, q5, [%1, #16 * 4]\n\t"
> -                 "ldp q6, q7, [%1, #16 * 6]\n\t"
> -                 "ldp q8, q9, [%1, #16 * 8]\n\t"
> -                 "ldp q10, q11, [%1, #16 * 10]\n\t"
> -                 "ldp q12, q13, [%1, #16 * 12]\n\t"
> -                 "ldp q14, q15, [%1, #16 * 14]\n\t"
> -                 "ldp q16, q17, [%1, #16 * 16]\n\t"
> -                 "ldp q18, q19, [%1, #16 * 18]\n\t"
> -                 "ldp q20, q21, [%1, #16 * 20]\n\t"
> -                 "ldp q22, q23, [%1, #16 * 22]\n\t"
> -                 "ldp q24, q25, [%1, #16 * 24]\n\t"
> -                 "ldp q26, q27, [%1, #16 * 26]\n\t"
> -                 "ldp q28, q29, [%1, #16 * 28]\n\t"
> -                 "ldp q30, q31, [%1, #16 * 30]\n\t"
> -                 : : "Q" (*v->arch.vfp.fpregs), "r" (v->arch.vfp.fpregs));
> +    if ( is_sve_domain(v->domain) )
> +        sve_restore_state(v);
> +    else
> +    {
> +        asm volatile("ldp q0, q1, [%1, #16 * 0]\n\t"
> +                     "ldp q2, q3, [%1, #16 * 2]\n\t"
> +                     "ldp q4, q5, [%1, #16 * 4]\n\t"
> +                     "ldp q6, q7, [%1, #16 * 6]\n\t"
> +                     "ldp q8, q9, [%1, #16 * 8]\n\t"
> +                     "ldp q10, q11, [%1, #16 * 10]\n\t"
> +                     "ldp q12, q13, [%1, #16 * 12]\n\t"
> +                     "ldp q14, q15, [%1, #16 * 14]\n\t"
> +                     "ldp q16, q17, [%1, #16 * 16]\n\t"
> +                     "ldp q18, q19, [%1, #16 * 18]\n\t"
> +                     "ldp q20, q21, [%1, #16 * 20]\n\t"
> +                     "ldp q22, q23, [%1, #16 * 22]\n\t"
> +                     "ldp q24, q25, [%1, #16 * 24]\n\t"
> +                     "ldp q26, q27, [%1, #16 * 26]\n\t"
> +                     "ldp q28, q29, [%1, #16 * 28]\n\t"
> +                     "ldp q30, q31, [%1, #16 * 30]\n\t"
> +                     : : "Q" (*v->arch.vfp.fpregs), "r" (v->arch.vfp.fpregs));
> +    }
>   
>       WRITE_SYSREG(v->arch.vfp.fpsr, FPSR);
>       WRITE_SYSREG(v->arch.vfp.fpcr, FPCR);
> diff --git a/xen/arch/arm/domain.c b/xen/arch/arm/domain.c
> index 143359d0f313..24c722a4a11e 100644
> --- a/xen/arch/arm/domain.c
> +++ b/xen/arch/arm/domain.c
> @@ -552,7 +552,14 @@ int arch_vcpu_create(struct vcpu *v)
>   
>       v->arch.cptr_el2 = get_default_cptr_flags();
>       if ( is_sve_domain(v->domain) )
> +    {
> +        if ( (rc = sve_context_init(v)) != 0 )
> +            goto fail;
>           v->arch.cptr_el2 &= ~HCPTR_CP(8);
> +#ifdef CONFIG_ARM64_SVE

This #ifdef reads a bit odd to me because you are protecting 
v->arch.zcr_el2 but not the rest. This is one of the case where I would 
surround the full if with the #ifdef because it makes clearer that there 
is no way the rest of the code can be reached if !CONFIG_ARM64_SVE.

That said, I would actually prefer if...

> +        v->arch.zcr_el2 = vl_to_zcr(sve_decode_vl(v->domain->arch.sve_vl));

... this line is moved in sve_context_init() because this is related to 
the SVE context.

> +#endif
> +    }
>   
>       v->arch.hcr_el2 = get_default_hcr_flags();
>   
> @@ -582,6 +589,8 @@ fail:
>   
>   void arch_vcpu_destroy(struct vcpu *v)
>   {
> +    if ( is_sve_domain(v->domain) )
> +        sve_context_free(v);
>       vcpu_timer_destroy(v);
>       vcpu_vgic_free(v);
>       free_xenheap_pages(v->arch.stack, STACK_ORDER);
> diff --git a/xen/arch/arm/include/asm/arm64/sve.h b/xen/arch/arm/include/asm/arm64/sve.h
> index 730c3fb5a9c8..582405dfdf6a 100644
> --- a/xen/arch/arm/include/asm/arm64/sve.h
> +++ b/xen/arch/arm/include/asm/arm64/sve.h
> @@ -26,6 +26,10 @@ static inline unsigned int sve_decode_vl(unsigned int sve_vl)
>   register_t compute_max_zcr(void);
>   register_t vl_to_zcr(unsigned int vl);
>   unsigned int get_sys_vl_len(void);
> +int sve_context_init(struct vcpu *v);
> +void sve_context_free(struct vcpu *v);
> +void sve_save_state(struct vcpu *v);
> +void sve_restore_state(struct vcpu *v);
>   
>   #else /* !CONFIG_ARM64_SVE */
>   
> @@ -46,6 +50,15 @@ static inline unsigned int get_sys_vl_len(void)
>       return 0;
>   }
>   
> +static inline int sve_context_init(struct vcpu *v)
> +{
> +    return 0;
> +}
> +
> +static inline void sve_context_free(struct vcpu *v) {}
> +static inline void sve_save_state(struct vcpu *v) {}
> +static inline void sve_restore_state(struct vcpu *v) {}
> +
>   #endif /* CONFIG_ARM64_SVE */
>   
>   #endif /* _ARM_ARM64_SVE_H */
> diff --git a/xen/arch/arm/include/asm/arm64/sysregs.h b/xen/arch/arm/include/asm/arm64/sysregs.h
> index 4cabb9eb4d5e..3fdeb9d8cdef 100644
> --- a/xen/arch/arm/include/asm/arm64/sysregs.h
> +++ b/xen/arch/arm/include/asm/arm64/sysregs.h
> @@ -88,6 +88,9 @@
>   #ifndef ID_AA64ISAR2_EL1
>   #define ID_AA64ISAR2_EL1            S3_0_C0_C6_2
>   #endif
> +#ifndef ZCR_EL1
> +#define ZCR_EL1                     S3_0_C1_C2_0
> +#endif
>   
>   /* ID registers (imported from arm64/include/asm/sysreg.h in Linux) */
>   
> diff --git a/xen/arch/arm/include/asm/arm64/vfp.h b/xen/arch/arm/include/asm/arm64/vfp.h
> index e6e8c363bc16..4aa371e85d26 100644
> --- a/xen/arch/arm/include/asm/arm64/vfp.h
> +++ b/xen/arch/arm/include/asm/arm64/vfp.h
> @@ -6,7 +6,19 @@
>   
>   struct vfp_state
>   {
> +    /*
> +     * When SVE is enabled for the guest, fpregs memory will be used to
> +     * save/restore P0-P15 registers, otherwise it will be used for the V0-V31
> +     * registers.
> +     */
>       uint64_t fpregs[64] __vfp_aligned;
> +    /*
> +     * When SVE is enabled for the guest, sve_zreg_ctx_end points to memory
> +     * where Z0-Z31 registers and FFR can be saved/restored, it points at the
> +     * end of the Z0-Z31 space and at the beginning of the FFR space, it's done
> +     * like that to ease the save/restore assembly operations.
> +     */
> +    uint64_t *sve_zreg_ctx_end;
>       register_t fpcr;
>       register_t fpexc32_el2;
>       register_t fpsr;
> diff --git a/xen/arch/arm/include/asm/domain.h b/xen/arch/arm/include/asm/domain.h
> index 331da0f3bcc3..814652d92568 100644
> --- a/xen/arch/arm/include/asm/domain.h
> +++ b/xen/arch/arm/include/asm/domain.h
> @@ -195,6 +195,8 @@ struct arch_vcpu
>       register_t tpidrro_el0;
>   
>       /* HYP configuration */
> +    register_t zcr_el1;
> +    register_t zcr_el2;
>       register_t cptr_el2;
>       register_t hcr_el2;
>       register_t mdcr_el2;

Cheers,

-- 
Julien Grall


^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [PATCH v6 05/12] arm/sve: save/restore SVE context switch
  2023-04-24  6:02 ` [PATCH v6 05/12] arm/sve: save/restore SVE context switch Luca Fancellu
  2023-05-18 18:27   ` Julien Grall
@ 2023-05-18 18:30   ` Julien Grall
  2023-05-22 10:20     ` Luca Fancellu
  1 sibling, 1 reply; 56+ messages in thread
From: Julien Grall @ 2023-05-18 18:30 UTC (permalink / raw)
  To: Luca Fancellu, xen-devel
  Cc: bertrand.marquis, wei.chen, Stefano Stabellini, Volodymyr Babchuk

Hi Luca,

One more remark.

On 24/04/2023 07:02, Luca Fancellu wrote:
>   #else /* !CONFIG_ARM64_SVE */
>   
> @@ -46,6 +50,15 @@ static inline unsigned int get_sys_vl_len(void)
>       return 0;
>   }
>   
> +static inline int sve_context_init(struct vcpu *v)
> +{
> +    return 0;

The call is protected by is_domain_sve(). So I think we want to return 
an error just in case someone is calling it outside of its intended use.

> +}
> +
> +static inline void sve_context_free(struct vcpu *v) {}
> +static inline void sve_save_state(struct vcpu *v) {}
> +static inline void sve_restore_state(struct vcpu *v) {}
> +

-- 
Julien Grall


^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [PATCH v6 07/12] xen: enable Dom0 to use SVE feature
  2023-04-25  6:04                         ` Luca Fancellu
@ 2023-05-18 18:39                           ` Julien Grall
  0 siblings, 0 replies; 56+ messages in thread
From: Julien Grall @ 2023-05-18 18:39 UTC (permalink / raw)
  To: Luca Fancellu, Jan Beulich
  Cc: Bertrand Marquis, Wei Chen, Andrew Cooper, George Dunlap,
	Stefano Stabellini, Wei Liu, Volodymyr Babchuk, xen-devel

Hi Luca,

Sorry for the late reply.

On 25/04/2023 07:04, Luca Fancellu wrote:
> 
> 
>> On 24 Apr 2023, at 17:10, Jan Beulich <jbeulich@suse.com> wrote:
>>
>> On 24.04.2023 17:43, Luca Fancellu wrote:
>>>> On 24 Apr 2023, at 16:41, Jan Beulich <jbeulich@suse.com> wrote:
>>>> On 24.04.2023 17:34, Luca Fancellu wrote:
>>>>>> On 24 Apr 2023, at 16:25, Jan Beulich <jbeulich@suse.com> wrote:
>>>>>> On 24.04.2023 17:18, Luca Fancellu wrote:
>>>>>>> Oh ok, I don’t know, here what I get if for example I build arm32:
>>>>>>>
>>>>>>> arm-linux-gnueabihf-ld -EL -T arch/arm/xen.lds -N prelink.o \
>>>>>>> ./common/symbols-dummy.o -o ./.xen-syms.0
>>>>>>> arm-linux-gnueabihf-ld: prelink.o: in function `create_domUs':
>>>>>>> (.init.text+0x13464): undefined reference to `sve_domctl_vl_param'
>>>>>>
>>>>>> In particular with seeing this: What you copied here is a build with the
>>>>>> series applied only up to this patch? I ask because the patch here adds a
>>>>>> call only out of create_dom0().
>>>>>
>>>>> No I’ve do the changes on top of the serie, I’ve tried it now, only to this patch and it builds correctly,
>>>>> It was my mistake to don’t read carefully the error output.
>>>>>
>>>>> Anyway I guess this change is not applicable because we don’t have a symbol that is plain 0 for domUs
>>>>> to be placed inside create_domUs.
>>>>
>>>> Possible, but would you mind first telling me in which other patch(es) the
>>>> further reference(s) are being introduced, so I could take a look without
>>>> (again) digging through the entire series?
>>>
>>> Sure, the other references to the function are introduced in "xen/arm: add sve property for dom0less domUs” patch 11
>>
>> Personally I'm inclined to suggest adding "#ifdef CONFIG_ARM64_SVE" there.
>> But I guess that may again go against your desire to not ignore inapplicable
>> options. Still I can't resist to at least ask how an "sve" node on Arm32 is
>> different from an entirely unknown one.
> 
> It would be ok for me to use #ifdef CONFIG_ARM64_SVE and fail in the #else branch,
> but I had the feeling in the past that Arm maintainers are not very happy with #ifdefs, I might
> be wrong so I’ll wait for them to give an opinion and then I will be happy to follow.

IIRC, your suggestion is for patch #11. In this case, my preference is 
the #ifdef + throwing an error in the #else branch. This would avoid to 
silently ignore the property if SVE is not enabled (both Bertrand and I 
agreed this should not be ignored, see [1]).

Cheers,

[1] 
https://lore.kernel.org/all/7614AE25-F59D-430A-9C3E-30B1CE0E1580@arm.com/

-- 
Julien Grall


^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [PATCH v6 01/12] xen/arm: enable SVE extension for Xen
  2023-05-18  9:35   ` Julien Grall
@ 2023-05-19 14:26     ` Luca Fancellu
  2023-05-19 14:46       ` Julien Grall
  0 siblings, 1 reply; 56+ messages in thread
From: Luca Fancellu @ 2023-05-19 14:26 UTC (permalink / raw)
  To: Julien Grall
  Cc: Xen-devel, Bertrand Marquis, Wei Chen, Stefano Stabellini,
	Volodymyr Babchuk



> On 18 May 2023, at 10:35, Julien Grall <julien@xen.org> wrote:
> 
> Hi Luca,
> 
> Sorry for jumping late in the review.

Hi Julien,

Thank you for taking the time to review,

>> 
>>   /*
>>    * Comment from Linux:
>>    * Userspace may perform DC ZVA instructions. Mismatched block sizes
>> diff --git a/xen/arch/arm/arm64/sve-asm.S b/xen/arch/arm/arm64/sve-asm.S
>> new file mode 100644
>> index 000000000000..4d1549344733
>> --- /dev/null
>> +++ b/xen/arch/arm/arm64/sve-asm.S
>> @@ -0,0 +1,48 @@
>> +/* SPDX-License-Identifier: GPL-2.0-only */
>> +/*
>> + * Arm SVE assembly routines
>> + *
>> + * Copyright (C) 2022 ARM Ltd.
>> + *
>> + * Some macros and instruction encoding in this file are taken from linux 6.1.1,
>> + * file arch/arm64/include/asm/fpsimdmacros.h, some of them are a modified
>> + * version.
> AFAICT, the only modified version is _sve_rdvl, but it is not clear to me why we would want to have a modified version?
> 
> I am asking this because without an explanation, it would be difficult to know how to re-sync the code with Linux.

In this patch the macros are exactly equal to Linux, apart from the coding style that uses spaces instead of tabs,
I was not expecting to keep them in sync as they seems to be not prone to change soon, let me know if I need to
use also tabs and be 100% equal to Linux.

The following macros that are coming in patch 5 are equal apart from sve_save/sve_load, that are different because
of the construction differences between the storage buffers here and in Linux, if you want I can put a comment on them
to explain this difference in patch 5

>> 
>> diff --git a/xen/arch/arm/arm64/sve.c b/xen/arch/arm/arm64/sve.c
>> new file mode 100644
>> index 000000000000..6f3fb368c59b
>> --- /dev/null
>> +++ b/xen/arch/arm/arm64/sve.c
>> @@ -0,0 +1,50 @@
>> +/* SPDX-License-Identifier: GPL-2.0 */
> 
> Above, you are using GPL-2.0-only, but here GPL-2.0. We favor the former now. Happy to deal it on commit if there is nothing else to address.

No problem, I will fix in the next push

> 
>> +/*
>> + * Arm SVE feature code
>> + *
>> + * Copyright (C) 2022 ARM Ltd.
>> + */
>> +
>> +#include <xen/types.h>
>> +#include <asm/arm64/sve.h>
>> +#include <asm/arm64/sysregs.h>
>> +#include <asm/processor.h>
>> +#include <asm/system.h>
>> +
>> +extern unsigned int sve_get_hw_vl(void);
>> +
>> +register_t compute_max_zcr(void)
>> +{
>> +    register_t cptr_bits = get_default_cptr_flags();
>> +    register_t zcr = vl_to_zcr(SVE_VL_MAX_BITS);
>> +    unsigned int hw_vl;
>> +
>> +    /* Remove trap for SVE resources */
>> +    WRITE_SYSREG(cptr_bits & ~HCPTR_CP(8), CPTR_EL2);
>> +    isb();
>> +
>> +    /*
>> +     * Set the maximum SVE vector length, doing that we will know the VL
>> +     * supported by the platform, calling sve_get_hw_vl()
>> +     */
>> +    WRITE_SYSREG(zcr, ZCR_EL2);
> 
> From my reading of the Arm (D19-6331, ARM DDI 0487J.a), a direct write to a system register would need to be followed by an context synchronization event (e.g. isb()) before the software can rely on the value.
> 
> In this situation, AFAICT, the instruciton in sve_get_hw_vl() will use the content of ZCR_EL2. So don't we need an ISB() here?

From what I’ve read in the manual for ZCR_ELx:

An indirect read of ZCR_EL2.LEN appears to occur in program order relative to a direct write of
the same register, without the need for explicit synchronization

I’ve interpreted it as “there is no need to sync before write” and I’ve looked into Linux and it does not
Appear any synchronisation mechanism after a write to that register, but if I am wrong I can for sure
add an isb if you prefer.

> 
>> +
>> +    /*
>> +     * Read the maximum VL, which could be lower than what we imposed before,
>> +     * hw_vl contains VL in bytes, multiply it by 8 to use vl_to_zcr() later
>> +     */
>> +    hw_vl = sve_get_hw_vl() * 8U;
>> +
>> +    /* Restore CPTR_EL2 */
>> +    WRITE_SYSREG(cptr_bits, CPTR_EL2);
>> +    isb();
>> +
>> +    return vl_to_zcr(hw_vl);
>> +}
>> +
>> +/* Takes a vector length in bits and returns the ZCR_ELx encoding */
>> +register_t vl_to_zcr(unsigned int vl)
>> +{
>> +    ASSERT(vl > 0);
>> +    return ((vl / SVE_VL_MULTIPLE_VAL) - 1U) & ZCR_ELx_LEN_MASK;
>> +}
> 
> Missing the emacs magic blocks at the end.

I’ll add

> 
>> diff --git a/xen/arch/arm/cpufeature.c b/xen/arch/arm/cpufeature.c
>> index c4ec38bb2554..83b84368f6d5 100644
>> --- a/xen/arch/arm/cpufeature.c
>> +++ b/xen/arch/arm/cpufeature.c
>> @@ -9,6 +9,7 @@
>>  #include <xen/init.h>
>>  #include <xen/smp.h>
>>  #include <xen/stop_machine.h>
>> +#include <asm/arm64/sve.h>
>>  #include <asm/cpufeature.h>
>>    DECLARE_BITMAP(cpu_hwcaps, ARM_NCAPS);
>> @@ -143,6 +144,9 @@ void identify_cpu(struct cpuinfo_arm *c)
>>        c->zfr64.bits[0] = READ_SYSREG(ID_AA64ZFR0_EL1);
>>  +    if ( cpu_has_sve )
>> +        c->zcr64.bits[0] = compute_max_zcr();
>> +
>>      c->dczid.bits[0] = READ_SYSREG(DCZID_EL0);
>>        c->ctr.bits[0] = READ_SYSREG(CTR_EL0);
>> @@ -199,7 +203,7 @@ static int __init create_guest_cpuinfo(void)
>>      guest_cpuinfo.pfr64.mpam = 0;
>>      guest_cpuinfo.pfr64.mpam_frac = 0;
>>  -    /* Hide SVE as Xen does not support it */
>> +    /* Hide SVE by default to the guests */
>>      guest_cpuinfo.pfr64.sve = 0;
>>      guest_cpuinfo.zfr64.bits[0] = 0;
>>  diff --git a/xen/arch/arm/domain.c b/xen/arch/arm/domain.c
>> index d8ef6501ff8e..0350d8c61ed8 100644
>> --- a/xen/arch/arm/domain.c
>> +++ b/xen/arch/arm/domain.c
>> @@ -181,9 +181,6 @@ static void ctxt_switch_to(struct vcpu *n)
>>      /* VGIC */
>>      gic_restore_state(n);
>>  -    /* VFP */
>> -    vfp_restore_state(n);
>> -
> 
> At the moment ctxt_switch_to() is (mostly?) the reverse of ctxt_switch_from(). But with this change, you are going to break it.
> 
> I would really prefer if the existing convention stays because it helps to confirm that we didn't miss bits in the restore code.
> 
> So if you want to move vfp_restore_state() later, then please more vfp_save_state() earlier in ctxt_switch_from().

Ok I will move vfp_save_state earlier, and ...

> 
> 
>>      /* XXX MPU */
>>        /* Fault Status */
>> @@ -234,6 +231,7 @@ static void ctxt_switch_to(struct vcpu *n)
>>      p2m_restore_state(n);
>>        /* Control Registers */
>> +    WRITE_SYSREG(n->arch.cptr_el2, CPTR_EL2);
> 
> I would prefer if this called closer to vfp_restore_state(). So the dependency between the two is easier to spot.
> 
>>      WRITE_SYSREG(n->arch.cpacr, CPACR_EL1);
>>        /*
>> @@ -258,6 +256,9 @@ static void ctxt_switch_to(struct vcpu *n)
>>  #endif
>>      isb();
>>  +    /* VFP */
> 
> Please document in the code that vfp_restore_state() have to be called after CPTR_EL2() + a synchronization event.
> 
> Similar docoumentation on top of at least CPTR_EL2 and possibly isb(). This would help if we need to re-order the code in the future.

I will put comments on top of CPTR_EL2 and vfp_restore_state to explain the sequence and the synchronisation.

> 
> 
>> +    vfp_restore_state(n);
>> +
>>      /* CP 15 */
>>      WRITE_SYSREG(n->arch.csselr, CSSELR_EL1);
>>  @@ -548,6 +549,8 @@ int arch_vcpu_create(struct vcpu *v)
>>        v->arch.vmpidr = MPIDR_SMP | vcpuid_to_vaffinity(v->vcpu_id);
>>  +    v->arch.cptr_el2 = get_default_cptr_flags();
>> +
>>      v->arch.hcr_el2 = get_default_hcr_flags();
>>        v->arch.mdcr_el2 = HDCR_TDRA | HDCR_TDOSA | HDCR_TDA;
>> diff --git a/xen/arch/arm/include/asm/arm64/sve.h b/xen/arch/arm/include/asm/arm64/sve.h
>> new file mode 100644
>> index 000000000000..144d2b1cc485
>> --- /dev/null
>> +++ b/xen/arch/arm/include/asm/arm64/sve.h
>> @@ -0,0 +1,43 @@
>> +/* SPDX-License-Identifier: GPL-2.0 */
> 
> Use GPL-2.0-only.

Ok

> 
>> +/*
>> + * Arm SVE feature code
>> + *
>> + * Copyright (C) 2022 ARM Ltd.
>> + */
>> +
>> +#ifndef _ARM_ARM64_SVE_H
>> +#define _ARM_ARM64_SVE_H
>> +
>> +#define SVE_VL_MAX_BITS (2048U)
> 
> NIT: The parentheses are unnecessary and we don't tend to add them in Xen.

Ok

> 
>> +
>> +/* Vector length must be multiple of 128 */
>> +#define SVE_VL_MULTIPLE_VAL (128U)
> 
> NIT: The parentheses are unnecessary

Ok

> 
>> +
>> +#ifdef CONFIG_ARM64_SVE
>> +
>> +register_t compute_max_zcr(void);
>> +register_t vl_to_zcr(unsigned int vl);
>> +
>> +#else /* !CONFIG_ARM64_SVE */
>> +
>> +static inline register_t compute_max_zcr(void)
>> +{
> 
> Is this meant to be called when SVE is not enabled? If not, then please add ASSERT_UNREACHABLE().

I’ll add

> 
>> +    return 0;
>> +}
>> +
>> +static inline register_t vl_to_zcr(unsigned int vl)
>> +{
> 
> Is this meant to be called when SVE is not enabled? If not, then please add ASSERT_UNREACHABLE().

It seems that for this patch this was unneeded, maybe some change from v1 led to that, I will remove this.

> 
>> +    return 0;
>> +}
>> +
>> +#endif /* CONFIG_ARM64_SVE */
>> +
>> +#endif /* _ARM_ARM64_SVE_H */
>> +/*
>> + * Local variables:
>> + * mode: C
>> + * c-file-style: "BSD"
>> + * c-basic-offset: 4
>> + * indent-tabs-mode: nil
>> + * End:
>> + */
>> diff --git a/xen/arch/arm/include/asm/arm64/sysregs.h b/xen/arch/arm/include/asm/arm64/sysregs.h
>> index 463899951414..4cabb9eb4d5e 100644
>> --- a/xen/arch/arm/include/asm/arm64/sysregs.h
>> +++ b/xen/arch/arm/include/asm/arm64/sysregs.h
>> @@ -24,6 +24,7 @@
>>  #define ICH_EISR_EL2              S3_4_C12_C11_3
>>  #define ICH_ELSR_EL2              S3_4_C12_C11_5
>>  #define ICH_VMCR_EL2              S3_4_C12_C11_7
>> +#define ZCR_EL2                   S3_4_C1_C2_0
>>    #define __LR0_EL2(x)              S3_4_C12_C12_ ## x
>>  #define __LR8_EL2(x)              S3_4_C12_C13_ ## x
>> diff --git a/xen/arch/arm/include/asm/cpufeature.h b/xen/arch/arm/include/asm/cpufeature.h
>> index c62cf6293fd6..6d703e051906 100644
>> --- a/xen/arch/arm/include/asm/cpufeature.h
>> +++ b/xen/arch/arm/include/asm/cpufeature.h
>> @@ -32,6 +32,12 @@
>>  #define cpu_has_thumbee   (boot_cpu_feature32(thumbee) == 1)
>>  #define cpu_has_aarch32   (cpu_has_arm || cpu_has_thumb)
>>  +#ifdef CONFIG_ARM64_SVE
>> +#define cpu_has_sve       (boot_cpu_feature64(sve) == 1)
>> +#else
>> +#define cpu_has_sve       (0)
> 
> NIT: The parentheses are unnecessary

Ok

> 
>> +#endif
>> +
>>  #ifdef CONFIG_ARM_32
>>  #define cpu_has_gicv3     (boot_cpu_feature32(gic) >= 1)
>>  #define cpu_has_gentimer  (boot_cpu_feature32(gentimer) == 1)
>> @@ -323,6 +329,14 @@ struct cpuinfo_arm {
>>          };
>>      } isa64;
>>  +    union {
>> +        register_t bits[1];
>> +        struct {
>> +            unsigned long len:4;
>> +            unsigned long __res0:60;
>> +        };
>> +    } zcr64;
>> +
>>      struct {
>>          register_t bits[1];
>>      } zfr64;
>> diff --git a/xen/arch/arm/include/asm/domain.h b/xen/arch/arm/include/asm/domain.h
>> index 2a51f0ca688e..e776ee704b7d 100644
>> --- a/xen/arch/arm/include/asm/domain.h
>> +++ b/xen/arch/arm/include/asm/domain.h
>> @@ -190,6 +190,7 @@ struct arch_vcpu
>>      register_t tpidrro_el0;
>>        /* HYP configuration */
>> +    register_t cptr_el2;
>>      register_t hcr_el2;
>>      register_t mdcr_el2;
>>  diff --git a/xen/arch/arm/include/asm/processor.h b/xen/arch/arm/include/asm/processor.h
>> index 54f253087718..bc683334125c 100644
>> --- a/xen/arch/arm/include/asm/processor.h
>> +++ b/xen/arch/arm/include/asm/processor.h
>> @@ -582,6 +582,8 @@ void do_trap_guest_serror(struct cpu_user_regs *regs);
>>    register_t get_default_hcr_flags(void);
>>  +register_t get_default_cptr_flags(void);
>> +
>>  /*
>>   * Synchronize SError unless the feature is selected.
>>   * This is relying on the SErrors are currently unmasked.
>> diff --git a/xen/arch/arm/setup.c b/xen/arch/arm/setup.c
>> index 6f9f4d8c8a15..4191a766767a 100644
>> --- a/xen/arch/arm/setup.c
>> +++ b/xen/arch/arm/setup.c
>> @@ -135,10 +135,11 @@ static void __init processor_id(void)
>>             cpu_has_el2_32 ? "64+32" : cpu_has_el2_64 ? "64" : "No",
>>             cpu_has_el1_32 ? "64+32" : cpu_has_el1_64 ? "64" : "No",
>>             cpu_has_el0_32 ? "64+32" : cpu_has_el0_64 ? "64" : "No");
>> -    printk("    Extensions:%s%s%s\n",
>> +    printk("    Extensions:%s%s%s%s\n",
>>             cpu_has_fp ? " FloatingPoint" : "",
>>             cpu_has_simd ? " AdvancedSIMD" : "",
>> -           cpu_has_gicv3 ? " GICv3-SysReg" : "");
>> +           cpu_has_gicv3 ? " GICv3-SysReg" : "",
>> +           cpu_has_sve ? " SVE" : "");
>>        /* Warn user if we find unknown floating-point features */
>>      if ( cpu_has_fp && (boot_cpu_feature64(fp) >= 2) )
>> diff --git a/xen/arch/arm/traps.c b/xen/arch/arm/traps.c
>> index d40c331a4e9c..c0611c2ef6a5 100644
>> --- a/xen/arch/arm/traps.c
>> +++ b/xen/arch/arm/traps.c
>> @@ -93,6 +93,21 @@ register_t get_default_hcr_flags(void)
>>               HCR_TID3|HCR_TSC|HCR_TAC|HCR_SWIO|HCR_TIDCP|HCR_FB|HCR_TSW);
>>  }
>>  +register_t get_default_cptr_flags(void)
>> +{
>> +    /*
>> +     * Trap all coprocessor registers (0-13) except cp10 and
>> +     * cp11 for VFP.
>> +     *
>> +     * /!\ All coprocessors except cp10 and cp11 cannot be used in Xen.
>> +     *
>> +     * On ARM64 the TCPx bits which we set here (0..9,12,13) are all
>> +     * RES1, i.e. they would trap whether we did this write or not.
>> +     */
>> +    return  ((HCPTR_CP_MASK & ~(HCPTR_CP(10) | HCPTR_CP(11))) |
>> +             HCPTR_TTA | HCPTR_TAM);
>> +}
>> +
>>  static enum {
>>      SERRORS_DIVERSE,
>>      SERRORS_PANIC,
>> @@ -122,6 +137,7 @@ __initcall(update_serrors_cpu_caps);
>>    void init_traps(void)
>>  {
>> +    register_t cptr_bits = get_default_cptr_flags();
> 
> Coding style: Please add a newline after the declaration. That said...
> 
>>      /*
>>       * Setup Hyp vector base. Note they might get updated with the
>>       * branch predictor hardening.
>> @@ -135,17 +151,7 @@ void init_traps(void)
>>      /* Trap CP15 c15 used for implementation defined registers */
>>      WRITE_SYSREG(HSTR_T(15), HSTR_EL2);
>>  -    /* Trap all coprocessor registers (0-13) except cp10 and
>> -     * cp11 for VFP.
>> -     *
>> -     * /!\ All coprocessors except cp10 and cp11 cannot be used in Xen.
>> -     *
>> -     * On ARM64 the TCPx bits which we set here (0..9,12,13) are all
>> -     * RES1, i.e. they would trap whether we did this write or not.
>> -     */
>> -    WRITE_SYSREG((HCPTR_CP_MASK & ~(HCPTR_CP(10) | HCPTR_CP(11))) |
>> -                 HCPTR_TTA | HCPTR_TAM,
>> -                 CPTR_EL2);
>> +    WRITE_SYSREG(cptr_bits, CPTR_EL2);
> 
> ... I would combine the two lines as the variable seems unnecessary.

I will combine

> 
>>        /*
>>       * Configure HCR_EL2 with the bare minimum to run Xen until a guest
> 
> Cheers,
> 
> -- 
> Julien Grall



^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [PATCH v6 01/12] xen/arm: enable SVE extension for Xen
  2023-05-19 14:26     ` Luca Fancellu
@ 2023-05-19 14:46       ` Julien Grall
  2023-05-19 14:51         ` Luca Fancellu
  2023-05-22  7:50         ` Jan Beulich
  0 siblings, 2 replies; 56+ messages in thread
From: Julien Grall @ 2023-05-19 14:46 UTC (permalink / raw)
  To: Luca Fancellu
  Cc: Xen-devel, Bertrand Marquis, Wei Chen, Stefano Stabellini,
	Volodymyr Babchuk

Hi Luca,

On 19/05/2023 15:26, Luca Fancellu wrote:
>> On 18 May 2023, at 10:35, Julien Grall <julien@xen.org> wrote:
>>>    /*
>>>     * Comment from Linux:
>>>     * Userspace may perform DC ZVA instructions. Mismatched block sizes
>>> diff --git a/xen/arch/arm/arm64/sve-asm.S b/xen/arch/arm/arm64/sve-asm.S
>>> new file mode 100644
>>> index 000000000000..4d1549344733
>>> --- /dev/null
>>> +++ b/xen/arch/arm/arm64/sve-asm.S
>>> @@ -0,0 +1,48 @@
>>> +/* SPDX-License-Identifier: GPL-2.0-only */
>>> +/*
>>> + * Arm SVE assembly routines
>>> + *
>>> + * Copyright (C) 2022 ARM Ltd.
>>> + *
>>> + * Some macros and instruction encoding in this file are taken from linux 6.1.1,
>>> + * file arch/arm64/include/asm/fpsimdmacros.h, some of them are a modified
>>> + * version.
>> AFAICT, the only modified version is _sve_rdvl, but it is not clear to me why we would want to have a modified version?
>>
>> I am asking this because without an explanation, it would be difficult to know how to re-sync the code with Linux.
> 
> In this patch the macros are exactly equal to Linux, apart from the coding style that uses spaces instead of tabs,
> I was not expecting to keep them in sync as they seems to be not prone to change soon, let me know if I need to
> use also tabs and be 100% equal to Linux.

The file is small enough, so I think it would be OK if this is converted 
to the Xen coding style.

> 
> The following macros that are coming in patch 5 are equal apart from sve_save/sve_load, that are different because
> of the construction differences between the storage buffers here and in Linux, if you want I can put a comment on them
> to explain this difference in patch 5

That would be good. Also, can you update 
arch/arm/README.LinuxPrimitives? The file is listing primitives imported 
from Linux and when.

> 
>>>
>>> diff --git a/xen/arch/arm/arm64/sve.c b/xen/arch/arm/arm64/sve.c
>>> new file mode 100644
>>> index 000000000000..6f3fb368c59b
>>> --- /dev/null
>>> +++ b/xen/arch/arm/arm64/sve.c
>>> @@ -0,0 +1,50 @@
>>> +/* SPDX-License-Identifier: GPL-2.0 */
>>
>> Above, you are using GPL-2.0-only, but here GPL-2.0. We favor the former now. Happy to deal it on commit if there is nothing else to address.
> 
> No problem, I will fix in the next push
> 
>>
>>> +/*
>>> + * Arm SVE feature code
>>> + *
>>> + * Copyright (C) 2022 ARM Ltd.
>>> + */
>>> +
>>> +#include <xen/types.h>
>>> +#include <asm/arm64/sve.h>
>>> +#include <asm/arm64/sysregs.h>
>>> +#include <asm/processor.h>
>>> +#include <asm/system.h>
>>> +
>>> +extern unsigned int sve_get_hw_vl(void);
>>> +
>>> +register_t compute_max_zcr(void)
>>> +{
>>> +    register_t cptr_bits = get_default_cptr_flags();
>>> +    register_t zcr = vl_to_zcr(SVE_VL_MAX_BITS);
>>> +    unsigned int hw_vl;
>>> +
>>> +    /* Remove trap for SVE resources */
>>> +    WRITE_SYSREG(cptr_bits & ~HCPTR_CP(8), CPTR_EL2);
>>> +    isb();
>>> +
>>> +    /*
>>> +     * Set the maximum SVE vector length, doing that we will know the VL
>>> +     * supported by the platform, calling sve_get_hw_vl()
>>> +     */
>>> +    WRITE_SYSREG(zcr, ZCR_EL2);
>>
>>  From my reading of the Arm (D19-6331, ARM DDI 0487J.a), a direct write to a system register would need to be followed by an context synchronization event (e.g. isb()) before the software can rely on the value.
>>
>> In this situation, AFAICT, the instruciton in sve_get_hw_vl() will use the content of ZCR_EL2. So don't we need an ISB() here?
> 
>  From what I’ve read in the manual for ZCR_ELx:
> 
> An indirect read of ZCR_EL2.LEN appears to occur in program order relative to a direct write of
> the same register, without the need for explicit synchronization
> 
> I’ve interpreted it as “there is no need to sync before write” and I’ve looked into Linux and it does not
> Appear any synchronisation mechanism after a write to that register, but if I am wrong I can for sure
> add an isb if you prefer.

Ah, I was reading the generic section about synchronization and didn't 
realize there was a paragraph in the ZCR_EL2 section as well.

Reading the new section, I agree with your understanding. The isb() is 
not necessary.

So please ignore this comment :).

>>>       /* XXX MPU */
>>>         /* Fault Status */
>>> @@ -234,6 +231,7 @@ static void ctxt_switch_to(struct vcpu *n)
>>>       p2m_restore_state(n);
>>>         /* Control Registers */
>>> +    WRITE_SYSREG(n->arch.cptr_el2, CPTR_EL2);
>>
>> I would prefer if this called closer to vfp_restore_state(). So the dependency between the two is easier to spot.
>>
>>>       WRITE_SYSREG(n->arch.cpacr, CPACR_EL1);
>>>         /*
>>> @@ -258,6 +256,9 @@ static void ctxt_switch_to(struct vcpu *n)
>>>   #endif
>>>       isb();
>>>   +    /* VFP */
>>
>> Please document in the code that vfp_restore_state() have to be called after CPTR_EL2() + a synchronization event.
>>
>> Similar docoumentation on top of at least CPTR_EL2 and possibly isb(). This would help if we need to re-order the code in the future.
> 
> I will put comments on top of CPTR_EL2 and vfp_restore_state to explain the sequence and the synchronisation.

Just to clarify, does this mean you will keep CPTR_EL2 where it 
currently is? (See my comment just above in the previous e-mail)

Cheers,

-- 
Julien Grall


^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [PATCH v6 01/12] xen/arm: enable SVE extension for Xen
  2023-05-19 14:46       ` Julien Grall
@ 2023-05-19 14:51         ` Luca Fancellu
  2023-05-19 15:00           ` Julien Grall
  2023-05-22  7:50         ` Jan Beulich
  1 sibling, 1 reply; 56+ messages in thread
From: Luca Fancellu @ 2023-05-19 14:51 UTC (permalink / raw)
  To: Julien Grall
  Cc: Xen-devel, Bertrand Marquis, Wei Chen, Stefano Stabellini,
	Volodymyr Babchuk



> On 19 May 2023, at 15:46, Julien Grall <julien@xen.org> wrote:
> 
> Hi Luca,
> 
> On 19/05/2023 15:26, Luca Fancellu wrote:
>>> On 18 May 2023, at 10:35, Julien Grall <julien@xen.org> wrote:
>>>>   /*
>>>>    * Comment from Linux:
>>>>    * Userspace may perform DC ZVA instructions. Mismatched block sizes
>>>> diff --git a/xen/arch/arm/arm64/sve-asm.S b/xen/arch/arm/arm64/sve-asm.S
>>>> new file mode 100644
>>>> index 000000000000..4d1549344733
>>>> --- /dev/null
>>>> +++ b/xen/arch/arm/arm64/sve-asm.S
>>>> @@ -0,0 +1,48 @@
>>>> +/* SPDX-License-Identifier: GPL-2.0-only */
>>>> +/*
>>>> + * Arm SVE assembly routines
>>>> + *
>>>> + * Copyright (C) 2022 ARM Ltd.
>>>> + *
>>>> + * Some macros and instruction encoding in this file are taken from linux 6.1.1,
>>>> + * file arch/arm64/include/asm/fpsimdmacros.h, some of them are a modified
>>>> + * version.
>>> AFAICT, the only modified version is _sve_rdvl, but it is not clear to me why we would want to have a modified version?
>>> 
>>> I am asking this because without an explanation, it would be difficult to know how to re-sync the code with Linux.
>> In this patch the macros are exactly equal to Linux, apart from the coding style that uses spaces instead of tabs,
>> I was not expecting to keep them in sync as they seems to be not prone to change soon, let me know if I need to
>> use also tabs and be 100% equal to Linux.
> 
> The file is small enough, so I think it would be OK if this is converted to the Xen coding style.
> 
>> The following macros that are coming in patch 5 are equal apart from sve_save/sve_load, that are different because
>> of the construction differences between the storage buffers here and in Linux, if you want I can put a comment on them
>> to explain this difference in patch 5
> 
> That would be good. Also, can you update arch/arm/README.LinuxPrimitives? The file is listing primitives imported from Linux and when.

Sure I will

> 
>>>> 
>>>> diff --git a/xen/arch/arm/arm64/sve.c b/xen/arch/arm/arm64/sve.c
>>>> new file mode 100644
>>>> index 000000000000..6f3fb368c59b
>>>> --- /dev/null
>>>> +++ b/xen/arch/arm/arm64/sve.c
>>>> @@ -0,0 +1,50 @@
>>>> +/* SPDX-License-Identifier: GPL-2.0 */
>>> 
>>> Above, you are using GPL-2.0-only, but here GPL-2.0. We favor the former now. Happy to deal it on commit if there is nothing else to address.
>> No problem, I will fix in the next push
>>> 
>>>> +/*
>>>> + * Arm SVE feature code
>>>> + *
>>>> + * Copyright (C) 2022 ARM Ltd.
>>>> + */
>>>> +
>>>> +#include <xen/types.h>
>>>> +#include <asm/arm64/sve.h>
>>>> +#include <asm/arm64/sysregs.h>
>>>> +#include <asm/processor.h>
>>>> +#include <asm/system.h>
>>>> +
>>>> +extern unsigned int sve_get_hw_vl(void);
>>>> +
>>>> +register_t compute_max_zcr(void)
>>>> +{
>>>> +    register_t cptr_bits = get_default_cptr_flags();
>>>> +    register_t zcr = vl_to_zcr(SVE_VL_MAX_BITS);
>>>> +    unsigned int hw_vl;
>>>> +
>>>> +    /* Remove trap for SVE resources */
>>>> +    WRITE_SYSREG(cptr_bits & ~HCPTR_CP(8), CPTR_EL2);
>>>> +    isb();
>>>> +
>>>> +    /*
>>>> +     * Set the maximum SVE vector length, doing that we will know the VL
>>>> +     * supported by the platform, calling sve_get_hw_vl()
>>>> +     */
>>>> +    WRITE_SYSREG(zcr, ZCR_EL2);
>>> 
>>> From my reading of the Arm (D19-6331, ARM DDI 0487J.a), a direct write to a system register would need to be followed by an context synchronization event (e.g. isb()) before the software can rely on the value.
>>> 
>>> In this situation, AFAICT, the instruciton in sve_get_hw_vl() will use the content of ZCR_EL2. So don't we need an ISB() here?
>> From what I’ve read in the manual for ZCR_ELx:
>> An indirect read of ZCR_EL2.LEN appears to occur in program order relative to a direct write of
>> the same register, without the need for explicit synchronization
>> I’ve interpreted it as “there is no need to sync before write” and I’ve looked into Linux and it does not
>> Appear any synchronisation mechanism after a write to that register, but if I am wrong I can for sure
>> add an isb if you prefer.
> 
> Ah, I was reading the generic section about synchronization and didn't realize there was a paragraph in the ZCR_EL2 section as well.
> 
> Reading the new section, I agree with your understanding. The isb() is not necessary.
> 
> So please ignore this comment :).
> 
>>>>      /* XXX MPU */
>>>>        /* Fault Status */
>>>> @@ -234,6 +231,7 @@ static void ctxt_switch_to(struct vcpu *n)
>>>>      p2m_restore_state(n);
>>>>        /* Control Registers */
>>>> +    WRITE_SYSREG(n->arch.cptr_el2, CPTR_EL2);
>>> 
>>> I would prefer if this called closer to vfp_restore_state(). So the dependency between the two is easier to spot.
>>> 
>>>>      WRITE_SYSREG(n->arch.cpacr, CPACR_EL1);
>>>>        /*
>>>> @@ -258,6 +256,9 @@ static void ctxt_switch_to(struct vcpu *n)
>>>>  #endif
>>>>      isb();
>>>>  +    /* VFP */
>>> 
>>> Please document in the code that vfp_restore_state() have to be called after CPTR_EL2() + a synchronization event.
>>> 
>>> Similar docoumentation on top of at least CPTR_EL2 and possibly isb(). This would help if we need to re-order the code in the future.
>> I will put comments on top of CPTR_EL2 and vfp_restore_state to explain the sequence and the synchronisation.
> 
> Just to clarify, does this mean you will keep CPTR_EL2 where it currently is? (See my comment just above in the previous e-mail)

This is how I changed the code:

/* Control Registers */
/*
* CPTR_EL2 needs to be written before calling vfp_restore_state, a
* synchronization instruction is expected after the write (isb)
*/
WRITE_SYSREG(n->arch.cptr_el2, CPTR_EL2);
WRITE_SYSREG(n->arch.cpacr, CPACR_EL1);

/*
* This write to sysreg CONTEXTIDR_EL1 ensures we don't hit erratum
* #852523 (Cortex-A57) or #853709 (Cortex-A72).
* I.e DACR32_EL2 is not correctly synchronized.
*/
WRITE_SYSREG(n->arch.contextidr, CONTEXTIDR_EL1);
WRITE_SYSREG(n->arch.tpidr_el0, TPIDR_EL0);
WRITE_SYSREG(n->arch.tpidrro_el0, TPIDRRO_EL0);
WRITE_SYSREG(n->arch.tpidr_el1, TPIDR_EL1);

if ( is_32bit_domain(n->domain) && cpu_has_thumbee )
{
WRITE_SYSREG(n->arch.teecr, TEECR32_EL1);
WRITE_SYSREG(n->arch.teehbr, TEEHBR32_EL1);
}

#ifdef CONFIG_ARM_32
WRITE_CP32(n->arch.joscr, JOSCR);
WRITE_CP32(n->arch.jmcr, JMCR);
#endif
isb();

/* VFP - call vfp_restore_state after writing on CPTR_EL2 + isb */
vfp_restore_state(n);

Maybe I misunderstood your preference, do you want me to put the write to CPTR_EL2
right before the isb() that precedes vfp_restore_state?


> 
> Cheers,
> 
> -- 
> Julien Grall


^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [PATCH v6 01/12] xen/arm: enable SVE extension for Xen
  2023-05-19 14:51         ` Luca Fancellu
@ 2023-05-19 15:00           ` Julien Grall
  2023-05-19 15:13             ` Luca Fancellu
  0 siblings, 1 reply; 56+ messages in thread
From: Julien Grall @ 2023-05-19 15:00 UTC (permalink / raw)
  To: Luca Fancellu
  Cc: Xen-devel, Bertrand Marquis, Wei Chen, Stefano Stabellini,
	Volodymyr Babchuk



On 19/05/2023 15:51, Luca Fancellu wrote:
> /* Control Registers */
> /*
> * CPTR_EL2 needs to be written before calling vfp_restore_state, a
> * synchronization instruction is expected after the write (isb)
> */
> WRITE_SYSREG(n->arch.cptr_el2, CPTR_EL2);
> WRITE_SYSREG(n->arch.cpacr, CPACR_EL1);
> 
> /*
> * This write to sysreg CONTEXTIDR_EL1 ensures we don't hit erratum
> * #852523 (Cortex-A57) or #853709 (Cortex-A72).
> * I.e DACR32_EL2 is not correctly synchronized.
> */
> WRITE_SYSREG(n->arch.contextidr, CONTEXTIDR_EL1);
> WRITE_SYSREG(n->arch.tpidr_el0, TPIDR_EL0);
> WRITE_SYSREG(n->arch.tpidrro_el0, TPIDRRO_EL0);
> WRITE_SYSREG(n->arch.tpidr_el1, TPIDR_EL1);
> 
> if ( is_32bit_domain(n->domain) && cpu_has_thumbee )
> {
> WRITE_SYSREG(n->arch.teecr, TEECR32_EL1);
> WRITE_SYSREG(n->arch.teehbr, TEEHBR32_EL1);
> }
> 
> #ifdef CONFIG_ARM_32
> WRITE_CP32(n->arch.joscr, JOSCR);
> WRITE_CP32(n->arch.jmcr, JMCR);
> #endif
> isb();
> 
> /* VFP - call vfp_restore_state after writing on CPTR_EL2 + isb */
> vfp_restore_state(n);
> 
> Maybe I misunderstood your preference, do you want me to put the write to CPTR_EL2
> right before the isb() that precedes vfp_restore_state?

Yes please. Unless there is a reason to keep it "far away". The comments 
look good to me.

Cheers,

-- 
Julien Grall


^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [PATCH v6 01/12] xen/arm: enable SVE extension for Xen
  2023-05-19 15:00           ` Julien Grall
@ 2023-05-19 15:13             ` Luca Fancellu
  2023-05-19 15:17               ` Julien Grall
  0 siblings, 1 reply; 56+ messages in thread
From: Luca Fancellu @ 2023-05-19 15:13 UTC (permalink / raw)
  To: Julien Grall
  Cc: Xen-devel, Bertrand Marquis, Wei Chen, Stefano Stabellini,
	Volodymyr Babchuk



> On 19 May 2023, at 16:00, Julien Grall <julien@xen.org> wrote:
> 
> 
> 
> On 19/05/2023 15:51, Luca Fancellu wrote:
>> /* Control Registers */
>> /*
>> * CPTR_EL2 needs to be written before calling vfp_restore_state, a
>> * synchronization instruction is expected after the write (isb)
>> */
>> WRITE_SYSREG(n->arch.cptr_el2, CPTR_EL2);
>> WRITE_SYSREG(n->arch.cpacr, CPACR_EL1);
>> /*
>> * This write to sysreg CONTEXTIDR_EL1 ensures we don't hit erratum
>> * #852523 (Cortex-A57) or #853709 (Cortex-A72).
>> * I.e DACR32_EL2 is not correctly synchronized.
>> */
>> WRITE_SYSREG(n->arch.contextidr, CONTEXTIDR_EL1);
>> WRITE_SYSREG(n->arch.tpidr_el0, TPIDR_EL0);
>> WRITE_SYSREG(n->arch.tpidrro_el0, TPIDRRO_EL0);
>> WRITE_SYSREG(n->arch.tpidr_el1, TPIDR_EL1);
>> if ( is_32bit_domain(n->domain) && cpu_has_thumbee )
>> {
>> WRITE_SYSREG(n->arch.teecr, TEECR32_EL1);
>> WRITE_SYSREG(n->arch.teehbr, TEEHBR32_EL1);
>> }
>> #ifdef CONFIG_ARM_32
>> WRITE_CP32(n->arch.joscr, JOSCR);
>> WRITE_CP32(n->arch.jmcr, JMCR);
>> #endif
>> isb();
>> /* VFP - call vfp_restore_state after writing on CPTR_EL2 + isb */
>> vfp_restore_state(n);
>> Maybe I misunderstood your preference, do you want me to put the write to CPTR_EL2
>> right before the isb() that precedes vfp_restore_state?
> 
> Yes please. Unless there is a reason to keep it "far away". The comments look good to me.

Ok, a question regarding README.LinuxPrimitives, is it some file taken from an automated tool?
Because I see there is some kind of structure, how can I know if my syntax is correct?

> 
> Cheers,
> 
> -- 
> Julien Grall




^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [PATCH v6 01/12] xen/arm: enable SVE extension for Xen
  2023-05-19 15:13             ` Luca Fancellu
@ 2023-05-19 15:17               ` Julien Grall
  0 siblings, 0 replies; 56+ messages in thread
From: Julien Grall @ 2023-05-19 15:17 UTC (permalink / raw)
  To: Luca Fancellu
  Cc: Xen-devel, Bertrand Marquis, Wei Chen, Stefano Stabellini,
	Volodymyr Babchuk

Hi,

On 19/05/2023 16:13, Luca Fancellu wrote:
>> On 19/05/2023 15:51, Luca Fancellu wrote:
>>> /* Control Registers */
>>> /*
>>> * CPTR_EL2 needs to be written before calling vfp_restore_state, a
>>> * synchronization instruction is expected after the write (isb)
>>> */
>>> WRITE_SYSREG(n->arch.cptr_el2, CPTR_EL2);
>>> WRITE_SYSREG(n->arch.cpacr, CPACR_EL1);
>>> /*
>>> * This write to sysreg CONTEXTIDR_EL1 ensures we don't hit erratum
>>> * #852523 (Cortex-A57) or #853709 (Cortex-A72).
>>> * I.e DACR32_EL2 is not correctly synchronized.
>>> */
>>> WRITE_SYSREG(n->arch.contextidr, CONTEXTIDR_EL1);
>>> WRITE_SYSREG(n->arch.tpidr_el0, TPIDR_EL0);
>>> WRITE_SYSREG(n->arch.tpidrro_el0, TPIDRRO_EL0);
>>> WRITE_SYSREG(n->arch.tpidr_el1, TPIDR_EL1);
>>> if ( is_32bit_domain(n->domain) && cpu_has_thumbee )
>>> {
>>> WRITE_SYSREG(n->arch.teecr, TEECR32_EL1);
>>> WRITE_SYSREG(n->arch.teehbr, TEEHBR32_EL1);
>>> }
>>> #ifdef CONFIG_ARM_32
>>> WRITE_CP32(n->arch.joscr, JOSCR);
>>> WRITE_CP32(n->arch.jmcr, JMCR);
>>> #endif
>>> isb();
>>> /* VFP - call vfp_restore_state after writing on CPTR_EL2 + isb */
>>> vfp_restore_state(n);
>>> Maybe I misunderstood your preference, do you want me to put the write to CPTR_EL2
>>> right before the isb() that precedes vfp_restore_state?
>>
>> Yes please. Unless there is a reason to keep it "far away". The comments look good to me.
> 
> Ok, a question regarding README.LinuxPrimitives, is it some file taken from an automated tool?

I am not aware of any automated tools using it. All the re-syncs I have 
seen recently were manual.

> Because I see there is some kind of structure, how can I know if my syntax is correct?

There are some commands to help syncing when the file is very similar to 
Linux. In your case, I would follow what we did for the atomics as only 
some functions are the same.

Cheers,

-- 
Julien Grall


^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [PATCH v6 05/12] arm/sve: save/restore SVE context switch
  2023-05-18 18:27   ` Julien Grall
@ 2023-05-19 17:35     ` Luca Fancellu
  2023-05-19 17:52       ` Julien Grall
  0 siblings, 1 reply; 56+ messages in thread
From: Luca Fancellu @ 2023-05-19 17:35 UTC (permalink / raw)
  To: Julien Grall
  Cc: Xen-devel, Bertrand Marquis, Wei Chen, Stefano Stabellini,
	Volodymyr Babchuk



> On 18 May 2023, at 19:27, Julien Grall <julien@xen.org> wrote:
> 
> Hi Luca,
> 
> On 24/04/2023 07:02, Luca Fancellu wrote:
>> Save/restore context switch for SVE, allocate memory to contain
>> the Z0-31 registers whose length is maximum 2048 bits each and
>> FFR who can be maximum 256 bits, the allocated memory depends on
>> how many bits is the vector length for the domain and how many bits
>> are supported by the platform.
>> Save P0-15 whose length is maximum 256 bits each, in this case the
>> memory used is from the fpregs field in struct vfp_state,
>> because V0-31 are part of Z0-31 and this space would have been
>> unused for SVE domain otherwise.
>> Create zcr_el{1,2} fields in arch_vcpu, initialise zcr_el2 on vcpu
>> creation given the requested vector length and restore it on
>> context switch, save/restore ZCR_EL1 value as well.
>> Signed-off-by: Luca Fancellu <luca.fancellu@arm.com>
>> ---
>> Changes from v5:
>>  - use XFREE instead of xfree, keep the headers (Julien)
>>  - Avoid math computation for every save/restore, store the computation
>>    in struct vfp_state once (Bertrand)
>>  - protect access to v->domain->arch.sve_vl inside arch_vcpu_create now
>>    that sve_vl is available only on arm64
>> Changes from v4:
>>  - No changes
>> Changes from v3:
>>  - don't use fixed len types when not needed (Jan)
>>  - now VL is an encoded value, decode it before using.
>> Changes from v2:
>>  - No changes
>> Changes from v1:
>>  - No changes
>> Changes from RFC:
>>  - Moved zcr_el2 field introduction in this patch, restore its
>>    content inside sve_restore_state function. (Julien)
>> ---
>>  xen/arch/arm/arm64/sve-asm.S             | 141 +++++++++++++++++++++++
>>  xen/arch/arm/arm64/sve.c                 |  63 ++++++++++
>>  xen/arch/arm/arm64/vfp.c                 |  79 +++++++------
>>  xen/arch/arm/domain.c                    |   9 ++
>>  xen/arch/arm/include/asm/arm64/sve.h     |  13 +++
>>  xen/arch/arm/include/asm/arm64/sysregs.h |   3 +
>>  xen/arch/arm/include/asm/arm64/vfp.h     |  12 ++
>>  xen/arch/arm/include/asm/domain.h        |   2 +
>>  8 files changed, 288 insertions(+), 34 deletions(-)
>> diff --git a/xen/arch/arm/arm64/sve-asm.S b/xen/arch/arm/arm64/sve-asm.S
>> index 4d1549344733..8c37d7bc95d5 100644
>> --- a/xen/arch/arm/arm64/sve-asm.S
>> +++ b/xen/arch/arm/arm64/sve-asm.S
> 
> Are all the new helpers added in this patch taken from Linux? If so, it would be good to clarify this (again) in the commit message as it helps for the review (I can diff with Linux rather than properly reviewing them).
> 
>> diff --git a/xen/arch/arm/arm64/sve.c b/xen/arch/arm/arm64/sve.c
>> index 86a5e617bfca..064832b450ff 100644
>> --- a/xen/arch/arm/arm64/sve.c
>> +++ b/xen/arch/arm/arm64/sve.c
>> @@ -5,6 +5,8 @@
>>   * Copyright (C) 2022 ARM Ltd.
>>   */
>>  +#include <xen/sched.h>
>> +#include <xen/sizes.h>
>>  #include <xen/types.h>
>>  #include <asm/arm64/sve.h>
>>  #include <asm/arm64/sysregs.h>
>> @@ -13,6 +15,24 @@
>>  #include <asm/system.h>
>>    extern unsigned int sve_get_hw_vl(void);
>> +extern void sve_save_ctx(uint64_t *sve_ctx, uint64_t *pregs, int save_ffr);
>> +extern void sve_load_ctx(uint64_t const *sve_ctx, uint64_t const *pregs,
>> +                         int restore_ffr);
> 
> From the use, it is not entirely what restore_ffr/save_ffr is meant to be. Are they bool? If so, maybe use bool? At mimimum, they probably want to be unsigned int.

I have to say that I trusted the Linux implementation here, in arch/rm64/include/asm/fpsimd.h, that uses int:

extern void sve_save_state(void *state, u32 *pfpsr, int save_ffr);
extern void sve_load_state(void const *state, u32 const *pfpsr,
int restore_ffr);

But if you prefer I can put unsigned int instead.

> 
>> +
>> +static inline unsigned int sve_zreg_ctx_size(unsigned int vl)
>> +{
>> +    /*
>> +     * Z0-31 registers size in bytes is computed from VL that is in bits, so VL
>> +     * in bytes is VL/8.
>> +     */
>> +    return (vl / 8U) * 32U;
>> +}
>> +
>> +static inline unsigned int sve_ffrreg_ctx_size(unsigned int vl)
>> +{
>> +    /* FFR register size is VL/8, which is in bytes (VL/8)/8 */
>> +    return (vl / 64U);
>> +}
>>    register_t compute_max_zcr(void)
>>  {
>> @@ -60,3 +80,46 @@ unsigned int get_sys_vl_len(void)
>>      return ((system_cpuinfo.zcr64.bits[0] & ZCR_ELx_LEN_MASK) + 1U) *
>>              SVE_VL_MULTIPLE_VAL;
>>  }
>> +
>> +int sve_context_init(struct vcpu *v)
>> +{
>> +    unsigned int sve_vl_bits = sve_decode_vl(v->domain->arch.sve_vl);
>> +    uint64_t *ctx = _xzalloc(sve_zreg_ctx_size(sve_vl_bits) +
>> +                             sve_ffrreg_ctx_size(sve_vl_bits),
>> +                             L1_CACHE_BYTES);
>> +
>> +    if ( !ctx )
>> +        return -ENOMEM;
>> +
>> +    /* Point to the end of Z0-Z31 memory, just before FFR memory */
> 
> NIT: I would add that the logic should be kept in sync with sve_context_free(). Same...
> 
>> +    v->arch.vfp.sve_zreg_ctx_end = ctx +
>> +        (sve_zreg_ctx_size(sve_vl_bits) / sizeof(uint64_t));
>> +
>> +    return 0;
>> +}
>> +
>> +void sve_context_free(struct vcpu *v)
>> +{
>> +    unsigned int sve_vl_bits = sve_decode_vl(v->domain->arch.sve_vl);
>> +
>> +    /* Point back to the beginning of Z0-Z31 + FFR memory */
> 
> ... here (but with sve_context_init()). So it is clearer that if the logic change in one place then it needs to be changed in the other.

Sure I will

> 
>> +    v->arch.vfp.sve_zreg_ctx_end -=
>> +        (sve_zreg_ctx_size(sve_vl_bits) / sizeof(uint64_t));
> 
> From my understanding, sve_context_free() could be called with sve_zreg_ctxt_end equal to NULL (i.e. because sve_context_init() failed). So wouldn't we end up to substract the value to NULL and therefore...
> 
>> +
>> +    XFREE(v->arch.vfp.sve_zreg_ctx_end);
> 
> ... free a random pointer?

Thank you for spotting this, I will surround the operations in sve_context_free by: 

if ( v->arch.vfp.sve_zreg_ctx_end )

I’m assuming the memory should be zero initialised for the vfp structure, please
correct me if I’m wrong.

> 
>> +}
>> +
>> +void sve_save_state(struct vcpu *v)
>> +{
>> +    v->arch.zcr_el1 = READ_SYSREG(ZCR_EL1);
>> +
>> +    sve_save_ctx(v->arch.vfp.sve_zreg_ctx_end, v->arch.vfp.fpregs, 1);
>> +}
>> +
>> +void sve_restore_state(struct vcpu *v)
>> +{
>> +    WRITE_SYSREG(v->arch.zcr_el1, ZCR_EL1);
>> +    WRITE_SYSREG(v->arch.zcr_el2, ZCR_EL2);
> 
> AFAIU, this value will be used for the restore below. So don't we need an isb()?

We reached the agreement on this in patch 1

> 
>> +
>> +    sve_load_ctx(v->arch.vfp.sve_zreg_ctx_end, v->arch.vfp.fpregs, 1);
>> +}
>> diff --git a/xen/arch/arm/arm64/vfp.c b/xen/arch/arm/arm64/vfp.c
>> index 47885e76baae..2d0d7c2e6ddb 100644
>> --- a/xen/arch/arm/arm64/vfp.c
>> +++ b/xen/arch/arm/arm64/vfp.c
>> @@ -2,29 +2,35 @@
>>  #include <asm/processor.h>
>>  #include <asm/cpufeature.h>
>>  #include <asm/vfp.h>
>> +#include <asm/arm64/sve.h>
>>    void vfp_save_state(struct vcpu *v)
>>  {
>>      if ( !cpu_has_fp )
>>          return;
>>  -    asm volatile("stp q0, q1, [%1, #16 * 0]\n\t"
>> -                 "stp q2, q3, [%1, #16 * 2]\n\t"
>> -                 "stp q4, q5, [%1, #16 * 4]\n\t"
>> -                 "stp q6, q7, [%1, #16 * 6]\n\t"
>> -                 "stp q8, q9, [%1, #16 * 8]\n\t"
>> -                 "stp q10, q11, [%1, #16 * 10]\n\t"
>> -                 "stp q12, q13, [%1, #16 * 12]\n\t"
>> -                 "stp q14, q15, [%1, #16 * 14]\n\t"
>> -                 "stp q16, q17, [%1, #16 * 16]\n\t"
>> -                 "stp q18, q19, [%1, #16 * 18]\n\t"
>> -                 "stp q20, q21, [%1, #16 * 20]\n\t"
>> -                 "stp q22, q23, [%1, #16 * 22]\n\t"
>> -                 "stp q24, q25, [%1, #16 * 24]\n\t"
>> -                 "stp q26, q27, [%1, #16 * 26]\n\t"
>> -                 "stp q28, q29, [%1, #16 * 28]\n\t"
>> -                 "stp q30, q31, [%1, #16 * 30]\n\t"
>> -                 : "=Q" (*v->arch.vfp.fpregs) : "r" (v->arch.vfp.fpregs));
>> +    if ( is_sve_domain(v->domain) )
>> +        sve_save_state(v);
>> +    else
>> +    {
>> +        asm volatile("stp q0, q1, [%1, #16 * 0]\n\t"
>> +                     "stp q2, q3, [%1, #16 * 2]\n\t"
>> +                     "stp q4, q5, [%1, #16 * 4]\n\t"
>> +                     "stp q6, q7, [%1, #16 * 6]\n\t"
>> +                     "stp q8, q9, [%1, #16 * 8]\n\t"
>> +                     "stp q10, q11, [%1, #16 * 10]\n\t"
>> +                     "stp q12, q13, [%1, #16 * 12]\n\t"
>> +                     "stp q14, q15, [%1, #16 * 14]\n\t"
>> +                     "stp q16, q17, [%1, #16 * 16]\n\t"
>> +                     "stp q18, q19, [%1, #16 * 18]\n\t"
>> +                     "stp q20, q21, [%1, #16 * 20]\n\t"
>> +                     "stp q22, q23, [%1, #16 * 22]\n\t"
>> +                     "stp q24, q25, [%1, #16 * 24]\n\t"
>> +                     "stp q26, q27, [%1, #16 * 26]\n\t"
>> +                     "stp q28, q29, [%1, #16 * 28]\n\t"
>> +                     "stp q30, q31, [%1, #16 * 30]\n\t"
>> +                     : "=Q" (*v->arch.vfp.fpregs) : "r" (v->arch.vfp.fpregs));
>> +    }
>>        v->arch.vfp.fpsr = READ_SYSREG(FPSR);
>>      v->arch.vfp.fpcr = READ_SYSREG(FPCR);
>> @@ -37,23 +43,28 @@ void vfp_restore_state(struct vcpu *v)
>>      if ( !cpu_has_fp )
>>          return;
>>  -    asm volatile("ldp q0, q1, [%1, #16 * 0]\n\t"
>> -                 "ldp q2, q3, [%1, #16 * 2]\n\t"
>> -                 "ldp q4, q5, [%1, #16 * 4]\n\t"
>> -                 "ldp q6, q7, [%1, #16 * 6]\n\t"
>> -                 "ldp q8, q9, [%1, #16 * 8]\n\t"
>> -                 "ldp q10, q11, [%1, #16 * 10]\n\t"
>> -                 "ldp q12, q13, [%1, #16 * 12]\n\t"
>> -                 "ldp q14, q15, [%1, #16 * 14]\n\t"
>> -                 "ldp q16, q17, [%1, #16 * 16]\n\t"
>> -                 "ldp q18, q19, [%1, #16 * 18]\n\t"
>> -                 "ldp q20, q21, [%1, #16 * 20]\n\t"
>> -                 "ldp q22, q23, [%1, #16 * 22]\n\t"
>> -                 "ldp q24, q25, [%1, #16 * 24]\n\t"
>> -                 "ldp q26, q27, [%1, #16 * 26]\n\t"
>> -                 "ldp q28, q29, [%1, #16 * 28]\n\t"
>> -                 "ldp q30, q31, [%1, #16 * 30]\n\t"
>> -                 : : "Q" (*v->arch.vfp.fpregs), "r" (v->arch.vfp.fpregs));
>> +    if ( is_sve_domain(v->domain) )
>> +        sve_restore_state(v);
>> +    else
>> +    {
>> +        asm volatile("ldp q0, q1, [%1, #16 * 0]\n\t"
>> +                     "ldp q2, q3, [%1, #16 * 2]\n\t"
>> +                     "ldp q4, q5, [%1, #16 * 4]\n\t"
>> +                     "ldp q6, q7, [%1, #16 * 6]\n\t"
>> +                     "ldp q8, q9, [%1, #16 * 8]\n\t"
>> +                     "ldp q10, q11, [%1, #16 * 10]\n\t"
>> +                     "ldp q12, q13, [%1, #16 * 12]\n\t"
>> +                     "ldp q14, q15, [%1, #16 * 14]\n\t"
>> +                     "ldp q16, q17, [%1, #16 * 16]\n\t"
>> +                     "ldp q18, q19, [%1, #16 * 18]\n\t"
>> +                     "ldp q20, q21, [%1, #16 * 20]\n\t"
>> +                     "ldp q22, q23, [%1, #16 * 22]\n\t"
>> +                     "ldp q24, q25, [%1, #16 * 24]\n\t"
>> +                     "ldp q26, q27, [%1, #16 * 26]\n\t"
>> +                     "ldp q28, q29, [%1, #16 * 28]\n\t"
>> +                     "ldp q30, q31, [%1, #16 * 30]\n\t"
>> +                     : : "Q" (*v->arch.vfp.fpregs), "r" (v->arch.vfp.fpregs));
>> +    }
>>        WRITE_SYSREG(v->arch.vfp.fpsr, FPSR);
>>      WRITE_SYSREG(v->arch.vfp.fpcr, FPCR);
>> diff --git a/xen/arch/arm/domain.c b/xen/arch/arm/domain.c
>> index 143359d0f313..24c722a4a11e 100644
>> --- a/xen/arch/arm/domain.c
>> +++ b/xen/arch/arm/domain.c
>> @@ -552,7 +552,14 @@ int arch_vcpu_create(struct vcpu *v)
>>        v->arch.cptr_el2 = get_default_cptr_flags();
>>      if ( is_sve_domain(v->domain) )
>> +    {
>> +        if ( (rc = sve_context_init(v)) != 0 )
>> +            goto fail;
>>          v->arch.cptr_el2 &= ~HCPTR_CP(8);
>> +#ifdef CONFIG_ARM64_SVE
> 
> This #ifdef reads a bit odd to me because you are protecting v->arch.zcr_el2 but not the rest. This is one of the case where I would surround the full if with the #ifdef because it makes clearer that there is no way the rest of the code can be reached if !CONFIG_ARM64_SVE.
> 
> That said, I would actually prefer if...
> 
>> +        v->arch.zcr_el2 = vl_to_zcr(sve_decode_vl(v->domain->arch.sve_vl));
> 
> ... this line is moved in sve_context_init() because this is related to the SVE context.

Sure I will do that, so if I’ve understood correctly, you want me to keep this:


v->arch.cptr_el2 = get_default_cptr_flags();
if ( is_sve_domain(v->domain) )
{
    if ( (rc = sve_context_init(v)) != 0 )
        goto fail;
    v->arch.cptr_el2 &= ~HCPTR_CP(8);
}

Without #ifdef CONFIG_ARM64_SVE

> 
>> +#endif
>> +    }
>>        v->arch.hcr_el2 = get_default_hcr_flags();
>>  @@ -582,6 +589,8 @@ fail:
>>    void arch_vcpu_destroy(struct vcpu *v)
>>  {
>> +    if ( is_sve_domain(v->domain) )
>> +        sve_context_free(v);
>>      vcpu_timer_destroy(v);
>>      vcpu_vgic_free(v);
>>      free_xenheap_pages(v->arch.stack, STACK_ORDER);
>> diff --git a/xen/arch/arm/include/asm/arm64/sve.h b/xen/arch/arm/include/asm/arm64/sve.h
>> index 730c3fb5a9c8..582405dfdf6a 100644
>> --- a/xen/arch/arm/include/asm/arm64/sve.h
>> +++ b/xen/arch/arm/include/asm/arm64/sve.h
>> @@ -26,6 +26,10 @@ static inline unsigned int sve_decode_vl(unsigned int sve_vl)
>>  register_t compute_max_zcr(void);
>>  register_t vl_to_zcr(unsigned int vl);
>>  unsigned int get_sys_vl_len(void);
>> +int sve_context_init(struct vcpu *v);
>> +void sve_context_free(struct vcpu *v);
>> +void sve_save_state(struct vcpu *v);
>> +void sve_restore_state(struct vcpu *v);
>>    #else /* !CONFIG_ARM64_SVE */
>>  @@ -46,6 +50,15 @@ static inline unsigned int get_sys_vl_len(void)
>>      return 0;
>>  }
>>  +static inline int sve_context_init(struct vcpu *v)
>> +{
>> +    return 0;
>> +}
>> +
>> +static inline void sve_context_free(struct vcpu *v) {}
>> +static inline void sve_save_state(struct vcpu *v) {}
>> +static inline void sve_restore_state(struct vcpu *v) {}
>> +
>>  #endif /* CONFIG_ARM64_SVE */
>>    #endif /* _ARM_ARM64_SVE_H */
>> diff --git a/xen/arch/arm/include/asm/arm64/sysregs.h b/xen/arch/arm/include/asm/arm64/sysregs.h
>> index 4cabb9eb4d5e..3fdeb9d8cdef 100644
>> --- a/xen/arch/arm/include/asm/arm64/sysregs.h
>> +++ b/xen/arch/arm/include/asm/arm64/sysregs.h
>> @@ -88,6 +88,9 @@
>>  #ifndef ID_AA64ISAR2_EL1
>>  #define ID_AA64ISAR2_EL1            S3_0_C0_C6_2
>>  #endif
>> +#ifndef ZCR_EL1
>> +#define ZCR_EL1                     S3_0_C1_C2_0
>> +#endif
>>    /* ID registers (imported from arm64/include/asm/sysreg.h in Linux) */
>>  diff --git a/xen/arch/arm/include/asm/arm64/vfp.h b/xen/arch/arm/include/asm/arm64/vfp.h
>> index e6e8c363bc16..4aa371e85d26 100644
>> --- a/xen/arch/arm/include/asm/arm64/vfp.h
>> +++ b/xen/arch/arm/include/asm/arm64/vfp.h
>> @@ -6,7 +6,19 @@
>>    struct vfp_state
>>  {
>> +    /*
>> +     * When SVE is enabled for the guest, fpregs memory will be used to
>> +     * save/restore P0-P15 registers, otherwise it will be used for the V0-V31
>> +     * registers.
>> +     */
>>      uint64_t fpregs[64] __vfp_aligned;
>> +    /*
>> +     * When SVE is enabled for the guest, sve_zreg_ctx_end points to memory
>> +     * where Z0-Z31 registers and FFR can be saved/restored, it points at the
>> +     * end of the Z0-Z31 space and at the beginning of the FFR space, it's done
>> +     * like that to ease the save/restore assembly operations.
>> +     */
>> +    uint64_t *sve_zreg_ctx_end;
>>      register_t fpcr;
>>      register_t fpexc32_el2;
>>      register_t fpsr;
>> diff --git a/xen/arch/arm/include/asm/domain.h b/xen/arch/arm/include/asm/domain.h
>> index 331da0f3bcc3..814652d92568 100644
>> --- a/xen/arch/arm/include/asm/domain.h
>> +++ b/xen/arch/arm/include/asm/domain.h
>> @@ -195,6 +195,8 @@ struct arch_vcpu
>>      register_t tpidrro_el0;
>>        /* HYP configuration */
>> +    register_t zcr_el1;
>> +    register_t zcr_el2;
>>      register_t cptr_el2;
>>      register_t hcr_el2;
>>      register_t mdcr_el2;
> 
> Cheers,
> 
> -- 
> Julien Grall



^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [PATCH v6 05/12] arm/sve: save/restore SVE context switch
  2023-05-19 17:35     ` Luca Fancellu
@ 2023-05-19 17:52       ` Julien Grall
  0 siblings, 0 replies; 56+ messages in thread
From: Julien Grall @ 2023-05-19 17:52 UTC (permalink / raw)
  To: Luca Fancellu
  Cc: Xen-devel, Bertrand Marquis, Wei Chen, Stefano Stabellini,
	Volodymyr Babchuk

Hi,

On 19/05/2023 18:35, Luca Fancellu wrote:
> 
> 
>> On 18 May 2023, at 19:27, Julien Grall <julien@xen.org> wrote:
>>
>> Hi Luca,
>>
>> On 24/04/2023 07:02, Luca Fancellu wrote:
>>> Save/restore context switch for SVE, allocate memory to contain
>>> the Z0-31 registers whose length is maximum 2048 bits each and
>>> FFR who can be maximum 256 bits, the allocated memory depends on
>>> how many bits is the vector length for the domain and how many bits
>>> are supported by the platform.
>>> Save P0-15 whose length is maximum 256 bits each, in this case the
>>> memory used is from the fpregs field in struct vfp_state,
>>> because V0-31 are part of Z0-31 and this space would have been
>>> unused for SVE domain otherwise.
>>> Create zcr_el{1,2} fields in arch_vcpu, initialise zcr_el2 on vcpu
>>> creation given the requested vector length and restore it on
>>> context switch, save/restore ZCR_EL1 value as well.
>>> Signed-off-by: Luca Fancellu <luca.fancellu@arm.com>
>>> ---
>>> Changes from v5:
>>>   - use XFREE instead of xfree, keep the headers (Julien)
>>>   - Avoid math computation for every save/restore, store the computation
>>>     in struct vfp_state once (Bertrand)
>>>   - protect access to v->domain->arch.sve_vl inside arch_vcpu_create now
>>>     that sve_vl is available only on arm64
>>> Changes from v4:
>>>   - No changes
>>> Changes from v3:
>>>   - don't use fixed len types when not needed (Jan)
>>>   - now VL is an encoded value, decode it before using.
>>> Changes from v2:
>>>   - No changes
>>> Changes from v1:
>>>   - No changes
>>> Changes from RFC:
>>>   - Moved zcr_el2 field introduction in this patch, restore its
>>>     content inside sve_restore_state function. (Julien)
>>> ---
>>>   xen/arch/arm/arm64/sve-asm.S             | 141 +++++++++++++++++++++++
>>>   xen/arch/arm/arm64/sve.c                 |  63 ++++++++++
>>>   xen/arch/arm/arm64/vfp.c                 |  79 +++++++------
>>>   xen/arch/arm/domain.c                    |   9 ++
>>>   xen/arch/arm/include/asm/arm64/sve.h     |  13 +++
>>>   xen/arch/arm/include/asm/arm64/sysregs.h |   3 +
>>>   xen/arch/arm/include/asm/arm64/vfp.h     |  12 ++
>>>   xen/arch/arm/include/asm/domain.h        |   2 +
>>>   8 files changed, 288 insertions(+), 34 deletions(-)
>>> diff --git a/xen/arch/arm/arm64/sve-asm.S b/xen/arch/arm/arm64/sve-asm.S
>>> index 4d1549344733..8c37d7bc95d5 100644
>>> --- a/xen/arch/arm/arm64/sve-asm.S
>>> +++ b/xen/arch/arm/arm64/sve-asm.S
>>
>> Are all the new helpers added in this patch taken from Linux? If so, it would be good to clarify this (again) in the commit message as it helps for the review (I can diff with Linux rather than properly reviewing them).
>>
>>> diff --git a/xen/arch/arm/arm64/sve.c b/xen/arch/arm/arm64/sve.c
>>> index 86a5e617bfca..064832b450ff 100644
>>> --- a/xen/arch/arm/arm64/sve.c
>>> +++ b/xen/arch/arm/arm64/sve.c
>>> @@ -5,6 +5,8 @@
>>>    * Copyright (C) 2022 ARM Ltd.
>>>    */
>>>   +#include <xen/sched.h>
>>> +#include <xen/sizes.h>
>>>   #include <xen/types.h>
>>>   #include <asm/arm64/sve.h>
>>>   #include <asm/arm64/sysregs.h>
>>> @@ -13,6 +15,24 @@
>>>   #include <asm/system.h>
>>>     extern unsigned int sve_get_hw_vl(void);
>>> +extern void sve_save_ctx(uint64_t *sve_ctx, uint64_t *pregs, int save_ffr);
>>> +extern void sve_load_ctx(uint64_t const *sve_ctx, uint64_t const *pregs,
>>> +                         int restore_ffr);
>>
>>  From the use, it is not entirely what restore_ffr/save_ffr is meant to be. Are they bool? If so, maybe use bool? At mimimum, they probably want to be unsigned int.
> 
> I have to say that I trusted the Linux implementation here, in arch/rm64/include/asm/fpsimd.h, that uses int:

Ah, so this is a verbatim copy of the Linux code? If so...

> 
> extern void sve_save_state(void *state, u32 *pfpsr, int save_ffr);
> extern void sve_load_state(void const *state, u32 const *pfpsr,
> int restore_ffr);
> 
> But if you prefer I can put unsigned int instead.

... keep it as-is (Linux seems to like using 'int' for bool) but I would 
suggest to document the expected values.

> 
>>
>>> +
>>> +static inline unsigned int sve_zreg_ctx_size(unsigned int vl)
>>> +{
>>> +    /*
>>> +     * Z0-31 registers size in bytes is computed from VL that is in bits, so VL
>>> +     * in bytes is VL/8.
>>> +     */
>>> +    return (vl / 8U) * 32U;
>>> +}
>>> +
>>> +static inline unsigned int sve_ffrreg_ctx_size(unsigned int vl)
>>> +{
>>> +    /* FFR register size is VL/8, which is in bytes (VL/8)/8 */
>>> +    return (vl / 64U);
>>> +}
>>>     register_t compute_max_zcr(void)
>>>   {
>>> @@ -60,3 +80,46 @@ unsigned int get_sys_vl_len(void)
>>>       return ((system_cpuinfo.zcr64.bits[0] & ZCR_ELx_LEN_MASK) + 1U) *
>>>               SVE_VL_MULTIPLE_VAL;
>>>   }
>>> +
>>> +int sve_context_init(struct vcpu *v)
>>> +{
>>> +    unsigned int sve_vl_bits = sve_decode_vl(v->domain->arch.sve_vl);
>>> +    uint64_t *ctx = _xzalloc(sve_zreg_ctx_size(sve_vl_bits) +
>>> +                             sve_ffrreg_ctx_size(sve_vl_bits),
>>> +                             L1_CACHE_BYTES);
>>> +
>>> +    if ( !ctx )
>>> +        return -ENOMEM;
>>> +
>>> +    /* Point to the end of Z0-Z31 memory, just before FFR memory */
>>
>> NIT: I would add that the logic should be kept in sync with sve_context_free(). Same...
>>
>>> +    v->arch.vfp.sve_zreg_ctx_end = ctx +
>>> +        (sve_zreg_ctx_size(sve_vl_bits) / sizeof(uint64_t));
>>> +
>>> +    return 0;
>>> +}
>>> +
>>> +void sve_context_free(struct vcpu *v)
>>> +{
>>> +    unsigned int sve_vl_bits = sve_decode_vl(v->domain->arch.sve_vl);
>>> +
>>> +    /* Point back to the beginning of Z0-Z31 + FFR memory */
>>
>> ... here (but with sve_context_init()). So it is clearer that if the logic change in one place then it needs to be changed in the other.
> 
> Sure I will
> 
>>
>>> +    v->arch.vfp.sve_zreg_ctx_end -=
>>> +        (sve_zreg_ctx_size(sve_vl_bits) / sizeof(uint64_t));
>>
>>  From my understanding, sve_context_free() could be called with sve_zreg_ctxt_end equal to NULL (i.e. because sve_context_init() failed). So wouldn't we end up to substract the value to NULL and therefore...
>>
>>> +
>>> +    XFREE(v->arch.vfp.sve_zreg_ctx_end);
>>
>> ... free a random pointer?
> 
> Thank you for spotting this, I will surround the operations in sve_context_free by:
> 
> if ( v->arch.vfp.sve_zreg_ctx_end )

Rather than surrounding, how about adding:

if ( !v->arch.vfp...)
   return;

This would avoid an extra indentation.

> 
> I’m assuming the memory should be zero initialised for the vfp structure, please
> correct me if I’m wrong.

This is part of the struct vcpu. So yes (see alloc_vcpu_struct()).

[...]

>>> index 143359d0f313..24c722a4a11e 100644
>>> --- a/xen/arch/arm/domain.c
>>> +++ b/xen/arch/arm/domain.c
>>> @@ -552,7 +552,14 @@ int arch_vcpu_create(struct vcpu *v)
>>>         v->arch.cptr_el2 = get_default_cptr_flags();
>>>       if ( is_sve_domain(v->domain) )
>>> +    {
>>> +        if ( (rc = sve_context_init(v)) != 0 )
>>> +            goto fail;
>>>           v->arch.cptr_el2 &= ~HCPTR_CP(8);
>>> +#ifdef CONFIG_ARM64_SVE
>>
>> This #ifdef reads a bit odd to me because you are protecting v->arch.zcr_el2 but not the rest. This is one of the case where I would surround the full if with the #ifdef because it makes clearer that there is no way the rest of the code can be reached if !CONFIG_ARM64_SVE.
>>
>> That said, I would actually prefer if...
>>
>>> +        v->arch.zcr_el2 = vl_to_zcr(sve_decode_vl(v->domain->arch.sve_vl));
>>
>> ... this line is moved in sve_context_init() because this is related to the SVE context.
> 
> Sure I will do that, so if I’ve understood correctly, you want me to keep this:
> 
> 
> v->arch.cptr_el2 = get_default_cptr_flags();
> if ( is_sve_domain(v->domain) )
> {
>      if ( (rc = sve_context_init(v)) != 0 )
>          goto fail;
>      v->arch.cptr_el2 &= ~HCPTR_CP(8);
> }
> 
> Without #ifdef CONFIG_ARM64_SVE

Yes please.

-- 
Julien Grall


^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [PATCH v6 01/12] xen/arm: enable SVE extension for Xen
  2023-05-19 14:46       ` Julien Grall
  2023-05-19 14:51         ` Luca Fancellu
@ 2023-05-22  7:50         ` Jan Beulich
  2023-05-22  8:43           ` Luca Fancellu
  1 sibling, 1 reply; 56+ messages in thread
From: Jan Beulich @ 2023-05-22  7:50 UTC (permalink / raw)
  To: Julien Grall, Luca Fancellu
  Cc: Xen-devel, Bertrand Marquis, Wei Chen, Stefano Stabellini,
	Volodymyr Babchuk

On 19.05.2023 16:46, Julien Grall wrote:
> On 19/05/2023 15:26, Luca Fancellu wrote:
>>> On 18 May 2023, at 10:35, Julien Grall <julien@xen.org> wrote:
>>>> +/*
>>>> + * Arm SVE feature code
>>>> + *
>>>> + * Copyright (C) 2022 ARM Ltd.
>>>> + */
>>>> +
>>>> +#include <xen/types.h>
>>>> +#include <asm/arm64/sve.h>
>>>> +#include <asm/arm64/sysregs.h>
>>>> +#include <asm/processor.h>
>>>> +#include <asm/system.h>
>>>> +
>>>> +extern unsigned int sve_get_hw_vl(void);
>>>> +
>>>> +register_t compute_max_zcr(void)
>>>> +{
>>>> +    register_t cptr_bits = get_default_cptr_flags();
>>>> +    register_t zcr = vl_to_zcr(SVE_VL_MAX_BITS);
>>>> +    unsigned int hw_vl;
>>>> +
>>>> +    /* Remove trap for SVE resources */
>>>> +    WRITE_SYSREG(cptr_bits & ~HCPTR_CP(8), CPTR_EL2);
>>>> +    isb();
>>>> +
>>>> +    /*
>>>> +     * Set the maximum SVE vector length, doing that we will know the VL
>>>> +     * supported by the platform, calling sve_get_hw_vl()
>>>> +     */
>>>> +    WRITE_SYSREG(zcr, ZCR_EL2);
>>>
>>>  From my reading of the Arm (D19-6331, ARM DDI 0487J.a), a direct write to a system register would need to be followed by an context synchronization event (e.g. isb()) before the software can rely on the value.
>>>
>>> In this situation, AFAICT, the instruciton in sve_get_hw_vl() will use the content of ZCR_EL2. So don't we need an ISB() here?
>>
>>  From what I’ve read in the manual for ZCR_ELx:
>>
>> An indirect read of ZCR_EL2.LEN appears to occur in program order relative to a direct write of
>> the same register, without the need for explicit synchronization
>>
>> I’ve interpreted it as “there is no need to sync before write” and I’ve looked into Linux and it does not
>> Appear any synchronisation mechanism after a write to that register, but if I am wrong I can for sure
>> add an isb if you prefer.
> 
> Ah, I was reading the generic section about synchronization and didn't 
> realize there was a paragraph in the ZCR_EL2 section as well.
> 
> Reading the new section, I agree with your understanding. The isb() is 
> not necessary.

And RDVL counts as an "indirect read"? I'm pretty sure "normal" SVE insn
use is falling in that category, but RDVL might also be viewed as more
similar to MRS in this regard? While the construct CurrentVL is used in
either case, I'm still not sure this goes without saying.

Jan


^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [PATCH v6 01/12] xen/arm: enable SVE extension for Xen
  2023-05-22  7:50         ` Jan Beulich
@ 2023-05-22  8:43           ` Luca Fancellu
  2023-05-22  9:30             ` Julien Grall
  0 siblings, 1 reply; 56+ messages in thread
From: Luca Fancellu @ 2023-05-22  8:43 UTC (permalink / raw)
  To: Jan Beulich
  Cc: Julien Grall, Xen-devel, Bertrand Marquis, Wei Chen,
	Stefano Stabellini, Volodymyr Babchuk



> On 22 May 2023, at 08:50, Jan Beulich <jbeulich@suse.com> wrote:
> 
> On 19.05.2023 16:46, Julien Grall wrote:
>> On 19/05/2023 15:26, Luca Fancellu wrote:
>>>> On 18 May 2023, at 10:35, Julien Grall <julien@xen.org> wrote:
>>>>> +/*
>>>>> + * Arm SVE feature code
>>>>> + *
>>>>> + * Copyright (C) 2022 ARM Ltd.
>>>>> + */
>>>>> +
>>>>> +#include <xen/types.h>
>>>>> +#include <asm/arm64/sve.h>
>>>>> +#include <asm/arm64/sysregs.h>
>>>>> +#include <asm/processor.h>
>>>>> +#include <asm/system.h>
>>>>> +
>>>>> +extern unsigned int sve_get_hw_vl(void);
>>>>> +
>>>>> +register_t compute_max_zcr(void)
>>>>> +{
>>>>> +    register_t cptr_bits = get_default_cptr_flags();
>>>>> +    register_t zcr = vl_to_zcr(SVE_VL_MAX_BITS);
>>>>> +    unsigned int hw_vl;
>>>>> +
>>>>> +    /* Remove trap for SVE resources */
>>>>> +    WRITE_SYSREG(cptr_bits & ~HCPTR_CP(8), CPTR_EL2);
>>>>> +    isb();
>>>>> +
>>>>> +    /*
>>>>> +     * Set the maximum SVE vector length, doing that we will know the VL
>>>>> +     * supported by the platform, calling sve_get_hw_vl()
>>>>> +     */
>>>>> +    WRITE_SYSREG(zcr, ZCR_EL2);
>>>> 
>>>> From my reading of the Arm (D19-6331, ARM DDI 0487J.a), a direct write to a system register would need to be followed by an context synchronization event (e.g. isb()) before the software can rely on the value.
>>>> 
>>>> In this situation, AFAICT, the instruciton in sve_get_hw_vl() will use the content of ZCR_EL2. So don't we need an ISB() here?
>>> 
>>> From what I’ve read in the manual for ZCR_ELx:
>>> 
>>> An indirect read of ZCR_EL2.LEN appears to occur in program order relative to a direct write of
>>> the same register, without the need for explicit synchronization
>>> 
>>> I’ve interpreted it as “there is no need to sync before write” and I’ve looked into Linux and it does not
>>> Appear any synchronisation mechanism after a write to that register, but if I am wrong I can for sure
>>> add an isb if you prefer.
>> 
>> Ah, I was reading the generic section about synchronization and didn't 
>> realize there was a paragraph in the ZCR_EL2 section as well.
>> 
>> Reading the new section, I agree with your understanding. The isb() is 
>> not necessary.
> 
> And RDVL counts as an "indirect read"? I'm pretty sure "normal" SVE insn
> use is falling in that category, but RDVL might also be viewed as more
> similar to MRS in this regard? While the construct CurrentVL is used in
> either case, I'm still not sure this goes without saying.

Hi Jan,

Looking into the Linux code, in function vec_probe_vqs(...) in arch/arm64/kernel/fpsimd.c,
ZCR_EL1 is written, without synchronisation, and afterwards RDVL is used.

I think ZCR_EL2 has the same behaviour.

Cheers,
Luca

> 
> Jan


^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [PATCH v6 01/12] xen/arm: enable SVE extension for Xen
  2023-05-22  8:43           ` Luca Fancellu
@ 2023-05-22  9:30             ` Julien Grall
  2023-05-22  9:35               ` Luca Fancellu
  0 siblings, 1 reply; 56+ messages in thread
From: Julien Grall @ 2023-05-22  9:30 UTC (permalink / raw)
  To: Luca Fancellu, Jan Beulich
  Cc: Xen-devel, Bertrand Marquis, Wei Chen, Stefano Stabellini,
	Volodymyr Babchuk

Hi,

On 22/05/2023 09:43, Luca Fancellu wrote:
>> On 22 May 2023, at 08:50, Jan Beulich <jbeulich@suse.com> wrote:
>>
>> On 19.05.2023 16:46, Julien Grall wrote:
>>> On 19/05/2023 15:26, Luca Fancellu wrote:
>>>>> On 18 May 2023, at 10:35, Julien Grall <julien@xen.org> wrote:
>>>>>> +/*
>>>>>> + * Arm SVE feature code
>>>>>> + *
>>>>>> + * Copyright (C) 2022 ARM Ltd.
>>>>>> + */
>>>>>> +
>>>>>> +#include <xen/types.h>
>>>>>> +#include <asm/arm64/sve.h>
>>>>>> +#include <asm/arm64/sysregs.h>
>>>>>> +#include <asm/processor.h>
>>>>>> +#include <asm/system.h>
>>>>>> +
>>>>>> +extern unsigned int sve_get_hw_vl(void);
>>>>>> +
>>>>>> +register_t compute_max_zcr(void)
>>>>>> +{
>>>>>> +    register_t cptr_bits = get_default_cptr_flags();
>>>>>> +    register_t zcr = vl_to_zcr(SVE_VL_MAX_BITS);
>>>>>> +    unsigned int hw_vl;
>>>>>> +
>>>>>> +    /* Remove trap for SVE resources */
>>>>>> +    WRITE_SYSREG(cptr_bits & ~HCPTR_CP(8), CPTR_EL2);
>>>>>> +    isb();
>>>>>> +
>>>>>> +    /*
>>>>>> +     * Set the maximum SVE vector length, doing that we will know the VL
>>>>>> +     * supported by the platform, calling sve_get_hw_vl()
>>>>>> +     */
>>>>>> +    WRITE_SYSREG(zcr, ZCR_EL2);
>>>>>
>>>>>  From my reading of the Arm (D19-6331, ARM DDI 0487J.a), a direct write to a system register would need to be followed by an context synchronization event (e.g. isb()) before the software can rely on the value.
>>>>>
>>>>> In this situation, AFAICT, the instruciton in sve_get_hw_vl() will use the content of ZCR_EL2. So don't we need an ISB() here?
>>>>
>>>>  From what I’ve read in the manual for ZCR_ELx:
>>>>
>>>> An indirect read of ZCR_EL2.LEN appears to occur in program order relative to a direct write of
>>>> the same register, without the need for explicit synchronization
>>>>
>>>> I’ve interpreted it as “there is no need to sync before write” and I’ve looked into Linux and it does not
>>>> Appear any synchronisation mechanism after a write to that register, but if I am wrong I can for sure
>>>> add an isb if you prefer.
>>>
>>> Ah, I was reading the generic section about synchronization and didn't
>>> realize there was a paragraph in the ZCR_EL2 section as well.
>>>
>>> Reading the new section, I agree with your understanding. The isb() is
>>> not necessary.
>>
>> And RDVL counts as an "indirect read"? I'm pretty sure "normal" SVE insn
>> use is falling in that category, but RDVL might also be viewed as more
>> similar to MRS in this regard? While the construct CurrentVL is used in
>> either case, I'm still not sure this goes without saying.
> 
> Hi Jan,
> 
> Looking into the Linux code, in function vec_probe_vqs(...) in arch/arm64/kernel/fpsimd.c,
> ZCR_EL1 is written, without synchronisation, and afterwards RDVL is used.

You are making the assumption that the Linux code is correct. It is 
mostly likely the case, but in general it is best to justify barriers 
based on the Arm Arm because it is authoritative.

In this case, the Arm Arm is pretty clear on the difference between 
indirect read and direct read (see D19-63333 ARM DDI 0487J.A). The 
latter only refers to use of the instruction of MRS. RDVL is its own 
instruction and therefore this is an indirect read.

Cheers,

-- 
Julien Grall


^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [PATCH v6 01/12] xen/arm: enable SVE extension for Xen
  2023-05-22  9:30             ` Julien Grall
@ 2023-05-22  9:35               ` Luca Fancellu
  0 siblings, 0 replies; 56+ messages in thread
From: Luca Fancellu @ 2023-05-22  9:35 UTC (permalink / raw)
  To: Julien Grall
  Cc: Jan Beulich, Xen-devel, Bertrand Marquis, Wei Chen,
	Stefano Stabellini, Volodymyr Babchuk



> On 22 May 2023, at 10:30, Julien Grall <julien@xen.org> wrote:
> 
> Hi,
> 
> On 22/05/2023 09:43, Luca Fancellu wrote:
>>> On 22 May 2023, at 08:50, Jan Beulich <jbeulich@suse.com> wrote:
>>> 
>>> On 19.05.2023 16:46, Julien Grall wrote:
>>>> On 19/05/2023 15:26, Luca Fancellu wrote:
>>>>>> On 18 May 2023, at 10:35, Julien Grall <julien@xen.org> wrote:
>>>>>>> +/*
>>>>>>> + * Arm SVE feature code
>>>>>>> + *
>>>>>>> + * Copyright (C) 2022 ARM Ltd.
>>>>>>> + */
>>>>>>> +
>>>>>>> +#include <xen/types.h>
>>>>>>> +#include <asm/arm64/sve.h>
>>>>>>> +#include <asm/arm64/sysregs.h>
>>>>>>> +#include <asm/processor.h>
>>>>>>> +#include <asm/system.h>
>>>>>>> +
>>>>>>> +extern unsigned int sve_get_hw_vl(void);
>>>>>>> +
>>>>>>> +register_t compute_max_zcr(void)
>>>>>>> +{
>>>>>>> +    register_t cptr_bits = get_default_cptr_flags();
>>>>>>> +    register_t zcr = vl_to_zcr(SVE_VL_MAX_BITS);
>>>>>>> +    unsigned int hw_vl;
>>>>>>> +
>>>>>>> +    /* Remove trap for SVE resources */
>>>>>>> +    WRITE_SYSREG(cptr_bits & ~HCPTR_CP(8), CPTR_EL2);
>>>>>>> +    isb();
>>>>>>> +
>>>>>>> +    /*
>>>>>>> +     * Set the maximum SVE vector length, doing that we will know the VL
>>>>>>> +     * supported by the platform, calling sve_get_hw_vl()
>>>>>>> +     */
>>>>>>> +    WRITE_SYSREG(zcr, ZCR_EL2);
>>>>>> 
>>>>>> From my reading of the Arm (D19-6331, ARM DDI 0487J.a), a direct write to a system register would need to be followed by an context synchronization event (e.g. isb()) before the software can rely on the value.
>>>>>> 
>>>>>> In this situation, AFAICT, the instruciton in sve_get_hw_vl() will use the content of ZCR_EL2. So don't we need an ISB() here?
>>>>> 
>>>>> From what I’ve read in the manual for ZCR_ELx:
>>>>> 
>>>>> An indirect read of ZCR_EL2.LEN appears to occur in program order relative to a direct write of
>>>>> the same register, without the need for explicit synchronization
>>>>> 
>>>>> I’ve interpreted it as “there is no need to sync before write” and I’ve looked into Linux and it does not
>>>>> Appear any synchronisation mechanism after a write to that register, but if I am wrong I can for sure
>>>>> add an isb if you prefer.
>>>> 
>>>> Ah, I was reading the generic section about synchronization and didn't
>>>> realize there was a paragraph in the ZCR_EL2 section as well.
>>>> 
>>>> Reading the new section, I agree with your understanding. The isb() is
>>>> not necessary.
>>> 
>>> And RDVL counts as an "indirect read"? I'm pretty sure "normal" SVE insn
>>> use is falling in that category, but RDVL might also be viewed as more
>>> similar to MRS in this regard? While the construct CurrentVL is used in
>>> either case, I'm still not sure this goes without saying.
>> Hi Jan,
>> Looking into the Linux code, in function vec_probe_vqs(...) in arch/arm64/kernel/fpsimd.c,
>> ZCR_EL1 is written, without synchronisation, and afterwards RDVL is used.
> 
> You are making the assumption that the Linux code is correct. It is mostly likely the case, but in general it is best to justify barriers based on the Arm Arm because it is authoritative.
> 
> In this case, the Arm Arm is pretty clear on the difference between indirect read and direct read (see D19-63333 ARM DDI 0487J.A). The latter only refers to use of the instruction of MRS. RDVL is its own instruction and therefore this is an indirect read.

Yes you are right

> 
> Cheers,
> 
> -- 
> Julien Grall



^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [PATCH v6 05/12] arm/sve: save/restore SVE context switch
  2023-05-18 18:30   ` Julien Grall
@ 2023-05-22 10:20     ` Luca Fancellu
  2023-05-22 12:41       ` Jan Beulich
  0 siblings, 1 reply; 56+ messages in thread
From: Luca Fancellu @ 2023-05-22 10:20 UTC (permalink / raw)
  To: Julien Grall
  Cc: Xen-devel, Bertrand Marquis, Wei Chen, Stefano Stabellini,
	Volodymyr Babchuk



> On 18 May 2023, at 19:30, Julien Grall <julien@xen.org> wrote:
> 
> Hi Luca,
> 
> One more remark.
> 
> On 24/04/2023 07:02, Luca Fancellu wrote:
>>  #else /* !CONFIG_ARM64_SVE */
>>  @@ -46,6 +50,15 @@ static inline unsigned int get_sys_vl_len(void)
>>      return 0;
>>  }
>>  +static inline int sve_context_init(struct vcpu *v)
>> +{
>> +    return 0;
> 
> The call is protected by is_domain_sve(). So I think we want to return an error just in case someone is calling it outside of its intended use.

Regarding this one, since it should not be called when SVE is not enabled, are you ok if I’ll do this:

static inline int sve_context_init(struct vcpu *v)
{
ASSERT_UNREACHABLE();
return 0;
}


> 
>> +}
>> +
>> +static inline void sve_context_free(struct vcpu *v) {}
>> +static inline void sve_save_state(struct vcpu *v) {}
>> +static inline void sve_restore_state(struct vcpu *v) {}
>> +
> 
> -- 
> Julien Grall


^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [PATCH v6 05/12] arm/sve: save/restore SVE context switch
  2023-05-22 10:20     ` Luca Fancellu
@ 2023-05-22 12:41       ` Jan Beulich
  2023-05-22 12:43         ` Luca Fancellu
  0 siblings, 1 reply; 56+ messages in thread
From: Jan Beulich @ 2023-05-22 12:41 UTC (permalink / raw)
  To: Luca Fancellu
  Cc: Xen-devel, Bertrand Marquis, Wei Chen, Stefano Stabellini,
	Volodymyr Babchuk, Julien Grall

On 22.05.2023 12:20, Luca Fancellu wrote:
> 
> 
>> On 18 May 2023, at 19:30, Julien Grall <julien@xen.org> wrote:
>>
>> Hi Luca,
>>
>> One more remark.
>>
>> On 24/04/2023 07:02, Luca Fancellu wrote:
>>>  #else /* !CONFIG_ARM64_SVE */
>>>  @@ -46,6 +50,15 @@ static inline unsigned int get_sys_vl_len(void)
>>>      return 0;
>>>  }
>>>  +static inline int sve_context_init(struct vcpu *v)
>>> +{
>>> +    return 0;
>>
>> The call is protected by is_domain_sve(). So I think we want to return an error just in case someone is calling it outside of its intended use.
> 
> Regarding this one, since it should not be called when SVE is not enabled, are you ok if I’ll do this:
> 
> static inline int sve_context_init(struct vcpu *v)
> {
> ASSERT_UNREACHABLE();
> return 0;
> }

Do you need such a stub in the first place? I.e. can't you arrange for
DCE to take care of unreachable function calls, thus letting you get
away with just an always-visible declaration (and no definition when
!ARM64_SVE)?

Jan


^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [PATCH v6 05/12] arm/sve: save/restore SVE context switch
  2023-05-22 12:41       ` Jan Beulich
@ 2023-05-22 12:43         ` Luca Fancellu
  0 siblings, 0 replies; 56+ messages in thread
From: Luca Fancellu @ 2023-05-22 12:43 UTC (permalink / raw)
  To: Jan Beulich
  Cc: Xen-devel, Bertrand Marquis, Wei Chen, Stefano Stabellini,
	Volodymyr Babchuk, Julien Grall



> On 22 May 2023, at 13:41, Jan Beulich <jbeulich@suse.com> wrote:
> 
> On 22.05.2023 12:20, Luca Fancellu wrote:
>> 
>> 
>>> On 18 May 2023, at 19:30, Julien Grall <julien@xen.org> wrote:
>>> 
>>> Hi Luca,
>>> 
>>> One more remark.
>>> 
>>> On 24/04/2023 07:02, Luca Fancellu wrote:
>>>> #else /* !CONFIG_ARM64_SVE */
>>>> @@ -46,6 +50,15 @@ static inline unsigned int get_sys_vl_len(void)
>>>>     return 0;
>>>> }
>>>> +static inline int sve_context_init(struct vcpu *v)
>>>> +{
>>>> +    return 0;
>>> 
>>> The call is protected by is_domain_sve(). So I think we want to return an error just in case someone is calling it outside of its intended use.
>> 
>> Regarding this one, since it should not be called when SVE is not enabled, are you ok if I’ll do this:
>> 
>> static inline int sve_context_init(struct vcpu *v)
>> {
>> ASSERT_UNREACHABLE();
>> return 0;
>> }
> 
> Do you need such a stub in the first place? I.e. can't you arrange for
> DCE to take care of unreachable function calls, thus letting you get
> away with just an always-visible declaration (and no definition when
> !ARM64_SVE)?
> 

Right, I always forgot about this approach, I’ll try that

> Jan



^ permalink raw reply	[flat|nested] 56+ messages in thread

end of thread, other threads:[~2023-05-22 12:44 UTC | newest]

Thread overview: 56+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-04-24  6:02 [PATCH v6 00/12] SVE feature for arm guests Luca Fancellu
2023-04-24  6:02 ` [PATCH v6 01/12] xen/arm: enable SVE extension for Xen Luca Fancellu
2023-05-18  9:35   ` Julien Grall
2023-05-19 14:26     ` Luca Fancellu
2023-05-19 14:46       ` Julien Grall
2023-05-19 14:51         ` Luca Fancellu
2023-05-19 15:00           ` Julien Grall
2023-05-19 15:13             ` Luca Fancellu
2023-05-19 15:17               ` Julien Grall
2023-05-22  7:50         ` Jan Beulich
2023-05-22  8:43           ` Luca Fancellu
2023-05-22  9:30             ` Julien Grall
2023-05-22  9:35               ` Luca Fancellu
2023-04-24  6:02 ` [PATCH v6 02/12] xen/arm: add SVE vector length field to the domain Luca Fancellu
2023-05-18  9:48   ` Julien Grall
2023-04-24  6:02 ` [PATCH v6 03/12] xen/arm: Expose SVE feature to the guest Luca Fancellu
2023-05-18  9:51   ` Julien Grall
2023-04-24  6:02 ` [PATCH v6 04/12] xen/arm: add SVE exception class handling Luca Fancellu
2023-05-18  9:55   ` Julien Grall
2023-04-24  6:02 ` [PATCH v6 05/12] arm/sve: save/restore SVE context switch Luca Fancellu
2023-05-18 18:27   ` Julien Grall
2023-05-19 17:35     ` Luca Fancellu
2023-05-19 17:52       ` Julien Grall
2023-05-18 18:30   ` Julien Grall
2023-05-22 10:20     ` Luca Fancellu
2023-05-22 12:41       ` Jan Beulich
2023-05-22 12:43         ` Luca Fancellu
2023-04-24  6:02 ` [PATCH v6 06/12] xen/common: add dom0 xen command line argument for Arm Luca Fancellu
2023-04-24  6:02 ` [PATCH v6 07/12] xen: enable Dom0 to use SVE feature Luca Fancellu
2023-04-24 11:34   ` Jan Beulich
2023-04-24 14:00     ` Luca Fancellu
2023-04-24 14:05       ` Jan Beulich
2023-04-24 14:57         ` Luca Fancellu
2023-04-24 15:06           ` Jan Beulich
2023-04-24 15:18             ` Luca Fancellu
2023-04-24 15:25               ` Jan Beulich
2023-04-24 15:34                 ` Luca Fancellu
2023-04-24 15:41                   ` Jan Beulich
2023-04-24 15:43                     ` Luca Fancellu
2023-04-24 16:10                       ` Jan Beulich
2023-04-25  6:04                         ` Luca Fancellu
2023-05-18 18:39                           ` Julien Grall
2023-04-24  6:02 ` [PATCH v6 08/12] xen/physinfo: encode Arm SVE vector length in arch_capabilities Luca Fancellu
2023-04-24  6:02 ` [PATCH v6 09/12] tools: add physinfo arch_capabilities handling for Arm Luca Fancellu
2023-05-02 16:13   ` Anthony PERARD
2023-05-03  9:23     ` Luca Fancellu
2023-05-05 16:44       ` Anthony PERARD
2023-05-05 16:56         ` Luca Fancellu
2023-04-24  6:02 ` [PATCH v6 10/12] xen/tools: add sve parameter in XL configuration Luca Fancellu
2023-05-02 17:06   ` Anthony PERARD
2023-05-02 19:54     ` Luca Fancellu
2023-05-05 16:23       ` Anthony PERARD
2023-05-05 16:36         ` Luca Fancellu
2023-04-24  6:02 ` [PATCH v6 11/12] xen/arm: add sve property for dom0less domUs Luca Fancellu
2023-04-24  6:02 ` [PATCH v6 12/12] xen/changelog: Add SVE and "dom0" options to the changelog for Arm Luca Fancellu
2023-04-24  7:22   ` Henry Wang

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.