linux-arm-kernel.lists.infradead.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v4 0/8] arm64/sve: Clean up KVM integration and optimise syscalls
@ 2022-10-21 23:03 Mark Brown
  2022-10-21 23:03 ` [PATCH v4 1/8] KVM: arm64: Discard any SVE state when entering KVM guests Mark Brown
                   ` (7 more replies)
  0 siblings, 8 replies; 19+ messages in thread
From: Mark Brown @ 2022-10-21 23:03 UTC (permalink / raw)
  To: Catalin Marinas, Will Deacon
  Cc: Marc Zyngier, Zhang Lei, James Morse, Alexandru Elisei,
	Andre Przywara, kvmarm, linux-arm-kernel, Mark Brown

This patch series attempts to clarify the tracking of which set of
floating point registers we save on systems supporting SVE, particularly
with reference to KVM, and then uses the results of this clarification
to improve the performance of simple syscalls where we return directly
to userspace in cases where userspace is using SVE.

At present we track which register state is active by using the TIF_SVE
flag for the current task which also controls if userspace is able to
use SVE, this is reasonably straightforward if limiting but for KVM it
gets a bit hairy since we may have guest state loaded in registers. This
results in KVM modifying TIF_SVE for the VMM task while the guest is
running which doesn't entirely help make things easy to follow. To help
make things clearer the series changes things so that in addition to
TIF_SVE we explicitly track both the type of registers that are
currently saved in the task struct and the type of registers that we
should save when we do so. TIF_SVE then solely controls if userspace
can use SVE without trapping, it has no function for KVM guests and we
can remove the code for managing it from KVM.

The refactoring to add the separate tracking is initially done by adding
the new state together with checks that the state corresponds to
expectations when we look at it before subsequent patches make use of
the separated state, the goal being to both split out the more repetitive
bits of tha change and make it easier to debug any problems that might
arise.

With the state tracked separately we then start to optimise the
performance of syscalls when the process is using SVE. Currently every
syscall disables SVE for userspace which means that we need to trap to
EL1 again on the next SVE instruction, flush the SVE registers, and
reenable SVE for EL0, creating overhead for tasks that mix SVE and
syscalls. We build on the above refactoring to eliminate this overhead
for simple syscalls which return directly to userspace by keeping SVE
enabled unless we need to reload the state from memory, meaning that if
syscalls do not block we avoid the overhead of trapping to EL1 again on
next use of SVE.

The series also includes a tangentially related patch which simplifies
the interface to fpsimd_bind_state_to_cpu(), reducing the very large
number of arguments that the function takes. This is already an issue
regardless of this series but is further amplified by the series, if
this approach is OK for people we could potentially build on this to
use the struct in more places. In order to avoid the user visible
improvements getting held up behind code cleanups this patch is placed
last.

v4:
 - Rebase onto v6.1-rc1.
 - Only call fpsimd_kvm_prepare() on systems supporting FPSIMD.
 - Reorder field in kvm_vcpu_arch for pahole.
 - Rename the enum fp_state to fp_type, we still use a single type for
   both the saved state and target state since naming two very similar
   closely related types with their constants clearly and concisely gets
   tricky.
 - Reword a comment in fpsimd_save().
 - Add KVM specific comment about FPSIMD vs SVE states.
 - Further clarifications and expansion in several commit messages and
   comments.
 - Add a patch on the end improving the API for fpsimd_bind_state_to_cpu()
v3:
 - Rebase onto my series "arm64/sme: SME related fixes" since there is a
   direct dependency on the signal fix and testing is much easier with
   the bug fixes rolled in.
 - s/type/fp_type/ in struct fpsimd_last_state_struct.
 - Add comment about the V register storage being ignored when data is
   stored in SVE format.
 - Move dropping of special casing for FPSIMD register state in SME
   into a separate patch later in the series.
 - Simplify logic in task_fpsimd_load().
 - Remove support for leaving the SVE state not shared with FPSIMD
   untouched, keep the unconditional flush.
v2:
 - Rebase onto v5.19-rc3.
 - Don't warn when restoring streaming mode SVE without TIF_SVE.

Mark Brown (8):
  KVM: arm64: Discard any SVE state when entering KVM guests
  arm64/fpsimd: Track the saved FPSIMD state type separately to TIF_SVE
  arm64/fpsimd: Have KVM explicitly say which FP registers to save
  arm64/fpsimd: Stop using TIF_SVE to manage register saving in KVM
  arm64/fpsimd: Load FP state based on recorded data type
  arm64/fpsimd: SME no longer requires SVE register state
  arm64/sve: Leave SVE enabled on syscall if we don't context switch
  arm64/fp: Use a struct to pass data to fpsimd_bind_state_to_cpu()

 arch/arm64/include/asm/fpsimd.h    |  17 ++-
 arch/arm64/include/asm/kvm_host.h  |  12 ++-
 arch/arm64/include/asm/processor.h |   7 ++
 arch/arm64/kernel/fpsimd.c         | 165 ++++++++++++++++++++---------
 arch/arm64/kernel/process.c        |   2 +
 arch/arm64/kernel/ptrace.c         |   5 +-
 arch/arm64/kernel/signal.c         |   7 +-
 arch/arm64/kernel/syscall.c        |  19 +---
 arch/arm64/kvm/fpsimd.c            |  26 +++--
 9 files changed, 180 insertions(+), 80 deletions(-)


base-commit: 9abf2313adc1ca1b6180c508c25f22f9395cc780
-- 
2.30.2


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 19+ messages in thread

* [PATCH v4 1/8] KVM: arm64: Discard any SVE state when entering KVM guests
  2022-10-21 23:03 [PATCH v4 0/8] arm64/sve: Clean up KVM integration and optimise syscalls Mark Brown
@ 2022-10-21 23:03 ` Mark Brown
  2022-11-13 22:00   ` Catalin Marinas
  2022-10-21 23:03 ` [PATCH v4 2/8] arm64/fpsimd: Track the saved FPSIMD state type separately to TIF_SVE Mark Brown
                   ` (6 subsequent siblings)
  7 siblings, 1 reply; 19+ messages in thread
From: Mark Brown @ 2022-10-21 23:03 UTC (permalink / raw)
  To: Catalin Marinas, Will Deacon
  Cc: Marc Zyngier, Zhang Lei, James Morse, Alexandru Elisei,
	Andre Przywara, kvmarm, linux-arm-kernel, Mark Brown

Since 8383741ab2e773a99 (KVM: arm64: Get rid of host SVE tracking/saving)
KVM has not tracked the host SVE state, relying on the fact that we
currently disable SVE whenever we perform a syscall. This may not be true
in future since performance optimisation may result in us keeping SVE
enabled in order to avoid needing to take access traps to reenable it.
Handle this by clearing TIF_SVE and converting the stored task state to
FPSIMD format when preparing to run the guest.  This is done with a new
call fpsimd_kvm_prepare() to keep the direct state manipulation
functions internal to fpsimd.c.

Signed-off-by: Mark Brown <broonie@kernel.org>
---
 arch/arm64/include/asm/fpsimd.h |  1 +
 arch/arm64/kernel/fpsimd.c      | 23 +++++++++++++++++++++++
 arch/arm64/kvm/fpsimd.c         |  3 ++-
 3 files changed, 26 insertions(+), 1 deletion(-)

diff --git a/arch/arm64/include/asm/fpsimd.h b/arch/arm64/include/asm/fpsimd.h
index 6f86b7ab6c28..c07e4abaca3d 100644
--- a/arch/arm64/include/asm/fpsimd.h
+++ b/arch/arm64/include/asm/fpsimd.h
@@ -56,6 +56,7 @@ extern void fpsimd_signal_preserve_current_state(void);
 extern void fpsimd_preserve_current_state(void);
 extern void fpsimd_restore_current_state(void);
 extern void fpsimd_update_current_state(struct user_fpsimd_state const *state);
+extern void fpsimd_kvm_prepare(void);
 
 extern void fpsimd_bind_state_to_cpu(struct user_fpsimd_state *state,
 				     void *sve_state, unsigned int sve_vl,
diff --git a/arch/arm64/kernel/fpsimd.c b/arch/arm64/kernel/fpsimd.c
index 23834d96d1e7..549e11645e0f 100644
--- a/arch/arm64/kernel/fpsimd.c
+++ b/arch/arm64/kernel/fpsimd.c
@@ -1627,6 +1627,29 @@ void fpsimd_signal_preserve_current_state(void)
 		sve_to_fpsimd(current);
 }
 
+/*
+ * Called by KVM when entering the guest.
+ */
+void fpsimd_kvm_prepare(void)
+{
+	if (!system_supports_sve())
+		return;
+
+	/*
+	 * KVM does not save host SVE state since we can only enter
+	 * the guest from a syscall so the ABI means that only the
+	 * non-saved SVE state needs to be saved.  If we have left
+	 * SVE enabled for performance reasons then update the task
+	 * state to be FPSIMD only.
+	 */
+	get_cpu_fpsimd_context();
+
+	if (test_and_clear_thread_flag(TIF_SVE))
+		sve_to_fpsimd(current);
+
+	put_cpu_fpsimd_context();
+}
+
 /*
  * Associate current's FPSIMD context with this cpu
  * The caller must have ownership of the cpu FPSIMD context before calling
diff --git a/arch/arm64/kvm/fpsimd.c b/arch/arm64/kvm/fpsimd.c
index ec8e4494873d..51ca78b31b95 100644
--- a/arch/arm64/kvm/fpsimd.c
+++ b/arch/arm64/kvm/fpsimd.c
@@ -75,11 +75,12 @@ int kvm_arch_vcpu_run_map_fp(struct kvm_vcpu *vcpu)
 void kvm_arch_vcpu_load_fp(struct kvm_vcpu *vcpu)
 {
 	BUG_ON(!current->mm);
-	BUG_ON(test_thread_flag(TIF_SVE));
 
 	if (!system_supports_fpsimd())
 		return;
 
+	fpsimd_kvm_prepare();
+
 	vcpu->arch.fp_state = FP_STATE_HOST_OWNED;
 
 	vcpu_clear_flag(vcpu, HOST_SVE_ENABLED);
-- 
2.30.2


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH v4 2/8] arm64/fpsimd: Track the saved FPSIMD state type separately to TIF_SVE
  2022-10-21 23:03 [PATCH v4 0/8] arm64/sve: Clean up KVM integration and optimise syscalls Mark Brown
  2022-10-21 23:03 ` [PATCH v4 1/8] KVM: arm64: Discard any SVE state when entering KVM guests Mark Brown
@ 2022-10-21 23:03 ` Mark Brown
  2022-11-13 22:12   ` Catalin Marinas
  2022-10-21 23:03 ` [PATCH v4 3/8] arm64/fpsimd: Have KVM explicitly say which FP registers to save Mark Brown
                   ` (5 subsequent siblings)
  7 siblings, 1 reply; 19+ messages in thread
From: Mark Brown @ 2022-10-21 23:03 UTC (permalink / raw)
  To: Catalin Marinas, Will Deacon
  Cc: Marc Zyngier, Zhang Lei, James Morse, Alexandru Elisei,
	Andre Przywara, kvmarm, linux-arm-kernel, Mark Brown

When we save the state for the floating point registers this can be done
in the form visible through either the FPSIMD V registers or the SVE Z and
P registers. At present we track which format is currently used based on
TIF_SVE and the SME streaming mode state but particularly in the SVE case
this limits our options for optimising things, especially around syscalls.
Introduce a new enum which we place together with saved floating point
state in both thread_struct and the KVM guest state which explicitly
states which format is active and keep it up to date when we change it.

At present we do not use this state except to verify that it has the
expected value when loading the state, future patches will introduce
functional changes.

Signed-off-by: Mark Brown <broonie@kernel.org>
---
 arch/arm64/include/asm/fpsimd.h    |  2 +-
 arch/arm64/include/asm/kvm_host.h  | 12 ++++++-
 arch/arm64/include/asm/processor.h |  6 ++++
 arch/arm64/kernel/fpsimd.c         | 58 ++++++++++++++++++++++--------
 arch/arm64/kernel/process.c        |  2 ++
 arch/arm64/kernel/ptrace.c         |  3 ++
 arch/arm64/kernel/signal.c         |  7 +++-
 arch/arm64/kvm/fpsimd.c            |  3 +-
 8 files changed, 74 insertions(+), 19 deletions(-)

diff --git a/arch/arm64/include/asm/fpsimd.h b/arch/arm64/include/asm/fpsimd.h
index c07e4abaca3d..341705fcb7bb 100644
--- a/arch/arm64/include/asm/fpsimd.h
+++ b/arch/arm64/include/asm/fpsimd.h
@@ -61,7 +61,7 @@ extern void fpsimd_kvm_prepare(void);
 extern void fpsimd_bind_state_to_cpu(struct user_fpsimd_state *state,
 				     void *sve_state, unsigned int sve_vl,
 				     void *za_state, unsigned int sme_vl,
-				     u64 *svcr);
+				     u64 *svcr, enum fp_type *type);
 
 extern void fpsimd_flush_task_state(struct task_struct *target);
 extern void fpsimd_save_and_flush_cpu_state(void);
diff --git a/arch/arm64/include/asm/kvm_host.h b/arch/arm64/include/asm/kvm_host.h
index 45e2136322ba..fd34ab155d0b 100644
--- a/arch/arm64/include/asm/kvm_host.h
+++ b/arch/arm64/include/asm/kvm_host.h
@@ -306,8 +306,18 @@ struct vcpu_reset_state {
 struct kvm_vcpu_arch {
 	struct kvm_cpu_context ctxt;
 
-	/* Guest floating point state */
+	/*
+	 * Guest floating point state
+	 *
+	 * The architecture has two main floating point extensions,
+	 * the original FPSIMD and SVE.  These have overlapping
+	 * register views, with the FPSIMD V registers occupying the
+	 * low 128 bits of the SVE Z registers.  When the core
+	 * floating point code saves the register state of a task it
+	 * records which view it saved in fp_type.
+	 */
 	void *sve_state;
+	enum fp_type fp_type;
 	unsigned int sve_max_vl;
 	u64 svcr;
 
diff --git a/arch/arm64/include/asm/processor.h b/arch/arm64/include/asm/processor.h
index 445aa3af3b76..3cce0a4c4e8d 100644
--- a/arch/arm64/include/asm/processor.h
+++ b/arch/arm64/include/asm/processor.h
@@ -122,6 +122,11 @@ enum vec_type {
 	ARM64_VEC_MAX,
 };
 
+enum fp_type {
+	FP_STATE_FPSIMD,
+	FP_STATE_SVE,
+};
+
 struct cpu_context {
 	unsigned long x19;
 	unsigned long x20;
@@ -152,6 +157,7 @@ struct thread_struct {
 		struct user_fpsimd_state fpsimd_state;
 	} uw;
 
+	enum fp_type		fp_type;	/* registers FPSIMD or SVE? */
 	unsigned int		fpsimd_cpu;
 	void			*sve_state;	/* SVE registers, if any */
 	void			*za_state;	/* ZA register, if any */
diff --git a/arch/arm64/kernel/fpsimd.c b/arch/arm64/kernel/fpsimd.c
index 549e11645e0f..aacd8e356084 100644
--- a/arch/arm64/kernel/fpsimd.c
+++ b/arch/arm64/kernel/fpsimd.c
@@ -125,6 +125,7 @@ struct fpsimd_last_state_struct {
 	u64 *svcr;
 	unsigned int sve_vl;
 	unsigned int sme_vl;
+	enum fp_type *fp_type;
 };
 
 static DEFINE_PER_CPU(struct fpsimd_last_state_struct, fpsimd_last_state);
@@ -330,15 +331,6 @@ void task_set_vl_onexec(struct task_struct *task, enum vec_type type,
  *    The task can execute SVE instructions while in userspace without
  *    trapping to the kernel.
  *
- *    When stored, Z0-Z31 (incorporating Vn in bits[127:0] or the
- *    corresponding Zn), P0-P15 and FFR are encoded in
- *    task->thread.sve_state, formatted appropriately for vector
- *    length task->thread.sve_vl or, if SVCR.SM is set,
- *    task->thread.sme_vl.
- *
- *    task->thread.sve_state must point to a valid buffer at least
- *    sve_state_size(task) bytes in size.
- *
  *    During any syscall, the kernel may optionally clear TIF_SVE and
  *    discard the vector state except for the FPSIMD subset.
  *
@@ -348,7 +340,15 @@ void task_set_vl_onexec(struct task_struct *task, enum vec_type type,
  *    do_sve_acc() to be called, which does some preparation and then
  *    sets TIF_SVE.
  *
- *    When stored, FPSIMD registers V0-V31 are encoded in
+ * During any syscall, the kernel may optionally clear TIF_SVE and
+ * discard the vector state except for the FPSIMD subset.
+ *
+ * The data will be stored in one of two formats:
+ *
+ *  * FPSIMD only - FP_STATE_FPSIMD:
+ *
+ *    When the FPSIMD only state stored task->thread.fp_type is set to
+ *    FP_STATE_FPSIMD, the FPSIMD registers V0-V31 are encoded in
  *    task->thread.uw.fpsimd_state; bits [max : 128] for each of Z0-Z31 are
  *    logically zero but not stored anywhere; P0-P15 and FFR are not
  *    stored and have unspecified values from userspace's point of
@@ -358,6 +358,19 @@ void task_set_vl_onexec(struct task_struct *task, enum vec_type type,
  *    task->thread.sve_state does not need to be non-NULL, valid or any
  *    particular size: it must not be dereferenced.
  *
+ *  * SVE state - FP_STATE_SVE:
+ *
+ *    When the full SVE state is stored task->thread.fp_type is set to
+ *    FP_STATE_SVE and Z0-Z31 (incorporating Vn in bits[127:0] or the
+ *    corresponding Zn), P0-P15 and FFR are encoded in in
+ *    task->thread.sve_state, formatted appropriately for vector
+ *    length task->thread.sve_vl or, if SVCR.SM is set,
+ *    task->thread.sme_vl. The storage for the vector registers in
+ *    task->thread.uw.fpsimd_state should be ignored.
+ *
+ *    task->thread.sve_state must point to a valid buffer at least
+ *    sve_state_size(task) bytes in size.
+ *
  *  * FPSR and FPCR are always stored in task->thread.uw.fpsimd_state
  *    irrespective of whether TIF_SVE is clear or set, since these are
  *    not vector length dependent.
@@ -404,12 +417,15 @@ static void task_fpsimd_load(void)
 		}
 	}
 
-	if (restore_sve_regs)
+	if (restore_sve_regs) {
+		WARN_ON_ONCE(current->thread.fp_type != FP_STATE_SVE);
 		sve_load_state(sve_pffr(&current->thread),
 			       &current->thread.uw.fpsimd_state.fpsr,
 			       restore_ffr);
-	else
+	} else {
+		WARN_ON_ONCE(current->thread.fp_type != FP_STATE_FPSIMD);
 		fpsimd_load_state(&current->thread.uw.fpsimd_state);
+	}
 }
 
 /*
@@ -474,8 +490,10 @@ static void fpsimd_save(void)
 		sve_save_state((char *)last->sve_state +
 					sve_ffr_offset(vl),
 			       &last->st->fpsr, save_ffr);
+		*last->fp_type = FP_STATE_SVE;
 	} else {
 		fpsimd_save_state(last->st);
+		*last->fp_type = FP_STATE_FPSIMD;
 	}
 }
 
@@ -848,8 +866,10 @@ int vec_set_vector_length(struct task_struct *task, enum vec_type type,
 
 	fpsimd_flush_task_state(task);
 	if (test_and_clear_tsk_thread_flag(task, TIF_SVE) ||
-	    thread_sm_enabled(&task->thread))
+	    thread_sm_enabled(&task->thread)) {
 		sve_to_fpsimd(task);
+		task->thread.fp_type = FP_STATE_FPSIMD;
+	}
 
 	if (system_supports_sme() && type == ARM64_VEC_SME) {
 		task->thread.svcr &= ~(SVCR_SM_MASK |
@@ -1368,6 +1388,7 @@ static void sve_init_regs(void)
 		fpsimd_bind_task_to_cpu();
 	} else {
 		fpsimd_to_sve(current);
+		current->thread.fp_type = FP_STATE_SVE;
 	}
 }
 
@@ -1596,6 +1617,8 @@ void fpsimd_flush_thread(void)
 		current->thread.svcr = 0;
 	}
 
+	current->thread.fp_type = FP_STATE_FPSIMD;
+
 	put_cpu_fpsimd_context();
 	kfree(sve_state);
 	kfree(za_state);
@@ -1644,8 +1667,10 @@ void fpsimd_kvm_prepare(void)
 	 */
 	get_cpu_fpsimd_context();
 
-	if (test_and_clear_thread_flag(TIF_SVE))
+	if (test_and_clear_thread_flag(TIF_SVE)) {
 		sve_to_fpsimd(current);
+		current->thread.fp_type = FP_STATE_FPSIMD;
+	}
 
 	put_cpu_fpsimd_context();
 }
@@ -1667,6 +1692,7 @@ static void fpsimd_bind_task_to_cpu(void)
 	last->sve_vl = task_get_sve_vl(current);
 	last->sme_vl = task_get_sme_vl(current);
 	last->svcr = &current->thread.svcr;
+	last->fp_type = &current->thread.fp_type;
 	current->thread.fpsimd_cpu = smp_processor_id();
 
 	/*
@@ -1690,7 +1716,8 @@ static void fpsimd_bind_task_to_cpu(void)
 
 void fpsimd_bind_state_to_cpu(struct user_fpsimd_state *st, void *sve_state,
 			      unsigned int sve_vl, void *za_state,
-			      unsigned int sme_vl, u64 *svcr)
+			      unsigned int sme_vl, u64 *svcr,
+			      enum fp_type *type)
 {
 	struct fpsimd_last_state_struct *last =
 		this_cpu_ptr(&fpsimd_last_state);
@@ -1704,6 +1731,7 @@ void fpsimd_bind_state_to_cpu(struct user_fpsimd_state *st, void *sve_state,
 	last->za_state = za_state;
 	last->sve_vl = sve_vl;
 	last->sme_vl = sme_vl;
+	last->fp_type = type;
 }
 
 /*
diff --git a/arch/arm64/kernel/process.c b/arch/arm64/kernel/process.c
index 044a7d7f1f6a..19cd05eea3f0 100644
--- a/arch/arm64/kernel/process.c
+++ b/arch/arm64/kernel/process.c
@@ -331,6 +331,8 @@ int arch_dup_task_struct(struct task_struct *dst, struct task_struct *src)
 		clear_tsk_thread_flag(dst, TIF_SME);
 	}
 
+	dst->thread.fp_type = FP_STATE_FPSIMD;
+
 	/* clear any pending asynchronous tag fault raised by the parent */
 	clear_tsk_thread_flag(dst, TIF_MTE_ASYNC_FAULT);
 
diff --git a/arch/arm64/kernel/ptrace.c b/arch/arm64/kernel/ptrace.c
index c2fb5755bbec..8a7c91791c16 100644
--- a/arch/arm64/kernel/ptrace.c
+++ b/arch/arm64/kernel/ptrace.c
@@ -909,6 +909,7 @@ static int sve_set_common(struct task_struct *target,
 		clear_tsk_thread_flag(target, TIF_SVE);
 		if (type == ARM64_VEC_SME)
 			fpsimd_force_sync_to_sve(target);
+		target->thread.fp_type = FP_STATE_FPSIMD;
 		goto out;
 	}
 
@@ -931,6 +932,7 @@ static int sve_set_common(struct task_struct *target,
 	if (!target->thread.sve_state) {
 		ret = -ENOMEM;
 		clear_tsk_thread_flag(target, TIF_SVE);
+		target->thread.fp_type = FP_STATE_FPSIMD;
 		goto out;
 	}
 
@@ -942,6 +944,7 @@ static int sve_set_common(struct task_struct *target,
 	 */
 	fpsimd_sync_to_sve(target);
 	set_tsk_thread_flag(target, TIF_SVE);
+	target->thread.fp_type = FP_STATE_SVE;
 
 	BUILD_BUG_ON(SVE_PT_SVE_OFFSET != sizeof(header));
 	start = SVE_PT_SVE_OFFSET;
diff --git a/arch/arm64/kernel/signal.c b/arch/arm64/kernel/signal.c
index 9ad911f1647c..e0d09bf5b01b 100644
--- a/arch/arm64/kernel/signal.c
+++ b/arch/arm64/kernel/signal.c
@@ -207,6 +207,7 @@ static int restore_fpsimd_context(struct fpsimd_context __user *ctx)
 	__get_user_error(fpsimd.fpcr, &ctx->fpcr, err);
 
 	clear_thread_flag(TIF_SVE);
+	current->thread.fp_type = FP_STATE_FPSIMD;
 
 	/* load the hardware registers from the fpsimd_state structure */
 	if (!err)
@@ -292,6 +293,7 @@ static int restore_sve_fpsimd_context(struct user_ctxs *user)
 	if (sve.head.size <= sizeof(*user->sve)) {
 		clear_thread_flag(TIF_SVE);
 		current->thread.svcr &= ~SVCR_SM_MASK;
+		current->thread.fp_type = FP_STATE_FPSIMD;
 		goto fpsimd_only;
 	}
 
@@ -327,6 +329,7 @@ static int restore_sve_fpsimd_context(struct user_ctxs *user)
 		current->thread.svcr |= SVCR_SM_MASK;
 	else
 		set_thread_flag(TIF_SVE);
+	current->thread.fp_type = FP_STATE_SVE;
 
 fpsimd_only:
 	/* copy the FP and status/control registers */
@@ -932,9 +935,11 @@ static void setup_return(struct pt_regs *regs, struct k_sigaction *ka,
 		 * FPSIMD register state - flush the saved FPSIMD
 		 * register state in case it gets loaded.
 		 */
-		if (current->thread.svcr & SVCR_SM_MASK)
+		if (current->thread.svcr & SVCR_SM_MASK) {
 			memset(&current->thread.uw.fpsimd_state, 0,
 			       sizeof(current->thread.uw.fpsimd_state));
+			current->thread.fp_type = FP_STATE_FPSIMD;
+		}
 
 		current->thread.svcr &= ~(SVCR_ZA_MASK |
 					  SVCR_SM_MASK);
diff --git a/arch/arm64/kvm/fpsimd.c b/arch/arm64/kvm/fpsimd.c
index 51ca78b31b95..a4b4502ad850 100644
--- a/arch/arm64/kvm/fpsimd.c
+++ b/arch/arm64/kvm/fpsimd.c
@@ -140,7 +140,8 @@ void kvm_arch_vcpu_ctxsync_fp(struct kvm_vcpu *vcpu)
 		fpsimd_bind_state_to_cpu(&vcpu->arch.ctxt.fp_regs,
 					 vcpu->arch.sve_state,
 					 vcpu->arch.sve_max_vl,
-					 NULL, 0, &vcpu->arch.svcr);
+					 NULL, 0, &vcpu->arch.svcr,
+					 &vcpu->arch.fp_type);
 
 		clear_thread_flag(TIF_FOREIGN_FPSTATE);
 		update_thread_flag(TIF_SVE, vcpu_has_sve(vcpu));
-- 
2.30.2


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH v4 3/8] arm64/fpsimd: Have KVM explicitly say which FP registers to save
  2022-10-21 23:03 [PATCH v4 0/8] arm64/sve: Clean up KVM integration and optimise syscalls Mark Brown
  2022-10-21 23:03 ` [PATCH v4 1/8] KVM: arm64: Discard any SVE state when entering KVM guests Mark Brown
  2022-10-21 23:03 ` [PATCH v4 2/8] arm64/fpsimd: Track the saved FPSIMD state type separately to TIF_SVE Mark Brown
@ 2022-10-21 23:03 ` Mark Brown
  2022-11-13 22:27   ` Catalin Marinas
  2022-10-21 23:03 ` [PATCH v4 4/8] arm64/fpsimd: Stop using TIF_SVE to manage register saving in KVM Mark Brown
                   ` (4 subsequent siblings)
  7 siblings, 1 reply; 19+ messages in thread
From: Mark Brown @ 2022-10-21 23:03 UTC (permalink / raw)
  To: Catalin Marinas, Will Deacon
  Cc: Marc Zyngier, Zhang Lei, James Morse, Alexandru Elisei,
	Andre Przywara, kvmarm, linux-arm-kernel, Mark Brown

In order to avoid needlessly saving and restoring the guest registers KVM
relies on the host FPSMID code to save the guest registers when we context
switch away from the guest. This is done by binding the KVM guest state to
the CPU on top of the task state that was originally there, then carefully
managing the TIF_SVE flag for the task to cause the host to save the full
SVE state when needed regardless of the needs of the host task. This works
well enough but isn't terribly direct about what is going on and makes it
much more complicated to try to optimise what we're doing with the SVE
register state.

Let's instead have KVM pass in the register state it wants saving when it
binds to the CPU. We introduce a new FP_STATE_TASK for use during normal
task binding to indicate that we should base our decisions on the current
task. This should not ben used when actually saving. Ideally we might
want to use a separate enum for the type to save but this enum and the
enum values would then need to be named which has problems with clarity
and ambiguity.

In order to ease any future debugging that might be required this patch
does not actually update any of the decision making about what to save,
it merely starts tracking the new information and warns if the requested
state is not what we would otherwise have decided to save.

Signed-off-by: Mark Brown <broonie@kernel.org>
---
 arch/arm64/include/asm/fpsimd.h    |  3 ++-
 arch/arm64/include/asm/processor.h |  1 +
 arch/arm64/kernel/fpsimd.c         | 27 ++++++++++++++++++++++++---
 arch/arm64/kvm/fpsimd.c            |  9 ++++++++-
 4 files changed, 35 insertions(+), 5 deletions(-)

diff --git a/arch/arm64/include/asm/fpsimd.h b/arch/arm64/include/asm/fpsimd.h
index 341705fcb7bb..0ad683dab1e2 100644
--- a/arch/arm64/include/asm/fpsimd.h
+++ b/arch/arm64/include/asm/fpsimd.h
@@ -61,7 +61,8 @@ extern void fpsimd_kvm_prepare(void);
 extern void fpsimd_bind_state_to_cpu(struct user_fpsimd_state *state,
 				     void *sve_state, unsigned int sve_vl,
 				     void *za_state, unsigned int sme_vl,
-				     u64 *svcr, enum fp_type *type);
+				     u64 *svcr, enum fp_type *type,
+				     enum fp_type to_save);
 
 extern void fpsimd_flush_task_state(struct task_struct *target);
 extern void fpsimd_save_and_flush_cpu_state(void);
diff --git a/arch/arm64/include/asm/processor.h b/arch/arm64/include/asm/processor.h
index 3cce0a4c4e8d..1d48398cb6dc 100644
--- a/arch/arm64/include/asm/processor.h
+++ b/arch/arm64/include/asm/processor.h
@@ -123,6 +123,7 @@ enum vec_type {
 };
 
 enum fp_type {
+	FP_STATE_TASK,		/* Save based on current task state. */
 	FP_STATE_FPSIMD,
 	FP_STATE_SVE,
 };
diff --git a/arch/arm64/kernel/fpsimd.c b/arch/arm64/kernel/fpsimd.c
index aacd8e356084..f819f79e69cc 100644
--- a/arch/arm64/kernel/fpsimd.c
+++ b/arch/arm64/kernel/fpsimd.c
@@ -126,6 +126,7 @@ struct fpsimd_last_state_struct {
 	unsigned int sve_vl;
 	unsigned int sme_vl;
 	enum fp_type *fp_type;
+	enum fp_type to_save;
 };
 
 static DEFINE_PER_CPU(struct fpsimd_last_state_struct, fpsimd_last_state);
@@ -356,7 +357,8 @@ void task_set_vl_onexec(struct task_struct *task, enum vec_type type,
  *    but userspace is discouraged from relying on this.
  *
  *    task->thread.sve_state does not need to be non-NULL, valid or any
- *    particular size: it must not be dereferenced.
+ *    particular size: it must not be dereferenced and any data stored
+ *    there should be considered stale and not referenced.
  *
  *  * SVE state - FP_STATE_SVE:
  *
@@ -369,7 +371,9 @@ void task_set_vl_onexec(struct task_struct *task, enum vec_type type,
  *    task->thread.uw.fpsimd_state should be ignored.
  *
  *    task->thread.sve_state must point to a valid buffer at least
- *    sve_state_size(task) bytes in size.
+ *    sve_state_size(task) bytes in size. The data stored in
+ *    task->thread.uw.fpsimd_state.vregs should be considered stale
+ *    and not referenced.
  *
  *  * FPSR and FPCR are always stored in task->thread.uw.fpsimd_state
  *    irrespective of whether TIF_SVE is clear or set, since these are
@@ -459,6 +463,21 @@ static void fpsimd_save(void)
 		vl = last->sve_vl;
 	}
 
+	/*
+	 * Validate that an explicitly specified state to save is
+	 * consistent with the task state.
+	 */
+	switch (last->to_save) {
+	case FP_STATE_TASK:
+		break;
+	case FP_STATE_FPSIMD:
+		WARN_ON_ONCE(save_sve_regs);
+		break;
+	case FP_STATE_SVE:
+		WARN_ON_ONCE(!save_sve_regs);
+		break;
+	}
+
 	if (system_supports_sme()) {
 		u64 *svcr = last->svcr;
 
@@ -1693,6 +1712,7 @@ static void fpsimd_bind_task_to_cpu(void)
 	last->sme_vl = task_get_sme_vl(current);
 	last->svcr = &current->thread.svcr;
 	last->fp_type = &current->thread.fp_type;
+	last->to_save = FP_STATE_TASK;
 	current->thread.fpsimd_cpu = smp_processor_id();
 
 	/*
@@ -1717,7 +1737,7 @@ static void fpsimd_bind_task_to_cpu(void)
 void fpsimd_bind_state_to_cpu(struct user_fpsimd_state *st, void *sve_state,
 			      unsigned int sve_vl, void *za_state,
 			      unsigned int sme_vl, u64 *svcr,
-			      enum fp_type *type)
+			      enum fp_type *type, enum fp_type to_save)
 {
 	struct fpsimd_last_state_struct *last =
 		this_cpu_ptr(&fpsimd_last_state);
@@ -1732,6 +1752,7 @@ void fpsimd_bind_state_to_cpu(struct user_fpsimd_state *st, void *sve_state,
 	last->sve_vl = sve_vl;
 	last->sme_vl = sme_vl;
 	last->fp_type = type;
+	last->to_save = to_save;
 }
 
 /*
diff --git a/arch/arm64/kvm/fpsimd.c b/arch/arm64/kvm/fpsimd.c
index a4b4502ad850..89c02ce797b8 100644
--- a/arch/arm64/kvm/fpsimd.c
+++ b/arch/arm64/kvm/fpsimd.c
@@ -130,9 +130,16 @@ void kvm_arch_vcpu_ctxflush_fp(struct kvm_vcpu *vcpu)
  */
 void kvm_arch_vcpu_ctxsync_fp(struct kvm_vcpu *vcpu)
 {
+	enum fp_type fp_type;
+
 	WARN_ON_ONCE(!irqs_disabled());
 
 	if (vcpu->arch.fp_state == FP_STATE_GUEST_OWNED) {
+		if (vcpu_has_sve(vcpu))
+			fp_type = FP_STATE_SVE;
+		else
+			fp_type = FP_STATE_FPSIMD;
+
 		/*
 		 * Currently we do not support SME guests so SVCR is
 		 * always 0 and we just need a variable to point to.
@@ -141,7 +148,7 @@ void kvm_arch_vcpu_ctxsync_fp(struct kvm_vcpu *vcpu)
 					 vcpu->arch.sve_state,
 					 vcpu->arch.sve_max_vl,
 					 NULL, 0, &vcpu->arch.svcr,
-					 &vcpu->arch.fp_type);
+					 &vcpu->arch.fp_type, fp_type);
 
 		clear_thread_flag(TIF_FOREIGN_FPSTATE);
 		update_thread_flag(TIF_SVE, vcpu_has_sve(vcpu));
-- 
2.30.2


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH v4 4/8] arm64/fpsimd: Stop using TIF_SVE to manage register saving in KVM
  2022-10-21 23:03 [PATCH v4 0/8] arm64/sve: Clean up KVM integration and optimise syscalls Mark Brown
                   ` (2 preceding siblings ...)
  2022-10-21 23:03 ` [PATCH v4 3/8] arm64/fpsimd: Have KVM explicitly say which FP registers to save Mark Brown
@ 2022-10-21 23:03 ` Mark Brown
  2022-11-13 22:30   ` Catalin Marinas
  2022-10-21 23:03 ` [PATCH v4 5/8] arm64/fpsimd: Load FP state based on recorded data type Mark Brown
                   ` (3 subsequent siblings)
  7 siblings, 1 reply; 19+ messages in thread
From: Mark Brown @ 2022-10-21 23:03 UTC (permalink / raw)
  To: Catalin Marinas, Will Deacon
  Cc: Marc Zyngier, Zhang Lei, James Morse, Alexandru Elisei,
	Andre Przywara, kvmarm, linux-arm-kernel, Mark Brown

Now that we are explicitly telling the host FP code which register state
it needs to save we can remove the manipulation of TIF_SVE from the KVM
code, simplifying it and allowing us to optimise our handling of normal
tasks. Remove the manipulation of TIF_SVE from KVM and instead rely on
to_save to ensure we save the correct data for it.

There should be no functional or performance impact from this change.

Signed-off-by: Mark Brown <broonie@kernel.org>
---
 arch/arm64/kernel/fpsimd.c | 22 ++++------------------
 arch/arm64/kvm/fpsimd.c    |  3 ---
 2 files changed, 4 insertions(+), 21 deletions(-)

diff --git a/arch/arm64/kernel/fpsimd.c b/arch/arm64/kernel/fpsimd.c
index f819f79e69cc..65ea8163c7f6 100644
--- a/arch/arm64/kernel/fpsimd.c
+++ b/arch/arm64/kernel/fpsimd.c
@@ -439,8 +439,8 @@ static void task_fpsimd_load(void)
  * last, if KVM is involved this may be the guest VM context rather
  * than the host thread for the VM pointed to by current. This means
  * that we must always reference the state storage via last rather
- * than via current, other than the TIF_ flags which KVM will
- * carefully maintain for us.
+ * than via current, if we are saving KVM state then it will have
+ * ensured that the type of registers to save is set in last->to_save.
  */
 static void fpsimd_save(void)
 {
@@ -457,27 +457,13 @@ static void fpsimd_save(void)
 	if (test_thread_flag(TIF_FOREIGN_FPSTATE))
 		return;
 
-	if (test_thread_flag(TIF_SVE)) {
+	if ((last->to_save == FP_STATE_TASK && test_thread_flag(TIF_SVE)) ||
+	    last->to_save == FP_STATE_SVE) {
 		save_sve_regs = true;
 		save_ffr = true;
 		vl = last->sve_vl;
 	}
 
-	/*
-	 * Validate that an explicitly specified state to save is
-	 * consistent with the task state.
-	 */
-	switch (last->to_save) {
-	case FP_STATE_TASK:
-		break;
-	case FP_STATE_FPSIMD:
-		WARN_ON_ONCE(save_sve_regs);
-		break;
-	case FP_STATE_SVE:
-		WARN_ON_ONCE(!save_sve_regs);
-		break;
-	}
-
 	if (system_supports_sme()) {
 		u64 *svcr = last->svcr;
 
diff --git a/arch/arm64/kvm/fpsimd.c b/arch/arm64/kvm/fpsimd.c
index 89c02ce797b8..ec82d0191f76 100644
--- a/arch/arm64/kvm/fpsimd.c
+++ b/arch/arm64/kvm/fpsimd.c
@@ -151,7 +151,6 @@ void kvm_arch_vcpu_ctxsync_fp(struct kvm_vcpu *vcpu)
 					 &vcpu->arch.fp_type, fp_type);
 
 		clear_thread_flag(TIF_FOREIGN_FPSTATE);
-		update_thread_flag(TIF_SVE, vcpu_has_sve(vcpu));
 	}
 }
 
@@ -208,7 +207,5 @@ void kvm_arch_vcpu_put_fp(struct kvm_vcpu *vcpu)
 			sysreg_clear_set(CPACR_EL1, CPACR_EL1_ZEN_EL0EN, 0);
 	}
 
-	update_thread_flag(TIF_SVE, 0);
-
 	local_irq_restore(flags);
 }
-- 
2.30.2


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH v4 5/8] arm64/fpsimd: Load FP state based on recorded data type
  2022-10-21 23:03 [PATCH v4 0/8] arm64/sve: Clean up KVM integration and optimise syscalls Mark Brown
                   ` (3 preceding siblings ...)
  2022-10-21 23:03 ` [PATCH v4 4/8] arm64/fpsimd: Stop using TIF_SVE to manage register saving in KVM Mark Brown
@ 2022-10-21 23:03 ` Mark Brown
  2022-11-14  9:24   ` Catalin Marinas
  2022-10-21 23:03 ` [PATCH v4 6/8] arm64/fpsimd: SME no longer requires SVE register state Mark Brown
                   ` (2 subsequent siblings)
  7 siblings, 1 reply; 19+ messages in thread
From: Mark Brown @ 2022-10-21 23:03 UTC (permalink / raw)
  To: Catalin Marinas, Will Deacon
  Cc: Marc Zyngier, Zhang Lei, James Morse, Alexandru Elisei,
	Andre Przywara, kvmarm, linux-arm-kernel, Mark Brown

Now that we are recording the type of floating point register state we
are saving when we write the register state out to memory we can use
that information when we load from memory to decide which format to
load, bringing TIF_SVE into line with what we saved rather than relying
on TIF_SVE to determine what to load.

The SME state details are already recorded directly in the saved
SVCR and handled based on the information there.

Since we are not changing any of the save paths there should be no
functional change from this patch, further patches will make use of this
to optimise and clarify the code.

Signed-off-by: Mark Brown <broonie@kernel.org>
---
 arch/arm64/kernel/fpsimd.c | 40 ++++++++++++++++++++++++++++++--------
 1 file changed, 32 insertions(+), 8 deletions(-)

diff --git a/arch/arm64/kernel/fpsimd.c b/arch/arm64/kernel/fpsimd.c
index 65ea8163c7f6..6544ecb297a5 100644
--- a/arch/arm64/kernel/fpsimd.c
+++ b/arch/arm64/kernel/fpsimd.c
@@ -395,11 +395,37 @@ static void task_fpsimd_load(void)
 	WARN_ON(!system_supports_fpsimd());
 	WARN_ON(!have_cpu_fpsimd_context());
 
-	/* Check if we should restore SVE first */
-	if (IS_ENABLED(CONFIG_ARM64_SVE) && test_thread_flag(TIF_SVE)) {
-		sve_set_vq(sve_vq_from_vl(task_get_sve_vl(current)) - 1);
-		restore_sve_regs = true;
-		restore_ffr = true;
+	if (system_supports_sve()) {
+		switch (current->thread.fp_type) {
+		case FP_STATE_FPSIMD:
+			/* Stop tracking SVE for this task until next use. */
+			if (test_and_clear_thread_flag(TIF_SVE))
+				sve_user_disable();
+			break;
+		case FP_STATE_SVE:
+			if (!thread_sm_enabled(&current->thread) &&
+			    !WARN_ON_ONCE(!test_and_set_thread_flag(TIF_SVE)))
+				sve_user_enable();
+
+			if (test_thread_flag(TIF_SVE))
+				sve_set_vq(sve_vq_from_vl(task_get_sve_vl(current)) - 1);
+
+			restore_sve_regs = true;
+			restore_ffr = true;
+			break;
+		default:
+			/*
+			 * This indicates either a bug in
+			 * fpsimd_save() or memory corruption, we
+			 * should always record an explicit format
+			 * when we save. We always at least have the
+			 * memory allocated for FPSMID registers so
+			 * try that and hope for the best.
+			 */
+			WARN_ON_ONCE(1);
+			clear_thread_flag(TIF_SVE);
+			break;
+		}
 	}
 
 	/* Restore SME, override SVE register configuration if needed */
@@ -415,10 +441,8 @@ static void task_fpsimd_load(void)
 		if (thread_za_enabled(&current->thread))
 			za_load_state(current->thread.za_state);
 
-		if (thread_sm_enabled(&current->thread)) {
-			restore_sve_regs = true;
+		if (thread_sm_enabled(&current->thread))
 			restore_ffr = system_supports_fa64();
-		}
 	}
 
 	if (restore_sve_regs) {
-- 
2.30.2


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH v4 6/8] arm64/fpsimd: SME no longer requires SVE register state
  2022-10-21 23:03 [PATCH v4 0/8] arm64/sve: Clean up KVM integration and optimise syscalls Mark Brown
                   ` (4 preceding siblings ...)
  2022-10-21 23:03 ` [PATCH v4 5/8] arm64/fpsimd: Load FP state based on recorded data type Mark Brown
@ 2022-10-21 23:03 ` Mark Brown
  2022-11-14 10:18   ` Catalin Marinas
  2022-10-21 23:03 ` [PATCH v4 7/8] arm64/sve: Leave SVE enabled on syscall if we don't context switch Mark Brown
  2022-10-21 23:03 ` [PATCH v4 8/8] arm64/fp: Use a struct to pass data to fpsimd_bind_state_to_cpu() Mark Brown
  7 siblings, 1 reply; 19+ messages in thread
From: Mark Brown @ 2022-10-21 23:03 UTC (permalink / raw)
  To: Catalin Marinas, Will Deacon
  Cc: Marc Zyngier, Zhang Lei, James Morse, Alexandru Elisei,
	Andre Przywara, kvmarm, linux-arm-kernel, Mark Brown

Now that we track the type of register state stored separately to
tracking what is active in the task it is valid to have FPSIMD register
state stored while in streaming mode so remove the special case handling
for SME when setting FPSIMD register state.

Signed-off-by: Mark Brown <broonie@kernel.org>
---
 arch/arm64/kernel/fpsimd.c | 3 +--
 arch/arm64/kernel/ptrace.c | 2 --
 2 files changed, 1 insertion(+), 4 deletions(-)

diff --git a/arch/arm64/kernel/fpsimd.c b/arch/arm64/kernel/fpsimd.c
index 6544ecb297a5..73a540a3a11c 100644
--- a/arch/arm64/kernel/fpsimd.c
+++ b/arch/arm64/kernel/fpsimd.c
@@ -815,8 +815,7 @@ void fpsimd_sync_to_sve(struct task_struct *task)
  */
 void sve_sync_to_fpsimd(struct task_struct *task)
 {
-	if (test_tsk_thread_flag(task, TIF_SVE) ||
-	    thread_sm_enabled(&task->thread))
+	if (task->thread.fp_type == FP_STATE_SVE)
 		sve_to_fpsimd(task);
 }
 
diff --git a/arch/arm64/kernel/ptrace.c b/arch/arm64/kernel/ptrace.c
index 8a7c91791c16..979dbdc36d52 100644
--- a/arch/arm64/kernel/ptrace.c
+++ b/arch/arm64/kernel/ptrace.c
@@ -907,8 +907,6 @@ static int sve_set_common(struct task_struct *target,
 		ret = __fpr_set(target, regset, pos, count, kbuf, ubuf,
 				SVE_PT_FPSIMD_OFFSET);
 		clear_tsk_thread_flag(target, TIF_SVE);
-		if (type == ARM64_VEC_SME)
-			fpsimd_force_sync_to_sve(target);
 		target->thread.fp_type = FP_STATE_FPSIMD;
 		goto out;
 	}
-- 
2.30.2


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH v4 7/8] arm64/sve: Leave SVE enabled on syscall if we don't context switch
  2022-10-21 23:03 [PATCH v4 0/8] arm64/sve: Clean up KVM integration and optimise syscalls Mark Brown
                   ` (5 preceding siblings ...)
  2022-10-21 23:03 ` [PATCH v4 6/8] arm64/fpsimd: SME no longer requires SVE register state Mark Brown
@ 2022-10-21 23:03 ` Mark Brown
  2022-11-14 10:31   ` Catalin Marinas
  2022-10-21 23:03 ` [PATCH v4 8/8] arm64/fp: Use a struct to pass data to fpsimd_bind_state_to_cpu() Mark Brown
  7 siblings, 1 reply; 19+ messages in thread
From: Mark Brown @ 2022-10-21 23:03 UTC (permalink / raw)
  To: Catalin Marinas, Will Deacon
  Cc: Marc Zyngier, Zhang Lei, James Morse, Alexandru Elisei,
	Andre Przywara, kvmarm, linux-arm-kernel, Mark Brown

The syscall ABI says that the SVE register state not shared with FPSIMD
may not be preserved on syscall, and this is the only mechanism we have
in the ABI to stop tracking the extra SVE state for a process. Currently
we do this unconditionally by means of disabling SVE for the process on
syscall, causing userspace to take a trap to EL1 if it uses SVE again.
These extra traps result in a noticeable overhead for using SVE instead
of FPSIMD in some workloads, especially for simple syscalls where we can
return directly to userspace and would not otherwise need to update the
floating point registers. Tests with fp-pidbench show an approximately
70% overhead on a range of implementations when SVE is in use - while
this is an extreme and entirely artificial benchmark it is clear that
there is some useful room for improvement here.

Now that we have the ability to track the decision about what to save
seprately to TIF_SVE we can improve things by leaving TIF_SVE enabled on
syscall but only saving the FPSIMD registers if we are in a syscall.
This means that if we need to restore the register state from memory
(eg, after a context switch or kernel mode NEON) we will drop TIF_SVE
and reenable traps for userspace but if we can just return to userspace
then traps will remain disabled.

Since our current implementation and hence ABI has the effect of zeroing
all the SVE register state not shared with FPSIMD on syscall we replace
the disabling of TIF_SVE with a flush of the non-shared register state,
this means that there is still some overhead for syscalls when SVE is in
use but it is very much reduced.

Signed-off-by: Mark Brown <broonie@kernel.org>
---
 arch/arm64/kernel/fpsimd.c  |  8 +++++++-
 arch/arm64/kernel/syscall.c | 19 +++++--------------
 2 files changed, 12 insertions(+), 15 deletions(-)

diff --git a/arch/arm64/kernel/fpsimd.c b/arch/arm64/kernel/fpsimd.c
index 73a540a3a11c..984ea997e7ca 100644
--- a/arch/arm64/kernel/fpsimd.c
+++ b/arch/arm64/kernel/fpsimd.c
@@ -481,7 +481,13 @@ static void fpsimd_save(void)
 	if (test_thread_flag(TIF_FOREIGN_FPSTATE))
 		return;
 
-	if ((last->to_save == FP_STATE_TASK && test_thread_flag(TIF_SVE)) ||
+	/*
+	 * If a task is in a syscall the ABI allows us to only
+	 * preserve the state shared with FPSIMD so don't bother
+	 * saving the full SVE state in that case.
+	 */
+	if ((last->to_save == FP_STATE_TASK && test_thread_flag(TIF_SVE) &&
+	     !in_syscall(current_pt_regs())) ||
 	    last->to_save == FP_STATE_SVE) {
 		save_sve_regs = true;
 		save_ffr = true;
diff --git a/arch/arm64/kernel/syscall.c b/arch/arm64/kernel/syscall.c
index d72e8f23422d..a5de47e3df2b 100644
--- a/arch/arm64/kernel/syscall.c
+++ b/arch/arm64/kernel/syscall.c
@@ -183,21 +183,12 @@ static inline void fp_user_discard(void)
 	if (!system_supports_sve())
 		return;
 
-	/*
-	 * If SME is not active then disable SVE, the registers will
-	 * be cleared when userspace next attempts to access them and
-	 * we do not need to track the SVE register state until then.
-	 */
-	clear_thread_flag(TIF_SVE);
+	if (test_thread_flag(TIF_SVE)) {
+		unsigned int sve_vq_minus_one;
 
-	/*
-	 * task_fpsimd_load() won't be called to update CPACR_EL1 in
-	 * ret_to_user unless TIF_FOREIGN_FPSTATE is still set, which only
-	 * happens if a context switch or kernel_neon_begin() or context
-	 * modification (sigreturn, ptrace) intervenes.
-	 * So, ensure that CPACR_EL1 is already correct for the fast-path case.
-	 */
-	sve_user_disable();
+		sve_vq_minus_one = sve_vq_from_vl(task_get_sve_vl(current)) - 1;
+		sve_flush_live(true, sve_vq_minus_one);
+	}
 }
 
 void do_el0_svc(struct pt_regs *regs)
-- 
2.30.2


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH v4 8/8] arm64/fp: Use a struct to pass data to fpsimd_bind_state_to_cpu()
  2022-10-21 23:03 [PATCH v4 0/8] arm64/sve: Clean up KVM integration and optimise syscalls Mark Brown
                   ` (6 preceding siblings ...)
  2022-10-21 23:03 ` [PATCH v4 7/8] arm64/sve: Leave SVE enabled on syscall if we don't context switch Mark Brown
@ 2022-10-21 23:03 ` Mark Brown
  2022-11-14 10:33   ` Catalin Marinas
  7 siblings, 1 reply; 19+ messages in thread
From: Mark Brown @ 2022-10-21 23:03 UTC (permalink / raw)
  To: Catalin Marinas, Will Deacon
  Cc: Marc Zyngier, Zhang Lei, James Morse, Alexandru Elisei,
	Andre Przywara, kvmarm, linux-arm-kernel, Mark Brown

For reasons that are unclear to this reader fpsimd_bind_state_to_cpu()
populates the struct fpsimd_last_state_struct that it uses to store the
active floating point state for KVM guests by passing an argument for
each member of the structure. As the richness of the architecture increases
this is resulting in a function with a rather large number of arguments
which isn't ideal.

Simplify the interface by using the struct directly as the single argument
for the function, renaming it as we lift the definition into the header.
This could be built on further to reduce the work we do adding storage for
new FP state in various places but for now it just simplifies this one
interface.

Signed-off-by: Mark Brown <broonie@kernel.org>
---
 arch/arm64/include/asm/fpsimd.h | 17 ++++++++++++-----
 arch/arm64/kernel/fpsimd.c      | 34 ++++++---------------------------
 arch/arm64/kvm/fpsimd.c         | 24 +++++++++++++----------
 3 files changed, 32 insertions(+), 43 deletions(-)

diff --git a/arch/arm64/include/asm/fpsimd.h b/arch/arm64/include/asm/fpsimd.h
index 0ad683dab1e2..e6fa1e2982c8 100644
--- a/arch/arm64/include/asm/fpsimd.h
+++ b/arch/arm64/include/asm/fpsimd.h
@@ -58,11 +58,18 @@ extern void fpsimd_restore_current_state(void);
 extern void fpsimd_update_current_state(struct user_fpsimd_state const *state);
 extern void fpsimd_kvm_prepare(void);
 
-extern void fpsimd_bind_state_to_cpu(struct user_fpsimd_state *state,
-				     void *sve_state, unsigned int sve_vl,
-				     void *za_state, unsigned int sme_vl,
-				     u64 *svcr, enum fp_type *type,
-				     enum fp_type to_save);
+struct cpu_fp_state {
+	struct user_fpsimd_state *st;
+	void *sve_state;
+	void *za_state;
+	u64 *svcr;
+	unsigned int sve_vl;
+	unsigned int sme_vl;
+	enum fp_type *fp_type;
+	enum fp_type to_save;
+};
+
+extern void fpsimd_bind_state_to_cpu(struct cpu_fp_state *fp_state);
 
 extern void fpsimd_flush_task_state(struct task_struct *target);
 extern void fpsimd_save_and_flush_cpu_state(void);
diff --git a/arch/arm64/kernel/fpsimd.c b/arch/arm64/kernel/fpsimd.c
index 984ea997e7ca..8b1f9b44d831 100644
--- a/arch/arm64/kernel/fpsimd.c
+++ b/arch/arm64/kernel/fpsimd.c
@@ -118,18 +118,8 @@
  *   returned from the 2nd syscall yet, TIF_FOREIGN_FPSTATE is still set so
  *   whatever is in the FPSIMD registers is not saved to memory, but discarded.
  */
-struct fpsimd_last_state_struct {
-	struct user_fpsimd_state *st;
-	void *sve_state;
-	void *za_state;
-	u64 *svcr;
-	unsigned int sve_vl;
-	unsigned int sme_vl;
-	enum fp_type *fp_type;
-	enum fp_type to_save;
-};
 
-static DEFINE_PER_CPU(struct fpsimd_last_state_struct, fpsimd_last_state);
+static DEFINE_PER_CPU(struct cpu_fp_state, fpsimd_last_state);
 
 __ro_after_init struct vl_info vl_info[ARM64_VEC_MAX] = {
 #ifdef CONFIG_ARM64_SVE
@@ -468,7 +458,7 @@ static void task_fpsimd_load(void)
  */
 static void fpsimd_save(void)
 {
-	struct fpsimd_last_state_struct const *last =
+	struct cpu_fp_state const *last =
 		this_cpu_ptr(&fpsimd_last_state);
 	/* set by fpsimd_bind_task_to_cpu() or fpsimd_bind_state_to_cpu() */
 	bool save_sve_regs = false;
@@ -1716,8 +1706,7 @@ void fpsimd_kvm_prepare(void)
  */
 static void fpsimd_bind_task_to_cpu(void)
 {
-	struct fpsimd_last_state_struct *last =
-		this_cpu_ptr(&fpsimd_last_state);
+	struct cpu_fp_state *last = this_cpu_ptr(&fpsimd_last_state);
 
 	WARN_ON(!system_supports_fpsimd());
 	last->st = &current->thread.uw.fpsimd_state;
@@ -1749,25 +1738,14 @@ static void fpsimd_bind_task_to_cpu(void)
 	}
 }
 
-void fpsimd_bind_state_to_cpu(struct user_fpsimd_state *st, void *sve_state,
-			      unsigned int sve_vl, void *za_state,
-			      unsigned int sme_vl, u64 *svcr,
-			      enum fp_type *type, enum fp_type to_save)
+void fpsimd_bind_state_to_cpu(struct cpu_fp_state *state)
 {
-	struct fpsimd_last_state_struct *last =
-		this_cpu_ptr(&fpsimd_last_state);
+	struct cpu_fp_state *last = this_cpu_ptr(&fpsimd_last_state);
 
 	WARN_ON(!system_supports_fpsimd());
 	WARN_ON(!in_softirq() && !irqs_disabled());
 
-	last->st = st;
-	last->svcr = svcr;
-	last->sve_state = sve_state;
-	last->za_state = za_state;
-	last->sve_vl = sve_vl;
-	last->sme_vl = sme_vl;
-	last->fp_type = type;
-	last->to_save = to_save;
+	*last = *state;
 }
 
 /*
diff --git a/arch/arm64/kvm/fpsimd.c b/arch/arm64/kvm/fpsimd.c
index ec82d0191f76..02dd7e9ebd39 100644
--- a/arch/arm64/kvm/fpsimd.c
+++ b/arch/arm64/kvm/fpsimd.c
@@ -130,25 +130,29 @@ void kvm_arch_vcpu_ctxflush_fp(struct kvm_vcpu *vcpu)
  */
 void kvm_arch_vcpu_ctxsync_fp(struct kvm_vcpu *vcpu)
 {
-	enum fp_type fp_type;
+	struct cpu_fp_state fp_state;
 
 	WARN_ON_ONCE(!irqs_disabled());
 
 	if (vcpu->arch.fp_state == FP_STATE_GUEST_OWNED) {
-		if (vcpu_has_sve(vcpu))
-			fp_type = FP_STATE_SVE;
-		else
-			fp_type = FP_STATE_FPSIMD;
 
 		/*
 		 * Currently we do not support SME guests so SVCR is
 		 * always 0 and we just need a variable to point to.
 		 */
-		fpsimd_bind_state_to_cpu(&vcpu->arch.ctxt.fp_regs,
-					 vcpu->arch.sve_state,
-					 vcpu->arch.sve_max_vl,
-					 NULL, 0, &vcpu->arch.svcr,
-					 &vcpu->arch.fp_type, fp_type);
+		fp_state.st = &vcpu->arch.ctxt.fp_regs;
+		fp_state.sve_state = vcpu->arch.sve_state;
+		fp_state.sve_vl = vcpu->arch.sve_max_vl;
+		fp_state.za_state = NULL;
+		fp_state.svcr = &vcpu->arch.svcr;
+		fp_state.fp_type = &vcpu->arch.fp_type;
+
+		if (vcpu_has_sve(vcpu))
+			fp_state.to_save = FP_STATE_SVE;
+		else
+			fp_state.to_save = FP_STATE_FPSIMD;
+
+		fpsimd_bind_state_to_cpu(&fp_state);
 
 		clear_thread_flag(TIF_FOREIGN_FPSTATE);
 	}
-- 
2.30.2


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 19+ messages in thread

* Re: [PATCH v4 1/8] KVM: arm64: Discard any SVE state when entering KVM guests
  2022-10-21 23:03 ` [PATCH v4 1/8] KVM: arm64: Discard any SVE state when entering KVM guests Mark Brown
@ 2022-11-13 22:00   ` Catalin Marinas
  0 siblings, 0 replies; 19+ messages in thread
From: Catalin Marinas @ 2022-11-13 22:00 UTC (permalink / raw)
  To: Mark Brown
  Cc: Will Deacon, Marc Zyngier, Zhang Lei, James Morse,
	Alexandru Elisei, Andre Przywara, kvmarm, linux-arm-kernel

On Sat, Oct 22, 2022 at 12:03:14AM +0100, Mark Brown wrote:
> Since 8383741ab2e773a99 (KVM: arm64: Get rid of host SVE tracking/saving)
> KVM has not tracked the host SVE state, relying on the fact that we
> currently disable SVE whenever we perform a syscall. This may not be true
> in future since performance optimisation may result in us keeping SVE
> enabled in order to avoid needing to take access traps to reenable it.
> Handle this by clearing TIF_SVE and converting the stored task state to
> FPSIMD format when preparing to run the guest.  This is done with a new
> call fpsimd_kvm_prepare() to keep the direct state manipulation
> functions internal to fpsimd.c.
> 
> Signed-off-by: Mark Brown <broonie@kernel.org>

Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v4 2/8] arm64/fpsimd: Track the saved FPSIMD state type separately to TIF_SVE
  2022-10-21 23:03 ` [PATCH v4 2/8] arm64/fpsimd: Track the saved FPSIMD state type separately to TIF_SVE Mark Brown
@ 2022-11-13 22:12   ` Catalin Marinas
  2022-11-14 11:10     ` Mark Brown
  0 siblings, 1 reply; 19+ messages in thread
From: Catalin Marinas @ 2022-11-13 22:12 UTC (permalink / raw)
  To: Mark Brown
  Cc: Will Deacon, Marc Zyngier, Zhang Lei, James Morse,
	Alexandru Elisei, Andre Przywara, kvmarm, linux-arm-kernel

On Sat, Oct 22, 2022 at 12:03:15AM +0100, Mark Brown wrote:
> diff --git a/arch/arm64/include/asm/processor.h b/arch/arm64/include/asm/processor.h
> index 445aa3af3b76..3cce0a4c4e8d 100644
> --- a/arch/arm64/include/asm/processor.h
> +++ b/arch/arm64/include/asm/processor.h
> @@ -122,6 +122,11 @@ enum vec_type {
>  	ARM64_VEC_MAX,
>  };
>  
> +enum fp_type {
> +	FP_STATE_FPSIMD,
> +	FP_STATE_SVE,
> +};

More of a nitpick, I still find the type/state naming confusing but I
don't have a better suggestion. Maybe call them FP_TYPE_* or rename the
enum to fp_state_type (up to you really, feel free to leave them as they
are).

Either way, I think having an explicit type of what state is about is a
significant improvement in clarity.

Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v4 3/8] arm64/fpsimd: Have KVM explicitly say which FP registers to save
  2022-10-21 23:03 ` [PATCH v4 3/8] arm64/fpsimd: Have KVM explicitly say which FP registers to save Mark Brown
@ 2022-11-13 22:27   ` Catalin Marinas
  0 siblings, 0 replies; 19+ messages in thread
From: Catalin Marinas @ 2022-11-13 22:27 UTC (permalink / raw)
  To: Mark Brown
  Cc: Will Deacon, Marc Zyngier, Zhang Lei, James Morse,
	Alexandru Elisei, Andre Przywara, kvmarm, linux-arm-kernel

On Sat, Oct 22, 2022 at 12:03:16AM +0100, Mark Brown wrote:
> In order to avoid needlessly saving and restoring the guest registers KVM
> relies on the host FPSMID code to save the guest registers when we context
> switch away from the guest. This is done by binding the KVM guest state to
> the CPU on top of the task state that was originally there, then carefully
> managing the TIF_SVE flag for the task to cause the host to save the full
> SVE state when needed regardless of the needs of the host task. This works
> well enough but isn't terribly direct about what is going on and makes it
> much more complicated to try to optimise what we're doing with the SVE
> register state.
> 
> Let's instead have KVM pass in the register state it wants saving when it
> binds to the CPU. We introduce a new FP_STATE_TASK for use during normal
> task binding to indicate that we should base our decisions on the current
> task. This should not ben used when actually saving. Ideally we might

s/ben/be/

> diff --git a/arch/arm64/include/asm/processor.h b/arch/arm64/include/asm/processor.h
> index 3cce0a4c4e8d..1d48398cb6dc 100644
> --- a/arch/arm64/include/asm/processor.h
> +++ b/arch/arm64/include/asm/processor.h
> @@ -123,6 +123,7 @@ enum vec_type {
>  };
>  
>  enum fp_type {
> +	FP_STATE_TASK,		/* Save based on current task state. */

Nitpick: FP_STATE_CURRENT instead?

Reviewed-by: Catalin Marinas

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v4 4/8] arm64/fpsimd: Stop using TIF_SVE to manage register saving in KVM
  2022-10-21 23:03 ` [PATCH v4 4/8] arm64/fpsimd: Stop using TIF_SVE to manage register saving in KVM Mark Brown
@ 2022-11-13 22:30   ` Catalin Marinas
  0 siblings, 0 replies; 19+ messages in thread
From: Catalin Marinas @ 2022-11-13 22:30 UTC (permalink / raw)
  To: Mark Brown
  Cc: Will Deacon, Marc Zyngier, Zhang Lei, James Morse,
	Alexandru Elisei, Andre Przywara, kvmarm, linux-arm-kernel

On Sat, Oct 22, 2022 at 12:03:17AM +0100, Mark Brown wrote:
> Now that we are explicitly telling the host FP code which register state
> it needs to save we can remove the manipulation of TIF_SVE from the KVM
> code, simplifying it and allowing us to optimise our handling of normal
> tasks. Remove the manipulation of TIF_SVE from KVM and instead rely on
> to_save to ensure we save the correct data for it.
> 
> There should be no functional or performance impact from this change.
> 
> Signed-off-by: Mark Brown <broonie@kernel.org>

Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v4 5/8] arm64/fpsimd: Load FP state based on recorded data type
  2022-10-21 23:03 ` [PATCH v4 5/8] arm64/fpsimd: Load FP state based on recorded data type Mark Brown
@ 2022-11-14  9:24   ` Catalin Marinas
  0 siblings, 0 replies; 19+ messages in thread
From: Catalin Marinas @ 2022-11-14  9:24 UTC (permalink / raw)
  To: Mark Brown
  Cc: Will Deacon, Marc Zyngier, Zhang Lei, James Morse,
	Alexandru Elisei, Andre Przywara, kvmarm, linux-arm-kernel

On Sat, Oct 22, 2022 at 12:03:18AM +0100, Mark Brown wrote:
> Now that we are recording the type of floating point register state we
> are saving when we write the register state out to memory we can use
> that information when we load from memory to decide which format to
> load, bringing TIF_SVE into line with what we saved rather than relying
> on TIF_SVE to determine what to load.
> 
> The SME state details are already recorded directly in the saved
> SVCR and handled based on the information there.
> 
> Since we are not changing any of the save paths there should be no
> functional change from this patch, further patches will make use of this
> to optimise and clarify the code.
> 
> Signed-off-by: Mark Brown <broonie@kernel.org>

Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v4 6/8] arm64/fpsimd: SME no longer requires SVE register state
  2022-10-21 23:03 ` [PATCH v4 6/8] arm64/fpsimd: SME no longer requires SVE register state Mark Brown
@ 2022-11-14 10:18   ` Catalin Marinas
  0 siblings, 0 replies; 19+ messages in thread
From: Catalin Marinas @ 2022-11-14 10:18 UTC (permalink / raw)
  To: Mark Brown
  Cc: Will Deacon, Marc Zyngier, Zhang Lei, James Morse,
	Alexandru Elisei, Andre Przywara, kvmarm, linux-arm-kernel

On Sat, Oct 22, 2022 at 12:03:19AM +0100, Mark Brown wrote:
> Now that we track the type of register state stored separately to
> tracking what is active in the task it is valid to have FPSIMD register
> state stored while in streaming mode so remove the special case handling
> for SME when setting FPSIMD register state.

I think this needs some punctuation as it is pretty hard to read.
Slightly adjusted, hopefully it keeps the same message:

  Now that we track the type of the stored register state separately to
  what is active in the task, it is valid to have the FPSIMD register
  state stored while in streaming mode. Remove the special case handling
  for SME when setting FPSIMD register state.

Otherwise,

Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v4 7/8] arm64/sve: Leave SVE enabled on syscall if we don't context switch
  2022-10-21 23:03 ` [PATCH v4 7/8] arm64/sve: Leave SVE enabled on syscall if we don't context switch Mark Brown
@ 2022-11-14 10:31   ` Catalin Marinas
  2022-11-14 11:25     ` Mark Brown
  0 siblings, 1 reply; 19+ messages in thread
From: Catalin Marinas @ 2022-11-14 10:31 UTC (permalink / raw)
  To: Mark Brown
  Cc: Will Deacon, Marc Zyngier, Zhang Lei, James Morse,
	Alexandru Elisei, Andre Przywara, kvmarm, linux-arm-kernel

On Sat, Oct 22, 2022 at 12:03:20AM +0100, Mark Brown wrote:
> The syscall ABI says that the SVE register state not shared with FPSIMD
> may not be preserved on syscall, and this is the only mechanism we have
> in the ABI to stop tracking the extra SVE state for a process. Currently
> we do this unconditionally by means of disabling SVE for the process on
> syscall, causing userspace to take a trap to EL1 if it uses SVE again.
> These extra traps result in a noticeable overhead for using SVE instead
> of FPSIMD in some workloads, especially for simple syscalls where we can
> return directly to userspace and would not otherwise need to update the
> floating point registers. Tests with fp-pidbench show an approximately
> 70% overhead on a range of implementations when SVE is in use - while
> this is an extreme and entirely artificial benchmark it is clear that
> there is some useful room for improvement here.

If SVE is no longer in use, does the explicit SVE regs flushing cause
any noticeable overhead? I guess even if there's a small overhead, it's
only temporary until a context switch clears TIF_SVE again.

Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v4 8/8] arm64/fp: Use a struct to pass data to fpsimd_bind_state_to_cpu()
  2022-10-21 23:03 ` [PATCH v4 8/8] arm64/fp: Use a struct to pass data to fpsimd_bind_state_to_cpu() Mark Brown
@ 2022-11-14 10:33   ` Catalin Marinas
  0 siblings, 0 replies; 19+ messages in thread
From: Catalin Marinas @ 2022-11-14 10:33 UTC (permalink / raw)
  To: Mark Brown
  Cc: Will Deacon, Marc Zyngier, Zhang Lei, James Morse,
	Alexandru Elisei, Andre Przywara, kvmarm, linux-arm-kernel

On Sat, Oct 22, 2022 at 12:03:21AM +0100, Mark Brown wrote:
> For reasons that are unclear to this reader fpsimd_bind_state_to_cpu()
> populates the struct fpsimd_last_state_struct that it uses to store the
> active floating point state for KVM guests by passing an argument for
> each member of the structure. As the richness of the architecture increases
> this is resulting in a function with a rather large number of arguments
> which isn't ideal.
> 
> Simplify the interface by using the struct directly as the single argument
> for the function, renaming it as we lift the definition into the header.
> This could be built on further to reduce the work we do adding storage for
> new FP state in various places but for now it just simplifies this one
> interface.
> 
> Signed-off-by: Mark Brown <broonie@kernel.org>

I think this looks cleaner.

Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v4 2/8] arm64/fpsimd: Track the saved FPSIMD state type separately to TIF_SVE
  2022-11-13 22:12   ` Catalin Marinas
@ 2022-11-14 11:10     ` Mark Brown
  0 siblings, 0 replies; 19+ messages in thread
From: Mark Brown @ 2022-11-14 11:10 UTC (permalink / raw)
  To: Catalin Marinas
  Cc: Will Deacon, Marc Zyngier, Zhang Lei, James Morse,
	Alexandru Elisei, Andre Przywara, kvmarm, linux-arm-kernel


[-- Attachment #1.1: Type: text/plain, Size: 868 bytes --]

On Sun, Nov 13, 2022 at 10:12:12PM +0000, Catalin Marinas wrote:
> On Sat, Oct 22, 2022 at 12:03:15AM +0100, Mark Brown wrote:

> > +enum fp_type {
> > +	FP_STATE_FPSIMD,
> > +	FP_STATE_SVE,
> > +};

> More of a nitpick, I still find the type/state naming confusing but I
> don't have a better suggestion. Maybe call them FP_TYPE_* or rename the
> enum to fp_state_type (up to you really, feel free to leave them as they
> are).

Yeah, the naming is kind of awkward and I'm not a massive fan
either.  I think whatever name is chosen will end up being
awkward in some circumstance.  No problem with renaming if
someone has a firm thought that some specific thing would be
better though.

> Either way, I think having an explicit type of what state is about is a
> significant improvement in clarity.

> Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>

Thanks.



[-- Attachment #1.2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

[-- Attachment #2: Type: text/plain, Size: 176 bytes --]

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v4 7/8] arm64/sve: Leave SVE enabled on syscall if we don't context switch
  2022-11-14 10:31   ` Catalin Marinas
@ 2022-11-14 11:25     ` Mark Brown
  0 siblings, 0 replies; 19+ messages in thread
From: Mark Brown @ 2022-11-14 11:25 UTC (permalink / raw)
  To: Catalin Marinas
  Cc: Will Deacon, Marc Zyngier, Zhang Lei, James Morse,
	Alexandru Elisei, Andre Przywara, kvmarm, linux-arm-kernel


[-- Attachment #1.1: Type: text/plain, Size: 1781 bytes --]

On Mon, Nov 14, 2022 at 10:31:17AM +0000, Catalin Marinas wrote:
> On Sat, Oct 22, 2022 at 12:03:20AM +0100, Mark Brown wrote:

> > The syscall ABI says that the SVE register state not shared with FPSIMD
> > may not be preserved on syscall, and this is the only mechanism we have
> > in the ABI to stop tracking the extra SVE state for a process. Currently
> > we do this unconditionally by means of disabling SVE for the process on
> > syscall, causing userspace to take a trap to EL1 if it uses SVE again.
> > These extra traps result in a noticeable overhead for using SVE instead
> > of FPSIMD in some workloads, especially for simple syscalls where we can
> > return directly to userspace and would not otherwise need to update the
> > floating point registers. Tests with fp-pidbench show an approximately
> > 70% overhead on a range of implementations when SVE is in use - while
> > this is an extreme and entirely artificial benchmark it is clear that
> > there is some useful room for improvement here.

> If SVE is no longer in use, does the explicit SVE regs flushing cause
> any noticeable overhead? I guess even if there's a small overhead, it's
> only temporary until a context switch clears TIF_SVE again.

The overhead of the flushes is measurable, IIRC it was about 2-3%
on fp-pidbench for vector lengths 256 bit and above which have
additional overhead due to needing to flush the V/Z registers.
OTOH that's an entirely artificial benchmark and as you say soon
as the task gets context switched it'll stop incurring that
overhead.  Given that the improvement we get in the case where
the task is continuing to use SVE is more than an order of
magnitude greater it seems like a sensible tradeoff.

> Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>

Thanks.

[-- Attachment #1.2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

[-- Attachment #2: Type: text/plain, Size: 176 bytes --]

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 19+ messages in thread

end of thread, other threads:[~2022-11-14 11:26 UTC | newest]

Thread overview: 19+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-10-21 23:03 [PATCH v4 0/8] arm64/sve: Clean up KVM integration and optimise syscalls Mark Brown
2022-10-21 23:03 ` [PATCH v4 1/8] KVM: arm64: Discard any SVE state when entering KVM guests Mark Brown
2022-11-13 22:00   ` Catalin Marinas
2022-10-21 23:03 ` [PATCH v4 2/8] arm64/fpsimd: Track the saved FPSIMD state type separately to TIF_SVE Mark Brown
2022-11-13 22:12   ` Catalin Marinas
2022-11-14 11:10     ` Mark Brown
2022-10-21 23:03 ` [PATCH v4 3/8] arm64/fpsimd: Have KVM explicitly say which FP registers to save Mark Brown
2022-11-13 22:27   ` Catalin Marinas
2022-10-21 23:03 ` [PATCH v4 4/8] arm64/fpsimd: Stop using TIF_SVE to manage register saving in KVM Mark Brown
2022-11-13 22:30   ` Catalin Marinas
2022-10-21 23:03 ` [PATCH v4 5/8] arm64/fpsimd: Load FP state based on recorded data type Mark Brown
2022-11-14  9:24   ` Catalin Marinas
2022-10-21 23:03 ` [PATCH v4 6/8] arm64/fpsimd: SME no longer requires SVE register state Mark Brown
2022-11-14 10:18   ` Catalin Marinas
2022-10-21 23:03 ` [PATCH v4 7/8] arm64/sve: Leave SVE enabled on syscall if we don't context switch Mark Brown
2022-11-14 10:31   ` Catalin Marinas
2022-11-14 11:25     ` Mark Brown
2022-10-21 23:03 ` [PATCH v4 8/8] arm64/fp: Use a struct to pass data to fpsimd_bind_state_to_cpu() Mark Brown
2022-11-14 10:33   ` Catalin Marinas

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).