* [PATCH v3 1/3] arm64/sve: Split _sve_flush macro into separate Z and predicate flushes
2021-05-12 15:11 [PATCH v3 0/3] arm64/sve: Trivial optimisation for 128 bit SVE vectors Mark Brown
@ 2021-05-12 15:11 ` Mark Brown
2021-05-12 15:11 ` [PATCH v3 2/3] arm64/sve: Use the sve_flush macros in sve_load_from_fpsimd_state() Mark Brown
` (3 subsequent siblings)
4 siblings, 0 replies; 6+ messages in thread
From: Mark Brown @ 2021-05-12 15:11 UTC (permalink / raw)
To: Catalin Marinas, Will Deacon; +Cc: Dave Martin, linux-arm-kernel, Mark Brown
Trivial refactoring to support further work, no change to generated code.
Signed-off-by: Mark Brown <broonie@kernel.org>
Reviewed-by: Dave Martin <Dave.Martin@arm.com>
---
arch/arm64/include/asm/fpsimdmacros.h | 4 +++-
arch/arm64/kernel/entry-fpsimd.S | 3 ++-
2 files changed, 5 insertions(+), 2 deletions(-)
diff --git a/arch/arm64/include/asm/fpsimdmacros.h b/arch/arm64/include/asm/fpsimdmacros.h
index a2563992d2dc..059204477ce6 100644
--- a/arch/arm64/include/asm/fpsimdmacros.h
+++ b/arch/arm64/include/asm/fpsimdmacros.h
@@ -213,8 +213,10 @@
mov v\nz\().16b, v\nz\().16b
.endm
-.macro sve_flush
+.macro sve_flush_z
_for n, 0, 31, _sve_flush_z \n
+.endm
+.macro sve_flush_p_ffr
_for n, 0, 15, _sve_pfalse \n
_sve_wrffr 0
.endm
diff --git a/arch/arm64/kernel/entry-fpsimd.S b/arch/arm64/kernel/entry-fpsimd.S
index 3ecec60d3295..7921d58427c2 100644
--- a/arch/arm64/kernel/entry-fpsimd.S
+++ b/arch/arm64/kernel/entry-fpsimd.S
@@ -72,7 +72,8 @@ SYM_FUNC_END(sve_load_from_fpsimd_state)
/* Zero all SVE registers but the first 128-bits of each vector */
SYM_FUNC_START(sve_flush_live)
- sve_flush
+ sve_flush_z
+ sve_flush_p_ffr
ret
SYM_FUNC_END(sve_flush_live)
--
2.20.1
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
^ permalink raw reply related [flat|nested] 6+ messages in thread
* [PATCH v3 2/3] arm64/sve: Use the sve_flush macros in sve_load_from_fpsimd_state()
2021-05-12 15:11 [PATCH v3 0/3] arm64/sve: Trivial optimisation for 128 bit SVE vectors Mark Brown
2021-05-12 15:11 ` [PATCH v3 1/3] arm64/sve: Split _sve_flush macro into separate Z and predicate flushes Mark Brown
@ 2021-05-12 15:11 ` Mark Brown
2021-05-12 15:11 ` [PATCH v3 3/3] arm64/sve: Skip flushing Z registers with 128 bit vectors Mark Brown
` (2 subsequent siblings)
4 siblings, 0 replies; 6+ messages in thread
From: Mark Brown @ 2021-05-12 15:11 UTC (permalink / raw)
To: Catalin Marinas, Will Deacon; +Cc: Dave Martin, linux-arm-kernel, Mark Brown
This makes the code a bit clearer and as a result we can also make the
indentation more normal, there is no change to the generated code.
Signed-off-by: Mark Brown <broonie@kernel.org>
Reviewed-by: Dave Martin <Dave.Martin@arm.com>
---
arch/arm64/kernel/entry-fpsimd.S | 9 ++++-----
1 file changed, 4 insertions(+), 5 deletions(-)
diff --git a/arch/arm64/kernel/entry-fpsimd.S b/arch/arm64/kernel/entry-fpsimd.S
index 7921d58427c2..dd8382e5ce82 100644
--- a/arch/arm64/kernel/entry-fpsimd.S
+++ b/arch/arm64/kernel/entry-fpsimd.S
@@ -63,11 +63,10 @@ SYM_FUNC_END(sve_set_vq)
* and the rest zeroed. All the other SVE registers will be zeroed.
*/
SYM_FUNC_START(sve_load_from_fpsimd_state)
- sve_load_vq x1, x2, x3
- fpsimd_restore x0, 8
- _for n, 0, 15, _sve_pfalse \n
- _sve_wrffr 0
- ret
+ sve_load_vq x1, x2, x3
+ fpsimd_restore x0, 8
+ sve_flush_p_ffr
+ ret
SYM_FUNC_END(sve_load_from_fpsimd_state)
/* Zero all SVE registers but the first 128-bits of each vector */
--
2.20.1
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
^ permalink raw reply related [flat|nested] 6+ messages in thread
* [PATCH v3 3/3] arm64/sve: Skip flushing Z registers with 128 bit vectors
2021-05-12 15:11 [PATCH v3 0/3] arm64/sve: Trivial optimisation for 128 bit SVE vectors Mark Brown
2021-05-12 15:11 ` [PATCH v3 1/3] arm64/sve: Split _sve_flush macro into separate Z and predicate flushes Mark Brown
2021-05-12 15:11 ` [PATCH v3 2/3] arm64/sve: Use the sve_flush macros in sve_load_from_fpsimd_state() Mark Brown
@ 2021-05-12 15:11 ` Mark Brown
2021-05-14 11:03 ` [PATCH v3 0/3] arm64/sve: Trivial optimisation for 128 bit SVE vectors Catalin Marinas
2021-05-26 22:15 ` Will Deacon
4 siblings, 0 replies; 6+ messages in thread
From: Mark Brown @ 2021-05-12 15:11 UTC (permalink / raw)
To: Catalin Marinas, Will Deacon; +Cc: Dave Martin, linux-arm-kernel, Mark Brown
When the SVE vector length is 128 bits then there are no bits in the Z
registers which are not shared with the V registers so we can skip them
when zeroing state not shared with FPSIMD, this results in a minor
performance improvement.
Signed-off-by: Mark Brown <broonie@kernel.org>
Reviewed-by: Dave Martin <Dave.Martin@arm.com>
---
arch/arm64/include/asm/fpsimd.h | 2 +-
arch/arm64/kernel/entry-fpsimd.S | 12 ++++++++++--
arch/arm64/kernel/fpsimd.c | 6 ++++--
3 files changed, 15 insertions(+), 5 deletions(-)
diff --git a/arch/arm64/include/asm/fpsimd.h b/arch/arm64/include/asm/fpsimd.h
index 2599504674b5..c072161d5c65 100644
--- a/arch/arm64/include/asm/fpsimd.h
+++ b/arch/arm64/include/asm/fpsimd.h
@@ -69,7 +69,7 @@ static inline void *sve_pffr(struct thread_struct *thread)
extern void sve_save_state(void *state, u32 *pfpsr);
extern void sve_load_state(void const *state, u32 const *pfpsr,
unsigned long vq_minus_1);
-extern void sve_flush_live(void);
+extern void sve_flush_live(unsigned long vq_minus_1);
extern void sve_load_from_fpsimd_state(struct user_fpsimd_state const *state,
unsigned long vq_minus_1);
extern unsigned int sve_get_vl(void);
diff --git a/arch/arm64/kernel/entry-fpsimd.S b/arch/arm64/kernel/entry-fpsimd.S
index dd8382e5ce82..0a7a64753878 100644
--- a/arch/arm64/kernel/entry-fpsimd.S
+++ b/arch/arm64/kernel/entry-fpsimd.S
@@ -69,10 +69,18 @@ SYM_FUNC_START(sve_load_from_fpsimd_state)
ret
SYM_FUNC_END(sve_load_from_fpsimd_state)
-/* Zero all SVE registers but the first 128-bits of each vector */
+/*
+ * Zero all SVE registers but the first 128-bits of each vector
+ *
+ * VQ must already be configured by caller, any further updates of VQ
+ * will need to ensure that the register state remains valid.
+ *
+ * x0 = VQ - 1
+ */
SYM_FUNC_START(sve_flush_live)
+ cbz x0, 1f // A VQ-1 of 0 is 128 bits so no extra Z state
sve_flush_z
- sve_flush_p_ffr
+1: sve_flush_p_ffr
ret
SYM_FUNC_END(sve_flush_live)
diff --git a/arch/arm64/kernel/fpsimd.c b/arch/arm64/kernel/fpsimd.c
index ad3dd34a83cf..e57b23f95284 100644
--- a/arch/arm64/kernel/fpsimd.c
+++ b/arch/arm64/kernel/fpsimd.c
@@ -957,8 +957,10 @@ void do_sve_acc(unsigned int esr, struct pt_regs *regs)
* disabling the trap, otherwise update our in-memory copy.
*/
if (!test_thread_flag(TIF_FOREIGN_FPSTATE)) {
- sve_set_vq(sve_vq_from_vl(current->thread.sve_vl) - 1);
- sve_flush_live();
+ unsigned long vq_minus_one =
+ sve_vq_from_vl(current->thread.sve_vl) - 1;
+ sve_set_vq(vq_minus_one);
+ sve_flush_live(vq_minus_one);
fpsimd_bind_task_to_cpu();
} else {
fpsimd_to_sve(current);
--
2.20.1
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
^ permalink raw reply related [flat|nested] 6+ messages in thread
* Re: [PATCH v3 0/3] arm64/sve: Trivial optimisation for 128 bit SVE vectors
2021-05-12 15:11 [PATCH v3 0/3] arm64/sve: Trivial optimisation for 128 bit SVE vectors Mark Brown
` (2 preceding siblings ...)
2021-05-12 15:11 ` [PATCH v3 3/3] arm64/sve: Skip flushing Z registers with 128 bit vectors Mark Brown
@ 2021-05-14 11:03 ` Catalin Marinas
2021-05-26 22:15 ` Will Deacon
4 siblings, 0 replies; 6+ messages in thread
From: Catalin Marinas @ 2021-05-14 11:03 UTC (permalink / raw)
To: Mark Brown; +Cc: Will Deacon, Dave Martin, linux-arm-kernel
On Wed, May 12, 2021 at 04:11:28PM +0100, Mark Brown wrote:
> This series is a combination of factoring out some duplicated code and a
> very minor optimisation to the performance of handling converting FPSIMD
> state to SVE in the live registers for 128 bit SVE vectors.
>
> v3:
> - Tweak comment.
> v2:
> - Combine P and FFR flushing into a single macro.
>
> Mark Brown (3):
> arm64/sve: Split _sve_flush macro into separate Z and predicate
> flushes
> arm64/sve: Use the sve_flush macros in sve_load_from_fpsimd_state()
> arm64/sve: Skip flushing Z registers with 128 bit vectors
I acked v2, hadn't noticed v3 was out. So here it is again:
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH v3 0/3] arm64/sve: Trivial optimisation for 128 bit SVE vectors
2021-05-12 15:11 [PATCH v3 0/3] arm64/sve: Trivial optimisation for 128 bit SVE vectors Mark Brown
` (3 preceding siblings ...)
2021-05-14 11:03 ` [PATCH v3 0/3] arm64/sve: Trivial optimisation for 128 bit SVE vectors Catalin Marinas
@ 2021-05-26 22:15 ` Will Deacon
4 siblings, 0 replies; 6+ messages in thread
From: Will Deacon @ 2021-05-26 22:15 UTC (permalink / raw)
To: Mark Brown, Catalin Marinas
Cc: kernel-team, Will Deacon, linux-arm-kernel, Dave Martin
On Wed, 12 May 2021 16:11:28 +0100, Mark Brown wrote:
> This series is a combination of factoring out some duplicated code and a
> very minor optimisation to the performance of handling converting FPSIMD
> state to SVE in the live registers for 128 bit SVE vectors.
>
> v3:
> - Tweak comment.
> v2:
> - Combine P and FFR flushing into a single macro.
>
> [...]
Applied to arm64 (for-next/sve), thanks!
[1/3] arm64/sve: Split _sve_flush macro into separate Z and predicate flushes
https://git.kernel.org/arm64/c/483dbf6a3590
[2/3] arm64/sve: Use the sve_flush macros in sve_load_from_fpsimd_state()
https://git.kernel.org/arm64/c/c9f6890bca11
[3/3] arm64/sve: Skip flushing Z registers with 128 bit vectors
https://git.kernel.org/arm64/c/ad4711f962e0
Cheers,
--
Will
https://fixes.arm64.dev
https://next.arm64.dev
https://will.arm64.dev
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
^ permalink raw reply [flat|nested] 6+ messages in thread