All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH] arm64/sve: Rework SVE access trap to convert state in registers
@ 2021-03-12 19:03 Mark Brown
  2021-04-07 11:45 ` Catalin Marinas
  2021-04-08 18:00 ` Catalin Marinas
  0 siblings, 2 replies; 6+ messages in thread
From: Mark Brown @ 2021-03-12 19:03 UTC (permalink / raw)
  To: Catalin Marinas, Will Deacon
  Cc: Julien Grall, Zhang Lei, Dave Martin, Daniel Kiss,
	linux-arm-kernel, Mark Brown

When we enable SVE usage in userspace after taking a SVE access trap we
need to ensure that the portions of the register state that are not
shared with the FPSIMD registers are zeroed. Currently we do this by
forcing the FPSIMD registers to be saved to the task struct and converting
them there. This is wasteful in the common case where the task state is
loaded into the registers and we will immediately return to userspace
since we can initialise the SVE state directly in registers instead of
accessing multiple copies of the register state in memory.

Instead in that common case do the conversion in the registers and
update the task metadata so that we can return to userspace without
spilling the register state to memory unless there is some other reason
to do so.

Signed-off-by: Mark Brown <broonie@kernel.org>
---

This is a greatly cut down approach to what the changes in "arm64/sve:
Improve performance when handling SVE access traps" [1] are aiming for,
it should show a similar improvement in almost all cases to that series
but is a very much less invasive change.

[1] https://lore.kernel.org/linux-arm-kernel/20210303201117.24777-1-broonie@kernel.org/

 arch/arm64/include/asm/fpsimd.h  |  1 +
 arch/arm64/kernel/entry-fpsimd.S |  5 +++++
 arch/arm64/kernel/fpsimd.c       | 26 +++++++++++++++++---------
 3 files changed, 23 insertions(+), 9 deletions(-)

diff --git a/arch/arm64/include/asm/fpsimd.h b/arch/arm64/include/asm/fpsimd.h
index bec5f14b622a..ebb263b2d3b1 100644
--- a/arch/arm64/include/asm/fpsimd.h
+++ b/arch/arm64/include/asm/fpsimd.h
@@ -73,6 +73,7 @@ extern void sve_flush_live(void);
 extern void sve_load_from_fpsimd_state(struct user_fpsimd_state const *state,
 				       unsigned long vq_minus_1);
 extern unsigned int sve_get_vl(void);
+extern void sve_set_vq(unsigned long vq_minus_1);
 
 struct arm64_cpu_capabilities;
 extern void sve_kernel_enable(const struct arm64_cpu_capabilities *__unused);
diff --git a/arch/arm64/kernel/entry-fpsimd.S b/arch/arm64/kernel/entry-fpsimd.S
index 2ca395c25448..3ecec60d3295 100644
--- a/arch/arm64/kernel/entry-fpsimd.S
+++ b/arch/arm64/kernel/entry-fpsimd.S
@@ -48,6 +48,11 @@ SYM_FUNC_START(sve_get_vl)
 	ret
 SYM_FUNC_END(sve_get_vl)
 
+SYM_FUNC_START(sve_set_vq)
+	sve_load_vq x0, x1, x2
+	ret
+SYM_FUNC_END(sve_set_vq)
+
 /*
  * Load SVE state from FPSIMD state.
  *
diff --git a/arch/arm64/kernel/fpsimd.c b/arch/arm64/kernel/fpsimd.c
index 062b21f30f94..0f58e45bd3d1 100644
--- a/arch/arm64/kernel/fpsimd.c
+++ b/arch/arm64/kernel/fpsimd.c
@@ -926,9 +926,8 @@ void fpsimd_release_task(struct task_struct *dead_task)
  * Trapped SVE access
  *
  * Storage is allocated for the full SVE state, the current FPSIMD
- * register contents are migrated across, and TIF_SVE is set so that
- * the SVE access trap will be disabled the next time this task
- * reaches ret_to_user.
+ * register contents are migrated across, and the access trap is
+ * disabled.
  *
  * TIF_SVE should be clear on entry: otherwise, fpsimd_restore_current_state()
  * would have disabled the SVE access trap for userspace during
@@ -946,15 +945,24 @@ void do_sve_acc(unsigned int esr, struct pt_regs *regs)
 
 	get_cpu_fpsimd_context();
 
-	fpsimd_save();
-
-	/* Force ret_to_user to reload the registers: */
-	fpsimd_flush_task_state(current);
-
-	fpsimd_to_sve(current);
 	if (test_and_set_thread_flag(TIF_SVE))
 		WARN_ON(1); /* SVE access shouldn't have trapped */
 
+	/*
+	 * Convert the FPSIMD state to SVE, zeroing all the state that
+	 * is not shared with FPSIMD. If (as is likely) the current
+	 * state is live in the registers then do this there and
+	 * update our metadata for the current task including
+	 * disabling the trap, otherwise update our in-memory copy.
+	 */
+	if (!test_thread_flag(TIF_FOREIGN_FPSTATE)) {
+		sve_set_vq(sve_vq_from_vl(current->thread.sve_vl) - 1);
+		sve_flush_live();
+		fpsimd_bind_task_to_cpu();
+	} else {
+		fpsimd_to_sve(current);
+	}
+
 	put_cpu_fpsimd_context();
 }
 
-- 
2.20.1


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 6+ messages in thread

* Re: [PATCH] arm64/sve: Rework SVE access trap to convert state in registers
  2021-03-12 19:03 [PATCH] arm64/sve: Rework SVE access trap to convert state in registers Mark Brown
@ 2021-04-07 11:45 ` Catalin Marinas
  2021-04-07 12:48   ` Mark Brown
  2021-04-08 18:00 ` Catalin Marinas
  1 sibling, 1 reply; 6+ messages in thread
From: Catalin Marinas @ 2021-04-07 11:45 UTC (permalink / raw)
  To: Mark Brown
  Cc: Will Deacon, Julien Grall, Zhang Lei, Dave Martin, Daniel Kiss,
	linux-arm-kernel

On Fri, Mar 12, 2021 at 07:03:13PM +0000, Mark Brown wrote:
> diff --git a/arch/arm64/include/asm/fpsimd.h b/arch/arm64/include/asm/fpsimd.h
> index bec5f14b622a..ebb263b2d3b1 100644
> --- a/arch/arm64/include/asm/fpsimd.h
> +++ b/arch/arm64/include/asm/fpsimd.h
> @@ -73,6 +73,7 @@ extern void sve_flush_live(void);
>  extern void sve_load_from_fpsimd_state(struct user_fpsimd_state const *state,
>  				       unsigned long vq_minus_1);
>  extern unsigned int sve_get_vl(void);
> +extern void sve_set_vq(unsigned long vq_minus_1);
>  
>  struct arm64_cpu_capabilities;
>  extern void sve_kernel_enable(const struct arm64_cpu_capabilities *__unused);
> diff --git a/arch/arm64/kernel/entry-fpsimd.S b/arch/arm64/kernel/entry-fpsimd.S
> index 2ca395c25448..3ecec60d3295 100644
> --- a/arch/arm64/kernel/entry-fpsimd.S
> +++ b/arch/arm64/kernel/entry-fpsimd.S
> @@ -48,6 +48,11 @@ SYM_FUNC_START(sve_get_vl)
>  	ret
>  SYM_FUNC_END(sve_get_vl)
>  
> +SYM_FUNC_START(sve_set_vq)
> +	sve_load_vq x0, x1, x2
> +	ret
> +SYM_FUNC_END(sve_set_vq)
> +
>  /*
>   * Load SVE state from FPSIMD state.
>   *
> diff --git a/arch/arm64/kernel/fpsimd.c b/arch/arm64/kernel/fpsimd.c
> index 062b21f30f94..0f58e45bd3d1 100644
> --- a/arch/arm64/kernel/fpsimd.c
> +++ b/arch/arm64/kernel/fpsimd.c
> @@ -926,9 +926,8 @@ void fpsimd_release_task(struct task_struct *dead_task)
>   * Trapped SVE access
>   *
>   * Storage is allocated for the full SVE state, the current FPSIMD
> - * register contents are migrated across, and TIF_SVE is set so that
> - * the SVE access trap will be disabled the next time this task
> - * reaches ret_to_user.
> + * register contents are migrated across, and the access trap is
> + * disabled.
>   *
>   * TIF_SVE should be clear on entry: otherwise, fpsimd_restore_current_state()
>   * would have disabled the SVE access trap for userspace during
> @@ -946,15 +945,24 @@ void do_sve_acc(unsigned int esr, struct pt_regs *regs)
>  
>  	get_cpu_fpsimd_context();
>  
> -	fpsimd_save();
> -
> -	/* Force ret_to_user to reload the registers: */
> -	fpsimd_flush_task_state(current);
> -
> -	fpsimd_to_sve(current);
>  	if (test_and_set_thread_flag(TIF_SVE))
>  		WARN_ON(1); /* SVE access shouldn't have trapped */
>  
> +	/*
> +	 * Convert the FPSIMD state to SVE, zeroing all the state that
> +	 * is not shared with FPSIMD. If (as is likely) the current
> +	 * state is live in the registers then do this there and
> +	 * update our metadata for the current task including
> +	 * disabling the trap, otherwise update our in-memory copy.
> +	 */
> +	if (!test_thread_flag(TIF_FOREIGN_FPSTATE)) {
> +		sve_set_vq(sve_vq_from_vl(current->thread.sve_vl) - 1);
> +		sve_flush_live();
> +		fpsimd_bind_task_to_cpu();
> +	} else {
> +		fpsimd_to_sve(current);
> +	}
> +
>  	put_cpu_fpsimd_context();
>  }

Just to make sure I understand: in the current code, if
TIF_FOREIGN_FPSTATE is cleared and we trapped (TIF_SVE is also cleared),
the FPSIMD state is transferred to the SVE one in memory and both
TIF_SVE and TIF_FOREIGN_FPSTATE get set. The (potentially stale) SVE
state is subsequently restored via do_notify_resume(). I couldn't find
where we actually zero the in-memory SVE state or save the latest one
via do_el0_svc(). So it looks like we may restore some old SVE state
after a syscall (maybe I'm missing something but it would be nice to
follow zero or preserved approach).

With your patch in the above scenario, we no longer restore the SVE
state from memory. We just flush the live SVE registers and disable
trapping. The assumption here is that (1) when TIF_SVE is cleared, we no
longer care about any of the SVE state (memory or regs) and (2) we only
trap if TIF_SVE was cleared (i.e. we don't have a different scenario of
lazy restoring where TIF_SVE but we still trap). That's fine by me and
matches the code paths but it wasn't entirely clear in the TIF_SVE
description higher up in this file.

The other case is TIF_FOREIGN_FPSTATE being set in do_sve_acc(). Since
we never return to user with this flag set, when can we actually hit the
'else' path in your patch?

-- 
Catalin

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] arm64/sve: Rework SVE access trap to convert state in registers
  2021-04-07 11:45 ` Catalin Marinas
@ 2021-04-07 12:48   ` Mark Brown
  2021-04-07 15:55     ` Mark Brown
  2021-04-08 16:43     ` Catalin Marinas
  0 siblings, 2 replies; 6+ messages in thread
From: Mark Brown @ 2021-04-07 12:48 UTC (permalink / raw)
  To: Catalin Marinas
  Cc: Will Deacon, Julien Grall, Zhang Lei, Dave Martin, Daniel Kiss,
	linux-arm-kernel


[-- Attachment #1.1: Type: text/plain, Size: 2932 bytes --]

On Wed, Apr 07, 2021 at 12:45:12PM +0100, Catalin Marinas wrote:
> On Fri, Mar 12, 2021 at 07:03:13PM +0000, Mark Brown wrote:

> > -	/* Force ret_to_user to reload the registers: */
> > -	fpsimd_flush_task_state(current);

> Just to make sure I understand: in the current code, if
> TIF_FOREIGN_FPSTATE is cleared and we trapped (TIF_SVE is also cleared),
> the FPSIMD state is transferred to the SVE one in memory and both
> TIF_SVE and TIF_FOREIGN_FPSTATE get set. The (potentially stale) SVE

Right, TIF_FOREIGN_FPSTATE is unconditionally set in the current code.

> state is subsequently restored via do_notify_resume(). I couldn't find
> where we actually zero the in-memory SVE state or save the latest one
> via do_el0_svc(). So it looks like we may restore some old SVE state
> after a syscall (maybe I'm missing something but it would be nice to
> follow zero or preserved approach).

The state is currently converted via the fpsimd_to_sve() call in the
trap handler, that *should* be dealing with anything required for
conversion.  You're right that this seems to miss zeroing anything
outside of the values it immediately writes - I might be missing
something though.  It's only going to leak information from the current
task and it is within the documented behaviour so it's not a bug but as
such but it does seem cleaner, ought to have a memset() in there
somewhere.

> With your patch in the above scenario, we no longer restore the SVE
> state from memory. We just flush the live SVE registers and disable
> trapping. The assumption here is that (1) when TIF_SVE is cleared, we no
> longer care about any of the SVE state (memory or regs) and (2) we only
> trap if TIF_SVE was cleared (i.e. we don't have a different scenario of
> lazy restoring where TIF_SVE but we still trap). That's fine by me and

Indeed.

> matches the code paths but it wasn't entirely clear in the TIF_SVE
> description higher up in this file.

I'm not seeing what's missing from the discussion there?  It explicitly
says that if TIF_SVE is set we don't trap and if it's clear then we do
trap and which registers are valid in each state which I think makes the
behaviour here clear but perhaps I've been looking at this too much, or
you were looking for this in a slightly different place.

> The other case is TIF_FOREIGN_FPSTATE being set in do_sve_acc(). Since
> we never return to user with this flag set, when can we actually hit the
> 'else' path in your patch?

We'd have to have been preempted between entering the kernel and
actually handling the access trap, I can't think of a scenario where
that would happen but it seemed easier to have the code to handle it
than to make absolutely sure that there was no possible case that I was
missing.  I think we should have the detection code either way in case
it does end up happening somehow but it could be changed to use
WARN_ON_ONCE() or something instead of silently handling it?

[-- Attachment #1.2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

[-- Attachment #2: Type: text/plain, Size: 176 bytes --]

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] arm64/sve: Rework SVE access trap to convert state in registers
  2021-04-07 12:48   ` Mark Brown
@ 2021-04-07 15:55     ` Mark Brown
  2021-04-08 16:43     ` Catalin Marinas
  1 sibling, 0 replies; 6+ messages in thread
From: Mark Brown @ 2021-04-07 15:55 UTC (permalink / raw)
  To: Catalin Marinas
  Cc: Will Deacon, Julien Grall, Zhang Lei, Dave Martin, Daniel Kiss,
	linux-arm-kernel


[-- Attachment #1.1: Type: text/plain, Size: 849 bytes --]

On Wed, Apr 07, 2021 at 01:48:06PM +0100, Mark Brown wrote:
> On Wed, Apr 07, 2021 at 12:45:12PM +0100, Catalin Marinas wrote:

> > via do_el0_svc(). So it looks like we may restore some old SVE state
> > after a syscall (maybe I'm missing something but it would be nice to
> > follow zero or preserved approach).
> 
> The state is currently converted via the fpsimd_to_sve() call in the
> trap handler, that *should* be dealing with anything required for
> conversion.  You're right that this seems to miss zeroing anything
> outside of the values it immediately writes - I might be missing
> something though.  It's only going to leak information from the current

Actually I found that we do have the zeroing - it's done in sve_alloc()
where we either kzalloc() a new state or memset() an existing state so
that's covered already

[-- Attachment #1.2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

[-- Attachment #2: Type: text/plain, Size: 176 bytes --]

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] arm64/sve: Rework SVE access trap to convert state in registers
  2021-04-07 12:48   ` Mark Brown
  2021-04-07 15:55     ` Mark Brown
@ 2021-04-08 16:43     ` Catalin Marinas
  1 sibling, 0 replies; 6+ messages in thread
From: Catalin Marinas @ 2021-04-08 16:43 UTC (permalink / raw)
  To: Mark Brown
  Cc: Will Deacon, Julien Grall, Zhang Lei, Dave Martin, Daniel Kiss,
	linux-arm-kernel

On Wed, Apr 07, 2021 at 01:48:06PM +0100, Mark Brown wrote:
> On Wed, Apr 07, 2021 at 12:45:12PM +0100, Catalin Marinas wrote:
> > The other case is TIF_FOREIGN_FPSTATE being set in do_sve_acc(). Since
> > we never return to user with this flag set, when can we actually hit the
> > 'else' path in your patch?
> 
> We'd have to have been preempted between entering the kernel and
> actually handling the access trap, I can't think of a scenario where
> that would happen but it seemed easier to have the code to handle it
> than to make absolutely sure that there was no possible case that I was
> missing.  I think we should have the detection code either way in case
> it does end up happening somehow but it could be changed to use
> WARN_ON_ONCE() or something instead of silently handling it?

If we get preempted before do_sve_acc() (and that's possible as the
interrupts are enabled), if the interrupted thread wakes up on another
CPU it will have TIF_FOREIGN_FPSTATE set so that it retrieves the new
state.

So the patch looks fine to me.

-- 
Catalin

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] arm64/sve: Rework SVE access trap to convert state in registers
  2021-03-12 19:03 [PATCH] arm64/sve: Rework SVE access trap to convert state in registers Mark Brown
  2021-04-07 11:45 ` Catalin Marinas
@ 2021-04-08 18:00 ` Catalin Marinas
  1 sibling, 0 replies; 6+ messages in thread
From: Catalin Marinas @ 2021-04-08 18:00 UTC (permalink / raw)
  To: Will Deacon, Mark Brown
  Cc: linux-arm-kernel, Daniel Kiss, Dave Martin, Zhang Lei, Julien Grall

On Fri, 12 Mar 2021 19:03:13 +0000, Mark Brown wrote:
> When we enable SVE usage in userspace after taking a SVE access trap we
> need to ensure that the portions of the register state that are not
> shared with the FPSIMD registers are zeroed. Currently we do this by
> forcing the FPSIMD registers to be saved to the task struct and converting
> them there. This is wasteful in the common case where the task state is
> loaded into the registers and we will immediately return to userspace
> since we can initialise the SVE state directly in registers instead of
> accessing multiple copies of the register state in memory.
> 
> [...]

Applied to arm64 (for-next/misc), thanks!

[1/1] arm64/sve: Rework SVE access trap to convert state in registers
      https://git.kernel.org/arm64/c/cccb78ce89c4

-- 
Catalin


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2021-04-08 18:01 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-03-12 19:03 [PATCH] arm64/sve: Rework SVE access trap to convert state in registers Mark Brown
2021-04-07 11:45 ` Catalin Marinas
2021-04-07 12:48   ` Mark Brown
2021-04-07 15:55     ` Mark Brown
2021-04-08 16:43     ` Catalin Marinas
2021-04-08 18:00 ` Catalin Marinas

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.