All of lore.kernel.org
 help / color / mirror / Atom feed
From: Mark Brown <broonie@kernel.org>
To: Catalin Marinas <catalin.marinas@arm.com>, Will Deacon <will@kernel.org>
Cc: Julien Grall <julien@xen.com>, Julien Grall <julien@xen.org>,
	Zhang Lei <zhang.lei@jp.fujitsu.com>,
	Mark Brown <broonie@kernel.org>,
	Dave Martin <Dave.Martin@arm.com>,
	linux-arm-kernel@lists.infradead.org,
	Daniel Kiss <Daniel.Kiss@arm.com>
Subject: [PATCH v7 2/2] arm64/sve: Rework SVE trap access to minimise memory access
Date: Mon,  1 Feb 2021 12:29:01 +0000	[thread overview]
Message-ID: <20210201122901.11331-3-broonie@kernel.org> (raw)
In-Reply-To: <20210201122901.11331-1-broonie@kernel.org>

When we take a SVE access trap only the subset of the SVE Z0-Z31
registers shared with the FPSIMD V0-V31 registers is valid, the rest
of the bits in the SVE registers must be cleared before returning to
userspace.  Currently we do this by saving the current FPSIMD register
state to the task struct and then using that to initalize the copy of
the SVE registers in the task struct so they can be loaded from there
into the registers.  This requires a lot more memory access than we
need.

The newly added TIF_SVE_FULL_REGS can be used to reduce this overhead -
instead of doing the conversion immediately we can set only TIF_SVE_EXEC
and not TIF_SVE_FULL_REGS.  This means that until we return to userspace
we only need to store the FPSIMD registers and if (as should be the
common case) the hardware still has the task state and does not need
that to be reloaded from the task struct we can do the initialization of
the SVE state entirely in registers.  In the event that we do need to
reload the registers from the task struct only the FPSIMD subset needs
to be loaded from memory.

If the FPSIMD state is loaded then we need to set the vector length.
This is because the vector length is only set when loading from memory,
the expectation is that the vector length is set when TIF_SVE_EXEC is
set.  We also need to rebind the task to the CPU so the newly allocated
SVE state is used when the task is saved.

This is based on earlier work by Julien Gral implementing a similar idea.

Signed-off-by: Mark Brown <broonie@kernel.org>
---
 arch/arm64/include/asm/fpsimd.h  |  2 ++
 arch/arm64/kernel/entry-fpsimd.S |  5 +++++
 arch/arm64/kernel/fpsimd.c       | 35 +++++++++++++++++++++-----------
 3 files changed, 30 insertions(+), 12 deletions(-)

diff --git a/arch/arm64/include/asm/fpsimd.h b/arch/arm64/include/asm/fpsimd.h
index bec5f14b622a..e60aa4ebb351 100644
--- a/arch/arm64/include/asm/fpsimd.h
+++ b/arch/arm64/include/asm/fpsimd.h
@@ -74,6 +74,8 @@ extern void sve_load_from_fpsimd_state(struct user_fpsimd_state const *state,
 				       unsigned long vq_minus_1);
 extern unsigned int sve_get_vl(void);
 
+extern void sve_set_vq(unsigned long vq_minus_1);
+
 struct arm64_cpu_capabilities;
 extern void sve_kernel_enable(const struct arm64_cpu_capabilities *__unused);
 
diff --git a/arch/arm64/kernel/entry-fpsimd.S b/arch/arm64/kernel/entry-fpsimd.S
index 2ca395c25448..3ecec60d3295 100644
--- a/arch/arm64/kernel/entry-fpsimd.S
+++ b/arch/arm64/kernel/entry-fpsimd.S
@@ -48,6 +48,11 @@ SYM_FUNC_START(sve_get_vl)
 	ret
 SYM_FUNC_END(sve_get_vl)
 
+SYM_FUNC_START(sve_set_vq)
+	sve_load_vq x0, x1, x2
+	ret
+SYM_FUNC_END(sve_set_vq)
+
 /*
  * Load SVE state from FPSIMD state.
  *
diff --git a/arch/arm64/kernel/fpsimd.c b/arch/arm64/kernel/fpsimd.c
index 58c749ef04c4..05caf207e2ce 100644
--- a/arch/arm64/kernel/fpsimd.c
+++ b/arch/arm64/kernel/fpsimd.c
@@ -994,10 +994,10 @@ void fpsimd_release_task(struct task_struct *dead_task)
 /*
  * Trapped SVE access
  *
- * Storage is allocated for the full SVE state, the current FPSIMD
- * register contents are migrated across, and TIF_SVE_EXEC is set so that
- * the SVE access trap will be disabled the next time this task
- * reaches ret_to_user.
+ * Storage is allocated for the full SVE state so that the code
+ * running subsequently has somewhere to save the SVE registers to. We
+ * then rely on ret_to_user to actually convert the FPSIMD registers
+ * to SVE state by flushing as required.
  *
  * TIF_SVE_EXEC should be clear on entry: otherwise,
  * fpsimd_restore_current_state() would have disabled the SVE access
@@ -1016,15 +1016,26 @@ void do_sve_acc(unsigned int esr, struct pt_regs *regs)
 
 	get_cpu_fpsimd_context();
 
-	fpsimd_save();
-
-	/* Force ret_to_user to reload the registers: */
-	fpsimd_flush_task_state(current);
-
-	fpsimd_to_sve(current);
+	/*
+	 * We shouldn't trap if we can execute SVE instructions and
+	 * there should be no SVE state if that is the case.
+	 */
 	if (test_and_set_thread_flag(TIF_SVE_EXEC))
-		WARN_ON(1); /* SVE access shouldn't have trapped */
-	set_thread_flag(TIF_SVE_FULL_REGS);
+		WARN_ON(1);
+	if (test_and_clear_thread_flag(TIF_SVE_FULL_REGS))
+		WARN_ON(1);
+
+	/*
+	 * When the FPSIMD state is loaded:
+	 *      - The return path (see fpsimd_restore_current_state) requires
+	 *        the vector length to be loaded beforehand.
+	 *      - We need to rebind the task to the CPU so the newly allocated
+	 *        SVE state is used when the task is saved.
+	 */
+	if (!test_thread_flag(TIF_FOREIGN_FPSTATE)) {
+		sve_set_vq(sve_vq_from_vl(current->thread.sve_vl) - 1);
+		fpsimd_bind_task_to_cpu();
+	}
 
 	put_cpu_fpsimd_context();
 }
-- 
2.20.1


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

  parent reply	other threads:[~2021-02-01 12:32 UTC|newest]

Thread overview: 17+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-02-01 12:28 [PATCH v7 0/2] arm64/sve: Improve performance when handling SVE access traps Mark Brown
2021-02-01 12:29 ` [PATCH v7 1/2] arm64/sve: Split TIF_SVE into separate execute and register state flags Mark Brown
2021-02-01 15:35   ` Dave Martin
2021-02-01 15:45     ` Mark Brown
2021-02-09 17:59   ` Dave Martin
2021-02-09 22:16     ` Mark Brown
2021-02-10 19:52       ` Mark Brown
2021-02-10 10:56   ` Dave Martin
2021-02-10 14:54     ` Mark Brown
2021-02-10 15:42       ` Dave Martin
2021-02-10 17:14         ` Mark Brown
2021-02-10 18:15           ` Dave Martin
2021-02-01 12:29 ` Mark Brown [this message]
2021-02-10 11:09   ` [PATCH v7 2/2] arm64/sve: Rework SVE trap access to minimise memory access Dave Martin
2021-02-10 17:54     ` Mark Brown
2021-02-08 17:26 ` [PATCH v7 0/2] arm64/sve: Improve performance when handling SVE access traps Dave Martin
2021-02-09 18:22   ` Mark Brown

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20210201122901.11331-3-broonie@kernel.org \
    --to=broonie@kernel.org \
    --cc=Daniel.Kiss@arm.com \
    --cc=Dave.Martin@arm.com \
    --cc=catalin.marinas@arm.com \
    --cc=julien@xen.com \
    --cc=julien@xen.org \
    --cc=linux-arm-kernel@lists.infradead.org \
    --cc=will@kernel.org \
    --cc=zhang.lei@jp.fujitsu.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.