All of lore.kernel.org
 help / color / mirror / Atom feed
From: Dave Martin <Dave.Martin@arm.com>
To: kvmarm@lists.cs.columbia.edu
Cc: Christoffer Dall <cdall@kernel.org>,
	Ard Biesheuvel <ard.biesheuvel@linaro.org>,
	Marc Zyngier <marc.zyngier@arm.com>,
	Catalin Marinas <catalin.marinas@arm.com>,
	Will Deacon <will.deacon@arm.com>,
	linux-arm-kernel@lists.infradead.org
Subject: [PATCH v11 08/19] arm64: fpsimd: Eliminate task->mm checks
Date: Thu, 24 May 2018 17:56:37 +0100	[thread overview]
Message-ID: <1527181008-13549-9-git-send-email-Dave.Martin@arm.com> (raw)
In-Reply-To: <1527181008-13549-1-git-send-email-Dave.Martin@arm.com>

Currently the FPSIMD handling code uses the condition task->mm ==
NULL as a hint that task has no FPSIMD register context.

The ->mm check is only there to filter out tasks that cannot
possibly have FPSIMD context loaded, for optimisation purposes.
Also, TIF_FOREIGN_FPSTATE must always be checked anyway before
saving FPSIMD context back to memory.  For these reasons, the ->mm
checks are not useful, providing that TIF_FOREIGN_FPSTATE is
maintained in a consistent way for all threads.

The context switch logic is already deliberately optimised to defer
reloads of the regs until ret_to_user (or sigreturn as a special
case), and save them only if they have been previously loaded.
These paths are the only places where the wrong_task and wrong_cpu
conditions can be made false, by calling fpsimd_bind_task_to_cpu().
Kernel threads by definition never reach these paths.  As a result,
the wrong_task and wrong_cpu tests in fpsimd_thread_switch() will
always yield true for kernel threads.

This patch removes the redundant checks and special-case code,                  ensuring that TIF_FOREIGN_FPSTATE is set whenever a kernel thread               is scheduled in, and ensures that this flag is set for the init
task.  The fpsimd_flush_task_state() call already present in
copy_thread() ensures the same for any new task.

With TIF_FOREIGN_FPSTATE always set for kernel threads, this patch
ensures that no extra context save work is added for kernel
threads, and eliminates the redundant context saving that may
currently occur for kernel threads that have acquired an mm via
use_mm().

Signed-off-by: Dave Martin <Dave.Martin@arm.com>
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Will Deacon <will.deacon@arm.com>
Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org>

---

Changes since v10:

 * The INIT_THREAD flag change is split out into the prior
   patch, since it is in principle a fix rather than simply a
   tidy-up.

Requested by Christoffer Dall / Catalin Marinas:

 * Reworded commit message to explain the change more clearly,
   and remove confusing claims about things being true by
   construction.

 * Added a comment to the code explaining that wrong_cpu and
   wrong_task will always be true for kernel threads.

 * Ensure .fpsimd_cpu = NR_CPUS for the init task.

   This does not seem to be a bug, because the wrong_task check in
   fpsimd_thread_switch() should still always be true for the init
   task; but it is nonetheless an inconsistency compared with what
   copy_thread() does.

   So fix it to avoid future surprises.
---
 arch/arm64/include/asm/processor.h |  4 +++-
 arch/arm64/kernel/fpsimd.c         | 40 +++++++++++++++-----------------------
 2 files changed, 19 insertions(+), 25 deletions(-)

diff --git a/arch/arm64/include/asm/processor.h b/arch/arm64/include/asm/processor.h
index 7675989..36d64f8 100644
--- a/arch/arm64/include/asm/processor.h
+++ b/arch/arm64/include/asm/processor.h
@@ -156,7 +156,9 @@ static inline void arch_thread_struct_whitelist(unsigned long *offset,
 /* Sync TPIDR_EL0 back to thread_struct for current */
 void tls_preserve_current_state(void);
 
-#define INIT_THREAD  {	}
+#define INIT_THREAD {				\
+	.fpsimd_cpu = NR_CPUS,			\
+}
 
 static inline void start_thread_common(struct pt_regs *regs, unsigned long pc)
 {
diff --git a/arch/arm64/kernel/fpsimd.c b/arch/arm64/kernel/fpsimd.c
index 2d9a9e8..d736b6c 100644
--- a/arch/arm64/kernel/fpsimd.c
+++ b/arch/arm64/kernel/fpsimd.c
@@ -892,31 +892,25 @@ asmlinkage void do_fpsimd_exc(unsigned int esr, struct pt_regs *regs)
 
 void fpsimd_thread_switch(struct task_struct *next)
 {
+	bool wrong_task, wrong_cpu;
+
 	if (!system_supports_fpsimd())
 		return;
+
+	/* Save unsaved fpsimd state, if any: */
+	fpsimd_save();
+
 	/*
-	 * Save the current FPSIMD state to memory, but only if whatever is in
-	 * the registers is in fact the most recent userland FPSIMD state of
-	 * 'current'.
+	 * Fix up TIF_FOREIGN_FPSTATE to correctly describe next's
+	 * state.  For kernel threads, FPSIMD registers are never loaded
+	 * and wrong_task and wrong_cpu will always be true.
 	 */
-	if (current->mm)
-		fpsimd_save();
-
-	if (next->mm) {
-		/*
-		 * If we are switching to a task whose most recent userland
-		 * FPSIMD state is already in the registers of *this* cpu,
-		 * we can skip loading the state from memory. Otherwise, set
-		 * the TIF_FOREIGN_FPSTATE flag so the state will be loaded
-		 * upon the next return to userland.
-		 */
-		bool wrong_task = __this_cpu_read(fpsimd_last_state.st) !=
+	wrong_task = __this_cpu_read(fpsimd_last_state.st) !=
 					&next->thread.uw.fpsimd_state;
-		bool wrong_cpu = next->thread.fpsimd_cpu != smp_processor_id();
+	wrong_cpu = next->thread.fpsimd_cpu != smp_processor_id();
 
-		update_tsk_thread_flag(next, TIF_FOREIGN_FPSTATE,
-				       wrong_task || wrong_cpu);
-	}
+	update_tsk_thread_flag(next, TIF_FOREIGN_FPSTATE,
+			       wrong_task || wrong_cpu);
 }
 
 void fpsimd_flush_thread(void)
@@ -1121,9 +1115,8 @@ void kernel_neon_begin(void)
 
 	__this_cpu_write(kernel_neon_busy, true);
 
-	/* Save unsaved task fpsimd state, if any: */
-	if (current->mm)
-		fpsimd_save();
+	/* Save unsaved fpsimd state, if any: */
+	fpsimd_save();
 
 	/* Invalidate any task state remaining in the fpsimd regs: */
 	fpsimd_flush_cpu_state();
@@ -1245,8 +1238,7 @@ static int fpsimd_cpu_pm_notifier(struct notifier_block *self,
 {
 	switch (cmd) {
 	case CPU_PM_ENTER:
-		if (current->mm)
-			fpsimd_save();
+		fpsimd_save();
 		fpsimd_flush_cpu_state();
 		break;
 	case CPU_PM_EXIT:
-- 
2.1.4


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

WARNING: multiple messages have this Message-ID (diff)
From: Dave.Martin@arm.com (Dave Martin)
To: linux-arm-kernel@lists.infradead.org
Subject: [PATCH v11 08/19] arm64: fpsimd: Eliminate task->mm checks
Date: Thu, 24 May 2018 17:56:37 +0100	[thread overview]
Message-ID: <1527181008-13549-9-git-send-email-Dave.Martin@arm.com> (raw)
In-Reply-To: <1527181008-13549-1-git-send-email-Dave.Martin@arm.com>

Currently the FPSIMD handling code uses the condition task->mm ==
NULL as a hint that task has no FPSIMD register context.

The ->mm check is only there to filter out tasks that cannot
possibly have FPSIMD context loaded, for optimisation purposes.
Also, TIF_FOREIGN_FPSTATE must always be checked anyway before
saving FPSIMD context back to memory.  For these reasons, the ->mm
checks are not useful, providing that TIF_FOREIGN_FPSTATE is
maintained in a consistent way for all threads.

The context switch logic is already deliberately optimised to defer
reloads of the regs until ret_to_user (or sigreturn as a special
case), and save them only if they have been previously loaded.
These paths are the only places where the wrong_task and wrong_cpu
conditions can be made false, by calling fpsimd_bind_task_to_cpu().
Kernel threads by definition never reach these paths.  As a result,
the wrong_task and wrong_cpu tests in fpsimd_thread_switch() will
always yield true for kernel threads.

This patch removes the redundant checks and special-case code,                  ensuring that TIF_FOREIGN_FPSTATE is set whenever a kernel thread               is scheduled in, and ensures that this flag is set for the init
task.  The fpsimd_flush_task_state() call already present in
copy_thread() ensures the same for any new task.

With TIF_FOREIGN_FPSTATE always set for kernel threads, this patch
ensures that no extra context save work is added for kernel
threads, and eliminates the redundant context saving that may
currently occur for kernel threads that have acquired an mm via
use_mm().

Signed-off-by: Dave Martin <Dave.Martin@arm.com>
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Reviewed-by: Alex Benn?e <alex.bennee@linaro.org>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Will Deacon <will.deacon@arm.com>
Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org>

---

Changes since v10:

 * The INIT_THREAD flag change is split out into the prior
   patch, since it is in principle a fix rather than simply a
   tidy-up.

Requested by Christoffer Dall / Catalin Marinas:

 * Reworded commit message to explain the change more clearly,
   and remove confusing claims about things being true by
   construction.

 * Added a comment to the code explaining that wrong_cpu and
   wrong_task will always be true for kernel threads.

 * Ensure .fpsimd_cpu = NR_CPUS for the init task.

   This does not seem to be a bug, because the wrong_task check in
   fpsimd_thread_switch() should still always be true for the init
   task; but it is nonetheless an inconsistency compared with what
   copy_thread() does.

   So fix it to avoid future surprises.
---
 arch/arm64/include/asm/processor.h |  4 +++-
 arch/arm64/kernel/fpsimd.c         | 40 +++++++++++++++-----------------------
 2 files changed, 19 insertions(+), 25 deletions(-)

diff --git a/arch/arm64/include/asm/processor.h b/arch/arm64/include/asm/processor.h
index 7675989..36d64f8 100644
--- a/arch/arm64/include/asm/processor.h
+++ b/arch/arm64/include/asm/processor.h
@@ -156,7 +156,9 @@ static inline void arch_thread_struct_whitelist(unsigned long *offset,
 /* Sync TPIDR_EL0 back to thread_struct for current */
 void tls_preserve_current_state(void);
 
-#define INIT_THREAD  {	}
+#define INIT_THREAD {				\
+	.fpsimd_cpu = NR_CPUS,			\
+}
 
 static inline void start_thread_common(struct pt_regs *regs, unsigned long pc)
 {
diff --git a/arch/arm64/kernel/fpsimd.c b/arch/arm64/kernel/fpsimd.c
index 2d9a9e8..d736b6c 100644
--- a/arch/arm64/kernel/fpsimd.c
+++ b/arch/arm64/kernel/fpsimd.c
@@ -892,31 +892,25 @@ asmlinkage void do_fpsimd_exc(unsigned int esr, struct pt_regs *regs)
 
 void fpsimd_thread_switch(struct task_struct *next)
 {
+	bool wrong_task, wrong_cpu;
+
 	if (!system_supports_fpsimd())
 		return;
+
+	/* Save unsaved fpsimd state, if any: */
+	fpsimd_save();
+
 	/*
-	 * Save the current FPSIMD state to memory, but only if whatever is in
-	 * the registers is in fact the most recent userland FPSIMD state of
-	 * 'current'.
+	 * Fix up TIF_FOREIGN_FPSTATE to correctly describe next's
+	 * state.  For kernel threads, FPSIMD registers are never loaded
+	 * and wrong_task and wrong_cpu will always be true.
 	 */
-	if (current->mm)
-		fpsimd_save();
-
-	if (next->mm) {
-		/*
-		 * If we are switching to a task whose most recent userland
-		 * FPSIMD state is already in the registers of *this* cpu,
-		 * we can skip loading the state from memory. Otherwise, set
-		 * the TIF_FOREIGN_FPSTATE flag so the state will be loaded
-		 * upon the next return to userland.
-		 */
-		bool wrong_task = __this_cpu_read(fpsimd_last_state.st) !=
+	wrong_task = __this_cpu_read(fpsimd_last_state.st) !=
 					&next->thread.uw.fpsimd_state;
-		bool wrong_cpu = next->thread.fpsimd_cpu != smp_processor_id();
+	wrong_cpu = next->thread.fpsimd_cpu != smp_processor_id();
 
-		update_tsk_thread_flag(next, TIF_FOREIGN_FPSTATE,
-				       wrong_task || wrong_cpu);
-	}
+	update_tsk_thread_flag(next, TIF_FOREIGN_FPSTATE,
+			       wrong_task || wrong_cpu);
 }
 
 void fpsimd_flush_thread(void)
@@ -1121,9 +1115,8 @@ void kernel_neon_begin(void)
 
 	__this_cpu_write(kernel_neon_busy, true);
 
-	/* Save unsaved task fpsimd state, if any: */
-	if (current->mm)
-		fpsimd_save();
+	/* Save unsaved fpsimd state, if any: */
+	fpsimd_save();
 
 	/* Invalidate any task state remaining in the fpsimd regs: */
 	fpsimd_flush_cpu_state();
@@ -1245,8 +1238,7 @@ static int fpsimd_cpu_pm_notifier(struct notifier_block *self,
 {
 	switch (cmd) {
 	case CPU_PM_ENTER:
-		if (current->mm)
-			fpsimd_save();
+		fpsimd_save();
 		fpsimd_flush_cpu_state();
 		break;
 	case CPU_PM_EXIT:
-- 
2.1.4

  parent reply	other threads:[~2018-05-24 16:56 UTC|newest]

Thread overview: 52+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-05-24 16:56 [PATCH v11 00/19] KVM: arm64: Optimise FPSIMD context switching Dave Martin
2018-05-24 16:56 ` Dave Martin
2018-05-24 16:56 ` [PATCH v11 01/19] arm64: fpsimd: Fix TIF_FOREIGN_FPSTATE after invalidating cpu regs Dave Martin
2018-05-24 16:56   ` Dave Martin
2018-05-24 16:56 ` [PATCH v11 02/19] thread_info: Add update_thread_flag() helpers Dave Martin
2018-05-24 16:56   ` Dave Martin
2018-05-24 17:02   ` Peter Zijlstra
2018-05-24 17:02     ` Peter Zijlstra
2018-05-24 16:56 ` [PATCH v11 03/19] arm64: Use update{,_tsk}_thread_flag() Dave Martin
2018-05-24 16:56   ` Dave Martin
2018-05-24 16:56 ` [PATCH v11 04/19] KVM: arm/arm64: Introduce kvm_arch_vcpu_run_pid_change Dave Martin
2018-05-24 16:56   ` Dave Martin
2018-05-24 16:56 ` [PATCH v11 05/19] KVM: arm64: Convert lazy FPSIMD context switch trap to C Dave Martin
2018-05-24 16:56   ` Dave Martin
2018-05-24 16:56 ` [PATCH v11 06/19] arm64: fpsimd: Generalise context saving for non-task contexts Dave Martin
2018-05-24 16:56   ` Dave Martin
2018-05-24 16:56 ` [PATCH v11 07/19] arm64: fpsimd: Avoid FPSIMD context leakage for the init task Dave Martin
2018-05-24 16:56   ` Dave Martin
2018-05-25 10:01   ` Alex Bennée
2018-05-25 10:01     ` Alex Bennée
2018-05-24 16:56 ` Dave Martin [this message]
2018-05-24 16:56   ` [PATCH v11 08/19] arm64: fpsimd: Eliminate task->mm checks Dave Martin
2018-05-25  9:02   ` Christoffer Dall
2018-05-25  9:02     ` Christoffer Dall
2018-05-25  9:52     ` Dave Martin
2018-05-25  9:52       ` Dave Martin
2018-05-25 10:04   ` Alex Bennée
2018-05-25 10:04     ` Alex Bennée
2018-05-25 10:48     ` Dave Martin
2018-05-25 10:48       ` Dave Martin
2018-05-24 16:56 ` [PATCH v11 09/19] arm64/sve: Refactor user SVE trap maintenance for external use Dave Martin
2018-05-24 16:56   ` Dave Martin
2018-05-24 16:56 ` [PATCH v11 10/19] KVM: arm64: Repurpose vcpu_arch.debug_flags for general-purpose flags Dave Martin
2018-05-24 16:56   ` Dave Martin
2018-05-24 16:56 ` [PATCH v11 11/19] KVM: arm64: Optimise FPSIMD handling to reduce guest/host thrashing Dave Martin
2018-05-24 16:56   ` Dave Martin
2018-05-24 16:56 ` [PATCH v11 12/19] arm64/sve: Move read_zcr_features() out of cpufeature.h Dave Martin
2018-05-24 16:56   ` Dave Martin
2018-05-24 16:56 ` [PATCH v11 13/19] arm64/sve: Switch sve_pffr() argument from task to thread Dave Martin
2018-05-24 16:56   ` Dave Martin
2018-05-24 16:56 ` [PATCH v11 14/19] arm64/sve: Move sve_pffr() to fpsimd.h and make inline Dave Martin
2018-05-24 16:56   ` Dave Martin
2018-05-24 16:56 ` [PATCH v11 15/19] KVM: arm64: Save host SVE context as appropriate Dave Martin
2018-05-24 16:56   ` Dave Martin
2018-05-24 16:56 ` [PATCH v11 16/19] KVM: arm64: Remove eager host SVE state saving Dave Martin
2018-05-24 16:56   ` Dave Martin
2018-05-24 16:56 ` [PATCH v11 17/19] KVM: arm64: Remove redundant *exit_code changes in fpsimd_guest_exit() Dave Martin
2018-05-24 16:56   ` Dave Martin
2018-05-24 16:56 ` [PATCH v11 18/19] KVM: arm64: Fold redundant exit code checks out of fixup_guest_exit() Dave Martin
2018-05-24 16:56   ` Dave Martin
2018-05-24 16:56 ` [PATCH v11 19/19] KVM: arm64: Invoke FPSIMD context switch trap from C Dave Martin
2018-05-24 16:56   ` Dave Martin

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1527181008-13549-9-git-send-email-Dave.Martin@arm.com \
    --to=dave.martin@arm.com \
    --cc=ard.biesheuvel@linaro.org \
    --cc=catalin.marinas@arm.com \
    --cc=cdall@kernel.org \
    --cc=kvmarm@lists.cs.columbia.edu \
    --cc=linux-arm-kernel@lists.infradead.org \
    --cc=marc.zyngier@arm.com \
    --cc=will.deacon@arm.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.