All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v6 00/15] KVM: arm64: Optimise FPSIMD context switching
@ 2018-05-08 16:44 ` Dave Martin
  0 siblings, 0 replies; 62+ messages in thread
From: Dave Martin @ 2018-05-08 16:44 UTC (permalink / raw)
  To: kvmarm
  Cc: linux-arm-kernel, Christoffer Dall, Marc Zyngier, Ard Biesheuvel,
	Catalin Marinas, linux-arch, linux-kernel, Ingo Molnar,
	Peter Zijlstra, Steven Rostedt, Oleg Nesterov

Note: Most of these patches are Arm-specific.  People not Cc'd on the
whole series can find it in the linux-arm-kernel archive [3].

This series aims to improve the way FPSIMD context is handled by KVM.
Only minor changes have been made since the previous v5 [1].
The changes are summarised in the individual patches.

Patch 6 is new, and does some simple refactoring to enable new bits to
be added to an existing vcpu_arch flags field.

Patches 1, 2, 6, 7 and 14 are still missing Reviewed-bys/Acked-bys, so
reviewer attention on these patches would be much appreciated.


Some testing done on Juno and the Arm fast model (arm64), including
(on earlier versions of the patches) combinations of (non-)SVE and
(non-)VHE configurations.

Cheers
---Dave

[1] [PATCH v5 00/14] KVM: arm64: Optimise FPSIMD context switching
http://lists.infradead.org/pipermail/linux-arm-kernel/2018-May/575805.html

[2] [RFC PATCH 0/6] Simplify setting thread flags to a particular value
https://lkml.org/lkml/2018/4/19/225

[3] linux-arm-kernel archive
http://lists.infradead.org/pipermail/linux-arm-kernel/2018-May/thread.html


Christoffer Dall (1):
  KVM: arm/arm64: Introduce kvm_arch_vcpu_run_pid_change

Dave Martin (14):
  thread_info: Add update_thread_flag() helpers
  arm64: Use update{,_tsk}_thread_flag()
  KVM: arm64: Convert lazy FPSIMD context switch trap to C
  arm64: fpsimd: Generalise context saving for non-task contexts
  KVM: arm64: Repurpose vcpu_arch.debug_flags for general-purpose flags
  KVM: arm64: Optimise FPSIMD handling to reduce guest/host thrashing
  arm64/sve: Move read_zcr_features() out of cpufeature.h
  arm64/sve: Switch sve_pffr() argument from task to thread
  arm64/sve: Move sve_pffr() to fpsimd.h and make inline
  KVM: arm64: Save host SVE context as appropriate
  KVM: arm64: Remove eager host SVE state saving
  KVM: arm64: Remove redundant *exit_code changes in fpsimd_guest_exit()
  KVM: arm64: Fold redundant exit code checks out of fixup_guest_exit()
  KVM: arm64: Invoke FPSIMD context switch trap from C

 arch/arm/include/asm/kvm_host.h     |   9 ++-
 arch/arm64/Kconfig                  |   7 ++
 arch/arm64/include/asm/cpufeature.h |  29 ---------
 arch/arm64/include/asm/fpsimd.h     |  20 ++++++
 arch/arm64/include/asm/kvm_asm.h    |   3 +
 arch/arm64/include/asm/kvm_host.h   |  24 ++++---
 arch/arm64/include/asm/processor.h  |   2 +
 arch/arm64/kernel/fpsimd.c          | 120 ++++++++++++++++++-----------------
 arch/arm64/kernel/ptrace.c          |   1 +
 arch/arm64/kvm/Kconfig              |   1 +
 arch/arm64/kvm/Makefile             |   2 +-
 arch/arm64/kvm/debug.c              |   8 +--
 arch/arm64/kvm/fpsimd.c             | 110 ++++++++++++++++++++++++++++++++
 arch/arm64/kvm/hyp/debug-sr.c       |   6 +-
 arch/arm64/kvm/hyp/entry.S          |  43 -------------
 arch/arm64/kvm/hyp/hyp-entry.S      |  19 ------
 arch/arm64/kvm/hyp/switch.c         | 123 +++++++++++++++++++++++++-----------
 arch/arm64/kvm/hyp/sysreg-sr.c      |   4 +-
 arch/arm64/kvm/sys_regs.c           |   8 +--
 include/linux/kvm_host.h            |   9 +++
 include/linux/sched.h               |   6 ++
 include/linux/thread_info.h         |  11 ++++
 virt/kvm/Kconfig                    |   3 +
 virt/kvm/arm/arm.c                  |  25 +++++++-
 virt/kvm/kvm_main.c                 |   7 +-
 25 files changed, 383 insertions(+), 217 deletions(-)
 create mode 100644 arch/arm64/kvm/fpsimd.c

-- 
2.1.4

^ permalink raw reply	[flat|nested] 62+ messages in thread

* [PATCH v6 00/15] KVM: arm64: Optimise FPSIMD context switching
@ 2018-05-08 16:44 ` Dave Martin
  0 siblings, 0 replies; 62+ messages in thread
From: Dave Martin @ 2018-05-08 16:44 UTC (permalink / raw)
  To: linux-arm-kernel

Note: Most of these patches are Arm-specific.  People not Cc'd on the
whole series can find it in the linux-arm-kernel archive [3].

This series aims to improve the way FPSIMD context is handled by KVM.
Only minor changes have been made since the previous v5 [1].
The changes are summarised in the individual patches.

Patch 6 is new, and does some simple refactoring to enable new bits to
be added to an existing vcpu_arch flags field.

Patches 1, 2, 6, 7 and 14 are still missing Reviewed-bys/Acked-bys, so
reviewer attention on these patches would be much appreciated.


Some testing done on Juno and the Arm fast model (arm64), including
(on earlier versions of the patches) combinations of (non-)SVE and
(non-)VHE configurations.

Cheers
---Dave

[1] [PATCH v5 00/14] KVM: arm64: Optimise FPSIMD context switching
http://lists.infradead.org/pipermail/linux-arm-kernel/2018-May/575805.html

[2] [RFC PATCH 0/6] Simplify setting thread flags to a particular value
https://lkml.org/lkml/2018/4/19/225

[3] linux-arm-kernel archive
http://lists.infradead.org/pipermail/linux-arm-kernel/2018-May/thread.html


Christoffer Dall (1):
  KVM: arm/arm64: Introduce kvm_arch_vcpu_run_pid_change

Dave Martin (14):
  thread_info: Add update_thread_flag() helpers
  arm64: Use update{,_tsk}_thread_flag()
  KVM: arm64: Convert lazy FPSIMD context switch trap to C
  arm64: fpsimd: Generalise context saving for non-task contexts
  KVM: arm64: Repurpose vcpu_arch.debug_flags for general-purpose flags
  KVM: arm64: Optimise FPSIMD handling to reduce guest/host thrashing
  arm64/sve: Move read_zcr_features() out of cpufeature.h
  arm64/sve: Switch sve_pffr() argument from task to thread
  arm64/sve: Move sve_pffr() to fpsimd.h and make inline
  KVM: arm64: Save host SVE context as appropriate
  KVM: arm64: Remove eager host SVE state saving
  KVM: arm64: Remove redundant *exit_code changes in fpsimd_guest_exit()
  KVM: arm64: Fold redundant exit code checks out of fixup_guest_exit()
  KVM: arm64: Invoke FPSIMD context switch trap from C

 arch/arm/include/asm/kvm_host.h     |   9 ++-
 arch/arm64/Kconfig                  |   7 ++
 arch/arm64/include/asm/cpufeature.h |  29 ---------
 arch/arm64/include/asm/fpsimd.h     |  20 ++++++
 arch/arm64/include/asm/kvm_asm.h    |   3 +
 arch/arm64/include/asm/kvm_host.h   |  24 ++++---
 arch/arm64/include/asm/processor.h  |   2 +
 arch/arm64/kernel/fpsimd.c          | 120 ++++++++++++++++++-----------------
 arch/arm64/kernel/ptrace.c          |   1 +
 arch/arm64/kvm/Kconfig              |   1 +
 arch/arm64/kvm/Makefile             |   2 +-
 arch/arm64/kvm/debug.c              |   8 +--
 arch/arm64/kvm/fpsimd.c             | 110 ++++++++++++++++++++++++++++++++
 arch/arm64/kvm/hyp/debug-sr.c       |   6 +-
 arch/arm64/kvm/hyp/entry.S          |  43 -------------
 arch/arm64/kvm/hyp/hyp-entry.S      |  19 ------
 arch/arm64/kvm/hyp/switch.c         | 123 +++++++++++++++++++++++++-----------
 arch/arm64/kvm/hyp/sysreg-sr.c      |   4 +-
 arch/arm64/kvm/sys_regs.c           |   8 +--
 include/linux/kvm_host.h            |   9 +++
 include/linux/sched.h               |   6 ++
 include/linux/thread_info.h         |  11 ++++
 virt/kvm/Kconfig                    |   3 +
 virt/kvm/arm/arm.c                  |  25 +++++++-
 virt/kvm/kvm_main.c                 |   7 +-
 25 files changed, 383 insertions(+), 217 deletions(-)
 create mode 100644 arch/arm64/kvm/fpsimd.c

-- 
2.1.4

^ permalink raw reply	[flat|nested] 62+ messages in thread

* [PATCH v6 01/15] thread_info: Add update_thread_flag() helpers
  2018-05-08 16:44 ` Dave Martin
@ 2018-05-08 16:44   ` Dave Martin
  -1 siblings, 0 replies; 62+ messages in thread
From: Dave Martin @ 2018-05-08 16:44 UTC (permalink / raw)
  To: kvmarm
  Cc: Christoffer Dall, Ard Biesheuvel, Marc Zyngier, Catalin Marinas,
	Oleg Nesterov, Steven Rostedt, Peter Zijlstra, Ingo Molnar,
	linux-arm-kernel

There are a number of bits of code sprinkled around the kernel to
set a thread flag if a certain condition is true, and clear it
otherwise.

To help make those call sites terser and less cumbersome, this
patch adds a new family of thread flag manipulators

	update*_thread_flag([...,] flag, cond)

which do the equivalent of:

	if (cond)
		set*_thread_flag([...,] flag);
	else
		clear*_thread_flag([...,] flag);

Signed-off-by: Dave Martin <Dave.Martin@arm.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Oleg Nesterov <oleg@redhat.com>
---
 include/linux/sched.h       |  6 ++++++
 include/linux/thread_info.h | 11 +++++++++++
 2 files changed, 17 insertions(+)

diff --git a/include/linux/sched.h b/include/linux/sched.h
index b3d697f..c2c3051 100644
--- a/include/linux/sched.h
+++ b/include/linux/sched.h
@@ -1578,6 +1578,12 @@ static inline void clear_tsk_thread_flag(struct task_struct *tsk, int flag)
 	clear_ti_thread_flag(task_thread_info(tsk), flag);
 }
 
+static inline void update_tsk_thread_flag(struct task_struct *tsk, int flag,
+					  bool value)
+{
+	update_ti_thread_flag(task_thread_info(tsk), flag, value);
+}
+
 static inline int test_and_set_tsk_thread_flag(struct task_struct *tsk, int flag)
 {
 	return test_and_set_ti_thread_flag(task_thread_info(tsk), flag);
diff --git a/include/linux/thread_info.h b/include/linux/thread_info.h
index cf2862b..8d8821b 100644
--- a/include/linux/thread_info.h
+++ b/include/linux/thread_info.h
@@ -60,6 +60,15 @@ static inline void clear_ti_thread_flag(struct thread_info *ti, int flag)
 	clear_bit(flag, (unsigned long *)&ti->flags);
 }
 
+static inline void update_ti_thread_flag(struct thread_info *ti, int flag,
+					 bool value)
+{
+	if (value)
+		set_ti_thread_flag(ti, flag);
+	else
+		clear_ti_thread_flag(ti, flag);
+}
+
 static inline int test_and_set_ti_thread_flag(struct thread_info *ti, int flag)
 {
 	return test_and_set_bit(flag, (unsigned long *)&ti->flags);
@@ -79,6 +88,8 @@ static inline int test_ti_thread_flag(struct thread_info *ti, int flag)
 	set_ti_thread_flag(current_thread_info(), flag)
 #define clear_thread_flag(flag) \
 	clear_ti_thread_flag(current_thread_info(), flag)
+#define update_thread_flag(flag, value) \
+	update_ti_thread_flag(current_thread_info(), flag, value)
 #define test_and_set_thread_flag(flag) \
 	test_and_set_ti_thread_flag(current_thread_info(), flag)
 #define test_and_clear_thread_flag(flag) \
-- 
2.1.4

^ permalink raw reply related	[flat|nested] 62+ messages in thread

* [PATCH v6 01/15] thread_info: Add update_thread_flag() helpers
@ 2018-05-08 16:44   ` Dave Martin
  0 siblings, 0 replies; 62+ messages in thread
From: Dave Martin @ 2018-05-08 16:44 UTC (permalink / raw)
  To: linux-arm-kernel

There are a number of bits of code sprinkled around the kernel to
set a thread flag if a certain condition is true, and clear it
otherwise.

To help make those call sites terser and less cumbersome, this
patch adds a new family of thread flag manipulators

	update*_thread_flag([...,] flag, cond)

which do the equivalent of:

	if (cond)
		set*_thread_flag([...,] flag);
	else
		clear*_thread_flag([...,] flag);

Signed-off-by: Dave Martin <Dave.Martin@arm.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Oleg Nesterov <oleg@redhat.com>
---
 include/linux/sched.h       |  6 ++++++
 include/linux/thread_info.h | 11 +++++++++++
 2 files changed, 17 insertions(+)

diff --git a/include/linux/sched.h b/include/linux/sched.h
index b3d697f..c2c3051 100644
--- a/include/linux/sched.h
+++ b/include/linux/sched.h
@@ -1578,6 +1578,12 @@ static inline void clear_tsk_thread_flag(struct task_struct *tsk, int flag)
 	clear_ti_thread_flag(task_thread_info(tsk), flag);
 }
 
+static inline void update_tsk_thread_flag(struct task_struct *tsk, int flag,
+					  bool value)
+{
+	update_ti_thread_flag(task_thread_info(tsk), flag, value);
+}
+
 static inline int test_and_set_tsk_thread_flag(struct task_struct *tsk, int flag)
 {
 	return test_and_set_ti_thread_flag(task_thread_info(tsk), flag);
diff --git a/include/linux/thread_info.h b/include/linux/thread_info.h
index cf2862b..8d8821b 100644
--- a/include/linux/thread_info.h
+++ b/include/linux/thread_info.h
@@ -60,6 +60,15 @@ static inline void clear_ti_thread_flag(struct thread_info *ti, int flag)
 	clear_bit(flag, (unsigned long *)&ti->flags);
 }
 
+static inline void update_ti_thread_flag(struct thread_info *ti, int flag,
+					 bool value)
+{
+	if (value)
+		set_ti_thread_flag(ti, flag);
+	else
+		clear_ti_thread_flag(ti, flag);
+}
+
 static inline int test_and_set_ti_thread_flag(struct thread_info *ti, int flag)
 {
 	return test_and_set_bit(flag, (unsigned long *)&ti->flags);
@@ -79,6 +88,8 @@ static inline int test_ti_thread_flag(struct thread_info *ti, int flag)
 	set_ti_thread_flag(current_thread_info(), flag)
 #define clear_thread_flag(flag) \
 	clear_ti_thread_flag(current_thread_info(), flag)
+#define update_thread_flag(flag, value) \
+	update_ti_thread_flag(current_thread_info(), flag, value)
 #define test_and_set_thread_flag(flag) \
 	test_and_set_ti_thread_flag(current_thread_info(), flag)
 #define test_and_clear_thread_flag(flag) \
-- 
2.1.4

^ permalink raw reply related	[flat|nested] 62+ messages in thread

* [PATCH v6 02/15] arm64: Use update{,_tsk}_thread_flag()
  2018-05-08 16:44 ` Dave Martin
@ 2018-05-08 16:44   ` Dave Martin
  -1 siblings, 0 replies; 62+ messages in thread
From: Dave Martin @ 2018-05-08 16:44 UTC (permalink / raw)
  To: kvmarm
  Cc: Christoffer Dall, Ard Biesheuvel, Marc Zyngier, Catalin Marinas,
	Will Deacon, linux-arm-kernel

This patch uses the new update_thread_flag() helpers to simplify a
couple of if () set; else clear; constructs.

No functional change.

Signed-off-by: Dave Martin <Dave.Martin@arm.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Will Deacon <will.deacon@arm.com>
---
 arch/arm64/kernel/fpsimd.c | 19 +++++++------------
 1 file changed, 7 insertions(+), 12 deletions(-)

diff --git a/arch/arm64/kernel/fpsimd.c b/arch/arm64/kernel/fpsimd.c
index 87a3536..0c4e7e0 100644
--- a/arch/arm64/kernel/fpsimd.c
+++ b/arch/arm64/kernel/fpsimd.c
@@ -618,10 +618,8 @@ int sve_set_vector_length(struct task_struct *task,
 	task->thread.sve_vl = vl;
 
 out:
-	if (flags & PR_SVE_VL_INHERIT)
-		set_tsk_thread_flag(task, TIF_SVE_VL_INHERIT);
-	else
-		clear_tsk_thread_flag(task, TIF_SVE_VL_INHERIT);
+	update_tsk_thread_flag(task, TIF_SVE_VL_INHERIT,
+			       flags & PR_SVE_VL_INHERIT);
 
 	return 0;
 }
@@ -902,7 +900,7 @@ void fpsimd_thread_switch(struct task_struct *next)
 	if (current->mm)
 		task_fpsimd_save();
 
-	if (next->mm) {
+	if (next->mm)
 		/*
 		 * If we are switching to a task whose most recent userland
 		 * FPSIMD state is already in the registers of *this* cpu,
@@ -910,13 +908,10 @@ void fpsimd_thread_switch(struct task_struct *next)
 		 * the TIF_FOREIGN_FPSTATE flag so the state will be loaded
 		 * upon the next return to userland.
 		 */
-		if (__this_cpu_read(fpsimd_last_state.st) ==
-			&next->thread.uw.fpsimd_state
-		    && next->thread.fpsimd_cpu == smp_processor_id())
-			clear_tsk_thread_flag(next, TIF_FOREIGN_FPSTATE);
-		else
-			set_tsk_thread_flag(next, TIF_FOREIGN_FPSTATE);
-	}
+		update_tsk_thread_flag(next, TIF_FOREIGN_FPSTATE,
+			__this_cpu_read(fpsimd_last_state.st) !=
+				&next->thread.uw.fpsimd_state ||
+			next->thread.fpsimd_cpu != smp_processor_id());
 }
 
 void fpsimd_flush_thread(void)
-- 
2.1.4

^ permalink raw reply related	[flat|nested] 62+ messages in thread

* [PATCH v6 02/15] arm64: Use update{,_tsk}_thread_flag()
@ 2018-05-08 16:44   ` Dave Martin
  0 siblings, 0 replies; 62+ messages in thread
From: Dave Martin @ 2018-05-08 16:44 UTC (permalink / raw)
  To: linux-arm-kernel

This patch uses the new update_thread_flag() helpers to simplify a
couple of if () set; else clear; constructs.

No functional change.

Signed-off-by: Dave Martin <Dave.Martin@arm.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Will Deacon <will.deacon@arm.com>
---
 arch/arm64/kernel/fpsimd.c | 19 +++++++------------
 1 file changed, 7 insertions(+), 12 deletions(-)

diff --git a/arch/arm64/kernel/fpsimd.c b/arch/arm64/kernel/fpsimd.c
index 87a3536..0c4e7e0 100644
--- a/arch/arm64/kernel/fpsimd.c
+++ b/arch/arm64/kernel/fpsimd.c
@@ -618,10 +618,8 @@ int sve_set_vector_length(struct task_struct *task,
 	task->thread.sve_vl = vl;
 
 out:
-	if (flags & PR_SVE_VL_INHERIT)
-		set_tsk_thread_flag(task, TIF_SVE_VL_INHERIT);
-	else
-		clear_tsk_thread_flag(task, TIF_SVE_VL_INHERIT);
+	update_tsk_thread_flag(task, TIF_SVE_VL_INHERIT,
+			       flags & PR_SVE_VL_INHERIT);
 
 	return 0;
 }
@@ -902,7 +900,7 @@ void fpsimd_thread_switch(struct task_struct *next)
 	if (current->mm)
 		task_fpsimd_save();
 
-	if (next->mm) {
+	if (next->mm)
 		/*
 		 * If we are switching to a task whose most recent userland
 		 * FPSIMD state is already in the registers of *this* cpu,
@@ -910,13 +908,10 @@ void fpsimd_thread_switch(struct task_struct *next)
 		 * the TIF_FOREIGN_FPSTATE flag so the state will be loaded
 		 * upon the next return to userland.
 		 */
-		if (__this_cpu_read(fpsimd_last_state.st) ==
-			&next->thread.uw.fpsimd_state
-		    && next->thread.fpsimd_cpu == smp_processor_id())
-			clear_tsk_thread_flag(next, TIF_FOREIGN_FPSTATE);
-		else
-			set_tsk_thread_flag(next, TIF_FOREIGN_FPSTATE);
-	}
+		update_tsk_thread_flag(next, TIF_FOREIGN_FPSTATE,
+			__this_cpu_read(fpsimd_last_state.st) !=
+				&next->thread.uw.fpsimd_state ||
+			next->thread.fpsimd_cpu != smp_processor_id());
 }
 
 void fpsimd_flush_thread(void)
-- 
2.1.4

^ permalink raw reply related	[flat|nested] 62+ messages in thread

* [PATCH v6 03/15] KVM: arm/arm64: Introduce kvm_arch_vcpu_run_pid_change
  2018-05-08 16:44 ` Dave Martin
@ 2018-05-08 16:44   ` Dave Martin
  -1 siblings, 0 replies; 62+ messages in thread
From: Dave Martin @ 2018-05-08 16:44 UTC (permalink / raw)
  To: kvmarm
  Cc: Christoffer Dall, Ard Biesheuvel, Marc Zyngier, Catalin Marinas,
	linux-arm-kernel, Christoffer Dall

From: Christoffer Dall <christoffer.dall@linaro.org>

KVM/ARM differs from other architectures in having to maintain an
additional virtual address space from that of the host and the
guest, because we split the execution of KVM across both EL1 and
EL2.

This results in a need to explicitly map data structures into EL2
(hyp) which are accessed from the hyp code.  As we are about to be
more clever with our FPSIMD handling on arm64, which stores data in
the task struct and uses thread_info flags, we will have to map
parts of the currently executing task struct into the EL2 virtual
address space.

However, we don't want to do this on every KVM_RUN, because it is a
fairly expensive operation to walk the page tables, and the common
execution mode is to map a single thread to a VCPU.  By introducing
a hook that architectures can select with
HAVE_KVM_VCPU_RUN_PID_CHANGE, we do not introduce overhead for
other architectures, but have a simple way to only map the data we
need when required for arm64.

This patch introduces the framework only, and wires it up in the
arm/arm64 KVM common code.

No functional change.

Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
Signed-off-by: Dave Martin <Dave.Martin@arm.com>
Reviewed-by: Marc Zyngier <marc.zyngier@arm.com>
---
 include/linux/kvm_host.h | 9 +++++++++
 virt/kvm/Kconfig         | 3 +++
 virt/kvm/kvm_main.c      | 7 ++++++-
 3 files changed, 18 insertions(+), 1 deletion(-)

diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h
index 6930c63..4268ace 100644
--- a/include/linux/kvm_host.h
+++ b/include/linux/kvm_host.h
@@ -1276,4 +1276,13 @@ static inline long kvm_arch_vcpu_async_ioctl(struct file *filp,
 void kvm_arch_mmu_notifier_invalidate_range(struct kvm *kvm,
 		unsigned long start, unsigned long end);
 
+#ifdef CONFIG_HAVE_KVM_VCPU_RUN_PID_CHANGE
+int kvm_arch_vcpu_run_pid_change(struct kvm_vcpu *vcpu);
+#else
+static inline int kvm_arch_vcpu_run_pid_change(struct kvm_vcpu *vcpu)
+{
+	return 0;
+}
+#endif /* CONFIG_HAVE_KVM_VCPU_RUN_PID_CHANGE */
+
 #endif
diff --git a/virt/kvm/Kconfig b/virt/kvm/Kconfig
index cca7e06..72143cf 100644
--- a/virt/kvm/Kconfig
+++ b/virt/kvm/Kconfig
@@ -54,3 +54,6 @@ config HAVE_KVM_IRQ_BYPASS
 
 config HAVE_KVM_VCPU_ASYNC_IOCTL
        bool
+
+config HAVE_KVM_VCPU_RUN_PID_CHANGE
+       bool
diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
index c7b2e92..c32e240 100644
--- a/virt/kvm/kvm_main.c
+++ b/virt/kvm/kvm_main.c
@@ -2550,8 +2550,13 @@ static long kvm_vcpu_ioctl(struct file *filp,
 		oldpid = rcu_access_pointer(vcpu->pid);
 		if (unlikely(oldpid != current->pids[PIDTYPE_PID].pid)) {
 			/* The thread running this VCPU changed. */
-			struct pid *newpid = get_task_pid(current, PIDTYPE_PID);
+			struct pid *newpid;
 
+			r = kvm_arch_vcpu_run_pid_change(vcpu);
+			if (r)
+				break;
+
+			newpid = get_task_pid(current, PIDTYPE_PID);
 			rcu_assign_pointer(vcpu->pid, newpid);
 			if (oldpid)
 				synchronize_rcu();
-- 
2.1.4

^ permalink raw reply related	[flat|nested] 62+ messages in thread

* [PATCH v6 03/15] KVM: arm/arm64: Introduce kvm_arch_vcpu_run_pid_change
@ 2018-05-08 16:44   ` Dave Martin
  0 siblings, 0 replies; 62+ messages in thread
From: Dave Martin @ 2018-05-08 16:44 UTC (permalink / raw)
  To: linux-arm-kernel

From: Christoffer Dall <christoffer.dall@linaro.org>

KVM/ARM differs from other architectures in having to maintain an
additional virtual address space from that of the host and the
guest, because we split the execution of KVM across both EL1 and
EL2.

This results in a need to explicitly map data structures into EL2
(hyp) which are accessed from the hyp code.  As we are about to be
more clever with our FPSIMD handling on arm64, which stores data in
the task struct and uses thread_info flags, we will have to map
parts of the currently executing task struct into the EL2 virtual
address space.

However, we don't want to do this on every KVM_RUN, because it is a
fairly expensive operation to walk the page tables, and the common
execution mode is to map a single thread to a VCPU.  By introducing
a hook that architectures can select with
HAVE_KVM_VCPU_RUN_PID_CHANGE, we do not introduce overhead for
other architectures, but have a simple way to only map the data we
need when required for arm64.

This patch introduces the framework only, and wires it up in the
arm/arm64 KVM common code.

No functional change.

Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
Signed-off-by: Dave Martin <Dave.Martin@arm.com>
Reviewed-by: Marc Zyngier <marc.zyngier@arm.com>
---
 include/linux/kvm_host.h | 9 +++++++++
 virt/kvm/Kconfig         | 3 +++
 virt/kvm/kvm_main.c      | 7 ++++++-
 3 files changed, 18 insertions(+), 1 deletion(-)

diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h
index 6930c63..4268ace 100644
--- a/include/linux/kvm_host.h
+++ b/include/linux/kvm_host.h
@@ -1276,4 +1276,13 @@ static inline long kvm_arch_vcpu_async_ioctl(struct file *filp,
 void kvm_arch_mmu_notifier_invalidate_range(struct kvm *kvm,
 		unsigned long start, unsigned long end);
 
+#ifdef CONFIG_HAVE_KVM_VCPU_RUN_PID_CHANGE
+int kvm_arch_vcpu_run_pid_change(struct kvm_vcpu *vcpu);
+#else
+static inline int kvm_arch_vcpu_run_pid_change(struct kvm_vcpu *vcpu)
+{
+	return 0;
+}
+#endif /* CONFIG_HAVE_KVM_VCPU_RUN_PID_CHANGE */
+
 #endif
diff --git a/virt/kvm/Kconfig b/virt/kvm/Kconfig
index cca7e06..72143cf 100644
--- a/virt/kvm/Kconfig
+++ b/virt/kvm/Kconfig
@@ -54,3 +54,6 @@ config HAVE_KVM_IRQ_BYPASS
 
 config HAVE_KVM_VCPU_ASYNC_IOCTL
        bool
+
+config HAVE_KVM_VCPU_RUN_PID_CHANGE
+       bool
diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
index c7b2e92..c32e240 100644
--- a/virt/kvm/kvm_main.c
+++ b/virt/kvm/kvm_main.c
@@ -2550,8 +2550,13 @@ static long kvm_vcpu_ioctl(struct file *filp,
 		oldpid = rcu_access_pointer(vcpu->pid);
 		if (unlikely(oldpid != current->pids[PIDTYPE_PID].pid)) {
 			/* The thread running this VCPU changed. */
-			struct pid *newpid = get_task_pid(current, PIDTYPE_PID);
+			struct pid *newpid;
 
+			r = kvm_arch_vcpu_run_pid_change(vcpu);
+			if (r)
+				break;
+
+			newpid = get_task_pid(current, PIDTYPE_PID);
 			rcu_assign_pointer(vcpu->pid, newpid);
 			if (oldpid)
 				synchronize_rcu();
-- 
2.1.4

^ permalink raw reply related	[flat|nested] 62+ messages in thread

* [PATCH v6 04/15] KVM: arm64: Convert lazy FPSIMD context switch trap to C
  2018-05-08 16:44 ` Dave Martin
@ 2018-05-08 16:44   ` Dave Martin
  -1 siblings, 0 replies; 62+ messages in thread
From: Dave Martin @ 2018-05-08 16:44 UTC (permalink / raw)
  To: kvmarm
  Cc: Marc Zyngier, Catalin Marinas, Christoffer Dall,
	linux-arm-kernel, Ard Biesheuvel

To make the lazy FPSIMD context switch trap code easier to hack on,
this patch converts it to C.

This is not amazingly efficient, but the trap should typically only
be taken once per host context switch.

Signed-off-by: Dave Martin <Dave.Martin@arm.com>
Reviewed-by: Marc Zyngier <marc.zyngier@arm.com>
---
 arch/arm64/kvm/hyp/entry.S  | 57 +++++++++++++++++----------------------------
 arch/arm64/kvm/hyp/switch.c | 24 +++++++++++++++++++
 2 files changed, 46 insertions(+), 35 deletions(-)

diff --git a/arch/arm64/kvm/hyp/entry.S b/arch/arm64/kvm/hyp/entry.S
index e41a161..40f349b 100644
--- a/arch/arm64/kvm/hyp/entry.S
+++ b/arch/arm64/kvm/hyp/entry.S
@@ -172,40 +172,27 @@ ENTRY(__fpsimd_guest_restore)
 	// x1: vcpu
 	// x2-x29,lr: vcpu regs
 	// vcpu x0-x1 on the stack
-	stp	x2, x3, [sp, #-16]!
-	stp	x4, lr, [sp, #-16]!
-
-alternative_if_not ARM64_HAS_VIRT_HOST_EXTN
-	mrs	x2, cptr_el2
-	bic	x2, x2, #CPTR_EL2_TFP
-	msr	cptr_el2, x2
-alternative_else
-	mrs	x2, cpacr_el1
-	orr	x2, x2, #CPACR_EL1_FPEN
-	msr	cpacr_el1, x2
-alternative_endif
-	isb
-
-	mov	x3, x1
-
-	ldr	x0, [x3, #VCPU_HOST_CONTEXT]
-	kern_hyp_va x0
-	add	x0, x0, #CPU_GP_REG_OFFSET(CPU_FP_REGS)
-	bl	__fpsimd_save_state
-
-	add	x2, x3, #VCPU_CONTEXT
-	add	x0, x2, #CPU_GP_REG_OFFSET(CPU_FP_REGS)
-	bl	__fpsimd_restore_state
-
-	// Skip restoring fpexc32 for AArch64 guests
-	mrs	x1, hcr_el2
-	tbnz	x1, #HCR_RW_SHIFT, 1f
-	ldr	x4, [x3, #VCPU_FPEXC32_EL2]
-	msr	fpexc32_el2, x4
-1:
-	ldp	x4, lr, [sp], #16
-	ldp	x2, x3, [sp], #16
-	ldp	x0, x1, [sp], #16
-
+	stp	x2, x3, [sp, #-144]!
+	stp	x4, x5, [sp, #16]
+	stp	x6, x7, [sp, #32]
+	stp	x8, x9, [sp, #48]
+	stp	x10, x11, [sp, #64]
+	stp	x12, x13, [sp, #80]
+	stp	x14, x15, [sp, #96]
+	stp	x16, x17, [sp, #112]
+	stp	x18, lr, [sp, #128]
+
+	bl	__hyp_switch_fpsimd
+
+	ldp	x4, x5, [sp, #16]
+	ldp	x6, x7, [sp, #32]
+	ldp	x8, x9, [sp, #48]
+	ldp	x10, x11, [sp, #64]
+	ldp	x12, x13, [sp, #80]
+	ldp	x14, x15, [sp, #96]
+	ldp	x16, x17, [sp, #112]
+	ldp	x18, lr, [sp, #128]
+	ldp	x0, x1, [sp, #144]
+	ldp	x2, x3, [sp], #160
 	eret
 ENDPROC(__fpsimd_guest_restore)
diff --git a/arch/arm64/kvm/hyp/switch.c b/arch/arm64/kvm/hyp/switch.c
index d964523..c0796c4 100644
--- a/arch/arm64/kvm/hyp/switch.c
+++ b/arch/arm64/kvm/hyp/switch.c
@@ -318,6 +318,30 @@ static bool __hyp_text __skip_instr(struct kvm_vcpu *vcpu)
 	}
 }
 
+void __hyp_text __hyp_switch_fpsimd(u64 esr __always_unused,
+				    struct kvm_vcpu *vcpu)
+{
+	kvm_cpu_context_t *host_ctxt;
+
+	if (has_vhe())
+		write_sysreg(read_sysreg(cpacr_el1) | CPACR_EL1_FPEN,
+			     cpacr_el1);
+	else
+		write_sysreg(read_sysreg(cptr_el2) & ~(u64)CPTR_EL2_TFP,
+			     cptr_el2);
+
+	isb();
+
+	host_ctxt = kern_hyp_va(vcpu->arch.host_cpu_context);
+	__fpsimd_save_state(&host_ctxt->gp_regs.fp_regs);
+	__fpsimd_restore_state(&vcpu->arch.ctxt.gp_regs.fp_regs);
+
+	/* Skip restoring fpexc32 for AArch64 guests */
+	if (!(read_sysreg(hcr_el2) & HCR_RW))
+		write_sysreg(vcpu->arch.ctxt.sys_regs[FPEXC32_EL2],
+			     fpexc32_el2);
+}
+
 /*
  * Return true when we were able to fixup the guest exit and should return to
  * the guest, false when we should restore the host state and return to the
-- 
2.1.4

^ permalink raw reply related	[flat|nested] 62+ messages in thread

* [PATCH v6 04/15] KVM: arm64: Convert lazy FPSIMD context switch trap to C
@ 2018-05-08 16:44   ` Dave Martin
  0 siblings, 0 replies; 62+ messages in thread
From: Dave Martin @ 2018-05-08 16:44 UTC (permalink / raw)
  To: linux-arm-kernel

To make the lazy FPSIMD context switch trap code easier to hack on,
this patch converts it to C.

This is not amazingly efficient, but the trap should typically only
be taken once per host context switch.

Signed-off-by: Dave Martin <Dave.Martin@arm.com>
Reviewed-by: Marc Zyngier <marc.zyngier@arm.com>
---
 arch/arm64/kvm/hyp/entry.S  | 57 +++++++++++++++++----------------------------
 arch/arm64/kvm/hyp/switch.c | 24 +++++++++++++++++++
 2 files changed, 46 insertions(+), 35 deletions(-)

diff --git a/arch/arm64/kvm/hyp/entry.S b/arch/arm64/kvm/hyp/entry.S
index e41a161..40f349b 100644
--- a/arch/arm64/kvm/hyp/entry.S
+++ b/arch/arm64/kvm/hyp/entry.S
@@ -172,40 +172,27 @@ ENTRY(__fpsimd_guest_restore)
 	// x1: vcpu
 	// x2-x29,lr: vcpu regs
 	// vcpu x0-x1 on the stack
-	stp	x2, x3, [sp, #-16]!
-	stp	x4, lr, [sp, #-16]!
-
-alternative_if_not ARM64_HAS_VIRT_HOST_EXTN
-	mrs	x2, cptr_el2
-	bic	x2, x2, #CPTR_EL2_TFP
-	msr	cptr_el2, x2
-alternative_else
-	mrs	x2, cpacr_el1
-	orr	x2, x2, #CPACR_EL1_FPEN
-	msr	cpacr_el1, x2
-alternative_endif
-	isb
-
-	mov	x3, x1
-
-	ldr	x0, [x3, #VCPU_HOST_CONTEXT]
-	kern_hyp_va x0
-	add	x0, x0, #CPU_GP_REG_OFFSET(CPU_FP_REGS)
-	bl	__fpsimd_save_state
-
-	add	x2, x3, #VCPU_CONTEXT
-	add	x0, x2, #CPU_GP_REG_OFFSET(CPU_FP_REGS)
-	bl	__fpsimd_restore_state
-
-	// Skip restoring fpexc32 for AArch64 guests
-	mrs	x1, hcr_el2
-	tbnz	x1, #HCR_RW_SHIFT, 1f
-	ldr	x4, [x3, #VCPU_FPEXC32_EL2]
-	msr	fpexc32_el2, x4
-1:
-	ldp	x4, lr, [sp], #16
-	ldp	x2, x3, [sp], #16
-	ldp	x0, x1, [sp], #16
-
+	stp	x2, x3, [sp, #-144]!
+	stp	x4, x5, [sp, #16]
+	stp	x6, x7, [sp, #32]
+	stp	x8, x9, [sp, #48]
+	stp	x10, x11, [sp, #64]
+	stp	x12, x13, [sp, #80]
+	stp	x14, x15, [sp, #96]
+	stp	x16, x17, [sp, #112]
+	stp	x18, lr, [sp, #128]
+
+	bl	__hyp_switch_fpsimd
+
+	ldp	x4, x5, [sp, #16]
+	ldp	x6, x7, [sp, #32]
+	ldp	x8, x9, [sp, #48]
+	ldp	x10, x11, [sp, #64]
+	ldp	x12, x13, [sp, #80]
+	ldp	x14, x15, [sp, #96]
+	ldp	x16, x17, [sp, #112]
+	ldp	x18, lr, [sp, #128]
+	ldp	x0, x1, [sp, #144]
+	ldp	x2, x3, [sp], #160
 	eret
 ENDPROC(__fpsimd_guest_restore)
diff --git a/arch/arm64/kvm/hyp/switch.c b/arch/arm64/kvm/hyp/switch.c
index d964523..c0796c4 100644
--- a/arch/arm64/kvm/hyp/switch.c
+++ b/arch/arm64/kvm/hyp/switch.c
@@ -318,6 +318,30 @@ static bool __hyp_text __skip_instr(struct kvm_vcpu *vcpu)
 	}
 }
 
+void __hyp_text __hyp_switch_fpsimd(u64 esr __always_unused,
+				    struct kvm_vcpu *vcpu)
+{
+	kvm_cpu_context_t *host_ctxt;
+
+	if (has_vhe())
+		write_sysreg(read_sysreg(cpacr_el1) | CPACR_EL1_FPEN,
+			     cpacr_el1);
+	else
+		write_sysreg(read_sysreg(cptr_el2) & ~(u64)CPTR_EL2_TFP,
+			     cptr_el2);
+
+	isb();
+
+	host_ctxt = kern_hyp_va(vcpu->arch.host_cpu_context);
+	__fpsimd_save_state(&host_ctxt->gp_regs.fp_regs);
+	__fpsimd_restore_state(&vcpu->arch.ctxt.gp_regs.fp_regs);
+
+	/* Skip restoring fpexc32 for AArch64 guests */
+	if (!(read_sysreg(hcr_el2) & HCR_RW))
+		write_sysreg(vcpu->arch.ctxt.sys_regs[FPEXC32_EL2],
+			     fpexc32_el2);
+}
+
 /*
  * Return true when we were able to fixup the guest exit and should return to
  * the guest, false when we should restore the host state and return to the
-- 
2.1.4

^ permalink raw reply related	[flat|nested] 62+ messages in thread

* [PATCH v6 05/15] arm64: fpsimd: Generalise context saving for non-task contexts
  2018-05-08 16:44 ` Dave Martin
@ 2018-05-08 16:44   ` Dave Martin
  -1 siblings, 0 replies; 62+ messages in thread
From: Dave Martin @ 2018-05-08 16:44 UTC (permalink / raw)
  To: kvmarm
  Cc: Marc Zyngier, Catalin Marinas, Christoffer Dall,
	linux-arm-kernel, Ard Biesheuvel

In preparation for allowing non-task (i.e., KVM vcpu) FPSIMD
contexts to be handled by the fpsimd common code, this patch adapts
task_fpsimd_save() to save back the currently loaded context,
removing the explicit dependency on current.

The relevant storage to write back to in memory is now found by
examining the fpsimd_last_state percpu struct.

fpsimd_save() does nothing unless TIF_FOREIGN_FPSTATE is clear, and
fpsimd_last_state is updated under local_bh_disable() or
local_irq_disable() everywhere that TIF_FOREIGN_FPSTATE is cleared:
thus, fpsimd_save() will write back to the correct storage for the
loaded context.

No functional change.

Signed-off-by: Dave Martin <Dave.Martin@arm.com>
Acked-by: Marc Zyngier <marc.zyngier@arm.com>
---
 arch/arm64/kernel/fpsimd.c | 25 +++++++++++++------------
 1 file changed, 13 insertions(+), 12 deletions(-)

diff --git a/arch/arm64/kernel/fpsimd.c b/arch/arm64/kernel/fpsimd.c
index 0c4e7e0..5fc0595 100644
--- a/arch/arm64/kernel/fpsimd.c
+++ b/arch/arm64/kernel/fpsimd.c
@@ -270,13 +270,15 @@ static void task_fpsimd_load(void)
 }
 
 /*
- * Ensure current's FPSIMD/SVE storage in thread_struct is up to date
- * with respect to the CPU registers.
+ * Ensure FPSIMD/SVE storage in memory for the loaded context is up to
+ * date with respect to the CPU registers.
  *
  * Softirqs (and preemption) must be disabled.
  */
-static void task_fpsimd_save(void)
+static void fpsimd_save(void)
 {
+	struct user_fpsimd_state *st = __this_cpu_read(fpsimd_last_state.st);
+
 	WARN_ON(!in_softirq() && !irqs_disabled());
 
 	if (!test_thread_flag(TIF_FOREIGN_FPSTATE)) {
@@ -291,10 +293,9 @@ static void task_fpsimd_save(void)
 				return;
 			}
 
-			sve_save_state(sve_pffr(current),
-				       &current->thread.uw.fpsimd_state.fpsr);
+			sve_save_state(sve_pffr(current), &st->fpsr);
 		} else
-			fpsimd_save_state(&current->thread.uw.fpsimd_state);
+			fpsimd_save_state(st);
 	}
 }
 
@@ -598,7 +599,7 @@ int sve_set_vector_length(struct task_struct *task,
 	if (task == current) {
 		local_bh_disable();
 
-		task_fpsimd_save();
+		fpsimd_save();
 		set_thread_flag(TIF_FOREIGN_FPSTATE);
 	}
 
@@ -837,7 +838,7 @@ asmlinkage void do_sve_acc(unsigned int esr, struct pt_regs *regs)
 
 	local_bh_disable();
 
-	task_fpsimd_save();
+	fpsimd_save();
 	fpsimd_to_sve(current);
 
 	/* Force ret_to_user to reload the registers: */
@@ -898,7 +899,7 @@ void fpsimd_thread_switch(struct task_struct *next)
 	 * 'current'.
 	 */
 	if (current->mm)
-		task_fpsimd_save();
+		fpsimd_save();
 
 	if (next->mm)
 		/*
@@ -977,7 +978,7 @@ void fpsimd_preserve_current_state(void)
 		return;
 
 	local_bh_disable();
-	task_fpsimd_save();
+	fpsimd_save();
 	local_bh_enable();
 }
 
@@ -1117,7 +1118,7 @@ void kernel_neon_begin(void)
 
 	/* Save unsaved task fpsimd state, if any: */
 	if (current->mm) {
-		task_fpsimd_save();
+		fpsimd_save();
 		set_thread_flag(TIF_FOREIGN_FPSTATE);
 	}
 
@@ -1242,7 +1243,7 @@ static int fpsimd_cpu_pm_notifier(struct notifier_block *self,
 	switch (cmd) {
 	case CPU_PM_ENTER:
 		if (current->mm)
-			task_fpsimd_save();
+			fpsimd_save();
 		fpsimd_flush_cpu_state();
 		break;
 	case CPU_PM_EXIT:
-- 
2.1.4

^ permalink raw reply related	[flat|nested] 62+ messages in thread

* [PATCH v6 05/15] arm64: fpsimd: Generalise context saving for non-task contexts
@ 2018-05-08 16:44   ` Dave Martin
  0 siblings, 0 replies; 62+ messages in thread
From: Dave Martin @ 2018-05-08 16:44 UTC (permalink / raw)
  To: linux-arm-kernel

In preparation for allowing non-task (i.e., KVM vcpu) FPSIMD
contexts to be handled by the fpsimd common code, this patch adapts
task_fpsimd_save() to save back the currently loaded context,
removing the explicit dependency on current.

The relevant storage to write back to in memory is now found by
examining the fpsimd_last_state percpu struct.

fpsimd_save() does nothing unless TIF_FOREIGN_FPSTATE is clear, and
fpsimd_last_state is updated under local_bh_disable() or
local_irq_disable() everywhere that TIF_FOREIGN_FPSTATE is cleared:
thus, fpsimd_save() will write back to the correct storage for the
loaded context.

No functional change.

Signed-off-by: Dave Martin <Dave.Martin@arm.com>
Acked-by: Marc Zyngier <marc.zyngier@arm.com>
---
 arch/arm64/kernel/fpsimd.c | 25 +++++++++++++------------
 1 file changed, 13 insertions(+), 12 deletions(-)

diff --git a/arch/arm64/kernel/fpsimd.c b/arch/arm64/kernel/fpsimd.c
index 0c4e7e0..5fc0595 100644
--- a/arch/arm64/kernel/fpsimd.c
+++ b/arch/arm64/kernel/fpsimd.c
@@ -270,13 +270,15 @@ static void task_fpsimd_load(void)
 }
 
 /*
- * Ensure current's FPSIMD/SVE storage in thread_struct is up to date
- * with respect to the CPU registers.
+ * Ensure FPSIMD/SVE storage in memory for the loaded context is up to
+ * date with respect to the CPU registers.
  *
  * Softirqs (and preemption) must be disabled.
  */
-static void task_fpsimd_save(void)
+static void fpsimd_save(void)
 {
+	struct user_fpsimd_state *st = __this_cpu_read(fpsimd_last_state.st);
+
 	WARN_ON(!in_softirq() && !irqs_disabled());
 
 	if (!test_thread_flag(TIF_FOREIGN_FPSTATE)) {
@@ -291,10 +293,9 @@ static void task_fpsimd_save(void)
 				return;
 			}
 
-			sve_save_state(sve_pffr(current),
-				       &current->thread.uw.fpsimd_state.fpsr);
+			sve_save_state(sve_pffr(current), &st->fpsr);
 		} else
-			fpsimd_save_state(&current->thread.uw.fpsimd_state);
+			fpsimd_save_state(st);
 	}
 }
 
@@ -598,7 +599,7 @@ int sve_set_vector_length(struct task_struct *task,
 	if (task == current) {
 		local_bh_disable();
 
-		task_fpsimd_save();
+		fpsimd_save();
 		set_thread_flag(TIF_FOREIGN_FPSTATE);
 	}
 
@@ -837,7 +838,7 @@ asmlinkage void do_sve_acc(unsigned int esr, struct pt_regs *regs)
 
 	local_bh_disable();
 
-	task_fpsimd_save();
+	fpsimd_save();
 	fpsimd_to_sve(current);
 
 	/* Force ret_to_user to reload the registers: */
@@ -898,7 +899,7 @@ void fpsimd_thread_switch(struct task_struct *next)
 	 * 'current'.
 	 */
 	if (current->mm)
-		task_fpsimd_save();
+		fpsimd_save();
 
 	if (next->mm)
 		/*
@@ -977,7 +978,7 @@ void fpsimd_preserve_current_state(void)
 		return;
 
 	local_bh_disable();
-	task_fpsimd_save();
+	fpsimd_save();
 	local_bh_enable();
 }
 
@@ -1117,7 +1118,7 @@ void kernel_neon_begin(void)
 
 	/* Save unsaved task fpsimd state, if any: */
 	if (current->mm) {
-		task_fpsimd_save();
+		fpsimd_save();
 		set_thread_flag(TIF_FOREIGN_FPSTATE);
 	}
 
@@ -1242,7 +1243,7 @@ static int fpsimd_cpu_pm_notifier(struct notifier_block *self,
 	switch (cmd) {
 	case CPU_PM_ENTER:
 		if (current->mm)
-			task_fpsimd_save();
+			fpsimd_save();
 		fpsimd_flush_cpu_state();
 		break;
 	case CPU_PM_EXIT:
-- 
2.1.4

^ permalink raw reply related	[flat|nested] 62+ messages in thread

* [PATCH v6 06/15] KVM: arm64: Repurpose vcpu_arch.debug_flags for general-purpose flags
  2018-05-08 16:44 ` Dave Martin
@ 2018-05-08 16:44   ` Dave Martin
  -1 siblings, 0 replies; 62+ messages in thread
From: Dave Martin @ 2018-05-08 16:44 UTC (permalink / raw)
  To: kvmarm
  Cc: Marc Zyngier, Catalin Marinas, Christoffer Dall,
	linux-arm-kernel, Ard Biesheuvel

In struct vcpu_arch, the debug_flags field is used to store debug-
related flags about the vcpu state.

Since we are about to add some more flags related to FPSIMD and
SVE, it makes sense to add them to the existing flags field rather
than adding new fields.  Since there is only one debug_flags flag
defined so far, there is plenty of free space for expansion.

In preparation for adding more flags, this patch renames the
debug_flags field to simply "flags", and updates comments
appropriately.

No functional change.

Signed-off-by: Dave Martin <Dave.Martin@arm.com>
---
 arch/arm64/include/asm/kvm_asm.h  | 1 +
 arch/arm64/include/asm/kvm_host.h | 4 ++--
 arch/arm64/kvm/debug.c            | 8 ++++----
 arch/arm64/kvm/hyp/debug-sr.c     | 6 +++---
 arch/arm64/kvm/hyp/sysreg-sr.c    | 4 ++--
 arch/arm64/kvm/sys_regs.c         | 8 ++++----
 6 files changed, 16 insertions(+), 15 deletions(-)

diff --git a/arch/arm64/include/asm/kvm_asm.h b/arch/arm64/include/asm/kvm_asm.h
index f6648a3..9699c13 100644
--- a/arch/arm64/include/asm/kvm_asm.h
+++ b/arch/arm64/include/asm/kvm_asm.h
@@ -30,6 +30,7 @@
 /* The hyp-stub will return this for any kvm_call_hyp() call */
 #define ARM_EXCEPTION_HYP_GONE	  HVC_STUB_ERR
 
+/* vcpu_arch flags field values: */
 #define KVM_ARM64_DEBUG_DIRTY_SHIFT	0
 #define KVM_ARM64_DEBUG_DIRTY		(1 << KVM_ARM64_DEBUG_DIRTY_SHIFT)
 
diff --git a/arch/arm64/include/asm/kvm_host.h b/arch/arm64/include/asm/kvm_host.h
index 469de8a..a5ed95c 100644
--- a/arch/arm64/include/asm/kvm_host.h
+++ b/arch/arm64/include/asm/kvm_host.h
@@ -216,8 +216,8 @@ struct kvm_vcpu_arch {
 	/* Exception Information */
 	struct kvm_vcpu_fault_info fault;
 
-	/* Guest debug state */
-	u64 debug_flags;
+	/* Miscellaneous vcpu state flags: see <asm/kvm_asm.h> */
+	u64 flags;
 
 	/*
 	 * We maintain more than a single set of debug registers to support
diff --git a/arch/arm64/kvm/debug.c b/arch/arm64/kvm/debug.c
index a1f4ebd..00d4223 100644
--- a/arch/arm64/kvm/debug.c
+++ b/arch/arm64/kvm/debug.c
@@ -103,7 +103,7 @@ void kvm_arm_reset_debug_ptr(struct kvm_vcpu *vcpu)
  *
  * Additionally, KVM only traps guest accesses to the debug registers if
  * the guest is not actively using them (see the KVM_ARM64_DEBUG_DIRTY
- * flag on vcpu->arch.debug_flags).  Since the guest must not interfere
+ * flag on vcpu->arch.flags).  Since the guest must not interfere
  * with the hardware state when debugging the guest, we must ensure that
  * trapping is enabled whenever we are debugging the guest using the
  * debug registers.
@@ -111,7 +111,7 @@ void kvm_arm_reset_debug_ptr(struct kvm_vcpu *vcpu)
 
 void kvm_arm_setup_debug(struct kvm_vcpu *vcpu)
 {
-	bool trap_debug = !(vcpu->arch.debug_flags & KVM_ARM64_DEBUG_DIRTY);
+	bool trap_debug = !(vcpu->arch.flags & KVM_ARM64_DEBUG_DIRTY);
 	unsigned long mdscr;
 
 	trace_kvm_arm_setup_debug(vcpu, vcpu->guest_debug);
@@ -184,7 +184,7 @@ void kvm_arm_setup_debug(struct kvm_vcpu *vcpu)
 			vcpu_write_sys_reg(vcpu, mdscr, MDSCR_EL1);
 
 			vcpu->arch.debug_ptr = &vcpu->arch.external_debug_state;
-			vcpu->arch.debug_flags |= KVM_ARM64_DEBUG_DIRTY;
+			vcpu->arch.flags |= KVM_ARM64_DEBUG_DIRTY;
 			trap_debug = true;
 
 			trace_kvm_arm_set_regset("BKPTS", get_num_brps(),
@@ -206,7 +206,7 @@ void kvm_arm_setup_debug(struct kvm_vcpu *vcpu)
 
 	/* If KDE or MDE are set, perform a full save/restore cycle. */
 	if (vcpu_read_sys_reg(vcpu, MDSCR_EL1) & (DBG_MDSCR_KDE | DBG_MDSCR_MDE))
-		vcpu->arch.debug_flags |= KVM_ARM64_DEBUG_DIRTY;
+		vcpu->arch.flags |= KVM_ARM64_DEBUG_DIRTY;
 
 	trace_kvm_arm_set_dreg32("MDCR_EL2", vcpu->arch.mdcr_el2);
 	trace_kvm_arm_set_dreg32("MDSCR_EL1", vcpu_read_sys_reg(vcpu, MDSCR_EL1));
diff --git a/arch/arm64/kvm/hyp/debug-sr.c b/arch/arm64/kvm/hyp/debug-sr.c
index 3e717f6..5000976 100644
--- a/arch/arm64/kvm/hyp/debug-sr.c
+++ b/arch/arm64/kvm/hyp/debug-sr.c
@@ -163,7 +163,7 @@ void __hyp_text __debug_switch_to_guest(struct kvm_vcpu *vcpu)
 	if (!has_vhe())
 		__debug_save_spe_nvhe(&vcpu->arch.host_debug_state.pmscr_el1);
 
-	if (!(vcpu->arch.debug_flags & KVM_ARM64_DEBUG_DIRTY))
+	if (!(vcpu->arch.flags & KVM_ARM64_DEBUG_DIRTY))
 		return;
 
 	host_ctxt = kern_hyp_va(vcpu->arch.host_cpu_context);
@@ -185,7 +185,7 @@ void __hyp_text __debug_switch_to_host(struct kvm_vcpu *vcpu)
 	if (!has_vhe())
 		__debug_restore_spe_nvhe(vcpu->arch.host_debug_state.pmscr_el1);
 
-	if (!(vcpu->arch.debug_flags & KVM_ARM64_DEBUG_DIRTY))
+	if (!(vcpu->arch.flags & KVM_ARM64_DEBUG_DIRTY))
 		return;
 
 	host_ctxt = kern_hyp_va(vcpu->arch.host_cpu_context);
@@ -196,7 +196,7 @@ void __hyp_text __debug_switch_to_host(struct kvm_vcpu *vcpu)
 	__debug_save_state(vcpu, guest_dbg, guest_ctxt);
 	__debug_restore_state(vcpu, host_dbg, host_ctxt);
 
-	vcpu->arch.debug_flags &= ~KVM_ARM64_DEBUG_DIRTY;
+	vcpu->arch.flags &= ~KVM_ARM64_DEBUG_DIRTY;
 }
 
 u32 __hyp_text __kvm_get_mdcr_el2(void)
diff --git a/arch/arm64/kvm/hyp/sysreg-sr.c b/arch/arm64/kvm/hyp/sysreg-sr.c
index b3894df..35bc168 100644
--- a/arch/arm64/kvm/hyp/sysreg-sr.c
+++ b/arch/arm64/kvm/hyp/sysreg-sr.c
@@ -196,7 +196,7 @@ void __hyp_text __sysreg32_save_state(struct kvm_vcpu *vcpu)
 	sysreg[DACR32_EL2] = read_sysreg(dacr32_el2);
 	sysreg[IFSR32_EL2] = read_sysreg(ifsr32_el2);
 
-	if (has_vhe() || vcpu->arch.debug_flags & KVM_ARM64_DEBUG_DIRTY)
+	if (has_vhe() || vcpu->arch.flags & KVM_ARM64_DEBUG_DIRTY)
 		sysreg[DBGVCR32_EL2] = read_sysreg(dbgvcr32_el2);
 }
 
@@ -218,7 +218,7 @@ void __hyp_text __sysreg32_restore_state(struct kvm_vcpu *vcpu)
 	write_sysreg(sysreg[DACR32_EL2], dacr32_el2);
 	write_sysreg(sysreg[IFSR32_EL2], ifsr32_el2);
 
-	if (has_vhe() || vcpu->arch.debug_flags & KVM_ARM64_DEBUG_DIRTY)
+	if (has_vhe() || vcpu->arch.flags & KVM_ARM64_DEBUG_DIRTY)
 		write_sysreg(sysreg[DBGVCR32_EL2], dbgvcr32_el2);
 }
 
diff --git a/arch/arm64/kvm/sys_regs.c b/arch/arm64/kvm/sys_regs.c
index 6e3b969..17daf5d 100644
--- a/arch/arm64/kvm/sys_regs.c
+++ b/arch/arm64/kvm/sys_regs.c
@@ -338,7 +338,7 @@ static bool trap_debug_regs(struct kvm_vcpu *vcpu,
 {
 	if (p->is_write) {
 		vcpu_write_sys_reg(vcpu, p->regval, r->reg);
-		vcpu->arch.debug_flags |= KVM_ARM64_DEBUG_DIRTY;
+		vcpu->arch.flags |= KVM_ARM64_DEBUG_DIRTY;
 	} else {
 		p->regval = vcpu_read_sys_reg(vcpu, r->reg);
 	}
@@ -369,7 +369,7 @@ static void reg_to_dbg(struct kvm_vcpu *vcpu,
 	}
 
 	*dbg_reg = val;
-	vcpu->arch.debug_flags |= KVM_ARM64_DEBUG_DIRTY;
+	vcpu->arch.flags |= KVM_ARM64_DEBUG_DIRTY;
 }
 
 static void dbg_to_reg(struct kvm_vcpu *vcpu,
@@ -1441,7 +1441,7 @@ static bool trap_debug32(struct kvm_vcpu *vcpu,
 {
 	if (p->is_write) {
 		vcpu_cp14(vcpu, r->reg) = p->regval;
-		vcpu->arch.debug_flags |= KVM_ARM64_DEBUG_DIRTY;
+		vcpu->arch.flags |= KVM_ARM64_DEBUG_DIRTY;
 	} else {
 		p->regval = vcpu_cp14(vcpu, r->reg);
 	}
@@ -1473,7 +1473,7 @@ static bool trap_xvr(struct kvm_vcpu *vcpu,
 		val |= p->regval << 32;
 		*dbg_reg = val;
 
-		vcpu->arch.debug_flags |= KVM_ARM64_DEBUG_DIRTY;
+		vcpu->arch.flags |= KVM_ARM64_DEBUG_DIRTY;
 	} else {
 		p->regval = *dbg_reg >> 32;
 	}
-- 
2.1.4

^ permalink raw reply related	[flat|nested] 62+ messages in thread

* [PATCH v6 06/15] KVM: arm64: Repurpose vcpu_arch.debug_flags for general-purpose flags
@ 2018-05-08 16:44   ` Dave Martin
  0 siblings, 0 replies; 62+ messages in thread
From: Dave Martin @ 2018-05-08 16:44 UTC (permalink / raw)
  To: linux-arm-kernel

In struct vcpu_arch, the debug_flags field is used to store debug-
related flags about the vcpu state.

Since we are about to add some more flags related to FPSIMD and
SVE, it makes sense to add them to the existing flags field rather
than adding new fields.  Since there is only one debug_flags flag
defined so far, there is plenty of free space for expansion.

In preparation for adding more flags, this patch renames the
debug_flags field to simply "flags", and updates comments
appropriately.

No functional change.

Signed-off-by: Dave Martin <Dave.Martin@arm.com>
---
 arch/arm64/include/asm/kvm_asm.h  | 1 +
 arch/arm64/include/asm/kvm_host.h | 4 ++--
 arch/arm64/kvm/debug.c            | 8 ++++----
 arch/arm64/kvm/hyp/debug-sr.c     | 6 +++---
 arch/arm64/kvm/hyp/sysreg-sr.c    | 4 ++--
 arch/arm64/kvm/sys_regs.c         | 8 ++++----
 6 files changed, 16 insertions(+), 15 deletions(-)

diff --git a/arch/arm64/include/asm/kvm_asm.h b/arch/arm64/include/asm/kvm_asm.h
index f6648a3..9699c13 100644
--- a/arch/arm64/include/asm/kvm_asm.h
+++ b/arch/arm64/include/asm/kvm_asm.h
@@ -30,6 +30,7 @@
 /* The hyp-stub will return this for any kvm_call_hyp() call */
 #define ARM_EXCEPTION_HYP_GONE	  HVC_STUB_ERR
 
+/* vcpu_arch flags field values: */
 #define KVM_ARM64_DEBUG_DIRTY_SHIFT	0
 #define KVM_ARM64_DEBUG_DIRTY		(1 << KVM_ARM64_DEBUG_DIRTY_SHIFT)
 
diff --git a/arch/arm64/include/asm/kvm_host.h b/arch/arm64/include/asm/kvm_host.h
index 469de8a..a5ed95c 100644
--- a/arch/arm64/include/asm/kvm_host.h
+++ b/arch/arm64/include/asm/kvm_host.h
@@ -216,8 +216,8 @@ struct kvm_vcpu_arch {
 	/* Exception Information */
 	struct kvm_vcpu_fault_info fault;
 
-	/* Guest debug state */
-	u64 debug_flags;
+	/* Miscellaneous vcpu state flags: see <asm/kvm_asm.h> */
+	u64 flags;
 
 	/*
 	 * We maintain more than a single set of debug registers to support
diff --git a/arch/arm64/kvm/debug.c b/arch/arm64/kvm/debug.c
index a1f4ebd..00d4223 100644
--- a/arch/arm64/kvm/debug.c
+++ b/arch/arm64/kvm/debug.c
@@ -103,7 +103,7 @@ void kvm_arm_reset_debug_ptr(struct kvm_vcpu *vcpu)
  *
  * Additionally, KVM only traps guest accesses to the debug registers if
  * the guest is not actively using them (see the KVM_ARM64_DEBUG_DIRTY
- * flag on vcpu->arch.debug_flags).  Since the guest must not interfere
+ * flag on vcpu->arch.flags).  Since the guest must not interfere
  * with the hardware state when debugging the guest, we must ensure that
  * trapping is enabled whenever we are debugging the guest using the
  * debug registers.
@@ -111,7 +111,7 @@ void kvm_arm_reset_debug_ptr(struct kvm_vcpu *vcpu)
 
 void kvm_arm_setup_debug(struct kvm_vcpu *vcpu)
 {
-	bool trap_debug = !(vcpu->arch.debug_flags & KVM_ARM64_DEBUG_DIRTY);
+	bool trap_debug = !(vcpu->arch.flags & KVM_ARM64_DEBUG_DIRTY);
 	unsigned long mdscr;
 
 	trace_kvm_arm_setup_debug(vcpu, vcpu->guest_debug);
@@ -184,7 +184,7 @@ void kvm_arm_setup_debug(struct kvm_vcpu *vcpu)
 			vcpu_write_sys_reg(vcpu, mdscr, MDSCR_EL1);
 
 			vcpu->arch.debug_ptr = &vcpu->arch.external_debug_state;
-			vcpu->arch.debug_flags |= KVM_ARM64_DEBUG_DIRTY;
+			vcpu->arch.flags |= KVM_ARM64_DEBUG_DIRTY;
 			trap_debug = true;
 
 			trace_kvm_arm_set_regset("BKPTS", get_num_brps(),
@@ -206,7 +206,7 @@ void kvm_arm_setup_debug(struct kvm_vcpu *vcpu)
 
 	/* If KDE or MDE are set, perform a full save/restore cycle. */
 	if (vcpu_read_sys_reg(vcpu, MDSCR_EL1) & (DBG_MDSCR_KDE | DBG_MDSCR_MDE))
-		vcpu->arch.debug_flags |= KVM_ARM64_DEBUG_DIRTY;
+		vcpu->arch.flags |= KVM_ARM64_DEBUG_DIRTY;
 
 	trace_kvm_arm_set_dreg32("MDCR_EL2", vcpu->arch.mdcr_el2);
 	trace_kvm_arm_set_dreg32("MDSCR_EL1", vcpu_read_sys_reg(vcpu, MDSCR_EL1));
diff --git a/arch/arm64/kvm/hyp/debug-sr.c b/arch/arm64/kvm/hyp/debug-sr.c
index 3e717f6..5000976 100644
--- a/arch/arm64/kvm/hyp/debug-sr.c
+++ b/arch/arm64/kvm/hyp/debug-sr.c
@@ -163,7 +163,7 @@ void __hyp_text __debug_switch_to_guest(struct kvm_vcpu *vcpu)
 	if (!has_vhe())
 		__debug_save_spe_nvhe(&vcpu->arch.host_debug_state.pmscr_el1);
 
-	if (!(vcpu->arch.debug_flags & KVM_ARM64_DEBUG_DIRTY))
+	if (!(vcpu->arch.flags & KVM_ARM64_DEBUG_DIRTY))
 		return;
 
 	host_ctxt = kern_hyp_va(vcpu->arch.host_cpu_context);
@@ -185,7 +185,7 @@ void __hyp_text __debug_switch_to_host(struct kvm_vcpu *vcpu)
 	if (!has_vhe())
 		__debug_restore_spe_nvhe(vcpu->arch.host_debug_state.pmscr_el1);
 
-	if (!(vcpu->arch.debug_flags & KVM_ARM64_DEBUG_DIRTY))
+	if (!(vcpu->arch.flags & KVM_ARM64_DEBUG_DIRTY))
 		return;
 
 	host_ctxt = kern_hyp_va(vcpu->arch.host_cpu_context);
@@ -196,7 +196,7 @@ void __hyp_text __debug_switch_to_host(struct kvm_vcpu *vcpu)
 	__debug_save_state(vcpu, guest_dbg, guest_ctxt);
 	__debug_restore_state(vcpu, host_dbg, host_ctxt);
 
-	vcpu->arch.debug_flags &= ~KVM_ARM64_DEBUG_DIRTY;
+	vcpu->arch.flags &= ~KVM_ARM64_DEBUG_DIRTY;
 }
 
 u32 __hyp_text __kvm_get_mdcr_el2(void)
diff --git a/arch/arm64/kvm/hyp/sysreg-sr.c b/arch/arm64/kvm/hyp/sysreg-sr.c
index b3894df..35bc168 100644
--- a/arch/arm64/kvm/hyp/sysreg-sr.c
+++ b/arch/arm64/kvm/hyp/sysreg-sr.c
@@ -196,7 +196,7 @@ void __hyp_text __sysreg32_save_state(struct kvm_vcpu *vcpu)
 	sysreg[DACR32_EL2] = read_sysreg(dacr32_el2);
 	sysreg[IFSR32_EL2] = read_sysreg(ifsr32_el2);
 
-	if (has_vhe() || vcpu->arch.debug_flags & KVM_ARM64_DEBUG_DIRTY)
+	if (has_vhe() || vcpu->arch.flags & KVM_ARM64_DEBUG_DIRTY)
 		sysreg[DBGVCR32_EL2] = read_sysreg(dbgvcr32_el2);
 }
 
@@ -218,7 +218,7 @@ void __hyp_text __sysreg32_restore_state(struct kvm_vcpu *vcpu)
 	write_sysreg(sysreg[DACR32_EL2], dacr32_el2);
 	write_sysreg(sysreg[IFSR32_EL2], ifsr32_el2);
 
-	if (has_vhe() || vcpu->arch.debug_flags & KVM_ARM64_DEBUG_DIRTY)
+	if (has_vhe() || vcpu->arch.flags & KVM_ARM64_DEBUG_DIRTY)
 		write_sysreg(sysreg[DBGVCR32_EL2], dbgvcr32_el2);
 }
 
diff --git a/arch/arm64/kvm/sys_regs.c b/arch/arm64/kvm/sys_regs.c
index 6e3b969..17daf5d 100644
--- a/arch/arm64/kvm/sys_regs.c
+++ b/arch/arm64/kvm/sys_regs.c
@@ -338,7 +338,7 @@ static bool trap_debug_regs(struct kvm_vcpu *vcpu,
 {
 	if (p->is_write) {
 		vcpu_write_sys_reg(vcpu, p->regval, r->reg);
-		vcpu->arch.debug_flags |= KVM_ARM64_DEBUG_DIRTY;
+		vcpu->arch.flags |= KVM_ARM64_DEBUG_DIRTY;
 	} else {
 		p->regval = vcpu_read_sys_reg(vcpu, r->reg);
 	}
@@ -369,7 +369,7 @@ static void reg_to_dbg(struct kvm_vcpu *vcpu,
 	}
 
 	*dbg_reg = val;
-	vcpu->arch.debug_flags |= KVM_ARM64_DEBUG_DIRTY;
+	vcpu->arch.flags |= KVM_ARM64_DEBUG_DIRTY;
 }
 
 static void dbg_to_reg(struct kvm_vcpu *vcpu,
@@ -1441,7 +1441,7 @@ static bool trap_debug32(struct kvm_vcpu *vcpu,
 {
 	if (p->is_write) {
 		vcpu_cp14(vcpu, r->reg) = p->regval;
-		vcpu->arch.debug_flags |= KVM_ARM64_DEBUG_DIRTY;
+		vcpu->arch.flags |= KVM_ARM64_DEBUG_DIRTY;
 	} else {
 		p->regval = vcpu_cp14(vcpu, r->reg);
 	}
@@ -1473,7 +1473,7 @@ static bool trap_xvr(struct kvm_vcpu *vcpu,
 		val |= p->regval << 32;
 		*dbg_reg = val;
 
-		vcpu->arch.debug_flags |= KVM_ARM64_DEBUG_DIRTY;
+		vcpu->arch.flags |= KVM_ARM64_DEBUG_DIRTY;
 	} else {
 		p->regval = *dbg_reg >> 32;
 	}
-- 
2.1.4

^ permalink raw reply related	[flat|nested] 62+ messages in thread

* [PATCH v6 07/15] KVM: arm64: Optimise FPSIMD handling to reduce guest/host thrashing
  2018-05-08 16:44 ` Dave Martin
@ 2018-05-08 16:44   ` Dave Martin
  -1 siblings, 0 replies; 62+ messages in thread
From: Dave Martin @ 2018-05-08 16:44 UTC (permalink / raw)
  To: kvmarm
  Cc: Marc Zyngier, Catalin Marinas, Christoffer Dall,
	linux-arm-kernel, Ard Biesheuvel

This patch refactors KVM to align the host and guest FPSIMD
save/restore logic with each other for arm64.  This reduces the
number of redundant save/restore operations that must occur, and
reduces the common-case IRQ blackout time during guest exit storms
by saving the host state lazily and optimising away the need to
restore the host state before returning to the run loop.

Four hooks are defined in order to enable this:

 * kvm_arch_vcpu_run_map_fp():
   Called on PID change to map necessary bits of current to Hyp.

 * kvm_arch_vcpu_load_fp():
   Set up FP/SIMD for entering the KVM run loop (parse as
   "vcpu_load fp").

 * kvm_arch_vcpu_ctxsync_fp():
   Get FP/SIMD into a safe state for re-enabling interrupts after a
   guest exit back to the run loop.

   For arm64 specifically, this involves updating the host kernel's
   FPSIMD context tracking metadata so that kernel-mode NEON use
   will cause the vcpu's FPSIMD state to be saved back correctly
   into the vcpu struct.  This must be done before re-enabling
   interrupts because kernel-mode NEON may be used by softirqs.

 * kvm_arch_vcpu_put_fp():
   Save guest FP/SIMD state back to memory and dissociate from the
   CPU ("vcpu_put fp").

Also, the arm64 FPSIMD context switch code is updated to enable it
to save back FPSIMD state for a vcpu, not just current.  A few
helpers drive this:

 * fpsimd_bind_state_to_cpu(struct user_fpsimd_state *fp):
   mark this CPU as having context fp (which may belong to a vcpu)
   currently loaded in its registers.  This is the non-task
   equivalent of the static function fpsimd_bind_to_cpu() in
   fpsimd.c.

 * task_fpsimd_save():
   exported to allow KVM to save the guest's FPSIMD state back to
   memory on exit from the run loop.

 * fpsimd_flush_state():
   invalidate any context's FPSIMD state that is currently loaded.
   Used to disassociate the vcpu from the CPU regs on run loop exit.

These changes allow the run loop to enable interrupts (and thus
softirqs that may use kernel-mode NEON) without having to save the
guest's FPSIMD state eagerly.

Some new vcpu_arch fields are added to make all this work.  Because
host FPSIMD state can now be saved back directly into current's
thread_struct as appropriate, host_cpu_context is no longer used
for preserving the FPSIMD state.  However, it is still needed for
preserving other things such as the host's system registers.  To
avoid ABI churn, the redundant storage space in host_cpu_context is
not removed for now.

arch/arm is not addressed by this patch and continues to use its
current save/restore logic.  It could provide implementations of
the helpers later if desired.

Signed-off-by: Dave Martin <Dave.Martin@arm.com>

---

Changes since v5:

Requested by Marc Zyngier:

 * Migrate from adding new bool flags in vcpu_arch to adding bits in
   vcpu_arch.flags.
---
 arch/arm/include/asm/kvm_host.h   |   8 +++
 arch/arm64/include/asm/fpsimd.h   |   5 ++
 arch/arm64/include/asm/kvm_asm.h  |   2 +
 arch/arm64/include/asm/kvm_host.h |  16 ++++++
 arch/arm64/kernel/fpsimd.c        |  15 +++++-
 arch/arm64/kvm/Kconfig            |   1 +
 arch/arm64/kvm/Makefile           |   2 +-
 arch/arm64/kvm/fpsimd.c           | 111 ++++++++++++++++++++++++++++++++++++++
 arch/arm64/kvm/hyp/switch.c       |  50 +++++++++--------
 virt/kvm/arm/arm.c                |   4 ++
 10 files changed, 185 insertions(+), 29 deletions(-)
 create mode 100644 arch/arm64/kvm/fpsimd.c

diff --git a/arch/arm/include/asm/kvm_host.h b/arch/arm/include/asm/kvm_host.h
index c7c28c8..4cac8d1 100644
--- a/arch/arm/include/asm/kvm_host.h
+++ b/arch/arm/include/asm/kvm_host.h
@@ -303,6 +303,14 @@ int kvm_arm_vcpu_arch_get_attr(struct kvm_vcpu *vcpu,
 int kvm_arm_vcpu_arch_has_attr(struct kvm_vcpu *vcpu,
 			       struct kvm_device_attr *attr);
 
+/*
+ * VFP/NEON switching is all done by the hyp switch code, so no need to
+ * coordinate with host context handling for this state:
+ */
+static inline void kvm_arch_vcpu_load_fp(struct kvm_vcpu *vcpu) {}
+static inline void kvm_arch_vcpu_park_fp(struct kvm_vcpu *vcpu) {}
+static inline void kvm_arch_vcpu_put_fp(struct kvm_vcpu *vcpu) {}
+
 /* All host FP/SIMD state is restored on guest exit, so nothing to save: */
 static inline void kvm_fpsimd_flush_cpu_state(void) {}
 
diff --git a/arch/arm64/include/asm/fpsimd.h b/arch/arm64/include/asm/fpsimd.h
index aa7162a..aa60895 100644
--- a/arch/arm64/include/asm/fpsimd.h
+++ b/arch/arm64/include/asm/fpsimd.h
@@ -41,6 +41,8 @@ struct task_struct;
 extern void fpsimd_save_state(struct user_fpsimd_state *state);
 extern void fpsimd_load_state(struct user_fpsimd_state *state);
 
+extern void fpsimd_save(void);
+
 extern void fpsimd_thread_switch(struct task_struct *next);
 extern void fpsimd_flush_thread(void);
 
@@ -49,7 +51,10 @@ extern void fpsimd_preserve_current_state(void);
 extern void fpsimd_restore_current_state(void);
 extern void fpsimd_update_current_state(struct user_fpsimd_state const *state);
 
+extern void fpsimd_bind_state_to_cpu(struct user_fpsimd_state *state);
+
 extern void fpsimd_flush_task_state(struct task_struct *target);
+extern void fpsimd_flush_cpu_state(void);
 extern void sve_flush_cpu_state(void);
 
 /* Maximum VL that SVE VL-agnostic software can transparently support */
diff --git a/arch/arm64/include/asm/kvm_asm.h b/arch/arm64/include/asm/kvm_asm.h
index 9699c13..6429eb6 100644
--- a/arch/arm64/include/asm/kvm_asm.h
+++ b/arch/arm64/include/asm/kvm_asm.h
@@ -33,6 +33,8 @@
 /* vcpu_arch flags field values: */
 #define KVM_ARM64_DEBUG_DIRTY_SHIFT	0
 #define KVM_ARM64_DEBUG_DIRTY		(1 << KVM_ARM64_DEBUG_DIRTY_SHIFT)
+#define KVM_ARM64_FP_ENABLED		(1 << 1) /* guest FP regs loaded */
+#define KVM_ARM64_HOST_SVE_IN_USE	(1 << 2) /* backup for host TIF_SVE */
 
 /* Translate a kernel address of @sym into its equivalent linear mapping */
 #define kvm_ksym_ref(sym)						\
diff --git a/arch/arm64/include/asm/kvm_host.h b/arch/arm64/include/asm/kvm_host.h
index a5ed95c..e4bdc75 100644
--- a/arch/arm64/include/asm/kvm_host.h
+++ b/arch/arm64/include/asm/kvm_host.h
@@ -30,6 +30,7 @@
 #include <asm/kvm.h>
 #include <asm/kvm_asm.h>
 #include <asm/kvm_mmio.h>
+#include <asm/thread_info.h>
 
 #define __KVM_HAVE_ARCH_INTC_INITIALIZED
 
@@ -238,6 +239,10 @@ struct kvm_vcpu_arch {
 
 	/* Pointer to host CPU context */
 	kvm_cpu_context_t *host_cpu_context;
+
+	struct thread_info *host_thread_info;	/* hyp VA */
+	struct user_fpsimd_state *host_fpsimd_state;	/* hyp VA */
+
 	struct {
 		/* {Break,watch}point registers */
 		struct kvm_guest_debug_arch regs;
@@ -420,6 +425,17 @@ static inline void __cpu_init_stage2(void)
 		  "PARange is %d bits, unsupported configuration!", parange);
 }
 
+/* Guest/host FPSIMD coordination helpers */
+int kvm_arch_vcpu_run_map_fp(struct kvm_vcpu *vcpu);
+void kvm_arch_vcpu_load_fp(struct kvm_vcpu *vcpu);
+void kvm_arch_vcpu_ctxsync_fp(struct kvm_vcpu *vcpu);
+void kvm_arch_vcpu_put_fp(struct kvm_vcpu *vcpu);
+
+static inline int kvm_arch_vcpu_run_pid_change(struct kvm_vcpu *vcpu)
+{
+	return kvm_arch_vcpu_run_map_fp(vcpu);
+}
+
 /*
  * All host FP/SIMD state is restored on guest exit, so nothing needs
  * doing here except in the SVE case:
diff --git a/arch/arm64/kernel/fpsimd.c b/arch/arm64/kernel/fpsimd.c
index 5fc0595..e7349b5 100644
--- a/arch/arm64/kernel/fpsimd.c
+++ b/arch/arm64/kernel/fpsimd.c
@@ -275,7 +275,7 @@ static void task_fpsimd_load(void)
  *
  * Softirqs (and preemption) must be disabled.
  */
-static void fpsimd_save(void)
+void fpsimd_save(void)
 {
 	struct user_fpsimd_state *st = __this_cpu_read(fpsimd_last_state.st);
 
@@ -1008,6 +1008,17 @@ static void fpsimd_bind_to_cpu(void)
 	current->thread.fpsimd_cpu = smp_processor_id();
 }
 
+void fpsimd_bind_state_to_cpu(struct user_fpsimd_state *st)
+{
+	struct fpsimd_last_state_struct *last =
+		this_cpu_ptr(&fpsimd_last_state);
+
+	WARN_ON(!in_softirq() && !irqs_disabled());
+
+	last->st = st;
+	last->sve_in_use = false;
+}
+
 /*
  * Load the userland FPSIMD state of 'current' from memory, but only if the
  * FPSIMD state already held in the registers is /not/ the most recent FPSIMD
@@ -1060,7 +1071,7 @@ void fpsimd_flush_task_state(struct task_struct *t)
 	t->thread.fpsimd_cpu = NR_CPUS;
 }
 
-static inline void fpsimd_flush_cpu_state(void)
+void fpsimd_flush_cpu_state(void)
 {
 	__this_cpu_write(fpsimd_last_state.st, NULL);
 }
diff --git a/arch/arm64/kvm/Kconfig b/arch/arm64/kvm/Kconfig
index a2e3a5a..47b23bf 100644
--- a/arch/arm64/kvm/Kconfig
+++ b/arch/arm64/kvm/Kconfig
@@ -39,6 +39,7 @@ config KVM
 	select HAVE_KVM_IRQ_ROUTING
 	select IRQ_BYPASS_MANAGER
 	select HAVE_KVM_IRQ_BYPASS
+	select HAVE_KVM_VCPU_RUN_PID_CHANGE
 	---help---
 	  Support hosting virtualized guest machines.
 	  We don't support KVM with 16K page tables yet, due to the multiple
diff --git a/arch/arm64/kvm/Makefile b/arch/arm64/kvm/Makefile
index 93afff9..0f2a135 100644
--- a/arch/arm64/kvm/Makefile
+++ b/arch/arm64/kvm/Makefile
@@ -19,7 +19,7 @@ kvm-$(CONFIG_KVM_ARM_HOST) += $(KVM)/arm/psci.o $(KVM)/arm/perf.o
 kvm-$(CONFIG_KVM_ARM_HOST) += inject_fault.o regmap.o va_layout.o
 kvm-$(CONFIG_KVM_ARM_HOST) += hyp.o hyp-init.o handle_exit.o
 kvm-$(CONFIG_KVM_ARM_HOST) += guest.o debug.o reset.o sys_regs.o sys_regs_generic_v8.o
-kvm-$(CONFIG_KVM_ARM_HOST) += vgic-sys-reg-v3.o
+kvm-$(CONFIG_KVM_ARM_HOST) += vgic-sys-reg-v3.o fpsimd.o
 kvm-$(CONFIG_KVM_ARM_HOST) += $(KVM)/arm/aarch32.o
 
 kvm-$(CONFIG_KVM_ARM_HOST) += $(KVM)/arm/vgic/vgic.o
diff --git a/arch/arm64/kvm/fpsimd.c b/arch/arm64/kvm/fpsimd.c
new file mode 100644
index 0000000..7059275
--- /dev/null
+++ b/arch/arm64/kvm/fpsimd.c
@@ -0,0 +1,111 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * arch/arm64/kvm/fpsimd.c: Guest/host FPSIMD context coordination helpers
+ *
+ * Copyright 2018 Arm Limited
+ * Author: Dave Martin <Dave.Martin@arm.com>
+ */
+#include <linux/bottom_half.h>
+#include <linux/sched.h>
+#include <linux/thread_info.h>
+#include <linux/kvm_host.h>
+#include <asm/kvm_asm.h>
+#include <asm/kvm_host.h>
+#include <asm/kvm_mmu.h>
+
+/*
+ * Called on entry to KVM_RUN unless this vcpu previously ran at least
+ * once and the most recent prior KVM_RUN for this vcpu was called from
+ * the same task as current (highly likely).
+ *
+ * This is guaranteed to execute before kvm_arch_vcpu_load_fp(vcpu),
+ * such that on entering hyp the relevant parts of current are already
+ * mapped.
+ */
+int kvm_arch_vcpu_run_map_fp(struct kvm_vcpu *vcpu)
+{
+	int ret;
+
+	struct thread_info *ti = &current->thread_info;
+	struct user_fpsimd_state *fpsimd = &current->thread.uw.fpsimd_state;
+
+	/*
+	 * Make sure the host task thread flags and fpsimd state are
+	 * visible to hyp:
+	 */
+	ret = create_hyp_mappings(ti, ti + 1, PAGE_HYP);
+	if (ret)
+		goto error;
+
+	ret = create_hyp_mappings(fpsimd, fpsimd + 1, PAGE_HYP);
+	if (ret)
+		goto error;
+
+	vcpu->arch.host_thread_info = kern_hyp_va(ti);
+	vcpu->arch.host_fpsimd_state = kern_hyp_va(fpsimd);
+error:
+	return ret;
+}
+
+/*
+ * Prepare vcpu for saving the host's FPSIMD state and loading the guest's.
+ * The actual loading is done by the FPSIMD access trap taken to hyp.
+ *
+ * Here, we just set the correct metadata to indicate that the FPSIMD
+ * state in the cpu regs (if any) belongs to current, and where to write
+ * it back to if/when a FPSIMD access trap is taken.
+ *
+ * TIF_SVE is backed up here, since it may get clobbered with guest state.
+ * This flag is restored by kvm_arch_vcpu_put_fp(vcpu).
+ */
+void kvm_arch_vcpu_load_fp(struct kvm_vcpu *vcpu)
+{
+	BUG_ON(system_supports_sve());
+	BUG_ON(!current->mm);
+
+	vcpu->arch.flags &= ~(KVM_ARM64_FP_ENABLED | KVM_ARM64_HOST_SVE_IN_USE);
+	if (test_thread_flag(TIF_SVE))
+		vcpu->arch.flags |= KVM_ARM64_HOST_SVE_IN_USE;
+
+	vcpu->arch.host_fpsimd_state =
+		kern_hyp_va(&current->thread.uw.fpsimd_state);
+}
+
+/*
+ * If the guest FPSIMD state was loaded, update the host's context
+ * tracking data mark the CPU FPSIMD regs as dirty for vcpu so that they
+ * will be written back if the kernel clobbers them due to kernel-mode
+ * NEON before re-entry into the guest.
+ */
+void kvm_arch_vcpu_ctxsync_fp(struct kvm_vcpu *vcpu)
+{
+	WARN_ON_ONCE(!irqs_disabled());
+
+	if (vcpu->arch.flags & KVM_ARM64_FP_ENABLED) {
+		fpsimd_bind_state_to_cpu(&vcpu->arch.ctxt.gp_regs.fp_regs);
+		clear_thread_flag(TIF_FOREIGN_FPSTATE);
+		clear_thread_flag(TIF_SVE);
+	}
+}
+
+/*
+ * Write back the vcpu FPSIMD regs if they are dirty, and invalidate the
+ * cpu FPSIMD regs so that they can't be spuriously reused if this vcpu
+ * disappears and another task or vcpu appears that recycles the same
+ * struct fpsimd_state.
+ */
+void kvm_arch_vcpu_put_fp(struct kvm_vcpu *vcpu)
+{
+	local_bh_disable();
+
+	if (vcpu->arch.flags & KVM_ARM64_FP_ENABLED) {
+		fpsimd_save();
+		fpsimd_flush_cpu_state();
+		set_thread_flag(TIF_FOREIGN_FPSTATE);
+	}
+
+	update_thread_flag(TIF_SVE,
+			   vcpu->arch.flags & KVM_ARM64_HOST_SVE_IN_USE);
+
+	local_bh_enable();
+}
diff --git a/arch/arm64/kvm/hyp/switch.c b/arch/arm64/kvm/hyp/switch.c
index c0796c4..75c3b65 100644
--- a/arch/arm64/kvm/hyp/switch.c
+++ b/arch/arm64/kvm/hyp/switch.c
@@ -27,15 +27,16 @@
 #include <asm/kvm_mmu.h>
 #include <asm/fpsimd.h>
 #include <asm/debug-monitors.h>
+#include <asm/thread_info.h>
 
-static bool __hyp_text __fpsimd_enabled_nvhe(void)
+static bool __hyp_text update_fp_enabled(struct kvm_vcpu *vcpu)
 {
-	return !(read_sysreg(cptr_el2) & CPTR_EL2_TFP);
-}
+	if (vcpu->arch.host_thread_info->flags & _TIF_FOREIGN_FPSTATE) {
+		vcpu->arch.host_fpsimd_state = NULL;
+		vcpu->arch.flags &= ~KVM_ARM64_FP_ENABLED;
+	}
 
-static bool fpsimd_enabled_vhe(void)
-{
-	return !!(read_sysreg(cpacr_el1) & CPACR_EL1_FPEN);
+	return !!(vcpu->arch.flags & KVM_ARM64_FP_ENABLED);
 }
 
 /* Save the 32-bit only FPSIMD system register state */
@@ -92,7 +93,10 @@ static void activate_traps_vhe(struct kvm_vcpu *vcpu)
 
 	val = read_sysreg(cpacr_el1);
 	val |= CPACR_EL1_TTA;
-	val &= ~(CPACR_EL1_FPEN | CPACR_EL1_ZEN);
+	val &= ~CPACR_EL1_ZEN;
+	if (!update_fp_enabled(vcpu))
+		val &= ~CPACR_EL1_FPEN;
+
 	write_sysreg(val, cpacr_el1);
 
 	write_sysreg(kvm_get_hyp_vector(), vbar_el1);
@@ -105,7 +109,10 @@ static void __hyp_text __activate_traps_nvhe(struct kvm_vcpu *vcpu)
 	__activate_traps_common(vcpu);
 
 	val = CPTR_EL2_DEFAULT;
-	val |= CPTR_EL2_TTA | CPTR_EL2_TFP | CPTR_EL2_TZ;
+	val |= CPTR_EL2_TTA | CPTR_EL2_TZ;
+	if (!update_fp_enabled(vcpu))
+		val |= CPTR_EL2_TFP;
+
 	write_sysreg(val, cptr_el2);
 }
 
@@ -321,8 +328,6 @@ static bool __hyp_text __skip_instr(struct kvm_vcpu *vcpu)
 void __hyp_text __hyp_switch_fpsimd(u64 esr __always_unused,
 				    struct kvm_vcpu *vcpu)
 {
-	kvm_cpu_context_t *host_ctxt;
-
 	if (has_vhe())
 		write_sysreg(read_sysreg(cpacr_el1) | CPACR_EL1_FPEN,
 			     cpacr_el1);
@@ -332,14 +337,19 @@ void __hyp_text __hyp_switch_fpsimd(u64 esr __always_unused,
 
 	isb();
 
-	host_ctxt = kern_hyp_va(vcpu->arch.host_cpu_context);
-	__fpsimd_save_state(&host_ctxt->gp_regs.fp_regs);
+	if (vcpu->arch.host_fpsimd_state) {
+		__fpsimd_save_state(vcpu->arch.host_fpsimd_state);
+		vcpu->arch.host_fpsimd_state = NULL;
+	}
+
 	__fpsimd_restore_state(&vcpu->arch.ctxt.gp_regs.fp_regs);
 
 	/* Skip restoring fpexc32 for AArch64 guests */
 	if (!(read_sysreg(hcr_el2) & HCR_RW))
 		write_sysreg(vcpu->arch.ctxt.sys_regs[FPEXC32_EL2],
 			     fpexc32_el2);
+
+	vcpu->arch.flags |= KVM_ARM64_FP_ENABLED;
 }
 
 /*
@@ -418,7 +428,6 @@ int kvm_vcpu_run_vhe(struct kvm_vcpu *vcpu)
 {
 	struct kvm_cpu_context *host_ctxt;
 	struct kvm_cpu_context *guest_ctxt;
-	bool fp_enabled;
 	u64 exit_code;
 
 	host_ctxt = vcpu->arch.host_cpu_context;
@@ -440,19 +449,14 @@ int kvm_vcpu_run_vhe(struct kvm_vcpu *vcpu)
 		/* And we're baaack! */
 	} while (fixup_guest_exit(vcpu, &exit_code));
 
-	fp_enabled = fpsimd_enabled_vhe();
-
 	sysreg_save_guest_state_vhe(guest_ctxt);
 
 	__deactivate_traps(vcpu);
 
 	sysreg_restore_host_state_vhe(host_ctxt);
 
-	if (fp_enabled) {
-		__fpsimd_save_state(&guest_ctxt->gp_regs.fp_regs);
-		__fpsimd_restore_state(&host_ctxt->gp_regs.fp_regs);
+	if (vcpu->arch.flags & KVM_ARM64_FP_ENABLED)
 		__fpsimd_save_fpexc32(vcpu);
-	}
 
 	__debug_switch_to_host(vcpu);
 
@@ -464,7 +468,6 @@ int __hyp_text __kvm_vcpu_run_nvhe(struct kvm_vcpu *vcpu)
 {
 	struct kvm_cpu_context *host_ctxt;
 	struct kvm_cpu_context *guest_ctxt;
-	bool fp_enabled;
 	u64 exit_code;
 
 	vcpu = kern_hyp_va(vcpu);
@@ -496,8 +499,6 @@ int __hyp_text __kvm_vcpu_run_nvhe(struct kvm_vcpu *vcpu)
 		/* And we're baaack! */
 	} while (fixup_guest_exit(vcpu, &exit_code));
 
-	fp_enabled = __fpsimd_enabled_nvhe();
-
 	__sysreg_save_state_nvhe(guest_ctxt);
 	__sysreg32_save_state(vcpu);
 	__timer_disable_traps(vcpu);
@@ -508,11 +509,8 @@ int __hyp_text __kvm_vcpu_run_nvhe(struct kvm_vcpu *vcpu)
 
 	__sysreg_restore_state_nvhe(host_ctxt);
 
-	if (fp_enabled) {
-		__fpsimd_save_state(&guest_ctxt->gp_regs.fp_regs);
-		__fpsimd_restore_state(&host_ctxt->gp_regs.fp_regs);
+	if (vcpu->arch.flags & KVM_ARM64_FP_ENABLED)
 		__fpsimd_save_fpexc32(vcpu);
-	}
 
 	/*
 	 * This must come after restoring the host sysregs, since a non-VHE
diff --git a/virt/kvm/arm/arm.c b/virt/kvm/arm/arm.c
index a4c1b76..bee226c 100644
--- a/virt/kvm/arm/arm.c
+++ b/virt/kvm/arm/arm.c
@@ -363,10 +363,12 @@ void kvm_arch_vcpu_load(struct kvm_vcpu *vcpu, int cpu)
 	kvm_vgic_load(vcpu);
 	kvm_timer_vcpu_load(vcpu);
 	kvm_vcpu_load_sysregs(vcpu);
+	kvm_arch_vcpu_load_fp(vcpu);
 }
 
 void kvm_arch_vcpu_put(struct kvm_vcpu *vcpu)
 {
+	kvm_arch_vcpu_put_fp(vcpu);
 	kvm_vcpu_put_sysregs(vcpu);
 	kvm_timer_vcpu_put(vcpu);
 	kvm_vgic_put(vcpu);
@@ -778,6 +780,8 @@ int kvm_arch_vcpu_ioctl_run(struct kvm_vcpu *vcpu, struct kvm_run *run)
 		if (static_branch_unlikely(&userspace_irqchip_in_use))
 			kvm_timer_sync_hwstate(vcpu);
 
+		kvm_arch_vcpu_ctxsync_fp(vcpu);
+
 		/*
 		 * We may have taken a host interrupt in HYP mode (ie
 		 * while executing the guest). This interrupt is still
-- 
2.1.4

^ permalink raw reply related	[flat|nested] 62+ messages in thread

* [PATCH v6 07/15] KVM: arm64: Optimise FPSIMD handling to reduce guest/host thrashing
@ 2018-05-08 16:44   ` Dave Martin
  0 siblings, 0 replies; 62+ messages in thread
From: Dave Martin @ 2018-05-08 16:44 UTC (permalink / raw)
  To: linux-arm-kernel

This patch refactors KVM to align the host and guest FPSIMD
save/restore logic with each other for arm64.  This reduces the
number of redundant save/restore operations that must occur, and
reduces the common-case IRQ blackout time during guest exit storms
by saving the host state lazily and optimising away the need to
restore the host state before returning to the run loop.

Four hooks are defined in order to enable this:

 * kvm_arch_vcpu_run_map_fp():
   Called on PID change to map necessary bits of current to Hyp.

 * kvm_arch_vcpu_load_fp():
   Set up FP/SIMD for entering the KVM run loop (parse as
   "vcpu_load fp").

 * kvm_arch_vcpu_ctxsync_fp():
   Get FP/SIMD into a safe state for re-enabling interrupts after a
   guest exit back to the run loop.

   For arm64 specifically, this involves updating the host kernel's
   FPSIMD context tracking metadata so that kernel-mode NEON use
   will cause the vcpu's FPSIMD state to be saved back correctly
   into the vcpu struct.  This must be done before re-enabling
   interrupts because kernel-mode NEON may be used by softirqs.

 * kvm_arch_vcpu_put_fp():
   Save guest FP/SIMD state back to memory and dissociate from the
   CPU ("vcpu_put fp").

Also, the arm64 FPSIMD context switch code is updated to enable it
to save back FPSIMD state for a vcpu, not just current.  A few
helpers drive this:

 * fpsimd_bind_state_to_cpu(struct user_fpsimd_state *fp):
   mark this CPU as having context fp (which may belong to a vcpu)
   currently loaded in its registers.  This is the non-task
   equivalent of the static function fpsimd_bind_to_cpu() in
   fpsimd.c.

 * task_fpsimd_save():
   exported to allow KVM to save the guest's FPSIMD state back to
   memory on exit from the run loop.

 * fpsimd_flush_state():
   invalidate any context's FPSIMD state that is currently loaded.
   Used to disassociate the vcpu from the CPU regs on run loop exit.

These changes allow the run loop to enable interrupts (and thus
softirqs that may use kernel-mode NEON) without having to save the
guest's FPSIMD state eagerly.

Some new vcpu_arch fields are added to make all this work.  Because
host FPSIMD state can now be saved back directly into current's
thread_struct as appropriate, host_cpu_context is no longer used
for preserving the FPSIMD state.  However, it is still needed for
preserving other things such as the host's system registers.  To
avoid ABI churn, the redundant storage space in host_cpu_context is
not removed for now.

arch/arm is not addressed by this patch and continues to use its
current save/restore logic.  It could provide implementations of
the helpers later if desired.

Signed-off-by: Dave Martin <Dave.Martin@arm.com>

---

Changes since v5:

Requested by Marc Zyngier:

 * Migrate from adding new bool flags in vcpu_arch to adding bits in
   vcpu_arch.flags.
---
 arch/arm/include/asm/kvm_host.h   |   8 +++
 arch/arm64/include/asm/fpsimd.h   |   5 ++
 arch/arm64/include/asm/kvm_asm.h  |   2 +
 arch/arm64/include/asm/kvm_host.h |  16 ++++++
 arch/arm64/kernel/fpsimd.c        |  15 +++++-
 arch/arm64/kvm/Kconfig            |   1 +
 arch/arm64/kvm/Makefile           |   2 +-
 arch/arm64/kvm/fpsimd.c           | 111 ++++++++++++++++++++++++++++++++++++++
 arch/arm64/kvm/hyp/switch.c       |  50 +++++++++--------
 virt/kvm/arm/arm.c                |   4 ++
 10 files changed, 185 insertions(+), 29 deletions(-)
 create mode 100644 arch/arm64/kvm/fpsimd.c

diff --git a/arch/arm/include/asm/kvm_host.h b/arch/arm/include/asm/kvm_host.h
index c7c28c8..4cac8d1 100644
--- a/arch/arm/include/asm/kvm_host.h
+++ b/arch/arm/include/asm/kvm_host.h
@@ -303,6 +303,14 @@ int kvm_arm_vcpu_arch_get_attr(struct kvm_vcpu *vcpu,
 int kvm_arm_vcpu_arch_has_attr(struct kvm_vcpu *vcpu,
 			       struct kvm_device_attr *attr);
 
+/*
+ * VFP/NEON switching is all done by the hyp switch code, so no need to
+ * coordinate with host context handling for this state:
+ */
+static inline void kvm_arch_vcpu_load_fp(struct kvm_vcpu *vcpu) {}
+static inline void kvm_arch_vcpu_park_fp(struct kvm_vcpu *vcpu) {}
+static inline void kvm_arch_vcpu_put_fp(struct kvm_vcpu *vcpu) {}
+
 /* All host FP/SIMD state is restored on guest exit, so nothing to save: */
 static inline void kvm_fpsimd_flush_cpu_state(void) {}
 
diff --git a/arch/arm64/include/asm/fpsimd.h b/arch/arm64/include/asm/fpsimd.h
index aa7162a..aa60895 100644
--- a/arch/arm64/include/asm/fpsimd.h
+++ b/arch/arm64/include/asm/fpsimd.h
@@ -41,6 +41,8 @@ struct task_struct;
 extern void fpsimd_save_state(struct user_fpsimd_state *state);
 extern void fpsimd_load_state(struct user_fpsimd_state *state);
 
+extern void fpsimd_save(void);
+
 extern void fpsimd_thread_switch(struct task_struct *next);
 extern void fpsimd_flush_thread(void);
 
@@ -49,7 +51,10 @@ extern void fpsimd_preserve_current_state(void);
 extern void fpsimd_restore_current_state(void);
 extern void fpsimd_update_current_state(struct user_fpsimd_state const *state);
 
+extern void fpsimd_bind_state_to_cpu(struct user_fpsimd_state *state);
+
 extern void fpsimd_flush_task_state(struct task_struct *target);
+extern void fpsimd_flush_cpu_state(void);
 extern void sve_flush_cpu_state(void);
 
 /* Maximum VL that SVE VL-agnostic software can transparently support */
diff --git a/arch/arm64/include/asm/kvm_asm.h b/arch/arm64/include/asm/kvm_asm.h
index 9699c13..6429eb6 100644
--- a/arch/arm64/include/asm/kvm_asm.h
+++ b/arch/arm64/include/asm/kvm_asm.h
@@ -33,6 +33,8 @@
 /* vcpu_arch flags field values: */
 #define KVM_ARM64_DEBUG_DIRTY_SHIFT	0
 #define KVM_ARM64_DEBUG_DIRTY		(1 << KVM_ARM64_DEBUG_DIRTY_SHIFT)
+#define KVM_ARM64_FP_ENABLED		(1 << 1) /* guest FP regs loaded */
+#define KVM_ARM64_HOST_SVE_IN_USE	(1 << 2) /* backup for host TIF_SVE */
 
 /* Translate a kernel address of @sym into its equivalent linear mapping */
 #define kvm_ksym_ref(sym)						\
diff --git a/arch/arm64/include/asm/kvm_host.h b/arch/arm64/include/asm/kvm_host.h
index a5ed95c..e4bdc75 100644
--- a/arch/arm64/include/asm/kvm_host.h
+++ b/arch/arm64/include/asm/kvm_host.h
@@ -30,6 +30,7 @@
 #include <asm/kvm.h>
 #include <asm/kvm_asm.h>
 #include <asm/kvm_mmio.h>
+#include <asm/thread_info.h>
 
 #define __KVM_HAVE_ARCH_INTC_INITIALIZED
 
@@ -238,6 +239,10 @@ struct kvm_vcpu_arch {
 
 	/* Pointer to host CPU context */
 	kvm_cpu_context_t *host_cpu_context;
+
+	struct thread_info *host_thread_info;	/* hyp VA */
+	struct user_fpsimd_state *host_fpsimd_state;	/* hyp VA */
+
 	struct {
 		/* {Break,watch}point registers */
 		struct kvm_guest_debug_arch regs;
@@ -420,6 +425,17 @@ static inline void __cpu_init_stage2(void)
 		  "PARange is %d bits, unsupported configuration!", parange);
 }
 
+/* Guest/host FPSIMD coordination helpers */
+int kvm_arch_vcpu_run_map_fp(struct kvm_vcpu *vcpu);
+void kvm_arch_vcpu_load_fp(struct kvm_vcpu *vcpu);
+void kvm_arch_vcpu_ctxsync_fp(struct kvm_vcpu *vcpu);
+void kvm_arch_vcpu_put_fp(struct kvm_vcpu *vcpu);
+
+static inline int kvm_arch_vcpu_run_pid_change(struct kvm_vcpu *vcpu)
+{
+	return kvm_arch_vcpu_run_map_fp(vcpu);
+}
+
 /*
  * All host FP/SIMD state is restored on guest exit, so nothing needs
  * doing here except in the SVE case:
diff --git a/arch/arm64/kernel/fpsimd.c b/arch/arm64/kernel/fpsimd.c
index 5fc0595..e7349b5 100644
--- a/arch/arm64/kernel/fpsimd.c
+++ b/arch/arm64/kernel/fpsimd.c
@@ -275,7 +275,7 @@ static void task_fpsimd_load(void)
  *
  * Softirqs (and preemption) must be disabled.
  */
-static void fpsimd_save(void)
+void fpsimd_save(void)
 {
 	struct user_fpsimd_state *st = __this_cpu_read(fpsimd_last_state.st);
 
@@ -1008,6 +1008,17 @@ static void fpsimd_bind_to_cpu(void)
 	current->thread.fpsimd_cpu = smp_processor_id();
 }
 
+void fpsimd_bind_state_to_cpu(struct user_fpsimd_state *st)
+{
+	struct fpsimd_last_state_struct *last =
+		this_cpu_ptr(&fpsimd_last_state);
+
+	WARN_ON(!in_softirq() && !irqs_disabled());
+
+	last->st = st;
+	last->sve_in_use = false;
+}
+
 /*
  * Load the userland FPSIMD state of 'current' from memory, but only if the
  * FPSIMD state already held in the registers is /not/ the most recent FPSIMD
@@ -1060,7 +1071,7 @@ void fpsimd_flush_task_state(struct task_struct *t)
 	t->thread.fpsimd_cpu = NR_CPUS;
 }
 
-static inline void fpsimd_flush_cpu_state(void)
+void fpsimd_flush_cpu_state(void)
 {
 	__this_cpu_write(fpsimd_last_state.st, NULL);
 }
diff --git a/arch/arm64/kvm/Kconfig b/arch/arm64/kvm/Kconfig
index a2e3a5a..47b23bf 100644
--- a/arch/arm64/kvm/Kconfig
+++ b/arch/arm64/kvm/Kconfig
@@ -39,6 +39,7 @@ config KVM
 	select HAVE_KVM_IRQ_ROUTING
 	select IRQ_BYPASS_MANAGER
 	select HAVE_KVM_IRQ_BYPASS
+	select HAVE_KVM_VCPU_RUN_PID_CHANGE
 	---help---
 	  Support hosting virtualized guest machines.
 	  We don't support KVM with 16K page tables yet, due to the multiple
diff --git a/arch/arm64/kvm/Makefile b/arch/arm64/kvm/Makefile
index 93afff9..0f2a135 100644
--- a/arch/arm64/kvm/Makefile
+++ b/arch/arm64/kvm/Makefile
@@ -19,7 +19,7 @@ kvm-$(CONFIG_KVM_ARM_HOST) += $(KVM)/arm/psci.o $(KVM)/arm/perf.o
 kvm-$(CONFIG_KVM_ARM_HOST) += inject_fault.o regmap.o va_layout.o
 kvm-$(CONFIG_KVM_ARM_HOST) += hyp.o hyp-init.o handle_exit.o
 kvm-$(CONFIG_KVM_ARM_HOST) += guest.o debug.o reset.o sys_regs.o sys_regs_generic_v8.o
-kvm-$(CONFIG_KVM_ARM_HOST) += vgic-sys-reg-v3.o
+kvm-$(CONFIG_KVM_ARM_HOST) += vgic-sys-reg-v3.o fpsimd.o
 kvm-$(CONFIG_KVM_ARM_HOST) += $(KVM)/arm/aarch32.o
 
 kvm-$(CONFIG_KVM_ARM_HOST) += $(KVM)/arm/vgic/vgic.o
diff --git a/arch/arm64/kvm/fpsimd.c b/arch/arm64/kvm/fpsimd.c
new file mode 100644
index 0000000..7059275
--- /dev/null
+++ b/arch/arm64/kvm/fpsimd.c
@@ -0,0 +1,111 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * arch/arm64/kvm/fpsimd.c: Guest/host FPSIMD context coordination helpers
+ *
+ * Copyright 2018 Arm Limited
+ * Author: Dave Martin <Dave.Martin@arm.com>
+ */
+#include <linux/bottom_half.h>
+#include <linux/sched.h>
+#include <linux/thread_info.h>
+#include <linux/kvm_host.h>
+#include <asm/kvm_asm.h>
+#include <asm/kvm_host.h>
+#include <asm/kvm_mmu.h>
+
+/*
+ * Called on entry to KVM_RUN unless this vcpu previously ran at least
+ * once and the most recent prior KVM_RUN for this vcpu was called from
+ * the same task as current (highly likely).
+ *
+ * This is guaranteed to execute before kvm_arch_vcpu_load_fp(vcpu),
+ * such that on entering hyp the relevant parts of current are already
+ * mapped.
+ */
+int kvm_arch_vcpu_run_map_fp(struct kvm_vcpu *vcpu)
+{
+	int ret;
+
+	struct thread_info *ti = &current->thread_info;
+	struct user_fpsimd_state *fpsimd = &current->thread.uw.fpsimd_state;
+
+	/*
+	 * Make sure the host task thread flags and fpsimd state are
+	 * visible to hyp:
+	 */
+	ret = create_hyp_mappings(ti, ti + 1, PAGE_HYP);
+	if (ret)
+		goto error;
+
+	ret = create_hyp_mappings(fpsimd, fpsimd + 1, PAGE_HYP);
+	if (ret)
+		goto error;
+
+	vcpu->arch.host_thread_info = kern_hyp_va(ti);
+	vcpu->arch.host_fpsimd_state = kern_hyp_va(fpsimd);
+error:
+	return ret;
+}
+
+/*
+ * Prepare vcpu for saving the host's FPSIMD state and loading the guest's.
+ * The actual loading is done by the FPSIMD access trap taken to hyp.
+ *
+ * Here, we just set the correct metadata to indicate that the FPSIMD
+ * state in the cpu regs (if any) belongs to current, and where to write
+ * it back to if/when a FPSIMD access trap is taken.
+ *
+ * TIF_SVE is backed up here, since it may get clobbered with guest state.
+ * This flag is restored by kvm_arch_vcpu_put_fp(vcpu).
+ */
+void kvm_arch_vcpu_load_fp(struct kvm_vcpu *vcpu)
+{
+	BUG_ON(system_supports_sve());
+	BUG_ON(!current->mm);
+
+	vcpu->arch.flags &= ~(KVM_ARM64_FP_ENABLED | KVM_ARM64_HOST_SVE_IN_USE);
+	if (test_thread_flag(TIF_SVE))
+		vcpu->arch.flags |= KVM_ARM64_HOST_SVE_IN_USE;
+
+	vcpu->arch.host_fpsimd_state =
+		kern_hyp_va(&current->thread.uw.fpsimd_state);
+}
+
+/*
+ * If the guest FPSIMD state was loaded, update the host's context
+ * tracking data mark the CPU FPSIMD regs as dirty for vcpu so that they
+ * will be written back if the kernel clobbers them due to kernel-mode
+ * NEON before re-entry into the guest.
+ */
+void kvm_arch_vcpu_ctxsync_fp(struct kvm_vcpu *vcpu)
+{
+	WARN_ON_ONCE(!irqs_disabled());
+
+	if (vcpu->arch.flags & KVM_ARM64_FP_ENABLED) {
+		fpsimd_bind_state_to_cpu(&vcpu->arch.ctxt.gp_regs.fp_regs);
+		clear_thread_flag(TIF_FOREIGN_FPSTATE);
+		clear_thread_flag(TIF_SVE);
+	}
+}
+
+/*
+ * Write back the vcpu FPSIMD regs if they are dirty, and invalidate the
+ * cpu FPSIMD regs so that they can't be spuriously reused if this vcpu
+ * disappears and another task or vcpu appears that recycles the same
+ * struct fpsimd_state.
+ */
+void kvm_arch_vcpu_put_fp(struct kvm_vcpu *vcpu)
+{
+	local_bh_disable();
+
+	if (vcpu->arch.flags & KVM_ARM64_FP_ENABLED) {
+		fpsimd_save();
+		fpsimd_flush_cpu_state();
+		set_thread_flag(TIF_FOREIGN_FPSTATE);
+	}
+
+	update_thread_flag(TIF_SVE,
+			   vcpu->arch.flags & KVM_ARM64_HOST_SVE_IN_USE);
+
+	local_bh_enable();
+}
diff --git a/arch/arm64/kvm/hyp/switch.c b/arch/arm64/kvm/hyp/switch.c
index c0796c4..75c3b65 100644
--- a/arch/arm64/kvm/hyp/switch.c
+++ b/arch/arm64/kvm/hyp/switch.c
@@ -27,15 +27,16 @@
 #include <asm/kvm_mmu.h>
 #include <asm/fpsimd.h>
 #include <asm/debug-monitors.h>
+#include <asm/thread_info.h>
 
-static bool __hyp_text __fpsimd_enabled_nvhe(void)
+static bool __hyp_text update_fp_enabled(struct kvm_vcpu *vcpu)
 {
-	return !(read_sysreg(cptr_el2) & CPTR_EL2_TFP);
-}
+	if (vcpu->arch.host_thread_info->flags & _TIF_FOREIGN_FPSTATE) {
+		vcpu->arch.host_fpsimd_state = NULL;
+		vcpu->arch.flags &= ~KVM_ARM64_FP_ENABLED;
+	}
 
-static bool fpsimd_enabled_vhe(void)
-{
-	return !!(read_sysreg(cpacr_el1) & CPACR_EL1_FPEN);
+	return !!(vcpu->arch.flags & KVM_ARM64_FP_ENABLED);
 }
 
 /* Save the 32-bit only FPSIMD system register state */
@@ -92,7 +93,10 @@ static void activate_traps_vhe(struct kvm_vcpu *vcpu)
 
 	val = read_sysreg(cpacr_el1);
 	val |= CPACR_EL1_TTA;
-	val &= ~(CPACR_EL1_FPEN | CPACR_EL1_ZEN);
+	val &= ~CPACR_EL1_ZEN;
+	if (!update_fp_enabled(vcpu))
+		val &= ~CPACR_EL1_FPEN;
+
 	write_sysreg(val, cpacr_el1);
 
 	write_sysreg(kvm_get_hyp_vector(), vbar_el1);
@@ -105,7 +109,10 @@ static void __hyp_text __activate_traps_nvhe(struct kvm_vcpu *vcpu)
 	__activate_traps_common(vcpu);
 
 	val = CPTR_EL2_DEFAULT;
-	val |= CPTR_EL2_TTA | CPTR_EL2_TFP | CPTR_EL2_TZ;
+	val |= CPTR_EL2_TTA | CPTR_EL2_TZ;
+	if (!update_fp_enabled(vcpu))
+		val |= CPTR_EL2_TFP;
+
 	write_sysreg(val, cptr_el2);
 }
 
@@ -321,8 +328,6 @@ static bool __hyp_text __skip_instr(struct kvm_vcpu *vcpu)
 void __hyp_text __hyp_switch_fpsimd(u64 esr __always_unused,
 				    struct kvm_vcpu *vcpu)
 {
-	kvm_cpu_context_t *host_ctxt;
-
 	if (has_vhe())
 		write_sysreg(read_sysreg(cpacr_el1) | CPACR_EL1_FPEN,
 			     cpacr_el1);
@@ -332,14 +337,19 @@ void __hyp_text __hyp_switch_fpsimd(u64 esr __always_unused,
 
 	isb();
 
-	host_ctxt = kern_hyp_va(vcpu->arch.host_cpu_context);
-	__fpsimd_save_state(&host_ctxt->gp_regs.fp_regs);
+	if (vcpu->arch.host_fpsimd_state) {
+		__fpsimd_save_state(vcpu->arch.host_fpsimd_state);
+		vcpu->arch.host_fpsimd_state = NULL;
+	}
+
 	__fpsimd_restore_state(&vcpu->arch.ctxt.gp_regs.fp_regs);
 
 	/* Skip restoring fpexc32 for AArch64 guests */
 	if (!(read_sysreg(hcr_el2) & HCR_RW))
 		write_sysreg(vcpu->arch.ctxt.sys_regs[FPEXC32_EL2],
 			     fpexc32_el2);
+
+	vcpu->arch.flags |= KVM_ARM64_FP_ENABLED;
 }
 
 /*
@@ -418,7 +428,6 @@ int kvm_vcpu_run_vhe(struct kvm_vcpu *vcpu)
 {
 	struct kvm_cpu_context *host_ctxt;
 	struct kvm_cpu_context *guest_ctxt;
-	bool fp_enabled;
 	u64 exit_code;
 
 	host_ctxt = vcpu->arch.host_cpu_context;
@@ -440,19 +449,14 @@ int kvm_vcpu_run_vhe(struct kvm_vcpu *vcpu)
 		/* And we're baaack! */
 	} while (fixup_guest_exit(vcpu, &exit_code));
 
-	fp_enabled = fpsimd_enabled_vhe();
-
 	sysreg_save_guest_state_vhe(guest_ctxt);
 
 	__deactivate_traps(vcpu);
 
 	sysreg_restore_host_state_vhe(host_ctxt);
 
-	if (fp_enabled) {
-		__fpsimd_save_state(&guest_ctxt->gp_regs.fp_regs);
-		__fpsimd_restore_state(&host_ctxt->gp_regs.fp_regs);
+	if (vcpu->arch.flags & KVM_ARM64_FP_ENABLED)
 		__fpsimd_save_fpexc32(vcpu);
-	}
 
 	__debug_switch_to_host(vcpu);
 
@@ -464,7 +468,6 @@ int __hyp_text __kvm_vcpu_run_nvhe(struct kvm_vcpu *vcpu)
 {
 	struct kvm_cpu_context *host_ctxt;
 	struct kvm_cpu_context *guest_ctxt;
-	bool fp_enabled;
 	u64 exit_code;
 
 	vcpu = kern_hyp_va(vcpu);
@@ -496,8 +499,6 @@ int __hyp_text __kvm_vcpu_run_nvhe(struct kvm_vcpu *vcpu)
 		/* And we're baaack! */
 	} while (fixup_guest_exit(vcpu, &exit_code));
 
-	fp_enabled = __fpsimd_enabled_nvhe();
-
 	__sysreg_save_state_nvhe(guest_ctxt);
 	__sysreg32_save_state(vcpu);
 	__timer_disable_traps(vcpu);
@@ -508,11 +509,8 @@ int __hyp_text __kvm_vcpu_run_nvhe(struct kvm_vcpu *vcpu)
 
 	__sysreg_restore_state_nvhe(host_ctxt);
 
-	if (fp_enabled) {
-		__fpsimd_save_state(&guest_ctxt->gp_regs.fp_regs);
-		__fpsimd_restore_state(&host_ctxt->gp_regs.fp_regs);
+	if (vcpu->arch.flags & KVM_ARM64_FP_ENABLED)
 		__fpsimd_save_fpexc32(vcpu);
-	}
 
 	/*
 	 * This must come after restoring the host sysregs, since a non-VHE
diff --git a/virt/kvm/arm/arm.c b/virt/kvm/arm/arm.c
index a4c1b76..bee226c 100644
--- a/virt/kvm/arm/arm.c
+++ b/virt/kvm/arm/arm.c
@@ -363,10 +363,12 @@ void kvm_arch_vcpu_load(struct kvm_vcpu *vcpu, int cpu)
 	kvm_vgic_load(vcpu);
 	kvm_timer_vcpu_load(vcpu);
 	kvm_vcpu_load_sysregs(vcpu);
+	kvm_arch_vcpu_load_fp(vcpu);
 }
 
 void kvm_arch_vcpu_put(struct kvm_vcpu *vcpu)
 {
+	kvm_arch_vcpu_put_fp(vcpu);
 	kvm_vcpu_put_sysregs(vcpu);
 	kvm_timer_vcpu_put(vcpu);
 	kvm_vgic_put(vcpu);
@@ -778,6 +780,8 @@ int kvm_arch_vcpu_ioctl_run(struct kvm_vcpu *vcpu, struct kvm_run *run)
 		if (static_branch_unlikely(&userspace_irqchip_in_use))
 			kvm_timer_sync_hwstate(vcpu);
 
+		kvm_arch_vcpu_ctxsync_fp(vcpu);
+
 		/*
 		 * We may have taken a host interrupt in HYP mode (ie
 		 * while executing the guest). This interrupt is still
-- 
2.1.4

^ permalink raw reply related	[flat|nested] 62+ messages in thread

* [PATCH v6 08/15] arm64/sve: Move read_zcr_features() out of cpufeature.h
  2018-05-08 16:44 ` Dave Martin
@ 2018-05-08 16:44   ` Dave Martin
  -1 siblings, 0 replies; 62+ messages in thread
From: Dave Martin @ 2018-05-08 16:44 UTC (permalink / raw)
  To: kvmarm
  Cc: Marc Zyngier, Catalin Marinas, Christoffer Dall,
	linux-arm-kernel, Ard Biesheuvel

Having read_zcr_features() inline in cpufeature.h results in that
header requiring #includes which make it hard to include
<asm/fpsimd.h> elsewhere without triggering header inclusion
cycles.

This is not a hot-path function and arguably should not be in
cpufeature.h in the first place, so this patch moves it to
fpsimd.c, compiled conditionally if CONFIG_ARM64_SVE=y.

This allows some SVE-related #includes to be dropped from
cpufeature.h, which will ease future maintenance.

A couple of missing #includes of <asm/fpsimd.h> are exposed by this
change under arch/arm64/.  This patch adds the missing #includes as
necessary.

No functional change.

Signed-off-by: Dave Martin <Dave.Martin@arm.com>
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
---
 arch/arm64/include/asm/cpufeature.h | 29 -----------------------------
 arch/arm64/include/asm/fpsimd.h     |  2 ++
 arch/arm64/include/asm/processor.h  |  1 +
 arch/arm64/kernel/fpsimd.c          | 28 ++++++++++++++++++++++++++++
 arch/arm64/kernel/ptrace.c          |  1 +
 5 files changed, 32 insertions(+), 29 deletions(-)

diff --git a/arch/arm64/include/asm/cpufeature.h b/arch/arm64/include/asm/cpufeature.h
index 09b0f2a..0a6b713 100644
--- a/arch/arm64/include/asm/cpufeature.h
+++ b/arch/arm64/include/asm/cpufeature.h
@@ -11,9 +11,7 @@
 
 #include <asm/cpucaps.h>
 #include <asm/cputype.h>
-#include <asm/fpsimd.h>
 #include <asm/hwcap.h>
-#include <asm/sigcontext.h>
 #include <asm/sysreg.h>
 
 /*
@@ -510,33 +508,6 @@ static inline bool system_supports_sve(void)
 		cpus_have_const_cap(ARM64_SVE);
 }
 
-/*
- * Read the pseudo-ZCR used by cpufeatures to identify the supported SVE
- * vector length.
- *
- * Use only if SVE is present.
- * This function clobbers the SVE vector length.
- */
-static inline u64 read_zcr_features(void)
-{
-	u64 zcr;
-	unsigned int vq_max;
-
-	/*
-	 * Set the maximum possible VL, and write zeroes to all other
-	 * bits to see if they stick.
-	 */
-	sve_kernel_enable(NULL);
-	write_sysreg_s(ZCR_ELx_LEN_MASK, SYS_ZCR_EL1);
-
-	zcr = read_sysreg_s(SYS_ZCR_EL1);
-	zcr &= ~(u64)ZCR_ELx_LEN_MASK; /* find sticky 1s outside LEN field */
-	vq_max = sve_vq_from_vl(sve_get_vl());
-	zcr |= vq_max - 1; /* set LEN field to maximum effective value */
-
-	return zcr;
-}
-
 #endif /* __ASSEMBLY__ */
 
 #endif
diff --git a/arch/arm64/include/asm/fpsimd.h b/arch/arm64/include/asm/fpsimd.h
index aa60895..f4d80ae 100644
--- a/arch/arm64/include/asm/fpsimd.h
+++ b/arch/arm64/include/asm/fpsimd.h
@@ -68,6 +68,8 @@ extern unsigned int sve_get_vl(void);
 struct arm64_cpu_capabilities;
 extern void sve_kernel_enable(const struct arm64_cpu_capabilities *__unused);
 
+extern u64 read_zcr_features(void);
+
 extern int __ro_after_init sve_max_vl;
 
 #ifdef CONFIG_ARM64_SVE
diff --git a/arch/arm64/include/asm/processor.h b/arch/arm64/include/asm/processor.h
index 7675989..f902b6d 100644
--- a/arch/arm64/include/asm/processor.h
+++ b/arch/arm64/include/asm/processor.h
@@ -40,6 +40,7 @@
 
 #include <asm/alternative.h>
 #include <asm/cpufeature.h>
+#include <asm/fpsimd.h>
 #include <asm/hw_breakpoint.h>
 #include <asm/lse.h>
 #include <asm/pgtable-hwdef.h>
diff --git a/arch/arm64/kernel/fpsimd.c b/arch/arm64/kernel/fpsimd.c
index e7349b5..e7f99f2 100644
--- a/arch/arm64/kernel/fpsimd.c
+++ b/arch/arm64/kernel/fpsimd.c
@@ -37,6 +37,7 @@
 #include <linux/sched/task_stack.h>
 #include <linux/signal.h>
 #include <linux/slab.h>
+#include <linux/stddef.h>
 #include <linux/sysctl.h>
 
 #include <asm/esr.h>
@@ -764,6 +765,33 @@ void sve_kernel_enable(const struct arm64_cpu_capabilities *__always_unused p)
 	isb();
 }
 
+/*
+ * Read the pseudo-ZCR used by cpufeatures to identify the supported SVE
+ * vector length.
+ *
+ * Use only if SVE is present.
+ * This function clobbers the SVE vector length.
+ */
+u64 read_zcr_features(void)
+{
+	u64 zcr;
+	unsigned int vq_max;
+
+	/*
+	 * Set the maximum possible VL, and write zeroes to all other
+	 * bits to see if they stick.
+	 */
+	sve_kernel_enable(NULL);
+	write_sysreg_s(ZCR_ELx_LEN_MASK, SYS_ZCR_EL1);
+
+	zcr = read_sysreg_s(SYS_ZCR_EL1);
+	zcr &= ~(u64)ZCR_ELx_LEN_MASK; /* find sticky 1s outside LEN field */
+	vq_max = sve_vq_from_vl(sve_get_vl());
+	zcr |= vq_max - 1; /* set LEN field to maximum effective value */
+
+	return zcr;
+}
+
 void __init sve_setup(void)
 {
 	u64 zcr;
diff --git a/arch/arm64/kernel/ptrace.c b/arch/arm64/kernel/ptrace.c
index 7ff81fe..78889c4 100644
--- a/arch/arm64/kernel/ptrace.c
+++ b/arch/arm64/kernel/ptrace.c
@@ -44,6 +44,7 @@
 #include <asm/compat.h>
 #include <asm/cpufeature.h>
 #include <asm/debug-monitors.h>
+#include <asm/fpsimd.h>
 #include <asm/pgtable.h>
 #include <asm/stacktrace.h>
 #include <asm/syscall.h>
-- 
2.1.4

^ permalink raw reply related	[flat|nested] 62+ messages in thread

* [PATCH v6 08/15] arm64/sve: Move read_zcr_features() out of cpufeature.h
@ 2018-05-08 16:44   ` Dave Martin
  0 siblings, 0 replies; 62+ messages in thread
From: Dave Martin @ 2018-05-08 16:44 UTC (permalink / raw)
  To: linux-arm-kernel

Having read_zcr_features() inline in cpufeature.h results in that
header requiring #includes which make it hard to include
<asm/fpsimd.h> elsewhere without triggering header inclusion
cycles.

This is not a hot-path function and arguably should not be in
cpufeature.h in the first place, so this patch moves it to
fpsimd.c, compiled conditionally if CONFIG_ARM64_SVE=y.

This allows some SVE-related #includes to be dropped from
cpufeature.h, which will ease future maintenance.

A couple of missing #includes of <asm/fpsimd.h> are exposed by this
change under arch/arm64/.  This patch adds the missing #includes as
necessary.

No functional change.

Signed-off-by: Dave Martin <Dave.Martin@arm.com>
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
---
 arch/arm64/include/asm/cpufeature.h | 29 -----------------------------
 arch/arm64/include/asm/fpsimd.h     |  2 ++
 arch/arm64/include/asm/processor.h  |  1 +
 arch/arm64/kernel/fpsimd.c          | 28 ++++++++++++++++++++++++++++
 arch/arm64/kernel/ptrace.c          |  1 +
 5 files changed, 32 insertions(+), 29 deletions(-)

diff --git a/arch/arm64/include/asm/cpufeature.h b/arch/arm64/include/asm/cpufeature.h
index 09b0f2a..0a6b713 100644
--- a/arch/arm64/include/asm/cpufeature.h
+++ b/arch/arm64/include/asm/cpufeature.h
@@ -11,9 +11,7 @@
 
 #include <asm/cpucaps.h>
 #include <asm/cputype.h>
-#include <asm/fpsimd.h>
 #include <asm/hwcap.h>
-#include <asm/sigcontext.h>
 #include <asm/sysreg.h>
 
 /*
@@ -510,33 +508,6 @@ static inline bool system_supports_sve(void)
 		cpus_have_const_cap(ARM64_SVE);
 }
 
-/*
- * Read the pseudo-ZCR used by cpufeatures to identify the supported SVE
- * vector length.
- *
- * Use only if SVE is present.
- * This function clobbers the SVE vector length.
- */
-static inline u64 read_zcr_features(void)
-{
-	u64 zcr;
-	unsigned int vq_max;
-
-	/*
-	 * Set the maximum possible VL, and write zeroes to all other
-	 * bits to see if they stick.
-	 */
-	sve_kernel_enable(NULL);
-	write_sysreg_s(ZCR_ELx_LEN_MASK, SYS_ZCR_EL1);
-
-	zcr = read_sysreg_s(SYS_ZCR_EL1);
-	zcr &= ~(u64)ZCR_ELx_LEN_MASK; /* find sticky 1s outside LEN field */
-	vq_max = sve_vq_from_vl(sve_get_vl());
-	zcr |= vq_max - 1; /* set LEN field to maximum effective value */
-
-	return zcr;
-}
-
 #endif /* __ASSEMBLY__ */
 
 #endif
diff --git a/arch/arm64/include/asm/fpsimd.h b/arch/arm64/include/asm/fpsimd.h
index aa60895..f4d80ae 100644
--- a/arch/arm64/include/asm/fpsimd.h
+++ b/arch/arm64/include/asm/fpsimd.h
@@ -68,6 +68,8 @@ extern unsigned int sve_get_vl(void);
 struct arm64_cpu_capabilities;
 extern void sve_kernel_enable(const struct arm64_cpu_capabilities *__unused);
 
+extern u64 read_zcr_features(void);
+
 extern int __ro_after_init sve_max_vl;
 
 #ifdef CONFIG_ARM64_SVE
diff --git a/arch/arm64/include/asm/processor.h b/arch/arm64/include/asm/processor.h
index 7675989..f902b6d 100644
--- a/arch/arm64/include/asm/processor.h
+++ b/arch/arm64/include/asm/processor.h
@@ -40,6 +40,7 @@
 
 #include <asm/alternative.h>
 #include <asm/cpufeature.h>
+#include <asm/fpsimd.h>
 #include <asm/hw_breakpoint.h>
 #include <asm/lse.h>
 #include <asm/pgtable-hwdef.h>
diff --git a/arch/arm64/kernel/fpsimd.c b/arch/arm64/kernel/fpsimd.c
index e7349b5..e7f99f2 100644
--- a/arch/arm64/kernel/fpsimd.c
+++ b/arch/arm64/kernel/fpsimd.c
@@ -37,6 +37,7 @@
 #include <linux/sched/task_stack.h>
 #include <linux/signal.h>
 #include <linux/slab.h>
+#include <linux/stddef.h>
 #include <linux/sysctl.h>
 
 #include <asm/esr.h>
@@ -764,6 +765,33 @@ void sve_kernel_enable(const struct arm64_cpu_capabilities *__always_unused p)
 	isb();
 }
 
+/*
+ * Read the pseudo-ZCR used by cpufeatures to identify the supported SVE
+ * vector length.
+ *
+ * Use only if SVE is present.
+ * This function clobbers the SVE vector length.
+ */
+u64 read_zcr_features(void)
+{
+	u64 zcr;
+	unsigned int vq_max;
+
+	/*
+	 * Set the maximum possible VL, and write zeroes to all other
+	 * bits to see if they stick.
+	 */
+	sve_kernel_enable(NULL);
+	write_sysreg_s(ZCR_ELx_LEN_MASK, SYS_ZCR_EL1);
+
+	zcr = read_sysreg_s(SYS_ZCR_EL1);
+	zcr &= ~(u64)ZCR_ELx_LEN_MASK; /* find sticky 1s outside LEN field */
+	vq_max = sve_vq_from_vl(sve_get_vl());
+	zcr |= vq_max - 1; /* set LEN field to maximum effective value */
+
+	return zcr;
+}
+
 void __init sve_setup(void)
 {
 	u64 zcr;
diff --git a/arch/arm64/kernel/ptrace.c b/arch/arm64/kernel/ptrace.c
index 7ff81fe..78889c4 100644
--- a/arch/arm64/kernel/ptrace.c
+++ b/arch/arm64/kernel/ptrace.c
@@ -44,6 +44,7 @@
 #include <asm/compat.h>
 #include <asm/cpufeature.h>
 #include <asm/debug-monitors.h>
+#include <asm/fpsimd.h>
 #include <asm/pgtable.h>
 #include <asm/stacktrace.h>
 #include <asm/syscall.h>
-- 
2.1.4

^ permalink raw reply related	[flat|nested] 62+ messages in thread

* [PATCH v6 09/15] arm64/sve: Switch sve_pffr() argument from task to thread
  2018-05-08 16:44 ` Dave Martin
@ 2018-05-08 16:44   ` Dave Martin
  -1 siblings, 0 replies; 62+ messages in thread
From: Dave Martin @ 2018-05-08 16:44 UTC (permalink / raw)
  To: kvmarm
  Cc: Marc Zyngier, Catalin Marinas, Christoffer Dall,
	linux-arm-kernel, Ard Biesheuvel

sve_pffr(), which is used to derive the base address used for
low-level SVE save/restore routines, currently takes the relevant
task_struct as an argument.

The only accessed fields are actually part of thread_struct, so
this patch changes the argument type accordingly.  This is done in
preparation for moving this function to a header, where we do not
want to have to include <linux/sched.h> due to the consequent
circular #include problems.

No functional change.

Signed-off-by: Dave Martin <Dave.Martin@arm.com>
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
---
 arch/arm64/kernel/fpsimd.c | 10 +++++-----
 1 file changed, 5 insertions(+), 5 deletions(-)

diff --git a/arch/arm64/kernel/fpsimd.c b/arch/arm64/kernel/fpsimd.c
index e7f99f2..d811679 100644
--- a/arch/arm64/kernel/fpsimd.c
+++ b/arch/arm64/kernel/fpsimd.c
@@ -44,6 +44,7 @@
 #include <asm/fpsimd.h>
 #include <asm/cpufeature.h>
 #include <asm/cputype.h>
+#include <asm/processor.h>
 #include <asm/simd.h>
 #include <asm/sigcontext.h>
 #include <asm/sysreg.h>
@@ -167,10 +168,9 @@ static size_t sve_ffr_offset(int vl)
 	return SVE_SIG_FFR_OFFSET(sve_vq_from_vl(vl)) - SVE_SIG_REGS_OFFSET;
 }
 
-static void *sve_pffr(struct task_struct *task)
+static void *sve_pffr(struct thread_struct *thread)
 {
-	return (char *)task->thread.sve_state +
-		sve_ffr_offset(task->thread.sve_vl);
+	return (char *)thread->sve_state + sve_ffr_offset(thread->sve_vl);
 }
 
 static void change_cpacr(u64 val, u64 mask)
@@ -253,7 +253,7 @@ static void task_fpsimd_load(void)
 	WARN_ON(!in_softirq() && !irqs_disabled());
 
 	if (system_supports_sve() && test_thread_flag(TIF_SVE))
-		sve_load_state(sve_pffr(current),
+		sve_load_state(sve_pffr(&current->thread),
 			       &current->thread.uw.fpsimd_state.fpsr,
 			       sve_vq_from_vl(current->thread.sve_vl) - 1);
 	else
@@ -294,7 +294,7 @@ void fpsimd_save(void)
 				return;
 			}
 
-			sve_save_state(sve_pffr(current), &st->fpsr);
+			sve_save_state(sve_pffr(&current->thread), &st->fpsr);
 		} else
 			fpsimd_save_state(st);
 	}
-- 
2.1.4

^ permalink raw reply related	[flat|nested] 62+ messages in thread

* [PATCH v6 09/15] arm64/sve: Switch sve_pffr() argument from task to thread
@ 2018-05-08 16:44   ` Dave Martin
  0 siblings, 0 replies; 62+ messages in thread
From: Dave Martin @ 2018-05-08 16:44 UTC (permalink / raw)
  To: linux-arm-kernel

sve_pffr(), which is used to derive the base address used for
low-level SVE save/restore routines, currently takes the relevant
task_struct as an argument.

The only accessed fields are actually part of thread_struct, so
this patch changes the argument type accordingly.  This is done in
preparation for moving this function to a header, where we do not
want to have to include <linux/sched.h> due to the consequent
circular #include problems.

No functional change.

Signed-off-by: Dave Martin <Dave.Martin@arm.com>
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
---
 arch/arm64/kernel/fpsimd.c | 10 +++++-----
 1 file changed, 5 insertions(+), 5 deletions(-)

diff --git a/arch/arm64/kernel/fpsimd.c b/arch/arm64/kernel/fpsimd.c
index e7f99f2..d811679 100644
--- a/arch/arm64/kernel/fpsimd.c
+++ b/arch/arm64/kernel/fpsimd.c
@@ -44,6 +44,7 @@
 #include <asm/fpsimd.h>
 #include <asm/cpufeature.h>
 #include <asm/cputype.h>
+#include <asm/processor.h>
 #include <asm/simd.h>
 #include <asm/sigcontext.h>
 #include <asm/sysreg.h>
@@ -167,10 +168,9 @@ static size_t sve_ffr_offset(int vl)
 	return SVE_SIG_FFR_OFFSET(sve_vq_from_vl(vl)) - SVE_SIG_REGS_OFFSET;
 }
 
-static void *sve_pffr(struct task_struct *task)
+static void *sve_pffr(struct thread_struct *thread)
 {
-	return (char *)task->thread.sve_state +
-		sve_ffr_offset(task->thread.sve_vl);
+	return (char *)thread->sve_state + sve_ffr_offset(thread->sve_vl);
 }
 
 static void change_cpacr(u64 val, u64 mask)
@@ -253,7 +253,7 @@ static void task_fpsimd_load(void)
 	WARN_ON(!in_softirq() && !irqs_disabled());
 
 	if (system_supports_sve() && test_thread_flag(TIF_SVE))
-		sve_load_state(sve_pffr(current),
+		sve_load_state(sve_pffr(&current->thread),
 			       &current->thread.uw.fpsimd_state.fpsr,
 			       sve_vq_from_vl(current->thread.sve_vl) - 1);
 	else
@@ -294,7 +294,7 @@ void fpsimd_save(void)
 				return;
 			}
 
-			sve_save_state(sve_pffr(current), &st->fpsr);
+			sve_save_state(sve_pffr(&current->thread), &st->fpsr);
 		} else
 			fpsimd_save_state(st);
 	}
-- 
2.1.4

^ permalink raw reply related	[flat|nested] 62+ messages in thread

* [PATCH v6 10/15] arm64/sve: Move sve_pffr() to fpsimd.h and make inline
  2018-05-08 16:44 ` Dave Martin
@ 2018-05-08 16:44   ` Dave Martin
  -1 siblings, 0 replies; 62+ messages in thread
From: Dave Martin @ 2018-05-08 16:44 UTC (permalink / raw)
  To: kvmarm
  Cc: Marc Zyngier, Catalin Marinas, Christoffer Dall,
	linux-arm-kernel, Ard Biesheuvel

In order to make sve_save_state()/sve_load_state() more easily
reusable and to get rid of a potential branch on context switch
critical paths, this patch makes sve_pffr() inline and moves it to
fpsimd.h.

<asm/processor.h> must be included in fpsimd.h in order to make
this work, and this creates an #include cycle that is tricky to
avoid without modifying core code, due to the way the PR_SVE_*()
prctl helpers are included in the core prctl implementation.

Instead of breaking the cycle, this patch defers inclusion of
<asm/fpsimd.h> in <asm/processor.h> until the point where it is
actually needed: i.e., immediately before the prctl definitions.

No functional change.

Signed-off-by: Dave Martin <Dave.Martin@arm.com>
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
---
 arch/arm64/include/asm/fpsimd.h    | 13 +++++++++++++
 arch/arm64/include/asm/processor.h |  3 ++-
 arch/arm64/kernel/fpsimd.c         | 12 ------------
 3 files changed, 15 insertions(+), 13 deletions(-)

diff --git a/arch/arm64/include/asm/fpsimd.h b/arch/arm64/include/asm/fpsimd.h
index f4d80ae..0026a92 100644
--- a/arch/arm64/include/asm/fpsimd.h
+++ b/arch/arm64/include/asm/fpsimd.h
@@ -18,6 +18,8 @@
 
 #include <asm/ptrace.h>
 #include <asm/errno.h>
+#include <asm/processor.h>
+#include <asm/sigcontext.h>
 
 #ifndef __ASSEMBLY__
 
@@ -60,6 +62,17 @@ extern void sve_flush_cpu_state(void);
 /* Maximum VL that SVE VL-agnostic software can transparently support */
 #define SVE_VL_ARCH_MAX 0x100
 
+/* Offset of FFR in the SVE register dump */
+static inline size_t sve_ffr_offset(int vl)
+{
+	return SVE_SIG_FFR_OFFSET(sve_vq_from_vl(vl)) - SVE_SIG_REGS_OFFSET;
+}
+
+static inline void *sve_pffr(struct thread_struct *thread)
+{
+	return (char *)thread->sve_state + sve_ffr_offset(thread->sve_vl);
+}
+
 extern void sve_save_state(void *state, u32 *pfpsr);
 extern void sve_load_state(void const *state, u32 const *pfpsr,
 			   unsigned long vq_minus_1);
diff --git a/arch/arm64/include/asm/processor.h b/arch/arm64/include/asm/processor.h
index f902b6d..ebaadb1 100644
--- a/arch/arm64/include/asm/processor.h
+++ b/arch/arm64/include/asm/processor.h
@@ -40,7 +40,6 @@
 
 #include <asm/alternative.h>
 #include <asm/cpufeature.h>
-#include <asm/fpsimd.h>
 #include <asm/hw_breakpoint.h>
 #include <asm/lse.h>
 #include <asm/pgtable-hwdef.h>
@@ -245,6 +244,8 @@ void cpu_enable_pan(const struct arm64_cpu_capabilities *__unused);
 void cpu_enable_cache_maint_trap(const struct arm64_cpu_capabilities *__unused);
 void cpu_clear_disr(const struct arm64_cpu_capabilities *__unused);
 
+#include <asm/fpsimd.h>
+
 /* Userspace interface for PR_SVE_{SET,GET}_VL prctl()s: */
 #define SVE_SET_VL(arg)	sve_set_current_vl(arg)
 #define SVE_GET_VL()	sve_get_current_vl()
diff --git a/arch/arm64/kernel/fpsimd.c b/arch/arm64/kernel/fpsimd.c
index d811679..152d834 100644
--- a/arch/arm64/kernel/fpsimd.c
+++ b/arch/arm64/kernel/fpsimd.c
@@ -161,18 +161,6 @@ static void sve_free(struct task_struct *task)
 	__sve_free(task);
 }
 
-
-/* Offset of FFR in the SVE register dump */
-static size_t sve_ffr_offset(int vl)
-{
-	return SVE_SIG_FFR_OFFSET(sve_vq_from_vl(vl)) - SVE_SIG_REGS_OFFSET;
-}
-
-static void *sve_pffr(struct thread_struct *thread)
-{
-	return (char *)thread->sve_state + sve_ffr_offset(thread->sve_vl);
-}
-
 static void change_cpacr(u64 val, u64 mask)
 {
 	u64 cpacr = read_sysreg(CPACR_EL1);
-- 
2.1.4

^ permalink raw reply related	[flat|nested] 62+ messages in thread

* [PATCH v6 10/15] arm64/sve: Move sve_pffr() to fpsimd.h and make inline
@ 2018-05-08 16:44   ` Dave Martin
  0 siblings, 0 replies; 62+ messages in thread
From: Dave Martin @ 2018-05-08 16:44 UTC (permalink / raw)
  To: linux-arm-kernel

In order to make sve_save_state()/sve_load_state() more easily
reusable and to get rid of a potential branch on context switch
critical paths, this patch makes sve_pffr() inline and moves it to
fpsimd.h.

<asm/processor.h> must be included in fpsimd.h in order to make
this work, and this creates an #include cycle that is tricky to
avoid without modifying core code, due to the way the PR_SVE_*()
prctl helpers are included in the core prctl implementation.

Instead of breaking the cycle, this patch defers inclusion of
<asm/fpsimd.h> in <asm/processor.h> until the point where it is
actually needed: i.e., immediately before the prctl definitions.

No functional change.

Signed-off-by: Dave Martin <Dave.Martin@arm.com>
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
---
 arch/arm64/include/asm/fpsimd.h    | 13 +++++++++++++
 arch/arm64/include/asm/processor.h |  3 ++-
 arch/arm64/kernel/fpsimd.c         | 12 ------------
 3 files changed, 15 insertions(+), 13 deletions(-)

diff --git a/arch/arm64/include/asm/fpsimd.h b/arch/arm64/include/asm/fpsimd.h
index f4d80ae..0026a92 100644
--- a/arch/arm64/include/asm/fpsimd.h
+++ b/arch/arm64/include/asm/fpsimd.h
@@ -18,6 +18,8 @@
 
 #include <asm/ptrace.h>
 #include <asm/errno.h>
+#include <asm/processor.h>
+#include <asm/sigcontext.h>
 
 #ifndef __ASSEMBLY__
 
@@ -60,6 +62,17 @@ extern void sve_flush_cpu_state(void);
 /* Maximum VL that SVE VL-agnostic software can transparently support */
 #define SVE_VL_ARCH_MAX 0x100
 
+/* Offset of FFR in the SVE register dump */
+static inline size_t sve_ffr_offset(int vl)
+{
+	return SVE_SIG_FFR_OFFSET(sve_vq_from_vl(vl)) - SVE_SIG_REGS_OFFSET;
+}
+
+static inline void *sve_pffr(struct thread_struct *thread)
+{
+	return (char *)thread->sve_state + sve_ffr_offset(thread->sve_vl);
+}
+
 extern void sve_save_state(void *state, u32 *pfpsr);
 extern void sve_load_state(void const *state, u32 const *pfpsr,
 			   unsigned long vq_minus_1);
diff --git a/arch/arm64/include/asm/processor.h b/arch/arm64/include/asm/processor.h
index f902b6d..ebaadb1 100644
--- a/arch/arm64/include/asm/processor.h
+++ b/arch/arm64/include/asm/processor.h
@@ -40,7 +40,6 @@
 
 #include <asm/alternative.h>
 #include <asm/cpufeature.h>
-#include <asm/fpsimd.h>
 #include <asm/hw_breakpoint.h>
 #include <asm/lse.h>
 #include <asm/pgtable-hwdef.h>
@@ -245,6 +244,8 @@ void cpu_enable_pan(const struct arm64_cpu_capabilities *__unused);
 void cpu_enable_cache_maint_trap(const struct arm64_cpu_capabilities *__unused);
 void cpu_clear_disr(const struct arm64_cpu_capabilities *__unused);
 
+#include <asm/fpsimd.h>
+
 /* Userspace interface for PR_SVE_{SET,GET}_VL prctl()s: */
 #define SVE_SET_VL(arg)	sve_set_current_vl(arg)
 #define SVE_GET_VL()	sve_get_current_vl()
diff --git a/arch/arm64/kernel/fpsimd.c b/arch/arm64/kernel/fpsimd.c
index d811679..152d834 100644
--- a/arch/arm64/kernel/fpsimd.c
+++ b/arch/arm64/kernel/fpsimd.c
@@ -161,18 +161,6 @@ static void sve_free(struct task_struct *task)
 	__sve_free(task);
 }
 
-
-/* Offset of FFR in the SVE register dump */
-static size_t sve_ffr_offset(int vl)
-{
-	return SVE_SIG_FFR_OFFSET(sve_vq_from_vl(vl)) - SVE_SIG_REGS_OFFSET;
-}
-
-static void *sve_pffr(struct thread_struct *thread)
-{
-	return (char *)thread->sve_state + sve_ffr_offset(thread->sve_vl);
-}
-
 static void change_cpacr(u64 val, u64 mask)
 {
 	u64 cpacr = read_sysreg(CPACR_EL1);
-- 
2.1.4

^ permalink raw reply related	[flat|nested] 62+ messages in thread

* [PATCH v6 11/15] KVM: arm64: Save host SVE context as appropriate
  2018-05-08 16:44 ` Dave Martin
@ 2018-05-08 16:44   ` Dave Martin
  -1 siblings, 0 replies; 62+ messages in thread
From: Dave Martin @ 2018-05-08 16:44 UTC (permalink / raw)
  To: kvmarm
  Cc: Marc Zyngier, Catalin Marinas, Christoffer Dall,
	linux-arm-kernel, Ard Biesheuvel

This patch adds SVE context saving to the hyp FPSIMD context switch
path.  This means that it is no longer necessary to save the host
SVE state in advance of entering the guest, when in use.

In order to avoid adding pointless complexity to the code, VHE is
assumed if SVE is in use.  VHE is an architectural prerequisite for
SVE, so there is no good reason to turn CONFIG_ARM64_VHE off in
kernels that support both SVE and KVM.

Historically, software models exist that can expose the
architecturally invalid configuration of SVE without VHE, so if
this situation is detected this patch warns and refuses to create a
VM.  Doing this check at VM creation time avoids race issues
between KVM and SVE initialisation.

Signed-off-by: Dave Martin <Dave.Martin@arm.com>
Reviewed-by: Christoffer Dall <christoffer.dall@arm.com>

---

Changes since v5:

Requested by Marc Zyngier:

 * Migrate to flag bits in vcpu_arch.flags instead of adding bools in
   vcpu_arch.

 * Move system_supports_sve() && !has_vhe() sanity check to
   kvm_arch_init() instead of doing it each time a VM is created.
   cpufeatures initialisation should be complete by the time any
   before module_init() call runs.
---
 arch/arm64/Kconfig          |  7 +++++++
 arch/arm64/kvm/fpsimd.c     |  1 -
 arch/arm64/kvm/hyp/switch.c | 22 ++++++++++++++++++++--
 virt/kvm/arm/arm.c          | 18 ++++++++++++++++++
 4 files changed, 45 insertions(+), 3 deletions(-)

diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
index eb2cf49..b0d3820 100644
--- a/arch/arm64/Kconfig
+++ b/arch/arm64/Kconfig
@@ -1130,6 +1130,7 @@ endmenu
 config ARM64_SVE
 	bool "ARM Scalable Vector Extension support"
 	default y
+	depends on !KVM || ARM64_VHE
 	help
 	  The Scalable Vector Extension (SVE) is an extension to the AArch64
 	  execution state which complements and extends the SIMD functionality
@@ -1155,6 +1156,12 @@ config ARM64_SVE
 	  booting the kernel.  If unsure and you are not observing these
 	  symptoms, you should assume that it is safe to say Y.
 
+	  CPUs that support SVE are architecturally required to support the
+	  Virtualization Host Extensions (VHE), so the kernel makes no
+	  provision for supporting SVE alongside KVM without VHE enabled.
+	  Thus, you will need to enable CONFIG_ARM64_VHE if you want to support
+	  KVM in the same kernel image.
+
 config ARM64_MODULE_PLTS
 	bool
 	select HAVE_MOD_ARCH_SPECIFIC
diff --git a/arch/arm64/kvm/fpsimd.c b/arch/arm64/kvm/fpsimd.c
index 7059275..4c47b55 100644
--- a/arch/arm64/kvm/fpsimd.c
+++ b/arch/arm64/kvm/fpsimd.c
@@ -60,7 +60,6 @@ int kvm_arch_vcpu_run_map_fp(struct kvm_vcpu *vcpu)
  */
 void kvm_arch_vcpu_load_fp(struct kvm_vcpu *vcpu)
 {
-	BUG_ON(system_supports_sve());
 	BUG_ON(!current->mm);
 
 	vcpu->arch.flags &= ~(KVM_ARM64_FP_ENABLED | KVM_ARM64_HOST_SVE_IN_USE);
diff --git a/arch/arm64/kvm/hyp/switch.c b/arch/arm64/kvm/hyp/switch.c
index 75c3b65..6826f2d 100644
--- a/arch/arm64/kvm/hyp/switch.c
+++ b/arch/arm64/kvm/hyp/switch.c
@@ -21,12 +21,14 @@
 
 #include <kvm/arm_psci.h>
 
+#include <asm/cpufeature.h>
 #include <asm/kvm_asm.h>
 #include <asm/kvm_emulate.h>
 #include <asm/kvm_hyp.h>
 #include <asm/kvm_mmu.h>
 #include <asm/fpsimd.h>
 #include <asm/debug-monitors.h>
+#include <asm/processor.h>
 #include <asm/thread_info.h>
 
 static bool __hyp_text update_fp_enabled(struct kvm_vcpu *vcpu)
@@ -328,6 +330,8 @@ static bool __hyp_text __skip_instr(struct kvm_vcpu *vcpu)
 void __hyp_text __hyp_switch_fpsimd(u64 esr __always_unused,
 				    struct kvm_vcpu *vcpu)
 {
+	struct user_fpsimd_state *host_fpsimd = vcpu->arch.host_fpsimd_state;
+
 	if (has_vhe())
 		write_sysreg(read_sysreg(cpacr_el1) | CPACR_EL1_FPEN,
 			     cpacr_el1);
@@ -337,8 +341,22 @@ void __hyp_text __hyp_switch_fpsimd(u64 esr __always_unused,
 
 	isb();
 
-	if (vcpu->arch.host_fpsimd_state) {
-		__fpsimd_save_state(vcpu->arch.host_fpsimd_state);
+	if (host_fpsimd) {
+		/*
+		 * In the SVE case, VHE is assumed: it is enforced by
+		 * Kconfig and kvm_arch_init_vm().
+		 */
+		if (system_supports_sve() &&
+		    (vcpu->arch.flags & KVM_ARM64_HOST_SVE_IN_USE)) {
+			struct thread_struct *thread = container_of(
+				host_fpsimd,
+				struct thread_struct, uw.fpsimd_state);
+
+			sve_save_state(sve_pffr(thread), &host_fpsimd->fpsr);
+		} else {
+			__fpsimd_save_state(vcpu->arch.host_fpsimd_state);
+		}
+
 		vcpu->arch.host_fpsimd_state = NULL;
 	}
 
diff --git a/virt/kvm/arm/arm.c b/virt/kvm/arm/arm.c
index bee226c..379e8a9 100644
--- a/virt/kvm/arm/arm.c
+++ b/virt/kvm/arm/arm.c
@@ -16,6 +16,7 @@
  * Foundation, 51 Franklin Street, Fifth Floor, Boston, MA  02110-1301, USA.
  */
 
+#include <linux/bug.h>
 #include <linux/cpu_pm.h>
 #include <linux/errno.h>
 #include <linux/err.h>
@@ -41,6 +42,7 @@
 #include <asm/mman.h>
 #include <asm/tlbflush.h>
 #include <asm/cacheflush.h>
+#include <asm/cpufeature.h>
 #include <asm/virt.h>
 #include <asm/kvm_arm.h>
 #include <asm/kvm_asm.h>
@@ -1582,6 +1584,22 @@ int kvm_arch_init(void *opaque)
 		}
 	}
 
+	/*
+	 * VHE is a prerequisite for SVE in the Arm architecture, and
+	 * Kconfig ensures that if system_supports_sve() here then
+	 * CONFIG_ARM64_VHE is enabled, so if VHE support wasn't already
+	 * detected and enabled, the CPU is architecturally
+	 * noncompliant.
+	 *
+	 * Just in case this mismatch is seen, detect it, warn and give
+	 * up.  Supporting this forbidden configuration in Hyp would be
+	 * pointless.
+	 */
+	if (system_supports_sve() && !has_vhe()) {
+		kvm_pr_unimpl("SVE system without VHE unsupported.  Broken cpu?");
+		return -ENODEV;
+	}
+
 	err = init_common_resources();
 	if (err)
 		return err;
-- 
2.1.4

^ permalink raw reply related	[flat|nested] 62+ messages in thread

* [PATCH v6 11/15] KVM: arm64: Save host SVE context as appropriate
@ 2018-05-08 16:44   ` Dave Martin
  0 siblings, 0 replies; 62+ messages in thread
From: Dave Martin @ 2018-05-08 16:44 UTC (permalink / raw)
  To: linux-arm-kernel

This patch adds SVE context saving to the hyp FPSIMD context switch
path.  This means that it is no longer necessary to save the host
SVE state in advance of entering the guest, when in use.

In order to avoid adding pointless complexity to the code, VHE is
assumed if SVE is in use.  VHE is an architectural prerequisite for
SVE, so there is no good reason to turn CONFIG_ARM64_VHE off in
kernels that support both SVE and KVM.

Historically, software models exist that can expose the
architecturally invalid configuration of SVE without VHE, so if
this situation is detected this patch warns and refuses to create a
VM.  Doing this check at VM creation time avoids race issues
between KVM and SVE initialisation.

Signed-off-by: Dave Martin <Dave.Martin@arm.com>
Reviewed-by: Christoffer Dall <christoffer.dall@arm.com>

---

Changes since v5:

Requested by Marc Zyngier:

 * Migrate to flag bits in vcpu_arch.flags instead of adding bools in
   vcpu_arch.

 * Move system_supports_sve() && !has_vhe() sanity check to
   kvm_arch_init() instead of doing it each time a VM is created.
   cpufeatures initialisation should be complete by the time any
   before module_init() call runs.
---
 arch/arm64/Kconfig          |  7 +++++++
 arch/arm64/kvm/fpsimd.c     |  1 -
 arch/arm64/kvm/hyp/switch.c | 22 ++++++++++++++++++++--
 virt/kvm/arm/arm.c          | 18 ++++++++++++++++++
 4 files changed, 45 insertions(+), 3 deletions(-)

diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
index eb2cf49..b0d3820 100644
--- a/arch/arm64/Kconfig
+++ b/arch/arm64/Kconfig
@@ -1130,6 +1130,7 @@ endmenu
 config ARM64_SVE
 	bool "ARM Scalable Vector Extension support"
 	default y
+	depends on !KVM || ARM64_VHE
 	help
 	  The Scalable Vector Extension (SVE) is an extension to the AArch64
 	  execution state which complements and extends the SIMD functionality
@@ -1155,6 +1156,12 @@ config ARM64_SVE
 	  booting the kernel.  If unsure and you are not observing these
 	  symptoms, you should assume that it is safe to say Y.
 
+	  CPUs that support SVE are architecturally required to support the
+	  Virtualization Host Extensions (VHE), so the kernel makes no
+	  provision for supporting SVE alongside KVM without VHE enabled.
+	  Thus, you will need to enable CONFIG_ARM64_VHE if you want to support
+	  KVM in the same kernel image.
+
 config ARM64_MODULE_PLTS
 	bool
 	select HAVE_MOD_ARCH_SPECIFIC
diff --git a/arch/arm64/kvm/fpsimd.c b/arch/arm64/kvm/fpsimd.c
index 7059275..4c47b55 100644
--- a/arch/arm64/kvm/fpsimd.c
+++ b/arch/arm64/kvm/fpsimd.c
@@ -60,7 +60,6 @@ int kvm_arch_vcpu_run_map_fp(struct kvm_vcpu *vcpu)
  */
 void kvm_arch_vcpu_load_fp(struct kvm_vcpu *vcpu)
 {
-	BUG_ON(system_supports_sve());
 	BUG_ON(!current->mm);
 
 	vcpu->arch.flags &= ~(KVM_ARM64_FP_ENABLED | KVM_ARM64_HOST_SVE_IN_USE);
diff --git a/arch/arm64/kvm/hyp/switch.c b/arch/arm64/kvm/hyp/switch.c
index 75c3b65..6826f2d 100644
--- a/arch/arm64/kvm/hyp/switch.c
+++ b/arch/arm64/kvm/hyp/switch.c
@@ -21,12 +21,14 @@
 
 #include <kvm/arm_psci.h>
 
+#include <asm/cpufeature.h>
 #include <asm/kvm_asm.h>
 #include <asm/kvm_emulate.h>
 #include <asm/kvm_hyp.h>
 #include <asm/kvm_mmu.h>
 #include <asm/fpsimd.h>
 #include <asm/debug-monitors.h>
+#include <asm/processor.h>
 #include <asm/thread_info.h>
 
 static bool __hyp_text update_fp_enabled(struct kvm_vcpu *vcpu)
@@ -328,6 +330,8 @@ static bool __hyp_text __skip_instr(struct kvm_vcpu *vcpu)
 void __hyp_text __hyp_switch_fpsimd(u64 esr __always_unused,
 				    struct kvm_vcpu *vcpu)
 {
+	struct user_fpsimd_state *host_fpsimd = vcpu->arch.host_fpsimd_state;
+
 	if (has_vhe())
 		write_sysreg(read_sysreg(cpacr_el1) | CPACR_EL1_FPEN,
 			     cpacr_el1);
@@ -337,8 +341,22 @@ void __hyp_text __hyp_switch_fpsimd(u64 esr __always_unused,
 
 	isb();
 
-	if (vcpu->arch.host_fpsimd_state) {
-		__fpsimd_save_state(vcpu->arch.host_fpsimd_state);
+	if (host_fpsimd) {
+		/*
+		 * In the SVE case, VHE is assumed: it is enforced by
+		 * Kconfig and kvm_arch_init_vm().
+		 */
+		if (system_supports_sve() &&
+		    (vcpu->arch.flags & KVM_ARM64_HOST_SVE_IN_USE)) {
+			struct thread_struct *thread = container_of(
+				host_fpsimd,
+				struct thread_struct, uw.fpsimd_state);
+
+			sve_save_state(sve_pffr(thread), &host_fpsimd->fpsr);
+		} else {
+			__fpsimd_save_state(vcpu->arch.host_fpsimd_state);
+		}
+
 		vcpu->arch.host_fpsimd_state = NULL;
 	}
 
diff --git a/virt/kvm/arm/arm.c b/virt/kvm/arm/arm.c
index bee226c..379e8a9 100644
--- a/virt/kvm/arm/arm.c
+++ b/virt/kvm/arm/arm.c
@@ -16,6 +16,7 @@
  * Foundation, 51 Franklin Street, Fifth Floor, Boston, MA  02110-1301, USA.
  */
 
+#include <linux/bug.h>
 #include <linux/cpu_pm.h>
 #include <linux/errno.h>
 #include <linux/err.h>
@@ -41,6 +42,7 @@
 #include <asm/mman.h>
 #include <asm/tlbflush.h>
 #include <asm/cacheflush.h>
+#include <asm/cpufeature.h>
 #include <asm/virt.h>
 #include <asm/kvm_arm.h>
 #include <asm/kvm_asm.h>
@@ -1582,6 +1584,22 @@ int kvm_arch_init(void *opaque)
 		}
 	}
 
+	/*
+	 * VHE is a prerequisite for SVE in the Arm architecture, and
+	 * Kconfig ensures that if system_supports_sve() here then
+	 * CONFIG_ARM64_VHE is enabled, so if VHE support wasn't already
+	 * detected and enabled, the CPU is architecturally
+	 * noncompliant.
+	 *
+	 * Just in case this mismatch is seen, detect it, warn and give
+	 * up.  Supporting this forbidden configuration in Hyp would be
+	 * pointless.
+	 */
+	if (system_supports_sve() && !has_vhe()) {
+		kvm_pr_unimpl("SVE system without VHE unsupported.  Broken cpu?");
+		return -ENODEV;
+	}
+
 	err = init_common_resources();
 	if (err)
 		return err;
-- 
2.1.4

^ permalink raw reply related	[flat|nested] 62+ messages in thread

* [PATCH v6 12/15] KVM: arm64: Remove eager host SVE state saving
  2018-05-08 16:44 ` Dave Martin
@ 2018-05-08 16:44   ` Dave Martin
  -1 siblings, 0 replies; 62+ messages in thread
From: Dave Martin @ 2018-05-08 16:44 UTC (permalink / raw)
  To: kvmarm
  Cc: Marc Zyngier, Catalin Marinas, Christoffer Dall,
	linux-arm-kernel, Ard Biesheuvel

Now that the host SVE context can be saved on demand from Hyp,
there is no longer any need to save this state in advance before
entering the guest.

This patch removes the relevant call to
kvm_fpsimd_flush_cpu_state().

Since the problem that function was intended to solve now no longer
exists, the function and its dependencies are also deleted.

Signed-off-by: Dave Martin <Dave.Martin@arm.com>
Acked-by: Christoffer Dall <christoffer.dall@arm.com>
---
 arch/arm/include/asm/kvm_host.h   |  3 ---
 arch/arm64/include/asm/kvm_host.h | 10 ----------
 arch/arm64/kernel/fpsimd.c        | 21 ---------------------
 virt/kvm/arm/arm.c                |  3 ---
 4 files changed, 37 deletions(-)

diff --git a/arch/arm/include/asm/kvm_host.h b/arch/arm/include/asm/kvm_host.h
index 4cac8d1..39f051f 100644
--- a/arch/arm/include/asm/kvm_host.h
+++ b/arch/arm/include/asm/kvm_host.h
@@ -311,9 +311,6 @@ static inline void kvm_arch_vcpu_load_fp(struct kvm_vcpu *vcpu) {}
 static inline void kvm_arch_vcpu_park_fp(struct kvm_vcpu *vcpu) {}
 static inline void kvm_arch_vcpu_put_fp(struct kvm_vcpu *vcpu) {}
 
-/* All host FP/SIMD state is restored on guest exit, so nothing to save: */
-static inline void kvm_fpsimd_flush_cpu_state(void) {}
-
 static inline void kvm_arm_vhe_guest_enter(void) {}
 static inline void kvm_arm_vhe_guest_exit(void) {}
 
diff --git a/arch/arm64/include/asm/kvm_host.h b/arch/arm64/include/asm/kvm_host.h
index e4bdc75..239097c 100644
--- a/arch/arm64/include/asm/kvm_host.h
+++ b/arch/arm64/include/asm/kvm_host.h
@@ -436,16 +436,6 @@ static inline int kvm_arch_vcpu_run_pid_change(struct kvm_vcpu *vcpu)
 	return kvm_arch_vcpu_run_map_fp(vcpu);
 }
 
-/*
- * All host FP/SIMD state is restored on guest exit, so nothing needs
- * doing here except in the SVE case:
-*/
-static inline void kvm_fpsimd_flush_cpu_state(void)
-{
-	if (system_supports_sve())
-		sve_flush_cpu_state();
-}
-
 static inline void kvm_arm_vhe_guest_enter(void)
 {
 	local_daif_mask();
diff --git a/arch/arm64/kernel/fpsimd.c b/arch/arm64/kernel/fpsimd.c
index 152d834..3fc3449 100644
--- a/arch/arm64/kernel/fpsimd.c
+++ b/arch/arm64/kernel/fpsimd.c
@@ -120,7 +120,6 @@
  */
 struct fpsimd_last_state_struct {
 	struct user_fpsimd_state *st;
-	bool sve_in_use;
 };
 
 static DEFINE_PER_CPU(struct fpsimd_last_state_struct, fpsimd_last_state);
@@ -1020,7 +1019,6 @@ static void fpsimd_bind_to_cpu(void)
 		this_cpu_ptr(&fpsimd_last_state);
 
 	last->st = &current->thread.uw.fpsimd_state;
-	last->sve_in_use = test_thread_flag(TIF_SVE);
 	current->thread.fpsimd_cpu = smp_processor_id();
 }
 
@@ -1032,7 +1030,6 @@ void fpsimd_bind_state_to_cpu(struct user_fpsimd_state *st)
 	WARN_ON(!in_softirq() && !irqs_disabled());
 
 	last->st = st;
-	last->sve_in_use = false;
 }
 
 /*
@@ -1092,24 +1089,6 @@ void fpsimd_flush_cpu_state(void)
 	__this_cpu_write(fpsimd_last_state.st, NULL);
 }
 
-/*
- * Invalidate any task SVE state currently held in this CPU's regs.
- *
- * This is used to prevent the kernel from trying to reuse SVE register data
- * that is detroyed by KVM guest enter/exit.  This function should go away when
- * KVM SVE support is implemented.  Don't use it for anything else.
- */
-#ifdef CONFIG_ARM64_SVE
-void sve_flush_cpu_state(void)
-{
-	struct fpsimd_last_state_struct const *last =
-		this_cpu_ptr(&fpsimd_last_state);
-
-	if (last->st && last->sve_in_use)
-		fpsimd_flush_cpu_state();
-}
-#endif /* CONFIG_ARM64_SVE */
-
 #ifdef CONFIG_KERNEL_MODE_NEON
 
 DEFINE_PER_CPU(bool, kernel_neon_busy);
diff --git a/virt/kvm/arm/arm.c b/virt/kvm/arm/arm.c
index 379e8a9..9cef525 100644
--- a/virt/kvm/arm/arm.c
+++ b/virt/kvm/arm/arm.c
@@ -682,9 +682,6 @@ int kvm_arch_vcpu_ioctl_run(struct kvm_vcpu *vcpu, struct kvm_run *run)
 		 */
 		preempt_disable();
 
-		/* Flush FP/SIMD state that can't survive guest entry/exit */
-		kvm_fpsimd_flush_cpu_state();
-
 		kvm_pmu_flush_hwstate(vcpu);
 
 		local_irq_disable();
-- 
2.1.4

^ permalink raw reply related	[flat|nested] 62+ messages in thread

* [PATCH v6 12/15] KVM: arm64: Remove eager host SVE state saving
@ 2018-05-08 16:44   ` Dave Martin
  0 siblings, 0 replies; 62+ messages in thread
From: Dave Martin @ 2018-05-08 16:44 UTC (permalink / raw)
  To: linux-arm-kernel

Now that the host SVE context can be saved on demand from Hyp,
there is no longer any need to save this state in advance before
entering the guest.

This patch removes the relevant call to
kvm_fpsimd_flush_cpu_state().

Since the problem that function was intended to solve now no longer
exists, the function and its dependencies are also deleted.

Signed-off-by: Dave Martin <Dave.Martin@arm.com>
Acked-by: Christoffer Dall <christoffer.dall@arm.com>
---
 arch/arm/include/asm/kvm_host.h   |  3 ---
 arch/arm64/include/asm/kvm_host.h | 10 ----------
 arch/arm64/kernel/fpsimd.c        | 21 ---------------------
 virt/kvm/arm/arm.c                |  3 ---
 4 files changed, 37 deletions(-)

diff --git a/arch/arm/include/asm/kvm_host.h b/arch/arm/include/asm/kvm_host.h
index 4cac8d1..39f051f 100644
--- a/arch/arm/include/asm/kvm_host.h
+++ b/arch/arm/include/asm/kvm_host.h
@@ -311,9 +311,6 @@ static inline void kvm_arch_vcpu_load_fp(struct kvm_vcpu *vcpu) {}
 static inline void kvm_arch_vcpu_park_fp(struct kvm_vcpu *vcpu) {}
 static inline void kvm_arch_vcpu_put_fp(struct kvm_vcpu *vcpu) {}
 
-/* All host FP/SIMD state is restored on guest exit, so nothing to save: */
-static inline void kvm_fpsimd_flush_cpu_state(void) {}
-
 static inline void kvm_arm_vhe_guest_enter(void) {}
 static inline void kvm_arm_vhe_guest_exit(void) {}
 
diff --git a/arch/arm64/include/asm/kvm_host.h b/arch/arm64/include/asm/kvm_host.h
index e4bdc75..239097c 100644
--- a/arch/arm64/include/asm/kvm_host.h
+++ b/arch/arm64/include/asm/kvm_host.h
@@ -436,16 +436,6 @@ static inline int kvm_arch_vcpu_run_pid_change(struct kvm_vcpu *vcpu)
 	return kvm_arch_vcpu_run_map_fp(vcpu);
 }
 
-/*
- * All host FP/SIMD state is restored on guest exit, so nothing needs
- * doing here except in the SVE case:
-*/
-static inline void kvm_fpsimd_flush_cpu_state(void)
-{
-	if (system_supports_sve())
-		sve_flush_cpu_state();
-}
-
 static inline void kvm_arm_vhe_guest_enter(void)
 {
 	local_daif_mask();
diff --git a/arch/arm64/kernel/fpsimd.c b/arch/arm64/kernel/fpsimd.c
index 152d834..3fc3449 100644
--- a/arch/arm64/kernel/fpsimd.c
+++ b/arch/arm64/kernel/fpsimd.c
@@ -120,7 +120,6 @@
  */
 struct fpsimd_last_state_struct {
 	struct user_fpsimd_state *st;
-	bool sve_in_use;
 };
 
 static DEFINE_PER_CPU(struct fpsimd_last_state_struct, fpsimd_last_state);
@@ -1020,7 +1019,6 @@ static void fpsimd_bind_to_cpu(void)
 		this_cpu_ptr(&fpsimd_last_state);
 
 	last->st = &current->thread.uw.fpsimd_state;
-	last->sve_in_use = test_thread_flag(TIF_SVE);
 	current->thread.fpsimd_cpu = smp_processor_id();
 }
 
@@ -1032,7 +1030,6 @@ void fpsimd_bind_state_to_cpu(struct user_fpsimd_state *st)
 	WARN_ON(!in_softirq() && !irqs_disabled());
 
 	last->st = st;
-	last->sve_in_use = false;
 }
 
 /*
@@ -1092,24 +1089,6 @@ void fpsimd_flush_cpu_state(void)
 	__this_cpu_write(fpsimd_last_state.st, NULL);
 }
 
-/*
- * Invalidate any task SVE state currently held in this CPU's regs.
- *
- * This is used to prevent the kernel from trying to reuse SVE register data
- * that is detroyed by KVM guest enter/exit.  This function should go away when
- * KVM SVE support is implemented.  Don't use it for anything else.
- */
-#ifdef CONFIG_ARM64_SVE
-void sve_flush_cpu_state(void)
-{
-	struct fpsimd_last_state_struct const *last =
-		this_cpu_ptr(&fpsimd_last_state);
-
-	if (last->st && last->sve_in_use)
-		fpsimd_flush_cpu_state();
-}
-#endif /* CONFIG_ARM64_SVE */
-
 #ifdef CONFIG_KERNEL_MODE_NEON
 
 DEFINE_PER_CPU(bool, kernel_neon_busy);
diff --git a/virt/kvm/arm/arm.c b/virt/kvm/arm/arm.c
index 379e8a9..9cef525 100644
--- a/virt/kvm/arm/arm.c
+++ b/virt/kvm/arm/arm.c
@@ -682,9 +682,6 @@ int kvm_arch_vcpu_ioctl_run(struct kvm_vcpu *vcpu, struct kvm_run *run)
 		 */
 		preempt_disable();
 
-		/* Flush FP/SIMD state that can't survive guest entry/exit */
-		kvm_fpsimd_flush_cpu_state();
-
 		kvm_pmu_flush_hwstate(vcpu);
 
 		local_irq_disable();
-- 
2.1.4

^ permalink raw reply related	[flat|nested] 62+ messages in thread

* [PATCH v6 13/15] KVM: arm64: Remove redundant *exit_code changes in fpsimd_guest_exit()
  2018-05-08 16:44 ` Dave Martin
@ 2018-05-08 16:44   ` Dave Martin
  -1 siblings, 0 replies; 62+ messages in thread
From: Dave Martin @ 2018-05-08 16:44 UTC (permalink / raw)
  To: kvmarm
  Cc: Marc Zyngier, Catalin Marinas, Christoffer Dall,
	linux-arm-kernel, Ard Biesheuvel

In fixup_guest_exit(), there are a couple of cases where after
checking what the exit code was, we assign it explicitly with the
value it already had.

Assuming this is not indicative of a bug, these assignments are not
needed.

This patch removes the redundant assignments simplifies some if-
nesting that becomes trivial as a result.

No functional change.

Signed-off-by: Dave Martin <Dave.Martin@arm.com>
Acked-by: Marc Zyngier <marc.zyngier@arm.com>
---
 arch/arm64/kvm/hyp/switch.c | 16 ++++------------
 1 file changed, 4 insertions(+), 12 deletions(-)

diff --git a/arch/arm64/kvm/hyp/switch.c b/arch/arm64/kvm/hyp/switch.c
index 6826f2d..21e5543 100644
--- a/arch/arm64/kvm/hyp/switch.c
+++ b/arch/arm64/kvm/hyp/switch.c
@@ -402,12 +402,8 @@ static bool __hyp_text fixup_guest_exit(struct kvm_vcpu *vcpu, u64 *exit_code)
 		if (valid) {
 			int ret = __vgic_v2_perform_cpuif_access(vcpu);
 
-			if (ret == 1) {
-				if (__skip_instr(vcpu))
-					return true;
-				else
-					*exit_code = ARM_EXCEPTION_TRAP;
-			}
+			if (ret ==  1 && __skip_instr(vcpu))
+				return true;
 
 			if (ret == -1) {
 				/* Promote an illegal access to an
@@ -429,12 +425,8 @@ static bool __hyp_text fixup_guest_exit(struct kvm_vcpu *vcpu, u64 *exit_code)
 	     kvm_vcpu_trap_get_class(vcpu) == ESR_ELx_EC_CP15_32)) {
 		int ret = __vgic_v3_perform_cpuif_access(vcpu);
 
-		if (ret == 1) {
-			if (__skip_instr(vcpu))
-				return true;
-			else
-				*exit_code = ARM_EXCEPTION_TRAP;
-		}
+		if (ret == 1 && __skip_instr(vcpu))
+			return true;
 	}
 
 	/* Return to the host kernel and handle the exit */
-- 
2.1.4

^ permalink raw reply related	[flat|nested] 62+ messages in thread

* [PATCH v6 13/15] KVM: arm64: Remove redundant *exit_code changes in fpsimd_guest_exit()
@ 2018-05-08 16:44   ` Dave Martin
  0 siblings, 0 replies; 62+ messages in thread
From: Dave Martin @ 2018-05-08 16:44 UTC (permalink / raw)
  To: linux-arm-kernel

In fixup_guest_exit(), there are a couple of cases where after
checking what the exit code was, we assign it explicitly with the
value it already had.

Assuming this is not indicative of a bug, these assignments are not
needed.

This patch removes the redundant assignments simplifies some if-
nesting that becomes trivial as a result.

No functional change.

Signed-off-by: Dave Martin <Dave.Martin@arm.com>
Acked-by: Marc Zyngier <marc.zyngier@arm.com>
---
 arch/arm64/kvm/hyp/switch.c | 16 ++++------------
 1 file changed, 4 insertions(+), 12 deletions(-)

diff --git a/arch/arm64/kvm/hyp/switch.c b/arch/arm64/kvm/hyp/switch.c
index 6826f2d..21e5543 100644
--- a/arch/arm64/kvm/hyp/switch.c
+++ b/arch/arm64/kvm/hyp/switch.c
@@ -402,12 +402,8 @@ static bool __hyp_text fixup_guest_exit(struct kvm_vcpu *vcpu, u64 *exit_code)
 		if (valid) {
 			int ret = __vgic_v2_perform_cpuif_access(vcpu);
 
-			if (ret == 1) {
-				if (__skip_instr(vcpu))
-					return true;
-				else
-					*exit_code = ARM_EXCEPTION_TRAP;
-			}
+			if (ret ==  1 && __skip_instr(vcpu))
+				return true;
 
 			if (ret == -1) {
 				/* Promote an illegal access to an
@@ -429,12 +425,8 @@ static bool __hyp_text fixup_guest_exit(struct kvm_vcpu *vcpu, u64 *exit_code)
 	     kvm_vcpu_trap_get_class(vcpu) == ESR_ELx_EC_CP15_32)) {
 		int ret = __vgic_v3_perform_cpuif_access(vcpu);
 
-		if (ret == 1) {
-			if (__skip_instr(vcpu))
-				return true;
-			else
-				*exit_code = ARM_EXCEPTION_TRAP;
-		}
+		if (ret == 1 && __skip_instr(vcpu))
+			return true;
 	}
 
 	/* Return to the host kernel and handle the exit */
-- 
2.1.4

^ permalink raw reply related	[flat|nested] 62+ messages in thread

* [PATCH v6 14/15] KVM: arm64: Fold redundant exit code checks out of fixup_guest_exit()
  2018-05-08 16:44 ` Dave Martin
@ 2018-05-08 16:44   ` Dave Martin
  -1 siblings, 0 replies; 62+ messages in thread
From: Dave Martin @ 2018-05-08 16:44 UTC (permalink / raw)
  To: kvmarm
  Cc: Marc Zyngier, Catalin Marinas, Christoffer Dall,
	linux-arm-kernel, Ard Biesheuvel

The entire tail of fixup_guest_exit() is contained in if statements
of the form if (x && *exit_code == ARM_EXCEPTION_TRAP).  As a result,
we can check just once and bail out of the function early, allowing
the remaining if conditions to be simplified.

The only awkward case is where *exit_code is changed to
ARM_EXCEPTION_EL1_SERROR in the case of an illegal GICv2 CPU
interface access: in that case, the GICv3 trap handling code is
skipped using a goto.  This avoids pointlessly evaluating the
static branch check for the GICv3 case, even though we can't have
vgic_v2_cpuif_trap and vgic_v3_cpuif_trap true simultaneously
unless we have a GICv3 and GICv2 on the host: that sounds stupid,
but I haven't satisfied myself that it can't happen.

No functional change.

Signed-off-by: Dave Martin <Dave.Martin@arm.com>

---

Changes since v5:

Requested by Marc Zyngier:

 * Move "goto exit" to the end of the data abort handling code under
   gicv2 cpuif trap handling, since the gicv3-specific handling is only
   for sysreg traps.
---
 arch/arm64/kvm/hyp/switch.c | 12 ++++++++----
 1 file changed, 8 insertions(+), 4 deletions(-)

diff --git a/arch/arm64/kvm/hyp/switch.c b/arch/arm64/kvm/hyp/switch.c
index 21e5543..5be472d 100644
--- a/arch/arm64/kvm/hyp/switch.c
+++ b/arch/arm64/kvm/hyp/switch.c
@@ -386,11 +386,13 @@ static bool __hyp_text fixup_guest_exit(struct kvm_vcpu *vcpu, u64 *exit_code)
 	 * same PC once the SError has been injected, and replay the
 	 * trapping instruction.
 	 */
-	if (*exit_code == ARM_EXCEPTION_TRAP && !__populate_fault_info(vcpu))
+	if (*exit_code != ARM_EXCEPTION_TRAP)
+		goto exit;
+
+	if (!__populate_fault_info(vcpu))
 		return true;
 
-	if (static_branch_unlikely(&vgic_v2_cpuif_trap) &&
-	    *exit_code == ARM_EXCEPTION_TRAP) {
+	if (static_branch_unlikely(&vgic_v2_cpuif_trap)) {
 		bool valid;
 
 		valid = kvm_vcpu_trap_get_class(vcpu) == ESR_ELx_EC_DABT_LOW &&
@@ -416,11 +418,12 @@ static bool __hyp_text fixup_guest_exit(struct kvm_vcpu *vcpu, u64 *exit_code)
 					*vcpu_cpsr(vcpu) &= ~DBG_SPSR_SS;
 				*exit_code = ARM_EXCEPTION_EL1_SERROR;
 			}
+
+			goto exit;
 		}
 	}
 
 	if (static_branch_unlikely(&vgic_v3_cpuif_trap) &&
-	    *exit_code == ARM_EXCEPTION_TRAP &&
 	    (kvm_vcpu_trap_get_class(vcpu) == ESR_ELx_EC_SYS64 ||
 	     kvm_vcpu_trap_get_class(vcpu) == ESR_ELx_EC_CP15_32)) {
 		int ret = __vgic_v3_perform_cpuif_access(vcpu);
@@ -429,6 +432,7 @@ static bool __hyp_text fixup_guest_exit(struct kvm_vcpu *vcpu, u64 *exit_code)
 			return true;
 	}
 
+exit:
 	/* Return to the host kernel and handle the exit */
 	return false;
 }
-- 
2.1.4

^ permalink raw reply related	[flat|nested] 62+ messages in thread

* [PATCH v6 14/15] KVM: arm64: Fold redundant exit code checks out of fixup_guest_exit()
@ 2018-05-08 16:44   ` Dave Martin
  0 siblings, 0 replies; 62+ messages in thread
From: Dave Martin @ 2018-05-08 16:44 UTC (permalink / raw)
  To: linux-arm-kernel

The entire tail of fixup_guest_exit() is contained in if statements
of the form if (x && *exit_code == ARM_EXCEPTION_TRAP).  As a result,
we can check just once and bail out of the function early, allowing
the remaining if conditions to be simplified.

The only awkward case is where *exit_code is changed to
ARM_EXCEPTION_EL1_SERROR in the case of an illegal GICv2 CPU
interface access: in that case, the GICv3 trap handling code is
skipped using a goto.  This avoids pointlessly evaluating the
static branch check for the GICv3 case, even though we can't have
vgic_v2_cpuif_trap and vgic_v3_cpuif_trap true simultaneously
unless we have a GICv3 and GICv2 on the host: that sounds stupid,
but I haven't satisfied myself that it can't happen.

No functional change.

Signed-off-by: Dave Martin <Dave.Martin@arm.com>

---

Changes since v5:

Requested by Marc Zyngier:

 * Move "goto exit" to the end of the data abort handling code under
   gicv2 cpuif trap handling, since the gicv3-specific handling is only
   for sysreg traps.
---
 arch/arm64/kvm/hyp/switch.c | 12 ++++++++----
 1 file changed, 8 insertions(+), 4 deletions(-)

diff --git a/arch/arm64/kvm/hyp/switch.c b/arch/arm64/kvm/hyp/switch.c
index 21e5543..5be472d 100644
--- a/arch/arm64/kvm/hyp/switch.c
+++ b/arch/arm64/kvm/hyp/switch.c
@@ -386,11 +386,13 @@ static bool __hyp_text fixup_guest_exit(struct kvm_vcpu *vcpu, u64 *exit_code)
 	 * same PC once the SError has been injected, and replay the
 	 * trapping instruction.
 	 */
-	if (*exit_code == ARM_EXCEPTION_TRAP && !__populate_fault_info(vcpu))
+	if (*exit_code != ARM_EXCEPTION_TRAP)
+		goto exit;
+
+	if (!__populate_fault_info(vcpu))
 		return true;
 
-	if (static_branch_unlikely(&vgic_v2_cpuif_trap) &&
-	    *exit_code == ARM_EXCEPTION_TRAP) {
+	if (static_branch_unlikely(&vgic_v2_cpuif_trap)) {
 		bool valid;
 
 		valid = kvm_vcpu_trap_get_class(vcpu) == ESR_ELx_EC_DABT_LOW &&
@@ -416,11 +418,12 @@ static bool __hyp_text fixup_guest_exit(struct kvm_vcpu *vcpu, u64 *exit_code)
 					*vcpu_cpsr(vcpu) &= ~DBG_SPSR_SS;
 				*exit_code = ARM_EXCEPTION_EL1_SERROR;
 			}
+
+			goto exit;
 		}
 	}
 
 	if (static_branch_unlikely(&vgic_v3_cpuif_trap) &&
-	    *exit_code == ARM_EXCEPTION_TRAP &&
 	    (kvm_vcpu_trap_get_class(vcpu) == ESR_ELx_EC_SYS64 ||
 	     kvm_vcpu_trap_get_class(vcpu) == ESR_ELx_EC_CP15_32)) {
 		int ret = __vgic_v3_perform_cpuif_access(vcpu);
@@ -429,6 +432,7 @@ static bool __hyp_text fixup_guest_exit(struct kvm_vcpu *vcpu, u64 *exit_code)
 			return true;
 	}
 
+exit:
 	/* Return to the host kernel and handle the exit */
 	return false;
 }
-- 
2.1.4

^ permalink raw reply related	[flat|nested] 62+ messages in thread

* [PATCH v6 15/15] KVM: arm64: Invoke FPSIMD context switch trap from C
  2018-05-08 16:44 ` Dave Martin
@ 2018-05-08 16:45   ` Dave Martin
  -1 siblings, 0 replies; 62+ messages in thread
From: Dave Martin @ 2018-05-08 16:45 UTC (permalink / raw)
  To: kvmarm
  Cc: Marc Zyngier, Catalin Marinas, Christoffer Dall,
	linux-arm-kernel, Ard Biesheuvel

The conversion of the FPSIMD context switch trap code to C has added
some overhead to calling it, due to the need to save registers that
the procedure call standard defines as caller-saved.

So, perhaps it is no longer worth invoking this trap handler quite
so early.

Instead, we can invoke it from fixup_guest_exit(), with little
likelihood of increasing the overhead much further.

As a convenience, this patch gives __hyp_switch_fpsimd() the same
return semantics fixup_guest_exit().  For now there is no
possibility of a spurious FPSIMD trap, so the function always
returns true, but this allows it to be tail-called with a single
return statement.

Signed-off-by: Dave Martin <Dave.Martin@arm.com>
Reviewed-by: Marc Zyngier <marc.zyngier@arm.com>
---
 arch/arm64/kvm/hyp/entry.S     | 30 ------------------------------
 arch/arm64/kvm/hyp/hyp-entry.S | 19 -------------------
 arch/arm64/kvm/hyp/switch.c    | 15 +++++++++++++--
 3 files changed, 13 insertions(+), 51 deletions(-)

diff --git a/arch/arm64/kvm/hyp/entry.S b/arch/arm64/kvm/hyp/entry.S
index 40f349b..fad1e16 100644
--- a/arch/arm64/kvm/hyp/entry.S
+++ b/arch/arm64/kvm/hyp/entry.S
@@ -166,33 +166,3 @@ abort_guest_exit_end:
 	orr	x0, x0, x5
 1:	ret
 ENDPROC(__guest_exit)
-
-ENTRY(__fpsimd_guest_restore)
-	// x0: esr
-	// x1: vcpu
-	// x2-x29,lr: vcpu regs
-	// vcpu x0-x1 on the stack
-	stp	x2, x3, [sp, #-144]!
-	stp	x4, x5, [sp, #16]
-	stp	x6, x7, [sp, #32]
-	stp	x8, x9, [sp, #48]
-	stp	x10, x11, [sp, #64]
-	stp	x12, x13, [sp, #80]
-	stp	x14, x15, [sp, #96]
-	stp	x16, x17, [sp, #112]
-	stp	x18, lr, [sp, #128]
-
-	bl	__hyp_switch_fpsimd
-
-	ldp	x4, x5, [sp, #16]
-	ldp	x6, x7, [sp, #32]
-	ldp	x8, x9, [sp, #48]
-	ldp	x10, x11, [sp, #64]
-	ldp	x12, x13, [sp, #80]
-	ldp	x14, x15, [sp, #96]
-	ldp	x16, x17, [sp, #112]
-	ldp	x18, lr, [sp, #128]
-	ldp	x0, x1, [sp, #144]
-	ldp	x2, x3, [sp], #160
-	eret
-ENDPROC(__fpsimd_guest_restore)
diff --git a/arch/arm64/kvm/hyp/hyp-entry.S b/arch/arm64/kvm/hyp/hyp-entry.S
index bffece2..753b9d2 100644
--- a/arch/arm64/kvm/hyp/hyp-entry.S
+++ b/arch/arm64/kvm/hyp/hyp-entry.S
@@ -113,25 +113,6 @@ el1_hvc_guest:
 
 el1_trap:
 	get_vcpu_ptr	x1, x0
-
-	mrs		x0, esr_el2
-	lsr		x0, x0, #ESR_ELx_EC_SHIFT
-	/*
-	 * x0: ESR_EC
-	 * x1: vcpu pointer
-	 */
-
-	/*
-	 * We trap the first access to the FP/SIMD to save the host context
-	 * and restore the guest context lazily.
-	 * If FP/SIMD is not implemented, handle the trap and inject an
-	 * undefined instruction exception to the guest.
-	 */
-alternative_if_not ARM64_HAS_NO_FPSIMD
-	cmp	x0, #ESR_ELx_EC_FP_ASIMD
-	b.eq	__fpsimd_guest_restore
-alternative_else_nop_endif
-
 	mov	x0, #ARM_EXCEPTION_TRAP
 	b	__guest_exit
 
diff --git a/arch/arm64/kvm/hyp/switch.c b/arch/arm64/kvm/hyp/switch.c
index 5be472d..6eeda72 100644
--- a/arch/arm64/kvm/hyp/switch.c
+++ b/arch/arm64/kvm/hyp/switch.c
@@ -327,8 +327,7 @@ static bool __hyp_text __skip_instr(struct kvm_vcpu *vcpu)
 	}
 }
 
-void __hyp_text __hyp_switch_fpsimd(u64 esr __always_unused,
-				    struct kvm_vcpu *vcpu)
+static bool __hyp_text __hyp_switch_fpsimd(struct kvm_vcpu *vcpu)
 {
 	struct user_fpsimd_state *host_fpsimd = vcpu->arch.host_fpsimd_state;
 
@@ -368,6 +367,8 @@ void __hyp_text __hyp_switch_fpsimd(u64 esr __always_unused,
 			     fpexc32_el2);
 
 	vcpu->arch.flags |= KVM_ARM64_FP_ENABLED;
+
+	return true;
 }
 
 /*
@@ -389,6 +390,16 @@ static bool __hyp_text fixup_guest_exit(struct kvm_vcpu *vcpu, u64 *exit_code)
 	if (*exit_code != ARM_EXCEPTION_TRAP)
 		goto exit;
 
+	/*
+	 * We trap the first access to the FP/SIMD to save the host context
+	 * and restore the guest context lazily.
+	 * If FP/SIMD is not implemented, handle the trap and inject an
+	 * undefined instruction exception to the guest.
+	 */
+	if (system_supports_fpsimd() &&
+	    kvm_vcpu_trap_get_class(vcpu) == ESR_ELx_EC_FP_ASIMD)
+		return __hyp_switch_fpsimd(vcpu);
+
 	if (!__populate_fault_info(vcpu))
 		return true;
 
-- 
2.1.4

^ permalink raw reply related	[flat|nested] 62+ messages in thread

* [PATCH v6 15/15] KVM: arm64: Invoke FPSIMD context switch trap from C
@ 2018-05-08 16:45   ` Dave Martin
  0 siblings, 0 replies; 62+ messages in thread
From: Dave Martin @ 2018-05-08 16:45 UTC (permalink / raw)
  To: linux-arm-kernel

The conversion of the FPSIMD context switch trap code to C has added
some overhead to calling it, due to the need to save registers that
the procedure call standard defines as caller-saved.

So, perhaps it is no longer worth invoking this trap handler quite
so early.

Instead, we can invoke it from fixup_guest_exit(), with little
likelihood of increasing the overhead much further.

As a convenience, this patch gives __hyp_switch_fpsimd() the same
return semantics fixup_guest_exit().  For now there is no
possibility of a spurious FPSIMD trap, so the function always
returns true, but this allows it to be tail-called with a single
return statement.

Signed-off-by: Dave Martin <Dave.Martin@arm.com>
Reviewed-by: Marc Zyngier <marc.zyngier@arm.com>
---
 arch/arm64/kvm/hyp/entry.S     | 30 ------------------------------
 arch/arm64/kvm/hyp/hyp-entry.S | 19 -------------------
 arch/arm64/kvm/hyp/switch.c    | 15 +++++++++++++--
 3 files changed, 13 insertions(+), 51 deletions(-)

diff --git a/arch/arm64/kvm/hyp/entry.S b/arch/arm64/kvm/hyp/entry.S
index 40f349b..fad1e16 100644
--- a/arch/arm64/kvm/hyp/entry.S
+++ b/arch/arm64/kvm/hyp/entry.S
@@ -166,33 +166,3 @@ abort_guest_exit_end:
 	orr	x0, x0, x5
 1:	ret
 ENDPROC(__guest_exit)
-
-ENTRY(__fpsimd_guest_restore)
-	// x0: esr
-	// x1: vcpu
-	// x2-x29,lr: vcpu regs
-	// vcpu x0-x1 on the stack
-	stp	x2, x3, [sp, #-144]!
-	stp	x4, x5, [sp, #16]
-	stp	x6, x7, [sp, #32]
-	stp	x8, x9, [sp, #48]
-	stp	x10, x11, [sp, #64]
-	stp	x12, x13, [sp, #80]
-	stp	x14, x15, [sp, #96]
-	stp	x16, x17, [sp, #112]
-	stp	x18, lr, [sp, #128]
-
-	bl	__hyp_switch_fpsimd
-
-	ldp	x4, x5, [sp, #16]
-	ldp	x6, x7, [sp, #32]
-	ldp	x8, x9, [sp, #48]
-	ldp	x10, x11, [sp, #64]
-	ldp	x12, x13, [sp, #80]
-	ldp	x14, x15, [sp, #96]
-	ldp	x16, x17, [sp, #112]
-	ldp	x18, lr, [sp, #128]
-	ldp	x0, x1, [sp, #144]
-	ldp	x2, x3, [sp], #160
-	eret
-ENDPROC(__fpsimd_guest_restore)
diff --git a/arch/arm64/kvm/hyp/hyp-entry.S b/arch/arm64/kvm/hyp/hyp-entry.S
index bffece2..753b9d2 100644
--- a/arch/arm64/kvm/hyp/hyp-entry.S
+++ b/arch/arm64/kvm/hyp/hyp-entry.S
@@ -113,25 +113,6 @@ el1_hvc_guest:
 
 el1_trap:
 	get_vcpu_ptr	x1, x0
-
-	mrs		x0, esr_el2
-	lsr		x0, x0, #ESR_ELx_EC_SHIFT
-	/*
-	 * x0: ESR_EC
-	 * x1: vcpu pointer
-	 */
-
-	/*
-	 * We trap the first access to the FP/SIMD to save the host context
-	 * and restore the guest context lazily.
-	 * If FP/SIMD is not implemented, handle the trap and inject an
-	 * undefined instruction exception to the guest.
-	 */
-alternative_if_not ARM64_HAS_NO_FPSIMD
-	cmp	x0, #ESR_ELx_EC_FP_ASIMD
-	b.eq	__fpsimd_guest_restore
-alternative_else_nop_endif
-
 	mov	x0, #ARM_EXCEPTION_TRAP
 	b	__guest_exit
 
diff --git a/arch/arm64/kvm/hyp/switch.c b/arch/arm64/kvm/hyp/switch.c
index 5be472d..6eeda72 100644
--- a/arch/arm64/kvm/hyp/switch.c
+++ b/arch/arm64/kvm/hyp/switch.c
@@ -327,8 +327,7 @@ static bool __hyp_text __skip_instr(struct kvm_vcpu *vcpu)
 	}
 }
 
-void __hyp_text __hyp_switch_fpsimd(u64 esr __always_unused,
-				    struct kvm_vcpu *vcpu)
+static bool __hyp_text __hyp_switch_fpsimd(struct kvm_vcpu *vcpu)
 {
 	struct user_fpsimd_state *host_fpsimd = vcpu->arch.host_fpsimd_state;
 
@@ -368,6 +367,8 @@ void __hyp_text __hyp_switch_fpsimd(u64 esr __always_unused,
 			     fpexc32_el2);
 
 	vcpu->arch.flags |= KVM_ARM64_FP_ENABLED;
+
+	return true;
 }
 
 /*
@@ -389,6 +390,16 @@ static bool __hyp_text fixup_guest_exit(struct kvm_vcpu *vcpu, u64 *exit_code)
 	if (*exit_code != ARM_EXCEPTION_TRAP)
 		goto exit;
 
+	/*
+	 * We trap the first access to the FP/SIMD to save the host context
+	 * and restore the guest context lazily.
+	 * If FP/SIMD is not implemented, handle the trap and inject an
+	 * undefined instruction exception to the guest.
+	 */
+	if (system_supports_fpsimd() &&
+	    kvm_vcpu_trap_get_class(vcpu) == ESR_ELx_EC_FP_ASIMD)
+		return __hyp_switch_fpsimd(vcpu);
+
 	if (!__populate_fault_info(vcpu))
 		return true;
 
-- 
2.1.4

^ permalink raw reply related	[flat|nested] 62+ messages in thread

* Re: [PATCH v6 01/15] thread_info: Add update_thread_flag() helpers
  2018-05-08 16:44   ` Dave Martin
@ 2018-05-08 17:08     ` Steven Rostedt
  -1 siblings, 0 replies; 62+ messages in thread
From: Steven Rostedt @ 2018-05-08 17:08 UTC (permalink / raw)
  To: Dave Martin
  Cc: Christoffer Dall, Ard Biesheuvel, Marc Zyngier, Catalin Marinas,
	Oleg Nesterov, Peter Zijlstra, Ingo Molnar, kvmarm,
	linux-arm-kernel

On Tue,  8 May 2018 17:44:46 +0100
Dave Martin <Dave.Martin@arm.com> wrote:

> There are a number of bits of code sprinkled around the kernel to
> set a thread flag if a certain condition is true, and clear it
> otherwise.
> 
> To help make those call sites terser and less cumbersome, this
> patch adds a new family of thread flag manipulators
> 
> 	update*_thread_flag([...,] flag, cond)
> 
> which do the equivalent of:
> 
> 	if (cond)
> 		set*_thread_flag([...,] flag);
> 	else
> 		clear*_thread_flag([...,] flag);
> 
> Signed-off-by: Dave Martin <Dave.Martin@arm.com>
> Cc: Ingo Molnar <mingo@redhat.com>
> Cc: Peter Zijlstra <peterz@infradead.org>

Acked-by: Steven Rostedt (VMware) <rostedt@goodmis.org>

-- Steve

> Cc: Oleg Nesterov <oleg@redhat.com>
> ---
>  include/linux/sched.h       |  6 ++++++
>  include/linux/thread_info.h | 11 +++++++++++
>  2 files changed, 17 insertions(+)
> 
> diff --git a/include/linux/sched.h b/include/linux/sched.h
> index b3d697f..c2c3051 100644
> --- a/include/linux/sched.h
> +++ b/include/linux/sched.h
> @@ -1578,6 +1578,12 @@ static inline void clear_tsk_thread_flag(struct task_struct *tsk, int flag)
>  	clear_ti_thread_flag(task_thread_info(tsk), flag);
>  }
>  
> +static inline void update_tsk_thread_flag(struct task_struct *tsk, int flag,
> +					  bool value)
> +{
> +	update_ti_thread_flag(task_thread_info(tsk), flag, value);
> +}
> +
>  static inline int test_and_set_tsk_thread_flag(struct task_struct *tsk, int flag)
>  {
>  	return test_and_set_ti_thread_flag(task_thread_info(tsk), flag);
> diff --git a/include/linux/thread_info.h b/include/linux/thread_info.h
> index cf2862b..8d8821b 100644
> --- a/include/linux/thread_info.h
> +++ b/include/linux/thread_info.h
> @@ -60,6 +60,15 @@ static inline void clear_ti_thread_flag(struct thread_info *ti, int flag)
>  	clear_bit(flag, (unsigned long *)&ti->flags);
>  }
>  
> +static inline void update_ti_thread_flag(struct thread_info *ti, int flag,
> +					 bool value)
> +{
> +	if (value)
> +		set_ti_thread_flag(ti, flag);
> +	else
> +		clear_ti_thread_flag(ti, flag);
> +}
> +
>  static inline int test_and_set_ti_thread_flag(struct thread_info *ti, int flag)
>  {
>  	return test_and_set_bit(flag, (unsigned long *)&ti->flags);
> @@ -79,6 +88,8 @@ static inline int test_ti_thread_flag(struct thread_info *ti, int flag)
>  	set_ti_thread_flag(current_thread_info(), flag)
>  #define clear_thread_flag(flag) \
>  	clear_ti_thread_flag(current_thread_info(), flag)
> +#define update_thread_flag(flag, value) \
> +	update_ti_thread_flag(current_thread_info(), flag, value)
>  #define test_and_set_thread_flag(flag) \
>  	test_and_set_ti_thread_flag(current_thread_info(), flag)
>  #define test_and_clear_thread_flag(flag) \

^ permalink raw reply	[flat|nested] 62+ messages in thread

* [PATCH v6 01/15] thread_info: Add update_thread_flag() helpers
@ 2018-05-08 17:08     ` Steven Rostedt
  0 siblings, 0 replies; 62+ messages in thread
From: Steven Rostedt @ 2018-05-08 17:08 UTC (permalink / raw)
  To: linux-arm-kernel

On Tue,  8 May 2018 17:44:46 +0100
Dave Martin <Dave.Martin@arm.com> wrote:

> There are a number of bits of code sprinkled around the kernel to
> set a thread flag if a certain condition is true, and clear it
> otherwise.
> 
> To help make those call sites terser and less cumbersome, this
> patch adds a new family of thread flag manipulators
> 
> 	update*_thread_flag([...,] flag, cond)
> 
> which do the equivalent of:
> 
> 	if (cond)
> 		set*_thread_flag([...,] flag);
> 	else
> 		clear*_thread_flag([...,] flag);
> 
> Signed-off-by: Dave Martin <Dave.Martin@arm.com>
> Cc: Ingo Molnar <mingo@redhat.com>
> Cc: Peter Zijlstra <peterz@infradead.org>

Acked-by: Steven Rostedt (VMware) <rostedt@goodmis.org>

-- Steve

> Cc: Oleg Nesterov <oleg@redhat.com>
> ---
>  include/linux/sched.h       |  6 ++++++
>  include/linux/thread_info.h | 11 +++++++++++
>  2 files changed, 17 insertions(+)
> 
> diff --git a/include/linux/sched.h b/include/linux/sched.h
> index b3d697f..c2c3051 100644
> --- a/include/linux/sched.h
> +++ b/include/linux/sched.h
> @@ -1578,6 +1578,12 @@ static inline void clear_tsk_thread_flag(struct task_struct *tsk, int flag)
>  	clear_ti_thread_flag(task_thread_info(tsk), flag);
>  }
>  
> +static inline void update_tsk_thread_flag(struct task_struct *tsk, int flag,
> +					  bool value)
> +{
> +	update_ti_thread_flag(task_thread_info(tsk), flag, value);
> +}
> +
>  static inline int test_and_set_tsk_thread_flag(struct task_struct *tsk, int flag)
>  {
>  	return test_and_set_ti_thread_flag(task_thread_info(tsk), flag);
> diff --git a/include/linux/thread_info.h b/include/linux/thread_info.h
> index cf2862b..8d8821b 100644
> --- a/include/linux/thread_info.h
> +++ b/include/linux/thread_info.h
> @@ -60,6 +60,15 @@ static inline void clear_ti_thread_flag(struct thread_info *ti, int flag)
>  	clear_bit(flag, (unsigned long *)&ti->flags);
>  }
>  
> +static inline void update_ti_thread_flag(struct thread_info *ti, int flag,
> +					 bool value)
> +{
> +	if (value)
> +		set_ti_thread_flag(ti, flag);
> +	else
> +		clear_ti_thread_flag(ti, flag);
> +}
> +
>  static inline int test_and_set_ti_thread_flag(struct thread_info *ti, int flag)
>  {
>  	return test_and_set_bit(flag, (unsigned long *)&ti->flags);
> @@ -79,6 +88,8 @@ static inline int test_ti_thread_flag(struct thread_info *ti, int flag)
>  	set_ti_thread_flag(current_thread_info(), flag)
>  #define clear_thread_flag(flag) \
>  	clear_ti_thread_flag(current_thread_info(), flag)
> +#define update_thread_flag(flag, value) \
> +	update_ti_thread_flag(current_thread_info(), flag, value)
>  #define test_and_set_thread_flag(flag) \
>  	test_and_set_ti_thread_flag(current_thread_info(), flag)
>  #define test_and_clear_thread_flag(flag) \

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH v6 07/15] KVM: arm64: Optimise FPSIMD handling to reduce guest/host thrashing
  2018-05-08 16:44   ` Dave Martin
@ 2018-05-09  8:24     ` Marc Zyngier
  -1 siblings, 0 replies; 62+ messages in thread
From: Marc Zyngier @ 2018-05-09  8:24 UTC (permalink / raw)
  To: Dave Martin, kvmarm
  Cc: Catalin Marinas, Christoffer Dall, linux-arm-kernel, Ard Biesheuvel

On 08/05/18 17:44, Dave Martin wrote:
> This patch refactors KVM to align the host and guest FPSIMD
> save/restore logic with each other for arm64.  This reduces the
> number of redundant save/restore operations that must occur, and
> reduces the common-case IRQ blackout time during guest exit storms
> by saving the host state lazily and optimising away the need to
> restore the host state before returning to the run loop.
> 
> Four hooks are defined in order to enable this:
> 
>  * kvm_arch_vcpu_run_map_fp():
>    Called on PID change to map necessary bits of current to Hyp.
> 
>  * kvm_arch_vcpu_load_fp():
>    Set up FP/SIMD for entering the KVM run loop (parse as
>    "vcpu_load fp").
> 
>  * kvm_arch_vcpu_ctxsync_fp():
>    Get FP/SIMD into a safe state for re-enabling interrupts after a
>    guest exit back to the run loop.
> 
>    For arm64 specifically, this involves updating the host kernel's
>    FPSIMD context tracking metadata so that kernel-mode NEON use
>    will cause the vcpu's FPSIMD state to be saved back correctly
>    into the vcpu struct.  This must be done before re-enabling
>    interrupts because kernel-mode NEON may be used by softirqs.
> 
>  * kvm_arch_vcpu_put_fp():
>    Save guest FP/SIMD state back to memory and dissociate from the
>    CPU ("vcpu_put fp").
> 
> Also, the arm64 FPSIMD context switch code is updated to enable it
> to save back FPSIMD state for a vcpu, not just current.  A few
> helpers drive this:
> 
>  * fpsimd_bind_state_to_cpu(struct user_fpsimd_state *fp):
>    mark this CPU as having context fp (which may belong to a vcpu)
>    currently loaded in its registers.  This is the non-task
>    equivalent of the static function fpsimd_bind_to_cpu() in
>    fpsimd.c.
> 
>  * task_fpsimd_save():
>    exported to allow KVM to save the guest's FPSIMD state back to
>    memory on exit from the run loop.
> 
>  * fpsimd_flush_state():
>    invalidate any context's FPSIMD state that is currently loaded.
>    Used to disassociate the vcpu from the CPU regs on run loop exit.
> 
> These changes allow the run loop to enable interrupts (and thus
> softirqs that may use kernel-mode NEON) without having to save the
> guest's FPSIMD state eagerly.
> 
> Some new vcpu_arch fields are added to make all this work.  Because
> host FPSIMD state can now be saved back directly into current's
> thread_struct as appropriate, host_cpu_context is no longer used
> for preserving the FPSIMD state.  However, it is still needed for
> preserving other things such as the host's system registers.  To
> avoid ABI churn, the redundant storage space in host_cpu_context is
> not removed for now.
> 
> arch/arm is not addressed by this patch and continues to use its
> current save/restore logic.  It could provide implementations of
> the helpers later if desired.
> 
> Signed-off-by: Dave Martin <Dave.Martin@arm.com>
> 
> ---
> 
> Changes since v5:
> 
> Requested by Marc Zyngier:
> 
>  * Migrate from adding new bool flags in vcpu_arch to adding bits in
>    vcpu_arch.flags.
> ---
>  arch/arm/include/asm/kvm_host.h   |   8 +++
>  arch/arm64/include/asm/fpsimd.h   |   5 ++
>  arch/arm64/include/asm/kvm_asm.h  |   2 +
>  arch/arm64/include/asm/kvm_host.h |  16 ++++++
>  arch/arm64/kernel/fpsimd.c        |  15 +++++-
>  arch/arm64/kvm/Kconfig            |   1 +
>  arch/arm64/kvm/Makefile           |   2 +-
>  arch/arm64/kvm/fpsimd.c           | 111 ++++++++++++++++++++++++++++++++++++++
>  arch/arm64/kvm/hyp/switch.c       |  50 +++++++++--------
>  virt/kvm/arm/arm.c                |   4 ++
>  10 files changed, 185 insertions(+), 29 deletions(-)
>  create mode 100644 arch/arm64/kvm/fpsimd.c
> 
> diff --git a/arch/arm/include/asm/kvm_host.h b/arch/arm/include/asm/kvm_host.h
> index c7c28c8..4cac8d1 100644
> --- a/arch/arm/include/asm/kvm_host.h
> +++ b/arch/arm/include/asm/kvm_host.h
> @@ -303,6 +303,14 @@ int kvm_arm_vcpu_arch_get_attr(struct kvm_vcpu *vcpu,
>  int kvm_arm_vcpu_arch_has_attr(struct kvm_vcpu *vcpu,
>  			       struct kvm_device_attr *attr);
>  
> +/*
> + * VFP/NEON switching is all done by the hyp switch code, so no need to
> + * coordinate with host context handling for this state:
> + */
> +static inline void kvm_arch_vcpu_load_fp(struct kvm_vcpu *vcpu) {}
> +static inline void kvm_arch_vcpu_park_fp(struct kvm_vcpu *vcpu) {}
> +static inline void kvm_arch_vcpu_put_fp(struct kvm_vcpu *vcpu) {}
> +
>  /* All host FP/SIMD state is restored on guest exit, so nothing to save: */
>  static inline void kvm_fpsimd_flush_cpu_state(void) {}
>  
> diff --git a/arch/arm64/include/asm/fpsimd.h b/arch/arm64/include/asm/fpsimd.h
> index aa7162a..aa60895 100644
> --- a/arch/arm64/include/asm/fpsimd.h
> +++ b/arch/arm64/include/asm/fpsimd.h
> @@ -41,6 +41,8 @@ struct task_struct;
>  extern void fpsimd_save_state(struct user_fpsimd_state *state);
>  extern void fpsimd_load_state(struct user_fpsimd_state *state);
>  
> +extern void fpsimd_save(void);
> +
>  extern void fpsimd_thread_switch(struct task_struct *next);
>  extern void fpsimd_flush_thread(void);
>  
> @@ -49,7 +51,10 @@ extern void fpsimd_preserve_current_state(void);
>  extern void fpsimd_restore_current_state(void);
>  extern void fpsimd_update_current_state(struct user_fpsimd_state const *state);
>  
> +extern void fpsimd_bind_state_to_cpu(struct user_fpsimd_state *state);
> +
>  extern void fpsimd_flush_task_state(struct task_struct *target);
> +extern void fpsimd_flush_cpu_state(void);
>  extern void sve_flush_cpu_state(void);
>  
>  /* Maximum VL that SVE VL-agnostic software can transparently support */
> diff --git a/arch/arm64/include/asm/kvm_asm.h b/arch/arm64/include/asm/kvm_asm.h
> index 9699c13..6429eb6 100644
> --- a/arch/arm64/include/asm/kvm_asm.h
> +++ b/arch/arm64/include/asm/kvm_asm.h
> @@ -33,6 +33,8 @@
>  /* vcpu_arch flags field values: */
>  #define KVM_ARM64_DEBUG_DIRTY_SHIFT	0
>  #define KVM_ARM64_DEBUG_DIRTY		(1 << KVM_ARM64_DEBUG_DIRTY_SHIFT)

At some point, we could remove KVM_ARM64_DEBUG_DIRTY_SHIFT as it is not
used anymore (since 1ea66d27e7b0). Do not respin on this account though.

> +#define KVM_ARM64_FP_ENABLED		(1 << 1) /* guest FP regs loaded */
> +#define KVM_ARM64_HOST_SVE_IN_USE	(1 << 2) /* backup for host TIF_SVE */
>  
>  /* Translate a kernel address of @sym into its equivalent linear mapping */
>  #define kvm_ksym_ref(sym)						\
> diff --git a/arch/arm64/include/asm/kvm_host.h b/arch/arm64/include/asm/kvm_host.h
> index a5ed95c..e4bdc75 100644
> --- a/arch/arm64/include/asm/kvm_host.h
> +++ b/arch/arm64/include/asm/kvm_host.h
> @@ -30,6 +30,7 @@
>  #include <asm/kvm.h>
>  #include <asm/kvm_asm.h>
>  #include <asm/kvm_mmio.h>
> +#include <asm/thread_info.h>
>  
>  #define __KVM_HAVE_ARCH_INTC_INITIALIZED
>  
> @@ -238,6 +239,10 @@ struct kvm_vcpu_arch {
>  
>  	/* Pointer to host CPU context */
>  	kvm_cpu_context_t *host_cpu_context;
> +
> +	struct thread_info *host_thread_info;	/* hyp VA */
> +	struct user_fpsimd_state *host_fpsimd_state;	/* hyp VA */
> +
>  	struct {
>  		/* {Break,watch}point registers */
>  		struct kvm_guest_debug_arch regs;
> @@ -420,6 +425,17 @@ static inline void __cpu_init_stage2(void)
>  		  "PARange is %d bits, unsupported configuration!", parange);
>  }
>  
> +/* Guest/host FPSIMD coordination helpers */
> +int kvm_arch_vcpu_run_map_fp(struct kvm_vcpu *vcpu);
> +void kvm_arch_vcpu_load_fp(struct kvm_vcpu *vcpu);
> +void kvm_arch_vcpu_ctxsync_fp(struct kvm_vcpu *vcpu);
> +void kvm_arch_vcpu_put_fp(struct kvm_vcpu *vcpu);
> +
> +static inline int kvm_arch_vcpu_run_pid_change(struct kvm_vcpu *vcpu)
> +{
> +	return kvm_arch_vcpu_run_map_fp(vcpu);
> +}
> +
>  /*
>   * All host FP/SIMD state is restored on guest exit, so nothing needs
>   * doing here except in the SVE case:
> diff --git a/arch/arm64/kernel/fpsimd.c b/arch/arm64/kernel/fpsimd.c
> index 5fc0595..e7349b5 100644
> --- a/arch/arm64/kernel/fpsimd.c
> +++ b/arch/arm64/kernel/fpsimd.c
> @@ -275,7 +275,7 @@ static void task_fpsimd_load(void)
>   *
>   * Softirqs (and preemption) must be disabled.
>   */
> -static void fpsimd_save(void)
> +void fpsimd_save(void)
>  {
>  	struct user_fpsimd_state *st = __this_cpu_read(fpsimd_last_state.st);
>  
> @@ -1008,6 +1008,17 @@ static void fpsimd_bind_to_cpu(void)
>  	current->thread.fpsimd_cpu = smp_processor_id();
>  }
>  
> +void fpsimd_bind_state_to_cpu(struct user_fpsimd_state *st)
> +{
> +	struct fpsimd_last_state_struct *last =
> +		this_cpu_ptr(&fpsimd_last_state);
> +
> +	WARN_ON(!in_softirq() && !irqs_disabled());
> +
> +	last->st = st;
> +	last->sve_in_use = false;
> +}
> +
>  /*
>   * Load the userland FPSIMD state of 'current' from memory, but only if the
>   * FPSIMD state already held in the registers is /not/ the most recent FPSIMD
> @@ -1060,7 +1071,7 @@ void fpsimd_flush_task_state(struct task_struct *t)
>  	t->thread.fpsimd_cpu = NR_CPUS;
>  }
>  
> -static inline void fpsimd_flush_cpu_state(void)
> +void fpsimd_flush_cpu_state(void)
>  {
>  	__this_cpu_write(fpsimd_last_state.st, NULL);
>  }
> diff --git a/arch/arm64/kvm/Kconfig b/arch/arm64/kvm/Kconfig
> index a2e3a5a..47b23bf 100644
> --- a/arch/arm64/kvm/Kconfig
> +++ b/arch/arm64/kvm/Kconfig
> @@ -39,6 +39,7 @@ config KVM
>  	select HAVE_KVM_IRQ_ROUTING
>  	select IRQ_BYPASS_MANAGER
>  	select HAVE_KVM_IRQ_BYPASS
> +	select HAVE_KVM_VCPU_RUN_PID_CHANGE
>  	---help---
>  	  Support hosting virtualized guest machines.
>  	  We don't support KVM with 16K page tables yet, due to the multiple
> diff --git a/arch/arm64/kvm/Makefile b/arch/arm64/kvm/Makefile
> index 93afff9..0f2a135 100644
> --- a/arch/arm64/kvm/Makefile
> +++ b/arch/arm64/kvm/Makefile
> @@ -19,7 +19,7 @@ kvm-$(CONFIG_KVM_ARM_HOST) += $(KVM)/arm/psci.o $(KVM)/arm/perf.o
>  kvm-$(CONFIG_KVM_ARM_HOST) += inject_fault.o regmap.o va_layout.o
>  kvm-$(CONFIG_KVM_ARM_HOST) += hyp.o hyp-init.o handle_exit.o
>  kvm-$(CONFIG_KVM_ARM_HOST) += guest.o debug.o reset.o sys_regs.o sys_regs_generic_v8.o
> -kvm-$(CONFIG_KVM_ARM_HOST) += vgic-sys-reg-v3.o
> +kvm-$(CONFIG_KVM_ARM_HOST) += vgic-sys-reg-v3.o fpsimd.o
>  kvm-$(CONFIG_KVM_ARM_HOST) += $(KVM)/arm/aarch32.o
>  
>  kvm-$(CONFIG_KVM_ARM_HOST) += $(KVM)/arm/vgic/vgic.o
> diff --git a/arch/arm64/kvm/fpsimd.c b/arch/arm64/kvm/fpsimd.c
> new file mode 100644
> index 0000000..7059275
> --- /dev/null
> +++ b/arch/arm64/kvm/fpsimd.c
> @@ -0,0 +1,111 @@
> +// SPDX-License-Identifier: GPL-2.0
> +/*
> + * arch/arm64/kvm/fpsimd.c: Guest/host FPSIMD context coordination helpers
> + *
> + * Copyright 2018 Arm Limited
> + * Author: Dave Martin <Dave.Martin@arm.com>
> + */
> +#include <linux/bottom_half.h>
> +#include <linux/sched.h>
> +#include <linux/thread_info.h>
> +#include <linux/kvm_host.h>
> +#include <asm/kvm_asm.h>
> +#include <asm/kvm_host.h>
> +#include <asm/kvm_mmu.h>
> +
> +/*
> + * Called on entry to KVM_RUN unless this vcpu previously ran at least
> + * once and the most recent prior KVM_RUN for this vcpu was called from
> + * the same task as current (highly likely).
> + *
> + * This is guaranteed to execute before kvm_arch_vcpu_load_fp(vcpu),
> + * such that on entering hyp the relevant parts of current are already
> + * mapped.
> + */
> +int kvm_arch_vcpu_run_map_fp(struct kvm_vcpu *vcpu)
> +{
> +	int ret;
> +
> +	struct thread_info *ti = &current->thread_info;
> +	struct user_fpsimd_state *fpsimd = &current->thread.uw.fpsimd_state;
> +
> +	/*
> +	 * Make sure the host task thread flags and fpsimd state are
> +	 * visible to hyp:
> +	 */
> +	ret = create_hyp_mappings(ti, ti + 1, PAGE_HYP);
> +	if (ret)
> +		goto error;
> +
> +	ret = create_hyp_mappings(fpsimd, fpsimd + 1, PAGE_HYP);
> +	if (ret)
> +		goto error;
> +
> +	vcpu->arch.host_thread_info = kern_hyp_va(ti);
> +	vcpu->arch.host_fpsimd_state = kern_hyp_va(fpsimd);

Since we assign host_fpsimd_state here, and that "map" is guaranteed to
execute before "load"...

> +error:
> +	return ret;
> +}
> +
> +/*
> + * Prepare vcpu for saving the host's FPSIMD state and loading the guest's.
> + * The actual loading is done by the FPSIMD access trap taken to hyp.
> + *
> + * Here, we just set the correct metadata to indicate that the FPSIMD
> + * state in the cpu regs (if any) belongs to current, and where to write
> + * it back to if/when a FPSIMD access trap is taken.
> + *
> + * TIF_SVE is backed up here, since it may get clobbered with guest state.
> + * This flag is restored by kvm_arch_vcpu_put_fp(vcpu).
> + */
> +void kvm_arch_vcpu_load_fp(struct kvm_vcpu *vcpu)
> +{
> +	BUG_ON(system_supports_sve());
> +	BUG_ON(!current->mm);
> +
> +	vcpu->arch.flags &= ~(KVM_ARM64_FP_ENABLED | KVM_ARM64_HOST_SVE_IN_USE);
> +	if (test_thread_flag(TIF_SVE))
> +		vcpu->arch.flags |= KVM_ARM64_HOST_SVE_IN_USE;
> +
> +	vcpu->arch.host_fpsimd_state =
> +		kern_hyp_va(&current->thread.uw.fpsimd_state);

... what is the purpose of this assignment? I don't think it is harmful
in any way, but I'm just wondering if I'm missing in the actual flow.

> +}
> +
> +/*
> + * If the guest FPSIMD state was loaded, update the host's context
> + * tracking data mark the CPU FPSIMD regs as dirty for vcpu so that they
> + * will be written back if the kernel clobbers them due to kernel-mode
> + * NEON before re-entry into the guest.
> + */
> +void kvm_arch_vcpu_ctxsync_fp(struct kvm_vcpu *vcpu)
> +{
> +	WARN_ON_ONCE(!irqs_disabled());
> +
> +	if (vcpu->arch.flags & KVM_ARM64_FP_ENABLED) {
> +		fpsimd_bind_state_to_cpu(&vcpu->arch.ctxt.gp_regs.fp_regs);
> +		clear_thread_flag(TIF_FOREIGN_FPSTATE);
> +		clear_thread_flag(TIF_SVE);
> +	}
> +}
> +
> +/*
> + * Write back the vcpu FPSIMD regs if they are dirty, and invalidate the
> + * cpu FPSIMD regs so that they can't be spuriously reused if this vcpu
> + * disappears and another task or vcpu appears that recycles the same
> + * struct fpsimd_state.
> + */
> +void kvm_arch_vcpu_put_fp(struct kvm_vcpu *vcpu)
> +{
> +	local_bh_disable();
> +
> +	if (vcpu->arch.flags & KVM_ARM64_FP_ENABLED) {
> +		fpsimd_save();
> +		fpsimd_flush_cpu_state();
> +		set_thread_flag(TIF_FOREIGN_FPSTATE);
> +	}
> +
> +	update_thread_flag(TIF_SVE,
> +			   vcpu->arch.flags & KVM_ARM64_HOST_SVE_IN_USE);

Does this indicate anything about the validity of the SVE registers? Or
is it for the sole purpose of userspace not having to trap to re-enable SVE?

> +
> +	local_bh_enable();
> +}
> diff --git a/arch/arm64/kvm/hyp/switch.c b/arch/arm64/kvm/hyp/switch.c
> index c0796c4..75c3b65 100644
> --- a/arch/arm64/kvm/hyp/switch.c
> +++ b/arch/arm64/kvm/hyp/switch.c
> @@ -27,15 +27,16 @@
>  #include <asm/kvm_mmu.h>
>  #include <asm/fpsimd.h>
>  #include <asm/debug-monitors.h>
> +#include <asm/thread_info.h>
>  
> -static bool __hyp_text __fpsimd_enabled_nvhe(void)
> +static bool __hyp_text update_fp_enabled(struct kvm_vcpu *vcpu)
>  {
> -	return !(read_sysreg(cptr_el2) & CPTR_EL2_TFP);
> -}
> +	if (vcpu->arch.host_thread_info->flags & _TIF_FOREIGN_FPSTATE) {
> +		vcpu->arch.host_fpsimd_state = NULL;
> +		vcpu->arch.flags &= ~KVM_ARM64_FP_ENABLED;
> +	}

I must confess I don't get this bit. If we started off with an invalid
host state (which is how I interpret TIF_FOREIGN_FPSTATE), why didn't we
simply zero the pointer at load time? The thread flag cannot change in
the meantime, right? Or did I get that wrong?

>  
> -static bool fpsimd_enabled_vhe(void)
> -{
> -	return !!(read_sysreg(cpacr_el1) & CPACR_EL1_FPEN);
> +	return !!(vcpu->arch.flags & KVM_ARM64_FP_ENABLED);
>  }
>  
>  /* Save the 32-bit only FPSIMD system register state */
> @@ -92,7 +93,10 @@ static void activate_traps_vhe(struct kvm_vcpu *vcpu)
>  
>  	val = read_sysreg(cpacr_el1);
>  	val |= CPACR_EL1_TTA;
> -	val &= ~(CPACR_EL1_FPEN | CPACR_EL1_ZEN);
> +	val &= ~CPACR_EL1_ZEN;
> +	if (!update_fp_enabled(vcpu))
> +		val &= ~CPACR_EL1_FPEN;
> +
>  	write_sysreg(val, cpacr_el1);
>  
>  	write_sysreg(kvm_get_hyp_vector(), vbar_el1);
> @@ -105,7 +109,10 @@ static void __hyp_text __activate_traps_nvhe(struct kvm_vcpu *vcpu)
>  	__activate_traps_common(vcpu);
>  
>  	val = CPTR_EL2_DEFAULT;
> -	val |= CPTR_EL2_TTA | CPTR_EL2_TFP | CPTR_EL2_TZ;
> +	val |= CPTR_EL2_TTA | CPTR_EL2_TZ;
> +	if (!update_fp_enabled(vcpu))
> +		val |= CPTR_EL2_TFP;
> +
>  	write_sysreg(val, cptr_el2);
>  }
>  
> @@ -321,8 +328,6 @@ static bool __hyp_text __skip_instr(struct kvm_vcpu *vcpu)
>  void __hyp_text __hyp_switch_fpsimd(u64 esr __always_unused,
>  				    struct kvm_vcpu *vcpu)
>  {
> -	kvm_cpu_context_t *host_ctxt;
> -
>  	if (has_vhe())
>  		write_sysreg(read_sysreg(cpacr_el1) | CPACR_EL1_FPEN,
>  			     cpacr_el1);
> @@ -332,14 +337,19 @@ void __hyp_text __hyp_switch_fpsimd(u64 esr __always_unused,
>  
>  	isb();
>  
> -	host_ctxt = kern_hyp_va(vcpu->arch.host_cpu_context);
> -	__fpsimd_save_state(&host_ctxt->gp_regs.fp_regs);
> +	if (vcpu->arch.host_fpsimd_state) {
> +		__fpsimd_save_state(vcpu->arch.host_fpsimd_state);
> +		vcpu->arch.host_fpsimd_state = NULL;
> +	}
> +
>  	__fpsimd_restore_state(&vcpu->arch.ctxt.gp_regs.fp_regs);
>  
>  	/* Skip restoring fpexc32 for AArch64 guests */
>  	if (!(read_sysreg(hcr_el2) & HCR_RW))
>  		write_sysreg(vcpu->arch.ctxt.sys_regs[FPEXC32_EL2],
>  			     fpexc32_el2);
> +
> +	vcpu->arch.flags |= KVM_ARM64_FP_ENABLED;
>  }
>  
>  /*
> @@ -418,7 +428,6 @@ int kvm_vcpu_run_vhe(struct kvm_vcpu *vcpu)
>  {
>  	struct kvm_cpu_context *host_ctxt;
>  	struct kvm_cpu_context *guest_ctxt;
> -	bool fp_enabled;
>  	u64 exit_code;
>  
>  	host_ctxt = vcpu->arch.host_cpu_context;
> @@ -440,19 +449,14 @@ int kvm_vcpu_run_vhe(struct kvm_vcpu *vcpu)
>  		/* And we're baaack! */
>  	} while (fixup_guest_exit(vcpu, &exit_code));
>  
> -	fp_enabled = fpsimd_enabled_vhe();
> -
>  	sysreg_save_guest_state_vhe(guest_ctxt);
>  
>  	__deactivate_traps(vcpu);
>  
>  	sysreg_restore_host_state_vhe(host_ctxt);
>  
> -	if (fp_enabled) {
> -		__fpsimd_save_state(&guest_ctxt->gp_regs.fp_regs);
> -		__fpsimd_restore_state(&host_ctxt->gp_regs.fp_regs);
> +	if (vcpu->arch.flags & KVM_ARM64_FP_ENABLED)
>  		__fpsimd_save_fpexc32(vcpu);
> -	}
>  
>  	__debug_switch_to_host(vcpu);
>  
> @@ -464,7 +468,6 @@ int __hyp_text __kvm_vcpu_run_nvhe(struct kvm_vcpu *vcpu)
>  {
>  	struct kvm_cpu_context *host_ctxt;
>  	struct kvm_cpu_context *guest_ctxt;
> -	bool fp_enabled;
>  	u64 exit_code;
>  
>  	vcpu = kern_hyp_va(vcpu);
> @@ -496,8 +499,6 @@ int __hyp_text __kvm_vcpu_run_nvhe(struct kvm_vcpu *vcpu)
>  		/* And we're baaack! */
>  	} while (fixup_guest_exit(vcpu, &exit_code));
>  
> -	fp_enabled = __fpsimd_enabled_nvhe();
> -
>  	__sysreg_save_state_nvhe(guest_ctxt);
>  	__sysreg32_save_state(vcpu);
>  	__timer_disable_traps(vcpu);
> @@ -508,11 +509,8 @@ int __hyp_text __kvm_vcpu_run_nvhe(struct kvm_vcpu *vcpu)
>  
>  	__sysreg_restore_state_nvhe(host_ctxt);
>  
> -	if (fp_enabled) {
> -		__fpsimd_save_state(&guest_ctxt->gp_regs.fp_regs);
> -		__fpsimd_restore_state(&host_ctxt->gp_regs.fp_regs);
> +	if (vcpu->arch.flags & KVM_ARM64_FP_ENABLED)
>  		__fpsimd_save_fpexc32(vcpu);
> -	}
>  
>  	/*
>  	 * This must come after restoring the host sysregs, since a non-VHE
> diff --git a/virt/kvm/arm/arm.c b/virt/kvm/arm/arm.c
> index a4c1b76..bee226c 100644
> --- a/virt/kvm/arm/arm.c
> +++ b/virt/kvm/arm/arm.c
> @@ -363,10 +363,12 @@ void kvm_arch_vcpu_load(struct kvm_vcpu *vcpu, int cpu)
>  	kvm_vgic_load(vcpu);
>  	kvm_timer_vcpu_load(vcpu);
>  	kvm_vcpu_load_sysregs(vcpu);
> +	kvm_arch_vcpu_load_fp(vcpu);
>  }
>  
>  void kvm_arch_vcpu_put(struct kvm_vcpu *vcpu)
>  {
> +	kvm_arch_vcpu_put_fp(vcpu);
>  	kvm_vcpu_put_sysregs(vcpu);
>  	kvm_timer_vcpu_put(vcpu);
>  	kvm_vgic_put(vcpu);
> @@ -778,6 +780,8 @@ int kvm_arch_vcpu_ioctl_run(struct kvm_vcpu *vcpu, struct kvm_run *run)
>  		if (static_branch_unlikely(&userspace_irqchip_in_use))
>  			kvm_timer_sync_hwstate(vcpu);
>  
> +		kvm_arch_vcpu_ctxsync_fp(vcpu);
> +
>  		/*
>  		 * We may have taken a host interrupt in HYP mode (ie
>  		 * while executing the guest). This interrupt is still
> 

It otherwise looks good, and I don't think the above nits are
problematic on their own.

Reviewed-by: Marc Zyngier <marc.zyngier@arm.com>

	M.
-- 
Jazz is not dead. It just smells funny...

^ permalink raw reply	[flat|nested] 62+ messages in thread

* [PATCH v6 07/15] KVM: arm64: Optimise FPSIMD handling to reduce guest/host thrashing
@ 2018-05-09  8:24     ` Marc Zyngier
  0 siblings, 0 replies; 62+ messages in thread
From: Marc Zyngier @ 2018-05-09  8:24 UTC (permalink / raw)
  To: linux-arm-kernel

On 08/05/18 17:44, Dave Martin wrote:
> This patch refactors KVM to align the host and guest FPSIMD
> save/restore logic with each other for arm64.  This reduces the
> number of redundant save/restore operations that must occur, and
> reduces the common-case IRQ blackout time during guest exit storms
> by saving the host state lazily and optimising away the need to
> restore the host state before returning to the run loop.
> 
> Four hooks are defined in order to enable this:
> 
>  * kvm_arch_vcpu_run_map_fp():
>    Called on PID change to map necessary bits of current to Hyp.
> 
>  * kvm_arch_vcpu_load_fp():
>    Set up FP/SIMD for entering the KVM run loop (parse as
>    "vcpu_load fp").
> 
>  * kvm_arch_vcpu_ctxsync_fp():
>    Get FP/SIMD into a safe state for re-enabling interrupts after a
>    guest exit back to the run loop.
> 
>    For arm64 specifically, this involves updating the host kernel's
>    FPSIMD context tracking metadata so that kernel-mode NEON use
>    will cause the vcpu's FPSIMD state to be saved back correctly
>    into the vcpu struct.  This must be done before re-enabling
>    interrupts because kernel-mode NEON may be used by softirqs.
> 
>  * kvm_arch_vcpu_put_fp():
>    Save guest FP/SIMD state back to memory and dissociate from the
>    CPU ("vcpu_put fp").
> 
> Also, the arm64 FPSIMD context switch code is updated to enable it
> to save back FPSIMD state for a vcpu, not just current.  A few
> helpers drive this:
> 
>  * fpsimd_bind_state_to_cpu(struct user_fpsimd_state *fp):
>    mark this CPU as having context fp (which may belong to a vcpu)
>    currently loaded in its registers.  This is the non-task
>    equivalent of the static function fpsimd_bind_to_cpu() in
>    fpsimd.c.
> 
>  * task_fpsimd_save():
>    exported to allow KVM to save the guest's FPSIMD state back to
>    memory on exit from the run loop.
> 
>  * fpsimd_flush_state():
>    invalidate any context's FPSIMD state that is currently loaded.
>    Used to disassociate the vcpu from the CPU regs on run loop exit.
> 
> These changes allow the run loop to enable interrupts (and thus
> softirqs that may use kernel-mode NEON) without having to save the
> guest's FPSIMD state eagerly.
> 
> Some new vcpu_arch fields are added to make all this work.  Because
> host FPSIMD state can now be saved back directly into current's
> thread_struct as appropriate, host_cpu_context is no longer used
> for preserving the FPSIMD state.  However, it is still needed for
> preserving other things such as the host's system registers.  To
> avoid ABI churn, the redundant storage space in host_cpu_context is
> not removed for now.
> 
> arch/arm is not addressed by this patch and continues to use its
> current save/restore logic.  It could provide implementations of
> the helpers later if desired.
> 
> Signed-off-by: Dave Martin <Dave.Martin@arm.com>
> 
> ---
> 
> Changes since v5:
> 
> Requested by Marc Zyngier:
> 
>  * Migrate from adding new bool flags in vcpu_arch to adding bits in
>    vcpu_arch.flags.
> ---
>  arch/arm/include/asm/kvm_host.h   |   8 +++
>  arch/arm64/include/asm/fpsimd.h   |   5 ++
>  arch/arm64/include/asm/kvm_asm.h  |   2 +
>  arch/arm64/include/asm/kvm_host.h |  16 ++++++
>  arch/arm64/kernel/fpsimd.c        |  15 +++++-
>  arch/arm64/kvm/Kconfig            |   1 +
>  arch/arm64/kvm/Makefile           |   2 +-
>  arch/arm64/kvm/fpsimd.c           | 111 ++++++++++++++++++++++++++++++++++++++
>  arch/arm64/kvm/hyp/switch.c       |  50 +++++++++--------
>  virt/kvm/arm/arm.c                |   4 ++
>  10 files changed, 185 insertions(+), 29 deletions(-)
>  create mode 100644 arch/arm64/kvm/fpsimd.c
> 
> diff --git a/arch/arm/include/asm/kvm_host.h b/arch/arm/include/asm/kvm_host.h
> index c7c28c8..4cac8d1 100644
> --- a/arch/arm/include/asm/kvm_host.h
> +++ b/arch/arm/include/asm/kvm_host.h
> @@ -303,6 +303,14 @@ int kvm_arm_vcpu_arch_get_attr(struct kvm_vcpu *vcpu,
>  int kvm_arm_vcpu_arch_has_attr(struct kvm_vcpu *vcpu,
>  			       struct kvm_device_attr *attr);
>  
> +/*
> + * VFP/NEON switching is all done by the hyp switch code, so no need to
> + * coordinate with host context handling for this state:
> + */
> +static inline void kvm_arch_vcpu_load_fp(struct kvm_vcpu *vcpu) {}
> +static inline void kvm_arch_vcpu_park_fp(struct kvm_vcpu *vcpu) {}
> +static inline void kvm_arch_vcpu_put_fp(struct kvm_vcpu *vcpu) {}
> +
>  /* All host FP/SIMD state is restored on guest exit, so nothing to save: */
>  static inline void kvm_fpsimd_flush_cpu_state(void) {}
>  
> diff --git a/arch/arm64/include/asm/fpsimd.h b/arch/arm64/include/asm/fpsimd.h
> index aa7162a..aa60895 100644
> --- a/arch/arm64/include/asm/fpsimd.h
> +++ b/arch/arm64/include/asm/fpsimd.h
> @@ -41,6 +41,8 @@ struct task_struct;
>  extern void fpsimd_save_state(struct user_fpsimd_state *state);
>  extern void fpsimd_load_state(struct user_fpsimd_state *state);
>  
> +extern void fpsimd_save(void);
> +
>  extern void fpsimd_thread_switch(struct task_struct *next);
>  extern void fpsimd_flush_thread(void);
>  
> @@ -49,7 +51,10 @@ extern void fpsimd_preserve_current_state(void);
>  extern void fpsimd_restore_current_state(void);
>  extern void fpsimd_update_current_state(struct user_fpsimd_state const *state);
>  
> +extern void fpsimd_bind_state_to_cpu(struct user_fpsimd_state *state);
> +
>  extern void fpsimd_flush_task_state(struct task_struct *target);
> +extern void fpsimd_flush_cpu_state(void);
>  extern void sve_flush_cpu_state(void);
>  
>  /* Maximum VL that SVE VL-agnostic software can transparently support */
> diff --git a/arch/arm64/include/asm/kvm_asm.h b/arch/arm64/include/asm/kvm_asm.h
> index 9699c13..6429eb6 100644
> --- a/arch/arm64/include/asm/kvm_asm.h
> +++ b/arch/arm64/include/asm/kvm_asm.h
> @@ -33,6 +33,8 @@
>  /* vcpu_arch flags field values: */
>  #define KVM_ARM64_DEBUG_DIRTY_SHIFT	0
>  #define KVM_ARM64_DEBUG_DIRTY		(1 << KVM_ARM64_DEBUG_DIRTY_SHIFT)

At some point, we could remove KVM_ARM64_DEBUG_DIRTY_SHIFT as it is not
used anymore (since 1ea66d27e7b0). Do not respin on this account though.

> +#define KVM_ARM64_FP_ENABLED		(1 << 1) /* guest FP regs loaded */
> +#define KVM_ARM64_HOST_SVE_IN_USE	(1 << 2) /* backup for host TIF_SVE */
>  
>  /* Translate a kernel address of @sym into its equivalent linear mapping */
>  #define kvm_ksym_ref(sym)						\
> diff --git a/arch/arm64/include/asm/kvm_host.h b/arch/arm64/include/asm/kvm_host.h
> index a5ed95c..e4bdc75 100644
> --- a/arch/arm64/include/asm/kvm_host.h
> +++ b/arch/arm64/include/asm/kvm_host.h
> @@ -30,6 +30,7 @@
>  #include <asm/kvm.h>
>  #include <asm/kvm_asm.h>
>  #include <asm/kvm_mmio.h>
> +#include <asm/thread_info.h>
>  
>  #define __KVM_HAVE_ARCH_INTC_INITIALIZED
>  
> @@ -238,6 +239,10 @@ struct kvm_vcpu_arch {
>  
>  	/* Pointer to host CPU context */
>  	kvm_cpu_context_t *host_cpu_context;
> +
> +	struct thread_info *host_thread_info;	/* hyp VA */
> +	struct user_fpsimd_state *host_fpsimd_state;	/* hyp VA */
> +
>  	struct {
>  		/* {Break,watch}point registers */
>  		struct kvm_guest_debug_arch regs;
> @@ -420,6 +425,17 @@ static inline void __cpu_init_stage2(void)
>  		  "PARange is %d bits, unsupported configuration!", parange);
>  }
>  
> +/* Guest/host FPSIMD coordination helpers */
> +int kvm_arch_vcpu_run_map_fp(struct kvm_vcpu *vcpu);
> +void kvm_arch_vcpu_load_fp(struct kvm_vcpu *vcpu);
> +void kvm_arch_vcpu_ctxsync_fp(struct kvm_vcpu *vcpu);
> +void kvm_arch_vcpu_put_fp(struct kvm_vcpu *vcpu);
> +
> +static inline int kvm_arch_vcpu_run_pid_change(struct kvm_vcpu *vcpu)
> +{
> +	return kvm_arch_vcpu_run_map_fp(vcpu);
> +}
> +
>  /*
>   * All host FP/SIMD state is restored on guest exit, so nothing needs
>   * doing here except in the SVE case:
> diff --git a/arch/arm64/kernel/fpsimd.c b/arch/arm64/kernel/fpsimd.c
> index 5fc0595..e7349b5 100644
> --- a/arch/arm64/kernel/fpsimd.c
> +++ b/arch/arm64/kernel/fpsimd.c
> @@ -275,7 +275,7 @@ static void task_fpsimd_load(void)
>   *
>   * Softirqs (and preemption) must be disabled.
>   */
> -static void fpsimd_save(void)
> +void fpsimd_save(void)
>  {
>  	struct user_fpsimd_state *st = __this_cpu_read(fpsimd_last_state.st);
>  
> @@ -1008,6 +1008,17 @@ static void fpsimd_bind_to_cpu(void)
>  	current->thread.fpsimd_cpu = smp_processor_id();
>  }
>  
> +void fpsimd_bind_state_to_cpu(struct user_fpsimd_state *st)
> +{
> +	struct fpsimd_last_state_struct *last =
> +		this_cpu_ptr(&fpsimd_last_state);
> +
> +	WARN_ON(!in_softirq() && !irqs_disabled());
> +
> +	last->st = st;
> +	last->sve_in_use = false;
> +}
> +
>  /*
>   * Load the userland FPSIMD state of 'current' from memory, but only if the
>   * FPSIMD state already held in the registers is /not/ the most recent FPSIMD
> @@ -1060,7 +1071,7 @@ void fpsimd_flush_task_state(struct task_struct *t)
>  	t->thread.fpsimd_cpu = NR_CPUS;
>  }
>  
> -static inline void fpsimd_flush_cpu_state(void)
> +void fpsimd_flush_cpu_state(void)
>  {
>  	__this_cpu_write(fpsimd_last_state.st, NULL);
>  }
> diff --git a/arch/arm64/kvm/Kconfig b/arch/arm64/kvm/Kconfig
> index a2e3a5a..47b23bf 100644
> --- a/arch/arm64/kvm/Kconfig
> +++ b/arch/arm64/kvm/Kconfig
> @@ -39,6 +39,7 @@ config KVM
>  	select HAVE_KVM_IRQ_ROUTING
>  	select IRQ_BYPASS_MANAGER
>  	select HAVE_KVM_IRQ_BYPASS
> +	select HAVE_KVM_VCPU_RUN_PID_CHANGE
>  	---help---
>  	  Support hosting virtualized guest machines.
>  	  We don't support KVM with 16K page tables yet, due to the multiple
> diff --git a/arch/arm64/kvm/Makefile b/arch/arm64/kvm/Makefile
> index 93afff9..0f2a135 100644
> --- a/arch/arm64/kvm/Makefile
> +++ b/arch/arm64/kvm/Makefile
> @@ -19,7 +19,7 @@ kvm-$(CONFIG_KVM_ARM_HOST) += $(KVM)/arm/psci.o $(KVM)/arm/perf.o
>  kvm-$(CONFIG_KVM_ARM_HOST) += inject_fault.o regmap.o va_layout.o
>  kvm-$(CONFIG_KVM_ARM_HOST) += hyp.o hyp-init.o handle_exit.o
>  kvm-$(CONFIG_KVM_ARM_HOST) += guest.o debug.o reset.o sys_regs.o sys_regs_generic_v8.o
> -kvm-$(CONFIG_KVM_ARM_HOST) += vgic-sys-reg-v3.o
> +kvm-$(CONFIG_KVM_ARM_HOST) += vgic-sys-reg-v3.o fpsimd.o
>  kvm-$(CONFIG_KVM_ARM_HOST) += $(KVM)/arm/aarch32.o
>  
>  kvm-$(CONFIG_KVM_ARM_HOST) += $(KVM)/arm/vgic/vgic.o
> diff --git a/arch/arm64/kvm/fpsimd.c b/arch/arm64/kvm/fpsimd.c
> new file mode 100644
> index 0000000..7059275
> --- /dev/null
> +++ b/arch/arm64/kvm/fpsimd.c
> @@ -0,0 +1,111 @@
> +// SPDX-License-Identifier: GPL-2.0
> +/*
> + * arch/arm64/kvm/fpsimd.c: Guest/host FPSIMD context coordination helpers
> + *
> + * Copyright 2018 Arm Limited
> + * Author: Dave Martin <Dave.Martin@arm.com>
> + */
> +#include <linux/bottom_half.h>
> +#include <linux/sched.h>
> +#include <linux/thread_info.h>
> +#include <linux/kvm_host.h>
> +#include <asm/kvm_asm.h>
> +#include <asm/kvm_host.h>
> +#include <asm/kvm_mmu.h>
> +
> +/*
> + * Called on entry to KVM_RUN unless this vcpu previously ran at least
> + * once and the most recent prior KVM_RUN for this vcpu was called from
> + * the same task as current (highly likely).
> + *
> + * This is guaranteed to execute before kvm_arch_vcpu_load_fp(vcpu),
> + * such that on entering hyp the relevant parts of current are already
> + * mapped.
> + */
> +int kvm_arch_vcpu_run_map_fp(struct kvm_vcpu *vcpu)
> +{
> +	int ret;
> +
> +	struct thread_info *ti = &current->thread_info;
> +	struct user_fpsimd_state *fpsimd = &current->thread.uw.fpsimd_state;
> +
> +	/*
> +	 * Make sure the host task thread flags and fpsimd state are
> +	 * visible to hyp:
> +	 */
> +	ret = create_hyp_mappings(ti, ti + 1, PAGE_HYP);
> +	if (ret)
> +		goto error;
> +
> +	ret = create_hyp_mappings(fpsimd, fpsimd + 1, PAGE_HYP);
> +	if (ret)
> +		goto error;
> +
> +	vcpu->arch.host_thread_info = kern_hyp_va(ti);
> +	vcpu->arch.host_fpsimd_state = kern_hyp_va(fpsimd);

Since we assign host_fpsimd_state here, and that "map" is guaranteed to
execute before "load"...

> +error:
> +	return ret;
> +}
> +
> +/*
> + * Prepare vcpu for saving the host's FPSIMD state and loading the guest's.
> + * The actual loading is done by the FPSIMD access trap taken to hyp.
> + *
> + * Here, we just set the correct metadata to indicate that the FPSIMD
> + * state in the cpu regs (if any) belongs to current, and where to write
> + * it back to if/when a FPSIMD access trap is taken.
> + *
> + * TIF_SVE is backed up here, since it may get clobbered with guest state.
> + * This flag is restored by kvm_arch_vcpu_put_fp(vcpu).
> + */
> +void kvm_arch_vcpu_load_fp(struct kvm_vcpu *vcpu)
> +{
> +	BUG_ON(system_supports_sve());
> +	BUG_ON(!current->mm);
> +
> +	vcpu->arch.flags &= ~(KVM_ARM64_FP_ENABLED | KVM_ARM64_HOST_SVE_IN_USE);
> +	if (test_thread_flag(TIF_SVE))
> +		vcpu->arch.flags |= KVM_ARM64_HOST_SVE_IN_USE;
> +
> +	vcpu->arch.host_fpsimd_state =
> +		kern_hyp_va(&current->thread.uw.fpsimd_state);

... what is the purpose of this assignment? I don't think it is harmful
in any way, but I'm just wondering if I'm missing in the actual flow.

> +}
> +
> +/*
> + * If the guest FPSIMD state was loaded, update the host's context
> + * tracking data mark the CPU FPSIMD regs as dirty for vcpu so that they
> + * will be written back if the kernel clobbers them due to kernel-mode
> + * NEON before re-entry into the guest.
> + */
> +void kvm_arch_vcpu_ctxsync_fp(struct kvm_vcpu *vcpu)
> +{
> +	WARN_ON_ONCE(!irqs_disabled());
> +
> +	if (vcpu->arch.flags & KVM_ARM64_FP_ENABLED) {
> +		fpsimd_bind_state_to_cpu(&vcpu->arch.ctxt.gp_regs.fp_regs);
> +		clear_thread_flag(TIF_FOREIGN_FPSTATE);
> +		clear_thread_flag(TIF_SVE);
> +	}
> +}
> +
> +/*
> + * Write back the vcpu FPSIMD regs if they are dirty, and invalidate the
> + * cpu FPSIMD regs so that they can't be spuriously reused if this vcpu
> + * disappears and another task or vcpu appears that recycles the same
> + * struct fpsimd_state.
> + */
> +void kvm_arch_vcpu_put_fp(struct kvm_vcpu *vcpu)
> +{
> +	local_bh_disable();
> +
> +	if (vcpu->arch.flags & KVM_ARM64_FP_ENABLED) {
> +		fpsimd_save();
> +		fpsimd_flush_cpu_state();
> +		set_thread_flag(TIF_FOREIGN_FPSTATE);
> +	}
> +
> +	update_thread_flag(TIF_SVE,
> +			   vcpu->arch.flags & KVM_ARM64_HOST_SVE_IN_USE);

Does this indicate anything about the validity of the SVE registers? Or
is it for the sole purpose of userspace not having to trap to re-enable SVE?

> +
> +	local_bh_enable();
> +}
> diff --git a/arch/arm64/kvm/hyp/switch.c b/arch/arm64/kvm/hyp/switch.c
> index c0796c4..75c3b65 100644
> --- a/arch/arm64/kvm/hyp/switch.c
> +++ b/arch/arm64/kvm/hyp/switch.c
> @@ -27,15 +27,16 @@
>  #include <asm/kvm_mmu.h>
>  #include <asm/fpsimd.h>
>  #include <asm/debug-monitors.h>
> +#include <asm/thread_info.h>
>  
> -static bool __hyp_text __fpsimd_enabled_nvhe(void)
> +static bool __hyp_text update_fp_enabled(struct kvm_vcpu *vcpu)
>  {
> -	return !(read_sysreg(cptr_el2) & CPTR_EL2_TFP);
> -}
> +	if (vcpu->arch.host_thread_info->flags & _TIF_FOREIGN_FPSTATE) {
> +		vcpu->arch.host_fpsimd_state = NULL;
> +		vcpu->arch.flags &= ~KVM_ARM64_FP_ENABLED;
> +	}

I must confess I don't get this bit. If we started off with an invalid
host state (which is how I interpret TIF_FOREIGN_FPSTATE), why didn't we
simply zero the pointer at load time? The thread flag cannot change in
the meantime, right? Or did I get that wrong?

>  
> -static bool fpsimd_enabled_vhe(void)
> -{
> -	return !!(read_sysreg(cpacr_el1) & CPACR_EL1_FPEN);
> +	return !!(vcpu->arch.flags & KVM_ARM64_FP_ENABLED);
>  }
>  
>  /* Save the 32-bit only FPSIMD system register state */
> @@ -92,7 +93,10 @@ static void activate_traps_vhe(struct kvm_vcpu *vcpu)
>  
>  	val = read_sysreg(cpacr_el1);
>  	val |= CPACR_EL1_TTA;
> -	val &= ~(CPACR_EL1_FPEN | CPACR_EL1_ZEN);
> +	val &= ~CPACR_EL1_ZEN;
> +	if (!update_fp_enabled(vcpu))
> +		val &= ~CPACR_EL1_FPEN;
> +
>  	write_sysreg(val, cpacr_el1);
>  
>  	write_sysreg(kvm_get_hyp_vector(), vbar_el1);
> @@ -105,7 +109,10 @@ static void __hyp_text __activate_traps_nvhe(struct kvm_vcpu *vcpu)
>  	__activate_traps_common(vcpu);
>  
>  	val = CPTR_EL2_DEFAULT;
> -	val |= CPTR_EL2_TTA | CPTR_EL2_TFP | CPTR_EL2_TZ;
> +	val |= CPTR_EL2_TTA | CPTR_EL2_TZ;
> +	if (!update_fp_enabled(vcpu))
> +		val |= CPTR_EL2_TFP;
> +
>  	write_sysreg(val, cptr_el2);
>  }
>  
> @@ -321,8 +328,6 @@ static bool __hyp_text __skip_instr(struct kvm_vcpu *vcpu)
>  void __hyp_text __hyp_switch_fpsimd(u64 esr __always_unused,
>  				    struct kvm_vcpu *vcpu)
>  {
> -	kvm_cpu_context_t *host_ctxt;
> -
>  	if (has_vhe())
>  		write_sysreg(read_sysreg(cpacr_el1) | CPACR_EL1_FPEN,
>  			     cpacr_el1);
> @@ -332,14 +337,19 @@ void __hyp_text __hyp_switch_fpsimd(u64 esr __always_unused,
>  
>  	isb();
>  
> -	host_ctxt = kern_hyp_va(vcpu->arch.host_cpu_context);
> -	__fpsimd_save_state(&host_ctxt->gp_regs.fp_regs);
> +	if (vcpu->arch.host_fpsimd_state) {
> +		__fpsimd_save_state(vcpu->arch.host_fpsimd_state);
> +		vcpu->arch.host_fpsimd_state = NULL;
> +	}
> +
>  	__fpsimd_restore_state(&vcpu->arch.ctxt.gp_regs.fp_regs);
>  
>  	/* Skip restoring fpexc32 for AArch64 guests */
>  	if (!(read_sysreg(hcr_el2) & HCR_RW))
>  		write_sysreg(vcpu->arch.ctxt.sys_regs[FPEXC32_EL2],
>  			     fpexc32_el2);
> +
> +	vcpu->arch.flags |= KVM_ARM64_FP_ENABLED;
>  }
>  
>  /*
> @@ -418,7 +428,6 @@ int kvm_vcpu_run_vhe(struct kvm_vcpu *vcpu)
>  {
>  	struct kvm_cpu_context *host_ctxt;
>  	struct kvm_cpu_context *guest_ctxt;
> -	bool fp_enabled;
>  	u64 exit_code;
>  
>  	host_ctxt = vcpu->arch.host_cpu_context;
> @@ -440,19 +449,14 @@ int kvm_vcpu_run_vhe(struct kvm_vcpu *vcpu)
>  		/* And we're baaack! */
>  	} while (fixup_guest_exit(vcpu, &exit_code));
>  
> -	fp_enabled = fpsimd_enabled_vhe();
> -
>  	sysreg_save_guest_state_vhe(guest_ctxt);
>  
>  	__deactivate_traps(vcpu);
>  
>  	sysreg_restore_host_state_vhe(host_ctxt);
>  
> -	if (fp_enabled) {
> -		__fpsimd_save_state(&guest_ctxt->gp_regs.fp_regs);
> -		__fpsimd_restore_state(&host_ctxt->gp_regs.fp_regs);
> +	if (vcpu->arch.flags & KVM_ARM64_FP_ENABLED)
>  		__fpsimd_save_fpexc32(vcpu);
> -	}
>  
>  	__debug_switch_to_host(vcpu);
>  
> @@ -464,7 +468,6 @@ int __hyp_text __kvm_vcpu_run_nvhe(struct kvm_vcpu *vcpu)
>  {
>  	struct kvm_cpu_context *host_ctxt;
>  	struct kvm_cpu_context *guest_ctxt;
> -	bool fp_enabled;
>  	u64 exit_code;
>  
>  	vcpu = kern_hyp_va(vcpu);
> @@ -496,8 +499,6 @@ int __hyp_text __kvm_vcpu_run_nvhe(struct kvm_vcpu *vcpu)
>  		/* And we're baaack! */
>  	} while (fixup_guest_exit(vcpu, &exit_code));
>  
> -	fp_enabled = __fpsimd_enabled_nvhe();
> -
>  	__sysreg_save_state_nvhe(guest_ctxt);
>  	__sysreg32_save_state(vcpu);
>  	__timer_disable_traps(vcpu);
> @@ -508,11 +509,8 @@ int __hyp_text __kvm_vcpu_run_nvhe(struct kvm_vcpu *vcpu)
>  
>  	__sysreg_restore_state_nvhe(host_ctxt);
>  
> -	if (fp_enabled) {
> -		__fpsimd_save_state(&guest_ctxt->gp_regs.fp_regs);
> -		__fpsimd_restore_state(&host_ctxt->gp_regs.fp_regs);
> +	if (vcpu->arch.flags & KVM_ARM64_FP_ENABLED)
>  		__fpsimd_save_fpexc32(vcpu);
> -	}
>  
>  	/*
>  	 * This must come after restoring the host sysregs, since a non-VHE
> diff --git a/virt/kvm/arm/arm.c b/virt/kvm/arm/arm.c
> index a4c1b76..bee226c 100644
> --- a/virt/kvm/arm/arm.c
> +++ b/virt/kvm/arm/arm.c
> @@ -363,10 +363,12 @@ void kvm_arch_vcpu_load(struct kvm_vcpu *vcpu, int cpu)
>  	kvm_vgic_load(vcpu);
>  	kvm_timer_vcpu_load(vcpu);
>  	kvm_vcpu_load_sysregs(vcpu);
> +	kvm_arch_vcpu_load_fp(vcpu);
>  }
>  
>  void kvm_arch_vcpu_put(struct kvm_vcpu *vcpu)
>  {
> +	kvm_arch_vcpu_put_fp(vcpu);
>  	kvm_vcpu_put_sysregs(vcpu);
>  	kvm_timer_vcpu_put(vcpu);
>  	kvm_vgic_put(vcpu);
> @@ -778,6 +780,8 @@ int kvm_arch_vcpu_ioctl_run(struct kvm_vcpu *vcpu, struct kvm_run *run)
>  		if (static_branch_unlikely(&userspace_irqchip_in_use))
>  			kvm_timer_sync_hwstate(vcpu);
>  
> +		kvm_arch_vcpu_ctxsync_fp(vcpu);
> +
>  		/*
>  		 * We may have taken a host interrupt in HYP mode (ie
>  		 * while executing the guest). This interrupt is still
> 

It otherwise looks good, and I don't think the above nits are
problematic on their own.

Reviewed-by: Marc Zyngier <marc.zyngier@arm.com>

	M.
-- 
Jazz is not dead. It just smells funny...

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH v6 06/15] KVM: arm64: Repurpose vcpu_arch.debug_flags for general-purpose flags
  2018-05-08 16:44   ` Dave Martin
@ 2018-05-09  8:27     ` Marc Zyngier
  -1 siblings, 0 replies; 62+ messages in thread
From: Marc Zyngier @ 2018-05-09  8:27 UTC (permalink / raw)
  To: Dave Martin, kvmarm
  Cc: Catalin Marinas, Christoffer Dall, linux-arm-kernel, Ard Biesheuvel

On 08/05/18 17:44, Dave Martin wrote:
> In struct vcpu_arch, the debug_flags field is used to store debug-
> related flags about the vcpu state.
> 
> Since we are about to add some more flags related to FPSIMD and
> SVE, it makes sense to add them to the existing flags field rather
> than adding new fields.  Since there is only one debug_flags flag
> defined so far, there is plenty of free space for expansion.
> 
> In preparation for adding more flags, this patch renames the
> debug_flags field to simply "flags", and updates comments
> appropriately.
> 
> No functional change.
> 
> Signed-off-by: Dave Martin <Dave.Martin@arm.com>

Reviewed-by: Marc Zyngier <marc.zyngier@arm.com>

	M.
-- 
Jazz is not dead. It just smells funny...

^ permalink raw reply	[flat|nested] 62+ messages in thread

* [PATCH v6 06/15] KVM: arm64: Repurpose vcpu_arch.debug_flags for general-purpose flags
@ 2018-05-09  8:27     ` Marc Zyngier
  0 siblings, 0 replies; 62+ messages in thread
From: Marc Zyngier @ 2018-05-09  8:27 UTC (permalink / raw)
  To: linux-arm-kernel

On 08/05/18 17:44, Dave Martin wrote:
> In struct vcpu_arch, the debug_flags field is used to store debug-
> related flags about the vcpu state.
> 
> Since we are about to add some more flags related to FPSIMD and
> SVE, it makes sense to add them to the existing flags field rather
> than adding new fields.  Since there is only one debug_flags flag
> defined so far, there is plenty of free space for expansion.
> 
> In preparation for adding more flags, this patch renames the
> debug_flags field to simply "flags", and updates comments
> appropriately.
> 
> No functional change.
> 
> Signed-off-by: Dave Martin <Dave.Martin@arm.com>

Reviewed-by: Marc Zyngier <marc.zyngier@arm.com>

	M.
-- 
Jazz is not dead. It just smells funny...

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH v6 08/15] arm64/sve: Move read_zcr_features() out of cpufeature.h
  2018-05-08 16:44   ` Dave Martin
@ 2018-05-09  8:28     ` Marc Zyngier
  -1 siblings, 0 replies; 62+ messages in thread
From: Marc Zyngier @ 2018-05-09  8:28 UTC (permalink / raw)
  To: Dave Martin, kvmarm
  Cc: Catalin Marinas, Christoffer Dall, linux-arm-kernel, Ard Biesheuvel

On 08/05/18 17:44, Dave Martin wrote:
> Having read_zcr_features() inline in cpufeature.h results in that
> header requiring #includes which make it hard to include
> <asm/fpsimd.h> elsewhere without triggering header inclusion
> cycles.
> 
> This is not a hot-path function and arguably should not be in
> cpufeature.h in the first place, so this patch moves it to
> fpsimd.c, compiled conditionally if CONFIG_ARM64_SVE=y.
> 
> This allows some SVE-related #includes to be dropped from
> cpufeature.h, which will ease future maintenance.
> 
> A couple of missing #includes of <asm/fpsimd.h> are exposed by this
> change under arch/arm64/.  This patch adds the missing #includes as
> necessary.
> 
> No functional change.
> 
> Signed-off-by: Dave Martin <Dave.Martin@arm.com>
> Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Acked-by: Marc Zyngier <marc.zyngier@arm.com>

	M.
-- 
Jazz is not dead. It just smells funny...

^ permalink raw reply	[flat|nested] 62+ messages in thread

* [PATCH v6 08/15] arm64/sve: Move read_zcr_features() out of cpufeature.h
@ 2018-05-09  8:28     ` Marc Zyngier
  0 siblings, 0 replies; 62+ messages in thread
From: Marc Zyngier @ 2018-05-09  8:28 UTC (permalink / raw)
  To: linux-arm-kernel

On 08/05/18 17:44, Dave Martin wrote:
> Having read_zcr_features() inline in cpufeature.h results in that
> header requiring #includes which make it hard to include
> <asm/fpsimd.h> elsewhere without triggering header inclusion
> cycles.
> 
> This is not a hot-path function and arguably should not be in
> cpufeature.h in the first place, so this patch moves it to
> fpsimd.c, compiled conditionally if CONFIG_ARM64_SVE=y.
> 
> This allows some SVE-related #includes to be dropped from
> cpufeature.h, which will ease future maintenance.
> 
> A couple of missing #includes of <asm/fpsimd.h> are exposed by this
> change under arch/arm64/.  This patch adds the missing #includes as
> necessary.
> 
> No functional change.
> 
> Signed-off-by: Dave Martin <Dave.Martin@arm.com>
> Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Acked-by: Marc Zyngier <marc.zyngier@arm.com>

	M.
-- 
Jazz is not dead. It just smells funny...

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH v6 09/15] arm64/sve: Switch sve_pffr() argument from task to thread
  2018-05-08 16:44   ` Dave Martin
@ 2018-05-09  8:44     ` Marc Zyngier
  -1 siblings, 0 replies; 62+ messages in thread
From: Marc Zyngier @ 2018-05-09  8:44 UTC (permalink / raw)
  To: Dave Martin, kvmarm
  Cc: Catalin Marinas, Christoffer Dall, linux-arm-kernel, Ard Biesheuvel

On 08/05/18 17:44, Dave Martin wrote:
> sve_pffr(), which is used to derive the base address used for
> low-level SVE save/restore routines, currently takes the relevant
> task_struct as an argument.
> 
> The only accessed fields are actually part of thread_struct, so
> this patch changes the argument type accordingly.  This is done in
> preparation for moving this function to a header, where we do not
> want to have to include <linux/sched.h> due to the consequent
> circular #include problems.
> 
> No functional change.
> 
> Signed-off-by: Dave Martin <Dave.Martin@arm.com>
> Acked-by: Catalin Marinas <catalin.marinas@arm.com>

Acked-by: Marc Zyngier <marc.zyngier@arm.com>

	M.
-- 
Jazz is not dead. It just smells funny...

^ permalink raw reply	[flat|nested] 62+ messages in thread

* [PATCH v6 09/15] arm64/sve: Switch sve_pffr() argument from task to thread
@ 2018-05-09  8:44     ` Marc Zyngier
  0 siblings, 0 replies; 62+ messages in thread
From: Marc Zyngier @ 2018-05-09  8:44 UTC (permalink / raw)
  To: linux-arm-kernel

On 08/05/18 17:44, Dave Martin wrote:
> sve_pffr(), which is used to derive the base address used for
> low-level SVE save/restore routines, currently takes the relevant
> task_struct as an argument.
> 
> The only accessed fields are actually part of thread_struct, so
> this patch changes the argument type accordingly.  This is done in
> preparation for moving this function to a header, where we do not
> want to have to include <linux/sched.h> due to the consequent
> circular #include problems.
> 
> No functional change.
> 
> Signed-off-by: Dave Martin <Dave.Martin@arm.com>
> Acked-by: Catalin Marinas <catalin.marinas@arm.com>

Acked-by: Marc Zyngier <marc.zyngier@arm.com>

	M.
-- 
Jazz is not dead. It just smells funny...

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH v6 10/15] arm64/sve: Move sve_pffr() to fpsimd.h and make inline
  2018-05-08 16:44   ` Dave Martin
@ 2018-05-09  8:46     ` Marc Zyngier
  -1 siblings, 0 replies; 62+ messages in thread
From: Marc Zyngier @ 2018-05-09  8:46 UTC (permalink / raw)
  To: Dave Martin, kvmarm
  Cc: Catalin Marinas, Christoffer Dall, linux-arm-kernel, Ard Biesheuvel

On 08/05/18 17:44, Dave Martin wrote:
> In order to make sve_save_state()/sve_load_state() more easily
> reusable and to get rid of a potential branch on context switch
> critical paths, this patch makes sve_pffr() inline and moves it to
> fpsimd.h.
> 
> <asm/processor.h> must be included in fpsimd.h in order to make
> this work, and this creates an #include cycle that is tricky to
> avoid without modifying core code, due to the way the PR_SVE_*()
> prctl helpers are included in the core prctl implementation.
> 
> Instead of breaking the cycle, this patch defers inclusion of
> <asm/fpsimd.h> in <asm/processor.h> until the point where it is
> actually needed: i.e., immediately before the prctl definitions.
> 
> No functional change.
> 
> Signed-off-by: Dave Martin <Dave.Martin@arm.com>
> Acked-by: Catalin Marinas <catalin.marinas@arm.com>

Acked-by: Marc Zyngier <marc.zyngier@arm.com>

	M.
-- 
Jazz is not dead. It just smells funny...

^ permalink raw reply	[flat|nested] 62+ messages in thread

* [PATCH v6 10/15] arm64/sve: Move sve_pffr() to fpsimd.h and make inline
@ 2018-05-09  8:46     ` Marc Zyngier
  0 siblings, 0 replies; 62+ messages in thread
From: Marc Zyngier @ 2018-05-09  8:46 UTC (permalink / raw)
  To: linux-arm-kernel

On 08/05/18 17:44, Dave Martin wrote:
> In order to make sve_save_state()/sve_load_state() more easily
> reusable and to get rid of a potential branch on context switch
> critical paths, this patch makes sve_pffr() inline and moves it to
> fpsimd.h.
> 
> <asm/processor.h> must be included in fpsimd.h in order to make
> this work, and this creates an #include cycle that is tricky to
> avoid without modifying core code, due to the way the PR_SVE_*()
> prctl helpers are included in the core prctl implementation.
> 
> Instead of breaking the cycle, this patch defers inclusion of
> <asm/fpsimd.h> in <asm/processor.h> until the point where it is
> actually needed: i.e., immediately before the prctl definitions.
> 
> No functional change.
> 
> Signed-off-by: Dave Martin <Dave.Martin@arm.com>
> Acked-by: Catalin Marinas <catalin.marinas@arm.com>

Acked-by: Marc Zyngier <marc.zyngier@arm.com>

	M.
-- 
Jazz is not dead. It just smells funny...

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH v6 11/15] KVM: arm64: Save host SVE context as appropriate
  2018-05-08 16:44   ` Dave Martin
@ 2018-05-09  8:50     ` Marc Zyngier
  -1 siblings, 0 replies; 62+ messages in thread
From: Marc Zyngier @ 2018-05-09  8:50 UTC (permalink / raw)
  To: Dave Martin, kvmarm
  Cc: Catalin Marinas, Christoffer Dall, linux-arm-kernel, Ard Biesheuvel

On 08/05/18 17:44, Dave Martin wrote:
> This patch adds SVE context saving to the hyp FPSIMD context switch
> path.  This means that it is no longer necessary to save the host
> SVE state in advance of entering the guest, when in use.
> 
> In order to avoid adding pointless complexity to the code, VHE is
> assumed if SVE is in use.  VHE is an architectural prerequisite for
> SVE, so there is no good reason to turn CONFIG_ARM64_VHE off in
> kernels that support both SVE and KVM.
> 
> Historically, software models exist that can expose the
> architecturally invalid configuration of SVE without VHE, so if
> this situation is detected this patch warns and refuses to create a
> VM.  Doing this check at VM creation time avoids race issues
> between KVM and SVE initialisation.

I don't think the commit log now reflects the code, since we bail out at
init time and disable KVM altogether.

> 
> Signed-off-by: Dave Martin <Dave.Martin@arm.com>
> Reviewed-by: Christoffer Dall <christoffer.dall@arm.com>
> 
> ---
> 
> Changes since v5:
> 
> Requested by Marc Zyngier:
> 
>  * Migrate to flag bits in vcpu_arch.flags instead of adding bools in
>    vcpu_arch.
> 
>  * Move system_supports_sve() && !has_vhe() sanity check to
>    kvm_arch_init() instead of doing it each time a VM is created.
>    cpufeatures initialisation should be complete by the time any
>    before module_init() call runs.
> ---
>  arch/arm64/Kconfig          |  7 +++++++
>  arch/arm64/kvm/fpsimd.c     |  1 -
>  arch/arm64/kvm/hyp/switch.c | 22 ++++++++++++++++++++--
>  virt/kvm/arm/arm.c          | 18 ++++++++++++++++++
>  4 files changed, 45 insertions(+), 3 deletions(-)
> 
> diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
> index eb2cf49..b0d3820 100644
> --- a/arch/arm64/Kconfig
> +++ b/arch/arm64/Kconfig
> @@ -1130,6 +1130,7 @@ endmenu
>  config ARM64_SVE
>  	bool "ARM Scalable Vector Extension support"
>  	default y
> +	depends on !KVM || ARM64_VHE
>  	help
>  	  The Scalable Vector Extension (SVE) is an extension to the AArch64
>  	  execution state which complements and extends the SIMD functionality
> @@ -1155,6 +1156,12 @@ config ARM64_SVE
>  	  booting the kernel.  If unsure and you are not observing these
>  	  symptoms, you should assume that it is safe to say Y.
>  
> +	  CPUs that support SVE are architecturally required to support the
> +	  Virtualization Host Extensions (VHE), so the kernel makes no
> +	  provision for supporting SVE alongside KVM without VHE enabled.
> +	  Thus, you will need to enable CONFIG_ARM64_VHE if you want to support
> +	  KVM in the same kernel image.
> +
>  config ARM64_MODULE_PLTS
>  	bool
>  	select HAVE_MOD_ARCH_SPECIFIC
> diff --git a/arch/arm64/kvm/fpsimd.c b/arch/arm64/kvm/fpsimd.c
> index 7059275..4c47b55 100644
> --- a/arch/arm64/kvm/fpsimd.c
> +++ b/arch/arm64/kvm/fpsimd.c
> @@ -60,7 +60,6 @@ int kvm_arch_vcpu_run_map_fp(struct kvm_vcpu *vcpu)
>   */
>  void kvm_arch_vcpu_load_fp(struct kvm_vcpu *vcpu)
>  {
> -	BUG_ON(system_supports_sve());
>  	BUG_ON(!current->mm);
>  
>  	vcpu->arch.flags &= ~(KVM_ARM64_FP_ENABLED | KVM_ARM64_HOST_SVE_IN_USE);
> diff --git a/arch/arm64/kvm/hyp/switch.c b/arch/arm64/kvm/hyp/switch.c
> index 75c3b65..6826f2d 100644
> --- a/arch/arm64/kvm/hyp/switch.c
> +++ b/arch/arm64/kvm/hyp/switch.c
> @@ -21,12 +21,14 @@
>  
>  #include <kvm/arm_psci.h>
>  
> +#include <asm/cpufeature.h>
>  #include <asm/kvm_asm.h>
>  #include <asm/kvm_emulate.h>
>  #include <asm/kvm_hyp.h>
>  #include <asm/kvm_mmu.h>
>  #include <asm/fpsimd.h>
>  #include <asm/debug-monitors.h>
> +#include <asm/processor.h>
>  #include <asm/thread_info.h>
>  
>  static bool __hyp_text update_fp_enabled(struct kvm_vcpu *vcpu)
> @@ -328,6 +330,8 @@ static bool __hyp_text __skip_instr(struct kvm_vcpu *vcpu)
>  void __hyp_text __hyp_switch_fpsimd(u64 esr __always_unused,
>  				    struct kvm_vcpu *vcpu)
>  {
> +	struct user_fpsimd_state *host_fpsimd = vcpu->arch.host_fpsimd_state;
> +
>  	if (has_vhe())
>  		write_sysreg(read_sysreg(cpacr_el1) | CPACR_EL1_FPEN,
>  			     cpacr_el1);
> @@ -337,8 +341,22 @@ void __hyp_text __hyp_switch_fpsimd(u64 esr __always_unused,
>  
>  	isb();
>  
> -	if (vcpu->arch.host_fpsimd_state) {
> -		__fpsimd_save_state(vcpu->arch.host_fpsimd_state);
> +	if (host_fpsimd) {
> +		/*
> +		 * In the SVE case, VHE is assumed: it is enforced by
> +		 * Kconfig and kvm_arch_init_vm().

Nit: this is now kvm_arch_init().

> +		 */
> +		if (system_supports_sve() &&
> +		    (vcpu->arch.flags & KVM_ARM64_HOST_SVE_IN_USE)) {
> +			struct thread_struct *thread = container_of(
> +				host_fpsimd,
> +				struct thread_struct, uw.fpsimd_state);
> +
> +			sve_save_state(sve_pffr(thread), &host_fpsimd->fpsr);
> +		} else {
> +			__fpsimd_save_state(vcpu->arch.host_fpsimd_state);
> +		}
> +
>  		vcpu->arch.host_fpsimd_state = NULL;
>  	}
>  
> diff --git a/virt/kvm/arm/arm.c b/virt/kvm/arm/arm.c
> index bee226c..379e8a9 100644
> --- a/virt/kvm/arm/arm.c
> +++ b/virt/kvm/arm/arm.c
> @@ -16,6 +16,7 @@
>   * Foundation, 51 Franklin Street, Fifth Floor, Boston, MA  02110-1301, USA.
>   */
>  
> +#include <linux/bug.h>
>  #include <linux/cpu_pm.h>
>  #include <linux/errno.h>
>  #include <linux/err.h>
> @@ -41,6 +42,7 @@
>  #include <asm/mman.h>
>  #include <asm/tlbflush.h>
>  #include <asm/cacheflush.h>
> +#include <asm/cpufeature.h>
>  #include <asm/virt.h>
>  #include <asm/kvm_arm.h>
>  #include <asm/kvm_asm.h>
> @@ -1582,6 +1584,22 @@ int kvm_arch_init(void *opaque)
>  		}
>  	}
>  
> +	/*
> +	 * VHE is a prerequisite for SVE in the Arm architecture, and
> +	 * Kconfig ensures that if system_supports_sve() here then
> +	 * CONFIG_ARM64_VHE is enabled, so if VHE support wasn't already
> +	 * detected and enabled, the CPU is architecturally
> +	 * noncompliant.
> +	 *
> +	 * Just in case this mismatch is seen, detect it, warn and give
> +	 * up.  Supporting this forbidden configuration in Hyp would be
> +	 * pointless.
> +	 */
> +	if (system_supports_sve() && !has_vhe()) {
> +		kvm_pr_unimpl("SVE system without VHE unsupported.  Broken cpu?");
> +		return -ENODEV;
> +	}
> +
>  	err = init_common_resources();
>  	if (err)
>  		return err;
> 

Otherwise:

Acked-by: Marc Zyngier <marc.zyngier@arm.com>

	M.
-- 
Jazz is not dead. It just smells funny...

^ permalink raw reply	[flat|nested] 62+ messages in thread

* [PATCH v6 11/15] KVM: arm64: Save host SVE context as appropriate
@ 2018-05-09  8:50     ` Marc Zyngier
  0 siblings, 0 replies; 62+ messages in thread
From: Marc Zyngier @ 2018-05-09  8:50 UTC (permalink / raw)
  To: linux-arm-kernel

On 08/05/18 17:44, Dave Martin wrote:
> This patch adds SVE context saving to the hyp FPSIMD context switch
> path.  This means that it is no longer necessary to save the host
> SVE state in advance of entering the guest, when in use.
> 
> In order to avoid adding pointless complexity to the code, VHE is
> assumed if SVE is in use.  VHE is an architectural prerequisite for
> SVE, so there is no good reason to turn CONFIG_ARM64_VHE off in
> kernels that support both SVE and KVM.
> 
> Historically, software models exist that can expose the
> architecturally invalid configuration of SVE without VHE, so if
> this situation is detected this patch warns and refuses to create a
> VM.  Doing this check at VM creation time avoids race issues
> between KVM and SVE initialisation.

I don't think the commit log now reflects the code, since we bail out at
init time and disable KVM altogether.

> 
> Signed-off-by: Dave Martin <Dave.Martin@arm.com>
> Reviewed-by: Christoffer Dall <christoffer.dall@arm.com>
> 
> ---
> 
> Changes since v5:
> 
> Requested by Marc Zyngier:
> 
>  * Migrate to flag bits in vcpu_arch.flags instead of adding bools in
>    vcpu_arch.
> 
>  * Move system_supports_sve() && !has_vhe() sanity check to
>    kvm_arch_init() instead of doing it each time a VM is created.
>    cpufeatures initialisation should be complete by the time any
>    before module_init() call runs.
> ---
>  arch/arm64/Kconfig          |  7 +++++++
>  arch/arm64/kvm/fpsimd.c     |  1 -
>  arch/arm64/kvm/hyp/switch.c | 22 ++++++++++++++++++++--
>  virt/kvm/arm/arm.c          | 18 ++++++++++++++++++
>  4 files changed, 45 insertions(+), 3 deletions(-)
> 
> diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
> index eb2cf49..b0d3820 100644
> --- a/arch/arm64/Kconfig
> +++ b/arch/arm64/Kconfig
> @@ -1130,6 +1130,7 @@ endmenu
>  config ARM64_SVE
>  	bool "ARM Scalable Vector Extension support"
>  	default y
> +	depends on !KVM || ARM64_VHE
>  	help
>  	  The Scalable Vector Extension (SVE) is an extension to the AArch64
>  	  execution state which complements and extends the SIMD functionality
> @@ -1155,6 +1156,12 @@ config ARM64_SVE
>  	  booting the kernel.  If unsure and you are not observing these
>  	  symptoms, you should assume that it is safe to say Y.
>  
> +	  CPUs that support SVE are architecturally required to support the
> +	  Virtualization Host Extensions (VHE), so the kernel makes no
> +	  provision for supporting SVE alongside KVM without VHE enabled.
> +	  Thus, you will need to enable CONFIG_ARM64_VHE if you want to support
> +	  KVM in the same kernel image.
> +
>  config ARM64_MODULE_PLTS
>  	bool
>  	select HAVE_MOD_ARCH_SPECIFIC
> diff --git a/arch/arm64/kvm/fpsimd.c b/arch/arm64/kvm/fpsimd.c
> index 7059275..4c47b55 100644
> --- a/arch/arm64/kvm/fpsimd.c
> +++ b/arch/arm64/kvm/fpsimd.c
> @@ -60,7 +60,6 @@ int kvm_arch_vcpu_run_map_fp(struct kvm_vcpu *vcpu)
>   */
>  void kvm_arch_vcpu_load_fp(struct kvm_vcpu *vcpu)
>  {
> -	BUG_ON(system_supports_sve());
>  	BUG_ON(!current->mm);
>  
>  	vcpu->arch.flags &= ~(KVM_ARM64_FP_ENABLED | KVM_ARM64_HOST_SVE_IN_USE);
> diff --git a/arch/arm64/kvm/hyp/switch.c b/arch/arm64/kvm/hyp/switch.c
> index 75c3b65..6826f2d 100644
> --- a/arch/arm64/kvm/hyp/switch.c
> +++ b/arch/arm64/kvm/hyp/switch.c
> @@ -21,12 +21,14 @@
>  
>  #include <kvm/arm_psci.h>
>  
> +#include <asm/cpufeature.h>
>  #include <asm/kvm_asm.h>
>  #include <asm/kvm_emulate.h>
>  #include <asm/kvm_hyp.h>
>  #include <asm/kvm_mmu.h>
>  #include <asm/fpsimd.h>
>  #include <asm/debug-monitors.h>
> +#include <asm/processor.h>
>  #include <asm/thread_info.h>
>  
>  static bool __hyp_text update_fp_enabled(struct kvm_vcpu *vcpu)
> @@ -328,6 +330,8 @@ static bool __hyp_text __skip_instr(struct kvm_vcpu *vcpu)
>  void __hyp_text __hyp_switch_fpsimd(u64 esr __always_unused,
>  				    struct kvm_vcpu *vcpu)
>  {
> +	struct user_fpsimd_state *host_fpsimd = vcpu->arch.host_fpsimd_state;
> +
>  	if (has_vhe())
>  		write_sysreg(read_sysreg(cpacr_el1) | CPACR_EL1_FPEN,
>  			     cpacr_el1);
> @@ -337,8 +341,22 @@ void __hyp_text __hyp_switch_fpsimd(u64 esr __always_unused,
>  
>  	isb();
>  
> -	if (vcpu->arch.host_fpsimd_state) {
> -		__fpsimd_save_state(vcpu->arch.host_fpsimd_state);
> +	if (host_fpsimd) {
> +		/*
> +		 * In the SVE case, VHE is assumed: it is enforced by
> +		 * Kconfig and kvm_arch_init_vm().

Nit: this is now kvm_arch_init().

> +		 */
> +		if (system_supports_sve() &&
> +		    (vcpu->arch.flags & KVM_ARM64_HOST_SVE_IN_USE)) {
> +			struct thread_struct *thread = container_of(
> +				host_fpsimd,
> +				struct thread_struct, uw.fpsimd_state);
> +
> +			sve_save_state(sve_pffr(thread), &host_fpsimd->fpsr);
> +		} else {
> +			__fpsimd_save_state(vcpu->arch.host_fpsimd_state);
> +		}
> +
>  		vcpu->arch.host_fpsimd_state = NULL;
>  	}
>  
> diff --git a/virt/kvm/arm/arm.c b/virt/kvm/arm/arm.c
> index bee226c..379e8a9 100644
> --- a/virt/kvm/arm/arm.c
> +++ b/virt/kvm/arm/arm.c
> @@ -16,6 +16,7 @@
>   * Foundation, 51 Franklin Street, Fifth Floor, Boston, MA  02110-1301, USA.
>   */
>  
> +#include <linux/bug.h>
>  #include <linux/cpu_pm.h>
>  #include <linux/errno.h>
>  #include <linux/err.h>
> @@ -41,6 +42,7 @@
>  #include <asm/mman.h>
>  #include <asm/tlbflush.h>
>  #include <asm/cacheflush.h>
> +#include <asm/cpufeature.h>
>  #include <asm/virt.h>
>  #include <asm/kvm_arm.h>
>  #include <asm/kvm_asm.h>
> @@ -1582,6 +1584,22 @@ int kvm_arch_init(void *opaque)
>  		}
>  	}
>  
> +	/*
> +	 * VHE is a prerequisite for SVE in the Arm architecture, and
> +	 * Kconfig ensures that if system_supports_sve() here then
> +	 * CONFIG_ARM64_VHE is enabled, so if VHE support wasn't already
> +	 * detected and enabled, the CPU is architecturally
> +	 * noncompliant.
> +	 *
> +	 * Just in case this mismatch is seen, detect it, warn and give
> +	 * up.  Supporting this forbidden configuration in Hyp would be
> +	 * pointless.
> +	 */
> +	if (system_supports_sve() && !has_vhe()) {
> +		kvm_pr_unimpl("SVE system without VHE unsupported.  Broken cpu?");
> +		return -ENODEV;
> +	}
> +
>  	err = init_common_resources();
>  	if (err)
>  		return err;
> 

Otherwise:

Acked-by: Marc Zyngier <marc.zyngier@arm.com>

	M.
-- 
Jazz is not dead. It just smells funny...

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH v6 12/15] KVM: arm64: Remove eager host SVE state saving
  2018-05-08 16:44   ` Dave Martin
@ 2018-05-09  8:56     ` Marc Zyngier
  -1 siblings, 0 replies; 62+ messages in thread
From: Marc Zyngier @ 2018-05-09  8:56 UTC (permalink / raw)
  To: Dave Martin, kvmarm
  Cc: Catalin Marinas, Christoffer Dall, linux-arm-kernel, Ard Biesheuvel

On 08/05/18 17:44, Dave Martin wrote:
> Now that the host SVE context can be saved on demand from Hyp,
> there is no longer any need to save this state in advance before
> entering the guest.
> 
> This patch removes the relevant call to
> kvm_fpsimd_flush_cpu_state().
> 
> Since the problem that function was intended to solve now no longer
> exists, the function and its dependencies are also deleted.
> 
> Signed-off-by: Dave Martin <Dave.Martin@arm.com>
> Acked-by: Christoffer Dall <christoffer.dall@arm.com>

Acked-by: Marc Zyngier <marc.zyngier@arm.com>

	M.
-- 
Jazz is not dead. It just smells funny...

^ permalink raw reply	[flat|nested] 62+ messages in thread

* [PATCH v6 12/15] KVM: arm64: Remove eager host SVE state saving
@ 2018-05-09  8:56     ` Marc Zyngier
  0 siblings, 0 replies; 62+ messages in thread
From: Marc Zyngier @ 2018-05-09  8:56 UTC (permalink / raw)
  To: linux-arm-kernel

On 08/05/18 17:44, Dave Martin wrote:
> Now that the host SVE context can be saved on demand from Hyp,
> there is no longer any need to save this state in advance before
> entering the guest.
> 
> This patch removes the relevant call to
> kvm_fpsimd_flush_cpu_state().
> 
> Since the problem that function was intended to solve now no longer
> exists, the function and its dependencies are also deleted.
> 
> Signed-off-by: Dave Martin <Dave.Martin@arm.com>
> Acked-by: Christoffer Dall <christoffer.dall@arm.com>

Acked-by: Marc Zyngier <marc.zyngier@arm.com>

	M.
-- 
Jazz is not dead. It just smells funny...

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH v6 14/15] KVM: arm64: Fold redundant exit code checks out of fixup_guest_exit()
  2018-05-08 16:44   ` Dave Martin
@ 2018-05-09  9:05     ` Marc Zyngier
  -1 siblings, 0 replies; 62+ messages in thread
From: Marc Zyngier @ 2018-05-09  9:05 UTC (permalink / raw)
  To: Dave Martin, kvmarm
  Cc: Catalin Marinas, Christoffer Dall, linux-arm-kernel, Ard Biesheuvel

On 08/05/18 17:44, Dave Martin wrote:
> The entire tail of fixup_guest_exit() is contained in if statements
> of the form if (x && *exit_code == ARM_EXCEPTION_TRAP).  As a result,
> we can check just once and bail out of the function early, allowing
> the remaining if conditions to be simplified.
> 
> The only awkward case is where *exit_code is changed to
> ARM_EXCEPTION_EL1_SERROR in the case of an illegal GICv2 CPU
> interface access: in that case, the GICv3 trap handling code is
> skipped using a goto.  This avoids pointlessly evaluating the
> static branch check for the GICv3 case, even though we can't have
> vgic_v2_cpuif_trap and vgic_v3_cpuif_trap true simultaneously
> unless we have a GICv3 and GICv2 on the host: that sounds stupid,
> but I haven't satisfied myself that it can't happen.
> 
> No functional change.
> 
> Signed-off-by: Dave Martin <Dave.Martin@arm.com>

Reviewed-by: Marc Zyngier <marc.zyngier@arm.com>

	M.
-- 
Jazz is not dead. It just smells funny...

^ permalink raw reply	[flat|nested] 62+ messages in thread

* [PATCH v6 14/15] KVM: arm64: Fold redundant exit code checks out of fixup_guest_exit()
@ 2018-05-09  9:05     ` Marc Zyngier
  0 siblings, 0 replies; 62+ messages in thread
From: Marc Zyngier @ 2018-05-09  9:05 UTC (permalink / raw)
  To: linux-arm-kernel

On 08/05/18 17:44, Dave Martin wrote:
> The entire tail of fixup_guest_exit() is contained in if statements
> of the form if (x && *exit_code == ARM_EXCEPTION_TRAP).  As a result,
> we can check just once and bail out of the function early, allowing
> the remaining if conditions to be simplified.
> 
> The only awkward case is where *exit_code is changed to
> ARM_EXCEPTION_EL1_SERROR in the case of an illegal GICv2 CPU
> interface access: in that case, the GICv3 trap handling code is
> skipped using a goto.  This avoids pointlessly evaluating the
> static branch check for the GICv3 case, even though we can't have
> vgic_v2_cpuif_trap and vgic_v3_cpuif_trap true simultaneously
> unless we have a GICv3 and GICv2 on the host: that sounds stupid,
> but I haven't satisfied myself that it can't happen.
> 
> No functional change.
> 
> Signed-off-by: Dave Martin <Dave.Martin@arm.com>

Reviewed-by: Marc Zyngier <marc.zyngier@arm.com>

	M.
-- 
Jazz is not dead. It just smells funny...

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH v6 01/15] thread_info: Add update_thread_flag() helpers
  2018-05-08 16:44   ` Dave Martin
@ 2018-05-09  9:07     ` Marc Zyngier
  -1 siblings, 0 replies; 62+ messages in thread
From: Marc Zyngier @ 2018-05-09  9:07 UTC (permalink / raw)
  To: Dave Martin, kvmarm
  Cc: Christoffer Dall, Ard Biesheuvel, Peter Zijlstra,
	Catalin Marinas, Oleg Nesterov, Steven Rostedt, Ingo Molnar,
	linux-arm-kernel

On 08/05/18 17:44, Dave Martin wrote:
> There are a number of bits of code sprinkled around the kernel to
> set a thread flag if a certain condition is true, and clear it
> otherwise.
> 
> To help make those call sites terser and less cumbersome, this
> patch adds a new family of thread flag manipulators
> 
> 	update*_thread_flag([...,] flag, cond)
> 
> which do the equivalent of:
> 
> 	if (cond)
> 		set*_thread_flag([...,] flag);
> 	else
> 		clear*_thread_flag([...,] flag);
> 
> Signed-off-by: Dave Martin <Dave.Martin@arm.com>
> Cc: Ingo Molnar <mingo@redhat.com>
> Cc: Peter Zijlstra <peterz@infradead.org>
> Cc: Steven Rostedt <rostedt@goodmis.org>
> Cc: Oleg Nesterov <oleg@redhat.com>
Acked-by: Marc Zyngier <marc.zyngier@arm.com>

	M.
-- 
Jazz is not dead. It just smells funny...

^ permalink raw reply	[flat|nested] 62+ messages in thread

* [PATCH v6 01/15] thread_info: Add update_thread_flag() helpers
@ 2018-05-09  9:07     ` Marc Zyngier
  0 siblings, 0 replies; 62+ messages in thread
From: Marc Zyngier @ 2018-05-09  9:07 UTC (permalink / raw)
  To: linux-arm-kernel

On 08/05/18 17:44, Dave Martin wrote:
> There are a number of bits of code sprinkled around the kernel to
> set a thread flag if a certain condition is true, and clear it
> otherwise.
> 
> To help make those call sites terser and less cumbersome, this
> patch adds a new family of thread flag manipulators
> 
> 	update*_thread_flag([...,] flag, cond)
> 
> which do the equivalent of:
> 
> 	if (cond)
> 		set*_thread_flag([...,] flag);
> 	else
> 		clear*_thread_flag([...,] flag);
> 
> Signed-off-by: Dave Martin <Dave.Martin@arm.com>
> Cc: Ingo Molnar <mingo@redhat.com>
> Cc: Peter Zijlstra <peterz@infradead.org>
> Cc: Steven Rostedt <rostedt@goodmis.org>
> Cc: Oleg Nesterov <oleg@redhat.com>
Acked-by: Marc Zyngier <marc.zyngier@arm.com>

	M.
-- 
Jazz is not dead. It just smells funny...

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH v6 02/15] arm64: Use update{,_tsk}_thread_flag()
  2018-05-08 16:44   ` Dave Martin
@ 2018-05-09  9:09     ` Marc Zyngier
  -1 siblings, 0 replies; 62+ messages in thread
From: Marc Zyngier @ 2018-05-09  9:09 UTC (permalink / raw)
  To: Dave Martin, kvmarm
  Cc: Catalin Marinas, Will Deacon, Christoffer Dall, linux-arm-kernel,
	Ard Biesheuvel

On 08/05/18 17:44, Dave Martin wrote:
> This patch uses the new update_thread_flag() helpers to simplify a
> couple of if () set; else clear; constructs.
> 
> No functional change.
> 
> Signed-off-by: Dave Martin <Dave.Martin@arm.com>
> Cc: Catalin Marinas <catalin.marinas@arm.com>
> Cc: Will Deacon <will.deacon@arm.com>

Acked-by: Marc Zyngier <marc.zyngier@arm.com>

	M.
-- 
Jazz is not dead. It just smells funny...

^ permalink raw reply	[flat|nested] 62+ messages in thread

* [PATCH v6 02/15] arm64: Use update{,_tsk}_thread_flag()
@ 2018-05-09  9:09     ` Marc Zyngier
  0 siblings, 0 replies; 62+ messages in thread
From: Marc Zyngier @ 2018-05-09  9:09 UTC (permalink / raw)
  To: linux-arm-kernel

On 08/05/18 17:44, Dave Martin wrote:
> This patch uses the new update_thread_flag() helpers to simplify a
> couple of if () set; else clear; constructs.
> 
> No functional change.
> 
> Signed-off-by: Dave Martin <Dave.Martin@arm.com>
> Cc: Catalin Marinas <catalin.marinas@arm.com>
> Cc: Will Deacon <will.deacon@arm.com>

Acked-by: Marc Zyngier <marc.zyngier@arm.com>

	M.
-- 
Jazz is not dead. It just smells funny...

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH v6 07/15] KVM: arm64: Optimise FPSIMD handling to reduce guest/host thrashing
  2018-05-09  8:24     ` Marc Zyngier
@ 2018-05-09  9:17       ` Dave Martin
  -1 siblings, 0 replies; 62+ messages in thread
From: Dave Martin @ 2018-05-09  9:17 UTC (permalink / raw)
  To: Marc Zyngier
  Cc: Catalin Marinas, Ard Biesheuvel, kvmarm, linux-arm-kernel,
	Christoffer Dall

On Wed, May 09, 2018 at 09:24:36AM +0100, Marc Zyngier wrote:
> On 08/05/18 17:44, Dave Martin wrote:
> > This patch refactors KVM to align the host and guest FPSIMD
> > save/restore logic with each other for arm64.  This reduces the

[...]

> > diff --git a/arch/arm64/include/asm/kvm_asm.h b/arch/arm64/include/asm/kvm_asm.h
> > index 9699c13..6429eb6 100644
> > --- a/arch/arm64/include/asm/kvm_asm.h
> > +++ b/arch/arm64/include/asm/kvm_asm.h
> > @@ -33,6 +33,8 @@
> >  /* vcpu_arch flags field values: */
> >  #define KVM_ARM64_DEBUG_DIRTY_SHIFT	0
> >  #define KVM_ARM64_DEBUG_DIRTY		(1 << KVM_ARM64_DEBUG_DIRTY_SHIFT)
> 
> At some point, we could remove KVM_ARM64_DEBUG_DIRTY_SHIFT as it is not
> used anymore (since 1ea66d27e7b0). Do not respin on this account though.

I wondered about this...  I presume this was used from assembly once
upon a time.

Since I need to repost for something else anyway, shall I delete that
and move the defines to kvm_host.h?  There's no longer an obvious reason
for them to be in a different header.

[...]

> > diff --git a/arch/arm64/kvm/fpsimd.c b/arch/arm64/kvm/fpsimd.c
> > new file mode 100644
> > index 0000000..7059275
> > --- /dev/null
> > +++ b/arch/arm64/kvm/fpsimd.c
> > @@ -0,0 +1,111 @@
> > +// SPDX-License-Identifier: GPL-2.0
> > +/*
> > + * arch/arm64/kvm/fpsimd.c: Guest/host FPSIMD context coordination helpers
> > + *
> > + * Copyright 2018 Arm Limited
> > + * Author: Dave Martin <Dave.Martin@arm.com>
> > + */
> > +#include <linux/bottom_half.h>
> > +#include <linux/sched.h>
> > +#include <linux/thread_info.h>
> > +#include <linux/kvm_host.h>
> > +#include <asm/kvm_asm.h>
> > +#include <asm/kvm_host.h>
> > +#include <asm/kvm_mmu.h>
> > +
> > +/*
> > + * Called on entry to KVM_RUN unless this vcpu previously ran at least
> > + * once and the most recent prior KVM_RUN for this vcpu was called from
> > + * the same task as current (highly likely).
> > + *
> > + * This is guaranteed to execute before kvm_arch_vcpu_load_fp(vcpu),
> > + * such that on entering hyp the relevant parts of current are already
> > + * mapped.
> > + */
> > +int kvm_arch_vcpu_run_map_fp(struct kvm_vcpu *vcpu)
> > +{
> > +	int ret;
> > +
> > +	struct thread_info *ti = &current->thread_info;
> > +	struct user_fpsimd_state *fpsimd = &current->thread.uw.fpsimd_state;
> > +
> > +	/*
> > +	 * Make sure the host task thread flags and fpsimd state are
> > +	 * visible to hyp:
> > +	 */
> > +	ret = create_hyp_mappings(ti, ti + 1, PAGE_HYP);
> > +	if (ret)
> > +		goto error;
> > +
> > +	ret = create_hyp_mappings(fpsimd, fpsimd + 1, PAGE_HYP);
> > +	if (ret)
> > +		goto error;
> > +
> > +	vcpu->arch.host_thread_info = kern_hyp_va(ti);
> > +	vcpu->arch.host_fpsimd_state = kern_hyp_va(fpsimd);
> 
> Since we assign host_fpsimd_state here, and that "map" is guaranteed to
> execute before "load"...
> 
> > +error:
> > +	return ret;
> > +}
> > +
> > +/*
> > + * Prepare vcpu for saving the host's FPSIMD state and loading the guest's.
> > + * The actual loading is done by the FPSIMD access trap taken to hyp.
> > + *
> > + * Here, we just set the correct metadata to indicate that the FPSIMD
> > + * state in the cpu regs (if any) belongs to current, and where to write
> > + * it back to if/when a FPSIMD access trap is taken.
> > + *
> > + * TIF_SVE is backed up here, since it may get clobbered with guest state.
> > + * This flag is restored by kvm_arch_vcpu_put_fp(vcpu).
> > + */
> > +void kvm_arch_vcpu_load_fp(struct kvm_vcpu *vcpu)
> > +{
> > +	BUG_ON(system_supports_sve());
> > +	BUG_ON(!current->mm);
> > +
> > +	vcpu->arch.flags &= ~(KVM_ARM64_FP_ENABLED | KVM_ARM64_HOST_SVE_IN_USE);
> > +	if (test_thread_flag(TIF_SVE))
> > +		vcpu->arch.flags |= KVM_ARM64_HOST_SVE_IN_USE;
> > +
> > +	vcpu->arch.host_fpsimd_state =
> > +		kern_hyp_va(&current->thread.uw.fpsimd_state);
> 
> ... what is the purpose of this assignment? I don't think it is harmful
> in any way, but I'm just wondering if I'm missing in the actual flow.

The pointer will be NULLed by __hyp_switch_fpsimd() when loading the
guest state in, to indicate that there is no longer any host state to
save.  Perhaps it would have been clearer to have a separate flag to
represent this change of state.

As things stand I think you're right: the assignments are redundant.  
We certainly need the pointer to reflect the current task, which is
the motivation for doing it here -- but in fact we need to do it
more often, at every vcpu_load().

I think the most appropriate thing is to delete the assignment from
_map_fp().

The alternative would be to add a flag for the "no host state to save"
case, and fpsimd_host_state only in the pid_change hook.

[...]

> > +/*
> > + * Write back the vcpu FPSIMD regs if they are dirty, and invalidate the
> > + * cpu FPSIMD regs so that they can't be spuriously reused if this vcpu
> > + * disappears and another task or vcpu appears that recycles the same
> > + * struct fpsimd_state.
> > + */
> > +void kvm_arch_vcpu_put_fp(struct kvm_vcpu *vcpu)
> > +{
> > +	local_bh_disable();
> > +
> > +	if (vcpu->arch.flags & KVM_ARM64_FP_ENABLED) {
> > +		fpsimd_save();
> > +		fpsimd_flush_cpu_state();
> > +		set_thread_flag(TIF_FOREIGN_FPSTATE);
> > +	}
> > +
> > +	update_thread_flag(TIF_SVE,
> > +			   vcpu->arch.flags & KVM_ARM64_HOST_SVE_IN_USE);
> 
> Does this indicate anything about the validity of the SVE registers? Or
> is it for the sole purpose of userspace not having to trap to re-enable SVE?

This is the host task's "SVE registers exist" flag.  It's essential to
restore this correctly before re-enabling preemption, because it
controls where the regs are saved/restored from:
current->thread.fpsimd_state, or the dynamically allocated buffer
current->thread.sve_state (TIF_SVE set indicates the latter).

TIF_FOREIGN_FPSTATE is orthogonal to this, and indicates whether the
regs are actually loaded in the hardware.  Obviously they aren't if
we had loaded the guest's regs meanwhile.

[...]

> > diff --git a/arch/arm64/kvm/hyp/switch.c b/arch/arm64/kvm/hyp/switch.c
> > index c0796c4..75c3b65 100644
> > --- a/arch/arm64/kvm/hyp/switch.c
> > +++ b/arch/arm64/kvm/hyp/switch.c
> > @@ -27,15 +27,16 @@
> >  #include <asm/kvm_mmu.h>
> >  #include <asm/fpsimd.h>
> >  #include <asm/debug-monitors.h>
> > +#include <asm/thread_info.h>
> >  
> > -static bool __hyp_text __fpsimd_enabled_nvhe(void)
> > +static bool __hyp_text update_fp_enabled(struct kvm_vcpu *vcpu)
> >  {
> > -	return !(read_sysreg(cptr_el2) & CPTR_EL2_TFP);
> > -}
> > +	if (vcpu->arch.host_thread_info->flags & _TIF_FOREIGN_FPSTATE) {
> > +		vcpu->arch.host_fpsimd_state = NULL;
> > +		vcpu->arch.flags &= ~KVM_ARM64_FP_ENABLED;
> > +	}
> 
> I must confess I don't get this bit. If we started off with an invalid
> host state (which is how I interpret TIF_FOREIGN_FPSTATE), why didn't we
> simply zero the pointer at load time? The thread flag cannot change in
> the meantime, right? Or did I get that wrong?

The host kernel can take interrupts in the run loop while outside the
guest, causing this flag to get set.  So we really do need to re-check
on each entry to the guest.

I can add a one-line comment documenting the purpose of this function,
which should help understanding here, something like:

/* Check whether the FP regs were dirtied while in the host-side run loop */

[...]

> It otherwise looks good, and I don't think the above nits are
> problematic on their own.
> 
> Reviewed-by: Marc Zyngier <marc.zyngier@arm.com>

Thanks.

I will probably address the nits since I need to fix another build break
issue with CONFIG_KVM=n.  That will be a pretty trivial respin.

Happy to keep your Reviewed-by with the discussed changes?

Cheers
---Dave

^ permalink raw reply	[flat|nested] 62+ messages in thread

* [PATCH v6 07/15] KVM: arm64: Optimise FPSIMD handling to reduce guest/host thrashing
@ 2018-05-09  9:17       ` Dave Martin
  0 siblings, 0 replies; 62+ messages in thread
From: Dave Martin @ 2018-05-09  9:17 UTC (permalink / raw)
  To: linux-arm-kernel

On Wed, May 09, 2018 at 09:24:36AM +0100, Marc Zyngier wrote:
> On 08/05/18 17:44, Dave Martin wrote:
> > This patch refactors KVM to align the host and guest FPSIMD
> > save/restore logic with each other for arm64.  This reduces the

[...]

> > diff --git a/arch/arm64/include/asm/kvm_asm.h b/arch/arm64/include/asm/kvm_asm.h
> > index 9699c13..6429eb6 100644
> > --- a/arch/arm64/include/asm/kvm_asm.h
> > +++ b/arch/arm64/include/asm/kvm_asm.h
> > @@ -33,6 +33,8 @@
> >  /* vcpu_arch flags field values: */
> >  #define KVM_ARM64_DEBUG_DIRTY_SHIFT	0
> >  #define KVM_ARM64_DEBUG_DIRTY		(1 << KVM_ARM64_DEBUG_DIRTY_SHIFT)
> 
> At some point, we could remove KVM_ARM64_DEBUG_DIRTY_SHIFT as it is not
> used anymore (since 1ea66d27e7b0). Do not respin on this account though.

I wondered about this...  I presume this was used from assembly once
upon a time.

Since I need to repost for something else anyway, shall I delete that
and move the defines to kvm_host.h?  There's no longer an obvious reason
for them to be in a different header.

[...]

> > diff --git a/arch/arm64/kvm/fpsimd.c b/arch/arm64/kvm/fpsimd.c
> > new file mode 100644
> > index 0000000..7059275
> > --- /dev/null
> > +++ b/arch/arm64/kvm/fpsimd.c
> > @@ -0,0 +1,111 @@
> > +// SPDX-License-Identifier: GPL-2.0
> > +/*
> > + * arch/arm64/kvm/fpsimd.c: Guest/host FPSIMD context coordination helpers
> > + *
> > + * Copyright 2018 Arm Limited
> > + * Author: Dave Martin <Dave.Martin@arm.com>
> > + */
> > +#include <linux/bottom_half.h>
> > +#include <linux/sched.h>
> > +#include <linux/thread_info.h>
> > +#include <linux/kvm_host.h>
> > +#include <asm/kvm_asm.h>
> > +#include <asm/kvm_host.h>
> > +#include <asm/kvm_mmu.h>
> > +
> > +/*
> > + * Called on entry to KVM_RUN unless this vcpu previously ran at least
> > + * once and the most recent prior KVM_RUN for this vcpu was called from
> > + * the same task as current (highly likely).
> > + *
> > + * This is guaranteed to execute before kvm_arch_vcpu_load_fp(vcpu),
> > + * such that on entering hyp the relevant parts of current are already
> > + * mapped.
> > + */
> > +int kvm_arch_vcpu_run_map_fp(struct kvm_vcpu *vcpu)
> > +{
> > +	int ret;
> > +
> > +	struct thread_info *ti = &current->thread_info;
> > +	struct user_fpsimd_state *fpsimd = &current->thread.uw.fpsimd_state;
> > +
> > +	/*
> > +	 * Make sure the host task thread flags and fpsimd state are
> > +	 * visible to hyp:
> > +	 */
> > +	ret = create_hyp_mappings(ti, ti + 1, PAGE_HYP);
> > +	if (ret)
> > +		goto error;
> > +
> > +	ret = create_hyp_mappings(fpsimd, fpsimd + 1, PAGE_HYP);
> > +	if (ret)
> > +		goto error;
> > +
> > +	vcpu->arch.host_thread_info = kern_hyp_va(ti);
> > +	vcpu->arch.host_fpsimd_state = kern_hyp_va(fpsimd);
> 
> Since we assign host_fpsimd_state here, and that "map" is guaranteed to
> execute before "load"...
> 
> > +error:
> > +	return ret;
> > +}
> > +
> > +/*
> > + * Prepare vcpu for saving the host's FPSIMD state and loading the guest's.
> > + * The actual loading is done by the FPSIMD access trap taken to hyp.
> > + *
> > + * Here, we just set the correct metadata to indicate that the FPSIMD
> > + * state in the cpu regs (if any) belongs to current, and where to write
> > + * it back to if/when a FPSIMD access trap is taken.
> > + *
> > + * TIF_SVE is backed up here, since it may get clobbered with guest state.
> > + * This flag is restored by kvm_arch_vcpu_put_fp(vcpu).
> > + */
> > +void kvm_arch_vcpu_load_fp(struct kvm_vcpu *vcpu)
> > +{
> > +	BUG_ON(system_supports_sve());
> > +	BUG_ON(!current->mm);
> > +
> > +	vcpu->arch.flags &= ~(KVM_ARM64_FP_ENABLED | KVM_ARM64_HOST_SVE_IN_USE);
> > +	if (test_thread_flag(TIF_SVE))
> > +		vcpu->arch.flags |= KVM_ARM64_HOST_SVE_IN_USE;
> > +
> > +	vcpu->arch.host_fpsimd_state =
> > +		kern_hyp_va(&current->thread.uw.fpsimd_state);
> 
> ... what is the purpose of this assignment? I don't think it is harmful
> in any way, but I'm just wondering if I'm missing in the actual flow.

The pointer will be NULLed by __hyp_switch_fpsimd() when loading the
guest state in, to indicate that there is no longer any host state to
save.  Perhaps it would have been clearer to have a separate flag to
represent this change of state.

As things stand I think you're right: the assignments are redundant.  
We certainly need the pointer to reflect the current task, which is
the motivation for doing it here -- but in fact we need to do it
more often, at every vcpu_load().

I think the most appropriate thing is to delete the assignment from
_map_fp().

The alternative would be to add a flag for the "no host state to save"
case, and fpsimd_host_state only in the pid_change hook.

[...]

> > +/*
> > + * Write back the vcpu FPSIMD regs if they are dirty, and invalidate the
> > + * cpu FPSIMD regs so that they can't be spuriously reused if this vcpu
> > + * disappears and another task or vcpu appears that recycles the same
> > + * struct fpsimd_state.
> > + */
> > +void kvm_arch_vcpu_put_fp(struct kvm_vcpu *vcpu)
> > +{
> > +	local_bh_disable();
> > +
> > +	if (vcpu->arch.flags & KVM_ARM64_FP_ENABLED) {
> > +		fpsimd_save();
> > +		fpsimd_flush_cpu_state();
> > +		set_thread_flag(TIF_FOREIGN_FPSTATE);
> > +	}
> > +
> > +	update_thread_flag(TIF_SVE,
> > +			   vcpu->arch.flags & KVM_ARM64_HOST_SVE_IN_USE);
> 
> Does this indicate anything about the validity of the SVE registers? Or
> is it for the sole purpose of userspace not having to trap to re-enable SVE?

This is the host task's "SVE registers exist" flag.  It's essential to
restore this correctly before re-enabling preemption, because it
controls where the regs are saved/restored from:
current->thread.fpsimd_state, or the dynamically allocated buffer
current->thread.sve_state (TIF_SVE set indicates the latter).

TIF_FOREIGN_FPSTATE is orthogonal to this, and indicates whether the
regs are actually loaded in the hardware.  Obviously they aren't if
we had loaded the guest's regs meanwhile.

[...]

> > diff --git a/arch/arm64/kvm/hyp/switch.c b/arch/arm64/kvm/hyp/switch.c
> > index c0796c4..75c3b65 100644
> > --- a/arch/arm64/kvm/hyp/switch.c
> > +++ b/arch/arm64/kvm/hyp/switch.c
> > @@ -27,15 +27,16 @@
> >  #include <asm/kvm_mmu.h>
> >  #include <asm/fpsimd.h>
> >  #include <asm/debug-monitors.h>
> > +#include <asm/thread_info.h>
> >  
> > -static bool __hyp_text __fpsimd_enabled_nvhe(void)
> > +static bool __hyp_text update_fp_enabled(struct kvm_vcpu *vcpu)
> >  {
> > -	return !(read_sysreg(cptr_el2) & CPTR_EL2_TFP);
> > -}
> > +	if (vcpu->arch.host_thread_info->flags & _TIF_FOREIGN_FPSTATE) {
> > +		vcpu->arch.host_fpsimd_state = NULL;
> > +		vcpu->arch.flags &= ~KVM_ARM64_FP_ENABLED;
> > +	}
> 
> I must confess I don't get this bit. If we started off with an invalid
> host state (which is how I interpret TIF_FOREIGN_FPSTATE), why didn't we
> simply zero the pointer at load time? The thread flag cannot change in
> the meantime, right? Or did I get that wrong?

The host kernel can take interrupts in the run loop while outside the
guest, causing this flag to get set.  So we really do need to re-check
on each entry to the guest.

I can add a one-line comment documenting the purpose of this function,
which should help understanding here, something like:

/* Check whether the FP regs were dirtied while in the host-side run loop */

[...]

> It otherwise looks good, and I don't think the above nits are
> problematic on their own.
> 
> Reviewed-by: Marc Zyngier <marc.zyngier@arm.com>

Thanks.

I will probably address the nits since I need to fix another build break
issue with CONFIG_KVM=n.  That will be a pretty trivial respin.

Happy to keep your Reviewed-by with the discussed changes?

Cheers
---Dave

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH v6 11/15] KVM: arm64: Save host SVE context as appropriate
  2018-05-09  8:50     ` Marc Zyngier
@ 2018-05-09  9:30       ` Dave Martin
  -1 siblings, 0 replies; 62+ messages in thread
From: Dave Martin @ 2018-05-09  9:30 UTC (permalink / raw)
  To: Marc Zyngier
  Cc: Catalin Marinas, Ard Biesheuvel, kvmarm, linux-arm-kernel,
	Christoffer Dall

On Wed, May 09, 2018 at 09:50:26AM +0100, Marc Zyngier wrote:
> On 08/05/18 17:44, Dave Martin wrote:
> > This patch adds SVE context saving to the hyp FPSIMD context switch
> > path.  This means that it is no longer necessary to save the host
> > SVE state in advance of entering the guest, when in use.
> > 
> > In order to avoid adding pointless complexity to the code, VHE is
> > assumed if SVE is in use.  VHE is an architectural prerequisite for
> > SVE, so there is no good reason to turn CONFIG_ARM64_VHE off in
> > kernels that support both SVE and KVM.
> > 
> > Historically, software models exist that can expose the
> > architecturally invalid configuration of SVE without VHE, so if
> > this situation is detected this patch warns and refuses to create a
> > VM.  Doing this check at VM creation time avoids race issues
> > between KVM and SVE initialisation.
> 
> I don't think the commit log now reflects the code, since we bail out at
> init time and disable KVM altogether.

Will fix to say that we simply disable KVM completely at kvm_init() time
in this case.

[...]

> > diff --git a/arch/arm64/kvm/hyp/switch.c b/arch/arm64/kvm/hyp/switch.c

[...]

> > @@ -337,8 +341,22 @@ void __hyp_text __hyp_switch_fpsimd(u64 esr __always_unused,
> >  
> >  	isb();
> >  
> > -	if (vcpu->arch.host_fpsimd_state) {
> > -		__fpsimd_save_state(vcpu->arch.host_fpsimd_state);
> > +	if (host_fpsimd) {
> > +		/*
> > +		 * In the SVE case, VHE is assumed: it is enforced by
> > +		 * Kconfig and kvm_arch_init_vm().
> 
> Nit: this is now kvm_arch_init().

Will fix.

[...]

> Otherwise:
> 
> Acked-by: Marc Zyngier <marc.zyngier@arm.com>

Thanks
---Dave

^ permalink raw reply	[flat|nested] 62+ messages in thread

* [PATCH v6 11/15] KVM: arm64: Save host SVE context as appropriate
@ 2018-05-09  9:30       ` Dave Martin
  0 siblings, 0 replies; 62+ messages in thread
From: Dave Martin @ 2018-05-09  9:30 UTC (permalink / raw)
  To: linux-arm-kernel

On Wed, May 09, 2018 at 09:50:26AM +0100, Marc Zyngier wrote:
> On 08/05/18 17:44, Dave Martin wrote:
> > This patch adds SVE context saving to the hyp FPSIMD context switch
> > path.  This means that it is no longer necessary to save the host
> > SVE state in advance of entering the guest, when in use.
> > 
> > In order to avoid adding pointless complexity to the code, VHE is
> > assumed if SVE is in use.  VHE is an architectural prerequisite for
> > SVE, so there is no good reason to turn CONFIG_ARM64_VHE off in
> > kernels that support both SVE and KVM.
> > 
> > Historically, software models exist that can expose the
> > architecturally invalid configuration of SVE without VHE, so if
> > this situation is detected this patch warns and refuses to create a
> > VM.  Doing this check at VM creation time avoids race issues
> > between KVM and SVE initialisation.
> 
> I don't think the commit log now reflects the code, since we bail out at
> init time and disable KVM altogether.

Will fix to say that we simply disable KVM completely at kvm_init() time
in this case.

[...]

> > diff --git a/arch/arm64/kvm/hyp/switch.c b/arch/arm64/kvm/hyp/switch.c

[...]

> > @@ -337,8 +341,22 @@ void __hyp_text __hyp_switch_fpsimd(u64 esr __always_unused,
> >  
> >  	isb();
> >  
> > -	if (vcpu->arch.host_fpsimd_state) {
> > -		__fpsimd_save_state(vcpu->arch.host_fpsimd_state);
> > +	if (host_fpsimd) {
> > +		/*
> > +		 * In the SVE case, VHE is assumed: it is enforced by
> > +		 * Kconfig and kvm_arch_init_vm().
> 
> Nit: this is now kvm_arch_init().

Will fix.

[...]

> Otherwise:
> 
> Acked-by: Marc Zyngier <marc.zyngier@arm.com>

Thanks
---Dave

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH v6 07/15] KVM: arm64: Optimise FPSIMD handling to reduce guest/host thrashing
  2018-05-09  9:17       ` Dave Martin
@ 2018-05-09  9:46         ` Marc Zyngier
  -1 siblings, 0 replies; 62+ messages in thread
From: Marc Zyngier @ 2018-05-09  9:46 UTC (permalink / raw)
  To: Dave Martin
  Cc: Catalin Marinas, Ard Biesheuvel, kvmarm, linux-arm-kernel,
	Christoffer Dall

On 09/05/18 10:17, Dave Martin wrote:
> On Wed, May 09, 2018 at 09:24:36AM +0100, Marc Zyngier wrote:
>> On 08/05/18 17:44, Dave Martin wrote:
>>> This patch refactors KVM to align the host and guest FPSIMD
>>> save/restore logic with each other for arm64.  This reduces the
> 
> [...]
> 
>>> diff --git a/arch/arm64/include/asm/kvm_asm.h b/arch/arm64/include/asm/kvm_asm.h
>>> index 9699c13..6429eb6 100644
>>> --- a/arch/arm64/include/asm/kvm_asm.h
>>> +++ b/arch/arm64/include/asm/kvm_asm.h
>>> @@ -33,6 +33,8 @@
>>>  /* vcpu_arch flags field values: */
>>>  #define KVM_ARM64_DEBUG_DIRTY_SHIFT	0
>>>  #define KVM_ARM64_DEBUG_DIRTY		(1 << KVM_ARM64_DEBUG_DIRTY_SHIFT)
>>
>> At some point, we could remove KVM_ARM64_DEBUG_DIRTY_SHIFT as it is not
>> used anymore (since 1ea66d27e7b0). Do not respin on this account though.
> 
> I wondered about this...  I presume this was used from assembly once
> upon a time.
> 
> Since I need to repost for something else anyway, shall I delete that
> and move the defines to kvm_host.h?  There's no longer an obvious reason
> for them to be in a different header.

Yes, please.

> [...]
> 
>>> diff --git a/arch/arm64/kvm/fpsimd.c b/arch/arm64/kvm/fpsimd.c
>>> new file mode 100644
>>> index 0000000..7059275
>>> --- /dev/null
>>> +++ b/arch/arm64/kvm/fpsimd.c
>>> @@ -0,0 +1,111 @@
>>> +// SPDX-License-Identifier: GPL-2.0
>>> +/*
>>> + * arch/arm64/kvm/fpsimd.c: Guest/host FPSIMD context coordination helpers
>>> + *
>>> + * Copyright 2018 Arm Limited
>>> + * Author: Dave Martin <Dave.Martin@arm.com>
>>> + */
>>> +#include <linux/bottom_half.h>
>>> +#include <linux/sched.h>
>>> +#include <linux/thread_info.h>
>>> +#include <linux/kvm_host.h>
>>> +#include <asm/kvm_asm.h>
>>> +#include <asm/kvm_host.h>
>>> +#include <asm/kvm_mmu.h>
>>> +
>>> +/*
>>> + * Called on entry to KVM_RUN unless this vcpu previously ran at least
>>> + * once and the most recent prior KVM_RUN for this vcpu was called from
>>> + * the same task as current (highly likely).
>>> + *
>>> + * This is guaranteed to execute before kvm_arch_vcpu_load_fp(vcpu),
>>> + * such that on entering hyp the relevant parts of current are already
>>> + * mapped.
>>> + */
>>> +int kvm_arch_vcpu_run_map_fp(struct kvm_vcpu *vcpu)
>>> +{
>>> +	int ret;
>>> +
>>> +	struct thread_info *ti = &current->thread_info;
>>> +	struct user_fpsimd_state *fpsimd = &current->thread.uw.fpsimd_state;
>>> +
>>> +	/*
>>> +	 * Make sure the host task thread flags and fpsimd state are
>>> +	 * visible to hyp:
>>> +	 */
>>> +	ret = create_hyp_mappings(ti, ti + 1, PAGE_HYP);
>>> +	if (ret)
>>> +		goto error;
>>> +
>>> +	ret = create_hyp_mappings(fpsimd, fpsimd + 1, PAGE_HYP);
>>> +	if (ret)
>>> +		goto error;
>>> +
>>> +	vcpu->arch.host_thread_info = kern_hyp_va(ti);
>>> +	vcpu->arch.host_fpsimd_state = kern_hyp_va(fpsimd);
>>
>> Since we assign host_fpsimd_state here, and that "map" is guaranteed to
>> execute before "load"...
>>
>>> +error:
>>> +	return ret;
>>> +}
>>> +
>>> +/*
>>> + * Prepare vcpu for saving the host's FPSIMD state and loading the guest's.
>>> + * The actual loading is done by the FPSIMD access trap taken to hyp.
>>> + *
>>> + * Here, we just set the correct metadata to indicate that the FPSIMD
>>> + * state in the cpu regs (if any) belongs to current, and where to write
>>> + * it back to if/when a FPSIMD access trap is taken.
>>> + *
>>> + * TIF_SVE is backed up here, since it may get clobbered with guest state.
>>> + * This flag is restored by kvm_arch_vcpu_put_fp(vcpu).
>>> + */
>>> +void kvm_arch_vcpu_load_fp(struct kvm_vcpu *vcpu)
>>> +{
>>> +	BUG_ON(system_supports_sve());
>>> +	BUG_ON(!current->mm);
>>> +
>>> +	vcpu->arch.flags &= ~(KVM_ARM64_FP_ENABLED | KVM_ARM64_HOST_SVE_IN_USE);
>>> +	if (test_thread_flag(TIF_SVE))
>>> +		vcpu->arch.flags |= KVM_ARM64_HOST_SVE_IN_USE;
>>> +
>>> +	vcpu->arch.host_fpsimd_state =
>>> +		kern_hyp_va(&current->thread.uw.fpsimd_state);
>>
>> ... what is the purpose of this assignment? I don't think it is harmful
>> in any way, but I'm just wondering if I'm missing in the actual flow.
> 
> The pointer will be NULLed by __hyp_switch_fpsimd() when loading the
> guest state in, to indicate that there is no longer any host state to
> save.  Perhaps it would have been clearer to have a separate flag to
> represent this change of state.
> 
> As things stand I think you're right: the assignments are redundant.  
> We certainly need the pointer to reflect the current task, which is
> the motivation for doing it here -- but in fact we need to do it
> more often, at every vcpu_load().
> 
> I think the most appropriate thing is to delete the assignment from
> _map_fp().
> 
> The alternative would be to add a flag for the "no host state to save"
> case, and fpsimd_host_state only in the pid_change hook.

I think I'd prefer the latter. It makes the whole pointer stuff a lot
easier to understand (the linkages are basically static), and using
flags is consistent with the way the rest of the kernel works. I'm not
sure whether we need to add a new flag or simply rely on the existing
thread flag, but that for you to decide.

> 
> [...]
> 
>>> +/*
>>> + * Write back the vcpu FPSIMD regs if they are dirty, and invalidate the
>>> + * cpu FPSIMD regs so that they can't be spuriously reused if this vcpu
>>> + * disappears and another task or vcpu appears that recycles the same
>>> + * struct fpsimd_state.
>>> + */
>>> +void kvm_arch_vcpu_put_fp(struct kvm_vcpu *vcpu)
>>> +{
>>> +	local_bh_disable();
>>> +
>>> +	if (vcpu->arch.flags & KVM_ARM64_FP_ENABLED) {
>>> +		fpsimd_save();
>>> +		fpsimd_flush_cpu_state();
>>> +		set_thread_flag(TIF_FOREIGN_FPSTATE);
>>> +	}
>>> +
>>> +	update_thread_flag(TIF_SVE,
>>> +			   vcpu->arch.flags & KVM_ARM64_HOST_SVE_IN_USE);
>>
>> Does this indicate anything about the validity of the SVE registers? Or
>> is it for the sole purpose of userspace not having to trap to re-enable SVE?
> 
> This is the host task's "SVE registers exist" flag.  It's essential to
> restore this correctly before re-enabling preemption, because it
> controls where the regs are saved/restored from:
> current->thread.fpsimd_state, or the dynamically allocated buffer
> current->thread.sve_state (TIF_SVE set indicates the latter).
> 
> TIF_FOREIGN_FPSTATE is orthogonal to this, and indicates whether the
> regs are actually loaded in the hardware.  Obviously they aren't if
> we had loaded the guest's regs meanwhile.

OK, that makes sense now.

> [...]
> 
>>> diff --git a/arch/arm64/kvm/hyp/switch.c b/arch/arm64/kvm/hyp/switch.c
>>> index c0796c4..75c3b65 100644
>>> --- a/arch/arm64/kvm/hyp/switch.c
>>> +++ b/arch/arm64/kvm/hyp/switch.c
>>> @@ -27,15 +27,16 @@
>>>  #include <asm/kvm_mmu.h>
>>>  #include <asm/fpsimd.h>
>>>  #include <asm/debug-monitors.h>
>>> +#include <asm/thread_info.h>
>>>  
>>> -static bool __hyp_text __fpsimd_enabled_nvhe(void)
>>> +static bool __hyp_text update_fp_enabled(struct kvm_vcpu *vcpu)
>>>  {
>>> -	return !(read_sysreg(cptr_el2) & CPTR_EL2_TFP);
>>> -}
>>> +	if (vcpu->arch.host_thread_info->flags & _TIF_FOREIGN_FPSTATE) {
>>> +		vcpu->arch.host_fpsimd_state = NULL;
>>> +		vcpu->arch.flags &= ~KVM_ARM64_FP_ENABLED;
>>> +	}
>>
>> I must confess I don't get this bit. If we started off with an invalid
>> host state (which is how I interpret TIF_FOREIGN_FPSTATE), why didn't we
>> simply zero the pointer at load time? The thread flag cannot change in
>> the meantime, right? Or did I get that wrong?
> 
> The host kernel can take interrupts in the run loop while outside the
> guest, causing this flag to get set.  So we really do need to re-check
> on each entry to the guest.
> 
> I can add a one-line comment documenting the purpose of this function,
> which should help understanding here, something like:
> 
> /* Check whether the FP regs were dirtied while in the host-side run loop */

Yup, looks good.

> [...]
> 
>> It otherwise looks good, and I don't think the above nits are
>> problematic on their own.
>>
>> Reviewed-by: Marc Zyngier <marc.zyngier@arm.com>
> 
> Thanks.
> 
> I will probably address the nits since I need to fix another build break
> issue with CONFIG_KVM=n.  That will be a pretty trivial respin.
> 
> Happy to keep your Reviewed-by with the discussed changes?
Yes, no problem. I'll eyeball the changes and let you know if I spot
anything.

Thanks,

	M.
-- 
Jazz is not dead. It just smells funny...

^ permalink raw reply	[flat|nested] 62+ messages in thread

* [PATCH v6 07/15] KVM: arm64: Optimise FPSIMD handling to reduce guest/host thrashing
@ 2018-05-09  9:46         ` Marc Zyngier
  0 siblings, 0 replies; 62+ messages in thread
From: Marc Zyngier @ 2018-05-09  9:46 UTC (permalink / raw)
  To: linux-arm-kernel

On 09/05/18 10:17, Dave Martin wrote:
> On Wed, May 09, 2018 at 09:24:36AM +0100, Marc Zyngier wrote:
>> On 08/05/18 17:44, Dave Martin wrote:
>>> This patch refactors KVM to align the host and guest FPSIMD
>>> save/restore logic with each other for arm64.  This reduces the
> 
> [...]
> 
>>> diff --git a/arch/arm64/include/asm/kvm_asm.h b/arch/arm64/include/asm/kvm_asm.h
>>> index 9699c13..6429eb6 100644
>>> --- a/arch/arm64/include/asm/kvm_asm.h
>>> +++ b/arch/arm64/include/asm/kvm_asm.h
>>> @@ -33,6 +33,8 @@
>>>  /* vcpu_arch flags field values: */
>>>  #define KVM_ARM64_DEBUG_DIRTY_SHIFT	0
>>>  #define KVM_ARM64_DEBUG_DIRTY		(1 << KVM_ARM64_DEBUG_DIRTY_SHIFT)
>>
>> At some point, we could remove KVM_ARM64_DEBUG_DIRTY_SHIFT as it is not
>> used anymore (since 1ea66d27e7b0). Do not respin on this account though.
> 
> I wondered about this...  I presume this was used from assembly once
> upon a time.
> 
> Since I need to repost for something else anyway, shall I delete that
> and move the defines to kvm_host.h?  There's no longer an obvious reason
> for them to be in a different header.

Yes, please.

> [...]
> 
>>> diff --git a/arch/arm64/kvm/fpsimd.c b/arch/arm64/kvm/fpsimd.c
>>> new file mode 100644
>>> index 0000000..7059275
>>> --- /dev/null
>>> +++ b/arch/arm64/kvm/fpsimd.c
>>> @@ -0,0 +1,111 @@
>>> +// SPDX-License-Identifier: GPL-2.0
>>> +/*
>>> + * arch/arm64/kvm/fpsimd.c: Guest/host FPSIMD context coordination helpers
>>> + *
>>> + * Copyright 2018 Arm Limited
>>> + * Author: Dave Martin <Dave.Martin@arm.com>
>>> + */
>>> +#include <linux/bottom_half.h>
>>> +#include <linux/sched.h>
>>> +#include <linux/thread_info.h>
>>> +#include <linux/kvm_host.h>
>>> +#include <asm/kvm_asm.h>
>>> +#include <asm/kvm_host.h>
>>> +#include <asm/kvm_mmu.h>
>>> +
>>> +/*
>>> + * Called on entry to KVM_RUN unless this vcpu previously ran at least
>>> + * once and the most recent prior KVM_RUN for this vcpu was called from
>>> + * the same task as current (highly likely).
>>> + *
>>> + * This is guaranteed to execute before kvm_arch_vcpu_load_fp(vcpu),
>>> + * such that on entering hyp the relevant parts of current are already
>>> + * mapped.
>>> + */
>>> +int kvm_arch_vcpu_run_map_fp(struct kvm_vcpu *vcpu)
>>> +{
>>> +	int ret;
>>> +
>>> +	struct thread_info *ti = &current->thread_info;
>>> +	struct user_fpsimd_state *fpsimd = &current->thread.uw.fpsimd_state;
>>> +
>>> +	/*
>>> +	 * Make sure the host task thread flags and fpsimd state are
>>> +	 * visible to hyp:
>>> +	 */
>>> +	ret = create_hyp_mappings(ti, ti + 1, PAGE_HYP);
>>> +	if (ret)
>>> +		goto error;
>>> +
>>> +	ret = create_hyp_mappings(fpsimd, fpsimd + 1, PAGE_HYP);
>>> +	if (ret)
>>> +		goto error;
>>> +
>>> +	vcpu->arch.host_thread_info = kern_hyp_va(ti);
>>> +	vcpu->arch.host_fpsimd_state = kern_hyp_va(fpsimd);
>>
>> Since we assign host_fpsimd_state here, and that "map" is guaranteed to
>> execute before "load"...
>>
>>> +error:
>>> +	return ret;
>>> +}
>>> +
>>> +/*
>>> + * Prepare vcpu for saving the host's FPSIMD state and loading the guest's.
>>> + * The actual loading is done by the FPSIMD access trap taken to hyp.
>>> + *
>>> + * Here, we just set the correct metadata to indicate that the FPSIMD
>>> + * state in the cpu regs (if any) belongs to current, and where to write
>>> + * it back to if/when a FPSIMD access trap is taken.
>>> + *
>>> + * TIF_SVE is backed up here, since it may get clobbered with guest state.
>>> + * This flag is restored by kvm_arch_vcpu_put_fp(vcpu).
>>> + */
>>> +void kvm_arch_vcpu_load_fp(struct kvm_vcpu *vcpu)
>>> +{
>>> +	BUG_ON(system_supports_sve());
>>> +	BUG_ON(!current->mm);
>>> +
>>> +	vcpu->arch.flags &= ~(KVM_ARM64_FP_ENABLED | KVM_ARM64_HOST_SVE_IN_USE);
>>> +	if (test_thread_flag(TIF_SVE))
>>> +		vcpu->arch.flags |= KVM_ARM64_HOST_SVE_IN_USE;
>>> +
>>> +	vcpu->arch.host_fpsimd_state =
>>> +		kern_hyp_va(&current->thread.uw.fpsimd_state);
>>
>> ... what is the purpose of this assignment? I don't think it is harmful
>> in any way, but I'm just wondering if I'm missing in the actual flow.
> 
> The pointer will be NULLed by __hyp_switch_fpsimd() when loading the
> guest state in, to indicate that there is no longer any host state to
> save.  Perhaps it would have been clearer to have a separate flag to
> represent this change of state.
> 
> As things stand I think you're right: the assignments are redundant.  
> We certainly need the pointer to reflect the current task, which is
> the motivation for doing it here -- but in fact we need to do it
> more often, at every vcpu_load().
> 
> I think the most appropriate thing is to delete the assignment from
> _map_fp().
> 
> The alternative would be to add a flag for the "no host state to save"
> case, and fpsimd_host_state only in the pid_change hook.

I think I'd prefer the latter. It makes the whole pointer stuff a lot
easier to understand (the linkages are basically static), and using
flags is consistent with the way the rest of the kernel works. I'm not
sure whether we need to add a new flag or simply rely on the existing
thread flag, but that for you to decide.

> 
> [...]
> 
>>> +/*
>>> + * Write back the vcpu FPSIMD regs if they are dirty, and invalidate the
>>> + * cpu FPSIMD regs so that they can't be spuriously reused if this vcpu
>>> + * disappears and another task or vcpu appears that recycles the same
>>> + * struct fpsimd_state.
>>> + */
>>> +void kvm_arch_vcpu_put_fp(struct kvm_vcpu *vcpu)
>>> +{
>>> +	local_bh_disable();
>>> +
>>> +	if (vcpu->arch.flags & KVM_ARM64_FP_ENABLED) {
>>> +		fpsimd_save();
>>> +		fpsimd_flush_cpu_state();
>>> +		set_thread_flag(TIF_FOREIGN_FPSTATE);
>>> +	}
>>> +
>>> +	update_thread_flag(TIF_SVE,
>>> +			   vcpu->arch.flags & KVM_ARM64_HOST_SVE_IN_USE);
>>
>> Does this indicate anything about the validity of the SVE registers? Or
>> is it for the sole purpose of userspace not having to trap to re-enable SVE?
> 
> This is the host task's "SVE registers exist" flag.  It's essential to
> restore this correctly before re-enabling preemption, because it
> controls where the regs are saved/restored from:
> current->thread.fpsimd_state, or the dynamically allocated buffer
> current->thread.sve_state (TIF_SVE set indicates the latter).
> 
> TIF_FOREIGN_FPSTATE is orthogonal to this, and indicates whether the
> regs are actually loaded in the hardware.  Obviously they aren't if
> we had loaded the guest's regs meanwhile.

OK, that makes sense now.

> [...]
> 
>>> diff --git a/arch/arm64/kvm/hyp/switch.c b/arch/arm64/kvm/hyp/switch.c
>>> index c0796c4..75c3b65 100644
>>> --- a/arch/arm64/kvm/hyp/switch.c
>>> +++ b/arch/arm64/kvm/hyp/switch.c
>>> @@ -27,15 +27,16 @@
>>>  #include <asm/kvm_mmu.h>
>>>  #include <asm/fpsimd.h>
>>>  #include <asm/debug-monitors.h>
>>> +#include <asm/thread_info.h>
>>>  
>>> -static bool __hyp_text __fpsimd_enabled_nvhe(void)
>>> +static bool __hyp_text update_fp_enabled(struct kvm_vcpu *vcpu)
>>>  {
>>> -	return !(read_sysreg(cptr_el2) & CPTR_EL2_TFP);
>>> -}
>>> +	if (vcpu->arch.host_thread_info->flags & _TIF_FOREIGN_FPSTATE) {
>>> +		vcpu->arch.host_fpsimd_state = NULL;
>>> +		vcpu->arch.flags &= ~KVM_ARM64_FP_ENABLED;
>>> +	}
>>
>> I must confess I don't get this bit. If we started off with an invalid
>> host state (which is how I interpret TIF_FOREIGN_FPSTATE), why didn't we
>> simply zero the pointer at load time? The thread flag cannot change in
>> the meantime, right? Or did I get that wrong?
> 
> The host kernel can take interrupts in the run loop while outside the
> guest, causing this flag to get set.  So we really do need to re-check
> on each entry to the guest.
> 
> I can add a one-line comment documenting the purpose of this function,
> which should help understanding here, something like:
> 
> /* Check whether the FP regs were dirtied while in the host-side run loop */

Yup, looks good.

> [...]
> 
>> It otherwise looks good, and I don't think the above nits are
>> problematic on their own.
>>
>> Reviewed-by: Marc Zyngier <marc.zyngier@arm.com>
> 
> Thanks.
> 
> I will probably address the nits since I need to fix another build break
> issue with CONFIG_KVM=n.  That will be a pretty trivial respin.
> 
> Happy to keep your Reviewed-by with the discussed changes?
Yes, no problem. I'll eyeball the changes and let you know if I spot
anything.

Thanks,

	M.
-- 
Jazz is not dead. It just smells funny...

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH v6 07/15] KVM: arm64: Optimise FPSIMD handling to reduce guest/host thrashing
  2018-05-09  9:46         ` Marc Zyngier
@ 2018-05-09 10:00           ` Dave Martin
  -1 siblings, 0 replies; 62+ messages in thread
From: Dave Martin @ 2018-05-09 10:00 UTC (permalink / raw)
  To: Marc Zyngier
  Cc: Catalin Marinas, Christoffer Dall, kvmarm, linux-arm-kernel,
	Ard Biesheuvel

On Wed, May 09, 2018 at 10:46:11AM +0100, Marc Zyngier wrote:
> On 09/05/18 10:17, Dave Martin wrote:
> > On Wed, May 09, 2018 at 09:24:36AM +0100, Marc Zyngier wrote:
> >> On 08/05/18 17:44, Dave Martin wrote:
> >>> This patch refactors KVM to align the host and guest FPSIMD
> >>> save/restore logic with each other for arm64.  This reduces the
> > 
> > [...]
> > 
> >>> diff --git a/arch/arm64/include/asm/kvm_asm.h b/arch/arm64/include/asm/kvm_asm.h
> >>> index 9699c13..6429eb6 100644
> >>> --- a/arch/arm64/include/asm/kvm_asm.h
> >>> +++ b/arch/arm64/include/asm/kvm_asm.h
> >>> @@ -33,6 +33,8 @@
> >>>  /* vcpu_arch flags field values: */
> >>>  #define KVM_ARM64_DEBUG_DIRTY_SHIFT	0
> >>>  #define KVM_ARM64_DEBUG_DIRTY		(1 << KVM_ARM64_DEBUG_DIRTY_SHIFT)
> >>
> >> At some point, we could remove KVM_ARM64_DEBUG_DIRTY_SHIFT as it is not
> >> used anymore (since 1ea66d27e7b0). Do not respin on this account though.
> > 
> > I wondered about this...  I presume this was used from assembly once
> > upon a time.
> > 
> > Since I need to repost for something else anyway, shall I delete that
> > and move the defines to kvm_host.h?  There's no longer an obvious reason
> > for them to be in a different header.
> 
> Yes, please.

OK, will do.

> >>> diff --git a/arch/arm64/kvm/fpsimd.c b/arch/arm64/kvm/fpsimd.c

[...]

> >>> +void kvm_arch_vcpu_load_fp(struct kvm_vcpu *vcpu)
> >>> +{
> >>> +	BUG_ON(system_supports_sve());
> >>> +	BUG_ON(!current->mm);
> >>> +
> >>> +	vcpu->arch.flags &= ~(KVM_ARM64_FP_ENABLED | KVM_ARM64_HOST_SVE_IN_USE);
> >>> +	if (test_thread_flag(TIF_SVE))
> >>> +		vcpu->arch.flags |= KVM_ARM64_HOST_SVE_IN_USE;
> >>> +
> >>> +	vcpu->arch.host_fpsimd_state =
> >>> +		kern_hyp_va(&current->thread.uw.fpsimd_state);
> >>
> >> ... what is the purpose of this assignment? I don't think it is harmful
> >> in any way, but I'm just wondering if I'm missing in the actual flow.
> > 
> > The pointer will be NULLed by __hyp_switch_fpsimd() when loading the
> > guest state in, to indicate that there is no longer any host state to
> > save.  Perhaps it would have been clearer to have a separate flag to
> > represent this change of state.
> > 
> > As things stand I think you're right: the assignments are redundant.  
> > We certainly need the pointer to reflect the current task, which is
> > the motivation for doing it here -- but in fact we need to do it
> > more often, at every vcpu_load().
> > 
> > I think the most appropriate thing is to delete the assignment from
> > _map_fp().
> > 
> > The alternative would be to add a flag for the "no host state to save"
> > case, and fpsimd_host_state only in the pid_change hook.
> 
> I think I'd prefer the latter. It makes the whole pointer stuff a lot
> easier to understand (the linkages are basically static), and using
> flags is consistent with the way the rest of the kernel works. I'm not
> sure whether we need to add a new flag or simply rely on the existing
> thread flag, but that for you to decide.

OK, agreed.  It should definitely help clarity.

[...]

> > Happy to keep your Reviewed-by with the discussed changes?
> Yes, no problem. I'll eyeball the changes and let you know if I spot
> anything.

OK, thanks.

I'll note the changes under the tearoff.

Cheers
---Dave

^ permalink raw reply	[flat|nested] 62+ messages in thread

* [PATCH v6 07/15] KVM: arm64: Optimise FPSIMD handling to reduce guest/host thrashing
@ 2018-05-09 10:00           ` Dave Martin
  0 siblings, 0 replies; 62+ messages in thread
From: Dave Martin @ 2018-05-09 10:00 UTC (permalink / raw)
  To: linux-arm-kernel

On Wed, May 09, 2018 at 10:46:11AM +0100, Marc Zyngier wrote:
> On 09/05/18 10:17, Dave Martin wrote:
> > On Wed, May 09, 2018 at 09:24:36AM +0100, Marc Zyngier wrote:
> >> On 08/05/18 17:44, Dave Martin wrote:
> >>> This patch refactors KVM to align the host and guest FPSIMD
> >>> save/restore logic with each other for arm64.  This reduces the
> > 
> > [...]
> > 
> >>> diff --git a/arch/arm64/include/asm/kvm_asm.h b/arch/arm64/include/asm/kvm_asm.h
> >>> index 9699c13..6429eb6 100644
> >>> --- a/arch/arm64/include/asm/kvm_asm.h
> >>> +++ b/arch/arm64/include/asm/kvm_asm.h
> >>> @@ -33,6 +33,8 @@
> >>>  /* vcpu_arch flags field values: */
> >>>  #define KVM_ARM64_DEBUG_DIRTY_SHIFT	0
> >>>  #define KVM_ARM64_DEBUG_DIRTY		(1 << KVM_ARM64_DEBUG_DIRTY_SHIFT)
> >>
> >> At some point, we could remove KVM_ARM64_DEBUG_DIRTY_SHIFT as it is not
> >> used anymore (since 1ea66d27e7b0). Do not respin on this account though.
> > 
> > I wondered about this...  I presume this was used from assembly once
> > upon a time.
> > 
> > Since I need to repost for something else anyway, shall I delete that
> > and move the defines to kvm_host.h?  There's no longer an obvious reason
> > for them to be in a different header.
> 
> Yes, please.

OK, will do.

> >>> diff --git a/arch/arm64/kvm/fpsimd.c b/arch/arm64/kvm/fpsimd.c

[...]

> >>> +void kvm_arch_vcpu_load_fp(struct kvm_vcpu *vcpu)
> >>> +{
> >>> +	BUG_ON(system_supports_sve());
> >>> +	BUG_ON(!current->mm);
> >>> +
> >>> +	vcpu->arch.flags &= ~(KVM_ARM64_FP_ENABLED | KVM_ARM64_HOST_SVE_IN_USE);
> >>> +	if (test_thread_flag(TIF_SVE))
> >>> +		vcpu->arch.flags |= KVM_ARM64_HOST_SVE_IN_USE;
> >>> +
> >>> +	vcpu->arch.host_fpsimd_state =
> >>> +		kern_hyp_va(&current->thread.uw.fpsimd_state);
> >>
> >> ... what is the purpose of this assignment? I don't think it is harmful
> >> in any way, but I'm just wondering if I'm missing in the actual flow.
> > 
> > The pointer will be NULLed by __hyp_switch_fpsimd() when loading the
> > guest state in, to indicate that there is no longer any host state to
> > save.  Perhaps it would have been clearer to have a separate flag to
> > represent this change of state.
> > 
> > As things stand I think you're right: the assignments are redundant.  
> > We certainly need the pointer to reflect the current task, which is
> > the motivation for doing it here -- but in fact we need to do it
> > more often, at every vcpu_load().
> > 
> > I think the most appropriate thing is to delete the assignment from
> > _map_fp().
> > 
> > The alternative would be to add a flag for the "no host state to save"
> > case, and fpsimd_host_state only in the pid_change hook.
> 
> I think I'd prefer the latter. It makes the whole pointer stuff a lot
> easier to understand (the linkages are basically static), and using
> flags is consistent with the way the rest of the kernel works. I'm not
> sure whether we need to add a new flag or simply rely on the existing
> thread flag, but that for you to decide.

OK, agreed.  It should definitely help clarity.

[...]

> > Happy to keep your Reviewed-by with the discussed changes?
> Yes, no problem. I'll eyeball the changes and let you know if I spot
> anything.

OK, thanks.

I'll note the changes under the tearoff.

Cheers
---Dave

^ permalink raw reply	[flat|nested] 62+ messages in thread

end of thread, other threads:[~2018-05-09 10:00 UTC | newest]

Thread overview: 62+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-05-08 16:44 [PATCH v6 00/15] KVM: arm64: Optimise FPSIMD context switching Dave Martin
2018-05-08 16:44 ` Dave Martin
2018-05-08 16:44 ` [PATCH v6 01/15] thread_info: Add update_thread_flag() helpers Dave Martin
2018-05-08 16:44   ` Dave Martin
2018-05-08 17:08   ` Steven Rostedt
2018-05-08 17:08     ` Steven Rostedt
2018-05-09  9:07   ` Marc Zyngier
2018-05-09  9:07     ` Marc Zyngier
2018-05-08 16:44 ` [PATCH v6 02/15] arm64: Use update{,_tsk}_thread_flag() Dave Martin
2018-05-08 16:44   ` Dave Martin
2018-05-09  9:09   ` Marc Zyngier
2018-05-09  9:09     ` Marc Zyngier
2018-05-08 16:44 ` [PATCH v6 03/15] KVM: arm/arm64: Introduce kvm_arch_vcpu_run_pid_change Dave Martin
2018-05-08 16:44   ` Dave Martin
2018-05-08 16:44 ` [PATCH v6 04/15] KVM: arm64: Convert lazy FPSIMD context switch trap to C Dave Martin
2018-05-08 16:44   ` Dave Martin
2018-05-08 16:44 ` [PATCH v6 05/15] arm64: fpsimd: Generalise context saving for non-task contexts Dave Martin
2018-05-08 16:44   ` Dave Martin
2018-05-08 16:44 ` [PATCH v6 06/15] KVM: arm64: Repurpose vcpu_arch.debug_flags for general-purpose flags Dave Martin
2018-05-08 16:44   ` Dave Martin
2018-05-09  8:27   ` Marc Zyngier
2018-05-09  8:27     ` Marc Zyngier
2018-05-08 16:44 ` [PATCH v6 07/15] KVM: arm64: Optimise FPSIMD handling to reduce guest/host thrashing Dave Martin
2018-05-08 16:44   ` Dave Martin
2018-05-09  8:24   ` Marc Zyngier
2018-05-09  8:24     ` Marc Zyngier
2018-05-09  9:17     ` Dave Martin
2018-05-09  9:17       ` Dave Martin
2018-05-09  9:46       ` Marc Zyngier
2018-05-09  9:46         ` Marc Zyngier
2018-05-09 10:00         ` Dave Martin
2018-05-09 10:00           ` Dave Martin
2018-05-08 16:44 ` [PATCH v6 08/15] arm64/sve: Move read_zcr_features() out of cpufeature.h Dave Martin
2018-05-08 16:44   ` Dave Martin
2018-05-09  8:28   ` Marc Zyngier
2018-05-09  8:28     ` Marc Zyngier
2018-05-08 16:44 ` [PATCH v6 09/15] arm64/sve: Switch sve_pffr() argument from task to thread Dave Martin
2018-05-08 16:44   ` Dave Martin
2018-05-09  8:44   ` Marc Zyngier
2018-05-09  8:44     ` Marc Zyngier
2018-05-08 16:44 ` [PATCH v6 10/15] arm64/sve: Move sve_pffr() to fpsimd.h and make inline Dave Martin
2018-05-08 16:44   ` Dave Martin
2018-05-09  8:46   ` Marc Zyngier
2018-05-09  8:46     ` Marc Zyngier
2018-05-08 16:44 ` [PATCH v6 11/15] KVM: arm64: Save host SVE context as appropriate Dave Martin
2018-05-08 16:44   ` Dave Martin
2018-05-09  8:50   ` Marc Zyngier
2018-05-09  8:50     ` Marc Zyngier
2018-05-09  9:30     ` Dave Martin
2018-05-09  9:30       ` Dave Martin
2018-05-08 16:44 ` [PATCH v6 12/15] KVM: arm64: Remove eager host SVE state saving Dave Martin
2018-05-08 16:44   ` Dave Martin
2018-05-09  8:56   ` Marc Zyngier
2018-05-09  8:56     ` Marc Zyngier
2018-05-08 16:44 ` [PATCH v6 13/15] KVM: arm64: Remove redundant *exit_code changes in fpsimd_guest_exit() Dave Martin
2018-05-08 16:44   ` Dave Martin
2018-05-08 16:44 ` [PATCH v6 14/15] KVM: arm64: Fold redundant exit code checks out of fixup_guest_exit() Dave Martin
2018-05-08 16:44   ` Dave Martin
2018-05-09  9:05   ` Marc Zyngier
2018-05-09  9:05     ` Marc Zyngier
2018-05-08 16:45 ` [PATCH v6 15/15] KVM: arm64: Invoke FPSIMD context switch trap from C Dave Martin
2018-05-08 16:45   ` Dave Martin

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.