All of lore.kernel.org
 help / color / mirror / Atom feed
* [RFC PATCH 00/16] KVM: arm64: Initial support for SVE guests
@ 2018-06-21 14:57 ` Dave Martin
  0 siblings, 0 replies; 178+ messages in thread
From: Dave Martin @ 2018-06-21 14:57 UTC (permalink / raw)
  To: kvmarm
  Cc: Peter Maydell, Okamoto Takayuki, Christoffer Dall,
	Ard Biesheuvel, Marc Zyngier, Catalin Marinas, Will Deacon,
	Alex Bennée, linux-arm-kernel

This series implements basic support for allowing KVM guests to use the
Arm Scalable Vector Extension (SVE).

The patches is based on torvalds/master f5b7769e (Revert "debugfs:
inode: debugfs_create_dir uses mode permission from parent") plus the
patches from [1].

Issues / missing features:

 * No way for userspace to determine or control the set of vector
   lengths exposed to the guest.  This needs to be fixed for
   snapshotting/migration of guests to work reliably.

   An ioctl needs to be added for this (dropped from this series
   because I consider it lower-priority than the core support).

 * No documentation update yet (I'd like to get the interfaces
   finalised before I put too much effort into writing that...)

 * Patch 14 (SVE register ioctl() access core) may be too complicated
   in its handling of backwards compatibility for the FPSIMD regs:
   It might be simpler to use nvcpu->arch.ctxt.gp_regs.fp_regs.vregs[]
   as a bounce buffer rather than trying to redirect uaccess directly
   to the appropriate locations in vcpu->sve_state.

   I'd be interested in people's comments on this.

 * kvmtool/qemu updates are needed to enable creation of SVE-enabled
   guests (to be discussed separately).


Brief notes on the patches:

 * Patches 1-4 are miscellaneous preliminary cleanups.

 * Patch 5 adds new arch vcpu init/teardown hooks due to the lack of
   an obvious place to free a vcpu's SVE state buffer.  This may want
   a rethink, because the allocation part has now ended up in
   kvm_reset_vcpu() instead of the new init hook.

 * Patches 6-12 implement the core SVE support for guests (of which
   patch 8-9 refactor the sysregs code to support conditional hiding
   of the SVE-related registers).

 * Patches 13-15 implement ioctl() access for the new SVE registers.

 * Patch 16 exposes the new functionality to userspace, allowing
   SVE-enabled vcpus to be created.


This series is somewhat tested on Arm Juno r0 and the Arm Fast Model
(with/without SVE support).  arch/arm builds, but I've not booted
it -- only some trivial refactoring in this series affects arch/arm.

Cheers
---Dave


[1] [PATCH v2 0/4] KVM: arm64: FPSIMD/SVE fixes for 4.17 [sic]
http://lists.infradead.org/pipermail/linux-arm-kernel/2018-June/584281.html

Dave Martin (16):
  arm64: fpsimd: Always set TIF_FOREIGN_FPSTATE on task state flush
  KVM: arm64: Delete orphaned declaration for __fpsimd_enabled()
  KVM: arm64: Refactor kvm_arm_num_regs() for easier maintenance
  KVM: arm64: Add missing #include of <linux/bitmap.h> to kvm_host.h
  KVM: arm: Add arch init/uninit hooks
  arm64/sve: Determine virtualisation-friendly vector lengths
  arm64/sve: Enable SVE state tracking for non-task contexts
  KVM: arm64: Support dynamically hideable system registers
  KVM: arm64: Allow ID registers to by dynamically read-as-zero
  KVM: arm64: Add a vcpu flag to control SVE visibility for the guest
  KVM: arm64/sve: System register context switch and access support
  KVM: arm64/sve: Context switch the SVE registers
  KVM: Allow 2048-bit register access via KVM_{GET,SET}_ONE_REG
  KVM: arm64/sve: Add SVE support to register access ioctl interface
  KVM: arm64: Enumerate SVE register indices for KVM_GET_REG_LIST
  KVM: arm64/sve: Report and enable SVE API extensions for userspace

 arch/arm/include/asm/kvm_host.h   |   4 +-
 arch/arm64/include/asm/fpsimd.h   |   4 +-
 arch/arm64/include/asm/kvm_host.h |  18 ++-
 arch/arm64/include/asm/kvm_hyp.h  |   1 -
 arch/arm64/include/asm/sysreg.h   |   3 +
 arch/arm64/include/uapi/asm/kvm.h |  11 ++
 arch/arm64/kernel/cpufeature.c    |   2 +-
 arch/arm64/kernel/fpsimd.c        | 131 +++++++++++++---
 arch/arm64/kernel/signal.c        |   5 -
 arch/arm64/kvm/fpsimd.c           |   7 +-
 arch/arm64/kvm/guest.c            | 321 +++++++++++++++++++++++++++++++++++---
 arch/arm64/kvm/hyp/switch.c       |  43 +++--
 arch/arm64/kvm/hyp/sysreg-sr.c    |   5 +
 arch/arm64/kvm/reset.c            |  14 ++
 arch/arm64/kvm/sys_regs.c         |  73 ++++++---
 arch/arm64/kvm/sys_regs.h         |  22 +++
 include/uapi/linux/kvm.h          |   1 +
 virt/kvm/arm/arm.c                |  13 +-
 18 files changed, 587 insertions(+), 91 deletions(-)

-- 
2.1.4

^ permalink raw reply	[flat|nested] 178+ messages in thread

* [RFC PATCH 00/16] KVM: arm64: Initial support for SVE guests
@ 2018-06-21 14:57 ` Dave Martin
  0 siblings, 0 replies; 178+ messages in thread
From: Dave Martin @ 2018-06-21 14:57 UTC (permalink / raw)
  To: linux-arm-kernel

This series implements basic support for allowing KVM guests to use the
Arm Scalable Vector Extension (SVE).

The patches is based on torvalds/master f5b7769e (Revert "debugfs:
inode: debugfs_create_dir uses mode permission from parent") plus the
patches from [1].

Issues / missing features:

 * No way for userspace to determine or control the set of vector
   lengths exposed to the guest.  This needs to be fixed for
   snapshotting/migration of guests to work reliably.

   An ioctl needs to be added for this (dropped from this series
   because I consider it lower-priority than the core support).

 * No documentation update yet (I'd like to get the interfaces
   finalised before I put too much effort into writing that...)

 * Patch 14 (SVE register ioctl() access core) may be too complicated
   in its handling of backwards compatibility for the FPSIMD regs:
   It might be simpler to use nvcpu->arch.ctxt.gp_regs.fp_regs.vregs[]
   as a bounce buffer rather than trying to redirect uaccess directly
   to the appropriate locations in vcpu->sve_state.

   I'd be interested in people's comments on this.

 * kvmtool/qemu updates are needed to enable creation of SVE-enabled
   guests (to be discussed separately).


Brief notes on the patches:

 * Patches 1-4 are miscellaneous preliminary cleanups.

 * Patch 5 adds new arch vcpu init/teardown hooks due to the lack of
   an obvious place to free a vcpu's SVE state buffer.  This may want
   a rethink, because the allocation part has now ended up in
   kvm_reset_vcpu() instead of the new init hook.

 * Patches 6-12 implement the core SVE support for guests (of which
   patch 8-9 refactor the sysregs code to support conditional hiding
   of the SVE-related registers).

 * Patches 13-15 implement ioctl() access for the new SVE registers.

 * Patch 16 exposes the new functionality to userspace, allowing
   SVE-enabled vcpus to be created.


This series is somewhat tested on Arm Juno r0 and the Arm Fast Model
(with/without SVE support).  arch/arm builds, but I've not booted
it -- only some trivial refactoring in this series affects arch/arm.

Cheers
---Dave


[1] [PATCH v2 0/4] KVM: arm64: FPSIMD/SVE fixes for 4.17 [sic]
http://lists.infradead.org/pipermail/linux-arm-kernel/2018-June/584281.html

Dave Martin (16):
  arm64: fpsimd: Always set TIF_FOREIGN_FPSTATE on task state flush
  KVM: arm64: Delete orphaned declaration for __fpsimd_enabled()
  KVM: arm64: Refactor kvm_arm_num_regs() for easier maintenance
  KVM: arm64: Add missing #include of <linux/bitmap.h> to kvm_host.h
  KVM: arm: Add arch init/uninit hooks
  arm64/sve: Determine virtualisation-friendly vector lengths
  arm64/sve: Enable SVE state tracking for non-task contexts
  KVM: arm64: Support dynamically hideable system registers
  KVM: arm64: Allow ID registers to by dynamically read-as-zero
  KVM: arm64: Add a vcpu flag to control SVE visibility for the guest
  KVM: arm64/sve: System register context switch and access support
  KVM: arm64/sve: Context switch the SVE registers
  KVM: Allow 2048-bit register access via KVM_{GET,SET}_ONE_REG
  KVM: arm64/sve: Add SVE support to register access ioctl interface
  KVM: arm64: Enumerate SVE register indices for KVM_GET_REG_LIST
  KVM: arm64/sve: Report and enable SVE API extensions for userspace

 arch/arm/include/asm/kvm_host.h   |   4 +-
 arch/arm64/include/asm/fpsimd.h   |   4 +-
 arch/arm64/include/asm/kvm_host.h |  18 ++-
 arch/arm64/include/asm/kvm_hyp.h  |   1 -
 arch/arm64/include/asm/sysreg.h   |   3 +
 arch/arm64/include/uapi/asm/kvm.h |  11 ++
 arch/arm64/kernel/cpufeature.c    |   2 +-
 arch/arm64/kernel/fpsimd.c        | 131 +++++++++++++---
 arch/arm64/kernel/signal.c        |   5 -
 arch/arm64/kvm/fpsimd.c           |   7 +-
 arch/arm64/kvm/guest.c            | 321 +++++++++++++++++++++++++++++++++++---
 arch/arm64/kvm/hyp/switch.c       |  43 +++--
 arch/arm64/kvm/hyp/sysreg-sr.c    |   5 +
 arch/arm64/kvm/reset.c            |  14 ++
 arch/arm64/kvm/sys_regs.c         |  73 ++++++---
 arch/arm64/kvm/sys_regs.h         |  22 +++
 include/uapi/linux/kvm.h          |   1 +
 virt/kvm/arm/arm.c                |  13 +-
 18 files changed, 587 insertions(+), 91 deletions(-)

-- 
2.1.4

^ permalink raw reply	[flat|nested] 178+ messages in thread

* [RFC PATCH 01/16] arm64: fpsimd: Always set TIF_FOREIGN_FPSTATE on task state flush
  2018-06-21 14:57 ` Dave Martin
@ 2018-06-21 14:57   ` Dave Martin
  -1 siblings, 0 replies; 178+ messages in thread
From: Dave Martin @ 2018-06-21 14:57 UTC (permalink / raw)
  To: kvmarm
  Cc: Okamoto Takayuki, Christoffer Dall, Ard Biesheuvel, Marc Zyngier,
	Catalin Marinas, Will Deacon, linux-arm-kernel

This patch updates fpsimd_flush_task_state() to mirror the new
semantics of fpsimd_flush_cpu_state(): both functions now
implicitly set TIF_FOREIGN_FPSTATE to indicate that the task's
FPSIMD state is not loaded into the cpu.

As a side-effect, fpsimd_flush_task_state() now sets
TIF_FOREIGN_FPSTATE even for non-running tasks.  In the case of
non-running tasks this is not useful but also harmless, because the
flag is live only while the corresponding task is running.  This
function is not called from fast paths, so special-casing this for
the task == current case is not really worth it.

Compiler barriers previously present in restore_sve_fpsimd_context()
are pulled into fpsimd_flush_task_state() so that it can be safely
called with preemption enabled if necessary.

Explicit calls to set TIF_FOREIGN_FPSTATE that accompany
fpsimd_flush_task_state() calls and are now redundant are removed
as appropriate.

fpsimd_flush_task_state() is used to get exclusive access to the
representation of the task's state via task_struct, for the purpose
of replacing the state.  Thus, the call to this function should
happen before manipulating fpsimd_state or sve_state etc. in
task_struct.  Anomalous cases are reordered appropriately in order
to make the code more consistent, although there should be no
functional difference since these cases are protected by
local_bh_disable() anyway.

Signed-off-by: Dave Martin <Dave.Martin@arm.com>
---
 arch/arm64/kernel/fpsimd.c | 25 +++++++++++++++++++------
 arch/arm64/kernel/signal.c |  5 -----
 2 files changed, 19 insertions(+), 11 deletions(-)

diff --git a/arch/arm64/kernel/fpsimd.c b/arch/arm64/kernel/fpsimd.c
index 84c68b1..6b1ddae 100644
--- a/arch/arm64/kernel/fpsimd.c
+++ b/arch/arm64/kernel/fpsimd.c
@@ -569,7 +569,6 @@ int sve_set_vector_length(struct task_struct *task,
 		local_bh_disable();
 
 		fpsimd_save();
-		set_thread_flag(TIF_FOREIGN_FPSTATE);
 	}
 
 	fpsimd_flush_task_state(task);
@@ -835,12 +834,11 @@ asmlinkage void do_sve_acc(unsigned int esr, struct pt_regs *regs)
 	local_bh_disable();
 
 	fpsimd_save();
-	fpsimd_to_sve(current);
 
 	/* Force ret_to_user to reload the registers: */
 	fpsimd_flush_task_state(current);
-	set_thread_flag(TIF_FOREIGN_FPSTATE);
 
+	fpsimd_to_sve(current);
 	if (test_and_set_thread_flag(TIF_SVE))
 		WARN_ON(1); /* SVE access shouldn't have trapped */
 
@@ -917,9 +915,9 @@ void fpsimd_flush_thread(void)
 
 	local_bh_disable();
 
+	fpsimd_flush_task_state(current);
 	memset(&current->thread.uw.fpsimd_state, 0,
 	       sizeof(current->thread.uw.fpsimd_state));
-	fpsimd_flush_task_state(current);
 
 	if (system_supports_sve()) {
 		clear_thread_flag(TIF_SVE);
@@ -956,8 +954,6 @@ void fpsimd_flush_thread(void)
 			current->thread.sve_vl_onexec = 0;
 	}
 
-	set_thread_flag(TIF_FOREIGN_FPSTATE);
-
 	local_bh_enable();
 }
 
@@ -1066,12 +1062,29 @@ void fpsimd_update_current_state(struct user_fpsimd_state const *state)
 
 /*
  * Invalidate live CPU copies of task t's FPSIMD state
+ *
+ * This function may be called with preemption enabled.  The barrier()
+ * ensures that the assignment to fpsimd_cpu is visible to any
+ * preemption/softirq that could race with set_tsk_thread_flag(), so
+ * that TIF_FOREIGN_FPSTATE cannot be spuriously re-cleared.
+ *
+ * The final barrier ensures that TIF_FOREIGN_FPSTATE is seen set by any
+ * subsequent code.
  */
 void fpsimd_flush_task_state(struct task_struct *t)
 {
 	t->thread.fpsimd_cpu = NR_CPUS;
+
+	barrier();
+	set_tsk_thread_flag(t, TIF_FOREIGN_FPSTATE);
+
+	barrier();
 }
 
+/*
+ * Invalidate any task's FPSIMD state that is present on this cpu.
+ * This function must be called with softirqs disabled.
+ */
 void fpsimd_flush_cpu_state(void)
 {
 	__this_cpu_write(fpsimd_last_state.st, NULL);
diff --git a/arch/arm64/kernel/signal.c b/arch/arm64/kernel/signal.c
index 511af13..7636965 100644
--- a/arch/arm64/kernel/signal.c
+++ b/arch/arm64/kernel/signal.c
@@ -296,11 +296,6 @@ static int restore_sve_fpsimd_context(struct user_ctxs *user)
 	 */
 
 	fpsimd_flush_task_state(current);
-	barrier();
-	/* From now, fpsimd_thread_switch() won't clear TIF_FOREIGN_FPSTATE */
-
-	set_thread_flag(TIF_FOREIGN_FPSTATE);
-	barrier();
 	/* From now, fpsimd_thread_switch() won't touch thread.sve_state */
 
 	sve_alloc(current);
-- 
2.1.4

^ permalink raw reply related	[flat|nested] 178+ messages in thread

* [RFC PATCH 01/16] arm64: fpsimd: Always set TIF_FOREIGN_FPSTATE on task state flush
@ 2018-06-21 14:57   ` Dave Martin
  0 siblings, 0 replies; 178+ messages in thread
From: Dave Martin @ 2018-06-21 14:57 UTC (permalink / raw)
  To: linux-arm-kernel

This patch updates fpsimd_flush_task_state() to mirror the new
semantics of fpsimd_flush_cpu_state(): both functions now
implicitly set TIF_FOREIGN_FPSTATE to indicate that the task's
FPSIMD state is not loaded into the cpu.

As a side-effect, fpsimd_flush_task_state() now sets
TIF_FOREIGN_FPSTATE even for non-running tasks.  In the case of
non-running tasks this is not useful but also harmless, because the
flag is live only while the corresponding task is running.  This
function is not called from fast paths, so special-casing this for
the task == current case is not really worth it.

Compiler barriers previously present in restore_sve_fpsimd_context()
are pulled into fpsimd_flush_task_state() so that it can be safely
called with preemption enabled if necessary.

Explicit calls to set TIF_FOREIGN_FPSTATE that accompany
fpsimd_flush_task_state() calls and are now redundant are removed
as appropriate.

fpsimd_flush_task_state() is used to get exclusive access to the
representation of the task's state via task_struct, for the purpose
of replacing the state.  Thus, the call to this function should
happen before manipulating fpsimd_state or sve_state etc. in
task_struct.  Anomalous cases are reordered appropriately in order
to make the code more consistent, although there should be no
functional difference since these cases are protected by
local_bh_disable() anyway.

Signed-off-by: Dave Martin <Dave.Martin@arm.com>
---
 arch/arm64/kernel/fpsimd.c | 25 +++++++++++++++++++------
 arch/arm64/kernel/signal.c |  5 -----
 2 files changed, 19 insertions(+), 11 deletions(-)

diff --git a/arch/arm64/kernel/fpsimd.c b/arch/arm64/kernel/fpsimd.c
index 84c68b1..6b1ddae 100644
--- a/arch/arm64/kernel/fpsimd.c
+++ b/arch/arm64/kernel/fpsimd.c
@@ -569,7 +569,6 @@ int sve_set_vector_length(struct task_struct *task,
 		local_bh_disable();
 
 		fpsimd_save();
-		set_thread_flag(TIF_FOREIGN_FPSTATE);
 	}
 
 	fpsimd_flush_task_state(task);
@@ -835,12 +834,11 @@ asmlinkage void do_sve_acc(unsigned int esr, struct pt_regs *regs)
 	local_bh_disable();
 
 	fpsimd_save();
-	fpsimd_to_sve(current);
 
 	/* Force ret_to_user to reload the registers: */
 	fpsimd_flush_task_state(current);
-	set_thread_flag(TIF_FOREIGN_FPSTATE);
 
+	fpsimd_to_sve(current);
 	if (test_and_set_thread_flag(TIF_SVE))
 		WARN_ON(1); /* SVE access shouldn't have trapped */
 
@@ -917,9 +915,9 @@ void fpsimd_flush_thread(void)
 
 	local_bh_disable();
 
+	fpsimd_flush_task_state(current);
 	memset(&current->thread.uw.fpsimd_state, 0,
 	       sizeof(current->thread.uw.fpsimd_state));
-	fpsimd_flush_task_state(current);
 
 	if (system_supports_sve()) {
 		clear_thread_flag(TIF_SVE);
@@ -956,8 +954,6 @@ void fpsimd_flush_thread(void)
 			current->thread.sve_vl_onexec = 0;
 	}
 
-	set_thread_flag(TIF_FOREIGN_FPSTATE);
-
 	local_bh_enable();
 }
 
@@ -1066,12 +1062,29 @@ void fpsimd_update_current_state(struct user_fpsimd_state const *state)
 
 /*
  * Invalidate live CPU copies of task t's FPSIMD state
+ *
+ * This function may be called with preemption enabled.  The barrier()
+ * ensures that the assignment to fpsimd_cpu is visible to any
+ * preemption/softirq that could race with set_tsk_thread_flag(), so
+ * that TIF_FOREIGN_FPSTATE cannot be spuriously re-cleared.
+ *
+ * The final barrier ensures that TIF_FOREIGN_FPSTATE is seen set by any
+ * subsequent code.
  */
 void fpsimd_flush_task_state(struct task_struct *t)
 {
 	t->thread.fpsimd_cpu = NR_CPUS;
+
+	barrier();
+	set_tsk_thread_flag(t, TIF_FOREIGN_FPSTATE);
+
+	barrier();
 }
 
+/*
+ * Invalidate any task's FPSIMD state that is present on this cpu.
+ * This function must be called with softirqs disabled.
+ */
 void fpsimd_flush_cpu_state(void)
 {
 	__this_cpu_write(fpsimd_last_state.st, NULL);
diff --git a/arch/arm64/kernel/signal.c b/arch/arm64/kernel/signal.c
index 511af13..7636965 100644
--- a/arch/arm64/kernel/signal.c
+++ b/arch/arm64/kernel/signal.c
@@ -296,11 +296,6 @@ static int restore_sve_fpsimd_context(struct user_ctxs *user)
 	 */
 
 	fpsimd_flush_task_state(current);
-	barrier();
-	/* From now, fpsimd_thread_switch() won't clear TIF_FOREIGN_FPSTATE */
-
-	set_thread_flag(TIF_FOREIGN_FPSTATE);
-	barrier();
 	/* From now, fpsimd_thread_switch() won't touch thread.sve_state */
 
 	sve_alloc(current);
-- 
2.1.4

^ permalink raw reply related	[flat|nested] 178+ messages in thread

* [RFC PATCH 02/16] KVM: arm64: Delete orphaned declaration for __fpsimd_enabled()
  2018-06-21 14:57 ` Dave Martin
@ 2018-06-21 14:57   ` Dave Martin
  -1 siblings, 0 replies; 178+ messages in thread
From: Dave Martin @ 2018-06-21 14:57 UTC (permalink / raw)
  To: kvmarm
  Cc: Peter Maydell, Okamoto Takayuki, Christoffer Dall,
	Ard Biesheuvel, Marc Zyngier, Catalin Marinas, Will Deacon,
	Alex Bennée, linux-arm-kernel

__fpsimd_enabled() no longer exists, but a dangling declaration has
survived in kvm_hyp.h.

This patch gets rid of it.

Signed-off-by: Dave Martin <Dave.Martin@arm.com>
---
 arch/arm64/include/asm/kvm_hyp.h | 1 -
 1 file changed, 1 deletion(-)

diff --git a/arch/arm64/include/asm/kvm_hyp.h b/arch/arm64/include/asm/kvm_hyp.h
index 384c343..9cbbd03 100644
--- a/arch/arm64/include/asm/kvm_hyp.h
+++ b/arch/arm64/include/asm/kvm_hyp.h
@@ -147,7 +147,6 @@ void __debug_switch_to_host(struct kvm_vcpu *vcpu);
 
 void __fpsimd_save_state(struct user_fpsimd_state *fp_regs);
 void __fpsimd_restore_state(struct user_fpsimd_state *fp_regs);
-bool __fpsimd_enabled(void);
 
 void activate_traps_vhe_load(struct kvm_vcpu *vcpu);
 void deactivate_traps_vhe_put(void);
-- 
2.1.4

^ permalink raw reply related	[flat|nested] 178+ messages in thread

* [RFC PATCH 02/16] KVM: arm64: Delete orphaned declaration for __fpsimd_enabled()
@ 2018-06-21 14:57   ` Dave Martin
  0 siblings, 0 replies; 178+ messages in thread
From: Dave Martin @ 2018-06-21 14:57 UTC (permalink / raw)
  To: linux-arm-kernel

__fpsimd_enabled() no longer exists, but a dangling declaration has
survived in kvm_hyp.h.

This patch gets rid of it.

Signed-off-by: Dave Martin <Dave.Martin@arm.com>
---
 arch/arm64/include/asm/kvm_hyp.h | 1 -
 1 file changed, 1 deletion(-)

diff --git a/arch/arm64/include/asm/kvm_hyp.h b/arch/arm64/include/asm/kvm_hyp.h
index 384c343..9cbbd03 100644
--- a/arch/arm64/include/asm/kvm_hyp.h
+++ b/arch/arm64/include/asm/kvm_hyp.h
@@ -147,7 +147,6 @@ void __debug_switch_to_host(struct kvm_vcpu *vcpu);
 
 void __fpsimd_save_state(struct user_fpsimd_state *fp_regs);
 void __fpsimd_restore_state(struct user_fpsimd_state *fp_regs);
-bool __fpsimd_enabled(void);
 
 void activate_traps_vhe_load(struct kvm_vcpu *vcpu);
 void deactivate_traps_vhe_put(void);
-- 
2.1.4

^ permalink raw reply related	[flat|nested] 178+ messages in thread

* [RFC PATCH 03/16] KVM: arm64: Refactor kvm_arm_num_regs() for easier maintenance
  2018-06-21 14:57 ` Dave Martin
@ 2018-06-21 14:57   ` Dave Martin
  -1 siblings, 0 replies; 178+ messages in thread
From: Dave Martin @ 2018-06-21 14:57 UTC (permalink / raw)
  To: kvmarm
  Cc: Okamoto Takayuki, Christoffer Dall, Ard Biesheuvel, Marc Zyngier,
	Catalin Marinas, Will Deacon, linux-arm-kernel

kvm_arm_num_regs() adds together various partial register counts in
a freeform sum expression, which makes it harder than necessary to
read diffs that add, modify or remove a single term in the sum
(which is expected to the common case under maintenance).

This patch refactors the code to add the term one per line, for
maximum readability.

Signed-off-by: Dave Martin <Dave.Martin@arm.com>
---
 arch/arm64/kvm/guest.c | 10 ++++++++--
 1 file changed, 8 insertions(+), 2 deletions(-)

diff --git a/arch/arm64/kvm/guest.c b/arch/arm64/kvm/guest.c
index 56a0260..4a9d77c 100644
--- a/arch/arm64/kvm/guest.c
+++ b/arch/arm64/kvm/guest.c
@@ -205,8 +205,14 @@ static int get_timer_reg(struct kvm_vcpu *vcpu, const struct kvm_one_reg *reg)
  */
 unsigned long kvm_arm_num_regs(struct kvm_vcpu *vcpu)
 {
-	return num_core_regs() + kvm_arm_num_sys_reg_descs(vcpu)
-		+ kvm_arm_get_fw_num_regs(vcpu)	+ NUM_TIMER_REGS;
+	unsigned long res = 0;
+
+	res += num_core_regs();
+	res += kvm_arm_num_sys_reg_descs(vcpu);
+	res += kvm_arm_get_fw_num_regs(vcpu);
+	res += NUM_TIMER_REGS;
+
+	return res;
 }
 
 /**
-- 
2.1.4

^ permalink raw reply related	[flat|nested] 178+ messages in thread

* [RFC PATCH 03/16] KVM: arm64: Refactor kvm_arm_num_regs() for easier maintenance
@ 2018-06-21 14:57   ` Dave Martin
  0 siblings, 0 replies; 178+ messages in thread
From: Dave Martin @ 2018-06-21 14:57 UTC (permalink / raw)
  To: linux-arm-kernel

kvm_arm_num_regs() adds together various partial register counts in
a freeform sum expression, which makes it harder than necessary to
read diffs that add, modify or remove a single term in the sum
(which is expected to the common case under maintenance).

This patch refactors the code to add the term one per line, for
maximum readability.

Signed-off-by: Dave Martin <Dave.Martin@arm.com>
---
 arch/arm64/kvm/guest.c | 10 ++++++++--
 1 file changed, 8 insertions(+), 2 deletions(-)

diff --git a/arch/arm64/kvm/guest.c b/arch/arm64/kvm/guest.c
index 56a0260..4a9d77c 100644
--- a/arch/arm64/kvm/guest.c
+++ b/arch/arm64/kvm/guest.c
@@ -205,8 +205,14 @@ static int get_timer_reg(struct kvm_vcpu *vcpu, const struct kvm_one_reg *reg)
  */
 unsigned long kvm_arm_num_regs(struct kvm_vcpu *vcpu)
 {
-	return num_core_regs() + kvm_arm_num_sys_reg_descs(vcpu)
-		+ kvm_arm_get_fw_num_regs(vcpu)	+ NUM_TIMER_REGS;
+	unsigned long res = 0;
+
+	res += num_core_regs();
+	res += kvm_arm_num_sys_reg_descs(vcpu);
+	res += kvm_arm_get_fw_num_regs(vcpu);
+	res += NUM_TIMER_REGS;
+
+	return res;
 }
 
 /**
-- 
2.1.4

^ permalink raw reply related	[flat|nested] 178+ messages in thread

* [RFC PATCH 04/16] KVM: arm64: Add missing #include of <linux/bitmap.h> to kvm_host.h
  2018-06-21 14:57 ` Dave Martin
@ 2018-06-21 14:57   ` Dave Martin
  -1 siblings, 0 replies; 178+ messages in thread
From: Dave Martin @ 2018-06-21 14:57 UTC (permalink / raw)
  To: kvmarm
  Cc: Okamoto Takayuki, Christoffer Dall, Ard Biesheuvel, Marc Zyngier,
	Catalin Marinas, Will Deacon, linux-arm-kernel

kvm_host.h uses DECLARE_BITMAP() to declare the features member of
struct vcpu_arch, but the corresponding #include for this is
missing.

This patch adds a suitable #include for <linux/bitmap.h>.  Although
the header builds without it today, this should help to avoid
future surprises.

Signed-off-by: Dave Martin <Dave.Martin@arm.com>
---
 arch/arm64/include/asm/kvm_host.h | 1 +
 1 file changed, 1 insertion(+)

diff --git a/arch/arm64/include/asm/kvm_host.h b/arch/arm64/include/asm/kvm_host.h
index fe8777b..92d6e88 100644
--- a/arch/arm64/include/asm/kvm_host.h
+++ b/arch/arm64/include/asm/kvm_host.h
@@ -22,6 +22,7 @@
 #ifndef __ARM64_KVM_HOST_H__
 #define __ARM64_KVM_HOST_H__
 
+#include <linux/bitmap.h>
 #include <linux/types.h>
 #include <linux/kvm_types.h>
 #include <asm/cpufeature.h>
-- 
2.1.4

^ permalink raw reply related	[flat|nested] 178+ messages in thread

* [RFC PATCH 04/16] KVM: arm64: Add missing #include of <linux/bitmap.h> to kvm_host.h
@ 2018-06-21 14:57   ` Dave Martin
  0 siblings, 0 replies; 178+ messages in thread
From: Dave Martin @ 2018-06-21 14:57 UTC (permalink / raw)
  To: linux-arm-kernel

kvm_host.h uses DECLARE_BITMAP() to declare the features member of
struct vcpu_arch, but the corresponding #include for this is
missing.

This patch adds a suitable #include for <linux/bitmap.h>.  Although
the header builds without it today, this should help to avoid
future surprises.

Signed-off-by: Dave Martin <Dave.Martin@arm.com>
---
 arch/arm64/include/asm/kvm_host.h | 1 +
 1 file changed, 1 insertion(+)

diff --git a/arch/arm64/include/asm/kvm_host.h b/arch/arm64/include/asm/kvm_host.h
index fe8777b..92d6e88 100644
--- a/arch/arm64/include/asm/kvm_host.h
+++ b/arch/arm64/include/asm/kvm_host.h
@@ -22,6 +22,7 @@
 #ifndef __ARM64_KVM_HOST_H__
 #define __ARM64_KVM_HOST_H__
 
+#include <linux/bitmap.h>
 #include <linux/types.h>
 #include <linux/kvm_types.h>
 #include <asm/cpufeature.h>
-- 
2.1.4

^ permalink raw reply related	[flat|nested] 178+ messages in thread

* [RFC PATCH 05/16] KVM: arm: Add arch init/uninit hooks
  2018-06-21 14:57 ` Dave Martin
@ 2018-06-21 14:57   ` Dave Martin
  -1 siblings, 0 replies; 178+ messages in thread
From: Dave Martin @ 2018-06-21 14:57 UTC (permalink / raw)
  To: kvmarm
  Cc: Okamoto Takayuki, Christoffer Dall, Ard Biesheuvel, Marc Zyngier,
	Catalin Marinas, Will Deacon, linux-arm-kernel

In preparation for adding support for SVE in guests on arm64, hooks
for allocating and freeing additional per-vcpu memory are needed.

kvm_arch_vcpu_setup() could be used for allocation, but this
function is not clearly balanced by un "unsetup" function, making
it unclear where memory allocated in this function should be freed.

To keep things simple, this patch defines backend hooks
kvm_arm_arch_vcpu_{,un}unint(), and plumbs them in appropriately.
The exusting kvm_arch_vcpu_init() function now calls
kvm_arm_arch_vcpu_init(), while an explicit kvm_arch_vcpu_uninit()
is added which current does nothing except to call
kvm_arm_arch_vcpu_uninit().

The backend functions are currently defined to do nothing.

No functional change.

Signed-off-by: Dave Martin <Dave.Martin@arm.com>
---
 arch/arm/include/asm/kvm_host.h   |  4 +++-
 arch/arm64/include/asm/kvm_host.h |  4 +++-
 virt/kvm/arm/arm.c                | 13 ++++++++++++-
 3 files changed, 18 insertions(+), 3 deletions(-)

diff --git a/arch/arm/include/asm/kvm_host.h b/arch/arm/include/asm/kvm_host.h
index 1f1fe410..9b902b8 100644
--- a/arch/arm/include/asm/kvm_host.h
+++ b/arch/arm/include/asm/kvm_host.h
@@ -284,10 +284,12 @@ struct kvm_vcpu *kvm_mpidr_to_vcpu(struct kvm *kvm, unsigned long mpidr);
 static inline bool kvm_arch_check_sve_has_vhe(void) { return true; }
 static inline void kvm_arch_hardware_unsetup(void) {}
 static inline void kvm_arch_sync_events(struct kvm *kvm) {}
-static inline void kvm_arch_vcpu_uninit(struct kvm_vcpu *vcpu) {}
 static inline void kvm_arch_sched_in(struct kvm_vcpu *vcpu, int cpu) {}
 static inline void kvm_arch_vcpu_block_finish(struct kvm_vcpu *vcpu) {}
 
+static inline int kvm_arm_arch_vcpu_init(struct kvm_vcpu *vcpu) { return 0; }
+static inline void kvm_arm_arch_vcpu_uninit(struct kvm_vcpu *vcpu) {}
+
 static inline void kvm_arm_init_debug(void) {}
 static inline void kvm_arm_setup_debug(struct kvm_vcpu *vcpu) {}
 static inline void kvm_arm_clear_debug(struct kvm_vcpu *vcpu) {}
diff --git a/arch/arm64/include/asm/kvm_host.h b/arch/arm64/include/asm/kvm_host.h
index 92d6e88..9671ddd 100644
--- a/arch/arm64/include/asm/kvm_host.h
+++ b/arch/arm64/include/asm/kvm_host.h
@@ -425,10 +425,12 @@ static inline bool kvm_arch_check_sve_has_vhe(void)
 
 static inline void kvm_arch_hardware_unsetup(void) {}
 static inline void kvm_arch_sync_events(struct kvm *kvm) {}
-static inline void kvm_arch_vcpu_uninit(struct kvm_vcpu *vcpu) {}
 static inline void kvm_arch_sched_in(struct kvm_vcpu *vcpu, int cpu) {}
 static inline void kvm_arch_vcpu_block_finish(struct kvm_vcpu *vcpu) {}
 
+static inline int kvm_arm_arch_vcpu_init(struct kvm_vcpu *vcpu) { return 0; }
+static inline void kvm_arm_arch_vcpu_uninit(struct kvm_vcpu *vcpu) {}
+
 void kvm_arm_init_debug(void);
 void kvm_arm_setup_debug(struct kvm_vcpu *vcpu);
 void kvm_arm_clear_debug(struct kvm_vcpu *vcpu);
diff --git a/virt/kvm/arm/arm.c b/virt/kvm/arm/arm.c
index 04e554c..66f15cc 100644
--- a/virt/kvm/arm/arm.c
+++ b/virt/kvm/arm/arm.c
@@ -345,6 +345,8 @@ void kvm_arch_vcpu_unblocking(struct kvm_vcpu *vcpu)
 
 int kvm_arch_vcpu_init(struct kvm_vcpu *vcpu)
 {
+	int ret;
+
 	/* Force users to call KVM_ARM_VCPU_INIT */
 	vcpu->arch.target = -1;
 	bitmap_zero(vcpu->arch.features, KVM_VCPU_MAX_FEATURES);
@@ -354,7 +356,16 @@ int kvm_arch_vcpu_init(struct kvm_vcpu *vcpu)
 
 	kvm_arm_reset_debug_ptr(vcpu);
 
-	return kvm_vgic_vcpu_init(vcpu);
+	ret = kvm_vgic_vcpu_init(vcpu);
+	if (ret)
+		return ret;
+
+	return kvm_arm_arch_vcpu_init(vcpu);
+}
+
+void kvm_arch_vcpu_uninit(struct kvm_vcpu *vcpu)
+{
+	kvm_arm_arch_vcpu_uninit(vcpu);
 }
 
 void kvm_arch_vcpu_load(struct kvm_vcpu *vcpu, int cpu)
-- 
2.1.4

^ permalink raw reply related	[flat|nested] 178+ messages in thread

* [RFC PATCH 05/16] KVM: arm: Add arch init/uninit hooks
@ 2018-06-21 14:57   ` Dave Martin
  0 siblings, 0 replies; 178+ messages in thread
From: Dave Martin @ 2018-06-21 14:57 UTC (permalink / raw)
  To: linux-arm-kernel

In preparation for adding support for SVE in guests on arm64, hooks
for allocating and freeing additional per-vcpu memory are needed.

kvm_arch_vcpu_setup() could be used for allocation, but this
function is not clearly balanced by un "unsetup" function, making
it unclear where memory allocated in this function should be freed.

To keep things simple, this patch defines backend hooks
kvm_arm_arch_vcpu_{,un}unint(), and plumbs them in appropriately.
The exusting kvm_arch_vcpu_init() function now calls
kvm_arm_arch_vcpu_init(), while an explicit kvm_arch_vcpu_uninit()
is added which current does nothing except to call
kvm_arm_arch_vcpu_uninit().

The backend functions are currently defined to do nothing.

No functional change.

Signed-off-by: Dave Martin <Dave.Martin@arm.com>
---
 arch/arm/include/asm/kvm_host.h   |  4 +++-
 arch/arm64/include/asm/kvm_host.h |  4 +++-
 virt/kvm/arm/arm.c                | 13 ++++++++++++-
 3 files changed, 18 insertions(+), 3 deletions(-)

diff --git a/arch/arm/include/asm/kvm_host.h b/arch/arm/include/asm/kvm_host.h
index 1f1fe410..9b902b8 100644
--- a/arch/arm/include/asm/kvm_host.h
+++ b/arch/arm/include/asm/kvm_host.h
@@ -284,10 +284,12 @@ struct kvm_vcpu *kvm_mpidr_to_vcpu(struct kvm *kvm, unsigned long mpidr);
 static inline bool kvm_arch_check_sve_has_vhe(void) { return true; }
 static inline void kvm_arch_hardware_unsetup(void) {}
 static inline void kvm_arch_sync_events(struct kvm *kvm) {}
-static inline void kvm_arch_vcpu_uninit(struct kvm_vcpu *vcpu) {}
 static inline void kvm_arch_sched_in(struct kvm_vcpu *vcpu, int cpu) {}
 static inline void kvm_arch_vcpu_block_finish(struct kvm_vcpu *vcpu) {}
 
+static inline int kvm_arm_arch_vcpu_init(struct kvm_vcpu *vcpu) { return 0; }
+static inline void kvm_arm_arch_vcpu_uninit(struct kvm_vcpu *vcpu) {}
+
 static inline void kvm_arm_init_debug(void) {}
 static inline void kvm_arm_setup_debug(struct kvm_vcpu *vcpu) {}
 static inline void kvm_arm_clear_debug(struct kvm_vcpu *vcpu) {}
diff --git a/arch/arm64/include/asm/kvm_host.h b/arch/arm64/include/asm/kvm_host.h
index 92d6e88..9671ddd 100644
--- a/arch/arm64/include/asm/kvm_host.h
+++ b/arch/arm64/include/asm/kvm_host.h
@@ -425,10 +425,12 @@ static inline bool kvm_arch_check_sve_has_vhe(void)
 
 static inline void kvm_arch_hardware_unsetup(void) {}
 static inline void kvm_arch_sync_events(struct kvm *kvm) {}
-static inline void kvm_arch_vcpu_uninit(struct kvm_vcpu *vcpu) {}
 static inline void kvm_arch_sched_in(struct kvm_vcpu *vcpu, int cpu) {}
 static inline void kvm_arch_vcpu_block_finish(struct kvm_vcpu *vcpu) {}
 
+static inline int kvm_arm_arch_vcpu_init(struct kvm_vcpu *vcpu) { return 0; }
+static inline void kvm_arm_arch_vcpu_uninit(struct kvm_vcpu *vcpu) {}
+
 void kvm_arm_init_debug(void);
 void kvm_arm_setup_debug(struct kvm_vcpu *vcpu);
 void kvm_arm_clear_debug(struct kvm_vcpu *vcpu);
diff --git a/virt/kvm/arm/arm.c b/virt/kvm/arm/arm.c
index 04e554c..66f15cc 100644
--- a/virt/kvm/arm/arm.c
+++ b/virt/kvm/arm/arm.c
@@ -345,6 +345,8 @@ void kvm_arch_vcpu_unblocking(struct kvm_vcpu *vcpu)
 
 int kvm_arch_vcpu_init(struct kvm_vcpu *vcpu)
 {
+	int ret;
+
 	/* Force users to call KVM_ARM_VCPU_INIT */
 	vcpu->arch.target = -1;
 	bitmap_zero(vcpu->arch.features, KVM_VCPU_MAX_FEATURES);
@@ -354,7 +356,16 @@ int kvm_arch_vcpu_init(struct kvm_vcpu *vcpu)
 
 	kvm_arm_reset_debug_ptr(vcpu);
 
-	return kvm_vgic_vcpu_init(vcpu);
+	ret = kvm_vgic_vcpu_init(vcpu);
+	if (ret)
+		return ret;
+
+	return kvm_arm_arch_vcpu_init(vcpu);
+}
+
+void kvm_arch_vcpu_uninit(struct kvm_vcpu *vcpu)
+{
+	kvm_arm_arch_vcpu_uninit(vcpu);
 }
 
 void kvm_arch_vcpu_load(struct kvm_vcpu *vcpu, int cpu)
-- 
2.1.4

^ permalink raw reply related	[flat|nested] 178+ messages in thread

* [RFC PATCH 06/16] arm64/sve: Determine virtualisation-friendly vector lengths
  2018-06-21 14:57 ` Dave Martin
@ 2018-06-21 14:57   ` Dave Martin
  -1 siblings, 0 replies; 178+ messages in thread
From: Dave Martin @ 2018-06-21 14:57 UTC (permalink / raw)
  To: kvmarm
  Cc: Okamoto Takayuki, Christoffer Dall, Ard Biesheuvel, Marc Zyngier,
	Catalin Marinas, Will Deacon, linux-arm-kernel

Software at EL1 is permitted to assume that when programming
ZCR_EL1.LEN, the effective vector length is nonetheless a vector
length supported by the CPU.  Thus, software may rely on the
effective vector length being different from that programmed via
ZCR_EL1.LEN in some situations.

However, KVM does not tightly bind vcpus to individual underlying
physical CPUs.  As a result, vcpus can migrate from one CPU to
another.  This means that in order to preserve the guarantee
described in the previous paragraph, the set of supported vector
lengths must appear to be the same for the vcpu at all times,
irrespective of which physical CPU the vcpu is currently running
on.

The Arm SVE architecture allows the maximum vector length visible
to EL1 to be restricted by programming ZCR_EL2.LEN.  This provides
a means to hide from guests any vector lengths that are not
supported by every physical CPU in the system.  However, there is
no way to hide a particular vector length while some greater vector
length is exposed to EL1.

This patch determines the maximum vector length
(sve_max_virtualisable_vl) for which the set of supported vector
lengths not exceeding it is identical for all CPUs.  When KVM is
available, the set of vector lengths supported by each late
secondary CPU is verified to be consistent with those of the early
CPUs, in order to ensure that the value chosen for
sve_max_virtualisable_vl remains globally valid, and ensure that
all created vcpus continue to behave correctly.

sve_secondary_vq_map is used as scratch space for these
computations, rendering its name misleading.  This patch renames
this bitmap to sve_tmp_vq_map in order to make its purpose clearer.

Signed-off-by: Dave Martin <Dave.Martin@arm.com>
---
 arch/arm64/include/asm/fpsimd.h |  1 +
 arch/arm64/kernel/cpufeature.c  |  2 +-
 arch/arm64/kernel/fpsimd.c      | 86 ++++++++++++++++++++++++++++++++++-------
 3 files changed, 75 insertions(+), 14 deletions(-)

diff --git a/arch/arm64/include/asm/fpsimd.h b/arch/arm64/include/asm/fpsimd.h
index fa92747..3ad4607 100644
--- a/arch/arm64/include/asm/fpsimd.h
+++ b/arch/arm64/include/asm/fpsimd.h
@@ -85,6 +85,7 @@ extern void sve_kernel_enable(const struct arm64_cpu_capabilities *__unused);
 extern u64 read_zcr_features(void);
 
 extern int __ro_after_init sve_max_vl;
+extern int __ro_after_init sve_max_virtualisable_vl;
 
 #ifdef CONFIG_ARM64_SVE
 
diff --git a/arch/arm64/kernel/cpufeature.c b/arch/arm64/kernel/cpufeature.c
index d2856b1..f493a2f 100644
--- a/arch/arm64/kernel/cpufeature.c
+++ b/arch/arm64/kernel/cpufeature.c
@@ -1511,7 +1511,7 @@ static void verify_sve_features(void)
 	unsigned int len = zcr & ZCR_ELx_LEN_MASK;
 
 	if (len < safe_len || sve_verify_vq_map()) {
-		pr_crit("CPU%d: SVE: required vector length(s) missing\n",
+		pr_crit("CPU%d: SVE: vector length support mismatch\n",
 			smp_processor_id());
 		cpu_die_early();
 	}
diff --git a/arch/arm64/kernel/fpsimd.c b/arch/arm64/kernel/fpsimd.c
index 6b1ddae..390afb4 100644
--- a/arch/arm64/kernel/fpsimd.c
+++ b/arch/arm64/kernel/fpsimd.c
@@ -18,6 +18,7 @@
  */
 
 #include <linux/bitmap.h>
+#include <linux/bitops.h>
 #include <linux/bottom_half.h>
 #include <linux/bug.h>
 #include <linux/cache.h>
@@ -48,6 +49,7 @@
 #include <asm/sigcontext.h>
 #include <asm/sysreg.h>
 #include <asm/traps.h>
+#include <asm/virt.h>
 
 #define FPEXC_IOF	(1 << 0)
 #define FPEXC_DZF	(1 << 1)
@@ -130,14 +132,18 @@ static int sve_default_vl = -1;
 
 /* Maximum supported vector length across all CPUs (initially poisoned) */
 int __ro_after_init sve_max_vl = SVE_VL_MIN;
+int __ro_after_init sve_max_virtualisable_vl = SVE_VL_MIN;
 /* Set of available vector lengths, as vq_to_bit(vq): */
 static __ro_after_init DECLARE_BITMAP(sve_vq_map, SVE_VQ_MAX);
+/* Set of vector lengths present on at least one cpu: */
+static __ro_after_init DECLARE_BITMAP(sve_vq_partial_map, SVE_VQ_MAX);
 static void __percpu *efi_sve_state;
 
 #else /* ! CONFIG_ARM64_SVE */
 
 /* Dummy declaration for code that will be optimised out: */
 extern __ro_after_init DECLARE_BITMAP(sve_vq_map, SVE_VQ_MAX);
+extern __ro_after_init DECLARE_BITMAP(sve_vq_partial_map, SVE_VQ_MAX);
 extern void __percpu *efi_sve_state;
 
 #endif /* ! CONFIG_ARM64_SVE */
@@ -642,11 +648,8 @@ int sve_get_current_vl(void)
 	return sve_prctl_status(0);
 }
 
-/*
- * Bitmap for temporary storage of the per-CPU set of supported vector lengths
- * during secondary boot.
- */
-static DECLARE_BITMAP(sve_secondary_vq_map, SVE_VQ_MAX);
+/* Bitmaps for temporary storage during manipulation of vector length sets */
+static DECLARE_BITMAP(sve_tmp_vq_map, SVE_VQ_MAX);
 
 static void sve_probe_vqs(DECLARE_BITMAP(map, SVE_VQ_MAX))
 {
@@ -669,6 +672,7 @@ static void sve_probe_vqs(DECLARE_BITMAP(map, SVE_VQ_MAX))
 void __init sve_init_vq_map(void)
 {
 	sve_probe_vqs(sve_vq_map);
+	bitmap_copy(sve_vq_partial_map, sve_vq_map, SVE_VQ_MAX);
 }
 
 /*
@@ -677,24 +681,60 @@ void __init sve_init_vq_map(void)
  */
 void sve_update_vq_map(void)
 {
-	sve_probe_vqs(sve_secondary_vq_map);
-	bitmap_and(sve_vq_map, sve_vq_map, sve_secondary_vq_map, SVE_VQ_MAX);
+	sve_probe_vqs(sve_tmp_vq_map);
+	bitmap_and(sve_vq_map, sve_vq_map, sve_tmp_vq_map,
+		   SVE_VQ_MAX);
+	bitmap_or(sve_vq_partial_map, sve_vq_partial_map, sve_tmp_vq_map,
+		  SVE_VQ_MAX);
 }
 
 /* Check whether the current CPU supports all VQs in the committed set */
 int sve_verify_vq_map(void)
 {
-	int ret = 0;
+	int ret = -EINVAL;
+	unsigned long b;
 
-	sve_probe_vqs(sve_secondary_vq_map);
-	bitmap_andnot(sve_secondary_vq_map, sve_vq_map, sve_secondary_vq_map,
-		      SVE_VQ_MAX);
-	if (!bitmap_empty(sve_secondary_vq_map, SVE_VQ_MAX)) {
+	sve_probe_vqs(sve_tmp_vq_map);
+
+	bitmap_complement(sve_tmp_vq_map, sve_tmp_vq_map, SVE_VQ_MAX);
+	if (bitmap_intersects(sve_tmp_vq_map, sve_vq_map, SVE_VQ_MAX)) {
 		pr_warn("SVE: cpu%d: Required vector length(s) missing\n",
 			smp_processor_id());
-		ret = -EINVAL;
+		goto error;
 	}
 
+	if (!IS_ENABLED(CONFIG_KVM) || !is_hyp_mode_available())
+		goto ok;
+
+	/*
+	 * For KVM, it is necessary to ensure that this CPU doesn't
+	 * support any vector length that guests may have probed as
+	 * unsupported.
+	 */
+
+	/* Recover the set of supported VQs: */
+	bitmap_complement(sve_tmp_vq_map, sve_tmp_vq_map, SVE_VQ_MAX);
+	/* Find VQs supported that are not globally supported: */
+	bitmap_andnot(sve_tmp_vq_map, sve_tmp_vq_map, sve_vq_map, SVE_VQ_MAX);
+
+	/* Find the lowest such VQ, if any: */
+	b = find_last_bit(sve_tmp_vq_map, SVE_VQ_MAX);
+	if (b >= SVE_VQ_MAX)
+		goto ok; /* no mismatches */
+
+	/*
+	 * Mismatches above sve_max_virtualisable_vl are fine, since
+	 * no guest is allowed to configure ZCR_EL2.LEN to exceed this:
+	 */
+	if (sve_vl_from_vq(bit_to_vq(b)) <= sve_max_virtualisable_vl) {
+		pr_warn("SVE: cpu%d: Unsupported vector length(s) present\n",
+			smp_processor_id());
+		goto error;
+	}
+
+ok:
+	ret = 0;
+error:
 	return ret;
 }
 
@@ -762,6 +802,7 @@ u64 read_zcr_features(void)
 void __init sve_setup(void)
 {
 	u64 zcr;
+	unsigned long b;
 
 	if (!system_supports_sve())
 		return;
@@ -790,10 +831,29 @@ void __init sve_setup(void)
 	 */
 	sve_default_vl = find_supported_vector_length(64);
 
+	bitmap_andnot(sve_tmp_vq_map, sve_vq_partial_map, sve_vq_map,
+		      SVE_VQ_MAX);
+
+	b = find_last_bit(sve_tmp_vq_map, SVE_VQ_MAX);
+	if (b >= SVE_VQ_MAX)
+		/* No non-virtualisable VLs found */
+		sve_max_virtualisable_vl = SVE_VQ_MAX;
+	else if (WARN_ON(b == SVE_VQ_MAX - 1))
+		/* No virtualisable VLs?  This is architecturally forbidden. */
+		sve_max_virtualisable_vl = SVE_VQ_MIN;
+	else /* b + 1 < SVE_VQ_MAX */
+		sve_max_virtualisable_vl = sve_vl_from_vq(bit_to_vq(b + 1));
+
+	if (sve_max_virtualisable_vl > sve_max_vl)
+		sve_max_virtualisable_vl = sve_max_vl;
+
 	pr_info("SVE: maximum available vector length %u bytes per vector\n",
 		sve_max_vl);
 	pr_info("SVE: default vector length %u bytes per vector\n",
 		sve_default_vl);
+	if (sve_max_virtualisable_vl < sve_max_vl)
+		pr_info("SVE: vector lengths greater than %u bytes not virtualisable\n",
+			sve_max_virtualisable_vl);
 
 	sve_efi_setup();
 }
-- 
2.1.4

^ permalink raw reply related	[flat|nested] 178+ messages in thread

* [RFC PATCH 06/16] arm64/sve: Determine virtualisation-friendly vector lengths
@ 2018-06-21 14:57   ` Dave Martin
  0 siblings, 0 replies; 178+ messages in thread
From: Dave Martin @ 2018-06-21 14:57 UTC (permalink / raw)
  To: linux-arm-kernel

Software at EL1 is permitted to assume that when programming
ZCR_EL1.LEN, the effective vector length is nonetheless a vector
length supported by the CPU.  Thus, software may rely on the
effective vector length being different from that programmed via
ZCR_EL1.LEN in some situations.

However, KVM does not tightly bind vcpus to individual underlying
physical CPUs.  As a result, vcpus can migrate from one CPU to
another.  This means that in order to preserve the guarantee
described in the previous paragraph, the set of supported vector
lengths must appear to be the same for the vcpu at all times,
irrespective of which physical CPU the vcpu is currently running
on.

The Arm SVE architecture allows the maximum vector length visible
to EL1 to be restricted by programming ZCR_EL2.LEN.  This provides
a means to hide from guests any vector lengths that are not
supported by every physical CPU in the system.  However, there is
no way to hide a particular vector length while some greater vector
length is exposed to EL1.

This patch determines the maximum vector length
(sve_max_virtualisable_vl) for which the set of supported vector
lengths not exceeding it is identical for all CPUs.  When KVM is
available, the set of vector lengths supported by each late
secondary CPU is verified to be consistent with those of the early
CPUs, in order to ensure that the value chosen for
sve_max_virtualisable_vl remains globally valid, and ensure that
all created vcpus continue to behave correctly.

sve_secondary_vq_map is used as scratch space for these
computations, rendering its name misleading.  This patch renames
this bitmap to sve_tmp_vq_map in order to make its purpose clearer.

Signed-off-by: Dave Martin <Dave.Martin@arm.com>
---
 arch/arm64/include/asm/fpsimd.h |  1 +
 arch/arm64/kernel/cpufeature.c  |  2 +-
 arch/arm64/kernel/fpsimd.c      | 86 ++++++++++++++++++++++++++++++++++-------
 3 files changed, 75 insertions(+), 14 deletions(-)

diff --git a/arch/arm64/include/asm/fpsimd.h b/arch/arm64/include/asm/fpsimd.h
index fa92747..3ad4607 100644
--- a/arch/arm64/include/asm/fpsimd.h
+++ b/arch/arm64/include/asm/fpsimd.h
@@ -85,6 +85,7 @@ extern void sve_kernel_enable(const struct arm64_cpu_capabilities *__unused);
 extern u64 read_zcr_features(void);
 
 extern int __ro_after_init sve_max_vl;
+extern int __ro_after_init sve_max_virtualisable_vl;
 
 #ifdef CONFIG_ARM64_SVE
 
diff --git a/arch/arm64/kernel/cpufeature.c b/arch/arm64/kernel/cpufeature.c
index d2856b1..f493a2f 100644
--- a/arch/arm64/kernel/cpufeature.c
+++ b/arch/arm64/kernel/cpufeature.c
@@ -1511,7 +1511,7 @@ static void verify_sve_features(void)
 	unsigned int len = zcr & ZCR_ELx_LEN_MASK;
 
 	if (len < safe_len || sve_verify_vq_map()) {
-		pr_crit("CPU%d: SVE: required vector length(s) missing\n",
+		pr_crit("CPU%d: SVE: vector length support mismatch\n",
 			smp_processor_id());
 		cpu_die_early();
 	}
diff --git a/arch/arm64/kernel/fpsimd.c b/arch/arm64/kernel/fpsimd.c
index 6b1ddae..390afb4 100644
--- a/arch/arm64/kernel/fpsimd.c
+++ b/arch/arm64/kernel/fpsimd.c
@@ -18,6 +18,7 @@
  */
 
 #include <linux/bitmap.h>
+#include <linux/bitops.h>
 #include <linux/bottom_half.h>
 #include <linux/bug.h>
 #include <linux/cache.h>
@@ -48,6 +49,7 @@
 #include <asm/sigcontext.h>
 #include <asm/sysreg.h>
 #include <asm/traps.h>
+#include <asm/virt.h>
 
 #define FPEXC_IOF	(1 << 0)
 #define FPEXC_DZF	(1 << 1)
@@ -130,14 +132,18 @@ static int sve_default_vl = -1;
 
 /* Maximum supported vector length across all CPUs (initially poisoned) */
 int __ro_after_init sve_max_vl = SVE_VL_MIN;
+int __ro_after_init sve_max_virtualisable_vl = SVE_VL_MIN;
 /* Set of available vector lengths, as vq_to_bit(vq): */
 static __ro_after_init DECLARE_BITMAP(sve_vq_map, SVE_VQ_MAX);
+/* Set of vector lengths present on@least one cpu: */
+static __ro_after_init DECLARE_BITMAP(sve_vq_partial_map, SVE_VQ_MAX);
 static void __percpu *efi_sve_state;
 
 #else /* ! CONFIG_ARM64_SVE */
 
 /* Dummy declaration for code that will be optimised out: */
 extern __ro_after_init DECLARE_BITMAP(sve_vq_map, SVE_VQ_MAX);
+extern __ro_after_init DECLARE_BITMAP(sve_vq_partial_map, SVE_VQ_MAX);
 extern void __percpu *efi_sve_state;
 
 #endif /* ! CONFIG_ARM64_SVE */
@@ -642,11 +648,8 @@ int sve_get_current_vl(void)
 	return sve_prctl_status(0);
 }
 
-/*
- * Bitmap for temporary storage of the per-CPU set of supported vector lengths
- * during secondary boot.
- */
-static DECLARE_BITMAP(sve_secondary_vq_map, SVE_VQ_MAX);
+/* Bitmaps for temporary storage during manipulation of vector length sets */
+static DECLARE_BITMAP(sve_tmp_vq_map, SVE_VQ_MAX);
 
 static void sve_probe_vqs(DECLARE_BITMAP(map, SVE_VQ_MAX))
 {
@@ -669,6 +672,7 @@ static void sve_probe_vqs(DECLARE_BITMAP(map, SVE_VQ_MAX))
 void __init sve_init_vq_map(void)
 {
 	sve_probe_vqs(sve_vq_map);
+	bitmap_copy(sve_vq_partial_map, sve_vq_map, SVE_VQ_MAX);
 }
 
 /*
@@ -677,24 +681,60 @@ void __init sve_init_vq_map(void)
  */
 void sve_update_vq_map(void)
 {
-	sve_probe_vqs(sve_secondary_vq_map);
-	bitmap_and(sve_vq_map, sve_vq_map, sve_secondary_vq_map, SVE_VQ_MAX);
+	sve_probe_vqs(sve_tmp_vq_map);
+	bitmap_and(sve_vq_map, sve_vq_map, sve_tmp_vq_map,
+		   SVE_VQ_MAX);
+	bitmap_or(sve_vq_partial_map, sve_vq_partial_map, sve_tmp_vq_map,
+		  SVE_VQ_MAX);
 }
 
 /* Check whether the current CPU supports all VQs in the committed set */
 int sve_verify_vq_map(void)
 {
-	int ret = 0;
+	int ret = -EINVAL;
+	unsigned long b;
 
-	sve_probe_vqs(sve_secondary_vq_map);
-	bitmap_andnot(sve_secondary_vq_map, sve_vq_map, sve_secondary_vq_map,
-		      SVE_VQ_MAX);
-	if (!bitmap_empty(sve_secondary_vq_map, SVE_VQ_MAX)) {
+	sve_probe_vqs(sve_tmp_vq_map);
+
+	bitmap_complement(sve_tmp_vq_map, sve_tmp_vq_map, SVE_VQ_MAX);
+	if (bitmap_intersects(sve_tmp_vq_map, sve_vq_map, SVE_VQ_MAX)) {
 		pr_warn("SVE: cpu%d: Required vector length(s) missing\n",
 			smp_processor_id());
-		ret = -EINVAL;
+		goto error;
 	}
 
+	if (!IS_ENABLED(CONFIG_KVM) || !is_hyp_mode_available())
+		goto ok;
+
+	/*
+	 * For KVM, it is necessary to ensure that this CPU doesn't
+	 * support any vector length that guests may have probed as
+	 * unsupported.
+	 */
+
+	/* Recover the set of supported VQs: */
+	bitmap_complement(sve_tmp_vq_map, sve_tmp_vq_map, SVE_VQ_MAX);
+	/* Find VQs supported that are not globally supported: */
+	bitmap_andnot(sve_tmp_vq_map, sve_tmp_vq_map, sve_vq_map, SVE_VQ_MAX);
+
+	/* Find the lowest such VQ, if any: */
+	b = find_last_bit(sve_tmp_vq_map, SVE_VQ_MAX);
+	if (b >= SVE_VQ_MAX)
+		goto ok; /* no mismatches */
+
+	/*
+	 * Mismatches above sve_max_virtualisable_vl are fine, since
+	 * no guest is allowed to configure ZCR_EL2.LEN to exceed this:
+	 */
+	if (sve_vl_from_vq(bit_to_vq(b)) <= sve_max_virtualisable_vl) {
+		pr_warn("SVE: cpu%d: Unsupported vector length(s) present\n",
+			smp_processor_id());
+		goto error;
+	}
+
+ok:
+	ret = 0;
+error:
 	return ret;
 }
 
@@ -762,6 +802,7 @@ u64 read_zcr_features(void)
 void __init sve_setup(void)
 {
 	u64 zcr;
+	unsigned long b;
 
 	if (!system_supports_sve())
 		return;
@@ -790,10 +831,29 @@ void __init sve_setup(void)
 	 */
 	sve_default_vl = find_supported_vector_length(64);
 
+	bitmap_andnot(sve_tmp_vq_map, sve_vq_partial_map, sve_vq_map,
+		      SVE_VQ_MAX);
+
+	b = find_last_bit(sve_tmp_vq_map, SVE_VQ_MAX);
+	if (b >= SVE_VQ_MAX)
+		/* No non-virtualisable VLs found */
+		sve_max_virtualisable_vl = SVE_VQ_MAX;
+	else if (WARN_ON(b == SVE_VQ_MAX - 1))
+		/* No virtualisable VLs?  This is architecturally forbidden. */
+		sve_max_virtualisable_vl = SVE_VQ_MIN;
+	else /* b + 1 < SVE_VQ_MAX */
+		sve_max_virtualisable_vl = sve_vl_from_vq(bit_to_vq(b + 1));
+
+	if (sve_max_virtualisable_vl > sve_max_vl)
+		sve_max_virtualisable_vl = sve_max_vl;
+
 	pr_info("SVE: maximum available vector length %u bytes per vector\n",
 		sve_max_vl);
 	pr_info("SVE: default vector length %u bytes per vector\n",
 		sve_default_vl);
+	if (sve_max_virtualisable_vl < sve_max_vl)
+		pr_info("SVE: vector lengths greater than %u bytes not virtualisable\n",
+			sve_max_virtualisable_vl);
 
 	sve_efi_setup();
 }
-- 
2.1.4

^ permalink raw reply related	[flat|nested] 178+ messages in thread

* [RFC PATCH 07/16] arm64/sve: Enable SVE state tracking for non-task contexts
  2018-06-21 14:57 ` Dave Martin
@ 2018-06-21 14:57   ` Dave Martin
  -1 siblings, 0 replies; 178+ messages in thread
From: Dave Martin @ 2018-06-21 14:57 UTC (permalink / raw)
  To: kvmarm
  Cc: Okamoto Takayuki, Christoffer Dall, Ard Biesheuvel, Marc Zyngier,
	Catalin Marinas, Will Deacon, linux-arm-kernel

The current FPSIMD/SVE context handling support for non-task (i.e.,
KVM vcpu) contexts does not take SVE into account.  This means that
only task contexts can safely use SVE at present.

In preparation for enabling KVM guests to use SVE, it is necessary
to keep track of SVE state for non-task contexts too.

This patch adds the necessary support, removing assumptions from
the context switch code about the location of the SVE context
storage.

When binding a vcpu context, its vector length is arbitrarily
specified as sve_max_vl for now.  In any case, because TIF_SVE is
presently cleared at vcpu context bind time, the specified vector
length will not be used for anything yet.  In later patches TIF_SVE
will be set here as appropriate, and the appropriate maximum vector
length for the vcpu will be passed when binding.

Signed-off-by: Dave Martin <Dave.Martin@arm.com>
---
 arch/arm64/include/asm/fpsimd.h |  3 ++-
 arch/arm64/kernel/fpsimd.c      | 20 +++++++++++++++-----
 arch/arm64/kvm/fpsimd.c         |  4 +++-
 3 files changed, 20 insertions(+), 7 deletions(-)

diff --git a/arch/arm64/include/asm/fpsimd.h b/arch/arm64/include/asm/fpsimd.h
index 3ad4607..d575e59 100644
--- a/arch/arm64/include/asm/fpsimd.h
+++ b/arch/arm64/include/asm/fpsimd.h
@@ -54,7 +54,8 @@ extern void fpsimd_restore_current_state(void);
 extern void fpsimd_update_current_state(struct user_fpsimd_state const *state);
 
 extern void fpsimd_bind_task_to_cpu(void);
-extern void fpsimd_bind_state_to_cpu(struct user_fpsimd_state *state);
+extern void fpsimd_bind_state_to_cpu(struct user_fpsimd_state *state,
+				     void *sve_state, unsigned int sve_vl);
 
 extern void fpsimd_flush_task_state(struct task_struct *target);
 extern void fpsimd_flush_cpu_state(void);
diff --git a/arch/arm64/kernel/fpsimd.c b/arch/arm64/kernel/fpsimd.c
index 390afb4..8afc518 100644
--- a/arch/arm64/kernel/fpsimd.c
+++ b/arch/arm64/kernel/fpsimd.c
@@ -121,6 +121,8 @@
  */
 struct fpsimd_last_state_struct {
 	struct user_fpsimd_state *st;
+	void *sve_state;
+	unsigned int sve_vl;
 };
 
 static DEFINE_PER_CPU(struct fpsimd_last_state_struct, fpsimd_last_state);
@@ -260,14 +262,15 @@ static void task_fpsimd_load(void)
  */
 void fpsimd_save(void)
 {
-	struct user_fpsimd_state *st = __this_cpu_read(fpsimd_last_state.st);
+	struct fpsimd_last_state_struct const *last =
+		this_cpu_ptr(&fpsimd_last_state);
 	/* set by fpsimd_bind_task_to_cpu() or fpsimd_bind_state_to_cpu() */
 
 	WARN_ON(!in_softirq() && !irqs_disabled());
 
 	if (!test_thread_flag(TIF_FOREIGN_FPSTATE)) {
 		if (system_supports_sve() && test_thread_flag(TIF_SVE)) {
-			if (WARN_ON(sve_get_vl() != current->thread.sve_vl)) {
+			if (WARN_ON(sve_get_vl() != last->sve_vl)) {
 				/*
 				 * Can't save the user regs, so current would
 				 * re-enter user with corrupt state.
@@ -277,9 +280,11 @@ void fpsimd_save(void)
 				return;
 			}
 
-			sve_save_state(sve_pffr(&current->thread), &st->fpsr);
+			sve_save_state((char *)last->sve_state +
+						sve_ffr_offset(last->sve_vl),
+				       &last->st->fpsr);
 		} else
-			fpsimd_save_state(st);
+			fpsimd_save_state(last->st);
 	}
 }
 
@@ -1053,6 +1058,8 @@ void fpsimd_bind_task_to_cpu(void)
 		this_cpu_ptr(&fpsimd_last_state);
 
 	last->st = &current->thread.uw.fpsimd_state;
+	last->sve_state = current->thread.sve_state;
+	last->sve_vl = current->thread.sve_vl;
 	current->thread.fpsimd_cpu = smp_processor_id();
 
 	if (system_supports_sve()) {
@@ -1066,7 +1073,8 @@ void fpsimd_bind_task_to_cpu(void)
 	}
 }
 
-void fpsimd_bind_state_to_cpu(struct user_fpsimd_state *st)
+void fpsimd_bind_state_to_cpu(struct user_fpsimd_state *st, void *sve_state,
+			      unsigned int sve_vl)
 {
 	struct fpsimd_last_state_struct *last =
 		this_cpu_ptr(&fpsimd_last_state);
@@ -1074,6 +1082,8 @@ void fpsimd_bind_state_to_cpu(struct user_fpsimd_state *st)
 	WARN_ON(!in_softirq() && !irqs_disabled());
 
 	last->st = st;
+	last->sve_state = sve_state;
+	last->sve_vl = sve_vl;
 }
 
 /*
diff --git a/arch/arm64/kvm/fpsimd.c b/arch/arm64/kvm/fpsimd.c
index 4aaf78e..872008c 100644
--- a/arch/arm64/kvm/fpsimd.c
+++ b/arch/arm64/kvm/fpsimd.c
@@ -85,7 +85,9 @@ void kvm_arch_vcpu_ctxsync_fp(struct kvm_vcpu *vcpu)
 	WARN_ON_ONCE(!irqs_disabled());
 
 	if (vcpu->arch.flags & KVM_ARM64_FP_ENABLED) {
-		fpsimd_bind_state_to_cpu(&vcpu->arch.ctxt.gp_regs.fp_regs);
+		fpsimd_bind_state_to_cpu(&vcpu->arch.ctxt.gp_regs.fp_regs,
+					 NULL, sve_max_vl);
+
 		clear_thread_flag(TIF_FOREIGN_FPSTATE);
 		clear_thread_flag(TIF_SVE);
 	}
-- 
2.1.4

^ permalink raw reply related	[flat|nested] 178+ messages in thread

* [RFC PATCH 07/16] arm64/sve: Enable SVE state tracking for non-task contexts
@ 2018-06-21 14:57   ` Dave Martin
  0 siblings, 0 replies; 178+ messages in thread
From: Dave Martin @ 2018-06-21 14:57 UTC (permalink / raw)
  To: linux-arm-kernel

The current FPSIMD/SVE context handling support for non-task (i.e.,
KVM vcpu) contexts does not take SVE into account.  This means that
only task contexts can safely use SVE at present.

In preparation for enabling KVM guests to use SVE, it is necessary
to keep track of SVE state for non-task contexts too.

This patch adds the necessary support, removing assumptions from
the context switch code about the location of the SVE context
storage.

When binding a vcpu context, its vector length is arbitrarily
specified as sve_max_vl for now.  In any case, because TIF_SVE is
presently cleared at vcpu context bind time, the specified vector
length will not be used for anything yet.  In later patches TIF_SVE
will be set here as appropriate, and the appropriate maximum vector
length for the vcpu will be passed when binding.

Signed-off-by: Dave Martin <Dave.Martin@arm.com>
---
 arch/arm64/include/asm/fpsimd.h |  3 ++-
 arch/arm64/kernel/fpsimd.c      | 20 +++++++++++++++-----
 arch/arm64/kvm/fpsimd.c         |  4 +++-
 3 files changed, 20 insertions(+), 7 deletions(-)

diff --git a/arch/arm64/include/asm/fpsimd.h b/arch/arm64/include/asm/fpsimd.h
index 3ad4607..d575e59 100644
--- a/arch/arm64/include/asm/fpsimd.h
+++ b/arch/arm64/include/asm/fpsimd.h
@@ -54,7 +54,8 @@ extern void fpsimd_restore_current_state(void);
 extern void fpsimd_update_current_state(struct user_fpsimd_state const *state);
 
 extern void fpsimd_bind_task_to_cpu(void);
-extern void fpsimd_bind_state_to_cpu(struct user_fpsimd_state *state);
+extern void fpsimd_bind_state_to_cpu(struct user_fpsimd_state *state,
+				     void *sve_state, unsigned int sve_vl);
 
 extern void fpsimd_flush_task_state(struct task_struct *target);
 extern void fpsimd_flush_cpu_state(void);
diff --git a/arch/arm64/kernel/fpsimd.c b/arch/arm64/kernel/fpsimd.c
index 390afb4..8afc518 100644
--- a/arch/arm64/kernel/fpsimd.c
+++ b/arch/arm64/kernel/fpsimd.c
@@ -121,6 +121,8 @@
  */
 struct fpsimd_last_state_struct {
 	struct user_fpsimd_state *st;
+	void *sve_state;
+	unsigned int sve_vl;
 };
 
 static DEFINE_PER_CPU(struct fpsimd_last_state_struct, fpsimd_last_state);
@@ -260,14 +262,15 @@ static void task_fpsimd_load(void)
  */
 void fpsimd_save(void)
 {
-	struct user_fpsimd_state *st = __this_cpu_read(fpsimd_last_state.st);
+	struct fpsimd_last_state_struct const *last =
+		this_cpu_ptr(&fpsimd_last_state);
 	/* set by fpsimd_bind_task_to_cpu() or fpsimd_bind_state_to_cpu() */
 
 	WARN_ON(!in_softirq() && !irqs_disabled());
 
 	if (!test_thread_flag(TIF_FOREIGN_FPSTATE)) {
 		if (system_supports_sve() && test_thread_flag(TIF_SVE)) {
-			if (WARN_ON(sve_get_vl() != current->thread.sve_vl)) {
+			if (WARN_ON(sve_get_vl() != last->sve_vl)) {
 				/*
 				 * Can't save the user regs, so current would
 				 * re-enter user with corrupt state.
@@ -277,9 +280,11 @@ void fpsimd_save(void)
 				return;
 			}
 
-			sve_save_state(sve_pffr(&current->thread), &st->fpsr);
+			sve_save_state((char *)last->sve_state +
+						sve_ffr_offset(last->sve_vl),
+				       &last->st->fpsr);
 		} else
-			fpsimd_save_state(st);
+			fpsimd_save_state(last->st);
 	}
 }
 
@@ -1053,6 +1058,8 @@ void fpsimd_bind_task_to_cpu(void)
 		this_cpu_ptr(&fpsimd_last_state);
 
 	last->st = &current->thread.uw.fpsimd_state;
+	last->sve_state = current->thread.sve_state;
+	last->sve_vl = current->thread.sve_vl;
 	current->thread.fpsimd_cpu = smp_processor_id();
 
 	if (system_supports_sve()) {
@@ -1066,7 +1073,8 @@ void fpsimd_bind_task_to_cpu(void)
 	}
 }
 
-void fpsimd_bind_state_to_cpu(struct user_fpsimd_state *st)
+void fpsimd_bind_state_to_cpu(struct user_fpsimd_state *st, void *sve_state,
+			      unsigned int sve_vl)
 {
 	struct fpsimd_last_state_struct *last =
 		this_cpu_ptr(&fpsimd_last_state);
@@ -1074,6 +1082,8 @@ void fpsimd_bind_state_to_cpu(struct user_fpsimd_state *st)
 	WARN_ON(!in_softirq() && !irqs_disabled());
 
 	last->st = st;
+	last->sve_state = sve_state;
+	last->sve_vl = sve_vl;
 }
 
 /*
diff --git a/arch/arm64/kvm/fpsimd.c b/arch/arm64/kvm/fpsimd.c
index 4aaf78e..872008c 100644
--- a/arch/arm64/kvm/fpsimd.c
+++ b/arch/arm64/kvm/fpsimd.c
@@ -85,7 +85,9 @@ void kvm_arch_vcpu_ctxsync_fp(struct kvm_vcpu *vcpu)
 	WARN_ON_ONCE(!irqs_disabled());
 
 	if (vcpu->arch.flags & KVM_ARM64_FP_ENABLED) {
-		fpsimd_bind_state_to_cpu(&vcpu->arch.ctxt.gp_regs.fp_regs);
+		fpsimd_bind_state_to_cpu(&vcpu->arch.ctxt.gp_regs.fp_regs,
+					 NULL, sve_max_vl);
+
 		clear_thread_flag(TIF_FOREIGN_FPSTATE);
 		clear_thread_flag(TIF_SVE);
 	}
-- 
2.1.4

^ permalink raw reply related	[flat|nested] 178+ messages in thread

* [RFC PATCH 08/16] KVM: arm64: Support dynamically hideable system registers
  2018-06-21 14:57 ` Dave Martin
@ 2018-06-21 14:57   ` Dave Martin
  -1 siblings, 0 replies; 178+ messages in thread
From: Dave Martin @ 2018-06-21 14:57 UTC (permalink / raw)
  To: kvmarm
  Cc: Okamoto Takayuki, Christoffer Dall, Ard Biesheuvel, Marc Zyngier,
	Catalin Marinas, Will Deacon, linux-arm-kernel

Some system registers may or may not logically exist for a vcpu
depending on whether certain architectural features are enabled for
the vcpu.

In order to avoid spuriously emulating access to these registers
when they should not exist, or allowing the registers to be
spuriously enumerated or saved/restored through the ioctl
interface, a means is needed to allow registers to be hidden
depending on the vcpu configuration.

In order to support this in a flexible way, this patch adds a
check_present() method to struct sys_reg_desc, and updates the
generic system register access and enumeration code to be aware of
it:  if check_present() returns false, the code behaves as if the
register did not exist.

For convenience, the complete check is wrapped up in a new helper
sys_reg_present().

An attempt has been made to hook the new check into the generic
accessors for trapped system registers.  This should reduce the
potential for future surprises, although the redundant check will
add a small cost.  No system register depends on this functionality
yet, and some paths needing the check may also need attention.

Naturally, this facility makes sense only for registers that are
trapped.

Signed-off-by: Dave Martin <Dave.Martin@arm.com>
---
 arch/arm64/kvm/sys_regs.c | 20 +++++++++++++++-----
 arch/arm64/kvm/sys_regs.h | 11 +++++++++++
 2 files changed, 26 insertions(+), 5 deletions(-)

diff --git a/arch/arm64/kvm/sys_regs.c b/arch/arm64/kvm/sys_regs.c
index a436373..31a351a 100644
--- a/arch/arm64/kvm/sys_regs.c
+++ b/arch/arm64/kvm/sys_regs.c
@@ -1840,7 +1840,7 @@ static int emulate_cp(struct kvm_vcpu *vcpu,
 
 	r = find_reg(params, table, num);
 
-	if (r) {
+	if (likely(r) && sys_reg_present(vcpu, r)) {
 		perform_access(vcpu, params, r);
 		return 0;
 	}
@@ -2016,7 +2016,7 @@ static int emulate_sys_reg(struct kvm_vcpu *vcpu,
 	if (!r)
 		r = find_reg(params, sys_reg_descs, ARRAY_SIZE(sys_reg_descs));
 
-	if (likely(r)) {
+	if (likely(r) && sys_reg_present(vcpu, r)) {
 		perform_access(vcpu, params, r);
 	} else {
 		kvm_err("Unsupported guest sys_reg access at: %lx\n",
@@ -2313,6 +2313,9 @@ int kvm_arm_sys_reg_get_reg(struct kvm_vcpu *vcpu, const struct kvm_one_reg *reg
 	if (!r)
 		return get_invariant_sys_reg(reg->id, uaddr);
 
+	if (!sys_reg_present(vcpu, r))
+		return -ENOENT;
+
 	if (r->get_user)
 		return (r->get_user)(vcpu, r, reg, uaddr);
 
@@ -2334,6 +2337,9 @@ int kvm_arm_sys_reg_set_reg(struct kvm_vcpu *vcpu, const struct kvm_one_reg *reg
 	if (!r)
 		return set_invariant_sys_reg(reg->id, uaddr);
 
+	if (!sys_reg_present(vcpu, r))
+		return -ENOENT;
+
 	if (r->set_user)
 		return (r->set_user)(vcpu, r, reg, uaddr);
 
@@ -2390,7 +2396,8 @@ static bool copy_reg_to_user(const struct sys_reg_desc *reg, u64 __user **uind)
 	return true;
 }
 
-static int walk_one_sys_reg(const struct sys_reg_desc *rd,
+static int walk_one_sys_reg(struct kvm_vcpu *vcpu,
+			    const struct sys_reg_desc *rd,
 			    u64 __user **uind,
 			    unsigned int *total)
 {
@@ -2401,6 +2408,9 @@ static int walk_one_sys_reg(const struct sys_reg_desc *rd,
 	if (!(rd->reg || rd->get_user))
 		return 0;
 
+	if (!sys_reg_present(vcpu, rd))
+		return 0;
+
 	if (!copy_reg_to_user(rd, uind))
 		return -EFAULT;
 
@@ -2429,9 +2439,9 @@ static int walk_sys_regs(struct kvm_vcpu *vcpu, u64 __user *uind)
 		int cmp = cmp_sys_reg(i1, i2);
 		/* target-specific overrides generic entry. */
 		if (cmp <= 0)
-			err = walk_one_sys_reg(i1, &uind, &total);
+			err = walk_one_sys_reg(vcpu, i1, &uind, &total);
 		else
-			err = walk_one_sys_reg(i2, &uind, &total);
+			err = walk_one_sys_reg(vcpu, i2, &uind, &total);
 
 		if (err)
 			return err;
diff --git a/arch/arm64/kvm/sys_regs.h b/arch/arm64/kvm/sys_regs.h
index cd710f8..dfbb342 100644
--- a/arch/arm64/kvm/sys_regs.h
+++ b/arch/arm64/kvm/sys_regs.h
@@ -22,6 +22,9 @@
 #ifndef __ARM64_KVM_SYS_REGS_LOCAL_H__
 #define __ARM64_KVM_SYS_REGS_LOCAL_H__
 
+#include <linux/compiler.h>
+#include <linux/types.h>
+
 struct sys_reg_params {
 	u8	Op0;
 	u8	Op1;
@@ -61,8 +64,16 @@ struct sys_reg_desc {
 			const struct kvm_one_reg *reg, void __user *uaddr);
 	int (*set_user)(struct kvm_vcpu *vcpu, const struct sys_reg_desc *rd,
 			const struct kvm_one_reg *reg, void __user *uaddr);
+	bool (*check_present)(const struct kvm_vcpu *vpcu,
+			      const struct sys_reg_desc *rd);
 };
 
+static inline bool sys_reg_present(const struct kvm_vcpu *vcpu,
+				   const struct sys_reg_desc *rd)
+{
+	return likely(!rd->check_present) || rd->check_present(vcpu, rd);
+}
+
 static inline void print_sys_reg_instr(const struct sys_reg_params *p)
 {
 	/* Look, we even formatted it for you to paste into the table! */
-- 
2.1.4

^ permalink raw reply related	[flat|nested] 178+ messages in thread

* [RFC PATCH 08/16] KVM: arm64: Support dynamically hideable system registers
@ 2018-06-21 14:57   ` Dave Martin
  0 siblings, 0 replies; 178+ messages in thread
From: Dave Martin @ 2018-06-21 14:57 UTC (permalink / raw)
  To: linux-arm-kernel

Some system registers may or may not logically exist for a vcpu
depending on whether certain architectural features are enabled for
the vcpu.

In order to avoid spuriously emulating access to these registers
when they should not exist, or allowing the registers to be
spuriously enumerated or saved/restored through the ioctl
interface, a means is needed to allow registers to be hidden
depending on the vcpu configuration.

In order to support this in a flexible way, this patch adds a
check_present() method to struct sys_reg_desc, and updates the
generic system register access and enumeration code to be aware of
it:  if check_present() returns false, the code behaves as if the
register did not exist.

For convenience, the complete check is wrapped up in a new helper
sys_reg_present().

An attempt has been made to hook the new check into the generic
accessors for trapped system registers.  This should reduce the
potential for future surprises, although the redundant check will
add a small cost.  No system register depends on this functionality
yet, and some paths needing the check may also need attention.

Naturally, this facility makes sense only for registers that are
trapped.

Signed-off-by: Dave Martin <Dave.Martin@arm.com>
---
 arch/arm64/kvm/sys_regs.c | 20 +++++++++++++++-----
 arch/arm64/kvm/sys_regs.h | 11 +++++++++++
 2 files changed, 26 insertions(+), 5 deletions(-)

diff --git a/arch/arm64/kvm/sys_regs.c b/arch/arm64/kvm/sys_regs.c
index a436373..31a351a 100644
--- a/arch/arm64/kvm/sys_regs.c
+++ b/arch/arm64/kvm/sys_regs.c
@@ -1840,7 +1840,7 @@ static int emulate_cp(struct kvm_vcpu *vcpu,
 
 	r = find_reg(params, table, num);
 
-	if (r) {
+	if (likely(r) && sys_reg_present(vcpu, r)) {
 		perform_access(vcpu, params, r);
 		return 0;
 	}
@@ -2016,7 +2016,7 @@ static int emulate_sys_reg(struct kvm_vcpu *vcpu,
 	if (!r)
 		r = find_reg(params, sys_reg_descs, ARRAY_SIZE(sys_reg_descs));
 
-	if (likely(r)) {
+	if (likely(r) && sys_reg_present(vcpu, r)) {
 		perform_access(vcpu, params, r);
 	} else {
 		kvm_err("Unsupported guest sys_reg access at: %lx\n",
@@ -2313,6 +2313,9 @@ int kvm_arm_sys_reg_get_reg(struct kvm_vcpu *vcpu, const struct kvm_one_reg *reg
 	if (!r)
 		return get_invariant_sys_reg(reg->id, uaddr);
 
+	if (!sys_reg_present(vcpu, r))
+		return -ENOENT;
+
 	if (r->get_user)
 		return (r->get_user)(vcpu, r, reg, uaddr);
 
@@ -2334,6 +2337,9 @@ int kvm_arm_sys_reg_set_reg(struct kvm_vcpu *vcpu, const struct kvm_one_reg *reg
 	if (!r)
 		return set_invariant_sys_reg(reg->id, uaddr);
 
+	if (!sys_reg_present(vcpu, r))
+		return -ENOENT;
+
 	if (r->set_user)
 		return (r->set_user)(vcpu, r, reg, uaddr);
 
@@ -2390,7 +2396,8 @@ static bool copy_reg_to_user(const struct sys_reg_desc *reg, u64 __user **uind)
 	return true;
 }
 
-static int walk_one_sys_reg(const struct sys_reg_desc *rd,
+static int walk_one_sys_reg(struct kvm_vcpu *vcpu,
+			    const struct sys_reg_desc *rd,
 			    u64 __user **uind,
 			    unsigned int *total)
 {
@@ -2401,6 +2408,9 @@ static int walk_one_sys_reg(const struct sys_reg_desc *rd,
 	if (!(rd->reg || rd->get_user))
 		return 0;
 
+	if (!sys_reg_present(vcpu, rd))
+		return 0;
+
 	if (!copy_reg_to_user(rd, uind))
 		return -EFAULT;
 
@@ -2429,9 +2439,9 @@ static int walk_sys_regs(struct kvm_vcpu *vcpu, u64 __user *uind)
 		int cmp = cmp_sys_reg(i1, i2);
 		/* target-specific overrides generic entry. */
 		if (cmp <= 0)
-			err = walk_one_sys_reg(i1, &uind, &total);
+			err = walk_one_sys_reg(vcpu, i1, &uind, &total);
 		else
-			err = walk_one_sys_reg(i2, &uind, &total);
+			err = walk_one_sys_reg(vcpu, i2, &uind, &total);
 
 		if (err)
 			return err;
diff --git a/arch/arm64/kvm/sys_regs.h b/arch/arm64/kvm/sys_regs.h
index cd710f8..dfbb342 100644
--- a/arch/arm64/kvm/sys_regs.h
+++ b/arch/arm64/kvm/sys_regs.h
@@ -22,6 +22,9 @@
 #ifndef __ARM64_KVM_SYS_REGS_LOCAL_H__
 #define __ARM64_KVM_SYS_REGS_LOCAL_H__
 
+#include <linux/compiler.h>
+#include <linux/types.h>
+
 struct sys_reg_params {
 	u8	Op0;
 	u8	Op1;
@@ -61,8 +64,16 @@ struct sys_reg_desc {
 			const struct kvm_one_reg *reg, void __user *uaddr);
 	int (*set_user)(struct kvm_vcpu *vcpu, const struct sys_reg_desc *rd,
 			const struct kvm_one_reg *reg, void __user *uaddr);
+	bool (*check_present)(const struct kvm_vcpu *vpcu,
+			      const struct sys_reg_desc *rd);
 };
 
+static inline bool sys_reg_present(const struct kvm_vcpu *vcpu,
+				   const struct sys_reg_desc *rd)
+{
+	return likely(!rd->check_present) || rd->check_present(vcpu, rd);
+}
+
 static inline void print_sys_reg_instr(const struct sys_reg_params *p)
 {
 	/* Look, we even formatted it for you to paste into the table! */
-- 
2.1.4

^ permalink raw reply related	[flat|nested] 178+ messages in thread

* [RFC PATCH 09/16] KVM: arm64: Allow ID registers to by dynamically read-as-zero
  2018-06-21 14:57 ` Dave Martin
@ 2018-06-21 14:57   ` Dave Martin
  -1 siblings, 0 replies; 178+ messages in thread
From: Dave Martin @ 2018-06-21 14:57 UTC (permalink / raw)
  To: kvmarm
  Cc: Okamoto Takayuki, Christoffer Dall, Ard Biesheuvel, Marc Zyngier,
	Catalin Marinas, Will Deacon, linux-arm-kernel

When a feature-dependent ID register is hidden from the guest, it
needs to exhibit read-as-zero behaviour as defined by the Arm
architecture, rather than appearing to be entirely absent.

This patch updates the ID register emulation logic to make use of
the new check_present() method to determine whether the register
should read as zero instead of yielding the host's sanitised
value.  Because currently a false result from this method truncates
the trap call chain before the sysreg's emulate method() is called,
a flag is added to distinguish this special case, and helpers are
refactored appropriately.

This invloves some trivial updates to pass the vcpu pointer down
into the ID register emulation/access functions.

A new ID_SANITISED_IF() macro is defined for declaring
conditionally visible ID registers.

Signed-off-by: Dave Martin <Dave.Martin@arm.com>
---
 arch/arm64/kvm/sys_regs.c | 51 ++++++++++++++++++++++++++++++-----------------
 arch/arm64/kvm/sys_regs.h | 11 ++++++++++
 2 files changed, 44 insertions(+), 18 deletions(-)

diff --git a/arch/arm64/kvm/sys_regs.c b/arch/arm64/kvm/sys_regs.c
index 31a351a..87d2468 100644
--- a/arch/arm64/kvm/sys_regs.c
+++ b/arch/arm64/kvm/sys_regs.c
@@ -987,11 +987,17 @@ static bool access_cntp_cval(struct kvm_vcpu *vcpu,
 }
 
 /* Read a sanitised cpufeature ID register by sys_reg_desc */
-static u64 read_id_reg(struct sys_reg_desc const *r, bool raz)
+static u64 read_id_reg(const struct kvm_vcpu *vcpu,
+		       struct sys_reg_desc const *r, bool raz)
 {
 	u32 id = sys_reg((u32)r->Op0, (u32)r->Op1,
 			 (u32)r->CRn, (u32)r->CRm, (u32)r->Op2);
-	u64 val = raz ? 0 : read_sanitised_ftr_reg(id);
+	u64 val;
+
+	if (raz || !sys_reg_present(vcpu, r))
+		val = 0;
+	else
+		val = read_sanitised_ftr_reg(id);
 
 	if (id == SYS_ID_AA64PFR0_EL1) {
 		if (val & (0xfUL << ID_AA64PFR0_SVE_SHIFT))
@@ -1018,7 +1024,7 @@ static bool __access_id_reg(struct kvm_vcpu *vcpu,
 	if (p->is_write)
 		return write_to_read_only(vcpu, p, r);
 
-	p->regval = read_id_reg(r, raz);
+	p->regval = read_id_reg(vcpu, r, raz);
 	return true;
 }
 
@@ -1047,16 +1053,18 @@ static u64 sys_reg_to_index(const struct sys_reg_desc *reg);
  * are stored, and for set_id_reg() we don't allow the effective value
  * to be changed.
  */
-static int __get_id_reg(const struct sys_reg_desc *rd, void __user *uaddr,
+static int __get_id_reg(const struct kvm_vcpu *vcpu,
+			const struct sys_reg_desc *rd, void __user *uaddr,
 			bool raz)
 {
 	const u64 id = sys_reg_to_index(rd);
-	const u64 val = read_id_reg(rd, raz);
+	const u64 val = read_id_reg(vcpu, rd, raz);
 
 	return reg_to_user(uaddr, &val, id);
 }
 
-static int __set_id_reg(const struct sys_reg_desc *rd, void __user *uaddr,
+static int __set_id_reg(const struct kvm_vcpu *vcpu,
+			const struct sys_reg_desc *rd, void __user *uaddr,
 			bool raz)
 {
 	const u64 id = sys_reg_to_index(rd);
@@ -1068,7 +1076,7 @@ static int __set_id_reg(const struct sys_reg_desc *rd, void __user *uaddr,
 		return err;
 
 	/* This is what we mean by invariant: you can't change it. */
-	if (val != read_id_reg(rd, raz))
+	if (val != read_id_reg(vcpu, rd, raz))
 		return -EINVAL;
 
 	return 0;
@@ -1077,33 +1085,40 @@ static int __set_id_reg(const struct sys_reg_desc *rd, void __user *uaddr,
 static int get_id_reg(struct kvm_vcpu *vcpu, const struct sys_reg_desc *rd,
 		      const struct kvm_one_reg *reg, void __user *uaddr)
 {
-	return __get_id_reg(rd, uaddr, false);
+	return __get_id_reg(vcpu, rd, uaddr, false);
 }
 
 static int set_id_reg(struct kvm_vcpu *vcpu, const struct sys_reg_desc *rd,
 		      const struct kvm_one_reg *reg, void __user *uaddr)
 {
-	return __set_id_reg(rd, uaddr, false);
+	return __set_id_reg(vcpu, rd, uaddr, false);
 }
 
 static int get_raz_id_reg(struct kvm_vcpu *vcpu, const struct sys_reg_desc *rd,
 			  const struct kvm_one_reg *reg, void __user *uaddr)
 {
-	return __get_id_reg(rd, uaddr, true);
+	return __get_id_reg(vcpu, rd, uaddr, true);
 }
 
 static int set_raz_id_reg(struct kvm_vcpu *vcpu, const struct sys_reg_desc *rd,
 			  const struct kvm_one_reg *reg, void __user *uaddr)
 {
-	return __set_id_reg(rd, uaddr, true);
+	return __set_id_reg(vcpu, rd, uaddr, true);
 }
 
 /* sys_reg_desc initialiser for known cpufeature ID registers */
-#define ID_SANITISED(name) {			\
+#define __ID_SANITISED(name)			\
 	SYS_DESC(SYS_##name),			\
 	.access	= access_id_reg,		\
 	.get_user = get_id_reg,			\
-	.set_user = set_id_reg,			\
+	.set_user = set_id_reg
+
+#define ID_SANITISED(name) { __ID_SANITISED(name) }
+
+#define ID_SANITISED_IF(name, check) {		\
+	__ID_SANITISED(name),			\
+	.check_present = check,			\
+	.flags = SR_RAZ_IF_ABSENT,		\
 }
 
 /*
@@ -1840,7 +1855,7 @@ static int emulate_cp(struct kvm_vcpu *vcpu,
 
 	r = find_reg(params, table, num);
 
-	if (likely(r) && sys_reg_present(vcpu, r)) {
+	if (likely(r) && sys_reg_present_or_raz(vcpu, r)) {
 		perform_access(vcpu, params, r);
 		return 0;
 	}
@@ -2016,7 +2031,7 @@ static int emulate_sys_reg(struct kvm_vcpu *vcpu,
 	if (!r)
 		r = find_reg(params, sys_reg_descs, ARRAY_SIZE(sys_reg_descs));
 
-	if (likely(r) && sys_reg_present(vcpu, r)) {
+	if (likely(r) && sys_reg_present_or_raz(vcpu, r)) {
 		perform_access(vcpu, params, r);
 	} else {
 		kvm_err("Unsupported guest sys_reg access at: %lx\n",
@@ -2313,7 +2328,7 @@ int kvm_arm_sys_reg_get_reg(struct kvm_vcpu *vcpu, const struct kvm_one_reg *reg
 	if (!r)
 		return get_invariant_sys_reg(reg->id, uaddr);
 
-	if (!sys_reg_present(vcpu, r))
+	if (!sys_reg_present_or_raz(vcpu, r))
 		return -ENOENT;
 
 	if (r->get_user)
@@ -2337,7 +2352,7 @@ int kvm_arm_sys_reg_set_reg(struct kvm_vcpu *vcpu, const struct kvm_one_reg *reg
 	if (!r)
 		return set_invariant_sys_reg(reg->id, uaddr);
 
-	if (!sys_reg_present(vcpu, r))
+	if (!sys_reg_present_or_raz(vcpu, r))
 		return -ENOENT;
 
 	if (r->set_user)
@@ -2408,7 +2423,7 @@ static int walk_one_sys_reg(struct kvm_vcpu *vcpu,
 	if (!(rd->reg || rd->get_user))
 		return 0;
 
-	if (!sys_reg_present(vcpu, rd))
+	if (!sys_reg_present_or_raz(vcpu, rd))
 		return 0;
 
 	if (!copy_reg_to_user(rd, uind))
diff --git a/arch/arm64/kvm/sys_regs.h b/arch/arm64/kvm/sys_regs.h
index dfbb342..304928f 100644
--- a/arch/arm64/kvm/sys_regs.h
+++ b/arch/arm64/kvm/sys_regs.h
@@ -66,14 +66,25 @@ struct sys_reg_desc {
 			const struct kvm_one_reg *reg, void __user *uaddr);
 	bool (*check_present)(const struct kvm_vcpu *vpcu,
 			      const struct sys_reg_desc *rd);
+
+	/* OR of SR_* flags */
+	unsigned int flags;
 };
 
+#define SR_RAZ_IF_ABSENT	(1 << 0)
+
 static inline bool sys_reg_present(const struct kvm_vcpu *vcpu,
 				   const struct sys_reg_desc *rd)
 {
 	return likely(!rd->check_present) || rd->check_present(vcpu, rd);
 }
 
+static inline bool sys_reg_present_or_raz(const struct kvm_vcpu *vcpu,
+					  const struct sys_reg_desc *rd)
+{
+	return sys_reg_present(vcpu, rd) || (rd->flags & SR_RAZ_IF_ABSENT);
+}
+
 static inline void print_sys_reg_instr(const struct sys_reg_params *p)
 {
 	/* Look, we even formatted it for you to paste into the table! */
-- 
2.1.4

^ permalink raw reply related	[flat|nested] 178+ messages in thread

* [RFC PATCH 09/16] KVM: arm64: Allow ID registers to by dynamically read-as-zero
@ 2018-06-21 14:57   ` Dave Martin
  0 siblings, 0 replies; 178+ messages in thread
From: Dave Martin @ 2018-06-21 14:57 UTC (permalink / raw)
  To: linux-arm-kernel

When a feature-dependent ID register is hidden from the guest, it
needs to exhibit read-as-zero behaviour as defined by the Arm
architecture, rather than appearing to be entirely absent.

This patch updates the ID register emulation logic to make use of
the new check_present() method to determine whether the register
should read as zero instead of yielding the host's sanitised
value.  Because currently a false result from this method truncates
the trap call chain before the sysreg's emulate method() is called,
a flag is added to distinguish this special case, and helpers are
refactored appropriately.

This invloves some trivial updates to pass the vcpu pointer down
into the ID register emulation/access functions.

A new ID_SANITISED_IF() macro is defined for declaring
conditionally visible ID registers.

Signed-off-by: Dave Martin <Dave.Martin@arm.com>
---
 arch/arm64/kvm/sys_regs.c | 51 ++++++++++++++++++++++++++++++-----------------
 arch/arm64/kvm/sys_regs.h | 11 ++++++++++
 2 files changed, 44 insertions(+), 18 deletions(-)

diff --git a/arch/arm64/kvm/sys_regs.c b/arch/arm64/kvm/sys_regs.c
index 31a351a..87d2468 100644
--- a/arch/arm64/kvm/sys_regs.c
+++ b/arch/arm64/kvm/sys_regs.c
@@ -987,11 +987,17 @@ static bool access_cntp_cval(struct kvm_vcpu *vcpu,
 }
 
 /* Read a sanitised cpufeature ID register by sys_reg_desc */
-static u64 read_id_reg(struct sys_reg_desc const *r, bool raz)
+static u64 read_id_reg(const struct kvm_vcpu *vcpu,
+		       struct sys_reg_desc const *r, bool raz)
 {
 	u32 id = sys_reg((u32)r->Op0, (u32)r->Op1,
 			 (u32)r->CRn, (u32)r->CRm, (u32)r->Op2);
-	u64 val = raz ? 0 : read_sanitised_ftr_reg(id);
+	u64 val;
+
+	if (raz || !sys_reg_present(vcpu, r))
+		val = 0;
+	else
+		val = read_sanitised_ftr_reg(id);
 
 	if (id == SYS_ID_AA64PFR0_EL1) {
 		if (val & (0xfUL << ID_AA64PFR0_SVE_SHIFT))
@@ -1018,7 +1024,7 @@ static bool __access_id_reg(struct kvm_vcpu *vcpu,
 	if (p->is_write)
 		return write_to_read_only(vcpu, p, r);
 
-	p->regval = read_id_reg(r, raz);
+	p->regval = read_id_reg(vcpu, r, raz);
 	return true;
 }
 
@@ -1047,16 +1053,18 @@ static u64 sys_reg_to_index(const struct sys_reg_desc *reg);
  * are stored, and for set_id_reg() we don't allow the effective value
  * to be changed.
  */
-static int __get_id_reg(const struct sys_reg_desc *rd, void __user *uaddr,
+static int __get_id_reg(const struct kvm_vcpu *vcpu,
+			const struct sys_reg_desc *rd, void __user *uaddr,
 			bool raz)
 {
 	const u64 id = sys_reg_to_index(rd);
-	const u64 val = read_id_reg(rd, raz);
+	const u64 val = read_id_reg(vcpu, rd, raz);
 
 	return reg_to_user(uaddr, &val, id);
 }
 
-static int __set_id_reg(const struct sys_reg_desc *rd, void __user *uaddr,
+static int __set_id_reg(const struct kvm_vcpu *vcpu,
+			const struct sys_reg_desc *rd, void __user *uaddr,
 			bool raz)
 {
 	const u64 id = sys_reg_to_index(rd);
@@ -1068,7 +1076,7 @@ static int __set_id_reg(const struct sys_reg_desc *rd, void __user *uaddr,
 		return err;
 
 	/* This is what we mean by invariant: you can't change it. */
-	if (val != read_id_reg(rd, raz))
+	if (val != read_id_reg(vcpu, rd, raz))
 		return -EINVAL;
 
 	return 0;
@@ -1077,33 +1085,40 @@ static int __set_id_reg(const struct sys_reg_desc *rd, void __user *uaddr,
 static int get_id_reg(struct kvm_vcpu *vcpu, const struct sys_reg_desc *rd,
 		      const struct kvm_one_reg *reg, void __user *uaddr)
 {
-	return __get_id_reg(rd, uaddr, false);
+	return __get_id_reg(vcpu, rd, uaddr, false);
 }
 
 static int set_id_reg(struct kvm_vcpu *vcpu, const struct sys_reg_desc *rd,
 		      const struct kvm_one_reg *reg, void __user *uaddr)
 {
-	return __set_id_reg(rd, uaddr, false);
+	return __set_id_reg(vcpu, rd, uaddr, false);
 }
 
 static int get_raz_id_reg(struct kvm_vcpu *vcpu, const struct sys_reg_desc *rd,
 			  const struct kvm_one_reg *reg, void __user *uaddr)
 {
-	return __get_id_reg(rd, uaddr, true);
+	return __get_id_reg(vcpu, rd, uaddr, true);
 }
 
 static int set_raz_id_reg(struct kvm_vcpu *vcpu, const struct sys_reg_desc *rd,
 			  const struct kvm_one_reg *reg, void __user *uaddr)
 {
-	return __set_id_reg(rd, uaddr, true);
+	return __set_id_reg(vcpu, rd, uaddr, true);
 }
 
 /* sys_reg_desc initialiser for known cpufeature ID registers */
-#define ID_SANITISED(name) {			\
+#define __ID_SANITISED(name)			\
 	SYS_DESC(SYS_##name),			\
 	.access	= access_id_reg,		\
 	.get_user = get_id_reg,			\
-	.set_user = set_id_reg,			\
+	.set_user = set_id_reg
+
+#define ID_SANITISED(name) { __ID_SANITISED(name) }
+
+#define ID_SANITISED_IF(name, check) {		\
+	__ID_SANITISED(name),			\
+	.check_present = check,			\
+	.flags = SR_RAZ_IF_ABSENT,		\
 }
 
 /*
@@ -1840,7 +1855,7 @@ static int emulate_cp(struct kvm_vcpu *vcpu,
 
 	r = find_reg(params, table, num);
 
-	if (likely(r) && sys_reg_present(vcpu, r)) {
+	if (likely(r) && sys_reg_present_or_raz(vcpu, r)) {
 		perform_access(vcpu, params, r);
 		return 0;
 	}
@@ -2016,7 +2031,7 @@ static int emulate_sys_reg(struct kvm_vcpu *vcpu,
 	if (!r)
 		r = find_reg(params, sys_reg_descs, ARRAY_SIZE(sys_reg_descs));
 
-	if (likely(r) && sys_reg_present(vcpu, r)) {
+	if (likely(r) && sys_reg_present_or_raz(vcpu, r)) {
 		perform_access(vcpu, params, r);
 	} else {
 		kvm_err("Unsupported guest sys_reg access at: %lx\n",
@@ -2313,7 +2328,7 @@ int kvm_arm_sys_reg_get_reg(struct kvm_vcpu *vcpu, const struct kvm_one_reg *reg
 	if (!r)
 		return get_invariant_sys_reg(reg->id, uaddr);
 
-	if (!sys_reg_present(vcpu, r))
+	if (!sys_reg_present_or_raz(vcpu, r))
 		return -ENOENT;
 
 	if (r->get_user)
@@ -2337,7 +2352,7 @@ int kvm_arm_sys_reg_set_reg(struct kvm_vcpu *vcpu, const struct kvm_one_reg *reg
 	if (!r)
 		return set_invariant_sys_reg(reg->id, uaddr);
 
-	if (!sys_reg_present(vcpu, r))
+	if (!sys_reg_present_or_raz(vcpu, r))
 		return -ENOENT;
 
 	if (r->set_user)
@@ -2408,7 +2423,7 @@ static int walk_one_sys_reg(struct kvm_vcpu *vcpu,
 	if (!(rd->reg || rd->get_user))
 		return 0;
 
-	if (!sys_reg_present(vcpu, rd))
+	if (!sys_reg_present_or_raz(vcpu, rd))
 		return 0;
 
 	if (!copy_reg_to_user(rd, uind))
diff --git a/arch/arm64/kvm/sys_regs.h b/arch/arm64/kvm/sys_regs.h
index dfbb342..304928f 100644
--- a/arch/arm64/kvm/sys_regs.h
+++ b/arch/arm64/kvm/sys_regs.h
@@ -66,14 +66,25 @@ struct sys_reg_desc {
 			const struct kvm_one_reg *reg, void __user *uaddr);
 	bool (*check_present)(const struct kvm_vcpu *vpcu,
 			      const struct sys_reg_desc *rd);
+
+	/* OR of SR_* flags */
+	unsigned int flags;
 };
 
+#define SR_RAZ_IF_ABSENT	(1 << 0)
+
 static inline bool sys_reg_present(const struct kvm_vcpu *vcpu,
 				   const struct sys_reg_desc *rd)
 {
 	return likely(!rd->check_present) || rd->check_present(vcpu, rd);
 }
 
+static inline bool sys_reg_present_or_raz(const struct kvm_vcpu *vcpu,
+					  const struct sys_reg_desc *rd)
+{
+	return sys_reg_present(vcpu, rd) || (rd->flags & SR_RAZ_IF_ABSENT);
+}
+
 static inline void print_sys_reg_instr(const struct sys_reg_params *p)
 {
 	/* Look, we even formatted it for you to paste into the table! */
-- 
2.1.4

^ permalink raw reply related	[flat|nested] 178+ messages in thread

* [RFC PATCH 10/16] KVM: arm64: Add a vcpu flag to control SVE visibility for the guest
  2018-06-21 14:57 ` Dave Martin
@ 2018-06-21 14:57   ` Dave Martin
  -1 siblings, 0 replies; 178+ messages in thread
From: Dave Martin @ 2018-06-21 14:57 UTC (permalink / raw)
  To: kvmarm
  Cc: Peter Maydell, Okamoto Takayuki, Christoffer Dall,
	Ard Biesheuvel, Marc Zyngier, Catalin Marinas, Will Deacon,
	Alex Bennée, linux-arm-kernel

Since SVE will be enabled or disabled on a per-vcpu basis, a flag
is needed in order to track which vcpus have it enabled.

This patch adds a suitable flag and a helper for checking it.

Signed-off-by: Dave Martin <Dave.Martin@arm.com>
---
 arch/arm64/include/asm/kvm_host.h | 8 ++++++++
 1 file changed, 8 insertions(+)

diff --git a/arch/arm64/include/asm/kvm_host.h b/arch/arm64/include/asm/kvm_host.h
index 9671ddd..609d08b 100644
--- a/arch/arm64/include/asm/kvm_host.h
+++ b/arch/arm64/include/asm/kvm_host.h
@@ -308,6 +308,14 @@ struct kvm_vcpu_arch {
 #define KVM_ARM64_FP_HOST		(1 << 2) /* host FP regs loaded */
 #define KVM_ARM64_HOST_SVE_IN_USE	(1 << 3) /* backup for host TIF_SVE */
 #define KVM_ARM64_HOST_SVE_ENABLED	(1 << 4) /* SVE enabled for EL0 */
+#define KVM_ARM64_GUEST_HAS_SVE		(1 << 5) /* SVE exposed to guest */
+
+static inline bool vcpu_has_sve(struct kvm_vcpu_arch const *vcpu_arch)
+{
+	return system_supports_sve() &&
+		(vcpu_arch->flags & KVM_ARM64_GUEST_HAS_SVE);
+
+}
 
 #define vcpu_gp_regs(v)		(&(v)->arch.ctxt.gp_regs)
 
-- 
2.1.4

^ permalink raw reply related	[flat|nested] 178+ messages in thread

* [RFC PATCH 10/16] KVM: arm64: Add a vcpu flag to control SVE visibility for the guest
@ 2018-06-21 14:57   ` Dave Martin
  0 siblings, 0 replies; 178+ messages in thread
From: Dave Martin @ 2018-06-21 14:57 UTC (permalink / raw)
  To: linux-arm-kernel

Since SVE will be enabled or disabled on a per-vcpu basis, a flag
is needed in order to track which vcpus have it enabled.

This patch adds a suitable flag and a helper for checking it.

Signed-off-by: Dave Martin <Dave.Martin@arm.com>
---
 arch/arm64/include/asm/kvm_host.h | 8 ++++++++
 1 file changed, 8 insertions(+)

diff --git a/arch/arm64/include/asm/kvm_host.h b/arch/arm64/include/asm/kvm_host.h
index 9671ddd..609d08b 100644
--- a/arch/arm64/include/asm/kvm_host.h
+++ b/arch/arm64/include/asm/kvm_host.h
@@ -308,6 +308,14 @@ struct kvm_vcpu_arch {
 #define KVM_ARM64_FP_HOST		(1 << 2) /* host FP regs loaded */
 #define KVM_ARM64_HOST_SVE_IN_USE	(1 << 3) /* backup for host TIF_SVE */
 #define KVM_ARM64_HOST_SVE_ENABLED	(1 << 4) /* SVE enabled for EL0 */
+#define KVM_ARM64_GUEST_HAS_SVE		(1 << 5) /* SVE exposed to guest */
+
+static inline bool vcpu_has_sve(struct kvm_vcpu_arch const *vcpu_arch)
+{
+	return system_supports_sve() &&
+		(vcpu_arch->flags & KVM_ARM64_GUEST_HAS_SVE);
+
+}
 
 #define vcpu_gp_regs(v)		(&(v)->arch.ctxt.gp_regs)
 
-- 
2.1.4

^ permalink raw reply related	[flat|nested] 178+ messages in thread

* [RFC PATCH 11/16] KVM: arm64/sve: System register context switch and access support
  2018-06-21 14:57 ` Dave Martin
@ 2018-06-21 14:57   ` Dave Martin
  -1 siblings, 0 replies; 178+ messages in thread
From: Dave Martin @ 2018-06-21 14:57 UTC (permalink / raw)
  To: kvmarm
  Cc: Okamoto Takayuki, Christoffer Dall, Ard Biesheuvel, Marc Zyngier,
	Catalin Marinas, Will Deacon, linux-arm-kernel

This patch adds the necessary support for context switching ZCR_EL1
for each vcpu.

The ID_AA64PFR0_EL1 emulation code is updated to expose the
presence of SVE to the guest if appropriate, and ioctl() access to
ZCR_EL1 is also added.

In the context switch code itself, ZCR_EL1 is context switched if
the host is SVE-capable, irrespectively for now of whether SVE is
exposed to the guest or not.  Adding a dynamic vcpu_has_sve() check
may lose as much performance as would be gained in this simple
case.

Signed-off-by: Dave Martin <Dave.Martin@arm.com>
---
 arch/arm64/include/asm/kvm_host.h |  1 +
 arch/arm64/include/asm/sysreg.h   |  3 +++
 arch/arm64/kvm/hyp/sysreg-sr.c    |  5 +++++
 arch/arm64/kvm/sys_regs.c         | 14 +++++++++-----
 4 files changed, 18 insertions(+), 5 deletions(-)

diff --git a/arch/arm64/include/asm/kvm_host.h b/arch/arm64/include/asm/kvm_host.h
index 609d08b..f331abf 100644
--- a/arch/arm64/include/asm/kvm_host.h
+++ b/arch/arm64/include/asm/kvm_host.h
@@ -111,6 +111,7 @@ enum vcpu_sysreg {
 	SCTLR_EL1,	/* System Control Register */
 	ACTLR_EL1,	/* Auxiliary Control Register */
 	CPACR_EL1,	/* Coprocessor Access Control */
+	ZCR_EL1,	/* SVE Control */
 	TTBR0_EL1,	/* Translation Table Base Register 0 */
 	TTBR1_EL1,	/* Translation Table Base Register 1 */
 	TCR_EL1,	/* Translation Control Register */
diff --git a/arch/arm64/include/asm/sysreg.h b/arch/arm64/include/asm/sysreg.h
index a8f8481..6476dbd 100644
--- a/arch/arm64/include/asm/sysreg.h
+++ b/arch/arm64/include/asm/sysreg.h
@@ -416,6 +416,9 @@
 #define SYS_ICH_LR14_EL2		__SYS__LR8_EL2(6)
 #define SYS_ICH_LR15_EL2		__SYS__LR8_EL2(7)
 
+/* VHE encodings for architectural EL0/1 system registers */
+#define SYS_ZCR_EL12			sys_reg(3, 5, 1, 2, 0)
+
 /* Common SCTLR_ELx flags. */
 #define SCTLR_ELx_EE    (1 << 25)
 #define SCTLR_ELx_IESB	(1 << 21)
diff --git a/arch/arm64/kvm/hyp/sysreg-sr.c b/arch/arm64/kvm/hyp/sysreg-sr.c
index 35bc168..0f4046a 100644
--- a/arch/arm64/kvm/hyp/sysreg-sr.c
+++ b/arch/arm64/kvm/hyp/sysreg-sr.c
@@ -21,6 +21,7 @@
 #include <asm/kvm_asm.h>
 #include <asm/kvm_emulate.h>
 #include <asm/kvm_hyp.h>
+#include <asm/sysreg.h>
 
 /*
  * Non-VHE: Both host and guest must save everything.
@@ -57,6 +58,8 @@ static void __hyp_text __sysreg_save_el1_state(struct kvm_cpu_context *ctxt)
 	ctxt->sys_regs[SCTLR_EL1]	= read_sysreg_el1(sctlr);
 	ctxt->sys_regs[ACTLR_EL1]	= read_sysreg(actlr_el1);
 	ctxt->sys_regs[CPACR_EL1]	= read_sysreg_el1(cpacr);
+	if (system_supports_sve()) /* implies has_vhe() */
+		ctxt->sys_regs[ZCR_EL1]	= read_sysreg_s(SYS_ZCR_EL12);
 	ctxt->sys_regs[TTBR0_EL1]	= read_sysreg_el1(ttbr0);
 	ctxt->sys_regs[TTBR1_EL1]	= read_sysreg_el1(ttbr1);
 	ctxt->sys_regs[TCR_EL1]		= read_sysreg_el1(tcr);
@@ -129,6 +132,8 @@ static void __hyp_text __sysreg_restore_el1_state(struct kvm_cpu_context *ctxt)
 	write_sysreg_el1(ctxt->sys_regs[SCTLR_EL1],	sctlr);
 	write_sysreg(ctxt->sys_regs[ACTLR_EL1],	  	actlr_el1);
 	write_sysreg_el1(ctxt->sys_regs[CPACR_EL1],	cpacr);
+	if (system_supports_sve()) /* implies has_vhe() */
+		write_sysreg_s(ctxt->sys_regs[ZCR_EL1],	SYS_ZCR_EL12);
 	write_sysreg_el1(ctxt->sys_regs[TTBR0_EL1],	ttbr0);
 	write_sysreg_el1(ctxt->sys_regs[TTBR1_EL1],	ttbr1);
 	write_sysreg_el1(ctxt->sys_regs[TCR_EL1],	tcr);
diff --git a/arch/arm64/kvm/sys_regs.c b/arch/arm64/kvm/sys_regs.c
index 87d2468..dcaf6e5 100644
--- a/arch/arm64/kvm/sys_regs.c
+++ b/arch/arm64/kvm/sys_regs.c
@@ -986,6 +986,12 @@ static bool access_cntp_cval(struct kvm_vcpu *vcpu,
 	return true;
 }
 
+static bool sve_check_present(const struct kvm_vcpu *vcpu,
+			      const struct sys_reg_desc *rd)
+{
+	return vcpu_has_sve(&vcpu->arch);
+}
+
 /* Read a sanitised cpufeature ID register by sys_reg_desc */
 static u64 read_id_reg(const struct kvm_vcpu *vcpu,
 		       struct sys_reg_desc const *r, bool raz)
@@ -999,10 +1005,7 @@ static u64 read_id_reg(const struct kvm_vcpu *vcpu,
 	else
 		val = read_sanitised_ftr_reg(id);
 
-	if (id == SYS_ID_AA64PFR0_EL1) {
-		if (val & (0xfUL << ID_AA64PFR0_SVE_SHIFT))
-			kvm_debug("SVE unsupported for guests, suppressing\n");
-
+	if (id == SYS_ID_AA64PFR0_EL1 && !vcpu_has_sve(&vcpu->arch)) {
 		val &= ~(0xfUL << ID_AA64PFR0_SVE_SHIFT);
 	} else if (id == SYS_ID_AA64MMFR1_EL1) {
 		if (val & (0xfUL << ID_AA64MMFR1_LOR_SHIFT))
@@ -1240,7 +1243,7 @@ static const struct sys_reg_desc sys_reg_descs[] = {
 	ID_SANITISED(ID_AA64PFR1_EL1),
 	ID_UNALLOCATED(4,2),
 	ID_UNALLOCATED(4,3),
-	ID_UNALLOCATED(4,4),
+	ID_SANITISED_IF(ID_AA64ZFR0_EL1, sve_check_present),
 	ID_UNALLOCATED(4,5),
 	ID_UNALLOCATED(4,6),
 	ID_UNALLOCATED(4,7),
@@ -1277,6 +1280,7 @@ static const struct sys_reg_desc sys_reg_descs[] = {
 
 	{ SYS_DESC(SYS_SCTLR_EL1), access_vm_reg, reset_val, SCTLR_EL1, 0x00C50078 },
 	{ SYS_DESC(SYS_CPACR_EL1), NULL, reset_val, CPACR_EL1, 0 },
+	{ SYS_DESC(SYS_ZCR_EL1), NULL, reset_val, ZCR_EL1, 0, .check_present = sve_check_present },
 	{ SYS_DESC(SYS_TTBR0_EL1), access_vm_reg, reset_unknown, TTBR0_EL1 },
 	{ SYS_DESC(SYS_TTBR1_EL1), access_vm_reg, reset_unknown, TTBR1_EL1 },
 	{ SYS_DESC(SYS_TCR_EL1), access_vm_reg, reset_val, TCR_EL1, 0 },
-- 
2.1.4

^ permalink raw reply related	[flat|nested] 178+ messages in thread

* [RFC PATCH 11/16] KVM: arm64/sve: System register context switch and access support
@ 2018-06-21 14:57   ` Dave Martin
  0 siblings, 0 replies; 178+ messages in thread
From: Dave Martin @ 2018-06-21 14:57 UTC (permalink / raw)
  To: linux-arm-kernel

This patch adds the necessary support for context switching ZCR_EL1
for each vcpu.

The ID_AA64PFR0_EL1 emulation code is updated to expose the
presence of SVE to the guest if appropriate, and ioctl() access to
ZCR_EL1 is also added.

In the context switch code itself, ZCR_EL1 is context switched if
the host is SVE-capable, irrespectively for now of whether SVE is
exposed to the guest or not.  Adding a dynamic vcpu_has_sve() check
may lose as much performance as would be gained in this simple
case.

Signed-off-by: Dave Martin <Dave.Martin@arm.com>
---
 arch/arm64/include/asm/kvm_host.h |  1 +
 arch/arm64/include/asm/sysreg.h   |  3 +++
 arch/arm64/kvm/hyp/sysreg-sr.c    |  5 +++++
 arch/arm64/kvm/sys_regs.c         | 14 +++++++++-----
 4 files changed, 18 insertions(+), 5 deletions(-)

diff --git a/arch/arm64/include/asm/kvm_host.h b/arch/arm64/include/asm/kvm_host.h
index 609d08b..f331abf 100644
--- a/arch/arm64/include/asm/kvm_host.h
+++ b/arch/arm64/include/asm/kvm_host.h
@@ -111,6 +111,7 @@ enum vcpu_sysreg {
 	SCTLR_EL1,	/* System Control Register */
 	ACTLR_EL1,	/* Auxiliary Control Register */
 	CPACR_EL1,	/* Coprocessor Access Control */
+	ZCR_EL1,	/* SVE Control */
 	TTBR0_EL1,	/* Translation Table Base Register 0 */
 	TTBR1_EL1,	/* Translation Table Base Register 1 */
 	TCR_EL1,	/* Translation Control Register */
diff --git a/arch/arm64/include/asm/sysreg.h b/arch/arm64/include/asm/sysreg.h
index a8f8481..6476dbd 100644
--- a/arch/arm64/include/asm/sysreg.h
+++ b/arch/arm64/include/asm/sysreg.h
@@ -416,6 +416,9 @@
 #define SYS_ICH_LR14_EL2		__SYS__LR8_EL2(6)
 #define SYS_ICH_LR15_EL2		__SYS__LR8_EL2(7)
 
+/* VHE encodings for architectural EL0/1 system registers */
+#define SYS_ZCR_EL12			sys_reg(3, 5, 1, 2, 0)
+
 /* Common SCTLR_ELx flags. */
 #define SCTLR_ELx_EE    (1 << 25)
 #define SCTLR_ELx_IESB	(1 << 21)
diff --git a/arch/arm64/kvm/hyp/sysreg-sr.c b/arch/arm64/kvm/hyp/sysreg-sr.c
index 35bc168..0f4046a 100644
--- a/arch/arm64/kvm/hyp/sysreg-sr.c
+++ b/arch/arm64/kvm/hyp/sysreg-sr.c
@@ -21,6 +21,7 @@
 #include <asm/kvm_asm.h>
 #include <asm/kvm_emulate.h>
 #include <asm/kvm_hyp.h>
+#include <asm/sysreg.h>
 
 /*
  * Non-VHE: Both host and guest must save everything.
@@ -57,6 +58,8 @@ static void __hyp_text __sysreg_save_el1_state(struct kvm_cpu_context *ctxt)
 	ctxt->sys_regs[SCTLR_EL1]	= read_sysreg_el1(sctlr);
 	ctxt->sys_regs[ACTLR_EL1]	= read_sysreg(actlr_el1);
 	ctxt->sys_regs[CPACR_EL1]	= read_sysreg_el1(cpacr);
+	if (system_supports_sve()) /* implies has_vhe() */
+		ctxt->sys_regs[ZCR_EL1]	= read_sysreg_s(SYS_ZCR_EL12);
 	ctxt->sys_regs[TTBR0_EL1]	= read_sysreg_el1(ttbr0);
 	ctxt->sys_regs[TTBR1_EL1]	= read_sysreg_el1(ttbr1);
 	ctxt->sys_regs[TCR_EL1]		= read_sysreg_el1(tcr);
@@ -129,6 +132,8 @@ static void __hyp_text __sysreg_restore_el1_state(struct kvm_cpu_context *ctxt)
 	write_sysreg_el1(ctxt->sys_regs[SCTLR_EL1],	sctlr);
 	write_sysreg(ctxt->sys_regs[ACTLR_EL1],	  	actlr_el1);
 	write_sysreg_el1(ctxt->sys_regs[CPACR_EL1],	cpacr);
+	if (system_supports_sve()) /* implies has_vhe() */
+		write_sysreg_s(ctxt->sys_regs[ZCR_EL1],	SYS_ZCR_EL12);
 	write_sysreg_el1(ctxt->sys_regs[TTBR0_EL1],	ttbr0);
 	write_sysreg_el1(ctxt->sys_regs[TTBR1_EL1],	ttbr1);
 	write_sysreg_el1(ctxt->sys_regs[TCR_EL1],	tcr);
diff --git a/arch/arm64/kvm/sys_regs.c b/arch/arm64/kvm/sys_regs.c
index 87d2468..dcaf6e5 100644
--- a/arch/arm64/kvm/sys_regs.c
+++ b/arch/arm64/kvm/sys_regs.c
@@ -986,6 +986,12 @@ static bool access_cntp_cval(struct kvm_vcpu *vcpu,
 	return true;
 }
 
+static bool sve_check_present(const struct kvm_vcpu *vcpu,
+			      const struct sys_reg_desc *rd)
+{
+	return vcpu_has_sve(&vcpu->arch);
+}
+
 /* Read a sanitised cpufeature ID register by sys_reg_desc */
 static u64 read_id_reg(const struct kvm_vcpu *vcpu,
 		       struct sys_reg_desc const *r, bool raz)
@@ -999,10 +1005,7 @@ static u64 read_id_reg(const struct kvm_vcpu *vcpu,
 	else
 		val = read_sanitised_ftr_reg(id);
 
-	if (id == SYS_ID_AA64PFR0_EL1) {
-		if (val & (0xfUL << ID_AA64PFR0_SVE_SHIFT))
-			kvm_debug("SVE unsupported for guests, suppressing\n");
-
+	if (id == SYS_ID_AA64PFR0_EL1 && !vcpu_has_sve(&vcpu->arch)) {
 		val &= ~(0xfUL << ID_AA64PFR0_SVE_SHIFT);
 	} else if (id == SYS_ID_AA64MMFR1_EL1) {
 		if (val & (0xfUL << ID_AA64MMFR1_LOR_SHIFT))
@@ -1240,7 +1243,7 @@ static const struct sys_reg_desc sys_reg_descs[] = {
 	ID_SANITISED(ID_AA64PFR1_EL1),
 	ID_UNALLOCATED(4,2),
 	ID_UNALLOCATED(4,3),
-	ID_UNALLOCATED(4,4),
+	ID_SANITISED_IF(ID_AA64ZFR0_EL1, sve_check_present),
 	ID_UNALLOCATED(4,5),
 	ID_UNALLOCATED(4,6),
 	ID_UNALLOCATED(4,7),
@@ -1277,6 +1280,7 @@ static const struct sys_reg_desc sys_reg_descs[] = {
 
 	{ SYS_DESC(SYS_SCTLR_EL1), access_vm_reg, reset_val, SCTLR_EL1, 0x00C50078 },
 	{ SYS_DESC(SYS_CPACR_EL1), NULL, reset_val, CPACR_EL1, 0 },
+	{ SYS_DESC(SYS_ZCR_EL1), NULL, reset_val, ZCR_EL1, 0, .check_present = sve_check_present },
 	{ SYS_DESC(SYS_TTBR0_EL1), access_vm_reg, reset_unknown, TTBR0_EL1 },
 	{ SYS_DESC(SYS_TTBR1_EL1), access_vm_reg, reset_unknown, TTBR1_EL1 },
 	{ SYS_DESC(SYS_TCR_EL1), access_vm_reg, reset_val, TCR_EL1, 0 },
-- 
2.1.4

^ permalink raw reply related	[flat|nested] 178+ messages in thread

* [RFC PATCH 12/16] KVM: arm64/sve: Context switch the SVE registers
  2018-06-21 14:57 ` Dave Martin
@ 2018-06-21 14:57   ` Dave Martin
  -1 siblings, 0 replies; 178+ messages in thread
From: Dave Martin @ 2018-06-21 14:57 UTC (permalink / raw)
  To: kvmarm
  Cc: Okamoto Takayuki, Christoffer Dall, Ard Biesheuvel, Marc Zyngier,
	Catalin Marinas, Will Deacon, linux-arm-kernel

In order to give each vcpu its own view of the SVE registers, this
patch adds context storage via a new sve_state pointer in struct
vcpu_arch.  An additional member sve_max_vl is also added for each
vcpu, to determine the maximum vector length visible to the guest
and thus the value to be configured in ZCR_EL2.LEN while the is
active.  This also determines the layout and size of the storage in
sve_state, which is read and written by the same backend functions
that are used for context-switching the SVE state for host tasks.

On SVE-enabled vcpus, SVE access traps are now handled by switching
in the vcpu's SVE context and disabling the trap before returning
to the guest.  On other vcpus, the trap is not handled and an exit
back to the host occurs, where the handle_sve() fallback path
reflects an undefined instruction exception back to the guest,
consistently with the behaviour of non-SVE-capable hardware (as was
done unconditionally prior to this patch).

No SVE handling is added on non-VHE-only paths, since VHE is an
architectural and Kconfig prerequisite of SVE.

Signed-off-by: Dave Martin <Dave.Martin@arm.com>
---
 arch/arm64/include/asm/kvm_host.h |  2 ++
 arch/arm64/kvm/fpsimd.c           |  5 +++--
 arch/arm64/kvm/hyp/switch.c       | 43 ++++++++++++++++++++++++++++++---------
 3 files changed, 38 insertions(+), 12 deletions(-)

diff --git a/arch/arm64/include/asm/kvm_host.h b/arch/arm64/include/asm/kvm_host.h
index f331abf..d2084ae 100644
--- a/arch/arm64/include/asm/kvm_host.h
+++ b/arch/arm64/include/asm/kvm_host.h
@@ -211,6 +211,8 @@ typedef struct kvm_cpu_context kvm_cpu_context_t;
 
 struct kvm_vcpu_arch {
 	struct kvm_cpu_context ctxt;
+	void *sve_state;
+	unsigned int sve_max_vl;
 
 	/* HYP configuration */
 	u64 hcr_el2;
diff --git a/arch/arm64/kvm/fpsimd.c b/arch/arm64/kvm/fpsimd.c
index 872008c..44cf783 100644
--- a/arch/arm64/kvm/fpsimd.c
+++ b/arch/arm64/kvm/fpsimd.c
@@ -86,10 +86,11 @@ void kvm_arch_vcpu_ctxsync_fp(struct kvm_vcpu *vcpu)
 
 	if (vcpu->arch.flags & KVM_ARM64_FP_ENABLED) {
 		fpsimd_bind_state_to_cpu(&vcpu->arch.ctxt.gp_regs.fp_regs,
-					 NULL, sve_max_vl);
+					 vcpu->arch.sve_state,
+					 vcpu->arch.sve_max_vl);
 
 		clear_thread_flag(TIF_FOREIGN_FPSTATE);
-		clear_thread_flag(TIF_SVE);
+		update_thread_flag(TIF_SVE, vcpu_has_sve(&vcpu->arch));
 	}
 }
 
diff --git a/arch/arm64/kvm/hyp/switch.c b/arch/arm64/kvm/hyp/switch.c
index d496ef5..98df5c1 100644
--- a/arch/arm64/kvm/hyp/switch.c
+++ b/arch/arm64/kvm/hyp/switch.c
@@ -98,8 +98,13 @@ static void activate_traps_vhe(struct kvm_vcpu *vcpu)
 	val = read_sysreg(cpacr_el1);
 	val |= CPACR_EL1_TTA;
 	val &= ~CPACR_EL1_ZEN;
-	if (!update_fp_enabled(vcpu))
+
+	if (update_fp_enabled(vcpu)) {
+		if (vcpu_has_sve(&vcpu->arch))
+			val |= CPACR_EL1_ZEN;
+	} else {
 		val &= ~CPACR_EL1_FPEN;
+	}
 
 	write_sysreg(val, cpacr_el1);
 
@@ -114,6 +119,7 @@ static void __hyp_text __activate_traps_nvhe(struct kvm_vcpu *vcpu)
 
 	val = CPTR_EL2_DEFAULT;
 	val |= CPTR_EL2_TTA | CPTR_EL2_TZ;
+
 	if (!update_fp_enabled(vcpu))
 		val |= CPTR_EL2_TFP;
 
@@ -329,16 +335,22 @@ static bool __hyp_text __skip_instr(struct kvm_vcpu *vcpu)
 	}
 }
 
-static bool __hyp_text __hyp_switch_fpsimd(struct kvm_vcpu *vcpu)
+static bool __hyp_text __hyp_switch_fpsimd(struct kvm_vcpu *vcpu,
+					   bool guest_has_sve)
 {
 	struct user_fpsimd_state *host_fpsimd = vcpu->arch.host_fpsimd_state;
 
-	if (has_vhe())
-		write_sysreg(read_sysreg(cpacr_el1) | CPACR_EL1_FPEN,
-			     cpacr_el1);
-	else
+	if (has_vhe()) {
+		u64 reg = read_sysreg(cpacr_el1) | CPACR_EL1_FPEN;
+
+		if (system_supports_sve() && guest_has_sve)
+			reg |= CPACR_EL1_ZEN;
+
+		write_sysreg(reg, cpacr_el1);
+	} else {
 		write_sysreg(read_sysreg(cptr_el2) & ~(u64)CPTR_EL2_TFP,
 			     cptr_el2);
+	}
 
 	isb();
 
@@ -361,7 +373,13 @@ static bool __hyp_text __hyp_switch_fpsimd(struct kvm_vcpu *vcpu)
 		vcpu->arch.flags &= ~KVM_ARM64_FP_HOST;
 	}
 
-	__fpsimd_restore_state(&vcpu->arch.ctxt.gp_regs.fp_regs);
+	if (system_supports_sve() && guest_has_sve)
+		sve_load_state((char *)vcpu->arch.sve_state +
+					sve_ffr_offset(vcpu->arch.sve_max_vl),
+			       &vcpu->arch.ctxt.gp_regs.fp_regs.fpsr,
+			       sve_vq_from_vl(vcpu->arch.sve_max_vl) - 1);
+	else
+		__fpsimd_restore_state(&vcpu->arch.ctxt.gp_regs.fp_regs);
 
 	/* Skip restoring fpexc32 for AArch64 guests */
 	if (!(read_sysreg(hcr_el2) & HCR_RW))
@@ -380,6 +398,8 @@ static bool __hyp_text __hyp_switch_fpsimd(struct kvm_vcpu *vcpu)
  */
 static bool __hyp_text fixup_guest_exit(struct kvm_vcpu *vcpu, u64 *exit_code)
 {
+	bool guest_has_sve;
+
 	if (ARM_EXCEPTION_CODE(*exit_code) != ARM_EXCEPTION_IRQ)
 		vcpu->arch.fault.esr_el2 = read_sysreg_el2(esr);
 
@@ -397,10 +417,13 @@ static bool __hyp_text fixup_guest_exit(struct kvm_vcpu *vcpu, u64 *exit_code)
 	 * and restore the guest context lazily.
 	 * If FP/SIMD is not implemented, handle the trap and inject an
 	 * undefined instruction exception to the guest.
+	 * Similarly for trapped SVE accesses.
 	 */
-	if (system_supports_fpsimd() &&
-	    kvm_vcpu_trap_get_class(vcpu) == ESR_ELx_EC_FP_ASIMD)
-		return __hyp_switch_fpsimd(vcpu);
+	guest_has_sve = vcpu_has_sve(&vcpu->arch);
+	if ((system_supports_fpsimd() &&
+	     kvm_vcpu_trap_get_class(vcpu) == ESR_ELx_EC_FP_ASIMD) ||
+	    (guest_has_sve && kvm_vcpu_trap_get_class(vcpu) == ESR_ELx_EC_SVE))
+		return __hyp_switch_fpsimd(vcpu, guest_has_sve);
 
 	if (!__populate_fault_info(vcpu))
 		return true;
-- 
2.1.4

^ permalink raw reply related	[flat|nested] 178+ messages in thread

* [RFC PATCH 12/16] KVM: arm64/sve: Context switch the SVE registers
@ 2018-06-21 14:57   ` Dave Martin
  0 siblings, 0 replies; 178+ messages in thread
From: Dave Martin @ 2018-06-21 14:57 UTC (permalink / raw)
  To: linux-arm-kernel

In order to give each vcpu its own view of the SVE registers, this
patch adds context storage via a new sve_state pointer in struct
vcpu_arch.  An additional member sve_max_vl is also added for each
vcpu, to determine the maximum vector length visible to the guest
and thus the value to be configured in ZCR_EL2.LEN while the is
active.  This also determines the layout and size of the storage in
sve_state, which is read and written by the same backend functions
that are used for context-switching the SVE state for host tasks.

On SVE-enabled vcpus, SVE access traps are now handled by switching
in the vcpu's SVE context and disabling the trap before returning
to the guest.  On other vcpus, the trap is not handled and an exit
back to the host occurs, where the handle_sve() fallback path
reflects an undefined instruction exception back to the guest,
consistently with the behaviour of non-SVE-capable hardware (as was
done unconditionally prior to this patch).

No SVE handling is added on non-VHE-only paths, since VHE is an
architectural and Kconfig prerequisite of SVE.

Signed-off-by: Dave Martin <Dave.Martin@arm.com>
---
 arch/arm64/include/asm/kvm_host.h |  2 ++
 arch/arm64/kvm/fpsimd.c           |  5 +++--
 arch/arm64/kvm/hyp/switch.c       | 43 ++++++++++++++++++++++++++++++---------
 3 files changed, 38 insertions(+), 12 deletions(-)

diff --git a/arch/arm64/include/asm/kvm_host.h b/arch/arm64/include/asm/kvm_host.h
index f331abf..d2084ae 100644
--- a/arch/arm64/include/asm/kvm_host.h
+++ b/arch/arm64/include/asm/kvm_host.h
@@ -211,6 +211,8 @@ typedef struct kvm_cpu_context kvm_cpu_context_t;
 
 struct kvm_vcpu_arch {
 	struct kvm_cpu_context ctxt;
+	void *sve_state;
+	unsigned int sve_max_vl;
 
 	/* HYP configuration */
 	u64 hcr_el2;
diff --git a/arch/arm64/kvm/fpsimd.c b/arch/arm64/kvm/fpsimd.c
index 872008c..44cf783 100644
--- a/arch/arm64/kvm/fpsimd.c
+++ b/arch/arm64/kvm/fpsimd.c
@@ -86,10 +86,11 @@ void kvm_arch_vcpu_ctxsync_fp(struct kvm_vcpu *vcpu)
 
 	if (vcpu->arch.flags & KVM_ARM64_FP_ENABLED) {
 		fpsimd_bind_state_to_cpu(&vcpu->arch.ctxt.gp_regs.fp_regs,
-					 NULL, sve_max_vl);
+					 vcpu->arch.sve_state,
+					 vcpu->arch.sve_max_vl);
 
 		clear_thread_flag(TIF_FOREIGN_FPSTATE);
-		clear_thread_flag(TIF_SVE);
+		update_thread_flag(TIF_SVE, vcpu_has_sve(&vcpu->arch));
 	}
 }
 
diff --git a/arch/arm64/kvm/hyp/switch.c b/arch/arm64/kvm/hyp/switch.c
index d496ef5..98df5c1 100644
--- a/arch/arm64/kvm/hyp/switch.c
+++ b/arch/arm64/kvm/hyp/switch.c
@@ -98,8 +98,13 @@ static void activate_traps_vhe(struct kvm_vcpu *vcpu)
 	val = read_sysreg(cpacr_el1);
 	val |= CPACR_EL1_TTA;
 	val &= ~CPACR_EL1_ZEN;
-	if (!update_fp_enabled(vcpu))
+
+	if (update_fp_enabled(vcpu)) {
+		if (vcpu_has_sve(&vcpu->arch))
+			val |= CPACR_EL1_ZEN;
+	} else {
 		val &= ~CPACR_EL1_FPEN;
+	}
 
 	write_sysreg(val, cpacr_el1);
 
@@ -114,6 +119,7 @@ static void __hyp_text __activate_traps_nvhe(struct kvm_vcpu *vcpu)
 
 	val = CPTR_EL2_DEFAULT;
 	val |= CPTR_EL2_TTA | CPTR_EL2_TZ;
+
 	if (!update_fp_enabled(vcpu))
 		val |= CPTR_EL2_TFP;
 
@@ -329,16 +335,22 @@ static bool __hyp_text __skip_instr(struct kvm_vcpu *vcpu)
 	}
 }
 
-static bool __hyp_text __hyp_switch_fpsimd(struct kvm_vcpu *vcpu)
+static bool __hyp_text __hyp_switch_fpsimd(struct kvm_vcpu *vcpu,
+					   bool guest_has_sve)
 {
 	struct user_fpsimd_state *host_fpsimd = vcpu->arch.host_fpsimd_state;
 
-	if (has_vhe())
-		write_sysreg(read_sysreg(cpacr_el1) | CPACR_EL1_FPEN,
-			     cpacr_el1);
-	else
+	if (has_vhe()) {
+		u64 reg = read_sysreg(cpacr_el1) | CPACR_EL1_FPEN;
+
+		if (system_supports_sve() && guest_has_sve)
+			reg |= CPACR_EL1_ZEN;
+
+		write_sysreg(reg, cpacr_el1);
+	} else {
 		write_sysreg(read_sysreg(cptr_el2) & ~(u64)CPTR_EL2_TFP,
 			     cptr_el2);
+	}
 
 	isb();
 
@@ -361,7 +373,13 @@ static bool __hyp_text __hyp_switch_fpsimd(struct kvm_vcpu *vcpu)
 		vcpu->arch.flags &= ~KVM_ARM64_FP_HOST;
 	}
 
-	__fpsimd_restore_state(&vcpu->arch.ctxt.gp_regs.fp_regs);
+	if (system_supports_sve() && guest_has_sve)
+		sve_load_state((char *)vcpu->arch.sve_state +
+					sve_ffr_offset(vcpu->arch.sve_max_vl),
+			       &vcpu->arch.ctxt.gp_regs.fp_regs.fpsr,
+			       sve_vq_from_vl(vcpu->arch.sve_max_vl) - 1);
+	else
+		__fpsimd_restore_state(&vcpu->arch.ctxt.gp_regs.fp_regs);
 
 	/* Skip restoring fpexc32 for AArch64 guests */
 	if (!(read_sysreg(hcr_el2) & HCR_RW))
@@ -380,6 +398,8 @@ static bool __hyp_text __hyp_switch_fpsimd(struct kvm_vcpu *vcpu)
  */
 static bool __hyp_text fixup_guest_exit(struct kvm_vcpu *vcpu, u64 *exit_code)
 {
+	bool guest_has_sve;
+
 	if (ARM_EXCEPTION_CODE(*exit_code) != ARM_EXCEPTION_IRQ)
 		vcpu->arch.fault.esr_el2 = read_sysreg_el2(esr);
 
@@ -397,10 +417,13 @@ static bool __hyp_text fixup_guest_exit(struct kvm_vcpu *vcpu, u64 *exit_code)
 	 * and restore the guest context lazily.
 	 * If FP/SIMD is not implemented, handle the trap and inject an
 	 * undefined instruction exception to the guest.
+	 * Similarly for trapped SVE accesses.
 	 */
-	if (system_supports_fpsimd() &&
-	    kvm_vcpu_trap_get_class(vcpu) == ESR_ELx_EC_FP_ASIMD)
-		return __hyp_switch_fpsimd(vcpu);
+	guest_has_sve = vcpu_has_sve(&vcpu->arch);
+	if ((system_supports_fpsimd() &&
+	     kvm_vcpu_trap_get_class(vcpu) == ESR_ELx_EC_FP_ASIMD) ||
+	    (guest_has_sve && kvm_vcpu_trap_get_class(vcpu) == ESR_ELx_EC_SVE))
+		return __hyp_switch_fpsimd(vcpu, guest_has_sve);
 
 	if (!__populate_fault_info(vcpu))
 		return true;
-- 
2.1.4

^ permalink raw reply related	[flat|nested] 178+ messages in thread

* [RFC PATCH 13/16] KVM: Allow 2048-bit register access via KVM_{GET, SET}_ONE_REG
  2018-06-21 14:57 ` Dave Martin
@ 2018-06-21 14:57   ` Dave Martin
  -1 siblings, 0 replies; 178+ messages in thread
From: Dave Martin @ 2018-06-21 14:57 UTC (permalink / raw)
  To: kvmarm
  Cc: Okamoto Takayuki, Christoffer Dall, Ard Biesheuvel, Marc Zyngier,
	Catalin Marinas, Will Deacon, linux-arm-kernel

The Arm SVE architecture defines registers that are up to 2048 bits
in size (with some possibility of further future expansion).

In order to avoid the need for an excessively large number of
ioctls when saving and restoring a vcpu's registers, this patch
adds a #define to make support for individual 2048-bit registers
through the KVM_{GET,SET}_ONE_REG ioctl interface official.  This
will allow each SVE register to be accessed in a single call.

There are sufficient spare bits in the register id size field for
this change, so there is no ABI impact providing that
KVM_GET_REG_LIST does not enumerate any 2048-bit register unless
userspace explicitly opts in to the relevant architecture-specific
features.

Signed-off-by: Dave Martin <Dave.Martin@arm.com>
---
 include/uapi/linux/kvm.h | 1 +
 1 file changed, 1 insertion(+)

diff --git a/include/uapi/linux/kvm.h b/include/uapi/linux/kvm.h
index b6270a3..345be88 100644
--- a/include/uapi/linux/kvm.h
+++ b/include/uapi/linux/kvm.h
@@ -1106,6 +1106,7 @@ struct kvm_dirty_tlb {
 #define KVM_REG_SIZE_U256	0x0050000000000000ULL
 #define KVM_REG_SIZE_U512	0x0060000000000000ULL
 #define KVM_REG_SIZE_U1024	0x0070000000000000ULL
+#define KVM_REG_SIZE_U2048	0x0080000000000000ULL
 
 struct kvm_reg_list {
 	__u64 n; /* number of regs */
-- 
2.1.4

^ permalink raw reply related	[flat|nested] 178+ messages in thread

* [RFC PATCH 13/16] KVM: Allow 2048-bit register access via KVM_{GET, SET}_ONE_REG
@ 2018-06-21 14:57   ` Dave Martin
  0 siblings, 0 replies; 178+ messages in thread
From: Dave Martin @ 2018-06-21 14:57 UTC (permalink / raw)
  To: linux-arm-kernel

The Arm SVE architecture defines registers that are up to 2048 bits
in size (with some possibility of further future expansion).

In order to avoid the need for an excessively large number of
ioctls when saving and restoring a vcpu's registers, this patch
adds a #define to make support for individual 2048-bit registers
through the KVM_{GET,SET}_ONE_REG ioctl interface official.  This
will allow each SVE register to be accessed in a single call.

There are sufficient spare bits in the register id size field for
this change, so there is no ABI impact providing that
KVM_GET_REG_LIST does not enumerate any 2048-bit register unless
userspace explicitly opts in to the relevant architecture-specific
features.

Signed-off-by: Dave Martin <Dave.Martin@arm.com>
---
 include/uapi/linux/kvm.h | 1 +
 1 file changed, 1 insertion(+)

diff --git a/include/uapi/linux/kvm.h b/include/uapi/linux/kvm.h
index b6270a3..345be88 100644
--- a/include/uapi/linux/kvm.h
+++ b/include/uapi/linux/kvm.h
@@ -1106,6 +1106,7 @@ struct kvm_dirty_tlb {
 #define KVM_REG_SIZE_U256	0x0050000000000000ULL
 #define KVM_REG_SIZE_U512	0x0060000000000000ULL
 #define KVM_REG_SIZE_U1024	0x0070000000000000ULL
+#define KVM_REG_SIZE_U2048	0x0080000000000000ULL
 
 struct kvm_reg_list {
 	__u64 n; /* number of regs */
-- 
2.1.4

^ permalink raw reply related	[flat|nested] 178+ messages in thread

* [RFC PATCH 14/16] KVM: arm64/sve: Add SVE support to register access ioctl interface
  2018-06-21 14:57 ` Dave Martin
@ 2018-06-21 14:57   ` Dave Martin
  -1 siblings, 0 replies; 178+ messages in thread
From: Dave Martin @ 2018-06-21 14:57 UTC (permalink / raw)
  To: kvmarm
  Cc: Okamoto Takayuki, Christoffer Dall, Ard Biesheuvel, Marc Zyngier,
	Catalin Marinas, Will Deacon, linux-arm-kernel

This patch adds the following registers for access via the
KVM_{GET,SET}_ONE_REG interface:

 * KVM_REG_ARM64_SVE_ZREG(n, i) (n = 0..31) (in 2048-bit slices)
 * KVM_REG_ARM64_SVE_PREG(n, i) (n = 0..15) (in 256-bit slices)
 * KVM_REG_ARM64_SVE_FFR(i) (in 256-bit slices)

In order to adapt gracefully to future architectural extensions,
the registers are divided up into slices as noted above:  the i
parameter denotes the slice index.

For simplicity, bits or slices that exceed the maximum vector
length supported for the vcpu are ignored for KVM_SET_ONE_REG, and
read as zero for KVM_GET_ONE_REG.

For the current architecture, only slice i = 0 is significant.  The
interface design allows i to increase to up to 31 in the future if
required by future architectural amendments.

The registers are only visible for vcpus that have SVE enabled.
They are not enumerated by KVM_GET_REG_LIST on vcpus that do not
have SVE.  In all cases, surplus slices are not enumerated by
KVM_GET_REG_LIST.

Accesses to the FPSIMD registers via KVM_REG_ARM_CORE are
redirected to access the underlying vcpu SVE register storage as
appropriate.  In order to make this more straightforward, register
accesses that straddle register boundaries are no longer guaranteed
to succeed.  (Support for such use was never deliberate, and
userspace does not currently seem to be relying on it.)

Signed-off-by: Dave Martin <Dave.Martin@arm.com>
---
 arch/arm64/include/uapi/asm/kvm.h |  10 ++
 arch/arm64/kvm/guest.c            | 219 +++++++++++++++++++++++++++++++++++---
 2 files changed, 216 insertions(+), 13 deletions(-)

diff --git a/arch/arm64/include/uapi/asm/kvm.h b/arch/arm64/include/uapi/asm/kvm.h
index 4e76630..f54a9b0 100644
--- a/arch/arm64/include/uapi/asm/kvm.h
+++ b/arch/arm64/include/uapi/asm/kvm.h
@@ -213,6 +213,16 @@ struct kvm_arch_memory_slot {
 					 KVM_REG_ARM_FW | ((r) & 0xffff))
 #define KVM_REG_ARM_PSCI_VERSION	KVM_REG_ARM_FW_REG(0)
 
+/* SVE registers */
+#define KVM_REG_ARM64_SVE		(0x15 << KVM_REG_ARM_COPROC_SHIFT)
+#define KVM_REG_ARM64_SVE_ZREG(n, i)	(KVM_REG_ARM64 | KVM_REG_ARM64_SVE | \
+					 KVM_REG_SIZE_U2048 |		\
+					 ((n) << 5) | (i))
+#define KVM_REG_ARM64_SVE_PREG(n, i)	(KVM_REG_ARM64 | KVM_REG_ARM64_SVE | \
+					 KVM_REG_SIZE_U256 |		\
+					 ((n) << 5) | (i) | 0x400)
+#define KVM_REG_ARM64_SVE_FFR(i)	KVM_REG_ARM64_SVE_PREG(16, i)
+
 /* Device Control API: ARM VGIC */
 #define KVM_DEV_ARM_VGIC_GRP_ADDR	0
 #define KVM_DEV_ARM_VGIC_GRP_DIST_REGS	1
diff --git a/arch/arm64/kvm/guest.c b/arch/arm64/kvm/guest.c
index 4a9d77c..005394b 100644
--- a/arch/arm64/kvm/guest.c
+++ b/arch/arm64/kvm/guest.c
@@ -23,14 +23,19 @@
 #include <linux/err.h>
 #include <linux/kvm_host.h>
 #include <linux/module.h>
+#include <linux/uaccess.h>
 #include <linux/vmalloc.h>
 #include <linux/fs.h>
+#include <linux/stddef.h>
 #include <kvm/arm_psci.h>
 #include <asm/cputype.h>
 #include <linux/uaccess.h>
+#include <asm/fpsimd.h>
 #include <asm/kvm.h>
 #include <asm/kvm_emulate.h>
 #include <asm/kvm_coproc.h>
+#include <asm/kvm_host.h>
+#include <asm/sigcontext.h>
 
 #include "trace.h"
 
@@ -57,6 +62,106 @@ static u64 core_reg_offset_from_id(u64 id)
 	return id & ~(KVM_REG_ARCH_MASK | KVM_REG_SIZE_MASK | KVM_REG_ARM_CORE);
 }
 
+static bool is_zreg(const struct kvm_one_reg *reg)
+{
+	return	reg->id >= KVM_REG_ARM64_SVE_ZREG(0, 0) &&
+		reg->id <= KVM_REG_ARM64_SVE_ZREG(SVE_NUM_ZREGS, 0x1f);
+}
+
+static bool is_preg(const struct kvm_one_reg *reg)
+{
+	return	reg->id >= KVM_REG_ARM64_SVE_PREG(0, 0) &&
+		reg->id <= KVM_REG_ARM64_SVE_FFR(0x1f);
+}
+
+static unsigned int sve_reg_num(const struct kvm_one_reg *reg)
+{
+	return (reg->id >> 5) & 0x1f;
+}
+
+static unsigned int sve_reg_index(const struct kvm_one_reg *reg)
+{
+	return reg->id & 0x1f;
+}
+
+struct reg_bounds_struct {
+	char *kptr;
+	size_t start_offset;
+	size_t copy_count;
+	size_t flush_count;
+};
+
+static int copy_bounded_reg_to_user(void __user *uptr,
+				    const struct reg_bounds_struct *b)
+{
+	if (copy_to_user(uptr, b->kptr, b->copy_count) ||
+	    clear_user((char __user *)uptr + b->copy_count, b->flush_count))
+		return -EFAULT;
+
+	return 0;
+}
+
+static int copy_bounded_reg_from_user(const struct reg_bounds_struct *b,
+				      const void __user *uptr)
+{
+	if (copy_from_user(b->kptr, uptr, b->copy_count))
+		return -EFAULT;
+
+	return 0;
+}
+
+static int fpsimd_vreg_bounds(struct reg_bounds_struct *b,
+			      struct kvm_vcpu *vcpu,
+			      const struct kvm_one_reg *reg)
+{
+	const size_t stride = KVM_REG_ARM_CORE_REG(fp_regs.vregs[1]) -
+				KVM_REG_ARM_CORE_REG(fp_regs.vregs[0]);
+	const size_t start = KVM_REG_ARM_CORE_REG(fp_regs.vregs[0]);
+	const size_t limit = KVM_REG_ARM_CORE_REG(fp_regs.vregs[32]);
+
+	const u64 uoffset = core_reg_offset_from_id(reg->id);
+	size_t usize = KVM_REG_SIZE(reg->id);
+	size_t start_vreg, end_vreg;
+
+	if (WARN_ON((reg->id & KVM_REG_ARM_COPROC_MASK) != KVM_REG_ARM_CORE))
+		return -ENOENT;
+
+	if (usize % sizeof(u32))
+		return -EINVAL;
+
+	usize /= sizeof(u32);
+
+	if ((uoffset <= start && usize <= start - uoffset) ||
+	    uoffset >= limit)
+		return -ENOENT;	/* not a vreg */
+
+	BUILD_BUG_ON(uoffset > limit);
+	if (uoffset < start || usize > limit - uoffset)
+		return -EINVAL;	/* overlaps vregs[] bounds */
+
+	start_vreg = (uoffset - start) / stride;
+	end_vreg = ((uoffset - start) + usize - 1) / stride;
+	if (start_vreg != end_vreg)
+		return -EINVAL;	/* spans multiple vregs */
+
+	b->start_offset = ((uoffset - start) % stride) * sizeof(u32);
+	b->copy_count = usize * sizeof(u32);
+	b->flush_count = 0;
+
+	if (vcpu_has_sve(&vcpu->arch)) {
+		const unsigned int vq = sve_vq_from_vl(vcpu->arch.sve_max_vl);
+
+		b->kptr = vcpu->arch.sve_state;
+		b->kptr += (SVE_SIG_ZREG_OFFSET(vq, start_vreg) -
+			    SVE_SIG_REGS_OFFSET);
+	} else {
+		b->kptr = (char *)&vcpu_gp_regs(vcpu)->fp_regs.vregs[
+				start_vreg];
+	}
+
+	return 0;
+}
+
 static int get_core_reg(struct kvm_vcpu *vcpu, const struct kvm_one_reg *reg)
 {
 	/*
@@ -65,11 +170,20 @@ static int get_core_reg(struct kvm_vcpu *vcpu, const struct kvm_one_reg *reg)
 	 * array. Hence below, nr_regs is the number of entries, and
 	 * off the index in the "array".
 	 */
+	int err;
+	struct reg_bounds_struct b;
 	__u32 __user *uaddr = (__u32 __user *)(unsigned long)reg->addr;
 	struct kvm_regs *regs = vcpu_gp_regs(vcpu);
 	int nr_regs = sizeof(*regs) / sizeof(__u32);
 	u32 off;
 
+	err = fpsimd_vreg_bounds(&b, vcpu, reg);
+	switch (err) {
+	case 0:		return copy_bounded_reg_to_user(uaddr, &b);
+	case -ENOENT:	break;	/* not and FPSIMD vreg */
+	default:	return err;
+	}
+
 	/* Our ID is an index into the kvm_regs struct. */
 	off = core_reg_offset_from_id(reg->id);
 	if (off >= nr_regs ||
@@ -84,14 +198,23 @@ static int get_core_reg(struct kvm_vcpu *vcpu, const struct kvm_one_reg *reg)
 
 static int set_core_reg(struct kvm_vcpu *vcpu, const struct kvm_one_reg *reg)
 {
+	int err;
+	struct reg_bounds_struct b;
 	__u32 __user *uaddr = (__u32 __user *)(unsigned long)reg->addr;
 	struct kvm_regs *regs = vcpu_gp_regs(vcpu);
 	int nr_regs = sizeof(*regs) / sizeof(__u32);
 	__uint128_t tmp;
 	void *valp = &tmp;
 	u64 off;
-	int err = 0;
 
+	err = fpsimd_vreg_bounds(&b, vcpu, reg);
+	switch (err) {
+	case 0:		return copy_bounded_reg_from_user(&b, uaddr);
+	case -ENOENT:	break;	/* not and FPSIMD vreg */
+	default:	return err;
+	}
+
+	err = 0;
 	/* Our ID is an index into the kvm_regs struct. */
 	off = core_reg_offset_from_id(reg->id);
 	if (off >= nr_regs ||
@@ -130,6 +253,78 @@ static int set_core_reg(struct kvm_vcpu *vcpu, const struct kvm_one_reg *reg)
 	return err;
 }
 
+static int sve_reg_bounds(struct reg_bounds_struct *b,
+			  const struct kvm_vcpu *vcpu,
+			  const struct kvm_one_reg *reg)
+{
+	unsigned int n = sve_reg_num(reg);
+	unsigned int i = sve_reg_index(reg);
+	unsigned int vl = vcpu->arch.sve_max_vl;
+	unsigned int vq = sve_vq_from_vl(vl);
+	unsigned int start, copy_limit, limit;
+
+	b->kptr = vcpu->arch.sve_state;
+	if (is_zreg(reg)) {
+		b->kptr += SVE_SIG_ZREG_OFFSET(vq, n) - SVE_SIG_REGS_OFFSET;
+		start = i * 0x100;
+		limit = start + 0x100;
+		copy_limit = vl;
+	} else if (is_preg(reg)) {
+		b->kptr += SVE_SIG_PREG_OFFSET(vq, n) - SVE_SIG_REGS_OFFSET;
+		start = i * 0x20;
+		limit = start + 0x20;
+		copy_limit = vl / 8;
+	} else {
+		WARN_ON(1);
+		start = 0;
+		copy_limit = limit = 0;
+	}
+
+	b->kptr += start;
+
+	if (copy_limit < start)
+		copy_limit = start;
+	else if (copy_limit > limit)
+		copy_limit = limit;
+
+	b->copy_count = copy_limit - start;
+	b->flush_count = limit - copy_limit;
+
+	return 0;
+}
+
+static int get_sve_reg(struct kvm_vcpu *vcpu, const struct kvm_one_reg *reg)
+{
+	int ret;
+	struct reg_bounds_struct b;
+	char __user *uptr = (char __user *)reg->addr;
+
+	if (!vcpu_has_sve(&vcpu->arch))
+		return -ENOENT;
+
+	ret = sve_reg_bounds(&b, vcpu, reg);
+	if (ret)
+		return ret;
+
+	return copy_bounded_reg_to_user(uptr, &b);
+}
+
+static int set_sve_reg(struct kvm_vcpu *vcpu, const struct kvm_one_reg *reg)
+{
+	int ret;
+	struct reg_bounds_struct b;
+	char __user *uptr = (char __user *)reg->addr;
+
+	if (!vcpu_has_sve(&vcpu->arch))
+		return -ENOENT;
+
+	ret = sve_reg_bounds(&b, vcpu, reg);
+	if (ret)
+		return ret;
+
+	return copy_bounded_reg_from_user(&b, uptr);
+}
+
 int kvm_arch_vcpu_ioctl_get_regs(struct kvm_vcpu *vcpu, struct kvm_regs *regs)
 {
 	return -EINVAL;
@@ -251,12 +446,11 @@ int kvm_arm_get_reg(struct kvm_vcpu *vcpu, const struct kvm_one_reg *reg)
 	if ((reg->id & ~KVM_REG_SIZE_MASK) >> 32 != KVM_REG_ARM64 >> 32)
 		return -EINVAL;
 
-	/* Register group 16 means we want a core register. */
-	if ((reg->id & KVM_REG_ARM_COPROC_MASK) == KVM_REG_ARM_CORE)
-		return get_core_reg(vcpu, reg);
-
-	if ((reg->id & KVM_REG_ARM_COPROC_MASK) == KVM_REG_ARM_FW)
-		return kvm_arm_get_fw_reg(vcpu, reg);
+	switch (reg->id & KVM_REG_ARM_COPROC_MASK) {
+	case KVM_REG_ARM_CORE:	return get_core_reg(vcpu, reg);
+	case KVM_REG_ARM_FW:	return kvm_arm_get_fw_reg(vcpu, reg);
+	case KVM_REG_ARM64_SVE:	return get_sve_reg(vcpu, reg);
+	}
 
 	if (is_timer_reg(reg->id))
 		return get_timer_reg(vcpu, reg);
@@ -270,12 +464,11 @@ int kvm_arm_set_reg(struct kvm_vcpu *vcpu, const struct kvm_one_reg *reg)
 	if ((reg->id & ~KVM_REG_SIZE_MASK) >> 32 != KVM_REG_ARM64 >> 32)
 		return -EINVAL;
 
-	/* Register group 16 means we set a core register. */
-	if ((reg->id & KVM_REG_ARM_COPROC_MASK) == KVM_REG_ARM_CORE)
-		return set_core_reg(vcpu, reg);
-
-	if ((reg->id & KVM_REG_ARM_COPROC_MASK) == KVM_REG_ARM_FW)
-		return kvm_arm_set_fw_reg(vcpu, reg);
+	switch (reg->id & KVM_REG_ARM_COPROC_MASK) {
+	case KVM_REG_ARM_CORE:	return set_core_reg(vcpu, reg);
+	case KVM_REG_ARM_FW:	return kvm_arm_set_fw_reg(vcpu, reg);
+	case KVM_REG_ARM64_SVE:	return set_sve_reg(vcpu, reg);
+	}
 
 	if (is_timer_reg(reg->id))
 		return set_timer_reg(vcpu, reg);
-- 
2.1.4

^ permalink raw reply related	[flat|nested] 178+ messages in thread

* [RFC PATCH 14/16] KVM: arm64/sve: Add SVE support to register access ioctl interface
@ 2018-06-21 14:57   ` Dave Martin
  0 siblings, 0 replies; 178+ messages in thread
From: Dave Martin @ 2018-06-21 14:57 UTC (permalink / raw)
  To: linux-arm-kernel

This patch adds the following registers for access via the
KVM_{GET,SET}_ONE_REG interface:

 * KVM_REG_ARM64_SVE_ZREG(n, i) (n = 0..31) (in 2048-bit slices)
 * KVM_REG_ARM64_SVE_PREG(n, i) (n = 0..15) (in 256-bit slices)
 * KVM_REG_ARM64_SVE_FFR(i) (in 256-bit slices)

In order to adapt gracefully to future architectural extensions,
the registers are divided up into slices as noted above:  the i
parameter denotes the slice index.

For simplicity, bits or slices that exceed the maximum vector
length supported for the vcpu are ignored for KVM_SET_ONE_REG, and
read as zero for KVM_GET_ONE_REG.

For the current architecture, only slice i = 0 is significant.  The
interface design allows i to increase to up to 31 in the future if
required by future architectural amendments.

The registers are only visible for vcpus that have SVE enabled.
They are not enumerated by KVM_GET_REG_LIST on vcpus that do not
have SVE.  In all cases, surplus slices are not enumerated by
KVM_GET_REG_LIST.

Accesses to the FPSIMD registers via KVM_REG_ARM_CORE are
redirected to access the underlying vcpu SVE register storage as
appropriate.  In order to make this more straightforward, register
accesses that straddle register boundaries are no longer guaranteed
to succeed.  (Support for such use was never deliberate, and
userspace does not currently seem to be relying on it.)

Signed-off-by: Dave Martin <Dave.Martin@arm.com>
---
 arch/arm64/include/uapi/asm/kvm.h |  10 ++
 arch/arm64/kvm/guest.c            | 219 +++++++++++++++++++++++++++++++++++---
 2 files changed, 216 insertions(+), 13 deletions(-)

diff --git a/arch/arm64/include/uapi/asm/kvm.h b/arch/arm64/include/uapi/asm/kvm.h
index 4e76630..f54a9b0 100644
--- a/arch/arm64/include/uapi/asm/kvm.h
+++ b/arch/arm64/include/uapi/asm/kvm.h
@@ -213,6 +213,16 @@ struct kvm_arch_memory_slot {
 					 KVM_REG_ARM_FW | ((r) & 0xffff))
 #define KVM_REG_ARM_PSCI_VERSION	KVM_REG_ARM_FW_REG(0)
 
+/* SVE registers */
+#define KVM_REG_ARM64_SVE		(0x15 << KVM_REG_ARM_COPROC_SHIFT)
+#define KVM_REG_ARM64_SVE_ZREG(n, i)	(KVM_REG_ARM64 | KVM_REG_ARM64_SVE | \
+					 KVM_REG_SIZE_U2048 |		\
+					 ((n) << 5) | (i))
+#define KVM_REG_ARM64_SVE_PREG(n, i)	(KVM_REG_ARM64 | KVM_REG_ARM64_SVE | \
+					 KVM_REG_SIZE_U256 |		\
+					 ((n) << 5) | (i) | 0x400)
+#define KVM_REG_ARM64_SVE_FFR(i)	KVM_REG_ARM64_SVE_PREG(16, i)
+
 /* Device Control API: ARM VGIC */
 #define KVM_DEV_ARM_VGIC_GRP_ADDR	0
 #define KVM_DEV_ARM_VGIC_GRP_DIST_REGS	1
diff --git a/arch/arm64/kvm/guest.c b/arch/arm64/kvm/guest.c
index 4a9d77c..005394b 100644
--- a/arch/arm64/kvm/guest.c
+++ b/arch/arm64/kvm/guest.c
@@ -23,14 +23,19 @@
 #include <linux/err.h>
 #include <linux/kvm_host.h>
 #include <linux/module.h>
+#include <linux/uaccess.h>
 #include <linux/vmalloc.h>
 #include <linux/fs.h>
+#include <linux/stddef.h>
 #include <kvm/arm_psci.h>
 #include <asm/cputype.h>
 #include <linux/uaccess.h>
+#include <asm/fpsimd.h>
 #include <asm/kvm.h>
 #include <asm/kvm_emulate.h>
 #include <asm/kvm_coproc.h>
+#include <asm/kvm_host.h>
+#include <asm/sigcontext.h>
 
 #include "trace.h"
 
@@ -57,6 +62,106 @@ static u64 core_reg_offset_from_id(u64 id)
 	return id & ~(KVM_REG_ARCH_MASK | KVM_REG_SIZE_MASK | KVM_REG_ARM_CORE);
 }
 
+static bool is_zreg(const struct kvm_one_reg *reg)
+{
+	return	reg->id >= KVM_REG_ARM64_SVE_ZREG(0, 0) &&
+		reg->id <= KVM_REG_ARM64_SVE_ZREG(SVE_NUM_ZREGS, 0x1f);
+}
+
+static bool is_preg(const struct kvm_one_reg *reg)
+{
+	return	reg->id >= KVM_REG_ARM64_SVE_PREG(0, 0) &&
+		reg->id <= KVM_REG_ARM64_SVE_FFR(0x1f);
+}
+
+static unsigned int sve_reg_num(const struct kvm_one_reg *reg)
+{
+	return (reg->id >> 5) & 0x1f;
+}
+
+static unsigned int sve_reg_index(const struct kvm_one_reg *reg)
+{
+	return reg->id & 0x1f;
+}
+
+struct reg_bounds_struct {
+	char *kptr;
+	size_t start_offset;
+	size_t copy_count;
+	size_t flush_count;
+};
+
+static int copy_bounded_reg_to_user(void __user *uptr,
+				    const struct reg_bounds_struct *b)
+{
+	if (copy_to_user(uptr, b->kptr, b->copy_count) ||
+	    clear_user((char __user *)uptr + b->copy_count, b->flush_count))
+		return -EFAULT;
+
+	return 0;
+}
+
+static int copy_bounded_reg_from_user(const struct reg_bounds_struct *b,
+				      const void __user *uptr)
+{
+	if (copy_from_user(b->kptr, uptr, b->copy_count))
+		return -EFAULT;
+
+	return 0;
+}
+
+static int fpsimd_vreg_bounds(struct reg_bounds_struct *b,
+			      struct kvm_vcpu *vcpu,
+			      const struct kvm_one_reg *reg)
+{
+	const size_t stride = KVM_REG_ARM_CORE_REG(fp_regs.vregs[1]) -
+				KVM_REG_ARM_CORE_REG(fp_regs.vregs[0]);
+	const size_t start = KVM_REG_ARM_CORE_REG(fp_regs.vregs[0]);
+	const size_t limit = KVM_REG_ARM_CORE_REG(fp_regs.vregs[32]);
+
+	const u64 uoffset = core_reg_offset_from_id(reg->id);
+	size_t usize = KVM_REG_SIZE(reg->id);
+	size_t start_vreg, end_vreg;
+
+	if (WARN_ON((reg->id & KVM_REG_ARM_COPROC_MASK) != KVM_REG_ARM_CORE))
+		return -ENOENT;
+
+	if (usize % sizeof(u32))
+		return -EINVAL;
+
+	usize /= sizeof(u32);
+
+	if ((uoffset <= start && usize <= start - uoffset) ||
+	    uoffset >= limit)
+		return -ENOENT;	/* not a vreg */
+
+	BUILD_BUG_ON(uoffset > limit);
+	if (uoffset < start || usize > limit - uoffset)
+		return -EINVAL;	/* overlaps vregs[] bounds */
+
+	start_vreg = (uoffset - start) / stride;
+	end_vreg = ((uoffset - start) + usize - 1) / stride;
+	if (start_vreg != end_vreg)
+		return -EINVAL;	/* spans multiple vregs */
+
+	b->start_offset = ((uoffset - start) % stride) * sizeof(u32);
+	b->copy_count = usize * sizeof(u32);
+	b->flush_count = 0;
+
+	if (vcpu_has_sve(&vcpu->arch)) {
+		const unsigned int vq = sve_vq_from_vl(vcpu->arch.sve_max_vl);
+
+		b->kptr = vcpu->arch.sve_state;
+		b->kptr += (SVE_SIG_ZREG_OFFSET(vq, start_vreg) -
+			    SVE_SIG_REGS_OFFSET);
+	} else {
+		b->kptr = (char *)&vcpu_gp_regs(vcpu)->fp_regs.vregs[
+				start_vreg];
+	}
+
+	return 0;
+}
+
 static int get_core_reg(struct kvm_vcpu *vcpu, const struct kvm_one_reg *reg)
 {
 	/*
@@ -65,11 +170,20 @@ static int get_core_reg(struct kvm_vcpu *vcpu, const struct kvm_one_reg *reg)
 	 * array. Hence below, nr_regs is the number of entries, and
 	 * off the index in the "array".
 	 */
+	int err;
+	struct reg_bounds_struct b;
 	__u32 __user *uaddr = (__u32 __user *)(unsigned long)reg->addr;
 	struct kvm_regs *regs = vcpu_gp_regs(vcpu);
 	int nr_regs = sizeof(*regs) / sizeof(__u32);
 	u32 off;
 
+	err = fpsimd_vreg_bounds(&b, vcpu, reg);
+	switch (err) {
+	case 0:		return copy_bounded_reg_to_user(uaddr, &b);
+	case -ENOENT:	break;	/* not and FPSIMD vreg */
+	default:	return err;
+	}
+
 	/* Our ID is an index into the kvm_regs struct. */
 	off = core_reg_offset_from_id(reg->id);
 	if (off >= nr_regs ||
@@ -84,14 +198,23 @@ static int get_core_reg(struct kvm_vcpu *vcpu, const struct kvm_one_reg *reg)
 
 static int set_core_reg(struct kvm_vcpu *vcpu, const struct kvm_one_reg *reg)
 {
+	int err;
+	struct reg_bounds_struct b;
 	__u32 __user *uaddr = (__u32 __user *)(unsigned long)reg->addr;
 	struct kvm_regs *regs = vcpu_gp_regs(vcpu);
 	int nr_regs = sizeof(*regs) / sizeof(__u32);
 	__uint128_t tmp;
 	void *valp = &tmp;
 	u64 off;
-	int err = 0;
 
+	err = fpsimd_vreg_bounds(&b, vcpu, reg);
+	switch (err) {
+	case 0:		return copy_bounded_reg_from_user(&b, uaddr);
+	case -ENOENT:	break;	/* not and FPSIMD vreg */
+	default:	return err;
+	}
+
+	err = 0;
 	/* Our ID is an index into the kvm_regs struct. */
 	off = core_reg_offset_from_id(reg->id);
 	if (off >= nr_regs ||
@@ -130,6 +253,78 @@ static int set_core_reg(struct kvm_vcpu *vcpu, const struct kvm_one_reg *reg)
 	return err;
 }
 
+static int sve_reg_bounds(struct reg_bounds_struct *b,
+			  const struct kvm_vcpu *vcpu,
+			  const struct kvm_one_reg *reg)
+{
+	unsigned int n = sve_reg_num(reg);
+	unsigned int i = sve_reg_index(reg);
+	unsigned int vl = vcpu->arch.sve_max_vl;
+	unsigned int vq = sve_vq_from_vl(vl);
+	unsigned int start, copy_limit, limit;
+
+	b->kptr = vcpu->arch.sve_state;
+	if (is_zreg(reg)) {
+		b->kptr += SVE_SIG_ZREG_OFFSET(vq, n) - SVE_SIG_REGS_OFFSET;
+		start = i * 0x100;
+		limit = start + 0x100;
+		copy_limit = vl;
+	} else if (is_preg(reg)) {
+		b->kptr += SVE_SIG_PREG_OFFSET(vq, n) - SVE_SIG_REGS_OFFSET;
+		start = i * 0x20;
+		limit = start + 0x20;
+		copy_limit = vl / 8;
+	} else {
+		WARN_ON(1);
+		start = 0;
+		copy_limit = limit = 0;
+	}
+
+	b->kptr += start;
+
+	if (copy_limit < start)
+		copy_limit = start;
+	else if (copy_limit > limit)
+		copy_limit = limit;
+
+	b->copy_count = copy_limit - start;
+	b->flush_count = limit - copy_limit;
+
+	return 0;
+}
+
+static int get_sve_reg(struct kvm_vcpu *vcpu, const struct kvm_one_reg *reg)
+{
+	int ret;
+	struct reg_bounds_struct b;
+	char __user *uptr = (char __user *)reg->addr;
+
+	if (!vcpu_has_sve(&vcpu->arch))
+		return -ENOENT;
+
+	ret = sve_reg_bounds(&b, vcpu, reg);
+	if (ret)
+		return ret;
+
+	return copy_bounded_reg_to_user(uptr, &b);
+}
+
+static int set_sve_reg(struct kvm_vcpu *vcpu, const struct kvm_one_reg *reg)
+{
+	int ret;
+	struct reg_bounds_struct b;
+	char __user *uptr = (char __user *)reg->addr;
+
+	if (!vcpu_has_sve(&vcpu->arch))
+		return -ENOENT;
+
+	ret = sve_reg_bounds(&b, vcpu, reg);
+	if (ret)
+		return ret;
+
+	return copy_bounded_reg_from_user(&b, uptr);
+}
+
 int kvm_arch_vcpu_ioctl_get_regs(struct kvm_vcpu *vcpu, struct kvm_regs *regs)
 {
 	return -EINVAL;
@@ -251,12 +446,11 @@ int kvm_arm_get_reg(struct kvm_vcpu *vcpu, const struct kvm_one_reg *reg)
 	if ((reg->id & ~KVM_REG_SIZE_MASK) >> 32 != KVM_REG_ARM64 >> 32)
 		return -EINVAL;
 
-	/* Register group 16 means we want a core register. */
-	if ((reg->id & KVM_REG_ARM_COPROC_MASK) == KVM_REG_ARM_CORE)
-		return get_core_reg(vcpu, reg);
-
-	if ((reg->id & KVM_REG_ARM_COPROC_MASK) == KVM_REG_ARM_FW)
-		return kvm_arm_get_fw_reg(vcpu, reg);
+	switch (reg->id & KVM_REG_ARM_COPROC_MASK) {
+	case KVM_REG_ARM_CORE:	return get_core_reg(vcpu, reg);
+	case KVM_REG_ARM_FW:	return kvm_arm_get_fw_reg(vcpu, reg);
+	case KVM_REG_ARM64_SVE:	return get_sve_reg(vcpu, reg);
+	}
 
 	if (is_timer_reg(reg->id))
 		return get_timer_reg(vcpu, reg);
@@ -270,12 +464,11 @@ int kvm_arm_set_reg(struct kvm_vcpu *vcpu, const struct kvm_one_reg *reg)
 	if ((reg->id & ~KVM_REG_SIZE_MASK) >> 32 != KVM_REG_ARM64 >> 32)
 		return -EINVAL;
 
-	/* Register group 16 means we set a core register. */
-	if ((reg->id & KVM_REG_ARM_COPROC_MASK) == KVM_REG_ARM_CORE)
-		return set_core_reg(vcpu, reg);
-
-	if ((reg->id & KVM_REG_ARM_COPROC_MASK) == KVM_REG_ARM_FW)
-		return kvm_arm_set_fw_reg(vcpu, reg);
+	switch (reg->id & KVM_REG_ARM_COPROC_MASK) {
+	case KVM_REG_ARM_CORE:	return set_core_reg(vcpu, reg);
+	case KVM_REG_ARM_FW:	return kvm_arm_set_fw_reg(vcpu, reg);
+	case KVM_REG_ARM64_SVE:	return set_sve_reg(vcpu, reg);
+	}
 
 	if (is_timer_reg(reg->id))
 		return set_timer_reg(vcpu, reg);
-- 
2.1.4

^ permalink raw reply related	[flat|nested] 178+ messages in thread

* [RFC PATCH 15/16] KVM: arm64: Enumerate SVE register indices for KVM_GET_REG_LIST
  2018-06-21 14:57 ` Dave Martin
@ 2018-06-21 14:57   ` Dave Martin
  -1 siblings, 0 replies; 178+ messages in thread
From: Dave Martin @ 2018-06-21 14:57 UTC (permalink / raw)
  To: kvmarm
  Cc: Okamoto Takayuki, Christoffer Dall, Ard Biesheuvel, Marc Zyngier,
	Catalin Marinas, Will Deacon, linux-arm-kernel

This patch includes the SVE register IDs in the list returned by
KVM_GET_REG_LIST, as appropriate.

On a non-SVE-enabled vcpu, no extra IDs are added.

On an SVE-enabled vcpu, the appropriate number of slide IDs are
enumerated for each SVE register, depending on the maximum vector
length for the vcpu.

Signed-off-by: Dave Martin <Dave.Martin@arm.com>
---
 arch/arm64/kvm/guest.c | 73 ++++++++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 73 insertions(+)

diff --git a/arch/arm64/kvm/guest.c b/arch/arm64/kvm/guest.c
index 005394b..5152362 100644
--- a/arch/arm64/kvm/guest.c
+++ b/arch/arm64/kvm/guest.c
@@ -21,6 +21,7 @@
 
 #include <linux/errno.h>
 #include <linux/err.h>
+#include <linux/kernel.h>
 #include <linux/kvm_host.h>
 #include <linux/module.h>
 #include <linux/uaccess.h>
@@ -253,6 +254,73 @@ static int set_core_reg(struct kvm_vcpu *vcpu, const struct kvm_one_reg *reg)
 	return err;
 }
 
+static void copy_reg_index_to_user(u64 __user **uind, int *total, int *cerr,
+				   u64 id)
+{
+	int err;
+
+	if (*cerr)
+		return;
+
+	if (uind) {
+		err = put_user(id, *uind);
+		if (err) {
+			*cerr = err;
+			return;
+		}
+	}
+
+	++*total;
+	if (uind)
+		++*uind;
+}
+
+static int enumerate_sve_regs(const struct kvm_vcpu *vcpu, u64 __user **uind)
+{
+	unsigned int n, i;
+	int err = 0;
+	int total = 0;
+	unsigned int slices;
+
+	if (!vcpu_has_sve(&vcpu->arch))
+		return 0;
+
+	slices = DIV_ROUND_UP(vcpu->arch.sve_max_vl,
+			      KVM_REG_SIZE(KVM_REG_ARM64_SVE_ZREG(0, 0)));
+
+	for (n = 0; n < SVE_NUM_ZREGS; ++n)
+		for (i = 0; i < slices; ++i)
+			copy_reg_index_to_user(uind, &total, &err,
+					       KVM_REG_ARM64_SVE_ZREG(n, i));
+
+	for (n = 0; n < SVE_NUM_PREGS; ++n)
+		for (i = 0; i < slices; ++i)
+			copy_reg_index_to_user(uind, &total, &err,
+					       KVM_REG_ARM64_SVE_PREG(n, i));
+
+	for (i = 0; i < slices; ++i)
+		copy_reg_index_to_user(uind, &total, &err,
+				       KVM_REG_ARM64_SVE_FFR(i));
+
+	if (err)
+		return -EFAULT;
+
+	return total;
+}
+
+static unsigned long num_sve_regs(const struct kvm_vcpu *vcpu)
+{
+	return enumerate_sve_regs(vcpu, NULL);
+}
+
+static int copy_sve_reg_indices(const struct kvm_vcpu *vcpu, u64 __user **uind)
+{
+	int err;
+
+	err = enumerate_sve_regs(vcpu, uind);
+	return err < 0 ? err : 0;
+}
+
 static int sve_reg_bounds(struct reg_bounds_struct *b,
 			  const struct kvm_vcpu *vcpu,
 			  const struct kvm_one_reg *reg)
@@ -403,6 +471,7 @@ unsigned long kvm_arm_num_regs(struct kvm_vcpu *vcpu)
 	unsigned long res = 0;
 
 	res += num_core_regs();
+	res += num_sve_regs(vcpu);
 	res += kvm_arm_num_sys_reg_descs(vcpu);
 	res += kvm_arm_get_fw_num_regs(vcpu);
 	res += NUM_TIMER_REGS;
@@ -427,6 +496,10 @@ int kvm_arm_copy_reg_indices(struct kvm_vcpu *vcpu, u64 __user *uindices)
 		uindices++;
 	}
 
+	ret = copy_sve_reg_indices(vcpu, &uindices);
+	if (ret)
+		return ret;
+
 	ret = kvm_arm_copy_fw_reg_indices(vcpu, uindices);
 	if (ret)
 		return ret;
-- 
2.1.4

^ permalink raw reply related	[flat|nested] 178+ messages in thread

* [RFC PATCH 15/16] KVM: arm64: Enumerate SVE register indices for KVM_GET_REG_LIST
@ 2018-06-21 14:57   ` Dave Martin
  0 siblings, 0 replies; 178+ messages in thread
From: Dave Martin @ 2018-06-21 14:57 UTC (permalink / raw)
  To: linux-arm-kernel

This patch includes the SVE register IDs in the list returned by
KVM_GET_REG_LIST, as appropriate.

On a non-SVE-enabled vcpu, no extra IDs are added.

On an SVE-enabled vcpu, the appropriate number of slide IDs are
enumerated for each SVE register, depending on the maximum vector
length for the vcpu.

Signed-off-by: Dave Martin <Dave.Martin@arm.com>
---
 arch/arm64/kvm/guest.c | 73 ++++++++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 73 insertions(+)

diff --git a/arch/arm64/kvm/guest.c b/arch/arm64/kvm/guest.c
index 005394b..5152362 100644
--- a/arch/arm64/kvm/guest.c
+++ b/arch/arm64/kvm/guest.c
@@ -21,6 +21,7 @@
 
 #include <linux/errno.h>
 #include <linux/err.h>
+#include <linux/kernel.h>
 #include <linux/kvm_host.h>
 #include <linux/module.h>
 #include <linux/uaccess.h>
@@ -253,6 +254,73 @@ static int set_core_reg(struct kvm_vcpu *vcpu, const struct kvm_one_reg *reg)
 	return err;
 }
 
+static void copy_reg_index_to_user(u64 __user **uind, int *total, int *cerr,
+				   u64 id)
+{
+	int err;
+
+	if (*cerr)
+		return;
+
+	if (uind) {
+		err = put_user(id, *uind);
+		if (err) {
+			*cerr = err;
+			return;
+		}
+	}
+
+	++*total;
+	if (uind)
+		++*uind;
+}
+
+static int enumerate_sve_regs(const struct kvm_vcpu *vcpu, u64 __user **uind)
+{
+	unsigned int n, i;
+	int err = 0;
+	int total = 0;
+	unsigned int slices;
+
+	if (!vcpu_has_sve(&vcpu->arch))
+		return 0;
+
+	slices = DIV_ROUND_UP(vcpu->arch.sve_max_vl,
+			      KVM_REG_SIZE(KVM_REG_ARM64_SVE_ZREG(0, 0)));
+
+	for (n = 0; n < SVE_NUM_ZREGS; ++n)
+		for (i = 0; i < slices; ++i)
+			copy_reg_index_to_user(uind, &total, &err,
+					       KVM_REG_ARM64_SVE_ZREG(n, i));
+
+	for (n = 0; n < SVE_NUM_PREGS; ++n)
+		for (i = 0; i < slices; ++i)
+			copy_reg_index_to_user(uind, &total, &err,
+					       KVM_REG_ARM64_SVE_PREG(n, i));
+
+	for (i = 0; i < slices; ++i)
+		copy_reg_index_to_user(uind, &total, &err,
+				       KVM_REG_ARM64_SVE_FFR(i));
+
+	if (err)
+		return -EFAULT;
+
+	return total;
+}
+
+static unsigned long num_sve_regs(const struct kvm_vcpu *vcpu)
+{
+	return enumerate_sve_regs(vcpu, NULL);
+}
+
+static int copy_sve_reg_indices(const struct kvm_vcpu *vcpu, u64 __user **uind)
+{
+	int err;
+
+	err = enumerate_sve_regs(vcpu, uind);
+	return err < 0 ? err : 0;
+}
+
 static int sve_reg_bounds(struct reg_bounds_struct *b,
 			  const struct kvm_vcpu *vcpu,
 			  const struct kvm_one_reg *reg)
@@ -403,6 +471,7 @@ unsigned long kvm_arm_num_regs(struct kvm_vcpu *vcpu)
 	unsigned long res = 0;
 
 	res += num_core_regs();
+	res += num_sve_regs(vcpu);
 	res += kvm_arm_num_sys_reg_descs(vcpu);
 	res += kvm_arm_get_fw_num_regs(vcpu);
 	res += NUM_TIMER_REGS;
@@ -427,6 +496,10 @@ int kvm_arm_copy_reg_indices(struct kvm_vcpu *vcpu, u64 __user *uindices)
 		uindices++;
 	}
 
+	ret = copy_sve_reg_indices(vcpu, &uindices);
+	if (ret)
+		return ret;
+
 	ret = kvm_arm_copy_fw_reg_indices(vcpu, uindices);
 	if (ret)
 		return ret;
-- 
2.1.4

^ permalink raw reply related	[flat|nested] 178+ messages in thread

* [RFC PATCH 16/16] KVM: arm64/sve: Report and enable SVE API extensions for userspace
  2018-06-21 14:57 ` Dave Martin
@ 2018-06-21 14:57   ` Dave Martin
  -1 siblings, 0 replies; 178+ messages in thread
From: Dave Martin @ 2018-06-21 14:57 UTC (permalink / raw)
  To: kvmarm
  Cc: Okamoto Takayuki, Christoffer Dall, Ard Biesheuvel, Marc Zyngier,
	Catalin Marinas, Will Deacon, linux-arm-kernel

This patch reports the availability of KVM SVE support to userspace
via a new vcpu feature flag KVM_ARM_VCPU_SVE.  This flag is
reported via the KVM_ARM_PREFERRED_TARGET ioctl.

Userspace can enable the feature by setting the flag for
KVM_ARM_VCPU_INIT.  Without this flag set, SVE-related ioctls and
register access extensions are hidden, and SVE remains disabled
unconditionally for the guest.  This ensures that non-SVE-aware KVM
userspace does not receive a vcpu that it does not understand how
to snapshot or restore correctly.

Storage is allocated for the SVE register state at vcpu init time,
sufficient for the maximum vector length to be exposed to the vcpu.
No attempt is made to allocate the storage lazily for now.  Also,
no attempt is made to resize the storage dynamically, since the
effective vector length of the vcpu can change at each EL0/EL1
transition.  The storage is freed at the vcpu uninit hook.

No particular attempt is made to prevent userspace from creating a
mix of vcpus some of which have SVE enabled and some of which have
it disabled.  This may or may not be useful, but it reflects the
underlying architectural behaviour.

Signed-off-by: Dave Martin <Dave.Martin@arm.com>
---
 arch/arm64/include/asm/kvm_host.h |  6 +++---
 arch/arm64/include/uapi/asm/kvm.h |  1 +
 arch/arm64/kvm/guest.c            | 19 +++++++++++++------
 arch/arm64/kvm/reset.c            | 14 ++++++++++++++
 4 files changed, 31 insertions(+), 9 deletions(-)

diff --git a/arch/arm64/include/asm/kvm_host.h b/arch/arm64/include/asm/kvm_host.h
index d2084ae..d956cf2 100644
--- a/arch/arm64/include/asm/kvm_host.h
+++ b/arch/arm64/include/asm/kvm_host.h
@@ -44,7 +44,7 @@
 
 #define KVM_MAX_VCPUS VGIC_V3_MAX_CPUS
 
-#define KVM_VCPU_MAX_FEATURES 4
+#define KVM_VCPU_MAX_FEATURES 5
 
 #define KVM_REQ_SLEEP \
 	KVM_ARCH_REQ_FLAGS(0, KVM_REQUEST_WAIT | KVM_REQUEST_NO_WAKEUP)
@@ -439,8 +439,8 @@ static inline void kvm_arch_sync_events(struct kvm *kvm) {}
 static inline void kvm_arch_sched_in(struct kvm_vcpu *vcpu, int cpu) {}
 static inline void kvm_arch_vcpu_block_finish(struct kvm_vcpu *vcpu) {}
 
-static inline int kvm_arm_arch_vcpu_init(struct kvm_vcpu *vcpu) { return 0; }
-static inline void kvm_arm_arch_vcpu_uninit(struct kvm_vcpu *vcpu) {}
+int kvm_arm_arch_vcpu_init(struct kvm_vcpu *vcpu);
+void kvm_arm_arch_vcpu_uninit(struct kvm_vcpu *vcpu);
 
 void kvm_arm_init_debug(void);
 void kvm_arm_setup_debug(struct kvm_vcpu *vcpu);
diff --git a/arch/arm64/include/uapi/asm/kvm.h b/arch/arm64/include/uapi/asm/kvm.h
index f54a9b0..6acf276 100644
--- a/arch/arm64/include/uapi/asm/kvm.h
+++ b/arch/arm64/include/uapi/asm/kvm.h
@@ -101,6 +101,7 @@ struct kvm_regs {
 #define KVM_ARM_VCPU_EL1_32BIT		1 /* CPU running a 32bit VM */
 #define KVM_ARM_VCPU_PSCI_0_2		2 /* CPU uses PSCI v0.2 */
 #define KVM_ARM_VCPU_PMU_V3		3 /* Support guest PMUv3 */
+#define KVM_ARM_VCPU_SVE		4 /* Allow SVE for guest */
 
 struct kvm_vcpu_init {
 	__u32 target;
diff --git a/arch/arm64/kvm/guest.c b/arch/arm64/kvm/guest.c
index 5152362..fb7f6aa 100644
--- a/arch/arm64/kvm/guest.c
+++ b/arch/arm64/kvm/guest.c
@@ -58,6 +58,16 @@ int kvm_arch_vcpu_setup(struct kvm_vcpu *vcpu)
 	return 0;
 }
 
+int kvm_arm_arch_vcpu_init(struct kvm_vcpu *vcpu)
+{
+	return 0;
+}
+
+void kvm_arm_arch_vcpu_uninit(struct kvm_vcpu *vcpu)
+{
+	kfree(vcpu->arch.sve_state);
+}
+
 static u64 core_reg_offset_from_id(u64 id)
 {
 	return id & ~(KVM_REG_ARCH_MASK | KVM_REG_SIZE_MASK | KVM_REG_ARM_CORE);
@@ -600,12 +610,9 @@ int kvm_vcpu_preferred_target(struct kvm_vcpu_init *init)
 
 	memset(init, 0, sizeof(*init));
 
-	/*
-	 * For now, we don't return any features.
-	 * In future, we might use features to return target
-	 * specific features available for the preferred
-	 * target type.
-	 */
+	/* KVM_ARM_VCPU_SVE understood by KVM_VCPU_INIT */
+	init->features[0] = 1 << KVM_ARM_VCPU_SVE;
+
 	init->target = (__u32)target;
 
 	return 0;
diff --git a/arch/arm64/kvm/reset.c b/arch/arm64/kvm/reset.c
index a74311b..f63a791 100644
--- a/arch/arm64/kvm/reset.c
+++ b/arch/arm64/kvm/reset.c
@@ -110,6 +110,20 @@ int kvm_reset_vcpu(struct kvm_vcpu *vcpu)
 			cpu_reset = &default_regs_reset;
 		}
 
+		if (system_supports_sve() &&
+		    test_bit(KVM_ARM_VCPU_SVE, vcpu->arch.features)) {
+			vcpu->arch.flags |= KVM_ARM64_GUEST_HAS_SVE;
+
+			vcpu->arch.sve_max_vl = sve_max_virtualisable_vl;
+
+			vcpu->arch.sve_state = kzalloc(
+				SVE_SIG_REGS_SIZE(
+					sve_vq_from_vl(vcpu->arch.sve_max_vl)),
+				GFP_KERNEL);
+			if (!vcpu->arch.sve_state)
+				return -ENOMEM;
+		}
+
 		break;
 	}
 
-- 
2.1.4

^ permalink raw reply related	[flat|nested] 178+ messages in thread

* [RFC PATCH 16/16] KVM: arm64/sve: Report and enable SVE API extensions for userspace
@ 2018-06-21 14:57   ` Dave Martin
  0 siblings, 0 replies; 178+ messages in thread
From: Dave Martin @ 2018-06-21 14:57 UTC (permalink / raw)
  To: linux-arm-kernel

This patch reports the availability of KVM SVE support to userspace
via a new vcpu feature flag KVM_ARM_VCPU_SVE.  This flag is
reported via the KVM_ARM_PREFERRED_TARGET ioctl.

Userspace can enable the feature by setting the flag for
KVM_ARM_VCPU_INIT.  Without this flag set, SVE-related ioctls and
register access extensions are hidden, and SVE remains disabled
unconditionally for the guest.  This ensures that non-SVE-aware KVM
userspace does not receive a vcpu that it does not understand how
to snapshot or restore correctly.

Storage is allocated for the SVE register state at vcpu init time,
sufficient for the maximum vector length to be exposed to the vcpu.
No attempt is made to allocate the storage lazily for now.  Also,
no attempt is made to resize the storage dynamically, since the
effective vector length of the vcpu can change at each EL0/EL1
transition.  The storage is freed at the vcpu uninit hook.

No particular attempt is made to prevent userspace from creating a
mix of vcpus some of which have SVE enabled and some of which have
it disabled.  This may or may not be useful, but it reflects the
underlying architectural behaviour.

Signed-off-by: Dave Martin <Dave.Martin@arm.com>
---
 arch/arm64/include/asm/kvm_host.h |  6 +++---
 arch/arm64/include/uapi/asm/kvm.h |  1 +
 arch/arm64/kvm/guest.c            | 19 +++++++++++++------
 arch/arm64/kvm/reset.c            | 14 ++++++++++++++
 4 files changed, 31 insertions(+), 9 deletions(-)

diff --git a/arch/arm64/include/asm/kvm_host.h b/arch/arm64/include/asm/kvm_host.h
index d2084ae..d956cf2 100644
--- a/arch/arm64/include/asm/kvm_host.h
+++ b/arch/arm64/include/asm/kvm_host.h
@@ -44,7 +44,7 @@
 
 #define KVM_MAX_VCPUS VGIC_V3_MAX_CPUS
 
-#define KVM_VCPU_MAX_FEATURES 4
+#define KVM_VCPU_MAX_FEATURES 5
 
 #define KVM_REQ_SLEEP \
 	KVM_ARCH_REQ_FLAGS(0, KVM_REQUEST_WAIT | KVM_REQUEST_NO_WAKEUP)
@@ -439,8 +439,8 @@ static inline void kvm_arch_sync_events(struct kvm *kvm) {}
 static inline void kvm_arch_sched_in(struct kvm_vcpu *vcpu, int cpu) {}
 static inline void kvm_arch_vcpu_block_finish(struct kvm_vcpu *vcpu) {}
 
-static inline int kvm_arm_arch_vcpu_init(struct kvm_vcpu *vcpu) { return 0; }
-static inline void kvm_arm_arch_vcpu_uninit(struct kvm_vcpu *vcpu) {}
+int kvm_arm_arch_vcpu_init(struct kvm_vcpu *vcpu);
+void kvm_arm_arch_vcpu_uninit(struct kvm_vcpu *vcpu);
 
 void kvm_arm_init_debug(void);
 void kvm_arm_setup_debug(struct kvm_vcpu *vcpu);
diff --git a/arch/arm64/include/uapi/asm/kvm.h b/arch/arm64/include/uapi/asm/kvm.h
index f54a9b0..6acf276 100644
--- a/arch/arm64/include/uapi/asm/kvm.h
+++ b/arch/arm64/include/uapi/asm/kvm.h
@@ -101,6 +101,7 @@ struct kvm_regs {
 #define KVM_ARM_VCPU_EL1_32BIT		1 /* CPU running a 32bit VM */
 #define KVM_ARM_VCPU_PSCI_0_2		2 /* CPU uses PSCI v0.2 */
 #define KVM_ARM_VCPU_PMU_V3		3 /* Support guest PMUv3 */
+#define KVM_ARM_VCPU_SVE		4 /* Allow SVE for guest */
 
 struct kvm_vcpu_init {
 	__u32 target;
diff --git a/arch/arm64/kvm/guest.c b/arch/arm64/kvm/guest.c
index 5152362..fb7f6aa 100644
--- a/arch/arm64/kvm/guest.c
+++ b/arch/arm64/kvm/guest.c
@@ -58,6 +58,16 @@ int kvm_arch_vcpu_setup(struct kvm_vcpu *vcpu)
 	return 0;
 }
 
+int kvm_arm_arch_vcpu_init(struct kvm_vcpu *vcpu)
+{
+	return 0;
+}
+
+void kvm_arm_arch_vcpu_uninit(struct kvm_vcpu *vcpu)
+{
+	kfree(vcpu->arch.sve_state);
+}
+
 static u64 core_reg_offset_from_id(u64 id)
 {
 	return id & ~(KVM_REG_ARCH_MASK | KVM_REG_SIZE_MASK | KVM_REG_ARM_CORE);
@@ -600,12 +610,9 @@ int kvm_vcpu_preferred_target(struct kvm_vcpu_init *init)
 
 	memset(init, 0, sizeof(*init));
 
-	/*
-	 * For now, we don't return any features.
-	 * In future, we might use features to return target
-	 * specific features available for the preferred
-	 * target type.
-	 */
+	/* KVM_ARM_VCPU_SVE understood by KVM_VCPU_INIT */
+	init->features[0] = 1 << KVM_ARM_VCPU_SVE;
+
 	init->target = (__u32)target;
 
 	return 0;
diff --git a/arch/arm64/kvm/reset.c b/arch/arm64/kvm/reset.c
index a74311b..f63a791 100644
--- a/arch/arm64/kvm/reset.c
+++ b/arch/arm64/kvm/reset.c
@@ -110,6 +110,20 @@ int kvm_reset_vcpu(struct kvm_vcpu *vcpu)
 			cpu_reset = &default_regs_reset;
 		}
 
+		if (system_supports_sve() &&
+		    test_bit(KVM_ARM_VCPU_SVE, vcpu->arch.features)) {
+			vcpu->arch.flags |= KVM_ARM64_GUEST_HAS_SVE;
+
+			vcpu->arch.sve_max_vl = sve_max_virtualisable_vl;
+
+			vcpu->arch.sve_state = kzalloc(
+				SVE_SIG_REGS_SIZE(
+					sve_vq_from_vl(vcpu->arch.sve_max_vl)),
+				GFP_KERNEL);
+			if (!vcpu->arch.sve_state)
+				return -ENOMEM;
+		}
+
 		break;
 	}
 
-- 
2.1.4

^ permalink raw reply related	[flat|nested] 178+ messages in thread

* Re: [RFC PATCH 00/16] KVM: arm64: Initial support for SVE guests
  2018-06-21 14:57 ` Dave Martin
@ 2018-07-06  8:22   ` Alex Bennée
  -1 siblings, 0 replies; 178+ messages in thread
From: Alex Bennée @ 2018-07-06  8:22 UTC (permalink / raw)
  To: Dave Martin
  Cc: Okamoto Takayuki, Christoffer Dall, Ard Biesheuvel, Marc Zyngier,
	Catalin Marinas, Will Deacon, kvmarm, linux-arm-kernel


Dave Martin <Dave.Martin@arm.com> writes:

<snip>
>
> This series is somewhat tested on Arm Juno r0 and the Arm Fast Model
> (with/without SVE support).  arch/arm builds, but I've not booted
> it -- only some trivial refactoring in this series affects arch/arm.

Now that QEMU linux-user SVE support is pretty much complete we've also
got preliminary patches for system emulation mode. However we currently
don't have VHE implemented so I guess we need to do that first before we
can test under QEMU.

>
> Cheers
> ---Dave
>
>
> [1] [PATCH v2 0/4] KVM: arm64: FPSIMD/SVE fixes for 4.17 [sic]
> http://lists.infradead.org/pipermail/linux-arm-kernel/2018-June/584281.html
>
> Dave Martin (16):
>   arm64: fpsimd: Always set TIF_FOREIGN_FPSTATE on task state flush
>   KVM: arm64: Delete orphaned declaration for __fpsimd_enabled()
>   KVM: arm64: Refactor kvm_arm_num_regs() for easier maintenance
>   KVM: arm64: Add missing #include of <linux/bitmap.h> to kvm_host.h
>   KVM: arm: Add arch init/uninit hooks
>   arm64/sve: Determine virtualisation-friendly vector lengths
>   arm64/sve: Enable SVE state tracking for non-task contexts
>   KVM: arm64: Support dynamically hideable system registers
>   KVM: arm64: Allow ID registers to by dynamically read-as-zero
>   KVM: arm64: Add a vcpu flag to control SVE visibility for the guest
>   KVM: arm64/sve: System register context switch and access support
>   KVM: arm64/sve: Context switch the SVE registers
>   KVM: Allow 2048-bit register access via KVM_{GET,SET}_ONE_REG
>   KVM: arm64/sve: Add SVE support to register access ioctl interface
>   KVM: arm64: Enumerate SVE register indices for KVM_GET_REG_LIST
>   KVM: arm64/sve: Report and enable SVE API extensions for userspace
>
>  arch/arm/include/asm/kvm_host.h   |   4 +-
>  arch/arm64/include/asm/fpsimd.h   |   4 +-
>  arch/arm64/include/asm/kvm_host.h |  18 ++-
>  arch/arm64/include/asm/kvm_hyp.h  |   1 -
>  arch/arm64/include/asm/sysreg.h   |   3 +
>  arch/arm64/include/uapi/asm/kvm.h |  11 ++
>  arch/arm64/kernel/cpufeature.c    |   2 +-
>  arch/arm64/kernel/fpsimd.c        | 131 +++++++++++++---
>  arch/arm64/kernel/signal.c        |   5 -
>  arch/arm64/kvm/fpsimd.c           |   7 +-
>  arch/arm64/kvm/guest.c            | 321 +++++++++++++++++++++++++++++++++++---
>  arch/arm64/kvm/hyp/switch.c       |  43 +++--
>  arch/arm64/kvm/hyp/sysreg-sr.c    |   5 +
>  arch/arm64/kvm/reset.c            |  14 ++
>  arch/arm64/kvm/sys_regs.c         |  73 ++++++---
>  arch/arm64/kvm/sys_regs.h         |  22 +++
>  include/uapi/linux/kvm.h          |   1 +
>  virt/kvm/arm/arm.c                |  13 +-
>  18 files changed, 587 insertions(+), 91 deletions(-)


--
Alex Bennée
_______________________________________________
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

^ permalink raw reply	[flat|nested] 178+ messages in thread

* [RFC PATCH 00/16] KVM: arm64: Initial support for SVE guests
@ 2018-07-06  8:22   ` Alex Bennée
  0 siblings, 0 replies; 178+ messages in thread
From: Alex Bennée @ 2018-07-06  8:22 UTC (permalink / raw)
  To: linux-arm-kernel


Dave Martin <Dave.Martin@arm.com> writes:

<snip>
>
> This series is somewhat tested on Arm Juno r0 and the Arm Fast Model
> (with/without SVE support).  arch/arm builds, but I've not booted
> it -- only some trivial refactoring in this series affects arch/arm.

Now that QEMU linux-user SVE support is pretty much complete we've also
got preliminary patches for system emulation mode. However we currently
don't have VHE implemented so I guess we need to do that first before we
can test under QEMU.

>
> Cheers
> ---Dave
>
>
> [1] [PATCH v2 0/4] KVM: arm64: FPSIMD/SVE fixes for 4.17 [sic]
> http://lists.infradead.org/pipermail/linux-arm-kernel/2018-June/584281.html
>
> Dave Martin (16):
>   arm64: fpsimd: Always set TIF_FOREIGN_FPSTATE on task state flush
>   KVM: arm64: Delete orphaned declaration for __fpsimd_enabled()
>   KVM: arm64: Refactor kvm_arm_num_regs() for easier maintenance
>   KVM: arm64: Add missing #include of <linux/bitmap.h> to kvm_host.h
>   KVM: arm: Add arch init/uninit hooks
>   arm64/sve: Determine virtualisation-friendly vector lengths
>   arm64/sve: Enable SVE state tracking for non-task contexts
>   KVM: arm64: Support dynamically hideable system registers
>   KVM: arm64: Allow ID registers to by dynamically read-as-zero
>   KVM: arm64: Add a vcpu flag to control SVE visibility for the guest
>   KVM: arm64/sve: System register context switch and access support
>   KVM: arm64/sve: Context switch the SVE registers
>   KVM: Allow 2048-bit register access via KVM_{GET,SET}_ONE_REG
>   KVM: arm64/sve: Add SVE support to register access ioctl interface
>   KVM: arm64: Enumerate SVE register indices for KVM_GET_REG_LIST
>   KVM: arm64/sve: Report and enable SVE API extensions for userspace
>
>  arch/arm/include/asm/kvm_host.h   |   4 +-
>  arch/arm64/include/asm/fpsimd.h   |   4 +-
>  arch/arm64/include/asm/kvm_host.h |  18 ++-
>  arch/arm64/include/asm/kvm_hyp.h  |   1 -
>  arch/arm64/include/asm/sysreg.h   |   3 +
>  arch/arm64/include/uapi/asm/kvm.h |  11 ++
>  arch/arm64/kernel/cpufeature.c    |   2 +-
>  arch/arm64/kernel/fpsimd.c        | 131 +++++++++++++---
>  arch/arm64/kernel/signal.c        |   5 -
>  arch/arm64/kvm/fpsimd.c           |   7 +-
>  arch/arm64/kvm/guest.c            | 321 +++++++++++++++++++++++++++++++++++---
>  arch/arm64/kvm/hyp/switch.c       |  43 +++--
>  arch/arm64/kvm/hyp/sysreg-sr.c    |   5 +
>  arch/arm64/kvm/reset.c            |  14 ++
>  arch/arm64/kvm/sys_regs.c         |  73 ++++++---
>  arch/arm64/kvm/sys_regs.h         |  22 +++
>  include/uapi/linux/kvm.h          |   1 +
>  virt/kvm/arm/arm.c                |  13 +-
>  18 files changed, 587 insertions(+), 91 deletions(-)


--
Alex Benn?e

^ permalink raw reply	[flat|nested] 178+ messages in thread

* Re: [RFC PATCH 00/16] KVM: arm64: Initial support for SVE guests
  2018-07-06  8:22   ` Alex Bennée
@ 2018-07-06  9:05     ` Dave Martin
  -1 siblings, 0 replies; 178+ messages in thread
From: Dave Martin @ 2018-07-06  9:05 UTC (permalink / raw)
  To: Alex Bennée
  Cc: Okamoto Takayuki, Christoffer Dall, Ard Biesheuvel, Marc Zyngier,
	Catalin Marinas, Will Deacon, kvmarm, linux-arm-kernel

On Fri, Jul 06, 2018 at 09:22:47AM +0100, Alex Bennée wrote:
> 
> Dave Martin <Dave.Martin@arm.com> writes:
> 
> <snip>
> >
> > This series is somewhat tested on Arm Juno r0 and the Arm Fast Model
> > (with/without SVE support).  arch/arm builds, but I've not booted
> > it -- only some trivial refactoring in this series affects arch/arm.
> 
> Now that QEMU linux-user SVE support is pretty much complete we've also
> got preliminary patches for system emulation mode. However we currently
> don't have VHE implemented so I guess we need to do that first before we
> can test under QEMU.

Qemu can use this as a KVM client without invasive changes, right?

For kvmtool, it's just a question of checking/setting a feature flag
in KVM_ARM_PREFERRED_TARGET and KVM_ARM_VCPU_INIT, and (eventually)
doing at ioctl() to set the set of permitted vector lengths (not yet,
this will come in a follow-up series).

Cheers
---Dave

^ permalink raw reply	[flat|nested] 178+ messages in thread

* [RFC PATCH 00/16] KVM: arm64: Initial support for SVE guests
@ 2018-07-06  9:05     ` Dave Martin
  0 siblings, 0 replies; 178+ messages in thread
From: Dave Martin @ 2018-07-06  9:05 UTC (permalink / raw)
  To: linux-arm-kernel

On Fri, Jul 06, 2018 at 09:22:47AM +0100, Alex Benn?e wrote:
> 
> Dave Martin <Dave.Martin@arm.com> writes:
> 
> <snip>
> >
> > This series is somewhat tested on Arm Juno r0 and the Arm Fast Model
> > (with/without SVE support).  arch/arm builds, but I've not booted
> > it -- only some trivial refactoring in this series affects arch/arm.
> 
> Now that QEMU linux-user SVE support is pretty much complete we've also
> got preliminary patches for system emulation mode. However we currently
> don't have VHE implemented so I guess we need to do that first before we
> can test under QEMU.

Qemu can use this as a KVM client without invasive changes, right?

For kvmtool, it's just a question of checking/setting a feature flag
in KVM_ARM_PREFERRED_TARGET and KVM_ARM_VCPU_INIT, and (eventually)
doing at ioctl() to set the set of permitted vector lengths (not yet,
this will come in a follow-up series).

Cheers
---Dave

^ permalink raw reply	[flat|nested] 178+ messages in thread

* Re: [RFC PATCH 01/16] arm64: fpsimd: Always set TIF_FOREIGN_FPSTATE on task state flush
  2018-06-21 14:57   ` Dave Martin
@ 2018-07-06  9:07     ` Alex Bennée
  -1 siblings, 0 replies; 178+ messages in thread
From: Alex Bennée @ 2018-07-06  9:07 UTC (permalink / raw)
  To: Dave Martin
  Cc: Okamoto Takayuki, Christoffer Dall, Ard Biesheuvel, Marc Zyngier,
	Catalin Marinas, Will Deacon, kvmarm, linux-arm-kernel


Dave Martin <Dave.Martin@arm.com> writes:

> This patch updates fpsimd_flush_task_state() to mirror the new
> semantics of fpsimd_flush_cpu_state(): both functions now
> implicitly set TIF_FOREIGN_FPSTATE to indicate that the task's
> FPSIMD state is not loaded into the cpu.
>
> As a side-effect, fpsimd_flush_task_state() now sets
> TIF_FOREIGN_FPSTATE even for non-running tasks.  In the case of
> non-running tasks this is not useful but also harmless, because the
> flag is live only while the corresponding task is running.  This
> function is not called from fast paths, so special-casing this for
> the task == current case is not really worth it.
>
> Compiler barriers previously present in restore_sve_fpsimd_context()
> are pulled into fpsimd_flush_task_state() so that it can be safely
> called with preemption enabled if necessary.
>
> Explicit calls to set TIF_FOREIGN_FPSTATE that accompany
> fpsimd_flush_task_state() calls and are now redundant are removed
> as appropriate.
>
> fpsimd_flush_task_state() is used to get exclusive access to the
> representation of the task's state via task_struct, for the purpose
> of replacing the state.  Thus, the call to this function should
> happen before manipulating fpsimd_state or sve_state etc. in
> task_struct.  Anomalous cases are reordered appropriately in order
> to make the code more consistent, although there should be no
> functional difference since these cases are protected by
> local_bh_disable() anyway.
>
> Signed-off-by: Dave Martin <Dave.Martin@arm.com>

Reviewed-by: Alex Bennée <alex.bennee@linaro.org>

> ---
>  arch/arm64/kernel/fpsimd.c | 25 +++++++++++++++++++------
>  arch/arm64/kernel/signal.c |  5 -----
>  2 files changed, 19 insertions(+), 11 deletions(-)
>
> diff --git a/arch/arm64/kernel/fpsimd.c b/arch/arm64/kernel/fpsimd.c
> index 84c68b1..6b1ddae 100644
> --- a/arch/arm64/kernel/fpsimd.c
> +++ b/arch/arm64/kernel/fpsimd.c
> @@ -569,7 +569,6 @@ int sve_set_vector_length(struct task_struct *task,
>  		local_bh_disable();
>
>  		fpsimd_save();
> -		set_thread_flag(TIF_FOREIGN_FPSTATE);
>  	}
>
>  	fpsimd_flush_task_state(task);
> @@ -835,12 +834,11 @@ asmlinkage void do_sve_acc(unsigned int esr, struct pt_regs *regs)
>  	local_bh_disable();
>
>  	fpsimd_save();
> -	fpsimd_to_sve(current);
>
>  	/* Force ret_to_user to reload the registers: */
>  	fpsimd_flush_task_state(current);
> -	set_thread_flag(TIF_FOREIGN_FPSTATE);
>
> +	fpsimd_to_sve(current);
>  	if (test_and_set_thread_flag(TIF_SVE))
>  		WARN_ON(1); /* SVE access shouldn't have trapped */
>
> @@ -917,9 +915,9 @@ void fpsimd_flush_thread(void)
>
>  	local_bh_disable();
>
> +	fpsimd_flush_task_state(current);
>  	memset(&current->thread.uw.fpsimd_state, 0,
>  	       sizeof(current->thread.uw.fpsimd_state));
> -	fpsimd_flush_task_state(current);
>
>  	if (system_supports_sve()) {
>  		clear_thread_flag(TIF_SVE);
> @@ -956,8 +954,6 @@ void fpsimd_flush_thread(void)
>  			current->thread.sve_vl_onexec = 0;
>  	}
>
> -	set_thread_flag(TIF_FOREIGN_FPSTATE);
> -
>  	local_bh_enable();
>  }
>
> @@ -1066,12 +1062,29 @@ void fpsimd_update_current_state(struct user_fpsimd_state const *state)
>
>  /*
>   * Invalidate live CPU copies of task t's FPSIMD state
> + *
> + * This function may be called with preemption enabled.  The barrier()
> + * ensures that the assignment to fpsimd_cpu is visible to any
> + * preemption/softirq that could race with set_tsk_thread_flag(), so
> + * that TIF_FOREIGN_FPSTATE cannot be spuriously re-cleared.
> + *
> + * The final barrier ensures that TIF_FOREIGN_FPSTATE is seen set by any
> + * subsequent code.
>   */
>  void fpsimd_flush_task_state(struct task_struct *t)
>  {
>  	t->thread.fpsimd_cpu = NR_CPUS;
> +
> +	barrier();
> +	set_tsk_thread_flag(t, TIF_FOREIGN_FPSTATE);
> +
> +	barrier();
>  }
>
> +/*
> + * Invalidate any task's FPSIMD state that is present on this cpu.
> + * This function must be called with softirqs disabled.
> + */
>  void fpsimd_flush_cpu_state(void)
>  {
>  	__this_cpu_write(fpsimd_last_state.st, NULL);
> diff --git a/arch/arm64/kernel/signal.c b/arch/arm64/kernel/signal.c
> index 511af13..7636965 100644
> --- a/arch/arm64/kernel/signal.c
> +++ b/arch/arm64/kernel/signal.c
> @@ -296,11 +296,6 @@ static int restore_sve_fpsimd_context(struct user_ctxs *user)
>  	 */
>
>  	fpsimd_flush_task_state(current);
> -	barrier();
> -	/* From now, fpsimd_thread_switch() won't clear TIF_FOREIGN_FPSTATE */
> -
> -	set_thread_flag(TIF_FOREIGN_FPSTATE);
> -	barrier();
>  	/* From now, fpsimd_thread_switch() won't touch thread.sve_state */
>
>  	sve_alloc(current);


--
Alex Bennée
_______________________________________________
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

^ permalink raw reply	[flat|nested] 178+ messages in thread

* [RFC PATCH 01/16] arm64: fpsimd: Always set TIF_FOREIGN_FPSTATE on task state flush
@ 2018-07-06  9:07     ` Alex Bennée
  0 siblings, 0 replies; 178+ messages in thread
From: Alex Bennée @ 2018-07-06  9:07 UTC (permalink / raw)
  To: linux-arm-kernel


Dave Martin <Dave.Martin@arm.com> writes:

> This patch updates fpsimd_flush_task_state() to mirror the new
> semantics of fpsimd_flush_cpu_state(): both functions now
> implicitly set TIF_FOREIGN_FPSTATE to indicate that the task's
> FPSIMD state is not loaded into the cpu.
>
> As a side-effect, fpsimd_flush_task_state() now sets
> TIF_FOREIGN_FPSTATE even for non-running tasks.  In the case of
> non-running tasks this is not useful but also harmless, because the
> flag is live only while the corresponding task is running.  This
> function is not called from fast paths, so special-casing this for
> the task == current case is not really worth it.
>
> Compiler barriers previously present in restore_sve_fpsimd_context()
> are pulled into fpsimd_flush_task_state() so that it can be safely
> called with preemption enabled if necessary.
>
> Explicit calls to set TIF_FOREIGN_FPSTATE that accompany
> fpsimd_flush_task_state() calls and are now redundant are removed
> as appropriate.
>
> fpsimd_flush_task_state() is used to get exclusive access to the
> representation of the task's state via task_struct, for the purpose
> of replacing the state.  Thus, the call to this function should
> happen before manipulating fpsimd_state or sve_state etc. in
> task_struct.  Anomalous cases are reordered appropriately in order
> to make the code more consistent, although there should be no
> functional difference since these cases are protected by
> local_bh_disable() anyway.
>
> Signed-off-by: Dave Martin <Dave.Martin@arm.com>

Reviewed-by: Alex Benn?e <alex.bennee@linaro.org>

> ---
>  arch/arm64/kernel/fpsimd.c | 25 +++++++++++++++++++------
>  arch/arm64/kernel/signal.c |  5 -----
>  2 files changed, 19 insertions(+), 11 deletions(-)
>
> diff --git a/arch/arm64/kernel/fpsimd.c b/arch/arm64/kernel/fpsimd.c
> index 84c68b1..6b1ddae 100644
> --- a/arch/arm64/kernel/fpsimd.c
> +++ b/arch/arm64/kernel/fpsimd.c
> @@ -569,7 +569,6 @@ int sve_set_vector_length(struct task_struct *task,
>  		local_bh_disable();
>
>  		fpsimd_save();
> -		set_thread_flag(TIF_FOREIGN_FPSTATE);
>  	}
>
>  	fpsimd_flush_task_state(task);
> @@ -835,12 +834,11 @@ asmlinkage void do_sve_acc(unsigned int esr, struct pt_regs *regs)
>  	local_bh_disable();
>
>  	fpsimd_save();
> -	fpsimd_to_sve(current);
>
>  	/* Force ret_to_user to reload the registers: */
>  	fpsimd_flush_task_state(current);
> -	set_thread_flag(TIF_FOREIGN_FPSTATE);
>
> +	fpsimd_to_sve(current);
>  	if (test_and_set_thread_flag(TIF_SVE))
>  		WARN_ON(1); /* SVE access shouldn't have trapped */
>
> @@ -917,9 +915,9 @@ void fpsimd_flush_thread(void)
>
>  	local_bh_disable();
>
> +	fpsimd_flush_task_state(current);
>  	memset(&current->thread.uw.fpsimd_state, 0,
>  	       sizeof(current->thread.uw.fpsimd_state));
> -	fpsimd_flush_task_state(current);
>
>  	if (system_supports_sve()) {
>  		clear_thread_flag(TIF_SVE);
> @@ -956,8 +954,6 @@ void fpsimd_flush_thread(void)
>  			current->thread.sve_vl_onexec = 0;
>  	}
>
> -	set_thread_flag(TIF_FOREIGN_FPSTATE);
> -
>  	local_bh_enable();
>  }
>
> @@ -1066,12 +1062,29 @@ void fpsimd_update_current_state(struct user_fpsimd_state const *state)
>
>  /*
>   * Invalidate live CPU copies of task t's FPSIMD state
> + *
> + * This function may be called with preemption enabled.  The barrier()
> + * ensures that the assignment to fpsimd_cpu is visible to any
> + * preemption/softirq that could race with set_tsk_thread_flag(), so
> + * that TIF_FOREIGN_FPSTATE cannot be spuriously re-cleared.
> + *
> + * The final barrier ensures that TIF_FOREIGN_FPSTATE is seen set by any
> + * subsequent code.
>   */
>  void fpsimd_flush_task_state(struct task_struct *t)
>  {
>  	t->thread.fpsimd_cpu = NR_CPUS;
> +
> +	barrier();
> +	set_tsk_thread_flag(t, TIF_FOREIGN_FPSTATE);
> +
> +	barrier();
>  }
>
> +/*
> + * Invalidate any task's FPSIMD state that is present on this cpu.
> + * This function must be called with softirqs disabled.
> + */
>  void fpsimd_flush_cpu_state(void)
>  {
>  	__this_cpu_write(fpsimd_last_state.st, NULL);
> diff --git a/arch/arm64/kernel/signal.c b/arch/arm64/kernel/signal.c
> index 511af13..7636965 100644
> --- a/arch/arm64/kernel/signal.c
> +++ b/arch/arm64/kernel/signal.c
> @@ -296,11 +296,6 @@ static int restore_sve_fpsimd_context(struct user_ctxs *user)
>  	 */
>
>  	fpsimd_flush_task_state(current);
> -	barrier();
> -	/* From now, fpsimd_thread_switch() won't clear TIF_FOREIGN_FPSTATE */
> -
> -	set_thread_flag(TIF_FOREIGN_FPSTATE);
> -	barrier();
>  	/* From now, fpsimd_thread_switch() won't touch thread.sve_state */
>
>  	sve_alloc(current);


--
Alex Benn?e

^ permalink raw reply	[flat|nested] 178+ messages in thread

* Re: [RFC PATCH 02/16] KVM: arm64: Delete orphaned declaration for __fpsimd_enabled()
  2018-06-21 14:57   ` Dave Martin
@ 2018-07-06  9:08     ` Alex Bennée
  -1 siblings, 0 replies; 178+ messages in thread
From: Alex Bennée @ 2018-07-06  9:08 UTC (permalink / raw)
  To: Dave Martin
  Cc: Okamoto Takayuki, Christoffer Dall, Ard Biesheuvel, Marc Zyngier,
	Catalin Marinas, Will Deacon, kvmarm, linux-arm-kernel


Dave Martin <Dave.Martin@arm.com> writes:

> __fpsimd_enabled() no longer exists, but a dangling declaration has
> survived in kvm_hyp.h.
>
> This patch gets rid of it.
>
> Signed-off-by: Dave Martin <Dave.Martin@arm.com>

Reviewed-by: Alex Bennée <alex.bennee@linaro.org>

> ---
>  arch/arm64/include/asm/kvm_hyp.h | 1 -
>  1 file changed, 1 deletion(-)
>
> diff --git a/arch/arm64/include/asm/kvm_hyp.h b/arch/arm64/include/asm/kvm_hyp.h
> index 384c343..9cbbd03 100644
> --- a/arch/arm64/include/asm/kvm_hyp.h
> +++ b/arch/arm64/include/asm/kvm_hyp.h
> @@ -147,7 +147,6 @@ void __debug_switch_to_host(struct kvm_vcpu *vcpu);
>
>  void __fpsimd_save_state(struct user_fpsimd_state *fp_regs);
>  void __fpsimd_restore_state(struct user_fpsimd_state *fp_regs);
> -bool __fpsimd_enabled(void);
>
>  void activate_traps_vhe_load(struct kvm_vcpu *vcpu);
>  void deactivate_traps_vhe_put(void);


--
Alex Bennée
_______________________________________________
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

^ permalink raw reply	[flat|nested] 178+ messages in thread

* [RFC PATCH 02/16] KVM: arm64: Delete orphaned declaration for __fpsimd_enabled()
@ 2018-07-06  9:08     ` Alex Bennée
  0 siblings, 0 replies; 178+ messages in thread
From: Alex Bennée @ 2018-07-06  9:08 UTC (permalink / raw)
  To: linux-arm-kernel


Dave Martin <Dave.Martin@arm.com> writes:

> __fpsimd_enabled() no longer exists, but a dangling declaration has
> survived in kvm_hyp.h.
>
> This patch gets rid of it.
>
> Signed-off-by: Dave Martin <Dave.Martin@arm.com>

Reviewed-by: Alex Benn?e <alex.bennee@linaro.org>

> ---
>  arch/arm64/include/asm/kvm_hyp.h | 1 -
>  1 file changed, 1 deletion(-)
>
> diff --git a/arch/arm64/include/asm/kvm_hyp.h b/arch/arm64/include/asm/kvm_hyp.h
> index 384c343..9cbbd03 100644
> --- a/arch/arm64/include/asm/kvm_hyp.h
> +++ b/arch/arm64/include/asm/kvm_hyp.h
> @@ -147,7 +147,6 @@ void __debug_switch_to_host(struct kvm_vcpu *vcpu);
>
>  void __fpsimd_save_state(struct user_fpsimd_state *fp_regs);
>  void __fpsimd_restore_state(struct user_fpsimd_state *fp_regs);
> -bool __fpsimd_enabled(void);
>
>  void activate_traps_vhe_load(struct kvm_vcpu *vcpu);
>  void deactivate_traps_vhe_put(void);


--
Alex Benn?e

^ permalink raw reply	[flat|nested] 178+ messages in thread

* Re: [RFC PATCH 00/16] KVM: arm64: Initial support for SVE guests
  2018-07-06  9:05     ` Dave Martin
@ 2018-07-06  9:20       ` Alex Bennée
  -1 siblings, 0 replies; 178+ messages in thread
From: Alex Bennée @ 2018-07-06  9:20 UTC (permalink / raw)
  To: Dave Martin
  Cc: Okamoto Takayuki, Christoffer Dall, Ard Biesheuvel, Marc Zyngier,
	Catalin Marinas, Will Deacon, kvmarm, linux-arm-kernel


Dave Martin <Dave.Martin@arm.com> writes:

> On Fri, Jul 06, 2018 at 09:22:47AM +0100, Alex Bennée wrote:
>>
>> Dave Martin <Dave.Martin@arm.com> writes:
>>
>> <snip>
>> >
>> > This series is somewhat tested on Arm Juno r0 and the Arm Fast Model
>> > (with/without SVE support).  arch/arm builds, but I've not booted
>> > it -- only some trivial refactoring in this series affects arch/arm.
>>
>> Now that QEMU linux-user SVE support is pretty much complete we've also
>> got preliminary patches for system emulation mode. However we currently
>> don't have VHE implemented so I guess we need to do that first before we
>> can test under QEMU.
>
> Qemu can use this as a KVM client without invasive changes, right?

Yeah for KVM use there aren't really any changes.

> For kvmtool, it's just a question of checking/setting a feature flag
> in KVM_ARM_PREFERRED_TARGET and KVM_ARM_VCPU_INIT, and (eventually)
> doing at ioctl() to set the set of permitted vector lengths (not yet,
> this will come in a follow-up series).

For the most part in QEMU with KVM we just treat the list of registers
we get from the OS as an opaque blob of things we save to memory and
pass back.

The CPU features code will probably need a bit of tweaking but it will
be minor. I'm not sure what the current status of cross-host CPU
migration is at the moment - that was mostly Christopher's headache to
track ;-)

For things like gdbstub we need to add additional handling. I suspect as
we currently don't explicitly handle the SVE registers they won't show
up - I believe there are protocol updates for better handling these
registers coming.

I think migration should work out of the box but I'd need to double
check. Certainly its a strong argument for getting VHE done in TCG mode
as we can exercise a lot of the common code.

>
> Cheers
> ---Dave


--
Alex Bennée
_______________________________________________
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

^ permalink raw reply	[flat|nested] 178+ messages in thread

* [RFC PATCH 00/16] KVM: arm64: Initial support for SVE guests
@ 2018-07-06  9:20       ` Alex Bennée
  0 siblings, 0 replies; 178+ messages in thread
From: Alex Bennée @ 2018-07-06  9:20 UTC (permalink / raw)
  To: linux-arm-kernel


Dave Martin <Dave.Martin@arm.com> writes:

> On Fri, Jul 06, 2018 at 09:22:47AM +0100, Alex Benn?e wrote:
>>
>> Dave Martin <Dave.Martin@arm.com> writes:
>>
>> <snip>
>> >
>> > This series is somewhat tested on Arm Juno r0 and the Arm Fast Model
>> > (with/without SVE support).  arch/arm builds, but I've not booted
>> > it -- only some trivial refactoring in this series affects arch/arm.
>>
>> Now that QEMU linux-user SVE support is pretty much complete we've also
>> got preliminary patches for system emulation mode. However we currently
>> don't have VHE implemented so I guess we need to do that first before we
>> can test under QEMU.
>
> Qemu can use this as a KVM client without invasive changes, right?

Yeah for KVM use there aren't really any changes.

> For kvmtool, it's just a question of checking/setting a feature flag
> in KVM_ARM_PREFERRED_TARGET and KVM_ARM_VCPU_INIT, and (eventually)
> doing at ioctl() to set the set of permitted vector lengths (not yet,
> this will come in a follow-up series).

For the most part in QEMU with KVM we just treat the list of registers
we get from the OS as an opaque blob of things we save to memory and
pass back.

The CPU features code will probably need a bit of tweaking but it will
be minor. I'm not sure what the current status of cross-host CPU
migration is at the moment - that was mostly Christopher's headache to
track ;-)

For things like gdbstub we need to add additional handling. I suspect as
we currently don't explicitly handle the SVE registers they won't show
up - I believe there are protocol updates for better handling these
registers coming.

I think migration should work out of the box but I'd need to double
check. Certainly its a strong argument for getting VHE done in TCG mode
as we can exercise a lot of the common code.

>
> Cheers
> ---Dave


--
Alex Benn?e

^ permalink raw reply	[flat|nested] 178+ messages in thread

* Re: [RFC PATCH 03/16] KVM: arm64: Refactor kvm_arm_num_regs() for easier maintenance
  2018-06-21 14:57   ` Dave Martin
@ 2018-07-06  9:20     ` Alex Bennée
  -1 siblings, 0 replies; 178+ messages in thread
From: Alex Bennée @ 2018-07-06  9:20 UTC (permalink / raw)
  To: Dave Martin
  Cc: Okamoto Takayuki, Christoffer Dall, Ard Biesheuvel, Marc Zyngier,
	Catalin Marinas, Will Deacon, kvmarm, linux-arm-kernel


Dave Martin <Dave.Martin@arm.com> writes:

> kvm_arm_num_regs() adds together various partial register counts in
> a freeform sum expression, which makes it harder than necessary to
> read diffs that add, modify or remove a single term in the sum
> (which is expected to the common case under maintenance).
>
> This patch refactors the code to add the term one per line, for
> maximum readability.
>
> Signed-off-by: Dave Martin <Dave.Martin@arm.com>

Reviewed-by: Alex Bennée <alex.bennee@linaro.org>

> ---
>  arch/arm64/kvm/guest.c | 10 ++++++++--
>  1 file changed, 8 insertions(+), 2 deletions(-)
>
> diff --git a/arch/arm64/kvm/guest.c b/arch/arm64/kvm/guest.c
> index 56a0260..4a9d77c 100644
> --- a/arch/arm64/kvm/guest.c
> +++ b/arch/arm64/kvm/guest.c
> @@ -205,8 +205,14 @@ static int get_timer_reg(struct kvm_vcpu *vcpu, const struct kvm_one_reg *reg)
>   */
>  unsigned long kvm_arm_num_regs(struct kvm_vcpu *vcpu)
>  {
> -	return num_core_regs() + kvm_arm_num_sys_reg_descs(vcpu)
> -		+ kvm_arm_get_fw_num_regs(vcpu)	+ NUM_TIMER_REGS;
> +	unsigned long res = 0;
> +
> +	res += num_core_regs();
> +	res += kvm_arm_num_sys_reg_descs(vcpu);
> +	res += kvm_arm_get_fw_num_regs(vcpu);
> +	res += NUM_TIMER_REGS;
> +
> +	return res;
>  }
>
>  /**


--
Alex Bennée
_______________________________________________
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

^ permalink raw reply	[flat|nested] 178+ messages in thread

* [RFC PATCH 03/16] KVM: arm64: Refactor kvm_arm_num_regs() for easier maintenance
@ 2018-07-06  9:20     ` Alex Bennée
  0 siblings, 0 replies; 178+ messages in thread
From: Alex Bennée @ 2018-07-06  9:20 UTC (permalink / raw)
  To: linux-arm-kernel


Dave Martin <Dave.Martin@arm.com> writes:

> kvm_arm_num_regs() adds together various partial register counts in
> a freeform sum expression, which makes it harder than necessary to
> read diffs that add, modify or remove a single term in the sum
> (which is expected to the common case under maintenance).
>
> This patch refactors the code to add the term one per line, for
> maximum readability.
>
> Signed-off-by: Dave Martin <Dave.Martin@arm.com>

Reviewed-by: Alex Benn?e <alex.bennee@linaro.org>

> ---
>  arch/arm64/kvm/guest.c | 10 ++++++++--
>  1 file changed, 8 insertions(+), 2 deletions(-)
>
> diff --git a/arch/arm64/kvm/guest.c b/arch/arm64/kvm/guest.c
> index 56a0260..4a9d77c 100644
> --- a/arch/arm64/kvm/guest.c
> +++ b/arch/arm64/kvm/guest.c
> @@ -205,8 +205,14 @@ static int get_timer_reg(struct kvm_vcpu *vcpu, const struct kvm_one_reg *reg)
>   */
>  unsigned long kvm_arm_num_regs(struct kvm_vcpu *vcpu)
>  {
> -	return num_core_regs() + kvm_arm_num_sys_reg_descs(vcpu)
> -		+ kvm_arm_get_fw_num_regs(vcpu)	+ NUM_TIMER_REGS;
> +	unsigned long res = 0;
> +
> +	res += num_core_regs();
> +	res += kvm_arm_num_sys_reg_descs(vcpu);
> +	res += kvm_arm_get_fw_num_regs(vcpu);
> +	res += NUM_TIMER_REGS;
> +
> +	return res;
>  }
>
>  /**


--
Alex Benn?e

^ permalink raw reply	[flat|nested] 178+ messages in thread

* Re: [RFC PATCH 04/16] KVM: arm64: Add missing #include of <linux/bitmap.h> to kvm_host.h
  2018-06-21 14:57   ` Dave Martin
@ 2018-07-06  9:21     ` Alex Bennée
  -1 siblings, 0 replies; 178+ messages in thread
From: Alex Bennée @ 2018-07-06  9:21 UTC (permalink / raw)
  To: Dave Martin
  Cc: Okamoto Takayuki, Christoffer Dall, Ard Biesheuvel, Marc Zyngier,
	Catalin Marinas, Will Deacon, kvmarm, linux-arm-kernel


Dave Martin <Dave.Martin@arm.com> writes:

> kvm_host.h uses DECLARE_BITMAP() to declare the features member of
> struct vcpu_arch, but the corresponding #include for this is
> missing.
>
> This patch adds a suitable #include for <linux/bitmap.h>.  Although
> the header builds without it today, this should help to avoid
> future surprises.
>
> Signed-off-by: Dave Martin <Dave.Martin@arm.com>

Reviewed-by: Alex Bennée <alex.bennee@linaro.org>

> ---
>  arch/arm64/include/asm/kvm_host.h | 1 +
>  1 file changed, 1 insertion(+)
>
> diff --git a/arch/arm64/include/asm/kvm_host.h b/arch/arm64/include/asm/kvm_host.h
> index fe8777b..92d6e88 100644
> --- a/arch/arm64/include/asm/kvm_host.h
> +++ b/arch/arm64/include/asm/kvm_host.h
> @@ -22,6 +22,7 @@
>  #ifndef __ARM64_KVM_HOST_H__
>  #define __ARM64_KVM_HOST_H__
>
> +#include <linux/bitmap.h>
>  #include <linux/types.h>
>  #include <linux/kvm_types.h>
>  #include <asm/cpufeature.h>


--
Alex Bennée
_______________________________________________
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

^ permalink raw reply	[flat|nested] 178+ messages in thread

* [RFC PATCH 04/16] KVM: arm64: Add missing #include of <linux/bitmap.h> to kvm_host.h
@ 2018-07-06  9:21     ` Alex Bennée
  0 siblings, 0 replies; 178+ messages in thread
From: Alex Bennée @ 2018-07-06  9:21 UTC (permalink / raw)
  To: linux-arm-kernel


Dave Martin <Dave.Martin@arm.com> writes:

> kvm_host.h uses DECLARE_BITMAP() to declare the features member of
> struct vcpu_arch, but the corresponding #include for this is
> missing.
>
> This patch adds a suitable #include for <linux/bitmap.h>.  Although
> the header builds without it today, this should help to avoid
> future surprises.
>
> Signed-off-by: Dave Martin <Dave.Martin@arm.com>

Reviewed-by: Alex Benn?e <alex.bennee@linaro.org>

> ---
>  arch/arm64/include/asm/kvm_host.h | 1 +
>  1 file changed, 1 insertion(+)
>
> diff --git a/arch/arm64/include/asm/kvm_host.h b/arch/arm64/include/asm/kvm_host.h
> index fe8777b..92d6e88 100644
> --- a/arch/arm64/include/asm/kvm_host.h
> +++ b/arch/arm64/include/asm/kvm_host.h
> @@ -22,6 +22,7 @@
>  #ifndef __ARM64_KVM_HOST_H__
>  #define __ARM64_KVM_HOST_H__
>
> +#include <linux/bitmap.h>
>  #include <linux/types.h>
>  #include <linux/kvm_types.h>
>  #include <asm/cpufeature.h>


--
Alex Benn?e

^ permalink raw reply	[flat|nested] 178+ messages in thread

* Re: [RFC PATCH 00/16] KVM: arm64: Initial support for SVE guests
  2018-07-06  9:20       ` Alex Bennée
@ 2018-07-06  9:23         ` Peter Maydell
  -1 siblings, 0 replies; 178+ messages in thread
From: Peter Maydell @ 2018-07-06  9:23 UTC (permalink / raw)
  To: Alex Bennée
  Cc: Okamoto Takayuki, Christoffer Dall, Ard Biesheuvel, Marc Zyngier,
	Catalin Marinas, Will Deacon, kvmarm, Dave Martin, arm-mail-list

On 6 July 2018 at 10:20, Alex Bennée <alex.bennee@linaro.org> wrote:
> For the most part in QEMU with KVM we just treat the list of registers
> we get from the OS as an opaque blob of things we save to memory and
> pass back.

This is specifically not the case for the SVE registers -- we
will need extra code to handle them. (The current code only
allows for the possibility of 64-bit registers, so if we'd
allowed the kernel to hand it the larger SVE registers it would
just have fallen over. This is why QEMU needs to specifically
enable SVE via the VCPU_INIT call.)

thanks
-- PMM
_______________________________________________
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

^ permalink raw reply	[flat|nested] 178+ messages in thread

* [RFC PATCH 00/16] KVM: arm64: Initial support for SVE guests
@ 2018-07-06  9:23         ` Peter Maydell
  0 siblings, 0 replies; 178+ messages in thread
From: Peter Maydell @ 2018-07-06  9:23 UTC (permalink / raw)
  To: linux-arm-kernel

On 6 July 2018 at 10:20, Alex Benn?e <alex.bennee@linaro.org> wrote:
> For the most part in QEMU with KVM we just treat the list of registers
> we get from the OS as an opaque blob of things we save to memory and
> pass back.

This is specifically not the case for the SVE registers -- we
will need extra code to handle them. (The current code only
allows for the possibility of 64-bit registers, so if we'd
allowed the kernel to hand it the larger SVE registers it would
just have fallen over. This is why QEMU needs to specifically
enable SVE via the VCPU_INIT call.)

thanks
-- PMM

^ permalink raw reply	[flat|nested] 178+ messages in thread

* Re: [RFC PATCH 05/16] KVM: arm: Add arch init/uninit hooks
  2018-06-21 14:57   ` Dave Martin
@ 2018-07-06 10:02     ` Alex Bennée
  -1 siblings, 0 replies; 178+ messages in thread
From: Alex Bennée @ 2018-07-06 10:02 UTC (permalink / raw)
  To: Dave Martin
  Cc: Okamoto Takayuki, Christoffer Dall, Ard Biesheuvel, Marc Zyngier,
	Catalin Marinas, Will Deacon, kvmarm, linux-arm-kernel


Dave Martin <Dave.Martin@arm.com> writes:

> In preparation for adding support for SVE in guests on arm64, hooks
> for allocating and freeing additional per-vcpu memory are needed.
>
> kvm_arch_vcpu_setup() could be used for allocation, but this
> function is not clearly balanced by un "unsetup" function, making

Isn't that a double negative there? Surely it would be balanced by a
kvm_arch_vcpu_unsetup() or possibly better named function.

> it unclear where memory allocated in this function should be freed.
>
> To keep things simple, this patch defines backend hooks
> kvm_arm_arch_vcpu_{,un}unint(), and plumbs them in appropriately.

Is {,un} a notation for dropping un? This might be why I'm confused. I
would have written it as kvm_arm_arch_vcpu_[un]init() or even
kvm_arm_arch_vcpu_[init|uninit].

> The exusting kvm_arch_vcpu_init() function now calls

/existing/

> kvm_arm_arch_vcpu_init(), while an explicit kvm_arch_vcpu_uninit()
> is added which current does nothing except to call
> kvm_arm_arch_vcpu_uninit().

OK I'm a little confused by this. It seems to me that KVM already has
the provision for an init/uninit. What does the extra level on
indirection buy you that keeping the static inline
kvm_arm_arch_vcpu_uninit in arm/kvm_host.h and a concrete implementation
in arm64/kvm/guest.c doesn't?

>
> The backend functions are currently defined to do nothing.
>
> No functional change.
>
> Signed-off-by: Dave Martin <Dave.Martin@arm.com>
> ---
>  arch/arm/include/asm/kvm_host.h   |  4 +++-
>  arch/arm64/include/asm/kvm_host.h |  4 +++-
>  virt/kvm/arm/arm.c                | 13 ++++++++++++-
>  3 files changed, 18 insertions(+), 3 deletions(-)
>
> diff --git a/arch/arm/include/asm/kvm_host.h b/arch/arm/include/asm/kvm_host.h
> index 1f1fe410..9b902b8 100644
> --- a/arch/arm/include/asm/kvm_host.h
> +++ b/arch/arm/include/asm/kvm_host.h
> @@ -284,10 +284,12 @@ struct kvm_vcpu *kvm_mpidr_to_vcpu(struct kvm *kvm, unsigned long mpidr);
>  static inline bool kvm_arch_check_sve_has_vhe(void) { return true; }
>  static inline void kvm_arch_hardware_unsetup(void) {}
>  static inline void kvm_arch_sync_events(struct kvm *kvm) {}
> -static inline void kvm_arch_vcpu_uninit(struct kvm_vcpu *vcpu) {}
>  static inline void kvm_arch_sched_in(struct kvm_vcpu *vcpu, int cpu) {}
>  static inline void kvm_arch_vcpu_block_finish(struct kvm_vcpu *vcpu) {}
>
> +static inline int kvm_arm_arch_vcpu_init(struct kvm_vcpu *vcpu) { return 0; }
> +static inline void kvm_arm_arch_vcpu_uninit(struct kvm_vcpu *vcpu) {}
> +
>  static inline void kvm_arm_init_debug(void) {}
>  static inline void kvm_arm_setup_debug(struct kvm_vcpu *vcpu) {}
>  static inline void kvm_arm_clear_debug(struct kvm_vcpu *vcpu) {}
> diff --git a/arch/arm64/include/asm/kvm_host.h b/arch/arm64/include/asm/kvm_host.h
> index 92d6e88..9671ddd 100644
> --- a/arch/arm64/include/asm/kvm_host.h
> +++ b/arch/arm64/include/asm/kvm_host.h
> @@ -425,10 +425,12 @@ static inline bool kvm_arch_check_sve_has_vhe(void)
>
>  static inline void kvm_arch_hardware_unsetup(void) {}
>  static inline void kvm_arch_sync_events(struct kvm *kvm) {}
> -static inline void kvm_arch_vcpu_uninit(struct kvm_vcpu *vcpu) {}
>  static inline void kvm_arch_sched_in(struct kvm_vcpu *vcpu, int cpu) {}
>  static inline void kvm_arch_vcpu_block_finish(struct kvm_vcpu *vcpu) {}
>
> +static inline int kvm_arm_arch_vcpu_init(struct kvm_vcpu *vcpu) { return 0; }
> +static inline void kvm_arm_arch_vcpu_uninit(struct kvm_vcpu *vcpu) {}
> +
>  void kvm_arm_init_debug(void);
>  void kvm_arm_setup_debug(struct kvm_vcpu *vcpu);
>  void kvm_arm_clear_debug(struct kvm_vcpu *vcpu);
> diff --git a/virt/kvm/arm/arm.c b/virt/kvm/arm/arm.c
> index 04e554c..66f15cc 100644
> --- a/virt/kvm/arm/arm.c
> +++ b/virt/kvm/arm/arm.c
> @@ -345,6 +345,8 @@ void kvm_arch_vcpu_unblocking(struct kvm_vcpu *vcpu)
>
>  int kvm_arch_vcpu_init(struct kvm_vcpu *vcpu)
>  {
> +	int ret;
> +
>  	/* Force users to call KVM_ARM_VCPU_INIT */
>  	vcpu->arch.target = -1;
>  	bitmap_zero(vcpu->arch.features, KVM_VCPU_MAX_FEATURES);
> @@ -354,7 +356,16 @@ int kvm_arch_vcpu_init(struct kvm_vcpu *vcpu)
>
>  	kvm_arm_reset_debug_ptr(vcpu);
>
> -	return kvm_vgic_vcpu_init(vcpu);
> +	ret = kvm_vgic_vcpu_init(vcpu);
> +	if (ret)
> +		return ret;
> +
> +	return kvm_arm_arch_vcpu_init(vcpu);
> +}
> +
> +void kvm_arch_vcpu_uninit(struct kvm_vcpu *vcpu)
> +{
> +	kvm_arm_arch_vcpu_uninit(vcpu);
>  }
>
>  void kvm_arch_vcpu_load(struct kvm_vcpu *vcpu, int cpu)


--
Alex Bennée
_______________________________________________
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

^ permalink raw reply	[flat|nested] 178+ messages in thread

* [RFC PATCH 05/16] KVM: arm: Add arch init/uninit hooks
@ 2018-07-06 10:02     ` Alex Bennée
  0 siblings, 0 replies; 178+ messages in thread
From: Alex Bennée @ 2018-07-06 10:02 UTC (permalink / raw)
  To: linux-arm-kernel


Dave Martin <Dave.Martin@arm.com> writes:

> In preparation for adding support for SVE in guests on arm64, hooks
> for allocating and freeing additional per-vcpu memory are needed.
>
> kvm_arch_vcpu_setup() could be used for allocation, but this
> function is not clearly balanced by un "unsetup" function, making

Isn't that a double negative there? Surely it would be balanced by a
kvm_arch_vcpu_unsetup() or possibly better named function.

> it unclear where memory allocated in this function should be freed.
>
> To keep things simple, this patch defines backend hooks
> kvm_arm_arch_vcpu_{,un}unint(), and plumbs them in appropriately.

Is {,un} a notation for dropping un? This might be why I'm confused. I
would have written it as kvm_arm_arch_vcpu_[un]init() or even
kvm_arm_arch_vcpu_[init|uninit].

> The exusting kvm_arch_vcpu_init() function now calls

/existing/

> kvm_arm_arch_vcpu_init(), while an explicit kvm_arch_vcpu_uninit()
> is added which current does nothing except to call
> kvm_arm_arch_vcpu_uninit().

OK I'm a little confused by this. It seems to me that KVM already has
the provision for an init/uninit. What does the extra level on
indirection buy you that keeping the static inline
kvm_arm_arch_vcpu_uninit in arm/kvm_host.h and a concrete implementation
in arm64/kvm/guest.c doesn't?

>
> The backend functions are currently defined to do nothing.
>
> No functional change.
>
> Signed-off-by: Dave Martin <Dave.Martin@arm.com>
> ---
>  arch/arm/include/asm/kvm_host.h   |  4 +++-
>  arch/arm64/include/asm/kvm_host.h |  4 +++-
>  virt/kvm/arm/arm.c                | 13 ++++++++++++-
>  3 files changed, 18 insertions(+), 3 deletions(-)
>
> diff --git a/arch/arm/include/asm/kvm_host.h b/arch/arm/include/asm/kvm_host.h
> index 1f1fe410..9b902b8 100644
> --- a/arch/arm/include/asm/kvm_host.h
> +++ b/arch/arm/include/asm/kvm_host.h
> @@ -284,10 +284,12 @@ struct kvm_vcpu *kvm_mpidr_to_vcpu(struct kvm *kvm, unsigned long mpidr);
>  static inline bool kvm_arch_check_sve_has_vhe(void) { return true; }
>  static inline void kvm_arch_hardware_unsetup(void) {}
>  static inline void kvm_arch_sync_events(struct kvm *kvm) {}
> -static inline void kvm_arch_vcpu_uninit(struct kvm_vcpu *vcpu) {}
>  static inline void kvm_arch_sched_in(struct kvm_vcpu *vcpu, int cpu) {}
>  static inline void kvm_arch_vcpu_block_finish(struct kvm_vcpu *vcpu) {}
>
> +static inline int kvm_arm_arch_vcpu_init(struct kvm_vcpu *vcpu) { return 0; }
> +static inline void kvm_arm_arch_vcpu_uninit(struct kvm_vcpu *vcpu) {}
> +
>  static inline void kvm_arm_init_debug(void) {}
>  static inline void kvm_arm_setup_debug(struct kvm_vcpu *vcpu) {}
>  static inline void kvm_arm_clear_debug(struct kvm_vcpu *vcpu) {}
> diff --git a/arch/arm64/include/asm/kvm_host.h b/arch/arm64/include/asm/kvm_host.h
> index 92d6e88..9671ddd 100644
> --- a/arch/arm64/include/asm/kvm_host.h
> +++ b/arch/arm64/include/asm/kvm_host.h
> @@ -425,10 +425,12 @@ static inline bool kvm_arch_check_sve_has_vhe(void)
>
>  static inline void kvm_arch_hardware_unsetup(void) {}
>  static inline void kvm_arch_sync_events(struct kvm *kvm) {}
> -static inline void kvm_arch_vcpu_uninit(struct kvm_vcpu *vcpu) {}
>  static inline void kvm_arch_sched_in(struct kvm_vcpu *vcpu, int cpu) {}
>  static inline void kvm_arch_vcpu_block_finish(struct kvm_vcpu *vcpu) {}
>
> +static inline int kvm_arm_arch_vcpu_init(struct kvm_vcpu *vcpu) { return 0; }
> +static inline void kvm_arm_arch_vcpu_uninit(struct kvm_vcpu *vcpu) {}
> +
>  void kvm_arm_init_debug(void);
>  void kvm_arm_setup_debug(struct kvm_vcpu *vcpu);
>  void kvm_arm_clear_debug(struct kvm_vcpu *vcpu);
> diff --git a/virt/kvm/arm/arm.c b/virt/kvm/arm/arm.c
> index 04e554c..66f15cc 100644
> --- a/virt/kvm/arm/arm.c
> +++ b/virt/kvm/arm/arm.c
> @@ -345,6 +345,8 @@ void kvm_arch_vcpu_unblocking(struct kvm_vcpu *vcpu)
>
>  int kvm_arch_vcpu_init(struct kvm_vcpu *vcpu)
>  {
> +	int ret;
> +
>  	/* Force users to call KVM_ARM_VCPU_INIT */
>  	vcpu->arch.target = -1;
>  	bitmap_zero(vcpu->arch.features, KVM_VCPU_MAX_FEATURES);
> @@ -354,7 +356,16 @@ int kvm_arch_vcpu_init(struct kvm_vcpu *vcpu)
>
>  	kvm_arm_reset_debug_ptr(vcpu);
>
> -	return kvm_vgic_vcpu_init(vcpu);
> +	ret = kvm_vgic_vcpu_init(vcpu);
> +	if (ret)
> +		return ret;
> +
> +	return kvm_arm_arch_vcpu_init(vcpu);
> +}
> +
> +void kvm_arch_vcpu_uninit(struct kvm_vcpu *vcpu)
> +{
> +	kvm_arm_arch_vcpu_uninit(vcpu);
>  }
>
>  void kvm_arch_vcpu_load(struct kvm_vcpu *vcpu, int cpu)


--
Alex Benn?e

^ permalink raw reply	[flat|nested] 178+ messages in thread

* Re: [RFC PATCH 00/16] KVM: arm64: Initial support for SVE guests
  2018-07-06  9:23         ` Peter Maydell
@ 2018-07-06 10:11           ` Alex Bennée
  -1 siblings, 0 replies; 178+ messages in thread
From: Alex Bennée @ 2018-07-06 10:11 UTC (permalink / raw)
  To: Peter Maydell
  Cc: Okamoto Takayuki, Christoffer Dall, Ard Biesheuvel, Marc Zyngier,
	Catalin Marinas, Will Deacon, kvmarm, Dave Martin, arm-mail-list


Peter Maydell <peter.maydell@linaro.org> writes:

> On 6 July 2018 at 10:20, Alex Bennée <alex.bennee@linaro.org> wrote:
>> For the most part in QEMU with KVM we just treat the list of registers
>> we get from the OS as an opaque blob of things we save to memory and
>> pass back.
>
> This is specifically not the case for the SVE registers -- we
> will need extra code to handle them. (The current code only
> allows for the possibility of 64-bit registers, so if we'd
> allowed the kernel to hand it the larger SVE registers it would
> just have fallen over. This is why QEMU needs to specifically
> enable SVE via the VCPU_INIT call.)

Ahh right. So currently both the KVM_GET_REG_LIST and the core registers
use KVM_GET/SET_ONE_REG which certainly won't do the trick. So I guess
we need another IOCTL (or to enhance the current one) and a KVM
capability bit as well?

>
> thanks
> -- PMM


--
Alex Bennée
_______________________________________________
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

^ permalink raw reply	[flat|nested] 178+ messages in thread

* [RFC PATCH 00/16] KVM: arm64: Initial support for SVE guests
@ 2018-07-06 10:11           ` Alex Bennée
  0 siblings, 0 replies; 178+ messages in thread
From: Alex Bennée @ 2018-07-06 10:11 UTC (permalink / raw)
  To: linux-arm-kernel


Peter Maydell <peter.maydell@linaro.org> writes:

> On 6 July 2018 at 10:20, Alex Benn?e <alex.bennee@linaro.org> wrote:
>> For the most part in QEMU with KVM we just treat the list of registers
>> we get from the OS as an opaque blob of things we save to memory and
>> pass back.
>
> This is specifically not the case for the SVE registers -- we
> will need extra code to handle them. (The current code only
> allows for the possibility of 64-bit registers, so if we'd
> allowed the kernel to hand it the larger SVE registers it would
> just have fallen over. This is why QEMU needs to specifically
> enable SVE via the VCPU_INIT call.)

Ahh right. So currently both the KVM_GET_REG_LIST and the core registers
use KVM_GET/SET_ONE_REG which certainly won't do the trick. So I guess
we need another IOCTL (or to enhance the current one) and a KVM
capability bit as well?

>
> thanks
> -- PMM


--
Alex Benn?e

^ permalink raw reply	[flat|nested] 178+ messages in thread

* Re: [RFC PATCH 00/16] KVM: arm64: Initial support for SVE guests
  2018-07-06 10:11           ` Alex Bennée
@ 2018-07-06 10:14             ` Peter Maydell
  -1 siblings, 0 replies; 178+ messages in thread
From: Peter Maydell @ 2018-07-06 10:14 UTC (permalink / raw)
  To: Alex Bennée
  Cc: Okamoto Takayuki, Christoffer Dall, Ard Biesheuvel, Marc Zyngier,
	Catalin Marinas, Will Deacon, kvmarm, Dave Martin, arm-mail-list

On 6 July 2018 at 11:11, Alex Bennée <alex.bennee@linaro.org> wrote:
>
> Peter Maydell <peter.maydell@linaro.org> writes:
>
>> On 6 July 2018 at 10:20, Alex Bennée <alex.bennee@linaro.org> wrote:
>>> For the most part in QEMU with KVM we just treat the list of registers
>>> we get from the OS as an opaque blob of things we save to memory and
>>> pass back.
>>
>> This is specifically not the case for the SVE registers -- we
>> will need extra code to handle them. (The current code only
>> allows for the possibility of 64-bit registers, so if we'd
>> allowed the kernel to hand it the larger SVE registers it would
>> just have fallen over. This is why QEMU needs to specifically
>> enable SVE via the VCPU_INIT call.)
>
> Ahh right. So currently both the KVM_GET_REG_LIST and the core registers
> use KVM_GET/SET_ONE_REG which certainly won't do the trick. So I guess
> we need another IOCTL (or to enhance the current one) and a KVM
> capability bit as well?

It is still GET/SET_ONE_REG, but the size of the register
(which is encoded in its ID number) is not something the
current code will cope with (see the switches on
regidx & KVM_REG_SIZE_MASK in write_kvmstate_to_list()
and write_list_to_kvmstate()).

thanks
-- PMM
_______________________________________________
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

^ permalink raw reply	[flat|nested] 178+ messages in thread

* [RFC PATCH 00/16] KVM: arm64: Initial support for SVE guests
@ 2018-07-06 10:14             ` Peter Maydell
  0 siblings, 0 replies; 178+ messages in thread
From: Peter Maydell @ 2018-07-06 10:14 UTC (permalink / raw)
  To: linux-arm-kernel

On 6 July 2018 at 11:11, Alex Benn?e <alex.bennee@linaro.org> wrote:
>
> Peter Maydell <peter.maydell@linaro.org> writes:
>
>> On 6 July 2018 at 10:20, Alex Benn?e <alex.bennee@linaro.org> wrote:
>>> For the most part in QEMU with KVM we just treat the list of registers
>>> we get from the OS as an opaque blob of things we save to memory and
>>> pass back.
>>
>> This is specifically not the case for the SVE registers -- we
>> will need extra code to handle them. (The current code only
>> allows for the possibility of 64-bit registers, so if we'd
>> allowed the kernel to hand it the larger SVE registers it would
>> just have fallen over. This is why QEMU needs to specifically
>> enable SVE via the VCPU_INIT call.)
>
> Ahh right. So currently both the KVM_GET_REG_LIST and the core registers
> use KVM_GET/SET_ONE_REG which certainly won't do the trick. So I guess
> we need another IOCTL (or to enhance the current one) and a KVM
> capability bit as well?

It is still GET/SET_ONE_REG, but the size of the register
(which is encoded in its ID number) is not something the
current code will cope with (see the switches on
regidx & KVM_REG_SIZE_MASK in write_kvmstate_to_list()
and write_list_to_kvmstate()).

thanks
-- PMM

^ permalink raw reply	[flat|nested] 178+ messages in thread

* Re: [RFC PATCH 06/16] arm64/sve: Determine virtualisation-friendly vector lengths
  2018-06-21 14:57   ` Dave Martin
@ 2018-07-06 13:20     ` Marc Zyngier
  -1 siblings, 0 replies; 178+ messages in thread
From: Marc Zyngier @ 2018-07-06 13:20 UTC (permalink / raw)
  To: Dave Martin, kvmarm
  Cc: Okamoto Takayuki, Christoffer Dall, Ard Biesheuvel,
	Catalin Marinas, Will Deacon, linux-arm-kernel

Hi Dave,

On 21/06/18 15:57, Dave Martin wrote:
> Software at EL1 is permitted to assume that when programming
> ZCR_EL1.LEN, the effective vector length is nonetheless a vector
> length supported by the CPU.  Thus, software may rely on the
> effective vector length being different from that programmed via
> ZCR_EL1.LEN in some situations.
> 
> However, KVM does not tightly bind vcpus to individual underlying
> physical CPUs.  As a result, vcpus can migrate from one CPU to
> another.  This means that in order to preserve the guarantee
> described in the previous paragraph, the set of supported vector
> lengths must appear to be the same for the vcpu at all times,
> irrespective of which physical CPU the vcpu is currently running
> on.
> 
> The Arm SVE architecture allows the maximum vector length visible
> to EL1 to be restricted by programming ZCR_EL2.LEN.  This provides
> a means to hide from guests any vector lengths that are not
> supported by every physical CPU in the system.  However, there is
> no way to hide a particular vector length while some greater vector
> length is exposed to EL1.
> 
> This patch determines the maximum vector length
> (sve_max_virtualisable_vl) for which the set of supported vector
> lengths not exceeding it is identical for all CPUs.  When KVM is
> available, the set of vector lengths supported by each late
> secondary CPU is verified to be consistent with those of the early
> CPUs, in order to ensure that the value chosen for
> sve_max_virtualisable_vl remains globally valid, and ensure that
> all created vcpus continue to behave correctly.
> 
> sve_secondary_vq_map is used as scratch space for these
> computations, rendering its name misleading.  This patch renames
> this bitmap to sve_tmp_vq_map in order to make its purpose clearer.

I'm slightly put off by this patch.

While it does a great job making sure we're always in a situation where
we can offer SVE to a guest, no matter how utterly broken the system is,
I wonder if there is a real value in jumping through all these hoops the
first place.

It is (sort of) reasonable to support a system that has different max
VLs (big-little) and cap it at the minimum of the max VLs (just like we
do for userspace). But I don't think it is reasonable to consider
systems that have different supported VLs in that range, and I don't
think any system exhibit that behaviour.

To put it another way, if a "small" CPU's supported VLs are not a strict
prefix of the "big" ones', we just disable SVE support in KVM. I'd be
tempted to do the same thing for userspace too, but that's a separate
discussion.

Can we turn this patch into one that checks the above condition instead
of trying to cope with an unacceptable level of braindeadness?

Thanks,

	M.
-- 
Jazz is not dead. It just smells funny...

^ permalink raw reply	[flat|nested] 178+ messages in thread

* [RFC PATCH 06/16] arm64/sve: Determine virtualisation-friendly vector lengths
@ 2018-07-06 13:20     ` Marc Zyngier
  0 siblings, 0 replies; 178+ messages in thread
From: Marc Zyngier @ 2018-07-06 13:20 UTC (permalink / raw)
  To: linux-arm-kernel

Hi Dave,

On 21/06/18 15:57, Dave Martin wrote:
> Software at EL1 is permitted to assume that when programming
> ZCR_EL1.LEN, the effective vector length is nonetheless a vector
> length supported by the CPU.  Thus, software may rely on the
> effective vector length being different from that programmed via
> ZCR_EL1.LEN in some situations.
> 
> However, KVM does not tightly bind vcpus to individual underlying
> physical CPUs.  As a result, vcpus can migrate from one CPU to
> another.  This means that in order to preserve the guarantee
> described in the previous paragraph, the set of supported vector
> lengths must appear to be the same for the vcpu at all times,
> irrespective of which physical CPU the vcpu is currently running
> on.
> 
> The Arm SVE architecture allows the maximum vector length visible
> to EL1 to be restricted by programming ZCR_EL2.LEN.  This provides
> a means to hide from guests any vector lengths that are not
> supported by every physical CPU in the system.  However, there is
> no way to hide a particular vector length while some greater vector
> length is exposed to EL1.
> 
> This patch determines the maximum vector length
> (sve_max_virtualisable_vl) for which the set of supported vector
> lengths not exceeding it is identical for all CPUs.  When KVM is
> available, the set of vector lengths supported by each late
> secondary CPU is verified to be consistent with those of the early
> CPUs, in order to ensure that the value chosen for
> sve_max_virtualisable_vl remains globally valid, and ensure that
> all created vcpus continue to behave correctly.
> 
> sve_secondary_vq_map is used as scratch space for these
> computations, rendering its name misleading.  This patch renames
> this bitmap to sve_tmp_vq_map in order to make its purpose clearer.

I'm slightly put off by this patch.

While it does a great job making sure we're always in a situation where
we can offer SVE to a guest, no matter how utterly broken the system is,
I wonder if there is a real value in jumping through all these hoops the
first place.

It is (sort of) reasonable to support a system that has different max
VLs (big-little) and cap it at the minimum of the max VLs (just like we
do for userspace). But I don't think it is reasonable to consider
systems that have different supported VLs in that range, and I don't
think any system exhibit that behaviour.

To put it another way, if a "small" CPU's supported VLs are not a strict
prefix of the "big" ones', we just disable SVE support in KVM. I'd be
tempted to do the same thing for userspace too, but that's a separate
discussion.

Can we turn this patch into one that checks the above condition instead
of trying to cope with an unacceptable level of braindeadness?

Thanks,

	M.
-- 
Jazz is not dead. It just smells funny...

^ permalink raw reply	[flat|nested] 178+ messages in thread

* Re: [RFC PATCH 05/16] KVM: arm: Add arch init/uninit hooks
  2018-07-06 10:02     ` Alex Bennée
@ 2018-07-09 15:15       ` Dave Martin
  -1 siblings, 0 replies; 178+ messages in thread
From: Dave Martin @ 2018-07-09 15:15 UTC (permalink / raw)
  To: Alex Bennée
  Cc: Okamoto Takayuki, Christoffer Dall, Ard Biesheuvel, Marc Zyngier,
	Catalin Marinas, Will Deacon, kvmarm, linux-arm-kernel

On Fri, Jul 06, 2018 at 11:02:20AM +0100, Alex Bennée wrote:
> 
> Dave Martin <Dave.Martin@arm.com> writes:
> 
> > In preparation for adding support for SVE in guests on arm64, hooks
> > for allocating and freeing additional per-vcpu memory are needed.
> >
> > kvm_arch_vcpu_setup() could be used for allocation, but this
> > function is not clearly balanced by un "unsetup" function, making
> 
> Isn't that a double negative there? Surely it would be balanced by a
> kvm_arch_vcpu_unsetup() or possibly better named function.

Yes, but there is no such function, and it wasn't clear what the
semantics of the existing hooks is supposed to be... so I didn't
feel comfortable adding an _unsetup().

I was trying to be minimally invasive while I got things working...

> > it unclear where memory allocated in this function should be freed.
> >
> > To keep things simple, this patch defines backend hooks
> > kvm_arm_arch_vcpu_{,un}unint(), and plumbs them in appropriately.
> 
> Is {,un} a notation for dropping un? This might be why I'm confused. I
> would have written it as kvm_arm_arch_vcpu_[un]init() or even
> kvm_arm_arch_vcpu_[init|uninit].

That should be kvm_arm_arch_vcpu_{,un}init().

Whether this is readily understood by people is another question.
This is the bash brace-expansion syntax which I'm in the habit of
using, partly because it looks nothing like C syntax, thus
"reducing" confusion.

Personally I make heavy use of this in the shell, like

	mv .config{,.old}

etc.  But that's me.  Maybe other people don't.

Too obscure?  I have a number of patches that follow this convention
upstream.

> > The exusting kvm_arch_vcpu_init() function now calls
> 
> /existing/
> 
> > kvm_arm_arch_vcpu_init(), while an explicit kvm_arch_vcpu_uninit()
> > is added which current does nothing except to call
> > kvm_arm_arch_vcpu_uninit().
> 
> OK I'm a little confused by this. It seems to me that KVM already has
> the provision for an init/uninit. What does the extra level on
> indirection buy you that keeping the static inline
> kvm_arm_arch_vcpu_uninit in arm/kvm_host.h and a concrete implementation
> in arm64/kvm/guest.c doesn't?

There isn't an intentional extra level of indirection, but the existing
code feels strangely factored due to the somewhat random use of
prefixes in function names (i.e., kvm_arch_*() is sometimes a hook from
KVM core into virt/kvm/arm/, some a hook from virt/kvm/arm/ into the
arm or arm64 backend, and sometimes a hook from KVM core directly into
the arm or arm64 backend).

So, I wasn't really sure where to put things initially.

This patch was always a bit of a bodge, and the series as posted
doesn't fully make use of it anyway... so I'll need to revisit.

Cheers
---Dave

^ permalink raw reply	[flat|nested] 178+ messages in thread

* [RFC PATCH 05/16] KVM: arm: Add arch init/uninit hooks
@ 2018-07-09 15:15       ` Dave Martin
  0 siblings, 0 replies; 178+ messages in thread
From: Dave Martin @ 2018-07-09 15:15 UTC (permalink / raw)
  To: linux-arm-kernel

On Fri, Jul 06, 2018 at 11:02:20AM +0100, Alex Benn?e wrote:
> 
> Dave Martin <Dave.Martin@arm.com> writes:
> 
> > In preparation for adding support for SVE in guests on arm64, hooks
> > for allocating and freeing additional per-vcpu memory are needed.
> >
> > kvm_arch_vcpu_setup() could be used for allocation, but this
> > function is not clearly balanced by un "unsetup" function, making
> 
> Isn't that a double negative there? Surely it would be balanced by a
> kvm_arch_vcpu_unsetup() or possibly better named function.

Yes, but there is no such function, and it wasn't clear what the
semantics of the existing hooks is supposed to be... so I didn't
feel comfortable adding an _unsetup().

I was trying to be minimally invasive while I got things working...

> > it unclear where memory allocated in this function should be freed.
> >
> > To keep things simple, this patch defines backend hooks
> > kvm_arm_arch_vcpu_{,un}unint(), and plumbs them in appropriately.
> 
> Is {,un} a notation for dropping un? This might be why I'm confused. I
> would have written it as kvm_arm_arch_vcpu_[un]init() or even
> kvm_arm_arch_vcpu_[init|uninit].

That should be kvm_arm_arch_vcpu_{,un}init().

Whether this is readily understood by people is another question.
This is the bash brace-expansion syntax which I'm in the habit of
using, partly because it looks nothing like C syntax, thus
"reducing" confusion.

Personally I make heavy use of this in the shell, like

	mv .config{,.old}

etc.  But that's me.  Maybe other people don't.

Too obscure?  I have a number of patches that follow this convention
upstream.

> > The exusting kvm_arch_vcpu_init() function now calls
> 
> /existing/
> 
> > kvm_arm_arch_vcpu_init(), while an explicit kvm_arch_vcpu_uninit()
> > is added which current does nothing except to call
> > kvm_arm_arch_vcpu_uninit().
> 
> OK I'm a little confused by this. It seems to me that KVM already has
> the provision for an init/uninit. What does the extra level on
> indirection buy you that keeping the static inline
> kvm_arm_arch_vcpu_uninit in arm/kvm_host.h and a concrete implementation
> in arm64/kvm/guest.c doesn't?

There isn't an intentional extra level of indirection, but the existing
code feels strangely factored due to the somewhat random use of
prefixes in function names (i.e., kvm_arch_*() is sometimes a hook from
KVM core into virt/kvm/arm/, some a hook from virt/kvm/arm/ into the
arm or arm64 backend, and sometimes a hook from KVM core directly into
the arm or arm64 backend).

So, I wasn't really sure where to put things initially.

This patch was always a bit of a bodge, and the series as posted
doesn't fully make use of it anyway... so I'll need to revisit.

Cheers
---Dave

^ permalink raw reply	[flat|nested] 178+ messages in thread

* Re: [RFC PATCH 10/16] KVM: arm64: Add a vcpu flag to control SVE visibility for the guest
  2018-06-21 14:57   ` Dave Martin
@ 2018-07-19 11:08     ` Andrew Jones
  -1 siblings, 0 replies; 178+ messages in thread
From: Andrew Jones @ 2018-07-19 11:08 UTC (permalink / raw)
  To: Dave Martin
  Cc: Okamoto Takayuki, Christoffer Dall, Ard Biesheuvel, Marc Zyngier,
	Catalin Marinas, Will Deacon, kvmarm, linux-arm-kernel

On Thu, Jun 21, 2018 at 03:57:34PM +0100, Dave Martin wrote:
> Since SVE will be enabled or disabled on a per-vcpu basis, a flag
> is needed in order to track which vcpus have it enabled.
> 
> This patch adds a suitable flag and a helper for checking it.
> 
> Signed-off-by: Dave Martin <Dave.Martin@arm.com>
> ---
>  arch/arm64/include/asm/kvm_host.h | 8 ++++++++
>  1 file changed, 8 insertions(+)
> 
> diff --git a/arch/arm64/include/asm/kvm_host.h b/arch/arm64/include/asm/kvm_host.h
> index 9671ddd..609d08b 100644
> --- a/arch/arm64/include/asm/kvm_host.h
> +++ b/arch/arm64/include/asm/kvm_host.h
> @@ -308,6 +308,14 @@ struct kvm_vcpu_arch {
>  #define KVM_ARM64_FP_HOST		(1 << 2) /* host FP regs loaded */
>  #define KVM_ARM64_HOST_SVE_IN_USE	(1 << 3) /* backup for host TIF_SVE */
>  #define KVM_ARM64_HOST_SVE_ENABLED	(1 << 4) /* SVE enabled for EL0 */
> +#define KVM_ARM64_GUEST_HAS_SVE		(1 << 5) /* SVE exposed to guest */
> +
> +static inline bool vcpu_has_sve(struct kvm_vcpu_arch const *vcpu_arch)
> +{
> +	return system_supports_sve() &&

system_supports_sve() checks cpus_have_const_cap(), not
this_cpu_has_cap(), so, iiuc, the result of this check won't
change, regardless of which cpu it's run on at the time.

> +		(vcpu_arch->flags & KVM_ARM64_GUEST_HAS_SVE);

Since this flag can only be set if system_supports_sve() is
true at vcpu init time, then it isn't necessary to always check
system_supports_sve() in this function. Or, should
system_supports_sve() be changed to use this_cpu_has_cap()?

Thanks,
drew

> +
> +}
>  
>  #define vcpu_gp_regs(v)		(&(v)->arch.ctxt.gp_regs)
>  
> -- 
> 2.1.4
> 
> _______________________________________________
> kvmarm mailing list
> kvmarm@lists.cs.columbia.edu
> https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

^ permalink raw reply	[flat|nested] 178+ messages in thread

* [RFC PATCH 10/16] KVM: arm64: Add a vcpu flag to control SVE visibility for the guest
@ 2018-07-19 11:08     ` Andrew Jones
  0 siblings, 0 replies; 178+ messages in thread
From: Andrew Jones @ 2018-07-19 11:08 UTC (permalink / raw)
  To: linux-arm-kernel

On Thu, Jun 21, 2018 at 03:57:34PM +0100, Dave Martin wrote:
> Since SVE will be enabled or disabled on a per-vcpu basis, a flag
> is needed in order to track which vcpus have it enabled.
> 
> This patch adds a suitable flag and a helper for checking it.
> 
> Signed-off-by: Dave Martin <Dave.Martin@arm.com>
> ---
>  arch/arm64/include/asm/kvm_host.h | 8 ++++++++
>  1 file changed, 8 insertions(+)
> 
> diff --git a/arch/arm64/include/asm/kvm_host.h b/arch/arm64/include/asm/kvm_host.h
> index 9671ddd..609d08b 100644
> --- a/arch/arm64/include/asm/kvm_host.h
> +++ b/arch/arm64/include/asm/kvm_host.h
> @@ -308,6 +308,14 @@ struct kvm_vcpu_arch {
>  #define KVM_ARM64_FP_HOST		(1 << 2) /* host FP regs loaded */
>  #define KVM_ARM64_HOST_SVE_IN_USE	(1 << 3) /* backup for host TIF_SVE */
>  #define KVM_ARM64_HOST_SVE_ENABLED	(1 << 4) /* SVE enabled for EL0 */
> +#define KVM_ARM64_GUEST_HAS_SVE		(1 << 5) /* SVE exposed to guest */
> +
> +static inline bool vcpu_has_sve(struct kvm_vcpu_arch const *vcpu_arch)
> +{
> +	return system_supports_sve() &&

system_supports_sve() checks cpus_have_const_cap(), not
this_cpu_has_cap(), so, iiuc, the result of this check won't
change, regardless of which cpu it's run on at the time.

> +		(vcpu_arch->flags & KVM_ARM64_GUEST_HAS_SVE);

Since this flag can only be set if system_supports_sve() is
true at vcpu init time, then it isn't necessary to always check
system_supports_sve() in this function. Or, should
system_supports_sve() be changed to use this_cpu_has_cap()?

Thanks,
drew

> +
> +}
>  
>  #define vcpu_gp_regs(v)		(&(v)->arch.ctxt.gp_regs)
>  
> -- 
> 2.1.4
> 
> _______________________________________________
> kvmarm mailing list
> kvmarm at lists.cs.columbia.edu
> https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

^ permalink raw reply	[flat|nested] 178+ messages in thread

* Re: [RFC PATCH 11/16] KVM: arm64/sve: System register context switch and access support
  2018-06-21 14:57   ` Dave Martin
@ 2018-07-19 11:11     ` Andrew Jones
  -1 siblings, 0 replies; 178+ messages in thread
From: Andrew Jones @ 2018-07-19 11:11 UTC (permalink / raw)
  To: Dave Martin
  Cc: Okamoto Takayuki, Christoffer Dall, Ard Biesheuvel, Marc Zyngier,
	Catalin Marinas, Will Deacon, kvmarm, linux-arm-kernel

On Thu, Jun 21, 2018 at 03:57:35PM +0100, Dave Martin wrote:
> This patch adds the necessary support for context switching ZCR_EL1
> for each vcpu.
> 
> The ID_AA64PFR0_EL1 emulation code is updated to expose the
> presence of SVE to the guest if appropriate, and ioctl() access to
> ZCR_EL1 is also added.
> 
> In the context switch code itself, ZCR_EL1 is context switched if
> the host is SVE-capable, irrespectively for now of whether SVE is
> exposed to the guest or not.  Adding a dynamic vcpu_has_sve() check
> may lose as much performance as would be gained in this simple
> case.
> 
> Signed-off-by: Dave Martin <Dave.Martin@arm.com>
> ---
>  arch/arm64/include/asm/kvm_host.h |  1 +
>  arch/arm64/include/asm/sysreg.h   |  3 +++
>  arch/arm64/kvm/hyp/sysreg-sr.c    |  5 +++++
>  arch/arm64/kvm/sys_regs.c         | 14 +++++++++-----
>  4 files changed, 18 insertions(+), 5 deletions(-)
> 
> diff --git a/arch/arm64/include/asm/kvm_host.h b/arch/arm64/include/asm/kvm_host.h
> index 609d08b..f331abf 100644
> --- a/arch/arm64/include/asm/kvm_host.h
> +++ b/arch/arm64/include/asm/kvm_host.h
> @@ -111,6 +111,7 @@ enum vcpu_sysreg {
>  	SCTLR_EL1,	/* System Control Register */
>  	ACTLR_EL1,	/* Auxiliary Control Register */
>  	CPACR_EL1,	/* Coprocessor Access Control */
> +	ZCR_EL1,	/* SVE Control */
>  	TTBR0_EL1,	/* Translation Table Base Register 0 */
>  	TTBR1_EL1,	/* Translation Table Base Register 1 */
>  	TCR_EL1,	/* Translation Control Register */
> diff --git a/arch/arm64/include/asm/sysreg.h b/arch/arm64/include/asm/sysreg.h
> index a8f8481..6476dbd 100644
> --- a/arch/arm64/include/asm/sysreg.h
> +++ b/arch/arm64/include/asm/sysreg.h
> @@ -416,6 +416,9 @@
>  #define SYS_ICH_LR14_EL2		__SYS__LR8_EL2(6)
>  #define SYS_ICH_LR15_EL2		__SYS__LR8_EL2(7)
>  
> +/* VHE encodings for architectural EL0/1 system registers */
> +#define SYS_ZCR_EL12			sys_reg(3, 5, 1, 2, 0)
> +
>  /* Common SCTLR_ELx flags. */
>  #define SCTLR_ELx_EE    (1 << 25)
>  #define SCTLR_ELx_IESB	(1 << 21)
> diff --git a/arch/arm64/kvm/hyp/sysreg-sr.c b/arch/arm64/kvm/hyp/sysreg-sr.c
> index 35bc168..0f4046a 100644
> --- a/arch/arm64/kvm/hyp/sysreg-sr.c
> +++ b/arch/arm64/kvm/hyp/sysreg-sr.c
> @@ -21,6 +21,7 @@
>  #include <asm/kvm_asm.h>
>  #include <asm/kvm_emulate.h>
>  #include <asm/kvm_hyp.h>
> +#include <asm/sysreg.h>
>  
>  /*
>   * Non-VHE: Both host and guest must save everything.
> @@ -57,6 +58,8 @@ static void __hyp_text __sysreg_save_el1_state(struct kvm_cpu_context *ctxt)
>  	ctxt->sys_regs[SCTLR_EL1]	= read_sysreg_el1(sctlr);
>  	ctxt->sys_regs[ACTLR_EL1]	= read_sysreg(actlr_el1);
>  	ctxt->sys_regs[CPACR_EL1]	= read_sysreg_el1(cpacr);
> +	if (system_supports_sve()) /* implies has_vhe() */
> +		ctxt->sys_regs[ZCR_EL1]	= read_sysreg_s(SYS_ZCR_EL12);
>  	ctxt->sys_regs[TTBR0_EL1]	= read_sysreg_el1(ttbr0);
>  	ctxt->sys_regs[TTBR1_EL1]	= read_sysreg_el1(ttbr1);
>  	ctxt->sys_regs[TCR_EL1]		= read_sysreg_el1(tcr);
> @@ -129,6 +132,8 @@ static void __hyp_text __sysreg_restore_el1_state(struct kvm_cpu_context *ctxt)
>  	write_sysreg_el1(ctxt->sys_regs[SCTLR_EL1],	sctlr);
>  	write_sysreg(ctxt->sys_regs[ACTLR_EL1],	  	actlr_el1);
>  	write_sysreg_el1(ctxt->sys_regs[CPACR_EL1],	cpacr);
> +	if (system_supports_sve()) /* implies has_vhe() */
> +		write_sysreg_s(ctxt->sys_regs[ZCR_EL1],	SYS_ZCR_EL12);

I feel like the ZCR_EL12 save/restores are out of place, as these
functions are shared by non-VHE and VHE. Maybe they should be
moved to the VHE callers?

Thanks,
drew

^ permalink raw reply	[flat|nested] 178+ messages in thread

* [RFC PATCH 11/16] KVM: arm64/sve: System register context switch and access support
@ 2018-07-19 11:11     ` Andrew Jones
  0 siblings, 0 replies; 178+ messages in thread
From: Andrew Jones @ 2018-07-19 11:11 UTC (permalink / raw)
  To: linux-arm-kernel

On Thu, Jun 21, 2018 at 03:57:35PM +0100, Dave Martin wrote:
> This patch adds the necessary support for context switching ZCR_EL1
> for each vcpu.
> 
> The ID_AA64PFR0_EL1 emulation code is updated to expose the
> presence of SVE to the guest if appropriate, and ioctl() access to
> ZCR_EL1 is also added.
> 
> In the context switch code itself, ZCR_EL1 is context switched if
> the host is SVE-capable, irrespectively for now of whether SVE is
> exposed to the guest or not.  Adding a dynamic vcpu_has_sve() check
> may lose as much performance as would be gained in this simple
> case.
> 
> Signed-off-by: Dave Martin <Dave.Martin@arm.com>
> ---
>  arch/arm64/include/asm/kvm_host.h |  1 +
>  arch/arm64/include/asm/sysreg.h   |  3 +++
>  arch/arm64/kvm/hyp/sysreg-sr.c    |  5 +++++
>  arch/arm64/kvm/sys_regs.c         | 14 +++++++++-----
>  4 files changed, 18 insertions(+), 5 deletions(-)
> 
> diff --git a/arch/arm64/include/asm/kvm_host.h b/arch/arm64/include/asm/kvm_host.h
> index 609d08b..f331abf 100644
> --- a/arch/arm64/include/asm/kvm_host.h
> +++ b/arch/arm64/include/asm/kvm_host.h
> @@ -111,6 +111,7 @@ enum vcpu_sysreg {
>  	SCTLR_EL1,	/* System Control Register */
>  	ACTLR_EL1,	/* Auxiliary Control Register */
>  	CPACR_EL1,	/* Coprocessor Access Control */
> +	ZCR_EL1,	/* SVE Control */
>  	TTBR0_EL1,	/* Translation Table Base Register 0 */
>  	TTBR1_EL1,	/* Translation Table Base Register 1 */
>  	TCR_EL1,	/* Translation Control Register */
> diff --git a/arch/arm64/include/asm/sysreg.h b/arch/arm64/include/asm/sysreg.h
> index a8f8481..6476dbd 100644
> --- a/arch/arm64/include/asm/sysreg.h
> +++ b/arch/arm64/include/asm/sysreg.h
> @@ -416,6 +416,9 @@
>  #define SYS_ICH_LR14_EL2		__SYS__LR8_EL2(6)
>  #define SYS_ICH_LR15_EL2		__SYS__LR8_EL2(7)
>  
> +/* VHE encodings for architectural EL0/1 system registers */
> +#define SYS_ZCR_EL12			sys_reg(3, 5, 1, 2, 0)
> +
>  /* Common SCTLR_ELx flags. */
>  #define SCTLR_ELx_EE    (1 << 25)
>  #define SCTLR_ELx_IESB	(1 << 21)
> diff --git a/arch/arm64/kvm/hyp/sysreg-sr.c b/arch/arm64/kvm/hyp/sysreg-sr.c
> index 35bc168..0f4046a 100644
> --- a/arch/arm64/kvm/hyp/sysreg-sr.c
> +++ b/arch/arm64/kvm/hyp/sysreg-sr.c
> @@ -21,6 +21,7 @@
>  #include <asm/kvm_asm.h>
>  #include <asm/kvm_emulate.h>
>  #include <asm/kvm_hyp.h>
> +#include <asm/sysreg.h>
>  
>  /*
>   * Non-VHE: Both host and guest must save everything.
> @@ -57,6 +58,8 @@ static void __hyp_text __sysreg_save_el1_state(struct kvm_cpu_context *ctxt)
>  	ctxt->sys_regs[SCTLR_EL1]	= read_sysreg_el1(sctlr);
>  	ctxt->sys_regs[ACTLR_EL1]	= read_sysreg(actlr_el1);
>  	ctxt->sys_regs[CPACR_EL1]	= read_sysreg_el1(cpacr);
> +	if (system_supports_sve()) /* implies has_vhe() */
> +		ctxt->sys_regs[ZCR_EL1]	= read_sysreg_s(SYS_ZCR_EL12);
>  	ctxt->sys_regs[TTBR0_EL1]	= read_sysreg_el1(ttbr0);
>  	ctxt->sys_regs[TTBR1_EL1]	= read_sysreg_el1(ttbr1);
>  	ctxt->sys_regs[TCR_EL1]		= read_sysreg_el1(tcr);
> @@ -129,6 +132,8 @@ static void __hyp_text __sysreg_restore_el1_state(struct kvm_cpu_context *ctxt)
>  	write_sysreg_el1(ctxt->sys_regs[SCTLR_EL1],	sctlr);
>  	write_sysreg(ctxt->sys_regs[ACTLR_EL1],	  	actlr_el1);
>  	write_sysreg_el1(ctxt->sys_regs[CPACR_EL1],	cpacr);
> +	if (system_supports_sve()) /* implies has_vhe() */
> +		write_sysreg_s(ctxt->sys_regs[ZCR_EL1],	SYS_ZCR_EL12);

I feel like the ZCR_EL12 save/restores are out of place, as these
functions are shared by non-VHE and VHE. Maybe they should be
moved to the VHE callers?

Thanks,
drew

^ permalink raw reply	[flat|nested] 178+ messages in thread

* Re: [RFC PATCH 14/16] KVM: arm64/sve: Add SVE support to register access ioctl interface
  2018-06-21 14:57   ` Dave Martin
@ 2018-07-19 13:04     ` Andrew Jones
  -1 siblings, 0 replies; 178+ messages in thread
From: Andrew Jones @ 2018-07-19 13:04 UTC (permalink / raw)
  To: Dave Martin
  Cc: Okamoto Takayuki, Christoffer Dall, Ard Biesheuvel, Marc Zyngier,
	Catalin Marinas, Will Deacon, kvmarm, linux-arm-kernel

On Thu, Jun 21, 2018 at 03:57:38PM +0100, Dave Martin wrote:
> This patch adds the following registers for access via the
> KVM_{GET,SET}_ONE_REG interface:
> 
>  * KVM_REG_ARM64_SVE_ZREG(n, i) (n = 0..31) (in 2048-bit slices)
>  * KVM_REG_ARM64_SVE_PREG(n, i) (n = 0..15) (in 256-bit slices)
>  * KVM_REG_ARM64_SVE_FFR(i) (in 256-bit slices)
> 
> In order to adapt gracefully to future architectural extensions,
> the registers are divided up into slices as noted above:  the i
> parameter denotes the slice index.
> 
> For simplicity, bits or slices that exceed the maximum vector
> length supported for the vcpu are ignored for KVM_SET_ONE_REG, and
> read as zero for KVM_GET_ONE_REG.
> 
> For the current architecture, only slice i = 0 is significant.  The
> interface design allows i to increase to up to 31 in the future if
> required by future architectural amendments.
> 
> The registers are only visible for vcpus that have SVE enabled.
> They are not enumerated by KVM_GET_REG_LIST on vcpus that do not
> have SVE.  In all cases, surplus slices are not enumerated by
> KVM_GET_REG_LIST.
> 
> Accesses to the FPSIMD registers via KVM_REG_ARM_CORE are
> redirected to access the underlying vcpu SVE register storage as
> appropriate.  In order to make this more straightforward, register
> accesses that straddle register boundaries are no longer guaranteed
> to succeed.  (Support for such use was never deliberate, and
> userspace does not currently seem to be relying on it.)
> 
> Signed-off-by: Dave Martin <Dave.Martin@arm.com>
> ---
>  arch/arm64/include/uapi/asm/kvm.h |  10 ++
>  arch/arm64/kvm/guest.c            | 219 +++++++++++++++++++++++++++++++++++---
>  2 files changed, 216 insertions(+), 13 deletions(-)
> 
> diff --git a/arch/arm64/include/uapi/asm/kvm.h b/arch/arm64/include/uapi/asm/kvm.h
> index 4e76630..f54a9b0 100644
> --- a/arch/arm64/include/uapi/asm/kvm.h
> +++ b/arch/arm64/include/uapi/asm/kvm.h
> @@ -213,6 +213,16 @@ struct kvm_arch_memory_slot {
>  					 KVM_REG_ARM_FW | ((r) & 0xffff))
>  #define KVM_REG_ARM_PSCI_VERSION	KVM_REG_ARM_FW_REG(0)
>  
> +/* SVE registers */
> +#define KVM_REG_ARM64_SVE		(0x15 << KVM_REG_ARM_COPROC_SHIFT)
> +#define KVM_REG_ARM64_SVE_ZREG(n, i)	(KVM_REG_ARM64 | KVM_REG_ARM64_SVE | \
> +					 KVM_REG_SIZE_U2048 |		\
> +					 ((n) << 5) | (i))
> +#define KVM_REG_ARM64_SVE_PREG(n, i)	(KVM_REG_ARM64 | KVM_REG_ARM64_SVE | \
> +					 KVM_REG_SIZE_U256 |		\
> +					 ((n) << 5) | (i) | 0x400)
> +#define KVM_REG_ARM64_SVE_FFR(i)	KVM_REG_ARM64_SVE_PREG(16, i)
> +
>  /* Device Control API: ARM VGIC */
>  #define KVM_DEV_ARM_VGIC_GRP_ADDR	0
>  #define KVM_DEV_ARM_VGIC_GRP_DIST_REGS	1
> diff --git a/arch/arm64/kvm/guest.c b/arch/arm64/kvm/guest.c
> index 4a9d77c..005394b 100644
> --- a/arch/arm64/kvm/guest.c
> +++ b/arch/arm64/kvm/guest.c
> @@ -23,14 +23,19 @@
>  #include <linux/err.h>
>  #include <linux/kvm_host.h>
>  #include <linux/module.h>
> +#include <linux/uaccess.h>
>  #include <linux/vmalloc.h>
>  #include <linux/fs.h>
> +#include <linux/stddef.h>
>  #include <kvm/arm_psci.h>
>  #include <asm/cputype.h>
>  #include <linux/uaccess.h>
> +#include <asm/fpsimd.h>
>  #include <asm/kvm.h>
>  #include <asm/kvm_emulate.h>
>  #include <asm/kvm_coproc.h>
> +#include <asm/kvm_host.h>
> +#include <asm/sigcontext.h>
>  
>  #include "trace.h"
>  
> @@ -57,6 +62,106 @@ static u64 core_reg_offset_from_id(u64 id)
>  	return id & ~(KVM_REG_ARCH_MASK | KVM_REG_SIZE_MASK | KVM_REG_ARM_CORE);
>  }
>  
> +static bool is_zreg(const struct kvm_one_reg *reg)
> +{
> +	return	reg->id >= KVM_REG_ARM64_SVE_ZREG(0, 0) &&
> +		reg->id <= KVM_REG_ARM64_SVE_ZREG(SVE_NUM_ZREGS, 0x1f);
> +}
> +
> +static bool is_preg(const struct kvm_one_reg *reg)
> +{
> +	return	reg->id >= KVM_REG_ARM64_SVE_PREG(0, 0) &&
> +		reg->id <= KVM_REG_ARM64_SVE_FFR(0x1f);
> +}
> +
> +static unsigned int sve_reg_num(const struct kvm_one_reg *reg)
> +{
> +	return (reg->id >> 5) & 0x1f;
> +}
> +
> +static unsigned int sve_reg_index(const struct kvm_one_reg *reg)
> +{
> +	return reg->id & 0x1f;
> +}
> +
> +struct reg_bounds_struct {
> +	char *kptr;
> +	size_t start_offset;

Maybe start_offset gets used in a later patch, but it doesn't seem
to be used here.

> +	size_t copy_count;
> +	size_t flush_count;
> +};
> +
> +static int copy_bounded_reg_to_user(void __user *uptr,
> +				    const struct reg_bounds_struct *b)
> +{
> +	if (copy_to_user(uptr, b->kptr, b->copy_count) ||
> +	    clear_user((char __user *)uptr + b->copy_count, b->flush_count))
> +		return -EFAULT;
> +
> +	return 0;
> +}
> +
> +static int copy_bounded_reg_from_user(const struct reg_bounds_struct *b,
> +				      const void __user *uptr)
> +{
> +	if (copy_from_user(b->kptr, uptr, b->copy_count))
> +		return -EFAULT;
> +
> +	return 0;
> +}
> +
> +static int fpsimd_vreg_bounds(struct reg_bounds_struct *b,
> +			      struct kvm_vcpu *vcpu,
> +			      const struct kvm_one_reg *reg)
> +{
> +	const size_t stride = KVM_REG_ARM_CORE_REG(fp_regs.vregs[1]) -
> +				KVM_REG_ARM_CORE_REG(fp_regs.vregs[0]);
> +	const size_t start = KVM_REG_ARM_CORE_REG(fp_regs.vregs[0]);
> +	const size_t limit = KVM_REG_ARM_CORE_REG(fp_regs.vregs[32]);
> +
> +	const u64 uoffset = core_reg_offset_from_id(reg->id);
> +	size_t usize = KVM_REG_SIZE(reg->id);
> +	size_t start_vreg, end_vreg;
> +
> +	if (WARN_ON((reg->id & KVM_REG_ARM_COPROC_MASK) != KVM_REG_ARM_CORE))
> +		return -ENOENT;

This warn-on can never fire, as the condition was already checked to even
get here, back in kvm_arm_set_reg(). If there's concern this function will
get called from the wrong place someday, then we should make it a bug-on.

> +
> +	if (usize % sizeof(u32))
> +		return -EINVAL;

We should do the below is vreg check first. Otherwise we may return EINVAL
for a valid non-vreg. Actually I think we should check if the reg is a
vreg in get/set_core_reg and only come here if it is, rather than coming
unconditionally and then requiring the handling of ENOENT.

> +
> +	usize /= sizeof(u32);
> +
> +	if ((uoffset <= start && usize <= start - uoffset) ||
> +	    uoffset >= limit)
> +		return -ENOENT;	/* not a vreg */
> +
> +	BUILD_BUG_ON(uoffset > limit);

Hmm, a build bug on uoffset can't be right, it's not a constant.

> +	if (uoffset < start || usize > limit - uoffset)
> +		return -EINVAL;	/* overlaps vregs[] bounds */
> +
> +	start_vreg = (uoffset - start) / stride;
> +	end_vreg = ((uoffset - start) + usize - 1) / stride;
> +	if (start_vreg != end_vreg)
> +		return -EINVAL;	/* spans multiple vregs */

Aren't the above three lines equivalent to just (usize > stride)?

> +
> +	b->start_offset = ((uoffset - start) % stride) * sizeof(u32);
> +	b->copy_count = usize * sizeof(u32);
> +	b->flush_count = 0;
> +
> +	if (vcpu_has_sve(&vcpu->arch)) {
> +		const unsigned int vq = sve_vq_from_vl(vcpu->arch.sve_max_vl);
> +
> +		b->kptr = vcpu->arch.sve_state;
> +		b->kptr += (SVE_SIG_ZREG_OFFSET(vq, start_vreg) -
> +			    SVE_SIG_REGS_OFFSET);
> +	} else {
> +		b->kptr = (char *)&vcpu_gp_regs(vcpu)->fp_regs.vregs[
> +				start_vreg];
> +	}
> +
> +	return 0;
> +}
> +
>  static int get_core_reg(struct kvm_vcpu *vcpu, const struct kvm_one_reg *reg)
>  {
>  	/*
> @@ -65,11 +170,20 @@ static int get_core_reg(struct kvm_vcpu *vcpu, const struct kvm_one_reg *reg)
>  	 * array. Hence below, nr_regs is the number of entries, and
>  	 * off the index in the "array".
>  	 */
> +	int err;
> +	struct reg_bounds_struct b;
>  	__u32 __user *uaddr = (__u32 __user *)(unsigned long)reg->addr;
>  	struct kvm_regs *regs = vcpu_gp_regs(vcpu);
>  	int nr_regs = sizeof(*regs) / sizeof(__u32);
>  	u32 off;
>  
> +	err = fpsimd_vreg_bounds(&b, vcpu, reg);
> +	switch (err) {
> +	case 0:		return copy_bounded_reg_to_user(uaddr, &b);
> +	case -ENOENT:	break;	/* not and FPSIMD vreg */

not an

> +	default:	return err;
> +	}
> +
>  	/* Our ID is an index into the kvm_regs struct. */
>  	off = core_reg_offset_from_id(reg->id);

How about instead of the above switch we just do this, with adjusted
sanity checks in fpsimd_vreg_bounds?

  if (off >= KVM_REG_ARM_CORE_REG(fp_regs.vregs[0])) {
      err = fpsimd_vreg_bounds(&b, vcpu, reg);
      if (!err)
          return copy_bounded_reg_to_user(uaddr, &b);
      return err;
  }

>  	if (off >= nr_regs ||
> @@ -84,14 +198,23 @@ static int get_core_reg(struct kvm_vcpu *vcpu, const struct kvm_one_reg *reg)
>  
>  static int set_core_reg(struct kvm_vcpu *vcpu, const struct kvm_one_reg *reg)
>  {
> +	int err;
> +	struct reg_bounds_struct b;
>  	__u32 __user *uaddr = (__u32 __user *)(unsigned long)reg->addr;
>  	struct kvm_regs *regs = vcpu_gp_regs(vcpu);
>  	int nr_regs = sizeof(*regs) / sizeof(__u32);
>  	__uint128_t tmp;
>  	void *valp = &tmp;
>  	u64 off;
> -	int err = 0;
>  
> +	err = fpsimd_vreg_bounds(&b, vcpu, reg);
> +	switch (err) {
> +	case 0:		return copy_bounded_reg_from_user(&b, uaddr);
> +	case -ENOENT:	break;	/* not and FPSIMD vreg */

no an

and same comments as for get_core_reg

> +	default:	return err;
> +	}
> +
> +	err = 0;
>  	/* Our ID is an index into the kvm_regs struct. */
>  	off = core_reg_offset_from_id(reg->id);
>  	if (off >= nr_regs ||
> @@ -130,6 +253,78 @@ static int set_core_reg(struct kvm_vcpu *vcpu, const struct kvm_one_reg *reg)
>  	return err;
>  }
>  
> +static int sve_reg_bounds(struct reg_bounds_struct *b,
> +			  const struct kvm_vcpu *vcpu,
> +			  const struct kvm_one_reg *reg)
> +{
> +	unsigned int n = sve_reg_num(reg);
> +	unsigned int i = sve_reg_index(reg);
> +	unsigned int vl = vcpu->arch.sve_max_vl;
> +	unsigned int vq = sve_vq_from_vl(vl);
> +	unsigned int start, copy_limit, limit;
> +
> +	b->kptr = vcpu->arch.sve_state;
> +	if (is_zreg(reg)) {
> +		b->kptr += SVE_SIG_ZREG_OFFSET(vq, n) - SVE_SIG_REGS_OFFSET;
> +		start = i * 0x100;
> +		limit = start + 0x100;
> +		copy_limit = vl;
> +	} else if (is_preg(reg)) {
> +		b->kptr += SVE_SIG_PREG_OFFSET(vq, n) - SVE_SIG_REGS_OFFSET;
> +		start = i * 0x20;
> +		limit = start + 0x20;
> +		copy_limit = vl / 8;
> +	} else {
> +		WARN_ON(1);
> +		start = 0;
> +		copy_limit = limit = 0;

Instead of WARN_ON, shouldn't this be a return -EINVAL that gets
propagated to the user?

> +	}
> +
> +	b->kptr += start;
> +
> +	if (copy_limit < start)
> +		copy_limit = start;
> +	else if (copy_limit > limit)
> +		copy_limit = limit;

 copy_limit = clamp(copy_limit, start, limit)

> +
> +	b->copy_count = copy_limit - start;
> +	b->flush_count = limit - copy_limit;

nit: might be nice (less error prone?) to set b->kptr once here with
the other bounds members, e.g.

  b->kptr = arch.sve_state + sve_reg_type_off + start;

> +
> +	return 0;
> +}
> +
> +static int get_sve_reg(struct kvm_vcpu *vcpu, const struct kvm_one_reg *reg)
> +{
> +	int ret;
> +	struct reg_bounds_struct b;
> +	char __user *uptr = (char __user *)reg->addr;
> +
> +	if (!vcpu_has_sve(&vcpu->arch))
> +		return -ENOENT;
> +
> +	ret = sve_reg_bounds(&b, vcpu, reg);
> +	if (ret)
> +		return ret;
> +
> +	return copy_bounded_reg_to_user(uptr, &b);
> +}
> +
> +static int set_sve_reg(struct kvm_vcpu *vcpu, const struct kvm_one_reg *reg)
> +{
> +	int ret;
> +	struct reg_bounds_struct b;
> +	char __user *uptr = (char __user *)reg->addr;
> +
> +	if (!vcpu_has_sve(&vcpu->arch))
> +		return -ENOENT;
> +
> +	ret = sve_reg_bounds(&b, vcpu, reg);
> +	if (ret)
> +		return ret;
> +
> +	return copy_bounded_reg_from_user(&b, uptr);
> +}
> +
>  int kvm_arch_vcpu_ioctl_get_regs(struct kvm_vcpu *vcpu, struct kvm_regs *regs)
>  {
>  	return -EINVAL;
> @@ -251,12 +446,11 @@ int kvm_arm_get_reg(struct kvm_vcpu *vcpu, const struct kvm_one_reg *reg)
>  	if ((reg->id & ~KVM_REG_SIZE_MASK) >> 32 != KVM_REG_ARM64 >> 32)
>  		return -EINVAL;
>  
> -	/* Register group 16 means we want a core register. */
> -	if ((reg->id & KVM_REG_ARM_COPROC_MASK) == KVM_REG_ARM_CORE)
> -		return get_core_reg(vcpu, reg);
> -
> -	if ((reg->id & KVM_REG_ARM_COPROC_MASK) == KVM_REG_ARM_FW)
> -		return kvm_arm_get_fw_reg(vcpu, reg);
> +	switch (reg->id & KVM_REG_ARM_COPROC_MASK) {
> +	case KVM_REG_ARM_CORE:	return get_core_reg(vcpu, reg);
> +	case KVM_REG_ARM_FW:	return kvm_arm_get_fw_reg(vcpu, reg);
> +	case KVM_REG_ARM64_SVE:	return get_sve_reg(vcpu, reg);
> +	}
>  
>  	if (is_timer_reg(reg->id))
>  		return get_timer_reg(vcpu, reg);
> @@ -270,12 +464,11 @@ int kvm_arm_set_reg(struct kvm_vcpu *vcpu, const struct kvm_one_reg *reg)
>  	if ((reg->id & ~KVM_REG_SIZE_MASK) >> 32 != KVM_REG_ARM64 >> 32)
>  		return -EINVAL;
>  
> -	/* Register group 16 means we set a core register. */
> -	if ((reg->id & KVM_REG_ARM_COPROC_MASK) == KVM_REG_ARM_CORE)
> -		return set_core_reg(vcpu, reg);
> -
> -	if ((reg->id & KVM_REG_ARM_COPROC_MASK) == KVM_REG_ARM_FW)
> -		return kvm_arm_set_fw_reg(vcpu, reg);
> +	switch (reg->id & KVM_REG_ARM_COPROC_MASK) {
> +	case KVM_REG_ARM_CORE:	return set_core_reg(vcpu, reg);
> +	case KVM_REG_ARM_FW:	return kvm_arm_set_fw_reg(vcpu, reg);
> +	case KVM_REG_ARM64_SVE:	return set_sve_reg(vcpu, reg);
> +	}
>  
>  	if (is_timer_reg(reg->id))
>  		return set_timer_reg(vcpu, reg);
> -- 
> 2.1.4

Thanks,
drew

^ permalink raw reply	[flat|nested] 178+ messages in thread

* [RFC PATCH 14/16] KVM: arm64/sve: Add SVE support to register access ioctl interface
@ 2018-07-19 13:04     ` Andrew Jones
  0 siblings, 0 replies; 178+ messages in thread
From: Andrew Jones @ 2018-07-19 13:04 UTC (permalink / raw)
  To: linux-arm-kernel

On Thu, Jun 21, 2018 at 03:57:38PM +0100, Dave Martin wrote:
> This patch adds the following registers for access via the
> KVM_{GET,SET}_ONE_REG interface:
> 
>  * KVM_REG_ARM64_SVE_ZREG(n, i) (n = 0..31) (in 2048-bit slices)
>  * KVM_REG_ARM64_SVE_PREG(n, i) (n = 0..15) (in 256-bit slices)
>  * KVM_REG_ARM64_SVE_FFR(i) (in 256-bit slices)
> 
> In order to adapt gracefully to future architectural extensions,
> the registers are divided up into slices as noted above:  the i
> parameter denotes the slice index.
> 
> For simplicity, bits or slices that exceed the maximum vector
> length supported for the vcpu are ignored for KVM_SET_ONE_REG, and
> read as zero for KVM_GET_ONE_REG.
> 
> For the current architecture, only slice i = 0 is significant.  The
> interface design allows i to increase to up to 31 in the future if
> required by future architectural amendments.
> 
> The registers are only visible for vcpus that have SVE enabled.
> They are not enumerated by KVM_GET_REG_LIST on vcpus that do not
> have SVE.  In all cases, surplus slices are not enumerated by
> KVM_GET_REG_LIST.
> 
> Accesses to the FPSIMD registers via KVM_REG_ARM_CORE are
> redirected to access the underlying vcpu SVE register storage as
> appropriate.  In order to make this more straightforward, register
> accesses that straddle register boundaries are no longer guaranteed
> to succeed.  (Support for such use was never deliberate, and
> userspace does not currently seem to be relying on it.)
> 
> Signed-off-by: Dave Martin <Dave.Martin@arm.com>
> ---
>  arch/arm64/include/uapi/asm/kvm.h |  10 ++
>  arch/arm64/kvm/guest.c            | 219 +++++++++++++++++++++++++++++++++++---
>  2 files changed, 216 insertions(+), 13 deletions(-)
> 
> diff --git a/arch/arm64/include/uapi/asm/kvm.h b/arch/arm64/include/uapi/asm/kvm.h
> index 4e76630..f54a9b0 100644
> --- a/arch/arm64/include/uapi/asm/kvm.h
> +++ b/arch/arm64/include/uapi/asm/kvm.h
> @@ -213,6 +213,16 @@ struct kvm_arch_memory_slot {
>  					 KVM_REG_ARM_FW | ((r) & 0xffff))
>  #define KVM_REG_ARM_PSCI_VERSION	KVM_REG_ARM_FW_REG(0)
>  
> +/* SVE registers */
> +#define KVM_REG_ARM64_SVE		(0x15 << KVM_REG_ARM_COPROC_SHIFT)
> +#define KVM_REG_ARM64_SVE_ZREG(n, i)	(KVM_REG_ARM64 | KVM_REG_ARM64_SVE | \
> +					 KVM_REG_SIZE_U2048 |		\
> +					 ((n) << 5) | (i))
> +#define KVM_REG_ARM64_SVE_PREG(n, i)	(KVM_REG_ARM64 | KVM_REG_ARM64_SVE | \
> +					 KVM_REG_SIZE_U256 |		\
> +					 ((n) << 5) | (i) | 0x400)
> +#define KVM_REG_ARM64_SVE_FFR(i)	KVM_REG_ARM64_SVE_PREG(16, i)
> +
>  /* Device Control API: ARM VGIC */
>  #define KVM_DEV_ARM_VGIC_GRP_ADDR	0
>  #define KVM_DEV_ARM_VGIC_GRP_DIST_REGS	1
> diff --git a/arch/arm64/kvm/guest.c b/arch/arm64/kvm/guest.c
> index 4a9d77c..005394b 100644
> --- a/arch/arm64/kvm/guest.c
> +++ b/arch/arm64/kvm/guest.c
> @@ -23,14 +23,19 @@
>  #include <linux/err.h>
>  #include <linux/kvm_host.h>
>  #include <linux/module.h>
> +#include <linux/uaccess.h>
>  #include <linux/vmalloc.h>
>  #include <linux/fs.h>
> +#include <linux/stddef.h>
>  #include <kvm/arm_psci.h>
>  #include <asm/cputype.h>
>  #include <linux/uaccess.h>
> +#include <asm/fpsimd.h>
>  #include <asm/kvm.h>
>  #include <asm/kvm_emulate.h>
>  #include <asm/kvm_coproc.h>
> +#include <asm/kvm_host.h>
> +#include <asm/sigcontext.h>
>  
>  #include "trace.h"
>  
> @@ -57,6 +62,106 @@ static u64 core_reg_offset_from_id(u64 id)
>  	return id & ~(KVM_REG_ARCH_MASK | KVM_REG_SIZE_MASK | KVM_REG_ARM_CORE);
>  }
>  
> +static bool is_zreg(const struct kvm_one_reg *reg)
> +{
> +	return	reg->id >= KVM_REG_ARM64_SVE_ZREG(0, 0) &&
> +		reg->id <= KVM_REG_ARM64_SVE_ZREG(SVE_NUM_ZREGS, 0x1f);
> +}
> +
> +static bool is_preg(const struct kvm_one_reg *reg)
> +{
> +	return	reg->id >= KVM_REG_ARM64_SVE_PREG(0, 0) &&
> +		reg->id <= KVM_REG_ARM64_SVE_FFR(0x1f);
> +}
> +
> +static unsigned int sve_reg_num(const struct kvm_one_reg *reg)
> +{
> +	return (reg->id >> 5) & 0x1f;
> +}
> +
> +static unsigned int sve_reg_index(const struct kvm_one_reg *reg)
> +{
> +	return reg->id & 0x1f;
> +}
> +
> +struct reg_bounds_struct {
> +	char *kptr;
> +	size_t start_offset;

Maybe start_offset gets used in a later patch, but it doesn't seem
to be used here.

> +	size_t copy_count;
> +	size_t flush_count;
> +};
> +
> +static int copy_bounded_reg_to_user(void __user *uptr,
> +				    const struct reg_bounds_struct *b)
> +{
> +	if (copy_to_user(uptr, b->kptr, b->copy_count) ||
> +	    clear_user((char __user *)uptr + b->copy_count, b->flush_count))
> +		return -EFAULT;
> +
> +	return 0;
> +}
> +
> +static int copy_bounded_reg_from_user(const struct reg_bounds_struct *b,
> +				      const void __user *uptr)
> +{
> +	if (copy_from_user(b->kptr, uptr, b->copy_count))
> +		return -EFAULT;
> +
> +	return 0;
> +}
> +
> +static int fpsimd_vreg_bounds(struct reg_bounds_struct *b,
> +			      struct kvm_vcpu *vcpu,
> +			      const struct kvm_one_reg *reg)
> +{
> +	const size_t stride = KVM_REG_ARM_CORE_REG(fp_regs.vregs[1]) -
> +				KVM_REG_ARM_CORE_REG(fp_regs.vregs[0]);
> +	const size_t start = KVM_REG_ARM_CORE_REG(fp_regs.vregs[0]);
> +	const size_t limit = KVM_REG_ARM_CORE_REG(fp_regs.vregs[32]);
> +
> +	const u64 uoffset = core_reg_offset_from_id(reg->id);
> +	size_t usize = KVM_REG_SIZE(reg->id);
> +	size_t start_vreg, end_vreg;
> +
> +	if (WARN_ON((reg->id & KVM_REG_ARM_COPROC_MASK) != KVM_REG_ARM_CORE))
> +		return -ENOENT;

This warn-on can never fire, as the condition was already checked to even
get here, back in kvm_arm_set_reg(). If there's concern this function will
get called from the wrong place someday, then we should make it a bug-on.

> +
> +	if (usize % sizeof(u32))
> +		return -EINVAL;

We should do the below is vreg check first. Otherwise we may return EINVAL
for a valid non-vreg. Actually I think we should check if the reg is a
vreg in get/set_core_reg and only come here if it is, rather than coming
unconditionally and then requiring the handling of ENOENT.

> +
> +	usize /= sizeof(u32);
> +
> +	if ((uoffset <= start && usize <= start - uoffset) ||
> +	    uoffset >= limit)
> +		return -ENOENT;	/* not a vreg */
> +
> +	BUILD_BUG_ON(uoffset > limit);

Hmm, a build bug on uoffset can't be right, it's not a constant.

> +	if (uoffset < start || usize > limit - uoffset)
> +		return -EINVAL;	/* overlaps vregs[] bounds */
> +
> +	start_vreg = (uoffset - start) / stride;
> +	end_vreg = ((uoffset - start) + usize - 1) / stride;
> +	if (start_vreg != end_vreg)
> +		return -EINVAL;	/* spans multiple vregs */

Aren't the above three lines equivalent to just (usize > stride)?

> +
> +	b->start_offset = ((uoffset - start) % stride) * sizeof(u32);
> +	b->copy_count = usize * sizeof(u32);
> +	b->flush_count = 0;
> +
> +	if (vcpu_has_sve(&vcpu->arch)) {
> +		const unsigned int vq = sve_vq_from_vl(vcpu->arch.sve_max_vl);
> +
> +		b->kptr = vcpu->arch.sve_state;
> +		b->kptr += (SVE_SIG_ZREG_OFFSET(vq, start_vreg) -
> +			    SVE_SIG_REGS_OFFSET);
> +	} else {
> +		b->kptr = (char *)&vcpu_gp_regs(vcpu)->fp_regs.vregs[
> +				start_vreg];
> +	}
> +
> +	return 0;
> +}
> +
>  static int get_core_reg(struct kvm_vcpu *vcpu, const struct kvm_one_reg *reg)
>  {
>  	/*
> @@ -65,11 +170,20 @@ static int get_core_reg(struct kvm_vcpu *vcpu, const struct kvm_one_reg *reg)
>  	 * array. Hence below, nr_regs is the number of entries, and
>  	 * off the index in the "array".
>  	 */
> +	int err;
> +	struct reg_bounds_struct b;
>  	__u32 __user *uaddr = (__u32 __user *)(unsigned long)reg->addr;
>  	struct kvm_regs *regs = vcpu_gp_regs(vcpu);
>  	int nr_regs = sizeof(*regs) / sizeof(__u32);
>  	u32 off;
>  
> +	err = fpsimd_vreg_bounds(&b, vcpu, reg);
> +	switch (err) {
> +	case 0:		return copy_bounded_reg_to_user(uaddr, &b);
> +	case -ENOENT:	break;	/* not and FPSIMD vreg */

not an

> +	default:	return err;
> +	}
> +
>  	/* Our ID is an index into the kvm_regs struct. */
>  	off = core_reg_offset_from_id(reg->id);

How about instead of the above switch we just do this, with adjusted
sanity checks in fpsimd_vreg_bounds?

  if (off >= KVM_REG_ARM_CORE_REG(fp_regs.vregs[0])) {
      err = fpsimd_vreg_bounds(&b, vcpu, reg);
      if (!err)
          return copy_bounded_reg_to_user(uaddr, &b);
      return err;
  }

>  	if (off >= nr_regs ||
> @@ -84,14 +198,23 @@ static int get_core_reg(struct kvm_vcpu *vcpu, const struct kvm_one_reg *reg)
>  
>  static int set_core_reg(struct kvm_vcpu *vcpu, const struct kvm_one_reg *reg)
>  {
> +	int err;
> +	struct reg_bounds_struct b;
>  	__u32 __user *uaddr = (__u32 __user *)(unsigned long)reg->addr;
>  	struct kvm_regs *regs = vcpu_gp_regs(vcpu);
>  	int nr_regs = sizeof(*regs) / sizeof(__u32);
>  	__uint128_t tmp;
>  	void *valp = &tmp;
>  	u64 off;
> -	int err = 0;
>  
> +	err = fpsimd_vreg_bounds(&b, vcpu, reg);
> +	switch (err) {
> +	case 0:		return copy_bounded_reg_from_user(&b, uaddr);
> +	case -ENOENT:	break;	/* not and FPSIMD vreg */

no an

and same comments as for get_core_reg

> +	default:	return err;
> +	}
> +
> +	err = 0;
>  	/* Our ID is an index into the kvm_regs struct. */
>  	off = core_reg_offset_from_id(reg->id);
>  	if (off >= nr_regs ||
> @@ -130,6 +253,78 @@ static int set_core_reg(struct kvm_vcpu *vcpu, const struct kvm_one_reg *reg)
>  	return err;
>  }
>  
> +static int sve_reg_bounds(struct reg_bounds_struct *b,
> +			  const struct kvm_vcpu *vcpu,
> +			  const struct kvm_one_reg *reg)
> +{
> +	unsigned int n = sve_reg_num(reg);
> +	unsigned int i = sve_reg_index(reg);
> +	unsigned int vl = vcpu->arch.sve_max_vl;
> +	unsigned int vq = sve_vq_from_vl(vl);
> +	unsigned int start, copy_limit, limit;
> +
> +	b->kptr = vcpu->arch.sve_state;
> +	if (is_zreg(reg)) {
> +		b->kptr += SVE_SIG_ZREG_OFFSET(vq, n) - SVE_SIG_REGS_OFFSET;
> +		start = i * 0x100;
> +		limit = start + 0x100;
> +		copy_limit = vl;
> +	} else if (is_preg(reg)) {
> +		b->kptr += SVE_SIG_PREG_OFFSET(vq, n) - SVE_SIG_REGS_OFFSET;
> +		start = i * 0x20;
> +		limit = start + 0x20;
> +		copy_limit = vl / 8;
> +	} else {
> +		WARN_ON(1);
> +		start = 0;
> +		copy_limit = limit = 0;

Instead of WARN_ON, shouldn't this be a return -EINVAL that gets
propagated to the user?

> +	}
> +
> +	b->kptr += start;
> +
> +	if (copy_limit < start)
> +		copy_limit = start;
> +	else if (copy_limit > limit)
> +		copy_limit = limit;

 copy_limit = clamp(copy_limit, start, limit)

> +
> +	b->copy_count = copy_limit - start;
> +	b->flush_count = limit - copy_limit;

nit: might be nice (less error prone?) to set b->kptr once here with
the other bounds members, e.g.

  b->kptr = arch.sve_state + sve_reg_type_off + start;

> +
> +	return 0;
> +}
> +
> +static int get_sve_reg(struct kvm_vcpu *vcpu, const struct kvm_one_reg *reg)
> +{
> +	int ret;
> +	struct reg_bounds_struct b;
> +	char __user *uptr = (char __user *)reg->addr;
> +
> +	if (!vcpu_has_sve(&vcpu->arch))
> +		return -ENOENT;
> +
> +	ret = sve_reg_bounds(&b, vcpu, reg);
> +	if (ret)
> +		return ret;
> +
> +	return copy_bounded_reg_to_user(uptr, &b);
> +}
> +
> +static int set_sve_reg(struct kvm_vcpu *vcpu, const struct kvm_one_reg *reg)
> +{
> +	int ret;
> +	struct reg_bounds_struct b;
> +	char __user *uptr = (char __user *)reg->addr;
> +
> +	if (!vcpu_has_sve(&vcpu->arch))
> +		return -ENOENT;
> +
> +	ret = sve_reg_bounds(&b, vcpu, reg);
> +	if (ret)
> +		return ret;
> +
> +	return copy_bounded_reg_from_user(&b, uptr);
> +}
> +
>  int kvm_arch_vcpu_ioctl_get_regs(struct kvm_vcpu *vcpu, struct kvm_regs *regs)
>  {
>  	return -EINVAL;
> @@ -251,12 +446,11 @@ int kvm_arm_get_reg(struct kvm_vcpu *vcpu, const struct kvm_one_reg *reg)
>  	if ((reg->id & ~KVM_REG_SIZE_MASK) >> 32 != KVM_REG_ARM64 >> 32)
>  		return -EINVAL;
>  
> -	/* Register group 16 means we want a core register. */
> -	if ((reg->id & KVM_REG_ARM_COPROC_MASK) == KVM_REG_ARM_CORE)
> -		return get_core_reg(vcpu, reg);
> -
> -	if ((reg->id & KVM_REG_ARM_COPROC_MASK) == KVM_REG_ARM_FW)
> -		return kvm_arm_get_fw_reg(vcpu, reg);
> +	switch (reg->id & KVM_REG_ARM_COPROC_MASK) {
> +	case KVM_REG_ARM_CORE:	return get_core_reg(vcpu, reg);
> +	case KVM_REG_ARM_FW:	return kvm_arm_get_fw_reg(vcpu, reg);
> +	case KVM_REG_ARM64_SVE:	return get_sve_reg(vcpu, reg);
> +	}
>  
>  	if (is_timer_reg(reg->id))
>  		return get_timer_reg(vcpu, reg);
> @@ -270,12 +464,11 @@ int kvm_arm_set_reg(struct kvm_vcpu *vcpu, const struct kvm_one_reg *reg)
>  	if ((reg->id & ~KVM_REG_SIZE_MASK) >> 32 != KVM_REG_ARM64 >> 32)
>  		return -EINVAL;
>  
> -	/* Register group 16 means we set a core register. */
> -	if ((reg->id & KVM_REG_ARM_COPROC_MASK) == KVM_REG_ARM_CORE)
> -		return set_core_reg(vcpu, reg);
> -
> -	if ((reg->id & KVM_REG_ARM_COPROC_MASK) == KVM_REG_ARM_FW)
> -		return kvm_arm_set_fw_reg(vcpu, reg);
> +	switch (reg->id & KVM_REG_ARM_COPROC_MASK) {
> +	case KVM_REG_ARM_CORE:	return set_core_reg(vcpu, reg);
> +	case KVM_REG_ARM_FW:	return kvm_arm_set_fw_reg(vcpu, reg);
> +	case KVM_REG_ARM64_SVE:	return set_sve_reg(vcpu, reg);
> +	}
>  
>  	if (is_timer_reg(reg->id))
>  		return set_timer_reg(vcpu, reg);
> -- 
> 2.1.4

Thanks,
drew

^ permalink raw reply	[flat|nested] 178+ messages in thread

* Re: [RFC PATCH 12/16] KVM: arm64/sve: Context switch the SVE registers
  2018-06-21 14:57   ` Dave Martin
@ 2018-07-19 13:13     ` Andrew Jones
  -1 siblings, 0 replies; 178+ messages in thread
From: Andrew Jones @ 2018-07-19 13:13 UTC (permalink / raw)
  To: Dave Martin
  Cc: Okamoto Takayuki, Christoffer Dall, Ard Biesheuvel, Marc Zyngier,
	Catalin Marinas, Will Deacon, kvmarm, linux-arm-kernel

On Thu, Jun 21, 2018 at 03:57:36PM +0100, Dave Martin wrote:
> In order to give each vcpu its own view of the SVE registers, this
> patch adds context storage via a new sve_state pointer in struct
> vcpu_arch.  An additional member sve_max_vl is also added for each
> vcpu, to determine the maximum vector length visible to the guest
> and thus the value to be configured in ZCR_EL2.LEN while the is
> active.  This also determines the layout and size of the storage in
> sve_state, which is read and written by the same backend functions
> that are used for context-switching the SVE state for host tasks.
> 
> On SVE-enabled vcpus, SVE access traps are now handled by switching
> in the vcpu's SVE context and disabling the trap before returning
> to the guest.  On other vcpus, the trap is not handled and an exit
> back to the host occurs, where the handle_sve() fallback path
> reflects an undefined instruction exception back to the guest,
> consistently with the behaviour of non-SVE-capable hardware (as was
> done unconditionally prior to this patch).
> 
> No SVE handling is added on non-VHE-only paths, since VHE is an
> architectural and Kconfig prerequisite of SVE.
> 
> Signed-off-by: Dave Martin <Dave.Martin@arm.com>
> ---
>  arch/arm64/include/asm/kvm_host.h |  2 ++
>  arch/arm64/kvm/fpsimd.c           |  5 +++--
>  arch/arm64/kvm/hyp/switch.c       | 43 ++++++++++++++++++++++++++++++---------
>  3 files changed, 38 insertions(+), 12 deletions(-)
> 
> diff --git a/arch/arm64/include/asm/kvm_host.h b/arch/arm64/include/asm/kvm_host.h
> index f331abf..d2084ae 100644
> --- a/arch/arm64/include/asm/kvm_host.h
> +++ b/arch/arm64/include/asm/kvm_host.h
> @@ -211,6 +211,8 @@ typedef struct kvm_cpu_context kvm_cpu_context_t;
>  
>  struct kvm_vcpu_arch {
>  	struct kvm_cpu_context ctxt;
> +	void *sve_state;
> +	unsigned int sve_max_vl;
>  
>  	/* HYP configuration */
>  	u64 hcr_el2;
> diff --git a/arch/arm64/kvm/fpsimd.c b/arch/arm64/kvm/fpsimd.c
> index 872008c..44cf783 100644
> --- a/arch/arm64/kvm/fpsimd.c
> +++ b/arch/arm64/kvm/fpsimd.c
> @@ -86,10 +86,11 @@ void kvm_arch_vcpu_ctxsync_fp(struct kvm_vcpu *vcpu)
>  
>  	if (vcpu->arch.flags & KVM_ARM64_FP_ENABLED) {
>  		fpsimd_bind_state_to_cpu(&vcpu->arch.ctxt.gp_regs.fp_regs,
> -					 NULL, sve_max_vl);
> +					 vcpu->arch.sve_state,
> +					 vcpu->arch.sve_max_vl);
>  
>  		clear_thread_flag(TIF_FOREIGN_FPSTATE);
> -		clear_thread_flag(TIF_SVE);
> +		update_thread_flag(TIF_SVE, vcpu_has_sve(&vcpu->arch));
>  	}
>  }
>  
> diff --git a/arch/arm64/kvm/hyp/switch.c b/arch/arm64/kvm/hyp/switch.c
> index d496ef5..98df5c1 100644
> --- a/arch/arm64/kvm/hyp/switch.c
> +++ b/arch/arm64/kvm/hyp/switch.c
> @@ -98,8 +98,13 @@ static void activate_traps_vhe(struct kvm_vcpu *vcpu)
>  	val = read_sysreg(cpacr_el1);
>  	val |= CPACR_EL1_TTA;
>  	val &= ~CPACR_EL1_ZEN;
> -	if (!update_fp_enabled(vcpu))
> +
> +	if (update_fp_enabled(vcpu)) {
> +		if (vcpu_has_sve(&vcpu->arch))
> +			val |= CPACR_EL1_ZEN;
> +	} else {
>  		val &= ~CPACR_EL1_FPEN;
> +	}
>  
>  	write_sysreg(val, cpacr_el1);
>  
> @@ -114,6 +119,7 @@ static void __hyp_text __activate_traps_nvhe(struct kvm_vcpu *vcpu)
>  
>  	val = CPTR_EL2_DEFAULT;
>  	val |= CPTR_EL2_TTA | CPTR_EL2_TZ;
> +
>  	if (!update_fp_enabled(vcpu))
>  		val |= CPTR_EL2_TFP;
>  
> @@ -329,16 +335,22 @@ static bool __hyp_text __skip_instr(struct kvm_vcpu *vcpu)
>  	}
>  }
>  
> -static bool __hyp_text __hyp_switch_fpsimd(struct kvm_vcpu *vcpu)
> +static bool __hyp_text __hyp_switch_fpsimd(struct kvm_vcpu *vcpu,
> +					   bool guest_has_sve)
>  {
>  	struct user_fpsimd_state *host_fpsimd = vcpu->arch.host_fpsimd_state;
>  
> -	if (has_vhe())
> -		write_sysreg(read_sysreg(cpacr_el1) | CPACR_EL1_FPEN,
> -			     cpacr_el1);
> -	else
> +	if (has_vhe()) {
> +		u64 reg = read_sysreg(cpacr_el1) | CPACR_EL1_FPEN;
> +
> +		if (system_supports_sve() && guest_has_sve)

guest_has_sve is only true when vcpu_arch->flags & KVM_ARM64_GUEST_HAS_SVE
is true, which can only be true when system_supports_sve() is true. So
I don't think we need system_supports_sve() here. guest_has_sve should be
enough.

> +			reg |= CPACR_EL1_ZEN;
> +
> +		write_sysreg(reg, cpacr_el1);
> +	} else {
>  		write_sysreg(read_sysreg(cptr_el2) & ~(u64)CPTR_EL2_TFP,
>  			     cptr_el2);
> +	}
>  
>  	isb();
>  
> @@ -361,7 +373,13 @@ static bool __hyp_text __hyp_switch_fpsimd(struct kvm_vcpu *vcpu)
>  		vcpu->arch.flags &= ~KVM_ARM64_FP_HOST;
>  	}
>  
> -	__fpsimd_restore_state(&vcpu->arch.ctxt.gp_regs.fp_regs);
> +	if (system_supports_sve() && guest_has_sve)

here too

> +		sve_load_state((char *)vcpu->arch.sve_state +
> +					sve_ffr_offset(vcpu->arch.sve_max_vl),
> +			       &vcpu->arch.ctxt.gp_regs.fp_regs.fpsr,
> +			       sve_vq_from_vl(vcpu->arch.sve_max_vl) - 1);
> +	else
> +		__fpsimd_restore_state(&vcpu->arch.ctxt.gp_regs.fp_regs);
>  
>  	/* Skip restoring fpexc32 for AArch64 guests */
>  	if (!(read_sysreg(hcr_el2) & HCR_RW))
> @@ -380,6 +398,8 @@ static bool __hyp_text __hyp_switch_fpsimd(struct kvm_vcpu *vcpu)
>   */
>  static bool __hyp_text fixup_guest_exit(struct kvm_vcpu *vcpu, u64 *exit_code)
>  {
> +	bool guest_has_sve;
> +
>  	if (ARM_EXCEPTION_CODE(*exit_code) != ARM_EXCEPTION_IRQ)
>  		vcpu->arch.fault.esr_el2 = read_sysreg_el2(esr);
>  
> @@ -397,10 +417,13 @@ static bool __hyp_text fixup_guest_exit(struct kvm_vcpu *vcpu, u64 *exit_code)
>  	 * and restore the guest context lazily.
>  	 * If FP/SIMD is not implemented, handle the trap and inject an
>  	 * undefined instruction exception to the guest.
> +	 * Similarly for trapped SVE accesses.
>  	 */
> -	if (system_supports_fpsimd() &&
> -	    kvm_vcpu_trap_get_class(vcpu) == ESR_ELx_EC_FP_ASIMD)
> -		return __hyp_switch_fpsimd(vcpu);
> +	guest_has_sve = vcpu_has_sve(&vcpu->arch);
> +	if ((system_supports_fpsimd() &&
> +	     kvm_vcpu_trap_get_class(vcpu) == ESR_ELx_EC_FP_ASIMD) ||
> +	    (guest_has_sve && kvm_vcpu_trap_get_class(vcpu) == ESR_ELx_EC_SVE))
> +		return __hyp_switch_fpsimd(vcpu, guest_has_sve);
>  
>  	if (!__populate_fault_info(vcpu))
>  		return true;
> -- 
> 2.1.4

Thanks,
drew

^ permalink raw reply	[flat|nested] 178+ messages in thread

* [RFC PATCH 12/16] KVM: arm64/sve: Context switch the SVE registers
@ 2018-07-19 13:13     ` Andrew Jones
  0 siblings, 0 replies; 178+ messages in thread
From: Andrew Jones @ 2018-07-19 13:13 UTC (permalink / raw)
  To: linux-arm-kernel

On Thu, Jun 21, 2018 at 03:57:36PM +0100, Dave Martin wrote:
> In order to give each vcpu its own view of the SVE registers, this
> patch adds context storage via a new sve_state pointer in struct
> vcpu_arch.  An additional member sve_max_vl is also added for each
> vcpu, to determine the maximum vector length visible to the guest
> and thus the value to be configured in ZCR_EL2.LEN while the is
> active.  This also determines the layout and size of the storage in
> sve_state, which is read and written by the same backend functions
> that are used for context-switching the SVE state for host tasks.
> 
> On SVE-enabled vcpus, SVE access traps are now handled by switching
> in the vcpu's SVE context and disabling the trap before returning
> to the guest.  On other vcpus, the trap is not handled and an exit
> back to the host occurs, where the handle_sve() fallback path
> reflects an undefined instruction exception back to the guest,
> consistently with the behaviour of non-SVE-capable hardware (as was
> done unconditionally prior to this patch).
> 
> No SVE handling is added on non-VHE-only paths, since VHE is an
> architectural and Kconfig prerequisite of SVE.
> 
> Signed-off-by: Dave Martin <Dave.Martin@arm.com>
> ---
>  arch/arm64/include/asm/kvm_host.h |  2 ++
>  arch/arm64/kvm/fpsimd.c           |  5 +++--
>  arch/arm64/kvm/hyp/switch.c       | 43 ++++++++++++++++++++++++++++++---------
>  3 files changed, 38 insertions(+), 12 deletions(-)
> 
> diff --git a/arch/arm64/include/asm/kvm_host.h b/arch/arm64/include/asm/kvm_host.h
> index f331abf..d2084ae 100644
> --- a/arch/arm64/include/asm/kvm_host.h
> +++ b/arch/arm64/include/asm/kvm_host.h
> @@ -211,6 +211,8 @@ typedef struct kvm_cpu_context kvm_cpu_context_t;
>  
>  struct kvm_vcpu_arch {
>  	struct kvm_cpu_context ctxt;
> +	void *sve_state;
> +	unsigned int sve_max_vl;
>  
>  	/* HYP configuration */
>  	u64 hcr_el2;
> diff --git a/arch/arm64/kvm/fpsimd.c b/arch/arm64/kvm/fpsimd.c
> index 872008c..44cf783 100644
> --- a/arch/arm64/kvm/fpsimd.c
> +++ b/arch/arm64/kvm/fpsimd.c
> @@ -86,10 +86,11 @@ void kvm_arch_vcpu_ctxsync_fp(struct kvm_vcpu *vcpu)
>  
>  	if (vcpu->arch.flags & KVM_ARM64_FP_ENABLED) {
>  		fpsimd_bind_state_to_cpu(&vcpu->arch.ctxt.gp_regs.fp_regs,
> -					 NULL, sve_max_vl);
> +					 vcpu->arch.sve_state,
> +					 vcpu->arch.sve_max_vl);
>  
>  		clear_thread_flag(TIF_FOREIGN_FPSTATE);
> -		clear_thread_flag(TIF_SVE);
> +		update_thread_flag(TIF_SVE, vcpu_has_sve(&vcpu->arch));
>  	}
>  }
>  
> diff --git a/arch/arm64/kvm/hyp/switch.c b/arch/arm64/kvm/hyp/switch.c
> index d496ef5..98df5c1 100644
> --- a/arch/arm64/kvm/hyp/switch.c
> +++ b/arch/arm64/kvm/hyp/switch.c
> @@ -98,8 +98,13 @@ static void activate_traps_vhe(struct kvm_vcpu *vcpu)
>  	val = read_sysreg(cpacr_el1);
>  	val |= CPACR_EL1_TTA;
>  	val &= ~CPACR_EL1_ZEN;
> -	if (!update_fp_enabled(vcpu))
> +
> +	if (update_fp_enabled(vcpu)) {
> +		if (vcpu_has_sve(&vcpu->arch))
> +			val |= CPACR_EL1_ZEN;
> +	} else {
>  		val &= ~CPACR_EL1_FPEN;
> +	}
>  
>  	write_sysreg(val, cpacr_el1);
>  
> @@ -114,6 +119,7 @@ static void __hyp_text __activate_traps_nvhe(struct kvm_vcpu *vcpu)
>  
>  	val = CPTR_EL2_DEFAULT;
>  	val |= CPTR_EL2_TTA | CPTR_EL2_TZ;
> +
>  	if (!update_fp_enabled(vcpu))
>  		val |= CPTR_EL2_TFP;
>  
> @@ -329,16 +335,22 @@ static bool __hyp_text __skip_instr(struct kvm_vcpu *vcpu)
>  	}
>  }
>  
> -static bool __hyp_text __hyp_switch_fpsimd(struct kvm_vcpu *vcpu)
> +static bool __hyp_text __hyp_switch_fpsimd(struct kvm_vcpu *vcpu,
> +					   bool guest_has_sve)
>  {
>  	struct user_fpsimd_state *host_fpsimd = vcpu->arch.host_fpsimd_state;
>  
> -	if (has_vhe())
> -		write_sysreg(read_sysreg(cpacr_el1) | CPACR_EL1_FPEN,
> -			     cpacr_el1);
> -	else
> +	if (has_vhe()) {
> +		u64 reg = read_sysreg(cpacr_el1) | CPACR_EL1_FPEN;
> +
> +		if (system_supports_sve() && guest_has_sve)

guest_has_sve is only true when vcpu_arch->flags & KVM_ARM64_GUEST_HAS_SVE
is true, which can only be true when system_supports_sve() is true. So
I don't think we need system_supports_sve() here. guest_has_sve should be
enough.

> +			reg |= CPACR_EL1_ZEN;
> +
> +		write_sysreg(reg, cpacr_el1);
> +	} else {
>  		write_sysreg(read_sysreg(cptr_el2) & ~(u64)CPTR_EL2_TFP,
>  			     cptr_el2);
> +	}
>  
>  	isb();
>  
> @@ -361,7 +373,13 @@ static bool __hyp_text __hyp_switch_fpsimd(struct kvm_vcpu *vcpu)
>  		vcpu->arch.flags &= ~KVM_ARM64_FP_HOST;
>  	}
>  
> -	__fpsimd_restore_state(&vcpu->arch.ctxt.gp_regs.fp_regs);
> +	if (system_supports_sve() && guest_has_sve)

here too

> +		sve_load_state((char *)vcpu->arch.sve_state +
> +					sve_ffr_offset(vcpu->arch.sve_max_vl),
> +			       &vcpu->arch.ctxt.gp_regs.fp_regs.fpsr,
> +			       sve_vq_from_vl(vcpu->arch.sve_max_vl) - 1);
> +	else
> +		__fpsimd_restore_state(&vcpu->arch.ctxt.gp_regs.fp_regs);
>  
>  	/* Skip restoring fpexc32 for AArch64 guests */
>  	if (!(read_sysreg(hcr_el2) & HCR_RW))
> @@ -380,6 +398,8 @@ static bool __hyp_text __hyp_switch_fpsimd(struct kvm_vcpu *vcpu)
>   */
>  static bool __hyp_text fixup_guest_exit(struct kvm_vcpu *vcpu, u64 *exit_code)
>  {
> +	bool guest_has_sve;
> +
>  	if (ARM_EXCEPTION_CODE(*exit_code) != ARM_EXCEPTION_IRQ)
>  		vcpu->arch.fault.esr_el2 = read_sysreg_el2(esr);
>  
> @@ -397,10 +417,13 @@ static bool __hyp_text fixup_guest_exit(struct kvm_vcpu *vcpu, u64 *exit_code)
>  	 * and restore the guest context lazily.
>  	 * If FP/SIMD is not implemented, handle the trap and inject an
>  	 * undefined instruction exception to the guest.
> +	 * Similarly for trapped SVE accesses.
>  	 */
> -	if (system_supports_fpsimd() &&
> -	    kvm_vcpu_trap_get_class(vcpu) == ESR_ELx_EC_FP_ASIMD)
> -		return __hyp_switch_fpsimd(vcpu);
> +	guest_has_sve = vcpu_has_sve(&vcpu->arch);
> +	if ((system_supports_fpsimd() &&
> +	     kvm_vcpu_trap_get_class(vcpu) == ESR_ELx_EC_FP_ASIMD) ||
> +	    (guest_has_sve && kvm_vcpu_trap_get_class(vcpu) == ESR_ELx_EC_SVE))
> +		return __hyp_switch_fpsimd(vcpu, guest_has_sve);
>  
>  	if (!__populate_fault_info(vcpu))
>  		return true;
> -- 
> 2.1.4

Thanks,
drew

^ permalink raw reply	[flat|nested] 178+ messages in thread

* Re: [RFC PATCH 15/16] KVM: arm64: Enumerate SVE register indices for KVM_GET_REG_LIST
  2018-06-21 14:57   ` Dave Martin
@ 2018-07-19 14:12     ` Andrew Jones
  -1 siblings, 0 replies; 178+ messages in thread
From: Andrew Jones @ 2018-07-19 14:12 UTC (permalink / raw)
  To: Dave Martin
  Cc: Okamoto Takayuki, Christoffer Dall, Ard Biesheuvel, Marc Zyngier,
	Catalin Marinas, Will Deacon, kvmarm, linux-arm-kernel

On Thu, Jun 21, 2018 at 03:57:39PM +0100, Dave Martin wrote:
> This patch includes the SVE register IDs in the list returned by
> KVM_GET_REG_LIST, as appropriate.
> 
> On a non-SVE-enabled vcpu, no extra IDs are added.
> 
> On an SVE-enabled vcpu, the appropriate number of slide IDs are
> enumerated for each SVE register, depending on the maximum vector
> length for the vcpu.
> 
> Signed-off-by: Dave Martin <Dave.Martin@arm.com>
> ---
>  arch/arm64/kvm/guest.c | 73 ++++++++++++++++++++++++++++++++++++++++++++++++++
>  1 file changed, 73 insertions(+)
> 
> diff --git a/arch/arm64/kvm/guest.c b/arch/arm64/kvm/guest.c
> index 005394b..5152362 100644
> --- a/arch/arm64/kvm/guest.c
> +++ b/arch/arm64/kvm/guest.c
> @@ -21,6 +21,7 @@
>  
>  #include <linux/errno.h>
>  #include <linux/err.h>
> +#include <linux/kernel.h>
>  #include <linux/kvm_host.h>
>  #include <linux/module.h>
>  #include <linux/uaccess.h>
> @@ -253,6 +254,73 @@ static int set_core_reg(struct kvm_vcpu *vcpu, const struct kvm_one_reg *reg)
>  	return err;
>  }
>  
> +static void copy_reg_index_to_user(u64 __user **uind, int *total, int *cerr,
> +				   u64 id)
> +{
> +	int err;
> +
> +	if (*cerr)
> +		return;
> +
> +	if (uind) {
> +		err = put_user(id, *uind);
> +		if (err) {
> +			*cerr = err;
> +			return;
> +		}
> +	}
> +
> +	++*total;
> +	if (uind)
> +		++*uind;
> +}
> +
> +static int enumerate_sve_regs(const struct kvm_vcpu *vcpu, u64 __user **uind)
> +{
> +	unsigned int n, i;
> +	int err = 0;
> +	int total = 0;
> +	unsigned int slices;
> +
> +	if (!vcpu_has_sve(&vcpu->arch))
> +		return 0;
> +
> +	slices = DIV_ROUND_UP(vcpu->arch.sve_max_vl,
> +			      KVM_REG_SIZE(KVM_REG_ARM64_SVE_ZREG(0, 0)));
> +
> +	for (n = 0; n < SVE_NUM_ZREGS; ++n)
> +		for (i = 0; i < slices; ++i)
> +			copy_reg_index_to_user(uind, &total, &err,
> +					       KVM_REG_ARM64_SVE_ZREG(n, i));
> +
> +	for (n = 0; n < SVE_NUM_PREGS; ++n)
> +		for (i = 0; i < slices; ++i)
> +			copy_reg_index_to_user(uind, &total, &err,
> +					       KVM_REG_ARM64_SVE_PREG(n, i));
> +
> +	for (i = 0; i < slices; ++i)
> +		copy_reg_index_to_user(uind, &total, &err,
> +				       KVM_REG_ARM64_SVE_FFR(i));
> +
> +	if (err)
> +		return -EFAULT;
> +
> +	return total;
> +}
> +
> +static unsigned long num_sve_regs(const struct kvm_vcpu *vcpu)
> +{
> +	return enumerate_sve_regs(vcpu, NULL);
> +}
> +
> +static int copy_sve_reg_indices(const struct kvm_vcpu *vcpu, u64 __user **uind)
> +{
> +	int err;
> +
> +	err = enumerate_sve_regs(vcpu, uind);
> +	return err < 0 ? err : 0;
> +}

I see the above functions were inspired by walk_sys_regs(), but, IMHO,
they're a bit overcomplicated. How about this untested approach?

diff --git a/arch/arm64/kvm/guest.c b/arch/arm64/kvm/guest.c
index 56a0260ceb11..0188a8b30d46 100644
--- a/arch/arm64/kvm/guest.c
+++ b/arch/arm64/kvm/guest.c
@@ -130,6 +130,52 @@ static int set_core_reg(struct kvm_vcpu *vcpu, const struct kvm_one_reg *reg)
 	return err;
 }
 
+static int enumerate_sve_regs(const struct kvm_vcpu *vcpu, u64 __user *uind)
+{
+	unsigned int slices = DIV_ROUND_UP(vcpu->arch.sve_max_vl,
+				KVM_REG_SIZE(KVM_REG_ARM64_SVE_ZREG(0, 0)));
+	unsigned int n, i;
+
+	if (!vcpu_has_sve(&vcpu->arch))
+		return 0;
+
+	for (n = 0; < SVE_NUM_ZREGS; ++n) {
+		for (i = 0; i < slices; ++i) {
+			if (put_user(KVM_REG_ARM64_SVE_ZREG(n, i), uind++))
+				return -EFAULT;
+		}
+	}
+
+	for (n = 0; < SVE_NUM_PREGS; ++n) {
+		for (i = 0; i < slices; ++i) {
+			if (put_user(KVM_REG_ARM64_SVE_PREG(n, i), uind++))
+				return -EFAULT;
+		}
+	}
+
+	for (i = 0; i < slices; ++i) {
+		if (put_user(KVM_REG_ARM64_SVE_FFR(i), uind++))
+			return -EFAULT;
+	}
+
+	return 0;
+}
+
+static unsigned long num_sve_regs(const struct kvm_vcpu *vcpu)
+{
+	unsigned int slices = DIV_ROUND_UP(vcpu->arch.sve_max_vl,
+				KVM_REG_SIZE(KVM_REG_ARM64_SVE_ZREG(0, 0)));
+
+	if (vcpu_has_sve(&vcpu->arch))
+		return (SVE_NUM_ZREGS + SVE_NUM_PREGS + 1) * slices;
+
+	return 0;
+}
+

> +
>  static int sve_reg_bounds(struct reg_bounds_struct *b,
>  			  const struct kvm_vcpu *vcpu,
>  			  const struct kvm_one_reg *reg)
> @@ -403,6 +471,7 @@ unsigned long kvm_arm_num_regs(struct kvm_vcpu *vcpu)
>  	unsigned long res = 0;
>  
>  	res += num_core_regs();
> +	res += num_sve_regs(vcpu);
>  	res += kvm_arm_num_sys_reg_descs(vcpu);
>  	res += kvm_arm_get_fw_num_regs(vcpu);
>  	res += NUM_TIMER_REGS;
> @@ -427,6 +496,10 @@ int kvm_arm_copy_reg_indices(struct kvm_vcpu *vcpu, u64 __user *uindices)
>  		uindices++;
>  	}
>  
> +	ret = copy_sve_reg_indices(vcpu, &uindices);
> +	if (ret)
> +		return ret;
> +
>  	ret = kvm_arm_copy_fw_reg_indices(vcpu, uindices);
>  	if (ret)
>  		return ret;
> -- 
> 2.1.4
> 
> _______________________________________________
> kvmarm mailing list
> kvmarm@lists.cs.columbia.edu
> https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

^ permalink raw reply related	[flat|nested] 178+ messages in thread

* [RFC PATCH 15/16] KVM: arm64: Enumerate SVE register indices for KVM_GET_REG_LIST
@ 2018-07-19 14:12     ` Andrew Jones
  0 siblings, 0 replies; 178+ messages in thread
From: Andrew Jones @ 2018-07-19 14:12 UTC (permalink / raw)
  To: linux-arm-kernel

On Thu, Jun 21, 2018 at 03:57:39PM +0100, Dave Martin wrote:
> This patch includes the SVE register IDs in the list returned by
> KVM_GET_REG_LIST, as appropriate.
> 
> On a non-SVE-enabled vcpu, no extra IDs are added.
> 
> On an SVE-enabled vcpu, the appropriate number of slide IDs are
> enumerated for each SVE register, depending on the maximum vector
> length for the vcpu.
> 
> Signed-off-by: Dave Martin <Dave.Martin@arm.com>
> ---
>  arch/arm64/kvm/guest.c | 73 ++++++++++++++++++++++++++++++++++++++++++++++++++
>  1 file changed, 73 insertions(+)
> 
> diff --git a/arch/arm64/kvm/guest.c b/arch/arm64/kvm/guest.c
> index 005394b..5152362 100644
> --- a/arch/arm64/kvm/guest.c
> +++ b/arch/arm64/kvm/guest.c
> @@ -21,6 +21,7 @@
>  
>  #include <linux/errno.h>
>  #include <linux/err.h>
> +#include <linux/kernel.h>
>  #include <linux/kvm_host.h>
>  #include <linux/module.h>
>  #include <linux/uaccess.h>
> @@ -253,6 +254,73 @@ static int set_core_reg(struct kvm_vcpu *vcpu, const struct kvm_one_reg *reg)
>  	return err;
>  }
>  
> +static void copy_reg_index_to_user(u64 __user **uind, int *total, int *cerr,
> +				   u64 id)
> +{
> +	int err;
> +
> +	if (*cerr)
> +		return;
> +
> +	if (uind) {
> +		err = put_user(id, *uind);
> +		if (err) {
> +			*cerr = err;
> +			return;
> +		}
> +	}
> +
> +	++*total;
> +	if (uind)
> +		++*uind;
> +}
> +
> +static int enumerate_sve_regs(const struct kvm_vcpu *vcpu, u64 __user **uind)
> +{
> +	unsigned int n, i;
> +	int err = 0;
> +	int total = 0;
> +	unsigned int slices;
> +
> +	if (!vcpu_has_sve(&vcpu->arch))
> +		return 0;
> +
> +	slices = DIV_ROUND_UP(vcpu->arch.sve_max_vl,
> +			      KVM_REG_SIZE(KVM_REG_ARM64_SVE_ZREG(0, 0)));
> +
> +	for (n = 0; n < SVE_NUM_ZREGS; ++n)
> +		for (i = 0; i < slices; ++i)
> +			copy_reg_index_to_user(uind, &total, &err,
> +					       KVM_REG_ARM64_SVE_ZREG(n, i));
> +
> +	for (n = 0; n < SVE_NUM_PREGS; ++n)
> +		for (i = 0; i < slices; ++i)
> +			copy_reg_index_to_user(uind, &total, &err,
> +					       KVM_REG_ARM64_SVE_PREG(n, i));
> +
> +	for (i = 0; i < slices; ++i)
> +		copy_reg_index_to_user(uind, &total, &err,
> +				       KVM_REG_ARM64_SVE_FFR(i));
> +
> +	if (err)
> +		return -EFAULT;
> +
> +	return total;
> +}
> +
> +static unsigned long num_sve_regs(const struct kvm_vcpu *vcpu)
> +{
> +	return enumerate_sve_regs(vcpu, NULL);
> +}
> +
> +static int copy_sve_reg_indices(const struct kvm_vcpu *vcpu, u64 __user **uind)
> +{
> +	int err;
> +
> +	err = enumerate_sve_regs(vcpu, uind);
> +	return err < 0 ? err : 0;
> +}

I see the above functions were inspired by walk_sys_regs(), but, IMHO,
they're a bit overcomplicated. How about this untested approach?

diff --git a/arch/arm64/kvm/guest.c b/arch/arm64/kvm/guest.c
index 56a0260ceb11..0188a8b30d46 100644
--- a/arch/arm64/kvm/guest.c
+++ b/arch/arm64/kvm/guest.c
@@ -130,6 +130,52 @@ static int set_core_reg(struct kvm_vcpu *vcpu, const struct kvm_one_reg *reg)
 	return err;
 }
 
+static int enumerate_sve_regs(const struct kvm_vcpu *vcpu, u64 __user *uind)
+{
+	unsigned int slices = DIV_ROUND_UP(vcpu->arch.sve_max_vl,
+				KVM_REG_SIZE(KVM_REG_ARM64_SVE_ZREG(0, 0)));
+	unsigned int n, i;
+
+	if (!vcpu_has_sve(&vcpu->arch))
+		return 0;
+
+	for (n = 0; < SVE_NUM_ZREGS; ++n) {
+		for (i = 0; i < slices; ++i) {
+			if (put_user(KVM_REG_ARM64_SVE_ZREG(n, i), uind++))
+				return -EFAULT;
+		}
+	}
+
+	for (n = 0; < SVE_NUM_PREGS; ++n) {
+		for (i = 0; i < slices; ++i) {
+			if (put_user(KVM_REG_ARM64_SVE_PREG(n, i), uind++))
+				return -EFAULT;
+		}
+	}
+
+	for (i = 0; i < slices; ++i) {
+		if (put_user(KVM_REG_ARM64_SVE_FFR(i), uind++))
+			return -EFAULT;
+	}
+
+	return 0;
+}
+
+static unsigned long num_sve_regs(const struct kvm_vcpu *vcpu)
+{
+	unsigned int slices = DIV_ROUND_UP(vcpu->arch.sve_max_vl,
+				KVM_REG_SIZE(KVM_REG_ARM64_SVE_ZREG(0, 0)));
+
+	if (vcpu_has_sve(&vcpu->arch))
+		return (SVE_NUM_ZREGS + SVE_NUM_PREGS + 1) * slices;
+
+	return 0;
+}
+

> +
>  static int sve_reg_bounds(struct reg_bounds_struct *b,
>  			  const struct kvm_vcpu *vcpu,
>  			  const struct kvm_one_reg *reg)
> @@ -403,6 +471,7 @@ unsigned long kvm_arm_num_regs(struct kvm_vcpu *vcpu)
>  	unsigned long res = 0;
>  
>  	res += num_core_regs();
> +	res += num_sve_regs(vcpu);
>  	res += kvm_arm_num_sys_reg_descs(vcpu);
>  	res += kvm_arm_get_fw_num_regs(vcpu);
>  	res += NUM_TIMER_REGS;
> @@ -427,6 +496,10 @@ int kvm_arm_copy_reg_indices(struct kvm_vcpu *vcpu, u64 __user *uindices)
>  		uindices++;
>  	}
>  
> +	ret = copy_sve_reg_indices(vcpu, &uindices);
> +	if (ret)
> +		return ret;
> +
>  	ret = kvm_arm_copy_fw_reg_indices(vcpu, uindices);
>  	if (ret)
>  		return ret;
> -- 
> 2.1.4
> 
> _______________________________________________
> kvmarm mailing list
> kvmarm at lists.cs.columbia.edu
> https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

^ permalink raw reply related	[flat|nested] 178+ messages in thread

* Re: [RFC PATCH 16/16] KVM: arm64/sve: Report and enable SVE API extensions for userspace
  2018-06-21 14:57   ` Dave Martin
@ 2018-07-19 14:59     ` Andrew Jones
  -1 siblings, 0 replies; 178+ messages in thread
From: Andrew Jones @ 2018-07-19 14:59 UTC (permalink / raw)
  To: Dave Martin
  Cc: Okamoto Takayuki, Christoffer Dall, Ard Biesheuvel, Marc Zyngier,
	Catalin Marinas, Will Deacon, kvmarm, linux-arm-kernel

On Thu, Jun 21, 2018 at 03:57:40PM +0100, Dave Martin wrote:
> This patch reports the availability of KVM SVE support to userspace
> via a new vcpu feature flag KVM_ARM_VCPU_SVE.  This flag is
> reported via the KVM_ARM_PREFERRED_TARGET ioctl.
> 
> Userspace can enable the feature by setting the flag for
> KVM_ARM_VCPU_INIT.  Without this flag set, SVE-related ioctls and
> register access extensions are hidden, and SVE remains disabled
> unconditionally for the guest.  This ensures that non-SVE-aware KVM
> userspace does not receive a vcpu that it does not understand how
> to snapshot or restore correctly.
> 
> Storage is allocated for the SVE register state at vcpu init time,
> sufficient for the maximum vector length to be exposed to the vcpu.
> No attempt is made to allocate the storage lazily for now.  Also,
> no attempt is made to resize the storage dynamically, since the
> effective vector length of the vcpu can change at each EL0/EL1
> transition.  The storage is freed at the vcpu uninit hook.
> 
> No particular attempt is made to prevent userspace from creating a
> mix of vcpus some of which have SVE enabled and some of which have
> it disabled.  This may or may not be useful, but it reflects the
> underlying architectural behaviour.
> 
> Signed-off-by: Dave Martin <Dave.Martin@arm.com>
> ---
>  arch/arm64/include/asm/kvm_host.h |  6 +++---
>  arch/arm64/include/uapi/asm/kvm.h |  1 +
>  arch/arm64/kvm/guest.c            | 19 +++++++++++++------
>  arch/arm64/kvm/reset.c            | 14 ++++++++++++++
>  4 files changed, 31 insertions(+), 9 deletions(-)
> 
> diff --git a/arch/arm64/include/asm/kvm_host.h b/arch/arm64/include/asm/kvm_host.h
> index d2084ae..d956cf2 100644
> --- a/arch/arm64/include/asm/kvm_host.h
> +++ b/arch/arm64/include/asm/kvm_host.h
> @@ -44,7 +44,7 @@
>  
>  #define KVM_MAX_VCPUS VGIC_V3_MAX_CPUS
>  
> -#define KVM_VCPU_MAX_FEATURES 4
> +#define KVM_VCPU_MAX_FEATURES 5
>  
>  #define KVM_REQ_SLEEP \
>  	KVM_ARCH_REQ_FLAGS(0, KVM_REQUEST_WAIT | KVM_REQUEST_NO_WAKEUP)
> @@ -439,8 +439,8 @@ static inline void kvm_arch_sync_events(struct kvm *kvm) {}
>  static inline void kvm_arch_sched_in(struct kvm_vcpu *vcpu, int cpu) {}
>  static inline void kvm_arch_vcpu_block_finish(struct kvm_vcpu *vcpu) {}
>  
> -static inline int kvm_arm_arch_vcpu_init(struct kvm_vcpu *vcpu) { return 0; }
> -static inline void kvm_arm_arch_vcpu_uninit(struct kvm_vcpu *vcpu) {}
> +int kvm_arm_arch_vcpu_init(struct kvm_vcpu *vcpu);
> +void kvm_arm_arch_vcpu_uninit(struct kvm_vcpu *vcpu);
>  
>  void kvm_arm_init_debug(void);
>  void kvm_arm_setup_debug(struct kvm_vcpu *vcpu);
> diff --git a/arch/arm64/include/uapi/asm/kvm.h b/arch/arm64/include/uapi/asm/kvm.h
> index f54a9b0..6acf276 100644
> --- a/arch/arm64/include/uapi/asm/kvm.h
> +++ b/arch/arm64/include/uapi/asm/kvm.h
> @@ -101,6 +101,7 @@ struct kvm_regs {
>  #define KVM_ARM_VCPU_EL1_32BIT		1 /* CPU running a 32bit VM */
>  #define KVM_ARM_VCPU_PSCI_0_2		2 /* CPU uses PSCI v0.2 */
>  #define KVM_ARM_VCPU_PMU_V3		3 /* Support guest PMUv3 */
> +#define KVM_ARM_VCPU_SVE		4 /* Allow SVE for guest */
>  
>  struct kvm_vcpu_init {
>  	__u32 target;
> diff --git a/arch/arm64/kvm/guest.c b/arch/arm64/kvm/guest.c
> index 5152362..fb7f6aa 100644
> --- a/arch/arm64/kvm/guest.c
> +++ b/arch/arm64/kvm/guest.c
> @@ -58,6 +58,16 @@ int kvm_arch_vcpu_setup(struct kvm_vcpu *vcpu)
>  	return 0;
>  }
>  
> +int kvm_arm_arch_vcpu_init(struct kvm_vcpu *vcpu)
> +{
> +	return 0;
> +}

Unused, so could have just left the inline version.

> +
> +void kvm_arm_arch_vcpu_uninit(struct kvm_vcpu *vcpu)
> +{
> +	kfree(vcpu->arch.sve_state);
> +}
> +
>  static u64 core_reg_offset_from_id(u64 id)
>  {
>  	return id & ~(KVM_REG_ARCH_MASK | KVM_REG_SIZE_MASK | KVM_REG_ARM_CORE);
> @@ -600,12 +610,9 @@ int kvm_vcpu_preferred_target(struct kvm_vcpu_init *init)
>  
>  	memset(init, 0, sizeof(*init));
>  
> -	/*
> -	 * For now, we don't return any features.
> -	 * In future, we might use features to return target
> -	 * specific features available for the preferred
> -	 * target type.
> -	 */
> +	/* KVM_ARM_VCPU_SVE understood by KVM_VCPU_INIT */
> +	init->features[0] = 1 << KVM_ARM_VCPU_SVE;
> +

We shouldn't need to do this. The "preferred" target type isn't defined
well (that I know of), but IMO it should probably be the target that
best matches the host, minus optional features. The best base target. We
may use these features to convey that the preferred target should enable
some optional feature if that feature is necessary to workaround a bug,
i.e. using the "feature" bit as an erratum bit someday, but that'd be
quite a debatable use, so maybe not even that. Most likely we'll never
need to add features here.

That said, I think defining the feature bit makes sense. ATM, I'm feeling
like we'll want to model the user interface for SVE like PMU (using VCPU
device ioctls).


>  	init->target = (__u32)target;
>  
>  	return 0;
> diff --git a/arch/arm64/kvm/reset.c b/arch/arm64/kvm/reset.c
> index a74311b..f63a791 100644
> --- a/arch/arm64/kvm/reset.c
> +++ b/arch/arm64/kvm/reset.c
> @@ -110,6 +110,20 @@ int kvm_reset_vcpu(struct kvm_vcpu *vcpu)
>  			cpu_reset = &default_regs_reset;
>  		}
>  
> +		if (system_supports_sve() &&
> +		    test_bit(KVM_ARM_VCPU_SVE, vcpu->arch.features)) {
> +			vcpu->arch.flags |= KVM_ARM64_GUEST_HAS_SVE;
> +
> +			vcpu->arch.sve_max_vl = sve_max_virtualisable_vl;
> +
> +			vcpu->arch.sve_state = kzalloc(
> +				SVE_SIG_REGS_SIZE(
> +					sve_vq_from_vl(vcpu->arch.sve_max_vl)),

I guess sve_state can be pretty large. Should we allocate it like we
do the VM with kvm_arch_alloc_vm()? I.e. using vzalloc() on VHE machines?

> +				GFP_KERNEL);
> +			if (!vcpu->arch.sve_state)
> +				return -ENOMEM;
> +		}
> +
>  		break;
>  	}
>  
> -- 
> 2.1.4

Thanks,
drew

^ permalink raw reply	[flat|nested] 178+ messages in thread

* [RFC PATCH 16/16] KVM: arm64/sve: Report and enable SVE API extensions for userspace
@ 2018-07-19 14:59     ` Andrew Jones
  0 siblings, 0 replies; 178+ messages in thread
From: Andrew Jones @ 2018-07-19 14:59 UTC (permalink / raw)
  To: linux-arm-kernel

On Thu, Jun 21, 2018 at 03:57:40PM +0100, Dave Martin wrote:
> This patch reports the availability of KVM SVE support to userspace
> via a new vcpu feature flag KVM_ARM_VCPU_SVE.  This flag is
> reported via the KVM_ARM_PREFERRED_TARGET ioctl.
> 
> Userspace can enable the feature by setting the flag for
> KVM_ARM_VCPU_INIT.  Without this flag set, SVE-related ioctls and
> register access extensions are hidden, and SVE remains disabled
> unconditionally for the guest.  This ensures that non-SVE-aware KVM
> userspace does not receive a vcpu that it does not understand how
> to snapshot or restore correctly.
> 
> Storage is allocated for the SVE register state at vcpu init time,
> sufficient for the maximum vector length to be exposed to the vcpu.
> No attempt is made to allocate the storage lazily for now.  Also,
> no attempt is made to resize the storage dynamically, since the
> effective vector length of the vcpu can change at each EL0/EL1
> transition.  The storage is freed at the vcpu uninit hook.
> 
> No particular attempt is made to prevent userspace from creating a
> mix of vcpus some of which have SVE enabled and some of which have
> it disabled.  This may or may not be useful, but it reflects the
> underlying architectural behaviour.
> 
> Signed-off-by: Dave Martin <Dave.Martin@arm.com>
> ---
>  arch/arm64/include/asm/kvm_host.h |  6 +++---
>  arch/arm64/include/uapi/asm/kvm.h |  1 +
>  arch/arm64/kvm/guest.c            | 19 +++++++++++++------
>  arch/arm64/kvm/reset.c            | 14 ++++++++++++++
>  4 files changed, 31 insertions(+), 9 deletions(-)
> 
> diff --git a/arch/arm64/include/asm/kvm_host.h b/arch/arm64/include/asm/kvm_host.h
> index d2084ae..d956cf2 100644
> --- a/arch/arm64/include/asm/kvm_host.h
> +++ b/arch/arm64/include/asm/kvm_host.h
> @@ -44,7 +44,7 @@
>  
>  #define KVM_MAX_VCPUS VGIC_V3_MAX_CPUS
>  
> -#define KVM_VCPU_MAX_FEATURES 4
> +#define KVM_VCPU_MAX_FEATURES 5
>  
>  #define KVM_REQ_SLEEP \
>  	KVM_ARCH_REQ_FLAGS(0, KVM_REQUEST_WAIT | KVM_REQUEST_NO_WAKEUP)
> @@ -439,8 +439,8 @@ static inline void kvm_arch_sync_events(struct kvm *kvm) {}
>  static inline void kvm_arch_sched_in(struct kvm_vcpu *vcpu, int cpu) {}
>  static inline void kvm_arch_vcpu_block_finish(struct kvm_vcpu *vcpu) {}
>  
> -static inline int kvm_arm_arch_vcpu_init(struct kvm_vcpu *vcpu) { return 0; }
> -static inline void kvm_arm_arch_vcpu_uninit(struct kvm_vcpu *vcpu) {}
> +int kvm_arm_arch_vcpu_init(struct kvm_vcpu *vcpu);
> +void kvm_arm_arch_vcpu_uninit(struct kvm_vcpu *vcpu);
>  
>  void kvm_arm_init_debug(void);
>  void kvm_arm_setup_debug(struct kvm_vcpu *vcpu);
> diff --git a/arch/arm64/include/uapi/asm/kvm.h b/arch/arm64/include/uapi/asm/kvm.h
> index f54a9b0..6acf276 100644
> --- a/arch/arm64/include/uapi/asm/kvm.h
> +++ b/arch/arm64/include/uapi/asm/kvm.h
> @@ -101,6 +101,7 @@ struct kvm_regs {
>  #define KVM_ARM_VCPU_EL1_32BIT		1 /* CPU running a 32bit VM */
>  #define KVM_ARM_VCPU_PSCI_0_2		2 /* CPU uses PSCI v0.2 */
>  #define KVM_ARM_VCPU_PMU_V3		3 /* Support guest PMUv3 */
> +#define KVM_ARM_VCPU_SVE		4 /* Allow SVE for guest */
>  
>  struct kvm_vcpu_init {
>  	__u32 target;
> diff --git a/arch/arm64/kvm/guest.c b/arch/arm64/kvm/guest.c
> index 5152362..fb7f6aa 100644
> --- a/arch/arm64/kvm/guest.c
> +++ b/arch/arm64/kvm/guest.c
> @@ -58,6 +58,16 @@ int kvm_arch_vcpu_setup(struct kvm_vcpu *vcpu)
>  	return 0;
>  }
>  
> +int kvm_arm_arch_vcpu_init(struct kvm_vcpu *vcpu)
> +{
> +	return 0;
> +}

Unused, so could have just left the inline version.

> +
> +void kvm_arm_arch_vcpu_uninit(struct kvm_vcpu *vcpu)
> +{
> +	kfree(vcpu->arch.sve_state);
> +}
> +
>  static u64 core_reg_offset_from_id(u64 id)
>  {
>  	return id & ~(KVM_REG_ARCH_MASK | KVM_REG_SIZE_MASK | KVM_REG_ARM_CORE);
> @@ -600,12 +610,9 @@ int kvm_vcpu_preferred_target(struct kvm_vcpu_init *init)
>  
>  	memset(init, 0, sizeof(*init));
>  
> -	/*
> -	 * For now, we don't return any features.
> -	 * In future, we might use features to return target
> -	 * specific features available for the preferred
> -	 * target type.
> -	 */
> +	/* KVM_ARM_VCPU_SVE understood by KVM_VCPU_INIT */
> +	init->features[0] = 1 << KVM_ARM_VCPU_SVE;
> +

We shouldn't need to do this. The "preferred" target type isn't defined
well (that I know of), but IMO it should probably be the target that
best matches the host, minus optional features. The best base target. We
may use these features to convey that the preferred target should enable
some optional feature if that feature is necessary to workaround a bug,
i.e. using the "feature" bit as an erratum bit someday, but that'd be
quite a debatable use, so maybe not even that. Most likely we'll never
need to add features here.

That said, I think defining the feature bit makes sense. ATM, I'm feeling
like we'll want to model the user interface for SVE like PMU (using VCPU
device ioctls).


>  	init->target = (__u32)target;
>  
>  	return 0;
> diff --git a/arch/arm64/kvm/reset.c b/arch/arm64/kvm/reset.c
> index a74311b..f63a791 100644
> --- a/arch/arm64/kvm/reset.c
> +++ b/arch/arm64/kvm/reset.c
> @@ -110,6 +110,20 @@ int kvm_reset_vcpu(struct kvm_vcpu *vcpu)
>  			cpu_reset = &default_regs_reset;
>  		}
>  
> +		if (system_supports_sve() &&
> +		    test_bit(KVM_ARM_VCPU_SVE, vcpu->arch.features)) {
> +			vcpu->arch.flags |= KVM_ARM64_GUEST_HAS_SVE;
> +
> +			vcpu->arch.sve_max_vl = sve_max_virtualisable_vl;
> +
> +			vcpu->arch.sve_state = kzalloc(
> +				SVE_SIG_REGS_SIZE(
> +					sve_vq_from_vl(vcpu->arch.sve_max_vl)),

I guess sve_state can be pretty large. Should we allocate it like we
do the VM with kvm_arch_alloc_vm()? I.e. using vzalloc() on VHE machines?

> +				GFP_KERNEL);
> +			if (!vcpu->arch.sve_state)
> +				return -ENOMEM;
> +		}
> +
>  		break;
>  	}
>  
> -- 
> 2.1.4

Thanks,
drew

^ permalink raw reply	[flat|nested] 178+ messages in thread

* Re: [RFC PATCH 10/16] KVM: arm64: Add a vcpu flag to control SVE visibility for the guest
  2018-06-21 14:57   ` Dave Martin
@ 2018-07-19 15:02     ` Andrew Jones
  -1 siblings, 0 replies; 178+ messages in thread
From: Andrew Jones @ 2018-07-19 15:02 UTC (permalink / raw)
  To: Dave Martin
  Cc: Okamoto Takayuki, Christoffer Dall, Ard Biesheuvel, Marc Zyngier,
	Catalin Marinas, Will Deacon, kvmarm, linux-arm-kernel

On Thu, Jun 21, 2018 at 03:57:34PM +0100, Dave Martin wrote:
> Since SVE will be enabled or disabled on a per-vcpu basis, a flag
> is needed in order to track which vcpus have it enabled.
> 
> This patch adds a suitable flag and a helper for checking it.
> 
> Signed-off-by: Dave Martin <Dave.Martin@arm.com>
> ---
>  arch/arm64/include/asm/kvm_host.h | 8 ++++++++
>  1 file changed, 8 insertions(+)
> 
> diff --git a/arch/arm64/include/asm/kvm_host.h b/arch/arm64/include/asm/kvm_host.h
> index 9671ddd..609d08b 100644
> --- a/arch/arm64/include/asm/kvm_host.h
> +++ b/arch/arm64/include/asm/kvm_host.h
> @@ -308,6 +308,14 @@ struct kvm_vcpu_arch {
>  #define KVM_ARM64_FP_HOST		(1 << 2) /* host FP regs loaded */
>  #define KVM_ARM64_HOST_SVE_IN_USE	(1 << 3) /* backup for host TIF_SVE */
>  #define KVM_ARM64_HOST_SVE_ENABLED	(1 << 4) /* SVE enabled for EL0 */
> +#define KVM_ARM64_GUEST_HAS_SVE		(1 << 5) /* SVE exposed to guest */
> +
> +static inline bool vcpu_has_sve(struct kvm_vcpu_arch const *vcpu_arch)

Shouldn't this vcpu function take a vcpu instead of a vcpu_arch?

Thanks,
drew

> +{
> +	return system_supports_sve() &&
> +		(vcpu_arch->flags & KVM_ARM64_GUEST_HAS_SVE);
> +
> +}
>  
>  #define vcpu_gp_regs(v)		(&(v)->arch.ctxt.gp_regs)
>  
> -- 
> 2.1.4
> 
> _______________________________________________
> kvmarm mailing list
> kvmarm@lists.cs.columbia.edu
> https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

^ permalink raw reply	[flat|nested] 178+ messages in thread

* [RFC PATCH 10/16] KVM: arm64: Add a vcpu flag to control SVE visibility for the guest
@ 2018-07-19 15:02     ` Andrew Jones
  0 siblings, 0 replies; 178+ messages in thread
From: Andrew Jones @ 2018-07-19 15:02 UTC (permalink / raw)
  To: linux-arm-kernel

On Thu, Jun 21, 2018 at 03:57:34PM +0100, Dave Martin wrote:
> Since SVE will be enabled or disabled on a per-vcpu basis, a flag
> is needed in order to track which vcpus have it enabled.
> 
> This patch adds a suitable flag and a helper for checking it.
> 
> Signed-off-by: Dave Martin <Dave.Martin@arm.com>
> ---
>  arch/arm64/include/asm/kvm_host.h | 8 ++++++++
>  1 file changed, 8 insertions(+)
> 
> diff --git a/arch/arm64/include/asm/kvm_host.h b/arch/arm64/include/asm/kvm_host.h
> index 9671ddd..609d08b 100644
> --- a/arch/arm64/include/asm/kvm_host.h
> +++ b/arch/arm64/include/asm/kvm_host.h
> @@ -308,6 +308,14 @@ struct kvm_vcpu_arch {
>  #define KVM_ARM64_FP_HOST		(1 << 2) /* host FP regs loaded */
>  #define KVM_ARM64_HOST_SVE_IN_USE	(1 << 3) /* backup for host TIF_SVE */
>  #define KVM_ARM64_HOST_SVE_ENABLED	(1 << 4) /* SVE enabled for EL0 */
> +#define KVM_ARM64_GUEST_HAS_SVE		(1 << 5) /* SVE exposed to guest */
> +
> +static inline bool vcpu_has_sve(struct kvm_vcpu_arch const *vcpu_arch)

Shouldn't this vcpu function take a vcpu instead of a vcpu_arch?

Thanks,
drew

> +{
> +	return system_supports_sve() &&
> +		(vcpu_arch->flags & KVM_ARM64_GUEST_HAS_SVE);
> +
> +}
>  
>  #define vcpu_gp_regs(v)		(&(v)->arch.ctxt.gp_regs)
>  
> -- 
> 2.1.4
> 
> _______________________________________________
> kvmarm mailing list
> kvmarm at lists.cs.columbia.edu
> https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

^ permalink raw reply	[flat|nested] 178+ messages in thread

* Re: [RFC PATCH 16/16] KVM: arm64/sve: Report and enable SVE API extensions for userspace
  2018-06-21 14:57   ` Dave Martin
@ 2018-07-19 15:24     ` Andrew Jones
  -1 siblings, 0 replies; 178+ messages in thread
From: Andrew Jones @ 2018-07-19 15:24 UTC (permalink / raw)
  To: Dave Martin
  Cc: Okamoto Takayuki, Christoffer Dall, Ard Biesheuvel, Marc Zyngier,
	Catalin Marinas, Will Deacon, kvmarm, linux-arm-kernel

On Thu, Jun 21, 2018 at 03:57:40PM +0100, Dave Martin wrote:
> This patch reports the availability of KVM SVE support to userspace
> via a new vcpu feature flag KVM_ARM_VCPU_SVE.  This flag is
> reported via the KVM_ARM_PREFERRED_TARGET ioctl.
> 
> Userspace can enable the feature by setting the flag for
> KVM_ARM_VCPU_INIT.  Without this flag set, SVE-related ioctls and
> register access extensions are hidden, and SVE remains disabled
> unconditionally for the guest.  This ensures that non-SVE-aware KVM
> userspace does not receive a vcpu that it does not understand how
> to snapshot or restore correctly.
> 
> Storage is allocated for the SVE register state at vcpu init time,
> sufficient for the maximum vector length to be exposed to the vcpu.
> No attempt is made to allocate the storage lazily for now.  Also,
> no attempt is made to resize the storage dynamically, since the
> effective vector length of the vcpu can change at each EL0/EL1
> transition.  The storage is freed at the vcpu uninit hook.
> 
> No particular attempt is made to prevent userspace from creating a
> mix of vcpus some of which have SVE enabled and some of which have
> it disabled.  This may or may not be useful, but it reflects the
> underlying architectural behaviour.
> 
> Signed-off-by: Dave Martin <Dave.Martin@arm.com>
> ---
>  arch/arm64/include/asm/kvm_host.h |  6 +++---
>  arch/arm64/include/uapi/asm/kvm.h |  1 +
>  arch/arm64/kvm/guest.c            | 19 +++++++++++++------
>  arch/arm64/kvm/reset.c            | 14 ++++++++++++++
>  4 files changed, 31 insertions(+), 9 deletions(-)
> 
> diff --git a/arch/arm64/include/asm/kvm_host.h b/arch/arm64/include/asm/kvm_host.h
> index d2084ae..d956cf2 100644
> --- a/arch/arm64/include/asm/kvm_host.h
> +++ b/arch/arm64/include/asm/kvm_host.h
> @@ -44,7 +44,7 @@
>  
>  #define KVM_MAX_VCPUS VGIC_V3_MAX_CPUS
>  
> -#define KVM_VCPU_MAX_FEATURES 4
> +#define KVM_VCPU_MAX_FEATURES 5
>  
>  #define KVM_REQ_SLEEP \
>  	KVM_ARCH_REQ_FLAGS(0, KVM_REQUEST_WAIT | KVM_REQUEST_NO_WAKEUP)
> @@ -439,8 +439,8 @@ static inline void kvm_arch_sync_events(struct kvm *kvm) {}
>  static inline void kvm_arch_sched_in(struct kvm_vcpu *vcpu, int cpu) {}
>  static inline void kvm_arch_vcpu_block_finish(struct kvm_vcpu *vcpu) {}
>  
> -static inline int kvm_arm_arch_vcpu_init(struct kvm_vcpu *vcpu) { return 0; }
> -static inline void kvm_arm_arch_vcpu_uninit(struct kvm_vcpu *vcpu) {}
> +int kvm_arm_arch_vcpu_init(struct kvm_vcpu *vcpu);
> +void kvm_arm_arch_vcpu_uninit(struct kvm_vcpu *vcpu);
>  
>  void kvm_arm_init_debug(void);
>  void kvm_arm_setup_debug(struct kvm_vcpu *vcpu);
> diff --git a/arch/arm64/include/uapi/asm/kvm.h b/arch/arm64/include/uapi/asm/kvm.h
> index f54a9b0..6acf276 100644
> --- a/arch/arm64/include/uapi/asm/kvm.h
> +++ b/arch/arm64/include/uapi/asm/kvm.h
> @@ -101,6 +101,7 @@ struct kvm_regs {
>  #define KVM_ARM_VCPU_EL1_32BIT		1 /* CPU running a 32bit VM */
>  #define KVM_ARM_VCPU_PSCI_0_2		2 /* CPU uses PSCI v0.2 */
>  #define KVM_ARM_VCPU_PMU_V3		3 /* Support guest PMUv3 */
> +#define KVM_ARM_VCPU_SVE		4 /* Allow SVE for guest */
>  
>  struct kvm_vcpu_init {
>  	__u32 target;
> diff --git a/arch/arm64/kvm/guest.c b/arch/arm64/kvm/guest.c
> index 5152362..fb7f6aa 100644
> --- a/arch/arm64/kvm/guest.c
> +++ b/arch/arm64/kvm/guest.c
> @@ -58,6 +58,16 @@ int kvm_arch_vcpu_setup(struct kvm_vcpu *vcpu)
>  	return 0;
>  }
>  
> +int kvm_arm_arch_vcpu_init(struct kvm_vcpu *vcpu)
> +{
> +	return 0;
> +}
> +
> +void kvm_arm_arch_vcpu_uninit(struct kvm_vcpu *vcpu)
> +{
> +	kfree(vcpu->arch.sve_state);
> +}
> +
>  static u64 core_reg_offset_from_id(u64 id)
>  {
>  	return id & ~(KVM_REG_ARCH_MASK | KVM_REG_SIZE_MASK | KVM_REG_ARM_CORE);
> @@ -600,12 +610,9 @@ int kvm_vcpu_preferred_target(struct kvm_vcpu_init *init)
>  
>  	memset(init, 0, sizeof(*init));
>  
> -	/*
> -	 * For now, we don't return any features.
> -	 * In future, we might use features to return target
> -	 * specific features available for the preferred
> -	 * target type.
> -	 */
> +	/* KVM_ARM_VCPU_SVE understood by KVM_VCPU_INIT */
> +	init->features[0] = 1 << KVM_ARM_VCPU_SVE;
> +
>  	init->target = (__u32)target;
>  
>  	return 0;
> diff --git a/arch/arm64/kvm/reset.c b/arch/arm64/kvm/reset.c
> index a74311b..f63a791 100644
> --- a/arch/arm64/kvm/reset.c
> +++ b/arch/arm64/kvm/reset.c
> @@ -110,6 +110,20 @@ int kvm_reset_vcpu(struct kvm_vcpu *vcpu)
>  			cpu_reset = &default_regs_reset;
>  		}
>  
> +		if (system_supports_sve() &&
> +		    test_bit(KVM_ARM_VCPU_SVE, vcpu->arch.features)) {
> +			vcpu->arch.flags |= KVM_ARM64_GUEST_HAS_SVE;
> +
> +			vcpu->arch.sve_max_vl = sve_max_virtualisable_vl;
> +

The allocation below needs to be guarded by an if (!vcpu->arch.sve_state),
otherwise every time the guest does a PSCI-off/PSCI-on cycle of the vcpu
we'll have a memory leak. Or, we need to move this allocation into the new
kvm_arm_arch_vcpu_init() function. Why did you opt for kvm_reset_vcpu()?

Thanks,
drew

> +			vcpu->arch.sve_state = kzalloc(
> +				SVE_SIG_REGS_SIZE(
> +					sve_vq_from_vl(vcpu->arch.sve_max_vl)),
> +				GFP_KERNEL);
> +			if (!vcpu->arch.sve_state)
> +				return -ENOMEM;
> +		}
> +
>  		break;
>  	}
>  
> -- 
> 2.1.4
> 
> _______________________________________________
> kvmarm mailing list
> kvmarm@lists.cs.columbia.edu
> https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

^ permalink raw reply	[flat|nested] 178+ messages in thread

* [RFC PATCH 16/16] KVM: arm64/sve: Report and enable SVE API extensions for userspace
@ 2018-07-19 15:24     ` Andrew Jones
  0 siblings, 0 replies; 178+ messages in thread
From: Andrew Jones @ 2018-07-19 15:24 UTC (permalink / raw)
  To: linux-arm-kernel

On Thu, Jun 21, 2018 at 03:57:40PM +0100, Dave Martin wrote:
> This patch reports the availability of KVM SVE support to userspace
> via a new vcpu feature flag KVM_ARM_VCPU_SVE.  This flag is
> reported via the KVM_ARM_PREFERRED_TARGET ioctl.
> 
> Userspace can enable the feature by setting the flag for
> KVM_ARM_VCPU_INIT.  Without this flag set, SVE-related ioctls and
> register access extensions are hidden, and SVE remains disabled
> unconditionally for the guest.  This ensures that non-SVE-aware KVM
> userspace does not receive a vcpu that it does not understand how
> to snapshot or restore correctly.
> 
> Storage is allocated for the SVE register state at vcpu init time,
> sufficient for the maximum vector length to be exposed to the vcpu.
> No attempt is made to allocate the storage lazily for now.  Also,
> no attempt is made to resize the storage dynamically, since the
> effective vector length of the vcpu can change at each EL0/EL1
> transition.  The storage is freed at the vcpu uninit hook.
> 
> No particular attempt is made to prevent userspace from creating a
> mix of vcpus some of which have SVE enabled and some of which have
> it disabled.  This may or may not be useful, but it reflects the
> underlying architectural behaviour.
> 
> Signed-off-by: Dave Martin <Dave.Martin@arm.com>
> ---
>  arch/arm64/include/asm/kvm_host.h |  6 +++---
>  arch/arm64/include/uapi/asm/kvm.h |  1 +
>  arch/arm64/kvm/guest.c            | 19 +++++++++++++------
>  arch/arm64/kvm/reset.c            | 14 ++++++++++++++
>  4 files changed, 31 insertions(+), 9 deletions(-)
> 
> diff --git a/arch/arm64/include/asm/kvm_host.h b/arch/arm64/include/asm/kvm_host.h
> index d2084ae..d956cf2 100644
> --- a/arch/arm64/include/asm/kvm_host.h
> +++ b/arch/arm64/include/asm/kvm_host.h
> @@ -44,7 +44,7 @@
>  
>  #define KVM_MAX_VCPUS VGIC_V3_MAX_CPUS
>  
> -#define KVM_VCPU_MAX_FEATURES 4
> +#define KVM_VCPU_MAX_FEATURES 5
>  
>  #define KVM_REQ_SLEEP \
>  	KVM_ARCH_REQ_FLAGS(0, KVM_REQUEST_WAIT | KVM_REQUEST_NO_WAKEUP)
> @@ -439,8 +439,8 @@ static inline void kvm_arch_sync_events(struct kvm *kvm) {}
>  static inline void kvm_arch_sched_in(struct kvm_vcpu *vcpu, int cpu) {}
>  static inline void kvm_arch_vcpu_block_finish(struct kvm_vcpu *vcpu) {}
>  
> -static inline int kvm_arm_arch_vcpu_init(struct kvm_vcpu *vcpu) { return 0; }
> -static inline void kvm_arm_arch_vcpu_uninit(struct kvm_vcpu *vcpu) {}
> +int kvm_arm_arch_vcpu_init(struct kvm_vcpu *vcpu);
> +void kvm_arm_arch_vcpu_uninit(struct kvm_vcpu *vcpu);
>  
>  void kvm_arm_init_debug(void);
>  void kvm_arm_setup_debug(struct kvm_vcpu *vcpu);
> diff --git a/arch/arm64/include/uapi/asm/kvm.h b/arch/arm64/include/uapi/asm/kvm.h
> index f54a9b0..6acf276 100644
> --- a/arch/arm64/include/uapi/asm/kvm.h
> +++ b/arch/arm64/include/uapi/asm/kvm.h
> @@ -101,6 +101,7 @@ struct kvm_regs {
>  #define KVM_ARM_VCPU_EL1_32BIT		1 /* CPU running a 32bit VM */
>  #define KVM_ARM_VCPU_PSCI_0_2		2 /* CPU uses PSCI v0.2 */
>  #define KVM_ARM_VCPU_PMU_V3		3 /* Support guest PMUv3 */
> +#define KVM_ARM_VCPU_SVE		4 /* Allow SVE for guest */
>  
>  struct kvm_vcpu_init {
>  	__u32 target;
> diff --git a/arch/arm64/kvm/guest.c b/arch/arm64/kvm/guest.c
> index 5152362..fb7f6aa 100644
> --- a/arch/arm64/kvm/guest.c
> +++ b/arch/arm64/kvm/guest.c
> @@ -58,6 +58,16 @@ int kvm_arch_vcpu_setup(struct kvm_vcpu *vcpu)
>  	return 0;
>  }
>  
> +int kvm_arm_arch_vcpu_init(struct kvm_vcpu *vcpu)
> +{
> +	return 0;
> +}
> +
> +void kvm_arm_arch_vcpu_uninit(struct kvm_vcpu *vcpu)
> +{
> +	kfree(vcpu->arch.sve_state);
> +}
> +
>  static u64 core_reg_offset_from_id(u64 id)
>  {
>  	return id & ~(KVM_REG_ARCH_MASK | KVM_REG_SIZE_MASK | KVM_REG_ARM_CORE);
> @@ -600,12 +610,9 @@ int kvm_vcpu_preferred_target(struct kvm_vcpu_init *init)
>  
>  	memset(init, 0, sizeof(*init));
>  
> -	/*
> -	 * For now, we don't return any features.
> -	 * In future, we might use features to return target
> -	 * specific features available for the preferred
> -	 * target type.
> -	 */
> +	/* KVM_ARM_VCPU_SVE understood by KVM_VCPU_INIT */
> +	init->features[0] = 1 << KVM_ARM_VCPU_SVE;
> +
>  	init->target = (__u32)target;
>  
>  	return 0;
> diff --git a/arch/arm64/kvm/reset.c b/arch/arm64/kvm/reset.c
> index a74311b..f63a791 100644
> --- a/arch/arm64/kvm/reset.c
> +++ b/arch/arm64/kvm/reset.c
> @@ -110,6 +110,20 @@ int kvm_reset_vcpu(struct kvm_vcpu *vcpu)
>  			cpu_reset = &default_regs_reset;
>  		}
>  
> +		if (system_supports_sve() &&
> +		    test_bit(KVM_ARM_VCPU_SVE, vcpu->arch.features)) {
> +			vcpu->arch.flags |= KVM_ARM64_GUEST_HAS_SVE;
> +
> +			vcpu->arch.sve_max_vl = sve_max_virtualisable_vl;
> +

The allocation below needs to be guarded by an if (!vcpu->arch.sve_state),
otherwise every time the guest does a PSCI-off/PSCI-on cycle of the vcpu
we'll have a memory leak. Or, we need to move this allocation into the new
kvm_arm_arch_vcpu_init() function. Why did you opt for kvm_reset_vcpu()?

Thanks,
drew

> +			vcpu->arch.sve_state = kzalloc(
> +				SVE_SIG_REGS_SIZE(
> +					sve_vq_from_vl(vcpu->arch.sve_max_vl)),
> +				GFP_KERNEL);
> +			if (!vcpu->arch.sve_state)
> +				return -ENOMEM;
> +		}
> +
>  		break;
>  	}
>  
> -- 
> 2.1.4
> 
> _______________________________________________
> kvmarm mailing list
> kvmarm at lists.cs.columbia.edu
> https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

^ permalink raw reply	[flat|nested] 178+ messages in thread

* Re: [RFC PATCH 10/16] KVM: arm64: Add a vcpu flag to control SVE visibility for the guest
  2018-07-19 11:08     ` Andrew Jones
@ 2018-07-25 11:41       ` Dave Martin
  -1 siblings, 0 replies; 178+ messages in thread
From: Dave Martin @ 2018-07-25 11:41 UTC (permalink / raw)
  To: Andrew Jones
  Cc: Okamoto Takayuki, Christoffer Dall, Ard Biesheuvel, Marc Zyngier,
	Catalin Marinas, Will Deacon, kvmarm, linux-arm-kernel

On Thu, Jul 19, 2018 at 01:08:10PM +0200, Andrew Jones wrote:
> On Thu, Jun 21, 2018 at 03:57:34PM +0100, Dave Martin wrote:
> > Since SVE will be enabled or disabled on a per-vcpu basis, a flag
> > is needed in order to track which vcpus have it enabled.
> > 
> > This patch adds a suitable flag and a helper for checking it.
> > 
> > Signed-off-by: Dave Martin <Dave.Martin@arm.com>
> > ---
> >  arch/arm64/include/asm/kvm_host.h | 8 ++++++++
> >  1 file changed, 8 insertions(+)
> > 
> > diff --git a/arch/arm64/include/asm/kvm_host.h b/arch/arm64/include/asm/kvm_host.h
> > index 9671ddd..609d08b 100644
> > --- a/arch/arm64/include/asm/kvm_host.h
> > +++ b/arch/arm64/include/asm/kvm_host.h
> > @@ -308,6 +308,14 @@ struct kvm_vcpu_arch {
> >  #define KVM_ARM64_FP_HOST		(1 << 2) /* host FP regs loaded */
> >  #define KVM_ARM64_HOST_SVE_IN_USE	(1 << 3) /* backup for host TIF_SVE */
> >  #define KVM_ARM64_HOST_SVE_ENABLED	(1 << 4) /* SVE enabled for EL0 */
> > +#define KVM_ARM64_GUEST_HAS_SVE		(1 << 5) /* SVE exposed to guest */
> > +
> > +static inline bool vcpu_has_sve(struct kvm_vcpu_arch const *vcpu_arch)
> > +{
> > +	return system_supports_sve() &&
> 
> system_supports_sve() checks cpus_have_const_cap(), not
> this_cpu_has_cap(), so, iiuc, the result of this check won't
> change, regardless of which cpu it's run on at the time.

That's correct: this is intentional.

If any physical cpu doesn't have SVE, we treat it is absent from the
whole system, and we don't permit its use.  This ensures that any task
or vcpu can always be migrated to any physical cpu.
> 
> > +		(vcpu_arch->flags & KVM_ARM64_GUEST_HAS_SVE);
> 
> Since this flag can only be set if system_supports_sve() is
> true at vcpu init time, then it isn't necessary to always check
> system_supports_sve() in this function. Or, should
> system_supports_sve() be changed to use this_cpu_has_cap()?

The main purpose of system_supports_sve() here is to shadow the check on
vcpu_arch->flags with a static branch.  If the system doesn't support
SVE, we don't pay the runtime cost of the dynamic check on
vcpu_arch->flags.

If the kernel is built with CONFIG_ARM64_SVE=n, the dynamic check should
be entirely optimised away by the compiler.

I'd rather not add an explicit comment for this because the same
convention is followed elsewhere -- thus for consistency the comment
would need to be added in a lot of places.

Cheers
---Dave

^ permalink raw reply	[flat|nested] 178+ messages in thread

* [RFC PATCH 10/16] KVM: arm64: Add a vcpu flag to control SVE visibility for the guest
@ 2018-07-25 11:41       ` Dave Martin
  0 siblings, 0 replies; 178+ messages in thread
From: Dave Martin @ 2018-07-25 11:41 UTC (permalink / raw)
  To: linux-arm-kernel

On Thu, Jul 19, 2018 at 01:08:10PM +0200, Andrew Jones wrote:
> On Thu, Jun 21, 2018 at 03:57:34PM +0100, Dave Martin wrote:
> > Since SVE will be enabled or disabled on a per-vcpu basis, a flag
> > is needed in order to track which vcpus have it enabled.
> > 
> > This patch adds a suitable flag and a helper for checking it.
> > 
> > Signed-off-by: Dave Martin <Dave.Martin@arm.com>
> > ---
> >  arch/arm64/include/asm/kvm_host.h | 8 ++++++++
> >  1 file changed, 8 insertions(+)
> > 
> > diff --git a/arch/arm64/include/asm/kvm_host.h b/arch/arm64/include/asm/kvm_host.h
> > index 9671ddd..609d08b 100644
> > --- a/arch/arm64/include/asm/kvm_host.h
> > +++ b/arch/arm64/include/asm/kvm_host.h
> > @@ -308,6 +308,14 @@ struct kvm_vcpu_arch {
> >  #define KVM_ARM64_FP_HOST		(1 << 2) /* host FP regs loaded */
> >  #define KVM_ARM64_HOST_SVE_IN_USE	(1 << 3) /* backup for host TIF_SVE */
> >  #define KVM_ARM64_HOST_SVE_ENABLED	(1 << 4) /* SVE enabled for EL0 */
> > +#define KVM_ARM64_GUEST_HAS_SVE		(1 << 5) /* SVE exposed to guest */
> > +
> > +static inline bool vcpu_has_sve(struct kvm_vcpu_arch const *vcpu_arch)
> > +{
> > +	return system_supports_sve() &&
> 
> system_supports_sve() checks cpus_have_const_cap(), not
> this_cpu_has_cap(), so, iiuc, the result of this check won't
> change, regardless of which cpu it's run on at the time.

That's correct: this is intentional.

If any physical cpu doesn't have SVE, we treat it is absent from the
whole system, and we don't permit its use.  This ensures that any task
or vcpu can always be migrated to any physical cpu.
> 
> > +		(vcpu_arch->flags & KVM_ARM64_GUEST_HAS_SVE);
> 
> Since this flag can only be set if system_supports_sve() is
> true at vcpu init time, then it isn't necessary to always check
> system_supports_sve() in this function. Or, should
> system_supports_sve() be changed to use this_cpu_has_cap()?

The main purpose of system_supports_sve() here is to shadow the check on
vcpu_arch->flags with a static branch.  If the system doesn't support
SVE, we don't pay the runtime cost of the dynamic check on
vcpu_arch->flags.

If the kernel is built with CONFIG_ARM64_SVE=n, the dynamic check should
be entirely optimised away by the compiler.

I'd rather not add an explicit comment for this because the same
convention is followed elsewhere -- thus for consistency the comment
would need to be added in a lot of places.

Cheers
---Dave

^ permalink raw reply	[flat|nested] 178+ messages in thread

* Re: [RFC PATCH 11/16] KVM: arm64/sve: System register context switch and access support
  2018-07-19 11:11     ` Andrew Jones
@ 2018-07-25 11:45       ` Dave Martin
  -1 siblings, 0 replies; 178+ messages in thread
From: Dave Martin @ 2018-07-25 11:45 UTC (permalink / raw)
  To: Andrew Jones
  Cc: Okamoto Takayuki, Christoffer Dall, Ard Biesheuvel, Marc Zyngier,
	Catalin Marinas, Will Deacon, kvmarm, linux-arm-kernel

On Thu, Jul 19, 2018 at 01:11:17PM +0200, Andrew Jones wrote:
> On Thu, Jun 21, 2018 at 03:57:35PM +0100, Dave Martin wrote:
> > This patch adds the necessary support for context switching ZCR_EL1
> > for each vcpu.
> > 
> > The ID_AA64PFR0_EL1 emulation code is updated to expose the
> > presence of SVE to the guest if appropriate, and ioctl() access to
> > ZCR_EL1 is also added.
> > 
> > In the context switch code itself, ZCR_EL1 is context switched if
> > the host is SVE-capable, irrespectively for now of whether SVE is
> > exposed to the guest or not.  Adding a dynamic vcpu_has_sve() check
> > may lose as much performance as would be gained in this simple
> > case.
> > 
> > Signed-off-by: Dave Martin <Dave.Martin@arm.com>

[...]

> >  #define SCTLR_ELx_IESB	(1 << 21)
> > diff --git a/arch/arm64/kvm/hyp/sysreg-sr.c b/arch/arm64/kvm/hyp/sysreg-sr.c
> > index 35bc168..0f4046a 100644
> > --- a/arch/arm64/kvm/hyp/sysreg-sr.c
> > +++ b/arch/arm64/kvm/hyp/sysreg-sr.c
> > @@ -21,6 +21,7 @@
> >  #include <asm/kvm_asm.h>
> >  #include <asm/kvm_emulate.h>
> >  #include <asm/kvm_hyp.h>
> > +#include <asm/sysreg.h>
> >  
> >  /*
> >   * Non-VHE: Both host and guest must save everything.
> > @@ -57,6 +58,8 @@ static void __hyp_text __sysreg_save_el1_state(struct kvm_cpu_context *ctxt)
> >  	ctxt->sys_regs[SCTLR_EL1]	= read_sysreg_el1(sctlr);
> >  	ctxt->sys_regs[ACTLR_EL1]	= read_sysreg(actlr_el1);
> >  	ctxt->sys_regs[CPACR_EL1]	= read_sysreg_el1(cpacr);
> > +	if (system_supports_sve()) /* implies has_vhe() */
> > +		ctxt->sys_regs[ZCR_EL1]	= read_sysreg_s(SYS_ZCR_EL12);
> >  	ctxt->sys_regs[TTBR0_EL1]	= read_sysreg_el1(ttbr0);
> >  	ctxt->sys_regs[TTBR1_EL1]	= read_sysreg_el1(ttbr1);
> >  	ctxt->sys_regs[TCR_EL1]		= read_sysreg_el1(tcr);
> > @@ -129,6 +132,8 @@ static void __hyp_text __sysreg_restore_el1_state(struct kvm_cpu_context *ctxt)
> >  	write_sysreg_el1(ctxt->sys_regs[SCTLR_EL1],	sctlr);
> >  	write_sysreg(ctxt->sys_regs[ACTLR_EL1],	  	actlr_el1);
> >  	write_sysreg_el1(ctxt->sys_regs[CPACR_EL1],	cpacr);
> > +	if (system_supports_sve()) /* implies has_vhe() */
> > +		write_sysreg_s(ctxt->sys_regs[ZCR_EL1],	SYS_ZCR_EL12);
> 
> I feel like the ZCR_EL12 save/restores are out of place, as these
> functions are shared by non-VHE and VHE. Maybe they should be
> moved to the VHE callers?

Hmmm, you're right -- it did look a bit odd at the time.

I think they should move to the vhe-specific paths as you suggest.

I'll change this for the next spin of the series.

Cheers
---Dave

^ permalink raw reply	[flat|nested] 178+ messages in thread

* [RFC PATCH 11/16] KVM: arm64/sve: System register context switch and access support
@ 2018-07-25 11:45       ` Dave Martin
  0 siblings, 0 replies; 178+ messages in thread
From: Dave Martin @ 2018-07-25 11:45 UTC (permalink / raw)
  To: linux-arm-kernel

On Thu, Jul 19, 2018 at 01:11:17PM +0200, Andrew Jones wrote:
> On Thu, Jun 21, 2018 at 03:57:35PM +0100, Dave Martin wrote:
> > This patch adds the necessary support for context switching ZCR_EL1
> > for each vcpu.
> > 
> > The ID_AA64PFR0_EL1 emulation code is updated to expose the
> > presence of SVE to the guest if appropriate, and ioctl() access to
> > ZCR_EL1 is also added.
> > 
> > In the context switch code itself, ZCR_EL1 is context switched if
> > the host is SVE-capable, irrespectively for now of whether SVE is
> > exposed to the guest or not.  Adding a dynamic vcpu_has_sve() check
> > may lose as much performance as would be gained in this simple
> > case.
> > 
> > Signed-off-by: Dave Martin <Dave.Martin@arm.com>

[...]

> >  #define SCTLR_ELx_IESB	(1 << 21)
> > diff --git a/arch/arm64/kvm/hyp/sysreg-sr.c b/arch/arm64/kvm/hyp/sysreg-sr.c
> > index 35bc168..0f4046a 100644
> > --- a/arch/arm64/kvm/hyp/sysreg-sr.c
> > +++ b/arch/arm64/kvm/hyp/sysreg-sr.c
> > @@ -21,6 +21,7 @@
> >  #include <asm/kvm_asm.h>
> >  #include <asm/kvm_emulate.h>
> >  #include <asm/kvm_hyp.h>
> > +#include <asm/sysreg.h>
> >  
> >  /*
> >   * Non-VHE: Both host and guest must save everything.
> > @@ -57,6 +58,8 @@ static void __hyp_text __sysreg_save_el1_state(struct kvm_cpu_context *ctxt)
> >  	ctxt->sys_regs[SCTLR_EL1]	= read_sysreg_el1(sctlr);
> >  	ctxt->sys_regs[ACTLR_EL1]	= read_sysreg(actlr_el1);
> >  	ctxt->sys_regs[CPACR_EL1]	= read_sysreg_el1(cpacr);
> > +	if (system_supports_sve()) /* implies has_vhe() */
> > +		ctxt->sys_regs[ZCR_EL1]	= read_sysreg_s(SYS_ZCR_EL12);
> >  	ctxt->sys_regs[TTBR0_EL1]	= read_sysreg_el1(ttbr0);
> >  	ctxt->sys_regs[TTBR1_EL1]	= read_sysreg_el1(ttbr1);
> >  	ctxt->sys_regs[TCR_EL1]		= read_sysreg_el1(tcr);
> > @@ -129,6 +132,8 @@ static void __hyp_text __sysreg_restore_el1_state(struct kvm_cpu_context *ctxt)
> >  	write_sysreg_el1(ctxt->sys_regs[SCTLR_EL1],	sctlr);
> >  	write_sysreg(ctxt->sys_regs[ACTLR_EL1],	  	actlr_el1);
> >  	write_sysreg_el1(ctxt->sys_regs[CPACR_EL1],	cpacr);
> > +	if (system_supports_sve()) /* implies has_vhe() */
> > +		write_sysreg_s(ctxt->sys_regs[ZCR_EL1],	SYS_ZCR_EL12);
> 
> I feel like the ZCR_EL12 save/restores are out of place, as these
> functions are shared by non-VHE and VHE. Maybe they should be
> moved to the VHE callers?

Hmmm, you're right -- it did look a bit odd at the time.

I think they should move to the vhe-specific paths as you suggest.

I'll change this for the next spin of the series.

Cheers
---Dave

^ permalink raw reply	[flat|nested] 178+ messages in thread

* Re: [RFC PATCH 10/16] KVM: arm64: Add a vcpu flag to control SVE visibility for the guest
  2018-07-19 15:02     ` Andrew Jones
@ 2018-07-25 11:48       ` Dave Martin
  -1 siblings, 0 replies; 178+ messages in thread
From: Dave Martin @ 2018-07-25 11:48 UTC (permalink / raw)
  To: Andrew Jones
  Cc: Okamoto Takayuki, Christoffer Dall, Ard Biesheuvel, Marc Zyngier,
	Catalin Marinas, Will Deacon, kvmarm, linux-arm-kernel

On Thu, Jul 19, 2018 at 05:02:44PM +0200, Andrew Jones wrote:
> On Thu, Jun 21, 2018 at 03:57:34PM +0100, Dave Martin wrote:
> > Since SVE will be enabled or disabled on a per-vcpu basis, a flag
> > is needed in order to track which vcpus have it enabled.
> > 
> > This patch adds a suitable flag and a helper for checking it.
> > 
> > Signed-off-by: Dave Martin <Dave.Martin@arm.com>
> > ---
> >  arch/arm64/include/asm/kvm_host.h | 8 ++++++++
> >  1 file changed, 8 insertions(+)
> > 
> > diff --git a/arch/arm64/include/asm/kvm_host.h b/arch/arm64/include/asm/kvm_host.h
> > index 9671ddd..609d08b 100644
> > --- a/arch/arm64/include/asm/kvm_host.h
> > +++ b/arch/arm64/include/asm/kvm_host.h
> > @@ -308,6 +308,14 @@ struct kvm_vcpu_arch {
> >  #define KVM_ARM64_FP_HOST		(1 << 2) /* host FP regs loaded */
> >  #define KVM_ARM64_HOST_SVE_IN_USE	(1 << 3) /* backup for host TIF_SVE */
> >  #define KVM_ARM64_HOST_SVE_ENABLED	(1 << 4) /* SVE enabled for EL0 */
> > +#define KVM_ARM64_GUEST_HAS_SVE		(1 << 5) /* SVE exposed to guest */
> > +
> > +static inline bool vcpu_has_sve(struct kvm_vcpu_arch const *vcpu_arch)
> 
> Shouldn't this vcpu function take a vcpu instead of a vcpu_arch?

Logically it could.  There was some circular include issue that made it
tricky to get the definition of struct kvm_vcpu here, but I may have
another go at it.

Cheers
---Dave

^ permalink raw reply	[flat|nested] 178+ messages in thread

* [RFC PATCH 10/16] KVM: arm64: Add a vcpu flag to control SVE visibility for the guest
@ 2018-07-25 11:48       ` Dave Martin
  0 siblings, 0 replies; 178+ messages in thread
From: Dave Martin @ 2018-07-25 11:48 UTC (permalink / raw)
  To: linux-arm-kernel

On Thu, Jul 19, 2018 at 05:02:44PM +0200, Andrew Jones wrote:
> On Thu, Jun 21, 2018 at 03:57:34PM +0100, Dave Martin wrote:
> > Since SVE will be enabled or disabled on a per-vcpu basis, a flag
> > is needed in order to track which vcpus have it enabled.
> > 
> > This patch adds a suitable flag and a helper for checking it.
> > 
> > Signed-off-by: Dave Martin <Dave.Martin@arm.com>
> > ---
> >  arch/arm64/include/asm/kvm_host.h | 8 ++++++++
> >  1 file changed, 8 insertions(+)
> > 
> > diff --git a/arch/arm64/include/asm/kvm_host.h b/arch/arm64/include/asm/kvm_host.h
> > index 9671ddd..609d08b 100644
> > --- a/arch/arm64/include/asm/kvm_host.h
> > +++ b/arch/arm64/include/asm/kvm_host.h
> > @@ -308,6 +308,14 @@ struct kvm_vcpu_arch {
> >  #define KVM_ARM64_FP_HOST		(1 << 2) /* host FP regs loaded */
> >  #define KVM_ARM64_HOST_SVE_IN_USE	(1 << 3) /* backup for host TIF_SVE */
> >  #define KVM_ARM64_HOST_SVE_ENABLED	(1 << 4) /* SVE enabled for EL0 */
> > +#define KVM_ARM64_GUEST_HAS_SVE		(1 << 5) /* SVE exposed to guest */
> > +
> > +static inline bool vcpu_has_sve(struct kvm_vcpu_arch const *vcpu_arch)
> 
> Shouldn't this vcpu function take a vcpu instead of a vcpu_arch?

Logically it could.  There was some circular include issue that made it
tricky to get the definition of struct kvm_vcpu here, but I may have
another go at it.

Cheers
---Dave

^ permalink raw reply	[flat|nested] 178+ messages in thread

* Re: [RFC PATCH 12/16] KVM: arm64/sve: Context switch the SVE registers
  2018-07-19 13:13     ` Andrew Jones
@ 2018-07-25 11:50       ` Dave Martin
  -1 siblings, 0 replies; 178+ messages in thread
From: Dave Martin @ 2018-07-25 11:50 UTC (permalink / raw)
  To: Andrew Jones
  Cc: Okamoto Takayuki, Christoffer Dall, Ard Biesheuvel, Marc Zyngier,
	Catalin Marinas, Will Deacon, kvmarm, linux-arm-kernel

On Thu, Jul 19, 2018 at 03:13:38PM +0200, Andrew Jones wrote:
> On Thu, Jun 21, 2018 at 03:57:36PM +0100, Dave Martin wrote:
> > In order to give each vcpu its own view of the SVE registers, this
> > patch adds context storage via a new sve_state pointer in struct
> > vcpu_arch.  An additional member sve_max_vl is also added for each
> > vcpu, to determine the maximum vector length visible to the guest
> > and thus the value to be configured in ZCR_EL2.LEN while the is
> > active.  This also determines the layout and size of the storage in
> > sve_state, which is read and written by the same backend functions
> > that are used for context-switching the SVE state for host tasks.
> > 
> > On SVE-enabled vcpus, SVE access traps are now handled by switching
> > in the vcpu's SVE context and disabling the trap before returning
> > to the guest.  On other vcpus, the trap is not handled and an exit
> > back to the host occurs, where the handle_sve() fallback path
> > reflects an undefined instruction exception back to the guest,
> > consistently with the behaviour of non-SVE-capable hardware (as was
> > done unconditionally prior to this patch).
> > 
> > No SVE handling is added on non-VHE-only paths, since VHE is an
> > architectural and Kconfig prerequisite of SVE.
> > 
> > Signed-off-by: Dave Martin <Dave.Martin@arm.com>
> > ---
> >  arch/arm64/include/asm/kvm_host.h |  2 ++
> >  arch/arm64/kvm/fpsimd.c           |  5 +++--
> >  arch/arm64/kvm/hyp/switch.c       | 43 ++++++++++++++++++++++++++++++---------
> >  3 files changed, 38 insertions(+), 12 deletions(-)
> > 
> > diff --git a/arch/arm64/include/asm/kvm_host.h b/arch/arm64/include/asm/kvm_host.h
> > index f331abf..d2084ae 100644
> > --- a/arch/arm64/include/asm/kvm_host.h
> > +++ b/arch/arm64/include/asm/kvm_host.h
> > @@ -211,6 +211,8 @@ typedef struct kvm_cpu_context kvm_cpu_context_t;
> >  
> >  struct kvm_vcpu_arch {
> >  	struct kvm_cpu_context ctxt;
> > +	void *sve_state;
> > +	unsigned int sve_max_vl;
> >  
> >  	/* HYP configuration */
> >  	u64 hcr_el2;
> > diff --git a/arch/arm64/kvm/fpsimd.c b/arch/arm64/kvm/fpsimd.c
> > index 872008c..44cf783 100644
> > --- a/arch/arm64/kvm/fpsimd.c
> > +++ b/arch/arm64/kvm/fpsimd.c
> > @@ -86,10 +86,11 @@ void kvm_arch_vcpu_ctxsync_fp(struct kvm_vcpu *vcpu)
> >  
> >  	if (vcpu->arch.flags & KVM_ARM64_FP_ENABLED) {
> >  		fpsimd_bind_state_to_cpu(&vcpu->arch.ctxt.gp_regs.fp_regs,
> > -					 NULL, sve_max_vl);
> > +					 vcpu->arch.sve_state,
> > +					 vcpu->arch.sve_max_vl);
> >  
> >  		clear_thread_flag(TIF_FOREIGN_FPSTATE);
> > -		clear_thread_flag(TIF_SVE);
> > +		update_thread_flag(TIF_SVE, vcpu_has_sve(&vcpu->arch));
> >  	}
> >  }
> >  
> > diff --git a/arch/arm64/kvm/hyp/switch.c b/arch/arm64/kvm/hyp/switch.c
> > index d496ef5..98df5c1 100644
> > --- a/arch/arm64/kvm/hyp/switch.c
> > +++ b/arch/arm64/kvm/hyp/switch.c
> > @@ -98,8 +98,13 @@ static void activate_traps_vhe(struct kvm_vcpu *vcpu)
> >  	val = read_sysreg(cpacr_el1);
> >  	val |= CPACR_EL1_TTA;
> >  	val &= ~CPACR_EL1_ZEN;
> > -	if (!update_fp_enabled(vcpu))
> > +
> > +	if (update_fp_enabled(vcpu)) {
> > +		if (vcpu_has_sve(&vcpu->arch))
> > +			val |= CPACR_EL1_ZEN;
> > +	} else {
> >  		val &= ~CPACR_EL1_FPEN;
> > +	}
> >  
> >  	write_sysreg(val, cpacr_el1);
> >  
> > @@ -114,6 +119,7 @@ static void __hyp_text __activate_traps_nvhe(struct kvm_vcpu *vcpu)
> >  
> >  	val = CPTR_EL2_DEFAULT;
> >  	val |= CPTR_EL2_TTA | CPTR_EL2_TZ;
> > +
> >  	if (!update_fp_enabled(vcpu))
> >  		val |= CPTR_EL2_TFP;
> >  
> > @@ -329,16 +335,22 @@ static bool __hyp_text __skip_instr(struct kvm_vcpu *vcpu)
> >  	}
> >  }
> >  
> > -static bool __hyp_text __hyp_switch_fpsimd(struct kvm_vcpu *vcpu)
> > +static bool __hyp_text __hyp_switch_fpsimd(struct kvm_vcpu *vcpu,
> > +					   bool guest_has_sve)
> >  {
> >  	struct user_fpsimd_state *host_fpsimd = vcpu->arch.host_fpsimd_state;
> >  
> > -	if (has_vhe())
> > -		write_sysreg(read_sysreg(cpacr_el1) | CPACR_EL1_FPEN,
> > -			     cpacr_el1);
> > -	else
> > +	if (has_vhe()) {
> > +		u64 reg = read_sysreg(cpacr_el1) | CPACR_EL1_FPEN;
> > +
> > +		if (system_supports_sve() && guest_has_sve)
> 
> guest_has_sve is only true when vcpu_arch->flags & KVM_ARM64_GUEST_HAS_SVE
> is true, which can only be true when system_supports_sve() is true. So
> I don't think we need system_supports_sve() here. guest_has_sve should be
> enough.
> 
> > +			reg |= CPACR_EL1_ZEN;
> > +
> > +		write_sysreg(reg, cpacr_el1);
> > +	} else {
> >  		write_sysreg(read_sysreg(cptr_el2) & ~(u64)CPTR_EL2_TFP,
> >  			     cptr_el2);
> > +	}
> >  
> >  	isb();
> >  
> > @@ -361,7 +373,13 @@ static bool __hyp_text __hyp_switch_fpsimd(struct kvm_vcpu *vcpu)
> >  		vcpu->arch.flags &= ~KVM_ARM64_FP_HOST;
> >  	}
> >  
> > -	__fpsimd_restore_state(&vcpu->arch.ctxt.gp_regs.fp_regs);
> > +	if (system_supports_sve() && guest_has_sve)
> 
> here too

As elsewhere, the system_supports_sve() check uses a static key and
should be very cheap (or free in a CONFIG_ARM64_SVE=n kernel).

The aim here is to reduce wasted effort on non-SVE systems.

Cheers
---Dave

^ permalink raw reply	[flat|nested] 178+ messages in thread

* [RFC PATCH 12/16] KVM: arm64/sve: Context switch the SVE registers
@ 2018-07-25 11:50       ` Dave Martin
  0 siblings, 0 replies; 178+ messages in thread
From: Dave Martin @ 2018-07-25 11:50 UTC (permalink / raw)
  To: linux-arm-kernel

On Thu, Jul 19, 2018 at 03:13:38PM +0200, Andrew Jones wrote:
> On Thu, Jun 21, 2018 at 03:57:36PM +0100, Dave Martin wrote:
> > In order to give each vcpu its own view of the SVE registers, this
> > patch adds context storage via a new sve_state pointer in struct
> > vcpu_arch.  An additional member sve_max_vl is also added for each
> > vcpu, to determine the maximum vector length visible to the guest
> > and thus the value to be configured in ZCR_EL2.LEN while the is
> > active.  This also determines the layout and size of the storage in
> > sve_state, which is read and written by the same backend functions
> > that are used for context-switching the SVE state for host tasks.
> > 
> > On SVE-enabled vcpus, SVE access traps are now handled by switching
> > in the vcpu's SVE context and disabling the trap before returning
> > to the guest.  On other vcpus, the trap is not handled and an exit
> > back to the host occurs, where the handle_sve() fallback path
> > reflects an undefined instruction exception back to the guest,
> > consistently with the behaviour of non-SVE-capable hardware (as was
> > done unconditionally prior to this patch).
> > 
> > No SVE handling is added on non-VHE-only paths, since VHE is an
> > architectural and Kconfig prerequisite of SVE.
> > 
> > Signed-off-by: Dave Martin <Dave.Martin@arm.com>
> > ---
> >  arch/arm64/include/asm/kvm_host.h |  2 ++
> >  arch/arm64/kvm/fpsimd.c           |  5 +++--
> >  arch/arm64/kvm/hyp/switch.c       | 43 ++++++++++++++++++++++++++++++---------
> >  3 files changed, 38 insertions(+), 12 deletions(-)
> > 
> > diff --git a/arch/arm64/include/asm/kvm_host.h b/arch/arm64/include/asm/kvm_host.h
> > index f331abf..d2084ae 100644
> > --- a/arch/arm64/include/asm/kvm_host.h
> > +++ b/arch/arm64/include/asm/kvm_host.h
> > @@ -211,6 +211,8 @@ typedef struct kvm_cpu_context kvm_cpu_context_t;
> >  
> >  struct kvm_vcpu_arch {
> >  	struct kvm_cpu_context ctxt;
> > +	void *sve_state;
> > +	unsigned int sve_max_vl;
> >  
> >  	/* HYP configuration */
> >  	u64 hcr_el2;
> > diff --git a/arch/arm64/kvm/fpsimd.c b/arch/arm64/kvm/fpsimd.c
> > index 872008c..44cf783 100644
> > --- a/arch/arm64/kvm/fpsimd.c
> > +++ b/arch/arm64/kvm/fpsimd.c
> > @@ -86,10 +86,11 @@ void kvm_arch_vcpu_ctxsync_fp(struct kvm_vcpu *vcpu)
> >  
> >  	if (vcpu->arch.flags & KVM_ARM64_FP_ENABLED) {
> >  		fpsimd_bind_state_to_cpu(&vcpu->arch.ctxt.gp_regs.fp_regs,
> > -					 NULL, sve_max_vl);
> > +					 vcpu->arch.sve_state,
> > +					 vcpu->arch.sve_max_vl);
> >  
> >  		clear_thread_flag(TIF_FOREIGN_FPSTATE);
> > -		clear_thread_flag(TIF_SVE);
> > +		update_thread_flag(TIF_SVE, vcpu_has_sve(&vcpu->arch));
> >  	}
> >  }
> >  
> > diff --git a/arch/arm64/kvm/hyp/switch.c b/arch/arm64/kvm/hyp/switch.c
> > index d496ef5..98df5c1 100644
> > --- a/arch/arm64/kvm/hyp/switch.c
> > +++ b/arch/arm64/kvm/hyp/switch.c
> > @@ -98,8 +98,13 @@ static void activate_traps_vhe(struct kvm_vcpu *vcpu)
> >  	val = read_sysreg(cpacr_el1);
> >  	val |= CPACR_EL1_TTA;
> >  	val &= ~CPACR_EL1_ZEN;
> > -	if (!update_fp_enabled(vcpu))
> > +
> > +	if (update_fp_enabled(vcpu)) {
> > +		if (vcpu_has_sve(&vcpu->arch))
> > +			val |= CPACR_EL1_ZEN;
> > +	} else {
> >  		val &= ~CPACR_EL1_FPEN;
> > +	}
> >  
> >  	write_sysreg(val, cpacr_el1);
> >  
> > @@ -114,6 +119,7 @@ static void __hyp_text __activate_traps_nvhe(struct kvm_vcpu *vcpu)
> >  
> >  	val = CPTR_EL2_DEFAULT;
> >  	val |= CPTR_EL2_TTA | CPTR_EL2_TZ;
> > +
> >  	if (!update_fp_enabled(vcpu))
> >  		val |= CPTR_EL2_TFP;
> >  
> > @@ -329,16 +335,22 @@ static bool __hyp_text __skip_instr(struct kvm_vcpu *vcpu)
> >  	}
> >  }
> >  
> > -static bool __hyp_text __hyp_switch_fpsimd(struct kvm_vcpu *vcpu)
> > +static bool __hyp_text __hyp_switch_fpsimd(struct kvm_vcpu *vcpu,
> > +					   bool guest_has_sve)
> >  {
> >  	struct user_fpsimd_state *host_fpsimd = vcpu->arch.host_fpsimd_state;
> >  
> > -	if (has_vhe())
> > -		write_sysreg(read_sysreg(cpacr_el1) | CPACR_EL1_FPEN,
> > -			     cpacr_el1);
> > -	else
> > +	if (has_vhe()) {
> > +		u64 reg = read_sysreg(cpacr_el1) | CPACR_EL1_FPEN;
> > +
> > +		if (system_supports_sve() && guest_has_sve)
> 
> guest_has_sve is only true when vcpu_arch->flags & KVM_ARM64_GUEST_HAS_SVE
> is true, which can only be true when system_supports_sve() is true. So
> I don't think we need system_supports_sve() here. guest_has_sve should be
> enough.
> 
> > +			reg |= CPACR_EL1_ZEN;
> > +
> > +		write_sysreg(reg, cpacr_el1);
> > +	} else {
> >  		write_sysreg(read_sysreg(cptr_el2) & ~(u64)CPTR_EL2_TFP,
> >  			     cptr_el2);
> > +	}
> >  
> >  	isb();
> >  
> > @@ -361,7 +373,13 @@ static bool __hyp_text __hyp_switch_fpsimd(struct kvm_vcpu *vcpu)
> >  		vcpu->arch.flags &= ~KVM_ARM64_FP_HOST;
> >  	}
> >  
> > -	__fpsimd_restore_state(&vcpu->arch.ctxt.gp_regs.fp_regs);
> > +	if (system_supports_sve() && guest_has_sve)
> 
> here too

As elsewhere, the system_supports_sve() check uses a static key and
should be very cheap (or free in a CONFIG_ARM64_SVE=n kernel).

The aim here is to reduce wasted effort on non-SVE systems.

Cheers
---Dave

^ permalink raw reply	[flat|nested] 178+ messages in thread

* Re: [RFC PATCH 10/16] KVM: arm64: Add a vcpu flag to control SVE visibility for the guest
  2018-07-25 11:41       ` Dave Martin
@ 2018-07-25 13:43         ` Andrew Jones
  -1 siblings, 0 replies; 178+ messages in thread
From: Andrew Jones @ 2018-07-25 13:43 UTC (permalink / raw)
  To: Dave Martin
  Cc: Okamoto Takayuki, Christoffer Dall, Ard Biesheuvel, Marc Zyngier,
	Catalin Marinas, Will Deacon, kvmarm, linux-arm-kernel

On Wed, Jul 25, 2018 at 12:41:06PM +0100, Dave Martin wrote:
> On Thu, Jul 19, 2018 at 01:08:10PM +0200, Andrew Jones wrote:
> > On Thu, Jun 21, 2018 at 03:57:34PM +0100, Dave Martin wrote:
> > > Since SVE will be enabled or disabled on a per-vcpu basis, a flag
> > > is needed in order to track which vcpus have it enabled.
> > > 
> > > This patch adds a suitable flag and a helper for checking it.
> > > 
> > > Signed-off-by: Dave Martin <Dave.Martin@arm.com>
> > > ---
> > >  arch/arm64/include/asm/kvm_host.h | 8 ++++++++
> > >  1 file changed, 8 insertions(+)
> > > 
> > > diff --git a/arch/arm64/include/asm/kvm_host.h b/arch/arm64/include/asm/kvm_host.h
> > > index 9671ddd..609d08b 100644
> > > --- a/arch/arm64/include/asm/kvm_host.h
> > > +++ b/arch/arm64/include/asm/kvm_host.h
> > > @@ -308,6 +308,14 @@ struct kvm_vcpu_arch {
> > >  #define KVM_ARM64_FP_HOST		(1 << 2) /* host FP regs loaded */
> > >  #define KVM_ARM64_HOST_SVE_IN_USE	(1 << 3) /* backup for host TIF_SVE */
> > >  #define KVM_ARM64_HOST_SVE_ENABLED	(1 << 4) /* SVE enabled for EL0 */
> > > +#define KVM_ARM64_GUEST_HAS_SVE		(1 << 5) /* SVE exposed to guest */
> > > +
> > > +static inline bool vcpu_has_sve(struct kvm_vcpu_arch const *vcpu_arch)
> > > +{
> > > +	return system_supports_sve() &&
> > 
> > system_supports_sve() checks cpus_have_const_cap(), not
> > this_cpu_has_cap(), so, iiuc, the result of this check won't
> > change, regardless of which cpu it's run on at the time.
> 
> That's correct: this is intentional.
> 
> If any physical cpu doesn't have SVE, we treat it is absent from the
> whole system, and we don't permit its use.  This ensures that any task
> or vcpu can always be migrated to any physical cpu.
> > 
> > > +		(vcpu_arch->flags & KVM_ARM64_GUEST_HAS_SVE);
> > 
> > Since this flag can only be set if system_supports_sve() is
> > true at vcpu init time, then it isn't necessary to always check
> > system_supports_sve() in this function. Or, should
> > system_supports_sve() be changed to use this_cpu_has_cap()?
> 
> The main purpose of system_supports_sve() here is to shadow the check on
> vcpu_arch->flags with a static branch.  If the system doesn't support
> SVE, we don't pay the runtime cost of the dynamic check on
> vcpu_arch->flags.
> 
> If the kernel is built with CONFIG_ARM64_SVE=n, the dynamic check should
> be entirely optimised away by the compiler.

Ah, that makes sense. Thanks for clarifying it.

> 
> I'd rather not add an explicit comment for this because the same
> convention is followed elsewhere -- thus for consistency the comment
> would need to be added in a lot of places.

Agreed that we don't need a comment. A note in the commit message might
have been nice though.

Thanks,
drew

^ permalink raw reply	[flat|nested] 178+ messages in thread

* [RFC PATCH 10/16] KVM: arm64: Add a vcpu flag to control SVE visibility for the guest
@ 2018-07-25 13:43         ` Andrew Jones
  0 siblings, 0 replies; 178+ messages in thread
From: Andrew Jones @ 2018-07-25 13:43 UTC (permalink / raw)
  To: linux-arm-kernel

On Wed, Jul 25, 2018 at 12:41:06PM +0100, Dave Martin wrote:
> On Thu, Jul 19, 2018 at 01:08:10PM +0200, Andrew Jones wrote:
> > On Thu, Jun 21, 2018 at 03:57:34PM +0100, Dave Martin wrote:
> > > Since SVE will be enabled or disabled on a per-vcpu basis, a flag
> > > is needed in order to track which vcpus have it enabled.
> > > 
> > > This patch adds a suitable flag and a helper for checking it.
> > > 
> > > Signed-off-by: Dave Martin <Dave.Martin@arm.com>
> > > ---
> > >  arch/arm64/include/asm/kvm_host.h | 8 ++++++++
> > >  1 file changed, 8 insertions(+)
> > > 
> > > diff --git a/arch/arm64/include/asm/kvm_host.h b/arch/arm64/include/asm/kvm_host.h
> > > index 9671ddd..609d08b 100644
> > > --- a/arch/arm64/include/asm/kvm_host.h
> > > +++ b/arch/arm64/include/asm/kvm_host.h
> > > @@ -308,6 +308,14 @@ struct kvm_vcpu_arch {
> > >  #define KVM_ARM64_FP_HOST		(1 << 2) /* host FP regs loaded */
> > >  #define KVM_ARM64_HOST_SVE_IN_USE	(1 << 3) /* backup for host TIF_SVE */
> > >  #define KVM_ARM64_HOST_SVE_ENABLED	(1 << 4) /* SVE enabled for EL0 */
> > > +#define KVM_ARM64_GUEST_HAS_SVE		(1 << 5) /* SVE exposed to guest */
> > > +
> > > +static inline bool vcpu_has_sve(struct kvm_vcpu_arch const *vcpu_arch)
> > > +{
> > > +	return system_supports_sve() &&
> > 
> > system_supports_sve() checks cpus_have_const_cap(), not
> > this_cpu_has_cap(), so, iiuc, the result of this check won't
> > change, regardless of which cpu it's run on at the time.
> 
> That's correct: this is intentional.
> 
> If any physical cpu doesn't have SVE, we treat it is absent from the
> whole system, and we don't permit its use.  This ensures that any task
> or vcpu can always be migrated to any physical cpu.
> > 
> > > +		(vcpu_arch->flags & KVM_ARM64_GUEST_HAS_SVE);
> > 
> > Since this flag can only be set if system_supports_sve() is
> > true at vcpu init time, then it isn't necessary to always check
> > system_supports_sve() in this function. Or, should
> > system_supports_sve() be changed to use this_cpu_has_cap()?
> 
> The main purpose of system_supports_sve() here is to shadow the check on
> vcpu_arch->flags with a static branch.  If the system doesn't support
> SVE, we don't pay the runtime cost of the dynamic check on
> vcpu_arch->flags.
> 
> If the kernel is built with CONFIG_ARM64_SVE=n, the dynamic check should
> be entirely optimised away by the compiler.

Ah, that makes sense. Thanks for clarifying it.

> 
> I'd rather not add an explicit comment for this because the same
> convention is followed elsewhere -- thus for consistency the comment
> would need to be added in a lot of places.

Agreed that we don't need a comment. A note in the commit message might
have been nice though.

Thanks,
drew

^ permalink raw reply	[flat|nested] 178+ messages in thread

* Re: [RFC PATCH 12/16] KVM: arm64/sve: Context switch the SVE registers
  2018-07-25 11:50       ` Dave Martin
@ 2018-07-25 13:57         ` Andrew Jones
  -1 siblings, 0 replies; 178+ messages in thread
From: Andrew Jones @ 2018-07-25 13:57 UTC (permalink / raw)
  To: Dave Martin
  Cc: Okamoto Takayuki, Christoffer Dall, Ard Biesheuvel, Marc Zyngier,
	Catalin Marinas, Will Deacon, kvmarm, linux-arm-kernel

On Wed, Jul 25, 2018 at 12:50:45PM +0100, Dave Martin wrote:
> > > +	if (system_supports_sve() && guest_has_sve)
> 
> As elsewhere, the system_supports_sve() check uses a static key and
> should be very cheap (or free in a CONFIG_ARM64_SVE=n kernel).
>

Yup, I'm clear on that now. Thanks again for explaining. It might
be nice for a small helper function in this case in order to avoid
the 'system_supports_sve() &&' everywhere and the chance that the
order of the checks gets swapped during some code refactoring someday.

Thanks,
drew

^ permalink raw reply	[flat|nested] 178+ messages in thread

* [RFC PATCH 12/16] KVM: arm64/sve: Context switch the SVE registers
@ 2018-07-25 13:57         ` Andrew Jones
  0 siblings, 0 replies; 178+ messages in thread
From: Andrew Jones @ 2018-07-25 13:57 UTC (permalink / raw)
  To: linux-arm-kernel

On Wed, Jul 25, 2018 at 12:50:45PM +0100, Dave Martin wrote:
> > > +	if (system_supports_sve() && guest_has_sve)
> 
> As elsewhere, the system_supports_sve() check uses a static key and
> should be very cheap (or free in a CONFIG_ARM64_SVE=n kernel).
>

Yup, I'm clear on that now. Thanks again for explaining. It might
be nice for a small helper function in this case in order to avoid
the 'system_supports_sve() &&' everywhere and the chance that the
order of the checks gets swapped during some code refactoring someday.

Thanks,
drew

^ permalink raw reply	[flat|nested] 178+ messages in thread

* Re: [RFC PATCH 07/16] arm64/sve: Enable SVE state tracking for non-task contexts
  2018-06-21 14:57   ` Dave Martin
@ 2018-07-25 13:58     ` Alex Bennée
  -1 siblings, 0 replies; 178+ messages in thread
From: Alex Bennée @ 2018-07-25 13:58 UTC (permalink / raw)
  To: Dave Martin
  Cc: Okamoto Takayuki, Christoffer Dall, Ard Biesheuvel, Marc Zyngier,
	Catalin Marinas, Will Deacon, kvmarm, linux-arm-kernel


Dave Martin <Dave.Martin@arm.com> writes:

> The current FPSIMD/SVE context handling support for non-task (i.e.,
> KVM vcpu) contexts does not take SVE into account.  This means that
> only task contexts can safely use SVE at present.
>
> In preparation for enabling KVM guests to use SVE, it is necessary
> to keep track of SVE state for non-task contexts too.
>
> This patch adds the necessary support, removing assumptions from
> the context switch code about the location of the SVE context
> storage.
>
> When binding a vcpu context, its vector length is arbitrarily
> specified as sve_max_vl for now.  In any case, because TIF_SVE is
> presently cleared at vcpu context bind time, the specified vector
> length will not be used for anything yet.  In later patches TIF_SVE
> will be set here as appropriate, and the appropriate maximum vector
> length for the vcpu will be passed when binding.
<snip>
> --- a/arch/arm64/kernel/fpsimd.c
> +++ b/arch/arm64/kernel/fpsimd.c
> @@ -121,6 +121,8 @@
>   */
>  struct fpsimd_last_state_struct {
>  	struct user_fpsimd_state *st;
> +	void *sve_state;
> +	unsigned int sve_vl;
>  };

> -	struct user_fpsimd_state *st = __this_cpu_read(fpsimd_last_state.st);
> +	struct fpsimd_last_state_struct const *last =
> +		this_cpu_ptr(&fpsimd_last_state);
<snip>
> @@ -1074,6 +1082,8 @@ void fpsimd_bind_state_to_cpu(struct user_fpsimd_state *st)
>  	WARN_ON(!in_softirq() && !irqs_disabled());
>
>  	last->st = st;
> +	last->sve_state = sve_state;
> +	last->sve_vl = sve_vl;
>  }

I'm suffering a little cognitive dissonance with the use of last here
because isn't it really the state as it is now - as we bind to the cpu?

Anyway not super relevant to this patch as the name has already been
chosen so:

Reviewed-by: Alex Bennée <alex.bennee@linaro.org>

--
Alex Bennée
_______________________________________________
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

^ permalink raw reply	[flat|nested] 178+ messages in thread

* [RFC PATCH 07/16] arm64/sve: Enable SVE state tracking for non-task contexts
@ 2018-07-25 13:58     ` Alex Bennée
  0 siblings, 0 replies; 178+ messages in thread
From: Alex Bennée @ 2018-07-25 13:58 UTC (permalink / raw)
  To: linux-arm-kernel


Dave Martin <Dave.Martin@arm.com> writes:

> The current FPSIMD/SVE context handling support for non-task (i.e.,
> KVM vcpu) contexts does not take SVE into account.  This means that
> only task contexts can safely use SVE at present.
>
> In preparation for enabling KVM guests to use SVE, it is necessary
> to keep track of SVE state for non-task contexts too.
>
> This patch adds the necessary support, removing assumptions from
> the context switch code about the location of the SVE context
> storage.
>
> When binding a vcpu context, its vector length is arbitrarily
> specified as sve_max_vl for now.  In any case, because TIF_SVE is
> presently cleared at vcpu context bind time, the specified vector
> length will not be used for anything yet.  In later patches TIF_SVE
> will be set here as appropriate, and the appropriate maximum vector
> length for the vcpu will be passed when binding.
<snip>
> --- a/arch/arm64/kernel/fpsimd.c
> +++ b/arch/arm64/kernel/fpsimd.c
> @@ -121,6 +121,8 @@
>   */
>  struct fpsimd_last_state_struct {
>  	struct user_fpsimd_state *st;
> +	void *sve_state;
> +	unsigned int sve_vl;
>  };

> -	struct user_fpsimd_state *st = __this_cpu_read(fpsimd_last_state.st);
> +	struct fpsimd_last_state_struct const *last =
> +		this_cpu_ptr(&fpsimd_last_state);
<snip>
> @@ -1074,6 +1082,8 @@ void fpsimd_bind_state_to_cpu(struct user_fpsimd_state *st)
>  	WARN_ON(!in_softirq() && !irqs_disabled());
>
>  	last->st = st;
> +	last->sve_state = sve_state;
> +	last->sve_vl = sve_vl;
>  }

I'm suffering a little cognitive dissonance with the use of last here
because isn't it really the state as it is now - as we bind to the cpu?

Anyway not super relevant to this patch as the name has already been
chosen so:

Reviewed-by: Alex Benn?e <alex.bennee@linaro.org>

--
Alex Benn?e

^ permalink raw reply	[flat|nested] 178+ messages in thread

* Re: [RFC PATCH 14/16] KVM: arm64/sve: Add SVE support to register access ioctl interface
  2018-07-19 13:04     ` Andrew Jones
@ 2018-07-25 14:06       ` Dave Martin
  -1 siblings, 0 replies; 178+ messages in thread
From: Dave Martin @ 2018-07-25 14:06 UTC (permalink / raw)
  To: Andrew Jones
  Cc: Okamoto Takayuki, Christoffer Dall, Ard Biesheuvel, Marc Zyngier,
	Catalin Marinas, Will Deacon, kvmarm, linux-arm-kernel

On Thu, Jul 19, 2018 at 03:04:33PM +0200, Andrew Jones wrote:
> On Thu, Jun 21, 2018 at 03:57:38PM +0100, Dave Martin wrote:
> > This patch adds the following registers for access via the
> > KVM_{GET,SET}_ONE_REG interface:
> > 
> >  * KVM_REG_ARM64_SVE_ZREG(n, i) (n = 0..31) (in 2048-bit slices)
> >  * KVM_REG_ARM64_SVE_PREG(n, i) (n = 0..15) (in 256-bit slices)
> >  * KVM_REG_ARM64_SVE_FFR(i) (in 256-bit slices)
> > 
> > In order to adapt gracefully to future architectural extensions,
> > the registers are divided up into slices as noted above:  the i
> > parameter denotes the slice index.
> > 
> > For simplicity, bits or slices that exceed the maximum vector
> > length supported for the vcpu are ignored for KVM_SET_ONE_REG, and
> > read as zero for KVM_GET_ONE_REG.
> > 
> > For the current architecture, only slice i = 0 is significant.  The
> > interface design allows i to increase to up to 31 in the future if
> > required by future architectural amendments.
> > 
> > The registers are only visible for vcpus that have SVE enabled.
> > They are not enumerated by KVM_GET_REG_LIST on vcpus that do not
> > have SVE.  In all cases, surplus slices are not enumerated by
> > KVM_GET_REG_LIST.
> > 
> > Accesses to the FPSIMD registers via KVM_REG_ARM_CORE are
> > redirected to access the underlying vcpu SVE register storage as
> > appropriate.  In order to make this more straightforward, register
> > accesses that straddle register boundaries are no longer guaranteed
> > to succeed.  (Support for such use was never deliberate, and
> > userspace does not currently seem to be relying on it.)
> > 
> > Signed-off-by: Dave Martin <Dave.Martin@arm.com>
> > ---
> >  arch/arm64/include/uapi/asm/kvm.h |  10 ++
> >  arch/arm64/kvm/guest.c            | 219 +++++++++++++++++++++++++++++++++++---
> >  2 files changed, 216 insertions(+), 13 deletions(-)
> > 
> > diff --git a/arch/arm64/include/uapi/asm/kvm.h b/arch/arm64/include/uapi/asm/kvm.h
> > index 4e76630..f54a9b0 100644
> > --- a/arch/arm64/include/uapi/asm/kvm.h
> > +++ b/arch/arm64/include/uapi/asm/kvm.h
> > @@ -213,6 +213,16 @@ struct kvm_arch_memory_slot {
> >  					 KVM_REG_ARM_FW | ((r) & 0xffff))
> >  #define KVM_REG_ARM_PSCI_VERSION	KVM_REG_ARM_FW_REG(0)
> >  
> > +/* SVE registers */
> > +#define KVM_REG_ARM64_SVE		(0x15 << KVM_REG_ARM_COPROC_SHIFT)
> > +#define KVM_REG_ARM64_SVE_ZREG(n, i)	(KVM_REG_ARM64 | KVM_REG_ARM64_SVE | \
> > +					 KVM_REG_SIZE_U2048 |		\
> > +					 ((n) << 5) | (i))
> > +#define KVM_REG_ARM64_SVE_PREG(n, i)	(KVM_REG_ARM64 | KVM_REG_ARM64_SVE | \
> > +					 KVM_REG_SIZE_U256 |		\
> > +					 ((n) << 5) | (i) | 0x400)
> > +#define KVM_REG_ARM64_SVE_FFR(i)	KVM_REG_ARM64_SVE_PREG(16, i)
> > +
> >  /* Device Control API: ARM VGIC */
> >  #define KVM_DEV_ARM_VGIC_GRP_ADDR	0
> >  #define KVM_DEV_ARM_VGIC_GRP_DIST_REGS	1
> > diff --git a/arch/arm64/kvm/guest.c b/arch/arm64/kvm/guest.c
> > index 4a9d77c..005394b 100644
> > --- a/arch/arm64/kvm/guest.c
> > +++ b/arch/arm64/kvm/guest.c
> > @@ -23,14 +23,19 @@
> >  #include <linux/err.h>
> >  #include <linux/kvm_host.h>
> >  #include <linux/module.h>
> > +#include <linux/uaccess.h>
> >  #include <linux/vmalloc.h>
> >  #include <linux/fs.h>
> > +#include <linux/stddef.h>
> >  #include <kvm/arm_psci.h>
> >  #include <asm/cputype.h>
> >  #include <linux/uaccess.h>
> > +#include <asm/fpsimd.h>
> >  #include <asm/kvm.h>
> >  #include <asm/kvm_emulate.h>
> >  #include <asm/kvm_coproc.h>
> > +#include <asm/kvm_host.h>
> > +#include <asm/sigcontext.h>
> >  
> >  #include "trace.h"
> >  
> > @@ -57,6 +62,106 @@ static u64 core_reg_offset_from_id(u64 id)
> >  	return id & ~(KVM_REG_ARCH_MASK | KVM_REG_SIZE_MASK | KVM_REG_ARM_CORE);
> >  }
> >  
> > +static bool is_zreg(const struct kvm_one_reg *reg)
> > +{
> > +	return	reg->id >= KVM_REG_ARM64_SVE_ZREG(0, 0) &&
> > +		reg->id <= KVM_REG_ARM64_SVE_ZREG(SVE_NUM_ZREGS, 0x1f);
> > +}
> > +
> > +static bool is_preg(const struct kvm_one_reg *reg)
> > +{
> > +	return	reg->id >= KVM_REG_ARM64_SVE_PREG(0, 0) &&
> > +		reg->id <= KVM_REG_ARM64_SVE_FFR(0x1f);
> > +}
> > +
> > +static unsigned int sve_reg_num(const struct kvm_one_reg *reg)
> > +{
> > +	return (reg->id >> 5) & 0x1f;
> > +}
> > +
> > +static unsigned int sve_reg_index(const struct kvm_one_reg *reg)
> > +{
> > +	return reg->id & 0x1f;
> > +}
> > +
> > +struct reg_bounds_struct {
> > +	char *kptr;
> > +	size_t start_offset;
> 
> Maybe start_offset gets used in a later patch, but it doesn't seem
> to be used here.

Good spot.  It looks like I was originally going to have kptr point to
the base of the thing to be copied, rather than the exact start
location, with start_offset providing the necessary offset from the
base... then at some point I changed my mind.

I need to check through the code to see whether the code is consistent
now.  fpsimd_vreg_bounds() assigns this member, but nothing uses it
subsequently :/

[...]

> > +static int fpsimd_vreg_bounds(struct reg_bounds_struct *b,
> > +			      struct kvm_vcpu *vcpu,
> > +			      const struct kvm_one_reg *reg)
> > +{
> > +	const size_t stride = KVM_REG_ARM_CORE_REG(fp_regs.vregs[1]) -
> > +				KVM_REG_ARM_CORE_REG(fp_regs.vregs[0]);
> > +	const size_t start = KVM_REG_ARM_CORE_REG(fp_regs.vregs[0]);
> > +	const size_t limit = KVM_REG_ARM_CORE_REG(fp_regs.vregs[32]);
> > +
> > +	const u64 uoffset = core_reg_offset_from_id(reg->id);
> > +	size_t usize = KVM_REG_SIZE(reg->id);
> > +	size_t start_vreg, end_vreg;
> > +
> > +	if (WARN_ON((reg->id & KVM_REG_ARM_COPROC_MASK) != KVM_REG_ARM_CORE))
> > +		return -ENOENT;
> 
> This warn-on can never fire, as the condition was already checked to even
> get here, back in kvm_arm_set_reg(). If there's concern this function will
> get called from the wrong place someday, then we should make it a bug-on.

After some past flamings, I tend to prefer WARN_ON() unless there is
no reasonable way to recover at all.

Here, we have a clear path for returning an error to the upstream
caller.  qemu/kvmtool will fail and we'll make some noise in dmesg,
but it's better than taking the whole system down (or so I thought).

> > +
> > +	if (usize % sizeof(u32))
> > +		return -EINVAL;
> 
> We should do the below is vreg check first. Otherwise we may return EINVAL
> for a valid non-vreg. Actually I think we should check if the reg is a
> vreg in get/set_core_reg and only come here if it is, rather than coming
> unconditionally and then requiring the handling of ENOENT.

Possibly.

I was trying to reduce noise at the call site: in (e.g.) get_core_reg(),
the call to fpsimd_vreg_bounds() has the semantics "try to parse reg as
an FPSIMD vector register", where -EINVAL means reg is definitely not
valid, -ENOENT means reg is not an FPSIMD vector register but might
be valid as something else, and 0 means reg is a vector register.

Currently we don't enforce the register size to be a multiple of 32 bits,
but I'm trying to establish a stronger position.  Passing different
register sizes feels like an abuse of the API and there is no evidence
that qemu or kvmtool is relying on this so far.  The ability to pass
a misaligned register ID and/or slurp multiple vcpu registers (or parts
of registers) is once call really seems like it works by accident today
and seems not to be intentional design.  Rather, it exposes kernel
implementation details, which is best avoided.

It would be better to make this a global check for usize % 32 == 0
though, rather than burying it in fpsimd_vreg_bounds().

Opinions?

> > +
> > +	usize /= sizeof(u32);
> > +
> > +	if ((uoffset <= start && usize <= start - uoffset) ||
> > +	    uoffset >= limit)
> > +		return -ENOENT;	/* not a vreg */
> > +
> > +	BUILD_BUG_ON(uoffset > limit);
> 
> Hmm, a build bug on uoffset can't be right, it's not a constant.
> 
> > +	if (uoffset < start || usize > limit - uoffset)
> > +		return -EINVAL;	/* overlaps vregs[] bounds */

uoffset is not compile-time constant, but (uoffset > limit) is compile-
time constant, because the previous if() returns from the function
otherwise.

gcc seems to do the right thing here: the code compiles as-is, but
if the prior if() is commented out then the BUILD_BUG_ON() fires
because (uoffset > limit) is no longer compile-time constant.


This is a defensively-coded bounds check, where

	if (A + B > C)

is transformed to

	if (C >= B && A > C - B)

The former is susceptible to overflow in (A + B), whereas the latter is
not.  We might be able to hide the risk with type casts, but that trades
one kind of fragility for another IMHO.

In this patch, the C >= B part is subsumed into the previous if(), but
because this is non-obvious I dropped the BUILD_BUG_ON() in as a hint
to maintainers that we really do depend on a property of the previous
check, so although it may look like the checks could be swapped over
with no ill effects, really that is not safe.


Maybe the BUILD_BUG_ON() is superfluous, but I would prefer at least
to keep a comment here.

What do you think.


OTOH, if we can show conclusively that we can avoid overflow here
then the code can be simplified.  But I would want to be confident
that this is really safe not just now but also under future maintenance.

> > +	start_vreg = (uoffset - start) / stride;
> > +	end_vreg = ((uoffset - start) + usize - 1) / stride;
> > +	if (start_vreg != end_vreg)
> > +		return -EINVAL;	/* spans multiple vregs */
> 
> Aren't the above three lines equivalent to just (usize > stride)?

No. usize could be < stride, but uoffset may be misaligned so that
the range uoffset .. uoffset + usize - 1 still crosses a register
boundary.

The above lines are trying to check that the whole range doesn't
cross any register boundary at all.

> 
> > +
> > +	b->start_offset = ((uoffset - start) % stride) * sizeof(u32);
> > +	b->copy_count = usize * sizeof(u32);
> > +	b->flush_count = 0;
> > +
> > +	if (vcpu_has_sve(&vcpu->arch)) {
> > +		const unsigned int vq = sve_vq_from_vl(vcpu->arch.sve_max_vl);
> > +
> > +		b->kptr = vcpu->arch.sve_state;
> > +		b->kptr += (SVE_SIG_ZREG_OFFSET(vq, start_vreg) -
> > +			    SVE_SIG_REGS_OFFSET);
> > +	} else {
> > +		b->kptr = (char *)&vcpu_gp_regs(vcpu)->fp_regs.vregs[
> > +				start_vreg];
> > +	}
> > +
> > +	return 0;
> > +}
> > +
> >  static int get_core_reg(struct kvm_vcpu *vcpu, const struct kvm_one_reg *reg)
> >  {
> >  	/*
> > @@ -65,11 +170,20 @@ static int get_core_reg(struct kvm_vcpu *vcpu, const struct kvm_one_reg *reg)
> >  	 * array. Hence below, nr_regs is the number of entries, and
> >  	 * off the index in the "array".
> >  	 */
> > +	int err;
> > +	struct reg_bounds_struct b;
> >  	__u32 __user *uaddr = (__u32 __user *)(unsigned long)reg->addr;
> >  	struct kvm_regs *regs = vcpu_gp_regs(vcpu);
> >  	int nr_regs = sizeof(*regs) / sizeof(__u32);
> >  	u32 off;
> >  
> > +	err = fpsimd_vreg_bounds(&b, vcpu, reg);
> > +	switch (err) {
> > +	case 0:		return copy_bounded_reg_to_user(uaddr, &b);
> > +	case -ENOENT:	break;	/* not and FPSIMD vreg */
> 
> not an
> 
> > +	default:	return err;
> > +	}
> > +
> >  	/* Our ID is an index into the kvm_regs struct. */
> >  	off = core_reg_offset_from_id(reg->id);
> 
> How about instead of the above switch we just do this, with adjusted
> sanity checks in fpsimd_vreg_bounds?
> 
>   if (off >= KVM_REG_ARM_CORE_REG(fp_regs.vregs[0])) {
>       err = fpsimd_vreg_bounds(&b, vcpu, reg);

I would prefer to keep the check all inside or all outside.
Here we're doing half of it in the caller and half of it inside
fpsimd_vreg_bounds(), which feels clunky.

This also doesn't work with registers that are after vregs[] in the
struct (i.e., fpsr, fpcr -- possibly more stuff could be added in
future).

>       if (!err)
>           return copy_bounded_reg_to_user(uaddr, &b);
>       return err;
>   }
> 
> >  	if (off >= nr_regs ||
> > @@ -84,14 +198,23 @@ static int get_core_reg(struct kvm_vcpu *vcpu, const struct kvm_one_reg *reg)
> >  
> >  static int set_core_reg(struct kvm_vcpu *vcpu, const struct kvm_one_reg *reg)
> >  {
> > +	int err;
> > +	struct reg_bounds_struct b;
> >  	__u32 __user *uaddr = (__u32 __user *)(unsigned long)reg->addr;
> >  	struct kvm_regs *regs = vcpu_gp_regs(vcpu);
> >  	int nr_regs = sizeof(*regs) / sizeof(__u32);
> >  	__uint128_t tmp;
> >  	void *valp = &tmp;
> >  	u64 off;
> > -	int err = 0;
> >  
> > +	err = fpsimd_vreg_bounds(&b, vcpu, reg);
> > +	switch (err) {
> > +	case 0:		return copy_bounded_reg_from_user(&b, uaddr);
> > +	case -ENOENT:	break;	/* not and FPSIMD vreg */
> 
> no an
> 
> and same comments as for get_core_reg
> 
> > +	default:	return err;
> > +	}
> > +
> > +	err = 0;
> >  	/* Our ID is an index into the kvm_regs struct. */
> >  	off = core_reg_offset_from_id(reg->id);
> >  	if (off >= nr_regs ||
> > @@ -130,6 +253,78 @@ static int set_core_reg(struct kvm_vcpu *vcpu, const struct kvm_one_reg *reg)
> >  	return err;
> >  }
> >  
> > +static int sve_reg_bounds(struct reg_bounds_struct *b,
> > +			  const struct kvm_vcpu *vcpu,
> > +			  const struct kvm_one_reg *reg)
> > +{
> > +	unsigned int n = sve_reg_num(reg);
> > +	unsigned int i = sve_reg_index(reg);
> > +	unsigned int vl = vcpu->arch.sve_max_vl;
> > +	unsigned int vq = sve_vq_from_vl(vl);
> > +	unsigned int start, copy_limit, limit;
> > +
> > +	b->kptr = vcpu->arch.sve_state;
> > +	if (is_zreg(reg)) {
> > +		b->kptr += SVE_SIG_ZREG_OFFSET(vq, n) - SVE_SIG_REGS_OFFSET;
> > +		start = i * 0x100;
> > +		limit = start + 0x100;
> > +		copy_limit = vl;
> > +	} else if (is_preg(reg)) {
> > +		b->kptr += SVE_SIG_PREG_OFFSET(vq, n) - SVE_SIG_REGS_OFFSET;
> > +		start = i * 0x20;
> > +		limit = start + 0x20;
> > +		copy_limit = vl / 8;
> > +	} else {
> > +		WARN_ON(1);
> > +		start = 0;
> > +		copy_limit = limit = 0;
> 
> Instead of WARN_ON, shouldn't this be a return -EINVAL that gets
> propagated to the user?

Hmmm, yes.

> > +	}
> > +
> > +	b->kptr += start;
> > +
> > +	if (copy_limit < start)
> > +		copy_limit = start;
> > +	else if (copy_limit > limit)
> > +		copy_limit = limit;
> 
>  copy_limit = clamp(copy_limit, start, limit)

Hmmm, again, yes.  I always forget about kernel.h....

> > +
> > +	b->copy_count = copy_limit - start;
> > +	b->flush_count = limit - copy_limit;
> 
> nit: might be nice (less error prone?) to set b->kptr once here with
> the other bounds members, e.g.
> 
>   b->kptr = arch.sve_state + sve_reg_type_off + start;

Yes, I guess that would be a little cleaner.

[...]

Cheers
---Dave

^ permalink raw reply	[flat|nested] 178+ messages in thread

* [RFC PATCH 14/16] KVM: arm64/sve: Add SVE support to register access ioctl interface
@ 2018-07-25 14:06       ` Dave Martin
  0 siblings, 0 replies; 178+ messages in thread
From: Dave Martin @ 2018-07-25 14:06 UTC (permalink / raw)
  To: linux-arm-kernel

On Thu, Jul 19, 2018 at 03:04:33PM +0200, Andrew Jones wrote:
> On Thu, Jun 21, 2018 at 03:57:38PM +0100, Dave Martin wrote:
> > This patch adds the following registers for access via the
> > KVM_{GET,SET}_ONE_REG interface:
> > 
> >  * KVM_REG_ARM64_SVE_ZREG(n, i) (n = 0..31) (in 2048-bit slices)
> >  * KVM_REG_ARM64_SVE_PREG(n, i) (n = 0..15) (in 256-bit slices)
> >  * KVM_REG_ARM64_SVE_FFR(i) (in 256-bit slices)
> > 
> > In order to adapt gracefully to future architectural extensions,
> > the registers are divided up into slices as noted above:  the i
> > parameter denotes the slice index.
> > 
> > For simplicity, bits or slices that exceed the maximum vector
> > length supported for the vcpu are ignored for KVM_SET_ONE_REG, and
> > read as zero for KVM_GET_ONE_REG.
> > 
> > For the current architecture, only slice i = 0 is significant.  The
> > interface design allows i to increase to up to 31 in the future if
> > required by future architectural amendments.
> > 
> > The registers are only visible for vcpus that have SVE enabled.
> > They are not enumerated by KVM_GET_REG_LIST on vcpus that do not
> > have SVE.  In all cases, surplus slices are not enumerated by
> > KVM_GET_REG_LIST.
> > 
> > Accesses to the FPSIMD registers via KVM_REG_ARM_CORE are
> > redirected to access the underlying vcpu SVE register storage as
> > appropriate.  In order to make this more straightforward, register
> > accesses that straddle register boundaries are no longer guaranteed
> > to succeed.  (Support for such use was never deliberate, and
> > userspace does not currently seem to be relying on it.)
> > 
> > Signed-off-by: Dave Martin <Dave.Martin@arm.com>
> > ---
> >  arch/arm64/include/uapi/asm/kvm.h |  10 ++
> >  arch/arm64/kvm/guest.c            | 219 +++++++++++++++++++++++++++++++++++---
> >  2 files changed, 216 insertions(+), 13 deletions(-)
> > 
> > diff --git a/arch/arm64/include/uapi/asm/kvm.h b/arch/arm64/include/uapi/asm/kvm.h
> > index 4e76630..f54a9b0 100644
> > --- a/arch/arm64/include/uapi/asm/kvm.h
> > +++ b/arch/arm64/include/uapi/asm/kvm.h
> > @@ -213,6 +213,16 @@ struct kvm_arch_memory_slot {
> >  					 KVM_REG_ARM_FW | ((r) & 0xffff))
> >  #define KVM_REG_ARM_PSCI_VERSION	KVM_REG_ARM_FW_REG(0)
> >  
> > +/* SVE registers */
> > +#define KVM_REG_ARM64_SVE		(0x15 << KVM_REG_ARM_COPROC_SHIFT)
> > +#define KVM_REG_ARM64_SVE_ZREG(n, i)	(KVM_REG_ARM64 | KVM_REG_ARM64_SVE | \
> > +					 KVM_REG_SIZE_U2048 |		\
> > +					 ((n) << 5) | (i))
> > +#define KVM_REG_ARM64_SVE_PREG(n, i)	(KVM_REG_ARM64 | KVM_REG_ARM64_SVE | \
> > +					 KVM_REG_SIZE_U256 |		\
> > +					 ((n) << 5) | (i) | 0x400)
> > +#define KVM_REG_ARM64_SVE_FFR(i)	KVM_REG_ARM64_SVE_PREG(16, i)
> > +
> >  /* Device Control API: ARM VGIC */
> >  #define KVM_DEV_ARM_VGIC_GRP_ADDR	0
> >  #define KVM_DEV_ARM_VGIC_GRP_DIST_REGS	1
> > diff --git a/arch/arm64/kvm/guest.c b/arch/arm64/kvm/guest.c
> > index 4a9d77c..005394b 100644
> > --- a/arch/arm64/kvm/guest.c
> > +++ b/arch/arm64/kvm/guest.c
> > @@ -23,14 +23,19 @@
> >  #include <linux/err.h>
> >  #include <linux/kvm_host.h>
> >  #include <linux/module.h>
> > +#include <linux/uaccess.h>
> >  #include <linux/vmalloc.h>
> >  #include <linux/fs.h>
> > +#include <linux/stddef.h>
> >  #include <kvm/arm_psci.h>
> >  #include <asm/cputype.h>
> >  #include <linux/uaccess.h>
> > +#include <asm/fpsimd.h>
> >  #include <asm/kvm.h>
> >  #include <asm/kvm_emulate.h>
> >  #include <asm/kvm_coproc.h>
> > +#include <asm/kvm_host.h>
> > +#include <asm/sigcontext.h>
> >  
> >  #include "trace.h"
> >  
> > @@ -57,6 +62,106 @@ static u64 core_reg_offset_from_id(u64 id)
> >  	return id & ~(KVM_REG_ARCH_MASK | KVM_REG_SIZE_MASK | KVM_REG_ARM_CORE);
> >  }
> >  
> > +static bool is_zreg(const struct kvm_one_reg *reg)
> > +{
> > +	return	reg->id >= KVM_REG_ARM64_SVE_ZREG(0, 0) &&
> > +		reg->id <= KVM_REG_ARM64_SVE_ZREG(SVE_NUM_ZREGS, 0x1f);
> > +}
> > +
> > +static bool is_preg(const struct kvm_one_reg *reg)
> > +{
> > +	return	reg->id >= KVM_REG_ARM64_SVE_PREG(0, 0) &&
> > +		reg->id <= KVM_REG_ARM64_SVE_FFR(0x1f);
> > +}
> > +
> > +static unsigned int sve_reg_num(const struct kvm_one_reg *reg)
> > +{
> > +	return (reg->id >> 5) & 0x1f;
> > +}
> > +
> > +static unsigned int sve_reg_index(const struct kvm_one_reg *reg)
> > +{
> > +	return reg->id & 0x1f;
> > +}
> > +
> > +struct reg_bounds_struct {
> > +	char *kptr;
> > +	size_t start_offset;
> 
> Maybe start_offset gets used in a later patch, but it doesn't seem
> to be used here.

Good spot.  It looks like I was originally going to have kptr point to
the base of the thing to be copied, rather than the exact start
location, with start_offset providing the necessary offset from the
base... then at some point I changed my mind.

I need to check through the code to see whether the code is consistent
now.  fpsimd_vreg_bounds() assigns this member, but nothing uses it
subsequently :/

[...]

> > +static int fpsimd_vreg_bounds(struct reg_bounds_struct *b,
> > +			      struct kvm_vcpu *vcpu,
> > +			      const struct kvm_one_reg *reg)
> > +{
> > +	const size_t stride = KVM_REG_ARM_CORE_REG(fp_regs.vregs[1]) -
> > +				KVM_REG_ARM_CORE_REG(fp_regs.vregs[0]);
> > +	const size_t start = KVM_REG_ARM_CORE_REG(fp_regs.vregs[0]);
> > +	const size_t limit = KVM_REG_ARM_CORE_REG(fp_regs.vregs[32]);
> > +
> > +	const u64 uoffset = core_reg_offset_from_id(reg->id);
> > +	size_t usize = KVM_REG_SIZE(reg->id);
> > +	size_t start_vreg, end_vreg;
> > +
> > +	if (WARN_ON((reg->id & KVM_REG_ARM_COPROC_MASK) != KVM_REG_ARM_CORE))
> > +		return -ENOENT;
> 
> This warn-on can never fire, as the condition was already checked to even
> get here, back in kvm_arm_set_reg(). If there's concern this function will
> get called from the wrong place someday, then we should make it a bug-on.

After some past flamings, I tend to prefer WARN_ON() unless there is
no reasonable way to recover at all.

Here, we have a clear path for returning an error to the upstream
caller.  qemu/kvmtool will fail and we'll make some noise in dmesg,
but it's better than taking the whole system down (or so I thought).

> > +
> > +	if (usize % sizeof(u32))
> > +		return -EINVAL;
> 
> We should do the below is vreg check first. Otherwise we may return EINVAL
> for a valid non-vreg. Actually I think we should check if the reg is a
> vreg in get/set_core_reg and only come here if it is, rather than coming
> unconditionally and then requiring the handling of ENOENT.

Possibly.

I was trying to reduce noise at the call site: in (e.g.) get_core_reg(),
the call to fpsimd_vreg_bounds() has the semantics "try to parse reg as
an FPSIMD vector register", where -EINVAL means reg is definitely not
valid, -ENOENT means reg is not an FPSIMD vector register but might
be valid as something else, and 0 means reg is a vector register.

Currently we don't enforce the register size to be a multiple of 32 bits,
but I'm trying to establish a stronger position.  Passing different
register sizes feels like an abuse of the API and there is no evidence
that qemu or kvmtool is relying on this so far.  The ability to pass
a misaligned register ID and/or slurp multiple vcpu registers (or parts
of registers) is once call really seems like it works by accident today
and seems not to be intentional design.  Rather, it exposes kernel
implementation details, which is best avoided.

It would be better to make this a global check for usize % 32 == 0
though, rather than burying it in fpsimd_vreg_bounds().

Opinions?

> > +
> > +	usize /= sizeof(u32);
> > +
> > +	if ((uoffset <= start && usize <= start - uoffset) ||
> > +	    uoffset >= limit)
> > +		return -ENOENT;	/* not a vreg */
> > +
> > +	BUILD_BUG_ON(uoffset > limit);
> 
> Hmm, a build bug on uoffset can't be right, it's not a constant.
> 
> > +	if (uoffset < start || usize > limit - uoffset)
> > +		return -EINVAL;	/* overlaps vregs[] bounds */

uoffset is not compile-time constant, but (uoffset > limit) is compile-
time constant, because the previous if() returns from the function
otherwise.

gcc seems to do the right thing here: the code compiles as-is, but
if the prior if() is commented out then the BUILD_BUG_ON() fires
because (uoffset > limit) is no longer compile-time constant.


This is a defensively-coded bounds check, where

	if (A + B > C)

is transformed to

	if (C >= B && A > C - B)

The former is susceptible to overflow in (A + B), whereas the latter is
not.  We might be able to hide the risk with type casts, but that trades
one kind of fragility for another IMHO.

In this patch, the C >= B part is subsumed into the previous if(), but
because this is non-obvious I dropped the BUILD_BUG_ON() in as a hint
to maintainers that we really do depend on a property of the previous
check, so although it may look like the checks could be swapped over
with no ill effects, really that is not safe.


Maybe the BUILD_BUG_ON() is superfluous, but I would prefer at least
to keep a comment here.

What do you think.


OTOH, if we can show conclusively that we can avoid overflow here
then the code can be simplified.  But I would want to be confident
that this is really safe not just now but also under future maintenance.

> > +	start_vreg = (uoffset - start) / stride;
> > +	end_vreg = ((uoffset - start) + usize - 1) / stride;
> > +	if (start_vreg != end_vreg)
> > +		return -EINVAL;	/* spans multiple vregs */
> 
> Aren't the above three lines equivalent to just (usize > stride)?

No. usize could be < stride, but uoffset may be misaligned so that
the range uoffset .. uoffset + usize - 1 still crosses a register
boundary.

The above lines are trying to check that the whole range doesn't
cross any register boundary@all.

> 
> > +
> > +	b->start_offset = ((uoffset - start) % stride) * sizeof(u32);
> > +	b->copy_count = usize * sizeof(u32);
> > +	b->flush_count = 0;
> > +
> > +	if (vcpu_has_sve(&vcpu->arch)) {
> > +		const unsigned int vq = sve_vq_from_vl(vcpu->arch.sve_max_vl);
> > +
> > +		b->kptr = vcpu->arch.sve_state;
> > +		b->kptr += (SVE_SIG_ZREG_OFFSET(vq, start_vreg) -
> > +			    SVE_SIG_REGS_OFFSET);
> > +	} else {
> > +		b->kptr = (char *)&vcpu_gp_regs(vcpu)->fp_regs.vregs[
> > +				start_vreg];
> > +	}
> > +
> > +	return 0;
> > +}
> > +
> >  static int get_core_reg(struct kvm_vcpu *vcpu, const struct kvm_one_reg *reg)
> >  {
> >  	/*
> > @@ -65,11 +170,20 @@ static int get_core_reg(struct kvm_vcpu *vcpu, const struct kvm_one_reg *reg)
> >  	 * array. Hence below, nr_regs is the number of entries, and
> >  	 * off the index in the "array".
> >  	 */
> > +	int err;
> > +	struct reg_bounds_struct b;
> >  	__u32 __user *uaddr = (__u32 __user *)(unsigned long)reg->addr;
> >  	struct kvm_regs *regs = vcpu_gp_regs(vcpu);
> >  	int nr_regs = sizeof(*regs) / sizeof(__u32);
> >  	u32 off;
> >  
> > +	err = fpsimd_vreg_bounds(&b, vcpu, reg);
> > +	switch (err) {
> > +	case 0:		return copy_bounded_reg_to_user(uaddr, &b);
> > +	case -ENOENT:	break;	/* not and FPSIMD vreg */
> 
> not an
> 
> > +	default:	return err;
> > +	}
> > +
> >  	/* Our ID is an index into the kvm_regs struct. */
> >  	off = core_reg_offset_from_id(reg->id);
> 
> How about instead of the above switch we just do this, with adjusted
> sanity checks in fpsimd_vreg_bounds?
> 
>   if (off >= KVM_REG_ARM_CORE_REG(fp_regs.vregs[0])) {
>       err = fpsimd_vreg_bounds(&b, vcpu, reg);

I would prefer to keep the check all inside or all outside.
Here we're doing half of it in the caller and half of it inside
fpsimd_vreg_bounds(), which feels clunky.

This also doesn't work with registers that are after vregs[] in the
struct (i.e., fpsr, fpcr -- possibly more stuff could be added in
future).

>       if (!err)
>           return copy_bounded_reg_to_user(uaddr, &b);
>       return err;
>   }
> 
> >  	if (off >= nr_regs ||
> > @@ -84,14 +198,23 @@ static int get_core_reg(struct kvm_vcpu *vcpu, const struct kvm_one_reg *reg)
> >  
> >  static int set_core_reg(struct kvm_vcpu *vcpu, const struct kvm_one_reg *reg)
> >  {
> > +	int err;
> > +	struct reg_bounds_struct b;
> >  	__u32 __user *uaddr = (__u32 __user *)(unsigned long)reg->addr;
> >  	struct kvm_regs *regs = vcpu_gp_regs(vcpu);
> >  	int nr_regs = sizeof(*regs) / sizeof(__u32);
> >  	__uint128_t tmp;
> >  	void *valp = &tmp;
> >  	u64 off;
> > -	int err = 0;
> >  
> > +	err = fpsimd_vreg_bounds(&b, vcpu, reg);
> > +	switch (err) {
> > +	case 0:		return copy_bounded_reg_from_user(&b, uaddr);
> > +	case -ENOENT:	break;	/* not and FPSIMD vreg */
> 
> no an
> 
> and same comments as for get_core_reg
> 
> > +	default:	return err;
> > +	}
> > +
> > +	err = 0;
> >  	/* Our ID is an index into the kvm_regs struct. */
> >  	off = core_reg_offset_from_id(reg->id);
> >  	if (off >= nr_regs ||
> > @@ -130,6 +253,78 @@ static int set_core_reg(struct kvm_vcpu *vcpu, const struct kvm_one_reg *reg)
> >  	return err;
> >  }
> >  
> > +static int sve_reg_bounds(struct reg_bounds_struct *b,
> > +			  const struct kvm_vcpu *vcpu,
> > +			  const struct kvm_one_reg *reg)
> > +{
> > +	unsigned int n = sve_reg_num(reg);
> > +	unsigned int i = sve_reg_index(reg);
> > +	unsigned int vl = vcpu->arch.sve_max_vl;
> > +	unsigned int vq = sve_vq_from_vl(vl);
> > +	unsigned int start, copy_limit, limit;
> > +
> > +	b->kptr = vcpu->arch.sve_state;
> > +	if (is_zreg(reg)) {
> > +		b->kptr += SVE_SIG_ZREG_OFFSET(vq, n) - SVE_SIG_REGS_OFFSET;
> > +		start = i * 0x100;
> > +		limit = start + 0x100;
> > +		copy_limit = vl;
> > +	} else if (is_preg(reg)) {
> > +		b->kptr += SVE_SIG_PREG_OFFSET(vq, n) - SVE_SIG_REGS_OFFSET;
> > +		start = i * 0x20;
> > +		limit = start + 0x20;
> > +		copy_limit = vl / 8;
> > +	} else {
> > +		WARN_ON(1);
> > +		start = 0;
> > +		copy_limit = limit = 0;
> 
> Instead of WARN_ON, shouldn't this be a return -EINVAL that gets
> propagated to the user?

Hmmm, yes.

> > +	}
> > +
> > +	b->kptr += start;
> > +
> > +	if (copy_limit < start)
> > +		copy_limit = start;
> > +	else if (copy_limit > limit)
> > +		copy_limit = limit;
> 
>  copy_limit = clamp(copy_limit, start, limit)

Hmmm, again, yes.  I always forget about kernel.h....

> > +
> > +	b->copy_count = copy_limit - start;
> > +	b->flush_count = limit - copy_limit;
> 
> nit: might be nice (less error prone?) to set b->kptr once here with
> the other bounds members, e.g.
> 
>   b->kptr = arch.sve_state + sve_reg_type_off + start;

Yes, I guess that would be a little cleaner.

[...]

Cheers
---Dave

^ permalink raw reply	[flat|nested] 178+ messages in thread

* Re: [RFC PATCH 12/16] KVM: arm64/sve: Context switch the SVE registers
  2018-07-25 13:57         ` Andrew Jones
@ 2018-07-25 14:12           ` Dave Martin
  -1 siblings, 0 replies; 178+ messages in thread
From: Dave Martin @ 2018-07-25 14:12 UTC (permalink / raw)
  To: Andrew Jones
  Cc: Okamoto Takayuki, Christoffer Dall, Ard Biesheuvel, Marc Zyngier,
	Catalin Marinas, Will Deacon, kvmarm, linux-arm-kernel

On Wed, Jul 25, 2018 at 03:57:33PM +0200, Andrew Jones wrote:
> On Wed, Jul 25, 2018 at 12:50:45PM +0100, Dave Martin wrote:
> > > > +	if (system_supports_sve() && guest_has_sve)
> > 
> > As elsewhere, the system_supports_sve() check uses a static key and
> > should be very cheap (or free in a CONFIG_ARM64_SVE=n kernel).
> >
> 
> Yup, I'm clear on that now. Thanks again for explaining. It might
> be nice for a small helper function in this case in order to avoid
> the 'system_supports_sve() &&' everywhere and the chance that the
> order of the checks gets swapped during some code refactoring someday.

This is what guest_has_sve() is for.

In hyp_switch_fpsimd() I wanted to avoid runtime checks wherever
possible, but it may be overkill to keep checking system_supports_sve()
like this.

It might take some benchmarking to figure out whether the extra checks
have any merit here...

Cheers
---Dave

^ permalink raw reply	[flat|nested] 178+ messages in thread

* [RFC PATCH 12/16] KVM: arm64/sve: Context switch the SVE registers
@ 2018-07-25 14:12           ` Dave Martin
  0 siblings, 0 replies; 178+ messages in thread
From: Dave Martin @ 2018-07-25 14:12 UTC (permalink / raw)
  To: linux-arm-kernel

On Wed, Jul 25, 2018 at 03:57:33PM +0200, Andrew Jones wrote:
> On Wed, Jul 25, 2018 at 12:50:45PM +0100, Dave Martin wrote:
> > > > +	if (system_supports_sve() && guest_has_sve)
> > 
> > As elsewhere, the system_supports_sve() check uses a static key and
> > should be very cheap (or free in a CONFIG_ARM64_SVE=n kernel).
> >
> 
> Yup, I'm clear on that now. Thanks again for explaining. It might
> be nice for a small helper function in this case in order to avoid
> the 'system_supports_sve() &&' everywhere and the chance that the
> order of the checks gets swapped during some code refactoring someday.

This is what guest_has_sve() is for.

In hyp_switch_fpsimd() I wanted to avoid runtime checks wherever
possible, but it may be overkill to keep checking system_supports_sve()
like this.

It might take some benchmarking to figure out whether the extra checks
have any merit here...

Cheers
---Dave

^ permalink raw reply	[flat|nested] 178+ messages in thread

* Re: [RFC PATCH 08/16] KVM: arm64: Support dynamically hideable system registers
  2018-06-21 14:57   ` Dave Martin
@ 2018-07-25 14:12     ` Alex Bennée
  -1 siblings, 0 replies; 178+ messages in thread
From: Alex Bennée @ 2018-07-25 14:12 UTC (permalink / raw)
  To: Dave Martin
  Cc: Okamoto Takayuki, Christoffer Dall, Ard Biesheuvel, Marc Zyngier,
	Catalin Marinas, Will Deacon, kvmarm, linux-arm-kernel


Dave Martin <Dave.Martin@arm.com> writes:

> Some system registers may or may not logically exist for a vcpu
> depending on whether certain architectural features are enabled for
> the vcpu.
>
> In order to avoid spuriously emulating access to these registers
> when they should not exist, or allowing the registers to be
> spuriously enumerated or saved/restored through the ioctl
> interface, a means is needed to allow registers to be hidden
> depending on the vcpu configuration.
>
> In order to support this in a flexible way, this patch adds a
> check_present() method to struct sys_reg_desc, and updates the
> generic system register access and enumeration code to be aware of
> it:  if check_present() returns false, the code behaves as if the
> register did not exist.
>
> For convenience, the complete check is wrapped up in a new helper
> sys_reg_present().
>
> An attempt has been made to hook the new check into the generic
> accessors for trapped system registers.  This should reduce the
> potential for future surprises, although the redundant check will
> add a small cost.  No system register depends on this functionality
> yet, and some paths needing the check may also need attention.
>
> Naturally, this facility makes sense only for registers that are
> trapped.
>
> Signed-off-by: Dave Martin <Dave.Martin@arm.com>
> ---
>  arch/arm64/kvm/sys_regs.c | 20 +++++++++++++++-----
>  arch/arm64/kvm/sys_regs.h | 11 +++++++++++
>  2 files changed, 26 insertions(+), 5 deletions(-)
>
> diff --git a/arch/arm64/kvm/sys_regs.c b/arch/arm64/kvm/sys_regs.c
> index a436373..31a351a 100644
> --- a/arch/arm64/kvm/sys_regs.c
> +++ b/arch/arm64/kvm/sys_regs.c
> @@ -1840,7 +1840,7 @@ static int emulate_cp(struct kvm_vcpu *vcpu,
>
>  	r = find_reg(params, table, num);
>
> -	if (r) {
> +	if (likely(r) && sys_reg_present(vcpu, r)) {
>  		perform_access(vcpu, params, r);
>  		return 0;
>  	}
> @@ -2016,7 +2016,7 @@ static int emulate_sys_reg(struct kvm_vcpu *vcpu,
>  	if (!r)
>  		r = find_reg(params, sys_reg_descs, ARRAY_SIZE(sys_reg_descs));
>
> -	if (likely(r)) {
> +	if (likely(r) && sys_reg_present(vcpu, r)) {
>  		perform_access(vcpu, params, r);
>  	} else {
>  		kvm_err("Unsupported guest sys_reg access at: %lx\n",
> @@ -2313,6 +2313,9 @@ int kvm_arm_sys_reg_get_reg(struct kvm_vcpu *vcpu, const struct kvm_one_reg *reg
>  	if (!r)
>  		return get_invariant_sys_reg(reg->id, uaddr);
>
> +	if (!sys_reg_present(vcpu, r))
> +		return -ENOENT;
> +
>  	if (r->get_user)
>  		return (r->get_user)(vcpu, r, reg, uaddr);
>
> @@ -2334,6 +2337,9 @@ int kvm_arm_sys_reg_set_reg(struct kvm_vcpu *vcpu, const struct kvm_one_reg *reg
>  	if (!r)
>  		return set_invariant_sys_reg(reg->id, uaddr);
>
> +	if (!sys_reg_present(vcpu, r))
> +		return -ENOENT;
> +
>  	if (r->set_user)
>  		return (r->set_user)(vcpu, r, reg, uaddr);
>
> @@ -2390,7 +2396,8 @@ static bool copy_reg_to_user(const struct sys_reg_desc *reg, u64 __user **uind)
>  	return true;
>  }
>
> -static int walk_one_sys_reg(const struct sys_reg_desc *rd,
> +static int walk_one_sys_reg(struct kvm_vcpu *vcpu,
> +			    const struct sys_reg_desc *rd,
>  			    u64 __user **uind,
>  			    unsigned int *total)
>  {
> @@ -2401,6 +2408,9 @@ static int walk_one_sys_reg(const struct sys_reg_desc *rd,
>  	if (!(rd->reg || rd->get_user))
>  		return 0;
>
> +	if (!sys_reg_present(vcpu, rd))
> +		return 0;
> +
>  	if (!copy_reg_to_user(rd, uind))
>  		return -EFAULT;
>
> @@ -2429,9 +2439,9 @@ static int walk_sys_regs(struct kvm_vcpu *vcpu, u64 __user *uind)
>  		int cmp = cmp_sys_reg(i1, i2);
>  		/* target-specific overrides generic entry. */
>  		if (cmp <= 0)
> -			err = walk_one_sys_reg(i1, &uind, &total);
> +			err = walk_one_sys_reg(vcpu, i1, &uind, &total);
>  		else
> -			err = walk_one_sys_reg(i2, &uind, &total);
> +			err = walk_one_sys_reg(vcpu, i2, &uind, &total);
>
>  		if (err)
>  			return err;
> diff --git a/arch/arm64/kvm/sys_regs.h b/arch/arm64/kvm/sys_regs.h
> index cd710f8..dfbb342 100644
> --- a/arch/arm64/kvm/sys_regs.h
> +++ b/arch/arm64/kvm/sys_regs.h
> @@ -22,6 +22,9 @@
>  #ifndef __ARM64_KVM_SYS_REGS_LOCAL_H__
>  #define __ARM64_KVM_SYS_REGS_LOCAL_H__
>
> +#include <linux/compiler.h>
> +#include <linux/types.h>

I can see why you want compiler.h, but why types.h?

--
Alex Bennée
_______________________________________________
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

^ permalink raw reply	[flat|nested] 178+ messages in thread

* [RFC PATCH 08/16] KVM: arm64: Support dynamically hideable system registers
@ 2018-07-25 14:12     ` Alex Bennée
  0 siblings, 0 replies; 178+ messages in thread
From: Alex Bennée @ 2018-07-25 14:12 UTC (permalink / raw)
  To: linux-arm-kernel


Dave Martin <Dave.Martin@arm.com> writes:

> Some system registers may or may not logically exist for a vcpu
> depending on whether certain architectural features are enabled for
> the vcpu.
>
> In order to avoid spuriously emulating access to these registers
> when they should not exist, or allowing the registers to be
> spuriously enumerated or saved/restored through the ioctl
> interface, a means is needed to allow registers to be hidden
> depending on the vcpu configuration.
>
> In order to support this in a flexible way, this patch adds a
> check_present() method to struct sys_reg_desc, and updates the
> generic system register access and enumeration code to be aware of
> it:  if check_present() returns false, the code behaves as if the
> register did not exist.
>
> For convenience, the complete check is wrapped up in a new helper
> sys_reg_present().
>
> An attempt has been made to hook the new check into the generic
> accessors for trapped system registers.  This should reduce the
> potential for future surprises, although the redundant check will
> add a small cost.  No system register depends on this functionality
> yet, and some paths needing the check may also need attention.
>
> Naturally, this facility makes sense only for registers that are
> trapped.
>
> Signed-off-by: Dave Martin <Dave.Martin@arm.com>
> ---
>  arch/arm64/kvm/sys_regs.c | 20 +++++++++++++++-----
>  arch/arm64/kvm/sys_regs.h | 11 +++++++++++
>  2 files changed, 26 insertions(+), 5 deletions(-)
>
> diff --git a/arch/arm64/kvm/sys_regs.c b/arch/arm64/kvm/sys_regs.c
> index a436373..31a351a 100644
> --- a/arch/arm64/kvm/sys_regs.c
> +++ b/arch/arm64/kvm/sys_regs.c
> @@ -1840,7 +1840,7 @@ static int emulate_cp(struct kvm_vcpu *vcpu,
>
>  	r = find_reg(params, table, num);
>
> -	if (r) {
> +	if (likely(r) && sys_reg_present(vcpu, r)) {
>  		perform_access(vcpu, params, r);
>  		return 0;
>  	}
> @@ -2016,7 +2016,7 @@ static int emulate_sys_reg(struct kvm_vcpu *vcpu,
>  	if (!r)
>  		r = find_reg(params, sys_reg_descs, ARRAY_SIZE(sys_reg_descs));
>
> -	if (likely(r)) {
> +	if (likely(r) && sys_reg_present(vcpu, r)) {
>  		perform_access(vcpu, params, r);
>  	} else {
>  		kvm_err("Unsupported guest sys_reg access at: %lx\n",
> @@ -2313,6 +2313,9 @@ int kvm_arm_sys_reg_get_reg(struct kvm_vcpu *vcpu, const struct kvm_one_reg *reg
>  	if (!r)
>  		return get_invariant_sys_reg(reg->id, uaddr);
>
> +	if (!sys_reg_present(vcpu, r))
> +		return -ENOENT;
> +
>  	if (r->get_user)
>  		return (r->get_user)(vcpu, r, reg, uaddr);
>
> @@ -2334,6 +2337,9 @@ int kvm_arm_sys_reg_set_reg(struct kvm_vcpu *vcpu, const struct kvm_one_reg *reg
>  	if (!r)
>  		return set_invariant_sys_reg(reg->id, uaddr);
>
> +	if (!sys_reg_present(vcpu, r))
> +		return -ENOENT;
> +
>  	if (r->set_user)
>  		return (r->set_user)(vcpu, r, reg, uaddr);
>
> @@ -2390,7 +2396,8 @@ static bool copy_reg_to_user(const struct sys_reg_desc *reg, u64 __user **uind)
>  	return true;
>  }
>
> -static int walk_one_sys_reg(const struct sys_reg_desc *rd,
> +static int walk_one_sys_reg(struct kvm_vcpu *vcpu,
> +			    const struct sys_reg_desc *rd,
>  			    u64 __user **uind,
>  			    unsigned int *total)
>  {
> @@ -2401,6 +2408,9 @@ static int walk_one_sys_reg(const struct sys_reg_desc *rd,
>  	if (!(rd->reg || rd->get_user))
>  		return 0;
>
> +	if (!sys_reg_present(vcpu, rd))
> +		return 0;
> +
>  	if (!copy_reg_to_user(rd, uind))
>  		return -EFAULT;
>
> @@ -2429,9 +2439,9 @@ static int walk_sys_regs(struct kvm_vcpu *vcpu, u64 __user *uind)
>  		int cmp = cmp_sys_reg(i1, i2);
>  		/* target-specific overrides generic entry. */
>  		if (cmp <= 0)
> -			err = walk_one_sys_reg(i1, &uind, &total);
> +			err = walk_one_sys_reg(vcpu, i1, &uind, &total);
>  		else
> -			err = walk_one_sys_reg(i2, &uind, &total);
> +			err = walk_one_sys_reg(vcpu, i2, &uind, &total);
>
>  		if (err)
>  			return err;
> diff --git a/arch/arm64/kvm/sys_regs.h b/arch/arm64/kvm/sys_regs.h
> index cd710f8..dfbb342 100644
> --- a/arch/arm64/kvm/sys_regs.h
> +++ b/arch/arm64/kvm/sys_regs.h
> @@ -22,6 +22,9 @@
>  #ifndef __ARM64_KVM_SYS_REGS_LOCAL_H__
>  #define __ARM64_KVM_SYS_REGS_LOCAL_H__
>
> +#include <linux/compiler.h>
> +#include <linux/types.h>

I can see why you want compiler.h, but why types.h?

--
Alex Benn?e

^ permalink raw reply	[flat|nested] 178+ messages in thread

* Re: [RFC PATCH 08/16] KVM: arm64: Support dynamically hideable system registers
  2018-07-25 14:12     ` Alex Bennée
@ 2018-07-25 14:36       ` Dave Martin
  -1 siblings, 0 replies; 178+ messages in thread
From: Dave Martin @ 2018-07-25 14:36 UTC (permalink / raw)
  To: Alex Bennée
  Cc: Okamoto Takayuki, Christoffer Dall, Ard Biesheuvel, Marc Zyngier,
	Catalin Marinas, Will Deacon, kvmarm, linux-arm-kernel

On Wed, Jul 25, 2018 at 03:12:15PM +0100, Alex Bennée wrote:
> 
> Dave Martin <Dave.Martin@arm.com> writes:
> 
> > Some system registers may or may not logically exist for a vcpu
> > depending on whether certain architectural features are enabled for
> > the vcpu.
> >
> > In order to avoid spuriously emulating access to these registers
> > when they should not exist, or allowing the registers to be
> > spuriously enumerated or saved/restored through the ioctl
> > interface, a means is needed to allow registers to be hidden
> > depending on the vcpu configuration.
> >
> > In order to support this in a flexible way, this patch adds a
> > check_present() method to struct sys_reg_desc, and updates the
> > generic system register access and enumeration code to be aware of
> > it:  if check_present() returns false, the code behaves as if the
> > register did not exist.
> >
> > For convenience, the complete check is wrapped up in a new helper
> > sys_reg_present().
> >
> > An attempt has been made to hook the new check into the generic
> > accessors for trapped system registers.  This should reduce the
> > potential for future surprises, although the redundant check will
> > add a small cost.  No system register depends on this functionality
> > yet, and some paths needing the check may also need attention.
> >
> > Naturally, this facility makes sense only for registers that are
> > trapped.
> >
> > Signed-off-by: Dave Martin <Dave.Martin@arm.com>
> > ---
> >  arch/arm64/kvm/sys_regs.c | 20 +++++++++++++++-----
> >  arch/arm64/kvm/sys_regs.h | 11 +++++++++++
> >  2 files changed, 26 insertions(+), 5 deletions(-)

[...]

> > diff --git a/arch/arm64/kvm/sys_regs.h b/arch/arm64/kvm/sys_regs.h
> > index cd710f8..dfbb342 100644
> > --- a/arch/arm64/kvm/sys_regs.h
> > +++ b/arch/arm64/kvm/sys_regs.h
> > @@ -22,6 +22,9 @@
> >  #ifndef __ARM64_KVM_SYS_REGS_LOCAL_H__
> >  #define __ARM64_KVM_SYS_REGS_LOCAL_H__
> >
> > +#include <linux/compiler.h>
> > +#include <linux/types.h>
> 
> I can see why you want compiler.h, but why types.h?

For bool (though it felt a bit pedantic).

Cheers
---Dave

^ permalink raw reply	[flat|nested] 178+ messages in thread

* [RFC PATCH 08/16] KVM: arm64: Support dynamically hideable system registers
@ 2018-07-25 14:36       ` Dave Martin
  0 siblings, 0 replies; 178+ messages in thread
From: Dave Martin @ 2018-07-25 14:36 UTC (permalink / raw)
  To: linux-arm-kernel

On Wed, Jul 25, 2018 at 03:12:15PM +0100, Alex Benn?e wrote:
> 
> Dave Martin <Dave.Martin@arm.com> writes:
> 
> > Some system registers may or may not logically exist for a vcpu
> > depending on whether certain architectural features are enabled for
> > the vcpu.
> >
> > In order to avoid spuriously emulating access to these registers
> > when they should not exist, or allowing the registers to be
> > spuriously enumerated or saved/restored through the ioctl
> > interface, a means is needed to allow registers to be hidden
> > depending on the vcpu configuration.
> >
> > In order to support this in a flexible way, this patch adds a
> > check_present() method to struct sys_reg_desc, and updates the
> > generic system register access and enumeration code to be aware of
> > it:  if check_present() returns false, the code behaves as if the
> > register did not exist.
> >
> > For convenience, the complete check is wrapped up in a new helper
> > sys_reg_present().
> >
> > An attempt has been made to hook the new check into the generic
> > accessors for trapped system registers.  This should reduce the
> > potential for future surprises, although the redundant check will
> > add a small cost.  No system register depends on this functionality
> > yet, and some paths needing the check may also need attention.
> >
> > Naturally, this facility makes sense only for registers that are
> > trapped.
> >
> > Signed-off-by: Dave Martin <Dave.Martin@arm.com>
> > ---
> >  arch/arm64/kvm/sys_regs.c | 20 +++++++++++++++-----
> >  arch/arm64/kvm/sys_regs.h | 11 +++++++++++
> >  2 files changed, 26 insertions(+), 5 deletions(-)

[...]

> > diff --git a/arch/arm64/kvm/sys_regs.h b/arch/arm64/kvm/sys_regs.h
> > index cd710f8..dfbb342 100644
> > --- a/arch/arm64/kvm/sys_regs.h
> > +++ b/arch/arm64/kvm/sys_regs.h
> > @@ -22,6 +22,9 @@
> >  #ifndef __ARM64_KVM_SYS_REGS_LOCAL_H__
> >  #define __ARM64_KVM_SYS_REGS_LOCAL_H__
> >
> > +#include <linux/compiler.h>
> > +#include <linux/types.h>
> 
> I can see why you want compiler.h, but why types.h?

For bool (though it felt a bit pedantic).

Cheers
---Dave

^ permalink raw reply	[flat|nested] 178+ messages in thread

* Re: [RFC PATCH 07/16] arm64/sve: Enable SVE state tracking for non-task contexts
  2018-07-25 13:58     ` Alex Bennée
@ 2018-07-25 14:39       ` Dave Martin
  -1 siblings, 0 replies; 178+ messages in thread
From: Dave Martin @ 2018-07-25 14:39 UTC (permalink / raw)
  To: Alex Bennée
  Cc: Okamoto Takayuki, Christoffer Dall, Ard Biesheuvel, Marc Zyngier,
	Catalin Marinas, Will Deacon, kvmarm, linux-arm-kernel

On Wed, Jul 25, 2018 at 02:58:29PM +0100, Alex Bennée wrote:
> 
> Dave Martin <Dave.Martin@arm.com> writes:
> 
> > The current FPSIMD/SVE context handling support for non-task (i.e.,
> > KVM vcpu) contexts does not take SVE into account.  This means that
> > only task contexts can safely use SVE at present.
> >
> > In preparation for enabling KVM guests to use SVE, it is necessary
> > to keep track of SVE state for non-task contexts too.
> >
> > This patch adds the necessary support, removing assumptions from
> > the context switch code about the location of the SVE context
> > storage.
> >
> > When binding a vcpu context, its vector length is arbitrarily
> > specified as sve_max_vl for now.  In any case, because TIF_SVE is
> > presently cleared at vcpu context bind time, the specified vector
> > length will not be used for anything yet.  In later patches TIF_SVE
> > will be set here as appropriate, and the appropriate maximum vector
> > length for the vcpu will be passed when binding.
> <snip>
> > --- a/arch/arm64/kernel/fpsimd.c
> > +++ b/arch/arm64/kernel/fpsimd.c
> > @@ -121,6 +121,8 @@
> >   */
> >  struct fpsimd_last_state_struct {
> >  	struct user_fpsimd_state *st;
> > +	void *sve_state;
> > +	unsigned int sve_vl;
> >  };
> 
> > -	struct user_fpsimd_state *st = __this_cpu_read(fpsimd_last_state.st);
> > +	struct fpsimd_last_state_struct const *last =
> > +		this_cpu_ptr(&fpsimd_last_state);
> <snip>
> > @@ -1074,6 +1082,8 @@ void fpsimd_bind_state_to_cpu(struct user_fpsimd_state *st)
> >  	WARN_ON(!in_softirq() && !irqs_disabled());
> >
> >  	last->st = st;
> > +	last->sve_state = sve_state;
> > +	last->sve_vl = sve_vl;
> >  }
> 
> I'm suffering a little cognitive dissonance with the use of last here
> because isn't it really the state as it is now - as we bind to the cpu?

Yes, but it _will_ be the last state ;)

It could have been named along the lines of "current", which would have
been less confusing in some respects.  Anyway, the name crept in a while
ago (pre-SVE) and I'm not sure it's really worth fixing.

> 
> Anyway not super relevant to this patch as the name has already been
> chosen so:
> 
> Reviewed-by: Alex Bennée <alex.bennee@linaro.org>

Thanks
---Dave

^ permalink raw reply	[flat|nested] 178+ messages in thread

* [RFC PATCH 07/16] arm64/sve: Enable SVE state tracking for non-task contexts
@ 2018-07-25 14:39       ` Dave Martin
  0 siblings, 0 replies; 178+ messages in thread
From: Dave Martin @ 2018-07-25 14:39 UTC (permalink / raw)
  To: linux-arm-kernel

On Wed, Jul 25, 2018 at 02:58:29PM +0100, Alex Benn?e wrote:
> 
> Dave Martin <Dave.Martin@arm.com> writes:
> 
> > The current FPSIMD/SVE context handling support for non-task (i.e.,
> > KVM vcpu) contexts does not take SVE into account.  This means that
> > only task contexts can safely use SVE at present.
> >
> > In preparation for enabling KVM guests to use SVE, it is necessary
> > to keep track of SVE state for non-task contexts too.
> >
> > This patch adds the necessary support, removing assumptions from
> > the context switch code about the location of the SVE context
> > storage.
> >
> > When binding a vcpu context, its vector length is arbitrarily
> > specified as sve_max_vl for now.  In any case, because TIF_SVE is
> > presently cleared at vcpu context bind time, the specified vector
> > length will not be used for anything yet.  In later patches TIF_SVE
> > will be set here as appropriate, and the appropriate maximum vector
> > length for the vcpu will be passed when binding.
> <snip>
> > --- a/arch/arm64/kernel/fpsimd.c
> > +++ b/arch/arm64/kernel/fpsimd.c
> > @@ -121,6 +121,8 @@
> >   */
> >  struct fpsimd_last_state_struct {
> >  	struct user_fpsimd_state *st;
> > +	void *sve_state;
> > +	unsigned int sve_vl;
> >  };
> 
> > -	struct user_fpsimd_state *st = __this_cpu_read(fpsimd_last_state.st);
> > +	struct fpsimd_last_state_struct const *last =
> > +		this_cpu_ptr(&fpsimd_last_state);
> <snip>
> > @@ -1074,6 +1082,8 @@ void fpsimd_bind_state_to_cpu(struct user_fpsimd_state *st)
> >  	WARN_ON(!in_softirq() && !irqs_disabled());
> >
> >  	last->st = st;
> > +	last->sve_state = sve_state;
> > +	last->sve_vl = sve_vl;
> >  }
> 
> I'm suffering a little cognitive dissonance with the use of last here
> because isn't it really the state as it is now - as we bind to the cpu?

Yes, but it _will_ be the last state ;)

It could have been named along the lines of "current", which would have
been less confusing in some respects.  Anyway, the name crept in a while
ago (pre-SVE) and I'm not sure it's really worth fixing.

> 
> Anyway not super relevant to this patch as the name has already been
> chosen so:
> 
> Reviewed-by: Alex Benn?e <alex.bennee@linaro.org>

Thanks
---Dave

^ permalink raw reply	[flat|nested] 178+ messages in thread

* Re: [RFC PATCH 10/16] KVM: arm64: Add a vcpu flag to control SVE visibility for the guest
  2018-07-25 13:43         ` Andrew Jones
@ 2018-07-25 14:41           ` Dave Martin
  -1 siblings, 0 replies; 178+ messages in thread
From: Dave Martin @ 2018-07-25 14:41 UTC (permalink / raw)
  To: Andrew Jones
  Cc: Okamoto Takayuki, Christoffer Dall, Ard Biesheuvel, Marc Zyngier,
	Catalin Marinas, Will Deacon, kvmarm, linux-arm-kernel

On Wed, Jul 25, 2018 at 03:43:59PM +0200, Andrew Jones wrote:
> On Wed, Jul 25, 2018 at 12:41:06PM +0100, Dave Martin wrote:

[...]

> > The main purpose of system_supports_sve() here is to shadow the check on
> > vcpu_arch->flags with a static branch.  If the system doesn't support
> > SVE, we don't pay the runtime cost of the dynamic check on
> > vcpu_arch->flags.
> > 
> > If the kernel is built with CONFIG_ARM64_SVE=n, the dynamic check should
> > be entirely optimised away by the compiler.
> 
> Ah, that makes sense. Thanks for clarifying it.
> 
> > 
> > I'd rather not add an explicit comment for this because the same
> > convention is followed elsewhere -- thus for consistency the comment
> > would need to be added in a lot of places.
> 
> Agreed that we don't need a comment. A note in the commit message might
> have been nice though.

Sure, I'll add something in the respin.

Cheers
---Dave

^ permalink raw reply	[flat|nested] 178+ messages in thread

* [RFC PATCH 10/16] KVM: arm64: Add a vcpu flag to control SVE visibility for the guest
@ 2018-07-25 14:41           ` Dave Martin
  0 siblings, 0 replies; 178+ messages in thread
From: Dave Martin @ 2018-07-25 14:41 UTC (permalink / raw)
  To: linux-arm-kernel

On Wed, Jul 25, 2018 at 03:43:59PM +0200, Andrew Jones wrote:
> On Wed, Jul 25, 2018 at 12:41:06PM +0100, Dave Martin wrote:

[...]

> > The main purpose of system_supports_sve() here is to shadow the check on
> > vcpu_arch->flags with a static branch.  If the system doesn't support
> > SVE, we don't pay the runtime cost of the dynamic check on
> > vcpu_arch->flags.
> > 
> > If the kernel is built with CONFIG_ARM64_SVE=n, the dynamic check should
> > be entirely optimised away by the compiler.
> 
> Ah, that makes sense. Thanks for clarifying it.
> 
> > 
> > I'd rather not add an explicit comment for this because the same
> > convention is followed elsewhere -- thus for consistency the comment
> > would need to be added in a lot of places.
> 
> Agreed that we don't need a comment. A note in the commit message might
> have been nice though.

Sure, I'll add something in the respin.

Cheers
---Dave

^ permalink raw reply	[flat|nested] 178+ messages in thread

* Re: [RFC PATCH 15/16] KVM: arm64: Enumerate SVE register indices for KVM_GET_REG_LIST
  2018-07-19 14:12     ` Andrew Jones
@ 2018-07-25 14:50       ` Dave Martin
  -1 siblings, 0 replies; 178+ messages in thread
From: Dave Martin @ 2018-07-25 14:50 UTC (permalink / raw)
  To: Andrew Jones
  Cc: Okamoto Takayuki, Christoffer Dall, Ard Biesheuvel, Marc Zyngier,
	Catalin Marinas, Will Deacon, kvmarm, linux-arm-kernel

On Thu, Jul 19, 2018 at 04:12:32PM +0200, Andrew Jones wrote:
> On Thu, Jun 21, 2018 at 03:57:39PM +0100, Dave Martin wrote:
> > This patch includes the SVE register IDs in the list returned by
> > KVM_GET_REG_LIST, as appropriate.
> > 
> > On a non-SVE-enabled vcpu, no extra IDs are added.
> > 
> > On an SVE-enabled vcpu, the appropriate number of slide IDs are
> > enumerated for each SVE register, depending on the maximum vector
> > length for the vcpu.
> > 
> > Signed-off-by: Dave Martin <Dave.Martin@arm.com>
> > ---
> >  arch/arm64/kvm/guest.c | 73 ++++++++++++++++++++++++++++++++++++++++++++++++++
> >  1 file changed, 73 insertions(+)
> > 
> > diff --git a/arch/arm64/kvm/guest.c b/arch/arm64/kvm/guest.c
> > index 005394b..5152362 100644
> > --- a/arch/arm64/kvm/guest.c
> > +++ b/arch/arm64/kvm/guest.c
> > @@ -21,6 +21,7 @@
> >  
> >  #include <linux/errno.h>
> >  #include <linux/err.h>
> > +#include <linux/kernel.h>
> >  #include <linux/kvm_host.h>
> >  #include <linux/module.h>
> >  #include <linux/uaccess.h>
> > @@ -253,6 +254,73 @@ static int set_core_reg(struct kvm_vcpu *vcpu, const struct kvm_one_reg *reg)
> >  	return err;
> >  }
> >  
> > +static void copy_reg_index_to_user(u64 __user **uind, int *total, int *cerr,
> > +				   u64 id)
> > +{
> > +	int err;
> > +
> > +	if (*cerr)
> > +		return;
> > +
> > +	if (uind) {
> > +		err = put_user(id, *uind);
> > +		if (err) {
> > +			*cerr = err;
> > +			return;
> > +		}
> > +	}
> > +
> > +	++*total;
> > +	if (uind)
> > +		++*uind;
> > +}
> > +
> > +static int enumerate_sve_regs(const struct kvm_vcpu *vcpu, u64 __user **uind)
> > +{
> > +	unsigned int n, i;
> > +	int err = 0;
> > +	int total = 0;
> > +	unsigned int slices;
> > +
> > +	if (!vcpu_has_sve(&vcpu->arch))
> > +		return 0;
> > +
> > +	slices = DIV_ROUND_UP(vcpu->arch.sve_max_vl,
> > +			      KVM_REG_SIZE(KVM_REG_ARM64_SVE_ZREG(0, 0)));
> > +
> > +	for (n = 0; n < SVE_NUM_ZREGS; ++n)
> > +		for (i = 0; i < slices; ++i)
> > +			copy_reg_index_to_user(uind, &total, &err,
> > +					       KVM_REG_ARM64_SVE_ZREG(n, i));
> > +
> > +	for (n = 0; n < SVE_NUM_PREGS; ++n)
> > +		for (i = 0; i < slices; ++i)
> > +			copy_reg_index_to_user(uind, &total, &err,
> > +					       KVM_REG_ARM64_SVE_PREG(n, i));
> > +
> > +	for (i = 0; i < slices; ++i)
> > +		copy_reg_index_to_user(uind, &total, &err,
> > +				       KVM_REG_ARM64_SVE_FFR(i));
> > +
> > +	if (err)
> > +		return -EFAULT;
> > +
> > +	return total;
> > +}
> > +
> > +static unsigned long num_sve_regs(const struct kvm_vcpu *vcpu)
> > +{
> > +	return enumerate_sve_regs(vcpu, NULL);
> > +}
> > +
> > +static int copy_sve_reg_indices(const struct kvm_vcpu *vcpu, u64 __user **uind)
> > +{
> > +	int err;
> > +
> > +	err = enumerate_sve_regs(vcpu, uind);
> > +	return err < 0 ? err : 0;
> > +}
> 
> I see the above functions were inspired by walk_sys_regs(), but, IMHO,
> they're a bit overcomplicated. How about this untested approach?
> 
> diff --git a/arch/arm64/kvm/guest.c b/arch/arm64/kvm/guest.c
> index 56a0260ceb11..0188a8b30d46 100644
> --- a/arch/arm64/kvm/guest.c
> +++ b/arch/arm64/kvm/guest.c
> @@ -130,6 +130,52 @@ static int set_core_reg(struct kvm_vcpu *vcpu, const struct kvm_one_reg *reg)
>  	return err;
>  }
>  
> +static int enumerate_sve_regs(const struct kvm_vcpu *vcpu, u64 __user *uind)
> +{
> +	unsigned int slices = DIV_ROUND_UP(vcpu->arch.sve_max_vl,
> +				KVM_REG_SIZE(KVM_REG_ARM64_SVE_ZREG(0, 0)));
> +	unsigned int n, i;
> +
> +	if (!vcpu_has_sve(&vcpu->arch))
> +		return 0;
> +
> +	for (n = 0; < SVE_NUM_ZREGS; ++n) {
> +		for (i = 0; i < slices; ++i) {
> +			if (put_user(KVM_REG_ARM64_SVE_ZREG(n, i), uind++))
> +				return -EFAULT;
> +		}
> +	}
> +
> +	for (n = 0; < SVE_NUM_PREGS; ++n) {
> +		for (i = 0; i < slices; ++i) {
> +			if (put_user(KVM_REG_ARM64_SVE_PREG(n, i), uind++))
> +				return -EFAULT;
> +		}
> +	}
> +
> +	for (i = 0; i < slices; ++i) {
> +		if (put_user(KVM_REG_ARM64_SVE_FFR(i), uind++))
> +			return -EFAULT;
> +	}
> +
> +	return 0;
> +}
> +
> +static unsigned long num_sve_regs(const struct kvm_vcpu *vcpu)
> +{
> +	unsigned int slices = DIV_ROUND_UP(vcpu->arch.sve_max_vl,
> +				KVM_REG_SIZE(KVM_REG_ARM64_SVE_ZREG(0, 0)));
> +
> +	if (vcpu_has_sve(&vcpu->arch))
> +		return (SVE_NUM_ZREGS + SVE_NUM_PREGS + 1) * slices;
> +
> +	return 0;
> +}
> +

I sympathise with this, though this loses the nice property that
enumerate_sve_regs() and walk_sve_regs() match by construction.

Your version is simple enough that this is obvious by inspection
though, which is probably good enough.  I'll consider abopting it
when I respin.


In the sysregs case this would be much harder to achieve.


I would prefer to keep copy_reg_index_to_user() since it is
used in a few places -- but it is basically the same thing as
sys_regs.c:copy_reg_to_user(), so I will take a look a merging
them together.

Cheers
---Dave

^ permalink raw reply	[flat|nested] 178+ messages in thread

* [RFC PATCH 15/16] KVM: arm64: Enumerate SVE register indices for KVM_GET_REG_LIST
@ 2018-07-25 14:50       ` Dave Martin
  0 siblings, 0 replies; 178+ messages in thread
From: Dave Martin @ 2018-07-25 14:50 UTC (permalink / raw)
  To: linux-arm-kernel

On Thu, Jul 19, 2018 at 04:12:32PM +0200, Andrew Jones wrote:
> On Thu, Jun 21, 2018 at 03:57:39PM +0100, Dave Martin wrote:
> > This patch includes the SVE register IDs in the list returned by
> > KVM_GET_REG_LIST, as appropriate.
> > 
> > On a non-SVE-enabled vcpu, no extra IDs are added.
> > 
> > On an SVE-enabled vcpu, the appropriate number of slide IDs are
> > enumerated for each SVE register, depending on the maximum vector
> > length for the vcpu.
> > 
> > Signed-off-by: Dave Martin <Dave.Martin@arm.com>
> > ---
> >  arch/arm64/kvm/guest.c | 73 ++++++++++++++++++++++++++++++++++++++++++++++++++
> >  1 file changed, 73 insertions(+)
> > 
> > diff --git a/arch/arm64/kvm/guest.c b/arch/arm64/kvm/guest.c
> > index 005394b..5152362 100644
> > --- a/arch/arm64/kvm/guest.c
> > +++ b/arch/arm64/kvm/guest.c
> > @@ -21,6 +21,7 @@
> >  
> >  #include <linux/errno.h>
> >  #include <linux/err.h>
> > +#include <linux/kernel.h>
> >  #include <linux/kvm_host.h>
> >  #include <linux/module.h>
> >  #include <linux/uaccess.h>
> > @@ -253,6 +254,73 @@ static int set_core_reg(struct kvm_vcpu *vcpu, const struct kvm_one_reg *reg)
> >  	return err;
> >  }
> >  
> > +static void copy_reg_index_to_user(u64 __user **uind, int *total, int *cerr,
> > +				   u64 id)
> > +{
> > +	int err;
> > +
> > +	if (*cerr)
> > +		return;
> > +
> > +	if (uind) {
> > +		err = put_user(id, *uind);
> > +		if (err) {
> > +			*cerr = err;
> > +			return;
> > +		}
> > +	}
> > +
> > +	++*total;
> > +	if (uind)
> > +		++*uind;
> > +}
> > +
> > +static int enumerate_sve_regs(const struct kvm_vcpu *vcpu, u64 __user **uind)
> > +{
> > +	unsigned int n, i;
> > +	int err = 0;
> > +	int total = 0;
> > +	unsigned int slices;
> > +
> > +	if (!vcpu_has_sve(&vcpu->arch))
> > +		return 0;
> > +
> > +	slices = DIV_ROUND_UP(vcpu->arch.sve_max_vl,
> > +			      KVM_REG_SIZE(KVM_REG_ARM64_SVE_ZREG(0, 0)));
> > +
> > +	for (n = 0; n < SVE_NUM_ZREGS; ++n)
> > +		for (i = 0; i < slices; ++i)
> > +			copy_reg_index_to_user(uind, &total, &err,
> > +					       KVM_REG_ARM64_SVE_ZREG(n, i));
> > +
> > +	for (n = 0; n < SVE_NUM_PREGS; ++n)
> > +		for (i = 0; i < slices; ++i)
> > +			copy_reg_index_to_user(uind, &total, &err,
> > +					       KVM_REG_ARM64_SVE_PREG(n, i));
> > +
> > +	for (i = 0; i < slices; ++i)
> > +		copy_reg_index_to_user(uind, &total, &err,
> > +				       KVM_REG_ARM64_SVE_FFR(i));
> > +
> > +	if (err)
> > +		return -EFAULT;
> > +
> > +	return total;
> > +}
> > +
> > +static unsigned long num_sve_regs(const struct kvm_vcpu *vcpu)
> > +{
> > +	return enumerate_sve_regs(vcpu, NULL);
> > +}
> > +
> > +static int copy_sve_reg_indices(const struct kvm_vcpu *vcpu, u64 __user **uind)
> > +{
> > +	int err;
> > +
> > +	err = enumerate_sve_regs(vcpu, uind);
> > +	return err < 0 ? err : 0;
> > +}
> 
> I see the above functions were inspired by walk_sys_regs(), but, IMHO,
> they're a bit overcomplicated. How about this untested approach?
> 
> diff --git a/arch/arm64/kvm/guest.c b/arch/arm64/kvm/guest.c
> index 56a0260ceb11..0188a8b30d46 100644
> --- a/arch/arm64/kvm/guest.c
> +++ b/arch/arm64/kvm/guest.c
> @@ -130,6 +130,52 @@ static int set_core_reg(struct kvm_vcpu *vcpu, const struct kvm_one_reg *reg)
>  	return err;
>  }
>  
> +static int enumerate_sve_regs(const struct kvm_vcpu *vcpu, u64 __user *uind)
> +{
> +	unsigned int slices = DIV_ROUND_UP(vcpu->arch.sve_max_vl,
> +				KVM_REG_SIZE(KVM_REG_ARM64_SVE_ZREG(0, 0)));
> +	unsigned int n, i;
> +
> +	if (!vcpu_has_sve(&vcpu->arch))
> +		return 0;
> +
> +	for (n = 0; < SVE_NUM_ZREGS; ++n) {
> +		for (i = 0; i < slices; ++i) {
> +			if (put_user(KVM_REG_ARM64_SVE_ZREG(n, i), uind++))
> +				return -EFAULT;
> +		}
> +	}
> +
> +	for (n = 0; < SVE_NUM_PREGS; ++n) {
> +		for (i = 0; i < slices; ++i) {
> +			if (put_user(KVM_REG_ARM64_SVE_PREG(n, i), uind++))
> +				return -EFAULT;
> +		}
> +	}
> +
> +	for (i = 0; i < slices; ++i) {
> +		if (put_user(KVM_REG_ARM64_SVE_FFR(i), uind++))
> +			return -EFAULT;
> +	}
> +
> +	return 0;
> +}
> +
> +static unsigned long num_sve_regs(const struct kvm_vcpu *vcpu)
> +{
> +	unsigned int slices = DIV_ROUND_UP(vcpu->arch.sve_max_vl,
> +				KVM_REG_SIZE(KVM_REG_ARM64_SVE_ZREG(0, 0)));
> +
> +	if (vcpu_has_sve(&vcpu->arch))
> +		return (SVE_NUM_ZREGS + SVE_NUM_PREGS + 1) * slices;
> +
> +	return 0;
> +}
> +

I sympathise with this, though this loses the nice property that
enumerate_sve_regs() and walk_sve_regs() match by construction.

Your version is simple enough that this is obvious by inspection
though, which is probably good enough.  I'll consider abopting it
when I respin.


In the sysregs case this would be much harder to achieve.


I would prefer to keep copy_reg_index_to_user() since it is
used in a few places -- but it is basically the same thing as
sys_regs.c:copy_reg_to_user(), so I will take a look a merging
them together.

Cheers
---Dave

^ permalink raw reply	[flat|nested] 178+ messages in thread

* Re: [RFC PATCH 16/16] KVM: arm64/sve: Report and enable SVE API extensions for userspace
  2018-07-19 14:59     ` Andrew Jones
@ 2018-07-25 15:27       ` Dave Martin
  -1 siblings, 0 replies; 178+ messages in thread
From: Dave Martin @ 2018-07-25 15:27 UTC (permalink / raw)
  To: Andrew Jones
  Cc: Okamoto Takayuki, Christoffer Dall, Ard Biesheuvel, Marc Zyngier,
	Catalin Marinas, Will Deacon, kvmarm, linux-arm-kernel

On Thu, Jul 19, 2018 at 04:59:21PM +0200, Andrew Jones wrote:
> On Thu, Jun 21, 2018 at 03:57:40PM +0100, Dave Martin wrote:
> > This patch reports the availability of KVM SVE support to userspace
> > via a new vcpu feature flag KVM_ARM_VCPU_SVE.  This flag is
> > reported via the KVM_ARM_PREFERRED_TARGET ioctl.
> > 
> > Userspace can enable the feature by setting the flag for
> > KVM_ARM_VCPU_INIT.  Without this flag set, SVE-related ioctls and
> > register access extensions are hidden, and SVE remains disabled
> > unconditionally for the guest.  This ensures that non-SVE-aware KVM
> > userspace does not receive a vcpu that it does not understand how
> > to snapshot or restore correctly.
> > 
> > Storage is allocated for the SVE register state at vcpu init time,
> > sufficient for the maximum vector length to be exposed to the vcpu.
> > No attempt is made to allocate the storage lazily for now.  Also,
> > no attempt is made to resize the storage dynamically, since the
> > effective vector length of the vcpu can change at each EL0/EL1
> > transition.  The storage is freed at the vcpu uninit hook.
> > 
> > No particular attempt is made to prevent userspace from creating a
> > mix of vcpus some of which have SVE enabled and some of which have
> > it disabled.  This may or may not be useful, but it reflects the
> > underlying architectural behaviour.
> > 
> > Signed-off-by: Dave Martin <Dave.Martin@arm.com>
> > ---
> >  arch/arm64/include/asm/kvm_host.h |  6 +++---
> >  arch/arm64/include/uapi/asm/kvm.h |  1 +
> >  arch/arm64/kvm/guest.c            | 19 +++++++++++++------
> >  arch/arm64/kvm/reset.c            | 14 ++++++++++++++
> >  4 files changed, 31 insertions(+), 9 deletions(-)
> > 
> > diff --git a/arch/arm64/include/asm/kvm_host.h b/arch/arm64/include/asm/kvm_host.h
> > index d2084ae..d956cf2 100644
> > --- a/arch/arm64/include/asm/kvm_host.h
> > +++ b/arch/arm64/include/asm/kvm_host.h
> > @@ -44,7 +44,7 @@
> >  
> >  #define KVM_MAX_VCPUS VGIC_V3_MAX_CPUS
> >  
> > -#define KVM_VCPU_MAX_FEATURES 4
> > +#define KVM_VCPU_MAX_FEATURES 5
> >  
> >  #define KVM_REQ_SLEEP \
> >  	KVM_ARCH_REQ_FLAGS(0, KVM_REQUEST_WAIT | KVM_REQUEST_NO_WAKEUP)
> > @@ -439,8 +439,8 @@ static inline void kvm_arch_sync_events(struct kvm *kvm) {}
> >  static inline void kvm_arch_sched_in(struct kvm_vcpu *vcpu, int cpu) {}
> >  static inline void kvm_arch_vcpu_block_finish(struct kvm_vcpu *vcpu) {}
> >  
> > -static inline int kvm_arm_arch_vcpu_init(struct kvm_vcpu *vcpu) { return 0; }
> > -static inline void kvm_arm_arch_vcpu_uninit(struct kvm_vcpu *vcpu) {}
> > +int kvm_arm_arch_vcpu_init(struct kvm_vcpu *vcpu);
> > +void kvm_arm_arch_vcpu_uninit(struct kvm_vcpu *vcpu);
> >  
> >  void kvm_arm_init_debug(void);
> >  void kvm_arm_setup_debug(struct kvm_vcpu *vcpu);
> > diff --git a/arch/arm64/include/uapi/asm/kvm.h b/arch/arm64/include/uapi/asm/kvm.h
> > index f54a9b0..6acf276 100644
> > --- a/arch/arm64/include/uapi/asm/kvm.h
> > +++ b/arch/arm64/include/uapi/asm/kvm.h
> > @@ -101,6 +101,7 @@ struct kvm_regs {
> >  #define KVM_ARM_VCPU_EL1_32BIT		1 /* CPU running a 32bit VM */
> >  #define KVM_ARM_VCPU_PSCI_0_2		2 /* CPU uses PSCI v0.2 */
> >  #define KVM_ARM_VCPU_PMU_V3		3 /* Support guest PMUv3 */
> > +#define KVM_ARM_VCPU_SVE		4 /* Allow SVE for guest */
> >  
> >  struct kvm_vcpu_init {
> >  	__u32 target;
> > diff --git a/arch/arm64/kvm/guest.c b/arch/arm64/kvm/guest.c
> > index 5152362..fb7f6aa 100644
> > --- a/arch/arm64/kvm/guest.c
> > +++ b/arch/arm64/kvm/guest.c
> > @@ -58,6 +58,16 @@ int kvm_arch_vcpu_setup(struct kvm_vcpu *vcpu)
> >  	return 0;
> >  }
> >  
> > +int kvm_arm_arch_vcpu_init(struct kvm_vcpu *vcpu)
> > +{
> > +	return 0;
> > +}
> 
> Unused, so could have just left the inline version.
> 
> > +
> > +void kvm_arm_arch_vcpu_uninit(struct kvm_vcpu *vcpu)
> > +{
> > +	kfree(vcpu->arch.sve_state);
> > +}
> > +
> >  static u64 core_reg_offset_from_id(u64 id)
> >  {
> >  	return id & ~(KVM_REG_ARCH_MASK | KVM_REG_SIZE_MASK | KVM_REG_ARM_CORE);
> > @@ -600,12 +610,9 @@ int kvm_vcpu_preferred_target(struct kvm_vcpu_init *init)
> >  
> >  	memset(init, 0, sizeof(*init));
> >  
> > -	/*
> > -	 * For now, we don't return any features.
> > -	 * In future, we might use features to return target
> > -	 * specific features available for the preferred
> > -	 * target type.
> > -	 */
> > +	/* KVM_ARM_VCPU_SVE understood by KVM_VCPU_INIT */
> > +	init->features[0] = 1 << KVM_ARM_VCPU_SVE;
> > +
> 
> We shouldn't need to do this. The "preferred" target type isn't defined
> well (that I know of), but IMO it should probably be the target that
> best matches the host, minus optional features. The best base target. We
> may use these features to convey that the preferred target should enable
> some optional feature if that feature is necessary to workaround a bug,
> i.e. using the "feature" bit as an erratum bit someday, but that'd be
> quite a debatable use, so maybe not even that. Most likely we'll never
> need to add features here.

init->features[] has no semantics yet so we can define it how we like,
but I agree that the way I use it here is not necessarily the most
natural.

OTOH, we cannot use features[] for "mandatory" features like erratum
workarounds, because current userspace just ignores these bits.

Rather, these bits would be for features that are considered beneficial
but must be off by default (due to incompatibility risks across nodes,
or due to ABI impacts).  Just blindly using the preferred target
already risks configuring a vcpu that won't work across all nodes in
your cluster.

So I'm not convinced that there is any useful interpretation of
features[] unless we interpret it as suggested in this patch.

Can you elaborate why you think it should be used with a more
concrete example?

> That said, I think defining the feature bit makes sense. ATM, I'm feeling
> like we'll want to model the user interface for SVE like PMU (using VCPU
> device ioctls).

Some people expressed concerns about the ioctls becoming order-sensitive.

In the SVE case we don't want people enabling/disabling/reconfiguring
"silicon" features like SVE after the vcpu starts executing.

We will need an extra ioctl() for configuring the allowed SVE vector
lengths though.  I don't see a way around that.  So maybe we have to
solve the ordering problem anyway.


By current approach (not in this series) was to have VCPU_INIT return
-EINPROGRESS or similar if SVE is enabled in features[]: this indicates
that certain setup ioctls are required before the vcpu can run.

This may be overkill / not the best approach though.  I can look at
vcpu device ioctls as an alternative.

> >  	init->target = (__u32)target;
> >  
> >  	return 0;
> > diff --git a/arch/arm64/kvm/reset.c b/arch/arm64/kvm/reset.c
> > index a74311b..f63a791 100644
> > --- a/arch/arm64/kvm/reset.c
> > +++ b/arch/arm64/kvm/reset.c
> > @@ -110,6 +110,20 @@ int kvm_reset_vcpu(struct kvm_vcpu *vcpu)
> >  			cpu_reset = &default_regs_reset;
> >  		}
> >  
> > +		if (system_supports_sve() &&
> > +		    test_bit(KVM_ARM_VCPU_SVE, vcpu->arch.features)) {
> > +			vcpu->arch.flags |= KVM_ARM64_GUEST_HAS_SVE;
> > +
> > +			vcpu->arch.sve_max_vl = sve_max_virtualisable_vl;
> > +
> > +			vcpu->arch.sve_state = kzalloc(
> > +				SVE_SIG_REGS_SIZE(
> > +					sve_vq_from_vl(vcpu->arch.sve_max_vl)),
> 
> I guess sve_state can be pretty large. Should we allocate it like we
> do the VM with kvm_arch_alloc_vm()? I.e. using vzalloc() on VHE machines?

Hmmm, dunno.

Historically (i.e., on 32-bit) vmalloc addresses could be a somewhat
scarce resource so I tend not to think of it by default, but allocations
of this kind of size would probably not pose a problem -- certainly
not on 64-bit.

Currently we allocate the sve_state storage for each host task with
kzalloc(), so it would be nice to stay consistent unless there's a
reason to deviate.


With vmalloc() we might waste half a page of memory per vcpu on a
typical system, though that probably isn't the end of the world.
It would be worse with 64K pages though.

The total size is not likely to be more than a few pages, so
it probably doesn't matter too much if we grab physically
contiguous memory for it.

The size of the sve state seems comparable to the size of struct kvm.

I'm not sure what the correct answer is here, to be honest.

Thoughts?

Cheers
---Dave

^ permalink raw reply	[flat|nested] 178+ messages in thread

* [RFC PATCH 16/16] KVM: arm64/sve: Report and enable SVE API extensions for userspace
@ 2018-07-25 15:27       ` Dave Martin
  0 siblings, 0 replies; 178+ messages in thread
From: Dave Martin @ 2018-07-25 15:27 UTC (permalink / raw)
  To: linux-arm-kernel

On Thu, Jul 19, 2018 at 04:59:21PM +0200, Andrew Jones wrote:
> On Thu, Jun 21, 2018 at 03:57:40PM +0100, Dave Martin wrote:
> > This patch reports the availability of KVM SVE support to userspace
> > via a new vcpu feature flag KVM_ARM_VCPU_SVE.  This flag is
> > reported via the KVM_ARM_PREFERRED_TARGET ioctl.
> > 
> > Userspace can enable the feature by setting the flag for
> > KVM_ARM_VCPU_INIT.  Without this flag set, SVE-related ioctls and
> > register access extensions are hidden, and SVE remains disabled
> > unconditionally for the guest.  This ensures that non-SVE-aware KVM
> > userspace does not receive a vcpu that it does not understand how
> > to snapshot or restore correctly.
> > 
> > Storage is allocated for the SVE register state at vcpu init time,
> > sufficient for the maximum vector length to be exposed to the vcpu.
> > No attempt is made to allocate the storage lazily for now.  Also,
> > no attempt is made to resize the storage dynamically, since the
> > effective vector length of the vcpu can change at each EL0/EL1
> > transition.  The storage is freed at the vcpu uninit hook.
> > 
> > No particular attempt is made to prevent userspace from creating a
> > mix of vcpus some of which have SVE enabled and some of which have
> > it disabled.  This may or may not be useful, but it reflects the
> > underlying architectural behaviour.
> > 
> > Signed-off-by: Dave Martin <Dave.Martin@arm.com>
> > ---
> >  arch/arm64/include/asm/kvm_host.h |  6 +++---
> >  arch/arm64/include/uapi/asm/kvm.h |  1 +
> >  arch/arm64/kvm/guest.c            | 19 +++++++++++++------
> >  arch/arm64/kvm/reset.c            | 14 ++++++++++++++
> >  4 files changed, 31 insertions(+), 9 deletions(-)
> > 
> > diff --git a/arch/arm64/include/asm/kvm_host.h b/arch/arm64/include/asm/kvm_host.h
> > index d2084ae..d956cf2 100644
> > --- a/arch/arm64/include/asm/kvm_host.h
> > +++ b/arch/arm64/include/asm/kvm_host.h
> > @@ -44,7 +44,7 @@
> >  
> >  #define KVM_MAX_VCPUS VGIC_V3_MAX_CPUS
> >  
> > -#define KVM_VCPU_MAX_FEATURES 4
> > +#define KVM_VCPU_MAX_FEATURES 5
> >  
> >  #define KVM_REQ_SLEEP \
> >  	KVM_ARCH_REQ_FLAGS(0, KVM_REQUEST_WAIT | KVM_REQUEST_NO_WAKEUP)
> > @@ -439,8 +439,8 @@ static inline void kvm_arch_sync_events(struct kvm *kvm) {}
> >  static inline void kvm_arch_sched_in(struct kvm_vcpu *vcpu, int cpu) {}
> >  static inline void kvm_arch_vcpu_block_finish(struct kvm_vcpu *vcpu) {}
> >  
> > -static inline int kvm_arm_arch_vcpu_init(struct kvm_vcpu *vcpu) { return 0; }
> > -static inline void kvm_arm_arch_vcpu_uninit(struct kvm_vcpu *vcpu) {}
> > +int kvm_arm_arch_vcpu_init(struct kvm_vcpu *vcpu);
> > +void kvm_arm_arch_vcpu_uninit(struct kvm_vcpu *vcpu);
> >  
> >  void kvm_arm_init_debug(void);
> >  void kvm_arm_setup_debug(struct kvm_vcpu *vcpu);
> > diff --git a/arch/arm64/include/uapi/asm/kvm.h b/arch/arm64/include/uapi/asm/kvm.h
> > index f54a9b0..6acf276 100644
> > --- a/arch/arm64/include/uapi/asm/kvm.h
> > +++ b/arch/arm64/include/uapi/asm/kvm.h
> > @@ -101,6 +101,7 @@ struct kvm_regs {
> >  #define KVM_ARM_VCPU_EL1_32BIT		1 /* CPU running a 32bit VM */
> >  #define KVM_ARM_VCPU_PSCI_0_2		2 /* CPU uses PSCI v0.2 */
> >  #define KVM_ARM_VCPU_PMU_V3		3 /* Support guest PMUv3 */
> > +#define KVM_ARM_VCPU_SVE		4 /* Allow SVE for guest */
> >  
> >  struct kvm_vcpu_init {
> >  	__u32 target;
> > diff --git a/arch/arm64/kvm/guest.c b/arch/arm64/kvm/guest.c
> > index 5152362..fb7f6aa 100644
> > --- a/arch/arm64/kvm/guest.c
> > +++ b/arch/arm64/kvm/guest.c
> > @@ -58,6 +58,16 @@ int kvm_arch_vcpu_setup(struct kvm_vcpu *vcpu)
> >  	return 0;
> >  }
> >  
> > +int kvm_arm_arch_vcpu_init(struct kvm_vcpu *vcpu)
> > +{
> > +	return 0;
> > +}
> 
> Unused, so could have just left the inline version.
> 
> > +
> > +void kvm_arm_arch_vcpu_uninit(struct kvm_vcpu *vcpu)
> > +{
> > +	kfree(vcpu->arch.sve_state);
> > +}
> > +
> >  static u64 core_reg_offset_from_id(u64 id)
> >  {
> >  	return id & ~(KVM_REG_ARCH_MASK | KVM_REG_SIZE_MASK | KVM_REG_ARM_CORE);
> > @@ -600,12 +610,9 @@ int kvm_vcpu_preferred_target(struct kvm_vcpu_init *init)
> >  
> >  	memset(init, 0, sizeof(*init));
> >  
> > -	/*
> > -	 * For now, we don't return any features.
> > -	 * In future, we might use features to return target
> > -	 * specific features available for the preferred
> > -	 * target type.
> > -	 */
> > +	/* KVM_ARM_VCPU_SVE understood by KVM_VCPU_INIT */
> > +	init->features[0] = 1 << KVM_ARM_VCPU_SVE;
> > +
> 
> We shouldn't need to do this. The "preferred" target type isn't defined
> well (that I know of), but IMO it should probably be the target that
> best matches the host, minus optional features. The best base target. We
> may use these features to convey that the preferred target should enable
> some optional feature if that feature is necessary to workaround a bug,
> i.e. using the "feature" bit as an erratum bit someday, but that'd be
> quite a debatable use, so maybe not even that. Most likely we'll never
> need to add features here.

init->features[] has no semantics yet so we can define it how we like,
but I agree that the way I use it here is not necessarily the most
natural.

OTOH, we cannot use features[] for "mandatory" features like erratum
workarounds, because current userspace just ignores these bits.

Rather, these bits would be for features that are considered beneficial
but must be off by default (due to incompatibility risks across nodes,
or due to ABI impacts).  Just blindly using the preferred target
already risks configuring a vcpu that won't work across all nodes in
your cluster.

So I'm not convinced that there is any useful interpretation of
features[] unless we interpret it as suggested in this patch.

Can you elaborate why you think it should be used with a more
concrete example?

> That said, I think defining the feature bit makes sense. ATM, I'm feeling
> like we'll want to model the user interface for SVE like PMU (using VCPU
> device ioctls).

Some people expressed concerns about the ioctls becoming order-sensitive.

In the SVE case we don't want people enabling/disabling/reconfiguring
"silicon" features like SVE after the vcpu starts executing.

We will need an extra ioctl() for configuring the allowed SVE vector
lengths though.  I don't see a way around that.  So maybe we have to
solve the ordering problem anyway.


By current approach (not in this series) was to have VCPU_INIT return
-EINPROGRESS or similar if SVE is enabled in features[]: this indicates
that certain setup ioctls are required before the vcpu can run.

This may be overkill / not the best approach though.  I can look at
vcpu device ioctls as an alternative.

> >  	init->target = (__u32)target;
> >  
> >  	return 0;
> > diff --git a/arch/arm64/kvm/reset.c b/arch/arm64/kvm/reset.c
> > index a74311b..f63a791 100644
> > --- a/arch/arm64/kvm/reset.c
> > +++ b/arch/arm64/kvm/reset.c
> > @@ -110,6 +110,20 @@ int kvm_reset_vcpu(struct kvm_vcpu *vcpu)
> >  			cpu_reset = &default_regs_reset;
> >  		}
> >  
> > +		if (system_supports_sve() &&
> > +		    test_bit(KVM_ARM_VCPU_SVE, vcpu->arch.features)) {
> > +			vcpu->arch.flags |= KVM_ARM64_GUEST_HAS_SVE;
> > +
> > +			vcpu->arch.sve_max_vl = sve_max_virtualisable_vl;
> > +
> > +			vcpu->arch.sve_state = kzalloc(
> > +				SVE_SIG_REGS_SIZE(
> > +					sve_vq_from_vl(vcpu->arch.sve_max_vl)),
> 
> I guess sve_state can be pretty large. Should we allocate it like we
> do the VM with kvm_arch_alloc_vm()? I.e. using vzalloc() on VHE machines?

Hmmm, dunno.

Historically (i.e., on 32-bit) vmalloc addresses could be a somewhat
scarce resource so I tend not to think of it by default, but allocations
of this kind of size would probably not pose a problem -- certainly
not on 64-bit.

Currently we allocate the sve_state storage for each host task with
kzalloc(), so it would be nice to stay consistent unless there's a
reason to deviate.


With vmalloc() we might waste half a page of memory per vcpu on a
typical system, though that probably isn't the end of the world.
It would be worse with 64K pages though.

The total size is not likely to be more than a few pages, so
it probably doesn't matter too much if we grab physically
contiguous memory for it.

The size of the sve state seems comparable to the size of struct kvm.

I'm not sure what the correct answer is here, to be honest.

Thoughts?

Cheers
---Dave

^ permalink raw reply	[flat|nested] 178+ messages in thread

* Re: [RFC PATCH 08/16] KVM: arm64: Support dynamically hideable system registers
  2018-07-25 14:36       ` Dave Martin
@ 2018-07-25 15:41         ` Alex Bennée
  -1 siblings, 0 replies; 178+ messages in thread
From: Alex Bennée @ 2018-07-25 15:41 UTC (permalink / raw)
  To: Dave Martin
  Cc: Okamoto Takayuki, Christoffer Dall, Ard Biesheuvel, Marc Zyngier,
	Catalin Marinas, Will Deacon, kvmarm, linux-arm-kernel


Dave Martin <Dave.Martin@arm.com> writes:

> On Wed, Jul 25, 2018 at 03:12:15PM +0100, Alex Bennée wrote:
>>
>> Dave Martin <Dave.Martin@arm.com> writes:
>>
>> > Some system registers may or may not logically exist for a vcpu
>> > depending on whether certain architectural features are enabled for
>> > the vcpu.
>> >
>> > In order to avoid spuriously emulating access to these registers
>> > when they should not exist, or allowing the registers to be
>> > spuriously enumerated or saved/restored through the ioctl
>> > interface, a means is needed to allow registers to be hidden
>> > depending on the vcpu configuration.
>> >
>> > In order to support this in a flexible way, this patch adds a
>> > check_present() method to struct sys_reg_desc, and updates the
>> > generic system register access and enumeration code to be aware of
>> > it:  if check_present() returns false, the code behaves as if the
>> > register did not exist.
>> >
>> > For convenience, the complete check is wrapped up in a new helper
>> > sys_reg_present().
>> >
>> > An attempt has been made to hook the new check into the generic
>> > accessors for trapped system registers.  This should reduce the
>> > potential for future surprises, although the redundant check will
>> > add a small cost.  No system register depends on this functionality
>> > yet, and some paths needing the check may also need attention.
>> >
>> > Naturally, this facility makes sense only for registers that are
>> > trapped.
>> >
>> > Signed-off-by: Dave Martin <Dave.Martin@arm.com>
>> > ---
>> >  arch/arm64/kvm/sys_regs.c | 20 +++++++++++++++-----
>> >  arch/arm64/kvm/sys_regs.h | 11 +++++++++++
>> >  2 files changed, 26 insertions(+), 5 deletions(-)
>
> [...]
>
>> > diff --git a/arch/arm64/kvm/sys_regs.h b/arch/arm64/kvm/sys_regs.h
>> > index cd710f8..dfbb342 100644
>> > --- a/arch/arm64/kvm/sys_regs.h
>> > +++ b/arch/arm64/kvm/sys_regs.h
>> > @@ -22,6 +22,9 @@
>> >  #ifndef __ARM64_KVM_SYS_REGS_LOCAL_H__
>> >  #define __ARM64_KVM_SYS_REGS_LOCAL_H__
>> >
>> > +#include <linux/compiler.h>
>> > +#include <linux/types.h>
>>
>> I can see why you want compiler.h, but why types.h?
>
> For bool (though it felt a bit pedantic).

It must be picked up elsewhere because it didn't fail when I rebuilt
without it - and the header has been happily using bool up to that
point.


>
> Cheers
> ---Dave


--
Alex Bennée
_______________________________________________
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

^ permalink raw reply	[flat|nested] 178+ messages in thread

* [RFC PATCH 08/16] KVM: arm64: Support dynamically hideable system registers
@ 2018-07-25 15:41         ` Alex Bennée
  0 siblings, 0 replies; 178+ messages in thread
From: Alex Bennée @ 2018-07-25 15:41 UTC (permalink / raw)
  To: linux-arm-kernel


Dave Martin <Dave.Martin@arm.com> writes:

> On Wed, Jul 25, 2018 at 03:12:15PM +0100, Alex Benn?e wrote:
>>
>> Dave Martin <Dave.Martin@arm.com> writes:
>>
>> > Some system registers may or may not logically exist for a vcpu
>> > depending on whether certain architectural features are enabled for
>> > the vcpu.
>> >
>> > In order to avoid spuriously emulating access to these registers
>> > when they should not exist, or allowing the registers to be
>> > spuriously enumerated or saved/restored through the ioctl
>> > interface, a means is needed to allow registers to be hidden
>> > depending on the vcpu configuration.
>> >
>> > In order to support this in a flexible way, this patch adds a
>> > check_present() method to struct sys_reg_desc, and updates the
>> > generic system register access and enumeration code to be aware of
>> > it:  if check_present() returns false, the code behaves as if the
>> > register did not exist.
>> >
>> > For convenience, the complete check is wrapped up in a new helper
>> > sys_reg_present().
>> >
>> > An attempt has been made to hook the new check into the generic
>> > accessors for trapped system registers.  This should reduce the
>> > potential for future surprises, although the redundant check will
>> > add a small cost.  No system register depends on this functionality
>> > yet, and some paths needing the check may also need attention.
>> >
>> > Naturally, this facility makes sense only for registers that are
>> > trapped.
>> >
>> > Signed-off-by: Dave Martin <Dave.Martin@arm.com>
>> > ---
>> >  arch/arm64/kvm/sys_regs.c | 20 +++++++++++++++-----
>> >  arch/arm64/kvm/sys_regs.h | 11 +++++++++++
>> >  2 files changed, 26 insertions(+), 5 deletions(-)
>
> [...]
>
>> > diff --git a/arch/arm64/kvm/sys_regs.h b/arch/arm64/kvm/sys_regs.h
>> > index cd710f8..dfbb342 100644
>> > --- a/arch/arm64/kvm/sys_regs.h
>> > +++ b/arch/arm64/kvm/sys_regs.h
>> > @@ -22,6 +22,9 @@
>> >  #ifndef __ARM64_KVM_SYS_REGS_LOCAL_H__
>> >  #define __ARM64_KVM_SYS_REGS_LOCAL_H__
>> >
>> > +#include <linux/compiler.h>
>> > +#include <linux/types.h>
>>
>> I can see why you want compiler.h, but why types.h?
>
> For bool (though it felt a bit pedantic).

It must be picked up elsewhere because it didn't fail when I rebuilt
without it - and the header has been happily using bool up to that
point.


>
> Cheers
> ---Dave


--
Alex Benn?e

^ permalink raw reply	[flat|nested] 178+ messages in thread

* Re: [RFC PATCH 09/16] KVM: arm64: Allow ID registers to by dynamically read-as-zero
  2018-06-21 14:57   ` Dave Martin
@ 2018-07-25 15:46     ` Alex Bennée
  -1 siblings, 0 replies; 178+ messages in thread
From: Alex Bennée @ 2018-07-25 15:46 UTC (permalink / raw)
  To: Dave Martin
  Cc: Okamoto Takayuki, Christoffer Dall, Ard Biesheuvel, Marc Zyngier,
	Catalin Marinas, Will Deacon, kvmarm, linux-arm-kernel


Dave Martin <Dave.Martin@arm.com> writes:

> When a feature-dependent ID register is hidden from the guest, it
> needs to exhibit read-as-zero behaviour as defined by the Arm
> architecture, rather than appearing to be entirely absent.
>
> This patch updates the ID register emulation logic to make use of
> the new check_present() method to determine whether the register
> should read as zero instead of yielding the host's sanitised
> value.  Because currently a false result from this method truncates
> the trap call chain before the sysreg's emulate method() is called,
> a flag is added to distinguish this special case, and helpers are
> refactored appropriately.
>
> This invloves some trivial updates to pass the vcpu pointer down
> into the ID register emulation/access functions.
>
> A new ID_SANITISED_IF() macro is defined for declaring
> conditionally visible ID registers.
>
> Signed-off-by: Dave Martin <Dave.Martin@arm.com>
> ---
<snip>
> @@ -2337,7 +2352,7 @@ int kvm_arm_sys_reg_set_reg(struct kvm_vcpu *vcpu, const struct kvm_one_reg *reg
>  	if (!r)
>  		return set_invariant_sys_reg(reg->id, uaddr);
>
> -	if (!sys_reg_present(vcpu, r))
> +	if (!sys_reg_present_or_raz(vcpu, r))
>  		return -ENOENT;

It's all very well being raz, but shouldn't you catch this further down
and not attempt to write the register that doesn't exist?

>
>  	if (r->set_user)
> @@ -2408,7 +2423,7 @@ static int walk_one_sys_reg(struct kvm_vcpu *vcpu,
>  	if (!(rd->reg || rd->get_user))
>  		return 0;
>
> -	if (!sys_reg_present(vcpu, rd))
> +	if (!sys_reg_present_or_raz(vcpu, rd))
>  		return 0;
>
>  	if (!copy_reg_to_user(rd, uind))
> diff --git a/arch/arm64/kvm/sys_regs.h b/arch/arm64/kvm/sys_regs.h
> index dfbb342..304928f 100644
> --- a/arch/arm64/kvm/sys_regs.h
> +++ b/arch/arm64/kvm/sys_regs.h
> @@ -66,14 +66,25 @@ struct sys_reg_desc {
>  			const struct kvm_one_reg *reg, void __user *uaddr);
>  	bool (*check_present)(const struct kvm_vcpu *vpcu,
>  			      const struct sys_reg_desc *rd);
> +
> +	/* OR of SR_* flags */
> +	unsigned int flags;
>  };
>
> +#define SR_RAZ_IF_ABSENT	(1 << 0)
> +
>  static inline bool sys_reg_present(const struct kvm_vcpu *vcpu,
>  				   const struct sys_reg_desc *rd)
>  {
>  	return likely(!rd->check_present) || rd->check_present(vcpu, rd);
>  }
>
> +static inline bool sys_reg_present_or_raz(const struct kvm_vcpu *vcpu,
> +					  const struct sys_reg_desc *rd)
> +{
> +	return sys_reg_present(vcpu, rd) || (rd->flags & SR_RAZ_IF_ABSENT);
> +}
> +
>  static inline void print_sys_reg_instr(const struct sys_reg_params *p)
>  {
>  	/* Look, we even formatted it for you to paste into the table! */


--
Alex Bennée
_______________________________________________
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

^ permalink raw reply	[flat|nested] 178+ messages in thread

* [RFC PATCH 09/16] KVM: arm64: Allow ID registers to by dynamically read-as-zero
@ 2018-07-25 15:46     ` Alex Bennée
  0 siblings, 0 replies; 178+ messages in thread
From: Alex Bennée @ 2018-07-25 15:46 UTC (permalink / raw)
  To: linux-arm-kernel


Dave Martin <Dave.Martin@arm.com> writes:

> When a feature-dependent ID register is hidden from the guest, it
> needs to exhibit read-as-zero behaviour as defined by the Arm
> architecture, rather than appearing to be entirely absent.
>
> This patch updates the ID register emulation logic to make use of
> the new check_present() method to determine whether the register
> should read as zero instead of yielding the host's sanitised
> value.  Because currently a false result from this method truncates
> the trap call chain before the sysreg's emulate method() is called,
> a flag is added to distinguish this special case, and helpers are
> refactored appropriately.
>
> This invloves some trivial updates to pass the vcpu pointer down
> into the ID register emulation/access functions.
>
> A new ID_SANITISED_IF() macro is defined for declaring
> conditionally visible ID registers.
>
> Signed-off-by: Dave Martin <Dave.Martin@arm.com>
> ---
<snip>
> @@ -2337,7 +2352,7 @@ int kvm_arm_sys_reg_set_reg(struct kvm_vcpu *vcpu, const struct kvm_one_reg *reg
>  	if (!r)
>  		return set_invariant_sys_reg(reg->id, uaddr);
>
> -	if (!sys_reg_present(vcpu, r))
> +	if (!sys_reg_present_or_raz(vcpu, r))
>  		return -ENOENT;

It's all very well being raz, but shouldn't you catch this further down
and not attempt to write the register that doesn't exist?

>
>  	if (r->set_user)
> @@ -2408,7 +2423,7 @@ static int walk_one_sys_reg(struct kvm_vcpu *vcpu,
>  	if (!(rd->reg || rd->get_user))
>  		return 0;
>
> -	if (!sys_reg_present(vcpu, rd))
> +	if (!sys_reg_present_or_raz(vcpu, rd))
>  		return 0;
>
>  	if (!copy_reg_to_user(rd, uind))
> diff --git a/arch/arm64/kvm/sys_regs.h b/arch/arm64/kvm/sys_regs.h
> index dfbb342..304928f 100644
> --- a/arch/arm64/kvm/sys_regs.h
> +++ b/arch/arm64/kvm/sys_regs.h
> @@ -66,14 +66,25 @@ struct sys_reg_desc {
>  			const struct kvm_one_reg *reg, void __user *uaddr);
>  	bool (*check_present)(const struct kvm_vcpu *vpcu,
>  			      const struct sys_reg_desc *rd);
> +
> +	/* OR of SR_* flags */
> +	unsigned int flags;
>  };
>
> +#define SR_RAZ_IF_ABSENT	(1 << 0)
> +
>  static inline bool sys_reg_present(const struct kvm_vcpu *vcpu,
>  				   const struct sys_reg_desc *rd)
>  {
>  	return likely(!rd->check_present) || rd->check_present(vcpu, rd);
>  }
>
> +static inline bool sys_reg_present_or_raz(const struct kvm_vcpu *vcpu,
> +					  const struct sys_reg_desc *rd)
> +{
> +	return sys_reg_present(vcpu, rd) || (rd->flags & SR_RAZ_IF_ABSENT);
> +}
> +
>  static inline void print_sys_reg_instr(const struct sys_reg_params *p)
>  {
>  	/* Look, we even formatted it for you to paste into the table! */


--
Alex Benn?e

^ permalink raw reply	[flat|nested] 178+ messages in thread

* Re: [RFC PATCH 13/16] KVM: Allow 2048-bit register access via KVM_{GET, SET}_ONE_REG
  2018-06-21 14:57   ` Dave Martin
@ 2018-07-25 15:58     ` Alex Bennée
  -1 siblings, 0 replies; 178+ messages in thread
From: Alex Bennée @ 2018-07-25 15:58 UTC (permalink / raw)
  To: Dave Martin
  Cc: Okamoto Takayuki, Christoffer Dall, Ard Biesheuvel, Marc Zyngier,
	Catalin Marinas, Will Deacon, kvmarm, linux-arm-kernel


Dave Martin <Dave.Martin@arm.com> writes:

> The Arm SVE architecture defines registers that are up to 2048 bits
> in size (with some possibility of further future expansion).
>
> In order to avoid the need for an excessively large number of
> ioctls when saving and restoring a vcpu's registers, this patch
> adds a #define to make support for individual 2048-bit registers
> through the KVM_{GET,SET}_ONE_REG ioctl interface official.  This
> will allow each SVE register to be accessed in a single call.
>
> There are sufficient spare bits in the register id size field for
> this change, so there is no ABI impact providing that
> KVM_GET_REG_LIST does not enumerate any 2048-bit register unless
> userspace explicitly opts in to the relevant architecture-specific
> features.

Does it? It's not in this patch and looking at the final tree:

  unsigned long kvm_arm_num_regs(struct kvm_vcpu *vcpu)
  {
          unsigned long res = 0;

          res += num_core_regs();
          res += num_sve_regs(vcpu);
          res += kvm_arm_num_sys_reg_descs(vcpu);
          res += kvm_arm_get_fw_num_regs(vcpu);
          res += NUM_TIMER_REGS;

          return res;
  }


which leads to:

  static int enumerate_sve_regs(const struct kvm_vcpu *vcpu, u64 __user **uind)
  {
          unsigned int n, i;
          int err = 0;
          int total = 0;
          unsigned int slices;

          if (!vcpu_has_sve(&vcpu->arch))
                  return 0;

Which enumerates the SVE regs if vcpu_has_sve() which AFAICT is true if
the host supports it, not if the user has requested it.

I'll have to check what but given the indirection of kvm_one_reg I
wonder if existing binaries might end up spamming a badly sized array
when run on a new SVE supporting kernel?

>
> Signed-off-by: Dave Martin <Dave.Martin@arm.com>
> ---
>  include/uapi/linux/kvm.h | 1 +
>  1 file changed, 1 insertion(+)
>
> diff --git a/include/uapi/linux/kvm.h b/include/uapi/linux/kvm.h
> index b6270a3..345be88 100644
> --- a/include/uapi/linux/kvm.h
> +++ b/include/uapi/linux/kvm.h
> @@ -1106,6 +1106,7 @@ struct kvm_dirty_tlb {
>  #define KVM_REG_SIZE_U256	0x0050000000000000ULL
>  #define KVM_REG_SIZE_U512	0x0060000000000000ULL
>  #define KVM_REG_SIZE_U1024	0x0070000000000000ULL
> +#define KVM_REG_SIZE_U2048	0x0080000000000000ULL
>
>  struct kvm_reg_list {
>  	__u64 n; /* number of regs */


--
Alex Bennée
_______________________________________________
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

^ permalink raw reply	[flat|nested] 178+ messages in thread

* [RFC PATCH 13/16] KVM: Allow 2048-bit register access via KVM_{GET, SET}_ONE_REG
@ 2018-07-25 15:58     ` Alex Bennée
  0 siblings, 0 replies; 178+ messages in thread
From: Alex Bennée @ 2018-07-25 15:58 UTC (permalink / raw)
  To: linux-arm-kernel


Dave Martin <Dave.Martin@arm.com> writes:

> The Arm SVE architecture defines registers that are up to 2048 bits
> in size (with some possibility of further future expansion).
>
> In order to avoid the need for an excessively large number of
> ioctls when saving and restoring a vcpu's registers, this patch
> adds a #define to make support for individual 2048-bit registers
> through the KVM_{GET,SET}_ONE_REG ioctl interface official.  This
> will allow each SVE register to be accessed in a single call.
>
> There are sufficient spare bits in the register id size field for
> this change, so there is no ABI impact providing that
> KVM_GET_REG_LIST does not enumerate any 2048-bit register unless
> userspace explicitly opts in to the relevant architecture-specific
> features.

Does it? It's not in this patch and looking at the final tree:

  unsigned long kvm_arm_num_regs(struct kvm_vcpu *vcpu)
  {
          unsigned long res = 0;

          res += num_core_regs();
          res += num_sve_regs(vcpu);
          res += kvm_arm_num_sys_reg_descs(vcpu);
          res += kvm_arm_get_fw_num_regs(vcpu);
          res += NUM_TIMER_REGS;

          return res;
  }


which leads to:

  static int enumerate_sve_regs(const struct kvm_vcpu *vcpu, u64 __user **uind)
  {
          unsigned int n, i;
          int err = 0;
          int total = 0;
          unsigned int slices;

          if (!vcpu_has_sve(&vcpu->arch))
                  return 0;

Which enumerates the SVE regs if vcpu_has_sve() which AFAICT is true if
the host supports it, not if the user has requested it.

I'll have to check what but given the indirection of kvm_one_reg I
wonder if existing binaries might end up spamming a badly sized array
when run on a new SVE supporting kernel?

>
> Signed-off-by: Dave Martin <Dave.Martin@arm.com>
> ---
>  include/uapi/linux/kvm.h | 1 +
>  1 file changed, 1 insertion(+)
>
> diff --git a/include/uapi/linux/kvm.h b/include/uapi/linux/kvm.h
> index b6270a3..345be88 100644
> --- a/include/uapi/linux/kvm.h
> +++ b/include/uapi/linux/kvm.h
> @@ -1106,6 +1106,7 @@ struct kvm_dirty_tlb {
>  #define KVM_REG_SIZE_U256	0x0050000000000000ULL
>  #define KVM_REG_SIZE_U512	0x0060000000000000ULL
>  #define KVM_REG_SIZE_U1024	0x0070000000000000ULL
> +#define KVM_REG_SIZE_U2048	0x0080000000000000ULL
>
>  struct kvm_reg_list {
>  	__u64 n; /* number of regs */


--
Alex Benn?e

^ permalink raw reply	[flat|nested] 178+ messages in thread

* Re: [RFC PATCH 16/16] KVM: arm64/sve: Report and enable SVE API extensions for userspace
  2018-07-25 15:27       ` Dave Martin
@ 2018-07-25 16:52         ` Andrew Jones
  -1 siblings, 0 replies; 178+ messages in thread
From: Andrew Jones @ 2018-07-25 16:52 UTC (permalink / raw)
  To: Dave Martin
  Cc: Okamoto Takayuki, Christoffer Dall, Ard Biesheuvel, Marc Zyngier,
	Catalin Marinas, Will Deacon, kvmarm, linux-arm-kernel

On Wed, Jul 25, 2018 at 04:27:49PM +0100, Dave Martin wrote:
> On Thu, Jul 19, 2018 at 04:59:21PM +0200, Andrew Jones wrote:
> > On Thu, Jun 21, 2018 at 03:57:40PM +0100, Dave Martin wrote:
> > > -	/*
> > > -	 * For now, we don't return any features.
> > > -	 * In future, we might use features to return target
> > > -	 * specific features available for the preferred
> > > -	 * target type.
> > > -	 */
> > > +	/* KVM_ARM_VCPU_SVE understood by KVM_VCPU_INIT */
> > > +	init->features[0] = 1 << KVM_ARM_VCPU_SVE;
> > > +
> > 
> > We shouldn't need to do this. The "preferred" target type isn't defined
> > well (that I know of), but IMO it should probably be the target that
> > best matches the host, minus optional features. The best base target. We
> > may use these features to convey that the preferred target should enable
> > some optional feature if that feature is necessary to workaround a bug,
> > i.e. using the "feature" bit as an erratum bit someday, but that'd be
> > quite a debatable use, so maybe not even that. Most likely we'll never
> > need to add features here.
> 
> init->features[] has no semantics yet so we can define it how we like,
> but I agree that the way I use it here is not necessarily the most
> natural.
> 
> OTOH, we cannot use features[] for "mandatory" features like erratum
> workarounds, because current userspace just ignores these bits.

It would have to learn to look here if that's how we started using it,
but it'd be better to invent something else that wouldn't appear as
abusive if we're going to teach userspace new stuff anyway.

> 
> Rather, these bits would be for features that are considered beneficial
> but must be off by default (due to incompatibility risks across nodes,
> or due to ABI impacts).  Just blindly using the preferred target
> already risks configuring a vcpu that won't work across all nodes in
> your cluster.

KVM usually advertises optional features through capabilities. A device
(vcpu device, in this case) ioctl can also be used to check for feature
availability.

> 
> So I'm not convinced that there is any useful interpretation of
> features[] unless we interpret it as suggested in this patch.
> 
> Can you elaborate why you think it should be used with a more
> concrete example?

I'm advocating that it *not* be used here. I think it should be used
like the PMU feature uses it - and the PMU feature doesn't set a bit
here.

> 
> > That said, I think defining the feature bit makes sense. ATM, I'm feeling
> > like we'll want to model the user interface for SVE like PMU (using VCPU
> > device ioctls).
> 
> Some people expressed concerns about the ioctls becoming order-sensitive.
> 
> In the SVE case we don't want people enabling/disabling/reconfiguring
> "silicon" features like SVE after the vcpu starts executing.
> 
> We will need an extra ioctl() for configuring the allowed SVE vector
> lengths though.  I don't see a way around that.  So maybe we have to
> solve the ordering problem anyway.

Yes, that's why I'm thinking that the vcpu device ioctls is probably the
right way to go. The SVE group can have its own "finalize" request that
allows all other SVE ioctls to be in any order prior to it.

> 
> 
> By current approach (not in this series) was to have VCPU_INIT return
> -EINPROGRESS or similar if SVE is enabled in features[]: this indicates
> that certain setup ioctls are required before the vcpu can run.
> 
> This may be overkill / not the best approach though.  I can look at
> vcpu device ioctls as an alternative.

With a "finalize" attribute if SVE isn't finalized by VCPU_INIT or
KVM_RUN time, then SVE just won't be enabled for that VCPU.

Thanks,
drew

^ permalink raw reply	[flat|nested] 178+ messages in thread

* [RFC PATCH 16/16] KVM: arm64/sve: Report and enable SVE API extensions for userspace
@ 2018-07-25 16:52         ` Andrew Jones
  0 siblings, 0 replies; 178+ messages in thread
From: Andrew Jones @ 2018-07-25 16:52 UTC (permalink / raw)
  To: linux-arm-kernel

On Wed, Jul 25, 2018 at 04:27:49PM +0100, Dave Martin wrote:
> On Thu, Jul 19, 2018 at 04:59:21PM +0200, Andrew Jones wrote:
> > On Thu, Jun 21, 2018 at 03:57:40PM +0100, Dave Martin wrote:
> > > -	/*
> > > -	 * For now, we don't return any features.
> > > -	 * In future, we might use features to return target
> > > -	 * specific features available for the preferred
> > > -	 * target type.
> > > -	 */
> > > +	/* KVM_ARM_VCPU_SVE understood by KVM_VCPU_INIT */
> > > +	init->features[0] = 1 << KVM_ARM_VCPU_SVE;
> > > +
> > 
> > We shouldn't need to do this. The "preferred" target type isn't defined
> > well (that I know of), but IMO it should probably be the target that
> > best matches the host, minus optional features. The best base target. We
> > may use these features to convey that the preferred target should enable
> > some optional feature if that feature is necessary to workaround a bug,
> > i.e. using the "feature" bit as an erratum bit someday, but that'd be
> > quite a debatable use, so maybe not even that. Most likely we'll never
> > need to add features here.
> 
> init->features[] has no semantics yet so we can define it how we like,
> but I agree that the way I use it here is not necessarily the most
> natural.
> 
> OTOH, we cannot use features[] for "mandatory" features like erratum
> workarounds, because current userspace just ignores these bits.

It would have to learn to look here if that's how we started using it,
but it'd be better to invent something else that wouldn't appear as
abusive if we're going to teach userspace new stuff anyway.

> 
> Rather, these bits would be for features that are considered beneficial
> but must be off by default (due to incompatibility risks across nodes,
> or due to ABI impacts).  Just blindly using the preferred target
> already risks configuring a vcpu that won't work across all nodes in
> your cluster.

KVM usually advertises optional features through capabilities. A device
(vcpu device, in this case) ioctl can also be used to check for feature
availability.

> 
> So I'm not convinced that there is any useful interpretation of
> features[] unless we interpret it as suggested in this patch.
> 
> Can you elaborate why you think it should be used with a more
> concrete example?

I'm advocating that it *not* be used here. I think it should be used
like the PMU feature uses it - and the PMU feature doesn't set a bit
here.

> 
> > That said, I think defining the feature bit makes sense. ATM, I'm feeling
> > like we'll want to model the user interface for SVE like PMU (using VCPU
> > device ioctls).
> 
> Some people expressed concerns about the ioctls becoming order-sensitive.
> 
> In the SVE case we don't want people enabling/disabling/reconfiguring
> "silicon" features like SVE after the vcpu starts executing.
> 
> We will need an extra ioctl() for configuring the allowed SVE vector
> lengths though.  I don't see a way around that.  So maybe we have to
> solve the ordering problem anyway.

Yes, that's why I'm thinking that the vcpu device ioctls is probably the
right way to go. The SVE group can have its own "finalize" request that
allows all other SVE ioctls to be in any order prior to it.

> 
> 
> By current approach (not in this series) was to have VCPU_INIT return
> -EINPROGRESS or similar if SVE is enabled in features[]: this indicates
> that certain setup ioctls are required before the vcpu can run.
> 
> This may be overkill / not the best approach though.  I can look at
> vcpu device ioctls as an alternative.

With a "finalize" attribute if SVE isn't finalized by VCPU_INIT or
KVM_RUN time, then SVE just won't be enabled for that VCPU.

Thanks,
drew

^ permalink raw reply	[flat|nested] 178+ messages in thread

* Re: [RFC PATCH 14/16] KVM: arm64/sve: Add SVE support to register access ioctl interface
  2018-07-25 14:06       ` Dave Martin
@ 2018-07-25 17:20         ` Andrew Jones
  -1 siblings, 0 replies; 178+ messages in thread
From: Andrew Jones @ 2018-07-25 17:20 UTC (permalink / raw)
  To: Dave Martin
  Cc: Okamoto Takayuki, Christoffer Dall, Ard Biesheuvel, Marc Zyngier,
	Catalin Marinas, Will Deacon, kvmarm, linux-arm-kernel

On Wed, Jul 25, 2018 at 03:06:21PM +0100, Dave Martin wrote:
> On Thu, Jul 19, 2018 at 03:04:33PM +0200, Andrew Jones wrote:
> > On Thu, Jun 21, 2018 at 03:57:38PM +0100, Dave Martin wrote:
> > > +
> > > +	if (usize % sizeof(u32))
> > > +		return -EINVAL;
> > 
> 
> Currently we don't enforce the register size to be a multiple of 32 bits,
> but I'm trying to establish a stronger position.  Passing different
> register sizes feels like an abuse of the API and there is no evidence
> that qemu or kvmtool is relying on this so far.  The ability to pass
> a misaligned register ID and/or slurp multiple vcpu registers (or parts
> of registers) is once call really seems like it works by accident today
> and seems not to be intentional design.  Rather, it exposes kernel
> implementation details, which is best avoided.
> 
> It would be better to make this a global check for usize % 32 == 0
> though, rather than burying it in fpsimd_vreg_bounds().
> 
> Opinions?

There's only one reason to not start enforcing it globally on arm/arm64,
and that's that it's not documented that way. Changing it would be an API
change, rather than just an API fix. It's probably a safe change, but...

> 
> > > +
> > > +	usize /= sizeof(u32);
> > > +
> > > +	if ((uoffset <= start && usize <= start - uoffset) ||
> > > +	    uoffset >= limit)
> > > +		return -ENOENT;	/* not a vreg */
> > > +
> > > +	BUILD_BUG_ON(uoffset > limit);
> > 
> > Hmm, a build bug on uoffset can't be right, it's not a constant.
> > 
> > > +	if (uoffset < start || usize > limit - uoffset)
> > > +		return -EINVAL;	/* overlaps vregs[] bounds */
> 
> uoffset is not compile-time constant, but (uoffset > limit) is compile-
> time constant, because the previous if() returns from the function
> otherwise.
> 
> gcc seems to do the right thing here: the code compiles as-is, but
> if the prior if() is commented out then the BUILD_BUG_ON() fires
> because (uoffset > limit) is no longer compile-time constant.

Oh, interesting.

> 
> 
> This is a defensively-coded bounds check, where
> 
> 	if (A + B > C)
> 
> is transformed to
> 
> 	if (C >= B && A > C - B)
> 
> The former is susceptible to overflow in (A + B), whereas the latter is
> not.  We might be able to hide the risk with type casts, but that trades
> one kind of fragility for another IMHO.
> 
> In this patch, the C >= B part is subsumed into the previous if(), but
> because this is non-obvious I dropped the BUILD_BUG_ON() in as a hint
> to maintainers that we really do depend on a property of the previous
> check, so although it may look like the checks could be swapped over
> with no ill effects, really that is not safe.

I'm glad our maintainers can pick up on hints like that :-) Maybe you can
add a comment for mortals like me though.

> 
> 
> Maybe the BUILD_BUG_ON() is superfluous, but I would prefer at least
> to keep a comment here.
> 
> What do you think.
>

Comment plus build-bug or just comment works for me.

> 
> OTOH, if we can show conclusively that we can avoid overflow here
> then the code can be simplified.  But I would want to be confident
> that this is really safe not just now but also under future maintenance.
> 

I agree with thoroughly checking user input. Maybe we can create/use
some helper functions to do it. Those helpers can then get reused
elsewhere, helping to keep ourselves sane the next time we need to
do similar sanity checks.

Thanks,
drew

^ permalink raw reply	[flat|nested] 178+ messages in thread

* [RFC PATCH 14/16] KVM: arm64/sve: Add SVE support to register access ioctl interface
@ 2018-07-25 17:20         ` Andrew Jones
  0 siblings, 0 replies; 178+ messages in thread
From: Andrew Jones @ 2018-07-25 17:20 UTC (permalink / raw)
  To: linux-arm-kernel

On Wed, Jul 25, 2018 at 03:06:21PM +0100, Dave Martin wrote:
> On Thu, Jul 19, 2018 at 03:04:33PM +0200, Andrew Jones wrote:
> > On Thu, Jun 21, 2018 at 03:57:38PM +0100, Dave Martin wrote:
> > > +
> > > +	if (usize % sizeof(u32))
> > > +		return -EINVAL;
> > 
> 
> Currently we don't enforce the register size to be a multiple of 32 bits,
> but I'm trying to establish a stronger position.  Passing different
> register sizes feels like an abuse of the API and there is no evidence
> that qemu or kvmtool is relying on this so far.  The ability to pass
> a misaligned register ID and/or slurp multiple vcpu registers (or parts
> of registers) is once call really seems like it works by accident today
> and seems not to be intentional design.  Rather, it exposes kernel
> implementation details, which is best avoided.
> 
> It would be better to make this a global check for usize % 32 == 0
> though, rather than burying it in fpsimd_vreg_bounds().
> 
> Opinions?

There's only one reason to not start enforcing it globally on arm/arm64,
and that's that it's not documented that way. Changing it would be an API
change, rather than just an API fix. It's probably a safe change, but...

> 
> > > +
> > > +	usize /= sizeof(u32);
> > > +
> > > +	if ((uoffset <= start && usize <= start - uoffset) ||
> > > +	    uoffset >= limit)
> > > +		return -ENOENT;	/* not a vreg */
> > > +
> > > +	BUILD_BUG_ON(uoffset > limit);
> > 
> > Hmm, a build bug on uoffset can't be right, it's not a constant.
> > 
> > > +	if (uoffset < start || usize > limit - uoffset)
> > > +		return -EINVAL;	/* overlaps vregs[] bounds */
> 
> uoffset is not compile-time constant, but (uoffset > limit) is compile-
> time constant, because the previous if() returns from the function
> otherwise.
> 
> gcc seems to do the right thing here: the code compiles as-is, but
> if the prior if() is commented out then the BUILD_BUG_ON() fires
> because (uoffset > limit) is no longer compile-time constant.

Oh, interesting.

> 
> 
> This is a defensively-coded bounds check, where
> 
> 	if (A + B > C)
> 
> is transformed to
> 
> 	if (C >= B && A > C - B)
> 
> The former is susceptible to overflow in (A + B), whereas the latter is
> not.  We might be able to hide the risk with type casts, but that trades
> one kind of fragility for another IMHO.
> 
> In this patch, the C >= B part is subsumed into the previous if(), but
> because this is non-obvious I dropped the BUILD_BUG_ON() in as a hint
> to maintainers that we really do depend on a property of the previous
> check, so although it may look like the checks could be swapped over
> with no ill effects, really that is not safe.

I'm glad our maintainers can pick up on hints like that :-) Maybe you can
add a comment for mortals like me though.

> 
> 
> Maybe the BUILD_BUG_ON() is superfluous, but I would prefer at least
> to keep a comment here.
> 
> What do you think.
>

Comment plus build-bug or just comment works for me.

> 
> OTOH, if we can show conclusively that we can avoid overflow here
> then the code can be simplified.  But I would want to be confident
> that this is really safe not just now but also under future maintenance.
> 

I agree with thoroughly checking user input. Maybe we can create/use
some helper functions to do it. Those helpers can then get reused
elsewhere, helping to keep ourselves sane the next time we need to
do similar sanity checks.

Thanks,
drew

^ permalink raw reply	[flat|nested] 178+ messages in thread

* Re: [RFC PATCH 08/16] KVM: arm64: Support dynamically hideable system registers
  2018-07-25 15:41         ` Alex Bennée
@ 2018-07-26 12:53           ` Dave Martin
  -1 siblings, 0 replies; 178+ messages in thread
From: Dave Martin @ 2018-07-26 12:53 UTC (permalink / raw)
  To: Alex Bennée
  Cc: Okamoto Takayuki, Christoffer Dall, Ard Biesheuvel, Marc Zyngier,
	Catalin Marinas, Will Deacon, kvmarm, linux-arm-kernel

On Wed, Jul 25, 2018 at 04:41:44PM +0100, Alex Bennée wrote:
> 
> Dave Martin <Dave.Martin@arm.com> writes:
> 
> > On Wed, Jul 25, 2018 at 03:12:15PM +0100, Alex Bennée wrote:
> >>
> >> Dave Martin <Dave.Martin@arm.com> writes:
> >>
> >> > Some system registers may or may not logically exist for a vcpu
> >> > depending on whether certain architectural features are enabled for
> >> > the vcpu.
> >> >
> >> > In order to avoid spuriously emulating access to these registers
> >> > when they should not exist, or allowing the registers to be
> >> > spuriously enumerated or saved/restored through the ioctl
> >> > interface, a means is needed to allow registers to be hidden
> >> > depending on the vcpu configuration.
> >> >
> >> > In order to support this in a flexible way, this patch adds a
> >> > check_present() method to struct sys_reg_desc, and updates the
> >> > generic system register access and enumeration code to be aware of
> >> > it:  if check_present() returns false, the code behaves as if the
> >> > register did not exist.
> >> >
> >> > For convenience, the complete check is wrapped up in a new helper
> >> > sys_reg_present().
> >> >
> >> > An attempt has been made to hook the new check into the generic
> >> > accessors for trapped system registers.  This should reduce the
> >> > potential for future surprises, although the redundant check will
> >> > add a small cost.  No system register depends on this functionality
> >> > yet, and some paths needing the check may also need attention.
> >> >
> >> > Naturally, this facility makes sense only for registers that are
> >> > trapped.
> >> >
> >> > Signed-off-by: Dave Martin <Dave.Martin@arm.com>
> >> > ---
> >> >  arch/arm64/kvm/sys_regs.c | 20 +++++++++++++++-----
> >> >  arch/arm64/kvm/sys_regs.h | 11 +++++++++++
> >> >  2 files changed, 26 insertions(+), 5 deletions(-)
> >
> > [...]
> >
> >> > diff --git a/arch/arm64/kvm/sys_regs.h b/arch/arm64/kvm/sys_regs.h
> >> > index cd710f8..dfbb342 100644
> >> > --- a/arch/arm64/kvm/sys_regs.h
> >> > +++ b/arch/arm64/kvm/sys_regs.h
> >> > @@ -22,6 +22,9 @@
> >> >  #ifndef __ARM64_KVM_SYS_REGS_LOCAL_H__
> >> >  #define __ARM64_KVM_SYS_REGS_LOCAL_H__
> >> >
> >> > +#include <linux/compiler.h>
> >> > +#include <linux/types.h>
> >>
> >> I can see why you want compiler.h, but why types.h?
> >
> > For bool (though it felt a bit pedantic).
> 
> It must be picked up elsewhere because it didn't fail when I rebuilt
> without it - and the header has been happily using bool up to that
> point.

Sure, lots of headers include linux/types.h, so we get away with it all
over the place.

I was adding it for completeness really: I prefer not to rely on headers
being included by accident.

Cheers
---Dave

^ permalink raw reply	[flat|nested] 178+ messages in thread

* [RFC PATCH 08/16] KVM: arm64: Support dynamically hideable system registers
@ 2018-07-26 12:53           ` Dave Martin
  0 siblings, 0 replies; 178+ messages in thread
From: Dave Martin @ 2018-07-26 12:53 UTC (permalink / raw)
  To: linux-arm-kernel

On Wed, Jul 25, 2018 at 04:41:44PM +0100, Alex Benn?e wrote:
> 
> Dave Martin <Dave.Martin@arm.com> writes:
> 
> > On Wed, Jul 25, 2018 at 03:12:15PM +0100, Alex Benn?e wrote:
> >>
> >> Dave Martin <Dave.Martin@arm.com> writes:
> >>
> >> > Some system registers may or may not logically exist for a vcpu
> >> > depending on whether certain architectural features are enabled for
> >> > the vcpu.
> >> >
> >> > In order to avoid spuriously emulating access to these registers
> >> > when they should not exist, or allowing the registers to be
> >> > spuriously enumerated or saved/restored through the ioctl
> >> > interface, a means is needed to allow registers to be hidden
> >> > depending on the vcpu configuration.
> >> >
> >> > In order to support this in a flexible way, this patch adds a
> >> > check_present() method to struct sys_reg_desc, and updates the
> >> > generic system register access and enumeration code to be aware of
> >> > it:  if check_present() returns false, the code behaves as if the
> >> > register did not exist.
> >> >
> >> > For convenience, the complete check is wrapped up in a new helper
> >> > sys_reg_present().
> >> >
> >> > An attempt has been made to hook the new check into the generic
> >> > accessors for trapped system registers.  This should reduce the
> >> > potential for future surprises, although the redundant check will
> >> > add a small cost.  No system register depends on this functionality
> >> > yet, and some paths needing the check may also need attention.
> >> >
> >> > Naturally, this facility makes sense only for registers that are
> >> > trapped.
> >> >
> >> > Signed-off-by: Dave Martin <Dave.Martin@arm.com>
> >> > ---
> >> >  arch/arm64/kvm/sys_regs.c | 20 +++++++++++++++-----
> >> >  arch/arm64/kvm/sys_regs.h | 11 +++++++++++
> >> >  2 files changed, 26 insertions(+), 5 deletions(-)
> >
> > [...]
> >
> >> > diff --git a/arch/arm64/kvm/sys_regs.h b/arch/arm64/kvm/sys_regs.h
> >> > index cd710f8..dfbb342 100644
> >> > --- a/arch/arm64/kvm/sys_regs.h
> >> > +++ b/arch/arm64/kvm/sys_regs.h
> >> > @@ -22,6 +22,9 @@
> >> >  #ifndef __ARM64_KVM_SYS_REGS_LOCAL_H__
> >> >  #define __ARM64_KVM_SYS_REGS_LOCAL_H__
> >> >
> >> > +#include <linux/compiler.h>
> >> > +#include <linux/types.h>
> >>
> >> I can see why you want compiler.h, but why types.h?
> >
> > For bool (though it felt a bit pedantic).
> 
> It must be picked up elsewhere because it didn't fail when I rebuilt
> without it - and the header has been happily using bool up to that
> point.

Sure, lots of headers include linux/types.h, so we get away with it all
over the place.

I was adding it for completeness really: I prefer not to rely on headers
being included by accident.

Cheers
---Dave

^ permalink raw reply	[flat|nested] 178+ messages in thread

* Re: [RFC PATCH 13/16] KVM: Allow 2048-bit register access via KVM_{GET, SET}_ONE_REG
  2018-07-25 15:58     ` Alex Bennée
@ 2018-07-26 12:58       ` Dave Martin
  -1 siblings, 0 replies; 178+ messages in thread
From: Dave Martin @ 2018-07-26 12:58 UTC (permalink / raw)
  To: Alex Bennée
  Cc: Okamoto Takayuki, Christoffer Dall, Ard Biesheuvel, Marc Zyngier,
	Catalin Marinas, Will Deacon, kvmarm, linux-arm-kernel

On Wed, Jul 25, 2018 at 04:58:30PM +0100, Alex Bennée wrote:
> 
> Dave Martin <Dave.Martin@arm.com> writes:
> 
> > The Arm SVE architecture defines registers that are up to 2048 bits
> > in size (with some possibility of further future expansion).
> >
> > In order to avoid the need for an excessively large number of
> > ioctls when saving and restoring a vcpu's registers, this patch
> > adds a #define to make support for individual 2048-bit registers
> > through the KVM_{GET,SET}_ONE_REG ioctl interface official.  This
> > will allow each SVE register to be accessed in a single call.
> >
> > There are sufficient spare bits in the register id size field for
> > this change, so there is no ABI impact providing that
> > KVM_GET_REG_LIST does not enumerate any 2048-bit register unless
> > userspace explicitly opts in to the relevant architecture-specific
> > features.
> 
> Does it? It's not in this patch and looking at the final tree:
> 
>   unsigned long kvm_arm_num_regs(struct kvm_vcpu *vcpu)
>   {
>           unsigned long res = 0;
> 
>           res += num_core_regs();
>           res += num_sve_regs(vcpu);
>           res += kvm_arm_num_sys_reg_descs(vcpu);
>           res += kvm_arm_get_fw_num_regs(vcpu);
>           res += NUM_TIMER_REGS;
> 
>           return res;
>   }
> 
> 
> which leads to:
> 
>   static int enumerate_sve_regs(const struct kvm_vcpu *vcpu, u64 __user **uind)
>   {
>           unsigned int n, i;
>           int err = 0;
>           int total = 0;
>           unsigned int slices;
> 
>           if (!vcpu_has_sve(&vcpu->arch))
>                   return 0;
> 
> Which enumerates the SVE regs if vcpu_has_sve() which AFAICT is true if
> the host supports it, not if the user has requested it.
> 
> I'll have to check what but given the indirection of kvm_one_reg I
> wonder if existing binaries might end up spamming a badly sized array
> when run on a new SVE supporting kernel?

That shouldn't be the case: vcpu_has_sve() checks for the
KVM_ARM64_GUEST_HAS_SVE flag, which should only be set if userspace asks
for it.

Give me a shout if this doesn't seem to be the case...

Cheers
---Dave

^ permalink raw reply	[flat|nested] 178+ messages in thread

* [RFC PATCH 13/16] KVM: Allow 2048-bit register access via KVM_{GET, SET}_ONE_REG
@ 2018-07-26 12:58       ` Dave Martin
  0 siblings, 0 replies; 178+ messages in thread
From: Dave Martin @ 2018-07-26 12:58 UTC (permalink / raw)
  To: linux-arm-kernel

On Wed, Jul 25, 2018 at 04:58:30PM +0100, Alex Benn?e wrote:
> 
> Dave Martin <Dave.Martin@arm.com> writes:
> 
> > The Arm SVE architecture defines registers that are up to 2048 bits
> > in size (with some possibility of further future expansion).
> >
> > In order to avoid the need for an excessively large number of
> > ioctls when saving and restoring a vcpu's registers, this patch
> > adds a #define to make support for individual 2048-bit registers
> > through the KVM_{GET,SET}_ONE_REG ioctl interface official.  This
> > will allow each SVE register to be accessed in a single call.
> >
> > There are sufficient spare bits in the register id size field for
> > this change, so there is no ABI impact providing that
> > KVM_GET_REG_LIST does not enumerate any 2048-bit register unless
> > userspace explicitly opts in to the relevant architecture-specific
> > features.
> 
> Does it? It's not in this patch and looking at the final tree:
> 
>   unsigned long kvm_arm_num_regs(struct kvm_vcpu *vcpu)
>   {
>           unsigned long res = 0;
> 
>           res += num_core_regs();
>           res += num_sve_regs(vcpu);
>           res += kvm_arm_num_sys_reg_descs(vcpu);
>           res += kvm_arm_get_fw_num_regs(vcpu);
>           res += NUM_TIMER_REGS;
> 
>           return res;
>   }
> 
> 
> which leads to:
> 
>   static int enumerate_sve_regs(const struct kvm_vcpu *vcpu, u64 __user **uind)
>   {
>           unsigned int n, i;
>           int err = 0;
>           int total = 0;
>           unsigned int slices;
> 
>           if (!vcpu_has_sve(&vcpu->arch))
>                   return 0;
> 
> Which enumerates the SVE regs if vcpu_has_sve() which AFAICT is true if
> the host supports it, not if the user has requested it.
> 
> I'll have to check what but given the indirection of kvm_one_reg I
> wonder if existing binaries might end up spamming a badly sized array
> when run on a new SVE supporting kernel?

That shouldn't be the case: vcpu_has_sve() checks for the
KVM_ARM64_GUEST_HAS_SVE flag, which should only be set if userspace asks
for it.

Give me a shout if this doesn't seem to be the case...

Cheers
---Dave

^ permalink raw reply	[flat|nested] 178+ messages in thread

* Re: [RFC PATCH 14/16] KVM: arm64/sve: Add SVE support to register access ioctl interface
  2018-07-25 17:20         ` Andrew Jones
@ 2018-07-26 13:10           ` Dave Martin
  -1 siblings, 0 replies; 178+ messages in thread
From: Dave Martin @ 2018-07-26 13:10 UTC (permalink / raw)
  To: Andrew Jones
  Cc: Okamoto Takayuki, Christoffer Dall, Ard Biesheuvel, Marc Zyngier,
	Catalin Marinas, Will Deacon, kvmarm, linux-arm-kernel

On Wed, Jul 25, 2018 at 07:20:57PM +0200, Andrew Jones wrote:
> On Wed, Jul 25, 2018 at 03:06:21PM +0100, Dave Martin wrote:
> > On Thu, Jul 19, 2018 at 03:04:33PM +0200, Andrew Jones wrote:
> > > On Thu, Jun 21, 2018 at 03:57:38PM +0100, Dave Martin wrote:
> > > > +
> > > > +	if (usize % sizeof(u32))
> > > > +		return -EINVAL;
> > > 
> > 
> > Currently we don't enforce the register size to be a multiple of 32 bits,
> > but I'm trying to establish a stronger position.  Passing different
> > register sizes feels like an abuse of the API and there is no evidence
> > that qemu or kvmtool is relying on this so far.  The ability to pass
> > a misaligned register ID and/or slurp multiple vcpu registers (or parts
> > of registers) is once call really seems like it works by accident today
> > and seems not to be intentional design.  Rather, it exposes kernel
> > implementation details, which is best avoided.
> > 
> > It would be better to make this a global check for usize % 32 == 0
> > though, rather than burying it in fpsimd_vreg_bounds().
> > 
> > Opinions?
> 
> There's only one reason to not start enforcing it globally on arm/arm64,
> and that's that it's not documented that way. Changing it would be an API
> change, rather than just an API fix. It's probably a safe change, but...

I agree, though there are few direct users of this API, and I couldn't
come up with a scenario where anyone in their right mind would access
the core regs struct with access sizes <= 16 bits, and I've seen no
evidence so far of the API being used in this way.

So it would be nice to close this hole before it springs a leak.

I'll keep if for now, but flag it up for attention in the repost.
I'm happy to drop it if people care strongly enough.

> > > > +
> > > > +	usize /= sizeof(u32);
> > > > +
> > > > +	if ((uoffset <= start && usize <= start - uoffset) ||
> > > > +	    uoffset >= limit)
> > > > +		return -ENOENT;	/* not a vreg */
> > > > +
> > > > +	BUILD_BUG_ON(uoffset > limit);
> > > 
> > > Hmm, a build bug on uoffset can't be right, it's not a constant.
> > > 
> > > > +	if (uoffset < start || usize > limit - uoffset)
> > > > +		return -EINVAL;	/* overlaps vregs[] bounds */
> > 
> > uoffset is not compile-time constant, but (uoffset > limit) is compile-
> > time constant, because the previous if() returns from the function
> > otherwise.
> > 
> > gcc seems to do the right thing here: the code compiles as-is, but
> > if the prior if() is commented out then the BUILD_BUG_ON() fires
> > because (uoffset > limit) is no longer compile-time constant.
> 
> Oh, interesting.
> 
> > 
> > 
> > This is a defensively-coded bounds check, where
> > 
> > 	if (A + B > C)
> > 
> > is transformed to
> > 
> > 	if (C >= B && A > C - B)
> > 
> > The former is susceptible to overflow in (A + B), whereas the latter is
> > not.  We might be able to hide the risk with type casts, but that trades
> > one kind of fragility for another IMHO.
> > 
> > In this patch, the C >= B part is subsumed into the previous if(), but
> > because this is non-obvious I dropped the BUILD_BUG_ON() in as a hint
> > to maintainers that we really do depend on a property of the previous
> > check, so although it may look like the checks could be swapped over
> > with no ill effects, really that is not safe.
> 
> I'm glad our maintainers can pick up on hints like that :-) Maybe you can
> add a comment for mortals like me though.

Hint taken...  I'll add a comment.  No doubt I'd eventually forget why 
the BUILD_BUG_ON() was there too.

> > Maybe the BUILD_BUG_ON() is superfluous, but I would prefer at least
> > to keep a comment here.
> > 
> > What do you think.
> >
> 
> Comment plus build-bug or just comment works for me.
> 
> > 
> > OTOH, if we can show conclusively that we can avoid overflow here
> > then the code can be simplified.  But I would want to be confident
> > that this is really safe not just now but also under future maintenance.
> > 
> 
> I agree with thoroughly checking user input. Maybe we can create/use
> some helper functions to do it. Those helpers can then get reused
> elsewhere, helping to keep ourselves sane the next time we need to
> do similar sanity checks.

It's a bit tricky to get right, because it all depends on the
combination of types being used in the expression.

I might have a think about how to do this, but for now I don't want to
introduce more churn.

Cheers
---Dave

^ permalink raw reply	[flat|nested] 178+ messages in thread

* [RFC PATCH 14/16] KVM: arm64/sve: Add SVE support to register access ioctl interface
@ 2018-07-26 13:10           ` Dave Martin
  0 siblings, 0 replies; 178+ messages in thread
From: Dave Martin @ 2018-07-26 13:10 UTC (permalink / raw)
  To: linux-arm-kernel

On Wed, Jul 25, 2018 at 07:20:57PM +0200, Andrew Jones wrote:
> On Wed, Jul 25, 2018 at 03:06:21PM +0100, Dave Martin wrote:
> > On Thu, Jul 19, 2018 at 03:04:33PM +0200, Andrew Jones wrote:
> > > On Thu, Jun 21, 2018 at 03:57:38PM +0100, Dave Martin wrote:
> > > > +
> > > > +	if (usize % sizeof(u32))
> > > > +		return -EINVAL;
> > > 
> > 
> > Currently we don't enforce the register size to be a multiple of 32 bits,
> > but I'm trying to establish a stronger position.  Passing different
> > register sizes feels like an abuse of the API and there is no evidence
> > that qemu or kvmtool is relying on this so far.  The ability to pass
> > a misaligned register ID and/or slurp multiple vcpu registers (or parts
> > of registers) is once call really seems like it works by accident today
> > and seems not to be intentional design.  Rather, it exposes kernel
> > implementation details, which is best avoided.
> > 
> > It would be better to make this a global check for usize % 32 == 0
> > though, rather than burying it in fpsimd_vreg_bounds().
> > 
> > Opinions?
> 
> There's only one reason to not start enforcing it globally on arm/arm64,
> and that's that it's not documented that way. Changing it would be an API
> change, rather than just an API fix. It's probably a safe change, but...

I agree, though there are few direct users of this API, and I couldn't
come up with a scenario where anyone in their right mind would access
the core regs struct with access sizes <= 16 bits, and I've seen no
evidence so far of the API being used in this way.

So it would be nice to close this hole before it springs a leak.

I'll keep if for now, but flag it up for attention in the repost.
I'm happy to drop it if people care strongly enough.

> > > > +
> > > > +	usize /= sizeof(u32);
> > > > +
> > > > +	if ((uoffset <= start && usize <= start - uoffset) ||
> > > > +	    uoffset >= limit)
> > > > +		return -ENOENT;	/* not a vreg */
> > > > +
> > > > +	BUILD_BUG_ON(uoffset > limit);
> > > 
> > > Hmm, a build bug on uoffset can't be right, it's not a constant.
> > > 
> > > > +	if (uoffset < start || usize > limit - uoffset)
> > > > +		return -EINVAL;	/* overlaps vregs[] bounds */
> > 
> > uoffset is not compile-time constant, but (uoffset > limit) is compile-
> > time constant, because the previous if() returns from the function
> > otherwise.
> > 
> > gcc seems to do the right thing here: the code compiles as-is, but
> > if the prior if() is commented out then the BUILD_BUG_ON() fires
> > because (uoffset > limit) is no longer compile-time constant.
> 
> Oh, interesting.
> 
> > 
> > 
> > This is a defensively-coded bounds check, where
> > 
> > 	if (A + B > C)
> > 
> > is transformed to
> > 
> > 	if (C >= B && A > C - B)
> > 
> > The former is susceptible to overflow in (A + B), whereas the latter is
> > not.  We might be able to hide the risk with type casts, but that trades
> > one kind of fragility for another IMHO.
> > 
> > In this patch, the C >= B part is subsumed into the previous if(), but
> > because this is non-obvious I dropped the BUILD_BUG_ON() in as a hint
> > to maintainers that we really do depend on a property of the previous
> > check, so although it may look like the checks could be swapped over
> > with no ill effects, really that is not safe.
> 
> I'm glad our maintainers can pick up on hints like that :-) Maybe you can
> add a comment for mortals like me though.

Hint taken...  I'll add a comment.  No doubt I'd eventually forget why 
the BUILD_BUG_ON() was there too.

> > Maybe the BUILD_BUG_ON() is superfluous, but I would prefer at least
> > to keep a comment here.
> > 
> > What do you think.
> >
> 
> Comment plus build-bug or just comment works for me.
> 
> > 
> > OTOH, if we can show conclusively that we can avoid overflow here
> > then the code can be simplified.  But I would want to be confident
> > that this is really safe not just now but also under future maintenance.
> > 
> 
> I agree with thoroughly checking user input. Maybe we can create/use
> some helper functions to do it. Those helpers can then get reused
> elsewhere, helping to keep ourselves sane the next time we need to
> do similar sanity checks.

It's a bit tricky to get right, because it all depends on the
combination of types being used in the expression.

I might have a think about how to do this, but for now I don't want to
introduce more churn.

Cheers
---Dave

^ permalink raw reply	[flat|nested] 178+ messages in thread

* Re: [RFC PATCH 16/16] KVM: arm64/sve: Report and enable SVE API extensions for userspace
  2018-07-25 16:52         ` Andrew Jones
@ 2018-07-26 13:18           ` Dave Martin
  -1 siblings, 0 replies; 178+ messages in thread
From: Dave Martin @ 2018-07-26 13:18 UTC (permalink / raw)
  To: Andrew Jones
  Cc: Okamoto Takayuki, Christoffer Dall, Ard Biesheuvel, Marc Zyngier,
	Catalin Marinas, Will Deacon, kvmarm, linux-arm-kernel

On Wed, Jul 25, 2018 at 06:52:56PM +0200, Andrew Jones wrote:
> On Wed, Jul 25, 2018 at 04:27:49PM +0100, Dave Martin wrote:
> > On Thu, Jul 19, 2018 at 04:59:21PM +0200, Andrew Jones wrote:
> > > On Thu, Jun 21, 2018 at 03:57:40PM +0100, Dave Martin wrote:
> > > > -	/*
> > > > -	 * For now, we don't return any features.
> > > > -	 * In future, we might use features to return target
> > > > -	 * specific features available for the preferred
> > > > -	 * target type.
> > > > -	 */
> > > > +	/* KVM_ARM_VCPU_SVE understood by KVM_VCPU_INIT */
> > > > +	init->features[0] = 1 << KVM_ARM_VCPU_SVE;
> > > > +
> > > 
> > > We shouldn't need to do this. The "preferred" target type isn't defined
> > > well (that I know of), but IMO it should probably be the target that
> > > best matches the host, minus optional features. The best base target. We
> > > may use these features to convey that the preferred target should enable
> > > some optional feature if that feature is necessary to workaround a bug,
> > > i.e. using the "feature" bit as an erratum bit someday, but that'd be
> > > quite a debatable use, so maybe not even that. Most likely we'll never
> > > need to add features here.
> > 
> > init->features[] has no semantics yet so we can define it how we like,
> > but I agree that the way I use it here is not necessarily the most
> > natural.
> > 
> > OTOH, we cannot use features[] for "mandatory" features like erratum
> > workarounds, because current userspace just ignores these bits.
> 
> It would have to learn to look here if that's how we started using it,
> but it'd be better to invent something else that wouldn't appear as
> abusive if we're going to teach userspace new stuff anyway.
> 
> > 
> > Rather, these bits would be for features that are considered beneficial
> > but must be off by default (due to incompatibility risks across nodes,
> > or due to ABI impacts).  Just blindly using the preferred target
> > already risks configuring a vcpu that won't work across all nodes in
> > your cluster.
> 
> KVM usually advertises optional features through capabilities. A device
> (vcpu device, in this case) ioctl can also be used to check for feature
> availability.
> 
> > 
> > So I'm not convinced that there is any useful interpretation of
> > features[] unless we interpret it as suggested in this patch.
> > 
> > Can you elaborate why you think it should be used with a more
> > concrete example?
> 
> I'm advocating that it *not* be used here. I think it should be used
> like the PMU feature uses it - and the PMU feature doesn't set a bit
> here.
> 
> > 
> > > That said, I think defining the feature bit makes sense. ATM, I'm feeling
> > > like we'll want to model the user interface for SVE like PMU (using VCPU
> > > device ioctls).
> > 
> > Some people expressed concerns about the ioctls becoming order-sensitive.
> > 
> > In the SVE case we don't want people enabling/disabling/reconfiguring
> > "silicon" features like SVE after the vcpu starts executing.
> > 
> > We will need an extra ioctl() for configuring the allowed SVE vector
> > lengths though.  I don't see a way around that.  So maybe we have to
> > solve the ordering problem anyway.
> 
> Yes, that's why I'm thinking that the vcpu device ioctls is probably the
> right way to go. The SVE group can have its own "finalize" request that
> allows all other SVE ioctls to be in any order prior to it.
> 
> > 
> > 
> > By current approach (not in this series) was to have VCPU_INIT return
> > -EINPROGRESS or similar if SVE is enabled in features[]: this indicates
> > that certain setup ioctls are required before the vcpu can run.
> > 
> > This may be overkill / not the best approach though.  I can look at
> > vcpu device ioctls as an alternative.
> 
> With a "finalize" attribute if SVE isn't finalized by VCPU_INIT or
> KVM_RUN time, then SVE just won't be enabled for that VCPU.

So I suppose we could do something like this:

 * Advertise SVE availability through a vcpu device capability (I need
   to check how that works).

 * SVE-aware userspace that understands SVE can do the relevant
   vcpu device ioctls to configure SVE and turn it on: these are only
   permitted before the vcpu runs.  We might require an explicit
   "finish SVE setup" ioctl to be issued before the vcpu can run.

 * Finally, the vcpu is set running by userspace as normal.

Marc or Christoffer was objecting to me previously that this may be an
abuse of vcpu device ioctls, because SVE is a CPU feature rather than a
device.  I guess it depends on how you define "device" -- I'm not sure
where to draw the line.

The vcpu device approach might reduce the amount of weird special-case
API that needs to be invented, which is probably a good thing.

Cheers
---Dave

^ permalink raw reply	[flat|nested] 178+ messages in thread

* [RFC PATCH 16/16] KVM: arm64/sve: Report and enable SVE API extensions for userspace
@ 2018-07-26 13:18           ` Dave Martin
  0 siblings, 0 replies; 178+ messages in thread
From: Dave Martin @ 2018-07-26 13:18 UTC (permalink / raw)
  To: linux-arm-kernel

On Wed, Jul 25, 2018 at 06:52:56PM +0200, Andrew Jones wrote:
> On Wed, Jul 25, 2018 at 04:27:49PM +0100, Dave Martin wrote:
> > On Thu, Jul 19, 2018 at 04:59:21PM +0200, Andrew Jones wrote:
> > > On Thu, Jun 21, 2018 at 03:57:40PM +0100, Dave Martin wrote:
> > > > -	/*
> > > > -	 * For now, we don't return any features.
> > > > -	 * In future, we might use features to return target
> > > > -	 * specific features available for the preferred
> > > > -	 * target type.
> > > > -	 */
> > > > +	/* KVM_ARM_VCPU_SVE understood by KVM_VCPU_INIT */
> > > > +	init->features[0] = 1 << KVM_ARM_VCPU_SVE;
> > > > +
> > > 
> > > We shouldn't need to do this. The "preferred" target type isn't defined
> > > well (that I know of), but IMO it should probably be the target that
> > > best matches the host, minus optional features. The best base target. We
> > > may use these features to convey that the preferred target should enable
> > > some optional feature if that feature is necessary to workaround a bug,
> > > i.e. using the "feature" bit as an erratum bit someday, but that'd be
> > > quite a debatable use, so maybe not even that. Most likely we'll never
> > > need to add features here.
> > 
> > init->features[] has no semantics yet so we can define it how we like,
> > but I agree that the way I use it here is not necessarily the most
> > natural.
> > 
> > OTOH, we cannot use features[] for "mandatory" features like erratum
> > workarounds, because current userspace just ignores these bits.
> 
> It would have to learn to look here if that's how we started using it,
> but it'd be better to invent something else that wouldn't appear as
> abusive if we're going to teach userspace new stuff anyway.
> 
> > 
> > Rather, these bits would be for features that are considered beneficial
> > but must be off by default (due to incompatibility risks across nodes,
> > or due to ABI impacts).  Just blindly using the preferred target
> > already risks configuring a vcpu that won't work across all nodes in
> > your cluster.
> 
> KVM usually advertises optional features through capabilities. A device
> (vcpu device, in this case) ioctl can also be used to check for feature
> availability.
> 
> > 
> > So I'm not convinced that there is any useful interpretation of
> > features[] unless we interpret it as suggested in this patch.
> > 
> > Can you elaborate why you think it should be used with a more
> > concrete example?
> 
> I'm advocating that it *not* be used here. I think it should be used
> like the PMU feature uses it - and the PMU feature doesn't set a bit
> here.
> 
> > 
> > > That said, I think defining the feature bit makes sense. ATM, I'm feeling
> > > like we'll want to model the user interface for SVE like PMU (using VCPU
> > > device ioctls).
> > 
> > Some people expressed concerns about the ioctls becoming order-sensitive.
> > 
> > In the SVE case we don't want people enabling/disabling/reconfiguring
> > "silicon" features like SVE after the vcpu starts executing.
> > 
> > We will need an extra ioctl() for configuring the allowed SVE vector
> > lengths though.  I don't see a way around that.  So maybe we have to
> > solve the ordering problem anyway.
> 
> Yes, that's why I'm thinking that the vcpu device ioctls is probably the
> right way to go. The SVE group can have its own "finalize" request that
> allows all other SVE ioctls to be in any order prior to it.
> 
> > 
> > 
> > By current approach (not in this series) was to have VCPU_INIT return
> > -EINPROGRESS or similar if SVE is enabled in features[]: this indicates
> > that certain setup ioctls are required before the vcpu can run.
> > 
> > This may be overkill / not the best approach though.  I can look at
> > vcpu device ioctls as an alternative.
> 
> With a "finalize" attribute if SVE isn't finalized by VCPU_INIT or
> KVM_RUN time, then SVE just won't be enabled for that VCPU.

So I suppose we could do something like this:

 * Advertise SVE availability through a vcpu device capability (I need
   to check how that works).

 * SVE-aware userspace that understands SVE can do the relevant
   vcpu device ioctls to configure SVE and turn it on: these are only
   permitted before the vcpu runs.  We might require an explicit
   "finish SVE setup" ioctl to be issued before the vcpu can run.

 * Finally, the vcpu is set running by userspace as normal.

Marc or Christoffer was objecting to me previously that this may be an
abuse of vcpu device ioctls, because SVE is a CPU feature rather than a
device.  I guess it depends on how you define "device" -- I'm not sure
where to draw the line.

The vcpu device approach might reduce the amount of weird special-case
API that needs to be invented, which is probably a good thing.

Cheers
---Dave

^ permalink raw reply	[flat|nested] 178+ messages in thread

* Re: [RFC PATCH 16/16] KVM: arm64/sve: Report and enable SVE API extensions for userspace
  2018-07-19 15:24     ` Andrew Jones
@ 2018-07-26 13:23       ` Dave Martin
  -1 siblings, 0 replies; 178+ messages in thread
From: Dave Martin @ 2018-07-26 13:23 UTC (permalink / raw)
  To: Andrew Jones
  Cc: Okamoto Takayuki, Christoffer Dall, Ard Biesheuvel, Marc Zyngier,
	Catalin Marinas, Will Deacon, kvmarm, linux-arm-kernel

On Thu, Jul 19, 2018 at 05:24:20PM +0200, Andrew Jones wrote:
> On Thu, Jun 21, 2018 at 03:57:40PM +0100, Dave Martin wrote:
> > This patch reports the availability of KVM SVE support to userspace
> > via a new vcpu feature flag KVM_ARM_VCPU_SVE.  This flag is
> > reported via the KVM_ARM_PREFERRED_TARGET ioctl.
> > 
> > Userspace can enable the feature by setting the flag for
> > KVM_ARM_VCPU_INIT.  Without this flag set, SVE-related ioctls and
> > register access extensions are hidden, and SVE remains disabled
> > unconditionally for the guest.  This ensures that non-SVE-aware KVM
> > userspace does not receive a vcpu that it does not understand how
> > to snapshot or restore correctly.
> > 
> > Storage is allocated for the SVE register state at vcpu init time,
> > sufficient for the maximum vector length to be exposed to the vcpu.
> > No attempt is made to allocate the storage lazily for now.  Also,
> > no attempt is made to resize the storage dynamically, since the
> > effective vector length of the vcpu can change at each EL0/EL1
> > transition.  The storage is freed at the vcpu uninit hook.
> > 
> > No particular attempt is made to prevent userspace from creating a
> > mix of vcpus some of which have SVE enabled and some of which have
> > it disabled.  This may or may not be useful, but it reflects the
> > underlying architectural behaviour.
> > 
> > Signed-off-by: Dave Martin <Dave.Martin@arm.com>
> > ---
> >  arch/arm64/include/asm/kvm_host.h |  6 +++---
> >  arch/arm64/include/uapi/asm/kvm.h |  1 +
> >  arch/arm64/kvm/guest.c            | 19 +++++++++++++------
> >  arch/arm64/kvm/reset.c            | 14 ++++++++++++++
> >  4 files changed, 31 insertions(+), 9 deletions(-)
> > 
> > diff --git a/arch/arm64/include/asm/kvm_host.h b/arch/arm64/include/asm/kvm_host.h
> > index d2084ae..d956cf2 100644
> > --- a/arch/arm64/include/asm/kvm_host.h
> > +++ b/arch/arm64/include/asm/kvm_host.h
> > @@ -44,7 +44,7 @@
> >  
> >  #define KVM_MAX_VCPUS VGIC_V3_MAX_CPUS
> >  
> > -#define KVM_VCPU_MAX_FEATURES 4
> > +#define KVM_VCPU_MAX_FEATURES 5
> >  
> >  #define KVM_REQ_SLEEP \
> >  	KVM_ARCH_REQ_FLAGS(0, KVM_REQUEST_WAIT | KVM_REQUEST_NO_WAKEUP)
> > @@ -439,8 +439,8 @@ static inline void kvm_arch_sync_events(struct kvm *kvm) {}
> >  static inline void kvm_arch_sched_in(struct kvm_vcpu *vcpu, int cpu) {}
> >  static inline void kvm_arch_vcpu_block_finish(struct kvm_vcpu *vcpu) {}
> >  
> > -static inline int kvm_arm_arch_vcpu_init(struct kvm_vcpu *vcpu) { return 0; }
> > -static inline void kvm_arm_arch_vcpu_uninit(struct kvm_vcpu *vcpu) {}
> > +int kvm_arm_arch_vcpu_init(struct kvm_vcpu *vcpu);
> > +void kvm_arm_arch_vcpu_uninit(struct kvm_vcpu *vcpu);
> >  
> >  void kvm_arm_init_debug(void);
> >  void kvm_arm_setup_debug(struct kvm_vcpu *vcpu);
> > diff --git a/arch/arm64/include/uapi/asm/kvm.h b/arch/arm64/include/uapi/asm/kvm.h
> > index f54a9b0..6acf276 100644
> > --- a/arch/arm64/include/uapi/asm/kvm.h
> > +++ b/arch/arm64/include/uapi/asm/kvm.h
> > @@ -101,6 +101,7 @@ struct kvm_regs {
> >  #define KVM_ARM_VCPU_EL1_32BIT		1 /* CPU running a 32bit VM */
> >  #define KVM_ARM_VCPU_PSCI_0_2		2 /* CPU uses PSCI v0.2 */
> >  #define KVM_ARM_VCPU_PMU_V3		3 /* Support guest PMUv3 */
> > +#define KVM_ARM_VCPU_SVE		4 /* Allow SVE for guest */
> >  
> >  struct kvm_vcpu_init {
> >  	__u32 target;
> > diff --git a/arch/arm64/kvm/guest.c b/arch/arm64/kvm/guest.c
> > index 5152362..fb7f6aa 100644
> > --- a/arch/arm64/kvm/guest.c
> > +++ b/arch/arm64/kvm/guest.c
> > @@ -58,6 +58,16 @@ int kvm_arch_vcpu_setup(struct kvm_vcpu *vcpu)
> >  	return 0;
> >  }
> >  
> > +int kvm_arm_arch_vcpu_init(struct kvm_vcpu *vcpu)
> > +{
> > +	return 0;
> > +}
> > +
> > +void kvm_arm_arch_vcpu_uninit(struct kvm_vcpu *vcpu)
> > +{
> > +	kfree(vcpu->arch.sve_state);
> > +}
> > +
> >  static u64 core_reg_offset_from_id(u64 id)
> >  {
> >  	return id & ~(KVM_REG_ARCH_MASK | KVM_REG_SIZE_MASK | KVM_REG_ARM_CORE);
> > @@ -600,12 +610,9 @@ int kvm_vcpu_preferred_target(struct kvm_vcpu_init *init)
> >  
> >  	memset(init, 0, sizeof(*init));
> >  
> > -	/*
> > -	 * For now, we don't return any features.
> > -	 * In future, we might use features to return target
> > -	 * specific features available for the preferred
> > -	 * target type.
> > -	 */
> > +	/* KVM_ARM_VCPU_SVE understood by KVM_VCPU_INIT */
> > +	init->features[0] = 1 << KVM_ARM_VCPU_SVE;
> > +
> >  	init->target = (__u32)target;
> >  
> >  	return 0;
> > diff --git a/arch/arm64/kvm/reset.c b/arch/arm64/kvm/reset.c
> > index a74311b..f63a791 100644
> > --- a/arch/arm64/kvm/reset.c
> > +++ b/arch/arm64/kvm/reset.c
> > @@ -110,6 +110,20 @@ int kvm_reset_vcpu(struct kvm_vcpu *vcpu)
> >  			cpu_reset = &default_regs_reset;
> >  		}
> >  
> > +		if (system_supports_sve() &&
> > +		    test_bit(KVM_ARM_VCPU_SVE, vcpu->arch.features)) {
> > +			vcpu->arch.flags |= KVM_ARM64_GUEST_HAS_SVE;
> > +
> > +			vcpu->arch.sve_max_vl = sve_max_virtualisable_vl;
> > +
> 
> The allocation below needs to be guarded by an if (!vcpu->arch.sve_state),
> otherwise every time the guest does a PSCI-off/PSCI-on cycle of the vcpu
> we'll have a memory leak. Or, we need to move this allocation into the new
> kvm_arm_arch_vcpu_init() function. Why did you opt for kvm_reset_vcpu()?

I think I failed to find another suitable init function that gets called
at the right time.  I may have got confused though.

Good spot on the PSCI interaction.  In light of that, I agree: the SVE
buffer allocation should not be done here.


I'll have a think about how to refactor this in light of the discussion.

Cheers
---Dave

^ permalink raw reply	[flat|nested] 178+ messages in thread

* [RFC PATCH 16/16] KVM: arm64/sve: Report and enable SVE API extensions for userspace
@ 2018-07-26 13:23       ` Dave Martin
  0 siblings, 0 replies; 178+ messages in thread
From: Dave Martin @ 2018-07-26 13:23 UTC (permalink / raw)
  To: linux-arm-kernel

On Thu, Jul 19, 2018 at 05:24:20PM +0200, Andrew Jones wrote:
> On Thu, Jun 21, 2018 at 03:57:40PM +0100, Dave Martin wrote:
> > This patch reports the availability of KVM SVE support to userspace
> > via a new vcpu feature flag KVM_ARM_VCPU_SVE.  This flag is
> > reported via the KVM_ARM_PREFERRED_TARGET ioctl.
> > 
> > Userspace can enable the feature by setting the flag for
> > KVM_ARM_VCPU_INIT.  Without this flag set, SVE-related ioctls and
> > register access extensions are hidden, and SVE remains disabled
> > unconditionally for the guest.  This ensures that non-SVE-aware KVM
> > userspace does not receive a vcpu that it does not understand how
> > to snapshot or restore correctly.
> > 
> > Storage is allocated for the SVE register state at vcpu init time,
> > sufficient for the maximum vector length to be exposed to the vcpu.
> > No attempt is made to allocate the storage lazily for now.  Also,
> > no attempt is made to resize the storage dynamically, since the
> > effective vector length of the vcpu can change at each EL0/EL1
> > transition.  The storage is freed at the vcpu uninit hook.
> > 
> > No particular attempt is made to prevent userspace from creating a
> > mix of vcpus some of which have SVE enabled and some of which have
> > it disabled.  This may or may not be useful, but it reflects the
> > underlying architectural behaviour.
> > 
> > Signed-off-by: Dave Martin <Dave.Martin@arm.com>
> > ---
> >  arch/arm64/include/asm/kvm_host.h |  6 +++---
> >  arch/arm64/include/uapi/asm/kvm.h |  1 +
> >  arch/arm64/kvm/guest.c            | 19 +++++++++++++------
> >  arch/arm64/kvm/reset.c            | 14 ++++++++++++++
> >  4 files changed, 31 insertions(+), 9 deletions(-)
> > 
> > diff --git a/arch/arm64/include/asm/kvm_host.h b/arch/arm64/include/asm/kvm_host.h
> > index d2084ae..d956cf2 100644
> > --- a/arch/arm64/include/asm/kvm_host.h
> > +++ b/arch/arm64/include/asm/kvm_host.h
> > @@ -44,7 +44,7 @@
> >  
> >  #define KVM_MAX_VCPUS VGIC_V3_MAX_CPUS
> >  
> > -#define KVM_VCPU_MAX_FEATURES 4
> > +#define KVM_VCPU_MAX_FEATURES 5
> >  
> >  #define KVM_REQ_SLEEP \
> >  	KVM_ARCH_REQ_FLAGS(0, KVM_REQUEST_WAIT | KVM_REQUEST_NO_WAKEUP)
> > @@ -439,8 +439,8 @@ static inline void kvm_arch_sync_events(struct kvm *kvm) {}
> >  static inline void kvm_arch_sched_in(struct kvm_vcpu *vcpu, int cpu) {}
> >  static inline void kvm_arch_vcpu_block_finish(struct kvm_vcpu *vcpu) {}
> >  
> > -static inline int kvm_arm_arch_vcpu_init(struct kvm_vcpu *vcpu) { return 0; }
> > -static inline void kvm_arm_arch_vcpu_uninit(struct kvm_vcpu *vcpu) {}
> > +int kvm_arm_arch_vcpu_init(struct kvm_vcpu *vcpu);
> > +void kvm_arm_arch_vcpu_uninit(struct kvm_vcpu *vcpu);
> >  
> >  void kvm_arm_init_debug(void);
> >  void kvm_arm_setup_debug(struct kvm_vcpu *vcpu);
> > diff --git a/arch/arm64/include/uapi/asm/kvm.h b/arch/arm64/include/uapi/asm/kvm.h
> > index f54a9b0..6acf276 100644
> > --- a/arch/arm64/include/uapi/asm/kvm.h
> > +++ b/arch/arm64/include/uapi/asm/kvm.h
> > @@ -101,6 +101,7 @@ struct kvm_regs {
> >  #define KVM_ARM_VCPU_EL1_32BIT		1 /* CPU running a 32bit VM */
> >  #define KVM_ARM_VCPU_PSCI_0_2		2 /* CPU uses PSCI v0.2 */
> >  #define KVM_ARM_VCPU_PMU_V3		3 /* Support guest PMUv3 */
> > +#define KVM_ARM_VCPU_SVE		4 /* Allow SVE for guest */
> >  
> >  struct kvm_vcpu_init {
> >  	__u32 target;
> > diff --git a/arch/arm64/kvm/guest.c b/arch/arm64/kvm/guest.c
> > index 5152362..fb7f6aa 100644
> > --- a/arch/arm64/kvm/guest.c
> > +++ b/arch/arm64/kvm/guest.c
> > @@ -58,6 +58,16 @@ int kvm_arch_vcpu_setup(struct kvm_vcpu *vcpu)
> >  	return 0;
> >  }
> >  
> > +int kvm_arm_arch_vcpu_init(struct kvm_vcpu *vcpu)
> > +{
> > +	return 0;
> > +}
> > +
> > +void kvm_arm_arch_vcpu_uninit(struct kvm_vcpu *vcpu)
> > +{
> > +	kfree(vcpu->arch.sve_state);
> > +}
> > +
> >  static u64 core_reg_offset_from_id(u64 id)
> >  {
> >  	return id & ~(KVM_REG_ARCH_MASK | KVM_REG_SIZE_MASK | KVM_REG_ARM_CORE);
> > @@ -600,12 +610,9 @@ int kvm_vcpu_preferred_target(struct kvm_vcpu_init *init)
> >  
> >  	memset(init, 0, sizeof(*init));
> >  
> > -	/*
> > -	 * For now, we don't return any features.
> > -	 * In future, we might use features to return target
> > -	 * specific features available for the preferred
> > -	 * target type.
> > -	 */
> > +	/* KVM_ARM_VCPU_SVE understood by KVM_VCPU_INIT */
> > +	init->features[0] = 1 << KVM_ARM_VCPU_SVE;
> > +
> >  	init->target = (__u32)target;
> >  
> >  	return 0;
> > diff --git a/arch/arm64/kvm/reset.c b/arch/arm64/kvm/reset.c
> > index a74311b..f63a791 100644
> > --- a/arch/arm64/kvm/reset.c
> > +++ b/arch/arm64/kvm/reset.c
> > @@ -110,6 +110,20 @@ int kvm_reset_vcpu(struct kvm_vcpu *vcpu)
> >  			cpu_reset = &default_regs_reset;
> >  		}
> >  
> > +		if (system_supports_sve() &&
> > +		    test_bit(KVM_ARM_VCPU_SVE, vcpu->arch.features)) {
> > +			vcpu->arch.flags |= KVM_ARM64_GUEST_HAS_SVE;
> > +
> > +			vcpu->arch.sve_max_vl = sve_max_virtualisable_vl;
> > +
> 
> The allocation below needs to be guarded by an if (!vcpu->arch.sve_state),
> otherwise every time the guest does a PSCI-off/PSCI-on cycle of the vcpu
> we'll have a memory leak. Or, we need to move this allocation into the new
> kvm_arm_arch_vcpu_init() function. Why did you opt for kvm_reset_vcpu()?

I think I failed to find another suitable init function that gets called
at the right time.  I may have got confused though.

Good spot on the PSCI interaction.  In light of that, I agree: the SVE
buffer allocation should not be done here.


I'll have a think about how to refactor this in light of the discussion.

Cheers
---Dave

^ permalink raw reply	[flat|nested] 178+ messages in thread

* Re: [RFC PATCH 13/16] KVM: Allow 2048-bit register access via KVM_{GET, SET}_ONE_REG
  2018-07-26 12:58       ` Dave Martin
@ 2018-07-26 13:55         ` Alex Bennée
  -1 siblings, 0 replies; 178+ messages in thread
From: Alex Bennée @ 2018-07-26 13:55 UTC (permalink / raw)
  To: Dave Martin
  Cc: Okamoto Takayuki, Christoffer Dall, Ard Biesheuvel, Marc Zyngier,
	Catalin Marinas, Will Deacon, kvmarm, linux-arm-kernel


Dave Martin <Dave.Martin@arm.com> writes:

> On Wed, Jul 25, 2018 at 04:58:30PM +0100, Alex Bennée wrote:
>>
>> Dave Martin <Dave.Martin@arm.com> writes:
>>
>> > The Arm SVE architecture defines registers that are up to 2048 bits
>> > in size (with some possibility of further future expansion).
>> >
>> > In order to avoid the need for an excessively large number of
>> > ioctls when saving and restoring a vcpu's registers, this patch
>> > adds a #define to make support for individual 2048-bit registers
>> > through the KVM_{GET,SET}_ONE_REG ioctl interface official.  This
>> > will allow each SVE register to be accessed in a single call.
>> >
>> > There are sufficient spare bits in the register id size field for
>> > this change, so there is no ABI impact providing that
>> > KVM_GET_REG_LIST does not enumerate any 2048-bit register unless
>> > userspace explicitly opts in to the relevant architecture-specific
>> > features.
>>
>> Does it? It's not in this patch and looking at the final tree:
>>
>>   unsigned long kvm_arm_num_regs(struct kvm_vcpu *vcpu)
>>   {
>>           unsigned long res = 0;
>>
>>           res += num_core_regs();
>>           res += num_sve_regs(vcpu);
>>           res += kvm_arm_num_sys_reg_descs(vcpu);
>>           res += kvm_arm_get_fw_num_regs(vcpu);
>>           res += NUM_TIMER_REGS;
>>
>>           return res;
>>   }
>>
>>
>> which leads to:
>>
>>   static int enumerate_sve_regs(const struct kvm_vcpu *vcpu, u64 __user **uind)
>>   {
>>           unsigned int n, i;
>>           int err = 0;
>>           int total = 0;
>>           unsigned int slices;
>>
>>           if (!vcpu_has_sve(&vcpu->arch))
>>                   return 0;
>>
>> Which enumerates the SVE regs if vcpu_has_sve() which AFAICT is true if
>> the host supports it, not if the user has requested it.
>>
>> I'll have to check what but given the indirection of kvm_one_reg I
>> wonder if existing binaries might end up spamming a badly sized array
>> when run on a new SVE supporting kernel?
>
> That shouldn't be the case: vcpu_has_sve() checks for the
> KVM_ARM64_GUEST_HAS_SVE flag, which should only be set if userspace asks
> for it.
>
> Give me a shout if this doesn't seem to be the case...

Ahh I missed it the first time:

		if (system_supports_sve() &&
		    test_bit(KVM_ARM_VCPU_SVE, vcpu->arch.features)) {
			vcpu->arch.flags |= KVM_ARM64_GUEST_HAS_SVE;

And vcpu->arch.features is set by the user. However it will be set as
unless you specify otherwise we use the results of probed:

  ret = ioctl(vmfd, KVM_ARM_PREFERRED_TARGET, init);

so the user will get it by definition when they first run on SVE capable
hardware.


>
> Cheers
> ---Dave


--
Alex Bennée
_______________________________________________
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

^ permalink raw reply	[flat|nested] 178+ messages in thread

* [RFC PATCH 13/16] KVM: Allow 2048-bit register access via KVM_{GET, SET}_ONE_REG
@ 2018-07-26 13:55         ` Alex Bennée
  0 siblings, 0 replies; 178+ messages in thread
From: Alex Bennée @ 2018-07-26 13:55 UTC (permalink / raw)
  To: linux-arm-kernel


Dave Martin <Dave.Martin@arm.com> writes:

> On Wed, Jul 25, 2018 at 04:58:30PM +0100, Alex Benn?e wrote:
>>
>> Dave Martin <Dave.Martin@arm.com> writes:
>>
>> > The Arm SVE architecture defines registers that are up to 2048 bits
>> > in size (with some possibility of further future expansion).
>> >
>> > In order to avoid the need for an excessively large number of
>> > ioctls when saving and restoring a vcpu's registers, this patch
>> > adds a #define to make support for individual 2048-bit registers
>> > through the KVM_{GET,SET}_ONE_REG ioctl interface official.  This
>> > will allow each SVE register to be accessed in a single call.
>> >
>> > There are sufficient spare bits in the register id size field for
>> > this change, so there is no ABI impact providing that
>> > KVM_GET_REG_LIST does not enumerate any 2048-bit register unless
>> > userspace explicitly opts in to the relevant architecture-specific
>> > features.
>>
>> Does it? It's not in this patch and looking at the final tree:
>>
>>   unsigned long kvm_arm_num_regs(struct kvm_vcpu *vcpu)
>>   {
>>           unsigned long res = 0;
>>
>>           res += num_core_regs();
>>           res += num_sve_regs(vcpu);
>>           res += kvm_arm_num_sys_reg_descs(vcpu);
>>           res += kvm_arm_get_fw_num_regs(vcpu);
>>           res += NUM_TIMER_REGS;
>>
>>           return res;
>>   }
>>
>>
>> which leads to:
>>
>>   static int enumerate_sve_regs(const struct kvm_vcpu *vcpu, u64 __user **uind)
>>   {
>>           unsigned int n, i;
>>           int err = 0;
>>           int total = 0;
>>           unsigned int slices;
>>
>>           if (!vcpu_has_sve(&vcpu->arch))
>>                   return 0;
>>
>> Which enumerates the SVE regs if vcpu_has_sve() which AFAICT is true if
>> the host supports it, not if the user has requested it.
>>
>> I'll have to check what but given the indirection of kvm_one_reg I
>> wonder if existing binaries might end up spamming a badly sized array
>> when run on a new SVE supporting kernel?
>
> That shouldn't be the case: vcpu_has_sve() checks for the
> KVM_ARM64_GUEST_HAS_SVE flag, which should only be set if userspace asks
> for it.
>
> Give me a shout if this doesn't seem to be the case...

Ahh I missed it the first time:

		if (system_supports_sve() &&
		    test_bit(KVM_ARM_VCPU_SVE, vcpu->arch.features)) {
			vcpu->arch.flags |= KVM_ARM64_GUEST_HAS_SVE;

And vcpu->arch.features is set by the user. However it will be set as
unless you specify otherwise we use the results of probed:

  ret = ioctl(vmfd, KVM_ARM_PREFERRED_TARGET, init);

so the user will get it by definition when they first run on SVE capable
hardware.


>
> Cheers
> ---Dave


--
Alex Benn?e

^ permalink raw reply	[flat|nested] 178+ messages in thread

* Re: [RFC PATCH 13/16] KVM: Allow 2048-bit register access via KVM_{GET, SET}_ONE_REG
  2018-07-26 13:55         ` Alex Bennée
@ 2018-07-27  9:26           ` Dave Martin
  -1 siblings, 0 replies; 178+ messages in thread
From: Dave Martin @ 2018-07-27  9:26 UTC (permalink / raw)
  To: Alex Bennée
  Cc: Okamoto Takayuki, Christoffer Dall, Ard Biesheuvel, Marc Zyngier,
	Catalin Marinas, Will Deacon, kvmarm, linux-arm-kernel

On Thu, Jul 26, 2018 at 02:55:44PM +0100, Alex Bennée wrote:
> 
> Dave Martin <Dave.Martin@arm.com> writes:
> 
> > On Wed, Jul 25, 2018 at 04:58:30PM +0100, Alex Bennée wrote:
> >>
> >> Dave Martin <Dave.Martin@arm.com> writes:
> >>
> >> > The Arm SVE architecture defines registers that are up to 2048 bits
> >> > in size (with some possibility of further future expansion).
> >> >
> >> > In order to avoid the need for an excessively large number of
> >> > ioctls when saving and restoring a vcpu's registers, this patch
> >> > adds a #define to make support for individual 2048-bit registers
> >> > through the KVM_{GET,SET}_ONE_REG ioctl interface official.  This
> >> > will allow each SVE register to be accessed in a single call.
> >> >
> >> > There are sufficient spare bits in the register id size field for
> >> > this change, so there is no ABI impact providing that
> >> > KVM_GET_REG_LIST does not enumerate any 2048-bit register unless
> >> > userspace explicitly opts in to the relevant architecture-specific
> >> > features.
> >>
> >> Does it? It's not in this patch and looking at the final tree:
> >>
> >>   unsigned long kvm_arm_num_regs(struct kvm_vcpu *vcpu)
> >>   {
> >>           unsigned long res = 0;
> >>
> >>           res += num_core_regs();
> >>           res += num_sve_regs(vcpu);
> >>           res += kvm_arm_num_sys_reg_descs(vcpu);
> >>           res += kvm_arm_get_fw_num_regs(vcpu);
> >>           res += NUM_TIMER_REGS;
> >>
> >>           return res;
> >>   }
> >>
> >>
> >> which leads to:
> >>
> >>   static int enumerate_sve_regs(const struct kvm_vcpu *vcpu, u64 __user **uind)
> >>   {
> >>           unsigned int n, i;
> >>           int err = 0;
> >>           int total = 0;
> >>           unsigned int slices;
> >>
> >>           if (!vcpu_has_sve(&vcpu->arch))
> >>                   return 0;
> >>
> >> Which enumerates the SVE regs if vcpu_has_sve() which AFAICT is true if
> >> the host supports it, not if the user has requested it.
> >>
> >> I'll have to check what but given the indirection of kvm_one_reg I
> >> wonder if existing binaries might end up spamming a badly sized array
> >> when run on a new SVE supporting kernel?
> >
> > That shouldn't be the case: vcpu_has_sve() checks for the
> > KVM_ARM64_GUEST_HAS_SVE flag, which should only be set if userspace asks
> > for it.
> >
> > Give me a shout if this doesn't seem to be the case...
> 
> Ahh I missed it the first time:
> 
> 		if (system_supports_sve() &&
> 		    test_bit(KVM_ARM_VCPU_SVE, vcpu->arch.features)) {
> 			vcpu->arch.flags |= KVM_ARM64_GUEST_HAS_SVE;
> 
> And vcpu->arch.features is set by the user. However it will be set as
> unless you specify otherwise we use the results of probed:
> 
>   ret = ioctl(vmfd, KVM_ARM_PREFERRED_TARGET, init);
> 
> so the user will get it by definition when they first run on SVE capable
> hardware.

AFAIK qemu and kvmtool don't currently propagate the feature flags from
KVM_ARM_PREFERRED_TARGET to KVM_VCPU_INIT.  That would probably not be
the right thing to do, because if you don't know what a feature flag
means then you can't safely turn it on.

But in light of Andrew's comments I will need to review how this works.

The semantics of the feature bits are not well defined, so it may be
safer not to use them for enabling SVE.  I was hoping we could choose
a meaning for them since they weren't previously used for anything,
but that may be optimistic.

Cheers
---Dave

^ permalink raw reply	[flat|nested] 178+ messages in thread

* [RFC PATCH 13/16] KVM: Allow 2048-bit register access via KVM_{GET, SET}_ONE_REG
@ 2018-07-27  9:26           ` Dave Martin
  0 siblings, 0 replies; 178+ messages in thread
From: Dave Martin @ 2018-07-27  9:26 UTC (permalink / raw)
  To: linux-arm-kernel

On Thu, Jul 26, 2018 at 02:55:44PM +0100, Alex Benn?e wrote:
> 
> Dave Martin <Dave.Martin@arm.com> writes:
> 
> > On Wed, Jul 25, 2018 at 04:58:30PM +0100, Alex Benn?e wrote:
> >>
> >> Dave Martin <Dave.Martin@arm.com> writes:
> >>
> >> > The Arm SVE architecture defines registers that are up to 2048 bits
> >> > in size (with some possibility of further future expansion).
> >> >
> >> > In order to avoid the need for an excessively large number of
> >> > ioctls when saving and restoring a vcpu's registers, this patch
> >> > adds a #define to make support for individual 2048-bit registers
> >> > through the KVM_{GET,SET}_ONE_REG ioctl interface official.  This
> >> > will allow each SVE register to be accessed in a single call.
> >> >
> >> > There are sufficient spare bits in the register id size field for
> >> > this change, so there is no ABI impact providing that
> >> > KVM_GET_REG_LIST does not enumerate any 2048-bit register unless
> >> > userspace explicitly opts in to the relevant architecture-specific
> >> > features.
> >>
> >> Does it? It's not in this patch and looking at the final tree:
> >>
> >>   unsigned long kvm_arm_num_regs(struct kvm_vcpu *vcpu)
> >>   {
> >>           unsigned long res = 0;
> >>
> >>           res += num_core_regs();
> >>           res += num_sve_regs(vcpu);
> >>           res += kvm_arm_num_sys_reg_descs(vcpu);
> >>           res += kvm_arm_get_fw_num_regs(vcpu);
> >>           res += NUM_TIMER_REGS;
> >>
> >>           return res;
> >>   }
> >>
> >>
> >> which leads to:
> >>
> >>   static int enumerate_sve_regs(const struct kvm_vcpu *vcpu, u64 __user **uind)
> >>   {
> >>           unsigned int n, i;
> >>           int err = 0;
> >>           int total = 0;
> >>           unsigned int slices;
> >>
> >>           if (!vcpu_has_sve(&vcpu->arch))
> >>                   return 0;
> >>
> >> Which enumerates the SVE regs if vcpu_has_sve() which AFAICT is true if
> >> the host supports it, not if the user has requested it.
> >>
> >> I'll have to check what but given the indirection of kvm_one_reg I
> >> wonder if existing binaries might end up spamming a badly sized array
> >> when run on a new SVE supporting kernel?
> >
> > That shouldn't be the case: vcpu_has_sve() checks for the
> > KVM_ARM64_GUEST_HAS_SVE flag, which should only be set if userspace asks
> > for it.
> >
> > Give me a shout if this doesn't seem to be the case...
> 
> Ahh I missed it the first time:
> 
> 		if (system_supports_sve() &&
> 		    test_bit(KVM_ARM_VCPU_SVE, vcpu->arch.features)) {
> 			vcpu->arch.flags |= KVM_ARM64_GUEST_HAS_SVE;
> 
> And vcpu->arch.features is set by the user. However it will be set as
> unless you specify otherwise we use the results of probed:
> 
>   ret = ioctl(vmfd, KVM_ARM_PREFERRED_TARGET, init);
> 
> so the user will get it by definition when they first run on SVE capable
> hardware.

AFAIK qemu and kvmtool don't currently propagate the feature flags from
KVM_ARM_PREFERRED_TARGET to KVM_VCPU_INIT.  That would probably not be
the right thing to do, because if you don't know what a feature flag
means then you can't safely turn it on.

But in light of Andrew's comments I will need to review how this works.

The semantics of the feature bits are not well defined, so it may be
safer not to use them for enabling SVE.  I was hoping we could choose
a meaning for them since they weren't previously used for anything,
but that may be optimistic.

Cheers
---Dave

^ permalink raw reply	[flat|nested] 178+ messages in thread

* Re: [RFC PATCH 14/16] KVM: arm64/sve: Add SVE support to register access ioctl interface
  2018-07-19 13:04     ` Andrew Jones
@ 2018-08-03 14:57       ` Dave Martin
  -1 siblings, 0 replies; 178+ messages in thread
From: Dave Martin @ 2018-08-03 14:57 UTC (permalink / raw)
  To: Andrew Jones
  Cc: Okamoto Takayuki, Christoffer Dall, Ard Biesheuvel, Marc Zyngier,
	Catalin Marinas, Will Deacon, kvmarm, linux-arm-kernel

On Thu, Jul 19, 2018 at 03:04:33PM +0200, Andrew Jones wrote:
> On Thu, Jun 21, 2018 at 03:57:38PM +0100, Dave Martin wrote:
> > This patch adds the following registers for access via the
> > KVM_{GET,SET}_ONE_REG interface:
> > 
> >  * KVM_REG_ARM64_SVE_ZREG(n, i) (n = 0..31) (in 2048-bit slices)
> >  * KVM_REG_ARM64_SVE_PREG(n, i) (n = 0..15) (in 256-bit slices)
> >  * KVM_REG_ARM64_SVE_FFR(i) (in 256-bit slices)
> > 
> > In order to adapt gracefully to future architectural extensions,
> > the registers are divided up into slices as noted above:  the i
> > parameter denotes the slice index.
> > 
> > For simplicity, bits or slices that exceed the maximum vector
> > length supported for the vcpu are ignored for KVM_SET_ONE_REG, and
> > read as zero for KVM_GET_ONE_REG.
> > 
> > For the current architecture, only slice i = 0 is significant.  The
> > interface design allows i to increase to up to 31 in the future if
> > required by future architectural amendments.
> > 
> > The registers are only visible for vcpus that have SVE enabled.
> > They are not enumerated by KVM_GET_REG_LIST on vcpus that do not
> > have SVE.  In all cases, surplus slices are not enumerated by
> > KVM_GET_REG_LIST.
> > 
> > Accesses to the FPSIMD registers via KVM_REG_ARM_CORE are
> > redirected to access the underlying vcpu SVE register storage as
> > appropriate.  In order to make this more straightforward, register
> > accesses that straddle register boundaries are no longer guaranteed
> > to succeed.  (Support for such use was never deliberate, and
> > userspace does not currently seem to be relying on it.)
> > 
> > Signed-off-by: Dave Martin <Dave.Martin@arm.com>

[...]

> > diff --git a/arch/arm64/kvm/guest.c b/arch/arm64/kvm/guest.c

[...]

> > +static int sve_reg_bounds(struct reg_bounds_struct *b,
> > +			  const struct kvm_vcpu *vcpu,
> > +			  const struct kvm_one_reg *reg)
> > +{

[...]

> > +	b->kptr += start;
> > +
> > +	if (copy_limit < start)
> > +		copy_limit = start;
> > +	else if (copy_limit > limit)
> > +		copy_limit = limit;
> 
>  copy_limit = clamp(copy_limit, start, limit)

Hmmm, having looked in detail in the definition of clamp(), I'm not sure
I like it that much -- it can introduce type issues that are not readily
apparent to the reader.

gcc can warn about signed/unsigned comparisons, which is the only issue
where clamp() genuinely helps AFAICT, but this requires -Wsign-compare
(which is not enabled by default, nor with -Wall).  Great.

I can use clamp() if you feel strongly about it, but otherwise I tend
prefer my subtleties to be in plain sight rather than buried inside a
macro, unless there is a serious verbosity impact from not using the
macro (here, I would say there isn't, since it's just a single
instance).

[...]

Cheers
---Dave

^ permalink raw reply	[flat|nested] 178+ messages in thread

* [RFC PATCH 14/16] KVM: arm64/sve: Add SVE support to register access ioctl interface
@ 2018-08-03 14:57       ` Dave Martin
  0 siblings, 0 replies; 178+ messages in thread
From: Dave Martin @ 2018-08-03 14:57 UTC (permalink / raw)
  To: linux-arm-kernel

On Thu, Jul 19, 2018 at 03:04:33PM +0200, Andrew Jones wrote:
> On Thu, Jun 21, 2018 at 03:57:38PM +0100, Dave Martin wrote:
> > This patch adds the following registers for access via the
> > KVM_{GET,SET}_ONE_REG interface:
> > 
> >  * KVM_REG_ARM64_SVE_ZREG(n, i) (n = 0..31) (in 2048-bit slices)
> >  * KVM_REG_ARM64_SVE_PREG(n, i) (n = 0..15) (in 256-bit slices)
> >  * KVM_REG_ARM64_SVE_FFR(i) (in 256-bit slices)
> > 
> > In order to adapt gracefully to future architectural extensions,
> > the registers are divided up into slices as noted above:  the i
> > parameter denotes the slice index.
> > 
> > For simplicity, bits or slices that exceed the maximum vector
> > length supported for the vcpu are ignored for KVM_SET_ONE_REG, and
> > read as zero for KVM_GET_ONE_REG.
> > 
> > For the current architecture, only slice i = 0 is significant.  The
> > interface design allows i to increase to up to 31 in the future if
> > required by future architectural amendments.
> > 
> > The registers are only visible for vcpus that have SVE enabled.
> > They are not enumerated by KVM_GET_REG_LIST on vcpus that do not
> > have SVE.  In all cases, surplus slices are not enumerated by
> > KVM_GET_REG_LIST.
> > 
> > Accesses to the FPSIMD registers via KVM_REG_ARM_CORE are
> > redirected to access the underlying vcpu SVE register storage as
> > appropriate.  In order to make this more straightforward, register
> > accesses that straddle register boundaries are no longer guaranteed
> > to succeed.  (Support for such use was never deliberate, and
> > userspace does not currently seem to be relying on it.)
> > 
> > Signed-off-by: Dave Martin <Dave.Martin@arm.com>

[...]

> > diff --git a/arch/arm64/kvm/guest.c b/arch/arm64/kvm/guest.c

[...]

> > +static int sve_reg_bounds(struct reg_bounds_struct *b,
> > +			  const struct kvm_vcpu *vcpu,
> > +			  const struct kvm_one_reg *reg)
> > +{

[...]

> > +	b->kptr += start;
> > +
> > +	if (copy_limit < start)
> > +		copy_limit = start;
> > +	else if (copy_limit > limit)
> > +		copy_limit = limit;
> 
>  copy_limit = clamp(copy_limit, start, limit)

Hmmm, having looked in detail in the definition of clamp(), I'm not sure
I like it that much -- it can introduce type issues that are not readily
apparent to the reader.

gcc can warn about signed/unsigned comparisons, which is the only issue
where clamp() genuinely helps AFAICT, but this requires -Wsign-compare
(which is not enabled by default, nor with -Wall).  Great.

I can use clamp() if you feel strongly about it, but otherwise I tend
prefer my subtleties to be in plain sight rather than buried inside a
macro, unless there is a serious verbosity impact from not using the
macro (here, I would say there isn't, since it's just a single
instance).

[...]

Cheers
---Dave

^ permalink raw reply	[flat|nested] 178+ messages in thread

* Re: [RFC PATCH 14/16] KVM: arm64/sve: Add SVE support to register access ioctl interface
  2018-08-03 14:57       ` Dave Martin
@ 2018-08-03 15:11         ` Andrew Jones
  -1 siblings, 0 replies; 178+ messages in thread
From: Andrew Jones @ 2018-08-03 15:11 UTC (permalink / raw)
  To: Dave Martin
  Cc: Okamoto Takayuki, Christoffer Dall, Ard Biesheuvel, Marc Zyngier,
	Catalin Marinas, Will Deacon, kvmarm, linux-arm-kernel

On Fri, Aug 03, 2018 at 03:57:59PM +0100, Dave Martin wrote:
> On Thu, Jul 19, 2018 at 03:04:33PM +0200, Andrew Jones wrote:
> > On Thu, Jun 21, 2018 at 03:57:38PM +0100, Dave Martin wrote:
> > > This patch adds the following registers for access via the
> > > KVM_{GET,SET}_ONE_REG interface:
> > > 
> > >  * KVM_REG_ARM64_SVE_ZREG(n, i) (n = 0..31) (in 2048-bit slices)
> > >  * KVM_REG_ARM64_SVE_PREG(n, i) (n = 0..15) (in 256-bit slices)
> > >  * KVM_REG_ARM64_SVE_FFR(i) (in 256-bit slices)
> > > 
> > > In order to adapt gracefully to future architectural extensions,
> > > the registers are divided up into slices as noted above:  the i
> > > parameter denotes the slice index.
> > > 
> > > For simplicity, bits or slices that exceed the maximum vector
> > > length supported for the vcpu are ignored for KVM_SET_ONE_REG, and
> > > read as zero for KVM_GET_ONE_REG.
> > > 
> > > For the current architecture, only slice i = 0 is significant.  The
> > > interface design allows i to increase to up to 31 in the future if
> > > required by future architectural amendments.
> > > 
> > > The registers are only visible for vcpus that have SVE enabled.
> > > They are not enumerated by KVM_GET_REG_LIST on vcpus that do not
> > > have SVE.  In all cases, surplus slices are not enumerated by
> > > KVM_GET_REG_LIST.
> > > 
> > > Accesses to the FPSIMD registers via KVM_REG_ARM_CORE are
> > > redirected to access the underlying vcpu SVE register storage as
> > > appropriate.  In order to make this more straightforward, register
> > > accesses that straddle register boundaries are no longer guaranteed
> > > to succeed.  (Support for such use was never deliberate, and
> > > userspace does not currently seem to be relying on it.)
> > > 
> > > Signed-off-by: Dave Martin <Dave.Martin@arm.com>
> 
> [...]
> 
> > > diff --git a/arch/arm64/kvm/guest.c b/arch/arm64/kvm/guest.c
> 
> [...]
> 
> > > +static int sve_reg_bounds(struct reg_bounds_struct *b,
> > > +			  const struct kvm_vcpu *vcpu,
> > > +			  const struct kvm_one_reg *reg)
> > > +{
> 
> [...]
> 
> > > +	b->kptr += start;
> > > +
> > > +	if (copy_limit < start)
> > > +		copy_limit = start;
> > > +	else if (copy_limit > limit)
> > > +		copy_limit = limit;
> > 
> >  copy_limit = clamp(copy_limit, start, limit)
> 
> Hmmm, having looked in detail in the definition of clamp(), I'm not sure
> I like it that much -- it can introduce type issues that are not readily
> apparent to the reader.
> 
> gcc can warn about signed/unsigned comparisons, which is the only issue
> where clamp() genuinely helps AFAICT, but this requires -Wsign-compare
> (which is not enabled by default, nor with -Wall).  Great.
> 
> I can use clamp() if you feel strongly about it, but otherwise I tend
> prefer my subtleties to be in plain sight rather than buried inside a
> macro, unless there is a serious verbosity impact from not using the
> macro (here, I would say there isn't, since it's just a single
> instance).
>

Would clamp_t, with an appropriate type, satisfy your concerns?

Thanks,
drew

^ permalink raw reply	[flat|nested] 178+ messages in thread

* [RFC PATCH 14/16] KVM: arm64/sve: Add SVE support to register access ioctl interface
@ 2018-08-03 15:11         ` Andrew Jones
  0 siblings, 0 replies; 178+ messages in thread
From: Andrew Jones @ 2018-08-03 15:11 UTC (permalink / raw)
  To: linux-arm-kernel

On Fri, Aug 03, 2018 at 03:57:59PM +0100, Dave Martin wrote:
> On Thu, Jul 19, 2018 at 03:04:33PM +0200, Andrew Jones wrote:
> > On Thu, Jun 21, 2018 at 03:57:38PM +0100, Dave Martin wrote:
> > > This patch adds the following registers for access via the
> > > KVM_{GET,SET}_ONE_REG interface:
> > > 
> > >  * KVM_REG_ARM64_SVE_ZREG(n, i) (n = 0..31) (in 2048-bit slices)
> > >  * KVM_REG_ARM64_SVE_PREG(n, i) (n = 0..15) (in 256-bit slices)
> > >  * KVM_REG_ARM64_SVE_FFR(i) (in 256-bit slices)
> > > 
> > > In order to adapt gracefully to future architectural extensions,
> > > the registers are divided up into slices as noted above:  the i
> > > parameter denotes the slice index.
> > > 
> > > For simplicity, bits or slices that exceed the maximum vector
> > > length supported for the vcpu are ignored for KVM_SET_ONE_REG, and
> > > read as zero for KVM_GET_ONE_REG.
> > > 
> > > For the current architecture, only slice i = 0 is significant.  The
> > > interface design allows i to increase to up to 31 in the future if
> > > required by future architectural amendments.
> > > 
> > > The registers are only visible for vcpus that have SVE enabled.
> > > They are not enumerated by KVM_GET_REG_LIST on vcpus that do not
> > > have SVE.  In all cases, surplus slices are not enumerated by
> > > KVM_GET_REG_LIST.
> > > 
> > > Accesses to the FPSIMD registers via KVM_REG_ARM_CORE are
> > > redirected to access the underlying vcpu SVE register storage as
> > > appropriate.  In order to make this more straightforward, register
> > > accesses that straddle register boundaries are no longer guaranteed
> > > to succeed.  (Support for such use was never deliberate, and
> > > userspace does not currently seem to be relying on it.)
> > > 
> > > Signed-off-by: Dave Martin <Dave.Martin@arm.com>
> 
> [...]
> 
> > > diff --git a/arch/arm64/kvm/guest.c b/arch/arm64/kvm/guest.c
> 
> [...]
> 
> > > +static int sve_reg_bounds(struct reg_bounds_struct *b,
> > > +			  const struct kvm_vcpu *vcpu,
> > > +			  const struct kvm_one_reg *reg)
> > > +{
> 
> [...]
> 
> > > +	b->kptr += start;
> > > +
> > > +	if (copy_limit < start)
> > > +		copy_limit = start;
> > > +	else if (copy_limit > limit)
> > > +		copy_limit = limit;
> > 
> >  copy_limit = clamp(copy_limit, start, limit)
> 
> Hmmm, having looked in detail in the definition of clamp(), I'm not sure
> I like it that much -- it can introduce type issues that are not readily
> apparent to the reader.
> 
> gcc can warn about signed/unsigned comparisons, which is the only issue
> where clamp() genuinely helps AFAICT, but this requires -Wsign-compare
> (which is not enabled by default, nor with -Wall).  Great.
> 
> I can use clamp() if you feel strongly about it, but otherwise I tend
> prefer my subtleties to be in plain sight rather than buried inside a
> macro, unless there is a serious verbosity impact from not using the
> macro (here, I would say there isn't, since it's just a single
> instance).
>

Would clamp_t, with an appropriate type, satisfy your concerns?

Thanks,
drew

^ permalink raw reply	[flat|nested] 178+ messages in thread

* Re: [RFC PATCH 14/16] KVM: arm64/sve: Add SVE support to register access ioctl interface
  2018-08-03 15:11         ` Andrew Jones
@ 2018-08-03 15:38           ` Dave Martin
  -1 siblings, 0 replies; 178+ messages in thread
From: Dave Martin @ 2018-08-03 15:38 UTC (permalink / raw)
  To: Andrew Jones
  Cc: Okamoto Takayuki, Christoffer Dall, Ard Biesheuvel, Marc Zyngier,
	Catalin Marinas, Will Deacon, kvmarm, linux-arm-kernel

On Fri, Aug 03, 2018 at 05:11:09PM +0200, Andrew Jones wrote:
> On Fri, Aug 03, 2018 at 03:57:59PM +0100, Dave Martin wrote:
> > On Thu, Jul 19, 2018 at 03:04:33PM +0200, Andrew Jones wrote:
> > > On Thu, Jun 21, 2018 at 03:57:38PM +0100, Dave Martin wrote:
> > > > This patch adds the following registers for access via the
> > > > KVM_{GET,SET}_ONE_REG interface:
> > > > 
> > > >  * KVM_REG_ARM64_SVE_ZREG(n, i) (n = 0..31) (in 2048-bit slices)
> > > >  * KVM_REG_ARM64_SVE_PREG(n, i) (n = 0..15) (in 256-bit slices)
> > > >  * KVM_REG_ARM64_SVE_FFR(i) (in 256-bit slices)
> > > > 
> > > > In order to adapt gracefully to future architectural extensions,
> > > > the registers are divided up into slices as noted above:  the i
> > > > parameter denotes the slice index.
> > > > 
> > > > For simplicity, bits or slices that exceed the maximum vector
> > > > length supported for the vcpu are ignored for KVM_SET_ONE_REG, and
> > > > read as zero for KVM_GET_ONE_REG.
> > > > 
> > > > For the current architecture, only slice i = 0 is significant.  The
> > > > interface design allows i to increase to up to 31 in the future if
> > > > required by future architectural amendments.
> > > > 
> > > > The registers are only visible for vcpus that have SVE enabled.
> > > > They are not enumerated by KVM_GET_REG_LIST on vcpus that do not
> > > > have SVE.  In all cases, surplus slices are not enumerated by
> > > > KVM_GET_REG_LIST.
> > > > 
> > > > Accesses to the FPSIMD registers via KVM_REG_ARM_CORE are
> > > > redirected to access the underlying vcpu SVE register storage as
> > > > appropriate.  In order to make this more straightforward, register
> > > > accesses that straddle register boundaries are no longer guaranteed
> > > > to succeed.  (Support for such use was never deliberate, and
> > > > userspace does not currently seem to be relying on it.)
> > > > 
> > > > Signed-off-by: Dave Martin <Dave.Martin@arm.com>
> > 
> > [...]
> > 
> > > > diff --git a/arch/arm64/kvm/guest.c b/arch/arm64/kvm/guest.c
> > 
> > [...]
> > 
> > > > +static int sve_reg_bounds(struct reg_bounds_struct *b,
> > > > +			  const struct kvm_vcpu *vcpu,
> > > > +			  const struct kvm_one_reg *reg)
> > > > +{
> > 
> > [...]
> > 
> > > > +	b->kptr += start;
> > > > +
> > > > +	if (copy_limit < start)
> > > > +		copy_limit = start;
> > > > +	else if (copy_limit > limit)
> > > > +		copy_limit = limit;
> > > 
> > >  copy_limit = clamp(copy_limit, start, limit)
> > 
> > Hmmm, having looked in detail in the definition of clamp(), I'm not sure
> > I like it that much -- it can introduce type issues that are not readily
> > apparent to the reader.
> > 
> > gcc can warn about signed/unsigned comparisons, which is the only issue
> > where clamp() genuinely helps AFAICT, but this requires -Wsign-compare
> > (which is not enabled by default, nor with -Wall).  Great.
> > 
> > I can use clamp() if you feel strongly about it, but otherwise I tend
> > prefer my subtleties to be in plain sight rather than buried inside a
> > macro, unless there is a serious verbosity impact from not using the
> > macro (here, I would say there isn't, since it's just a single
> > instance).
> >
> 
> Would clamp_t, with an appropriate type, satisfy your concerns?

clamp_t() seems worse actually, since it replaces the typechecking
that is the main benefit of clamp() with explicit, unsafe typecasts.


To save just a few lines of code, I wasn't sure it was really worth
opening this can of worms...

Cheers
---Dave

^ permalink raw reply	[flat|nested] 178+ messages in thread

* [RFC PATCH 14/16] KVM: arm64/sve: Add SVE support to register access ioctl interface
@ 2018-08-03 15:38           ` Dave Martin
  0 siblings, 0 replies; 178+ messages in thread
From: Dave Martin @ 2018-08-03 15:38 UTC (permalink / raw)
  To: linux-arm-kernel

On Fri, Aug 03, 2018 at 05:11:09PM +0200, Andrew Jones wrote:
> On Fri, Aug 03, 2018 at 03:57:59PM +0100, Dave Martin wrote:
> > On Thu, Jul 19, 2018 at 03:04:33PM +0200, Andrew Jones wrote:
> > > On Thu, Jun 21, 2018 at 03:57:38PM +0100, Dave Martin wrote:
> > > > This patch adds the following registers for access via the
> > > > KVM_{GET,SET}_ONE_REG interface:
> > > > 
> > > >  * KVM_REG_ARM64_SVE_ZREG(n, i) (n = 0..31) (in 2048-bit slices)
> > > >  * KVM_REG_ARM64_SVE_PREG(n, i) (n = 0..15) (in 256-bit slices)
> > > >  * KVM_REG_ARM64_SVE_FFR(i) (in 256-bit slices)
> > > > 
> > > > In order to adapt gracefully to future architectural extensions,
> > > > the registers are divided up into slices as noted above:  the i
> > > > parameter denotes the slice index.
> > > > 
> > > > For simplicity, bits or slices that exceed the maximum vector
> > > > length supported for the vcpu are ignored for KVM_SET_ONE_REG, and
> > > > read as zero for KVM_GET_ONE_REG.
> > > > 
> > > > For the current architecture, only slice i = 0 is significant.  The
> > > > interface design allows i to increase to up to 31 in the future if
> > > > required by future architectural amendments.
> > > > 
> > > > The registers are only visible for vcpus that have SVE enabled.
> > > > They are not enumerated by KVM_GET_REG_LIST on vcpus that do not
> > > > have SVE.  In all cases, surplus slices are not enumerated by
> > > > KVM_GET_REG_LIST.
> > > > 
> > > > Accesses to the FPSIMD registers via KVM_REG_ARM_CORE are
> > > > redirected to access the underlying vcpu SVE register storage as
> > > > appropriate.  In order to make this more straightforward, register
> > > > accesses that straddle register boundaries are no longer guaranteed
> > > > to succeed.  (Support for such use was never deliberate, and
> > > > userspace does not currently seem to be relying on it.)
> > > > 
> > > > Signed-off-by: Dave Martin <Dave.Martin@arm.com>
> > 
> > [...]
> > 
> > > > diff --git a/arch/arm64/kvm/guest.c b/arch/arm64/kvm/guest.c
> > 
> > [...]
> > 
> > > > +static int sve_reg_bounds(struct reg_bounds_struct *b,
> > > > +			  const struct kvm_vcpu *vcpu,
> > > > +			  const struct kvm_one_reg *reg)
> > > > +{
> > 
> > [...]
> > 
> > > > +	b->kptr += start;
> > > > +
> > > > +	if (copy_limit < start)
> > > > +		copy_limit = start;
> > > > +	else if (copy_limit > limit)
> > > > +		copy_limit = limit;
> > > 
> > >  copy_limit = clamp(copy_limit, start, limit)
> > 
> > Hmmm, having looked in detail in the definition of clamp(), I'm not sure
> > I like it that much -- it can introduce type issues that are not readily
> > apparent to the reader.
> > 
> > gcc can warn about signed/unsigned comparisons, which is the only issue
> > where clamp() genuinely helps AFAICT, but this requires -Wsign-compare
> > (which is not enabled by default, nor with -Wall).  Great.
> > 
> > I can use clamp() if you feel strongly about it, but otherwise I tend
> > prefer my subtleties to be in plain sight rather than buried inside a
> > macro, unless there is a serious verbosity impact from not using the
> > macro (here, I would say there isn't, since it's just a single
> > instance).
> >
> 
> Would clamp_t, with an appropriate type, satisfy your concerns?

clamp_t() seems worse actually, since it replaces the typechecking
that is the main benefit of clamp() with explicit, unsafe typecasts.


To save just a few lines of code, I wasn't sure it was really worth
opening this can of worms...

Cheers
---Dave

^ permalink raw reply	[flat|nested] 178+ messages in thread

* Re: [RFC PATCH 09/16] KVM: arm64: Allow ID registers to by dynamically read-as-zero
  2018-06-21 14:57   ` Dave Martin
@ 2018-08-06 13:03     ` Christoffer Dall
  -1 siblings, 0 replies; 178+ messages in thread
From: Christoffer Dall @ 2018-08-06 13:03 UTC (permalink / raw)
  To: Dave Martin
  Cc: Okamoto Takayuki, Christoffer Dall, Ard Biesheuvel, Marc Zyngier,
	Catalin Marinas, Will Deacon, kvmarm, linux-arm-kernel

Hi Dave,

I think there's a typo in the subject "to be" rather than "to by".

On Thu, Jun 21, 2018 at 03:57:33PM +0100, Dave Martin wrote:
> When a feature-dependent ID register is hidden from the guest, it
> needs to exhibit read-as-zero behaviour as defined by the Arm
> architecture, rather than appearing to be entirely absent.
> 
> This patch updates the ID register emulation logic to make use of
> the new check_present() method to determine whether the register
> should read as zero instead of yielding the host's sanitised
> value.  Because currently a false result from this method truncates
> the trap call chain before the sysreg's emulate method() is called,
> a flag is added to distinguish this special case, and helpers are
> refactored appropriately.

I don't understand this last sentence.

And I'm not really sure I understand the code either.

I can't seem to see any registers which are defined as !present && !raz,
which is what I thought this feature was all about.

In other words, what is the benefit of this more generic method as
opposed to having a wrapper around read_id_reg() for read_sve_id_reg()
which sets RAZ if there is no support for SVE in this context?

Thanks,
-Christoffer

> 
> This invloves some trivial updates to pass the vcpu pointer down
> into the ID register emulation/access functions.
> 
> A new ID_SANITISED_IF() macro is defined for declaring
> conditionally visible ID registers.
> 
> Signed-off-by: Dave Martin <Dave.Martin@arm.com>
> ---
>  arch/arm64/kvm/sys_regs.c | 51 ++++++++++++++++++++++++++++++-----------------
>  arch/arm64/kvm/sys_regs.h | 11 ++++++++++
>  2 files changed, 44 insertions(+), 18 deletions(-)
> 
> diff --git a/arch/arm64/kvm/sys_regs.c b/arch/arm64/kvm/sys_regs.c
> index 31a351a..87d2468 100644
> --- a/arch/arm64/kvm/sys_regs.c
> +++ b/arch/arm64/kvm/sys_regs.c
> @@ -987,11 +987,17 @@ static bool access_cntp_cval(struct kvm_vcpu *vcpu,
>  }
>  
>  /* Read a sanitised cpufeature ID register by sys_reg_desc */
> -static u64 read_id_reg(struct sys_reg_desc const *r, bool raz)
> +static u64 read_id_reg(const struct kvm_vcpu *vcpu,
> +		       struct sys_reg_desc const *r, bool raz)
>  {
>  	u32 id = sys_reg((u32)r->Op0, (u32)r->Op1,
>  			 (u32)r->CRn, (u32)r->CRm, (u32)r->Op2);
> -	u64 val = raz ? 0 : read_sanitised_ftr_reg(id);
> +	u64 val;
> +
> +	if (raz || !sys_reg_present(vcpu, r))
> +		val = 0;
> +	else
> +		val = read_sanitised_ftr_reg(id);
>  
>  	if (id == SYS_ID_AA64PFR0_EL1) {
>  		if (val & (0xfUL << ID_AA64PFR0_SVE_SHIFT))
> @@ -1018,7 +1024,7 @@ static bool __access_id_reg(struct kvm_vcpu *vcpu,
>  	if (p->is_write)
>  		return write_to_read_only(vcpu, p, r);
>  
> -	p->regval = read_id_reg(r, raz);
> +	p->regval = read_id_reg(vcpu, r, raz);
>  	return true;
>  }
>  
> @@ -1047,16 +1053,18 @@ static u64 sys_reg_to_index(const struct sys_reg_desc *reg);
>   * are stored, and for set_id_reg() we don't allow the effective value
>   * to be changed.
>   */
> -static int __get_id_reg(const struct sys_reg_desc *rd, void __user *uaddr,
> +static int __get_id_reg(const struct kvm_vcpu *vcpu,
> +			const struct sys_reg_desc *rd, void __user *uaddr,
>  			bool raz)
>  {
>  	const u64 id = sys_reg_to_index(rd);
> -	const u64 val = read_id_reg(rd, raz);
> +	const u64 val = read_id_reg(vcpu, rd, raz);
>  
>  	return reg_to_user(uaddr, &val, id);
>  }
>  
> -static int __set_id_reg(const struct sys_reg_desc *rd, void __user *uaddr,
> +static int __set_id_reg(const struct kvm_vcpu *vcpu,
> +			const struct sys_reg_desc *rd, void __user *uaddr,
>  			bool raz)
>  {
>  	const u64 id = sys_reg_to_index(rd);
> @@ -1068,7 +1076,7 @@ static int __set_id_reg(const struct sys_reg_desc *rd, void __user *uaddr,
>  		return err;
>  
>  	/* This is what we mean by invariant: you can't change it. */
> -	if (val != read_id_reg(rd, raz))
> +	if (val != read_id_reg(vcpu, rd, raz))
>  		return -EINVAL;
>  
>  	return 0;
> @@ -1077,33 +1085,40 @@ static int __set_id_reg(const struct sys_reg_desc *rd, void __user *uaddr,
>  static int get_id_reg(struct kvm_vcpu *vcpu, const struct sys_reg_desc *rd,
>  		      const struct kvm_one_reg *reg, void __user *uaddr)
>  {
> -	return __get_id_reg(rd, uaddr, false);
> +	return __get_id_reg(vcpu, rd, uaddr, false);
>  }
>  
>  static int set_id_reg(struct kvm_vcpu *vcpu, const struct sys_reg_desc *rd,
>  		      const struct kvm_one_reg *reg, void __user *uaddr)
>  {
> -	return __set_id_reg(rd, uaddr, false);
> +	return __set_id_reg(vcpu, rd, uaddr, false);
>  }
>  
>  static int get_raz_id_reg(struct kvm_vcpu *vcpu, const struct sys_reg_desc *rd,
>  			  const struct kvm_one_reg *reg, void __user *uaddr)
>  {
> -	return __get_id_reg(rd, uaddr, true);
> +	return __get_id_reg(vcpu, rd, uaddr, true);
>  }
>  
>  static int set_raz_id_reg(struct kvm_vcpu *vcpu, const struct sys_reg_desc *rd,
>  			  const struct kvm_one_reg *reg, void __user *uaddr)
>  {
> -	return __set_id_reg(rd, uaddr, true);
> +	return __set_id_reg(vcpu, rd, uaddr, true);
>  }
>  
>  /* sys_reg_desc initialiser for known cpufeature ID registers */
> -#define ID_SANITISED(name) {			\
> +#define __ID_SANITISED(name)			\
>  	SYS_DESC(SYS_##name),			\
>  	.access	= access_id_reg,		\
>  	.get_user = get_id_reg,			\
> -	.set_user = set_id_reg,			\
> +	.set_user = set_id_reg
> +
> +#define ID_SANITISED(name) { __ID_SANITISED(name) }
> +
> +#define ID_SANITISED_IF(name, check) {		\
> +	__ID_SANITISED(name),			\
> +	.check_present = check,			\
> +	.flags = SR_RAZ_IF_ABSENT,		\
>  }
>  
>  /*
> @@ -1840,7 +1855,7 @@ static int emulate_cp(struct kvm_vcpu *vcpu,
>  
>  	r = find_reg(params, table, num);
>  
> -	if (likely(r) && sys_reg_present(vcpu, r)) {
> +	if (likely(r) && sys_reg_present_or_raz(vcpu, r)) {
>  		perform_access(vcpu, params, r);
>  		return 0;
>  	}
> @@ -2016,7 +2031,7 @@ static int emulate_sys_reg(struct kvm_vcpu *vcpu,
>  	if (!r)
>  		r = find_reg(params, sys_reg_descs, ARRAY_SIZE(sys_reg_descs));
>  
> -	if (likely(r) && sys_reg_present(vcpu, r)) {
> +	if (likely(r) && sys_reg_present_or_raz(vcpu, r)) {
>  		perform_access(vcpu, params, r);
>  	} else {
>  		kvm_err("Unsupported guest sys_reg access at: %lx\n",
> @@ -2313,7 +2328,7 @@ int kvm_arm_sys_reg_get_reg(struct kvm_vcpu *vcpu, const struct kvm_one_reg *reg
>  	if (!r)
>  		return get_invariant_sys_reg(reg->id, uaddr);
>  
> -	if (!sys_reg_present(vcpu, r))
> +	if (!sys_reg_present_or_raz(vcpu, r))
>  		return -ENOENT;
>  
>  	if (r->get_user)
> @@ -2337,7 +2352,7 @@ int kvm_arm_sys_reg_set_reg(struct kvm_vcpu *vcpu, const struct kvm_one_reg *reg
>  	if (!r)
>  		return set_invariant_sys_reg(reg->id, uaddr);
>  
> -	if (!sys_reg_present(vcpu, r))
> +	if (!sys_reg_present_or_raz(vcpu, r))
>  		return -ENOENT;
>  
>  	if (r->set_user)
> @@ -2408,7 +2423,7 @@ static int walk_one_sys_reg(struct kvm_vcpu *vcpu,
>  	if (!(rd->reg || rd->get_user))
>  		return 0;
>  
> -	if (!sys_reg_present(vcpu, rd))
> +	if (!sys_reg_present_or_raz(vcpu, rd))
>  		return 0;
>  
>  	if (!copy_reg_to_user(rd, uind))
> diff --git a/arch/arm64/kvm/sys_regs.h b/arch/arm64/kvm/sys_regs.h
> index dfbb342..304928f 100644
> --- a/arch/arm64/kvm/sys_regs.h
> +++ b/arch/arm64/kvm/sys_regs.h
> @@ -66,14 +66,25 @@ struct sys_reg_desc {
>  			const struct kvm_one_reg *reg, void __user *uaddr);
>  	bool (*check_present)(const struct kvm_vcpu *vpcu,
>  			      const struct sys_reg_desc *rd);
> +
> +	/* OR of SR_* flags */
> +	unsigned int flags;
>  };
>  
> +#define SR_RAZ_IF_ABSENT	(1 << 0)
> +
>  static inline bool sys_reg_present(const struct kvm_vcpu *vcpu,
>  				   const struct sys_reg_desc *rd)
>  {
>  	return likely(!rd->check_present) || rd->check_present(vcpu, rd);
>  }
>  
> +static inline bool sys_reg_present_or_raz(const struct kvm_vcpu *vcpu,
> +					  const struct sys_reg_desc *rd)
> +{
> +	return sys_reg_present(vcpu, rd) || (rd->flags & SR_RAZ_IF_ABSENT);
> +}
> +
>  static inline void print_sys_reg_instr(const struct sys_reg_params *p)
>  {
>  	/* Look, we even formatted it for you to paste into the table! */
> -- 
> 2.1.4
> 
> _______________________________________________
> kvmarm mailing list
> kvmarm@lists.cs.columbia.edu
> https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

^ permalink raw reply	[flat|nested] 178+ messages in thread

* [RFC PATCH 09/16] KVM: arm64: Allow ID registers to by dynamically read-as-zero
@ 2018-08-06 13:03     ` Christoffer Dall
  0 siblings, 0 replies; 178+ messages in thread
From: Christoffer Dall @ 2018-08-06 13:03 UTC (permalink / raw)
  To: linux-arm-kernel

Hi Dave,

I think there's a typo in the subject "to be" rather than "to by".

On Thu, Jun 21, 2018 at 03:57:33PM +0100, Dave Martin wrote:
> When a feature-dependent ID register is hidden from the guest, it
> needs to exhibit read-as-zero behaviour as defined by the Arm
> architecture, rather than appearing to be entirely absent.
> 
> This patch updates the ID register emulation logic to make use of
> the new check_present() method to determine whether the register
> should read as zero instead of yielding the host's sanitised
> value.  Because currently a false result from this method truncates
> the trap call chain before the sysreg's emulate method() is called,
> a flag is added to distinguish this special case, and helpers are
> refactored appropriately.

I don't understand this last sentence.

And I'm not really sure I understand the code either.

I can't seem to see any registers which are defined as !present && !raz,
which is what I thought this feature was all about.

In other words, what is the benefit of this more generic method as
opposed to having a wrapper around read_id_reg() for read_sve_id_reg()
which sets RAZ if there is no support for SVE in this context?

Thanks,
-Christoffer

> 
> This invloves some trivial updates to pass the vcpu pointer down
> into the ID register emulation/access functions.
> 
> A new ID_SANITISED_IF() macro is defined for declaring
> conditionally visible ID registers.
> 
> Signed-off-by: Dave Martin <Dave.Martin@arm.com>
> ---
>  arch/arm64/kvm/sys_regs.c | 51 ++++++++++++++++++++++++++++++-----------------
>  arch/arm64/kvm/sys_regs.h | 11 ++++++++++
>  2 files changed, 44 insertions(+), 18 deletions(-)
> 
> diff --git a/arch/arm64/kvm/sys_regs.c b/arch/arm64/kvm/sys_regs.c
> index 31a351a..87d2468 100644
> --- a/arch/arm64/kvm/sys_regs.c
> +++ b/arch/arm64/kvm/sys_regs.c
> @@ -987,11 +987,17 @@ static bool access_cntp_cval(struct kvm_vcpu *vcpu,
>  }
>  
>  /* Read a sanitised cpufeature ID register by sys_reg_desc */
> -static u64 read_id_reg(struct sys_reg_desc const *r, bool raz)
> +static u64 read_id_reg(const struct kvm_vcpu *vcpu,
> +		       struct sys_reg_desc const *r, bool raz)
>  {
>  	u32 id = sys_reg((u32)r->Op0, (u32)r->Op1,
>  			 (u32)r->CRn, (u32)r->CRm, (u32)r->Op2);
> -	u64 val = raz ? 0 : read_sanitised_ftr_reg(id);
> +	u64 val;
> +
> +	if (raz || !sys_reg_present(vcpu, r))
> +		val = 0;
> +	else
> +		val = read_sanitised_ftr_reg(id);
>  
>  	if (id == SYS_ID_AA64PFR0_EL1) {
>  		if (val & (0xfUL << ID_AA64PFR0_SVE_SHIFT))
> @@ -1018,7 +1024,7 @@ static bool __access_id_reg(struct kvm_vcpu *vcpu,
>  	if (p->is_write)
>  		return write_to_read_only(vcpu, p, r);
>  
> -	p->regval = read_id_reg(r, raz);
> +	p->regval = read_id_reg(vcpu, r, raz);
>  	return true;
>  }
>  
> @@ -1047,16 +1053,18 @@ static u64 sys_reg_to_index(const struct sys_reg_desc *reg);
>   * are stored, and for set_id_reg() we don't allow the effective value
>   * to be changed.
>   */
> -static int __get_id_reg(const struct sys_reg_desc *rd, void __user *uaddr,
> +static int __get_id_reg(const struct kvm_vcpu *vcpu,
> +			const struct sys_reg_desc *rd, void __user *uaddr,
>  			bool raz)
>  {
>  	const u64 id = sys_reg_to_index(rd);
> -	const u64 val = read_id_reg(rd, raz);
> +	const u64 val = read_id_reg(vcpu, rd, raz);
>  
>  	return reg_to_user(uaddr, &val, id);
>  }
>  
> -static int __set_id_reg(const struct sys_reg_desc *rd, void __user *uaddr,
> +static int __set_id_reg(const struct kvm_vcpu *vcpu,
> +			const struct sys_reg_desc *rd, void __user *uaddr,
>  			bool raz)
>  {
>  	const u64 id = sys_reg_to_index(rd);
> @@ -1068,7 +1076,7 @@ static int __set_id_reg(const struct sys_reg_desc *rd, void __user *uaddr,
>  		return err;
>  
>  	/* This is what we mean by invariant: you can't change it. */
> -	if (val != read_id_reg(rd, raz))
> +	if (val != read_id_reg(vcpu, rd, raz))
>  		return -EINVAL;
>  
>  	return 0;
> @@ -1077,33 +1085,40 @@ static int __set_id_reg(const struct sys_reg_desc *rd, void __user *uaddr,
>  static int get_id_reg(struct kvm_vcpu *vcpu, const struct sys_reg_desc *rd,
>  		      const struct kvm_one_reg *reg, void __user *uaddr)
>  {
> -	return __get_id_reg(rd, uaddr, false);
> +	return __get_id_reg(vcpu, rd, uaddr, false);
>  }
>  
>  static int set_id_reg(struct kvm_vcpu *vcpu, const struct sys_reg_desc *rd,
>  		      const struct kvm_one_reg *reg, void __user *uaddr)
>  {
> -	return __set_id_reg(rd, uaddr, false);
> +	return __set_id_reg(vcpu, rd, uaddr, false);
>  }
>  
>  static int get_raz_id_reg(struct kvm_vcpu *vcpu, const struct sys_reg_desc *rd,
>  			  const struct kvm_one_reg *reg, void __user *uaddr)
>  {
> -	return __get_id_reg(rd, uaddr, true);
> +	return __get_id_reg(vcpu, rd, uaddr, true);
>  }
>  
>  static int set_raz_id_reg(struct kvm_vcpu *vcpu, const struct sys_reg_desc *rd,
>  			  const struct kvm_one_reg *reg, void __user *uaddr)
>  {
> -	return __set_id_reg(rd, uaddr, true);
> +	return __set_id_reg(vcpu, rd, uaddr, true);
>  }
>  
>  /* sys_reg_desc initialiser for known cpufeature ID registers */
> -#define ID_SANITISED(name) {			\
> +#define __ID_SANITISED(name)			\
>  	SYS_DESC(SYS_##name),			\
>  	.access	= access_id_reg,		\
>  	.get_user = get_id_reg,			\
> -	.set_user = set_id_reg,			\
> +	.set_user = set_id_reg
> +
> +#define ID_SANITISED(name) { __ID_SANITISED(name) }
> +
> +#define ID_SANITISED_IF(name, check) {		\
> +	__ID_SANITISED(name),			\
> +	.check_present = check,			\
> +	.flags = SR_RAZ_IF_ABSENT,		\
>  }
>  
>  /*
> @@ -1840,7 +1855,7 @@ static int emulate_cp(struct kvm_vcpu *vcpu,
>  
>  	r = find_reg(params, table, num);
>  
> -	if (likely(r) && sys_reg_present(vcpu, r)) {
> +	if (likely(r) && sys_reg_present_or_raz(vcpu, r)) {
>  		perform_access(vcpu, params, r);
>  		return 0;
>  	}
> @@ -2016,7 +2031,7 @@ static int emulate_sys_reg(struct kvm_vcpu *vcpu,
>  	if (!r)
>  		r = find_reg(params, sys_reg_descs, ARRAY_SIZE(sys_reg_descs));
>  
> -	if (likely(r) && sys_reg_present(vcpu, r)) {
> +	if (likely(r) && sys_reg_present_or_raz(vcpu, r)) {
>  		perform_access(vcpu, params, r);
>  	} else {
>  		kvm_err("Unsupported guest sys_reg access at: %lx\n",
> @@ -2313,7 +2328,7 @@ int kvm_arm_sys_reg_get_reg(struct kvm_vcpu *vcpu, const struct kvm_one_reg *reg
>  	if (!r)
>  		return get_invariant_sys_reg(reg->id, uaddr);
>  
> -	if (!sys_reg_present(vcpu, r))
> +	if (!sys_reg_present_or_raz(vcpu, r))
>  		return -ENOENT;
>  
>  	if (r->get_user)
> @@ -2337,7 +2352,7 @@ int kvm_arm_sys_reg_set_reg(struct kvm_vcpu *vcpu, const struct kvm_one_reg *reg
>  	if (!r)
>  		return set_invariant_sys_reg(reg->id, uaddr);
>  
> -	if (!sys_reg_present(vcpu, r))
> +	if (!sys_reg_present_or_raz(vcpu, r))
>  		return -ENOENT;
>  
>  	if (r->set_user)
> @@ -2408,7 +2423,7 @@ static int walk_one_sys_reg(struct kvm_vcpu *vcpu,
>  	if (!(rd->reg || rd->get_user))
>  		return 0;
>  
> -	if (!sys_reg_present(vcpu, rd))
> +	if (!sys_reg_present_or_raz(vcpu, rd))
>  		return 0;
>  
>  	if (!copy_reg_to_user(rd, uind))
> diff --git a/arch/arm64/kvm/sys_regs.h b/arch/arm64/kvm/sys_regs.h
> index dfbb342..304928f 100644
> --- a/arch/arm64/kvm/sys_regs.h
> +++ b/arch/arm64/kvm/sys_regs.h
> @@ -66,14 +66,25 @@ struct sys_reg_desc {
>  			const struct kvm_one_reg *reg, void __user *uaddr);
>  	bool (*check_present)(const struct kvm_vcpu *vpcu,
>  			      const struct sys_reg_desc *rd);
> +
> +	/* OR of SR_* flags */
> +	unsigned int flags;
>  };
>  
> +#define SR_RAZ_IF_ABSENT	(1 << 0)
> +
>  static inline bool sys_reg_present(const struct kvm_vcpu *vcpu,
>  				   const struct sys_reg_desc *rd)
>  {
>  	return likely(!rd->check_present) || rd->check_present(vcpu, rd);
>  }
>  
> +static inline bool sys_reg_present_or_raz(const struct kvm_vcpu *vcpu,
> +					  const struct sys_reg_desc *rd)
> +{
> +	return sys_reg_present(vcpu, rd) || (rd->flags & SR_RAZ_IF_ABSENT);
> +}
> +
>  static inline void print_sys_reg_instr(const struct sys_reg_params *p)
>  {
>  	/* Look, we even formatted it for you to paste into the table! */
> -- 
> 2.1.4
> 
> _______________________________________________
> kvmarm mailing list
> kvmarm at lists.cs.columbia.edu
> https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

^ permalink raw reply	[flat|nested] 178+ messages in thread

* Re: [RFC PATCH 00/16] KVM: arm64: Initial support for SVE guests
  2018-06-21 14:57 ` Dave Martin
@ 2018-08-06 13:05   ` Christoffer Dall
  -1 siblings, 0 replies; 178+ messages in thread
From: Christoffer Dall @ 2018-08-06 13:05 UTC (permalink / raw)
  To: Dave Martin
  Cc: Okamoto Takayuki, Christoffer Dall, Ard Biesheuvel, Marc Zyngier,
	Catalin Marinas, Will Deacon, kvmarm, linux-arm-kernel

On Thu, Jun 21, 2018 at 03:57:24PM +0100, Dave Martin wrote:
> This series implements basic support for allowing KVM guests to use the
> Arm Scalable Vector Extension (SVE).
> 
> The patches is based on torvalds/master f5b7769e (Revert "debugfs:
> inode: debugfs_create_dir uses mode permission from parent") plus the
> patches from [1].

Given the effort required to go fetch another patch set from the list to
apply to a specific commit, and the size of this patch set, it would be
helpful to have a pointer to a branch with everything in it.

Thanks,
-Christoffer

^ permalink raw reply	[flat|nested] 178+ messages in thread

* [RFC PATCH 00/16] KVM: arm64: Initial support for SVE guests
@ 2018-08-06 13:05   ` Christoffer Dall
  0 siblings, 0 replies; 178+ messages in thread
From: Christoffer Dall @ 2018-08-06 13:05 UTC (permalink / raw)
  To: linux-arm-kernel

On Thu, Jun 21, 2018 at 03:57:24PM +0100, Dave Martin wrote:
> This series implements basic support for allowing KVM guests to use the
> Arm Scalable Vector Extension (SVE).
> 
> The patches is based on torvalds/master f5b7769e (Revert "debugfs:
> inode: debugfs_create_dir uses mode permission from parent") plus the
> patches from [1].

Given the effort required to go fetch another patch set from the list to
apply to a specific commit, and the size of this patch set, it would be
helpful to have a pointer to a branch with everything in it.

Thanks,
-Christoffer

^ permalink raw reply	[flat|nested] 178+ messages in thread

* Re: [RFC PATCH 12/16] KVM: arm64/sve: Context switch the SVE registers
  2018-06-21 14:57   ` Dave Martin
@ 2018-08-06 13:19     ` Christoffer Dall
  -1 siblings, 0 replies; 178+ messages in thread
From: Christoffer Dall @ 2018-08-06 13:19 UTC (permalink / raw)
  To: Dave Martin
  Cc: Okamoto Takayuki, Christoffer Dall, Ard Biesheuvel, Marc Zyngier,
	Catalin Marinas, Will Deacon, kvmarm, linux-arm-kernel

On Thu, Jun 21, 2018 at 03:57:36PM +0100, Dave Martin wrote:
> In order to give each vcpu its own view of the SVE registers, this
> patch adds context storage via a new sve_state pointer in struct
> vcpu_arch.  An additional member sve_max_vl is also added for each
> vcpu, to determine the maximum vector length visible to the guest
> and thus the value to be configured in ZCR_EL2.LEN while the is
> active.  This also determines the layout and size of the storage in
> sve_state, which is read and written by the same backend functions
> that are used for context-switching the SVE state for host tasks.
> 
> On SVE-enabled vcpus, SVE access traps are now handled by switching
> in the vcpu's SVE context and disabling the trap before returning
> to the guest.  On other vcpus, the trap is not handled and an exit
> back to the host occurs, where the handle_sve() fallback path
> reflects an undefined instruction exception back to the guest,
> consistently with the behaviour of non-SVE-capable hardware (as was
> done unconditionally prior to this patch).
> 
> No SVE handling is added on non-VHE-only paths, since VHE is an
> architectural and Kconfig prerequisite of SVE.
> 
> Signed-off-by: Dave Martin <Dave.Martin@arm.com>
> ---
>  arch/arm64/include/asm/kvm_host.h |  2 ++
>  arch/arm64/kvm/fpsimd.c           |  5 +++--
>  arch/arm64/kvm/hyp/switch.c       | 43 ++++++++++++++++++++++++++++++---------
>  3 files changed, 38 insertions(+), 12 deletions(-)
> 
> diff --git a/arch/arm64/include/asm/kvm_host.h b/arch/arm64/include/asm/kvm_host.h
> index f331abf..d2084ae 100644
> --- a/arch/arm64/include/asm/kvm_host.h
> +++ b/arch/arm64/include/asm/kvm_host.h
> @@ -211,6 +211,8 @@ typedef struct kvm_cpu_context kvm_cpu_context_t;
>  
>  struct kvm_vcpu_arch {
>  	struct kvm_cpu_context ctxt;
> +	void *sve_state;
> +	unsigned int sve_max_vl;
>  
>  	/* HYP configuration */
>  	u64 hcr_el2;
> diff --git a/arch/arm64/kvm/fpsimd.c b/arch/arm64/kvm/fpsimd.c
> index 872008c..44cf783 100644
> --- a/arch/arm64/kvm/fpsimd.c
> +++ b/arch/arm64/kvm/fpsimd.c
> @@ -86,10 +86,11 @@ void kvm_arch_vcpu_ctxsync_fp(struct kvm_vcpu *vcpu)
>  
>  	if (vcpu->arch.flags & KVM_ARM64_FP_ENABLED) {
>  		fpsimd_bind_state_to_cpu(&vcpu->arch.ctxt.gp_regs.fp_regs,
> -					 NULL, sve_max_vl);
> +					 vcpu->arch.sve_state,
> +					 vcpu->arch.sve_max_vl);
>  
>  		clear_thread_flag(TIF_FOREIGN_FPSTATE);
> -		clear_thread_flag(TIF_SVE);
> +		update_thread_flag(TIF_SVE, vcpu_has_sve(&vcpu->arch));
>  	}
>  }
>  
> diff --git a/arch/arm64/kvm/hyp/switch.c b/arch/arm64/kvm/hyp/switch.c
> index d496ef5..98df5c1 100644
> --- a/arch/arm64/kvm/hyp/switch.c
> +++ b/arch/arm64/kvm/hyp/switch.c
> @@ -98,8 +98,13 @@ static void activate_traps_vhe(struct kvm_vcpu *vcpu)
>  	val = read_sysreg(cpacr_el1);
>  	val |= CPACR_EL1_TTA;
>  	val &= ~CPACR_EL1_ZEN;
> -	if (!update_fp_enabled(vcpu))
> +
> +	if (update_fp_enabled(vcpu)) {
> +		if (vcpu_has_sve(&vcpu->arch))
> +			val |= CPACR_EL1_ZEN;
> +	} else {
>  		val &= ~CPACR_EL1_FPEN;
> +	}
>  
>  	write_sysreg(val, cpacr_el1);
>  
> @@ -114,6 +119,7 @@ static void __hyp_text __activate_traps_nvhe(struct kvm_vcpu *vcpu)
>  
>  	val = CPTR_EL2_DEFAULT;
>  	val |= CPTR_EL2_TTA | CPTR_EL2_TZ;
> +
>  	if (!update_fp_enabled(vcpu))
>  		val |= CPTR_EL2_TFP;
>  
> @@ -329,16 +335,22 @@ static bool __hyp_text __skip_instr(struct kvm_vcpu *vcpu)
>  	}
>  }
>  
> -static bool __hyp_text __hyp_switch_fpsimd(struct kvm_vcpu *vcpu)
> +static bool __hyp_text __hyp_switch_fpsimd(struct kvm_vcpu *vcpu,
> +					   bool guest_has_sve)
>  {
>  	struct user_fpsimd_state *host_fpsimd = vcpu->arch.host_fpsimd_state;
>  
> -	if (has_vhe())
> -		write_sysreg(read_sysreg(cpacr_el1) | CPACR_EL1_FPEN,
> -			     cpacr_el1);
> -	else
> +	if (has_vhe()) {
> +		u64 reg = read_sysreg(cpacr_el1) | CPACR_EL1_FPEN;
> +
> +		if (system_supports_sve() && guest_has_sve)
> +			reg |= CPACR_EL1_ZEN;
> +
> +		write_sysreg(reg, cpacr_el1);
> +	} else {
>  		write_sysreg(read_sysreg(cptr_el2) & ~(u64)CPTR_EL2_TFP,
>  			     cptr_el2);
> +	}
>  
>  	isb();
>  
> @@ -361,7 +373,13 @@ static bool __hyp_text __hyp_switch_fpsimd(struct kvm_vcpu *vcpu)
>  		vcpu->arch.flags &= ~KVM_ARM64_FP_HOST;
>  	}
>  
> -	__fpsimd_restore_state(&vcpu->arch.ctxt.gp_regs.fp_regs);
> +	if (system_supports_sve() && guest_has_sve)
> +		sve_load_state((char *)vcpu->arch.sve_state +
> +					sve_ffr_offset(vcpu->arch.sve_max_vl),

nit: would it make sense to have a macro 'vcpu_get_sve_state_ptr(vcpu)'
to make this first argument more pretty?

> +			       &vcpu->arch.ctxt.gp_regs.fp_regs.fpsr,
> +			       sve_vq_from_vl(vcpu->arch.sve_max_vl) - 1);
> +	else
> +		__fpsimd_restore_state(&vcpu->arch.ctxt.gp_regs.fp_regs);
>  
>  	/* Skip restoring fpexc32 for AArch64 guests */
>  	if (!(read_sysreg(hcr_el2) & HCR_RW))
> @@ -380,6 +398,8 @@ static bool __hyp_text __hyp_switch_fpsimd(struct kvm_vcpu *vcpu)
>   */
>  static bool __hyp_text fixup_guest_exit(struct kvm_vcpu *vcpu, u64 *exit_code)
>  {
> +	bool guest_has_sve;
> +
>  	if (ARM_EXCEPTION_CODE(*exit_code) != ARM_EXCEPTION_IRQ)
>  		vcpu->arch.fault.esr_el2 = read_sysreg_el2(esr);
>  
> @@ -397,10 +417,13 @@ static bool __hyp_text fixup_guest_exit(struct kvm_vcpu *vcpu, u64 *exit_code)
>  	 * and restore the guest context lazily.
>  	 * If FP/SIMD is not implemented, handle the trap and inject an
>  	 * undefined instruction exception to the guest.
> +	 * Similarly for trapped SVE accesses.
>  	 */
> -	if (system_supports_fpsimd() &&
> -	    kvm_vcpu_trap_get_class(vcpu) == ESR_ELx_EC_FP_ASIMD)
> -		return __hyp_switch_fpsimd(vcpu);
> +	guest_has_sve = vcpu_has_sve(&vcpu->arch);
> +	if ((system_supports_fpsimd() &&
> +	     kvm_vcpu_trap_get_class(vcpu) == ESR_ELx_EC_FP_ASIMD) ||
> +	    (guest_has_sve && kvm_vcpu_trap_get_class(vcpu) == ESR_ELx_EC_SVE))

nit: this may also be folded nicely into a static bool
__trap_fpsimd_sve_access() check.

> +		return __hyp_switch_fpsimd(vcpu, guest_has_sve);
>  
>  	if (!__populate_fault_info(vcpu))
>  		return true;
> -- 
> 2.1.4
> 

Thanks,
-Christoffer

^ permalink raw reply	[flat|nested] 178+ messages in thread

* [RFC PATCH 12/16] KVM: arm64/sve: Context switch the SVE registers
@ 2018-08-06 13:19     ` Christoffer Dall
  0 siblings, 0 replies; 178+ messages in thread
From: Christoffer Dall @ 2018-08-06 13:19 UTC (permalink / raw)
  To: linux-arm-kernel

On Thu, Jun 21, 2018 at 03:57:36PM +0100, Dave Martin wrote:
> In order to give each vcpu its own view of the SVE registers, this
> patch adds context storage via a new sve_state pointer in struct
> vcpu_arch.  An additional member sve_max_vl is also added for each
> vcpu, to determine the maximum vector length visible to the guest
> and thus the value to be configured in ZCR_EL2.LEN while the is
> active.  This also determines the layout and size of the storage in
> sve_state, which is read and written by the same backend functions
> that are used for context-switching the SVE state for host tasks.
> 
> On SVE-enabled vcpus, SVE access traps are now handled by switching
> in the vcpu's SVE context and disabling the trap before returning
> to the guest.  On other vcpus, the trap is not handled and an exit
> back to the host occurs, where the handle_sve() fallback path
> reflects an undefined instruction exception back to the guest,
> consistently with the behaviour of non-SVE-capable hardware (as was
> done unconditionally prior to this patch).
> 
> No SVE handling is added on non-VHE-only paths, since VHE is an
> architectural and Kconfig prerequisite of SVE.
> 
> Signed-off-by: Dave Martin <Dave.Martin@arm.com>
> ---
>  arch/arm64/include/asm/kvm_host.h |  2 ++
>  arch/arm64/kvm/fpsimd.c           |  5 +++--
>  arch/arm64/kvm/hyp/switch.c       | 43 ++++++++++++++++++++++++++++++---------
>  3 files changed, 38 insertions(+), 12 deletions(-)
> 
> diff --git a/arch/arm64/include/asm/kvm_host.h b/arch/arm64/include/asm/kvm_host.h
> index f331abf..d2084ae 100644
> --- a/arch/arm64/include/asm/kvm_host.h
> +++ b/arch/arm64/include/asm/kvm_host.h
> @@ -211,6 +211,8 @@ typedef struct kvm_cpu_context kvm_cpu_context_t;
>  
>  struct kvm_vcpu_arch {
>  	struct kvm_cpu_context ctxt;
> +	void *sve_state;
> +	unsigned int sve_max_vl;
>  
>  	/* HYP configuration */
>  	u64 hcr_el2;
> diff --git a/arch/arm64/kvm/fpsimd.c b/arch/arm64/kvm/fpsimd.c
> index 872008c..44cf783 100644
> --- a/arch/arm64/kvm/fpsimd.c
> +++ b/arch/arm64/kvm/fpsimd.c
> @@ -86,10 +86,11 @@ void kvm_arch_vcpu_ctxsync_fp(struct kvm_vcpu *vcpu)
>  
>  	if (vcpu->arch.flags & KVM_ARM64_FP_ENABLED) {
>  		fpsimd_bind_state_to_cpu(&vcpu->arch.ctxt.gp_regs.fp_regs,
> -					 NULL, sve_max_vl);
> +					 vcpu->arch.sve_state,
> +					 vcpu->arch.sve_max_vl);
>  
>  		clear_thread_flag(TIF_FOREIGN_FPSTATE);
> -		clear_thread_flag(TIF_SVE);
> +		update_thread_flag(TIF_SVE, vcpu_has_sve(&vcpu->arch));
>  	}
>  }
>  
> diff --git a/arch/arm64/kvm/hyp/switch.c b/arch/arm64/kvm/hyp/switch.c
> index d496ef5..98df5c1 100644
> --- a/arch/arm64/kvm/hyp/switch.c
> +++ b/arch/arm64/kvm/hyp/switch.c
> @@ -98,8 +98,13 @@ static void activate_traps_vhe(struct kvm_vcpu *vcpu)
>  	val = read_sysreg(cpacr_el1);
>  	val |= CPACR_EL1_TTA;
>  	val &= ~CPACR_EL1_ZEN;
> -	if (!update_fp_enabled(vcpu))
> +
> +	if (update_fp_enabled(vcpu)) {
> +		if (vcpu_has_sve(&vcpu->arch))
> +			val |= CPACR_EL1_ZEN;
> +	} else {
>  		val &= ~CPACR_EL1_FPEN;
> +	}
>  
>  	write_sysreg(val, cpacr_el1);
>  
> @@ -114,6 +119,7 @@ static void __hyp_text __activate_traps_nvhe(struct kvm_vcpu *vcpu)
>  
>  	val = CPTR_EL2_DEFAULT;
>  	val |= CPTR_EL2_TTA | CPTR_EL2_TZ;
> +
>  	if (!update_fp_enabled(vcpu))
>  		val |= CPTR_EL2_TFP;
>  
> @@ -329,16 +335,22 @@ static bool __hyp_text __skip_instr(struct kvm_vcpu *vcpu)
>  	}
>  }
>  
> -static bool __hyp_text __hyp_switch_fpsimd(struct kvm_vcpu *vcpu)
> +static bool __hyp_text __hyp_switch_fpsimd(struct kvm_vcpu *vcpu,
> +					   bool guest_has_sve)
>  {
>  	struct user_fpsimd_state *host_fpsimd = vcpu->arch.host_fpsimd_state;
>  
> -	if (has_vhe())
> -		write_sysreg(read_sysreg(cpacr_el1) | CPACR_EL1_FPEN,
> -			     cpacr_el1);
> -	else
> +	if (has_vhe()) {
> +		u64 reg = read_sysreg(cpacr_el1) | CPACR_EL1_FPEN;
> +
> +		if (system_supports_sve() && guest_has_sve)
> +			reg |= CPACR_EL1_ZEN;
> +
> +		write_sysreg(reg, cpacr_el1);
> +	} else {
>  		write_sysreg(read_sysreg(cptr_el2) & ~(u64)CPTR_EL2_TFP,
>  			     cptr_el2);
> +	}
>  
>  	isb();
>  
> @@ -361,7 +373,13 @@ static bool __hyp_text __hyp_switch_fpsimd(struct kvm_vcpu *vcpu)
>  		vcpu->arch.flags &= ~KVM_ARM64_FP_HOST;
>  	}
>  
> -	__fpsimd_restore_state(&vcpu->arch.ctxt.gp_regs.fp_regs);
> +	if (system_supports_sve() && guest_has_sve)
> +		sve_load_state((char *)vcpu->arch.sve_state +
> +					sve_ffr_offset(vcpu->arch.sve_max_vl),

nit: would it make sense to have a macro 'vcpu_get_sve_state_ptr(vcpu)'
to make this first argument more pretty?

> +			       &vcpu->arch.ctxt.gp_regs.fp_regs.fpsr,
> +			       sve_vq_from_vl(vcpu->arch.sve_max_vl) - 1);
> +	else
> +		__fpsimd_restore_state(&vcpu->arch.ctxt.gp_regs.fp_regs);
>  
>  	/* Skip restoring fpexc32 for AArch64 guests */
>  	if (!(read_sysreg(hcr_el2) & HCR_RW))
> @@ -380,6 +398,8 @@ static bool __hyp_text __hyp_switch_fpsimd(struct kvm_vcpu *vcpu)
>   */
>  static bool __hyp_text fixup_guest_exit(struct kvm_vcpu *vcpu, u64 *exit_code)
>  {
> +	bool guest_has_sve;
> +
>  	if (ARM_EXCEPTION_CODE(*exit_code) != ARM_EXCEPTION_IRQ)
>  		vcpu->arch.fault.esr_el2 = read_sysreg_el2(esr);
>  
> @@ -397,10 +417,13 @@ static bool __hyp_text fixup_guest_exit(struct kvm_vcpu *vcpu, u64 *exit_code)
>  	 * and restore the guest context lazily.
>  	 * If FP/SIMD is not implemented, handle the trap and inject an
>  	 * undefined instruction exception to the guest.
> +	 * Similarly for trapped SVE accesses.
>  	 */
> -	if (system_supports_fpsimd() &&
> -	    kvm_vcpu_trap_get_class(vcpu) == ESR_ELx_EC_FP_ASIMD)
> -		return __hyp_switch_fpsimd(vcpu);
> +	guest_has_sve = vcpu_has_sve(&vcpu->arch);
> +	if ((system_supports_fpsimd() &&
> +	     kvm_vcpu_trap_get_class(vcpu) == ESR_ELx_EC_FP_ASIMD) ||
> +	    (guest_has_sve && kvm_vcpu_trap_get_class(vcpu) == ESR_ELx_EC_SVE))

nit: this may also be folded nicely into a static bool
__trap_fpsimd_sve_access() check.

> +		return __hyp_switch_fpsimd(vcpu, guest_has_sve);
>  
>  	if (!__populate_fault_info(vcpu))
>  		return true;
> -- 
> 2.1.4
> 

Thanks,
-Christoffer

^ permalink raw reply	[flat|nested] 178+ messages in thread

* Re: [RFC PATCH 14/16] KVM: arm64/sve: Add SVE support to register access ioctl interface
  2018-06-21 14:57   ` Dave Martin
@ 2018-08-06 13:25     ` Christoffer Dall
  -1 siblings, 0 replies; 178+ messages in thread
From: Christoffer Dall @ 2018-08-06 13:25 UTC (permalink / raw)
  To: Dave Martin
  Cc: Okamoto Takayuki, Christoffer Dall, Ard Biesheuvel, Marc Zyngier,
	Catalin Marinas, Will Deacon, kvmarm, linux-arm-kernel

On Thu, Jun 21, 2018 at 03:57:38PM +0100, Dave Martin wrote:
> This patch adds the following registers for access via the
> KVM_{GET,SET}_ONE_REG interface:
> 
>  * KVM_REG_ARM64_SVE_ZREG(n, i) (n = 0..31) (in 2048-bit slices)
>  * KVM_REG_ARM64_SVE_PREG(n, i) (n = 0..15) (in 256-bit slices)
>  * KVM_REG_ARM64_SVE_FFR(i) (in 256-bit slices)
> 
> In order to adapt gracefully to future architectural extensions,
> the registers are divided up into slices as noted above:  the i
> parameter denotes the slice index.
> 
> For simplicity, bits or slices that exceed the maximum vector
> length supported for the vcpu are ignored for KVM_SET_ONE_REG, and
> read as zero for KVM_GET_ONE_REG.
> 
> For the current architecture, only slice i = 0 is significant.  The
> interface design allows i to increase to up to 31 in the future if
> required by future architectural amendments.
> 
> The registers are only visible for vcpus that have SVE enabled.
> They are not enumerated by KVM_GET_REG_LIST on vcpus that do not
> have SVE.  In all cases, surplus slices are not enumerated by
> KVM_GET_REG_LIST.
> 
> Accesses to the FPSIMD registers via KVM_REG_ARM_CORE are
> redirected to access the underlying vcpu SVE register storage as
> appropriate.  In order to make this more straightforward, register
> accesses that straddle register boundaries are no longer guaranteed
> to succeed.  (Support for such use was never deliberate, and
> userspace does not currently seem to be relying on it.)

Could you add documentation to Documentation/virtual/kvm/api.txt for
this as well under the KVM_SET_ONE_REG definitions explaining the use
for arm64 ?

Thanks,
-Christoffer

> 
> Signed-off-by: Dave Martin <Dave.Martin@arm.com>
> ---
>  arch/arm64/include/uapi/asm/kvm.h |  10 ++
>  arch/arm64/kvm/guest.c            | 219 +++++++++++++++++++++++++++++++++++---
>  2 files changed, 216 insertions(+), 13 deletions(-)
> 
> diff --git a/arch/arm64/include/uapi/asm/kvm.h b/arch/arm64/include/uapi/asm/kvm.h
> index 4e76630..f54a9b0 100644
> --- a/arch/arm64/include/uapi/asm/kvm.h
> +++ b/arch/arm64/include/uapi/asm/kvm.h
> @@ -213,6 +213,16 @@ struct kvm_arch_memory_slot {
>  					 KVM_REG_ARM_FW | ((r) & 0xffff))
>  #define KVM_REG_ARM_PSCI_VERSION	KVM_REG_ARM_FW_REG(0)
>  
> +/* SVE registers */
> +#define KVM_REG_ARM64_SVE		(0x15 << KVM_REG_ARM_COPROC_SHIFT)
> +#define KVM_REG_ARM64_SVE_ZREG(n, i)	(KVM_REG_ARM64 | KVM_REG_ARM64_SVE | \
> +					 KVM_REG_SIZE_U2048 |		\
> +					 ((n) << 5) | (i))
> +#define KVM_REG_ARM64_SVE_PREG(n, i)	(KVM_REG_ARM64 | KVM_REG_ARM64_SVE | \
> +					 KVM_REG_SIZE_U256 |		\
> +					 ((n) << 5) | (i) | 0x400)
> +#define KVM_REG_ARM64_SVE_FFR(i)	KVM_REG_ARM64_SVE_PREG(16, i)
> +
>  /* Device Control API: ARM VGIC */
>  #define KVM_DEV_ARM_VGIC_GRP_ADDR	0
>  #define KVM_DEV_ARM_VGIC_GRP_DIST_REGS	1
> diff --git a/arch/arm64/kvm/guest.c b/arch/arm64/kvm/guest.c
> index 4a9d77c..005394b 100644
> --- a/arch/arm64/kvm/guest.c
> +++ b/arch/arm64/kvm/guest.c
> @@ -23,14 +23,19 @@
>  #include <linux/err.h>
>  #include <linux/kvm_host.h>
>  #include <linux/module.h>
> +#include <linux/uaccess.h>
>  #include <linux/vmalloc.h>
>  #include <linux/fs.h>
> +#include <linux/stddef.h>
>  #include <kvm/arm_psci.h>
>  #include <asm/cputype.h>
>  #include <linux/uaccess.h>
> +#include <asm/fpsimd.h>
>  #include <asm/kvm.h>
>  #include <asm/kvm_emulate.h>
>  #include <asm/kvm_coproc.h>
> +#include <asm/kvm_host.h>
> +#include <asm/sigcontext.h>
>  
>  #include "trace.h"
>  
> @@ -57,6 +62,106 @@ static u64 core_reg_offset_from_id(u64 id)
>  	return id & ~(KVM_REG_ARCH_MASK | KVM_REG_SIZE_MASK | KVM_REG_ARM_CORE);
>  }
>  
> +static bool is_zreg(const struct kvm_one_reg *reg)
> +{
> +	return	reg->id >= KVM_REG_ARM64_SVE_ZREG(0, 0) &&
> +		reg->id <= KVM_REG_ARM64_SVE_ZREG(SVE_NUM_ZREGS, 0x1f);
> +}
> +
> +static bool is_preg(const struct kvm_one_reg *reg)
> +{
> +	return	reg->id >= KVM_REG_ARM64_SVE_PREG(0, 0) &&
> +		reg->id <= KVM_REG_ARM64_SVE_FFR(0x1f);
> +}
> +
> +static unsigned int sve_reg_num(const struct kvm_one_reg *reg)
> +{
> +	return (reg->id >> 5) & 0x1f;
> +}
> +
> +static unsigned int sve_reg_index(const struct kvm_one_reg *reg)
> +{
> +	return reg->id & 0x1f;
> +}
> +
> +struct reg_bounds_struct {
> +	char *kptr;
> +	size_t start_offset;
> +	size_t copy_count;
> +	size_t flush_count;
> +};
> +
> +static int copy_bounded_reg_to_user(void __user *uptr,
> +				    const struct reg_bounds_struct *b)
> +{
> +	if (copy_to_user(uptr, b->kptr, b->copy_count) ||
> +	    clear_user((char __user *)uptr + b->copy_count, b->flush_count))
> +		return -EFAULT;
> +
> +	return 0;
> +}
> +
> +static int copy_bounded_reg_from_user(const struct reg_bounds_struct *b,
> +				      const void __user *uptr)
> +{
> +	if (copy_from_user(b->kptr, uptr, b->copy_count))
> +		return -EFAULT;
> +
> +	return 0;
> +}
> +
> +static int fpsimd_vreg_bounds(struct reg_bounds_struct *b,
> +			      struct kvm_vcpu *vcpu,
> +			      const struct kvm_one_reg *reg)
> +{
> +	const size_t stride = KVM_REG_ARM_CORE_REG(fp_regs.vregs[1]) -
> +				KVM_REG_ARM_CORE_REG(fp_regs.vregs[0]);
> +	const size_t start = KVM_REG_ARM_CORE_REG(fp_regs.vregs[0]);
> +	const size_t limit = KVM_REG_ARM_CORE_REG(fp_regs.vregs[32]);
> +
> +	const u64 uoffset = core_reg_offset_from_id(reg->id);
> +	size_t usize = KVM_REG_SIZE(reg->id);
> +	size_t start_vreg, end_vreg;
> +
> +	if (WARN_ON((reg->id & KVM_REG_ARM_COPROC_MASK) != KVM_REG_ARM_CORE))
> +		return -ENOENT;
> +
> +	if (usize % sizeof(u32))
> +		return -EINVAL;
> +
> +	usize /= sizeof(u32);
> +
> +	if ((uoffset <= start && usize <= start - uoffset) ||
> +	    uoffset >= limit)
> +		return -ENOENT;	/* not a vreg */
> +
> +	BUILD_BUG_ON(uoffset > limit);
> +	if (uoffset < start || usize > limit - uoffset)
> +		return -EINVAL;	/* overlaps vregs[] bounds */
> +
> +	start_vreg = (uoffset - start) / stride;
> +	end_vreg = ((uoffset - start) + usize - 1) / stride;
> +	if (start_vreg != end_vreg)
> +		return -EINVAL;	/* spans multiple vregs */
> +
> +	b->start_offset = ((uoffset - start) % stride) * sizeof(u32);
> +	b->copy_count = usize * sizeof(u32);
> +	b->flush_count = 0;
> +
> +	if (vcpu_has_sve(&vcpu->arch)) {
> +		const unsigned int vq = sve_vq_from_vl(vcpu->arch.sve_max_vl);
> +
> +		b->kptr = vcpu->arch.sve_state;
> +		b->kptr += (SVE_SIG_ZREG_OFFSET(vq, start_vreg) -
> +			    SVE_SIG_REGS_OFFSET);
> +	} else {
> +		b->kptr = (char *)&vcpu_gp_regs(vcpu)->fp_regs.vregs[
> +				start_vreg];
> +	}
> +
> +	return 0;
> +}
> +
>  static int get_core_reg(struct kvm_vcpu *vcpu, const struct kvm_one_reg *reg)
>  {
>  	/*
> @@ -65,11 +170,20 @@ static int get_core_reg(struct kvm_vcpu *vcpu, const struct kvm_one_reg *reg)
>  	 * array. Hence below, nr_regs is the number of entries, and
>  	 * off the index in the "array".
>  	 */
> +	int err;
> +	struct reg_bounds_struct b;
>  	__u32 __user *uaddr = (__u32 __user *)(unsigned long)reg->addr;
>  	struct kvm_regs *regs = vcpu_gp_regs(vcpu);
>  	int nr_regs = sizeof(*regs) / sizeof(__u32);
>  	u32 off;
>  
> +	err = fpsimd_vreg_bounds(&b, vcpu, reg);
> +	switch (err) {
> +	case 0:		return copy_bounded_reg_to_user(uaddr, &b);
> +	case -ENOENT:	break;	/* not and FPSIMD vreg */
> +	default:	return err;
> +	}
> +
>  	/* Our ID is an index into the kvm_regs struct. */
>  	off = core_reg_offset_from_id(reg->id);
>  	if (off >= nr_regs ||
> @@ -84,14 +198,23 @@ static int get_core_reg(struct kvm_vcpu *vcpu, const struct kvm_one_reg *reg)
>  
>  static int set_core_reg(struct kvm_vcpu *vcpu, const struct kvm_one_reg *reg)
>  {
> +	int err;
> +	struct reg_bounds_struct b;
>  	__u32 __user *uaddr = (__u32 __user *)(unsigned long)reg->addr;
>  	struct kvm_regs *regs = vcpu_gp_regs(vcpu);
>  	int nr_regs = sizeof(*regs) / sizeof(__u32);
>  	__uint128_t tmp;
>  	void *valp = &tmp;
>  	u64 off;
> -	int err = 0;
>  
> +	err = fpsimd_vreg_bounds(&b, vcpu, reg);
> +	switch (err) {
> +	case 0:		return copy_bounded_reg_from_user(&b, uaddr);
> +	case -ENOENT:	break;	/* not and FPSIMD vreg */
> +	default:	return err;
> +	}
> +
> +	err = 0;
>  	/* Our ID is an index into the kvm_regs struct. */
>  	off = core_reg_offset_from_id(reg->id);
>  	if (off >= nr_regs ||
> @@ -130,6 +253,78 @@ static int set_core_reg(struct kvm_vcpu *vcpu, const struct kvm_one_reg *reg)
>  	return err;
>  }
>  
> +static int sve_reg_bounds(struct reg_bounds_struct *b,
> +			  const struct kvm_vcpu *vcpu,
> +			  const struct kvm_one_reg *reg)
> +{
> +	unsigned int n = sve_reg_num(reg);
> +	unsigned int i = sve_reg_index(reg);
> +	unsigned int vl = vcpu->arch.sve_max_vl;
> +	unsigned int vq = sve_vq_from_vl(vl);
> +	unsigned int start, copy_limit, limit;
> +
> +	b->kptr = vcpu->arch.sve_state;
> +	if (is_zreg(reg)) {
> +		b->kptr += SVE_SIG_ZREG_OFFSET(vq, n) - SVE_SIG_REGS_OFFSET;
> +		start = i * 0x100;
> +		limit = start + 0x100;
> +		copy_limit = vl;
> +	} else if (is_preg(reg)) {
> +		b->kptr += SVE_SIG_PREG_OFFSET(vq, n) - SVE_SIG_REGS_OFFSET;
> +		start = i * 0x20;
> +		limit = start + 0x20;
> +		copy_limit = vl / 8;
> +	} else {
> +		WARN_ON(1);
> +		start = 0;
> +		copy_limit = limit = 0;
> +	}
> +
> +	b->kptr += start;
> +
> +	if (copy_limit < start)
> +		copy_limit = start;
> +	else if (copy_limit > limit)
> +		copy_limit = limit;
> +
> +	b->copy_count = copy_limit - start;
> +	b->flush_count = limit - copy_limit;
> +
> +	return 0;
> +}
> +
> +static int get_sve_reg(struct kvm_vcpu *vcpu, const struct kvm_one_reg *reg)
> +{
> +	int ret;
> +	struct reg_bounds_struct b;
> +	char __user *uptr = (char __user *)reg->addr;
> +
> +	if (!vcpu_has_sve(&vcpu->arch))
> +		return -ENOENT;
> +
> +	ret = sve_reg_bounds(&b, vcpu, reg);
> +	if (ret)
> +		return ret;
> +
> +	return copy_bounded_reg_to_user(uptr, &b);
> +}
> +
> +static int set_sve_reg(struct kvm_vcpu *vcpu, const struct kvm_one_reg *reg)
> +{
> +	int ret;
> +	struct reg_bounds_struct b;
> +	char __user *uptr = (char __user *)reg->addr;
> +
> +	if (!vcpu_has_sve(&vcpu->arch))
> +		return -ENOENT;
> +
> +	ret = sve_reg_bounds(&b, vcpu, reg);
> +	if (ret)
> +		return ret;
> +
> +	return copy_bounded_reg_from_user(&b, uptr);
> +}
> +
>  int kvm_arch_vcpu_ioctl_get_regs(struct kvm_vcpu *vcpu, struct kvm_regs *regs)
>  {
>  	return -EINVAL;
> @@ -251,12 +446,11 @@ int kvm_arm_get_reg(struct kvm_vcpu *vcpu, const struct kvm_one_reg *reg)
>  	if ((reg->id & ~KVM_REG_SIZE_MASK) >> 32 != KVM_REG_ARM64 >> 32)
>  		return -EINVAL;
>  
> -	/* Register group 16 means we want a core register. */
> -	if ((reg->id & KVM_REG_ARM_COPROC_MASK) == KVM_REG_ARM_CORE)
> -		return get_core_reg(vcpu, reg);
> -
> -	if ((reg->id & KVM_REG_ARM_COPROC_MASK) == KVM_REG_ARM_FW)
> -		return kvm_arm_get_fw_reg(vcpu, reg);
> +	switch (reg->id & KVM_REG_ARM_COPROC_MASK) {
> +	case KVM_REG_ARM_CORE:	return get_core_reg(vcpu, reg);
> +	case KVM_REG_ARM_FW:	return kvm_arm_get_fw_reg(vcpu, reg);
> +	case KVM_REG_ARM64_SVE:	return get_sve_reg(vcpu, reg);
> +	}
>  
>  	if (is_timer_reg(reg->id))
>  		return get_timer_reg(vcpu, reg);
> @@ -270,12 +464,11 @@ int kvm_arm_set_reg(struct kvm_vcpu *vcpu, const struct kvm_one_reg *reg)
>  	if ((reg->id & ~KVM_REG_SIZE_MASK) >> 32 != KVM_REG_ARM64 >> 32)
>  		return -EINVAL;
>  
> -	/* Register group 16 means we set a core register. */
> -	if ((reg->id & KVM_REG_ARM_COPROC_MASK) == KVM_REG_ARM_CORE)
> -		return set_core_reg(vcpu, reg);
> -
> -	if ((reg->id & KVM_REG_ARM_COPROC_MASK) == KVM_REG_ARM_FW)
> -		return kvm_arm_set_fw_reg(vcpu, reg);
> +	switch (reg->id & KVM_REG_ARM_COPROC_MASK) {
> +	case KVM_REG_ARM_CORE:	return set_core_reg(vcpu, reg);
> +	case KVM_REG_ARM_FW:	return kvm_arm_set_fw_reg(vcpu, reg);
> +	case KVM_REG_ARM64_SVE:	return set_sve_reg(vcpu, reg);
> +	}
>  
>  	if (is_timer_reg(reg->id))
>  		return set_timer_reg(vcpu, reg);
> -- 
> 2.1.4
> 
> _______________________________________________
> kvmarm mailing list
> kvmarm@lists.cs.columbia.edu
> https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

^ permalink raw reply	[flat|nested] 178+ messages in thread

* [RFC PATCH 14/16] KVM: arm64/sve: Add SVE support to register access ioctl interface
@ 2018-08-06 13:25     ` Christoffer Dall
  0 siblings, 0 replies; 178+ messages in thread
From: Christoffer Dall @ 2018-08-06 13:25 UTC (permalink / raw)
  To: linux-arm-kernel

On Thu, Jun 21, 2018 at 03:57:38PM +0100, Dave Martin wrote:
> This patch adds the following registers for access via the
> KVM_{GET,SET}_ONE_REG interface:
> 
>  * KVM_REG_ARM64_SVE_ZREG(n, i) (n = 0..31) (in 2048-bit slices)
>  * KVM_REG_ARM64_SVE_PREG(n, i) (n = 0..15) (in 256-bit slices)
>  * KVM_REG_ARM64_SVE_FFR(i) (in 256-bit slices)
> 
> In order to adapt gracefully to future architectural extensions,
> the registers are divided up into slices as noted above:  the i
> parameter denotes the slice index.
> 
> For simplicity, bits or slices that exceed the maximum vector
> length supported for the vcpu are ignored for KVM_SET_ONE_REG, and
> read as zero for KVM_GET_ONE_REG.
> 
> For the current architecture, only slice i = 0 is significant.  The
> interface design allows i to increase to up to 31 in the future if
> required by future architectural amendments.
> 
> The registers are only visible for vcpus that have SVE enabled.
> They are not enumerated by KVM_GET_REG_LIST on vcpus that do not
> have SVE.  In all cases, surplus slices are not enumerated by
> KVM_GET_REG_LIST.
> 
> Accesses to the FPSIMD registers via KVM_REG_ARM_CORE are
> redirected to access the underlying vcpu SVE register storage as
> appropriate.  In order to make this more straightforward, register
> accesses that straddle register boundaries are no longer guaranteed
> to succeed.  (Support for such use was never deliberate, and
> userspace does not currently seem to be relying on it.)

Could you add documentation to Documentation/virtual/kvm/api.txt for
this as well under the KVM_SET_ONE_REG definitions explaining the use
for arm64 ?

Thanks,
-Christoffer

> 
> Signed-off-by: Dave Martin <Dave.Martin@arm.com>
> ---
>  arch/arm64/include/uapi/asm/kvm.h |  10 ++
>  arch/arm64/kvm/guest.c            | 219 +++++++++++++++++++++++++++++++++++---
>  2 files changed, 216 insertions(+), 13 deletions(-)
> 
> diff --git a/arch/arm64/include/uapi/asm/kvm.h b/arch/arm64/include/uapi/asm/kvm.h
> index 4e76630..f54a9b0 100644
> --- a/arch/arm64/include/uapi/asm/kvm.h
> +++ b/arch/arm64/include/uapi/asm/kvm.h
> @@ -213,6 +213,16 @@ struct kvm_arch_memory_slot {
>  					 KVM_REG_ARM_FW | ((r) & 0xffff))
>  #define KVM_REG_ARM_PSCI_VERSION	KVM_REG_ARM_FW_REG(0)
>  
> +/* SVE registers */
> +#define KVM_REG_ARM64_SVE		(0x15 << KVM_REG_ARM_COPROC_SHIFT)
> +#define KVM_REG_ARM64_SVE_ZREG(n, i)	(KVM_REG_ARM64 | KVM_REG_ARM64_SVE | \
> +					 KVM_REG_SIZE_U2048 |		\
> +					 ((n) << 5) | (i))
> +#define KVM_REG_ARM64_SVE_PREG(n, i)	(KVM_REG_ARM64 | KVM_REG_ARM64_SVE | \
> +					 KVM_REG_SIZE_U256 |		\
> +					 ((n) << 5) | (i) | 0x400)
> +#define KVM_REG_ARM64_SVE_FFR(i)	KVM_REG_ARM64_SVE_PREG(16, i)
> +
>  /* Device Control API: ARM VGIC */
>  #define KVM_DEV_ARM_VGIC_GRP_ADDR	0
>  #define KVM_DEV_ARM_VGIC_GRP_DIST_REGS	1
> diff --git a/arch/arm64/kvm/guest.c b/arch/arm64/kvm/guest.c
> index 4a9d77c..005394b 100644
> --- a/arch/arm64/kvm/guest.c
> +++ b/arch/arm64/kvm/guest.c
> @@ -23,14 +23,19 @@
>  #include <linux/err.h>
>  #include <linux/kvm_host.h>
>  #include <linux/module.h>
> +#include <linux/uaccess.h>
>  #include <linux/vmalloc.h>
>  #include <linux/fs.h>
> +#include <linux/stddef.h>
>  #include <kvm/arm_psci.h>
>  #include <asm/cputype.h>
>  #include <linux/uaccess.h>
> +#include <asm/fpsimd.h>
>  #include <asm/kvm.h>
>  #include <asm/kvm_emulate.h>
>  #include <asm/kvm_coproc.h>
> +#include <asm/kvm_host.h>
> +#include <asm/sigcontext.h>
>  
>  #include "trace.h"
>  
> @@ -57,6 +62,106 @@ static u64 core_reg_offset_from_id(u64 id)
>  	return id & ~(KVM_REG_ARCH_MASK | KVM_REG_SIZE_MASK | KVM_REG_ARM_CORE);
>  }
>  
> +static bool is_zreg(const struct kvm_one_reg *reg)
> +{
> +	return	reg->id >= KVM_REG_ARM64_SVE_ZREG(0, 0) &&
> +		reg->id <= KVM_REG_ARM64_SVE_ZREG(SVE_NUM_ZREGS, 0x1f);
> +}
> +
> +static bool is_preg(const struct kvm_one_reg *reg)
> +{
> +	return	reg->id >= KVM_REG_ARM64_SVE_PREG(0, 0) &&
> +		reg->id <= KVM_REG_ARM64_SVE_FFR(0x1f);
> +}
> +
> +static unsigned int sve_reg_num(const struct kvm_one_reg *reg)
> +{
> +	return (reg->id >> 5) & 0x1f;
> +}
> +
> +static unsigned int sve_reg_index(const struct kvm_one_reg *reg)
> +{
> +	return reg->id & 0x1f;
> +}
> +
> +struct reg_bounds_struct {
> +	char *kptr;
> +	size_t start_offset;
> +	size_t copy_count;
> +	size_t flush_count;
> +};
> +
> +static int copy_bounded_reg_to_user(void __user *uptr,
> +				    const struct reg_bounds_struct *b)
> +{
> +	if (copy_to_user(uptr, b->kptr, b->copy_count) ||
> +	    clear_user((char __user *)uptr + b->copy_count, b->flush_count))
> +		return -EFAULT;
> +
> +	return 0;
> +}
> +
> +static int copy_bounded_reg_from_user(const struct reg_bounds_struct *b,
> +				      const void __user *uptr)
> +{
> +	if (copy_from_user(b->kptr, uptr, b->copy_count))
> +		return -EFAULT;
> +
> +	return 0;
> +}
> +
> +static int fpsimd_vreg_bounds(struct reg_bounds_struct *b,
> +			      struct kvm_vcpu *vcpu,
> +			      const struct kvm_one_reg *reg)
> +{
> +	const size_t stride = KVM_REG_ARM_CORE_REG(fp_regs.vregs[1]) -
> +				KVM_REG_ARM_CORE_REG(fp_regs.vregs[0]);
> +	const size_t start = KVM_REG_ARM_CORE_REG(fp_regs.vregs[0]);
> +	const size_t limit = KVM_REG_ARM_CORE_REG(fp_regs.vregs[32]);
> +
> +	const u64 uoffset = core_reg_offset_from_id(reg->id);
> +	size_t usize = KVM_REG_SIZE(reg->id);
> +	size_t start_vreg, end_vreg;
> +
> +	if (WARN_ON((reg->id & KVM_REG_ARM_COPROC_MASK) != KVM_REG_ARM_CORE))
> +		return -ENOENT;
> +
> +	if (usize % sizeof(u32))
> +		return -EINVAL;
> +
> +	usize /= sizeof(u32);
> +
> +	if ((uoffset <= start && usize <= start - uoffset) ||
> +	    uoffset >= limit)
> +		return -ENOENT;	/* not a vreg */
> +
> +	BUILD_BUG_ON(uoffset > limit);
> +	if (uoffset < start || usize > limit - uoffset)
> +		return -EINVAL;	/* overlaps vregs[] bounds */
> +
> +	start_vreg = (uoffset - start) / stride;
> +	end_vreg = ((uoffset - start) + usize - 1) / stride;
> +	if (start_vreg != end_vreg)
> +		return -EINVAL;	/* spans multiple vregs */
> +
> +	b->start_offset = ((uoffset - start) % stride) * sizeof(u32);
> +	b->copy_count = usize * sizeof(u32);
> +	b->flush_count = 0;
> +
> +	if (vcpu_has_sve(&vcpu->arch)) {
> +		const unsigned int vq = sve_vq_from_vl(vcpu->arch.sve_max_vl);
> +
> +		b->kptr = vcpu->arch.sve_state;
> +		b->kptr += (SVE_SIG_ZREG_OFFSET(vq, start_vreg) -
> +			    SVE_SIG_REGS_OFFSET);
> +	} else {
> +		b->kptr = (char *)&vcpu_gp_regs(vcpu)->fp_regs.vregs[
> +				start_vreg];
> +	}
> +
> +	return 0;
> +}
> +
>  static int get_core_reg(struct kvm_vcpu *vcpu, const struct kvm_one_reg *reg)
>  {
>  	/*
> @@ -65,11 +170,20 @@ static int get_core_reg(struct kvm_vcpu *vcpu, const struct kvm_one_reg *reg)
>  	 * array. Hence below, nr_regs is the number of entries, and
>  	 * off the index in the "array".
>  	 */
> +	int err;
> +	struct reg_bounds_struct b;
>  	__u32 __user *uaddr = (__u32 __user *)(unsigned long)reg->addr;
>  	struct kvm_regs *regs = vcpu_gp_regs(vcpu);
>  	int nr_regs = sizeof(*regs) / sizeof(__u32);
>  	u32 off;
>  
> +	err = fpsimd_vreg_bounds(&b, vcpu, reg);
> +	switch (err) {
> +	case 0:		return copy_bounded_reg_to_user(uaddr, &b);
> +	case -ENOENT:	break;	/* not and FPSIMD vreg */
> +	default:	return err;
> +	}
> +
>  	/* Our ID is an index into the kvm_regs struct. */
>  	off = core_reg_offset_from_id(reg->id);
>  	if (off >= nr_regs ||
> @@ -84,14 +198,23 @@ static int get_core_reg(struct kvm_vcpu *vcpu, const struct kvm_one_reg *reg)
>  
>  static int set_core_reg(struct kvm_vcpu *vcpu, const struct kvm_one_reg *reg)
>  {
> +	int err;
> +	struct reg_bounds_struct b;
>  	__u32 __user *uaddr = (__u32 __user *)(unsigned long)reg->addr;
>  	struct kvm_regs *regs = vcpu_gp_regs(vcpu);
>  	int nr_regs = sizeof(*regs) / sizeof(__u32);
>  	__uint128_t tmp;
>  	void *valp = &tmp;
>  	u64 off;
> -	int err = 0;
>  
> +	err = fpsimd_vreg_bounds(&b, vcpu, reg);
> +	switch (err) {
> +	case 0:		return copy_bounded_reg_from_user(&b, uaddr);
> +	case -ENOENT:	break;	/* not and FPSIMD vreg */
> +	default:	return err;
> +	}
> +
> +	err = 0;
>  	/* Our ID is an index into the kvm_regs struct. */
>  	off = core_reg_offset_from_id(reg->id);
>  	if (off >= nr_regs ||
> @@ -130,6 +253,78 @@ static int set_core_reg(struct kvm_vcpu *vcpu, const struct kvm_one_reg *reg)
>  	return err;
>  }
>  
> +static int sve_reg_bounds(struct reg_bounds_struct *b,
> +			  const struct kvm_vcpu *vcpu,
> +			  const struct kvm_one_reg *reg)
> +{
> +	unsigned int n = sve_reg_num(reg);
> +	unsigned int i = sve_reg_index(reg);
> +	unsigned int vl = vcpu->arch.sve_max_vl;
> +	unsigned int vq = sve_vq_from_vl(vl);
> +	unsigned int start, copy_limit, limit;
> +
> +	b->kptr = vcpu->arch.sve_state;
> +	if (is_zreg(reg)) {
> +		b->kptr += SVE_SIG_ZREG_OFFSET(vq, n) - SVE_SIG_REGS_OFFSET;
> +		start = i * 0x100;
> +		limit = start + 0x100;
> +		copy_limit = vl;
> +	} else if (is_preg(reg)) {
> +		b->kptr += SVE_SIG_PREG_OFFSET(vq, n) - SVE_SIG_REGS_OFFSET;
> +		start = i * 0x20;
> +		limit = start + 0x20;
> +		copy_limit = vl / 8;
> +	} else {
> +		WARN_ON(1);
> +		start = 0;
> +		copy_limit = limit = 0;
> +	}
> +
> +	b->kptr += start;
> +
> +	if (copy_limit < start)
> +		copy_limit = start;
> +	else if (copy_limit > limit)
> +		copy_limit = limit;
> +
> +	b->copy_count = copy_limit - start;
> +	b->flush_count = limit - copy_limit;
> +
> +	return 0;
> +}
> +
> +static int get_sve_reg(struct kvm_vcpu *vcpu, const struct kvm_one_reg *reg)
> +{
> +	int ret;
> +	struct reg_bounds_struct b;
> +	char __user *uptr = (char __user *)reg->addr;
> +
> +	if (!vcpu_has_sve(&vcpu->arch))
> +		return -ENOENT;
> +
> +	ret = sve_reg_bounds(&b, vcpu, reg);
> +	if (ret)
> +		return ret;
> +
> +	return copy_bounded_reg_to_user(uptr, &b);
> +}
> +
> +static int set_sve_reg(struct kvm_vcpu *vcpu, const struct kvm_one_reg *reg)
> +{
> +	int ret;
> +	struct reg_bounds_struct b;
> +	char __user *uptr = (char __user *)reg->addr;
> +
> +	if (!vcpu_has_sve(&vcpu->arch))
> +		return -ENOENT;
> +
> +	ret = sve_reg_bounds(&b, vcpu, reg);
> +	if (ret)
> +		return ret;
> +
> +	return copy_bounded_reg_from_user(&b, uptr);
> +}
> +
>  int kvm_arch_vcpu_ioctl_get_regs(struct kvm_vcpu *vcpu, struct kvm_regs *regs)
>  {
>  	return -EINVAL;
> @@ -251,12 +446,11 @@ int kvm_arm_get_reg(struct kvm_vcpu *vcpu, const struct kvm_one_reg *reg)
>  	if ((reg->id & ~KVM_REG_SIZE_MASK) >> 32 != KVM_REG_ARM64 >> 32)
>  		return -EINVAL;
>  
> -	/* Register group 16 means we want a core register. */
> -	if ((reg->id & KVM_REG_ARM_COPROC_MASK) == KVM_REG_ARM_CORE)
> -		return get_core_reg(vcpu, reg);
> -
> -	if ((reg->id & KVM_REG_ARM_COPROC_MASK) == KVM_REG_ARM_FW)
> -		return kvm_arm_get_fw_reg(vcpu, reg);
> +	switch (reg->id & KVM_REG_ARM_COPROC_MASK) {
> +	case KVM_REG_ARM_CORE:	return get_core_reg(vcpu, reg);
> +	case KVM_REG_ARM_FW:	return kvm_arm_get_fw_reg(vcpu, reg);
> +	case KVM_REG_ARM64_SVE:	return get_sve_reg(vcpu, reg);
> +	}
>  
>  	if (is_timer_reg(reg->id))
>  		return get_timer_reg(vcpu, reg);
> @@ -270,12 +464,11 @@ int kvm_arm_set_reg(struct kvm_vcpu *vcpu, const struct kvm_one_reg *reg)
>  	if ((reg->id & ~KVM_REG_SIZE_MASK) >> 32 != KVM_REG_ARM64 >> 32)
>  		return -EINVAL;
>  
> -	/* Register group 16 means we set a core register. */
> -	if ((reg->id & KVM_REG_ARM_COPROC_MASK) == KVM_REG_ARM_CORE)
> -		return set_core_reg(vcpu, reg);
> -
> -	if ((reg->id & KVM_REG_ARM_COPROC_MASK) == KVM_REG_ARM_FW)
> -		return kvm_arm_set_fw_reg(vcpu, reg);
> +	switch (reg->id & KVM_REG_ARM_COPROC_MASK) {
> +	case KVM_REG_ARM_CORE:	return set_core_reg(vcpu, reg);
> +	case KVM_REG_ARM_FW:	return kvm_arm_set_fw_reg(vcpu, reg);
> +	case KVM_REG_ARM64_SVE:	return set_sve_reg(vcpu, reg);
> +	}
>  
>  	if (is_timer_reg(reg->id))
>  		return set_timer_reg(vcpu, reg);
> -- 
> 2.1.4
> 
> _______________________________________________
> kvmarm mailing list
> kvmarm at lists.cs.columbia.edu
> https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

^ permalink raw reply	[flat|nested] 178+ messages in thread

* Re: [RFC PATCH 16/16] KVM: arm64/sve: Report and enable SVE API extensions for userspace
  2018-07-26 13:18           ` Dave Martin
@ 2018-08-06 13:41             ` Christoffer Dall
  -1 siblings, 0 replies; 178+ messages in thread
From: Christoffer Dall @ 2018-08-06 13:41 UTC (permalink / raw)
  To: Dave Martin
  Cc: Christoffer Dall, Ard Biesheuvel, Marc Zyngier, Catalin Marinas,
	Will Deacon, Okamoto Takayuki, kvmarm, linux-arm-kernel

On Thu, Jul 26, 2018 at 02:18:02PM +0100, Dave Martin wrote:
> On Wed, Jul 25, 2018 at 06:52:56PM +0200, Andrew Jones wrote:
> > On Wed, Jul 25, 2018 at 04:27:49PM +0100, Dave Martin wrote:
> > > On Thu, Jul 19, 2018 at 04:59:21PM +0200, Andrew Jones wrote:
> > > > On Thu, Jun 21, 2018 at 03:57:40PM +0100, Dave Martin wrote:
> > > > > -	/*
> > > > > -	 * For now, we don't return any features.
> > > > > -	 * In future, we might use features to return target
> > > > > -	 * specific features available for the preferred
> > > > > -	 * target type.
> > > > > -	 */
> > > > > +	/* KVM_ARM_VCPU_SVE understood by KVM_VCPU_INIT */
> > > > > +	init->features[0] = 1 << KVM_ARM_VCPU_SVE;
> > > > > +
> > > > 
> > > > We shouldn't need to do this. The "preferred" target type isn't defined
> > > > well (that I know of), but IMO it should probably be the target that
> > > > best matches the host, minus optional features. The best base target. We
> > > > may use these features to convey that the preferred target should enable
> > > > some optional feature if that feature is necessary to workaround a bug,
> > > > i.e. using the "feature" bit as an erratum bit someday, but that'd be
> > > > quite a debatable use, so maybe not even that. Most likely we'll never
> > > > need to add features here.
> > > 
> > > init->features[] has no semantics yet so we can define it how we like,
> > > but I agree that the way I use it here is not necessarily the most
> > > natural.
> > > 
> > > OTOH, we cannot use features[] for "mandatory" features like erratum
> > > workarounds, because current userspace just ignores these bits.
> > 
> > It would have to learn to look here if that's how we started using it,
> > but it'd be better to invent something else that wouldn't appear as
> > abusive if we're going to teach userspace new stuff anyway.
> > 
> > > 
> > > Rather, these bits would be for features that are considered beneficial
> > > but must be off by default (due to incompatibility risks across nodes,
> > > or due to ABI impacts).  Just blindly using the preferred target
> > > already risks configuring a vcpu that won't work across all nodes in
> > > your cluster.
> > 
> > KVM usually advertises optional features through capabilities. A device
> > (vcpu device, in this case) ioctl can also be used to check for feature
> > availability.
> > 
> > > 
> > > So I'm not convinced that there is any useful interpretation of
> > > features[] unless we interpret it as suggested in this patch.
> > > 
> > > Can you elaborate why you think it should be used with a more
> > > concrete example?
> > 
> > I'm advocating that it *not* be used here. I think it should be used
> > like the PMU feature uses it - and the PMU feature doesn't set a bit
> > here.
> > 
> > > 
> > > > That said, I think defining the feature bit makes sense. ATM, I'm feeling
> > > > like we'll want to model the user interface for SVE like PMU (using VCPU
> > > > device ioctls).
> > > 
> > > Some people expressed concerns about the ioctls becoming order-sensitive.
> > > 
> > > In the SVE case we don't want people enabling/disabling/reconfiguring
> > > "silicon" features like SVE after the vcpu starts executing.
> > > 
> > > We will need an extra ioctl() for configuring the allowed SVE vector
> > > lengths though.  I don't see a way around that.  So maybe we have to
> > > solve the ordering problem anyway.
> > 
> > Yes, that's why I'm thinking that the vcpu device ioctls is probably the
> > right way to go. The SVE group can have its own "finalize" request that
> > allows all other SVE ioctls to be in any order prior to it.
> > 
> > > 
> > > 
> > > By current approach (not in this series) was to have VCPU_INIT return
> > > -EINPROGRESS or similar if SVE is enabled in features[]: this indicates
> > > that certain setup ioctls are required before the vcpu can run.
> > > 
> > > This may be overkill / not the best approach though.  I can look at
> > > vcpu device ioctls as an alternative.
> > 
> > With a "finalize" attribute if SVE isn't finalized by VCPU_INIT or
> > KVM_RUN time, then SVE just won't be enabled for that VCPU.
> 
> So I suppose we could do something like this:
> 
>  * Advertise SVE availability through a vcpu device capability (I need
>    to check how that works).
> 
>  * SVE-aware userspace that understands SVE can do the relevant
>    vcpu device ioctls to configure SVE and turn it on: these are only
>    permitted before the vcpu runs.  We might require an explicit
>    "finish SVE setup" ioctl to be issued before the vcpu can run.
> 
>  * Finally, the vcpu is set running by userspace as normal.
> 
> Marc or Christoffer was objecting to me previously that this may be an
> abuse of vcpu device ioctls, because SVE is a CPU feature rather than a
> device.  I guess it depends on how you define "device" -- I'm not sure
> where to draw the line.

I initially advocated for a VCPU device ioctl as well, because it's a
less crowded number space that gives you more flexibility.  Marc did
have a strong point that vcpu *devices* implies something else than
features though.

I think you (a) definitely want to announce SVE support via a
capability, and (b) only set the preferred target flag if enabling SVE
*generally* gives you a VM more like the real hardware with similar
performance on some system.

I'm personally fine with both feature flags and vcpu device ioctls.  If
using vcpu device ioctls gives you an obvious way to set attributes
relating to SVE, e.g. the vector length, then I think that's a strong
argument for that approach.

Whatever you do, please document this in
Documentation/virtual/kvm/api.txt and related files, such as
Documentation/virtual/kvm/devices/vcpu.txt.

Thanks,
-Christoffer

^ permalink raw reply	[flat|nested] 178+ messages in thread

* [RFC PATCH 16/16] KVM: arm64/sve: Report and enable SVE API extensions for userspace
@ 2018-08-06 13:41             ` Christoffer Dall
  0 siblings, 0 replies; 178+ messages in thread
From: Christoffer Dall @ 2018-08-06 13:41 UTC (permalink / raw)
  To: linux-arm-kernel

On Thu, Jul 26, 2018 at 02:18:02PM +0100, Dave Martin wrote:
> On Wed, Jul 25, 2018 at 06:52:56PM +0200, Andrew Jones wrote:
> > On Wed, Jul 25, 2018 at 04:27:49PM +0100, Dave Martin wrote:
> > > On Thu, Jul 19, 2018 at 04:59:21PM +0200, Andrew Jones wrote:
> > > > On Thu, Jun 21, 2018 at 03:57:40PM +0100, Dave Martin wrote:
> > > > > -	/*
> > > > > -	 * For now, we don't return any features.
> > > > > -	 * In future, we might use features to return target
> > > > > -	 * specific features available for the preferred
> > > > > -	 * target type.
> > > > > -	 */
> > > > > +	/* KVM_ARM_VCPU_SVE understood by KVM_VCPU_INIT */
> > > > > +	init->features[0] = 1 << KVM_ARM_VCPU_SVE;
> > > > > +
> > > > 
> > > > We shouldn't need to do this. The "preferred" target type isn't defined
> > > > well (that I know of), but IMO it should probably be the target that
> > > > best matches the host, minus optional features. The best base target. We
> > > > may use these features to convey that the preferred target should enable
> > > > some optional feature if that feature is necessary to workaround a bug,
> > > > i.e. using the "feature" bit as an erratum bit someday, but that'd be
> > > > quite a debatable use, so maybe not even that. Most likely we'll never
> > > > need to add features here.
> > > 
> > > init->features[] has no semantics yet so we can define it how we like,
> > > but I agree that the way I use it here is not necessarily the most
> > > natural.
> > > 
> > > OTOH, we cannot use features[] for "mandatory" features like erratum
> > > workarounds, because current userspace just ignores these bits.
> > 
> > It would have to learn to look here if that's how we started using it,
> > but it'd be better to invent something else that wouldn't appear as
> > abusive if we're going to teach userspace new stuff anyway.
> > 
> > > 
> > > Rather, these bits would be for features that are considered beneficial
> > > but must be off by default (due to incompatibility risks across nodes,
> > > or due to ABI impacts).  Just blindly using the preferred target
> > > already risks configuring a vcpu that won't work across all nodes in
> > > your cluster.
> > 
> > KVM usually advertises optional features through capabilities. A device
> > (vcpu device, in this case) ioctl can also be used to check for feature
> > availability.
> > 
> > > 
> > > So I'm not convinced that there is any useful interpretation of
> > > features[] unless we interpret it as suggested in this patch.
> > > 
> > > Can you elaborate why you think it should be used with a more
> > > concrete example?
> > 
> > I'm advocating that it *not* be used here. I think it should be used
> > like the PMU feature uses it - and the PMU feature doesn't set a bit
> > here.
> > 
> > > 
> > > > That said, I think defining the feature bit makes sense. ATM, I'm feeling
> > > > like we'll want to model the user interface for SVE like PMU (using VCPU
> > > > device ioctls).
> > > 
> > > Some people expressed concerns about the ioctls becoming order-sensitive.
> > > 
> > > In the SVE case we don't want people enabling/disabling/reconfiguring
> > > "silicon" features like SVE after the vcpu starts executing.
> > > 
> > > We will need an extra ioctl() for configuring the allowed SVE vector
> > > lengths though.  I don't see a way around that.  So maybe we have to
> > > solve the ordering problem anyway.
> > 
> > Yes, that's why I'm thinking that the vcpu device ioctls is probably the
> > right way to go. The SVE group can have its own "finalize" request that
> > allows all other SVE ioctls to be in any order prior to it.
> > 
> > > 
> > > 
> > > By current approach (not in this series) was to have VCPU_INIT return
> > > -EINPROGRESS or similar if SVE is enabled in features[]: this indicates
> > > that certain setup ioctls are required before the vcpu can run.
> > > 
> > > This may be overkill / not the best approach though.  I can look at
> > > vcpu device ioctls as an alternative.
> > 
> > With a "finalize" attribute if SVE isn't finalized by VCPU_INIT or
> > KVM_RUN time, then SVE just won't be enabled for that VCPU.
> 
> So I suppose we could do something like this:
> 
>  * Advertise SVE availability through a vcpu device capability (I need
>    to check how that works).
> 
>  * SVE-aware userspace that understands SVE can do the relevant
>    vcpu device ioctls to configure SVE and turn it on: these are only
>    permitted before the vcpu runs.  We might require an explicit
>    "finish SVE setup" ioctl to be issued before the vcpu can run.
> 
>  * Finally, the vcpu is set running by userspace as normal.
> 
> Marc or Christoffer was objecting to me previously that this may be an
> abuse of vcpu device ioctls, because SVE is a CPU feature rather than a
> device.  I guess it depends on how you define "device" -- I'm not sure
> where to draw the line.

I initially advocated for a VCPU device ioctl as well, because it's a
less crowded number space that gives you more flexibility.  Marc did
have a strong point that vcpu *devices* implies something else than
features though.

I think you (a) definitely want to announce SVE support via a
capability, and (b) only set the preferred target flag if enabling SVE
*generally* gives you a VM more like the real hardware with similar
performance on some system.

I'm personally fine with both feature flags and vcpu device ioctls.  If
using vcpu device ioctls gives you an obvious way to set attributes
relating to SVE, e.g. the vector length, then I think that's a strong
argument for that approach.

Whatever you do, please document this in
Documentation/virtual/kvm/api.txt and related files, such as
Documentation/virtual/kvm/devices/vcpu.txt.

Thanks,
-Christoffer

^ permalink raw reply	[flat|nested] 178+ messages in thread

* Re: [RFC PATCH 09/16] KVM: arm64: Allow ID registers to by dynamically read-as-zero
  2018-08-06 13:03     ` Christoffer Dall
@ 2018-08-07 11:09       ` Dave Martin
  -1 siblings, 0 replies; 178+ messages in thread
From: Dave Martin @ 2018-08-07 11:09 UTC (permalink / raw)
  To: Christoffer Dall
  Cc: Okamoto Takayuki, Christoffer Dall, Ard Biesheuvel, Marc Zyngier,
	Catalin Marinas, Will Deacon, kvmarm, linux-arm-kernel

On Mon, Aug 06, 2018 at 03:03:24PM +0200, Christoffer Dall wrote:
> Hi Dave,
> 
> I think there's a typo in the subject "to be" rather than "to by".
> 
> On Thu, Jun 21, 2018 at 03:57:33PM +0100, Dave Martin wrote:
> > When a feature-dependent ID register is hidden from the guest, it
> > needs to exhibit read-as-zero behaviour as defined by the Arm
> > architecture, rather than appearing to be entirely absent.
> > 
> > This patch updates the ID register emulation logic to make use of
> > the new check_present() method to determine whether the register
> > should read as zero instead of yielding the host's sanitised
> > value.  Because currently a false result from this method truncates
> > the trap call chain before the sysreg's emulate method() is called,
> > a flag is added to distinguish this special case, and helpers are
> > refactored appropriately.
> 
> I don't understand this last sentence.
> 
> And I'm not really sure I understand the code either.
> 
> I can't seem to see any registers which are defined as !present && !raz,
> which is what I thought this feature was all about.

!present and !raz is the default behaviour for everything that is not
ID-register-like.  This patch is adding the !present && raz case (though
that may not be a helpful way to descibe it ... see below).

> In other words, what is the benefit of this more generic method as
> opposed to having a wrapper around read_id_reg() for read_sve_id_reg()
> which sets RAZ if there is no support for SVE in this context?

There may be other ways to factor this.  I can't now remember whay I
went with this particular approach, except that I vaguely recall
hitting some obstacles when doing things another way.

Can you take a look at my attempted explanation below and then we
can reconsider this?

[...]

> 
> > 
> > This invloves some trivial updates to pass the vcpu pointer down
> > into the ID register emulation/access functions.
> > 
> > A new ID_SANITISED_IF() macro is defined for declaring
> > conditionally visible ID registers.
> > 
> > Signed-off-by: Dave Martin <Dave.Martin@arm.com>
> > ---
> >  arch/arm64/kvm/sys_regs.c | 51 ++++++++++++++++++++++++++++++-----------------
> >  arch/arm64/kvm/sys_regs.h | 11 ++++++++++
> >  2 files changed, 44 insertions(+), 18 deletions(-)
> > 

[...]

> > @@ -1840,7 +1855,7 @@ static int emulate_cp(struct kvm_vcpu *vcpu,
> >  
> >  	r = find_reg(params, table, num);
> >  
> > -	if (likely(r) && sys_reg_present(vcpu, r)) {
> > +	if (likely(r) && sys_reg_present_or_raz(vcpu, r)) {
> >  		perform_access(vcpu, params, r);
> >  		return 0;
> >  	}
> > @@ -2016,7 +2031,7 @@ static int emulate_sys_reg(struct kvm_vcpu *vcpu,
> >  	if (!r)
> >  		r = find_reg(params, sys_reg_descs, ARRAY_SIZE(sys_reg_descs));
> >  
> > -	if (likely(r) && sys_reg_present(vcpu, r)) {
> > +	if (likely(r) && sys_reg_present_or_raz(vcpu, r)) {
> >  		perform_access(vcpu, params, r);
> >  	} else {
> >  		kvm_err("Unsupported guest sys_reg access at: %lx\n",
> > @@ -2313,7 +2328,7 @@ int kvm_arm_sys_reg_get_reg(struct kvm_vcpu *vcpu, const struct kvm_one_reg *reg
> >  	if (!r)
> >  		return get_invariant_sys_reg(reg->id, uaddr);
> >  
> > -	if (!sys_reg_present(vcpu, r))
> > +	if (!sys_reg_present_or_raz(vcpu, r))
> >  		return -ENOENT;
> >  
> >  	if (r->get_user)
> > @@ -2337,7 +2352,7 @@ int kvm_arm_sys_reg_set_reg(struct kvm_vcpu *vcpu, const struct kvm_one_reg *reg
> >  	if (!r)
> >  		return set_invariant_sys_reg(reg->id, uaddr);
> >  
> > -	if (!sys_reg_present(vcpu, r))
> > +	if (!sys_reg_present_or_raz(vcpu, r))
> >  		return -ENOENT;

On Wed, Jul 25, 2018 at 04:46:55PM +0100, Alex Bennée wrote:
> It's all very well being raz, but shouldn't you catch this further down
> and not attempt to write the register that doesn't exist?

To be clear, is this a question about factoring, or do you think there's
a bug here?


In response to both sets of comments, I think the way the code is
factored is causing some confusion.

The idea in my head was something like this:

System register encodings fall into two classes:

 a) encodings that we emulate in some way
 b) encodings that we unconditionally reflect back to the guest as an
    Undef.

Architecturally defined system registers fall into two classes:

 i) registers whose removal turns all accesses into an Undef
 ii) registers whose removal exhibits some other behaviour.

These two classifications overlap somwehat.


>From an emulation perspective, (b), and (i) in the "register not
present" case, look the same: we trap the register and reflect an Undef
directly back to the guest with no further action required.

>From an emulation perspective, (a) and (ii) are also somewhat the
same: we need to emulate something, although precisely what we need
to do depends on which register it is and on whether the register is
deemed present or not.

sys_reg_check_present_or_raz() thus means "falls under (a) or (ii)",
i.e., some emulation is required and we need to call sysreg-specific
methods to figure out precisely what we need to do.

Conversely !sys_reg_check_present_or_raz() means that we can just
Undef the guest with no further logic required.

Does this rationale make things clearer?  The naming is perhaps
unfortunate.

Cheers,
---Dave

^ permalink raw reply	[flat|nested] 178+ messages in thread

* [RFC PATCH 09/16] KVM: arm64: Allow ID registers to by dynamically read-as-zero
@ 2018-08-07 11:09       ` Dave Martin
  0 siblings, 0 replies; 178+ messages in thread
From: Dave Martin @ 2018-08-07 11:09 UTC (permalink / raw)
  To: linux-arm-kernel

On Mon, Aug 06, 2018 at 03:03:24PM +0200, Christoffer Dall wrote:
> Hi Dave,
> 
> I think there's a typo in the subject "to be" rather than "to by".
> 
> On Thu, Jun 21, 2018 at 03:57:33PM +0100, Dave Martin wrote:
> > When a feature-dependent ID register is hidden from the guest, it
> > needs to exhibit read-as-zero behaviour as defined by the Arm
> > architecture, rather than appearing to be entirely absent.
> > 
> > This patch updates the ID register emulation logic to make use of
> > the new check_present() method to determine whether the register
> > should read as zero instead of yielding the host's sanitised
> > value.  Because currently a false result from this method truncates
> > the trap call chain before the sysreg's emulate method() is called,
> > a flag is added to distinguish this special case, and helpers are
> > refactored appropriately.
> 
> I don't understand this last sentence.
> 
> And I'm not really sure I understand the code either.
> 
> I can't seem to see any registers which are defined as !present && !raz,
> which is what I thought this feature was all about.

!present and !raz is the default behaviour for everything that is not
ID-register-like.  This patch is adding the !present && raz case (though
that may not be a helpful way to descibe it ... see below).

> In other words, what is the benefit of this more generic method as
> opposed to having a wrapper around read_id_reg() for read_sve_id_reg()
> which sets RAZ if there is no support for SVE in this context?

There may be other ways to factor this.  I can't now remember whay I
went with this particular approach, except that I vaguely recall
hitting some obstacles when doing things another way.

Can you take a look at my attempted explanation below and then we
can reconsider this?

[...]

> 
> > 
> > This invloves some trivial updates to pass the vcpu pointer down
> > into the ID register emulation/access functions.
> > 
> > A new ID_SANITISED_IF() macro is defined for declaring
> > conditionally visible ID registers.
> > 
> > Signed-off-by: Dave Martin <Dave.Martin@arm.com>
> > ---
> >  arch/arm64/kvm/sys_regs.c | 51 ++++++++++++++++++++++++++++++-----------------
> >  arch/arm64/kvm/sys_regs.h | 11 ++++++++++
> >  2 files changed, 44 insertions(+), 18 deletions(-)
> > 

[...]

> > @@ -1840,7 +1855,7 @@ static int emulate_cp(struct kvm_vcpu *vcpu,
> >  
> >  	r = find_reg(params, table, num);
> >  
> > -	if (likely(r) && sys_reg_present(vcpu, r)) {
> > +	if (likely(r) && sys_reg_present_or_raz(vcpu, r)) {
> >  		perform_access(vcpu, params, r);
> >  		return 0;
> >  	}
> > @@ -2016,7 +2031,7 @@ static int emulate_sys_reg(struct kvm_vcpu *vcpu,
> >  	if (!r)
> >  		r = find_reg(params, sys_reg_descs, ARRAY_SIZE(sys_reg_descs));
> >  
> > -	if (likely(r) && sys_reg_present(vcpu, r)) {
> > +	if (likely(r) && sys_reg_present_or_raz(vcpu, r)) {
> >  		perform_access(vcpu, params, r);
> >  	} else {
> >  		kvm_err("Unsupported guest sys_reg access at: %lx\n",
> > @@ -2313,7 +2328,7 @@ int kvm_arm_sys_reg_get_reg(struct kvm_vcpu *vcpu, const struct kvm_one_reg *reg
> >  	if (!r)
> >  		return get_invariant_sys_reg(reg->id, uaddr);
> >  
> > -	if (!sys_reg_present(vcpu, r))
> > +	if (!sys_reg_present_or_raz(vcpu, r))
> >  		return -ENOENT;
> >  
> >  	if (r->get_user)
> > @@ -2337,7 +2352,7 @@ int kvm_arm_sys_reg_set_reg(struct kvm_vcpu *vcpu, const struct kvm_one_reg *reg
> >  	if (!r)
> >  		return set_invariant_sys_reg(reg->id, uaddr);
> >  
> > -	if (!sys_reg_present(vcpu, r))
> > +	if (!sys_reg_present_or_raz(vcpu, r))
> >  		return -ENOENT;

On Wed, Jul 25, 2018 at 04:46:55PM +0100, Alex Benn?e wrote:
> It's all very well being raz, but shouldn't you catch this further down
> and not attempt to write the register that doesn't exist?

To be clear, is this a question about factoring, or do you think there's
a bug here?


In response to both sets of comments, I think the way the code is
factored is causing some confusion.

The idea in my head was something like this:

System register encodings fall into two classes:

 a) encodings that we emulate in some way
 b) encodings that we unconditionally reflect back to the guest as an
    Undef.

Architecturally defined system registers fall into two classes:

 i) registers whose removal turns all accesses into an Undef
 ii) registers whose removal exhibits some other behaviour.

These two classifications overlap somwehat.


>From an emulation perspective, (b), and (i) in the "register not
present" case, look the same: we trap the register and reflect an Undef
directly back to the guest with no further action required.

>From an emulation perspective, (a) and (ii) are also somewhat the
same: we need to emulate something, although precisely what we need
to do depends on which register it is and on whether the register is
deemed present or not.

sys_reg_check_present_or_raz() thus means "falls under (a) or (ii)",
i.e., some emulation is required and we need to call sysreg-specific
methods to figure out precisely what we need to do.

Conversely !sys_reg_check_present_or_raz() means that we can just
Undef the guest with no further logic required.

Does this rationale make things clearer?  The naming is perhaps
unfortunate.

Cheers,
---Dave

^ permalink raw reply	[flat|nested] 178+ messages in thread

* Re: [RFC PATCH 12/16] KVM: arm64/sve: Context switch the SVE registers
  2018-08-06 13:19     ` Christoffer Dall
@ 2018-08-07 11:15       ` Dave Martin
  -1 siblings, 0 replies; 178+ messages in thread
From: Dave Martin @ 2018-08-07 11:15 UTC (permalink / raw)
  To: Christoffer Dall
  Cc: Okamoto Takayuki, Christoffer Dall, Ard Biesheuvel, Marc Zyngier,
	Catalin Marinas, Will Deacon, kvmarm, linux-arm-kernel

On Mon, Aug 06, 2018 at 03:19:10PM +0200, Christoffer Dall wrote:
> On Thu, Jun 21, 2018 at 03:57:36PM +0100, Dave Martin wrote:
> > In order to give each vcpu its own view of the SVE registers, this
> > patch adds context storage via a new sve_state pointer in struct
> > vcpu_arch.  An additional member sve_max_vl is also added for each
> > vcpu, to determine the maximum vector length visible to the guest
> > and thus the value to be configured in ZCR_EL2.LEN while the is
> > active.  This also determines the layout and size of the storage in
> > sve_state, which is read and written by the same backend functions
> > that are used for context-switching the SVE state for host tasks.
> > 
> > On SVE-enabled vcpus, SVE access traps are now handled by switching
> > in the vcpu's SVE context and disabling the trap before returning
> > to the guest.  On other vcpus, the trap is not handled and an exit
> > back to the host occurs, where the handle_sve() fallback path
> > reflects an undefined instruction exception back to the guest,
> > consistently with the behaviour of non-SVE-capable hardware (as was
> > done unconditionally prior to this patch).
> > 
> > No SVE handling is added on non-VHE-only paths, since VHE is an
> > architectural and Kconfig prerequisite of SVE.
> > 
> > Signed-off-by: Dave Martin <Dave.Martin@arm.com>
> > ---
> >  arch/arm64/include/asm/kvm_host.h |  2 ++
> >  arch/arm64/kvm/fpsimd.c           |  5 +++--
> >  arch/arm64/kvm/hyp/switch.c       | 43 ++++++++++++++++++++++++++++++---------
> >  3 files changed, 38 insertions(+), 12 deletions(-)

[...]

> > diff --git a/arch/arm64/kvm/hyp/switch.c b/arch/arm64/kvm/hyp/switch.c

[...]

> > @@ -361,7 +373,13 @@ static bool __hyp_text __hyp_switch_fpsimd(struct kvm_vcpu *vcpu)
> >  		vcpu->arch.flags &= ~KVM_ARM64_FP_HOST;
> >  	}
> >  
> > -	__fpsimd_restore_state(&vcpu->arch.ctxt.gp_regs.fp_regs);
> > +	if (system_supports_sve() && guest_has_sve)
> > +		sve_load_state((char *)vcpu->arch.sve_state +
> > +					sve_ffr_offset(vcpu->arch.sve_max_vl),
> 
> nit: would it make sense to have a macro 'vcpu_get_sve_state_ptr(vcpu)'
> to make this first argument more pretty?

Could do, I guess.  I'll take a look.

> 
> > +			       &vcpu->arch.ctxt.gp_regs.fp_regs.fpsr,
> > +			       sve_vq_from_vl(vcpu->arch.sve_max_vl) - 1);
> > +	else
> > +		__fpsimd_restore_state(&vcpu->arch.ctxt.gp_regs.fp_regs);
> >  
> >  	/* Skip restoring fpexc32 for AArch64 guests */
> >  	if (!(read_sysreg(hcr_el2) & HCR_RW))
> > @@ -380,6 +398,8 @@ static bool __hyp_text __hyp_switch_fpsimd(struct kvm_vcpu *vcpu)
> >   */
> >  static bool __hyp_text fixup_guest_exit(struct kvm_vcpu *vcpu, u64 *exit_code)
> >  {
> > +	bool guest_has_sve;
> > +
> >  	if (ARM_EXCEPTION_CODE(*exit_code) != ARM_EXCEPTION_IRQ)
> >  		vcpu->arch.fault.esr_el2 = read_sysreg_el2(esr);
> >  
> > @@ -397,10 +417,13 @@ static bool __hyp_text fixup_guest_exit(struct kvm_vcpu *vcpu, u64 *exit_code)
> >  	 * and restore the guest context lazily.
> >  	 * If FP/SIMD is not implemented, handle the trap and inject an
> >  	 * undefined instruction exception to the guest.
> > +	 * Similarly for trapped SVE accesses.
> >  	 */
> > -	if (system_supports_fpsimd() &&
> > -	    kvm_vcpu_trap_get_class(vcpu) == ESR_ELx_EC_FP_ASIMD)
> > -		return __hyp_switch_fpsimd(vcpu);
> > +	guest_has_sve = vcpu_has_sve(&vcpu->arch);
> > +	if ((system_supports_fpsimd() &&
> > +	     kvm_vcpu_trap_get_class(vcpu) == ESR_ELx_EC_FP_ASIMD) ||
> > +	    (guest_has_sve && kvm_vcpu_trap_get_class(vcpu) == ESR_ELx_EC_SVE))
> 
> nit: this may also be folded nicely into a static bool
> __trap_fpsimd_sve_access() check.

It wouldn't hurt to make this look less fiddly, certainly.

Can you elaborate on precisely what you had in mind?

Cheers
---Dave

^ permalink raw reply	[flat|nested] 178+ messages in thread

* [RFC PATCH 12/16] KVM: arm64/sve: Context switch the SVE registers
@ 2018-08-07 11:15       ` Dave Martin
  0 siblings, 0 replies; 178+ messages in thread
From: Dave Martin @ 2018-08-07 11:15 UTC (permalink / raw)
  To: linux-arm-kernel

On Mon, Aug 06, 2018 at 03:19:10PM +0200, Christoffer Dall wrote:
> On Thu, Jun 21, 2018 at 03:57:36PM +0100, Dave Martin wrote:
> > In order to give each vcpu its own view of the SVE registers, this
> > patch adds context storage via a new sve_state pointer in struct
> > vcpu_arch.  An additional member sve_max_vl is also added for each
> > vcpu, to determine the maximum vector length visible to the guest
> > and thus the value to be configured in ZCR_EL2.LEN while the is
> > active.  This also determines the layout and size of the storage in
> > sve_state, which is read and written by the same backend functions
> > that are used for context-switching the SVE state for host tasks.
> > 
> > On SVE-enabled vcpus, SVE access traps are now handled by switching
> > in the vcpu's SVE context and disabling the trap before returning
> > to the guest.  On other vcpus, the trap is not handled and an exit
> > back to the host occurs, where the handle_sve() fallback path
> > reflects an undefined instruction exception back to the guest,
> > consistently with the behaviour of non-SVE-capable hardware (as was
> > done unconditionally prior to this patch).
> > 
> > No SVE handling is added on non-VHE-only paths, since VHE is an
> > architectural and Kconfig prerequisite of SVE.
> > 
> > Signed-off-by: Dave Martin <Dave.Martin@arm.com>
> > ---
> >  arch/arm64/include/asm/kvm_host.h |  2 ++
> >  arch/arm64/kvm/fpsimd.c           |  5 +++--
> >  arch/arm64/kvm/hyp/switch.c       | 43 ++++++++++++++++++++++++++++++---------
> >  3 files changed, 38 insertions(+), 12 deletions(-)

[...]

> > diff --git a/arch/arm64/kvm/hyp/switch.c b/arch/arm64/kvm/hyp/switch.c

[...]

> > @@ -361,7 +373,13 @@ static bool __hyp_text __hyp_switch_fpsimd(struct kvm_vcpu *vcpu)
> >  		vcpu->arch.flags &= ~KVM_ARM64_FP_HOST;
> >  	}
> >  
> > -	__fpsimd_restore_state(&vcpu->arch.ctxt.gp_regs.fp_regs);
> > +	if (system_supports_sve() && guest_has_sve)
> > +		sve_load_state((char *)vcpu->arch.sve_state +
> > +					sve_ffr_offset(vcpu->arch.sve_max_vl),
> 
> nit: would it make sense to have a macro 'vcpu_get_sve_state_ptr(vcpu)'
> to make this first argument more pretty?

Could do, I guess.  I'll take a look.

> 
> > +			       &vcpu->arch.ctxt.gp_regs.fp_regs.fpsr,
> > +			       sve_vq_from_vl(vcpu->arch.sve_max_vl) - 1);
> > +	else
> > +		__fpsimd_restore_state(&vcpu->arch.ctxt.gp_regs.fp_regs);
> >  
> >  	/* Skip restoring fpexc32 for AArch64 guests */
> >  	if (!(read_sysreg(hcr_el2) & HCR_RW))
> > @@ -380,6 +398,8 @@ static bool __hyp_text __hyp_switch_fpsimd(struct kvm_vcpu *vcpu)
> >   */
> >  static bool __hyp_text fixup_guest_exit(struct kvm_vcpu *vcpu, u64 *exit_code)
> >  {
> > +	bool guest_has_sve;
> > +
> >  	if (ARM_EXCEPTION_CODE(*exit_code) != ARM_EXCEPTION_IRQ)
> >  		vcpu->arch.fault.esr_el2 = read_sysreg_el2(esr);
> >  
> > @@ -397,10 +417,13 @@ static bool __hyp_text fixup_guest_exit(struct kvm_vcpu *vcpu, u64 *exit_code)
> >  	 * and restore the guest context lazily.
> >  	 * If FP/SIMD is not implemented, handle the trap and inject an
> >  	 * undefined instruction exception to the guest.
> > +	 * Similarly for trapped SVE accesses.
> >  	 */
> > -	if (system_supports_fpsimd() &&
> > -	    kvm_vcpu_trap_get_class(vcpu) == ESR_ELx_EC_FP_ASIMD)
> > -		return __hyp_switch_fpsimd(vcpu);
> > +	guest_has_sve = vcpu_has_sve(&vcpu->arch);
> > +	if ((system_supports_fpsimd() &&
> > +	     kvm_vcpu_trap_get_class(vcpu) == ESR_ELx_EC_FP_ASIMD) ||
> > +	    (guest_has_sve && kvm_vcpu_trap_get_class(vcpu) == ESR_ELx_EC_SVE))
> 
> nit: this may also be folded nicely into a static bool
> __trap_fpsimd_sve_access() check.

It wouldn't hurt to make this look less fiddly, certainly.

Can you elaborate on precisely what you had in mind?

Cheers
---Dave

^ permalink raw reply	[flat|nested] 178+ messages in thread

* Re: [RFC PATCH 14/16] KVM: arm64/sve: Add SVE support to register access ioctl interface
  2018-08-06 13:25     ` Christoffer Dall
@ 2018-08-07 11:17       ` Dave Martin
  -1 siblings, 0 replies; 178+ messages in thread
From: Dave Martin @ 2018-08-07 11:17 UTC (permalink / raw)
  To: Christoffer Dall
  Cc: Okamoto Takayuki, Christoffer Dall, Ard Biesheuvel, Marc Zyngier,
	Catalin Marinas, Will Deacon, kvmarm, linux-arm-kernel

On Mon, Aug 06, 2018 at 03:25:57PM +0200, Christoffer Dall wrote:
> On Thu, Jun 21, 2018 at 03:57:38PM +0100, Dave Martin wrote:
> > This patch adds the following registers for access via the
> > KVM_{GET,SET}_ONE_REG interface:
> > 
> >  * KVM_REG_ARM64_SVE_ZREG(n, i) (n = 0..31) (in 2048-bit slices)
> >  * KVM_REG_ARM64_SVE_PREG(n, i) (n = 0..15) (in 256-bit slices)
> >  * KVM_REG_ARM64_SVE_FFR(i) (in 256-bit slices)
> > 
> > In order to adapt gracefully to future architectural extensions,
> > the registers are divided up into slices as noted above:  the i
> > parameter denotes the slice index.
> > 
> > For simplicity, bits or slices that exceed the maximum vector
> > length supported for the vcpu are ignored for KVM_SET_ONE_REG, and
> > read as zero for KVM_GET_ONE_REG.
> > 
> > For the current architecture, only slice i = 0 is significant.  The
> > interface design allows i to increase to up to 31 in the future if
> > required by future architectural amendments.
> > 
> > The registers are only visible for vcpus that have SVE enabled.
> > They are not enumerated by KVM_GET_REG_LIST on vcpus that do not
> > have SVE.  In all cases, surplus slices are not enumerated by
> > KVM_GET_REG_LIST.
> > 
> > Accesses to the FPSIMD registers via KVM_REG_ARM_CORE are
> > redirected to access the underlying vcpu SVE register storage as
> > appropriate.  In order to make this more straightforward, register
> > accesses that straddle register boundaries are no longer guaranteed
> > to succeed.  (Support for such use was never deliberate, and
> > userspace does not currently seem to be relying on it.)
> 
> Could you add documentation to Documentation/virtual/kvm/api.txt for
> this as well under the KVM_SET_ONE_REG definitions explaining the use
> for arm64 ?

I plan to add documentation for all these additions, but didn't want
to do that while the API was still subject to change, to avoid having
to write it twice.

So, the documentation should be in the next spin if I consider it
mature enough.

Cheers
---Dave

^ permalink raw reply	[flat|nested] 178+ messages in thread

* [RFC PATCH 14/16] KVM: arm64/sve: Add SVE support to register access ioctl interface
@ 2018-08-07 11:17       ` Dave Martin
  0 siblings, 0 replies; 178+ messages in thread
From: Dave Martin @ 2018-08-07 11:17 UTC (permalink / raw)
  To: linux-arm-kernel

On Mon, Aug 06, 2018 at 03:25:57PM +0200, Christoffer Dall wrote:
> On Thu, Jun 21, 2018 at 03:57:38PM +0100, Dave Martin wrote:
> > This patch adds the following registers for access via the
> > KVM_{GET,SET}_ONE_REG interface:
> > 
> >  * KVM_REG_ARM64_SVE_ZREG(n, i) (n = 0..31) (in 2048-bit slices)
> >  * KVM_REG_ARM64_SVE_PREG(n, i) (n = 0..15) (in 256-bit slices)
> >  * KVM_REG_ARM64_SVE_FFR(i) (in 256-bit slices)
> > 
> > In order to adapt gracefully to future architectural extensions,
> > the registers are divided up into slices as noted above:  the i
> > parameter denotes the slice index.
> > 
> > For simplicity, bits or slices that exceed the maximum vector
> > length supported for the vcpu are ignored for KVM_SET_ONE_REG, and
> > read as zero for KVM_GET_ONE_REG.
> > 
> > For the current architecture, only slice i = 0 is significant.  The
> > interface design allows i to increase to up to 31 in the future if
> > required by future architectural amendments.
> > 
> > The registers are only visible for vcpus that have SVE enabled.
> > They are not enumerated by KVM_GET_REG_LIST on vcpus that do not
> > have SVE.  In all cases, surplus slices are not enumerated by
> > KVM_GET_REG_LIST.
> > 
> > Accesses to the FPSIMD registers via KVM_REG_ARM_CORE are
> > redirected to access the underlying vcpu SVE register storage as
> > appropriate.  In order to make this more straightforward, register
> > accesses that straddle register boundaries are no longer guaranteed
> > to succeed.  (Support for such use was never deliberate, and
> > userspace does not currently seem to be relying on it.)
> 
> Could you add documentation to Documentation/virtual/kvm/api.txt for
> this as well under the KVM_SET_ONE_REG definitions explaining the use
> for arm64 ?

I plan to add documentation for all these additions, but didn't want
to do that while the API was still subject to change, to avoid having
to write it twice.

So, the documentation should be in the next spin if I consider it
mature enough.

Cheers
---Dave

^ permalink raw reply	[flat|nested] 178+ messages in thread

* Re: [RFC PATCH 00/16] KVM: arm64: Initial support for SVE guests
  2018-08-06 13:05   ` Christoffer Dall
@ 2018-08-07 11:18     ` Dave Martin
  -1 siblings, 0 replies; 178+ messages in thread
From: Dave Martin @ 2018-08-07 11:18 UTC (permalink / raw)
  To: Christoffer Dall
  Cc: Okamoto Takayuki, Christoffer Dall, Ard Biesheuvel, Marc Zyngier,
	Catalin Marinas, Will Deacon, kvmarm, linux-arm-kernel

On Mon, Aug 06, 2018 at 03:05:00PM +0200, Christoffer Dall wrote:
> On Thu, Jun 21, 2018 at 03:57:24PM +0100, Dave Martin wrote:
> > This series implements basic support for allowing KVM guests to use the
> > Arm Scalable Vector Extension (SVE).
> > 
> > The patches is based on torvalds/master f5b7769e (Revert "debugfs:
> > inode: debugfs_create_dir uses mode permission from parent") plus the
> > patches from [1].
> 
> Given the effort required to go fetch another patch set from the list to
> apply to a specific commit, and the size of this patch set, it would be
> helpful to have a pointer to a branch with everything in it.

Fair enough.  This probably won't be an issue for the next spin, in
any case, but I can push a branch somewhere otherwise.

Cheers
---Dave

^ permalink raw reply	[flat|nested] 178+ messages in thread

* [RFC PATCH 00/16] KVM: arm64: Initial support for SVE guests
@ 2018-08-07 11:18     ` Dave Martin
  0 siblings, 0 replies; 178+ messages in thread
From: Dave Martin @ 2018-08-07 11:18 UTC (permalink / raw)
  To: linux-arm-kernel

On Mon, Aug 06, 2018 at 03:05:00PM +0200, Christoffer Dall wrote:
> On Thu, Jun 21, 2018 at 03:57:24PM +0100, Dave Martin wrote:
> > This series implements basic support for allowing KVM guests to use the
> > Arm Scalable Vector Extension (SVE).
> > 
> > The patches is based on torvalds/master f5b7769e (Revert "debugfs:
> > inode: debugfs_create_dir uses mode permission from parent") plus the
> > patches from [1].
> 
> Given the effort required to go fetch another patch set from the list to
> apply to a specific commit, and the size of this patch set, it would be
> helpful to have a pointer to a branch with everything in it.

Fair enough.  This probably won't be an issue for the next spin, in
any case, but I can push a branch somewhere otherwise.

Cheers
---Dave

^ permalink raw reply	[flat|nested] 178+ messages in thread

* Re: [RFC PATCH 16/16] KVM: arm64/sve: Report and enable SVE API extensions for userspace
  2018-08-06 13:41             ` Christoffer Dall
@ 2018-08-07 11:23               ` Dave Martin
  -1 siblings, 0 replies; 178+ messages in thread
From: Dave Martin @ 2018-08-07 11:23 UTC (permalink / raw)
  To: Christoffer Dall
  Cc: Okamoto Takayuki, Christoffer Dall, Ard Biesheuvel, Marc Zyngier,
	Catalin Marinas, Will Deacon, kvmarm, linux-arm-kernel

On Mon, Aug 06, 2018 at 03:41:33PM +0200, Christoffer Dall wrote:
> On Thu, Jul 26, 2018 at 02:18:02PM +0100, Dave Martin wrote:
> > On Wed, Jul 25, 2018 at 06:52:56PM +0200, Andrew Jones wrote:
> > > On Wed, Jul 25, 2018 at 04:27:49PM +0100, Dave Martin wrote:
> > > > On Thu, Jul 19, 2018 at 04:59:21PM +0200, Andrew Jones wrote:
> > > > > On Thu, Jun 21, 2018 at 03:57:40PM +0100, Dave Martin wrote:
> > > > > > -	/*
> > > > > > -	 * For now, we don't return any features.
> > > > > > -	 * In future, we might use features to return target
> > > > > > -	 * specific features available for the preferred
> > > > > > -	 * target type.
> > > > > > -	 */
> > > > > > +	/* KVM_ARM_VCPU_SVE understood by KVM_VCPU_INIT */
> > > > > > +	init->features[0] = 1 << KVM_ARM_VCPU_SVE;
> > > > > > +
> > > > > 
> > > > > We shouldn't need to do this. The "preferred" target type isn't defined
> > > > > well (that I know of), but IMO it should probably be the target that
> > > > > best matches the host, minus optional features. The best base target. We
> > > > > may use these features to convey that the preferred target should enable
> > > > > some optional feature if that feature is necessary to workaround a bug,
> > > > > i.e. using the "feature" bit as an erratum bit someday, but that'd be
> > > > > quite a debatable use, so maybe not even that. Most likely we'll never
> > > > > need to add features here.
> > > > 
> > > > init->features[] has no semantics yet so we can define it how we like,
> > > > but I agree that the way I use it here is not necessarily the most
> > > > natural.
> > > > 
> > > > OTOH, we cannot use features[] for "mandatory" features like erratum
> > > > workarounds, because current userspace just ignores these bits.
> > > 
> > > It would have to learn to look here if that's how we started using it,
> > > but it'd be better to invent something else that wouldn't appear as
> > > abusive if we're going to teach userspace new stuff anyway.
> > > 
> > > > 
> > > > Rather, these bits would be for features that are considered beneficial
> > > > but must be off by default (due to incompatibility risks across nodes,
> > > > or due to ABI impacts).  Just blindly using the preferred target
> > > > already risks configuring a vcpu that won't work across all nodes in
> > > > your cluster.
> > > 
> > > KVM usually advertises optional features through capabilities. A device
> > > (vcpu device, in this case) ioctl can also be used to check for feature
> > > availability.
> > > 
> > > > 
> > > > So I'm not convinced that there is any useful interpretation of
> > > > features[] unless we interpret it as suggested in this patch.
> > > > 
> > > > Can you elaborate why you think it should be used with a more
> > > > concrete example?
> > > 
> > > I'm advocating that it *not* be used here. I think it should be used
> > > like the PMU feature uses it - and the PMU feature doesn't set a bit
> > > here.
> > > 
> > > > 
> > > > > That said, I think defining the feature bit makes sense. ATM, I'm feeling
> > > > > like we'll want to model the user interface for SVE like PMU (using VCPU
> > > > > device ioctls).
> > > > 
> > > > Some people expressed concerns about the ioctls becoming order-sensitive.
> > > > 
> > > > In the SVE case we don't want people enabling/disabling/reconfiguring
> > > > "silicon" features like SVE after the vcpu starts executing.
> > > > 
> > > > We will need an extra ioctl() for configuring the allowed SVE vector
> > > > lengths though.  I don't see a way around that.  So maybe we have to
> > > > solve the ordering problem anyway.
> > > 
> > > Yes, that's why I'm thinking that the vcpu device ioctls is probably the
> > > right way to go. The SVE group can have its own "finalize" request that
> > > allows all other SVE ioctls to be in any order prior to it.
> > > 
> > > > 
> > > > 
> > > > By current approach (not in this series) was to have VCPU_INIT return
> > > > -EINPROGRESS or similar if SVE is enabled in features[]: this indicates
> > > > that certain setup ioctls are required before the vcpu can run.
> > > > 
> > > > This may be overkill / not the best approach though.  I can look at
> > > > vcpu device ioctls as an alternative.
> > > 
> > > With a "finalize" attribute if SVE isn't finalized by VCPU_INIT or
> > > KVM_RUN time, then SVE just won't be enabled for that VCPU.
> > 
> > So I suppose we could do something like this:
> > 
> >  * Advertise SVE availability through a vcpu device capability (I need
> >    to check how that works).
> > 
> >  * SVE-aware userspace that understands SVE can do the relevant
> >    vcpu device ioctls to configure SVE and turn it on: these are only
> >    permitted before the vcpu runs.  We might require an explicit
> >    "finish SVE setup" ioctl to be issued before the vcpu can run.
> > 
> >  * Finally, the vcpu is set running by userspace as normal.
> > 
> > Marc or Christoffer was objecting to me previously that this may be an
> > abuse of vcpu device ioctls, because SVE is a CPU feature rather than a
> > device.  I guess it depends on how you define "device" -- I'm not sure
> > where to draw the line.
> 
> I initially advocated for a VCPU device ioctl as well, because it's a
> less crowded number space that gives you more flexibility.  Marc did
> have a strong point that vcpu *devices* implies something else than
> features though.
> 
> I think you (a) definitely want to announce SVE support via a
> capability, and (b) only set the preferred target flag if enabling SVE
> *generally* gives you a VM more like the real hardware with similar
> performance on some system.
> 
> I'm personally fine with both feature flags and vcpu device ioctls.  If
> using vcpu device ioctls gives you an obvious way to set attributes
> relating to SVE, e.g. the vector length, then I think that's a strong
> argument for that approach.

There is another option I'm tending towards, which is simply to have
a "set vector lengths" ioctl (whether presented as a vcpu device
ioctl or a random arch ioctl).

If that ioctl() fails then SVE support is not available.

If it succeeds, it will update its arguments to indicate which
vector lengths are enabled (if different).

Old userspace, or userspace that doesn't want to use SVE, would
not use this ioctl at all.

It would also do no harm additionally to advertise this as a
capability, though I wonder whether it's necessary to do so (?)

Cheers
---Dave

^ permalink raw reply	[flat|nested] 178+ messages in thread

* [RFC PATCH 16/16] KVM: arm64/sve: Report and enable SVE API extensions for userspace
@ 2018-08-07 11:23               ` Dave Martin
  0 siblings, 0 replies; 178+ messages in thread
From: Dave Martin @ 2018-08-07 11:23 UTC (permalink / raw)
  To: linux-arm-kernel

On Mon, Aug 06, 2018 at 03:41:33PM +0200, Christoffer Dall wrote:
> On Thu, Jul 26, 2018 at 02:18:02PM +0100, Dave Martin wrote:
> > On Wed, Jul 25, 2018 at 06:52:56PM +0200, Andrew Jones wrote:
> > > On Wed, Jul 25, 2018 at 04:27:49PM +0100, Dave Martin wrote:
> > > > On Thu, Jul 19, 2018 at 04:59:21PM +0200, Andrew Jones wrote:
> > > > > On Thu, Jun 21, 2018 at 03:57:40PM +0100, Dave Martin wrote:
> > > > > > -	/*
> > > > > > -	 * For now, we don't return any features.
> > > > > > -	 * In future, we might use features to return target
> > > > > > -	 * specific features available for the preferred
> > > > > > -	 * target type.
> > > > > > -	 */
> > > > > > +	/* KVM_ARM_VCPU_SVE understood by KVM_VCPU_INIT */
> > > > > > +	init->features[0] = 1 << KVM_ARM_VCPU_SVE;
> > > > > > +
> > > > > 
> > > > > We shouldn't need to do this. The "preferred" target type isn't defined
> > > > > well (that I know of), but IMO it should probably be the target that
> > > > > best matches the host, minus optional features. The best base target. We
> > > > > may use these features to convey that the preferred target should enable
> > > > > some optional feature if that feature is necessary to workaround a bug,
> > > > > i.e. using the "feature" bit as an erratum bit someday, but that'd be
> > > > > quite a debatable use, so maybe not even that. Most likely we'll never
> > > > > need to add features here.
> > > > 
> > > > init->features[] has no semantics yet so we can define it how we like,
> > > > but I agree that the way I use it here is not necessarily the most
> > > > natural.
> > > > 
> > > > OTOH, we cannot use features[] for "mandatory" features like erratum
> > > > workarounds, because current userspace just ignores these bits.
> > > 
> > > It would have to learn to look here if that's how we started using it,
> > > but it'd be better to invent something else that wouldn't appear as
> > > abusive if we're going to teach userspace new stuff anyway.
> > > 
> > > > 
> > > > Rather, these bits would be for features that are considered beneficial
> > > > but must be off by default (due to incompatibility risks across nodes,
> > > > or due to ABI impacts).  Just blindly using the preferred target
> > > > already risks configuring a vcpu that won't work across all nodes in
> > > > your cluster.
> > > 
> > > KVM usually advertises optional features through capabilities. A device
> > > (vcpu device, in this case) ioctl can also be used to check for feature
> > > availability.
> > > 
> > > > 
> > > > So I'm not convinced that there is any useful interpretation of
> > > > features[] unless we interpret it as suggested in this patch.
> > > > 
> > > > Can you elaborate why you think it should be used with a more
> > > > concrete example?
> > > 
> > > I'm advocating that it *not* be used here. I think it should be used
> > > like the PMU feature uses it - and the PMU feature doesn't set a bit
> > > here.
> > > 
> > > > 
> > > > > That said, I think defining the feature bit makes sense. ATM, I'm feeling
> > > > > like we'll want to model the user interface for SVE like PMU (using VCPU
> > > > > device ioctls).
> > > > 
> > > > Some people expressed concerns about the ioctls becoming order-sensitive.
> > > > 
> > > > In the SVE case we don't want people enabling/disabling/reconfiguring
> > > > "silicon" features like SVE after the vcpu starts executing.
> > > > 
> > > > We will need an extra ioctl() for configuring the allowed SVE vector
> > > > lengths though.  I don't see a way around that.  So maybe we have to
> > > > solve the ordering problem anyway.
> > > 
> > > Yes, that's why I'm thinking that the vcpu device ioctls is probably the
> > > right way to go. The SVE group can have its own "finalize" request that
> > > allows all other SVE ioctls to be in any order prior to it.
> > > 
> > > > 
> > > > 
> > > > By current approach (not in this series) was to have VCPU_INIT return
> > > > -EINPROGRESS or similar if SVE is enabled in features[]: this indicates
> > > > that certain setup ioctls are required before the vcpu can run.
> > > > 
> > > > This may be overkill / not the best approach though.  I can look at
> > > > vcpu device ioctls as an alternative.
> > > 
> > > With a "finalize" attribute if SVE isn't finalized by VCPU_INIT or
> > > KVM_RUN time, then SVE just won't be enabled for that VCPU.
> > 
> > So I suppose we could do something like this:
> > 
> >  * Advertise SVE availability through a vcpu device capability (I need
> >    to check how that works).
> > 
> >  * SVE-aware userspace that understands SVE can do the relevant
> >    vcpu device ioctls to configure SVE and turn it on: these are only
> >    permitted before the vcpu runs.  We might require an explicit
> >    "finish SVE setup" ioctl to be issued before the vcpu can run.
> > 
> >  * Finally, the vcpu is set running by userspace as normal.
> > 
> > Marc or Christoffer was objecting to me previously that this may be an
> > abuse of vcpu device ioctls, because SVE is a CPU feature rather than a
> > device.  I guess it depends on how you define "device" -- I'm not sure
> > where to draw the line.
> 
> I initially advocated for a VCPU device ioctl as well, because it's a
> less crowded number space that gives you more flexibility.  Marc did
> have a strong point that vcpu *devices* implies something else than
> features though.
> 
> I think you (a) definitely want to announce SVE support via a
> capability, and (b) only set the preferred target flag if enabling SVE
> *generally* gives you a VM more like the real hardware with similar
> performance on some system.
> 
> I'm personally fine with both feature flags and vcpu device ioctls.  If
> using vcpu device ioctls gives you an obvious way to set attributes
> relating to SVE, e.g. the vector length, then I think that's a strong
> argument for that approach.

There is another option I'm tending towards, which is simply to have
a "set vector lengths" ioctl (whether presented as a vcpu device
ioctl or a random arch ioctl).

If that ioctl() fails then SVE support is not available.

If it succeeds, it will update its arguments to indicate which
vector lengths are enabled (if different).

Old userspace, or userspace that doesn't want to use SVE, would
not use this ioctl at all.

It would also do no harm additionally to advertise this as a
capability, though I wonder whether it's necessary to do so (?)

Cheers
---Dave

^ permalink raw reply	[flat|nested] 178+ messages in thread

* Re: [RFC PATCH 08/16] KVM: arm64: Support dynamically hideable system registers
  2018-06-21 14:57   ` Dave Martin
@ 2018-08-07 19:20     ` Christoffer Dall
  -1 siblings, 0 replies; 178+ messages in thread
From: Christoffer Dall @ 2018-08-07 19:20 UTC (permalink / raw)
  To: Dave Martin
  Cc: Okamoto Takayuki, Christoffer Dall, Ard Biesheuvel, Marc Zyngier,
	Catalin Marinas, Will Deacon, kvmarm, linux-arm-kernel

On Thu, Jun 21, 2018 at 03:57:32PM +0100, Dave Martin wrote:
> Some system registers may or may not logically exist for a vcpu
> depending on whether certain architectural features are enabled for
> the vcpu.
> 
> In order to avoid spuriously emulating access to these registers
> when they should not exist, or allowing the registers to be
> spuriously enumerated or saved/restored through the ioctl
> interface, a means is needed to allow registers to be hidden
> depending on the vcpu configuration.
> 
> In order to support this in a flexible way, this patch adds a
> check_present() method to struct sys_reg_desc, and updates the
> generic system register access and enumeration code to be aware of
> it:  if check_present() returns false, the code behaves as if the
> register did not exist.
> 
> For convenience, the complete check is wrapped up in a new helper
> sys_reg_present().
> 
> An attempt has been made to hook the new check into the generic
> accessors for trapped system registers.  This should reduce the
> potential for future surprises, although the redundant check will
> add a small cost.  No system register depends on this functionality
> yet, and some paths needing the check may also need attention.
> 
> Naturally, this facility makes sense only for registers that are
> trapped.
> 
> Signed-off-by: Dave Martin <Dave.Martin@arm.com>
> ---
>  arch/arm64/kvm/sys_regs.c | 20 +++++++++++++++-----
>  arch/arm64/kvm/sys_regs.h | 11 +++++++++++
>  2 files changed, 26 insertions(+), 5 deletions(-)
> 
> diff --git a/arch/arm64/kvm/sys_regs.c b/arch/arm64/kvm/sys_regs.c
> index a436373..31a351a 100644
> --- a/arch/arm64/kvm/sys_regs.c
> +++ b/arch/arm64/kvm/sys_regs.c
> @@ -1840,7 +1840,7 @@ static int emulate_cp(struct kvm_vcpu *vcpu,
>  
>  	r = find_reg(params, table, num);
>  
> -	if (r) {
> +	if (likely(r) && sys_reg_present(vcpu, r)) {
>  		perform_access(vcpu, params, r);
>  		return 0;
>  	}
> @@ -2016,7 +2016,7 @@ static int emulate_sys_reg(struct kvm_vcpu *vcpu,
>  	if (!r)
>  		r = find_reg(params, sys_reg_descs, ARRAY_SIZE(sys_reg_descs));
>  
> -	if (likely(r)) {
> +	if (likely(r) && sys_reg_present(vcpu, r)) {
>  		perform_access(vcpu, params, r);
>  	} else {
>  		kvm_err("Unsupported guest sys_reg access at: %lx\n",

This looks a bit fishy, because it seems that now a guest can be
configured in such a way that it can access non-present emulated system
registers and get the host to tell the operator that the KVM instance
running on the system doesn't really support the hardware...

Thanks,
-Christoffer

> @@ -2313,6 +2313,9 @@ int kvm_arm_sys_reg_get_reg(struct kvm_vcpu *vcpu, const struct kvm_one_reg *reg
>  	if (!r)
>  		return get_invariant_sys_reg(reg->id, uaddr);
>  
> +	if (!sys_reg_present(vcpu, r))
> +		return -ENOENT;
> +
>  	if (r->get_user)
>  		return (r->get_user)(vcpu, r, reg, uaddr);
>  
> @@ -2334,6 +2337,9 @@ int kvm_arm_sys_reg_set_reg(struct kvm_vcpu *vcpu, const struct kvm_one_reg *reg
>  	if (!r)
>  		return set_invariant_sys_reg(reg->id, uaddr);
>  
> +	if (!sys_reg_present(vcpu, r))
> +		return -ENOENT;
> +
>  	if (r->set_user)
>  		return (r->set_user)(vcpu, r, reg, uaddr);
>  
> @@ -2390,7 +2396,8 @@ static bool copy_reg_to_user(const struct sys_reg_desc *reg, u64 __user **uind)
>  	return true;
>  }
>  
> -static int walk_one_sys_reg(const struct sys_reg_desc *rd,
> +static int walk_one_sys_reg(struct kvm_vcpu *vcpu,
> +			    const struct sys_reg_desc *rd,
>  			    u64 __user **uind,
>  			    unsigned int *total)
>  {
> @@ -2401,6 +2408,9 @@ static int walk_one_sys_reg(const struct sys_reg_desc *rd,
>  	if (!(rd->reg || rd->get_user))
>  		return 0;
>  
> +	if (!sys_reg_present(vcpu, rd))
> +		return 0;
> +
>  	if (!copy_reg_to_user(rd, uind))
>  		return -EFAULT;
>  
> @@ -2429,9 +2439,9 @@ static int walk_sys_regs(struct kvm_vcpu *vcpu, u64 __user *uind)
>  		int cmp = cmp_sys_reg(i1, i2);
>  		/* target-specific overrides generic entry. */
>  		if (cmp <= 0)
> -			err = walk_one_sys_reg(i1, &uind, &total);
> +			err = walk_one_sys_reg(vcpu, i1, &uind, &total);
>  		else
> -			err = walk_one_sys_reg(i2, &uind, &total);
> +			err = walk_one_sys_reg(vcpu, i2, &uind, &total);
>  
>  		if (err)
>  			return err;
> diff --git a/arch/arm64/kvm/sys_regs.h b/arch/arm64/kvm/sys_regs.h
> index cd710f8..dfbb342 100644
> --- a/arch/arm64/kvm/sys_regs.h
> +++ b/arch/arm64/kvm/sys_regs.h
> @@ -22,6 +22,9 @@
>  #ifndef __ARM64_KVM_SYS_REGS_LOCAL_H__
>  #define __ARM64_KVM_SYS_REGS_LOCAL_H__
>  
> +#include <linux/compiler.h>
> +#include <linux/types.h>
> +
>  struct sys_reg_params {
>  	u8	Op0;
>  	u8	Op1;
> @@ -61,8 +64,16 @@ struct sys_reg_desc {
>  			const struct kvm_one_reg *reg, void __user *uaddr);
>  	int (*set_user)(struct kvm_vcpu *vcpu, const struct sys_reg_desc *rd,
>  			const struct kvm_one_reg *reg, void __user *uaddr);
> +	bool (*check_present)(const struct kvm_vcpu *vpcu,
> +			      const struct sys_reg_desc *rd);
>  };
>  
> +static inline bool sys_reg_present(const struct kvm_vcpu *vcpu,
> +				   const struct sys_reg_desc *rd)
> +{
> +	return likely(!rd->check_present) || rd->check_present(vcpu, rd);
> +}
> +
>  static inline void print_sys_reg_instr(const struct sys_reg_params *p)
>  {
>  	/* Look, we even formatted it for you to paste into the table! */
> -- 
> 2.1.4
> 
> _______________________________________________
> kvmarm mailing list
> kvmarm@lists.cs.columbia.edu
> https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

^ permalink raw reply	[flat|nested] 178+ messages in thread

* [RFC PATCH 08/16] KVM: arm64: Support dynamically hideable system registers
@ 2018-08-07 19:20     ` Christoffer Dall
  0 siblings, 0 replies; 178+ messages in thread
From: Christoffer Dall @ 2018-08-07 19:20 UTC (permalink / raw)
  To: linux-arm-kernel

On Thu, Jun 21, 2018 at 03:57:32PM +0100, Dave Martin wrote:
> Some system registers may or may not logically exist for a vcpu
> depending on whether certain architectural features are enabled for
> the vcpu.
> 
> In order to avoid spuriously emulating access to these registers
> when they should not exist, or allowing the registers to be
> spuriously enumerated or saved/restored through the ioctl
> interface, a means is needed to allow registers to be hidden
> depending on the vcpu configuration.
> 
> In order to support this in a flexible way, this patch adds a
> check_present() method to struct sys_reg_desc, and updates the
> generic system register access and enumeration code to be aware of
> it:  if check_present() returns false, the code behaves as if the
> register did not exist.
> 
> For convenience, the complete check is wrapped up in a new helper
> sys_reg_present().
> 
> An attempt has been made to hook the new check into the generic
> accessors for trapped system registers.  This should reduce the
> potential for future surprises, although the redundant check will
> add a small cost.  No system register depends on this functionality
> yet, and some paths needing the check may also need attention.
> 
> Naturally, this facility makes sense only for registers that are
> trapped.
> 
> Signed-off-by: Dave Martin <Dave.Martin@arm.com>
> ---
>  arch/arm64/kvm/sys_regs.c | 20 +++++++++++++++-----
>  arch/arm64/kvm/sys_regs.h | 11 +++++++++++
>  2 files changed, 26 insertions(+), 5 deletions(-)
> 
> diff --git a/arch/arm64/kvm/sys_regs.c b/arch/arm64/kvm/sys_regs.c
> index a436373..31a351a 100644
> --- a/arch/arm64/kvm/sys_regs.c
> +++ b/arch/arm64/kvm/sys_regs.c
> @@ -1840,7 +1840,7 @@ static int emulate_cp(struct kvm_vcpu *vcpu,
>  
>  	r = find_reg(params, table, num);
>  
> -	if (r) {
> +	if (likely(r) && sys_reg_present(vcpu, r)) {
>  		perform_access(vcpu, params, r);
>  		return 0;
>  	}
> @@ -2016,7 +2016,7 @@ static int emulate_sys_reg(struct kvm_vcpu *vcpu,
>  	if (!r)
>  		r = find_reg(params, sys_reg_descs, ARRAY_SIZE(sys_reg_descs));
>  
> -	if (likely(r)) {
> +	if (likely(r) && sys_reg_present(vcpu, r)) {
>  		perform_access(vcpu, params, r);
>  	} else {
>  		kvm_err("Unsupported guest sys_reg access at: %lx\n",

This looks a bit fishy, because it seems that now a guest can be
configured in such a way that it can access non-present emulated system
registers and get the host to tell the operator that the KVM instance
running on the system doesn't really support the hardware...

Thanks,
-Christoffer

> @@ -2313,6 +2313,9 @@ int kvm_arm_sys_reg_get_reg(struct kvm_vcpu *vcpu, const struct kvm_one_reg *reg
>  	if (!r)
>  		return get_invariant_sys_reg(reg->id, uaddr);
>  
> +	if (!sys_reg_present(vcpu, r))
> +		return -ENOENT;
> +
>  	if (r->get_user)
>  		return (r->get_user)(vcpu, r, reg, uaddr);
>  
> @@ -2334,6 +2337,9 @@ int kvm_arm_sys_reg_set_reg(struct kvm_vcpu *vcpu, const struct kvm_one_reg *reg
>  	if (!r)
>  		return set_invariant_sys_reg(reg->id, uaddr);
>  
> +	if (!sys_reg_present(vcpu, r))
> +		return -ENOENT;
> +
>  	if (r->set_user)
>  		return (r->set_user)(vcpu, r, reg, uaddr);
>  
> @@ -2390,7 +2396,8 @@ static bool copy_reg_to_user(const struct sys_reg_desc *reg, u64 __user **uind)
>  	return true;
>  }
>  
> -static int walk_one_sys_reg(const struct sys_reg_desc *rd,
> +static int walk_one_sys_reg(struct kvm_vcpu *vcpu,
> +			    const struct sys_reg_desc *rd,
>  			    u64 __user **uind,
>  			    unsigned int *total)
>  {
> @@ -2401,6 +2408,9 @@ static int walk_one_sys_reg(const struct sys_reg_desc *rd,
>  	if (!(rd->reg || rd->get_user))
>  		return 0;
>  
> +	if (!sys_reg_present(vcpu, rd))
> +		return 0;
> +
>  	if (!copy_reg_to_user(rd, uind))
>  		return -EFAULT;
>  
> @@ -2429,9 +2439,9 @@ static int walk_sys_regs(struct kvm_vcpu *vcpu, u64 __user *uind)
>  		int cmp = cmp_sys_reg(i1, i2);
>  		/* target-specific overrides generic entry. */
>  		if (cmp <= 0)
> -			err = walk_one_sys_reg(i1, &uind, &total);
> +			err = walk_one_sys_reg(vcpu, i1, &uind, &total);
>  		else
> -			err = walk_one_sys_reg(i2, &uind, &total);
> +			err = walk_one_sys_reg(vcpu, i2, &uind, &total);
>  
>  		if (err)
>  			return err;
> diff --git a/arch/arm64/kvm/sys_regs.h b/arch/arm64/kvm/sys_regs.h
> index cd710f8..dfbb342 100644
> --- a/arch/arm64/kvm/sys_regs.h
> +++ b/arch/arm64/kvm/sys_regs.h
> @@ -22,6 +22,9 @@
>  #ifndef __ARM64_KVM_SYS_REGS_LOCAL_H__
>  #define __ARM64_KVM_SYS_REGS_LOCAL_H__
>  
> +#include <linux/compiler.h>
> +#include <linux/types.h>
> +
>  struct sys_reg_params {
>  	u8	Op0;
>  	u8	Op1;
> @@ -61,8 +64,16 @@ struct sys_reg_desc {
>  			const struct kvm_one_reg *reg, void __user *uaddr);
>  	int (*set_user)(struct kvm_vcpu *vcpu, const struct sys_reg_desc *rd,
>  			const struct kvm_one_reg *reg, void __user *uaddr);
> +	bool (*check_present)(const struct kvm_vcpu *vpcu,
> +			      const struct sys_reg_desc *rd);
>  };
>  
> +static inline bool sys_reg_present(const struct kvm_vcpu *vcpu,
> +				   const struct sys_reg_desc *rd)
> +{
> +	return likely(!rd->check_present) || rd->check_present(vcpu, rd);
> +}
> +
>  static inline void print_sys_reg_instr(const struct sys_reg_params *p)
>  {
>  	/* Look, we even formatted it for you to paste into the table! */
> -- 
> 2.1.4
> 
> _______________________________________________
> kvmarm mailing list
> kvmarm at lists.cs.columbia.edu
> https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

^ permalink raw reply	[flat|nested] 178+ messages in thread

* Re: [RFC PATCH 09/16] KVM: arm64: Allow ID registers to by dynamically read-as-zero
  2018-08-07 11:09       ` Dave Martin
@ 2018-08-07 19:35         ` Christoffer Dall
  -1 siblings, 0 replies; 178+ messages in thread
From: Christoffer Dall @ 2018-08-07 19:35 UTC (permalink / raw)
  To: Dave Martin
  Cc: Okamoto Takayuki, Christoffer Dall, Ard Biesheuvel, Marc Zyngier,
	Catalin Marinas, Will Deacon, kvmarm, linux-arm-kernel

On Tue, Aug 07, 2018 at 12:09:58PM +0100, Dave Martin wrote:
> On Mon, Aug 06, 2018 at 03:03:24PM +0200, Christoffer Dall wrote:
> > Hi Dave,
> > 
> > I think there's a typo in the subject "to be" rather than "to by".
> > 
> > On Thu, Jun 21, 2018 at 03:57:33PM +0100, Dave Martin wrote:
> > > When a feature-dependent ID register is hidden from the guest, it
> > > needs to exhibit read-as-zero behaviour as defined by the Arm
> > > architecture, rather than appearing to be entirely absent.
> > > 
> > > This patch updates the ID register emulation logic to make use of
> > > the new check_present() method to determine whether the register
> > > should read as zero instead of yielding the host's sanitised
> > > value.  Because currently a false result from this method truncates
> > > the trap call chain before the sysreg's emulate method() is called,
> > > a flag is added to distinguish this special case, and helpers are
> > > refactored appropriately.
> > 
> > I don't understand this last sentence.
> > 
> > And I'm not really sure I understand the code either.
> > 
> > I can't seem to see any registers which are defined as !present && !raz,
> > which is what I thought this feature was all about.
> 
> !present and !raz is the default behaviour for everything that is not
> ID-register-like.  This patch is adding the !present && raz case (though
> that may not be a helpful way to descibe it ... see below).

Fair enough, but I don't really see why you need to classify a register
as !present && raz, because raz implies present AFAICT.

> 
> > In other words, what is the benefit of this more generic method as
> > opposed to having a wrapper around read_id_reg() for read_sve_id_reg()
> > which sets RAZ if there is no support for SVE in this context?
> 
> There may be other ways to factor this.  I can't now remember whay I
> went with this particular approach, except that I vaguely recall
> hitting some obstacles when doing things another way.

What I don't much care for is that we now seem to be mixing the concept
of whether something is present and the value it returns if it is
present in the overall system register handling logic.  And I don't
understand why this is a requirement.

> 
> Can you take a look at my attempted explanation below and then we
> can reconsider this?

Sure, see my comments below.

> 
> [...]
> 
> > 
> > > 
> > > This invloves some trivial updates to pass the vcpu pointer down
> > > into the ID register emulation/access functions.
> > > 
> > > A new ID_SANITISED_IF() macro is defined for declaring
> > > conditionally visible ID registers.
> > > 
> > > Signed-off-by: Dave Martin <Dave.Martin@arm.com>
> > > ---
> > >  arch/arm64/kvm/sys_regs.c | 51 ++++++++++++++++++++++++++++++-----------------
> > >  arch/arm64/kvm/sys_regs.h | 11 ++++++++++
> > >  2 files changed, 44 insertions(+), 18 deletions(-)
> > > 
> 
> [...]
> 
> > > @@ -1840,7 +1855,7 @@ static int emulate_cp(struct kvm_vcpu *vcpu,
> > >  
> > >  	r = find_reg(params, table, num);
> > >  
> > > -	if (likely(r) && sys_reg_present(vcpu, r)) {
> > > +	if (likely(r) && sys_reg_present_or_raz(vcpu, r)) {
> > >  		perform_access(vcpu, params, r);
> > >  		return 0;
> > >  	}
> > > @@ -2016,7 +2031,7 @@ static int emulate_sys_reg(struct kvm_vcpu *vcpu,
> > >  	if (!r)
> > >  		r = find_reg(params, sys_reg_descs, ARRAY_SIZE(sys_reg_descs));
> > >  
> > > -	if (likely(r) && sys_reg_present(vcpu, r)) {
> > > +	if (likely(r) && sys_reg_present_or_raz(vcpu, r)) {
> > >  		perform_access(vcpu, params, r);
> > >  	} else {
> > >  		kvm_err("Unsupported guest sys_reg access at: %lx\n",
> > > @@ -2313,7 +2328,7 @@ int kvm_arm_sys_reg_get_reg(struct kvm_vcpu *vcpu, const struct kvm_one_reg *reg
> > >  	if (!r)
> > >  		return get_invariant_sys_reg(reg->id, uaddr);
> > >  
> > > -	if (!sys_reg_present(vcpu, r))
> > > +	if (!sys_reg_present_or_raz(vcpu, r))
> > >  		return -ENOENT;
> > >  
> > >  	if (r->get_user)
> > > @@ -2337,7 +2352,7 @@ int kvm_arm_sys_reg_set_reg(struct kvm_vcpu *vcpu, const struct kvm_one_reg *reg
> > >  	if (!r)
> > >  		return set_invariant_sys_reg(reg->id, uaddr);
> > >  
> > > -	if (!sys_reg_present(vcpu, r))
> > > +	if (!sys_reg_present_or_raz(vcpu, r))
> > >  		return -ENOENT;
> 
> On Wed, Jul 25, 2018 at 04:46:55PM +0100, Alex Bennée wrote:
> > It's all very well being raz, but shouldn't you catch this further down
> > and not attempt to write the register that doesn't exist?
> 
> To be clear, is this a question about factoring, or do you think there's
> a bug here?
> 
> 
> In response to both sets of comments, I think the way the code is
> factored is causing some confusion.
> 
> The idea in my head was something like this:
> 
> System register encodings fall into two classes:
> 
>  a) encodings that we emulate in some way

this is present, then

>  b) encodings that we unconditionally reflect back to the guest as an
>     Undef.

this is !present, then

The previous change made this a configurable thing as opposed to a
static compile time thing, right?
> 
> Architecturally defined system registers fall into two classes:
> 
>  i) registers whose removal turns all accesses into an Undef
>  ii) registers whose removal exhibits some other behaviour.

I'm not sure what you mean by 'removal' here, and which architectural
concept that relates to, which makes it hard for me to parse the rest
here...

> 
> These two classifications overlap somwehat.
> 
> 
> From an emulation perspective, (b), and (i) in the "register not
> present" case, look the same: we trap the register and reflect an Undef
> directly back to the guest with no further action required.
> 
> From an emulation perspective, (a) and (ii) are also somewhat the
> same: we need to emulate something, although precisely what we need
> to do depends on which register it is and on whether the register is
> deemed present or not.
> 
> sys_reg_check_present_or_raz() thus means "falls under (a) or (ii)",
> i.e., some emulation is required and we need to call sysreg-specific
> methods to figure out precisely what we need to do.

yes, but we've always had that without the "or_raz" stuff at the lookup
level.  What has changed?

> 
> Conversely !sys_reg_check_present_or_raz() means that we can just
> Undef the guest with no further logic required.

Yes, but that's the same as !present, because raz then implies present,
see above.

> 
> Does this rationale make things clearer?  The naming is perhaps
> unfortunate.
> 

Unfortunately not so much.  I have a strong feeling you want to move
anything relating to something being emulated as RAZ/RAO/something else
into sysreg specific functions.

Thanks,
-Christoffer
_______________________________________________
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

^ permalink raw reply	[flat|nested] 178+ messages in thread

* [RFC PATCH 09/16] KVM: arm64: Allow ID registers to by dynamically read-as-zero
@ 2018-08-07 19:35         ` Christoffer Dall
  0 siblings, 0 replies; 178+ messages in thread
From: Christoffer Dall @ 2018-08-07 19:35 UTC (permalink / raw)
  To: linux-arm-kernel

On Tue, Aug 07, 2018 at 12:09:58PM +0100, Dave Martin wrote:
> On Mon, Aug 06, 2018 at 03:03:24PM +0200, Christoffer Dall wrote:
> > Hi Dave,
> > 
> > I think there's a typo in the subject "to be" rather than "to by".
> > 
> > On Thu, Jun 21, 2018 at 03:57:33PM +0100, Dave Martin wrote:
> > > When a feature-dependent ID register is hidden from the guest, it
> > > needs to exhibit read-as-zero behaviour as defined by the Arm
> > > architecture, rather than appearing to be entirely absent.
> > > 
> > > This patch updates the ID register emulation logic to make use of
> > > the new check_present() method to determine whether the register
> > > should read as zero instead of yielding the host's sanitised
> > > value.  Because currently a false result from this method truncates
> > > the trap call chain before the sysreg's emulate method() is called,
> > > a flag is added to distinguish this special case, and helpers are
> > > refactored appropriately.
> > 
> > I don't understand this last sentence.
> > 
> > And I'm not really sure I understand the code either.
> > 
> > I can't seem to see any registers which are defined as !present && !raz,
> > which is what I thought this feature was all about.
> 
> !present and !raz is the default behaviour for everything that is not
> ID-register-like.  This patch is adding the !present && raz case (though
> that may not be a helpful way to descibe it ... see below).

Fair enough, but I don't really see why you need to classify a register
as !present && raz, because raz implies present AFAICT.

> 
> > In other words, what is the benefit of this more generic method as
> > opposed to having a wrapper around read_id_reg() for read_sve_id_reg()
> > which sets RAZ if there is no support for SVE in this context?
> 
> There may be other ways to factor this.  I can't now remember whay I
> went with this particular approach, except that I vaguely recall
> hitting some obstacles when doing things another way.

What I don't much care for is that we now seem to be mixing the concept
of whether something is present and the value it returns if it is
present in the overall system register handling logic.  And I don't
understand why this is a requirement.

> 
> Can you take a look at my attempted explanation below and then we
> can reconsider this?

Sure, see my comments below.

> 
> [...]
> 
> > 
> > > 
> > > This invloves some trivial updates to pass the vcpu pointer down
> > > into the ID register emulation/access functions.
> > > 
> > > A new ID_SANITISED_IF() macro is defined for declaring
> > > conditionally visible ID registers.
> > > 
> > > Signed-off-by: Dave Martin <Dave.Martin@arm.com>
> > > ---
> > >  arch/arm64/kvm/sys_regs.c | 51 ++++++++++++++++++++++++++++++-----------------
> > >  arch/arm64/kvm/sys_regs.h | 11 ++++++++++
> > >  2 files changed, 44 insertions(+), 18 deletions(-)
> > > 
> 
> [...]
> 
> > > @@ -1840,7 +1855,7 @@ static int emulate_cp(struct kvm_vcpu *vcpu,
> > >  
> > >  	r = find_reg(params, table, num);
> > >  
> > > -	if (likely(r) && sys_reg_present(vcpu, r)) {
> > > +	if (likely(r) && sys_reg_present_or_raz(vcpu, r)) {
> > >  		perform_access(vcpu, params, r);
> > >  		return 0;
> > >  	}
> > > @@ -2016,7 +2031,7 @@ static int emulate_sys_reg(struct kvm_vcpu *vcpu,
> > >  	if (!r)
> > >  		r = find_reg(params, sys_reg_descs, ARRAY_SIZE(sys_reg_descs));
> > >  
> > > -	if (likely(r) && sys_reg_present(vcpu, r)) {
> > > +	if (likely(r) && sys_reg_present_or_raz(vcpu, r)) {
> > >  		perform_access(vcpu, params, r);
> > >  	} else {
> > >  		kvm_err("Unsupported guest sys_reg access at: %lx\n",
> > > @@ -2313,7 +2328,7 @@ int kvm_arm_sys_reg_get_reg(struct kvm_vcpu *vcpu, const struct kvm_one_reg *reg
> > >  	if (!r)
> > >  		return get_invariant_sys_reg(reg->id, uaddr);
> > >  
> > > -	if (!sys_reg_present(vcpu, r))
> > > +	if (!sys_reg_present_or_raz(vcpu, r))
> > >  		return -ENOENT;
> > >  
> > >  	if (r->get_user)
> > > @@ -2337,7 +2352,7 @@ int kvm_arm_sys_reg_set_reg(struct kvm_vcpu *vcpu, const struct kvm_one_reg *reg
> > >  	if (!r)
> > >  		return set_invariant_sys_reg(reg->id, uaddr);
> > >  
> > > -	if (!sys_reg_present(vcpu, r))
> > > +	if (!sys_reg_present_or_raz(vcpu, r))
> > >  		return -ENOENT;
> 
> On Wed, Jul 25, 2018 at 04:46:55PM +0100, Alex Benn?e wrote:
> > It's all very well being raz, but shouldn't you catch this further down
> > and not attempt to write the register that doesn't exist?
> 
> To be clear, is this a question about factoring, or do you think there's
> a bug here?
> 
> 
> In response to both sets of comments, I think the way the code is
> factored is causing some confusion.
> 
> The idea in my head was something like this:
> 
> System register encodings fall into two classes:
> 
>  a) encodings that we emulate in some way

this is present, then

>  b) encodings that we unconditionally reflect back to the guest as an
>     Undef.

this is !present, then

The previous change made this a configurable thing as opposed to a
static compile time thing, right?
> 
> Architecturally defined system registers fall into two classes:
> 
>  i) registers whose removal turns all accesses into an Undef
>  ii) registers whose removal exhibits some other behaviour.

I'm not sure what you mean by 'removal' here, and which architectural
concept that relates to, which makes it hard for me to parse the rest
here...

> 
> These two classifications overlap somwehat.
> 
> 
> From an emulation perspective, (b), and (i) in the "register not
> present" case, look the same: we trap the register and reflect an Undef
> directly back to the guest with no further action required.
> 
> From an emulation perspective, (a) and (ii) are also somewhat the
> same: we need to emulate something, although precisely what we need
> to do depends on which register it is and on whether the register is
> deemed present or not.
> 
> sys_reg_check_present_or_raz() thus means "falls under (a) or (ii)",
> i.e., some emulation is required and we need to call sysreg-specific
> methods to figure out precisely what we need to do.

yes, but we've always had that without the "or_raz" stuff at the lookup
level.  What has changed?

> 
> Conversely !sys_reg_check_present_or_raz() means that we can just
> Undef the guest with no further logic required.

Yes, but that's the same as !present, because raz then implies present,
see above.

> 
> Does this rationale make things clearer?  The naming is perhaps
> unfortunate.
> 

Unfortunately not so much.  I have a strong feeling you want to move
anything relating to something being emulated as RAZ/RAO/something else
into sysreg specific functions.

Thanks,
-Christoffer

^ permalink raw reply	[flat|nested] 178+ messages in thread

* Re: [RFC PATCH 12/16] KVM: arm64/sve: Context switch the SVE registers
  2018-08-07 11:15       ` Dave Martin
@ 2018-08-07 19:43         ` Christoffer Dall
  -1 siblings, 0 replies; 178+ messages in thread
From: Christoffer Dall @ 2018-08-07 19:43 UTC (permalink / raw)
  To: Dave Martin
  Cc: Okamoto Takayuki, Christoffer Dall, Ard Biesheuvel, Marc Zyngier,
	Catalin Marinas, Will Deacon, kvmarm, linux-arm-kernel

On Tue, Aug 07, 2018 at 12:15:26PM +0100, Dave Martin wrote:
> On Mon, Aug 06, 2018 at 03:19:10PM +0200, Christoffer Dall wrote:
> > On Thu, Jun 21, 2018 at 03:57:36PM +0100, Dave Martin wrote:
> > > In order to give each vcpu its own view of the SVE registers, this
> > > patch adds context storage via a new sve_state pointer in struct
> > > vcpu_arch.  An additional member sve_max_vl is also added for each
> > > vcpu, to determine the maximum vector length visible to the guest
> > > and thus the value to be configured in ZCR_EL2.LEN while the is
> > > active.  This also determines the layout and size of the storage in
> > > sve_state, which is read and written by the same backend functions
> > > that are used for context-switching the SVE state for host tasks.
> > > 
> > > On SVE-enabled vcpus, SVE access traps are now handled by switching
> > > in the vcpu's SVE context and disabling the trap before returning
> > > to the guest.  On other vcpus, the trap is not handled and an exit
> > > back to the host occurs, where the handle_sve() fallback path
> > > reflects an undefined instruction exception back to the guest,
> > > consistently with the behaviour of non-SVE-capable hardware (as was
> > > done unconditionally prior to this patch).
> > > 
> > > No SVE handling is added on non-VHE-only paths, since VHE is an
> > > architectural and Kconfig prerequisite of SVE.
> > > 
> > > Signed-off-by: Dave Martin <Dave.Martin@arm.com>
> > > ---
> > >  arch/arm64/include/asm/kvm_host.h |  2 ++
> > >  arch/arm64/kvm/fpsimd.c           |  5 +++--
> > >  arch/arm64/kvm/hyp/switch.c       | 43 ++++++++++++++++++++++++++++++---------
> > >  3 files changed, 38 insertions(+), 12 deletions(-)
> 
> [...]
> 
> > > diff --git a/arch/arm64/kvm/hyp/switch.c b/arch/arm64/kvm/hyp/switch.c
> 
> [...]
> 
> > > @@ -361,7 +373,13 @@ static bool __hyp_text __hyp_switch_fpsimd(struct kvm_vcpu *vcpu)
> > >  		vcpu->arch.flags &= ~KVM_ARM64_FP_HOST;
> > >  	}
> > >  
> > > -	__fpsimd_restore_state(&vcpu->arch.ctxt.gp_regs.fp_regs);
> > > +	if (system_supports_sve() && guest_has_sve)
> > > +		sve_load_state((char *)vcpu->arch.sve_state +
> > > +					sve_ffr_offset(vcpu->arch.sve_max_vl),
> > 
> > nit: would it make sense to have a macro 'vcpu_get_sve_state_ptr(vcpu)'
> > to make this first argument more pretty?
> 
> Could do, I guess.  I'll take a look.
> 
> > 
> > > +			       &vcpu->arch.ctxt.gp_regs.fp_regs.fpsr,
> > > +			       sve_vq_from_vl(vcpu->arch.sve_max_vl) - 1);
> > > +	else
> > > +		__fpsimd_restore_state(&vcpu->arch.ctxt.gp_regs.fp_regs);
> > >  
> > >  	/* Skip restoring fpexc32 for AArch64 guests */
> > >  	if (!(read_sysreg(hcr_el2) & HCR_RW))
> > > @@ -380,6 +398,8 @@ static bool __hyp_text __hyp_switch_fpsimd(struct kvm_vcpu *vcpu)
> > >   */
> > >  static bool __hyp_text fixup_guest_exit(struct kvm_vcpu *vcpu, u64 *exit_code)
> > >  {
> > > +	bool guest_has_sve;
> > > +
> > >  	if (ARM_EXCEPTION_CODE(*exit_code) != ARM_EXCEPTION_IRQ)
> > >  		vcpu->arch.fault.esr_el2 = read_sysreg_el2(esr);
> > >  
> > > @@ -397,10 +417,13 @@ static bool __hyp_text fixup_guest_exit(struct kvm_vcpu *vcpu, u64 *exit_code)
> > >  	 * and restore the guest context lazily.
> > >  	 * If FP/SIMD is not implemented, handle the trap and inject an
> > >  	 * undefined instruction exception to the guest.
> > > +	 * Similarly for trapped SVE accesses.
> > >  	 */
> > > -	if (system_supports_fpsimd() &&
> > > -	    kvm_vcpu_trap_get_class(vcpu) == ESR_ELx_EC_FP_ASIMD)
> > > -		return __hyp_switch_fpsimd(vcpu);
> > > +	guest_has_sve = vcpu_has_sve(&vcpu->arch);
> > > +	if ((system_supports_fpsimd() &&
> > > +	     kvm_vcpu_trap_get_class(vcpu) == ESR_ELx_EC_FP_ASIMD) ||
> > > +	    (guest_has_sve && kvm_vcpu_trap_get_class(vcpu) == ESR_ELx_EC_SVE))
> > 
> > nit: this may also be folded nicely into a static bool
> > __trap_fpsimd_sve_access() check.
> 
> It wouldn't hurt to make this look less fiddly, certainly.
> 
> Can you elaborate on precisely what you had in mind?

sure:

static bool __hyp_text __trap_is_fpsimd_sve_access(struct kvm_vcpu *vcpu)
{
	/*
	 * Can we support SVE without FPSIMD? If not, this can be
	 * simplified by reversing the condition.
	 */
	if (system_supports_fpsimd() &&
	    kvm_vcpu_trap_get_class(vcpu) == ESR_ELx_EC_FP_ASIMD)
		return true;

	if (guest_has_sve && kvm_vcpu_trap_get_class(vcpu) == ESR_ELx_EC_SVE)
		return true;

	return false;
}


static bool __hyp_text fixup_guest_exit(struct kvm_vcpu *vcpu, u64 *exit_code)
{
	[...]
	if (__trap_is_fpsimd_sve_access(vcpu))
		return __hyp_switch_fpsimd(vcpu, guest_has_sve);
	[...]
}

Of course not even compile-tested or anything like that.

Thanks,
-Christoffer

^ permalink raw reply	[flat|nested] 178+ messages in thread

* [RFC PATCH 12/16] KVM: arm64/sve: Context switch the SVE registers
@ 2018-08-07 19:43         ` Christoffer Dall
  0 siblings, 0 replies; 178+ messages in thread
From: Christoffer Dall @ 2018-08-07 19:43 UTC (permalink / raw)
  To: linux-arm-kernel

On Tue, Aug 07, 2018 at 12:15:26PM +0100, Dave Martin wrote:
> On Mon, Aug 06, 2018 at 03:19:10PM +0200, Christoffer Dall wrote:
> > On Thu, Jun 21, 2018 at 03:57:36PM +0100, Dave Martin wrote:
> > > In order to give each vcpu its own view of the SVE registers, this
> > > patch adds context storage via a new sve_state pointer in struct
> > > vcpu_arch.  An additional member sve_max_vl is also added for each
> > > vcpu, to determine the maximum vector length visible to the guest
> > > and thus the value to be configured in ZCR_EL2.LEN while the is
> > > active.  This also determines the layout and size of the storage in
> > > sve_state, which is read and written by the same backend functions
> > > that are used for context-switching the SVE state for host tasks.
> > > 
> > > On SVE-enabled vcpus, SVE access traps are now handled by switching
> > > in the vcpu's SVE context and disabling the trap before returning
> > > to the guest.  On other vcpus, the trap is not handled and an exit
> > > back to the host occurs, where the handle_sve() fallback path
> > > reflects an undefined instruction exception back to the guest,
> > > consistently with the behaviour of non-SVE-capable hardware (as was
> > > done unconditionally prior to this patch).
> > > 
> > > No SVE handling is added on non-VHE-only paths, since VHE is an
> > > architectural and Kconfig prerequisite of SVE.
> > > 
> > > Signed-off-by: Dave Martin <Dave.Martin@arm.com>
> > > ---
> > >  arch/arm64/include/asm/kvm_host.h |  2 ++
> > >  arch/arm64/kvm/fpsimd.c           |  5 +++--
> > >  arch/arm64/kvm/hyp/switch.c       | 43 ++++++++++++++++++++++++++++++---------
> > >  3 files changed, 38 insertions(+), 12 deletions(-)
> 
> [...]
> 
> > > diff --git a/arch/arm64/kvm/hyp/switch.c b/arch/arm64/kvm/hyp/switch.c
> 
> [...]
> 
> > > @@ -361,7 +373,13 @@ static bool __hyp_text __hyp_switch_fpsimd(struct kvm_vcpu *vcpu)
> > >  		vcpu->arch.flags &= ~KVM_ARM64_FP_HOST;
> > >  	}
> > >  
> > > -	__fpsimd_restore_state(&vcpu->arch.ctxt.gp_regs.fp_regs);
> > > +	if (system_supports_sve() && guest_has_sve)
> > > +		sve_load_state((char *)vcpu->arch.sve_state +
> > > +					sve_ffr_offset(vcpu->arch.sve_max_vl),
> > 
> > nit: would it make sense to have a macro 'vcpu_get_sve_state_ptr(vcpu)'
> > to make this first argument more pretty?
> 
> Could do, I guess.  I'll take a look.
> 
> > 
> > > +			       &vcpu->arch.ctxt.gp_regs.fp_regs.fpsr,
> > > +			       sve_vq_from_vl(vcpu->arch.sve_max_vl) - 1);
> > > +	else
> > > +		__fpsimd_restore_state(&vcpu->arch.ctxt.gp_regs.fp_regs);
> > >  
> > >  	/* Skip restoring fpexc32 for AArch64 guests */
> > >  	if (!(read_sysreg(hcr_el2) & HCR_RW))
> > > @@ -380,6 +398,8 @@ static bool __hyp_text __hyp_switch_fpsimd(struct kvm_vcpu *vcpu)
> > >   */
> > >  static bool __hyp_text fixup_guest_exit(struct kvm_vcpu *vcpu, u64 *exit_code)
> > >  {
> > > +	bool guest_has_sve;
> > > +
> > >  	if (ARM_EXCEPTION_CODE(*exit_code) != ARM_EXCEPTION_IRQ)
> > >  		vcpu->arch.fault.esr_el2 = read_sysreg_el2(esr);
> > >  
> > > @@ -397,10 +417,13 @@ static bool __hyp_text fixup_guest_exit(struct kvm_vcpu *vcpu, u64 *exit_code)
> > >  	 * and restore the guest context lazily.
> > >  	 * If FP/SIMD is not implemented, handle the trap and inject an
> > >  	 * undefined instruction exception to the guest.
> > > +	 * Similarly for trapped SVE accesses.
> > >  	 */
> > > -	if (system_supports_fpsimd() &&
> > > -	    kvm_vcpu_trap_get_class(vcpu) == ESR_ELx_EC_FP_ASIMD)
> > > -		return __hyp_switch_fpsimd(vcpu);
> > > +	guest_has_sve = vcpu_has_sve(&vcpu->arch);
> > > +	if ((system_supports_fpsimd() &&
> > > +	     kvm_vcpu_trap_get_class(vcpu) == ESR_ELx_EC_FP_ASIMD) ||
> > > +	    (guest_has_sve && kvm_vcpu_trap_get_class(vcpu) == ESR_ELx_EC_SVE))
> > 
> > nit: this may also be folded nicely into a static bool
> > __trap_fpsimd_sve_access() check.
> 
> It wouldn't hurt to make this look less fiddly, certainly.
> 
> Can you elaborate on precisely what you had in mind?

sure:

static bool __hyp_text __trap_is_fpsimd_sve_access(struct kvm_vcpu *vcpu)
{
	/*
	 * Can we support SVE without FPSIMD? If not, this can be
	 * simplified by reversing the condition.
	 */
	if (system_supports_fpsimd() &&
	    kvm_vcpu_trap_get_class(vcpu) == ESR_ELx_EC_FP_ASIMD)
		return true;

	if (guest_has_sve && kvm_vcpu_trap_get_class(vcpu) == ESR_ELx_EC_SVE)
		return true;

	return false;
}


static bool __hyp_text fixup_guest_exit(struct kvm_vcpu *vcpu, u64 *exit_code)
{
	[...]
	if (__trap_is_fpsimd_sve_access(vcpu))
		return __hyp_switch_fpsimd(vcpu, guest_has_sve);
	[...]
}

Of course not even compile-tested or anything like that.

Thanks,
-Christoffer

^ permalink raw reply	[flat|nested] 178+ messages in thread

* Re: [RFC PATCH 16/16] KVM: arm64/sve: Report and enable SVE API extensions for userspace
  2018-08-07 11:23               ` Dave Martin
@ 2018-08-07 20:08                 ` Christoffer Dall
  -1 siblings, 0 replies; 178+ messages in thread
From: Christoffer Dall @ 2018-08-07 20:08 UTC (permalink / raw)
  To: Dave Martin
  Cc: Okamoto Takayuki, Christoffer Dall, Ard Biesheuvel, Marc Zyngier,
	Catalin Marinas, Will Deacon, kvmarm, linux-arm-kernel

On Tue, Aug 07, 2018 at 12:23:45PM +0100, Dave Martin wrote:
> On Mon, Aug 06, 2018 at 03:41:33PM +0200, Christoffer Dall wrote:
> > On Thu, Jul 26, 2018 at 02:18:02PM +0100, Dave Martin wrote:
> > > On Wed, Jul 25, 2018 at 06:52:56PM +0200, Andrew Jones wrote:
> > > > On Wed, Jul 25, 2018 at 04:27:49PM +0100, Dave Martin wrote:
> > > > > On Thu, Jul 19, 2018 at 04:59:21PM +0200, Andrew Jones wrote:
> > > > > > On Thu, Jun 21, 2018 at 03:57:40PM +0100, Dave Martin wrote:
> > > > > > > -	/*
> > > > > > > -	 * For now, we don't return any features.
> > > > > > > -	 * In future, we might use features to return target
> > > > > > > -	 * specific features available for the preferred
> > > > > > > -	 * target type.
> > > > > > > -	 */
> > > > > > > +	/* KVM_ARM_VCPU_SVE understood by KVM_VCPU_INIT */
> > > > > > > +	init->features[0] = 1 << KVM_ARM_VCPU_SVE;
> > > > > > > +
> > > > > > 
> > > > > > We shouldn't need to do this. The "preferred" target type isn't defined
> > > > > > well (that I know of), but IMO it should probably be the target that
> > > > > > best matches the host, minus optional features. The best base target. We
> > > > > > may use these features to convey that the preferred target should enable
> > > > > > some optional feature if that feature is necessary to workaround a bug,
> > > > > > i.e. using the "feature" bit as an erratum bit someday, but that'd be
> > > > > > quite a debatable use, so maybe not even that. Most likely we'll never
> > > > > > need to add features here.
> > > > > 
> > > > > init->features[] has no semantics yet so we can define it how we like,
> > > > > but I agree that the way I use it here is not necessarily the most
> > > > > natural.
> > > > > 
> > > > > OTOH, we cannot use features[] for "mandatory" features like erratum
> > > > > workarounds, because current userspace just ignores these bits.
> > > > 
> > > > It would have to learn to look here if that's how we started using it,
> > > > but it'd be better to invent something else that wouldn't appear as
> > > > abusive if we're going to teach userspace new stuff anyway.
> > > > 
> > > > > 
> > > > > Rather, these bits would be for features that are considered beneficial
> > > > > but must be off by default (due to incompatibility risks across nodes,
> > > > > or due to ABI impacts).  Just blindly using the preferred target
> > > > > already risks configuring a vcpu that won't work across all nodes in
> > > > > your cluster.
> > > > 
> > > > KVM usually advertises optional features through capabilities. A device
> > > > (vcpu device, in this case) ioctl can also be used to check for feature
> > > > availability.
> > > > 
> > > > > 
> > > > > So I'm not convinced that there is any useful interpretation of
> > > > > features[] unless we interpret it as suggested in this patch.
> > > > > 
> > > > > Can you elaborate why you think it should be used with a more
> > > > > concrete example?
> > > > 
> > > > I'm advocating that it *not* be used here. I think it should be used
> > > > like the PMU feature uses it - and the PMU feature doesn't set a bit
> > > > here.
> > > > 
> > > > > 
> > > > > > That said, I think defining the feature bit makes sense. ATM, I'm feeling
> > > > > > like we'll want to model the user interface for SVE like PMU (using VCPU
> > > > > > device ioctls).
> > > > > 
> > > > > Some people expressed concerns about the ioctls becoming order-sensitive.
> > > > > 
> > > > > In the SVE case we don't want people enabling/disabling/reconfiguring
> > > > > "silicon" features like SVE after the vcpu starts executing.
> > > > > 
> > > > > We will need an extra ioctl() for configuring the allowed SVE vector
> > > > > lengths though.  I don't see a way around that.  So maybe we have to
> > > > > solve the ordering problem anyway.
> > > > 
> > > > Yes, that's why I'm thinking that the vcpu device ioctls is probably the
> > > > right way to go. The SVE group can have its own "finalize" request that
> > > > allows all other SVE ioctls to be in any order prior to it.
> > > > 
> > > > > 
> > > > > 
> > > > > By current approach (not in this series) was to have VCPU_INIT return
> > > > > -EINPROGRESS or similar if SVE is enabled in features[]: this indicates
> > > > > that certain setup ioctls are required before the vcpu can run.
> > > > > 
> > > > > This may be overkill / not the best approach though.  I can look at
> > > > > vcpu device ioctls as an alternative.
> > > > 
> > > > With a "finalize" attribute if SVE isn't finalized by VCPU_INIT or
> > > > KVM_RUN time, then SVE just won't be enabled for that VCPU.
> > > 
> > > So I suppose we could do something like this:
> > > 
> > >  * Advertise SVE availability through a vcpu device capability (I need
> > >    to check how that works).
> > > 
> > >  * SVE-aware userspace that understands SVE can do the relevant
> > >    vcpu device ioctls to configure SVE and turn it on: these are only
> > >    permitted before the vcpu runs.  We might require an explicit
> > >    "finish SVE setup" ioctl to be issued before the vcpu can run.
> > > 
> > >  * Finally, the vcpu is set running by userspace as normal.
> > > 
> > > Marc or Christoffer was objecting to me previously that this may be an
> > > abuse of vcpu device ioctls, because SVE is a CPU feature rather than a
> > > device.  I guess it depends on how you define "device" -- I'm not sure
> > > where to draw the line.
> > 
> > I initially advocated for a VCPU device ioctl as well, because it's a
> > less crowded number space that gives you more flexibility.  Marc did
> > have a strong point that vcpu *devices* implies something else than
> > features though.
> > 
> > I think you (a) definitely want to announce SVE support via a
> > capability, and (b) only set the preferred target flag if enabling SVE
> > *generally* gives you a VM more like the real hardware with similar
> > performance on some system.
> > 
> > I'm personally fine with both feature flags and vcpu device ioctls.  If
> > using vcpu device ioctls gives you an obvious way to set attributes
> > relating to SVE, e.g. the vector length, then I think that's a strong
> > argument for that approach.
> 
> There is another option I'm tending towards, which is simply to have
> a "set vector lengths" ioctl (whether presented as a vcpu device
> ioctl or a random arch ioctl).

Someone complained once about adding too many arch ioctls because there
is a limited number space for doing so, but I'm not sure if that was and
still a valid concern.

> 
> If that ioctl() fails then SVE support is not available.
> 
> If it succeeds, it will update its arguments to indicate which
> vector lengths are enabled (if different).
> 
> Old userspace, or userspace that doesn't want to use SVE, would
> not use this ioctl at all.
> 
> It would also do no harm additionally to advertise this as a
> capability, though I wonder whether it's necessary to do so (?)
> 

It is customary to expose features via capabilities.  I have a vague
recollection that tools like libvirt negotiate capabilities across
systems and would need more plumbing to discover features by probing an
ioctl instead.

Thanks,
-Christoffer

^ permalink raw reply	[flat|nested] 178+ messages in thread

* [RFC PATCH 16/16] KVM: arm64/sve: Report and enable SVE API extensions for userspace
@ 2018-08-07 20:08                 ` Christoffer Dall
  0 siblings, 0 replies; 178+ messages in thread
From: Christoffer Dall @ 2018-08-07 20:08 UTC (permalink / raw)
  To: linux-arm-kernel

On Tue, Aug 07, 2018 at 12:23:45PM +0100, Dave Martin wrote:
> On Mon, Aug 06, 2018 at 03:41:33PM +0200, Christoffer Dall wrote:
> > On Thu, Jul 26, 2018 at 02:18:02PM +0100, Dave Martin wrote:
> > > On Wed, Jul 25, 2018 at 06:52:56PM +0200, Andrew Jones wrote:
> > > > On Wed, Jul 25, 2018 at 04:27:49PM +0100, Dave Martin wrote:
> > > > > On Thu, Jul 19, 2018 at 04:59:21PM +0200, Andrew Jones wrote:
> > > > > > On Thu, Jun 21, 2018 at 03:57:40PM +0100, Dave Martin wrote:
> > > > > > > -	/*
> > > > > > > -	 * For now, we don't return any features.
> > > > > > > -	 * In future, we might use features to return target
> > > > > > > -	 * specific features available for the preferred
> > > > > > > -	 * target type.
> > > > > > > -	 */
> > > > > > > +	/* KVM_ARM_VCPU_SVE understood by KVM_VCPU_INIT */
> > > > > > > +	init->features[0] = 1 << KVM_ARM_VCPU_SVE;
> > > > > > > +
> > > > > > 
> > > > > > We shouldn't need to do this. The "preferred" target type isn't defined
> > > > > > well (that I know of), but IMO it should probably be the target that
> > > > > > best matches the host, minus optional features. The best base target. We
> > > > > > may use these features to convey that the preferred target should enable
> > > > > > some optional feature if that feature is necessary to workaround a bug,
> > > > > > i.e. using the "feature" bit as an erratum bit someday, but that'd be
> > > > > > quite a debatable use, so maybe not even that. Most likely we'll never
> > > > > > need to add features here.
> > > > > 
> > > > > init->features[] has no semantics yet so we can define it how we like,
> > > > > but I agree that the way I use it here is not necessarily the most
> > > > > natural.
> > > > > 
> > > > > OTOH, we cannot use features[] for "mandatory" features like erratum
> > > > > workarounds, because current userspace just ignores these bits.
> > > > 
> > > > It would have to learn to look here if that's how we started using it,
> > > > but it'd be better to invent something else that wouldn't appear as
> > > > abusive if we're going to teach userspace new stuff anyway.
> > > > 
> > > > > 
> > > > > Rather, these bits would be for features that are considered beneficial
> > > > > but must be off by default (due to incompatibility risks across nodes,
> > > > > or due to ABI impacts).  Just blindly using the preferred target
> > > > > already risks configuring a vcpu that won't work across all nodes in
> > > > > your cluster.
> > > > 
> > > > KVM usually advertises optional features through capabilities. A device
> > > > (vcpu device, in this case) ioctl can also be used to check for feature
> > > > availability.
> > > > 
> > > > > 
> > > > > So I'm not convinced that there is any useful interpretation of
> > > > > features[] unless we interpret it as suggested in this patch.
> > > > > 
> > > > > Can you elaborate why you think it should be used with a more
> > > > > concrete example?
> > > > 
> > > > I'm advocating that it *not* be used here. I think it should be used
> > > > like the PMU feature uses it - and the PMU feature doesn't set a bit
> > > > here.
> > > > 
> > > > > 
> > > > > > That said, I think defining the feature bit makes sense. ATM, I'm feeling
> > > > > > like we'll want to model the user interface for SVE like PMU (using VCPU
> > > > > > device ioctls).
> > > > > 
> > > > > Some people expressed concerns about the ioctls becoming order-sensitive.
> > > > > 
> > > > > In the SVE case we don't want people enabling/disabling/reconfiguring
> > > > > "silicon" features like SVE after the vcpu starts executing.
> > > > > 
> > > > > We will need an extra ioctl() for configuring the allowed SVE vector
> > > > > lengths though.  I don't see a way around that.  So maybe we have to
> > > > > solve the ordering problem anyway.
> > > > 
> > > > Yes, that's why I'm thinking that the vcpu device ioctls is probably the
> > > > right way to go. The SVE group can have its own "finalize" request that
> > > > allows all other SVE ioctls to be in any order prior to it.
> > > > 
> > > > > 
> > > > > 
> > > > > By current approach (not in this series) was to have VCPU_INIT return
> > > > > -EINPROGRESS or similar if SVE is enabled in features[]: this indicates
> > > > > that certain setup ioctls are required before the vcpu can run.
> > > > > 
> > > > > This may be overkill / not the best approach though.  I can look at
> > > > > vcpu device ioctls as an alternative.
> > > > 
> > > > With a "finalize" attribute if SVE isn't finalized by VCPU_INIT or
> > > > KVM_RUN time, then SVE just won't be enabled for that VCPU.
> > > 
> > > So I suppose we could do something like this:
> > > 
> > >  * Advertise SVE availability through a vcpu device capability (I need
> > >    to check how that works).
> > > 
> > >  * SVE-aware userspace that understands SVE can do the relevant
> > >    vcpu device ioctls to configure SVE and turn it on: these are only
> > >    permitted before the vcpu runs.  We might require an explicit
> > >    "finish SVE setup" ioctl to be issued before the vcpu can run.
> > > 
> > >  * Finally, the vcpu is set running by userspace as normal.
> > > 
> > > Marc or Christoffer was objecting to me previously that this may be an
> > > abuse of vcpu device ioctls, because SVE is a CPU feature rather than a
> > > device.  I guess it depends on how you define "device" -- I'm not sure
> > > where to draw the line.
> > 
> > I initially advocated for a VCPU device ioctl as well, because it's a
> > less crowded number space that gives you more flexibility.  Marc did
> > have a strong point that vcpu *devices* implies something else than
> > features though.
> > 
> > I think you (a) definitely want to announce SVE support via a
> > capability, and (b) only set the preferred target flag if enabling SVE
> > *generally* gives you a VM more like the real hardware with similar
> > performance on some system.
> > 
> > I'm personally fine with both feature flags and vcpu device ioctls.  If
> > using vcpu device ioctls gives you an obvious way to set attributes
> > relating to SVE, e.g. the vector length, then I think that's a strong
> > argument for that approach.
> 
> There is another option I'm tending towards, which is simply to have
> a "set vector lengths" ioctl (whether presented as a vcpu device
> ioctl or a random arch ioctl).

Someone complained once about adding too many arch ioctls because there
is a limited number space for doing so, but I'm not sure if that was and
still a valid concern.

> 
> If that ioctl() fails then SVE support is not available.
> 
> If it succeeds, it will update its arguments to indicate which
> vector lengths are enabled (if different).
> 
> Old userspace, or userspace that doesn't want to use SVE, would
> not use this ioctl at all.
> 
> It would also do no harm additionally to advertise this as a
> capability, though I wonder whether it's necessary to do so (?)
> 

It is customary to expose features via capabilities.  I have a vague
recollection that tools like libvirt negotiate capabilities across
systems and would need more plumbing to discover features by probing an
ioctl instead.

Thanks,
-Christoffer

^ permalink raw reply	[flat|nested] 178+ messages in thread

* Re: [RFC PATCH 12/16] KVM: arm64/sve: Context switch the SVE registers
  2018-08-07 19:43         ` Christoffer Dall
@ 2018-08-08  8:23           ` Dave Martin
  -1 siblings, 0 replies; 178+ messages in thread
From: Dave Martin @ 2018-08-08  8:23 UTC (permalink / raw)
  To: Christoffer Dall
  Cc: Okamoto Takayuki, Christoffer Dall, Ard Biesheuvel, Marc Zyngier,
	Catalin Marinas, Will Deacon, kvmarm, linux-arm-kernel

On Tue, Aug 07, 2018 at 09:43:38PM +0200, Christoffer Dall wrote:
> On Tue, Aug 07, 2018 at 12:15:26PM +0100, Dave Martin wrote:
> > On Mon, Aug 06, 2018 at 03:19:10PM +0200, Christoffer Dall wrote:

[...]

> > > nit: this may also be folded nicely into a static bool
> > > __trap_fpsimd_sve_access() check.
> > 
> > It wouldn't hurt to make this look less fiddly, certainly.
> > 
> > Can you elaborate on precisely what you had in mind?
> 
> sure:
> 
> static bool __hyp_text __trap_is_fpsimd_sve_access(struct kvm_vcpu *vcpu)
> {
> 	/*
> 	 * Can we support SVE without FPSIMD? If not, this can be
> 	 * simplified by reversing the condition.
> 	 */
> 	if (system_supports_fpsimd() &&
> 	    kvm_vcpu_trap_get_class(vcpu) == ESR_ELx_EC_FP_ASIMD)
> 		return true;
> 
> 	if (guest_has_sve && kvm_vcpu_trap_get_class(vcpu) == ESR_ELx_EC_SVE)
> 		return true;
> 
> 	return false;
> }
> 
> 
> static bool __hyp_text fixup_guest_exit(struct kvm_vcpu *vcpu, u64 *exit_code)
> {
> 	[...]
> 	if (__trap_is_fpsimd_sve_access(vcpu))
> 		return __hyp_switch_fpsimd(vcpu, guest_has_sve);
> 	[...]
> }
> 
> Of course not even compile-tested or anything like that.

Sure, I can do something along these line.  The conditions are indeed a
bit unwieldy today.

Cheers
---Dave

^ permalink raw reply	[flat|nested] 178+ messages in thread

* [RFC PATCH 12/16] KVM: arm64/sve: Context switch the SVE registers
@ 2018-08-08  8:23           ` Dave Martin
  0 siblings, 0 replies; 178+ messages in thread
From: Dave Martin @ 2018-08-08  8:23 UTC (permalink / raw)
  To: linux-arm-kernel

On Tue, Aug 07, 2018 at 09:43:38PM +0200, Christoffer Dall wrote:
> On Tue, Aug 07, 2018 at 12:15:26PM +0100, Dave Martin wrote:
> > On Mon, Aug 06, 2018 at 03:19:10PM +0200, Christoffer Dall wrote:

[...]

> > > nit: this may also be folded nicely into a static bool
> > > __trap_fpsimd_sve_access() check.
> > 
> > It wouldn't hurt to make this look less fiddly, certainly.
> > 
> > Can you elaborate on precisely what you had in mind?
> 
> sure:
> 
> static bool __hyp_text __trap_is_fpsimd_sve_access(struct kvm_vcpu *vcpu)
> {
> 	/*
> 	 * Can we support SVE without FPSIMD? If not, this can be
> 	 * simplified by reversing the condition.
> 	 */
> 	if (system_supports_fpsimd() &&
> 	    kvm_vcpu_trap_get_class(vcpu) == ESR_ELx_EC_FP_ASIMD)
> 		return true;
> 
> 	if (guest_has_sve && kvm_vcpu_trap_get_class(vcpu) == ESR_ELx_EC_SVE)
> 		return true;
> 
> 	return false;
> }
> 
> 
> static bool __hyp_text fixup_guest_exit(struct kvm_vcpu *vcpu, u64 *exit_code)
> {
> 	[...]
> 	if (__trap_is_fpsimd_sve_access(vcpu))
> 		return __hyp_switch_fpsimd(vcpu, guest_has_sve);
> 	[...]
> }
> 
> Of course not even compile-tested or anything like that.

Sure, I can do something along these line.  The conditions are indeed a
bit unwieldy today.

Cheers
---Dave

^ permalink raw reply	[flat|nested] 178+ messages in thread

* Re: [RFC PATCH 16/16] KVM: arm64/sve: Report and enable SVE API extensions for userspace
  2018-08-07 20:08                 ` Christoffer Dall
@ 2018-08-08  8:30                   ` Dave Martin
  -1 siblings, 0 replies; 178+ messages in thread
From: Dave Martin @ 2018-08-08  8:30 UTC (permalink / raw)
  To: Christoffer Dall
  Cc: Okamoto Takayuki, Christoffer Dall, Ard Biesheuvel, Marc Zyngier,
	Catalin Marinas, Will Deacon, kvmarm, linux-arm-kernel

On Tue, Aug 07, 2018 at 10:08:28PM +0200, Christoffer Dall wrote:
> On Tue, Aug 07, 2018 at 12:23:45PM +0100, Dave Martin wrote:
> > On Mon, Aug 06, 2018 at 03:41:33PM +0200, Christoffer Dall wrote:

[...]

> > > I'm personally fine with both feature flags and vcpu device ioctls.  If
> > > using vcpu device ioctls gives you an obvious way to set attributes
> > > relating to SVE, e.g. the vector length, then I think that's a strong
> > > argument for that approach.
> > 
> > There is another option I'm tending towards, which is simply to have
> > a "set vector lengths" ioctl (whether presented as a vcpu device
> > ioctl or a random arch ioctl).
> 
> Someone complained once about adding too many arch ioctls because there
> is a limited number space for doing so, but I'm not sure if that was and
> still a valid concern.

I have no strong opinion on this.  I may stick with the separate ioctl
approach for the next spin just to reduce the amount of churn, but I'm
not overly committed to it.

> > 
> > If that ioctl() fails then SVE support is not available.
> > 
> > If it succeeds, it will update its arguments to indicate which
> > vector lengths are enabled (if different).
> > 
> > Old userspace, or userspace that doesn't want to use SVE, would
> > not use this ioctl at all.
> > 
> > It would also do no harm additionally to advertise this as a
> > capability, though I wonder whether it's necessary to do so (?)
> > 
> 
> It is customary to expose features via capabilities.  I have a vague
> recollection that tools like libvirt negotiate capabilities across
> systems and would need more plumbing to discover features by probing an
> ioctl instead.

OK, fair enough.

Cheers
---Dave

^ permalink raw reply	[flat|nested] 178+ messages in thread

* [RFC PATCH 16/16] KVM: arm64/sve: Report and enable SVE API extensions for userspace
@ 2018-08-08  8:30                   ` Dave Martin
  0 siblings, 0 replies; 178+ messages in thread
From: Dave Martin @ 2018-08-08  8:30 UTC (permalink / raw)
  To: linux-arm-kernel

On Tue, Aug 07, 2018 at 10:08:28PM +0200, Christoffer Dall wrote:
> On Tue, Aug 07, 2018 at 12:23:45PM +0100, Dave Martin wrote:
> > On Mon, Aug 06, 2018 at 03:41:33PM +0200, Christoffer Dall wrote:

[...]

> > > I'm personally fine with both feature flags and vcpu device ioctls.  If
> > > using vcpu device ioctls gives you an obvious way to set attributes
> > > relating to SVE, e.g. the vector length, then I think that's a strong
> > > argument for that approach.
> > 
> > There is another option I'm tending towards, which is simply to have
> > a "set vector lengths" ioctl (whether presented as a vcpu device
> > ioctl or a random arch ioctl).
> 
> Someone complained once about adding too many arch ioctls because there
> is a limited number space for doing so, but I'm not sure if that was and
> still a valid concern.

I have no strong opinion on this.  I may stick with the separate ioctl
approach for the next spin just to reduce the amount of churn, but I'm
not overly committed to it.

> > 
> > If that ioctl() fails then SVE support is not available.
> > 
> > If it succeeds, it will update its arguments to indicate which
> > vector lengths are enabled (if different).
> > 
> > Old userspace, or userspace that doesn't want to use SVE, would
> > not use this ioctl at all.
> > 
> > It would also do no harm additionally to advertise this as a
> > capability, though I wonder whether it's necessary to do so (?)
> > 
> 
> It is customary to expose features via capabilities.  I have a vague
> recollection that tools like libvirt negotiate capabilities across
> systems and would need more plumbing to discover features by probing an
> ioctl instead.

OK, fair enough.

Cheers
---Dave

^ permalink raw reply	[flat|nested] 178+ messages in thread

* Re: [RFC PATCH 08/16] KVM: arm64: Support dynamically hideable system registers
  2018-08-07 19:20     ` Christoffer Dall
@ 2018-08-08  8:33       ` Dave Martin
  -1 siblings, 0 replies; 178+ messages in thread
From: Dave Martin @ 2018-08-08  8:33 UTC (permalink / raw)
  To: Christoffer Dall
  Cc: Okamoto Takayuki, Christoffer Dall, Ard Biesheuvel, Marc Zyngier,
	Catalin Marinas, Will Deacon, kvmarm, linux-arm-kernel

On Tue, Aug 07, 2018 at 09:20:10PM +0200, Christoffer Dall wrote:

[...]

> > diff --git a/arch/arm64/kvm/sys_regs.c b/arch/arm64/kvm/sys_regs.c
> > index a436373..31a351a 100644
> > --- a/arch/arm64/kvm/sys_regs.c
> > +++ b/arch/arm64/kvm/sys_regs.c
> > @@ -1840,7 +1840,7 @@ static int emulate_cp(struct kvm_vcpu *vcpu,
> >  
> >  	r = find_reg(params, table, num);
> >  
> > -	if (r) {
> > +	if (likely(r) && sys_reg_present(vcpu, r)) {
> >  		perform_access(vcpu, params, r);
> >  		return 0;
> >  	}
> > @@ -2016,7 +2016,7 @@ static int emulate_sys_reg(struct kvm_vcpu *vcpu,
> >  	if (!r)
> >  		r = find_reg(params, sys_reg_descs, ARRAY_SIZE(sys_reg_descs));
> >  
> > -	if (likely(r)) {
> > +	if (likely(r) && sys_reg_present(vcpu, r)) {
> >  		perform_access(vcpu, params, r);
> >  	} else {
> >  		kvm_err("Unsupported guest sys_reg access at: %lx\n",
> 
> This looks a bit fishy, because it seems that now a guest can be
> configured in such a way that it can access non-present emulated system
> registers and get the host to tell the operator that the KVM instance
> running on the system doesn't really support the hardware...

Hmmm, looks like I just blindly adapted the if () condition without
looking at the context here.

I'll take a look at it.

[...]

Cheers
---Dave

^ permalink raw reply	[flat|nested] 178+ messages in thread

* [RFC PATCH 08/16] KVM: arm64: Support dynamically hideable system registers
@ 2018-08-08  8:33       ` Dave Martin
  0 siblings, 0 replies; 178+ messages in thread
From: Dave Martin @ 2018-08-08  8:33 UTC (permalink / raw)
  To: linux-arm-kernel

On Tue, Aug 07, 2018 at 09:20:10PM +0200, Christoffer Dall wrote:

[...]

> > diff --git a/arch/arm64/kvm/sys_regs.c b/arch/arm64/kvm/sys_regs.c
> > index a436373..31a351a 100644
> > --- a/arch/arm64/kvm/sys_regs.c
> > +++ b/arch/arm64/kvm/sys_regs.c
> > @@ -1840,7 +1840,7 @@ static int emulate_cp(struct kvm_vcpu *vcpu,
> >  
> >  	r = find_reg(params, table, num);
> >  
> > -	if (r) {
> > +	if (likely(r) && sys_reg_present(vcpu, r)) {
> >  		perform_access(vcpu, params, r);
> >  		return 0;
> >  	}
> > @@ -2016,7 +2016,7 @@ static int emulate_sys_reg(struct kvm_vcpu *vcpu,
> >  	if (!r)
> >  		r = find_reg(params, sys_reg_descs, ARRAY_SIZE(sys_reg_descs));
> >  
> > -	if (likely(r)) {
> > +	if (likely(r) && sys_reg_present(vcpu, r)) {
> >  		perform_access(vcpu, params, r);
> >  	} else {
> >  		kvm_err("Unsupported guest sys_reg access at: %lx\n",
> 
> This looks a bit fishy, because it seems that now a guest can be
> configured in such a way that it can access non-present emulated system
> registers and get the host to tell the operator that the KVM instance
> running on the system doesn't really support the hardware...

Hmmm, looks like I just blindly adapted the if () condition without
looking at the context here.

I'll take a look at it.

[...]

Cheers
---Dave

^ permalink raw reply	[flat|nested] 178+ messages in thread

* Re: [RFC PATCH 09/16] KVM: arm64: Allow ID registers to by dynamically read-as-zero
  2018-08-07 19:35         ` Christoffer Dall
@ 2018-08-08  9:11           ` Dave Martin
  -1 siblings, 0 replies; 178+ messages in thread
From: Dave Martin @ 2018-08-08  9:11 UTC (permalink / raw)
  To: Christoffer Dall
  Cc: Okamoto Takayuki, Christoffer Dall, Ard Biesheuvel, Marc Zyngier,
	Catalin Marinas, Will Deacon, kvmarm, linux-arm-kernel

On Tue, Aug 07, 2018 at 09:35:12PM +0200, Christoffer Dall wrote:
> On Tue, Aug 07, 2018 at 12:09:58PM +0100, Dave Martin wrote:
> > On Mon, Aug 06, 2018 at 03:03:24PM +0200, Christoffer Dall wrote:
> > > Hi Dave,
> > > 
> > > I think there's a typo in the subject "to be" rather than "to by".
> > > 
> > > On Thu, Jun 21, 2018 at 03:57:33PM +0100, Dave Martin wrote:
> > > > When a feature-dependent ID register is hidden from the guest, it
> > > > needs to exhibit read-as-zero behaviour as defined by the Arm
> > > > architecture, rather than appearing to be entirely absent.
> > > > 
> > > > This patch updates the ID register emulation logic to make use of
> > > > the new check_present() method to determine whether the register
> > > > should read as zero instead of yielding the host's sanitised
> > > > value.  Because currently a false result from this method truncates
> > > > the trap call chain before the sysreg's emulate method() is called,
> > > > a flag is added to distinguish this special case, and helpers are
> > > > refactored appropriately.
> > > 
> > > I don't understand this last sentence.
> > > 
> > > And I'm not really sure I understand the code either.
> > > 
> > > I can't seem to see any registers which are defined as !present && !raz,
> > > which is what I thought this feature was all about.
> > 
> > !present and !raz is the default behaviour for everything that is not
> > ID-register-like.  This patch is adding the !present && raz case (though
> > that may not be a helpful way to descibe it ... see below).
> 
> Fair enough, but I don't really see why you need to classify a register
> as !present && raz, because raz implies present AFAICT.
> 
> > 
> > > In other words, what is the benefit of this more generic method as
> > > opposed to having a wrapper around read_id_reg() for read_sve_id_reg()
> > > which sets RAZ if there is no support for SVE in this context?
> > 
> > There may be other ways to factor this.  I can't now remember whay I
> > went with this particular approach, except that I vaguely recall
> > hitting some obstacles when doing things another way.
> 
> What I don't much care for is that we now seem to be mixing the concept
> of whether something is present and the value it returns if it is
> present in the overall system register handling logic.  And I don't
> understand why this is a requirement.
> 
> > 
> > Can you take a look at my attempted explanation below and then we
> > can reconsider this?
> 
> Sure, see my comments below.
> 
> > 
> > [...]
> > 
> > > 
> > > > 
> > > > This invloves some trivial updates to pass the vcpu pointer down
> > > > into the ID register emulation/access functions.
> > > > 
> > > > A new ID_SANITISED_IF() macro is defined for declaring
> > > > conditionally visible ID registers.
> > > > 
> > > > Signed-off-by: Dave Martin <Dave.Martin@arm.com>
> > > > ---
> > > >  arch/arm64/kvm/sys_regs.c | 51 ++++++++++++++++++++++++++++++-----------------
> > > >  arch/arm64/kvm/sys_regs.h | 11 ++++++++++
> > > >  2 files changed, 44 insertions(+), 18 deletions(-)
> > > > 
> > 
> > [...]
> > 
> > > > @@ -1840,7 +1855,7 @@ static int emulate_cp(struct kvm_vcpu *vcpu,
> > > >  
> > > >  	r = find_reg(params, table, num);
> > > >  
> > > > -	if (likely(r) && sys_reg_present(vcpu, r)) {
> > > > +	if (likely(r) && sys_reg_present_or_raz(vcpu, r)) {
> > > >  		perform_access(vcpu, params, r);
> > > >  		return 0;
> > > >  	}
> > > > @@ -2016,7 +2031,7 @@ static int emulate_sys_reg(struct kvm_vcpu *vcpu,
> > > >  	if (!r)
> > > >  		r = find_reg(params, sys_reg_descs, ARRAY_SIZE(sys_reg_descs));
> > > >  
> > > > -	if (likely(r) && sys_reg_present(vcpu, r)) {
> > > > +	if (likely(r) && sys_reg_present_or_raz(vcpu, r)) {
> > > >  		perform_access(vcpu, params, r);
> > > >  	} else {
> > > >  		kvm_err("Unsupported guest sys_reg access at: %lx\n",
> > > > @@ -2313,7 +2328,7 @@ int kvm_arm_sys_reg_get_reg(struct kvm_vcpu *vcpu, const struct kvm_one_reg *reg
> > > >  	if (!r)
> > > >  		return get_invariant_sys_reg(reg->id, uaddr);
> > > >  
> > > > -	if (!sys_reg_present(vcpu, r))
> > > > +	if (!sys_reg_present_or_raz(vcpu, r))
> > > >  		return -ENOENT;
> > > >  
> > > >  	if (r->get_user)
> > > > @@ -2337,7 +2352,7 @@ int kvm_arm_sys_reg_set_reg(struct kvm_vcpu *vcpu, const struct kvm_one_reg *reg
> > > >  	if (!r)
> > > >  		return set_invariant_sys_reg(reg->id, uaddr);
> > > >  
> > > > -	if (!sys_reg_present(vcpu, r))
> > > > +	if (!sys_reg_present_or_raz(vcpu, r))
> > > >  		return -ENOENT;
> > 
> > On Wed, Jul 25, 2018 at 04:46:55PM +0100, Alex Bennée wrote:
> > > It's all very well being raz, but shouldn't you catch this further down
> > > and not attempt to write the register that doesn't exist?
> > 
> > To be clear, is this a question about factoring, or do you think there's
> > a bug here?
> > 
> > 
> > In response to both sets of comments, I think the way the code is
> > factored is causing some confusion.
> > 
> > The idea in my head was something like this:
> > 
> > System register encodings fall into two classes:
> > 
> >  a) encodings that we emulate in some way
> 
> this is present, then
> 
> >  b) encodings that we unconditionally reflect back to the guest as an
> >     Undef.
> 
> this is !present, then
> 
> The previous change made this a configurable thing as opposed to a
> static compile time thing, right?
> > 
> > Architecturally defined system registers fall into two classes:
> > 
> >  i) registers whose removal turns all accesses into an Undef
> >  ii) registers whose removal exhibits some other behaviour.
> 
> I'm not sure what you mean by 'removal' here, and which architectural
> concept that relates to, which makes it hard for me to parse the rest
> here...
> 
> > 
> > These two classifications overlap somwehat.
> > 
> > 
> > From an emulation perspective, (b), and (i) in the "register not
> > present" case, look the same: we trap the register and reflect an Undef
> > directly back to the guest with no further action required.
> > 
> > From an emulation perspective, (a) and (ii) are also somewhat the
> > same: we need to emulate something, although precisely what we need
> > to do depends on which register it is and on whether the register is
> > deemed present or not.
> > 
> > sys_reg_check_present_or_raz() thus means "falls under (a) or (ii)",
> > i.e., some emulation is required and we need to call sysreg-specific
> > methods to figure out precisely what we need to do.
> 
> yes, but we've always had that without the "or_raz" stuff at the lookup
> level.  What has changed?
> 
> > 
> > Conversely !sys_reg_check_present_or_raz() means that we can just
> > Undef the guest with no further logic required.
> 
> Yes, but that's the same as !present, because raz then implies present,
> see above.
> 
> > 
> > Does this rationale make things clearer?  The naming is perhaps
> > unfortunate.
> > 
> 
> Unfortunately not so much.  I have a strong feeling you want to move
> anything relating to something being emulated as RAZ/RAO/something else
> into sysreg specific functions.


The way I integrated this seemed natural at the time, but your
reaction suggests that it may not be the right approach...


At its heart, I'm trying to abstract out the special behaviour of
all unallocated ID registers, so that we can decide at runtime which
ones to hide fro the guest: within the ID register block, each
unallocated register becomes RAZ, not UNDEFINED as would be the case
for other system registers, so we need to capture both behaviours.


If we want a generic handler for all the ID registers in sys_regs.c,
then we need a flag to tell us whether to pass the ID register through
from cpufeatures or to make it appear as zero.

For ZCR_EL1 on the other hand, we really want attempts to access that
to reflect an Undef to the guest if we are pretending that SVE is not
implemented.  Again if we want to filter out some sysregs in a runtime-
controlled way, we need a flag to tell us whether to filter out a
particular register.

So, we have two specific ways of rolling a feature that is really
implemented in the hardware back to the ARMv8-A behaviour (RAZ for ID
registers and Undef for anything else).

I tried to group these under a single concept of presence/absence,
which is what check_present() is intended to check.  However, we
don't really want ID registers to Undef when !check_present(): this
is bodged around with the additional SR_RAZ_IF_ABSENT flag so that
the decision about whether to make the register Undef or not can
be made generic.


It seems that this attempt at generalisation is creating more confusion
than it solves, so I may abandon it and just handle ID_AA64PFR0_EL1 and
ID_AA64ZFR0_EL1 specially.

When/if we've done that a few times for different features, it may
become clearer what any generic framework for doing it should look
like...

Thoughts?

---Dave

^ permalink raw reply	[flat|nested] 178+ messages in thread

* [RFC PATCH 09/16] KVM: arm64: Allow ID registers to by dynamically read-as-zero
@ 2018-08-08  9:11           ` Dave Martin
  0 siblings, 0 replies; 178+ messages in thread
From: Dave Martin @ 2018-08-08  9:11 UTC (permalink / raw)
  To: linux-arm-kernel

On Tue, Aug 07, 2018 at 09:35:12PM +0200, Christoffer Dall wrote:
> On Tue, Aug 07, 2018 at 12:09:58PM +0100, Dave Martin wrote:
> > On Mon, Aug 06, 2018 at 03:03:24PM +0200, Christoffer Dall wrote:
> > > Hi Dave,
> > > 
> > > I think there's a typo in the subject "to be" rather than "to by".
> > > 
> > > On Thu, Jun 21, 2018 at 03:57:33PM +0100, Dave Martin wrote:
> > > > When a feature-dependent ID register is hidden from the guest, it
> > > > needs to exhibit read-as-zero behaviour as defined by the Arm
> > > > architecture, rather than appearing to be entirely absent.
> > > > 
> > > > This patch updates the ID register emulation logic to make use of
> > > > the new check_present() method to determine whether the register
> > > > should read as zero instead of yielding the host's sanitised
> > > > value.  Because currently a false result from this method truncates
> > > > the trap call chain before the sysreg's emulate method() is called,
> > > > a flag is added to distinguish this special case, and helpers are
> > > > refactored appropriately.
> > > 
> > > I don't understand this last sentence.
> > > 
> > > And I'm not really sure I understand the code either.
> > > 
> > > I can't seem to see any registers which are defined as !present && !raz,
> > > which is what I thought this feature was all about.
> > 
> > !present and !raz is the default behaviour for everything that is not
> > ID-register-like.  This patch is adding the !present && raz case (though
> > that may not be a helpful way to descibe it ... see below).
> 
> Fair enough, but I don't really see why you need to classify a register
> as !present && raz, because raz implies present AFAICT.
> 
> > 
> > > In other words, what is the benefit of this more generic method as
> > > opposed to having a wrapper around read_id_reg() for read_sve_id_reg()
> > > which sets RAZ if there is no support for SVE in this context?
> > 
> > There may be other ways to factor this.  I can't now remember whay I
> > went with this particular approach, except that I vaguely recall
> > hitting some obstacles when doing things another way.
> 
> What I don't much care for is that we now seem to be mixing the concept
> of whether something is present and the value it returns if it is
> present in the overall system register handling logic.  And I don't
> understand why this is a requirement.
> 
> > 
> > Can you take a look at my attempted explanation below and then we
> > can reconsider this?
> 
> Sure, see my comments below.
> 
> > 
> > [...]
> > 
> > > 
> > > > 
> > > > This invloves some trivial updates to pass the vcpu pointer down
> > > > into the ID register emulation/access functions.
> > > > 
> > > > A new ID_SANITISED_IF() macro is defined for declaring
> > > > conditionally visible ID registers.
> > > > 
> > > > Signed-off-by: Dave Martin <Dave.Martin@arm.com>
> > > > ---
> > > >  arch/arm64/kvm/sys_regs.c | 51 ++++++++++++++++++++++++++++++-----------------
> > > >  arch/arm64/kvm/sys_regs.h | 11 ++++++++++
> > > >  2 files changed, 44 insertions(+), 18 deletions(-)
> > > > 
> > 
> > [...]
> > 
> > > > @@ -1840,7 +1855,7 @@ static int emulate_cp(struct kvm_vcpu *vcpu,
> > > >  
> > > >  	r = find_reg(params, table, num);
> > > >  
> > > > -	if (likely(r) && sys_reg_present(vcpu, r)) {
> > > > +	if (likely(r) && sys_reg_present_or_raz(vcpu, r)) {
> > > >  		perform_access(vcpu, params, r);
> > > >  		return 0;
> > > >  	}
> > > > @@ -2016,7 +2031,7 @@ static int emulate_sys_reg(struct kvm_vcpu *vcpu,
> > > >  	if (!r)
> > > >  		r = find_reg(params, sys_reg_descs, ARRAY_SIZE(sys_reg_descs));
> > > >  
> > > > -	if (likely(r) && sys_reg_present(vcpu, r)) {
> > > > +	if (likely(r) && sys_reg_present_or_raz(vcpu, r)) {
> > > >  		perform_access(vcpu, params, r);
> > > >  	} else {
> > > >  		kvm_err("Unsupported guest sys_reg access at: %lx\n",
> > > > @@ -2313,7 +2328,7 @@ int kvm_arm_sys_reg_get_reg(struct kvm_vcpu *vcpu, const struct kvm_one_reg *reg
> > > >  	if (!r)
> > > >  		return get_invariant_sys_reg(reg->id, uaddr);
> > > >  
> > > > -	if (!sys_reg_present(vcpu, r))
> > > > +	if (!sys_reg_present_or_raz(vcpu, r))
> > > >  		return -ENOENT;
> > > >  
> > > >  	if (r->get_user)
> > > > @@ -2337,7 +2352,7 @@ int kvm_arm_sys_reg_set_reg(struct kvm_vcpu *vcpu, const struct kvm_one_reg *reg
> > > >  	if (!r)
> > > >  		return set_invariant_sys_reg(reg->id, uaddr);
> > > >  
> > > > -	if (!sys_reg_present(vcpu, r))
> > > > +	if (!sys_reg_present_or_raz(vcpu, r))
> > > >  		return -ENOENT;
> > 
> > On Wed, Jul 25, 2018 at 04:46:55PM +0100, Alex Benn?e wrote:
> > > It's all very well being raz, but shouldn't you catch this further down
> > > and not attempt to write the register that doesn't exist?
> > 
> > To be clear, is this a question about factoring, or do you think there's
> > a bug here?
> > 
> > 
> > In response to both sets of comments, I think the way the code is
> > factored is causing some confusion.
> > 
> > The idea in my head was something like this:
> > 
> > System register encodings fall into two classes:
> > 
> >  a) encodings that we emulate in some way
> 
> this is present, then
> 
> >  b) encodings that we unconditionally reflect back to the guest as an
> >     Undef.
> 
> this is !present, then
> 
> The previous change made this a configurable thing as opposed to a
> static compile time thing, right?
> > 
> > Architecturally defined system registers fall into two classes:
> > 
> >  i) registers whose removal turns all accesses into an Undef
> >  ii) registers whose removal exhibits some other behaviour.
> 
> I'm not sure what you mean by 'removal' here, and which architectural
> concept that relates to, which makes it hard for me to parse the rest
> here...
> 
> > 
> > These two classifications overlap somwehat.
> > 
> > 
> > From an emulation perspective, (b), and (i) in the "register not
> > present" case, look the same: we trap the register and reflect an Undef
> > directly back to the guest with no further action required.
> > 
> > From an emulation perspective, (a) and (ii) are also somewhat the
> > same: we need to emulate something, although precisely what we need
> > to do depends on which register it is and on whether the register is
> > deemed present or not.
> > 
> > sys_reg_check_present_or_raz() thus means "falls under (a) or (ii)",
> > i.e., some emulation is required and we need to call sysreg-specific
> > methods to figure out precisely what we need to do.
> 
> yes, but we've always had that without the "or_raz" stuff at the lookup
> level.  What has changed?
> 
> > 
> > Conversely !sys_reg_check_present_or_raz() means that we can just
> > Undef the guest with no further logic required.
> 
> Yes, but that's the same as !present, because raz then implies present,
> see above.
> 
> > 
> > Does this rationale make things clearer?  The naming is perhaps
> > unfortunate.
> > 
> 
> Unfortunately not so much.  I have a strong feeling you want to move
> anything relating to something being emulated as RAZ/RAO/something else
> into sysreg specific functions.


The way I integrated this seemed natural at the time, but your
reaction suggests that it may not be the right approach...


At its heart, I'm trying to abstract out the special behaviour of
all unallocated ID registers, so that we can decide at runtime which
ones to hide fro the guest: within the ID register block, each
unallocated register becomes RAZ, not UNDEFINED as would be the case
for other system registers, so we need to capture both behaviours.


If we want a generic handler for all the ID registers in sys_regs.c,
then we need a flag to tell us whether to pass the ID register through
from cpufeatures or to make it appear as zero.

For ZCR_EL1 on the other hand, we really want attempts to access that
to reflect an Undef to the guest if we are pretending that SVE is not
implemented.  Again if we want to filter out some sysregs in a runtime-
controlled way, we need a flag to tell us whether to filter out a
particular register.

So, we have two specific ways of rolling a feature that is really
implemented in the hardware back to the ARMv8-A behaviour (RAZ for ID
registers and Undef for anything else).

I tried to group these under a single concept of presence/absence,
which is what check_present() is intended to check.  However, we
don't really want ID registers to Undef when !check_present(): this
is bodged around with the additional SR_RAZ_IF_ABSENT flag so that
the decision about whether to make the register Undef or not can
be made generic.


It seems that this attempt at generalisation is creating more confusion
than it solves, so I may abandon it and just handle ID_AA64PFR0_EL1 and
ID_AA64ZFR0_EL1 specially.

When/if we've done that a few times for different features, it may
become clearer what any generic framework for doing it should look
like...

Thoughts?

---Dave

^ permalink raw reply	[flat|nested] 178+ messages in thread

* Re: [RFC PATCH 09/16] KVM: arm64: Allow ID registers to by dynamically read-as-zero
  2018-08-08  9:11           ` Dave Martin
@ 2018-08-08  9:58             ` Christoffer Dall
  -1 siblings, 0 replies; 178+ messages in thread
From: Christoffer Dall @ 2018-08-08  9:58 UTC (permalink / raw)
  To: Dave Martin
  Cc: Okamoto Takayuki, Christoffer Dall, Ard Biesheuvel, Marc Zyngier,
	Catalin Marinas, Will Deacon, kvmarm, linux-arm-kernel

On Wed, Aug 08, 2018 at 10:11:11AM +0100, Dave Martin wrote:
> On Tue, Aug 07, 2018 at 09:35:12PM +0200, Christoffer Dall wrote:
> > On Tue, Aug 07, 2018 at 12:09:58PM +0100, Dave Martin wrote:
> > > On Mon, Aug 06, 2018 at 03:03:24PM +0200, Christoffer Dall wrote:
> > > > Hi Dave,
> > > > 
> > > > I think there's a typo in the subject "to be" rather than "to by".
> > > > 
> > > > On Thu, Jun 21, 2018 at 03:57:33PM +0100, Dave Martin wrote:
> > > > > When a feature-dependent ID register is hidden from the guest, it
> > > > > needs to exhibit read-as-zero behaviour as defined by the Arm
> > > > > architecture, rather than appearing to be entirely absent.
> > > > > 
> > > > > This patch updates the ID register emulation logic to make use of
> > > > > the new check_present() method to determine whether the register
> > > > > should read as zero instead of yielding the host's sanitised
> > > > > value.  Because currently a false result from this method truncates
> > > > > the trap call chain before the sysreg's emulate method() is called,
> > > > > a flag is added to distinguish this special case, and helpers are
> > > > > refactored appropriately.
> > > > 
> > > > I don't understand this last sentence.
> > > > 
> > > > And I'm not really sure I understand the code either.
> > > > 
> > > > I can't seem to see any registers which are defined as !present && !raz,
> > > > which is what I thought this feature was all about.
> > > 
> > > !present and !raz is the default behaviour for everything that is not
> > > ID-register-like.  This patch is adding the !present && raz case (though
> > > that may not be a helpful way to descibe it ... see below).
> > 
> > Fair enough, but I don't really see why you need to classify a register
> > as !present && raz, because raz implies present AFAICT.
> > 
> > > 
> > > > In other words, what is the benefit of this more generic method as
> > > > opposed to having a wrapper around read_id_reg() for read_sve_id_reg()
> > > > which sets RAZ if there is no support for SVE in this context?
> > > 
> > > There may be other ways to factor this.  I can't now remember whay I
> > > went with this particular approach, except that I vaguely recall
> > > hitting some obstacles when doing things another way.
> > 
> > What I don't much care for is that we now seem to be mixing the concept
> > of whether something is present and the value it returns if it is
> > present in the overall system register handling logic.  And I don't
> > understand why this is a requirement.
> > 
> > > 
> > > Can you take a look at my attempted explanation below and then we
> > > can reconsider this?
> > 
> > Sure, see my comments below.
> > 
> > > 
> > > [...]
> > > 
> > > > 
> > > > > 
> > > > > This invloves some trivial updates to pass the vcpu pointer down
> > > > > into the ID register emulation/access functions.
> > > > > 
> > > > > A new ID_SANITISED_IF() macro is defined for declaring
> > > > > conditionally visible ID registers.
> > > > > 
> > > > > Signed-off-by: Dave Martin <Dave.Martin@arm.com>
> > > > > ---
> > > > >  arch/arm64/kvm/sys_regs.c | 51 ++++++++++++++++++++++++++++++-----------------
> > > > >  arch/arm64/kvm/sys_regs.h | 11 ++++++++++
> > > > >  2 files changed, 44 insertions(+), 18 deletions(-)
> > > > > 
> > > 
> > > [...]
> > > 
> > > > > @@ -1840,7 +1855,7 @@ static int emulate_cp(struct kvm_vcpu *vcpu,
> > > > >  
> > > > >  	r = find_reg(params, table, num);
> > > > >  
> > > > > -	if (likely(r) && sys_reg_present(vcpu, r)) {
> > > > > +	if (likely(r) && sys_reg_present_or_raz(vcpu, r)) {
> > > > >  		perform_access(vcpu, params, r);
> > > > >  		return 0;
> > > > >  	}
> > > > > @@ -2016,7 +2031,7 @@ static int emulate_sys_reg(struct kvm_vcpu *vcpu,
> > > > >  	if (!r)
> > > > >  		r = find_reg(params, sys_reg_descs, ARRAY_SIZE(sys_reg_descs));
> > > > >  
> > > > > -	if (likely(r) && sys_reg_present(vcpu, r)) {
> > > > > +	if (likely(r) && sys_reg_present_or_raz(vcpu, r)) {
> > > > >  		perform_access(vcpu, params, r);
> > > > >  	} else {
> > > > >  		kvm_err("Unsupported guest sys_reg access at: %lx\n",
> > > > > @@ -2313,7 +2328,7 @@ int kvm_arm_sys_reg_get_reg(struct kvm_vcpu *vcpu, const struct kvm_one_reg *reg
> > > > >  	if (!r)
> > > > >  		return get_invariant_sys_reg(reg->id, uaddr);
> > > > >  
> > > > > -	if (!sys_reg_present(vcpu, r))
> > > > > +	if (!sys_reg_present_or_raz(vcpu, r))
> > > > >  		return -ENOENT;
> > > > >  
> > > > >  	if (r->get_user)
> > > > > @@ -2337,7 +2352,7 @@ int kvm_arm_sys_reg_set_reg(struct kvm_vcpu *vcpu, const struct kvm_one_reg *reg
> > > > >  	if (!r)
> > > > >  		return set_invariant_sys_reg(reg->id, uaddr);
> > > > >  
> > > > > -	if (!sys_reg_present(vcpu, r))
> > > > > +	if (!sys_reg_present_or_raz(vcpu, r))
> > > > >  		return -ENOENT;
> > > 
> > > On Wed, Jul 25, 2018 at 04:46:55PM +0100, Alex Bennée wrote:
> > > > It's all very well being raz, but shouldn't you catch this further down
> > > > and not attempt to write the register that doesn't exist?
> > > 
> > > To be clear, is this a question about factoring, or do you think there's
> > > a bug here?
> > > 
> > > 
> > > In response to both sets of comments, I think the way the code is
> > > factored is causing some confusion.
> > > 
> > > The idea in my head was something like this:
> > > 
> > > System register encodings fall into two classes:
> > > 
> > >  a) encodings that we emulate in some way
> > 
> > this is present, then
> > 
> > >  b) encodings that we unconditionally reflect back to the guest as an
> > >     Undef.
> > 
> > this is !present, then
> > 
> > The previous change made this a configurable thing as opposed to a
> > static compile time thing, right?
> > > 
> > > Architecturally defined system registers fall into two classes:
> > > 
> > >  i) registers whose removal turns all accesses into an Undef
> > >  ii) registers whose removal exhibits some other behaviour.
> > 
> > I'm not sure what you mean by 'removal' here, and which architectural
> > concept that relates to, which makes it hard for me to parse the rest
> > here...
> > 
> > > 
> > > These two classifications overlap somwehat.
> > > 
> > > 
> > > From an emulation perspective, (b), and (i) in the "register not
> > > present" case, look the same: we trap the register and reflect an Undef
> > > directly back to the guest with no further action required.
> > > 
> > > From an emulation perspective, (a) and (ii) are also somewhat the
> > > same: we need to emulate something, although precisely what we need
> > > to do depends on which register it is and on whether the register is
> > > deemed present or not.
> > > 
> > > sys_reg_check_present_or_raz() thus means "falls under (a) or (ii)",
> > > i.e., some emulation is required and we need to call sysreg-specific
> > > methods to figure out precisely what we need to do.
> > 
> > yes, but we've always had that without the "or_raz" stuff at the lookup
> > level.  What has changed?
> > 
> > > 
> > > Conversely !sys_reg_check_present_or_raz() means that we can just
> > > Undef the guest with no further logic required.
> > 
> > Yes, but that's the same as !present, because raz then implies present,
> > see above.
> > 
> > > 
> > > Does this rationale make things clearer?  The naming is perhaps
> > > unfortunate.
> > > 
> > 
> > Unfortunately not so much.  I have a strong feeling you want to move
> > anything relating to something being emulated as RAZ/RAO/something else
> > into sysreg specific functions.
> 
> 
> The way I integrated this seemed natural at the time, but your
> reaction suggests that it may not be the right approach...
> 
> 
> At its heart, I'm trying to abstract out the special behaviour of
> all unallocated ID registers, so that we can decide at runtime which
> ones to hide fro the guest: within the ID register block, each
> unallocated register becomes RAZ, not UNDEFINED as would be the case
> for other system registers, so we need to capture both behaviours.
> 
> 
> If we want a generic handler for all the ID registers in sys_regs.c,
> then we need a flag to tell us whether to pass the ID register through
> from cpufeatures or to make it appear as zero.
> 
> For ZCR_EL1 on the other hand, we really want attempts to access that
> to reflect an Undef to the guest if we are pretending that SVE is not
> implemented.  Again if we want to filter out some sysregs in a runtime-
> controlled way, we need a flag to tell us whether to filter out a
> particular register.
> 
> So, we have two specific ways of rolling a feature that is really
> implemented in the hardware back to the ARMv8-A behaviour (RAZ for ID
> registers and Undef for anything else).
> 
> I tried to group these under a single concept of presence/absence,
> which is what check_present() is intended to check.  However, we
> don't really want ID registers to Undef when !check_present(): this
> is bodged around with the additional SR_RAZ_IF_ABSENT flag so that
> the decision about whether to make the register Undef or not can
> be made generic.
> 
> 
> It seems that this attempt at generalisation is creating more confusion
> than it solves, so I may abandon it and just handle ID_AA64PFR0_EL1 and
> ID_AA64ZFR0_EL1 specially.
> 
> When/if we've done that a few times for different features, it may
> become clearer what any generic framework for doing it should look
> like...
> 

I think there's just a reasonable amount of complexity in sys_regs.c
already, and the naming and concepts aren't clear from the reading the
code as it stands now.

I think it would probably help to flag things as ID registers as opposed
to 'RAZ' behavior, beccause it's clear we're then trying to support an
architectural concept.

However, I think of the generic infrastructre in sys_regs.c to be
worried about KVM-specifics, and the implementation of the
emulate/access functions to be concerned with architectural concepts.
That's just how I've always thought abuot this code.

Therefore, I would suggest wrapping all ID register accesses through
common ID register access functions (read/write) which handles the RAZ
case.

At least, I'd like to see if that becomes too horrible before taking
this route.

Thanks,
-Christoffer
_______________________________________________
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

^ permalink raw reply	[flat|nested] 178+ messages in thread

* [RFC PATCH 09/16] KVM: arm64: Allow ID registers to by dynamically read-as-zero
@ 2018-08-08  9:58             ` Christoffer Dall
  0 siblings, 0 replies; 178+ messages in thread
From: Christoffer Dall @ 2018-08-08  9:58 UTC (permalink / raw)
  To: linux-arm-kernel

On Wed, Aug 08, 2018 at 10:11:11AM +0100, Dave Martin wrote:
> On Tue, Aug 07, 2018 at 09:35:12PM +0200, Christoffer Dall wrote:
> > On Tue, Aug 07, 2018 at 12:09:58PM +0100, Dave Martin wrote:
> > > On Mon, Aug 06, 2018 at 03:03:24PM +0200, Christoffer Dall wrote:
> > > > Hi Dave,
> > > > 
> > > > I think there's a typo in the subject "to be" rather than "to by".
> > > > 
> > > > On Thu, Jun 21, 2018 at 03:57:33PM +0100, Dave Martin wrote:
> > > > > When a feature-dependent ID register is hidden from the guest, it
> > > > > needs to exhibit read-as-zero behaviour as defined by the Arm
> > > > > architecture, rather than appearing to be entirely absent.
> > > > > 
> > > > > This patch updates the ID register emulation logic to make use of
> > > > > the new check_present() method to determine whether the register
> > > > > should read as zero instead of yielding the host's sanitised
> > > > > value.  Because currently a false result from this method truncates
> > > > > the trap call chain before the sysreg's emulate method() is called,
> > > > > a flag is added to distinguish this special case, and helpers are
> > > > > refactored appropriately.
> > > > 
> > > > I don't understand this last sentence.
> > > > 
> > > > And I'm not really sure I understand the code either.
> > > > 
> > > > I can't seem to see any registers which are defined as !present && !raz,
> > > > which is what I thought this feature was all about.
> > > 
> > > !present and !raz is the default behaviour for everything that is not
> > > ID-register-like.  This patch is adding the !present && raz case (though
> > > that may not be a helpful way to descibe it ... see below).
> > 
> > Fair enough, but I don't really see why you need to classify a register
> > as !present && raz, because raz implies present AFAICT.
> > 
> > > 
> > > > In other words, what is the benefit of this more generic method as
> > > > opposed to having a wrapper around read_id_reg() for read_sve_id_reg()
> > > > which sets RAZ if there is no support for SVE in this context?
> > > 
> > > There may be other ways to factor this.  I can't now remember whay I
> > > went with this particular approach, except that I vaguely recall
> > > hitting some obstacles when doing things another way.
> > 
> > What I don't much care for is that we now seem to be mixing the concept
> > of whether something is present and the value it returns if it is
> > present in the overall system register handling logic.  And I don't
> > understand why this is a requirement.
> > 
> > > 
> > > Can you take a look at my attempted explanation below and then we
> > > can reconsider this?
> > 
> > Sure, see my comments below.
> > 
> > > 
> > > [...]
> > > 
> > > > 
> > > > > 
> > > > > This invloves some trivial updates to pass the vcpu pointer down
> > > > > into the ID register emulation/access functions.
> > > > > 
> > > > > A new ID_SANITISED_IF() macro is defined for declaring
> > > > > conditionally visible ID registers.
> > > > > 
> > > > > Signed-off-by: Dave Martin <Dave.Martin@arm.com>
> > > > > ---
> > > > >  arch/arm64/kvm/sys_regs.c | 51 ++++++++++++++++++++++++++++++-----------------
> > > > >  arch/arm64/kvm/sys_regs.h | 11 ++++++++++
> > > > >  2 files changed, 44 insertions(+), 18 deletions(-)
> > > > > 
> > > 
> > > [...]
> > > 
> > > > > @@ -1840,7 +1855,7 @@ static int emulate_cp(struct kvm_vcpu *vcpu,
> > > > >  
> > > > >  	r = find_reg(params, table, num);
> > > > >  
> > > > > -	if (likely(r) && sys_reg_present(vcpu, r)) {
> > > > > +	if (likely(r) && sys_reg_present_or_raz(vcpu, r)) {
> > > > >  		perform_access(vcpu, params, r);
> > > > >  		return 0;
> > > > >  	}
> > > > > @@ -2016,7 +2031,7 @@ static int emulate_sys_reg(struct kvm_vcpu *vcpu,
> > > > >  	if (!r)
> > > > >  		r = find_reg(params, sys_reg_descs, ARRAY_SIZE(sys_reg_descs));
> > > > >  
> > > > > -	if (likely(r) && sys_reg_present(vcpu, r)) {
> > > > > +	if (likely(r) && sys_reg_present_or_raz(vcpu, r)) {
> > > > >  		perform_access(vcpu, params, r);
> > > > >  	} else {
> > > > >  		kvm_err("Unsupported guest sys_reg access at: %lx\n",
> > > > > @@ -2313,7 +2328,7 @@ int kvm_arm_sys_reg_get_reg(struct kvm_vcpu *vcpu, const struct kvm_one_reg *reg
> > > > >  	if (!r)
> > > > >  		return get_invariant_sys_reg(reg->id, uaddr);
> > > > >  
> > > > > -	if (!sys_reg_present(vcpu, r))
> > > > > +	if (!sys_reg_present_or_raz(vcpu, r))
> > > > >  		return -ENOENT;
> > > > >  
> > > > >  	if (r->get_user)
> > > > > @@ -2337,7 +2352,7 @@ int kvm_arm_sys_reg_set_reg(struct kvm_vcpu *vcpu, const struct kvm_one_reg *reg
> > > > >  	if (!r)
> > > > >  		return set_invariant_sys_reg(reg->id, uaddr);
> > > > >  
> > > > > -	if (!sys_reg_present(vcpu, r))
> > > > > +	if (!sys_reg_present_or_raz(vcpu, r))
> > > > >  		return -ENOENT;
> > > 
> > > On Wed, Jul 25, 2018 at 04:46:55PM +0100, Alex Benn?e wrote:
> > > > It's all very well being raz, but shouldn't you catch this further down
> > > > and not attempt to write the register that doesn't exist?
> > > 
> > > To be clear, is this a question about factoring, or do you think there's
> > > a bug here?
> > > 
> > > 
> > > In response to both sets of comments, I think the way the code is
> > > factored is causing some confusion.
> > > 
> > > The idea in my head was something like this:
> > > 
> > > System register encodings fall into two classes:
> > > 
> > >  a) encodings that we emulate in some way
> > 
> > this is present, then
> > 
> > >  b) encodings that we unconditionally reflect back to the guest as an
> > >     Undef.
> > 
> > this is !present, then
> > 
> > The previous change made this a configurable thing as opposed to a
> > static compile time thing, right?
> > > 
> > > Architecturally defined system registers fall into two classes:
> > > 
> > >  i) registers whose removal turns all accesses into an Undef
> > >  ii) registers whose removal exhibits some other behaviour.
> > 
> > I'm not sure what you mean by 'removal' here, and which architectural
> > concept that relates to, which makes it hard for me to parse the rest
> > here...
> > 
> > > 
> > > These two classifications overlap somwehat.
> > > 
> > > 
> > > From an emulation perspective, (b), and (i) in the "register not
> > > present" case, look the same: we trap the register and reflect an Undef
> > > directly back to the guest with no further action required.
> > > 
> > > From an emulation perspective, (a) and (ii) are also somewhat the
> > > same: we need to emulate something, although precisely what we need
> > > to do depends on which register it is and on whether the register is
> > > deemed present or not.
> > > 
> > > sys_reg_check_present_or_raz() thus means "falls under (a) or (ii)",
> > > i.e., some emulation is required and we need to call sysreg-specific
> > > methods to figure out precisely what we need to do.
> > 
> > yes, but we've always had that without the "or_raz" stuff at the lookup
> > level.  What has changed?
> > 
> > > 
> > > Conversely !sys_reg_check_present_or_raz() means that we can just
> > > Undef the guest with no further logic required.
> > 
> > Yes, but that's the same as !present, because raz then implies present,
> > see above.
> > 
> > > 
> > > Does this rationale make things clearer?  The naming is perhaps
> > > unfortunate.
> > > 
> > 
> > Unfortunately not so much.  I have a strong feeling you want to move
> > anything relating to something being emulated as RAZ/RAO/something else
> > into sysreg specific functions.
> 
> 
> The way I integrated this seemed natural at the time, but your
> reaction suggests that it may not be the right approach...
> 
> 
> At its heart, I'm trying to abstract out the special behaviour of
> all unallocated ID registers, so that we can decide at runtime which
> ones to hide fro the guest: within the ID register block, each
> unallocated register becomes RAZ, not UNDEFINED as would be the case
> for other system registers, so we need to capture both behaviours.
> 
> 
> If we want a generic handler for all the ID registers in sys_regs.c,
> then we need a flag to tell us whether to pass the ID register through
> from cpufeatures or to make it appear as zero.
> 
> For ZCR_EL1 on the other hand, we really want attempts to access that
> to reflect an Undef to the guest if we are pretending that SVE is not
> implemented.  Again if we want to filter out some sysregs in a runtime-
> controlled way, we need a flag to tell us whether to filter out a
> particular register.
> 
> So, we have two specific ways of rolling a feature that is really
> implemented in the hardware back to the ARMv8-A behaviour (RAZ for ID
> registers and Undef for anything else).
> 
> I tried to group these under a single concept of presence/absence,
> which is what check_present() is intended to check.  However, we
> don't really want ID registers to Undef when !check_present(): this
> is bodged around with the additional SR_RAZ_IF_ABSENT flag so that
> the decision about whether to make the register Undef or not can
> be made generic.
> 
> 
> It seems that this attempt at generalisation is creating more confusion
> than it solves, so I may abandon it and just handle ID_AA64PFR0_EL1 and
> ID_AA64ZFR0_EL1 specially.
> 
> When/if we've done that a few times for different features, it may
> become clearer what any generic framework for doing it should look
> like...
> 

I think there's just a reasonable amount of complexity in sys_regs.c
already, and the naming and concepts aren't clear from the reading the
code as it stands now.

I think it would probably help to flag things as ID registers as opposed
to 'RAZ' behavior, beccause it's clear we're then trying to support an
architectural concept.

However, I think of the generic infrastructre in sys_regs.c to be
worried about KVM-specifics, and the implementation of the
emulate/access functions to be concerned with architectural concepts.
That's just how I've always thought abuot this code.

Therefore, I would suggest wrapping all ID register accesses through
common ID register access functions (read/write) which handles the RAZ
case.

At least, I'd like to see if that becomes too horrible before taking
this route.

Thanks,
-Christoffer

^ permalink raw reply	[flat|nested] 178+ messages in thread

* Re: [RFC PATCH 09/16] KVM: arm64: Allow ID registers to by dynamically read-as-zero
  2018-08-08  9:11           ` Dave Martin
@ 2018-08-08 14:03             ` Peter Maydell
  -1 siblings, 0 replies; 178+ messages in thread
From: Peter Maydell @ 2018-08-08 14:03 UTC (permalink / raw)
  To: Dave Martin
  Cc: Okamoto Takayuki, Christoffer Dall, Ard Biesheuvel, Marc Zyngier,
	Catalin Marinas, Will Deacon, kvmarm, arm-mail-list

On 8 August 2018 at 10:11, Dave Martin <Dave.Martin@arm.com> wrote:
> At its heart, I'm trying to abstract out the special behaviour of
> all unallocated ID registers, so that we can decide at runtime which
> ones to hide fro the guest: within the ID register block, each
> unallocated register becomes RAZ, not UNDEFINED as would be the case
> for other system registers, so we need to capture both behaviours.

I think a better way to think of the ID register block is
that all the registers in it *are* allocated. It's just that
some of them are specified as RAZ/WI (because no bits in
them have been given meaning yet). Then you retain the
straightforward "unallocated == UNDEF". (In the Arm ARM
the gaps in the ID register block are documented as
"reserved, RAZ", not "unallocated".)

thanks
-- PMM

^ permalink raw reply	[flat|nested] 178+ messages in thread

* [RFC PATCH 09/16] KVM: arm64: Allow ID registers to by dynamically read-as-zero
@ 2018-08-08 14:03             ` Peter Maydell
  0 siblings, 0 replies; 178+ messages in thread
From: Peter Maydell @ 2018-08-08 14:03 UTC (permalink / raw)
  To: linux-arm-kernel

On 8 August 2018 at 10:11, Dave Martin <Dave.Martin@arm.com> wrote:
> At its heart, I'm trying to abstract out the special behaviour of
> all unallocated ID registers, so that we can decide at runtime which
> ones to hide fro the guest: within the ID register block, each
> unallocated register becomes RAZ, not UNDEFINED as would be the case
> for other system registers, so we need to capture both behaviours.

I think a better way to think of the ID register block is
that all the registers in it *are* allocated. It's just that
some of them are specified as RAZ/WI (because no bits in
them have been given meaning yet). Then you retain the
straightforward "unallocated == UNDEF". (In the Arm ARM
the gaps in the ID register block are documented as
"reserved, RAZ", not "unallocated".)

thanks
-- PMM

^ permalink raw reply	[flat|nested] 178+ messages in thread

* Re: [RFC PATCH 09/16] KVM: arm64: Allow ID registers to by dynamically read-as-zero
  2018-08-08 14:03             ` Peter Maydell
@ 2018-08-09 10:19               ` Dave Martin
  -1 siblings, 0 replies; 178+ messages in thread
From: Dave Martin @ 2018-08-09 10:19 UTC (permalink / raw)
  To: Peter Maydell
  Cc: Okamoto Takayuki, Christoffer Dall, Ard Biesheuvel, Marc Zyngier,
	Catalin Marinas, Will Deacon, kvmarm, arm-mail-list

On Wed, Aug 08, 2018 at 03:03:31PM +0100, Peter Maydell wrote:
> On 8 August 2018 at 10:11, Dave Martin <Dave.Martin@arm.com> wrote:
> > At its heart, I'm trying to abstract out the special behaviour of
> > all unallocated ID registers, so that we can decide at runtime which
> > ones to hide fro the guest: within the ID register block, each
> > unallocated register becomes RAZ, not UNDEFINED as would be the case
> > for other system registers, so we need to capture both behaviours.
> 
> I think a better way to think of the ID register block is
> that all the registers in it *are* allocated. It's just that
> some of them are specified as RAZ/WI (because no bits in
> them have been given meaning yet). Then you retain the
> straightforward "unallocated == UNDEF". (In the Arm ARM
> the gaps in the ID register block are documented as
> "reserved, RAZ", not "unallocated".)

Sure, I'm not arguing against that.  Your viewpoint does mean that
"enabling"/"disabling" a system register needs to do different things
depending on whether it's an ID register or not: in the latter case,
disabling it makes it unallocated; in the former case it doesn't but
makes the register read as zero.

I think my attempt to conflate the two behaviours was not helpful.

The way the existing code was structured was not helpful for solving
this either, which is one reason I ended up with my approach, but I will
take another look and see if I can some up with something a bit more
sane.

Cheers
---Dave

^ permalink raw reply	[flat|nested] 178+ messages in thread

* [RFC PATCH 09/16] KVM: arm64: Allow ID registers to by dynamically read-as-zero
@ 2018-08-09 10:19               ` Dave Martin
  0 siblings, 0 replies; 178+ messages in thread
From: Dave Martin @ 2018-08-09 10:19 UTC (permalink / raw)
  To: linux-arm-kernel

On Wed, Aug 08, 2018 at 03:03:31PM +0100, Peter Maydell wrote:
> On 8 August 2018 at 10:11, Dave Martin <Dave.Martin@arm.com> wrote:
> > At its heart, I'm trying to abstract out the special behaviour of
> > all unallocated ID registers, so that we can decide at runtime which
> > ones to hide fro the guest: within the ID register block, each
> > unallocated register becomes RAZ, not UNDEFINED as would be the case
> > for other system registers, so we need to capture both behaviours.
> 
> I think a better way to think of the ID register block is
> that all the registers in it *are* allocated. It's just that
> some of them are specified as RAZ/WI (because no bits in
> them have been given meaning yet). Then you retain the
> straightforward "unallocated == UNDEF". (In the Arm ARM
> the gaps in the ID register block are documented as
> "reserved, RAZ", not "unallocated".)

Sure, I'm not arguing against that.  Your viewpoint does mean that
"enabling"/"disabling" a system register needs to do different things
depending on whether it's an ID register or not: in the latter case,
disabling it makes it unallocated; in the former case it doesn't but
makes the register read as zero.

I think my attempt to conflate the two behaviours was not helpful.

The way the existing code was structured was not helpful for solving
this either, which is one reason I ended up with my approach, but I will
take another look and see if I can some up with something a bit more
sane.

Cheers
---Dave

^ permalink raw reply	[flat|nested] 178+ messages in thread

end of thread, other threads:[~2018-08-09 10:19 UTC | newest]

Thread overview: 178+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-06-21 14:57 [RFC PATCH 00/16] KVM: arm64: Initial support for SVE guests Dave Martin
2018-06-21 14:57 ` Dave Martin
2018-06-21 14:57 ` [RFC PATCH 01/16] arm64: fpsimd: Always set TIF_FOREIGN_FPSTATE on task state flush Dave Martin
2018-06-21 14:57   ` Dave Martin
2018-07-06  9:07   ` Alex Bennée
2018-07-06  9:07     ` Alex Bennée
2018-06-21 14:57 ` [RFC PATCH 02/16] KVM: arm64: Delete orphaned declaration for __fpsimd_enabled() Dave Martin
2018-06-21 14:57   ` Dave Martin
2018-07-06  9:08   ` Alex Bennée
2018-07-06  9:08     ` Alex Bennée
2018-06-21 14:57 ` [RFC PATCH 03/16] KVM: arm64: Refactor kvm_arm_num_regs() for easier maintenance Dave Martin
2018-06-21 14:57   ` Dave Martin
2018-07-06  9:20   ` Alex Bennée
2018-07-06  9:20     ` Alex Bennée
2018-06-21 14:57 ` [RFC PATCH 04/16] KVM: arm64: Add missing #include of <linux/bitmap.h> to kvm_host.h Dave Martin
2018-06-21 14:57   ` Dave Martin
2018-07-06  9:21   ` Alex Bennée
2018-07-06  9:21     ` Alex Bennée
2018-06-21 14:57 ` [RFC PATCH 05/16] KVM: arm: Add arch init/uninit hooks Dave Martin
2018-06-21 14:57   ` Dave Martin
2018-07-06 10:02   ` Alex Bennée
2018-07-06 10:02     ` Alex Bennée
2018-07-09 15:15     ` Dave Martin
2018-07-09 15:15       ` Dave Martin
2018-06-21 14:57 ` [RFC PATCH 06/16] arm64/sve: Determine virtualisation-friendly vector lengths Dave Martin
2018-06-21 14:57   ` Dave Martin
2018-07-06 13:20   ` Marc Zyngier
2018-07-06 13:20     ` Marc Zyngier
2018-06-21 14:57 ` [RFC PATCH 07/16] arm64/sve: Enable SVE state tracking for non-task contexts Dave Martin
2018-06-21 14:57   ` Dave Martin
2018-07-25 13:58   ` Alex Bennée
2018-07-25 13:58     ` Alex Bennée
2018-07-25 14:39     ` Dave Martin
2018-07-25 14:39       ` Dave Martin
2018-06-21 14:57 ` [RFC PATCH 08/16] KVM: arm64: Support dynamically hideable system registers Dave Martin
2018-06-21 14:57   ` Dave Martin
2018-07-25 14:12   ` Alex Bennée
2018-07-25 14:12     ` Alex Bennée
2018-07-25 14:36     ` Dave Martin
2018-07-25 14:36       ` Dave Martin
2018-07-25 15:41       ` Alex Bennée
2018-07-25 15:41         ` Alex Bennée
2018-07-26 12:53         ` Dave Martin
2018-07-26 12:53           ` Dave Martin
2018-08-07 19:20   ` Christoffer Dall
2018-08-07 19:20     ` Christoffer Dall
2018-08-08  8:33     ` Dave Martin
2018-08-08  8:33       ` Dave Martin
2018-06-21 14:57 ` [RFC PATCH 09/16] KVM: arm64: Allow ID registers to by dynamically read-as-zero Dave Martin
2018-06-21 14:57   ` Dave Martin
2018-07-25 15:46   ` Alex Bennée
2018-07-25 15:46     ` Alex Bennée
2018-08-06 13:03   ` Christoffer Dall
2018-08-06 13:03     ` Christoffer Dall
2018-08-07 11:09     ` Dave Martin
2018-08-07 11:09       ` Dave Martin
2018-08-07 19:35       ` Christoffer Dall
2018-08-07 19:35         ` Christoffer Dall
2018-08-08  9:11         ` Dave Martin
2018-08-08  9:11           ` Dave Martin
2018-08-08  9:58           ` Christoffer Dall
2018-08-08  9:58             ` Christoffer Dall
2018-08-08 14:03           ` Peter Maydell
2018-08-08 14:03             ` Peter Maydell
2018-08-09 10:19             ` Dave Martin
2018-08-09 10:19               ` Dave Martin
2018-06-21 14:57 ` [RFC PATCH 10/16] KVM: arm64: Add a vcpu flag to control SVE visibility for the guest Dave Martin
2018-06-21 14:57   ` Dave Martin
2018-07-19 11:08   ` Andrew Jones
2018-07-19 11:08     ` Andrew Jones
2018-07-25 11:41     ` Dave Martin
2018-07-25 11:41       ` Dave Martin
2018-07-25 13:43       ` Andrew Jones
2018-07-25 13:43         ` Andrew Jones
2018-07-25 14:41         ` Dave Martin
2018-07-25 14:41           ` Dave Martin
2018-07-19 15:02   ` Andrew Jones
2018-07-19 15:02     ` Andrew Jones
2018-07-25 11:48     ` Dave Martin
2018-07-25 11:48       ` Dave Martin
2018-06-21 14:57 ` [RFC PATCH 11/16] KVM: arm64/sve: System register context switch and access support Dave Martin
2018-06-21 14:57   ` Dave Martin
2018-07-19 11:11   ` Andrew Jones
2018-07-19 11:11     ` Andrew Jones
2018-07-25 11:45     ` Dave Martin
2018-07-25 11:45       ` Dave Martin
2018-06-21 14:57 ` [RFC PATCH 12/16] KVM: arm64/sve: Context switch the SVE registers Dave Martin
2018-06-21 14:57   ` Dave Martin
2018-07-19 13:13   ` Andrew Jones
2018-07-19 13:13     ` Andrew Jones
2018-07-25 11:50     ` Dave Martin
2018-07-25 11:50       ` Dave Martin
2018-07-25 13:57       ` Andrew Jones
2018-07-25 13:57         ` Andrew Jones
2018-07-25 14:12         ` Dave Martin
2018-07-25 14:12           ` Dave Martin
2018-08-06 13:19   ` Christoffer Dall
2018-08-06 13:19     ` Christoffer Dall
2018-08-07 11:15     ` Dave Martin
2018-08-07 11:15       ` Dave Martin
2018-08-07 19:43       ` Christoffer Dall
2018-08-07 19:43         ` Christoffer Dall
2018-08-08  8:23         ` Dave Martin
2018-08-08  8:23           ` Dave Martin
2018-06-21 14:57 ` [RFC PATCH 13/16] KVM: Allow 2048-bit register access via KVM_{GET, SET}_ONE_REG Dave Martin
2018-06-21 14:57   ` Dave Martin
2018-07-25 15:58   ` Alex Bennée
2018-07-25 15:58     ` Alex Bennée
2018-07-26 12:58     ` Dave Martin
2018-07-26 12:58       ` Dave Martin
2018-07-26 13:55       ` Alex Bennée
2018-07-26 13:55         ` Alex Bennée
2018-07-27  9:26         ` Dave Martin
2018-07-27  9:26           ` Dave Martin
2018-06-21 14:57 ` [RFC PATCH 14/16] KVM: arm64/sve: Add SVE support to register access ioctl interface Dave Martin
2018-06-21 14:57   ` Dave Martin
2018-07-19 13:04   ` Andrew Jones
2018-07-19 13:04     ` Andrew Jones
2018-07-25 14:06     ` Dave Martin
2018-07-25 14:06       ` Dave Martin
2018-07-25 17:20       ` Andrew Jones
2018-07-25 17:20         ` Andrew Jones
2018-07-26 13:10         ` Dave Martin
2018-07-26 13:10           ` Dave Martin
2018-08-03 14:57     ` Dave Martin
2018-08-03 14:57       ` Dave Martin
2018-08-03 15:11       ` Andrew Jones
2018-08-03 15:11         ` Andrew Jones
2018-08-03 15:38         ` Dave Martin
2018-08-03 15:38           ` Dave Martin
2018-08-06 13:25   ` Christoffer Dall
2018-08-06 13:25     ` Christoffer Dall
2018-08-07 11:17     ` Dave Martin
2018-08-07 11:17       ` Dave Martin
2018-06-21 14:57 ` [RFC PATCH 15/16] KVM: arm64: Enumerate SVE register indices for KVM_GET_REG_LIST Dave Martin
2018-06-21 14:57   ` Dave Martin
2018-07-19 14:12   ` Andrew Jones
2018-07-19 14:12     ` Andrew Jones
2018-07-25 14:50     ` Dave Martin
2018-07-25 14:50       ` Dave Martin
2018-06-21 14:57 ` [RFC PATCH 16/16] KVM: arm64/sve: Report and enable SVE API extensions for userspace Dave Martin
2018-06-21 14:57   ` Dave Martin
2018-07-19 14:59   ` Andrew Jones
2018-07-19 14:59     ` Andrew Jones
2018-07-25 15:27     ` Dave Martin
2018-07-25 15:27       ` Dave Martin
2018-07-25 16:52       ` Andrew Jones
2018-07-25 16:52         ` Andrew Jones
2018-07-26 13:18         ` Dave Martin
2018-07-26 13:18           ` Dave Martin
2018-08-06 13:41           ` Christoffer Dall
2018-08-06 13:41             ` Christoffer Dall
2018-08-07 11:23             ` Dave Martin
2018-08-07 11:23               ` Dave Martin
2018-08-07 20:08               ` Christoffer Dall
2018-08-07 20:08                 ` Christoffer Dall
2018-08-08  8:30                 ` Dave Martin
2018-08-08  8:30                   ` Dave Martin
2018-07-19 15:24   ` Andrew Jones
2018-07-19 15:24     ` Andrew Jones
2018-07-26 13:23     ` Dave Martin
2018-07-26 13:23       ` Dave Martin
2018-07-06  8:22 ` [RFC PATCH 00/16] KVM: arm64: Initial support for SVE guests Alex Bennée
2018-07-06  8:22   ` Alex Bennée
2018-07-06  9:05   ` Dave Martin
2018-07-06  9:05     ` Dave Martin
2018-07-06  9:20     ` Alex Bennée
2018-07-06  9:20       ` Alex Bennée
2018-07-06  9:23       ` Peter Maydell
2018-07-06  9:23         ` Peter Maydell
2018-07-06 10:11         ` Alex Bennée
2018-07-06 10:11           ` Alex Bennée
2018-07-06 10:14           ` Peter Maydell
2018-07-06 10:14             ` Peter Maydell
2018-08-06 13:05 ` Christoffer Dall
2018-08-06 13:05   ` Christoffer Dall
2018-08-07 11:18   ` Dave Martin
2018-08-07 11:18     ` Dave Martin

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.