All of lore.kernel.org
 help / color / mirror / Atom feed
* [RFC PATCH v2 00/23] KVM: arm64: Initial support for SVE guests
@ 2018-09-28 13:39 ` Dave Martin
  0 siblings, 0 replies; 154+ messages in thread
From: Dave Martin @ 2018-09-28 13:39 UTC (permalink / raw)
  To: kvmarm
  Cc: Okamoto Takayuki, Christoffer Dall, Ard Biesheuvel, Marc Zyngier,
	Catalin Marinas, Will Deacon, linux-arm-kernel

This series implements basic support for allowing KVM guests to use the
Arm Scalable Vector Extension (SVE).

The patches are based on v4.19-rc5.

The patches are also available on a branch for reviewer convenience. [1]

This is a significant overhaul of the previous preliminary series [2],
with the major changes outlined below, and additional minor updates in
response to review feedback (see the individual patches for those).

In the interest of getting this series out for review, 
This series is **completely untested**.

Reviewers should focus on the proposed API (but any other comments are
of course welcome!)


Major changes:

 * Reworked on top of the new KVM FPSIMD context switching model.

 * Migrated away from use of feature bits in KVM_ARM_PREFERRED_TARGET /
   KVM_VCPU_INIT to detect and enable SVE support.

   Availability is now reported via KVM_CAP_ARM_SVE, and SVE support
   is enabled and configured per-vcpu with a new ioctl
   KVM_ARM_SVE_CONFIG.

 * KVM_ARM_SVE_CONFIG also adds the ability to detect and configure
   the set of SVE vector lengths provided to the guest.

 * The ioctl register access model has been simplified.  The new SVE
   interface must now be used to access the content of the FPSIMD
   V-registers, on SVE-enabled vcpus.  The kernel no longer tries to
   emulate the KVM_REG_ARM_CORE view for these registers in this case.
   (For non-SVE-enabled vcpus, the KVM_REG_ARM_CORE interface works as
   normal.)

 * Draft documentation for the SVE extensions has been added to KVM's
   api.txt.

Known issues:

 * KVM_GET_REG_LIST enumerates the FPSIMD V-registers for SVE-enabled
   vcpus.  It shouldn't, because attempts to access these will now fail!

 * kvmtool/qemu updates are needed to enable creation of SVE-enabled
   guests (to be discussed separately).

 * Due to heavy rework, the series is currently a bit of a mess: the
   patches are not presented in a very coherent order, and one or two
   patches may even now be redundant or irrelevant.  If you spot any
   obvious inconsistencies, please shout!


[1]
http://linux-arm.org/git?p=linux-dm.git;a=shortlog;h=refs/heads/sve-kvm/rfcv2
git://linux-arm.org/linux-dm.git sve-kvm/rfcv2

[2] [RFC PATCH 00/16] KVM: arm64: Initial support for SVE guests
http://lists.infradead.org/pipermail/linux-arm-kernel/2018-June/585467.html


Dave Martin (23):
  arm64: fpsimd: Always set TIF_FOREIGN_FPSTATE on task state flush
  KVM: arm64: Delete orphaned declaration for __fpsimd_enabled()
  KVM: arm64: Refactor kvm_arm_num_regs() for easier maintenance
  KVM: arm64: Add missing #include of <linux/bitmap.h> to kvm_host.h
  KVM: arm: Add arch vcpu uninit hook
  arm64/sve: Check SVE virtualisability
  arm64/sve: Enable SVE state tracking for non-task contexts
  KVM: arm64: Add a vcpu flag to control SVE visibility for the guest
  KVM: arm64: Propagate vcpu into read_id_reg()
  KVM: arm64: Extend reset_unknown() to handle mixed RES0/UNKNOWN
    registers
  KVM: arm64: Support runtime sysreg filtering for KVM_GET_REG_LIST
  KVM: arm64/sve: System register context switch and access support
  KVM: arm64/sve: Context switch the SVE registers
  KVM: Allow 2048-bit register access via ioctl interface
  KVM: arm64/sve: Add SVE support to register access ioctl interface
  KVM: arm64: Enumerate SVE register indices for KVM_GET_REG_LIST
  arm64/sve: In-kernel vector length availability query interface
  KVM: arm64: Add arch vcpu ioctl hook
  KVM: arm64/sve: Report and enable SVE API extensions for userspace
  KVM: arm64: Add arch vm ioctl hook
  KVM: arm64/sve: allow KVM_ARM_SVE_CONFIG_QUERY on vm fd
  KVM: Documentation: Document arm64 core registers in detail
  KVM: arm64/sve: Document KVM API extensions for SVE

 Documentation/virtual/kvm/api.txt | 160 ++++++++++++++++
 arch/arm/include/asm/kvm_host.h   |  16 +-
 arch/arm64/include/asm/fpsimd.h   |  33 +++-
 arch/arm64/include/asm/kvm_host.h |  25 ++-
 arch/arm64/include/asm/kvm_hyp.h  |   1 -
 arch/arm64/include/asm/sysreg.h   |   3 +
 arch/arm64/include/uapi/asm/kvm.h |  24 +++
 arch/arm64/kernel/cpufeature.c    |   2 +-
 arch/arm64/kernel/fpsimd.c        | 163 +++++++++++-----
 arch/arm64/kernel/signal.c        |   5 -
 arch/arm64/kvm/fpsimd.c           |  16 +-
 arch/arm64/kvm/guest.c            | 391 ++++++++++++++++++++++++++++++++++++--
 arch/arm64/kvm/hyp/switch.c       |  69 +++++--
 arch/arm64/kvm/reset.c            |  50 +++++
 arch/arm64/kvm/sys_regs.c         | 144 ++++++++++++--
 arch/arm64/kvm/sys_regs.h         |   8 +-
 include/uapi/linux/kvm.h          |   5 +
 virt/kvm/arm/arm.c                |   9 +-
 18 files changed, 1016 insertions(+), 108 deletions(-)

-- 
2.1.4

^ permalink raw reply	[flat|nested] 154+ messages in thread

* [RFC PATCH v2 00/23] KVM: arm64: Initial support for SVE guests
@ 2018-09-28 13:39 ` Dave Martin
  0 siblings, 0 replies; 154+ messages in thread
From: Dave Martin @ 2018-09-28 13:39 UTC (permalink / raw)
  To: linux-arm-kernel

This series implements basic support for allowing KVM guests to use the
Arm Scalable Vector Extension (SVE).

The patches are based on v4.19-rc5.

The patches are also available on a branch for reviewer convenience. [1]

This is a significant overhaul of the previous preliminary series [2],
with the major changes outlined below, and additional minor updates in
response to review feedback (see the individual patches for those).

In the interest of getting this series out for review, 
This series is **completely untested**.

Reviewers should focus on the proposed API (but any other comments are
of course welcome!)


Major changes:

 * Reworked on top of the new KVM FPSIMD context switching model.

 * Migrated away from use of feature bits in KVM_ARM_PREFERRED_TARGET /
   KVM_VCPU_INIT to detect and enable SVE support.

   Availability is now reported via KVM_CAP_ARM_SVE, and SVE support
   is enabled and configured per-vcpu with a new ioctl
   KVM_ARM_SVE_CONFIG.

 * KVM_ARM_SVE_CONFIG also adds the ability to detect and configure
   the set of SVE vector lengths provided to the guest.

 * The ioctl register access model has been simplified.  The new SVE
   interface must now be used to access the content of the FPSIMD
   V-registers, on SVE-enabled vcpus.  The kernel no longer tries to
   emulate the KVM_REG_ARM_CORE view for these registers in this case.
   (For non-SVE-enabled vcpus, the KVM_REG_ARM_CORE interface works as
   normal.)

 * Draft documentation for the SVE extensions has been added to KVM's
   api.txt.

Known issues:

 * KVM_GET_REG_LIST enumerates the FPSIMD V-registers for SVE-enabled
   vcpus.  It shouldn't, because attempts to access these will now fail!

 * kvmtool/qemu updates are needed to enable creation of SVE-enabled
   guests (to be discussed separately).

 * Due to heavy rework, the series is currently a bit of a mess: the
   patches are not presented in a very coherent order, and one or two
   patches may even now be redundant or irrelevant.  If you spot any
   obvious inconsistencies, please shout!


[1]
http://linux-arm.org/git?p=linux-dm.git;a=shortlog;h=refs/heads/sve-kvm/rfcv2
git://linux-arm.org/linux-dm.git sve-kvm/rfcv2

[2] [RFC PATCH 00/16] KVM: arm64: Initial support for SVE guests
http://lists.infradead.org/pipermail/linux-arm-kernel/2018-June/585467.html


Dave Martin (23):
  arm64: fpsimd: Always set TIF_FOREIGN_FPSTATE on task state flush
  KVM: arm64: Delete orphaned declaration for __fpsimd_enabled()
  KVM: arm64: Refactor kvm_arm_num_regs() for easier maintenance
  KVM: arm64: Add missing #include of <linux/bitmap.h> to kvm_host.h
  KVM: arm: Add arch vcpu uninit hook
  arm64/sve: Check SVE virtualisability
  arm64/sve: Enable SVE state tracking for non-task contexts
  KVM: arm64: Add a vcpu flag to control SVE visibility for the guest
  KVM: arm64: Propagate vcpu into read_id_reg()
  KVM: arm64: Extend reset_unknown() to handle mixed RES0/UNKNOWN
    registers
  KVM: arm64: Support runtime sysreg filtering for KVM_GET_REG_LIST
  KVM: arm64/sve: System register context switch and access support
  KVM: arm64/sve: Context switch the SVE registers
  KVM: Allow 2048-bit register access via ioctl interface
  KVM: arm64/sve: Add SVE support to register access ioctl interface
  KVM: arm64: Enumerate SVE register indices for KVM_GET_REG_LIST
  arm64/sve: In-kernel vector length availability query interface
  KVM: arm64: Add arch vcpu ioctl hook
  KVM: arm64/sve: Report and enable SVE API extensions for userspace
  KVM: arm64: Add arch vm ioctl hook
  KVM: arm64/sve: allow KVM_ARM_SVE_CONFIG_QUERY on vm fd
  KVM: Documentation: Document arm64 core registers in detail
  KVM: arm64/sve: Document KVM API extensions for SVE

 Documentation/virtual/kvm/api.txt | 160 ++++++++++++++++
 arch/arm/include/asm/kvm_host.h   |  16 +-
 arch/arm64/include/asm/fpsimd.h   |  33 +++-
 arch/arm64/include/asm/kvm_host.h |  25 ++-
 arch/arm64/include/asm/kvm_hyp.h  |   1 -
 arch/arm64/include/asm/sysreg.h   |   3 +
 arch/arm64/include/uapi/asm/kvm.h |  24 +++
 arch/arm64/kernel/cpufeature.c    |   2 +-
 arch/arm64/kernel/fpsimd.c        | 163 +++++++++++-----
 arch/arm64/kernel/signal.c        |   5 -
 arch/arm64/kvm/fpsimd.c           |  16 +-
 arch/arm64/kvm/guest.c            | 391 ++++++++++++++++++++++++++++++++++++--
 arch/arm64/kvm/hyp/switch.c       |  69 +++++--
 arch/arm64/kvm/reset.c            |  50 +++++
 arch/arm64/kvm/sys_regs.c         | 144 ++++++++++++--
 arch/arm64/kvm/sys_regs.h         |   8 +-
 include/uapi/linux/kvm.h          |   5 +
 virt/kvm/arm/arm.c                |   9 +-
 18 files changed, 1016 insertions(+), 108 deletions(-)

-- 
2.1.4

^ permalink raw reply	[flat|nested] 154+ messages in thread

* [RFC PATCH v2 01/23] arm64: fpsimd: Always set TIF_FOREIGN_FPSTATE on task state flush
  2018-09-28 13:39 ` Dave Martin
@ 2018-09-28 13:39   ` Dave Martin
  -1 siblings, 0 replies; 154+ messages in thread
From: Dave Martin @ 2018-09-28 13:39 UTC (permalink / raw)
  To: kvmarm
  Cc: Okamoto Takayuki, Christoffer Dall, Ard Biesheuvel, Marc Zyngier,
	Catalin Marinas, Will Deacon, linux-arm-kernel

This patch updates fpsimd_flush_task_state() to mirror the new
semantics of fpsimd_flush_cpu_state(): both functions now
implicitly set TIF_FOREIGN_FPSTATE to indicate that the task's
FPSIMD state is not loaded into the cpu.

As a side-effect, fpsimd_flush_task_state() now sets
TIF_FOREIGN_FPSTATE even for non-running tasks.  In the case of
non-running tasks this is not useful but also harmless, because the
flag is live only while the corresponding task is running.  This
function is not called from fast paths, so special-casing this for
the task == current case is not really worth it.

Compiler barriers previously present in restore_sve_fpsimd_context()
are pulled into fpsimd_flush_task_state() so that it can be safely
called with preemption enabled if necessary.

Explicit calls to set TIF_FOREIGN_FPSTATE that accompany
fpsimd_flush_task_state() calls and are now redundant are removed
as appropriate.

fpsimd_flush_task_state() is used to get exclusive access to the
representation of the task's state via task_struct, for the purpose
of replacing the state.  Thus, the call to this function should
happen before manipulating fpsimd_state or sve_state etc. in
task_struct.  Anomalous cases are reordered appropriately in order
to make the code more consistent, although there should be no
functional difference since these cases are protected by
local_bh_disable() anyway.

Signed-off-by: Dave Martin <Dave.Martin@arm.com>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
---
 arch/arm64/kernel/fpsimd.c | 25 +++++++++++++++++++------
 arch/arm64/kernel/signal.c |  5 -----
 2 files changed, 19 insertions(+), 11 deletions(-)

diff --git a/arch/arm64/kernel/fpsimd.c b/arch/arm64/kernel/fpsimd.c
index 58c53bc..42aa154 100644
--- a/arch/arm64/kernel/fpsimd.c
+++ b/arch/arm64/kernel/fpsimd.c
@@ -550,7 +550,6 @@ int sve_set_vector_length(struct task_struct *task,
 		local_bh_disable();
 
 		fpsimd_save();
-		set_thread_flag(TIF_FOREIGN_FPSTATE);
 	}
 
 	fpsimd_flush_task_state(task);
@@ -816,12 +815,11 @@ asmlinkage void do_sve_acc(unsigned int esr, struct pt_regs *regs)
 	local_bh_disable();
 
 	fpsimd_save();
-	fpsimd_to_sve(current);
 
 	/* Force ret_to_user to reload the registers: */
 	fpsimd_flush_task_state(current);
-	set_thread_flag(TIF_FOREIGN_FPSTATE);
 
+	fpsimd_to_sve(current);
 	if (test_and_set_thread_flag(TIF_SVE))
 		WARN_ON(1); /* SVE access shouldn't have trapped */
 
@@ -898,9 +896,9 @@ void fpsimd_flush_thread(void)
 
 	local_bh_disable();
 
+	fpsimd_flush_task_state(current);
 	memset(&current->thread.uw.fpsimd_state, 0,
 	       sizeof(current->thread.uw.fpsimd_state));
-	fpsimd_flush_task_state(current);
 
 	if (system_supports_sve()) {
 		clear_thread_flag(TIF_SVE);
@@ -937,8 +935,6 @@ void fpsimd_flush_thread(void)
 			current->thread.sve_vl_onexec = 0;
 	}
 
-	set_thread_flag(TIF_FOREIGN_FPSTATE);
-
 	local_bh_enable();
 }
 
@@ -1047,12 +1043,29 @@ void fpsimd_update_current_state(struct user_fpsimd_state const *state)
 
 /*
  * Invalidate live CPU copies of task t's FPSIMD state
+ *
+ * This function may be called with preemption enabled.  The barrier()
+ * ensures that the assignment to fpsimd_cpu is visible to any
+ * preemption/softirq that could race with set_tsk_thread_flag(), so
+ * that TIF_FOREIGN_FPSTATE cannot be spuriously re-cleared.
+ *
+ * The final barrier ensures that TIF_FOREIGN_FPSTATE is seen set by any
+ * subsequent code.
  */
 void fpsimd_flush_task_state(struct task_struct *t)
 {
 	t->thread.fpsimd_cpu = NR_CPUS;
+
+	barrier();
+	set_tsk_thread_flag(t, TIF_FOREIGN_FPSTATE);
+
+	barrier();
 }
 
+/*
+ * Invalidate any task's FPSIMD state that is present on this cpu.
+ * This function must be called with softirqs disabled.
+ */
 void fpsimd_flush_cpu_state(void)
 {
 	__this_cpu_write(fpsimd_last_state.st, NULL);
diff --git a/arch/arm64/kernel/signal.c b/arch/arm64/kernel/signal.c
index 5dcc942..7dcf0f1 100644
--- a/arch/arm64/kernel/signal.c
+++ b/arch/arm64/kernel/signal.c
@@ -296,11 +296,6 @@ static int restore_sve_fpsimd_context(struct user_ctxs *user)
 	 */
 
 	fpsimd_flush_task_state(current);
-	barrier();
-	/* From now, fpsimd_thread_switch() won't clear TIF_FOREIGN_FPSTATE */
-
-	set_thread_flag(TIF_FOREIGN_FPSTATE);
-	barrier();
 	/* From now, fpsimd_thread_switch() won't touch thread.sve_state */
 
 	sve_alloc(current);
-- 
2.1.4

_______________________________________________
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

^ permalink raw reply related	[flat|nested] 154+ messages in thread

* [RFC PATCH v2 01/23] arm64: fpsimd: Always set TIF_FOREIGN_FPSTATE on task state flush
@ 2018-09-28 13:39   ` Dave Martin
  0 siblings, 0 replies; 154+ messages in thread
From: Dave Martin @ 2018-09-28 13:39 UTC (permalink / raw)
  To: linux-arm-kernel

This patch updates fpsimd_flush_task_state() to mirror the new
semantics of fpsimd_flush_cpu_state(): both functions now
implicitly set TIF_FOREIGN_FPSTATE to indicate that the task's
FPSIMD state is not loaded into the cpu.

As a side-effect, fpsimd_flush_task_state() now sets
TIF_FOREIGN_FPSTATE even for non-running tasks.  In the case of
non-running tasks this is not useful but also harmless, because the
flag is live only while the corresponding task is running.  This
function is not called from fast paths, so special-casing this for
the task == current case is not really worth it.

Compiler barriers previously present in restore_sve_fpsimd_context()
are pulled into fpsimd_flush_task_state() so that it can be safely
called with preemption enabled if necessary.

Explicit calls to set TIF_FOREIGN_FPSTATE that accompany
fpsimd_flush_task_state() calls and are now redundant are removed
as appropriate.

fpsimd_flush_task_state() is used to get exclusive access to the
representation of the task's state via task_struct, for the purpose
of replacing the state.  Thus, the call to this function should
happen before manipulating fpsimd_state or sve_state etc. in
task_struct.  Anomalous cases are reordered appropriately in order
to make the code more consistent, although there should be no
functional difference since these cases are protected by
local_bh_disable() anyway.

Signed-off-by: Dave Martin <Dave.Martin@arm.com>
Reviewed-by: Alex Benn?e <alex.bennee@linaro.org>
---
 arch/arm64/kernel/fpsimd.c | 25 +++++++++++++++++++------
 arch/arm64/kernel/signal.c |  5 -----
 2 files changed, 19 insertions(+), 11 deletions(-)

diff --git a/arch/arm64/kernel/fpsimd.c b/arch/arm64/kernel/fpsimd.c
index 58c53bc..42aa154 100644
--- a/arch/arm64/kernel/fpsimd.c
+++ b/arch/arm64/kernel/fpsimd.c
@@ -550,7 +550,6 @@ int sve_set_vector_length(struct task_struct *task,
 		local_bh_disable();
 
 		fpsimd_save();
-		set_thread_flag(TIF_FOREIGN_FPSTATE);
 	}
 
 	fpsimd_flush_task_state(task);
@@ -816,12 +815,11 @@ asmlinkage void do_sve_acc(unsigned int esr, struct pt_regs *regs)
 	local_bh_disable();
 
 	fpsimd_save();
-	fpsimd_to_sve(current);
 
 	/* Force ret_to_user to reload the registers: */
 	fpsimd_flush_task_state(current);
-	set_thread_flag(TIF_FOREIGN_FPSTATE);
 
+	fpsimd_to_sve(current);
 	if (test_and_set_thread_flag(TIF_SVE))
 		WARN_ON(1); /* SVE access shouldn't have trapped */
 
@@ -898,9 +896,9 @@ void fpsimd_flush_thread(void)
 
 	local_bh_disable();
 
+	fpsimd_flush_task_state(current);
 	memset(&current->thread.uw.fpsimd_state, 0,
 	       sizeof(current->thread.uw.fpsimd_state));
-	fpsimd_flush_task_state(current);
 
 	if (system_supports_sve()) {
 		clear_thread_flag(TIF_SVE);
@@ -937,8 +935,6 @@ void fpsimd_flush_thread(void)
 			current->thread.sve_vl_onexec = 0;
 	}
 
-	set_thread_flag(TIF_FOREIGN_FPSTATE);
-
 	local_bh_enable();
 }
 
@@ -1047,12 +1043,29 @@ void fpsimd_update_current_state(struct user_fpsimd_state const *state)
 
 /*
  * Invalidate live CPU copies of task t's FPSIMD state
+ *
+ * This function may be called with preemption enabled.  The barrier()
+ * ensures that the assignment to fpsimd_cpu is visible to any
+ * preemption/softirq that could race with set_tsk_thread_flag(), so
+ * that TIF_FOREIGN_FPSTATE cannot be spuriously re-cleared.
+ *
+ * The final barrier ensures that TIF_FOREIGN_FPSTATE is seen set by any
+ * subsequent code.
  */
 void fpsimd_flush_task_state(struct task_struct *t)
 {
 	t->thread.fpsimd_cpu = NR_CPUS;
+
+	barrier();
+	set_tsk_thread_flag(t, TIF_FOREIGN_FPSTATE);
+
+	barrier();
 }
 
+/*
+ * Invalidate any task's FPSIMD state that is present on this cpu.
+ * This function must be called with softirqs disabled.
+ */
 void fpsimd_flush_cpu_state(void)
 {
 	__this_cpu_write(fpsimd_last_state.st, NULL);
diff --git a/arch/arm64/kernel/signal.c b/arch/arm64/kernel/signal.c
index 5dcc942..7dcf0f1 100644
--- a/arch/arm64/kernel/signal.c
+++ b/arch/arm64/kernel/signal.c
@@ -296,11 +296,6 @@ static int restore_sve_fpsimd_context(struct user_ctxs *user)
 	 */
 
 	fpsimd_flush_task_state(current);
-	barrier();
-	/* From now, fpsimd_thread_switch() won't clear TIF_FOREIGN_FPSTATE */
-
-	set_thread_flag(TIF_FOREIGN_FPSTATE);
-	barrier();
 	/* From now, fpsimd_thread_switch() won't touch thread.sve_state */
 
 	sve_alloc(current);
-- 
2.1.4

^ permalink raw reply related	[flat|nested] 154+ messages in thread

* [RFC PATCH v2 02/23] KVM: arm64: Delete orphaned declaration for __fpsimd_enabled()
  2018-09-28 13:39 ` Dave Martin
@ 2018-09-28 13:39   ` Dave Martin
  -1 siblings, 0 replies; 154+ messages in thread
From: Dave Martin @ 2018-09-28 13:39 UTC (permalink / raw)
  To: kvmarm
  Cc: Okamoto Takayuki, Christoffer Dall, Ard Biesheuvel, Marc Zyngier,
	Catalin Marinas, Will Deacon, linux-arm-kernel

__fpsimd_enabled() no longer exists, but a dangling declaration has
survived in kvm_hyp.h.

This patch gets rid of it.

Signed-off-by: Dave Martin <Dave.Martin@arm.com>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
---
 arch/arm64/include/asm/kvm_hyp.h | 1 -
 1 file changed, 1 deletion(-)

diff --git a/arch/arm64/include/asm/kvm_hyp.h b/arch/arm64/include/asm/kvm_hyp.h
index 384c343..9cbbd03 100644
--- a/arch/arm64/include/asm/kvm_hyp.h
+++ b/arch/arm64/include/asm/kvm_hyp.h
@@ -147,7 +147,6 @@ void __debug_switch_to_host(struct kvm_vcpu *vcpu);
 
 void __fpsimd_save_state(struct user_fpsimd_state *fp_regs);
 void __fpsimd_restore_state(struct user_fpsimd_state *fp_regs);
-bool __fpsimd_enabled(void);
 
 void activate_traps_vhe_load(struct kvm_vcpu *vcpu);
 void deactivate_traps_vhe_put(void);
-- 
2.1.4

_______________________________________________
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

^ permalink raw reply related	[flat|nested] 154+ messages in thread

* [RFC PATCH v2 02/23] KVM: arm64: Delete orphaned declaration for __fpsimd_enabled()
@ 2018-09-28 13:39   ` Dave Martin
  0 siblings, 0 replies; 154+ messages in thread
From: Dave Martin @ 2018-09-28 13:39 UTC (permalink / raw)
  To: linux-arm-kernel

__fpsimd_enabled() no longer exists, but a dangling declaration has
survived in kvm_hyp.h.

This patch gets rid of it.

Signed-off-by: Dave Martin <Dave.Martin@arm.com>
Reviewed-by: Alex Benn?e <alex.bennee@linaro.org>
---
 arch/arm64/include/asm/kvm_hyp.h | 1 -
 1 file changed, 1 deletion(-)

diff --git a/arch/arm64/include/asm/kvm_hyp.h b/arch/arm64/include/asm/kvm_hyp.h
index 384c343..9cbbd03 100644
--- a/arch/arm64/include/asm/kvm_hyp.h
+++ b/arch/arm64/include/asm/kvm_hyp.h
@@ -147,7 +147,6 @@ void __debug_switch_to_host(struct kvm_vcpu *vcpu);
 
 void __fpsimd_save_state(struct user_fpsimd_state *fp_regs);
 void __fpsimd_restore_state(struct user_fpsimd_state *fp_regs);
-bool __fpsimd_enabled(void);
 
 void activate_traps_vhe_load(struct kvm_vcpu *vcpu);
 void deactivate_traps_vhe_put(void);
-- 
2.1.4

^ permalink raw reply related	[flat|nested] 154+ messages in thread

* [RFC PATCH v2 03/23] KVM: arm64: Refactor kvm_arm_num_regs() for easier maintenance
  2018-09-28 13:39 ` Dave Martin
@ 2018-09-28 13:39   ` Dave Martin
  -1 siblings, 0 replies; 154+ messages in thread
From: Dave Martin @ 2018-09-28 13:39 UTC (permalink / raw)
  To: kvmarm
  Cc: Okamoto Takayuki, Christoffer Dall, Ard Biesheuvel, Marc Zyngier,
	Catalin Marinas, Will Deacon, linux-arm-kernel

kvm_arm_num_regs() adds together various partial register counts in
a freeform sum expression, which makes it harder than necessary to
read diffs that add, modify or remove a single term in the sum
(which is expected to the common case under maintenance).

This patch refactors the code to add the term one per line, for
maximum readability.

Signed-off-by: Dave Martin <Dave.Martin@arm.com>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
---
 arch/arm64/kvm/guest.c | 10 ++++++++--
 1 file changed, 8 insertions(+), 2 deletions(-)

diff --git a/arch/arm64/kvm/guest.c b/arch/arm64/kvm/guest.c
index 07256b0..953a5c9 100644
--- a/arch/arm64/kvm/guest.c
+++ b/arch/arm64/kvm/guest.c
@@ -205,8 +205,14 @@ static int get_timer_reg(struct kvm_vcpu *vcpu, const struct kvm_one_reg *reg)
  */
 unsigned long kvm_arm_num_regs(struct kvm_vcpu *vcpu)
 {
-	return num_core_regs() + kvm_arm_num_sys_reg_descs(vcpu)
-		+ kvm_arm_get_fw_num_regs(vcpu)	+ NUM_TIMER_REGS;
+	unsigned long res = 0;
+
+	res += num_core_regs();
+	res += kvm_arm_num_sys_reg_descs(vcpu);
+	res += kvm_arm_get_fw_num_regs(vcpu);
+	res += NUM_TIMER_REGS;
+
+	return res;
 }
 
 /**
-- 
2.1.4

_______________________________________________
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

^ permalink raw reply related	[flat|nested] 154+ messages in thread

* [RFC PATCH v2 03/23] KVM: arm64: Refactor kvm_arm_num_regs() for easier maintenance
@ 2018-09-28 13:39   ` Dave Martin
  0 siblings, 0 replies; 154+ messages in thread
From: Dave Martin @ 2018-09-28 13:39 UTC (permalink / raw)
  To: linux-arm-kernel

kvm_arm_num_regs() adds together various partial register counts in
a freeform sum expression, which makes it harder than necessary to
read diffs that add, modify or remove a single term in the sum
(which is expected to the common case under maintenance).

This patch refactors the code to add the term one per line, for
maximum readability.

Signed-off-by: Dave Martin <Dave.Martin@arm.com>
Reviewed-by: Alex Benn?e <alex.bennee@linaro.org>
---
 arch/arm64/kvm/guest.c | 10 ++++++++--
 1 file changed, 8 insertions(+), 2 deletions(-)

diff --git a/arch/arm64/kvm/guest.c b/arch/arm64/kvm/guest.c
index 07256b0..953a5c9 100644
--- a/arch/arm64/kvm/guest.c
+++ b/arch/arm64/kvm/guest.c
@@ -205,8 +205,14 @@ static int get_timer_reg(struct kvm_vcpu *vcpu, const struct kvm_one_reg *reg)
  */
 unsigned long kvm_arm_num_regs(struct kvm_vcpu *vcpu)
 {
-	return num_core_regs() + kvm_arm_num_sys_reg_descs(vcpu)
-		+ kvm_arm_get_fw_num_regs(vcpu)	+ NUM_TIMER_REGS;
+	unsigned long res = 0;
+
+	res += num_core_regs();
+	res += kvm_arm_num_sys_reg_descs(vcpu);
+	res += kvm_arm_get_fw_num_regs(vcpu);
+	res += NUM_TIMER_REGS;
+
+	return res;
 }
 
 /**
-- 
2.1.4

^ permalink raw reply related	[flat|nested] 154+ messages in thread

* [RFC PATCH v2 04/23] KVM: arm64: Add missing #include of <linux/bitmap.h> to kvm_host.h
  2018-09-28 13:39 ` Dave Martin
@ 2018-09-28 13:39   ` Dave Martin
  -1 siblings, 0 replies; 154+ messages in thread
From: Dave Martin @ 2018-09-28 13:39 UTC (permalink / raw)
  To: kvmarm
  Cc: Okamoto Takayuki, Christoffer Dall, Ard Biesheuvel, Marc Zyngier,
	Catalin Marinas, Will Deacon, linux-arm-kernel

kvm_host.h uses DECLARE_BITMAP() to declare the features member of
struct vcpu_arch, but the corresponding #include for this is
missing.

This patch adds a suitable #include for <linux/bitmap.h>.  Although
the header builds without it today, this should help to avoid
future surprises.

Signed-off-by: Dave Martin <Dave.Martin@arm.com>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
---
 arch/arm64/include/asm/kvm_host.h | 1 +
 1 file changed, 1 insertion(+)

diff --git a/arch/arm64/include/asm/kvm_host.h b/arch/arm64/include/asm/kvm_host.h
index 3d6d733..6316a57 100644
--- a/arch/arm64/include/asm/kvm_host.h
+++ b/arch/arm64/include/asm/kvm_host.h
@@ -22,6 +22,7 @@
 #ifndef __ARM64_KVM_HOST_H__
 #define __ARM64_KVM_HOST_H__
 
+#include <linux/bitmap.h>
 #include <linux/types.h>
 #include <linux/kvm_types.h>
 #include <asm/cpufeature.h>
-- 
2.1.4

_______________________________________________
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

^ permalink raw reply related	[flat|nested] 154+ messages in thread

* [RFC PATCH v2 04/23] KVM: arm64: Add missing #include of <linux/bitmap.h> to kvm_host.h
@ 2018-09-28 13:39   ` Dave Martin
  0 siblings, 0 replies; 154+ messages in thread
From: Dave Martin @ 2018-09-28 13:39 UTC (permalink / raw)
  To: linux-arm-kernel

kvm_host.h uses DECLARE_BITMAP() to declare the features member of
struct vcpu_arch, but the corresponding #include for this is
missing.

This patch adds a suitable #include for <linux/bitmap.h>.  Although
the header builds without it today, this should help to avoid
future surprises.

Signed-off-by: Dave Martin <Dave.Martin@arm.com>
Reviewed-by: Alex Benn?e <alex.bennee@linaro.org>
---
 arch/arm64/include/asm/kvm_host.h | 1 +
 1 file changed, 1 insertion(+)

diff --git a/arch/arm64/include/asm/kvm_host.h b/arch/arm64/include/asm/kvm_host.h
index 3d6d733..6316a57 100644
--- a/arch/arm64/include/asm/kvm_host.h
+++ b/arch/arm64/include/asm/kvm_host.h
@@ -22,6 +22,7 @@
 #ifndef __ARM64_KVM_HOST_H__
 #define __ARM64_KVM_HOST_H__
 
+#include <linux/bitmap.h>
 #include <linux/types.h>
 #include <linux/kvm_types.h>
 #include <asm/cpufeature.h>
-- 
2.1.4

^ permalink raw reply related	[flat|nested] 154+ messages in thread

* [RFC PATCH v2 05/23] KVM: arm: Add arch vcpu uninit hook
  2018-09-28 13:39 ` Dave Martin
@ 2018-09-28 13:39   ` Dave Martin
  -1 siblings, 0 replies; 154+ messages in thread
From: Dave Martin @ 2018-09-28 13:39 UTC (permalink / raw)
  To: kvmarm
  Cc: Okamoto Takayuki, Christoffer Dall, Ard Biesheuvel, Marc Zyngier,
	Catalin Marinas, Will Deacon, linux-arm-kernel

In preparation for adding support for SVE in guests on arm64, a
hook is needed for freeing additional per-vcpu memory when a vcpu
is freed.

x86 already uses the kvm_arch_vcpu_uninit() hook for a similar
purpose, so this patch populates the same hook for arm.  Since SVE
is specific to arm64, a subsidiary hook kvm_arm_arch_vcpu_uninit()
is added (with trivial implementations for now) to enable separate
specialisation for arm and arm64.

No functional change.

Signed-off-by: Dave Martin <Dave.Martin@arm.com>
---

Changes since RFCv1:

 * The vcpu _init_ hook that was added by the former version of this
   patch was never used for anything, so it is gone from this version.
---
 arch/arm/include/asm/kvm_host.h   | 3 ++-
 arch/arm64/include/asm/kvm_host.h | 3 ++-
 virt/kvm/arm/arm.c                | 5 +++++
 3 files changed, 9 insertions(+), 2 deletions(-)

diff --git a/arch/arm/include/asm/kvm_host.h b/arch/arm/include/asm/kvm_host.h
index 3ad482d..c36760b 100644
--- a/arch/arm/include/asm/kvm_host.h
+++ b/arch/arm/include/asm/kvm_host.h
@@ -288,10 +288,11 @@ struct kvm_vcpu *kvm_mpidr_to_vcpu(struct kvm *kvm, unsigned long mpidr);
 static inline bool kvm_arch_check_sve_has_vhe(void) { return true; }
 static inline void kvm_arch_hardware_unsetup(void) {}
 static inline void kvm_arch_sync_events(struct kvm *kvm) {}
-static inline void kvm_arch_vcpu_uninit(struct kvm_vcpu *vcpu) {}
 static inline void kvm_arch_sched_in(struct kvm_vcpu *vcpu, int cpu) {}
 static inline void kvm_arch_vcpu_block_finish(struct kvm_vcpu *vcpu) {}
 
+static inline void kvm_arm_arch_vcpu_uninit(struct kvm_vcpu *vcpu) {}
+
 static inline void kvm_arm_init_debug(void) {}
 static inline void kvm_arm_setup_debug(struct kvm_vcpu *vcpu) {}
 static inline void kvm_arm_clear_debug(struct kvm_vcpu *vcpu) {}
diff --git a/arch/arm64/include/asm/kvm_host.h b/arch/arm64/include/asm/kvm_host.h
index 6316a57..d4b65414 100644
--- a/arch/arm64/include/asm/kvm_host.h
+++ b/arch/arm64/include/asm/kvm_host.h
@@ -425,10 +425,11 @@ static inline bool kvm_arch_check_sve_has_vhe(void)
 
 static inline void kvm_arch_hardware_unsetup(void) {}
 static inline void kvm_arch_sync_events(struct kvm *kvm) {}
-static inline void kvm_arch_vcpu_uninit(struct kvm_vcpu *vcpu) {}
 static inline void kvm_arch_sched_in(struct kvm_vcpu *vcpu, int cpu) {}
 static inline void kvm_arch_vcpu_block_finish(struct kvm_vcpu *vcpu) {}
 
+static inline void kvm_arm_arch_vcpu_uninit(struct kvm_vcpu *vcpu) {}
+
 void kvm_arm_init_debug(void);
 void kvm_arm_setup_debug(struct kvm_vcpu *vcpu);
 void kvm_arm_clear_debug(struct kvm_vcpu *vcpu);
diff --git a/virt/kvm/arm/arm.c b/virt/kvm/arm/arm.c
index c92053b..1418af9 100644
--- a/virt/kvm/arm/arm.c
+++ b/virt/kvm/arm/arm.c
@@ -358,6 +358,11 @@ int kvm_arch_vcpu_init(struct kvm_vcpu *vcpu)
 	return kvm_vgic_vcpu_init(vcpu);
 }
 
+void kvm_arch_vcpu_uninit(struct kvm_vcpu *vcpu)
+{
+	kvm_arm_arch_vcpu_uninit(vcpu);
+}
+
 void kvm_arch_vcpu_load(struct kvm_vcpu *vcpu, int cpu)
 {
 	int *last_ran;
-- 
2.1.4

^ permalink raw reply related	[flat|nested] 154+ messages in thread

* [RFC PATCH v2 05/23] KVM: arm: Add arch vcpu uninit hook
@ 2018-09-28 13:39   ` Dave Martin
  0 siblings, 0 replies; 154+ messages in thread
From: Dave Martin @ 2018-09-28 13:39 UTC (permalink / raw)
  To: linux-arm-kernel

In preparation for adding support for SVE in guests on arm64, a
hook is needed for freeing additional per-vcpu memory when a vcpu
is freed.

x86 already uses the kvm_arch_vcpu_uninit() hook for a similar
purpose, so this patch populates the same hook for arm.  Since SVE
is specific to arm64, a subsidiary hook kvm_arm_arch_vcpu_uninit()
is added (with trivial implementations for now) to enable separate
specialisation for arm and arm64.

No functional change.

Signed-off-by: Dave Martin <Dave.Martin@arm.com>
---

Changes since RFCv1:

 * The vcpu _init_ hook that was added by the former version of this
   patch was never used for anything, so it is gone from this version.
---
 arch/arm/include/asm/kvm_host.h   | 3 ++-
 arch/arm64/include/asm/kvm_host.h | 3 ++-
 virt/kvm/arm/arm.c                | 5 +++++
 3 files changed, 9 insertions(+), 2 deletions(-)

diff --git a/arch/arm/include/asm/kvm_host.h b/arch/arm/include/asm/kvm_host.h
index 3ad482d..c36760b 100644
--- a/arch/arm/include/asm/kvm_host.h
+++ b/arch/arm/include/asm/kvm_host.h
@@ -288,10 +288,11 @@ struct kvm_vcpu *kvm_mpidr_to_vcpu(struct kvm *kvm, unsigned long mpidr);
 static inline bool kvm_arch_check_sve_has_vhe(void) { return true; }
 static inline void kvm_arch_hardware_unsetup(void) {}
 static inline void kvm_arch_sync_events(struct kvm *kvm) {}
-static inline void kvm_arch_vcpu_uninit(struct kvm_vcpu *vcpu) {}
 static inline void kvm_arch_sched_in(struct kvm_vcpu *vcpu, int cpu) {}
 static inline void kvm_arch_vcpu_block_finish(struct kvm_vcpu *vcpu) {}
 
+static inline void kvm_arm_arch_vcpu_uninit(struct kvm_vcpu *vcpu) {}
+
 static inline void kvm_arm_init_debug(void) {}
 static inline void kvm_arm_setup_debug(struct kvm_vcpu *vcpu) {}
 static inline void kvm_arm_clear_debug(struct kvm_vcpu *vcpu) {}
diff --git a/arch/arm64/include/asm/kvm_host.h b/arch/arm64/include/asm/kvm_host.h
index 6316a57..d4b65414 100644
--- a/arch/arm64/include/asm/kvm_host.h
+++ b/arch/arm64/include/asm/kvm_host.h
@@ -425,10 +425,11 @@ static inline bool kvm_arch_check_sve_has_vhe(void)
 
 static inline void kvm_arch_hardware_unsetup(void) {}
 static inline void kvm_arch_sync_events(struct kvm *kvm) {}
-static inline void kvm_arch_vcpu_uninit(struct kvm_vcpu *vcpu) {}
 static inline void kvm_arch_sched_in(struct kvm_vcpu *vcpu, int cpu) {}
 static inline void kvm_arch_vcpu_block_finish(struct kvm_vcpu *vcpu) {}
 
+static inline void kvm_arm_arch_vcpu_uninit(struct kvm_vcpu *vcpu) {}
+
 void kvm_arm_init_debug(void);
 void kvm_arm_setup_debug(struct kvm_vcpu *vcpu);
 void kvm_arm_clear_debug(struct kvm_vcpu *vcpu);
diff --git a/virt/kvm/arm/arm.c b/virt/kvm/arm/arm.c
index c92053b..1418af9 100644
--- a/virt/kvm/arm/arm.c
+++ b/virt/kvm/arm/arm.c
@@ -358,6 +358,11 @@ int kvm_arch_vcpu_init(struct kvm_vcpu *vcpu)
 	return kvm_vgic_vcpu_init(vcpu);
 }
 
+void kvm_arch_vcpu_uninit(struct kvm_vcpu *vcpu)
+{
+	kvm_arm_arch_vcpu_uninit(vcpu);
+}
+
 void kvm_arch_vcpu_load(struct kvm_vcpu *vcpu, int cpu)
 {
 	int *last_ran;
-- 
2.1.4

^ permalink raw reply related	[flat|nested] 154+ messages in thread

* [RFC PATCH v2 06/23] arm64/sve: Check SVE virtualisability
  2018-09-28 13:39 ` Dave Martin
@ 2018-09-28 13:39   ` Dave Martin
  -1 siblings, 0 replies; 154+ messages in thread
From: Dave Martin @ 2018-09-28 13:39 UTC (permalink / raw)
  To: kvmarm
  Cc: Okamoto Takayuki, Christoffer Dall, Ard Biesheuvel, Marc Zyngier,
	Catalin Marinas, Will Deacon, linux-arm-kernel

Due to the way the effective SVE vector length is controlled and
trapped at different exception levels, certain mismatches in the
sets of vector lengths supported by different physical CPUs in the
system may prevent straightforward virtualisation of SVE at parity
with the host.

This patch analyses the extent to which SVE can be virtualised
safely without interfering with migration of vcpus between physical
CPUs, and rejects late secondary CPUs that would erode the
situation further.

It is left up to KVM to decide what to do with this information.

Signed-off-by: Dave Martin <Dave.Martin@arm.com>
---

Changes since RFCv1:

 * The analysis done by this patch is the same as in the previous
   version, but the commit message the printks etc. have been reworded
   to avoid the suggestion that KVM is expected to work on a system with
   mismatched SVE implementations.
---
 arch/arm64/include/asm/fpsimd.h |  1 +
 arch/arm64/kernel/cpufeature.c  |  2 +-
 arch/arm64/kernel/fpsimd.c      | 87 +++++++++++++++++++++++++++++++++++------
 3 files changed, 76 insertions(+), 14 deletions(-)

diff --git a/arch/arm64/include/asm/fpsimd.h b/arch/arm64/include/asm/fpsimd.h
index dd1ad39..964adc9 100644
--- a/arch/arm64/include/asm/fpsimd.h
+++ b/arch/arm64/include/asm/fpsimd.h
@@ -87,6 +87,7 @@ extern void sve_kernel_enable(const struct arm64_cpu_capabilities *__unused);
 extern u64 read_zcr_features(void);
 
 extern int __ro_after_init sve_max_vl;
+extern int __ro_after_init sve_max_virtualisable_vl;
 
 #ifdef CONFIG_ARM64_SVE
 
diff --git a/arch/arm64/kernel/cpufeature.c b/arch/arm64/kernel/cpufeature.c
index e238b79..aa1a55b 100644
--- a/arch/arm64/kernel/cpufeature.c
+++ b/arch/arm64/kernel/cpufeature.c
@@ -1531,7 +1531,7 @@ static void verify_sve_features(void)
 	unsigned int len = zcr & ZCR_ELx_LEN_MASK;
 
 	if (len < safe_len || sve_verify_vq_map()) {
-		pr_crit("CPU%d: SVE: required vector length(s) missing\n",
+		pr_crit("CPU%d: SVE: vector length support mismatch\n",
 			smp_processor_id());
 		cpu_die_early();
 	}
diff --git a/arch/arm64/kernel/fpsimd.c b/arch/arm64/kernel/fpsimd.c
index 42aa154..d28042b 100644
--- a/arch/arm64/kernel/fpsimd.c
+++ b/arch/arm64/kernel/fpsimd.c
@@ -18,6 +18,7 @@
  */
 
 #include <linux/bitmap.h>
+#include <linux/bitops.h>
 #include <linux/bottom_half.h>
 #include <linux/bug.h>
 #include <linux/cache.h>
@@ -48,6 +49,7 @@
 #include <asm/sigcontext.h>
 #include <asm/sysreg.h>
 #include <asm/traps.h>
+#include <asm/virt.h>
 
 #define FPEXC_IOF	(1 << 0)
 #define FPEXC_DZF	(1 << 1)
@@ -130,14 +132,18 @@ static int sve_default_vl = -1;
 
 /* Maximum supported vector length across all CPUs (initially poisoned) */
 int __ro_after_init sve_max_vl = SVE_VL_MIN;
+int __ro_after_init sve_max_virtualisable_vl = SVE_VL_MIN;
 /* Set of available vector lengths, as vq_to_bit(vq): */
 static __ro_after_init DECLARE_BITMAP(sve_vq_map, SVE_VQ_MAX);
+/* Set of vector lengths present on at least one cpu: */
+static __ro_after_init DECLARE_BITMAP(sve_vq_partial_map, SVE_VQ_MAX);
 static void __percpu *efi_sve_state;
 
 #else /* ! CONFIG_ARM64_SVE */
 
 /* Dummy declaration for code that will be optimised out: */
 extern __ro_after_init DECLARE_BITMAP(sve_vq_map, SVE_VQ_MAX);
+extern __ro_after_init DECLARE_BITMAP(sve_vq_partial_map, SVE_VQ_MAX);
 extern void __percpu *efi_sve_state;
 
 #endif /* ! CONFIG_ARM64_SVE */
@@ -623,11 +629,8 @@ int sve_get_current_vl(void)
 	return sve_prctl_status(0);
 }
 
-/*
- * Bitmap for temporary storage of the per-CPU set of supported vector lengths
- * during secondary boot.
- */
-static DECLARE_BITMAP(sve_secondary_vq_map, SVE_VQ_MAX);
+/* Bitmaps for temporary storage during manipulation of vector length sets */
+static DECLARE_BITMAP(sve_tmp_vq_map, SVE_VQ_MAX);
 
 static void sve_probe_vqs(DECLARE_BITMAP(map, SVE_VQ_MAX))
 {
@@ -650,6 +653,7 @@ static void sve_probe_vqs(DECLARE_BITMAP(map, SVE_VQ_MAX))
 void __init sve_init_vq_map(void)
 {
 	sve_probe_vqs(sve_vq_map);
+	bitmap_copy(sve_vq_partial_map, sve_vq_map, SVE_VQ_MAX);
 }
 
 /*
@@ -658,24 +662,60 @@ void __init sve_init_vq_map(void)
  */
 void sve_update_vq_map(void)
 {
-	sve_probe_vqs(sve_secondary_vq_map);
-	bitmap_and(sve_vq_map, sve_vq_map, sve_secondary_vq_map, SVE_VQ_MAX);
+	sve_probe_vqs(sve_tmp_vq_map);
+	bitmap_and(sve_vq_map, sve_vq_map, sve_tmp_vq_map,
+		   SVE_VQ_MAX);
+	bitmap_or(sve_vq_partial_map, sve_vq_partial_map, sve_tmp_vq_map,
+		  SVE_VQ_MAX);
 }
 
 /* Check whether the current CPU supports all VQs in the committed set */
 int sve_verify_vq_map(void)
 {
-	int ret = 0;
+	int ret = -EINVAL;
+	unsigned long b;
 
-	sve_probe_vqs(sve_secondary_vq_map);
-	bitmap_andnot(sve_secondary_vq_map, sve_vq_map, sve_secondary_vq_map,
-		      SVE_VQ_MAX);
-	if (!bitmap_empty(sve_secondary_vq_map, SVE_VQ_MAX)) {
+	sve_probe_vqs(sve_tmp_vq_map);
+
+	bitmap_complement(sve_tmp_vq_map, sve_tmp_vq_map, SVE_VQ_MAX);
+	if (bitmap_intersects(sve_tmp_vq_map, sve_vq_map, SVE_VQ_MAX)) {
 		pr_warn("SVE: cpu%d: Required vector length(s) missing\n",
 			smp_processor_id());
-		ret = -EINVAL;
+		goto error;
+	}
+
+	if (!IS_ENABLED(CONFIG_KVM) || !is_hyp_mode_available())
+		goto ok;
+
+	/*
+	 * For KVM, it is necessary to ensure that this CPU doesn't
+	 * support any vector length that guests may have probed as
+	 * unsupported.
+	 */
+
+	/* Recover the set of supported VQs: */
+	bitmap_complement(sve_tmp_vq_map, sve_tmp_vq_map, SVE_VQ_MAX);
+	/* Find VQs supported that are not globally supported: */
+	bitmap_andnot(sve_tmp_vq_map, sve_tmp_vq_map, sve_vq_map, SVE_VQ_MAX);
+
+	/* Find the lowest such VQ, if any: */
+	b = find_last_bit(sve_tmp_vq_map, SVE_VQ_MAX);
+	if (b >= SVE_VQ_MAX)
+		goto ok; /* no mismatches */
+
+	/*
+	 * Mismatches above sve_max_virtualisable_vl are fine, since
+	 * no guest is allowed to configure ZCR_EL2.LEN to exceed this:
+	 */
+	if (sve_vl_from_vq(bit_to_vq(b)) <= sve_max_virtualisable_vl) {
+		pr_warn("SVE: cpu%d: Unsupported vector length(s) present\n",
+			smp_processor_id());
+		goto error;
 	}
 
+ok:
+	ret = 0;
+error:
 	return ret;
 }
 
@@ -743,6 +783,7 @@ u64 read_zcr_features(void)
 void __init sve_setup(void)
 {
 	u64 zcr;
+	unsigned long b;
 
 	if (!system_supports_sve())
 		return;
@@ -771,11 +812,31 @@ void __init sve_setup(void)
 	 */
 	sve_default_vl = find_supported_vector_length(64);
 
+	bitmap_andnot(sve_tmp_vq_map, sve_vq_partial_map, sve_vq_map,
+		      SVE_VQ_MAX);
+
+	b = find_last_bit(sve_tmp_vq_map, SVE_VQ_MAX);
+	if (b >= SVE_VQ_MAX)
+		/* No non-virtualisable VLs found */
+		sve_max_virtualisable_vl = SVE_VQ_MAX;
+	else if (WARN_ON(b == SVE_VQ_MAX - 1))
+		/* No virtualisable VLs?  This is architecturally forbidden. */
+		sve_max_virtualisable_vl = SVE_VQ_MIN;
+	else /* b + 1 < SVE_VQ_MAX */
+		sve_max_virtualisable_vl = sve_vl_from_vq(bit_to_vq(b + 1));
+
+	if (sve_max_virtualisable_vl > sve_max_vl)
+		sve_max_virtualisable_vl = sve_max_vl;
+
 	pr_info("SVE: maximum available vector length %u bytes per vector\n",
 		sve_max_vl);
 	pr_info("SVE: default vector length %u bytes per vector\n",
 		sve_default_vl);
 
+	/* KVM decides whether to support mismatched systems. Just warn here: */
+	if (sve_max_virtualisable_vl < sve_max_vl)
+		pr_info("SVE: unvirtualisable vector lengths present\n");
+
 	sve_efi_setup();
 }
 
-- 
2.1.4

^ permalink raw reply related	[flat|nested] 154+ messages in thread

* [RFC PATCH v2 06/23] arm64/sve: Check SVE virtualisability
@ 2018-09-28 13:39   ` Dave Martin
  0 siblings, 0 replies; 154+ messages in thread
From: Dave Martin @ 2018-09-28 13:39 UTC (permalink / raw)
  To: linux-arm-kernel

Due to the way the effective SVE vector length is controlled and
trapped at different exception levels, certain mismatches in the
sets of vector lengths supported by different physical CPUs in the
system may prevent straightforward virtualisation of SVE at parity
with the host.

This patch analyses the extent to which SVE can be virtualised
safely without interfering with migration of vcpus between physical
CPUs, and rejects late secondary CPUs that would erode the
situation further.

It is left up to KVM to decide what to do with this information.

Signed-off-by: Dave Martin <Dave.Martin@arm.com>
---

Changes since RFCv1:

 * The analysis done by this patch is the same as in the previous
   version, but the commit message the printks etc. have been reworded
   to avoid the suggestion that KVM is expected to work on a system with
   mismatched SVE implementations.
---
 arch/arm64/include/asm/fpsimd.h |  1 +
 arch/arm64/kernel/cpufeature.c  |  2 +-
 arch/arm64/kernel/fpsimd.c      | 87 +++++++++++++++++++++++++++++++++++------
 3 files changed, 76 insertions(+), 14 deletions(-)

diff --git a/arch/arm64/include/asm/fpsimd.h b/arch/arm64/include/asm/fpsimd.h
index dd1ad39..964adc9 100644
--- a/arch/arm64/include/asm/fpsimd.h
+++ b/arch/arm64/include/asm/fpsimd.h
@@ -87,6 +87,7 @@ extern void sve_kernel_enable(const struct arm64_cpu_capabilities *__unused);
 extern u64 read_zcr_features(void);
 
 extern int __ro_after_init sve_max_vl;
+extern int __ro_after_init sve_max_virtualisable_vl;
 
 #ifdef CONFIG_ARM64_SVE
 
diff --git a/arch/arm64/kernel/cpufeature.c b/arch/arm64/kernel/cpufeature.c
index e238b79..aa1a55b 100644
--- a/arch/arm64/kernel/cpufeature.c
+++ b/arch/arm64/kernel/cpufeature.c
@@ -1531,7 +1531,7 @@ static void verify_sve_features(void)
 	unsigned int len = zcr & ZCR_ELx_LEN_MASK;
 
 	if (len < safe_len || sve_verify_vq_map()) {
-		pr_crit("CPU%d: SVE: required vector length(s) missing\n",
+		pr_crit("CPU%d: SVE: vector length support mismatch\n",
 			smp_processor_id());
 		cpu_die_early();
 	}
diff --git a/arch/arm64/kernel/fpsimd.c b/arch/arm64/kernel/fpsimd.c
index 42aa154..d28042b 100644
--- a/arch/arm64/kernel/fpsimd.c
+++ b/arch/arm64/kernel/fpsimd.c
@@ -18,6 +18,7 @@
  */
 
 #include <linux/bitmap.h>
+#include <linux/bitops.h>
 #include <linux/bottom_half.h>
 #include <linux/bug.h>
 #include <linux/cache.h>
@@ -48,6 +49,7 @@
 #include <asm/sigcontext.h>
 #include <asm/sysreg.h>
 #include <asm/traps.h>
+#include <asm/virt.h>
 
 #define FPEXC_IOF	(1 << 0)
 #define FPEXC_DZF	(1 << 1)
@@ -130,14 +132,18 @@ static int sve_default_vl = -1;
 
 /* Maximum supported vector length across all CPUs (initially poisoned) */
 int __ro_after_init sve_max_vl = SVE_VL_MIN;
+int __ro_after_init sve_max_virtualisable_vl = SVE_VL_MIN;
 /* Set of available vector lengths, as vq_to_bit(vq): */
 static __ro_after_init DECLARE_BITMAP(sve_vq_map, SVE_VQ_MAX);
+/* Set of vector lengths present on@least one cpu: */
+static __ro_after_init DECLARE_BITMAP(sve_vq_partial_map, SVE_VQ_MAX);
 static void __percpu *efi_sve_state;
 
 #else /* ! CONFIG_ARM64_SVE */
 
 /* Dummy declaration for code that will be optimised out: */
 extern __ro_after_init DECLARE_BITMAP(sve_vq_map, SVE_VQ_MAX);
+extern __ro_after_init DECLARE_BITMAP(sve_vq_partial_map, SVE_VQ_MAX);
 extern void __percpu *efi_sve_state;
 
 #endif /* ! CONFIG_ARM64_SVE */
@@ -623,11 +629,8 @@ int sve_get_current_vl(void)
 	return sve_prctl_status(0);
 }
 
-/*
- * Bitmap for temporary storage of the per-CPU set of supported vector lengths
- * during secondary boot.
- */
-static DECLARE_BITMAP(sve_secondary_vq_map, SVE_VQ_MAX);
+/* Bitmaps for temporary storage during manipulation of vector length sets */
+static DECLARE_BITMAP(sve_tmp_vq_map, SVE_VQ_MAX);
 
 static void sve_probe_vqs(DECLARE_BITMAP(map, SVE_VQ_MAX))
 {
@@ -650,6 +653,7 @@ static void sve_probe_vqs(DECLARE_BITMAP(map, SVE_VQ_MAX))
 void __init sve_init_vq_map(void)
 {
 	sve_probe_vqs(sve_vq_map);
+	bitmap_copy(sve_vq_partial_map, sve_vq_map, SVE_VQ_MAX);
 }
 
 /*
@@ -658,24 +662,60 @@ void __init sve_init_vq_map(void)
  */
 void sve_update_vq_map(void)
 {
-	sve_probe_vqs(sve_secondary_vq_map);
-	bitmap_and(sve_vq_map, sve_vq_map, sve_secondary_vq_map, SVE_VQ_MAX);
+	sve_probe_vqs(sve_tmp_vq_map);
+	bitmap_and(sve_vq_map, sve_vq_map, sve_tmp_vq_map,
+		   SVE_VQ_MAX);
+	bitmap_or(sve_vq_partial_map, sve_vq_partial_map, sve_tmp_vq_map,
+		  SVE_VQ_MAX);
 }
 
 /* Check whether the current CPU supports all VQs in the committed set */
 int sve_verify_vq_map(void)
 {
-	int ret = 0;
+	int ret = -EINVAL;
+	unsigned long b;
 
-	sve_probe_vqs(sve_secondary_vq_map);
-	bitmap_andnot(sve_secondary_vq_map, sve_vq_map, sve_secondary_vq_map,
-		      SVE_VQ_MAX);
-	if (!bitmap_empty(sve_secondary_vq_map, SVE_VQ_MAX)) {
+	sve_probe_vqs(sve_tmp_vq_map);
+
+	bitmap_complement(sve_tmp_vq_map, sve_tmp_vq_map, SVE_VQ_MAX);
+	if (bitmap_intersects(sve_tmp_vq_map, sve_vq_map, SVE_VQ_MAX)) {
 		pr_warn("SVE: cpu%d: Required vector length(s) missing\n",
 			smp_processor_id());
-		ret = -EINVAL;
+		goto error;
+	}
+
+	if (!IS_ENABLED(CONFIG_KVM) || !is_hyp_mode_available())
+		goto ok;
+
+	/*
+	 * For KVM, it is necessary to ensure that this CPU doesn't
+	 * support any vector length that guests may have probed as
+	 * unsupported.
+	 */
+
+	/* Recover the set of supported VQs: */
+	bitmap_complement(sve_tmp_vq_map, sve_tmp_vq_map, SVE_VQ_MAX);
+	/* Find VQs supported that are not globally supported: */
+	bitmap_andnot(sve_tmp_vq_map, sve_tmp_vq_map, sve_vq_map, SVE_VQ_MAX);
+
+	/* Find the lowest such VQ, if any: */
+	b = find_last_bit(sve_tmp_vq_map, SVE_VQ_MAX);
+	if (b >= SVE_VQ_MAX)
+		goto ok; /* no mismatches */
+
+	/*
+	 * Mismatches above sve_max_virtualisable_vl are fine, since
+	 * no guest is allowed to configure ZCR_EL2.LEN to exceed this:
+	 */
+	if (sve_vl_from_vq(bit_to_vq(b)) <= sve_max_virtualisable_vl) {
+		pr_warn("SVE: cpu%d: Unsupported vector length(s) present\n",
+			smp_processor_id());
+		goto error;
 	}
 
+ok:
+	ret = 0;
+error:
 	return ret;
 }
 
@@ -743,6 +783,7 @@ u64 read_zcr_features(void)
 void __init sve_setup(void)
 {
 	u64 zcr;
+	unsigned long b;
 
 	if (!system_supports_sve())
 		return;
@@ -771,11 +812,31 @@ void __init sve_setup(void)
 	 */
 	sve_default_vl = find_supported_vector_length(64);
 
+	bitmap_andnot(sve_tmp_vq_map, sve_vq_partial_map, sve_vq_map,
+		      SVE_VQ_MAX);
+
+	b = find_last_bit(sve_tmp_vq_map, SVE_VQ_MAX);
+	if (b >= SVE_VQ_MAX)
+		/* No non-virtualisable VLs found */
+		sve_max_virtualisable_vl = SVE_VQ_MAX;
+	else if (WARN_ON(b == SVE_VQ_MAX - 1))
+		/* No virtualisable VLs?  This is architecturally forbidden. */
+		sve_max_virtualisable_vl = SVE_VQ_MIN;
+	else /* b + 1 < SVE_VQ_MAX */
+		sve_max_virtualisable_vl = sve_vl_from_vq(bit_to_vq(b + 1));
+
+	if (sve_max_virtualisable_vl > sve_max_vl)
+		sve_max_virtualisable_vl = sve_max_vl;
+
 	pr_info("SVE: maximum available vector length %u bytes per vector\n",
 		sve_max_vl);
 	pr_info("SVE: default vector length %u bytes per vector\n",
 		sve_default_vl);
 
+	/* KVM decides whether to support mismatched systems. Just warn here: */
+	if (sve_max_virtualisable_vl < sve_max_vl)
+		pr_info("SVE: unvirtualisable vector lengths present\n");
+
 	sve_efi_setup();
 }
 
-- 
2.1.4

^ permalink raw reply related	[flat|nested] 154+ messages in thread

* [RFC PATCH v2 07/23] arm64/sve: Enable SVE state tracking for non-task contexts
  2018-09-28 13:39 ` Dave Martin
@ 2018-09-28 13:39   ` Dave Martin
  -1 siblings, 0 replies; 154+ messages in thread
From: Dave Martin @ 2018-09-28 13:39 UTC (permalink / raw)
  To: kvmarm
  Cc: Okamoto Takayuki, Christoffer Dall, Ard Biesheuvel, Marc Zyngier,
	Catalin Marinas, Will Deacon, linux-arm-kernel

The current FPSIMD/SVE context handling support for non-task (i.e.,
KVM vcpu) contexts does not take SVE into account.  This means that
only task contexts can safely use SVE at present.

In preparation for enabling KVM guests to use SVE, it is necessary
to keep track of SVE state for non-task contexts too.

This patch adds the necessary support, removing assumptions from
the context switch code about the location of the SVE context
storage.

When binding a vcpu context, its vector length is arbitrarily
specified as sve_max_vl for now.  In any case, because TIF_SVE is
presently cleared at vcpu context bind time, the specified vector
length will not be used for anything yet.  In later patches TIF_SVE
will be set here as appropriate, and the appropriate maximum vector
length for the vcpu will be passed when binding.

Signed-off-by: Dave Martin <Dave.Martin@arm.com>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
---
 arch/arm64/include/asm/fpsimd.h |  3 ++-
 arch/arm64/kernel/fpsimd.c      | 20 +++++++++++++++-----
 arch/arm64/kvm/fpsimd.c         |  4 +++-
 3 files changed, 20 insertions(+), 7 deletions(-)

diff --git a/arch/arm64/include/asm/fpsimd.h b/arch/arm64/include/asm/fpsimd.h
index 964adc9..df7a143 100644
--- a/arch/arm64/include/asm/fpsimd.h
+++ b/arch/arm64/include/asm/fpsimd.h
@@ -56,7 +56,8 @@ extern void fpsimd_restore_current_state(void);
 extern void fpsimd_update_current_state(struct user_fpsimd_state const *state);
 
 extern void fpsimd_bind_task_to_cpu(void);
-extern void fpsimd_bind_state_to_cpu(struct user_fpsimd_state *state);
+extern void fpsimd_bind_state_to_cpu(struct user_fpsimd_state *state,
+				     void *sve_state, unsigned int sve_vl);
 
 extern void fpsimd_flush_task_state(struct task_struct *target);
 extern void fpsimd_flush_cpu_state(void);
diff --git a/arch/arm64/kernel/fpsimd.c b/arch/arm64/kernel/fpsimd.c
index d28042b..60c5e28 100644
--- a/arch/arm64/kernel/fpsimd.c
+++ b/arch/arm64/kernel/fpsimd.c
@@ -121,6 +121,8 @@
  */
 struct fpsimd_last_state_struct {
 	struct user_fpsimd_state *st;
+	void *sve_state;
+	unsigned int sve_vl;
 };
 
 static DEFINE_PER_CPU(struct fpsimd_last_state_struct, fpsimd_last_state);
@@ -241,14 +243,15 @@ static void task_fpsimd_load(void)
  */
 void fpsimd_save(void)
 {
-	struct user_fpsimd_state *st = __this_cpu_read(fpsimd_last_state.st);
+	struct fpsimd_last_state_struct const *last =
+		this_cpu_ptr(&fpsimd_last_state);
 	/* set by fpsimd_bind_task_to_cpu() or fpsimd_bind_state_to_cpu() */
 
 	WARN_ON(!in_softirq() && !irqs_disabled());
 
 	if (!test_thread_flag(TIF_FOREIGN_FPSTATE)) {
 		if (system_supports_sve() && test_thread_flag(TIF_SVE)) {
-			if (WARN_ON(sve_get_vl() != current->thread.sve_vl)) {
+			if (WARN_ON(sve_get_vl() != last->sve_vl)) {
 				/*
 				 * Can't save the user regs, so current would
 				 * re-enter user with corrupt state.
@@ -258,9 +261,11 @@ void fpsimd_save(void)
 				return;
 			}
 
-			sve_save_state(sve_pffr(&current->thread), &st->fpsr);
+			sve_save_state((char *)last->sve_state +
+						sve_ffr_offset(last->sve_vl),
+				       &last->st->fpsr);
 		} else
-			fpsimd_save_state(st);
+			fpsimd_save_state(last->st);
 	}
 }
 
@@ -1035,6 +1040,8 @@ void fpsimd_bind_task_to_cpu(void)
 		this_cpu_ptr(&fpsimd_last_state);
 
 	last->st = &current->thread.uw.fpsimd_state;
+	last->sve_state = current->thread.sve_state;
+	last->sve_vl = current->thread.sve_vl;
 	current->thread.fpsimd_cpu = smp_processor_id();
 
 	if (system_supports_sve()) {
@@ -1048,7 +1055,8 @@ void fpsimd_bind_task_to_cpu(void)
 	}
 }
 
-void fpsimd_bind_state_to_cpu(struct user_fpsimd_state *st)
+void fpsimd_bind_state_to_cpu(struct user_fpsimd_state *st, void *sve_state,
+			      unsigned int sve_vl)
 {
 	struct fpsimd_last_state_struct *last =
 		this_cpu_ptr(&fpsimd_last_state);
@@ -1056,6 +1064,8 @@ void fpsimd_bind_state_to_cpu(struct user_fpsimd_state *st)
 	WARN_ON(!in_softirq() && !irqs_disabled());
 
 	last->st = st;
+	last->sve_state = sve_state;
+	last->sve_vl = sve_vl;
 }
 
 /*
diff --git a/arch/arm64/kvm/fpsimd.c b/arch/arm64/kvm/fpsimd.c
index aac7808..55654cb 100644
--- a/arch/arm64/kvm/fpsimd.c
+++ b/arch/arm64/kvm/fpsimd.c
@@ -85,7 +85,9 @@ void kvm_arch_vcpu_ctxsync_fp(struct kvm_vcpu *vcpu)
 	WARN_ON_ONCE(!irqs_disabled());
 
 	if (vcpu->arch.flags & KVM_ARM64_FP_ENABLED) {
-		fpsimd_bind_state_to_cpu(&vcpu->arch.ctxt.gp_regs.fp_regs);
+		fpsimd_bind_state_to_cpu(&vcpu->arch.ctxt.gp_regs.fp_regs,
+					 NULL, sve_max_vl);
+
 		clear_thread_flag(TIF_FOREIGN_FPSTATE);
 		clear_thread_flag(TIF_SVE);
 	}
-- 
2.1.4

_______________________________________________
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

^ permalink raw reply related	[flat|nested] 154+ messages in thread

* [RFC PATCH v2 07/23] arm64/sve: Enable SVE state tracking for non-task contexts
@ 2018-09-28 13:39   ` Dave Martin
  0 siblings, 0 replies; 154+ messages in thread
From: Dave Martin @ 2018-09-28 13:39 UTC (permalink / raw)
  To: linux-arm-kernel

The current FPSIMD/SVE context handling support for non-task (i.e.,
KVM vcpu) contexts does not take SVE into account.  This means that
only task contexts can safely use SVE at present.

In preparation for enabling KVM guests to use SVE, it is necessary
to keep track of SVE state for non-task contexts too.

This patch adds the necessary support, removing assumptions from
the context switch code about the location of the SVE context
storage.

When binding a vcpu context, its vector length is arbitrarily
specified as sve_max_vl for now.  In any case, because TIF_SVE is
presently cleared at vcpu context bind time, the specified vector
length will not be used for anything yet.  In later patches TIF_SVE
will be set here as appropriate, and the appropriate maximum vector
length for the vcpu will be passed when binding.

Signed-off-by: Dave Martin <Dave.Martin@arm.com>
Reviewed-by: Alex Benn?e <alex.bennee@linaro.org>
---
 arch/arm64/include/asm/fpsimd.h |  3 ++-
 arch/arm64/kernel/fpsimd.c      | 20 +++++++++++++++-----
 arch/arm64/kvm/fpsimd.c         |  4 +++-
 3 files changed, 20 insertions(+), 7 deletions(-)

diff --git a/arch/arm64/include/asm/fpsimd.h b/arch/arm64/include/asm/fpsimd.h
index 964adc9..df7a143 100644
--- a/arch/arm64/include/asm/fpsimd.h
+++ b/arch/arm64/include/asm/fpsimd.h
@@ -56,7 +56,8 @@ extern void fpsimd_restore_current_state(void);
 extern void fpsimd_update_current_state(struct user_fpsimd_state const *state);
 
 extern void fpsimd_bind_task_to_cpu(void);
-extern void fpsimd_bind_state_to_cpu(struct user_fpsimd_state *state);
+extern void fpsimd_bind_state_to_cpu(struct user_fpsimd_state *state,
+				     void *sve_state, unsigned int sve_vl);
 
 extern void fpsimd_flush_task_state(struct task_struct *target);
 extern void fpsimd_flush_cpu_state(void);
diff --git a/arch/arm64/kernel/fpsimd.c b/arch/arm64/kernel/fpsimd.c
index d28042b..60c5e28 100644
--- a/arch/arm64/kernel/fpsimd.c
+++ b/arch/arm64/kernel/fpsimd.c
@@ -121,6 +121,8 @@
  */
 struct fpsimd_last_state_struct {
 	struct user_fpsimd_state *st;
+	void *sve_state;
+	unsigned int sve_vl;
 };
 
 static DEFINE_PER_CPU(struct fpsimd_last_state_struct, fpsimd_last_state);
@@ -241,14 +243,15 @@ static void task_fpsimd_load(void)
  */
 void fpsimd_save(void)
 {
-	struct user_fpsimd_state *st = __this_cpu_read(fpsimd_last_state.st);
+	struct fpsimd_last_state_struct const *last =
+		this_cpu_ptr(&fpsimd_last_state);
 	/* set by fpsimd_bind_task_to_cpu() or fpsimd_bind_state_to_cpu() */
 
 	WARN_ON(!in_softirq() && !irqs_disabled());
 
 	if (!test_thread_flag(TIF_FOREIGN_FPSTATE)) {
 		if (system_supports_sve() && test_thread_flag(TIF_SVE)) {
-			if (WARN_ON(sve_get_vl() != current->thread.sve_vl)) {
+			if (WARN_ON(sve_get_vl() != last->sve_vl)) {
 				/*
 				 * Can't save the user regs, so current would
 				 * re-enter user with corrupt state.
@@ -258,9 +261,11 @@ void fpsimd_save(void)
 				return;
 			}
 
-			sve_save_state(sve_pffr(&current->thread), &st->fpsr);
+			sve_save_state((char *)last->sve_state +
+						sve_ffr_offset(last->sve_vl),
+				       &last->st->fpsr);
 		} else
-			fpsimd_save_state(st);
+			fpsimd_save_state(last->st);
 	}
 }
 
@@ -1035,6 +1040,8 @@ void fpsimd_bind_task_to_cpu(void)
 		this_cpu_ptr(&fpsimd_last_state);
 
 	last->st = &current->thread.uw.fpsimd_state;
+	last->sve_state = current->thread.sve_state;
+	last->sve_vl = current->thread.sve_vl;
 	current->thread.fpsimd_cpu = smp_processor_id();
 
 	if (system_supports_sve()) {
@@ -1048,7 +1055,8 @@ void fpsimd_bind_task_to_cpu(void)
 	}
 }
 
-void fpsimd_bind_state_to_cpu(struct user_fpsimd_state *st)
+void fpsimd_bind_state_to_cpu(struct user_fpsimd_state *st, void *sve_state,
+			      unsigned int sve_vl)
 {
 	struct fpsimd_last_state_struct *last =
 		this_cpu_ptr(&fpsimd_last_state);
@@ -1056,6 +1064,8 @@ void fpsimd_bind_state_to_cpu(struct user_fpsimd_state *st)
 	WARN_ON(!in_softirq() && !irqs_disabled());
 
 	last->st = st;
+	last->sve_state = sve_state;
+	last->sve_vl = sve_vl;
 }
 
 /*
diff --git a/arch/arm64/kvm/fpsimd.c b/arch/arm64/kvm/fpsimd.c
index aac7808..55654cb 100644
--- a/arch/arm64/kvm/fpsimd.c
+++ b/arch/arm64/kvm/fpsimd.c
@@ -85,7 +85,9 @@ void kvm_arch_vcpu_ctxsync_fp(struct kvm_vcpu *vcpu)
 	WARN_ON_ONCE(!irqs_disabled());
 
 	if (vcpu->arch.flags & KVM_ARM64_FP_ENABLED) {
-		fpsimd_bind_state_to_cpu(&vcpu->arch.ctxt.gp_regs.fp_regs);
+		fpsimd_bind_state_to_cpu(&vcpu->arch.ctxt.gp_regs.fp_regs,
+					 NULL, sve_max_vl);
+
 		clear_thread_flag(TIF_FOREIGN_FPSTATE);
 		clear_thread_flag(TIF_SVE);
 	}
-- 
2.1.4

^ permalink raw reply related	[flat|nested] 154+ messages in thread

* [RFC PATCH v2 08/23] KVM: arm64: Add a vcpu flag to control SVE visibility for the guest
  2018-09-28 13:39 ` Dave Martin
@ 2018-09-28 13:39   ` Dave Martin
  -1 siblings, 0 replies; 154+ messages in thread
From: Dave Martin @ 2018-09-28 13:39 UTC (permalink / raw)
  To: kvmarm
  Cc: Okamoto Takayuki, Christoffer Dall, Ard Biesheuvel, Marc Zyngier,
	Catalin Marinas, Will Deacon, linux-arm-kernel

Since SVE will be enabled or disabled on a per-vcpu basis, a flag
is needed in order to track which vcpus have it enabled.

This patch adds a suitable flag and a helper for checking it.

Signed-off-by: Dave Martin <Dave.Martin@arm.com>
---

Changes since RFCv1:

 * Convert vcpu_has_sve() to a macro so that it can operate on a vcpu
   without circular header dependency problems.

   This avoids the helper requiring a vcpu_arch argument, which was
   a little ugly.
---
 arch/arm64/include/asm/kvm_host.h | 4 ++++
 1 file changed, 4 insertions(+)

diff --git a/arch/arm64/include/asm/kvm_host.h b/arch/arm64/include/asm/kvm_host.h
index d4b65414..20baf4a 100644
--- a/arch/arm64/include/asm/kvm_host.h
+++ b/arch/arm64/include/asm/kvm_host.h
@@ -307,6 +307,10 @@ struct kvm_vcpu_arch {
 #define KVM_ARM64_FP_HOST		(1 << 2) /* host FP regs loaded */
 #define KVM_ARM64_HOST_SVE_IN_USE	(1 << 3) /* backup for host TIF_SVE */
 #define KVM_ARM64_HOST_SVE_ENABLED	(1 << 4) /* SVE enabled for EL0 */
+#define KVM_ARM64_GUEST_HAS_SVE		(1 << 5) /* SVE exposed to guest */
+
+#define vcpu_has_sve(vcpu) (system_supports_sve() && \
+			    ((vcpu)->arch.flags & KVM_ARM64_GUEST_HAS_SVE))
 
 #define vcpu_gp_regs(v)		(&(v)->arch.ctxt.gp_regs)
 
-- 
2.1.4

^ permalink raw reply related	[flat|nested] 154+ messages in thread

* [RFC PATCH v2 08/23] KVM: arm64: Add a vcpu flag to control SVE visibility for the guest
@ 2018-09-28 13:39   ` Dave Martin
  0 siblings, 0 replies; 154+ messages in thread
From: Dave Martin @ 2018-09-28 13:39 UTC (permalink / raw)
  To: linux-arm-kernel

Since SVE will be enabled or disabled on a per-vcpu basis, a flag
is needed in order to track which vcpus have it enabled.

This patch adds a suitable flag and a helper for checking it.

Signed-off-by: Dave Martin <Dave.Martin@arm.com>
---

Changes since RFCv1:

 * Convert vcpu_has_sve() to a macro so that it can operate on a vcpu
   without circular header dependency problems.

   This avoids the helper requiring a vcpu_arch argument, which was
   a little ugly.
---
 arch/arm64/include/asm/kvm_host.h | 4 ++++
 1 file changed, 4 insertions(+)

diff --git a/arch/arm64/include/asm/kvm_host.h b/arch/arm64/include/asm/kvm_host.h
index d4b65414..20baf4a 100644
--- a/arch/arm64/include/asm/kvm_host.h
+++ b/arch/arm64/include/asm/kvm_host.h
@@ -307,6 +307,10 @@ struct kvm_vcpu_arch {
 #define KVM_ARM64_FP_HOST		(1 << 2) /* host FP regs loaded */
 #define KVM_ARM64_HOST_SVE_IN_USE	(1 << 3) /* backup for host TIF_SVE */
 #define KVM_ARM64_HOST_SVE_ENABLED	(1 << 4) /* SVE enabled for EL0 */
+#define KVM_ARM64_GUEST_HAS_SVE		(1 << 5) /* SVE exposed to guest */
+
+#define vcpu_has_sve(vcpu) (system_supports_sve() && \
+			    ((vcpu)->arch.flags & KVM_ARM64_GUEST_HAS_SVE))
 
 #define vcpu_gp_regs(v)		(&(v)->arch.ctxt.gp_regs)
 
-- 
2.1.4

^ permalink raw reply related	[flat|nested] 154+ messages in thread

* [RFC PATCH v2 09/23] KVM: arm64: Propagate vcpu into read_id_reg()
  2018-09-28 13:39 ` Dave Martin
@ 2018-09-28 13:39   ` Dave Martin
  -1 siblings, 0 replies; 154+ messages in thread
From: Dave Martin @ 2018-09-28 13:39 UTC (permalink / raw)
  To: kvmarm
  Cc: Okamoto Takayuki, Christoffer Dall, Ard Biesheuvel, Marc Zyngier,
	Catalin Marinas, Will Deacon, linux-arm-kernel

Architecture features that are conditionally visible to the guest
will require run-time checks in the ID register accessor functions.
In particular, read_id_reg() will need to perform checks in order
to generate the correct emulated value for certain ID register
fields such as ID_AA64PFR0_EL1.SVE for example.

This patch propagates vcpu into read_id_reg() so that future
patches can add run-time checks on the guest configuration here.

For now, there is no functional change.

Signed-off-by: Dave Martin <Dave.Martin@arm.com>
---
 arch/arm64/kvm/sys_regs.c | 23 +++++++++++++----------
 1 file changed, 13 insertions(+), 10 deletions(-)

diff --git a/arch/arm64/kvm/sys_regs.c b/arch/arm64/kvm/sys_regs.c
index 22fbbdb..0dfd064 100644
--- a/arch/arm64/kvm/sys_regs.c
+++ b/arch/arm64/kvm/sys_regs.c
@@ -1029,7 +1029,8 @@ static bool access_cntp_cval(struct kvm_vcpu *vcpu,
 }
 
 /* Read a sanitised cpufeature ID register by sys_reg_desc */
-static u64 read_id_reg(struct sys_reg_desc const *r, bool raz)
+static u64 read_id_reg(const struct kvm_vcpu *vcpu,
+		struct sys_reg_desc const *r, bool raz)
 {
 	u32 id = sys_reg((u32)r->Op0, (u32)r->Op1,
 			 (u32)r->CRn, (u32)r->CRm, (u32)r->Op2);
@@ -1060,7 +1061,7 @@ static bool __access_id_reg(struct kvm_vcpu *vcpu,
 	if (p->is_write)
 		return write_to_read_only(vcpu, p, r);
 
-	p->regval = read_id_reg(r, raz);
+	p->regval = read_id_reg(vcpu, r, raz);
 	return true;
 }
 
@@ -1089,16 +1090,18 @@ static u64 sys_reg_to_index(const struct sys_reg_desc *reg);
  * are stored, and for set_id_reg() we don't allow the effective value
  * to be changed.
  */
-static int __get_id_reg(const struct sys_reg_desc *rd, void __user *uaddr,
+static int __get_id_reg(const struct kvm_vcpu *vcpu,
+			const struct sys_reg_desc *rd, void __user *uaddr,
 			bool raz)
 {
 	const u64 id = sys_reg_to_index(rd);
-	const u64 val = read_id_reg(rd, raz);
+	const u64 val = read_id_reg(vcpu, rd, raz);
 
 	return reg_to_user(uaddr, &val, id);
 }
 
-static int __set_id_reg(const struct sys_reg_desc *rd, void __user *uaddr,
+static int __set_id_reg(const struct kvm_vcpu *vcpu,
+			const struct sys_reg_desc *rd, void __user *uaddr,
 			bool raz)
 {
 	const u64 id = sys_reg_to_index(rd);
@@ -1110,7 +1113,7 @@ static int __set_id_reg(const struct sys_reg_desc *rd, void __user *uaddr,
 		return err;
 
 	/* This is what we mean by invariant: you can't change it. */
-	if (val != read_id_reg(rd, raz))
+	if (val != read_id_reg(vcpu, rd, raz))
 		return -EINVAL;
 
 	return 0;
@@ -1119,25 +1122,25 @@ static int __set_id_reg(const struct sys_reg_desc *rd, void __user *uaddr,
 static int get_id_reg(struct kvm_vcpu *vcpu, const struct sys_reg_desc *rd,
 		      const struct kvm_one_reg *reg, void __user *uaddr)
 {
-	return __get_id_reg(rd, uaddr, false);
+	return __get_id_reg(vcpu, rd, uaddr, false);
 }
 
 static int set_id_reg(struct kvm_vcpu *vcpu, const struct sys_reg_desc *rd,
 		      const struct kvm_one_reg *reg, void __user *uaddr)
 {
-	return __set_id_reg(rd, uaddr, false);
+	return __set_id_reg(vcpu, rd, uaddr, false);
 }
 
 static int get_raz_id_reg(struct kvm_vcpu *vcpu, const struct sys_reg_desc *rd,
 			  const struct kvm_one_reg *reg, void __user *uaddr)
 {
-	return __get_id_reg(rd, uaddr, true);
+	return __get_id_reg(vcpu, rd, uaddr, true);
 }
 
 static int set_raz_id_reg(struct kvm_vcpu *vcpu, const struct sys_reg_desc *rd,
 			  const struct kvm_one_reg *reg, void __user *uaddr)
 {
-	return __set_id_reg(rd, uaddr, true);
+	return __set_id_reg(vcpu, rd, uaddr, true);
 }
 
 /* sys_reg_desc initialiser for known cpufeature ID registers */
-- 
2.1.4

^ permalink raw reply related	[flat|nested] 154+ messages in thread

* [RFC PATCH v2 09/23] KVM: arm64: Propagate vcpu into read_id_reg()
@ 2018-09-28 13:39   ` Dave Martin
  0 siblings, 0 replies; 154+ messages in thread
From: Dave Martin @ 2018-09-28 13:39 UTC (permalink / raw)
  To: linux-arm-kernel

Architecture features that are conditionally visible to the guest
will require run-time checks in the ID register accessor functions.
In particular, read_id_reg() will need to perform checks in order
to generate the correct emulated value for certain ID register
fields such as ID_AA64PFR0_EL1.SVE for example.

This patch propagates vcpu into read_id_reg() so that future
patches can add run-time checks on the guest configuration here.

For now, there is no functional change.

Signed-off-by: Dave Martin <Dave.Martin@arm.com>
---
 arch/arm64/kvm/sys_regs.c | 23 +++++++++++++----------
 1 file changed, 13 insertions(+), 10 deletions(-)

diff --git a/arch/arm64/kvm/sys_regs.c b/arch/arm64/kvm/sys_regs.c
index 22fbbdb..0dfd064 100644
--- a/arch/arm64/kvm/sys_regs.c
+++ b/arch/arm64/kvm/sys_regs.c
@@ -1029,7 +1029,8 @@ static bool access_cntp_cval(struct kvm_vcpu *vcpu,
 }
 
 /* Read a sanitised cpufeature ID register by sys_reg_desc */
-static u64 read_id_reg(struct sys_reg_desc const *r, bool raz)
+static u64 read_id_reg(const struct kvm_vcpu *vcpu,
+		struct sys_reg_desc const *r, bool raz)
 {
 	u32 id = sys_reg((u32)r->Op0, (u32)r->Op1,
 			 (u32)r->CRn, (u32)r->CRm, (u32)r->Op2);
@@ -1060,7 +1061,7 @@ static bool __access_id_reg(struct kvm_vcpu *vcpu,
 	if (p->is_write)
 		return write_to_read_only(vcpu, p, r);
 
-	p->regval = read_id_reg(r, raz);
+	p->regval = read_id_reg(vcpu, r, raz);
 	return true;
 }
 
@@ -1089,16 +1090,18 @@ static u64 sys_reg_to_index(const struct sys_reg_desc *reg);
  * are stored, and for set_id_reg() we don't allow the effective value
  * to be changed.
  */
-static int __get_id_reg(const struct sys_reg_desc *rd, void __user *uaddr,
+static int __get_id_reg(const struct kvm_vcpu *vcpu,
+			const struct sys_reg_desc *rd, void __user *uaddr,
 			bool raz)
 {
 	const u64 id = sys_reg_to_index(rd);
-	const u64 val = read_id_reg(rd, raz);
+	const u64 val = read_id_reg(vcpu, rd, raz);
 
 	return reg_to_user(uaddr, &val, id);
 }
 
-static int __set_id_reg(const struct sys_reg_desc *rd, void __user *uaddr,
+static int __set_id_reg(const struct kvm_vcpu *vcpu,
+			const struct sys_reg_desc *rd, void __user *uaddr,
 			bool raz)
 {
 	const u64 id = sys_reg_to_index(rd);
@@ -1110,7 +1113,7 @@ static int __set_id_reg(const struct sys_reg_desc *rd, void __user *uaddr,
 		return err;
 
 	/* This is what we mean by invariant: you can't change it. */
-	if (val != read_id_reg(rd, raz))
+	if (val != read_id_reg(vcpu, rd, raz))
 		return -EINVAL;
 
 	return 0;
@@ -1119,25 +1122,25 @@ static int __set_id_reg(const struct sys_reg_desc *rd, void __user *uaddr,
 static int get_id_reg(struct kvm_vcpu *vcpu, const struct sys_reg_desc *rd,
 		      const struct kvm_one_reg *reg, void __user *uaddr)
 {
-	return __get_id_reg(rd, uaddr, false);
+	return __get_id_reg(vcpu, rd, uaddr, false);
 }
 
 static int set_id_reg(struct kvm_vcpu *vcpu, const struct sys_reg_desc *rd,
 		      const struct kvm_one_reg *reg, void __user *uaddr)
 {
-	return __set_id_reg(rd, uaddr, false);
+	return __set_id_reg(vcpu, rd, uaddr, false);
 }
 
 static int get_raz_id_reg(struct kvm_vcpu *vcpu, const struct sys_reg_desc *rd,
 			  const struct kvm_one_reg *reg, void __user *uaddr)
 {
-	return __get_id_reg(rd, uaddr, true);
+	return __get_id_reg(vcpu, rd, uaddr, true);
 }
 
 static int set_raz_id_reg(struct kvm_vcpu *vcpu, const struct sys_reg_desc *rd,
 			  const struct kvm_one_reg *reg, void __user *uaddr)
 {
-	return __set_id_reg(rd, uaddr, true);
+	return __set_id_reg(vcpu, rd, uaddr, true);
 }
 
 /* sys_reg_desc initialiser for known cpufeature ID registers */
-- 
2.1.4

^ permalink raw reply related	[flat|nested] 154+ messages in thread

* [RFC PATCH v2 10/23] KVM: arm64: Extend reset_unknown() to handle mixed RES0/UNKNOWN registers
  2018-09-28 13:39 ` Dave Martin
@ 2018-09-28 13:39   ` Dave Martin
  -1 siblings, 0 replies; 154+ messages in thread
From: Dave Martin @ 2018-09-28 13:39 UTC (permalink / raw)
  To: kvmarm
  Cc: Okamoto Takayuki, Christoffer Dall, Ard Biesheuvel, Marc Zyngier,
	Catalin Marinas, Will Deacon, linux-arm-kernel

The reset_unknown() system register helper initialises a guest
register to a distinctive junk value on vcpu reset, to help expose
and debug deficient register initialisation within the guest.

Some registers such as the SVE control register ZCR_EL1 contain a
mixture of UNKNOWN fields and RES0 bits.  For these,
reset_unknown() does not work at present, since it sets all bits to
junk values instead of just the wanted bits.

There is no need to craft another special helper just for that,
since reset_unknown() almost does the appropriate thing anyway.
This patch takes advantage of the ununused val field in struct
sys_reg_desc to specify a mask of bits that should be initialised
to zero instead of junk.

All existing users of reset_unknown() do not (and should not)
define a value for val, so they will implicitly set it to zero,
resulting in all bits being made UNKNOWN by this function: thus,
this patch makes no functional change for currently defined
registers.

Future patches will make use of non-zero val.

Signed-off-by: Dave Martin <Dave.Martin@arm.com>
---
 arch/arm64/kvm/sys_regs.h | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/arch/arm64/kvm/sys_regs.h b/arch/arm64/kvm/sys_regs.h
index cd710f8..24bac06 100644
--- a/arch/arm64/kvm/sys_regs.h
+++ b/arch/arm64/kvm/sys_regs.h
@@ -89,7 +89,9 @@ static inline void reset_unknown(struct kvm_vcpu *vcpu,
 {
 	BUG_ON(!r->reg);
 	BUG_ON(r->reg >= NR_SYS_REGS);
-	__vcpu_sys_reg(vcpu, r->reg) = 0x1de7ec7edbadc0deULL;
+
+	/* If non-zero, r->val specifies which register bits are RES0: */
+	__vcpu_sys_reg(vcpu, r->reg) = 0x1de7ec7edbadc0deULL & ~r->val;
 }
 
 static inline void reset_val(struct kvm_vcpu *vcpu, const struct sys_reg_desc *r)
-- 
2.1.4

^ permalink raw reply related	[flat|nested] 154+ messages in thread

* [RFC PATCH v2 10/23] KVM: arm64: Extend reset_unknown() to handle mixed RES0/UNKNOWN registers
@ 2018-09-28 13:39   ` Dave Martin
  0 siblings, 0 replies; 154+ messages in thread
From: Dave Martin @ 2018-09-28 13:39 UTC (permalink / raw)
  To: linux-arm-kernel

The reset_unknown() system register helper initialises a guest
register to a distinctive junk value on vcpu reset, to help expose
and debug deficient register initialisation within the guest.

Some registers such as the SVE control register ZCR_EL1 contain a
mixture of UNKNOWN fields and RES0 bits.  For these,
reset_unknown() does not work at present, since it sets all bits to
junk values instead of just the wanted bits.

There is no need to craft another special helper just for that,
since reset_unknown() almost does the appropriate thing anyway.
This patch takes advantage of the ununused val field in struct
sys_reg_desc to specify a mask of bits that should be initialised
to zero instead of junk.

All existing users of reset_unknown() do not (and should not)
define a value for val, so they will implicitly set it to zero,
resulting in all bits being made UNKNOWN by this function: thus,
this patch makes no functional change for currently defined
registers.

Future patches will make use of non-zero val.

Signed-off-by: Dave Martin <Dave.Martin@arm.com>
---
 arch/arm64/kvm/sys_regs.h | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/arch/arm64/kvm/sys_regs.h b/arch/arm64/kvm/sys_regs.h
index cd710f8..24bac06 100644
--- a/arch/arm64/kvm/sys_regs.h
+++ b/arch/arm64/kvm/sys_regs.h
@@ -89,7 +89,9 @@ static inline void reset_unknown(struct kvm_vcpu *vcpu,
 {
 	BUG_ON(!r->reg);
 	BUG_ON(r->reg >= NR_SYS_REGS);
-	__vcpu_sys_reg(vcpu, r->reg) = 0x1de7ec7edbadc0deULL;
+
+	/* If non-zero, r->val specifies which register bits are RES0: */
+	__vcpu_sys_reg(vcpu, r->reg) = 0x1de7ec7edbadc0deULL & ~r->val;
 }
 
 static inline void reset_val(struct kvm_vcpu *vcpu, const struct sys_reg_desc *r)
-- 
2.1.4

^ permalink raw reply related	[flat|nested] 154+ messages in thread

* [RFC PATCH v2 11/23] KVM: arm64: Support runtime sysreg filtering for KVM_GET_REG_LIST
  2018-09-28 13:39 ` Dave Martin
@ 2018-09-28 13:39   ` Dave Martin
  -1 siblings, 0 replies; 154+ messages in thread
From: Dave Martin @ 2018-09-28 13:39 UTC (permalink / raw)
  To: kvmarm
  Cc: Okamoto Takayuki, Christoffer Dall, Ard Biesheuvel, Marc Zyngier,
	Catalin Marinas, Will Deacon, linux-arm-kernel

KVM_GET_REG_LIST should only enumerate registers that are actually
accessible, so it is necessary to filter out any register that is
not exposed to the guest.  For features that are configured at
runtime, this will require a dynamic check.

For example, ZCR_EL1 and ID_AA64ZFR0_EL1 would need to be hidden
if SVE is not enabled for the guest.

Special-casing walk_one_sys_reg() for specific registers will make
the code unnecessarily messy, so this patch adds a new sysreg
method check_present() that, if defined, indicates whether the
sysreg should be enumerated.  If the guest runtime configuration
may require a particular system register to be hidden,
check_present should point to a function that returns true or false
to enable or disable enumeration of that register respectively.

Currently check_present() is not used for any other purpose, but it
may be a useful foundation for abstracting other parts of the code
to handle conditionally-present sysregs, if required.

Signed-off-by: Dave Martin <Dave.Martin@arm.com>
---
 arch/arm64/kvm/sys_regs.c | 10 +++++++---
 arch/arm64/kvm/sys_regs.h |  4 ++++
 2 files changed, 11 insertions(+), 3 deletions(-)

diff --git a/arch/arm64/kvm/sys_regs.c b/arch/arm64/kvm/sys_regs.c
index 0dfd064..adb6cbd 100644
--- a/arch/arm64/kvm/sys_regs.c
+++ b/arch/arm64/kvm/sys_regs.c
@@ -2437,7 +2437,8 @@ static bool copy_reg_to_user(const struct sys_reg_desc *reg, u64 __user **uind)
 	return true;
 }
 
-static int walk_one_sys_reg(const struct sys_reg_desc *rd,
+static int walk_one_sys_reg(const struct kvm_vcpu *vcpu,
+			    const struct sys_reg_desc *rd,
 			    u64 __user **uind,
 			    unsigned int *total)
 {
@@ -2448,6 +2449,9 @@ static int walk_one_sys_reg(const struct sys_reg_desc *rd,
 	if (!(rd->reg || rd->get_user))
 		return 0;
 
+	if (rd->check_present && !rd->check_present(vcpu, rd))
+		return 0;
+
 	if (!copy_reg_to_user(rd, uind))
 		return -EFAULT;
 
@@ -2476,9 +2480,9 @@ static int walk_sys_regs(struct kvm_vcpu *vcpu, u64 __user *uind)
 		int cmp = cmp_sys_reg(i1, i2);
 		/* target-specific overrides generic entry. */
 		if (cmp <= 0)
-			err = walk_one_sys_reg(i1, &uind, &total);
+			err = walk_one_sys_reg(vcpu, i1, &uind, &total);
 		else
-			err = walk_one_sys_reg(i2, &uind, &total);
+			err = walk_one_sys_reg(vcpu, i2, &uind, &total);
 
 		if (err)
 			return err;
diff --git a/arch/arm64/kvm/sys_regs.h b/arch/arm64/kvm/sys_regs.h
index 24bac06..cffb31e 100644
--- a/arch/arm64/kvm/sys_regs.h
+++ b/arch/arm64/kvm/sys_regs.h
@@ -61,6 +61,10 @@ struct sys_reg_desc {
 			const struct kvm_one_reg *reg, void __user *uaddr);
 	int (*set_user)(struct kvm_vcpu *vcpu, const struct sys_reg_desc *rd,
 			const struct kvm_one_reg *reg, void __user *uaddr);
+
+	/* Return true iff the register exists; assume present if NULL */
+	bool (*check_present)(const struct kvm_vcpu *vcpu,
+			      const struct sys_reg_desc *rd);
 };
 
 static inline void print_sys_reg_instr(const struct sys_reg_params *p)
-- 
2.1.4

^ permalink raw reply related	[flat|nested] 154+ messages in thread

* [RFC PATCH v2 11/23] KVM: arm64: Support runtime sysreg filtering for KVM_GET_REG_LIST
@ 2018-09-28 13:39   ` Dave Martin
  0 siblings, 0 replies; 154+ messages in thread
From: Dave Martin @ 2018-09-28 13:39 UTC (permalink / raw)
  To: linux-arm-kernel

KVM_GET_REG_LIST should only enumerate registers that are actually
accessible, so it is necessary to filter out any register that is
not exposed to the guest.  For features that are configured at
runtime, this will require a dynamic check.

For example, ZCR_EL1 and ID_AA64ZFR0_EL1 would need to be hidden
if SVE is not enabled for the guest.

Special-casing walk_one_sys_reg() for specific registers will make
the code unnecessarily messy, so this patch adds a new sysreg
method check_present() that, if defined, indicates whether the
sysreg should be enumerated.  If the guest runtime configuration
may require a particular system register to be hidden,
check_present should point to a function that returns true or false
to enable or disable enumeration of that register respectively.

Currently check_present() is not used for any other purpose, but it
may be a useful foundation for abstracting other parts of the code
to handle conditionally-present sysregs, if required.

Signed-off-by: Dave Martin <Dave.Martin@arm.com>
---
 arch/arm64/kvm/sys_regs.c | 10 +++++++---
 arch/arm64/kvm/sys_regs.h |  4 ++++
 2 files changed, 11 insertions(+), 3 deletions(-)

diff --git a/arch/arm64/kvm/sys_regs.c b/arch/arm64/kvm/sys_regs.c
index 0dfd064..adb6cbd 100644
--- a/arch/arm64/kvm/sys_regs.c
+++ b/arch/arm64/kvm/sys_regs.c
@@ -2437,7 +2437,8 @@ static bool copy_reg_to_user(const struct sys_reg_desc *reg, u64 __user **uind)
 	return true;
 }
 
-static int walk_one_sys_reg(const struct sys_reg_desc *rd,
+static int walk_one_sys_reg(const struct kvm_vcpu *vcpu,
+			    const struct sys_reg_desc *rd,
 			    u64 __user **uind,
 			    unsigned int *total)
 {
@@ -2448,6 +2449,9 @@ static int walk_one_sys_reg(const struct sys_reg_desc *rd,
 	if (!(rd->reg || rd->get_user))
 		return 0;
 
+	if (rd->check_present && !rd->check_present(vcpu, rd))
+		return 0;
+
 	if (!copy_reg_to_user(rd, uind))
 		return -EFAULT;
 
@@ -2476,9 +2480,9 @@ static int walk_sys_regs(struct kvm_vcpu *vcpu, u64 __user *uind)
 		int cmp = cmp_sys_reg(i1, i2);
 		/* target-specific overrides generic entry. */
 		if (cmp <= 0)
-			err = walk_one_sys_reg(i1, &uind, &total);
+			err = walk_one_sys_reg(vcpu, i1, &uind, &total);
 		else
-			err = walk_one_sys_reg(i2, &uind, &total);
+			err = walk_one_sys_reg(vcpu, i2, &uind, &total);
 
 		if (err)
 			return err;
diff --git a/arch/arm64/kvm/sys_regs.h b/arch/arm64/kvm/sys_regs.h
index 24bac06..cffb31e 100644
--- a/arch/arm64/kvm/sys_regs.h
+++ b/arch/arm64/kvm/sys_regs.h
@@ -61,6 +61,10 @@ struct sys_reg_desc {
 			const struct kvm_one_reg *reg, void __user *uaddr);
 	int (*set_user)(struct kvm_vcpu *vcpu, const struct sys_reg_desc *rd,
 			const struct kvm_one_reg *reg, void __user *uaddr);
+
+	/* Return true iff the register exists; assume present if NULL */
+	bool (*check_present)(const struct kvm_vcpu *vcpu,
+			      const struct sys_reg_desc *rd);
 };
 
 static inline void print_sys_reg_instr(const struct sys_reg_params *p)
-- 
2.1.4

^ permalink raw reply related	[flat|nested] 154+ messages in thread

* [RFC PATCH v2 12/23] KVM: arm64/sve: System register context switch and access support
  2018-09-28 13:39 ` Dave Martin
@ 2018-09-28 13:39   ` Dave Martin
  -1 siblings, 0 replies; 154+ messages in thread
From: Dave Martin @ 2018-09-28 13:39 UTC (permalink / raw)
  To: kvmarm
  Cc: Okamoto Takayuki, Christoffer Dall, Ard Biesheuvel, Marc Zyngier,
	Catalin Marinas, Will Deacon, linux-arm-kernel

This patch adds the necessary support for context switching ZCR_EL1
for each vcpu.

ZCR_EL1 is trapped alongside the FPSIMD/SVE registers, so it makes
sense for it to be handled as part of the guest FPSIMD/SVE context
for context switch purposes instead of handling it as a general
system register.  This means that it can be switched in lazily at
the appropriate time.  No effort is made to track host context for
this register, since SVE requires VHE: thus the hosts's value for
this register lives permanently in ZCR_EL2 and does not alias the
guest's value at any time.

The Hyp switch and fpsimd context handling code is extended
appropriately.

Accessors are added in sys_regs.c to expose the SVE system
registers and ID register fields.  Because these need to be
conditionally visible based on the guest configuration, they are
implemented separately for now rather than by use of the generic
system register helpers.  This may be abstracted better later on
when/if there are more features requiring this model.

ID_AA64ZFR0_EL1 is RO-RAZ for MRS/MSR when SVE is disabled for the
guest, but for compatibility with non-SVE aware KVM implementations
the register should not be enumerated at all for KVM_GET_REG_LIST
in this case.  For consistency we also reject ioctl access to the
register.  This ensures that a non-SVE-enabled guest looks the same
to userspace, irrespective of whether the kernel KVM implementation
supports SVE.

Signed-off-by: Dave Martin <Dave.Martin@arm.com>
---

Changes since RFCv1:

 * The conditional visibility logic in sys_regs.c has been
   simplified.

 * The guest's ZCR_EL1 is now treated as part of the FPSIMD/SVE state
   for switching purposes.  Any access to this register before it is
   switched in generates an SVE trap, so we have a change to switch it
   along with the vector registers.

   Because SVE is only available with VHE there is no need ever to
   restore the host's version of this register (which instead lives
   permanently in ZCR_EL2).
---
 arch/arm64/include/asm/kvm_host.h |   1 +
 arch/arm64/include/asm/sysreg.h   |   3 ++
 arch/arm64/kvm/fpsimd.c           |   9 +++-
 arch/arm64/kvm/hyp/switch.c       |   4 ++
 arch/arm64/kvm/sys_regs.c         | 111 ++++++++++++++++++++++++++++++++++++--
 5 files changed, 123 insertions(+), 5 deletions(-)

diff --git a/arch/arm64/include/asm/kvm_host.h b/arch/arm64/include/asm/kvm_host.h
index 20baf4a..76cbb95e 100644
--- a/arch/arm64/include/asm/kvm_host.h
+++ b/arch/arm64/include/asm/kvm_host.h
@@ -110,6 +110,7 @@ enum vcpu_sysreg {
 	SCTLR_EL1,	/* System Control Register */
 	ACTLR_EL1,	/* Auxiliary Control Register */
 	CPACR_EL1,	/* Coprocessor Access Control */
+	ZCR_EL1,	/* SVE Control */
 	TTBR0_EL1,	/* Translation Table Base Register 0 */
 	TTBR1_EL1,	/* Translation Table Base Register 1 */
 	TCR_EL1,	/* Translation Control Register */
diff --git a/arch/arm64/include/asm/sysreg.h b/arch/arm64/include/asm/sysreg.h
index c147093..dbac42f 100644
--- a/arch/arm64/include/asm/sysreg.h
+++ b/arch/arm64/include/asm/sysreg.h
@@ -418,6 +418,9 @@
 #define SYS_ICH_LR14_EL2		__SYS__LR8_EL2(6)
 #define SYS_ICH_LR15_EL2		__SYS__LR8_EL2(7)
 
+/* VHE encodings for architectural EL0/1 system registers */
+#define SYS_ZCR_EL12			sys_reg(3, 5, 1, 2, 0)
+
 /* Common SCTLR_ELx flags. */
 #define SCTLR_ELx_EE    (1 << 25)
 #define SCTLR_ELx_IESB	(1 << 21)
diff --git a/arch/arm64/kvm/fpsimd.c b/arch/arm64/kvm/fpsimd.c
index 55654cb..29e5585 100644
--- a/arch/arm64/kvm/fpsimd.c
+++ b/arch/arm64/kvm/fpsimd.c
@@ -102,6 +102,9 @@ void kvm_arch_vcpu_ctxsync_fp(struct kvm_vcpu *vcpu)
 void kvm_arch_vcpu_put_fp(struct kvm_vcpu *vcpu)
 {
 	unsigned long flags;
+	bool host_has_sve = system_supports_sve();
+	bool guest_has_sve =
+		host_has_sve && (vcpu->arch.flags & KVM_ARM64_FP_ENABLED);
 
 	local_irq_save(flags);
 
@@ -109,7 +112,11 @@ void kvm_arch_vcpu_put_fp(struct kvm_vcpu *vcpu)
 		/* Clean guest FP state to memory and invalidate cpu view */
 		fpsimd_save();
 		fpsimd_flush_cpu_state();
-	} else if (system_supports_sve()) {
+
+		if (guest_has_sve)
+			vcpu->arch.ctxt.sys_regs[ZCR_EL1] =
+				read_sysreg_s(SYS_ZCR_EL12);
+	} else if (host_has_sve) {
 		/*
 		 * The FPSIMD/SVE state in the CPU has not been touched, and we
 		 * have SVE (and VHE): CPACR_EL1 (alias CPTR_EL2) has been
diff --git a/arch/arm64/kvm/hyp/switch.c b/arch/arm64/kvm/hyp/switch.c
index ca46153..085ed06 100644
--- a/arch/arm64/kvm/hyp/switch.c
+++ b/arch/arm64/kvm/hyp/switch.c
@@ -366,6 +366,10 @@ static bool __hyp_text __hyp_switch_fpsimd(struct kvm_vcpu *vcpu)
 
 	__fpsimd_restore_state(&vcpu->arch.ctxt.gp_regs.fp_regs);
 
+	if (system_supports_sve() &&
+	    vcpu->arch.flags & KVM_ARM64_GUEST_HAS_SVE)
+		write_sysreg_s(vcpu->arch.ctxt.sys_regs[ZCR_EL1], SYS_ZCR_EL12);
+
 	/* Skip restoring fpexc32 for AArch64 guests */
 	if (!(read_sysreg(hcr_el2) & HCR_RW))
 		write_sysreg(vcpu->arch.ctxt.sys_regs[FPEXC32_EL2],
diff --git a/arch/arm64/kvm/sys_regs.c b/arch/arm64/kvm/sys_regs.c
index adb6cbd..6f03211 100644
--- a/arch/arm64/kvm/sys_regs.c
+++ b/arch/arm64/kvm/sys_regs.c
@@ -1036,10 +1036,7 @@ static u64 read_id_reg(const struct kvm_vcpu *vcpu,
 			 (u32)r->CRn, (u32)r->CRm, (u32)r->Op2);
 	u64 val = raz ? 0 : read_sanitised_ftr_reg(id);
 
-	if (id == SYS_ID_AA64PFR0_EL1) {
-		if (val & (0xfUL << ID_AA64PFR0_SVE_SHIFT))
-			kvm_debug("SVE unsupported for guests, suppressing\n");
-
+	if (id == SYS_ID_AA64PFR0_EL1 && !vcpu_has_sve(vcpu)) {
 		val &= ~(0xfUL << ID_AA64PFR0_SVE_SHIFT);
 	} else if (id == SYS_ID_AA64MMFR1_EL1) {
 		if (val & (0xfUL << ID_AA64MMFR1_LOR_SHIFT))
@@ -1083,6 +1080,105 @@ static int reg_from_user(u64 *val, const void __user *uaddr, u64 id);
 static int reg_to_user(void __user *uaddr, const u64 *val, u64 id);
 static u64 sys_reg_to_index(const struct sys_reg_desc *reg);
 
+#ifdef CONFIG_ARM64_SVE
+static bool sve_check_present(const struct kvm_vcpu *vcpu,
+			      const struct sys_reg_desc *rd)
+{
+	return vcpu_has_sve(vcpu);
+}
+
+static bool access_zcr_el1(struct kvm_vcpu *vcpu,
+			   struct sys_reg_params *p,
+			   const struct sys_reg_desc *rd)
+{
+	/*
+	 * ZCR_EL1 access is handled directly in Hyp as part of the FPSIMD/SVE
+	 * context, so we should only arrive here for non-SVE guests:
+	 */
+	WARN_ON(vcpu_has_sve(vcpu));
+
+	kvm_inject_undefined(vcpu);
+	return false;
+}
+
+static int get_zcr_el1(struct kvm_vcpu *vcpu,
+		       const struct sys_reg_desc *rd,
+		       const struct kvm_one_reg *reg, void __user *uaddr)
+{
+	if (!vcpu_has_sve(vcpu))
+		return -ENOENT;
+
+	return reg_to_user(uaddr, &vcpu->arch.ctxt.sys_regs[ZCR_EL1],
+			   reg->id);
+}
+
+static int set_zcr_el1(struct kvm_vcpu *vcpu,
+		       const struct sys_reg_desc *rd,
+		       const struct kvm_one_reg *reg, void __user *uaddr)
+{
+	if (!vcpu_has_sve(vcpu))
+		return -ENOENT;
+
+	return reg_from_user(&vcpu->arch.ctxt.sys_regs[ZCR_EL1], uaddr,
+			     reg->id);
+}
+
+/* Generate the emulated ID_AA64ZFR0_EL1 value exposed to the guest */
+static u64 guest_id_aa64zfr0_el1(const struct kvm_vcpu *vcpu)
+{
+	if (!vcpu_has_sve(vcpu))
+		return 0;
+
+	return read_sanitised_ftr_reg(SYS_ID_AA64ZFR0_EL1);
+}
+
+static bool access_id_aa64zfr0_el1(struct kvm_vcpu *vcpu,
+				   struct sys_reg_params *p,
+				   const struct sys_reg_desc *rd)
+{
+	if (p->is_write)
+		return write_to_read_only(vcpu, p, rd);
+
+	p->regval = guest_id_aa64zfr0_el1(vcpu);
+	return true;
+}
+
+static int get_id_aa64zfr0_el1(struct kvm_vcpu *vcpu,
+		const struct sys_reg_desc *rd,
+		const struct kvm_one_reg *reg, void __user *uaddr)
+{
+	u64 val;
+
+	if (!vcpu_has_sve(vcpu))
+		return -ENOENT;
+
+	val = guest_id_aa64zfr0_el1(vcpu);
+	return reg_to_user(uaddr, &val, reg->id);
+}
+
+static int set_id_aa64zfr0_el1(struct kvm_vcpu *vcpu,
+		const struct sys_reg_desc *rd,
+		const struct kvm_one_reg *reg, void __user *uaddr)
+{
+	const u64 id = sys_reg_to_index(rd);
+	int err;
+	u64 val;
+
+	if (!vcpu_has_sve(vcpu))
+		return -ENOENT;
+
+	err = reg_from_user(&val, uaddr, id);
+	if (err)
+		return err;
+
+	/* This is what we mean by invariant: you can't change it. */
+	if (val != guest_id_aa64zfr0_el1(vcpu))
+		return -EINVAL;
+
+	return 0;
+}
+#endif /* CONFIG_ARM64_SVE */
+
 /*
  * cpufeature ID register user accessors
  *
@@ -1270,7 +1366,11 @@ static const struct sys_reg_desc sys_reg_descs[] = {
 	ID_SANITISED(ID_AA64PFR1_EL1),
 	ID_UNALLOCATED(4,2),
 	ID_UNALLOCATED(4,3),
+#ifdef CONFIG_ARM64_SVE
+	{ SYS_DESC(SYS_ID_AA64ZFR0_EL1), access_id_aa64zfr0_el1, .get_user = get_id_aa64zfr0_el1, .set_user = set_id_aa64zfr0_el1, .check_present = sve_check_present },
+#else
 	ID_UNALLOCATED(4,4),
+#endif
 	ID_UNALLOCATED(4,5),
 	ID_UNALLOCATED(4,6),
 	ID_UNALLOCATED(4,7),
@@ -1307,6 +1407,9 @@ static const struct sys_reg_desc sys_reg_descs[] = {
 
 	{ SYS_DESC(SYS_SCTLR_EL1), access_vm_reg, reset_val, SCTLR_EL1, 0x00C50078 },
 	{ SYS_DESC(SYS_CPACR_EL1), NULL, reset_val, CPACR_EL1, 0 },
+#ifdef CONFIG_ARM64_SVE
+	{ SYS_DESC(SYS_ZCR_EL1), access_zcr_el1, reset_unknown, ZCR_EL1, ~0xfUL, .get_user = get_zcr_el1, .set_user = set_zcr_el1, .check_present = sve_check_present },
+#endif
 	{ SYS_DESC(SYS_TTBR0_EL1), access_vm_reg, reset_unknown, TTBR0_EL1 },
 	{ SYS_DESC(SYS_TTBR1_EL1), access_vm_reg, reset_unknown, TTBR1_EL1 },
 	{ SYS_DESC(SYS_TCR_EL1), access_vm_reg, reset_val, TCR_EL1, 0 },
-- 
2.1.4

^ permalink raw reply related	[flat|nested] 154+ messages in thread

* [RFC PATCH v2 12/23] KVM: arm64/sve: System register context switch and access support
@ 2018-09-28 13:39   ` Dave Martin
  0 siblings, 0 replies; 154+ messages in thread
From: Dave Martin @ 2018-09-28 13:39 UTC (permalink / raw)
  To: linux-arm-kernel

This patch adds the necessary support for context switching ZCR_EL1
for each vcpu.

ZCR_EL1 is trapped alongside the FPSIMD/SVE registers, so it makes
sense for it to be handled as part of the guest FPSIMD/SVE context
for context switch purposes instead of handling it as a general
system register.  This means that it can be switched in lazily at
the appropriate time.  No effort is made to track host context for
this register, since SVE requires VHE: thus the hosts's value for
this register lives permanently in ZCR_EL2 and does not alias the
guest's value at any time.

The Hyp switch and fpsimd context handling code is extended
appropriately.

Accessors are added in sys_regs.c to expose the SVE system
registers and ID register fields.  Because these need to be
conditionally visible based on the guest configuration, they are
implemented separately for now rather than by use of the generic
system register helpers.  This may be abstracted better later on
when/if there are more features requiring this model.

ID_AA64ZFR0_EL1 is RO-RAZ for MRS/MSR when SVE is disabled for the
guest, but for compatibility with non-SVE aware KVM implementations
the register should not be enumerated at all for KVM_GET_REG_LIST
in this case.  For consistency we also reject ioctl access to the
register.  This ensures that a non-SVE-enabled guest looks the same
to userspace, irrespective of whether the kernel KVM implementation
supports SVE.

Signed-off-by: Dave Martin <Dave.Martin@arm.com>
---

Changes since RFCv1:

 * The conditional visibility logic in sys_regs.c has been
   simplified.

 * The guest's ZCR_EL1 is now treated as part of the FPSIMD/SVE state
   for switching purposes.  Any access to this register before it is
   switched in generates an SVE trap, so we have a change to switch it
   along with the vector registers.

   Because SVE is only available with VHE there is no need ever to
   restore the host's version of this register (which instead lives
   permanently in ZCR_EL2).
---
 arch/arm64/include/asm/kvm_host.h |   1 +
 arch/arm64/include/asm/sysreg.h   |   3 ++
 arch/arm64/kvm/fpsimd.c           |   9 +++-
 arch/arm64/kvm/hyp/switch.c       |   4 ++
 arch/arm64/kvm/sys_regs.c         | 111 ++++++++++++++++++++++++++++++++++++--
 5 files changed, 123 insertions(+), 5 deletions(-)

diff --git a/arch/arm64/include/asm/kvm_host.h b/arch/arm64/include/asm/kvm_host.h
index 20baf4a..76cbb95e 100644
--- a/arch/arm64/include/asm/kvm_host.h
+++ b/arch/arm64/include/asm/kvm_host.h
@@ -110,6 +110,7 @@ enum vcpu_sysreg {
 	SCTLR_EL1,	/* System Control Register */
 	ACTLR_EL1,	/* Auxiliary Control Register */
 	CPACR_EL1,	/* Coprocessor Access Control */
+	ZCR_EL1,	/* SVE Control */
 	TTBR0_EL1,	/* Translation Table Base Register 0 */
 	TTBR1_EL1,	/* Translation Table Base Register 1 */
 	TCR_EL1,	/* Translation Control Register */
diff --git a/arch/arm64/include/asm/sysreg.h b/arch/arm64/include/asm/sysreg.h
index c147093..dbac42f 100644
--- a/arch/arm64/include/asm/sysreg.h
+++ b/arch/arm64/include/asm/sysreg.h
@@ -418,6 +418,9 @@
 #define SYS_ICH_LR14_EL2		__SYS__LR8_EL2(6)
 #define SYS_ICH_LR15_EL2		__SYS__LR8_EL2(7)
 
+/* VHE encodings for architectural EL0/1 system registers */
+#define SYS_ZCR_EL12			sys_reg(3, 5, 1, 2, 0)
+
 /* Common SCTLR_ELx flags. */
 #define SCTLR_ELx_EE    (1 << 25)
 #define SCTLR_ELx_IESB	(1 << 21)
diff --git a/arch/arm64/kvm/fpsimd.c b/arch/arm64/kvm/fpsimd.c
index 55654cb..29e5585 100644
--- a/arch/arm64/kvm/fpsimd.c
+++ b/arch/arm64/kvm/fpsimd.c
@@ -102,6 +102,9 @@ void kvm_arch_vcpu_ctxsync_fp(struct kvm_vcpu *vcpu)
 void kvm_arch_vcpu_put_fp(struct kvm_vcpu *vcpu)
 {
 	unsigned long flags;
+	bool host_has_sve = system_supports_sve();
+	bool guest_has_sve =
+		host_has_sve && (vcpu->arch.flags & KVM_ARM64_FP_ENABLED);
 
 	local_irq_save(flags);
 
@@ -109,7 +112,11 @@ void kvm_arch_vcpu_put_fp(struct kvm_vcpu *vcpu)
 		/* Clean guest FP state to memory and invalidate cpu view */
 		fpsimd_save();
 		fpsimd_flush_cpu_state();
-	} else if (system_supports_sve()) {
+
+		if (guest_has_sve)
+			vcpu->arch.ctxt.sys_regs[ZCR_EL1] =
+				read_sysreg_s(SYS_ZCR_EL12);
+	} else if (host_has_sve) {
 		/*
 		 * The FPSIMD/SVE state in the CPU has not been touched, and we
 		 * have SVE (and VHE): CPACR_EL1 (alias CPTR_EL2) has been
diff --git a/arch/arm64/kvm/hyp/switch.c b/arch/arm64/kvm/hyp/switch.c
index ca46153..085ed06 100644
--- a/arch/arm64/kvm/hyp/switch.c
+++ b/arch/arm64/kvm/hyp/switch.c
@@ -366,6 +366,10 @@ static bool __hyp_text __hyp_switch_fpsimd(struct kvm_vcpu *vcpu)
 
 	__fpsimd_restore_state(&vcpu->arch.ctxt.gp_regs.fp_regs);
 
+	if (system_supports_sve() &&
+	    vcpu->arch.flags & KVM_ARM64_GUEST_HAS_SVE)
+		write_sysreg_s(vcpu->arch.ctxt.sys_regs[ZCR_EL1], SYS_ZCR_EL12);
+
 	/* Skip restoring fpexc32 for AArch64 guests */
 	if (!(read_sysreg(hcr_el2) & HCR_RW))
 		write_sysreg(vcpu->arch.ctxt.sys_regs[FPEXC32_EL2],
diff --git a/arch/arm64/kvm/sys_regs.c b/arch/arm64/kvm/sys_regs.c
index adb6cbd..6f03211 100644
--- a/arch/arm64/kvm/sys_regs.c
+++ b/arch/arm64/kvm/sys_regs.c
@@ -1036,10 +1036,7 @@ static u64 read_id_reg(const struct kvm_vcpu *vcpu,
 			 (u32)r->CRn, (u32)r->CRm, (u32)r->Op2);
 	u64 val = raz ? 0 : read_sanitised_ftr_reg(id);
 
-	if (id == SYS_ID_AA64PFR0_EL1) {
-		if (val & (0xfUL << ID_AA64PFR0_SVE_SHIFT))
-			kvm_debug("SVE unsupported for guests, suppressing\n");
-
+	if (id == SYS_ID_AA64PFR0_EL1 && !vcpu_has_sve(vcpu)) {
 		val &= ~(0xfUL << ID_AA64PFR0_SVE_SHIFT);
 	} else if (id == SYS_ID_AA64MMFR1_EL1) {
 		if (val & (0xfUL << ID_AA64MMFR1_LOR_SHIFT))
@@ -1083,6 +1080,105 @@ static int reg_from_user(u64 *val, const void __user *uaddr, u64 id);
 static int reg_to_user(void __user *uaddr, const u64 *val, u64 id);
 static u64 sys_reg_to_index(const struct sys_reg_desc *reg);
 
+#ifdef CONFIG_ARM64_SVE
+static bool sve_check_present(const struct kvm_vcpu *vcpu,
+			      const struct sys_reg_desc *rd)
+{
+	return vcpu_has_sve(vcpu);
+}
+
+static bool access_zcr_el1(struct kvm_vcpu *vcpu,
+			   struct sys_reg_params *p,
+			   const struct sys_reg_desc *rd)
+{
+	/*
+	 * ZCR_EL1 access is handled directly in Hyp as part of the FPSIMD/SVE
+	 * context, so we should only arrive here for non-SVE guests:
+	 */
+	WARN_ON(vcpu_has_sve(vcpu));
+
+	kvm_inject_undefined(vcpu);
+	return false;
+}
+
+static int get_zcr_el1(struct kvm_vcpu *vcpu,
+		       const struct sys_reg_desc *rd,
+		       const struct kvm_one_reg *reg, void __user *uaddr)
+{
+	if (!vcpu_has_sve(vcpu))
+		return -ENOENT;
+
+	return reg_to_user(uaddr, &vcpu->arch.ctxt.sys_regs[ZCR_EL1],
+			   reg->id);
+}
+
+static int set_zcr_el1(struct kvm_vcpu *vcpu,
+		       const struct sys_reg_desc *rd,
+		       const struct kvm_one_reg *reg, void __user *uaddr)
+{
+	if (!vcpu_has_sve(vcpu))
+		return -ENOENT;
+
+	return reg_from_user(&vcpu->arch.ctxt.sys_regs[ZCR_EL1], uaddr,
+			     reg->id);
+}
+
+/* Generate the emulated ID_AA64ZFR0_EL1 value exposed to the guest */
+static u64 guest_id_aa64zfr0_el1(const struct kvm_vcpu *vcpu)
+{
+	if (!vcpu_has_sve(vcpu))
+		return 0;
+
+	return read_sanitised_ftr_reg(SYS_ID_AA64ZFR0_EL1);
+}
+
+static bool access_id_aa64zfr0_el1(struct kvm_vcpu *vcpu,
+				   struct sys_reg_params *p,
+				   const struct sys_reg_desc *rd)
+{
+	if (p->is_write)
+		return write_to_read_only(vcpu, p, rd);
+
+	p->regval = guest_id_aa64zfr0_el1(vcpu);
+	return true;
+}
+
+static int get_id_aa64zfr0_el1(struct kvm_vcpu *vcpu,
+		const struct sys_reg_desc *rd,
+		const struct kvm_one_reg *reg, void __user *uaddr)
+{
+	u64 val;
+
+	if (!vcpu_has_sve(vcpu))
+		return -ENOENT;
+
+	val = guest_id_aa64zfr0_el1(vcpu);
+	return reg_to_user(uaddr, &val, reg->id);
+}
+
+static int set_id_aa64zfr0_el1(struct kvm_vcpu *vcpu,
+		const struct sys_reg_desc *rd,
+		const struct kvm_one_reg *reg, void __user *uaddr)
+{
+	const u64 id = sys_reg_to_index(rd);
+	int err;
+	u64 val;
+
+	if (!vcpu_has_sve(vcpu))
+		return -ENOENT;
+
+	err = reg_from_user(&val, uaddr, id);
+	if (err)
+		return err;
+
+	/* This is what we mean by invariant: you can't change it. */
+	if (val != guest_id_aa64zfr0_el1(vcpu))
+		return -EINVAL;
+
+	return 0;
+}
+#endif /* CONFIG_ARM64_SVE */
+
 /*
  * cpufeature ID register user accessors
  *
@@ -1270,7 +1366,11 @@ static const struct sys_reg_desc sys_reg_descs[] = {
 	ID_SANITISED(ID_AA64PFR1_EL1),
 	ID_UNALLOCATED(4,2),
 	ID_UNALLOCATED(4,3),
+#ifdef CONFIG_ARM64_SVE
+	{ SYS_DESC(SYS_ID_AA64ZFR0_EL1), access_id_aa64zfr0_el1, .get_user = get_id_aa64zfr0_el1, .set_user = set_id_aa64zfr0_el1, .check_present = sve_check_present },
+#else
 	ID_UNALLOCATED(4,4),
+#endif
 	ID_UNALLOCATED(4,5),
 	ID_UNALLOCATED(4,6),
 	ID_UNALLOCATED(4,7),
@@ -1307,6 +1407,9 @@ static const struct sys_reg_desc sys_reg_descs[] = {
 
 	{ SYS_DESC(SYS_SCTLR_EL1), access_vm_reg, reset_val, SCTLR_EL1, 0x00C50078 },
 	{ SYS_DESC(SYS_CPACR_EL1), NULL, reset_val, CPACR_EL1, 0 },
+#ifdef CONFIG_ARM64_SVE
+	{ SYS_DESC(SYS_ZCR_EL1), access_zcr_el1, reset_unknown, ZCR_EL1, ~0xfUL, .get_user = get_zcr_el1, .set_user = set_zcr_el1, .check_present = sve_check_present },
+#endif
 	{ SYS_DESC(SYS_TTBR0_EL1), access_vm_reg, reset_unknown, TTBR0_EL1 },
 	{ SYS_DESC(SYS_TTBR1_EL1), access_vm_reg, reset_unknown, TTBR1_EL1 },
 	{ SYS_DESC(SYS_TCR_EL1), access_vm_reg, reset_val, TCR_EL1, 0 },
-- 
2.1.4

^ permalink raw reply related	[flat|nested] 154+ messages in thread

* [RFC PATCH v2 13/23] KVM: arm64/sve: Context switch the SVE registers
  2018-09-28 13:39 ` Dave Martin
@ 2018-09-28 13:39   ` Dave Martin
  -1 siblings, 0 replies; 154+ messages in thread
From: Dave Martin @ 2018-09-28 13:39 UTC (permalink / raw)
  To: kvmarm
  Cc: Okamoto Takayuki, Christoffer Dall, Ard Biesheuvel, Marc Zyngier,
	Catalin Marinas, Will Deacon, linux-arm-kernel

In order to give each vcpu its own view of the SVE registers, this
patch adds context storage via a new sve_state pointer in struct
vcpu_arch.  An additional member sve_max_vl is also added for each
vcpu, to determine the maximum vector length visible to the guest
and thus the value to be configured in ZCR_EL2.LEN while the is
active.  This also determines the layout and size of the storage in
sve_state, which is read and written by the same backend functions
that are used for context-switching the SVE state for host tasks.

On SVE-enabled vcpus, SVE access traps are now handled by switching
in the vcpu's SVE context and disabling the trap before returning
to the guest.  On other vcpus, the trap is not handled and an exit
back to the host occurs, where the handle_sve() fallback path
reflects an undefined instruction exception back to the guest,
consistently with the behaviour of non-SVE-capable hardware (as was
done unconditionally prior to this patch).

No SVE handling is added on non-VHE-only paths, since VHE is an
architectural and Kconfig prerequisite of SVE.

Signed-off-by: Dave Martin <Dave.Martin@arm.com>
---

Changes since RFCv1:

 * Add a if_sve () helper macro to efficiently skip or optimise out
   SVE conditional support code for the SVE-unsupported case.  This
   reduces the verbose boilerplate at the affected sites.

 * In the style of sve_pffr(), a vcpu_sve_pffr() helper is added to
   provide the FFR anchor pointer for sve_load_state() in the hyp switch
   code.   This help avoid some open-coded pointer mungeing which is not
   very readable.

 * The condition for calling __hyp_switch_fpsimd() is abstracted for
   better readability.
---
 arch/arm64/include/asm/kvm_host.h |  6 ++++
 arch/arm64/kvm/fpsimd.c           |  5 +--
 arch/arm64/kvm/hyp/switch.c       | 71 ++++++++++++++++++++++++++++++---------
 3 files changed, 65 insertions(+), 17 deletions(-)

diff --git a/arch/arm64/include/asm/kvm_host.h b/arch/arm64/include/asm/kvm_host.h
index 76cbb95e..8e9cd43 100644
--- a/arch/arm64/include/asm/kvm_host.h
+++ b/arch/arm64/include/asm/kvm_host.h
@@ -210,6 +210,8 @@ typedef struct kvm_cpu_context kvm_cpu_context_t;
 
 struct kvm_vcpu_arch {
 	struct kvm_cpu_context ctxt;
+	void *sve_state;
+	unsigned int sve_max_vl;
 
 	/* HYP configuration */
 	u64 hcr_el2;
@@ -302,6 +304,10 @@ struct kvm_vcpu_arch {
 	bool sysregs_loaded_on_cpu;
 };
 
+/* Pointer to the vcpu's SVE FFR for sve_{save,load}_state() */
+#define vcpu_sve_pffr(vcpu) ((void *)((char *)((vcpu)->arch.sve_state) + \
+				      sve_ffr_offset((vcpu)->arch.sve_max_vl)))
+
 /* vcpu_arch flags field values: */
 #define KVM_ARM64_DEBUG_DIRTY		(1 << 0)
 #define KVM_ARM64_FP_ENABLED		(1 << 1) /* guest FP regs loaded */
diff --git a/arch/arm64/kvm/fpsimd.c b/arch/arm64/kvm/fpsimd.c
index 29e5585..3474388 100644
--- a/arch/arm64/kvm/fpsimd.c
+++ b/arch/arm64/kvm/fpsimd.c
@@ -86,10 +86,11 @@ void kvm_arch_vcpu_ctxsync_fp(struct kvm_vcpu *vcpu)
 
 	if (vcpu->arch.flags & KVM_ARM64_FP_ENABLED) {
 		fpsimd_bind_state_to_cpu(&vcpu->arch.ctxt.gp_regs.fp_regs,
-					 NULL, sve_max_vl);
+					 vcpu->arch.sve_state,
+					 vcpu->arch.sve_max_vl);
 
 		clear_thread_flag(TIF_FOREIGN_FPSTATE);
-		clear_thread_flag(TIF_SVE);
+		update_thread_flag(TIF_SVE, vcpu_has_sve(vcpu));
 	}
 }
 
diff --git a/arch/arm64/kvm/hyp/switch.c b/arch/arm64/kvm/hyp/switch.c
index 085ed06..9941349 100644
--- a/arch/arm64/kvm/hyp/switch.c
+++ b/arch/arm64/kvm/hyp/switch.c
@@ -98,7 +98,10 @@ static void activate_traps_vhe(struct kvm_vcpu *vcpu)
 	val = read_sysreg(cpacr_el1);
 	val |= CPACR_EL1_TTA;
 	val &= ~CPACR_EL1_ZEN;
-	if (!update_fp_enabled(vcpu)) {
+	if (update_fp_enabled(vcpu)) {
+		if (vcpu_has_sve(vcpu))
+			val |= CPACR_EL1_ZEN;
+	} else {
 		val &= ~CPACR_EL1_FPEN;
 		__activate_traps_fpsimd32(vcpu);
 	}
@@ -332,16 +335,29 @@ static bool __hyp_text __skip_instr(struct kvm_vcpu *vcpu)
 	}
 }
 
-static bool __hyp_text __hyp_switch_fpsimd(struct kvm_vcpu *vcpu)
+/*
+ * if () with a gating check for SVE support to minimise branch
+ * mispredictions in non-SVE systems.
+ * (system_supports_sve() is resolved at build time or via a static key.)
+ */
+#define if_sve(cond) if (system_supports_sve() && (cond))
+
+static bool __hyp_text __hyp_switch_fpsimd(struct kvm_vcpu *vcpu,
+					   bool guest_has_sve)
 {
 	struct user_fpsimd_state *host_fpsimd = vcpu->arch.host_fpsimd_state;
 
-	if (has_vhe())
-		write_sysreg(read_sysreg(cpacr_el1) | CPACR_EL1_FPEN,
-			     cpacr_el1);
-	else
+	if (has_vhe()) {
+		u64 reg = read_sysreg(cpacr_el1) | CPACR_EL1_FPEN;
+
+		if_sve (guest_has_sve)
+			reg |= CPACR_EL1_ZEN;
+
+		write_sysreg(reg, cpacr_el1);
+	} else {
 		write_sysreg(read_sysreg(cptr_el2) & ~(u64)CPTR_EL2_TFP,
 			     cptr_el2);
+	}
 
 	isb();
 
@@ -350,8 +366,7 @@ static bool __hyp_text __hyp_switch_fpsimd(struct kvm_vcpu *vcpu)
 		 * In the SVE case, VHE is assumed: it is enforced by
 		 * Kconfig and kvm_arch_init().
 		 */
-		if (system_supports_sve() &&
-		    (vcpu->arch.flags & KVM_ARM64_HOST_SVE_IN_USE)) {
+		if_sve (vcpu->arch.flags & KVM_ARM64_HOST_SVE_IN_USE) {
 			struct thread_struct *thread = container_of(
 				host_fpsimd,
 				struct thread_struct, uw.fpsimd_state);
@@ -364,11 +379,14 @@ static bool __hyp_text __hyp_switch_fpsimd(struct kvm_vcpu *vcpu)
 		vcpu->arch.flags &= ~KVM_ARM64_FP_HOST;
 	}
 
-	__fpsimd_restore_state(&vcpu->arch.ctxt.gp_regs.fp_regs);
-
-	if (system_supports_sve() &&
-	    vcpu->arch.flags & KVM_ARM64_GUEST_HAS_SVE)
+	if_sve (guest_has_sve) {
+		sve_load_state(vcpu_sve_pffr(vcpu),
+			       &vcpu->arch.ctxt.gp_regs.fp_regs.fpsr,
+			       sve_vq_from_vl(vcpu->arch.sve_max_vl) - 1);
 		write_sysreg_s(vcpu->arch.ctxt.sys_regs[ZCR_EL1], SYS_ZCR_EL12);
+	} else {
+		__fpsimd_restore_state(&vcpu->arch.ctxt.gp_regs.fp_regs);
+	}
 
 	/* Skip restoring fpexc32 for AArch64 guests */
 	if (!(read_sysreg(hcr_el2) & HCR_RW))
@@ -380,6 +398,26 @@ static bool __hyp_text __hyp_switch_fpsimd(struct kvm_vcpu *vcpu)
 	return true;
 }
 
+static inline bool __hyp_text __hyp_trap_is_fpsimd(struct kvm_vcpu *vcpu,
+						   bool guest_has_sve)
+{
+
+	u8 trap_class;
+
+	if (!system_supports_fpsimd())
+		return false;
+
+	trap_class = kvm_vcpu_trap_get_class(vcpu);
+
+	if (trap_class == ESR_ELx_EC_FP_ASIMD)
+		return true;
+
+	if_sve (guest_has_sve && trap_class == ESR_ELx_EC_SVE)
+		return true;
+
+	return false;
+}
+
 /*
  * Return true when we were able to fixup the guest exit and should return to
  * the guest, false when we should restore the host state and return to the
@@ -387,6 +425,8 @@ static bool __hyp_text __hyp_switch_fpsimd(struct kvm_vcpu *vcpu)
  */
 static bool __hyp_text fixup_guest_exit(struct kvm_vcpu *vcpu, u64 *exit_code)
 {
+	bool guest_has_sve;
+
 	if (ARM_EXCEPTION_CODE(*exit_code) != ARM_EXCEPTION_IRQ)
 		vcpu->arch.fault.esr_el2 = read_sysreg_el2(esr);
 
@@ -404,10 +444,11 @@ static bool __hyp_text fixup_guest_exit(struct kvm_vcpu *vcpu, u64 *exit_code)
 	 * and restore the guest context lazily.
 	 * If FP/SIMD is not implemented, handle the trap and inject an
 	 * undefined instruction exception to the guest.
+	 * Similarly for trapped SVE accesses.
 	 */
-	if (system_supports_fpsimd() &&
-	    kvm_vcpu_trap_get_class(vcpu) == ESR_ELx_EC_FP_ASIMD)
-		return __hyp_switch_fpsimd(vcpu);
+	guest_has_sve = vcpu_has_sve(vcpu);
+	if (__hyp_trap_is_fpsimd(vcpu, guest_has_sve))
+		return __hyp_switch_fpsimd(vcpu, guest_has_sve);
 
 	if (!__populate_fault_info(vcpu))
 		return true;
-- 
2.1.4

^ permalink raw reply related	[flat|nested] 154+ messages in thread

* [RFC PATCH v2 13/23] KVM: arm64/sve: Context switch the SVE registers
@ 2018-09-28 13:39   ` Dave Martin
  0 siblings, 0 replies; 154+ messages in thread
From: Dave Martin @ 2018-09-28 13:39 UTC (permalink / raw)
  To: linux-arm-kernel

In order to give each vcpu its own view of the SVE registers, this
patch adds context storage via a new sve_state pointer in struct
vcpu_arch.  An additional member sve_max_vl is also added for each
vcpu, to determine the maximum vector length visible to the guest
and thus the value to be configured in ZCR_EL2.LEN while the is
active.  This also determines the layout and size of the storage in
sve_state, which is read and written by the same backend functions
that are used for context-switching the SVE state for host tasks.

On SVE-enabled vcpus, SVE access traps are now handled by switching
in the vcpu's SVE context and disabling the trap before returning
to the guest.  On other vcpus, the trap is not handled and an exit
back to the host occurs, where the handle_sve() fallback path
reflects an undefined instruction exception back to the guest,
consistently with the behaviour of non-SVE-capable hardware (as was
done unconditionally prior to this patch).

No SVE handling is added on non-VHE-only paths, since VHE is an
architectural and Kconfig prerequisite of SVE.

Signed-off-by: Dave Martin <Dave.Martin@arm.com>
---

Changes since RFCv1:

 * Add a if_sve () helper macro to efficiently skip or optimise out
   SVE conditional support code for the SVE-unsupported case.  This
   reduces the verbose boilerplate at the affected sites.

 * In the style of sve_pffr(), a vcpu_sve_pffr() helper is added to
   provide the FFR anchor pointer for sve_load_state() in the hyp switch
   code.   This help avoid some open-coded pointer mungeing which is not
   very readable.

 * The condition for calling __hyp_switch_fpsimd() is abstracted for
   better readability.
---
 arch/arm64/include/asm/kvm_host.h |  6 ++++
 arch/arm64/kvm/fpsimd.c           |  5 +--
 arch/arm64/kvm/hyp/switch.c       | 71 ++++++++++++++++++++++++++++++---------
 3 files changed, 65 insertions(+), 17 deletions(-)

diff --git a/arch/arm64/include/asm/kvm_host.h b/arch/arm64/include/asm/kvm_host.h
index 76cbb95e..8e9cd43 100644
--- a/arch/arm64/include/asm/kvm_host.h
+++ b/arch/arm64/include/asm/kvm_host.h
@@ -210,6 +210,8 @@ typedef struct kvm_cpu_context kvm_cpu_context_t;
 
 struct kvm_vcpu_arch {
 	struct kvm_cpu_context ctxt;
+	void *sve_state;
+	unsigned int sve_max_vl;
 
 	/* HYP configuration */
 	u64 hcr_el2;
@@ -302,6 +304,10 @@ struct kvm_vcpu_arch {
 	bool sysregs_loaded_on_cpu;
 };
 
+/* Pointer to the vcpu's SVE FFR for sve_{save,load}_state() */
+#define vcpu_sve_pffr(vcpu) ((void *)((char *)((vcpu)->arch.sve_state) + \
+				      sve_ffr_offset((vcpu)->arch.sve_max_vl)))
+
 /* vcpu_arch flags field values: */
 #define KVM_ARM64_DEBUG_DIRTY		(1 << 0)
 #define KVM_ARM64_FP_ENABLED		(1 << 1) /* guest FP regs loaded */
diff --git a/arch/arm64/kvm/fpsimd.c b/arch/arm64/kvm/fpsimd.c
index 29e5585..3474388 100644
--- a/arch/arm64/kvm/fpsimd.c
+++ b/arch/arm64/kvm/fpsimd.c
@@ -86,10 +86,11 @@ void kvm_arch_vcpu_ctxsync_fp(struct kvm_vcpu *vcpu)
 
 	if (vcpu->arch.flags & KVM_ARM64_FP_ENABLED) {
 		fpsimd_bind_state_to_cpu(&vcpu->arch.ctxt.gp_regs.fp_regs,
-					 NULL, sve_max_vl);
+					 vcpu->arch.sve_state,
+					 vcpu->arch.sve_max_vl);
 
 		clear_thread_flag(TIF_FOREIGN_FPSTATE);
-		clear_thread_flag(TIF_SVE);
+		update_thread_flag(TIF_SVE, vcpu_has_sve(vcpu));
 	}
 }
 
diff --git a/arch/arm64/kvm/hyp/switch.c b/arch/arm64/kvm/hyp/switch.c
index 085ed06..9941349 100644
--- a/arch/arm64/kvm/hyp/switch.c
+++ b/arch/arm64/kvm/hyp/switch.c
@@ -98,7 +98,10 @@ static void activate_traps_vhe(struct kvm_vcpu *vcpu)
 	val = read_sysreg(cpacr_el1);
 	val |= CPACR_EL1_TTA;
 	val &= ~CPACR_EL1_ZEN;
-	if (!update_fp_enabled(vcpu)) {
+	if (update_fp_enabled(vcpu)) {
+		if (vcpu_has_sve(vcpu))
+			val |= CPACR_EL1_ZEN;
+	} else {
 		val &= ~CPACR_EL1_FPEN;
 		__activate_traps_fpsimd32(vcpu);
 	}
@@ -332,16 +335,29 @@ static bool __hyp_text __skip_instr(struct kvm_vcpu *vcpu)
 	}
 }
 
-static bool __hyp_text __hyp_switch_fpsimd(struct kvm_vcpu *vcpu)
+/*
+ * if () with a gating check for SVE support to minimise branch
+ * mispredictions in non-SVE systems.
+ * (system_supports_sve() is resolved at build time or via a static key.)
+ */
+#define if_sve(cond) if (system_supports_sve() && (cond))
+
+static bool __hyp_text __hyp_switch_fpsimd(struct kvm_vcpu *vcpu,
+					   bool guest_has_sve)
 {
 	struct user_fpsimd_state *host_fpsimd = vcpu->arch.host_fpsimd_state;
 
-	if (has_vhe())
-		write_sysreg(read_sysreg(cpacr_el1) | CPACR_EL1_FPEN,
-			     cpacr_el1);
-	else
+	if (has_vhe()) {
+		u64 reg = read_sysreg(cpacr_el1) | CPACR_EL1_FPEN;
+
+		if_sve (guest_has_sve)
+			reg |= CPACR_EL1_ZEN;
+
+		write_sysreg(reg, cpacr_el1);
+	} else {
 		write_sysreg(read_sysreg(cptr_el2) & ~(u64)CPTR_EL2_TFP,
 			     cptr_el2);
+	}
 
 	isb();
 
@@ -350,8 +366,7 @@ static bool __hyp_text __hyp_switch_fpsimd(struct kvm_vcpu *vcpu)
 		 * In the SVE case, VHE is assumed: it is enforced by
 		 * Kconfig and kvm_arch_init().
 		 */
-		if (system_supports_sve() &&
-		    (vcpu->arch.flags & KVM_ARM64_HOST_SVE_IN_USE)) {
+		if_sve (vcpu->arch.flags & KVM_ARM64_HOST_SVE_IN_USE) {
 			struct thread_struct *thread = container_of(
 				host_fpsimd,
 				struct thread_struct, uw.fpsimd_state);
@@ -364,11 +379,14 @@ static bool __hyp_text __hyp_switch_fpsimd(struct kvm_vcpu *vcpu)
 		vcpu->arch.flags &= ~KVM_ARM64_FP_HOST;
 	}
 
-	__fpsimd_restore_state(&vcpu->arch.ctxt.gp_regs.fp_regs);
-
-	if (system_supports_sve() &&
-	    vcpu->arch.flags & KVM_ARM64_GUEST_HAS_SVE)
+	if_sve (guest_has_sve) {
+		sve_load_state(vcpu_sve_pffr(vcpu),
+			       &vcpu->arch.ctxt.gp_regs.fp_regs.fpsr,
+			       sve_vq_from_vl(vcpu->arch.sve_max_vl) - 1);
 		write_sysreg_s(vcpu->arch.ctxt.sys_regs[ZCR_EL1], SYS_ZCR_EL12);
+	} else {
+		__fpsimd_restore_state(&vcpu->arch.ctxt.gp_regs.fp_regs);
+	}
 
 	/* Skip restoring fpexc32 for AArch64 guests */
 	if (!(read_sysreg(hcr_el2) & HCR_RW))
@@ -380,6 +398,26 @@ static bool __hyp_text __hyp_switch_fpsimd(struct kvm_vcpu *vcpu)
 	return true;
 }
 
+static inline bool __hyp_text __hyp_trap_is_fpsimd(struct kvm_vcpu *vcpu,
+						   bool guest_has_sve)
+{
+
+	u8 trap_class;
+
+	if (!system_supports_fpsimd())
+		return false;
+
+	trap_class = kvm_vcpu_trap_get_class(vcpu);
+
+	if (trap_class == ESR_ELx_EC_FP_ASIMD)
+		return true;
+
+	if_sve (guest_has_sve && trap_class == ESR_ELx_EC_SVE)
+		return true;
+
+	return false;
+}
+
 /*
  * Return true when we were able to fixup the guest exit and should return to
  * the guest, false when we should restore the host state and return to the
@@ -387,6 +425,8 @@ static bool __hyp_text __hyp_switch_fpsimd(struct kvm_vcpu *vcpu)
  */
 static bool __hyp_text fixup_guest_exit(struct kvm_vcpu *vcpu, u64 *exit_code)
 {
+	bool guest_has_sve;
+
 	if (ARM_EXCEPTION_CODE(*exit_code) != ARM_EXCEPTION_IRQ)
 		vcpu->arch.fault.esr_el2 = read_sysreg_el2(esr);
 
@@ -404,10 +444,11 @@ static bool __hyp_text fixup_guest_exit(struct kvm_vcpu *vcpu, u64 *exit_code)
 	 * and restore the guest context lazily.
 	 * If FP/SIMD is not implemented, handle the trap and inject an
 	 * undefined instruction exception to the guest.
+	 * Similarly for trapped SVE accesses.
 	 */
-	if (system_supports_fpsimd() &&
-	    kvm_vcpu_trap_get_class(vcpu) == ESR_ELx_EC_FP_ASIMD)
-		return __hyp_switch_fpsimd(vcpu);
+	guest_has_sve = vcpu_has_sve(vcpu);
+	if (__hyp_trap_is_fpsimd(vcpu, guest_has_sve))
+		return __hyp_switch_fpsimd(vcpu, guest_has_sve);
 
 	if (!__populate_fault_info(vcpu))
 		return true;
-- 
2.1.4

^ permalink raw reply related	[flat|nested] 154+ messages in thread

* [RFC PATCH v2 14/23] KVM: Allow 2048-bit register access via ioctl interface
  2018-09-28 13:39 ` Dave Martin
@ 2018-09-28 13:39   ` Dave Martin
  -1 siblings, 0 replies; 154+ messages in thread
From: Dave Martin @ 2018-09-28 13:39 UTC (permalink / raw)
  To: kvmarm
  Cc: Okamoto Takayuki, Christoffer Dall, Ard Biesheuvel, Marc Zyngier,
	Catalin Marinas, Will Deacon, linux-arm-kernel

The Arm SVE architecture defines registers that are up to 2048 bits
in size (with some possibility of further future expansion).

In order to avoid the need for an excessively large number of
ioctls when saving and restoring a vcpu's registers, this patch
adds a #define to make support for individual 2048-bit registers
through the KVM_{GET,SET}_ONE_REG ioctl interface official.  This
will allow each SVE register to be accessed in a single call.

There are sufficient spare bits in the register id size field for
this change, so there is no ABI impact providing that
KVM_GET_REG_LIST does not enumerate any 2048-bit register unless
userspace explicitly opts in to the relevant architecture-specific
features.

Signed-off-by: Dave Martin <Dave.Martin@arm.com>
---
 include/uapi/linux/kvm.h | 1 +
 1 file changed, 1 insertion(+)

diff --git a/include/uapi/linux/kvm.h b/include/uapi/linux/kvm.h
index 251be35..7c3c5cc 100644
--- a/include/uapi/linux/kvm.h
+++ b/include/uapi/linux/kvm.h
@@ -1110,6 +1110,7 @@ struct kvm_dirty_tlb {
 #define KVM_REG_SIZE_U256	0x0050000000000000ULL
 #define KVM_REG_SIZE_U512	0x0060000000000000ULL
 #define KVM_REG_SIZE_U1024	0x0070000000000000ULL
+#define KVM_REG_SIZE_U2048	0x0080000000000000ULL
 
 struct kvm_reg_list {
 	__u64 n; /* number of regs */
-- 
2.1.4

^ permalink raw reply related	[flat|nested] 154+ messages in thread

* [RFC PATCH v2 14/23] KVM: Allow 2048-bit register access via ioctl interface
@ 2018-09-28 13:39   ` Dave Martin
  0 siblings, 0 replies; 154+ messages in thread
From: Dave Martin @ 2018-09-28 13:39 UTC (permalink / raw)
  To: linux-arm-kernel

The Arm SVE architecture defines registers that are up to 2048 bits
in size (with some possibility of further future expansion).

In order to avoid the need for an excessively large number of
ioctls when saving and restoring a vcpu's registers, this patch
adds a #define to make support for individual 2048-bit registers
through the KVM_{GET,SET}_ONE_REG ioctl interface official.  This
will allow each SVE register to be accessed in a single call.

There are sufficient spare bits in the register id size field for
this change, so there is no ABI impact providing that
KVM_GET_REG_LIST does not enumerate any 2048-bit register unless
userspace explicitly opts in to the relevant architecture-specific
features.

Signed-off-by: Dave Martin <Dave.Martin@arm.com>
---
 include/uapi/linux/kvm.h | 1 +
 1 file changed, 1 insertion(+)

diff --git a/include/uapi/linux/kvm.h b/include/uapi/linux/kvm.h
index 251be35..7c3c5cc 100644
--- a/include/uapi/linux/kvm.h
+++ b/include/uapi/linux/kvm.h
@@ -1110,6 +1110,7 @@ struct kvm_dirty_tlb {
 #define KVM_REG_SIZE_U256	0x0050000000000000ULL
 #define KVM_REG_SIZE_U512	0x0060000000000000ULL
 #define KVM_REG_SIZE_U1024	0x0070000000000000ULL
+#define KVM_REG_SIZE_U2048	0x0080000000000000ULL
 
 struct kvm_reg_list {
 	__u64 n; /* number of regs */
-- 
2.1.4

^ permalink raw reply related	[flat|nested] 154+ messages in thread

* [RFC PATCH v2 15/23] KVM: arm64/sve: Add SVE support to register access ioctl interface
  2018-09-28 13:39 ` Dave Martin
@ 2018-09-28 13:39   ` Dave Martin
  -1 siblings, 0 replies; 154+ messages in thread
From: Dave Martin @ 2018-09-28 13:39 UTC (permalink / raw)
  To: kvmarm
  Cc: Okamoto Takayuki, Christoffer Dall, Ard Biesheuvel, Marc Zyngier,
	Catalin Marinas, Will Deacon, linux-arm-kernel

This patch adds the following registers for access via the
KVM_{GET,SET}_ONE_REG interface:

 * KVM_REG_ARM64_SVE_ZREG(n, i) (n = 0..31) (in 2048-bit slices)
 * KVM_REG_ARM64_SVE_PREG(n, i) (n = 0..15) (in 256-bit slices)
 * KVM_REG_ARM64_SVE_FFR(i) (in 256-bit slices)

In order to adapt gracefully to future architectural extensions,
the registers are divided up into slices as noted above:  the i
parameter denotes the slice index.

For simplicity, bits or slices that exceed the maximum vector
length supported for the vcpu are ignored for KVM_SET_ONE_REG, and
read as zero for KVM_GET_ONE_REG.

For the current architecture, only slice i = 0 is significant.  The
interface design allows i to increase to up to 31 in the future if
required by future architectural amendments.

The registers are only visible for vcpus that have SVE enabled.
They are not enumerated by KVM_GET_REG_LIST on vcpus that do not
have SVE.  In all cases, surplus slices are not enumerated by
KVM_GET_REG_LIST.

Accesses to the FPSIMD registers via KVM_REG_ARM_CORE is not
allowed for SVE-enabled vcpus: SVE-aware userspace can use the
KVM_REG_ARM64_SVE_ZREG() interface instead to access the same
register state.  This avoids some complex and pointless emluation
in the kernel.

Signed-off-by: Dave Martin <Dave.Martin@arm.com>
---

Changes since RFCv1:

 * Refactored to remove emulation of FPSIMD registers with the SVE
   register view and vice-versa.  This simplifies the code a fair bit.

 * Fixed a couple of range errors.

 * Inlined various trivial helpers that now have only one call site.

 * Use KVM_REG_SIZE() as a symbolic way of getting SVE register slice
   sizes.
---
 arch/arm64/include/uapi/asm/kvm.h |  10 +++
 arch/arm64/kvm/guest.c            | 147 ++++++++++++++++++++++++++++++++++----
 2 files changed, 145 insertions(+), 12 deletions(-)

diff --git a/arch/arm64/include/uapi/asm/kvm.h b/arch/arm64/include/uapi/asm/kvm.h
index 97c3478..1ff68fa 100644
--- a/arch/arm64/include/uapi/asm/kvm.h
+++ b/arch/arm64/include/uapi/asm/kvm.h
@@ -226,6 +226,16 @@ struct kvm_vcpu_events {
 					 KVM_REG_ARM_FW | ((r) & 0xffff))
 #define KVM_REG_ARM_PSCI_VERSION	KVM_REG_ARM_FW_REG(0)
 
+/* SVE registers */
+#define KVM_REG_ARM64_SVE		(0x15 << KVM_REG_ARM_COPROC_SHIFT)
+#define KVM_REG_ARM64_SVE_ZREG(n, i)	(KVM_REG_ARM64 | KVM_REG_ARM64_SVE | \
+					 KVM_REG_SIZE_U2048 |		\
+					 ((n) << 5) | (i))
+#define KVM_REG_ARM64_SVE_PREG(n, i)	(KVM_REG_ARM64 | KVM_REG_ARM64_SVE | \
+					 KVM_REG_SIZE_U256 |		\
+					 ((n) << 5) | (i) | 0x400)
+#define KVM_REG_ARM64_SVE_FFR(i)	KVM_REG_ARM64_SVE_PREG(16, i)
+
 /* Device Control API: ARM VGIC */
 #define KVM_DEV_ARM_VGIC_GRP_ADDR	0
 #define KVM_DEV_ARM_VGIC_GRP_DIST_REGS	1
diff --git a/arch/arm64/kvm/guest.c b/arch/arm64/kvm/guest.c
index 953a5c9..320db0f 100644
--- a/arch/arm64/kvm/guest.c
+++ b/arch/arm64/kvm/guest.c
@@ -21,6 +21,7 @@
 
 #include <linux/errno.h>
 #include <linux/err.h>
+#include <linux/kernel.h>
 #include <linux/kvm_host.h>
 #include <linux/module.h>
 #include <linux/vmalloc.h>
@@ -28,9 +29,12 @@
 #include <kvm/arm_psci.h>
 #include <asm/cputype.h>
 #include <linux/uaccess.h>
+#include <asm/fpsimd.h>
 #include <asm/kvm.h>
 #include <asm/kvm_emulate.h>
 #include <asm/kvm_coproc.h>
+#include <asm/kvm_host.h>
+#include <asm/sigcontext.h>
 
 #include "trace.h"
 
@@ -57,6 +61,12 @@ static u64 core_reg_offset_from_id(u64 id)
 	return id & ~(KVM_REG_ARCH_MASK | KVM_REG_SIZE_MASK | KVM_REG_ARM_CORE);
 }
 
+static bool core_reg_offset_is_vreg(u64 off)
+{
+	return off >= KVM_REG_ARM_CORE_REG(fp_regs.vregs) &&
+		off < KVM_REG_ARM_CORE_REG(fp_regs.fpsr);
+}
+
 static int get_core_reg(struct kvm_vcpu *vcpu, const struct kvm_one_reg *reg)
 {
 	/*
@@ -76,6 +86,13 @@ static int get_core_reg(struct kvm_vcpu *vcpu, const struct kvm_one_reg *reg)
 	    (off + (KVM_REG_SIZE(reg->id) / sizeof(__u32))) >= nr_regs)
 		return -ENOENT;
 
+	/*
+	 * For SVE-enabled vcpus, access to the FPSIMD V-regs must use
+	 * KVM_REG_ARM64_SVE instead:
+	 */
+	if (vcpu_has_sve(vcpu) && core_reg_offset_is_vreg(off))
+		return -EINVAL;
+
 	if (copy_to_user(uaddr, ((u32 *)regs) + off, KVM_REG_SIZE(reg->id)))
 		return -EFAULT;
 
@@ -98,6 +115,13 @@ static int set_core_reg(struct kvm_vcpu *vcpu, const struct kvm_one_reg *reg)
 	    (off + (KVM_REG_SIZE(reg->id) / sizeof(__u32))) >= nr_regs)
 		return -ENOENT;
 
+	/*
+	 * For SVE-enabled vcpus, access to the FPSIMD V-regs must use
+	 * KVM_REG_ARM64_SVE instead:
+	 */
+	if (vcpu_has_sve(vcpu) && core_reg_offset_is_vreg(off))
+		return -EINVAL;
+
 	if (KVM_REG_SIZE(reg->id) > sizeof(tmp))
 		return -EINVAL;
 
@@ -130,6 +154,107 @@ static int set_core_reg(struct kvm_vcpu *vcpu, const struct kvm_one_reg *reg)
 	return err;
 }
 
+struct kreg_region {
+	char *kptr;
+	size_t size;
+	size_t zeropad;
+};
+
+#define SVE_REG_SLICE_SHIFT	0
+#define SVE_REG_SLICE_BITS	5
+#define SVE_REG_ID_SHIFT	(SVE_REG_SLICE_SHIFT + SVE_REG_SLICE_BITS)
+#define SVE_REG_ID_BITS		5
+
+#define SVE_REG_SLICE_MASK \
+	(GENMASK(SVE_REG_SLICE_BITS - 1, 0) << SVE_REG_SLICE_SHIFT)
+#define SVE_REG_ID_MASK	\
+	(GENMASK(SVE_REG_ID_BITS - 1, 0) << SVE_REG_ID_SHIFT)
+
+#define SVE_NUM_SLICES (1 << SVE_REG_SLICE_BITS)
+
+static int sve_reg_region(struct kreg_region *b,
+			  const struct kvm_vcpu *vcpu,
+			  const struct kvm_one_reg *reg)
+{
+	const unsigned int vl = vcpu->arch.sve_max_vl;
+	const unsigned int vq = sve_vq_from_vl(vl);
+
+	const unsigned int reg_num =
+		(reg->id & SVE_REG_ID_MASK) >> SVE_REG_ID_SHIFT;
+	const unsigned int slice_num =
+		(reg->id & SVE_REG_SLICE_MASK) >> SVE_REG_SLICE_SHIFT;
+
+	unsigned int slice_size, offset, limit;
+
+	if (reg->id >= KVM_REG_ARM64_SVE_ZREG(0, 0) &&
+	    reg->id <= KVM_REG_ARM64_SVE_ZREG(SVE_NUM_ZREGS - 1,
+					      SVE_NUM_SLICES - 1)) {
+		slice_size = KVM_REG_SIZE(KVM_REG_ARM64_SVE_ZREG(0, 0));
+
+		/* Compute start and end of the register: */
+		offset = SVE_SIG_ZREG_OFFSET(vq, reg_num) - SVE_SIG_REGS_OFFSET;
+		limit = offset + SVE_SIG_ZREG_SIZE(vq);
+
+		offset += slice_size * slice_num; /* start of requested slice */
+
+	} else if (reg->id >= KVM_REG_ARM64_SVE_PREG(0, 0) &&
+		   reg->id <= KVM_REG_ARM64_SVE_FFR(SVE_NUM_SLICES - 1)) {
+		/* (FFR is P16 for our purposes) */
+
+		slice_size = KVM_REG_SIZE(KVM_REG_ARM64_SVE_PREG(0, 0));
+
+		/* Compute start and end of the register: */
+		offset = SVE_SIG_PREG_OFFSET(vq, reg_num) - SVE_SIG_REGS_OFFSET;
+		limit = offset + SVE_SIG_PREG_SIZE(vq);
+
+		offset += slice_size * slice_num; /* start of requested slice */
+
+	} else {
+		return -ENOENT;
+	}
+
+	b->kptr = (char *)vcpu->arch.sve_state + offset;
+
+	/*
+	 * If the slice starts after the end of the reg, just pad.
+	 * Otherwise, copy as much as possible up to slice_size and pad
+	 * the remainder:
+	 */
+	b->size = offset >= limit ? 0 : min(limit - offset, slice_size);
+	b->zeropad = slice_size - b->size;
+
+	return 0;
+}
+
+static int get_sve_reg(struct kvm_vcpu *vcpu, const struct kvm_one_reg *reg)
+{
+	struct kreg_region kreg;
+	char __user *uptr = (char __user *)reg->addr;
+
+	if (!vcpu_has_sve(vcpu) || sve_reg_region(&kreg, vcpu, reg))
+		return -ENOENT;
+
+	if (copy_to_user(uptr, kreg.kptr, kreg.size) ||
+	    clear_user(uptr + kreg.size, kreg.zeropad))
+		return -EFAULT;
+
+	return 0;
+}
+
+static int set_sve_reg(struct kvm_vcpu *vcpu, const struct kvm_one_reg *reg)
+{
+	struct kreg_region kreg;
+	char __user *uptr = (char __user *)reg->addr;
+
+	if (!vcpu_has_sve(vcpu) || sve_reg_region(&kreg, vcpu, reg))
+		return -ENOENT;
+
+	if (copy_from_user(kreg.kptr, uptr, kreg.size))
+		return -EFAULT;
+
+	return 0;
+}
+
 int kvm_arch_vcpu_ioctl_get_regs(struct kvm_vcpu *vcpu, struct kvm_regs *regs)
 {
 	return -EINVAL;
@@ -251,12 +376,11 @@ int kvm_arm_get_reg(struct kvm_vcpu *vcpu, const struct kvm_one_reg *reg)
 	if ((reg->id & ~KVM_REG_SIZE_MASK) >> 32 != KVM_REG_ARM64 >> 32)
 		return -EINVAL;
 
-	/* Register group 16 means we want a core register. */
-	if ((reg->id & KVM_REG_ARM_COPROC_MASK) == KVM_REG_ARM_CORE)
-		return get_core_reg(vcpu, reg);
-
-	if ((reg->id & KVM_REG_ARM_COPROC_MASK) == KVM_REG_ARM_FW)
-		return kvm_arm_get_fw_reg(vcpu, reg);
+	switch (reg->id & KVM_REG_ARM_COPROC_MASK) {
+	case KVM_REG_ARM_CORE:	return get_core_reg(vcpu, reg);
+	case KVM_REG_ARM_FW:	return kvm_arm_get_fw_reg(vcpu, reg);
+	case KVM_REG_ARM64_SVE:	return get_sve_reg(vcpu, reg);
+	}
 
 	if (is_timer_reg(reg->id))
 		return get_timer_reg(vcpu, reg);
@@ -270,12 +394,11 @@ int kvm_arm_set_reg(struct kvm_vcpu *vcpu, const struct kvm_one_reg *reg)
 	if ((reg->id & ~KVM_REG_SIZE_MASK) >> 32 != KVM_REG_ARM64 >> 32)
 		return -EINVAL;
 
-	/* Register group 16 means we set a core register. */
-	if ((reg->id & KVM_REG_ARM_COPROC_MASK) == KVM_REG_ARM_CORE)
-		return set_core_reg(vcpu, reg);
-
-	if ((reg->id & KVM_REG_ARM_COPROC_MASK) == KVM_REG_ARM_FW)
-		return kvm_arm_set_fw_reg(vcpu, reg);
+	switch (reg->id & KVM_REG_ARM_COPROC_MASK) {
+	case KVM_REG_ARM_CORE:	return set_core_reg(vcpu, reg);
+	case KVM_REG_ARM_FW:	return kvm_arm_set_fw_reg(vcpu, reg);
+	case KVM_REG_ARM64_SVE:	return set_sve_reg(vcpu, reg);
+	}
 
 	if (is_timer_reg(reg->id))
 		return set_timer_reg(vcpu, reg);
-- 
2.1.4

^ permalink raw reply related	[flat|nested] 154+ messages in thread

* [RFC PATCH v2 15/23] KVM: arm64/sve: Add SVE support to register access ioctl interface
@ 2018-09-28 13:39   ` Dave Martin
  0 siblings, 0 replies; 154+ messages in thread
From: Dave Martin @ 2018-09-28 13:39 UTC (permalink / raw)
  To: linux-arm-kernel

This patch adds the following registers for access via the
KVM_{GET,SET}_ONE_REG interface:

 * KVM_REG_ARM64_SVE_ZREG(n, i) (n = 0..31) (in 2048-bit slices)
 * KVM_REG_ARM64_SVE_PREG(n, i) (n = 0..15) (in 256-bit slices)
 * KVM_REG_ARM64_SVE_FFR(i) (in 256-bit slices)

In order to adapt gracefully to future architectural extensions,
the registers are divided up into slices as noted above:  the i
parameter denotes the slice index.

For simplicity, bits or slices that exceed the maximum vector
length supported for the vcpu are ignored for KVM_SET_ONE_REG, and
read as zero for KVM_GET_ONE_REG.

For the current architecture, only slice i = 0 is significant.  The
interface design allows i to increase to up to 31 in the future if
required by future architectural amendments.

The registers are only visible for vcpus that have SVE enabled.
They are not enumerated by KVM_GET_REG_LIST on vcpus that do not
have SVE.  In all cases, surplus slices are not enumerated by
KVM_GET_REG_LIST.

Accesses to the FPSIMD registers via KVM_REG_ARM_CORE is not
allowed for SVE-enabled vcpus: SVE-aware userspace can use the
KVM_REG_ARM64_SVE_ZREG() interface instead to access the same
register state.  This avoids some complex and pointless emluation
in the kernel.

Signed-off-by: Dave Martin <Dave.Martin@arm.com>
---

Changes since RFCv1:

 * Refactored to remove emulation of FPSIMD registers with the SVE
   register view and vice-versa.  This simplifies the code a fair bit.

 * Fixed a couple of range errors.

 * Inlined various trivial helpers that now have only one call site.

 * Use KVM_REG_SIZE() as a symbolic way of getting SVE register slice
   sizes.
---
 arch/arm64/include/uapi/asm/kvm.h |  10 +++
 arch/arm64/kvm/guest.c            | 147 ++++++++++++++++++++++++++++++++++----
 2 files changed, 145 insertions(+), 12 deletions(-)

diff --git a/arch/arm64/include/uapi/asm/kvm.h b/arch/arm64/include/uapi/asm/kvm.h
index 97c3478..1ff68fa 100644
--- a/arch/arm64/include/uapi/asm/kvm.h
+++ b/arch/arm64/include/uapi/asm/kvm.h
@@ -226,6 +226,16 @@ struct kvm_vcpu_events {
 					 KVM_REG_ARM_FW | ((r) & 0xffff))
 #define KVM_REG_ARM_PSCI_VERSION	KVM_REG_ARM_FW_REG(0)
 
+/* SVE registers */
+#define KVM_REG_ARM64_SVE		(0x15 << KVM_REG_ARM_COPROC_SHIFT)
+#define KVM_REG_ARM64_SVE_ZREG(n, i)	(KVM_REG_ARM64 | KVM_REG_ARM64_SVE | \
+					 KVM_REG_SIZE_U2048 |		\
+					 ((n) << 5) | (i))
+#define KVM_REG_ARM64_SVE_PREG(n, i)	(KVM_REG_ARM64 | KVM_REG_ARM64_SVE | \
+					 KVM_REG_SIZE_U256 |		\
+					 ((n) << 5) | (i) | 0x400)
+#define KVM_REG_ARM64_SVE_FFR(i)	KVM_REG_ARM64_SVE_PREG(16, i)
+
 /* Device Control API: ARM VGIC */
 #define KVM_DEV_ARM_VGIC_GRP_ADDR	0
 #define KVM_DEV_ARM_VGIC_GRP_DIST_REGS	1
diff --git a/arch/arm64/kvm/guest.c b/arch/arm64/kvm/guest.c
index 953a5c9..320db0f 100644
--- a/arch/arm64/kvm/guest.c
+++ b/arch/arm64/kvm/guest.c
@@ -21,6 +21,7 @@
 
 #include <linux/errno.h>
 #include <linux/err.h>
+#include <linux/kernel.h>
 #include <linux/kvm_host.h>
 #include <linux/module.h>
 #include <linux/vmalloc.h>
@@ -28,9 +29,12 @@
 #include <kvm/arm_psci.h>
 #include <asm/cputype.h>
 #include <linux/uaccess.h>
+#include <asm/fpsimd.h>
 #include <asm/kvm.h>
 #include <asm/kvm_emulate.h>
 #include <asm/kvm_coproc.h>
+#include <asm/kvm_host.h>
+#include <asm/sigcontext.h>
 
 #include "trace.h"
 
@@ -57,6 +61,12 @@ static u64 core_reg_offset_from_id(u64 id)
 	return id & ~(KVM_REG_ARCH_MASK | KVM_REG_SIZE_MASK | KVM_REG_ARM_CORE);
 }
 
+static bool core_reg_offset_is_vreg(u64 off)
+{
+	return off >= KVM_REG_ARM_CORE_REG(fp_regs.vregs) &&
+		off < KVM_REG_ARM_CORE_REG(fp_regs.fpsr);
+}
+
 static int get_core_reg(struct kvm_vcpu *vcpu, const struct kvm_one_reg *reg)
 {
 	/*
@@ -76,6 +86,13 @@ static int get_core_reg(struct kvm_vcpu *vcpu, const struct kvm_one_reg *reg)
 	    (off + (KVM_REG_SIZE(reg->id) / sizeof(__u32))) >= nr_regs)
 		return -ENOENT;
 
+	/*
+	 * For SVE-enabled vcpus, access to the FPSIMD V-regs must use
+	 * KVM_REG_ARM64_SVE instead:
+	 */
+	if (vcpu_has_sve(vcpu) && core_reg_offset_is_vreg(off))
+		return -EINVAL;
+
 	if (copy_to_user(uaddr, ((u32 *)regs) + off, KVM_REG_SIZE(reg->id)))
 		return -EFAULT;
 
@@ -98,6 +115,13 @@ static int set_core_reg(struct kvm_vcpu *vcpu, const struct kvm_one_reg *reg)
 	    (off + (KVM_REG_SIZE(reg->id) / sizeof(__u32))) >= nr_regs)
 		return -ENOENT;
 
+	/*
+	 * For SVE-enabled vcpus, access to the FPSIMD V-regs must use
+	 * KVM_REG_ARM64_SVE instead:
+	 */
+	if (vcpu_has_sve(vcpu) && core_reg_offset_is_vreg(off))
+		return -EINVAL;
+
 	if (KVM_REG_SIZE(reg->id) > sizeof(tmp))
 		return -EINVAL;
 
@@ -130,6 +154,107 @@ static int set_core_reg(struct kvm_vcpu *vcpu, const struct kvm_one_reg *reg)
 	return err;
 }
 
+struct kreg_region {
+	char *kptr;
+	size_t size;
+	size_t zeropad;
+};
+
+#define SVE_REG_SLICE_SHIFT	0
+#define SVE_REG_SLICE_BITS	5
+#define SVE_REG_ID_SHIFT	(SVE_REG_SLICE_SHIFT + SVE_REG_SLICE_BITS)
+#define SVE_REG_ID_BITS		5
+
+#define SVE_REG_SLICE_MASK \
+	(GENMASK(SVE_REG_SLICE_BITS - 1, 0) << SVE_REG_SLICE_SHIFT)
+#define SVE_REG_ID_MASK	\
+	(GENMASK(SVE_REG_ID_BITS - 1, 0) << SVE_REG_ID_SHIFT)
+
+#define SVE_NUM_SLICES (1 << SVE_REG_SLICE_BITS)
+
+static int sve_reg_region(struct kreg_region *b,
+			  const struct kvm_vcpu *vcpu,
+			  const struct kvm_one_reg *reg)
+{
+	const unsigned int vl = vcpu->arch.sve_max_vl;
+	const unsigned int vq = sve_vq_from_vl(vl);
+
+	const unsigned int reg_num =
+		(reg->id & SVE_REG_ID_MASK) >> SVE_REG_ID_SHIFT;
+	const unsigned int slice_num =
+		(reg->id & SVE_REG_SLICE_MASK) >> SVE_REG_SLICE_SHIFT;
+
+	unsigned int slice_size, offset, limit;
+
+	if (reg->id >= KVM_REG_ARM64_SVE_ZREG(0, 0) &&
+	    reg->id <= KVM_REG_ARM64_SVE_ZREG(SVE_NUM_ZREGS - 1,
+					      SVE_NUM_SLICES - 1)) {
+		slice_size = KVM_REG_SIZE(KVM_REG_ARM64_SVE_ZREG(0, 0));
+
+		/* Compute start and end of the register: */
+		offset = SVE_SIG_ZREG_OFFSET(vq, reg_num) - SVE_SIG_REGS_OFFSET;
+		limit = offset + SVE_SIG_ZREG_SIZE(vq);
+
+		offset += slice_size * slice_num; /* start of requested slice */
+
+	} else if (reg->id >= KVM_REG_ARM64_SVE_PREG(0, 0) &&
+		   reg->id <= KVM_REG_ARM64_SVE_FFR(SVE_NUM_SLICES - 1)) {
+		/* (FFR is P16 for our purposes) */
+
+		slice_size = KVM_REG_SIZE(KVM_REG_ARM64_SVE_PREG(0, 0));
+
+		/* Compute start and end of the register: */
+		offset = SVE_SIG_PREG_OFFSET(vq, reg_num) - SVE_SIG_REGS_OFFSET;
+		limit = offset + SVE_SIG_PREG_SIZE(vq);
+
+		offset += slice_size * slice_num; /* start of requested slice */
+
+	} else {
+		return -ENOENT;
+	}
+
+	b->kptr = (char *)vcpu->arch.sve_state + offset;
+
+	/*
+	 * If the slice starts after the end of the reg, just pad.
+	 * Otherwise, copy as much as possible up to slice_size and pad
+	 * the remainder:
+	 */
+	b->size = offset >= limit ? 0 : min(limit - offset, slice_size);
+	b->zeropad = slice_size - b->size;
+
+	return 0;
+}
+
+static int get_sve_reg(struct kvm_vcpu *vcpu, const struct kvm_one_reg *reg)
+{
+	struct kreg_region kreg;
+	char __user *uptr = (char __user *)reg->addr;
+
+	if (!vcpu_has_sve(vcpu) || sve_reg_region(&kreg, vcpu, reg))
+		return -ENOENT;
+
+	if (copy_to_user(uptr, kreg.kptr, kreg.size) ||
+	    clear_user(uptr + kreg.size, kreg.zeropad))
+		return -EFAULT;
+
+	return 0;
+}
+
+static int set_sve_reg(struct kvm_vcpu *vcpu, const struct kvm_one_reg *reg)
+{
+	struct kreg_region kreg;
+	char __user *uptr = (char __user *)reg->addr;
+
+	if (!vcpu_has_sve(vcpu) || sve_reg_region(&kreg, vcpu, reg))
+		return -ENOENT;
+
+	if (copy_from_user(kreg.kptr, uptr, kreg.size))
+		return -EFAULT;
+
+	return 0;
+}
+
 int kvm_arch_vcpu_ioctl_get_regs(struct kvm_vcpu *vcpu, struct kvm_regs *regs)
 {
 	return -EINVAL;
@@ -251,12 +376,11 @@ int kvm_arm_get_reg(struct kvm_vcpu *vcpu, const struct kvm_one_reg *reg)
 	if ((reg->id & ~KVM_REG_SIZE_MASK) >> 32 != KVM_REG_ARM64 >> 32)
 		return -EINVAL;
 
-	/* Register group 16 means we want a core register. */
-	if ((reg->id & KVM_REG_ARM_COPROC_MASK) == KVM_REG_ARM_CORE)
-		return get_core_reg(vcpu, reg);
-
-	if ((reg->id & KVM_REG_ARM_COPROC_MASK) == KVM_REG_ARM_FW)
-		return kvm_arm_get_fw_reg(vcpu, reg);
+	switch (reg->id & KVM_REG_ARM_COPROC_MASK) {
+	case KVM_REG_ARM_CORE:	return get_core_reg(vcpu, reg);
+	case KVM_REG_ARM_FW:	return kvm_arm_get_fw_reg(vcpu, reg);
+	case KVM_REG_ARM64_SVE:	return get_sve_reg(vcpu, reg);
+	}
 
 	if (is_timer_reg(reg->id))
 		return get_timer_reg(vcpu, reg);
@@ -270,12 +394,11 @@ int kvm_arm_set_reg(struct kvm_vcpu *vcpu, const struct kvm_one_reg *reg)
 	if ((reg->id & ~KVM_REG_SIZE_MASK) >> 32 != KVM_REG_ARM64 >> 32)
 		return -EINVAL;
 
-	/* Register group 16 means we set a core register. */
-	if ((reg->id & KVM_REG_ARM_COPROC_MASK) == KVM_REG_ARM_CORE)
-		return set_core_reg(vcpu, reg);
-
-	if ((reg->id & KVM_REG_ARM_COPROC_MASK) == KVM_REG_ARM_FW)
-		return kvm_arm_set_fw_reg(vcpu, reg);
+	switch (reg->id & KVM_REG_ARM_COPROC_MASK) {
+	case KVM_REG_ARM_CORE:	return set_core_reg(vcpu, reg);
+	case KVM_REG_ARM_FW:	return kvm_arm_set_fw_reg(vcpu, reg);
+	case KVM_REG_ARM64_SVE:	return set_sve_reg(vcpu, reg);
+	}
 
 	if (is_timer_reg(reg->id))
 		return set_timer_reg(vcpu, reg);
-- 
2.1.4

^ permalink raw reply related	[flat|nested] 154+ messages in thread

* [RFC PATCH v2 16/23] KVM: arm64: Enumerate SVE register indices for KVM_GET_REG_LIST
  2018-09-28 13:39 ` Dave Martin
@ 2018-09-28 13:39   ` Dave Martin
  -1 siblings, 0 replies; 154+ messages in thread
From: Dave Martin @ 2018-09-28 13:39 UTC (permalink / raw)
  To: kvmarm
  Cc: Okamoto Takayuki, Christoffer Dall, Ard Biesheuvel, Marc Zyngier,
	Catalin Marinas, Will Deacon, linux-arm-kernel

This patch includes the SVE register IDs in the list returned by
KVM_GET_REG_LIST, as appropriate.

On a non-SVE-enabled vcpu, no extra IDs are added.

On an SVE-enabled vcpu, the appropriate number of slice IDs are
enumerated for each SVE register, depending on the maximum vector
length for the vcpu.

Signed-off-by: Dave Martin <Dave.Martin@arm.com>
---

Changes since RFCv1:

 * Simplify enumerate_sve_regs() based on Andrew Jones' approach.

 * Reg copying loops are inverted for brevity, since the order we
   spit out the regs in doesn't really matter.

(I tried to keep part of my approach to avoid the duplicate logic
between num_sve_regs() and copy_sve_reg_indices(), but although
it works in principle, gcc fails to fully collapse the num_regs()
case... so I gave up.  The two functions need to be manually kept
consistent, but hopefully that's fairly straightforward.)
---
 arch/arm64/kvm/guest.c | 45 +++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 45 insertions(+)

diff --git a/arch/arm64/kvm/guest.c b/arch/arm64/kvm/guest.c
index 320db0f..89eab68 100644
--- a/arch/arm64/kvm/guest.c
+++ b/arch/arm64/kvm/guest.c
@@ -323,6 +323,46 @@ static int get_timer_reg(struct kvm_vcpu *vcpu, const struct kvm_one_reg *reg)
 	return copy_to_user(uaddr, &val, KVM_REG_SIZE(reg->id)) ? -EFAULT : 0;
 }
 
+static unsigned long num_sve_regs(const struct kvm_vcpu *vcpu)
+{
+	const unsigned int slices = DIV_ROUND_UP(
+		vcpu->arch.sve_max_vl,
+		KVM_REG_SIZE(KVM_REG_ARM64_SVE_ZREG(0, 0)));
+
+	if (!vcpu_has_sve(vcpu))
+		return 0;
+
+	return slices * (SVE_NUM_PREGS + SVE_NUM_ZREGS + 1 /* FFR */);
+}
+
+static int copy_sve_reg_indices(const struct kvm_vcpu *vcpu, u64 __user **uind)
+{
+	const unsigned int slices = DIV_ROUND_UP(
+		vcpu->arch.sve_max_vl,
+		KVM_REG_SIZE(KVM_REG_ARM64_SVE_ZREG(0, 0)));
+	unsigned int i, n;
+
+	if (!vcpu_has_sve(vcpu))
+		return 0;
+
+	for (i = 0; i < slices; i++) {
+		for (n = 0; n < SVE_NUM_ZREGS; n++) {
+			if (put_user(KVM_REG_ARM64_SVE_ZREG(n, i), (*uind)++))
+				return -EFAULT;
+		}
+
+		for (n = 0; n < SVE_NUM_PREGS; n++) {
+			if (put_user(KVM_REG_ARM64_SVE_PREG(n, i), (*uind)++))
+				return -EFAULT;
+		}
+
+		if (put_user(KVM_REG_ARM64_SVE_FFR(i), (*uind)++))
+			return -EFAULT;
+	}
+
+	return 0;
+}
+
 /**
  * kvm_arm_num_regs - how many registers do we present via KVM_GET_ONE_REG
  *
@@ -333,6 +373,7 @@ unsigned long kvm_arm_num_regs(struct kvm_vcpu *vcpu)
 	unsigned long res = 0;
 
 	res += num_core_regs();
+	res += num_sve_regs(vcpu);
 	res += kvm_arm_num_sys_reg_descs(vcpu);
 	res += kvm_arm_get_fw_num_regs(vcpu);
 	res += NUM_TIMER_REGS;
@@ -357,6 +398,10 @@ int kvm_arm_copy_reg_indices(struct kvm_vcpu *vcpu, u64 __user *uindices)
 		uindices++;
 	}
 
+	ret = copy_sve_reg_indices(vcpu, &uindices);
+	if (ret)
+		return ret;
+
 	ret = kvm_arm_copy_fw_reg_indices(vcpu, uindices);
 	if (ret)
 		return ret;
-- 
2.1.4

^ permalink raw reply related	[flat|nested] 154+ messages in thread

* [RFC PATCH v2 16/23] KVM: arm64: Enumerate SVE register indices for KVM_GET_REG_LIST
@ 2018-09-28 13:39   ` Dave Martin
  0 siblings, 0 replies; 154+ messages in thread
From: Dave Martin @ 2018-09-28 13:39 UTC (permalink / raw)
  To: linux-arm-kernel

This patch includes the SVE register IDs in the list returned by
KVM_GET_REG_LIST, as appropriate.

On a non-SVE-enabled vcpu, no extra IDs are added.

On an SVE-enabled vcpu, the appropriate number of slice IDs are
enumerated for each SVE register, depending on the maximum vector
length for the vcpu.

Signed-off-by: Dave Martin <Dave.Martin@arm.com>
---

Changes since RFCv1:

 * Simplify enumerate_sve_regs() based on Andrew Jones' approach.

 * Reg copying loops are inverted for brevity, since the order we
   spit out the regs in doesn't really matter.

(I tried to keep part of my approach to avoid the duplicate logic
between num_sve_regs() and copy_sve_reg_indices(), but although
it works in principle, gcc fails to fully collapse the num_regs()
case... so I gave up.  The two functions need to be manually kept
consistent, but hopefully that's fairly straightforward.)
---
 arch/arm64/kvm/guest.c | 45 +++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 45 insertions(+)

diff --git a/arch/arm64/kvm/guest.c b/arch/arm64/kvm/guest.c
index 320db0f..89eab68 100644
--- a/arch/arm64/kvm/guest.c
+++ b/arch/arm64/kvm/guest.c
@@ -323,6 +323,46 @@ static int get_timer_reg(struct kvm_vcpu *vcpu, const struct kvm_one_reg *reg)
 	return copy_to_user(uaddr, &val, KVM_REG_SIZE(reg->id)) ? -EFAULT : 0;
 }
 
+static unsigned long num_sve_regs(const struct kvm_vcpu *vcpu)
+{
+	const unsigned int slices = DIV_ROUND_UP(
+		vcpu->arch.sve_max_vl,
+		KVM_REG_SIZE(KVM_REG_ARM64_SVE_ZREG(0, 0)));
+
+	if (!vcpu_has_sve(vcpu))
+		return 0;
+
+	return slices * (SVE_NUM_PREGS + SVE_NUM_ZREGS + 1 /* FFR */);
+}
+
+static int copy_sve_reg_indices(const struct kvm_vcpu *vcpu, u64 __user **uind)
+{
+	const unsigned int slices = DIV_ROUND_UP(
+		vcpu->arch.sve_max_vl,
+		KVM_REG_SIZE(KVM_REG_ARM64_SVE_ZREG(0, 0)));
+	unsigned int i, n;
+
+	if (!vcpu_has_sve(vcpu))
+		return 0;
+
+	for (i = 0; i < slices; i++) {
+		for (n = 0; n < SVE_NUM_ZREGS; n++) {
+			if (put_user(KVM_REG_ARM64_SVE_ZREG(n, i), (*uind)++))
+				return -EFAULT;
+		}
+
+		for (n = 0; n < SVE_NUM_PREGS; n++) {
+			if (put_user(KVM_REG_ARM64_SVE_PREG(n, i), (*uind)++))
+				return -EFAULT;
+		}
+
+		if (put_user(KVM_REG_ARM64_SVE_FFR(i), (*uind)++))
+			return -EFAULT;
+	}
+
+	return 0;
+}
+
 /**
  * kvm_arm_num_regs - how many registers do we present via KVM_GET_ONE_REG
  *
@@ -333,6 +373,7 @@ unsigned long kvm_arm_num_regs(struct kvm_vcpu *vcpu)
 	unsigned long res = 0;
 
 	res += num_core_regs();
+	res += num_sve_regs(vcpu);
 	res += kvm_arm_num_sys_reg_descs(vcpu);
 	res += kvm_arm_get_fw_num_regs(vcpu);
 	res += NUM_TIMER_REGS;
@@ -357,6 +398,10 @@ int kvm_arm_copy_reg_indices(struct kvm_vcpu *vcpu, u64 __user *uindices)
 		uindices++;
 	}
 
+	ret = copy_sve_reg_indices(vcpu, &uindices);
+	if (ret)
+		return ret;
+
 	ret = kvm_arm_copy_fw_reg_indices(vcpu, uindices);
 	if (ret)
 		return ret;
-- 
2.1.4

^ permalink raw reply related	[flat|nested] 154+ messages in thread

* [RFC PATCH v2 17/23] arm64/sve: In-kernel vector length availability query interface
  2018-09-28 13:39 ` Dave Martin
@ 2018-09-28 13:39   ` Dave Martin
  -1 siblings, 0 replies; 154+ messages in thread
From: Dave Martin @ 2018-09-28 13:39 UTC (permalink / raw)
  To: kvmarm
  Cc: Okamoto Takayuki, Christoffer Dall, Ard Biesheuvel, Marc Zyngier,
	Catalin Marinas, Will Deacon, linux-arm-kernel

KVM will need to interrogate the set of SVE vector lengths
available on the system.

This patch exposes the relevant bits to the kernel, along with a
sve_vq_available() helper to check whether a particular vector
length is supported.

vq_to_bit() and bit_to_vq() are not intended for use outside these
functions, so they are given a __ prefix to warn people not to use
them unless they really know what they are doing.

Signed-off-by: Dave Martin <Dave.Martin@arm.com>
---
 arch/arm64/include/asm/fpsimd.h | 29 +++++++++++++++++++++++++++++
 arch/arm64/kernel/fpsimd.c      | 35 ++++++++---------------------------
 2 files changed, 37 insertions(+), 27 deletions(-)

diff --git a/arch/arm64/include/asm/fpsimd.h b/arch/arm64/include/asm/fpsimd.h
index df7a143..ad6d2e4 100644
--- a/arch/arm64/include/asm/fpsimd.h
+++ b/arch/arm64/include/asm/fpsimd.h
@@ -24,10 +24,13 @@
 
 #ifndef __ASSEMBLY__
 
+#include <linux/bitmap.h>
 #include <linux/build_bug.h>
+#include <linux/bug.h>
 #include <linux/cache.h>
 #include <linux/init.h>
 #include <linux/stddef.h>
+#include <linux/types.h>
 
 #if defined(__KERNEL__) && defined(CONFIG_COMPAT)
 /* Masks for extracting the FPSR and FPCR from the FPSCR */
@@ -89,6 +92,32 @@ extern u64 read_zcr_features(void);
 
 extern int __ro_after_init sve_max_vl;
 extern int __ro_after_init sve_max_virtualisable_vl;
+/* Set of available vector lengths, as vq_to_bit(vq): */
+extern __ro_after_init DECLARE_BITMAP(sve_vq_map, SVE_VQ_MAX);
+
+/*
+ * Helpers to translate bit indices in sve_vq_map to VQ values (and
+ * vice versa).  This allows find_next_bit() to be used to find the
+ * _maximum_ VQ not exceeding a certain value.
+ */
+static inline unsigned int __vq_to_bit(unsigned int vq)
+{
+	return SVE_VQ_MAX - vq;
+}
+
+static inline unsigned int __bit_to_vq(unsigned int bit)
+{
+	if (WARN_ON(bit >= SVE_VQ_MAX))
+		bit = SVE_VQ_MAX - 1;
+
+	return SVE_VQ_MAX - bit;
+}
+
+/* Ensure vq >= SVE_VQ_MIN && vq <= SVE_VQ_MAX before calling this function */
+static inline bool sve_vq_available(unsigned int vq)
+{
+	return test_bit(__vq_to_bit(vq), sve_vq_map);
+}
 
 #ifdef CONFIG_ARM64_SVE
 
diff --git a/arch/arm64/kernel/fpsimd.c b/arch/arm64/kernel/fpsimd.c
index 60c5e28..cc5a495 100644
--- a/arch/arm64/kernel/fpsimd.c
+++ b/arch/arm64/kernel/fpsimd.c
@@ -136,7 +136,7 @@ static int sve_default_vl = -1;
 int __ro_after_init sve_max_vl = SVE_VL_MIN;
 int __ro_after_init sve_max_virtualisable_vl = SVE_VL_MIN;
 /* Set of available vector lengths, as vq_to_bit(vq): */
-static __ro_after_init DECLARE_BITMAP(sve_vq_map, SVE_VQ_MAX);
+__ro_after_init DECLARE_BITMAP(sve_vq_map, SVE_VQ_MAX);
 /* Set of vector lengths present on at least one cpu: */
 static __ro_after_init DECLARE_BITMAP(sve_vq_partial_map, SVE_VQ_MAX);
 static void __percpu *efi_sve_state;
@@ -270,25 +270,6 @@ void fpsimd_save(void)
 }
 
 /*
- * Helpers to translate bit indices in sve_vq_map to VQ values (and
- * vice versa).  This allows find_next_bit() to be used to find the
- * _maximum_ VQ not exceeding a certain value.
- */
-
-static unsigned int vq_to_bit(unsigned int vq)
-{
-	return SVE_VQ_MAX - vq;
-}
-
-static unsigned int bit_to_vq(unsigned int bit)
-{
-	if (WARN_ON(bit >= SVE_VQ_MAX))
-		bit = SVE_VQ_MAX - 1;
-
-	return SVE_VQ_MAX - bit;
-}
-
-/*
  * All vector length selection from userspace comes through here.
  * We're on a slow path, so some sanity-checks are included.
  * If things go wrong there's a bug somewhere, but try to fall back to a
@@ -309,8 +290,8 @@ static unsigned int find_supported_vector_length(unsigned int vl)
 		vl = max_vl;
 
 	bit = find_next_bit(sve_vq_map, SVE_VQ_MAX,
-			    vq_to_bit(sve_vq_from_vl(vl)));
-	return sve_vl_from_vq(bit_to_vq(bit));
+			    __vq_to_bit(sve_vq_from_vl(vl)));
+	return sve_vl_from_vq(__bit_to_vq(bit));
 }
 
 #ifdef CONFIG_SYSCTL
@@ -651,7 +632,7 @@ static void sve_probe_vqs(DECLARE_BITMAP(map, SVE_VQ_MAX))
 		write_sysreg_s(zcr | (vq - 1), SYS_ZCR_EL1); /* self-syncing */
 		vl = sve_get_vl();
 		vq = sve_vq_from_vl(vl); /* skip intervening lengths */
-		set_bit(vq_to_bit(vq), map);
+		set_bit(__vq_to_bit(vq), map);
 	}
 }
 
@@ -712,7 +693,7 @@ int sve_verify_vq_map(void)
 	 * Mismatches above sve_max_virtualisable_vl are fine, since
 	 * no guest is allowed to configure ZCR_EL2.LEN to exceed this:
 	 */
-	if (sve_vl_from_vq(bit_to_vq(b)) <= sve_max_virtualisable_vl) {
+	if (sve_vl_from_vq(__bit_to_vq(b)) <= sve_max_virtualisable_vl) {
 		pr_warn("SVE: cpu%d: Unsupported vector length(s) present\n",
 			smp_processor_id());
 		goto error;
@@ -798,8 +779,8 @@ void __init sve_setup(void)
 	 * so sve_vq_map must have at least SVE_VQ_MIN set.
 	 * If something went wrong, at least try to patch it up:
 	 */
-	if (WARN_ON(!test_bit(vq_to_bit(SVE_VQ_MIN), sve_vq_map)))
-		set_bit(vq_to_bit(SVE_VQ_MIN), sve_vq_map);
+	if (WARN_ON(!test_bit(__vq_to_bit(SVE_VQ_MIN), sve_vq_map)))
+		set_bit(__vq_to_bit(SVE_VQ_MIN), sve_vq_map);
 
 	zcr = read_sanitised_ftr_reg(SYS_ZCR_EL1);
 	sve_max_vl = sve_vl_from_vq((zcr & ZCR_ELx_LEN_MASK) + 1);
@@ -828,7 +809,7 @@ void __init sve_setup(void)
 		/* No virtualisable VLs?  This is architecturally forbidden. */
 		sve_max_virtualisable_vl = SVE_VQ_MIN;
 	else /* b + 1 < SVE_VQ_MAX */
-		sve_max_virtualisable_vl = sve_vl_from_vq(bit_to_vq(b + 1));
+		sve_max_virtualisable_vl = sve_vl_from_vq(__bit_to_vq(b + 1));
 
 	if (sve_max_virtualisable_vl > sve_max_vl)
 		sve_max_virtualisable_vl = sve_max_vl;
-- 
2.1.4

^ permalink raw reply related	[flat|nested] 154+ messages in thread

* [RFC PATCH v2 17/23] arm64/sve: In-kernel vector length availability query interface
@ 2018-09-28 13:39   ` Dave Martin
  0 siblings, 0 replies; 154+ messages in thread
From: Dave Martin @ 2018-09-28 13:39 UTC (permalink / raw)
  To: linux-arm-kernel

KVM will need to interrogate the set of SVE vector lengths
available on the system.

This patch exposes the relevant bits to the kernel, along with a
sve_vq_available() helper to check whether a particular vector
length is supported.

vq_to_bit() and bit_to_vq() are not intended for use outside these
functions, so they are given a __ prefix to warn people not to use
them unless they really know what they are doing.

Signed-off-by: Dave Martin <Dave.Martin@arm.com>
---
 arch/arm64/include/asm/fpsimd.h | 29 +++++++++++++++++++++++++++++
 arch/arm64/kernel/fpsimd.c      | 35 ++++++++---------------------------
 2 files changed, 37 insertions(+), 27 deletions(-)

diff --git a/arch/arm64/include/asm/fpsimd.h b/arch/arm64/include/asm/fpsimd.h
index df7a143..ad6d2e4 100644
--- a/arch/arm64/include/asm/fpsimd.h
+++ b/arch/arm64/include/asm/fpsimd.h
@@ -24,10 +24,13 @@
 
 #ifndef __ASSEMBLY__
 
+#include <linux/bitmap.h>
 #include <linux/build_bug.h>
+#include <linux/bug.h>
 #include <linux/cache.h>
 #include <linux/init.h>
 #include <linux/stddef.h>
+#include <linux/types.h>
 
 #if defined(__KERNEL__) && defined(CONFIG_COMPAT)
 /* Masks for extracting the FPSR and FPCR from the FPSCR */
@@ -89,6 +92,32 @@ extern u64 read_zcr_features(void);
 
 extern int __ro_after_init sve_max_vl;
 extern int __ro_after_init sve_max_virtualisable_vl;
+/* Set of available vector lengths, as vq_to_bit(vq): */
+extern __ro_after_init DECLARE_BITMAP(sve_vq_map, SVE_VQ_MAX);
+
+/*
+ * Helpers to translate bit indices in sve_vq_map to VQ values (and
+ * vice versa).  This allows find_next_bit() to be used to find the
+ * _maximum_ VQ not exceeding a certain value.
+ */
+static inline unsigned int __vq_to_bit(unsigned int vq)
+{
+	return SVE_VQ_MAX - vq;
+}
+
+static inline unsigned int __bit_to_vq(unsigned int bit)
+{
+	if (WARN_ON(bit >= SVE_VQ_MAX))
+		bit = SVE_VQ_MAX - 1;
+
+	return SVE_VQ_MAX - bit;
+}
+
+/* Ensure vq >= SVE_VQ_MIN && vq <= SVE_VQ_MAX before calling this function */
+static inline bool sve_vq_available(unsigned int vq)
+{
+	return test_bit(__vq_to_bit(vq), sve_vq_map);
+}
 
 #ifdef CONFIG_ARM64_SVE
 
diff --git a/arch/arm64/kernel/fpsimd.c b/arch/arm64/kernel/fpsimd.c
index 60c5e28..cc5a495 100644
--- a/arch/arm64/kernel/fpsimd.c
+++ b/arch/arm64/kernel/fpsimd.c
@@ -136,7 +136,7 @@ static int sve_default_vl = -1;
 int __ro_after_init sve_max_vl = SVE_VL_MIN;
 int __ro_after_init sve_max_virtualisable_vl = SVE_VL_MIN;
 /* Set of available vector lengths, as vq_to_bit(vq): */
-static __ro_after_init DECLARE_BITMAP(sve_vq_map, SVE_VQ_MAX);
+__ro_after_init DECLARE_BITMAP(sve_vq_map, SVE_VQ_MAX);
 /* Set of vector lengths present on@least one cpu: */
 static __ro_after_init DECLARE_BITMAP(sve_vq_partial_map, SVE_VQ_MAX);
 static void __percpu *efi_sve_state;
@@ -270,25 +270,6 @@ void fpsimd_save(void)
 }
 
 /*
- * Helpers to translate bit indices in sve_vq_map to VQ values (and
- * vice versa).  This allows find_next_bit() to be used to find the
- * _maximum_ VQ not exceeding a certain value.
- */
-
-static unsigned int vq_to_bit(unsigned int vq)
-{
-	return SVE_VQ_MAX - vq;
-}
-
-static unsigned int bit_to_vq(unsigned int bit)
-{
-	if (WARN_ON(bit >= SVE_VQ_MAX))
-		bit = SVE_VQ_MAX - 1;
-
-	return SVE_VQ_MAX - bit;
-}
-
-/*
  * All vector length selection from userspace comes through here.
  * We're on a slow path, so some sanity-checks are included.
  * If things go wrong there's a bug somewhere, but try to fall back to a
@@ -309,8 +290,8 @@ static unsigned int find_supported_vector_length(unsigned int vl)
 		vl = max_vl;
 
 	bit = find_next_bit(sve_vq_map, SVE_VQ_MAX,
-			    vq_to_bit(sve_vq_from_vl(vl)));
-	return sve_vl_from_vq(bit_to_vq(bit));
+			    __vq_to_bit(sve_vq_from_vl(vl)));
+	return sve_vl_from_vq(__bit_to_vq(bit));
 }
 
 #ifdef CONFIG_SYSCTL
@@ -651,7 +632,7 @@ static void sve_probe_vqs(DECLARE_BITMAP(map, SVE_VQ_MAX))
 		write_sysreg_s(zcr | (vq - 1), SYS_ZCR_EL1); /* self-syncing */
 		vl = sve_get_vl();
 		vq = sve_vq_from_vl(vl); /* skip intervening lengths */
-		set_bit(vq_to_bit(vq), map);
+		set_bit(__vq_to_bit(vq), map);
 	}
 }
 
@@ -712,7 +693,7 @@ int sve_verify_vq_map(void)
 	 * Mismatches above sve_max_virtualisable_vl are fine, since
 	 * no guest is allowed to configure ZCR_EL2.LEN to exceed this:
 	 */
-	if (sve_vl_from_vq(bit_to_vq(b)) <= sve_max_virtualisable_vl) {
+	if (sve_vl_from_vq(__bit_to_vq(b)) <= sve_max_virtualisable_vl) {
 		pr_warn("SVE: cpu%d: Unsupported vector length(s) present\n",
 			smp_processor_id());
 		goto error;
@@ -798,8 +779,8 @@ void __init sve_setup(void)
 	 * so sve_vq_map must have at least SVE_VQ_MIN set.
 	 * If something went wrong,@least try to patch it up:
 	 */
-	if (WARN_ON(!test_bit(vq_to_bit(SVE_VQ_MIN), sve_vq_map)))
-		set_bit(vq_to_bit(SVE_VQ_MIN), sve_vq_map);
+	if (WARN_ON(!test_bit(__vq_to_bit(SVE_VQ_MIN), sve_vq_map)))
+		set_bit(__vq_to_bit(SVE_VQ_MIN), sve_vq_map);
 
 	zcr = read_sanitised_ftr_reg(SYS_ZCR_EL1);
 	sve_max_vl = sve_vl_from_vq((zcr & ZCR_ELx_LEN_MASK) + 1);
@@ -828,7 +809,7 @@ void __init sve_setup(void)
 		/* No virtualisable VLs?  This is architecturally forbidden. */
 		sve_max_virtualisable_vl = SVE_VQ_MIN;
 	else /* b + 1 < SVE_VQ_MAX */
-		sve_max_virtualisable_vl = sve_vl_from_vq(bit_to_vq(b + 1));
+		sve_max_virtualisable_vl = sve_vl_from_vq(__bit_to_vq(b + 1));
 
 	if (sve_max_virtualisable_vl > sve_max_vl)
 		sve_max_virtualisable_vl = sve_max_vl;
-- 
2.1.4

^ permalink raw reply related	[flat|nested] 154+ messages in thread

* [RFC PATCH v2 18/23] KVM: arm64: Add arch vcpu ioctl hook
  2018-09-28 13:39 ` Dave Martin
@ 2018-09-28 13:39   ` Dave Martin
  -1 siblings, 0 replies; 154+ messages in thread
From: Dave Martin @ 2018-09-28 13:39 UTC (permalink / raw)
  To: kvmarm
  Cc: Okamoto Takayuki, Christoffer Dall, Ard Biesheuvel, Marc Zyngier,
	Catalin Marinas, Will Deacon, linux-arm-kernel

To enable arm64-specific vcpu ioctls to be added cleanly, this
patch adds a kvm_arm_arch_vcpu_ioctl() hook so that these don't
pollute the common code.

No functional change: the -EINVAL return for unknown ioctls is
retained, though it may or may not be intentional (KVM returns
-ENXIO in various other similar contexts).

Signed-off-by: Dave Martin <Dave.Martin@arm.com>
---
 arch/arm/include/asm/kvm_host.h   | 7 +++++++
 arch/arm64/include/asm/kvm_host.h | 2 ++
 arch/arm64/kvm/guest.c            | 6 ++++++
 virt/kvm/arm/arm.c                | 2 +-
 4 files changed, 16 insertions(+), 1 deletion(-)

diff --git a/arch/arm/include/asm/kvm_host.h b/arch/arm/include/asm/kvm_host.h
index c36760b..df2659d 100644
--- a/arch/arm/include/asm/kvm_host.h
+++ b/arch/arm/include/asm/kvm_host.h
@@ -19,6 +19,7 @@
 #ifndef __ARM_KVM_HOST_H__
 #define __ARM_KVM_HOST_H__
 
+#include <linux/errno.h>
 #include <linux/types.h>
 #include <linux/kvm_types.h>
 #include <asm/cputype.h>
@@ -278,6 +279,12 @@ static inline int kvm_arch_dev_ioctl_check_extension(struct kvm *kvm, long ext)
 	return 0;
 }
 
+static inline int kvm_arm_arch_vcpu_ioctl(struct vcpu *vcpu,
+	unsigned int ioctl, unsigned long arg)
+{
+	return -EINVAL;
+}
+
 int kvm_perf_init(void);
 int kvm_perf_teardown(void);
 
diff --git a/arch/arm64/include/asm/kvm_host.h b/arch/arm64/include/asm/kvm_host.h
index 8e9cd43..bbde597 100644
--- a/arch/arm64/include/asm/kvm_host.h
+++ b/arch/arm64/include/asm/kvm_host.h
@@ -55,6 +55,8 @@ DECLARE_STATIC_KEY_FALSE(userspace_irqchip_in_use);
 int __attribute_const__ kvm_target_cpu(void);
 int kvm_reset_vcpu(struct kvm_vcpu *vcpu);
 int kvm_arch_dev_ioctl_check_extension(struct kvm *kvm, long ext);
+int kvm_arm_arch_vcpu_ioctl(struct kvm_vcpu *vcpu,
+			    unsigned int ioctl, unsigned long arg);
 void __extended_idmap_trampoline(phys_addr_t boot_pgd, phys_addr_t idmap_start);
 
 struct kvm_arch {
diff --git a/arch/arm64/kvm/guest.c b/arch/arm64/kvm/guest.c
index 89eab68..331b85e 100644
--- a/arch/arm64/kvm/guest.c
+++ b/arch/arm64/kvm/guest.c
@@ -546,6 +546,12 @@ int kvm_vcpu_preferred_target(struct kvm_vcpu_init *init)
 	return 0;
 }
 
+int kvm_arm_arch_vcpu_ioctl(struct kvm_vcpu *vcpu,
+			    unsigned int ioctl, unsigned long arg)
+{
+	return -EINVAL;
+}
+
 int kvm_arch_vcpu_ioctl_get_fpu(struct kvm_vcpu *vcpu, struct kvm_fpu *fpu)
 {
 	return -EINVAL;
diff --git a/virt/kvm/arm/arm.c b/virt/kvm/arm/arm.c
index 1418af9..6e894a8 100644
--- a/virt/kvm/arm/arm.c
+++ b/virt/kvm/arm/arm.c
@@ -1181,7 +1181,7 @@ long kvm_arch_vcpu_ioctl(struct file *filp,
 		return kvm_arm_vcpu_set_events(vcpu, &events);
 	}
 	default:
-		r = -EINVAL;
+		r = kvm_arm_arch_vcpu_ioctl(vcpu, ioctl, arg);
 	}
 
 	return r;
-- 
2.1.4

^ permalink raw reply related	[flat|nested] 154+ messages in thread

* [RFC PATCH v2 18/23] KVM: arm64: Add arch vcpu ioctl hook
@ 2018-09-28 13:39   ` Dave Martin
  0 siblings, 0 replies; 154+ messages in thread
From: Dave Martin @ 2018-09-28 13:39 UTC (permalink / raw)
  To: linux-arm-kernel

To enable arm64-specific vcpu ioctls to be added cleanly, this
patch adds a kvm_arm_arch_vcpu_ioctl() hook so that these don't
pollute the common code.

No functional change: the -EINVAL return for unknown ioctls is
retained, though it may or may not be intentional (KVM returns
-ENXIO in various other similar contexts).

Signed-off-by: Dave Martin <Dave.Martin@arm.com>
---
 arch/arm/include/asm/kvm_host.h   | 7 +++++++
 arch/arm64/include/asm/kvm_host.h | 2 ++
 arch/arm64/kvm/guest.c            | 6 ++++++
 virt/kvm/arm/arm.c                | 2 +-
 4 files changed, 16 insertions(+), 1 deletion(-)

diff --git a/arch/arm/include/asm/kvm_host.h b/arch/arm/include/asm/kvm_host.h
index c36760b..df2659d 100644
--- a/arch/arm/include/asm/kvm_host.h
+++ b/arch/arm/include/asm/kvm_host.h
@@ -19,6 +19,7 @@
 #ifndef __ARM_KVM_HOST_H__
 #define __ARM_KVM_HOST_H__
 
+#include <linux/errno.h>
 #include <linux/types.h>
 #include <linux/kvm_types.h>
 #include <asm/cputype.h>
@@ -278,6 +279,12 @@ static inline int kvm_arch_dev_ioctl_check_extension(struct kvm *kvm, long ext)
 	return 0;
 }
 
+static inline int kvm_arm_arch_vcpu_ioctl(struct vcpu *vcpu,
+	unsigned int ioctl, unsigned long arg)
+{
+	return -EINVAL;
+}
+
 int kvm_perf_init(void);
 int kvm_perf_teardown(void);
 
diff --git a/arch/arm64/include/asm/kvm_host.h b/arch/arm64/include/asm/kvm_host.h
index 8e9cd43..bbde597 100644
--- a/arch/arm64/include/asm/kvm_host.h
+++ b/arch/arm64/include/asm/kvm_host.h
@@ -55,6 +55,8 @@ DECLARE_STATIC_KEY_FALSE(userspace_irqchip_in_use);
 int __attribute_const__ kvm_target_cpu(void);
 int kvm_reset_vcpu(struct kvm_vcpu *vcpu);
 int kvm_arch_dev_ioctl_check_extension(struct kvm *kvm, long ext);
+int kvm_arm_arch_vcpu_ioctl(struct kvm_vcpu *vcpu,
+			    unsigned int ioctl, unsigned long arg);
 void __extended_idmap_trampoline(phys_addr_t boot_pgd, phys_addr_t idmap_start);
 
 struct kvm_arch {
diff --git a/arch/arm64/kvm/guest.c b/arch/arm64/kvm/guest.c
index 89eab68..331b85e 100644
--- a/arch/arm64/kvm/guest.c
+++ b/arch/arm64/kvm/guest.c
@@ -546,6 +546,12 @@ int kvm_vcpu_preferred_target(struct kvm_vcpu_init *init)
 	return 0;
 }
 
+int kvm_arm_arch_vcpu_ioctl(struct kvm_vcpu *vcpu,
+			    unsigned int ioctl, unsigned long arg)
+{
+	return -EINVAL;
+}
+
 int kvm_arch_vcpu_ioctl_get_fpu(struct kvm_vcpu *vcpu, struct kvm_fpu *fpu)
 {
 	return -EINVAL;
diff --git a/virt/kvm/arm/arm.c b/virt/kvm/arm/arm.c
index 1418af9..6e894a8 100644
--- a/virt/kvm/arm/arm.c
+++ b/virt/kvm/arm/arm.c
@@ -1181,7 +1181,7 @@ long kvm_arch_vcpu_ioctl(struct file *filp,
 		return kvm_arm_vcpu_set_events(vcpu, &events);
 	}
 	default:
-		r = -EINVAL;
+		r = kvm_arm_arch_vcpu_ioctl(vcpu, ioctl, arg);
 	}
 
 	return r;
-- 
2.1.4

^ permalink raw reply related	[flat|nested] 154+ messages in thread

* [RFC PATCH v2 19/23] KVM: arm64/sve: Report and enable SVE API extensions for userspace
  2018-09-28 13:39 ` Dave Martin
@ 2018-09-28 13:39   ` Dave Martin
  -1 siblings, 0 replies; 154+ messages in thread
From: Dave Martin @ 2018-09-28 13:39 UTC (permalink / raw)
  To: kvmarm
  Cc: Okamoto Takayuki, Christoffer Dall, Ard Biesheuvel, Marc Zyngier,
	Catalin Marinas, Will Deacon, linux-arm-kernel

This patch adds the necessary API extensions to allow userspace to
detect SVE support for guests and enable it.

A new capability KVM_CAP_ARM_SVE is defined to allow userspace to
detect the availability of the KVM SVE API extensions in the usual
way.

Userspace needs to enable SVE explicitly per vcpu and configure the
set of SVE vector lengths available to the guest before the vcpu is
allowed to run.  For these purposes, a new arm64-specific vcpu
ioctl KVM_ARM_SVE_CONFIG is added, with the following subcommands
(in rough order of expected use):

KVM_ARM_SVE_CONFIG_QUERY: report the set of vector lengths
    supported by this host.

    The resulting set can be supplied directly to
    KVM_ARM_SVE_CONFIG_SET in order to obtain the maximal possible
    set, or used to inform userspace's decision on the appropriate
    set of vector lengths (possibly taking into account the
    configuration of other nodes in the cluster so that the VM can
    migrate freely).

KVM_ARM_SVE_CONFIG_SET: enable SVE for this vcpu and configure the
    set of vector lengths it offers to the guest.

    This can only be done once, before the vcpu is run.

KVM_ARM_SVE_CONFIG_GET: report the set of vector lengths available
    to the guest on this vcpu (for use when snapshotting or
    migrating a VM).

Signed-off-by: Dave Martin <Dave.Martin@arm.com>
---

Changes since RFCv1:

 * The new feature bit for PREFERRED_TARGET / VCPU_INIT is gone in
   favour of a capability and a new ioctl to enable/configure SVE.

   Perhaps the SVE configuration could be done via device attributes,
   but it still has to be done early, so crowbarring support for this
   behind a generic API may cause more trouble than it solves.

   This is still up for discussion if anybody feels strongly about it.

 * An ioctl KVM_ARM_SVE_CONFIG has been added to report the set of
   vector lengths available and configure SVE for a vcpu.

   To reduce ioctl namespace pollution the new operations are grouped
   as subcommands under a single ioctl, since they use the same
   argument format anyway.
---
 arch/arm64/include/asm/kvm_host.h |   8 +-
 arch/arm64/include/uapi/asm/kvm.h |  14 ++++
 arch/arm64/kvm/guest.c            | 164 +++++++++++++++++++++++++++++++++++++-
 arch/arm64/kvm/reset.c            |  50 ++++++++++++
 include/uapi/linux/kvm.h          |   4 +
 5 files changed, 238 insertions(+), 2 deletions(-)

diff --git a/arch/arm64/include/asm/kvm_host.h b/arch/arm64/include/asm/kvm_host.h
index bbde597..5225485 100644
--- a/arch/arm64/include/asm/kvm_host.h
+++ b/arch/arm64/include/asm/kvm_host.h
@@ -52,6 +52,12 @@
 
 DECLARE_STATIC_KEY_FALSE(userspace_irqchip_in_use);
 
+#ifdef CONFIG_ARM64_SVE
+bool kvm_sve_supported(void);
+#else
+static inline bool kvm_sve_supported(void) { return false; }
+#endif
+
 int __attribute_const__ kvm_target_cpu(void);
 int kvm_reset_vcpu(struct kvm_vcpu *vcpu);
 int kvm_arch_dev_ioctl_check_extension(struct kvm *kvm, long ext);
@@ -441,7 +447,7 @@ static inline void kvm_arch_sync_events(struct kvm *kvm) {}
 static inline void kvm_arch_sched_in(struct kvm_vcpu *vcpu, int cpu) {}
 static inline void kvm_arch_vcpu_block_finish(struct kvm_vcpu *vcpu) {}
 
-static inline void kvm_arm_arch_vcpu_uninit(struct kvm_vcpu *vcpu) {}
+void kvm_arm_arch_vcpu_uninit(struct kvm_vcpu *vcpu);
 
 void kvm_arm_init_debug(void);
 void kvm_arm_setup_debug(struct kvm_vcpu *vcpu);
diff --git a/arch/arm64/include/uapi/asm/kvm.h b/arch/arm64/include/uapi/asm/kvm.h
index 1ff68fa..94f6932 100644
--- a/arch/arm64/include/uapi/asm/kvm.h
+++ b/arch/arm64/include/uapi/asm/kvm.h
@@ -32,6 +32,7 @@
 #define KVM_NR_SPSR	5
 
 #ifndef __ASSEMBLY__
+#include <linux/kernel.h>
 #include <linux/psci.h>
 #include <linux/types.h>
 #include <asm/ptrace.h>
@@ -108,6 +109,19 @@ struct kvm_vcpu_init {
 	__u32 features[7];
 };
 
+/* Vector length set for KVM_ARM_SVE_CONFIG */
+struct kvm_sve_vls {
+	__u16 cmd;
+	__u16 max_vq;
+	__u16 _reserved[2];
+	__u64 required_vqs[__KERNEL_DIV_ROUND_UP(SVE_VQ_MAX - SVE_VQ_MIN + 1, 64)];
+};
+
+/* values for cmd: */
+#define KVM_ARM_SVE_CONFIG_QUERY	0 /* query what the host can support */
+#define KVM_ARM_SVE_CONFIG_SET		1 /* enable SVE for vcpu and set VLs */
+#define KVM_ARM_SVE_CONFIG_GET		2 /* read the set of VLs for a vcpu */
+
 struct kvm_sregs {
 };
 
diff --git a/arch/arm64/kvm/guest.c b/arch/arm64/kvm/guest.c
index 331b85e..d96145a 100644
--- a/arch/arm64/kvm/guest.c
+++ b/arch/arm64/kvm/guest.c
@@ -26,6 +26,9 @@
 #include <linux/module.h>
 #include <linux/vmalloc.h>
 #include <linux/fs.h>
+#include <linux/slab.h>
+#include <linux/string.h>
+#include <linux/types.h>
 #include <kvm/arm_psci.h>
 #include <asm/cputype.h>
 #include <linux/uaccess.h>
@@ -56,6 +59,11 @@ int kvm_arch_vcpu_setup(struct kvm_vcpu *vcpu)
 	return 0;
 }
 
+void kvm_arm_arch_vcpu_uninit(struct kvm_vcpu *vcpu)
+{
+	kfree(vcpu->arch.sve_state);
+}
+
 static u64 core_reg_offset_from_id(u64 id)
 {
 	return id & ~(KVM_REG_ARCH_MASK | KVM_REG_SIZE_MASK | KVM_REG_ARM_CORE);
@@ -546,10 +554,164 @@ int kvm_vcpu_preferred_target(struct kvm_vcpu_init *init)
 	return 0;
 }
 
+#define VQS_PER_U64 64
+#define vq_word(vqs, vq) (&(vqs)[((vq) - SVE_VQ_MIN) / VQS_PER_U64])
+#define vq_mask(vq) ((u64)1 << (((vq) - SVE_VQ_MIN) % VQS_PER_U64))
+
+static void set_vq(u64 *vqs, unsigned int vq)
+{
+	*vq_word(vqs, vq) |= vq_mask(vq);
+}
+
+static bool vq_set(const u64 *vqs, unsigned int vq)
+{
+	return *vq_word(vqs, vq) & vq_mask(vq);
+}
+
+static int kvm_vcpu_set_sve_vls(struct kvm_vcpu *vcpu, struct kvm_sve_vls *vls,
+		struct kvm_sve_vls __user *userp)
+{
+	unsigned int vq, max_vq;
+	int ret;
+
+	if (vcpu->arch.has_run_once || vcpu_has_sve(vcpu))
+		return -EBADFD; /* too late, or already configured */
+
+	BUG_ON(vcpu->arch.sve_max_vl || vcpu->arch.sve_state);
+
+	if (vls->max_vq < SVE_VQ_MIN || vls->max_vq > SVE_VQ_MAX)
+		return -EINVAL;
+
+	max_vq = 0;
+	for (vq = SVE_VQ_MIN; vq <= vls->max_vq; ++vq) {
+		bool available = sve_vq_available(vq);
+		bool required = vq_set(vls->required_vqs, vq);
+
+		if (required != available)
+			break;
+
+		if (required)
+			max_vq = vq;
+	}
+
+	if (max_vq < SVE_VQ_MIN)
+		return -EINVAL;
+
+	vls->max_vq = max_vq;
+	ret = put_user(vls->max_vq, &userp->max_vq);
+	if (ret)
+		return ret;
+
+	/*
+	 * kvm_reset_vcpu() may already have run in KVM_VCPU_INIT, so we
+	 * rely on kzalloc() being sufficient to reset the guest SVE
+	 * state here for a new vcpu.
+	 *
+	 * Subsequent resets after vcpu initialisation are handled by
+	 * kvm_reset_sve().
+	 */
+	vcpu->arch.sve_state = kzalloc(SVE_SIG_REGS_SIZE(vls->max_vq),
+				       GFP_KERNEL);
+	if (!vcpu->arch.sve_state)
+		return -ENOMEM;
+
+	vcpu->arch.flags |= KVM_ARM64_GUEST_HAS_SVE;
+	vcpu->arch.sve_max_vl = sve_vl_from_vq(vls->max_vq);
+
+	return 0;
+}
+
+static int __kvm_vcpu_query_sve_vls(struct kvm_sve_vls *vls,
+		unsigned int max_vq, struct kvm_sve_vls __user *userp)
+{
+	unsigned int vq, max_available_vq;
+
+	memset(&vls->required_vqs, 0, sizeof(vls->required_vqs));
+
+	BUG_ON(max_vq < SVE_VQ_MIN || max_vq > SVE_VQ_MAX);
+
+	max_available_vq = 0;
+	for (vq = SVE_VQ_MIN; vq <= max_vq; ++vq)
+		if (sve_vq_available(vq)) {
+			set_vq(vls->required_vqs, vq);
+			max_available_vq = vq;
+		}
+
+	if (WARN_ON(max_available_vq < SVE_VQ_MIN))
+		return -EIO;
+
+	vls->max_vq = max_available_vq;
+	if (copy_to_user(userp, vls, sizeof(*vls)))
+		return -EFAULT;
+
+	return 0;
+}
+
+static int kvm_vcpu_query_sve_vls(struct kvm_vcpu *vcpu, struct kvm_sve_vls *vls,
+		struct kvm_sve_vls __user *userp)
+{
+	BUG_ON(!sve_vl_valid(sve_max_vl));
+
+	return __kvm_vcpu_query_sve_vls(vls,
+			sve_vq_from_vl(sve_max_vl), userp);
+}
+
+static int kvm_vcpu_get_sve_vls(struct kvm_vcpu *vcpu, struct kvm_sve_vls *vls,
+		struct kvm_sve_vls __user *userp)
+{
+	if (!vcpu_has_sve(vcpu))
+		return -EBADFD; /* not configured yet */
+
+	BUG_ON(!sve_vl_valid(vcpu->arch.sve_max_vl));
+
+	return __kvm_vcpu_query_sve_vls(vls,
+			sve_vq_from_vl(vcpu->arch.sve_max_vl), userp);
+}
+
+static int kvm_vcpu_sve_config(struct kvm_vcpu *vcpu,
+			       struct kvm_sve_vls __user *userp)
+{
+	struct kvm_sve_vls vls;
+
+	if (!kvm_sve_supported())
+		return -EINVAL;
+
+	if (copy_from_user(&vls, userp, sizeof(vls)))
+		return -EFAULT;
+
+	/*
+	 * For forwards compatibility, flush any set bits in _reserved[]
+	 * to tell userspace that we didn't look at them:
+	 */
+	memset(&vls._reserved, 0, sizeof vls._reserved);
+
+	switch (vls.cmd) {
+	case KVM_ARM_SVE_CONFIG_QUERY:
+		return kvm_vcpu_query_sve_vls(vcpu, &vls, userp);
+
+	case KVM_ARM_SVE_CONFIG_SET:
+		return kvm_vcpu_set_sve_vls(vcpu, &vls, userp);
+
+	case KVM_ARM_SVE_CONFIG_GET:
+		return kvm_vcpu_get_sve_vls(vcpu, &vls, userp);
+
+	default:
+		return -EINVAL;
+	}
+}
+
 int kvm_arm_arch_vcpu_ioctl(struct kvm_vcpu *vcpu,
 			    unsigned int ioctl, unsigned long arg)
 {
-	return -EINVAL;
+	void __user *userp = (void __user *)arg;
+
+	switch (ioctl) {
+	case KVM_ARM_SVE_CONFIG:
+		return kvm_vcpu_sve_config(vcpu, userp);
+
+	default:
+		return -EINVAL;
+	}
 }
 
 int kvm_arch_vcpu_ioctl_get_fpu(struct kvm_vcpu *vcpu, struct kvm_fpu *fpu)
diff --git a/arch/arm64/kvm/reset.c b/arch/arm64/kvm/reset.c
index e37c78b..c2edcde 100644
--- a/arch/arm64/kvm/reset.c
+++ b/arch/arm64/kvm/reset.c
@@ -19,10 +19,12 @@
  * along with this program.  If not, see <http://www.gnu.org/licenses/>.
  */
 
+#include <linux/atomic.h>
 #include <linux/errno.h>
 #include <linux/kvm_host.h>
 #include <linux/kvm.h>
 #include <linux/hw_breakpoint.h>
+#include <linux/string.h>
 
 #include <kvm/arm_arch_timer.h>
 
@@ -54,6 +56,31 @@ static bool cpu_has_32bit_el1(void)
 	return !!(pfr0 & 0x20);
 }
 
+#ifdef CONFIG_ARM64_SVE
+bool kvm_sve_supported(void)
+{
+	static bool warn_printed = false;
+
+	if (!system_supports_sve())
+		return false;
+
+	/*
+	 * For now, consider the hardware broken if implementation
+	 * differences between CPUs in the system result in the set of
+	 * vector lengths safely virtualisable for guests being less
+	 * than the set provided to userspace:
+	 */
+	if (sve_max_virtualisable_vl != sve_max_vl) {
+		if (!xchg(&warn_printed, true))
+			kvm_err("Hardware SVE implementations mismatched: suppressing SVE for guests.");
+
+		return false;
+	}
+
+	return true;
+}
+#endif
+
 /**
  * kvm_arch_dev_ioctl_check_extension
  *
@@ -85,6 +112,9 @@ int kvm_arch_dev_ioctl_check_extension(struct kvm *kvm, long ext)
 	case KVM_CAP_VCPU_EVENTS:
 		r = 1;
 		break;
+	case KVM_CAP_ARM_SVE:
+		r = kvm_sve_supported();
+		break;
 	default:
 		r = 0;
 	}
@@ -92,6 +122,21 @@ int kvm_arch_dev_ioctl_check_extension(struct kvm *kvm, long ext)
 	return r;
 }
 
+int kvm_reset_sve(struct kvm_vcpu *vcpu)
+{
+	if (!vcpu_has_sve(vcpu))
+		return 0;
+
+	if (WARN_ON(!vcpu->arch.sve_state ||
+		    !sve_vl_valid(vcpu->arch.sve_max_vl)))
+		return -EIO;
+
+	memset(vcpu->arch.sve_state, 0,
+	       SVE_SIG_REGS_SIZE(sve_vq_from_vl(vcpu->arch.sve_max_vl)));
+
+	return 0;
+}
+
 /**
  * kvm_reset_vcpu - sets core registers and sys_regs to reset value
  * @vcpu: The VCPU pointer
@@ -103,6 +148,7 @@ int kvm_arch_dev_ioctl_check_extension(struct kvm *kvm, long ext)
 int kvm_reset_vcpu(struct kvm_vcpu *vcpu)
 {
 	const struct kvm_regs *cpu_reset;
+	int ret;
 
 	switch (vcpu->arch.target) {
 	default:
@@ -120,6 +166,10 @@ int kvm_reset_vcpu(struct kvm_vcpu *vcpu)
 	/* Reset core registers */
 	memcpy(vcpu_gp_regs(vcpu), cpu_reset, sizeof(*cpu_reset));
 
+	ret = kvm_reset_sve(vcpu);
+	if (ret)
+		return ret;
+
 	/* Reset system registers */
 	kvm_reset_sys_regs(vcpu);
 
diff --git a/include/uapi/linux/kvm.h b/include/uapi/linux/kvm.h
index 7c3c5cc..488ca56 100644
--- a/include/uapi/linux/kvm.h
+++ b/include/uapi/linux/kvm.h
@@ -953,6 +953,7 @@ struct kvm_ppc_resize_hpt {
 #define KVM_CAP_NESTED_STATE 157
 #define KVM_CAP_ARM_INJECT_SERROR_ESR 158
 #define KVM_CAP_MSR_PLATFORM_INFO 159
+#define KVM_CAP_ARM_SVE 160
 
 #ifdef KVM_CAP_IRQ_ROUTING
 
@@ -1400,6 +1401,9 @@ struct kvm_enc_region {
 #define KVM_GET_NESTED_STATE         _IOWR(KVMIO, 0xbe, struct kvm_nested_state)
 #define KVM_SET_NESTED_STATE         _IOW(KVMIO,  0xbf, struct kvm_nested_state)
 
+/* Available with KVM_CAP_ARM_SVE */
+#define KVM_ARM_SVE_CONFIG	  _IOWR(KVMIO,  0xc0, struct kvm_sve_vls)
+
 /* Secure Encrypted Virtualization command */
 enum sev_cmd_id {
 	/* Guest initialization commands */
-- 
2.1.4

^ permalink raw reply related	[flat|nested] 154+ messages in thread

* [RFC PATCH v2 19/23] KVM: arm64/sve: Report and enable SVE API extensions for userspace
@ 2018-09-28 13:39   ` Dave Martin
  0 siblings, 0 replies; 154+ messages in thread
From: Dave Martin @ 2018-09-28 13:39 UTC (permalink / raw)
  To: linux-arm-kernel

This patch adds the necessary API extensions to allow userspace to
detect SVE support for guests and enable it.

A new capability KVM_CAP_ARM_SVE is defined to allow userspace to
detect the availability of the KVM SVE API extensions in the usual
way.

Userspace needs to enable SVE explicitly per vcpu and configure the
set of SVE vector lengths available to the guest before the vcpu is
allowed to run.  For these purposes, a new arm64-specific vcpu
ioctl KVM_ARM_SVE_CONFIG is added, with the following subcommands
(in rough order of expected use):

KVM_ARM_SVE_CONFIG_QUERY: report the set of vector lengths
    supported by this host.

    The resulting set can be supplied directly to
    KVM_ARM_SVE_CONFIG_SET in order to obtain the maximal possible
    set, or used to inform userspace's decision on the appropriate
    set of vector lengths (possibly taking into account the
    configuration of other nodes in the cluster so that the VM can
    migrate freely).

KVM_ARM_SVE_CONFIG_SET: enable SVE for this vcpu and configure the
    set of vector lengths it offers to the guest.

    This can only be done once, before the vcpu is run.

KVM_ARM_SVE_CONFIG_GET: report the set of vector lengths available
    to the guest on this vcpu (for use when snapshotting or
    migrating a VM).

Signed-off-by: Dave Martin <Dave.Martin@arm.com>
---

Changes since RFCv1:

 * The new feature bit for PREFERRED_TARGET / VCPU_INIT is gone in
   favour of a capability and a new ioctl to enable/configure SVE.

   Perhaps the SVE configuration could be done via device attributes,
   but it still has to be done early, so crowbarring support for this
   behind a generic API may cause more trouble than it solves.

   This is still up for discussion if anybody feels strongly about it.

 * An ioctl KVM_ARM_SVE_CONFIG has been added to report the set of
   vector lengths available and configure SVE for a vcpu.

   To reduce ioctl namespace pollution the new operations are grouped
   as subcommands under a single ioctl, since they use the same
   argument format anyway.
---
 arch/arm64/include/asm/kvm_host.h |   8 +-
 arch/arm64/include/uapi/asm/kvm.h |  14 ++++
 arch/arm64/kvm/guest.c            | 164 +++++++++++++++++++++++++++++++++++++-
 arch/arm64/kvm/reset.c            |  50 ++++++++++++
 include/uapi/linux/kvm.h          |   4 +
 5 files changed, 238 insertions(+), 2 deletions(-)

diff --git a/arch/arm64/include/asm/kvm_host.h b/arch/arm64/include/asm/kvm_host.h
index bbde597..5225485 100644
--- a/arch/arm64/include/asm/kvm_host.h
+++ b/arch/arm64/include/asm/kvm_host.h
@@ -52,6 +52,12 @@
 
 DECLARE_STATIC_KEY_FALSE(userspace_irqchip_in_use);
 
+#ifdef CONFIG_ARM64_SVE
+bool kvm_sve_supported(void);
+#else
+static inline bool kvm_sve_supported(void) { return false; }
+#endif
+
 int __attribute_const__ kvm_target_cpu(void);
 int kvm_reset_vcpu(struct kvm_vcpu *vcpu);
 int kvm_arch_dev_ioctl_check_extension(struct kvm *kvm, long ext);
@@ -441,7 +447,7 @@ static inline void kvm_arch_sync_events(struct kvm *kvm) {}
 static inline void kvm_arch_sched_in(struct kvm_vcpu *vcpu, int cpu) {}
 static inline void kvm_arch_vcpu_block_finish(struct kvm_vcpu *vcpu) {}
 
-static inline void kvm_arm_arch_vcpu_uninit(struct kvm_vcpu *vcpu) {}
+void kvm_arm_arch_vcpu_uninit(struct kvm_vcpu *vcpu);
 
 void kvm_arm_init_debug(void);
 void kvm_arm_setup_debug(struct kvm_vcpu *vcpu);
diff --git a/arch/arm64/include/uapi/asm/kvm.h b/arch/arm64/include/uapi/asm/kvm.h
index 1ff68fa..94f6932 100644
--- a/arch/arm64/include/uapi/asm/kvm.h
+++ b/arch/arm64/include/uapi/asm/kvm.h
@@ -32,6 +32,7 @@
 #define KVM_NR_SPSR	5
 
 #ifndef __ASSEMBLY__
+#include <linux/kernel.h>
 #include <linux/psci.h>
 #include <linux/types.h>
 #include <asm/ptrace.h>
@@ -108,6 +109,19 @@ struct kvm_vcpu_init {
 	__u32 features[7];
 };
 
+/* Vector length set for KVM_ARM_SVE_CONFIG */
+struct kvm_sve_vls {
+	__u16 cmd;
+	__u16 max_vq;
+	__u16 _reserved[2];
+	__u64 required_vqs[__KERNEL_DIV_ROUND_UP(SVE_VQ_MAX - SVE_VQ_MIN + 1, 64)];
+};
+
+/* values for cmd: */
+#define KVM_ARM_SVE_CONFIG_QUERY	0 /* query what the host can support */
+#define KVM_ARM_SVE_CONFIG_SET		1 /* enable SVE for vcpu and set VLs */
+#define KVM_ARM_SVE_CONFIG_GET		2 /* read the set of VLs for a vcpu */
+
 struct kvm_sregs {
 };
 
diff --git a/arch/arm64/kvm/guest.c b/arch/arm64/kvm/guest.c
index 331b85e..d96145a 100644
--- a/arch/arm64/kvm/guest.c
+++ b/arch/arm64/kvm/guest.c
@@ -26,6 +26,9 @@
 #include <linux/module.h>
 #include <linux/vmalloc.h>
 #include <linux/fs.h>
+#include <linux/slab.h>
+#include <linux/string.h>
+#include <linux/types.h>
 #include <kvm/arm_psci.h>
 #include <asm/cputype.h>
 #include <linux/uaccess.h>
@@ -56,6 +59,11 @@ int kvm_arch_vcpu_setup(struct kvm_vcpu *vcpu)
 	return 0;
 }
 
+void kvm_arm_arch_vcpu_uninit(struct kvm_vcpu *vcpu)
+{
+	kfree(vcpu->arch.sve_state);
+}
+
 static u64 core_reg_offset_from_id(u64 id)
 {
 	return id & ~(KVM_REG_ARCH_MASK | KVM_REG_SIZE_MASK | KVM_REG_ARM_CORE);
@@ -546,10 +554,164 @@ int kvm_vcpu_preferred_target(struct kvm_vcpu_init *init)
 	return 0;
 }
 
+#define VQS_PER_U64 64
+#define vq_word(vqs, vq) (&(vqs)[((vq) - SVE_VQ_MIN) / VQS_PER_U64])
+#define vq_mask(vq) ((u64)1 << (((vq) - SVE_VQ_MIN) % VQS_PER_U64))
+
+static void set_vq(u64 *vqs, unsigned int vq)
+{
+	*vq_word(vqs, vq) |= vq_mask(vq);
+}
+
+static bool vq_set(const u64 *vqs, unsigned int vq)
+{
+	return *vq_word(vqs, vq) & vq_mask(vq);
+}
+
+static int kvm_vcpu_set_sve_vls(struct kvm_vcpu *vcpu, struct kvm_sve_vls *vls,
+		struct kvm_sve_vls __user *userp)
+{
+	unsigned int vq, max_vq;
+	int ret;
+
+	if (vcpu->arch.has_run_once || vcpu_has_sve(vcpu))
+		return -EBADFD; /* too late, or already configured */
+
+	BUG_ON(vcpu->arch.sve_max_vl || vcpu->arch.sve_state);
+
+	if (vls->max_vq < SVE_VQ_MIN || vls->max_vq > SVE_VQ_MAX)
+		return -EINVAL;
+
+	max_vq = 0;
+	for (vq = SVE_VQ_MIN; vq <= vls->max_vq; ++vq) {
+		bool available = sve_vq_available(vq);
+		bool required = vq_set(vls->required_vqs, vq);
+
+		if (required != available)
+			break;
+
+		if (required)
+			max_vq = vq;
+	}
+
+	if (max_vq < SVE_VQ_MIN)
+		return -EINVAL;
+
+	vls->max_vq = max_vq;
+	ret = put_user(vls->max_vq, &userp->max_vq);
+	if (ret)
+		return ret;
+
+	/*
+	 * kvm_reset_vcpu() may already have run in KVM_VCPU_INIT, so we
+	 * rely on kzalloc() being sufficient to reset the guest SVE
+	 * state here for a new vcpu.
+	 *
+	 * Subsequent resets after vcpu initialisation are handled by
+	 * kvm_reset_sve().
+	 */
+	vcpu->arch.sve_state = kzalloc(SVE_SIG_REGS_SIZE(vls->max_vq),
+				       GFP_KERNEL);
+	if (!vcpu->arch.sve_state)
+		return -ENOMEM;
+
+	vcpu->arch.flags |= KVM_ARM64_GUEST_HAS_SVE;
+	vcpu->arch.sve_max_vl = sve_vl_from_vq(vls->max_vq);
+
+	return 0;
+}
+
+static int __kvm_vcpu_query_sve_vls(struct kvm_sve_vls *vls,
+		unsigned int max_vq, struct kvm_sve_vls __user *userp)
+{
+	unsigned int vq, max_available_vq;
+
+	memset(&vls->required_vqs, 0, sizeof(vls->required_vqs));
+
+	BUG_ON(max_vq < SVE_VQ_MIN || max_vq > SVE_VQ_MAX);
+
+	max_available_vq = 0;
+	for (vq = SVE_VQ_MIN; vq <= max_vq; ++vq)
+		if (sve_vq_available(vq)) {
+			set_vq(vls->required_vqs, vq);
+			max_available_vq = vq;
+		}
+
+	if (WARN_ON(max_available_vq < SVE_VQ_MIN))
+		return -EIO;
+
+	vls->max_vq = max_available_vq;
+	if (copy_to_user(userp, vls, sizeof(*vls)))
+		return -EFAULT;
+
+	return 0;
+}
+
+static int kvm_vcpu_query_sve_vls(struct kvm_vcpu *vcpu, struct kvm_sve_vls *vls,
+		struct kvm_sve_vls __user *userp)
+{
+	BUG_ON(!sve_vl_valid(sve_max_vl));
+
+	return __kvm_vcpu_query_sve_vls(vls,
+			sve_vq_from_vl(sve_max_vl), userp);
+}
+
+static int kvm_vcpu_get_sve_vls(struct kvm_vcpu *vcpu, struct kvm_sve_vls *vls,
+		struct kvm_sve_vls __user *userp)
+{
+	if (!vcpu_has_sve(vcpu))
+		return -EBADFD; /* not configured yet */
+
+	BUG_ON(!sve_vl_valid(vcpu->arch.sve_max_vl));
+
+	return __kvm_vcpu_query_sve_vls(vls,
+			sve_vq_from_vl(vcpu->arch.sve_max_vl), userp);
+}
+
+static int kvm_vcpu_sve_config(struct kvm_vcpu *vcpu,
+			       struct kvm_sve_vls __user *userp)
+{
+	struct kvm_sve_vls vls;
+
+	if (!kvm_sve_supported())
+		return -EINVAL;
+
+	if (copy_from_user(&vls, userp, sizeof(vls)))
+		return -EFAULT;
+
+	/*
+	 * For forwards compatibility, flush any set bits in _reserved[]
+	 * to tell userspace that we didn't look at them:
+	 */
+	memset(&vls._reserved, 0, sizeof vls._reserved);
+
+	switch (vls.cmd) {
+	case KVM_ARM_SVE_CONFIG_QUERY:
+		return kvm_vcpu_query_sve_vls(vcpu, &vls, userp);
+
+	case KVM_ARM_SVE_CONFIG_SET:
+		return kvm_vcpu_set_sve_vls(vcpu, &vls, userp);
+
+	case KVM_ARM_SVE_CONFIG_GET:
+		return kvm_vcpu_get_sve_vls(vcpu, &vls, userp);
+
+	default:
+		return -EINVAL;
+	}
+}
+
 int kvm_arm_arch_vcpu_ioctl(struct kvm_vcpu *vcpu,
 			    unsigned int ioctl, unsigned long arg)
 {
-	return -EINVAL;
+	void __user *userp = (void __user *)arg;
+
+	switch (ioctl) {
+	case KVM_ARM_SVE_CONFIG:
+		return kvm_vcpu_sve_config(vcpu, userp);
+
+	default:
+		return -EINVAL;
+	}
 }
 
 int kvm_arch_vcpu_ioctl_get_fpu(struct kvm_vcpu *vcpu, struct kvm_fpu *fpu)
diff --git a/arch/arm64/kvm/reset.c b/arch/arm64/kvm/reset.c
index e37c78b..c2edcde 100644
--- a/arch/arm64/kvm/reset.c
+++ b/arch/arm64/kvm/reset.c
@@ -19,10 +19,12 @@
  * along with this program.  If not, see <http://www.gnu.org/licenses/>.
  */
 
+#include <linux/atomic.h>
 #include <linux/errno.h>
 #include <linux/kvm_host.h>
 #include <linux/kvm.h>
 #include <linux/hw_breakpoint.h>
+#include <linux/string.h>
 
 #include <kvm/arm_arch_timer.h>
 
@@ -54,6 +56,31 @@ static bool cpu_has_32bit_el1(void)
 	return !!(pfr0 & 0x20);
 }
 
+#ifdef CONFIG_ARM64_SVE
+bool kvm_sve_supported(void)
+{
+	static bool warn_printed = false;
+
+	if (!system_supports_sve())
+		return false;
+
+	/*
+	 * For now, consider the hardware broken if implementation
+	 * differences between CPUs in the system result in the set of
+	 * vector lengths safely virtualisable for guests being less
+	 * than the set provided to userspace:
+	 */
+	if (sve_max_virtualisable_vl != sve_max_vl) {
+		if (!xchg(&warn_printed, true))
+			kvm_err("Hardware SVE implementations mismatched: suppressing SVE for guests.");
+
+		return false;
+	}
+
+	return true;
+}
+#endif
+
 /**
  * kvm_arch_dev_ioctl_check_extension
  *
@@ -85,6 +112,9 @@ int kvm_arch_dev_ioctl_check_extension(struct kvm *kvm, long ext)
 	case KVM_CAP_VCPU_EVENTS:
 		r = 1;
 		break;
+	case KVM_CAP_ARM_SVE:
+		r = kvm_sve_supported();
+		break;
 	default:
 		r = 0;
 	}
@@ -92,6 +122,21 @@ int kvm_arch_dev_ioctl_check_extension(struct kvm *kvm, long ext)
 	return r;
 }
 
+int kvm_reset_sve(struct kvm_vcpu *vcpu)
+{
+	if (!vcpu_has_sve(vcpu))
+		return 0;
+
+	if (WARN_ON(!vcpu->arch.sve_state ||
+		    !sve_vl_valid(vcpu->arch.sve_max_vl)))
+		return -EIO;
+
+	memset(vcpu->arch.sve_state, 0,
+	       SVE_SIG_REGS_SIZE(sve_vq_from_vl(vcpu->arch.sve_max_vl)));
+
+	return 0;
+}
+
 /**
  * kvm_reset_vcpu - sets core registers and sys_regs to reset value
  * @vcpu: The VCPU pointer
@@ -103,6 +148,7 @@ int kvm_arch_dev_ioctl_check_extension(struct kvm *kvm, long ext)
 int kvm_reset_vcpu(struct kvm_vcpu *vcpu)
 {
 	const struct kvm_regs *cpu_reset;
+	int ret;
 
 	switch (vcpu->arch.target) {
 	default:
@@ -120,6 +166,10 @@ int kvm_reset_vcpu(struct kvm_vcpu *vcpu)
 	/* Reset core registers */
 	memcpy(vcpu_gp_regs(vcpu), cpu_reset, sizeof(*cpu_reset));
 
+	ret = kvm_reset_sve(vcpu);
+	if (ret)
+		return ret;
+
 	/* Reset system registers */
 	kvm_reset_sys_regs(vcpu);
 
diff --git a/include/uapi/linux/kvm.h b/include/uapi/linux/kvm.h
index 7c3c5cc..488ca56 100644
--- a/include/uapi/linux/kvm.h
+++ b/include/uapi/linux/kvm.h
@@ -953,6 +953,7 @@ struct kvm_ppc_resize_hpt {
 #define KVM_CAP_NESTED_STATE 157
 #define KVM_CAP_ARM_INJECT_SERROR_ESR 158
 #define KVM_CAP_MSR_PLATFORM_INFO 159
+#define KVM_CAP_ARM_SVE 160
 
 #ifdef KVM_CAP_IRQ_ROUTING
 
@@ -1400,6 +1401,9 @@ struct kvm_enc_region {
 #define KVM_GET_NESTED_STATE         _IOWR(KVMIO, 0xbe, struct kvm_nested_state)
 #define KVM_SET_NESTED_STATE         _IOW(KVMIO,  0xbf, struct kvm_nested_state)
 
+/* Available with KVM_CAP_ARM_SVE */
+#define KVM_ARM_SVE_CONFIG	  _IOWR(KVMIO,  0xc0, struct kvm_sve_vls)
+
 /* Secure Encrypted Virtualization command */
 enum sev_cmd_id {
 	/* Guest initialization commands */
-- 
2.1.4

^ permalink raw reply related	[flat|nested] 154+ messages in thread

* [RFC PATCH v2 20/23] KVM: arm64: Add arch vm ioctl hook
  2018-09-28 13:39 ` Dave Martin
@ 2018-09-28 13:39   ` Dave Martin
  -1 siblings, 0 replies; 154+ messages in thread
From: Dave Martin @ 2018-09-28 13:39 UTC (permalink / raw)
  To: kvmarm
  Cc: Okamoto Takayuki, Christoffer Dall, Ard Biesheuvel, Marc Zyngier,
	Catalin Marinas, Will Deacon, linux-arm-kernel

To enable arm64-specific vm ioctls to be added cleanly, this patch
adds a kvm_arm_arch_vm_ioctl() hook so that these don't pollute the
common code.

No functional change.

Signed-off-by: Dave Martin <Dave.Martin@arm.com>
---
 arch/arm/include/asm/kvm_host.h   | 6 ++++++
 arch/arm64/include/asm/kvm_host.h | 2 ++
 arch/arm64/kvm/guest.c            | 6 ++++++
 virt/kvm/arm/arm.c                | 2 +-
 4 files changed, 15 insertions(+), 1 deletion(-)

diff --git a/arch/arm/include/asm/kvm_host.h b/arch/arm/include/asm/kvm_host.h
index df2659d..0850fcd 100644
--- a/arch/arm/include/asm/kvm_host.h
+++ b/arch/arm/include/asm/kvm_host.h
@@ -285,6 +285,12 @@ static inline int kvm_arm_arch_vcpu_ioctl(struct vcpu *vcpu,
 	return -EINVAL;
 }
 
+static inline int kvm_arm_arch_vm_ioctl(struct kvm *kvm,
+	unsigned int ioctl, unsigned long arg)
+{
+	return -EINVAL;
+}
+
 int kvm_perf_init(void);
 int kvm_perf_teardown(void);
 
diff --git a/arch/arm64/include/asm/kvm_host.h b/arch/arm64/include/asm/kvm_host.h
index 5225485..ae25f14 100644
--- a/arch/arm64/include/asm/kvm_host.h
+++ b/arch/arm64/include/asm/kvm_host.h
@@ -63,6 +63,8 @@ int kvm_reset_vcpu(struct kvm_vcpu *vcpu);
 int kvm_arch_dev_ioctl_check_extension(struct kvm *kvm, long ext);
 int kvm_arm_arch_vcpu_ioctl(struct kvm_vcpu *vcpu,
 			    unsigned int ioctl, unsigned long arg);
+int kvm_arm_arch_vm_ioctl(struct kvm *kvm,
+			  unsigned int ioctl, unsigned long arg);
 void __extended_idmap_trampoline(phys_addr_t boot_pgd, phys_addr_t idmap_start);
 
 struct kvm_arch {
diff --git a/arch/arm64/kvm/guest.c b/arch/arm64/kvm/guest.c
index d96145a..f066b17 100644
--- a/arch/arm64/kvm/guest.c
+++ b/arch/arm64/kvm/guest.c
@@ -714,6 +714,12 @@ int kvm_arm_arch_vcpu_ioctl(struct kvm_vcpu *vcpu,
 	}
 }
 
+int kvm_arm_arch_vm_ioctl(struct kvm *kvm,
+			  unsigned int ioctl, unsigned long arg)
+{
+	return -EINVAL;
+}
+
 int kvm_arch_vcpu_ioctl_get_fpu(struct kvm_vcpu *vcpu, struct kvm_fpu *fpu)
 {
 	return -EINVAL;
diff --git a/virt/kvm/arm/arm.c b/virt/kvm/arm/arm.c
index 6e894a8..6582a38 100644
--- a/virt/kvm/arm/arm.c
+++ b/virt/kvm/arm/arm.c
@@ -1279,7 +1279,7 @@ long kvm_arch_vm_ioctl(struct file *filp,
 		return 0;
 	}
 	default:
-		return -EINVAL;
+		return kvm_arm_arch_vm_ioctl(kvm, ioctl, arg);
 	}
 }
 
-- 
2.1.4

^ permalink raw reply related	[flat|nested] 154+ messages in thread

* [RFC PATCH v2 20/23] KVM: arm64: Add arch vm ioctl hook
@ 2018-09-28 13:39   ` Dave Martin
  0 siblings, 0 replies; 154+ messages in thread
From: Dave Martin @ 2018-09-28 13:39 UTC (permalink / raw)
  To: linux-arm-kernel

To enable arm64-specific vm ioctls to be added cleanly, this patch
adds a kvm_arm_arch_vm_ioctl() hook so that these don't pollute the
common code.

No functional change.

Signed-off-by: Dave Martin <Dave.Martin@arm.com>
---
 arch/arm/include/asm/kvm_host.h   | 6 ++++++
 arch/arm64/include/asm/kvm_host.h | 2 ++
 arch/arm64/kvm/guest.c            | 6 ++++++
 virt/kvm/arm/arm.c                | 2 +-
 4 files changed, 15 insertions(+), 1 deletion(-)

diff --git a/arch/arm/include/asm/kvm_host.h b/arch/arm/include/asm/kvm_host.h
index df2659d..0850fcd 100644
--- a/arch/arm/include/asm/kvm_host.h
+++ b/arch/arm/include/asm/kvm_host.h
@@ -285,6 +285,12 @@ static inline int kvm_arm_arch_vcpu_ioctl(struct vcpu *vcpu,
 	return -EINVAL;
 }
 
+static inline int kvm_arm_arch_vm_ioctl(struct kvm *kvm,
+	unsigned int ioctl, unsigned long arg)
+{
+	return -EINVAL;
+}
+
 int kvm_perf_init(void);
 int kvm_perf_teardown(void);
 
diff --git a/arch/arm64/include/asm/kvm_host.h b/arch/arm64/include/asm/kvm_host.h
index 5225485..ae25f14 100644
--- a/arch/arm64/include/asm/kvm_host.h
+++ b/arch/arm64/include/asm/kvm_host.h
@@ -63,6 +63,8 @@ int kvm_reset_vcpu(struct kvm_vcpu *vcpu);
 int kvm_arch_dev_ioctl_check_extension(struct kvm *kvm, long ext);
 int kvm_arm_arch_vcpu_ioctl(struct kvm_vcpu *vcpu,
 			    unsigned int ioctl, unsigned long arg);
+int kvm_arm_arch_vm_ioctl(struct kvm *kvm,
+			  unsigned int ioctl, unsigned long arg);
 void __extended_idmap_trampoline(phys_addr_t boot_pgd, phys_addr_t idmap_start);
 
 struct kvm_arch {
diff --git a/arch/arm64/kvm/guest.c b/arch/arm64/kvm/guest.c
index d96145a..f066b17 100644
--- a/arch/arm64/kvm/guest.c
+++ b/arch/arm64/kvm/guest.c
@@ -714,6 +714,12 @@ int kvm_arm_arch_vcpu_ioctl(struct kvm_vcpu *vcpu,
 	}
 }
 
+int kvm_arm_arch_vm_ioctl(struct kvm *kvm,
+			  unsigned int ioctl, unsigned long arg)
+{
+	return -EINVAL;
+}
+
 int kvm_arch_vcpu_ioctl_get_fpu(struct kvm_vcpu *vcpu, struct kvm_fpu *fpu)
 {
 	return -EINVAL;
diff --git a/virt/kvm/arm/arm.c b/virt/kvm/arm/arm.c
index 6e894a8..6582a38 100644
--- a/virt/kvm/arm/arm.c
+++ b/virt/kvm/arm/arm.c
@@ -1279,7 +1279,7 @@ long kvm_arch_vm_ioctl(struct file *filp,
 		return 0;
 	}
 	default:
-		return -EINVAL;
+		return kvm_arm_arch_vm_ioctl(kvm, ioctl, arg);
 	}
 }
 
-- 
2.1.4

^ permalink raw reply related	[flat|nested] 154+ messages in thread

* [RFC PATCH v2 21/23] KVM: arm64/sve: allow KVM_ARM_SVE_CONFIG_QUERY on vm fd
  2018-09-28 13:39 ` Dave Martin
@ 2018-09-28 13:39   ` Dave Martin
  -1 siblings, 0 replies; 154+ messages in thread
From: Dave Martin @ 2018-09-28 13:39 UTC (permalink / raw)
  To: kvmarm
  Cc: Okamoto Takayuki, Christoffer Dall, Ard Biesheuvel, Marc Zyngier,
	Catalin Marinas, Will Deacon, linux-arm-kernel

Since userspace may need to decide on the set of vector lengths for
the guest before setting up a vm, it is onerous to require a vcpu
fd to be available first.  KVM_ARM_SVE_CONFIG_QUERY is not
vcpu-dependent anyway, so this patch wires up KVM_ARM_SVE_CONFIG to
be usable on a vm fd where appropriate.

Subcommands that are vcpu-dependent (currently
KVM_ARM_SVE_CONFIG_SET, KVM_ARM_SVE_CONFIG_GET) will return -EINVAL
if invoked on a vm fd.

Signed-off-by: Dave Martin <Dave.Martin@arm.com>
---
 arch/arm64/kvm/guest.c | 17 ++++++++++++++++-
 1 file changed, 16 insertions(+), 1 deletion(-)

diff --git a/arch/arm64/kvm/guest.c b/arch/arm64/kvm/guest.c
index f066b17..2313c22 100644
--- a/arch/arm64/kvm/guest.c
+++ b/arch/arm64/kvm/guest.c
@@ -574,6 +574,9 @@ static int kvm_vcpu_set_sve_vls(struct kvm_vcpu *vcpu, struct kvm_sve_vls *vls,
 	unsigned int vq, max_vq;
 	int ret;
 
+	if (!vcpu)
+		return -EINVAL; /* per-vcpu operation on vm fd */
+
 	if (vcpu->arch.has_run_once || vcpu_has_sve(vcpu))
 		return -EBADFD; /* too late, or already configured */
 
@@ -659,6 +662,9 @@ static int kvm_vcpu_query_sve_vls(struct kvm_vcpu *vcpu, struct kvm_sve_vls *vls
 static int kvm_vcpu_get_sve_vls(struct kvm_vcpu *vcpu, struct kvm_sve_vls *vls,
 		struct kvm_sve_vls __user *userp)
 {
+	if (!vcpu)
+		return -EINVAL; /* per-vcpu operation on vm fd */
+
 	if (!vcpu_has_sve(vcpu))
 		return -EBADFD; /* not configured yet */
 
@@ -668,6 +674,7 @@ static int kvm_vcpu_get_sve_vls(struct kvm_vcpu *vcpu, struct kvm_sve_vls *vls,
 			sve_vq_from_vl(vcpu->arch.sve_max_vl), userp);
 }
 
+/* vcpu may be NULL if this is called via a vm fd */
 static int kvm_vcpu_sve_config(struct kvm_vcpu *vcpu,
 			       struct kvm_sve_vls __user *userp)
 {
@@ -717,7 +724,15 @@ int kvm_arm_arch_vcpu_ioctl(struct kvm_vcpu *vcpu,
 int kvm_arm_arch_vm_ioctl(struct kvm *kvm,
 			  unsigned int ioctl, unsigned long arg)
 {
-	return -EINVAL;
+	void __user *userp = (void __user *)arg;
+
+	switch (ioctl) {
+	case KVM_ARM_SVE_CONFIG:
+		return kvm_vcpu_sve_config(NULL, userp);
+
+	default:
+		return -EINVAL;
+	}
 }
 
 int kvm_arch_vcpu_ioctl_get_fpu(struct kvm_vcpu *vcpu, struct kvm_fpu *fpu)
-- 
2.1.4

^ permalink raw reply related	[flat|nested] 154+ messages in thread

* [RFC PATCH v2 21/23] KVM: arm64/sve: allow KVM_ARM_SVE_CONFIG_QUERY on vm fd
@ 2018-09-28 13:39   ` Dave Martin
  0 siblings, 0 replies; 154+ messages in thread
From: Dave Martin @ 2018-09-28 13:39 UTC (permalink / raw)
  To: linux-arm-kernel

Since userspace may need to decide on the set of vector lengths for
the guest before setting up a vm, it is onerous to require a vcpu
fd to be available first.  KVM_ARM_SVE_CONFIG_QUERY is not
vcpu-dependent anyway, so this patch wires up KVM_ARM_SVE_CONFIG to
be usable on a vm fd where appropriate.

Subcommands that are vcpu-dependent (currently
KVM_ARM_SVE_CONFIG_SET, KVM_ARM_SVE_CONFIG_GET) will return -EINVAL
if invoked on a vm fd.

Signed-off-by: Dave Martin <Dave.Martin@arm.com>
---
 arch/arm64/kvm/guest.c | 17 ++++++++++++++++-
 1 file changed, 16 insertions(+), 1 deletion(-)

diff --git a/arch/arm64/kvm/guest.c b/arch/arm64/kvm/guest.c
index f066b17..2313c22 100644
--- a/arch/arm64/kvm/guest.c
+++ b/arch/arm64/kvm/guest.c
@@ -574,6 +574,9 @@ static int kvm_vcpu_set_sve_vls(struct kvm_vcpu *vcpu, struct kvm_sve_vls *vls,
 	unsigned int vq, max_vq;
 	int ret;
 
+	if (!vcpu)
+		return -EINVAL; /* per-vcpu operation on vm fd */
+
 	if (vcpu->arch.has_run_once || vcpu_has_sve(vcpu))
 		return -EBADFD; /* too late, or already configured */
 
@@ -659,6 +662,9 @@ static int kvm_vcpu_query_sve_vls(struct kvm_vcpu *vcpu, struct kvm_sve_vls *vls
 static int kvm_vcpu_get_sve_vls(struct kvm_vcpu *vcpu, struct kvm_sve_vls *vls,
 		struct kvm_sve_vls __user *userp)
 {
+	if (!vcpu)
+		return -EINVAL; /* per-vcpu operation on vm fd */
+
 	if (!vcpu_has_sve(vcpu))
 		return -EBADFD; /* not configured yet */
 
@@ -668,6 +674,7 @@ static int kvm_vcpu_get_sve_vls(struct kvm_vcpu *vcpu, struct kvm_sve_vls *vls,
 			sve_vq_from_vl(vcpu->arch.sve_max_vl), userp);
 }
 
+/* vcpu may be NULL if this is called via a vm fd */
 static int kvm_vcpu_sve_config(struct kvm_vcpu *vcpu,
 			       struct kvm_sve_vls __user *userp)
 {
@@ -717,7 +724,15 @@ int kvm_arm_arch_vcpu_ioctl(struct kvm_vcpu *vcpu,
 int kvm_arm_arch_vm_ioctl(struct kvm *kvm,
 			  unsigned int ioctl, unsigned long arg)
 {
-	return -EINVAL;
+	void __user *userp = (void __user *)arg;
+
+	switch (ioctl) {
+	case KVM_ARM_SVE_CONFIG:
+		return kvm_vcpu_sve_config(NULL, userp);
+
+	default:
+		return -EINVAL;
+	}
 }
 
 int kvm_arch_vcpu_ioctl_get_fpu(struct kvm_vcpu *vcpu, struct kvm_fpu *fpu)
-- 
2.1.4

^ permalink raw reply related	[flat|nested] 154+ messages in thread

* [RFC PATCH v2 22/23] KVM: Documentation: Document arm64 core registers in detail
  2018-09-28 13:39 ` Dave Martin
@ 2018-09-28 13:39   ` Dave Martin
  -1 siblings, 0 replies; 154+ messages in thread
From: Dave Martin @ 2018-09-28 13:39 UTC (permalink / raw)
  To: kvmarm
  Cc: Okamoto Takayuki, Christoffer Dall, Ard Biesheuvel, Marc Zyngier,
	Catalin Marinas, Will Deacon, linux-arm-kernel

Since the the sizes of members the core arm64 registers vary, the
list of register encodings that make sense is not a simple linear
sequence.

To clarify which encodings to use, this patch adds a brief list
to the documentation.

Signed-off-by: Dave Martin <Dave.Martin@arm.com>
---

Draft only -- encodings not checked yet.
---
 Documentation/virtual/kvm/api.txt | 24 ++++++++++++++++++++++++
 1 file changed, 24 insertions(+)

diff --git a/Documentation/virtual/kvm/api.txt b/Documentation/virtual/kvm/api.txt
index 647f941..a58067b 100644
--- a/Documentation/virtual/kvm/api.txt
+++ b/Documentation/virtual/kvm/api.txt
@@ -2037,6 +2037,30 @@ contains elements ranging from 32 to 128 bits. The index is a 32bit
 value in the kvm_regs structure seen as a 32bit array.
   0x60x0 0000 0010 <index into the kvm_regs struct:16>
 
+Specifically:
+    Encoding            Register  Bits  kvm_regs member
+----------------------------------------------------------------
+  0x6030 0000 0010 0000 X0          64  regs.regs[0]
+  0x6030 0000 0010 0002 X1          64  regs.regs[1]
+    ...
+  0x6030 0000 0010 003c X30         64  regs.regs[30]
+  0x6030 0000 0010 003e SP          64  regs.sp
+  0x6030 0000 0010 0040 PC          64  regs.pc
+  0x6030 0000 0010 0042 PSTATE      64  regs.pstate
+  0x6030 0000 0010 0044 SP_EL1      64  sp_el1
+  0x6030 0000 0010 0046 ELR_EL1     64  elr_el1
+  0x6030 0000 0010 0048 SPSR_EL1    64  spsr[KVM_SPSR_EL1] (alias SPSR_SVC)
+  0x6030 0000 0010 004a SPSR_ABT    64  spsr[KVM_SPSR_ABT]
+  0x6030 0000 0010 004c SPSR_UND    64  spsr[KVM_SPSR_UND]
+  0x6030 0000 0010 004e SPSR_IRQ    64  spsr[KVM_SPSR_IRQ]
+  0x6060 0000 0010 0050 SPSR_FIQ    64  spsr[KVM_SPSR_FIQ]
+  0x6040 0000 0010 0054 V0         128  fp_regs.vregs[0]
+  0x6040 0000 0010 0058 V1         128  fp_regs.vregs[1]
+    ...
+  0x6040 0000 0010 00d0 V31        128  fp_regs.vregs[31]
+  0x6020 0000 0010 00d4 FPSR        32  fp_regs.fpsr
+  0x6020 0000 0010 00d5 FPCR        32  fp_regs.fpcr
+
 arm64 CCSIDR registers are demultiplexed by CSSELR value:
   0x6020 0000 0011 00 <csselr:8>
 
-- 
2.1.4

^ permalink raw reply related	[flat|nested] 154+ messages in thread

* [RFC PATCH v2 22/23] KVM: Documentation: Document arm64 core registers in detail
@ 2018-09-28 13:39   ` Dave Martin
  0 siblings, 0 replies; 154+ messages in thread
From: Dave Martin @ 2018-09-28 13:39 UTC (permalink / raw)
  To: linux-arm-kernel

Since the the sizes of members the core arm64 registers vary, the
list of register encodings that make sense is not a simple linear
sequence.

To clarify which encodings to use, this patch adds a brief list
to the documentation.

Signed-off-by: Dave Martin <Dave.Martin@arm.com>
---

Draft only -- encodings not checked yet.
---
 Documentation/virtual/kvm/api.txt | 24 ++++++++++++++++++++++++
 1 file changed, 24 insertions(+)

diff --git a/Documentation/virtual/kvm/api.txt b/Documentation/virtual/kvm/api.txt
index 647f941..a58067b 100644
--- a/Documentation/virtual/kvm/api.txt
+++ b/Documentation/virtual/kvm/api.txt
@@ -2037,6 +2037,30 @@ contains elements ranging from 32 to 128 bits. The index is a 32bit
 value in the kvm_regs structure seen as a 32bit array.
   0x60x0 0000 0010 <index into the kvm_regs struct:16>
 
+Specifically:
+    Encoding            Register  Bits  kvm_regs member
+----------------------------------------------------------------
+  0x6030 0000 0010 0000 X0          64  regs.regs[0]
+  0x6030 0000 0010 0002 X1          64  regs.regs[1]
+    ...
+  0x6030 0000 0010 003c X30         64  regs.regs[30]
+  0x6030 0000 0010 003e SP          64  regs.sp
+  0x6030 0000 0010 0040 PC          64  regs.pc
+  0x6030 0000 0010 0042 PSTATE      64  regs.pstate
+  0x6030 0000 0010 0044 SP_EL1      64  sp_el1
+  0x6030 0000 0010 0046 ELR_EL1     64  elr_el1
+  0x6030 0000 0010 0048 SPSR_EL1    64  spsr[KVM_SPSR_EL1] (alias SPSR_SVC)
+  0x6030 0000 0010 004a SPSR_ABT    64  spsr[KVM_SPSR_ABT]
+  0x6030 0000 0010 004c SPSR_UND    64  spsr[KVM_SPSR_UND]
+  0x6030 0000 0010 004e SPSR_IRQ    64  spsr[KVM_SPSR_IRQ]
+  0x6060 0000 0010 0050 SPSR_FIQ    64  spsr[KVM_SPSR_FIQ]
+  0x6040 0000 0010 0054 V0         128  fp_regs.vregs[0]
+  0x6040 0000 0010 0058 V1         128  fp_regs.vregs[1]
+    ...
+  0x6040 0000 0010 00d0 V31        128  fp_regs.vregs[31]
+  0x6020 0000 0010 00d4 FPSR        32  fp_regs.fpsr
+  0x6020 0000 0010 00d5 FPCR        32  fp_regs.fpcr
+
 arm64 CCSIDR registers are demultiplexed by CSSELR value:
   0x6020 0000 0011 00 <csselr:8>
 
-- 
2.1.4

^ permalink raw reply related	[flat|nested] 154+ messages in thread

* [RFC PATCH v2 23/23] KVM: arm64/sve: Document KVM API extensions for SVE
  2018-09-28 13:39 ` Dave Martin
@ 2018-09-28 13:39   ` Dave Martin
  -1 siblings, 0 replies; 154+ messages in thread
From: Dave Martin @ 2018-09-28 13:39 UTC (permalink / raw)
  To: kvmarm
  Cc: Okamoto Takayuki, Christoffer Dall, Ard Biesheuvel, Marc Zyngier,
	Catalin Marinas, Will Deacon, linux-arm-kernel

This patch adds sections to the KVM API documentation describing
the extensions for supporting the Scalable Vector Extension (SVE)
in guests.

Signed-off-by: Dave Martin <Dave.Martin@arm.com>
---
 Documentation/virtual/kvm/api.txt | 142 +++++++++++++++++++++++++++++++++++++-
 1 file changed, 139 insertions(+), 3 deletions(-)

diff --git a/Documentation/virtual/kvm/api.txt b/Documentation/virtual/kvm/api.txt
index a58067b..b8257d4 100644
--- a/Documentation/virtual/kvm/api.txt
+++ b/Documentation/virtual/kvm/api.txt
@@ -2054,13 +2054,21 @@ Specifically:
   0x6030 0000 0010 004c SPSR_UND    64  spsr[KVM_SPSR_UND]
   0x6030 0000 0010 004e SPSR_IRQ    64  spsr[KVM_SPSR_IRQ]
   0x6060 0000 0010 0050 SPSR_FIQ    64  spsr[KVM_SPSR_FIQ]
-  0x6040 0000 0010 0054 V0         128  fp_regs.vregs[0]
-  0x6040 0000 0010 0058 V1         128  fp_regs.vregs[1]
+  0x6040 0000 0010 0054 V0         128  fp_regs.vregs[0]    (*)
+  0x6040 0000 0010 0058 V1         128  fp_regs.vregs[1]    (*)
     ...
-  0x6040 0000 0010 00d0 V31        128  fp_regs.vregs[31]
+  0x6040 0000 0010 00d0 V31        128  fp_regs.vregs[31]   (*)
   0x6020 0000 0010 00d4 FPSR        32  fp_regs.fpsr
   0x6020 0000 0010 00d5 FPCR        32  fp_regs.fpcr
 
+(*) These encodings are not accepted for SVE-enabled vcpus.  See
+    KVM_ARM_SVE_CONFIG for details of how SVE support is configured for
+    a vcpu.
+
+    The equivalent register content can be accessed via bits [2047:0] of
+    the corresponding SVE Zn registers instead for vcpus that have SVE
+    enabled (see below).
+
 arm64 CCSIDR registers are demultiplexed by CSSELR value:
   0x6020 0000 0011 00 <csselr:8>
 
@@ -2070,6 +2078,14 @@ arm64 system registers have the following id bit patterns:
 arm64 firmware pseudo-registers have the following bit pattern:
   0x6030 0000 0014 <regno:16>
 
+arm64 SVE registers have the following bit patterns:
+  0x6080 0000 0015 00 <n:5> <slice:5>   Zn bits[2048*slice + 2047 : 2048*slice]
+  0x6050 0000 0015 04 <n:4> <slice:5>   Pn bits[256*slice + 255 : 256*slice]
+  0x6050 0000 0015 060 <slice:5>        FFR bits[256*slice + 255 : 256*slice]
+
+  These registers are only accessible on SVE-enabled vcpus.  See
+  KVM_ARM_SVE_CONFIG for details.
+
 
 MIPS registers are mapped using the lower 32 bits.  The upper 16 of that is
 the register group type:
@@ -3700,6 +3716,126 @@ Returns: 0 on success, -1 on error
 This copies the vcpu's kvm_nested_state struct from userspace to the kernel.  For
 the definition of struct kvm_nested_state, see KVM_GET_NESTED_STATE.
 
+4.116 KVM_ARM_SVE_CONFIG
+
+Capability: KVM_CAP_ARM_SVE
+Architectures: arm64
+Type: vm and vcpu ioctl
+Parameters: struct kvm_sve_vls (in/out)
+Returns: 0 on success
+Errors:
+  EINVAL:    Unrecognised subcommand or bad arguments
+  EBADFD:    vcpu in wrong state for request
+             (KVM_ARM_SVE_CONFIG_SET, KVM_ARM_SVE_CONFIG_SET)
+  ENOMEM:    Out of memory
+  EFAULT:    Bad user address
+
+struct kvm_sve_vls {
+	__u16 cmd;
+	__u16 max_vq;
+	__u16 _reserved[2];
+	__u64 required_vqs[8];
+};
+
+General:
+
+cmd: This ioctl supports a few different subcommands, selected by the
+value of cmd (described in detail in the following sections).
+
+_reserved[]: these fields may be meaningful to later kernels.  For
+forward compatibility, they must be zeroed before invoking this ioctl
+for the first time on a given struct kvm_sve_vls object.  (So, memset()
+it to zero before first use, or allocate with calloc() for example.)
+
+max_vq, required_vqs[]: encode a set of SVE vector lengths.  The set is
+encoded as follows:
+
+If (a * 64 + b + 1) <= max_vq, then the bit represented by
+
+    required_vqs[a] & ((__u64)1 << b)
+
+(where a is in the range 0..7 and b is in the range 0..63)
+indicates that the vector length (a * 64 + b + 1) * 128 bits is
+supported (KVM_ARM_SVE_CONFIG_QUERY, KVM_ARM_SVE_CONFIG_GET) or required
+(KVM_ARM_SVE_CONFIG_SET).
+
+If (a * 64 + b + 1) > max_vq, then the vector length
+(a * 64 + b + 1) * 128 bits is unsupported or prohibited respectively.
+In other words, only the first max_vq bits in required_vqs[] are
+significant; remaining bits are implicitly treated as if they were zero.
+
+max_vq must be in the range SVE_VQ_MIN (1) to SVE_VQ_MAX (512).
+
+See Documentation/arm64/sve.txt for an explanation of vector lengths and
+the meaning associated with "VQ".
+
+Subcommands:
+
+/* values for cmd: */
+#define KVM_ARM_SVE_CONFIG_QUERY	0 /* query what the host can support */
+#define KVM_ARM_SVE_CONFIG_SET		1 /* enable SVE for vcpu and set VLs */
+#define KVM_ARM_SVE_CONFIG_GET		2 /* read the set of VLs for a vcpu */
+
+Subcommand details:
+
+4.116.1 KVM_ARM_SVE_CONFIG_QUERY
+Type: vm and vcpu
+
+Retrieve the full set of SVE vector lengths available for use by KVM
+guests on this host.  The result is independent of which vcpu this
+command is invoked on.  As a convenience, it may also be invoked on a
+vm file descriptor, eliminating the need to create a vcpu first.
+
+4.116.2 KVM_ARM_SVE_CONFIG_SET
+Type: vcpu only
+
+Enables SVE for the vcpu and sets the set of SVE vector lengths that
+will be visible to the guest.
+
+This is the only way to enable SVE for a vcpu: if this command is not
+invoked for a vcpu then SVE will not be available to the guest on this
+vcpu.
+
+This subcommand is only permitted once per vcpu, before KVM_RUN has been
+invoked for the vcpu for the first time.  Otherwise, the command fails
+with -EBADFD and the state of the vcpu is not modified.
+
+In typical use, the user should call KVM_ARM_SVE_CONFIG_QUERY first to
+populate a struct kvm_sve_vls with the full set of vector lengths
+available on the host, then set cmd = KVM_ARM_SVE_CONFIG_SET and
+re-issue the KVM_ARM_SVE_CONFIG ioctl on the desired vcpu.  This will
+configure the best set of vector lengths available.  When following this
+approach, the maximum available vector length can also be restricted by
+reducing the value of max_vq before invoking KVM_ARM_SVE_CONFIG_SET.
+
+Every requested vector length in the struct kvm_sve_vls argument must be
+supported by the hardware.  In addition, except for vector lengths
+greater than the maximum requested vector length, every vector length
+not requested must *not* be supported by the hardware.  (The latter
+restriction may be relaxed in the future.)  If the requested set of
+vector lengths is not supportable, the command fails with -EINVAL and
+the state of the vcpu is not modified.
+
+Different vcpus of a vm may be configured with different sets of vector
+lengths.  Equally, some vcpus may have SVE enabled and some not.
+However, such configurations are not recommended except for testing and
+experimentation purposes.  Architecturally compliant guest OSes will
+work, but may or may not make effective use of the resulting
+configuration.
+
+After a successful KVM_ARM_SVE_CONFIG_SET, KVM_ARM_SVE_CONFIG_GET can be
+used to retrieve the configured set of vector lengths.
+
+4.116.3 KVM_ARM_SVE_CONFIG_GET
+Type: vcpu only
+
+This subcommand returns the set of vector lengths enabled for the vcpu.
+SVE must have been enabled and configured for this vcpu by a successful
+prior KVM_ARM_SVE_CONFIG_SET call.  Otherwise, -EBADFD is returned.
+
+The state of the vcpu is unchanged.
+
+
 5. The kvm_run structure
 ------------------------
 
-- 
2.1.4

^ permalink raw reply related	[flat|nested] 154+ messages in thread

* [RFC PATCH v2 23/23] KVM: arm64/sve: Document KVM API extensions for SVE
@ 2018-09-28 13:39   ` Dave Martin
  0 siblings, 0 replies; 154+ messages in thread
From: Dave Martin @ 2018-09-28 13:39 UTC (permalink / raw)
  To: linux-arm-kernel

This patch adds sections to the KVM API documentation describing
the extensions for supporting the Scalable Vector Extension (SVE)
in guests.

Signed-off-by: Dave Martin <Dave.Martin@arm.com>
---
 Documentation/virtual/kvm/api.txt | 142 +++++++++++++++++++++++++++++++++++++-
 1 file changed, 139 insertions(+), 3 deletions(-)

diff --git a/Documentation/virtual/kvm/api.txt b/Documentation/virtual/kvm/api.txt
index a58067b..b8257d4 100644
--- a/Documentation/virtual/kvm/api.txt
+++ b/Documentation/virtual/kvm/api.txt
@@ -2054,13 +2054,21 @@ Specifically:
   0x6030 0000 0010 004c SPSR_UND    64  spsr[KVM_SPSR_UND]
   0x6030 0000 0010 004e SPSR_IRQ    64  spsr[KVM_SPSR_IRQ]
   0x6060 0000 0010 0050 SPSR_FIQ    64  spsr[KVM_SPSR_FIQ]
-  0x6040 0000 0010 0054 V0         128  fp_regs.vregs[0]
-  0x6040 0000 0010 0058 V1         128  fp_regs.vregs[1]
+  0x6040 0000 0010 0054 V0         128  fp_regs.vregs[0]    (*)
+  0x6040 0000 0010 0058 V1         128  fp_regs.vregs[1]    (*)
     ...
-  0x6040 0000 0010 00d0 V31        128  fp_regs.vregs[31]
+  0x6040 0000 0010 00d0 V31        128  fp_regs.vregs[31]   (*)
   0x6020 0000 0010 00d4 FPSR        32  fp_regs.fpsr
   0x6020 0000 0010 00d5 FPCR        32  fp_regs.fpcr
 
+(*) These encodings are not accepted for SVE-enabled vcpus.  See
+    KVM_ARM_SVE_CONFIG for details of how SVE support is configured for
+    a vcpu.
+
+    The equivalent register content can be accessed via bits [2047:0] of
+    the corresponding SVE Zn registers instead for vcpus that have SVE
+    enabled (see below).
+
 arm64 CCSIDR registers are demultiplexed by CSSELR value:
   0x6020 0000 0011 00 <csselr:8>
 
@@ -2070,6 +2078,14 @@ arm64 system registers have the following id bit patterns:
 arm64 firmware pseudo-registers have the following bit pattern:
   0x6030 0000 0014 <regno:16>
 
+arm64 SVE registers have the following bit patterns:
+  0x6080 0000 0015 00 <n:5> <slice:5>   Zn bits[2048*slice + 2047 : 2048*slice]
+  0x6050 0000 0015 04 <n:4> <slice:5>   Pn bits[256*slice + 255 : 256*slice]
+  0x6050 0000 0015 060 <slice:5>        FFR bits[256*slice + 255 : 256*slice]
+
+  These registers are only accessible on SVE-enabled vcpus.  See
+  KVM_ARM_SVE_CONFIG for details.
+
 
 MIPS registers are mapped using the lower 32 bits.  The upper 16 of that is
 the register group type:
@@ -3700,6 +3716,126 @@ Returns: 0 on success, -1 on error
 This copies the vcpu's kvm_nested_state struct from userspace to the kernel.  For
 the definition of struct kvm_nested_state, see KVM_GET_NESTED_STATE.
 
+4.116 KVM_ARM_SVE_CONFIG
+
+Capability: KVM_CAP_ARM_SVE
+Architectures: arm64
+Type: vm and vcpu ioctl
+Parameters: struct kvm_sve_vls (in/out)
+Returns: 0 on success
+Errors:
+  EINVAL:    Unrecognised subcommand or bad arguments
+  EBADFD:    vcpu in wrong state for request
+             (KVM_ARM_SVE_CONFIG_SET, KVM_ARM_SVE_CONFIG_SET)
+  ENOMEM:    Out of memory
+  EFAULT:    Bad user address
+
+struct kvm_sve_vls {
+	__u16 cmd;
+	__u16 max_vq;
+	__u16 _reserved[2];
+	__u64 required_vqs[8];
+};
+
+General:
+
+cmd: This ioctl supports a few different subcommands, selected by the
+value of cmd (described in detail in the following sections).
+
+_reserved[]: these fields may be meaningful to later kernels.  For
+forward compatibility, they must be zeroed before invoking this ioctl
+for the first time on a given struct kvm_sve_vls object.  (So, memset()
+it to zero before first use, or allocate with calloc() for example.)
+
+max_vq, required_vqs[]: encode a set of SVE vector lengths.  The set is
+encoded as follows:
+
+If (a * 64 + b + 1) <= max_vq, then the bit represented by
+
+    required_vqs[a] & ((__u64)1 << b)
+
+(where a is in the range 0..7 and b is in the range 0..63)
+indicates that the vector length (a * 64 + b + 1) * 128 bits is
+supported (KVM_ARM_SVE_CONFIG_QUERY, KVM_ARM_SVE_CONFIG_GET) or required
+(KVM_ARM_SVE_CONFIG_SET).
+
+If (a * 64 + b + 1) > max_vq, then the vector length
+(a * 64 + b + 1) * 128 bits is unsupported or prohibited respectively.
+In other words, only the first max_vq bits in required_vqs[] are
+significant; remaining bits are implicitly treated as if they were zero.
+
+max_vq must be in the range SVE_VQ_MIN (1) to SVE_VQ_MAX (512).
+
+See Documentation/arm64/sve.txt for an explanation of vector lengths and
+the meaning associated with "VQ".
+
+Subcommands:
+
+/* values for cmd: */
+#define KVM_ARM_SVE_CONFIG_QUERY	0 /* query what the host can support */
+#define KVM_ARM_SVE_CONFIG_SET		1 /* enable SVE for vcpu and set VLs */
+#define KVM_ARM_SVE_CONFIG_GET		2 /* read the set of VLs for a vcpu */
+
+Subcommand details:
+
+4.116.1 KVM_ARM_SVE_CONFIG_QUERY
+Type: vm and vcpu
+
+Retrieve the full set of SVE vector lengths available for use by KVM
+guests on this host.  The result is independent of which vcpu this
+command is invoked on.  As a convenience, it may also be invoked on a
+vm file descriptor, eliminating the need to create a vcpu first.
+
+4.116.2 KVM_ARM_SVE_CONFIG_SET
+Type: vcpu only
+
+Enables SVE for the vcpu and sets the set of SVE vector lengths that
+will be visible to the guest.
+
+This is the only way to enable SVE for a vcpu: if this command is not
+invoked for a vcpu then SVE will not be available to the guest on this
+vcpu.
+
+This subcommand is only permitted once per vcpu, before KVM_RUN has been
+invoked for the vcpu for the first time.  Otherwise, the command fails
+with -EBADFD and the state of the vcpu is not modified.
+
+In typical use, the user should call KVM_ARM_SVE_CONFIG_QUERY first to
+populate a struct kvm_sve_vls with the full set of vector lengths
+available on the host, then set cmd = KVM_ARM_SVE_CONFIG_SET and
+re-issue the KVM_ARM_SVE_CONFIG ioctl on the desired vcpu.  This will
+configure the best set of vector lengths available.  When following this
+approach, the maximum available vector length can also be restricted by
+reducing the value of max_vq before invoking KVM_ARM_SVE_CONFIG_SET.
+
+Every requested vector length in the struct kvm_sve_vls argument must be
+supported by the hardware.  In addition, except for vector lengths
+greater than the maximum requested vector length, every vector length
+not requested must *not* be supported by the hardware.  (The latter
+restriction may be relaxed in the future.)  If the requested set of
+vector lengths is not supportable, the command fails with -EINVAL and
+the state of the vcpu is not modified.
+
+Different vcpus of a vm may be configured with different sets of vector
+lengths.  Equally, some vcpus may have SVE enabled and some not.
+However, such configurations are not recommended except for testing and
+experimentation purposes.  Architecturally compliant guest OSes will
+work, but may or may not make effective use of the resulting
+configuration.
+
+After a successful KVM_ARM_SVE_CONFIG_SET, KVM_ARM_SVE_CONFIG_GET can be
+used to retrieve the configured set of vector lengths.
+
+4.116.3 KVM_ARM_SVE_CONFIG_GET
+Type: vcpu only
+
+This subcommand returns the set of vector lengths enabled for the vcpu.
+SVE must have been enabled and configured for this vcpu by a successful
+prior KVM_ARM_SVE_CONFIG_SET call.  Otherwise, -EBADFD is returned.
+
+The state of the vcpu is unchanged.
+
+
 5. The kvm_run structure
 ------------------------
 
-- 
2.1.4

^ permalink raw reply related	[flat|nested] 154+ messages in thread

* Re: [RFC PATCH v2 05/23] KVM: arm: Add arch vcpu uninit hook
  2018-09-28 13:39   ` Dave Martin
@ 2018-11-02  8:05     ` Christoffer Dall
  -1 siblings, 0 replies; 154+ messages in thread
From: Christoffer Dall @ 2018-11-02  8:05 UTC (permalink / raw)
  To: Dave Martin
  Cc: Okamoto Takayuki, Christoffer Dall, Ard Biesheuvel, Marc Zyngier,
	Catalin Marinas, Will Deacon, kvmarm, linux-arm-kernel

On Fri, Sep 28, 2018 at 02:39:09PM +0100, Dave Martin wrote:
> In preparation for adding support for SVE in guests on arm64, a
> hook is needed for freeing additional per-vcpu memory when a vcpu
> is freed.

Can this commit motivate why we can't do the work in kvm_arch_vcpu_free,
which we use for freeing other data structures?

(Presumably, uninit is needed when you need to do something at the very
last step after releasing the struct pid.


Thanks,

    Christoffer

> 
> x86 already uses the kvm_arch_vcpu_uninit() hook for a similar
> purpose, so this patch populates the same hook for arm.  Since SVE
> is specific to arm64, a subsidiary hook kvm_arm_arch_vcpu_uninit()
> is added (with trivial implementations for now) to enable separate
> specialisation for arm and arm64.
> 
> No functional change.
> 
> Signed-off-by: Dave Martin <Dave.Martin@arm.com>
> ---
> 
> Changes since RFCv1:
> 
>  * The vcpu _init_ hook that was added by the former version of this
>    patch was never used for anything, so it is gone from this version.
> ---
>  arch/arm/include/asm/kvm_host.h   | 3 ++-
>  arch/arm64/include/asm/kvm_host.h | 3 ++-
>  virt/kvm/arm/arm.c                | 5 +++++
>  3 files changed, 9 insertions(+), 2 deletions(-)
> 
> diff --git a/arch/arm/include/asm/kvm_host.h b/arch/arm/include/asm/kvm_host.h
> index 3ad482d..c36760b 100644
> --- a/arch/arm/include/asm/kvm_host.h
> +++ b/arch/arm/include/asm/kvm_host.h
> @@ -288,10 +288,11 @@ struct kvm_vcpu *kvm_mpidr_to_vcpu(struct kvm *kvm, unsigned long mpidr);
>  static inline bool kvm_arch_check_sve_has_vhe(void) { return true; }
>  static inline void kvm_arch_hardware_unsetup(void) {}
>  static inline void kvm_arch_sync_events(struct kvm *kvm) {}
> -static inline void kvm_arch_vcpu_uninit(struct kvm_vcpu *vcpu) {}
>  static inline void kvm_arch_sched_in(struct kvm_vcpu *vcpu, int cpu) {}
>  static inline void kvm_arch_vcpu_block_finish(struct kvm_vcpu *vcpu) {}
>  
> +static inline void kvm_arm_arch_vcpu_uninit(struct kvm_vcpu *vcpu) {}
> +
>  static inline void kvm_arm_init_debug(void) {}
>  static inline void kvm_arm_setup_debug(struct kvm_vcpu *vcpu) {}
>  static inline void kvm_arm_clear_debug(struct kvm_vcpu *vcpu) {}
> diff --git a/arch/arm64/include/asm/kvm_host.h b/arch/arm64/include/asm/kvm_host.h
> index 6316a57..d4b65414 100644
> --- a/arch/arm64/include/asm/kvm_host.h
> +++ b/arch/arm64/include/asm/kvm_host.h
> @@ -425,10 +425,11 @@ static inline bool kvm_arch_check_sve_has_vhe(void)
>  
>  static inline void kvm_arch_hardware_unsetup(void) {}
>  static inline void kvm_arch_sync_events(struct kvm *kvm) {}
> -static inline void kvm_arch_vcpu_uninit(struct kvm_vcpu *vcpu) {}
>  static inline void kvm_arch_sched_in(struct kvm_vcpu *vcpu, int cpu) {}
>  static inline void kvm_arch_vcpu_block_finish(struct kvm_vcpu *vcpu) {}
>  
> +static inline void kvm_arm_arch_vcpu_uninit(struct kvm_vcpu *vcpu) {}
> +
>  void kvm_arm_init_debug(void);
>  void kvm_arm_setup_debug(struct kvm_vcpu *vcpu);
>  void kvm_arm_clear_debug(struct kvm_vcpu *vcpu);
> diff --git a/virt/kvm/arm/arm.c b/virt/kvm/arm/arm.c
> index c92053b..1418af9 100644
> --- a/virt/kvm/arm/arm.c
> +++ b/virt/kvm/arm/arm.c
> @@ -358,6 +358,11 @@ int kvm_arch_vcpu_init(struct kvm_vcpu *vcpu)
>  	return kvm_vgic_vcpu_init(vcpu);
>  }
>  
> +void kvm_arch_vcpu_uninit(struct kvm_vcpu *vcpu)
> +{
> +	kvm_arm_arch_vcpu_uninit(vcpu);
> +}
> +
>  void kvm_arch_vcpu_load(struct kvm_vcpu *vcpu, int cpu)
>  {
>  	int *last_ran;
> -- 
> 2.1.4
> 
> _______________________________________________
> kvmarm mailing list
> kvmarm@lists.cs.columbia.edu
> https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

^ permalink raw reply	[flat|nested] 154+ messages in thread

* [RFC PATCH v2 05/23] KVM: arm: Add arch vcpu uninit hook
@ 2018-11-02  8:05     ` Christoffer Dall
  0 siblings, 0 replies; 154+ messages in thread
From: Christoffer Dall @ 2018-11-02  8:05 UTC (permalink / raw)
  To: linux-arm-kernel

On Fri, Sep 28, 2018 at 02:39:09PM +0100, Dave Martin wrote:
> In preparation for adding support for SVE in guests on arm64, a
> hook is needed for freeing additional per-vcpu memory when a vcpu
> is freed.

Can this commit motivate why we can't do the work in kvm_arch_vcpu_free,
which we use for freeing other data structures?

(Presumably, uninit is needed when you need to do something at the very
last step after releasing the struct pid.


Thanks,

    Christoffer

> 
> x86 already uses the kvm_arch_vcpu_uninit() hook for a similar
> purpose, so this patch populates the same hook for arm.  Since SVE
> is specific to arm64, a subsidiary hook kvm_arm_arch_vcpu_uninit()
> is added (with trivial implementations for now) to enable separate
> specialisation for arm and arm64.
> 
> No functional change.
> 
> Signed-off-by: Dave Martin <Dave.Martin@arm.com>
> ---
> 
> Changes since RFCv1:
> 
>  * The vcpu _init_ hook that was added by the former version of this
>    patch was never used for anything, so it is gone from this version.
> ---
>  arch/arm/include/asm/kvm_host.h   | 3 ++-
>  arch/arm64/include/asm/kvm_host.h | 3 ++-
>  virt/kvm/arm/arm.c                | 5 +++++
>  3 files changed, 9 insertions(+), 2 deletions(-)
> 
> diff --git a/arch/arm/include/asm/kvm_host.h b/arch/arm/include/asm/kvm_host.h
> index 3ad482d..c36760b 100644
> --- a/arch/arm/include/asm/kvm_host.h
> +++ b/arch/arm/include/asm/kvm_host.h
> @@ -288,10 +288,11 @@ struct kvm_vcpu *kvm_mpidr_to_vcpu(struct kvm *kvm, unsigned long mpidr);
>  static inline bool kvm_arch_check_sve_has_vhe(void) { return true; }
>  static inline void kvm_arch_hardware_unsetup(void) {}
>  static inline void kvm_arch_sync_events(struct kvm *kvm) {}
> -static inline void kvm_arch_vcpu_uninit(struct kvm_vcpu *vcpu) {}
>  static inline void kvm_arch_sched_in(struct kvm_vcpu *vcpu, int cpu) {}
>  static inline void kvm_arch_vcpu_block_finish(struct kvm_vcpu *vcpu) {}
>  
> +static inline void kvm_arm_arch_vcpu_uninit(struct kvm_vcpu *vcpu) {}
> +
>  static inline void kvm_arm_init_debug(void) {}
>  static inline void kvm_arm_setup_debug(struct kvm_vcpu *vcpu) {}
>  static inline void kvm_arm_clear_debug(struct kvm_vcpu *vcpu) {}
> diff --git a/arch/arm64/include/asm/kvm_host.h b/arch/arm64/include/asm/kvm_host.h
> index 6316a57..d4b65414 100644
> --- a/arch/arm64/include/asm/kvm_host.h
> +++ b/arch/arm64/include/asm/kvm_host.h
> @@ -425,10 +425,11 @@ static inline bool kvm_arch_check_sve_has_vhe(void)
>  
>  static inline void kvm_arch_hardware_unsetup(void) {}
>  static inline void kvm_arch_sync_events(struct kvm *kvm) {}
> -static inline void kvm_arch_vcpu_uninit(struct kvm_vcpu *vcpu) {}
>  static inline void kvm_arch_sched_in(struct kvm_vcpu *vcpu, int cpu) {}
>  static inline void kvm_arch_vcpu_block_finish(struct kvm_vcpu *vcpu) {}
>  
> +static inline void kvm_arm_arch_vcpu_uninit(struct kvm_vcpu *vcpu) {}
> +
>  void kvm_arm_init_debug(void);
>  void kvm_arm_setup_debug(struct kvm_vcpu *vcpu);
>  void kvm_arm_clear_debug(struct kvm_vcpu *vcpu);
> diff --git a/virt/kvm/arm/arm.c b/virt/kvm/arm/arm.c
> index c92053b..1418af9 100644
> --- a/virt/kvm/arm/arm.c
> +++ b/virt/kvm/arm/arm.c
> @@ -358,6 +358,11 @@ int kvm_arch_vcpu_init(struct kvm_vcpu *vcpu)
>  	return kvm_vgic_vcpu_init(vcpu);
>  }
>  
> +void kvm_arch_vcpu_uninit(struct kvm_vcpu *vcpu)
> +{
> +	kvm_arm_arch_vcpu_uninit(vcpu);
> +}
> +
>  void kvm_arch_vcpu_load(struct kvm_vcpu *vcpu, int cpu)
>  {
>  	int *last_ran;
> -- 
> 2.1.4
> 
> _______________________________________________
> kvmarm mailing list
> kvmarm at lists.cs.columbia.edu
> https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

^ permalink raw reply	[flat|nested] 154+ messages in thread

* Re: [RFC PATCH v2 10/23] KVM: arm64: Extend reset_unknown() to handle mixed RES0/UNKNOWN registers
  2018-09-28 13:39   ` Dave Martin
@ 2018-11-02  8:11     ` Christoffer Dall
  -1 siblings, 0 replies; 154+ messages in thread
From: Christoffer Dall @ 2018-11-02  8:11 UTC (permalink / raw)
  To: Dave Martin
  Cc: Okamoto Takayuki, Christoffer Dall, Ard Biesheuvel, Marc Zyngier,
	Catalin Marinas, Will Deacon, kvmarm, linux-arm-kernel

On Fri, Sep 28, 2018 at 02:39:14PM +0100, Dave Martin wrote:
> The reset_unknown() system register helper initialises a guest
> register to a distinctive junk value on vcpu reset, to help expose
> and debug deficient register initialisation within the guest.
> 
> Some registers such as the SVE control register ZCR_EL1 contain a
> mixture of UNKNOWN fields and RES0 bits.  For these,
> reset_unknown() does not work at present, since it sets all bits to
> junk values instead of just the wanted bits.
> 
> There is no need to craft another special helper just for that,
> since reset_unknown() almost does the appropriate thing anyway.
> This patch takes advantage of the ununused val field in struct
> sys_reg_desc to specify a mask of bits that should be initialised
> to zero instead of junk.
> 
> All existing users of reset_unknown() do not (and should not)
> define a value for val, so they will implicitly set it to zero,
> resulting in all bits being made UNKNOWN by this function: thus,
> this patch makes no functional change for currently defined
> registers.
> 
> Future patches will make use of non-zero val.
> 
> Signed-off-by: Dave Martin <Dave.Martin@arm.com>
> ---
>  arch/arm64/kvm/sys_regs.h | 4 +++-
>  1 file changed, 3 insertions(+), 1 deletion(-)
> 
> diff --git a/arch/arm64/kvm/sys_regs.h b/arch/arm64/kvm/sys_regs.h
> index cd710f8..24bac06 100644
> --- a/arch/arm64/kvm/sys_regs.h
> +++ b/arch/arm64/kvm/sys_regs.h
> @@ -89,7 +89,9 @@ static inline void reset_unknown(struct kvm_vcpu *vcpu,
>  {
>  	BUG_ON(!r->reg);
>  	BUG_ON(r->reg >= NR_SYS_REGS);
> -	__vcpu_sys_reg(vcpu, r->reg) = 0x1de7ec7edbadc0deULL;
> +
> +	/* If non-zero, r->val specifies which register bits are RES0: */
> +	__vcpu_sys_reg(vcpu, r->reg) = 0x1de7ec7edbadc0deULL & ~r->val;

nit: it would be nice to document this feature on the val field in the
sys_reg_desc structure above as well.


>  }
>  
>  static inline void reset_val(struct kvm_vcpu *vcpu, const struct sys_reg_desc *r)
> -- 
> 2.1.4
> 

Thanks,

    Christoffer

^ permalink raw reply	[flat|nested] 154+ messages in thread

* [RFC PATCH v2 10/23] KVM: arm64: Extend reset_unknown() to handle mixed RES0/UNKNOWN registers
@ 2018-11-02  8:11     ` Christoffer Dall
  0 siblings, 0 replies; 154+ messages in thread
From: Christoffer Dall @ 2018-11-02  8:11 UTC (permalink / raw)
  To: linux-arm-kernel

On Fri, Sep 28, 2018 at 02:39:14PM +0100, Dave Martin wrote:
> The reset_unknown() system register helper initialises a guest
> register to a distinctive junk value on vcpu reset, to help expose
> and debug deficient register initialisation within the guest.
> 
> Some registers such as the SVE control register ZCR_EL1 contain a
> mixture of UNKNOWN fields and RES0 bits.  For these,
> reset_unknown() does not work at present, since it sets all bits to
> junk values instead of just the wanted bits.
> 
> There is no need to craft another special helper just for that,
> since reset_unknown() almost does the appropriate thing anyway.
> This patch takes advantage of the ununused val field in struct
> sys_reg_desc to specify a mask of bits that should be initialised
> to zero instead of junk.
> 
> All existing users of reset_unknown() do not (and should not)
> define a value for val, so they will implicitly set it to zero,
> resulting in all bits being made UNKNOWN by this function: thus,
> this patch makes no functional change for currently defined
> registers.
> 
> Future patches will make use of non-zero val.
> 
> Signed-off-by: Dave Martin <Dave.Martin@arm.com>
> ---
>  arch/arm64/kvm/sys_regs.h | 4 +++-
>  1 file changed, 3 insertions(+), 1 deletion(-)
> 
> diff --git a/arch/arm64/kvm/sys_regs.h b/arch/arm64/kvm/sys_regs.h
> index cd710f8..24bac06 100644
> --- a/arch/arm64/kvm/sys_regs.h
> +++ b/arch/arm64/kvm/sys_regs.h
> @@ -89,7 +89,9 @@ static inline void reset_unknown(struct kvm_vcpu *vcpu,
>  {
>  	BUG_ON(!r->reg);
>  	BUG_ON(r->reg >= NR_SYS_REGS);
> -	__vcpu_sys_reg(vcpu, r->reg) = 0x1de7ec7edbadc0deULL;
> +
> +	/* If non-zero, r->val specifies which register bits are RES0: */
> +	__vcpu_sys_reg(vcpu, r->reg) = 0x1de7ec7edbadc0deULL & ~r->val;

nit: it would be nice to document this feature on the val field in the
sys_reg_desc structure above as well.


>  }
>  
>  static inline void reset_val(struct kvm_vcpu *vcpu, const struct sys_reg_desc *r)
> -- 
> 2.1.4
> 

Thanks,

    Christoffer

^ permalink raw reply	[flat|nested] 154+ messages in thread

* Re: [RFC PATCH v2 11/23] KVM: arm64: Support runtime sysreg filtering for KVM_GET_REG_LIST
  2018-09-28 13:39   ` Dave Martin
@ 2018-11-02  8:16     ` Christoffer Dall
  -1 siblings, 0 replies; 154+ messages in thread
From: Christoffer Dall @ 2018-11-02  8:16 UTC (permalink / raw)
  To: Dave Martin
  Cc: Okamoto Takayuki, Christoffer Dall, Ard Biesheuvel, Marc Zyngier,
	Catalin Marinas, Will Deacon, kvmarm, linux-arm-kernel

On Fri, Sep 28, 2018 at 02:39:15PM +0100, Dave Martin wrote:
> KVM_GET_REG_LIST should only enumerate registers that are actually
> accessible, so it is necessary to filter out any register that is
> not exposed to the guest.  For features that are configured at
> runtime, this will require a dynamic check.
> 
> For example, ZCR_EL1 and ID_AA64ZFR0_EL1 would need to be hidden
> if SVE is not enabled for the guest.

This implies that userspace can never access this interface for a vcpu
before having decided whether such features are enabled for the guest or
not, since otherwise userspace will see different states for a VCPU
depending on sequencing of the API, which sounds fragile to me.

That should probably be documented somewhere, and I hope the
enable/disable API for SVE in guests already takes that into account.

Not sure if there's an action to take here, but it was the best place I
could raise this concern.

Thanks,

    Christoffer

> 
> Special-casing walk_one_sys_reg() for specific registers will make
> the code unnecessarily messy, so this patch adds a new sysreg
> method check_present() that, if defined, indicates whether the
> sysreg should be enumerated.  If the guest runtime configuration
> may require a particular system register to be hidden,
> check_present should point to a function that returns true or false
> to enable or disable enumeration of that register respectively.
> 
> Currently check_present() is not used for any other purpose, but it
> may be a useful foundation for abstracting other parts of the code
> to handle conditionally-present sysregs, if required.
> 
> Signed-off-by: Dave Martin <Dave.Martin@arm.com>
> ---
>  arch/arm64/kvm/sys_regs.c | 10 +++++++---
>  arch/arm64/kvm/sys_regs.h |  4 ++++
>  2 files changed, 11 insertions(+), 3 deletions(-)
> 
> diff --git a/arch/arm64/kvm/sys_regs.c b/arch/arm64/kvm/sys_regs.c
> index 0dfd064..adb6cbd 100644
> --- a/arch/arm64/kvm/sys_regs.c
> +++ b/arch/arm64/kvm/sys_regs.c
> @@ -2437,7 +2437,8 @@ static bool copy_reg_to_user(const struct sys_reg_desc *reg, u64 __user **uind)
>  	return true;
>  }
>  
> -static int walk_one_sys_reg(const struct sys_reg_desc *rd,
> +static int walk_one_sys_reg(const struct kvm_vcpu *vcpu,
> +			    const struct sys_reg_desc *rd,
>  			    u64 __user **uind,
>  			    unsigned int *total)
>  {
> @@ -2448,6 +2449,9 @@ static int walk_one_sys_reg(const struct sys_reg_desc *rd,
>  	if (!(rd->reg || rd->get_user))
>  		return 0;
>  
> +	if (rd->check_present && !rd->check_present(vcpu, rd))
> +		return 0;
> +
>  	if (!copy_reg_to_user(rd, uind))
>  		return -EFAULT;
>  
> @@ -2476,9 +2480,9 @@ static int walk_sys_regs(struct kvm_vcpu *vcpu, u64 __user *uind)
>  		int cmp = cmp_sys_reg(i1, i2);
>  		/* target-specific overrides generic entry. */
>  		if (cmp <= 0)
> -			err = walk_one_sys_reg(i1, &uind, &total);
> +			err = walk_one_sys_reg(vcpu, i1, &uind, &total);
>  		else
> -			err = walk_one_sys_reg(i2, &uind, &total);
> +			err = walk_one_sys_reg(vcpu, i2, &uind, &total);
>  
>  		if (err)
>  			return err;
> diff --git a/arch/arm64/kvm/sys_regs.h b/arch/arm64/kvm/sys_regs.h
> index 24bac06..cffb31e 100644
> --- a/arch/arm64/kvm/sys_regs.h
> +++ b/arch/arm64/kvm/sys_regs.h
> @@ -61,6 +61,10 @@ struct sys_reg_desc {
>  			const struct kvm_one_reg *reg, void __user *uaddr);
>  	int (*set_user)(struct kvm_vcpu *vcpu, const struct sys_reg_desc *rd,
>  			const struct kvm_one_reg *reg, void __user *uaddr);
> +
> +	/* Return true iff the register exists; assume present if NULL */
> +	bool (*check_present)(const struct kvm_vcpu *vcpu,
> +			      const struct sys_reg_desc *rd);
>  };
>  
>  static inline void print_sys_reg_instr(const struct sys_reg_params *p)
> -- 
> 2.1.4
> 
> _______________________________________________
> kvmarm mailing list
> kvmarm@lists.cs.columbia.edu
> https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

^ permalink raw reply	[flat|nested] 154+ messages in thread

* [RFC PATCH v2 11/23] KVM: arm64: Support runtime sysreg filtering for KVM_GET_REG_LIST
@ 2018-11-02  8:16     ` Christoffer Dall
  0 siblings, 0 replies; 154+ messages in thread
From: Christoffer Dall @ 2018-11-02  8:16 UTC (permalink / raw)
  To: linux-arm-kernel

On Fri, Sep 28, 2018 at 02:39:15PM +0100, Dave Martin wrote:
> KVM_GET_REG_LIST should only enumerate registers that are actually
> accessible, so it is necessary to filter out any register that is
> not exposed to the guest.  For features that are configured at
> runtime, this will require a dynamic check.
> 
> For example, ZCR_EL1 and ID_AA64ZFR0_EL1 would need to be hidden
> if SVE is not enabled for the guest.

This implies that userspace can never access this interface for a vcpu
before having decided whether such features are enabled for the guest or
not, since otherwise userspace will see different states for a VCPU
depending on sequencing of the API, which sounds fragile to me.

That should probably be documented somewhere, and I hope the
enable/disable API for SVE in guests already takes that into account.

Not sure if there's an action to take here, but it was the best place I
could raise this concern.

Thanks,

    Christoffer

> 
> Special-casing walk_one_sys_reg() for specific registers will make
> the code unnecessarily messy, so this patch adds a new sysreg
> method check_present() that, if defined, indicates whether the
> sysreg should be enumerated.  If the guest runtime configuration
> may require a particular system register to be hidden,
> check_present should point to a function that returns true or false
> to enable or disable enumeration of that register respectively.
> 
> Currently check_present() is not used for any other purpose, but it
> may be a useful foundation for abstracting other parts of the code
> to handle conditionally-present sysregs, if required.
> 
> Signed-off-by: Dave Martin <Dave.Martin@arm.com>
> ---
>  arch/arm64/kvm/sys_regs.c | 10 +++++++---
>  arch/arm64/kvm/sys_regs.h |  4 ++++
>  2 files changed, 11 insertions(+), 3 deletions(-)
> 
> diff --git a/arch/arm64/kvm/sys_regs.c b/arch/arm64/kvm/sys_regs.c
> index 0dfd064..adb6cbd 100644
> --- a/arch/arm64/kvm/sys_regs.c
> +++ b/arch/arm64/kvm/sys_regs.c
> @@ -2437,7 +2437,8 @@ static bool copy_reg_to_user(const struct sys_reg_desc *reg, u64 __user **uind)
>  	return true;
>  }
>  
> -static int walk_one_sys_reg(const struct sys_reg_desc *rd,
> +static int walk_one_sys_reg(const struct kvm_vcpu *vcpu,
> +			    const struct sys_reg_desc *rd,
>  			    u64 __user **uind,
>  			    unsigned int *total)
>  {
> @@ -2448,6 +2449,9 @@ static int walk_one_sys_reg(const struct sys_reg_desc *rd,
>  	if (!(rd->reg || rd->get_user))
>  		return 0;
>  
> +	if (rd->check_present && !rd->check_present(vcpu, rd))
> +		return 0;
> +
>  	if (!copy_reg_to_user(rd, uind))
>  		return -EFAULT;
>  
> @@ -2476,9 +2480,9 @@ static int walk_sys_regs(struct kvm_vcpu *vcpu, u64 __user *uind)
>  		int cmp = cmp_sys_reg(i1, i2);
>  		/* target-specific overrides generic entry. */
>  		if (cmp <= 0)
> -			err = walk_one_sys_reg(i1, &uind, &total);
> +			err = walk_one_sys_reg(vcpu, i1, &uind, &total);
>  		else
> -			err = walk_one_sys_reg(i2, &uind, &total);
> +			err = walk_one_sys_reg(vcpu, i2, &uind, &total);
>  
>  		if (err)
>  			return err;
> diff --git a/arch/arm64/kvm/sys_regs.h b/arch/arm64/kvm/sys_regs.h
> index 24bac06..cffb31e 100644
> --- a/arch/arm64/kvm/sys_regs.h
> +++ b/arch/arm64/kvm/sys_regs.h
> @@ -61,6 +61,10 @@ struct sys_reg_desc {
>  			const struct kvm_one_reg *reg, void __user *uaddr);
>  	int (*set_user)(struct kvm_vcpu *vcpu, const struct sys_reg_desc *rd,
>  			const struct kvm_one_reg *reg, void __user *uaddr);
> +
> +	/* Return true iff the register exists; assume present if NULL */
> +	bool (*check_present)(const struct kvm_vcpu *vcpu,
> +			      const struct sys_reg_desc *rd);
>  };
>  
>  static inline void print_sys_reg_instr(const struct sys_reg_params *p)
> -- 
> 2.1.4
> 
> _______________________________________________
> kvmarm mailing list
> kvmarm at lists.cs.columbia.edu
> https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

^ permalink raw reply	[flat|nested] 154+ messages in thread

* Re: [RFC PATCH v2 18/23] KVM: arm64: Add arch vcpu ioctl hook
  2018-09-28 13:39   ` Dave Martin
@ 2018-11-02  8:30     ` Christoffer Dall
  -1 siblings, 0 replies; 154+ messages in thread
From: Christoffer Dall @ 2018-11-02  8:30 UTC (permalink / raw)
  To: Dave Martin
  Cc: Okamoto Takayuki, Christoffer Dall, Ard Biesheuvel, Marc Zyngier,
	Catalin Marinas, Will Deacon, kvmarm, linux-arm-kernel

On Fri, Sep 28, 2018 at 02:39:22PM +0100, Dave Martin wrote:
> To enable arm64-specific vcpu ioctls to be added cleanly, this
> patch adds a kvm_arm_arch_vcpu_ioctl() hook so that these don't
> pollute the common code.
> 
> No functional change: the -EINVAL return for unknown ioctls is
> retained, though it may or may not be intentional (KVM returns
> -ENXIO in various other similar contexts).
> 
> Signed-off-by: Dave Martin <Dave.Martin@arm.com>
> ---
>  arch/arm/include/asm/kvm_host.h   | 7 +++++++
>  arch/arm64/include/asm/kvm_host.h | 2 ++
>  arch/arm64/kvm/guest.c            | 6 ++++++
>  virt/kvm/arm/arm.c                | 2 +-
>  4 files changed, 16 insertions(+), 1 deletion(-)
> 
> diff --git a/arch/arm/include/asm/kvm_host.h b/arch/arm/include/asm/kvm_host.h
> index c36760b..df2659d 100644
> --- a/arch/arm/include/asm/kvm_host.h
> +++ b/arch/arm/include/asm/kvm_host.h
> @@ -19,6 +19,7 @@
>  #ifndef __ARM_KVM_HOST_H__
>  #define __ARM_KVM_HOST_H__
>  
> +#include <linux/errno.h>
>  #include <linux/types.h>
>  #include <linux/kvm_types.h>
>  #include <asm/cputype.h>
> @@ -278,6 +279,12 @@ static inline int kvm_arch_dev_ioctl_check_extension(struct kvm *kvm, long ext)
>  	return 0;
>  }
>  
> +static inline int kvm_arm_arch_vcpu_ioctl(struct vcpu *vcpu,
> +	unsigned int ioctl, unsigned long arg)
> +{
> +	return -EINVAL;
> +}
> +
>  int kvm_perf_init(void);
>  int kvm_perf_teardown(void);
>  
> diff --git a/arch/arm64/include/asm/kvm_host.h b/arch/arm64/include/asm/kvm_host.h
> index 8e9cd43..bbde597 100644
> --- a/arch/arm64/include/asm/kvm_host.h
> +++ b/arch/arm64/include/asm/kvm_host.h
> @@ -55,6 +55,8 @@ DECLARE_STATIC_KEY_FALSE(userspace_irqchip_in_use);
>  int __attribute_const__ kvm_target_cpu(void);
>  int kvm_reset_vcpu(struct kvm_vcpu *vcpu);
>  int kvm_arch_dev_ioctl_check_extension(struct kvm *kvm, long ext);
> +int kvm_arm_arch_vcpu_ioctl(struct kvm_vcpu *vcpu,
> +			    unsigned int ioctl, unsigned long arg);
>  void __extended_idmap_trampoline(phys_addr_t boot_pgd, phys_addr_t idmap_start);
>  
>  struct kvm_arch {
> diff --git a/arch/arm64/kvm/guest.c b/arch/arm64/kvm/guest.c
> index 89eab68..331b85e 100644
> --- a/arch/arm64/kvm/guest.c
> +++ b/arch/arm64/kvm/guest.c
> @@ -546,6 +546,12 @@ int kvm_vcpu_preferred_target(struct kvm_vcpu_init *init)
>  	return 0;
>  }
>  
> +int kvm_arm_arch_vcpu_ioctl(struct kvm_vcpu *vcpu,
> +			    unsigned int ioctl, unsigned long arg)
> +{
> +	return -EINVAL;
> +}
> +
>  int kvm_arch_vcpu_ioctl_get_fpu(struct kvm_vcpu *vcpu, struct kvm_fpu *fpu)
>  {
>  	return -EINVAL;
> diff --git a/virt/kvm/arm/arm.c b/virt/kvm/arm/arm.c
> index 1418af9..6e894a8 100644
> --- a/virt/kvm/arm/arm.c
> +++ b/virt/kvm/arm/arm.c
> @@ -1181,7 +1181,7 @@ long kvm_arch_vcpu_ioctl(struct file *filp,
>  		return kvm_arm_vcpu_set_events(vcpu, &events);
>  	}
>  	default:
> -		r = -EINVAL;
> +		r = kvm_arm_arch_vcpu_ioctl(vcpu, ioctl, arg);

I don't like this additional indirection.  Is it just to avoid defining
the SVE ioctl value on 32-bit ARM?

I think you should just handle the ioctl here and return an error on the
32-bit side, like we do for other things.

Am I missing something?


Thanks,

    Christoffer

>  	}
>  
>  	return r;
> -- 
> 2.1.4
> 
> _______________________________________________
> kvmarm mailing list
> kvmarm@lists.cs.columbia.edu
> https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

^ permalink raw reply	[flat|nested] 154+ messages in thread

* [RFC PATCH v2 18/23] KVM: arm64: Add arch vcpu ioctl hook
@ 2018-11-02  8:30     ` Christoffer Dall
  0 siblings, 0 replies; 154+ messages in thread
From: Christoffer Dall @ 2018-11-02  8:30 UTC (permalink / raw)
  To: linux-arm-kernel

On Fri, Sep 28, 2018 at 02:39:22PM +0100, Dave Martin wrote:
> To enable arm64-specific vcpu ioctls to be added cleanly, this
> patch adds a kvm_arm_arch_vcpu_ioctl() hook so that these don't
> pollute the common code.
> 
> No functional change: the -EINVAL return for unknown ioctls is
> retained, though it may or may not be intentional (KVM returns
> -ENXIO in various other similar contexts).
> 
> Signed-off-by: Dave Martin <Dave.Martin@arm.com>
> ---
>  arch/arm/include/asm/kvm_host.h   | 7 +++++++
>  arch/arm64/include/asm/kvm_host.h | 2 ++
>  arch/arm64/kvm/guest.c            | 6 ++++++
>  virt/kvm/arm/arm.c                | 2 +-
>  4 files changed, 16 insertions(+), 1 deletion(-)
> 
> diff --git a/arch/arm/include/asm/kvm_host.h b/arch/arm/include/asm/kvm_host.h
> index c36760b..df2659d 100644
> --- a/arch/arm/include/asm/kvm_host.h
> +++ b/arch/arm/include/asm/kvm_host.h
> @@ -19,6 +19,7 @@
>  #ifndef __ARM_KVM_HOST_H__
>  #define __ARM_KVM_HOST_H__
>  
> +#include <linux/errno.h>
>  #include <linux/types.h>
>  #include <linux/kvm_types.h>
>  #include <asm/cputype.h>
> @@ -278,6 +279,12 @@ static inline int kvm_arch_dev_ioctl_check_extension(struct kvm *kvm, long ext)
>  	return 0;
>  }
>  
> +static inline int kvm_arm_arch_vcpu_ioctl(struct vcpu *vcpu,
> +	unsigned int ioctl, unsigned long arg)
> +{
> +	return -EINVAL;
> +}
> +
>  int kvm_perf_init(void);
>  int kvm_perf_teardown(void);
>  
> diff --git a/arch/arm64/include/asm/kvm_host.h b/arch/arm64/include/asm/kvm_host.h
> index 8e9cd43..bbde597 100644
> --- a/arch/arm64/include/asm/kvm_host.h
> +++ b/arch/arm64/include/asm/kvm_host.h
> @@ -55,6 +55,8 @@ DECLARE_STATIC_KEY_FALSE(userspace_irqchip_in_use);
>  int __attribute_const__ kvm_target_cpu(void);
>  int kvm_reset_vcpu(struct kvm_vcpu *vcpu);
>  int kvm_arch_dev_ioctl_check_extension(struct kvm *kvm, long ext);
> +int kvm_arm_arch_vcpu_ioctl(struct kvm_vcpu *vcpu,
> +			    unsigned int ioctl, unsigned long arg);
>  void __extended_idmap_trampoline(phys_addr_t boot_pgd, phys_addr_t idmap_start);
>  
>  struct kvm_arch {
> diff --git a/arch/arm64/kvm/guest.c b/arch/arm64/kvm/guest.c
> index 89eab68..331b85e 100644
> --- a/arch/arm64/kvm/guest.c
> +++ b/arch/arm64/kvm/guest.c
> @@ -546,6 +546,12 @@ int kvm_vcpu_preferred_target(struct kvm_vcpu_init *init)
>  	return 0;
>  }
>  
> +int kvm_arm_arch_vcpu_ioctl(struct kvm_vcpu *vcpu,
> +			    unsigned int ioctl, unsigned long arg)
> +{
> +	return -EINVAL;
> +}
> +
>  int kvm_arch_vcpu_ioctl_get_fpu(struct kvm_vcpu *vcpu, struct kvm_fpu *fpu)
>  {
>  	return -EINVAL;
> diff --git a/virt/kvm/arm/arm.c b/virt/kvm/arm/arm.c
> index 1418af9..6e894a8 100644
> --- a/virt/kvm/arm/arm.c
> +++ b/virt/kvm/arm/arm.c
> @@ -1181,7 +1181,7 @@ long kvm_arch_vcpu_ioctl(struct file *filp,
>  		return kvm_arm_vcpu_set_events(vcpu, &events);
>  	}
>  	default:
> -		r = -EINVAL;
> +		r = kvm_arm_arch_vcpu_ioctl(vcpu, ioctl, arg);

I don't like this additional indirection.  Is it just to avoid defining
the SVE ioctl value on 32-bit ARM?

I think you should just handle the ioctl here and return an error on the
32-bit side, like we do for other things.

Am I missing something?


Thanks,

    Christoffer

>  	}
>  
>  	return r;
> -- 
> 2.1.4
> 
> _______________________________________________
> kvmarm mailing list
> kvmarm at lists.cs.columbia.edu
> https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

^ permalink raw reply	[flat|nested] 154+ messages in thread

* Re: [RFC PATCH v2 20/23] KVM: arm64: Add arch vm ioctl hook
  2018-09-28 13:39   ` Dave Martin
@ 2018-11-02  8:32     ` Christoffer Dall
  -1 siblings, 0 replies; 154+ messages in thread
From: Christoffer Dall @ 2018-11-02  8:32 UTC (permalink / raw)
  To: Dave Martin
  Cc: Okamoto Takayuki, Christoffer Dall, Ard Biesheuvel, Marc Zyngier,
	Catalin Marinas, Will Deacon, kvmarm, linux-arm-kernel

On Fri, Sep 28, 2018 at 02:39:24PM +0100, Dave Martin wrote:
> To enable arm64-specific vm ioctls to be added cleanly, this patch
> adds a kvm_arm_arch_vm_ioctl() hook so that these don't pollute the
> common code.

Hmmm, I don't really see the strenght of that argument, and have the
same concern as before.  I'd like to avoid the additional indirection
and instead just follow the existing pattern with a dummy implementation
on the 32-bit side that returns an error.


Thanks,

    Christoffer

> 
> No functional change.
> 
> Signed-off-by: Dave Martin <Dave.Martin@arm.com>
> ---
>  arch/arm/include/asm/kvm_host.h   | 6 ++++++
>  arch/arm64/include/asm/kvm_host.h | 2 ++
>  arch/arm64/kvm/guest.c            | 6 ++++++
>  virt/kvm/arm/arm.c                | 2 +-
>  4 files changed, 15 insertions(+), 1 deletion(-)
> 
> diff --git a/arch/arm/include/asm/kvm_host.h b/arch/arm/include/asm/kvm_host.h
> index df2659d..0850fcd 100644
> --- a/arch/arm/include/asm/kvm_host.h
> +++ b/arch/arm/include/asm/kvm_host.h
> @@ -285,6 +285,12 @@ static inline int kvm_arm_arch_vcpu_ioctl(struct vcpu *vcpu,
>  	return -EINVAL;
>  }
>  
> +static inline int kvm_arm_arch_vm_ioctl(struct kvm *kvm,
> +	unsigned int ioctl, unsigned long arg)
> +{
> +	return -EINVAL;
> +}
> +
>  int kvm_perf_init(void);
>  int kvm_perf_teardown(void);
>  
> diff --git a/arch/arm64/include/asm/kvm_host.h b/arch/arm64/include/asm/kvm_host.h
> index 5225485..ae25f14 100644
> --- a/arch/arm64/include/asm/kvm_host.h
> +++ b/arch/arm64/include/asm/kvm_host.h
> @@ -63,6 +63,8 @@ int kvm_reset_vcpu(struct kvm_vcpu *vcpu);
>  int kvm_arch_dev_ioctl_check_extension(struct kvm *kvm, long ext);
>  int kvm_arm_arch_vcpu_ioctl(struct kvm_vcpu *vcpu,
>  			    unsigned int ioctl, unsigned long arg);
> +int kvm_arm_arch_vm_ioctl(struct kvm *kvm,
> +			  unsigned int ioctl, unsigned long arg);
>  void __extended_idmap_trampoline(phys_addr_t boot_pgd, phys_addr_t idmap_start);
>  
>  struct kvm_arch {
> diff --git a/arch/arm64/kvm/guest.c b/arch/arm64/kvm/guest.c
> index d96145a..f066b17 100644
> --- a/arch/arm64/kvm/guest.c
> +++ b/arch/arm64/kvm/guest.c
> @@ -714,6 +714,12 @@ int kvm_arm_arch_vcpu_ioctl(struct kvm_vcpu *vcpu,
>  	}
>  }
>  
> +int kvm_arm_arch_vm_ioctl(struct kvm *kvm,
> +			  unsigned int ioctl, unsigned long arg)
> +{
> +	return -EINVAL;
> +}
> +
>  int kvm_arch_vcpu_ioctl_get_fpu(struct kvm_vcpu *vcpu, struct kvm_fpu *fpu)
>  {
>  	return -EINVAL;
> diff --git a/virt/kvm/arm/arm.c b/virt/kvm/arm/arm.c
> index 6e894a8..6582a38 100644
> --- a/virt/kvm/arm/arm.c
> +++ b/virt/kvm/arm/arm.c
> @@ -1279,7 +1279,7 @@ long kvm_arch_vm_ioctl(struct file *filp,
>  		return 0;
>  	}
>  	default:
> -		return -EINVAL;
> +		return kvm_arm_arch_vm_ioctl(kvm, ioctl, arg);
>  	}
>  }
>  
> -- 
> 2.1.4
> 
> _______________________________________________
> kvmarm mailing list
> kvmarm@lists.cs.columbia.edu
> https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

^ permalink raw reply	[flat|nested] 154+ messages in thread

* [RFC PATCH v2 20/23] KVM: arm64: Add arch vm ioctl hook
@ 2018-11-02  8:32     ` Christoffer Dall
  0 siblings, 0 replies; 154+ messages in thread
From: Christoffer Dall @ 2018-11-02  8:32 UTC (permalink / raw)
  To: linux-arm-kernel

On Fri, Sep 28, 2018 at 02:39:24PM +0100, Dave Martin wrote:
> To enable arm64-specific vm ioctls to be added cleanly, this patch
> adds a kvm_arm_arch_vm_ioctl() hook so that these don't pollute the
> common code.

Hmmm, I don't really see the strenght of that argument, and have the
same concern as before.  I'd like to avoid the additional indirection
and instead just follow the existing pattern with a dummy implementation
on the 32-bit side that returns an error.


Thanks,

    Christoffer

> 
> No functional change.
> 
> Signed-off-by: Dave Martin <Dave.Martin@arm.com>
> ---
>  arch/arm/include/asm/kvm_host.h   | 6 ++++++
>  arch/arm64/include/asm/kvm_host.h | 2 ++
>  arch/arm64/kvm/guest.c            | 6 ++++++
>  virt/kvm/arm/arm.c                | 2 +-
>  4 files changed, 15 insertions(+), 1 deletion(-)
> 
> diff --git a/arch/arm/include/asm/kvm_host.h b/arch/arm/include/asm/kvm_host.h
> index df2659d..0850fcd 100644
> --- a/arch/arm/include/asm/kvm_host.h
> +++ b/arch/arm/include/asm/kvm_host.h
> @@ -285,6 +285,12 @@ static inline int kvm_arm_arch_vcpu_ioctl(struct vcpu *vcpu,
>  	return -EINVAL;
>  }
>  
> +static inline int kvm_arm_arch_vm_ioctl(struct kvm *kvm,
> +	unsigned int ioctl, unsigned long arg)
> +{
> +	return -EINVAL;
> +}
> +
>  int kvm_perf_init(void);
>  int kvm_perf_teardown(void);
>  
> diff --git a/arch/arm64/include/asm/kvm_host.h b/arch/arm64/include/asm/kvm_host.h
> index 5225485..ae25f14 100644
> --- a/arch/arm64/include/asm/kvm_host.h
> +++ b/arch/arm64/include/asm/kvm_host.h
> @@ -63,6 +63,8 @@ int kvm_reset_vcpu(struct kvm_vcpu *vcpu);
>  int kvm_arch_dev_ioctl_check_extension(struct kvm *kvm, long ext);
>  int kvm_arm_arch_vcpu_ioctl(struct kvm_vcpu *vcpu,
>  			    unsigned int ioctl, unsigned long arg);
> +int kvm_arm_arch_vm_ioctl(struct kvm *kvm,
> +			  unsigned int ioctl, unsigned long arg);
>  void __extended_idmap_trampoline(phys_addr_t boot_pgd, phys_addr_t idmap_start);
>  
>  struct kvm_arch {
> diff --git a/arch/arm64/kvm/guest.c b/arch/arm64/kvm/guest.c
> index d96145a..f066b17 100644
> --- a/arch/arm64/kvm/guest.c
> +++ b/arch/arm64/kvm/guest.c
> @@ -714,6 +714,12 @@ int kvm_arm_arch_vcpu_ioctl(struct kvm_vcpu *vcpu,
>  	}
>  }
>  
> +int kvm_arm_arch_vm_ioctl(struct kvm *kvm,
> +			  unsigned int ioctl, unsigned long arg)
> +{
> +	return -EINVAL;
> +}
> +
>  int kvm_arch_vcpu_ioctl_get_fpu(struct kvm_vcpu *vcpu, struct kvm_fpu *fpu)
>  {
>  	return -EINVAL;
> diff --git a/virt/kvm/arm/arm.c b/virt/kvm/arm/arm.c
> index 6e894a8..6582a38 100644
> --- a/virt/kvm/arm/arm.c
> +++ b/virt/kvm/arm/arm.c
> @@ -1279,7 +1279,7 @@ long kvm_arch_vm_ioctl(struct file *filp,
>  		return 0;
>  	}
>  	default:
> -		return -EINVAL;
> +		return kvm_arm_arch_vm_ioctl(kvm, ioctl, arg);
>  	}
>  }
>  
> -- 
> 2.1.4
> 
> _______________________________________________
> kvmarm mailing list
> kvmarm at lists.cs.columbia.edu
> https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

^ permalink raw reply	[flat|nested] 154+ messages in thread

* Re: [RFC PATCH v2 06/23] arm64/sve: Check SVE virtualisability
  2018-09-28 13:39   ` Dave Martin
@ 2018-11-15 15:39     ` Alex Bennée
  -1 siblings, 0 replies; 154+ messages in thread
From: Alex Bennée @ 2018-11-15 15:39 UTC (permalink / raw)
  To: Dave Martin
  Cc: Okamoto Takayuki, Christoffer Dall, Ard Biesheuvel, Marc Zyngier,
	Catalin Marinas, Will Deacon, kvmarm, linux-arm-kernel


Dave Martin <Dave.Martin@arm.com> writes:

> Due to the way the effective SVE vector length is controlled and
> trapped at different exception levels, certain mismatches in the
> sets of vector lengths supported by different physical CPUs in the
> system may prevent straightforward virtualisation of SVE at parity
> with the host.
>
> This patch analyses the extent to which SVE can be virtualised
> safely without interfering with migration of vcpus between physical
> CPUs, and rejects late secondary CPUs that would erode the
> situation further.
>
> It is left up to KVM to decide what to do with this information.
>
> Signed-off-by: Dave Martin <Dave.Martin@arm.com>
> ---
>
> Changes since RFCv1:
>
>  * The analysis done by this patch is the same as in the previous
>    version, but the commit message the printks etc. have been reworded
>    to avoid the suggestion that KVM is expected to work on a system with
>    mismatched SVE implementations.
> ---
>  arch/arm64/include/asm/fpsimd.h |  1 +
>  arch/arm64/kernel/cpufeature.c  |  2 +-
>  arch/arm64/kernel/fpsimd.c      | 87 +++++++++++++++++++++++++++++++++++------
>  3 files changed, 76 insertions(+), 14 deletions(-)
>
> diff --git a/arch/arm64/include/asm/fpsimd.h b/arch/arm64/include/asm/fpsimd.h
> index dd1ad39..964adc9 100644
> --- a/arch/arm64/include/asm/fpsimd.h
> +++ b/arch/arm64/include/asm/fpsimd.h
> @@ -87,6 +87,7 @@ extern void sve_kernel_enable(const struct arm64_cpu_capabilities *__unused);
>  extern u64 read_zcr_features(void);
>
>  extern int __ro_after_init sve_max_vl;
> +extern int __ro_after_init sve_max_virtualisable_vl;
>
>  #ifdef CONFIG_ARM64_SVE
>
> diff --git a/arch/arm64/kernel/cpufeature.c b/arch/arm64/kernel/cpufeature.c
> index e238b79..aa1a55b 100644
> --- a/arch/arm64/kernel/cpufeature.c
> +++ b/arch/arm64/kernel/cpufeature.c
> @@ -1531,7 +1531,7 @@ static void verify_sve_features(void)
>  	unsigned int len = zcr & ZCR_ELx_LEN_MASK;
>
>  	if (len < safe_len || sve_verify_vq_map()) {
> -		pr_crit("CPU%d: SVE: required vector length(s) missing\n",
> +		pr_crit("CPU%d: SVE: vector length support mismatch\n",
>  			smp_processor_id());
>  		cpu_die_early();
>  	}
> diff --git a/arch/arm64/kernel/fpsimd.c b/arch/arm64/kernel/fpsimd.c
> index 42aa154..d28042b 100644
> --- a/arch/arm64/kernel/fpsimd.c
> +++ b/arch/arm64/kernel/fpsimd.c
> @@ -18,6 +18,7 @@
>   */
>
>  #include <linux/bitmap.h>
> +#include <linux/bitops.h>
>  #include <linux/bottom_half.h>
>  #include <linux/bug.h>
>  #include <linux/cache.h>
> @@ -48,6 +49,7 @@
>  #include <asm/sigcontext.h>
>  #include <asm/sysreg.h>
>  #include <asm/traps.h>
> +#include <asm/virt.h>
>
>  #define FPEXC_IOF	(1 << 0)
>  #define FPEXC_DZF	(1 << 1)
> @@ -130,14 +132,18 @@ static int sve_default_vl = -1;
>
>  /* Maximum supported vector length across all CPUs (initially poisoned) */
>  int __ro_after_init sve_max_vl = SVE_VL_MIN;
> +int __ro_after_init sve_max_virtualisable_vl = SVE_VL_MIN;
>  /* Set of available vector lengths, as vq_to_bit(vq): */
>  static __ro_after_init DECLARE_BITMAP(sve_vq_map, SVE_VQ_MAX);
> +/* Set of vector lengths present on at least one cpu: */
> +static __ro_after_init DECLARE_BITMAP(sve_vq_partial_map, SVE_VQ_MAX);
>  static void __percpu *efi_sve_state;
>
>  #else /* ! CONFIG_ARM64_SVE */
>
>  /* Dummy declaration for code that will be optimised out: */
>  extern __ro_after_init DECLARE_BITMAP(sve_vq_map, SVE_VQ_MAX);
> +extern __ro_after_init DECLARE_BITMAP(sve_vq_partial_map, SVE_VQ_MAX);
>  extern void __percpu *efi_sve_state;
>
>  #endif /* ! CONFIG_ARM64_SVE */
> @@ -623,11 +629,8 @@ int sve_get_current_vl(void)
>  	return sve_prctl_status(0);
>  }
>
> -/*
> - * Bitmap for temporary storage of the per-CPU set of supported vector lengths
> - * during secondary boot.
> - */
> -static DECLARE_BITMAP(sve_secondary_vq_map, SVE_VQ_MAX);
> +/* Bitmaps for temporary storage during manipulation of vector length sets */
> +static DECLARE_BITMAP(sve_tmp_vq_map, SVE_VQ_MAX);

This seems odd as a local global, why not declared locally when used?

>
>  static void sve_probe_vqs(DECLARE_BITMAP(map, SVE_VQ_MAX))
>  {
> @@ -650,6 +653,7 @@ static void sve_probe_vqs(DECLARE_BITMAP(map, SVE_VQ_MAX))
>  void __init sve_init_vq_map(void)
>  {
>  	sve_probe_vqs(sve_vq_map);
> +	bitmap_copy(sve_vq_partial_map, sve_vq_map, SVE_VQ_MAX);
>  }
>
>  /*
> @@ -658,24 +662,60 @@ void __init sve_init_vq_map(void)
>   */
>  void sve_update_vq_map(void)
>  {
> -	sve_probe_vqs(sve_secondary_vq_map);
> -	bitmap_and(sve_vq_map, sve_vq_map, sve_secondary_vq_map, SVE_VQ_MAX);
> +	sve_probe_vqs(sve_tmp_vq_map);
> +	bitmap_and(sve_vq_map, sve_vq_map, sve_tmp_vq_map,
> +		   SVE_VQ_MAX);
> +	bitmap_or(sve_vq_partial_map, sve_vq_partial_map, sve_tmp_vq_map,
> +		  SVE_VQ_MAX);
>  }

I'm not quite following what's going on here. This is tracking both the
vector lengths available on all CPUs and the ones available on at least
one CPU? This raises a some questions:

  - do such franken-machines exist or are expected to exit?
  - how do we ensure this is always upto date?
  - what happens when we hotplug a new CPU with less available VQ?

>
>  /* Check whether the current CPU supports all VQs in the committed set */
>  int sve_verify_vq_map(void)
>  {
> -	int ret = 0;
> +	int ret = -EINVAL;
> +	unsigned long b;
>
> -	sve_probe_vqs(sve_secondary_vq_map);
> -	bitmap_andnot(sve_secondary_vq_map, sve_vq_map, sve_secondary_vq_map,
> -		      SVE_VQ_MAX);
> -	if (!bitmap_empty(sve_secondary_vq_map, SVE_VQ_MAX)) {
> +	sve_probe_vqs(sve_tmp_vq_map);
> +
> +	bitmap_complement(sve_tmp_vq_map, sve_tmp_vq_map, SVE_VQ_MAX);
> +	if (bitmap_intersects(sve_tmp_vq_map, sve_vq_map, SVE_VQ_MAX)) {
>  		pr_warn("SVE: cpu%d: Required vector length(s) missing\n",
>  			smp_processor_id());
> -		ret = -EINVAL;
> +		goto error;

The use of goto seems a little premature considering we don't have any
clean-up to do.

> +	}
> +
> +	if (!IS_ENABLED(CONFIG_KVM) || !is_hyp_mode_available())
> +		goto ok;
> +
> +	/*
> +	 * For KVM, it is necessary to ensure that this CPU doesn't
> +	 * support any vector length that guests may have probed as
> +	 * unsupported.
> +	 */
> +
> +	/* Recover the set of supported VQs: */
> +	bitmap_complement(sve_tmp_vq_map, sve_tmp_vq_map, SVE_VQ_MAX);
> +	/* Find VQs supported that are not globally supported: */
> +	bitmap_andnot(sve_tmp_vq_map, sve_tmp_vq_map, sve_vq_map, SVE_VQ_MAX);
> +
> +	/* Find the lowest such VQ, if any: */
> +	b = find_last_bit(sve_tmp_vq_map, SVE_VQ_MAX);
> +	if (b >= SVE_VQ_MAX)
> +		goto ok; /* no mismatches */
> +
> +	/*
> +	 * Mismatches above sve_max_virtualisable_vl are fine, since
> +	 * no guest is allowed to configure ZCR_EL2.LEN to exceed this:
> +	 */
> +	if (sve_vl_from_vq(bit_to_vq(b)) <= sve_max_virtualisable_vl) {
> +		pr_warn("SVE: cpu%d: Unsupported vector length(s) present\n",
> +			smp_processor_id());
> +		goto error;
>  	}
>
> +ok:
> +	ret = 0;
> +error:
>  	return ret;
>  }
>
> @@ -743,6 +783,7 @@ u64 read_zcr_features(void)
>  void __init sve_setup(void)
>  {
>  	u64 zcr;
> +	unsigned long b;
>
>  	if (!system_supports_sve())
>  		return;
> @@ -771,11 +812,31 @@ void __init sve_setup(void)
>  	 */
>  	sve_default_vl = find_supported_vector_length(64);
>
> +	bitmap_andnot(sve_tmp_vq_map, sve_vq_partial_map, sve_vq_map,
> +		      SVE_VQ_MAX);
> +
> +	b = find_last_bit(sve_tmp_vq_map, SVE_VQ_MAX);
> +	if (b >= SVE_VQ_MAX)
> +		/* No non-virtualisable VLs found */
> +		sve_max_virtualisable_vl = SVE_VQ_MAX;
> +	else if (WARN_ON(b == SVE_VQ_MAX - 1))
> +		/* No virtualisable VLs?  This is architecturally forbidden. */
> +		sve_max_virtualisable_vl = SVE_VQ_MIN;
> +	else /* b + 1 < SVE_VQ_MAX */
> +		sve_max_virtualisable_vl = sve_vl_from_vq(bit_to_vq(b + 1));
> +
> +	if (sve_max_virtualisable_vl > sve_max_vl)
> +		sve_max_virtualisable_vl = sve_max_vl;
> +
>  	pr_info("SVE: maximum available vector length %u bytes per vector\n",
>  		sve_max_vl);
>  	pr_info("SVE: default vector length %u bytes per vector\n",
>  		sve_default_vl);
>
> +	/* KVM decides whether to support mismatched systems. Just warn here: */
> +	if (sve_max_virtualisable_vl < sve_max_vl)
> +		pr_info("SVE: unvirtualisable vector lengths present\n");
> +
>  	sve_efi_setup();
>  }


--
Alex Bennée
_______________________________________________
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

^ permalink raw reply	[flat|nested] 154+ messages in thread

* [RFC PATCH v2 06/23] arm64/sve: Check SVE virtualisability
@ 2018-11-15 15:39     ` Alex Bennée
  0 siblings, 0 replies; 154+ messages in thread
From: Alex Bennée @ 2018-11-15 15:39 UTC (permalink / raw)
  To: linux-arm-kernel


Dave Martin <Dave.Martin@arm.com> writes:

> Due to the way the effective SVE vector length is controlled and
> trapped at different exception levels, certain mismatches in the
> sets of vector lengths supported by different physical CPUs in the
> system may prevent straightforward virtualisation of SVE at parity
> with the host.
>
> This patch analyses the extent to which SVE can be virtualised
> safely without interfering with migration of vcpus between physical
> CPUs, and rejects late secondary CPUs that would erode the
> situation further.
>
> It is left up to KVM to decide what to do with this information.
>
> Signed-off-by: Dave Martin <Dave.Martin@arm.com>
> ---
>
> Changes since RFCv1:
>
>  * The analysis done by this patch is the same as in the previous
>    version, but the commit message the printks etc. have been reworded
>    to avoid the suggestion that KVM is expected to work on a system with
>    mismatched SVE implementations.
> ---
>  arch/arm64/include/asm/fpsimd.h |  1 +
>  arch/arm64/kernel/cpufeature.c  |  2 +-
>  arch/arm64/kernel/fpsimd.c      | 87 +++++++++++++++++++++++++++++++++++------
>  3 files changed, 76 insertions(+), 14 deletions(-)
>
> diff --git a/arch/arm64/include/asm/fpsimd.h b/arch/arm64/include/asm/fpsimd.h
> index dd1ad39..964adc9 100644
> --- a/arch/arm64/include/asm/fpsimd.h
> +++ b/arch/arm64/include/asm/fpsimd.h
> @@ -87,6 +87,7 @@ extern void sve_kernel_enable(const struct arm64_cpu_capabilities *__unused);
>  extern u64 read_zcr_features(void);
>
>  extern int __ro_after_init sve_max_vl;
> +extern int __ro_after_init sve_max_virtualisable_vl;
>
>  #ifdef CONFIG_ARM64_SVE
>
> diff --git a/arch/arm64/kernel/cpufeature.c b/arch/arm64/kernel/cpufeature.c
> index e238b79..aa1a55b 100644
> --- a/arch/arm64/kernel/cpufeature.c
> +++ b/arch/arm64/kernel/cpufeature.c
> @@ -1531,7 +1531,7 @@ static void verify_sve_features(void)
>  	unsigned int len = zcr & ZCR_ELx_LEN_MASK;
>
>  	if (len < safe_len || sve_verify_vq_map()) {
> -		pr_crit("CPU%d: SVE: required vector length(s) missing\n",
> +		pr_crit("CPU%d: SVE: vector length support mismatch\n",
>  			smp_processor_id());
>  		cpu_die_early();
>  	}
> diff --git a/arch/arm64/kernel/fpsimd.c b/arch/arm64/kernel/fpsimd.c
> index 42aa154..d28042b 100644
> --- a/arch/arm64/kernel/fpsimd.c
> +++ b/arch/arm64/kernel/fpsimd.c
> @@ -18,6 +18,7 @@
>   */
>
>  #include <linux/bitmap.h>
> +#include <linux/bitops.h>
>  #include <linux/bottom_half.h>
>  #include <linux/bug.h>
>  #include <linux/cache.h>
> @@ -48,6 +49,7 @@
>  #include <asm/sigcontext.h>
>  #include <asm/sysreg.h>
>  #include <asm/traps.h>
> +#include <asm/virt.h>
>
>  #define FPEXC_IOF	(1 << 0)
>  #define FPEXC_DZF	(1 << 1)
> @@ -130,14 +132,18 @@ static int sve_default_vl = -1;
>
>  /* Maximum supported vector length across all CPUs (initially poisoned) */
>  int __ro_after_init sve_max_vl = SVE_VL_MIN;
> +int __ro_after_init sve_max_virtualisable_vl = SVE_VL_MIN;
>  /* Set of available vector lengths, as vq_to_bit(vq): */
>  static __ro_after_init DECLARE_BITMAP(sve_vq_map, SVE_VQ_MAX);
> +/* Set of vector lengths present on at least one cpu: */
> +static __ro_after_init DECLARE_BITMAP(sve_vq_partial_map, SVE_VQ_MAX);
>  static void __percpu *efi_sve_state;
>
>  #else /* ! CONFIG_ARM64_SVE */
>
>  /* Dummy declaration for code that will be optimised out: */
>  extern __ro_after_init DECLARE_BITMAP(sve_vq_map, SVE_VQ_MAX);
> +extern __ro_after_init DECLARE_BITMAP(sve_vq_partial_map, SVE_VQ_MAX);
>  extern void __percpu *efi_sve_state;
>
>  #endif /* ! CONFIG_ARM64_SVE */
> @@ -623,11 +629,8 @@ int sve_get_current_vl(void)
>  	return sve_prctl_status(0);
>  }
>
> -/*
> - * Bitmap for temporary storage of the per-CPU set of supported vector lengths
> - * during secondary boot.
> - */
> -static DECLARE_BITMAP(sve_secondary_vq_map, SVE_VQ_MAX);
> +/* Bitmaps for temporary storage during manipulation of vector length sets */
> +static DECLARE_BITMAP(sve_tmp_vq_map, SVE_VQ_MAX);

This seems odd as a local global, why not declared locally when used?

>
>  static void sve_probe_vqs(DECLARE_BITMAP(map, SVE_VQ_MAX))
>  {
> @@ -650,6 +653,7 @@ static void sve_probe_vqs(DECLARE_BITMAP(map, SVE_VQ_MAX))
>  void __init sve_init_vq_map(void)
>  {
>  	sve_probe_vqs(sve_vq_map);
> +	bitmap_copy(sve_vq_partial_map, sve_vq_map, SVE_VQ_MAX);
>  }
>
>  /*
> @@ -658,24 +662,60 @@ void __init sve_init_vq_map(void)
>   */
>  void sve_update_vq_map(void)
>  {
> -	sve_probe_vqs(sve_secondary_vq_map);
> -	bitmap_and(sve_vq_map, sve_vq_map, sve_secondary_vq_map, SVE_VQ_MAX);
> +	sve_probe_vqs(sve_tmp_vq_map);
> +	bitmap_and(sve_vq_map, sve_vq_map, sve_tmp_vq_map,
> +		   SVE_VQ_MAX);
> +	bitmap_or(sve_vq_partial_map, sve_vq_partial_map, sve_tmp_vq_map,
> +		  SVE_VQ_MAX);
>  }

I'm not quite following what's going on here. This is tracking both the
vector lengths available on all CPUs and the ones available on at least
one CPU? This raises a some questions:

  - do such franken-machines exist or are expected to exit?
  - how do we ensure this is always upto date?
  - what happens when we hotplug a new CPU with less available VQ?

>
>  /* Check whether the current CPU supports all VQs in the committed set */
>  int sve_verify_vq_map(void)
>  {
> -	int ret = 0;
> +	int ret = -EINVAL;
> +	unsigned long b;
>
> -	sve_probe_vqs(sve_secondary_vq_map);
> -	bitmap_andnot(sve_secondary_vq_map, sve_vq_map, sve_secondary_vq_map,
> -		      SVE_VQ_MAX);
> -	if (!bitmap_empty(sve_secondary_vq_map, SVE_VQ_MAX)) {
> +	sve_probe_vqs(sve_tmp_vq_map);
> +
> +	bitmap_complement(sve_tmp_vq_map, sve_tmp_vq_map, SVE_VQ_MAX);
> +	if (bitmap_intersects(sve_tmp_vq_map, sve_vq_map, SVE_VQ_MAX)) {
>  		pr_warn("SVE: cpu%d: Required vector length(s) missing\n",
>  			smp_processor_id());
> -		ret = -EINVAL;
> +		goto error;

The use of goto seems a little premature considering we don't have any
clean-up to do.

> +	}
> +
> +	if (!IS_ENABLED(CONFIG_KVM) || !is_hyp_mode_available())
> +		goto ok;
> +
> +	/*
> +	 * For KVM, it is necessary to ensure that this CPU doesn't
> +	 * support any vector length that guests may have probed as
> +	 * unsupported.
> +	 */
> +
> +	/* Recover the set of supported VQs: */
> +	bitmap_complement(sve_tmp_vq_map, sve_tmp_vq_map, SVE_VQ_MAX);
> +	/* Find VQs supported that are not globally supported: */
> +	bitmap_andnot(sve_tmp_vq_map, sve_tmp_vq_map, sve_vq_map, SVE_VQ_MAX);
> +
> +	/* Find the lowest such VQ, if any: */
> +	b = find_last_bit(sve_tmp_vq_map, SVE_VQ_MAX);
> +	if (b >= SVE_VQ_MAX)
> +		goto ok; /* no mismatches */
> +
> +	/*
> +	 * Mismatches above sve_max_virtualisable_vl are fine, since
> +	 * no guest is allowed to configure ZCR_EL2.LEN to exceed this:
> +	 */
> +	if (sve_vl_from_vq(bit_to_vq(b)) <= sve_max_virtualisable_vl) {
> +		pr_warn("SVE: cpu%d: Unsupported vector length(s) present\n",
> +			smp_processor_id());
> +		goto error;
>  	}
>
> +ok:
> +	ret = 0;
> +error:
>  	return ret;
>  }
>
> @@ -743,6 +783,7 @@ u64 read_zcr_features(void)
>  void __init sve_setup(void)
>  {
>  	u64 zcr;
> +	unsigned long b;
>
>  	if (!system_supports_sve())
>  		return;
> @@ -771,11 +812,31 @@ void __init sve_setup(void)
>  	 */
>  	sve_default_vl = find_supported_vector_length(64);
>
> +	bitmap_andnot(sve_tmp_vq_map, sve_vq_partial_map, sve_vq_map,
> +		      SVE_VQ_MAX);
> +
> +	b = find_last_bit(sve_tmp_vq_map, SVE_VQ_MAX);
> +	if (b >= SVE_VQ_MAX)
> +		/* No non-virtualisable VLs found */
> +		sve_max_virtualisable_vl = SVE_VQ_MAX;
> +	else if (WARN_ON(b == SVE_VQ_MAX - 1))
> +		/* No virtualisable VLs?  This is architecturally forbidden. */
> +		sve_max_virtualisable_vl = SVE_VQ_MIN;
> +	else /* b + 1 < SVE_VQ_MAX */
> +		sve_max_virtualisable_vl = sve_vl_from_vq(bit_to_vq(b + 1));
> +
> +	if (sve_max_virtualisable_vl > sve_max_vl)
> +		sve_max_virtualisable_vl = sve_max_vl;
> +
>  	pr_info("SVE: maximum available vector length %u bytes per vector\n",
>  		sve_max_vl);
>  	pr_info("SVE: default vector length %u bytes per vector\n",
>  		sve_default_vl);
>
> +	/* KVM decides whether to support mismatched systems. Just warn here: */
> +	if (sve_max_virtualisable_vl < sve_max_vl)
> +		pr_info("SVE: unvirtualisable vector lengths present\n");
> +
>  	sve_efi_setup();
>  }


--
Alex Benn?e

^ permalink raw reply	[flat|nested] 154+ messages in thread

* Re: [RFC PATCH v2 08/23] KVM: arm64: Add a vcpu flag to control SVE visibility for the guest
  2018-09-28 13:39   ` Dave Martin
@ 2018-11-15 15:44     ` Alex Bennée
  -1 siblings, 0 replies; 154+ messages in thread
From: Alex Bennée @ 2018-11-15 15:44 UTC (permalink / raw)
  To: Dave Martin
  Cc: Okamoto Takayuki, Christoffer Dall, Ard Biesheuvel, Marc Zyngier,
	Catalin Marinas, Will Deacon, kvmarm, linux-arm-kernel


Dave Martin <Dave.Martin@arm.com> writes:

> Since SVE will be enabled or disabled on a per-vcpu basis, a flag
> is needed in order to track which vcpus have it enabled.
>
> This patch adds a suitable flag and a helper for checking it.
>
> Signed-off-by: Dave Martin <Dave.Martin@arm.com>

Reviewed-by: Alex Bennée <alex.bennee@linaro.org>

> ---
>
> Changes since RFCv1:
>
>  * Convert vcpu_has_sve() to a macro so that it can operate on a vcpu
>    without circular header dependency problems.
>
>    This avoids the helper requiring a vcpu_arch argument, which was
>    a little ugly.
> ---
>  arch/arm64/include/asm/kvm_host.h | 4 ++++
>  1 file changed, 4 insertions(+)
>
> diff --git a/arch/arm64/include/asm/kvm_host.h b/arch/arm64/include/asm/kvm_host.h
> index d4b65414..20baf4a 100644
> --- a/arch/arm64/include/asm/kvm_host.h
> +++ b/arch/arm64/include/asm/kvm_host.h
> @@ -307,6 +307,10 @@ struct kvm_vcpu_arch {
>  #define KVM_ARM64_FP_HOST		(1 << 2) /* host FP regs loaded */
>  #define KVM_ARM64_HOST_SVE_IN_USE	(1 << 3) /* backup for host TIF_SVE */
>  #define KVM_ARM64_HOST_SVE_ENABLED	(1 << 4) /* SVE enabled for EL0 */
> +#define KVM_ARM64_GUEST_HAS_SVE		(1 << 5) /* SVE exposed to guest */
> +
> +#define vcpu_has_sve(vcpu) (system_supports_sve() && \
> +			    ((vcpu)->arch.flags & KVM_ARM64_GUEST_HAS_SVE))
>
>  #define vcpu_gp_regs(v)		(&(v)->arch.ctxt.gp_regs)


--
Alex Bennée
_______________________________________________
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

^ permalink raw reply	[flat|nested] 154+ messages in thread

* [RFC PATCH v2 08/23] KVM: arm64: Add a vcpu flag to control SVE visibility for the guest
@ 2018-11-15 15:44     ` Alex Bennée
  0 siblings, 0 replies; 154+ messages in thread
From: Alex Bennée @ 2018-11-15 15:44 UTC (permalink / raw)
  To: linux-arm-kernel


Dave Martin <Dave.Martin@arm.com> writes:

> Since SVE will be enabled or disabled on a per-vcpu basis, a flag
> is needed in order to track which vcpus have it enabled.
>
> This patch adds a suitable flag and a helper for checking it.
>
> Signed-off-by: Dave Martin <Dave.Martin@arm.com>

Reviewed-by: Alex Benn?e <alex.bennee@linaro.org>

> ---
>
> Changes since RFCv1:
>
>  * Convert vcpu_has_sve() to a macro so that it can operate on a vcpu
>    without circular header dependency problems.
>
>    This avoids the helper requiring a vcpu_arch argument, which was
>    a little ugly.
> ---
>  arch/arm64/include/asm/kvm_host.h | 4 ++++
>  1 file changed, 4 insertions(+)
>
> diff --git a/arch/arm64/include/asm/kvm_host.h b/arch/arm64/include/asm/kvm_host.h
> index d4b65414..20baf4a 100644
> --- a/arch/arm64/include/asm/kvm_host.h
> +++ b/arch/arm64/include/asm/kvm_host.h
> @@ -307,6 +307,10 @@ struct kvm_vcpu_arch {
>  #define KVM_ARM64_FP_HOST		(1 << 2) /* host FP regs loaded */
>  #define KVM_ARM64_HOST_SVE_IN_USE	(1 << 3) /* backup for host TIF_SVE */
>  #define KVM_ARM64_HOST_SVE_ENABLED	(1 << 4) /* SVE enabled for EL0 */
> +#define KVM_ARM64_GUEST_HAS_SVE		(1 << 5) /* SVE exposed to guest */
> +
> +#define vcpu_has_sve(vcpu) (system_supports_sve() && \
> +			    ((vcpu)->arch.flags & KVM_ARM64_GUEST_HAS_SVE))
>
>  #define vcpu_gp_regs(v)		(&(v)->arch.ctxt.gp_regs)


--
Alex Benn?e

^ permalink raw reply	[flat|nested] 154+ messages in thread

* Re: [RFC PATCH v2 09/23] KVM: arm64: Propagate vcpu into read_id_reg()
  2018-09-28 13:39   ` Dave Martin
@ 2018-11-15 15:56     ` Alex Bennée
  -1 siblings, 0 replies; 154+ messages in thread
From: Alex Bennée @ 2018-11-15 15:56 UTC (permalink / raw)
  To: Dave Martin
  Cc: Okamoto Takayuki, Christoffer Dall, Ard Biesheuvel, Marc Zyngier,
	Catalin Marinas, Will Deacon, kvmarm, linux-arm-kernel


Dave Martin <Dave.Martin@arm.com> writes:

> Architecture features that are conditionally visible to the guest
> will require run-time checks in the ID register accessor functions.
> In particular, read_id_reg() will need to perform checks in order
> to generate the correct emulated value for certain ID register
> fields such as ID_AA64PFR0_EL1.SVE for example.
>
> This patch propagates vcpu into read_id_reg() so that future
> patches can add run-time checks on the guest configuration here.
>
> For now, there is no functional change.
>
> Signed-off-by: Dave Martin <Dave.Martin@arm.com>

Reviewed-by: Alex Bennée <alex.bennee@linaro.org>

> ---
>  arch/arm64/kvm/sys_regs.c | 23 +++++++++++++----------
>  1 file changed, 13 insertions(+), 10 deletions(-)
>
> diff --git a/arch/arm64/kvm/sys_regs.c b/arch/arm64/kvm/sys_regs.c
> index 22fbbdb..0dfd064 100644
> --- a/arch/arm64/kvm/sys_regs.c
> +++ b/arch/arm64/kvm/sys_regs.c
> @@ -1029,7 +1029,8 @@ static bool access_cntp_cval(struct kvm_vcpu *vcpu,
>  }
>
>  /* Read a sanitised cpufeature ID register by sys_reg_desc */
> -static u64 read_id_reg(struct sys_reg_desc const *r, bool raz)
> +static u64 read_id_reg(const struct kvm_vcpu *vcpu,
> +		struct sys_reg_desc const *r, bool raz)
>  {
>  	u32 id = sys_reg((u32)r->Op0, (u32)r->Op1,
>  			 (u32)r->CRn, (u32)r->CRm, (u32)r->Op2);
> @@ -1060,7 +1061,7 @@ static bool __access_id_reg(struct kvm_vcpu *vcpu,
>  	if (p->is_write)
>  		return write_to_read_only(vcpu, p, r);
>
> -	p->regval = read_id_reg(r, raz);
> +	p->regval = read_id_reg(vcpu, r, raz);
>  	return true;
>  }
>
> @@ -1089,16 +1090,18 @@ static u64 sys_reg_to_index(const struct sys_reg_desc *reg);
>   * are stored, and for set_id_reg() we don't allow the effective value
>   * to be changed.
>   */
> -static int __get_id_reg(const struct sys_reg_desc *rd, void __user *uaddr,
> +static int __get_id_reg(const struct kvm_vcpu *vcpu,
> +			const struct sys_reg_desc *rd, void __user *uaddr,
>  			bool raz)
>  {
>  	const u64 id = sys_reg_to_index(rd);
> -	const u64 val = read_id_reg(rd, raz);
> +	const u64 val = read_id_reg(vcpu, rd, raz);
>
>  	return reg_to_user(uaddr, &val, id);
>  }
>
> -static int __set_id_reg(const struct sys_reg_desc *rd, void __user *uaddr,
> +static int __set_id_reg(const struct kvm_vcpu *vcpu,
> +			const struct sys_reg_desc *rd, void __user *uaddr,
>  			bool raz)
>  {
>  	const u64 id = sys_reg_to_index(rd);
> @@ -1110,7 +1113,7 @@ static int __set_id_reg(const struct sys_reg_desc *rd, void __user *uaddr,
>  		return err;
>
>  	/* This is what we mean by invariant: you can't change it. */
> -	if (val != read_id_reg(rd, raz))
> +	if (val != read_id_reg(vcpu, rd, raz))
>  		return -EINVAL;
>
>  	return 0;
> @@ -1119,25 +1122,25 @@ static int __set_id_reg(const struct sys_reg_desc *rd, void __user *uaddr,
>  static int get_id_reg(struct kvm_vcpu *vcpu, const struct sys_reg_desc *rd,
>  		      const struct kvm_one_reg *reg, void __user *uaddr)
>  {
> -	return __get_id_reg(rd, uaddr, false);
> +	return __get_id_reg(vcpu, rd, uaddr, false);
>  }
>
>  static int set_id_reg(struct kvm_vcpu *vcpu, const struct sys_reg_desc *rd,
>  		      const struct kvm_one_reg *reg, void __user *uaddr)
>  {
> -	return __set_id_reg(rd, uaddr, false);
> +	return __set_id_reg(vcpu, rd, uaddr, false);
>  }
>
>  static int get_raz_id_reg(struct kvm_vcpu *vcpu, const struct sys_reg_desc *rd,
>  			  const struct kvm_one_reg *reg, void __user *uaddr)
>  {
> -	return __get_id_reg(rd, uaddr, true);
> +	return __get_id_reg(vcpu, rd, uaddr, true);
>  }
>
>  static int set_raz_id_reg(struct kvm_vcpu *vcpu, const struct sys_reg_desc *rd,
>  			  const struct kvm_one_reg *reg, void __user *uaddr)
>  {
> -	return __set_id_reg(rd, uaddr, true);
> +	return __set_id_reg(vcpu, rd, uaddr, true);
>  }
>
>  /* sys_reg_desc initialiser for known cpufeature ID registers */


--
Alex Bennée
_______________________________________________
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

^ permalink raw reply	[flat|nested] 154+ messages in thread

* [RFC PATCH v2 09/23] KVM: arm64: Propagate vcpu into read_id_reg()
@ 2018-11-15 15:56     ` Alex Bennée
  0 siblings, 0 replies; 154+ messages in thread
From: Alex Bennée @ 2018-11-15 15:56 UTC (permalink / raw)
  To: linux-arm-kernel


Dave Martin <Dave.Martin@arm.com> writes:

> Architecture features that are conditionally visible to the guest
> will require run-time checks in the ID register accessor functions.
> In particular, read_id_reg() will need to perform checks in order
> to generate the correct emulated value for certain ID register
> fields such as ID_AA64PFR0_EL1.SVE for example.
>
> This patch propagates vcpu into read_id_reg() so that future
> patches can add run-time checks on the guest configuration here.
>
> For now, there is no functional change.
>
> Signed-off-by: Dave Martin <Dave.Martin@arm.com>

Reviewed-by: Alex Benn?e <alex.bennee@linaro.org>

> ---
>  arch/arm64/kvm/sys_regs.c | 23 +++++++++++++----------
>  1 file changed, 13 insertions(+), 10 deletions(-)
>
> diff --git a/arch/arm64/kvm/sys_regs.c b/arch/arm64/kvm/sys_regs.c
> index 22fbbdb..0dfd064 100644
> --- a/arch/arm64/kvm/sys_regs.c
> +++ b/arch/arm64/kvm/sys_regs.c
> @@ -1029,7 +1029,8 @@ static bool access_cntp_cval(struct kvm_vcpu *vcpu,
>  }
>
>  /* Read a sanitised cpufeature ID register by sys_reg_desc */
> -static u64 read_id_reg(struct sys_reg_desc const *r, bool raz)
> +static u64 read_id_reg(const struct kvm_vcpu *vcpu,
> +		struct sys_reg_desc const *r, bool raz)
>  {
>  	u32 id = sys_reg((u32)r->Op0, (u32)r->Op1,
>  			 (u32)r->CRn, (u32)r->CRm, (u32)r->Op2);
> @@ -1060,7 +1061,7 @@ static bool __access_id_reg(struct kvm_vcpu *vcpu,
>  	if (p->is_write)
>  		return write_to_read_only(vcpu, p, r);
>
> -	p->regval = read_id_reg(r, raz);
> +	p->regval = read_id_reg(vcpu, r, raz);
>  	return true;
>  }
>
> @@ -1089,16 +1090,18 @@ static u64 sys_reg_to_index(const struct sys_reg_desc *reg);
>   * are stored, and for set_id_reg() we don't allow the effective value
>   * to be changed.
>   */
> -static int __get_id_reg(const struct sys_reg_desc *rd, void __user *uaddr,
> +static int __get_id_reg(const struct kvm_vcpu *vcpu,
> +			const struct sys_reg_desc *rd, void __user *uaddr,
>  			bool raz)
>  {
>  	const u64 id = sys_reg_to_index(rd);
> -	const u64 val = read_id_reg(rd, raz);
> +	const u64 val = read_id_reg(vcpu, rd, raz);
>
>  	return reg_to_user(uaddr, &val, id);
>  }
>
> -static int __set_id_reg(const struct sys_reg_desc *rd, void __user *uaddr,
> +static int __set_id_reg(const struct kvm_vcpu *vcpu,
> +			const struct sys_reg_desc *rd, void __user *uaddr,
>  			bool raz)
>  {
>  	const u64 id = sys_reg_to_index(rd);
> @@ -1110,7 +1113,7 @@ static int __set_id_reg(const struct sys_reg_desc *rd, void __user *uaddr,
>  		return err;
>
>  	/* This is what we mean by invariant: you can't change it. */
> -	if (val != read_id_reg(rd, raz))
> +	if (val != read_id_reg(vcpu, rd, raz))
>  		return -EINVAL;
>
>  	return 0;
> @@ -1119,25 +1122,25 @@ static int __set_id_reg(const struct sys_reg_desc *rd, void __user *uaddr,
>  static int get_id_reg(struct kvm_vcpu *vcpu, const struct sys_reg_desc *rd,
>  		      const struct kvm_one_reg *reg, void __user *uaddr)
>  {
> -	return __get_id_reg(rd, uaddr, false);
> +	return __get_id_reg(vcpu, rd, uaddr, false);
>  }
>
>  static int set_id_reg(struct kvm_vcpu *vcpu, const struct sys_reg_desc *rd,
>  		      const struct kvm_one_reg *reg, void __user *uaddr)
>  {
> -	return __set_id_reg(rd, uaddr, false);
> +	return __set_id_reg(vcpu, rd, uaddr, false);
>  }
>
>  static int get_raz_id_reg(struct kvm_vcpu *vcpu, const struct sys_reg_desc *rd,
>  			  const struct kvm_one_reg *reg, void __user *uaddr)
>  {
> -	return __get_id_reg(rd, uaddr, true);
> +	return __get_id_reg(vcpu, rd, uaddr, true);
>  }
>
>  static int set_raz_id_reg(struct kvm_vcpu *vcpu, const struct sys_reg_desc *rd,
>  			  const struct kvm_one_reg *reg, void __user *uaddr)
>  {
> -	return __set_id_reg(rd, uaddr, true);
> +	return __set_id_reg(vcpu, rd, uaddr, true);
>  }
>
>  /* sys_reg_desc initialiser for known cpufeature ID registers */


--
Alex Benn?e

^ permalink raw reply	[flat|nested] 154+ messages in thread

* Re: [RFC PATCH v2 12/23] KVM: arm64/sve: System register context switch and access support
  2018-09-28 13:39   ` Dave Martin
@ 2018-11-15 16:37     ` Alex Bennée
  -1 siblings, 0 replies; 154+ messages in thread
From: Alex Bennée @ 2018-11-15 16:37 UTC (permalink / raw)
  To: Dave Martin
  Cc: Okamoto Takayuki, Christoffer Dall, Ard Biesheuvel, Marc Zyngier,
	Catalin Marinas, Will Deacon, kvmarm, linux-arm-kernel


Dave Martin <Dave.Martin@arm.com> writes:

> This patch adds the necessary support for context switching ZCR_EL1
> for each vcpu.
>
> ZCR_EL1 is trapped alongside the FPSIMD/SVE registers, so it makes
> sense for it to be handled as part of the guest FPSIMD/SVE context
> for context switch purposes instead of handling it as a general
> system register.  This means that it can be switched in lazily at
> the appropriate time.  No effort is made to track host context for
> this register, since SVE requires VHE: thus the hosts's value for
> this register lives permanently in ZCR_EL2 and does not alias the
> guest's value at any time.
>
> The Hyp switch and fpsimd context handling code is extended
> appropriately.
>
> Accessors are added in sys_regs.c to expose the SVE system
> registers and ID register fields.  Because these need to be
> conditionally visible based on the guest configuration, they are
> implemented separately for now rather than by use of the generic
> system register helpers.  This may be abstracted better later on
> when/if there are more features requiring this model.
>
> ID_AA64ZFR0_EL1 is RO-RAZ for MRS/MSR when SVE is disabled for the
> guest, but for compatibility with non-SVE aware KVM implementations
> the register should not be enumerated at all for KVM_GET_REG_LIST
> in this case.  For consistency we also reject ioctl access to the
> register.  This ensures that a non-SVE-enabled guest looks the same
> to userspace, irrespective of whether the kernel KVM implementation
> supports SVE.
>
> Signed-off-by: Dave Martin <Dave.Martin@arm.com>
> ---
>
> Changes since RFCv1:
>
>  * The conditional visibility logic in sys_regs.c has been
>    simplified.
>
>  * The guest's ZCR_EL1 is now treated as part of the FPSIMD/SVE state
>    for switching purposes.  Any access to this register before it is
>    switched in generates an SVE trap, so we have a change to switch it
>    along with the vector registers.
>
>    Because SVE is only available with VHE there is no need ever to
>    restore the host's version of this register (which instead lives
>    permanently in ZCR_EL2).
> ---
>  arch/arm64/include/asm/kvm_host.h |   1 +
>  arch/arm64/include/asm/sysreg.h   |   3 ++
>  arch/arm64/kvm/fpsimd.c           |   9 +++-
>  arch/arm64/kvm/hyp/switch.c       |   4 ++
>  arch/arm64/kvm/sys_regs.c         | 111 ++++++++++++++++++++++++++++++++++++--
>  5 files changed, 123 insertions(+), 5 deletions(-)
>
> diff --git a/arch/arm64/include/asm/kvm_host.h b/arch/arm64/include/asm/kvm_host.h
> index 20baf4a..76cbb95e 100644
> --- a/arch/arm64/include/asm/kvm_host.h
> +++ b/arch/arm64/include/asm/kvm_host.h
> @@ -110,6 +110,7 @@ enum vcpu_sysreg {
>  	SCTLR_EL1,	/* System Control Register */
>  	ACTLR_EL1,	/* Auxiliary Control Register */
>  	CPACR_EL1,	/* Coprocessor Access Control */
> +	ZCR_EL1,	/* SVE Control */
>  	TTBR0_EL1,	/* Translation Table Base Register 0 */
>  	TTBR1_EL1,	/* Translation Table Base Register 1 */
>  	TCR_EL1,	/* Translation Control Register */
> diff --git a/arch/arm64/include/asm/sysreg.h b/arch/arm64/include/asm/sysreg.h
> index c147093..dbac42f 100644
> --- a/arch/arm64/include/asm/sysreg.h
> +++ b/arch/arm64/include/asm/sysreg.h
> @@ -418,6 +418,9 @@
>  #define SYS_ICH_LR14_EL2		__SYS__LR8_EL2(6)
>  #define SYS_ICH_LR15_EL2		__SYS__LR8_EL2(7)
>
> +/* VHE encodings for architectural EL0/1 system registers */
> +#define SYS_ZCR_EL12			sys_reg(3, 5, 1, 2, 0)
> +
>  /* Common SCTLR_ELx flags. */
>  #define SCTLR_ELx_EE    (1 << 25)
>  #define SCTLR_ELx_IESB	(1 << 21)
> diff --git a/arch/arm64/kvm/fpsimd.c b/arch/arm64/kvm/fpsimd.c
> index 55654cb..29e5585 100644
> --- a/arch/arm64/kvm/fpsimd.c
> +++ b/arch/arm64/kvm/fpsimd.c
> @@ -102,6 +102,9 @@ void kvm_arch_vcpu_ctxsync_fp(struct kvm_vcpu *vcpu)
>  void kvm_arch_vcpu_put_fp(struct kvm_vcpu *vcpu)
>  {
>  	unsigned long flags;
> +	bool host_has_sve = system_supports_sve();
> +	bool guest_has_sve =
> +		host_has_sve && (vcpu->arch.flags &
> KVM_ARM64_FP_ENABLED);

erm... didn't you create a KVM_ARM64_GUEST_HAS_SVE and vcpu_has_sve() for this?

>
>  	local_irq_save(flags);
>
> @@ -109,7 +112,11 @@ void kvm_arch_vcpu_put_fp(struct kvm_vcpu *vcpu)
>  		/* Clean guest FP state to memory and invalidate cpu view */
>  		fpsimd_save();
>  		fpsimd_flush_cpu_state();
> -	} else if (system_supports_sve()) {
> +
> +		if (guest_has_sve)
> +			vcpu->arch.ctxt.sys_regs[ZCR_EL1] =
> +				read_sysreg_s(SYS_ZCR_EL12);
> +	} else if (host_has_sve) {
>  		/*
>  		 * The FPSIMD/SVE state in the CPU has not been touched, and we
>  		 * have SVE (and VHE): CPACR_EL1 (alias CPTR_EL2) has been
> diff --git a/arch/arm64/kvm/hyp/switch.c b/arch/arm64/kvm/hyp/switch.c
> index ca46153..085ed06 100644
> --- a/arch/arm64/kvm/hyp/switch.c
> +++ b/arch/arm64/kvm/hyp/switch.c
> @@ -366,6 +366,10 @@ static bool __hyp_text __hyp_switch_fpsimd(struct kvm_vcpu *vcpu)
>
>  	__fpsimd_restore_state(&vcpu->arch.ctxt.gp_regs.fp_regs);
>
> +	if (system_supports_sve() &&
> +	    vcpu->arch.flags & KVM_ARM64_GUEST_HAS_SVE)

vcpu_has_sve(vcpu)

> +		write_sysreg_s(vcpu->arch.ctxt.sys_regs[ZCR_EL1], SYS_ZCR_EL12);
> +
>  	/* Skip restoring fpexc32 for AArch64 guests */
>  	if (!(read_sysreg(hcr_el2) & HCR_RW))
>  		write_sysreg(vcpu->arch.ctxt.sys_regs[FPEXC32_EL2],
> diff --git a/arch/arm64/kvm/sys_regs.c b/arch/arm64/kvm/sys_regs.c
> index adb6cbd..6f03211 100644
> --- a/arch/arm64/kvm/sys_regs.c
> +++ b/arch/arm64/kvm/sys_regs.c
> @@ -1036,10 +1036,7 @@ static u64 read_id_reg(const struct kvm_vcpu *vcpu,
>  			 (u32)r->CRn, (u32)r->CRm, (u32)r->Op2);
>  	u64 val = raz ? 0 : read_sanitised_ftr_reg(id);
>
> -	if (id == SYS_ID_AA64PFR0_EL1) {
> -		if (val & (0xfUL << ID_AA64PFR0_SVE_SHIFT))
> -			kvm_debug("SVE unsupported for guests, suppressing\n");
> -
> +	if (id == SYS_ID_AA64PFR0_EL1 && !vcpu_has_sve(vcpu)) {
>  		val &= ~(0xfUL << ID_AA64PFR0_SVE_SHIFT);
>  	} else if (id == SYS_ID_AA64MMFR1_EL1) {
>  		if (val & (0xfUL << ID_AA64MMFR1_LOR_SHIFT))
> @@ -1083,6 +1080,105 @@ static int reg_from_user(u64 *val, const void __user *uaddr, u64 id);
>  static int reg_to_user(void __user *uaddr, const u64 *val, u64 id);
>  static u64 sys_reg_to_index(const struct sys_reg_desc *reg);
>
> +#ifdef CONFIG_ARM64_SVE
> +static bool sve_check_present(const struct kvm_vcpu *vcpu,
> +			      const struct sys_reg_desc *rd)
> +{
> +	return vcpu_has_sve(vcpu);
> +}
> +
> +static bool access_zcr_el1(struct kvm_vcpu *vcpu,
> +			   struct sys_reg_params *p,
> +			   const struct sys_reg_desc *rd)
> +{
> +	/*
> +	 * ZCR_EL1 access is handled directly in Hyp as part of the FPSIMD/SVE
> +	 * context, so we should only arrive here for non-SVE guests:
> +	 */
> +	WARN_ON(vcpu_has_sve(vcpu));
> +
> +	kvm_inject_undefined(vcpu);
> +	return false;
> +}
> +
> +static int get_zcr_el1(struct kvm_vcpu *vcpu,
> +		       const struct sys_reg_desc *rd,
> +		       const struct kvm_one_reg *reg, void __user *uaddr)
> +{
> +	if (!vcpu_has_sve(vcpu))
> +		return -ENOENT;
> +
> +	return reg_to_user(uaddr, &vcpu->arch.ctxt.sys_regs[ZCR_EL1],
> +			   reg->id);
> +}
> +
> +static int set_zcr_el1(struct kvm_vcpu *vcpu,
> +		       const struct sys_reg_desc *rd,
> +		       const struct kvm_one_reg *reg, void __user *uaddr)
> +{
> +	if (!vcpu_has_sve(vcpu))
> +		return -ENOENT;
> +
> +	return reg_from_user(&vcpu->arch.ctxt.sys_regs[ZCR_EL1], uaddr,
> +			     reg->id);
> +}
> +
> +/* Generate the emulated ID_AA64ZFR0_EL1 value exposed to the guest */
> +static u64 guest_id_aa64zfr0_el1(const struct kvm_vcpu *vcpu)
> +{
> +	if (!vcpu_has_sve(vcpu))
> +		return 0;
> +
> +	return read_sanitised_ftr_reg(SYS_ID_AA64ZFR0_EL1);
> +}
> +
> +static bool access_id_aa64zfr0_el1(struct kvm_vcpu *vcpu,
> +				   struct sys_reg_params *p,
> +				   const struct sys_reg_desc *rd)
> +{
> +	if (p->is_write)
> +		return write_to_read_only(vcpu, p, rd);
> +
> +	p->regval = guest_id_aa64zfr0_el1(vcpu);
> +	return true;
> +}
> +
> +static int get_id_aa64zfr0_el1(struct kvm_vcpu *vcpu,
> +		const struct sys_reg_desc *rd,
> +		const struct kvm_one_reg *reg, void __user *uaddr)
> +{
> +	u64 val;
> +
> +	if (!vcpu_has_sve(vcpu))
> +		return -ENOENT;
> +
> +	val = guest_id_aa64zfr0_el1(vcpu);
> +	return reg_to_user(uaddr, &val, reg->id);
> +}
> +
> +static int set_id_aa64zfr0_el1(struct kvm_vcpu *vcpu,
> +		const struct sys_reg_desc *rd,
> +		const struct kvm_one_reg *reg, void __user *uaddr)
> +{
> +	const u64 id = sys_reg_to_index(rd);
> +	int err;
> +	u64 val;
> +
> +	if (!vcpu_has_sve(vcpu))
> +		return -ENOENT;
> +
> +	err = reg_from_user(&val, uaddr, id);
> +	if (err)
> +		return err;
> +
> +	/* This is what we mean by invariant: you can't change it. */
> +	if (val != guest_id_aa64zfr0_el1(vcpu))
> +		return -EINVAL;
> +
> +	return 0;
> +}
> +#endif /* CONFIG_ARM64_SVE */
> +
>  /*
>   * cpufeature ID register user accessors
>   *
> @@ -1270,7 +1366,11 @@ static const struct sys_reg_desc sys_reg_descs[] = {
>  	ID_SANITISED(ID_AA64PFR1_EL1),
>  	ID_UNALLOCATED(4,2),
>  	ID_UNALLOCATED(4,3),
> +#ifdef CONFIG_ARM64_SVE
> +	{ SYS_DESC(SYS_ID_AA64ZFR0_EL1), access_id_aa64zfr0_el1, .get_user = get_id_aa64zfr0_el1, .set_user = set_id_aa64zfr0_el1, .check_present = sve_check_present },
> +#else
>  	ID_UNALLOCATED(4,4),
> +#endif
>  	ID_UNALLOCATED(4,5),
>  	ID_UNALLOCATED(4,6),
>  	ID_UNALLOCATED(4,7),
> @@ -1307,6 +1407,9 @@ static const struct sys_reg_desc sys_reg_descs[] = {
>
>  	{ SYS_DESC(SYS_SCTLR_EL1), access_vm_reg, reset_val, SCTLR_EL1, 0x00C50078 },
>  	{ SYS_DESC(SYS_CPACR_EL1), NULL, reset_val, CPACR_EL1, 0 },
> +#ifdef CONFIG_ARM64_SVE
> +	{ SYS_DESC(SYS_ZCR_EL1), access_zcr_el1, reset_unknown, ZCR_EL1, ~0xfUL, .get_user = get_zcr_el1, .set_user = set_zcr_el1, .check_present = sve_check_present },
> +#endif
>  	{ SYS_DESC(SYS_TTBR0_EL1), access_vm_reg, reset_unknown, TTBR0_EL1 },
>  	{ SYS_DESC(SYS_TTBR1_EL1), access_vm_reg, reset_unknown, TTBR1_EL1 },
>  	{ SYS_DESC(SYS_TCR_EL1), access_vm_reg, reset_val, TCR_EL1, 0 },

Overlong lines.


--
Alex Bennée
_______________________________________________
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

^ permalink raw reply	[flat|nested] 154+ messages in thread

* [RFC PATCH v2 12/23] KVM: arm64/sve: System register context switch and access support
@ 2018-11-15 16:37     ` Alex Bennée
  0 siblings, 0 replies; 154+ messages in thread
From: Alex Bennée @ 2018-11-15 16:37 UTC (permalink / raw)
  To: linux-arm-kernel


Dave Martin <Dave.Martin@arm.com> writes:

> This patch adds the necessary support for context switching ZCR_EL1
> for each vcpu.
>
> ZCR_EL1 is trapped alongside the FPSIMD/SVE registers, so it makes
> sense for it to be handled as part of the guest FPSIMD/SVE context
> for context switch purposes instead of handling it as a general
> system register.  This means that it can be switched in lazily at
> the appropriate time.  No effort is made to track host context for
> this register, since SVE requires VHE: thus the hosts's value for
> this register lives permanently in ZCR_EL2 and does not alias the
> guest's value at any time.
>
> The Hyp switch and fpsimd context handling code is extended
> appropriately.
>
> Accessors are added in sys_regs.c to expose the SVE system
> registers and ID register fields.  Because these need to be
> conditionally visible based on the guest configuration, they are
> implemented separately for now rather than by use of the generic
> system register helpers.  This may be abstracted better later on
> when/if there are more features requiring this model.
>
> ID_AA64ZFR0_EL1 is RO-RAZ for MRS/MSR when SVE is disabled for the
> guest, but for compatibility with non-SVE aware KVM implementations
> the register should not be enumerated at all for KVM_GET_REG_LIST
> in this case.  For consistency we also reject ioctl access to the
> register.  This ensures that a non-SVE-enabled guest looks the same
> to userspace, irrespective of whether the kernel KVM implementation
> supports SVE.
>
> Signed-off-by: Dave Martin <Dave.Martin@arm.com>
> ---
>
> Changes since RFCv1:
>
>  * The conditional visibility logic in sys_regs.c has been
>    simplified.
>
>  * The guest's ZCR_EL1 is now treated as part of the FPSIMD/SVE state
>    for switching purposes.  Any access to this register before it is
>    switched in generates an SVE trap, so we have a change to switch it
>    along with the vector registers.
>
>    Because SVE is only available with VHE there is no need ever to
>    restore the host's version of this register (which instead lives
>    permanently in ZCR_EL2).
> ---
>  arch/arm64/include/asm/kvm_host.h |   1 +
>  arch/arm64/include/asm/sysreg.h   |   3 ++
>  arch/arm64/kvm/fpsimd.c           |   9 +++-
>  arch/arm64/kvm/hyp/switch.c       |   4 ++
>  arch/arm64/kvm/sys_regs.c         | 111 ++++++++++++++++++++++++++++++++++++--
>  5 files changed, 123 insertions(+), 5 deletions(-)
>
> diff --git a/arch/arm64/include/asm/kvm_host.h b/arch/arm64/include/asm/kvm_host.h
> index 20baf4a..76cbb95e 100644
> --- a/arch/arm64/include/asm/kvm_host.h
> +++ b/arch/arm64/include/asm/kvm_host.h
> @@ -110,6 +110,7 @@ enum vcpu_sysreg {
>  	SCTLR_EL1,	/* System Control Register */
>  	ACTLR_EL1,	/* Auxiliary Control Register */
>  	CPACR_EL1,	/* Coprocessor Access Control */
> +	ZCR_EL1,	/* SVE Control */
>  	TTBR0_EL1,	/* Translation Table Base Register 0 */
>  	TTBR1_EL1,	/* Translation Table Base Register 1 */
>  	TCR_EL1,	/* Translation Control Register */
> diff --git a/arch/arm64/include/asm/sysreg.h b/arch/arm64/include/asm/sysreg.h
> index c147093..dbac42f 100644
> --- a/arch/arm64/include/asm/sysreg.h
> +++ b/arch/arm64/include/asm/sysreg.h
> @@ -418,6 +418,9 @@
>  #define SYS_ICH_LR14_EL2		__SYS__LR8_EL2(6)
>  #define SYS_ICH_LR15_EL2		__SYS__LR8_EL2(7)
>
> +/* VHE encodings for architectural EL0/1 system registers */
> +#define SYS_ZCR_EL12			sys_reg(3, 5, 1, 2, 0)
> +
>  /* Common SCTLR_ELx flags. */
>  #define SCTLR_ELx_EE    (1 << 25)
>  #define SCTLR_ELx_IESB	(1 << 21)
> diff --git a/arch/arm64/kvm/fpsimd.c b/arch/arm64/kvm/fpsimd.c
> index 55654cb..29e5585 100644
> --- a/arch/arm64/kvm/fpsimd.c
> +++ b/arch/arm64/kvm/fpsimd.c
> @@ -102,6 +102,9 @@ void kvm_arch_vcpu_ctxsync_fp(struct kvm_vcpu *vcpu)
>  void kvm_arch_vcpu_put_fp(struct kvm_vcpu *vcpu)
>  {
>  	unsigned long flags;
> +	bool host_has_sve = system_supports_sve();
> +	bool guest_has_sve =
> +		host_has_sve && (vcpu->arch.flags &
> KVM_ARM64_FP_ENABLED);

erm... didn't you create a KVM_ARM64_GUEST_HAS_SVE and vcpu_has_sve() for this?

>
>  	local_irq_save(flags);
>
> @@ -109,7 +112,11 @@ void kvm_arch_vcpu_put_fp(struct kvm_vcpu *vcpu)
>  		/* Clean guest FP state to memory and invalidate cpu view */
>  		fpsimd_save();
>  		fpsimd_flush_cpu_state();
> -	} else if (system_supports_sve()) {
> +
> +		if (guest_has_sve)
> +			vcpu->arch.ctxt.sys_regs[ZCR_EL1] =
> +				read_sysreg_s(SYS_ZCR_EL12);
> +	} else if (host_has_sve) {
>  		/*
>  		 * The FPSIMD/SVE state in the CPU has not been touched, and we
>  		 * have SVE (and VHE): CPACR_EL1 (alias CPTR_EL2) has been
> diff --git a/arch/arm64/kvm/hyp/switch.c b/arch/arm64/kvm/hyp/switch.c
> index ca46153..085ed06 100644
> --- a/arch/arm64/kvm/hyp/switch.c
> +++ b/arch/arm64/kvm/hyp/switch.c
> @@ -366,6 +366,10 @@ static bool __hyp_text __hyp_switch_fpsimd(struct kvm_vcpu *vcpu)
>
>  	__fpsimd_restore_state(&vcpu->arch.ctxt.gp_regs.fp_regs);
>
> +	if (system_supports_sve() &&
> +	    vcpu->arch.flags & KVM_ARM64_GUEST_HAS_SVE)

vcpu_has_sve(vcpu)

> +		write_sysreg_s(vcpu->arch.ctxt.sys_regs[ZCR_EL1], SYS_ZCR_EL12);
> +
>  	/* Skip restoring fpexc32 for AArch64 guests */
>  	if (!(read_sysreg(hcr_el2) & HCR_RW))
>  		write_sysreg(vcpu->arch.ctxt.sys_regs[FPEXC32_EL2],
> diff --git a/arch/arm64/kvm/sys_regs.c b/arch/arm64/kvm/sys_regs.c
> index adb6cbd..6f03211 100644
> --- a/arch/arm64/kvm/sys_regs.c
> +++ b/arch/arm64/kvm/sys_regs.c
> @@ -1036,10 +1036,7 @@ static u64 read_id_reg(const struct kvm_vcpu *vcpu,
>  			 (u32)r->CRn, (u32)r->CRm, (u32)r->Op2);
>  	u64 val = raz ? 0 : read_sanitised_ftr_reg(id);
>
> -	if (id == SYS_ID_AA64PFR0_EL1) {
> -		if (val & (0xfUL << ID_AA64PFR0_SVE_SHIFT))
> -			kvm_debug("SVE unsupported for guests, suppressing\n");
> -
> +	if (id == SYS_ID_AA64PFR0_EL1 && !vcpu_has_sve(vcpu)) {
>  		val &= ~(0xfUL << ID_AA64PFR0_SVE_SHIFT);
>  	} else if (id == SYS_ID_AA64MMFR1_EL1) {
>  		if (val & (0xfUL << ID_AA64MMFR1_LOR_SHIFT))
> @@ -1083,6 +1080,105 @@ static int reg_from_user(u64 *val, const void __user *uaddr, u64 id);
>  static int reg_to_user(void __user *uaddr, const u64 *val, u64 id);
>  static u64 sys_reg_to_index(const struct sys_reg_desc *reg);
>
> +#ifdef CONFIG_ARM64_SVE
> +static bool sve_check_present(const struct kvm_vcpu *vcpu,
> +			      const struct sys_reg_desc *rd)
> +{
> +	return vcpu_has_sve(vcpu);
> +}
> +
> +static bool access_zcr_el1(struct kvm_vcpu *vcpu,
> +			   struct sys_reg_params *p,
> +			   const struct sys_reg_desc *rd)
> +{
> +	/*
> +	 * ZCR_EL1 access is handled directly in Hyp as part of the FPSIMD/SVE
> +	 * context, so we should only arrive here for non-SVE guests:
> +	 */
> +	WARN_ON(vcpu_has_sve(vcpu));
> +
> +	kvm_inject_undefined(vcpu);
> +	return false;
> +}
> +
> +static int get_zcr_el1(struct kvm_vcpu *vcpu,
> +		       const struct sys_reg_desc *rd,
> +		       const struct kvm_one_reg *reg, void __user *uaddr)
> +{
> +	if (!vcpu_has_sve(vcpu))
> +		return -ENOENT;
> +
> +	return reg_to_user(uaddr, &vcpu->arch.ctxt.sys_regs[ZCR_EL1],
> +			   reg->id);
> +}
> +
> +static int set_zcr_el1(struct kvm_vcpu *vcpu,
> +		       const struct sys_reg_desc *rd,
> +		       const struct kvm_one_reg *reg, void __user *uaddr)
> +{
> +	if (!vcpu_has_sve(vcpu))
> +		return -ENOENT;
> +
> +	return reg_from_user(&vcpu->arch.ctxt.sys_regs[ZCR_EL1], uaddr,
> +			     reg->id);
> +}
> +
> +/* Generate the emulated ID_AA64ZFR0_EL1 value exposed to the guest */
> +static u64 guest_id_aa64zfr0_el1(const struct kvm_vcpu *vcpu)
> +{
> +	if (!vcpu_has_sve(vcpu))
> +		return 0;
> +
> +	return read_sanitised_ftr_reg(SYS_ID_AA64ZFR0_EL1);
> +}
> +
> +static bool access_id_aa64zfr0_el1(struct kvm_vcpu *vcpu,
> +				   struct sys_reg_params *p,
> +				   const struct sys_reg_desc *rd)
> +{
> +	if (p->is_write)
> +		return write_to_read_only(vcpu, p, rd);
> +
> +	p->regval = guest_id_aa64zfr0_el1(vcpu);
> +	return true;
> +}
> +
> +static int get_id_aa64zfr0_el1(struct kvm_vcpu *vcpu,
> +		const struct sys_reg_desc *rd,
> +		const struct kvm_one_reg *reg, void __user *uaddr)
> +{
> +	u64 val;
> +
> +	if (!vcpu_has_sve(vcpu))
> +		return -ENOENT;
> +
> +	val = guest_id_aa64zfr0_el1(vcpu);
> +	return reg_to_user(uaddr, &val, reg->id);
> +}
> +
> +static int set_id_aa64zfr0_el1(struct kvm_vcpu *vcpu,
> +		const struct sys_reg_desc *rd,
> +		const struct kvm_one_reg *reg, void __user *uaddr)
> +{
> +	const u64 id = sys_reg_to_index(rd);
> +	int err;
> +	u64 val;
> +
> +	if (!vcpu_has_sve(vcpu))
> +		return -ENOENT;
> +
> +	err = reg_from_user(&val, uaddr, id);
> +	if (err)
> +		return err;
> +
> +	/* This is what we mean by invariant: you can't change it. */
> +	if (val != guest_id_aa64zfr0_el1(vcpu))
> +		return -EINVAL;
> +
> +	return 0;
> +}
> +#endif /* CONFIG_ARM64_SVE */
> +
>  /*
>   * cpufeature ID register user accessors
>   *
> @@ -1270,7 +1366,11 @@ static const struct sys_reg_desc sys_reg_descs[] = {
>  	ID_SANITISED(ID_AA64PFR1_EL1),
>  	ID_UNALLOCATED(4,2),
>  	ID_UNALLOCATED(4,3),
> +#ifdef CONFIG_ARM64_SVE
> +	{ SYS_DESC(SYS_ID_AA64ZFR0_EL1), access_id_aa64zfr0_el1, .get_user = get_id_aa64zfr0_el1, .set_user = set_id_aa64zfr0_el1, .check_present = sve_check_present },
> +#else
>  	ID_UNALLOCATED(4,4),
> +#endif
>  	ID_UNALLOCATED(4,5),
>  	ID_UNALLOCATED(4,6),
>  	ID_UNALLOCATED(4,7),
> @@ -1307,6 +1407,9 @@ static const struct sys_reg_desc sys_reg_descs[] = {
>
>  	{ SYS_DESC(SYS_SCTLR_EL1), access_vm_reg, reset_val, SCTLR_EL1, 0x00C50078 },
>  	{ SYS_DESC(SYS_CPACR_EL1), NULL, reset_val, CPACR_EL1, 0 },
> +#ifdef CONFIG_ARM64_SVE
> +	{ SYS_DESC(SYS_ZCR_EL1), access_zcr_el1, reset_unknown, ZCR_EL1, ~0xfUL, .get_user = get_zcr_el1, .set_user = set_zcr_el1, .check_present = sve_check_present },
> +#endif
>  	{ SYS_DESC(SYS_TTBR0_EL1), access_vm_reg, reset_unknown, TTBR0_EL1 },
>  	{ SYS_DESC(SYS_TTBR1_EL1), access_vm_reg, reset_unknown, TTBR1_EL1 },
>  	{ SYS_DESC(SYS_TCR_EL1), access_vm_reg, reset_val, TCR_EL1, 0 },

Overlong lines.


--
Alex Benn?e

^ permalink raw reply	[flat|nested] 154+ messages in thread

* Re: [RFC PATCH v2 05/23] KVM: arm: Add arch vcpu uninit hook
  2018-11-02  8:05     ` Christoffer Dall
@ 2018-11-15 16:40       ` Dave Martin
  -1 siblings, 0 replies; 154+ messages in thread
From: Dave Martin @ 2018-11-15 16:40 UTC (permalink / raw)
  To: Christoffer Dall
  Cc: Okamoto Takayuki, Christoffer Dall, Ard Biesheuvel, Marc Zyngier,
	Catalin Marinas, Will Deacon, kvmarm, linux-arm-kernel

On Fri, Nov 02, 2018 at 09:05:36AM +0100, Christoffer Dall wrote:
> On Fri, Sep 28, 2018 at 02:39:09PM +0100, Dave Martin wrote:
> > In preparation for adding support for SVE in guests on arm64, a
> > hook is needed for freeing additional per-vcpu memory when a vcpu
> > is freed.
> 
> Can this commit motivate why we can't do the work in kvm_arch_vcpu_free,
> which we use for freeing other data structures?
> 
> (Presumably, uninit is needed when you need to do something at the very
> last step after releasing the struct pid.

It wasn't to do with that.

Rather, the division of responsibility between the vcpu_uninit and
vcpu_free paths is not very clear.

In the earlier version of the series, I think SVE state may have been
allocated rather early and we may have needed to free it in the failure
path of kvm_arch_vcpu_create() (which just calls kvm_vcpu_uninit()).
(Alternatively, I may just have been wrong.)

Now, the vcpu must be fully created before the KVM_ARM_SVE_CONFIG ioctl
on it (which is what allocates sve_state) can succeed anyway.

So the distinction between these two teardown phases is probably no
longer important.

I'll see whether I can get rid of this hook and free the SVE state in
kvm_arch_vcpu_free() instead.

Does that make sense?

Cheers
---Dave

^ permalink raw reply	[flat|nested] 154+ messages in thread

* [RFC PATCH v2 05/23] KVM: arm: Add arch vcpu uninit hook
@ 2018-11-15 16:40       ` Dave Martin
  0 siblings, 0 replies; 154+ messages in thread
From: Dave Martin @ 2018-11-15 16:40 UTC (permalink / raw)
  To: linux-arm-kernel

On Fri, Nov 02, 2018 at 09:05:36AM +0100, Christoffer Dall wrote:
> On Fri, Sep 28, 2018 at 02:39:09PM +0100, Dave Martin wrote:
> > In preparation for adding support for SVE in guests on arm64, a
> > hook is needed for freeing additional per-vcpu memory when a vcpu
> > is freed.
> 
> Can this commit motivate why we can't do the work in kvm_arch_vcpu_free,
> which we use for freeing other data structures?
> 
> (Presumably, uninit is needed when you need to do something at the very
> last step after releasing the struct pid.

It wasn't to do with that.

Rather, the division of responsibility between the vcpu_uninit and
vcpu_free paths is not very clear.

In the earlier version of the series, I think SVE state may have been
allocated rather early and we may have needed to free it in the failure
path of kvm_arch_vcpu_create() (which just calls kvm_vcpu_uninit()).
(Alternatively, I may just have been wrong.)

Now, the vcpu must be fully created before the KVM_ARM_SVE_CONFIG ioctl
on it (which is what allocates sve_state) can succeed anyway.

So the distinction between these two teardown phases is probably no
longer important.

I'll see whether I can get rid of this hook and free the SVE state in
kvm_arch_vcpu_free() instead.

Does that make sense?

Cheers
---Dave

^ permalink raw reply	[flat|nested] 154+ messages in thread

* Re: [RFC PATCH v2 06/23] arm64/sve: Check SVE virtualisability
  2018-11-15 15:39     ` Alex Bennée
@ 2018-11-15 17:09       ` Dave Martin
  -1 siblings, 0 replies; 154+ messages in thread
From: Dave Martin @ 2018-11-15 17:09 UTC (permalink / raw)
  To: Alex Bennée
  Cc: Okamoto Takayuki, Christoffer Dall, Ard Biesheuvel, Marc Zyngier,
	Catalin Marinas, Will Deacon, kvmarm, linux-arm-kernel

On Thu, Nov 15, 2018 at 03:39:01PM +0000, Alex Bennée wrote:
> 
> Dave Martin <Dave.Martin@arm.com> writes:
> 
> > Due to the way the effective SVE vector length is controlled and
> > trapped at different exception levels, certain mismatches in the
> > sets of vector lengths supported by different physical CPUs in the
> > system may prevent straightforward virtualisation of SVE at parity
> > with the host.
> >
> > This patch analyses the extent to which SVE can be virtualised
> > safely without interfering with migration of vcpus between physical
> > CPUs, and rejects late secondary CPUs that would erode the
> > situation further.
> >
> > It is left up to KVM to decide what to do with this information.
> >
> > Signed-off-by: Dave Martin <Dave.Martin@arm.com>
> > ---
> >
> > Changes since RFCv1:
> >
> >  * The analysis done by this patch is the same as in the previous
> >    version, but the commit message the printks etc. have been reworded
> >    to avoid the suggestion that KVM is expected to work on a system with
> >    mismatched SVE implementations.
> > ---
> >  arch/arm64/include/asm/fpsimd.h |  1 +
> >  arch/arm64/kernel/cpufeature.c  |  2 +-
> >  arch/arm64/kernel/fpsimd.c      | 87 +++++++++++++++++++++++++++++++++++------
> >  3 files changed, 76 insertions(+), 14 deletions(-)
> >

[...]

> > diff --git a/arch/arm64/kernel/fpsimd.c b/arch/arm64/kernel/fpsimd.c

[...]

> > @@ -623,11 +629,8 @@ int sve_get_current_vl(void)

[...]

> > +/* Bitmaps for temporary storage during manipulation of vector length sets */
> > +static DECLARE_BITMAP(sve_tmp_vq_map, SVE_VQ_MAX);
> 
> This seems odd as a local global, why not declared locally when used?

Could do.

My original concern was that this is "big" and therefore it's impolite
to allocate it on the stack.

But on reflection, 64 bytes of stack is no big deal for a 64-bit
architecture.  The affected functions probably spill more than than
already, and these functions are called on well-defined paths which
shouldn't have super-deep stacks already.

[...]

> > @@ -658,24 +662,60 @@ void __init sve_init_vq_map(void)
> >   */
> >  void sve_update_vq_map(void)
> >  {
> > -	sve_probe_vqs(sve_secondary_vq_map);
> > -	bitmap_and(sve_vq_map, sve_vq_map, sve_secondary_vq_map, SVE_VQ_MAX);
> > +	sve_probe_vqs(sve_tmp_vq_map);
> > +	bitmap_and(sve_vq_map, sve_vq_map, sve_tmp_vq_map,
> > +		   SVE_VQ_MAX);
> > +	bitmap_or(sve_vq_partial_map, sve_vq_partial_map, sve_tmp_vq_map,
> > +		  SVE_VQ_MAX);
> >  }
> 
> I'm not quite following what's going on here. This is tracking both the
> vector lengths available on all CPUs and the ones available on at least
> one CPU? This raises a some questions:
> 
>   - do such franken-machines exist or are expected to exit?

no, and yes respectively (Linux does not endorse the latter for now,
since it results in a non-SMP system: we hide the asymmetries where
possible by clamping the set of available vector lengths, but for
KVM it's too hard and we don't aim to support it at all).

Even if we don't recommend deploying a general-purpose OS on such a
system, people will eventually try it.  So it's better to fail safe
rather than silently doing the wrong thing.

>   - how do we ensure this is always upto date?

This gets updated for each early secondary CPU that comes up.  (Early
secondaries' boot is serialised, so we shouldn't have to worry about
races here.)

The configuration is frozen by the time we enter userspace (hence
__ro_after_init).

Once all the early secondaries have come up, we commit to the best
possible set of vector lengths for the CPUs that we know about, and we
don't call this path any more: instead, each late secondary goes into
sve_verify_vq_map() instead to check that those CPUs are compatible
with the configuration we committed to.

For context, take a look at
arch/arm64/kernel/cpufeature.c:check_local_cpu_capabilities(), which is
the common entry point for all secondary CPUs: that splits into
update_cpu_capabilities() and verify_local_cpu_capabilities() paths for
the two cases described above, calling down into sve_update_vq_map()
and sve_verify_vq_map() as appropriate.

>   - what happens when we hotplug a new CPU with less available VQ?

We reject the CPU and throw it back to the firmware (see
cpufeature.c:verify_sve_features()).

This follows the precedent already set in verify_local_cpu_capabilities()
etc.

> 
> >
> >  /* Check whether the current CPU supports all VQs in the committed set */
> >  int sve_verify_vq_map(void)
> >  {
> > -	int ret = 0;
> > +	int ret = -EINVAL;
> > +	unsigned long b;
> >
> > -	sve_probe_vqs(sve_secondary_vq_map);
> > -	bitmap_andnot(sve_secondary_vq_map, sve_vq_map, sve_secondary_vq_map,
> > -		      SVE_VQ_MAX);
> > -	if (!bitmap_empty(sve_secondary_vq_map, SVE_VQ_MAX)) {
> > +	sve_probe_vqs(sve_tmp_vq_map);
> > +
> > +	bitmap_complement(sve_tmp_vq_map, sve_tmp_vq_map, SVE_VQ_MAX);
> > +	if (bitmap_intersects(sve_tmp_vq_map, sve_vq_map, SVE_VQ_MAX)) {
> >  		pr_warn("SVE: cpu%d: Required vector length(s) missing\n",
> >  			smp_processor_id());
> > -		ret = -EINVAL;
> > +		goto error;
> 
> The use of goto seems a little premature considering we don't have any
> clean-up to do.

Hmm, this does look a little overengineered.  I think it may have been
more complex during development (making the gotos less redundant), but
to be honest I don't remember now.

I'm happy to get rid of the rather pointless ret variable and replace
all the gotos with returns if that works for you.

What do you think?

[...]

Cheers
---Dave

^ permalink raw reply	[flat|nested] 154+ messages in thread

* [RFC PATCH v2 06/23] arm64/sve: Check SVE virtualisability
@ 2018-11-15 17:09       ` Dave Martin
  0 siblings, 0 replies; 154+ messages in thread
From: Dave Martin @ 2018-11-15 17:09 UTC (permalink / raw)
  To: linux-arm-kernel

On Thu, Nov 15, 2018 at 03:39:01PM +0000, Alex Benn?e wrote:
> 
> Dave Martin <Dave.Martin@arm.com> writes:
> 
> > Due to the way the effective SVE vector length is controlled and
> > trapped at different exception levels, certain mismatches in the
> > sets of vector lengths supported by different physical CPUs in the
> > system may prevent straightforward virtualisation of SVE at parity
> > with the host.
> >
> > This patch analyses the extent to which SVE can be virtualised
> > safely without interfering with migration of vcpus between physical
> > CPUs, and rejects late secondary CPUs that would erode the
> > situation further.
> >
> > It is left up to KVM to decide what to do with this information.
> >
> > Signed-off-by: Dave Martin <Dave.Martin@arm.com>
> > ---
> >
> > Changes since RFCv1:
> >
> >  * The analysis done by this patch is the same as in the previous
> >    version, but the commit message the printks etc. have been reworded
> >    to avoid the suggestion that KVM is expected to work on a system with
> >    mismatched SVE implementations.
> > ---
> >  arch/arm64/include/asm/fpsimd.h |  1 +
> >  arch/arm64/kernel/cpufeature.c  |  2 +-
> >  arch/arm64/kernel/fpsimd.c      | 87 +++++++++++++++++++++++++++++++++++------
> >  3 files changed, 76 insertions(+), 14 deletions(-)
> >

[...]

> > diff --git a/arch/arm64/kernel/fpsimd.c b/arch/arm64/kernel/fpsimd.c

[...]

> > @@ -623,11 +629,8 @@ int sve_get_current_vl(void)

[...]

> > +/* Bitmaps for temporary storage during manipulation of vector length sets */
> > +static DECLARE_BITMAP(sve_tmp_vq_map, SVE_VQ_MAX);
> 
> This seems odd as a local global, why not declared locally when used?

Could do.

My original concern was that this is "big" and therefore it's impolite
to allocate it on the stack.

But on reflection, 64 bytes of stack is no big deal for a 64-bit
architecture.  The affected functions probably spill more than than
already, and these functions are called on well-defined paths which
shouldn't have super-deep stacks already.

[...]

> > @@ -658,24 +662,60 @@ void __init sve_init_vq_map(void)
> >   */
> >  void sve_update_vq_map(void)
> >  {
> > -	sve_probe_vqs(sve_secondary_vq_map);
> > -	bitmap_and(sve_vq_map, sve_vq_map, sve_secondary_vq_map, SVE_VQ_MAX);
> > +	sve_probe_vqs(sve_tmp_vq_map);
> > +	bitmap_and(sve_vq_map, sve_vq_map, sve_tmp_vq_map,
> > +		   SVE_VQ_MAX);
> > +	bitmap_or(sve_vq_partial_map, sve_vq_partial_map, sve_tmp_vq_map,
> > +		  SVE_VQ_MAX);
> >  }
> 
> I'm not quite following what's going on here. This is tracking both the
> vector lengths available on all CPUs and the ones available on at least
> one CPU? This raises a some questions:
> 
>   - do such franken-machines exist or are expected to exit?

no, and yes respectively (Linux does not endorse the latter for now,
since it results in a non-SMP system: we hide the asymmetries where
possible by clamping the set of available vector lengths, but for
KVM it's too hard and we don't aim to support it at all).

Even if we don't recommend deploying a general-purpose OS on such a
system, people will eventually try it.  So it's better to fail safe
rather than silently doing the wrong thing.

>   - how do we ensure this is always upto date?

This gets updated for each early secondary CPU that comes up.  (Early
secondaries' boot is serialised, so we shouldn't have to worry about
races here.)

The configuration is frozen by the time we enter userspace (hence
__ro_after_init).

Once all the early secondaries have come up, we commit to the best
possible set of vector lengths for the CPUs that we know about, and we
don't call this path any more: instead, each late secondary goes into
sve_verify_vq_map() instead to check that those CPUs are compatible
with the configuration we committed to.

For context, take a look at
arch/arm64/kernel/cpufeature.c:check_local_cpu_capabilities(), which is
the common entry point for all secondary CPUs: that splits into
update_cpu_capabilities() and verify_local_cpu_capabilities() paths for
the two cases described above, calling down into sve_update_vq_map()
and sve_verify_vq_map() as appropriate.

>   - what happens when we hotplug a new CPU with less available VQ?

We reject the CPU and throw it back to the firmware (see
cpufeature.c:verify_sve_features()).

This follows the precedent already set in verify_local_cpu_capabilities()
etc.

> 
> >
> >  /* Check whether the current CPU supports all VQs in the committed set */
> >  int sve_verify_vq_map(void)
> >  {
> > -	int ret = 0;
> > +	int ret = -EINVAL;
> > +	unsigned long b;
> >
> > -	sve_probe_vqs(sve_secondary_vq_map);
> > -	bitmap_andnot(sve_secondary_vq_map, sve_vq_map, sve_secondary_vq_map,
> > -		      SVE_VQ_MAX);
> > -	if (!bitmap_empty(sve_secondary_vq_map, SVE_VQ_MAX)) {
> > +	sve_probe_vqs(sve_tmp_vq_map);
> > +
> > +	bitmap_complement(sve_tmp_vq_map, sve_tmp_vq_map, SVE_VQ_MAX);
> > +	if (bitmap_intersects(sve_tmp_vq_map, sve_vq_map, SVE_VQ_MAX)) {
> >  		pr_warn("SVE: cpu%d: Required vector length(s) missing\n",
> >  			smp_processor_id());
> > -		ret = -EINVAL;
> > +		goto error;
> 
> The use of goto seems a little premature considering we don't have any
> clean-up to do.

Hmm, this does look a little overengineered.  I think it may have been
more complex during development (making the gotos less redundant), but
to be honest I don't remember now.

I'm happy to get rid of the rather pointless ret variable and replace
all the gotos with returns if that works for you.

What do you think?

[...]

Cheers
---Dave

^ permalink raw reply	[flat|nested] 154+ messages in thread

* Re: [RFC PATCH v2 10/23] KVM: arm64: Extend reset_unknown() to handle mixed RES0/UNKNOWN registers
  2018-11-02  8:11     ` Christoffer Dall
@ 2018-11-15 17:11       ` Dave Martin
  -1 siblings, 0 replies; 154+ messages in thread
From: Dave Martin @ 2018-11-15 17:11 UTC (permalink / raw)
  To: Christoffer Dall
  Cc: Okamoto Takayuki, Christoffer Dall, Ard Biesheuvel, Marc Zyngier,
	Catalin Marinas, Will Deacon, kvmarm, linux-arm-kernel

On Fri, Nov 02, 2018 at 09:11:19AM +0100, Christoffer Dall wrote:
> On Fri, Sep 28, 2018 at 02:39:14PM +0100, Dave Martin wrote:
> > The reset_unknown() system register helper initialises a guest
> > register to a distinctive junk value on vcpu reset, to help expose
> > and debug deficient register initialisation within the guest.
> > 
> > Some registers such as the SVE control register ZCR_EL1 contain a
> > mixture of UNKNOWN fields and RES0 bits.  For these,
> > reset_unknown() does not work at present, since it sets all bits to
> > junk values instead of just the wanted bits.
> > 
> > There is no need to craft another special helper just for that,
> > since reset_unknown() almost does the appropriate thing anyway.
> > This patch takes advantage of the ununused val field in struct
> > sys_reg_desc to specify a mask of bits that should be initialised
> > to zero instead of junk.
> > 
> > All existing users of reset_unknown() do not (and should not)
> > define a value for val, so they will implicitly set it to zero,
> > resulting in all bits being made UNKNOWN by this function: thus,
> > this patch makes no functional change for currently defined
> > registers.
> > 
> > Future patches will make use of non-zero val.
> > 
> > Signed-off-by: Dave Martin <Dave.Martin@arm.com>
> > ---
> >  arch/arm64/kvm/sys_regs.h | 4 +++-
> >  1 file changed, 3 insertions(+), 1 deletion(-)
> > 
> > diff --git a/arch/arm64/kvm/sys_regs.h b/arch/arm64/kvm/sys_regs.h
> > index cd710f8..24bac06 100644
> > --- a/arch/arm64/kvm/sys_regs.h
> > +++ b/arch/arm64/kvm/sys_regs.h
> > @@ -89,7 +89,9 @@ static inline void reset_unknown(struct kvm_vcpu *vcpu,
> >  {
> >  	BUG_ON(!r->reg);
> >  	BUG_ON(r->reg >= NR_SYS_REGS);
> > -	__vcpu_sys_reg(vcpu, r->reg) = 0x1de7ec7edbadc0deULL;
> > +
> > +	/* If non-zero, r->val specifies which register bits are RES0: */
> > +	__vcpu_sys_reg(vcpu, r->reg) = 0x1de7ec7edbadc0deULL & ~r->val;
> 
> nit: it would be nice to document this feature on the val field in the
> sys_reg_desc structure above as well.

Sure thing, I missed that, but we _do_ want this clearly documented.

I'll add that when I respin.

[...]

Cheers
---Dave

^ permalink raw reply	[flat|nested] 154+ messages in thread

* [RFC PATCH v2 10/23] KVM: arm64: Extend reset_unknown() to handle mixed RES0/UNKNOWN registers
@ 2018-11-15 17:11       ` Dave Martin
  0 siblings, 0 replies; 154+ messages in thread
From: Dave Martin @ 2018-11-15 17:11 UTC (permalink / raw)
  To: linux-arm-kernel

On Fri, Nov 02, 2018 at 09:11:19AM +0100, Christoffer Dall wrote:
> On Fri, Sep 28, 2018 at 02:39:14PM +0100, Dave Martin wrote:
> > The reset_unknown() system register helper initialises a guest
> > register to a distinctive junk value on vcpu reset, to help expose
> > and debug deficient register initialisation within the guest.
> > 
> > Some registers such as the SVE control register ZCR_EL1 contain a
> > mixture of UNKNOWN fields and RES0 bits.  For these,
> > reset_unknown() does not work at present, since it sets all bits to
> > junk values instead of just the wanted bits.
> > 
> > There is no need to craft another special helper just for that,
> > since reset_unknown() almost does the appropriate thing anyway.
> > This patch takes advantage of the ununused val field in struct
> > sys_reg_desc to specify a mask of bits that should be initialised
> > to zero instead of junk.
> > 
> > All existing users of reset_unknown() do not (and should not)
> > define a value for val, so they will implicitly set it to zero,
> > resulting in all bits being made UNKNOWN by this function: thus,
> > this patch makes no functional change for currently defined
> > registers.
> > 
> > Future patches will make use of non-zero val.
> > 
> > Signed-off-by: Dave Martin <Dave.Martin@arm.com>
> > ---
> >  arch/arm64/kvm/sys_regs.h | 4 +++-
> >  1 file changed, 3 insertions(+), 1 deletion(-)
> > 
> > diff --git a/arch/arm64/kvm/sys_regs.h b/arch/arm64/kvm/sys_regs.h
> > index cd710f8..24bac06 100644
> > --- a/arch/arm64/kvm/sys_regs.h
> > +++ b/arch/arm64/kvm/sys_regs.h
> > @@ -89,7 +89,9 @@ static inline void reset_unknown(struct kvm_vcpu *vcpu,
> >  {
> >  	BUG_ON(!r->reg);
> >  	BUG_ON(r->reg >= NR_SYS_REGS);
> > -	__vcpu_sys_reg(vcpu, r->reg) = 0x1de7ec7edbadc0deULL;
> > +
> > +	/* If non-zero, r->val specifies which register bits are RES0: */
> > +	__vcpu_sys_reg(vcpu, r->reg) = 0x1de7ec7edbadc0deULL & ~r->val;
> 
> nit: it would be nice to document this feature on the val field in the
> sys_reg_desc structure above as well.

Sure thing, I missed that, but we _do_ want this clearly documented.

I'll add that when I respin.

[...]

Cheers
---Dave

^ permalink raw reply	[flat|nested] 154+ messages in thread

* Re: [RFC PATCH v2 11/23] KVM: arm64: Support runtime sysreg filtering for KVM_GET_REG_LIST
  2018-11-02  8:16     ` Christoffer Dall
@ 2018-11-15 17:27       ` Dave Martin
  -1 siblings, 0 replies; 154+ messages in thread
From: Dave Martin @ 2018-11-15 17:27 UTC (permalink / raw)
  To: Christoffer Dall
  Cc: Okamoto Takayuki, Christoffer Dall, Ard Biesheuvel, Marc Zyngier,
	Catalin Marinas, Will Deacon, kvmarm, linux-arm-kernel

On Fri, Nov 02, 2018 at 09:16:25AM +0100, Christoffer Dall wrote:
> On Fri, Sep 28, 2018 at 02:39:15PM +0100, Dave Martin wrote:
> > KVM_GET_REG_LIST should only enumerate registers that are actually
> > accessible, so it is necessary to filter out any register that is
> > not exposed to the guest.  For features that are configured at
> > runtime, this will require a dynamic check.
> > 
> > For example, ZCR_EL1 and ID_AA64ZFR0_EL1 would need to be hidden
> > if SVE is not enabled for the guest.
> 
> This implies that userspace can never access this interface for a vcpu
> before having decided whether such features are enabled for the guest or
> not, since otherwise userspace will see different states for a VCPU
> depending on sequencing of the API, which sounds fragile to me.
> 
> That should probably be documented somewhere, and I hope the
> enable/disable API for SVE in guests already takes that into account.
> 
> Not sure if there's an action to take here, but it was the best place I
> could raise this concern.

Fair point.  I struggled to come up with something better that solves
all problems.

My expectation is that KVM_ARM_SVE_CONFIG_SET is considered part of
creating the vcpu, so that if issued at all for a vcpu, it is issued
very soon after KVM_VCPU_INIT.

I think this worked OK with the current structure of kvmtool and I
seem to remember discussing this with Peter Maydell re qemu -- but
it sounds like I should double-check.

Either way, you're right, this needs to be clearly documented.


If we want to be more robust, maybe we should add a capability too,
so that userspace that enables this capability promises to call
KVM_ARM_SVE_CONFIG_SET for each vcpu, and affected ioctls (KVM_RUN,
KVM_GET_REG_LIST etc.) are forbidden until that is done?

That should help avoid accidents.

I could add a special meaning for an empty kvm_sve_vls, such that
it doesn't enable SVE on the affected vcpu.  That retains the ability
to create heterogeneous guests while still following the above flow.

Thoughts?

[...]

Cheers
---Dave

^ permalink raw reply	[flat|nested] 154+ messages in thread

* [RFC PATCH v2 11/23] KVM: arm64: Support runtime sysreg filtering for KVM_GET_REG_LIST
@ 2018-11-15 17:27       ` Dave Martin
  0 siblings, 0 replies; 154+ messages in thread
From: Dave Martin @ 2018-11-15 17:27 UTC (permalink / raw)
  To: linux-arm-kernel

On Fri, Nov 02, 2018 at 09:16:25AM +0100, Christoffer Dall wrote:
> On Fri, Sep 28, 2018 at 02:39:15PM +0100, Dave Martin wrote:
> > KVM_GET_REG_LIST should only enumerate registers that are actually
> > accessible, so it is necessary to filter out any register that is
> > not exposed to the guest.  For features that are configured at
> > runtime, this will require a dynamic check.
> > 
> > For example, ZCR_EL1 and ID_AA64ZFR0_EL1 would need to be hidden
> > if SVE is not enabled for the guest.
> 
> This implies that userspace can never access this interface for a vcpu
> before having decided whether such features are enabled for the guest or
> not, since otherwise userspace will see different states for a VCPU
> depending on sequencing of the API, which sounds fragile to me.
> 
> That should probably be documented somewhere, and I hope the
> enable/disable API for SVE in guests already takes that into account.
> 
> Not sure if there's an action to take here, but it was the best place I
> could raise this concern.

Fair point.  I struggled to come up with something better that solves
all problems.

My expectation is that KVM_ARM_SVE_CONFIG_SET is considered part of
creating the vcpu, so that if issued at all for a vcpu, it is issued
very soon after KVM_VCPU_INIT.

I think this worked OK with the current structure of kvmtool and I
seem to remember discussing this with Peter Maydell re qemu -- but
it sounds like I should double-check.

Either way, you're right, this needs to be clearly documented.


If we want to be more robust, maybe we should add a capability too,
so that userspace that enables this capability promises to call
KVM_ARM_SVE_CONFIG_SET for each vcpu, and affected ioctls (KVM_RUN,
KVM_GET_REG_LIST etc.) are forbidden until that is done?

That should help avoid accidents.

I could add a special meaning for an empty kvm_sve_vls, such that
it doesn't enable SVE on the affected vcpu.  That retains the ability
to create heterogeneous guests while still following the above flow.

Thoughts?

[...]

Cheers
---Dave

^ permalink raw reply	[flat|nested] 154+ messages in thread

* Re: [RFC PATCH v2 12/23] KVM: arm64/sve: System register context switch and access support
  2018-11-15 16:37     ` Alex Bennée
@ 2018-11-15 17:59       ` Dave Martin
  -1 siblings, 0 replies; 154+ messages in thread
From: Dave Martin @ 2018-11-15 17:59 UTC (permalink / raw)
  To: Alex Bennée
  Cc: Okamoto Takayuki, Christoffer Dall, Ard Biesheuvel, Marc Zyngier,
	Catalin Marinas, Will Deacon, kvmarm, linux-arm-kernel

On Thu, Nov 15, 2018 at 04:37:59PM +0000, Alex Bennée wrote:
> 
> Dave Martin <Dave.Martin@arm.com> writes:
> 
> > This patch adds the necessary support for context switching ZCR_EL1
> > for each vcpu.
> >
> > ZCR_EL1 is trapped alongside the FPSIMD/SVE registers, so it makes
> > sense for it to be handled as part of the guest FPSIMD/SVE context
> > for context switch purposes instead of handling it as a general
> > system register.  This means that it can be switched in lazily at
> > the appropriate time.  No effort is made to track host context for
> > this register, since SVE requires VHE: thus the hosts's value for
> > this register lives permanently in ZCR_EL2 and does not alias the
> > guest's value at any time.
> >
> > The Hyp switch and fpsimd context handling code is extended
> > appropriately.
> >
> > Accessors are added in sys_regs.c to expose the SVE system
> > registers and ID register fields.  Because these need to be
> > conditionally visible based on the guest configuration, they are
> > implemented separately for now rather than by use of the generic
> > system register helpers.  This may be abstracted better later on
> > when/if there are more features requiring this model.
> >
> > ID_AA64ZFR0_EL1 is RO-RAZ for MRS/MSR when SVE is disabled for the
> > guest, but for compatibility with non-SVE aware KVM implementations
> > the register should not be enumerated at all for KVM_GET_REG_LIST
> > in this case.  For consistency we also reject ioctl access to the
> > register.  This ensures that a non-SVE-enabled guest looks the same
> > to userspace, irrespective of whether the kernel KVM implementation
> > supports SVE.
> >
> > Signed-off-by: Dave Martin <Dave.Martin@arm.com>
> > ---
> >
> > Changes since RFCv1:
> >
> >  * The conditional visibility logic in sys_regs.c has been
> >    simplified.
> >
> >  * The guest's ZCR_EL1 is now treated as part of the FPSIMD/SVE state
> >    for switching purposes.  Any access to this register before it is
> >    switched in generates an SVE trap, so we have a change to switch it
> >    along with the vector registers.
> >
> >    Because SVE is only available with VHE there is no need ever to
> >    restore the host's version of this register (which instead lives
> >    permanently in ZCR_EL2).
> > ---
> >  arch/arm64/include/asm/kvm_host.h |   1 +
> >  arch/arm64/include/asm/sysreg.h   |   3 ++
> >  arch/arm64/kvm/fpsimd.c           |   9 +++-
> >  arch/arm64/kvm/hyp/switch.c       |   4 ++
> >  arch/arm64/kvm/sys_regs.c         | 111 ++++++++++++++++++++++++++++++++++++--
> >  5 files changed, 123 insertions(+), 5 deletions(-)

[...]

> > diff --git a/arch/arm64/kvm/fpsimd.c b/arch/arm64/kvm/fpsimd.c
> > index 55654cb..29e5585 100644
> > --- a/arch/arm64/kvm/fpsimd.c
> > +++ b/arch/arm64/kvm/fpsimd.c
> > @@ -102,6 +102,9 @@ void kvm_arch_vcpu_ctxsync_fp(struct kvm_vcpu *vcpu)
> >  void kvm_arch_vcpu_put_fp(struct kvm_vcpu *vcpu)
> >  {
> >  	unsigned long flags;
> > +	bool host_has_sve = system_supports_sve();
> > +	bool guest_has_sve =
> > +		host_has_sve && (vcpu->arch.flags &
> > KVM_ARM64_FP_ENABLED);
> 
> erm... didn't you create a KVM_ARM64_GUEST_HAS_SVE and vcpu_has_sve() for this?

Hmmm, I think this should indeed say KVM_ARM64_GUEST_HAS_SVE.
(Otherwise it would be redundant with the if() conditions that follow.)

I'll use vcpu_has_sve() if possible.  There may have been some reason
why I didn't use it here, but I'd need to go over the code again.

> >
> >  	local_irq_save(flags);
> >
> > @@ -109,7 +112,11 @@ void kvm_arch_vcpu_put_fp(struct kvm_vcpu *vcpu)
> >  		/* Clean guest FP state to memory and invalidate cpu view */
> >  		fpsimd_save();
> >  		fpsimd_flush_cpu_state();
> > -	} else if (system_supports_sve()) {
> > +
> > +		if (guest_has_sve)
> > +			vcpu->arch.ctxt.sys_regs[ZCR_EL1] =
> > +				read_sysreg_s(SYS_ZCR_EL12);
> > +	} else if (host_has_sve) {
> >  		/*
> >  		 * The FPSIMD/SVE state in the CPU has not been touched, and we
> >  		 * have SVE (and VHE): CPACR_EL1 (alias CPTR_EL2) has been

[...]

> > @@ -1270,7 +1366,11 @@ static const struct sys_reg_desc sys_reg_descs[] = {
> >  	ID_SANITISED(ID_AA64PFR1_EL1),
> >  	ID_UNALLOCATED(4,2),
> >  	ID_UNALLOCATED(4,3),
> > +#ifdef CONFIG_ARM64_SVE
> > +	{ SYS_DESC(SYS_ID_AA64ZFR0_EL1), access_id_aa64zfr0_el1, .get_user = get_id_aa64zfr0_el1, .set_user = set_id_aa64zfr0_el1, .check_present = sve_check_present },
> > +#else
> >  	ID_UNALLOCATED(4,4),
> > +#endif
> >  	ID_UNALLOCATED(4,5),
> >  	ID_UNALLOCATED(4,6),
> >  	ID_UNALLOCATED(4,7),
> > @@ -1307,6 +1407,9 @@ static const struct sys_reg_desc sys_reg_descs[] = {
> >
> >  	{ SYS_DESC(SYS_SCTLR_EL1), access_vm_reg, reset_val, SCTLR_EL1, 0x00C50078 },
> >  	{ SYS_DESC(SYS_CPACR_EL1), NULL, reset_val, CPACR_EL1, 0 },
> > +#ifdef CONFIG_ARM64_SVE
> > +	{ SYS_DESC(SYS_ZCR_EL1), access_zcr_el1, reset_unknown, ZCR_EL1, ~0xfUL, .get_user = get_zcr_el1, .set_user = set_zcr_el1, .check_present = sve_check_present },
> > +#endif
> >  	{ SYS_DESC(SYS_TTBR0_EL1), access_vm_reg, reset_unknown, TTBR0_EL1 },
> >  	{ SYS_DESC(SYS_TTBR1_EL1), access_vm_reg, reset_unknown, TTBR1_EL1 },
> >  	{ SYS_DESC(SYS_TCR_EL1), access_vm_reg, reset_val, TCR_EL1, 0 },
> 
> Overlong lines.

Fair point, but I'll defer to the maintainers on this.  sys_regs.c
already has overlong lines to some extent (for a suitable definition of
"overlong") but there seems to be a preference of keeping to one entry
per line in these tables.

I'm happy to wrap these or not, as people prefer.

Cheers
---Dave

^ permalink raw reply	[flat|nested] 154+ messages in thread

* [RFC PATCH v2 12/23] KVM: arm64/sve: System register context switch and access support
@ 2018-11-15 17:59       ` Dave Martin
  0 siblings, 0 replies; 154+ messages in thread
From: Dave Martin @ 2018-11-15 17:59 UTC (permalink / raw)
  To: linux-arm-kernel

On Thu, Nov 15, 2018 at 04:37:59PM +0000, Alex Benn?e wrote:
> 
> Dave Martin <Dave.Martin@arm.com> writes:
> 
> > This patch adds the necessary support for context switching ZCR_EL1
> > for each vcpu.
> >
> > ZCR_EL1 is trapped alongside the FPSIMD/SVE registers, so it makes
> > sense for it to be handled as part of the guest FPSIMD/SVE context
> > for context switch purposes instead of handling it as a general
> > system register.  This means that it can be switched in lazily at
> > the appropriate time.  No effort is made to track host context for
> > this register, since SVE requires VHE: thus the hosts's value for
> > this register lives permanently in ZCR_EL2 and does not alias the
> > guest's value at any time.
> >
> > The Hyp switch and fpsimd context handling code is extended
> > appropriately.
> >
> > Accessors are added in sys_regs.c to expose the SVE system
> > registers and ID register fields.  Because these need to be
> > conditionally visible based on the guest configuration, they are
> > implemented separately for now rather than by use of the generic
> > system register helpers.  This may be abstracted better later on
> > when/if there are more features requiring this model.
> >
> > ID_AA64ZFR0_EL1 is RO-RAZ for MRS/MSR when SVE is disabled for the
> > guest, but for compatibility with non-SVE aware KVM implementations
> > the register should not be enumerated at all for KVM_GET_REG_LIST
> > in this case.  For consistency we also reject ioctl access to the
> > register.  This ensures that a non-SVE-enabled guest looks the same
> > to userspace, irrespective of whether the kernel KVM implementation
> > supports SVE.
> >
> > Signed-off-by: Dave Martin <Dave.Martin@arm.com>
> > ---
> >
> > Changes since RFCv1:
> >
> >  * The conditional visibility logic in sys_regs.c has been
> >    simplified.
> >
> >  * The guest's ZCR_EL1 is now treated as part of the FPSIMD/SVE state
> >    for switching purposes.  Any access to this register before it is
> >    switched in generates an SVE trap, so we have a change to switch it
> >    along with the vector registers.
> >
> >    Because SVE is only available with VHE there is no need ever to
> >    restore the host's version of this register (which instead lives
> >    permanently in ZCR_EL2).
> > ---
> >  arch/arm64/include/asm/kvm_host.h |   1 +
> >  arch/arm64/include/asm/sysreg.h   |   3 ++
> >  arch/arm64/kvm/fpsimd.c           |   9 +++-
> >  arch/arm64/kvm/hyp/switch.c       |   4 ++
> >  arch/arm64/kvm/sys_regs.c         | 111 ++++++++++++++++++++++++++++++++++++--
> >  5 files changed, 123 insertions(+), 5 deletions(-)

[...]

> > diff --git a/arch/arm64/kvm/fpsimd.c b/arch/arm64/kvm/fpsimd.c
> > index 55654cb..29e5585 100644
> > --- a/arch/arm64/kvm/fpsimd.c
> > +++ b/arch/arm64/kvm/fpsimd.c
> > @@ -102,6 +102,9 @@ void kvm_arch_vcpu_ctxsync_fp(struct kvm_vcpu *vcpu)
> >  void kvm_arch_vcpu_put_fp(struct kvm_vcpu *vcpu)
> >  {
> >  	unsigned long flags;
> > +	bool host_has_sve = system_supports_sve();
> > +	bool guest_has_sve =
> > +		host_has_sve && (vcpu->arch.flags &
> > KVM_ARM64_FP_ENABLED);
> 
> erm... didn't you create a KVM_ARM64_GUEST_HAS_SVE and vcpu_has_sve() for this?

Hmmm, I think this should indeed say KVM_ARM64_GUEST_HAS_SVE.
(Otherwise it would be redundant with the if() conditions that follow.)

I'll use vcpu_has_sve() if possible.  There may have been some reason
why I didn't use it here, but I'd need to go over the code again.

> >
> >  	local_irq_save(flags);
> >
> > @@ -109,7 +112,11 @@ void kvm_arch_vcpu_put_fp(struct kvm_vcpu *vcpu)
> >  		/* Clean guest FP state to memory and invalidate cpu view */
> >  		fpsimd_save();
> >  		fpsimd_flush_cpu_state();
> > -	} else if (system_supports_sve()) {
> > +
> > +		if (guest_has_sve)
> > +			vcpu->arch.ctxt.sys_regs[ZCR_EL1] =
> > +				read_sysreg_s(SYS_ZCR_EL12);
> > +	} else if (host_has_sve) {
> >  		/*
> >  		 * The FPSIMD/SVE state in the CPU has not been touched, and we
> >  		 * have SVE (and VHE): CPACR_EL1 (alias CPTR_EL2) has been

[...]

> > @@ -1270,7 +1366,11 @@ static const struct sys_reg_desc sys_reg_descs[] = {
> >  	ID_SANITISED(ID_AA64PFR1_EL1),
> >  	ID_UNALLOCATED(4,2),
> >  	ID_UNALLOCATED(4,3),
> > +#ifdef CONFIG_ARM64_SVE
> > +	{ SYS_DESC(SYS_ID_AA64ZFR0_EL1), access_id_aa64zfr0_el1, .get_user = get_id_aa64zfr0_el1, .set_user = set_id_aa64zfr0_el1, .check_present = sve_check_present },
> > +#else
> >  	ID_UNALLOCATED(4,4),
> > +#endif
> >  	ID_UNALLOCATED(4,5),
> >  	ID_UNALLOCATED(4,6),
> >  	ID_UNALLOCATED(4,7),
> > @@ -1307,6 +1407,9 @@ static const struct sys_reg_desc sys_reg_descs[] = {
> >
> >  	{ SYS_DESC(SYS_SCTLR_EL1), access_vm_reg, reset_val, SCTLR_EL1, 0x00C50078 },
> >  	{ SYS_DESC(SYS_CPACR_EL1), NULL, reset_val, CPACR_EL1, 0 },
> > +#ifdef CONFIG_ARM64_SVE
> > +	{ SYS_DESC(SYS_ZCR_EL1), access_zcr_el1, reset_unknown, ZCR_EL1, ~0xfUL, .get_user = get_zcr_el1, .set_user = set_zcr_el1, .check_present = sve_check_present },
> > +#endif
> >  	{ SYS_DESC(SYS_TTBR0_EL1), access_vm_reg, reset_unknown, TTBR0_EL1 },
> >  	{ SYS_DESC(SYS_TTBR1_EL1), access_vm_reg, reset_unknown, TTBR1_EL1 },
> >  	{ SYS_DESC(SYS_TCR_EL1), access_vm_reg, reset_val, TCR_EL1, 0 },
> 
> Overlong lines.

Fair point, but I'll defer to the maintainers on this.  sys_regs.c
already has overlong lines to some extent (for a suitable definition of
"overlong") but there seems to be a preference of keeping to one entry
per line in these tables.

I'm happy to wrap these or not, as people prefer.

Cheers
---Dave

^ permalink raw reply	[flat|nested] 154+ messages in thread

* Re: [RFC PATCH v2 20/23] KVM: arm64: Add arch vm ioctl hook
  2018-11-02  8:32     ` Christoffer Dall
@ 2018-11-15 18:04       ` Dave Martin
  -1 siblings, 0 replies; 154+ messages in thread
From: Dave Martin @ 2018-11-15 18:04 UTC (permalink / raw)
  To: Christoffer Dall
  Cc: Okamoto Takayuki, Christoffer Dall, Ard Biesheuvel, Marc Zyngier,
	Catalin Marinas, Will Deacon, kvmarm, linux-arm-kernel

On Fri, Nov 02, 2018 at 09:32:27AM +0100, Christoffer Dall wrote:
> On Fri, Sep 28, 2018 at 02:39:24PM +0100, Dave Martin wrote:
> > To enable arm64-specific vm ioctls to be added cleanly, this patch
> > adds a kvm_arm_arch_vm_ioctl() hook so that these don't pollute the
> > common code.
> 
> Hmmm, I don't really see the strenght of that argument, and have the
> same concern as before.  I'd like to avoid the additional indirection
> and instead just follow the existing pattern with a dummy implementation
> on the 32-bit side that returns an error.

So for this and the similar comment on patch 18, this was premature (or
at least, overzealous) factoring on my part.

I'm happy to merge this back together for arm and arm64 as you prefer.

Do we have a nice way of writing the arch check, e.g.

	case KVM_ARM_SVE_CONFIG:
		if (!IS_ENABLED(ARM64))
			return -EINVAL;
		else
			return kvm_vcpu_sve_config(NULL, userp);

should work, but looks a bit strange.  Maybe I'm just being fussy.

Is there a better way that I'm missing?

Cheers
---Dave

^ permalink raw reply	[flat|nested] 154+ messages in thread

* [RFC PATCH v2 20/23] KVM: arm64: Add arch vm ioctl hook
@ 2018-11-15 18:04       ` Dave Martin
  0 siblings, 0 replies; 154+ messages in thread
From: Dave Martin @ 2018-11-15 18:04 UTC (permalink / raw)
  To: linux-arm-kernel

On Fri, Nov 02, 2018 at 09:32:27AM +0100, Christoffer Dall wrote:
> On Fri, Sep 28, 2018 at 02:39:24PM +0100, Dave Martin wrote:
> > To enable arm64-specific vm ioctls to be added cleanly, this patch
> > adds a kvm_arm_arch_vm_ioctl() hook so that these don't pollute the
> > common code.
> 
> Hmmm, I don't really see the strenght of that argument, and have the
> same concern as before.  I'd like to avoid the additional indirection
> and instead just follow the existing pattern with a dummy implementation
> on the 32-bit side that returns an error.

So for this and the similar comment on patch 18, this was premature (or
at least, overzealous) factoring on my part.

I'm happy to merge this back together for arm and arm64 as you prefer.

Do we have a nice way of writing the arch check, e.g.

	case KVM_ARM_SVE_CONFIG:
		if (!IS_ENABLED(ARM64))
			return -EINVAL;
		else
			return kvm_vcpu_sve_config(NULL, userp);

should work, but looks a bit strange.  Maybe I'm just being fussy.

Is there a better way that I'm missing?

Cheers
---Dave

^ permalink raw reply	[flat|nested] 154+ messages in thread

* Re: [RFC PATCH v2 06/23] arm64/sve: Check SVE virtualisability
  2018-11-15 17:09       ` Dave Martin
@ 2018-11-16 12:32         ` Alex Bennée
  -1 siblings, 0 replies; 154+ messages in thread
From: Alex Bennée @ 2018-11-16 12:32 UTC (permalink / raw)
  To: Dave Martin
  Cc: Okamoto Takayuki, Christoffer Dall, Ard Biesheuvel, Marc Zyngier,
	Catalin Marinas, Will Deacon, kvmarm, linux-arm-kernel


Dave Martin <Dave.Martin@arm.com> writes:

> On Thu, Nov 15, 2018 at 03:39:01PM +0000, Alex Bennée wrote:
>>
>> Dave Martin <Dave.Martin@arm.com> writes:
>>
>> > Due to the way the effective SVE vector length is controlled and
>> > trapped at different exception levels, certain mismatches in the
>> > sets of vector lengths supported by different physical CPUs in the
>> > system may prevent straightforward virtualisation of SVE at parity
>> > with the host.
>> >
>> > This patch analyses the extent to which SVE can be virtualised
>> > safely without interfering with migration of vcpus between physical
>> > CPUs, and rejects late secondary CPUs that would erode the
>> > situation further.
>> >
>> > It is left up to KVM to decide what to do with this information.
>> >
>> > Signed-off-by: Dave Martin <Dave.Martin@arm.com>
>> > ---
>> >
>> > Changes since RFCv1:
>> >
>> >  * The analysis done by this patch is the same as in the previous
>> >    version, but the commit message the printks etc. have been reworded
>> >    to avoid the suggestion that KVM is expected to work on a system with
>> >    mismatched SVE implementations.
>> > ---
>> >  arch/arm64/include/asm/fpsimd.h |  1 +
>> >  arch/arm64/kernel/cpufeature.c  |  2 +-
>> >  arch/arm64/kernel/fpsimd.c      | 87 +++++++++++++++++++++++++++++++++++------
>> >  3 files changed, 76 insertions(+), 14 deletions(-)
>> >
>
> [...]
>
>> > diff --git a/arch/arm64/kernel/fpsimd.c b/arch/arm64/kernel/fpsimd.c
>
> [...]
>
>> > @@ -623,11 +629,8 @@ int sve_get_current_vl(void)
>
> [...]
>
>> > +/* Bitmaps for temporary storage during manipulation of vector length sets */
>> > +static DECLARE_BITMAP(sve_tmp_vq_map, SVE_VQ_MAX);
>>
>> This seems odd as a local global, why not declared locally when used?
>
> Could do.
>
> My original concern was that this is "big" and therefore it's impolite
> to allocate it on the stack.
>
> But on reflection, 64 bytes of stack is no big deal for a 64-bit
> architecture.  The affected functions probably spill more than than
> already, and these functions are called on well-defined paths which
> shouldn't have super-deep stacks already.
>
> [...]
>
>> > @@ -658,24 +662,60 @@ void __init sve_init_vq_map(void)
>> >   */
>> >  void sve_update_vq_map(void)
>> >  {
>> > -	sve_probe_vqs(sve_secondary_vq_map);
>> > -	bitmap_and(sve_vq_map, sve_vq_map, sve_secondary_vq_map, SVE_VQ_MAX);
>> > +	sve_probe_vqs(sve_tmp_vq_map);
>> > +	bitmap_and(sve_vq_map, sve_vq_map, sve_tmp_vq_map,
>> > +		   SVE_VQ_MAX);
>> > +	bitmap_or(sve_vq_partial_map, sve_vq_partial_map, sve_tmp_vq_map,
>> > +		  SVE_VQ_MAX);
>> >  }
>>
>> I'm not quite following what's going on here. This is tracking both the
>> vector lengths available on all CPUs and the ones available on at least
>> one CPU? This raises a some questions:
>>
>>   - do such franken-machines exist or are expected to exit?
>
> no, and yes respectively (Linux does not endorse the latter for now,
> since it results in a non-SMP system: we hide the asymmetries where
> possible by clamping the set of available vector lengths, but for
> KVM it's too hard and we don't aim to support it at all).
>
> Even if we don't recommend deploying a general-purpose OS on such a
> system, people will eventually try it.  So it's better to fail safe
> rather than silently doing the wrong thing.
>
>>   - how do we ensure this is always upto date?
>
> This gets updated for each early secondary CPU that comes up.  (Early
> secondaries' boot is serialised, so we shouldn't have to worry about
> races here.)
>
> The configuration is frozen by the time we enter userspace (hence
> __ro_after_init).
>
> Once all the early secondaries have come up, we commit to the best
> possible set of vector lengths for the CPUs that we know about, and we
> don't call this path any more: instead, each late secondary goes into
> sve_verify_vq_map() instead to check that those CPUs are compatible
> with the configuration we committed to.
>
> For context, take a look at
> arch/arm64/kernel/cpufeature.c:check_local_cpu_capabilities(), which is
> the common entry point for all secondary CPUs: that splits into
> update_cpu_capabilities() and verify_local_cpu_capabilities() paths for
> the two cases described above, calling down into sve_update_vq_map()
> and sve_verify_vq_map() as appropriate.
>
>>   - what happens when we hotplug a new CPU with less available VQ?
>
> We reject the CPU and throw it back to the firmware (see
> cpufeature.c:verify_sve_features()).
>
> This follows the precedent already set in verify_local_cpu_capabilities()
> etc.

I think a few words to that effect in the function comments would be
helpful:

  /*
   * sve_update_vq_map only cares about CPUs at boot time and is called
   * serially for each one. Any CPUs added later via hotplug will fail
   * at sve_verify_vq_map if they don't match what is detected here.
   */

>
>>
>> >
>> >  /* Check whether the current CPU supports all VQs in the committed set */
>> >  int sve_verify_vq_map(void)
>> >  {
>> > -	int ret = 0;
>> > +	int ret = -EINVAL;
>> > +	unsigned long b;
>> >
>> > -	sve_probe_vqs(sve_secondary_vq_map);
>> > -	bitmap_andnot(sve_secondary_vq_map, sve_vq_map, sve_secondary_vq_map,
>> > -		      SVE_VQ_MAX);
>> > -	if (!bitmap_empty(sve_secondary_vq_map, SVE_VQ_MAX)) {
>> > +	sve_probe_vqs(sve_tmp_vq_map);
>> > +
>> > +	bitmap_complement(sve_tmp_vq_map, sve_tmp_vq_map, SVE_VQ_MAX);
>> > +	if (bitmap_intersects(sve_tmp_vq_map, sve_vq_map, SVE_VQ_MAX)) {
>> >  		pr_warn("SVE: cpu%d: Required vector length(s) missing\n",
>> >  			smp_processor_id());
>> > -		ret = -EINVAL;
>> > +		goto error;
>>
>> The use of goto seems a little premature considering we don't have any
>> clean-up to do.
>
> Hmm, this does look a little overengineered.  I think it may have been
> more complex during development (making the gotos less redundant), but
> to be honest I don't remember now.
>
> I'm happy to get rid of the rather pointless ret variable and replace
> all the gotos with returns if that works for you.
>
> What do you think?

Yes please, that would be cleaner.

--
Alex Bennée
_______________________________________________
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

^ permalink raw reply	[flat|nested] 154+ messages in thread

* [RFC PATCH v2 06/23] arm64/sve: Check SVE virtualisability
@ 2018-11-16 12:32         ` Alex Bennée
  0 siblings, 0 replies; 154+ messages in thread
From: Alex Bennée @ 2018-11-16 12:32 UTC (permalink / raw)
  To: linux-arm-kernel


Dave Martin <Dave.Martin@arm.com> writes:

> On Thu, Nov 15, 2018 at 03:39:01PM +0000, Alex Benn?e wrote:
>>
>> Dave Martin <Dave.Martin@arm.com> writes:
>>
>> > Due to the way the effective SVE vector length is controlled and
>> > trapped at different exception levels, certain mismatches in the
>> > sets of vector lengths supported by different physical CPUs in the
>> > system may prevent straightforward virtualisation of SVE at parity
>> > with the host.
>> >
>> > This patch analyses the extent to which SVE can be virtualised
>> > safely without interfering with migration of vcpus between physical
>> > CPUs, and rejects late secondary CPUs that would erode the
>> > situation further.
>> >
>> > It is left up to KVM to decide what to do with this information.
>> >
>> > Signed-off-by: Dave Martin <Dave.Martin@arm.com>
>> > ---
>> >
>> > Changes since RFCv1:
>> >
>> >  * The analysis done by this patch is the same as in the previous
>> >    version, but the commit message the printks etc. have been reworded
>> >    to avoid the suggestion that KVM is expected to work on a system with
>> >    mismatched SVE implementations.
>> > ---
>> >  arch/arm64/include/asm/fpsimd.h |  1 +
>> >  arch/arm64/kernel/cpufeature.c  |  2 +-
>> >  arch/arm64/kernel/fpsimd.c      | 87 +++++++++++++++++++++++++++++++++++------
>> >  3 files changed, 76 insertions(+), 14 deletions(-)
>> >
>
> [...]
>
>> > diff --git a/arch/arm64/kernel/fpsimd.c b/arch/arm64/kernel/fpsimd.c
>
> [...]
>
>> > @@ -623,11 +629,8 @@ int sve_get_current_vl(void)
>
> [...]
>
>> > +/* Bitmaps for temporary storage during manipulation of vector length sets */
>> > +static DECLARE_BITMAP(sve_tmp_vq_map, SVE_VQ_MAX);
>>
>> This seems odd as a local global, why not declared locally when used?
>
> Could do.
>
> My original concern was that this is "big" and therefore it's impolite
> to allocate it on the stack.
>
> But on reflection, 64 bytes of stack is no big deal for a 64-bit
> architecture.  The affected functions probably spill more than than
> already, and these functions are called on well-defined paths which
> shouldn't have super-deep stacks already.
>
> [...]
>
>> > @@ -658,24 +662,60 @@ void __init sve_init_vq_map(void)
>> >   */
>> >  void sve_update_vq_map(void)
>> >  {
>> > -	sve_probe_vqs(sve_secondary_vq_map);
>> > -	bitmap_and(sve_vq_map, sve_vq_map, sve_secondary_vq_map, SVE_VQ_MAX);
>> > +	sve_probe_vqs(sve_tmp_vq_map);
>> > +	bitmap_and(sve_vq_map, sve_vq_map, sve_tmp_vq_map,
>> > +		   SVE_VQ_MAX);
>> > +	bitmap_or(sve_vq_partial_map, sve_vq_partial_map, sve_tmp_vq_map,
>> > +		  SVE_VQ_MAX);
>> >  }
>>
>> I'm not quite following what's going on here. This is tracking both the
>> vector lengths available on all CPUs and the ones available on at least
>> one CPU? This raises a some questions:
>>
>>   - do such franken-machines exist or are expected to exit?
>
> no, and yes respectively (Linux does not endorse the latter for now,
> since it results in a non-SMP system: we hide the asymmetries where
> possible by clamping the set of available vector lengths, but for
> KVM it's too hard and we don't aim to support it at all).
>
> Even if we don't recommend deploying a general-purpose OS on such a
> system, people will eventually try it.  So it's better to fail safe
> rather than silently doing the wrong thing.
>
>>   - how do we ensure this is always upto date?
>
> This gets updated for each early secondary CPU that comes up.  (Early
> secondaries' boot is serialised, so we shouldn't have to worry about
> races here.)
>
> The configuration is frozen by the time we enter userspace (hence
> __ro_after_init).
>
> Once all the early secondaries have come up, we commit to the best
> possible set of vector lengths for the CPUs that we know about, and we
> don't call this path any more: instead, each late secondary goes into
> sve_verify_vq_map() instead to check that those CPUs are compatible
> with the configuration we committed to.
>
> For context, take a look at
> arch/arm64/kernel/cpufeature.c:check_local_cpu_capabilities(), which is
> the common entry point for all secondary CPUs: that splits into
> update_cpu_capabilities() and verify_local_cpu_capabilities() paths for
> the two cases described above, calling down into sve_update_vq_map()
> and sve_verify_vq_map() as appropriate.
>
>>   - what happens when we hotplug a new CPU with less available VQ?
>
> We reject the CPU and throw it back to the firmware (see
> cpufeature.c:verify_sve_features()).
>
> This follows the precedent already set in verify_local_cpu_capabilities()
> etc.

I think a few words to that effect in the function comments would be
helpful:

  /*
   * sve_update_vq_map only cares about CPUs at boot time and is called
   * serially for each one. Any CPUs added later via hotplug will fail
   * at sve_verify_vq_map if they don't match what is detected here.
   */

>
>>
>> >
>> >  /* Check whether the current CPU supports all VQs in the committed set */
>> >  int sve_verify_vq_map(void)
>> >  {
>> > -	int ret = 0;
>> > +	int ret = -EINVAL;
>> > +	unsigned long b;
>> >
>> > -	sve_probe_vqs(sve_secondary_vq_map);
>> > -	bitmap_andnot(sve_secondary_vq_map, sve_vq_map, sve_secondary_vq_map,
>> > -		      SVE_VQ_MAX);
>> > -	if (!bitmap_empty(sve_secondary_vq_map, SVE_VQ_MAX)) {
>> > +	sve_probe_vqs(sve_tmp_vq_map);
>> > +
>> > +	bitmap_complement(sve_tmp_vq_map, sve_tmp_vq_map, SVE_VQ_MAX);
>> > +	if (bitmap_intersects(sve_tmp_vq_map, sve_vq_map, SVE_VQ_MAX)) {
>> >  		pr_warn("SVE: cpu%d: Required vector length(s) missing\n",
>> >  			smp_processor_id());
>> > -		ret = -EINVAL;
>> > +		goto error;
>>
>> The use of goto seems a little premature considering we don't have any
>> clean-up to do.
>
> Hmm, this does look a little overengineered.  I think it may have been
> more complex during development (making the gotos less redundant), but
> to be honest I don't remember now.
>
> I'm happy to get rid of the rather pointless ret variable and replace
> all the gotos with returns if that works for you.
>
> What do you think?

Yes please, that would be cleaner.

--
Alex Benn?e

^ permalink raw reply	[flat|nested] 154+ messages in thread

* Re: [RFC PATCH v2 06/23] arm64/sve: Check SVE virtualisability
  2018-11-16 12:32         ` Alex Bennée
@ 2018-11-16 15:09           ` Dave Martin
  -1 siblings, 0 replies; 154+ messages in thread
From: Dave Martin @ 2018-11-16 15:09 UTC (permalink / raw)
  To: Alex Bennée
  Cc: Okamoto Takayuki, Christoffer Dall, Ard Biesheuvel, Marc Zyngier,
	Catalin Marinas, Will Deacon, kvmarm, linux-arm-kernel

On Fri, Nov 16, 2018 at 12:32:18PM +0000, Alex Bennée wrote:
> 
> Dave Martin <Dave.Martin@arm.com> writes:
> 
> > On Thu, Nov 15, 2018 at 03:39:01PM +0000, Alex Bennée wrote:
> >>
> >> Dave Martin <Dave.Martin@arm.com> writes:
> >>
> >> > Due to the way the effective SVE vector length is controlled and
> >> > trapped at different exception levels, certain mismatches in the
> >> > sets of vector lengths supported by different physical CPUs in the
> >> > system may prevent straightforward virtualisation of SVE at parity
> >> > with the host.
> >> >
> >> > This patch analyses the extent to which SVE can be virtualised
> >> > safely without interfering with migration of vcpus between physical
> >> > CPUs, and rejects late secondary CPUs that would erode the
> >> > situation further.
> >> >
> >> > It is left up to KVM to decide what to do with this information.
> >> >
> >> > Signed-off-by: Dave Martin <Dave.Martin@arm.com>
> >> > ---
> >> >
> >> > Changes since RFCv1:
> >> >
> >> >  * The analysis done by this patch is the same as in the previous
> >> >    version, but the commit message the printks etc. have been reworded
> >> >    to avoid the suggestion that KVM is expected to work on a system with
> >> >    mismatched SVE implementations.
> >> > ---
> >> >  arch/arm64/include/asm/fpsimd.h |  1 +
> >> >  arch/arm64/kernel/cpufeature.c  |  2 +-
> >> >  arch/arm64/kernel/fpsimd.c      | 87 +++++++++++++++++++++++++++++++++++------
> >> >  3 files changed, 76 insertions(+), 14 deletions(-)
> >> >
> >
> > [...]
> >
> >> > diff --git a/arch/arm64/kernel/fpsimd.c b/arch/arm64/kernel/fpsimd.c

[...]

> >> > @@ -658,24 +662,60 @@ void __init sve_init_vq_map(void)
> >> >   */
> >> >  void sve_update_vq_map(void)
> >> >  {
> >> > -	sve_probe_vqs(sve_secondary_vq_map);
> >> > -	bitmap_and(sve_vq_map, sve_vq_map, sve_secondary_vq_map, SVE_VQ_MAX);
> >> > +	sve_probe_vqs(sve_tmp_vq_map);
> >> > +	bitmap_and(sve_vq_map, sve_vq_map, sve_tmp_vq_map,
> >> > +		   SVE_VQ_MAX);
> >> > +	bitmap_or(sve_vq_partial_map, sve_vq_partial_map, sve_tmp_vq_map,
> >> > +		  SVE_VQ_MAX);
> >> >  }
> >>
> >> I'm not quite following what's going on here. This is tracking both the
> >> vector lengths available on all CPUs and the ones available on at least
> >> one CPU? This raises a some questions:
> >>
> >>   - do such franken-machines exist or are expected to exit?
> >
> > no, and yes respectively (Linux does not endorse the latter for now,
> > since it results in a non-SMP system: we hide the asymmetries where
> > possible by clamping the set of available vector lengths, but for
> > KVM it's too hard and we don't aim to support it at all).
> >
> > Even if we don't recommend deploying a general-purpose OS on such a
> > system, people will eventually try it.  So it's better to fail safe
> > rather than silently doing the wrong thing.
> >
> >>   - how do we ensure this is always upto date?
> >
> > This gets updated for each early secondary CPU that comes up.  (Early
> > secondaries' boot is serialised, so we shouldn't have to worry about
> > races here.)
> >
> > The configuration is frozen by the time we enter userspace (hence
> > __ro_after_init).
> >
> > Once all the early secondaries have come up, we commit to the best
> > possible set of vector lengths for the CPUs that we know about, and we
> > don't call this path any more: instead, each late secondary goes into
> > sve_verify_vq_map() instead to check that those CPUs are compatible
> > with the configuration we committed to.
> >
> > For context, take a look at
> > arch/arm64/kernel/cpufeature.c:check_local_cpu_capabilities(), which is
> > the common entry point for all secondary CPUs: that splits into
> > update_cpu_capabilities() and verify_local_cpu_capabilities() paths for
> > the two cases described above, calling down into sve_update_vq_map()
> > and sve_verify_vq_map() as appropriate.
> >
> >>   - what happens when we hotplug a new CPU with less available VQ?
> >
> > We reject the CPU and throw it back to the firmware (see
> > cpufeature.c:verify_sve_features()).
> >
> > This follows the precedent already set in verify_local_cpu_capabilities()
> > etc.
> 
> I think a few words to that effect in the function comments would be
> helpful:
> 
>   /*
>    * sve_update_vq_map only cares about CPUs at boot time and is called
>    * serially for each one. Any CPUs added later via hotplug will fail
>    * at sve_verify_vq_map if they don't match what is detected here.
>    */

Ack.  I might tweak the wording, but adding brief comments explaining
when these functions are called is a good idea.

> 
> >
> >>
> >> >
> >> >  /* Check whether the current CPU supports all VQs in the committed set */
> >> >  int sve_verify_vq_map(void)
> >> >  {
> >> > -	int ret = 0;
> >> > +	int ret = -EINVAL;
> >> > +	unsigned long b;
> >> >
> >> > -	sve_probe_vqs(sve_secondary_vq_map);
> >> > -	bitmap_andnot(sve_secondary_vq_map, sve_vq_map, sve_secondary_vq_map,
> >> > -		      SVE_VQ_MAX);
> >> > -	if (!bitmap_empty(sve_secondary_vq_map, SVE_VQ_MAX)) {
> >> > +	sve_probe_vqs(sve_tmp_vq_map);
> >> > +
> >> > +	bitmap_complement(sve_tmp_vq_map, sve_tmp_vq_map, SVE_VQ_MAX);
> >> > +	if (bitmap_intersects(sve_tmp_vq_map, sve_vq_map, SVE_VQ_MAX)) {
> >> >  		pr_warn("SVE: cpu%d: Required vector length(s) missing\n",
> >> >  			smp_processor_id());
> >> > -		ret = -EINVAL;
> >> > +		goto error;
> >>
> >> The use of goto seems a little premature considering we don't have any
> >> clean-up to do.
> >
> > Hmm, this does look a little overengineered.  I think it may have been
> > more complex during development (making the gotos less redundant), but
> > to be honest I don't remember now.
> >
> > I'm happy to get rid of the rather pointless ret variable and replace
> > all the gotos with returns if that works for you.
> >
> > What do you think?
> 
> Yes please, that would be cleaner.

OK, good.

Cheers
---Dave

^ permalink raw reply	[flat|nested] 154+ messages in thread

* [RFC PATCH v2 06/23] arm64/sve: Check SVE virtualisability
@ 2018-11-16 15:09           ` Dave Martin
  0 siblings, 0 replies; 154+ messages in thread
From: Dave Martin @ 2018-11-16 15:09 UTC (permalink / raw)
  To: linux-arm-kernel

On Fri, Nov 16, 2018 at 12:32:18PM +0000, Alex Benn?e wrote:
> 
> Dave Martin <Dave.Martin@arm.com> writes:
> 
> > On Thu, Nov 15, 2018 at 03:39:01PM +0000, Alex Benn?e wrote:
> >>
> >> Dave Martin <Dave.Martin@arm.com> writes:
> >>
> >> > Due to the way the effective SVE vector length is controlled and
> >> > trapped at different exception levels, certain mismatches in the
> >> > sets of vector lengths supported by different physical CPUs in the
> >> > system may prevent straightforward virtualisation of SVE at parity
> >> > with the host.
> >> >
> >> > This patch analyses the extent to which SVE can be virtualised
> >> > safely without interfering with migration of vcpus between physical
> >> > CPUs, and rejects late secondary CPUs that would erode the
> >> > situation further.
> >> >
> >> > It is left up to KVM to decide what to do with this information.
> >> >
> >> > Signed-off-by: Dave Martin <Dave.Martin@arm.com>
> >> > ---
> >> >
> >> > Changes since RFCv1:
> >> >
> >> >  * The analysis done by this patch is the same as in the previous
> >> >    version, but the commit message the printks etc. have been reworded
> >> >    to avoid the suggestion that KVM is expected to work on a system with
> >> >    mismatched SVE implementations.
> >> > ---
> >> >  arch/arm64/include/asm/fpsimd.h |  1 +
> >> >  arch/arm64/kernel/cpufeature.c  |  2 +-
> >> >  arch/arm64/kernel/fpsimd.c      | 87 +++++++++++++++++++++++++++++++++++------
> >> >  3 files changed, 76 insertions(+), 14 deletions(-)
> >> >
> >
> > [...]
> >
> >> > diff --git a/arch/arm64/kernel/fpsimd.c b/arch/arm64/kernel/fpsimd.c

[...]

> >> > @@ -658,24 +662,60 @@ void __init sve_init_vq_map(void)
> >> >   */
> >> >  void sve_update_vq_map(void)
> >> >  {
> >> > -	sve_probe_vqs(sve_secondary_vq_map);
> >> > -	bitmap_and(sve_vq_map, sve_vq_map, sve_secondary_vq_map, SVE_VQ_MAX);
> >> > +	sve_probe_vqs(sve_tmp_vq_map);
> >> > +	bitmap_and(sve_vq_map, sve_vq_map, sve_tmp_vq_map,
> >> > +		   SVE_VQ_MAX);
> >> > +	bitmap_or(sve_vq_partial_map, sve_vq_partial_map, sve_tmp_vq_map,
> >> > +		  SVE_VQ_MAX);
> >> >  }
> >>
> >> I'm not quite following what's going on here. This is tracking both the
> >> vector lengths available on all CPUs and the ones available on at least
> >> one CPU? This raises a some questions:
> >>
> >>   - do such franken-machines exist or are expected to exit?
> >
> > no, and yes respectively (Linux does not endorse the latter for now,
> > since it results in a non-SMP system: we hide the asymmetries where
> > possible by clamping the set of available vector lengths, but for
> > KVM it's too hard and we don't aim to support it at all).
> >
> > Even if we don't recommend deploying a general-purpose OS on such a
> > system, people will eventually try it.  So it's better to fail safe
> > rather than silently doing the wrong thing.
> >
> >>   - how do we ensure this is always upto date?
> >
> > This gets updated for each early secondary CPU that comes up.  (Early
> > secondaries' boot is serialised, so we shouldn't have to worry about
> > races here.)
> >
> > The configuration is frozen by the time we enter userspace (hence
> > __ro_after_init).
> >
> > Once all the early secondaries have come up, we commit to the best
> > possible set of vector lengths for the CPUs that we know about, and we
> > don't call this path any more: instead, each late secondary goes into
> > sve_verify_vq_map() instead to check that those CPUs are compatible
> > with the configuration we committed to.
> >
> > For context, take a look at
> > arch/arm64/kernel/cpufeature.c:check_local_cpu_capabilities(), which is
> > the common entry point for all secondary CPUs: that splits into
> > update_cpu_capabilities() and verify_local_cpu_capabilities() paths for
> > the two cases described above, calling down into sve_update_vq_map()
> > and sve_verify_vq_map() as appropriate.
> >
> >>   - what happens when we hotplug a new CPU with less available VQ?
> >
> > We reject the CPU and throw it back to the firmware (see
> > cpufeature.c:verify_sve_features()).
> >
> > This follows the precedent already set in verify_local_cpu_capabilities()
> > etc.
> 
> I think a few words to that effect in the function comments would be
> helpful:
> 
>   /*
>    * sve_update_vq_map only cares about CPUs at boot time and is called
>    * serially for each one. Any CPUs added later via hotplug will fail
>    * at sve_verify_vq_map if they don't match what is detected here.
>    */

Ack.  I might tweak the wording, but adding brief comments explaining
when these functions are called is a good idea.

> 
> >
> >>
> >> >
> >> >  /* Check whether the current CPU supports all VQs in the committed set */
> >> >  int sve_verify_vq_map(void)
> >> >  {
> >> > -	int ret = 0;
> >> > +	int ret = -EINVAL;
> >> > +	unsigned long b;
> >> >
> >> > -	sve_probe_vqs(sve_secondary_vq_map);
> >> > -	bitmap_andnot(sve_secondary_vq_map, sve_vq_map, sve_secondary_vq_map,
> >> > -		      SVE_VQ_MAX);
> >> > -	if (!bitmap_empty(sve_secondary_vq_map, SVE_VQ_MAX)) {
> >> > +	sve_probe_vqs(sve_tmp_vq_map);
> >> > +
> >> > +	bitmap_complement(sve_tmp_vq_map, sve_tmp_vq_map, SVE_VQ_MAX);
> >> > +	if (bitmap_intersects(sve_tmp_vq_map, sve_vq_map, SVE_VQ_MAX)) {
> >> >  		pr_warn("SVE: cpu%d: Required vector length(s) missing\n",
> >> >  			smp_processor_id());
> >> > -		ret = -EINVAL;
> >> > +		goto error;
> >>
> >> The use of goto seems a little premature considering we don't have any
> >> clean-up to do.
> >
> > Hmm, this does look a little overengineered.  I think it may have been
> > more complex during development (making the gotos less redundant), but
> > to be honest I don't remember now.
> >
> > I'm happy to get rid of the rather pointless ret variable and replace
> > all the gotos with returns if that works for you.
> >
> > What do you think?
> 
> Yes please, that would be cleaner.

OK, good.

Cheers
---Dave

^ permalink raw reply	[flat|nested] 154+ messages in thread

* Re: [RFC PATCH v2 13/23] KVM: arm64/sve: Context switch the SVE registers
  2018-09-28 13:39   ` Dave Martin
@ 2018-11-19 16:36     ` Alex Bennée
  -1 siblings, 0 replies; 154+ messages in thread
From: Alex Bennée @ 2018-11-19 16:36 UTC (permalink / raw)
  To: Dave Martin
  Cc: Okamoto Takayuki, Christoffer Dall, Ard Biesheuvel, Marc Zyngier,
	Catalin Marinas, Will Deacon, kvmarm, linux-arm-kernel


Dave Martin <Dave.Martin@arm.com> writes:

> In order to give each vcpu its own view of the SVE registers, this
> patch adds context storage via a new sve_state pointer in struct
> vcpu_arch.  An additional member sve_max_vl is also added for each
> vcpu, to determine the maximum vector length visible to the guest
> and thus the value to be configured in ZCR_EL2.LEN while the is
> active.  This also determines the layout and size of the storage in
> sve_state, which is read and written by the same backend functions
> that are used for context-switching the SVE state for host tasks.
>
> On SVE-enabled vcpus, SVE access traps are now handled by switching
> in the vcpu's SVE context and disabling the trap before returning
> to the guest.  On other vcpus, the trap is not handled and an exit
> back to the host occurs, where the handle_sve() fallback path
> reflects an undefined instruction exception back to the guest,
> consistently with the behaviour of non-SVE-capable hardware (as was
> done unconditionally prior to this patch).
>
> No SVE handling is added on non-VHE-only paths, since VHE is an
> architectural and Kconfig prerequisite of SVE.
>
> Signed-off-by: Dave Martin <Dave.Martin@arm.com>
> ---
>
> Changes since RFCv1:
>
>  * Add a if_sve () helper macro to efficiently skip or optimise out
>    SVE conditional support code for the SVE-unsupported case.  This
>    reduces the verbose boilerplate at the affected sites.
>
>  * In the style of sve_pffr(), a vcpu_sve_pffr() helper is added to
>    provide the FFR anchor pointer for sve_load_state() in the hyp switch
>    code.   This help avoid some open-coded pointer mungeing which is not
>    very readable.
>
>  * The condition for calling __hyp_switch_fpsimd() is abstracted for
>    better readability.
> ---
>  arch/arm64/include/asm/kvm_host.h |  6 ++++
>  arch/arm64/kvm/fpsimd.c           |  5 +--
>  arch/arm64/kvm/hyp/switch.c       | 71 ++++++++++++++++++++++++++++++---------
>  3 files changed, 65 insertions(+), 17 deletions(-)
>
> diff --git a/arch/arm64/include/asm/kvm_host.h b/arch/arm64/include/asm/kvm_host.h
> index 76cbb95e..8e9cd43 100644
> --- a/arch/arm64/include/asm/kvm_host.h
> +++ b/arch/arm64/include/asm/kvm_host.h
> @@ -210,6 +210,8 @@ typedef struct kvm_cpu_context kvm_cpu_context_t;
>
>  struct kvm_vcpu_arch {
>  	struct kvm_cpu_context ctxt;
> +	void *sve_state;
> +	unsigned int sve_max_vl;
>
>  	/* HYP configuration */
>  	u64 hcr_el2;
> @@ -302,6 +304,10 @@ struct kvm_vcpu_arch {
>  	bool sysregs_loaded_on_cpu;
>  };
>
> +/* Pointer to the vcpu's SVE FFR for sve_{save,load}_state() */
> +#define vcpu_sve_pffr(vcpu) ((void *)((char *)((vcpu)->arch.sve_state) + \
> +				      sve_ffr_offset((vcpu)->arch.sve_max_vl)))
> +
>  /* vcpu_arch flags field values: */
>  #define KVM_ARM64_DEBUG_DIRTY		(1 << 0)
>  #define KVM_ARM64_FP_ENABLED		(1 << 1) /* guest FP regs loaded */
> diff --git a/arch/arm64/kvm/fpsimd.c b/arch/arm64/kvm/fpsimd.c
> index 29e5585..3474388 100644
> --- a/arch/arm64/kvm/fpsimd.c
> +++ b/arch/arm64/kvm/fpsimd.c
> @@ -86,10 +86,11 @@ void kvm_arch_vcpu_ctxsync_fp(struct kvm_vcpu *vcpu)
>
>  	if (vcpu->arch.flags & KVM_ARM64_FP_ENABLED) {
>  		fpsimd_bind_state_to_cpu(&vcpu->arch.ctxt.gp_regs.fp_regs,
> -					 NULL, sve_max_vl);
> +					 vcpu->arch.sve_state,
> +					 vcpu->arch.sve_max_vl);
>
>  		clear_thread_flag(TIF_FOREIGN_FPSTATE);
> -		clear_thread_flag(TIF_SVE);
> +		update_thread_flag(TIF_SVE, vcpu_has_sve(vcpu));
>  	}
>  }
>
> diff --git a/arch/arm64/kvm/hyp/switch.c b/arch/arm64/kvm/hyp/switch.c
> index 085ed06..9941349 100644
> --- a/arch/arm64/kvm/hyp/switch.c
> +++ b/arch/arm64/kvm/hyp/switch.c
> @@ -98,7 +98,10 @@ static void activate_traps_vhe(struct kvm_vcpu *vcpu)
>  	val = read_sysreg(cpacr_el1);
>  	val |= CPACR_EL1_TTA;
>  	val &= ~CPACR_EL1_ZEN;
> -	if (!update_fp_enabled(vcpu)) {
> +	if (update_fp_enabled(vcpu)) {
> +		if (vcpu_has_sve(vcpu))
> +			val |= CPACR_EL1_ZEN;
> +	} else {
>  		val &= ~CPACR_EL1_FPEN;
>  		__activate_traps_fpsimd32(vcpu);
>  	}
> @@ -332,16 +335,29 @@ static bool __hyp_text __skip_instr(struct kvm_vcpu *vcpu)
>  	}
>  }
>
> -static bool __hyp_text __hyp_switch_fpsimd(struct kvm_vcpu *vcpu)
> +/*
> + * if () with a gating check for SVE support to minimise branch
> + * mispredictions in non-SVE systems.
> + * (system_supports_sve() is resolved at build time or via a static key.)
> + */
> +#define if_sve(cond) if (system_supports_sve() && (cond))
> +
> +static bool __hyp_text __hyp_switch_fpsimd(struct kvm_vcpu *vcpu,
> +					   bool guest_has_sve)
>  {
>  	struct user_fpsimd_state *host_fpsimd = vcpu->arch.host_fpsimd_state;
>
> -	if (has_vhe())
> -		write_sysreg(read_sysreg(cpacr_el1) | CPACR_EL1_FPEN,
> -			     cpacr_el1);
> -	else
> +	if (has_vhe()) {
> +		u64 reg = read_sysreg(cpacr_el1) | CPACR_EL1_FPEN;
> +
> +		if_sve (guest_has_sve)
> +			reg |= CPACR_EL1_ZEN;
> +
> +		write_sysreg(reg, cpacr_el1);
> +	} else {
>  		write_sysreg(read_sysreg(cptr_el2) & ~(u64)CPTR_EL2_TFP,
>  			     cptr_el2);
> +	}
>
>  	isb();
>
> @@ -350,8 +366,7 @@ static bool __hyp_text __hyp_switch_fpsimd(struct kvm_vcpu *vcpu)
>  		 * In the SVE case, VHE is assumed: it is enforced by
>  		 * Kconfig and kvm_arch_init().
>  		 */
> -		if (system_supports_sve() &&
> -		    (vcpu->arch.flags & KVM_ARM64_HOST_SVE_IN_USE)) {
> +		if_sve (vcpu->arch.flags & KVM_ARM64_HOST_SVE_IN_USE) {
>  			struct thread_struct *thread = container_of(
>  				host_fpsimd,
>  				struct thread_struct, uw.fpsimd_state);
> @@ -364,11 +379,14 @@ static bool __hyp_text __hyp_switch_fpsimd(struct kvm_vcpu *vcpu)
>  		vcpu->arch.flags &= ~KVM_ARM64_FP_HOST;
>  	}
>
> -	__fpsimd_restore_state(&vcpu->arch.ctxt.gp_regs.fp_regs);
> -
> -	if (system_supports_sve() &&
> -	    vcpu->arch.flags & KVM_ARM64_GUEST_HAS_SVE)
> +	if_sve (guest_has_sve) {
> +		sve_load_state(vcpu_sve_pffr(vcpu),
> +			       &vcpu->arch.ctxt.gp_regs.fp_regs.fpsr,
> +			       sve_vq_from_vl(vcpu->arch.sve_max_vl) - 1);
>  		write_sysreg_s(vcpu->arch.ctxt.sys_regs[ZCR_EL1], SYS_ZCR_EL12);
> +	} else {
> +		__fpsimd_restore_state(&vcpu->arch.ctxt.gp_regs.fp_regs);
> +	}
>
>  	/* Skip restoring fpexc32 for AArch64 guests */
>  	if (!(read_sysreg(hcr_el2) & HCR_RW))
> @@ -380,6 +398,26 @@ static bool __hyp_text __hyp_switch_fpsimd(struct kvm_vcpu *vcpu)
>  	return true;
>  }
>
> +static inline bool __hyp_text __hyp_trap_is_fpsimd(struct kvm_vcpu *vcpu,
> +						   bool guest_has_sve)
> +{
> +
> +	u8 trap_class;
> +
> +	if (!system_supports_fpsimd())
> +		return false;
> +
> +	trap_class = kvm_vcpu_trap_get_class(vcpu);
> +
> +	if (trap_class == ESR_ELx_EC_FP_ASIMD)
> +		return true;
> +
> +	if_sve (guest_has_sve && trap_class == ESR_ELx_EC_SVE)
> +		return true;

Do we really need to check the guest has SVE before believing what the
hardware is telling us? According to the ARM ARM:

For ESR_ELx_EC_FP_ASIMD

  Excludes exceptions resulting from CPACR_EL1 when the value of HCR_EL2.TGE is
  1, or because SVE or Advanced SIMD and floating-point are not implemented. These
  are reported with EC value 0b000000

But also for ESR_ELx_EC_SVE

  Access to SVE functionality trapped as a result of CPACR_EL1.ZEN,
  CPTR_EL2.ZEN, CPTR_EL2.TZ, or CPTR_EL3.EZ, that is not reported using EC
  0b000000. This EC is defined only if SVE is implemented

Given I got confused maybe we need a comment for clarity?

  /* Catch guests without SVE enabled running on SVE capable hardware */

> +
> +	return false;
> +}
> +
>  /*
>   * Return true when we were able to fixup the guest exit and should return to
>   * the guest, false when we should restore the host state and return to the
> @@ -387,6 +425,8 @@ static bool __hyp_text __hyp_switch_fpsimd(struct kvm_vcpu *vcpu)
>   */
>  static bool __hyp_text fixup_guest_exit(struct kvm_vcpu *vcpu, u64 *exit_code)
>  {
> +	bool guest_has_sve;
> +
>  	if (ARM_EXCEPTION_CODE(*exit_code) != ARM_EXCEPTION_IRQ)
>  		vcpu->arch.fault.esr_el2 = read_sysreg_el2(esr);
>
> @@ -404,10 +444,11 @@ static bool __hyp_text fixup_guest_exit(struct kvm_vcpu *vcpu, u64 *exit_code)
>  	 * and restore the guest context lazily.
>  	 * If FP/SIMD is not implemented, handle the trap and inject an
>  	 * undefined instruction exception to the guest.
> +	 * Similarly for trapped SVE accesses.
>  	 */
> -	if (system_supports_fpsimd() &&
> -	    kvm_vcpu_trap_get_class(vcpu) == ESR_ELx_EC_FP_ASIMD)
> -		return __hyp_switch_fpsimd(vcpu);
> +	guest_has_sve = vcpu_has_sve(vcpu);

I'm not sure if it's worth fishing this out here given you are already
passing vcpu down the chain.

> +	if (__hyp_trap_is_fpsimd(vcpu, guest_has_sve))
> +		return __hyp_switch_fpsimd(vcpu, guest_has_sve);
>
>  	if (!__populate_fault_info(vcpu))
>  		return true;

Otherwise:

Reviewed-by: Alex Bennée <alex.bennee@linaro.org>

--
Alex Bennée
_______________________________________________
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

^ permalink raw reply	[flat|nested] 154+ messages in thread

* [RFC PATCH v2 13/23] KVM: arm64/sve: Context switch the SVE registers
@ 2018-11-19 16:36     ` Alex Bennée
  0 siblings, 0 replies; 154+ messages in thread
From: Alex Bennée @ 2018-11-19 16:36 UTC (permalink / raw)
  To: linux-arm-kernel


Dave Martin <Dave.Martin@arm.com> writes:

> In order to give each vcpu its own view of the SVE registers, this
> patch adds context storage via a new sve_state pointer in struct
> vcpu_arch.  An additional member sve_max_vl is also added for each
> vcpu, to determine the maximum vector length visible to the guest
> and thus the value to be configured in ZCR_EL2.LEN while the is
> active.  This also determines the layout and size of the storage in
> sve_state, which is read and written by the same backend functions
> that are used for context-switching the SVE state for host tasks.
>
> On SVE-enabled vcpus, SVE access traps are now handled by switching
> in the vcpu's SVE context and disabling the trap before returning
> to the guest.  On other vcpus, the trap is not handled and an exit
> back to the host occurs, where the handle_sve() fallback path
> reflects an undefined instruction exception back to the guest,
> consistently with the behaviour of non-SVE-capable hardware (as was
> done unconditionally prior to this patch).
>
> No SVE handling is added on non-VHE-only paths, since VHE is an
> architectural and Kconfig prerequisite of SVE.
>
> Signed-off-by: Dave Martin <Dave.Martin@arm.com>
> ---
>
> Changes since RFCv1:
>
>  * Add a if_sve () helper macro to efficiently skip or optimise out
>    SVE conditional support code for the SVE-unsupported case.  This
>    reduces the verbose boilerplate at the affected sites.
>
>  * In the style of sve_pffr(), a vcpu_sve_pffr() helper is added to
>    provide the FFR anchor pointer for sve_load_state() in the hyp switch
>    code.   This help avoid some open-coded pointer mungeing which is not
>    very readable.
>
>  * The condition for calling __hyp_switch_fpsimd() is abstracted for
>    better readability.
> ---
>  arch/arm64/include/asm/kvm_host.h |  6 ++++
>  arch/arm64/kvm/fpsimd.c           |  5 +--
>  arch/arm64/kvm/hyp/switch.c       | 71 ++++++++++++++++++++++++++++++---------
>  3 files changed, 65 insertions(+), 17 deletions(-)
>
> diff --git a/arch/arm64/include/asm/kvm_host.h b/arch/arm64/include/asm/kvm_host.h
> index 76cbb95e..8e9cd43 100644
> --- a/arch/arm64/include/asm/kvm_host.h
> +++ b/arch/arm64/include/asm/kvm_host.h
> @@ -210,6 +210,8 @@ typedef struct kvm_cpu_context kvm_cpu_context_t;
>
>  struct kvm_vcpu_arch {
>  	struct kvm_cpu_context ctxt;
> +	void *sve_state;
> +	unsigned int sve_max_vl;
>
>  	/* HYP configuration */
>  	u64 hcr_el2;
> @@ -302,6 +304,10 @@ struct kvm_vcpu_arch {
>  	bool sysregs_loaded_on_cpu;
>  };
>
> +/* Pointer to the vcpu's SVE FFR for sve_{save,load}_state() */
> +#define vcpu_sve_pffr(vcpu) ((void *)((char *)((vcpu)->arch.sve_state) + \
> +				      sve_ffr_offset((vcpu)->arch.sve_max_vl)))
> +
>  /* vcpu_arch flags field values: */
>  #define KVM_ARM64_DEBUG_DIRTY		(1 << 0)
>  #define KVM_ARM64_FP_ENABLED		(1 << 1) /* guest FP regs loaded */
> diff --git a/arch/arm64/kvm/fpsimd.c b/arch/arm64/kvm/fpsimd.c
> index 29e5585..3474388 100644
> --- a/arch/arm64/kvm/fpsimd.c
> +++ b/arch/arm64/kvm/fpsimd.c
> @@ -86,10 +86,11 @@ void kvm_arch_vcpu_ctxsync_fp(struct kvm_vcpu *vcpu)
>
>  	if (vcpu->arch.flags & KVM_ARM64_FP_ENABLED) {
>  		fpsimd_bind_state_to_cpu(&vcpu->arch.ctxt.gp_regs.fp_regs,
> -					 NULL, sve_max_vl);
> +					 vcpu->arch.sve_state,
> +					 vcpu->arch.sve_max_vl);
>
>  		clear_thread_flag(TIF_FOREIGN_FPSTATE);
> -		clear_thread_flag(TIF_SVE);
> +		update_thread_flag(TIF_SVE, vcpu_has_sve(vcpu));
>  	}
>  }
>
> diff --git a/arch/arm64/kvm/hyp/switch.c b/arch/arm64/kvm/hyp/switch.c
> index 085ed06..9941349 100644
> --- a/arch/arm64/kvm/hyp/switch.c
> +++ b/arch/arm64/kvm/hyp/switch.c
> @@ -98,7 +98,10 @@ static void activate_traps_vhe(struct kvm_vcpu *vcpu)
>  	val = read_sysreg(cpacr_el1);
>  	val |= CPACR_EL1_TTA;
>  	val &= ~CPACR_EL1_ZEN;
> -	if (!update_fp_enabled(vcpu)) {
> +	if (update_fp_enabled(vcpu)) {
> +		if (vcpu_has_sve(vcpu))
> +			val |= CPACR_EL1_ZEN;
> +	} else {
>  		val &= ~CPACR_EL1_FPEN;
>  		__activate_traps_fpsimd32(vcpu);
>  	}
> @@ -332,16 +335,29 @@ static bool __hyp_text __skip_instr(struct kvm_vcpu *vcpu)
>  	}
>  }
>
> -static bool __hyp_text __hyp_switch_fpsimd(struct kvm_vcpu *vcpu)
> +/*
> + * if () with a gating check for SVE support to minimise branch
> + * mispredictions in non-SVE systems.
> + * (system_supports_sve() is resolved at build time or via a static key.)
> + */
> +#define if_sve(cond) if (system_supports_sve() && (cond))
> +
> +static bool __hyp_text __hyp_switch_fpsimd(struct kvm_vcpu *vcpu,
> +					   bool guest_has_sve)
>  {
>  	struct user_fpsimd_state *host_fpsimd = vcpu->arch.host_fpsimd_state;
>
> -	if (has_vhe())
> -		write_sysreg(read_sysreg(cpacr_el1) | CPACR_EL1_FPEN,
> -			     cpacr_el1);
> -	else
> +	if (has_vhe()) {
> +		u64 reg = read_sysreg(cpacr_el1) | CPACR_EL1_FPEN;
> +
> +		if_sve (guest_has_sve)
> +			reg |= CPACR_EL1_ZEN;
> +
> +		write_sysreg(reg, cpacr_el1);
> +	} else {
>  		write_sysreg(read_sysreg(cptr_el2) & ~(u64)CPTR_EL2_TFP,
>  			     cptr_el2);
> +	}
>
>  	isb();
>
> @@ -350,8 +366,7 @@ static bool __hyp_text __hyp_switch_fpsimd(struct kvm_vcpu *vcpu)
>  		 * In the SVE case, VHE is assumed: it is enforced by
>  		 * Kconfig and kvm_arch_init().
>  		 */
> -		if (system_supports_sve() &&
> -		    (vcpu->arch.flags & KVM_ARM64_HOST_SVE_IN_USE)) {
> +		if_sve (vcpu->arch.flags & KVM_ARM64_HOST_SVE_IN_USE) {
>  			struct thread_struct *thread = container_of(
>  				host_fpsimd,
>  				struct thread_struct, uw.fpsimd_state);
> @@ -364,11 +379,14 @@ static bool __hyp_text __hyp_switch_fpsimd(struct kvm_vcpu *vcpu)
>  		vcpu->arch.flags &= ~KVM_ARM64_FP_HOST;
>  	}
>
> -	__fpsimd_restore_state(&vcpu->arch.ctxt.gp_regs.fp_regs);
> -
> -	if (system_supports_sve() &&
> -	    vcpu->arch.flags & KVM_ARM64_GUEST_HAS_SVE)
> +	if_sve (guest_has_sve) {
> +		sve_load_state(vcpu_sve_pffr(vcpu),
> +			       &vcpu->arch.ctxt.gp_regs.fp_regs.fpsr,
> +			       sve_vq_from_vl(vcpu->arch.sve_max_vl) - 1);
>  		write_sysreg_s(vcpu->arch.ctxt.sys_regs[ZCR_EL1], SYS_ZCR_EL12);
> +	} else {
> +		__fpsimd_restore_state(&vcpu->arch.ctxt.gp_regs.fp_regs);
> +	}
>
>  	/* Skip restoring fpexc32 for AArch64 guests */
>  	if (!(read_sysreg(hcr_el2) & HCR_RW))
> @@ -380,6 +398,26 @@ static bool __hyp_text __hyp_switch_fpsimd(struct kvm_vcpu *vcpu)
>  	return true;
>  }
>
> +static inline bool __hyp_text __hyp_trap_is_fpsimd(struct kvm_vcpu *vcpu,
> +						   bool guest_has_sve)
> +{
> +
> +	u8 trap_class;
> +
> +	if (!system_supports_fpsimd())
> +		return false;
> +
> +	trap_class = kvm_vcpu_trap_get_class(vcpu);
> +
> +	if (trap_class == ESR_ELx_EC_FP_ASIMD)
> +		return true;
> +
> +	if_sve (guest_has_sve && trap_class == ESR_ELx_EC_SVE)
> +		return true;

Do we really need to check the guest has SVE before believing what the
hardware is telling us? According to the ARM ARM:

For ESR_ELx_EC_FP_ASIMD

  Excludes exceptions resulting from CPACR_EL1 when the value of HCR_EL2.TGE is
  1, or because SVE or Advanced SIMD and floating-point are not implemented. These
  are reported with EC value 0b000000

But also for ESR_ELx_EC_SVE

  Access to SVE functionality trapped as a result of CPACR_EL1.ZEN,
  CPTR_EL2.ZEN, CPTR_EL2.TZ, or CPTR_EL3.EZ, that is not reported using EC
  0b000000. This EC is defined only if SVE is implemented

Given I got confused maybe we need a comment for clarity?

  /* Catch guests without SVE enabled running on SVE capable hardware */

> +
> +	return false;
> +}
> +
>  /*
>   * Return true when we were able to fixup the guest exit and should return to
>   * the guest, false when we should restore the host state and return to the
> @@ -387,6 +425,8 @@ static bool __hyp_text __hyp_switch_fpsimd(struct kvm_vcpu *vcpu)
>   */
>  static bool __hyp_text fixup_guest_exit(struct kvm_vcpu *vcpu, u64 *exit_code)
>  {
> +	bool guest_has_sve;
> +
>  	if (ARM_EXCEPTION_CODE(*exit_code) != ARM_EXCEPTION_IRQ)
>  		vcpu->arch.fault.esr_el2 = read_sysreg_el2(esr);
>
> @@ -404,10 +444,11 @@ static bool __hyp_text fixup_guest_exit(struct kvm_vcpu *vcpu, u64 *exit_code)
>  	 * and restore the guest context lazily.
>  	 * If FP/SIMD is not implemented, handle the trap and inject an
>  	 * undefined instruction exception to the guest.
> +	 * Similarly for trapped SVE accesses.
>  	 */
> -	if (system_supports_fpsimd() &&
> -	    kvm_vcpu_trap_get_class(vcpu) == ESR_ELx_EC_FP_ASIMD)
> -		return __hyp_switch_fpsimd(vcpu);
> +	guest_has_sve = vcpu_has_sve(vcpu);

I'm not sure if it's worth fishing this out here given you are already
passing vcpu down the chain.

> +	if (__hyp_trap_is_fpsimd(vcpu, guest_has_sve))
> +		return __hyp_switch_fpsimd(vcpu, guest_has_sve);
>
>  	if (!__populate_fault_info(vcpu))
>  		return true;

Otherwise:

Reviewed-by: Alex Benn?e <alex.bennee@linaro.org>

--
Alex Benn?e

^ permalink raw reply	[flat|nested] 154+ messages in thread

* Re: [RFC PATCH v2 14/23] KVM: Allow 2048-bit register access via ioctl interface
  2018-09-28 13:39   ` Dave Martin
@ 2018-11-19 16:48     ` Alex Bennée
  -1 siblings, 0 replies; 154+ messages in thread
From: Alex Bennée @ 2018-11-19 16:48 UTC (permalink / raw)
  To: Dave Martin
  Cc: Okamoto Takayuki, Christoffer Dall, Ard Biesheuvel, Marc Zyngier,
	Catalin Marinas, Will Deacon, kvmarm, linux-arm-kernel


Dave Martin <Dave.Martin@arm.com> writes:

> The Arm SVE architecture defines registers that are up to 2048 bits
> in size (with some possibility of further future expansion).
>
> In order to avoid the need for an excessively large number of
> ioctls when saving and restoring a vcpu's registers, this patch
> adds a #define to make support for individual 2048-bit registers
> through the KVM_{GET,SET}_ONE_REG ioctl interface official.  This
> will allow each SVE register to be accessed in a single call.
>
> There are sufficient spare bits in the register id size field for
> this change, so there is no ABI impact providing that
> KVM_GET_REG_LIST does not enumerate any 2048-bit register unless
> userspace explicitly opts in to the relevant architecture-specific
> features.
>
> Signed-off-by: Dave Martin <Dave.Martin@arm.com>
> ---
>  include/uapi/linux/kvm.h | 1 +
>  1 file changed, 1 insertion(+)
>
> diff --git a/include/uapi/linux/kvm.h b/include/uapi/linux/kvm.h
> index 251be35..7c3c5cc 100644
> --- a/include/uapi/linux/kvm.h
> +++ b/include/uapi/linux/kvm.h
> @@ -1110,6 +1110,7 @@ struct kvm_dirty_tlb {
>  #define KVM_REG_SIZE_U256	0x0050000000000000ULL
>  #define KVM_REG_SIZE_U512	0x0060000000000000ULL
>  #define KVM_REG_SIZE_U1024	0x0070000000000000ULL
> +#define KVM_REG_SIZE_U2048	0x0080000000000000ULL

Yeah OK I guess - but it does make me question if KVM_REG_SIZE_MASK is
part of the ABI because although we have space for another few bits that
is the last one without changing the mask.

Reviewed-by: Alex Bennée <alex.bennee@linaro.org>

>
>  struct kvm_reg_list {
>  	__u64 n; /* number of regs */


--
Alex Bennée
_______________________________________________
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

^ permalink raw reply	[flat|nested] 154+ messages in thread

* [RFC PATCH v2 14/23] KVM: Allow 2048-bit register access via ioctl interface
@ 2018-11-19 16:48     ` Alex Bennée
  0 siblings, 0 replies; 154+ messages in thread
From: Alex Bennée @ 2018-11-19 16:48 UTC (permalink / raw)
  To: linux-arm-kernel


Dave Martin <Dave.Martin@arm.com> writes:

> The Arm SVE architecture defines registers that are up to 2048 bits
> in size (with some possibility of further future expansion).
>
> In order to avoid the need for an excessively large number of
> ioctls when saving and restoring a vcpu's registers, this patch
> adds a #define to make support for individual 2048-bit registers
> through the KVM_{GET,SET}_ONE_REG ioctl interface official.  This
> will allow each SVE register to be accessed in a single call.
>
> There are sufficient spare bits in the register id size field for
> this change, so there is no ABI impact providing that
> KVM_GET_REG_LIST does not enumerate any 2048-bit register unless
> userspace explicitly opts in to the relevant architecture-specific
> features.
>
> Signed-off-by: Dave Martin <Dave.Martin@arm.com>
> ---
>  include/uapi/linux/kvm.h | 1 +
>  1 file changed, 1 insertion(+)
>
> diff --git a/include/uapi/linux/kvm.h b/include/uapi/linux/kvm.h
> index 251be35..7c3c5cc 100644
> --- a/include/uapi/linux/kvm.h
> +++ b/include/uapi/linux/kvm.h
> @@ -1110,6 +1110,7 @@ struct kvm_dirty_tlb {
>  #define KVM_REG_SIZE_U256	0x0050000000000000ULL
>  #define KVM_REG_SIZE_U512	0x0060000000000000ULL
>  #define KVM_REG_SIZE_U1024	0x0070000000000000ULL
> +#define KVM_REG_SIZE_U2048	0x0080000000000000ULL

Yeah OK I guess - but it does make me question if KVM_REG_SIZE_MASK is
part of the ABI because although we have space for another few bits that
is the last one without changing the mask.

Reviewed-by: Alex Benn?e <alex.bennee@linaro.org>

>
>  struct kvm_reg_list {
>  	__u64 n; /* number of regs */


--
Alex Benn?e

^ permalink raw reply	[flat|nested] 154+ messages in thread

* Re: [RFC PATCH v2 13/23] KVM: arm64/sve: Context switch the SVE registers
  2018-11-19 16:36     ` Alex Bennée
@ 2018-11-19 17:03       ` Dave Martin
  -1 siblings, 0 replies; 154+ messages in thread
From: Dave Martin @ 2018-11-19 17:03 UTC (permalink / raw)
  To: Alex Bennée
  Cc: Okamoto Takayuki, Christoffer Dall, Ard Biesheuvel, Marc Zyngier,
	Catalin Marinas, Will Deacon, kvmarm, linux-arm-kernel

On Mon, Nov 19, 2018 at 04:36:01PM +0000, Alex Bennée wrote:
> 
> Dave Martin <Dave.Martin@arm.com> writes:
> 
> > In order to give each vcpu its own view of the SVE registers, this
> > patch adds context storage via a new sve_state pointer in struct
> > vcpu_arch.  An additional member sve_max_vl is also added for each
> > vcpu, to determine the maximum vector length visible to the guest
> > and thus the value to be configured in ZCR_EL2.LEN while the is
> > active.  This also determines the layout and size of the storage in
> > sve_state, which is read and written by the same backend functions
> > that are used for context-switching the SVE state for host tasks.
> >
> > On SVE-enabled vcpus, SVE access traps are now handled by switching
> > in the vcpu's SVE context and disabling the trap before returning
> > to the guest.  On other vcpus, the trap is not handled and an exit
> > back to the host occurs, where the handle_sve() fallback path
> > reflects an undefined instruction exception back to the guest,
> > consistently with the behaviour of non-SVE-capable hardware (as was
> > done unconditionally prior to this patch).
> >
> > No SVE handling is added on non-VHE-only paths, since VHE is an
> > architectural and Kconfig prerequisite of SVE.
> >
> > Signed-off-by: Dave Martin <Dave.Martin@arm.com>

[...]

> > diff --git a/arch/arm64/kvm/hyp/switch.c b/arch/arm64/kvm/hyp/switch.c
> > index 085ed06..9941349 100644
> > --- a/arch/arm64/kvm/hyp/switch.c
> > +++ b/arch/arm64/kvm/hyp/switch.c

[...]

> > @@ -380,6 +398,26 @@ static bool __hyp_text __hyp_switch_fpsimd(struct kvm_vcpu *vcpu)
> >  	return true;
> >  }
> >
> > +static inline bool __hyp_text __hyp_trap_is_fpsimd(struct kvm_vcpu *vcpu,
> > +						   bool guest_has_sve)
> > +{
> > +
> > +	u8 trap_class;
> > +
> > +	if (!system_supports_fpsimd())
> > +		return false;
> > +
> > +	trap_class = kvm_vcpu_trap_get_class(vcpu);
> > +
> > +	if (trap_class == ESR_ELx_EC_FP_ASIMD)
> > +		return true;
> > +
> > +	if_sve (guest_has_sve && trap_class == ESR_ELx_EC_SVE)
> > +		return true;
> 
> Do we really need to check the guest has SVE before believing what the
> hardware is telling us? According to the ARM ARM:
> 
> For ESR_ELx_EC_FP_ASIMD
> 
>   Excludes exceptions resulting from CPACR_EL1 when the value of HCR_EL2.TGE is
>   1, or because SVE or Advanced SIMD and floating-point are not implemented. These
>   are reported with EC value 0b000000
> 
> But also for ESR_ELx_EC_SVE
> 
>   Access to SVE functionality trapped as a result of CPACR_EL1.ZEN,
>   CPTR_EL2.ZEN, CPTR_EL2.TZ, or CPTR_EL3.EZ, that is not reported using EC
>   0b000000. This EC is defined only if SVE is implemented
> 
> Given I got confused maybe we need a comment for clarity?

This is not about not trusting the value ESR_ELx_EC_SVE on older
hardware: in effect it is retrospectively reserved for this purpose on
all older arch versions, so there is no ambiguity about what it means.
It should never be observed on hardware that doesn't have SVE.

Rather, how we handle this trap differs depending on whether the guest
is SVE-enabled or not.  If not, then this trap is handled by the generic
fallback path for unhandled guest traps, so we don't check for this
particular EC value explicitly in that case.

>   /* Catch guests without SVE enabled running on SVE capable hardware */

I might write something like:

	/*
	 * For sve-enmabled guests only, handle SVE access via FPSIMD
	 * context handling code.
	 */

Does that make sense?  I may have misunderstood your concern here.

[...]

> > @@ -387,6 +425,8 @@ static bool __hyp_text __hyp_switch_fpsimd(struct kvm_vcpu *vcpu)
> >   */
> >  static bool __hyp_text fixup_guest_exit(struct kvm_vcpu *vcpu, u64 *exit_code)
> >  {
> > +	bool guest_has_sve;
> > +
> >  	if (ARM_EXCEPTION_CODE(*exit_code) != ARM_EXCEPTION_IRQ)
> >  		vcpu->arch.fault.esr_el2 = read_sysreg_el2(esr);
> >
> > @@ -404,10 +444,11 @@ static bool __hyp_text fixup_guest_exit(struct kvm_vcpu *vcpu, u64 *exit_code)
> >  	 * and restore the guest context lazily.
> >  	 * If FP/SIMD is not implemented, handle the trap and inject an
> >  	 * undefined instruction exception to the guest.
> > +	 * Similarly for trapped SVE accesses.
> >  	 */
> > -	if (system_supports_fpsimd() &&
> > -	    kvm_vcpu_trap_get_class(vcpu) == ESR_ELx_EC_FP_ASIMD)
> > -		return __hyp_switch_fpsimd(vcpu);
> > +	guest_has_sve = vcpu_has_sve(vcpu);
> 
> I'm not sure if it's worth fishing this out here given you are already
> passing vcpu down the chain.

I wanted to discourage GCC from recomputing this.  If you're in a
position to do so, can you look at the disassembly with/without this
factored out and see whether it makes a difference?

> 
> > +	if (__hyp_trap_is_fpsimd(vcpu, guest_has_sve))
> > +		return __hyp_switch_fpsimd(vcpu, guest_has_sve);
> >
> >  	if (!__populate_fault_info(vcpu))
> >  		return true;
> 
> Otherwise:
> 
> Reviewed-by: Alex Bennée <alex.bennee@linaro.org>

Thanks
---Dave

^ permalink raw reply	[flat|nested] 154+ messages in thread

* [RFC PATCH v2 13/23] KVM: arm64/sve: Context switch the SVE registers
@ 2018-11-19 17:03       ` Dave Martin
  0 siblings, 0 replies; 154+ messages in thread
From: Dave Martin @ 2018-11-19 17:03 UTC (permalink / raw)
  To: linux-arm-kernel

On Mon, Nov 19, 2018 at 04:36:01PM +0000, Alex Benn?e wrote:
> 
> Dave Martin <Dave.Martin@arm.com> writes:
> 
> > In order to give each vcpu its own view of the SVE registers, this
> > patch adds context storage via a new sve_state pointer in struct
> > vcpu_arch.  An additional member sve_max_vl is also added for each
> > vcpu, to determine the maximum vector length visible to the guest
> > and thus the value to be configured in ZCR_EL2.LEN while the is
> > active.  This also determines the layout and size of the storage in
> > sve_state, which is read and written by the same backend functions
> > that are used for context-switching the SVE state for host tasks.
> >
> > On SVE-enabled vcpus, SVE access traps are now handled by switching
> > in the vcpu's SVE context and disabling the trap before returning
> > to the guest.  On other vcpus, the trap is not handled and an exit
> > back to the host occurs, where the handle_sve() fallback path
> > reflects an undefined instruction exception back to the guest,
> > consistently with the behaviour of non-SVE-capable hardware (as was
> > done unconditionally prior to this patch).
> >
> > No SVE handling is added on non-VHE-only paths, since VHE is an
> > architectural and Kconfig prerequisite of SVE.
> >
> > Signed-off-by: Dave Martin <Dave.Martin@arm.com>

[...]

> > diff --git a/arch/arm64/kvm/hyp/switch.c b/arch/arm64/kvm/hyp/switch.c
> > index 085ed06..9941349 100644
> > --- a/arch/arm64/kvm/hyp/switch.c
> > +++ b/arch/arm64/kvm/hyp/switch.c

[...]

> > @@ -380,6 +398,26 @@ static bool __hyp_text __hyp_switch_fpsimd(struct kvm_vcpu *vcpu)
> >  	return true;
> >  }
> >
> > +static inline bool __hyp_text __hyp_trap_is_fpsimd(struct kvm_vcpu *vcpu,
> > +						   bool guest_has_sve)
> > +{
> > +
> > +	u8 trap_class;
> > +
> > +	if (!system_supports_fpsimd())
> > +		return false;
> > +
> > +	trap_class = kvm_vcpu_trap_get_class(vcpu);
> > +
> > +	if (trap_class == ESR_ELx_EC_FP_ASIMD)
> > +		return true;
> > +
> > +	if_sve (guest_has_sve && trap_class == ESR_ELx_EC_SVE)
> > +		return true;
> 
> Do we really need to check the guest has SVE before believing what the
> hardware is telling us? According to the ARM ARM:
> 
> For ESR_ELx_EC_FP_ASIMD
> 
>   Excludes exceptions resulting from CPACR_EL1 when the value of HCR_EL2.TGE is
>   1, or because SVE or Advanced SIMD and floating-point are not implemented. These
>   are reported with EC value 0b000000
> 
> But also for ESR_ELx_EC_SVE
> 
>   Access to SVE functionality trapped as a result of CPACR_EL1.ZEN,
>   CPTR_EL2.ZEN, CPTR_EL2.TZ, or CPTR_EL3.EZ, that is not reported using EC
>   0b000000. This EC is defined only if SVE is implemented
> 
> Given I got confused maybe we need a comment for clarity?

This is not about not trusting the value ESR_ELx_EC_SVE on older
hardware: in effect it is retrospectively reserved for this purpose on
all older arch versions, so there is no ambiguity about what it means.
It should never be observed on hardware that doesn't have SVE.

Rather, how we handle this trap differs depending on whether the guest
is SVE-enabled or not.  If not, then this trap is handled by the generic
fallback path for unhandled guest traps, so we don't check for this
particular EC value explicitly in that case.

>   /* Catch guests without SVE enabled running on SVE capable hardware */

I might write something like:

	/*
	 * For sve-enmabled guests only, handle SVE access via FPSIMD
	 * context handling code.
	 */

Does that make sense?  I may have misunderstood your concern here.

[...]

> > @@ -387,6 +425,8 @@ static bool __hyp_text __hyp_switch_fpsimd(struct kvm_vcpu *vcpu)
> >   */
> >  static bool __hyp_text fixup_guest_exit(struct kvm_vcpu *vcpu, u64 *exit_code)
> >  {
> > +	bool guest_has_sve;
> > +
> >  	if (ARM_EXCEPTION_CODE(*exit_code) != ARM_EXCEPTION_IRQ)
> >  		vcpu->arch.fault.esr_el2 = read_sysreg_el2(esr);
> >
> > @@ -404,10 +444,11 @@ static bool __hyp_text fixup_guest_exit(struct kvm_vcpu *vcpu, u64 *exit_code)
> >  	 * and restore the guest context lazily.
> >  	 * If FP/SIMD is not implemented, handle the trap and inject an
> >  	 * undefined instruction exception to the guest.
> > +	 * Similarly for trapped SVE accesses.
> >  	 */
> > -	if (system_supports_fpsimd() &&
> > -	    kvm_vcpu_trap_get_class(vcpu) == ESR_ELx_EC_FP_ASIMD)
> > -		return __hyp_switch_fpsimd(vcpu);
> > +	guest_has_sve = vcpu_has_sve(vcpu);
> 
> I'm not sure if it's worth fishing this out here given you are already
> passing vcpu down the chain.

I wanted to discourage GCC from recomputing this.  If you're in a
position to do so, can you look at the disassembly with/without this
factored out and see whether it makes a difference?

> 
> > +	if (__hyp_trap_is_fpsimd(vcpu, guest_has_sve))
> > +		return __hyp_switch_fpsimd(vcpu, guest_has_sve);
> >
> >  	if (!__populate_fault_info(vcpu))
> >  		return true;
> 
> Otherwise:
> 
> Reviewed-by: Alex Benn?e <alex.bennee@linaro.org>

Thanks
---Dave

^ permalink raw reply	[flat|nested] 154+ messages in thread

* Re: [RFC PATCH v2 14/23] KVM: Allow 2048-bit register access via ioctl interface
  2018-11-19 16:48     ` Alex Bennée
@ 2018-11-19 17:07       ` Dave Martin
  -1 siblings, 0 replies; 154+ messages in thread
From: Dave Martin @ 2018-11-19 17:07 UTC (permalink / raw)
  To: Alex Bennée
  Cc: Okamoto Takayuki, Christoffer Dall, Ard Biesheuvel, Marc Zyngier,
	Catalin Marinas, Will Deacon, kvmarm, linux-arm-kernel

On Mon, Nov 19, 2018 at 04:48:36PM +0000, Alex Bennée wrote:
> 
> Dave Martin <Dave.Martin@arm.com> writes:
> 
> > The Arm SVE architecture defines registers that are up to 2048 bits
> > in size (with some possibility of further future expansion).
> >
> > In order to avoid the need for an excessively large number of
> > ioctls when saving and restoring a vcpu's registers, this patch
> > adds a #define to make support for individual 2048-bit registers
> > through the KVM_{GET,SET}_ONE_REG ioctl interface official.  This
> > will allow each SVE register to be accessed in a single call.
> >
> > There are sufficient spare bits in the register id size field for
> > this change, so there is no ABI impact providing that
> > KVM_GET_REG_LIST does not enumerate any 2048-bit register unless
> > userspace explicitly opts in to the relevant architecture-specific
> > features.
> >
> > Signed-off-by: Dave Martin <Dave.Martin@arm.com>
> > ---
> >  include/uapi/linux/kvm.h | 1 +
> >  1 file changed, 1 insertion(+)
> >
> > diff --git a/include/uapi/linux/kvm.h b/include/uapi/linux/kvm.h
> > index 251be35..7c3c5cc 100644
> > --- a/include/uapi/linux/kvm.h
> > +++ b/include/uapi/linux/kvm.h
> > @@ -1110,6 +1110,7 @@ struct kvm_dirty_tlb {
> >  #define KVM_REG_SIZE_U256	0x0050000000000000ULL
> >  #define KVM_REG_SIZE_U512	0x0060000000000000ULL
> >  #define KVM_REG_SIZE_U1024	0x0070000000000000ULL
> > +#define KVM_REG_SIZE_U2048	0x0080000000000000ULL
> 
> Yeah OK I guess - but it does make me question if KVM_REG_SIZE_MASK is
> part of the ABI because although we have space for another few bits that
> is the last one without changing the mask.

Debatable, but KVM_REG_SIZE_MASK is UAPI and suggests a clear intent not
to recycle bit 55 for another purpose.  This allows for reg sizes up to
262144 bits which is hopefully more than enough for the foreseeable
future.

Even if bits 56-59 are currently always 0, KVM_REG_ARCH_MASK suggests
that these bits aren't going to be used for size field bits.


Or am I missing something?

> Reviewed-by: Alex Bennée <alex.bennee@linaro.org>

Thanks
---Dave

^ permalink raw reply	[flat|nested] 154+ messages in thread

* [RFC PATCH v2 14/23] KVM: Allow 2048-bit register access via ioctl interface
@ 2018-11-19 17:07       ` Dave Martin
  0 siblings, 0 replies; 154+ messages in thread
From: Dave Martin @ 2018-11-19 17:07 UTC (permalink / raw)
  To: linux-arm-kernel

On Mon, Nov 19, 2018 at 04:48:36PM +0000, Alex Benn?e wrote:
> 
> Dave Martin <Dave.Martin@arm.com> writes:
> 
> > The Arm SVE architecture defines registers that are up to 2048 bits
> > in size (with some possibility of further future expansion).
> >
> > In order to avoid the need for an excessively large number of
> > ioctls when saving and restoring a vcpu's registers, this patch
> > adds a #define to make support for individual 2048-bit registers
> > through the KVM_{GET,SET}_ONE_REG ioctl interface official.  This
> > will allow each SVE register to be accessed in a single call.
> >
> > There are sufficient spare bits in the register id size field for
> > this change, so there is no ABI impact providing that
> > KVM_GET_REG_LIST does not enumerate any 2048-bit register unless
> > userspace explicitly opts in to the relevant architecture-specific
> > features.
> >
> > Signed-off-by: Dave Martin <Dave.Martin@arm.com>
> > ---
> >  include/uapi/linux/kvm.h | 1 +
> >  1 file changed, 1 insertion(+)
> >
> > diff --git a/include/uapi/linux/kvm.h b/include/uapi/linux/kvm.h
> > index 251be35..7c3c5cc 100644
> > --- a/include/uapi/linux/kvm.h
> > +++ b/include/uapi/linux/kvm.h
> > @@ -1110,6 +1110,7 @@ struct kvm_dirty_tlb {
> >  #define KVM_REG_SIZE_U256	0x0050000000000000ULL
> >  #define KVM_REG_SIZE_U512	0x0060000000000000ULL
> >  #define KVM_REG_SIZE_U1024	0x0070000000000000ULL
> > +#define KVM_REG_SIZE_U2048	0x0080000000000000ULL
> 
> Yeah OK I guess - but it does make me question if KVM_REG_SIZE_MASK is
> part of the ABI because although we have space for another few bits that
> is the last one without changing the mask.

Debatable, but KVM_REG_SIZE_MASK is UAPI and suggests a clear intent not
to recycle bit 55 for another purpose.  This allows for reg sizes up to
262144 bits which is hopefully more than enough for the foreseeable
future.

Even if bits 56-59 are currently always 0, KVM_REG_ARCH_MASK suggests
that these bits aren't going to be used for size field bits.


Or am I missing something?

> Reviewed-by: Alex Benn?e <alex.bennee@linaro.org>

Thanks
---Dave

^ permalink raw reply	[flat|nested] 154+ messages in thread

* Re: [RFC PATCH v2 05/23] KVM: arm: Add arch vcpu uninit hook
  2018-11-15 16:40       ` Dave Martin
@ 2018-11-20 10:56         ` Christoffer Dall
  -1 siblings, 0 replies; 154+ messages in thread
From: Christoffer Dall @ 2018-11-20 10:56 UTC (permalink / raw)
  To: Dave Martin
  Cc: Okamoto Takayuki, Christoffer Dall, Ard Biesheuvel, Marc Zyngier,
	Catalin Marinas, Will Deacon, kvmarm, linux-arm-kernel

On Thu, Nov 15, 2018 at 04:40:31PM +0000, Dave Martin wrote:
> On Fri, Nov 02, 2018 at 09:05:36AM +0100, Christoffer Dall wrote:
> > On Fri, Sep 28, 2018 at 02:39:09PM +0100, Dave Martin wrote:
> > > In preparation for adding support for SVE in guests on arm64, a
> > > hook is needed for freeing additional per-vcpu memory when a vcpu
> > > is freed.
> > 
> > Can this commit motivate why we can't do the work in kvm_arch_vcpu_free,
> > which we use for freeing other data structures?
> > 
> > (Presumably, uninit is needed when you need to do something at the very
> > last step after releasing the struct pid.
> 
> It wasn't to do with that.
> 
> Rather, the division of responsibility between the vcpu_uninit and
> vcpu_free paths is not very clear.
> 
> In the earlier version of the series, I think SVE state may have been
> allocated rather early and we may have needed to free it in the failure
> path of kvm_arch_vcpu_create() (which just calls kvm_vcpu_uninit()).
> (Alternatively, I may just have been wrong.)
> 
> Now, the vcpu must be fully created before the KVM_ARM_SVE_CONFIG ioctl
> on it (which is what allocates sve_state) can succeed anyway.
> 
> So the distinction between these two teardown phases is probably no
> longer important.
> 
> I'll see whether I can get rid of this hook and free the SVE state in
> kvm_arch_vcpu_free() instead.
> 
> Does that make sense?
> 

Yes, thanks.

    Christoffer

^ permalink raw reply	[flat|nested] 154+ messages in thread

* [RFC PATCH v2 05/23] KVM: arm: Add arch vcpu uninit hook
@ 2018-11-20 10:56         ` Christoffer Dall
  0 siblings, 0 replies; 154+ messages in thread
From: Christoffer Dall @ 2018-11-20 10:56 UTC (permalink / raw)
  To: linux-arm-kernel

On Thu, Nov 15, 2018 at 04:40:31PM +0000, Dave Martin wrote:
> On Fri, Nov 02, 2018 at 09:05:36AM +0100, Christoffer Dall wrote:
> > On Fri, Sep 28, 2018 at 02:39:09PM +0100, Dave Martin wrote:
> > > In preparation for adding support for SVE in guests on arm64, a
> > > hook is needed for freeing additional per-vcpu memory when a vcpu
> > > is freed.
> > 
> > Can this commit motivate why we can't do the work in kvm_arch_vcpu_free,
> > which we use for freeing other data structures?
> > 
> > (Presumably, uninit is needed when you need to do something at the very
> > last step after releasing the struct pid.
> 
> It wasn't to do with that.
> 
> Rather, the division of responsibility between the vcpu_uninit and
> vcpu_free paths is not very clear.
> 
> In the earlier version of the series, I think SVE state may have been
> allocated rather early and we may have needed to free it in the failure
> path of kvm_arch_vcpu_create() (which just calls kvm_vcpu_uninit()).
> (Alternatively, I may just have been wrong.)
> 
> Now, the vcpu must be fully created before the KVM_ARM_SVE_CONFIG ioctl
> on it (which is what allocates sve_state) can succeed anyway.
> 
> So the distinction between these two teardown phases is probably no
> longer important.
> 
> I'll see whether I can get rid of this hook and free the SVE state in
> kvm_arch_vcpu_free() instead.
> 
> Does that make sense?
> 

Yes, thanks.

    Christoffer

^ permalink raw reply	[flat|nested] 154+ messages in thread

* Re: [RFC PATCH v2 20/23] KVM: arm64: Add arch vm ioctl hook
  2018-11-15 18:04       ` Dave Martin
@ 2018-11-20 10:58         ` Christoffer Dall
  -1 siblings, 0 replies; 154+ messages in thread
From: Christoffer Dall @ 2018-11-20 10:58 UTC (permalink / raw)
  To: Dave Martin
  Cc: Okamoto Takayuki, Christoffer Dall, Ard Biesheuvel, Marc Zyngier,
	Catalin Marinas, Will Deacon, kvmarm, linux-arm-kernel

On Thu, Nov 15, 2018 at 06:04:22PM +0000, Dave Martin wrote:
> On Fri, Nov 02, 2018 at 09:32:27AM +0100, Christoffer Dall wrote:
> > On Fri, Sep 28, 2018 at 02:39:24PM +0100, Dave Martin wrote:
> > > To enable arm64-specific vm ioctls to be added cleanly, this patch
> > > adds a kvm_arm_arch_vm_ioctl() hook so that these don't pollute the
> > > common code.
> > 
> > Hmmm, I don't really see the strenght of that argument, and have the
> > same concern as before.  I'd like to avoid the additional indirection
> > and instead just follow the existing pattern with a dummy implementation
> > on the 32-bit side that returns an error.
> 
> So for this and the similar comment on patch 18, this was premature (or
> at least, overzealous) factoring on my part.
> 
> I'm happy to merge this back together for arm and arm64 as you prefer.
> 
> Do we have a nice way of writing the arch check, e.g.
> 
> 	case KVM_ARM_SVE_CONFIG:
> 		if (!IS_ENABLED(ARM64))
> 			return -EINVAL;
> 		else
> 			return kvm_vcpu_sve_config(NULL, userp);
> 
> should work, but looks a bit strange.  Maybe I'm just being fussy.

I prefer just doing:

	case KVM_ARM_SVE_CONFIG:
		return kvm_vcpu_sve_config(NULL, userp);


And having this in arch/arm/include/asm/kvm_foo.h:

static inline int kvm_vcpu_sve_config(...)
{
	return -EINVAL;
}


Thanks,

    Christoffer

^ permalink raw reply	[flat|nested] 154+ messages in thread

* [RFC PATCH v2 20/23] KVM: arm64: Add arch vm ioctl hook
@ 2018-11-20 10:58         ` Christoffer Dall
  0 siblings, 0 replies; 154+ messages in thread
From: Christoffer Dall @ 2018-11-20 10:58 UTC (permalink / raw)
  To: linux-arm-kernel

On Thu, Nov 15, 2018 at 06:04:22PM +0000, Dave Martin wrote:
> On Fri, Nov 02, 2018 at 09:32:27AM +0100, Christoffer Dall wrote:
> > On Fri, Sep 28, 2018 at 02:39:24PM +0100, Dave Martin wrote:
> > > To enable arm64-specific vm ioctls to be added cleanly, this patch
> > > adds a kvm_arm_arch_vm_ioctl() hook so that these don't pollute the
> > > common code.
> > 
> > Hmmm, I don't really see the strenght of that argument, and have the
> > same concern as before.  I'd like to avoid the additional indirection
> > and instead just follow the existing pattern with a dummy implementation
> > on the 32-bit side that returns an error.
> 
> So for this and the similar comment on patch 18, this was premature (or
> at least, overzealous) factoring on my part.
> 
> I'm happy to merge this back together for arm and arm64 as you prefer.
> 
> Do we have a nice way of writing the arch check, e.g.
> 
> 	case KVM_ARM_SVE_CONFIG:
> 		if (!IS_ENABLED(ARM64))
> 			return -EINVAL;
> 		else
> 			return kvm_vcpu_sve_config(NULL, userp);
> 
> should work, but looks a bit strange.  Maybe I'm just being fussy.

I prefer just doing:

	case KVM_ARM_SVE_CONFIG:
		return kvm_vcpu_sve_config(NULL, userp);


And having this in arch/arm/include/asm/kvm_foo.h:

static inline int kvm_vcpu_sve_config(...)
{
	return -EINVAL;
}


Thanks,

    Christoffer

^ permalink raw reply	[flat|nested] 154+ messages in thread

* Re: [RFC PATCH v2 14/23] KVM: Allow 2048-bit register access via ioctl interface
  2018-11-19 17:07       ` Dave Martin
@ 2018-11-20 11:20         ` Alex Bennée
  -1 siblings, 0 replies; 154+ messages in thread
From: Alex Bennée @ 2018-11-20 11:20 UTC (permalink / raw)
  To: Dave Martin
  Cc: Okamoto Takayuki, Christoffer Dall, Ard Biesheuvel, Marc Zyngier,
	Catalin Marinas, Will Deacon, kvmarm, linux-arm-kernel


Dave Martin <Dave.Martin@arm.com> writes:

> On Mon, Nov 19, 2018 at 04:48:36PM +0000, Alex Bennée wrote:
>>
>> Dave Martin <Dave.Martin@arm.com> writes:
>>
>> > The Arm SVE architecture defines registers that are up to 2048 bits
>> > in size (with some possibility of further future expansion).
>> >
>> > In order to avoid the need for an excessively large number of
>> > ioctls when saving and restoring a vcpu's registers, this patch
>> > adds a #define to make support for individual 2048-bit registers
>> > through the KVM_{GET,SET}_ONE_REG ioctl interface official.  This
>> > will allow each SVE register to be accessed in a single call.
>> >
>> > There are sufficient spare bits in the register id size field for
>> > this change, so there is no ABI impact providing that
>> > KVM_GET_REG_LIST does not enumerate any 2048-bit register unless
>> > userspace explicitly opts in to the relevant architecture-specific
>> > features.
>> >
>> > Signed-off-by: Dave Martin <Dave.Martin@arm.com>
>> > ---
>> >  include/uapi/linux/kvm.h | 1 +
>> >  1 file changed, 1 insertion(+)
>> >
>> > diff --git a/include/uapi/linux/kvm.h b/include/uapi/linux/kvm.h
>> > index 251be35..7c3c5cc 100644
>> > --- a/include/uapi/linux/kvm.h
>> > +++ b/include/uapi/linux/kvm.h
>> > @@ -1110,6 +1110,7 @@ struct kvm_dirty_tlb {
>> >  #define KVM_REG_SIZE_U256	0x0050000000000000ULL
>> >  #define KVM_REG_SIZE_U512	0x0060000000000000ULL
>> >  #define KVM_REG_SIZE_U1024	0x0070000000000000ULL
>> > +#define KVM_REG_SIZE_U2048	0x0080000000000000ULL
>>
>> Yeah OK I guess - but it does make me question if KVM_REG_SIZE_MASK is
>> part of the ABI because although we have space for another few bits that
>> is the last one without changing the mask.
>
> Debatable, but KVM_REG_SIZE_MASK is UAPI and suggests a clear intent not
> to recycle bit 55 for another purpose.  This allows for reg sizes up to
> 262144 bits which is hopefully more than enough for the foreseeable
> future.
>
> Even if bits 56-59 are currently always 0, KVM_REG_ARCH_MASK suggests
> that these bits aren't going to be used for size field bits.
>
>
> Or am I missing something?

No you are quite right - I thought I was watching an incrementing bit
position not an incrementing number. Too much staring at defines, carry
on ;-)

>
>> Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
>
> Thanks
> ---Dave


--
Alex Bennée
_______________________________________________
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

^ permalink raw reply	[flat|nested] 154+ messages in thread

* [RFC PATCH v2 14/23] KVM: Allow 2048-bit register access via ioctl interface
@ 2018-11-20 11:20         ` Alex Bennée
  0 siblings, 0 replies; 154+ messages in thread
From: Alex Bennée @ 2018-11-20 11:20 UTC (permalink / raw)
  To: linux-arm-kernel


Dave Martin <Dave.Martin@arm.com> writes:

> On Mon, Nov 19, 2018 at 04:48:36PM +0000, Alex Benn?e wrote:
>>
>> Dave Martin <Dave.Martin@arm.com> writes:
>>
>> > The Arm SVE architecture defines registers that are up to 2048 bits
>> > in size (with some possibility of further future expansion).
>> >
>> > In order to avoid the need for an excessively large number of
>> > ioctls when saving and restoring a vcpu's registers, this patch
>> > adds a #define to make support for individual 2048-bit registers
>> > through the KVM_{GET,SET}_ONE_REG ioctl interface official.  This
>> > will allow each SVE register to be accessed in a single call.
>> >
>> > There are sufficient spare bits in the register id size field for
>> > this change, so there is no ABI impact providing that
>> > KVM_GET_REG_LIST does not enumerate any 2048-bit register unless
>> > userspace explicitly opts in to the relevant architecture-specific
>> > features.
>> >
>> > Signed-off-by: Dave Martin <Dave.Martin@arm.com>
>> > ---
>> >  include/uapi/linux/kvm.h | 1 +
>> >  1 file changed, 1 insertion(+)
>> >
>> > diff --git a/include/uapi/linux/kvm.h b/include/uapi/linux/kvm.h
>> > index 251be35..7c3c5cc 100644
>> > --- a/include/uapi/linux/kvm.h
>> > +++ b/include/uapi/linux/kvm.h
>> > @@ -1110,6 +1110,7 @@ struct kvm_dirty_tlb {
>> >  #define KVM_REG_SIZE_U256	0x0050000000000000ULL
>> >  #define KVM_REG_SIZE_U512	0x0060000000000000ULL
>> >  #define KVM_REG_SIZE_U1024	0x0070000000000000ULL
>> > +#define KVM_REG_SIZE_U2048	0x0080000000000000ULL
>>
>> Yeah OK I guess - but it does make me question if KVM_REG_SIZE_MASK is
>> part of the ABI because although we have space for another few bits that
>> is the last one without changing the mask.
>
> Debatable, but KVM_REG_SIZE_MASK is UAPI and suggests a clear intent not
> to recycle bit 55 for another purpose.  This allows for reg sizes up to
> 262144 bits which is hopefully more than enough for the foreseeable
> future.
>
> Even if bits 56-59 are currently always 0, KVM_REG_ARCH_MASK suggests
> that these bits aren't going to be used for size field bits.
>
>
> Or am I missing something?

No you are quite right - I thought I was watching an incrementing bit
position not an incrementing number. Too much staring at defines, carry
on ;-)

>
>> Reviewed-by: Alex Benn?e <alex.bennee@linaro.org>
>
> Thanks
> ---Dave


--
Alex Benn?e

^ permalink raw reply	[flat|nested] 154+ messages in thread

* Re: [RFC PATCH v2 13/23] KVM: arm64/sve: Context switch the SVE registers
  2018-11-19 17:03       ` Dave Martin
@ 2018-11-20 12:25         ` Alex Bennée
  -1 siblings, 0 replies; 154+ messages in thread
From: Alex Bennée @ 2018-11-20 12:25 UTC (permalink / raw)
  To: Dave Martin
  Cc: Okamoto Takayuki, Christoffer Dall, Ard Biesheuvel, Marc Zyngier,
	Catalin Marinas, Will Deacon, kvmarm, linux-arm-kernel


Dave Martin <Dave.Martin@arm.com> writes:

> On Mon, Nov 19, 2018 at 04:36:01PM +0000, Alex Bennée wrote:
>>
>> Dave Martin <Dave.Martin@arm.com> writes:
>>
>> > In order to give each vcpu its own view of the SVE registers, this
>> > patch adds context storage via a new sve_state pointer in struct
>> > vcpu_arch.  An additional member sve_max_vl is also added for each
>> > vcpu, to determine the maximum vector length visible to the guest
>> > and thus the value to be configured in ZCR_EL2.LEN while the is
>> > active.  This also determines the layout and size of the storage in
>> > sve_state, which is read and written by the same backend functions
>> > that are used for context-switching the SVE state for host tasks.
>> >
>> > On SVE-enabled vcpus, SVE access traps are now handled by switching
>> > in the vcpu's SVE context and disabling the trap before returning
>> > to the guest.  On other vcpus, the trap is not handled and an exit
>> > back to the host occurs, where the handle_sve() fallback path
>> > reflects an undefined instruction exception back to the guest,
>> > consistently with the behaviour of non-SVE-capable hardware (as was
>> > done unconditionally prior to this patch).
>> >
>> > No SVE handling is added on non-VHE-only paths, since VHE is an
>> > architectural and Kconfig prerequisite of SVE.
>> >
>> > Signed-off-by: Dave Martin <Dave.Martin@arm.com>
>
> [...]
>
>> > diff --git a/arch/arm64/kvm/hyp/switch.c b/arch/arm64/kvm/hyp/switch.c
>> > index 085ed06..9941349 100644
>> > --- a/arch/arm64/kvm/hyp/switch.c
>> > +++ b/arch/arm64/kvm/hyp/switch.c
>
> [...]
>
>> > @@ -380,6 +398,26 @@ static bool __hyp_text __hyp_switch_fpsimd(struct kvm_vcpu *vcpu)
>> >  	return true;
>> >  }
>> >
>> > +static inline bool __hyp_text __hyp_trap_is_fpsimd(struct kvm_vcpu *vcpu,
>> > +						   bool guest_has_sve)
>> > +{
>> > +
>> > +	u8 trap_class;
>> > +
>> > +	if (!system_supports_fpsimd())
>> > +		return false;
>> > +
>> > +	trap_class = kvm_vcpu_trap_get_class(vcpu);
>> > +
>> > +	if (trap_class == ESR_ELx_EC_FP_ASIMD)
>> > +		return true;
>> > +
>> > +	if_sve (guest_has_sve && trap_class == ESR_ELx_EC_SVE)
>> > +		return true;
>>
>> Do we really need to check the guest has SVE before believing what the
>> hardware is telling us? According to the ARM ARM:
>>
>> For ESR_ELx_EC_FP_ASIMD
>>
>>   Excludes exceptions resulting from CPACR_EL1 when the value of HCR_EL2.TGE is
>>   1, or because SVE or Advanced SIMD and floating-point are not implemented. These
>>   are reported with EC value 0b000000
>>
>> But also for ESR_ELx_EC_SVE
>>
>>   Access to SVE functionality trapped as a result of CPACR_EL1.ZEN,
>>   CPTR_EL2.ZEN, CPTR_EL2.TZ, or CPTR_EL3.EZ, that is not reported using EC
>>   0b000000. This EC is defined only if SVE is implemented
>>
>> Given I got confused maybe we need a comment for clarity?
>
> This is not about not trusting the value ESR_ELx_EC_SVE on older
> hardware: in effect it is retrospectively reserved for this purpose on
> all older arch versions, so there is no ambiguity about what it means.
> It should never be observed on hardware that doesn't have SVE.
>
> Rather, how we handle this trap differs depending on whether the guest
> is SVE-enabled or not.  If not, then this trap is handled by the generic
> fallback path for unhandled guest traps, so we don't check for this
> particular EC value explicitly in that case.
>
>>   /* Catch guests without SVE enabled running on SVE capable hardware */
>
> I might write something like:
>
> 	/*
> 	 * For sve-enmabled guests only, handle SVE access via FPSIMD
> 	 * context handling code.
> 	 */
>
> Does that make sense?  I may have misunderstood your concern here.

s/enmabled/enabled/ but yeah that's fine.

>
> [...]
>
>> > @@ -387,6 +425,8 @@ static bool __hyp_text __hyp_switch_fpsimd(struct kvm_vcpu *vcpu)
>> >   */
>> >  static bool __hyp_text fixup_guest_exit(struct kvm_vcpu *vcpu, u64 *exit_code)
>> >  {
>> > +	bool guest_has_sve;
>> > +
>> >  	if (ARM_EXCEPTION_CODE(*exit_code) != ARM_EXCEPTION_IRQ)
>> >  		vcpu->arch.fault.esr_el2 = read_sysreg_el2(esr);
>> >
>> > @@ -404,10 +444,11 @@ static bool __hyp_text fixup_guest_exit(struct kvm_vcpu *vcpu, u64 *exit_code)
>> >  	 * and restore the guest context lazily.
>> >  	 * If FP/SIMD is not implemented, handle the trap and inject an
>> >  	 * undefined instruction exception to the guest.
>> > +	 * Similarly for trapped SVE accesses.
>> >  	 */
>> > -	if (system_supports_fpsimd() &&
>> > -	    kvm_vcpu_trap_get_class(vcpu) == ESR_ELx_EC_FP_ASIMD)
>> > -		return __hyp_switch_fpsimd(vcpu);
>> > +	guest_has_sve = vcpu_has_sve(vcpu);
>>
>> I'm not sure if it's worth fishing this out here given you are already
>> passing vcpu down the chain.
>
> I wanted to discourage GCC from recomputing this.  If you're in a
> position to do so, can you look at the disassembly with/without this
> factored out and see whether it makes a difference?

Hmm it is hard to tell. There is code motion but for some reason I'm
seeing the static jump code unrolled, for example (original on left):

__hyp_switch_fpsimd():                                                                  __hyp_switch_fpsimd():
/home/alex/lsrc/kvm/linux.git/arch/arm64/kvm/hyp/switch.----:382                      | /home/alex/lsrc/kvm/linux.git/arch/arm64/kvm/hyp/switch.----:381
                                                                                      >  ----:  tst     w0, #0x400000
                                                                                      >  ----:  b.eq    22c <fixup_guest_exit+0x1a4>  // b.none
                                                                                      > arch_static_branch_jump():
                                                                                      > /home/alex/lsrc/kvm/linux.git/arch/arm64/include/asm/jump_label.h:45
                                                                                      >  ----:  b       38c <fixup_guest_exit+0x304>
                                                                                      > arch_static_branch():
                                                                                      > /home/alex/lsrc/kvm/linux.git/arch/arm64/include/asm/jump_label.h:31
                                                                                      >  ----:  nop
                                                                                      >  ----:  b       22c <fixup_guest_exit+0x1a4>
                                                                                      > test_bit():
                                                                                      > /home/alex/lsrc/kvm/linux.git/include/asm-generic/bitops/non-atomic.h:106
                                                                                      >  ----:  adrp    x0, 0 <cpu_hwcaps>
                                                                                      >  ----:  ldr     x0, [x0]
                                                                                      > __hyp_switch_fpsimd():
                                                                                      > /home/alex/lsrc/kvm/linux.git/arch/arm64/kvm/hyp/switch.----:381
 ----:  tst     w0, #0x400000                                                            ----:  tst     w0, #0x400000
 ----:  b.eq    238 <fixup_guest_exit+0x1b0>  // b.none                               |  ----:  b.eq    22c <fixup_guest_exit+0x1a4>  // b.none
 ----:  cbz     w21, 238 <fixup_guest_exit+0x1b0>                                     |  ----:  tbz     w2, #5, 22c <fixup_guest_exit+0x1a4>
/home/alex/lsrc/kvm/linux.git/arch/arm64/kvm/hyp/switch.----:383                      | /home/alex/lsrc/kvm/linux.git/arch/arm64/kvm/hyp/switch.----:382
 ----:  ldr     w2, [x19, #2040]                                                      |  ----:  ldr     w2, [x20, #2040]
 ----:  add     x1, x19, #0x4b0                                                       |  ----:  add     x1, x20, #0x4b0
 ----:  ldr     x0, [x19, #2032]                                                      |  ----:  ldr     x0, [x20, #2032]
sve_ffr_offset():                                                                       sve_ffr_offset():

Put calculating guest_has_sve at the top of __hyp_switch_fpsimd make
most of that go away and just moves things around a little bit. So I
guess it could makes sense for the fast(ish) path although I'd be
interested in knowing if it made any real difference to the numbers.
After all the first read should be well cached and moving it through the
stack is just additional memory and register pressure.

>>
>> > +	if (__hyp_trap_is_fpsimd(vcpu, guest_has_sve))
>> > +		return __hyp_switch_fpsimd(vcpu, guest_has_sve);
>> >
>> >  	if (!__populate_fault_info(vcpu))
>> >  		return true;
>>
>> Otherwise:
>>
>> Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
>
> Thanks
> ---Dave


--
Alex Bennée
_______________________________________________
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

^ permalink raw reply	[flat|nested] 154+ messages in thread

* [RFC PATCH v2 13/23] KVM: arm64/sve: Context switch the SVE registers
@ 2018-11-20 12:25         ` Alex Bennée
  0 siblings, 0 replies; 154+ messages in thread
From: Alex Bennée @ 2018-11-20 12:25 UTC (permalink / raw)
  To: linux-arm-kernel


Dave Martin <Dave.Martin@arm.com> writes:

> On Mon, Nov 19, 2018 at 04:36:01PM +0000, Alex Benn?e wrote:
>>
>> Dave Martin <Dave.Martin@arm.com> writes:
>>
>> > In order to give each vcpu its own view of the SVE registers, this
>> > patch adds context storage via a new sve_state pointer in struct
>> > vcpu_arch.  An additional member sve_max_vl is also added for each
>> > vcpu, to determine the maximum vector length visible to the guest
>> > and thus the value to be configured in ZCR_EL2.LEN while the is
>> > active.  This also determines the layout and size of the storage in
>> > sve_state, which is read and written by the same backend functions
>> > that are used for context-switching the SVE state for host tasks.
>> >
>> > On SVE-enabled vcpus, SVE access traps are now handled by switching
>> > in the vcpu's SVE context and disabling the trap before returning
>> > to the guest.  On other vcpus, the trap is not handled and an exit
>> > back to the host occurs, where the handle_sve() fallback path
>> > reflects an undefined instruction exception back to the guest,
>> > consistently with the behaviour of non-SVE-capable hardware (as was
>> > done unconditionally prior to this patch).
>> >
>> > No SVE handling is added on non-VHE-only paths, since VHE is an
>> > architectural and Kconfig prerequisite of SVE.
>> >
>> > Signed-off-by: Dave Martin <Dave.Martin@arm.com>
>
> [...]
>
>> > diff --git a/arch/arm64/kvm/hyp/switch.c b/arch/arm64/kvm/hyp/switch.c
>> > index 085ed06..9941349 100644
>> > --- a/arch/arm64/kvm/hyp/switch.c
>> > +++ b/arch/arm64/kvm/hyp/switch.c
>
> [...]
>
>> > @@ -380,6 +398,26 @@ static bool __hyp_text __hyp_switch_fpsimd(struct kvm_vcpu *vcpu)
>> >  	return true;
>> >  }
>> >
>> > +static inline bool __hyp_text __hyp_trap_is_fpsimd(struct kvm_vcpu *vcpu,
>> > +						   bool guest_has_sve)
>> > +{
>> > +
>> > +	u8 trap_class;
>> > +
>> > +	if (!system_supports_fpsimd())
>> > +		return false;
>> > +
>> > +	trap_class = kvm_vcpu_trap_get_class(vcpu);
>> > +
>> > +	if (trap_class == ESR_ELx_EC_FP_ASIMD)
>> > +		return true;
>> > +
>> > +	if_sve (guest_has_sve && trap_class == ESR_ELx_EC_SVE)
>> > +		return true;
>>
>> Do we really need to check the guest has SVE before believing what the
>> hardware is telling us? According to the ARM ARM:
>>
>> For ESR_ELx_EC_FP_ASIMD
>>
>>   Excludes exceptions resulting from CPACR_EL1 when the value of HCR_EL2.TGE is
>>   1, or because SVE or Advanced SIMD and floating-point are not implemented. These
>>   are reported with EC value 0b000000
>>
>> But also for ESR_ELx_EC_SVE
>>
>>   Access to SVE functionality trapped as a result of CPACR_EL1.ZEN,
>>   CPTR_EL2.ZEN, CPTR_EL2.TZ, or CPTR_EL3.EZ, that is not reported using EC
>>   0b000000. This EC is defined only if SVE is implemented
>>
>> Given I got confused maybe we need a comment for clarity?
>
> This is not about not trusting the value ESR_ELx_EC_SVE on older
> hardware: in effect it is retrospectively reserved for this purpose on
> all older arch versions, so there is no ambiguity about what it means.
> It should never be observed on hardware that doesn't have SVE.
>
> Rather, how we handle this trap differs depending on whether the guest
> is SVE-enabled or not.  If not, then this trap is handled by the generic
> fallback path for unhandled guest traps, so we don't check for this
> particular EC value explicitly in that case.
>
>>   /* Catch guests without SVE enabled running on SVE capable hardware */
>
> I might write something like:
>
> 	/*
> 	 * For sve-enmabled guests only, handle SVE access via FPSIMD
> 	 * context handling code.
> 	 */
>
> Does that make sense?  I may have misunderstood your concern here.

s/enmabled/enabled/ but yeah that's fine.

>
> [...]
>
>> > @@ -387,6 +425,8 @@ static bool __hyp_text __hyp_switch_fpsimd(struct kvm_vcpu *vcpu)
>> >   */
>> >  static bool __hyp_text fixup_guest_exit(struct kvm_vcpu *vcpu, u64 *exit_code)
>> >  {
>> > +	bool guest_has_sve;
>> > +
>> >  	if (ARM_EXCEPTION_CODE(*exit_code) != ARM_EXCEPTION_IRQ)
>> >  		vcpu->arch.fault.esr_el2 = read_sysreg_el2(esr);
>> >
>> > @@ -404,10 +444,11 @@ static bool __hyp_text fixup_guest_exit(struct kvm_vcpu *vcpu, u64 *exit_code)
>> >  	 * and restore the guest context lazily.
>> >  	 * If FP/SIMD is not implemented, handle the trap and inject an
>> >  	 * undefined instruction exception to the guest.
>> > +	 * Similarly for trapped SVE accesses.
>> >  	 */
>> > -	if (system_supports_fpsimd() &&
>> > -	    kvm_vcpu_trap_get_class(vcpu) == ESR_ELx_EC_FP_ASIMD)
>> > -		return __hyp_switch_fpsimd(vcpu);
>> > +	guest_has_sve = vcpu_has_sve(vcpu);
>>
>> I'm not sure if it's worth fishing this out here given you are already
>> passing vcpu down the chain.
>
> I wanted to discourage GCC from recomputing this.  If you're in a
> position to do so, can you look at the disassembly with/without this
> factored out and see whether it makes a difference?

Hmm it is hard to tell. There is code motion but for some reason I'm
seeing the static jump code unrolled, for example (original on left):

__hyp_switch_fpsimd():                                                                  __hyp_switch_fpsimd():
/home/alex/lsrc/kvm/linux.git/arch/arm64/kvm/hyp/switch.----:382                      | /home/alex/lsrc/kvm/linux.git/arch/arm64/kvm/hyp/switch.----:381
                                                                                      >  ----:  tst     w0, #0x400000
                                                                                      >  ----:  b.eq    22c <fixup_guest_exit+0x1a4>  // b.none
                                                                                      > arch_static_branch_jump():
                                                                                      > /home/alex/lsrc/kvm/linux.git/arch/arm64/include/asm/jump_label.h:45
                                                                                      >  ----:  b       38c <fixup_guest_exit+0x304>
                                                                                      > arch_static_branch():
                                                                                      > /home/alex/lsrc/kvm/linux.git/arch/arm64/include/asm/jump_label.h:31
                                                                                      >  ----:  nop
                                                                                      >  ----:  b       22c <fixup_guest_exit+0x1a4>
                                                                                      > test_bit():
                                                                                      > /home/alex/lsrc/kvm/linux.git/include/asm-generic/bitops/non-atomic.h:106
                                                                                      >  ----:  adrp    x0, 0 <cpu_hwcaps>
                                                                                      >  ----:  ldr     x0, [x0]
                                                                                      > __hyp_switch_fpsimd():
                                                                                      > /home/alex/lsrc/kvm/linux.git/arch/arm64/kvm/hyp/switch.----:381
 ----:  tst     w0, #0x400000                                                            ----:  tst     w0, #0x400000
 ----:  b.eq    238 <fixup_guest_exit+0x1b0>  // b.none                               |  ----:  b.eq    22c <fixup_guest_exit+0x1a4>  // b.none
 ----:  cbz     w21, 238 <fixup_guest_exit+0x1b0>                                     |  ----:  tbz     w2, #5, 22c <fixup_guest_exit+0x1a4>
/home/alex/lsrc/kvm/linux.git/arch/arm64/kvm/hyp/switch.----:383                      | /home/alex/lsrc/kvm/linux.git/arch/arm64/kvm/hyp/switch.----:382
 ----:  ldr     w2, [x19, #2040]                                                      |  ----:  ldr     w2, [x20, #2040]
 ----:  add     x1, x19, #0x4b0                                                       |  ----:  add     x1, x20, #0x4b0
 ----:  ldr     x0, [x19, #2032]                                                      |  ----:  ldr     x0, [x20, #2032]
sve_ffr_offset():                                                                       sve_ffr_offset():

Put calculating guest_has_sve at the top of __hyp_switch_fpsimd make
most of that go away and just moves things around a little bit. So I
guess it could makes sense for the fast(ish) path although I'd be
interested in knowing if it made any real difference to the numbers.
After all the first read should be well cached and moving it through the
stack is just additional memory and register pressure.

>>
>> > +	if (__hyp_trap_is_fpsimd(vcpu, guest_has_sve))
>> > +		return __hyp_switch_fpsimd(vcpu, guest_has_sve);
>> >
>> >  	if (!__populate_fault_info(vcpu))
>> >  		return true;
>>
>> Otherwise:
>>
>> Reviewed-by: Alex Benn?e <alex.bennee@linaro.org>
>
> Thanks
> ---Dave


--
Alex Benn?e

^ permalink raw reply	[flat|nested] 154+ messages in thread

* Re: [RFC PATCH v2 13/23] KVM: arm64/sve: Context switch the SVE registers
  2018-11-20 12:25         ` Alex Bennée
@ 2018-11-20 14:17           ` Dave Martin
  -1 siblings, 0 replies; 154+ messages in thread
From: Dave Martin @ 2018-11-20 14:17 UTC (permalink / raw)
  To: Alex Bennée
  Cc: Okamoto Takayuki, Christoffer Dall, Ard Biesheuvel, Marc Zyngier,
	Catalin Marinas, Will Deacon, kvmarm, linux-arm-kernel

On Tue, Nov 20, 2018 at 12:25:12PM +0000, Alex Bennée wrote:
> 
> Dave Martin <Dave.Martin@arm.com> writes:
> 
> > On Mon, Nov 19, 2018 at 04:36:01PM +0000, Alex Bennée wrote:
> >>
> >> Dave Martin <Dave.Martin@arm.com> writes:
> >>
> >> > In order to give each vcpu its own view of the SVE registers, this
> >> > patch adds context storage via a new sve_state pointer in struct
> >> > vcpu_arch.  An additional member sve_max_vl is also added for each
> >> > vcpu, to determine the maximum vector length visible to the guest
> >> > and thus the value to be configured in ZCR_EL2.LEN while the is
> >> > active.  This also determines the layout and size of the storage in
> >> > sve_state, which is read and written by the same backend functions
> >> > that are used for context-switching the SVE state for host tasks.
> >> >
> >> > On SVE-enabled vcpus, SVE access traps are now handled by switching
> >> > in the vcpu's SVE context and disabling the trap before returning
> >> > to the guest.  On other vcpus, the trap is not handled and an exit
> >> > back to the host occurs, where the handle_sve() fallback path
> >> > reflects an undefined instruction exception back to the guest,
> >> > consistently with the behaviour of non-SVE-capable hardware (as was
> >> > done unconditionally prior to this patch).
> >> >
> >> > No SVE handling is added on non-VHE-only paths, since VHE is an
> >> > architectural and Kconfig prerequisite of SVE.
> >> >
> >> > Signed-off-by: Dave Martin <Dave.Martin@arm.com>
> >
> > [...]
> >
> >> > diff --git a/arch/arm64/kvm/hyp/switch.c b/arch/arm64/kvm/hyp/switch.c
> >> > index 085ed06..9941349 100644
> >> > --- a/arch/arm64/kvm/hyp/switch.c
> >> > +++ b/arch/arm64/kvm/hyp/switch.c
> >
> > [...]
> >
> >> > @@ -380,6 +398,26 @@ static bool __hyp_text __hyp_switch_fpsimd(struct kvm_vcpu *vcpu)
> >> >  	return true;
> >> >  }
> >> >
> >> > +static inline bool __hyp_text __hyp_trap_is_fpsimd(struct kvm_vcpu *vcpu,
> >> > +						   bool guest_has_sve)
> >> > +{
> >> > +
> >> > +	u8 trap_class;
> >> > +
> >> > +	if (!system_supports_fpsimd())
> >> > +		return false;
> >> > +
> >> > +	trap_class = kvm_vcpu_trap_get_class(vcpu);
> >> > +
> >> > +	if (trap_class == ESR_ELx_EC_FP_ASIMD)
> >> > +		return true;
> >> > +
> >> > +	if_sve (guest_has_sve && trap_class == ESR_ELx_EC_SVE)
> >> > +		return true;
> >>
> >> Do we really need to check the guest has SVE before believing what the
> >> hardware is telling us? According to the ARM ARM:
> >>
> >> For ESR_ELx_EC_FP_ASIMD
> >>
> >>   Excludes exceptions resulting from CPACR_EL1 when the value of HCR_EL2.TGE is
> >>   1, or because SVE or Advanced SIMD and floating-point are not implemented. These
> >>   are reported with EC value 0b000000
> >>
> >> But also for ESR_ELx_EC_SVE
> >>
> >>   Access to SVE functionality trapped as a result of CPACR_EL1.ZEN,
> >>   CPTR_EL2.ZEN, CPTR_EL2.TZ, or CPTR_EL3.EZ, that is not reported using EC
> >>   0b000000. This EC is defined only if SVE is implemented
> >>
> >> Given I got confused maybe we need a comment for clarity?
> >
> > This is not about not trusting the value ESR_ELx_EC_SVE on older
> > hardware: in effect it is retrospectively reserved for this purpose on
> > all older arch versions, so there is no ambiguity about what it means.
> > It should never be observed on hardware that doesn't have SVE.
> >
> > Rather, how we handle this trap differs depending on whether the guest
> > is SVE-enabled or not.  If not, then this trap is handled by the generic
> > fallback path for unhandled guest traps, so we don't check for this
> > particular EC value explicitly in that case.
> >
> >>   /* Catch guests without SVE enabled running on SVE capable hardware */
> >
> > I might write something like:
> >
> > 	/*
> > 	 * For sve-enmabled guests only, handle SVE access via FPSIMD
> > 	 * context handling code.
> > 	 */
> >
> > Does that make sense?  I may have misunderstood your concern here.
> 
> s/enmabled/enabled/ but yeah that's fine.

Well spotted... I guess I was in a hurry.

> > [...]
> >
> >> > @@ -387,6 +425,8 @@ static bool __hyp_text __hyp_switch_fpsimd(struct kvm_vcpu *vcpu)
> >> >   */
> >> >  static bool __hyp_text fixup_guest_exit(struct kvm_vcpu *vcpu, u64 *exit_code)
> >> >  {
> >> > +	bool guest_has_sve;
> >> > +
> >> >  	if (ARM_EXCEPTION_CODE(*exit_code) != ARM_EXCEPTION_IRQ)
> >> >  		vcpu->arch.fault.esr_el2 = read_sysreg_el2(esr);
> >> >
> >> > @@ -404,10 +444,11 @@ static bool __hyp_text fixup_guest_exit(struct kvm_vcpu *vcpu, u64 *exit_code)
> >> >  	 * and restore the guest context lazily.
> >> >  	 * If FP/SIMD is not implemented, handle the trap and inject an
> >> >  	 * undefined instruction exception to the guest.
> >> > +	 * Similarly for trapped SVE accesses.
> >> >  	 */
> >> > -	if (system_supports_fpsimd() &&
> >> > -	    kvm_vcpu_trap_get_class(vcpu) == ESR_ELx_EC_FP_ASIMD)
> >> > -		return __hyp_switch_fpsimd(vcpu);
> >> > +	guest_has_sve = vcpu_has_sve(vcpu);
> >>
> >> I'm not sure if it's worth fishing this out here given you are already
> >> passing vcpu down the chain.
> >
> > I wanted to discourage GCC from recomputing this.  If you're in a
> > position to do so, can you look at the disassembly with/without this
> > factored out and see whether it makes a difference?
> 
> Hmm it is hard to tell. There is code motion but for some reason I'm
> seeing the static jump code unrolled, for example (original on left):
> 
> __hyp_switch_fpsimd():                                                                  __hyp_switch_fpsimd():
> /home/alex/lsrc/kvm/linux.git/arch/arm64/kvm/hyp/switch.----:382                      | /home/alex/lsrc/kvm/linux.git/arch/arm64/kvm/hyp/switch.----:381
>                                                                                       >  ----:  tst     w0, #0x400000
>                                                                                       >  ----:  b.eq    22c <fixup_guest_exit+0x1a4>  // b.none
>                                                                                       > arch_static_branch_jump():
>                                                                                       > /home/alex/lsrc/kvm/linux.git/arch/arm64/include/asm/jump_label.h:45
>                                                                                       >  ----:  b       38c <fixup_guest_exit+0x304>
>                                                                                       > arch_static_branch():
>                                                                                       > /home/alex/lsrc/kvm/linux.git/arch/arm64/include/asm/jump_label.h:31
>                                                                                       >  ----:  nop
>                                                                                       >  ----:  b       22c <fixup_guest_exit+0x1a4>
>                                                                                       > test_bit():
>                                                                                       > /home/alex/lsrc/kvm/linux.git/include/asm-generic/bitops/non-atomic.h:106
>                                                                                       >  ----:  adrp    x0, 0 <cpu_hwcaps>
>                                                                                       >  ----:  ldr     x0, [x0]
>                                                                                       > __hyp_switch_fpsimd():
>                                                                                       > /home/alex/lsrc/kvm/linux.git/arch/arm64/kvm/hyp/switch.----:381
>  ----:  tst     w0, #0x400000                                                            ----:  tst     w0, #0x400000
>  ----:  b.eq    238 <fixup_guest_exit+0x1b0>  // b.none                               |  ----:  b.eq    22c <fixup_guest_exit+0x1a4>  // b.none
>  ----:  cbz     w21, 238 <fixup_guest_exit+0x1b0>                                     |  ----:  tbz     w2, #5, 22c <fixup_guest_exit+0x1a4>
> /home/alex/lsrc/kvm/linux.git/arch/arm64/kvm/hyp/switch.----:383                      | /home/alex/lsrc/kvm/linux.git/arch/arm64/kvm/hyp/switch.----:382
>  ----:  ldr     w2, [x19, #2040]                                                      |  ----:  ldr     w2, [x20, #2040]
>  ----:  add     x1, x19, #0x4b0                                                       |  ----:  add     x1, x20, #0x4b0
>  ----:  ldr     x0, [x19, #2032]                                                      |  ----:  ldr     x0, [x20, #2032]
> sve_ffr_offset():                                                                       sve_ffr_offset():
> 
> Put calculating guest_has_sve at the top of __hyp_switch_fpsimd make
> most of that go away and just moves things around a little bit. So I
> guess it could makes sense for the fast(ish) path although I'd be
> interested in knowing if it made any real difference to the numbers.
> After all the first read should be well cached and moving it through the
> stack is just additional memory and register pressure.

Hmmm, I will have a think about this when I respin.

Explicitly caching guest_has_sve() does reduce the compiler's freedom to
optimise.

We might be able to mark it as __pure or __attribute_const__ to enable
the compiler to decide whether to cache the result, but this may not be
100% safe.

Part of me would prefer to leave things as they are to avoid the risk of
breaking the code again...

Cheers
---Dave

^ permalink raw reply	[flat|nested] 154+ messages in thread

* [RFC PATCH v2 13/23] KVM: arm64/sve: Context switch the SVE registers
@ 2018-11-20 14:17           ` Dave Martin
  0 siblings, 0 replies; 154+ messages in thread
From: Dave Martin @ 2018-11-20 14:17 UTC (permalink / raw)
  To: linux-arm-kernel

On Tue, Nov 20, 2018 at 12:25:12PM +0000, Alex Benn?e wrote:
> 
> Dave Martin <Dave.Martin@arm.com> writes:
> 
> > On Mon, Nov 19, 2018 at 04:36:01PM +0000, Alex Benn?e wrote:
> >>
> >> Dave Martin <Dave.Martin@arm.com> writes:
> >>
> >> > In order to give each vcpu its own view of the SVE registers, this
> >> > patch adds context storage via a new sve_state pointer in struct
> >> > vcpu_arch.  An additional member sve_max_vl is also added for each
> >> > vcpu, to determine the maximum vector length visible to the guest
> >> > and thus the value to be configured in ZCR_EL2.LEN while the is
> >> > active.  This also determines the layout and size of the storage in
> >> > sve_state, which is read and written by the same backend functions
> >> > that are used for context-switching the SVE state for host tasks.
> >> >
> >> > On SVE-enabled vcpus, SVE access traps are now handled by switching
> >> > in the vcpu's SVE context and disabling the trap before returning
> >> > to the guest.  On other vcpus, the trap is not handled and an exit
> >> > back to the host occurs, where the handle_sve() fallback path
> >> > reflects an undefined instruction exception back to the guest,
> >> > consistently with the behaviour of non-SVE-capable hardware (as was
> >> > done unconditionally prior to this patch).
> >> >
> >> > No SVE handling is added on non-VHE-only paths, since VHE is an
> >> > architectural and Kconfig prerequisite of SVE.
> >> >
> >> > Signed-off-by: Dave Martin <Dave.Martin@arm.com>
> >
> > [...]
> >
> >> > diff --git a/arch/arm64/kvm/hyp/switch.c b/arch/arm64/kvm/hyp/switch.c
> >> > index 085ed06..9941349 100644
> >> > --- a/arch/arm64/kvm/hyp/switch.c
> >> > +++ b/arch/arm64/kvm/hyp/switch.c
> >
> > [...]
> >
> >> > @@ -380,6 +398,26 @@ static bool __hyp_text __hyp_switch_fpsimd(struct kvm_vcpu *vcpu)
> >> >  	return true;
> >> >  }
> >> >
> >> > +static inline bool __hyp_text __hyp_trap_is_fpsimd(struct kvm_vcpu *vcpu,
> >> > +						   bool guest_has_sve)
> >> > +{
> >> > +
> >> > +	u8 trap_class;
> >> > +
> >> > +	if (!system_supports_fpsimd())
> >> > +		return false;
> >> > +
> >> > +	trap_class = kvm_vcpu_trap_get_class(vcpu);
> >> > +
> >> > +	if (trap_class == ESR_ELx_EC_FP_ASIMD)
> >> > +		return true;
> >> > +
> >> > +	if_sve (guest_has_sve && trap_class == ESR_ELx_EC_SVE)
> >> > +		return true;
> >>
> >> Do we really need to check the guest has SVE before believing what the
> >> hardware is telling us? According to the ARM ARM:
> >>
> >> For ESR_ELx_EC_FP_ASIMD
> >>
> >>   Excludes exceptions resulting from CPACR_EL1 when the value of HCR_EL2.TGE is
> >>   1, or because SVE or Advanced SIMD and floating-point are not implemented. These
> >>   are reported with EC value 0b000000
> >>
> >> But also for ESR_ELx_EC_SVE
> >>
> >>   Access to SVE functionality trapped as a result of CPACR_EL1.ZEN,
> >>   CPTR_EL2.ZEN, CPTR_EL2.TZ, or CPTR_EL3.EZ, that is not reported using EC
> >>   0b000000. This EC is defined only if SVE is implemented
> >>
> >> Given I got confused maybe we need a comment for clarity?
> >
> > This is not about not trusting the value ESR_ELx_EC_SVE on older
> > hardware: in effect it is retrospectively reserved for this purpose on
> > all older arch versions, so there is no ambiguity about what it means.
> > It should never be observed on hardware that doesn't have SVE.
> >
> > Rather, how we handle this trap differs depending on whether the guest
> > is SVE-enabled or not.  If not, then this trap is handled by the generic
> > fallback path for unhandled guest traps, so we don't check for this
> > particular EC value explicitly in that case.
> >
> >>   /* Catch guests without SVE enabled running on SVE capable hardware */
> >
> > I might write something like:
> >
> > 	/*
> > 	 * For sve-enmabled guests only, handle SVE access via FPSIMD
> > 	 * context handling code.
> > 	 */
> >
> > Does that make sense?  I may have misunderstood your concern here.
> 
> s/enmabled/enabled/ but yeah that's fine.

Well spotted... I guess I was in a hurry.

> > [...]
> >
> >> > @@ -387,6 +425,8 @@ static bool __hyp_text __hyp_switch_fpsimd(struct kvm_vcpu *vcpu)
> >> >   */
> >> >  static bool __hyp_text fixup_guest_exit(struct kvm_vcpu *vcpu, u64 *exit_code)
> >> >  {
> >> > +	bool guest_has_sve;
> >> > +
> >> >  	if (ARM_EXCEPTION_CODE(*exit_code) != ARM_EXCEPTION_IRQ)
> >> >  		vcpu->arch.fault.esr_el2 = read_sysreg_el2(esr);
> >> >
> >> > @@ -404,10 +444,11 @@ static bool __hyp_text fixup_guest_exit(struct kvm_vcpu *vcpu, u64 *exit_code)
> >> >  	 * and restore the guest context lazily.
> >> >  	 * If FP/SIMD is not implemented, handle the trap and inject an
> >> >  	 * undefined instruction exception to the guest.
> >> > +	 * Similarly for trapped SVE accesses.
> >> >  	 */
> >> > -	if (system_supports_fpsimd() &&
> >> > -	    kvm_vcpu_trap_get_class(vcpu) == ESR_ELx_EC_FP_ASIMD)
> >> > -		return __hyp_switch_fpsimd(vcpu);
> >> > +	guest_has_sve = vcpu_has_sve(vcpu);
> >>
> >> I'm not sure if it's worth fishing this out here given you are already
> >> passing vcpu down the chain.
> >
> > I wanted to discourage GCC from recomputing this.  If you're in a
> > position to do so, can you look at the disassembly with/without this
> > factored out and see whether it makes a difference?
> 
> Hmm it is hard to tell. There is code motion but for some reason I'm
> seeing the static jump code unrolled, for example (original on left):
> 
> __hyp_switch_fpsimd():                                                                  __hyp_switch_fpsimd():
> /home/alex/lsrc/kvm/linux.git/arch/arm64/kvm/hyp/switch.----:382                      | /home/alex/lsrc/kvm/linux.git/arch/arm64/kvm/hyp/switch.----:381
>                                                                                       >  ----:  tst     w0, #0x400000
>                                                                                       >  ----:  b.eq    22c <fixup_guest_exit+0x1a4>  // b.none
>                                                                                       > arch_static_branch_jump():
>                                                                                       > /home/alex/lsrc/kvm/linux.git/arch/arm64/include/asm/jump_label.h:45
>                                                                                       >  ----:  b       38c <fixup_guest_exit+0x304>
>                                                                                       > arch_static_branch():
>                                                                                       > /home/alex/lsrc/kvm/linux.git/arch/arm64/include/asm/jump_label.h:31
>                                                                                       >  ----:  nop
>                                                                                       >  ----:  b       22c <fixup_guest_exit+0x1a4>
>                                                                                       > test_bit():
>                                                                                       > /home/alex/lsrc/kvm/linux.git/include/asm-generic/bitops/non-atomic.h:106
>                                                                                       >  ----:  adrp    x0, 0 <cpu_hwcaps>
>                                                                                       >  ----:  ldr     x0, [x0]
>                                                                                       > __hyp_switch_fpsimd():
>                                                                                       > /home/alex/lsrc/kvm/linux.git/arch/arm64/kvm/hyp/switch.----:381
>  ----:  tst     w0, #0x400000                                                            ----:  tst     w0, #0x400000
>  ----:  b.eq    238 <fixup_guest_exit+0x1b0>  // b.none                               |  ----:  b.eq    22c <fixup_guest_exit+0x1a4>  // b.none
>  ----:  cbz     w21, 238 <fixup_guest_exit+0x1b0>                                     |  ----:  tbz     w2, #5, 22c <fixup_guest_exit+0x1a4>
> /home/alex/lsrc/kvm/linux.git/arch/arm64/kvm/hyp/switch.----:383                      | /home/alex/lsrc/kvm/linux.git/arch/arm64/kvm/hyp/switch.----:382
>  ----:  ldr     w2, [x19, #2040]                                                      |  ----:  ldr     w2, [x20, #2040]
>  ----:  add     x1, x19, #0x4b0                                                       |  ----:  add     x1, x20, #0x4b0
>  ----:  ldr     x0, [x19, #2032]                                                      |  ----:  ldr     x0, [x20, #2032]
> sve_ffr_offset():                                                                       sve_ffr_offset():
> 
> Put calculating guest_has_sve at the top of __hyp_switch_fpsimd make
> most of that go away and just moves things around a little bit. So I
> guess it could makes sense for the fast(ish) path although I'd be
> interested in knowing if it made any real difference to the numbers.
> After all the first read should be well cached and moving it through the
> stack is just additional memory and register pressure.

Hmmm, I will have a think about this when I respin.

Explicitly caching guest_has_sve() does reduce the compiler's freedom to
optimise.

We might be able to mark it as __pure or __attribute_const__ to enable
the compiler to decide whether to cache the result, but this may not be
100% safe.

Part of me would prefer to leave things as they are to avoid the risk of
breaking the code again...

Cheers
---Dave

^ permalink raw reply	[flat|nested] 154+ messages in thread

* Re: [RFC PATCH v2 20/23] KVM: arm64: Add arch vm ioctl hook
  2018-11-20 10:58         ` Christoffer Dall
@ 2018-11-20 14:19           ` Dave Martin
  -1 siblings, 0 replies; 154+ messages in thread
From: Dave Martin @ 2018-11-20 14:19 UTC (permalink / raw)
  To: Christoffer Dall
  Cc: Okamoto Takayuki, Christoffer Dall, Ard Biesheuvel, Marc Zyngier,
	Catalin Marinas, Will Deacon, kvmarm, linux-arm-kernel

On Tue, Nov 20, 2018 at 11:58:52AM +0100, Christoffer Dall wrote:
> On Thu, Nov 15, 2018 at 06:04:22PM +0000, Dave Martin wrote:
> > On Fri, Nov 02, 2018 at 09:32:27AM +0100, Christoffer Dall wrote:
> > > On Fri, Sep 28, 2018 at 02:39:24PM +0100, Dave Martin wrote:
> > > > To enable arm64-specific vm ioctls to be added cleanly, this patch
> > > > adds a kvm_arm_arch_vm_ioctl() hook so that these don't pollute the
> > > > common code.
> > > 
> > > Hmmm, I don't really see the strenght of that argument, and have the
> > > same concern as before.  I'd like to avoid the additional indirection
> > > and instead just follow the existing pattern with a dummy implementation
> > > on the 32-bit side that returns an error.
> > 
> > So for this and the similar comment on patch 18, this was premature (or
> > at least, overzealous) factoring on my part.
> > 
> > I'm happy to merge this back together for arm and arm64 as you prefer.
> > 
> > Do we have a nice way of writing the arch check, e.g.
> > 
> > 	case KVM_ARM_SVE_CONFIG:
> > 		if (!IS_ENABLED(ARM64))
> > 			return -EINVAL;
> > 		else
> > 			return kvm_vcpu_sve_config(NULL, userp);
> > 
> > should work, but looks a bit strange.  Maybe I'm just being fussy.
> 
> I prefer just doing:
> 
> 	case KVM_ARM_SVE_CONFIG:
> 		return kvm_vcpu_sve_config(NULL, userp);
> 
> 
> And having this in arch/arm/include/asm/kvm_foo.h:
> 
> static inline int kvm_vcpu_sve_config(...)
> {
> 	return -EINVAL;
> }

Sure, I can do that if you prefer.  I was a little uneasy about
littering arm64 junk all over the arch/arm headers, but we already have
precedent for this and it keeps the call sites clean.

Cheers
---Dave

^ permalink raw reply	[flat|nested] 154+ messages in thread

* [RFC PATCH v2 20/23] KVM: arm64: Add arch vm ioctl hook
@ 2018-11-20 14:19           ` Dave Martin
  0 siblings, 0 replies; 154+ messages in thread
From: Dave Martin @ 2018-11-20 14:19 UTC (permalink / raw)
  To: linux-arm-kernel

On Tue, Nov 20, 2018 at 11:58:52AM +0100, Christoffer Dall wrote:
> On Thu, Nov 15, 2018 at 06:04:22PM +0000, Dave Martin wrote:
> > On Fri, Nov 02, 2018 at 09:32:27AM +0100, Christoffer Dall wrote:
> > > On Fri, Sep 28, 2018 at 02:39:24PM +0100, Dave Martin wrote:
> > > > To enable arm64-specific vm ioctls to be added cleanly, this patch
> > > > adds a kvm_arm_arch_vm_ioctl() hook so that these don't pollute the
> > > > common code.
> > > 
> > > Hmmm, I don't really see the strenght of that argument, and have the
> > > same concern as before.  I'd like to avoid the additional indirection
> > > and instead just follow the existing pattern with a dummy implementation
> > > on the 32-bit side that returns an error.
> > 
> > So for this and the similar comment on patch 18, this was premature (or
> > at least, overzealous) factoring on my part.
> > 
> > I'm happy to merge this back together for arm and arm64 as you prefer.
> > 
> > Do we have a nice way of writing the arch check, e.g.
> > 
> > 	case KVM_ARM_SVE_CONFIG:
> > 		if (!IS_ENABLED(ARM64))
> > 			return -EINVAL;
> > 		else
> > 			return kvm_vcpu_sve_config(NULL, userp);
> > 
> > should work, but looks a bit strange.  Maybe I'm just being fussy.
> 
> I prefer just doing:
> 
> 	case KVM_ARM_SVE_CONFIG:
> 		return kvm_vcpu_sve_config(NULL, userp);
> 
> 
> And having this in arch/arm/include/asm/kvm_foo.h:
> 
> static inline int kvm_vcpu_sve_config(...)
> {
> 	return -EINVAL;
> }

Sure, I can do that if you prefer.  I was a little uneasy about
littering arm64 junk all over the arch/arm headers, but we already have
precedent for this and it keeps the call sites clean.

Cheers
---Dave

^ permalink raw reply	[flat|nested] 154+ messages in thread

* Re: [RFC PATCH v2 13/23] KVM: arm64/sve: Context switch the SVE registers
  2018-11-20 14:17           ` Dave Martin
@ 2018-11-20 15:30             ` Alex Bennée
  -1 siblings, 0 replies; 154+ messages in thread
From: Alex Bennée @ 2018-11-20 15:30 UTC (permalink / raw)
  To: Dave Martin
  Cc: Okamoto Takayuki, Christoffer Dall, Ard Biesheuvel, Marc Zyngier,
	Catalin Marinas, Will Deacon, kvmarm, linux-arm-kernel


Dave Martin <Dave.Martin@arm.com> writes:

<snip>
>> >> > @@ -404,10 +444,11 @@ static bool __hyp_text fixup_guest_exit(struct kvm_vcpu *vcpu, u64 *exit_code)
>> >> >  	 * and restore the guest context lazily.
>> >> >  	 * If FP/SIMD is not implemented, handle the trap and inject an
>> >> >  	 * undefined instruction exception to the guest.
>> >> > +	 * Similarly for trapped SVE accesses.
>> >> >  	 */
>> >> > -	if (system_supports_fpsimd() &&
>> >> > -	    kvm_vcpu_trap_get_class(vcpu) == ESR_ELx_EC_FP_ASIMD)
>> >> > -		return __hyp_switch_fpsimd(vcpu);
>> >> > +	guest_has_sve = vcpu_has_sve(vcpu);
>> >>
>> >> I'm not sure if it's worth fishing this out here given you are already
>> >> passing vcpu down the chain.
>> >
>> > I wanted to discourage GCC from recomputing this.  If you're in a
>> > position to do so, can you look at the disassembly with/without this
>> > factored out and see whether it makes a difference?
>>
>> Hmm it is hard to tell. There is code motion but for some reason I'm
>> seeing the static jump code unrolled, for example (original on left):
>>
>> __hyp_switch_fpsimd():                                                                  __hyp_switch_fpsimd():
>> /home/alex/lsrc/kvm/linux.git/arch/arm64/kvm/hyp/switch.----:382                      | /home/alex/lsrc/kvm/linux.git/arch/arm64/kvm/hyp/switch.----:381
>>                                                                                       >  ----:  tst     w0, #0x400000
>>                                                                                       >  ----:  b.eq    22c <fixup_guest_exit+0x1a4>  // b.none
>>                                                                                       > arch_static_branch_jump():
>>                                                                                       > /home/alex/lsrc/kvm/linux.git/arch/arm64/include/asm/jump_label.h:45
>>                                                                                       >  ----:  b       38c <fixup_guest_exit+0x304>
>>                                                                                       > arch_static_branch():
>>                                                                                       > /home/alex/lsrc/kvm/linux.git/arch/arm64/include/asm/jump_label.h:31
>>                                                                                       >  ----:  nop
>>                                                                                       >  ----:  b       22c <fixup_guest_exit+0x1a4>
>>                                                                                       > test_bit():
>>                                                                                       > /home/alex/lsrc/kvm/linux.git/include/asm-generic/bitops/non-atomic.h:106
>>                                                                                       >  ----:  adrp    x0, 0 <cpu_hwcaps>
>>                                                                                       >  ----:  ldr     x0, [x0]
>>                                                                                       > __hyp_switch_fpsimd():
>>                                                                                       > /home/alex/lsrc/kvm/linux.git/arch/arm64/kvm/hyp/switch.----:381
>>  ----:  tst     w0, #0x400000                                                            ----:  tst     w0, #0x400000
>>  ----:  b.eq    238 <fixup_guest_exit+0x1b0>  // b.none                               |  ----:  b.eq    22c <fixup_guest_exit+0x1a4>  // b.none
>>  ----:  cbz     w21, 238 <fixup_guest_exit+0x1b0>                                     |  ----:  tbz     w2, #5, 22c <fixup_guest_exit+0x1a4>
>> /home/alex/lsrc/kvm/linux.git/arch/arm64/kvm/hyp/switch.----:383                      | /home/alex/lsrc/kvm/linux.git/arch/arm64/kvm/hyp/switch.----:382
>>  ----:  ldr     w2, [x19, #2040]                                                      |  ----:  ldr     w2, [x20, #2040]
>>  ----:  add     x1, x19, #0x4b0                                                       |  ----:  add     x1, x20, #0x4b0
>>  ----:  ldr     x0, [x19, #2032]                                                      |  ----:  ldr     x0, [x20, #2032]
>> sve_ffr_offset():                                                                       sve_ffr_offset():
>>
>> Put calculating guest_has_sve at the top of __hyp_switch_fpsimd make
>> most of that go away and just moves things around a little bit. So I
>> guess it could makes sense for the fast(ish) path although I'd be
>> interested in knowing if it made any real difference to the numbers.
>> After all the first read should be well cached and moving it through the
>> stack is just additional memory and register pressure.
>
> Hmmm, I will have a think about this when I respin.
>
> Explicitly caching guest_has_sve() does reduce the compiler's freedom to
> optimise.
>
> We might be able to mark it as __pure or __attribute_const__ to enable
> the compiler to decide whether to cache the result, but this may not be
> 100% safe.
>
> Part of me would prefer to leave things as they are to avoid the risk of
> breaking the code again...

Given that the only place you call __hyp_switch_fpsimd is here you could
just roll in into __hyp_trap_is_fpsimd and have:

	if (__hyp_trap_is_fpsimd(vcpu))
		return true;

--
Alex Bennée
_______________________________________________
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

^ permalink raw reply	[flat|nested] 154+ messages in thread

* [RFC PATCH v2 13/23] KVM: arm64/sve: Context switch the SVE registers
@ 2018-11-20 15:30             ` Alex Bennée
  0 siblings, 0 replies; 154+ messages in thread
From: Alex Bennée @ 2018-11-20 15:30 UTC (permalink / raw)
  To: linux-arm-kernel


Dave Martin <Dave.Martin@arm.com> writes:

<snip>
>> >> > @@ -404,10 +444,11 @@ static bool __hyp_text fixup_guest_exit(struct kvm_vcpu *vcpu, u64 *exit_code)
>> >> >  	 * and restore the guest context lazily.
>> >> >  	 * If FP/SIMD is not implemented, handle the trap and inject an
>> >> >  	 * undefined instruction exception to the guest.
>> >> > +	 * Similarly for trapped SVE accesses.
>> >> >  	 */
>> >> > -	if (system_supports_fpsimd() &&
>> >> > -	    kvm_vcpu_trap_get_class(vcpu) == ESR_ELx_EC_FP_ASIMD)
>> >> > -		return __hyp_switch_fpsimd(vcpu);
>> >> > +	guest_has_sve = vcpu_has_sve(vcpu);
>> >>
>> >> I'm not sure if it's worth fishing this out here given you are already
>> >> passing vcpu down the chain.
>> >
>> > I wanted to discourage GCC from recomputing this.  If you're in a
>> > position to do so, can you look at the disassembly with/without this
>> > factored out and see whether it makes a difference?
>>
>> Hmm it is hard to tell. There is code motion but for some reason I'm
>> seeing the static jump code unrolled, for example (original on left):
>>
>> __hyp_switch_fpsimd():                                                                  __hyp_switch_fpsimd():
>> /home/alex/lsrc/kvm/linux.git/arch/arm64/kvm/hyp/switch.----:382                      | /home/alex/lsrc/kvm/linux.git/arch/arm64/kvm/hyp/switch.----:381
>>                                                                                       >  ----:  tst     w0, #0x400000
>>                                                                                       >  ----:  b.eq    22c <fixup_guest_exit+0x1a4>  // b.none
>>                                                                                       > arch_static_branch_jump():
>>                                                                                       > /home/alex/lsrc/kvm/linux.git/arch/arm64/include/asm/jump_label.h:45
>>                                                                                       >  ----:  b       38c <fixup_guest_exit+0x304>
>>                                                                                       > arch_static_branch():
>>                                                                                       > /home/alex/lsrc/kvm/linux.git/arch/arm64/include/asm/jump_label.h:31
>>                                                                                       >  ----:  nop
>>                                                                                       >  ----:  b       22c <fixup_guest_exit+0x1a4>
>>                                                                                       > test_bit():
>>                                                                                       > /home/alex/lsrc/kvm/linux.git/include/asm-generic/bitops/non-atomic.h:106
>>                                                                                       >  ----:  adrp    x0, 0 <cpu_hwcaps>
>>                                                                                       >  ----:  ldr     x0, [x0]
>>                                                                                       > __hyp_switch_fpsimd():
>>                                                                                       > /home/alex/lsrc/kvm/linux.git/arch/arm64/kvm/hyp/switch.----:381
>>  ----:  tst     w0, #0x400000                                                            ----:  tst     w0, #0x400000
>>  ----:  b.eq    238 <fixup_guest_exit+0x1b0>  // b.none                               |  ----:  b.eq    22c <fixup_guest_exit+0x1a4>  // b.none
>>  ----:  cbz     w21, 238 <fixup_guest_exit+0x1b0>                                     |  ----:  tbz     w2, #5, 22c <fixup_guest_exit+0x1a4>
>> /home/alex/lsrc/kvm/linux.git/arch/arm64/kvm/hyp/switch.----:383                      | /home/alex/lsrc/kvm/linux.git/arch/arm64/kvm/hyp/switch.----:382
>>  ----:  ldr     w2, [x19, #2040]                                                      |  ----:  ldr     w2, [x20, #2040]
>>  ----:  add     x1, x19, #0x4b0                                                       |  ----:  add     x1, x20, #0x4b0
>>  ----:  ldr     x0, [x19, #2032]                                                      |  ----:  ldr     x0, [x20, #2032]
>> sve_ffr_offset():                                                                       sve_ffr_offset():
>>
>> Put calculating guest_has_sve at the top of __hyp_switch_fpsimd make
>> most of that go away and just moves things around a little bit. So I
>> guess it could makes sense for the fast(ish) path although I'd be
>> interested in knowing if it made any real difference to the numbers.
>> After all the first read should be well cached and moving it through the
>> stack is just additional memory and register pressure.
>
> Hmmm, I will have a think about this when I respin.
>
> Explicitly caching guest_has_sve() does reduce the compiler's freedom to
> optimise.
>
> We might be able to mark it as __pure or __attribute_const__ to enable
> the compiler to decide whether to cache the result, but this may not be
> 100% safe.
>
> Part of me would prefer to leave things as they are to avoid the risk of
> breaking the code again...

Given that the only place you call __hyp_switch_fpsimd is here you could
just roll in into __hyp_trap_is_fpsimd and have:

	if (__hyp_trap_is_fpsimd(vcpu))
		return true;

--
Alex Benn?e

^ permalink raw reply	[flat|nested] 154+ messages in thread

* Re: [RFC PATCH v2 13/23] KVM: arm64/sve: Context switch the SVE registers
  2018-11-20 15:30             ` Alex Bennée
@ 2018-11-20 17:18               ` Dave Martin
  -1 siblings, 0 replies; 154+ messages in thread
From: Dave Martin @ 2018-11-20 17:18 UTC (permalink / raw)
  To: Alex Bennée
  Cc: Okamoto Takayuki, Christoffer Dall, Ard Biesheuvel, Marc Zyngier,
	Catalin Marinas, Will Deacon, kvmarm, linux-arm-kernel

On Tue, Nov 20, 2018 at 03:30:29PM +0000, Alex Bennée wrote:
> 
> Dave Martin <Dave.Martin@arm.com> writes:

[...]

> >> Put calculating guest_has_sve at the top of __hyp_switch_fpsimd make
> >> most of that go away and just moves things around a little bit. So I
> >> guess it could makes sense for the fast(ish) path although I'd be
> >> interested in knowing if it made any real difference to the numbers.
> >> After all the first read should be well cached and moving it through the
> >> stack is just additional memory and register pressure.
> >
> > Hmmm, I will have a think about this when I respin.
> >
> > Explicitly caching guest_has_sve() does reduce the compiler's freedom to
> > optimise.
> >
> > We might be able to mark it as __pure or __attribute_const__ to enable
> > the compiler to decide whether to cache the result, but this may not be
> > 100% safe.
> >
> > Part of me would prefer to leave things as they are to avoid the risk of
> > breaking the code again...
> 
> Given that the only place you call __hyp_switch_fpsimd is here you could
> just roll in into __hyp_trap_is_fpsimd and have:
> 
> 	if (__hyp_trap_is_fpsimd(vcpu))
> 		return true;

Possibly, though the function should be renamed in this case, something
like __hyp_handle_fpsimd_trap() I guess.

Cheers
---Dave

^ permalink raw reply	[flat|nested] 154+ messages in thread

* [RFC PATCH v2 13/23] KVM: arm64/sve: Context switch the SVE registers
@ 2018-11-20 17:18               ` Dave Martin
  0 siblings, 0 replies; 154+ messages in thread
From: Dave Martin @ 2018-11-20 17:18 UTC (permalink / raw)
  To: linux-arm-kernel

On Tue, Nov 20, 2018 at 03:30:29PM +0000, Alex Benn?e wrote:
> 
> Dave Martin <Dave.Martin@arm.com> writes:

[...]

> >> Put calculating guest_has_sve at the top of __hyp_switch_fpsimd make
> >> most of that go away and just moves things around a little bit. So I
> >> guess it could makes sense for the fast(ish) path although I'd be
> >> interested in knowing if it made any real difference to the numbers.
> >> After all the first read should be well cached and moving it through the
> >> stack is just additional memory and register pressure.
> >
> > Hmmm, I will have a think about this when I respin.
> >
> > Explicitly caching guest_has_sve() does reduce the compiler's freedom to
> > optimise.
> >
> > We might be able to mark it as __pure or __attribute_const__ to enable
> > the compiler to decide whether to cache the result, but this may not be
> > 100% safe.
> >
> > Part of me would prefer to leave things as they are to avoid the risk of
> > breaking the code again...
> 
> Given that the only place you call __hyp_switch_fpsimd is here you could
> just roll in into __hyp_trap_is_fpsimd and have:
> 
> 	if (__hyp_trap_is_fpsimd(vcpu))
> 		return true;

Possibly, though the function should be renamed in this case, something
like __hyp_handle_fpsimd_trap() I guess.

Cheers
---Dave

^ permalink raw reply	[flat|nested] 154+ messages in thread

* Re: [RFC PATCH v2 15/23] KVM: arm64/sve: Add SVE support to register access ioctl interface
  2018-09-28 13:39   ` Dave Martin
@ 2018-11-21 15:20     ` Alex Bennée
  -1 siblings, 0 replies; 154+ messages in thread
From: Alex Bennée @ 2018-11-21 15:20 UTC (permalink / raw)
  To: Dave Martin
  Cc: Okamoto Takayuki, Christoffer Dall, Ard Biesheuvel, Marc Zyngier,
	Catalin Marinas, Will Deacon, kvmarm, linux-arm-kernel


Dave Martin <Dave.Martin@arm.com> writes:

> This patch adds the following registers for access via the
> KVM_{GET,SET}_ONE_REG interface:
>
>  * KVM_REG_ARM64_SVE_ZREG(n, i) (n = 0..31) (in 2048-bit slices)
>  * KVM_REG_ARM64_SVE_PREG(n, i) (n = 0..15) (in 256-bit slices)
>  * KVM_REG_ARM64_SVE_FFR(i) (in 256-bit slices)
>
> In order to adapt gracefully to future architectural extensions,
> the registers are divided up into slices as noted above:  the i
> parameter denotes the slice index.
>
> For simplicity, bits or slices that exceed the maximum vector
> length supported for the vcpu are ignored for KVM_SET_ONE_REG, and
> read as zero for KVM_GET_ONE_REG.
>
> For the current architecture, only slice i = 0 is significant.  The
> interface design allows i to increase to up to 31 in the future if
> required by future architectural amendments.
>
> The registers are only visible for vcpus that have SVE enabled.
> They are not enumerated by KVM_GET_REG_LIST on vcpus that do not
> have SVE.  In all cases, surplus slices are not enumerated by
> KVM_GET_REG_LIST.
>
> Accesses to the FPSIMD registers via KVM_REG_ARM_CORE is not
> allowed for SVE-enabled vcpus: SVE-aware userspace can use the
> KVM_REG_ARM64_SVE_ZREG() interface instead to access the same
> register state.  This avoids some complex and pointless emluation
> in the kernel.
>
> Signed-off-by: Dave Martin <Dave.Martin@arm.com>
> ---
>
> Changes since RFCv1:
>
>  * Refactored to remove emulation of FPSIMD registers with the SVE
>    register view and vice-versa.  This simplifies the code a fair bit.
>
>  * Fixed a couple of range errors.
>
>  * Inlined various trivial helpers that now have only one call site.
>
>  * Use KVM_REG_SIZE() as a symbolic way of getting SVE register slice
>    sizes.
> ---
>  arch/arm64/include/uapi/asm/kvm.h |  10 +++
>  arch/arm64/kvm/guest.c            | 147 ++++++++++++++++++++++++++++++++++----
>  2 files changed, 145 insertions(+), 12 deletions(-)
>
> diff --git a/arch/arm64/include/uapi/asm/kvm.h b/arch/arm64/include/uapi/asm/kvm.h
> index 97c3478..1ff68fa 100644
> --- a/arch/arm64/include/uapi/asm/kvm.h
> +++ b/arch/arm64/include/uapi/asm/kvm.h
> @@ -226,6 +226,16 @@ struct kvm_vcpu_events {
>  					 KVM_REG_ARM_FW | ((r) & 0xffff))
>  #define KVM_REG_ARM_PSCI_VERSION	KVM_REG_ARM_FW_REG(0)
>
> +/* SVE registers */
> +#define KVM_REG_ARM64_SVE		(0x15 << KVM_REG_ARM_COPROC_SHIFT)
> +#define KVM_REG_ARM64_SVE_ZREG(n, i)	(KVM_REG_ARM64 | KVM_REG_ARM64_SVE | \
> +					 KVM_REG_SIZE_U2048 |		\
> +					 ((n) << 5) | (i))
> +#define KVM_REG_ARM64_SVE_PREG(n, i)	(KVM_REG_ARM64 | KVM_REG_ARM64_SVE | \
> +					 KVM_REG_SIZE_U256 |		\
> +					 ((n) << 5) | (i) | 0x400)

What's the 0x400 for? Aren't PREG's already unique by being 256 bit vs
the Z regs 2048 bit size?

> +#define KVM_REG_ARM64_SVE_FFR(i)	KVM_REG_ARM64_SVE_PREG(16, i)
> +
>  /* Device Control API: ARM VGIC */
>  #define KVM_DEV_ARM_VGIC_GRP_ADDR	0
>  #define KVM_DEV_ARM_VGIC_GRP_DIST_REGS	1
> diff --git a/arch/arm64/kvm/guest.c b/arch/arm64/kvm/guest.c
> index 953a5c9..320db0f 100644
<snip>
>
> @@ -130,6 +154,107 @@ static int set_core_reg(struct kvm_vcpu *vcpu, const struct kvm_one_reg *reg)
>  	return err;
>  }
>
> +struct kreg_region {
> +	char *kptr;
> +	size_t size;
> +	size_t zeropad;
> +};
> +
> +#define SVE_REG_SLICE_SHIFT	0
> +#define SVE_REG_SLICE_BITS	5
> +#define SVE_REG_ID_SHIFT	(SVE_REG_SLICE_SHIFT + SVE_REG_SLICE_BITS)
> +#define SVE_REG_ID_BITS		5
> +
> +#define SVE_REG_SLICE_MASK \
> +	(GENMASK(SVE_REG_SLICE_BITS - 1, 0) << SVE_REG_SLICE_SHIFT)
> +#define SVE_REG_ID_MASK	\
> +	(GENMASK(SVE_REG_ID_BITS - 1, 0) << SVE_REG_ID_SHIFT)
> +

I guess this all comes out in the wash once the constants are folded but
GENMASK does seem to be designed for arbitrary bit positions:

  #define SVE_REG_SLICE_MASK \
     GEN_MASK(SVE_REG_SLICE_BITS + SVE_REG_SLICE_SHIFT - 1, SVE_REG_SLICE_SHIFT)

Hmm I guess that might be even harder to follow...

> +#define SVE_NUM_SLICES (1 << SVE_REG_SLICE_BITS)
> +
> +static int sve_reg_region(struct kreg_region *b,
> +			  const struct kvm_vcpu *vcpu,
> +			  const struct kvm_one_reg *reg)
> +{
> +	const unsigned int vl = vcpu->arch.sve_max_vl;
> +	const unsigned int vq = sve_vq_from_vl(vl);
> +
> +	const unsigned int reg_num =
> +		(reg->id & SVE_REG_ID_MASK) >> SVE_REG_ID_SHIFT;
> +	const unsigned int slice_num =
> +		(reg->id & SVE_REG_SLICE_MASK) >> SVE_REG_SLICE_SHIFT;
> +
> +	unsigned int slice_size, offset, limit;
> +
> +	if (reg->id >= KVM_REG_ARM64_SVE_ZREG(0, 0) &&
> +	    reg->id <= KVM_REG_ARM64_SVE_ZREG(SVE_NUM_ZREGS - 1,
> +					      SVE_NUM_SLICES - 1)) {
> +		slice_size = KVM_REG_SIZE(KVM_REG_ARM64_SVE_ZREG(0, 0));
> +
> +		/* Compute start and end of the register: */
> +		offset = SVE_SIG_ZREG_OFFSET(vq, reg_num) - SVE_SIG_REGS_OFFSET;
> +		limit = offset + SVE_SIG_ZREG_SIZE(vq);
> +
> +		offset += slice_size * slice_num; /* start of requested slice */
> +
> +	} else if (reg->id >= KVM_REG_ARM64_SVE_PREG(0, 0) &&
> +		   reg->id <= KVM_REG_ARM64_SVE_FFR(SVE_NUM_SLICES - 1)) {
> +		/* (FFR is P16 for our purposes) */
> +
> +		slice_size = KVM_REG_SIZE(KVM_REG_ARM64_SVE_PREG(0, 0));
> +
> +		/* Compute start and end of the register: */
> +		offset = SVE_SIG_PREG_OFFSET(vq, reg_num) - SVE_SIG_REGS_OFFSET;
> +		limit = offset + SVE_SIG_PREG_SIZE(vq);
> +
> +		offset += slice_size * slice_num; /* start of requested slice */
> +
> +	} else {
> +		return -ENOENT;
> +	}
> +
> +	b->kptr = (char *)vcpu->arch.sve_state + offset;
> +
> +	/*
> +	 * If the slice starts after the end of the reg, just pad.
> +	 * Otherwise, copy as much as possible up to slice_size and pad
> +	 * the remainder:
> +	 */
> +	b->size = offset >= limit ? 0 : min(limit - offset, slice_size);
> +	b->zeropad = slice_size - b->size;
> +
> +	return 0;
> +}
> +
> +static int get_sve_reg(struct kvm_vcpu *vcpu, const struct kvm_one_reg *reg)
> +{
> +	struct kreg_region kreg;
> +	char __user *uptr = (char __user *)reg->addr;
> +
> +	if (!vcpu_has_sve(vcpu) || sve_reg_region(&kreg, vcpu, reg))
> +		return -ENOENT;
> +
> +	if (copy_to_user(uptr, kreg.kptr, kreg.size) ||
> +	    clear_user(uptr + kreg.size, kreg.zeropad))
> +		return -EFAULT;
> +
> +	return 0;
> +}
> +
> +static int set_sve_reg(struct kvm_vcpu *vcpu, const struct kvm_one_reg *reg)
> +{
> +	struct kreg_region kreg;
> +	char __user *uptr = (char __user *)reg->addr;
> +
> +	if (!vcpu_has_sve(vcpu) || sve_reg_region(&kreg, vcpu, reg))
> +		return -ENOENT;
> +
> +	if (copy_from_user(kreg.kptr, uptr, kreg.size))
> +		return -EFAULT;
> +
> +	return 0;
> +}
> +
>  int kvm_arch_vcpu_ioctl_get_regs(struct kvm_vcpu *vcpu, struct kvm_regs *regs)
>  {
>  	return -EINVAL;
> @@ -251,12 +376,11 @@ int kvm_arm_get_reg(struct kvm_vcpu *vcpu, const struct kvm_one_reg *reg)
>  	if ((reg->id & ~KVM_REG_SIZE_MASK) >> 32 != KVM_REG_ARM64 >> 32)
>  		return -EINVAL;
>
> -	/* Register group 16 means we want a core register. */
> -	if ((reg->id & KVM_REG_ARM_COPROC_MASK) == KVM_REG_ARM_CORE)
> -		return get_core_reg(vcpu, reg);
> -
> -	if ((reg->id & KVM_REG_ARM_COPROC_MASK) == KVM_REG_ARM_FW)
> -		return kvm_arm_get_fw_reg(vcpu, reg);
> +	switch (reg->id & KVM_REG_ARM_COPROC_MASK) {
> +	case KVM_REG_ARM_CORE:	return get_core_reg(vcpu, reg);
> +	case KVM_REG_ARM_FW:	return kvm_arm_get_fw_reg(vcpu, reg);
> +	case KVM_REG_ARM64_SVE:	return get_sve_reg(vcpu, reg);
> +	}
>
>  	if (is_timer_reg(reg->id))
>  		return get_timer_reg(vcpu, reg);
> @@ -270,12 +394,11 @@ int kvm_arm_set_reg(struct kvm_vcpu *vcpu, const struct kvm_one_reg *reg)
>  	if ((reg->id & ~KVM_REG_SIZE_MASK) >> 32 != KVM_REG_ARM64 >> 32)
>  		return -EINVAL;
>
> -	/* Register group 16 means we set a core register. */
> -	if ((reg->id & KVM_REG_ARM_COPROC_MASK) == KVM_REG_ARM_CORE)
> -		return set_core_reg(vcpu, reg);
> -
> -	if ((reg->id & KVM_REG_ARM_COPROC_MASK) == KVM_REG_ARM_FW)
> -		return kvm_arm_set_fw_reg(vcpu, reg);
> +	switch (reg->id & KVM_REG_ARM_COPROC_MASK) {
> +	case KVM_REG_ARM_CORE:	return set_core_reg(vcpu, reg);
> +	case KVM_REG_ARM_FW:	return kvm_arm_set_fw_reg(vcpu, reg);
> +	case KVM_REG_ARM64_SVE:	return set_sve_reg(vcpu, reg);
> +	}
>
>  	if (is_timer_reg(reg->id))
>  		return set_timer_reg(vcpu, reg);

The kernel coding-style.rst seems mute on the subject of default
handling in switch but it's probably worth having a:

  default: break; /* falls through */

to be explicit.

It's out of scope for this review but I did get a bit confused as the
KVM_REG_ARM_COPROC_SHIFT registers seems to be fairly spread out across
the files. We have demux_c15_get/set in sys_regs but doesn't look as
though it touches the rest of the emulation logic and we have
kvm_arm_get/set_fw_reg which are "special" PCSI registers. I guess this
is because COPROC_SHIFT has been used for a bunch of disparate core and
non-core and special registers.

--
Alex Bennée
_______________________________________________
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

^ permalink raw reply	[flat|nested] 154+ messages in thread

* [RFC PATCH v2 15/23] KVM: arm64/sve: Add SVE support to register access ioctl interface
@ 2018-11-21 15:20     ` Alex Bennée
  0 siblings, 0 replies; 154+ messages in thread
From: Alex Bennée @ 2018-11-21 15:20 UTC (permalink / raw)
  To: linux-arm-kernel


Dave Martin <Dave.Martin@arm.com> writes:

> This patch adds the following registers for access via the
> KVM_{GET,SET}_ONE_REG interface:
>
>  * KVM_REG_ARM64_SVE_ZREG(n, i) (n = 0..31) (in 2048-bit slices)
>  * KVM_REG_ARM64_SVE_PREG(n, i) (n = 0..15) (in 256-bit slices)
>  * KVM_REG_ARM64_SVE_FFR(i) (in 256-bit slices)
>
> In order to adapt gracefully to future architectural extensions,
> the registers are divided up into slices as noted above:  the i
> parameter denotes the slice index.
>
> For simplicity, bits or slices that exceed the maximum vector
> length supported for the vcpu are ignored for KVM_SET_ONE_REG, and
> read as zero for KVM_GET_ONE_REG.
>
> For the current architecture, only slice i = 0 is significant.  The
> interface design allows i to increase to up to 31 in the future if
> required by future architectural amendments.
>
> The registers are only visible for vcpus that have SVE enabled.
> They are not enumerated by KVM_GET_REG_LIST on vcpus that do not
> have SVE.  In all cases, surplus slices are not enumerated by
> KVM_GET_REG_LIST.
>
> Accesses to the FPSIMD registers via KVM_REG_ARM_CORE is not
> allowed for SVE-enabled vcpus: SVE-aware userspace can use the
> KVM_REG_ARM64_SVE_ZREG() interface instead to access the same
> register state.  This avoids some complex and pointless emluation
> in the kernel.
>
> Signed-off-by: Dave Martin <Dave.Martin@arm.com>
> ---
>
> Changes since RFCv1:
>
>  * Refactored to remove emulation of FPSIMD registers with the SVE
>    register view and vice-versa.  This simplifies the code a fair bit.
>
>  * Fixed a couple of range errors.
>
>  * Inlined various trivial helpers that now have only one call site.
>
>  * Use KVM_REG_SIZE() as a symbolic way of getting SVE register slice
>    sizes.
> ---
>  arch/arm64/include/uapi/asm/kvm.h |  10 +++
>  arch/arm64/kvm/guest.c            | 147 ++++++++++++++++++++++++++++++++++----
>  2 files changed, 145 insertions(+), 12 deletions(-)
>
> diff --git a/arch/arm64/include/uapi/asm/kvm.h b/arch/arm64/include/uapi/asm/kvm.h
> index 97c3478..1ff68fa 100644
> --- a/arch/arm64/include/uapi/asm/kvm.h
> +++ b/arch/arm64/include/uapi/asm/kvm.h
> @@ -226,6 +226,16 @@ struct kvm_vcpu_events {
>  					 KVM_REG_ARM_FW | ((r) & 0xffff))
>  #define KVM_REG_ARM_PSCI_VERSION	KVM_REG_ARM_FW_REG(0)
>
> +/* SVE registers */
> +#define KVM_REG_ARM64_SVE		(0x15 << KVM_REG_ARM_COPROC_SHIFT)
> +#define KVM_REG_ARM64_SVE_ZREG(n, i)	(KVM_REG_ARM64 | KVM_REG_ARM64_SVE | \
> +					 KVM_REG_SIZE_U2048 |		\
> +					 ((n) << 5) | (i))
> +#define KVM_REG_ARM64_SVE_PREG(n, i)	(KVM_REG_ARM64 | KVM_REG_ARM64_SVE | \
> +					 KVM_REG_SIZE_U256 |		\
> +					 ((n) << 5) | (i) | 0x400)

What's the 0x400 for? Aren't PREG's already unique by being 256 bit vs
the Z regs 2048 bit size?

> +#define KVM_REG_ARM64_SVE_FFR(i)	KVM_REG_ARM64_SVE_PREG(16, i)
> +
>  /* Device Control API: ARM VGIC */
>  #define KVM_DEV_ARM_VGIC_GRP_ADDR	0
>  #define KVM_DEV_ARM_VGIC_GRP_DIST_REGS	1
> diff --git a/arch/arm64/kvm/guest.c b/arch/arm64/kvm/guest.c
> index 953a5c9..320db0f 100644
<snip>
>
> @@ -130,6 +154,107 @@ static int set_core_reg(struct kvm_vcpu *vcpu, const struct kvm_one_reg *reg)
>  	return err;
>  }
>
> +struct kreg_region {
> +	char *kptr;
> +	size_t size;
> +	size_t zeropad;
> +};
> +
> +#define SVE_REG_SLICE_SHIFT	0
> +#define SVE_REG_SLICE_BITS	5
> +#define SVE_REG_ID_SHIFT	(SVE_REG_SLICE_SHIFT + SVE_REG_SLICE_BITS)
> +#define SVE_REG_ID_BITS		5
> +
> +#define SVE_REG_SLICE_MASK \
> +	(GENMASK(SVE_REG_SLICE_BITS - 1, 0) << SVE_REG_SLICE_SHIFT)
> +#define SVE_REG_ID_MASK	\
> +	(GENMASK(SVE_REG_ID_BITS - 1, 0) << SVE_REG_ID_SHIFT)
> +

I guess this all comes out in the wash once the constants are folded but
GENMASK does seem to be designed for arbitrary bit positions:

  #define SVE_REG_SLICE_MASK \
     GEN_MASK(SVE_REG_SLICE_BITS + SVE_REG_SLICE_SHIFT - 1, SVE_REG_SLICE_SHIFT)

Hmm I guess that might be even harder to follow...

> +#define SVE_NUM_SLICES (1 << SVE_REG_SLICE_BITS)
> +
> +static int sve_reg_region(struct kreg_region *b,
> +			  const struct kvm_vcpu *vcpu,
> +			  const struct kvm_one_reg *reg)
> +{
> +	const unsigned int vl = vcpu->arch.sve_max_vl;
> +	const unsigned int vq = sve_vq_from_vl(vl);
> +
> +	const unsigned int reg_num =
> +		(reg->id & SVE_REG_ID_MASK) >> SVE_REG_ID_SHIFT;
> +	const unsigned int slice_num =
> +		(reg->id & SVE_REG_SLICE_MASK) >> SVE_REG_SLICE_SHIFT;
> +
> +	unsigned int slice_size, offset, limit;
> +
> +	if (reg->id >= KVM_REG_ARM64_SVE_ZREG(0, 0) &&
> +	    reg->id <= KVM_REG_ARM64_SVE_ZREG(SVE_NUM_ZREGS - 1,
> +					      SVE_NUM_SLICES - 1)) {
> +		slice_size = KVM_REG_SIZE(KVM_REG_ARM64_SVE_ZREG(0, 0));
> +
> +		/* Compute start and end of the register: */
> +		offset = SVE_SIG_ZREG_OFFSET(vq, reg_num) - SVE_SIG_REGS_OFFSET;
> +		limit = offset + SVE_SIG_ZREG_SIZE(vq);
> +
> +		offset += slice_size * slice_num; /* start of requested slice */
> +
> +	} else if (reg->id >= KVM_REG_ARM64_SVE_PREG(0, 0) &&
> +		   reg->id <= KVM_REG_ARM64_SVE_FFR(SVE_NUM_SLICES - 1)) {
> +		/* (FFR is P16 for our purposes) */
> +
> +		slice_size = KVM_REG_SIZE(KVM_REG_ARM64_SVE_PREG(0, 0));
> +
> +		/* Compute start and end of the register: */
> +		offset = SVE_SIG_PREG_OFFSET(vq, reg_num) - SVE_SIG_REGS_OFFSET;
> +		limit = offset + SVE_SIG_PREG_SIZE(vq);
> +
> +		offset += slice_size * slice_num; /* start of requested slice */
> +
> +	} else {
> +		return -ENOENT;
> +	}
> +
> +	b->kptr = (char *)vcpu->arch.sve_state + offset;
> +
> +	/*
> +	 * If the slice starts after the end of the reg, just pad.
> +	 * Otherwise, copy as much as possible up to slice_size and pad
> +	 * the remainder:
> +	 */
> +	b->size = offset >= limit ? 0 : min(limit - offset, slice_size);
> +	b->zeropad = slice_size - b->size;
> +
> +	return 0;
> +}
> +
> +static int get_sve_reg(struct kvm_vcpu *vcpu, const struct kvm_one_reg *reg)
> +{
> +	struct kreg_region kreg;
> +	char __user *uptr = (char __user *)reg->addr;
> +
> +	if (!vcpu_has_sve(vcpu) || sve_reg_region(&kreg, vcpu, reg))
> +		return -ENOENT;
> +
> +	if (copy_to_user(uptr, kreg.kptr, kreg.size) ||
> +	    clear_user(uptr + kreg.size, kreg.zeropad))
> +		return -EFAULT;
> +
> +	return 0;
> +}
> +
> +static int set_sve_reg(struct kvm_vcpu *vcpu, const struct kvm_one_reg *reg)
> +{
> +	struct kreg_region kreg;
> +	char __user *uptr = (char __user *)reg->addr;
> +
> +	if (!vcpu_has_sve(vcpu) || sve_reg_region(&kreg, vcpu, reg))
> +		return -ENOENT;
> +
> +	if (copy_from_user(kreg.kptr, uptr, kreg.size))
> +		return -EFAULT;
> +
> +	return 0;
> +}
> +
>  int kvm_arch_vcpu_ioctl_get_regs(struct kvm_vcpu *vcpu, struct kvm_regs *regs)
>  {
>  	return -EINVAL;
> @@ -251,12 +376,11 @@ int kvm_arm_get_reg(struct kvm_vcpu *vcpu, const struct kvm_one_reg *reg)
>  	if ((reg->id & ~KVM_REG_SIZE_MASK) >> 32 != KVM_REG_ARM64 >> 32)
>  		return -EINVAL;
>
> -	/* Register group 16 means we want a core register. */
> -	if ((reg->id & KVM_REG_ARM_COPROC_MASK) == KVM_REG_ARM_CORE)
> -		return get_core_reg(vcpu, reg);
> -
> -	if ((reg->id & KVM_REG_ARM_COPROC_MASK) == KVM_REG_ARM_FW)
> -		return kvm_arm_get_fw_reg(vcpu, reg);
> +	switch (reg->id & KVM_REG_ARM_COPROC_MASK) {
> +	case KVM_REG_ARM_CORE:	return get_core_reg(vcpu, reg);
> +	case KVM_REG_ARM_FW:	return kvm_arm_get_fw_reg(vcpu, reg);
> +	case KVM_REG_ARM64_SVE:	return get_sve_reg(vcpu, reg);
> +	}
>
>  	if (is_timer_reg(reg->id))
>  		return get_timer_reg(vcpu, reg);
> @@ -270,12 +394,11 @@ int kvm_arm_set_reg(struct kvm_vcpu *vcpu, const struct kvm_one_reg *reg)
>  	if ((reg->id & ~KVM_REG_SIZE_MASK) >> 32 != KVM_REG_ARM64 >> 32)
>  		return -EINVAL;
>
> -	/* Register group 16 means we set a core register. */
> -	if ((reg->id & KVM_REG_ARM_COPROC_MASK) == KVM_REG_ARM_CORE)
> -		return set_core_reg(vcpu, reg);
> -
> -	if ((reg->id & KVM_REG_ARM_COPROC_MASK) == KVM_REG_ARM_FW)
> -		return kvm_arm_set_fw_reg(vcpu, reg);
> +	switch (reg->id & KVM_REG_ARM_COPROC_MASK) {
> +	case KVM_REG_ARM_CORE:	return set_core_reg(vcpu, reg);
> +	case KVM_REG_ARM_FW:	return kvm_arm_set_fw_reg(vcpu, reg);
> +	case KVM_REG_ARM64_SVE:	return set_sve_reg(vcpu, reg);
> +	}
>
>  	if (is_timer_reg(reg->id))
>  		return set_timer_reg(vcpu, reg);

The kernel coding-style.rst seems mute on the subject of default
handling in switch but it's probably worth having a:

  default: break; /* falls through */

to be explicit.

It's out of scope for this review but I did get a bit confused as the
KVM_REG_ARM_COPROC_SHIFT registers seems to be fairly spread out across
the files. We have demux_c15_get/set in sys_regs but doesn't look as
though it touches the rest of the emulation logic and we have
kvm_arm_get/set_fw_reg which are "special" PCSI registers. I guess this
is because COPROC_SHIFT has been used for a bunch of disparate core and
non-core and special registers.

--
Alex Benn?e

^ permalink raw reply	[flat|nested] 154+ messages in thread

* Re: [RFC PATCH v2 16/23] KVM: arm64: Enumerate SVE register indices for KVM_GET_REG_LIST
  2018-09-28 13:39   ` Dave Martin
@ 2018-11-21 16:09     ` Alex Bennée
  -1 siblings, 0 replies; 154+ messages in thread
From: Alex Bennée @ 2018-11-21 16:09 UTC (permalink / raw)
  To: Dave Martin
  Cc: Okamoto Takayuki, Christoffer Dall, Ard Biesheuvel, Marc Zyngier,
	Catalin Marinas, Will Deacon, kvmarm, linux-arm-kernel


Dave Martin <Dave.Martin@arm.com> writes:

> This patch includes the SVE register IDs in the list returned by
> KVM_GET_REG_LIST, as appropriate.
>
> On a non-SVE-enabled vcpu, no extra IDs are added.
>
> On an SVE-enabled vcpu, the appropriate number of slice IDs are
> enumerated for each SVE register, depending on the maximum vector
> length for the vcpu.
>
> Signed-off-by: Dave Martin <Dave.Martin@arm.com>
> ---
>
> Changes since RFCv1:
>
>  * Simplify enumerate_sve_regs() based on Andrew Jones' approach.
>
>  * Reg copying loops are inverted for brevity, since the order we
>    spit out the regs in doesn't really matter.
>
> (I tried to keep part of my approach to avoid the duplicate logic
> between num_sve_regs() and copy_sve_reg_indices(), but although
> it works in principle, gcc fails to fully collapse the num_regs()
> case... so I gave up.  The two functions need to be manually kept
> consistent, but hopefully that's fairly straightforward.)
> ---
>  arch/arm64/kvm/guest.c | 45 +++++++++++++++++++++++++++++++++++++++++++++
>  1 file changed, 45 insertions(+)
>
> diff --git a/arch/arm64/kvm/guest.c b/arch/arm64/kvm/guest.c
> index 320db0f..89eab68 100644
> --- a/arch/arm64/kvm/guest.c
> +++ b/arch/arm64/kvm/guest.c
> @@ -323,6 +323,46 @@ static int get_timer_reg(struct kvm_vcpu *vcpu, const struct kvm_one_reg *reg)
>  	return copy_to_user(uaddr, &val, KVM_REG_SIZE(reg->id)) ? -EFAULT : 0;
>  }
>
> +static unsigned long num_sve_regs(const struct kvm_vcpu *vcpu)
> +{
> +	const unsigned int slices = DIV_ROUND_UP(
> +		vcpu->arch.sve_max_vl,
> +		KVM_REG_SIZE(KVM_REG_ARM64_SVE_ZREG(0, 0)));

Having seen this formulation come up several times now I wonder if there
should be a kernel private define, KVM_SVE_ZREG/PREG_SIZE to avoid this
clumsiness.

You could still use the KVM_REG_SIZE to extract it as I guess this is to
make changes simpler if/when the SVE reg size gets bumped up.

> +
> +	if (!vcpu_has_sve(vcpu))
> +		return 0;
> +
> +	return slices * (SVE_NUM_PREGS + SVE_NUM_ZREGS + 1 /* FFR */);
> +}
> +
> +static int copy_sve_reg_indices(const struct kvm_vcpu *vcpu, u64 __user **uind)
> +{
> +	const unsigned int slices = DIV_ROUND_UP(
> +		vcpu->arch.sve_max_vl,
> +		KVM_REG_SIZE(KVM_REG_ARM64_SVE_ZREG(0, 0)));
> +	unsigned int i, n;
> +
> +	if (!vcpu_has_sve(vcpu))
> +		return 0;
> +
> +	for (i = 0; i < slices; i++) {
> +		for (n = 0; n < SVE_NUM_ZREGS; n++) {
> +			if (put_user(KVM_REG_ARM64_SVE_ZREG(n, i), (*uind)++))
> +				return -EFAULT;
> +		}
> +
> +		for (n = 0; n < SVE_NUM_PREGS; n++) {
> +			if (put_user(KVM_REG_ARM64_SVE_PREG(n, i), (*uind)++))
> +				return -EFAULT;
> +		}
> +
> +		if (put_user(KVM_REG_ARM64_SVE_FFR(i), (*uind)++))
> +			return -EFAULT;
> +	}
> +
> +	return 0;
> +}
> +
>  /**
>   * kvm_arm_num_regs - how many registers do we present via KVM_GET_ONE_REG
>   *
> @@ -333,6 +373,7 @@ unsigned long kvm_arm_num_regs(struct kvm_vcpu *vcpu)
>  	unsigned long res = 0;
>
>  	res += num_core_regs();
> +	res += num_sve_regs(vcpu);
>  	res += kvm_arm_num_sys_reg_descs(vcpu);
>  	res += kvm_arm_get_fw_num_regs(vcpu);
>  	res += NUM_TIMER_REGS;
> @@ -357,6 +398,10 @@ int kvm_arm_copy_reg_indices(struct kvm_vcpu *vcpu, u64 __user *uindices)
>  		uindices++;
>  	}
>
> +	ret = copy_sve_reg_indices(vcpu, &uindices);
> +	if (ret)
> +		return ret;
> +
>  	ret = kvm_arm_copy_fw_reg_indices(vcpu, uindices);
>  	if (ret)
>  		return ret;

Otherwise:

Reviewed-by: Alex Bennée <alex.bennee@linaro.org>

--
Alex Bennée
_______________________________________________
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

^ permalink raw reply	[flat|nested] 154+ messages in thread

* [RFC PATCH v2 16/23] KVM: arm64: Enumerate SVE register indices for KVM_GET_REG_LIST
@ 2018-11-21 16:09     ` Alex Bennée
  0 siblings, 0 replies; 154+ messages in thread
From: Alex Bennée @ 2018-11-21 16:09 UTC (permalink / raw)
  To: linux-arm-kernel


Dave Martin <Dave.Martin@arm.com> writes:

> This patch includes the SVE register IDs in the list returned by
> KVM_GET_REG_LIST, as appropriate.
>
> On a non-SVE-enabled vcpu, no extra IDs are added.
>
> On an SVE-enabled vcpu, the appropriate number of slice IDs are
> enumerated for each SVE register, depending on the maximum vector
> length for the vcpu.
>
> Signed-off-by: Dave Martin <Dave.Martin@arm.com>
> ---
>
> Changes since RFCv1:
>
>  * Simplify enumerate_sve_regs() based on Andrew Jones' approach.
>
>  * Reg copying loops are inverted for brevity, since the order we
>    spit out the regs in doesn't really matter.
>
> (I tried to keep part of my approach to avoid the duplicate logic
> between num_sve_regs() and copy_sve_reg_indices(), but although
> it works in principle, gcc fails to fully collapse the num_regs()
> case... so I gave up.  The two functions need to be manually kept
> consistent, but hopefully that's fairly straightforward.)
> ---
>  arch/arm64/kvm/guest.c | 45 +++++++++++++++++++++++++++++++++++++++++++++
>  1 file changed, 45 insertions(+)
>
> diff --git a/arch/arm64/kvm/guest.c b/arch/arm64/kvm/guest.c
> index 320db0f..89eab68 100644
> --- a/arch/arm64/kvm/guest.c
> +++ b/arch/arm64/kvm/guest.c
> @@ -323,6 +323,46 @@ static int get_timer_reg(struct kvm_vcpu *vcpu, const struct kvm_one_reg *reg)
>  	return copy_to_user(uaddr, &val, KVM_REG_SIZE(reg->id)) ? -EFAULT : 0;
>  }
>
> +static unsigned long num_sve_regs(const struct kvm_vcpu *vcpu)
> +{
> +	const unsigned int slices = DIV_ROUND_UP(
> +		vcpu->arch.sve_max_vl,
> +		KVM_REG_SIZE(KVM_REG_ARM64_SVE_ZREG(0, 0)));

Having seen this formulation come up several times now I wonder if there
should be a kernel private define, KVM_SVE_ZREG/PREG_SIZE to avoid this
clumsiness.

You could still use the KVM_REG_SIZE to extract it as I guess this is to
make changes simpler if/when the SVE reg size gets bumped up.

> +
> +	if (!vcpu_has_sve(vcpu))
> +		return 0;
> +
> +	return slices * (SVE_NUM_PREGS + SVE_NUM_ZREGS + 1 /* FFR */);
> +}
> +
> +static int copy_sve_reg_indices(const struct kvm_vcpu *vcpu, u64 __user **uind)
> +{
> +	const unsigned int slices = DIV_ROUND_UP(
> +		vcpu->arch.sve_max_vl,
> +		KVM_REG_SIZE(KVM_REG_ARM64_SVE_ZREG(0, 0)));
> +	unsigned int i, n;
> +
> +	if (!vcpu_has_sve(vcpu))
> +		return 0;
> +
> +	for (i = 0; i < slices; i++) {
> +		for (n = 0; n < SVE_NUM_ZREGS; n++) {
> +			if (put_user(KVM_REG_ARM64_SVE_ZREG(n, i), (*uind)++))
> +				return -EFAULT;
> +		}
> +
> +		for (n = 0; n < SVE_NUM_PREGS; n++) {
> +			if (put_user(KVM_REG_ARM64_SVE_PREG(n, i), (*uind)++))
> +				return -EFAULT;
> +		}
> +
> +		if (put_user(KVM_REG_ARM64_SVE_FFR(i), (*uind)++))
> +			return -EFAULT;
> +	}
> +
> +	return 0;
> +}
> +
>  /**
>   * kvm_arm_num_regs - how many registers do we present via KVM_GET_ONE_REG
>   *
> @@ -333,6 +373,7 @@ unsigned long kvm_arm_num_regs(struct kvm_vcpu *vcpu)
>  	unsigned long res = 0;
>
>  	res += num_core_regs();
> +	res += num_sve_regs(vcpu);
>  	res += kvm_arm_num_sys_reg_descs(vcpu);
>  	res += kvm_arm_get_fw_num_regs(vcpu);
>  	res += NUM_TIMER_REGS;
> @@ -357,6 +398,10 @@ int kvm_arm_copy_reg_indices(struct kvm_vcpu *vcpu, u64 __user *uindices)
>  		uindices++;
>  	}
>
> +	ret = copy_sve_reg_indices(vcpu, &uindices);
> +	if (ret)
> +		return ret;
> +
>  	ret = kvm_arm_copy_fw_reg_indices(vcpu, uindices);
>  	if (ret)
>  		return ret;

Otherwise:

Reviewed-by: Alex Benn?e <alex.bennee@linaro.org>

--
Alex Benn?e

^ permalink raw reply	[flat|nested] 154+ messages in thread

* Re: [RFC PATCH v2 17/23] arm64/sve: In-kernel vector length availability query interface
  2018-09-28 13:39   ` Dave Martin
@ 2018-11-21 16:16     ` Alex Bennée
  -1 siblings, 0 replies; 154+ messages in thread
From: Alex Bennée @ 2018-11-21 16:16 UTC (permalink / raw)
  To: Dave Martin
  Cc: Okamoto Takayuki, Christoffer Dall, Ard Biesheuvel, Marc Zyngier,
	Catalin Marinas, Will Deacon, kvmarm, linux-arm-kernel


Dave Martin <Dave.Martin@arm.com> writes:

> KVM will need to interrogate the set of SVE vector lengths
> available on the system.
>
> This patch exposes the relevant bits to the kernel, along with a
> sve_vq_available() helper to check whether a particular vector
> length is supported.
>
> vq_to_bit() and bit_to_vq() are not intended for use outside these
> functions, so they are given a __ prefix to warn people not to use
> them unless they really know what they are doing.

Personally I wouldn't have bothered with the __ but whatever:

Reviewed-by: Alex Bennée <alex.bennee@linaro.org>

>
> Signed-off-by: Dave Martin <Dave.Martin@arm.com>
> ---
>  arch/arm64/include/asm/fpsimd.h | 29 +++++++++++++++++++++++++++++
>  arch/arm64/kernel/fpsimd.c      | 35 ++++++++---------------------------
>  2 files changed, 37 insertions(+), 27 deletions(-)
>
> diff --git a/arch/arm64/include/asm/fpsimd.h b/arch/arm64/include/asm/fpsimd.h
> index df7a143..ad6d2e4 100644
> --- a/arch/arm64/include/asm/fpsimd.h
> +++ b/arch/arm64/include/asm/fpsimd.h
> @@ -24,10 +24,13 @@
>
>  #ifndef __ASSEMBLY__
>
> +#include <linux/bitmap.h>
>  #include <linux/build_bug.h>
> +#include <linux/bug.h>
>  #include <linux/cache.h>
>  #include <linux/init.h>
>  #include <linux/stddef.h>
> +#include <linux/types.h>
>
>  #if defined(__KERNEL__) && defined(CONFIG_COMPAT)
>  /* Masks for extracting the FPSR and FPCR from the FPSCR */
> @@ -89,6 +92,32 @@ extern u64 read_zcr_features(void);
>
>  extern int __ro_after_init sve_max_vl;
>  extern int __ro_after_init sve_max_virtualisable_vl;
> +/* Set of available vector lengths, as vq_to_bit(vq): */
> +extern __ro_after_init DECLARE_BITMAP(sve_vq_map, SVE_VQ_MAX);
> +
> +/*
> + * Helpers to translate bit indices in sve_vq_map to VQ values (and
> + * vice versa).  This allows find_next_bit() to be used to find the
> + * _maximum_ VQ not exceeding a certain value.
> + */
> +static inline unsigned int __vq_to_bit(unsigned int vq)
> +{
> +	return SVE_VQ_MAX - vq;
> +}
> +
> +static inline unsigned int __bit_to_vq(unsigned int bit)
> +{
> +	if (WARN_ON(bit >= SVE_VQ_MAX))
> +		bit = SVE_VQ_MAX - 1;
> +
> +	return SVE_VQ_MAX - bit;
> +}
> +
> +/* Ensure vq >= SVE_VQ_MIN && vq <= SVE_VQ_MAX before calling this function */
> +static inline bool sve_vq_available(unsigned int vq)
> +{
> +	return test_bit(__vq_to_bit(vq), sve_vq_map);
> +}
>
>  #ifdef CONFIG_ARM64_SVE
>
> diff --git a/arch/arm64/kernel/fpsimd.c b/arch/arm64/kernel/fpsimd.c
> index 60c5e28..cc5a495 100644
> --- a/arch/arm64/kernel/fpsimd.c
> +++ b/arch/arm64/kernel/fpsimd.c
> @@ -136,7 +136,7 @@ static int sve_default_vl = -1;
>  int __ro_after_init sve_max_vl = SVE_VL_MIN;
>  int __ro_after_init sve_max_virtualisable_vl = SVE_VL_MIN;
>  /* Set of available vector lengths, as vq_to_bit(vq): */
> -static __ro_after_init DECLARE_BITMAP(sve_vq_map, SVE_VQ_MAX);
> +__ro_after_init DECLARE_BITMAP(sve_vq_map, SVE_VQ_MAX);
>  /* Set of vector lengths present on at least one cpu: */
>  static __ro_after_init DECLARE_BITMAP(sve_vq_partial_map, SVE_VQ_MAX);
>  static void __percpu *efi_sve_state;
> @@ -270,25 +270,6 @@ void fpsimd_save(void)
>  }
>
>  /*
> - * Helpers to translate bit indices in sve_vq_map to VQ values (and
> - * vice versa).  This allows find_next_bit() to be used to find the
> - * _maximum_ VQ not exceeding a certain value.
> - */
> -
> -static unsigned int vq_to_bit(unsigned int vq)
> -{
> -	return SVE_VQ_MAX - vq;
> -}
> -
> -static unsigned int bit_to_vq(unsigned int bit)
> -{
> -	if (WARN_ON(bit >= SVE_VQ_MAX))
> -		bit = SVE_VQ_MAX - 1;
> -
> -	return SVE_VQ_MAX - bit;
> -}
> -
> -/*
>   * All vector length selection from userspace comes through here.
>   * We're on a slow path, so some sanity-checks are included.
>   * If things go wrong there's a bug somewhere, but try to fall back to a
> @@ -309,8 +290,8 @@ static unsigned int find_supported_vector_length(unsigned int vl)
>  		vl = max_vl;
>
>  	bit = find_next_bit(sve_vq_map, SVE_VQ_MAX,
> -			    vq_to_bit(sve_vq_from_vl(vl)));
> -	return sve_vl_from_vq(bit_to_vq(bit));
> +			    __vq_to_bit(sve_vq_from_vl(vl)));
> +	return sve_vl_from_vq(__bit_to_vq(bit));
>  }
>
>  #ifdef CONFIG_SYSCTL
> @@ -651,7 +632,7 @@ static void sve_probe_vqs(DECLARE_BITMAP(map, SVE_VQ_MAX))
>  		write_sysreg_s(zcr | (vq - 1), SYS_ZCR_EL1); /* self-syncing */
>  		vl = sve_get_vl();
>  		vq = sve_vq_from_vl(vl); /* skip intervening lengths */
> -		set_bit(vq_to_bit(vq), map);
> +		set_bit(__vq_to_bit(vq), map);
>  	}
>  }
>
> @@ -712,7 +693,7 @@ int sve_verify_vq_map(void)
>  	 * Mismatches above sve_max_virtualisable_vl are fine, since
>  	 * no guest is allowed to configure ZCR_EL2.LEN to exceed this:
>  	 */
> -	if (sve_vl_from_vq(bit_to_vq(b)) <= sve_max_virtualisable_vl) {
> +	if (sve_vl_from_vq(__bit_to_vq(b)) <= sve_max_virtualisable_vl) {
>  		pr_warn("SVE: cpu%d: Unsupported vector length(s) present\n",
>  			smp_processor_id());
>  		goto error;
> @@ -798,8 +779,8 @@ void __init sve_setup(void)
>  	 * so sve_vq_map must have at least SVE_VQ_MIN set.
>  	 * If something went wrong, at least try to patch it up:
>  	 */
> -	if (WARN_ON(!test_bit(vq_to_bit(SVE_VQ_MIN), sve_vq_map)))
> -		set_bit(vq_to_bit(SVE_VQ_MIN), sve_vq_map);
> +	if (WARN_ON(!test_bit(__vq_to_bit(SVE_VQ_MIN), sve_vq_map)))
> +		set_bit(__vq_to_bit(SVE_VQ_MIN), sve_vq_map);
>
>  	zcr = read_sanitised_ftr_reg(SYS_ZCR_EL1);
>  	sve_max_vl = sve_vl_from_vq((zcr & ZCR_ELx_LEN_MASK) + 1);
> @@ -828,7 +809,7 @@ void __init sve_setup(void)
>  		/* No virtualisable VLs?  This is architecturally forbidden. */
>  		sve_max_virtualisable_vl = SVE_VQ_MIN;
>  	else /* b + 1 < SVE_VQ_MAX */
> -		sve_max_virtualisable_vl = sve_vl_from_vq(bit_to_vq(b + 1));
> +		sve_max_virtualisable_vl = sve_vl_from_vq(__bit_to_vq(b + 1));
>
>  	if (sve_max_virtualisable_vl > sve_max_vl)
>  		sve_max_virtualisable_vl = sve_max_vl;


--
Alex Bennée
_______________________________________________
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

^ permalink raw reply	[flat|nested] 154+ messages in thread

* [RFC PATCH v2 17/23] arm64/sve: In-kernel vector length availability query interface
@ 2018-11-21 16:16     ` Alex Bennée
  0 siblings, 0 replies; 154+ messages in thread
From: Alex Bennée @ 2018-11-21 16:16 UTC (permalink / raw)
  To: linux-arm-kernel


Dave Martin <Dave.Martin@arm.com> writes:

> KVM will need to interrogate the set of SVE vector lengths
> available on the system.
>
> This patch exposes the relevant bits to the kernel, along with a
> sve_vq_available() helper to check whether a particular vector
> length is supported.
>
> vq_to_bit() and bit_to_vq() are not intended for use outside these
> functions, so they are given a __ prefix to warn people not to use
> them unless they really know what they are doing.

Personally I wouldn't have bothered with the __ but whatever:

Reviewed-by: Alex Benn?e <alex.bennee@linaro.org>

>
> Signed-off-by: Dave Martin <Dave.Martin@arm.com>
> ---
>  arch/arm64/include/asm/fpsimd.h | 29 +++++++++++++++++++++++++++++
>  arch/arm64/kernel/fpsimd.c      | 35 ++++++++---------------------------
>  2 files changed, 37 insertions(+), 27 deletions(-)
>
> diff --git a/arch/arm64/include/asm/fpsimd.h b/arch/arm64/include/asm/fpsimd.h
> index df7a143..ad6d2e4 100644
> --- a/arch/arm64/include/asm/fpsimd.h
> +++ b/arch/arm64/include/asm/fpsimd.h
> @@ -24,10 +24,13 @@
>
>  #ifndef __ASSEMBLY__
>
> +#include <linux/bitmap.h>
>  #include <linux/build_bug.h>
> +#include <linux/bug.h>
>  #include <linux/cache.h>
>  #include <linux/init.h>
>  #include <linux/stddef.h>
> +#include <linux/types.h>
>
>  #if defined(__KERNEL__) && defined(CONFIG_COMPAT)
>  /* Masks for extracting the FPSR and FPCR from the FPSCR */
> @@ -89,6 +92,32 @@ extern u64 read_zcr_features(void);
>
>  extern int __ro_after_init sve_max_vl;
>  extern int __ro_after_init sve_max_virtualisable_vl;
> +/* Set of available vector lengths, as vq_to_bit(vq): */
> +extern __ro_after_init DECLARE_BITMAP(sve_vq_map, SVE_VQ_MAX);
> +
> +/*
> + * Helpers to translate bit indices in sve_vq_map to VQ values (and
> + * vice versa).  This allows find_next_bit() to be used to find the
> + * _maximum_ VQ not exceeding a certain value.
> + */
> +static inline unsigned int __vq_to_bit(unsigned int vq)
> +{
> +	return SVE_VQ_MAX - vq;
> +}
> +
> +static inline unsigned int __bit_to_vq(unsigned int bit)
> +{
> +	if (WARN_ON(bit >= SVE_VQ_MAX))
> +		bit = SVE_VQ_MAX - 1;
> +
> +	return SVE_VQ_MAX - bit;
> +}
> +
> +/* Ensure vq >= SVE_VQ_MIN && vq <= SVE_VQ_MAX before calling this function */
> +static inline bool sve_vq_available(unsigned int vq)
> +{
> +	return test_bit(__vq_to_bit(vq), sve_vq_map);
> +}
>
>  #ifdef CONFIG_ARM64_SVE
>
> diff --git a/arch/arm64/kernel/fpsimd.c b/arch/arm64/kernel/fpsimd.c
> index 60c5e28..cc5a495 100644
> --- a/arch/arm64/kernel/fpsimd.c
> +++ b/arch/arm64/kernel/fpsimd.c
> @@ -136,7 +136,7 @@ static int sve_default_vl = -1;
>  int __ro_after_init sve_max_vl = SVE_VL_MIN;
>  int __ro_after_init sve_max_virtualisable_vl = SVE_VL_MIN;
>  /* Set of available vector lengths, as vq_to_bit(vq): */
> -static __ro_after_init DECLARE_BITMAP(sve_vq_map, SVE_VQ_MAX);
> +__ro_after_init DECLARE_BITMAP(sve_vq_map, SVE_VQ_MAX);
>  /* Set of vector lengths present on at least one cpu: */
>  static __ro_after_init DECLARE_BITMAP(sve_vq_partial_map, SVE_VQ_MAX);
>  static void __percpu *efi_sve_state;
> @@ -270,25 +270,6 @@ void fpsimd_save(void)
>  }
>
>  /*
> - * Helpers to translate bit indices in sve_vq_map to VQ values (and
> - * vice versa).  This allows find_next_bit() to be used to find the
> - * _maximum_ VQ not exceeding a certain value.
> - */
> -
> -static unsigned int vq_to_bit(unsigned int vq)
> -{
> -	return SVE_VQ_MAX - vq;
> -}
> -
> -static unsigned int bit_to_vq(unsigned int bit)
> -{
> -	if (WARN_ON(bit >= SVE_VQ_MAX))
> -		bit = SVE_VQ_MAX - 1;
> -
> -	return SVE_VQ_MAX - bit;
> -}
> -
> -/*
>   * All vector length selection from userspace comes through here.
>   * We're on a slow path, so some sanity-checks are included.
>   * If things go wrong there's a bug somewhere, but try to fall back to a
> @@ -309,8 +290,8 @@ static unsigned int find_supported_vector_length(unsigned int vl)
>  		vl = max_vl;
>
>  	bit = find_next_bit(sve_vq_map, SVE_VQ_MAX,
> -			    vq_to_bit(sve_vq_from_vl(vl)));
> -	return sve_vl_from_vq(bit_to_vq(bit));
> +			    __vq_to_bit(sve_vq_from_vl(vl)));
> +	return sve_vl_from_vq(__bit_to_vq(bit));
>  }
>
>  #ifdef CONFIG_SYSCTL
> @@ -651,7 +632,7 @@ static void sve_probe_vqs(DECLARE_BITMAP(map, SVE_VQ_MAX))
>  		write_sysreg_s(zcr | (vq - 1), SYS_ZCR_EL1); /* self-syncing */
>  		vl = sve_get_vl();
>  		vq = sve_vq_from_vl(vl); /* skip intervening lengths */
> -		set_bit(vq_to_bit(vq), map);
> +		set_bit(__vq_to_bit(vq), map);
>  	}
>  }
>
> @@ -712,7 +693,7 @@ int sve_verify_vq_map(void)
>  	 * Mismatches above sve_max_virtualisable_vl are fine, since
>  	 * no guest is allowed to configure ZCR_EL2.LEN to exceed this:
>  	 */
> -	if (sve_vl_from_vq(bit_to_vq(b)) <= sve_max_virtualisable_vl) {
> +	if (sve_vl_from_vq(__bit_to_vq(b)) <= sve_max_virtualisable_vl) {
>  		pr_warn("SVE: cpu%d: Unsupported vector length(s) present\n",
>  			smp_processor_id());
>  		goto error;
> @@ -798,8 +779,8 @@ void __init sve_setup(void)
>  	 * so sve_vq_map must have at least SVE_VQ_MIN set.
>  	 * If something went wrong, at least try to patch it up:
>  	 */
> -	if (WARN_ON(!test_bit(vq_to_bit(SVE_VQ_MIN), sve_vq_map)))
> -		set_bit(vq_to_bit(SVE_VQ_MIN), sve_vq_map);
> +	if (WARN_ON(!test_bit(__vq_to_bit(SVE_VQ_MIN), sve_vq_map)))
> +		set_bit(__vq_to_bit(SVE_VQ_MIN), sve_vq_map);
>
>  	zcr = read_sanitised_ftr_reg(SYS_ZCR_EL1);
>  	sve_max_vl = sve_vl_from_vq((zcr & ZCR_ELx_LEN_MASK) + 1);
> @@ -828,7 +809,7 @@ void __init sve_setup(void)
>  		/* No virtualisable VLs?  This is architecturally forbidden. */
>  		sve_max_virtualisable_vl = SVE_VQ_MIN;
>  	else /* b + 1 < SVE_VQ_MAX */
> -		sve_max_virtualisable_vl = sve_vl_from_vq(bit_to_vq(b + 1));
> +		sve_max_virtualisable_vl = sve_vl_from_vq(__bit_to_vq(b + 1));
>
>  	if (sve_max_virtualisable_vl > sve_max_vl)
>  		sve_max_virtualisable_vl = sve_max_vl;


--
Alex Benn?e

^ permalink raw reply	[flat|nested] 154+ messages in thread

* Re: [RFC PATCH v2 16/23] KVM: arm64: Enumerate SVE register indices for KVM_GET_REG_LIST
  2018-11-21 16:09     ` Alex Bennée
@ 2018-11-21 16:32       ` Dave Martin
  -1 siblings, 0 replies; 154+ messages in thread
From: Dave Martin @ 2018-11-21 16:32 UTC (permalink / raw)
  To: Alex Bennée
  Cc: Okamoto Takayuki, Christoffer Dall, Ard Biesheuvel, Marc Zyngier,
	Catalin Marinas, Will Deacon, kvmarm, linux-arm-kernel

On Wed, Nov 21, 2018 at 04:09:03PM +0000, Alex Bennée wrote:
> 
> Dave Martin <Dave.Martin@arm.com> writes:
> 
> > This patch includes the SVE register IDs in the list returned by
> > KVM_GET_REG_LIST, as appropriate.
> >
> > On a non-SVE-enabled vcpu, no extra IDs are added.
> >
> > On an SVE-enabled vcpu, the appropriate number of slice IDs are
> > enumerated for each SVE register, depending on the maximum vector
> > length for the vcpu.
> >
> > Signed-off-by: Dave Martin <Dave.Martin@arm.com>
> > ---
> >
> > Changes since RFCv1:
> >
> >  * Simplify enumerate_sve_regs() based on Andrew Jones' approach.
> >
> >  * Reg copying loops are inverted for brevity, since the order we
> >    spit out the regs in doesn't really matter.
> >
> > (I tried to keep part of my approach to avoid the duplicate logic
> > between num_sve_regs() and copy_sve_reg_indices(), but although
> > it works in principle, gcc fails to fully collapse the num_regs()
> > case... so I gave up.  The two functions need to be manually kept
> > consistent, but hopefully that's fairly straightforward.)
> > ---
> >  arch/arm64/kvm/guest.c | 45 +++++++++++++++++++++++++++++++++++++++++++++
> >  1 file changed, 45 insertions(+)
> >
> > diff --git a/arch/arm64/kvm/guest.c b/arch/arm64/kvm/guest.c
> > index 320db0f..89eab68 100644
> > --- a/arch/arm64/kvm/guest.c
> > +++ b/arch/arm64/kvm/guest.c
> > @@ -323,6 +323,46 @@ static int get_timer_reg(struct kvm_vcpu *vcpu, const struct kvm_one_reg *reg)
> >  	return copy_to_user(uaddr, &val, KVM_REG_SIZE(reg->id)) ? -EFAULT : 0;
> >  }
> >
> > +static unsigned long num_sve_regs(const struct kvm_vcpu *vcpu)
> > +{
> > +	const unsigned int slices = DIV_ROUND_UP(
> > +		vcpu->arch.sve_max_vl,
> > +		KVM_REG_SIZE(KVM_REG_ARM64_SVE_ZREG(0, 0)));
> 
> Having seen this formulation come up several times now I wonder if there
> should be a kernel private define, KVM_SVE_ZREG/PREG_SIZE to avoid this
> clumsiness.

I agree it's a bit awkward.  Previous I spelled this "0x100", which
was terse but more sensitive to typos and other screwups that Io
liked.

> You could still use the KVM_REG_SIZE to extract it as I guess this is to
> make changes simpler if/when the SVE reg size gets bumped up.

That might be more challenging to determine at compile time.

I'm not sure how good GCC is at doing const-propagation between related
(but different) expressions, so I preferred to go for something that
is clearly compiletime constant rather than extracting it from the
register ID that came from userspace.

So, I'd prefer not to use KVM_REG_SIZE() for this, but I'm happy to add
a private #define to hide this cumbersome construct.  That would
certainly make the code more readable.

(Of course, the actual runtime cost is trivial either way, but I felt
it was easier to reason about correctness if this is really a constant.)


Sound OK?

 > 
> > +
> > +	if (!vcpu_has_sve(vcpu))
> > +		return 0;
> > +
> > +	return slices * (SVE_NUM_PREGS + SVE_NUM_ZREGS + 1 /* FFR */);
> > +}
> > +
> > +static int copy_sve_reg_indices(const struct kvm_vcpu *vcpu, u64 __user **uind)
> > +{
> > +	const unsigned int slices = DIV_ROUND_UP(
> > +		vcpu->arch.sve_max_vl,
> > +		KVM_REG_SIZE(KVM_REG_ARM64_SVE_ZREG(0, 0)));
> > +	unsigned int i, n;
> > +
> > +	if (!vcpu_has_sve(vcpu))
> > +		return 0;
> > +
> > +	for (i = 0; i < slices; i++) {
> > +		for (n = 0; n < SVE_NUM_ZREGS; n++) {
> > +			if (put_user(KVM_REG_ARM64_SVE_ZREG(n, i), (*uind)++))
> > +				return -EFAULT;
> > +		}
> > +
> > +		for (n = 0; n < SVE_NUM_PREGS; n++) {
> > +			if (put_user(KVM_REG_ARM64_SVE_PREG(n, i), (*uind)++))
> > +				return -EFAULT;
> > +		}
> > +
> > +		if (put_user(KVM_REG_ARM64_SVE_FFR(i), (*uind)++))
> > +			return -EFAULT;
> > +	}
> > +
> > +	return 0;
> > +}
> > +
> >  /**
> >   * kvm_arm_num_regs - how many registers do we present via KVM_GET_ONE_REG
> >   *
> > @@ -333,6 +373,7 @@ unsigned long kvm_arm_num_regs(struct kvm_vcpu *vcpu)
> >  	unsigned long res = 0;
> >
> >  	res += num_core_regs();
> > +	res += num_sve_regs(vcpu);
> >  	res += kvm_arm_num_sys_reg_descs(vcpu);
> >  	res += kvm_arm_get_fw_num_regs(vcpu);
> >  	res += NUM_TIMER_REGS;
> > @@ -357,6 +398,10 @@ int kvm_arm_copy_reg_indices(struct kvm_vcpu *vcpu, u64 __user *uindices)
> >  		uindices++;
> >  	}
> >
> > +	ret = copy_sve_reg_indices(vcpu, &uindices);
> > +	if (ret)
> > +		return ret;
> > +
> >  	ret = kvm_arm_copy_fw_reg_indices(vcpu, uindices);
> >  	if (ret)
> >  		return ret;
> 
> Otherwise:
> 
> Reviewed-by: Alex Bennée <alex.bennee@linaro.org>

Thanks
---Dave

^ permalink raw reply	[flat|nested] 154+ messages in thread

* [RFC PATCH v2 16/23] KVM: arm64: Enumerate SVE register indices for KVM_GET_REG_LIST
@ 2018-11-21 16:32       ` Dave Martin
  0 siblings, 0 replies; 154+ messages in thread
From: Dave Martin @ 2018-11-21 16:32 UTC (permalink / raw)
  To: linux-arm-kernel

On Wed, Nov 21, 2018 at 04:09:03PM +0000, Alex Benn?e wrote:
> 
> Dave Martin <Dave.Martin@arm.com> writes:
> 
> > This patch includes the SVE register IDs in the list returned by
> > KVM_GET_REG_LIST, as appropriate.
> >
> > On a non-SVE-enabled vcpu, no extra IDs are added.
> >
> > On an SVE-enabled vcpu, the appropriate number of slice IDs are
> > enumerated for each SVE register, depending on the maximum vector
> > length for the vcpu.
> >
> > Signed-off-by: Dave Martin <Dave.Martin@arm.com>
> > ---
> >
> > Changes since RFCv1:
> >
> >  * Simplify enumerate_sve_regs() based on Andrew Jones' approach.
> >
> >  * Reg copying loops are inverted for brevity, since the order we
> >    spit out the regs in doesn't really matter.
> >
> > (I tried to keep part of my approach to avoid the duplicate logic
> > between num_sve_regs() and copy_sve_reg_indices(), but although
> > it works in principle, gcc fails to fully collapse the num_regs()
> > case... so I gave up.  The two functions need to be manually kept
> > consistent, but hopefully that's fairly straightforward.)
> > ---
> >  arch/arm64/kvm/guest.c | 45 +++++++++++++++++++++++++++++++++++++++++++++
> >  1 file changed, 45 insertions(+)
> >
> > diff --git a/arch/arm64/kvm/guest.c b/arch/arm64/kvm/guest.c
> > index 320db0f..89eab68 100644
> > --- a/arch/arm64/kvm/guest.c
> > +++ b/arch/arm64/kvm/guest.c
> > @@ -323,6 +323,46 @@ static int get_timer_reg(struct kvm_vcpu *vcpu, const struct kvm_one_reg *reg)
> >  	return copy_to_user(uaddr, &val, KVM_REG_SIZE(reg->id)) ? -EFAULT : 0;
> >  }
> >
> > +static unsigned long num_sve_regs(const struct kvm_vcpu *vcpu)
> > +{
> > +	const unsigned int slices = DIV_ROUND_UP(
> > +		vcpu->arch.sve_max_vl,
> > +		KVM_REG_SIZE(KVM_REG_ARM64_SVE_ZREG(0, 0)));
> 
> Having seen this formulation come up several times now I wonder if there
> should be a kernel private define, KVM_SVE_ZREG/PREG_SIZE to avoid this
> clumsiness.

I agree it's a bit awkward.  Previous I spelled this "0x100", which
was terse but more sensitive to typos and other screwups that Io
liked.

> You could still use the KVM_REG_SIZE to extract it as I guess this is to
> make changes simpler if/when the SVE reg size gets bumped up.

That might be more challenging to determine at compile time.

I'm not sure how good GCC is at doing const-propagation between related
(but different) expressions, so I preferred to go for something that
is clearly compiletime constant rather than extracting it from the
register ID that came from userspace.

So, I'd prefer not to use KVM_REG_SIZE() for this, but I'm happy to add
a private #define to hide this cumbersome construct.  That would
certainly make the code more readable.

(Of course, the actual runtime cost is trivial either way, but I felt
it was easier to reason about correctness if this is really a constant.)


Sound OK?

 > 
> > +
> > +	if (!vcpu_has_sve(vcpu))
> > +		return 0;
> > +
> > +	return slices * (SVE_NUM_PREGS + SVE_NUM_ZREGS + 1 /* FFR */);
> > +}
> > +
> > +static int copy_sve_reg_indices(const struct kvm_vcpu *vcpu, u64 __user **uind)
> > +{
> > +	const unsigned int slices = DIV_ROUND_UP(
> > +		vcpu->arch.sve_max_vl,
> > +		KVM_REG_SIZE(KVM_REG_ARM64_SVE_ZREG(0, 0)));
> > +	unsigned int i, n;
> > +
> > +	if (!vcpu_has_sve(vcpu))
> > +		return 0;
> > +
> > +	for (i = 0; i < slices; i++) {
> > +		for (n = 0; n < SVE_NUM_ZREGS; n++) {
> > +			if (put_user(KVM_REG_ARM64_SVE_ZREG(n, i), (*uind)++))
> > +				return -EFAULT;
> > +		}
> > +
> > +		for (n = 0; n < SVE_NUM_PREGS; n++) {
> > +			if (put_user(KVM_REG_ARM64_SVE_PREG(n, i), (*uind)++))
> > +				return -EFAULT;
> > +		}
> > +
> > +		if (put_user(KVM_REG_ARM64_SVE_FFR(i), (*uind)++))
> > +			return -EFAULT;
> > +	}
> > +
> > +	return 0;
> > +}
> > +
> >  /**
> >   * kvm_arm_num_regs - how many registers do we present via KVM_GET_ONE_REG
> >   *
> > @@ -333,6 +373,7 @@ unsigned long kvm_arm_num_regs(struct kvm_vcpu *vcpu)
> >  	unsigned long res = 0;
> >
> >  	res += num_core_regs();
> > +	res += num_sve_regs(vcpu);
> >  	res += kvm_arm_num_sys_reg_descs(vcpu);
> >  	res += kvm_arm_get_fw_num_regs(vcpu);
> >  	res += NUM_TIMER_REGS;
> > @@ -357,6 +398,10 @@ int kvm_arm_copy_reg_indices(struct kvm_vcpu *vcpu, u64 __user *uindices)
> >  		uindices++;
> >  	}
> >
> > +	ret = copy_sve_reg_indices(vcpu, &uindices);
> > +	if (ret)
> > +		return ret;
> > +
> >  	ret = kvm_arm_copy_fw_reg_indices(vcpu, uindices);
> >  	if (ret)
> >  		return ret;
> 
> Otherwise:
> 
> Reviewed-by: Alex Benn?e <alex.bennee@linaro.org>

Thanks
---Dave

^ permalink raw reply	[flat|nested] 154+ messages in thread

* Re: [RFC PATCH v2 17/23] arm64/sve: In-kernel vector length availability query interface
  2018-11-21 16:16     ` Alex Bennée
@ 2018-11-21 16:35       ` Dave Martin
  -1 siblings, 0 replies; 154+ messages in thread
From: Dave Martin @ 2018-11-21 16:35 UTC (permalink / raw)
  To: Alex Bennée
  Cc: Okamoto Takayuki, Christoffer Dall, Ard Biesheuvel, Marc Zyngier,
	Catalin Marinas, Will Deacon, kvmarm, linux-arm-kernel

On Wed, Nov 21, 2018 at 04:16:42PM +0000, Alex Bennée wrote:
> 
> Dave Martin <Dave.Martin@arm.com> writes:
> 
> > KVM will need to interrogate the set of SVE vector lengths
> > available on the system.
> >
> > This patch exposes the relevant bits to the kernel, along with a
> > sve_vq_available() helper to check whether a particular vector
> > length is supported.
> >
> > vq_to_bit() and bit_to_vq() are not intended for use outside these
> > functions, so they are given a __ prefix to warn people not to use
> > them unless they really know what they are doing.
> 
> Personally I wouldn't have bothered with the __ but whatever:
> 
> Reviewed-by: Alex Bennée <alex.bennee@linaro.org>

OK, thanks

I'll probably keep the __ unless somebody else objects, but if you feel
strongly I could get rid of it.

Perhaps I simply shouldn't have called attention to it in the commit
message ;)

Cheers
---Dave

> 
> >
> > Signed-off-by: Dave Martin <Dave.Martin@arm.com>
> > ---
> >  arch/arm64/include/asm/fpsimd.h | 29 +++++++++++++++++++++++++++++
> >  arch/arm64/kernel/fpsimd.c      | 35 ++++++++---------------------------
> >  2 files changed, 37 insertions(+), 27 deletions(-)
> >
> > diff --git a/arch/arm64/include/asm/fpsimd.h b/arch/arm64/include/asm/fpsimd.h
> > index df7a143..ad6d2e4 100644
> > --- a/arch/arm64/include/asm/fpsimd.h
> > +++ b/arch/arm64/include/asm/fpsimd.h
> > @@ -24,10 +24,13 @@
> >
> >  #ifndef __ASSEMBLY__
> >
> > +#include <linux/bitmap.h>
> >  #include <linux/build_bug.h>
> > +#include <linux/bug.h>
> >  #include <linux/cache.h>
> >  #include <linux/init.h>
> >  #include <linux/stddef.h>
> > +#include <linux/types.h>
> >
> >  #if defined(__KERNEL__) && defined(CONFIG_COMPAT)
> >  /* Masks for extracting the FPSR and FPCR from the FPSCR */
> > @@ -89,6 +92,32 @@ extern u64 read_zcr_features(void);
> >
> >  extern int __ro_after_init sve_max_vl;
> >  extern int __ro_after_init sve_max_virtualisable_vl;
> > +/* Set of available vector lengths, as vq_to_bit(vq): */
> > +extern __ro_after_init DECLARE_BITMAP(sve_vq_map, SVE_VQ_MAX);
> > +
> > +/*
> > + * Helpers to translate bit indices in sve_vq_map to VQ values (and
> > + * vice versa).  This allows find_next_bit() to be used to find the
> > + * _maximum_ VQ not exceeding a certain value.
> > + */
> > +static inline unsigned int __vq_to_bit(unsigned int vq)
> > +{
> > +	return SVE_VQ_MAX - vq;
> > +}
> > +
> > +static inline unsigned int __bit_to_vq(unsigned int bit)
> > +{

[...]

^ permalink raw reply	[flat|nested] 154+ messages in thread

* [RFC PATCH v2 17/23] arm64/sve: In-kernel vector length availability query interface
@ 2018-11-21 16:35       ` Dave Martin
  0 siblings, 0 replies; 154+ messages in thread
From: Dave Martin @ 2018-11-21 16:35 UTC (permalink / raw)
  To: linux-arm-kernel

On Wed, Nov 21, 2018 at 04:16:42PM +0000, Alex Benn?e wrote:
> 
> Dave Martin <Dave.Martin@arm.com> writes:
> 
> > KVM will need to interrogate the set of SVE vector lengths
> > available on the system.
> >
> > This patch exposes the relevant bits to the kernel, along with a
> > sve_vq_available() helper to check whether a particular vector
> > length is supported.
> >
> > vq_to_bit() and bit_to_vq() are not intended for use outside these
> > functions, so they are given a __ prefix to warn people not to use
> > them unless they really know what they are doing.
> 
> Personally I wouldn't have bothered with the __ but whatever:
> 
> Reviewed-by: Alex Benn?e <alex.bennee@linaro.org>

OK, thanks

I'll probably keep the __ unless somebody else objects, but if you feel
strongly I could get rid of it.

Perhaps I simply shouldn't have called attention to it in the commit
message ;)

Cheers
---Dave

> 
> >
> > Signed-off-by: Dave Martin <Dave.Martin@arm.com>
> > ---
> >  arch/arm64/include/asm/fpsimd.h | 29 +++++++++++++++++++++++++++++
> >  arch/arm64/kernel/fpsimd.c      | 35 ++++++++---------------------------
> >  2 files changed, 37 insertions(+), 27 deletions(-)
> >
> > diff --git a/arch/arm64/include/asm/fpsimd.h b/arch/arm64/include/asm/fpsimd.h
> > index df7a143..ad6d2e4 100644
> > --- a/arch/arm64/include/asm/fpsimd.h
> > +++ b/arch/arm64/include/asm/fpsimd.h
> > @@ -24,10 +24,13 @@
> >
> >  #ifndef __ASSEMBLY__
> >
> > +#include <linux/bitmap.h>
> >  #include <linux/build_bug.h>
> > +#include <linux/bug.h>
> >  #include <linux/cache.h>
> >  #include <linux/init.h>
> >  #include <linux/stddef.h>
> > +#include <linux/types.h>
> >
> >  #if defined(__KERNEL__) && defined(CONFIG_COMPAT)
> >  /* Masks for extracting the FPSR and FPCR from the FPSCR */
> > @@ -89,6 +92,32 @@ extern u64 read_zcr_features(void);
> >
> >  extern int __ro_after_init sve_max_vl;
> >  extern int __ro_after_init sve_max_virtualisable_vl;
> > +/* Set of available vector lengths, as vq_to_bit(vq): */
> > +extern __ro_after_init DECLARE_BITMAP(sve_vq_map, SVE_VQ_MAX);
> > +
> > +/*
> > + * Helpers to translate bit indices in sve_vq_map to VQ values (and
> > + * vice versa).  This allows find_next_bit() to be used to find the
> > + * _maximum_ VQ not exceeding a certain value.
> > + */
> > +static inline unsigned int __vq_to_bit(unsigned int vq)
> > +{
> > +	return SVE_VQ_MAX - vq;
> > +}
> > +
> > +static inline unsigned int __bit_to_vq(unsigned int bit)
> > +{

[...]

^ permalink raw reply	[flat|nested] 154+ messages in thread

* Re: [RFC PATCH v2 17/23] arm64/sve: In-kernel vector length availability query interface
  2018-11-21 16:35       ` Dave Martin
@ 2018-11-21 16:46         ` Alex Bennée
  -1 siblings, 0 replies; 154+ messages in thread
From: Alex Bennée @ 2018-11-21 16:46 UTC (permalink / raw)
  To: Dave Martin
  Cc: Okamoto Takayuki, Christoffer Dall, Ard Biesheuvel, Marc Zyngier,
	Catalin Marinas, Will Deacon, kvmarm, linux-arm-kernel


Dave Martin <Dave.Martin@arm.com> writes:

> On Wed, Nov 21, 2018 at 04:16:42PM +0000, Alex Bennée wrote:
>>
>> Dave Martin <Dave.Martin@arm.com> writes:
>>
>> > KVM will need to interrogate the set of SVE vector lengths
>> > available on the system.
>> >
>> > This patch exposes the relevant bits to the kernel, along with a
>> > sve_vq_available() helper to check whether a particular vector
>> > length is supported.
>> >
>> > vq_to_bit() and bit_to_vq() are not intended for use outside these
>> > functions, so they are given a __ prefix to warn people not to use
>> > them unless they really know what they are doing.
>>
>> Personally I wouldn't have bothered with the __ but whatever:
>>
>> Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
>
> OK, thanks
>
> I'll probably keep the __ unless somebody else objects, but if you feel
> strongly I could get rid of it.

nah - it's just a personal opinion...

> Perhaps I simply shouldn't have called attention to it in the commit
> message ;)

Psychological priming ;-)

>
> Cheers
> ---Dave
>
>>
>> >
>> > Signed-off-by: Dave Martin <Dave.Martin@arm.com>
>> > ---
>> >  arch/arm64/include/asm/fpsimd.h | 29 +++++++++++++++++++++++++++++
>> >  arch/arm64/kernel/fpsimd.c      | 35 ++++++++---------------------------
>> >  2 files changed, 37 insertions(+), 27 deletions(-)
>> >
>> > diff --git a/arch/arm64/include/asm/fpsimd.h b/arch/arm64/include/asm/fpsimd.h
>> > index df7a143..ad6d2e4 100644
>> > --- a/arch/arm64/include/asm/fpsimd.h
>> > +++ b/arch/arm64/include/asm/fpsimd.h
>> > @@ -24,10 +24,13 @@
>> >
>> >  #ifndef __ASSEMBLY__
>> >
>> > +#include <linux/bitmap.h>
>> >  #include <linux/build_bug.h>
>> > +#include <linux/bug.h>
>> >  #include <linux/cache.h>
>> >  #include <linux/init.h>
>> >  #include <linux/stddef.h>
>> > +#include <linux/types.h>
>> >
>> >  #if defined(__KERNEL__) && defined(CONFIG_COMPAT)
>> >  /* Masks for extracting the FPSR and FPCR from the FPSCR */
>> > @@ -89,6 +92,32 @@ extern u64 read_zcr_features(void);
>> >
>> >  extern int __ro_after_init sve_max_vl;
>> >  extern int __ro_after_init sve_max_virtualisable_vl;
>> > +/* Set of available vector lengths, as vq_to_bit(vq): */
>> > +extern __ro_after_init DECLARE_BITMAP(sve_vq_map, SVE_VQ_MAX);
>> > +
>> > +/*
>> > + * Helpers to translate bit indices in sve_vq_map to VQ values (and
>> > + * vice versa).  This allows find_next_bit() to be used to find the
>> > + * _maximum_ VQ not exceeding a certain value.
>> > + */
>> > +static inline unsigned int __vq_to_bit(unsigned int vq)
>> > +{
>> > +	return SVE_VQ_MAX - vq;
>> > +}
>> > +
>> > +static inline unsigned int __bit_to_vq(unsigned int bit)
>> > +{
>
> [...]


--
Alex Bennée
_______________________________________________
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

^ permalink raw reply	[flat|nested] 154+ messages in thread

* [RFC PATCH v2 17/23] arm64/sve: In-kernel vector length availability query interface
@ 2018-11-21 16:46         ` Alex Bennée
  0 siblings, 0 replies; 154+ messages in thread
From: Alex Bennée @ 2018-11-21 16:46 UTC (permalink / raw)
  To: linux-arm-kernel


Dave Martin <Dave.Martin@arm.com> writes:

> On Wed, Nov 21, 2018 at 04:16:42PM +0000, Alex Benn?e wrote:
>>
>> Dave Martin <Dave.Martin@arm.com> writes:
>>
>> > KVM will need to interrogate the set of SVE vector lengths
>> > available on the system.
>> >
>> > This patch exposes the relevant bits to the kernel, along with a
>> > sve_vq_available() helper to check whether a particular vector
>> > length is supported.
>> >
>> > vq_to_bit() and bit_to_vq() are not intended for use outside these
>> > functions, so they are given a __ prefix to warn people not to use
>> > them unless they really know what they are doing.
>>
>> Personally I wouldn't have bothered with the __ but whatever:
>>
>> Reviewed-by: Alex Benn?e <alex.bennee@linaro.org>
>
> OK, thanks
>
> I'll probably keep the __ unless somebody else objects, but if you feel
> strongly I could get rid of it.

nah - it's just a personal opinion...

> Perhaps I simply shouldn't have called attention to it in the commit
> message ;)

Psychological priming ;-)

>
> Cheers
> ---Dave
>
>>
>> >
>> > Signed-off-by: Dave Martin <Dave.Martin@arm.com>
>> > ---
>> >  arch/arm64/include/asm/fpsimd.h | 29 +++++++++++++++++++++++++++++
>> >  arch/arm64/kernel/fpsimd.c      | 35 ++++++++---------------------------
>> >  2 files changed, 37 insertions(+), 27 deletions(-)
>> >
>> > diff --git a/arch/arm64/include/asm/fpsimd.h b/arch/arm64/include/asm/fpsimd.h
>> > index df7a143..ad6d2e4 100644
>> > --- a/arch/arm64/include/asm/fpsimd.h
>> > +++ b/arch/arm64/include/asm/fpsimd.h
>> > @@ -24,10 +24,13 @@
>> >
>> >  #ifndef __ASSEMBLY__
>> >
>> > +#include <linux/bitmap.h>
>> >  #include <linux/build_bug.h>
>> > +#include <linux/bug.h>
>> >  #include <linux/cache.h>
>> >  #include <linux/init.h>
>> >  #include <linux/stddef.h>
>> > +#include <linux/types.h>
>> >
>> >  #if defined(__KERNEL__) && defined(CONFIG_COMPAT)
>> >  /* Masks for extracting the FPSR and FPCR from the FPSCR */
>> > @@ -89,6 +92,32 @@ extern u64 read_zcr_features(void);
>> >
>> >  extern int __ro_after_init sve_max_vl;
>> >  extern int __ro_after_init sve_max_virtualisable_vl;
>> > +/* Set of available vector lengths, as vq_to_bit(vq): */
>> > +extern __ro_after_init DECLARE_BITMAP(sve_vq_map, SVE_VQ_MAX);
>> > +
>> > +/*
>> > + * Helpers to translate bit indices in sve_vq_map to VQ values (and
>> > + * vice versa).  This allows find_next_bit() to be used to find the
>> > + * _maximum_ VQ not exceeding a certain value.
>> > + */
>> > +static inline unsigned int __vq_to_bit(unsigned int vq)
>> > +{
>> > +	return SVE_VQ_MAX - vq;
>> > +}
>> > +
>> > +static inline unsigned int __bit_to_vq(unsigned int bit)
>> > +{
>
> [...]


--
Alex Benn?e

^ permalink raw reply	[flat|nested] 154+ messages in thread

* Re: [RFC PATCH v2 16/23] KVM: arm64: Enumerate SVE register indices for KVM_GET_REG_LIST
  2018-11-21 16:32       ` Dave Martin
@ 2018-11-21 16:49         ` Alex Bennée
  -1 siblings, 0 replies; 154+ messages in thread
From: Alex Bennée @ 2018-11-21 16:49 UTC (permalink / raw)
  To: Dave Martin
  Cc: Okamoto Takayuki, Christoffer Dall, Ard Biesheuvel, Marc Zyngier,
	Catalin Marinas, Will Deacon, kvmarm, linux-arm-kernel


Dave Martin <Dave.Martin@arm.com> writes:

> On Wed, Nov 21, 2018 at 04:09:03PM +0000, Alex Bennée wrote:
>>
>> Dave Martin <Dave.Martin@arm.com> writes:
>>
>> > This patch includes the SVE register IDs in the list returned by
>> > KVM_GET_REG_LIST, as appropriate.
>> >
>> > On a non-SVE-enabled vcpu, no extra IDs are added.
>> >
>> > On an SVE-enabled vcpu, the appropriate number of slice IDs are
>> > enumerated for each SVE register, depending on the maximum vector
>> > length for the vcpu.
>> >
>> > Signed-off-by: Dave Martin <Dave.Martin@arm.com>
>> > ---
>> >
>> > Changes since RFCv1:
>> >
>> >  * Simplify enumerate_sve_regs() based on Andrew Jones' approach.
>> >
>> >  * Reg copying loops are inverted for brevity, since the order we
>> >    spit out the regs in doesn't really matter.
>> >
>> > (I tried to keep part of my approach to avoid the duplicate logic
>> > between num_sve_regs() and copy_sve_reg_indices(), but although
>> > it works in principle, gcc fails to fully collapse the num_regs()
>> > case... so I gave up.  The two functions need to be manually kept
>> > consistent, but hopefully that's fairly straightforward.)
>> > ---
>> >  arch/arm64/kvm/guest.c | 45 +++++++++++++++++++++++++++++++++++++++++++++
>> >  1 file changed, 45 insertions(+)
>> >
>> > diff --git a/arch/arm64/kvm/guest.c b/arch/arm64/kvm/guest.c
>> > index 320db0f..89eab68 100644
>> > --- a/arch/arm64/kvm/guest.c
>> > +++ b/arch/arm64/kvm/guest.c
>> > @@ -323,6 +323,46 @@ static int get_timer_reg(struct kvm_vcpu *vcpu, const struct kvm_one_reg *reg)
>> >  	return copy_to_user(uaddr, &val, KVM_REG_SIZE(reg->id)) ? -EFAULT : 0;
>> >  }
>> >
>> > +static unsigned long num_sve_regs(const struct kvm_vcpu *vcpu)
>> > +{
>> > +	const unsigned int slices = DIV_ROUND_UP(
>> > +		vcpu->arch.sve_max_vl,
>> > +		KVM_REG_SIZE(KVM_REG_ARM64_SVE_ZREG(0, 0)));
>>
>> Having seen this formulation come up several times now I wonder if there
>> should be a kernel private define, KVM_SVE_ZREG/PREG_SIZE to avoid this
>> clumsiness.
>
> I agree it's a bit awkward.  Previous I spelled this "0x100", which
> was terse but more sensitive to typos and other screwups that Io
> liked.
>
>> You could still use the KVM_REG_SIZE to extract it as I guess this is to
>> make changes simpler if/when the SVE reg size gets bumped up.
>
> That might be more challenging to determine at compile time.
>
> I'm not sure how good GCC is at doing const-propagation between related
> (but different) expressions, so I preferred to go for something that
> is clearly compiletime constant rather than extracting it from the
> register ID that came from userspace.
>
> So, I'd prefer not to use KVM_REG_SIZE() for this, but I'm happy to add
> a private #define to hide this cumbersome construct.  That would
> certainly make the code more readable.
>
> (Of course, the actual runtime cost is trivial either way, but I felt
> it was easier to reason about correctness if this is really a constant.)
>
>
> Sound OK?

Yes.

I'd almost suggested by not just use KVM_REG_SIZE(KVM_REG_SIZE_U2048)
earlier until I realised this might be forward looking.

--
Alex Bennée
_______________________________________________
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

^ permalink raw reply	[flat|nested] 154+ messages in thread

* [RFC PATCH v2 16/23] KVM: arm64: Enumerate SVE register indices for KVM_GET_REG_LIST
@ 2018-11-21 16:49         ` Alex Bennée
  0 siblings, 0 replies; 154+ messages in thread
From: Alex Bennée @ 2018-11-21 16:49 UTC (permalink / raw)
  To: linux-arm-kernel


Dave Martin <Dave.Martin@arm.com> writes:

> On Wed, Nov 21, 2018 at 04:09:03PM +0000, Alex Benn?e wrote:
>>
>> Dave Martin <Dave.Martin@arm.com> writes:
>>
>> > This patch includes the SVE register IDs in the list returned by
>> > KVM_GET_REG_LIST, as appropriate.
>> >
>> > On a non-SVE-enabled vcpu, no extra IDs are added.
>> >
>> > On an SVE-enabled vcpu, the appropriate number of slice IDs are
>> > enumerated for each SVE register, depending on the maximum vector
>> > length for the vcpu.
>> >
>> > Signed-off-by: Dave Martin <Dave.Martin@arm.com>
>> > ---
>> >
>> > Changes since RFCv1:
>> >
>> >  * Simplify enumerate_sve_regs() based on Andrew Jones' approach.
>> >
>> >  * Reg copying loops are inverted for brevity, since the order we
>> >    spit out the regs in doesn't really matter.
>> >
>> > (I tried to keep part of my approach to avoid the duplicate logic
>> > between num_sve_regs() and copy_sve_reg_indices(), but although
>> > it works in principle, gcc fails to fully collapse the num_regs()
>> > case... so I gave up.  The two functions need to be manually kept
>> > consistent, but hopefully that's fairly straightforward.)
>> > ---
>> >  arch/arm64/kvm/guest.c | 45 +++++++++++++++++++++++++++++++++++++++++++++
>> >  1 file changed, 45 insertions(+)
>> >
>> > diff --git a/arch/arm64/kvm/guest.c b/arch/arm64/kvm/guest.c
>> > index 320db0f..89eab68 100644
>> > --- a/arch/arm64/kvm/guest.c
>> > +++ b/arch/arm64/kvm/guest.c
>> > @@ -323,6 +323,46 @@ static int get_timer_reg(struct kvm_vcpu *vcpu, const struct kvm_one_reg *reg)
>> >  	return copy_to_user(uaddr, &val, KVM_REG_SIZE(reg->id)) ? -EFAULT : 0;
>> >  }
>> >
>> > +static unsigned long num_sve_regs(const struct kvm_vcpu *vcpu)
>> > +{
>> > +	const unsigned int slices = DIV_ROUND_UP(
>> > +		vcpu->arch.sve_max_vl,
>> > +		KVM_REG_SIZE(KVM_REG_ARM64_SVE_ZREG(0, 0)));
>>
>> Having seen this formulation come up several times now I wonder if there
>> should be a kernel private define, KVM_SVE_ZREG/PREG_SIZE to avoid this
>> clumsiness.
>
> I agree it's a bit awkward.  Previous I spelled this "0x100", which
> was terse but more sensitive to typos and other screwups that Io
> liked.
>
>> You could still use the KVM_REG_SIZE to extract it as I guess this is to
>> make changes simpler if/when the SVE reg size gets bumped up.
>
> That might be more challenging to determine at compile time.
>
> I'm not sure how good GCC is at doing const-propagation between related
> (but different) expressions, so I preferred to go for something that
> is clearly compiletime constant rather than extracting it from the
> register ID that came from userspace.
>
> So, I'd prefer not to use KVM_REG_SIZE() for this, but I'm happy to add
> a private #define to hide this cumbersome construct.  That would
> certainly make the code more readable.
>
> (Of course, the actual runtime cost is trivial either way, but I felt
> it was easier to reason about correctness if this is really a constant.)
>
>
> Sound OK?

Yes.

I'd almost suggested by not just use KVM_REG_SIZE(KVM_REG_SIZE_U2048)
earlier until I realised this might be forward looking.

--
Alex Benn?e

^ permalink raw reply	[flat|nested] 154+ messages in thread

* Re: [RFC PATCH v2 16/23] KVM: arm64: Enumerate SVE register indices for KVM_GET_REG_LIST
  2018-11-21 16:49         ` Alex Bennée
@ 2018-11-21 17:46           ` Dave Martin
  -1 siblings, 0 replies; 154+ messages in thread
From: Dave Martin @ 2018-11-21 17:46 UTC (permalink / raw)
  To: Alex Bennée
  Cc: Okamoto Takayuki, Christoffer Dall, Ard Biesheuvel, Marc Zyngier,
	Catalin Marinas, Will Deacon, kvmarm, linux-arm-kernel

On Wed, Nov 21, 2018 at 04:49:59PM +0000, Alex Bennée wrote:
> 
> Dave Martin <Dave.Martin@arm.com> writes:
> 
> > On Wed, Nov 21, 2018 at 04:09:03PM +0000, Alex Bennée wrote:
> >>
> >> Dave Martin <Dave.Martin@arm.com> writes:

[...]

> >> > diff --git a/arch/arm64/kvm/guest.c b/arch/arm64/kvm/guest.c
> >> > index 320db0f..89eab68 100644
> >> > --- a/arch/arm64/kvm/guest.c
> >> > +++ b/arch/arm64/kvm/guest.c
> >> > @@ -323,6 +323,46 @@ static int get_timer_reg(struct kvm_vcpu *vcpu, const struct kvm_one_reg *reg)
> >> >  	return copy_to_user(uaddr, &val, KVM_REG_SIZE(reg->id)) ? -EFAULT : 0;
> >> >  }
> >> >
> >> > +static unsigned long num_sve_regs(const struct kvm_vcpu *vcpu)
> >> > +{
> >> > +	const unsigned int slices = DIV_ROUND_UP(
> >> > +		vcpu->arch.sve_max_vl,
> >> > +		KVM_REG_SIZE(KVM_REG_ARM64_SVE_ZREG(0, 0)));
> >>
> >> Having seen this formulation come up several times now I wonder if there
> >> should be a kernel private define, KVM_SVE_ZREG/PREG_SIZE to avoid this
> >> clumsiness.
> >
> > I agree it's a bit awkward.  Previous I spelled this "0x100", which
> > was terse but more sensitive to typos and other screwups that Io
> > liked.
> >
> >> You could still use the KVM_REG_SIZE to extract it as I guess this is to
> >> make changes simpler if/when the SVE reg size gets bumped up.
> >
> > That might be more challenging to determine at compile time.
> >
> > I'm not sure how good GCC is at doing const-propagation between related
> > (but different) expressions, so I preferred to go for something that
> > is clearly compiletime constant rather than extracting it from the
> > register ID that came from userspace.
> >
> > So, I'd prefer not to use KVM_REG_SIZE() for this, but I'm happy to add
> > a private #define to hide this cumbersome construct.  That would
> > certainly make the code more readable.
> >
> > (Of course, the actual runtime cost is trivial either way, but I felt
> > it was easier to reason about correctness if this is really a constant.)
> >
> >
> > Sound OK?
> 
> Yes.
> 
> I'd almost suggested by not just use KVM_REG_SIZE(KVM_REG_SIZE_U2048)
> earlier until I realised this might be forward looking.

Having the slice size as 2048 bits is a property of the ABI and isn't
intended to change in the future.

So, I guess we could write the above, but I prefer to have the size
determined in the fewest number of places possible.


Cheers
---Dave

^ permalink raw reply	[flat|nested] 154+ messages in thread

* [RFC PATCH v2 16/23] KVM: arm64: Enumerate SVE register indices for KVM_GET_REG_LIST
@ 2018-11-21 17:46           ` Dave Martin
  0 siblings, 0 replies; 154+ messages in thread
From: Dave Martin @ 2018-11-21 17:46 UTC (permalink / raw)
  To: linux-arm-kernel

On Wed, Nov 21, 2018 at 04:49:59PM +0000, Alex Benn?e wrote:
> 
> Dave Martin <Dave.Martin@arm.com> writes:
> 
> > On Wed, Nov 21, 2018 at 04:09:03PM +0000, Alex Benn?e wrote:
> >>
> >> Dave Martin <Dave.Martin@arm.com> writes:

[...]

> >> > diff --git a/arch/arm64/kvm/guest.c b/arch/arm64/kvm/guest.c
> >> > index 320db0f..89eab68 100644
> >> > --- a/arch/arm64/kvm/guest.c
> >> > +++ b/arch/arm64/kvm/guest.c
> >> > @@ -323,6 +323,46 @@ static int get_timer_reg(struct kvm_vcpu *vcpu, const struct kvm_one_reg *reg)
> >> >  	return copy_to_user(uaddr, &val, KVM_REG_SIZE(reg->id)) ? -EFAULT : 0;
> >> >  }
> >> >
> >> > +static unsigned long num_sve_regs(const struct kvm_vcpu *vcpu)
> >> > +{
> >> > +	const unsigned int slices = DIV_ROUND_UP(
> >> > +		vcpu->arch.sve_max_vl,
> >> > +		KVM_REG_SIZE(KVM_REG_ARM64_SVE_ZREG(0, 0)));
> >>
> >> Having seen this formulation come up several times now I wonder if there
> >> should be a kernel private define, KVM_SVE_ZREG/PREG_SIZE to avoid this
> >> clumsiness.
> >
> > I agree it's a bit awkward.  Previous I spelled this "0x100", which
> > was terse but more sensitive to typos and other screwups that Io
> > liked.
> >
> >> You could still use the KVM_REG_SIZE to extract it as I guess this is to
> >> make changes simpler if/when the SVE reg size gets bumped up.
> >
> > That might be more challenging to determine at compile time.
> >
> > I'm not sure how good GCC is at doing const-propagation between related
> > (but different) expressions, so I preferred to go for something that
> > is clearly compiletime constant rather than extracting it from the
> > register ID that came from userspace.
> >
> > So, I'd prefer not to use KVM_REG_SIZE() for this, but I'm happy to add
> > a private #define to hide this cumbersome construct.  That would
> > certainly make the code more readable.
> >
> > (Of course, the actual runtime cost is trivial either way, but I felt
> > it was easier to reason about correctness if this is really a constant.)
> >
> >
> > Sound OK?
> 
> Yes.
> 
> I'd almost suggested by not just use KVM_REG_SIZE(KVM_REG_SIZE_U2048)
> earlier until I realised this might be forward looking.

Having the slice size as 2048 bits is a property of the ABI and isn't
intended to change in the future.

So, I guess we could write the above, but I prefer to have the size
determined in the fewest number of places possible.


Cheers
---Dave

^ permalink raw reply	[flat|nested] 154+ messages in thread

* Re: [RFC PATCH v2 15/23] KVM: arm64/sve: Add SVE support to register access ioctl interface
  2018-11-21 15:20     ` Alex Bennée
@ 2018-11-21 18:05       ` Dave Martin
  -1 siblings, 0 replies; 154+ messages in thread
From: Dave Martin @ 2018-11-21 18:05 UTC (permalink / raw)
  To: Alex Bennée
  Cc: Okamoto Takayuki, Christoffer Dall, Ard Biesheuvel, Marc Zyngier,
	Catalin Marinas, Will Deacon, kvmarm, linux-arm-kernel

On Wed, Nov 21, 2018 at 03:20:15PM +0000, Alex Bennée wrote:
> 
> Dave Martin <Dave.Martin@arm.com> writes:
> 
> > This patch adds the following registers for access via the
> > KVM_{GET,SET}_ONE_REG interface:
> >
> >  * KVM_REG_ARM64_SVE_ZREG(n, i) (n = 0..31) (in 2048-bit slices)
> >  * KVM_REG_ARM64_SVE_PREG(n, i) (n = 0..15) (in 256-bit slices)
> >  * KVM_REG_ARM64_SVE_FFR(i) (in 256-bit slices)
> >
> > In order to adapt gracefully to future architectural extensions,
> > the registers are divided up into slices as noted above:  the i
> > parameter denotes the slice index.
> >
> > For simplicity, bits or slices that exceed the maximum vector
> > length supported for the vcpu are ignored for KVM_SET_ONE_REG, and
> > read as zero for KVM_GET_ONE_REG.
> >
> > For the current architecture, only slice i = 0 is significant.  The
> > interface design allows i to increase to up to 31 in the future if
> > required by future architectural amendments.
> >
> > The registers are only visible for vcpus that have SVE enabled.
> > They are not enumerated by KVM_GET_REG_LIST on vcpus that do not
> > have SVE.  In all cases, surplus slices are not enumerated by
> > KVM_GET_REG_LIST.
> >
> > Accesses to the FPSIMD registers via KVM_REG_ARM_CORE is not
> > allowed for SVE-enabled vcpus: SVE-aware userspace can use the
> > KVM_REG_ARM64_SVE_ZREG() interface instead to access the same
> > register state.  This avoids some complex and pointless emluation
> > in the kernel.
> >
> > Signed-off-by: Dave Martin <Dave.Martin@arm.com>
> > ---
> >
> > Changes since RFCv1:
> >
> >  * Refactored to remove emulation of FPSIMD registers with the SVE
> >    register view and vice-versa.  This simplifies the code a fair bit.
> >
> >  * Fixed a couple of range errors.
> >
> >  * Inlined various trivial helpers that now have only one call site.
> >
> >  * Use KVM_REG_SIZE() as a symbolic way of getting SVE register slice
> >    sizes.
> > ---
> >  arch/arm64/include/uapi/asm/kvm.h |  10 +++
> >  arch/arm64/kvm/guest.c            | 147 ++++++++++++++++++++++++++++++++++----
> >  2 files changed, 145 insertions(+), 12 deletions(-)
> >
> > diff --git a/arch/arm64/include/uapi/asm/kvm.h b/arch/arm64/include/uapi/asm/kvm.h
> > index 97c3478..1ff68fa 100644
> > --- a/arch/arm64/include/uapi/asm/kvm.h
> > +++ b/arch/arm64/include/uapi/asm/kvm.h
> > @@ -226,6 +226,16 @@ struct kvm_vcpu_events {
> >  					 KVM_REG_ARM_FW | ((r) & 0xffff))
> >  #define KVM_REG_ARM_PSCI_VERSION	KVM_REG_ARM_FW_REG(0)
> >
> > +/* SVE registers */
> > +#define KVM_REG_ARM64_SVE		(0x15 << KVM_REG_ARM_COPROC_SHIFT)
> > +#define KVM_REG_ARM64_SVE_ZREG(n, i)	(KVM_REG_ARM64 | KVM_REG_ARM64_SVE | \
> > +					 KVM_REG_SIZE_U2048 |		\
> > +					 ((n) << 5) | (i))
> > +#define KVM_REG_ARM64_SVE_PREG(n, i)	(KVM_REG_ARM64 | KVM_REG_ARM64_SVE | \
> > +					 KVM_REG_SIZE_U256 |		\
> > +					 ((n) << 5) | (i) | 0x400)
> 
> What's the 0x400 for? Aren't PREG's already unique by being 256 bit vs
> the Z regs 2048 bit size?

I was treating the reg size field as metadata rather than being part
of the ID, so the IDs remain unique even with the size field masked
out.

For the core regs, we explicitly allow access to the same underlying
regs using a mix of access sizes (whether or not that is a good idea is
another question).

For the rest of the regs perhaps we can rely on the size field to
disambiguate different regs in practice, but I didn't reel comfortable
making that assumption...

> 
> > +#define KVM_REG_ARM64_SVE_FFR(i)	KVM_REG_ARM64_SVE_PREG(16, i)
> > +
> >  /* Device Control API: ARM VGIC */
> >  #define KVM_DEV_ARM_VGIC_GRP_ADDR	0
> >  #define KVM_DEV_ARM_VGIC_GRP_DIST_REGS	1
> > diff --git a/arch/arm64/kvm/guest.c b/arch/arm64/kvm/guest.c
> > index 953a5c9..320db0f 100644
> <snip>
> >
> > @@ -130,6 +154,107 @@ static int set_core_reg(struct kvm_vcpu *vcpu, const struct kvm_one_reg *reg)
> >  	return err;
> >  }
> >
> > +struct kreg_region {
> > +	char *kptr;
> > +	size_t size;
> > +	size_t zeropad;
> > +};
> > +
> > +#define SVE_REG_SLICE_SHIFT	0
> > +#define SVE_REG_SLICE_BITS	5
> > +#define SVE_REG_ID_SHIFT	(SVE_REG_SLICE_SHIFT + SVE_REG_SLICE_BITS)
> > +#define SVE_REG_ID_BITS		5
> > +
> > +#define SVE_REG_SLICE_MASK \
> > +	(GENMASK(SVE_REG_SLICE_BITS - 1, 0) << SVE_REG_SLICE_SHIFT)
> > +#define SVE_REG_ID_MASK	\
> > +	(GENMASK(SVE_REG_ID_BITS - 1, 0) << SVE_REG_ID_SHIFT)
> > +
> 
> I guess this all comes out in the wash once the constants are folded but
> GENMASK does seem to be designed for arbitrary bit positions:
> 
>   #define SVE_REG_SLICE_MASK \
>      GEN_MASK(SVE_REG_SLICE_BITS + SVE_REG_SLICE_SHIFT - 1, SVE_REG_SLICE_SHIFT)
> 
> Hmm I guess that might be even harder to follow...

Swings and roundabouts...

I'm not sure I prefer your version, but I agree it's a more natural use
of GENMASK() than my version.  I'm happy to change it.

> 
> > +#define SVE_NUM_SLICES (1 << SVE_REG_SLICE_BITS)
> > +
> > +static int sve_reg_region(struct kreg_region *b,
> > +			  const struct kvm_vcpu *vcpu,
> > +			  const struct kvm_one_reg *reg)
> > +{
> > +	const unsigned int vl = vcpu->arch.sve_max_vl;
> > +	const unsigned int vq = sve_vq_from_vl(vl);
> > +
> > +	const unsigned int reg_num =
> > +		(reg->id & SVE_REG_ID_MASK) >> SVE_REG_ID_SHIFT;
> > +	const unsigned int slice_num =
> > +		(reg->id & SVE_REG_SLICE_MASK) >> SVE_REG_SLICE_SHIFT;
> > +
> > +	unsigned int slice_size, offset, limit;
> > +
> > +	if (reg->id >= KVM_REG_ARM64_SVE_ZREG(0, 0) &&
> > +	    reg->id <= KVM_REG_ARM64_SVE_ZREG(SVE_NUM_ZREGS - 1,
> > +					      SVE_NUM_SLICES - 1)) {
> > +		slice_size = KVM_REG_SIZE(KVM_REG_ARM64_SVE_ZREG(0, 0));
> > +
> > +		/* Compute start and end of the register: */
> > +		offset = SVE_SIG_ZREG_OFFSET(vq, reg_num) - SVE_SIG_REGS_OFFSET;
> > +		limit = offset + SVE_SIG_ZREG_SIZE(vq);
> > +
> > +		offset += slice_size * slice_num; /* start of requested slice */
> > +
> > +	} else if (reg->id >= KVM_REG_ARM64_SVE_PREG(0, 0) &&
> > +		   reg->id <= KVM_REG_ARM64_SVE_FFR(SVE_NUM_SLICES - 1)) {
> > +		/* (FFR is P16 for our purposes) */
> > +
> > +		slice_size = KVM_REG_SIZE(KVM_REG_ARM64_SVE_PREG(0, 0));
> > +
> > +		/* Compute start and end of the register: */
> > +		offset = SVE_SIG_PREG_OFFSET(vq, reg_num) - SVE_SIG_REGS_OFFSET;
> > +		limit = offset + SVE_SIG_PREG_SIZE(vq);
> > +
> > +		offset += slice_size * slice_num; /* start of requested slice */
> > +
> > +	} else {
> > +		return -ENOENT;
> > +	}
> > +
> > +	b->kptr = (char *)vcpu->arch.sve_state + offset;
> > +
> > +	/*
> > +	 * If the slice starts after the end of the reg, just pad.
> > +	 * Otherwise, copy as much as possible up to slice_size and pad
> > +	 * the remainder:
> > +	 */
> > +	b->size = offset >= limit ? 0 : min(limit - offset, slice_size);
> > +	b->zeropad = slice_size - b->size;
> > +
> > +	return 0;
> > +}
> > +
> > +static int get_sve_reg(struct kvm_vcpu *vcpu, const struct kvm_one_reg *reg)
> > +{
> > +	struct kreg_region kreg;
> > +	char __user *uptr = (char __user *)reg->addr;
> > +
> > +	if (!vcpu_has_sve(vcpu) || sve_reg_region(&kreg, vcpu, reg))
> > +		return -ENOENT;
> > +
> > +	if (copy_to_user(uptr, kreg.kptr, kreg.size) ||
> > +	    clear_user(uptr + kreg.size, kreg.zeropad))
> > +		return -EFAULT;
> > +
> > +	return 0;
> > +}
> > +
> > +static int set_sve_reg(struct kvm_vcpu *vcpu, const struct kvm_one_reg *reg)
> > +{
> > +	struct kreg_region kreg;
> > +	char __user *uptr = (char __user *)reg->addr;
> > +
> > +	if (!vcpu_has_sve(vcpu) || sve_reg_region(&kreg, vcpu, reg))
> > +		return -ENOENT;
> > +
> > +	if (copy_from_user(kreg.kptr, uptr, kreg.size))
> > +		return -EFAULT;
> > +
> > +	return 0;
> > +}
> > +
> >  int kvm_arch_vcpu_ioctl_get_regs(struct kvm_vcpu *vcpu, struct kvm_regs *regs)
> >  {
> >  	return -EINVAL;
> > @@ -251,12 +376,11 @@ int kvm_arm_get_reg(struct kvm_vcpu *vcpu, const struct kvm_one_reg *reg)
> >  	if ((reg->id & ~KVM_REG_SIZE_MASK) >> 32 != KVM_REG_ARM64 >> 32)
> >  		return -EINVAL;
> >
> > -	/* Register group 16 means we want a core register. */
> > -	if ((reg->id & KVM_REG_ARM_COPROC_MASK) == KVM_REG_ARM_CORE)
> > -		return get_core_reg(vcpu, reg);
> > -
> > -	if ((reg->id & KVM_REG_ARM_COPROC_MASK) == KVM_REG_ARM_FW)
> > -		return kvm_arm_get_fw_reg(vcpu, reg);
> > +	switch (reg->id & KVM_REG_ARM_COPROC_MASK) {
> > +	case KVM_REG_ARM_CORE:	return get_core_reg(vcpu, reg);
> > +	case KVM_REG_ARM_FW:	return kvm_arm_get_fw_reg(vcpu, reg);
> > +	case KVM_REG_ARM64_SVE:	return get_sve_reg(vcpu, reg);
> > +	}
> >
> >  	if (is_timer_reg(reg->id))
> >  		return get_timer_reg(vcpu, reg);
> > @@ -270,12 +394,11 @@ int kvm_arm_set_reg(struct kvm_vcpu *vcpu, const struct kvm_one_reg *reg)
> >  	if ((reg->id & ~KVM_REG_SIZE_MASK) >> 32 != KVM_REG_ARM64 >> 32)
> >  		return -EINVAL;
> >
> > -	/* Register group 16 means we set a core register. */
> > -	if ((reg->id & KVM_REG_ARM_COPROC_MASK) == KVM_REG_ARM_CORE)
> > -		return set_core_reg(vcpu, reg);
> > -
> > -	if ((reg->id & KVM_REG_ARM_COPROC_MASK) == KVM_REG_ARM_FW)
> > -		return kvm_arm_set_fw_reg(vcpu, reg);
> > +	switch (reg->id & KVM_REG_ARM_COPROC_MASK) {
> > +	case KVM_REG_ARM_CORE:	return set_core_reg(vcpu, reg);
> > +	case KVM_REG_ARM_FW:	return kvm_arm_set_fw_reg(vcpu, reg);
> > +	case KVM_REG_ARM64_SVE:	return set_sve_reg(vcpu, reg);
> > +	}
> >
> >  	if (is_timer_reg(reg->id))
> >  		return set_timer_reg(vcpu, reg);
> 
> The kernel coding-style.rst seems mute on the subject of default
> handling in switch but it's probably worth having a:
> 
>   default: break; /* falls through */
> 
> to be explicit.

I can add that.  I thought it was reasonably clear given the pattern
being followed here but it may still be a trap for the future.
There's no harm in being explicit, so I will follow your suggestion in
the respin.  

> It's out of scope for this review but I did get a bit confused as the
> KVM_REG_ARM_COPROC_SHIFT registers seems to be fairly spread out across
> the files. We have demux_c15_get/set in sys_regs but doesn't look as
> though it touches the rest of the emulation logic and we have
> kvm_arm_get/set_fw_reg which are "special" PCSI registers. I guess this
> is because COPROC_SHIFT has been used for a bunch of disparate core and
> non-core and special registers.

Not sure I quite get your point, except that yes, handling of different
registers is somewhat spread around the place.  I tried not to make
things worse here than they already are, at least.

Cheers
---Dave

^ permalink raw reply	[flat|nested] 154+ messages in thread

* [RFC PATCH v2 15/23] KVM: arm64/sve: Add SVE support to register access ioctl interface
@ 2018-11-21 18:05       ` Dave Martin
  0 siblings, 0 replies; 154+ messages in thread
From: Dave Martin @ 2018-11-21 18:05 UTC (permalink / raw)
  To: linux-arm-kernel

On Wed, Nov 21, 2018 at 03:20:15PM +0000, Alex Benn?e wrote:
> 
> Dave Martin <Dave.Martin@arm.com> writes:
> 
> > This patch adds the following registers for access via the
> > KVM_{GET,SET}_ONE_REG interface:
> >
> >  * KVM_REG_ARM64_SVE_ZREG(n, i) (n = 0..31) (in 2048-bit slices)
> >  * KVM_REG_ARM64_SVE_PREG(n, i) (n = 0..15) (in 256-bit slices)
> >  * KVM_REG_ARM64_SVE_FFR(i) (in 256-bit slices)
> >
> > In order to adapt gracefully to future architectural extensions,
> > the registers are divided up into slices as noted above:  the i
> > parameter denotes the slice index.
> >
> > For simplicity, bits or slices that exceed the maximum vector
> > length supported for the vcpu are ignored for KVM_SET_ONE_REG, and
> > read as zero for KVM_GET_ONE_REG.
> >
> > For the current architecture, only slice i = 0 is significant.  The
> > interface design allows i to increase to up to 31 in the future if
> > required by future architectural amendments.
> >
> > The registers are only visible for vcpus that have SVE enabled.
> > They are not enumerated by KVM_GET_REG_LIST on vcpus that do not
> > have SVE.  In all cases, surplus slices are not enumerated by
> > KVM_GET_REG_LIST.
> >
> > Accesses to the FPSIMD registers via KVM_REG_ARM_CORE is not
> > allowed for SVE-enabled vcpus: SVE-aware userspace can use the
> > KVM_REG_ARM64_SVE_ZREG() interface instead to access the same
> > register state.  This avoids some complex and pointless emluation
> > in the kernel.
> >
> > Signed-off-by: Dave Martin <Dave.Martin@arm.com>
> > ---
> >
> > Changes since RFCv1:
> >
> >  * Refactored to remove emulation of FPSIMD registers with the SVE
> >    register view and vice-versa.  This simplifies the code a fair bit.
> >
> >  * Fixed a couple of range errors.
> >
> >  * Inlined various trivial helpers that now have only one call site.
> >
> >  * Use KVM_REG_SIZE() as a symbolic way of getting SVE register slice
> >    sizes.
> > ---
> >  arch/arm64/include/uapi/asm/kvm.h |  10 +++
> >  arch/arm64/kvm/guest.c            | 147 ++++++++++++++++++++++++++++++++++----
> >  2 files changed, 145 insertions(+), 12 deletions(-)
> >
> > diff --git a/arch/arm64/include/uapi/asm/kvm.h b/arch/arm64/include/uapi/asm/kvm.h
> > index 97c3478..1ff68fa 100644
> > --- a/arch/arm64/include/uapi/asm/kvm.h
> > +++ b/arch/arm64/include/uapi/asm/kvm.h
> > @@ -226,6 +226,16 @@ struct kvm_vcpu_events {
> >  					 KVM_REG_ARM_FW | ((r) & 0xffff))
> >  #define KVM_REG_ARM_PSCI_VERSION	KVM_REG_ARM_FW_REG(0)
> >
> > +/* SVE registers */
> > +#define KVM_REG_ARM64_SVE		(0x15 << KVM_REG_ARM_COPROC_SHIFT)
> > +#define KVM_REG_ARM64_SVE_ZREG(n, i)	(KVM_REG_ARM64 | KVM_REG_ARM64_SVE | \
> > +					 KVM_REG_SIZE_U2048 |		\
> > +					 ((n) << 5) | (i))
> > +#define KVM_REG_ARM64_SVE_PREG(n, i)	(KVM_REG_ARM64 | KVM_REG_ARM64_SVE | \
> > +					 KVM_REG_SIZE_U256 |		\
> > +					 ((n) << 5) | (i) | 0x400)
> 
> What's the 0x400 for? Aren't PREG's already unique by being 256 bit vs
> the Z regs 2048 bit size?

I was treating the reg size field as metadata rather than being part
of the ID, so the IDs remain unique even with the size field masked
out.

For the core regs, we explicitly allow access to the same underlying
regs using a mix of access sizes (whether or not that is a good idea is
another question).

For the rest of the regs perhaps we can rely on the size field to
disambiguate different regs in practice, but I didn't reel comfortable
making that assumption...

> 
> > +#define KVM_REG_ARM64_SVE_FFR(i)	KVM_REG_ARM64_SVE_PREG(16, i)
> > +
> >  /* Device Control API: ARM VGIC */
> >  #define KVM_DEV_ARM_VGIC_GRP_ADDR	0
> >  #define KVM_DEV_ARM_VGIC_GRP_DIST_REGS	1
> > diff --git a/arch/arm64/kvm/guest.c b/arch/arm64/kvm/guest.c
> > index 953a5c9..320db0f 100644
> <snip>
> >
> > @@ -130,6 +154,107 @@ static int set_core_reg(struct kvm_vcpu *vcpu, const struct kvm_one_reg *reg)
> >  	return err;
> >  }
> >
> > +struct kreg_region {
> > +	char *kptr;
> > +	size_t size;
> > +	size_t zeropad;
> > +};
> > +
> > +#define SVE_REG_SLICE_SHIFT	0
> > +#define SVE_REG_SLICE_BITS	5
> > +#define SVE_REG_ID_SHIFT	(SVE_REG_SLICE_SHIFT + SVE_REG_SLICE_BITS)
> > +#define SVE_REG_ID_BITS		5
> > +
> > +#define SVE_REG_SLICE_MASK \
> > +	(GENMASK(SVE_REG_SLICE_BITS - 1, 0) << SVE_REG_SLICE_SHIFT)
> > +#define SVE_REG_ID_MASK	\
> > +	(GENMASK(SVE_REG_ID_BITS - 1, 0) << SVE_REG_ID_SHIFT)
> > +
> 
> I guess this all comes out in the wash once the constants are folded but
> GENMASK does seem to be designed for arbitrary bit positions:
> 
>   #define SVE_REG_SLICE_MASK \
>      GEN_MASK(SVE_REG_SLICE_BITS + SVE_REG_SLICE_SHIFT - 1, SVE_REG_SLICE_SHIFT)
> 
> Hmm I guess that might be even harder to follow...

Swings and roundabouts...

I'm not sure I prefer your version, but I agree it's a more natural use
of GENMASK() than my version.  I'm happy to change it.

> 
> > +#define SVE_NUM_SLICES (1 << SVE_REG_SLICE_BITS)
> > +
> > +static int sve_reg_region(struct kreg_region *b,
> > +			  const struct kvm_vcpu *vcpu,
> > +			  const struct kvm_one_reg *reg)
> > +{
> > +	const unsigned int vl = vcpu->arch.sve_max_vl;
> > +	const unsigned int vq = sve_vq_from_vl(vl);
> > +
> > +	const unsigned int reg_num =
> > +		(reg->id & SVE_REG_ID_MASK) >> SVE_REG_ID_SHIFT;
> > +	const unsigned int slice_num =
> > +		(reg->id & SVE_REG_SLICE_MASK) >> SVE_REG_SLICE_SHIFT;
> > +
> > +	unsigned int slice_size, offset, limit;
> > +
> > +	if (reg->id >= KVM_REG_ARM64_SVE_ZREG(0, 0) &&
> > +	    reg->id <= KVM_REG_ARM64_SVE_ZREG(SVE_NUM_ZREGS - 1,
> > +					      SVE_NUM_SLICES - 1)) {
> > +		slice_size = KVM_REG_SIZE(KVM_REG_ARM64_SVE_ZREG(0, 0));
> > +
> > +		/* Compute start and end of the register: */
> > +		offset = SVE_SIG_ZREG_OFFSET(vq, reg_num) - SVE_SIG_REGS_OFFSET;
> > +		limit = offset + SVE_SIG_ZREG_SIZE(vq);
> > +
> > +		offset += slice_size * slice_num; /* start of requested slice */
> > +
> > +	} else if (reg->id >= KVM_REG_ARM64_SVE_PREG(0, 0) &&
> > +		   reg->id <= KVM_REG_ARM64_SVE_FFR(SVE_NUM_SLICES - 1)) {
> > +		/* (FFR is P16 for our purposes) */
> > +
> > +		slice_size = KVM_REG_SIZE(KVM_REG_ARM64_SVE_PREG(0, 0));
> > +
> > +		/* Compute start and end of the register: */
> > +		offset = SVE_SIG_PREG_OFFSET(vq, reg_num) - SVE_SIG_REGS_OFFSET;
> > +		limit = offset + SVE_SIG_PREG_SIZE(vq);
> > +
> > +		offset += slice_size * slice_num; /* start of requested slice */
> > +
> > +	} else {
> > +		return -ENOENT;
> > +	}
> > +
> > +	b->kptr = (char *)vcpu->arch.sve_state + offset;
> > +
> > +	/*
> > +	 * If the slice starts after the end of the reg, just pad.
> > +	 * Otherwise, copy as much as possible up to slice_size and pad
> > +	 * the remainder:
> > +	 */
> > +	b->size = offset >= limit ? 0 : min(limit - offset, slice_size);
> > +	b->zeropad = slice_size - b->size;
> > +
> > +	return 0;
> > +}
> > +
> > +static int get_sve_reg(struct kvm_vcpu *vcpu, const struct kvm_one_reg *reg)
> > +{
> > +	struct kreg_region kreg;
> > +	char __user *uptr = (char __user *)reg->addr;
> > +
> > +	if (!vcpu_has_sve(vcpu) || sve_reg_region(&kreg, vcpu, reg))
> > +		return -ENOENT;
> > +
> > +	if (copy_to_user(uptr, kreg.kptr, kreg.size) ||
> > +	    clear_user(uptr + kreg.size, kreg.zeropad))
> > +		return -EFAULT;
> > +
> > +	return 0;
> > +}
> > +
> > +static int set_sve_reg(struct kvm_vcpu *vcpu, const struct kvm_one_reg *reg)
> > +{
> > +	struct kreg_region kreg;
> > +	char __user *uptr = (char __user *)reg->addr;
> > +
> > +	if (!vcpu_has_sve(vcpu) || sve_reg_region(&kreg, vcpu, reg))
> > +		return -ENOENT;
> > +
> > +	if (copy_from_user(kreg.kptr, uptr, kreg.size))
> > +		return -EFAULT;
> > +
> > +	return 0;
> > +}
> > +
> >  int kvm_arch_vcpu_ioctl_get_regs(struct kvm_vcpu *vcpu, struct kvm_regs *regs)
> >  {
> >  	return -EINVAL;
> > @@ -251,12 +376,11 @@ int kvm_arm_get_reg(struct kvm_vcpu *vcpu, const struct kvm_one_reg *reg)
> >  	if ((reg->id & ~KVM_REG_SIZE_MASK) >> 32 != KVM_REG_ARM64 >> 32)
> >  		return -EINVAL;
> >
> > -	/* Register group 16 means we want a core register. */
> > -	if ((reg->id & KVM_REG_ARM_COPROC_MASK) == KVM_REG_ARM_CORE)
> > -		return get_core_reg(vcpu, reg);
> > -
> > -	if ((reg->id & KVM_REG_ARM_COPROC_MASK) == KVM_REG_ARM_FW)
> > -		return kvm_arm_get_fw_reg(vcpu, reg);
> > +	switch (reg->id & KVM_REG_ARM_COPROC_MASK) {
> > +	case KVM_REG_ARM_CORE:	return get_core_reg(vcpu, reg);
> > +	case KVM_REG_ARM_FW:	return kvm_arm_get_fw_reg(vcpu, reg);
> > +	case KVM_REG_ARM64_SVE:	return get_sve_reg(vcpu, reg);
> > +	}
> >
> >  	if (is_timer_reg(reg->id))
> >  		return get_timer_reg(vcpu, reg);
> > @@ -270,12 +394,11 @@ int kvm_arm_set_reg(struct kvm_vcpu *vcpu, const struct kvm_one_reg *reg)
> >  	if ((reg->id & ~KVM_REG_SIZE_MASK) >> 32 != KVM_REG_ARM64 >> 32)
> >  		return -EINVAL;
> >
> > -	/* Register group 16 means we set a core register. */
> > -	if ((reg->id & KVM_REG_ARM_COPROC_MASK) == KVM_REG_ARM_CORE)
> > -		return set_core_reg(vcpu, reg);
> > -
> > -	if ((reg->id & KVM_REG_ARM_COPROC_MASK) == KVM_REG_ARM_FW)
> > -		return kvm_arm_set_fw_reg(vcpu, reg);
> > +	switch (reg->id & KVM_REG_ARM_COPROC_MASK) {
> > +	case KVM_REG_ARM_CORE:	return set_core_reg(vcpu, reg);
> > +	case KVM_REG_ARM_FW:	return kvm_arm_set_fw_reg(vcpu, reg);
> > +	case KVM_REG_ARM64_SVE:	return set_sve_reg(vcpu, reg);
> > +	}
> >
> >  	if (is_timer_reg(reg->id))
> >  		return set_timer_reg(vcpu, reg);
> 
> The kernel coding-style.rst seems mute on the subject of default
> handling in switch but it's probably worth having a:
> 
>   default: break; /* falls through */
> 
> to be explicit.

I can add that.  I thought it was reasonably clear given the pattern
being followed here but it may still be a trap for the future.
There's no harm in being explicit, so I will follow your suggestion in
the respin.  

> It's out of scope for this review but I did get a bit confused as the
> KVM_REG_ARM_COPROC_SHIFT registers seems to be fairly spread out across
> the files. We have demux_c15_get/set in sys_regs but doesn't look as
> though it touches the rest of the emulation logic and we have
> kvm_arm_get/set_fw_reg which are "special" PCSI registers. I guess this
> is because COPROC_SHIFT has been used for a bunch of disparate core and
> non-core and special registers.

Not sure I quite get your point, except that yes, handling of different
registers is somewhat spread around the place.  I tried not to make
things worse here than they already are, at least.

Cheers
---Dave

^ permalink raw reply	[flat|nested] 154+ messages in thread

* Re: [RFC PATCH v2 11/23] KVM: arm64: Support runtime sysreg filtering for KVM_GET_REG_LIST
  2018-11-15 17:27       ` Dave Martin
@ 2018-11-22 10:53         ` Christoffer Dall
  -1 siblings, 0 replies; 154+ messages in thread
From: Christoffer Dall @ 2018-11-22 10:53 UTC (permalink / raw)
  To: Dave Martin
  Cc: Okamoto Takayuki, Christoffer Dall, Ard Biesheuvel, Marc Zyngier,
	Catalin Marinas, Will Deacon, kvmarm, linux-arm-kernel

[Adding Peter and Alex for their view on the QEMU side]

On Thu, Nov 15, 2018 at 05:27:11PM +0000, Dave Martin wrote:
> On Fri, Nov 02, 2018 at 09:16:25AM +0100, Christoffer Dall wrote:
> > On Fri, Sep 28, 2018 at 02:39:15PM +0100, Dave Martin wrote:
> > > KVM_GET_REG_LIST should only enumerate registers that are actually
> > > accessible, so it is necessary to filter out any register that is
> > > not exposed to the guest.  For features that are configured at
> > > runtime, this will require a dynamic check.
> > > 
> > > For example, ZCR_EL1 and ID_AA64ZFR0_EL1 would need to be hidden
> > > if SVE is not enabled for the guest.
> > 
> > This implies that userspace can never access this interface for a vcpu
> > before having decided whether such features are enabled for the guest or
> > not, since otherwise userspace will see different states for a VCPU
> > depending on sequencing of the API, which sounds fragile to me.
> > 
> > That should probably be documented somewhere, and I hope the
> > enable/disable API for SVE in guests already takes that into account.
> > 
> > Not sure if there's an action to take here, but it was the best place I
> > could raise this concern.
> 
> Fair point.  I struggled to come up with something better that solves
> all problems.
> 
> My expectation is that KVM_ARM_SVE_CONFIG_SET is considered part of
> creating the vcpu, so that if issued at all for a vcpu, it is issued
> very soon after KVM_VCPU_INIT.
> 
> I think this worked OK with the current structure of kvmtool and I
> seem to remember discussing this with Peter Maydell re qemu -- but
> it sounds like I should double-check.

QEMU does some thing around enumerating all the system registers exposed
by KVM and saving/restoring them as part of its startup, but I don't
remember the exact sequence.

> 
> Either way, you're right, this needs to be clearly documented.
> 
> 
> If we want to be more robust, maybe we should add a capability too,
> so that userspace that enables this capability promises to call
> KVM_ARM_SVE_CONFIG_SET for each vcpu, and affected ioctls (KVM_RUN,
> KVM_GET_REG_LIST etc.) are forbidden until that is done?
> 
> That should help avoid accidents.
> 
> I could add a special meaning for an empty kvm_sve_vls, such that
> it doesn't enable SVE on the affected vcpu.  That retains the ability
> to create heterogeneous guests while still following the above flow.
> 
I think making sure that userspace can ever only see the same list of
available system regiters is going to cause us less pain going forward.

If the separate ioctl and capability check is the easiest way of doing
that, then I think that sounds good.  (I had wished we could have just
added some data to KVM_CREATE_VCPU, but that doesn't seem to be the
case.)


Thanks,

    Christoffer

^ permalink raw reply	[flat|nested] 154+ messages in thread

* [RFC PATCH v2 11/23] KVM: arm64: Support runtime sysreg filtering for KVM_GET_REG_LIST
@ 2018-11-22 10:53         ` Christoffer Dall
  0 siblings, 0 replies; 154+ messages in thread
From: Christoffer Dall @ 2018-11-22 10:53 UTC (permalink / raw)
  To: linux-arm-kernel

[Adding Peter and Alex for their view on the QEMU side]

On Thu, Nov 15, 2018 at 05:27:11PM +0000, Dave Martin wrote:
> On Fri, Nov 02, 2018 at 09:16:25AM +0100, Christoffer Dall wrote:
> > On Fri, Sep 28, 2018 at 02:39:15PM +0100, Dave Martin wrote:
> > > KVM_GET_REG_LIST should only enumerate registers that are actually
> > > accessible, so it is necessary to filter out any register that is
> > > not exposed to the guest.  For features that are configured at
> > > runtime, this will require a dynamic check.
> > > 
> > > For example, ZCR_EL1 and ID_AA64ZFR0_EL1 would need to be hidden
> > > if SVE is not enabled for the guest.
> > 
> > This implies that userspace can never access this interface for a vcpu
> > before having decided whether such features are enabled for the guest or
> > not, since otherwise userspace will see different states for a VCPU
> > depending on sequencing of the API, which sounds fragile to me.
> > 
> > That should probably be documented somewhere, and I hope the
> > enable/disable API for SVE in guests already takes that into account.
> > 
> > Not sure if there's an action to take here, but it was the best place I
> > could raise this concern.
> 
> Fair point.  I struggled to come up with something better that solves
> all problems.
> 
> My expectation is that KVM_ARM_SVE_CONFIG_SET is considered part of
> creating the vcpu, so that if issued at all for a vcpu, it is issued
> very soon after KVM_VCPU_INIT.
> 
> I think this worked OK with the current structure of kvmtool and I
> seem to remember discussing this with Peter Maydell re qemu -- but
> it sounds like I should double-check.

QEMU does some thing around enumerating all the system registers exposed
by KVM and saving/restoring them as part of its startup, but I don't
remember the exact sequence.

> 
> Either way, you're right, this needs to be clearly documented.
> 
> 
> If we want to be more robust, maybe we should add a capability too,
> so that userspace that enables this capability promises to call
> KVM_ARM_SVE_CONFIG_SET for each vcpu, and affected ioctls (KVM_RUN,
> KVM_GET_REG_LIST etc.) are forbidden until that is done?
> 
> That should help avoid accidents.
> 
> I could add a special meaning for an empty kvm_sve_vls, such that
> it doesn't enable SVE on the affected vcpu.  That retains the ability
> to create heterogeneous guests while still following the above flow.
> 
I think making sure that userspace can ever only see the same list of
available system regiters is going to cause us less pain going forward.

If the separate ioctl and capability check is the easiest way of doing
that, then I think that sounds good.  (I had wished we could have just
added some data to KVM_CREATE_VCPU, but that doesn't seem to be the
case.)


Thanks,

    Christoffer

^ permalink raw reply	[flat|nested] 154+ messages in thread

* Re: [RFC PATCH v2 11/23] KVM: arm64: Support runtime sysreg filtering for KVM_GET_REG_LIST
  2018-11-22 10:53         ` Christoffer Dall
@ 2018-11-22 11:13           ` Peter Maydell
  -1 siblings, 0 replies; 154+ messages in thread
From: Peter Maydell @ 2018-11-22 11:13 UTC (permalink / raw)
  To: Christoffer Dall
  Cc: Okamoto Takayuki, Christoffer Dall, Ard Biesheuvel, Marc Zyngier,
	Catalin Marinas, Will Deacon, kvmarm, Dave Martin, arm-mail-list

On 22 November 2018 at 10:53, Christoffer Dall <christoffer.dall@arm.com> wrote:
> [Adding Peter and Alex for their view on the QEMU side]
>
> On Thu, Nov 15, 2018 at 05:27:11PM +0000, Dave Martin wrote:
>> My expectation is that KVM_ARM_SVE_CONFIG_SET is considered part of
>> creating the vcpu, so that if issued at all for a vcpu, it is issued
>> very soon after KVM_VCPU_INIT.
>>
>> I think this worked OK with the current structure of kvmtool and I
>> seem to remember discussing this with Peter Maydell re qemu -- but
>> it sounds like I should double-check.
>
> QEMU does some thing around enumerating all the system registers exposed
> by KVM and saving/restoring them as part of its startup, but I don't
> remember the exact sequence.

This all happens in kvm_arch_init_vcpu(), which does:
 * KVM_ARM_VCPU_INIT ioctl (with the appropriate kvm_init_features set)
 * read the guest MPIDR with GET_ONE_REG so we know what KVM
   is doing with MPIDR assignment across CPUs
 * check for interesting extensions like KVM_CAP_SET_GUEST_DEBUG
 * get and cache a list of what system registers the vcpu has,
   using KVM_GET_REG_LIST. This is where we do the "size must
   be U32 or U64" sanity check.

So if there's something we can't do by setting kvm_init_features
for KVM_ARM_VCPU_INIT but have to do immediately afterwards,
that is straightforward.

The major requirement for QEMU is that if we don't specifically
enable SVE in the VCPU then we must not see any registers
in the KVM_GET_REG_LIST that are not u32 or u64 -- otherwise
QEMU will refuse to start.

thanks
-- PMM

^ permalink raw reply	[flat|nested] 154+ messages in thread

* [RFC PATCH v2 11/23] KVM: arm64: Support runtime sysreg filtering for KVM_GET_REG_LIST
@ 2018-11-22 11:13           ` Peter Maydell
  0 siblings, 0 replies; 154+ messages in thread
From: Peter Maydell @ 2018-11-22 11:13 UTC (permalink / raw)
  To: linux-arm-kernel

On 22 November 2018 at 10:53, Christoffer Dall <christoffer.dall@arm.com> wrote:
> [Adding Peter and Alex for their view on the QEMU side]
>
> On Thu, Nov 15, 2018 at 05:27:11PM +0000, Dave Martin wrote:
>> My expectation is that KVM_ARM_SVE_CONFIG_SET is considered part of
>> creating the vcpu, so that if issued at all for a vcpu, it is issued
>> very soon after KVM_VCPU_INIT.
>>
>> I think this worked OK with the current structure of kvmtool and I
>> seem to remember discussing this with Peter Maydell re qemu -- but
>> it sounds like I should double-check.
>
> QEMU does some thing around enumerating all the system registers exposed
> by KVM and saving/restoring them as part of its startup, but I don't
> remember the exact sequence.

This all happens in kvm_arch_init_vcpu(), which does:
 * KVM_ARM_VCPU_INIT ioctl (with the appropriate kvm_init_features set)
 * read the guest MPIDR with GET_ONE_REG so we know what KVM
   is doing with MPIDR assignment across CPUs
 * check for interesting extensions like KVM_CAP_SET_GUEST_DEBUG
 * get and cache a list of what system registers the vcpu has,
   using KVM_GET_REG_LIST. This is where we do the "size must
   be U32 or U64" sanity check.

So if there's something we can't do by setting kvm_init_features
for KVM_ARM_VCPU_INIT but have to do immediately afterwards,
that is straightforward.

The major requirement for QEMU is that if we don't specifically
enable SVE in the VCPU then we must not see any registers
in the KVM_GET_REG_LIST that are not u32 or u64 -- otherwise
QEMU will refuse to start.

thanks
-- PMM

^ permalink raw reply	[flat|nested] 154+ messages in thread

* Re: [RFC PATCH v2 11/23] KVM: arm64: Support runtime sysreg filtering for KVM_GET_REG_LIST
  2018-11-22 10:53         ` Christoffer Dall
@ 2018-11-22 11:27           ` Alex Bennée
  -1 siblings, 0 replies; 154+ messages in thread
From: Alex Bennée @ 2018-11-22 11:27 UTC (permalink / raw)
  To: Christoffer Dall
  Cc: Okamoto Takayuki, Christoffer Dall, Ard Biesheuvel, Marc Zyngier,
	Catalin Marinas, Will Deacon, kvmarm, Dave Martin,
	linux-arm-kernel


Christoffer Dall <christoffer.dall@arm.com> writes:

> [Adding Peter and Alex for their view on the QEMU side]
>
> On Thu, Nov 15, 2018 at 05:27:11PM +0000, Dave Martin wrote:
>> On Fri, Nov 02, 2018 at 09:16:25AM +0100, Christoffer Dall wrote:
>> > On Fri, Sep 28, 2018 at 02:39:15PM +0100, Dave Martin wrote:
>> > > KVM_GET_REG_LIST should only enumerate registers that are actually
>> > > accessible, so it is necessary to filter out any register that is
>> > > not exposed to the guest.  For features that are configured at
>> > > runtime, this will require a dynamic check.
>> > >
>> > > For example, ZCR_EL1 and ID_AA64ZFR0_EL1 would need to be hidden
>> > > if SVE is not enabled for the guest.
>> >
>> > This implies that userspace can never access this interface for a vcpu
>> > before having decided whether such features are enabled for the guest or
>> > not, since otherwise userspace will see different states for a VCPU
>> > depending on sequencing of the API, which sounds fragile to me.
>> >
>> > That should probably be documented somewhere, and I hope the
>> > enable/disable API for SVE in guests already takes that into account.
>> >
>> > Not sure if there's an action to take here, but it was the best place I
>> > could raise this concern.
>>
>> Fair point.  I struggled to come up with something better that solves
>> all problems.
>>
>> My expectation is that KVM_ARM_SVE_CONFIG_SET is considered part of
>> creating the vcpu, so that if issued at all for a vcpu, it is issued
>> very soon after KVM_VCPU_INIT.
>>
>> I think this worked OK with the current structure of kvmtool and I
>> seem to remember discussing this with Peter Maydell re qemu -- but
>> it sounds like I should double-check.
>
> QEMU does some thing around enumerating all the system registers exposed
> by KVM and saving/restoring them as part of its startup, but I don't
> remember the exact sequence.

QEMU does this for each vCPU as part of it's start-up sequence:

  kvm_init_vcpu
    kvm_get_cpu (-> KVM_CREATE_VCPU)
    KVM_GET_VCPU_MMAP_SIZE
    kvm_arch_init_vcpu
      kvm_arm_vcpu_init (-> KVM_ARM_VCPU_INIT)
      kvm_get_one_reg(ARM_CPU_ID_MPIDR)
      kvm_arm_init_debug (chk for KVM_CAP SET_GUEST_DEBUG/GUEST_DEBUG_HW_WPS/BPS)
      kvm_arm_init_serror_injection (chk KVM_CAP_ARM_INJECT_SERROR_ESR)
      kvm_arm_init_cpreg_list (KVM_GET_REG_LIST)

At this point we have the register list we need for
kvm_arch_get_registers which is what we call every time we want to
synchronise state. We only really do this for debug events, crashes and
at some point when migrating.

>
>>
>> Either way, you're right, this needs to be clearly documented.
>>
>>
>> If we want to be more robust, maybe we should add a capability too,
>> so that userspace that enables this capability promises to call
>> KVM_ARM_SVE_CONFIG_SET for each vcpu, and affected ioctls (KVM_RUN,
>> KVM_GET_REG_LIST etc.) are forbidden until that is done?
>>
>> That should help avoid accidents.
>>
>> I could add a special meaning for an empty kvm_sve_vls, such that
>> it doesn't enable SVE on the affected vcpu.  That retains the ability
>> to create heterogeneous guests while still following the above flow.
>>
> I think making sure that userspace can ever only see the same list of
> available system regiters is going to cause us less pain going forward.
>
> If the separate ioctl and capability check is the easiest way of doing
> that, then I think that sounds good.  (I had wished we could have just
> added some data to KVM_CREATE_VCPU, but that doesn't seem to be the
> case.)
>
>
> Thanks,
>
>     Christoffer


--
Alex Bennée
_______________________________________________
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

^ permalink raw reply	[flat|nested] 154+ messages in thread

* [RFC PATCH v2 11/23] KVM: arm64: Support runtime sysreg filtering for KVM_GET_REG_LIST
@ 2018-11-22 11:27           ` Alex Bennée
  0 siblings, 0 replies; 154+ messages in thread
From: Alex Bennée @ 2018-11-22 11:27 UTC (permalink / raw)
  To: linux-arm-kernel


Christoffer Dall <christoffer.dall@arm.com> writes:

> [Adding Peter and Alex for their view on the QEMU side]
>
> On Thu, Nov 15, 2018 at 05:27:11PM +0000, Dave Martin wrote:
>> On Fri, Nov 02, 2018 at 09:16:25AM +0100, Christoffer Dall wrote:
>> > On Fri, Sep 28, 2018 at 02:39:15PM +0100, Dave Martin wrote:
>> > > KVM_GET_REG_LIST should only enumerate registers that are actually
>> > > accessible, so it is necessary to filter out any register that is
>> > > not exposed to the guest.  For features that are configured at
>> > > runtime, this will require a dynamic check.
>> > >
>> > > For example, ZCR_EL1 and ID_AA64ZFR0_EL1 would need to be hidden
>> > > if SVE is not enabled for the guest.
>> >
>> > This implies that userspace can never access this interface for a vcpu
>> > before having decided whether such features are enabled for the guest or
>> > not, since otherwise userspace will see different states for a VCPU
>> > depending on sequencing of the API, which sounds fragile to me.
>> >
>> > That should probably be documented somewhere, and I hope the
>> > enable/disable API for SVE in guests already takes that into account.
>> >
>> > Not sure if there's an action to take here, but it was the best place I
>> > could raise this concern.
>>
>> Fair point.  I struggled to come up with something better that solves
>> all problems.
>>
>> My expectation is that KVM_ARM_SVE_CONFIG_SET is considered part of
>> creating the vcpu, so that if issued at all for a vcpu, it is issued
>> very soon after KVM_VCPU_INIT.
>>
>> I think this worked OK with the current structure of kvmtool and I
>> seem to remember discussing this with Peter Maydell re qemu -- but
>> it sounds like I should double-check.
>
> QEMU does some thing around enumerating all the system registers exposed
> by KVM and saving/restoring them as part of its startup, but I don't
> remember the exact sequence.

QEMU does this for each vCPU as part of it's start-up sequence:

  kvm_init_vcpu
    kvm_get_cpu (-> KVM_CREATE_VCPU)
    KVM_GET_VCPU_MMAP_SIZE
    kvm_arch_init_vcpu
      kvm_arm_vcpu_init (-> KVM_ARM_VCPU_INIT)
      kvm_get_one_reg(ARM_CPU_ID_MPIDR)
      kvm_arm_init_debug (chk for KVM_CAP SET_GUEST_DEBUG/GUEST_DEBUG_HW_WPS/BPS)
      kvm_arm_init_serror_injection (chk KVM_CAP_ARM_INJECT_SERROR_ESR)
      kvm_arm_init_cpreg_list (KVM_GET_REG_LIST)

At this point we have the register list we need for
kvm_arch_get_registers which is what we call every time we want to
synchronise state. We only really do this for debug events, crashes and
at some point when migrating.

>
>>
>> Either way, you're right, this needs to be clearly documented.
>>
>>
>> If we want to be more robust, maybe we should add a capability too,
>> so that userspace that enables this capability promises to call
>> KVM_ARM_SVE_CONFIG_SET for each vcpu, and affected ioctls (KVM_RUN,
>> KVM_GET_REG_LIST etc.) are forbidden until that is done?
>>
>> That should help avoid accidents.
>>
>> I could add a special meaning for an empty kvm_sve_vls, such that
>> it doesn't enable SVE on the affected vcpu.  That retains the ability
>> to create heterogeneous guests while still following the above flow.
>>
> I think making sure that userspace can ever only see the same list of
> available system regiters is going to cause us less pain going forward.
>
> If the separate ioctl and capability check is the easiest way of doing
> that, then I think that sounds good.  (I had wished we could have just
> added some data to KVM_CREATE_VCPU, but that doesn't seem to be the
> case.)
>
>
> Thanks,
>
>     Christoffer


--
Alex Benn?e

^ permalink raw reply	[flat|nested] 154+ messages in thread

* Re: [RFC PATCH v2 11/23] KVM: arm64: Support runtime sysreg filtering for KVM_GET_REG_LIST
  2018-11-22 11:27           ` Alex Bennée
@ 2018-11-22 12:32             ` Dave P Martin
  -1 siblings, 0 replies; 154+ messages in thread
From: Dave P Martin @ 2018-11-22 12:32 UTC (permalink / raw)
  To: Alex Bennée
  Cc: tokamoto, Christoffer Dall, Ard Biesheuvel, Marc Zyngier,
	Catalin Marinas, Will Deacon, kvmarm, linux-arm-kernel

On Thu, Nov 22, 2018 at 11:27:53AM +0000, Alex Bennée wrote:
>
> Christoffer Dall <christoffer.dall@arm.com> writes:
>
> > [Adding Peter and Alex for their view on the QEMU side]
> >
> > On Thu, Nov 15, 2018 at 05:27:11PM +0000, Dave Martin wrote:
> >> On Fri, Nov 02, 2018 at 09:16:25AM +0100, Christoffer Dall wrote:
> >> > On Fri, Sep 28, 2018 at 02:39:15PM +0100, Dave Martin wrote:
> >> > > KVM_GET_REG_LIST should only enumerate registers that are actually
> >> > > accessible, so it is necessary to filter out any register that is
> >> > > not exposed to the guest.  For features that are configured at
> >> > > runtime, this will require a dynamic check.
> >> > >
> >> > > For example, ZCR_EL1 and ID_AA64ZFR0_EL1 would need to be hidden
> >> > > if SVE is not enabled for the guest.
> >> >
> >> > This implies that userspace can never access this interface for a vcpu
> >> > before having decided whether such features are enabled for the guest or
> >> > not, since otherwise userspace will see different states for a VCPU
> >> > depending on sequencing of the API, which sounds fragile to me.
> >> >
> >> > That should probably be documented somewhere, and I hope the
> >> > enable/disable API for SVE in guests already takes that into account.
> >> >
> >> > Not sure if there's an action to take here, but it was the best place I
> >> > could raise this concern.
> >>
> >> Fair point.  I struggled to come up with something better that solves
> >> all problems.
> >>
> >> My expectation is that KVM_ARM_SVE_CONFIG_SET is considered part of
> >> creating the vcpu, so that if issued at all for a vcpu, it is issued
> >> very soon after KVM_VCPU_INIT.
> >>
> >> I think this worked OK with the current structure of kvmtool and I
> >> seem to remember discussing this with Peter Maydell re qemu -- but
> >> it sounds like I should double-check.
> >
> > QEMU does some thing around enumerating all the system registers exposed
> > by KVM and saving/restoring them as part of its startup, but I don't
> > remember the exact sequence.
>
> QEMU does this for each vCPU as part of it's start-up sequence:
>
>   kvm_init_vcpu
>     kvm_get_cpu (-> KVM_CREATE_VCPU)
>     KVM_GET_VCPU_MMAP_SIZE
>     kvm_arch_init_vcpu
>       kvm_arm_vcpu_init (-> KVM_ARM_VCPU_INIT)
>       kvm_get_one_reg(ARM_CPU_ID_MPIDR)
>       kvm_arm_init_debug (chk for KVM_CAP SET_GUEST_DEBUG/GUEST_DEBUG_HW_WPS/BPS)
>       kvm_arm_init_serror_injection (chk KVM_CAP_ARM_INJECT_SERROR_ESR)
>       kvm_arm_init_cpreg_list (KVM_GET_REG_LIST)
>
> At this point we have the register list we need for
> kvm_arch_get_registers which is what we call every time we want to
> synchronise state. We only really do this for debug events, crashes and
> at some point when migrating.

So we would need to insert KVM_ARM_SVE_CONFIG_SET into this sequence,
meaning that the new capability is not strictly necessary.

I sympathise with Christoffer's view though that without the capability
mechanism it may be too easy for software to make mistakes: code
refactoring might swap the KVM_GET_REG_LIST and KVM_ARM_SVE_CONFIG ioctls
over and then things would go wrong with no immediate error indication.

In effect, the SVE regs would be missing from the list yielded by
KVM_GET_REG_LIST, possibly leading to silent migration failures.

I'm a bit uneasy about that.  Am I being too paranoid now?

Cheers
---Dave
IMPORTANT NOTICE: The contents of this email and any attachments are confidential and may also be privileged. If you are not the intended recipient, please notify the sender immediately and do not disclose the contents to any other person, use it for any purpose, or store or copy the information in any medium. Thank you.

^ permalink raw reply	[flat|nested] 154+ messages in thread

* [RFC PATCH v2 11/23] KVM: arm64: Support runtime sysreg filtering for KVM_GET_REG_LIST
@ 2018-11-22 12:32             ` Dave P Martin
  0 siblings, 0 replies; 154+ messages in thread
From: Dave P Martin @ 2018-11-22 12:32 UTC (permalink / raw)
  To: linux-arm-kernel

On Thu, Nov 22, 2018 at 11:27:53AM +0000, Alex Benn?e wrote:
>
> Christoffer Dall <christoffer.dall@arm.com> writes:
>
> > [Adding Peter and Alex for their view on the QEMU side]
> >
> > On Thu, Nov 15, 2018 at 05:27:11PM +0000, Dave Martin wrote:
> >> On Fri, Nov 02, 2018 at 09:16:25AM +0100, Christoffer Dall wrote:
> >> > On Fri, Sep 28, 2018 at 02:39:15PM +0100, Dave Martin wrote:
> >> > > KVM_GET_REG_LIST should only enumerate registers that are actually
> >> > > accessible, so it is necessary to filter out any register that is
> >> > > not exposed to the guest.  For features that are configured at
> >> > > runtime, this will require a dynamic check.
> >> > >
> >> > > For example, ZCR_EL1 and ID_AA64ZFR0_EL1 would need to be hidden
> >> > > if SVE is not enabled for the guest.
> >> >
> >> > This implies that userspace can never access this interface for a vcpu
> >> > before having decided whether such features are enabled for the guest or
> >> > not, since otherwise userspace will see different states for a VCPU
> >> > depending on sequencing of the API, which sounds fragile to me.
> >> >
> >> > That should probably be documented somewhere, and I hope the
> >> > enable/disable API for SVE in guests already takes that into account.
> >> >
> >> > Not sure if there's an action to take here, but it was the best place I
> >> > could raise this concern.
> >>
> >> Fair point.  I struggled to come up with something better that solves
> >> all problems.
> >>
> >> My expectation is that KVM_ARM_SVE_CONFIG_SET is considered part of
> >> creating the vcpu, so that if issued at all for a vcpu, it is issued
> >> very soon after KVM_VCPU_INIT.
> >>
> >> I think this worked OK with the current structure of kvmtool and I
> >> seem to remember discussing this with Peter Maydell re qemu -- but
> >> it sounds like I should double-check.
> >
> > QEMU does some thing around enumerating all the system registers exposed
> > by KVM and saving/restoring them as part of its startup, but I don't
> > remember the exact sequence.
>
> QEMU does this for each vCPU as part of it's start-up sequence:
>
>   kvm_init_vcpu
>     kvm_get_cpu (-> KVM_CREATE_VCPU)
>     KVM_GET_VCPU_MMAP_SIZE
>     kvm_arch_init_vcpu
>       kvm_arm_vcpu_init (-> KVM_ARM_VCPU_INIT)
>       kvm_get_one_reg(ARM_CPU_ID_MPIDR)
>       kvm_arm_init_debug (chk for KVM_CAP SET_GUEST_DEBUG/GUEST_DEBUG_HW_WPS/BPS)
>       kvm_arm_init_serror_injection (chk KVM_CAP_ARM_INJECT_SERROR_ESR)
>       kvm_arm_init_cpreg_list (KVM_GET_REG_LIST)
>
> At this point we have the register list we need for
> kvm_arch_get_registers which is what we call every time we want to
> synchronise state. We only really do this for debug events, crashes and
> at some point when migrating.

So we would need to insert KVM_ARM_SVE_CONFIG_SET into this sequence,
meaning that the new capability is not strictly necessary.

I sympathise with Christoffer's view though that without the capability
mechanism it may be too easy for software to make mistakes: code
refactoring might swap the KVM_GET_REG_LIST and KVM_ARM_SVE_CONFIG ioctls
over and then things would go wrong with no immediate error indication.

In effect, the SVE regs would be missing from the list yielded by
KVM_GET_REG_LIST, possibly leading to silent migration failures.

I'm a bit uneasy about that.  Am I being too paranoid now?

Cheers
---Dave
IMPORTANT NOTICE: The contents of this email and any attachments are confidential and may also be privileged. If you are not the intended recipient, please notify the sender immediately and do not disclose the contents to any other person, use it for any purpose, or store or copy the information in any medium. Thank you.

^ permalink raw reply	[flat|nested] 154+ messages in thread

* Re: [RFC PATCH v2 11/23] KVM: arm64: Support runtime sysreg filtering for KVM_GET_REG_LIST
  2018-11-22 11:13           ` Peter Maydell
@ 2018-11-22 12:34             ` Christoffer Dall
  -1 siblings, 0 replies; 154+ messages in thread
From: Christoffer Dall @ 2018-11-22 12:34 UTC (permalink / raw)
  To: Peter Maydell
  Cc: Okamoto Takayuki, Christoffer Dall, Ard Biesheuvel, Marc Zyngier,
	Catalin Marinas, Will Deacon, kvmarm, Dave Martin, arm-mail-list

On Thu, Nov 22, 2018 at 11:13:51AM +0000, Peter Maydell wrote:
> On 22 November 2018 at 10:53, Christoffer Dall <christoffer.dall@arm.com> wrote:
> > [Adding Peter and Alex for their view on the QEMU side]
> >
> > On Thu, Nov 15, 2018 at 05:27:11PM +0000, Dave Martin wrote:
> >> My expectation is that KVM_ARM_SVE_CONFIG_SET is considered part of
> >> creating the vcpu, so that if issued at all for a vcpu, it is issued
> >> very soon after KVM_VCPU_INIT.
> >>
> >> I think this worked OK with the current structure of kvmtool and I
> >> seem to remember discussing this with Peter Maydell re qemu -- but
> >> it sounds like I should double-check.
> >
> > QEMU does some thing around enumerating all the system registers exposed
> > by KVM and saving/restoring them as part of its startup, but I don't
> > remember the exact sequence.
> 
> This all happens in kvm_arch_init_vcpu(), which does:
>  * KVM_ARM_VCPU_INIT ioctl (with the appropriate kvm_init_features set)
>  * read the guest MPIDR with GET_ONE_REG so we know what KVM
>    is doing with MPIDR assignment across CPUs
>  * check for interesting extensions like KVM_CAP_SET_GUEST_DEBUG
>  * get and cache a list of what system registers the vcpu has,
>    using KVM_GET_REG_LIST. This is where we do the "size must
>    be U32 or U64" sanity check.
> 
> So if there's something we can't do by setting kvm_init_features
> for KVM_ARM_VCPU_INIT but have to do immediately afterwards,
> that is straightforward.
> 
> The major requirement for QEMU is that if we don't specifically
> enable SVE in the VCPU then we must not see any registers
> in the KVM_GET_REG_LIST that are not u32 or u64 -- otherwise
> QEMU will refuse to start.
> 

So on migration, will you have the required information for
KVM_ARM_VCPU_INIT before setting the registers from the migration
stream?

(I assume so, because presumably this comes from a command-line switch
or from the machine definition, which must match the source.)

Therefore, I don't think there's an issue with this patch, but from
bitter experience I think we should enforce ordering if possible.


Thanks,

    Christoffer

^ permalink raw reply	[flat|nested] 154+ messages in thread

* [RFC PATCH v2 11/23] KVM: arm64: Support runtime sysreg filtering for KVM_GET_REG_LIST
@ 2018-11-22 12:34             ` Christoffer Dall
  0 siblings, 0 replies; 154+ messages in thread
From: Christoffer Dall @ 2018-11-22 12:34 UTC (permalink / raw)
  To: linux-arm-kernel

On Thu, Nov 22, 2018 at 11:13:51AM +0000, Peter Maydell wrote:
> On 22 November 2018 at 10:53, Christoffer Dall <christoffer.dall@arm.com> wrote:
> > [Adding Peter and Alex for their view on the QEMU side]
> >
> > On Thu, Nov 15, 2018 at 05:27:11PM +0000, Dave Martin wrote:
> >> My expectation is that KVM_ARM_SVE_CONFIG_SET is considered part of
> >> creating the vcpu, so that if issued at all for a vcpu, it is issued
> >> very soon after KVM_VCPU_INIT.
> >>
> >> I think this worked OK with the current structure of kvmtool and I
> >> seem to remember discussing this with Peter Maydell re qemu -- but
> >> it sounds like I should double-check.
> >
> > QEMU does some thing around enumerating all the system registers exposed
> > by KVM and saving/restoring them as part of its startup, but I don't
> > remember the exact sequence.
> 
> This all happens in kvm_arch_init_vcpu(), which does:
>  * KVM_ARM_VCPU_INIT ioctl (with the appropriate kvm_init_features set)
>  * read the guest MPIDR with GET_ONE_REG so we know what KVM
>    is doing with MPIDR assignment across CPUs
>  * check for interesting extensions like KVM_CAP_SET_GUEST_DEBUG
>  * get and cache a list of what system registers the vcpu has,
>    using KVM_GET_REG_LIST. This is where we do the "size must
>    be U32 or U64" sanity check.
> 
> So if there's something we can't do by setting kvm_init_features
> for KVM_ARM_VCPU_INIT but have to do immediately afterwards,
> that is straightforward.
> 
> The major requirement for QEMU is that if we don't specifically
> enable SVE in the VCPU then we must not see any registers
> in the KVM_GET_REG_LIST that are not u32 or u64 -- otherwise
> QEMU will refuse to start.
> 

So on migration, will you have the required information for
KVM_ARM_VCPU_INIT before setting the registers from the migration
stream?

(I assume so, because presumably this comes from a command-line switch
or from the machine definition, which must match the source.)

Therefore, I don't think there's an issue with this patch, but from
bitter experience I think we should enforce ordering if possible.


Thanks,

    Christoffer

^ permalink raw reply	[flat|nested] 154+ messages in thread

* Re: [RFC PATCH v2 11/23] KVM: arm64: Support runtime sysreg filtering for KVM_GET_REG_LIST
  2018-11-22 12:34             ` Christoffer Dall
@ 2018-11-22 12:59               ` Peter Maydell
  -1 siblings, 0 replies; 154+ messages in thread
From: Peter Maydell @ 2018-11-22 12:59 UTC (permalink / raw)
  To: Christoffer Dall
  Cc: Okamoto Takayuki, Christoffer Dall, Ard Biesheuvel, Marc Zyngier,
	Catalin Marinas, Will Deacon, kvmarm, Dave Martin, arm-mail-list

On 22 November 2018 at 12:34, Christoffer Dall <christoffer.dall@arm.com> wrote:
> So on migration, will you have the required information for
> KVM_ARM_VCPU_INIT before setting the registers from the migration
> stream?
>
> (I assume so, because presumably this comes from a command-line switch
> or from the machine definition, which must match the source.)

Yes. QEMU always sets up the VCPU completely before doing any
inbound migration.

> Therefore, I don't think there's an issue with this patch, but from
> bitter experience I think we should enforce ordering if possible.

Yes, if there are semantic ordering constraints on the various
calls it would be nice to have the kernel enforce them.

thanks
-- PMM

^ permalink raw reply	[flat|nested] 154+ messages in thread

* [RFC PATCH v2 11/23] KVM: arm64: Support runtime sysreg filtering for KVM_GET_REG_LIST
@ 2018-11-22 12:59               ` Peter Maydell
  0 siblings, 0 replies; 154+ messages in thread
From: Peter Maydell @ 2018-11-22 12:59 UTC (permalink / raw)
  To: linux-arm-kernel

On 22 November 2018 at 12:34, Christoffer Dall <christoffer.dall@arm.com> wrote:
> So on migration, will you have the required information for
> KVM_ARM_VCPU_INIT before setting the registers from the migration
> stream?
>
> (I assume so, because presumably this comes from a command-line switch
> or from the machine definition, which must match the source.)

Yes. QEMU always sets up the VCPU completely before doing any
inbound migration.

> Therefore, I don't think there's an issue with this patch, but from
> bitter experience I think we should enforce ordering if possible.

Yes, if there are semantic ordering constraints on the various
calls it would be nice to have the kernel enforce them.

thanks
-- PMM

^ permalink raw reply	[flat|nested] 154+ messages in thread

* Re: [RFC PATCH v2 11/23] KVM: arm64: Support runtime sysreg filtering for KVM_GET_REG_LIST
  2018-11-22 12:32             ` Dave P Martin
@ 2018-11-22 13:07               ` Christoffer Dall
  -1 siblings, 0 replies; 154+ messages in thread
From: Christoffer Dall @ 2018-11-22 13:07 UTC (permalink / raw)
  To: Dave P Martin
  Cc: tokamoto, Christoffer Dall, Ard Biesheuvel, Marc Zyngier,
	Catalin Marinas, Will Deacon, kvmarm, linux-arm-kernel

On Thu, Nov 22, 2018 at 01:32:37PM +0100, Dave P Martin wrote:
> On Thu, Nov 22, 2018 at 11:27:53AM +0000, Alex Bennée wrote:
> > 
> > Christoffer Dall <christoffer.dall@arm.com> writes:
> > 
> > > [Adding Peter and Alex for their view on the QEMU side]
> > >
> > > On Thu, Nov 15, 2018 at 05:27:11PM +0000, Dave Martin wrote:
> > >> On Fri, Nov 02, 2018 at 09:16:25AM +0100, Christoffer Dall wrote:
> > >> > On Fri, Sep 28, 2018 at 02:39:15PM +0100, Dave Martin wrote:
> > >> > > KVM_GET_REG_LIST should only enumerate registers that are actually
> > >> > > accessible, so it is necessary to filter out any register that is
> > >> > > not exposed to the guest.  For features that are configured at
> > >> > > runtime, this will require a dynamic check.
> > >> > >
> > >> > > For example, ZCR_EL1 and ID_AA64ZFR0_EL1 would need to be hidden
> > >> > > if SVE is not enabled for the guest.
> > >> >
> > >> > This implies that userspace can never access this interface for a vcpu
> > >> > before having decided whether such features are enabled for the guest or
> > >> > not, since otherwise userspace will see different states for a VCPU
> > >> > depending on sequencing of the API, which sounds fragile to me.
> > >> >
> > >> > That should probably be documented somewhere, and I hope the
> > >> > enable/disable API for SVE in guests already takes that into account.
> > >> >
> > >> > Not sure if there's an action to take here, but it was the best place I
> > >> > could raise this concern.
> > >>
> > >> Fair point.  I struggled to come up with something better that solves
> > >> all problems.
> > >>
> > >> My expectation is that KVM_ARM_SVE_CONFIG_SET is considered part of
> > >> creating the vcpu, so that if issued at all for a vcpu, it is issued
> > >> very soon after KVM_VCPU_INIT.
> > >>
> > >> I think this worked OK with the current structure of kvmtool and I
> > >> seem to remember discussing this with Peter Maydell re qemu -- but
> > >> it sounds like I should double-check.
> > >
> > > QEMU does some thing around enumerating all the system registers exposed
> > > by KVM and saving/restoring them as part of its startup, but I don't
> > > remember the exact sequence.
> > 
> > QEMU does this for each vCPU as part of it's start-up sequence:
> > 
> >   kvm_init_vcpu
> >     kvm_get_cpu (-> KVM_CREATE_VCPU)
> >     KVM_GET_VCPU_MMAP_SIZE
> >     kvm_arch_init_vcpu
> >       kvm_arm_vcpu_init (-> KVM_ARM_VCPU_INIT)
> >       kvm_get_one_reg(ARM_CPU_ID_MPIDR)
> >       kvm_arm_init_debug (chk for KVM_CAP SET_GUEST_DEBUG/GUEST_DEBUG_HW_WPS/BPS)
> >       kvm_arm_init_serror_injection (chk KVM_CAP_ARM_INJECT_SERROR_ESR)
> >       kvm_arm_init_cpreg_list (KVM_GET_REG_LIST)
> > 
> > At this point we have the register list we need for
> > kvm_arch_get_registers which is what we call every time we want to
> > synchronise state. We only really do this for debug events, crashes and
> > at some point when migrating.
> 
> So we would need to insert KVM_ARM_SVE_CONFIG_SET into this sequence,
> meaning that the new capability is not strictly necessary.
> 
> I sympathise with Christoffer's view though that without the capability
> mechanism it may be too easy for software to make mistakes: code
> refactoring might swap the KVM_GET_REG_LIST and KVM_ARM_SVE_CONFIG ioctls
> over and then things would go wrong with no immediate error indication.
> 
> In effect, the SVE regs would be missing from the list yielded by
> KVM_GET_REG_LIST, possibly leading to silent migration failures.
> 
> I'm a bit uneasy about that.  Am I being too paranoid now?
> 

No, we've made decisions in the past where we didn't enforce ordering
which ended up being a huge pain (vgic lazy init, as a clear example of
something really bad).  Of course, it's a tradeoff.  If it's a huge pain
to implement, maybe things will be ok, but if it's just a read/write
capability handshake, I think it's worth doing.


Thanks,

    Christoffer
_______________________________________________
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

^ permalink raw reply	[flat|nested] 154+ messages in thread

* [RFC PATCH v2 11/23] KVM: arm64: Support runtime sysreg filtering for KVM_GET_REG_LIST
@ 2018-11-22 13:07               ` Christoffer Dall
  0 siblings, 0 replies; 154+ messages in thread
From: Christoffer Dall @ 2018-11-22 13:07 UTC (permalink / raw)
  To: linux-arm-kernel

On Thu, Nov 22, 2018 at 01:32:37PM +0100, Dave P Martin wrote:
> On Thu, Nov 22, 2018 at 11:27:53AM +0000, Alex Benn?e wrote:
> > 
> > Christoffer Dall <christoffer.dall@arm.com> writes:
> > 
> > > [Adding Peter and Alex for their view on the QEMU side]
> > >
> > > On Thu, Nov 15, 2018 at 05:27:11PM +0000, Dave Martin wrote:
> > >> On Fri, Nov 02, 2018 at 09:16:25AM +0100, Christoffer Dall wrote:
> > >> > On Fri, Sep 28, 2018 at 02:39:15PM +0100, Dave Martin wrote:
> > >> > > KVM_GET_REG_LIST should only enumerate registers that are actually
> > >> > > accessible, so it is necessary to filter out any register that is
> > >> > > not exposed to the guest.  For features that are configured at
> > >> > > runtime, this will require a dynamic check.
> > >> > >
> > >> > > For example, ZCR_EL1 and ID_AA64ZFR0_EL1 would need to be hidden
> > >> > > if SVE is not enabled for the guest.
> > >> >
> > >> > This implies that userspace can never access this interface for a vcpu
> > >> > before having decided whether such features are enabled for the guest or
> > >> > not, since otherwise userspace will see different states for a VCPU
> > >> > depending on sequencing of the API, which sounds fragile to me.
> > >> >
> > >> > That should probably be documented somewhere, and I hope the
> > >> > enable/disable API for SVE in guests already takes that into account.
> > >> >
> > >> > Not sure if there's an action to take here, but it was the best place I
> > >> > could raise this concern.
> > >>
> > >> Fair point.  I struggled to come up with something better that solves
> > >> all problems.
> > >>
> > >> My expectation is that KVM_ARM_SVE_CONFIG_SET is considered part of
> > >> creating the vcpu, so that if issued at all for a vcpu, it is issued
> > >> very soon after KVM_VCPU_INIT.
> > >>
> > >> I think this worked OK with the current structure of kvmtool and I
> > >> seem to remember discussing this with Peter Maydell re qemu -- but
> > >> it sounds like I should double-check.
> > >
> > > QEMU does some thing around enumerating all the system registers exposed
> > > by KVM and saving/restoring them as part of its startup, but I don't
> > > remember the exact sequence.
> > 
> > QEMU does this for each vCPU as part of it's start-up sequence:
> > 
> >   kvm_init_vcpu
> >     kvm_get_cpu (-> KVM_CREATE_VCPU)
> >     KVM_GET_VCPU_MMAP_SIZE
> >     kvm_arch_init_vcpu
> >       kvm_arm_vcpu_init (-> KVM_ARM_VCPU_INIT)
> >       kvm_get_one_reg(ARM_CPU_ID_MPIDR)
> >       kvm_arm_init_debug (chk for KVM_CAP SET_GUEST_DEBUG/GUEST_DEBUG_HW_WPS/BPS)
> >       kvm_arm_init_serror_injection (chk KVM_CAP_ARM_INJECT_SERROR_ESR)
> >       kvm_arm_init_cpreg_list (KVM_GET_REG_LIST)
> > 
> > At this point we have the register list we need for
> > kvm_arch_get_registers which is what we call every time we want to
> > synchronise state. We only really do this for debug events, crashes and
> > at some point when migrating.
> 
> So we would need to insert KVM_ARM_SVE_CONFIG_SET into this sequence,
> meaning that the new capability is not strictly necessary.
> 
> I sympathise with Christoffer's view though that without the capability
> mechanism it may be too easy for software to make mistakes: code
> refactoring might swap the KVM_GET_REG_LIST and KVM_ARM_SVE_CONFIG ioctls
> over and then things would go wrong with no immediate error indication.
> 
> In effect, the SVE regs would be missing from the list yielded by
> KVM_GET_REG_LIST, possibly leading to silent migration failures.
> 
> I'm a bit uneasy about that.  Am I being too paranoid now?
> 

No, we've made decisions in the past where we didn't enforce ordering
which ended up being a huge pain (vgic lazy init, as a clear example of
something really bad).  Of course, it's a tradeoff.  If it's a huge pain
to implement, maybe things will be ok, but if it's just a read/write
capability handshake, I think it's worth doing.


Thanks,

    Christoffer

^ permalink raw reply	[flat|nested] 154+ messages in thread

* Re: [RFC PATCH v2 19/23] KVM: arm64/sve: Report and enable SVE API extensions for userspace
  2018-09-28 13:39   ` Dave Martin
@ 2018-11-22 15:23     ` Alex Bennée
  -1 siblings, 0 replies; 154+ messages in thread
From: Alex Bennée @ 2018-11-22 15:23 UTC (permalink / raw)
  To: Dave Martin
  Cc: Okamoto Takayuki, Christoffer Dall, Ard Biesheuvel, Marc Zyngier,
	Catalin Marinas, Will Deacon, kvmarm, linux-arm-kernel


Dave Martin <Dave.Martin@arm.com> writes:

> This patch adds the necessary API extensions to allow userspace to
> detect SVE support for guests and enable it.
>
> A new capability KVM_CAP_ARM_SVE is defined to allow userspace to
> detect the availability of the KVM SVE API extensions in the usual
> way.
>
> Userspace needs to enable SVE explicitly per vcpu and configure the
> set of SVE vector lengths available to the guest before the vcpu is
> allowed to run.  For these purposes, a new arm64-specific vcpu
> ioctl KVM_ARM_SVE_CONFIG is added, with the following subcommands
> (in rough order of expected use):
>
> KVM_ARM_SVE_CONFIG_QUERY: report the set of vector lengths
>     supported by this host.
>
>     The resulting set can be supplied directly to
>     KVM_ARM_SVE_CONFIG_SET in order to obtain the maximal possible
>     set, or used to inform userspace's decision on the appropriate
>     set of vector lengths (possibly taking into account the
>     configuration of other nodes in the cluster so that the VM can
>     migrate freely).
>
> KVM_ARM_SVE_CONFIG_SET: enable SVE for this vcpu and configure the
>     set of vector lengths it offers to the guest.
>
>     This can only be done once, before the vcpu is run.
>
> KVM_ARM_SVE_CONFIG_GET: report the set of vector lengths available
>     to the guest on this vcpu (for use when snapshotting or
>     migrating a VM).
>
> Signed-off-by: Dave Martin <Dave.Martin@arm.com>
> ---
>
> Changes since RFCv1:
>
>  * The new feature bit for PREFERRED_TARGET / VCPU_INIT is gone in
>    favour of a capability and a new ioctl to enable/configure SVE.
>
>    Perhaps the SVE configuration could be done via device attributes,
>    but it still has to be done early, so crowbarring support for this
>    behind a generic API may cause more trouble than it solves.
>
>    This is still up for discussion if anybody feels strongly about it.
>
>  * An ioctl KVM_ARM_SVE_CONFIG has been added to report the set of
>    vector lengths available and configure SVE for a vcpu.
>
>    To reduce ioctl namespace pollution the new operations are grouped
>    as subcommands under a single ioctl, since they use the same
>    argument format anyway.
> ---
>  arch/arm64/include/asm/kvm_host.h |   8 +-
>  arch/arm64/include/uapi/asm/kvm.h |  14 ++++
>  arch/arm64/kvm/guest.c            | 164 +++++++++++++++++++++++++++++++++++++-
>  arch/arm64/kvm/reset.c            |  50 ++++++++++++
>  include/uapi/linux/kvm.h          |   4 +
>  5 files changed, 238 insertions(+), 2 deletions(-)
>
> diff --git a/arch/arm64/include/asm/kvm_host.h b/arch/arm64/include/asm/kvm_host.h
> index bbde597..5225485 100644
> --- a/arch/arm64/include/asm/kvm_host.h
> +++ b/arch/arm64/include/asm/kvm_host.h
> @@ -52,6 +52,12 @@
>
>  DECLARE_STATIC_KEY_FALSE(userspace_irqchip_in_use);
>
> +#ifdef CONFIG_ARM64_SVE
> +bool kvm_sve_supported(void);
> +#else
> +static inline bool kvm_sve_supported(void) { return false; }
> +#endif
> +
>  int __attribute_const__ kvm_target_cpu(void);
>  int kvm_reset_vcpu(struct kvm_vcpu *vcpu);
>  int kvm_arch_dev_ioctl_check_extension(struct kvm *kvm, long ext);
> @@ -441,7 +447,7 @@ static inline void kvm_arch_sync_events(struct kvm *kvm) {}
>  static inline void kvm_arch_sched_in(struct kvm_vcpu *vcpu, int cpu) {}
>  static inline void kvm_arch_vcpu_block_finish(struct kvm_vcpu *vcpu) {}
>
> -static inline void kvm_arm_arch_vcpu_uninit(struct kvm_vcpu *vcpu) {}
> +void kvm_arm_arch_vcpu_uninit(struct kvm_vcpu *vcpu);
>
>  void kvm_arm_init_debug(void);
>  void kvm_arm_setup_debug(struct kvm_vcpu *vcpu);
> diff --git a/arch/arm64/include/uapi/asm/kvm.h b/arch/arm64/include/uapi/asm/kvm.h
> index 1ff68fa..94f6932 100644
> --- a/arch/arm64/include/uapi/asm/kvm.h
> +++ b/arch/arm64/include/uapi/asm/kvm.h
> @@ -32,6 +32,7 @@
>  #define KVM_NR_SPSR	5
>
>  #ifndef __ASSEMBLY__
> +#include <linux/kernel.h>
>  #include <linux/psci.h>
>  #include <linux/types.h>
>  #include <asm/ptrace.h>
> @@ -108,6 +109,19 @@ struct kvm_vcpu_init {
>  	__u32 features[7];
>  };
>
> +/* Vector length set for KVM_ARM_SVE_CONFIG */
> +struct kvm_sve_vls {
> +	__u16 cmd;
> +	__u16 max_vq;
> +	__u16 _reserved[2];
> +	__u64 required_vqs[__KERNEL_DIV_ROUND_UP(SVE_VQ_MAX - SVE_VQ_MIN + 1, 64)];
> +};
> +
> +/* values for cmd: */
> +#define KVM_ARM_SVE_CONFIG_QUERY	0 /* query what the host can support */
> +#define KVM_ARM_SVE_CONFIG_SET		1 /* enable SVE for vcpu and set VLs */
> +#define KVM_ARM_SVE_CONFIG_GET		2 /* read the set of VLs for a vcpu */
> +
>  struct kvm_sregs {
>  };
>
> diff --git a/arch/arm64/kvm/guest.c b/arch/arm64/kvm/guest.c
> index 331b85e..d96145a 100644
> --- a/arch/arm64/kvm/guest.c
> +++ b/arch/arm64/kvm/guest.c
> @@ -26,6 +26,9 @@
>  #include <linux/module.h>
>  #include <linux/vmalloc.h>
>  #include <linux/fs.h>
> +#include <linux/slab.h>
> +#include <linux/string.h>
> +#include <linux/types.h>
>  #include <kvm/arm_psci.h>
>  #include <asm/cputype.h>
>  #include <linux/uaccess.h>
> @@ -56,6 +59,11 @@ int kvm_arch_vcpu_setup(struct kvm_vcpu *vcpu)
>  	return 0;
>  }
>
> +void kvm_arm_arch_vcpu_uninit(struct kvm_vcpu *vcpu)
> +{
> +	kfree(vcpu->arch.sve_state);
> +}
> +
>  static u64 core_reg_offset_from_id(u64 id)
>  {
>  	return id & ~(KVM_REG_ARCH_MASK | KVM_REG_SIZE_MASK | KVM_REG_ARM_CORE);
> @@ -546,10 +554,164 @@ int kvm_vcpu_preferred_target(struct kvm_vcpu_init *init)
>  	return 0;
>  }
>
> +#define VQS_PER_U64 64
> +#define vq_word(vqs, vq) (&(vqs)[((vq) - SVE_VQ_MIN) / VQS_PER_U64])
> +#define vq_mask(vq) ((u64)1 << (((vq) - SVE_VQ_MIN) % VQS_PER_U64))
> +
> +static void set_vq(u64 *vqs, unsigned int vq)
> +{
> +	*vq_word(vqs, vq) |= vq_mask(vq);
> +}
> +
> +static bool vq_set(const u64 *vqs, unsigned int vq)
> +{
> +	return *vq_word(vqs, vq) & vq_mask(vq);
> +}
> +
> +static int kvm_vcpu_set_sve_vls(struct kvm_vcpu *vcpu, struct kvm_sve_vls *vls,
> +		struct kvm_sve_vls __user *userp)
> +{
> +	unsigned int vq, max_vq;
> +	int ret;
> +
> +	if (vcpu->arch.has_run_once || vcpu_has_sve(vcpu))
> +		return -EBADFD; /* too late, or already configured */
> +
> +	BUG_ON(vcpu->arch.sve_max_vl || vcpu->arch.sve_state);
> +
> +	if (vls->max_vq < SVE_VQ_MIN || vls->max_vq > SVE_VQ_MAX)
> +		return -EINVAL;
> +
> +	max_vq = 0;
> +	for (vq = SVE_VQ_MIN; vq <= vls->max_vq; ++vq) {
> +		bool available = sve_vq_available(vq);
> +		bool required = vq_set(vls->required_vqs, vq);
> +
> +		if (required != available)
> +			break;
> +
> +		if (required)
> +			max_vq = vq;
> +	}
> +
> +	if (max_vq < SVE_VQ_MIN)
> +		return -EINVAL;
> +
> +	vls->max_vq = max_vq;
> +	ret = put_user(vls->max_vq, &userp->max_vq);
> +	if (ret)
> +		return ret;
> +
> +	/*
> +	 * kvm_reset_vcpu() may already have run in KVM_VCPU_INIT, so we
> +	 * rely on kzalloc() being sufficient to reset the guest SVE
> +	 * state here for a new vcpu.
> +	 *
> +	 * Subsequent resets after vcpu initialisation are handled by
> +	 * kvm_reset_sve().
> +	 */
> +	vcpu->arch.sve_state = kzalloc(SVE_SIG_REGS_SIZE(vls->max_vq),
> +				       GFP_KERNEL);
> +	if (!vcpu->arch.sve_state)
> +		return -ENOMEM;
> +
> +	vcpu->arch.flags |= KVM_ARM64_GUEST_HAS_SVE;
> +	vcpu->arch.sve_max_vl = sve_vl_from_vq(vls->max_vq);
> +
> +	return 0;
> +}
> +
> +static int __kvm_vcpu_query_sve_vls(struct kvm_sve_vls *vls,
> +		unsigned int max_vq, struct kvm_sve_vls __user *userp)
> +{
> +	unsigned int vq, max_available_vq;
> +
> +	memset(&vls->required_vqs, 0, sizeof(vls->required_vqs));
> +
> +	BUG_ON(max_vq < SVE_VQ_MIN || max_vq > SVE_VQ_MAX);
> +
> +	max_available_vq = 0;
> +	for (vq = SVE_VQ_MIN; vq <= max_vq; ++vq)
> +		if (sve_vq_available(vq)) {
> +			set_vq(vls->required_vqs, vq);
> +			max_available_vq = vq;
> +		}
> +
> +	if (WARN_ON(max_available_vq < SVE_VQ_MIN))
> +		return -EIO;
> +
> +	vls->max_vq = max_available_vq;
> +	if (copy_to_user(userp, vls, sizeof(*vls)))
> +		return -EFAULT;
> +
> +	return 0;
> +}
> +
> +static int kvm_vcpu_query_sve_vls(struct kvm_vcpu *vcpu, struct kvm_sve_vls *vls,
> +		struct kvm_sve_vls __user *userp)
> +{
> +	BUG_ON(!sve_vl_valid(sve_max_vl));
> +
> +	return __kvm_vcpu_query_sve_vls(vls,
> +			sve_vq_from_vl(sve_max_vl), userp);
> +}
> +
> +static int kvm_vcpu_get_sve_vls(struct kvm_vcpu *vcpu, struct kvm_sve_vls *vls,
> +		struct kvm_sve_vls __user *userp)
> +{
> +	if (!vcpu_has_sve(vcpu))
> +		return -EBADFD; /* not configured yet */
> +
> +	BUG_ON(!sve_vl_valid(vcpu->arch.sve_max_vl));
> +
> +	return __kvm_vcpu_query_sve_vls(vls,
> +			sve_vq_from_vl(vcpu->arch.sve_max_vl), userp);
> +}
> +
> +static int kvm_vcpu_sve_config(struct kvm_vcpu *vcpu,
> +			       struct kvm_sve_vls __user *userp)
> +{
> +	struct kvm_sve_vls vls;
> +
> +	if (!kvm_sve_supported())
> +		return -EINVAL;
> +
> +	if (copy_from_user(&vls, userp, sizeof(vls)))
> +		return -EFAULT;
> +
> +	/*
> +	 * For forwards compatibility, flush any set bits in _reserved[]
> +	 * to tell userspace that we didn't look at them:
> +	 */
> +	memset(&vls._reserved, 0, sizeof vls._reserved);
> +
> +	switch (vls.cmd) {
> +	case KVM_ARM_SVE_CONFIG_QUERY:
> +		return kvm_vcpu_query_sve_vls(vcpu, &vls, userp);
> +
> +	case KVM_ARM_SVE_CONFIG_SET:
> +		return kvm_vcpu_set_sve_vls(vcpu, &vls, userp);
> +
> +	case KVM_ARM_SVE_CONFIG_GET:
> +		return kvm_vcpu_get_sve_vls(vcpu, &vls, userp);
> +
> +	default:
> +		return -EINVAL;
> +	}
> +}
> +
>  int kvm_arm_arch_vcpu_ioctl(struct kvm_vcpu *vcpu,
>  			    unsigned int ioctl, unsigned long arg)
>  {
> -	return -EINVAL;
> +	void __user *userp = (void __user *)arg;
> +
> +	switch (ioctl) {
> +	case KVM_ARM_SVE_CONFIG:
> +		return kvm_vcpu_sve_config(vcpu, userp);
> +
> +	default:
> +		return -EINVAL;
> +	}
>  }
>
>  int kvm_arch_vcpu_ioctl_get_fpu(struct kvm_vcpu *vcpu, struct kvm_fpu *fpu)
> diff --git a/arch/arm64/kvm/reset.c b/arch/arm64/kvm/reset.c
> index e37c78b..c2edcde 100644
> --- a/arch/arm64/kvm/reset.c
> +++ b/arch/arm64/kvm/reset.c
> @@ -19,10 +19,12 @@
>   * along with this program.  If not, see <http://www.gnu.org/licenses/>.
>   */
>
> +#include <linux/atomic.h>
>  #include <linux/errno.h>
>  #include <linux/kvm_host.h>
>  #include <linux/kvm.h>
>  #include <linux/hw_breakpoint.h>
> +#include <linux/string.h>
>
>  #include <kvm/arm_arch_timer.h>
>
> @@ -54,6 +56,31 @@ static bool cpu_has_32bit_el1(void)
>  	return !!(pfr0 & 0x20);
>  }
>
> +#ifdef CONFIG_ARM64_SVE
> +bool kvm_sve_supported(void)
> +{
> +	static bool warn_printed = false;
> +
> +	if (!system_supports_sve())
> +		return false;
> +
> +	/*
> +	 * For now, consider the hardware broken if implementation
> +	 * differences between CPUs in the system result in the set of
> +	 * vector lengths safely virtualisable for guests being less
> +	 * than the set provided to userspace:
> +	 */
> +	if (sve_max_virtualisable_vl != sve_max_vl) {
> +		if (!xchg(&warn_printed, true))
> +			kvm_err("Hardware SVE implementations
> mismatched: suppressing SVE for guests.");

This seems like you are re-inventing WARN_ONCE for the sake of having
"kvm [%i]: " in your printk string.

> +
> +		return false;
> +	}
> +
> +	return true;
> +}
> +#endif
> +
>  /**
>   * kvm_arch_dev_ioctl_check_extension
>   *
> @@ -85,6 +112,9 @@ int kvm_arch_dev_ioctl_check_extension(struct kvm *kvm, long ext)
>  	case KVM_CAP_VCPU_EVENTS:
>  		r = 1;
>  		break;
> +	case KVM_CAP_ARM_SVE:
> +		r = kvm_sve_supported();
> +		break;

For debugging we actually use the return value to indicate how many
WP/BPs we have. We could do the same here for max number of VQs but I
guess KVM_ARM_SVE_CONFIG_QUERY reports a much richer set of information.
However this does beg the question of how useful all this extra
information is to the guest program?

A dumber implementation would be:

QEMU                 |        Kernel

KVM_CAP_ARM_SVE  --------->
                              Max VQ=n
           VQ/0  <---------

We want n < max VQ

KVM_ARM_SVE_CONFIG(n) ---->
                              Unsupported VQ
           EINVAL <--------

Weird HW can't support our choice of n.
Give up or try another value.

KVM_ARM_SVE_CONFIG(n-1) --->
                              That's OK
           0 (OK) <---------

It imposes more heavy lifting on the userspace side of things but I
would expect the "normal" case would be sane hardware supports all VLs
from Max VQ to 1. And for cases where it doesn't iterating through
several KVM_ARM_SVE_CONFIG steps is a start-up cost not a runtime one.

This would mean one capability and one SVE_CONFIG sub-command with a single
parameter. Would could always add the extend the interface later but I
wonder if we are gold plating the API too early here?

What do the maintainers thing?


>  	default:
>  		r = 0;
>  	}
> @@ -92,6 +122,21 @@ int kvm_arch_dev_ioctl_check_extension(struct kvm *kvm, long ext)
>  	return r;
>  }
>
> +int kvm_reset_sve(struct kvm_vcpu *vcpu)
> +{
> +	if (!vcpu_has_sve(vcpu))
> +		return 0;
> +
> +	if (WARN_ON(!vcpu->arch.sve_state ||
> +		    !sve_vl_valid(vcpu->arch.sve_max_vl)))
> +		return -EIO;

For some reason using WARN_ON for side effects seems sketchy but while
BUG_ON can compile away to nothing it seems WARN_ON has been designed to
always give you a result of the condition so never mind...

> +
> +	memset(vcpu->arch.sve_state, 0,
> +	       SVE_SIG_REGS_SIZE(sve_vq_from_vl(vcpu->arch.sve_max_vl)));
> +
> +	return 0;
> +}
> +
>  /**
>   * kvm_reset_vcpu - sets core registers and sys_regs to reset value
>   * @vcpu: The VCPU pointer
> @@ -103,6 +148,7 @@ int kvm_arch_dev_ioctl_check_extension(struct kvm *kvm, long ext)
>  int kvm_reset_vcpu(struct kvm_vcpu *vcpu)
>  {
>  	const struct kvm_regs *cpu_reset;
> +	int ret;
>
>  	switch (vcpu->arch.target) {
>  	default:
> @@ -120,6 +166,10 @@ int kvm_reset_vcpu(struct kvm_vcpu *vcpu)
>  	/* Reset core registers */
>  	memcpy(vcpu_gp_regs(vcpu), cpu_reset, sizeof(*cpu_reset));
>
> +	ret = kvm_reset_sve(vcpu);
> +	if (ret)
> +		return ret;
> +
>  	/* Reset system registers */
>  	kvm_reset_sys_regs(vcpu);
>
> diff --git a/include/uapi/linux/kvm.h b/include/uapi/linux/kvm.h
> index 7c3c5cc..488ca56 100644
> --- a/include/uapi/linux/kvm.h
> +++ b/include/uapi/linux/kvm.h
> @@ -953,6 +953,7 @@ struct kvm_ppc_resize_hpt {
>  #define KVM_CAP_NESTED_STATE 157
>  #define KVM_CAP_ARM_INJECT_SERROR_ESR 158
>  #define KVM_CAP_MSR_PLATFORM_INFO 159
> +#define KVM_CAP_ARM_SVE 160
>
>  #ifdef KVM_CAP_IRQ_ROUTING
>
> @@ -1400,6 +1401,9 @@ struct kvm_enc_region {
>  #define KVM_GET_NESTED_STATE         _IOWR(KVMIO, 0xbe, struct kvm_nested_state)
>  #define KVM_SET_NESTED_STATE         _IOW(KVMIO,  0xbf, struct kvm_nested_state)
>
> +/* Available with KVM_CAP_ARM_SVE */
> +#define KVM_ARM_SVE_CONFIG	  _IOWR(KVMIO,  0xc0, struct kvm_sve_vls)
> +
>  /* Secure Encrypted Virtualization command */
>  enum sev_cmd_id {
>  	/* Guest initialization commands */


--
Alex Bennée
_______________________________________________
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

^ permalink raw reply	[flat|nested] 154+ messages in thread

* [RFC PATCH v2 19/23] KVM: arm64/sve: Report and enable SVE API extensions for userspace
@ 2018-11-22 15:23     ` Alex Bennée
  0 siblings, 0 replies; 154+ messages in thread
From: Alex Bennée @ 2018-11-22 15:23 UTC (permalink / raw)
  To: linux-arm-kernel


Dave Martin <Dave.Martin@arm.com> writes:

> This patch adds the necessary API extensions to allow userspace to
> detect SVE support for guests and enable it.
>
> A new capability KVM_CAP_ARM_SVE is defined to allow userspace to
> detect the availability of the KVM SVE API extensions in the usual
> way.
>
> Userspace needs to enable SVE explicitly per vcpu and configure the
> set of SVE vector lengths available to the guest before the vcpu is
> allowed to run.  For these purposes, a new arm64-specific vcpu
> ioctl KVM_ARM_SVE_CONFIG is added, with the following subcommands
> (in rough order of expected use):
>
> KVM_ARM_SVE_CONFIG_QUERY: report the set of vector lengths
>     supported by this host.
>
>     The resulting set can be supplied directly to
>     KVM_ARM_SVE_CONFIG_SET in order to obtain the maximal possible
>     set, or used to inform userspace's decision on the appropriate
>     set of vector lengths (possibly taking into account the
>     configuration of other nodes in the cluster so that the VM can
>     migrate freely).
>
> KVM_ARM_SVE_CONFIG_SET: enable SVE for this vcpu and configure the
>     set of vector lengths it offers to the guest.
>
>     This can only be done once, before the vcpu is run.
>
> KVM_ARM_SVE_CONFIG_GET: report the set of vector lengths available
>     to the guest on this vcpu (for use when snapshotting or
>     migrating a VM).
>
> Signed-off-by: Dave Martin <Dave.Martin@arm.com>
> ---
>
> Changes since RFCv1:
>
>  * The new feature bit for PREFERRED_TARGET / VCPU_INIT is gone in
>    favour of a capability and a new ioctl to enable/configure SVE.
>
>    Perhaps the SVE configuration could be done via device attributes,
>    but it still has to be done early, so crowbarring support for this
>    behind a generic API may cause more trouble than it solves.
>
>    This is still up for discussion if anybody feels strongly about it.
>
>  * An ioctl KVM_ARM_SVE_CONFIG has been added to report the set of
>    vector lengths available and configure SVE for a vcpu.
>
>    To reduce ioctl namespace pollution the new operations are grouped
>    as subcommands under a single ioctl, since they use the same
>    argument format anyway.
> ---
>  arch/arm64/include/asm/kvm_host.h |   8 +-
>  arch/arm64/include/uapi/asm/kvm.h |  14 ++++
>  arch/arm64/kvm/guest.c            | 164 +++++++++++++++++++++++++++++++++++++-
>  arch/arm64/kvm/reset.c            |  50 ++++++++++++
>  include/uapi/linux/kvm.h          |   4 +
>  5 files changed, 238 insertions(+), 2 deletions(-)
>
> diff --git a/arch/arm64/include/asm/kvm_host.h b/arch/arm64/include/asm/kvm_host.h
> index bbde597..5225485 100644
> --- a/arch/arm64/include/asm/kvm_host.h
> +++ b/arch/arm64/include/asm/kvm_host.h
> @@ -52,6 +52,12 @@
>
>  DECLARE_STATIC_KEY_FALSE(userspace_irqchip_in_use);
>
> +#ifdef CONFIG_ARM64_SVE
> +bool kvm_sve_supported(void);
> +#else
> +static inline bool kvm_sve_supported(void) { return false; }
> +#endif
> +
>  int __attribute_const__ kvm_target_cpu(void);
>  int kvm_reset_vcpu(struct kvm_vcpu *vcpu);
>  int kvm_arch_dev_ioctl_check_extension(struct kvm *kvm, long ext);
> @@ -441,7 +447,7 @@ static inline void kvm_arch_sync_events(struct kvm *kvm) {}
>  static inline void kvm_arch_sched_in(struct kvm_vcpu *vcpu, int cpu) {}
>  static inline void kvm_arch_vcpu_block_finish(struct kvm_vcpu *vcpu) {}
>
> -static inline void kvm_arm_arch_vcpu_uninit(struct kvm_vcpu *vcpu) {}
> +void kvm_arm_arch_vcpu_uninit(struct kvm_vcpu *vcpu);
>
>  void kvm_arm_init_debug(void);
>  void kvm_arm_setup_debug(struct kvm_vcpu *vcpu);
> diff --git a/arch/arm64/include/uapi/asm/kvm.h b/arch/arm64/include/uapi/asm/kvm.h
> index 1ff68fa..94f6932 100644
> --- a/arch/arm64/include/uapi/asm/kvm.h
> +++ b/arch/arm64/include/uapi/asm/kvm.h
> @@ -32,6 +32,7 @@
>  #define KVM_NR_SPSR	5
>
>  #ifndef __ASSEMBLY__
> +#include <linux/kernel.h>
>  #include <linux/psci.h>
>  #include <linux/types.h>
>  #include <asm/ptrace.h>
> @@ -108,6 +109,19 @@ struct kvm_vcpu_init {
>  	__u32 features[7];
>  };
>
> +/* Vector length set for KVM_ARM_SVE_CONFIG */
> +struct kvm_sve_vls {
> +	__u16 cmd;
> +	__u16 max_vq;
> +	__u16 _reserved[2];
> +	__u64 required_vqs[__KERNEL_DIV_ROUND_UP(SVE_VQ_MAX - SVE_VQ_MIN + 1, 64)];
> +};
> +
> +/* values for cmd: */
> +#define KVM_ARM_SVE_CONFIG_QUERY	0 /* query what the host can support */
> +#define KVM_ARM_SVE_CONFIG_SET		1 /* enable SVE for vcpu and set VLs */
> +#define KVM_ARM_SVE_CONFIG_GET		2 /* read the set of VLs for a vcpu */
> +
>  struct kvm_sregs {
>  };
>
> diff --git a/arch/arm64/kvm/guest.c b/arch/arm64/kvm/guest.c
> index 331b85e..d96145a 100644
> --- a/arch/arm64/kvm/guest.c
> +++ b/arch/arm64/kvm/guest.c
> @@ -26,6 +26,9 @@
>  #include <linux/module.h>
>  #include <linux/vmalloc.h>
>  #include <linux/fs.h>
> +#include <linux/slab.h>
> +#include <linux/string.h>
> +#include <linux/types.h>
>  #include <kvm/arm_psci.h>
>  #include <asm/cputype.h>
>  #include <linux/uaccess.h>
> @@ -56,6 +59,11 @@ int kvm_arch_vcpu_setup(struct kvm_vcpu *vcpu)
>  	return 0;
>  }
>
> +void kvm_arm_arch_vcpu_uninit(struct kvm_vcpu *vcpu)
> +{
> +	kfree(vcpu->arch.sve_state);
> +}
> +
>  static u64 core_reg_offset_from_id(u64 id)
>  {
>  	return id & ~(KVM_REG_ARCH_MASK | KVM_REG_SIZE_MASK | KVM_REG_ARM_CORE);
> @@ -546,10 +554,164 @@ int kvm_vcpu_preferred_target(struct kvm_vcpu_init *init)
>  	return 0;
>  }
>
> +#define VQS_PER_U64 64
> +#define vq_word(vqs, vq) (&(vqs)[((vq) - SVE_VQ_MIN) / VQS_PER_U64])
> +#define vq_mask(vq) ((u64)1 << (((vq) - SVE_VQ_MIN) % VQS_PER_U64))
> +
> +static void set_vq(u64 *vqs, unsigned int vq)
> +{
> +	*vq_word(vqs, vq) |= vq_mask(vq);
> +}
> +
> +static bool vq_set(const u64 *vqs, unsigned int vq)
> +{
> +	return *vq_word(vqs, vq) & vq_mask(vq);
> +}
> +
> +static int kvm_vcpu_set_sve_vls(struct kvm_vcpu *vcpu, struct kvm_sve_vls *vls,
> +		struct kvm_sve_vls __user *userp)
> +{
> +	unsigned int vq, max_vq;
> +	int ret;
> +
> +	if (vcpu->arch.has_run_once || vcpu_has_sve(vcpu))
> +		return -EBADFD; /* too late, or already configured */
> +
> +	BUG_ON(vcpu->arch.sve_max_vl || vcpu->arch.sve_state);
> +
> +	if (vls->max_vq < SVE_VQ_MIN || vls->max_vq > SVE_VQ_MAX)
> +		return -EINVAL;
> +
> +	max_vq = 0;
> +	for (vq = SVE_VQ_MIN; vq <= vls->max_vq; ++vq) {
> +		bool available = sve_vq_available(vq);
> +		bool required = vq_set(vls->required_vqs, vq);
> +
> +		if (required != available)
> +			break;
> +
> +		if (required)
> +			max_vq = vq;
> +	}
> +
> +	if (max_vq < SVE_VQ_MIN)
> +		return -EINVAL;
> +
> +	vls->max_vq = max_vq;
> +	ret = put_user(vls->max_vq, &userp->max_vq);
> +	if (ret)
> +		return ret;
> +
> +	/*
> +	 * kvm_reset_vcpu() may already have run in KVM_VCPU_INIT, so we
> +	 * rely on kzalloc() being sufficient to reset the guest SVE
> +	 * state here for a new vcpu.
> +	 *
> +	 * Subsequent resets after vcpu initialisation are handled by
> +	 * kvm_reset_sve().
> +	 */
> +	vcpu->arch.sve_state = kzalloc(SVE_SIG_REGS_SIZE(vls->max_vq),
> +				       GFP_KERNEL);
> +	if (!vcpu->arch.sve_state)
> +		return -ENOMEM;
> +
> +	vcpu->arch.flags |= KVM_ARM64_GUEST_HAS_SVE;
> +	vcpu->arch.sve_max_vl = sve_vl_from_vq(vls->max_vq);
> +
> +	return 0;
> +}
> +
> +static int __kvm_vcpu_query_sve_vls(struct kvm_sve_vls *vls,
> +		unsigned int max_vq, struct kvm_sve_vls __user *userp)
> +{
> +	unsigned int vq, max_available_vq;
> +
> +	memset(&vls->required_vqs, 0, sizeof(vls->required_vqs));
> +
> +	BUG_ON(max_vq < SVE_VQ_MIN || max_vq > SVE_VQ_MAX);
> +
> +	max_available_vq = 0;
> +	for (vq = SVE_VQ_MIN; vq <= max_vq; ++vq)
> +		if (sve_vq_available(vq)) {
> +			set_vq(vls->required_vqs, vq);
> +			max_available_vq = vq;
> +		}
> +
> +	if (WARN_ON(max_available_vq < SVE_VQ_MIN))
> +		return -EIO;
> +
> +	vls->max_vq = max_available_vq;
> +	if (copy_to_user(userp, vls, sizeof(*vls)))
> +		return -EFAULT;
> +
> +	return 0;
> +}
> +
> +static int kvm_vcpu_query_sve_vls(struct kvm_vcpu *vcpu, struct kvm_sve_vls *vls,
> +		struct kvm_sve_vls __user *userp)
> +{
> +	BUG_ON(!sve_vl_valid(sve_max_vl));
> +
> +	return __kvm_vcpu_query_sve_vls(vls,
> +			sve_vq_from_vl(sve_max_vl), userp);
> +}
> +
> +static int kvm_vcpu_get_sve_vls(struct kvm_vcpu *vcpu, struct kvm_sve_vls *vls,
> +		struct kvm_sve_vls __user *userp)
> +{
> +	if (!vcpu_has_sve(vcpu))
> +		return -EBADFD; /* not configured yet */
> +
> +	BUG_ON(!sve_vl_valid(vcpu->arch.sve_max_vl));
> +
> +	return __kvm_vcpu_query_sve_vls(vls,
> +			sve_vq_from_vl(vcpu->arch.sve_max_vl), userp);
> +}
> +
> +static int kvm_vcpu_sve_config(struct kvm_vcpu *vcpu,
> +			       struct kvm_sve_vls __user *userp)
> +{
> +	struct kvm_sve_vls vls;
> +
> +	if (!kvm_sve_supported())
> +		return -EINVAL;
> +
> +	if (copy_from_user(&vls, userp, sizeof(vls)))
> +		return -EFAULT;
> +
> +	/*
> +	 * For forwards compatibility, flush any set bits in _reserved[]
> +	 * to tell userspace that we didn't look at them:
> +	 */
> +	memset(&vls._reserved, 0, sizeof vls._reserved);
> +
> +	switch (vls.cmd) {
> +	case KVM_ARM_SVE_CONFIG_QUERY:
> +		return kvm_vcpu_query_sve_vls(vcpu, &vls, userp);
> +
> +	case KVM_ARM_SVE_CONFIG_SET:
> +		return kvm_vcpu_set_sve_vls(vcpu, &vls, userp);
> +
> +	case KVM_ARM_SVE_CONFIG_GET:
> +		return kvm_vcpu_get_sve_vls(vcpu, &vls, userp);
> +
> +	default:
> +		return -EINVAL;
> +	}
> +}
> +
>  int kvm_arm_arch_vcpu_ioctl(struct kvm_vcpu *vcpu,
>  			    unsigned int ioctl, unsigned long arg)
>  {
> -	return -EINVAL;
> +	void __user *userp = (void __user *)arg;
> +
> +	switch (ioctl) {
> +	case KVM_ARM_SVE_CONFIG:
> +		return kvm_vcpu_sve_config(vcpu, userp);
> +
> +	default:
> +		return -EINVAL;
> +	}
>  }
>
>  int kvm_arch_vcpu_ioctl_get_fpu(struct kvm_vcpu *vcpu, struct kvm_fpu *fpu)
> diff --git a/arch/arm64/kvm/reset.c b/arch/arm64/kvm/reset.c
> index e37c78b..c2edcde 100644
> --- a/arch/arm64/kvm/reset.c
> +++ b/arch/arm64/kvm/reset.c
> @@ -19,10 +19,12 @@
>   * along with this program.  If not, see <http://www.gnu.org/licenses/>.
>   */
>
> +#include <linux/atomic.h>
>  #include <linux/errno.h>
>  #include <linux/kvm_host.h>
>  #include <linux/kvm.h>
>  #include <linux/hw_breakpoint.h>
> +#include <linux/string.h>
>
>  #include <kvm/arm_arch_timer.h>
>
> @@ -54,6 +56,31 @@ static bool cpu_has_32bit_el1(void)
>  	return !!(pfr0 & 0x20);
>  }
>
> +#ifdef CONFIG_ARM64_SVE
> +bool kvm_sve_supported(void)
> +{
> +	static bool warn_printed = false;
> +
> +	if (!system_supports_sve())
> +		return false;
> +
> +	/*
> +	 * For now, consider the hardware broken if implementation
> +	 * differences between CPUs in the system result in the set of
> +	 * vector lengths safely virtualisable for guests being less
> +	 * than the set provided to userspace:
> +	 */
> +	if (sve_max_virtualisable_vl != sve_max_vl) {
> +		if (!xchg(&warn_printed, true))
> +			kvm_err("Hardware SVE implementations
> mismatched: suppressing SVE for guests.");

This seems like you are re-inventing WARN_ONCE for the sake of having
"kvm [%i]: " in your printk string.

> +
> +		return false;
> +	}
> +
> +	return true;
> +}
> +#endif
> +
>  /**
>   * kvm_arch_dev_ioctl_check_extension
>   *
> @@ -85,6 +112,9 @@ int kvm_arch_dev_ioctl_check_extension(struct kvm *kvm, long ext)
>  	case KVM_CAP_VCPU_EVENTS:
>  		r = 1;
>  		break;
> +	case KVM_CAP_ARM_SVE:
> +		r = kvm_sve_supported();
> +		break;

For debugging we actually use the return value to indicate how many
WP/BPs we have. We could do the same here for max number of VQs but I
guess KVM_ARM_SVE_CONFIG_QUERY reports a much richer set of information.
However this does beg the question of how useful all this extra
information is to the guest program?

A dumber implementation would be:

QEMU                 |        Kernel

KVM_CAP_ARM_SVE  --------->
                              Max VQ=n
           VQ/0  <---------

We want n < max VQ

KVM_ARM_SVE_CONFIG(n) ---->
                              Unsupported VQ
           EINVAL <--------

Weird HW can't support our choice of n.
Give up or try another value.

KVM_ARM_SVE_CONFIG(n-1) --->
                              That's OK
           0 (OK) <---------

It imposes more heavy lifting on the userspace side of things but I
would expect the "normal" case would be sane hardware supports all VLs
from Max VQ to 1. And for cases where it doesn't iterating through
several KVM_ARM_SVE_CONFIG steps is a start-up cost not a runtime one.

This would mean one capability and one SVE_CONFIG sub-command with a single
parameter. Would could always add the extend the interface later but I
wonder if we are gold plating the API too early here?

What do the maintainers thing?


>  	default:
>  		r = 0;
>  	}
> @@ -92,6 +122,21 @@ int kvm_arch_dev_ioctl_check_extension(struct kvm *kvm, long ext)
>  	return r;
>  }
>
> +int kvm_reset_sve(struct kvm_vcpu *vcpu)
> +{
> +	if (!vcpu_has_sve(vcpu))
> +		return 0;
> +
> +	if (WARN_ON(!vcpu->arch.sve_state ||
> +		    !sve_vl_valid(vcpu->arch.sve_max_vl)))
> +		return -EIO;

For some reason using WARN_ON for side effects seems sketchy but while
BUG_ON can compile away to nothing it seems WARN_ON has been designed to
always give you a result of the condition so never mind...

> +
> +	memset(vcpu->arch.sve_state, 0,
> +	       SVE_SIG_REGS_SIZE(sve_vq_from_vl(vcpu->arch.sve_max_vl)));
> +
> +	return 0;
> +}
> +
>  /**
>   * kvm_reset_vcpu - sets core registers and sys_regs to reset value
>   * @vcpu: The VCPU pointer
> @@ -103,6 +148,7 @@ int kvm_arch_dev_ioctl_check_extension(struct kvm *kvm, long ext)
>  int kvm_reset_vcpu(struct kvm_vcpu *vcpu)
>  {
>  	const struct kvm_regs *cpu_reset;
> +	int ret;
>
>  	switch (vcpu->arch.target) {
>  	default:
> @@ -120,6 +166,10 @@ int kvm_reset_vcpu(struct kvm_vcpu *vcpu)
>  	/* Reset core registers */
>  	memcpy(vcpu_gp_regs(vcpu), cpu_reset, sizeof(*cpu_reset));
>
> +	ret = kvm_reset_sve(vcpu);
> +	if (ret)
> +		return ret;
> +
>  	/* Reset system registers */
>  	kvm_reset_sys_regs(vcpu);
>
> diff --git a/include/uapi/linux/kvm.h b/include/uapi/linux/kvm.h
> index 7c3c5cc..488ca56 100644
> --- a/include/uapi/linux/kvm.h
> +++ b/include/uapi/linux/kvm.h
> @@ -953,6 +953,7 @@ struct kvm_ppc_resize_hpt {
>  #define KVM_CAP_NESTED_STATE 157
>  #define KVM_CAP_ARM_INJECT_SERROR_ESR 158
>  #define KVM_CAP_MSR_PLATFORM_INFO 159
> +#define KVM_CAP_ARM_SVE 160
>
>  #ifdef KVM_CAP_IRQ_ROUTING
>
> @@ -1400,6 +1401,9 @@ struct kvm_enc_region {
>  #define KVM_GET_NESTED_STATE         _IOWR(KVMIO, 0xbe, struct kvm_nested_state)
>  #define KVM_SET_NESTED_STATE         _IOW(KVMIO,  0xbf, struct kvm_nested_state)
>
> +/* Available with KVM_CAP_ARM_SVE */
> +#define KVM_ARM_SVE_CONFIG	  _IOWR(KVMIO,  0xc0, struct kvm_sve_vls)
> +
>  /* Secure Encrypted Virtualization command */
>  enum sev_cmd_id {
>  	/* Guest initialization commands */


--
Alex Benn?e

^ permalink raw reply	[flat|nested] 154+ messages in thread

* Re: [RFC PATCH v2 21/23] KVM: arm64/sve: allow KVM_ARM_SVE_CONFIG_QUERY on vm fd
  2018-09-28 13:39   ` Dave Martin
@ 2018-11-22 15:29     ` Alex Bennée
  -1 siblings, 0 replies; 154+ messages in thread
From: Alex Bennée @ 2018-11-22 15:29 UTC (permalink / raw)
  To: Dave Martin
  Cc: Okamoto Takayuki, Christoffer Dall, Ard Biesheuvel, Marc Zyngier,
	Catalin Marinas, Will Deacon, kvmarm, linux-arm-kernel


Dave Martin <Dave.Martin@arm.com> writes:

> Since userspace may need to decide on the set of vector lengths for
> the guest before setting up a vm, it is onerous to require a vcpu
> fd to be available first.  KVM_ARM_SVE_CONFIG_QUERY is not
> vcpu-dependent anyway, so this patch wires up KVM_ARM_SVE_CONFIG to
> be usable on a vm fd where appropriate.
>
> Subcommands that are vcpu-dependent (currently
> KVM_ARM_SVE_CONFIG_SET, KVM_ARM_SVE_CONFIG_GET) will return -EINVAL
> if invoked on a vm fd.
>
> Signed-off-by: Dave Martin <Dave.Martin@arm.com>

Apropos comments on last patch, this could go away if we went with just
reporting the max VL via the SVE capability probe which works on the
vmfd.

> ---
>  arch/arm64/kvm/guest.c | 17 ++++++++++++++++-
>  1 file changed, 16 insertions(+), 1 deletion(-)
>
> diff --git a/arch/arm64/kvm/guest.c b/arch/arm64/kvm/guest.c
> index f066b17..2313c22 100644
> --- a/arch/arm64/kvm/guest.c
> +++ b/arch/arm64/kvm/guest.c
> @@ -574,6 +574,9 @@ static int kvm_vcpu_set_sve_vls(struct kvm_vcpu *vcpu, struct kvm_sve_vls *vls,
>  	unsigned int vq, max_vq;
>  	int ret;
>
> +	if (!vcpu)
> +		return -EINVAL; /* per-vcpu operation on vm fd */
> +
>  	if (vcpu->arch.has_run_once || vcpu_has_sve(vcpu))
>  		return -EBADFD; /* too late, or already configured */
>
> @@ -659,6 +662,9 @@ static int kvm_vcpu_query_sve_vls(struct kvm_vcpu *vcpu, struct kvm_sve_vls *vls
>  static int kvm_vcpu_get_sve_vls(struct kvm_vcpu *vcpu, struct kvm_sve_vls *vls,
>  		struct kvm_sve_vls __user *userp)
>  {
> +	if (!vcpu)
> +		return -EINVAL; /* per-vcpu operation on vm fd */
> +
>  	if (!vcpu_has_sve(vcpu))
>  		return -EBADFD; /* not configured yet */
>
> @@ -668,6 +674,7 @@ static int kvm_vcpu_get_sve_vls(struct kvm_vcpu *vcpu, struct kvm_sve_vls *vls,
>  			sve_vq_from_vl(vcpu->arch.sve_max_vl), userp);
>  }
>
> +/* vcpu may be NULL if this is called via a vm fd */
>  static int kvm_vcpu_sve_config(struct kvm_vcpu *vcpu,
>  			       struct kvm_sve_vls __user *userp)
>  {
> @@ -717,7 +724,15 @@ int kvm_arm_arch_vcpu_ioctl(struct kvm_vcpu *vcpu,
>  int kvm_arm_arch_vm_ioctl(struct kvm *kvm,
>  			  unsigned int ioctl, unsigned long arg)
>  {
> -	return -EINVAL;
> +	void __user *userp = (void __user *)arg;
> +
> +	switch (ioctl) {
> +	case KVM_ARM_SVE_CONFIG:
> +		return kvm_vcpu_sve_config(NULL, userp);
> +
> +	default:
> +		return -EINVAL;
> +	}
>  }
>
>  int kvm_arch_vcpu_ioctl_get_fpu(struct kvm_vcpu *vcpu, struct kvm_fpu *fpu)


--
Alex Bennée
_______________________________________________
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

^ permalink raw reply	[flat|nested] 154+ messages in thread

* [RFC PATCH v2 21/23] KVM: arm64/sve: allow KVM_ARM_SVE_CONFIG_QUERY on vm fd
@ 2018-11-22 15:29     ` Alex Bennée
  0 siblings, 0 replies; 154+ messages in thread
From: Alex Bennée @ 2018-11-22 15:29 UTC (permalink / raw)
  To: linux-arm-kernel


Dave Martin <Dave.Martin@arm.com> writes:

> Since userspace may need to decide on the set of vector lengths for
> the guest before setting up a vm, it is onerous to require a vcpu
> fd to be available first.  KVM_ARM_SVE_CONFIG_QUERY is not
> vcpu-dependent anyway, so this patch wires up KVM_ARM_SVE_CONFIG to
> be usable on a vm fd where appropriate.
>
> Subcommands that are vcpu-dependent (currently
> KVM_ARM_SVE_CONFIG_SET, KVM_ARM_SVE_CONFIG_GET) will return -EINVAL
> if invoked on a vm fd.
>
> Signed-off-by: Dave Martin <Dave.Martin@arm.com>

Apropos comments on last patch, this could go away if we went with just
reporting the max VL via the SVE capability probe which works on the
vmfd.

> ---
>  arch/arm64/kvm/guest.c | 17 ++++++++++++++++-
>  1 file changed, 16 insertions(+), 1 deletion(-)
>
> diff --git a/arch/arm64/kvm/guest.c b/arch/arm64/kvm/guest.c
> index f066b17..2313c22 100644
> --- a/arch/arm64/kvm/guest.c
> +++ b/arch/arm64/kvm/guest.c
> @@ -574,6 +574,9 @@ static int kvm_vcpu_set_sve_vls(struct kvm_vcpu *vcpu, struct kvm_sve_vls *vls,
>  	unsigned int vq, max_vq;
>  	int ret;
>
> +	if (!vcpu)
> +		return -EINVAL; /* per-vcpu operation on vm fd */
> +
>  	if (vcpu->arch.has_run_once || vcpu_has_sve(vcpu))
>  		return -EBADFD; /* too late, or already configured */
>
> @@ -659,6 +662,9 @@ static int kvm_vcpu_query_sve_vls(struct kvm_vcpu *vcpu, struct kvm_sve_vls *vls
>  static int kvm_vcpu_get_sve_vls(struct kvm_vcpu *vcpu, struct kvm_sve_vls *vls,
>  		struct kvm_sve_vls __user *userp)
>  {
> +	if (!vcpu)
> +		return -EINVAL; /* per-vcpu operation on vm fd */
> +
>  	if (!vcpu_has_sve(vcpu))
>  		return -EBADFD; /* not configured yet */
>
> @@ -668,6 +674,7 @@ static int kvm_vcpu_get_sve_vls(struct kvm_vcpu *vcpu, struct kvm_sve_vls *vls,
>  			sve_vq_from_vl(vcpu->arch.sve_max_vl), userp);
>  }
>
> +/* vcpu may be NULL if this is called via a vm fd */
>  static int kvm_vcpu_sve_config(struct kvm_vcpu *vcpu,
>  			       struct kvm_sve_vls __user *userp)
>  {
> @@ -717,7 +724,15 @@ int kvm_arm_arch_vcpu_ioctl(struct kvm_vcpu *vcpu,
>  int kvm_arm_arch_vm_ioctl(struct kvm *kvm,
>  			  unsigned int ioctl, unsigned long arg)
>  {
> -	return -EINVAL;
> +	void __user *userp = (void __user *)arg;
> +
> +	switch (ioctl) {
> +	case KVM_ARM_SVE_CONFIG:
> +		return kvm_vcpu_sve_config(NULL, userp);
> +
> +	default:
> +		return -EINVAL;
> +	}
>  }
>
>  int kvm_arch_vcpu_ioctl_get_fpu(struct kvm_vcpu *vcpu, struct kvm_fpu *fpu)


--
Alex Benn?e

^ permalink raw reply	[flat|nested] 154+ messages in thread

* Re: [RFC PATCH v2 23/23] KVM: arm64/sve: Document KVM API extensions for SVE
  2018-09-28 13:39   ` Dave Martin
@ 2018-11-22 15:31     ` Alex Bennée
  -1 siblings, 0 replies; 154+ messages in thread
From: Alex Bennée @ 2018-11-22 15:31 UTC (permalink / raw)
  To: Dave Martin
  Cc: Okamoto Takayuki, Christoffer Dall, Ard Biesheuvel, Marc Zyngier,
	Catalin Marinas, Will Deacon, kvmarm, linux-arm-kernel


Dave Martin <Dave.Martin@arm.com> writes:

> This patch adds sections to the KVM API documentation describing
> the extensions for supporting the Scalable Vector Extension (SVE)
> in guests.
>
> Signed-off-by: Dave Martin <Dave.Martin@arm.com>
> ---
>  Documentation/virtual/kvm/api.txt | 142 +++++++++++++++++++++++++++++++++++++-
>  1 file changed, 139 insertions(+), 3 deletions(-)
>
> diff --git a/Documentation/virtual/kvm/api.txt b/Documentation/virtual/kvm/api.txt
> index a58067b..b8257d4 100644
> --- a/Documentation/virtual/kvm/api.txt
> +++ b/Documentation/virtual/kvm/api.txt
> @@ -2054,13 +2054,21 @@ Specifically:
>    0x6030 0000 0010 004c SPSR_UND    64  spsr[KVM_SPSR_UND]
>    0x6030 0000 0010 004e SPSR_IRQ    64  spsr[KVM_SPSR_IRQ]
>    0x6060 0000 0010 0050 SPSR_FIQ    64  spsr[KVM_SPSR_FIQ]
> -  0x6040 0000 0010 0054 V0         128  fp_regs.vregs[0]
> -  0x6040 0000 0010 0058 V1         128  fp_regs.vregs[1]
> +  0x6040 0000 0010 0054 V0         128  fp_regs.vregs[0]    (*)
> +  0x6040 0000 0010 0058 V1         128  fp_regs.vregs[1]    (*)
>      ...
> -  0x6040 0000 0010 00d0 V31        128  fp_regs.vregs[31]
> +  0x6040 0000 0010 00d0 V31        128  fp_regs.vregs[31]   (*)
>    0x6020 0000 0010 00d4 FPSR        32  fp_regs.fpsr
>    0x6020 0000 0010 00d5 FPCR        32  fp_regs.fpcr
>
> +(*) These encodings are not accepted for SVE-enabled vcpus.  See
> +    KVM_ARM_SVE_CONFIG for details of how SVE support is configured for
> +    a vcpu.
> +
> +    The equivalent register content can be accessed via bits [2047:0]
> of

You mean [127:0] I think.

> +    the corresponding SVE Zn registers instead for vcpus that have SVE
> +    enabled (see below).
> +
>  arm64 CCSIDR registers are demultiplexed by CSSELR value:
>    0x6020 0000 0011 00 <csselr:8>
>
> @@ -2070,6 +2078,14 @@ arm64 system registers have the following id bit patterns:
>  arm64 firmware pseudo-registers have the following bit pattern:
>    0x6030 0000 0014 <regno:16>
>
> +arm64 SVE registers have the following bit patterns:
> +  0x6080 0000 0015 00 <n:5> <slice:5>   Zn bits[2048*slice + 2047 : 2048*slice]
> +  0x6050 0000 0015 04 <n:4> <slice:5>   Pn bits[256*slice + 255 : 256*slice]
> +  0x6050 0000 0015 060 <slice:5>        FFR bits[256*slice + 255 : 256*slice]
> +
> +  These registers are only accessible on SVE-enabled vcpus.  See
> +  KVM_ARM_SVE_CONFIG for details.
> +
>
>  MIPS registers are mapped using the lower 32 bits.  The upper 16 of that is
>  the register group type:
> @@ -3700,6 +3716,126 @@ Returns: 0 on success, -1 on error
>  This copies the vcpu's kvm_nested_state struct from userspace to the kernel.  For
>  the definition of struct kvm_nested_state, see KVM_GET_NESTED_STATE.
>
> +4.116 KVM_ARM_SVE_CONFIG
> +
> +Capability: KVM_CAP_ARM_SVE
> +Architectures: arm64
> +Type: vm and vcpu ioctl
> +Parameters: struct kvm_sve_vls (in/out)
> +Returns: 0 on success
> +Errors:
> +  EINVAL:    Unrecognised subcommand or bad arguments
> +  EBADFD:    vcpu in wrong state for request
> +             (KVM_ARM_SVE_CONFIG_SET, KVM_ARM_SVE_CONFIG_SET)
> +  ENOMEM:    Out of memory
> +  EFAULT:    Bad user address
> +
> +struct kvm_sve_vls {
> +	__u16 cmd;
> +	__u16 max_vq;
> +	__u16 _reserved[2];
> +	__u64 required_vqs[8];
> +};
> +
> +General:
> +
> +cmd: This ioctl supports a few different subcommands, selected by the
> +value of cmd (described in detail in the following sections).
> +
> +_reserved[]: these fields may be meaningful to later kernels.  For
> +forward compatibility, they must be zeroed before invoking this ioctl
> +for the first time on a given struct kvm_sve_vls object.  (So, memset()
> +it to zero before first use, or allocate with calloc() for example.)
> +
> +max_vq, required_vqs[]: encode a set of SVE vector lengths.  The set is
> +encoded as follows:
> +
> +If (a * 64 + b + 1) <= max_vq, then the bit represented by
> +
> +    required_vqs[a] & ((__u64)1 << b)
> +
> +(where a is in the range 0..7 and b is in the range 0..63)
> +indicates that the vector length (a * 64 + b + 1) * 128 bits is
> +supported (KVM_ARM_SVE_CONFIG_QUERY, KVM_ARM_SVE_CONFIG_GET) or required
> +(KVM_ARM_SVE_CONFIG_SET).
> +
> +If (a * 64 + b + 1) > max_vq, then the vector length
> +(a * 64 + b + 1) * 128 bits is unsupported or prohibited respectively.
> +In other words, only the first max_vq bits in required_vqs[] are
> +significant; remaining bits are implicitly treated as if they were zero.
> +
> +max_vq must be in the range SVE_VQ_MIN (1) to SVE_VQ_MAX (512).
> +
> +See Documentation/arm64/sve.txt for an explanation of vector lengths and
> +the meaning associated with "VQ".
> +
> +Subcommands:
> +
> +/* values for cmd: */
> +#define KVM_ARM_SVE_CONFIG_QUERY	0 /* query what the host can support */
> +#define KVM_ARM_SVE_CONFIG_SET		1 /* enable SVE for vcpu and set VLs */
> +#define KVM_ARM_SVE_CONFIG_GET		2 /* read the set of VLs for a vcpu */
> +
> +Subcommand details:
> +
> +4.116.1 KVM_ARM_SVE_CONFIG_QUERY
> +Type: vm and vcpu
> +
> +Retrieve the full set of SVE vector lengths available for use by KVM
> +guests on this host.  The result is independent of which vcpu this
> +command is invoked on.  As a convenience, it may also be invoked on a
> +vm file descriptor, eliminating the need to create a vcpu first.
> +
> +4.116.2 KVM_ARM_SVE_CONFIG_SET
> +Type: vcpu only
> +
> +Enables SVE for the vcpu and sets the set of SVE vector lengths that
> +will be visible to the guest.
> +
> +This is the only way to enable SVE for a vcpu: if this command is not
> +invoked for a vcpu then SVE will not be available to the guest on this
> +vcpu.
> +
> +This subcommand is only permitted once per vcpu, before KVM_RUN has been
> +invoked for the vcpu for the first time.  Otherwise, the command fails
> +with -EBADFD and the state of the vcpu is not modified.
> +
> +In typical use, the user should call KVM_ARM_SVE_CONFIG_QUERY first to
> +populate a struct kvm_sve_vls with the full set of vector lengths
> +available on the host, then set cmd = KVM_ARM_SVE_CONFIG_SET and
> +re-issue the KVM_ARM_SVE_CONFIG ioctl on the desired vcpu.  This will
> +configure the best set of vector lengths available.  When following this
> +approach, the maximum available vector length can also be restricted by
> +reducing the value of max_vq before invoking KVM_ARM_SVE_CONFIG_SET.
> +
> +Every requested vector length in the struct kvm_sve_vls argument must be
> +supported by the hardware.  In addition, except for vector lengths
> +greater than the maximum requested vector length, every vector length
> +not requested must *not* be supported by the hardware.  (The latter
> +restriction may be relaxed in the future.)  If the requested set of
> +vector lengths is not supportable, the command fails with -EINVAL and
> +the state of the vcpu is not modified.
> +
> +Different vcpus of a vm may be configured with different sets of vector
> +lengths.  Equally, some vcpus may have SVE enabled and some not.
> +However, such configurations are not recommended except for testing and
> +experimentation purposes.  Architecturally compliant guest OSes will
> +work, but may or may not make effective use of the resulting
> +configuration.
> +
> +After a successful KVM_ARM_SVE_CONFIG_SET, KVM_ARM_SVE_CONFIG_GET can be
> +used to retrieve the configured set of vector lengths.
> +
> +4.116.3 KVM_ARM_SVE_CONFIG_GET
> +Type: vcpu only
> +
> +This subcommand returns the set of vector lengths enabled for the vcpu.
> +SVE must have been enabled and configured for this vcpu by a successful
> +prior KVM_ARM_SVE_CONFIG_SET call.  Otherwise, -EBADFD is returned.
> +
> +The state of the vcpu is unchanged.
> +
> +
>  5. The kvm_run structure
>  ------------------------


--
Alex Bennée
_______________________________________________
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

^ permalink raw reply	[flat|nested] 154+ messages in thread

* [RFC PATCH v2 23/23] KVM: arm64/sve: Document KVM API extensions for SVE
@ 2018-11-22 15:31     ` Alex Bennée
  0 siblings, 0 replies; 154+ messages in thread
From: Alex Bennée @ 2018-11-22 15:31 UTC (permalink / raw)
  To: linux-arm-kernel


Dave Martin <Dave.Martin@arm.com> writes:

> This patch adds sections to the KVM API documentation describing
> the extensions for supporting the Scalable Vector Extension (SVE)
> in guests.
>
> Signed-off-by: Dave Martin <Dave.Martin@arm.com>
> ---
>  Documentation/virtual/kvm/api.txt | 142 +++++++++++++++++++++++++++++++++++++-
>  1 file changed, 139 insertions(+), 3 deletions(-)
>
> diff --git a/Documentation/virtual/kvm/api.txt b/Documentation/virtual/kvm/api.txt
> index a58067b..b8257d4 100644
> --- a/Documentation/virtual/kvm/api.txt
> +++ b/Documentation/virtual/kvm/api.txt
> @@ -2054,13 +2054,21 @@ Specifically:
>    0x6030 0000 0010 004c SPSR_UND    64  spsr[KVM_SPSR_UND]
>    0x6030 0000 0010 004e SPSR_IRQ    64  spsr[KVM_SPSR_IRQ]
>    0x6060 0000 0010 0050 SPSR_FIQ    64  spsr[KVM_SPSR_FIQ]
> -  0x6040 0000 0010 0054 V0         128  fp_regs.vregs[0]
> -  0x6040 0000 0010 0058 V1         128  fp_regs.vregs[1]
> +  0x6040 0000 0010 0054 V0         128  fp_regs.vregs[0]    (*)
> +  0x6040 0000 0010 0058 V1         128  fp_regs.vregs[1]    (*)
>      ...
> -  0x6040 0000 0010 00d0 V31        128  fp_regs.vregs[31]
> +  0x6040 0000 0010 00d0 V31        128  fp_regs.vregs[31]   (*)
>    0x6020 0000 0010 00d4 FPSR        32  fp_regs.fpsr
>    0x6020 0000 0010 00d5 FPCR        32  fp_regs.fpcr
>
> +(*) These encodings are not accepted for SVE-enabled vcpus.  See
> +    KVM_ARM_SVE_CONFIG for details of how SVE support is configured for
> +    a vcpu.
> +
> +    The equivalent register content can be accessed via bits [2047:0]
> of

You mean [127:0] I think.

> +    the corresponding SVE Zn registers instead for vcpus that have SVE
> +    enabled (see below).
> +
>  arm64 CCSIDR registers are demultiplexed by CSSELR value:
>    0x6020 0000 0011 00 <csselr:8>
>
> @@ -2070,6 +2078,14 @@ arm64 system registers have the following id bit patterns:
>  arm64 firmware pseudo-registers have the following bit pattern:
>    0x6030 0000 0014 <regno:16>
>
> +arm64 SVE registers have the following bit patterns:
> +  0x6080 0000 0015 00 <n:5> <slice:5>   Zn bits[2048*slice + 2047 : 2048*slice]
> +  0x6050 0000 0015 04 <n:4> <slice:5>   Pn bits[256*slice + 255 : 256*slice]
> +  0x6050 0000 0015 060 <slice:5>        FFR bits[256*slice + 255 : 256*slice]
> +
> +  These registers are only accessible on SVE-enabled vcpus.  See
> +  KVM_ARM_SVE_CONFIG for details.
> +
>
>  MIPS registers are mapped using the lower 32 bits.  The upper 16 of that is
>  the register group type:
> @@ -3700,6 +3716,126 @@ Returns: 0 on success, -1 on error
>  This copies the vcpu's kvm_nested_state struct from userspace to the kernel.  For
>  the definition of struct kvm_nested_state, see KVM_GET_NESTED_STATE.
>
> +4.116 KVM_ARM_SVE_CONFIG
> +
> +Capability: KVM_CAP_ARM_SVE
> +Architectures: arm64
> +Type: vm and vcpu ioctl
> +Parameters: struct kvm_sve_vls (in/out)
> +Returns: 0 on success
> +Errors:
> +  EINVAL:    Unrecognised subcommand or bad arguments
> +  EBADFD:    vcpu in wrong state for request
> +             (KVM_ARM_SVE_CONFIG_SET, KVM_ARM_SVE_CONFIG_SET)
> +  ENOMEM:    Out of memory
> +  EFAULT:    Bad user address
> +
> +struct kvm_sve_vls {
> +	__u16 cmd;
> +	__u16 max_vq;
> +	__u16 _reserved[2];
> +	__u64 required_vqs[8];
> +};
> +
> +General:
> +
> +cmd: This ioctl supports a few different subcommands, selected by the
> +value of cmd (described in detail in the following sections).
> +
> +_reserved[]: these fields may be meaningful to later kernels.  For
> +forward compatibility, they must be zeroed before invoking this ioctl
> +for the first time on a given struct kvm_sve_vls object.  (So, memset()
> +it to zero before first use, or allocate with calloc() for example.)
> +
> +max_vq, required_vqs[]: encode a set of SVE vector lengths.  The set is
> +encoded as follows:
> +
> +If (a * 64 + b + 1) <= max_vq, then the bit represented by
> +
> +    required_vqs[a] & ((__u64)1 << b)
> +
> +(where a is in the range 0..7 and b is in the range 0..63)
> +indicates that the vector length (a * 64 + b + 1) * 128 bits is
> +supported (KVM_ARM_SVE_CONFIG_QUERY, KVM_ARM_SVE_CONFIG_GET) or required
> +(KVM_ARM_SVE_CONFIG_SET).
> +
> +If (a * 64 + b + 1) > max_vq, then the vector length
> +(a * 64 + b + 1) * 128 bits is unsupported or prohibited respectively.
> +In other words, only the first max_vq bits in required_vqs[] are
> +significant; remaining bits are implicitly treated as if they were zero.
> +
> +max_vq must be in the range SVE_VQ_MIN (1) to SVE_VQ_MAX (512).
> +
> +See Documentation/arm64/sve.txt for an explanation of vector lengths and
> +the meaning associated with "VQ".
> +
> +Subcommands:
> +
> +/* values for cmd: */
> +#define KVM_ARM_SVE_CONFIG_QUERY	0 /* query what the host can support */
> +#define KVM_ARM_SVE_CONFIG_SET		1 /* enable SVE for vcpu and set VLs */
> +#define KVM_ARM_SVE_CONFIG_GET		2 /* read the set of VLs for a vcpu */
> +
> +Subcommand details:
> +
> +4.116.1 KVM_ARM_SVE_CONFIG_QUERY
> +Type: vm and vcpu
> +
> +Retrieve the full set of SVE vector lengths available for use by KVM
> +guests on this host.  The result is independent of which vcpu this
> +command is invoked on.  As a convenience, it may also be invoked on a
> +vm file descriptor, eliminating the need to create a vcpu first.
> +
> +4.116.2 KVM_ARM_SVE_CONFIG_SET
> +Type: vcpu only
> +
> +Enables SVE for the vcpu and sets the set of SVE vector lengths that
> +will be visible to the guest.
> +
> +This is the only way to enable SVE for a vcpu: if this command is not
> +invoked for a vcpu then SVE will not be available to the guest on this
> +vcpu.
> +
> +This subcommand is only permitted once per vcpu, before KVM_RUN has been
> +invoked for the vcpu for the first time.  Otherwise, the command fails
> +with -EBADFD and the state of the vcpu is not modified.
> +
> +In typical use, the user should call KVM_ARM_SVE_CONFIG_QUERY first to
> +populate a struct kvm_sve_vls with the full set of vector lengths
> +available on the host, then set cmd = KVM_ARM_SVE_CONFIG_SET and
> +re-issue the KVM_ARM_SVE_CONFIG ioctl on the desired vcpu.  This will
> +configure the best set of vector lengths available.  When following this
> +approach, the maximum available vector length can also be restricted by
> +reducing the value of max_vq before invoking KVM_ARM_SVE_CONFIG_SET.
> +
> +Every requested vector length in the struct kvm_sve_vls argument must be
> +supported by the hardware.  In addition, except for vector lengths
> +greater than the maximum requested vector length, every vector length
> +not requested must *not* be supported by the hardware.  (The latter
> +restriction may be relaxed in the future.)  If the requested set of
> +vector lengths is not supportable, the command fails with -EINVAL and
> +the state of the vcpu is not modified.
> +
> +Different vcpus of a vm may be configured with different sets of vector
> +lengths.  Equally, some vcpus may have SVE enabled and some not.
> +However, such configurations are not recommended except for testing and
> +experimentation purposes.  Architecturally compliant guest OSes will
> +work, but may or may not make effective use of the resulting
> +configuration.
> +
> +After a successful KVM_ARM_SVE_CONFIG_SET, KVM_ARM_SVE_CONFIG_GET can be
> +used to retrieve the configured set of vector lengths.
> +
> +4.116.3 KVM_ARM_SVE_CONFIG_GET
> +Type: vcpu only
> +
> +This subcommand returns the set of vector lengths enabled for the vcpu.
> +SVE must have been enabled and configured for this vcpu by a successful
> +prior KVM_ARM_SVE_CONFIG_SET call.  Otherwise, -EBADFD is returned.
> +
> +The state of the vcpu is unchanged.
> +
> +
>  5. The kvm_run structure
>  ------------------------


--
Alex Benn?e

^ permalink raw reply	[flat|nested] 154+ messages in thread

* Re: [RFC PATCH v2 00/23] KVM: arm64: Initial support for SVE guests
  2018-09-28 13:39 ` Dave Martin
@ 2018-11-22 15:34   ` Alex Bennée
  -1 siblings, 0 replies; 154+ messages in thread
From: Alex Bennée @ 2018-11-22 15:34 UTC (permalink / raw)
  To: Dave Martin
  Cc: Okamoto Takayuki, Christoffer Dall, Ard Biesheuvel, Marc Zyngier,
	Catalin Marinas, Will Deacon, kvmarm, linux-arm-kernel


Dave Martin <Dave.Martin@arm.com> writes:

> This series implements basic support for allowing KVM guests to use the
> Arm Scalable Vector Extension (SVE).
>
> The patches are based on v4.19-rc5.
>
> The patches are also available on a branch for reviewer convenience. [1]
>
> This is a significant overhaul of the previous preliminary series [2],
> with the major changes outlined below, and additional minor updates in
> response to review feedback (see the individual patches for those).
>
> In the interest of getting this series out for review,
> This series is **completely untested**.

Richard is currently working on VHE support for QEMU and we already have
SVE system emulation support as of 3.1 so hopefully QEMU will be able to
test this soon (and probably shake out a few of our own bugs ;-).

> Reviewers should focus on the proposed API (but any other comments are
> of course welcome!)
<snip>

I've finished my pass for this revision. Sorry it took so long to get to
it.

--
Alex Bennée
_______________________________________________
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

^ permalink raw reply	[flat|nested] 154+ messages in thread

* [RFC PATCH v2 00/23] KVM: arm64: Initial support for SVE guests
@ 2018-11-22 15:34   ` Alex Bennée
  0 siblings, 0 replies; 154+ messages in thread
From: Alex Bennée @ 2018-11-22 15:34 UTC (permalink / raw)
  To: linux-arm-kernel


Dave Martin <Dave.Martin@arm.com> writes:

> This series implements basic support for allowing KVM guests to use the
> Arm Scalable Vector Extension (SVE).
>
> The patches are based on v4.19-rc5.
>
> The patches are also available on a branch for reviewer convenience. [1]
>
> This is a significant overhaul of the previous preliminary series [2],
> with the major changes outlined below, and additional minor updates in
> response to review feedback (see the individual patches for those).
>
> In the interest of getting this series out for review,
> This series is **completely untested**.

Richard is currently working on VHE support for QEMU and we already have
SVE system emulation support as of 3.1 so hopefully QEMU will be able to
test this soon (and probably shake out a few of our own bugs ;-).

> Reviewers should focus on the proposed API (but any other comments are
> of course welcome!)
<snip>

I've finished my pass for this revision. Sorry it took so long to get to
it.

--
Alex Benn?e

^ permalink raw reply	[flat|nested] 154+ messages in thread

* Re: [RFC PATCH v2 11/23] KVM: arm64: Support runtime sysreg filtering for KVM_GET_REG_LIST
  2018-11-22 13:07               ` Christoffer Dall
@ 2018-11-23 17:42                 ` Dave Martin
  -1 siblings, 0 replies; 154+ messages in thread
From: Dave Martin @ 2018-11-23 17:42 UTC (permalink / raw)
  To: Christoffer Dall
  Cc: tokamoto, Christoffer Dall, Ard Biesheuvel, Marc Zyngier,
	Catalin Marinas, Will Deacon, kvmarm, linux-arm-kernel

On Thu, Nov 22, 2018 at 02:07:18PM +0100, Christoffer Dall wrote:
> On Thu, Nov 22, 2018 at 01:32:37PM +0100, Dave P Martin wrote:
> > On Thu, Nov 22, 2018 at 11:27:53AM +0000, Alex Bennée wrote:
> > > 
> > > Christoffer Dall <christoffer.dall@arm.com> writes:
> > > 
> > > > [Adding Peter and Alex for their view on the QEMU side]
> > > >
> > > > On Thu, Nov 15, 2018 at 05:27:11PM +0000, Dave Martin wrote:
> > > >> On Fri, Nov 02, 2018 at 09:16:25AM +0100, Christoffer Dall wrote:
> > > >> > On Fri, Sep 28, 2018 at 02:39:15PM +0100, Dave Martin wrote:
> > > >> > > KVM_GET_REG_LIST should only enumerate registers that are actually
> > > >> > > accessible, so it is necessary to filter out any register that is
> > > >> > > not exposed to the guest.  For features that are configured at
> > > >> > > runtime, this will require a dynamic check.
> > > >> > >
> > > >> > > For example, ZCR_EL1 and ID_AA64ZFR0_EL1 would need to be hidden
> > > >> > > if SVE is not enabled for the guest.
> > > >> >
> > > >> > This implies that userspace can never access this interface for a vcpu
> > > >> > before having decided whether such features are enabled for the guest or
> > > >> > not, since otherwise userspace will see different states for a VCPU
> > > >> > depending on sequencing of the API, which sounds fragile to me.
> > > >> >
> > > >> > That should probably be documented somewhere, and I hope the
> > > >> > enable/disable API for SVE in guests already takes that into account.
> > > >> >
> > > >> > Not sure if there's an action to take here, but it was the best place I
> > > >> > could raise this concern.
> > > >>
> > > >> Fair point.  I struggled to come up with something better that solves
> > > >> all problems.
> > > >>
> > > >> My expectation is that KVM_ARM_SVE_CONFIG_SET is considered part of
> > > >> creating the vcpu, so that if issued at all for a vcpu, it is issued
> > > >> very soon after KVM_VCPU_INIT.
> > > >>
> > > >> I think this worked OK with the current structure of kvmtool and I
> > > >> seem to remember discussing this with Peter Maydell re qemu -- but
> > > >> it sounds like I should double-check.
> > > >
> > > > QEMU does some thing around enumerating all the system registers exposed
> > > > by KVM and saving/restoring them as part of its startup, but I don't
> > > > remember the exact sequence.
> > > 
> > > QEMU does this for each vCPU as part of it's start-up sequence:
> > > 
> > >   kvm_init_vcpu
> > >     kvm_get_cpu (-> KVM_CREATE_VCPU)
> > >     KVM_GET_VCPU_MMAP_SIZE
> > >     kvm_arch_init_vcpu
> > >       kvm_arm_vcpu_init (-> KVM_ARM_VCPU_INIT)
> > >       kvm_get_one_reg(ARM_CPU_ID_MPIDR)
> > >       kvm_arm_init_debug (chk for KVM_CAP SET_GUEST_DEBUG/GUEST_DEBUG_HW_WPS/BPS)
> > >       kvm_arm_init_serror_injection (chk KVM_CAP_ARM_INJECT_SERROR_ESR)
> > >       kvm_arm_init_cpreg_list (KVM_GET_REG_LIST)
> > > 
> > > At this point we have the register list we need for
> > > kvm_arch_get_registers which is what we call every time we want to
> > > synchronise state. We only really do this for debug events, crashes and
> > > at some point when migrating.
> > 
> > So we would need to insert KVM_ARM_SVE_CONFIG_SET into this sequence,
> > meaning that the new capability is not strictly necessary.
> > 
> > I sympathise with Christoffer's view though that without the capability
> > mechanism it may be too easy for software to make mistakes: code
> > refactoring might swap the KVM_GET_REG_LIST and KVM_ARM_SVE_CONFIG ioctls
> > over and then things would go wrong with no immediate error indication.
> > 
> > In effect, the SVE regs would be missing from the list yielded by
> > KVM_GET_REG_LIST, possibly leading to silent migration failures.
> > 
> > I'm a bit uneasy about that.  Am I being too paranoid now?
> > 
> 
> No, we've made decisions in the past where we didn't enforce ordering
> which ended up being a huge pain (vgic lazy init, as a clear example of
> something really bad).  Of course, it's a tradeoff.  If it's a huge pain
> to implement, maybe things will be ok, but if it's just a read/write
> capability handshake, I think it's worth doing.

OK, I'll add the capability and enforcement in the respin.

We never came up with a way to extent KVM_VCPU_INIT that felt 100% right
for this case, and a capability is straightforward to reason about,
even if it's a bit clunky.

Cheers
---Dave

^ permalink raw reply	[flat|nested] 154+ messages in thread

* [RFC PATCH v2 11/23] KVM: arm64: Support runtime sysreg filtering for KVM_GET_REG_LIST
@ 2018-11-23 17:42                 ` Dave Martin
  0 siblings, 0 replies; 154+ messages in thread
From: Dave Martin @ 2018-11-23 17:42 UTC (permalink / raw)
  To: linux-arm-kernel

On Thu, Nov 22, 2018 at 02:07:18PM +0100, Christoffer Dall wrote:
> On Thu, Nov 22, 2018 at 01:32:37PM +0100, Dave P Martin wrote:
> > On Thu, Nov 22, 2018 at 11:27:53AM +0000, Alex Benn?e wrote:
> > > 
> > > Christoffer Dall <christoffer.dall@arm.com> writes:
> > > 
> > > > [Adding Peter and Alex for their view on the QEMU side]
> > > >
> > > > On Thu, Nov 15, 2018 at 05:27:11PM +0000, Dave Martin wrote:
> > > >> On Fri, Nov 02, 2018 at 09:16:25AM +0100, Christoffer Dall wrote:
> > > >> > On Fri, Sep 28, 2018 at 02:39:15PM +0100, Dave Martin wrote:
> > > >> > > KVM_GET_REG_LIST should only enumerate registers that are actually
> > > >> > > accessible, so it is necessary to filter out any register that is
> > > >> > > not exposed to the guest.  For features that are configured at
> > > >> > > runtime, this will require a dynamic check.
> > > >> > >
> > > >> > > For example, ZCR_EL1 and ID_AA64ZFR0_EL1 would need to be hidden
> > > >> > > if SVE is not enabled for the guest.
> > > >> >
> > > >> > This implies that userspace can never access this interface for a vcpu
> > > >> > before having decided whether such features are enabled for the guest or
> > > >> > not, since otherwise userspace will see different states for a VCPU
> > > >> > depending on sequencing of the API, which sounds fragile to me.
> > > >> >
> > > >> > That should probably be documented somewhere, and I hope the
> > > >> > enable/disable API for SVE in guests already takes that into account.
> > > >> >
> > > >> > Not sure if there's an action to take here, but it was the best place I
> > > >> > could raise this concern.
> > > >>
> > > >> Fair point.  I struggled to come up with something better that solves
> > > >> all problems.
> > > >>
> > > >> My expectation is that KVM_ARM_SVE_CONFIG_SET is considered part of
> > > >> creating the vcpu, so that if issued at all for a vcpu, it is issued
> > > >> very soon after KVM_VCPU_INIT.
> > > >>
> > > >> I think this worked OK with the current structure of kvmtool and I
> > > >> seem to remember discussing this with Peter Maydell re qemu -- but
> > > >> it sounds like I should double-check.
> > > >
> > > > QEMU does some thing around enumerating all the system registers exposed
> > > > by KVM and saving/restoring them as part of its startup, but I don't
> > > > remember the exact sequence.
> > > 
> > > QEMU does this for each vCPU as part of it's start-up sequence:
> > > 
> > >   kvm_init_vcpu
> > >     kvm_get_cpu (-> KVM_CREATE_VCPU)
> > >     KVM_GET_VCPU_MMAP_SIZE
> > >     kvm_arch_init_vcpu
> > >       kvm_arm_vcpu_init (-> KVM_ARM_VCPU_INIT)
> > >       kvm_get_one_reg(ARM_CPU_ID_MPIDR)
> > >       kvm_arm_init_debug (chk for KVM_CAP SET_GUEST_DEBUG/GUEST_DEBUG_HW_WPS/BPS)
> > >       kvm_arm_init_serror_injection (chk KVM_CAP_ARM_INJECT_SERROR_ESR)
> > >       kvm_arm_init_cpreg_list (KVM_GET_REG_LIST)
> > > 
> > > At this point we have the register list we need for
> > > kvm_arch_get_registers which is what we call every time we want to
> > > synchronise state. We only really do this for debug events, crashes and
> > > at some point when migrating.
> > 
> > So we would need to insert KVM_ARM_SVE_CONFIG_SET into this sequence,
> > meaning that the new capability is not strictly necessary.
> > 
> > I sympathise with Christoffer's view though that without the capability
> > mechanism it may be too easy for software to make mistakes: code
> > refactoring might swap the KVM_GET_REG_LIST and KVM_ARM_SVE_CONFIG ioctls
> > over and then things would go wrong with no immediate error indication.
> > 
> > In effect, the SVE regs would be missing from the list yielded by
> > KVM_GET_REG_LIST, possibly leading to silent migration failures.
> > 
> > I'm a bit uneasy about that.  Am I being too paranoid now?
> > 
> 
> No, we've made decisions in the past where we didn't enforce ordering
> which ended up being a huge pain (vgic lazy init, as a clear example of
> something really bad).  Of course, it's a tradeoff.  If it's a huge pain
> to implement, maybe things will be ok, but if it's just a read/write
> capability handshake, I think it's worth doing.

OK, I'll add the capability and enforcement in the respin.

We never came up with a way to extent KVM_VCPU_INIT that felt 100% right
for this case, and a capability is straightforward to reason about,
even if it's a bit clunky.

Cheers
---Dave

^ permalink raw reply	[flat|nested] 154+ messages in thread

* Re: [RFC PATCH v2 00/23] KVM: arm64: Initial support for SVE guests
  2018-11-22 15:34   ` Alex Bennée
@ 2018-12-04 15:50     ` Dave Martin
  -1 siblings, 0 replies; 154+ messages in thread
From: Dave Martin @ 2018-12-04 15:50 UTC (permalink / raw)
  To: Alex Bennée
  Cc: Okamoto Takayuki, Christoffer Dall, Ard Biesheuvel, Marc Zyngier,
	Catalin Marinas, Will Deacon, kvmarm, linux-arm-kernel

On Thu, Nov 22, 2018 at 03:34:16PM +0000, Alex Bennée wrote:
> 
> Dave Martin <Dave.Martin@arm.com> writes:
> 
> > This series implements basic support for allowing KVM guests to use the
> > Arm Scalable Vector Extension (SVE).
> >
> > The patches are based on v4.19-rc5.
> >
> > The patches are also available on a branch for reviewer convenience. [1]
> >
> > This is a significant overhaul of the previous preliminary series [2],
> > with the major changes outlined below, and additional minor updates in
> > response to review feedback (see the individual patches for those).
> >
> > In the interest of getting this series out for review,
> > This series is **completely untested**.
> 
> Richard is currently working on VHE support for QEMU and we already have
> SVE system emulation support as of 3.1 so hopefully QEMU will be able to
> test this soon (and probably shake out a few of our own bugs ;-).

Awesome :)

Of course, there are no bugs on the kernel side... but if you find any
bugs that aren't there I'd like to know!

> > Reviewers should focus on the proposed API (but any other comments are
> > of course welcome!)
> <snip>
> 
> I've finished my pass for this revision. Sorry it took so long to get to
> it.

Much appreciated, thanks!

(I missed this mail when myself when going over responses.)

Cheers
---Dave

^ permalink raw reply	[flat|nested] 154+ messages in thread

* Re: [RFC PATCH v2 00/23] KVM: arm64: Initial support for SVE guests
@ 2018-12-04 15:50     ` Dave Martin
  0 siblings, 0 replies; 154+ messages in thread
From: Dave Martin @ 2018-12-04 15:50 UTC (permalink / raw)
  To: Alex Bennée
  Cc: Okamoto Takayuki, Christoffer Dall, Ard Biesheuvel, Marc Zyngier,
	Catalin Marinas, Will Deacon, kvmarm, linux-arm-kernel

On Thu, Nov 22, 2018 at 03:34:16PM +0000, Alex Bennée wrote:
> 
> Dave Martin <Dave.Martin@arm.com> writes:
> 
> > This series implements basic support for allowing KVM guests to use the
> > Arm Scalable Vector Extension (SVE).
> >
> > The patches are based on v4.19-rc5.
> >
> > The patches are also available on a branch for reviewer convenience. [1]
> >
> > This is a significant overhaul of the previous preliminary series [2],
> > with the major changes outlined below, and additional minor updates in
> > response to review feedback (see the individual patches for those).
> >
> > In the interest of getting this series out for review,
> > This series is **completely untested**.
> 
> Richard is currently working on VHE support for QEMU and we already have
> SVE system emulation support as of 3.1 so hopefully QEMU will be able to
> test this soon (and probably shake out a few of our own bugs ;-).

Awesome :)

Of course, there are no bugs on the kernel side... but if you find any
bugs that aren't there I'd like to know!

> > Reviewers should focus on the proposed API (but any other comments are
> > of course welcome!)
> <snip>
> 
> I've finished my pass for this revision. Sorry it took so long to get to
> it.

Much appreciated, thanks!

(I missed this mail when myself when going over responses.)

Cheers
---Dave

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 154+ messages in thread

* Re: [RFC PATCH v2 23/23] KVM: arm64/sve: Document KVM API extensions for SVE
  2018-11-22 15:31     ` Alex Bennée
@ 2018-12-05 17:59       ` Dave Martin
  -1 siblings, 0 replies; 154+ messages in thread
From: Dave Martin @ 2018-12-05 17:59 UTC (permalink / raw)
  To: Alex Bennée
  Cc: Okamoto Takayuki, Christoffer Dall, Ard Biesheuvel, Marc Zyngier,
	Catalin Marinas, Will Deacon, kvmarm, linux-arm-kernel

On Thu, Nov 22, 2018 at 03:31:51PM +0000, Alex Bennée wrote:
> 
> Dave Martin <Dave.Martin@arm.com> writes:
> 
> > This patch adds sections to the KVM API documentation describing
> > the extensions for supporting the Scalable Vector Extension (SVE)
> > in guests.
> >
> > Signed-off-by: Dave Martin <Dave.Martin@arm.com>
> > ---
> >  Documentation/virtual/kvm/api.txt | 142 +++++++++++++++++++++++++++++++++++++-
> >  1 file changed, 139 insertions(+), 3 deletions(-)
> >
> > diff --git a/Documentation/virtual/kvm/api.txt b/Documentation/virtual/kvm/api.txt
> > index a58067b..b8257d4 100644
> > --- a/Documentation/virtual/kvm/api.txt
> > +++ b/Documentation/virtual/kvm/api.txt
> > @@ -2054,13 +2054,21 @@ Specifically:
> >    0x6030 0000 0010 004c SPSR_UND    64  spsr[KVM_SPSR_UND]
> >    0x6030 0000 0010 004e SPSR_IRQ    64  spsr[KVM_SPSR_IRQ]
> >    0x6060 0000 0010 0050 SPSR_FIQ    64  spsr[KVM_SPSR_FIQ]
> > -  0x6040 0000 0010 0054 V0         128  fp_regs.vregs[0]
> > -  0x6040 0000 0010 0058 V1         128  fp_regs.vregs[1]
> > +  0x6040 0000 0010 0054 V0         128  fp_regs.vregs[0]    (*)
> > +  0x6040 0000 0010 0058 V1         128  fp_regs.vregs[1]    (*)
> >      ...
> > -  0x6040 0000 0010 00d0 V31        128  fp_regs.vregs[31]
> > +  0x6040 0000 0010 00d0 V31        128  fp_regs.vregs[31]   (*)
> >    0x6020 0000 0010 00d4 FPSR        32  fp_regs.fpsr
> >    0x6020 0000 0010 00d5 FPCR        32  fp_regs.fpcr
> >
> > +(*) These encodings are not accepted for SVE-enabled vcpus.  See
> > +    KVM_ARM_SVE_CONFIG for details of how SVE support is configured for
> > +    a vcpu.
> > +
> > +    The equivalent register content can be accessed via bits [2047:0]
> > of
> 
> You mean [127:0] I think.

Good spot.  Yes.

Possibly I meant "through the first slice of the corresponding Zn",
though [127:0] should make it clear both which slice and which bits
within the slice to look at.

[...]

Cheers
---Dave

^ permalink raw reply	[flat|nested] 154+ messages in thread

* Re: [RFC PATCH v2 23/23] KVM: arm64/sve: Document KVM API extensions for SVE
@ 2018-12-05 17:59       ` Dave Martin
  0 siblings, 0 replies; 154+ messages in thread
From: Dave Martin @ 2018-12-05 17:59 UTC (permalink / raw)
  To: Alex Bennée
  Cc: Okamoto Takayuki, Christoffer Dall, Ard Biesheuvel, Marc Zyngier,
	Catalin Marinas, Will Deacon, kvmarm, linux-arm-kernel

On Thu, Nov 22, 2018 at 03:31:51PM +0000, Alex Bennée wrote:
> 
> Dave Martin <Dave.Martin@arm.com> writes:
> 
> > This patch adds sections to the KVM API documentation describing
> > the extensions for supporting the Scalable Vector Extension (SVE)
> > in guests.
> >
> > Signed-off-by: Dave Martin <Dave.Martin@arm.com>
> > ---
> >  Documentation/virtual/kvm/api.txt | 142 +++++++++++++++++++++++++++++++++++++-
> >  1 file changed, 139 insertions(+), 3 deletions(-)
> >
> > diff --git a/Documentation/virtual/kvm/api.txt b/Documentation/virtual/kvm/api.txt
> > index a58067b..b8257d4 100644
> > --- a/Documentation/virtual/kvm/api.txt
> > +++ b/Documentation/virtual/kvm/api.txt
> > @@ -2054,13 +2054,21 @@ Specifically:
> >    0x6030 0000 0010 004c SPSR_UND    64  spsr[KVM_SPSR_UND]
> >    0x6030 0000 0010 004e SPSR_IRQ    64  spsr[KVM_SPSR_IRQ]
> >    0x6060 0000 0010 0050 SPSR_FIQ    64  spsr[KVM_SPSR_FIQ]
> > -  0x6040 0000 0010 0054 V0         128  fp_regs.vregs[0]
> > -  0x6040 0000 0010 0058 V1         128  fp_regs.vregs[1]
> > +  0x6040 0000 0010 0054 V0         128  fp_regs.vregs[0]    (*)
> > +  0x6040 0000 0010 0058 V1         128  fp_regs.vregs[1]    (*)
> >      ...
> > -  0x6040 0000 0010 00d0 V31        128  fp_regs.vregs[31]
> > +  0x6040 0000 0010 00d0 V31        128  fp_regs.vregs[31]   (*)
> >    0x6020 0000 0010 00d4 FPSR        32  fp_regs.fpsr
> >    0x6020 0000 0010 00d5 FPCR        32  fp_regs.fpcr
> >
> > +(*) These encodings are not accepted for SVE-enabled vcpus.  See
> > +    KVM_ARM_SVE_CONFIG for details of how SVE support is configured for
> > +    a vcpu.
> > +
> > +    The equivalent register content can be accessed via bits [2047:0]
> > of
> 
> You mean [127:0] I think.

Good spot.  Yes.

Possibly I meant "through the first slice of the corresponding Zn",
though [127:0] should make it clear both which slice and which bits
within the slice to look at.

[...]

Cheers
---Dave

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 154+ messages in thread

* Re: [RFC PATCH v2 19/23] KVM: arm64/sve: Report and enable SVE API extensions for userspace
  2018-11-22 15:23     ` Alex Bennée
@ 2018-12-05 18:22       ` Dave Martin
  -1 siblings, 0 replies; 154+ messages in thread
From: Dave Martin @ 2018-12-05 18:22 UTC (permalink / raw)
  To: Alex Bennée
  Cc: Okamoto Takayuki, Christoffer Dall, Ard Biesheuvel, Marc Zyngier,
	Catalin Marinas, Will Deacon, kvmarm, linux-arm-kernel

On Thu, Nov 22, 2018 at 03:23:13PM +0000, Alex Bennée wrote:
> 
> Dave Martin <Dave.Martin@arm.com> writes:
> 
> > This patch adds the necessary API extensions to allow userspace to
> > detect SVE support for guests and enable it.
> >
> > A new capability KVM_CAP_ARM_SVE is defined to allow userspace to
> > detect the availability of the KVM SVE API extensions in the usual
> > way.
> >
> > Userspace needs to enable SVE explicitly per vcpu and configure the
> > set of SVE vector lengths available to the guest before the vcpu is
> > allowed to run.  For these purposes, a new arm64-specific vcpu
> > ioctl KVM_ARM_SVE_CONFIG is added, with the following subcommands
> > (in rough order of expected use):
> >
> > KVM_ARM_SVE_CONFIG_QUERY: report the set of vector lengths
> >     supported by this host.
> >
> >     The resulting set can be supplied directly to
> >     KVM_ARM_SVE_CONFIG_SET in order to obtain the maximal possible
> >     set, or used to inform userspace's decision on the appropriate
> >     set of vector lengths (possibly taking into account the
> >     configuration of other nodes in the cluster so that the VM can
> >     migrate freely).
> >
> > KVM_ARM_SVE_CONFIG_SET: enable SVE for this vcpu and configure the
> >     set of vector lengths it offers to the guest.
> >
> >     This can only be done once, before the vcpu is run.
> >
> > KVM_ARM_SVE_CONFIG_GET: report the set of vector lengths available
> >     to the guest on this vcpu (for use when snapshotting or
> >     migrating a VM).
> >
> > Signed-off-by: Dave Martin <Dave.Martin@arm.com>
> > ---
> >
> > Changes since RFCv1:
> >
> >  * The new feature bit for PREFERRED_TARGET / VCPU_INIT is gone in
> >    favour of a capability and a new ioctl to enable/configure SVE.
> >
> >    Perhaps the SVE configuration could be done via device attributes,
> >    but it still has to be done early, so crowbarring support for this
> >    behind a generic API may cause more trouble than it solves.
> >
> >    This is still up for discussion if anybody feels strongly about it.
> >
> >  * An ioctl KVM_ARM_SVE_CONFIG has been added to report the set of
> >    vector lengths available and configure SVE for a vcpu.
> >
> >    To reduce ioctl namespace pollution the new operations are grouped
> >    as subcommands under a single ioctl, since they use the same
> >    argument format anyway.
> > ---
> >  arch/arm64/include/asm/kvm_host.h |   8 +-
> >  arch/arm64/include/uapi/asm/kvm.h |  14 ++++
> >  arch/arm64/kvm/guest.c            | 164 +++++++++++++++++++++++++++++++++++++-
> >  arch/arm64/kvm/reset.c            |  50 ++++++++++++
> >  include/uapi/linux/kvm.h          |   4 +
> >  5 files changed, 238 insertions(+), 2 deletions(-)
> >

[...]

> > diff --git a/arch/arm64/kvm/reset.c b/arch/arm64/kvm/reset.c
> > index e37c78b..c2edcde 100644
> > --- a/arch/arm64/kvm/reset.c
> > +++ b/arch/arm64/kvm/reset.c
> > @@ -19,10 +19,12 @@
> >   * along with this program.  If not, see <http://www.gnu.org/licenses/>.
> >   */
> >
> > +#include <linux/atomic.h>
> >  #include <linux/errno.h>
> >  #include <linux/kvm_host.h>
> >  #include <linux/kvm.h>
> >  #include <linux/hw_breakpoint.h>
> > +#include <linux/string.h>
> >
> >  #include <kvm/arm_arch_timer.h>
> >
> > @@ -54,6 +56,31 @@ static bool cpu_has_32bit_el1(void)
> >  	return !!(pfr0 & 0x20);
> >  }
> >
> > +#ifdef CONFIG_ARM64_SVE
> > +bool kvm_sve_supported(void)
> > +{
> > +	static bool warn_printed = false;
> > +
> > +	if (!system_supports_sve())
> > +		return false;
> > +
> > +	/*
> > +	 * For now, consider the hardware broken if implementation
> > +	 * differences between CPUs in the system result in the set of
> > +	 * vector lengths safely virtualisable for guests being less
> > +	 * than the set provided to userspace:
> > +	 */
> > +	if (sve_max_virtualisable_vl != sve_max_vl) {
> > +		if (!xchg(&warn_printed, true))
> > +			kvm_err("Hardware SVE implementations
> > mismatched: suppressing SVE for guests.");
> 
> This seems like you are re-inventing WARN_ONCE for the sake of having
> "kvm [%i]: " in your printk string.

Yes... adding a kvm_err_once() just for this seemed overkill.

Perhaps we don't really need to print the PID here anyway: this is a
system issue, not something that specific KVM instance did wrong, so
it's misleading to confine the warning to a specific PID.

So maybe I should just do a bare printk_once(KERN_ERR "kvm: " ...)
instead?

> 
> > +
> > +		return false;
> > +	}
> > +
> > +	return true;
> > +}
> > +#endif
> > +
> >  /**
> >   * kvm_arch_dev_ioctl_check_extension
> >   *
> > @@ -85,6 +112,9 @@ int kvm_arch_dev_ioctl_check_extension(struct kvm *kvm, long ext)
> >  	case KVM_CAP_VCPU_EVENTS:
> >  		r = 1;
> >  		break;
> > +	case KVM_CAP_ARM_SVE:
> > +		r = kvm_sve_supported();
> > +		break;
> 
> For debugging we actually use the return value to indicate how many
> WP/BPs we have. We could do the same here for max number of VQs but I
> guess KVM_ARM_SVE_CONFIG_QUERY reports a much richer set of information.
> However this does beg the question of how useful all this extra
> information is to the guest program?

As elaborated below, while I agree that the max VQ is potentially useful
to return here, I don't think it's enough.  There's a risk people will
get lazy and just guess which VQs <= the max are supported from the
return here.

So I prefer to return nothing, to avoid giving a false comfort to the
caller.

> 
> A dumber implementation would be:
> 
> QEMU                 |        Kernel
> 
> KVM_CAP_ARM_SVE  --------->
>                               Max VQ=n
>            VQ/0  <---------
> 
> We want n < max VQ
> 
> KVM_ARM_SVE_CONFIG(n) ---->
>                               Unsupported VQ
>            EINVAL <--------
> 
> Weird HW can't support our choice of n.
> Give up or try another value.
> 
> KVM_ARM_SVE_CONFIG(n-1) --->
>                               That's OK
>            0 (OK) <---------
> 
> It imposes more heavy lifting on the userspace side of things but I
> would expect the "normal" case would be sane hardware supports all VLs
> from Max VQ to 1. And for cases where it doesn't iterating through
> several KVM_ARM_SVE_CONFIG steps is a start-up cost not a runtime one.

The architecture only mandates power-of-two vector lengths, and I would
not be surprised to see hardware that takes advantage of that.

> This would mean one capability and one SVE_CONFIG sub-command with a single
> parameter. Would could always add the extend the interface later but I
> wonder if we are gold plating the API too early here?

I agree this is simpler and would typically work, but I'm not confident
that max VQ is sufficient alone to avoid silently letting
incompatibilities through.

That's why I designed the interface to specify the set of VQs explicitly
rather than just specifying the maximum and accepting whatever other VQs
come along with it.

The problem is that if you created a VM on a node that only supports
power-of-two vector lengths (say), and then migrate to a node that
support the same or greater max VQ but supports additional
non-power-of-two vector lengths below that, then weird things are going
to happen in the guest -- yet we would silently allow that here.

Of course, we could throw this into the "you are supposed to know what
you are doing and build a sane compute cluster" bucket, but I'd prefer
to be able to flag up obvious incompatibilities explicitly.

> What do the maintainers thing?
> 
> 
> >  	default:
> >  		r = 0;
> >  	}
> > @@ -92,6 +122,21 @@ int kvm_arch_dev_ioctl_check_extension(struct kvm *kvm, long ext)
> >  	return r;
> >  }
> >
> > +int kvm_reset_sve(struct kvm_vcpu *vcpu)
> > +{
> > +	if (!vcpu_has_sve(vcpu))
> > +		return 0;
> > +
> > +	if (WARN_ON(!vcpu->arch.sve_state ||
> > +		    !sve_vl_valid(vcpu->arch.sve_max_vl)))
> > +		return -EIO;
> 
> For some reason using WARN_ON for side effects seems sketchy but while
> BUG_ON can compile away to nothing it seems WARN_ON has been designed to
> always give you a result of the condition so never mind...

I think this is a common idiom in the kernel.

I would definitely agree that WARN_ON(expr) (and especially
BUG_ON(expr)) is poor practice if expr has side-effects, but as you say,
WARN_ON() seems to have been designed explicitly to let the check
expression show through.

If nothing else, this avoids the possibility of typoing the expression
between the WARN_ON() and the accompanying if ().

That's just my opinion, but I'll probably stick with it in its current
form unless somebody shouts.

Cheers
---Dave

^ permalink raw reply	[flat|nested] 154+ messages in thread

* Re: [RFC PATCH v2 19/23] KVM: arm64/sve: Report and enable SVE API extensions for userspace
@ 2018-12-05 18:22       ` Dave Martin
  0 siblings, 0 replies; 154+ messages in thread
From: Dave Martin @ 2018-12-05 18:22 UTC (permalink / raw)
  To: Alex Bennée
  Cc: Okamoto Takayuki, Christoffer Dall, Ard Biesheuvel, Marc Zyngier,
	Catalin Marinas, Will Deacon, kvmarm, linux-arm-kernel

On Thu, Nov 22, 2018 at 03:23:13PM +0000, Alex Bennée wrote:
> 
> Dave Martin <Dave.Martin@arm.com> writes:
> 
> > This patch adds the necessary API extensions to allow userspace to
> > detect SVE support for guests and enable it.
> >
> > A new capability KVM_CAP_ARM_SVE is defined to allow userspace to
> > detect the availability of the KVM SVE API extensions in the usual
> > way.
> >
> > Userspace needs to enable SVE explicitly per vcpu and configure the
> > set of SVE vector lengths available to the guest before the vcpu is
> > allowed to run.  For these purposes, a new arm64-specific vcpu
> > ioctl KVM_ARM_SVE_CONFIG is added, with the following subcommands
> > (in rough order of expected use):
> >
> > KVM_ARM_SVE_CONFIG_QUERY: report the set of vector lengths
> >     supported by this host.
> >
> >     The resulting set can be supplied directly to
> >     KVM_ARM_SVE_CONFIG_SET in order to obtain the maximal possible
> >     set, or used to inform userspace's decision on the appropriate
> >     set of vector lengths (possibly taking into account the
> >     configuration of other nodes in the cluster so that the VM can
> >     migrate freely).
> >
> > KVM_ARM_SVE_CONFIG_SET: enable SVE for this vcpu and configure the
> >     set of vector lengths it offers to the guest.
> >
> >     This can only be done once, before the vcpu is run.
> >
> > KVM_ARM_SVE_CONFIG_GET: report the set of vector lengths available
> >     to the guest on this vcpu (for use when snapshotting or
> >     migrating a VM).
> >
> > Signed-off-by: Dave Martin <Dave.Martin@arm.com>
> > ---
> >
> > Changes since RFCv1:
> >
> >  * The new feature bit for PREFERRED_TARGET / VCPU_INIT is gone in
> >    favour of a capability and a new ioctl to enable/configure SVE.
> >
> >    Perhaps the SVE configuration could be done via device attributes,
> >    but it still has to be done early, so crowbarring support for this
> >    behind a generic API may cause more trouble than it solves.
> >
> >    This is still up for discussion if anybody feels strongly about it.
> >
> >  * An ioctl KVM_ARM_SVE_CONFIG has been added to report the set of
> >    vector lengths available and configure SVE for a vcpu.
> >
> >    To reduce ioctl namespace pollution the new operations are grouped
> >    as subcommands under a single ioctl, since they use the same
> >    argument format anyway.
> > ---
> >  arch/arm64/include/asm/kvm_host.h |   8 +-
> >  arch/arm64/include/uapi/asm/kvm.h |  14 ++++
> >  arch/arm64/kvm/guest.c            | 164 +++++++++++++++++++++++++++++++++++++-
> >  arch/arm64/kvm/reset.c            |  50 ++++++++++++
> >  include/uapi/linux/kvm.h          |   4 +
> >  5 files changed, 238 insertions(+), 2 deletions(-)
> >

[...]

> > diff --git a/arch/arm64/kvm/reset.c b/arch/arm64/kvm/reset.c
> > index e37c78b..c2edcde 100644
> > --- a/arch/arm64/kvm/reset.c
> > +++ b/arch/arm64/kvm/reset.c
> > @@ -19,10 +19,12 @@
> >   * along with this program.  If not, see <http://www.gnu.org/licenses/>.
> >   */
> >
> > +#include <linux/atomic.h>
> >  #include <linux/errno.h>
> >  #include <linux/kvm_host.h>
> >  #include <linux/kvm.h>
> >  #include <linux/hw_breakpoint.h>
> > +#include <linux/string.h>
> >
> >  #include <kvm/arm_arch_timer.h>
> >
> > @@ -54,6 +56,31 @@ static bool cpu_has_32bit_el1(void)
> >  	return !!(pfr0 & 0x20);
> >  }
> >
> > +#ifdef CONFIG_ARM64_SVE
> > +bool kvm_sve_supported(void)
> > +{
> > +	static bool warn_printed = false;
> > +
> > +	if (!system_supports_sve())
> > +		return false;
> > +
> > +	/*
> > +	 * For now, consider the hardware broken if implementation
> > +	 * differences between CPUs in the system result in the set of
> > +	 * vector lengths safely virtualisable for guests being less
> > +	 * than the set provided to userspace:
> > +	 */
> > +	if (sve_max_virtualisable_vl != sve_max_vl) {
> > +		if (!xchg(&warn_printed, true))
> > +			kvm_err("Hardware SVE implementations
> > mismatched: suppressing SVE for guests.");
> 
> This seems like you are re-inventing WARN_ONCE for the sake of having
> "kvm [%i]: " in your printk string.

Yes... adding a kvm_err_once() just for this seemed overkill.

Perhaps we don't really need to print the PID here anyway: this is a
system issue, not something that specific KVM instance did wrong, so
it's misleading to confine the warning to a specific PID.

So maybe I should just do a bare printk_once(KERN_ERR "kvm: " ...)
instead?

> 
> > +
> > +		return false;
> > +	}
> > +
> > +	return true;
> > +}
> > +#endif
> > +
> >  /**
> >   * kvm_arch_dev_ioctl_check_extension
> >   *
> > @@ -85,6 +112,9 @@ int kvm_arch_dev_ioctl_check_extension(struct kvm *kvm, long ext)
> >  	case KVM_CAP_VCPU_EVENTS:
> >  		r = 1;
> >  		break;
> > +	case KVM_CAP_ARM_SVE:
> > +		r = kvm_sve_supported();
> > +		break;
> 
> For debugging we actually use the return value to indicate how many
> WP/BPs we have. We could do the same here for max number of VQs but I
> guess KVM_ARM_SVE_CONFIG_QUERY reports a much richer set of information.
> However this does beg the question of how useful all this extra
> information is to the guest program?

As elaborated below, while I agree that the max VQ is potentially useful
to return here, I don't think it's enough.  There's a risk people will
get lazy and just guess which VQs <= the max are supported from the
return here.

So I prefer to return nothing, to avoid giving a false comfort to the
caller.

> 
> A dumber implementation would be:
> 
> QEMU                 |        Kernel
> 
> KVM_CAP_ARM_SVE  --------->
>                               Max VQ=n
>            VQ/0  <---------
> 
> We want n < max VQ
> 
> KVM_ARM_SVE_CONFIG(n) ---->
>                               Unsupported VQ
>            EINVAL <--------
> 
> Weird HW can't support our choice of n.
> Give up or try another value.
> 
> KVM_ARM_SVE_CONFIG(n-1) --->
>                               That's OK
>            0 (OK) <---------
> 
> It imposes more heavy lifting on the userspace side of things but I
> would expect the "normal" case would be sane hardware supports all VLs
> from Max VQ to 1. And for cases where it doesn't iterating through
> several KVM_ARM_SVE_CONFIG steps is a start-up cost not a runtime one.

The architecture only mandates power-of-two vector lengths, and I would
not be surprised to see hardware that takes advantage of that.

> This would mean one capability and one SVE_CONFIG sub-command with a single
> parameter. Would could always add the extend the interface later but I
> wonder if we are gold plating the API too early here?

I agree this is simpler and would typically work, but I'm not confident
that max VQ is sufficient alone to avoid silently letting
incompatibilities through.

That's why I designed the interface to specify the set of VQs explicitly
rather than just specifying the maximum and accepting whatever other VQs
come along with it.

The problem is that if you created a VM on a node that only supports
power-of-two vector lengths (say), and then migrate to a node that
support the same or greater max VQ but supports additional
non-power-of-two vector lengths below that, then weird things are going
to happen in the guest -- yet we would silently allow that here.

Of course, we could throw this into the "you are supposed to know what
you are doing and build a sane compute cluster" bucket, but I'd prefer
to be able to flag up obvious incompatibilities explicitly.

> What do the maintainers thing?
> 
> 
> >  	default:
> >  		r = 0;
> >  	}
> > @@ -92,6 +122,21 @@ int kvm_arch_dev_ioctl_check_extension(struct kvm *kvm, long ext)
> >  	return r;
> >  }
> >
> > +int kvm_reset_sve(struct kvm_vcpu *vcpu)
> > +{
> > +	if (!vcpu_has_sve(vcpu))
> > +		return 0;
> > +
> > +	if (WARN_ON(!vcpu->arch.sve_state ||
> > +		    !sve_vl_valid(vcpu->arch.sve_max_vl)))
> > +		return -EIO;
> 
> For some reason using WARN_ON for side effects seems sketchy but while
> BUG_ON can compile away to nothing it seems WARN_ON has been designed to
> always give you a result of the condition so never mind...

I think this is a common idiom in the kernel.

I would definitely agree that WARN_ON(expr) (and especially
BUG_ON(expr)) is poor practice if expr has side-effects, but as you say,
WARN_ON() seems to have been designed explicitly to let the check
expression show through.

If nothing else, this avoids the possibility of typoing the expression
between the WARN_ON() and the accompanying if ().

That's just my opinion, but I'll probably stick with it in its current
form unless somebody shouts.

Cheers
---Dave

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 154+ messages in thread

end of thread, other threads:[~2018-12-05 18:22 UTC | newest]

Thread overview: 154+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-09-28 13:39 [RFC PATCH v2 00/23] KVM: arm64: Initial support for SVE guests Dave Martin
2018-09-28 13:39 ` Dave Martin
2018-09-28 13:39 ` [RFC PATCH v2 01/23] arm64: fpsimd: Always set TIF_FOREIGN_FPSTATE on task state flush Dave Martin
2018-09-28 13:39   ` Dave Martin
2018-09-28 13:39 ` [RFC PATCH v2 02/23] KVM: arm64: Delete orphaned declaration for __fpsimd_enabled() Dave Martin
2018-09-28 13:39   ` Dave Martin
2018-09-28 13:39 ` [RFC PATCH v2 03/23] KVM: arm64: Refactor kvm_arm_num_regs() for easier maintenance Dave Martin
2018-09-28 13:39   ` Dave Martin
2018-09-28 13:39 ` [RFC PATCH v2 04/23] KVM: arm64: Add missing #include of <linux/bitmap.h> to kvm_host.h Dave Martin
2018-09-28 13:39   ` Dave Martin
2018-09-28 13:39 ` [RFC PATCH v2 05/23] KVM: arm: Add arch vcpu uninit hook Dave Martin
2018-09-28 13:39   ` Dave Martin
2018-11-02  8:05   ` Christoffer Dall
2018-11-02  8:05     ` Christoffer Dall
2018-11-15 16:40     ` Dave Martin
2018-11-15 16:40       ` Dave Martin
2018-11-20 10:56       ` Christoffer Dall
2018-11-20 10:56         ` Christoffer Dall
2018-09-28 13:39 ` [RFC PATCH v2 06/23] arm64/sve: Check SVE virtualisability Dave Martin
2018-09-28 13:39   ` Dave Martin
2018-11-15 15:39   ` Alex Bennée
2018-11-15 15:39     ` Alex Bennée
2018-11-15 17:09     ` Dave Martin
2018-11-15 17:09       ` Dave Martin
2018-11-16 12:32       ` Alex Bennée
2018-11-16 12:32         ` Alex Bennée
2018-11-16 15:09         ` Dave Martin
2018-11-16 15:09           ` Dave Martin
2018-09-28 13:39 ` [RFC PATCH v2 07/23] arm64/sve: Enable SVE state tracking for non-task contexts Dave Martin
2018-09-28 13:39   ` Dave Martin
2018-09-28 13:39 ` [RFC PATCH v2 08/23] KVM: arm64: Add a vcpu flag to control SVE visibility for the guest Dave Martin
2018-09-28 13:39   ` Dave Martin
2018-11-15 15:44   ` Alex Bennée
2018-11-15 15:44     ` Alex Bennée
2018-09-28 13:39 ` [RFC PATCH v2 09/23] KVM: arm64: Propagate vcpu into read_id_reg() Dave Martin
2018-09-28 13:39   ` Dave Martin
2018-11-15 15:56   ` Alex Bennée
2018-11-15 15:56     ` Alex Bennée
2018-09-28 13:39 ` [RFC PATCH v2 10/23] KVM: arm64: Extend reset_unknown() to handle mixed RES0/UNKNOWN registers Dave Martin
2018-09-28 13:39   ` Dave Martin
2018-11-02  8:11   ` Christoffer Dall
2018-11-02  8:11     ` Christoffer Dall
2018-11-15 17:11     ` Dave Martin
2018-11-15 17:11       ` Dave Martin
2018-09-28 13:39 ` [RFC PATCH v2 11/23] KVM: arm64: Support runtime sysreg filtering for KVM_GET_REG_LIST Dave Martin
2018-09-28 13:39   ` Dave Martin
2018-11-02  8:16   ` Christoffer Dall
2018-11-02  8:16     ` Christoffer Dall
2018-11-15 17:27     ` Dave Martin
2018-11-15 17:27       ` Dave Martin
2018-11-22 10:53       ` Christoffer Dall
2018-11-22 10:53         ` Christoffer Dall
2018-11-22 11:13         ` Peter Maydell
2018-11-22 11:13           ` Peter Maydell
2018-11-22 12:34           ` Christoffer Dall
2018-11-22 12:34             ` Christoffer Dall
2018-11-22 12:59             ` Peter Maydell
2018-11-22 12:59               ` Peter Maydell
2018-11-22 11:27         ` Alex Bennée
2018-11-22 11:27           ` Alex Bennée
2018-11-22 12:32           ` Dave P Martin
2018-11-22 12:32             ` Dave P Martin
2018-11-22 13:07             ` Christoffer Dall
2018-11-22 13:07               ` Christoffer Dall
2018-11-23 17:42               ` Dave Martin
2018-11-23 17:42                 ` Dave Martin
2018-09-28 13:39 ` [RFC PATCH v2 12/23] KVM: arm64/sve: System register context switch and access support Dave Martin
2018-09-28 13:39   ` Dave Martin
2018-11-15 16:37   ` Alex Bennée
2018-11-15 16:37     ` Alex Bennée
2018-11-15 17:59     ` Dave Martin
2018-11-15 17:59       ` Dave Martin
2018-09-28 13:39 ` [RFC PATCH v2 13/23] KVM: arm64/sve: Context switch the SVE registers Dave Martin
2018-09-28 13:39   ` Dave Martin
2018-11-19 16:36   ` Alex Bennée
2018-11-19 16:36     ` Alex Bennée
2018-11-19 17:03     ` Dave Martin
2018-11-19 17:03       ` Dave Martin
2018-11-20 12:25       ` Alex Bennée
2018-11-20 12:25         ` Alex Bennée
2018-11-20 14:17         ` Dave Martin
2018-11-20 14:17           ` Dave Martin
2018-11-20 15:30           ` Alex Bennée
2018-11-20 15:30             ` Alex Bennée
2018-11-20 17:18             ` Dave Martin
2018-11-20 17:18               ` Dave Martin
2018-09-28 13:39 ` [RFC PATCH v2 14/23] KVM: Allow 2048-bit register access via ioctl interface Dave Martin
2018-09-28 13:39   ` Dave Martin
2018-11-19 16:48   ` Alex Bennée
2018-11-19 16:48     ` Alex Bennée
2018-11-19 17:07     ` Dave Martin
2018-11-19 17:07       ` Dave Martin
2018-11-20 11:20       ` Alex Bennée
2018-11-20 11:20         ` Alex Bennée
2018-09-28 13:39 ` [RFC PATCH v2 15/23] KVM: arm64/sve: Add SVE support to register access " Dave Martin
2018-09-28 13:39   ` Dave Martin
2018-11-21 15:20   ` Alex Bennée
2018-11-21 15:20     ` Alex Bennée
2018-11-21 18:05     ` Dave Martin
2018-11-21 18:05       ` Dave Martin
2018-09-28 13:39 ` [RFC PATCH v2 16/23] KVM: arm64: Enumerate SVE register indices for KVM_GET_REG_LIST Dave Martin
2018-09-28 13:39   ` Dave Martin
2018-11-21 16:09   ` Alex Bennée
2018-11-21 16:09     ` Alex Bennée
2018-11-21 16:32     ` Dave Martin
2018-11-21 16:32       ` Dave Martin
2018-11-21 16:49       ` Alex Bennée
2018-11-21 16:49         ` Alex Bennée
2018-11-21 17:46         ` Dave Martin
2018-11-21 17:46           ` Dave Martin
2018-09-28 13:39 ` [RFC PATCH v2 17/23] arm64/sve: In-kernel vector length availability query interface Dave Martin
2018-09-28 13:39   ` Dave Martin
2018-11-21 16:16   ` Alex Bennée
2018-11-21 16:16     ` Alex Bennée
2018-11-21 16:35     ` Dave Martin
2018-11-21 16:35       ` Dave Martin
2018-11-21 16:46       ` Alex Bennée
2018-11-21 16:46         ` Alex Bennée
2018-09-28 13:39 ` [RFC PATCH v2 18/23] KVM: arm64: Add arch vcpu ioctl hook Dave Martin
2018-09-28 13:39   ` Dave Martin
2018-11-02  8:30   ` Christoffer Dall
2018-11-02  8:30     ` Christoffer Dall
2018-09-28 13:39 ` [RFC PATCH v2 19/23] KVM: arm64/sve: Report and enable SVE API extensions for userspace Dave Martin
2018-09-28 13:39   ` Dave Martin
2018-11-22 15:23   ` Alex Bennée
2018-11-22 15:23     ` Alex Bennée
2018-12-05 18:22     ` Dave Martin
2018-12-05 18:22       ` Dave Martin
2018-09-28 13:39 ` [RFC PATCH v2 20/23] KVM: arm64: Add arch vm ioctl hook Dave Martin
2018-09-28 13:39   ` Dave Martin
2018-11-02  8:32   ` Christoffer Dall
2018-11-02  8:32     ` Christoffer Dall
2018-11-15 18:04     ` Dave Martin
2018-11-15 18:04       ` Dave Martin
2018-11-20 10:58       ` Christoffer Dall
2018-11-20 10:58         ` Christoffer Dall
2018-11-20 14:19         ` Dave Martin
2018-11-20 14:19           ` Dave Martin
2018-09-28 13:39 ` [RFC PATCH v2 21/23] KVM: arm64/sve: allow KVM_ARM_SVE_CONFIG_QUERY on vm fd Dave Martin
2018-09-28 13:39   ` Dave Martin
2018-11-22 15:29   ` Alex Bennée
2018-11-22 15:29     ` Alex Bennée
2018-09-28 13:39 ` [RFC PATCH v2 22/23] KVM: Documentation: Document arm64 core registers in detail Dave Martin
2018-09-28 13:39   ` Dave Martin
2018-09-28 13:39 ` [RFC PATCH v2 23/23] KVM: arm64/sve: Document KVM API extensions for SVE Dave Martin
2018-09-28 13:39   ` Dave Martin
2018-11-22 15:31   ` Alex Bennée
2018-11-22 15:31     ` Alex Bennée
2018-12-05 17:59     ` Dave Martin
2018-12-05 17:59       ` Dave Martin
2018-11-22 15:34 ` [RFC PATCH v2 00/23] KVM: arm64: Initial support for SVE guests Alex Bennée
2018-11-22 15:34   ` Alex Bennée
2018-12-04 15:50   ` Dave Martin
2018-12-04 15:50     ` Dave Martin

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.