KVM ARM Archive on lore.kernel.org
 help / color / Atom feed
* [PATCH 0/2] Improve handling of stage 2 aborts without instruction decode
@ 2019-09-09 12:13 Christoffer Dall
  2019-09-09 12:13 ` [PATCH 1/2] KVM: arm/arm64: Allow reporting non-ISV data aborts to userspace Christoffer Dall
                   ` (4 more replies)
  0 siblings, 5 replies; 10+ messages in thread
From: Christoffer Dall @ 2019-09-09 12:13 UTC (permalink / raw)
  To: kvmarm
  Cc: Daniel P. Berrangé,
	Marc Zyngier, Stefan Hajnoczi, Heinrich Schuchardt,
	linux-arm-kernel

When a guest accesses memory outside the memory slots, KVM usually
bounces the access back to userspace with KVM_EXIT_MMIO.  However, on
arm/arm64 systems, certain load/store instructions did not provide
decoding info for the hypervisor to emulate the instruction, and in this
case KVM has rather rudely returned -ENOSYS and printed a not overly
helpful error message:

  load/store instruction decoding not implemented

This patch series improves the error message and allows userspace to be
notified of this event instead of receiving -ENOSYS, and also allows
userspace to ask KVM to inject an external abort to the guest, which it
can use for any memory access that it either cannot handle.

One remaining case which this patch set does not address is if the guest
accesses an in-kernel emulated device, such as the VGIC, but using a
load/store instruction which doesn't provide decode info.  With these
patches, this will return to userspace for it to handle, but there's no
way for userspace to return the decoding information to KVM and have KVM
complete the access to the in-kernel emulated device.  I have no plans
to address this limitation.

Christoffer Dall (2):
  KVM: arm/arm64: Allow reporting non-ISV data aborts to userspace
  KVM: arm/arm64: Allow user injection of external data aborts

 Documentation/virt/kvm/api.txt       | 44 +++++++++++++++++++++++++++-
 arch/arm/include/asm/kvm_arm.h       |  2 ++
 arch/arm/include/asm/kvm_emulate.h   |  5 ++++
 arch/arm/include/asm/kvm_host.h      |  8 +++++
 arch/arm/include/uapi/asm/kvm.h      |  3 +-
 arch/arm/kvm/guest.c                 |  3 ++
 arch/arm64/include/asm/kvm_emulate.h |  5 ++++
 arch/arm64/include/asm/kvm_host.h    |  8 +++++
 arch/arm64/include/uapi/asm/kvm.h    |  3 +-
 arch/arm64/kvm/guest.c               |  3 ++
 arch/arm64/kvm/inject_fault.c        |  4 +--
 include/uapi/linux/kvm.h             |  8 +++++
 virt/kvm/arm/arm.c                   | 22 ++++++++++++++
 virt/kvm/arm/mmio.c                  | 11 +++++--
 14 files changed, 122 insertions(+), 7 deletions(-)

-- 
2.17.1

_______________________________________________
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

^ permalink raw reply	[flat|nested] 10+ messages in thread

* [PATCH 1/2] KVM: arm/arm64: Allow reporting non-ISV data aborts to userspace
  2019-09-09 12:13 [PATCH 0/2] Improve handling of stage 2 aborts without instruction decode Christoffer Dall
@ 2019-09-09 12:13 ` Christoffer Dall
  2019-09-09 12:13 ` [PATCH 2/2] KVM: arm/arm64: Allow user injection of external data aborts Christoffer Dall
                   ` (3 subsequent siblings)
  4 siblings, 0 replies; 10+ messages in thread
From: Christoffer Dall @ 2019-09-09 12:13 UTC (permalink / raw)
  To: kvmarm
  Cc: Daniel P. Berrangé,
	Marc Zyngier, Stefan Hajnoczi, Heinrich Schuchardt,
	linux-arm-kernel

For a long time, if a guest accessed memory outside of a memslot using
any of the load/store instructions in the architecture which doesn't
supply decoding information in the ESR_EL2 (the ISV bit is not set), the
kernel would print the following message and terminate the VM as a
result of returning -ENOSYS to userspace:

  load/store instruction decoding not implemented

The reason behind this message is that KVM assumes that all accesses
outside a memslot is an MMIO access which should be handled by
userspace, and we originally expected to eventually implement some sort
of decoding of load/store instructions where the ISV bit was not set.

However, it turns out that many of the instructions which don't provide
decoding information on abort are not safe to use for MMIO accesses, and
the remaining few that would potentially make sense to use on MMIO
accesses, such as those with register writeback, are not used in
practice.  It also turns out that fetching an instruction from guest
memory can be a pretty horrible affair, involving stopping all CPUs on
SMP systems, handling multiple corner cases of address translation in
software, and more.  It doesn't appear likely that we'll ever implement
this in the kernel.

What is much more common is that a user has misconfigured his/her guest
and is actually not accessing an MMIO region, but just hitting some
random hole in the IPA space.  In this scenario, the error message above
is almost misleading and has led to a great deal of confusion over the
years.

It is, nevertheless, ABI to userspace, and we therefore need to
introduce a new capability that userspace explicitly enables to change
behavior.

This patch introduces KVM_CAP_ARM_NISV_TO_USER (NISV meaning Non-ISV)
which does exactly that, and introduces a new exit reason to report the
event to userspace.  User space can then emulate an exception to the
guest, restart the guest, suspend the guest, or take any other
appropriate action as per the policy of the running system.

Reported-by: Heinrich Schuchardt <xypron.glpk@gmx.de>
Signed-off-by: Christoffer Dall <christoffer.dall@arm.com>
---
 Documentation/virt/kvm/api.txt       | 29 ++++++++++++++++++++++++++++
 arch/arm/include/asm/kvm_arm.h       |  2 ++
 arch/arm/include/asm/kvm_emulate.h   |  5 +++++
 arch/arm/include/asm/kvm_host.h      |  8 ++++++++
 arch/arm64/include/asm/kvm_emulate.h |  5 +++++
 arch/arm64/include/asm/kvm_host.h    |  8 ++++++++
 include/uapi/linux/kvm.h             |  7 +++++++
 virt/kvm/arm/arm.c                   | 21 ++++++++++++++++++++
 virt/kvm/arm/mmio.c                  | 11 +++++++++--
 9 files changed, 94 insertions(+), 2 deletions(-)

diff --git a/Documentation/virt/kvm/api.txt b/Documentation/virt/kvm/api.txt
index 2d067767b617..02501333f746 100644
--- a/Documentation/virt/kvm/api.txt
+++ b/Documentation/virt/kvm/api.txt
@@ -4453,6 +4453,35 @@ Hyper-V SynIC state change. Notification is used to remap SynIC
 event/message pages and to enable/disable SynIC messages/events processing
 in userspace.
 
+		/* KVM_EXIT_ARM_NISV */
+		struct {
+			__u64 esr_iss;
+			__u64 fault_ipa;
+		} arm_nisv;
+
+Used on arm and arm64 systems. If a guest accesses memory not in a memslot,
+KVM will typically return to userspace and ask it to do MMIO emulation on its
+behalf. However, for certain classes of instructions, no instruction decode
+(direction, length of memory access) is provided, and fetching and decoding
+the instruction from the VM is overly complicated to live in the kernel.
+
+Historically, when this situation occurred, KVM would print a warning and kill
+the VM. KVM assumed that if the guest accessed non-memslot memory, it was
+trying to do I/O, which just couldn't be emulated, and the warning message was
+phrased accordingly. However, what happened more often was that a guest bug
+caused access outside the guest memory areas which should lead to a more
+mearningful warning message and an external abort in the guest, if the access
+did not fall within an I/O window.
+
+Userspace implementations can query for KVM_CAP_ARM_NISV_TO_USER, and enable
+this capability at VM creation. Once this is done, these types of errors will
+instead return to userspace with KVM_EXIT_ARM_NISV, with the valid bits from
+the HSR (arm) and ESR_EL2 (arm64) in the esr_iss field, and the faulting IPA
+in the fault_ipa field. Userspace can either fix up the access if it's
+actually an I/O access by decoding the instruction from guest memory (if it's
+very brave) and continue executing the guest, or it can decide to suspend,
+dump, or restart the guest.
+
 		/* Fix the size of the union. */
 		char padding[256];
 	};
diff --git a/arch/arm/include/asm/kvm_arm.h b/arch/arm/include/asm/kvm_arm.h
index 0125aa059d5b..ce61b3b0058d 100644
--- a/arch/arm/include/asm/kvm_arm.h
+++ b/arch/arm/include/asm/kvm_arm.h
@@ -162,6 +162,8 @@
 #define HSR_ISV		(_AC(1, UL) << HSR_ISV_SHIFT)
 #define HSR_SRT_SHIFT	(16)
 #define HSR_SRT_MASK	(0xf << HSR_SRT_SHIFT)
+#define HSR_CM		(1 << 8)
+#define HSR_WNR		(1 << 6)
 #define HSR_FSC		(0x3f)
 #define HSR_FSC_TYPE	(0x3c)
 #define HSR_SSE		(1 << 21)
diff --git a/arch/arm/include/asm/kvm_emulate.h b/arch/arm/include/asm/kvm_emulate.h
index 40002416efec..e8ef349c04b4 100644
--- a/arch/arm/include/asm/kvm_emulate.h
+++ b/arch/arm/include/asm/kvm_emulate.h
@@ -167,6 +167,11 @@ static inline bool kvm_vcpu_dabt_isvalid(struct kvm_vcpu *vcpu)
 	return kvm_vcpu_get_hsr(vcpu) & HSR_ISV;
 }
 
+static inline unsigned long kvm_vcpu_dabt_iss_nisv_sanitized(const struct kvm_vcpu *vcpu)
+{
+	return kvm_vcpu_get_hsr(vcpu) & (HSR_CM | HSR_WNR | HSR_FSC);
+}
+
 static inline bool kvm_vcpu_dabt_iswrite(struct kvm_vcpu *vcpu)
 {
 	return kvm_vcpu_get_hsr(vcpu) & HSR_WNR;
diff --git a/arch/arm/include/asm/kvm_host.h b/arch/arm/include/asm/kvm_host.h
index 8a37c8e89777..19a92c49039c 100644
--- a/arch/arm/include/asm/kvm_host.h
+++ b/arch/arm/include/asm/kvm_host.h
@@ -76,6 +76,14 @@ struct kvm_arch {
 
 	/* Mandated version of PSCI */
 	u32 psci_version;
+
+	/*
+	 * If we encounter a data abort without valid instruction syndrome
+	 * information, report this to user space.  User space can (and
+	 * should) opt in to this feature if KVM_CAP_ARM_NISV_TO_USER is
+	 * supported.
+	 */
+	bool return_nisv_io_abort_to_user;
 };
 
 #define KVM_NR_MEM_OBJS     40
diff --git a/arch/arm64/include/asm/kvm_emulate.h b/arch/arm64/include/asm/kvm_emulate.h
index d69c1efc63e7..a3c967988e1d 100644
--- a/arch/arm64/include/asm/kvm_emulate.h
+++ b/arch/arm64/include/asm/kvm_emulate.h
@@ -258,6 +258,11 @@ static inline bool kvm_vcpu_dabt_isvalid(const struct kvm_vcpu *vcpu)
 	return !!(kvm_vcpu_get_hsr(vcpu) & ESR_ELx_ISV);
 }
 
+static inline unsigned long kvm_vcpu_dabt_iss_nisv_sanitized(const struct kvm_vcpu *vcpu)
+{
+	return kvm_vcpu_get_hsr(vcpu) & (ESR_ELx_CM | ESR_ELx_WNR | ESR_ELx_FSC);
+}
+
 static inline bool kvm_vcpu_dabt_issext(const struct kvm_vcpu *vcpu)
 {
 	return !!(kvm_vcpu_get_hsr(vcpu) & ESR_ELx_SSE);
diff --git a/arch/arm64/include/asm/kvm_host.h b/arch/arm64/include/asm/kvm_host.h
index f656169db8c3..019bc560edc1 100644
--- a/arch/arm64/include/asm/kvm_host.h
+++ b/arch/arm64/include/asm/kvm_host.h
@@ -83,6 +83,14 @@ struct kvm_arch {
 
 	/* Mandated version of PSCI */
 	u32 psci_version;
+
+	/*
+	 * If we encounter a data abort without valid instruction syndrome
+	 * information, report this to user space.  User space can (and
+	 * should) opt in to this feature if KVM_CAP_ARM_NISV_TO_USER is
+	 * supported.
+	 */
+	bool return_nisv_io_abort_to_user;
 };
 
 #define KVM_NR_MEM_OBJS     40
diff --git a/include/uapi/linux/kvm.h b/include/uapi/linux/kvm.h
index 5e3f12d5359e..dd79235b6435 100644
--- a/include/uapi/linux/kvm.h
+++ b/include/uapi/linux/kvm.h
@@ -235,6 +235,7 @@ struct kvm_hyperv_exit {
 #define KVM_EXIT_S390_STSI        25
 #define KVM_EXIT_IOAPIC_EOI       26
 #define KVM_EXIT_HYPERV           27
+#define KVM_EXIT_ARM_NISV         28
 
 /* For KVM_EXIT_INTERNAL_ERROR */
 /* Emulate instruction failed. */
@@ -392,6 +393,11 @@ struct kvm_run {
 		} eoi;
 		/* KVM_EXIT_HYPERV */
 		struct kvm_hyperv_exit hyperv;
+		/* KVM_EXIT_ARM_NISV */
+		struct {
+			__u64 esr_iss;
+			__u64 fault_ipa;
+		} arm_nisv;
 		/* Fix the size of the union. */
 		char padding[256];
 	};
@@ -996,6 +1002,7 @@ struct kvm_ppc_resize_hpt {
 #define KVM_CAP_ARM_PTRAUTH_ADDRESS 171
 #define KVM_CAP_ARM_PTRAUTH_GENERIC 172
 #define KVM_CAP_PMU_EVENT_FILTER 173
+#define KVM_CAP_ARM_NISV_TO_USER 174
 
 #ifdef KVM_CAP_IRQ_ROUTING
 
diff --git a/virt/kvm/arm/arm.c b/virt/kvm/arm/arm.c
index 35a069815baf..7153504bb106 100644
--- a/virt/kvm/arm/arm.c
+++ b/virt/kvm/arm/arm.c
@@ -98,6 +98,26 @@ int kvm_arch_check_processor_compat(void)
 	return 0;
 }
 
+int kvm_vm_ioctl_enable_cap(struct kvm *kvm,
+			    struct kvm_enable_cap *cap)
+{
+	int r;
+
+	if (cap->flags)
+		return -EINVAL;
+
+	switch (cap->cap) {
+	case KVM_CAP_ARM_NISV_TO_USER:
+		r = 0;
+		kvm->arch.return_nisv_io_abort_to_user = true;
+		break;
+	default:
+		r = -EINVAL;
+		break;
+	}
+
+	return r;
+}
 
 /**
  * kvm_arch_init_vm - initializes a VM data structure
@@ -196,6 +216,7 @@ int kvm_vm_ioctl_check_extension(struct kvm *kvm, long ext)
 	case KVM_CAP_MP_STATE:
 	case KVM_CAP_IMMEDIATE_EXIT:
 	case KVM_CAP_VCPU_EVENTS:
+	case KVM_CAP_ARM_NISV_TO_USER:
 		r = 1;
 		break;
 	case KVM_CAP_ARM_SET_DEVICE_ADDR:
diff --git a/virt/kvm/arm/mmio.c b/virt/kvm/arm/mmio.c
index 6af5c91337f2..7b92e2744fa7 100644
--- a/virt/kvm/arm/mmio.c
+++ b/virt/kvm/arm/mmio.c
@@ -167,8 +167,15 @@ int io_mem_abort(struct kvm_vcpu *vcpu, struct kvm_run *run,
 		if (ret)
 			return ret;
 	} else {
-		kvm_err("load/store instruction decoding not implemented\n");
-		return -ENOSYS;
+		if (vcpu->kvm->arch.return_nisv_io_abort_to_user) {
+			run->exit_reason = KVM_EXIT_ARM_NISV;
+			run->arm_nisv.esr_iss = kvm_vcpu_dabt_iss_nisv_sanitized(vcpu);
+			run->arm_nisv.fault_ipa = fault_ipa;
+			return 0;
+		} else {
+			kvm_info("Encountered data abort outside memslots with no valid syndrome info\n");
+			return -ENOSYS;
+		}
 	}
 
 	rt = vcpu->arch.mmio_decode.rt;
-- 
2.17.1

_______________________________________________
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

^ permalink raw reply	[flat|nested] 10+ messages in thread

* [PATCH 2/2] KVM: arm/arm64: Allow user injection of external data aborts
  2019-09-09 12:13 [PATCH 0/2] Improve handling of stage 2 aborts without instruction decode Christoffer Dall
  2019-09-09 12:13 ` [PATCH 1/2] KVM: arm/arm64: Allow reporting non-ISV data aborts to userspace Christoffer Dall
@ 2019-09-09 12:13 ` Christoffer Dall
  2019-09-09 12:32   ` Peter Maydell
  2019-09-09 12:13 ` [kvmtool PATCH 3/5] update headers: Update the KVM headers for new Arm fault reporting features Christoffer Dall
                   ` (2 subsequent siblings)
  4 siblings, 1 reply; 10+ messages in thread
From: Christoffer Dall @ 2019-09-09 12:13 UTC (permalink / raw)
  To: kvmarm
  Cc: Daniel P. Berrangé,
	Marc Zyngier, Stefan Hajnoczi, Heinrich Schuchardt,
	linux-arm-kernel

In some scenarios, such as buggy guest or incorrect configuration of the
VMM and firmware description data, userspace will detect a memory access
to a portion of the IPA, which is not mapped to any MMIO region.

For this purpose, the appropriate action is to inject an external abort
to the guest.  The kernel already has functionality to inject an
external abort, but we need to wire up a signal from user space that
lets user space tell the kernel to do this.

It turns out, we already have the set event functionality which we can
perfectly reuse for this.

Signed-off-by: Christoffer Dall <christoffer.dall@arm.com>
---
 Documentation/virt/kvm/api.txt    | 15 ++++++++++++++-
 arch/arm/include/uapi/asm/kvm.h   |  3 ++-
 arch/arm/kvm/guest.c              |  3 +++
 arch/arm64/include/uapi/asm/kvm.h |  3 ++-
 arch/arm64/kvm/guest.c            |  3 +++
 arch/arm64/kvm/inject_fault.c     |  4 ++--
 include/uapi/linux/kvm.h          |  1 +
 virt/kvm/arm/arm.c                |  1 +
 8 files changed, 28 insertions(+), 5 deletions(-)

diff --git a/Documentation/virt/kvm/api.txt b/Documentation/virt/kvm/api.txt
index 02501333f746..edd6cdc470ca 100644
--- a/Documentation/virt/kvm/api.txt
+++ b/Documentation/virt/kvm/api.txt
@@ -955,6 +955,8 @@ The following bits are defined in the flags field:
 
 ARM/ARM64:
 
+User space may need to inject several types of events to the guest.
+
 If the guest accesses a device that is being emulated by the host kernel in
 such a way that a real device would generate a physical SError, KVM may make
 a virtual SError pending for that VCPU. This system error interrupt remains
@@ -989,12 +991,23 @@ Specifying exception.has_esr on a system that does not support it will return
 -EINVAL. Setting anything other than the lower 24bits of exception.serror_esr
 will return -EINVAL.
 
+If the guest performed an access to I/O memory which could not be handled by
+user space, for example because of missing instruction syndrome decode
+information or because there is no device mapped at the accessed IPA, then
+user space can ask the kernel to inject an external abort using the address
+from the exiting fault on the VCPU. It is a programming error to set
+ext_dabt_pending at the same time as any of the serror fields, or to set
+ext_dabt_pending on an exit which was not either KVM_EXIT_MMIO or
+KVM_EXIT_ARM_NISV. This feature is only available if the system supports
+KVM_CAP_ARM_INJECT_EXT_DABT;
+
 struct kvm_vcpu_events {
 	struct {
 		__u8 serror_pending;
 		__u8 serror_has_esr;
+		__u8 ext_dabt_pending;
 		/* Align it to 8 bytes */
-		__u8 pad[6];
+		__u8 pad[5];
 		__u64 serror_esr;
 	} exception;
 	__u32 reserved[12];
diff --git a/arch/arm/include/uapi/asm/kvm.h b/arch/arm/include/uapi/asm/kvm.h
index a4217c1a5d01..d2449a5bf8d5 100644
--- a/arch/arm/include/uapi/asm/kvm.h
+++ b/arch/arm/include/uapi/asm/kvm.h
@@ -131,8 +131,9 @@ struct kvm_vcpu_events {
 	struct {
 		__u8 serror_pending;
 		__u8 serror_has_esr;
+		__u8 ext_dabt_pending;
 		/* Align it to 8 bytes */
-		__u8 pad[6];
+		__u8 pad[5];
 		__u64 serror_esr;
 	} exception;
 	__u32 reserved[12];
diff --git a/arch/arm/kvm/guest.c b/arch/arm/kvm/guest.c
index 684cf64b4033..4154c5589501 100644
--- a/arch/arm/kvm/guest.c
+++ b/arch/arm/kvm/guest.c
@@ -263,11 +263,14 @@ int __kvm_arm_vcpu_set_events(struct kvm_vcpu *vcpu,
 {
 	bool serror_pending = events->exception.serror_pending;
 	bool has_esr = events->exception.serror_has_esr;
+	bool has_ext_dabt_pending = events->exception.ext_dabt_pending;
 
 	if (serror_pending && has_esr)
 		return -EINVAL;
 	else if (serror_pending)
 		kvm_inject_vabt(vcpu);
+	else if (has_ext_dabt_pending)
+		kvm_inject_dabt(vcpu, kvm_vcpu_get_hfar(vcpu));
 
 	return 0;
 }
diff --git a/arch/arm64/include/uapi/asm/kvm.h b/arch/arm64/include/uapi/asm/kvm.h
index 9a507716ae2f..7729efdb1c0c 100644
--- a/arch/arm64/include/uapi/asm/kvm.h
+++ b/arch/arm64/include/uapi/asm/kvm.h
@@ -164,8 +164,9 @@ struct kvm_vcpu_events {
 	struct {
 		__u8 serror_pending;
 		__u8 serror_has_esr;
+		__u8 ext_dabt_pending;
 		/* Align it to 8 bytes */
-		__u8 pad[6];
+		__u8 pad[5];
 		__u64 serror_esr;
 	} exception;
 	__u32 reserved[12];
diff --git a/arch/arm64/kvm/guest.c b/arch/arm64/kvm/guest.c
index dfd626447482..10e6e2144dca 100644
--- a/arch/arm64/kvm/guest.c
+++ b/arch/arm64/kvm/guest.c
@@ -720,6 +720,7 @@ int __kvm_arm_vcpu_set_events(struct kvm_vcpu *vcpu,
 {
 	bool serror_pending = events->exception.serror_pending;
 	bool has_esr = events->exception.serror_has_esr;
+	bool has_ext_dabt_pending = events->exception.ext_dabt_pending;
 
 	if (serror_pending && has_esr) {
 		if (!cpus_have_const_cap(ARM64_HAS_RAS_EXTN))
@@ -731,6 +732,8 @@ int __kvm_arm_vcpu_set_events(struct kvm_vcpu *vcpu,
 			return -EINVAL;
 	} else if (serror_pending) {
 		kvm_inject_vabt(vcpu);
+	} else if (has_ext_dabt_pending) {
+		kvm_inject_dabt(vcpu, kvm_vcpu_get_hfar(vcpu));
 	}
 
 	return 0;
diff --git a/arch/arm64/kvm/inject_fault.c b/arch/arm64/kvm/inject_fault.c
index a9d25a305af5..ccdb6a051ab2 100644
--- a/arch/arm64/kvm/inject_fault.c
+++ b/arch/arm64/kvm/inject_fault.c
@@ -109,7 +109,7 @@ static void inject_undef64(struct kvm_vcpu *vcpu)
 
 /**
  * kvm_inject_dabt - inject a data abort into the guest
- * @vcpu: The VCPU to receive the undefined exception
+ * @vcpu: The VCPU to receive the data abort
  * @addr: The address to report in the DFAR
  *
  * It is assumed that this code is called from the VCPU thread and that the
@@ -125,7 +125,7 @@ void kvm_inject_dabt(struct kvm_vcpu *vcpu, unsigned long addr)
 
 /**
  * kvm_inject_pabt - inject a prefetch abort into the guest
- * @vcpu: The VCPU to receive the undefined exception
+ * @vcpu: The VCPU to receive the prefetch abort
  * @addr: The address to report in the DFAR
  *
  * It is assumed that this code is called from the VCPU thread and that the
diff --git a/include/uapi/linux/kvm.h b/include/uapi/linux/kvm.h
index dd79235b6435..a80ee820e700 100644
--- a/include/uapi/linux/kvm.h
+++ b/include/uapi/linux/kvm.h
@@ -1003,6 +1003,7 @@ struct kvm_ppc_resize_hpt {
 #define KVM_CAP_ARM_PTRAUTH_GENERIC 172
 #define KVM_CAP_PMU_EVENT_FILTER 173
 #define KVM_CAP_ARM_NISV_TO_USER 174
+#define KVM_CAP_ARM_INJECT_EXT_DABT 175
 
 #ifdef KVM_CAP_IRQ_ROUTING
 
diff --git a/virt/kvm/arm/arm.c b/virt/kvm/arm/arm.c
index 7153504bb106..56a97dd9b292 100644
--- a/virt/kvm/arm/arm.c
+++ b/virt/kvm/arm/arm.c
@@ -217,6 +217,7 @@ int kvm_vm_ioctl_check_extension(struct kvm *kvm, long ext)
 	case KVM_CAP_IMMEDIATE_EXIT:
 	case KVM_CAP_VCPU_EVENTS:
 	case KVM_CAP_ARM_NISV_TO_USER:
+	case KVM_CAP_ARM_INJECT_EXT_DABT:
 		r = 1;
 		break;
 	case KVM_CAP_ARM_SET_DEVICE_ADDR:
-- 
2.17.1

_______________________________________________
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

^ permalink raw reply	[flat|nested] 10+ messages in thread

* [kvmtool PATCH 3/5] update headers: Update the KVM headers for new Arm fault reporting features
  2019-09-09 12:13 [PATCH 0/2] Improve handling of stage 2 aborts without instruction decode Christoffer Dall
  2019-09-09 12:13 ` [PATCH 1/2] KVM: arm/arm64: Allow reporting non-ISV data aborts to userspace Christoffer Dall
  2019-09-09 12:13 ` [PATCH 2/2] KVM: arm/arm64: Allow user injection of external data aborts Christoffer Dall
@ 2019-09-09 12:13 ` Christoffer Dall
  2019-09-09 12:13 ` [kvmtool PATCH 4/5] arm: Handle exits from undecoded load/store instructions Christoffer Dall
  2019-09-09 12:13 ` [kvmtool PATCH 5/5] arm: Inject external data aborts when accessing holes in the memory map Christoffer Dall
  4 siblings, 0 replies; 10+ messages in thread
From: Christoffer Dall @ 2019-09-09 12:13 UTC (permalink / raw)
  To: kvmarm
  Cc: Daniel P. Berrangé,
	Marc Zyngier, Stefan Hajnoczi, Heinrich Schuchardt,
	linux-arm-kernel

In preparation for improving our handling of guest aborts with missing
decode info or outside any mapped resource, sync updated Linux header
files.

NOTE: This is a development update and these headers are not yet in an
upstream tree.  DO NOT MERGE.

Signed-off-by: Christoffer Dall <christoffer.dall@arm.com>
---
 arm/aarch32/include/asm/kvm.h | 3 ++-
 arm/aarch64/include/asm/kvm.h | 3 ++-
 include/linux/kvm.h           | 8 ++++++++
 3 files changed, 12 insertions(+), 2 deletions(-)

diff --git a/arm/aarch32/include/asm/kvm.h b/arm/aarch32/include/asm/kvm.h
index 4602464..b450900 100644
--- a/arm/aarch32/include/asm/kvm.h
+++ b/arm/aarch32/include/asm/kvm.h
@@ -131,8 +131,9 @@ struct kvm_vcpu_events {
 	struct {
 		__u8 serror_pending;
 		__u8 serror_has_esr;
+		__u8 ext_dabt_pending;
 		/* Align it to 8 bytes */
-		__u8 pad[6];
+		__u8 pad[5];
 		__u64 serror_esr;
 	} exception;
 	__u32 reserved[12];
diff --git a/arm/aarch64/include/asm/kvm.h b/arm/aarch64/include/asm/kvm.h
index 97c3478..e4cf9bd 100644
--- a/arm/aarch64/include/asm/kvm.h
+++ b/arm/aarch64/include/asm/kvm.h
@@ -160,8 +160,9 @@ struct kvm_vcpu_events {
 	struct {
 		__u8 serror_pending;
 		__u8 serror_has_esr;
+		__u8 ext_dabt_pending;
 		/* Align it to 8 bytes */
-		__u8 pad[6];
+		__u8 pad[5];
 		__u64 serror_esr;
 	} exception;
 	__u32 reserved[12];
diff --git a/include/linux/kvm.h b/include/linux/kvm.h
index 6d4ea4b..6e2d2df 100644
--- a/include/linux/kvm.h
+++ b/include/linux/kvm.h
@@ -235,6 +235,7 @@ struct kvm_hyperv_exit {
 #define KVM_EXIT_S390_STSI        25
 #define KVM_EXIT_IOAPIC_EOI       26
 #define KVM_EXIT_HYPERV           27
+#define KVM_EXIT_ARM_NISV         28
 
 /* For KVM_EXIT_INTERNAL_ERROR */
 /* Emulate instruction failed. */
@@ -392,6 +393,11 @@ struct kvm_run {
 		} eoi;
 		/* KVM_EXIT_HYPERV */
 		struct kvm_hyperv_exit hyperv;
+		/* KVM_EXIT_ARM_NISV */
+		struct {
+			__u64 esr_iss;
+			__u64 fault_ipa;
+		} arm_nisv;
 		/* Fix the size of the union. */
 		char padding[256];
 	};
@@ -988,6 +994,8 @@ struct kvm_ppc_resize_hpt {
 #define KVM_CAP_ARM_VM_IPA_SIZE 165
 #define KVM_CAP_MANUAL_DIRTY_LOG_PROTECT 166
 #define KVM_CAP_HYPERV_CPUID 167
+#define KVM_CAP_ARM_NISV_TO_USER 174
+#define KVM_CAP_ARM_INJECT_EXT_DABT 175
 
 #ifdef KVM_CAP_IRQ_ROUTING
 
-- 
2.17.1

_______________________________________________
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

^ permalink raw reply	[flat|nested] 10+ messages in thread

* [kvmtool PATCH 4/5] arm: Handle exits from undecoded load/store instructions
  2019-09-09 12:13 [PATCH 0/2] Improve handling of stage 2 aborts without instruction decode Christoffer Dall
                   ` (2 preceding siblings ...)
  2019-09-09 12:13 ` [kvmtool PATCH 3/5] update headers: Update the KVM headers for new Arm fault reporting features Christoffer Dall
@ 2019-09-09 12:13 ` Christoffer Dall
  2019-09-09 12:13 ` [kvmtool PATCH 5/5] arm: Inject external data aborts when accessing holes in the memory map Christoffer Dall
  4 siblings, 0 replies; 10+ messages in thread
From: Christoffer Dall @ 2019-09-09 12:13 UTC (permalink / raw)
  To: kvmarm
  Cc: Daniel P. Berrangé,
	Marc Zyngier, Stefan Hajnoczi, Heinrich Schuchardt,
	linux-arm-kernel

KVM occasionally encounters guests that attempt to access memory outside
the registered RAM memory slots using instructions that don't provide
decoding information in the ESR_EL2 (the ISV bit is not set), and
historically this has led to the kernel printing a confusing error
message in dmesg and returning -ENOYSYS from KVM_RUN.

KVM/Arm now has KVM_CAP_ARM_NISV_TO_USER, which can be enabled from
userspace, and which allows us to handle this with a little bit more
helpful information to the user.  For example, we can at least tell the
user if the guest just hit a hole in the guest's memory map, or if this
appeared to be an attempt at doing MMIO.

Signed-off-by: Christoffer Dall <christoffer.dall@arm.com>
---
 arm/kvm-cpu.c     | 20 +++++++++++++++++++-
 arm/kvm.c         |  8 ++++++++
 include/kvm/kvm.h |  1 +
 kvm.c             |  1 +
 mmio.c            | 11 +++++++++++
 5 files changed, 40 insertions(+), 1 deletion(-)

diff --git a/arm/kvm-cpu.c b/arm/kvm-cpu.c
index 7780251..25bd3ed 100644
--- a/arm/kvm-cpu.c
+++ b/arm/kvm-cpu.c
@@ -136,7 +136,25 @@ void kvm_cpu__delete(struct kvm_cpu *vcpu)
 
 bool kvm_cpu__handle_exit(struct kvm_cpu *vcpu)
 {
-	return false;
+	switch (vcpu->kvm_run->exit_reason) {
+	case KVM_EXIT_ARM_NISV: {
+		u64 phys_addr = vcpu->kvm_run->arm_nisv.fault_ipa;
+
+		if (!arm_addr_in_ioport_region(phys_addr) &&
+		    !kvm__mmio_exists(vcpu, phys_addr))
+			die("Guest accessed memory outside RAM and IO ranges");
+
+		/*
+		 * We cannot fetch and decode instructions from a KVM guest,
+		 * which used a load/store instruction that doesn't get
+		 * decoded in the ESR towards an I/O device, so we have no
+		 * choice but to exit to the user with an error.
+		 */
+		die("Guest accessed I/O device with unsupported load/store instruction");
+	}
+	default:
+		return false;
+	}
 }
 
 void kvm_cpu__show_page_tables(struct kvm_cpu *vcpu)
diff --git a/arm/kvm.c b/arm/kvm.c
index 1f85fc6..2572ac2 100644
--- a/arm/kvm.c
+++ b/arm/kvm.c
@@ -59,6 +59,8 @@ void kvm__arch_set_cmdline(char *cmdline, bool video)
 
 void kvm__arch_init(struct kvm *kvm, const char *hugetlbfs_path, u64 ram_size)
 {
+	struct kvm_enable_cap enable_cap = { .flags = 0 };
+
 	/*
 	 * Allocate guest memory. We must align our buffer to 64K to
 	 * correlate with the maximum guest page size for virtio-mmio.
@@ -83,6 +85,12 @@ void kvm__arch_init(struct kvm *kvm, const char *hugetlbfs_path, u64 ram_size)
 	madvise(kvm->arch.ram_alloc_start, kvm->arch.ram_alloc_size,
 		MADV_HUGEPAGE);
 
+	if (kvm__supports_extension(kvm, KVM_CAP_ARM_NISV_TO_USER)) {
+		enable_cap.cap = KVM_CAP_ARM_NISV_TO_USER;
+		if (ioctl(kvm->vm_fd, KVM_ENABLE_CAP, &enable_cap) < 0)
+			die("unable to enable NISV_TO_USER capability");
+	}
+
 	/* Create the virtual GIC. */
 	if (gic__create(kvm, kvm->cfg.arch.irqchip))
 		die("Failed to create virtual GIC");
diff --git a/include/kvm/kvm.h b/include/kvm/kvm.h
index 7a73818..05d90ee 100644
--- a/include/kvm/kvm.h
+++ b/include/kvm/kvm.h
@@ -107,6 +107,7 @@ bool kvm__emulate_io(struct kvm_cpu *vcpu, u16 port, void *data, int direction,
 bool kvm__emulate_mmio(struct kvm_cpu *vcpu, u64 phys_addr, u8 *data, u32 len, u8 is_write);
 int kvm__register_mem(struct kvm *kvm, u64 guest_phys, u64 size, void *userspace_addr,
 		      enum kvm_mem_type type);
+bool kvm__mmio_exists(struct kvm_cpu *vcpu, u64 phys_addr);
 static inline int kvm__register_ram(struct kvm *kvm, u64 guest_phys, u64 size,
 				    void *userspace_addr)
 {
diff --git a/kvm.c b/kvm.c
index 57c4ff9..03ec43f 100644
--- a/kvm.c
+++ b/kvm.c
@@ -55,6 +55,7 @@ const char *kvm_exit_reasons[] = {
 #ifdef CONFIG_PPC64
 	DEFINE_KVM_EXIT_REASON(KVM_EXIT_PAPR_HCALL),
 #endif
+	DEFINE_KVM_EXIT_REASON(KVM_EXIT_ARM_NISV),
 };
 
 static int pause_event;
diff --git a/mmio.c b/mmio.c
index 61e1d47..2ab7fa7 100644
--- a/mmio.c
+++ b/mmio.c
@@ -139,3 +139,14 @@ bool kvm__emulate_mmio(struct kvm_cpu *vcpu, u64 phys_addr, u8 *data, u32 len, u
 
 	return true;
 }
+
+bool kvm__mmio_exists(struct kvm_cpu *vcpu, u64 phys_addr)
+{
+	struct mmio_mapping *mmio;
+
+	br_read_lock(vcpu->kvm);
+	mmio = mmio_search(&mmio_tree, phys_addr, 1);
+	br_read_unlock(vcpu->kvm);
+
+	return mmio != NULL;
+}
-- 
2.17.1

_______________________________________________
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

^ permalink raw reply	[flat|nested] 10+ messages in thread

* [kvmtool PATCH 5/5] arm: Inject external data aborts when accessing holes in the memory map
  2019-09-09 12:13 [PATCH 0/2] Improve handling of stage 2 aborts without instruction decode Christoffer Dall
                   ` (3 preceding siblings ...)
  2019-09-09 12:13 ` [kvmtool PATCH 4/5] arm: Handle exits from undecoded load/store instructions Christoffer Dall
@ 2019-09-09 12:13 ` Christoffer Dall
  4 siblings, 0 replies; 10+ messages in thread
From: Christoffer Dall @ 2019-09-09 12:13 UTC (permalink / raw)
  To: kvmarm
  Cc: Daniel P. Berrangé,
	Marc Zyngier, Stefan Hajnoczi, Heinrich Schuchardt,
	linux-arm-kernel

Occasionally guests will attempt to access parts of the guest memory map
where there is... nothing at all.  Until now, we've handled this by
either forcefully killing the guest, or silently (unless a debug option
was enabled) ignoring the access.  Neither is very helpful to a user,
who is most likely running either a broken or misconfigured guest.

A more appropriate action is to inject an external abort to the guest.
Luckily, with KVM_CAP_ARM_INJECT_EXT_DABT, we can use the set event
mechanism and ask KVM to do this for us.

So we add an architecture specific hook to handle accesses to MMIO
regions which cannot be found, and allow them to return if the invalid
access was handled or not.

Signed-off-by: Christoffer Dall <christoffer.dall@arm.com>
---
 arm/include/arm-common/kvm-cpu-arch.h | 16 ++++++++++++++++
 arm/kvm-cpu.c                         |  2 +-
 mips/include/kvm/kvm-cpu-arch.h       |  5 +++++
 mmio.c                                |  3 ++-
 powerpc/include/kvm/kvm-cpu-arch.h    |  5 +++++
 x86/include/kvm/kvm-cpu-arch.h        |  5 +++++
 6 files changed, 34 insertions(+), 2 deletions(-)

diff --git a/arm/include/arm-common/kvm-cpu-arch.h b/arm/include/arm-common/kvm-cpu-arch.h
index 923d2c4..33defa2 100644
--- a/arm/include/arm-common/kvm-cpu-arch.h
+++ b/arm/include/arm-common/kvm-cpu-arch.h
@@ -57,6 +57,22 @@ static inline bool kvm_cpu__emulate_mmio(struct kvm_cpu *vcpu, u64 phys_addr,
 	return kvm__emulate_mmio(vcpu, phys_addr, data, len, is_write);
 }
 
+static inline bool kvm_cpu__mmio_not_found(struct kvm_cpu *vcpu, u64 phys_addr)
+{
+	struct kvm_vcpu_events events = {
+		.exception.ext_dabt_pending = 1,
+	};
+	int err;
+
+	if (!kvm__supports_extension(vcpu->kvm, KVM_CAP_ARM_INJECT_EXT_DABT))
+		return false;
+
+	err = ioctl(vcpu->vcpu_fd, KVM_SET_VCPU_EVENTS, &events);
+	if (err)
+		die("failed to inject external abort");
+	return true;
+}
+
 unsigned long kvm_cpu__get_vcpu_mpidr(struct kvm_cpu *vcpu);
 
 #endif /* ARM_COMMON__KVM_CPU_ARCH_H */
diff --git a/arm/kvm-cpu.c b/arm/kvm-cpu.c
index 25bd3ed..321a3e4 100644
--- a/arm/kvm-cpu.c
+++ b/arm/kvm-cpu.c
@@ -142,7 +142,7 @@ bool kvm_cpu__handle_exit(struct kvm_cpu *vcpu)
 
 		if (!arm_addr_in_ioport_region(phys_addr) &&
 		    !kvm__mmio_exists(vcpu, phys_addr))
-			die("Guest accessed memory outside RAM and IO ranges");
+			return kvm_cpu__mmio_not_found(vcpu, phys_addr);
 
 		/*
 		 * We cannot fetch and decode instructions from a KVM guest,
diff --git a/mips/include/kvm/kvm-cpu-arch.h b/mips/include/kvm/kvm-cpu-arch.h
index 45e69f6..512ab34 100644
--- a/mips/include/kvm/kvm-cpu-arch.h
+++ b/mips/include/kvm/kvm-cpu-arch.h
@@ -40,4 +40,9 @@ static inline bool kvm_cpu__emulate_mmio(struct kvm_cpu *vcpu, u64 phys_addr, u8
 	return kvm__emulate_mmio(vcpu, phys_addr, data, len, is_write);
 }
 
+static inline bool kvm_cpu__mmio_not_found(struct kvm_cpu *vcpu, u64 phys_addr)
+{
+	return false;
+}
+
 #endif /* KVM__KVM_CPU_ARCH_H */
diff --git a/mmio.c b/mmio.c
index 2ab7fa7..d6df303 100644
--- a/mmio.c
+++ b/mmio.c
@@ -130,7 +130,8 @@ bool kvm__emulate_mmio(struct kvm_cpu *vcpu, u64 phys_addr, u8 *data, u32 len, u
 	if (mmio)
 		mmio->mmio_fn(vcpu, phys_addr, data, len, is_write, mmio->ptr);
 	else {
-		if (vcpu->kvm->cfg.mmio_debug)
+		if (!kvm_cpu__mmio_not_found(vcpu, phys_addr) &&
+		    vcpu->kvm->cfg.mmio_debug)
 			fprintf(stderr,	"Warning: Ignoring MMIO %s at %016llx (length %u)\n",
 				to_direction(is_write),
 				(unsigned long long)phys_addr, len);
diff --git a/powerpc/include/kvm/kvm-cpu-arch.h b/powerpc/include/kvm/kvm-cpu-arch.h
index a69e0cc..64b69b1 100644
--- a/powerpc/include/kvm/kvm-cpu-arch.h
+++ b/powerpc/include/kvm/kvm-cpu-arch.h
@@ -76,4 +76,9 @@ static inline bool kvm_cpu__emulate_io(struct kvm_cpu *vcpu, u16 port, void *dat
 
 bool kvm_cpu__emulate_mmio(struct kvm_cpu *vcpu, u64 phys_addr, u8 *data, u32 len, u8 is_write);
 
+static inline bool kvm_cpu__mmio_not_found(struct kvm_cpu *vcpu, u64 phys_addr)
+{
+	return false;
+}
+
 #endif /* KVM__KVM_CPU_ARCH_H */
diff --git a/x86/include/kvm/kvm-cpu-arch.h b/x86/include/kvm/kvm-cpu-arch.h
index 05e5bb6..10cbe6e 100644
--- a/x86/include/kvm/kvm-cpu-arch.h
+++ b/x86/include/kvm/kvm-cpu-arch.h
@@ -47,4 +47,9 @@ static inline bool kvm_cpu__emulate_mmio(struct kvm_cpu *vcpu, u64 phys_addr, u8
 	return kvm__emulate_mmio(vcpu, phys_addr, data, len, is_write);
 }
 
+static inline bool kvm_cpu__mmio_not_found(struct kvm_cpu *vcpu, u64 phys_addr)
+{
+	return false;
+}
+
 #endif /* KVM__KVM_CPU_ARCH_H */
-- 
2.17.1

_______________________________________________
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH 2/2] KVM: arm/arm64: Allow user injection of external data aborts
  2019-09-09 12:13 ` [PATCH 2/2] KVM: arm/arm64: Allow user injection of external data aborts Christoffer Dall
@ 2019-09-09 12:32   ` Peter Maydell
  2019-09-09 15:16     ` Christoffer Dall
  0 siblings, 1 reply; 10+ messages in thread
From: Peter Maydell @ 2019-09-09 12:32 UTC (permalink / raw)
  To: Christoffer Dall
  Cc: Daniel P. Berrangé,
	Marc Zyngier, Stefan Hajnoczi, Heinrich Schuchardt, kvmarm,
	arm-mail-list

On Mon, 9 Sep 2019 at 13:13, Christoffer Dall <christoffer.dall@arm.com> wrote:
>
> In some scenarios, such as buggy guest or incorrect configuration of the
> VMM and firmware description data, userspace will detect a memory access
> to a portion of the IPA, which is not mapped to any MMIO region.
>
> For this purpose, the appropriate action is to inject an external abort
> to the guest.  The kernel already has functionality to inject an
> external abort, but we need to wire up a signal from user space that
> lets user space tell the kernel to do this.
>
> It turns out, we already have the set event functionality which we can
> perfectly reuse for this.
>
> Signed-off-by: Christoffer Dall <christoffer.dall@arm.com>
> ---
>  Documentation/virt/kvm/api.txt    | 15 ++++++++++++++-
>  arch/arm/include/uapi/asm/kvm.h   |  3 ++-
>  arch/arm/kvm/guest.c              |  3 +++
>  arch/arm64/include/uapi/asm/kvm.h |  3 ++-
>  arch/arm64/kvm/guest.c            |  3 +++
>  arch/arm64/kvm/inject_fault.c     |  4 ++--
>  include/uapi/linux/kvm.h          |  1 +
>  virt/kvm/arm/arm.c                |  1 +
>  8 files changed, 28 insertions(+), 5 deletions(-)
>
> diff --git a/Documentation/virt/kvm/api.txt b/Documentation/virt/kvm/api.txt
> index 02501333f746..edd6cdc470ca 100644
> --- a/Documentation/virt/kvm/api.txt
> +++ b/Documentation/virt/kvm/api.txt
> @@ -955,6 +955,8 @@ The following bits are defined in the flags field:
>
>  ARM/ARM64:
>
> +User space may need to inject several types of events to the guest.
> +
>  If the guest accesses a device that is being emulated by the host kernel in
>  such a way that a real device would generate a physical SError, KVM may make
>  a virtual SError pending for that VCPU. This system error interrupt remains
> @@ -989,12 +991,23 @@ Specifying exception.has_esr on a system that does not support it will return
>  -EINVAL. Setting anything other than the lower 24bits of exception.serror_esr
>  will return -EINVAL.
>
> +If the guest performed an access to I/O memory which could not be handled by
> +user space, for example because of missing instruction syndrome decode
> +information or because there is no device mapped at the accessed IPA, then
> +user space can ask the kernel to inject an external abort using the address
> +from the exiting fault on the VCPU. It is a programming error to set
> +ext_dabt_pending at the same time as any of the serror fields, or to set
> +ext_dabt_pending on an exit which was not either KVM_EXIT_MMIO or
> +KVM_EXIT_ARM_NISV. This feature is only available if the system supports
> +KVM_CAP_ARM_INJECT_EXT_DABT;
> +
>  struct kvm_vcpu_events {
>         struct {
>                 __u8 serror_pending;
>                 __u8 serror_has_esr;
> +               __u8 ext_dabt_pending;
>                 /* Align it to 8 bytes */
> -               __u8 pad[6];
> +               __u8 pad[5];
>                 __u64 serror_esr;
>         } exception;
>         __u32 reserved[12];

This API seems to be missing support for userspace to specify
whether the ESR_ELx for the guest should have the EA bit set
(and more generally other syndrome/fault status bits). I think
if we have an API for "KVM_EXIT_MMIO but the access failed"
then it should either (a) be architecture agnostic, since
pretty much any architecture might have a concept of "access
gave some bus-error-type failure" and it would be nice if userspace
didn't have to special case them all in arch-specific code,
or (b) have the same flexibility for specifying exactly what
kind of fault as the architecture does. This sort of seems to
fall between two stools. (My ideal for KVM_EXIT_MMIO faults
would be a generic API which included space for optional
arch-specific info, which for Arm would pretty much just be
the EA bit.)

As and when we support nested virtualization, any suggestions
on how this API would extend to support userspace saying
"deliver fault to guest EL1" vs "deliver fault to guest EL2" ?

thanks
-- PMM
_______________________________________________
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH 2/2] KVM: arm/arm64: Allow user injection of external data aborts
  2019-09-09 12:32   ` Peter Maydell
@ 2019-09-09 15:16     ` Christoffer Dall
  2019-09-09 15:56       ` Peter Maydell
  0 siblings, 1 reply; 10+ messages in thread
From: Christoffer Dall @ 2019-09-09 15:16 UTC (permalink / raw)
  To: Peter Maydell
  Cc: Daniel P. Berrangé,
	Marc Zyngier, Stefan Hajnoczi, Heinrich Schuchardt, kvmarm,
	arm-mail-list

On Mon, Sep 09, 2019 at 01:32:46PM +0100, Peter Maydell wrote:
> On Mon, 9 Sep 2019 at 13:13, Christoffer Dall <christoffer.dall@arm.com> wrote:
> >
> > In some scenarios, such as buggy guest or incorrect configuration of the
> > VMM and firmware description data, userspace will detect a memory access
> > to a portion of the IPA, which is not mapped to any MMIO region.
> >
> > For this purpose, the appropriate action is to inject an external abort
> > to the guest.  The kernel already has functionality to inject an
> > external abort, but we need to wire up a signal from user space that
> > lets user space tell the kernel to do this.
> >
> > It turns out, we already have the set event functionality which we can
> > perfectly reuse for this.
> >
> > Signed-off-by: Christoffer Dall <christoffer.dall@arm.com>
> > ---
> >  Documentation/virt/kvm/api.txt    | 15 ++++++++++++++-
> >  arch/arm/include/uapi/asm/kvm.h   |  3 ++-
> >  arch/arm/kvm/guest.c              |  3 +++
> >  arch/arm64/include/uapi/asm/kvm.h |  3 ++-
> >  arch/arm64/kvm/guest.c            |  3 +++
> >  arch/arm64/kvm/inject_fault.c     |  4 ++--
> >  include/uapi/linux/kvm.h          |  1 +
> >  virt/kvm/arm/arm.c                |  1 +
> >  8 files changed, 28 insertions(+), 5 deletions(-)
> >
> > diff --git a/Documentation/virt/kvm/api.txt b/Documentation/virt/kvm/api.txt
> > index 02501333f746..edd6cdc470ca 100644
> > --- a/Documentation/virt/kvm/api.txt
> > +++ b/Documentation/virt/kvm/api.txt
> > @@ -955,6 +955,8 @@ The following bits are defined in the flags field:
> >
> >  ARM/ARM64:
> >
> > +User space may need to inject several types of events to the guest.
> > +
> >  If the guest accesses a device that is being emulated by the host kernel in
> >  such a way that a real device would generate a physical SError, KVM may make
> >  a virtual SError pending for that VCPU. This system error interrupt remains
> > @@ -989,12 +991,23 @@ Specifying exception.has_esr on a system that does not support it will return
> >  -EINVAL. Setting anything other than the lower 24bits of exception.serror_esr
> >  will return -EINVAL.
> >
> > +If the guest performed an access to I/O memory which could not be handled by
> > +user space, for example because of missing instruction syndrome decode
> > +information or because there is no device mapped at the accessed IPA, then
> > +user space can ask the kernel to inject an external abort using the address
> > +from the exiting fault on the VCPU. It is a programming error to set
> > +ext_dabt_pending at the same time as any of the serror fields, or to set
> > +ext_dabt_pending on an exit which was not either KVM_EXIT_MMIO or
> > +KVM_EXIT_ARM_NISV. This feature is only available if the system supports
> > +KVM_CAP_ARM_INJECT_EXT_DABT;
> > +
> >  struct kvm_vcpu_events {
> >         struct {
> >                 __u8 serror_pending;
> >                 __u8 serror_has_esr;
> > +               __u8 ext_dabt_pending;
> >                 /* Align it to 8 bytes */
> > -               __u8 pad[6];
> > +               __u8 pad[5];
> >                 __u64 serror_esr;
> >         } exception;
> >         __u32 reserved[12];
> 
> This API seems to be missing support for userspace to specify
> whether the ESR_ELx for the guest should have the EA bit set
> (and more generally other syndrome/fault status bits). I think
> if we have an API for "KVM_EXIT_MMIO but the access failed"
> then it should either (a) be architecture agnostic, since
> pretty much any architecture might have a concept of "access
> gave some bus-error-type failure" and it would be nice if userspace
> didn't have to special case them all in arch-specific code,
> or (b) have the same flexibility for specifying exactly what
> kind of fault as the architecture does. This sort of seems to
> fall between two stools. (My ideal for KVM_EXIT_MMIO faults
> would be a generic API which included space for optional
> arch-specific info, which for Arm would pretty much just be
> the EA bit.)

I'm not sure I understand exactly what would be improved by making this
either more architecture speific or more architecture generic.  The
EA bit will always be set, that's why the field is called
'ext_dabt_pending'.

I thought as per the previous discussion, that we were specifically
trying to avoid userspace emulating the exception in detail, so I
designed this to provide the minimal effort API for userspace.

Since we already have an architecture specific ioctl, kvm_vcpu_events, I
don't think we're painting ourselves into a corner by using that.  Is a
natural consequence of what you're saying not that we should try to make
that whole call architecture generic?

Unless we already have specific examples of how other architectures
would want to use something like this, and given the impact of this
patch, I'm not sure it's worth trying to speculate about that.

> 
> As and when we support nested virtualization, any suggestions
> on how this API would extend to support userspace saying
> "deliver fault to guest EL1" vs "deliver fault to guest EL2" ?
> 

If we took one of the supported exits from a VM with nested virt
support, it means that you either had a fault from the guest hypervisor,
or a fault from a nested guest where the guest hypervisor has set up a
virtual stage 2 mapping to a hole in the VM's IPA space.  In the former
case, the exception would be delivered back to guest hypervisor, and in
the latter case the target depends on the guest hypervisor's
configuration of the virtual HCR_EL2(.TEA), which the kernel should
respect when handling the KVM_SET_VCPU_EVENTS ioctl.


Thanks,

    Christoffer
_______________________________________________
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH 2/2] KVM: arm/arm64: Allow user injection of external data aborts
  2019-09-09 15:16     ` Christoffer Dall
@ 2019-09-09 15:56       ` Peter Maydell
  2019-09-09 17:36         ` Christoffer Dall
  0 siblings, 1 reply; 10+ messages in thread
From: Peter Maydell @ 2019-09-09 15:56 UTC (permalink / raw)
  To: Christoffer Dall
  Cc: Daniel P. Berrangé,
	Marc Zyngier, Stefan Hajnoczi, Heinrich Schuchardt, kvmarm,
	arm-mail-list

On Mon, 9 Sep 2019 at 16:16, Christoffer Dall <christoffer.dall@arm.com> wrote:
>
> On Mon, Sep 09, 2019 at 01:32:46PM +0100, Peter Maydell wrote:
> > This API seems to be missing support for userspace to specify
> > whether the ESR_ELx for the guest should have the EA bit set
> > (and more generally other syndrome/fault status bits). I think
> > if we have an API for "KVM_EXIT_MMIO but the access failed"
> > then it should either (a) be architecture agnostic, since
> > pretty much any architecture might have a concept of "access
> > gave some bus-error-type failure" and it would be nice if userspace
> > didn't have to special case them all in arch-specific code,
> > or (b) have the same flexibility for specifying exactly what
> > kind of fault as the architecture does. This sort of seems to
> > fall between two stools. (My ideal for KVM_EXIT_MMIO faults
> > would be a generic API which included space for optional
> > arch-specific info, which for Arm would pretty much just be
> > the EA bit.)
>
> I'm not sure I understand exactly what would be improved by making this
> either more architecture speific or more architecture generic.  The
> EA bit will always be set, that's why the field is called
> 'ext_dabt_pending'.

ESR_EL1.EA doesn't mean "this is an external abort". It means
"given that this is an external abort as indicated by ESR_EL1.DFSC,
specify the external abort type". Traditionally this is 0 for
an AXI bus Decode error ("interconnect says there's nothing there")
and 1 for a Slave error ("there's something there but it told us
to go away"), though architecturally it's specified as impdef
because not everybody uses AXI. In QEMU we track the difference
between these two things and for TCG will raise external aborts
with the correct EA bit value.

> I thought as per the previous discussion, that we were specifically
> trying to avoid userspace emulating the exception in detail, so I
> designed this to provide the minimal effort API for userspace.
>
> Since we already have an architecture specific ioctl, kvm_vcpu_events, I
> don't think we're painting ourselves into a corner by using that.  Is a
> natural consequence of what you're saying not that we should try to make
> that whole call architecture generic?
>
> Unless we already have specific examples of how other architectures
> would want to use something like this, and given the impact of this
> patch, I'm not sure it's worth trying to speculate about that.

In QEMU, use of a generic API would look something like
this in kvm-all.c:

        case KVM_EXIT_MMIO:
            DPRINTF("handle_mmio\n");
            /* Called outside BQL */
            MemTxResult res;

            res = address_space_rw(&address_space_memory,
                                   run->mmio.phys_addr, attrs,
                                   run->mmio.data,
                                   run->mmio.len,
                                   run->mmio.is_write);
            if (res != MEMTX_OK) {
                /* tell the kernel the access failed, eg
                 * by updating the kvm_run struct to say so
                 */
            } else {
                /* access passed, we have updated the kvm_run
                 * struct's mmio subfield, proceed as usual
                 */
            }
            ret = 0;
            break;

[this is exactly the current QEMU code except that today
we throw away the 'res' that tells us if the transaction
succeeded because we have no way to report it to KVM and
effectively always RAZ/WI the access.]

This is nice because you don't need anything here that has to do
"bail out to architecture specific handling of anything",
you just say "nope, the access failed", and let the kernel handle
that however the CPU would handle it. It just immediately works
for all architectures on the userspace side (assuming the kernel
defaults to not actually trying to report an abort to the guest
if nobody's implemented that on the kernel side, which is exactly
what happens today where there's no way to report the error for
any architecture).
The downside is that you lose the ability to be more specific about
architecture-specific fine distinctions like decode errors vs slave
errors, though.

Or you could have an arm-specific API that does care about
fine details like the EA bit (and maybe also other ESR_ELx
fields); that has the downside that userspace needs to
make the handling of error returns from "handle this MMIO
access" architecture specific, but you get architecture-specific
benefits as a result. (Preferably the architecture-specific
APIs should at least be basically the same, eg same ioctl
or same bit of the kvm_run struct being updated with some parts
being arch-specific data, rather than 3 different mechanisms.)

Having an API that is architecture specific but doesn't actually
let you define any of the architecture-specific aspects of
what the abort might imply seems like the worst of both worlds.
If all we can say is "this aborted" then we might as well have
the API be generic.

thanks
-- PMM
_______________________________________________
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH 2/2] KVM: arm/arm64: Allow user injection of external data aborts
  2019-09-09 15:56       ` Peter Maydell
@ 2019-09-09 17:36         ` Christoffer Dall
  0 siblings, 0 replies; 10+ messages in thread
From: Christoffer Dall @ 2019-09-09 17:36 UTC (permalink / raw)
  To: Peter Maydell
  Cc: Daniel P. Berrangé,
	Marc Zyngier, Stefan Hajnoczi, Heinrich Schuchardt, kvmarm,
	arm-mail-list

On Mon, Sep 09, 2019 at 04:56:23PM +0100, Peter Maydell wrote:
> On Mon, 9 Sep 2019 at 16:16, Christoffer Dall <christoffer.dall@arm.com> wrote:
> >
> > On Mon, Sep 09, 2019 at 01:32:46PM +0100, Peter Maydell wrote:
> > > This API seems to be missing support for userspace to specify
> > > whether the ESR_ELx for the guest should have the EA bit set
> > > (and more generally other syndrome/fault status bits). I think
> > > if we have an API for "KVM_EXIT_MMIO but the access failed"
> > > then it should either (a) be architecture agnostic, since
> > > pretty much any architecture might have a concept of "access
> > > gave some bus-error-type failure" and it would be nice if userspace
> > > didn't have to special case them all in arch-specific code,
> > > or (b) have the same flexibility for specifying exactly what
> > > kind of fault as the architecture does. This sort of seems to
> > > fall between two stools. (My ideal for KVM_EXIT_MMIO faults
> > > would be a generic API which included space for optional
> > > arch-specific info, which for Arm would pretty much just be
> > > the EA bit.)
> >
> > I'm not sure I understand exactly what would be improved by making this
> > either more architecture speific or more architecture generic.  The
> > EA bit will always be set, that's why the field is called
> > 'ext_dabt_pending'.
> 
> ESR_EL1.EA doesn't mean "this is an external abort". It means
> "given that this is an external abort as indicated by ESR_EL1.DFSC,
> specify the external abort type". Traditionally this is 0 for
> an AXI bus Decode error ("interconnect says there's nothing there")
> and 1 for a Slave error ("there's something there but it told us
> to go away"), though architecturally it's specified as impdef
> because not everybody uses AXI. In QEMU we track the difference
> between these two things and for TCG will raise external aborts
> with the correct EA bit value.
> 

Ah, I missed that.  I don't think we want to allow userspace to supply
any implementation defined values for the VM, though.

> > I thought as per the previous discussion, that we were specifically
> > trying to avoid userspace emulating the exception in detail, so I
> > designed this to provide the minimal effort API for userspace.
> >
> > Since we already have an architecture specific ioctl, kvm_vcpu_events, I
> > don't think we're painting ourselves into a corner by using that.  Is a
> > natural consequence of what you're saying not that we should try to make
> > that whole call architecture generic?
> >
> > Unless we already have specific examples of how other architectures
> > would want to use something like this, and given the impact of this
> > patch, I'm not sure it's worth trying to speculate about that.
> 
> In QEMU, use of a generic API would look something like
> this in kvm-all.c:
> 
>         case KVM_EXIT_MMIO:
>             DPRINTF("handle_mmio\n");
>             /* Called outside BQL */
>             MemTxResult res;
> 
>             res = address_space_rw(&address_space_memory,
>                                    run->mmio.phys_addr, attrs,
>                                    run->mmio.data,
>                                    run->mmio.len,
>                                    run->mmio.is_write);
>             if (res != MEMTX_OK) {
>                 /* tell the kernel the access failed, eg
>                  * by updating the kvm_run struct to say so
>                  */
>             } else {
>                 /* access passed, we have updated the kvm_run
>                  * struct's mmio subfield, proceed as usual
>                  */
>             }
>             ret = 0;
>             break;
> 
> [this is exactly the current QEMU code except that today
> we throw away the 'res' that tells us if the transaction
> succeeded because we have no way to report it to KVM and
> effectively always RAZ/WI the access.]
> 
> This is nice because you don't need anything here that has to do
> "bail out to architecture specific handling of anything",
> you just say "nope, the access failed", and let the kernel handle
> that however the CPU would handle it. It just immediately works
> for all architectures on the userspace side (assuming the kernel
> defaults to not actually trying to report an abort to the guest
> if nobody's implemented that on the kernel side, which is exactly
> what happens today where there's no way to report the error for
> any architecture).
> The downside is that you lose the ability to be more specific about
> architecture-specific fine distinctions like decode errors vs slave
> errors, though.

I understand that it's convenient to avoid having to write an
architecture hook, but I simply don't know if it makes sense to do this
on other architectures, and while it can be more code to have to write
the architecture hooks in QEMU, it's hardly a strong argument against
using an existing architecture-specific mechanism to inject an event to
a guest.

Note that I looked at using a an appropriate field in the kvm_run
structure, but nothing elegant came to mind.

Do you have a concrete example of how you would like to change the
kvm_run structure?

> 
> Or you could have an arm-specific API that does care about
> fine details like the EA bit (and maybe also other ESR_ELx
> fields); that has the downside that userspace needs to
> make the handling of error returns from "handle this MMIO
> access" architecture specific, but you get architecture-specific
> benefits as a result. (Preferably the architecture-specific
> APIs should at least be basically the same, eg same ioctl
> or same bit of the kvm_run struct being updated with some parts
> being arch-specific data, rather than 3 different mechanisms.)

Are there other bits of the ESR than the EA that you think we should be
able to specify?

Can we decide if we need to allow userspace to provide additional
information or not, and then decide on the mechanism, instead of
conflating the two questions?

I think we should either expose the minimal mechanism to user space, or
just leave it to user space to emulate the whole thing.


Thanks,

    Christoffer
_______________________________________________
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, back to index

Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-09-09 12:13 [PATCH 0/2] Improve handling of stage 2 aborts without instruction decode Christoffer Dall
2019-09-09 12:13 ` [PATCH 1/2] KVM: arm/arm64: Allow reporting non-ISV data aborts to userspace Christoffer Dall
2019-09-09 12:13 ` [PATCH 2/2] KVM: arm/arm64: Allow user injection of external data aborts Christoffer Dall
2019-09-09 12:32   ` Peter Maydell
2019-09-09 15:16     ` Christoffer Dall
2019-09-09 15:56       ` Peter Maydell
2019-09-09 17:36         ` Christoffer Dall
2019-09-09 12:13 ` [kvmtool PATCH 3/5] update headers: Update the KVM headers for new Arm fault reporting features Christoffer Dall
2019-09-09 12:13 ` [kvmtool PATCH 4/5] arm: Handle exits from undecoded load/store instructions Christoffer Dall
2019-09-09 12:13 ` [kvmtool PATCH 5/5] arm: Inject external data aborts when accessing holes in the memory map Christoffer Dall

KVM ARM Archive on lore.kernel.org

Archives are clonable:
	git clone --mirror https://lore.kernel.org/kvmarm/0 kvmarm/git/0.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 kvmarm kvmarm/ https://lore.kernel.org/kvmarm \
		kvmarm@lists.cs.columbia.edu kvmarm@archiver.kernel.org
	public-inbox-index kvmarm


Newsgroup available over NNTP:
	nntp://nntp.lore.kernel.org/edu.columbia.cs.lists.kvmarm


AGPL code for this site: git clone https://public-inbox.org/ public-inbox