All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v2 0/2] target/arm: kvm: Support for KVM DABT without valid ISS
@ 2020-01-29 20:24 ` Beata Michalska
  0 siblings, 0 replies; 20+ messages in thread
From: Beata Michalska @ 2020-01-29 20:24 UTC (permalink / raw)
  To: qemu-devel; +Cc: peter.maydell, qemu-arm, kvmarm, Christoffer.Dall, pbonzini

Some of the ARMv7 & ARMv8 load/store instructions might trigger a data abort
exception with no valid ISS info to be decoded. The lack of decode info
makes it at least tricky to emulate the instruction which is one of the
(many) reasons why KVM will not even try to do so.

So far, if a guest made an attempt to access memory outside the memory slot,
KVM reported vague ENOSYS. As a result QEMU exited with no useful information
being provided or even a clue on what has just happened.

ARM KVM introduced support for notifying guest of an attempt to execute
an instruction that resulted in dabt with no valid ISS decoding info.
This still leaves QEMU to handle the case, but at least now, it can enable
further debugging of the encountered issue by being more verbose
in a (hopefully) useful way.

v2:
- Improving/re-phrasing messaging
- Dropping messing around with forced sync (@see [PATCH v2 1/2])
  and PC alignment


Beata Michalska (2):
  target/arm: kvm: Inject events at the last stage of sync
  target/arm: kvm: Handle DABT with no valid ISS

 target/arm/cpu.h     |  2 ++
 target/arm/kvm.c     | 96 ++++++++++++++++++++++++++++++++++++++++++++++++++++
 target/arm/kvm32.c   | 23 +++++++------
 target/arm/kvm64.c   | 23 +++++++------
 target/arm/kvm_arm.h | 19 +++++++++++
 5 files changed, 143 insertions(+), 20 deletions(-)

-- 
2.7.4



^ permalink raw reply	[flat|nested] 20+ messages in thread

* [PATCH v2 0/2] target/arm: kvm: Support for KVM DABT without valid ISS
@ 2020-01-29 20:24 ` Beata Michalska
  0 siblings, 0 replies; 20+ messages in thread
From: Beata Michalska @ 2020-01-29 20:24 UTC (permalink / raw)
  To: qemu-devel; +Cc: qemu-arm, kvmarm, pbonzini

Some of the ARMv7 & ARMv8 load/store instructions might trigger a data abort
exception with no valid ISS info to be decoded. The lack of decode info
makes it at least tricky to emulate the instruction which is one of the
(many) reasons why KVM will not even try to do so.

So far, if a guest made an attempt to access memory outside the memory slot,
KVM reported vague ENOSYS. As a result QEMU exited with no useful information
being provided or even a clue on what has just happened.

ARM KVM introduced support for notifying guest of an attempt to execute
an instruction that resulted in dabt with no valid ISS decoding info.
This still leaves QEMU to handle the case, but at least now, it can enable
further debugging of the encountered issue by being more verbose
in a (hopefully) useful way.

v2:
- Improving/re-phrasing messaging
- Dropping messing around with forced sync (@see [PATCH v2 1/2])
  and PC alignment


Beata Michalska (2):
  target/arm: kvm: Inject events at the last stage of sync
  target/arm: kvm: Handle DABT with no valid ISS

 target/arm/cpu.h     |  2 ++
 target/arm/kvm.c     | 96 ++++++++++++++++++++++++++++++++++++++++++++++++++++
 target/arm/kvm32.c   | 23 +++++++------
 target/arm/kvm64.c   | 23 +++++++------
 target/arm/kvm_arm.h | 19 +++++++++++
 5 files changed, 143 insertions(+), 20 deletions(-)

-- 
2.7.4

_______________________________________________
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

^ permalink raw reply	[flat|nested] 20+ messages in thread

* [PATCH v2 1/2] target/arm: kvm: Inject events at the last stage of sync
  2020-01-29 20:24 ` Beata Michalska
@ 2020-01-29 20:24   ` Beata Michalska
  -1 siblings, 0 replies; 20+ messages in thread
From: Beata Michalska @ 2020-01-29 20:24 UTC (permalink / raw)
  To: qemu-devel; +Cc: peter.maydell, qemu-arm, kvmarm, Christoffer.Dall, pbonzini

KVM_SET_VCPU_EVENTS might actually lead to vcpu registers being modified.
As such this should be the last step of sync to avoid potential overwriting
of whatever changes KVM might have done.

Signed-off-by: Beata Michalska <beata.michalska@linaro.org>
---
 target/arm/kvm32.c | 20 ++++++++++----------
 target/arm/kvm64.c | 20 ++++++++++----------
 2 files changed, 20 insertions(+), 20 deletions(-)

diff --git a/target/arm/kvm32.c b/target/arm/kvm32.c
index 32bf8d6..cf2b47f 100644
--- a/target/arm/kvm32.c
+++ b/target/arm/kvm32.c
@@ -386,17 +386,17 @@ int kvm_arch_put_registers(CPUState *cs, int level)
         return ret;
     }
 
-    ret = kvm_put_vcpu_events(cpu);
-    if (ret) {
-        return ret;
-    }
-
     write_cpustate_to_list(cpu, true);
 
     if (!write_list_to_kvmstate(cpu, level)) {
         return EINVAL;
     }
 
+    ret = kvm_put_vcpu_events(cpu);
+    if (ret) {
+        return ret;
+    }
+
     kvm_arm_sync_mpstate_to_kvm(cpu);
 
     return ret;
@@ -462,11 +462,6 @@ int kvm_arch_get_registers(CPUState *cs)
     }
     vfp_set_fpscr(env, fpscr);
 
-    ret = kvm_get_vcpu_events(cpu);
-    if (ret) {
-        return ret;
-    }
-
     if (!write_kvmstate_to_list(cpu)) {
         return EINVAL;
     }
@@ -475,6 +470,11 @@ int kvm_arch_get_registers(CPUState *cs)
      */
     write_list_to_cpustate(cpu);
 
+    ret = kvm_get_vcpu_events(cpu);
+    if (ret) {
+        return ret;
+    }
+
     kvm_arm_sync_mpstate_to_qemu(cpu);
 
     return 0;
diff --git a/target/arm/kvm64.c b/target/arm/kvm64.c
index 6344113..d06fd32 100644
--- a/target/arm/kvm64.c
+++ b/target/arm/kvm64.c
@@ -1043,17 +1043,17 @@ int kvm_arch_put_registers(CPUState *cs, int level)
         return ret;
     }
 
-    ret = kvm_put_vcpu_events(cpu);
-    if (ret) {
-        return ret;
-    }
-
     write_cpustate_to_list(cpu, true);
 
     if (!write_list_to_kvmstate(cpu, level)) {
         return -EINVAL;
     }
 
+    ret = kvm_put_vcpu_events(cpu);
+    if (ret) {
+        return ret;
+    }
+
     kvm_arm_sync_mpstate_to_kvm(cpu);
 
     return ret;
@@ -1251,11 +1251,6 @@ int kvm_arch_get_registers(CPUState *cs)
     }
     vfp_set_fpcr(env, fpr);
 
-    ret = kvm_get_vcpu_events(cpu);
-    if (ret) {
-        return ret;
-    }
-
     if (!write_kvmstate_to_list(cpu)) {
         return -EINVAL;
     }
@@ -1264,6 +1259,11 @@ int kvm_arch_get_registers(CPUState *cs)
      */
     write_list_to_cpustate(cpu);
 
+    ret = kvm_get_vcpu_events(cpu);
+    if (ret) {
+        return ret;
+    }
+
     kvm_arm_sync_mpstate_to_qemu(cpu);
 
     /* TODO: other registers */
-- 
2.7.4



^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [PATCH v2 1/2] target/arm: kvm: Inject events at the last stage of sync
@ 2020-01-29 20:24   ` Beata Michalska
  0 siblings, 0 replies; 20+ messages in thread
From: Beata Michalska @ 2020-01-29 20:24 UTC (permalink / raw)
  To: qemu-devel; +Cc: qemu-arm, kvmarm, pbonzini

KVM_SET_VCPU_EVENTS might actually lead to vcpu registers being modified.
As such this should be the last step of sync to avoid potential overwriting
of whatever changes KVM might have done.

Signed-off-by: Beata Michalska <beata.michalska@linaro.org>
---
 target/arm/kvm32.c | 20 ++++++++++----------
 target/arm/kvm64.c | 20 ++++++++++----------
 2 files changed, 20 insertions(+), 20 deletions(-)

diff --git a/target/arm/kvm32.c b/target/arm/kvm32.c
index 32bf8d6..cf2b47f 100644
--- a/target/arm/kvm32.c
+++ b/target/arm/kvm32.c
@@ -386,17 +386,17 @@ int kvm_arch_put_registers(CPUState *cs, int level)
         return ret;
     }
 
-    ret = kvm_put_vcpu_events(cpu);
-    if (ret) {
-        return ret;
-    }
-
     write_cpustate_to_list(cpu, true);
 
     if (!write_list_to_kvmstate(cpu, level)) {
         return EINVAL;
     }
 
+    ret = kvm_put_vcpu_events(cpu);
+    if (ret) {
+        return ret;
+    }
+
     kvm_arm_sync_mpstate_to_kvm(cpu);
 
     return ret;
@@ -462,11 +462,6 @@ int kvm_arch_get_registers(CPUState *cs)
     }
     vfp_set_fpscr(env, fpscr);
 
-    ret = kvm_get_vcpu_events(cpu);
-    if (ret) {
-        return ret;
-    }
-
     if (!write_kvmstate_to_list(cpu)) {
         return EINVAL;
     }
@@ -475,6 +470,11 @@ int kvm_arch_get_registers(CPUState *cs)
      */
     write_list_to_cpustate(cpu);
 
+    ret = kvm_get_vcpu_events(cpu);
+    if (ret) {
+        return ret;
+    }
+
     kvm_arm_sync_mpstate_to_qemu(cpu);
 
     return 0;
diff --git a/target/arm/kvm64.c b/target/arm/kvm64.c
index 6344113..d06fd32 100644
--- a/target/arm/kvm64.c
+++ b/target/arm/kvm64.c
@@ -1043,17 +1043,17 @@ int kvm_arch_put_registers(CPUState *cs, int level)
         return ret;
     }
 
-    ret = kvm_put_vcpu_events(cpu);
-    if (ret) {
-        return ret;
-    }
-
     write_cpustate_to_list(cpu, true);
 
     if (!write_list_to_kvmstate(cpu, level)) {
         return -EINVAL;
     }
 
+    ret = kvm_put_vcpu_events(cpu);
+    if (ret) {
+        return ret;
+    }
+
     kvm_arm_sync_mpstate_to_kvm(cpu);
 
     return ret;
@@ -1251,11 +1251,6 @@ int kvm_arch_get_registers(CPUState *cs)
     }
     vfp_set_fpcr(env, fpr);
 
-    ret = kvm_get_vcpu_events(cpu);
-    if (ret) {
-        return ret;
-    }
-
     if (!write_kvmstate_to_list(cpu)) {
         return -EINVAL;
     }
@@ -1264,6 +1259,11 @@ int kvm_arch_get_registers(CPUState *cs)
      */
     write_list_to_cpustate(cpu);
 
+    ret = kvm_get_vcpu_events(cpu);
+    if (ret) {
+        return ret;
+    }
+
     kvm_arm_sync_mpstate_to_qemu(cpu);
 
     /* TODO: other registers */
-- 
2.7.4

_______________________________________________
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [PATCH v2 2/2] target/arm: kvm: Handle DABT with no valid ISS
  2020-01-29 20:24 ` Beata Michalska
@ 2020-01-29 20:24   ` Beata Michalska
  -1 siblings, 0 replies; 20+ messages in thread
From: Beata Michalska @ 2020-01-29 20:24 UTC (permalink / raw)
  To: qemu-devel; +Cc: peter.maydell, qemu-arm, kvmarm, Christoffer.Dall, pbonzini

On ARMv7 & ARMv8 some load/store instructions might trigger a data abort
exception with no valid ISS info to be decoded. The lack of decode info
makes it at least tricky to emulate those instruction which is one of the
(many) reasons why KVM will not even try to do so.

Add support for handling those by requesting KVM to inject external
dabt into the quest.

Signed-off-by: Beata Michalska <beata.michalska@linaro.org>
---
 target/arm/cpu.h     |  2 ++
 target/arm/kvm.c     | 96 ++++++++++++++++++++++++++++++++++++++++++++++++++++
 target/arm/kvm32.c   |  3 ++
 target/arm/kvm64.c   |  3 ++
 target/arm/kvm_arm.h | 19 +++++++++++
 5 files changed, 123 insertions(+)

diff --git a/target/arm/cpu.h b/target/arm/cpu.h
index c1aedbe..e04a8d3 100644
--- a/target/arm/cpu.h
+++ b/target/arm/cpu.h
@@ -558,6 +558,8 @@ typedef struct CPUARMState {
         uint8_t has_esr;
         uint64_t esr;
     } serror;
+    /* Status field for pending external dabt */
+    uint8_t ext_dabt_pending;
 
     /* State of our input IRQ/FIQ/VIRQ/VFIQ lines */
     uint32_t irq_line_state;
diff --git a/target/arm/kvm.c b/target/arm/kvm.c
index 8d82889..e7bc9b7 100644
--- a/target/arm/kvm.c
+++ b/target/arm/kvm.c
@@ -37,6 +37,7 @@ const KVMCapabilityInfo kvm_arch_required_capabilities[] = {
 
 static bool cap_has_mp_state;
 static bool cap_has_inject_serror_esr;
+static bool cap_has_inject_ext_dabt; /* KVM_CAP_ARM_INJECT_EXT_DABT */
 
 static ARMHostCPUFeatures arm_host_cpu_features;
 
@@ -62,6 +63,12 @@ void kvm_arm_init_serror_injection(CPUState *cs)
                                     KVM_CAP_ARM_INJECT_SERROR_ESR);
 }
 
+void kvm_arm_init_ext_dabt_injection(CPUState *cs)
+{
+    cap_has_inject_ext_dabt = kvm_check_extension(cs->kvm_state,
+                                    KVM_CAP_ARM_INJECT_EXT_DABT);
+}
+
 bool kvm_arm_create_scratch_host_vcpu(const uint32_t *cpus_to_try,
                                       int *fdarray,
                                       struct kvm_vcpu_init *init)
@@ -216,6 +223,11 @@ int kvm_arch_init(MachineState *ms, KVMState *s)
         ret = -EINVAL;
     }
 
+    if (kvm_check_extension(s, KVM_CAP_ARM_NISV_TO_USER))
+        if (kvm_vm_enable_cap(s, KVM_CAP_ARM_NISV_TO_USER, 0)) {
+            warn_report("Failed to enable DABT NISV cap");
+        }
+
     return ret;
 }
 
@@ -598,6 +610,10 @@ int kvm_put_vcpu_events(ARMCPU *cpu)
         events.exception.serror_esr = env->serror.esr;
     }
 
+    if (cap_has_inject_ext_dabt) {
+        events.exception.ext_dabt_pending = env->ext_dabt_pending;
+    }
+
     ret = kvm_vcpu_ioctl(CPU(cpu), KVM_SET_VCPU_EVENTS, &events);
     if (ret) {
         error_report("failed to put vcpu events");
@@ -627,6 +643,8 @@ int kvm_get_vcpu_events(ARMCPU *cpu)
     env->serror.has_esr = events.exception.serror_has_esr;
     env->serror.esr = events.exception.serror_esr;
 
+    env->ext_dabt_pending = events.exception.ext_dabt_pending;
+
     return 0;
 }
 
@@ -634,6 +652,7 @@ void kvm_arch_pre_run(CPUState *cs, struct kvm_run *run)
 {
 }
 
+
 MemTxAttrs kvm_arch_post_run(CPUState *cs, struct kvm_run *run)
 {
     ARMCPU *cpu;
@@ -699,6 +718,11 @@ int kvm_arch_handle_exit(CPUState *cs, struct kvm_run *run)
             ret = EXCP_DEBUG;
         } /* otherwise return to guest */
         break;
+    case KVM_EXIT_ARM_NISV:
+        /* External DABT with no valid iss to decode */
+        ret = kvm_arm_handle_dabt_nisv(cs, run->arm_nisv.esr_iss,
+                                       run->arm_nisv.fault_ipa);
+        break;
     default:
         qemu_log_mask(LOG_UNIMP, "%s: un-handled exit reason %d\n",
                       __func__, run->exit_reason);
@@ -833,3 +857,75 @@ int kvm_arch_msi_data_to_gsi(uint32_t data)
 {
     return (data - 32) & 0xffff;
 }
+
+int kvm_arm_handle_dabt_nisv(CPUState *cs, uint64_t esr_iss,
+                             uint64_t fault_ipa)
+{
+    ARMCPU *cpu = ARM_CPU(cs);
+    CPUARMState *env = &cpu->env;
+    uint32_t ins, ins_fetched;
+
+    /*
+     * Hacky workaround for kernels that for aarch32 guests, instead of expected
+     * external data abort, inject the IMPLEMENTATION DEFINED exception with the
+     * lock-down. This is actually handled by the guest which results in
+     * re-running the faulting instruction.
+     * This intends to break the vicious cycle.
+     */
+    if (!is_a64(env)) {
+        static uint8_t setback;
+
+        /*
+         * The state has not been synchronized yet, so if this is re-occurrence
+         * of the same abort triggered by guest, the status for pending external
+         * abort should not get cleared yet
+         */
+        if (unlikely(env->ext_dabt_pending)) {
+            if (setback) {
+                error_report("Most probably triggered kernel issue with"
+                             " injecting external data abort.");
+                error_printf("Giving up trying ...\n");
+                /* Here is where the fun comes to an end */
+                return -EINVAL;
+            }
+        }
+        setback = env->ext_dabt_pending;
+    }
+
+    kvm_cpu_synchronize_state(cs);
+   /*
+    * ISS [23:14] is invalid so there is a limited info
+    * on what has just happened so the only *useful* thing that can
+    * be retrieved from ISS is WnR & DFSC (though in some cases WnR
+    * might be less of a value as well)
+    */
+
+    /*
+     * Get current PC before it will get updated to exception vector entry
+     */
+    target_ulong ins_addr = is_a64(env) ? env->pc : env->regs[15];
+
+    /*
+     * Get the faulting instruction
+     * Instructions have a fixed length of 32 bits
+     * and are always little-endian
+     */
+    ins_fetched = !cpu_memory_rw_debug(cs, ins_addr, (uint8_t *) &ins,
+                                       sizeof(uint32_t), 0);
+
+    error_report("Data abort exception with no valid ISS generated by "
+                 "guest memory access at physical address: 0x" TARGET_FMT_lx,
+                 (target_ulong)fault_ipa);
+
+    error_printf(ins_fetched ? "%s : 0x%" PRIx32 " (ins encoded)\n"  : "%s\n",
+                 "KVM unable to emulate faulting instruction",
+                 (!ins_fetched ? 0 : ldl_le_p(&ins)));
+    error_printf("Injecting a data abort exception into guest.\n");
+    /*
+     * Set pending ext dabt and trigger SET_EVENTS so that
+     * KVM can inject the abort
+     */
+    env->ext_dabt_pending = 1;
+
+    return 0;
+}
diff --git a/target/arm/kvm32.c b/target/arm/kvm32.c
index cf2b47f..177402e 100644
--- a/target/arm/kvm32.c
+++ b/target/arm/kvm32.c
@@ -240,6 +240,9 @@ int kvm_arch_init_vcpu(CPUState *cs)
     /* Check whether userspace can specify guest syndrome value */
     kvm_arm_init_serror_injection(cs);
 
+    /* Set status for supporting the external dabt injection */
+    kvm_arm_init_ext_dabt_injection(cs);
+
     return kvm_arm_init_cpreg_list(cpu);
 }
 
diff --git a/target/arm/kvm64.c b/target/arm/kvm64.c
index d06fd32..a474369 100644
--- a/target/arm/kvm64.c
+++ b/target/arm/kvm64.c
@@ -791,6 +791,9 @@ int kvm_arch_init_vcpu(CPUState *cs)
     /* Check whether user space can specify guest syndrome value */
     kvm_arm_init_serror_injection(cs);
 
+    /* Set status for supporting the external dabt injection */
+    kvm_arm_init_ext_dabt_injection(cs);
+
     return kvm_arm_init_cpreg_list(cpu);
 }
 
diff --git a/target/arm/kvm_arm.h b/target/arm/kvm_arm.h
index 8e14d40..fe019f2 100644
--- a/target/arm/kvm_arm.h
+++ b/target/arm/kvm_arm.h
@@ -144,6 +144,14 @@ void kvm_arm_reset_vcpu(ARMCPU *cpu);
 void kvm_arm_init_serror_injection(CPUState *cs);
 
 /**
+ * kvm_arm_init_ext_dabt_injection
+ * @cs: CPUState
+ *
+ * Set status for KVM support for Ext DABT injection
+ */
+void kvm_arm_init_ext_dabt_injection(CPUState *cs);
+
+/**
  * kvm_get_vcpu_events:
  * @cpu: ARMCPU
  *
@@ -372,6 +380,17 @@ static inline const char *gicv3_class_name(void)
 bool kvm_arm_handle_debug(CPUState *cs, struct kvm_debug_exit_arch *debug_exit);
 
 /**
+ * kvm_arm_handle_dabt_nisv
+ * @cs: CPUState
+ * @esr_iss: ISS encoding (limited) for the exception from Data Abort
+ *           ISV bit set to '0b0' -> no valid instruction syndrome
+ * @fault_ipa: faulting address for the synch data abort
+ *
+ * Returns: 0 if the exception has been handled
+ */
+int kvm_arm_handle_dabt_nisv(CPUState *cs, uint64_t esr_iss,
+                             uint64_t fault_ipa);
+/**
  * kvm_arm_hw_debug_active:
  * @cs: CPU State
  *
-- 
2.7.4



^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [PATCH v2 2/2] target/arm: kvm: Handle DABT with no valid ISS
@ 2020-01-29 20:24   ` Beata Michalska
  0 siblings, 0 replies; 20+ messages in thread
From: Beata Michalska @ 2020-01-29 20:24 UTC (permalink / raw)
  To: qemu-devel; +Cc: qemu-arm, kvmarm, pbonzini

On ARMv7 & ARMv8 some load/store instructions might trigger a data abort
exception with no valid ISS info to be decoded. The lack of decode info
makes it at least tricky to emulate those instruction which is one of the
(many) reasons why KVM will not even try to do so.

Add support for handling those by requesting KVM to inject external
dabt into the quest.

Signed-off-by: Beata Michalska <beata.michalska@linaro.org>
---
 target/arm/cpu.h     |  2 ++
 target/arm/kvm.c     | 96 ++++++++++++++++++++++++++++++++++++++++++++++++++++
 target/arm/kvm32.c   |  3 ++
 target/arm/kvm64.c   |  3 ++
 target/arm/kvm_arm.h | 19 +++++++++++
 5 files changed, 123 insertions(+)

diff --git a/target/arm/cpu.h b/target/arm/cpu.h
index c1aedbe..e04a8d3 100644
--- a/target/arm/cpu.h
+++ b/target/arm/cpu.h
@@ -558,6 +558,8 @@ typedef struct CPUARMState {
         uint8_t has_esr;
         uint64_t esr;
     } serror;
+    /* Status field for pending external dabt */
+    uint8_t ext_dabt_pending;
 
     /* State of our input IRQ/FIQ/VIRQ/VFIQ lines */
     uint32_t irq_line_state;
diff --git a/target/arm/kvm.c b/target/arm/kvm.c
index 8d82889..e7bc9b7 100644
--- a/target/arm/kvm.c
+++ b/target/arm/kvm.c
@@ -37,6 +37,7 @@ const KVMCapabilityInfo kvm_arch_required_capabilities[] = {
 
 static bool cap_has_mp_state;
 static bool cap_has_inject_serror_esr;
+static bool cap_has_inject_ext_dabt; /* KVM_CAP_ARM_INJECT_EXT_DABT */
 
 static ARMHostCPUFeatures arm_host_cpu_features;
 
@@ -62,6 +63,12 @@ void kvm_arm_init_serror_injection(CPUState *cs)
                                     KVM_CAP_ARM_INJECT_SERROR_ESR);
 }
 
+void kvm_arm_init_ext_dabt_injection(CPUState *cs)
+{
+    cap_has_inject_ext_dabt = kvm_check_extension(cs->kvm_state,
+                                    KVM_CAP_ARM_INJECT_EXT_DABT);
+}
+
 bool kvm_arm_create_scratch_host_vcpu(const uint32_t *cpus_to_try,
                                       int *fdarray,
                                       struct kvm_vcpu_init *init)
@@ -216,6 +223,11 @@ int kvm_arch_init(MachineState *ms, KVMState *s)
         ret = -EINVAL;
     }
 
+    if (kvm_check_extension(s, KVM_CAP_ARM_NISV_TO_USER))
+        if (kvm_vm_enable_cap(s, KVM_CAP_ARM_NISV_TO_USER, 0)) {
+            warn_report("Failed to enable DABT NISV cap");
+        }
+
     return ret;
 }
 
@@ -598,6 +610,10 @@ int kvm_put_vcpu_events(ARMCPU *cpu)
         events.exception.serror_esr = env->serror.esr;
     }
 
+    if (cap_has_inject_ext_dabt) {
+        events.exception.ext_dabt_pending = env->ext_dabt_pending;
+    }
+
     ret = kvm_vcpu_ioctl(CPU(cpu), KVM_SET_VCPU_EVENTS, &events);
     if (ret) {
         error_report("failed to put vcpu events");
@@ -627,6 +643,8 @@ int kvm_get_vcpu_events(ARMCPU *cpu)
     env->serror.has_esr = events.exception.serror_has_esr;
     env->serror.esr = events.exception.serror_esr;
 
+    env->ext_dabt_pending = events.exception.ext_dabt_pending;
+
     return 0;
 }
 
@@ -634,6 +652,7 @@ void kvm_arch_pre_run(CPUState *cs, struct kvm_run *run)
 {
 }
 
+
 MemTxAttrs kvm_arch_post_run(CPUState *cs, struct kvm_run *run)
 {
     ARMCPU *cpu;
@@ -699,6 +718,11 @@ int kvm_arch_handle_exit(CPUState *cs, struct kvm_run *run)
             ret = EXCP_DEBUG;
         } /* otherwise return to guest */
         break;
+    case KVM_EXIT_ARM_NISV:
+        /* External DABT with no valid iss to decode */
+        ret = kvm_arm_handle_dabt_nisv(cs, run->arm_nisv.esr_iss,
+                                       run->arm_nisv.fault_ipa);
+        break;
     default:
         qemu_log_mask(LOG_UNIMP, "%s: un-handled exit reason %d\n",
                       __func__, run->exit_reason);
@@ -833,3 +857,75 @@ int kvm_arch_msi_data_to_gsi(uint32_t data)
 {
     return (data - 32) & 0xffff;
 }
+
+int kvm_arm_handle_dabt_nisv(CPUState *cs, uint64_t esr_iss,
+                             uint64_t fault_ipa)
+{
+    ARMCPU *cpu = ARM_CPU(cs);
+    CPUARMState *env = &cpu->env;
+    uint32_t ins, ins_fetched;
+
+    /*
+     * Hacky workaround for kernels that for aarch32 guests, instead of expected
+     * external data abort, inject the IMPLEMENTATION DEFINED exception with the
+     * lock-down. This is actually handled by the guest which results in
+     * re-running the faulting instruction.
+     * This intends to break the vicious cycle.
+     */
+    if (!is_a64(env)) {
+        static uint8_t setback;
+
+        /*
+         * The state has not been synchronized yet, so if this is re-occurrence
+         * of the same abort triggered by guest, the status for pending external
+         * abort should not get cleared yet
+         */
+        if (unlikely(env->ext_dabt_pending)) {
+            if (setback) {
+                error_report("Most probably triggered kernel issue with"
+                             " injecting external data abort.");
+                error_printf("Giving up trying ...\n");
+                /* Here is where the fun comes to an end */
+                return -EINVAL;
+            }
+        }
+        setback = env->ext_dabt_pending;
+    }
+
+    kvm_cpu_synchronize_state(cs);
+   /*
+    * ISS [23:14] is invalid so there is a limited info
+    * on what has just happened so the only *useful* thing that can
+    * be retrieved from ISS is WnR & DFSC (though in some cases WnR
+    * might be less of a value as well)
+    */
+
+    /*
+     * Get current PC before it will get updated to exception vector entry
+     */
+    target_ulong ins_addr = is_a64(env) ? env->pc : env->regs[15];
+
+    /*
+     * Get the faulting instruction
+     * Instructions have a fixed length of 32 bits
+     * and are always little-endian
+     */
+    ins_fetched = !cpu_memory_rw_debug(cs, ins_addr, (uint8_t *) &ins,
+                                       sizeof(uint32_t), 0);
+
+    error_report("Data abort exception with no valid ISS generated by "
+                 "guest memory access at physical address: 0x" TARGET_FMT_lx,
+                 (target_ulong)fault_ipa);
+
+    error_printf(ins_fetched ? "%s : 0x%" PRIx32 " (ins encoded)\n"  : "%s\n",
+                 "KVM unable to emulate faulting instruction",
+                 (!ins_fetched ? 0 : ldl_le_p(&ins)));
+    error_printf("Injecting a data abort exception into guest.\n");
+    /*
+     * Set pending ext dabt and trigger SET_EVENTS so that
+     * KVM can inject the abort
+     */
+    env->ext_dabt_pending = 1;
+
+    return 0;
+}
diff --git a/target/arm/kvm32.c b/target/arm/kvm32.c
index cf2b47f..177402e 100644
--- a/target/arm/kvm32.c
+++ b/target/arm/kvm32.c
@@ -240,6 +240,9 @@ int kvm_arch_init_vcpu(CPUState *cs)
     /* Check whether userspace can specify guest syndrome value */
     kvm_arm_init_serror_injection(cs);
 
+    /* Set status for supporting the external dabt injection */
+    kvm_arm_init_ext_dabt_injection(cs);
+
     return kvm_arm_init_cpreg_list(cpu);
 }
 
diff --git a/target/arm/kvm64.c b/target/arm/kvm64.c
index d06fd32..a474369 100644
--- a/target/arm/kvm64.c
+++ b/target/arm/kvm64.c
@@ -791,6 +791,9 @@ int kvm_arch_init_vcpu(CPUState *cs)
     /* Check whether user space can specify guest syndrome value */
     kvm_arm_init_serror_injection(cs);
 
+    /* Set status for supporting the external dabt injection */
+    kvm_arm_init_ext_dabt_injection(cs);
+
     return kvm_arm_init_cpreg_list(cpu);
 }
 
diff --git a/target/arm/kvm_arm.h b/target/arm/kvm_arm.h
index 8e14d40..fe019f2 100644
--- a/target/arm/kvm_arm.h
+++ b/target/arm/kvm_arm.h
@@ -144,6 +144,14 @@ void kvm_arm_reset_vcpu(ARMCPU *cpu);
 void kvm_arm_init_serror_injection(CPUState *cs);
 
 /**
+ * kvm_arm_init_ext_dabt_injection
+ * @cs: CPUState
+ *
+ * Set status for KVM support for Ext DABT injection
+ */
+void kvm_arm_init_ext_dabt_injection(CPUState *cs);
+
+/**
  * kvm_get_vcpu_events:
  * @cpu: ARMCPU
  *
@@ -372,6 +380,17 @@ static inline const char *gicv3_class_name(void)
 bool kvm_arm_handle_debug(CPUState *cs, struct kvm_debug_exit_arch *debug_exit);
 
 /**
+ * kvm_arm_handle_dabt_nisv
+ * @cs: CPUState
+ * @esr_iss: ISS encoding (limited) for the exception from Data Abort
+ *           ISV bit set to '0b0' -> no valid instruction syndrome
+ * @fault_ipa: faulting address for the synch data abort
+ *
+ * Returns: 0 if the exception has been handled
+ */
+int kvm_arm_handle_dabt_nisv(CPUState *cs, uint64_t esr_iss,
+                             uint64_t fault_ipa);
+/**
  * kvm_arm_hw_debug_active:
  * @cs: CPU State
  *
-- 
2.7.4

_______________________________________________
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

^ permalink raw reply related	[flat|nested] 20+ messages in thread

* Re: [PATCH v2 1/2] target/arm: kvm: Inject events at the last stage of sync
  2020-01-29 20:24   ` Beata Michalska
@ 2020-02-04 10:34     ` Andrew Jones
  -1 siblings, 0 replies; 20+ messages in thread
From: Andrew Jones @ 2020-02-04 10:34 UTC (permalink / raw)
  To: Beata Michalska
  Cc: peter.maydell, Christoffer.Dall, qemu-devel, qemu-arm, pbonzini, kvmarm

On Wed, Jan 29, 2020 at 08:24:40PM +0000, Beata Michalska wrote:
> KVM_SET_VCPU_EVENTS might actually lead to vcpu registers being modified.
> As such this should be the last step of sync to avoid potential overwriting
> of whatever changes KVM might have done.
> 
> Signed-off-by: Beata Michalska <beata.michalska@linaro.org>
> ---
>  target/arm/kvm32.c | 20 ++++++++++----------
>  target/arm/kvm64.c | 20 ++++++++++----------
>  2 files changed, 20 insertions(+), 20 deletions(-)
> 
> diff --git a/target/arm/kvm32.c b/target/arm/kvm32.c
> index 32bf8d6..cf2b47f 100644
> --- a/target/arm/kvm32.c
> +++ b/target/arm/kvm32.c
> @@ -386,17 +386,17 @@ int kvm_arch_put_registers(CPUState *cs, int level)
>          return ret;
>      }
>  
> -    ret = kvm_put_vcpu_events(cpu);
> -    if (ret) {
> -        return ret;
> -    }
> -
>      write_cpustate_to_list(cpu, true);
>  
>      if (!write_list_to_kvmstate(cpu, level)) {
>          return EINVAL;
>      }
>  
> +    ret = kvm_put_vcpu_events(cpu);
> +    if (ret) {
> +        return ret;
> +    }
> +

I think we should put a comment above this that says basically the same
thing as the commit message in order to explain why kvm_put_vcpu_events()
*must* be after write_list_to_kvmstate().

>      kvm_arm_sync_mpstate_to_kvm(cpu);
>  
>      return ret;
> @@ -462,11 +462,6 @@ int kvm_arch_get_registers(CPUState *cs)
>      }
>      vfp_set_fpscr(env, fpscr);
>  
> -    ret = kvm_get_vcpu_events(cpu);
> -    if (ret) {
> -        return ret;
> -    }
> -
>      if (!write_kvmstate_to_list(cpu)) {
>          return EINVAL;
>      }
> @@ -475,6 +470,11 @@ int kvm_arch_get_registers(CPUState *cs)
>       */
>      write_list_to_cpustate(cpu);
>  
> +    ret = kvm_get_vcpu_events(cpu);
> +    if (ret) {
> +        return ret;
> +    }
> +

Why are we moving kvm_get_vcpu_events()?

>      kvm_arm_sync_mpstate_to_qemu(cpu);
>  
>      return 0;
> diff --git a/target/arm/kvm64.c b/target/arm/kvm64.c
> index 6344113..d06fd32 100644
> --- a/target/arm/kvm64.c
> +++ b/target/arm/kvm64.c
> @@ -1043,17 +1043,17 @@ int kvm_arch_put_registers(CPUState *cs, int level)
>          return ret;
>      }
>  
> -    ret = kvm_put_vcpu_events(cpu);
> -    if (ret) {
> -        return ret;
> -    }
> -
>      write_cpustate_to_list(cpu, true);
>  
>      if (!write_list_to_kvmstate(cpu, level)) {
>          return -EINVAL;
>      }
>  
> +    ret = kvm_put_vcpu_events(cpu);
> +    if (ret) {
> +        return ret;
> +    }
> +
>      kvm_arm_sync_mpstate_to_kvm(cpu);
>  
>      return ret;
> @@ -1251,11 +1251,6 @@ int kvm_arch_get_registers(CPUState *cs)
>      }
>      vfp_set_fpcr(env, fpr);
>  
> -    ret = kvm_get_vcpu_events(cpu);
> -    if (ret) {
> -        return ret;
> -    }
> -
>      if (!write_kvmstate_to_list(cpu)) {
>          return -EINVAL;
>      }
> @@ -1264,6 +1259,11 @@ int kvm_arch_get_registers(CPUState *cs)
>       */
>      write_list_to_cpustate(cpu);
>  
> +    ret = kvm_get_vcpu_events(cpu);
> +    if (ret) {
> +        return ret;
> +    }
> +
>      kvm_arm_sync_mpstate_to_qemu(cpu);
>  
>      /* TODO: other registers */
> -- 
> 2.7.4
> 
> 

Same comments for kvm64.c as for kvm32.c

Thanks,
drew



^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH v2 1/2] target/arm: kvm: Inject events at the last stage of sync
@ 2020-02-04 10:34     ` Andrew Jones
  0 siblings, 0 replies; 20+ messages in thread
From: Andrew Jones @ 2020-02-04 10:34 UTC (permalink / raw)
  To: Beata Michalska; +Cc: qemu-devel, qemu-arm, pbonzini, kvmarm

On Wed, Jan 29, 2020 at 08:24:40PM +0000, Beata Michalska wrote:
> KVM_SET_VCPU_EVENTS might actually lead to vcpu registers being modified.
> As such this should be the last step of sync to avoid potential overwriting
> of whatever changes KVM might have done.
> 
> Signed-off-by: Beata Michalska <beata.michalska@linaro.org>
> ---
>  target/arm/kvm32.c | 20 ++++++++++----------
>  target/arm/kvm64.c | 20 ++++++++++----------
>  2 files changed, 20 insertions(+), 20 deletions(-)
> 
> diff --git a/target/arm/kvm32.c b/target/arm/kvm32.c
> index 32bf8d6..cf2b47f 100644
> --- a/target/arm/kvm32.c
> +++ b/target/arm/kvm32.c
> @@ -386,17 +386,17 @@ int kvm_arch_put_registers(CPUState *cs, int level)
>          return ret;
>      }
>  
> -    ret = kvm_put_vcpu_events(cpu);
> -    if (ret) {
> -        return ret;
> -    }
> -
>      write_cpustate_to_list(cpu, true);
>  
>      if (!write_list_to_kvmstate(cpu, level)) {
>          return EINVAL;
>      }
>  
> +    ret = kvm_put_vcpu_events(cpu);
> +    if (ret) {
> +        return ret;
> +    }
> +

I think we should put a comment above this that says basically the same
thing as the commit message in order to explain why kvm_put_vcpu_events()
*must* be after write_list_to_kvmstate().

>      kvm_arm_sync_mpstate_to_kvm(cpu);
>  
>      return ret;
> @@ -462,11 +462,6 @@ int kvm_arch_get_registers(CPUState *cs)
>      }
>      vfp_set_fpscr(env, fpscr);
>  
> -    ret = kvm_get_vcpu_events(cpu);
> -    if (ret) {
> -        return ret;
> -    }
> -
>      if (!write_kvmstate_to_list(cpu)) {
>          return EINVAL;
>      }
> @@ -475,6 +470,11 @@ int kvm_arch_get_registers(CPUState *cs)
>       */
>      write_list_to_cpustate(cpu);
>  
> +    ret = kvm_get_vcpu_events(cpu);
> +    if (ret) {
> +        return ret;
> +    }
> +

Why are we moving kvm_get_vcpu_events()?

>      kvm_arm_sync_mpstate_to_qemu(cpu);
>  
>      return 0;
> diff --git a/target/arm/kvm64.c b/target/arm/kvm64.c
> index 6344113..d06fd32 100644
> --- a/target/arm/kvm64.c
> +++ b/target/arm/kvm64.c
> @@ -1043,17 +1043,17 @@ int kvm_arch_put_registers(CPUState *cs, int level)
>          return ret;
>      }
>  
> -    ret = kvm_put_vcpu_events(cpu);
> -    if (ret) {
> -        return ret;
> -    }
> -
>      write_cpustate_to_list(cpu, true);
>  
>      if (!write_list_to_kvmstate(cpu, level)) {
>          return -EINVAL;
>      }
>  
> +    ret = kvm_put_vcpu_events(cpu);
> +    if (ret) {
> +        return ret;
> +    }
> +
>      kvm_arm_sync_mpstate_to_kvm(cpu);
>  
>      return ret;
> @@ -1251,11 +1251,6 @@ int kvm_arch_get_registers(CPUState *cs)
>      }
>      vfp_set_fpcr(env, fpr);
>  
> -    ret = kvm_get_vcpu_events(cpu);
> -    if (ret) {
> -        return ret;
> -    }
> -
>      if (!write_kvmstate_to_list(cpu)) {
>          return -EINVAL;
>      }
> @@ -1264,6 +1259,11 @@ int kvm_arch_get_registers(CPUState *cs)
>       */
>      write_list_to_cpustate(cpu);
>  
> +    ret = kvm_get_vcpu_events(cpu);
> +    if (ret) {
> +        return ret;
> +    }
> +
>      kvm_arm_sync_mpstate_to_qemu(cpu);
>  
>      /* TODO: other registers */
> -- 
> 2.7.4
> 
> 

Same comments for kvm64.c as for kvm32.c

Thanks,
drew

_______________________________________________
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH v2 2/2] target/arm: kvm: Handle DABT with no valid ISS
  2020-01-29 20:24   ` Beata Michalska
@ 2020-02-05 16:57     ` Andrew Jones
  -1 siblings, 0 replies; 20+ messages in thread
From: Andrew Jones @ 2020-02-05 16:57 UTC (permalink / raw)
  To: Beata Michalska
  Cc: peter.maydell, Christoffer.Dall, qemu-devel, qemu-arm, pbonzini, kvmarm

On Wed, Jan 29, 2020 at 08:24:41PM +0000, Beata Michalska wrote:
> On ARMv7 & ARMv8 some load/store instructions might trigger a data abort
> exception with no valid ISS info to be decoded. The lack of decode info
> makes it at least tricky to emulate those instruction which is one of the
> (many) reasons why KVM will not even try to do so.
> 
> Add support for handling those by requesting KVM to inject external
> dabt into the quest.
> 
> Signed-off-by: Beata Michalska <beata.michalska@linaro.org>
> ---
>  target/arm/cpu.h     |  2 ++
>  target/arm/kvm.c     | 96 ++++++++++++++++++++++++++++++++++++++++++++++++++++
>  target/arm/kvm32.c   |  3 ++
>  target/arm/kvm64.c   |  3 ++
>  target/arm/kvm_arm.h | 19 +++++++++++
>  5 files changed, 123 insertions(+)
> 
> diff --git a/target/arm/cpu.h b/target/arm/cpu.h
> index c1aedbe..e04a8d3 100644
> --- a/target/arm/cpu.h
> +++ b/target/arm/cpu.h
> @@ -558,6 +558,8 @@ typedef struct CPUARMState {
>          uint8_t has_esr;
>          uint64_t esr;
>      } serror;
> +    /* Status field for pending external dabt */
> +    uint8_t ext_dabt_pending;
>  
>      /* State of our input IRQ/FIQ/VIRQ/VFIQ lines */
>      uint32_t irq_line_state;
> diff --git a/target/arm/kvm.c b/target/arm/kvm.c
> index 8d82889..e7bc9b7 100644
> --- a/target/arm/kvm.c
> +++ b/target/arm/kvm.c
> @@ -37,6 +37,7 @@ const KVMCapabilityInfo kvm_arch_required_capabilities[] = {
>  
>  static bool cap_has_mp_state;
>  static bool cap_has_inject_serror_esr;
> +static bool cap_has_inject_ext_dabt; /* KVM_CAP_ARM_INJECT_EXT_DABT */

nit: the KVM_CAP_ARM_INJECT_EXT_DABT comment is unnecessary

>  
>  static ARMHostCPUFeatures arm_host_cpu_features;
>  
> @@ -62,6 +63,12 @@ void kvm_arm_init_serror_injection(CPUState *cs)
>                                      KVM_CAP_ARM_INJECT_SERROR_ESR);
>  }
>  
> +void kvm_arm_init_ext_dabt_injection(CPUState *cs)
> +{
> +    cap_has_inject_ext_dabt = kvm_check_extension(cs->kvm_state,
> +                                    KVM_CAP_ARM_INJECT_EXT_DABT);
> +}
> +
>  bool kvm_arm_create_scratch_host_vcpu(const uint32_t *cpus_to_try,
>                                        int *fdarray,
>                                        struct kvm_vcpu_init *init)
> @@ -216,6 +223,11 @@ int kvm_arch_init(MachineState *ms, KVMState *s)
>          ret = -EINVAL;
>      }
>  
> +    if (kvm_check_extension(s, KVM_CAP_ARM_NISV_TO_USER))
> +        if (kvm_vm_enable_cap(s, KVM_CAP_ARM_NISV_TO_USER, 0)) {
> +            warn_report("Failed to enable DABT NISV cap");
> +        }
> +

Missing {} around the outer block.

As KVM_CAP_ARM_INJECT_EXT_DABT is a VM capability then I think we should
set cap_has_inject_ext_dabt here, like cap_has_mp_state gets set. I see
you've followed the pattern used for cap_has_inject_serror_esr, but that
looks wrong too since KVM_CAP_ARM_INJECT_SERROR_ESR is also a VM
capability. The way it is now we just keep setting
cap_has_inject_serror_esr to the same value, NR_VCPUS times.

>      return ret;
>  }
>  
> @@ -598,6 +610,10 @@ int kvm_put_vcpu_events(ARMCPU *cpu)
>          events.exception.serror_esr = env->serror.esr;
>      }
>  
> +    if (cap_has_inject_ext_dabt) {
> +        events.exception.ext_dabt_pending = env->ext_dabt_pending;
> +    }
> +
>      ret = kvm_vcpu_ioctl(CPU(cpu), KVM_SET_VCPU_EVENTS, &events);
>      if (ret) {
>          error_report("failed to put vcpu events");
> @@ -627,6 +643,8 @@ int kvm_get_vcpu_events(ARMCPU *cpu)
>      env->serror.has_esr = events.exception.serror_has_esr;
>      env->serror.esr = events.exception.serror_esr;
>  
> +    env->ext_dabt_pending = events.exception.ext_dabt_pending;
> +

afaict from Documentation/virt/kvm/api.txt and the KVM code you cannot
get this state. Therefore the above line (and extra stray blank line)
should be dropped.

>      return 0;
>  }
>  
> @@ -634,6 +652,7 @@ void kvm_arch_pre_run(CPUState *cs, struct kvm_run *run)
>  {
>  }
>  
> +

stray blank line

>  MemTxAttrs kvm_arch_post_run(CPUState *cs, struct kvm_run *run)
>  {
>      ARMCPU *cpu;
> @@ -699,6 +718,11 @@ int kvm_arch_handle_exit(CPUState *cs, struct kvm_run *run)
>              ret = EXCP_DEBUG;
>          } /* otherwise return to guest */
>          break;
> +    case KVM_EXIT_ARM_NISV:
> +        /* External DABT with no valid iss to decode */
> +        ret = kvm_arm_handle_dabt_nisv(cs, run->arm_nisv.esr_iss,
> +                                       run->arm_nisv.fault_ipa);
> +        break;
>      default:
>          qemu_log_mask(LOG_UNIMP, "%s: un-handled exit reason %d\n",
>                        __func__, run->exit_reason);
> @@ -833,3 +857,75 @@ int kvm_arch_msi_data_to_gsi(uint32_t data)
>  {
>      return (data - 32) & 0xffff;
>  }
> +
> +int kvm_arm_handle_dabt_nisv(CPUState *cs, uint64_t esr_iss,
> +                             uint64_t fault_ipa)
> +{
> +    ARMCPU *cpu = ARM_CPU(cs);
> +    CPUARMState *env = &cpu->env;
> +    uint32_t ins, ins_fetched;

ins_fetched is a bool

> +
> +    /*
> +     * Hacky workaround for kernels that for aarch32 guests, instead of expected
> +     * external data abort, inject the IMPLEMENTATION DEFINED exception with the
> +     * lock-down. This is actually handled by the guest which results in
> +     * re-running the faulting instruction.
> +     * This intends to break the vicious cycle.
> +     */

This doesn't seem like the right thing to do. What happens on real
hardware in this case? If an OS would get into a "vicious cycle" on
real hardware then it should get into one on KVM too.

> +    if (!is_a64(env)) {
> +        static uint8_t setback;
> +
> +        /*
> +         * The state has not been synchronized yet, so if this is re-occurrence
> +         * of the same abort triggered by guest, the status for pending external
> +         * abort should not get cleared yet
> +         */
> +        if (unlikely(env->ext_dabt_pending)) {
> +            if (setback) {
> +                error_report("Most probably triggered kernel issue with"
> +                             " injecting external data abort.");
> +                error_printf("Giving up trying ...\n");
> +                /* Here is where the fun comes to an end */
> +                return -EINVAL;
> +            }
> +        }
> +        setback = env->ext_dabt_pending;
> +    }
> +
> +    kvm_cpu_synchronize_state(cs);
> +   /*
> +    * ISS [23:14] is invalid so there is a limited info
> +    * on what has just happened so the only *useful* thing that can
> +    * be retrieved from ISS is WnR & DFSC (though in some cases WnR
> +    * might be less of a value as well)
> +    */
> +
> +    /*
> +     * Get current PC before it will get updated to exception vector entry
> +     */
> +    target_ulong ins_addr = is_a64(env) ? env->pc : env->regs[15];

ins_addr should be declared above

But what are we doing? pc is a guest virtual address. Oh, I see...

> +
> +    /*
> +     * Get the faulting instruction
> +     * Instructions have a fixed length of 32 bits
> +     * and are always little-endian
> +     */
> +    ins_fetched = !cpu_memory_rw_debug(cs, ins_addr, (uint8_t *) &ins,
> +                                       sizeof(uint32_t), 0);

... we're trying to actual walk the KVM guest's page tables. That
seems a bit odd, and the "_debug" function name used for it makes
it seem even more odd.

> +
> +    error_report("Data abort exception with no valid ISS generated by "
> +                 "guest memory access at physical address: 0x" TARGET_FMT_lx,
> +                 (target_ulong)fault_ipa);
> +
> +    error_printf(ins_fetched ? "%s : 0x%" PRIx32 " (ins encoded)\n"  : "%s\n",
> +                 "KVM unable to emulate faulting instruction",
> +                 (!ins_fetched ? 0 : ldl_le_p(&ins)));
> +    error_printf("Injecting a data abort exception into guest.\n");

The guest shouldn't be able to generate three lines of errors on the host
whenever it wants. That's a security bug. One trace point here seems like
the most we should do. Or, after these three lines we should kill the
guest.

Actually, I don't think we should attempt to get the instruction at all.
We should do what the KVM documenation suggests we do when we get
this exit. We should either do a user selected action: one of suspend,
dump, restart, or inject a dabt (as is done below). The last choice hopes
that the guest will handle it in some graceful way, or that it'll crash
with all the information needed for a post-mortem crash investigation.

And I don't think we should do the "very brave" option of trying to
emulate the instruction, even if we identify it as a valid MMIO address.
That just sounds like a huge can of worms.

Does QEMU already have a way for users to select a
don't-know-how-to-handle-guest-exception behavior?

> +    /*
> +     * Set pending ext dabt and trigger SET_EVENTS so that
> +     * KVM can inject the abort
> +     */
> +    env->ext_dabt_pending = 1;
> +
> +    return 0;
> +}

Thanks,
drew



^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH v2 2/2] target/arm: kvm: Handle DABT with no valid ISS
@ 2020-02-05 16:57     ` Andrew Jones
  0 siblings, 0 replies; 20+ messages in thread
From: Andrew Jones @ 2020-02-05 16:57 UTC (permalink / raw)
  To: Beata Michalska; +Cc: qemu-devel, qemu-arm, pbonzini, kvmarm

On Wed, Jan 29, 2020 at 08:24:41PM +0000, Beata Michalska wrote:
> On ARMv7 & ARMv8 some load/store instructions might trigger a data abort
> exception with no valid ISS info to be decoded. The lack of decode info
> makes it at least tricky to emulate those instruction which is one of the
> (many) reasons why KVM will not even try to do so.
> 
> Add support for handling those by requesting KVM to inject external
> dabt into the quest.
> 
> Signed-off-by: Beata Michalska <beata.michalska@linaro.org>
> ---
>  target/arm/cpu.h     |  2 ++
>  target/arm/kvm.c     | 96 ++++++++++++++++++++++++++++++++++++++++++++++++++++
>  target/arm/kvm32.c   |  3 ++
>  target/arm/kvm64.c   |  3 ++
>  target/arm/kvm_arm.h | 19 +++++++++++
>  5 files changed, 123 insertions(+)
> 
> diff --git a/target/arm/cpu.h b/target/arm/cpu.h
> index c1aedbe..e04a8d3 100644
> --- a/target/arm/cpu.h
> +++ b/target/arm/cpu.h
> @@ -558,6 +558,8 @@ typedef struct CPUARMState {
>          uint8_t has_esr;
>          uint64_t esr;
>      } serror;
> +    /* Status field for pending external dabt */
> +    uint8_t ext_dabt_pending;
>  
>      /* State of our input IRQ/FIQ/VIRQ/VFIQ lines */
>      uint32_t irq_line_state;
> diff --git a/target/arm/kvm.c b/target/arm/kvm.c
> index 8d82889..e7bc9b7 100644
> --- a/target/arm/kvm.c
> +++ b/target/arm/kvm.c
> @@ -37,6 +37,7 @@ const KVMCapabilityInfo kvm_arch_required_capabilities[] = {
>  
>  static bool cap_has_mp_state;
>  static bool cap_has_inject_serror_esr;
> +static bool cap_has_inject_ext_dabt; /* KVM_CAP_ARM_INJECT_EXT_DABT */

nit: the KVM_CAP_ARM_INJECT_EXT_DABT comment is unnecessary

>  
>  static ARMHostCPUFeatures arm_host_cpu_features;
>  
> @@ -62,6 +63,12 @@ void kvm_arm_init_serror_injection(CPUState *cs)
>                                      KVM_CAP_ARM_INJECT_SERROR_ESR);
>  }
>  
> +void kvm_arm_init_ext_dabt_injection(CPUState *cs)
> +{
> +    cap_has_inject_ext_dabt = kvm_check_extension(cs->kvm_state,
> +                                    KVM_CAP_ARM_INJECT_EXT_DABT);
> +}
> +
>  bool kvm_arm_create_scratch_host_vcpu(const uint32_t *cpus_to_try,
>                                        int *fdarray,
>                                        struct kvm_vcpu_init *init)
> @@ -216,6 +223,11 @@ int kvm_arch_init(MachineState *ms, KVMState *s)
>          ret = -EINVAL;
>      }
>  
> +    if (kvm_check_extension(s, KVM_CAP_ARM_NISV_TO_USER))
> +        if (kvm_vm_enable_cap(s, KVM_CAP_ARM_NISV_TO_USER, 0)) {
> +            warn_report("Failed to enable DABT NISV cap");
> +        }
> +

Missing {} around the outer block.

As KVM_CAP_ARM_INJECT_EXT_DABT is a VM capability then I think we should
set cap_has_inject_ext_dabt here, like cap_has_mp_state gets set. I see
you've followed the pattern used for cap_has_inject_serror_esr, but that
looks wrong too since KVM_CAP_ARM_INJECT_SERROR_ESR is also a VM
capability. The way it is now we just keep setting
cap_has_inject_serror_esr to the same value, NR_VCPUS times.

>      return ret;
>  }
>  
> @@ -598,6 +610,10 @@ int kvm_put_vcpu_events(ARMCPU *cpu)
>          events.exception.serror_esr = env->serror.esr;
>      }
>  
> +    if (cap_has_inject_ext_dabt) {
> +        events.exception.ext_dabt_pending = env->ext_dabt_pending;
> +    }
> +
>      ret = kvm_vcpu_ioctl(CPU(cpu), KVM_SET_VCPU_EVENTS, &events);
>      if (ret) {
>          error_report("failed to put vcpu events");
> @@ -627,6 +643,8 @@ int kvm_get_vcpu_events(ARMCPU *cpu)
>      env->serror.has_esr = events.exception.serror_has_esr;
>      env->serror.esr = events.exception.serror_esr;
>  
> +    env->ext_dabt_pending = events.exception.ext_dabt_pending;
> +

afaict from Documentation/virt/kvm/api.txt and the KVM code you cannot
get this state. Therefore the above line (and extra stray blank line)
should be dropped.

>      return 0;
>  }
>  
> @@ -634,6 +652,7 @@ void kvm_arch_pre_run(CPUState *cs, struct kvm_run *run)
>  {
>  }
>  
> +

stray blank line

>  MemTxAttrs kvm_arch_post_run(CPUState *cs, struct kvm_run *run)
>  {
>      ARMCPU *cpu;
> @@ -699,6 +718,11 @@ int kvm_arch_handle_exit(CPUState *cs, struct kvm_run *run)
>              ret = EXCP_DEBUG;
>          } /* otherwise return to guest */
>          break;
> +    case KVM_EXIT_ARM_NISV:
> +        /* External DABT with no valid iss to decode */
> +        ret = kvm_arm_handle_dabt_nisv(cs, run->arm_nisv.esr_iss,
> +                                       run->arm_nisv.fault_ipa);
> +        break;
>      default:
>          qemu_log_mask(LOG_UNIMP, "%s: un-handled exit reason %d\n",
>                        __func__, run->exit_reason);
> @@ -833,3 +857,75 @@ int kvm_arch_msi_data_to_gsi(uint32_t data)
>  {
>      return (data - 32) & 0xffff;
>  }
> +
> +int kvm_arm_handle_dabt_nisv(CPUState *cs, uint64_t esr_iss,
> +                             uint64_t fault_ipa)
> +{
> +    ARMCPU *cpu = ARM_CPU(cs);
> +    CPUARMState *env = &cpu->env;
> +    uint32_t ins, ins_fetched;

ins_fetched is a bool

> +
> +    /*
> +     * Hacky workaround for kernels that for aarch32 guests, instead of expected
> +     * external data abort, inject the IMPLEMENTATION DEFINED exception with the
> +     * lock-down. This is actually handled by the guest which results in
> +     * re-running the faulting instruction.
> +     * This intends to break the vicious cycle.
> +     */

This doesn't seem like the right thing to do. What happens on real
hardware in this case? If an OS would get into a "vicious cycle" on
real hardware then it should get into one on KVM too.

> +    if (!is_a64(env)) {
> +        static uint8_t setback;
> +
> +        /*
> +         * The state has not been synchronized yet, so if this is re-occurrence
> +         * of the same abort triggered by guest, the status for pending external
> +         * abort should not get cleared yet
> +         */
> +        if (unlikely(env->ext_dabt_pending)) {
> +            if (setback) {
> +                error_report("Most probably triggered kernel issue with"
> +                             " injecting external data abort.");
> +                error_printf("Giving up trying ...\n");
> +                /* Here is where the fun comes to an end */
> +                return -EINVAL;
> +            }
> +        }
> +        setback = env->ext_dabt_pending;
> +    }
> +
> +    kvm_cpu_synchronize_state(cs);
> +   /*
> +    * ISS [23:14] is invalid so there is a limited info
> +    * on what has just happened so the only *useful* thing that can
> +    * be retrieved from ISS is WnR & DFSC (though in some cases WnR
> +    * might be less of a value as well)
> +    */
> +
> +    /*
> +     * Get current PC before it will get updated to exception vector entry
> +     */
> +    target_ulong ins_addr = is_a64(env) ? env->pc : env->regs[15];

ins_addr should be declared above

But what are we doing? pc is a guest virtual address. Oh, I see...

> +
> +    /*
> +     * Get the faulting instruction
> +     * Instructions have a fixed length of 32 bits
> +     * and are always little-endian
> +     */
> +    ins_fetched = !cpu_memory_rw_debug(cs, ins_addr, (uint8_t *) &ins,
> +                                       sizeof(uint32_t), 0);

... we're trying to actual walk the KVM guest's page tables. That
seems a bit odd, and the "_debug" function name used for it makes
it seem even more odd.

> +
> +    error_report("Data abort exception with no valid ISS generated by "
> +                 "guest memory access at physical address: 0x" TARGET_FMT_lx,
> +                 (target_ulong)fault_ipa);
> +
> +    error_printf(ins_fetched ? "%s : 0x%" PRIx32 " (ins encoded)\n"  : "%s\n",
> +                 "KVM unable to emulate faulting instruction",
> +                 (!ins_fetched ? 0 : ldl_le_p(&ins)));
> +    error_printf("Injecting a data abort exception into guest.\n");

The guest shouldn't be able to generate three lines of errors on the host
whenever it wants. That's a security bug. One trace point here seems like
the most we should do. Or, after these three lines we should kill the
guest.

Actually, I don't think we should attempt to get the instruction at all.
We should do what the KVM documenation suggests we do when we get
this exit. We should either do a user selected action: one of suspend,
dump, restart, or inject a dabt (as is done below). The last choice hopes
that the guest will handle it in some graceful way, or that it'll crash
with all the information needed for a post-mortem crash investigation.

And I don't think we should do the "very brave" option of trying to
emulate the instruction, even if we identify it as a valid MMIO address.
That just sounds like a huge can of worms.

Does QEMU already have a way for users to select a
don't-know-how-to-handle-guest-exception behavior?

> +    /*
> +     * Set pending ext dabt and trigger SET_EVENTS so that
> +     * KVM can inject the abort
> +     */
> +    env->ext_dabt_pending = 1;
> +
> +    return 0;
> +}

Thanks,
drew

_______________________________________________
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH v2 1/2] target/arm: kvm: Inject events at the last stage of sync
  2020-02-04 10:34     ` Andrew Jones
@ 2020-02-06 21:41       ` Beata Michalska
  -1 siblings, 0 replies; 20+ messages in thread
From: Beata Michalska @ 2020-02-06 21:41 UTC (permalink / raw)
  To: Andrew Jones
  Cc: Peter Maydell, Christoffer Dall, QEMU Developers, qemu-arm,
	Paolo Bonzini, kvmarm

On Tue, 4 Feb 2020 at 10:34, Andrew Jones <drjones@redhat.com> wrote:
>
> On Wed, Jan 29, 2020 at 08:24:40PM +0000, Beata Michalska wrote:
> > KVM_SET_VCPU_EVENTS might actually lead to vcpu registers being modified.
> > As such this should be the last step of sync to avoid potential overwriting
> > of whatever changes KVM might have done.
> >
> > Signed-off-by: Beata Michalska <beata.michalska@linaro.org>
> > ---
> >  target/arm/kvm32.c | 20 ++++++++++----------
> >  target/arm/kvm64.c | 20 ++++++++++----------
> >  2 files changed, 20 insertions(+), 20 deletions(-)
> >
> > diff --git a/target/arm/kvm32.c b/target/arm/kvm32.c
> > index 32bf8d6..cf2b47f 100644
> > --- a/target/arm/kvm32.c
> > +++ b/target/arm/kvm32.c
> > @@ -386,17 +386,17 @@ int kvm_arch_put_registers(CPUState *cs, int level)
> >          return ret;
> >      }
> >
> > -    ret = kvm_put_vcpu_events(cpu);
> > -    if (ret) {
> > -        return ret;
> > -    }
> > -
> >      write_cpustate_to_list(cpu, true);
> >
> >      if (!write_list_to_kvmstate(cpu, level)) {
> >          return EINVAL;
> >      }
> >
> > +    ret = kvm_put_vcpu_events(cpu);
> > +    if (ret) {
> > +        return ret;
> > +    }
> > +
>
> I think we should put a comment above this that says basically the same
> thing as the commit message in order to explain why kvm_put_vcpu_events()
> *must* be after write_list_to_kvmstate().
>
Will do that.

> >      kvm_arm_sync_mpstate_to_kvm(cpu);
> >
> >      return ret;
> > @@ -462,11 +462,6 @@ int kvm_arch_get_registers(CPUState *cs)
> >      }
> >      vfp_set_fpscr(env, fpscr);
> >
> > -    ret = kvm_get_vcpu_events(cpu);
> > -    if (ret) {
> > -        return ret;
> > -    }
> > -
> >      if (!write_kvmstate_to_list(cpu)) {
> >          return EINVAL;
> >      }
> > @@ -475,6 +470,11 @@ int kvm_arch_get_registers(CPUState *cs)
> >       */
> >      write_list_to_cpustate(cpu);
> >
> > +    ret = kvm_get_vcpu_events(cpu);
> > +    if (ret) {
> > +        return ret;
> > +    }
> > +
>
> Why are we moving kvm_get_vcpu_events()?

This is only to make things consistent with put_registeres.
There is no functional change per se.

BR

Beata

> >      kvm_arm_sync_mpstate_to_qemu(cpu);
> >
> >      return 0;
> > diff --git a/target/arm/kvm64.c b/target/arm/kvm64.c
> > index 6344113..d06fd32 100644
> > --- a/target/arm/kvm64.c
> > +++ b/target/arm/kvm64.c
> > @@ -1043,17 +1043,17 @@ int kvm_arch_put_registers(CPUState *cs, int level)
> >          return ret;
> >      }
> >
> > -    ret = kvm_put_vcpu_events(cpu);
> > -    if (ret) {
> > -        return ret;
> > -    }
> > -
> >      write_cpustate_to_list(cpu, true);
> >
> >      if (!write_list_to_kvmstate(cpu, level)) {
> >          return -EINVAL;
> >      }
> >
> > +    ret = kvm_put_vcpu_events(cpu);
> > +    if (ret) {
> > +        return ret;
> > +    }
> > +
> >      kvm_arm_sync_mpstate_to_kvm(cpu);
> >
> >      return ret;
> > @@ -1251,11 +1251,6 @@ int kvm_arch_get_registers(CPUState *cs)
> >      }
> >      vfp_set_fpcr(env, fpr);
> >
> > -    ret = kvm_get_vcpu_events(cpu);
> > -    if (ret) {
> > -        return ret;
> > -    }
> > -
> >      if (!write_kvmstate_to_list(cpu)) {
> >          return -EINVAL;
> >      }
> > @@ -1264,6 +1259,11 @@ int kvm_arch_get_registers(CPUState *cs)
> >       */
> >      write_list_to_cpustate(cpu);
> >
> > +    ret = kvm_get_vcpu_events(cpu);
> > +    if (ret) {
> > +        return ret;
> > +    }
> > +
> >      kvm_arm_sync_mpstate_to_qemu(cpu);
> >
> >      /* TODO: other registers */
> > --
> > 2.7.4
> >
> >
>
> Same comments for kvm64.c as for kvm32.c
>
> Thanks,
> drew
>


^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH v2 1/2] target/arm: kvm: Inject events at the last stage of sync
@ 2020-02-06 21:41       ` Beata Michalska
  0 siblings, 0 replies; 20+ messages in thread
From: Beata Michalska @ 2020-02-06 21:41 UTC (permalink / raw)
  To: Andrew Jones; +Cc: QEMU Developers, qemu-arm, Paolo Bonzini, kvmarm

On Tue, 4 Feb 2020 at 10:34, Andrew Jones <drjones@redhat.com> wrote:
>
> On Wed, Jan 29, 2020 at 08:24:40PM +0000, Beata Michalska wrote:
> > KVM_SET_VCPU_EVENTS might actually lead to vcpu registers being modified.
> > As such this should be the last step of sync to avoid potential overwriting
> > of whatever changes KVM might have done.
> >
> > Signed-off-by: Beata Michalska <beata.michalska@linaro.org>
> > ---
> >  target/arm/kvm32.c | 20 ++++++++++----------
> >  target/arm/kvm64.c | 20 ++++++++++----------
> >  2 files changed, 20 insertions(+), 20 deletions(-)
> >
> > diff --git a/target/arm/kvm32.c b/target/arm/kvm32.c
> > index 32bf8d6..cf2b47f 100644
> > --- a/target/arm/kvm32.c
> > +++ b/target/arm/kvm32.c
> > @@ -386,17 +386,17 @@ int kvm_arch_put_registers(CPUState *cs, int level)
> >          return ret;
> >      }
> >
> > -    ret = kvm_put_vcpu_events(cpu);
> > -    if (ret) {
> > -        return ret;
> > -    }
> > -
> >      write_cpustate_to_list(cpu, true);
> >
> >      if (!write_list_to_kvmstate(cpu, level)) {
> >          return EINVAL;
> >      }
> >
> > +    ret = kvm_put_vcpu_events(cpu);
> > +    if (ret) {
> > +        return ret;
> > +    }
> > +
>
> I think we should put a comment above this that says basically the same
> thing as the commit message in order to explain why kvm_put_vcpu_events()
> *must* be after write_list_to_kvmstate().
>
Will do that.

> >      kvm_arm_sync_mpstate_to_kvm(cpu);
> >
> >      return ret;
> > @@ -462,11 +462,6 @@ int kvm_arch_get_registers(CPUState *cs)
> >      }
> >      vfp_set_fpscr(env, fpscr);
> >
> > -    ret = kvm_get_vcpu_events(cpu);
> > -    if (ret) {
> > -        return ret;
> > -    }
> > -
> >      if (!write_kvmstate_to_list(cpu)) {
> >          return EINVAL;
> >      }
> > @@ -475,6 +470,11 @@ int kvm_arch_get_registers(CPUState *cs)
> >       */
> >      write_list_to_cpustate(cpu);
> >
> > +    ret = kvm_get_vcpu_events(cpu);
> > +    if (ret) {
> > +        return ret;
> > +    }
> > +
>
> Why are we moving kvm_get_vcpu_events()?

This is only to make things consistent with put_registeres.
There is no functional change per se.

BR

Beata

> >      kvm_arm_sync_mpstate_to_qemu(cpu);
> >
> >      return 0;
> > diff --git a/target/arm/kvm64.c b/target/arm/kvm64.c
> > index 6344113..d06fd32 100644
> > --- a/target/arm/kvm64.c
> > +++ b/target/arm/kvm64.c
> > @@ -1043,17 +1043,17 @@ int kvm_arch_put_registers(CPUState *cs, int level)
> >          return ret;
> >      }
> >
> > -    ret = kvm_put_vcpu_events(cpu);
> > -    if (ret) {
> > -        return ret;
> > -    }
> > -
> >      write_cpustate_to_list(cpu, true);
> >
> >      if (!write_list_to_kvmstate(cpu, level)) {
> >          return -EINVAL;
> >      }
> >
> > +    ret = kvm_put_vcpu_events(cpu);
> > +    if (ret) {
> > +        return ret;
> > +    }
> > +
> >      kvm_arm_sync_mpstate_to_kvm(cpu);
> >
> >      return ret;
> > @@ -1251,11 +1251,6 @@ int kvm_arch_get_registers(CPUState *cs)
> >      }
> >      vfp_set_fpcr(env, fpr);
> >
> > -    ret = kvm_get_vcpu_events(cpu);
> > -    if (ret) {
> > -        return ret;
> > -    }
> > -
> >      if (!write_kvmstate_to_list(cpu)) {
> >          return -EINVAL;
> >      }
> > @@ -1264,6 +1259,11 @@ int kvm_arch_get_registers(CPUState *cs)
> >       */
> >      write_list_to_cpustate(cpu);
> >
> > +    ret = kvm_get_vcpu_events(cpu);
> > +    if (ret) {
> > +        return ret;
> > +    }
> > +
> >      kvm_arm_sync_mpstate_to_qemu(cpu);
> >
> >      /* TODO: other registers */
> > --
> > 2.7.4
> >
> >
>
> Same comments for kvm64.c as for kvm32.c
>
> Thanks,
> drew
>
_______________________________________________
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH v2 2/2] target/arm: kvm: Handle DABT with no valid ISS
  2020-02-05 16:57     ` Andrew Jones
@ 2020-02-06 21:48       ` Beata Michalska
  -1 siblings, 0 replies; 20+ messages in thread
From: Beata Michalska @ 2020-02-06 21:48 UTC (permalink / raw)
  To: Andrew Jones
  Cc: Peter Maydell, Christoffer Dall, QEMU Developers, qemu-arm,
	Paolo Bonzini, kvmarm

On Wed, 5 Feb 2020 at 16:57, Andrew Jones <drjones@redhat.com> wrote:
>
> On Wed, Jan 29, 2020 at 08:24:41PM +0000, Beata Michalska wrote:
> > On ARMv7 & ARMv8 some load/store instructions might trigger a data abort
> > exception with no valid ISS info to be decoded. The lack of decode info
> > makes it at least tricky to emulate those instruction which is one of the
> > (many) reasons why KVM will not even try to do so.
> >
> > Add support for handling those by requesting KVM to inject external
> > dabt into the quest.
> >
> > Signed-off-by: Beata Michalska <beata.michalska@linaro.org>
> > ---
> >  target/arm/cpu.h     |  2 ++
> >  target/arm/kvm.c     | 96 ++++++++++++++++++++++++++++++++++++++++++++++++++++
> >  target/arm/kvm32.c   |  3 ++
> >  target/arm/kvm64.c   |  3 ++
> >  target/arm/kvm_arm.h | 19 +++++++++++
> >  5 files changed, 123 insertions(+)
> >
> > diff --git a/target/arm/cpu.h b/target/arm/cpu.h
> > index c1aedbe..e04a8d3 100644
> > --- a/target/arm/cpu.h
> > +++ b/target/arm/cpu.h
> > @@ -558,6 +558,8 @@ typedef struct CPUARMState {
> >          uint8_t has_esr;
> >          uint64_t esr;
> >      } serror;
> > +    /* Status field for pending external dabt */
> > +    uint8_t ext_dabt_pending;
> >
> >      /* State of our input IRQ/FIQ/VIRQ/VFIQ lines */
> >      uint32_t irq_line_state;
> > diff --git a/target/arm/kvm.c b/target/arm/kvm.c
> > index 8d82889..e7bc9b7 100644
> > --- a/target/arm/kvm.c
> > +++ b/target/arm/kvm.c
> > @@ -37,6 +37,7 @@ const KVMCapabilityInfo kvm_arch_required_capabilities[] = {
> >
> >  static bool cap_has_mp_state;
> >  static bool cap_has_inject_serror_esr;
> > +static bool cap_has_inject_ext_dabt; /* KVM_CAP_ARM_INJECT_EXT_DABT */
>
> nit: the KVM_CAP_ARM_INJECT_EXT_DABT comment is unnecessary

Might be - I just find it handy when looking for  related details.
I will remove that one though.

>
> >
> >  static ARMHostCPUFeatures arm_host_cpu_features;
> >
> > @@ -62,6 +63,12 @@ void kvm_arm_init_serror_injection(CPUState *cs)
> >                                      KVM_CAP_ARM_INJECT_SERROR_ESR);
> >  }
> >
> > +void kvm_arm_init_ext_dabt_injection(CPUState *cs)
> > +{
> > +    cap_has_inject_ext_dabt = kvm_check_extension(cs->kvm_state,
> > +                                    KVM_CAP_ARM_INJECT_EXT_DABT);
> > +}
> > +
> >  bool kvm_arm_create_scratch_host_vcpu(const uint32_t *cpus_to_try,
> >                                        int *fdarray,
> >                                        struct kvm_vcpu_init *init)
> > @@ -216,6 +223,11 @@ int kvm_arch_init(MachineState *ms, KVMState *s)
> >          ret = -EINVAL;
> >      }
> >
> > +    if (kvm_check_extension(s, KVM_CAP_ARM_NISV_TO_USER))
> > +        if (kvm_vm_enable_cap(s, KVM_CAP_ARM_NISV_TO_USER, 0)) {
> > +            warn_report("Failed to enable DABT NISV cap");
> > +        }
> > +
>
> Missing {} around the outer block.

Checkpatch didn't complain ...
Will fix that.

>
> As KVM_CAP_ARM_INJECT_EXT_DABT is a VM capability then I think we should
> set cap_has_inject_ext_dabt here, like cap_has_mp_state gets set. I see
> you've followed the pattern used for cap_has_inject_serror_esr, but that
> looks wrong too since KVM_CAP_ARM_INJECT_SERROR_ESR is also a VM
> capability. The way it is now we just keep setting
> cap_has_inject_serror_esr to the same value, NR_VCPUS times.
>
You are totally right - I have completely missed that point! Thanks.

> >      return ret;
> >  }
> >
> > @@ -598,6 +610,10 @@ int kvm_put_vcpu_events(ARMCPU *cpu)
> >          events.exception.serror_esr = env->serror.esr;
> >      }
> >
> > +    if (cap_has_inject_ext_dabt) {
> > +        events.exception.ext_dabt_pending = env->ext_dabt_pending;
> > +    }
> > +
> >      ret = kvm_vcpu_ioctl(CPU(cpu), KVM_SET_VCPU_EVENTS, &events);
> >      if (ret) {
> >          error_report("failed to put vcpu events");
> > @@ -627,6 +643,8 @@ int kvm_get_vcpu_events(ARMCPU *cpu)
> >      env->serror.has_esr = events.exception.serror_has_esr;
> >      env->serror.esr = events.exception.serror_esr;
> >
> > +    env->ext_dabt_pending = events.exception.ext_dabt_pending;
> > +
>
> afaict from Documentation/virt/kvm/api.txt and the KVM code you cannot
> get this state. Therefore the above line (and extra stray blank line)
> should be dropped.
>
That's true, though this is a lightweight way of resetting the vcpu state.
We would have to do that otherwise to mark that this case has been handled
and that the abort is no longer pending.

> >      return 0;
> >  }
> >
> > @@ -634,6 +652,7 @@ void kvm_arch_pre_run(CPUState *cs, struct kvm_run *run)
> >  {
> >  }
> >
> > +
>
> stray blank line
>
> >  MemTxAttrs kvm_arch_post_run(CPUState *cs, struct kvm_run *run)
> >  {
> >      ARMCPU *cpu;
> > @@ -699,6 +718,11 @@ int kvm_arch_handle_exit(CPUState *cs, struct kvm_run *run)
> >              ret = EXCP_DEBUG;
> >          } /* otherwise return to guest */
> >          break;
> > +    case KVM_EXIT_ARM_NISV:
> > +        /* External DABT with no valid iss to decode */
> > +        ret = kvm_arm_handle_dabt_nisv(cs, run->arm_nisv.esr_iss,
> > +                                       run->arm_nisv.fault_ipa);
> > +        break;
> >      default:
> >          qemu_log_mask(LOG_UNIMP, "%s: un-handled exit reason %d\n",
> >                        __func__, run->exit_reason);
> > @@ -833,3 +857,75 @@ int kvm_arch_msi_data_to_gsi(uint32_t data)
> >  {
> >      return (data - 32) & 0xffff;
> >  }
> > +
> > +int kvm_arm_handle_dabt_nisv(CPUState *cs, uint64_t esr_iss,
> > +                             uint64_t fault_ipa)
> > +{
> > +    ARMCPU *cpu = ARM_CPU(cs);
> > +    CPUARMState *env = &cpu->env;
> > +    uint32_t ins, ins_fetched;
>
> ins_fetched is a bool

Actually that's int (as per cpu_memory_rw_debug)
>
> > +
> > +    /*
> > +     * Hacky workaround for kernels that for aarch32 guests, instead of expected
> > +     * external data abort, inject the IMPLEMENTATION DEFINED exception with the
> > +     * lock-down. This is actually handled by the guest which results in
> > +     * re-running the faulting instruction.
> > +     * This intends to break the vicious cycle.
> > +     */
>
> This doesn't seem like the right thing to do. What happens on real
> hardware in this case? If an OS would get into a "vicious cycle" on
> real hardware then it should get into one on KVM too.
>
That's the problem. that would not happen on a real hardware.
We might end up here due to a KVM bug (which has been fixed recently)
when the injected
abort is not the one expected (as of not the one that would be
triggered by hw in this
particular case).
Details in : https://patchwork.kernel.org/patch/11358083/

> > +    if (!is_a64(env)) {
> > +        static uint8_t setback;
> > +
> > +        /*
> > +         * The state has not been synchronized yet, so if this is re-occurrence
> > +         * of the same abort triggered by guest, the status for pending external
> > +         * abort should not get cleared yet
> > +         */
> > +        if (unlikely(env->ext_dabt_pending)) {
> > +            if (setback) {
> > +                error_report("Most probably triggered kernel issue with"
> > +                             " injecting external data abort.");
> > +                error_printf("Giving up trying ...\n");
> > +                /* Here is where the fun comes to an end */
> > +                return -EINVAL;
> > +            }
> > +        }
> > +        setback = env->ext_dabt_pending;
> > +    }
> > +
> > +    kvm_cpu_synchronize_state(cs);
> > +   /*
> > +    * ISS [23:14] is invalid so there is a limited info
> > +    * on what has just happened so the only *useful* thing that can
> > +    * be retrieved from ISS is WnR & DFSC (though in some cases WnR
> > +    * might be less of a value as well)
> > +    */
> > +
> > +    /*
> > +     * Get current PC before it will get updated to exception vector entry
> > +     */
> > +    target_ulong ins_addr = is_a64(env) ? env->pc : env->regs[15];
>
> ins_addr should be declared above
>
> But what are we doing? pc is a guest virtual address. Oh, I see...
>
> > +
> > +    /*
> > +     * Get the faulting instruction
> > +     * Instructions have a fixed length of 32 bits
> > +     * and are always little-endian
> > +     */
> > +    ins_fetched = !cpu_memory_rw_debug(cs, ins_addr, (uint8_t *) &ins,
> > +                                       sizeof(uint32_t), 0);
>
> ... we're trying to actual walk the KVM guest's page tables. That
> seems a bit odd, and the "_debug" function name used for it makes
> it seem even more odd.
>
> > +
> > +    error_report("Data abort exception with no valid ISS generated by "
> > +                 "guest memory access at physical address: 0x" TARGET_FMT_lx,
> > +                 (target_ulong)fault_ipa);
> > +
> > +    error_printf(ins_fetched ? "%s : 0x%" PRIx32 " (ins encoded)\n"  : "%s\n",
> > +                 "KVM unable to emulate faulting instruction",
> > +                 (!ins_fetched ? 0 : ldl_le_p(&ins)));
> > +    error_printf("Injecting a data abort exception into guest.\n");
>
> The guest shouldn't be able to generate three lines of errors on the host
> whenever it wants. That's a security bug. One trace point here seems like
> the most we should do. Or, after these three lines we should kill the
> guest.
>
You have a point here, this can indeed be exploited by the malicious
guest. Not sure if killing it is necessary here.
I could limit the logging though that gets tricky for aborts triggered by
user-space processes which would probably leave the guest running
without much of an issue there.
I can y get rid of logging the additional info and keep the single
statement of an instruction not being emulated but that still leaves an
open door for potential exploits...

> Actually, I don't think we should attempt to get the instruction at all.
> We should do what the KVM documenation suggests we do when we get
> this exit. We should either do a user selected action: one of suspend,
> dump, restart, or inject a dabt (as is done below). The last choice hopes
> that the guest will handle it in some graceful way, or that it'll crash
> with all the information needed for a post-mortem crash investigation.
>
> And I don't think we should do the "very brave" option of trying to
> emulate the instruction, even if we identify it as a valid MMIO address.
> That just sounds like a huge can of worms.
>
> Does QEMU already have a way for users to select a
> don't-know-how-to-handle-guest-exception behavior?
>

The function is attempting to inject the external data abort. The rest is for
the aftermath analysis to easy figuring out what has happened
which will not be so easy in case of user-space process triggering
the DABT with no valid ISS. There is no attempt of emulating the instruction
just simply saying the guest tried to access this address with this instruction
which couldn't be handled. Might be excessive in some cases
(like misbehaving kernel) but it also might be handy on those rare
occasions when there is not much to analyze.
But as per the issue you have raised above I will rework that.

Not sure if there is a mechanism to specify the 'preferred' behavior though.

BR


Beata


> > +    /*
> > +     * Set pending ext dabt and trigger SET_EVENTS so that
> > +     * KVM can inject the abort
> > +     */
> > +    env->ext_dabt_pending = 1;
> > +
> > +    return 0;
> > +}
>
> Thanks,
> drew
>


^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH v2 2/2] target/arm: kvm: Handle DABT with no valid ISS
@ 2020-02-06 21:48       ` Beata Michalska
  0 siblings, 0 replies; 20+ messages in thread
From: Beata Michalska @ 2020-02-06 21:48 UTC (permalink / raw)
  To: Andrew Jones; +Cc: QEMU Developers, qemu-arm, Paolo Bonzini, kvmarm

On Wed, 5 Feb 2020 at 16:57, Andrew Jones <drjones@redhat.com> wrote:
>
> On Wed, Jan 29, 2020 at 08:24:41PM +0000, Beata Michalska wrote:
> > On ARMv7 & ARMv8 some load/store instructions might trigger a data abort
> > exception with no valid ISS info to be decoded. The lack of decode info
> > makes it at least tricky to emulate those instruction which is one of the
> > (many) reasons why KVM will not even try to do so.
> >
> > Add support for handling those by requesting KVM to inject external
> > dabt into the quest.
> >
> > Signed-off-by: Beata Michalska <beata.michalska@linaro.org>
> > ---
> >  target/arm/cpu.h     |  2 ++
> >  target/arm/kvm.c     | 96 ++++++++++++++++++++++++++++++++++++++++++++++++++++
> >  target/arm/kvm32.c   |  3 ++
> >  target/arm/kvm64.c   |  3 ++
> >  target/arm/kvm_arm.h | 19 +++++++++++
> >  5 files changed, 123 insertions(+)
> >
> > diff --git a/target/arm/cpu.h b/target/arm/cpu.h
> > index c1aedbe..e04a8d3 100644
> > --- a/target/arm/cpu.h
> > +++ b/target/arm/cpu.h
> > @@ -558,6 +558,8 @@ typedef struct CPUARMState {
> >          uint8_t has_esr;
> >          uint64_t esr;
> >      } serror;
> > +    /* Status field for pending external dabt */
> > +    uint8_t ext_dabt_pending;
> >
> >      /* State of our input IRQ/FIQ/VIRQ/VFIQ lines */
> >      uint32_t irq_line_state;
> > diff --git a/target/arm/kvm.c b/target/arm/kvm.c
> > index 8d82889..e7bc9b7 100644
> > --- a/target/arm/kvm.c
> > +++ b/target/arm/kvm.c
> > @@ -37,6 +37,7 @@ const KVMCapabilityInfo kvm_arch_required_capabilities[] = {
> >
> >  static bool cap_has_mp_state;
> >  static bool cap_has_inject_serror_esr;
> > +static bool cap_has_inject_ext_dabt; /* KVM_CAP_ARM_INJECT_EXT_DABT */
>
> nit: the KVM_CAP_ARM_INJECT_EXT_DABT comment is unnecessary

Might be - I just find it handy when looking for  related details.
I will remove that one though.

>
> >
> >  static ARMHostCPUFeatures arm_host_cpu_features;
> >
> > @@ -62,6 +63,12 @@ void kvm_arm_init_serror_injection(CPUState *cs)
> >                                      KVM_CAP_ARM_INJECT_SERROR_ESR);
> >  }
> >
> > +void kvm_arm_init_ext_dabt_injection(CPUState *cs)
> > +{
> > +    cap_has_inject_ext_dabt = kvm_check_extension(cs->kvm_state,
> > +                                    KVM_CAP_ARM_INJECT_EXT_DABT);
> > +}
> > +
> >  bool kvm_arm_create_scratch_host_vcpu(const uint32_t *cpus_to_try,
> >                                        int *fdarray,
> >                                        struct kvm_vcpu_init *init)
> > @@ -216,6 +223,11 @@ int kvm_arch_init(MachineState *ms, KVMState *s)
> >          ret = -EINVAL;
> >      }
> >
> > +    if (kvm_check_extension(s, KVM_CAP_ARM_NISV_TO_USER))
> > +        if (kvm_vm_enable_cap(s, KVM_CAP_ARM_NISV_TO_USER, 0)) {
> > +            warn_report("Failed to enable DABT NISV cap");
> > +        }
> > +
>
> Missing {} around the outer block.

Checkpatch didn't complain ...
Will fix that.

>
> As KVM_CAP_ARM_INJECT_EXT_DABT is a VM capability then I think we should
> set cap_has_inject_ext_dabt here, like cap_has_mp_state gets set. I see
> you've followed the pattern used for cap_has_inject_serror_esr, but that
> looks wrong too since KVM_CAP_ARM_INJECT_SERROR_ESR is also a VM
> capability. The way it is now we just keep setting
> cap_has_inject_serror_esr to the same value, NR_VCPUS times.
>
You are totally right - I have completely missed that point! Thanks.

> >      return ret;
> >  }
> >
> > @@ -598,6 +610,10 @@ int kvm_put_vcpu_events(ARMCPU *cpu)
> >          events.exception.serror_esr = env->serror.esr;
> >      }
> >
> > +    if (cap_has_inject_ext_dabt) {
> > +        events.exception.ext_dabt_pending = env->ext_dabt_pending;
> > +    }
> > +
> >      ret = kvm_vcpu_ioctl(CPU(cpu), KVM_SET_VCPU_EVENTS, &events);
> >      if (ret) {
> >          error_report("failed to put vcpu events");
> > @@ -627,6 +643,8 @@ int kvm_get_vcpu_events(ARMCPU *cpu)
> >      env->serror.has_esr = events.exception.serror_has_esr;
> >      env->serror.esr = events.exception.serror_esr;
> >
> > +    env->ext_dabt_pending = events.exception.ext_dabt_pending;
> > +
>
> afaict from Documentation/virt/kvm/api.txt and the KVM code you cannot
> get this state. Therefore the above line (and extra stray blank line)
> should be dropped.
>
That's true, though this is a lightweight way of resetting the vcpu state.
We would have to do that otherwise to mark that this case has been handled
and that the abort is no longer pending.

> >      return 0;
> >  }
> >
> > @@ -634,6 +652,7 @@ void kvm_arch_pre_run(CPUState *cs, struct kvm_run *run)
> >  {
> >  }
> >
> > +
>
> stray blank line
>
> >  MemTxAttrs kvm_arch_post_run(CPUState *cs, struct kvm_run *run)
> >  {
> >      ARMCPU *cpu;
> > @@ -699,6 +718,11 @@ int kvm_arch_handle_exit(CPUState *cs, struct kvm_run *run)
> >              ret = EXCP_DEBUG;
> >          } /* otherwise return to guest */
> >          break;
> > +    case KVM_EXIT_ARM_NISV:
> > +        /* External DABT with no valid iss to decode */
> > +        ret = kvm_arm_handle_dabt_nisv(cs, run->arm_nisv.esr_iss,
> > +                                       run->arm_nisv.fault_ipa);
> > +        break;
> >      default:
> >          qemu_log_mask(LOG_UNIMP, "%s: un-handled exit reason %d\n",
> >                        __func__, run->exit_reason);
> > @@ -833,3 +857,75 @@ int kvm_arch_msi_data_to_gsi(uint32_t data)
> >  {
> >      return (data - 32) & 0xffff;
> >  }
> > +
> > +int kvm_arm_handle_dabt_nisv(CPUState *cs, uint64_t esr_iss,
> > +                             uint64_t fault_ipa)
> > +{
> > +    ARMCPU *cpu = ARM_CPU(cs);
> > +    CPUARMState *env = &cpu->env;
> > +    uint32_t ins, ins_fetched;
>
> ins_fetched is a bool

Actually that's int (as per cpu_memory_rw_debug)
>
> > +
> > +    /*
> > +     * Hacky workaround for kernels that for aarch32 guests, instead of expected
> > +     * external data abort, inject the IMPLEMENTATION DEFINED exception with the
> > +     * lock-down. This is actually handled by the guest which results in
> > +     * re-running the faulting instruction.
> > +     * This intends to break the vicious cycle.
> > +     */
>
> This doesn't seem like the right thing to do. What happens on real
> hardware in this case? If an OS would get into a "vicious cycle" on
> real hardware then it should get into one on KVM too.
>
That's the problem. that would not happen on a real hardware.
We might end up here due to a KVM bug (which has been fixed recently)
when the injected
abort is not the one expected (as of not the one that would be
triggered by hw in this
particular case).
Details in : https://patchwork.kernel.org/patch/11358083/

> > +    if (!is_a64(env)) {
> > +        static uint8_t setback;
> > +
> > +        /*
> > +         * The state has not been synchronized yet, so if this is re-occurrence
> > +         * of the same abort triggered by guest, the status for pending external
> > +         * abort should not get cleared yet
> > +         */
> > +        if (unlikely(env->ext_dabt_pending)) {
> > +            if (setback) {
> > +                error_report("Most probably triggered kernel issue with"
> > +                             " injecting external data abort.");
> > +                error_printf("Giving up trying ...\n");
> > +                /* Here is where the fun comes to an end */
> > +                return -EINVAL;
> > +            }
> > +        }
> > +        setback = env->ext_dabt_pending;
> > +    }
> > +
> > +    kvm_cpu_synchronize_state(cs);
> > +   /*
> > +    * ISS [23:14] is invalid so there is a limited info
> > +    * on what has just happened so the only *useful* thing that can
> > +    * be retrieved from ISS is WnR & DFSC (though in some cases WnR
> > +    * might be less of a value as well)
> > +    */
> > +
> > +    /*
> > +     * Get current PC before it will get updated to exception vector entry
> > +     */
> > +    target_ulong ins_addr = is_a64(env) ? env->pc : env->regs[15];
>
> ins_addr should be declared above
>
> But what are we doing? pc is a guest virtual address. Oh, I see...
>
> > +
> > +    /*
> > +     * Get the faulting instruction
> > +     * Instructions have a fixed length of 32 bits
> > +     * and are always little-endian
> > +     */
> > +    ins_fetched = !cpu_memory_rw_debug(cs, ins_addr, (uint8_t *) &ins,
> > +                                       sizeof(uint32_t), 0);
>
> ... we're trying to actual walk the KVM guest's page tables. That
> seems a bit odd, and the "_debug" function name used for it makes
> it seem even more odd.
>
> > +
> > +    error_report("Data abort exception with no valid ISS generated by "
> > +                 "guest memory access at physical address: 0x" TARGET_FMT_lx,
> > +                 (target_ulong)fault_ipa);
> > +
> > +    error_printf(ins_fetched ? "%s : 0x%" PRIx32 " (ins encoded)\n"  : "%s\n",
> > +                 "KVM unable to emulate faulting instruction",
> > +                 (!ins_fetched ? 0 : ldl_le_p(&ins)));
> > +    error_printf("Injecting a data abort exception into guest.\n");
>
> The guest shouldn't be able to generate three lines of errors on the host
> whenever it wants. That's a security bug. One trace point here seems like
> the most we should do. Or, after these three lines we should kill the
> guest.
>
You have a point here, this can indeed be exploited by the malicious
guest. Not sure if killing it is necessary here.
I could limit the logging though that gets tricky for aborts triggered by
user-space processes which would probably leave the guest running
without much of an issue there.
I can y get rid of logging the additional info and keep the single
statement of an instruction not being emulated but that still leaves an
open door for potential exploits...

> Actually, I don't think we should attempt to get the instruction at all.
> We should do what the KVM documenation suggests we do when we get
> this exit. We should either do a user selected action: one of suspend,
> dump, restart, or inject a dabt (as is done below). The last choice hopes
> that the guest will handle it in some graceful way, or that it'll crash
> with all the information needed for a post-mortem crash investigation.
>
> And I don't think we should do the "very brave" option of trying to
> emulate the instruction, even if we identify it as a valid MMIO address.
> That just sounds like a huge can of worms.
>
> Does QEMU already have a way for users to select a
> don't-know-how-to-handle-guest-exception behavior?
>

The function is attempting to inject the external data abort. The rest is for
the aftermath analysis to easy figuring out what has happened
which will not be so easy in case of user-space process triggering
the DABT with no valid ISS. There is no attempt of emulating the instruction
just simply saying the guest tried to access this address with this instruction
which couldn't be handled. Might be excessive in some cases
(like misbehaving kernel) but it also might be handy on those rare
occasions when there is not much to analyze.
But as per the issue you have raised above I will rework that.

Not sure if there is a mechanism to specify the 'preferred' behavior though.

BR


Beata


> > +    /*
> > +     * Set pending ext dabt and trigger SET_EVENTS so that
> > +     * KVM can inject the abort
> > +     */
> > +    env->ext_dabt_pending = 1;
> > +
> > +    return 0;
> > +}
>
> Thanks,
> drew
>
_______________________________________________
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH v2 1/2] target/arm: kvm: Inject events at the last stage of sync
  2020-02-06 21:41       ` Beata Michalska
@ 2020-02-07  7:41         ` Andrew Jones
  -1 siblings, 0 replies; 20+ messages in thread
From: Andrew Jones @ 2020-02-07  7:41 UTC (permalink / raw)
  To: Beata Michalska
  Cc: Peter Maydell, Christoffer Dall, QEMU Developers, qemu-arm,
	Paolo Bonzini, kvmarm

On Thu, Feb 06, 2020 at 09:41:10PM +0000, Beata Michalska wrote:
> On Tue, 4 Feb 2020 at 10:34, Andrew Jones <drjones@redhat.com> wrote:
> >
> > On Wed, Jan 29, 2020 at 08:24:40PM +0000, Beata Michalska wrote:
> > > KVM_SET_VCPU_EVENTS might actually lead to vcpu registers being modified.
> > > As such this should be the last step of sync to avoid potential overwriting
> > > of whatever changes KVM might have done.
> > >
> > > Signed-off-by: Beata Michalska <beata.michalska@linaro.org>
> > > ---
> > >  target/arm/kvm32.c | 20 ++++++++++----------
> > >  target/arm/kvm64.c | 20 ++++++++++----------
> > >  2 files changed, 20 insertions(+), 20 deletions(-)
> > >
> > > diff --git a/target/arm/kvm32.c b/target/arm/kvm32.c
> > > index 32bf8d6..cf2b47f 100644
> > > --- a/target/arm/kvm32.c
> > > +++ b/target/arm/kvm32.c
> > > @@ -386,17 +386,17 @@ int kvm_arch_put_registers(CPUState *cs, int level)
> > >          return ret;
> > >      }
> > >
> > > -    ret = kvm_put_vcpu_events(cpu);
> > > -    if (ret) {
> > > -        return ret;
> > > -    }
> > > -
> > >      write_cpustate_to_list(cpu, true);
> > >
> > >      if (!write_list_to_kvmstate(cpu, level)) {
> > >          return EINVAL;
> > >      }
> > >
> > > +    ret = kvm_put_vcpu_events(cpu);
> > > +    if (ret) {
> > > +        return ret;
> > > +    }
> > > +
> >
> > I think we should put a comment above this that says basically the same
> > thing as the commit message in order to explain why kvm_put_vcpu_events()
> > *must* be after write_list_to_kvmstate().
> >
> Will do that.
> 
> > >      kvm_arm_sync_mpstate_to_kvm(cpu);
> > >
> > >      return ret;
> > > @@ -462,11 +462,6 @@ int kvm_arch_get_registers(CPUState *cs)
> > >      }
> > >      vfp_set_fpscr(env, fpscr);
> > >
> > > -    ret = kvm_get_vcpu_events(cpu);
> > > -    if (ret) {
> > > -        return ret;
> > > -    }
> > > -
> > >      if (!write_kvmstate_to_list(cpu)) {
> > >          return EINVAL;
> > >      }
> > > @@ -475,6 +470,11 @@ int kvm_arch_get_registers(CPUState *cs)
> > >       */
> > >      write_list_to_cpustate(cpu);
> > >
> > > +    ret = kvm_get_vcpu_events(cpu);
> > > +    if (ret) {
> > > +        return ret;
> > > +    }
> > > +
> >
> > Why are we moving kvm_get_vcpu_events()?
> 
> This is only to make things consistent with put_registeres.
> There is no functional change per se.

Without a functional change I wouldn't move it. It's much
more appealing to have the final state writes at the bottom
of this function.

Thanks,
drew



^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH v2 1/2] target/arm: kvm: Inject events at the last stage of sync
@ 2020-02-07  7:41         ` Andrew Jones
  0 siblings, 0 replies; 20+ messages in thread
From: Andrew Jones @ 2020-02-07  7:41 UTC (permalink / raw)
  To: Beata Michalska; +Cc: QEMU Developers, qemu-arm, Paolo Bonzini, kvmarm

On Thu, Feb 06, 2020 at 09:41:10PM +0000, Beata Michalska wrote:
> On Tue, 4 Feb 2020 at 10:34, Andrew Jones <drjones@redhat.com> wrote:
> >
> > On Wed, Jan 29, 2020 at 08:24:40PM +0000, Beata Michalska wrote:
> > > KVM_SET_VCPU_EVENTS might actually lead to vcpu registers being modified.
> > > As such this should be the last step of sync to avoid potential overwriting
> > > of whatever changes KVM might have done.
> > >
> > > Signed-off-by: Beata Michalska <beata.michalska@linaro.org>
> > > ---
> > >  target/arm/kvm32.c | 20 ++++++++++----------
> > >  target/arm/kvm64.c | 20 ++++++++++----------
> > >  2 files changed, 20 insertions(+), 20 deletions(-)
> > >
> > > diff --git a/target/arm/kvm32.c b/target/arm/kvm32.c
> > > index 32bf8d6..cf2b47f 100644
> > > --- a/target/arm/kvm32.c
> > > +++ b/target/arm/kvm32.c
> > > @@ -386,17 +386,17 @@ int kvm_arch_put_registers(CPUState *cs, int level)
> > >          return ret;
> > >      }
> > >
> > > -    ret = kvm_put_vcpu_events(cpu);
> > > -    if (ret) {
> > > -        return ret;
> > > -    }
> > > -
> > >      write_cpustate_to_list(cpu, true);
> > >
> > >      if (!write_list_to_kvmstate(cpu, level)) {
> > >          return EINVAL;
> > >      }
> > >
> > > +    ret = kvm_put_vcpu_events(cpu);
> > > +    if (ret) {
> > > +        return ret;
> > > +    }
> > > +
> >
> > I think we should put a comment above this that says basically the same
> > thing as the commit message in order to explain why kvm_put_vcpu_events()
> > *must* be after write_list_to_kvmstate().
> >
> Will do that.
> 
> > >      kvm_arm_sync_mpstate_to_kvm(cpu);
> > >
> > >      return ret;
> > > @@ -462,11 +462,6 @@ int kvm_arch_get_registers(CPUState *cs)
> > >      }
> > >      vfp_set_fpscr(env, fpscr);
> > >
> > > -    ret = kvm_get_vcpu_events(cpu);
> > > -    if (ret) {
> > > -        return ret;
> > > -    }
> > > -
> > >      if (!write_kvmstate_to_list(cpu)) {
> > >          return EINVAL;
> > >      }
> > > @@ -475,6 +470,11 @@ int kvm_arch_get_registers(CPUState *cs)
> > >       */
> > >      write_list_to_cpustate(cpu);
> > >
> > > +    ret = kvm_get_vcpu_events(cpu);
> > > +    if (ret) {
> > > +        return ret;
> > > +    }
> > > +
> >
> > Why are we moving kvm_get_vcpu_events()?
> 
> This is only to make things consistent with put_registeres.
> There is no functional change per se.

Without a functional change I wouldn't move it. It's much
more appealing to have the final state writes at the bottom
of this function.

Thanks,
drew

_______________________________________________
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH v2 2/2] target/arm: kvm: Handle DABT with no valid ISS
  2020-02-06 21:48       ` Beata Michalska
@ 2020-02-07  8:19         ` Andrew Jones
  -1 siblings, 0 replies; 20+ messages in thread
From: Andrew Jones @ 2020-02-07  8:19 UTC (permalink / raw)
  To: Beata Michalska
  Cc: Peter Maydell, Christoffer Dall, QEMU Developers, qemu-arm,
	Paolo Bonzini, kvmarm

On Thu, Feb 06, 2020 at 09:48:05PM +0000, Beata Michalska wrote:
> On Wed, 5 Feb 2020 at 16:57, Andrew Jones <drjones@redhat.com> wrote:
> >
> > On Wed, Jan 29, 2020 at 08:24:41PM +0000, Beata Michalska wrote:
> > > On ARMv7 & ARMv8 some load/store instructions might trigger a data abort
> > > exception with no valid ISS info to be decoded. The lack of decode info
> > > makes it at least tricky to emulate those instruction which is one of the
> > > (many) reasons why KVM will not even try to do so.
> > >
> > > Add support for handling those by requesting KVM to inject external
> > > dabt into the quest.
> > >
> > > Signed-off-by: Beata Michalska <beata.michalska@linaro.org>
> > > ---
> > >  target/arm/cpu.h     |  2 ++
> > >  target/arm/kvm.c     | 96 ++++++++++++++++++++++++++++++++++++++++++++++++++++
> > >  target/arm/kvm32.c   |  3 ++
> > >  target/arm/kvm64.c   |  3 ++
> > >  target/arm/kvm_arm.h | 19 +++++++++++
> > >  5 files changed, 123 insertions(+)
> > >
> > > diff --git a/target/arm/cpu.h b/target/arm/cpu.h
> > > index c1aedbe..e04a8d3 100644
> > > --- a/target/arm/cpu.h
> > > +++ b/target/arm/cpu.h
> > > @@ -558,6 +558,8 @@ typedef struct CPUARMState {
> > >          uint8_t has_esr;
> > >          uint64_t esr;
> > >      } serror;
> > > +    /* Status field for pending external dabt */
> > > +    uint8_t ext_dabt_pending;
> > >
> > >      /* State of our input IRQ/FIQ/VIRQ/VFIQ lines */
> > >      uint32_t irq_line_state;
> > > diff --git a/target/arm/kvm.c b/target/arm/kvm.c
> > > index 8d82889..e7bc9b7 100644
> > > --- a/target/arm/kvm.c
> > > +++ b/target/arm/kvm.c
> > > @@ -37,6 +37,7 @@ const KVMCapabilityInfo kvm_arch_required_capabilities[] = {
> > >
> > >  static bool cap_has_mp_state;
> > >  static bool cap_has_inject_serror_esr;
> > > +static bool cap_has_inject_ext_dabt; /* KVM_CAP_ARM_INJECT_EXT_DABT */
> >
> > nit: the KVM_CAP_ARM_INJECT_EXT_DABT comment is unnecessary
> 
> Might be - I just find it handy when looking for  related details.
> I will remove that one though.
> 
> >
> > >
> > >  static ARMHostCPUFeatures arm_host_cpu_features;
> > >
> > > @@ -62,6 +63,12 @@ void kvm_arm_init_serror_injection(CPUState *cs)
> > >                                      KVM_CAP_ARM_INJECT_SERROR_ESR);
> > >  }
> > >
> > > +void kvm_arm_init_ext_dabt_injection(CPUState *cs)
> > > +{
> > > +    cap_has_inject_ext_dabt = kvm_check_extension(cs->kvm_state,
> > > +                                    KVM_CAP_ARM_INJECT_EXT_DABT);
> > > +}
> > > +
> > >  bool kvm_arm_create_scratch_host_vcpu(const uint32_t *cpus_to_try,
> > >                                        int *fdarray,
> > >                                        struct kvm_vcpu_init *init)
> > > @@ -216,6 +223,11 @@ int kvm_arch_init(MachineState *ms, KVMState *s)
> > >          ret = -EINVAL;
> > >      }
> > >
> > > +    if (kvm_check_extension(s, KVM_CAP_ARM_NISV_TO_USER))
> > > +        if (kvm_vm_enable_cap(s, KVM_CAP_ARM_NISV_TO_USER, 0)) {
> > > +            warn_report("Failed to enable DABT NISV cap");
> > > +        }
> > > +
> >
> > Missing {} around the outer block.
> 
> Checkpatch didn't complain ...
> Will fix that.
> 
> >
> > As KVM_CAP_ARM_INJECT_EXT_DABT is a VM capability then I think we should
> > set cap_has_inject_ext_dabt here, like cap_has_mp_state gets set. I see
> > you've followed the pattern used for cap_has_inject_serror_esr, but that
> > looks wrong too since KVM_CAP_ARM_INJECT_SERROR_ESR is also a VM
> > capability. The way it is now we just keep setting
> > cap_has_inject_serror_esr to the same value, NR_VCPUS times.
> >
> You are totally right - I have completely missed that point! Thanks.
> 
> > >      return ret;
> > >  }
> > >
> > > @@ -598,6 +610,10 @@ int kvm_put_vcpu_events(ARMCPU *cpu)
> > >          events.exception.serror_esr = env->serror.esr;
> > >      }
> > >
> > > +    if (cap_has_inject_ext_dabt) {
> > > +        events.exception.ext_dabt_pending = env->ext_dabt_pending;
> > > +    }
> > > +
> > >      ret = kvm_vcpu_ioctl(CPU(cpu), KVM_SET_VCPU_EVENTS, &events);
> > >      if (ret) {
> > >          error_report("failed to put vcpu events");
> > > @@ -627,6 +643,8 @@ int kvm_get_vcpu_events(ARMCPU *cpu)
> > >      env->serror.has_esr = events.exception.serror_has_esr;
> > >      env->serror.esr = events.exception.serror_esr;
> > >
> > > +    env->ext_dabt_pending = events.exception.ext_dabt_pending;
> > > +
> >
> > afaict from Documentation/virt/kvm/api.txt and the KVM code you cannot
> > get this state. Therefore the above line (and extra stray blank line)
> > should be dropped.
> >
> That's true, though this is a lightweight way of resetting the vcpu state.
> We would have to do that otherwise to mark that this case has been handled
> and that the abort is no longer pending.

There's no reason to wait until get-vcpu-events time to reset this.
According to the KVM documentation (and the code) the event is
immediately delivered at vcpu_ioctl(KVM_SET_VCPU_EVENTS) time.
So I think the kvm_put_vcpu_events() patch should be

 if (env->ext_dabt_pending) {
     events.exception.ext_dabt_pending = env->ext_dabt_pending;
     env->ext_dabt_pending = 0;
 }

(env->ext_dabt_pending will only be non-zero if we have the capability)

> 
> > >      return 0;
> > >  }
> > >
> > > @@ -634,6 +652,7 @@ void kvm_arch_pre_run(CPUState *cs, struct kvm_run *run)
> > >  {
> > >  }
> > >
> > > +
> >
> > stray blank line
> >
> > >  MemTxAttrs kvm_arch_post_run(CPUState *cs, struct kvm_run *run)
> > >  {
> > >      ARMCPU *cpu;
> > > @@ -699,6 +718,11 @@ int kvm_arch_handle_exit(CPUState *cs, struct kvm_run *run)
> > >              ret = EXCP_DEBUG;
> > >          } /* otherwise return to guest */
> > >          break;
> > > +    case KVM_EXIT_ARM_NISV:
> > > +        /* External DABT with no valid iss to decode */
> > > +        ret = kvm_arm_handle_dabt_nisv(cs, run->arm_nisv.esr_iss,
> > > +                                       run->arm_nisv.fault_ipa);
> > > +        break;
> > >      default:
> > >          qemu_log_mask(LOG_UNIMP, "%s: un-handled exit reason %d\n",
> > >                        __func__, run->exit_reason);
> > > @@ -833,3 +857,75 @@ int kvm_arch_msi_data_to_gsi(uint32_t data)
> > >  {
> > >      return (data - 32) & 0xffff;
> > >  }
> > > +
> > > +int kvm_arm_handle_dabt_nisv(CPUState *cs, uint64_t esr_iss,
> > > +                             uint64_t fault_ipa)
> > > +{
> > > +    ARMCPU *cpu = ARM_CPU(cs);
> > > +    CPUARMState *env = &cpu->env;
> > > +    uint32_t ins, ins_fetched;
> >
> > ins_fetched is a bool
> 
> Actually that's int (as per cpu_memory_rw_debug)

But that's not the way you're using it here. It you really want to
ensure the return value is converted to a bool, then you could do

 ins_fetched = !!cpu_memory_rw_debug()

> >
> > > +
> > > +    /*
> > > +     * Hacky workaround for kernels that for aarch32 guests, instead of expected
> > > +     * external data abort, inject the IMPLEMENTATION DEFINED exception with the
> > > +     * lock-down. This is actually handled by the guest which results in
> > > +     * re-running the faulting instruction.
> > > +     * This intends to break the vicious cycle.
> > > +     */
> >
> > This doesn't seem like the right thing to do. What happens on real
> > hardware in this case? If an OS would get into a "vicious cycle" on
> > real hardware then it should get into one on KVM too.
> >
> That's the problem. that would not happen on a real hardware.
> We might end up here due to a KVM bug (which has been fixed recently)
> when the injected
> abort is not the one expected (as of not the one that would be
> triggered by hw in this
> particular case).
> Details in : https://patchwork.kernel.org/patch/11358083/

If KVM can be fixed (and in this case already is fixed - 21aecdbd7f3a),
then doesn't it make more sense to just fix KVM, than to add a workaround
to QEMU?

> 
> > > +    if (!is_a64(env)) {
> > > +        static uint8_t setback;
> > > +
> > > +        /*
> > > +         * The state has not been synchronized yet, so if this is re-occurrence
> > > +         * of the same abort triggered by guest, the status for pending external
> > > +         * abort should not get cleared yet
> > > +         */
> > > +        if (unlikely(env->ext_dabt_pending)) {
> > > +            if (setback) {
> > > +                error_report("Most probably triggered kernel issue with"
> > > +                             " injecting external data abort.");
> > > +                error_printf("Giving up trying ...\n");
> > > +                /* Here is where the fun comes to an end */
> > > +                return -EINVAL;
> > > +            }
> > > +        }
> > > +        setback = env->ext_dabt_pending;
> > > +    }
> > > +
> > > +    kvm_cpu_synchronize_state(cs);
> > > +   /*
> > > +    * ISS [23:14] is invalid so there is a limited info
> > > +    * on what has just happened so the only *useful* thing that can
> > > +    * be retrieved from ISS is WnR & DFSC (though in some cases WnR
> > > +    * might be less of a value as well)
> > > +    */
> > > +
> > > +    /*
> > > +     * Get current PC before it will get updated to exception vector entry
> > > +     */
> > > +    target_ulong ins_addr = is_a64(env) ? env->pc : env->regs[15];
> >
> > ins_addr should be declared above
> >
> > But what are we doing? pc is a guest virtual address. Oh, I see...
> >
> > > +
> > > +    /*
> > > +     * Get the faulting instruction
> > > +     * Instructions have a fixed length of 32 bits
> > > +     * and are always little-endian
> > > +     */
> > > +    ins_fetched = !cpu_memory_rw_debug(cs, ins_addr, (uint8_t *) &ins,
> > > +                                       sizeof(uint32_t), 0);
> >
> > ... we're trying to actual walk the KVM guest's page tables. That
> > seems a bit odd, and the "_debug" function name used for it makes
> > it seem even more odd.
> >
> > > +
> > > +    error_report("Data abort exception with no valid ISS generated by "
> > > +                 "guest memory access at physical address: 0x" TARGET_FMT_lx,
> > > +                 (target_ulong)fault_ipa);
> > > +
> > > +    error_printf(ins_fetched ? "%s : 0x%" PRIx32 " (ins encoded)\n"  : "%s\n",
> > > +                 "KVM unable to emulate faulting instruction",
> > > +                 (!ins_fetched ? 0 : ldl_le_p(&ins)));
> > > +    error_printf("Injecting a data abort exception into guest.\n");
> >
> > The guest shouldn't be able to generate three lines of errors on the host
> > whenever it wants. That's a security bug. One trace point here seems like
> > the most we should do. Or, after these three lines we should kill the
> > guest.
> >
> You have a point here, this can indeed be exploited by the malicious
> guest. Not sure if killing it is necessary here.
> I could limit the logging though that gets tricky for aborts triggered by
> user-space processes which would probably leave the guest running
> without much of an issue there.
> I can y get rid of logging the additional info and keep the single
> statement of an instruction not being emulated but that still leaves an
> open door for potential exploits...
> 
> > Actually, I don't think we should attempt to get the instruction at all.
> > We should do what the KVM documenation suggests we do when we get
> > this exit. We should either do a user selected action: one of suspend,
> > dump, restart, or inject a dabt (as is done below). The last choice hopes
> > that the guest will handle it in some graceful way, or that it'll crash
> > with all the information needed for a post-mortem crash investigation.
> >
> > And I don't think we should do the "very brave" option of trying to
> > emulate the instruction, even if we identify it as a valid MMIO address.
> > That just sounds like a huge can of worms.
> >
> > Does QEMU already have a way for users to select a
> > don't-know-how-to-handle-guest-exception behavior?
> >
> 
> The function is attempting to inject the external data abort. The rest is for
> the aftermath analysis to easy figuring out what has happened
> which will not be so easy in case of user-space process triggering
> the DABT with no valid ISS. There is no attempt of emulating the instruction
> just simply saying the guest tried to access this address with this instruction
> which couldn't be handled. Might be excessive in some cases
> (like misbehaving kernel) but it also might be handy on those rare
> occasions when there is not much to analyze.
> But as per the issue you have raised above I will rework that.
> 
> Not sure if there is a mechanism to specify the 'preferred' behavior though.

I think a mechanism for a preferred behavior would be nice, because
all options make sense, depending on the use case. I'm not sure what
the right default is, though. Maybe this dabt injection, but there's
a good chance it'll just loop here. So maybe QEMU should just abort.
Aborting should be the safest default (from the host's PoV).

Thanks,
drew


> 
> BR
> 
> 
> Beata
> 
> 
> > > +    /*
> > > +     * Set pending ext dabt and trigger SET_EVENTS so that
> > > +     * KVM can inject the abort
> > > +     */
> > > +    env->ext_dabt_pending = 1;
> > > +
> > > +    return 0;
> > > +}
> >
> > Thanks,
> > drew
> >
> 



^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH v2 2/2] target/arm: kvm: Handle DABT with no valid ISS
@ 2020-02-07  8:19         ` Andrew Jones
  0 siblings, 0 replies; 20+ messages in thread
From: Andrew Jones @ 2020-02-07  8:19 UTC (permalink / raw)
  To: Beata Michalska; +Cc: QEMU Developers, qemu-arm, Paolo Bonzini, kvmarm

On Thu, Feb 06, 2020 at 09:48:05PM +0000, Beata Michalska wrote:
> On Wed, 5 Feb 2020 at 16:57, Andrew Jones <drjones@redhat.com> wrote:
> >
> > On Wed, Jan 29, 2020 at 08:24:41PM +0000, Beata Michalska wrote:
> > > On ARMv7 & ARMv8 some load/store instructions might trigger a data abort
> > > exception with no valid ISS info to be decoded. The lack of decode info
> > > makes it at least tricky to emulate those instruction which is one of the
> > > (many) reasons why KVM will not even try to do so.
> > >
> > > Add support for handling those by requesting KVM to inject external
> > > dabt into the quest.
> > >
> > > Signed-off-by: Beata Michalska <beata.michalska@linaro.org>
> > > ---
> > >  target/arm/cpu.h     |  2 ++
> > >  target/arm/kvm.c     | 96 ++++++++++++++++++++++++++++++++++++++++++++++++++++
> > >  target/arm/kvm32.c   |  3 ++
> > >  target/arm/kvm64.c   |  3 ++
> > >  target/arm/kvm_arm.h | 19 +++++++++++
> > >  5 files changed, 123 insertions(+)
> > >
> > > diff --git a/target/arm/cpu.h b/target/arm/cpu.h
> > > index c1aedbe..e04a8d3 100644
> > > --- a/target/arm/cpu.h
> > > +++ b/target/arm/cpu.h
> > > @@ -558,6 +558,8 @@ typedef struct CPUARMState {
> > >          uint8_t has_esr;
> > >          uint64_t esr;
> > >      } serror;
> > > +    /* Status field for pending external dabt */
> > > +    uint8_t ext_dabt_pending;
> > >
> > >      /* State of our input IRQ/FIQ/VIRQ/VFIQ lines */
> > >      uint32_t irq_line_state;
> > > diff --git a/target/arm/kvm.c b/target/arm/kvm.c
> > > index 8d82889..e7bc9b7 100644
> > > --- a/target/arm/kvm.c
> > > +++ b/target/arm/kvm.c
> > > @@ -37,6 +37,7 @@ const KVMCapabilityInfo kvm_arch_required_capabilities[] = {
> > >
> > >  static bool cap_has_mp_state;
> > >  static bool cap_has_inject_serror_esr;
> > > +static bool cap_has_inject_ext_dabt; /* KVM_CAP_ARM_INJECT_EXT_DABT */
> >
> > nit: the KVM_CAP_ARM_INJECT_EXT_DABT comment is unnecessary
> 
> Might be - I just find it handy when looking for  related details.
> I will remove that one though.
> 
> >
> > >
> > >  static ARMHostCPUFeatures arm_host_cpu_features;
> > >
> > > @@ -62,6 +63,12 @@ void kvm_arm_init_serror_injection(CPUState *cs)
> > >                                      KVM_CAP_ARM_INJECT_SERROR_ESR);
> > >  }
> > >
> > > +void kvm_arm_init_ext_dabt_injection(CPUState *cs)
> > > +{
> > > +    cap_has_inject_ext_dabt = kvm_check_extension(cs->kvm_state,
> > > +                                    KVM_CAP_ARM_INJECT_EXT_DABT);
> > > +}
> > > +
> > >  bool kvm_arm_create_scratch_host_vcpu(const uint32_t *cpus_to_try,
> > >                                        int *fdarray,
> > >                                        struct kvm_vcpu_init *init)
> > > @@ -216,6 +223,11 @@ int kvm_arch_init(MachineState *ms, KVMState *s)
> > >          ret = -EINVAL;
> > >      }
> > >
> > > +    if (kvm_check_extension(s, KVM_CAP_ARM_NISV_TO_USER))
> > > +        if (kvm_vm_enable_cap(s, KVM_CAP_ARM_NISV_TO_USER, 0)) {
> > > +            warn_report("Failed to enable DABT NISV cap");
> > > +        }
> > > +
> >
> > Missing {} around the outer block.
> 
> Checkpatch didn't complain ...
> Will fix that.
> 
> >
> > As KVM_CAP_ARM_INJECT_EXT_DABT is a VM capability then I think we should
> > set cap_has_inject_ext_dabt here, like cap_has_mp_state gets set. I see
> > you've followed the pattern used for cap_has_inject_serror_esr, but that
> > looks wrong too since KVM_CAP_ARM_INJECT_SERROR_ESR is also a VM
> > capability. The way it is now we just keep setting
> > cap_has_inject_serror_esr to the same value, NR_VCPUS times.
> >
> You are totally right - I have completely missed that point! Thanks.
> 
> > >      return ret;
> > >  }
> > >
> > > @@ -598,6 +610,10 @@ int kvm_put_vcpu_events(ARMCPU *cpu)
> > >          events.exception.serror_esr = env->serror.esr;
> > >      }
> > >
> > > +    if (cap_has_inject_ext_dabt) {
> > > +        events.exception.ext_dabt_pending = env->ext_dabt_pending;
> > > +    }
> > > +
> > >      ret = kvm_vcpu_ioctl(CPU(cpu), KVM_SET_VCPU_EVENTS, &events);
> > >      if (ret) {
> > >          error_report("failed to put vcpu events");
> > > @@ -627,6 +643,8 @@ int kvm_get_vcpu_events(ARMCPU *cpu)
> > >      env->serror.has_esr = events.exception.serror_has_esr;
> > >      env->serror.esr = events.exception.serror_esr;
> > >
> > > +    env->ext_dabt_pending = events.exception.ext_dabt_pending;
> > > +
> >
> > afaict from Documentation/virt/kvm/api.txt and the KVM code you cannot
> > get this state. Therefore the above line (and extra stray blank line)
> > should be dropped.
> >
> That's true, though this is a lightweight way of resetting the vcpu state.
> We would have to do that otherwise to mark that this case has been handled
> and that the abort is no longer pending.

There's no reason to wait until get-vcpu-events time to reset this.
According to the KVM documentation (and the code) the event is
immediately delivered at vcpu_ioctl(KVM_SET_VCPU_EVENTS) time.
So I think the kvm_put_vcpu_events() patch should be

 if (env->ext_dabt_pending) {
     events.exception.ext_dabt_pending = env->ext_dabt_pending;
     env->ext_dabt_pending = 0;
 }

(env->ext_dabt_pending will only be non-zero if we have the capability)

> 
> > >      return 0;
> > >  }
> > >
> > > @@ -634,6 +652,7 @@ void kvm_arch_pre_run(CPUState *cs, struct kvm_run *run)
> > >  {
> > >  }
> > >
> > > +
> >
> > stray blank line
> >
> > >  MemTxAttrs kvm_arch_post_run(CPUState *cs, struct kvm_run *run)
> > >  {
> > >      ARMCPU *cpu;
> > > @@ -699,6 +718,11 @@ int kvm_arch_handle_exit(CPUState *cs, struct kvm_run *run)
> > >              ret = EXCP_DEBUG;
> > >          } /* otherwise return to guest */
> > >          break;
> > > +    case KVM_EXIT_ARM_NISV:
> > > +        /* External DABT with no valid iss to decode */
> > > +        ret = kvm_arm_handle_dabt_nisv(cs, run->arm_nisv.esr_iss,
> > > +                                       run->arm_nisv.fault_ipa);
> > > +        break;
> > >      default:
> > >          qemu_log_mask(LOG_UNIMP, "%s: un-handled exit reason %d\n",
> > >                        __func__, run->exit_reason);
> > > @@ -833,3 +857,75 @@ int kvm_arch_msi_data_to_gsi(uint32_t data)
> > >  {
> > >      return (data - 32) & 0xffff;
> > >  }
> > > +
> > > +int kvm_arm_handle_dabt_nisv(CPUState *cs, uint64_t esr_iss,
> > > +                             uint64_t fault_ipa)
> > > +{
> > > +    ARMCPU *cpu = ARM_CPU(cs);
> > > +    CPUARMState *env = &cpu->env;
> > > +    uint32_t ins, ins_fetched;
> >
> > ins_fetched is a bool
> 
> Actually that's int (as per cpu_memory_rw_debug)

But that's not the way you're using it here. It you really want to
ensure the return value is converted to a bool, then you could do

 ins_fetched = !!cpu_memory_rw_debug()

> >
> > > +
> > > +    /*
> > > +     * Hacky workaround for kernels that for aarch32 guests, instead of expected
> > > +     * external data abort, inject the IMPLEMENTATION DEFINED exception with the
> > > +     * lock-down. This is actually handled by the guest which results in
> > > +     * re-running the faulting instruction.
> > > +     * This intends to break the vicious cycle.
> > > +     */
> >
> > This doesn't seem like the right thing to do. What happens on real
> > hardware in this case? If an OS would get into a "vicious cycle" on
> > real hardware then it should get into one on KVM too.
> >
> That's the problem. that would not happen on a real hardware.
> We might end up here due to a KVM bug (which has been fixed recently)
> when the injected
> abort is not the one expected (as of not the one that would be
> triggered by hw in this
> particular case).
> Details in : https://patchwork.kernel.org/patch/11358083/

If KVM can be fixed (and in this case already is fixed - 21aecdbd7f3a),
then doesn't it make more sense to just fix KVM, than to add a workaround
to QEMU?

> 
> > > +    if (!is_a64(env)) {
> > > +        static uint8_t setback;
> > > +
> > > +        /*
> > > +         * The state has not been synchronized yet, so if this is re-occurrence
> > > +         * of the same abort triggered by guest, the status for pending external
> > > +         * abort should not get cleared yet
> > > +         */
> > > +        if (unlikely(env->ext_dabt_pending)) {
> > > +            if (setback) {
> > > +                error_report("Most probably triggered kernel issue with"
> > > +                             " injecting external data abort.");
> > > +                error_printf("Giving up trying ...\n");
> > > +                /* Here is where the fun comes to an end */
> > > +                return -EINVAL;
> > > +            }
> > > +        }
> > > +        setback = env->ext_dabt_pending;
> > > +    }
> > > +
> > > +    kvm_cpu_synchronize_state(cs);
> > > +   /*
> > > +    * ISS [23:14] is invalid so there is a limited info
> > > +    * on what has just happened so the only *useful* thing that can
> > > +    * be retrieved from ISS is WnR & DFSC (though in some cases WnR
> > > +    * might be less of a value as well)
> > > +    */
> > > +
> > > +    /*
> > > +     * Get current PC before it will get updated to exception vector entry
> > > +     */
> > > +    target_ulong ins_addr = is_a64(env) ? env->pc : env->regs[15];
> >
> > ins_addr should be declared above
> >
> > But what are we doing? pc is a guest virtual address. Oh, I see...
> >
> > > +
> > > +    /*
> > > +     * Get the faulting instruction
> > > +     * Instructions have a fixed length of 32 bits
> > > +     * and are always little-endian
> > > +     */
> > > +    ins_fetched = !cpu_memory_rw_debug(cs, ins_addr, (uint8_t *) &ins,
> > > +                                       sizeof(uint32_t), 0);
> >
> > ... we're trying to actual walk the KVM guest's page tables. That
> > seems a bit odd, and the "_debug" function name used for it makes
> > it seem even more odd.
> >
> > > +
> > > +    error_report("Data abort exception with no valid ISS generated by "
> > > +                 "guest memory access at physical address: 0x" TARGET_FMT_lx,
> > > +                 (target_ulong)fault_ipa);
> > > +
> > > +    error_printf(ins_fetched ? "%s : 0x%" PRIx32 " (ins encoded)\n"  : "%s\n",
> > > +                 "KVM unable to emulate faulting instruction",
> > > +                 (!ins_fetched ? 0 : ldl_le_p(&ins)));
> > > +    error_printf("Injecting a data abort exception into guest.\n");
> >
> > The guest shouldn't be able to generate three lines of errors on the host
> > whenever it wants. That's a security bug. One trace point here seems like
> > the most we should do. Or, after these three lines we should kill the
> > guest.
> >
> You have a point here, this can indeed be exploited by the malicious
> guest. Not sure if killing it is necessary here.
> I could limit the logging though that gets tricky for aborts triggered by
> user-space processes which would probably leave the guest running
> without much of an issue there.
> I can y get rid of logging the additional info and keep the single
> statement of an instruction not being emulated but that still leaves an
> open door for potential exploits...
> 
> > Actually, I don't think we should attempt to get the instruction at all.
> > We should do what the KVM documenation suggests we do when we get
> > this exit. We should either do a user selected action: one of suspend,
> > dump, restart, or inject a dabt (as is done below). The last choice hopes
> > that the guest will handle it in some graceful way, or that it'll crash
> > with all the information needed for a post-mortem crash investigation.
> >
> > And I don't think we should do the "very brave" option of trying to
> > emulate the instruction, even if we identify it as a valid MMIO address.
> > That just sounds like a huge can of worms.
> >
> > Does QEMU already have a way for users to select a
> > don't-know-how-to-handle-guest-exception behavior?
> >
> 
> The function is attempting to inject the external data abort. The rest is for
> the aftermath analysis to easy figuring out what has happened
> which will not be so easy in case of user-space process triggering
> the DABT with no valid ISS. There is no attempt of emulating the instruction
> just simply saying the guest tried to access this address with this instruction
> which couldn't be handled. Might be excessive in some cases
> (like misbehaving kernel) but it also might be handy on those rare
> occasions when there is not much to analyze.
> But as per the issue you have raised above I will rework that.
> 
> Not sure if there is a mechanism to specify the 'preferred' behavior though.

I think a mechanism for a preferred behavior would be nice, because
all options make sense, depending on the use case. I'm not sure what
the right default is, though. Maybe this dabt injection, but there's
a good chance it'll just loop here. So maybe QEMU should just abort.
Aborting should be the safest default (from the host's PoV).

Thanks,
drew


> 
> BR
> 
> 
> Beata
> 
> 
> > > +    /*
> > > +     * Set pending ext dabt and trigger SET_EVENTS so that
> > > +     * KVM can inject the abort
> > > +     */
> > > +    env->ext_dabt_pending = 1;
> > > +
> > > +    return 0;
> > > +}
> >
> > Thanks,
> > drew
> >
> 

_______________________________________________
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH v2 2/2] target/arm: kvm: Handle DABT with no valid ISS
  2020-02-07  8:19         ` Andrew Jones
@ 2020-02-11 23:10           ` Beata Michalska
  -1 siblings, 0 replies; 20+ messages in thread
From: Beata Michalska @ 2020-02-11 23:10 UTC (permalink / raw)
  To: Andrew Jones
  Cc: Peter Maydell, Christoffer Dall, QEMU Developers, qemu-arm,
	Paolo Bonzini, kvmarm

On Fri, 7 Feb 2020 at 08:20, Andrew Jones <drjones@redhat.com> wrote:
>
> On Thu, Feb 06, 2020 at 09:48:05PM +0000, Beata Michalska wrote:
> > On Wed, 5 Feb 2020 at 16:57, Andrew Jones <drjones@redhat.com> wrote:
> > >
> > > On Wed, Jan 29, 2020 at 08:24:41PM +0000, Beata Michalska wrote:
> > > > On ARMv7 & ARMv8 some load/store instructions might trigger a data abort
> > > > exception with no valid ISS info to be decoded. The lack of decode info
> > > > makes it at least tricky to emulate those instruction which is one of the
> > > > (many) reasons why KVM will not even try to do so.
> > > >
> > > > Add support for handling those by requesting KVM to inject external
> > > > dabt into the quest.
> > > >
> > > > Signed-off-by: Beata Michalska <beata.michalska@linaro.org>
> > > > ---
> > > >  target/arm/cpu.h     |  2 ++
> > > >  target/arm/kvm.c     | 96 ++++++++++++++++++++++++++++++++++++++++++++++++++++
> > > >  target/arm/kvm32.c   |  3 ++
> > > >  target/arm/kvm64.c   |  3 ++
> > > >  target/arm/kvm_arm.h | 19 +++++++++++
> > > >  5 files changed, 123 insertions(+)
> > > >
> > > > diff --git a/target/arm/cpu.h b/target/arm/cpu.h
> > > > index c1aedbe..e04a8d3 100644
> > > > --- a/target/arm/cpu.h
> > > > +++ b/target/arm/cpu.h
> > > > @@ -558,6 +558,8 @@ typedef struct CPUARMState {
> > > >          uint8_t has_esr;
> > > >          uint64_t esr;
> > > >      } serror;
> > > > +    /* Status field for pending external dabt */
> > > > +    uint8_t ext_dabt_pending;
> > > >
> > > >      /* State of our input IRQ/FIQ/VIRQ/VFIQ lines */
> > > >      uint32_t irq_line_state;
> > > > diff --git a/target/arm/kvm.c b/target/arm/kvm.c
> > > > index 8d82889..e7bc9b7 100644
> > > > --- a/target/arm/kvm.c
> > > > +++ b/target/arm/kvm.c
> > > > @@ -37,6 +37,7 @@ const KVMCapabilityInfo kvm_arch_required_capabilities[] = {
> > > >
> > > >  static bool cap_has_mp_state;
> > > >  static bool cap_has_inject_serror_esr;
> > > > +static bool cap_has_inject_ext_dabt; /* KVM_CAP_ARM_INJECT_EXT_DABT */
> > >
> > > nit: the KVM_CAP_ARM_INJECT_EXT_DABT comment is unnecessary
> >
> > Might be - I just find it handy when looking for  related details.
> > I will remove that one though.
> >
> > >
> > > >
> > > >  static ARMHostCPUFeatures arm_host_cpu_features;
> > > >
> > > > @@ -62,6 +63,12 @@ void kvm_arm_init_serror_injection(CPUState *cs)
> > > >                                      KVM_CAP_ARM_INJECT_SERROR_ESR);
> > > >  }
> > > >
> > > > +void kvm_arm_init_ext_dabt_injection(CPUState *cs)
> > > > +{
> > > > +    cap_has_inject_ext_dabt = kvm_check_extension(cs->kvm_state,
> > > > +                                    KVM_CAP_ARM_INJECT_EXT_DABT);
> > > > +}
> > > > +
> > > >  bool kvm_arm_create_scratch_host_vcpu(const uint32_t *cpus_to_try,
> > > >                                        int *fdarray,
> > > >                                        struct kvm_vcpu_init *init)
> > > > @@ -216,6 +223,11 @@ int kvm_arch_init(MachineState *ms, KVMState *s)
> > > >          ret = -EINVAL;
> > > >      }
> > > >
> > > > +    if (kvm_check_extension(s, KVM_CAP_ARM_NISV_TO_USER))
> > > > +        if (kvm_vm_enable_cap(s, KVM_CAP_ARM_NISV_TO_USER, 0)) {
> > > > +            warn_report("Failed to enable DABT NISV cap");
> > > > +        }
> > > > +
> > >
> > > Missing {} around the outer block.
> >
> > Checkpatch didn't complain ...
> > Will fix that.
> >
> > >
> > > As KVM_CAP_ARM_INJECT_EXT_DABT is a VM capability then I think we should
> > > set cap_has_inject_ext_dabt here, like cap_has_mp_state gets set. I see
> > > you've followed the pattern used for cap_has_inject_serror_esr, but that
> > > looks wrong too since KVM_CAP_ARM_INJECT_SERROR_ESR is also a VM
> > > capability. The way it is now we just keep setting
> > > cap_has_inject_serror_esr to the same value, NR_VCPUS times.
> > >
> > You are totally right - I have completely missed that point! Thanks.
> >
> > > >      return ret;
> > > >  }
> > > >
> > > > @@ -598,6 +610,10 @@ int kvm_put_vcpu_events(ARMCPU *cpu)
> > > >          events.exception.serror_esr = env->serror.esr;
> > > >      }
> > > >
> > > > +    if (cap_has_inject_ext_dabt) {
> > > > +        events.exception.ext_dabt_pending = env->ext_dabt_pending;
> > > > +    }
> > > > +
> > > >      ret = kvm_vcpu_ioctl(CPU(cpu), KVM_SET_VCPU_EVENTS, &events);
> > > >      if (ret) {
> > > >          error_report("failed to put vcpu events");
> > > > @@ -627,6 +643,8 @@ int kvm_get_vcpu_events(ARMCPU *cpu)
> > > >      env->serror.has_esr = events.exception.serror_has_esr;
> > > >      env->serror.esr = events.exception.serror_esr;
> > > >
> > > > +    env->ext_dabt_pending = events.exception.ext_dabt_pending;
> > > > +
> > >
> > > afaict from Documentation/virt/kvm/api.txt and the KVM code you cannot
> > > get this state. Therefore the above line (and extra stray blank line)
> > > should be dropped.
> > >
> > That's true, though this is a lightweight way of resetting the vcpu state.
> > We would have to do that otherwise to mark that this case has been handled
> > and that the abort is no longer pending.
>
> There's no reason to wait until get-vcpu-events time to reset this.
> According to the KVM documentation (and the code) the event is
> immediately delivered at vcpu_ioctl(KVM_SET_VCPU_EVENTS) time.
> So I think the kvm_put_vcpu_events() patch should be
>
>  if (env->ext_dabt_pending) {
>      events.exception.ext_dabt_pending = env->ext_dabt_pending;
>      env->ext_dabt_pending = 0;
>  }
>
> (env->ext_dabt_pending will only be non-zero if we have the capability)
>
I would  expect this then to be reset only after successful ioctl call.
Will update it.

> >
> > > >      return 0;
> > > >  }
> > > >
> > > > @@ -634,6 +652,7 @@ void kvm_arch_pre_run(CPUState *cs, struct kvm_run *run)
> > > >  {
> > > >  }
> > > >
> > > > +
> > >
> > > stray blank line
> > >
> > > >  MemTxAttrs kvm_arch_post_run(CPUState *cs, struct kvm_run *run)
> > > >  {
> > > >      ARMCPU *cpu;
> > > > @@ -699,6 +718,11 @@ int kvm_arch_handle_exit(CPUState *cs, struct kvm_run *run)
> > > >              ret = EXCP_DEBUG;
> > > >          } /* otherwise return to guest */
> > > >          break;
> > > > +    case KVM_EXIT_ARM_NISV:
> > > > +        /* External DABT with no valid iss to decode */
> > > > +        ret = kvm_arm_handle_dabt_nisv(cs, run->arm_nisv.esr_iss,
> > > > +                                       run->arm_nisv.fault_ipa);
> > > > +        break;
> > > >      default:
> > > >          qemu_log_mask(LOG_UNIMP, "%s: un-handled exit reason %d\n",
> > > >                        __func__, run->exit_reason);
> > > > @@ -833,3 +857,75 @@ int kvm_arch_msi_data_to_gsi(uint32_t data)
> > > >  {
> > > >      return (data - 32) & 0xffff;
> > > >  }
> > > > +
> > > > +int kvm_arm_handle_dabt_nisv(CPUState *cs, uint64_t esr_iss,
> > > > +                             uint64_t fault_ipa)
> > > > +{
> > > > +    ARMCPU *cpu = ARM_CPU(cs);
> > > > +    CPUARMState *env = &cpu->env;
> > > > +    uint32_t ins, ins_fetched;
> > >
> > > ins_fetched is a bool
> >
> > Actually that's int (as per cpu_memory_rw_debug)
>
> But that's not the way you're using it here. It you really want to
> ensure the return value is converted to a bool, then you could do
>
>  ins_fetched = !!cpu_memory_rw_debug()
>
> > >
> > > > +
> > > > +    /*
> > > > +     * Hacky workaround for kernels that for aarch32 guests, instead of expected
> > > > +     * external data abort, inject the IMPLEMENTATION DEFINED exception with the
> > > > +     * lock-down. This is actually handled by the guest which results in
> > > > +     * re-running the faulting instruction.
> > > > +     * This intends to break the vicious cycle.
> > > > +     */
> > >
> > > This doesn't seem like the right thing to do. What happens on real
> > > hardware in this case? If an OS would get into a "vicious cycle" on
> > > real hardware then it should get into one on KVM too.
> > >
> > That's the problem. that would not happen on a real hardware.
> > We might end up here due to a KVM bug (which has been fixed recently)
> > when the injected
> > abort is not the one expected (as of not the one that would be
> > triggered by hw in this
> > particular case).
> > Details in : https://patchwork.kernel.org/patch/11358083/
>
> If KVM can be fixed (and in this case already is fixed - 21aecdbd7f3a),
> then doesn't it make more sense to just fix KVM, than to add a workaround
> to QEMU?
>
This is to cover the cases (hopefully rare but still) when the host kernel
won't get updated, leaving the issue exposed to the guest. Not sure what
or if there is any policy regarding that in Qemu (?) There is also no mechanism
within KVM to allow the user space to verify whether the fix is available
or not. Capabilities are not designed for that (if I got things right) unless
user-space interface is broken ... which is not entirely the case
here.
> >
> > > > +    if (!is_a64(env)) {
> > > > +        static uint8_t setback;
> > > > +
> > > > +        /*
> > > > +         * The state has not been synchronized yet, so if this is re-occurrence
> > > > +         * of the same abort triggered by guest, the status for pending external
> > > > +         * abort should not get cleared yet
> > > > +         */
> > > > +        if (unlikely(env->ext_dabt_pending)) {
> > > > +            if (setback) {
> > > > +                error_report("Most probably triggered kernel issue with"
> > > > +                             " injecting external data abort.");
> > > > +                error_printf("Giving up trying ...\n");
> > > > +                /* Here is where the fun comes to an end */
> > > > +                return -EINVAL;
> > > > +            }
> > > > +        }
> > > > +        setback = env->ext_dabt_pending;
> > > > +    }
> > > > +
> > > > +    kvm_cpu_synchronize_state(cs);
> > > > +   /*
> > > > +    * ISS [23:14] is invalid so there is a limited info
> > > > +    * on what has just happened so the only *useful* thing that can
> > > > +    * be retrieved from ISS is WnR & DFSC (though in some cases WnR
> > > > +    * might be less of a value as well)
> > > > +    */
> > > > +
> > > > +    /*
> > > > +     * Get current PC before it will get updated to exception vector entry
> > > > +     */
> > > > +    target_ulong ins_addr = is_a64(env) ? env->pc : env->regs[15];
> > >
> > > ins_addr should be declared above
> > >
> > > But what are we doing? pc is a guest virtual address. Oh, I see...
> > >
> > > > +
> > > > +    /*
> > > > +     * Get the faulting instruction
> > > > +     * Instructions have a fixed length of 32 bits
> > > > +     * and are always little-endian
> > > > +     */
> > > > +    ins_fetched = !cpu_memory_rw_debug(cs, ins_addr, (uint8_t *) &ins,
> > > > +                                       sizeof(uint32_t), 0);
> > >
> > > ... we're trying to actual walk the KVM guest's page tables. That
> > > seems a bit odd, and the "_debug" function name used for it makes
> > > it seem even more odd.
> > >
> > > > +
> > > > +    error_report("Data abort exception with no valid ISS generated by "
> > > > +                 "guest memory access at physical address: 0x" TARGET_FMT_lx,
> > > > +                 (target_ulong)fault_ipa);
> > > > +
> > > > +    error_printf(ins_fetched ? "%s : 0x%" PRIx32 " (ins encoded)\n"  : "%s\n",
> > > > +                 "KVM unable to emulate faulting instruction",
> > > > +                 (!ins_fetched ? 0 : ldl_le_p(&ins)));
> > > > +    error_printf("Injecting a data abort exception into guest.\n");
> > >
> > > The guest shouldn't be able to generate three lines of errors on the host
> > > whenever it wants. That's a security bug. One trace point here seems like
> > > the most we should do. Or, after these three lines we should kill the
> > > guest.
> > >
> > You have a point here, this can indeed be exploited by the malicious
> > guest. Not sure if killing it is necessary here.
> > I could limit the logging though that gets tricky for aborts triggered by
> > user-space processes which would probably leave the guest running
> > without much of an issue there.
> > I can y get rid of logging the additional info and keep the single
> > statement of an instruction not being emulated but that still leaves an
> > open door for potential exploits...
> >
> > > Actually, I don't think we should attempt to get the instruction at all.
> > > We should do what the KVM documenation suggests we do when we get
> > > this exit. We should either do a user selected action: one of suspend,
> > > dump, restart, or inject a dabt (as is done below). The last choice hopes
> > > that the guest will handle it in some graceful way, or that it'll crash
> > > with all the information needed for a post-mortem crash investigation.
> > >
> > > And I don't think we should do the "very brave" option of trying to
> > > emulate the instruction, even if we identify it as a valid MMIO address.
> > > That just sounds like a huge can of worms.
> > >
> > > Does QEMU already have a way for users to select a
> > > don't-know-how-to-handle-guest-exception behavior?
> > >
> >
> > The function is attempting to inject the external data abort. The rest is for
> > the aftermath analysis to easy figuring out what has happened
> > which will not be so easy in case of user-space process triggering
> > the DABT with no valid ISS. There is no attempt of emulating the instruction
> > just simply saying the guest tried to access this address with this instruction
> > which couldn't be handled. Might be excessive in some cases
> > (like misbehaving kernel) but it also might be handy on those rare
> > occasions when there is not much to analyze.
> > But as per the issue you have raised above I will rework that.
> >
> > Not sure if there is a mechanism to specify the 'preferred' behavior though.
>
> I think a mechanism for a preferred behavior would be nice, because
> all options make sense, depending on the use case. I'm not sure what
> the right default is, though. Maybe this dabt injection, but there's
> a good chance it'll just loop here. So maybe QEMU should just abort.
> Aborting should be the safest default (from the host's PoV).

I can still play around with this for a bit to make it more .. sensible,
keeping in mind the potential risk with the current solution you have
pointed out. I would still like to keep the abort injection as that is the
main reason behind having this change. It might be a bit tricky,
(time-wise)  to add the 'def behaviour' interface though.
 Is there anything else that could potentially benefit from it ?

BR
Beata
>
> Thanks,
> drew
>
>
> >
> > BR
> >
> >
> > Beata
> >
> >
> > > > +    /*
> > > > +     * Set pending ext dabt and trigger SET_EVENTS so that
> > > > +     * KVM can inject the abort
> > > > +     */
> > > > +    env->ext_dabt_pending = 1;
> > > > +
> > > > +    return 0;
> > > > +}
> > >
> > > Thanks,
> > > drew
> > >
> >
>


^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH v2 2/2] target/arm: kvm: Handle DABT with no valid ISS
@ 2020-02-11 23:10           ` Beata Michalska
  0 siblings, 0 replies; 20+ messages in thread
From: Beata Michalska @ 2020-02-11 23:10 UTC (permalink / raw)
  To: Andrew Jones; +Cc: QEMU Developers, qemu-arm, Paolo Bonzini, kvmarm

On Fri, 7 Feb 2020 at 08:20, Andrew Jones <drjones@redhat.com> wrote:
>
> On Thu, Feb 06, 2020 at 09:48:05PM +0000, Beata Michalska wrote:
> > On Wed, 5 Feb 2020 at 16:57, Andrew Jones <drjones@redhat.com> wrote:
> > >
> > > On Wed, Jan 29, 2020 at 08:24:41PM +0000, Beata Michalska wrote:
> > > > On ARMv7 & ARMv8 some load/store instructions might trigger a data abort
> > > > exception with no valid ISS info to be decoded. The lack of decode info
> > > > makes it at least tricky to emulate those instruction which is one of the
> > > > (many) reasons why KVM will not even try to do so.
> > > >
> > > > Add support for handling those by requesting KVM to inject external
> > > > dabt into the quest.
> > > >
> > > > Signed-off-by: Beata Michalska <beata.michalska@linaro.org>
> > > > ---
> > > >  target/arm/cpu.h     |  2 ++
> > > >  target/arm/kvm.c     | 96 ++++++++++++++++++++++++++++++++++++++++++++++++++++
> > > >  target/arm/kvm32.c   |  3 ++
> > > >  target/arm/kvm64.c   |  3 ++
> > > >  target/arm/kvm_arm.h | 19 +++++++++++
> > > >  5 files changed, 123 insertions(+)
> > > >
> > > > diff --git a/target/arm/cpu.h b/target/arm/cpu.h
> > > > index c1aedbe..e04a8d3 100644
> > > > --- a/target/arm/cpu.h
> > > > +++ b/target/arm/cpu.h
> > > > @@ -558,6 +558,8 @@ typedef struct CPUARMState {
> > > >          uint8_t has_esr;
> > > >          uint64_t esr;
> > > >      } serror;
> > > > +    /* Status field for pending external dabt */
> > > > +    uint8_t ext_dabt_pending;
> > > >
> > > >      /* State of our input IRQ/FIQ/VIRQ/VFIQ lines */
> > > >      uint32_t irq_line_state;
> > > > diff --git a/target/arm/kvm.c b/target/arm/kvm.c
> > > > index 8d82889..e7bc9b7 100644
> > > > --- a/target/arm/kvm.c
> > > > +++ b/target/arm/kvm.c
> > > > @@ -37,6 +37,7 @@ const KVMCapabilityInfo kvm_arch_required_capabilities[] = {
> > > >
> > > >  static bool cap_has_mp_state;
> > > >  static bool cap_has_inject_serror_esr;
> > > > +static bool cap_has_inject_ext_dabt; /* KVM_CAP_ARM_INJECT_EXT_DABT */
> > >
> > > nit: the KVM_CAP_ARM_INJECT_EXT_DABT comment is unnecessary
> >
> > Might be - I just find it handy when looking for  related details.
> > I will remove that one though.
> >
> > >
> > > >
> > > >  static ARMHostCPUFeatures arm_host_cpu_features;
> > > >
> > > > @@ -62,6 +63,12 @@ void kvm_arm_init_serror_injection(CPUState *cs)
> > > >                                      KVM_CAP_ARM_INJECT_SERROR_ESR);
> > > >  }
> > > >
> > > > +void kvm_arm_init_ext_dabt_injection(CPUState *cs)
> > > > +{
> > > > +    cap_has_inject_ext_dabt = kvm_check_extension(cs->kvm_state,
> > > > +                                    KVM_CAP_ARM_INJECT_EXT_DABT);
> > > > +}
> > > > +
> > > >  bool kvm_arm_create_scratch_host_vcpu(const uint32_t *cpus_to_try,
> > > >                                        int *fdarray,
> > > >                                        struct kvm_vcpu_init *init)
> > > > @@ -216,6 +223,11 @@ int kvm_arch_init(MachineState *ms, KVMState *s)
> > > >          ret = -EINVAL;
> > > >      }
> > > >
> > > > +    if (kvm_check_extension(s, KVM_CAP_ARM_NISV_TO_USER))
> > > > +        if (kvm_vm_enable_cap(s, KVM_CAP_ARM_NISV_TO_USER, 0)) {
> > > > +            warn_report("Failed to enable DABT NISV cap");
> > > > +        }
> > > > +
> > >
> > > Missing {} around the outer block.
> >
> > Checkpatch didn't complain ...
> > Will fix that.
> >
> > >
> > > As KVM_CAP_ARM_INJECT_EXT_DABT is a VM capability then I think we should
> > > set cap_has_inject_ext_dabt here, like cap_has_mp_state gets set. I see
> > > you've followed the pattern used for cap_has_inject_serror_esr, but that
> > > looks wrong too since KVM_CAP_ARM_INJECT_SERROR_ESR is also a VM
> > > capability. The way it is now we just keep setting
> > > cap_has_inject_serror_esr to the same value, NR_VCPUS times.
> > >
> > You are totally right - I have completely missed that point! Thanks.
> >
> > > >      return ret;
> > > >  }
> > > >
> > > > @@ -598,6 +610,10 @@ int kvm_put_vcpu_events(ARMCPU *cpu)
> > > >          events.exception.serror_esr = env->serror.esr;
> > > >      }
> > > >
> > > > +    if (cap_has_inject_ext_dabt) {
> > > > +        events.exception.ext_dabt_pending = env->ext_dabt_pending;
> > > > +    }
> > > > +
> > > >      ret = kvm_vcpu_ioctl(CPU(cpu), KVM_SET_VCPU_EVENTS, &events);
> > > >      if (ret) {
> > > >          error_report("failed to put vcpu events");
> > > > @@ -627,6 +643,8 @@ int kvm_get_vcpu_events(ARMCPU *cpu)
> > > >      env->serror.has_esr = events.exception.serror_has_esr;
> > > >      env->serror.esr = events.exception.serror_esr;
> > > >
> > > > +    env->ext_dabt_pending = events.exception.ext_dabt_pending;
> > > > +
> > >
> > > afaict from Documentation/virt/kvm/api.txt and the KVM code you cannot
> > > get this state. Therefore the above line (and extra stray blank line)
> > > should be dropped.
> > >
> > That's true, though this is a lightweight way of resetting the vcpu state.
> > We would have to do that otherwise to mark that this case has been handled
> > and that the abort is no longer pending.
>
> There's no reason to wait until get-vcpu-events time to reset this.
> According to the KVM documentation (and the code) the event is
> immediately delivered at vcpu_ioctl(KVM_SET_VCPU_EVENTS) time.
> So I think the kvm_put_vcpu_events() patch should be
>
>  if (env->ext_dabt_pending) {
>      events.exception.ext_dabt_pending = env->ext_dabt_pending;
>      env->ext_dabt_pending = 0;
>  }
>
> (env->ext_dabt_pending will only be non-zero if we have the capability)
>
I would  expect this then to be reset only after successful ioctl call.
Will update it.

> >
> > > >      return 0;
> > > >  }
> > > >
> > > > @@ -634,6 +652,7 @@ void kvm_arch_pre_run(CPUState *cs, struct kvm_run *run)
> > > >  {
> > > >  }
> > > >
> > > > +
> > >
> > > stray blank line
> > >
> > > >  MemTxAttrs kvm_arch_post_run(CPUState *cs, struct kvm_run *run)
> > > >  {
> > > >      ARMCPU *cpu;
> > > > @@ -699,6 +718,11 @@ int kvm_arch_handle_exit(CPUState *cs, struct kvm_run *run)
> > > >              ret = EXCP_DEBUG;
> > > >          } /* otherwise return to guest */
> > > >          break;
> > > > +    case KVM_EXIT_ARM_NISV:
> > > > +        /* External DABT with no valid iss to decode */
> > > > +        ret = kvm_arm_handle_dabt_nisv(cs, run->arm_nisv.esr_iss,
> > > > +                                       run->arm_nisv.fault_ipa);
> > > > +        break;
> > > >      default:
> > > >          qemu_log_mask(LOG_UNIMP, "%s: un-handled exit reason %d\n",
> > > >                        __func__, run->exit_reason);
> > > > @@ -833,3 +857,75 @@ int kvm_arch_msi_data_to_gsi(uint32_t data)
> > > >  {
> > > >      return (data - 32) & 0xffff;
> > > >  }
> > > > +
> > > > +int kvm_arm_handle_dabt_nisv(CPUState *cs, uint64_t esr_iss,
> > > > +                             uint64_t fault_ipa)
> > > > +{
> > > > +    ARMCPU *cpu = ARM_CPU(cs);
> > > > +    CPUARMState *env = &cpu->env;
> > > > +    uint32_t ins, ins_fetched;
> > >
> > > ins_fetched is a bool
> >
> > Actually that's int (as per cpu_memory_rw_debug)
>
> But that's not the way you're using it here. It you really want to
> ensure the return value is converted to a bool, then you could do
>
>  ins_fetched = !!cpu_memory_rw_debug()
>
> > >
> > > > +
> > > > +    /*
> > > > +     * Hacky workaround for kernels that for aarch32 guests, instead of expected
> > > > +     * external data abort, inject the IMPLEMENTATION DEFINED exception with the
> > > > +     * lock-down. This is actually handled by the guest which results in
> > > > +     * re-running the faulting instruction.
> > > > +     * This intends to break the vicious cycle.
> > > > +     */
> > >
> > > This doesn't seem like the right thing to do. What happens on real
> > > hardware in this case? If an OS would get into a "vicious cycle" on
> > > real hardware then it should get into one on KVM too.
> > >
> > That's the problem. that would not happen on a real hardware.
> > We might end up here due to a KVM bug (which has been fixed recently)
> > when the injected
> > abort is not the one expected (as of not the one that would be
> > triggered by hw in this
> > particular case).
> > Details in : https://patchwork.kernel.org/patch/11358083/
>
> If KVM can be fixed (and in this case already is fixed - 21aecdbd7f3a),
> then doesn't it make more sense to just fix KVM, than to add a workaround
> to QEMU?
>
This is to cover the cases (hopefully rare but still) when the host kernel
won't get updated, leaving the issue exposed to the guest. Not sure what
or if there is any policy regarding that in Qemu (?) There is also no mechanism
within KVM to allow the user space to verify whether the fix is available
or not. Capabilities are not designed for that (if I got things right) unless
user-space interface is broken ... which is not entirely the case
here.
> >
> > > > +    if (!is_a64(env)) {
> > > > +        static uint8_t setback;
> > > > +
> > > > +        /*
> > > > +         * The state has not been synchronized yet, so if this is re-occurrence
> > > > +         * of the same abort triggered by guest, the status for pending external
> > > > +         * abort should not get cleared yet
> > > > +         */
> > > > +        if (unlikely(env->ext_dabt_pending)) {
> > > > +            if (setback) {
> > > > +                error_report("Most probably triggered kernel issue with"
> > > > +                             " injecting external data abort.");
> > > > +                error_printf("Giving up trying ...\n");
> > > > +                /* Here is where the fun comes to an end */
> > > > +                return -EINVAL;
> > > > +            }
> > > > +        }
> > > > +        setback = env->ext_dabt_pending;
> > > > +    }
> > > > +
> > > > +    kvm_cpu_synchronize_state(cs);
> > > > +   /*
> > > > +    * ISS [23:14] is invalid so there is a limited info
> > > > +    * on what has just happened so the only *useful* thing that can
> > > > +    * be retrieved from ISS is WnR & DFSC (though in some cases WnR
> > > > +    * might be less of a value as well)
> > > > +    */
> > > > +
> > > > +    /*
> > > > +     * Get current PC before it will get updated to exception vector entry
> > > > +     */
> > > > +    target_ulong ins_addr = is_a64(env) ? env->pc : env->regs[15];
> > >
> > > ins_addr should be declared above
> > >
> > > But what are we doing? pc is a guest virtual address. Oh, I see...
> > >
> > > > +
> > > > +    /*
> > > > +     * Get the faulting instruction
> > > > +     * Instructions have a fixed length of 32 bits
> > > > +     * and are always little-endian
> > > > +     */
> > > > +    ins_fetched = !cpu_memory_rw_debug(cs, ins_addr, (uint8_t *) &ins,
> > > > +                                       sizeof(uint32_t), 0);
> > >
> > > ... we're trying to actual walk the KVM guest's page tables. That
> > > seems a bit odd, and the "_debug" function name used for it makes
> > > it seem even more odd.
> > >
> > > > +
> > > > +    error_report("Data abort exception with no valid ISS generated by "
> > > > +                 "guest memory access at physical address: 0x" TARGET_FMT_lx,
> > > > +                 (target_ulong)fault_ipa);
> > > > +
> > > > +    error_printf(ins_fetched ? "%s : 0x%" PRIx32 " (ins encoded)\n"  : "%s\n",
> > > > +                 "KVM unable to emulate faulting instruction",
> > > > +                 (!ins_fetched ? 0 : ldl_le_p(&ins)));
> > > > +    error_printf("Injecting a data abort exception into guest.\n");
> > >
> > > The guest shouldn't be able to generate three lines of errors on the host
> > > whenever it wants. That's a security bug. One trace point here seems like
> > > the most we should do. Or, after these three lines we should kill the
> > > guest.
> > >
> > You have a point here, this can indeed be exploited by the malicious
> > guest. Not sure if killing it is necessary here.
> > I could limit the logging though that gets tricky for aborts triggered by
> > user-space processes which would probably leave the guest running
> > without much of an issue there.
> > I can y get rid of logging the additional info and keep the single
> > statement of an instruction not being emulated but that still leaves an
> > open door for potential exploits...
> >
> > > Actually, I don't think we should attempt to get the instruction at all.
> > > We should do what the KVM documenation suggests we do when we get
> > > this exit. We should either do a user selected action: one of suspend,
> > > dump, restart, or inject a dabt (as is done below). The last choice hopes
> > > that the guest will handle it in some graceful way, or that it'll crash
> > > with all the information needed for a post-mortem crash investigation.
> > >
> > > And I don't think we should do the "very brave" option of trying to
> > > emulate the instruction, even if we identify it as a valid MMIO address.
> > > That just sounds like a huge can of worms.
> > >
> > > Does QEMU already have a way for users to select a
> > > don't-know-how-to-handle-guest-exception behavior?
> > >
> >
> > The function is attempting to inject the external data abort. The rest is for
> > the aftermath analysis to easy figuring out what has happened
> > which will not be so easy in case of user-space process triggering
> > the DABT with no valid ISS. There is no attempt of emulating the instruction
> > just simply saying the guest tried to access this address with this instruction
> > which couldn't be handled. Might be excessive in some cases
> > (like misbehaving kernel) but it also might be handy on those rare
> > occasions when there is not much to analyze.
> > But as per the issue you have raised above I will rework that.
> >
> > Not sure if there is a mechanism to specify the 'preferred' behavior though.
>
> I think a mechanism for a preferred behavior would be nice, because
> all options make sense, depending on the use case. I'm not sure what
> the right default is, though. Maybe this dabt injection, but there's
> a good chance it'll just loop here. So maybe QEMU should just abort.
> Aborting should be the safest default (from the host's PoV).

I can still play around with this for a bit to make it more .. sensible,
keeping in mind the potential risk with the current solution you have
pointed out. I would still like to keep the abort injection as that is the
main reason behind having this change. It might be a bit tricky,
(time-wise)  to add the 'def behaviour' interface though.
 Is there anything else that could potentially benefit from it ?

BR
Beata
>
> Thanks,
> drew
>
>
> >
> > BR
> >
> >
> > Beata
> >
> >
> > > > +    /*
> > > > +     * Set pending ext dabt and trigger SET_EVENTS so that
> > > > +     * KVM can inject the abort
> > > > +     */
> > > > +    env->ext_dabt_pending = 1;
> > > > +
> > > > +    return 0;
> > > > +}
> > >
> > > Thanks,
> > > drew
> > >
> >
>
_______________________________________________
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

^ permalink raw reply	[flat|nested] 20+ messages in thread

end of thread, other threads:[~2020-02-11 23:12 UTC | newest]

Thread overview: 20+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-01-29 20:24 [PATCH v2 0/2] target/arm: kvm: Support for KVM DABT without valid ISS Beata Michalska
2020-01-29 20:24 ` Beata Michalska
2020-01-29 20:24 ` [PATCH v2 1/2] target/arm: kvm: Inject events at the last stage of sync Beata Michalska
2020-01-29 20:24   ` Beata Michalska
2020-02-04 10:34   ` Andrew Jones
2020-02-04 10:34     ` Andrew Jones
2020-02-06 21:41     ` Beata Michalska
2020-02-06 21:41       ` Beata Michalska
2020-02-07  7:41       ` Andrew Jones
2020-02-07  7:41         ` Andrew Jones
2020-01-29 20:24 ` [PATCH v2 2/2] target/arm: kvm: Handle DABT with no valid ISS Beata Michalska
2020-01-29 20:24   ` Beata Michalska
2020-02-05 16:57   ` Andrew Jones
2020-02-05 16:57     ` Andrew Jones
2020-02-06 21:48     ` Beata Michalska
2020-02-06 21:48       ` Beata Michalska
2020-02-07  8:19       ` Andrew Jones
2020-02-07  8:19         ` Andrew Jones
2020-02-11 23:10         ` Beata Michalska
2020-02-11 23:10           ` Beata Michalska

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.