kvmarm.lists.cs.columbia.edu archive mirror
 help / color / mirror / Atom feed
From: Andrew Jones <drjones@redhat.com>
To: Beata Michalska <beata.michalska@linaro.org>
Cc: qemu-devel@nongnu.org, qemu-arm@nongnu.org, pbonzini@redhat.com,
	kvmarm@lists.cs.columbia.edu
Subject: Re: [PATCH v2 2/2] target/arm: kvm: Handle DABT with no valid ISS
Date: Wed, 5 Feb 2020 17:57:39 +0100	[thread overview]
Message-ID: <20200205165739.2jkklbpmy3yrdq3q@kamzik.brq.redhat.com> (raw)
In-Reply-To: <20200129202441.12745-3-beata.michalska@linaro.org>

On Wed, Jan 29, 2020 at 08:24:41PM +0000, Beata Michalska wrote:
> On ARMv7 & ARMv8 some load/store instructions might trigger a data abort
> exception with no valid ISS info to be decoded. The lack of decode info
> makes it at least tricky to emulate those instruction which is one of the
> (many) reasons why KVM will not even try to do so.
> 
> Add support for handling those by requesting KVM to inject external
> dabt into the quest.
> 
> Signed-off-by: Beata Michalska <beata.michalska@linaro.org>
> ---
>  target/arm/cpu.h     |  2 ++
>  target/arm/kvm.c     | 96 ++++++++++++++++++++++++++++++++++++++++++++++++++++
>  target/arm/kvm32.c   |  3 ++
>  target/arm/kvm64.c   |  3 ++
>  target/arm/kvm_arm.h | 19 +++++++++++
>  5 files changed, 123 insertions(+)
> 
> diff --git a/target/arm/cpu.h b/target/arm/cpu.h
> index c1aedbe..e04a8d3 100644
> --- a/target/arm/cpu.h
> +++ b/target/arm/cpu.h
> @@ -558,6 +558,8 @@ typedef struct CPUARMState {
>          uint8_t has_esr;
>          uint64_t esr;
>      } serror;
> +    /* Status field for pending external dabt */
> +    uint8_t ext_dabt_pending;
>  
>      /* State of our input IRQ/FIQ/VIRQ/VFIQ lines */
>      uint32_t irq_line_state;
> diff --git a/target/arm/kvm.c b/target/arm/kvm.c
> index 8d82889..e7bc9b7 100644
> --- a/target/arm/kvm.c
> +++ b/target/arm/kvm.c
> @@ -37,6 +37,7 @@ const KVMCapabilityInfo kvm_arch_required_capabilities[] = {
>  
>  static bool cap_has_mp_state;
>  static bool cap_has_inject_serror_esr;
> +static bool cap_has_inject_ext_dabt; /* KVM_CAP_ARM_INJECT_EXT_DABT */

nit: the KVM_CAP_ARM_INJECT_EXT_DABT comment is unnecessary

>  
>  static ARMHostCPUFeatures arm_host_cpu_features;
>  
> @@ -62,6 +63,12 @@ void kvm_arm_init_serror_injection(CPUState *cs)
>                                      KVM_CAP_ARM_INJECT_SERROR_ESR);
>  }
>  
> +void kvm_arm_init_ext_dabt_injection(CPUState *cs)
> +{
> +    cap_has_inject_ext_dabt = kvm_check_extension(cs->kvm_state,
> +                                    KVM_CAP_ARM_INJECT_EXT_DABT);
> +}
> +
>  bool kvm_arm_create_scratch_host_vcpu(const uint32_t *cpus_to_try,
>                                        int *fdarray,
>                                        struct kvm_vcpu_init *init)
> @@ -216,6 +223,11 @@ int kvm_arch_init(MachineState *ms, KVMState *s)
>          ret = -EINVAL;
>      }
>  
> +    if (kvm_check_extension(s, KVM_CAP_ARM_NISV_TO_USER))
> +        if (kvm_vm_enable_cap(s, KVM_CAP_ARM_NISV_TO_USER, 0)) {
> +            warn_report("Failed to enable DABT NISV cap");
> +        }
> +

Missing {} around the outer block.

As KVM_CAP_ARM_INJECT_EXT_DABT is a VM capability then I think we should
set cap_has_inject_ext_dabt here, like cap_has_mp_state gets set. I see
you've followed the pattern used for cap_has_inject_serror_esr, but that
looks wrong too since KVM_CAP_ARM_INJECT_SERROR_ESR is also a VM
capability. The way it is now we just keep setting
cap_has_inject_serror_esr to the same value, NR_VCPUS times.

>      return ret;
>  }
>  
> @@ -598,6 +610,10 @@ int kvm_put_vcpu_events(ARMCPU *cpu)
>          events.exception.serror_esr = env->serror.esr;
>      }
>  
> +    if (cap_has_inject_ext_dabt) {
> +        events.exception.ext_dabt_pending = env->ext_dabt_pending;
> +    }
> +
>      ret = kvm_vcpu_ioctl(CPU(cpu), KVM_SET_VCPU_EVENTS, &events);
>      if (ret) {
>          error_report("failed to put vcpu events");
> @@ -627,6 +643,8 @@ int kvm_get_vcpu_events(ARMCPU *cpu)
>      env->serror.has_esr = events.exception.serror_has_esr;
>      env->serror.esr = events.exception.serror_esr;
>  
> +    env->ext_dabt_pending = events.exception.ext_dabt_pending;
> +

afaict from Documentation/virt/kvm/api.txt and the KVM code you cannot
get this state. Therefore the above line (and extra stray blank line)
should be dropped.

>      return 0;
>  }
>  
> @@ -634,6 +652,7 @@ void kvm_arch_pre_run(CPUState *cs, struct kvm_run *run)
>  {
>  }
>  
> +

stray blank line

>  MemTxAttrs kvm_arch_post_run(CPUState *cs, struct kvm_run *run)
>  {
>      ARMCPU *cpu;
> @@ -699,6 +718,11 @@ int kvm_arch_handle_exit(CPUState *cs, struct kvm_run *run)
>              ret = EXCP_DEBUG;
>          } /* otherwise return to guest */
>          break;
> +    case KVM_EXIT_ARM_NISV:
> +        /* External DABT with no valid iss to decode */
> +        ret = kvm_arm_handle_dabt_nisv(cs, run->arm_nisv.esr_iss,
> +                                       run->arm_nisv.fault_ipa);
> +        break;
>      default:
>          qemu_log_mask(LOG_UNIMP, "%s: un-handled exit reason %d\n",
>                        __func__, run->exit_reason);
> @@ -833,3 +857,75 @@ int kvm_arch_msi_data_to_gsi(uint32_t data)
>  {
>      return (data - 32) & 0xffff;
>  }
> +
> +int kvm_arm_handle_dabt_nisv(CPUState *cs, uint64_t esr_iss,
> +                             uint64_t fault_ipa)
> +{
> +    ARMCPU *cpu = ARM_CPU(cs);
> +    CPUARMState *env = &cpu->env;
> +    uint32_t ins, ins_fetched;

ins_fetched is a bool

> +
> +    /*
> +     * Hacky workaround for kernels that for aarch32 guests, instead of expected
> +     * external data abort, inject the IMPLEMENTATION DEFINED exception with the
> +     * lock-down. This is actually handled by the guest which results in
> +     * re-running the faulting instruction.
> +     * This intends to break the vicious cycle.
> +     */

This doesn't seem like the right thing to do. What happens on real
hardware in this case? If an OS would get into a "vicious cycle" on
real hardware then it should get into one on KVM too.

> +    if (!is_a64(env)) {
> +        static uint8_t setback;
> +
> +        /*
> +         * The state has not been synchronized yet, so if this is re-occurrence
> +         * of the same abort triggered by guest, the status for pending external
> +         * abort should not get cleared yet
> +         */
> +        if (unlikely(env->ext_dabt_pending)) {
> +            if (setback) {
> +                error_report("Most probably triggered kernel issue with"
> +                             " injecting external data abort.");
> +                error_printf("Giving up trying ...\n");
> +                /* Here is where the fun comes to an end */
> +                return -EINVAL;
> +            }
> +        }
> +        setback = env->ext_dabt_pending;
> +    }
> +
> +    kvm_cpu_synchronize_state(cs);
> +   /*
> +    * ISS [23:14] is invalid so there is a limited info
> +    * on what has just happened so the only *useful* thing that can
> +    * be retrieved from ISS is WnR & DFSC (though in some cases WnR
> +    * might be less of a value as well)
> +    */
> +
> +    /*
> +     * Get current PC before it will get updated to exception vector entry
> +     */
> +    target_ulong ins_addr = is_a64(env) ? env->pc : env->regs[15];

ins_addr should be declared above

But what are we doing? pc is a guest virtual address. Oh, I see...

> +
> +    /*
> +     * Get the faulting instruction
> +     * Instructions have a fixed length of 32 bits
> +     * and are always little-endian
> +     */
> +    ins_fetched = !cpu_memory_rw_debug(cs, ins_addr, (uint8_t *) &ins,
> +                                       sizeof(uint32_t), 0);

... we're trying to actual walk the KVM guest's page tables. That
seems a bit odd, and the "_debug" function name used for it makes
it seem even more odd.

> +
> +    error_report("Data abort exception with no valid ISS generated by "
> +                 "guest memory access at physical address: 0x" TARGET_FMT_lx,
> +                 (target_ulong)fault_ipa);
> +
> +    error_printf(ins_fetched ? "%s : 0x%" PRIx32 " (ins encoded)\n"  : "%s\n",
> +                 "KVM unable to emulate faulting instruction",
> +                 (!ins_fetched ? 0 : ldl_le_p(&ins)));
> +    error_printf("Injecting a data abort exception into guest.\n");

The guest shouldn't be able to generate three lines of errors on the host
whenever it wants. That's a security bug. One trace point here seems like
the most we should do. Or, after these three lines we should kill the
guest.

Actually, I don't think we should attempt to get the instruction at all.
We should do what the KVM documenation suggests we do when we get
this exit. We should either do a user selected action: one of suspend,
dump, restart, or inject a dabt (as is done below). The last choice hopes
that the guest will handle it in some graceful way, or that it'll crash
with all the information needed for a post-mortem crash investigation.

And I don't think we should do the "very brave" option of trying to
emulate the instruction, even if we identify it as a valid MMIO address.
That just sounds like a huge can of worms.

Does QEMU already have a way for users to select a
don't-know-how-to-handle-guest-exception behavior?

> +    /*
> +     * Set pending ext dabt and trigger SET_EVENTS so that
> +     * KVM can inject the abort
> +     */
> +    env->ext_dabt_pending = 1;
> +
> +    return 0;
> +}

Thanks,
drew

_______________________________________________
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

  reply	other threads:[~2020-02-05 16:57 UTC|newest]

Thread overview: 10+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-01-29 20:24 [PATCH v2 0/2] target/arm: kvm: Support for KVM DABT without valid ISS Beata Michalska
2020-01-29 20:24 ` [PATCH v2 1/2] target/arm: kvm: Inject events at the last stage of sync Beata Michalska
2020-02-04 10:34   ` Andrew Jones
2020-02-06 21:41     ` Beata Michalska
2020-02-07  7:41       ` Andrew Jones
2020-01-29 20:24 ` [PATCH v2 2/2] target/arm: kvm: Handle DABT with no valid ISS Beata Michalska
2020-02-05 16:57   ` Andrew Jones [this message]
2020-02-06 21:48     ` Beata Michalska
2020-02-07  8:19       ` Andrew Jones
2020-02-11 23:10         ` Beata Michalska

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20200205165739.2jkklbpmy3yrdq3q@kamzik.brq.redhat.com \
    --to=drjones@redhat.com \
    --cc=beata.michalska@linaro.org \
    --cc=kvmarm@lists.cs.columbia.edu \
    --cc=pbonzini@redhat.com \
    --cc=qemu-arm@nongnu.org \
    --cc=qemu-devel@nongnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).