All of lore.kernel.org
 help / color / mirror / Atom feed
From: Emanuele Giuseppe Esposito <eesposit@redhat.com>
To: Sean Christopherson <seanjc@google.com>
Cc: David Hildenbrand <david@redhat.com>,
	Paolo Bonzini <pbonzini@redhat.com>,
	Maxim Levitsky <mlevitsk@redhat.com>,
	kvm@vger.kernel.org, Vitaly Kuznetsov <vkuznets@redhat.com>,
	Wanpeng Li <wanpengli@tencent.com>,
	Jim Mattson <jmattson@google.com>, Joerg Roedel <joro@8bytes.org>,
	Thomas Gleixner <tglx@linutronix.de>,
	Ingo Molnar <mingo@redhat.com>, Borislav Petkov <bp@alien8.de>,
	Dave Hansen <dave.hansen@linux.intel.com>,
	x86@kernel.org, "H. Peter Anvin" <hpa@zytor.com>,
	linux-kernel@vger.kernel.org, Like Xu <like.xu.linux@gmail.com>
Subject: Re: [RFC PATCH 0/9] kvm: implement atomic memslot updates
Date: Wed, 28 Sep 2022 11:11:52 +0200	[thread overview]
Message-ID: <b3dc9505-9a5c-a631-065a-85bf86b1d071@redhat.com> (raw)
In-Reply-To: <YzMdjSkKaJ8HyWXh@google.com>



Am 27/09/2022 um 17:58 schrieb Sean Christopherson:
> On Tue, Sep 27, 2022, Emanuele Giuseppe Esposito wrote:
>>
>> Am 26/09/2022 um 23:28 schrieb Sean Christopherson:
>>> On Mon, Sep 26, 2022, David Hildenbrand wrote:
>>>> As Sean said "This is an awful lot of a complexity to take on for something
>>>> that appears to be solvable in userspace."
>>>
>>> And if the userspace solution is unpalatable for whatever reason, I'd like to
>>> understand exactly what KVM behavior is problematic for userspace.  E.g. the
>>> above RHBZ bug should no longer be an issue as the buggy commit has since been
>>> reverted.
>>
>> It still is because I can reproduce the bug, as also pointed out in
>> multiple comments below.
> 
> You can reproduce _a_ bug, but it's obviously not the original bug, because the
> last comment says:
> 
>   Second, indeed the patch was reverted and somehow accepted without generating
>   too much noise:
> 
>   ...
> 
>   The underlying issue of course as we both know is still there.
> 
>   You might have luck reproducing it with this bug
> 
>   https://bugzilla.redhat.com/show_bug.cgi?id=1855298
> 
>   But for me it looks like it is 'working' as well, so you might have
>   to write a unit test to trigger the issue.
> 
>>> If the issue is KVM doing something nonsensical on a code fetch to MMIO, then I'd
>>> much rather fix _that_ bug and improve KVM's user exit ABI to let userspace handle
>>> the race _if_ userspace chooses not to pause vCPUs.
>>>
>>
>> Also on the BZ they all seem (Paolo included) to agree that the issue is
>> non-atomic memslots update.
> 
> Yes, non-atomic memslot likely results in the guest fetching from a GPA without a
> memslot.  I'm asking for an explanation of exactly what happens when that occurs,
> because it should be possible to adjust KVM and/or QEMU to play nice with the
> fetch, e.g. to resume the guest until the new memslot is installed, in which case
> an atomic update isn't needed.
> 
> I assume the issue is that KVM exits with KVM_EXIT_INTERNAL_ERROR because the
> guest is running at CPL=0, and QEMU kills the guest in response.  If that's correct,
> then that problem can be solved by exiting to userspace with KVM_EXIT_MMIO instead
> of KVM_EXIT_INTERNAL_ERROR so that userspace can do something sane in response to
> the MMIO code fetch.
> 
> I'm pretty sure this patch will Just Work for QEMU, because QEMU simply resumes
> the vCPU if mmio.len==0.  It's a bit of a hack, but I don't think it violates KVM's
> ABI in any way, and it can even become "official" behavior since KVM x86 doesn't
> otherwise exit with mmio.len==0.
> 
> Compile tested only...

So basically you are just making KVM catch the failed
kvm_vcpu_read_guest_page() by retuning mmio.len = 0 to QEMU which
basically ends up in doing nothing and retry again executing the
instruction?

I wonder if there are some performance implications in this, but it's
definitely simpler than what I did.

Tested on the same failing machine used for the BZ, fixes the bug.

Do you want me to re-send the patch on your behalf (and add probably a
small documentation on Documentation/virt/kvm/api.rst)?

Emanuele
> 
> ---
> From: Sean Christopherson <seanjc@google.com>
> Date: Tue, 27 Sep 2022 08:16:03 -0700
> Subject: [PATCH] KVM: x86: Exit to userspace with zero-length MMIO "read" on
>  MMIO fetch
> 
> Exit to userspace with KVM_EXIT_MMIO if emulation fails due to not being
> able to fetch instruction bytes, e.g. if the resolved GPA isn't backed by
> a memslot.  If userspace is manipulating memslots without pausing vCPUs,
> e.g. to emulate BIOS relocation, then a vCPU may fetch while there is no
> valid memslot installed.  Depending on guest context, KVM will either
> exit to userspace with KVM_EXIT_INTERNAL_ERROR (L1, CPL=0) or simply
> resume the guest (L2 or CPL>0), neither of which is desirable as exiting
> with "emulation error" effectively kills the VM, and resuming the guest
> doesn't provide userspace an opportunity to react the to fetch.
> 
> Use "mmio.len == 0" to indicate "fetch".  This is a bit of a hack, but
> there is no other way to communicate "fetch" to userspace without
> defining an entirely new exit reason, e.g. "mmio.is_write" is a boolean
> and not a flag, and there is no known use case for actually supporting
> code fetches from MMIO, i.e. there's no need to allow userspace to fill
> in the instruction bytes.
> 
> Signed-off-by: Sean Christopherson <seanjc@google.com>
> ---
>  arch/x86/kvm/emulate.c     | 2 ++
>  arch/x86/kvm/kvm_emulate.h | 1 +
>  arch/x86/kvm/x86.c         | 9 ++++++++-
>  3 files changed, 11 insertions(+), 1 deletion(-)
> 
> diff --git a/arch/x86/kvm/emulate.c b/arch/x86/kvm/emulate.c
> index f092c54d1a2f..e141238d93b0 100644
> --- a/arch/x86/kvm/emulate.c
> +++ b/arch/x86/kvm/emulate.c
> @@ -5353,6 +5353,8 @@ int x86_decode_insn(struct x86_emulate_ctxt *ctxt, void *insn, int insn_len, int
>  done:
>  	if (rc == X86EMUL_PROPAGATE_FAULT)
>  		ctxt->have_exception = true;
> +	if (rc == X86EMUL_IO_NEEDED)
> +		return EMULATION_IO_FETCH;
>  	return (rc != X86EMUL_CONTINUE) ? EMULATION_FAILED : EMULATION_OK;
>  }
>  
> diff --git a/arch/x86/kvm/kvm_emulate.h b/arch/x86/kvm/kvm_emulate.h
> index 89246446d6aa..3cb2e321fcd2 100644
> --- a/arch/x86/kvm/kvm_emulate.h
> +++ b/arch/x86/kvm/kvm_emulate.h
> @@ -516,6 +516,7 @@ bool x86_page_table_writing_insn(struct x86_emulate_ctxt *ctxt);
>  #define EMULATION_OK 0
>  #define EMULATION_RESTART 1
>  #define EMULATION_INTERCEPTED 2
> +#define EMULATION_IO_FETCH 3
>  void init_decode_cache(struct x86_emulate_ctxt *ctxt);
>  int x86_emulate_insn(struct x86_emulate_ctxt *ctxt);
>  int emulator_task_switch(struct x86_emulate_ctxt *ctxt,
> diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
> index aa5ab0c620de..7eb72694c601 100644
> --- a/arch/x86/kvm/x86.c
> +++ b/arch/x86/kvm/x86.c
> @@ -7129,8 +7129,13 @@ static int kvm_fetch_guest_virt(struct x86_emulate_ctxt *ctxt,
>  		bytes = (unsigned)PAGE_SIZE - offset;
>  	ret = kvm_vcpu_read_guest_page(vcpu, gpa >> PAGE_SHIFT, val,
>  				       offset, bytes);
> -	if (unlikely(ret < 0))
> +	if (unlikely(ret < 0)) {
> +		vcpu->run->mmio.phys_addr = gpa;
> +		vcpu->run->mmio.len = 0;
> +		vcpu->run->mmio.is_write = 0;
> +		vcpu->run->exit_reason = KVM_EXIT_MMIO;
>  		return X86EMUL_IO_NEEDED;
> +	}
>  
>  	return X86EMUL_CONTINUE;
>  }
> @@ -8665,6 +8670,8 @@ int x86_emulate_instruction(struct kvm_vcpu *vcpu, gpa_t cr2_or_gpa,
>  		r = x86_decode_emulated_instruction(vcpu, emulation_type,
>  						    insn, insn_len);
>  		if (r != EMULATION_OK)  {
> +			if (r == EMULATION_IO_FETCH)
> +				return 0;
>  			if ((emulation_type & EMULTYPE_TRAP_UD) ||
>  			    (emulation_type & EMULTYPE_TRAP_UD_FORCED)) {
>  				kvm_queue_exception(vcpu, UD_VECTOR);
> 
> base-commit: 39d9b48cc777bdf6d67d01ed24f1f89b13f5fbb2
> 


  reply	other threads:[~2022-09-28  9:13 UTC|newest]

Thread overview: 58+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-09-09 10:44 [RFC PATCH 0/9] kvm: implement atomic memslot updates Emanuele Giuseppe Esposito
2022-09-09 10:44 ` [RFC PATCH 1/9] kvm_main.c: move slot check in kvm_set_memory_region Emanuele Giuseppe Esposito
2022-09-28 16:41   ` Paolo Bonzini
2022-09-09 10:44 ` [RFC PATCH 2/9] kvm.h: introduce KVM_SET_USER_MEMORY_REGION_LIST ioctl Emanuele Giuseppe Esposito
2022-09-28 16:42   ` Paolo Bonzini
2022-09-09 10:45 ` [RFC PATCH 3/9] kvm_main.c: introduce kvm_internal_memory_region_list Emanuele Giuseppe Esposito
2022-09-28 16:48   ` Paolo Bonzini
2022-09-09 10:45 ` [RFC PATCH 4/9] kvm_main.c: split logic in kvm_set_memslots Emanuele Giuseppe Esposito
2022-09-28 17:04   ` Paolo Bonzini
2022-09-09 10:45 ` [RFC PATCH 5/9] kvm_main.c: split __kvm_set_memory_region logic in kvm_check_mem and kvm_prepare_batch Emanuele Giuseppe Esposito
2022-09-13  2:56   ` Yang, Weijiang
2022-09-18 16:22     ` Emanuele Giuseppe Esposito
2022-09-28 17:11   ` Paolo Bonzini
2022-09-09 10:45 ` [RFC PATCH 6/9] kvm_main.c: simplify change-specific callbacks Emanuele Giuseppe Esposito
2022-09-09 10:45 ` [RFC PATCH 7/9] kvm_main.c: duplicate invalid memslot also in inactive list Emanuele Giuseppe Esposito
2022-09-28 17:18   ` Paolo Bonzini
2022-09-09 10:45 ` [RFC PATCH 8/9] kvm_main.c: find memslots from the inactive memslot list Emanuele Giuseppe Esposito
2022-09-09 10:45 ` [RFC PATCH 9/9] kvm_main.c: handle atomic memslot update Emanuele Giuseppe Esposito
2022-09-13  2:30   ` Yang, Weijiang
2022-09-18 16:18     ` Emanuele Giuseppe Esposito
2022-09-27  7:46   ` David Hildenbrand
2022-09-27  8:35     ` Emanuele Giuseppe Esposito
2022-09-27  9:22       ` David Hildenbrand
2022-09-27  9:32         ` Emanuele Giuseppe Esposito
2022-09-27 14:52           ` David Hildenbrand
2022-09-28 17:29   ` Paolo Bonzini
2022-09-09 14:30 ` [RFC PATCH 0/9] kvm: implement atomic memslot updates Sean Christopherson
2022-09-18 16:13   ` Emanuele Giuseppe Esposito
2022-09-19  7:38     ` Like Xu
2022-09-19  7:53     ` David Hildenbrand
2022-09-19 17:30       ` David Hildenbrand
2022-09-23 13:10         ` Emanuele Giuseppe Esposito
2022-09-23 13:21           ` David Hildenbrand
2022-09-23 13:38             ` Emanuele Giuseppe Esposito
2022-09-26  9:03               ` David Hildenbrand
2022-09-26 21:28                 ` Sean Christopherson
2022-09-27  7:38                   ` Emanuele Giuseppe Esposito
2022-09-27 15:58                     ` Sean Christopherson
2022-09-28  9:11                       ` Emanuele Giuseppe Esposito [this message]
2022-09-28 11:14                         ` Maxim Levitsky
2022-09-28 12:52                           ` David Hildenbrand
2022-09-28 15:07                       ` Paolo Bonzini
2022-09-28 15:33                         ` David Hildenbrand
2022-09-28 15:58                         ` Sean Christopherson
2022-09-28 16:38                           ` Paolo Bonzini
2022-09-28 20:41                             ` Sean Christopherson
2022-09-29  8:05                               ` Emanuele Giuseppe Esposito
2022-09-29  8:24                                 ` David Hildenbrand
2022-09-29 15:18                                 ` Sean Christopherson
2022-09-29 15:41                                   ` Paolo Bonzini
2022-09-29 15:28                               ` Paolo Bonzini
2022-09-29 15:40                                 ` Maxim Levitsky
2022-09-29 16:00                                 ` David Hildenbrand
2022-09-29 21:39                                 ` Sean Christopherson
2022-10-13  7:43                                   ` Emanuele Giuseppe Esposito
2022-10-13  8:44                                     ` David Hildenbrand
2022-10-13 11:12                                       ` Emanuele Giuseppe Esposito
2022-10-13 14:45                                         ` David Hildenbrand

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=b3dc9505-9a5c-a631-065a-85bf86b1d071@redhat.com \
    --to=eesposit@redhat.com \
    --cc=bp@alien8.de \
    --cc=dave.hansen@linux.intel.com \
    --cc=david@redhat.com \
    --cc=hpa@zytor.com \
    --cc=jmattson@google.com \
    --cc=joro@8bytes.org \
    --cc=kvm@vger.kernel.org \
    --cc=like.xu.linux@gmail.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@redhat.com \
    --cc=mlevitsk@redhat.com \
    --cc=pbonzini@redhat.com \
    --cc=seanjc@google.com \
    --cc=tglx@linutronix.de \
    --cc=vkuznets@redhat.com \
    --cc=wanpengli@tencent.com \
    --cc=x86@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.