All of lore.kernel.org
 help / color / mirror / Atom feed
From: Jim Mattson <jmattson@google.com>
To: Krish Sadhukhan <krish.sadhukhan@oracle.com>
Cc: "kvm list" <kvm@vger.kernel.org>,
	"Paolo Bonzini" <pbonzini@redhat.com>,
	"Radim Krčmář" <rkrcmar@redhat.com>
Subject: Re: [PATCH] nVMX: Defer error from VM-entry MSR-load area to until after hardware verifies VMCS guest state-area
Date: Wed, 2 Oct 2019 10:32:36 -0700	[thread overview]
Message-ID: <CALMp9eTr3Hqe-d7avFksTm+FkTOLv+g30CZEfQazu5z_zw7Jmw@mail.gmail.com> (raw)
In-Reply-To: <9b881861-260d-fbf8-c740-20ca93aa2ae9@oracle.com>

On Wed, Oct 2, 2019 at 10:16 AM Krish Sadhukhan
<krish.sadhukhan@oracle.com> wrote:
>
>
> On 10/1/19 3:02 PM, Jim Mattson wrote:
> > On Tue, Oct 1, 2019 at 2:23 PM Krish Sadhukhan
> > <krish.sadhukhan@oracle.com>  wrote:
> >>
> >> On 10/01/2019 10:09 AM, Jim Mattson wrote:
> >>> On Mon, Sep 30, 2019 at 5:12 PM Krish Sadhukhan
> >>> <krish.sadhukhan@oracle.com>  wrote:
> >>>> According to section “VM Entries” in Intel SDM vol 3C, VM-entry checks are
> >>>> performed in a certain order. Checks on MSRs that are loaded on VM-entry
> >>>> from VM-entry MSR-load area, should be done after verifying VMCS controls,
> >>>> host-state area and guest-state area. As KVM relies on CPU hardware to
> >>>> perform some of these checks, we need to defer VM-exit due to invalid
> >>>> VM-entry MSR-load area to until after CPU hardware completes the earlier
> >>>> checks and is ready to do VMLAUNCH/VMRESUME.
> >>>>
> >>>> In order to defer errors arising from invalid VM-entry MSR-load area in
> >>>> vmcs12, we set up a single invalid entry, which is illegal according to
> >>>> section "Loading MSRs in Intel SDM vol 3C, in VM-entry MSR-load area of
> >>>> vmcs02. This will cause the CPU hardware to VM-exit with "VM-entry failure
> >>>> due to MSR loading" after it completes checks on VMCS controls, host-state
> >>>> area and guest-state area. We reflect a synthesized Exit Qualification to
> >>>> our guest.
> >>> This change addresses the priority inversion, but it still potentially
> >>> leaves guest MSRs incorrectly loaded with values from the VMCS12
> >>> VM-entry MSR-load area when a higher priority error condition would
> >>> have precluded any processing of the VM-entry MSR-load area.
> >> May be, we should not load any guest MSR until we have checked the
> >> entire vmcs12 MSR-load area in nested_vmx_load_msr()  ?
> > That's not sufficient. We shouldn't load any guest MSR until all of
> > the checks on host state, controls, and guest state have passed. Since
> > we defer the guest state checks to hardware, that makes the proper
> > ordering a bit problematic. Instead, you can try to arrange to undo
> > the guest MSR loads if a higher priority error is discovered. That
> > won't be trivial.
>
>
> We discussed about undoing guest MSR loads in an earlier thread. You
> said that some MSR updates couldn't be rolled back. If we can't rollback
> some MSR updates and if that leads to an undefined guest configuration,
> may be we should return to L0 and mark it as a hardware failure, like
> what we currently do for some VM-entry failures in vmx_handle_exit():

The MSR loads that can't be rolled back are mainly the ones with
side-effects, like IA32_PRED_CMD. I think these are mostly benign
(except, perhaps, from a performance perspective). I think it's worth
investigating rollback as a partial solution to this problem.

>
>      if (exit_reason & VMX_EXIT_REASONS_FAILED_VMENTRY) {
>                  dump_vmcs();
>                  vcpu->run->exit_reason = KVM_EXIT_FAIL_ENTRY;
> vcpu->run->fail_entry.hardware_entry_failure_reason
>                          = exit_reason;
>                  return 0;
>          }
>
>
> >>>> Suggested-by: Jim Mattson<jmattson@google.com>
> >>>> Signed-off-by: Krish Sadhukhan<krish.sadhukhan@oracle.com>
> >>>> Reviewed-by: Mihai Carabas<mihai.carabas@oracle.com>
> >>>> Reviewed-by: Liran Alon<liran.alon@oracle.com>
> >>>> ---
> >>>>    arch/x86/kvm/vmx/nested.c | 34 +++++++++++++++++++++++++++++++---
> >>>>    arch/x86/kvm/vmx/nested.h | 14 ++++++++++++--
> >>>>    arch/x86/kvm/vmx/vmcs.h   |  6 ++++++
> >>>>    3 files changed, 49 insertions(+), 5 deletions(-)
> >>>>
> >>>> diff --git a/arch/x86/kvm/vmx/nested.c b/arch/x86/kvm/vmx/nested.c
> >>>> index ced9fba32598..b74491c04090 100644
> >>>> --- a/arch/x86/kvm/vmx/nested.c
> >>>> +++ b/arch/x86/kvm/vmx/nested.c
> >>>> @@ -3054,12 +3054,40 @@ int nested_vmx_enter_non_root_mode(struct kvm_vcpu *vcpu, bool from_vmentry)
> >>>>                   goto vmentry_fail_vmexit_guest_mode;
> >>>>
> >>>>           if (from_vmentry) {
> >>>> -               exit_reason = EXIT_REASON_MSR_LOAD_FAIL;
> >>>>                   exit_qual = nested_vmx_load_msr(vcpu,
> >>>>                                                   vmcs12->vm_entry_msr_load_addr,
> >>>>                                                   vmcs12->vm_entry_msr_load_count);
> >>>> -               if (exit_qual)
> >>>> -                       goto vmentry_fail_vmexit_guest_mode;
> >>>> +               if (exit_qual) {
> >>>> +                       /*
> >>>> +                        * According to section “VM Entries” in Intel SDM
> >>>> +                        * vol 3C, VM-entry checks are performed in a certain
> >>>> +                        * order. Checks on MSRs that are loaded on VM-entry
> >>>> +                        * from VM-entry MSR-load area, should be done after
> >>>> +                        * verifying VMCS controls, host-state area and
> >>>> +                        * guest-state area. As KVM relies on CPU hardware to
> >>>> +                        * perform some of these checks, we need to defer
> >>>> +                        * VM-exit due to invalid VM-entry MSR-load area to
> >>>> +                        * until after CPU hardware completes the earlier
> >>>> +                        * checks and is ready to do VMLAUNCH/VMRESUME.
> >>>> +                        *
> >>>> +                        * In order to defer errors arising from invalid
> >>>> +                        * VM-entry MSR-load area in vmcs12, we set up a
> >>>> +                        * single invalid entry, which is illegal according
> >>>> +                        * to section "Loading MSRs in Intel SDM vol 3C, in
> >>>> +                        * VM-entry MSR-load area of vmcs02. This will cause
> >>>> +                        * the CPU hardware to VM-exit with "VM-entry
> >>>> +                        * failure due to MSR loading" after it completes
> >>>> +                        * checks on VMCS controls, host-state area and
> >>>> +                        * guest-state area.
> >>>> +                        */
> >>>> +                       vmx->loaded_vmcs->invalid_msr_load_area.index =
> >>>> +                           MSR_FS_BASE;
> >>> Can this field be statically populated during initialization?
> >> Yes.
> >>
> >>>> +                       vmx->loaded_vmcs->invalid_msr_load_area.value =
> >>>> +                           exit_qual;
> >>> This seems awkward. Why not save 16 bytes per loaded_vmcs by
> >>> allocating one invalid_msr_load_area system-wide and just add a 4 byte
> >>> field to struct nested_vmx to store this value?
> >> OK.
> >>
> >>>> +                       vmcs_write32(VM_ENTRY_MSR_LOAD_COUNT, 1);
> >>>> +                       vmcs_write64(VM_ENTRY_MSR_LOAD_ADDR,
> >>>> +                           __pa(&(vmx->loaded_vmcs->invalid_msr_load_area)));
> >>>> +               }
> >>> Do you need to set vmx->nested.dirty_vmcs12 to ensure that
> >>> prepare_vmcs02_constant_state() will be called at the next emulated
> >>> VM-entry, to undo this change to VM_ENTRY_MSR_LOAD_ADDR?
> >> Yes.
> >>
> >>>>           } else {
> >>>>                   /*
> >>>>                    * The MMU is not initialized to point at the right entities yet and
> >>>> diff --git a/arch/x86/kvm/vmx/nested.h b/arch/x86/kvm/vmx/nested.h
> >>>> index 187d39bf0bf1..f3a384235b68 100644
> >>>> --- a/arch/x86/kvm/vmx/nested.h
> >>>> +++ b/arch/x86/kvm/vmx/nested.h
> >>>> @@ -64,7 +64,9 @@ static inline bool nested_ept_ad_enabled(struct kvm_vcpu *vcpu)
> >>>>    static inline int nested_vmx_reflect_vmexit(struct kvm_vcpu *vcpu,
> >>>>                                               u32 exit_reason)
> >>>>    {
> >>>> +       u32 exit_qual;
> >>>>           u32 exit_intr_info = vmcs_read32(VM_EXIT_INTR_INFO);
> >>>> +       struct vcpu_vmx *vmx = to_vmx(vcpu);
> >>>>
> >>>>           /*
> >>>>            * At this point, the exit interruption info in exit_intr_info
> >>>> @@ -81,8 +83,16 @@ static inline int nested_vmx_reflect_vmexit(struct kvm_vcpu *vcpu,
> >>>>                           vmcs_read32(VM_EXIT_INTR_ERROR_CODE);
> >>>>           }
> >>>>
> >>>> -       nested_vmx_vmexit(vcpu, exit_reason, exit_intr_info,
> >>>> -                         vmcs_readl(EXIT_QUALIFICATION));
> >>>> +       exit_qual = vmcs_readl(EXIT_QUALIFICATION);
> >>>> +
> >>>> +       if (vmx->loaded_vmcs->invalid_msr_load_area.index == MSR_FS_BASE &&
> >>>> +           (exit_reason == (VMX_EXIT_REASONS_FAILED_VMENTRY |
> >>>> +                           EXIT_REASON_MSR_LOAD_FAIL))) {
> >>> Is the second conjunct sufficient? i.e. Isn't there a bug in kvm if
> >>> the second conjunct is true and the first is not?
> >> If the first conjunct isn't true and the second one is, it means it's a
> >> hardware-detected MSR-load error. Right ? So won't that be handled the
> >> same way it is handled currently in KVM ?
> > I believe that KVM will currently forward such an error to L1, which
> > is wrong, since L1 has no control over vmcs02's VM-entry MSR-load
> > area.
>
>
> You are right. I hadn't noticed it until you mentioned.
>
>
> >   The assumption is that this will never happen, because L0 is too
> > careful.
> >>>> +               exit_qual = vmx->loaded_vmcs->invalid_msr_load_area.value;
> >>>> +       }
> >>>> +
> >>>> +       nested_vmx_vmexit(vcpu, exit_reason, exit_intr_info, exit_qual);
> >>>> +
> >>>>           return 1;
> >>>>    }
> >>>>
> >>>> diff --git a/arch/x86/kvm/vmx/vmcs.h b/arch/x86/kvm/vmx/vmcs.h
> >>>> index 481ad879197b..e272788bd4b8 100644
> >>>> --- a/arch/x86/kvm/vmx/vmcs.h
> >>>> +++ b/arch/x86/kvm/vmx/vmcs.h
> >>>> @@ -70,6 +70,12 @@ struct loaded_vmcs {
> >>>>           struct list_head loaded_vmcss_on_cpu_link;
> >>>>           struct vmcs_host_state host_state;
> >>>>           struct vmcs_controls_shadow controls_shadow;
> >>>> +       /*
> >>>> +        * This field is used to set up an invalid VM-entry MSR-load area
> >>>> +        * for vmcs02 if an error is detected while processing the entries
> >>>> +        * in VM-entry MSR-load area of vmcs12.
> >>>> +        */
> >>>> +       struct vmx_msr_entry invalid_msr_load_area;
> >>>>    };
> >>> I'd suggest allocating just one invalid_msr_load_area system-wide, as
> >>> mentioned above.
> >>>
> >>>>    static inline bool is_exception_n(u32 intr_info, u8 vector)
> >>>> --
> >>>> 2.20.1
> >>>>

      reply	other threads:[~2019-10-02 17:32 UTC|newest]

Thread overview: 7+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-09-30 23:36 [PATCH] KVM: nVMX: Defer error from VM-entry MSR-load area to until after hardware verifies VMCS guest state-area Krish Sadhukhan
2019-09-30 23:36 ` [PATCH] " Krish Sadhukhan
2019-10-01 17:09   ` Jim Mattson
2019-10-01 21:21     ` Krish Sadhukhan
2019-10-01 22:02       ` Jim Mattson
2019-10-02 17:16         ` Krish Sadhukhan
2019-10-02 17:32           ` Jim Mattson [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CALMp9eTr3Hqe-d7avFksTm+FkTOLv+g30CZEfQazu5z_zw7Jmw@mail.gmail.com \
    --to=jmattson@google.com \
    --cc=krish.sadhukhan@oracle.com \
    --cc=kvm@vger.kernel.org \
    --cc=pbonzini@redhat.com \
    --cc=rkrcmar@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.