All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Nadav Har'El" <nyh@math.technion.ac.il>
To: Avi Kivity <avi@redhat.com>
Cc: kvm@vger.kernel.org
Subject: Re: [PATCH 18/24] Exiting from L2 to L1
Date: Sun, 12 Sep 2010 19:05:03 +0200	[thread overview]
Message-ID: <20100912170503.GA7828@fermat.math.technion.ac.il> (raw)
In-Reply-To: <4C8CE3E2.7060708@redhat.com>

On Sun, Sep 12, 2010, Avi Kivity wrote about "Re: [PATCH 18/24] Exiting from L2 to L1":
> Great.  Hopefully you can commit some time to it or you'll spend a lot 
> of cycles just catching up.

Right. I will.

> Joerg just merged nested npt; as far as I can tell it is 100% in line 
> with nested ept, but please take a look as well.

Indeed. Making nested EPT work based on the nested NPT work is one of the
first things I plan to do after the basic nested VMX patches are accepted.
As you know, we already have a version of nested EPT working in our testing
code, but I'll need to tweak it a bit to use the common nested NPT code.

> >>>+	vmcs12->guest_ia32_debugctl = vmcs_read64(GUEST_IA32_DEBUGCTL);
> >>>
> >>Without msr bitmaps, cannot change.
> >I added a TODO before this (and a couple of others) for future 
> >optimization.
> >I'm not even convinced how much quicker it is to check the MSR bitmap 
> >before
> >doing vmcs_read64, vs just to going ahead and vmreading it in any case.
> 
> IIRC we don't support msr bitmaps now, so no need to check anything.

I do think we support msr bitmaps... E.g., we have
nested_vmx_exit_handled_msr() to check whether L1 requires an exit for a
certain MSR access.  Where don't we support them? (but I'm not denying the
possiblity that this support still has holes or bugs).

> In general, avoid vmcs reads as much as possible.  Just think of your 
> code running in a guest!

Yes. On the other hand, I don't want to be sorry in the future when I want
to support some feature, but because I wanted to shave off 1% of the L2->L1
switching time, and 0.01% of the total run time (and I'm just making
numbers up...), I now need to find a dozen places where things need to change
to support this feature. On the other hand, this will likely happen anyway ;-)

> 
> >>>+    vmcs12->vmcs_link_pointer = vmcs_read64(VMCS_LINK_POINTER);
> >>Can this change?
>..
> If it changes in the future, it can only be under the influence of some 
> control or at least guest-discoverable capability.  Since we don't 
> expose such control or capability, there's no need to copy it.

You convinced me. Removed it.

> >>>+	vmcs12->vm_entry_intr_info_field =
> >>>+		vmcs_read32(VM_ENTRY_INTR_INFO_FIELD);
> >>>
> >>Autocleared, no need to read.
> >Well, we need to clear the "valid bit" on exit, so we don't mistakenly 
> >inject
> >the same interrupt twice.
> 
> I don't think so.  We write this field as part of guest entry (that is, 
> after the decision re which vmcs to use, yes?), so guest entry will 
> always follow writing this field.  Since guest entry clears the field, 
> reading it after an exit will necessarily return 0.

Well, you obviously know the KVM code much better than I do, but from what
I saw, I thought (maybe I misunderstood) that in normal (non-nested) KVM,
this field only gets written on injection, not on every entry, so the code
relies on the fact that the processor turns off the "valid" bit during exit,
to avoid the same event being injected when the same field value is used for
another entry. I can only find code which resets this field in vmx_vcpu_reset(),
but that doesn't get called on every entry, right? Or am I missing something?

> What can happen is that the contents of the field is transferred to the 
> IDT_VECTORING_INFO field or VM_EXIT_INTR_INFO field.
> 
> (question: on a failed vmentry, is this field cleared?)

I don't know the answer :-)

> >There were two ways to do it: 1. clear it ourselves,
> >or 2. copy the value from vmcs02 where the processor already cleared it.
> >There are pros and cons for each approach, but I'll change like you 
> >suggest,
> >to clear it ourselves:
> >
> >	vmcs12->vm_entry_intr_info_field&= ~INTR_INFO_VALID_MASK;
> 
> That's really a temporary variable, I don't think you need to touch it.

But we need to emulate the correct VMX behavior. According to the spec, the
"valid" bit on this field needs to be cleared on vmexit, so we need to do it
also on emulated exits from L2 to L1. If we're sure that we already cleared it
earlier, then fine, but if not (and like I said, I failed to find this code),
we need to do it now, on exit - either by explictly clearing the bit or by
copying a value where the processor cleared this bit (arguably, the former is
more correct emulation).

> I didn't mean register independent helper; one function for cr0 and one 
> function for cr4 so the reader can just see the name and pretend to 
> understand what it does, instead of seeing a bunch of incomprehensible 
> bitwise operations.

Ok, done:

/*
 * On a nested exit from L2 to L1, vmcs12.guest_cr0 might not be up-to-date
 * because L2 may have changed some cr0 bits directly (see CRO_GUEST_HOST_MASK)
 * without L0 trapping the change and updating vmcs12.
 * This function returns the value we should put in vmcs12.guest_cr0. It's not
 * enough to just return the current (vmcs02) GUEST_CR0. This may not be the
 * guest cr0 that L1 thought it was giving its L2 guest - it is possible that
 * L1 wished to allow its guest to set a cr0 bit directly, but we (L0) asked
 * to trap this change and instead set just the read shadow. If this is the
 * case, we need to copy these read-shadow bits back to vmcs12.guest_cr0, where
 * L1 believes they already are.
 */
static inline unsigned long
vmcs12_guest_cr0(struct kvm_vcpu *vcpu, struct vmcs_fields *vmcs12){
	unsigned long guest_cr0_bits =
		vcpu->arch.cr0_guest_owned_bits | vmcs12->cr0_guest_host_mask;
	return (vmcs_readl(GUEST_CR0) & guest_cr0_bits) |
		(vmcs_readl(CR0_READ_SHADOW) & ~guest_cr0_bits);
}

and the call becomes just:

	vmcs12->guest_cr0 = vmcs12_guest_cr0(vcpu, vmcs12);

which is easy to glance over (but doesn't say much about what it is doing).


-- 
Nadav Har'El                        |       Sunday, Sep 12 2010, 4 Tishri 5771
nyh@math.technion.ac.il             |-----------------------------------------
Phone +972-523-790466, ICQ 13349191 |Sign above a shop selling burglar alarms:
http://nadav.harel.org.il           |"For the man who has everything"

  reply	other threads:[~2010-09-12 17:05 UTC|newest]

Thread overview: 147+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2010-06-13 12:22 [PATCH 0/24] Nested VMX, v5 Nadav Har'El
2010-06-13 12:23 ` [PATCH 1/24] Move nested option from svm.c to x86.c Nadav Har'El
2010-06-14  8:11   ` Avi Kivity
2010-06-15 14:27     ` Nadav Har'El
2010-06-13 12:23 ` [PATCH 2/24] Add VMX and SVM to list of supported cpuid features Nadav Har'El
2010-06-14  8:13   ` Avi Kivity
2010-06-15 14:31     ` Nadav Har'El
2010-06-13 12:24 ` [PATCH 3/24] Implement VMXON and VMXOFF Nadav Har'El
2010-06-14  8:21   ` Avi Kivity
2010-06-16 11:14     ` Nadav Har'El
2010-06-16 11:26       ` Avi Kivity
2010-06-15 20:18   ` Marcelo Tosatti
2010-06-16  7:50     ` Nadav Har'El
2010-06-13 12:24 ` [PATCH 4/24] Allow setting the VMXE bit in CR4 Nadav Har'El
2010-06-15 11:09   ` Gleb Natapov
2010-06-15 14:44     ` Nadav Har'El
2010-06-13 12:25 ` [PATCH 5/24] Introduce vmcs12: a VMCS structure for L1 Nadav Har'El
2010-06-14  8:33   ` Avi Kivity
2010-06-14  8:49     ` Nadav Har'El
2010-06-14 12:35       ` Avi Kivity
2010-06-16 12:24     ` Nadav Har'El
2010-06-16 13:10       ` Avi Kivity
2010-06-22 14:54     ` Nadav Har'El
2010-06-22 16:53       ` Nadav Har'El
2010-06-23  8:07         ` Avi Kivity
2010-08-08 15:09           ` Nadav Har'El
2010-08-10  3:24             ` Avi Kivity
2010-06-23  7:57       ` Avi Kivity
2010-06-23  9:15         ` Alexander Graf
2010-06-23  9:24           ` Avi Kivity
2010-06-23 12:07         ` Nadav Har'El
2010-06-23 12:13           ` Avi Kivity
2010-06-13 12:25 ` [PATCH 6/24] Implement reading and writing of VMX MSRs Nadav Har'El
2010-06-14  8:42   ` Avi Kivity
2010-06-23  8:13     ` Nadav Har'El
2010-06-23  8:24       ` Avi Kivity
2010-06-13 12:26 ` [PATCH 7/24] Understanding guest pointers to vmcs12 structures Nadav Har'El
2010-06-14  8:48   ` Avi Kivity
2010-08-02 12:25     ` Nadav Har'El
2010-08-02 13:38       ` Avi Kivity
2010-06-15 12:14   ` Gleb Natapov
2010-08-01 15:16     ` Nadav Har'El
2010-08-01 15:25       ` Gleb Natapov
2010-08-02  8:57         ` Nadav Har'El
2010-06-13 12:26 ` [PATCH 8/24] Hold a vmcs02 for each vmcs12 Nadav Har'El
2010-06-14  8:57   ` Avi Kivity
2010-07-06  9:50   ` Dong, Eddie
2010-08-02 13:38     ` Nadav Har'El
2010-06-13 12:27 ` [PATCH 9/24] Implement VMCLEAR Nadav Har'El
2010-06-14  9:03   ` Avi Kivity
2010-06-15 13:47   ` Gleb Natapov
2010-06-15 13:50     ` Avi Kivity
2010-06-15 13:54       ` Gleb Natapov
2010-08-05 11:50         ` Nadav Har'El
2010-08-05 11:53           ` Gleb Natapov
2010-08-05 12:01             ` Nadav Har'El
2010-08-05 12:05               ` Avi Kivity
2010-08-05 12:10                 ` Nadav Har'El
2010-08-05 12:13                   ` Avi Kivity
2010-08-05 12:29                     ` Nadav Har'El
2010-08-05 12:03           ` Avi Kivity
2010-07-06  2:56   ` Dong, Eddie
2010-08-03 12:12     ` Nadav Har'El
2010-06-13 12:27 ` [PATCH 10/24] Implement VMPTRLD Nadav Har'El
2010-06-14  9:07   ` Avi Kivity
2010-08-05 11:13     ` Nadav Har'El
2010-06-16 13:36   ` Gleb Natapov
2010-07-06  3:09   ` Dong, Eddie
2010-08-05 11:35     ` Nadav Har'El
2010-06-13 12:28 ` [PATCH 11/24] Implement VMPTRST Nadav Har'El
2010-06-14  9:15   ` Avi Kivity
2010-06-16 13:53     ` Gleb Natapov
2010-06-16 15:33       ` Nadav Har'El
2010-06-13 12:28 ` [PATCH 12/24] Add VMCS fields to the vmcs12 Nadav Har'El
2010-06-14  9:24   ` Avi Kivity
2010-06-16 14:18   ` Gleb Natapov
2010-06-13 12:29 ` [PATCH 13/24] Implement VMREAD and VMWRITE Nadav Har'El
2010-06-14  9:36   ` Avi Kivity
2010-06-16 14:48     ` Gleb Natapov
2010-08-04 13:42       ` Nadav Har'El
2010-08-04 16:09     ` Nadav Har'El
2010-08-04 16:41       ` Avi Kivity
2010-06-16 15:03   ` Gleb Natapov
2010-08-04 11:46     ` Nadav Har'El
2010-06-13 12:29 ` [PATCH 14/24] Prepare vmcs02 from vmcs01 and vmcs12 Nadav Har'El
2010-06-14 11:11   ` Avi Kivity
2010-06-17  8:50   ` Gleb Natapov
2010-07-06  6:25   ` Dong, Eddie
2010-06-13 12:30 ` [PATCH 15/24] Move register-syncing to a function Nadav Har'El
2010-06-13 12:30 ` [PATCH 16/24] Implement VMLAUNCH and VMRESUME Nadav Har'El
2010-06-14 11:41   ` Avi Kivity
2010-09-26 11:14     ` Nadav Har'El
2010-09-26 12:56       ` Avi Kivity
2010-09-26 13:06         ` Nadav Har'El
2010-09-26 13:51           ` Avi Kivity
2010-06-17 10:59   ` Gleb Natapov
2010-09-16 16:06     ` Nadav Har'El
2010-06-13 12:31 ` [PATCH 17/24] No need for handle_vmx_insn function any more Nadav Har'El
2010-06-13 12:31 ` [PATCH 18/24] Exiting from L2 to L1 Nadav Har'El
2010-06-14 12:04   ` Avi Kivity
2010-09-12 14:05     ` Nadav Har'El
2010-09-12 14:29       ` Avi Kivity
2010-09-12 17:05         ` Nadav Har'El [this message]
2010-09-12 17:21           ` Avi Kivity
2010-09-12 19:51             ` Nadav Har'El
2010-09-13  8:48               ` Avi Kivity
2010-09-13  5:53             ` Sheng Yang
2010-09-13  8:52               ` Avi Kivity
2010-09-13  9:01                 ` Nadav Har'El
2010-09-13  9:34                   ` Avi Kivity
2010-09-14 13:07     ` Nadav Har'El
2010-06-13 12:32 ` [PATCH 19/24] Deciding if L0 or L1 should handle an L2 exit Nadav Har'El
2010-06-14 12:24   ` Avi Kivity
2010-09-16 14:42     ` Nadav Har'El
2010-06-13 12:32 ` [PATCH 20/24] Correct handling of interrupt injection Nadav Har'El
2010-06-14 12:29   ` Avi Kivity
2010-06-14 12:48     ` Avi Kivity
2010-09-16 15:25     ` Nadav Har'El
2010-06-13 12:33 ` [PATCH 21/24] Correct handling of exception injection Nadav Har'El
2010-06-13 12:33 ` [PATCH 22/24] Correct handling of idt vectoring info Nadav Har'El
2010-06-17 11:58   ` Gleb Natapov
2010-09-20  6:37     ` Nadav Har'El
2010-09-20  9:34       ` Gleb Natapov
2010-09-20 10:03         ` Nadav Har'El
2010-09-20 10:11           ` Avi Kivity
2010-09-22 23:15             ` Nadav Har'El
2010-09-26 15:14               ` Avi Kivity
2010-09-26 15:18                 ` Gleb Natapov
2010-09-20 10:20           ` Gleb Natapov
2010-06-13 12:34 ` [PATCH 23/24] Handling of CR0.TS and #NM for Lazy FPU loading Nadav Har'El
2010-06-13 12:34 ` [PATCH 24/24] Miscellenous small corrections Nadav Har'El
2010-06-14 12:34 ` [PATCH 0/24] Nested VMX, v5 Avi Kivity
2010-06-14 13:03   ` Nadav Har'El
2010-06-15 10:00     ` Avi Kivity
2010-10-17 12:03       ` Nadav Har'El
2010-10-17 12:10         ` Avi Kivity
2010-10-17 12:39           ` Nadav Har'El
2010-10-17 13:35             ` Avi Kivity
2010-07-09  8:59 ` Dong, Eddie
2010-07-11  8:27   ` Nadav Har'El
2010-07-11 11:05     ` Alexander Graf
2010-07-11 12:49       ` Nadav Har'El
2010-07-11 13:12         ` Avi Kivity
2010-07-11 15:39           ` Nadav Har'El
2010-07-11 15:45             ` Avi Kivity
2010-07-11 13:20     ` Avi Kivity
2010-07-15  3:27 ` Sheng Yang

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20100912170503.GA7828@fermat.math.technion.ac.il \
    --to=nyh@math.technion.ac.il \
    --cc=avi@redhat.com \
    --cc=kvm@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.