All of lore.kernel.org
 help / color / mirror / Atom feed
* Fwd: [PATCH 0/18] Nested Virtualization: Overview
@ 2010-04-15 13:20 Christoph Egger
  2010-04-15 13:39 ` Vincent Hanquez
                   ` (2 more replies)
  0 siblings, 3 replies; 16+ messages in thread
From: Christoph Egger @ 2010-04-15 13:20 UTC (permalink / raw)
  To: xen-devel


Resending w/o documents. It seems to not get through to the list.


----------  Forwarded Message  ----------

Subject: [PATCH 0/18] Nested Virtualization: Overview
Date: Thursday 15 April 2010
From: Christoph Egger <Christoph.Egger@amd.com>
To: xen-devel@lists.xensource.com


Hi!

This patch series brings Nested Virtualization to Xen.
I have attached two documents Nested_Virtualization.pdf and XenNestedHVM.pdf.

The first describes how nested virtualization works in general and the latter
describes the xen implementation in detail.

The patch series:

patch 01: add nestedhvm guest config option to the tools
                  This is the only one patch touching the tools
patch 02: move viridian MSRs into the viridian header for use in an
                  other file added in patch 09.
patch 03: change local_event_delivery_* to take vcpu argument.
                  This prevents spurious xen crashes on guest shutdown/destroy
                  with nestedhvm enabled.
patch 04: obsolete gfn_to_mfn_current and remove it.
                  gfn_to_mfn_current is redundant to 
gfn_to_mfn(current->domain, ...)
                  This patch reduces the size of patch 17.
patch 05: hvm_set_cr0: Allow guest to switch into paged real mode.
                  This makes hvmloader boot when we use xen in xen.
patch 06: Move phys_table from struct domain to struct p2m_domain.
                  Combined with patch 17 and patch 18, this allows to run
                  nested guest with hap.
patch 07: Add data structures for nested virtualization.
patch 08: add nestedhvm function hooks, described in XenNestedHVM.pdf
patch 09: The heart of nested virtualization.
patch 10: Allow guest to enable SVM in EFER
patch 11: Propagate SVM cpuid feature bits to guest
patch 12: Emulate MSRs needed for nested virtualization
patch 13: Handle interrupts (generic part)
patch 14: SVM specific implementation for nested virtualization
patch 15: Handle interrupts (SVM specific)
patch 16: The piece of code that effectively turns nested virtualization on
patch 17: Change p2m infrastructure to operate with per-p2m instead
                  of per-domain. Combined with patch 06 and patch 18, this
                  allows to run nested guest with hap. 
patch 18: Handle nested pagefault to enable hap-on-hap


-- 
---to satisfy European Law for business letters:
Advanced Micro Devices GmbH
Karl-Hammerschmidt-Str. 34, 85609 Dornach b. Muenchen
Geschaeftsfuehrer: Andrew Bowd, Thomas M. McCoy, Giuliano Meroni
Sitz: Dornach, Gemeinde Aschheim, Landkreis Muenchen
Registergericht Muenchen, HRB Nr. 43632

-------------------------------------------------------

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: Fwd: [PATCH 0/18] Nested Virtualization: Overview
  2010-04-15 13:20 Fwd: [PATCH 0/18] Nested Virtualization: Overview Christoph Egger
@ 2010-04-15 13:39 ` Vincent Hanquez
  2010-04-15 14:50   ` Christoph Egger
  2010-04-15 14:51   ` Christoph Egger
  2010-04-15 14:57 ` Keir Fraser
  2010-04-16  9:07 ` Qing He
  2 siblings, 2 replies; 16+ messages in thread
From: Vincent Hanquez @ 2010-04-15 13:39 UTC (permalink / raw)
  To: Christoph Egger; +Cc: xen-devel

On 15/04/10 14:20, Christoph Egger wrote:
> Resending w/o documents. It seems to not get through to the list.
>
>    
Any chance to put the documents somewhere else (http ?).

I've been waiting for such a feature for a while, and that's really cool 
to see it happening.
kudos !

-- 
Vincent

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: Fwd: [PATCH 0/18] Nested Virtualization: Overview
  2010-04-15 13:39 ` Vincent Hanquez
@ 2010-04-15 14:50   ` Christoph Egger
  2010-04-15 14:51   ` Christoph Egger
  1 sibling, 0 replies; 16+ messages in thread
From: Christoph Egger @ 2010-04-15 14:50 UTC (permalink / raw)
  To: xen-devel

[-- Attachment #1: Type: text/plain, Size: 736 bytes --]

On Thursday 15 April 2010 15:39:37 Vincent Hanquez wrote:
> On 15/04/10 14:20, Christoph Egger wrote:
> > Resending w/o documents. It seems to not get through to the list.
>
> Any chance to put the documents somewhere else (http ?).
>
> I've been waiting for such a feature for a while, and that's really cool
> to see it happening.
> kudos !


I try to send it out in two mails. This is the Nested_Virtualization.pdf 
(gzip'd).

Christoph



-- 
---to satisfy European Law for business letters:
Advanced Micro Devices GmbH
Karl-Hammerschmidt-Str. 34, 85609 Dornach b. Muenchen
Geschaeftsfuehrer: Andrew Bowd, Thomas M. McCoy, Giuliano Meroni
Sitz: Dornach, Gemeinde Aschheim, Landkreis Muenchen
Registergericht Muenchen, HRB Nr. 43632

[-- Attachment #2: Nested_Virtualization.pdf.gz --]
[-- Type: application/x-gzip, Size: 253223 bytes --]

[-- Attachment #3: Type: text/plain, Size: 138 bytes --]

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: Fwd: [PATCH 0/18] Nested Virtualization: Overview
  2010-04-15 13:39 ` Vincent Hanquez
  2010-04-15 14:50   ` Christoph Egger
@ 2010-04-15 14:51   ` Christoph Egger
  1 sibling, 0 replies; 16+ messages in thread
From: Christoph Egger @ 2010-04-15 14:51 UTC (permalink / raw)
  To: xen-devel

[-- Attachment #1: Type: text/plain, Size: 689 bytes --]

On Thursday 15 April 2010 15:39:37 Vincent Hanquez wrote:
> On 15/04/10 14:20, Christoph Egger wrote:
> > Resending w/o documents. It seems to not get through to the list.
>
> Any chance to put the documents somewhere else (http ?).
>
> I've been waiting for such a feature for a while, and that's really cool
> to see it happening.
> kudos !

This is with XenNestedHVM.pdf (gzip'd)

Christoph


-- 
---to satisfy European Law for business letters:
Advanced Micro Devices GmbH
Karl-Hammerschmidt-Str. 34, 85609 Dornach b. Muenchen
Geschaeftsfuehrer: Andrew Bowd, Thomas M. McCoy, Giuliano Meroni
Sitz: Dornach, Gemeinde Aschheim, Landkreis Muenchen
Registergericht Muenchen, HRB Nr. 43632

[-- Attachment #2: XenNestedHVM.pdf.gz --]
[-- Type: application/x-gzip, Size: 247637 bytes --]

[-- Attachment #3: Type: text/plain, Size: 138 bytes --]

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: Fwd: [PATCH 0/18] Nested Virtualization: Overview
  2010-04-15 13:20 Fwd: [PATCH 0/18] Nested Virtualization: Overview Christoph Egger
  2010-04-15 13:39 ` Vincent Hanquez
@ 2010-04-15 14:57 ` Keir Fraser
  2010-04-15 15:17   ` Christoph Egger
  2010-04-15 15:25   ` Tim Deegan
  2010-04-16  9:07 ` Qing He
  2 siblings, 2 replies; 16+ messages in thread
From: Keir Fraser @ 2010-04-15 14:57 UTC (permalink / raw)
  To: Christoph Egger, xen-devel

On 15/04/2010 14:20, "Christoph Egger" <Christoph.Egger@amd.com> wrote:

> patch 03: change local_event_delivery_* to take vcpu argument.
>                   This prevents spurious xen crashes on guest shutdown/destroy
>                   with nestedhvm enabled.

Can you give an example of how this bug manifests? I don't really see how
nestedhvm would interact so unexpectedly with this rather pv-oriented
subsystem.

> patch 04: obsolete gfn_to_mfn_current and remove it.
>                   gfn_to_mfn_current is redundant to
> gfn_to_mfn(current->domain, ...)
>                   This patch reduces the size of patch 17.

This one (at least -- there may be others) needs an ack from Tim.

> patch 05: hvm_set_cr0: Allow guest to switch into paged real mode.
>                   This makes hvmloader boot when we use xen in xen.

What if we are not running a nestedhvm guest, or otherwise on a system not
supporting paged real mode? Is it wise to remove the check in that case?
Even where we *do* support nestedhvm, should all guest writes to CR0 be
allowed to bypass that check (Isn't paged real mode architecturally only
allowed to be entered via VMRUN)?

More generally, I will allow these patches to sit for a week or two to give
time for potential reviewers to digest them.

 Thanks,
 Keir

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: Fwd: [PATCH 0/18] Nested Virtualization: Overview
  2010-04-15 14:57 ` Keir Fraser
@ 2010-04-15 15:17   ` Christoph Egger
  2010-04-15 16:26     ` Keir Fraser
  2010-04-15 15:25   ` Tim Deegan
  1 sibling, 1 reply; 16+ messages in thread
From: Christoph Egger @ 2010-04-15 15:17 UTC (permalink / raw)
  To: Keir Fraser; +Cc: xen-devel

On Thursday 15 April 2010 16:57:40 Keir Fraser wrote:
> On 15/04/2010 14:20, "Christoph Egger" <Christoph.Egger@amd.com> wrote:
> > patch 03: change local_event_delivery_* to take vcpu argument.
> >                   This prevents spurious xen crashes on guest
> > shutdown/destroy with nestedhvm enabled.
>
> Can you give an example of how this bug manifests? I don't really see how
> nestedhvm would interact so unexpectedly with this rather pv-oriented
> subsystem.

On guest shutdown/destroy, 'current' does not always point to the expected
virtual cpu. When current != v, then xen crashes this way:

(XEN) ----[ Xen-4.0.0-rc6  x86_64  debug=y  Tainted:    C ]----
(XEN) CPU:    2
(XEN) RIP:    e008:[<ffff82c4801a5257>] nestedsvm_vcpu_stgi+0xa0/0xaf
(XEN) RFLAGS: 0000000000010202   CONTEXT: hypervisor
(XEN) rax: 0000000000000001   rbx: ffff83008faf0000   rcx: 0000000000000001
(XEN) rdx: 0000000000000000   rsi: ffff82c480270b20   rdi: ffff8301495c0000
(XEN) rbp: ffff830167e27dd0   rsp: ffff830167e27dc0   r8:  0000000000000001
(XEN) r9:  ffff82c4803937e0   r10: 0000ffff0000ffff   r11: 00ff00ff00ff00ff
(XEN) r12: 0000000000000000   r13: 0000000000000000   r14: ffff82c48026cb00
(XEN) r15: ffffffffffffffff   cr0: 000000008005003b   cr4: 00000000000006f0
(XEN) cr3: 0000000160e28000   cr2: 0000000000000001
(XEN) ds: 0000   es: 0000   fs: 0000   gs: 0000   ss: e010   cs: e008
(XEN) Xen stack trace from rsp=ffff830167e27dc0:
(XEN)    ffff830144fd08d0 ffff83008faf0000 ffff830167e27df0 ffff82c4801a5583
(XEN)    ffff83008faf0000 ffff83008faf0000 ffff830167e27e10 ffff82c48019cf13
(XEN)    ffff82c48011e68c ffff83008faf0000 ffff830167e27e30 ffff82c48014d17d
(XEN)    ffff830167e27e40 ffff8301495c0000 ffff830167e27e60 ffff82c480105d8c
(XEN)    ffff830167e27e90 ffff82c4802701a0 0000000000000000 0000000000000000
(XEN)    ffff830167e27e90 ffff82c480124113 ffff82c48038a980 ffff82c48038aa80
(XEN)    ffff82c48038a980 ffff830167e27f28 ffff830167e27ed0 ffff82c48011e1f2
(XEN)    ffff83008faf80f8 ffff830167e27f28 ffff82c480243040 ffff830167e27f28
(XEN)    ffff82c48026cb00 ffff82c480243ab8 ffff830167e27ee0 ffff82c48011e211
(XEN)    ffff830167e27f20 ffff82c48014d363 ffffffff8057c580 ffff83008fe68000
(XEN)    ffff83008faf8000 0000000000004000 ffff82c480270080 0000000000000002
(XEN)    ffff830167e27dc8 0000000000000000 ffffffff8057d1c0 ffffffff8057c580
(XEN)    ffffffffffffffff 0000000000631918 0000000000000000 0000000000000246
(XEN)    0000000000000000 00000397cc124096 0000000000000000 0000000000000000
(XEN)    ffffffff802083aa 00000000deadbeef 00000000deadbeef 00000000deadbeef
(XEN)    0000010000000000 ffffffff802083aa 000000000000e033 0000000000000246
(XEN)    ffffffff80553f10 000000000000e02b 0000000000000000 0000000000000000
(XEN)    0000000000000000 0000000000000000 0000000000000002 ffff83008fe68000
(XEN) Xen call trace:
(XEN)    [<ffff82c4801a5257>] nestedsvm_vcpu_stgi+0xa0/0xaf
(XEN)    [<ffff82c4801a5583>] nestedsvm_vcpu_destroy+0x48/0x6d
(XEN)    [<ffff82c48019cf13>] hvm_vcpu_destroy+0x11/0x67
(XEN)    [<ffff82c48014d17d>] vcpu_destroy+0x33/0x3a
(XEN)    [<ffff82c480105d8c>] complete_domain_destroy+0x39/0x103
(XEN)    [<ffff82c480124113>] rcu_process_callbacks+0x17e/0x1dc
(XEN)    [<ffff82c48011e1f2>] __do_softirq+0x74/0x85
(XEN)    [<ffff82c48011e211>] do_softirq+0xe/0x10
(XEN)    [<ffff82c48014d363>] idle_loop+0x8d/0x8f
(XEN)
(XEN) Pagetable walk from 0000000000000001:
(XEN)  L4[0x000] = 00000001619e5067 000000000001e849
(XEN)  L3[0x000] = 0000000160cbc067 000000000001dd72
(XEN)  L2[0x000] = 0000000000000000 ffffffffffffffff
(XEN)
(XEN) ****************************************
(XEN) Panic on CPU 2:
(XEN) FATAL PAGE FAULT
(XEN) [error_code=0002]
(XEN) Faulting linear address: 0000000000000001
(XEN) ****************************************

nestedsvm_vcpu_stgi() enables events for PV drivers.

>
> > patch 04: obsolete gfn_to_mfn_current and remove it.
> >                   gfn_to_mfn_current is redundant to
> > gfn_to_mfn(current->domain, ...)
> >                   This patch reduces the size of patch 17.
>
> This one (at least -- there may be others) needs an ack from Tim.

I could imagine, all patches touching p2m need ack from Tim.

>
> > patch 05: hvm_set_cr0: Allow guest to switch into paged real mode.
> >                   This makes hvmloader boot when we use xen in xen.
>
> What if we are not running a nestedhvm guest, or otherwise on a system not
> supporting paged real mode?

The hvmloader itself switches to real mode right before invoking the BIOS
via a trampoline. In virtualization, the cpu always uses paged real mode or
it can't fetch the instructions, otherwise.

> Is it wise to remove the check in that case? 

No. The mov-to-cr0 instruction fails with a #GP, otherwise. The guest
doesn't expect the #GP to happen. The guest just retries the same instruction
summing up to a double-fault and triple-fault.

> Even where we *do* support nestedhvm, should all guest writes to CR0 be
> allowed to bypass that check (Isn't paged real mode architecturally only
> allowed to be entered via VMRUN)?

Are you talking about the VMRUN instruction or the VMRUN emulation ?
Note, this is a difference. With nestedhvm, the guest believes to be able
to run VMRUN instruction. In reality, VMRUN is intercepted and emulated in
the host. The papers (pdf documents) describe how VMRUN emulation works.

> More generally, I will allow these patches to sit for a week or two to give
> time for potential reviewers to digest them.

Ack.

Christoph

>
>  Thanks,
>  Keir



-- 
---to satisfy European Law for business letters:
Advanced Micro Devices GmbH
Karl-Hammerschmidt-Str. 34, 85609 Dornach b. Muenchen
Geschaeftsfuehrer: Andrew Bowd, Thomas M. McCoy, Giuliano Meroni
Sitz: Dornach, Gemeinde Aschheim, Landkreis Muenchen
Registergericht Muenchen, HRB Nr. 43632

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: Fwd: [PATCH 0/18] Nested Virtualization: Overview
  2010-04-15 14:57 ` Keir Fraser
  2010-04-15 15:17   ` Christoph Egger
@ 2010-04-15 15:25   ` Tim Deegan
  2010-04-16 10:01     ` Christoph Egger
  1 sibling, 1 reply; 16+ messages in thread
From: Tim Deegan @ 2010-04-15 15:25 UTC (permalink / raw)
  To: Keir Fraser; +Cc: Christoph Egger, xen-devel

At 15:57 +0100 on 15 Apr (1271347060), Keir Fraser wrote:
> > patch 04: obsolete gfn_to_mfn_current and remove it.
> >                   gfn_to_mfn_current is redundant to
> > gfn_to_mfn(current->domain, ...)
> >                   This patch reduces the size of patch 17.
> 
> This one (at least -- there may be others) needs an ack from Tim.

I've already asked for some measurement to show the effect of removing
gfn_to_mfn_current() on shadow pagetable performance.

The other patches that I was CC'd on look mostly OK, except for
introducing some clunky (and wide) y_to_z(x_to_y(foo_to_x(foo)))
patterns that I'm sure could be done a bit more neatly.

I'll read the PDFs tomorrow and have a proper look at the patches then.

Cheers,

Tim.

> > patch 05: hvm_set_cr0: Allow guest to switch into paged real mode.
> >                   This makes hvmloader boot when we use xen in xen.
> 
> What if we are not running a nestedhvm guest, or otherwise on a system not
> supporting paged real mode? Is it wise to remove the check in that case?
> Even where we *do* support nestedhvm, should all guest writes to CR0 be
> allowed to bypass that check (Isn't paged real mode architecturally only
> allowed to be entered via VMRUN)?
> 
> More generally, I will allow these patches to sit for a week or two to give
> time for potential reviewers to digest them.
> 
>  Thanks,
>  Keir
> 
> 
> 
> _______________________________________________
> Xen-devel mailing list
> Xen-devel@lists.xensource.com
> http://lists.xensource.com/xen-devel

-- 
Tim Deegan <Tim.Deegan@citrix.com>
Principal Software Engineer, XenServer Engineering
Citrix Systems UK Ltd.  (Company #02937203, SL9 0BG)

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: Fwd: [PATCH 0/18] Nested Virtualization: Overview
  2010-04-15 15:17   ` Christoph Egger
@ 2010-04-15 16:26     ` Keir Fraser
  0 siblings, 0 replies; 16+ messages in thread
From: Keir Fraser @ 2010-04-15 16:26 UTC (permalink / raw)
  To: Christoph Egger; +Cc: xen-devel

On 15/04/2010 16:17, "Christoph Egger" <Christoph.Egger@amd.com> wrote:

> (XEN)    [<ffff82c4801a5257>] nestedsvm_vcpu_stgi+0xa0/0xaf
> (XEN)    [<ffff82c4801a5583>] nestedsvm_vcpu_destroy+0x48/0x6d

Well, I haven't found which of the 19 patches actually implements the above
functions. But it seems odd to me that a function on the vcpu_destroy path
-- which is concerned with final teardown and freeing of vcpu structures
after a domain's death -- would call a function relating to STGI. The latter
being the kind of operation that doesn't seem to make sense outside the
context of a currently-running or at least runnable-in-future guest? You
haven't convinced me yet. ;-) It's not a barrier to the other patches though
-- worst case the rest go in and we argue the merits of this patch versus
other possible approaches as a second step.

>> What if we are not running a nestedhvm guest, or otherwise on a system not
>> supporting paged real mode?
> 
> The hvmloader itself switches to real mode right before invoking the BIOS
> via a trampoline. In virtualization, the cpu always uses paged real mode or
> it can't fetch the instructions, otherwise.

Well, no, hvmloader thinks it is setting CR0.PE *and* CR0.PG both to zero.
The fact that is either emulated or turned into paged real mode under the
hood is an implementation concern. Fact is afaik it is not valid in general
to set CR0.PE=0,CR0.PG=1 via an ordinary MOV-to-CR0 instruction, for
example. In my version of the AMD docs, Vol.2 has a section called Paged
Real Mode which only discusses this mode in the context of the VMRUN and RSM
instructions.

>> Is it wise to remove the check in that case?
> 
> No. The mov-to-cr0 instruction fails with a #GP, otherwise. The guest
> doesn't expect the #GP to happen. The guest just retries the same instruction
> summing up to a double-fault and triple-fault.

It is a guest-visible change in behaviour however. I don't want regressions
in our emulation of x86 semantics as a simple matter of principle, even in
cases like this where indeed it may well not really matter all that much.

In this case, my main objection is that hvm_set_cr0() looks like it emulates
ordinary CR0 semantics, and it will now look like a bug that this check is
missing. You even risk it getting added back in down the road by someone
ignorant of the requirement of nexted SVM.

>> Even where we *do* support nestedhvm, should all guest writes to CR0 be
>> allowed to bypass that check (Isn't paged real mode architecturally only
>> allowed to be entered via VMRUN)?
> 
> Are you talking about the VMRUN instruction or the VMRUN emulation ?
> Note, this is a difference. With nestedhvm, the guest believes to be able
> to run VMRUN instruction. In reality, VMRUN is intercepted and emulated in
> the host. The papers (pdf documents) describe how VMRUN emulation works.

Yeah, right, and the emulated VMRUN needs to be able to set
CR0.PG=1,CR0.PE=0. I can see that. In short, I suggest adding a flags
parameter to hvm_set_cr0() and defining a flag to specify that this specific
check be bypassed. That's all nice and explicit and it's clear-as-day then
that we are emulating CR0 semantics correctly as possible in all cases. The
downside that we've made hvm_set_cr0() a bit more clunky is worth it imo.

 -- Keir

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: Fwd: [PATCH 0/18] Nested Virtualization: Overview
  2010-04-15 13:20 Fwd: [PATCH 0/18] Nested Virtualization: Overview Christoph Egger
  2010-04-15 13:39 ` Vincent Hanquez
  2010-04-15 14:57 ` Keir Fraser
@ 2010-04-16  9:07 ` Qing He
  2010-04-16  9:32   ` Christoph Egger
  2 siblings, 1 reply; 16+ messages in thread
From: Qing He @ 2010-04-16  9:07 UTC (permalink / raw)
  To: Christoph Egger; +Cc: xen-devel

On Thu, 2010-04-15 at 21:20 +0800, Christoph Egger wrote:
> This patch series brings Nested Virtualization to Xen.
> I have attached two documents Nested_Virtualization.pdf and XenNestedHVM.pdf.
> 

This code is NOT generic and doesn't fit for vmx, although it is placed in hvm
directory.

For example, the structure v->arch.hvm_vcpu.nestedhvm is filled with
svm specific registers and concepts, so is the arch/x86/hvm/nestedhvm.c,
this file even includes vmcb_struct, as well as things like cpuids and efer.
A vmx adaption to this nestedhvm would be way too difficult if not impossible.

These files should go to arch/x86/hvm/svm instead since they are really svm
specific, so is the struct nestedhvm, v->arch.hvm_svm fits better.


In fact, I even doubt whether a generic nested arch should be made for vmx
and svm. When the HVM abstraction was made, it was largely because the targets
are the same, the x86 VM. However, the emulation target of nested virtualization
is either vmx or svm, which have much more differences, including
the capability reporting, the access of control structures, control registers,
interrupt delivery (idt vectoring), optimization options, guest format of hap
and etc.

Not much is common except the similarity of control flow (which is
not abstracted even in HVM: vmx_do_resume and svm_do_resume). Given the
complexity of the two HVM extensions, an abstraction may create more
overhead and quirkness than benefit.

Moreover, a l2 guest is not context complete by itself, some of it logically
belongs to l1 guest. This makes things more complicated that VMExit handling
is interlaced. I notice that the patchset uses a separate flow path other than
hvm/{vmx,svm}/entry.S, this means, for at least vmx side, many things already
there need to be duplicated to the new path, this is both error prone and makes
maintenance more difficult.


I have a nested vmx implementation in hand that supports major features
(shadow-on-shadow, shadow and ept on ept, x86_64, smp), however, several weeks
are still needed for some additional cleanup before I can send it to the list
for RFC. The patch, in contrast, lies almost fully in arch/x86/hvm/vmx,
and only adds two memebers to v->arch.hvm_vcpu, nesting_avail and in_nesting.

Thanks,
Qing

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: Fwd: [PATCH 0/18] Nested Virtualization: Overview
  2010-04-16  9:07 ` Qing He
@ 2010-04-16  9:32   ` Christoph Egger
  2010-04-16 10:27     ` Tim Deegan
  0 siblings, 1 reply; 16+ messages in thread
From: Christoph Egger @ 2010-04-16  9:32 UTC (permalink / raw)
  To: xen-devel; +Cc: Qing He

On Friday 16 April 2010 11:07:08 Qing He wrote:
> On Thu, 2010-04-15 at 21:20 +0800, Christoph Egger wrote:
> > This patch series brings Nested Virtualization to Xen.
> > I have attached two documents Nested_Virtualization.pdf and
> > XenNestedHVM.pdf.
>
> This code is NOT generic and doesn't fit for vmx, although it is placed in
> hvm directory.
>
> For example, the structure v->arch.hvm_vcpu.nestedhvm is filled with
> svm specific registers and concepts, so is the arch/x86/hvm/nestedhvm.c,
> this file even includes vmcb_struct, as well as things like cpuids and
> efer. A vmx adaption to this nestedhvm would be way too difficult if not
> impossible.
>
> These files should go to arch/x86/hvm/svm instead since they are really svm
> specific, so is the struct nestedhvm, v->arch.hvm_svm fits better.

Well, that is what I expected from noone else than Intel :-)

Please read the XenNestedHVM.pdf paper, particularly the section "Software 
Architecture". This describes how this is made to be generic and what needs
to be done to adapt to Intel.

I estimate one or two weeks for you to have NestedHVM on Intel
based on my patchset.


The way you suggest to go makes the xen source tree and the xen binary
a lot larger than necessary.

Christoph


-- 
---to satisfy European Law for business letters:
Advanced Micro Devices GmbH
Karl-Hammerschmidt-Str. 34, 85609 Dornach b. Muenchen
Geschaeftsfuehrer: Andrew Bowd, Thomas M. McCoy, Giuliano Meroni
Sitz: Dornach, Gemeinde Aschheim, Landkreis Muenchen
Registergericht Muenchen, HRB Nr. 43632

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: Fwd: [PATCH 0/18] Nested Virtualization: Overview
  2010-04-15 15:25   ` Tim Deegan
@ 2010-04-16 10:01     ` Christoph Egger
  0 siblings, 0 replies; 16+ messages in thread
From: Christoph Egger @ 2010-04-16 10:01 UTC (permalink / raw)
  To: Tim Deegan; +Cc: xen-devel, Keir Fraser

On Thursday 15 April 2010 17:25:42 Tim Deegan wrote:
> At 15:57 +0100 on 15 Apr (1271347060), Keir Fraser wrote:
> > > patch 04: obsolete gfn_to_mfn_current and remove it.
> > >                   gfn_to_mfn_current is redundant to
> > > gfn_to_mfn(current->domain, ...)
> > >                   This patch reduces the size of patch 17.
> >
> > This one (at least -- there may be others) needs an ack from Tim.
>
> I've already asked for some measurement to show the effect of removing
> gfn_to_mfn_current() on shadow pagetable performance.

Yes, I couldn't do since tools were broken and couldn't even start a guest.
This is fixed since c/s 21187.

> The other patches that I was CC'd on look mostly OK, except for
> introducing some clunky (and wide) y_to_z(x_to_y(foo_to_x(foo)))
> patterns that I'm sure could be done a bit more neatly.

Yes, you see that pattern in patch 06/18. In patch 17/18 you see
that pattern changed again in a better shape.

BTW: I think, I forgot to CC you in patch 18/18.

> I'll read the PDFs tomorrow and have a proper look at the patches then.

Thanks.

Christoph


> Cheers,
>
> Tim.
>
> > > patch 05: hvm_set_cr0: Allow guest to switch into paged real mode.
> > >                   This makes hvmloader boot when we use xen in xen.
> >
> > What if we are not running a nestedhvm guest, or otherwise on a system
> > not supporting paged real mode? Is it wise to remove the check in that
> > case? Even where we *do* support nestedhvm, should all guest writes to
> > CR0 be allowed to bypass that check (Isn't paged real mode
> > architecturally only allowed to be entered via VMRUN)?
> >
> > More generally, I will allow these patches to sit for a week or two to
> > give time for potential reviewers to digest them.
> >
> >  Thanks,
> >  Keir
> >
> >
> >
> > _______________________________________________
> > Xen-devel mailing list
> > Xen-devel@lists.xensource.com
> > http://lists.xensource.com/xen-devel



-- 
---to satisfy European Law for business letters:
Advanced Micro Devices GmbH
Karl-Hammerschmidt-Str. 34, 85609 Dornach b. Muenchen
Geschaeftsfuehrer: Andrew Bowd, Thomas M. McCoy, Giuliano Meroni
Sitz: Dornach, Gemeinde Aschheim, Landkreis Muenchen
Registergericht Muenchen, HRB Nr. 43632

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: Fwd: [PATCH 0/18] Nested Virtualization: Overview
  2010-04-16  9:32   ` Christoph Egger
@ 2010-04-16 10:27     ` Tim Deegan
  2010-04-16 17:50       ` Keir Fraser
  2010-04-17 11:43       ` Joerg Roedel
  0 siblings, 2 replies; 16+ messages in thread
From: Tim Deegan @ 2010-04-16 10:27 UTC (permalink / raw)
  To: Christoph Egger; +Cc: xen-devel, Qing He

At 10:32 +0100 on 16 Apr (1271413945), Christoph Egger wrote:
> > For example, the structure v->arch.hvm_vcpu.nestedhvm is filled with
> > svm specific registers and concepts, so is the arch/x86/hvm/nestedhvm.c,
> > this file even includes vmcb_struct, as well as things like cpuids and
> > efer. A vmx adaption to this nestedhvm would be way too difficult if not
> > impossible.
> >
> > These files should go to arch/x86/hvm/svm instead since they are really svm
> > specific, so is the struct nestedhvm, v->arch.hvm_svm fits better.
> 
> Well, that is what I expected from noone else than Intel :-)
> 
> Please read the XenNestedHVM.pdf paper, particularly the section "Software 
> Architecture". This describes how this is made to be generic and what needs
> to be done to adapt to Intel.

Your PDFs suggest that even on Intel CPUs, the nested hypervisor should
always see SVM, not VMX.  You shouldn't be surprised or offended if that
isn't popular with Intel. :)

I would like to see some reasonable discussion of exactly how much of
the nested virtualization logic can be shared.  With HVM it has turned
out to be quite a lot, but it's taken years of reshuffling to pull code
out into common HVM routines (and we're not there yet).  I don't think
either of the suggested approaches (common code == SVM, or common code
== nothing) will be the right one.

Cheers,

Tim.

-- 
Tim Deegan <Tim.Deegan@citrix.com>
Principal Software Engineer, XenServer Engineering
Citrix Systems UK Ltd.  (Company #02937203, SL9 0BG)

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: Fwd: [PATCH 0/18] Nested Virtualization: Overview
  2010-04-16 10:27     ` Tim Deegan
@ 2010-04-16 17:50       ` Keir Fraser
  2010-04-20  2:07         ` Dong, Eddie
  2010-04-17 11:43       ` Joerg Roedel
  1 sibling, 1 reply; 16+ messages in thread
From: Keir Fraser @ 2010-04-16 17:50 UTC (permalink / raw)
  To: Tim Deegan, Christoph Egger; +Cc: xen-devel, Qing He

On 16/04/2010 11:27, "Tim Deegan" <Tim.Deegan@citrix.com> wrote:

>> Please read the XenNestedHVM.pdf paper, particularly the section "Software
>> Architecture". This describes how this is made to be generic and what needs
>> to be done to adapt to Intel.
> 
> Your PDFs suggest that even on Intel CPUs, the nested hypervisor should
> always see SVM, not VMX.  You shouldn't be surprised or offended if that
> isn't popular with Intel. :)

I don't see any good argument for it either. I.e., I don't think we care
about migrating between AMD and Intel hosts with nestedhvm enabled, which I
think would be the only argument for it. I know we added support for
cross-emulating SYSENTER and SYSCALL, but that's needed for cross-migration
of any 64-bit guest running compat-mode apps (i.e., really need to make
cross-migration possible at all). I'm sceptical enough of the utility of
cross-vendor migration *at all*, let alone supporting in tandem with
advanced features also of dubious utility (at least in enterprise space),
like nestedhvm.

 -- Keir

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: Fwd: [PATCH 0/18] Nested Virtualization: Overview
  2010-04-16 10:27     ` Tim Deegan
  2010-04-16 17:50       ` Keir Fraser
@ 2010-04-17 11:43       ` Joerg Roedel
  2010-04-18 17:52         ` Keir Fraser
  1 sibling, 1 reply; 16+ messages in thread
From: Joerg Roedel @ 2010-04-17 11:43 UTC (permalink / raw)
  To: Tim Deegan; +Cc: Christoph Egger, xen-devel, Qing He

On Fri, Apr 16, 2010 at 11:27:11AM +0100, Tim Deegan wrote:
> At 10:32 +0100 on 16 Apr (1271413945), Christoph Egger wrote:

> Your PDFs suggest that even on Intel CPUs, the nested hypervisor should
> always see SVM, not VMX.  You shouldn't be surprised or offended if that
> isn't popular with Intel. :)

Well, it would make sense for Intel too virtualize SVM because it
doesn't has the performance issues with lots and lots of emulated
vmread/vmwrite instructions that cause vmexits in the nested case. The
bigger problem with SVM on VMX is that it could never be complete
because afaik VMX has fewer intercepts than SVM.

	Joerg

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: Fwd: [PATCH 0/18] Nested Virtualization: Overview
  2010-04-17 11:43       ` Joerg Roedel
@ 2010-04-18 17:52         ` Keir Fraser
  0 siblings, 0 replies; 16+ messages in thread
From: Keir Fraser @ 2010-04-18 17:52 UTC (permalink / raw)
  To: Joerg Roedel, Tim Deegan; +Cc: Christoph Egger, xen-devel, Qing He

On 17/04/2010 12:43, "Joerg Roedel" <joro@8bytes.org> wrote:

>> Your PDFs suggest that even on Intel CPUs, the nested hypervisor should
>> always see SVM, not VMX.  You shouldn't be surprised or offended if that
>> isn't popular with Intel. :)
> 
> Well, it would make sense for Intel too virtualize SVM because it
> doesn't has the performance issues with lots and lots of emulated
> vmread/vmwrite instructions that cause vmexits in the nested case. The
> bigger problem with SVM on VMX is that it could never be complete
> because afaik VMX has fewer intercepts than SVM.

I don't think either VMX-on-SVM or SVM-on-VMX should be an aim. I mean, we'd
have to completely emulate the underlying Intel processor, say, as AMD, to
ensure SVM code paths get taken in the guest kernel/hypervisor. It's not
really on.

 -- Keir

^ permalink raw reply	[flat|nested] 16+ messages in thread

* RE: Fwd: [PATCH 0/18] Nested Virtualization: Overview
  2010-04-16 17:50       ` Keir Fraser
@ 2010-04-20  2:07         ` Dong, Eddie
  0 siblings, 0 replies; 16+ messages in thread
From: Dong, Eddie @ 2010-04-20  2:07 UTC (permalink / raw)
  To: Keir Fraser, Tim Deegan, Christoph Egger; +Cc: xen-devel, Dong, Eddie, He, Qing

Keir Fraser wrote:
> On 16/04/2010 11:27, "Tim Deegan" <Tim.Deegan@citrix.com> wrote:
> 
>>> Please read the XenNestedHVM.pdf paper, particularly the section
>>> "Software Architecture". This describes how this is made to be
>>> generic and what needs to be done to adapt to Intel.
>> 
>> Your PDFs suggest that even on Intel CPUs, the nested hypervisor
>> should always see SVM, not VMX.  You shouldn't be surprised or
>> offended if that isn't popular with Intel. :)
> 
> I don't see any good argument for it either. I.e., I don't think we
> care about migrating between AMD and Intel hosts with nestedhvm
> enabled, which I think would be the only argument for it. I know we
> added support for cross-emulating SYSENTER and SYSCALL, but that's
> needed for cross-migration of any 64-bit guest running compat-mode
> apps (i.e., really need to make cross-migration possible at all). I'm
> sceptical enough of the utility of cross-vendor migration *at all*,
> let alone supporting in tandem with advanced features also of dubious
> utility (at least in enterprise space), like nestedhvm.
> 

Although SVM on VMX is possible in theory, I doutb on the feasibility given that there are many semantics difference between VMCB & VMCS, which will eventually have to be emulated with extra complexity.
Qing will post his natural VMX on VMX patch this week, base on his Xen summit talk on http://www.xen.org/files/xensummit_intel09/xensummit-nested-virt.pdf :) We can have next level of discussion/comparation then.
BTW, If somebody has implemented SVM on VMX solution already, we can have performance comparation between 2 approaches to assist the discussion.

Thx, Eddie

^ permalink raw reply	[flat|nested] 16+ messages in thread

end of thread, other threads:[~2010-04-20  2:07 UTC | newest]

Thread overview: 16+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2010-04-15 13:20 Fwd: [PATCH 0/18] Nested Virtualization: Overview Christoph Egger
2010-04-15 13:39 ` Vincent Hanquez
2010-04-15 14:50   ` Christoph Egger
2010-04-15 14:51   ` Christoph Egger
2010-04-15 14:57 ` Keir Fraser
2010-04-15 15:17   ` Christoph Egger
2010-04-15 16:26     ` Keir Fraser
2010-04-15 15:25   ` Tim Deegan
2010-04-16 10:01     ` Christoph Egger
2010-04-16  9:07 ` Qing He
2010-04-16  9:32   ` Christoph Egger
2010-04-16 10:27     ` Tim Deegan
2010-04-16 17:50       ` Keir Fraser
2010-04-20  2:07         ` Dong, Eddie
2010-04-17 11:43       ` Joerg Roedel
2010-04-18 17:52         ` Keir Fraser

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.