From: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
To: Jan Beulich <JBeulich@suse.com>
Cc: andrew.cooper3@citrix.com, kevin.tian@intel.com,
wim.coekaerts@oracle.com, jun.nakajima@intel.com,
xen-devel <xen-devel@lists.xenproject.org>
Subject: Re: Nested virtualization off VMware vSphere 6.0 with EL6 guests crashes on Xen 4.6
Date: Fri, 15 Jan 2016 16:39:58 -0500 [thread overview]
Message-ID: <20160115213958.GA16118@char.us.oracle.com> (raw)
In-Reply-To: <5694D3CB02000078000C5D00@prv-mh.provo.novell.com>
On Tue, Jan 12, 2016 at 02:22:03AM -0700, Jan Beulich wrote:
> >>> On 12.01.16 at 04:38, <konrad.wilk@oracle.com> wrote:
> > (XEN) Assertion 'vapic_pg && !p2m_is_paging(p2mt)' failed at vvmx.c:698
> > (XEN) ----[ Xen-4.6.0 x86_64 debug=y Tainted: C ]----
> > (XEN) CPU: 39
> > (XEN) RIP: e008:[<ffff82d0801ed053>] virtual_vmentry+0x487/0xac9
> > (XEN) RFLAGS: 0000000000010246 CONTEXT: hypervisor (d1v3)
> > (XEN) rax: 0000000000000000 rbx: ffff83007786c000 rcx: 0000000000000000
> > (XEN) rdx: 0000000000000e00 rsi: 000fffffffffffff rdi: ffff83407f81e010
> > (XEN) rbp: ffff834008a47ea8 rsp: ffff834008a47e38 r8: 0000000000000000
> > (XEN) r9: 0000000000000000 r10: 0000000000000000 r11: 0000000000000000
> > (XEN) r12: 0000000000000000 r13: ffff82c000341000 r14: ffff834008a47f18
> > (XEN) r15: ffff83407f7c4000 cr0: 0000000080050033 cr4: 00000000001526e0
> > (XEN) cr3: 000000407fb22000 cr2: 0000000000000000
> > (XEN) ds: 0000 es: 0000 fs: 0000 gs: 0000 ss: 0000 cs: e008
> > (XEN) Xen stack trace from rsp=ffff834008a47e38:
> > (XEN) ffff834008a47e68 ffff82d0801d2cde ffff834008a47e68 0000000000000d00
> > (XEN) 0000000000000000 0000000000000000 ffff834008a47e88 00000004801cc30e
> > (XEN) ffff83007786c000 ffff83007786c000 ffff834008a40000 0000000000000000
> > (XEN) ffff834008a47f18 0000000000000000 ffff834008a47f08 ffff82d0801edf94
> > (XEN) ffff834008a47ef8 0000000000000000 ffff834008f62000 ffff834008a47f18
> > (XEN) 000000ae8c99eb8d ffff83007786c000 0000000000000000 0000000000000000
> > (XEN) 0000000000000000 0000000000000000 0000000000000000 ffff82d0801ee2ab
> > (XEN) 0000000000000000 0000000000000000 0000000000000000 0000000000000000
> > (XEN) 0000000000000000 0000000000000000 0000000000000000 0000000000000000
> > (XEN) 0000000000000000 0000000000000000 0000000000000000 0000000000000000
> > (XEN) 00000000078bfbff 0000000000000000 0000000000000000 0000beef0000beef
> > (XEN) fffffffffc4b3440 000000bf0000beef 0000000000040046 fffffffffc607f00
> > (XEN) 000000000000beef 000000000000beef 000000000000beef 000000000000beef
> > (XEN) 000000000000beef 0000000000000027 ffff83007786c000 0000006f88716300
> > (XEN) 0000000000000000
> > (XEN) Xen call trace:
> > (XEN) [<ffff82d0801ed053>] virtual_vmentry+0x487/0xac9
> > (XEN) [<ffff82d0801edf94>] nvmx_switch_guest+0x8ff/0x915
> > (XEN) [<ffff82d0801ee2ab>] vmx_asm_vmexit_handler+0x4b/0xc0
> > (XEN)
> > (XEN)
> > (XEN) ****************************************
> > (XEN) Panic on CPU 39:
> > (XEN) Assertion 'vapic_pg && !p2m_is_paging(p2mt)' failed at vvmx.c:698
> > (XEN) ****************************************
> > (XEN)
> >
> > ..and then to my surprise the hypervisor stopped hitting this.
>
> Since we can (I hope) pretty much exclude a paging type, the
> ASSERT() must have triggered because of vapic_pg being NULL.
> That might be verifiable without extra printk()s, just by checking
> the disassembly (assuming the value sits in a register). In which
> case vapic_gpfn would be of interest too.
The vapic_gpfn is 0xffffffffffff.
To be exact:
nvmx_update_virtual_apic_address:vCPU0 0xffffffffffffffff(vAPIC) 0x0(APIC), 0x0(TPR) ctrl=b5b9effe
Based on this:
diff --git a/xen/arch/x86/hvm/vmx/vvmx.c b/xen/arch/x86/hvm/vmx/vvmx.c
index cb6f9b8..8a0abfc 100644
--- a/xen/arch/x86/hvm/vmx/vvmx.c
+++ b/xen/arch/x86/hvm/vmx/vvmx.c
@@ -695,7 +695,15 @@ static void nvmx_update_virtual_apic_address(struct vcpu *v)
vapic_gpfn = __get_vvmcs(nvcpu->nv_vvmcx, VIRTUAL_APIC_PAGE_ADDR) >> PAGE_SHIFT;
vapic_pg = get_page_from_gfn(v->domain, vapic_gpfn, &p2mt, P2M_ALLOC);
- ASSERT(vapic_pg && !p2m_is_paging(p2mt));
+ if ( !vapic_pg ) {
+ printk("%s:vCPU%d 0x%lx(vAPIC) 0x%lx(APIC), 0x%lx(TPR) ctrl=%x\n", __func__,v->vcpu_id,
+ __get_vvmcs(nvcpu->nv_vvmcx, VIRTUAL_APIC_PAGE_ADDR),
+ __get_vvmcs(nvcpu->nv_vvmcx, APIC_ACCESS_ADDR),
+ __get_vvmcs(nvcpu->nv_vvmcx, TPR_THRESHOLD),
+ ctrl);
+ }
+ ASSERT(vapic_pg);
+ ASSERT(vapic_pg && !p2m_is_paging(p2mt));
__vmwrite(VIRTUAL_APIC_PAGE_ADDR, page_to_maddr(vapic_pg));
put_page(vapic_pg);
}
>
> What looks odd to me is the connection between
> CPU_BASED_TPR_SHADOW being set and the use of a (valid)
> virtual APIC page: Wouldn't this rather need to depend on
> SECONDARY_EXEC_VIRTUALIZE_APIC_ACCESSES, just like in
> nvmx_update_apic_access_address()?
Could be. I added in an read for the secondary control:
nvmx_update_virtual_apic_address:vCPU2 0xffffffffffffffff(vAPIC) 0x0(APIC), 0x0(TPR) ctrl=b5b9effe sec=0
So trying your recommendation:
diff --git a/xen/arch/x86/hvm/vmx/vvmx.c b/xen/arch/x86/hvm/vmx/vvmx.c
index cb6f9b8..d291c91 100644
--- a/xen/arch/x86/hvm/vmx/vvmx.c
+++ b/xen/arch/x86/hvm/vmx/vvmx.c
@@ -686,8 +686,8 @@ static void nvmx_update_virtual_apic_address(struct vcpu *v)
struct nestedvcpu *nvcpu = &vcpu_nestedhvm(v);
u32 ctrl;
- ctrl = __n2_exec_control(v);
- if ( ctrl & CPU_BASED_TPR_SHADOW )
+ ctrl = __n2_secondary_exec_control(v);
+ if ( ctrl & SECONDARY_EXEC_VIRTUALIZE_APIC_ACCESSES )
{
p2m_type_t p2mt;
unsigned long vapic_gpfn;
Got me:
(XEN) stdvga.c:151:d1v0 leaving stdvga mode
(XEN) stdvga.c:147:d1v0 entering stdvga and caching modes
(XEN) stdvga.c:520:d1v0 leaving caching mode
(XEN) vvmx.c:2491:d1v0 Unknown nested vmexit reason 80000021.
(XEN) Failed vm entry (exit reason 0x80000021) caused by invalid guest state (0).
(XEN) ************* VMCS Area **************
(XEN) *** Guest State ***
(XEN) CR0: actual=0x0000000000000030, shadow=0x0000000000000000, gh_mask=ffffffffffffffff
(XEN) CR4: actual=0x0000000000002050, shadow=0x0000000000000000, gh_mask=ffffffffffffffff
(XEN) CR3 = 0x00000000800ed000
(XEN) RSP = 0x0000000000000000 (0x0000000000000000) RIP = 0x0000000000000000 (0x0000000000000000)
(XEN) RFLAGS=0x00000002 (0x00000002) DR7 = 0x0000000000000400
(XEN) Sysenter RSP=0000000000000000 CS:RIP=0000:0000000000000000
(XEN) sel attr limit base
(XEN) CS: 0000 00000 00000000 0000000000000000
(XEN) DS: 0000 00000 00000000 0000000000000000
(XEN) SS: 0000 00000 00000000 0000000000000000
(XEN) ES: 0000 00000 00000000 0000000000000000
(XEN) FS: 0000 00000 00000000 0000000000000000
(XEN) GS: 0000 00000 00000000 0000000000000000
(XEN) GDTR: 00000000 0000000000000000
(XEN) LDTR: 0000 00000 00000000 0000000000000000
(XEN) IDTR: 00000000 0000000000000000
(XEN) TR: 0000 00000 00000000 0000000000000000
(XEN) EFER = 0x0000000000000800 PAT = 0x0000000000000000
(XEN) PreemptionTimer = 0x00000000 SM Base = 0x00000000
(XEN) DebugCtl = 0x0000000000000000 DebugExceptions = 0x0000000000000000
(XEN) Interruptibility = 00000000 ActivityState = 00000000
(XEN) *** Host State ***
(XEN) RIP = 0xffff82d0801ee3a0 (vmx_asm_vmexit_handler) RSP = 0xffff8340077d7f90
(XEN) CS=e008 SS=0000 DS=0000 ES=0000 FS=0000 GS=0000 TR=e040
(XEN) FSBase=0000000000000000 GSBase=0000000000000000 TRBase=ffff8340077dfc00
(XEN) GDTBase=ffff8340077d0000 IDTBase=ffff8340077dc000
(XEN) CR0=0000000080050033 CR3=000000400076c000 CR4=00000000001526e0
(XEN) Sysenter RSP=ffff8340077d7fc0 CS:RIP=e008:ffff82d080238870
(XEN) EFER = 0x0000000000000000 PAT = 0x0000050100070406
(XEN) *** Control State ***
(XEN) PinBased=0000003f CPUBased=b5b9effe SecondaryExec=000054eb
(XEN) EntryControls=000011fb ExitControls=001fefff
(XEN) ExceptionBitmap=00062042 PFECmask=00000000 PFECmatch=ffffffff
(XEN) VMEntry: intr_info=00000000 errcode=00000000 ilen=00000000
(XEN) VMExit: intr_info=00000000 errcode=00000000 ilen=00000006
(XEN) reason=80000021 qualification=0000000000000000
(XEN) IDTVectoring: info=00000000 errcode=00000000
(XEN) TSC Offset = 0xfffd34adb2c3a149
(XEN) TPR Threshold = 0x00 PostedIntrVec = 0x00
(XEN) EPT pointer = 0x000000400079a01e EPTP index = 0x0000
(XEN) PLE Gap=00000080 Window=00001000
(XEN) Virtual processor ID = 0x004e VMfunc controls = 0000000000000000
(XEN) **************************************
(XEN) domain_crash called from vmx.c:2729
(XEN) Domain 1 (vcpu#0) crashed on cpu#21:
(XEN) ----[ Xen-4.6.0 x86_64 debug=y Tainted: C ]----
(XEN) CPU: 21
(XEN) RIP: 0000:[<0000000000000000>]
(XEN) RFLAGS: 0000000000000002 CONTEXT: hvm guest (d1v0)
(XEN) rax: 0000000000000000 rbx: 0000000000000000 rcx: 0000000000000000
(XEN) rdx: 00000000078bfbff rsi: 0000000000000000 rdi: 0000000000000000
(XEN) rbp: 0000000000000000 rsp: 0000000000000000 r8: 0000000000000000
(XEN) r9: 0000000000000000 r10: 0000000000000000 r11: 0000000000000000
(XEN) r12: 0000000000000000 r13: 0000000000000000 r14: 0000000000000000
(XEN) r15: 0000000000000000 cr0: 0000000000000010 cr4: 0000000000000000
(XEN) cr3: 00000000800ed000 cr2: 0000000000000000
(XEN) ds: 0000 es: 0000 fs: 0000 gs: 0000 ss: 0000 cs: 0000
..
>
> Anyway, the writing of the respective VMCS field to zero in the
> alternative worries me a little: Aren't we risking MFN zero to be
> wrongly accessed due to this?
>
> Furthermore, nvmx_update_apic_access_address() having a
> similar ASSERT() seems entirely wrong: The APIC access
> page doesn't really need to match up with any actual page
> belonging to the guest - a guest could choose to point this
> into no-where (note that we've been at least considering this
> option recently for our own purposes, in the context of
> http://lists.xenproject.org/archives/html/xen-devel/2015-12/msg02191.html).
>
> > Instead I started getting an even more bizzare crash:
Ignore this part please.
.. snip..
> this doesn't match the call stack. Something's pretty fishy here.
Yes. The hypervisor was modified alongside me and I hadn't connected
the dots...
>
> Jan
next prev parent reply other threads:[~2016-01-15 21:40 UTC|newest]
Thread overview: 14+ messages / expand[flat|nested] mbox.gz Atom feed top
2016-01-12 3:38 Nested virtualization off VMware vSphere 6.0 with EL6 guests crashes on Xen 4.6 Konrad Rzeszutek Wilk
2016-01-12 9:22 ` Jan Beulich
2016-01-15 21:39 ` Konrad Rzeszutek Wilk [this message]
2016-01-18 9:41 ` Jan Beulich
2016-02-02 22:05 ` Konrad Rzeszutek Wilk
2016-02-03 9:34 ` Jan Beulich
2016-02-03 15:07 ` Konrad Rzeszutek Wilk
2016-02-04 18:36 ` Konrad Rzeszutek Wilk
2016-02-05 10:33 ` Jan Beulich
2016-11-03 1:41 ` Konrad Rzeszutek Wilk
2016-11-03 14:36 ` Konrad Rzeszutek Wilk
2016-02-04 5:52 ` Tian, Kevin
2016-02-17 2:54 ` Tian, Kevin
2016-01-12 14:18 ` Alvin Starr
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20160115213958.GA16118@char.us.oracle.com \
--to=konrad.wilk@oracle.com \
--cc=JBeulich@suse.com \
--cc=andrew.cooper3@citrix.com \
--cc=jun.nakajima@intel.com \
--cc=kevin.tian@intel.com \
--cc=wim.coekaerts@oracle.com \
--cc=xen-devel@lists.xenproject.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.