From mboxrd@z Thu Jan 1 00:00:00 1970 From: Matthias Subject: Re: Please revert / review 077fc1c04d70ef1748ac2daa6622b3320a1a004c Date: Thu, 19 Jun 2014 03:07:12 +0200 Message-ID: References: <5399828D02000078000B63B7@mail.emea.novell.com> <539EC228020000780001A7FC@mail.emea.novell.com> Mime-Version: 1.0 Content-Type: multipart/mixed; boundary="===============3310722060230980161==" Return-path: In-Reply-To: List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Sender: xen-devel-bounces@lists.xen.org Errors-To: xen-devel-bounces@lists.xen.org To: "Zhang, Yang Z" Cc: "tim@xen.org" , Jan Beulich , "xen-devel@lists.xen.org" List-Id: xen-devel@lists.xenproject.org --===============3310722060230980161== Content-Type: multipart/alternative; boundary=f46d04428cf466069604fc260072 --f46d04428cf466069604fc260072 Content-Type: text/plain; charset=UTF-8 I did some furthor testing with a much more recent commit (the 6b4d71d0 Jan suggested earlier from 05-28-14) and with your patch now in the first run everything seemed fine and the domU came up. In the second run however, I got this: (XEN) svm.c:1439:d1v0 SVM violation gpa 0x000000f2088004, mfn 0xfe5f7, type 5 (XEN) domain_crash called from svm.c:1440 (XEN) Domain 1 (vcpu#0) crashed on cpu#5: (XEN) ----[ Xen-4.5-unstable x86_64 debug=y Not tainted ]---- (XEN) CPU: 5 (XEN) RIP: 0010:[] (XEN) RFLAGS: 0000000000000086 CONTEXT: hvm guest (XEN) rax: ffffffffffd07000 rbx: ffffffffffd07000 rcx: 0000000000000046 (XEN) rdx: fffff6ffffffe838 rsi: 0000000000000000 rdi: 0000000000000000 (XEN) rbp: 0000000000008086 rsp: fffff80000b9cdc0 r8: ffffffffffd07000 (XEN) r9: 0000000000000000 r10: ffffffffffd08000 r11: 0000000fffffffff (XEN) r12: 0000000000000007 r13: 0000000000000000 r14: 0000000000000007 (XEN) r15: 0000000000000000 cr0: 0000000080050031 cr4: 00000000000006a0 (XEN) cr3: 0000000000187000 cr2: 0000000000000000 (XEN) ds: 002b es: 002b fs: 0053 gs: 002b ss: 0000 cs: 0010 And the domU died. I know this behaviour from the time when I simply reverted your 077 commit and could backtrace this to a series of commits by Jan: 2014-05-02 Jan Beulich x86/NPT: don't walk entire page tables when globally... 2014-05-02 Jan Beulich x86/NPT: don't walk page tables when changing types... 2014-05-02 Jan Beulich x86/EPT: don't walk page tables when changing types... 2014-05-02 Jan Beulich x86/EPT: don't walk entire page tables when globally... which seem to introduce this behaviour. But since in the first one he mentions something about the log dirty, I assume that this is just a cross dependancy from your log dirty change and would be resolved when my issue with your commit is resolved. But since it happened again I thought it was worth mentioning. It also seems that this issue only occurs when I pass my USB hosts to the domU in addion to the VGA. If I only pass through my vga, it works, but performance seems to be slower (only judged from the time windows needs for login / boot, no dedicated benchmarking). Maybe it helps.. 2014-06-19 0:21 GMT+02:00 Matthias : > Yes, I'm only seeing the BSOD since 077fc1c04d. 0e251a837 is still fine > and I can boot my win7 domU. > > My bisection process is pretty basic. I have a script which checks out the > git staging tree, does a hard reset on the git commit I want to test, > applies some custom patches (only changes in vif-nat and mkdeb to put some > git build info into the package description so i can use dpkg -I to see > what commit the package is on) and does a make world and make debball: > > git clone -b staging git://xenbits.xen.org/xen.git xen-unstable-staging > git reset --hard 077fc1c04d70ef1748ac2daa6622b3320a1a004c > // add custom patches > ./configure --disable-kernels --disable-stubdom --disable-docs > make -j4 world > make -j4 debball > > Then I save the created .deb into a folder for storage / later testing and > install it if I want. And with that, I did the usual bisection: use a > previous commit if something goes wrong and a later commit if everything > works, until I arrived at your commit and wrote the mail.. > > > Also, the original problem I am trying to fix only related to EPT and > VT-d page table sharing. So have you tried to not share them? > > Sorry, can you explain this a little more? I don't know how to influence > VT-d page table sharing since I don't know much about the deeper mechanics > of XEN. > > But I am very grateful for your help and therefor would like to help with > the testing of your patches. > > For my last test I once again used your 077fc1c commit and applied both > your first (printing out if log dirty mode is enabled) and second (the > latest) patch and it actually workd: no BSOD and the domU came up fine and > was usable. Also logs seem fine and there were no VT-d page faults. I > attached qemu log and xl dmesg log never the less. > > Hope this helps! > > > --f46d04428cf466069604fc260072 Content-Type: text/html; charset=UTF-8 Content-Transfer-Encoding: quoted-printable
I did some furthor testing with a much more rece= nt commit (the 6b4d71d0 Jan suggested earlier from 05-28-14) and with your = patch now in the first run everything seemed fine and the domU came up. In = the second run however, I got this:

(XEN) svm.c:1439:d1v0 SVM violation gpa 0x000000f2088004, mfn 0xfe5f7, = type 5
(XEN) domain_crash called from svm.c:1440
(XEN) Domain 1 (vcpu= #0) crashed on cpu#5:
(XEN) ----[ Xen-4.5-unstable=C2=A0 x86_64=C2=A0 de= bug=3Dy=C2=A0 Not tainted ]----
(XEN) CPU:=C2=A0=C2=A0=C2=A0 5
(XEN) RIP:=C2=A0=C2=A0=C2=A0 0010:[<ff= fff8000461e2a6>]
(XEN) RFLAGS: 0000000000000086=C2=A0=C2=A0 CONTEXT: = hvm guest
(XEN) rax: ffffffffffd07000=C2=A0=C2=A0 rbx: ffffffffffd07000= =C2=A0=C2=A0 rcx: 0000000000000046
(XEN) rdx: fffff6ffffffe838=C2=A0=C2= =A0 rsi: 0000000000000000=C2=A0=C2=A0 rdi: 0000000000000000
(XEN) rbp: 0000000000008086=C2=A0=C2=A0 rsp: fffff80000b9cdc0=C2=A0=C2=A0 r= 8:=C2=A0 ffffffffffd07000
(XEN) r9:=C2=A0 0000000000000000=C2=A0=C2=A0 r= 10: ffffffffffd08000=C2=A0=C2=A0 r11: 0000000fffffffff
(XEN) r12: 000000= 0000000007=C2=A0=C2=A0 r13: 0000000000000000=C2=A0=C2=A0 r14: 0000000000000= 007
(XEN) r15: 0000000000000000=C2=A0=C2=A0 cr0: 0000000080050031=C2=A0=C2=A0 c= r4: 00000000000006a0
(XEN) cr3: 0000000000187000=C2=A0=C2=A0 cr2: 000000= 0000000000
(XEN) ds: 002b=C2=A0=C2=A0 es: 002b=C2=A0=C2=A0 fs: 0053=C2= =A0=C2=A0 gs: 002b=C2=A0=C2=A0 ss: 0000=C2=A0=C2=A0 cs: 0010

A= nd the domU died. I know this behaviour from the time when I simply reverte= d your 077 commit and could backtrace this to a series of commits by Jan:
2014-05-02 Jan Beulich=C2=A0=C2=A0=C2=A0 x86/NPT: don't walk entire= page tables when globally...
2014-05-02 Jan Beulich=C2=A0=C2=A0=C2=A0 x= 86/NPT: don't walk page tables when changing types...
2014-05-02 Ja= n Beulich=C2=A0=C2=A0=C2=A0 x86/EPT: don't walk page tables when changi= ng types...
2014-05-02 Jan Beulich=C2=A0=C2=A0=C2=A0 x86/EPT: don't walk entire pag= e tables when globally...

which seem to introduce this behavi= our. But since in the first one he mentions something about the log dirty, = I assume that this is just a cross dependancy from your log dirty change an= d would be resolved when my issue with your commit is resolved. But since i= t happened again I thought it was worth mentioning. It also seems that this= issue only occurs when I pass my USB hosts to the domU in addion to the VG= A. If I only pass through my vga, it works, but performance seems to be slo= wer (only judged from the time windows needs for login / boot, no dedicated= benchmarking).

Maybe it helps..


2014-06-19 0:21 GMT+02:00 Matthias <<= a href=3D"mailto:matthias.kannenberg@googlemail.com" target=3D"_blank">matt= hias.kannenberg@googlemail.com>:
Yes, I'm only= seeing the BSOD since 077fc1c04d. 0e251a837 is still fine and I can boot m= y win7 domU.

My bisection process is pretty basic. I have a script which check= s out the git staging tree, does a hard reset on the git commit I want to t= est, applies some custom patches (only changes in vif-nat and mkdeb to put = some git build info into the package description so i can use dpkg -I to se= e what commit the package is on) and does a make world and make debball:
git clone -b staging git://xenbits.xen.org/xen.git xen-unstable-staging
git re= set --hard 077fc1c04d70ef1748ac2daa6622b3320a1a004c
// add custom patche= s
./configure=C2=A0 --disable-kernels --disable-stubdom --disable-docs
make -j4 world
make -j4 debball

Then I save the created .de= b into a folder for storage / later testing and install it if I want. And w= ith that, I did the usual bisection: use a previous commit if something goe= s wrong and a later commit if everything works, until I arrived at your com= mit and wrote the mail..

> Also, the original problem I am try= ing to fix only related to EPT and=20 VT-d page table sharing. So have you tried to not share them?

=
Sorry, can you explain this a little more? I don't know how = to influence VT-d page table sharing since I don't know much about the = deeper mechanics of XEN.

But I am very grateful for your help and therefor would like= to help with the testing of your patches.

For my last te= st I once again used your 077fc1c commit and applied both your first (print= ing out if log dirty mode is enabled) and second (the latest) patch and it = actually workd: no BSOD and the domU came up fine and was usable. Also logs= seem fine and there were no VT-d page faults. I attached qemu log and xl d= mesg log never the less.

Hope this helps!



--f46d04428cf466069604fc260072-- --===============3310722060230980161== Content-Type: text/plain; charset="us-ascii" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit Content-Disposition: inline _______________________________________________ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel --===============3310722060230980161==--