All of lore.kernel.org
 help / color / mirror / Atom feed
* xm save + restore crashes Windows 2008 32-bit (4.0.2-rc2-pre)
@ 2011-01-25  4:20 James Harper
  2011-01-25  9:24 ` Tim Deegan
  0 siblings, 1 reply; 16+ messages in thread
From: James Harper @ 2011-01-25  4:20 UTC (permalink / raw)
  To: xen-devel

Under the latest xen-4.0-testing, xm save + xm restore of Windows 2008
causes a BSoD on restore. This is without any PV drivers loaded. I
haven't tested any other versions of windows.

According to the debugger when I did have PV drivers loaded, the crash
happens on the return to userspace. I haven't looked at the debugger
with no PV drivers loaded.

The bug check is 0x7F (0xD, 0, 0, 0) which is an unspecified exception.

Is anyone else able to reproduce?

Thanks

James

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: xm save + restore crashes Windows 2008 32-bit (4.0.2-rc2-pre)
  2011-01-25  4:20 xm save + restore crashes Windows 2008 32-bit (4.0.2-rc2-pre) James Harper
@ 2011-01-25  9:24 ` Tim Deegan
  2011-01-25  9:28   ` James Harper
  0 siblings, 1 reply; 16+ messages in thread
From: Tim Deegan @ 2011-01-25  9:24 UTC (permalink / raw)
  To: James Harper; +Cc: xen-devel

At 04:20 +0000 on 25 Jan (1295929205), James Harper wrote:
> Under the latest xen-4.0-testing, xm save + xm restore of Windows 2008
> causes a BSoD on restore. This is without any PV drivers loaded. I
> haven't tested any other versions of windows.
> 
> According to the debugger when I did have PV drivers loaded, the crash
> happens on the return to userspace. I haven't looked at the debugger
> with no PV drivers loaded.
> 
> The bug check is 0x7F (0xD, 0, 0, 0) which is an unspecified exception.
> 
> Is anyone else able to reproduce?

I saw a failure like this last month but when I came to look at it again
it seemed to be working.   Are you testing on Intel or AMD hardware?

Tim.

-- 
Tim Deegan <Tim.Deegan@citrix.com>
Principal Software Engineer, Xen Platform Team
Citrix Systems UK Ltd.  (Company #02937203, SL9 0BG)

^ permalink raw reply	[flat|nested] 16+ messages in thread

* RE: xm save + restore crashes Windows 2008 32-bit (4.0.2-rc2-pre)
  2011-01-25  9:24 ` Tim Deegan
@ 2011-01-25  9:28   ` James Harper
  2011-01-25 10:39     ` Tim Deegan
  0 siblings, 1 reply; 16+ messages in thread
From: James Harper @ 2011-01-25  9:28 UTC (permalink / raw)
  To: Tim Deegan; +Cc: xen-devel

> 
> At 04:20 +0000 on 25 Jan (1295929205), James Harper wrote:
> > Under the latest xen-4.0-testing, xm save + xm restore of Windows
2008
> > causes a BSoD on restore. This is without any PV drivers loaded. I
> > haven't tested any other versions of windows.
> >
> > According to the debugger when I did have PV drivers loaded, the
crash
> > happens on the return to userspace. I haven't looked at the debugger
> > with no PV drivers loaded.
> >
> > The bug check is 0x7F (0xD, 0, 0, 0) which is an unspecified
exception.
> >
> > Is anyone else able to reproduce?
> 
> I saw a failure like this last month but when I came to look at it
again
> it seemed to be working.   Are you testing on Intel or AMD hardware?
> 

AMD (cpuinfo below). I've just spent the last few hours updating my
Intel box to test on that too. Save/resume was working but was only
3.4.1 so isn't really a comparison.

Both are now same xen, tools, and kernel.

I should know if the problem exists on intel very soon. I'm guessing not
or I wouldn't be the first one posting about it... either that or there
is something else going on.

James

(cpuinfo from in Dom0)

processor       : 0
vendor_id       : AuthenticAMD
cpu family      : 15
model           : 67
model name      : Dual-Core AMD Opteron(tm) Processor 1210
stepping        : 3
cpu MHz         : 1799.999
cache size      : 1024 KB
fpu             : yes
fpu_exception   : yes
cpuid level     : 1
wp              : yes
flags           : fpu de tsc msr pae mce cx8 apic mtrr mca cmov pat
clflush mmx fxsr sse sse2 ht syscall nx mmxext fxsr_opt lm 3dnowext
3dnow up rep_good extd_apicid pni cx16 hypervisor lahf_lm cmp_legacy
extapic cr8_legacy
bogomips        : 3599.99
TLB size        : 1024 4K pages
clflush size    : 64
cache_alignment : 64
address sizes   : 40 bits physical, 48 bits virtual
power management: ts fid vid ttp tm stc

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: xm save + restore crashes Windows 2008 32-bit (4.0.2-rc2-pre)
  2011-01-25  9:28   ` James Harper
@ 2011-01-25 10:39     ` Tim Deegan
  2011-01-25 10:43       ` James Harper
  0 siblings, 1 reply; 16+ messages in thread
From: Tim Deegan @ 2011-01-25 10:39 UTC (permalink / raw)
  To: James Harper; +Cc: xen-devel

At 09:28 +0000 on 25 Jan (1295947707), James Harper wrote:
> > > The bug check is 0x7F (0xD, 0, 0, 0) which is an unspecified exception.

It's a GPF, not that that helps a lot. 

> > > Is anyone else able to reproduce?
> > 
> > I saw a failure like this last month but when I came to look at it again
> > it seemed to be working.   Are you testing on Intel or AMD hardware?
> > 
> 
> AMD (cpuinfo below). I've just spent the last few hours updating my
> Intel box to test on that too. Save/resume was working but was only
> 3.4.1 so isn't really a comparison.
> 
> Both are now same xen, tools, and kernel.
> 
> I should know if the problem exists on intel very soon. I'm guessing not
> or I wouldn't be the first one posting about it... either that or there
> is something else going on.

It was on AMD that I saw it too.  If I read them correctly, Ian's
regression tests are passing HVM save/restore for at least some Windows
versions on AMD too, so it may be very specific.

Cheers,

Tim.

-- 
Tim Deegan <Tim.Deegan@citrix.com>
Principal Software Engineer, Xen Platform Team
Citrix Systems UK Ltd.  (Company #02937203, SL9 0BG)

^ permalink raw reply	[flat|nested] 16+ messages in thread

* RE: xm save + restore crashes Windows 2008 32-bit (4.0.2-rc2-pre)
  2011-01-25 10:39     ` Tim Deegan
@ 2011-01-25 10:43       ` James Harper
  2011-01-25 10:53         ` Tim Deegan
  0 siblings, 1 reply; 16+ messages in thread
From: James Harper @ 2011-01-25 10:43 UTC (permalink / raw)
  To: Tim Deegan; +Cc: xen-devel

> > I should know if the problem exists on intel very soon. I'm guessing
not
> > or I wouldn't be the first one posting about it... either that or
there
> > is something else going on.
> 
> It was on AMD that I saw it too.  If I read them correctly, Ian's
> regression tests are passing HVM save/restore for at least some
Windows
> versions on AMD too, so it may be very specific.
> 

Definitely AMD specific. Works fine on my Intel system.

I'm guessing it's missing the save and/or restore of some critical part
of the CPU state for the domain that causes an immediate crash when
return to user mode. Any suggestions as to where to start looking?

James

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: xm save + restore crashes Windows 2008 32-bit (4.0.2-rc2-pre)
  2011-01-25 10:43       ` James Harper
@ 2011-01-25 10:53         ` Tim Deegan
  2011-01-25 11:01           ` James Harper
                             ` (2 more replies)
  0 siblings, 3 replies; 16+ messages in thread
From: Tim Deegan @ 2011-01-25 10:53 UTC (permalink / raw)
  To: James Harper; +Cc: xen-devel

At 10:43 +0000 on 25 Jan (1295952215), James Harper wrote:
> > > I should know if the problem exists on intel very soon. I'm guessing
> not
> > > or I wouldn't be the first one posting about it... either that or
> there
> > > is something else going on.
> > 
> > It was on AMD that I saw it too.  If I read them correctly, Ian's
> > regression tests are passing HVM save/restore for at least some
> Windows
> > versions on AMD too, so it may be very specific.
> > 
> 
> Definitely AMD specific. Works fine on my Intel system.
> 
> I'm guessing it's missing the save and/or restore of some critical part
> of the CPU state for the domain that causes an immediate crash when
> return to user mode. Any suggestions as to where to start looking?

I'm trying to set it up here as well but I'm away from the office and
getting the VGA console as far as my screen is proving tricky.

Can you try:
 - xl pause <domid>
 - xen-hvmctx <domid> >before
 - xl save <domid> save-file
 - xl restore -p save-file
 - xl list
 - xen-hvmctx <new-domid> >after
 - diff -u before after

There should be a few differences to do with timers and TSCs but there
might be some other smoking gun.  Of course it's possible that some
piece of state got added that didn't get into the save/restore code at
all.  It's also possible that some vital piece of memory isn't getting
saved properly but that's less likely to be AMD-specific. 

Cheers,

Tim.

-- 
Tim Deegan <Tim.Deegan@citrix.com>
Principal Software Engineer, Xen Platform Team
Citrix Systems UK Ltd.  (Company #02937203, SL9 0BG)

^ permalink raw reply	[flat|nested] 16+ messages in thread

* RE: xm save + restore crashes Windows 2008 32-bit (4.0.2-rc2-pre)
  2011-01-25 10:53         ` Tim Deegan
@ 2011-01-25 11:01           ` James Harper
  2011-01-25 11:12             ` Tim Deegan
  2011-01-25 11:24           ` James Harper
  2011-01-25 11:37           ` James Harper
  2 siblings, 1 reply; 16+ messages in thread
From: James Harper @ 2011-01-25 11:01 UTC (permalink / raw)
  To: Tim Deegan; +Cc: xen-devel

> 
> At 10:43 +0000 on 25 Jan (1295952215), James Harper wrote:
> > > > I should know if the problem exists on intel very soon. I'm
guessing
> > not
> > > > or I wouldn't be the first one posting about it... either that
or
> > there
> > > > is something else going on.
> > >
> > > It was on AMD that I saw it too.  If I read them correctly, Ian's
> > > regression tests are passing HVM save/restore for at least some
> > Windows
> > > versions on AMD too, so it may be very specific.
> > >
> >
> > Definitely AMD specific. Works fine on my Intel system.
> >
> > I'm guessing it's missing the save and/or restore of some critical
part
> > of the CPU state for the domain that causes an immediate crash when
> > return to user mode. Any suggestions as to where to start looking?
> 
> I'm trying to set it up here as well but I'm away from the office and
> getting the VGA console as far as my screen is proving tricky.
> 
> Can you try:
>  - xl pause <domid>
>  - xen-hvmctx <domid> >before
>  - xl save <domid> save-file
>  - xl restore -p save-file
>  - xl list
>  - xen-hvmctx <new-domid> >after
>  - diff -u before after
> 
> There should be a few differences to do with timers and TSCs but there
> might be some other smoking gun.  Of course it's possible that some
> piece of state got added that didn't get into the save/restore code at
> all.  It's also possible that some vital piece of memory isn't getting
> saved properly but that's less likely to be AMD-specific.
> 

I'll give it a go. I tried using some of the xl tools the other day but
it wasn't working. xl save seemed to go okay but xl restore just gave
errors. That might be a clue in itself though... is xl considered stable
for 4.0.2?

James

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: xm save + restore crashes Windows 2008 32-bit (4.0.2-rc2-pre)
  2011-01-25 11:01           ` James Harper
@ 2011-01-25 11:12             ` Tim Deegan
  0 siblings, 0 replies; 16+ messages in thread
From: Tim Deegan @ 2011-01-25 11:12 UTC (permalink / raw)
  To: James Harper; +Cc: xen-devel

At 11:01 +0000 on 25 Jan (1295953315), James Harper wrote:
> > 
> > At 10:43 +0000 on 25 Jan (1295952215), James Harper wrote:
> > > > > I should know if the problem exists on intel very soon. I'm
> guessing
> > > not
> > > > > or I wouldn't be the first one posting about it... either that
> or
> > > there
> > > > > is something else going on.
> > > >
> > > > It was on AMD that I saw it too.  If I read them correctly, Ian's
> > > > regression tests are passing HVM save/restore for at least some
> > > Windows
> > > > versions on AMD too, so it may be very specific.
> > > >
> > >
> > > Definitely AMD specific. Works fine on my Intel system.
> > >
> > > I'm guessing it's missing the save and/or restore of some critical
> part
> > > of the CPU state for the domain that causes an immediate crash when
> > > return to user mode. Any suggestions as to where to start looking?
> > 
> > I'm trying to set it up here as well but I'm away from the office and
> > getting the VGA console as far as my screen is proving tricky.
> > 
> > Can you try:
> >  - xl pause <domid>
> >  - xen-hvmctx <domid> >before
> >  - xl save <domid> save-file
> >  - xl restore -p save-file
> >  - xl list
> >  - xen-hvmctx <new-domid> >after
> >  - diff -u before after
> > 
> > There should be a few differences to do with timers and TSCs but there
> > might be some other smoking gun.  Of course it's possible that some
> > piece of state got added that didn't get into the save/restore code at
> > all.  It's also possible that some vital piece of memory isn't getting
> > saved properly but that's less likely to be AMD-specific.
> > 
> 
> I'll give it a go. I tried using some of the xl tools the other day but
> it wasn't working. xl save seemed to go okay but xl restore just gave
> errors. That might be a clue in itself though... is xl considered stable
> for 4.0.2?

For 4.0.x, I think you can substitute 'xm' for 'xl' throughout. :)

Tim.

-- 
Tim Deegan <Tim.Deegan@citrix.com>
Principal Software Engineer, Xen Platform Team
Citrix Systems UK Ltd.  (Company #02937203, SL9 0BG)

^ permalink raw reply	[flat|nested] 16+ messages in thread

* RE: xm save + restore crashes Windows 2008 32-bit (4.0.2-rc2-pre)
  2011-01-25 10:53         ` Tim Deegan
  2011-01-25 11:01           ` James Harper
@ 2011-01-25 11:24           ` James Harper
  2011-01-25 11:37           ` James Harper
  2 siblings, 0 replies; 16+ messages in thread
From: James Harper @ 2011-01-25 11:24 UTC (permalink / raw)
  To: Tim Deegan; +Cc: xen-devel

> Can you try:
>  - xl pause <domid>
>  - xen-hvmctx <domid> >before
>  - xl save <domid> save-file
>  - xl restore -p save-file
>  - xl list
>  - xen-hvmctx <new-domid> >after
>  - diff -u before after
> 
> There should be a few differences to do with timers and TSCs but there
> might be some other smoking gun.  Of course it's possible that some
> piece of state got added that didn't get into the save/restore code at
> all.  It's also possible that some vital piece of memory isn't getting
> saved properly but that's less likely to be AMD-specific.
> 

xl just isn't working for me. xl create doesn't seem to work with drbd,
and even when I use the /dev/ path of the disk image and make it primary
myself, xl create gives me a domain that won't boot (bug check 0x5C
HAL_INITIALIZATION_FAILED). And even when I start the domain with xm, xl
save works but xl restore won't work. First you need to specify the
config file or it complains about a missing userdata-d-<guid?).xl file
in /var/lib/xen, and even when you specify the config file it says:

Failed allocation for dom 50: 1024 extents of order 0
ERROR Internal error: Failed to allocate memory for batch.!

xm save won't work if the domu is paused. I'll see if that's just an
artificial limitation I can remove...

James

^ permalink raw reply	[flat|nested] 16+ messages in thread

* RE: xm save + restore crashes Windows 2008 32-bit (4.0.2-rc2-pre)
  2011-01-25 10:53         ` Tim Deegan
  2011-01-25 11:01           ` James Harper
  2011-01-25 11:24           ` James Harper
@ 2011-01-25 11:37           ` James Harper
  2011-01-25 11:52             ` xm save + restore crashes Windows 2008 32-bit(4.0.2-rc2-pre) James Harper
  2 siblings, 1 reply; 16+ messages in thread
From: James Harper @ 2011-01-25 11:37 UTC (permalink / raw)
  To: Tim Deegan; +Cc: xen-devel

> I'm trying to set it up here as well but I'm away from the office and
> getting the VGA console as far as my screen is proving tricky.
> 
> Can you try:
>  - xl pause <domid>
>  - xen-hvmctx <domid> >before
>  - xl save <domid> save-file
>  - xl restore -p save-file
>  - xl list
>  - xen-hvmctx <new-domid> >after
>  - diff -u before after
> 
> There should be a few differences to do with timers and TSCs but there
> might be some other smoking gun.  Of course it's possible that some
> piece of state got added that didn't get into the save/restore code at
> all.  It's also possible that some vital piece of memory isn't getting
> saved properly but that's less likely to be AMD-specific.
> 

I was able to remove the 'is domain running?' check from xend and
complete your request using xm.

# diff -u before after
--- before      2011-01-25 22:27:51.064451527 +1100
+++ after       2011-01-25 22:33:25.724619490 +1100
@@ -1,4 +1,4 @@
-HVM save record for domain 53
+HVM save record for domain 54
 Entry 0: type 1 instance 0, length 24
      Header: magic 0x54381286, version 1
              Xen changeset 0
@@ -22,11 +22,11 @@
              cs 0x0000001b (0x0000000000000000 + 0xffffffff / 0x00cfb)
              ds 0x00000023 (0x0000000000000000 + 0xffffffff / 0x00cf3)
              es 0x00000023 (0x0000000000000000 + 0xffffffff / 0x00cf3)
-             fs 0x0000003b (0x000000007ffdc000 + 0x00000fff / 0x004f3)
-             gs 0x00000000 (0x0000000000000000 + 0xffffffff / 0x00000)
+             fs 0x00000000 (0x00007f18bcbc6700 + 0xffffffff / 0x00000)
+             gs 0x00000000 (0xffff880028038000 + 0xffffffff / 0x00000)
              ss 0x00000023 (0x0000000000000000 + 0xffffffff / 0x00cf3)
-             tr 0x00000028 (0x0000000080157000 + 0x000020ab / 0x0008b)
-           ldtr 0x00000000 (0x0000000000000000 + 0x00000000 / 0x00000)
+             tr 0x0000e040 (0xffff82c480263a80 + 0x00000067 / 0x0008b)
+           ldtr 0x00000000 (0x0000000000000000 + 0x0000ffff / 0x00000)
            itdr            (0x0000000081fff400 + 0x000007ff)
            gdtr            (0x0000000081fff000 + 0x000003ff)
     sysenter cs 0x00000000  eip 0x0000000000000000  esp
0x0000000000000000
@@ -34,7 +34,7 @@
       MSR flags 0xffffffffffffffff  lstar 0x0000000000000000
            star 0x0000000000000000  cstar 0x0000000000000000
          sfmask 0x0000000000000000   efer 0x0000000000000800
-            tsc 0x00000018cad69045
+            tsc 0x0000008fe39fad26
           event 0x00000000 error 0x00000000
     FPU:    fcw 0x037f fsw 0x0000
             ftw 0x00 (0x00) fop 0x0000
@@ -71,7 +71,7 @@
                (0x00000000000000000000000000000000)
                (0x00000000000000000000000000000000)
 Entry 2: type 3 instance 0, length 8
-    PIC: IRQ base 0x30, irr 0, imr 0xff, isr 0
+    PIC: IRQ base 0x30, irr 0x2, imr 0xff, isr 0
          init_state 0, priority_add 0, readsel_isr 0, poll 0
          auto_eoi 0, rotate_on_auto_eoi 0
          special_fully_nested_mode 0, special_mask_mode 0
@@ -153,8 +153,8 @@
           0x01c0: 0x0000000000000004   0x01d0: 0x0000000000000000
           0x01e0: 0x0000000000000000   0x01f0: 0x0000000000000000
           0x0200: 0x0000000000000000   0x0210: 0x0000000000000000
-          0x0220: 0x0000000000000000   0x0230: 0x0000000000040000
-          0x0240: 0x0000000000000004   0x0250: 0x0000000000000000
+          0x0220: 0x0000000000000000   0x0230: 0x0000000000060000
+          0x0240: 0x0000000000000006   0x0250: 0x0000000000000000
           0x0260: 0x0000000000000000   0x0270: 0x0000000000000000
           0x0280: 0x0000000000000000   0x0290: 0x0000000000000000
           0x02a0: 0x0000000000000000   0x02b0: 0x0000000000000000
@@ -171,7 +171,7 @@
 Entry 7: type 7 instance 0, length 16
     PCI IRQs: 0x00000000000100800000000000000000
 Entry 8: type 8 instance 0, length 8
-    ISA IRQs: 0x0001
+    ISA IRQs: 0x0003
 Entry 9: type 9 instance 0, length 8
     PCI LINK: 0 0 0 0
 Entry 10: type 10 instance 0, length 56
@@ -185,11 +185,11 @@
                rd_state 0, wr_state 0, wr_latch 0, rw_mode 0
                mode 0xff, bcd 0, gate 0x1
 Entry 11: type 11 instance 0, length 16
-    RTC: regs 0x48 0x00 0x27 0x00 0x22 0x00 0x02 0x25
+    RTC: regs 0x23 0x00 0x33 0x00 0x22 0x00 0x02 0x25
               0x01 0x11 0x2a 0x42 0x00 0x80, index 0x0c
 Entry 12: type 12 instance 0, length 1048
     HPET: capability 0xf424008086a201 config 0
-          isr 0 counter 0x1308081db
+          isr 0 counter 0x55323282f
           timer0 config 0xf0000000000030 cmp 0
           timer0 period 0 fsb 0
           timer1 config 0xf0000000000030 cmp 0
@@ -200,8 +200,8 @@
     ACPI PM: TMR_VAL 0xd9f446d, PM1a_STS 0x0, PM1a_EN 0x321
 Entry 14: type 14 instance 0, length 240
     MTRR: PAT 0x7010600070106, cap 0x508, default 0xc06
-          var 0 0x00000000f0000000 0x000000fff8000800
-          var 1 0x00000000f8000000 0x000000fffc000800
+          var 0 0x00000000f0000000 0x0000000000000000
+          var 1 0x00000000f8000000 0x0000000000000000
           var 2 0x0000000000000000 0x0000000000000000
           var 3 0x0000000000000000 0x0000000000000000
           var 4 0x0000000000000000 0x0000000000000000

I don't really know a lot of what I'm looking at, but those cpu
registers shouldn't be different should they??? It almost looks like the
cpu was allowed to run for a few cycles at some point even though it was
paused.

I'll try the same on the intel box and see what happens.

Thanks

James

^ permalink raw reply	[flat|nested] 16+ messages in thread

* RE: xm save + restore crashes Windows 2008 32-bit(4.0.2-rc2-pre)
  2011-01-25 11:37           ` James Harper
@ 2011-01-25 11:52             ` James Harper
  2011-01-25 13:35               ` xm save + restore crashes Windows 200832-bit(4.0.2-rc2-pre) (AMD only) James Harper
  0 siblings, 1 reply; 16+ messages in thread
From: James Harper @ 2011-01-25 11:52 UTC (permalink / raw)
  To: Tim Deegan; +Cc: xen-devel

On intel we get a much more believable result:

# diff -u before after
--- before      2011-01-25 22:40:32.861270619 +1100
+++ after       2011-01-25 22:43:31.665271154 +1100
@@ -1,4 +1,4 @@
-HVM save record for domain 24
+HVM save record for domain 25
 Entry 0: type 1 instance 0, length 24
      Header: magic 0x54381286, version 1
              Xen changeset 0
@@ -34,7 +34,7 @@
       MSR flags 0x0000000000000000  lstar 0x0000000000000000
            star 0x0000000000000000  cstar 0x0000000000000000
          sfmask 0x0000000000000000   efer 0x0000000000000800
-            tsc 0x0000002a2056c07e
+            tsc 0x0000007e31c3b3f3
           event 0x00000000 error 0x00000000
     FPU:    fcw 0x027f fsw 0x0000
             ftw 0x00 (0x00) fop 0x0000
@@ -185,11 +185,11 @@
                rd_state 0, wr_state 0, wr_latch 0, rw_mode 0
                mode 0xff, bcd 0, gate 0x1
 Entry 11: type 11 instance 0, length 16
-    RTC: regs 0x32 0x00 0x40 0x00 0x22 0x00 0x02 0x25
+    RTC: regs 0x29 0x00 0x43 0x00 0x22 0x00 0x02 0x25
               0x01 0x11 0x2a 0x42 0x00 0x80, index 0x0c
 Entry 12: type 12 instance 0, length 1048
     HPET: capability 0xf424008086a201 config 0
-          isr 0 counter 0x19b07289b
+          isr 0 counter 0x394b04f80
           timer0 config 0xf0000000000030 cmp 0
           timer0 period 0 fsb 0
           timer1 config 0xf0000000000030 cmp 0

just a few counters changed.

I retried under AMD and got the same result as last time so it's
definitely broken.

Are there any tools to analyse the save file? If I can see what numbers
in there I should be able to tell if it's the save or the restore that's
broken...

Thanks

James

> -----Original Message-----
> From: xen-devel-bounces@lists.xensource.com [mailto:xen-devel-
> bounces@lists.xensource.com] On Behalf Of James Harper
> Sent: Tuesday, 25 January 2011 22:38
> To: Tim Deegan
> Cc: xen-devel@lists.xensource.com
> Subject: RE: [Xen-devel] xm save + restore crashes Windows 2008
32-bit(4.0.2-
> rc2-pre)
> 
> > I'm trying to set it up here as well but I'm away from the office
and
> > getting the VGA console as far as my screen is proving tricky.
> >
> > Can you try:
> >  - xl pause <domid>
> >  - xen-hvmctx <domid> >before
> >  - xl save <domid> save-file
> >  - xl restore -p save-file
> >  - xl list
> >  - xen-hvmctx <new-domid> >after
> >  - diff -u before after
> >
> > There should be a few differences to do with timers and TSCs but
there
> > might be some other smoking gun.  Of course it's possible that some
> > piece of state got added that didn't get into the save/restore code
at
> > all.  It's also possible that some vital piece of memory isn't
getting
> > saved properly but that's less likely to be AMD-specific.
> >
> 
> I was able to remove the 'is domain running?' check from xend and
> complete your request using xm.
> 
> # diff -u before after
> --- before      2011-01-25 22:27:51.064451527 +1100
> +++ after       2011-01-25 22:33:25.724619490 +1100
> @@ -1,4 +1,4 @@
> -HVM save record for domain 53
> +HVM save record for domain 54
>  Entry 0: type 1 instance 0, length 24
>       Header: magic 0x54381286, version 1
>               Xen changeset 0
> @@ -22,11 +22,11 @@
>               cs 0x0000001b (0x0000000000000000 + 0xffffffff /
0x00cfb)
>               ds 0x00000023 (0x0000000000000000 + 0xffffffff /
0x00cf3)
>               es 0x00000023 (0x0000000000000000 + 0xffffffff /
0x00cf3)
> -             fs 0x0000003b (0x000000007ffdc000 + 0x00000fff /
0x004f3)
> -             gs 0x00000000 (0x0000000000000000 + 0xffffffff /
0x00000)
> +             fs 0x00000000 (0x00007f18bcbc6700 + 0xffffffff /
0x00000)
> +             gs 0x00000000 (0xffff880028038000 + 0xffffffff /
0x00000)
>               ss 0x00000023 (0x0000000000000000 + 0xffffffff /
0x00cf3)
> -             tr 0x00000028 (0x0000000080157000 + 0x000020ab /
0x0008b)
> -           ldtr 0x00000000 (0x0000000000000000 + 0x00000000 /
0x00000)
> +             tr 0x0000e040 (0xffff82c480263a80 + 0x00000067 /
0x0008b)
> +           ldtr 0x00000000 (0x0000000000000000 + 0x0000ffff /
0x00000)
>             itdr            (0x0000000081fff400 + 0x000007ff)
>             gdtr            (0x0000000081fff000 + 0x000003ff)
>      sysenter cs 0x00000000  eip 0x0000000000000000  esp
> 0x0000000000000000
> @@ -34,7 +34,7 @@
>        MSR flags 0xffffffffffffffff  lstar 0x0000000000000000
>             star 0x0000000000000000  cstar 0x0000000000000000
>           sfmask 0x0000000000000000   efer 0x0000000000000800
> -            tsc 0x00000018cad69045
> +            tsc 0x0000008fe39fad26
>            event 0x00000000 error 0x00000000
>      FPU:    fcw 0x037f fsw 0x0000
>              ftw 0x00 (0x00) fop 0x0000
> @@ -71,7 +71,7 @@
>                 (0x00000000000000000000000000000000)
>                 (0x00000000000000000000000000000000)
>  Entry 2: type 3 instance 0, length 8
> -    PIC: IRQ base 0x30, irr 0, imr 0xff, isr 0
> +    PIC: IRQ base 0x30, irr 0x2, imr 0xff, isr 0
>           init_state 0, priority_add 0, readsel_isr 0, poll 0
>           auto_eoi 0, rotate_on_auto_eoi 0
>           special_fully_nested_mode 0, special_mask_mode 0
> @@ -153,8 +153,8 @@
>            0x01c0: 0x0000000000000004   0x01d0: 0x0000000000000000
>            0x01e0: 0x0000000000000000   0x01f0: 0x0000000000000000
>            0x0200: 0x0000000000000000   0x0210: 0x0000000000000000
> -          0x0220: 0x0000000000000000   0x0230: 0x0000000000040000
> -          0x0240: 0x0000000000000004   0x0250: 0x0000000000000000
> +          0x0220: 0x0000000000000000   0x0230: 0x0000000000060000
> +          0x0240: 0x0000000000000006   0x0250: 0x0000000000000000
>            0x0260: 0x0000000000000000   0x0270: 0x0000000000000000
>            0x0280: 0x0000000000000000   0x0290: 0x0000000000000000
>            0x02a0: 0x0000000000000000   0x02b0: 0x0000000000000000
> @@ -171,7 +171,7 @@
>  Entry 7: type 7 instance 0, length 16
>      PCI IRQs: 0x00000000000100800000000000000000
>  Entry 8: type 8 instance 0, length 8
> -    ISA IRQs: 0x0001
> +    ISA IRQs: 0x0003
>  Entry 9: type 9 instance 0, length 8
>      PCI LINK: 0 0 0 0
>  Entry 10: type 10 instance 0, length 56
> @@ -185,11 +185,11 @@
>                 rd_state 0, wr_state 0, wr_latch 0, rw_mode 0
>                 mode 0xff, bcd 0, gate 0x1
>  Entry 11: type 11 instance 0, length 16
> -    RTC: regs 0x48 0x00 0x27 0x00 0x22 0x00 0x02 0x25
> +    RTC: regs 0x23 0x00 0x33 0x00 0x22 0x00 0x02 0x25
>                0x01 0x11 0x2a 0x42 0x00 0x80, index 0x0c
>  Entry 12: type 12 instance 0, length 1048
>      HPET: capability 0xf424008086a201 config 0
> -          isr 0 counter 0x1308081db
> +          isr 0 counter 0x55323282f
>            timer0 config 0xf0000000000030 cmp 0
>            timer0 period 0 fsb 0
>            timer1 config 0xf0000000000030 cmp 0
> @@ -200,8 +200,8 @@
>      ACPI PM: TMR_VAL 0xd9f446d, PM1a_STS 0x0, PM1a_EN 0x321
>  Entry 14: type 14 instance 0, length 240
>      MTRR: PAT 0x7010600070106, cap 0x508, default 0xc06
> -          var 0 0x00000000f0000000 0x000000fff8000800
> -          var 1 0x00000000f8000000 0x000000fffc000800
> +          var 0 0x00000000f0000000 0x0000000000000000
> +          var 1 0x00000000f8000000 0x0000000000000000
>            var 2 0x0000000000000000 0x0000000000000000
>            var 3 0x0000000000000000 0x0000000000000000
>            var 4 0x0000000000000000 0x0000000000000000
> 
> I don't really know a lot of what I'm looking at, but those cpu
> registers shouldn't be different should they??? It almost looks like
the
> cpu was allowed to run for a few cycles at some point even though it
was
> paused.
> 
> I'll try the same on the intel box and see what happens.
> 
> Thanks
> 
> James
> 
> _______________________________________________
> Xen-devel mailing list
> Xen-devel@lists.xensource.com
> http://lists.xensource.com/xen-devel

^ permalink raw reply	[flat|nested] 16+ messages in thread

* RE: xm save + restore crashes Windows 200832-bit(4.0.2-rc2-pre) (AMD only)
  2011-01-25 11:52             ` xm save + restore crashes Windows 2008 32-bit(4.0.2-rc2-pre) James Harper
@ 2011-01-25 13:35               ` James Harper
  2011-01-25 14:37                 ` Tim Deegan
  0 siblings, 1 reply; 16+ messages in thread
From: James Harper @ 2011-01-25 13:35 UTC (permalink / raw)
  To: Tim Deegan; +Cc: xen-devel

I put some printf's around the restore of registers in
hvm_load_cpu_ctxt. One before, announcing what the register was about to
be set to, then set it as normal, then read it and print what it
contains (which should be what it was set to). The values don't match
for fs, gs, tr, and ldtr. The value's written do match what xen-hvmctx
tells me before the save is done, so the save is working just not the
restore.

So the problem is somewhere past hvm_set_segment_register, and because
it's amd only, probably in or beyond svm_set_segment_register. The first
thing I notice in that routine is that there is a case for those 4
registers... although all it seems to do is svm_sync_vmcb before and
svm_vmload after setting. I don't know what those two do though.

I'll investigate further tomorrow, assuming nobody fixes it while I'm
asleep :)

James


> -----Original Message-----
> From: xen-devel-bounces@lists.xensource.com [mailto:xen-devel-
> bounces@lists.xensource.com] On Behalf Of James Harper
> Sent: Tuesday, 25 January 2011 22:52
> To: Tim Deegan
> Cc: xen-devel@lists.xensource.com
> Subject: RE: [Xen-devel] xm save + restore crashes Windows
200832-bit(4.0.2-
> rc2-pre)
> 
> On intel we get a much more believable result:
> 
> # diff -u before after
> --- before      2011-01-25 22:40:32.861270619 +1100
> +++ after       2011-01-25 22:43:31.665271154 +1100
> @@ -1,4 +1,4 @@
> -HVM save record for domain 24
> +HVM save record for domain 25
>  Entry 0: type 1 instance 0, length 24
>       Header: magic 0x54381286, version 1
>               Xen changeset 0
> @@ -34,7 +34,7 @@
>        MSR flags 0x0000000000000000  lstar 0x0000000000000000
>             star 0x0000000000000000  cstar 0x0000000000000000
>           sfmask 0x0000000000000000   efer 0x0000000000000800
> -            tsc 0x0000002a2056c07e
> +            tsc 0x0000007e31c3b3f3
>            event 0x00000000 error 0x00000000
>      FPU:    fcw 0x027f fsw 0x0000
>              ftw 0x00 (0x00) fop 0x0000
> @@ -185,11 +185,11 @@
>                 rd_state 0, wr_state 0, wr_latch 0, rw_mode 0
>                 mode 0xff, bcd 0, gate 0x1
>  Entry 11: type 11 instance 0, length 16
> -    RTC: regs 0x32 0x00 0x40 0x00 0x22 0x00 0x02 0x25
> +    RTC: regs 0x29 0x00 0x43 0x00 0x22 0x00 0x02 0x25
>                0x01 0x11 0x2a 0x42 0x00 0x80, index 0x0c
>  Entry 12: type 12 instance 0, length 1048
>      HPET: capability 0xf424008086a201 config 0
> -          isr 0 counter 0x19b07289b
> +          isr 0 counter 0x394b04f80
>            timer0 config 0xf0000000000030 cmp 0
>            timer0 period 0 fsb 0
>            timer1 config 0xf0000000000030 cmp 0
> 
> just a few counters changed.
> 
> I retried under AMD and got the same result as last time so it's
> definitely broken.
> 
> Are there any tools to analyse the save file? If I can see what
numbers
> in there I should be able to tell if it's the save or the restore
that's
> broken...
> 
> Thanks
> 
> James
> 
> > -----Original Message-----
> > From: xen-devel-bounces@lists.xensource.com [mailto:xen-devel-
> > bounces@lists.xensource.com] On Behalf Of James Harper
> > Sent: Tuesday, 25 January 2011 22:38
> > To: Tim Deegan
> > Cc: xen-devel@lists.xensource.com
> > Subject: RE: [Xen-devel] xm save + restore crashes Windows 2008
> 32-bit(4.0.2-
> > rc2-pre)
> >
> > > I'm trying to set it up here as well but I'm away from the office
> and
> > > getting the VGA console as far as my screen is proving tricky.
> > >
> > > Can you try:
> > >  - xl pause <domid>
> > >  - xen-hvmctx <domid> >before
> > >  - xl save <domid> save-file
> > >  - xl restore -p save-file
> > >  - xl list
> > >  - xen-hvmctx <new-domid> >after
> > >  - diff -u before after
> > >
> > > There should be a few differences to do with timers and TSCs but
> there
> > > might be some other smoking gun.  Of course it's possible that
some
> > > piece of state got added that didn't get into the save/restore
code
> at
> > > all.  It's also possible that some vital piece of memory isn't
> getting
> > > saved properly but that's less likely to be AMD-specific.
> > >
> >
> > I was able to remove the 'is domain running?' check from xend and
> > complete your request using xm.
> >
> > # diff -u before after
> > --- before      2011-01-25 22:27:51.064451527 +1100
> > +++ after       2011-01-25 22:33:25.724619490 +1100
> > @@ -1,4 +1,4 @@
> > -HVM save record for domain 53
> > +HVM save record for domain 54
> >  Entry 0: type 1 instance 0, length 24
> >       Header: magic 0x54381286, version 1
> >               Xen changeset 0
> > @@ -22,11 +22,11 @@
> >               cs 0x0000001b (0x0000000000000000 + 0xffffffff /
> 0x00cfb)
> >               ds 0x00000023 (0x0000000000000000 + 0xffffffff /
> 0x00cf3)
> >               es 0x00000023 (0x0000000000000000 + 0xffffffff /
> 0x00cf3)
> > -             fs 0x0000003b (0x000000007ffdc000 + 0x00000fff /
> 0x004f3)
> > -             gs 0x00000000 (0x0000000000000000 + 0xffffffff /
> 0x00000)
> > +             fs 0x00000000 (0x00007f18bcbc6700 + 0xffffffff /
> 0x00000)
> > +             gs 0x00000000 (0xffff880028038000 + 0xffffffff /
> 0x00000)
> >               ss 0x00000023 (0x0000000000000000 + 0xffffffff /
> 0x00cf3)
> > -             tr 0x00000028 (0x0000000080157000 + 0x000020ab /
> 0x0008b)
> > -           ldtr 0x00000000 (0x0000000000000000 + 0x00000000 /
> 0x00000)
> > +             tr 0x0000e040 (0xffff82c480263a80 + 0x00000067 /
> 0x0008b)
> > +           ldtr 0x00000000 (0x0000000000000000 + 0x0000ffff /
> 0x00000)
> >             itdr            (0x0000000081fff400 + 0x000007ff)
> >             gdtr            (0x0000000081fff000 + 0x000003ff)
> >      sysenter cs 0x00000000  eip 0x0000000000000000  esp
> > 0x0000000000000000
> > @@ -34,7 +34,7 @@
> >        MSR flags 0xffffffffffffffff  lstar 0x0000000000000000
> >             star 0x0000000000000000  cstar 0x0000000000000000
> >           sfmask 0x0000000000000000   efer 0x0000000000000800
> > -            tsc 0x00000018cad69045
> > +            tsc 0x0000008fe39fad26
> >            event 0x00000000 error 0x00000000
> >      FPU:    fcw 0x037f fsw 0x0000
> >              ftw 0x00 (0x00) fop 0x0000
> > @@ -71,7 +71,7 @@
> >                 (0x00000000000000000000000000000000)
> >                 (0x00000000000000000000000000000000)
> >  Entry 2: type 3 instance 0, length 8
> > -    PIC: IRQ base 0x30, irr 0, imr 0xff, isr 0
> > +    PIC: IRQ base 0x30, irr 0x2, imr 0xff, isr 0
> >           init_state 0, priority_add 0, readsel_isr 0, poll 0
> >           auto_eoi 0, rotate_on_auto_eoi 0
> >           special_fully_nested_mode 0, special_mask_mode 0
> > @@ -153,8 +153,8 @@
> >            0x01c0: 0x0000000000000004   0x01d0: 0x0000000000000000
> >            0x01e0: 0x0000000000000000   0x01f0: 0x0000000000000000
> >            0x0200: 0x0000000000000000   0x0210: 0x0000000000000000
> > -          0x0220: 0x0000000000000000   0x0230: 0x0000000000040000
> > -          0x0240: 0x0000000000000004   0x0250: 0x0000000000000000
> > +          0x0220: 0x0000000000000000   0x0230: 0x0000000000060000
> > +          0x0240: 0x0000000000000006   0x0250: 0x0000000000000000
> >            0x0260: 0x0000000000000000   0x0270: 0x0000000000000000
> >            0x0280: 0x0000000000000000   0x0290: 0x0000000000000000
> >            0x02a0: 0x0000000000000000   0x02b0: 0x0000000000000000
> > @@ -171,7 +171,7 @@
> >  Entry 7: type 7 instance 0, length 16
> >      PCI IRQs: 0x00000000000100800000000000000000
> >  Entry 8: type 8 instance 0, length 8
> > -    ISA IRQs: 0x0001
> > +    ISA IRQs: 0x0003
> >  Entry 9: type 9 instance 0, length 8
> >      PCI LINK: 0 0 0 0
> >  Entry 10: type 10 instance 0, length 56
> > @@ -185,11 +185,11 @@
> >                 rd_state 0, wr_state 0, wr_latch 0, rw_mode 0
> >                 mode 0xff, bcd 0, gate 0x1
> >  Entry 11: type 11 instance 0, length 16
> > -    RTC: regs 0x48 0x00 0x27 0x00 0x22 0x00 0x02 0x25
> > +    RTC: regs 0x23 0x00 0x33 0x00 0x22 0x00 0x02 0x25
> >                0x01 0x11 0x2a 0x42 0x00 0x80, index 0x0c
> >  Entry 12: type 12 instance 0, length 1048
> >      HPET: capability 0xf424008086a201 config 0
> > -          isr 0 counter 0x1308081db
> > +          isr 0 counter 0x55323282f
> >            timer0 config 0xf0000000000030 cmp 0
> >            timer0 period 0 fsb 0
> >            timer1 config 0xf0000000000030 cmp 0
> > @@ -200,8 +200,8 @@
> >      ACPI PM: TMR_VAL 0xd9f446d, PM1a_STS 0x0, PM1a_EN 0x321
> >  Entry 14: type 14 instance 0, length 240
> >      MTRR: PAT 0x7010600070106, cap 0x508, default 0xc06
> > -          var 0 0x00000000f0000000 0x000000fff8000800
> > -          var 1 0x00000000f8000000 0x000000fffc000800
> > +          var 0 0x00000000f0000000 0x0000000000000000
> > +          var 1 0x00000000f8000000 0x0000000000000000
> >            var 2 0x0000000000000000 0x0000000000000000
> >            var 3 0x0000000000000000 0x0000000000000000
> >            var 4 0x0000000000000000 0x0000000000000000
> >
> > I don't really know a lot of what I'm looking at, but those cpu
> > registers shouldn't be different should they??? It almost looks like
> the
> > cpu was allowed to run for a few cycles at some point even though it
> was
> > paused.
> >
> > I'll try the same on the intel box and see what happens.
> >
> > Thanks
> >
> > James
> >
> > _______________________________________________
> > Xen-devel mailing list
> > Xen-devel@lists.xensource.com
> > http://lists.xensource.com/xen-devel
> 
> _______________________________________________
> Xen-devel mailing list
> Xen-devel@lists.xensource.com
> http://lists.xensource.com/xen-devel

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: xm save + restore crashes Windows 200832-bit(4.0.2-rc2-pre) (AMD only)
  2011-01-25 13:35               ` xm save + restore crashes Windows 200832-bit(4.0.2-rc2-pre) (AMD only) James Harper
@ 2011-01-25 14:37                 ` Tim Deegan
  2011-01-25 22:11                   ` James Harper
  0 siblings, 1 reply; 16+ messages in thread
From: Tim Deegan @ 2011-01-25 14:37 UTC (permalink / raw)
  To: James Harper; +Cc: xen-devel

[-- Attachment #1: Type: text/plain, Size: 933 bytes --]

At 13:35 +0000 on 25 Jan (1295962540), James Harper wrote:
> So the problem is somewhere past hvm_set_segment_register, and because
> it's amd only, probably in or beyond svm_set_segment_register. The first
> thing I notice in that routine is that there is a case for those 4
> registers... although all it seems to do is svm_sync_vmcb before and
> svm_vmload after setting. I don't know what those two do though.

Hmm; I suspect the bug here is actually in the save side -- the syncing
of the vmcb in the save routine is not conditional on v == current, and
the "already synced" bit that it would otherwise gate on isn't properly
initialized.

Try the attached patch; I'm sorry to say that I suspect it will fix the
odd output of xen_hvmctx but probably won't fix the BSOD. :(

Cheers,

Tim.

-- 
Tim Deegan <Tim.Deegan@citrix.com>
Principal Software Engineer, Xen Platform Team
Citrix Systems UK Ltd.  (Company #02937203, SL9 0BG)

[-- Attachment #2: vmcb-sync --]
[-- Type: text/plain, Size: 353 bytes --]

diff -r 9b453f96dd46 xen/arch/x86/hvm/svm/vmcb.c
--- a/xen/arch/x86/hvm/svm/vmcb.c	Fri Jan 21 16:03:04 2011 +0000
+++ b/xen/arch/x86/hvm/svm/vmcb.c	Tue Jan 25 14:36:32 2011 +0000
@@ -280,6 +280,7 @@ int svm_create_vmcb(struct vcpu *v)
     }
 
     arch_svm->vmcb_pa = virt_to_maddr(arch_svm->vmcb);
+    arch_svm->vmcb_in_sync = 1;
 
     return 0;
 }

[-- Attachment #3: Type: text/plain, Size: 138 bytes --]

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

^ permalink raw reply	[flat|nested] 16+ messages in thread

* RE: xm save + restore crashes Windows 200832-bit(4.0.2-rc2-pre) (AMD only)
  2011-01-25 14:37                 ` Tim Deegan
@ 2011-01-25 22:11                   ` James Harper
  2011-01-25 22:21                     ` Tim Deegan
  0 siblings, 1 reply; 16+ messages in thread
From: James Harper @ 2011-01-25 22:11 UTC (permalink / raw)
  To: Tim Deegan; +Cc: xen-devel

> 
> At 13:35 +0000 on 25 Jan (1295962540), James Harper wrote:
> > So the problem is somewhere past hvm_set_segment_register, and
because
> > it's amd only, probably in or beyond svm_set_segment_register. The
first
> > thing I notice in that routine is that there is a case for those 4
> > registers... although all it seems to do is svm_sync_vmcb before and
> > svm_vmload after setting. I don't know what those two do though.
> 
> Hmm; I suspect the bug here is actually in the save side -- the
syncing
> of the vmcb in the save routine is not conditional on v == current,
and
> the "already synced" bit that it would otherwise gate on isn't
properly
> initialized.
> 
> Try the attached patch; I'm sorry to say that I suspect it will fix
the
> odd output of xen_hvmctx but probably won't fix the BSOD. :(
> 

Just to clarify, in the restore path I print the values to be saved to
the segment registers, then I read the segment registers and print the
values that are in them. They aren't the same. Doesn't that sound like a
problem on the restore side?

James

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: xm save + restore crashes Windows 200832-bit(4.0.2-rc2-pre) (AMD only)
  2011-01-25 22:11                   ` James Harper
@ 2011-01-25 22:21                     ` Tim Deegan
  2011-01-25 22:25                       ` James Harper
  0 siblings, 1 reply; 16+ messages in thread
From: Tim Deegan @ 2011-01-25 22:21 UTC (permalink / raw)
  To: James Harper; +Cc: xen-devel

At 22:11 +0000 on 25 Jan (1295993487), James Harper wrote:
> > 
> > At 13:35 +0000 on 25 Jan (1295962540), James Harper wrote:
> > > So the problem is somewhere past hvm_set_segment_register, and
> because
> > > it's amd only, probably in or beyond svm_set_segment_register. The
> first
> > > thing I notice in that routine is that there is a case for those 4
> > > registers... although all it seems to do is svm_sync_vmcb before and
> > > svm_vmload after setting. I don't know what those two do though.
> > 
> > Hmm; I suspect the bug here is actually in the save side -- the
> syncing
> > of the vmcb in the save routine is not conditional on v == current,
> and
> > the "already synced" bit that it would otherwise gate on isn't
> properly
> > initialized.
> > 
> > Try the attached patch; I'm sorry to say that I suspect it will fix
> the
> > odd output of xen_hvmctx but probably won't fix the BSOD. :(
> > 
> 
> Just to clarify, in the restore path I print the values to be saved to
> the segment registers, then I read the segment registers and print the
> values that are in them. They aren't the same. Doesn't that sound like a
> problem on the restore side?

That would depend on how you read the values after the restore - the
patch is for a bug that I think is causing svm_get_segment_register() to
corrupt the vmcb if it's called before the vcpu is first scheduled (and
to return the corrupted values).

Cheers,

Tim.

-- 
Tim Deegan <Tim.Deegan@citrix.com>
Principal Software Engineer, Xen Platform Team
Citrix Systems UK Ltd.  (Company #02937203, SL9 0BG)

^ permalink raw reply	[flat|nested] 16+ messages in thread

* RE: xm save + restore crashes Windows 200832-bit(4.0.2-rc2-pre) (AMD only)
  2011-01-25 22:21                     ` Tim Deegan
@ 2011-01-25 22:25                       ` James Harper
  0 siblings, 0 replies; 16+ messages in thread
From: James Harper @ 2011-01-25 22:25 UTC (permalink / raw)
  To: Tim Deegan; +Cc: xen-devel

> >
> > Just to clarify, in the restore path I print the values to be saved
to
> > the segment registers, then I read the segment registers and print
the
> > values that are in them. They aren't the same. Doesn't that sound
like a
> > problem on the restore side?
> 
> That would depend on how you read the values after the restore - the
> patch is for a bug that I think is causing svm_get_segment_register()
to
> corrupt the vmcb if it's called before the vcpu is first scheduled
(and
> to return the corrupted values).
> 

I see. I just tested and while I still get the crash, all the segment
registers are now correct after applying your patch.

The only thing I can see that's different now is the MTRR's.

James

--- before      2011-01-26 09:16:19.030666000 +1100
+++ after       2011-01-26 09:21:13.374664075 +1100
@@ -1,4 +1,4 @@
-HVM save record for domain 4
+HVM save record for domain 6
 Entry 0: type 1 instance 0, length 24
      Header: magic 0x54381286, version 1
              Xen changeset 0
@@ -34,7 +34,7 @@
       MSR flags 0xffffffffffffffff  lstar 0x0000000000000000
            star 0x0000000000000000  cstar 0x0000000000000000
          sfmask 0x0000000000000000   efer 0x0000000000000800
-            tsc 0x000000172cbec19e
+            tsc 0x0000005866dd3e1f
           event 0x00000000 error 0x00000000
     FPU:    fcw 0x027f fsw 0x0000
             ftw 0x00 (0x00) fop 0x0000
@@ -185,11 +185,11 @@
                rd_state 0, wr_state 0, wr_latch 0, rw_mode 0
                mode 0xff, bcd 0, gate 0x1
 Entry 11: type 11 instance 0, length 16
-    RTC: regs 0x16 0x00 0x16 0x00 0x09 0x00 0x03 0x26
+    RTC: regs 0x12 0x00 0x21 0x00 0x09 0x00 0x03 0x26
               0x01 0x11 0x2a 0x42 0x00 0x80, index 0x0c
 Entry 12: type 12 instance 0, length 1048
     HPET: capability 0xf424008086a201 config 0
-          isr 0 counter 0x1f65b81fc
+          isr 0 counter 0x43a264ae0
           timer0 config 0xf0000000000030 cmp 0
           timer0 period 0 fsb 0
           timer1 config 0xf0000000000030 cmp 0
@@ -200,8 +200,8 @@
     ACPI PM: TMR_VAL 0x19b239a8, PM1a_STS 0x0, PM1a_EN 0x321
 Entry 14: type 14 instance 0, length 240
     MTRR: PAT 0x7010600070106, cap 0x508, default 0xc06
-          var 0 0x00000000f0000000 0x000000fff8000800
-          var 1 0x00000000f8000000 0x000000fffc000800
+          var 0 0x00000000f0000000 0x0000000000000000
+          var 1 0x00000000f8000000 0x0000000000000000
           var 2 0x0000000000000000 0x0000000000000000
           var 3 0x0000000000000000 0x0000000000000000
           var 4 0x0000000000000000 0x0000000000000000

^ permalink raw reply	[flat|nested] 16+ messages in thread

end of thread, other threads:[~2011-01-25 22:25 UTC | newest]

Thread overview: 16+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2011-01-25  4:20 xm save + restore crashes Windows 2008 32-bit (4.0.2-rc2-pre) James Harper
2011-01-25  9:24 ` Tim Deegan
2011-01-25  9:28   ` James Harper
2011-01-25 10:39     ` Tim Deegan
2011-01-25 10:43       ` James Harper
2011-01-25 10:53         ` Tim Deegan
2011-01-25 11:01           ` James Harper
2011-01-25 11:12             ` Tim Deegan
2011-01-25 11:24           ` James Harper
2011-01-25 11:37           ` James Harper
2011-01-25 11:52             ` xm save + restore crashes Windows 2008 32-bit(4.0.2-rc2-pre) James Harper
2011-01-25 13:35               ` xm save + restore crashes Windows 200832-bit(4.0.2-rc2-pre) (AMD only) James Harper
2011-01-25 14:37                 ` Tim Deegan
2011-01-25 22:11                   ` James Harper
2011-01-25 22:21                     ` Tim Deegan
2011-01-25 22:25                       ` James Harper

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.