From mboxrd@z Thu Jan 1 00:00:00 1970 From: "James Harper" Subject: RE: xm save + restore crashes Windows 2008 32-bit(4.0.2-rc2-pre) Date: Tue, 25 Jan 2011 22:52:04 +1100 Message-ID: References: <20110125092440.GA13241@whitby.uk.xensource.com><20110125103938.GB13241@whitby.uk.xensource.com><20110125105313.GC13241@whitby.uk.xensource.com> Mime-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: quoted-printable Return-path: Content-class: urn:content-classes:message In-Reply-To: List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Sender: xen-devel-bounces@lists.xensource.com Errors-To: xen-devel-bounces@lists.xensource.com To: Tim Deegan Cc: xen-devel@lists.xensource.com List-Id: xen-devel@lists.xenproject.org On intel we get a much more believable result: # diff -u before after --- before 2011-01-25 22:40:32.861270619 +1100 +++ after 2011-01-25 22:43:31.665271154 +1100 @@ -1,4 +1,4 @@ -HVM save record for domain 24 +HVM save record for domain 25 Entry 0: type 1 instance 0, length 24 Header: magic 0x54381286, version 1 Xen changeset 0 @@ -34,7 +34,7 @@ MSR flags 0x0000000000000000 lstar 0x0000000000000000 star 0x0000000000000000 cstar 0x0000000000000000 sfmask 0x0000000000000000 efer 0x0000000000000800 - tsc 0x0000002a2056c07e + tsc 0x0000007e31c3b3f3 event 0x00000000 error 0x00000000 FPU: fcw 0x027f fsw 0x0000 ftw 0x00 (0x00) fop 0x0000 @@ -185,11 +185,11 @@ rd_state 0, wr_state 0, wr_latch 0, rw_mode 0 mode 0xff, bcd 0, gate 0x1 Entry 11: type 11 instance 0, length 16 - RTC: regs 0x32 0x00 0x40 0x00 0x22 0x00 0x02 0x25 + RTC: regs 0x29 0x00 0x43 0x00 0x22 0x00 0x02 0x25 0x01 0x11 0x2a 0x42 0x00 0x80, index 0x0c Entry 12: type 12 instance 0, length 1048 HPET: capability 0xf424008086a201 config 0 - isr 0 counter 0x19b07289b + isr 0 counter 0x394b04f80 timer0 config 0xf0000000000030 cmp 0 timer0 period 0 fsb 0 timer1 config 0xf0000000000030 cmp 0 just a few counters changed. I retried under AMD and got the same result as last time so it's definitely broken. Are there any tools to analyse the save file? If I can see what numbers in there I should be able to tell if it's the save or the restore that's broken... Thanks James > -----Original Message----- > From: xen-devel-bounces@lists.xensource.com [mailto:xen-devel- > bounces@lists.xensource.com] On Behalf Of James Harper > Sent: Tuesday, 25 January 2011 22:38 > To: Tim Deegan > Cc: xen-devel@lists.xensource.com > Subject: RE: [Xen-devel] xm save + restore crashes Windows 2008 32-bit(4.0.2- > rc2-pre) >=20 > > I'm trying to set it up here as well but I'm away from the office and > > getting the VGA console as far as my screen is proving tricky. > > > > Can you try: > > - xl pause > > - xen-hvmctx >before > > - xl save save-file > > - xl restore -p save-file > > - xl list > > - xen-hvmctx >after > > - diff -u before after > > > > There should be a few differences to do with timers and TSCs but there > > might be some other smoking gun. Of course it's possible that some > > piece of state got added that didn't get into the save/restore code at > > all. It's also possible that some vital piece of memory isn't getting > > saved properly but that's less likely to be AMD-specific. > > >=20 > I was able to remove the 'is domain running?' check from xend and > complete your request using xm. >=20 > # diff -u before after > --- before 2011-01-25 22:27:51.064451527 +1100 > +++ after 2011-01-25 22:33:25.724619490 +1100 > @@ -1,4 +1,4 @@ > -HVM save record for domain 53 > +HVM save record for domain 54 > Entry 0: type 1 instance 0, length 24 > Header: magic 0x54381286, version 1 > Xen changeset 0 > @@ -22,11 +22,11 @@ > cs 0x0000001b (0x0000000000000000 + 0xffffffff / 0x00cfb) > ds 0x00000023 (0x0000000000000000 + 0xffffffff / 0x00cf3) > es 0x00000023 (0x0000000000000000 + 0xffffffff / 0x00cf3) > - fs 0x0000003b (0x000000007ffdc000 + 0x00000fff / 0x004f3) > - gs 0x00000000 (0x0000000000000000 + 0xffffffff / 0x00000) > + fs 0x00000000 (0x00007f18bcbc6700 + 0xffffffff / 0x00000) > + gs 0x00000000 (0xffff880028038000 + 0xffffffff / 0x00000) > ss 0x00000023 (0x0000000000000000 + 0xffffffff / 0x00cf3) > - tr 0x00000028 (0x0000000080157000 + 0x000020ab / 0x0008b) > - ldtr 0x00000000 (0x0000000000000000 + 0x00000000 / 0x00000) > + tr 0x0000e040 (0xffff82c480263a80 + 0x00000067 / 0x0008b) > + ldtr 0x00000000 (0x0000000000000000 + 0x0000ffff / 0x00000) > itdr (0x0000000081fff400 + 0x000007ff) > gdtr (0x0000000081fff000 + 0x000003ff) > sysenter cs 0x00000000 eip 0x0000000000000000 esp > 0x0000000000000000 > @@ -34,7 +34,7 @@ > MSR flags 0xffffffffffffffff lstar 0x0000000000000000 > star 0x0000000000000000 cstar 0x0000000000000000 > sfmask 0x0000000000000000 efer 0x0000000000000800 > - tsc 0x00000018cad69045 > + tsc 0x0000008fe39fad26 > event 0x00000000 error 0x00000000 > FPU: fcw 0x037f fsw 0x0000 > ftw 0x00 (0x00) fop 0x0000 > @@ -71,7 +71,7 @@ > (0x00000000000000000000000000000000) > (0x00000000000000000000000000000000) > Entry 2: type 3 instance 0, length 8 > - PIC: IRQ base 0x30, irr 0, imr 0xff, isr 0 > + PIC: IRQ base 0x30, irr 0x2, imr 0xff, isr 0 > init_state 0, priority_add 0, readsel_isr 0, poll 0 > auto_eoi 0, rotate_on_auto_eoi 0 > special_fully_nested_mode 0, special_mask_mode 0 > @@ -153,8 +153,8 @@ > 0x01c0: 0x0000000000000004 0x01d0: 0x0000000000000000 > 0x01e0: 0x0000000000000000 0x01f0: 0x0000000000000000 > 0x0200: 0x0000000000000000 0x0210: 0x0000000000000000 > - 0x0220: 0x0000000000000000 0x0230: 0x0000000000040000 > - 0x0240: 0x0000000000000004 0x0250: 0x0000000000000000 > + 0x0220: 0x0000000000000000 0x0230: 0x0000000000060000 > + 0x0240: 0x0000000000000006 0x0250: 0x0000000000000000 > 0x0260: 0x0000000000000000 0x0270: 0x0000000000000000 > 0x0280: 0x0000000000000000 0x0290: 0x0000000000000000 > 0x02a0: 0x0000000000000000 0x02b0: 0x0000000000000000 > @@ -171,7 +171,7 @@ > Entry 7: type 7 instance 0, length 16 > PCI IRQs: 0x00000000000100800000000000000000 > Entry 8: type 8 instance 0, length 8 > - ISA IRQs: 0x0001 > + ISA IRQs: 0x0003 > Entry 9: type 9 instance 0, length 8 > PCI LINK: 0 0 0 0 > Entry 10: type 10 instance 0, length 56 > @@ -185,11 +185,11 @@ > rd_state 0, wr_state 0, wr_latch 0, rw_mode 0 > mode 0xff, bcd 0, gate 0x1 > Entry 11: type 11 instance 0, length 16 > - RTC: regs 0x48 0x00 0x27 0x00 0x22 0x00 0x02 0x25 > + RTC: regs 0x23 0x00 0x33 0x00 0x22 0x00 0x02 0x25 > 0x01 0x11 0x2a 0x42 0x00 0x80, index 0x0c > Entry 12: type 12 instance 0, length 1048 > HPET: capability 0xf424008086a201 config 0 > - isr 0 counter 0x1308081db > + isr 0 counter 0x55323282f > timer0 config 0xf0000000000030 cmp 0 > timer0 period 0 fsb 0 > timer1 config 0xf0000000000030 cmp 0 > @@ -200,8 +200,8 @@ > ACPI PM: TMR_VAL 0xd9f446d, PM1a_STS 0x0, PM1a_EN 0x321 > Entry 14: type 14 instance 0, length 240 > MTRR: PAT 0x7010600070106, cap 0x508, default 0xc06 > - var 0 0x00000000f0000000 0x000000fff8000800 > - var 1 0x00000000f8000000 0x000000fffc000800 > + var 0 0x00000000f0000000 0x0000000000000000 > + var 1 0x00000000f8000000 0x0000000000000000 > var 2 0x0000000000000000 0x0000000000000000 > var 3 0x0000000000000000 0x0000000000000000 > var 4 0x0000000000000000 0x0000000000000000 >=20 > I don't really know a lot of what I'm looking at, but those cpu > registers shouldn't be different should they??? It almost looks like the > cpu was allowed to run for a few cycles at some point even though it was > paused. >=20 > I'll try the same on the intel box and see what happens. >=20 > Thanks >=20 > James >=20 > _______________________________________________ > Xen-devel mailing list > Xen-devel@lists.xensource.com > http://lists.xensource.com/xen-devel