All of lore.kernel.org
 help / color / mirror / Atom feed
* XL: pv guests dont reboot after migration (xen4.1.2-rc2-pre)
@ 2011-09-20  9:41 Andreas Olsowski
  2011-09-20 19:23 ` Ian Campbell
  0 siblings, 1 reply; 10+ messages in thread
From: Andreas Olsowski @ 2011-09-20  9:41 UTC (permalink / raw)
  To: xen-devel


[-- Attachment #1.1: Type: text/plain, Size: 527 bytes --]

A pv guest will not reboot after migration, the guest itself does 
everything right, including the shutdown, but xl does not recreate the 
guest, it just shuts it down.

This goes for 2.6.39 and 3.0.4 guest kernels, havent tried different 
ones. I also haven tried different xen versions.

Dont know if this would affect hvm, probably not since qemu leaves the 
guest running and does a "proper" restart.


I guess this behavior has always been that way and noone ever bothered 
to bring it up, well now i do :)


[-- Attachment #1.2: S/MIME Cryptographic Signature --]
[-- Type: application/pkcs7-signature, Size: 6595 bytes --]

[-- Attachment #2: Type: text/plain, Size: 138 bytes --]

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: XL: pv guests dont reboot after migration (xen4.1.2-rc2-pre)
  2011-09-20  9:41 XL: pv guests dont reboot after migration (xen4.1.2-rc2-pre) Andreas Olsowski
@ 2011-09-20 19:23 ` Ian Campbell
  2011-09-23  7:40   ` Andreas Olsowski
  0 siblings, 1 reply; 10+ messages in thread
From: Ian Campbell @ 2011-09-20 19:23 UTC (permalink / raw)
  To: Andreas Olsowski; +Cc: xen-devel

On Tue, 2011-09-20 at 10:41 +0100, Andreas Olsowski wrote:
> A pv guest will not reboot after migration, the guest itself does 
> everything right, including the shutdown, but xl does not recreate the 
> guest, it just shuts it down.

After the migrate but before the shutdown is there an xl process
associated with the guest?

Please take alook at http://wiki.xen.org/xenwiki/ReportingBugs it
includes a useful list of bits of info (logfiles etc), including those
in your report will help us to help you.

For example:

What exact commands are you running on the host and in the guest?

What is in your guest cfg file?

What does /var/log/xen/*-$GUESTNAME* contain?

Ian.

> 
> This goes for 2.6.39 and 3.0.4 guest kernels, havent tried different 
> ones. I also haven tried different xen versions.
> 
> Dont know if this would affect hvm, probably not since qemu leaves the 
> guest running and does a "proper" restart.
> 
> 
> I guess this behavior has always been that way and noone ever bothered 
> to bring it up, well now i do :)
> 

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: XL: pv guests dont reboot after migration (xen4.1.2-rc2-pre)
  2011-09-20 19:23 ` Ian Campbell
@ 2011-09-23  7:40   ` Andreas Olsowski
  2011-09-23  8:00     ` Ian Campbell
  0 siblings, 1 reply; 10+ messages in thread
From: Andreas Olsowski @ 2011-09-23  7:40 UTC (permalink / raw)
  To: xen-devel


[-- Attachment #1.1: Type: text/plain, Size: 5899 bytes --]

On 09/20/2011 09:23 PM, Ian Campbell wrote:
 > On Tue, 2011-09-20 at 10:41 +0100, Andreas Olsowski wrote:
 >> A pv guest will not reboot after migration, the guest itself does
 >> everything right, including the shutdown, but xl does not recreate the
 >> guest, it just shuts it down.
 >

 > After the migrate but before the shutdown is there an xl process
 > associated with the guest?
Yes, xl migrate-receive is running, but check this out:

root@xenturio1:/var/log/xen# cat xl-thiswillfail.log
Waiting for domain thiswillfail (domid 7) to die [pid 7475]


root@xenturio1:/usr/src/linux-2.6-xen# xl -vvv migrate thiswillfail 
xenturio2
migration target: Ready to receive domain.
Saving to migration stream new xl format (info 0x0/0x0/380)
Loading new save file incoming migration stream (new xl fmt info 
0x0/0x0/380)
  Savefile contains xl domain config
xc: detail: Had 0 unexplained entries in p2m table
xc: Saving memory: iter 0 (last sent 0 skipped 0): 133120/133120  100%
xc: detail: delta 9519ms, dom0 94%, target 1%, sent 449Mb/s, dirtied 
1Mb/s 533 pages
xc: Saving memory: iter 1 (last sent 130565 skipped 507): 133120/133120 
  100%
xc: detail: delta 39ms, dom0 92%, target 2%, sent 447Mb/s, dirtied 
28Mb/s 34 pages
xc: Saving memory: iter 2 (last sent 533 skipped 0): 133120/133120  100%
xc: detail: Start last iteration
libxl: debug: libxl_dom.c:384:libxl__domain_suspend_common_callback 
issuing PV suspend request via XenBus control node
libxl: debug: libxl_dom.c:389:libxl__domain_suspend_common_callback wait 
for the guest to acknowledge suspend request
libxl: debug: libxl_dom.c:434:libxl__domain_suspend_common_callback 
guest acknowledged suspend request
libxl: debug: libxl_dom.c:438:libxl__domain_suspend_common_callback wait 
for the guest to suspend
libxl: debug: libxl_dom.c:450:libxl__domain_suspend_common_callback 
guest has suspended
xc: detail: SUSPEND shinfo 0007fafc
xc: detail: delta 205ms, dom0 3%, target 0%, sent 5Mb/s, dirtied 25Mb/s 
160 pages
xc: Saving memory: iter 3 (last sent 34 skipped 0): 133120/133120  100%
xc: detail: delta 3ms, dom0 0%, target 0%, sent 1747Mb/s, dirtied 
1747Mb/s 160 pages
xc: detail: Total pages sent= 131292 (0.99x)
xc: detail: (of which 0 were fixups)
xc: detail: All memory is saved
xc: detail: Save exit rc=0
migration target: Transfer complete, requesting permission to start domain.
migration sender: Target has acknowledged transfer.
migration sender: Giving target permission to start.
migration target: Got permission, starting domain.
migration target: Domain started successsfully.
migration sender: Target reports successful startup.
Migration successful.


root@xenturio1:/var/log/xen# cat xl-thiswillfail.log
Waiting for domain thiswillfail (domid 7) to die [pid 7475]
Domain 7 is dead
Done. Exiting now

root@xenturio2:/var/log/xen# cat xl-thiswillfail--incoming.log
Waiting for domain thiswillfail--incoming (domid 10) to die [pid 5162]

root@xenturio2:/var/log/xen# ps auxww |grep -v grep |grep "migrate-rec"
root      5162  0.0  0.0  36128  1592 ?        Ssl  09:30   0:00 xl 
migrate-receive



root@xenturio2:/var/log/xen# xl console thiswillfail
PM: early restore of devices complete after 0.071 msecs
PM: restore of devices complete after 14.727 msecs
Setting capacity to 10485760
Setting capacity to 2097152


root@thiswillfail:~# init 6
INIT: Switching to runlevel: 6
INIT: Sending processes the TERM signal
Using makefile-style concurrent boot in runlevel 6.
Asking all remaining processes to terminate...done.
All processes ended within 1 seconds....done.
Stopping enhanced syslogd: rsyslogd.
Saving the system clock.
Cannot access the Hardware Clock via any known method.
Use the --debug option to see the details of our search for an access 
method.
Deconfiguring network interfaces...Internet Systems Consortium DHCP 
Client 4.1.1-P1
Copyright 2004-2010 Internet Systems Consortium.
All rights reserved.
For info, please visit https://www.isc.org/software/dhcp/

Listening on LPF/eth0/00:16:3e:7e:38:fb
Sending on   LPF/eth0/00:16:3e:7e:38:fb
Sending on   Socket/fallback
DHCPRELEASE on eth0 to 10.19.46.16 port 67
done.
Cleaning up ifupdown....
Deactivating swap...done.
Will now restart.
md: stopping all md devices.
Restarting system.


root@xenturio2:/var/log/xen# xl list
Name                                        ID   Mem VCPUs    State 
Time(s)
Domain-0                                     0  4661     8     r----- 
77471.3
root@xenturio2:/var/log/xen# ps auxww |grep -v grep |grep xl
root@xenturio2:/var/log/xen# cat xl-thiswillfail--incoming.log
Waiting for domain thiswillfail--incoming (domid 10) to die [pid 5162]
Domain 10 is dead
Action for shutdown reason code 1 is restart
Domain 10 needs to be cleaned up: destroying the domain
Done. Rebooting now
xc: error: 0-length read: Internal error
xc: error: read_exact_timed failed (read rc: 0, errno: 0): Internal error
xc: error: read: p2m_size (0 = Success): Internal error




######
# domU config
root@xenturio2:/var/log/xen# cat /mnt/vmctrl/xenconfig/thiswillfail.sxp
# generated using xen-tool
kernel = "/boot/vmlinuz-3.0-xenU"
ramdisk = "/boot/initrd.img-3.0-xenU"
name = "thiswillfail"
memory = "512"
vcpus = "2"
vif = [ 'bridge=vlanbr27','mac=fe:ff:00:1b:00:06,bridge=mgmtbr27' ]
disk = [ 
'phy:/dev/xen-data/thiswillfail-root,xvda1,w','phy:/dev/xen-data/thiswillfail-swap,xvda2,w' 
]
root = "/dev/xvda1"
extra = "xencons=hvc0 console=hvc0"


This again goes for 2.3.39-xenU and 3.0.4-xenU.




I guess the core of the problem is somewhere around this:
 >xc: error: 0-length read: Internal error
 >xc: error: read_exact_timed failed (read rc: 0, errno: 0): Internal error
 >xc: error: read: p2m_size (0 = Success): Internal error


with best regards



andreas



[-- Attachment #1.2: S/MIME Cryptographic Signature --]
[-- Type: application/pkcs7-signature, Size: 6595 bytes --]

[-- Attachment #2: Type: text/plain, Size: 138 bytes --]

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: XL: pv guests dont reboot after migration (xen4.1.2-rc2-pre)
  2011-09-23  7:40   ` Andreas Olsowski
@ 2011-09-23  8:00     ` Ian Campbell
  2011-09-23  9:15       ` Andreas Olsowski
  0 siblings, 1 reply; 10+ messages in thread
From: Ian Campbell @ 2011-09-23  8:00 UTC (permalink / raw)
  To: Andreas Olsowski; +Cc: xen-devel

On Fri, 2011-09-23 at 08:38 +0100, Andreas Olsowski wrote:
I guess the core of the problem is somewhere around this:
>  >xc: error: 0-length read: Internal error
>  >xc: error: read_exact_timed failed (read rc: 0, errno: 0): Internal error
>  >xc: error: read: p2m_size (0 = Success): Internal error
> 

It smells like on reboot it is trying to receive another incoming
migration, instead of restarting the domain it already has.

This (untested) might help:

diff -r d7b14b76f1eb tools/libxl/xl_cmdimpl.c
--- a/tools/libxl/xl_cmdimpl.c	Thu Sep 22 14:26:08 2011 +0100
+++ b/tools/libxl/xl_cmdimpl.c	Fri Sep 23 08:59:36 2011 +0100
@@ -1516,6 +1516,11 @@ start:
         ret = libxl_domain_create_restore(ctx, &d_config,
                                             cb, &child_console_pid,
                                             &domid, restore_fd);
+        /*
+         * On subsequent reboot etc we should create the domain, not
+         * restore/migrate-receive it again.
+         */
+        restore_file = NULL;
     }else{
         ret = libxl_domain_create_new(ctx, &d_config,
                                         cb, &child_console_pid, &domid);

Ian.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: XL: pv guests dont reboot after migration (xen4.1.2-rc2-pre)
  2011-09-23  8:00     ` Ian Campbell
@ 2011-09-23  9:15       ` Andreas Olsowski
  2011-09-27 17:39         ` XL: pv guests dont reboot after migration (xen4.1.2-rc2-pre) [and 1 more messages] Ian Jackson
  0 siblings, 1 reply; 10+ messages in thread
From: Andreas Olsowski @ 2011-09-23  9:15 UTC (permalink / raw)
  To: Ian Campbell; +Cc: xen-devel

On 09/23/2011 10:00 AM, Ian Campbell wrote:
> It smells like on reboot it is trying to receive another incoming
> migration, instead of restarting the domain it already has.
>
> This (untested) might help:
>
> diff -r d7b14b76f1eb tools/libxl/xl_cmdimpl.c
> --- a/tools/libxl/xl_cmdimpl.c	Thu Sep 22 14:26:08 2011 +0100
> +++ b/tools/libxl/xl_cmdimpl.c	Fri Sep 23 08:59:36 2011 +0100
> @@ -1516,6 +1516,11 @@ start:
>           ret = libxl_domain_create_restore(ctx,&d_config,
>                                               cb,&child_console_pid,
>                                               &domid, restore_fd);
> +        /*
> +         * On subsequent reboot etc we should create the domain, not
> +         * restore/migrate-receive it again.
> +         */
> +        restore_file = NULL;
>       }else{
>           ret = libxl_domain_create_new(ctx,&d_config,
>                                           cb,&child_console_pid,&domid);
>
> Ian.


Patching works.

root@xenturio2:/usr/src/xen-4.1-testing.hg# patch -p1 < 
../xl-migration-reboot.ian.patch
patching file tools/libxl/xl_cmdimpl.c
Hunk #1 succeeded at 1520 with fuzz 2 (offset 4 lines).

Compilation (clean/make/install) worked fine too.

The patch did what you intended for it to do, the guest reboots:

##############
root@xenturio2:/usr/src/xen-4.1-testing.hg# xl console thishopefullywontfail
PM: early restore of devices complete after 0.068 msecs
PM: restore of devices complete after 13.033 msecs
Setting capacity to 10485760
Setting capacity to 2097152

root@thishopefullywontfail:~# init 6
INIT: Switching to runlevel: 6
INIT: Sending processes the TERM signal
... usual shutdown ...
Restarting system.

root@xenturio2:/usr/src/xen-4.1-testing.hg# xl list
Name                                        ID   Mem VCPUs	State	Time(s)
Domain-0                                     0  4661     8     r----- 
78258.3
thishopefullywontfail                       14   512     2     -b---- 
     2.6
root@xenturio2:/usr/src/xen-4.1-testing.hg# xl console thishopefullywontfail
Linux version 3.0.4-xenU (root@xenturio1) (gcc version 4.4.5 (Debian 
4.4.5-8) ) #6 SMP Wed Aug 31 17:04:24 CEST 2011
... usual bootup ....

root@thishopefullywontfail:~#

#####################

Here is the output of the log:

root@xenturio2:/var/log/xen# cat xl-thishopefullywontfail--incoming.log
Waiting for domain thishopefullywontfail--incoming (domid 13) to die 
[pid 14668]
Domain 13 is dead
Action for shutdown reason code 1 is restart
Domain 13 needs to be cleaned up: destroying the domain
Done. Rebooting now
Waiting for domain thishopefullywontfail (domid 14) to die [pid 14668]



with best regards



andreas

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: XL: pv guests dont reboot after migration (xen4.1.2-rc2-pre) [and 1 more messages]
  2011-09-23  9:15       ` Andreas Olsowski
@ 2011-09-27 17:39         ` Ian Jackson
  2011-10-15  1:01           ` XL: pv guests dont reboot after migration (xen-4.1.2-rc3) libc-2.11.2 segfault Andreas Olsowski
  0 siblings, 1 reply; 10+ messages in thread
From: Ian Jackson @ 2011-09-27 17:39 UTC (permalink / raw)
  To: Andreas Olsowski, Ian Campbell; +Cc: xen-devel

Ian Campbell writes ("Re: [Xen-devel] XL: pv guests dont reboot after	migration (xen4.1.2-rc2-pre)"):
> It smells like on reboot it is trying to receive another incoming
> migration, instead of restarting the domain it already has.
> 
> This (untested) might help:

Thanks both, I've applied this patch.

Ian.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: XL: pv guests dont reboot after migration (xen-4.1.2-rc3) libc-2.11.2 segfault
  2011-09-27 17:39         ` XL: pv guests dont reboot after migration (xen4.1.2-rc2-pre) [and 1 more messages] Ian Jackson
@ 2011-10-15  1:01           ` Andreas Olsowski
  2011-10-15  5:45             ` Ian Campbell
  0 siblings, 1 reply; 10+ messages in thread
From: Andreas Olsowski @ 2011-10-15  1:01 UTC (permalink / raw)
  To: xen-devel; +Cc: Ian.Jackson, Konrad Rzeszutek Wilk


[-- Attachment #1.1: Type: text/plain, Size: 2657 bytes --]

Hi all.

If you recall i created a similar discussion last month and working 
patches were replied and tested.


Now i finally got around to updating my systems to the latest testing 
release and it would seem something else is preventing a clean reboot now.

pv guests dont reboot after migration,
just when xl should reboot the machine syslog shows:


Oct 15 02:46:32 netcatarina kernel: xl[14986]: segfault at 7f0ec70a3008 
ip 00007f0ec7d517f9 sp 00007fff366cf100 error 4 in 
libc-2.11.2.so[7f0ec7cdb000+158000]



I am running debian squueze and havent made any changes to it since
4.1.2-rc2-pre with the patches from my previous thread worked fine.


root@netcatarina:~# locate libc-2.
/lib/libc-2.11.2.so
/lib32/libc-2.11.2.so
root@netcatarina:~# dpkg -l |grep libc6
ii  libc6                               2.11.2-10 
Embedded GNU C Library: Shared libraries
ii  libc6-dev                           2.11.2-10 
Embedded GNU C Library: Development Libraries and Header Files
ii  libc6-i386                          2.11.2-10 
Embedded GNU C Library: 32-bit shared libraries for AMD64
root@netcatarina:~# ldd /usr/sbin/xl
	linux-vdso.so.1 =>  (0x00007fffa33c9000)
	libxlutil.so.1.0 => /usr/lib/libxlutil.so.1.0 (0x00007f6815a23000)
	libxenlight.so.1.0 => /usr/lib/libxenlight.so.1.0 (0x00007f68157fb000)
	libxenctrl.so.4.0 => /usr/lib/libxenctrl.so.4.0 (0x00007f68155d8000)
	libdl.so.2 => /lib/libdl.so.2 (0x00007f68153d4000)
	libxenguest.so.4.0 => /usr/lib/libxenguest.so.4.0 (0x00007f68151af000)
	libxenstore.so.3.0 => /usr/lib/libxenstore.so.3.0 (0x00007f6814fa5000)
	libblktapctl.so.1.0 => /usr/lib/libblktapctl.so.1.0 (0x00007f6814d9e000)
	libutil.so.1 => /lib/libutil.so.1 (0x00007f6814b9b000)
	libuuid.so.1 => /lib/libuuid.so.1 (0x00007f6814996000)
	libc.so.6 => /lib/libc.so.6 (0x00007f6814635000)
	libpthread.so.0 => /lib/libpthread.so.0 (0x00007f6814419000)
	/lib64/ld-linux-x86-64.so.2 (0x00007f6815c32000)
	libz.so.1 => /usr/lib/libz.so.1 (0x00007f6814201000)

root@netcatarina:~# ls -la /lib/libc.so.6
lrwxrwxrwx 1 root root 14 May 23 13:04 /lib/libc.so.6 -> libc-2.11.2.so

So its not the fact that i do have a additional 32bit version installed 
(this is a 64bit system).



This patch is the main reason i even bothered to update the servers so 
it would be nice if you could post a patch to this problem as well.

It happens more frequently than i would like that people reboot their 
servers.
And since i migrate them to a different server when i do maintenance 
this problem pretty much affects all of my 60 virtual machines.



With best regards


---
Andreas


[-- Attachment #1.2: S/MIME Cryptographic Signature --]
[-- Type: application/pkcs7-signature, Size: 6595 bytes --]

[-- Attachment #2: Type: text/plain, Size: 138 bytes --]

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: XL: pv guests dont reboot after migration (xen-4.1.2-rc3) libc-2.11.2 segfault
  2011-10-15  1:01           ` XL: pv guests dont reboot after migration (xen-4.1.2-rc3) libc-2.11.2 segfault Andreas Olsowski
@ 2011-10-15  5:45             ` Ian Campbell
  2011-10-15 10:47               ` Andreas Olsowski
  0 siblings, 1 reply; 10+ messages in thread
From: Ian Campbell @ 2011-10-15  5:45 UTC (permalink / raw)
  To: Andreas Olsowski; +Cc: Ian, xen-devel, Jackson, Konrad Rzeszutek Wilk

On Sat, 2011-10-15 at 02:01 +0100, Andreas Olsowski wrote:

> pv guests dont reboot after migration,
> just when xl should reboot the machine syslog shows:
> 
> 
> Oct 15 02:46:32 netcatarina kernel: xl[14986]: segfault at 7f0ec70a3008 
> ip 00007f0ec7d517f9 sp 00007fff366cf100 error 4 in 
> libc-2.11.2.so[7f0ec7cdb000+158000]

Can you run under gdb and get a backtrace? Or perhaps core file is
dropped somewhere?

Ian.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: XL: pv guests dont reboot after migration (xen-4.1.2-rc3) libc-2.11.2 segfault
  2011-10-15  5:45             ` Ian Campbell
@ 2011-10-15 10:47               ` Andreas Olsowski
  2011-10-15 13:07                 ` Ian Campbell
  0 siblings, 1 reply; 10+ messages in thread
From: Andreas Olsowski @ 2011-10-15 10:47 UTC (permalink / raw)
  To: Ian Campbell; +Cc: xen-devel, Ian Jackson, Konrad Rzeszutek Wilk


[-- Attachment #1.1: Type: text/plain, Size: 1256 bytes --]

On 10/15/2011 07:45 AM, Ian Campbell wrote:
> On Sat, 2011-10-15 at 02:01 +0100, Andreas Olsowski wrote:
>
>> pv guests dont reboot after migration,
>> just when xl should reboot the machine syslog shows:
>>
>>
>> Oct 15 02:46:32 netcatarina kernel: xl[14986]: segfault at 7f0ec70a3008
>> ip 00007f0ec7d517f9 sp 00007fff366cf100 error 4 in
>> libc-2.11.2.so[7f0ec7cdb000+158000]
>
> Can you run under gdb and get a backtrace? Or perhaps core file is
> dropped somewhere?
How? xl migrate-receive is not started by hand. Can you point me to the 
location within the code that calls it so i can put a "gdb" infront of it?


 > Or perhaps core file is dropped somewhere?
Wouldnt i have to run a debugging enabled build of xen for that?

I found this in the log dir:


root@netcatarina:/var/log/xen# cat xl-testmig--incoming.log
Waiting for domain testmig--incoming (domid 67) to die [pid 3429]
Domain 67 is dead
Action for shutdown reason code 1 is restart
Domain 67 needs to be cleaned up: destroying the domain
Done. Rebooting now
xc: error: 0-length read: Internal error
xc: error: read_exact_timed failed (read rc: 0, errno: 0): Internal error
xc: error: read: p2m_size (0 = Success): Internal error




-- 
Andreas


[-- Attachment #1.2: S/MIME Cryptographic Signature --]
[-- Type: application/pkcs7-signature, Size: 6595 bytes --]

[-- Attachment #2: Type: text/plain, Size: 138 bytes --]

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: XL: pv guests dont reboot after migration (xen-4.1.2-rc3) libc-2.11.2 segfault
  2011-10-15 10:47               ` Andreas Olsowski
@ 2011-10-15 13:07                 ` Ian Campbell
  0 siblings, 0 replies; 10+ messages in thread
From: Ian Campbell @ 2011-10-15 13:07 UTC (permalink / raw)
  To: Andreas Olsowski; +Cc: xen-devel, Ian Jackson, Konrad Rzeszutek Wilk

On Sat, 2011-10-15 at 11:47 +0100, Andreas Olsowski wrote:
> On 10/15/2011 07:45 AM, Ian Campbell wrote:
> > On Sat, 2011-10-15 at 02:01 +0100, Andreas Olsowski wrote:
> >
> >> pv guests dont reboot after migration,
> >> just when xl should reboot the machine syslog shows:
> >>
> >>
> >> Oct 15 02:46:32 netcatarina kernel: xl[14986]: segfault at 7f0ec70a3008
> >> ip 00007f0ec7d517f9 sp 00007fff366cf100 error 4 in
> >> libc-2.11.2.so[7f0ec7cdb000+158000]
> >
> > Can you run under gdb and get a backtrace? Or perhaps core file is
> > dropped somewhere?
> How? xl migrate-receive is not started by hand. Can you point me to the 
> location within the code that calls it so i can put a "gdb" infront of it?

tools/libxl/xl_cmdimpl.c, main_migrate().

Or you can attach gdb to a running xl migrate receive ("gdb -p
<pid> /path/xl"?). I think you can also control the remove command which
is run using the -e option to "xl migrate", maybe. Not so sure about
that last one.

>  > Or perhaps core file is dropped somewhere?
> Wouldnt i have to run a debugging enabled build of xen for that?
> 
> I found this in the log dir:
> 
> 
> root@netcatarina:/var/log/xen# cat xl-testmig--incoming.log
> Waiting for domain testmig--incoming (domid 67) to die [pid 3429]
> Domain 67 is dead
> Action for shutdown reason code 1 is restart
> Domain 67 needs to be cleaned up: destroying the domain
> Done. Rebooting now
> xc: error: 0-length read: Internal error

Interesting. That suggests we've gone back round to the migrate/restore
path, but all the uses after the start: label (where we go back to on
reboot) in create_domain seem to be gated on restore_file != NULL. I
must be missing something...

Adding some logging in create_domain wherever a *fd variable is used
might be interesting, perhaps on the exit paths too.

I notice that we don't appear to close restore_fd in the child process.
That probably isn't related to this problem but would be worth doing I
suspect.

> xc: error: read_exact_timed failed (read rc: 0, errno: 0): Internal error
> xc: error: read: p2m_size (0 = Success): Internal error


> 
> 
> 
> 

^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2011-10-15 13:07 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2011-09-20  9:41 XL: pv guests dont reboot after migration (xen4.1.2-rc2-pre) Andreas Olsowski
2011-09-20 19:23 ` Ian Campbell
2011-09-23  7:40   ` Andreas Olsowski
2011-09-23  8:00     ` Ian Campbell
2011-09-23  9:15       ` Andreas Olsowski
2011-09-27 17:39         ` XL: pv guests dont reboot after migration (xen4.1.2-rc2-pre) [and 1 more messages] Ian Jackson
2011-10-15  1:01           ` XL: pv guests dont reboot after migration (xen-4.1.2-rc3) libc-2.11.2 segfault Andreas Olsowski
2011-10-15  5:45             ` Ian Campbell
2011-10-15 10:47               ` Andreas Olsowski
2011-10-15 13:07                 ` Ian Campbell

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.