From mboxrd@z Thu Jan 1 00:00:00 1970 From: Pasi =?iso-8859-1?Q?K=E4rkk=E4inen?= Subject: Re: pv 2.6.31 (kernel.org) and save/migrate, domU BUG() Date: Sun, 8 Nov 2009 16:17:43 +0200 Message-ID: <20091108141743.GK1434@reaktio.net> References: <20091107110905.GB1434@reaktio.net> <373a54d1-3dec-4185-b1ca-0363e14329b4@default> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Return-path: Content-Disposition: inline In-Reply-To: <373a54d1-3dec-4185-b1ca-0363e14329b4@default> List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Sender: xen-devel-bounces@lists.xensource.com Errors-To: xen-devel-bounces@lists.xensource.com To: Dan Magenheimer Cc: "Xen-Devel (E-mail)" List-Id: xen-devel@lists.xenproject.org On Sat, Nov 07, 2009 at 07:32:49AM -0800, Dan Magenheimer wrote: > > > Well, first, I got 2.6.31.5 to boot in a PV guest in another > > > machine and it fails to save also. Are you able to save > > > 2.6.31{,.5} successfully? On latest xen-unstable? > > > (NOTE: Yes, I do have CONFIG_XEN_SAVE_RESTORE=y... don't > > > know if that is important.) > > > > I'll have to try it later today.. > > Let me know. > Ok. I just tried with a Fedora 12 (rawhide) PV guest. I was able to "xm save" and "xm restore" it without problems. But I noticed there was a BUG printed on the guest console: http://pasik.reaktio.net/xen/debug/dmesg-2.6.31.5-122.fc12.x86_64-saverestore.txt BUG: sleeping function called from invalid context at kernel/mutex.c:94 in_atomic(): 0, irqs_disabled(): 1, pid: 1052, name: kstop/0 Pid: 1052, comm: kstop/0 Not tainted 2.6.31.5-122.fc12.x86_64 #1 Call Trace: [] __might_sleep+0xe6/0xe8 [] mutex_lock+0x22/0x4e [] dpm_resume_noirq+0x21/0x11f [] xen_suspend+0xca/0xd1 [] stop_cpu+0x8c/0xd2 [] worker_thread+0x18a/0x224 [] ? autoremove_wake_function+0x0/0x39 [] ? _spin_unlock_irqrestore+0x19/0x1b [] ? worker_thread+0x0/0x224 [] kthread+0x91/0x99 [] child_rip+0xa/0x20 [] ? int_ret_from_sys_call+0x7/0x1b [] ? retint_restore_args+0x5/0x6 [] ? child_rip+0x0/0x20 More information about my setup: Host/dom0: Fedora 12 (latest rawhide) with included Xen 3.4.1-5 and custom 2.6.31.5 x86_64 pv_ops dom0 kernel (a couple of days old). Guest/domU: Fedora 12 (latest rawhide) with the included/default 2.6.31.5-122.fc12.x86_64 kernel. > > > (On the machine I couldn't boot 2.6.31.5 as a PV guest, there > > > was absolutely no console output. However, I think tools > > > are out-of-date on that machine so ignore that.) > > > > Did you have "console=hvc0 earlyprintk=xen" in the domU kernel > > parameters? > > No, but that didn't work either. > Ok.. then it crashes really early. > > You might also change the xen guest cfgfile so that you have > > on_crash=preserve and then when the PV guest is crashed run this: > > > > /usr/lib/xen/bin/xenctx -s System.map-domUkernelversion > > > > (if you have 64b host the xenctx binary might be under /usr/lib64/) > > > > to get a stack trace.. > > Very interesting and useful! I was completely unaware of > xenctx and could have used it many times in tmem development! > > The results explain why I can get it to run on > one machine (an older laptop) and not run on another > machine (a Nehalem system)... looks like this is maybe > related to the cpuid-extended-topology-leaf bug that Jeremy > sent a fix for upstream recently. > Did you try with that patch applied? -- Pasi > cs:eip: e019:c040342d xen_cpuid+0x46 > flags: 00001206 i nz p > ss:esp: e021:c0779ee4 > eax: 00000001 ebx: 00000002 ecx: 00000100 edx: 00000001 > esi: c0779f1c edi: c0779f18 ebp: c0779f24 > ds: e021 es: e021 fs: 00d8 gs: 0000 > Code (instr addr c040342d) > 24 04 8b 15 a4 02 7c c0 89 54 24 08 8b 0e 0f 0b 78 65 6e 0f a2 <89> 45 00 8b 04 24 89 18 89 0e 89 > > > Stack: > c0779f20 ffffffff ffffffff c07c0360 c0779f18 c0779f1c c0779f20 c066fd0f > c0779f18 c0779f24 00000002 16aee301 00000001 00000001 16aee301 00000002 > 0000000b c07c03cc c07c0360 c07c0360 c07c03d8 c0670ed8 c0779f58 00000001 > c07c0360 c0779f60 c066fe6a c0779f60 c0779f60 00000003 00000001 00000000 > > Call Trace: > [] xen_cpuid+0x46 <-- > [] detect_extended_topology+0xae > [] init_intel+0x140 > [] init_scattered_cpuid_features+0x82 > [] identify_cpu+0x22d > [] xen_force_evtchn_callback+0xc > [] check_events+0x8 > [] identify_boot_cpu+0xa > [] check_bugs+0x8 > [] start_kernel+0x2a0 > [] xen_start_kernel+0x340