From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1757719Ab1GKVKO (ORCPT ); Mon, 11 Jul 2011 17:10:14 -0400 Received: from acsinet15.oracle.com ([141.146.126.227]:64776 "EHLO acsinet15.oracle.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752961Ab1GKVKM (ORCPT ); Mon, 11 Jul 2011 17:10:12 -0400 Date: Mon, 11 Jul 2011 17:09:54 -0400 From: Konrad Rzeszutek Wilk To: "Paul E. McKenney" Cc: xen-devel@lists.xensource.com, julie Sullivan , linux-kernel@vger.kernel.org Subject: Re: PROBLEM: 3.0-rc kernels unbootable since -rc3 Message-ID: <20110711210954.GA15745@dumpdata.com> References: <20110710171626.GK6014@linux.vnet.ibm.com> <20110710173530.GA16954@linux.vnet.ibm.com> <20110710214639.GP6014@linux.vnet.ibm.com> <20110710231449.GQ6014@linux.vnet.ibm.com> <20110711162450.GA22913@dumpdata.com> <20110711171337.GK2245@linux.vnet.ibm.com> <20110711193021.GA2996@dumpdata.com> <20110711201508.GN2245@linux.vnet.ibm.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20110711201508.GN2245@linux.vnet.ibm.com> User-Agent: Mutt/1.5.21 (2010-09-15) X-Source-IP: rtcsinet22.oracle.com [66.248.204.30] X-CT-RefId: str=0001.0A090204.4E1B66B0.0030:SCFSTAT5015188,ss=1,re=-4.000,fgs=0 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Mon, Jul 11, 2011 at 01:15:08PM -0700, Paul E. McKenney wrote: > On Mon, Jul 11, 2011 at 03:30:22PM -0400, Konrad Rzeszutek Wilk wrote: > > > > > > Hmmm... Does the stall repeat about every 3.5 minutes after the first stall? > > > > Starting Configure read-only root support... > > [ 81.335070] INFO: rcu_sched_state detected stalls on CPUs/tasks: { 0} (detected by 3, t=60002 jiffies) > > [ 81.335091] sending NMI to all CPUs: > > [ 261.367071] INFO: rcu_sched_state detected stalls on CPUs/tasks: { 0} (detected by 3, t=240034 jiffies) > > [ 261.367092] sending NMI to all CPUs: > > [ 441.399066] INFO: rcu_sched_state detected stalls on CPUs/tasks: { 0} (detected by 3, t=420066 jiffies) > > [ 441.399089] sending NMI to all CPUs: > > OK, then the likely cause is something hanging onto the CPU. Do the later > stalls also show stack traces? If so, what shows up? I don't really get any stack traces from the guest. Not sure why it does not print them out (probably b/c the NMI functionality is not accessible somehow?). I get the stack traces using a 'xenctx' tool and this is what I get from the guest before the stall, and after the stall: 20:45:56 # 12 :/mnt/tmp/FC15-32/ /usr/lib64/xen/bin/xenctx 29 -s System.map-3.0.0-rc6-disabled-options+ -a 2 cs:eip: 0061:c042d0f5 task_waking_fair+0x14 flags: 00001286 i s nz p ss:esp: 0069:e94cff0c eax: c18dbed0 ebx: ffffffff ecx: fff00000 edx: c14a10c0 esi: 00000000 edi: 00000000 ebp: e94cff18 ds: 007b es: 007b fs: 00d8 gs: 00e0 cr0: 8005003b cr2: b7743000 cr3: 97348001 cr4: 00000660 dr0: 00000000 dr1: 00000000 dr2: 00000000 dr3: 00000000 dr6: ffff0ff0 dr7: 00000400 Code (instr addr c042d0f5) c3 55 89 e5 57 56 53 3e 8d 74 26 00 8b 90 58 01 00 00 8b 7a 1c <8b> 72 20 8b 5a 18 8b 4a 14 39 f3 Stack: c18dbed0 00000003 00000002 e94cff38 c0439a45 c18d00c0 c18dc2c0 00000000 e8bd1ec4 e8bd1ef8 00000003 e94cff40 c0439b0c e94cff64 c042d4db 00000000 e8bd1f04 00000001 00000001 e8bd1f00 e8bd0200 e8bd1efc e94cff80 c042ea69 00000000 00000000 e8bd1ef4 ea9c4918 c0a43a80 e94cff88 c0455e14 e94cffb4 Call Trace: [] task_waking_fair+0x14 <-- [] try_to_wake_up+0xb2 [] default_wake_function+0x10 [] __wake_up_common+0x3b [] complete+0x3e [] wakeme_after_rcu+0x10 [] __rcu_process_callbacks+0x172 [] rcu_process_callbacks+0x1e [] __do_softirq+0xa2