From mboxrd@z Thu Jan 1 00:00:00 1970 From: "Jan Beulich" Subject: Re: [Xen-users] kernel 3.9.2 - xen 4.2.2/4.3rc1 => BUG unable to handle kernel paging request netif_poll+0x49c/0xe8 Date: Fri, 05 Jul 2013 11:56:49 +0100 Message-ID: <51D6C29102000078000E2F57@nat28.tlf.novell.com> References: <8511913.uMAmUdIO30@eistomin.edss.local> <20130517085923.GC14401@zion.uk.xensource.com> <51D57C1F.8070909@hunenet.nl> <20130704150137.GW7483@zion.uk.xensource.com> <51D6A282.4030703@hunenet.nl> Mime-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Return-path: In-Reply-To: <51D6A282.4030703@hunenet.nl> Content-Disposition: inline List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Sender: xen-devel-bounces@lists.xen.org Errors-To: xen-devel-bounces@lists.xen.org To: Dion Kant Cc: Wei Liu , xen-devel@lists.xen.org List-Id: xen-devel@lists.xenproject.org >>> On 05.07.13 at 12:40, Dion Kant wrote: > On 07/04/2013 05:01 PM, Wei Liu wrote: >> --- a/drivers/xen/netfront/netfront.c >> +++ b/drivers/xen/netfront/netfront.c >> @@ -1306,6 +1306,7 @@ static RING_IDX xennet_fill_frags(struct netfront_info *np, >> struct sk_buff *nskb; >> >> while ((nskb = __skb_dequeue(list))) { >> + BUG_ON(nr_frags >= MAX_SKB_FRAGS); >> struct netif_rx_response *rx = >> RING_GET_RESPONSE(&np->rx, ++cons); >> > > Integrated the patch. I obtained a crash dump and the log in it did not > show this BUG_ON. Here is the relevant section from the log > > var/lib/xen/dump/domUA # crash /root/vmlinux-p1 > 2013-0705-1347.43-domUA.1.core > > [ 7.670132] Adding 4192252k swap on /dev/xvda1. Priority:-1 extents:1 across:4192252k SS > [ 10.204340] NET: Registered protocol family 17 > [ 481.534979] netfront: Too many frags > [ 487.543946] netfront: Too many frags > [ 491.049458] netfront: Too many frags > [ 491.491153] ------------[ cut here ]------------ > [ 491.491628] kernel BUG at drivers/xen/netfront/netfront.c:1295! So if not the BUG_ON() from the patch above, what else does that line have in your sources? Jan > [ 491.492056] invalid opcode: 0000 [#1] SMP > [ 491.492056] Modules linked in: af_packet autofs4 xennet xenblk cdrom > [ 491.492056] CPU 0 > [ 491.492056] Pid: 1471, comm: sshd Not tainted 3.7.10-1.16-dbg-p1-xen #8 > [ 491.492056] RIP: e030:[] [] > netif_poll+0xe4f/0xf90 [xennet] > [ 491.492056] RSP: e02b:ffff8801f5803c60 EFLAGS: 00010202 > [ 491.492056] RAX: ffff8801f5803da0 RBX: ffff8801f1a082c0 RCX: > 0000000180200010 > [ 491.492056] RDX: ffff8801f5803da0 RSI: ffff8801fe83ec80 RDI: > ffff8801f03b2900 > [ 491.492056] RBP: ffff8801f5803e20 R08: 0000000000000001 R09: > 0000000000000000 > [ 491.492056] R10: 0000000000000000 R11: 0000000000000000 R12: > ffff8801f03b3400 > [ 491.492056] R13: 0000000000000011 R14: 000000000004327e R15: > ffff8801f06009c0 > [ 491.492056] FS: 00007fc519f3d7c0(0000) GS:ffff8801f5800000(0000) > knlGS:0000000000000000 > [ 491.492056] CS: e033 DS: 0000 ES: 0000 CR0: 0000000080050033 > [ 491.492056] CR2: 00007fc51410c400 CR3: 00000001f1430000 CR4: > 0000000000002660 > [ 491.492056] DR0: 0000000000000000 DR1: 0000000000000000 DR2: > 0000000000000000 > [ 491.492056] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: > 0000000000000400 > [ 491.492056] Process sshd (pid: 1471, threadinfo ffff8801f1264000, > task ffff8801f137bf00) > [ 491.492056] Stack: > [ 491.492056] ffff8801f5803d60 ffffffff8008503e ffff8801f0600a40 > ffff8801f0600000 > [ 491.492056] 0004328000000040 0000001200000000 ffff8801f5810570 > ffff8801f0600a78 > [ 491.492056] 0000000000000000 ffff8801f0601fb0 0004326e00000012 > ffff8801f5803d00 > [ 491.492056] Call Trace: > [ 491.492056] [] net_rx_action+0xd5/0x250 > [ 491.492056] [] __do_softirq+0xe8/0x230 > [ 491.492056] [] call_softirq+0x1c/0x30 > [ 491.492056] [] do_softirq+0x75/0xd0 > [ 491.492056] [] irq_exit+0xb5/0xc0 > [ 491.492056] [] evtchn_do_upcall+0x295/0x2d0 > [ 491.492056] [] do_hypervisor_callback+0x1e/0x30 > [ 491.492056] [<00007fc519f97700>] 0x7fc519f976ff > [ 491.492056] Code: ff 0f 1f 00 e8 a3 c1 40 e0 85 c0 90 75 69 44 89 ea > 4c 89 f6 4c 89 ff e8 f0 cb ff ff c7 85 80 fe ff ff ea ff ff ff e9 7c f4 > ff ff <0f> 0b ba 12 00 00 00 48 01 d0 48 39 c1 0f 82 bd fc ff ff e9 e9 > [ 491.492056] RIP [] netif_poll+0xe4f/0xf90 [xennet] > [ 491.492056] RSP > [ 491.511975] ---[ end trace c9e37475f12e1aaf ]--- > [ 491.512877] Kernel panic - not syncing: Fatal exception in interrupt > > In the mean time Jan took the bug in bugzilla > (https://bugzilla.novell.com/show_bug.cgi?id=826374) and created a first > patch. I propose we continue the discussion there and post the > conclusion in this list to finish this thread here as well. > > > Dion