From mboxrd@z Thu Jan 1 00:00:00 1970 From: Eric Dumazet Subject: Re: Bug, kernel panic, NULL dereference , cleanup_once / icmp_route_lookup.clone.19.clone / nat , 2.6.39-rc7-git11 Date: Wed, 18 May 2011 11:37:51 +0200 Message-ID: <1305711471.2983.27.camel@edumazet-laptop> References: <54ec5cd14e5e5c76aa06c2e6899299ce@visp.net.lb> <41a1892fed59b411bb08d3ecb0d8cda5@visp.net.lb> Mime-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: QUOTED-PRINTABLE Cc: netdev@vger.kernel.org To: Denys Fedoryshchenko Return-path: Received: from mail-fx0-f46.google.com ([209.85.161.46]:64066 "EHLO mail-fx0-f46.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S932507Ab1ERJhy (ORCPT ); Wed, 18 May 2011 05:37:54 -0400 Received: by fxm17 with SMTP id 17so1049459fxm.19 for ; Wed, 18 May 2011 02:37:53 -0700 (PDT) In-Reply-To: <41a1892fed59b411bb08d3ecb0d8cda5@visp.net.lb> Sender: netdev-owner@vger.kernel.org List-ID: Le mercredi 18 mai 2011 =C3=A0 12:27 +0300, Denys Fedoryshchenko a =C3=A9= crit : > On Wed, 18 May 2011 01:16:29 +0300, Denys Fedoryshchenko wrote: > > Just got recently. 32Bit, PPPoE NAS, shapers, firewall, NAT > > Kernel i mention in subject, 2.6.39-rc7-git11 > > If required i can give more information > > > > sharanal (sorry for ugly name) is libpcap based traffic analyser, > > sure userspace > > > Here is some info, i hope it will be a little useful >=20 > (gdb) l *(cleanup_once + 0x49) > 0xc02e85cc is in cleanup_once (include/linux/list.h:88). > 83 * This is only for internal list manipulation where we know > 84 * the prev/next entries already! > 85 */ > 86 static inline void __list_del(struct list_head * prev, struc= t=20 > list_head * next) > 87 { > 88 next->prev =3D prev; > 89 prev->next =3D next; > 90 } > 91 > 92 /** >=20 > (gdb) l *(inet_getpeer + 0x2ab) > 0xc02e8ae8 is in inet_getpeer (net/ipv4/inetpeer.c:530). > 525 if (base->total >=3D inet_peer_threshold) > 526 /* Remove one less-recently-used entry. */ > 527 cleanup_once(0, stack); > 528 > 529 return p; > 530 } > 531 > 532 static int compute_total(void) > 533 { > 534 return v4_peers.total + v6_peers.total; >=20 I really begin to think we have a bug here... In previous reports, I suggested to use slub_nomerge because I thought one corruption from another kernel layer was going on. (inetpeer was using 64 bytes objects). But now that inetpeer objects ar= e bigger and sit in another kmemcache, its bad news. Could you try this, and eventually add some SLUB debugging stuff as well ?