All of lore.kernel.org
 help / color / mirror / Atom feed
From: Eric Dumazet <eric.dumazet@gmail.com>
To: Brad Campbell <brad@fnarfbargle.com>
Cc: Patrick McHardy <kaber@trash.net>,
	Bart De Schuymer <bdschuym@pandora.be>,
	kvm@vger.kernel.org, linux-mm@kvack.org,
	linux-kernel@vger.kernel.org, netdev@vger.kernel.org,
	netfilter-devel@vger.kernel.org
Subject: Re: KVM induced panic on 2.6.38[2367] & 2.6.39
Date: Wed, 08 Jun 2011 23:22:29 +0200	[thread overview]
Message-ID: <1307568149.3980.3.camel@edumazet-laptop> (raw)
In-Reply-To: <4DEFAB15.2060905@fnarfbargle.com>

Le jeudi 09 juin 2011 à 01:02 +0800, Brad Campbell a écrit :
> On 08/06/11 11:59, Eric Dumazet wrote:
> 
> > Well, a bisection definitely should help, but needs a lot of time in
> > your case.
> 
> Yes. compile, test, crash, walk out to the other building to press 
> reset, lather, rinse, repeat.
> 
> I need a reset button on the end of a 50M wire, or a hardware watchdog!
> 
> Actually it's not so bad. If I turn off slub debugging the kernel panics 
> and reboots itself.
> 
> This.. :
> [    2.913034] netconsole: remote ethernet address 00:16:cb:a7:dd:d1
> [    2.913066] netconsole: device eth0 not up yet, forcing it
> [    3.660062] Refined TSC clocksource calibration: 3213.422 MHz.
> [    3.660118] Switching to clocksource tsc
> [   63.200273] r8169 0000:03:00.0: eth0: unable to load firmware patch 
> rtl_nic/rtl8168e-1.fw (-2)
> [   63.223513] r8169 0000:03:00.0: eth0: link down
> [   63.223556] r8169 0000:03:00.0: eth0: link down
> 
> ..is slowing down reboots considerably. 3.0-rc does _not_ like some 
> timing hardware in my machine. Having said that, at least it does not 
> randomly panic on SCSI like 2.6.39 does.
> 
> Ok, I've ruled out TCPMSS. Found out where it was being set and neutered 
> it. I've replicated it with only the single DNAT rule.
> 
> 
> > Could you try following patch, because this is the 'usual suspect' I had
> > yesterday :
> >
> > diff --git a/net/core/skbuff.c b/net/core/skbuff.c
> > index 46cbd28..9f548f9 100644
> > --- a/net/core/skbuff.c
> > +++ b/net/core/skbuff.c
> > @@ -792,6 +792,7 @@ int pskb_expand_head(struct sk_buff *skb, int nhead, int ntail,
> >   		fastpath = atomic_read(&skb_shinfo(skb)->dataref) == delta;
> >   	}
> >
> > +#if 0
> >   	if (fastpath&&
> >   	size + sizeof(struct skb_shared_info)<= ksize(skb->head)) {
> >   		memmove(skb->head + size, skb_shinfo(skb),
> > @@ -802,7 +803,7 @@ int pskb_expand_head(struct sk_buff *skb, int nhead, int ntail,
> >   		off = nhead;
> >   		goto adjust_others;
> >   	}
> > -
> > +#endif
> >   	data = kmalloc(size + sizeof(struct skb_shared_info), gfp_mask);
> >   	if (!data)
> >   		goto nodata;
> >
> >
> >
> 
> Nope.. that's not it. <sigh> That might have changed the characteristic 
> of the fault slightly, but unfortunately I got caught with a couple of 
> fsck's, so I only got to test it 3 times tonight.
> 
> It's unfortunate that this is a production system, so I can only take it 
> down between about 9pm and 1am. That would normally be pretty 
> productive, except that an fsck of a 14TB ext4 can take 30 minutes if it 
> panics at the wrong time.
> 
> I'm out of time tonight, but I'll have a crack at some bisection 
> tomorrow night. Now I just have to go back far enough that it works, and 
> be near enough not to have to futz around with /proc /sys or drivers.
> 
> I really, really, really appreciate you guys helping me with this. It 
> has been driving me absolutely bonkers. If I'm ever in the same town as 
> any of you, dinner and drinks are on me.

Hmm, I wonder if kmemcheck could help you, but its slow as hell, so not
appropriate for production :(




WARNING: multiple messages have this Message-ID (diff)
From: Eric Dumazet <eric.dumazet@gmail.com>
To: Brad Campbell <brad@fnarfbargle.com>
Cc: Patrick McHardy <kaber@trash.net>,
	Bart De Schuymer <bdschuym@pandora.be>,
	 kvm@vger.kernel.org, linux-mm@kvack.org,
	linux-kernel@vger.kernel.org,  netdev@vger.kernel.org,
	netfilter-devel@vger.kernel.org
Subject: Re: KVM induced panic on 2.6.38[2367] & 2.6.39
Date: Wed, 08 Jun 2011 23:22:29 +0200	[thread overview]
Message-ID: <1307568149.3980.3.camel@edumazet-laptop> (raw)
In-Reply-To: <4DEFAB15.2060905@fnarfbargle.com>

Le jeudi 09 juin 2011 à 01:02 +0800, Brad Campbell a écrit :
> On 08/06/11 11:59, Eric Dumazet wrote:
> 
> > Well, a bisection definitely should help, but needs a lot of time in
> > your case.
> 
> Yes. compile, test, crash, walk out to the other building to press 
> reset, lather, rinse, repeat.
> 
> I need a reset button on the end of a 50M wire, or a hardware watchdog!
> 
> Actually it's not so bad. If I turn off slub debugging the kernel panics 
> and reboots itself.
> 
> This.. :
> [    2.913034] netconsole: remote ethernet address 00:16:cb:a7:dd:d1
> [    2.913066] netconsole: device eth0 not up yet, forcing it
> [    3.660062] Refined TSC clocksource calibration: 3213.422 MHz.
> [    3.660118] Switching to clocksource tsc
> [   63.200273] r8169 0000:03:00.0: eth0: unable to load firmware patch 
> rtl_nic/rtl8168e-1.fw (-2)
> [   63.223513] r8169 0000:03:00.0: eth0: link down
> [   63.223556] r8169 0000:03:00.0: eth0: link down
> 
> ..is slowing down reboots considerably. 3.0-rc does _not_ like some 
> timing hardware in my machine. Having said that, at least it does not 
> randomly panic on SCSI like 2.6.39 does.
> 
> Ok, I've ruled out TCPMSS. Found out where it was being set and neutered 
> it. I've replicated it with only the single DNAT rule.
> 
> 
> > Could you try following patch, because this is the 'usual suspect' I had
> > yesterday :
> >
> > diff --git a/net/core/skbuff.c b/net/core/skbuff.c
> > index 46cbd28..9f548f9 100644
> > --- a/net/core/skbuff.c
> > +++ b/net/core/skbuff.c
> > @@ -792,6 +792,7 @@ int pskb_expand_head(struct sk_buff *skb, int nhead, int ntail,
> >   		fastpath = atomic_read(&skb_shinfo(skb)->dataref) == delta;
> >   	}
> >
> > +#if 0
> >   	if (fastpath&&
> >   	size + sizeof(struct skb_shared_info)<= ksize(skb->head)) {
> >   		memmove(skb->head + size, skb_shinfo(skb),
> > @@ -802,7 +803,7 @@ int pskb_expand_head(struct sk_buff *skb, int nhead, int ntail,
> >   		off = nhead;
> >   		goto adjust_others;
> >   	}
> > -
> > +#endif
> >   	data = kmalloc(size + sizeof(struct skb_shared_info), gfp_mask);
> >   	if (!data)
> >   		goto nodata;
> >
> >
> >
> 
> Nope.. that's not it. <sigh> That might have changed the characteristic 
> of the fault slightly, but unfortunately I got caught with a couple of 
> fsck's, so I only got to test it 3 times tonight.
> 
> It's unfortunate that this is a production system, so I can only take it 
> down between about 9pm and 1am. That would normally be pretty 
> productive, except that an fsck of a 14TB ext4 can take 30 minutes if it 
> panics at the wrong time.
> 
> I'm out of time tonight, but I'll have a crack at some bisection 
> tomorrow night. Now I just have to go back far enough that it works, and 
> be near enough not to have to futz around with /proc /sys or drivers.
> 
> I really, really, really appreciate you guys helping me with this. It 
> has been driving me absolutely bonkers. If I'm ever in the same town as 
> any of you, dinner and drinks are on me.

Hmm, I wonder if kmemcheck could help you, but its slow as hell, so not
appropriate for production :(



--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

WARNING: multiple messages have this Message-ID (diff)
From: Eric Dumazet <eric.dumazet@gmail.com>
To: Brad Campbell <brad@fnarfbargle.com>
Cc: Patrick McHardy <kaber@trash.net>,
	Bart De Schuymer <bdschuym@pandora.be>,
	kvm@vger.kernel.org, linux-mm@kvack.org,
	linux-kernel@vger.kernel.org, netdev@vger.kernel.org,
	netfilter-devel@vger.kernel.org
Subject: Re: KVM induced panic on 2.6.38[2367] & 2.6.39
Date: Wed, 08 Jun 2011 23:22:29 +0200	[thread overview]
Message-ID: <1307568149.3980.3.camel@edumazet-laptop> (raw)
In-Reply-To: <4DEFAB15.2060905@fnarfbargle.com>

Le jeudi 09 juin 2011 A  01:02 +0800, Brad Campbell a A(C)crit :
> On 08/06/11 11:59, Eric Dumazet wrote:
> 
> > Well, a bisection definitely should help, but needs a lot of time in
> > your case.
> 
> Yes. compile, test, crash, walk out to the other building to press 
> reset, lather, rinse, repeat.
> 
> I need a reset button on the end of a 50M wire, or a hardware watchdog!
> 
> Actually it's not so bad. If I turn off slub debugging the kernel panics 
> and reboots itself.
> 
> This.. :
> [    2.913034] netconsole: remote ethernet address 00:16:cb:a7:dd:d1
> [    2.913066] netconsole: device eth0 not up yet, forcing it
> [    3.660062] Refined TSC clocksource calibration: 3213.422 MHz.
> [    3.660118] Switching to clocksource tsc
> [   63.200273] r8169 0000:03:00.0: eth0: unable to load firmware patch 
> rtl_nic/rtl8168e-1.fw (-2)
> [   63.223513] r8169 0000:03:00.0: eth0: link down
> [   63.223556] r8169 0000:03:00.0: eth0: link down
> 
> ..is slowing down reboots considerably. 3.0-rc does _not_ like some 
> timing hardware in my machine. Having said that, at least it does not 
> randomly panic on SCSI like 2.6.39 does.
> 
> Ok, I've ruled out TCPMSS. Found out where it was being set and neutered 
> it. I've replicated it with only the single DNAT rule.
> 
> 
> > Could you try following patch, because this is the 'usual suspect' I had
> > yesterday :
> >
> > diff --git a/net/core/skbuff.c b/net/core/skbuff.c
> > index 46cbd28..9f548f9 100644
> > --- a/net/core/skbuff.c
> > +++ b/net/core/skbuff.c
> > @@ -792,6 +792,7 @@ int pskb_expand_head(struct sk_buff *skb, int nhead, int ntail,
> >   		fastpath = atomic_read(&skb_shinfo(skb)->dataref) == delta;
> >   	}
> >
> > +#if 0
> >   	if (fastpath&&
> >   	size + sizeof(struct skb_shared_info)<= ksize(skb->head)) {
> >   		memmove(skb->head + size, skb_shinfo(skb),
> > @@ -802,7 +803,7 @@ int pskb_expand_head(struct sk_buff *skb, int nhead, int ntail,
> >   		off = nhead;
> >   		goto adjust_others;
> >   	}
> > -
> > +#endif
> >   	data = kmalloc(size + sizeof(struct skb_shared_info), gfp_mask);
> >   	if (!data)
> >   		goto nodata;
> >
> >
> >
> 
> Nope.. that's not it. <sigh> That might have changed the characteristic 
> of the fault slightly, but unfortunately I got caught with a couple of 
> fsck's, so I only got to test it 3 times tonight.
> 
> It's unfortunate that this is a production system, so I can only take it 
> down between about 9pm and 1am. That would normally be pretty 
> productive, except that an fsck of a 14TB ext4 can take 30 minutes if it 
> panics at the wrong time.
> 
> I'm out of time tonight, but I'll have a crack at some bisection 
> tomorrow night. Now I just have to go back far enough that it works, and 
> be near enough not to have to futz around with /proc /sys or drivers.
> 
> I really, really, really appreciate you guys helping me with this. It 
> has been driving me absolutely bonkers. If I'm ever in the same town as 
> any of you, dinner and drinks are on me.

Hmm, I wonder if kmemcheck could help you, but its slow as hell, so not
appropriate for production :(



--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

  reply	other threads:[~2011-06-08 21:22 UTC|newest]

Thread overview: 111+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2011-05-31  1:24 KVM induced panic on 2.6.38[2367] & 2.6.39 Brad Campbell
2011-05-31  5:47 ` Borislav Petkov
2011-05-31  5:47   ` Borislav Petkov
2011-05-31  9:26   ` Brad Campbell
2011-05-31  9:26     ` Brad Campbell
2011-05-31 10:38     ` Borislav Petkov
2011-05-31 10:38       ` Borislav Petkov
2011-05-31 14:24       ` Brad Campbell
2011-05-31 14:24         ` Brad Campbell
2011-05-31 14:24         ` Brad Campbell
2011-05-31 22:31         ` Hugh Dickins
2011-05-31 22:31           ` Hugh Dickins
2011-06-01  0:18           ` Brad Campbell
2011-06-01  0:18             ` Brad Campbell
2011-06-01  0:37           ` Brad Campbell
2011-06-01  0:37             ` Brad Campbell
2011-06-01  1:15             ` Andrea Arcangeli
2011-06-01  1:15               ` Andrea Arcangeli
2011-06-01  2:03               ` Brad Campbell
2011-06-01  2:03                 ` Brad Campbell
2011-06-01  4:52               ` Hugh Dickins
2011-06-01  4:52                 ` Hugh Dickins
2011-06-01  6:31                 ` Brad Campbell
2011-06-01  6:31                   ` Brad Campbell
2011-06-01  6:56                   ` Avi Kivity
2011-06-01  6:56                     ` Avi Kivity
2011-06-01  9:29                     ` Brad Campbell
2011-06-01  9:29                       ` Brad Campbell
2011-06-01  9:29                       ` Brad Campbell
2011-06-01  9:40                       ` Avi Kivity
2011-06-01  9:40                         ` Avi Kivity
2011-06-01  9:41                         ` Avi Kivity
2011-06-01  9:41                           ` Avi Kivity
2011-06-01 10:53                           ` Brad Campbell
2011-06-01 10:53                             ` Brad Campbell
2011-06-01 11:09                             ` Avi Kivity
2011-06-01 11:09                               ` Avi Kivity
2011-06-01 11:18                             ` CaT
2011-06-01 11:18                               ` CaT
2011-06-01 11:52                               ` Brad Campbell
2011-06-01 11:52                                 ` Brad Campbell
2011-06-01 23:03                                 ` CaT
2011-06-01 23:03                                   ` CaT
2011-06-03 13:38                                   ` Brad Campbell
2011-06-03 13:38                                     ` Brad Campbell
2011-06-03 15:50                                     ` Bernhard Held
2011-06-03 15:50                                       ` Bernhard Held
2011-06-03 15:50                                       ` Bernhard Held
2011-06-03 16:07                                       ` Brad Campbell
2011-06-03 16:07                                         ` Brad Campbell
2011-06-06 20:10                                         ` Bart De Schuymer
2011-06-06 20:10                                           ` Bart De Schuymer
2011-06-06 20:23                                           ` Eric Dumazet
2011-06-06 20:23                                             ` Eric Dumazet
2011-06-06 20:23                                             ` Eric Dumazet
2011-06-07  3:33                                           ` Brad Campbell
2011-06-07  3:33                                             ` Brad Campbell
2011-06-07 13:30                                             ` Patrick McHardy
2011-06-07 13:30                                               ` Patrick McHardy
2011-06-07 14:40                                               ` Brad Campbell
2011-06-07 14:40                                                 ` Brad Campbell
2011-06-07 15:35                                                 ` Patrick McHardy
2011-06-07 15:35                                                   ` Patrick McHardy
2011-06-07 18:31                                                   ` Eric Dumazet
2011-06-07 18:31                                                     ` Eric Dumazet
2011-06-07 18:31                                                     ` Eric Dumazet
2011-06-07 22:57                                                     ` Patrick McHardy
2011-06-07 22:57                                                       ` Patrick McHardy
2011-06-07 22:57                                                       ` Patrick McHardy
2011-06-08  0:18                                                       ` Brad Campbell
2011-06-08  0:18                                                         ` Brad Campbell
2011-06-08  0:18                                                         ` Brad Campbell
2011-06-08  3:59                                                         ` Eric Dumazet
2011-06-08  3:59                                                           ` Eric Dumazet
2011-06-08  3:59                                                           ` Eric Dumazet
2011-06-08 17:02                                                           ` Brad Campbell
2011-06-08 17:02                                                             ` Brad Campbell
2011-06-08 17:02                                                             ` Brad Campbell
2011-06-08 21:22                                                             ` Eric Dumazet [this message]
2011-06-08 21:22                                                               ` Eric Dumazet
2011-06-08 21:22                                                               ` Eric Dumazet
2011-06-10  2:52                                                             ` Simon Horman
2011-06-10  2:52                                                               ` Simon Horman
2011-06-10 12:37                                                               ` Mark Lord
2011-06-10 12:37                                                                 ` Mark Lord
2011-06-10 16:43                                                                 ` Henrique de Moraes Holschuh
2011-06-10 16:43                                                                   ` Henrique de Moraes Holschuh
2011-06-12 15:38                                                               ` Avi Kivity
2011-06-12 15:38                                                                 ` Avi Kivity
2011-06-07 23:43                                                   ` Brad Campbell
2011-06-07 23:43                                                     ` Brad Campbell
2011-06-07 18:04                                                 ` Bart De Schuymer
2011-06-07 18:04                                                   ` Bart De Schuymer
2011-06-08  0:15                                                   ` Brad Campbell
2011-06-08  0:15                                                     ` Brad Campbell
2011-06-05  8:14                                     ` Avi Kivity
2011-06-05  8:14                                       ` Avi Kivity
2011-06-05 13:45                                       ` Brad Campbell
2011-06-05 13:45                                         ` Brad Campbell
2011-06-05 13:58                                         ` Avi Kivity
2011-06-05 13:58                                           ` Avi Kivity
2011-06-06 20:22                                         ` Eric Dumazet
2011-06-06 20:22                                           ` Eric Dumazet
2011-06-06 20:22                                           ` Eric Dumazet
2011-06-07 13:27                                           ` Brad Campbell
2011-06-07 13:37                                             ` Eric Dumazet
2011-06-07 15:15                                               ` Brad Campbell
2011-08-20 13:16                                               ` Brad Campbell
2011-08-22  6:36                                                 ` Avi Kivity
2011-08-22  6:45                                                   ` Eric Dumazet
2011-08-22 11:45                                                     ` Brad Campbell

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1307568149.3980.3.camel@edumazet-laptop \
    --to=eric.dumazet@gmail.com \
    --cc=bdschuym@pandora.be \
    --cc=brad@fnarfbargle.com \
    --cc=kaber@trash.net \
    --cc=kvm@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=netdev@vger.kernel.org \
    --cc=netfilter-devel@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.