linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* Oops in 2.6.23-rc5
@ 2007-09-01 20:57 Christian Kujau
  2007-09-02  4:19 ` Herbert Xu
  0 siblings, 1 reply; 3+ messages in thread
From: Christian Kujau @ 2007-09-01 20:57 UTC (permalink / raw)
  To: linux-kernel; +Cc: netdev

Hi,

today I switched from 2.6.22.3 to 2.6.23-rc5 (skipped quite a few -rc 
versions due to lack of time), and the box keeps panicking under certain 
circumstances. I suspected disk related problems, because: when the box 
is up, I usually resume ~10 bittorrent files. When doing this, each
file (~200MB...1GB) is checked and disk activity is pretty high (20MB/s
or so), and after 1 minute of doing so the box panicks. Every time.

However, I could not reproduce it while generating disk-io with say tar 
or rsync to the same fs. It always panicked when the torrent client(s) 
start up. As the box would not log anything via remote-syslog before 
halting, I connected a vga display. As I don't have a digital camera, I 
tried to write down some stuff: http://ww.nerdbynature.de/bits/2.6.23-rc5/
(I'll try to write down the full oops to this place, or what was still 
visible from it, because the first few(?) lines where lost, display 
scrollback was not working, only sysrq was).

The backtrace mentions do_page_fault, error_code, tcp_rtt_estimator, 
tcp_ack_saw_timestamp, tcp_ack, tcp_rcv_established, tcp_v4_do_rcv, 
tcp_v4_rcv, ip_local_delimiter, netif_receive_skb, process_backlog, 
net_rcv_activate, __do_softirq, do_softirq - in that order. As said, the 
correct addresses will be put on above's url (Q: do I really need *all* 
the numbers? Or just a few?). These snippets made me suspect network 
related issues, because: aside from disk-io, the bittorrent clients will 
establish quite a few (~50 in total) connections to all the peers.

The box is a amd-k7, 2 NICs (forcedeth, 3c59x), 2 GB RAM, ACPI 
disabled, gcc-4.1

Thanks for looking into this,
Christian.
-- 
BOFH excuse #335:

the AA battery in the wallclock sends magnetic interference

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: Oops in 2.6.23-rc5
  2007-09-01 20:57 Oops in 2.6.23-rc5 Christian Kujau
@ 2007-09-02  4:19 ` Herbert Xu
  2007-09-02 12:12   ` Christian Kujau
  0 siblings, 1 reply; 3+ messages in thread
From: Herbert Xu @ 2007-09-02  4:19 UTC (permalink / raw)
  To: Christian Kujau; +Cc: linux-kernel, netdev, torvalds, davem

Christian Kujau <lists@nerdbynature.de> wrote:
> 
> today I switched from 2.6.22.3 to 2.6.23-rc5 (skipped quite a few -rc 
> versions due to lack of time), and the box keeps panicking under certain 
> circumstances. I suspected disk related problems, because: when the box 
> is up, I usually resume ~10 bittorrent files. When doing this, each
> file (~200MB...1GB) is checked and disk activity is pretty high (20MB/s
> or so), and after 1 minute of doing so the box panicks. Every time.

You want this patch (by davem).

Unfortunately people are travelling so I'm not sure when it'll
get picked up by Linus.

Cheers,
-- 
Visit Openswan at http://www.openswan.org/
Email: Herbert Xu ~{PmV>HI~} <herbert@gondor.apana.org.au>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt
--
From: davem@davemloft.net (David Miller)

> ip is at tcp_rto_min+0x20/0x40

diff --git a/net/ipv4/tcp_input.c b/net/ipv4/tcp_input.c
index 1ee7212..bbad2cd 100644
--- a/net/ipv4/tcp_input.c
+++ b/net/ipv4/tcp_input.c
@@ -560,7 +560,7 @@ static u32 tcp_rto_min(struct sock *sk)
 	struct dst_entry *dst = __sk_dst_get(sk);
 	u32 rto_min = TCP_RTO_MIN;
 
-	if (dst_metric_locked(dst, RTAX_RTO_MIN))
+	if (dst && dst_metric_locked(dst, RTAX_RTO_MIN))
 		rto_min = dst->metrics[RTAX_RTO_MIN-1];
 	return rto_min;
 }

^ permalink raw reply related	[flat|nested] 3+ messages in thread

* Re: Oops in 2.6.23-rc5
  2007-09-02  4:19 ` Herbert Xu
@ 2007-09-02 12:12   ` Christian Kujau
  0 siblings, 0 replies; 3+ messages in thread
From: Christian Kujau @ 2007-09-02 12:12 UTC (permalink / raw)
  To: Herbert Xu; +Cc: linux-kernel, netdev, torvalds, davem

On Sun, 2 Sep 2007, Herbert Xu wrote:
> You want this patch (by davem).

I applied the patch and the box is up for 1hr now. Since I was able to 
reproduce the oops pretty reliable with this bittorrent thingy, I 
did the same a few times now, but the box did NOT crash :)

> Unfortunately people are travelling so I'm not sure when it'll
> get picked up by Linus.

I've seen this patch only in:
http://article.gmane.org/gmane.linux.network/70781

And, for the archives, a simliar looking error report:
http://article.gmane.org/gmane.linux.network/70777

Thanks for the quick reply, Herbert!

Christian.
-- 
BOFH excuse #297:

Too many interrupts

^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2007-09-02 12:12 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2007-09-01 20:57 Oops in 2.6.23-rc5 Christian Kujau
2007-09-02  4:19 ` Herbert Xu
2007-09-02 12:12   ` Christian Kujau

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).