linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* RE: skb allocation problems (More Brain damage!)
@ 2001-04-11 19:02 Imran.Patel
  0 siblings, 0 replies; 9+ messages in thread
From: Imran.Patel @ 2001-04-11 19:02 UTC (permalink / raw)
  To: ak, Imran.Patel; +Cc: netdev, linux-kernel

> What you can try is to turn on slab debugging. Set the  FORCED_DEBUG
> define in mm/slab.c to one and recompile. Does it change any pattern
> when you dump the data in the skbs or pings? 
> If yes someone is playing with already freed packets.

I think the dump that i got suggests something more strange than that. This
is what i can make of the dump:

this is the ip header (with src addr: 192.168.102.22 and dest addr:
192.168.10.29)
45 0 0 80 0 0 40 0 ff 1 2d f8 c0 a8 66 16 c0 a8 66 1d 

this is the icmp header (echo reply)
0 0 e4 48 11 d 0 0 

the regular ping data follows
14 5d d4 3a 63 1 a 0 8 9 a b c d e f 10 11 12 13 14 15 16 17 18 19 1a 1b 1c
1d
1e 1f 20 21 22 23 

Now, it is expecting 24, 25, 26,.....but the outer ip & icmp header and data
(as above) follows again....
45 0 0 80 0 0 40 0 ff 1 2d f8 c0 a8 66 16 c0 a8 66 1d 0 0
0 0 11 d 0 0 14 5d d4 3a 63 1 a 0 8 9 a b c d e f 10 11 12 13 14 15 16 17 18
19 1a 1b 1c 1d 1e 1f 20 21 22 23

it is very hard to imagine the scenario which can lead to this...
I will try your suggestion..

> And what NIC are you using btw?
as i said earlier, Intel Ethernet Pro 100...

imran


^ permalink raw reply	[flat|nested] 9+ messages in thread

* RE: skb allocation problems (More Brain damage!)
@ 2001-04-11 20:02 Manfred Spraul
  0 siblings, 0 replies; 9+ messages in thread
From: Manfred Spraul @ 2001-04-11 20:02 UTC (permalink / raw)
  To: Imran.Patel; +Cc: linux-kernel

> it is very hard to imagine the scenario which can lead to this...
> I will try your suggestion..

Perhaps a problem with the csum assembler implementations? Which cpu
type do you optimize for, and which cpu is installed?

Btw, are you overclocking anything?

--
    Manfred





^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: skb allocation problems (More Brain damage!)
  2001-04-11 17:15 Imran.Patel
  2001-04-11 17:20 ` Dave Airlie
  2001-04-11 17:47 ` Bart Trojanowski
@ 2001-04-11 18:28 ` Andi Kleen
  2 siblings, 0 replies; 9+ messages in thread
From: Andi Kleen @ 2001-04-11 18:28 UTC (permalink / raw)
  To: Imran.Patel; +Cc: ak, netfilter-devel, netdev, linux-kernel

On Wed, Apr 11, 2001 at 08:15:49PM +0300, Imran.Patel@nokia.com wrote:
> And as a I said earlier, only ping packets with size within certain range
> create this problem......Something is terribly wrong here!! But as I am not
> a Linux mm guru, i can't tell what is wrong here!

What you can try is to turn on slab debugging. Set the  FORCED_DEBUG
define in mm/slab.c to one and recompile. Does it change any pattern
when you dump the data in the skbs or pings? 
If yes someone is playing with already freed packets.

Furthermore you can instrument other parts with good old printk.

And what NIC are you using btw?


-Andi

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: skb allocation problems (More Brain damage!)
  2001-04-11 17:47 ` Bart Trojanowski
@ 2001-04-11 18:25   ` Andi Kleen
  0 siblings, 0 replies; 9+ messages in thread
From: Andi Kleen @ 2001-04-11 18:25 UTC (permalink / raw)
  To: Bart Trojanowski; +Cc: Imran.Patel, ak, netfilter-devel, netdev, LKML

On Wed, Apr 11, 2001 at 01:47:18PM -0400, Bart Trojanowski wrote:
> 
> Coudl the problem be in the NIC driver not in the alloc_skb?  I have used
> both 2.4.{1,3} for some time and never seen this corruption.  I use ping
> -f with various packet sizes for stress testing my IPSec boxes... these do
> quite a bit of extra skb creation as an IPSec header sometimes does not
> fit in the original skb.  No problems yet.
> 
> My gut tells me to blame the NIC driver of the NIC itself.

The NIC is not directly involved in alloc_skb() (except maybe if it corrupts
internal data structures of the allocator) 

-Andi

^ permalink raw reply	[flat|nested] 9+ messages in thread

* RE: skb allocation problems (More Brain damage!)
@ 2001-04-11 18:22 Imran.Patel
  0 siblings, 0 replies; 9+ messages in thread
From: Imran.Patel @ 2001-04-11 18:22 UTC (permalink / raw)
  To: Imran.Patel; +Cc: netfilter-devel, netdev, linux-kernel

> Coudl the problem be in the NIC driver not in the alloc_skb?  
No, i don't think so...i got the dump of the packet at the local_out and
post routing hooks....& found it in bad shape there. Here it is what it
looks like:

45 0 0 80 0 0 40 0 ff 1 2d f8 c0 a8 66 16 c0 a8 66 1d 0 0 e4 48 11 d 0 0 14
5d d4 3a 63 1 a 0 8 9 a b c d e f 10 11 12 13 14 15 16 17 18 19 1a 1b 1c 1d
1e 1f 20 21 22 23 45 0 0 80 0 0 40 0 ff 1 2d f8 c0 a8 66 16 c0 a8 66 1d 0 0
0 0 11 d 0 0 14 5d d4 3a 63 1 a 0 8 9 a b c d e f 10 11 12 13 14 15 16 17 18
19 1a 1b 1c 1d 1e 1f 20 21 22 23


> My gut tells me to blame the NIC driver of the NIC itself.
btw, the card is Intel Ethernet Pro 100..

imran

> On Wed, 11 Apr 2001, Imran.Patel@nokia.com wrote:
> 
> > > Well, I don't know then. You have to debug it. It's probably
> > > something stupid
> > > (if fundamental services like alloc_skb/kfree_skb were
> > > completely buggy
> > > someone surely would have noticed earlier)
> >
> > yep, at first i thought it was because of sume stupidity in 
> my module...but
> > now it seems that actually it is not my code which is doing 
> something
> > stupid....just now i have found out that even simple ping 
> faces similar
> > problems ....here is the output that i get when i ping from the host
> > 192.168.102.29 (runs 2.4.1) to 192.168.102.22 (runs 2.4.3) 
> (Note:I don't
> > insert any kernel modules of my own on these machines):
> >
> >
> > PING 192.168.102.22 (192.168.102.22) from 192.168.102.29 : 
> 100(128) bytes of
> > data.
> > 108 bytes from hobbes.sr.ntc.nokia.com (192.168.102.22): 
> icmp_seq=0 ttl=255
> > time=36.5 ms
> > wrong data byte #36 should be 0x24 but was 0x45
> > 	19 45 d4 3a e 7a a 0 8 9 a b c d e f 10 11 12 13 14 15 
> 16 17 18 19
> > 1a 1b 1c 1d 1e 1f
> > 	20 21 22 23 45 0 0 80 0 0 40 0 ff 1 2d f8 c0 a8 66 16 
> c0 a8 66 1d 0
> > 0 0 0 4 c 0 0
> > 	19 45 d4 3a e 7a a 0 8 9 a b c d e f 10 11 12 13 14 15 
> 16 17 18 19
> > 1a 1b
> >
> > --- 192.168.102.22 ping statistics ---
> > 1 packets transmitted, 1 packets received, 0% packet loss
> > round-trip min/avg/max = 36.5/36.5/36.5 ms
> >
> >
> > Note that the problem starts with byte #36 which goes on 
> like " 45 0 0 80 0
> > ......." which is in fact the outer IP header!! So 
> certainly there are
> > buffer overruns on the other end (host 192.168.102.22)....
> >
> > And as a I said earlier, only ping packets with size within 
> certain range
> > create this problem......Something is terribly wrong here!! 
> But as I am not
> > a Linux mm guru, i can't tell what is wrong here!
> >
> >
> > regards,
> > imran
> >
> > -
> > To unsubscribe from this list: send the line "unsubscribe 
> linux-kernel" in
> > the body of a message to majordomo@vger.kernel.org
> > More majordomo info at  http://vger.kernel.org/majordomo-info.html
> > Please read the FAQ at  http://www.tux.org/lkml/
> >
> 
> -- 
> 	WebSig: http://www.jukie.net/~bart/sig/
> 
> 

^ permalink raw reply	[flat|nested] 9+ messages in thread

* RE: skb allocation problems (More Brain damage!)
  2001-04-11 17:15 Imran.Patel
  2001-04-11 17:20 ` Dave Airlie
@ 2001-04-11 17:47 ` Bart Trojanowski
  2001-04-11 18:25   ` Andi Kleen
  2001-04-11 18:28 ` Andi Kleen
  2 siblings, 1 reply; 9+ messages in thread
From: Bart Trojanowski @ 2001-04-11 17:47 UTC (permalink / raw)
  To: Imran.Patel; +Cc: ak, netfilter-devel, netdev, LKML


Coudl the problem be in the NIC driver not in the alloc_skb?  I have used
both 2.4.{1,3} for some time and never seen this corruption.  I use ping
-f with various packet sizes for stress testing my IPSec boxes... these do
quite a bit of extra skb creation as an IPSec header sometimes does not
fit in the original skb.  No problems yet.

My gut tells me to blame the NIC driver of the NIC itself.

B.

On Wed, 11 Apr 2001, Imran.Patel@nokia.com wrote:

> > Well, I don't know then. You have to debug it. It's probably
> > something stupid
> > (if fundamental services like alloc_skb/kfree_skb were
> > completely buggy
> > someone surely would have noticed earlier)
>
> yep, at first i thought it was because of sume stupidity in my module...but
> now it seems that actually it is not my code which is doing something
> stupid....just now i have found out that even simple ping faces similar
> problems ....here is the output that i get when i ping from the host
> 192.168.102.29 (runs 2.4.1) to 192.168.102.22 (runs 2.4.3) (Note:I don't
> insert any kernel modules of my own on these machines):
>
>
> PING 192.168.102.22 (192.168.102.22) from 192.168.102.29 : 100(128) bytes of
> data.
> 108 bytes from hobbes.sr.ntc.nokia.com (192.168.102.22): icmp_seq=0 ttl=255
> time=36.5 ms
> wrong data byte #36 should be 0x24 but was 0x45
> 	19 45 d4 3a e 7a a 0 8 9 a b c d e f 10 11 12 13 14 15 16 17 18 19
> 1a 1b 1c 1d 1e 1f
> 	20 21 22 23 45 0 0 80 0 0 40 0 ff 1 2d f8 c0 a8 66 16 c0 a8 66 1d 0
> 0 0 0 4 c 0 0
> 	19 45 d4 3a e 7a a 0 8 9 a b c d e f 10 11 12 13 14 15 16 17 18 19
> 1a 1b
>
> --- 192.168.102.22 ping statistics ---
> 1 packets transmitted, 1 packets received, 0% packet loss
> round-trip min/avg/max = 36.5/36.5/36.5 ms
>
>
> Note that the problem starts with byte #36 which goes on like " 45 0 0 80 0
> ......." which is in fact the outer IP header!! So certainly there are
> buffer overruns on the other end (host 192.168.102.22)....
>
> And as a I said earlier, only ping packets with size within certain range
> create this problem......Something is terribly wrong here!! But as I am not
> a Linux mm guru, i can't tell what is wrong here!
>
>
> regards,
> imran
>
> -
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/
>

-- 
	WebSig: http://www.jukie.net/~bart/sig/



^ permalink raw reply	[flat|nested] 9+ messages in thread

* RE: skb allocation problems (More Brain damage!)
@ 2001-04-11 17:24 Imran.Patel
  0 siblings, 0 replies; 9+ messages in thread
From: Imran.Patel @ 2001-04-11 17:24 UTC (permalink / raw)
  To: airlied, Imran.Patel; +Cc: netfilter-devel, netdev, linux-kernel


> -----Original Message-----
> From: ext Dave Airlie [mailto:airlied@csn.ul.ie]
> Sent: 11. April 2001 20:20
> To: Imran.Patel@nokia.com
> Cc: ak@suse.de; netfilter-devel@us5.samba.org; netdev@oss.sgi.com;
> linux-kernel@vger.kernel.org
> Subject: RE: skb allocation problems (More Brain damage!)
> 
> 
> 
> What compiler are you using to compile the kernel?

gcc version egcs-2.91.66 19990314/Linux (egcs-1.1.2 release)


imran

^ permalink raw reply	[flat|nested] 9+ messages in thread

* RE: skb allocation problems (More Brain damage!)
  2001-04-11 17:15 Imran.Patel
@ 2001-04-11 17:20 ` Dave Airlie
  2001-04-11 17:47 ` Bart Trojanowski
  2001-04-11 18:28 ` Andi Kleen
  2 siblings, 0 replies; 9+ messages in thread
From: Dave Airlie @ 2001-04-11 17:20 UTC (permalink / raw)
  To: Imran.Patel; +Cc: ak, netfilter-devel, netdev, linux-kernel


What compiler are you using to compile the kernel?

Dave.

On Wed, 11 Apr 2001 Imran.Patel@nokia.com wrote:

> > Well, I don't know then. You have to debug it. It's probably
> > something stupid
> > (if fundamental services like alloc_skb/kfree_skb were
> > completely buggy
> > someone surely would have noticed earlier)
>
> yep, at first i thought it was because of sume stupidity in my module...but
> now it seems that actually it is not my code which is doing something
> stupid....just now i have found out that even simple ping faces similar
> problems ....here is the output that i get when i ping from the host
> 192.168.102.29 (runs 2.4.1) to 192.168.102.22 (runs 2.4.3) (Note:I don't
> insert any kernel modules of my own on these machines):
>
>
> PING 192.168.102.22 (192.168.102.22) from 192.168.102.29 : 100(128) bytes of
> data.
> 108 bytes from hobbes.sr.ntc.nokia.com (192.168.102.22): icmp_seq=0 ttl=255
> time=36.5 ms
> wrong data byte #36 should be 0x24 but was 0x45
> 	19 45 d4 3a e 7a a 0 8 9 a b c d e f 10 11 12 13 14 15 16 17 18 19
> 1a 1b 1c 1d 1e 1f
> 	20 21 22 23 45 0 0 80 0 0 40 0 ff 1 2d f8 c0 a8 66 16 c0 a8 66 1d 0
> 0 0 0 4 c 0 0
> 	19 45 d4 3a e 7a a 0 8 9 a b c d e f 10 11 12 13 14 15 16 17 18 19
> 1a 1b
>
> --- 192.168.102.22 ping statistics ---
> 1 packets transmitted, 1 packets received, 0% packet loss
> round-trip min/avg/max = 36.5/36.5/36.5 ms
>
>
> Note that the problem starts with byte #36 which goes on like " 45 0 0 80 0
> ......." which is in fact the outer IP header!! So certainly there are
> buffer overruns on the other end (host 192.168.102.22)....
>
> And as a I said earlier, only ping packets with size within certain range
> create this problem......Something is terribly wrong here!! But as I am not
> a Linux mm guru, i can't tell what is wrong here!
>
>
> regards,
> imran
>
>

-- 
David Airlie, Software Engineer
http://www.skynet.ie/~airlied / airlied@skynet.ie
pam_smb / Linux DecStation / Linux VAX / ILUG person



^ permalink raw reply	[flat|nested] 9+ messages in thread

* RE: skb allocation problems (More Brain damage!)
@ 2001-04-11 17:15 Imran.Patel
  2001-04-11 17:20 ` Dave Airlie
                   ` (2 more replies)
  0 siblings, 3 replies; 9+ messages in thread
From: Imran.Patel @ 2001-04-11 17:15 UTC (permalink / raw)
  To: ak, Imran.Patel; +Cc: netfilter-devel, netdev, linux-kernel

> Well, I don't know then. You have to debug it. It's probably 
> something stupid
> (if fundamental services like alloc_skb/kfree_skb were 
> completely buggy
> someone surely would have noticed earlier)

yep, at first i thought it was because of sume stupidity in my module...but
now it seems that actually it is not my code which is doing something
stupid....just now i have found out that even simple ping faces similar
problems ....here is the output that i get when i ping from the host
192.168.102.29 (runs 2.4.1) to 192.168.102.22 (runs 2.4.3) (Note:I don't
insert any kernel modules of my own on these machines):


PING 192.168.102.22 (192.168.102.22) from 192.168.102.29 : 100(128) bytes of
data.
108 bytes from hobbes.sr.ntc.nokia.com (192.168.102.22): icmp_seq=0 ttl=255
time=36.5 ms
wrong data byte #36 should be 0x24 but was 0x45
	19 45 d4 3a e 7a a 0 8 9 a b c d e f 10 11 12 13 14 15 16 17 18 19
1a 1b 1c 1d 1e 1f 
	20 21 22 23 45 0 0 80 0 0 40 0 ff 1 2d f8 c0 a8 66 16 c0 a8 66 1d 0
0 0 0 4 c 0 0 
	19 45 d4 3a e 7a a 0 8 9 a b c d e f 10 11 12 13 14 15 16 17 18 19
1a 1b 

--- 192.168.102.22 ping statistics ---
1 packets transmitted, 1 packets received, 0% packet loss
round-trip min/avg/max = 36.5/36.5/36.5 ms


Note that the problem starts with byte #36 which goes on like " 45 0 0 80 0
......." which is in fact the outer IP header!! So certainly there are
buffer overruns on the other end (host 192.168.102.22)....

And as a I said earlier, only ping packets with size within certain range
create this problem......Something is terribly wrong here!! But as I am not
a Linux mm guru, i can't tell what is wrong here!


regards,
imran


^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2001-04-11 20:02 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2001-04-11 19:02 skb allocation problems (More Brain damage!) Imran.Patel
  -- strict thread matches above, loose matches on Subject: below --
2001-04-11 20:02 Manfred Spraul
2001-04-11 18:22 Imran.Patel
2001-04-11 17:24 Imran.Patel
2001-04-11 17:15 Imran.Patel
2001-04-11 17:20 ` Dave Airlie
2001-04-11 17:47 ` Bart Trojanowski
2001-04-11 18:25   ` Andi Kleen
2001-04-11 18:28 ` Andi Kleen

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).