linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* Re: possible bug x86 2.4.2 SMP in IP receive stack
       [not found] <200102232259.OAA20943@frisbee.myri.com>
@ 2001-02-27 19:41 ` kuznet
  0 siblings, 0 replies; 4+ messages in thread
From: kuznet @ 2001-02-27 19:41 UTC (permalink / raw)
  To: Bob Felderman; +Cc: linux-kernel

Hello!

> Feb 23 12:42:30 rcc2 kernel: Warning: kfree_skb passed an skb still on a list (from c01f58dc).

BTW, that's didactic example of bug which results in similar behaviour.

Alexey


> From: andrewm@uow.EDU.AU (Andrew Morton)
> Subject: Re: Failed assertion
> Date: 27 Feb 2001 04:15:01 +0300
> 
> "David S. Miller" wrote:
> > 
> > Ralf Baechle writes:
> >  > No backtrace, the machine did continue as you'd suspect after a print.
> >  > The machine is a dual CPU Origin 200 with an IOC3 NIC.
> > 
> > What is your current kernel based upon, some older 2.4.x or
> > even 2.3.x variant?  Or is it sync'd to current?
> 
> Could this be a driver problem?  This code:
> 
>             netif_rx(skb);
> 
>             ip->rx_skbs[rx_entry] = NULL;   /* Poison  */
> 
>             new_skb = ioc3_alloc_skb(RX_BUF_ALLOC_SIZE, GFP_ATOMIC);
>             if (!new_skb) {
>                 /* Ouch, drop packet and just recycle packet
>                    to keep the ring filled.  */
>                 ip->stats.rx_dropped++;
>                 new_skb = skb;
>                 goto next;
>             }
> 
> looks scary.  We've passed an skb to the network stack,
> but we can continue to make it available to the device
> driver at the same time.
> 
> I'd suggest a printk() in there, plus perhaps do the
> alloc_skb _before_ the netif_rx().  Don't pass the skb
> to the stack if it is to be recycled.

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: possible bug x86 2.4.2 SMP in IP receive stack
  2001-02-23 22:13 Bob Felderman
@ 2001-02-26  5:10 ` David S. Miller
  0 siblings, 0 replies; 4+ messages in thread
From: David S. Miller @ 2001-02-26  5:10 UTC (permalink / raw)
  To: Bob Felderman; +Cc: linux-kernel


Sounds like a bug wrt. SKB allocations in the Myrinet driver.

You're the author of most of that code, so I'm sure you're the
best one to audit it :-)

Later,
David S. Miller
davem@redhat.com

^ permalink raw reply	[flat|nested] 4+ messages in thread

* re: possible bug x86 2.4.2 SMP in IP receive stack
@ 2001-02-23 23:03 Bob Felderman
  0 siblings, 0 replies; 4+ messages in thread
From: Bob Felderman @ 2001-02-23 23:03 UTC (permalink / raw)
  To: linux-kernel

=> From feldy Fri Feb 23 14:13:08 2001
=>
=> Feb 23 12:42:30 rcc2 kernel: Warning: kfree_skb passed an skb still on a list (from c01f58dc).
=>
=> I'm going to pop out one processor on the receiver
=> and see if that makes the problem go away.

Using a single processor on the receive side makes the problem go away.
I see no problems on the receiver with one cpu removed.

rcc2 29% netstat -i myri0 
Kernel Interface table
Iface   MTU Met    RX-OK RX-ERR RX-DRP RX-OVR    TX-OK TX-ERR TX-DRP TX-OVR Flg
eth0   1500   0    37857      0      0      0    36404      0      0      0 BRU
lo    16192   0       46      0      0      0       46      0      0      0 LRU
myri0  9000   0 20564644      0      0      0      312      0      0      0 BRU




^ permalink raw reply	[flat|nested] 4+ messages in thread

* possible bug x86 2.4.2 SMP in IP receive stack
@ 2001-02-23 22:13 Bob Felderman
  2001-02-26  5:10 ` David S. Miller
  0 siblings, 1 reply; 4+ messages in thread
From: Bob Felderman @ 2001-02-23 22:13 UTC (permalink / raw)
  To: linux-kernel; +Cc: feldy

With dual x86 processors running 2.4.2, if I blast a UDP
stream at the machine using netperf, I can easily
cause the kernel to panic with the message below.

Feb 23 12:42:30 rcc2 kernel: Warning: kfree_skb passed an skb still on a list (from c01f58dc).

I'm going to pop out one processor on the receiver
and see if that makes the problem go away.

Note that this is using a Myrinet network that is able to 
get more than 1.5Gigabit/sec UDP transfers on 
single-processor x86 2.4.0 linux. Perhaps this is reproducible 
with good GigE cards with jumbo MTU turned on.

I'm also upping the socket limits
echo "1048576" > /proc/sys/net/core/rmem_max
echo "1048576" > /proc/sys/net/core/wmem_max
echo "1048576" > /proc/sys/net/core/wmem_default
echo "1048576" > /proc/sys/net/core/rmem_default
echo "1048576" > /proc/sys/net/core/optmem_max



Feb 23 12:42:30 rcc2 kernel: Warning: kfree_skb passed an skb still on a list (from c01f58dc).

Looking up the "from c01f58dc" in the ksyms shows that
	ip_rcv
is the caller.


c01f3d38 ip_route_input_Rsmp_0a99f032
c01f44f8 ip_route_output_key_Rsmp_4ce6fe49
c01f5170 inet_add_protocol_Rsmp_a27098bd
c01f51f0 inet_del_protocol_Rsmp_0c8ae503
c01f5538 ip_rcv_Rsmp_587335e5
c01f58dc ERROR LOCATION (kfree_skb passed an skb still on a list (from c01f58dc))
c01f61dc ip_defrag_Rsmp_5532f3a2
c01f6b34 ip_options_compile_Rsmp_b8621391
c01f70ec ip_options_undo_Rsmp_9721f12f
c01f8650 ip_fragment_Rsmp_41bc67d3
c01f8bb0 ip_send_check_Rsmp_a37b7441
c01f8bf8 ip_finish_output_Rsmp_5b565e28



On a different machine I have seen this.

Feb 23 12:32:20 rcc kernel: KERNEL: assertion (del_timer(&qp->timer) == 0) failed at ip_fragment.c(163):ip_frag_destroy



CONFIG_X86=y
CONFIG_ISA=y
CONFIG_UID16=y
CONFIG_EXPERIMENTAL=y
CONFIG_MODULES=y
CONFIG_MODVERSIONS=y
CONFIG_KMOD=y
CONFIG_MPENTIUMIII=y
CONFIG_X86_WP_WORKS_OK=y
CONFIG_X86_INVLPG=y
CONFIG_X86_CMPXCHG=y
CONFIG_X86_BSWAP=y
CONFIG_X86_POPAD_OK=y
CONFIG_X86_TSC=y
CONFIG_X86_GOOD_APIC=y
CONFIG_X86_PGE=y
CONFIG_X86_USE_PPRO_CHECKSUM=y
CONFIG_NOHIGHMEM=y
CONFIG_SMP=y
CONFIG_HAVE_DEC_LOCK=y
CONFIG_NET=y
CONFIG_X86_IO_APIC=y
CONFIG_X86_LOCAL_APIC=y
CONFIG_PCI=y
CONFIG_PCI_GOANY=y
CONFIG_PCI_BIOS=y
CONFIG_PCI_DIRECT=y
CONFIG_PCI_NAMES=y
CONFIG_HOTPLUG=y
CONFIG_SYSVIPC=y
CONFIG_SYSCTL=y
CONFIG_KCORE_ELF=y
CONFIG_BINFMT_AOUT=y
CONFIG_BINFMT_ELF=y
CONFIG_BINFMT_MISC=y
CONFIG_PM=y
CONFIG_ACPI=y
CONFIG_PNP=y
CONFIG_ISAPNP=y
CONFIG_BLK_DEV_FD=y
CONFIG_PACKET=y
CONFIG_PACKET_MMAP=y
CONFIG_UNIX=y
CONFIG_INET=y
CONFIG_IDE=y
CONFIG_BLK_DEV_IDE=y
CONFIG_BLK_DEV_IDEDISK=y
CONFIG_BLK_DEV_IDECD=y
CONFIG_BLK_DEV_CMD640=y
CONFIG_BLK_DEV_RZ1000=y
CONFIG_BLK_DEV_IDEPCI=y
CONFIG_IDEPCI_SHARE_IRQ=y
CONFIG_BLK_DEV_IDE_MODES=y
CONFIG_NETDEVICES=y
CONFIG_NET_ETHERNET=y
CONFIG_NET_PCI=y
CONFIG_EEPRO100=y
CONFIG_VT=y
CONFIG_VT_CONSOLE=y
CONFIG_SERIAL=y
CONFIG_UNIX98_PTYS=y
CONFIG_MOUSE=y
CONFIG_PSMOUSE=y
CONFIG_DRM=y
CONFIG_DRM_TDFX=y
CONFIG_AUTOFS_FS=y
CONFIG_AUTOFS4_FS=y
CONFIG_FAT_FS=y
CONFIG_MSDOS_FS=y
CONFIG_ISO9660_FS=y
CONFIG_PROC_FS=y
CONFIG_DEVPTS_FS=y
CONFIG_EXT2_FS=y
CONFIG_NFS_FS=y
CONFIG_NFS_V3=y
CONFIG_NFSD=y
CONFIG_NFSD_V3=y
CONFIG_SUNRPC=y
CONFIG_LOCKD=y
CONFIG_LOCKD_V4=y
CONFIG_MSDOS_PARTITION=y
CONFIG_NLS=y
CONFIG_VGA_CONSOLE=y


------------------------------------------------------------------
Bob Felderman                                 (626) 821-5555
Director of Software Development              (626) 821-5316 fax
Myricom Inc.                                  feldy@myri.com
325 N. Santa Anita Ave.                       http://www.myri.com
Arcadia, CA 91006
------------------------------------------------------------------

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2001-02-27 19:48 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <200102232259.OAA20943@frisbee.myri.com>
2001-02-27 19:41 ` possible bug x86 2.4.2 SMP in IP receive stack kuznet
2001-02-23 23:03 Bob Felderman
  -- strict thread matches above, loose matches on Subject: below --
2001-02-23 22:13 Bob Felderman
2001-02-26  5:10 ` David S. Miller

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).