linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [BUG] DEADLOCK when removing a bridge on 2.4.19-pre6
@ 2002-04-10  1:53 Mike Fedyk
  2002-04-10  3:57 ` Andreas Dilger
                   ` (2 more replies)
  0 siblings, 3 replies; 10+ messages in thread
From: Mike Fedyk @ 2002-04-10  1:53 UTC (permalink / raw)
  To: linux-kernel; +Cc: Jeff Garzik, David S. Miller, Andrew Morton

Hi,

*read below for a problem report against the tulip network driver[1]

It looks like the bridging feature of the kernel has changed between 2.4.17
and 18.

I have a machine running 2.4.16 running as a bride without problem.  While
testing the network cards for another bridge I had to restart.
Unfortunately, it didn't complete.  Nothing would respond except for sysrq.
I was able to sync, unmount, and reboot successfully.

Debian does this for you automatically, but if you do it manually it does
the same thing.

o Setup a bridge like normal
o start it up
o ifconfig br0 down
o brctl delbr br0
*boom*

2.4.16,17 work just fine, but 2.4.18 does not.  Looking through the
changelog, 2.4.18-pre4, or 5 look suspect.  Also, it is not fixed in
2.4.19-pre6.  I can test the 2.4.18-pre kernels if you'd like, just let me
know.

[1] Also several times I was able to get my tulip network card to stop
forwarding packets after a soft-reboot from the lockup mentioned above.  I
was able to reproduce this in 2.4.17, 2.4.18, 2.4.19-pre3-ac4 (possibly
2.4.19-pre5-ac3 though didn't check to make sure but I used it several times
during my testing), and 2.4.19-pre6.  None of the other network cards in
this machine did this (3c509-isa, pcnet32-pci, eepro100-pci)

I don't have a serial cable ATM but I could buy one in a few days if a
ksymoopsed sysrq-t would help.

00:00.0 Host bridge: Intel Corp. 430HX - 82439HX TXC [Triton II] (rev 03)
00:01.0 ISA bridge: Intel Corp. 82371SB PIIX3 ISA [Natoma/Triton II] (rev 01)
00:01.1 IDE interface: Intel Corp. 82371SB PIIX3 IDE [Natoma/Triton II]
00:01.2 USB Controller: Intel Corp. 82371SB PIIX3 USB [Natoma/Triton II] (rev 01)
00:06.0 Ethernet controller: Lite-On Communications Inc LNE100TX (rev 20)
00:07.0 Ethernet controller: Intel Corp. 82557 [Ethernet Pro 100] (rev 02)
00:08.0 VGA compatible controller: S3 Inc. 86c764/765 [Trio32/64/64V+] (rev 54)
00:0b.0 Ethernet controller: Advanced Micro Devices [AMD] 79c970 [PCnet LANCE] (rev 16)

Mike

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [BUG] DEADLOCK when removing a bridge on 2.4.19-pre6
  2002-04-10  1:53 [BUG] DEADLOCK when removing a bridge on 2.4.19-pre6 Mike Fedyk
@ 2002-04-10  3:57 ` Andreas Dilger
  2002-04-10 15:00 ` Lennert Buytenhek
  2002-04-10 18:16 ` Mike Fedyk
  2 siblings, 0 replies; 10+ messages in thread
From: Andreas Dilger @ 2002-04-10  3:57 UTC (permalink / raw)
  To: linux-kernel

On Apr 09, 2002  18:53 -0700, Mike Fedyk wrote:
> I have a machine running 2.4.16 running as a bride without problem.

Well, I know some people get attached to their machines, but you will
find that your new bride will age a lot faster than you ;-).

You may also want to contact Lennert Buytenhek <buytenh@gnu.org>, the
bridge code maintainer.

Cheers, Andreas
--
Andreas Dilger
http://www-mddsp.enel.ucalgary.ca/People/adilger/
http://sourceforge.net/projects/ext2resize/


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [BUG] DEADLOCK when removing a bridge on 2.4.19-pre6
  2002-04-10  1:53 [BUG] DEADLOCK when removing a bridge on 2.4.19-pre6 Mike Fedyk
  2002-04-10  3:57 ` Andreas Dilger
@ 2002-04-10 15:00 ` Lennert Buytenhek
  2002-04-10 18:16 ` Mike Fedyk
  2 siblings, 0 replies; 10+ messages in thread
From: Lennert Buytenhek @ 2002-04-10 15:00 UTC (permalink / raw)
  To: Mike Fedyk
  Cc: linux-kernel, bridge, Jeff Garzik, David S. Miller, Andrew Morton


On Tue, Apr 09, 2002 at 06:53:11PM -0700, Mike Fedyk wrote:

> 2.4.16,17 work just fine, but 2.4.18 does not.  Looking through the
> changelog, 2.4.18-pre4, or 5 look suspect.  Also, it is not fixed in
> 2.4.19-pre6.  I can test the 2.4.18-pre kernels if you'd like, just let me
> know.

There's a detailed list of patches that went into 2.4 at:

	http://bridge.sourceforge.net/patchtracker.html

Can you try and find out which of the three patches that went
into 2.4.18 is causing your problems?


thanks,
Lennert

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [BUG] DEADLOCK when removing a bridge on 2.4.19-pre6
  2002-04-10  1:53 [BUG] DEADLOCK when removing a bridge on 2.4.19-pre6 Mike Fedyk
  2002-04-10  3:57 ` Andreas Dilger
  2002-04-10 15:00 ` Lennert Buytenhek
@ 2002-04-10 18:16 ` Mike Fedyk
  2002-04-11  0:49   ` Mike Fedyk
  2 siblings, 1 reply; 10+ messages in thread
From: Mike Fedyk @ 2002-04-10 18:16 UTC (permalink / raw)
  To: linux-kernel, Jeff Garzik, David S. Miller, Andrew Morton,
	Lennert Buytenhek

On Tue, Apr 09, 2002 at 06:53:11PM -0700, Mike Fedyk wrote:
> Hi,
> 
> *read below for a problem report against the tulip network driver[1]
> 
> It looks like the bridging feature of the kernel has changed between 2.4.17
> and 18.
> 
> I have a machine running 2.4.16 running as a bride without problem.  While

Oh, yes s/bride/bridge/ ... Nice one Andreas. Women age so much better
than computers, but are just as much harder to change also.  ;)

> testing the network cards for another bridge I had to restart.
> Unfortunately, it didn't complete.  Nothing would respond except for sysrq.
> I was able to sync, unmount, and reboot successfully.
> 
> Debian does this for you automatically, but if you do it manually it does
> the same thing.
> 
> o Setup a bridge like normal
> o start it up
> o ifconfig br0 down
> o brctl delbr br0
> *boom*
> 
> 2.4.16,17 work just fine, but 2.4.18 does not.  Looking through the
> changelog, 2.4.18-pre4, or 5 look suspect.  Also, it is not fixed in
> 2.4.19-pre6.  I can test the 2.4.18-pre kernels if you'd like, just let me
> know.
> 
> [1] Also several times I was able to get my tulip network card to stop
> forwarding packets after a soft-reboot from the lockup mentioned above.  I

I forgot to mention that on each occourance, if I turn the power off, wait a
few seconds, and turn it back on, all network cards work normally including
tulip.  It just looks like the chip isn't setup correctly[2] if it wasn't
already in a known state.

[2] I only say this because a power-down "fixes" the problem whereas a
sysctl-s-u-b soft-reset does not.  It could be the bios on this old machine
too.

As requested by Jeff Garzik, I'm going to copy the latest tulip driver from
2.4.19-pre6 to 2.4.17 and test it there.  Lennert, I'll test the three
bridgeing patches on top of that and let you know which one(s) cause the
problem with bridge removal. 

Mike

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [BUG] DEADLOCK when removing a bridge on 2.4.19-pre6
  2002-04-10 18:16 ` Mike Fedyk
@ 2002-04-11  0:49   ` Mike Fedyk
  2002-04-11  1:05     ` Mike Fedyk
  0 siblings, 1 reply; 10+ messages in thread
From: Mike Fedyk @ 2002-04-11  0:49 UTC (permalink / raw)
  To: linux-kernel, Jeff Garzik, David S. Miller, Andrew Morton,
	Lennert Buytenhek

On Wed, Apr 10, 2002 at 11:16:06AM -0700, Mike Fedyk wrote:
> As requested by Jeff Garzik, I'm going to copy the latest tulip driver from
> 2.4.19-pre6 to 2.4.17 and test it there.  Lennert, I'll test the three
> bridgeing patches on top of that and let you know which one(s) cause the
> problem with bridge removal. 
> 

2.4.18_fix_port_state_handling.diff

Is causing the problem on 2.4.17-19p6tulip-br0fpsh.  I haven't tested the
other patches though.

I'm going to reverse this patch on 2.4.19-pre6 to see if it fixes it there
too.  Stay tuned.

Mike

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [BUG] DEADLOCK when removing a bridge on 2.4.19-pre6
  2002-04-11  0:49   ` Mike Fedyk
@ 2002-04-11  1:05     ` Mike Fedyk
  2002-04-11  1:40       ` Mike Fedyk
  0 siblings, 1 reply; 10+ messages in thread
From: Mike Fedyk @ 2002-04-11  1:05 UTC (permalink / raw)
  To: linux-kernel, Jeff Garzik, David S. Miller, Andrew Morton,
	Lennert Buytenhek

On Wed, Apr 10, 2002 at 05:49:11PM -0700, Mike Fedyk wrote:
> 2.4.18_fix_port_state_handling.diff
> 
> Is causing the problem on 2.4.17-19p6tulip-br0fpsh.  I haven't tested the
> other patches though.
> 
> I'm going to reverse this patch on 2.4.19-pre6 to see if it fixes it there
> too.  Stay tuned.

2.4.18_enslave_bridge_dev_to_bridge_dev.diff

Is fine I didn't reproduce the trouble in 2.4.17-19p6tulip-br0ebdtbd (it was
already compiling when I tested the port_state kernel...).

2.4.19-pre6-nobr0fpsh building now...

Mike

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [BUG] DEADLOCK when removing a bridge on 2.4.19-pre6
  2002-04-11  1:05     ` Mike Fedyk
@ 2002-04-11  1:40       ` Mike Fedyk
  2002-04-11  2:44         ` David S. Miller
  2002-04-23 23:00         ` [PATCH] " Mike Fedyk
  0 siblings, 2 replies; 10+ messages in thread
From: Mike Fedyk @ 2002-04-11  1:40 UTC (permalink / raw)
  To: linux-kernel, Jeff Garzik, David S. Miller, Andrew Morton,
	Lennert Buytenhek

On Wed, Apr 10, 2002 at 06:05:15PM -0700, Mike Fedyk wrote:
> On Wed, Apr 10, 2002 at 05:49:11PM -0700, Mike Fedyk wrote:
> > 2.4.18_fix_port_state_handling.diff
> > 
> > Is causing the problem on 2.4.17-19p6tulip-br0fpsh.  I haven't tested the
> > other patches though.
> > 
> > I'm going to reverse this patch on 2.4.19-pre6 to see if it fixes it there
> > too.  Stay tuned.
> 
> 2.4.18_enslave_bridge_dev_to_bridge_dev.diff
> 
> Is fine I didn't reproduce the trouble in 2.4.17-19p6tulip-br0ebdtbd (it was
> already compiling when I tested the port_state kernel...).
> 
> 2.4.19-pre6-nobr0fpsh building now...

Yep, reversing 2.4.18_fix_port_state_handling.diff fixed it.

Here's a patch to reverse it for you:
--- 2.4.19-pre6/net/bridge/br_input.c	Mon Feb 25 11:38:14 2002
+++ 2.4.19-pre6-nobr0fpsh/net/bridge/br_input.c	Wed Apr 10 18:04:17 2002
@@ -5,7 +5,7 @@
  *	Authors:
  *	Lennert Buytenhek		<buytenh@gnu.org>
  *
- *	$Id: br_input.c,v 1.9.2.1 2001/12/24 04:50:05 davem Exp $
+ *	$Id: br_input.c,v 1.9 2001/08/14 22:05:57 davem Exp $
  *
  *	This program is free software; you can redistribute it and/or
  *	modify it under the terms of the GNU General Public License
@@ -46,7 +46,7 @@
 			br_pass_frame_up_finish);
 }
 
-static int br_handle_frame_finish(struct sk_buff *skb)
+static void __br_handle_frame(struct sk_buff *skb)
 {
 	struct net_bridge *br;
 	unsigned char *dest;
@@ -57,112 +57,103 @@
 	dest = skb->mac.ethernet->h_dest;
 
 	p = skb->dev->br_port;
-	if (p == NULL)
-		goto err_nolock;
-
 	br = p->br;
-	read_lock(&br->lock);
-	if (skb->dev->br_port == NULL)
-		goto err;
-
 	passedup = 0;
+
+	if (!(br->dev.flags & IFF_UP) ||
+	    p->state == BR_STATE_DISABLED)
+		goto freeandout;
+
 	if (br->dev.flags & IFF_PROMISC) {
 		struct sk_buff *skb2;
 
 		skb2 = skb_clone(skb, GFP_ATOMIC);
-		if (skb2 != NULL) {
+		if (skb2) {
 			passedup = 1;
 			br_pass_frame_up(br, skb2);
 		}
 	}
 
+	if (skb->mac.ethernet->h_source[0] & 1)
+		goto freeandout;
+
+	if (!passedup &&
+	    (dest[0] & 1) &&
+	    (br->dev.flags & IFF_ALLMULTI || br->dev.mc_list != NULL)) {
+		struct sk_buff *skb2;
+
+		skb2 = skb_clone(skb, GFP_ATOMIC);
+		if (skb2) {
+			passedup = 1;
+			br_pass_frame_up(br, skb2);
+		}
+	}
+
+	if (br->stp_enabled &&
+	    !memcmp(dest, bridge_ula, 5) &&
+	    !(dest[5] & 0xF0))
+		goto handle_special_frame;
+
+	if (p->state == BR_STATE_LEARNING ||
+	    p->state == BR_STATE_FORWARDING)
+		br_fdb_insert(br, p, skb->mac.ethernet->h_source, 0);
+
+	if (p->state != BR_STATE_FORWARDING)
+		goto freeandout;
+
 	if (dest[0] & 1) {
-		br_flood_forward(br, skb, !passedup);
+		br_flood_forward(br, skb, 1);
 		if (!passedup)
 			br_pass_frame_up(br, skb);
-		goto out;
+		else
+			kfree_skb(skb);
+		return;
 	}
 
 	dst = br_fdb_get(br, dest);
+
 	if (dst != NULL && dst->is_local) {
 		if (!passedup)
 			br_pass_frame_up(br, skb);
 		else
 			kfree_skb(skb);
 		br_fdb_put(dst);
-		goto out;
+		return;
 	}
 
 	if (dst != NULL) {
 		br_forward(dst->dst, skb);
 		br_fdb_put(dst);
-		goto out;
+		return;
 	}
 
 	br_flood_forward(br, skb, 0);
+	return;
 
-out:
-	read_unlock(&br->lock);
-	return 0;
+ handle_special_frame:
+	if (!dest[5]) {
+		br_stp_handle_bpdu(skb);
+		return;
+	}
 
-err:
-	read_unlock(&br->lock);
-err_nolock:
+ freeandout:
 	kfree_skb(skb);
-	return 0;
 }
 
-void br_handle_frame(struct sk_buff *skb)
+static int br_handle_frame_finish(struct sk_buff *skb)
 {
 	struct net_bridge *br;
-	unsigned char *dest;
-	struct net_bridge_port *p;
 
-	dest = skb->mac.ethernet->h_dest;
-
-	p = skb->dev->br_port;
-	if (p == NULL)
-		goto err_nolock;
-
-	br = p->br;
+	br = skb->dev->br_port->br;
 	read_lock(&br->lock);
-	if (skb->dev->br_port == NULL)
-		goto err;
-
-	if (!(br->dev.flags & IFF_UP) ||
-	    p->state == BR_STATE_DISABLED)
-		goto err;
-
-	if (skb->mac.ethernet->h_source[0] & 1)
-		goto err;
-
-	if (p->state == BR_STATE_LEARNING ||
-	    p->state == BR_STATE_FORWARDING)
-		br_fdb_insert(br, p, skb->mac.ethernet->h_source, 0);
-
-	if (br->stp_enabled &&
-	    !memcmp(dest, bridge_ula, 5) &&
-	    !(dest[5] & 0xF0))
-		goto handle_special_frame;
-
-	if (p->state == BR_STATE_FORWARDING) {
-		NF_HOOK(PF_BRIDGE, NF_BR_PRE_ROUTING, skb, skb->dev, NULL,
-			br_handle_frame_finish);
-		read_unlock(&br->lock);
-		return;
-	}
-
-err:
+	__br_handle_frame(skb);
 	read_unlock(&br->lock);
-err_nolock:
-	kfree_skb(skb);
-	return;
 
-handle_special_frame:
-	if (!dest[5]) {
-		br_stp_handle_bpdu(skb);
-		return;
-	}
+	return 0;
+}
 
-	kfree_skb(skb);
+void br_handle_frame(struct sk_buff *skb)
+{
+	NF_HOOK(PF_BRIDGE, NF_BR_PRE_ROUTING, skb, skb->dev, NULL,
+			br_handle_frame_finish);
 }

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [BUG] DEADLOCK when removing a bridge on 2.4.19-pre6
  2002-04-11  1:40       ` Mike Fedyk
@ 2002-04-11  2:44         ` David S. Miller
  2002-04-11 20:24           ` Mike Fedyk
  2002-04-23 23:00         ` [PATCH] " Mike Fedyk
  1 sibling, 1 reply; 10+ messages in thread
From: David S. Miller @ 2002-04-11  2:44 UTC (permalink / raw)
  To: mfedyk; +Cc: linux-kernel, jgarzik, akpm, buytenh

   From: Mike Fedyk <mfedyk@matchmail.com>
   Date: Wed, 10 Apr 2002 18:40:35 -0700

   On Wed, Apr 10, 2002 at 06:05:15PM -0700, Mike Fedyk wrote:
   > 2.4.19-pre6-nobr0fpsh building now...
   
   Yep, reversing 2.4.18_fix_port_state_handling.diff fixed it.

Thanks for tracking this down so precisely.  If every bug reporter
actually bothered to do this work, it'd take a lot less work to
fix most bugs :-)

Lennert, just let me know when you have a fix :-)

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [BUG] DEADLOCK when removing a bridge on 2.4.19-pre6
  2002-04-11  2:44         ` David S. Miller
@ 2002-04-11 20:24           ` Mike Fedyk
  0 siblings, 0 replies; 10+ messages in thread
From: Mike Fedyk @ 2002-04-11 20:24 UTC (permalink / raw)
  To: David S. Miller; +Cc: linux-kernel, jgarzik, akpm, buytenh

On Wed, Apr 10, 2002 at 07:44:11PM -0700, David S. Miller wrote:
>    From: Mike Fedyk <mfedyk@matchmail.com>
>    Date: Wed, 10 Apr 2002 18:40:35 -0700
> 
>    On Wed, Apr 10, 2002 at 06:05:15PM -0700, Mike Fedyk wrote:
>    > 2.4.19-pre6-nobr0fpsh building now...
>    
>    Yep, reversing 2.4.18_fix_port_state_handling.diff fixed it.
> 
> Thanks for tracking this down so precisely.  If every bug reporter
> actually bothered to do this work, it'd take a lot less work to
> fix most bugs :-)
>

Thanks. :)

> Lennert, just let me know when you have a fix :-)
> 

Couldn't have done it if Lennert didn't keep those applied patches
available, and separated. :)

Mike

^ permalink raw reply	[flat|nested] 10+ messages in thread

* [PATCH] Re: [BUG] DEADLOCK when removing a bridge on 2.4.19-pre6
  2002-04-11  1:40       ` Mike Fedyk
  2002-04-11  2:44         ` David S. Miller
@ 2002-04-23 23:00         ` Mike Fedyk
  1 sibling, 0 replies; 10+ messages in thread
From: Mike Fedyk @ 2002-04-23 23:00 UTC (permalink / raw)
  To: linux-kernel, Jeff Garzik, David S. Miller, Andrew Morton,
	Lennert Buytenhek

On Wed, Apr 10, 2002 at 06:40:35PM -0700, Mike Fedyk wrote:
> On Wed, Apr 10, 2002 at 06:05:15PM -0700, Mike Fedyk wrote:
> > On Wed, Apr 10, 2002 at 05:49:11PM -0700, Mike Fedyk wrote:
> > > 2.4.18_fix_port_state_handling.diff
> > > 
> > > Is causing the problem on 2.4.17-19p6tulip-br0fpsh.  I haven't tested the
> > > other patches though.
> > > 
> > > I'm going to reverse this patch on 2.4.19-pre6 to see if it fixes it there
> > > too.  Stay tuned.
> > 
> > 2.4.18_enslave_bridge_dev_to_bridge_dev.diff
> > 
> > Is fine I didn't reproduce the trouble in 2.4.17-19p6tulip-br0ebdtbd (it was
> > already compiling when I tested the port_state kernel...).
> > 
> > 2.4.19-pre6-nobr0fpsh building now...
> 
> Yep, reversing 2.4.18_fix_port_state_handling.diff fixed it.
> 

Instead of reversing the patch, use this patch from Lennert Buytenhek instead:

Note: this problem should only show up on smp kernels(unvarified), it showed
up with a smp kernel on a UP system for me.

--- linux-2.4.18/net/bridge/br_input.c.orig	Thu Apr 18 11:50:16 2002
+++ linux-2.4.18/net/bridge/br_input.c	Thu Apr 18 11:50:26 2002
@@ -161,8 +161,10 @@
 handle_special_frame:
 	if (!dest[5]) {
 		br_stp_handle_bpdu(skb);
+		read_unlock(&br->lock);
 		return;
 	}
 
+	read_unlock(&br->lock);
 	kfree_skb(skb);
 }




^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2002-04-23 22:58 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2002-04-10  1:53 [BUG] DEADLOCK when removing a bridge on 2.4.19-pre6 Mike Fedyk
2002-04-10  3:57 ` Andreas Dilger
2002-04-10 15:00 ` Lennert Buytenhek
2002-04-10 18:16 ` Mike Fedyk
2002-04-11  0:49   ` Mike Fedyk
2002-04-11  1:05     ` Mike Fedyk
2002-04-11  1:40       ` Mike Fedyk
2002-04-11  2:44         ` David S. Miller
2002-04-11 20:24           ` Mike Fedyk
2002-04-23 23:00         ` [PATCH] " Mike Fedyk

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).