All of lore.kernel.org
 help / color / mirror / Atom feed
* PROBLEM: 2.4.37.9 destroying an Ethernet interface with permanent NUD leaves the kernels with undestroyable interfaces when ATM is compiled in
@ 2010-04-07 20:23 Sylvain Rochet
  2010-04-08 13:47 ` Sylvain Rochet
  0 siblings, 1 reply; 4+ messages in thread
From: Sylvain Rochet @ 2010-04-07 20:23 UTC (permalink / raw)
  To: linux-kernel


[-- Attachment #1.1: Type: text/plain, Size: 1633 bytes --]

Hi,

When ATM and Ethernet are compiled in, ATM and Ethernet create their 
NEIGH/ARP tables, they are both assigned to family AF_INET.


int neigh_add(....) {

 ...
        for (tbl=neigh_tables; tbl; tbl = tbl->next) {
                if (tbl->family != ndm->ndm_family)
                        continue;
  ...
}


As ATM table is created before Ethernet(main?) table, 
net/core/neighbour.c::neigh_add() function add all permanent IP ARP 
Ethernet NUD to the IP ATM table, which is wrong.

Therefore, when net/core/neighbour.c::neigh_ifdown() is called ARP 
entries are not cleared, leaving dev->refcnt to a value that will never 
be able to reach 0 anymore.

So, when net/core/dev.c::unregister_netdevice() is called it stalls 
without being able to destroy the interface leaving the system with no 
network tools working anymore.


This is really easy to reproduce:

openvpn --mktun --dev tap10
ip addr add 10.20.30.20/24 dev tap10
ip link set up dev tap10
ip neighbour add 10.20.30.40 lladdr 01:02:03:04:05:06 nud permanent dev tap10
ip link set down dev tap10
openvpn --rmtun --dev tap10

and then kernel log starts being filled by:

unregister_netdevice: waiting for tap10 to become free. Usage count = 2
unregister_netdevice: waiting for tap10 to become free. Usage count = 2
unregister_netdevice: waiting for tap10 to become free. Usage count = 2
unregister_netdevice: waiting for tap10 to become free. Usage count = 2


I changed the family of the ATM table to AF_ATMPVC, of course it fixes 
the issue but I guess this is the wrong way to fix that.


Best regard,
Sylvain

[-- Attachment #1.2: wrongfamily-atm-2.5.36.6.patch --]
[-- Type: text/x-diff, Size: 695 bytes --]

diff -Nru linux-2.4.36.6.a/net/atm/clip.c linux-2.4.36.6.b/net/atm/clip.c
--- linux-2.4.36.6.a/net/atm/clip.c	2008-06-06 18:25:34.000000000 +0200
+++ linux-2.4.36.6.b/net/atm/clip.c	2010-04-07 21:33:38.000000000 +0200
@@ -277,7 +277,7 @@
 
 
 static struct neigh_ops clip_neigh_ops = {
-	family:			AF_INET,
+	family:			AF_ATMPVC,
 	destructor:		clip_neigh_destroy,
 	solicit:		clip_neigh_solicit,
 	error_report:		clip_neigh_error,
@@ -316,7 +316,7 @@
 
 static struct neigh_table clip_tbl = {
 	NULL,			/* next */
-	AF_INET,		/* family */
+	AF_ATMPVC,		/* family */
 	sizeof(struct neighbour)+sizeof(struct atmarp_entry), /* entry_size */
 	4,			/* key_len */
 	clip_hash,

[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 189 bytes --]

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: PROBLEM: 2.4.37.9 destroying an Ethernet interface with permanent NUD leaves the kernels with undestroyable interfaces when ATM is compiled in
  2010-04-07 20:23 PROBLEM: 2.4.37.9 destroying an Ethernet interface with permanent NUD leaves the kernels with undestroyable interfaces when ATM is compiled in Sylvain Rochet
@ 2010-04-08 13:47 ` Sylvain Rochet
  2010-04-20  5:11   ` Willy Tarreau
  0 siblings, 1 reply; 4+ messages in thread
From: Sylvain Rochet @ 2010-04-08 13:47 UTC (permalink / raw)
  To: linux-kernel


[-- Attachment #1.1: Type: text/plain, Size: 362 bytes --]

Hi,

On Wed, Apr 07, 2010 at 10:23:39PM +0200, Sylvain Rochet wrote:
> Hi,
> 
> (...)
> 
> I changed the family of the ATM table to AF_ATMPVC, of course it fixes 
> the issue but I guess this is the wrong way to fix that.

Finally made a patch that follows what Linux 2.6 does, which consists of 
having "netlink" and "no-netlink" tables.

Sylvain

[-- Attachment #1.2: neigh_no_netlink_atm_clip-2.4.37.9.patch --]
[-- Type: text/x-diff, Size: 2176 bytes --]

diff -Nru linux-2.4.36.6.a/include/net/neighbour.h linux-2.4.36.6.b/include/net/neighbour.h
--- linux-2.4.36.6.a/include/net/neighbour.h	2008-06-06 16:25:34.000000000 +0000
+++ linux-2.4.36.6.b/include/net/neighbour.h	2010-04-08 13:36:12.000000000 +0000
@@ -192,6 +192,7 @@
 };
 
 extern void			neigh_table_init(struct neigh_table *tbl);
+extern void			neigh_table_init_no_netlink(struct neigh_table *tbl);
 extern int			neigh_table_clear(struct neigh_table *tbl);
 extern struct neighbour *	neigh_lookup(struct neigh_table *tbl,
 					     const void *pkey,
diff -Nru linux-2.4.36.6.a/net/atm/clip.c linux-2.4.36.6.b/net/atm/clip.c
--- linux-2.4.36.6.a/net/atm/clip.c	2008-06-06 16:25:34.000000000 +0000
+++ linux-2.4.36.6.b/net/atm/clip.c	2010-04-08 13:35:09.000000000 +0000
@@ -752,7 +752,7 @@
 
 static int __init atm_clip_init(void)
 {
-	neigh_table_init(&clip_tbl);
+	neigh_table_init_no_netlink(&clip_tbl);
 
 	clip_tbl_hook = &clip_tbl;
 	atm_clip_ops_set(&__atm_clip_ops);
diff -Nru linux-2.4.36.6.a/net/core/neighbour.c linux-2.4.36.6.b/net/core/neighbour.c
--- linux-2.4.36.6.a/net/core/neighbour.c	2008-06-06 16:25:34.000000000 +0000
+++ linux-2.4.36.6.b/net/core/neighbour.c	2010-04-08 13:33:40.000000000 +0000
@@ -1248,7 +1248,7 @@
 }
 
 
-void neigh_table_init(struct neigh_table *tbl)
+void neigh_table_init_no_netlink(struct neigh_table *tbl)
 {
 	unsigned long now = jiffies;
 	unsigned long phsize;
@@ -1302,10 +1302,27 @@
 
 	tbl->last_flush = now;
 	tbl->last_rand = now + tbl->parms.reachable_time*20;
+}
+
+void neigh_table_init(struct neigh_table *tbl)
+{
+	struct neigh_table *tmp;
+
+	neigh_table_init_no_netlink(tbl);
 	write_lock(&neigh_tbl_lock);
-	tbl->next = neigh_tables;
-	neigh_tables = tbl;
+	for (tmp = neigh_tables; tmp; tmp = tmp->next) {
+		if (tmp->family == tbl->family)
+		break;
+	}
+	tbl->next       = neigh_tables;
+	neigh_tables    = tbl;
 	write_unlock(&neigh_tbl_lock);
+
+	if (unlikely(tmp)) {
+		printk(KERN_ERR "NEIGH: Registering multiple tables for "
+			"family %d\n", tbl->family);
+		dump_stack();
+	}
 }
 
 int neigh_table_clear(struct neigh_table *tbl)

[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 189 bytes --]

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: PROBLEM: 2.4.37.9 destroying an Ethernet interface with permanent NUD leaves the kernels with undestroyable interfaces when ATM is compiled in
  2010-04-08 13:47 ` Sylvain Rochet
@ 2010-04-20  5:11   ` Willy Tarreau
  2010-04-21 14:07     ` Sylvain Rochet
  0 siblings, 1 reply; 4+ messages in thread
From: Willy Tarreau @ 2010-04-20  5:11 UTC (permalink / raw)
  To: Sylvain Rochet; +Cc: linux-kernel

Hi Sylvain,

indeed, you've hit a real bug. It reminds me of the sad days I
was forced to use IPoA over a USB modem to access the net. The
tiniest config error required a reboot to fix it :-/

Your fix looks right at first glance, but I'll review it deeper
before merging it, though it should be OK since 2.6 is similar.

BTW, is there any reason why you're stuck on 2.4 ? Are you using
some vendor-specific drivers which are not in 2.6, did you not
have the time to upgrade yet, or did you not find a long enable
support for 2.6 releases ? Or anything else ?

I'm asking because whatever keeps users in 2.4 should be addressed
one way or another (probably via some doc to add in 2.4 BTW).

Regards,
Willy


^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: PROBLEM: 2.4.37.9 destroying an Ethernet interface with permanent NUD leaves the kernels with undestroyable interfaces when ATM is compiled in
  2010-04-20  5:11   ` Willy Tarreau
@ 2010-04-21 14:07     ` Sylvain Rochet
  0 siblings, 0 replies; 4+ messages in thread
From: Sylvain Rochet @ 2010-04-21 14:07 UTC (permalink / raw)
  To: Willy Tarreau; +Cc: linux-kernel

[-- Attachment #1: Type: text/plain, Size: 1860 bytes --]

Hi Willy,

On Tue, Apr 20, 2010 at 07:11:25AM +0200, Willy Tarreau wrote:
> Hi Sylvain,
> 
> indeed, you've hit a real bug. It reminds me of the sad days I
> was forced to use IPoA over a USB modem to access the net. The
> tiniest config error required a reboot to fix it :-/
> 
> Your fix looks right at first glance, but I'll review it deeper
> before merging it, though it should be OK since 2.6 is similar.
> 
> BTW, is there any reason why you're stuck on 2.4 ? Are you using
> some vendor-specific drivers which are not in 2.6, did you not
> have the time to upgrade yet, or did you not find a long enable
> support for 2.6 releases ? Or anything else ?
> 
> I'm asking because whatever keeps users in 2.4 should be addressed
> one way or another (probably via some doc to add in 2.4 BTW).

Well, if that were only me, this would be a 2.6 kernel, actually one of 
our new xDSL collect provider use Linux routers on operator customer 
edge and they are still using 2.4 kernels. This is going to change 
soon, but well, I discovered that there was this bug, I could not left 
it uncorrected, even on the 2.4 kernel ;-)

I am not sure if collect routers are also used elsewhere than in France. 
This is the server where PPP/L2TP tunnels or VP/VC ATM are ended, so 
that all operators use the same national networks and simply use PPP 
tunnels from xDSL customer to Internet operator router, using Radius, 
PPPoE, PPPoA, L2TP, and PPP protocol to do the authentication, find and 
reach the endpoint.

By the way, the patch also fix another issue, when an interface with 
dynamic NUDs is set to link down sate, you have to wait that NUDs 
entries expire before setting the interface back to link up state. This 
is obvious because dynamic NUDs entries are not cleared when 
neigh_ifdown() is called.

Regards,
Sylvain

[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 189 bytes --]

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2010-04-21 14:07 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2010-04-07 20:23 PROBLEM: 2.4.37.9 destroying an Ethernet interface with permanent NUD leaves the kernels with undestroyable interfaces when ATM is compiled in Sylvain Rochet
2010-04-08 13:47 ` Sylvain Rochet
2010-04-20  5:11   ` Willy Tarreau
2010-04-21 14:07     ` Sylvain Rochet

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.