* Re: [PATCH] panic during unregister_netdevice()
@ 2003-11-06 19:58 Krishna Kumar
2003-11-06 19:59 ` David S. Miller
0 siblings, 1 reply; 10+ messages in thread
From: Krishna Kumar @ 2003-11-06 19:58 UTC (permalink / raw)
To: Stephen Hemminger; +Cc: davem, krkumar, netdev
Actually, even the original code looks good, and if that panic'd, the
following won't change
that (verified it also).
When unregister_netdev() is called by the driver, it first calls
unregister_netdevice() which
drops it's last ref to the dev, making it zero. unregister_netdev() then
calls rtnl_unlock() which
calls netdev_run_todo(), which calls netdev_wait_allrefs() and only after
that succeeds,
does the driver do a free_netdev(). So the dev should not be freed while
the wait_ref() is
executing, and the original code looks correct.
I don't know if it is some corruption on my system, some hardware problem ?
I will look
some more, also try to get a different machine.
- KK
|---------+---------------------------->
| | Stephen Hemminger|
| | <shemminger@osdl.|
| | org> |
| | Sent by: |
| | netdev-bounce@oss|
| | .sgi.com |
| | |
| | |
| | 11/05/2003 05:09 |
| | PM |
| | |
|---------+---------------------------->
>-----------------------------------------------------------------------------------------------------------------|
| |
| To: Krishna Kumar/Beaverton/IBM@IBMUS |
| cc: davem@redhat.com, krkumar@us.ltcfwd.linux.ibm.com, netdev@oss.sgi.com |
| Subject: Re: [PATCH] panic during unregister_netdevice() |
| |
>-----------------------------------------------------------------------------------------------------------------|
Try this. Instead of dropping the last reference in unregister, it does it
after all other references are gone (sort of like the old 2.4 code).
diff -Nru a/net/core/dev.c b/net/core/dev.c
--- a/net/core/dev.c Wed Nov 5 17:04:57 2003
+++ b/net/core/dev.c Wed Nov 5 17:04:57 2003
@@ -2743,7 +2743,7 @@
unsigned long rebroadcast_time, warning_time;
rebroadcast_time = warning_time = jiffies;
- while (atomic_read(&dev->refcnt) != 0) {
+ while (atomic_read(&dev->refcnt) > 1) {
if (time_after(jiffies, rebroadcast_time + 1 *
HZ)) {
rtnl_shlock();
rtnl_exlock();
@@ -2838,6 +2838,7 @@
dev->reg_state = NETREG_UNREGISTERED;
netdev_wait_allrefs(dev);
+ dev_put(dev);
/* paranoia */
BUG_ON(atomic_read(&dev->refcnt));
@@ -2974,7 +2975,6 @@
/* Finish processing unregister after unlock */
net_set_todo(dev);
- dev_put(dev);
return 0;
}
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH] panic during unregister_netdevice()
2003-11-06 19:58 [PATCH] panic during unregister_netdevice() Krishna Kumar
@ 2003-11-06 19:59 ` David S. Miller
2003-11-06 21:07 ` Krishna Kumar
2003-11-07 19:01 ` [PATCH] Hang in downing interface with IPv6 PRIVACY Krishna Kumar
0 siblings, 2 replies; 10+ messages in thread
From: David S. Miller @ 2003-11-06 19:59 UTC (permalink / raw)
To: Krishna Kumar; +Cc: shemminger, krkumar, netdev
On Thu, 6 Nov 2003 11:58:24 -0800
Krishna Kumar <kumarkr@us.ibm.com> wrote:
> When unregister_netdev() is called by the driver, it first calls
> unregister_netdevice() which
> drops it's last ref to the dev, making it zero. unregister_netdev() then
> calls rtnl_unlock() which
> calls netdev_run_todo(), which calls netdev_wait_allrefs() and only after
> that succeeds,
> does the driver do a free_netdev(). So the dev should not be freed while
> the wait_ref() is
> executing, and the original code looks correct.
That's correct.
> I don't know if it is some corruption on my system, some hardware problem ?
> I will look
> some more, also try to get a different machine.
It could be some 'user after free' or similar issue.
Just an idea of something else to look for.
My earlier comments about "putting to zero multiple times" were
misguided, I forgot that these days dev_put() just decrements the
count and does not do anything special when the count reaches zero.
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH] panic during unregister_netdevice()
2003-11-06 19:59 ` David S. Miller
@ 2003-11-06 21:07 ` Krishna Kumar
2003-11-06 21:14 ` David S. Miller
2003-11-07 19:01 ` [PATCH] Hang in downing interface with IPv6 PRIVACY Krishna Kumar
1 sibling, 1 reply; 10+ messages in thread
From: Krishna Kumar @ 2003-11-06 21:07 UTC (permalink / raw)
To: David S. Miller; +Cc: Krishna Kumar, shemminger, netdev
I think I found the problem in the link event code. linkwatch_event()
calls rtnl_unlock() when it gets an event (UNREGISTER) for the device
going down. But this gets called before the device unregister gets called,
via the rmmod/pci_device_remove), and frees up the dev. Later the device
unregister code panics the system.
I also noticed that this panic happens for e100 but not for the 3com
driver. 3com doesn't generate events for up/down using the linkwatch.
I tested with the following patch, and the panic disappeared (the
device shutdown properly). Dave, any need for rtnl_unlock() in this
code ?
Thanks,
- KK
diff -ruN linux-2.6.0-test9-bk9/net/core/link_watch.c linux-2.6.0-test9-bk9.new/net/core/link_watch.c
--- linux-2.6.0-test9-bk9/net/core/link_watch.c 2003-11-06 12:26:30.000000000 -0800
+++ linux-2.6.0-test9-bk9.new/net/core/link_watch.c 2003-11-06 12:25:41.000000000 -0800
@@ -15,6 +15,7 @@
#include <linux/module.h>
#include <linux/netdevice.h>
#include <linux/if.h>
+#include <net/sock.h>
#include <linux/rtnetlink.h>
#include <linux/jiffies.h>
#include <linux/spinlock.h>
@@ -89,9 +90,11 @@
linkwatch_nextevent = jiffies + HZ;
clear_bit(LW_RUNNING, &linkwatch_flags);
- rtnl_lock();
+ rtnl_shlock();
+ rtnl_exlock();
linkwatch_run_queue();
- rtnl_unlock();
+ rtnl_exunlock();
+ rtnl_shunlock();
}
On Thu, 6 Nov 2003, David S. Miller wrote:
> On Thu, 6 Nov 2003 11:58:24 -0800
> Krishna Kumar <kumarkr@us.ibm.com> wrote:
>
> > When unregister_netdev() is called by the driver, it first calls
> > unregister_netdevice() which
> > drops it's last ref to the dev, making it zero. unregister_netdev() then
> > calls rtnl_unlock() which
> > calls netdev_run_todo(), which calls netdev_wait_allrefs() and only after
> > that succeeds,
> > does the driver do a free_netdev(). So the dev should not be freed while
> > the wait_ref() is
> > executing, and the original code looks correct.
>
> That's correct.
>
> > I don't know if it is some corruption on my system, some hardware problem ?
> > I will look
> > some more, also try to get a different machine.
>
> It could be some 'user after free' or similar issue.
> Just an idea of something else to look for.
>
> My earlier comments about "putting to zero multiple times" were
> misguided, I forgot that these days dev_put() just decrements the
> count and does not do anything special when the count reaches zero.
>
>
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH] panic during unregister_netdevice()
2003-11-06 21:07 ` Krishna Kumar
@ 2003-11-06 21:14 ` David S. Miller
0 siblings, 0 replies; 10+ messages in thread
From: David S. Miller @ 2003-11-06 21:14 UTC (permalink / raw)
To: Krishna Kumar; +Cc: kumarkr, shemminger, netdev
On Thu, 6 Nov 2003 13:07:57 -0800 (PST)
Krishna Kumar <krkumar@us.ibm.com> wrote:
> I tested with the following patch, and the panic disappeared (the
> device shutdown properly). Dave, any need for rtnl_unlock() in this
> code ?
This linkwatch code never generates new entries on the netdev todo
list so your patch looks fine.
I'm going to apply it, thanks a lot.
^ permalink raw reply [flat|nested] 10+ messages in thread
* [PATCH] Hang in downing interface with IPv6 PRIVACY
2003-11-06 19:59 ` David S. Miller
2003-11-06 21:07 ` Krishna Kumar
@ 2003-11-07 19:01 ` Krishna Kumar
2003-11-09 6:30 ` David S. Miller
1 sibling, 1 reply; 10+ messages in thread
From: Krishna Kumar @ 2003-11-07 19:01 UTC (permalink / raw)
To: David S. Miller; +Cc: netdev, krkumar
While using PRIVACY extensions, I sometimes get a hang when I remove the
interface. But I can reproduce this every time using the test script at
the end of the mail (hang depends on the order of address deletion).
The bug is in ipv6_del_addr() where if a temp address is being deleted, it
does an __in6_ifa_put() of the main address from which it was derived
(basically the autoconf prefix address). So if the main address was
deleted first, it's ifp ref count would be 1 and it would 'wait' to be
freed till it's temp address was freed first. When the temp address is
deleted, the __put() routine drops the main address's ifp ref count to 0,
but not free it. unregister_netdevice() hangs giving message that ref
count is 1. Fix tested overnight.
Also, the code at the top of the routine is unnecessary, the same is being
done when the address is found a little later in that routine.
Thanks,
- KK
-------------------------- PATCH -----------------------------------------
diff -ruN linux-2.6.0-test9-bk9/net/ipv6/addrconf.c linux-2.6.0-test9-bk9.new/net/ipv6/addrconf.c
--- linux-2.6.0-test9-bk9/net/ipv6/addrconf.c 2003-11-07 10:56:42.000000000 -0800
+++ linux-2.6.0-test9-bk9.new/net/ipv6/addrconf.c 2003-11-07 10:56:50.000000000 -0800
@@ -571,15 +571,6 @@
ifp->dead = 1;
-#ifdef CONFIG_IPV6_PRIVACY
- spin_lock_bh(&ifp->lock);
- if (ifp->ifpub) {
- __in6_ifa_put(ifp->ifpub);
- ifp->ifpub = NULL;
- }
- spin_unlock_bh(&ifp->lock);
-#endif
-
write_lock_bh(&addrconf_hash_lock);
for (ifap = &inet6_addr_lst[hash]; (ifa=*ifap) != NULL;
ifap = &ifa->lst_next) {
@@ -600,7 +591,7 @@
if (ifa == ifp) {
*ifap = ifa->tmp_next;
if (ifp->ifpub) {
- __in6_ifa_put(ifp->ifpub);
+ in6_ifa_put(ifp->ifpub);
ifp->ifpub = NULL;
}
__in6_ifa_put(ifp);
----------------------- TEST PROGRAM ------------------------------------
insmod /lib/modules/2.6.0-test9-bk9/kernel/drivers/net/e100/e100.ko
ifup eth0
# enable privacy address creation
echo 1 > /proc/sys/net/ipv6/conf/eth0/use_tempaddr
# set router, get autoconf/privacy addresses
echo 1 > /proc/sys/net/ipv6/conf/all/forwarding
radvd
echo 0 > /proc/sys/net/ipv6/conf/all/forwarding
# wait for radvd to configure interface addresses
sleep 10
# kill radvd (paranoia)
kill `ps -ef|grep radvd|grep -v grep|awk '{print $2}'`
# delete last regular address first! (happens to be regular :-)
ifconfig eth0 del `ifconfig eth0 | grep Site | tail -1 | awk '{print $3}'`
# now delete all other addresses. bug happens here when the temp address
# is deleted, it doesn't free the regular addresses 'ifp'.
for i in `ifconfig eth0 | grep Site | awk '{print $3}'`
do
ifconfig eth0 del $i
done
ifdown eth0
rmmod e100
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH] Hang in downing interface with IPv6 PRIVACY
2003-11-07 19:01 ` [PATCH] Hang in downing interface with IPv6 PRIVACY Krishna Kumar
@ 2003-11-09 6:30 ` David S. Miller
2004-01-10 1:40 ` [PATCH] Bugs in xfrm Krishna Kumar
0 siblings, 1 reply; 10+ messages in thread
From: David S. Miller @ 2003-11-09 6:30 UTC (permalink / raw)
To: Krishna Kumar; +Cc: netdev, krkumar
On Fri, 7 Nov 2003 11:01:41 -0800 (PST)
Krishna Kumar <krkumar@us.ibm.com> wrote:
> While using PRIVACY extensions, I sometimes get a hang when I remove the
> interface. But I can reproduce this every time using the test script at
> the end of the mail (hang depends on the order of address deletion).
...
> The bug is in ipv6_del_addr() where if a temp address is being deleted
...
> Also, the code at the top of the routine is unnecessary, the same is being
> done when the address is found a little later in that routine.
Looks great, applied.
Thanks.
^ permalink raw reply [flat|nested] 10+ messages in thread
* [PATCH] Bugs in xfrm
2003-11-09 6:30 ` David S. Miller
@ 2004-01-10 1:40 ` Krishna Kumar
2004-01-10 4:48 ` David S. Miller
0 siblings, 1 reply; 10+ messages in thread
From: Krishna Kumar @ 2004-01-10 1:40 UTC (permalink / raw)
To: David S. Miller; +Cc: netdev, KK, kumarkr
Hi Dave,
The following look to be bugs in xfrm related code.
1. In xfrm_lookup, a couple of bugs :
- the found or allocated xfrm_states are not passed correctly to
xfrm_bundle_create (and to the subsequent frees in case of create
failing) if the first xfrm_tmpl_resolve failed and the second one
succeeded.
- error handling is wrong.
2. In pfkey_get(), the xfrm_state is dereferenced after it is dropped,
which could lead to dereferencing freed memory.
3. ipcomp_tunnel_create and xfrm_state_construct don't set x->km.state
to XFRM_STATE_DEAD. This can lead to the BUG_TRAP in __xfrm_state_destroy
when xfrm_state_put() finds this is the last reference. This was reported
as one of the problems of [Bug 1754] some time back.
4. I am not sure of this one. I think dst_free() cannot be used when
a bundle of dst's are allocated and have to be freed. We need a new
dst_bundle_free() routine to free all linked dst's ?? Change is in
__xfrm[46]_bundle_create & xfrm_lookup(), the lookup one I am not very
sure. Why are we doing a dst_put(), shouldn't this be a free of all
dst's off the dst->child list since the routine marking 'dead' cleared all
entries off the dst->next list ?
These changes compile cleanly, but I couldn't test since these are
corner cases. Please let me know if this can be applied. I am sending
as one patch file for now instead of multiple files as they all small.
Thanks,
- KK
diff -ruN linux-2.6.0-rc2-bk6.org/include/net/dst.h linux-2.6.0-rc2-bk6/include/net/dst.h
--- linux-2.6.0-rc2-bk6.org/include/net/dst.h 2004-01-09 17:08:18.000000000 -0800
+++ linux-2.6.0-rc2-bk6/include/net/dst.h 2004-01-09 17:08:55.000000000 -0800
@@ -168,6 +168,7 @@
extern void * dst_alloc(struct dst_ops * ops);
extern void __dst_free(struct dst_entry * dst);
+extern void dst_bundle_free(struct dst_entry * dst);
extern struct dst_entry *dst_destroy(struct dst_entry * dst);
static inline void dst_free(struct dst_entry * dst)
diff -ruN linux-2.6.0-rc2-bk6.org/net/ipv4/ipcomp.c linux-2.6.0-rc2-bk6/net/ipv4/ipcomp.c
--- linux-2.6.0-rc2-bk6.org/net/ipv4/ipcomp.c 2004-01-05 13:43:50.000000000 -0800
+++ linux-2.6.0-rc2-bk6/net/ipv4/ipcomp.c 2004-01-09 13:00:22.000000000 -0800
@@ -294,6 +294,7 @@
return t;
error:
+ t->km.state = XFRM_STATE_DEAD;
xfrm_state_put(t);
t = NULL;
goto out;
diff -ruN linux-2.6.0-rc2-bk6.org/net/ipv4/xfrm4_policy.c linux-2.6.0-rc2-bk6/net/ipv4/xfrm4_policy.c
--- linux-2.6.0-rc2-bk6.org/net/ipv4/xfrm4_policy.c 2004-01-09 15:02:48.000000000 -0800
+++ linux-2.6.0-rc2-bk6/net/ipv4/xfrm4_policy.c 2004-01-09 16:11:57.000000000 -0800
@@ -162,7 +162,7 @@
error:
if (dst)
- dst_free(dst);
+ dst_bundle_free(dst);
return err;
}
diff -ruN linux-2.6.0-rc2-bk6.org/net/ipv6/xfrm6_policy.c linux-2.6.0-rc2-bk6/net/ipv6/xfrm6_policy.c
--- linux-2.6.0-rc2-bk6.org/net/ipv6/xfrm6_policy.c 2004-01-09 16:43:45.000000000 -0800
+++ linux-2.6.0-rc2-bk6/net/ipv6/xfrm6_policy.c 2004-01-09 16:44:03.000000000 -0800
@@ -184,7 +184,7 @@
error:
if (dst)
- dst_free(dst);
+ dst_bundle_free(dst);
return err;
}
diff -ruN linux-2.6.0-rc2-bk6.org/net/key/af_key.c linux-2.6.0-rc2-bk6/net/key/af_key.c
--- linux-2.6.0-rc2-bk6.org/net/key/af_key.c 2004-01-05 13:45:47.000000000 -0800
+++ linux-2.6.0-rc2-bk6/net/key/af_key.c 2004-01-09 12:41:30.000000000 -0800
@@ -1283,6 +1283,7 @@
static int pfkey_get(struct sock *sk, struct sk_buff *skb, struct sadb_msg *hdr, void **ext_hdrs)
{
+ __u8 proto;
struct sk_buff *out_skb;
struct sadb_msg *out_hdr;
struct xfrm_state *x;
@@ -1297,6 +1298,7 @@
return -ESRCH;
out_skb = pfkey_xfrm_state2msg(x, 1, 3);
+ proto = x->id.proto;
xfrm_state_put(x);
if (IS_ERR(out_skb))
return PTR_ERR(out_skb);
@@ -1304,7 +1306,7 @@
out_hdr = (struct sadb_msg *) out_skb->data;
out_hdr->sadb_msg_version = hdr->sadb_msg_version;
out_hdr->sadb_msg_type = SADB_DUMP;
- out_hdr->sadb_msg_satype = pfkey_proto2satype(x->id.proto);
+ out_hdr->sadb_msg_satype = pfkey_proto2satype(proto);
out_hdr->sadb_msg_errno = 0;
out_hdr->sadb_msg_reserved = 0;
out_hdr->sadb_msg_seq = hdr->sadb_msg_seq;
diff -ruN linux-2.6.0-rc2-bk6.org/net/xfrm/xfrm_policy.c linux-2.6.0-rc2-bk6/net/xfrm/xfrm_policy.c
--- linux-2.6.0-rc2-bk6.org/net/xfrm/xfrm_policy.c 2004-01-09 12:42:53.000000000 -0800
+++ linux-2.6.0-rc2-bk6/net/xfrm/xfrm_policy.c 2004-01-09 17:31:05.000000000 -0800
@@ -694,6 +694,16 @@
static int stale_bundle(struct dst_entry *dst);
+void dst_bundle_free(struct dst_entry *dst)
+{
+ struct dst_entry *next;
+
+ while (dst) {
+ next = dst->child;
+ dst_free(dst);
+ }
+}
+
/* Main function: finds/creates a bundle for given flow.
*
* At the moment we eat a raw IP route. Mostly to speed up lookups
@@ -799,9 +809,16 @@
goto restart;
}
}
- if (err)
+ if (err < 0)
goto error;
- } else if (nx == 0) {
+ /*
+ * Save number of xfrm_state's found/created for both
+ * the nx == 0 check below as well as to pass the
+ * right value to xfrm_bundle_create().
+ */
+ nx = err;
+ }
+ if (nx == 0) {
/* Flow passes not transformed. */
xfrm_pol_put(policy);
return 0;
@@ -827,8 +844,8 @@
write_unlock_bh(&policy->lock);
xfrm_pol_put(policy);
- if (dst)
- dst_free(dst);
+ if (dst) /* 'dead' freed dst->next list only */
+ dst_bundle_free(dst);
goto restart;
}
dst->next = policy->bundles;
diff -ruN linux-2.6.0-rc2-bk6.org/net/xfrm/xfrm_user.c linux-2.6.0-rc2-bk6/net/xfrm/xfrm_user.c
--- linux-2.6.0-rc2-bk6.org/net/xfrm/xfrm_user.c 2004-01-09 12:57:42.000000000 -0800
+++ linux-2.6.0-rc2-bk6/net/xfrm/xfrm_user.c 2004-01-09 12:59:00.000000000 -0800
@@ -241,6 +241,7 @@
return x;
error:
+ x->km.state = XFRM_STATE_DEAD;
xfrm_state_put(x);
error_no_put:
*errp = err;
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH] Bugs in xfrm
2004-01-10 1:40 ` [PATCH] Bugs in xfrm Krishna Kumar
@ 2004-01-10 4:48 ` David S. Miller
0 siblings, 0 replies; 10+ messages in thread
From: David S. Miller @ 2004-01-10 4:48 UTC (permalink / raw)
To: Krishna Kumar; +Cc: netdev, krkumar, kumarkr
On Fri, 9 Jan 2004 17:40:03 -0800 (PST)
Krishna Kumar <krkumar@us.ibm.com> wrote:
> These changes compile cleanly, but I couldn't test since these are
> corner cases. Please let me know if this can be applied. I am sending
> as one patch file for now instead of multiple files as they all small.
Maybe you should actually try to test these changes before I think
about applying them, for example:
> +void dst_bundle_free(struct dst_entry *dst)
> +{
> + struct dst_entry *next;
> +
> + while (dst) {
> + next = dst->child;
> + dst_free(dst);
> + }
> +}
Explain to me how that won't loop forever if given a non-NULL dst?
Next, this dst_bundle_free() thing is totally not needed as far as I can
tell. When dst_free() is made, the top-level of the bundle's dst gets added
to the garbage collection list, the garbage collection properly walks the
children to process the whole bundle.
Please redo this patch and please test it this time :)
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH] Bugs in xfrm
@ 2004-01-10 18:33 Krishna Kumar
2004-01-10 20:11 ` David S. Miller
0 siblings, 1 reply; 10+ messages in thread
From: Krishna Kumar @ 2004-01-10 18:33 UTC (permalink / raw)
To: David S. Miller; +Cc: krkumar, netdev
[-- Attachment #1: Type: text/plain, Size: 840 bytes --]
> Explain to me how that won't loop forever if given a non-NULL dst?
Oops, forgot the dst = next in the loop :-)
> Next, this dst_bundle_free() thing is totally not needed as far as I can
> tell. When dst_free() is made, the top-level of the bundle's dst gets
added
> to the garbage collection list, the garbage collection properly walks the
> children to process the whole bundle.
The garbage collector (called via __dst_free) takes dst_garbage_list and
goes through the dst->next pointer. But dst_destroy() seems to destroy
stuff
on the dst->child list (I missed the part that the first dst has a refcnt
of
zero and all others on child have refcnt of 1). So this is not needed.
> Please redo this patch and please test it this time :)
Are the other "bugs" correct :-) Or should I send separate patches this
time ?
thanks,
- KK
[-- Attachment #2: Type: text/html, Size: 1338 bytes --]
^ permalink raw reply [flat|nested] 10+ messages in thread
end of thread, other threads:[~2004-01-10 20:11 UTC | newest]
Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2003-11-06 19:58 [PATCH] panic during unregister_netdevice() Krishna Kumar
2003-11-06 19:59 ` David S. Miller
2003-11-06 21:07 ` Krishna Kumar
2003-11-06 21:14 ` David S. Miller
2003-11-07 19:01 ` [PATCH] Hang in downing interface with IPv6 PRIVACY Krishna Kumar
2003-11-09 6:30 ` David S. Miller
2004-01-10 1:40 ` [PATCH] Bugs in xfrm Krishna Kumar
2004-01-10 4:48 ` David S. Miller
2004-01-10 18:33 Krishna Kumar
2004-01-10 20:11 ` David S. Miller
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.