* BUG: 2.4.23-pre3 + ifconfig
@ 2003-09-04 18:05 Abraham van der Merwe
[not found] ` <200309071217.03470.fedor@karpelevitch.net>
0 siblings, 1 reply; 8+ messages in thread
From: Abraham van der Merwe @ 2003-09-04 18:05 UTC (permalink / raw)
To: Linux Kernel Discussions
Hi!
I just installed 2.4.23-pre3 on one of our servers. If I up/down the
loopback device multiple times ifconfig hangs on the second down (as in
unkillable) and afterwards ifconfig stops functioning and I can't reboot the
machine, etc.
No oopses, kernel panics, messages or anything. The system is still alive,
it is just as if some system call is hung.
If anyone is interested, I can send my .config or any other relevant details.
--
Regards
Abraham
"Consequences, Schmonsequences, as long as I'm rich."
-- "Ali Baba Bunny" [1957, Chuck Jones]
___________________________________________________
Abraham vd Merwe - Frogfoot Networks CC
9 Kinnaird Court, 33 Main Street, Newlands, 7700
Phone: +27 21 686 1665 Cell: +27 82 565 4451
Http: http://www.frogfoot.net/ Email: abz@frogfoot.net
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: BUG: 2.4.23-pre3 + ifconfig
[not found] ` <200309071217.03470.fedor@karpelevitch.net>
@ 2003-09-07 19:15 ` Abraham van der Merwe
[not found] ` <200309080943.26254.fedor@karpelevitch.net>
0 siblings, 1 reply; 8+ messages in thread
From: Abraham van der Merwe @ 2003-09-07 19:15 UTC (permalink / raw)
To: Fedor Karpelevitch; +Cc: Linux Kernel Discussions
Hi Fedor >@2003.09.07_21:17:02_+0200
> > I just installed 2.4.23-pre3 on one of our servers. If I up/down
> > the loopback device multiple times ifconfig hangs on the second
> > down (as in unkillable) and afterwards ifconfig stops functioning
> > and I can't reboot the machine, etc.
> >
> > No oopses, kernel panics, messages or anything. The system is still
> > alive, it is just as if some system call is hung.
> >
> > If anyone is interested, I can send my .config or any other
> > relevant details.
>
> I have the same problem. Did you find any solution?
No :P Not even sure if anyone on lkml noticed my bug report.
--
Regards
Abraham
Carmel, New York, has an ordinance forbidding men to wear coats and
trousers that don't match.
___________________________________________________
Abraham vd Merwe - Frogfoot Networks CC
9 Kinnaird Court, 33 Main Street, Newlands, 7700
Phone: +27 21 686 1665 Cell: +27 82 565 4451
Http: http://www.frogfoot.net/ Email: abz@frogfoot.net
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: possibly bug in 8139cp? (WAS Re: BUG: 2.4.23-pre3 + ifconfig)
[not found] ` <200309080943.26254.fedor@karpelevitch.net>
@ 2003-09-08 16:46 ` Abraham van der Merwe
2003-09-08 17:26 ` Jeff Garzik
1 sibling, 0 replies; 8+ messages in thread
From: Abraham van der Merwe @ 2003-09-08 16:46 UTC (permalink / raw)
To: Fedor Karpelevitch; +Cc: Linux Kernel Discussions, Jeff Garzik
Hi Fedor >@2003.09.08_18:43:25_+0200
Not sure if you've stumbled onto the same bug as me.
My server have 2 Netgear cards and I'm using the National Semiconductor
dp8381x driver included with 2.4.23-pre3.
Also, my system doesn't lock up after `ifconfig lo down', ifconfig just
hangs and becomes unkillable and I can't reboot the machine, use ifconfig
anymore, etc.
> Actually for me this happens when I do "pump -i eth0"
> The system is frozen dead (even SysRq-B does not work)
> That's when I am using 8139cp driver (no problem in 2.4.22)
> I tried using 8139too (I believe it is supposed to work, right?) - I
> do not get this lockup, but instead it starts printing "too much work
> at interrupt " messages all the time. It could be connected to the
> latest changes in 8139 drivers...
>
> Fedor
>
> On ?????????????????????? 07 ???????????????? 2003 12:15 pm, Abraham van der Merwe wrote:
> > Hi Fedor
> > >@2003.09.07_21:17:02_+0200
> >
> > > > I just installed 2.4.23-pre3 on one of our servers. If I
> > > > up/down the loopback device multiple times ifconfig hangs on
> > > > the second down (as in unkillable) and afterwards ifconfig
> > > > stops functioning and I can't reboot the machine, etc.
> > > >
> > > > No oopses, kernel panics, messages or anything. The system is
> > > > still alive, it is just as if some system call is hung.
> > > >
> > > > If anyone is interested, I can send my .config or any other
> > > > relevant details.
> > >
> > > I have the same problem. Did you find any solution?
> >
> > No :P Not even sure if anyone on lkml noticed my bug report.
>
--
Regards
Abraham
Moderation in all things.
-- Publius Terentius Afer [Terence]
___________________________________________________
Abraham vd Merwe - Frogfoot Networks CC
9 Kinnaird Court, 33 Main Street, Newlands, 7700
Phone: +27 21 686 1665 Cell: +27 82 565 4451
Http: http://www.frogfoot.net/ Email: abz@frogfoot.net
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: possibly bug in 8139cp? (WAS Re: BUG: 2.4.23-pre3 + ifconfig)
[not found] ` <200309080943.26254.fedor@karpelevitch.net>
2003-09-08 16:46 ` possibly bug in 8139cp? (WAS Re: BUG: 2.4.23-pre3 + ifconfig) Abraham van der Merwe
@ 2003-09-08 17:26 ` Jeff Garzik
2003-09-08 20:32 ` Andrew Morton
1 sibling, 1 reply; 8+ messages in thread
From: Jeff Garzik @ 2003-09-08 17:26 UTC (permalink / raw)
To: Fedor Karpelevitch; +Cc: Abraham van der Merwe, Linux Kernel Discussions
On Mon, Sep 08, 2003 at 09:43:25AM -0700, Fedor Karpelevitch wrote:
> > > > I just installed 2.4.23-pre3 on one of our servers. If I
> > > > up/down the loopback device multiple times ifconfig hangs on
> > > > the second down (as in unkillable) and afterwards ifconfig
> > > > stops functioning and I can't reboot the machine, etc.
This sounds like the NAPI bug we're chasing.
Jeff
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: possibly bug in 8139cp? (WAS Re: BUG: 2.4.23-pre3 + ifconfig)
2003-09-08 17:26 ` Jeff Garzik
@ 2003-09-08 20:32 ` Andrew Morton
2003-09-08 23:59 ` Jeff Garzik
0 siblings, 1 reply; 8+ messages in thread
From: Andrew Morton @ 2003-09-08 20:32 UTC (permalink / raw)
To: Jeff Garzik; +Cc: fedor, abz, linux-kernel
Jeff Garzik <jgarzik@pobox.com> wrote:
>
> On Mon, Sep 08, 2003 at 09:43:25AM -0700, Fedor Karpelevitch wrote:
> > > > > I just installed 2.4.23-pre3 on one of our servers. If I
> > > > > up/down the loopback device multiple times ifconfig hangs on
> > > > > the second down (as in unkillable) and afterwards ifconfig
> > > > > stops functioning and I can't reboot the machine, etc.
>
> This sounds like the NAPI bug we're chasing.
>
Well it's the same as the bug which was recently added to 2.6:
- dev_close() used to do
test_bit(__LINK_STATE_RX_SCHED, &dev->state);
- the netif_poll_disable() in tg3.c used to do
test_and_set_bit(__LINK_STATE_RX_SCHED, &dev->state);
- someone copied the tg3 netif_poll_disable() into netdevice.h and then
used it in dev_close(), thus changing and presumably breaking
dev_close().
I haven't tested this yet, but it'll probably fix things for all NICs and
it might break tg3. In which case it's a net win ;)
diff -puN include/linux/netdevice.h~ifdown-lockup-fix include/linux/netdevice.h
--- 25/include/linux/netdevice.h~ifdown-lockup-fix Mon Sep 8 13:20:28 2003
+++ 25-akpm/include/linux/netdevice.h Mon Sep 8 13:20:34 2003
@@ -854,7 +854,7 @@ static inline void netif_rx_complete(str
static inline void netif_poll_disable(struct net_device *dev)
{
- while (test_and_set_bit(__LINK_STATE_RX_SCHED, &dev->state)) {
+ while (test_bit(__LINK_STATE_RX_SCHED, &dev->state)) {
/* No hurry. */
current->state = TASK_INTERRUPTIBLE;
schedule_timeout(1);
_
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: possibly bug in 8139cp? (WAS Re: BUG: 2.4.23-pre3 + ifconfig)
2003-09-08 20:32 ` Andrew Morton
@ 2003-09-08 23:59 ` Jeff Garzik
2003-09-09 0:09 ` Andrew Morton
0 siblings, 1 reply; 8+ messages in thread
From: Jeff Garzik @ 2003-09-08 23:59 UTC (permalink / raw)
To: Andrew Morton; +Cc: fedor, abz, linux-kernel
Andrew Morton wrote:
> diff -puN include/linux/netdevice.h~ifdown-lockup-fix include/linux/netdevice.h
> --- 25/include/linux/netdevice.h~ifdown-lockup-fix Mon Sep 8 13:20:28 2003
> +++ 25-akpm/include/linux/netdevice.h Mon Sep 8 13:20:34 2003
> @@ -854,7 +854,7 @@ static inline void netif_rx_complete(str
>
> static inline void netif_poll_disable(struct net_device *dev)
> {
> - while (test_and_set_bit(__LINK_STATE_RX_SCHED, &dev->state)) {
> + while (test_bit(__LINK_STATE_RX_SCHED, &dev->state)) {
> /* No hurry. */
> current->state = TASK_INTERRUPTIBLE;
> schedule_timeout(1);
>
no that breaks other things.
Jeff
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: possibly bug in 8139cp? (WAS Re: BUG: 2.4.23-pre3 + ifconfig)
2003-09-08 23:59 ` Jeff Garzik
@ 2003-09-09 0:09 ` Andrew Morton
2003-09-09 0:41 ` Jeff Garzik
0 siblings, 1 reply; 8+ messages in thread
From: Andrew Morton @ 2003-09-09 0:09 UTC (permalink / raw)
To: Jeff Garzik; +Cc: fedor, abz, linux-kernel
Jeff Garzik <jgarzik@pobox.com> wrote:
>
> Andrew Morton wrote:
> > diff -puN include/linux/netdevice.h~ifdown-lockup-fix include/linux/netdevice.h
> > --- 25/include/linux/netdevice.h~ifdown-lockup-fix Mon Sep 8 13:20:28 2003
> > +++ 25-akpm/include/linux/netdevice.h Mon Sep 8 13:20:34 2003
> > @@ -854,7 +854,7 @@ static inline void netif_rx_complete(str
> >
> > static inline void netif_poll_disable(struct net_device *dev)
> > {
> > - while (test_and_set_bit(__LINK_STATE_RX_SCHED, &dev->state)) {
> > + while (test_bit(__LINK_STATE_RX_SCHED, &dev->state)) {
> > /* No hurry. */
> > current->state = TASK_INTERRUPTIBLE;
> > schedule_timeout(1);
> >
>
>
> no that breaks other things.
>
The only thing it can break is tg3, which appears to be placing a competing
interpretation upon the handling of this flag.
Given that tg3_netif_stop() will set __LINK_STATE_RX_SCHED and dev_close()
will then loop on it getting cleared again there appears to be a risk that
a dev_close() against tg3 will lock up.
It's all very unclear. And uncommented, but that is experientially the
same thing :(
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: possibly bug in 8139cp? (WAS Re: BUG: 2.4.23-pre3 + ifconfig)
2003-09-09 0:09 ` Andrew Morton
@ 2003-09-09 0:41 ` Jeff Garzik
0 siblings, 0 replies; 8+ messages in thread
From: Jeff Garzik @ 2003-09-09 0:41 UTC (permalink / raw)
To: Andrew Morton; +Cc: fedor, abz, linux-kernel
Andrew Morton wrote:
> The only thing it can break is tg3, which appears to be placing a competing
> interpretation upon the handling of this flag.
Right, so, don't break tg3 :) The patch I posted doesn't do the
_and_set, which tg3 needs. netif_poll_{enable,disable} control whether
the net stack may call the dev->poll() function. tg3 asynchronously
disables polling, resets the phy and/or hardware, then enables polling
again.
I thought I had a check in there for when it was contending with
dev_close(), but I'll look again. The hardware/phy reset should
continue to occur regardless of dev_close() -- that's ok. During this
event, all rx/tx is already stopped anyway. So we can let ifdown/ifup
events occur in parallel... carefully. :)
Ideally we want to present a machine that's asynchronously managing its
power state and various functions. dev->open() and dev->close() events
become just two more "major events."
Jeff
^ permalink raw reply [flat|nested] 8+ messages in thread
end of thread, other threads:[~2003-09-09 0:41 UTC | newest]
Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2003-09-04 18:05 BUG: 2.4.23-pre3 + ifconfig Abraham van der Merwe
[not found] ` <200309071217.03470.fedor@karpelevitch.net>
2003-09-07 19:15 ` Abraham van der Merwe
[not found] ` <200309080943.26254.fedor@karpelevitch.net>
2003-09-08 16:46 ` possibly bug in 8139cp? (WAS Re: BUG: 2.4.23-pre3 + ifconfig) Abraham van der Merwe
2003-09-08 17:26 ` Jeff Garzik
2003-09-08 20:32 ` Andrew Morton
2003-09-08 23:59 ` Jeff Garzik
2003-09-09 0:09 ` Andrew Morton
2003-09-09 0:41 ` Jeff Garzik
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).