linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* BUG: 2.4.23-pre3 + ifconfig
@ 2003-09-04 18:05 Abraham van der Merwe
       [not found] ` <200309071217.03470.fedor@karpelevitch.net>
  0 siblings, 1 reply; 8+ messages in thread
From: Abraham van der Merwe @ 2003-09-04 18:05 UTC (permalink / raw)
  To: Linux Kernel Discussions

Hi!

I just installed 2.4.23-pre3 on one of our servers. If I up/down the
loopback device multiple times ifconfig hangs on the second down (as in
unkillable) and afterwards ifconfig stops functioning and I can't reboot the
machine, etc.

No oopses, kernel panics, messages or anything. The system is still alive,
it is just as if some system call is hung.

If anyone is interested, I can send my .config or any other relevant details.

-- 

Regards
 Abraham

"Consequences, Schmonsequences, as long as I'm rich."
		-- "Ali Baba Bunny" [1957, Chuck Jones]

___________________________________________________
 Abraham vd Merwe - Frogfoot Networks CC
 9 Kinnaird Court, 33 Main Street, Newlands, 7700
 Phone: +27 21 686 1665 Cell: +27 82 565 4451
 Http: http://www.frogfoot.net/ Email: abz@frogfoot.net


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: BUG: 2.4.23-pre3 + ifconfig
       [not found] ` <200309071217.03470.fedor@karpelevitch.net>
@ 2003-09-07 19:15   ` Abraham van der Merwe
       [not found]     ` <200309080943.26254.fedor@karpelevitch.net>
  0 siblings, 1 reply; 8+ messages in thread
From: Abraham van der Merwe @ 2003-09-07 19:15 UTC (permalink / raw)
  To: Fedor Karpelevitch; +Cc: Linux Kernel Discussions

Hi Fedor                                         >@2003.09.07_21:17:02_+0200

> > I just installed 2.4.23-pre3 on one of our servers. If I up/down
> > the loopback device multiple times ifconfig hangs on the second
> > down (as in unkillable) and afterwards ifconfig stops functioning
> > and I can't reboot the machine, etc.
> >
> > No oopses, kernel panics, messages or anything. The system is still
> > alive, it is just as if some system call is hung.
> >
> > If anyone is interested, I can send my .config or any other
> > relevant details.
> 
> I have the same problem. Did you find any solution?

No :P Not even sure if anyone on lkml noticed my bug report.

-- 

Regards
 Abraham

Carmel, New York, has an ordinance forbidding men to wear coats and
trousers that don't match.

___________________________________________________
 Abraham vd Merwe - Frogfoot Networks CC
 9 Kinnaird Court, 33 Main Street, Newlands, 7700
 Phone: +27 21 686 1665 Cell: +27 82 565 4451
 Http: http://www.frogfoot.net/ Email: abz@frogfoot.net


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: possibly bug in 8139cp? (WAS Re: BUG: 2.4.23-pre3 + ifconfig)
       [not found]     ` <200309080943.26254.fedor@karpelevitch.net>
@ 2003-09-08 16:46       ` Abraham van der Merwe
  2003-09-08 17:26       ` Jeff Garzik
  1 sibling, 0 replies; 8+ messages in thread
From: Abraham van der Merwe @ 2003-09-08 16:46 UTC (permalink / raw)
  To: Fedor Karpelevitch; +Cc: Linux Kernel Discussions, Jeff Garzik

Hi Fedor                                         >@2003.09.08_18:43:25_+0200

Not sure if you've stumbled onto the same bug as me.

My server have 2 Netgear cards and I'm using the National Semiconductor
dp8381x driver included with 2.4.23-pre3.

Also, my system doesn't lock up after `ifconfig lo down', ifconfig just
hangs and becomes unkillable and I can't reboot the machine, use ifconfig
anymore, etc.

> Actually for me this happens when I do "pump -i eth0"
> The system is frozen dead (even SysRq-B does not work)
> That's when I am using 8139cp driver (no problem in 2.4.22)
> I tried using 8139too (I believe it is supposed to work, right?) - I 
> do not get this lockup, but instead it starts printing "too much work 
> at interrupt " messages all the time. It could be connected to the 
> latest changes in 8139 drivers...
> 
> Fedor
> 
> On ?????????????????????? 07 ???????????????? 2003 12:15 pm, Abraham van der Merwe wrote:
> > Hi Fedor                                        
> > >@2003.09.07_21:17:02_+0200
> >
> > > > I just installed 2.4.23-pre3 on one of our servers. If I
> > > > up/down the loopback device multiple times ifconfig hangs on
> > > > the second down (as in unkillable) and afterwards ifconfig
> > > > stops functioning and I can't reboot the machine, etc.
> > > >
> > > > No oopses, kernel panics, messages or anything. The system is
> > > > still alive, it is just as if some system call is hung.
> > > >
> > > > If anyone is interested, I can send my .config or any other
> > > > relevant details.
> > >
> > > I have the same problem. Did you find any solution?
> >
> > No :P Not even sure if anyone on lkml noticed my bug report.
> 

-- 

Regards
 Abraham

Moderation in all things.
		-- Publius Terentius Afer [Terence]

___________________________________________________
 Abraham vd Merwe - Frogfoot Networks CC
 9 Kinnaird Court, 33 Main Street, Newlands, 7700
 Phone: +27 21 686 1665 Cell: +27 82 565 4451
 Http: http://www.frogfoot.net/ Email: abz@frogfoot.net


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: possibly bug in 8139cp? (WAS Re: BUG: 2.4.23-pre3 + ifconfig)
       [not found]     ` <200309080943.26254.fedor@karpelevitch.net>
  2003-09-08 16:46       ` possibly bug in 8139cp? (WAS Re: BUG: 2.4.23-pre3 + ifconfig) Abraham van der Merwe
@ 2003-09-08 17:26       ` Jeff Garzik
  2003-09-08 20:32         ` Andrew Morton
  1 sibling, 1 reply; 8+ messages in thread
From: Jeff Garzik @ 2003-09-08 17:26 UTC (permalink / raw)
  To: Fedor Karpelevitch; +Cc: Abraham van der Merwe, Linux Kernel Discussions

On Mon, Sep 08, 2003 at 09:43:25AM -0700, Fedor Karpelevitch wrote:
> > > > I just installed 2.4.23-pre3 on one of our servers. If I
> > > > up/down the loopback device multiple times ifconfig hangs on
> > > > the second down (as in unkillable) and afterwards ifconfig
> > > > stops functioning and I can't reboot the machine, etc.

This sounds like the NAPI bug we're chasing.

	Jeff


	

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: possibly bug in 8139cp? (WAS Re: BUG: 2.4.23-pre3 + ifconfig)
  2003-09-08 17:26       ` Jeff Garzik
@ 2003-09-08 20:32         ` Andrew Morton
  2003-09-08 23:59           ` Jeff Garzik
  0 siblings, 1 reply; 8+ messages in thread
From: Andrew Morton @ 2003-09-08 20:32 UTC (permalink / raw)
  To: Jeff Garzik; +Cc: fedor, abz, linux-kernel

Jeff Garzik <jgarzik@pobox.com> wrote:
>
> On Mon, Sep 08, 2003 at 09:43:25AM -0700, Fedor Karpelevitch wrote:
> > > > > I just installed 2.4.23-pre3 on one of our servers. If I
> > > > > up/down the loopback device multiple times ifconfig hangs on
> > > > > the second down (as in unkillable) and afterwards ifconfig
> > > > > stops functioning and I can't reboot the machine, etc.
> 
> This sounds like the NAPI bug we're chasing.
> 

Well it's the same as the bug which was recently added to 2.6:

- dev_close() used to do

	test_bit(__LINK_STATE_RX_SCHED, &dev->state);

- the netif_poll_disable() in tg3.c used to do

	test_and_set_bit(__LINK_STATE_RX_SCHED, &dev->state);

- someone copied the tg3 netif_poll_disable() into netdevice.h and then
  used it in dev_close(), thus changing and presumably breaking
  dev_close().


I haven't tested this yet, but it'll probably fix things for all NICs and
it might break tg3.  In which case it's a net win ;)


diff -puN include/linux/netdevice.h~ifdown-lockup-fix include/linux/netdevice.h
--- 25/include/linux/netdevice.h~ifdown-lockup-fix	Mon Sep  8 13:20:28 2003
+++ 25-akpm/include/linux/netdevice.h	Mon Sep  8 13:20:34 2003
@@ -854,7 +854,7 @@ static inline void netif_rx_complete(str
 
 static inline void netif_poll_disable(struct net_device *dev)
 {
-	while (test_and_set_bit(__LINK_STATE_RX_SCHED, &dev->state)) {
+	while (test_bit(__LINK_STATE_RX_SCHED, &dev->state)) {
 		/* No hurry. */
 		current->state = TASK_INTERRUPTIBLE;
 		schedule_timeout(1);

_


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: possibly bug in 8139cp? (WAS Re: BUG: 2.4.23-pre3 + ifconfig)
  2003-09-08 20:32         ` Andrew Morton
@ 2003-09-08 23:59           ` Jeff Garzik
  2003-09-09  0:09             ` Andrew Morton
  0 siblings, 1 reply; 8+ messages in thread
From: Jeff Garzik @ 2003-09-08 23:59 UTC (permalink / raw)
  To: Andrew Morton; +Cc: fedor, abz, linux-kernel

Andrew Morton wrote:
> diff -puN include/linux/netdevice.h~ifdown-lockup-fix include/linux/netdevice.h
> --- 25/include/linux/netdevice.h~ifdown-lockup-fix	Mon Sep  8 13:20:28 2003
> +++ 25-akpm/include/linux/netdevice.h	Mon Sep  8 13:20:34 2003
> @@ -854,7 +854,7 @@ static inline void netif_rx_complete(str
>  
>  static inline void netif_poll_disable(struct net_device *dev)
>  {
> -	while (test_and_set_bit(__LINK_STATE_RX_SCHED, &dev->state)) {
> +	while (test_bit(__LINK_STATE_RX_SCHED, &dev->state)) {
>  		/* No hurry. */
>  		current->state = TASK_INTERRUPTIBLE;
>  		schedule_timeout(1);
> 


no that breaks other things.

	Jeff




^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: possibly bug in 8139cp? (WAS Re: BUG: 2.4.23-pre3 + ifconfig)
  2003-09-08 23:59           ` Jeff Garzik
@ 2003-09-09  0:09             ` Andrew Morton
  2003-09-09  0:41               ` Jeff Garzik
  0 siblings, 1 reply; 8+ messages in thread
From: Andrew Morton @ 2003-09-09  0:09 UTC (permalink / raw)
  To: Jeff Garzik; +Cc: fedor, abz, linux-kernel

Jeff Garzik <jgarzik@pobox.com> wrote:
>
> Andrew Morton wrote:
> > diff -puN include/linux/netdevice.h~ifdown-lockup-fix include/linux/netdevice.h
> > --- 25/include/linux/netdevice.h~ifdown-lockup-fix	Mon Sep  8 13:20:28 2003
> > +++ 25-akpm/include/linux/netdevice.h	Mon Sep  8 13:20:34 2003
> > @@ -854,7 +854,7 @@ static inline void netif_rx_complete(str
> >  
> >  static inline void netif_poll_disable(struct net_device *dev)
> >  {
> > -	while (test_and_set_bit(__LINK_STATE_RX_SCHED, &dev->state)) {
> > +	while (test_bit(__LINK_STATE_RX_SCHED, &dev->state)) {
> >  		/* No hurry. */
> >  		current->state = TASK_INTERRUPTIBLE;
> >  		schedule_timeout(1);
> > 
> 
> 
> no that breaks other things.
> 

The only thing it can break is tg3, which appears to be placing a competing
interpretation upon the handling of this flag.

Given that tg3_netif_stop() will set __LINK_STATE_RX_SCHED and dev_close()
will then loop on it getting cleared again there appears to be a risk that
a dev_close() against tg3 will lock up.

It's all very unclear.  And uncommented, but that is experientially the
same thing :(


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: possibly bug in 8139cp? (WAS Re: BUG: 2.4.23-pre3 + ifconfig)
  2003-09-09  0:09             ` Andrew Morton
@ 2003-09-09  0:41               ` Jeff Garzik
  0 siblings, 0 replies; 8+ messages in thread
From: Jeff Garzik @ 2003-09-09  0:41 UTC (permalink / raw)
  To: Andrew Morton; +Cc: fedor, abz, linux-kernel

Andrew Morton wrote:
> The only thing it can break is tg3, which appears to be placing a competing
> interpretation upon the handling of this flag.


Right, so, don't break tg3 :)  The patch I posted doesn't do the 
_and_set, which tg3 needs.  netif_poll_{enable,disable} control whether 
the net stack may call the dev->poll() function.  tg3 asynchronously 
disables polling, resets the phy and/or hardware, then enables polling 
again.

I thought I had a check in there for when it was contending with 
dev_close(), but I'll look again.  The hardware/phy reset should 
continue to occur regardless of dev_close() -- that's ok.  During this 
event, all rx/tx is already stopped anyway.  So we can let ifdown/ifup 
events occur in parallel... carefully.  :)

Ideally we want to present a machine that's asynchronously managing its 
power state and various functions.  dev->open() and dev->close() events 
become just two more "major events."

	Jeff




^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2003-09-09  0:41 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2003-09-04 18:05 BUG: 2.4.23-pre3 + ifconfig Abraham van der Merwe
     [not found] ` <200309071217.03470.fedor@karpelevitch.net>
2003-09-07 19:15   ` Abraham van der Merwe
     [not found]     ` <200309080943.26254.fedor@karpelevitch.net>
2003-09-08 16:46       ` possibly bug in 8139cp? (WAS Re: BUG: 2.4.23-pre3 + ifconfig) Abraham van der Merwe
2003-09-08 17:26       ` Jeff Garzik
2003-09-08 20:32         ` Andrew Morton
2003-09-08 23:59           ` Jeff Garzik
2003-09-09  0:09             ` Andrew Morton
2003-09-09  0:41               ` Jeff Garzik

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).