[-- Attachment #1: Type: text/plain, Size: 776 bytes --] Hi, I found this bug in dev.c It happens if register_netdevice is called before net_dev_init which can happen from init functions, when device drivers are compiled into the kernel. The dev->refcnt will go up to 2 ( it should be 1) unregister_netdevice will usually loop forever waiting for the refcnt to drop to 1 when an attempt to unregister is done. The printk in the bootmessages early initialization of device <blah> is deferred is evidence of this behaviour occuring. The small patch is below ,hope it is okay for you. (See attached file: dev.c.2.4.5.patch) D.J. Barrow Gnu/Linux for S/390 kernel developer eMail: djbarrow@de.ibm.com,barrow_dj@yahoo.com Phone: +49-(0)7031-16-2583 IBM Germany Lab, Schönaicherstr. 220, 71032 Böblingen [-- Attachment #2: dev.c.2.4.5.patch --] [-- Type: application/octet-stream, Size: 900 bytes --] --- net/core/dev.old.c Mon Jun 11 20:00:42 2001 +++ net/core/dev.c Mon Jun 11 19:51:55 2001 @@ -20,6 +20,9 @@ * Pekka Riikonen <priikone@poesidon.pspt.fi> * * Changes: + * D.J. Barrow : Fixed net_dev_init calling dev_hold & setting + * The dev->refcnt to 2 if register_netdev gets + * called first & initialization is deferred. * Alan Cox : device private ioctl copies fields back. * Alan Cox : Transmit queue code does relevant stunts to * keep the queue safe. @@ -2733,8 +2736,10 @@ #endif dev->xmit_lock_owner = -1; dev->iflink = -1; - dev_hold(dev); - + /* Can't do dev_hold here in case + * register_netdev gets called first. + */ + atomic_set(&dev->refcnt,1); /* * Allocate name. If the init() fails * the name will be reissued correctly.
DJBARROW@de.ibm.com writes: > The dev->refcnt will go up to 2 ( it should be 1) unregister_netdevice will > usually loop forever > waiting for the refcnt to drop to 1 when an attempt to unregister is done. When will devices built statically into the kernel ever be unregister'd? Later, David S. Miller davem@redhat.com
[-- Attachment #1: Type: text/plain, Size: 1565 bytes --] Hi Alan/Dave Here is a better version of the patch I think it cuts about 15 lines out of dev.c & I think logically it is more bulletproof than my previous attempt. (See attached file: dev.c.2.4.5.v2.patch) Hi Dave, On 390 it is very easy to hotplug devices under VM. We do it all the time using the channel device layer on s/390, so users can reconfigure devices if they misconfigure them. If it can be registered it should be able to be unregistered. Cornelia Huck in our group in boeblingen noticed the bug too. Please respond to "David S. Miller" <davem@redhat.com> To: Denis Joseph Barrow/Germany/Contr/IBM@IBMDE cc: linux-kernel@vger.kernel.org Subject: Re: bug in /net/core/dev.c DJBARROW@de.ibm.com writes: > The dev->refcnt will go up to 2 ( it should be 1) unregister_netdevice will > usually loop forever > waiting for the refcnt to drop to 1 when an attempt to unregister is done. When will devices built statically into the kernel ever be unregister'd? Later, David S. Miller davem@redhat.com Hi, I found this bug in dev.c It happens if register_netdevice is called before net_dev_init which can happen from init functions, when device drivers are compiled into the kernel. The dev->refcnt will go up to 2 ( it should be 1) unregister_netdevice will usually loop forever waiting for the refcnt to drop to 1 when an attempt to unregister is done. The printk in the bootmessages early initialization of device <blah> is deferred is evidence of this behaviour occuring. The small patch is below ,hope it is okay for you. [-- Attachment #2: dev.c.2.4.5.v2.patch --] [-- Type: application/octet-stream, Size: 2269 bytes --] --- net/core/dev.old.c Wed May 30 14:24:47 2001 +++ net/core/dev.c Tue Jun 12 15:37:58 2001 @@ -20,6 +20,10 @@ * Pekka Riikonen <priikone@poesidon.pspt.fi> * * Changes: + * D.J. Barrow : Fixed bug where dev->refcnt gets set to 2 + * if register_netdev gets called before + * net_dev_init & also removed a few lines + * of code in the process. * Alan Cox : device private ioctl copies fields back. * Alan Cox : Transmit queue code does relevant stunts to * keep the queue safe. @@ -2382,6 +2386,7 @@ * will not get the same name. */ +int net_dev_init(void); int register_netdevice(struct net_device *dev) { struct net_device *d, **dp; @@ -2396,48 +2401,8 @@ dev->fastpath_lock=RW_LOCK_UNLOCKED; #endif - if (dev_boot_phase) { -#ifdef CONFIG_NET_DIVERT - ret = alloc_divert_blk(dev); - if (ret) - return ret; -#endif /* CONFIG_NET_DIVERT */ - - /* This is NOT bug, but I am not sure, that all the - devices, initialized before netdev module is started - are sane. - - Now they are chained to device boot list - and probed later. If a module is initialized - before netdev, but assumes that dev->init - is really called by register_netdev(), it will fail. - - So that this message should be printed for a while. - */ - printk(KERN_INFO "early initialization of device %s is deferred\n", dev->name); - - /* Check for existence, and append to tail of chain */ - for (dp=&dev_base; (d=*dp) != NULL; dp=&d->next) { - if (d == dev || strcmp(d->name, dev->name) == 0) { - return -EEXIST; - } - } - dev->next = NULL; - write_lock_bh(&dev_base_lock); - *dp = dev; - dev_hold(dev); - write_unlock_bh(&dev_base_lock); - - /* - * Default initial state at registry is that the - * device is present. - */ - - set_bit(__LINK_STATE_PRESENT, &dev->state); - - return 0; - } - + if (dev_boot_phase) + net_dev_init(); #ifdef CONFIG_NET_DIVERT ret = alloc_divert_blk(dev); if (ret) @@ -2679,7 +2644,9 @@ { struct net_device *dev, **dp; int i; - + + if(!dev_boot_phase) + return 0; #ifdef CONFIG_NET_SCHED pktsched_init(); #endif
DJBARROW@de.ibm.com writes: > On 390 it is very easy to hotplug devices under VM. > > We do it all the time using the channel device layer on s/390, so users > can reconfigure devices if they misconfigure them. > > If it can be registered it should be able to be unregistered. > > Cornelia Huck in our group in boeblingen noticed the bug too. I still think it's bogus. Those "early init of device" messages should only be occurring for specific legacy ISA device drivers, and for nothing else. You need to find a way to link in the S390 network drivers after the net/ module so that the drivers run their init after net_dev_init() runs. Sure, this refcount thing is a bug, and I will look at your patch, but you need to make your normal net devices get init'd after net_dev_init() runs. Later, David S. Miller davem@redhat.com
On Tue, 12 Jun 2001 07:57:56 -0700 (PDT),
"David S. Miller" <davem@redhat.com> wrote:
>You need to find a way to link in the S390 network drivers after the
>net/ module so that the drivers run their init after net_dev_init()
>runs.
Tricky.
net_dev_init() is in net/core/dev.c which is linked into
net/core/core.o which is linked into net/network.o which is linked into
vmlinux as part of $(NETWORKS).
drivers/s390/net creates drivers/s390/net/s390-net.o which is linked
into drivers/s390/io.o which is part of $(CORE_FILES), not $(DRIVERS).
$(NETWORKS) is linked _after_ both $(CORE_FILES) and $(DRIVERS).
Have I mentioned recently how much I hate the "link order depends on
Makefile compile order, except where fudged by special case Makefiles"
method?
David, how do any network drivers that need net_dev_init() work when
all $(DRIVERS) come before $(NETWORKS)? This is a generic problem, not
s390 specific.
AFAICT net_dev_init() needs to be promoted ahead of $(DRIVERS), either
as minimal code or by moving all of $(NETWORKS). Then moving
drivers/s390/io.o to $(DRIVERS) instead of $(CORE_FILES) will fix the
problem.
Keith Owens writes: > David, how do any network drivers that need net_dev_init() work when > all $(DRIVERS) come before $(NETWORKS)? This is a generic problem, not > s390 specific. Then I don't understand why none of the statically built in drivers in any of my kernels (on any platform I use, sparc64, i386, etc.) make that "early initialization of device XXX deferred" message get spit out. Actually after some studying I do understand :-) net_dev_init() is not invoked via net/* initialization. My bad. It gets invoked from drivers/block/genhd.c:device_init() so that is the ordering dependency and why things work for non-s390 network devices now. So, if the s390 folks move their stuff into the appropriate spot it will work. In fact, I personally like to see the s390 net devices under drivers/net/s390 anyways. They'll get free maintenance from myself and Jeff Garzik in this way as I rarely look int drivers/${PLATFORM} type directories unless I'm doing a tree-wide grep. :-) Later, David S. Miller davem@redhat.com
On Tue, 12 Jun 2001 08:46:00 -0700 (PDT),
"David S. Miller" <davem@redhat.com> wrote:
>So, if the s390 folks move their stuff into the appropriate spot it
>will work. In fact, I personally like to see the s390 net devices
>under drivers/net/s390 anyways. They'll get free maintenance from
>myself and Jeff Garzik in this way as I rarely look int
>drivers/${PLATFORM} type directories unless I'm doing a tree-wide
>grep. :-)
Leave s390 separate for the moment, all the other architectures work
the same way.
Minimal (and totally untested) patch to compile s390/net as part of the
other network drivers follows - if it's good enough for acorn, it's
good enough for s390. The method is as ugly as hell but it is the
least possible change for 2.4, major redesign will have to wait for
2.5. Patch against 2.4.6-pre2.
Index: 6-pre2.1/drivers/s390/Makefile
--- 6-pre2.1/drivers/s390/Makefile Fri, 13 Apr 2001 12:02:38 +1000 kaos (linux-2.4/u/b/43_Makefile 1.3 644)
+++ 6-pre2.1(w)/drivers/s390/Makefile Wed, 13 Jun 2001 02:05:24 +1000 kaos (linux-2.4/u/b/43_Makefile 1.3 644)
@@ -4,7 +4,7 @@
O_TARGET := io.o
-subdir-y := block char misc net
+subdir-y := block char misc
subdir-m := $(subdir-y)
obj-y := $(foreach dir,$(subdir-y),$(dir)/s390-$(dir).o)
@@ -12,3 +12,5 @@ obj-y += s390io.o s390mach.o s390dyn.o i
export-objs += ccwcache.o idals.o s390dyn.o s390io.o
include $(TOPDIR)/Rules.make
+
+# the NET subdir is included from drivers/net now
Index: 6-pre2.1/drivers/net/Makefile
--- 6-pre2.1/drivers/net/Makefile Thu, 17 May 2001 10:25:35 +1000 kaos (linux-2.4/l/c/26_Makefile 1.1.1.1.3.3.1.1.1.2 644)
+++ 6-pre2.1(w)/drivers/net/Makefile Wed, 13 Jun 2001 02:03:09 +1000 kaos (linux-2.4/l/c/26_Makefile 1.1.1.1.3.3.1.1.1.2 644)
@@ -214,6 +214,12 @@ subdir-y += ../acorn/net
obj-y += ../acorn/net/acorn-net.o
endif
+ifeq ($(CONFIG_ARCH_S390),y)
+mod-subdirs += ../s390/net
+subdir-y += ../s390/net
+obj-y += ../s390/net/s390-net.o
+endif
+
#
# HIPPI adapters
#
[-- Attachment #1: Type: text/plain, Size: 908 bytes --] Hi Keith, This is a cure the syntom not the problem, build order shouldn't mess up something this simple. I've forwarded your patch to Ulrich & Martin ( the s390 maintainers ) & they may use it seeing as you & possibly others would prefer a /drivers/net/s390. David admitted it is a bug, if the patch is good hopefully it will be taken, it if better can be done hopefully someone will do better. D.J. Barrow Gnu/Linux for S/390 kernel developer eMail: djbarrow@de.ibm.com,barrow_dj@yahoo.com Phone: +49-(0)7031-16-2583 IBM Germany Lab, Schönaicherstr. 220, 71032 Böblingen Keith Owens <kaos@ocs.com.au> on 12.06.2001 18:11:26 Please respond to Keith Owens <kaos@ocs.com.au> To: "David S. Miller" <davem@redhat.com> cc: Denis Joseph Barrow/Germany/Contr/IBM@IBMDE, alan@lxorguk.ukuu.org.uk, linux-kernel@vger.kernel.org Subject: Re: bug in /net/core/dev.c [-- Attachment #2: Type: text/plain, Size: 2000 bytes --] On Tue, 12 Jun 2001 08:46:00 -0700 (PDT), "David S. Miller" <davem@redhat.com> wrote: >So, if the s390 folks move their stuff into the appropriate spot it >will work. In fact, I personally like to see the s390 net devices >under drivers/net/s390 anyways. They'll get free maintenance from >myself and Jeff Garzik in this way as I rarely look int >drivers/${PLATFORM} type directories unless I'm doing a tree-wide >grep. :-) Leave s390 separate for the moment, all the other architectures work the same way. Minimal (and totally untested) patch to compile s390/net as part of the other network drivers follows - if it's good enough for acorn, it's good enough for s390. The method is as ugly as hell but it is the least possible change for 2.4, major redesign will have to wait for 2.5. Patch against 2.4.6-pre2. Index: 6-pre2.1/drivers/s390/Makefile --- 6-pre2.1/drivers/s390/Makefile Fri, 13 Apr 2001 12:02:38 +1000 kaos (linux-2.4/u/b/43_Makefile 1.3 644) +++ 6-pre2.1(w)/drivers/s390/Makefile Wed, 13 Jun 2001 02:05:24 +1000 kaos (linux-2.4/u/b/43_Makefile 1.3 644) @@ -4,7 +4,7 @@ O_TARGET := io.o -subdir-y := block char misc net +subdir-y := block char misc subdir-m := $(subdir-y) obj-y := $(foreach dir,$(subdir-y),$(dir)/s390-$(dir).o) @@ -12,3 +12,5 @@ obj-y += s390io.o s390mach.o s390dyn.o i export-objs += ccwcache.o idals.o s390dyn.o s390io.o include $(TOPDIR)/Rules.make + +# the NET subdir is included from drivers/net now Index: 6-pre2.1/drivers/net/Makefile --- 6-pre2.1/drivers/net/Makefile Thu, 17 May 2001 10:25:35 +1000 kaos (linux-2.4/l/c/26_Makefile 1.1.1.1.3.3.1.1.1.2 644) +++ 6-pre2.1(w)/drivers/net/Makefile Wed, 13 Jun 2001 02:03:09 +1000 kaos (linux-2.4/l/c/26_Makefile 1.1.1.1.3.3.1.1.1.2 644) @@ -214,6 +214,12 @@ subdir-y += ../acorn/net obj-y += ../acorn/net/acorn-net.o endif +ifeq ($(CONFIG_ARCH_S390),y) +mod-subdirs += ../s390/net +subdir-y += ../s390/net +obj-y += ../s390/net/s390-net.o +endif + # # HIPPI adapters #
DJBARROW@de.ibm.com writes: > This is a cure the syntom not the problem, build order shouldn't mess up > something this simple. The refcounting issue you've described is a bug, and so is initializing a network device before net_dev_init() has been invoked. These two things are equally buggy. Build order does mess up things this simple, look at the comments at the top of drivers/scsi/Makefile if you don't believe me :-) Later, David S. Miller davem@redhat.com
On Tue, 12 Jun 2001 18:43:41 +0200, DJBARROW@de.ibm.com wrote: >Hi Keith, >This is a cure the syntom not the problem, build order shouldn't mess up >something this simple. I totally agree on one point. Having the automatic initialisers in the same source as the related code is a good idea, it is far better than the old Space.c method. Automatically executing the initialisers in the order they are linked into vmlinux is also a good idea, you must have some automatic order or you are back to the horrors of Space.c. But controlling the link order of the initialisers as a side effect of Makefile compile order sucks big rocks, especially when you have some Makefiles executing other Makefiles which are not their children. Link order should be independent of Makefile compile order and it should be explicitly specified, instead of occurring as a little known side effect of the convoluted and twisted way that the Makefile tree is processed. Alas Linus likes it this way :(. At least with my 2.5 Makefile rewrite[*] it is more obvious what is going on. I abhore the line order dependencies and controling link order at the level of entire directories when it should be at the level of individual sources. >I've forwarded your patch to Ulrich & Martin ( the s390 maintainers ) & >they may use it >seeing as you & possibly others would prefer a /drivers/net/s390. DaveM suggested a drivers/net/s390, I recommended against changing the diretory structure right now, wait for 2.5. My patch just changes the compilation order for drivers/s390/net while keeping the same name. [*] Extract from top level Makefile in my 2.5 rewrite. # Link order information to build vmlinux. link_subdirs(arch/$(ARCH)/boot) link_subdirs(init) link_subdirs($(arch_core_subdirs)) link_subdirs(kernel) link_subdirs(mm) link_subdirs(fs) link_subdirs(ipc) # FIXME: Built from the DRIVERS definitions in the old Makefile. Some of this # is just to link intermediate objects into the kernel, some of it is link order # specific but the old Makefile had no documentation. Preserve the old list, # even though most of it is unnecessary. The problem is, we do not know which # bits are necessary because they have link order requirements. Anybody feeling # brave? KAO link_subdirs(drivers/block) link_subdirs(drivers/char) link_subdirs(drivers/misc) link_subdirs(drivers/net) link_subdirs(drivers/media) ... link_subdirs(drivers/acpi) link_subdirs(drivers/md) link_subdirs($(arch_drivers_subdirs)) link_subdirs(net) link_subdirs($(arch_libs)) link_subdirs(lib)
Keith Owens wrote:
>Minimal (and totally untested) patch to compile s390/net as part of the
>other network drivers follows - if it's good enough for acorn, it's
>good enough for s390. The method is as ugly as hell but it is the
>least possible change for 2.4, major redesign will have to wait for
>2.5. Patch against 2.4.6-pre2.
Keith, it's probably much easier to just change arch/s390/Makefile
to link the S/390 drivers *after* all common drivers by moving
drivers/s390/io.o to $DRIVERS instead of $CORE_FILES.
This should fix this particular problem, and appears to be more
logical anyway.
We are just testing this change whether any other problem crop up;
if it's OK we'll send a patch ...
Mit freundlichen Gruessen / Best Regards
Ulrich Weigand
--
Dr. Ulrich Weigand
Linux for S/390 Design & Development
IBM Deutschland Entwicklung GmbH, Schoenaicher Str. 220, 71032 Boeblingen
Phone: +49-7031/16-3727 --- Email: Ulrich.Weigand@de.ibm.com
On Tue, 12 Jun 2001 19:05:01 +0200,
Ulrich.Weigand@de.ibm.com wrote:
>Keith, it's probably much easier to just change arch/s390/Makefile
>to link the S/390 drivers *after* all common drivers by moving
>drivers/s390/io.o to $DRIVERS instead of $CORE_FILES.
That should work, although it would be yet another (and different) way
of doing things. Your method would not work for SCSI drivers, they
have to go in the middle of drivers/scsi/Makefile.
That last point is only relevant if s390 supports native SCSI. SCSI
disks emulating 33[89]0 does not count, the cpu does not issue SCSI
commands. Are there any native SCSI devices for s390 or do they all
emulate IBM devices? I can just see somebody trying to fit SCSI
commands into a CCW and IOB :).
[-- Attachment #1: Type: text/plain, Size: 608 bytes --] Why not ?, the patch is less than 15 lines & you've managed to send me & others about 10 emails expressing a formed opinion without even looking at the code. D.J. Barrow Gnu/Linux for S/390 kernel developer eMail: djbarrow@de.ibm.com,barrow_dj@yahoo.com Phone: +49-(0)7031-16-2583 IBM Germany Lab, Schönaicherstr. 220, 71032 Böblingen "David S. Miller" <davem@redhat.com> on 12.06.2001 20:28:46 Please respond to "David S. Miller" <davem@redhat.com> To: Denis Joseph Barrow/Germany/Contr/IBM@IBMDE cc: Keith Owens <kaos@ocs.com.au> Subject: Re: bug in /net/core/dev.c [-- Attachment #2: Type: text/plain, Size: 394 bytes --] DJBARROW@de.ibm.com writes: > I'm sure you both understand the problem & have a valid solution. > But out of curiousity have either of you looked at the dev.c.2.4.5.v2.patch > & have an opinon on it. Not yet, I simply don't have the time today. I'll look at it tomorrow or the next day. Besides, debugging PCI x86 bios code is so much fun. :-( Later, David S. Miller davem@redhat.com
DJBARROW@de.ibm.com writes: > Why not ?, > the patch is less than 15 lines & you've managed to send me & others about > 10 emails > expressing a formed opinion without even looking at the code. Because those 10 mails were passive replies, reading the patch requires real thinking. Later, David S. Miller davem@redhat.com
Hi, Here is the patch again for the benefit of those who are allergic to opening enclosures. I believe it will only take about 30 seconds of "real thinking" for those familiar with dev.c to make an evaluation of the patch. Those already familiar with the bug can skip this paragraph, The bug fixes a problem that occurs if register_netdev gets called before net_dev_init the dev->refcnt gets incremented to 2 after net_dev_init is called & unregister_netdev will loop forever. The following additional lines at the top of register_netdev will call net_dev_init if it hasn't already been called & by doing this I'm able to remove about 20 lines of special case code in this function. + if (dev_boot_phase) + net_dev_init(); The following additional lines in net_dev_init will do a quick exit if this routine as already been called ( as may be done from register_netdev ). + if(!dev_boot_phase) + return 0; --- net/core/dev.old.c Wed May 30 14:24:47 2001 +++ net/core/dev.c Tue Jun 12 15:37:58 2001 @@ -20,6 +20,10 @@ * Pekka Riikonen <priikone@poesidon.pspt.fi> * * Changes: + * D.J. Barrow : Fixed bug where dev->refcnt gets set to 2 + * if register_netdev gets called before + * net_dev_init & also removed a few lines + * of code in the process. * Alan Cox : device private ioctl copies fields back. * Alan Cox : Transmit queue code does relevant stunts to * keep the queue safe. @@ -2382,6 +2386,7 @@ * will not get the same name. */ +int net_dev_init(void); int register_netdevice(struct net_device *dev) { struct net_device *d, **dp; @@ -2396,48 +2401,8 @@ dev->fastpath_lock=RW_LOCK_UNLOCKED; #endif - if (dev_boot_phase) { -#ifdef CONFIG_NET_DIVERT - ret = alloc_divert_blk(dev); - if (ret) - return ret; -#endif /* CONFIG_NET_DIVERT */ - - /* This is NOT bug, but I am not sure, that all the - devices, initialized before netdev module is started - are sane. - - Now they are chained to device boot list - and probed later. If a module is initialized - before netdev, but assumes that dev->init - is really called by register_netdev(), it will fail. - - So that this message should be printed for a while. - */ - printk(KERN_INFO "early initialization of device %s is deferred\n", dev->name); - - /* Check for existence, and append to tail of chain */ - for (dp=&dev_base; (d=*dp) != NULL; dp=&d->next) { - if (d == dev || strcmp(d->name, dev->name) == 0) { - return -EEXIST; - } - } - dev->next = NULL; - write_lock_bh(&dev_base_lock); - *dp = dev; - dev_hold(dev); - write_unlock_bh(&dev_base_lock); - - /* - * Default initial state at registry is that the - * device is present. - */ - - set_bit(__LINK_STATE_PRESENT, &dev->state); - - return 0; - } - + if (dev_boot_phase) + net_dev_init(); #ifdef CONFIG_NET_DIVERT ret = alloc_divert_blk(dev); if (ret) @@ -2679,7 +2644,9 @@ { struct net_device *dev, **dp; int i; - + + if(!dev_boot_phase) + return 0; #ifdef CONFIG_NET_SCHED pktsched_init(); #endif D.J. Barrow Gnu/Linux for S/390 kernel developer eMail: djbarrow@de.ibm.com,barrow_dj@yahoo.com Phone: +49-(0)7031-16-2583 IBM Germany Lab, Schönaicherstr. 220, 71032 Böblingen
DJBARROW@de.ibm.com writes: > Here is the patch again for the benefit of those who are allergic to > opening enclosures. Yes, this one is for those who are not allergic to line-breaks messing up the patch, it seems :-) The only reservation I had was wrt. procfs, but it dawned on me that we init procfs explicitly in init/main.c before initcalls are made. So I've fixed up the line breaks and coding style issues, and applied your patch. Thanks. Later, David S. Miller davem@redhat.com