All of lore.kernel.org
 help / color / mirror / Atom feed
From: Greg KH <gregkh@suse.de>
To: Ingo Molnar <mingo@elte.hu>, Kay Sievers <kay.sievers@vrfy.org>
Cc: Adrian Bunk <bunk@stusta.de>,
	Linus Torvalds <torvalds@linux-foundation.org>,
	Andrew Morton <akpm@linux-foundation.org>,
	linux-kernel@vger.kernel.org
Subject: Re: [bug] hung bootup in various drivers, was: "2.6.21-rc5: known regressions"
Date: Fri, 30 Mar 2007 07:16:40 -0700	[thread overview]
Message-ID: <20070330141639.GA10387@suse.de> (raw)
In-Reply-To: <20070330120416.GA19373@elte.hu>

On Fri, Mar 30, 2007 at 02:04:16PM +0200, Ingo Molnar wrote:
> 
> i just found a new category of driver regressions in 2.6.21, doing 
> allyesconfig bzImage bootup tests: the init methods of various drivers 
> hangs in driver_unregister().
> 
> It is caused by this problem: the semantics of driver_unregister() [also 
> implicitly called in pci_driver_unregister()] has apparently changed 
> recently. If a driver does:
> 
> 	pci_register_driver(&my_driver);
> 	...
> 	if (some_failure) {
> 		pci_unregister_driver(&my_driver);
> 		...
> 	}
> 
> it will hang the bootup in the following piece of code:
> 
>  drivers/base/driver.c:
> 
>   void driver_unregister(struct device_driver * drv)
>   {
>          bus_remove_driver(drv);
>          wait_for_completion(&drv->unloaded);
> 
> the completion is never done - because nobody removes the bus while the 
> init is still happening, obviously. (and bootup is serialized anyway)
> 
> now, the majority of drivers does the driver unregistry from its 
> module-cleanup function, so it's not affected by this problem. But if 
> you apply the debug patch attached further below, and do an allyesconfig 
> bzImage bootup, there's 3 hits already:
> 
>  BUG: at drivers/base/driver.c:187 driver_unregister()
>   [<c0105ff9>] show_trace_log_lvl+0x19/0x2e
>   [<c01063e2>] show_trace+0x12/0x14
>   [<c01063f8>] dump_stack+0x14/0x16
>   [<c063f7e6>] driver_unregister+0x3d/0x43
>   [<c0488048>] pci_unregister_driver+0x10/0x5f
>   [<c1b5f7c7>] slgt_init+0x9b/0x1ca
>   [<c1b31a2d>] init+0x15d/0x2bd
>   [<c0105bc3>] kernel_thread_helper+0x7/0x10
> 
>  BUG: at drivers/base/driver.c:187 driver_unregister()
>   [<c0105ff9>] show_trace_log_lvl+0x19/0x2e
>   [<c01063e2>] show_trace+0x12/0x14
>   [<c01063f8>] dump_stack+0x14/0x16
>   [<c063f7e6>] driver_unregister+0x3d/0x43
>   [<c0488048>] pci_unregister_driver+0x10/0x5f
>   [<c0619505>] init_ipmi_si+0x70a/0x738
>   [<c1b31a2d>] init+0x15d/0x2bd
>   [<c0105bc3>] kernel_thread_helper+0x7/0x10
> 
>  BUG: at drivers/base/driver.c:187 driver_unregister()
>   [<c0105ff9>] show_trace_log_lvl+0x19/0x2e
>   [<c01063e2>] show_trace+0x12/0x14
>   [<c01063f8>] dump_stack+0x14/0x16
>   [<c063f7e6>] driver_unregister+0x3d/0x43
>   [<c0488048>] pci_unregister_driver+0x10/0x5f
>   [<c1b6d2d8>] tlan_probe+0x2dd/0x30e
>   [<c1b31a2d>] init+0x15d/0x2bd
>   [<c0105bc3>] kernel_thread_helper+0x7/0x10
> 
> possibly more could trigger. Each of these 3 places caused an actual 
> bootup hang on my testbox, so these are real regressions and need to be 
> fixed.
> 
> because there are a good number of drivers that do 
> pci_unregister_device() from their init function, and because i cannot 
> see anything obviously wrong in doing an unregister call after a 
> failure, i think it's driver_unregister() that needs to be fixed. Greg, 
> what do you think?

Yes, we should allow the ability to call unregister_driver from within
the module_init function.

But I don't understand what is causing you to see this problem.  Who is
holding the reference on the struct device at this point in time?  Is it
the fact that userspace has some files open and it hasn't released them
yet?

I don't see anything implicit in the driver_unregister() path that
should not work from within the module_init() path.  Kay, am I missing
anything here?

(patch left below for Kay's benefit)

thanks,

greg k-h

> Index: linux/drivers/base/driver.c
> ===================================================================
> --- linux.orig/drivers/base/driver.c
> +++ linux/drivers/base/driver.c
> @@ -183,7 +183,8 @@ int driver_register(struct device_driver
>  void driver_unregister(struct device_driver * drv)
>  {
>  	bus_remove_driver(drv);
> -	wait_for_completion(&drv->unloaded);
> +	if (!drv->unloaded.done)
> +		WARN_ON(1);
>  }
>  
>  /**

  parent reply	other threads:[~2007-03-30 14:18 UTC|newest]

Thread overview: 128+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2007-03-25 23:08 Linux 2.6.21-rc5 Linus Torvalds
2007-03-26  8:31 ` Ingo Molnar
2007-03-26  8:17   ` Ayaz Abdulla
2007-03-26  8:39   ` Ingo Molnar
2007-03-26  8:58     ` [patch] forcedeth: work around NULL skb dereference crash Ingo Molnar
2007-04-02 11:56       ` [patch] forcedeth: improve NAPI logic Ingo Molnar
2007-03-26  8:55 ` Linux 2.6.21-rc5 Thomas Gleixner
2007-03-26 12:25   ` Bob Tracy
2007-03-26 12:30     ` Thomas Gleixner
2007-03-26  9:04 ` 2.6.21-rc5: maxcpus=1 crash in cpufreq: kernel BUG at drivers/cpufreq/cpufreq.c:82! Ingo Molnar
2007-03-26 18:12   ` Venki Pallipadi
2007-03-26 19:03     ` Venki Pallipadi
2007-03-27  7:11       ` Ingo Molnar
2007-03-26  9:21 ` [PATCH] clockevents: remove bad designed sysfs support for now Thomas Gleixner
2007-03-26  9:25   ` Ingo Molnar
2007-03-26 18:57     ` Greg KH
2007-03-26 12:51   ` Pavel Machek
2007-03-27  7:08   ` [PATCH] i386: Fix bogus return value in hpet_next_event() Thomas Gleixner
2007-03-26 10:11 ` -rc5: e1000 resume weirdness Ingo Molnar
2007-03-26 15:39   ` Kok, Auke
2007-03-26 15:50   ` Jesse Brandeburg
2007-03-26 15:55     ` Kok, Auke
2007-03-26 17:39     ` Ingo Molnar
2007-03-27  1:59 ` [1/5] 2.6.21-rc5: known regressions Adrian Bunk
2007-03-28 18:54   ` Kok, Auke
2007-03-28 19:23     ` Ingo Molnar
2007-03-30 18:04     ` Adrian Bunk
2007-03-30 12:04   ` [bug] hung bootup in various drivers, was: "2.6.21-rc5: known regressions" Ingo Molnar
2007-03-30 12:06     ` [bug] fixed_init(): BUG: at drivers/base/core.c:120 device_release(), " Ingo Molnar
2007-03-30 14:18       ` Greg KH
2007-03-30 14:25         ` Ingo Molnar
2007-03-30 16:31           ` Vitaly Bordug
2007-03-30 14:16     ` Greg KH [this message]
2007-03-30 17:46       ` [bug] hung bootup in various drivers, " Ingo Molnar
2007-03-30 19:32         ` Greg KH
2007-03-31  2:32           ` Kay Sievers
2007-03-31 16:51             ` [patch] driver core: fix built-in drivers sysfs links Ingo Molnar
2007-03-31 16:31           ` [bug] hung bootup in various drivers, was: "2.6.21-rc5: known regressions" Ingo Molnar
2007-04-01  7:49     ` Pavel Machek
2007-04-01 17:17       ` Linus Torvalds
2007-04-01 17:35         ` [patch] driver core: if built-in, do not wait in driver_unregister() Ingo Molnar
2007-04-02  1:47           ` Greg KH
2007-03-27  1:59 ` [2/5] 2.6.21-rc5: known regressions Adrian Bunk
2007-03-27  1:59   ` Adrian Bunk
2007-03-27  1:59   ` Adrian Bunk
2007-03-28 19:46   ` Laurent Riffard
2007-03-29 19:02     ` Fabio Comolli
2007-03-27  1:59 ` [3/5] " Adrian Bunk
2007-03-27  1:59 ` [4/5] " Adrian Bunk
2007-03-27  1:59   ` Adrian Bunk
2007-03-27  8:00   ` Marcus Better
2007-03-27 13:25     ` Eric W. Biederman
2007-03-27 16:53       ` Marcus Better
2007-03-27 20:50         ` Eric W. Biederman
2007-03-27 10:09   ` Rafael J. Wysocki
2007-03-27 10:09     ` Rafael J. Wysocki
2007-03-27 22:29     ` Adrian Bunk
2007-03-27 22:29       ` Adrian Bunk
2007-03-27 22:45       ` Thomas Meyer
2007-03-27 22:45         ` Thomas Meyer
2007-03-28 12:19   ` Ingo Molnar
2007-03-28 12:41     ` Ingo Molnar
2007-03-28 13:03       ` Ingo Molnar
2007-03-28 13:06         ` [patch] MSI-X: fix resume crash Ingo Molnar
2007-03-28 13:31           ` Eric W. Biederman
2007-03-28 13:36             ` Ingo Molnar
2007-03-29  4:30           ` Len Brown
2007-03-29  4:57             ` Eric W. Biederman
2007-03-27  1:59 ` [5/5] 2.6.21-rc5: known regressions Adrian Bunk
2007-03-27  1:59   ` Adrian Bunk
2007-03-27  5:51 ` ATA ACPI (was Re: Linux 2.6.21-rc5) Jeff Garzik
2007-03-27  5:54   ` Tejun Heo
2007-03-27 21:32     ` Pavel Machek
2007-03-28  9:51       ` Tejun Heo
2007-03-27 17:07   ` Linus Torvalds
2007-03-27 18:48     ` Jeff Garzik
2007-03-27  6:17 ` Linux 2.6.21-rc5 Andrew Morton
2007-03-27  6:20   ` Greg KH
2007-03-27 16:49     ` Jesse Barnes
2007-03-27  9:49   ` Takashi Iwai
2007-03-27 12:25   ` Andi Kleen
2007-03-27 16:33     ` Andrew Morton
2007-03-27 12:43   ` Dmitry Torokhov
2007-03-28 22:32   ` Tilman Schmidt
2007-03-27 18:34 ` Michal Piotrowski
2007-03-27 22:29   ` Pavel Machek
2007-03-27 22:55     ` Michal Piotrowski
2007-03-27 18:53 ` Michal Piotrowski
2007-03-28 14:30   ` Andi Kleen
2007-03-28 14:56     ` Michal Piotrowski
2007-03-28 16:12       ` Jiri Kosina
2007-03-28 16:51         ` Michal Piotrowski
2007-03-28 17:56     ` Linus Torvalds
     [not found] ` <20070327230024.GJ16477@stusta.de>
2007-03-27 23:10   ` 2.6.21-rc5: known regressions with patches Rafael J. Wysocki
2007-03-28  0:50   ` Jay Cliburn
2007-03-30 21:32 ` [1/4] 2.6.21-rc5: known regressions (v2) Adrian Bunk
2007-03-30 21:32   ` Adrian Bunk
2007-03-30 21:38   ` Greg KH
2007-03-31  0:23   ` Michal Jaegermann
2007-03-31 15:01     ` Adrian Bunk
2007-03-31 16:42       ` Michal Jaegermann
2007-03-30 21:32 ` [2/4] " Adrian Bunk
2007-03-30 21:32 ` [3/4] " Adrian Bunk
2007-03-30 21:32   ` Adrian Bunk
2007-03-31  2:52   ` Jeff Chua
2007-03-31  2:52     ` Jeff Chua
2007-03-31  2:52     ` Jeff Chua
2007-03-31  3:16     ` Adrian Bunk
2007-03-31 11:08       ` Jens Axboe
2007-04-01  5:39   ` Jeremy Fitzhardinge
2007-04-01  5:39     ` Jeremy Fitzhardinge
2007-04-13 16:32   ` Michal Piotrowski
2007-04-13 16:32     ` Michal Piotrowski
2007-03-30 21:49 ` [4/4] " Adrian Bunk
2007-03-30 21:49   ` Adrian Bunk
2007-03-31  2:41   ` Jeff Chua
2007-03-31  2:41     ` Jeff Chua
2007-03-31  6:44   ` Frédéric Riss
2007-04-01  7:04   ` Michael S. Tsirkin
2007-04-01  7:04     ` Michael S. Tsirkin
2007-04-01 20:37   ` Michael S. Tsirkin
2007-04-01 20:37     ` Michael S. Tsirkin
2007-03-31 18:19 ` 2.6.21-rc5: known regressions with patches (v2) Adrian Bunk
2007-03-31 18:19   ` Adrian Bunk
2007-04-03  4:05   ` [PATCH] libata: add NCQ blacklist entries from Silicon Image Windows driver (v2) Robert Hancock
2007-04-03  4:13     ` Tejun Heo
2007-04-04  6:09     ` Jeff Garzik
2007-04-04 14:26       ` Robert Hancock

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20070330141639.GA10387@suse.de \
    --to=gregkh@suse.de \
    --cc=akpm@linux-foundation.org \
    --cc=bunk@stusta.de \
    --cc=kay.sievers@vrfy.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@elte.hu \
    --cc=torvalds@linux-foundation.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.