All of lore.kernel.org
 help / color / mirror / Atom feed
* [GIT PATCH] driver core patches for 2.6.31-rc2
@ 2009-04-17 19:01 Greg KH
  2009-04-17 19:04 ` [PATCH 01/10] sysfs: don't use global workqueue in sysfs_schedule_callback() Greg Kroah-Hartman
                   ` (10 more replies)
  0 siblings, 11 replies; 15+ messages in thread
From: Greg KH @ 2009-04-17 19:01 UTC (permalink / raw)
  To: Linus Torvalds, Andrew Morton; +Cc: linux-kernel

Here are a few driver-core patches for your 2.6.31-rc2 tree.

The biggest patch here is the addition of the early platform driver api.
This patch has been in the -mm tree for a few major kernel releases now
(around 5 months), and there are a number of drivers pending in some of
the arch trees that depend on this api being present.  It doesn't affect
any x86 code, so the majority of the world will have no problems with
this, and it all gets thrown away after init starts up, so it has no
affect on memory sizes.

Other than that, it's just a few bugfixes and tweaks.


Please pull from:
	master.kernel.org:/pub/scm/linux/kernel/git/gregkh/driver-core-2.6.git/

All of these patches have been in the linux-next and mm trees for a
while.

The patches will be sent as a follow-on to this message to lkml for people
to see.

thanks,

greg k-h

------------

 Documentation/driver-model/platform.txt |   59 ++++++++
 drivers/base/base.h                     |    2 +-
 drivers/base/core.c                     |    3 +
 drivers/base/platform.c                 |  239 +++++++++++++++++++++++++++++++
 drivers/uio/uio_cif.c                   |    1 +
 fs/proc/base.c                          |    4 +-
 fs/sysfs/file.c                         |   16 ++-
 include/linux/dynamic_debug.h           |    2 +-
 include/linux/init.h                    |    1 +
 include/linux/kernel.h                  |    9 ++
 include/linux/platform_device.h         |   42 ++++++
 init/main.c                             |    7 +-
 lib/kobject_uevent.c                    |    2 +-
 13 files changed, 378 insertions(+), 9 deletions(-)

---------------

Alex Chiang (1):
      sysfs: don't use global workqueue in sysfs_schedule_callback()

Greg Kroah-Hartman (1):
      driver core: prevent device_for_each_child from oopsing

Hans J. Koch (1):
      UIO: fix specific device driver missing statement for depmod

Jason Baron (1):
      Driver core: remove pr_fmt() from dynamic_dev_dbg() printk

KOSAKI Motohiro (2):
      sysfs: sysfs poll keep the poll rule of regular file.
      proc: mounts_poll() make consistent to mdstat_poll

Kay Sievers (1):
      driver core: allow non-root users to listen to uevents

Magnus Damm (1):
      Driver Core: early platform driver

Michael Ellerman (1):
      dynamic debug: resurrect old pr_debug() semantics as pr_devel()

Ming Lei (1):
      driver core: fix driver_match_device


^ permalink raw reply	[flat|nested] 15+ messages in thread

* [PATCH 01/10] sysfs: don't use global workqueue in sysfs_schedule_callback()
  2009-04-17 19:01 [GIT PATCH] driver core patches for 2.6.31-rc2 Greg KH
@ 2009-04-17 19:04 ` Greg Kroah-Hartman
  2009-04-22 20:14   ` Andrew Morton
  2009-04-17 19:04 ` [PATCH 02/10] driver core: fix driver_match_device Greg Kroah-Hartman
                   ` (9 subsequent siblings)
  10 siblings, 1 reply; 15+ messages in thread
From: Greg Kroah-Hartman @ 2009-04-17 19:04 UTC (permalink / raw)
  To: linux-kernel; +Cc: Alex Chiang, Greg Kroah-Hartman

From: Alex Chiang <achiang@hp.com>

A sysfs attribute using sysfs_schedule_callback() to commit suicide
may end up calling device_unregister(), which will eventually call
a driver's ->remove function.

Drivers may call flush_scheduled_work() in their shutdown routines,
in which case lockdep will complain with something like the following:

  =============================================
  [ INFO: possible recursive locking detected ]
  2.6.29-rc8-kk #1
  ---------------------------------------------
  events/4/56 is trying to acquire lock:
  (events){--..}, at: [<ffffffff80257fc0>] flush_workqueue+0x0/0xa0

  but task is already holding lock:
  (events){--..}, at: [<ffffffff80257648>] run_workqueue+0x108/0x230

  other info that might help us debug this:
  3 locks held by events/4/56:
  #0:  (events){--..}, at: [<ffffffff80257648>] run_workqueue+0x108/0x230
  #1:  (&ss->work){--..}, at: [<ffffffff80257648>] run_workqueue+0x108/0x230
  #2:  (pci_remove_rescan_mutex){--..}, at: [<ffffffff803c10d1>] remove_callback+0x21/0x40

  stack backtrace:
  Pid: 56, comm: events/4 Not tainted 2.6.29-rc8-kk #1
  Call Trace:
  [<ffffffff8026dfcd>] validate_chain+0xb7d/0x1260
  [<ffffffff8026eade>] __lock_acquire+0x42e/0xa40
  [<ffffffff8026f148>] lock_acquire+0x58/0x80
  [<ffffffff80257fc0>] ? flush_workqueue+0x0/0xa0
  [<ffffffff8025800d>] flush_workqueue+0x4d/0xa0
  [<ffffffff80257fc0>] ? flush_workqueue+0x0/0xa0
  [<ffffffff80258070>] flush_scheduled_work+0x10/0x20
  [<ffffffffa0144065>] e1000_remove+0x55/0xfe [e1000e]
  [<ffffffff8033ee30>] ? sysfs_schedule_callback_work+0x0/0x50
  [<ffffffff803bfeb2>] pci_device_remove+0x32/0x70
  [<ffffffff80441da9>] __device_release_driver+0x59/0x90
  [<ffffffff80441edb>] device_release_driver+0x2b/0x40
  [<ffffffff804419d6>] bus_remove_device+0xa6/0x120
  [<ffffffff8043e46b>] device_del+0x12b/0x190
  [<ffffffff8043e4f6>] device_unregister+0x26/0x70
  [<ffffffff803ba969>] pci_stop_dev+0x49/0x60
  [<ffffffff803baab0>] pci_remove_bus_device+0x40/0xc0
  [<ffffffff803c10d9>] remove_callback+0x29/0x40
  [<ffffffff8033ee4f>] sysfs_schedule_callback_work+0x1f/0x50
  [<ffffffff8025769a>] run_workqueue+0x15a/0x230
  [<ffffffff80257648>] ? run_workqueue+0x108/0x230
  [<ffffffff8025846f>] worker_thread+0x9f/0x100
  [<ffffffff8025bce0>] ? autoremove_wake_function+0x0/0x40
  [<ffffffff802583d0>] ? worker_thread+0x0/0x100
  [<ffffffff8025b89d>] kthread+0x4d/0x80
  [<ffffffff8020d4ba>] child_rip+0xa/0x20
  [<ffffffff8020cebc>] ? restore_args+0x0/0x30
  [<ffffffff8025b850>] ? kthread+0x0/0x80
  [<ffffffff8020d4b0>] ? child_rip+0x0/0x20

Although we know that the device_unregister path will never acquire
a lock that a driver might try to acquire in its ->remove, in general
we should never attempt to flush a workqueue from within the same
workqueue, and lockdep rightly complains.

So as long as sysfs attributes cannot commit suicide directly and we
are stuck with this callback mechanism, put the sysfs callbacks on
their own workqueue instead of the global one.

This has the side benefit that if a suicidal sysfs attribute kicks
off a long chain of ->remove callbacks, we no longer induce a long
delay on the global queue.

This also fixes a missing module_put in the error path introduced
by sysfs-only-allow-one-scheduled-removal-callback-per-kobj.patch.

We never destroy the workqueue, but I'm not sure that's a
problem.

Reported-by: Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com>
Tested-by: Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com>
Signed-off-by: Alex Chiang <achiang@hp.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
---
 fs/sysfs/file.c |   12 +++++++++++-
 1 files changed, 11 insertions(+), 1 deletions(-)

diff --git a/fs/sysfs/file.c b/fs/sysfs/file.c
index 289c43a..979e937 100644
--- a/fs/sysfs/file.c
+++ b/fs/sysfs/file.c
@@ -667,6 +667,7 @@ struct sysfs_schedule_callback_struct {
 	struct work_struct	work;
 };
 
+static struct workqueue_struct *sysfs_workqueue;
 static DEFINE_MUTEX(sysfs_workq_mutex);
 static LIST_HEAD(sysfs_workq);
 static void sysfs_schedule_callback_work(struct work_struct *work)
@@ -715,11 +716,20 @@ int sysfs_schedule_callback(struct kobject *kobj, void (*func)(void *),
 	mutex_lock(&sysfs_workq_mutex);
 	list_for_each_entry_safe(ss, tmp, &sysfs_workq, workq_list)
 		if (ss->kobj == kobj) {
+			module_put(owner);
 			mutex_unlock(&sysfs_workq_mutex);
 			return -EAGAIN;
 		}
 	mutex_unlock(&sysfs_workq_mutex);
 
+	if (sysfs_workqueue == NULL) {
+		sysfs_workqueue = create_workqueue("sysfsd");
+		if (sysfs_workqueue == NULL) {
+			module_put(owner);
+			return -ENOMEM;
+		}
+	}
+
 	ss = kmalloc(sizeof(*ss), GFP_KERNEL);
 	if (!ss) {
 		module_put(owner);
@@ -735,7 +745,7 @@ int sysfs_schedule_callback(struct kobject *kobj, void (*func)(void *),
 	mutex_lock(&sysfs_workq_mutex);
 	list_add_tail(&ss->workq_list, &sysfs_workq);
 	mutex_unlock(&sysfs_workq_mutex);
-	schedule_work(&ss->work);
+	queue_work(sysfs_workqueue, &ss->work);
 	return 0;
 }
 EXPORT_SYMBOL_GPL(sysfs_schedule_callback);
-- 
1.6.2


^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH 02/10] driver core: fix driver_match_device
  2009-04-17 19:01 [GIT PATCH] driver core patches for 2.6.31-rc2 Greg KH
  2009-04-17 19:04 ` [PATCH 01/10] sysfs: don't use global workqueue in sysfs_schedule_callback() Greg Kroah-Hartman
@ 2009-04-17 19:04 ` Greg Kroah-Hartman
  2009-04-17 19:04 ` [PATCH 03/10] driver core: allow non-root users to listen to uevents Greg Kroah-Hartman
                   ` (8 subsequent siblings)
  10 siblings, 0 replies; 15+ messages in thread
From: Greg Kroah-Hartman @ 2009-04-17 19:04 UTC (permalink / raw)
  To: linux-kernel; +Cc: Ming Lei, Greg Kroah-Hartman

From: Ming Lei <tom.leiming@gmail.com>

This patch fixes a bug introduced in commit
49b420a13ff95b449947181190b08367348e3e1b.

If a instance of bus_type doesn't have  .match method,
all .probe of drivers in the bus should be called, or else
the .probe have not a chance to be called.

Signed-off-by: Ming Lei <tom.leiming@gmail.com>
Reported-by: Guennadi Liakhovetski <g.liakhovetski@gmx.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
---
 drivers/base/base.h |    2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/drivers/base/base.h b/drivers/base/base.h
index ddc9749..b528145 100644
--- a/drivers/base/base.h
+++ b/drivers/base/base.h
@@ -115,7 +115,7 @@ extern int driver_probe_device(struct device_driver *drv, struct device *dev);
 static inline int driver_match_device(struct device_driver *drv,
 				      struct device *dev)
 {
-	return drv->bus->match && drv->bus->match(dev, drv);
+	return drv->bus->match ? drv->bus->match(dev, drv) : 1;
 }
 
 extern void sysdev_shutdown(void);
-- 
1.6.2


^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH 03/10] driver core: allow non-root users to listen to uevents
  2009-04-17 19:01 [GIT PATCH] driver core patches for 2.6.31-rc2 Greg KH
  2009-04-17 19:04 ` [PATCH 01/10] sysfs: don't use global workqueue in sysfs_schedule_callback() Greg Kroah-Hartman
  2009-04-17 19:04 ` [PATCH 02/10] driver core: fix driver_match_device Greg Kroah-Hartman
@ 2009-04-17 19:04 ` Greg Kroah-Hartman
  2009-04-17 19:04 ` [PATCH 04/10] sysfs: sysfs poll keep the poll rule of regular file Greg Kroah-Hartman
                   ` (7 subsequent siblings)
  10 siblings, 0 replies; 15+ messages in thread
From: Greg Kroah-Hartman @ 2009-04-17 19:04 UTC (permalink / raw)
  To: linux-kernel; +Cc: Kay Sievers, Greg Kroah-Hartman

From: Kay Sievers <kay.sievers@vrfy.org>

Users can read sysfs files, there is no reason they should not be
allowed to listen to uevents.  This lets xorg and other userspace
programs properly get these messages without having to be root.

Signed-off-by: Kay Sievers <kay.sievers@vrfy.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
---
 lib/kobject_uevent.c |    2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/lib/kobject_uevent.c b/lib/kobject_uevent.c
index dafeecf..920a3ca 100644
--- a/lib/kobject_uevent.c
+++ b/lib/kobject_uevent.c
@@ -328,7 +328,7 @@ static int __init kobject_uevent_init(void)
 		       "kobject_uevent: unable to create netlink socket!\n");
 		return -ENODEV;
 	}
-
+	netlink_set_nonroot(NETLINK_KOBJECT_UEVENT, NL_NONROOT_RECV);
 	return 0;
 }
 
-- 
1.6.2


^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH 04/10] sysfs: sysfs poll keep the poll rule of regular file.
  2009-04-17 19:01 [GIT PATCH] driver core patches for 2.6.31-rc2 Greg KH
                   ` (2 preceding siblings ...)
  2009-04-17 19:04 ` [PATCH 03/10] driver core: allow non-root users to listen to uevents Greg Kroah-Hartman
@ 2009-04-17 19:04 ` Greg Kroah-Hartman
  2009-04-17 19:04 ` [PATCH 05/10] proc: mounts_poll() make consistent to mdstat_poll Greg Kroah-Hartman
                   ` (6 subsequent siblings)
  10 siblings, 0 replies; 15+ messages in thread
From: Greg Kroah-Hartman @ 2009-04-17 19:04 UTC (permalink / raw)
  To: linux-kernel; +Cc: KOSAKI Motohiro, Neil Brown, Greg Kroah-Hartman

From: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>

Currently, following test programs don't finished.

% ruby -e '
Thread.new { sleep }
File.read("/sys/devices/system/cpu/cpu0/cpufreq/scaling_available_frequencies")
'

strace expose the reason.

...
open("/sys/devices/system/cpu/cpu0/cpufreq/scaling_available_frequencies", O_RDONLY|O_LARGEFILE) = 3
ioctl(3, SNDCTL_TMR_TIMEBASE or TCGETS, 0xbf9fa6b8) = -1 ENOTTY (Inappropriate ioctl for device)
fstat64(3, {st_mode=S_IFREG|0444, st_size=4096, ...}) = 0
_llseek(3, 0, [0], SEEK_CUR)            = 0
select(4, [3], NULL, NULL, NULL)        = 1 (in [3])
read(3, "1400000 1300000 1200000 1100000 1"..., 4096) = 62
select(4, [3], NULL, NULL, NULL


Because Ruby (the scripting language) VM assume select system-call
against regular file don't block.  it because SUSv3 says "Regular files
shall always poll TRUE for reading and writing".  see
http://www.opengroup.org/onlinepubs/009695399/functions/poll.html it
seems valid assumption.

But sysfs_poll() don't keep this rule although sysfs file can read and
write always.

This patch restore proper poll behavior to sysfs.
/sys/block/md*/md/sync_action polling application and another sysfs
updating sensitive application still can use POLLERR and POLLPRI.

Cc: Neil Brown <neilb@suse.de>
Signed-off-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
---
 fs/sysfs/file.c |    4 ++--
 1 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/fs/sysfs/file.c b/fs/sysfs/file.c
index 979e937..b1606e0 100644
--- a/fs/sysfs/file.c
+++ b/fs/sysfs/file.c
@@ -446,11 +446,11 @@ static unsigned int sysfs_poll(struct file *filp, poll_table *wait)
 	if (buffer->event != atomic_read(&od->event))
 		goto trigger;
 
-	return 0;
+	return DEFAULT_POLLMASK;
 
  trigger:
 	buffer->needs_read_fill = 1;
-	return POLLERR|POLLPRI;
+	return DEFAULT_POLLMASK|POLLERR|POLLPRI;
 }
 
 void sysfs_notify_dirent(struct sysfs_dirent *sd)
-- 
1.6.2


^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH 05/10] proc: mounts_poll() make consistent to mdstat_poll
  2009-04-17 19:01 [GIT PATCH] driver core patches for 2.6.31-rc2 Greg KH
                   ` (3 preceding siblings ...)
  2009-04-17 19:04 ` [PATCH 04/10] sysfs: sysfs poll keep the poll rule of regular file Greg Kroah-Hartman
@ 2009-04-17 19:04 ` Greg Kroah-Hartman
  2009-04-17 19:04 ` [PATCH 06/10] Driver Core: early platform driver Greg Kroah-Hartman
                   ` (5 subsequent siblings)
  10 siblings, 0 replies; 15+ messages in thread
From: Greg Kroah-Hartman @ 2009-04-17 19:04 UTC (permalink / raw)
  To: linux-kernel
  Cc: KOSAKI Motohiro, Ram Pai, Miklos Szeredi, Al Viro, Greg Kroah-Hartman

From: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>

In recently sysfs_poll discussion, Neil Brown pointed out /proc/mounts
also should be fixed.

SUSv3 says "Regular files shall always poll TRUE for reading and
writing".  see
http://www.opengroup.org/onlinepubs/009695399/functions/poll.html

Then, mounts_poll()'s default should be "POLLIN | POLLRDNORM".  it mean
always readable.

In addition, event trigger should use "POLLERR | POLLPRI" instead
POLLERR.  it makes consistent to mdstat_poll() and sysfs_poll(). and,
select(2) can handle POLLPRI easily.


Reported-by: Neil Brown <neilb@suse.de>
Signed-off-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Cc: Ram Pai <linuxram@us.ibm.com>
Cc: Miklos Szeredi <mszeredi@suse.cz>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
---
 fs/proc/base.c |    4 ++--
 1 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/fs/proc/base.c b/fs/proc/base.c
index f715597..aa763ab 100644
--- a/fs/proc/base.c
+++ b/fs/proc/base.c
@@ -648,14 +648,14 @@ static unsigned mounts_poll(struct file *file, poll_table *wait)
 {
 	struct proc_mounts *p = file->private_data;
 	struct mnt_namespace *ns = p->ns;
-	unsigned res = 0;
+	unsigned res = POLLIN | POLLRDNORM;
 
 	poll_wait(file, &ns->poll, wait);
 
 	spin_lock(&vfsmount_lock);
 	if (p->event != ns->event) {
 		p->event = ns->event;
-		res = POLLERR;
+		res |= POLLERR | POLLPRI;
 	}
 	spin_unlock(&vfsmount_lock);
 
-- 
1.6.2


^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH 06/10] Driver Core: early platform driver
  2009-04-17 19:01 [GIT PATCH] driver core patches for 2.6.31-rc2 Greg KH
                   ` (4 preceding siblings ...)
  2009-04-17 19:04 ` [PATCH 05/10] proc: mounts_poll() make consistent to mdstat_poll Greg Kroah-Hartman
@ 2009-04-17 19:04 ` Greg Kroah-Hartman
  2009-04-17 19:04 ` [PATCH 07/10] dynamic debug: resurrect old pr_debug() semantics as pr_devel() Greg Kroah-Hartman
                   ` (4 subsequent siblings)
  10 siblings, 0 replies; 15+ messages in thread
From: Greg Kroah-Hartman @ 2009-04-17 19:04 UTC (permalink / raw)
  To: linux-kernel
  Cc: Magnus Damm, Paul Mundt, Kay Sievers, David Brownell, Tejun Heo,
	Andrew Morton, Greg Kroah-Hartman

From: Magnus Damm <damm@igel.co.jp>

V3 of the early platform driver implementation.

Platform drivers are great for embedded platforms because we can separate
driver configuration from the actual driver.  So base addresses,
interrupts and other configuration can be kept with the processor or board
code, and the platform driver can be reused by many different platforms.

For early devices we have nothing today.  For instance, to configure early
timers and early serial ports we cannot use platform devices.  This
because the setup order during boot.  Timers are needed before the
platform driver core code is available.  The same goes for early printk
support.  Early in this case means before initcalls.

These early drivers today have their configuration either hard coded or
they receive it using some special configuration method.  This is working
quite well, but if we want to support both regular kernel modules and
early devices then we need to have two ways of configuring the same
driver.  A single way would be better.

The early platform driver patch is basically a set of functions that allow
drivers to register themselves and architecture code to locate them and
probe.  Registration happens through early_param().  The time for the
probe is decided by the architecture code.

See Documentation/driver-model/platform.txt for more details.

[akpm@linux-foundation.org: coding-style fixes]
Signed-off-by: Magnus Damm <damm@igel.co.jp>
Cc: Paul Mundt <lethal@linux-sh.org>
Cc: Kay Sievers <kay.sievers@vrfy.org>
Cc: David Brownell <david-b@pacbell.net>
Cc: Tejun Heo <htejun@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
---
 Documentation/driver-model/platform.txt |   59 ++++++++
 drivers/base/platform.c                 |  239 +++++++++++++++++++++++++++++++
 include/linux/init.h                    |    1 +
 include/linux/platform_device.h         |   42 ++++++
 init/main.c                             |    7 +-
 5 files changed, 347 insertions(+), 1 deletions(-)

diff --git a/Documentation/driver-model/platform.txt b/Documentation/driver-model/platform.txt
index 83009fd..2e2c2ea 100644
--- a/Documentation/driver-model/platform.txt
+++ b/Documentation/driver-model/platform.txt
@@ -169,3 +169,62 @@ three different ways to find such a match:
       be probed later if another device registers.  (Which is OK, since
       this interface is only for use with non-hotpluggable devices.)
 
+
+Early Platform Devices and Drivers
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+The early platform interfaces provide platform data to platform device
+drivers early on during the system boot. The code is built on top of the
+early_param() command line parsing and can be executed very early on.
+
+Example: "earlyprintk" class early serial console in 6 steps
+
+1. Registering early platform device data
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+The architecture code registers platform device data using the function
+early_platform_add_devices(). In the case of early serial console this
+should be hardware configuration for the serial port. Devices registered
+at this point will later on be matched against early platform drivers.
+
+2. Parsing kernel command line
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+The architecture code calls parse_early_param() to parse the kernel
+command line. This will execute all matching early_param() callbacks.
+User specified early platform devices will be registered at this point.
+For the early serial console case the user can specify port on the
+kernel command line as "earlyprintk=serial.0" where "earlyprintk" is
+the class string, "serial" is the name of the platfrom driver and
+0 is the platform device id. If the id is -1 then the dot and the
+id can be omitted.
+
+3. Installing early platform drivers belonging to a certain class
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+The architecture code may optionally force registration of all early
+platform drivers belonging to a certain class using the function
+early_platform_driver_register_all(). User specified devices from
+step 2 have priority over these. This step is omitted by the serial
+driver example since the early serial driver code should be disabled
+unless the user has specified port on the kernel command line.
+
+4. Early platform driver registration
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+Compiled-in platform drivers making use of early_platform_init() are
+automatically registered during step 2 or 3. The serial driver example
+should use early_platform_init("earlyprintk", &platform_driver).
+
+5. Probing of early platform drivers belonging to a certain class
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+The architecture code calls early_platform_driver_probe() to match
+registered early platform devices associated with a certain class with
+registered early platform drivers. Matched devices will get probed().
+This step can be executed at any point during the early boot. As soon
+as possible may be good for the serial port case.
+
+6. Inside the early platform driver probe()
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+The driver code needs to take special care during early boot, especially
+when it comes to memory allocation and interrupt registration. The code
+in the probe() function can use is_early_platform_device() to check if
+it is called at early platform device or at the regular platform device
+time. The early serial driver performs register_console() at this point.
+
+For further information, see <linux/platform_device.h>.
diff --git a/drivers/base/platform.c b/drivers/base/platform.c
index d2198f6..b5b6c97 100644
--- a/drivers/base/platform.c
+++ b/drivers/base/platform.c
@@ -990,6 +990,8 @@ int __init platform_bus_init(void)
 {
 	int error;
 
+	early_platform_cleanup();
+
 	error = device_register(&platform_bus);
 	if (error)
 		return error;
@@ -1020,3 +1022,240 @@ u64 dma_get_required_mask(struct device *dev)
 }
 EXPORT_SYMBOL_GPL(dma_get_required_mask);
 #endif
+
+static __initdata LIST_HEAD(early_platform_driver_list);
+static __initdata LIST_HEAD(early_platform_device_list);
+
+/**
+ * early_platform_driver_register
+ * @edrv: early_platform driver structure
+ * @buf: string passed from early_param()
+ */
+int __init early_platform_driver_register(struct early_platform_driver *epdrv,
+					  char *buf)
+{
+	unsigned long index;
+	int n;
+
+	/* Simply add the driver to the end of the global list.
+	 * Drivers will by default be put on the list in compiled-in order.
+	 */
+	if (!epdrv->list.next) {
+		INIT_LIST_HEAD(&epdrv->list);
+		list_add_tail(&epdrv->list, &early_platform_driver_list);
+	}
+
+	/* If the user has specified device then make sure the driver
+	 * gets prioritized. The driver of the last device specified on
+	 * command line will be put first on the list.
+	 */
+	n = strlen(epdrv->pdrv->driver.name);
+	if (buf && !strncmp(buf, epdrv->pdrv->driver.name, n)) {
+		list_move(&epdrv->list, &early_platform_driver_list);
+
+		if (!strcmp(buf, epdrv->pdrv->driver.name))
+			epdrv->requested_id = -1;
+		else if (buf[n] == '.' && strict_strtoul(&buf[n + 1], 10,
+							 &index) == 0)
+			epdrv->requested_id = index;
+		else
+			epdrv->requested_id = EARLY_PLATFORM_ID_ERROR;
+	}
+
+	return 0;
+}
+
+/**
+ * early_platform_add_devices - add a numbers of early platform devices
+ * @devs: array of early platform devices to add
+ * @num: number of early platform devices in array
+ */
+void __init early_platform_add_devices(struct platform_device **devs, int num)
+{
+	struct device *dev;
+	int i;
+
+	/* simply add the devices to list */
+	for (i = 0; i < num; i++) {
+		dev = &devs[i]->dev;
+
+		if (!dev->devres_head.next) {
+			INIT_LIST_HEAD(&dev->devres_head);
+			list_add_tail(&dev->devres_head,
+				      &early_platform_device_list);
+		}
+	}
+}
+
+/**
+ * early_platform_driver_register_all
+ * @class_str: string to identify early platform driver class
+ */
+void __init early_platform_driver_register_all(char *class_str)
+{
+	/* The "class_str" parameter may or may not be present on the kernel
+	 * command line. If it is present then there may be more than one
+	 * matching parameter.
+	 *
+	 * Since we register our early platform drivers using early_param()
+	 * we need to make sure that they also get registered in the case
+	 * when the parameter is missing from the kernel command line.
+	 *
+	 * We use parse_early_options() to make sure the early_param() gets
+	 * called at least once. The early_param() may be called more than
+	 * once since the name of the preferred device may be specified on
+	 * the kernel command line. early_platform_driver_register() handles
+	 * this case for us.
+	 */
+	parse_early_options(class_str);
+}
+
+/**
+ * early_platform_match
+ * @edrv: early platform driver structure
+ * @id: id to match against
+ */
+static  __init struct platform_device *
+early_platform_match(struct early_platform_driver *epdrv, int id)
+{
+	struct platform_device *pd;
+
+	list_for_each_entry(pd, &early_platform_device_list, dev.devres_head)
+		if (platform_match(&pd->dev, &epdrv->pdrv->driver))
+			if (pd->id == id)
+				return pd;
+
+	return NULL;
+}
+
+/**
+ * early_platform_left
+ * @edrv: early platform driver structure
+ * @id: return true if id or above exists
+ */
+static  __init int early_platform_left(struct early_platform_driver *epdrv,
+				       int id)
+{
+	struct platform_device *pd;
+
+	list_for_each_entry(pd, &early_platform_device_list, dev.devres_head)
+		if (platform_match(&pd->dev, &epdrv->pdrv->driver))
+			if (pd->id >= id)
+				return 1;
+
+	return 0;
+}
+
+/**
+ * early_platform_driver_probe_id
+ * @class_str: string to identify early platform driver class
+ * @id: id to match against
+ * @nr_probe: number of platform devices to successfully probe before exiting
+ */
+static int __init early_platform_driver_probe_id(char *class_str,
+						 int id,
+						 int nr_probe)
+{
+	struct early_platform_driver *epdrv;
+	struct platform_device *match;
+	int match_id;
+	int n = 0;
+	int left = 0;
+
+	list_for_each_entry(epdrv, &early_platform_driver_list, list) {
+		/* only use drivers matching our class_str */
+		if (strcmp(class_str, epdrv->class_str))
+			continue;
+
+		if (id == -2) {
+			match_id = epdrv->requested_id;
+			left = 1;
+
+		} else {
+			match_id = id;
+			left += early_platform_left(epdrv, id);
+
+			/* skip requested id */
+			switch (epdrv->requested_id) {
+			case EARLY_PLATFORM_ID_ERROR:
+			case EARLY_PLATFORM_ID_UNSET:
+				break;
+			default:
+				if (epdrv->requested_id == id)
+					match_id = EARLY_PLATFORM_ID_UNSET;
+			}
+		}
+
+		switch (match_id) {
+		case EARLY_PLATFORM_ID_ERROR:
+			pr_warning("%s: unable to parse %s parameter\n",
+				   class_str, epdrv->pdrv->driver.name);
+			/* fall-through */
+		case EARLY_PLATFORM_ID_UNSET:
+			match = NULL;
+			break;
+		default:
+			match = early_platform_match(epdrv, match_id);
+		}
+
+		if (match) {
+			if (epdrv->pdrv->probe(match))
+				pr_warning("%s: unable to probe %s early.\n",
+					   class_str, match->name);
+			else
+				n++;
+		}
+
+		if (n >= nr_probe)
+			break;
+	}
+
+	if (left)
+		return n;
+	else
+		return -ENODEV;
+}
+
+/**
+ * early_platform_driver_probe
+ * @class_str: string to identify early platform driver class
+ * @nr_probe: number of platform devices to successfully probe before exiting
+ * @user_only: only probe user specified early platform devices
+ */
+int __init early_platform_driver_probe(char *class_str,
+				       int nr_probe,
+				       int user_only)
+{
+	int k, n, i;
+
+	n = 0;
+	for (i = -2; n < nr_probe; i++) {
+		k = early_platform_driver_probe_id(class_str, i, nr_probe - n);
+
+		if (k < 0)
+			break;
+
+		n += k;
+
+		if (user_only)
+			break;
+	}
+
+	return n;
+}
+
+/**
+ * early_platform_cleanup - clean up early platform code
+ */
+void __init early_platform_cleanup(void)
+{
+	struct platform_device *pd, *pd2;
+
+	/* clean up the devres list used to chain devices */
+	list_for_each_entry_safe(pd, pd2, &early_platform_device_list,
+				 dev.devres_head) {
+		list_del(&pd->dev.devres_head);
+		memset(&pd->dev.devres_head, 0, sizeof(pd->dev.devres_head));
+	}
+}
+
diff --git a/include/linux/init.h b/include/linux/init.h
index 68cb026..f121a7a 100644
--- a/include/linux/init.h
+++ b/include/linux/init.h
@@ -247,6 +247,7 @@ struct obs_kernel_param {
 
 /* Relies on boot_command_line being set */
 void __init parse_early_param(void);
+void __init parse_early_options(char *cmdline);
 #endif /* __ASSEMBLY__ */
 
 /**
diff --git a/include/linux/platform_device.h b/include/linux/platform_device.h
index 76e470a..72736fd 100644
--- a/include/linux/platform_device.h
+++ b/include/linux/platform_device.h
@@ -77,4 +77,46 @@ extern int platform_driver_probe(struct platform_driver *driver,
 #define platform_get_drvdata(_dev)	dev_get_drvdata(&(_dev)->dev)
 #define platform_set_drvdata(_dev,data)	dev_set_drvdata(&(_dev)->dev, (data))
 
+/* early platform driver interface */
+struct early_platform_driver {
+	const char *class_str;
+	struct platform_driver *pdrv;
+	struct list_head list;
+	int requested_id;
+};
+
+#define EARLY_PLATFORM_ID_UNSET -2
+#define EARLY_PLATFORM_ID_ERROR -3
+
+extern int early_platform_driver_register(struct early_platform_driver *epdrv,
+					  char *buf);
+extern void early_platform_add_devices(struct platform_device **devs, int num);
+
+static inline int is_early_platform_device(struct platform_device *pdev)
+{
+	return !pdev->dev.driver;
+}
+
+extern void early_platform_driver_register_all(char *class_str);
+extern int early_platform_driver_probe(char *class_str,
+				       int nr_probe, int user_only);
+extern void early_platform_cleanup(void);
+
+
+#ifndef MODULE
+#define early_platform_init(class_string, platform_driver)		\
+static __initdata struct early_platform_driver early_driver = {		\
+	.class_str = class_string,					\
+	.pdrv = platform_driver,					\
+	.requested_id = EARLY_PLATFORM_ID_UNSET,			\
+};									\
+static int __init early_platform_driver_setup_func(char *buf)		\
+{									\
+	return early_platform_driver_register(&early_driver, buf);	\
+}									\
+early_param(class_string, early_platform_driver_setup_func)
+#else /* MODULE */
+#define early_platform_init(class_string, platform_driver)
+#endif /* MODULE */
+
 #endif /* _PLATFORM_DEVICE_H_ */
diff --git a/init/main.c b/init/main.c
index 3585f07..3bbf93b 100644
--- a/init/main.c
+++ b/init/main.c
@@ -492,6 +492,11 @@ static int __init do_early_param(char *param, char *val)
 	return 0;
 }
 
+void __init parse_early_options(char *cmdline)
+{
+	parse_args("early options", cmdline, NULL, 0, do_early_param);
+}
+
 /* Arch code calls this early on, or if not, just before other parsing. */
 void __init parse_early_param(void)
 {
@@ -503,7 +508,7 @@ void __init parse_early_param(void)
 
 	/* All fall through to do_early_param. */
 	strlcpy(tmp_cmdline, boot_command_line, COMMAND_LINE_SIZE);
-	parse_args("early options", tmp_cmdline, NULL, 0, do_early_param);
+	parse_early_options(tmp_cmdline);
 	done = 1;
 }
 
-- 
1.6.2


^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH 07/10] dynamic debug: resurrect old pr_debug() semantics as pr_devel()
  2009-04-17 19:01 [GIT PATCH] driver core patches for 2.6.31-rc2 Greg KH
                   ` (5 preceding siblings ...)
  2009-04-17 19:04 ` [PATCH 06/10] Driver Core: early platform driver Greg Kroah-Hartman
@ 2009-04-17 19:04 ` Greg Kroah-Hartman
  2009-04-17 19:04 ` [PATCH 08/10] driver core: prevent device_for_each_child from oopsing Greg Kroah-Hartman
                   ` (3 subsequent siblings)
  10 siblings, 0 replies; 15+ messages in thread
From: Greg Kroah-Hartman @ 2009-04-17 19:04 UTC (permalink / raw)
  To: linux-kernel
  Cc: Michael Ellerman, Jason Baron, Greg Banks, Andrew Morton,
	Greg Kroah-Hartman

From: Michael Ellerman <michael@ellerman.id.au>

pr_debug() used to produce zero code unless DEBUG was #defined.  This is
now no longer the case in practice[1].

There are places where it's useful to have debugging printks, but we don't
want them to generate any code in production kernels.

So add a new macro, pr_devel(), for _devel_opment, to provide the old
semantics, ie.  if the programmer doesn't explicitly enable debugging, no
code is produced.

[1]: You can turn CONFIG_DYNAMIC_DEBUG off, but it's enabled in at least
     one distro kernel, so it's not really a solution.

Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
Cc: Jason Baron <jbaron@redhat.com>
Cc: Greg Banks <gnb@sgi.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
---
 include/linux/kernel.h |    9 +++++++++
 1 files changed, 9 insertions(+), 0 deletions(-)

diff --git a/include/linux/kernel.h b/include/linux/kernel.h
index d9e75ec..883cd44 100644
--- a/include/linux/kernel.h
+++ b/include/linux/kernel.h
@@ -377,6 +377,15 @@ static inline char *pack_hex_byte(char *buf, u8 byte)
 #define pr_cont(fmt, ...) \
 	printk(KERN_CONT fmt, ##__VA_ARGS__)
 
+/* pr_devel() should produce zero code unless DEBUG is defined */
+#ifdef DEBUG
+#define pr_devel(fmt, ...) \
+	printk(KERN_DEBUG pr_fmt(fmt), ##__VA_ARGS__)
+#else
+#define pr_devel(fmt, ...) \
+	({ if (0) printk(KERN_DEBUG pr_fmt(fmt), ##__VA_ARGS__); 0; })
+#endif
+
 /* If you are writing a driver, please use dev_dbg instead */
 #if defined(DEBUG)
 #define pr_debug(fmt, ...) \
-- 
1.6.2


^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH 08/10] driver core: prevent device_for_each_child from oopsing
  2009-04-17 19:01 [GIT PATCH] driver core patches for 2.6.31-rc2 Greg KH
                   ` (6 preceding siblings ...)
  2009-04-17 19:04 ` [PATCH 07/10] dynamic debug: resurrect old pr_debug() semantics as pr_devel() Greg Kroah-Hartman
@ 2009-04-17 19:04 ` Greg Kroah-Hartman
  2009-04-17 19:04 ` [PATCH 09/10] Driver core: remove pr_fmt() from dynamic_dev_dbg() printk Greg Kroah-Hartman
                   ` (2 subsequent siblings)
  10 siblings, 0 replies; 15+ messages in thread
From: Greg Kroah-Hartman @ 2009-04-17 19:04 UTC (permalink / raw)
  To: linux-kernel; +Cc: Greg Kroah-Hartman

David Vrabel noticed that the wireless usb stack likes to call
device_for_each_chile() with an empty bus.  This used to work fine, but
now oopses.  This patch fixes the oops and makes the code behave like it
used to.

Reported-by: David Vrabel <david.vrabel@csr.com>
Tested-by: David Vrabel <david.vrabel@csr.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
---
 drivers/base/core.c |    3 +++
 1 files changed, 3 insertions(+), 0 deletions(-)

diff --git a/drivers/base/core.c b/drivers/base/core.c
index e73c92d..d230ff4 100644
--- a/drivers/base/core.c
+++ b/drivers/base/core.c
@@ -1142,6 +1142,9 @@ int device_for_each_child(struct device *parent, void *data,
 	struct device *child;
 	int error = 0;
 
+	if (!parent->p)
+		return 0;
+
 	klist_iter_init(&parent->p->klist_children, &i);
 	while ((child = next_device(&i)) && !error)
 		error = fn(child, data);
-- 
1.6.2


^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH 09/10] Driver core: remove pr_fmt() from dynamic_dev_dbg() printk
  2009-04-17 19:01 [GIT PATCH] driver core patches for 2.6.31-rc2 Greg KH
                   ` (7 preceding siblings ...)
  2009-04-17 19:04 ` [PATCH 08/10] driver core: prevent device_for_each_child from oopsing Greg Kroah-Hartman
@ 2009-04-17 19:04 ` Greg Kroah-Hartman
  2009-04-17 19:04 ` [PATCH 10/10] UIO: fix specific device driver missing statement for depmod Greg Kroah-Hartman
  2009-04-18  3:57 ` [GIT PATCH] driver core patches for 2.6.31-rc2 KOSAKI Motohiro
  10 siblings, 0 replies; 15+ messages in thread
From: Greg Kroah-Hartman @ 2009-04-17 19:04 UTC (permalink / raw)
  To: linux-kernel; +Cc: Jason Baron, Greg Kroah-Hartman

From: Jason Baron <jbaron@redhat.com>

When pr_fmt() was added to the pr_debug() code, we added it not only to the
dynamic_pr_debug() function, but also to the dynamic_dev_dbg() funciton.
However, dev_dbg() doesn't make use of pr_fmt(), so neither should
dynamic_dev_dbg().

Signed-off-by: Jason Baron <jbaron@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
---
 include/linux/dynamic_debug.h |    2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/include/linux/dynamic_debug.h b/include/linux/dynamic_debug.h
index baabf33..a0d9422 100644
--- a/include/linux/dynamic_debug.h
+++ b/include/linux/dynamic_debug.h
@@ -70,7 +70,7 @@ extern int ddebug_remove_module(char *mod_name);
 		DEBUG_HASH2, __LINE__, _DPRINTK_FLAGS_DEFAULT };	\
 	if (__dynamic_dbg_enabled(descriptor))				\
 			dev_printk(KERN_DEBUG, dev,			\
-					KBUILD_MODNAME ": " pr_fmt(fmt),\
+					KBUILD_MODNAME ": " fmt,	\
 					##__VA_ARGS__);			\
 	} while (0)
 
-- 
1.6.2


^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH 10/10] UIO: fix specific device driver missing statement for depmod
  2009-04-17 19:01 [GIT PATCH] driver core patches for 2.6.31-rc2 Greg KH
                   ` (8 preceding siblings ...)
  2009-04-17 19:04 ` [PATCH 09/10] Driver core: remove pr_fmt() from dynamic_dev_dbg() printk Greg Kroah-Hartman
@ 2009-04-17 19:04 ` Greg Kroah-Hartman
  2009-04-18  3:57 ` [GIT PATCH] driver core patches for 2.6.31-rc2 KOSAKI Motohiro
  10 siblings, 0 replies; 15+ messages in thread
From: Greg Kroah-Hartman @ 2009-04-17 19:04 UTC (permalink / raw)
  To: linux-kernel; +Cc: Hans J. Koch, Greg Kroah-Hartman

From: Hans J. Koch <hjk@linutronix.de>

On Fri, Apr 10, 2009 at 01:50:50PM -0700, Andrew Morton wrote:
> On Fri, 10 Apr 2009 13:32:01 GMT
> bugzilla-daemon@bugzilla.kernel.org wrote:
>
> > http://bugzilla.kernel.org/show_bug.cgi?id=13059

drivers/uio/uio_cif.c misses a MODULE_DEVICE_TABLE, this fixes it.

Signed-off-by: Hans J. Koch <hjk@linutronix.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
---
 drivers/uio/uio_cif.c |    1 +
 1 files changed, 1 insertions(+), 0 deletions(-)

diff --git a/drivers/uio/uio_cif.c b/drivers/uio/uio_cif.c
index c60b8fc..28034c8 100644
--- a/drivers/uio/uio_cif.c
+++ b/drivers/uio/uio_cif.c
@@ -147,5 +147,6 @@ static void __exit hilscher_exit_module(void)
 module_init(hilscher_init_module);
 module_exit(hilscher_exit_module);
 
+MODULE_DEVICE_TABLE(pci, hilscher_pci_ids);
 MODULE_LICENSE("GPL v2");
 MODULE_AUTHOR("Hans J. Koch, Benedikt Spranger");
-- 
1.6.2


^ permalink raw reply related	[flat|nested] 15+ messages in thread

* Re: [GIT PATCH] driver core patches for 2.6.31-rc2
  2009-04-17 19:01 [GIT PATCH] driver core patches for 2.6.31-rc2 Greg KH
                   ` (9 preceding siblings ...)
  2009-04-17 19:04 ` [PATCH 10/10] UIO: fix specific device driver missing statement for depmod Greg Kroah-Hartman
@ 2009-04-18  3:57 ` KOSAKI Motohiro
  2009-04-18 16:50   ` Greg KH
  10 siblings, 1 reply; 15+ messages in thread
From: KOSAKI Motohiro @ 2009-04-18  3:57 UTC (permalink / raw)
  To: Greg KH; +Cc: Linus Torvalds, Andrew Morton, linux-kernel

2009/4/18 Greg KH <gregkh@suse.de>:
> Here are a few driver-core patches for your 2.6.31-rc2 tree.

2.6.30-rc2?

>
> The biggest patch here is the addition of the early platform driver api.
> This patch has been in the -mm tree for a few major kernel releases now
> (around 5 months), and there are a number of drivers pending in some of
> the arch trees that depend on this api being present.  It doesn't affect
> any x86 code, so the majority of the world will have no problems with
> this, and it all gets thrown away after init starts up, so it has no
> affect on memory sizes.

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [GIT PATCH] driver core patches for 2.6.31-rc2
  2009-04-18  3:57 ` [GIT PATCH] driver core patches for 2.6.31-rc2 KOSAKI Motohiro
@ 2009-04-18 16:50   ` Greg KH
  0 siblings, 0 replies; 15+ messages in thread
From: Greg KH @ 2009-04-18 16:50 UTC (permalink / raw)
  To: KOSAKI Motohiro; +Cc: Linus Torvalds, Andrew Morton, linux-kernel

On Sat, Apr 18, 2009 at 12:57:23PM +0900, KOSAKI Motohiro wrote:
> 2009/4/18 Greg KH <gregkh@suse.de>:
> > Here are a few driver-core patches for your 2.6.31-rc2 tree.
> 
> 2.6.30-rc2?

Yeah, sorry, I'm numerically challenged at times :)

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH 01/10] sysfs: don't use global workqueue in sysfs_schedule_callback()
  2009-04-17 19:04 ` [PATCH 01/10] sysfs: don't use global workqueue in sysfs_schedule_callback() Greg Kroah-Hartman
@ 2009-04-22 20:14   ` Andrew Morton
  2009-04-28  5:15     ` Alex Chiang
  0 siblings, 1 reply; 15+ messages in thread
From: Andrew Morton @ 2009-04-22 20:14 UTC (permalink / raw)
  To: Greg Kroah-Hartman; +Cc: linux-kernel, achiang, gregkh, Lai Jiangshan

On Fri, 17 Apr 2009 12:04:25 -0700
Greg Kroah-Hartman <gregkh@suse.de> wrote:

> From: Alex Chiang <achiang@hp.com>
> 
> A sysfs attribute using sysfs_schedule_callback() to commit suicide
> may end up calling device_unregister(), which will eventually call
> a driver's ->remove function.
> 
> Drivers may call flush_scheduled_work() in their shutdown routines,
> in which case lockdep will complain with something like the following:
> 
>   =============================================
>   [ INFO: possible recursive locking detected ]
>   2.6.29-rc8-kk #1
>   ---------------------------------------------
>   events/4/56 is trying to acquire lock:
>   (events){--..}, at: [<ffffffff80257fc0>] flush_workqueue+0x0/0xa0
> 
>   but task is already holding lock:
>   (events){--..}, at: [<ffffffff80257648>] run_workqueue+0x108/0x230
> 
>   other info that might help us debug this:
>   3 locks held by events/4/56:
>   #0:  (events){--..}, at: [<ffffffff80257648>] run_workqueue+0x108/0x230
>   #1:  (&ss->work){--..}, at: [<ffffffff80257648>] run_workqueue+0x108/0x230
>   #2:  (pci_remove_rescan_mutex){--..}, at: [<ffffffff803c10d1>] remove_callback+0x21/0x40
> 
>   stack backtrace:
>   Pid: 56, comm: events/4 Not tainted 2.6.29-rc8-kk #1
>   Call Trace:
>   [<ffffffff8026dfcd>] validate_chain+0xb7d/0x1260
>   [<ffffffff8026eade>] __lock_acquire+0x42e/0xa40
>   [<ffffffff8026f148>] lock_acquire+0x58/0x80
>   [<ffffffff80257fc0>] ? flush_workqueue+0x0/0xa0
>   [<ffffffff8025800d>] flush_workqueue+0x4d/0xa0
>   [<ffffffff80257fc0>] ? flush_workqueue+0x0/0xa0
>   [<ffffffff80258070>] flush_scheduled_work+0x10/0x20
>   [<ffffffffa0144065>] e1000_remove+0x55/0xfe [e1000e]
>   [<ffffffff8033ee30>] ? sysfs_schedule_callback_work+0x0/0x50
>   [<ffffffff803bfeb2>] pci_device_remove+0x32/0x70
>   [<ffffffff80441da9>] __device_release_driver+0x59/0x90
>   [<ffffffff80441edb>] device_release_driver+0x2b/0x40
>   [<ffffffff804419d6>] bus_remove_device+0xa6/0x120
>   [<ffffffff8043e46b>] device_del+0x12b/0x190
>   [<ffffffff8043e4f6>] device_unregister+0x26/0x70
>   [<ffffffff803ba969>] pci_stop_dev+0x49/0x60
>   [<ffffffff803baab0>] pci_remove_bus_device+0x40/0xc0
>   [<ffffffff803c10d9>] remove_callback+0x29/0x40
>   [<ffffffff8033ee4f>] sysfs_schedule_callback_work+0x1f/0x50
>   [<ffffffff8025769a>] run_workqueue+0x15a/0x230
>   [<ffffffff80257648>] ? run_workqueue+0x108/0x230
>   [<ffffffff8025846f>] worker_thread+0x9f/0x100
>   [<ffffffff8025bce0>] ? autoremove_wake_function+0x0/0x40
>   [<ffffffff802583d0>] ? worker_thread+0x0/0x100
>   [<ffffffff8025b89d>] kthread+0x4d/0x80
>   [<ffffffff8020d4ba>] child_rip+0xa/0x20
>   [<ffffffff8020cebc>] ? restore_args+0x0/0x30
>   [<ffffffff8025b850>] ? kthread+0x0/0x80
>   [<ffffffff8020d4b0>] ? child_rip+0x0/0x20
> 
> Although we know that the device_unregister path will never acquire
> a lock that a driver might try to acquire in its ->remove, in general
> we should never attempt to flush a workqueue from within the same
> workqueue, and lockdep rightly complains.
> 
> So as long as sysfs attributes cannot commit suicide directly and we
> are stuck with this callback mechanism, put the sysfs callbacks on
> their own workqueue instead of the global one.
> 
> This has the side benefit that if a suicidal sysfs attribute kicks
> off a long chain of ->remove callbacks, we no longer induce a long
> delay on the global queue.

I still don't know why I merged 

: commit 2355b70fd59cb5be7de2052a9edeee7afb7ff099
: Author: Lai Jiangshan <laijs@cn.fujitsu.com>
: Date:   Thu Apr 2 16:58:24 2009 -0700
:
:    workqueue: avoid recursion in run_workqueue()

there was nothing wrong with permitting limited recursion into
run_workqueue().  It never deadlocked and the three-deep-recursion
warning never triggered.

> +	if (sysfs_workqueue == NULL) {
> +		sysfs_workqueue = create_workqueue("sysfsd");
> +		if (sysfs_workqueue == NULL) {
> +			module_put(owner);
> +			return -ENOMEM;
> +		}
> +	}

This will create a kernel thread per CPU.  Surely
create_singlethread_workqueue() will suffice?


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH 01/10] sysfs: don't use global workqueue in sysfs_schedule_callback()
  2009-04-22 20:14   ` Andrew Morton
@ 2009-04-28  5:15     ` Alex Chiang
  0 siblings, 0 replies; 15+ messages in thread
From: Alex Chiang @ 2009-04-28  5:15 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Greg Kroah-Hartman, linux-kernel, Lai Jiangshan, a.p.zijlstra

* Andrew Morton <akpm@linux-foundation.org>:
> On Fri, 17 Apr 2009 12:04:25 -0700
> Greg Kroah-Hartman <gregkh@suse.de> wrote:
> 
> > From: Alex Chiang <achiang@hp.com>
> > 
> > A sysfs attribute using sysfs_schedule_callback() to commit suicide
> > may end up calling device_unregister(), which will eventually call
> > a driver's ->remove function.
> > 
> > Drivers may call flush_scheduled_work() in their shutdown routines,
> > in which case lockdep will complain with something like the following:
> > 
> >   =============================================
> >   [ INFO: possible recursive locking detected ]
> >   2.6.29-rc8-kk #1
> >   ---------------------------------------------
> >   events/4/56 is trying to acquire lock:
> >   (events){--..}, at: [<ffffffff80257fc0>] flush_workqueue+0x0/0xa0
> > 
> >   but task is already holding lock:
> >   (events){--..}, at: [<ffffffff80257648>] run_workqueue+0x108/0x230
> > 
> >   other info that might help us debug this:
> >   3 locks held by events/4/56:
> >   #0:  (events){--..}, at: [<ffffffff80257648>] run_workqueue+0x108/0x230
> >   #1:  (&ss->work){--..}, at: [<ffffffff80257648>] run_workqueue+0x108/0x230
> >   #2:  (pci_remove_rescan_mutex){--..}, at: [<ffffffff803c10d1>] remove_callback+0x21/0x40
> > 
> >   stack backtrace:
> >   Pid: 56, comm: events/4 Not tainted 2.6.29-rc8-kk #1
> >   Call Trace:
> >   [<ffffffff8026dfcd>] validate_chain+0xb7d/0x1260
> >   [<ffffffff8026eade>] __lock_acquire+0x42e/0xa40
> >   [<ffffffff8026f148>] lock_acquire+0x58/0x80
> >   [<ffffffff80257fc0>] ? flush_workqueue+0x0/0xa0
> >   [<ffffffff8025800d>] flush_workqueue+0x4d/0xa0
> >   [<ffffffff80257fc0>] ? flush_workqueue+0x0/0xa0
> >   [<ffffffff80258070>] flush_scheduled_work+0x10/0x20
> >   [<ffffffffa0144065>] e1000_remove+0x55/0xfe [e1000e]
> >   [<ffffffff8033ee30>] ? sysfs_schedule_callback_work+0x0/0x50
> >   [<ffffffff803bfeb2>] pci_device_remove+0x32/0x70
> >   [<ffffffff80441da9>] __device_release_driver+0x59/0x90
> >   [<ffffffff80441edb>] device_release_driver+0x2b/0x40
> >   [<ffffffff804419d6>] bus_remove_device+0xa6/0x120
> >   [<ffffffff8043e46b>] device_del+0x12b/0x190
> >   [<ffffffff8043e4f6>] device_unregister+0x26/0x70
> >   [<ffffffff803ba969>] pci_stop_dev+0x49/0x60
> >   [<ffffffff803baab0>] pci_remove_bus_device+0x40/0xc0
> >   [<ffffffff803c10d9>] remove_callback+0x29/0x40
> >   [<ffffffff8033ee4f>] sysfs_schedule_callback_work+0x1f/0x50
> >   [<ffffffff8025769a>] run_workqueue+0x15a/0x230
> >   [<ffffffff80257648>] ? run_workqueue+0x108/0x230
> >   [<ffffffff8025846f>] worker_thread+0x9f/0x100
> >   [<ffffffff8025bce0>] ? autoremove_wake_function+0x0/0x40
> >   [<ffffffff802583d0>] ? worker_thread+0x0/0x100
> >   [<ffffffff8025b89d>] kthread+0x4d/0x80
> >   [<ffffffff8020d4ba>] child_rip+0xa/0x20
> >   [<ffffffff8020cebc>] ? restore_args+0x0/0x30
> >   [<ffffffff8025b850>] ? kthread+0x0/0x80
> >   [<ffffffff8020d4b0>] ? child_rip+0x0/0x20
> > 
> > Although we know that the device_unregister path will never acquire
> > a lock that a driver might try to acquire in its ->remove, in general
> > we should never attempt to flush a workqueue from within the same
> > workqueue, and lockdep rightly complains.
> > 
> > So as long as sysfs attributes cannot commit suicide directly and we
> > are stuck with this callback mechanism, put the sysfs callbacks on
> > their own workqueue instead of the global one.
> > 
> > This has the side benefit that if a suicidal sysfs attribute kicks
> > off a long chain of ->remove callbacks, we no longer induce a long
> > delay on the global queue.
> 
> I still don't know why I merged 
> 
> : commit 2355b70fd59cb5be7de2052a9edeee7afb7ff099
> : Author: Lai Jiangshan <laijs@cn.fujitsu.com>
> : Date:   Thu Apr 2 16:58:24 2009 -0700
> :
> :    workqueue: avoid recursion in run_workqueue()
> 
> there was nothing wrong with permitting limited recursion into
> run_workqueue().  It never deadlocked and the three-deep-recursion
> warning never triggered.

According to Peter Zjilstra and Lai Jiangshan, there actually is
danger of deadlock when flushing a workqueue from the same
workqueue:

	http://thread.gmane.org/gmane.linux.kernel.pci/3713/focus=811703

And in any case, I think it's a good idea to avoid the global
workqueue, as removing a PCI bridge near the root of the
hierarchy may result in a longish series of operations as we
unregister drivers, etc.

> > +	if (sysfs_workqueue == NULL) {
> > +		sysfs_workqueue = create_workqueue("sysfsd");
> > +		if (sysfs_workqueue == NULL) {
> > +			module_put(owner);
> > +			return -ENOMEM;
> > +		}
> > +	}
> 
> This will create a kernel thread per CPU.  Surely
> create_singlethread_workqueue() will suffice?

Oh darn, you are right. We do not need a per-CPU thread. I will
queue up a patch to use create_singlethread_workqueue.

Thanks for the review.

/ac

^ permalink raw reply	[flat|nested] 15+ messages in thread

end of thread, other threads:[~2009-04-28  5:18 UTC | newest]

Thread overview: 15+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2009-04-17 19:01 [GIT PATCH] driver core patches for 2.6.31-rc2 Greg KH
2009-04-17 19:04 ` [PATCH 01/10] sysfs: don't use global workqueue in sysfs_schedule_callback() Greg Kroah-Hartman
2009-04-22 20:14   ` Andrew Morton
2009-04-28  5:15     ` Alex Chiang
2009-04-17 19:04 ` [PATCH 02/10] driver core: fix driver_match_device Greg Kroah-Hartman
2009-04-17 19:04 ` [PATCH 03/10] driver core: allow non-root users to listen to uevents Greg Kroah-Hartman
2009-04-17 19:04 ` [PATCH 04/10] sysfs: sysfs poll keep the poll rule of regular file Greg Kroah-Hartman
2009-04-17 19:04 ` [PATCH 05/10] proc: mounts_poll() make consistent to mdstat_poll Greg Kroah-Hartman
2009-04-17 19:04 ` [PATCH 06/10] Driver Core: early platform driver Greg Kroah-Hartman
2009-04-17 19:04 ` [PATCH 07/10] dynamic debug: resurrect old pr_debug() semantics as pr_devel() Greg Kroah-Hartman
2009-04-17 19:04 ` [PATCH 08/10] driver core: prevent device_for_each_child from oopsing Greg Kroah-Hartman
2009-04-17 19:04 ` [PATCH 09/10] Driver core: remove pr_fmt() from dynamic_dev_dbg() printk Greg Kroah-Hartman
2009-04-17 19:04 ` [PATCH 10/10] UIO: fix specific device driver missing statement for depmod Greg Kroah-Hartman
2009-04-18  3:57 ` [GIT PATCH] driver core patches for 2.6.31-rc2 KOSAKI Motohiro
2009-04-18 16:50   ` Greg KH

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.