linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [RESEND][PATCH-2.6.24-rc8] Fix fakephp deadlock
@ 2008-01-22 14:28 Ian Abbott
  2008-01-23 17:46 ` Greg KH
  0 siblings, 1 reply; 4+ messages in thread
From: Ian Abbott @ 2008-01-22 14:28 UTC (permalink / raw)
  To: linux-kernel, linux-pci, pcihpd-discuss, gregkh, kristen.c.accardi

From: Ian Abbott <abbotti@mev.co.uk>

I last sent this 11 days ago and didn't get any replies, so my timeout
handler has kicked in. :)

The patch is same as last time except I've bumped the revision from rc7
to rc8 and have remembered to use diff's -p option this time!

---
From: Ian Abbott <abbotti@mev.co.uk>

If the fakephp driver is used to emulate removal of a PCI device by
writing text string "0" to the "power" sysfs attribute file, this causes
its parent directory and its contents (including the "power" file) to be
deleted before the write operation returns.  Unfortunately, it ends up
in a deadlock waiting for itself to complete.

The deadlock is as follows: sysfs_write_file calls flush_write_buffer
which calls sysfs_get_active_two before calling power_write_file in
pci_hotplug_core.c via the sysfs store operation. The power_write_file
function calls disable_slot in fakephp.c via the slot operation.  The
disable_slot function calls remove_slot which calls pci_hp_deregister
(back in pci_hotplug_core.c) which calls fs_remove_slot which calls
sysfs_remove_file to remove the "power" file. The sysfs_remove_file
function calls sysfs_hash_and_remove which calls sysfs_addrm_finish
which calls sysfs_deactivate. The sysfs_deactivate function sees that
something has an active reference on the sysfs_dirent (from the
previous call to sysfs_get_active_two back up the call stack somewhere)
so waits for the active reference to go away, which is of course
impossible.

The problem has been present since 2.6.21.

This patch breaks the deadlock by queuing a work queue item on a single-
threaded workqueue to do the removal from sysfs.  There is also some
protection against performing store operations on the "power" file of a
slot that has already been disabled.

Signed-off-by: Ian Abbott <abbotti@mev.co.uk>
---

--- linux-2.6.24-rc8/drivers/pci/hotplug/fakephp.c.orig	2008-01-03 16:50:05.000000000 +0000
+++ linux-2.6.24-rc8/drivers/pci/hotplug/fakephp.c	2008-01-10 17:20:58.000000000 +0000
@@ -39,6 +39,7 @@
 #include <linux/init.h>
 #include <linux/string.h>
 #include <linux/slab.h>
+#include <linux/workqueue.h>
 #include "../pci.h"
 
 #if !defined(MODULE)
@@ -63,10 +64,13 @@ struct dummy_slot {
 	struct list_head node;
 	struct hotplug_slot *slot;
 	struct pci_dev *dev;
+	struct work_struct remove_work;
+	unsigned long removed;
 };
 
 static int debug;
 static LIST_HEAD(slot_list);
+static struct workqueue_struct *dummyphp_wq;
 
 static int enable_slot (struct hotplug_slot *slot);
 static int disable_slot (struct hotplug_slot *slot);
@@ -109,7 +113,7 @@ static int add_slot(struct pci_dev *dev)
 	slot->name = &dev->dev.bus_id[0];
 	dbg("slot->name = %s\n", slot->name);
 
-	dslot = kmalloc(sizeof(struct dummy_slot), GFP_KERNEL);
+	dslot = kzalloc(sizeof(struct dummy_slot), GFP_KERNEL);
 	if (!dslot)
 		goto error_info;
 
@@ -164,6 +168,13 @@ static void remove_slot(struct dummy_slo
 		err("Problem unregistering a slot %s\n", dslot->slot->name);
 }
 
+static void remove_slot_worker(struct work_struct *work)
+{
+	struct dummy_slot *dslot =
+		container_of(work, struct dummy_slot, remove_work);
+	remove_slot(dslot);
+}
+
 /**
  * pci_rescan_slot - Rescan slot
  * @temp: Device template. Should be set: bus and devfn.
@@ -271,6 +282,20 @@ static inline void pci_rescan(void) {
 static int enable_slot(struct hotplug_slot *hotplug_slot)
 {
 	/* mis-use enable_slot for rescanning of the pci bus */
+	struct dummy_slot *dslot;
+
+	if (hotplug_slot) {
+		/* make sure slot is not scheduled for removal
+		 * otherwise flush_workqueue() could deadlock */
+		dslot = hotplug_slot->private;
+		if (test_bit(0, &dslot->removed)) {
+			/* FIXME: -ENODEV is also returned when function
+			 * succeeds. */
+			return -ENODEV;
+		}
+	}
+	/* first, complete any pending removals */
+	flush_workqueue(dummyphp_wq);
 	pci_rescan();
 	return -ENODEV;
 }
@@ -306,6 +331,10 @@ static int disable_slot(struct hotplug_s
 		err("Can't remove PCI devices with other PCI devices behind it yet.\n");
 		return -ENODEV;
 	}
+	if (test_and_set_bit(0, &dslot->removed)) {
+		dbg("Slot already scheduled for removal\n");
+		return -ENODEV;
+	}
 	/* search for subfunctions and disable them first */
 	if (!(dslot->dev->devfn & 7)) {
 		for (func = 1; func < 8; func++) {
@@ -328,8 +357,9 @@ static int disable_slot(struct hotplug_s
 	/* remove the device from the pci core */
 	pci_remove_bus_device(dslot->dev);
 
-	/* blow away this sysfs entry and other parts. */
-	remove_slot(dslot);
+	/* queue work item to blow away this sysfs entry and other parts. */
+	INIT_WORK(&dslot->remove_work, remove_slot_worker);
+	queue_work(dummyphp_wq, &dslot->remove_work);
 
 	return 0;
 }
@@ -340,6 +370,7 @@ static void cleanup_slots (void)
 	struct list_head *next;
 	struct dummy_slot *dslot;
 
+	destroy_workqueue(dummyphp_wq);
 	list_for_each_safe (tmp, next, &slot_list) {
 		dslot = list_entry (tmp, struct dummy_slot, node);
 		remove_slot(dslot);
@@ -351,6 +382,10 @@ static int __init dummyphp_init(void)
 {
 	info(DRIVER_DESC "\n");
 
+	dummyphp_wq = create_singlethread_workqueue(MY_NAME);
+	if (!dummyphp_wq)
+		return -ENOMEM;
+
 	return pci_scan_buses();
 }
 

---
Note that the return values from the enable_slot function needs some
attention.  The patch causes this to return -ENODEV if the slot has
already been disabled, but that is also the normal return value of the
function.  The return values should be different.  Is there any reason
why the normal return value can't be 0?


-- 
-=( Ian Abbott @ MEV Ltd.    E-mail: <abbotti@mev.co.uk>        )=-
-=( Tel: +44 (0)161 477 1898   FAX: +44 (0)161 718 3587         )=-


^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [RESEND][PATCH-2.6.24-rc8] Fix fakephp deadlock
  2008-01-22 14:28 [RESEND][PATCH-2.6.24-rc8] Fix fakephp deadlock Ian Abbott
@ 2008-01-23 17:46 ` Greg KH
  2008-01-23 18:38   ` Ian Abbott
  0 siblings, 1 reply; 4+ messages in thread
From: Greg KH @ 2008-01-23 17:46 UTC (permalink / raw)
  To: Ian Abbott
  Cc: linux-kernel, linux-pci, pcihpd-discuss, gregkh, kristen.c.accardi

On Tue, Jan 22, 2008 at 02:28:08PM +0000, Ian Abbott wrote:
> #include <linux/init.h>
> #include <linux/string.h>
> #include <linux/slab.h>
> +#include <linux/workqueue.h>
> #include "../pci.h"
> #if !defined(MODULE)
> @@ -63,10 +64,13 @@ struct dummy_slot {
> 	struct list_head node;
> 	struct hotplug_slot *slot;
> 	struct pci_dev *dev;
> +	struct work_struct remove_work;
> +	unsigned long removed;

You are treating "removed" as an atomic value, so why not just make it
an atomic_t?

And what is protecting the fact that the flag could be set right after
it gets checked?  I don't see a lock here :)

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [RESEND][PATCH-2.6.24-rc8] Fix fakephp deadlock
  2008-01-23 17:46 ` Greg KH
@ 2008-01-23 18:38   ` Ian Abbott
  2008-01-23 18:42     ` Greg KH
  0 siblings, 1 reply; 4+ messages in thread
From: Ian Abbott @ 2008-01-23 18:38 UTC (permalink / raw)
  To: Greg KH
  Cc: linux-kernel, linux-pci, pcihpd-discuss, gregkh, kristen.c.accardi

On 23/01/08 17:46, Greg KH wrote:
> On Tue, Jan 22, 2008 at 02:28:08PM +0000, Ian Abbott wrote:
>> #include <linux/init.h>
>> #include <linux/string.h>
>> #include <linux/slab.h>
>> +#include <linux/workqueue.h>
>> #include "../pci.h"
>> #if !defined(MODULE)
>> @@ -63,10 +64,13 @@ struct dummy_slot {
>> 	struct list_head node;
>> 	struct hotplug_slot *slot;
>> 	struct pci_dev *dev;
>> +	struct work_struct remove_work;
>> +	unsigned long removed;
> 
> You are treating "removed" as an atomic value, so why not just make it
> an atomic_t?

Because I'm using it as a boolean?

> And what is protecting the fact that the flag could be set right after
> it gets checked?  I don't see a lock here :)

Okay, it looks like there might be a race condition between 
enable_slot() and disable_slot() if some other task calls disable_slot() 
while enable_slot() is between the test_bit() and flush_workqueue() 
calls.  I can fix that by avoiding the call to flush_workqueue() in 
enable_slot() and allocating and queueing a work queue item to defer the 
call to pci_rescan().  And enable_slot() won't then need to check if the 
slot was marked as removed - it can just go ahead and allocate and queue 
a work item.

-- 
-=( Ian Abbott @ MEV Ltd.    E-mail: <abbotti@mev.co.uk>        )=-
-=( Tel: +44 (0)161 477 1898   FAX: +44 (0)161 718 3587         )=-

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [RESEND][PATCH-2.6.24-rc8] Fix fakephp deadlock
  2008-01-23 18:38   ` Ian Abbott
@ 2008-01-23 18:42     ` Greg KH
  0 siblings, 0 replies; 4+ messages in thread
From: Greg KH @ 2008-01-23 18:42 UTC (permalink / raw)
  To: Ian Abbott
  Cc: Greg KH, linux-kernel, linux-pci, pcihpd-discuss, kristen.c.accardi

On Wed, Jan 23, 2008 at 06:38:22PM +0000, Ian Abbott wrote:
> On 23/01/08 17:46, Greg KH wrote:
>> On Tue, Jan 22, 2008 at 02:28:08PM +0000, Ian Abbott wrote:
>>> #include <linux/init.h>
>>> #include <linux/string.h>
>>> #include <linux/slab.h>
>>> +#include <linux/workqueue.h>
>>> #include "../pci.h"
>>> #if !defined(MODULE)
>>> @@ -63,10 +64,13 @@ struct dummy_slot {
>>> 	struct list_head node;
>>> 	struct hotplug_slot *slot;
>>> 	struct pci_dev *dev;
>>> +	struct work_struct remove_work;
>>> +	unsigned long removed;
>> You are treating "removed" as an atomic value, so why not just make it
>> an atomic_t?
>
> Because I'm using it as a boolean?

Heh, an unsigned long as a boolean?  Come on... :)

>> And what is protecting the fact that the flag could be set right after
>> it gets checked?  I don't see a lock here :)
>
> Okay, it looks like there might be a race condition between enable_slot() 
> and disable_slot() if some other task calls disable_slot() while 
> enable_slot() is between the test_bit() and flush_workqueue() calls.  I can 
> fix that by avoiding the call to flush_workqueue() in enable_slot() and 
> allocating and queueing a work queue item to defer the call to 
> pci_rescan().  And enable_slot() won't then need to check if the slot was 
> marked as removed - it can just go ahead and allocate and queue a work 
> item.

That sounds reasonable.

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2008-01-23 18:49 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2008-01-22 14:28 [RESEND][PATCH-2.6.24-rc8] Fix fakephp deadlock Ian Abbott
2008-01-23 17:46 ` Greg KH
2008-01-23 18:38   ` Ian Abbott
2008-01-23 18:42     ` Greg KH

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).