linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 0/5] firewire: fix crashes in workqueue jobs
@ 2008-02-24 17:56 Stefan Richter
  2008-02-24 17:57 ` [PATCH 1/5] firewire: invalid pointers used in fw_card_bm_work Stefan Richter
                   ` (5 more replies)
  0 siblings, 6 replies; 10+ messages in thread
From: Stefan Richter @ 2008-02-24 17:56 UTC (permalink / raw)
  To: linux1394-devel; +Cc: Kristian Hoegsberg, Jarod Wilson, linux-kernel

Here come 3 fixes + 2 cleanups.

 1/5 firewire: invalid pointers used in fw_card_bm_work
 2/5 firewire: fix crash in automatic module unloading
 3/5 firewire: remove superfluous reference counting
 4/5 firewire: fw-sbp2: fix reference counting
 5/5 firewire: refactor fw_unit reference counting

The main theme is that the firewire-core module has to stay loaded until
after all workqueue jobs of the core and of protocol drivers have been
finished.  This is accomplished by tracking the sum of references to
instances of fw_device for each card.  (As a side effect, we don't need
to count references to instances of fw_card anymore.)

 drivers/firewire/fw-card.c        |   99 ++++++++++++++++----------------------
 drivers/firewire/fw-device.c      |   21 ++------
 drivers/firewire/fw-device.h      |   35 ++++++++++---
 drivers/firewire/fw-ohci.c        |    8 +--
 drivers/firewire/fw-sbp2.c        |    9 ++-
 drivers/firewire/fw-topology.c    |    1
 drivers/firewire/fw-transaction.h |    6 --
 7 files changed, 89 insertions(+), 90 deletions(-)
-- 
Stefan Richter
-=====-==--- --=- ==---
http://arcgraph.de/sr/


^ permalink raw reply	[flat|nested] 10+ messages in thread

* [PATCH 1/5] firewire: invalid pointers used in fw_card_bm_work
  2008-02-24 17:56 [PATCH 0/5] firewire: fix crashes in workqueue jobs Stefan Richter
@ 2008-02-24 17:57 ` Stefan Richter
  2008-02-24 17:59 ` [PATCH 2/5] firewire: fix crash in automatic module unloading Stefan Richter
                   ` (4 subsequent siblings)
  5 siblings, 0 replies; 10+ messages in thread
From: Stefan Richter @ 2008-02-24 17:57 UTC (permalink / raw)
  To: linux1394-devel; +Cc: Kristian Hoegsberg, Jarod Wilson, linux-kernel

The bus management workqueue job was in danger to dereference NULL
pointers.  Also, after having temporarily lifted card->lock, a few node
pointers and a device pointer may have become invalid.

Add NULL pointer checks and get the necessary references.  Also, move
card->local_node out of fw_card_bm_work's sight during shutdown of the
card.

Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>
---
 drivers/firewire/fw-card.c     |   51 ++++++++++++++++++++++-----------
 drivers/firewire/fw-topology.c |    1 
 2 files changed, 35 insertions(+), 17 deletions(-)

Index: linux/drivers/firewire/fw-card.c
===================================================================
--- linux.orig/drivers/firewire/fw-card.c
+++ linux/drivers/firewire/fw-card.c
@@ -214,17 +214,29 @@ static void
 fw_card_bm_work(struct work_struct *work)
 {
 	struct fw_card *card = container_of(work, struct fw_card, work.work);
-	struct fw_device *root;
+	struct fw_device *root_device;
+	struct fw_node *root_node, *local_node;
 	struct bm_data bmd;
 	unsigned long flags;
 	int root_id, new_root_id, irm_id, gap_count, generation, grace;
 	int do_reset = 0;
 
 	spin_lock_irqsave(&card->lock, flags);
+	local_node = card->local_node;
+	root_node  = card->root_node;
+
+	if (local_node == NULL) {
+		spin_unlock_irqrestore(&card->lock, flags);
+		return;
+	}
+	fw_node_get(local_node);
+	fw_node_get(root_node);
 
 	generation = card->generation;
-	root = card->root_node->data;
-	root_id = card->root_node->node_id;
+	root_device = root_node->data;
+	if (root_device)
+		fw_device_get(root_device);
+	root_id = root_node->node_id;
 	grace = time_after(jiffies, card->reset_jiffies + DIV_ROUND_UP(HZ, 10));
 
 	if (card->bm_generation + 1 == generation ||
@@ -243,14 +255,14 @@ fw_card_bm_work(struct work_struct *work
 
 		irm_id = card->irm_node->node_id;
 		if (!card->irm_node->link_on) {
-			new_root_id = card->local_node->node_id;
+			new_root_id = local_node->node_id;
 			fw_notify("IRM has link off, making local node (%02x) root.\n",
 				  new_root_id);
 			goto pick_me;
 		}
 
 		bmd.lock.arg = cpu_to_be32(0x3f);
-		bmd.lock.data = cpu_to_be32(card->local_node->node_id);
+		bmd.lock.data = cpu_to_be32(local_node->node_id);
 
 		spin_unlock_irqrestore(&card->lock, flags);
 
@@ -267,12 +279,12 @@ fw_card_bm_work(struct work_struct *work
 			 * Another bus reset happened. Just return,
 			 * the BM work has been rescheduled.
 			 */
-			return;
+			goto out;
 		}
 
 		if (bmd.rcode == RCODE_COMPLETE && bmd.old != 0x3f)
 			/* Somebody else is BM, let them do the work. */
-			return;
+			goto out;
 
 		spin_lock_irqsave(&card->lock, flags);
 		if (bmd.rcode != RCODE_COMPLETE) {
@@ -282,7 +294,7 @@ fw_card_bm_work(struct work_struct *work
 			 * do a bus reset and pick the local node as
 			 * root, and thus, IRM.
 			 */
-			new_root_id = card->local_node->node_id;
+			new_root_id = local_node->node_id;
 			fw_notify("BM lock failed, making local node (%02x) root.\n",
 				  new_root_id);
 			goto pick_me;
@@ -295,7 +307,7 @@ fw_card_bm_work(struct work_struct *work
 		 */
 		spin_unlock_irqrestore(&card->lock, flags);
 		schedule_delayed_work(&card->work, DIV_ROUND_UP(HZ, 10));
-		return;
+		goto out;
 	}
 
 	/*
@@ -305,20 +317,20 @@ fw_card_bm_work(struct work_struct *work
 	 */
 	card->bm_generation = generation;
 
-	if (root == NULL) {
+	if (root_device == NULL) {
 		/*
 		 * Either link_on is false, or we failed to read the
 		 * config rom.  In either case, pick another root.
 		 */
-		new_root_id = card->local_node->node_id;
-	} else if (atomic_read(&root->state) != FW_DEVICE_RUNNING) {
+		new_root_id = local_node->node_id;
+	} else if (atomic_read(&root_device->state) != FW_DEVICE_RUNNING) {
 		/*
 		 * If we haven't probed this device yet, bail out now
 		 * and let's try again once that's done.
 		 */
 		spin_unlock_irqrestore(&card->lock, flags);
-		return;
-	} else if (root->config_rom[2] & BIB_CMC) {
+		goto out;
+	} else if (root_device->config_rom[2] & BIB_CMC) {
 		/*
 		 * FIXME: I suppose we should set the cmstr bit in the
 		 * STATE_CLEAR register of this node, as described in
@@ -332,7 +344,7 @@ fw_card_bm_work(struct work_struct *work
 		 * successfully read the config rom, but it's not
 		 * cycle master capable.
 		 */
-		new_root_id = card->local_node->node_id;
+		new_root_id = local_node->node_id;
 	}
 
  pick_me:
@@ -341,8 +353,8 @@ fw_card_bm_work(struct work_struct *work
 	 * the typically much larger 1394b beta repeater delays though.
 	 */
 	if (!card->beta_repeaters_present &&
-	    card->root_node->max_hops < ARRAY_SIZE(gap_count_table))
-		gap_count = gap_count_table[card->root_node->max_hops];
+	    root_node->max_hops < ARRAY_SIZE(gap_count_table))
+		gap_count = gap_count_table[root_node->max_hops];
 	else
 		gap_count = 63;
 
@@ -364,6 +376,11 @@ fw_card_bm_work(struct work_struct *work
 		fw_send_phy_config(card, new_root_id, generation, gap_count);
 		fw_core_initiate_bus_reset(card, 1);
 	}
+ out:
+	if (root_device)
+		fw_device_put(root_device);
+	fw_node_put(root_node);
+	fw_node_put(local_node);
 }
 
 static void
Index: linux/drivers/firewire/fw-topology.c
===================================================================
--- linux.orig/drivers/firewire/fw-topology.c
+++ linux/drivers/firewire/fw-topology.c
@@ -383,6 +383,7 @@ void fw_destroy_nodes(struct fw_card *ca
 	card->color++;
 	if (card->local_node != NULL)
 		for_each_fw_node(card, card->local_node, report_lost_node);
+	card->local_node = NULL;
 	spin_unlock_irqrestore(&card->lock, flags);
 }
 

-- 
Stefan Richter
-=====-==--- --=- ==---
http://arcgraph.de/sr/


^ permalink raw reply	[flat|nested] 10+ messages in thread

* [PATCH 2/5] firewire: fix crash in automatic module unloading
  2008-02-24 17:56 [PATCH 0/5] firewire: fix crashes in workqueue jobs Stefan Richter
  2008-02-24 17:57 ` [PATCH 1/5] firewire: invalid pointers used in fw_card_bm_work Stefan Richter
@ 2008-02-24 17:59 ` Stefan Richter
  2008-03-03 16:45   ` Kristian Høgsberg
  2008-02-24 17:59 ` [PATCH 3/5] firewire: remove superfluous reference counting Stefan Richter
                   ` (3 subsequent siblings)
  5 siblings, 1 reply; 10+ messages in thread
From: Stefan Richter @ 2008-02-24 17:59 UTC (permalink / raw)
  To: linux1394-devel; +Cc: Kristian Hoegsberg, Jarod Wilson, linux-kernel

"modprobe firewire-ohci; sleep .1; modprobe -r firewire-ohci" used to
result in crashes like this:

    BUG: unable to handle kernel paging request at ffffffff8807b455
    IP: [<ffffffff8807b455>]
    PGD 203067 PUD 207063 PMD 7c170067 PTE 0
    Oops: 0010 [1] PREEMPT SMP 
    CPU 0 
    Modules linked in: i915 drm cpufreq_ondemand acpi_cpufreq freq_table applesmc input_polldev led_class coretemp hwmon eeprom snd_seq_oss snd_seq_midi_event snd_seq snd_seq_device snd_pcm_oss snd_mixer_oss button thermal processor sg snd_hda_intel snd_pcm snd_timer snd snd_page_alloc sky2 i2c_i801 rtc [last unloaded: crc_itu_t]
    Pid: 9, comm: events/0 Not tainted 2.6.25-rc2 #3
    RIP: 0010:[<ffffffff8807b455>]  [<ffffffff8807b455>]
    RSP: 0018:ffff81007dcdde88  EFLAGS: 00010246
    RAX: ffff81007dc95040 RBX: ffff81007dee5390 RCX: 0000000000005e13
    RDX: 0000000000008c8b RSI: 0000000000000001 RDI: ffff81007dee5388
    RBP: ffff81007dc5eb40 R08: 0000000000000002 R09: ffffffff8022d05c
    R10: ffffffff8023b34c R11: ffffffff8041a353 R12: ffff81007dee5388
    R13: ffffffff8807b455 R14: ffffffff80593bc0 R15: 0000000000000000
    FS:  0000000000000000(0000) GS:ffffffff8055a000(0000) knlGS:0000000000000000
    CS:  0010 DS: 0018 ES: 0018 CR0: 000000008005003b
    CR2: ffffffff8807b455 CR3: 0000000000201000 CR4: 00000000000006e0
    DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
    DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
    Process events/0 (pid: 9, threadinfo ffff81007dcdc000, task ffff81007dc95040)
    Stack:  ffffffff8023b396 ffffffff88082524 0000000000000000 ffffffff8807d9ae
    ffff81007dc5eb40 ffff81007dc9dce0 ffff81007dc5eb40 ffff81007dc5eb80
    ffff81007dc9dce0 ffffffffffffffff ffffffff8023be87 0000000000000000
    Call Trace:
    [<ffffffff8023b396>] ? run_workqueue+0xdf/0x1df
    [<ffffffff8023be87>] ? worker_thread+0xd8/0xe3
    [<ffffffff8023e917>] ? autoremove_wake_function+0x0/0x2e
    [<ffffffff8023bdaf>] ? worker_thread+0x0/0xe3
    [<ffffffff8023e813>] ? kthread+0x47/0x74
    [<ffffffff804198e0>] ? trace_hardirqs_on_thunk+0x35/0x3a
    [<ffffffff8020c008>] ? child_rip+0xa/0x12
    [<ffffffff8020b6e3>] ? restore_args+0x0/0x3d
    [<ffffffff8023e68a>] ? kthreadd+0x14c/0x171
    [<ffffffff8023e68a>] ? kthreadd+0x14c/0x171
    [<ffffffff8023e7cc>] ? kthread+0x0/0x74
    [<ffffffff8020bffe>] ? child_rip+0x0/0x12


    Code:  Bad RIP value.
    RIP  [<ffffffff8807b455>]
    RSP <ffff81007dcdde88>
    CR2: ffffffff8807b455
    ---[ end trace c7366c6657fe5bed ]---

Note that this crash happened _after_ firewire-core was unloaded.  The
shared workqueue tried to run firewire-core's device initialization jobs
or similar jobs.

The fix makes sure that firewire-ohci and hence firewire-core is not
unloaded before all device shutdown jobs have been completed.  This is
determined by the count of device initializations minus device releases.

Also skip useless retries in the node initialization job if the node is
to be shut down.

Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>
---
 drivers/firewire/fw-card.c        |   10 +++++++++-
 drivers/firewire/fw-device.c      |   21 ++++++---------------
 drivers/firewire/fw-device.h      |   16 ++++++++++++++--
 drivers/firewire/fw-sbp2.c        |    4 ++++
 drivers/firewire/fw-transaction.h |    2 ++
 5 files changed, 35 insertions(+), 18 deletions(-)

Index: linux/drivers/firewire/fw-card.c
===================================================================
--- linux.orig/drivers/firewire/fw-card.c
+++ linux/drivers/firewire/fw-card.c
@@ -18,6 +18,7 @@
 
 #include <linux/module.h>
 #include <linux/errno.h>
+#include <linux/delay.h>
 #include <linux/device.h>
 #include <linux/mutex.h>
 #include <linux/crc-itu-t.h>
@@ -398,6 +399,7 @@ fw_card_initialize(struct fw_card *card,
 	static atomic_t index = ATOMIC_INIT(-1);
 
 	kref_init(&card->kref);
+	atomic_set(&card->device_count, 0);
 	card->index = atomic_inc_return(&index);
 	card->driver = driver;
 	card->device = device;
@@ -528,8 +530,14 @@ fw_core_remove_card(struct fw_card *card
 	card->driver = &dummy_driver;
 
 	fw_destroy_nodes(card);
-	flush_scheduled_work();
+	/*
+	 * Wait for all device workqueue jobs to finish.  Otherwise the
+	 * firewire-core module could be unloaded before the jobs ran.
+	 */
+	while (atomic_read(&card->device_count) > 0)
+		msleep(100);
 
+	cancel_delayed_work_sync(&card->work);
 	fw_flush_transactions(card);
 	del_timer_sync(&card->flush_timer);
 
Index: linux/drivers/firewire/fw-device.c
===================================================================
--- linux.orig/drivers/firewire/fw-device.c
+++ linux/drivers/firewire/fw-device.c
@@ -150,21 +150,10 @@ struct bus_type fw_bus_type = {
 };
 EXPORT_SYMBOL(fw_bus_type);
 
-struct fw_device *fw_device_get(struct fw_device *device)
-{
-	get_device(&device->device);
-
-	return device;
-}
-
-void fw_device_put(struct fw_device *device)
-{
-	put_device(&device->device);
-}
-
 static void fw_device_release(struct device *dev)
 {
 	struct fw_device *device = fw_device(dev);
+	struct fw_card *card = device->card;
 	unsigned long flags;
 
 	/*
@@ -176,9 +165,9 @@ static void fw_device_release(struct dev
 	spin_unlock_irqrestore(&device->card->lock, flags);
 
 	fw_node_put(device->node);
-	fw_card_put(device->card);
 	kfree(device->config_rom);
 	kfree(device);
+	atomic_dec(&card->device_count);
 }
 
 int fw_device_enable_phys_dma(struct fw_device *device)
@@ -668,7 +657,8 @@ static void fw_device_init(struct work_s
 	 */
 
 	if (read_bus_info_block(device, device->generation) < 0) {
-		if (device->config_rom_retries < MAX_RETRIES) {
+		if (device->config_rom_retries < MAX_RETRIES &&
+		    atomic_read(&device->state) == FW_DEVICE_INITIALIZING) {
 			device->config_rom_retries++;
 			schedule_delayed_work(&device->work, RETRY_DELAY);
 		} else {
@@ -805,7 +795,8 @@ void fw_node_event(struct fw_card *card,
 		 */
 		device_initialize(&device->device);
 		atomic_set(&device->state, FW_DEVICE_INITIALIZING);
-		device->card = fw_card_get(card);
+		atomic_inc(&card->device_count);
+		device->card = card;
 		device->node = fw_node_get(node);
 		device->node_id = node->node_id;
 		device->generation = card->generation;
Index: linux/drivers/firewire/fw-sbp2.c
===================================================================
--- linux.orig/drivers/firewire/fw-sbp2.c
+++ linux/drivers/firewire/fw-sbp2.c
@@ -757,6 +757,7 @@ static void sbp2_release_target(struct k
 	struct sbp2_logical_unit *lu, *next;
 	struct Scsi_Host *shost =
 		container_of((void *)tgt, struct Scsi_Host, hostdata[0]);
+	struct fw_device *device = fw_device(tgt->unit->device.parent);
 
 	/* prevent deadlocks */
 	sbp2_unblock(tgt);
@@ -778,6 +779,7 @@ static void sbp2_release_target(struct k
 
 	put_device(&tgt->unit->device);
 	scsi_host_put(shost);
+	fw_device_put(device);
 }
 
 static struct workqueue_struct *sbp2_wq;
@@ -1080,6 +1082,8 @@ static int sbp2_probe(struct device *dev
 	if (scsi_add_host(shost, &unit->device) < 0)
 		goto fail_shost_put;
 
+	fw_device_get(device);
+
 	/* Initialize to values that won't match anything in our table. */
 	firmware_revision = 0xff000000;
 	model = 0xff000000;
Index: linux/drivers/firewire/fw-transaction.h
===================================================================
--- linux.orig/drivers/firewire/fw-transaction.h
+++ linux/drivers/firewire/fw-transaction.h
@@ -26,6 +26,7 @@
 #include <linux/fs.h>
 #include <linux/dma-mapping.h>
 #include <linux/firewire-constants.h>
+#include <asm/atomic.h>
 
 #define TCODE_IS_READ_REQUEST(tcode)	(((tcode) & ~1) == 4)
 #define TCODE_IS_BLOCK_PACKET(tcode)	(((tcode) &  1) != 0)
@@ -219,6 +220,7 @@ extern struct bus_type fw_bus_type;
 struct fw_card {
 	const struct fw_card_driver *driver;
 	struct device *device;
+	atomic_t device_count;
 	struct kref kref;
 
 	int node_id;
Index: linux/drivers/firewire/fw-device.h
===================================================================
--- linux.orig/drivers/firewire/fw-device.h
+++ linux/drivers/firewire/fw-device.h
@@ -76,9 +76,21 @@ fw_device_is_shutdown(struct fw_device *
 	return atomic_read(&device->state) == FW_DEVICE_SHUTDOWN;
 }
 
-struct fw_device *fw_device_get(struct fw_device *device);
+static inline struct fw_device *
+fw_device_get(struct fw_device *device)
+{
+	get_device(&device->device);
+
+	return device;
+}
+
+static inline void
+fw_device_put(struct fw_device *device)
+{
+	put_device(&device->device);
+}
+
 struct fw_device *fw_device_get_by_devt(dev_t devt);
-void fw_device_put(struct fw_device *device);
 int fw_device_enable_phys_dma(struct fw_device *device);
 
 void fw_device_cdev_update(struct fw_device *device);

-- 
Stefan Richter
-=====-==--- --=- ==---
http://arcgraph.de/sr/


^ permalink raw reply	[flat|nested] 10+ messages in thread

* [PATCH 3/5] firewire: remove superfluous reference counting
  2008-02-24 17:56 [PATCH 0/5] firewire: fix crashes in workqueue jobs Stefan Richter
  2008-02-24 17:57 ` [PATCH 1/5] firewire: invalid pointers used in fw_card_bm_work Stefan Richter
  2008-02-24 17:59 ` [PATCH 2/5] firewire: fix crash in automatic module unloading Stefan Richter
@ 2008-02-24 17:59 ` Stefan Richter
  2008-02-24 18:00 ` [PATCH 4/5] firewire: fw-sbp2: fix " Stefan Richter
                   ` (2 subsequent siblings)
  5 siblings, 0 replies; 10+ messages in thread
From: Stefan Richter @ 2008-02-24 17:59 UTC (permalink / raw)
  To: linux1394-devel; +Cc: Kristian Hoegsberg, Jarod Wilson, linux-kernel

The card->kref became obsolete since patch "firewire: fix crash in
automatic module unloading" added another counter of card users.

Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>
---
 drivers/firewire/fw-card.c        |   38 ------------------------------
 drivers/firewire/fw-ohci.c        |    8 +++---
 drivers/firewire/fw-transaction.h |    4 ---
 3 files changed, 4 insertions(+), 46 deletions(-)

Index: linux/drivers/firewire/fw-card.c
===================================================================
--- linux.orig/drivers/firewire/fw-card.c
+++ linux/drivers/firewire/fw-card.c
@@ -398,7 +398,6 @@ fw_card_initialize(struct fw_card *card,
 {
 	static atomic_t index = ATOMIC_INIT(-1);
 
-	kref_init(&card->kref);
 	atomic_set(&card->device_count, 0);
 	card->index = atomic_inc_return(&index);
 	card->driver = driver;
@@ -429,12 +428,6 @@ fw_card_add(struct fw_card *card,
 	card->link_speed = link_speed;
 	card->guid = guid;
 
-	/*
-	 * The subsystem grabs a reference when the card is added and
-	 * drops it when the driver calls fw_core_remove_card.
-	 */
-	fw_card_get(card);
-
 	mutex_lock(&card_mutex);
 	config_rom = generate_config_rom(card, &length);
 	list_add_tail(&card->link, &card_list);
@@ -540,40 +533,9 @@ fw_core_remove_card(struct fw_card *card
 	cancel_delayed_work_sync(&card->work);
 	fw_flush_transactions(card);
 	del_timer_sync(&card->flush_timer);
-
-	fw_card_put(card);
 }
 EXPORT_SYMBOL(fw_core_remove_card);
 
-struct fw_card *
-fw_card_get(struct fw_card *card)
-{
-	kref_get(&card->kref);
-
-	return card;
-}
-EXPORT_SYMBOL(fw_card_get);
-
-static void
-release_card(struct kref *kref)
-{
-	struct fw_card *card = container_of(kref, struct fw_card, kref);
-
-	kfree(card);
-}
-
-/*
- * An assumption for fw_card_put() is that the card driver allocates
- * the fw_card struct with kalloc and that it has been shut down
- * before the last ref is dropped.
- */
-void
-fw_card_put(struct fw_card *card)
-{
-	kref_put(&card->kref, release_card);
-}
-EXPORT_SYMBOL(fw_card_put);
-
 int
 fw_core_initiate_bus_reset(struct fw_card *card, int short_reset)
 {
Index: linux/drivers/firewire/fw-ohci.c
===================================================================
--- linux.orig/drivers/firewire/fw-ohci.c
+++ linux/drivers/firewire/fw-ohci.c
@@ -2059,7 +2059,7 @@ pci_probe(struct pci_dev *dev, const str
 	err = pci_enable_device(dev);
 	if (err) {
 		fw_error("Failed to enable OHCI hardware.\n");
-		goto fail_put_card;
+		goto fail_free;
 	}
 
 	pci_set_master(dev);
@@ -2151,8 +2151,8 @@ pci_probe(struct pci_dev *dev, const str
 	pci_release_region(dev, 0);
  fail_disable:
 	pci_disable_device(dev);
- fail_put_card:
-	fw_card_put(&ohci->card);
+ fail_free:
+	kfree(&ohci->card);
 
 	return err;
 }
@@ -2180,7 +2180,7 @@ static void pci_remove(struct pci_dev *d
 	pci_iounmap(dev, ohci->registers);
 	pci_release_region(dev, 0);
 	pci_disable_device(dev);
-	fw_card_put(&ohci->card);
+	kfree(&ohci->card);
 
 	fw_notify("Removed fw-ohci device.\n");
 }
Index: linux/drivers/firewire/fw-transaction.h
===================================================================
--- linux.orig/drivers/firewire/fw-transaction.h
+++ linux/drivers/firewire/fw-transaction.h
@@ -221,7 +221,6 @@ struct fw_card {
 	const struct fw_card_driver *driver;
 	struct device *device;
 	atomic_t device_count;
-	struct kref kref;
 
 	int node_id;
 	int generation;
@@ -263,9 +262,6 @@ struct fw_card {
 	int bm_generation;
 };
 
-struct fw_card *fw_card_get(struct fw_card *card);
-void fw_card_put(struct fw_card *card);
-
 /*
  * The iso packet format allows for an immediate header/payload part
  * stored in 'header' immediately after the packet info plus an

-- 
Stefan Richter
-=====-==--- --=- ==---
http://arcgraph.de/sr/


^ permalink raw reply	[flat|nested] 10+ messages in thread

* [PATCH 4/5] firewire: fw-sbp2: fix reference counting
  2008-02-24 17:56 [PATCH 0/5] firewire: fix crashes in workqueue jobs Stefan Richter
                   ` (2 preceding siblings ...)
  2008-02-24 17:59 ` [PATCH 3/5] firewire: remove superfluous reference counting Stefan Richter
@ 2008-02-24 18:00 ` Stefan Richter
  2008-02-24 18:01 ` [PATCH 5/5] firewire: refactor fw_unit " Stefan Richter
  2008-03-01  5:17 ` [PATCH 0/5] firewire: fix crashes in workqueue jobs Jarod Wilson
  5 siblings, 0 replies; 10+ messages in thread
From: Stefan Richter @ 2008-02-24 18:00 UTC (permalink / raw)
  To: linux1394-devel; +Cc: Kristian Hoegsberg, Jarod Wilson, linux-kernel

The reference count of the unit dropped too low in an error path in
sbp2_probe.  Fixed by moving the _get further up.

Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>
---
 drivers/firewire/fw-sbp2.c |    3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

Index: linux/drivers/firewire/fw-sbp2.c
===================================================================
--- linux.orig/drivers/firewire/fw-sbp2.c
+++ linux/drivers/firewire/fw-sbp2.c
@@ -1083,6 +1083,7 @@ static int sbp2_probe(struct device *dev
 		goto fail_shost_put;
 
 	fw_device_get(device);
+	get_device(&unit->device);
 
 	/* Initialize to values that won't match anything in our table. */
 	firmware_revision = 0xff000000;
@@ -1098,8 +1099,6 @@ static int sbp2_probe(struct device *dev
 
 	sbp2_init_workarounds(tgt, model, firmware_revision);
 
-	get_device(&unit->device);
-
 	/* Do the login in a workqueue so we can easily reschedule retries. */
 	list_for_each_entry(lu, &tgt->lu_list, link)
 		sbp2_queue_work(lu, 0);

-- 
Stefan Richter
-=====-==--- --=- ==---
http://arcgraph.de/sr/


^ permalink raw reply	[flat|nested] 10+ messages in thread

* [PATCH 5/5] firewire: refactor fw_unit reference counting
  2008-02-24 17:56 [PATCH 0/5] firewire: fix crashes in workqueue jobs Stefan Richter
                   ` (3 preceding siblings ...)
  2008-02-24 18:00 ` [PATCH 4/5] firewire: fw-sbp2: fix " Stefan Richter
@ 2008-02-24 18:01 ` Stefan Richter
  2008-03-01  5:17 ` [PATCH 0/5] firewire: fix crashes in workqueue jobs Jarod Wilson
  5 siblings, 0 replies; 10+ messages in thread
From: Stefan Richter @ 2008-02-24 18:01 UTC (permalink / raw)
  To: linux1394-devel; +Cc: Kristian Hoegsberg, Jarod Wilson, linux-kernel

Add wrappers for getting and putting a unit.
Remove some line breaks.

Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>
---
 drivers/firewire/fw-device.h |   27 +++++++++++++++++----------
 drivers/firewire/fw-sbp2.c   |    4 ++--
 2 files changed, 19 insertions(+), 12 deletions(-)

Index: linux/drivers/firewire/fw-device.h
===================================================================
--- linux.orig/drivers/firewire/fw-device.h
+++ linux/drivers/firewire/fw-device.h
@@ -64,28 +64,24 @@ struct fw_device {
 	struct fw_attribute_group attribute_group;
 };
 
-static inline struct fw_device *
-fw_device(struct device *dev)
+static inline struct fw_device *fw_device(struct device *dev)
 {
 	return container_of(dev, struct fw_device, device);
 }
 
-static inline int
-fw_device_is_shutdown(struct fw_device *device)
+static inline int fw_device_is_shutdown(struct fw_device *device)
 {
 	return atomic_read(&device->state) == FW_DEVICE_SHUTDOWN;
 }
 
-static inline struct fw_device *
-fw_device_get(struct fw_device *device)
+static inline struct fw_device *fw_device_get(struct fw_device *device)
 {
 	get_device(&device->device);
 
 	return device;
 }
 
-static inline void
-fw_device_put(struct fw_device *device)
+static inline void fw_device_put(struct fw_device *device)
 {
 	put_device(&device->device);
 }
@@ -104,12 +100,23 @@ struct fw_unit {
 	struct fw_attribute_group attribute_group;
 };
 
-static inline struct fw_unit *
-fw_unit(struct device *dev)
+static inline struct fw_unit *fw_unit(struct device *dev)
 {
 	return container_of(dev, struct fw_unit, device);
 }
 
+static inline struct fw_unit *fw_unit_get(struct fw_unit *unit)
+{
+	get_device(&unit->device);
+
+	return unit;
+}
+
+static inline void fw_unit_put(struct fw_unit *unit)
+{
+	put_device(&unit->device);
+}
+
 #define CSR_OFFSET	0x40
 #define CSR_LEAF	0x80
 #define CSR_DIRECTORY	0xc0
Index: linux/drivers/firewire/fw-sbp2.c
===================================================================
--- linux.orig/drivers/firewire/fw-sbp2.c
+++ linux/drivers/firewire/fw-sbp2.c
@@ -777,7 +777,7 @@ static void sbp2_release_target(struct k
 	scsi_remove_host(shost);
 	fw_notify("released %s\n", tgt->bus_id);
 
-	put_device(&tgt->unit->device);
+	fw_unit_put(tgt->unit);
 	scsi_host_put(shost);
 	fw_device_put(device);
 }
@@ -1083,7 +1083,7 @@ static int sbp2_probe(struct device *dev
 		goto fail_shost_put;
 
 	fw_device_get(device);
-	get_device(&unit->device);
+	fw_unit_get(unit);
 
 	/* Initialize to values that won't match anything in our table. */
 	firmware_revision = 0xff000000;

-- 
Stefan Richter
-=====-==--- --=- ==---
http://arcgraph.de/sr/


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH 0/5] firewire: fix crashes in workqueue jobs
  2008-02-24 17:56 [PATCH 0/5] firewire: fix crashes in workqueue jobs Stefan Richter
                   ` (4 preceding siblings ...)
  2008-02-24 18:01 ` [PATCH 5/5] firewire: refactor fw_unit " Stefan Richter
@ 2008-03-01  5:17 ` Jarod Wilson
  5 siblings, 0 replies; 10+ messages in thread
From: Jarod Wilson @ 2008-03-01  5:17 UTC (permalink / raw)
  To: Stefan Richter; +Cc: linux1394-devel, Kristian Hoegsberg, linux-kernel

On Sunday 24 February 2008 12:56:10 pm Stefan Richter wrote:
> Here come 3 fixes + 2 cleanups.
>
>  1/5 firewire: invalid pointers used in fw_card_bm_work
>  2/5 firewire: fix crash in automatic module unloading
>  3/5 firewire: remove superfluous reference counting
>  4/5 firewire: fw-sbp2: fix reference counting
>  5/5 firewire: refactor fw_unit reference counting
>
> The main theme is that the firewire-core module has to stay loaded until
> after all workqueue jobs of the core and of protocol drivers have been
> finished.  This is accomplished by tracking the sum of references to
> instances of fw_device for each card.  (As a side effect, we don't need
> to count references to instances of fw_card anymore.)
>
>  drivers/firewire/fw-card.c        |   99
> ++++++++++++++++---------------------- drivers/firewire/fw-device.c      | 
>  21 ++------
>  drivers/firewire/fw-device.h      |   35 ++++++++++---
>  drivers/firewire/fw-ohci.c        |    8 +--
>  drivers/firewire/fw-sbp2.c        |    9 ++-
>  drivers/firewire/fw-topology.c    |    1
>  drivers/firewire/fw-transaction.h |    6 --
>  7 files changed, 89 insertions(+), 90 deletions(-)

Thumbs up for the whole series, with update 2 of the automatic module 
unloading swapped in for the original.

Signed-off-by: Jarod Wilson <jwilson@redhat.com>


-- 
Jarod Wilson
jwilson@redhat.com

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH 2/5] firewire: fix crash in automatic module unloading
  2008-02-24 17:59 ` [PATCH 2/5] firewire: fix crash in automatic module unloading Stefan Richter
@ 2008-03-03 16:45   ` Kristian Høgsberg
  2008-03-03 17:16     ` Stefan Richter
  0 siblings, 1 reply; 10+ messages in thread
From: Kristian Høgsberg @ 2008-03-03 16:45 UTC (permalink / raw)
  To: Stefan Richter; +Cc: linux1394-devel, Jarod Wilson, linux-kernel

On Sun, Feb 24, 2008 at 12:59 PM, Stefan Richter
<stefanr@s5r6.in-berlin.de> wrote:
...
>  Note that this crash happened _after_ firewire-core was unloaded.  The
>  shared workqueue tried to run firewire-core's device initialization jobs
>  or similar jobs.

Yeah, those pesky workqueue jobs :)

>  The fix makes sure that firewire-ohci and hence firewire-core is not
>  unloaded before all device shutdown jobs have been completed.  This is
>  determined by the count of device initializations minus device releases.

That's probably fine; I thought about this approach when I did the
dummy stuff, but I wanted something that would let me unload the
module immediately.  I guess in practice it doesn't make a big
difference, and this certainly is simpler.

I would want to use a kref and a completion for tracking this though
instead of the atomic.  Just use kref_get() instead of incrementing
the atomic and use kref_put() instead of decrementing it.  The release
function for kref_put() should complete the completion struct and
instead of the busy loop in fw_core_remove_card() we just wait for the
completion.  And I'm not sure I agree that it's a device_count, it
really just is a ref-count.  The core should also hold a reference to
the card and release it in fw_core_remove_card(), just before waiting
on the completion.  I do like how you moved the kfree() of the fw_ohci
struct back into fw-ohci.c where it balances the kzalloc() in
pci_probe().

>  Also skip useless retries in the node initialization job if the node is
>  to be shut down.
>
>  Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>
>  ---
>   drivers/firewire/fw-card.c        |   10 +++++++++-
>   drivers/firewire/fw-device.c      |   21 ++++++---------------
>   drivers/firewire/fw-device.h      |   16 ++++++++++++++--
>   drivers/firewire/fw-sbp2.c        |    4 ++++
>   drivers/firewire/fw-transaction.h |    2 ++
>   5 files changed, 35 insertions(+), 18 deletions(-)
>
>  Index: linux/drivers/firewire/fw-card.c
>  ===================================================================
>  --- linux.orig/drivers/firewire/fw-card.c
>  +++ linux/drivers/firewire/fw-card.c
>  @@ -18,6 +18,7 @@
>
>   #include <linux/module.h>
>   #include <linux/errno.h>
>  +#include <linux/delay.h>
>   #include <linux/device.h>
>   #include <linux/mutex.h>
>   #include <linux/crc-itu-t.h>
>  @@ -398,6 +399,7 @@ fw_card_initialize(struct fw_card *card,
>         static atomic_t index = ATOMIC_INIT(-1);
>
>         kref_init(&card->kref);
>  +       atomic_set(&card->device_count, 0);
>         card->index = atomic_inc_return(&index);
>         card->driver = driver;
>         card->device = device;
>  @@ -528,8 +530,14 @@ fw_core_remove_card(struct fw_card *card
>         card->driver = &dummy_driver;
>
>         fw_destroy_nodes(card);
>  -       flush_scheduled_work();
>  +       /*
>  +        * Wait for all device workqueue jobs to finish.  Otherwise the
>  +        * firewire-core module could be unloaded before the jobs ran.
>  +        */
>  +       while (atomic_read(&card->device_count) > 0)
>  +               msleep(100);
>
>  +       cancel_delayed_work_sync(&card->work);
>         fw_flush_transactions(card);
>         del_timer_sync(&card->flush_timer);
>
>  Index: linux/drivers/firewire/fw-device.c
>  ===================================================================
>  --- linux.orig/drivers/firewire/fw-device.c
>  +++ linux/drivers/firewire/fw-device.c
>  @@ -150,21 +150,10 @@ struct bus_type fw_bus_type = {
>   };
>   EXPORT_SYMBOL(fw_bus_type);
>
>  -struct fw_device *fw_device_get(struct fw_device *device)
>  -{
>  -       get_device(&device->device);
>  -
>  -       return device;
>  -}
>  -
>  -void fw_device_put(struct fw_device *device)
>  -{
>  -       put_device(&device->device);
>  -}
>  -
>   static void fw_device_release(struct device *dev)
>   {
>         struct fw_device *device = fw_device(dev);
>  +       struct fw_card *card = device->card;
>         unsigned long flags;
>
>         /*
>  @@ -176,9 +165,9 @@ static void fw_device_release(struct dev
>         spin_unlock_irqrestore(&device->card->lock, flags);
>
>         fw_node_put(device->node);
>  -       fw_card_put(device->card);
>         kfree(device->config_rom);
>         kfree(device);
>  +       atomic_dec(&card->device_count);
>   }
>
>   int fw_device_enable_phys_dma(struct fw_device *device)
>  @@ -668,7 +657,8 @@ static void fw_device_init(struct work_s
>          */
>
>         if (read_bus_info_block(device, device->generation) < 0) {
>  -               if (device->config_rom_retries < MAX_RETRIES) {
>  +               if (device->config_rom_retries < MAX_RETRIES &&
>  +                   atomic_read(&device->state) == FW_DEVICE_INITIALIZING) {
>                         device->config_rom_retries++;
>                         schedule_delayed_work(&device->work, RETRY_DELAY);
>                 } else {
>  @@ -805,7 +795,8 @@ void fw_node_event(struct fw_card *card,
>                  */
>                 device_initialize(&device->device);
>                 atomic_set(&device->state, FW_DEVICE_INITIALIZING);
>  -               device->card = fw_card_get(card);
>  +               atomic_inc(&card->device_count);
>  +               device->card = card;
>                 device->node = fw_node_get(node);
>                 device->node_id = node->node_id;
>                 device->generation = card->generation;
>  Index: linux/drivers/firewire/fw-sbp2.c
>  ===================================================================
>  --- linux.orig/drivers/firewire/fw-sbp2.c
>  +++ linux/drivers/firewire/fw-sbp2.c
>  @@ -757,6 +757,7 @@ static void sbp2_release_target(struct k
>         struct sbp2_logical_unit *lu, *next;
>         struct Scsi_Host *shost =
>                 container_of((void *)tgt, struct Scsi_Host, hostdata[0]);
>  +       struct fw_device *device = fw_device(tgt->unit->device.parent);
>
>         /* prevent deadlocks */
>         sbp2_unblock(tgt);
>  @@ -778,6 +779,7 @@ static void sbp2_release_target(struct k
>
>         put_device(&tgt->unit->device);
>         scsi_host_put(shost);
>  +       fw_device_put(device);
>   }
>
>   static struct workqueue_struct *sbp2_wq;
>  @@ -1080,6 +1082,8 @@ static int sbp2_probe(struct device *dev
>         if (scsi_add_host(shost, &unit->device) < 0)
>                 goto fail_shost_put;
>
>  +       fw_device_get(device);
>  +
>         /* Initialize to values that won't match anything in our table. */
>         firmware_revision = 0xff000000;
>         model = 0xff000000;
>  Index: linux/drivers/firewire/fw-transaction.h
>  ===================================================================
>  --- linux.orig/drivers/firewire/fw-transaction.h
>  +++ linux/drivers/firewire/fw-transaction.h
>  @@ -26,6 +26,7 @@
>   #include <linux/fs.h>
>   #include <linux/dma-mapping.h>
>   #include <linux/firewire-constants.h>
>  +#include <asm/atomic.h>
>
>   #define TCODE_IS_READ_REQUEST(tcode)   (((tcode) & ~1) == 4)
>   #define TCODE_IS_BLOCK_PACKET(tcode)   (((tcode) &  1) != 0)
>  @@ -219,6 +220,7 @@ extern struct bus_type fw_bus_type;
>   struct fw_card {
>         const struct fw_card_driver *driver;
>         struct device *device;
>  +       atomic_t device_count;
>         struct kref kref;
>
>         int node_id;
>  Index: linux/drivers/firewire/fw-device.h
>  ===================================================================
>  --- linux.orig/drivers/firewire/fw-device.h
>  +++ linux/drivers/firewire/fw-device.h
>  @@ -76,9 +76,21 @@ fw_device_is_shutdown(struct fw_device *
>         return atomic_read(&device->state) == FW_DEVICE_SHUTDOWN;
>   }
>
>  -struct fw_device *fw_device_get(struct fw_device *device);
>  +static inline struct fw_device *
>  +fw_device_get(struct fw_device *device)
>  +{
>  +       get_device(&device->device);
>  +
>  +       return device;
>  +}
>  +
>  +static inline void
>  +fw_device_put(struct fw_device *device)
>  +{
>  +       put_device(&device->device);
>  +}
>  +
>   struct fw_device *fw_device_get_by_devt(dev_t devt);
>  -void fw_device_put(struct fw_device *device);
>   int fw_device_enable_phys_dma(struct fw_device *device);
>
>   void fw_device_cdev_update(struct fw_device *device);
>
>  --
>  Stefan Richter
>  -=====-==--- --=- ==---
>  http://arcgraph.de/sr/
>
>

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH 2/5] firewire: fix crash in automatic module unloading
  2008-03-03 16:45   ` Kristian Høgsberg
@ 2008-03-03 17:16     ` Stefan Richter
  2008-03-03 17:37       ` Kristian Høgsberg
  0 siblings, 1 reply; 10+ messages in thread
From: Stefan Richter @ 2008-03-03 17:16 UTC (permalink / raw)
  To: Kristian Høgsberg; +Cc: linux1394-devel, Jarod Wilson, linux-kernel

Kristian Høgsberg wrote:
> I would want to use a kref and a completion for tracking this though
> instead of the atomic.  Just use kref_get() instead of incrementing
> the atomic and use kref_put() instead of decrementing it.  The release
> function for kref_put() should complete the completion struct and
> instead of the busy loop in fw_core_remove_card() we just wait for the
> completion.

Sounds like the way to go.  Since I already passed that patch upwards, I 
will do an incremental rework.  (But perhaps not before spending some 
time on ticket number 9617 at bugzilla.kernel.org's...)

> And I'm not sure I agree that it's a device_count, it
> really just is a ref-count.  The core should also hold a reference to
> the card and release it in fw_core_remove_card(), just before waiting
> on the completion.

Right; we just shouldn't mix fw-ohci's refcounting (which isn't really 
needed since the lifetime rules for the card are as simple as they can 
get for fw-ohci) and fw-core's refcounting.
-- 
Stefan Richter
-=====-==--- --== ---==
http://arcgraph.de/sr/

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH 2/5] firewire: fix crash in automatic module unloading
  2008-03-03 17:16     ` Stefan Richter
@ 2008-03-03 17:37       ` Kristian Høgsberg
  0 siblings, 0 replies; 10+ messages in thread
From: Kristian Høgsberg @ 2008-03-03 17:37 UTC (permalink / raw)
  To: Stefan Richter; +Cc: linux1394-devel, Jarod Wilson, linux-kernel

[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #1: Type: text/plain; charset=UTF-8, Size: 1766 bytes --]

On Mon, Mar 3, 2008 at 12:16 PM, Stefan Richter<stefanr@s5r6.in-berlin.de> wrote:> Kristian Høgsberg wrote:>  > I would want to use a kref and a completion for tracking this though>  > instead of the atomic.  Just use kref_get() instead of incrementing>  > the atomic and use kref_put() instead of decrementing it.  The release>  > function for kref_put() should complete the completion struct and>  > instead of the busy loop in fw_core_remove_card() we just wait for the>  > completion.>>  Sounds like the way to go.  Since I already passed that patch upwards, I>  will do an incremental rework.  (But perhaps not before spending some>  time on ticket number 9617 at bugzilla.kernel.org's...)>>>  > And I'm not sure I agree that it's a device_count, it>  > really just is a ref-count.  The core should also hold a reference to>  > the card and release it in fw_core_remove_card(), just before waiting>  > on the completion.>>  Right; we just shouldn't mix fw-ohci's refcounting (which isn't really>  needed since the lifetime rules for the card are as simple as they can>  get for fw-ohci) and fw-core's refcounting.
Yup, we should keep the fw-ohci side of things simple.  And they stillare: it calls add_card() to add a new card at which point the corestarts using the card.  On removal, it calls remove_card() and whenthat returns the core no knows or cares about the card and fw-ohci.cis free to tear it down however it wants.  When I mention that thecore should hold a ref-count, it's really just that somebody needs toown the initial ref-count and make sure it's dropped on the removepath.
Kristianÿôèº{.nÇ+‰·Ÿ®‰­†+%ŠËÿ±éݶ\x17¥Šwÿº{.nÇ+‰·¥Š{±þG«éÿŠ{ayº\x1dʇڙë,j\a­¢f£¢·hšïêÿ‘êçz_è®\x03(­éšŽŠÝ¢j"ú\x1a¶^[m§ÿÿ¾\a«þG«éÿ¢¸?™¨è­Ú&£ø§~á¶iO•æ¬z·švØ^\x14\x04\x1a¶^[m§ÿÿÃ\fÿ¶ìÿ¢¸?–I¥

^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2008-03-03 17:42 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2008-02-24 17:56 [PATCH 0/5] firewire: fix crashes in workqueue jobs Stefan Richter
2008-02-24 17:57 ` [PATCH 1/5] firewire: invalid pointers used in fw_card_bm_work Stefan Richter
2008-02-24 17:59 ` [PATCH 2/5] firewire: fix crash in automatic module unloading Stefan Richter
2008-03-03 16:45   ` Kristian Høgsberg
2008-03-03 17:16     ` Stefan Richter
2008-03-03 17:37       ` Kristian Høgsberg
2008-02-24 17:59 ` [PATCH 3/5] firewire: remove superfluous reference counting Stefan Richter
2008-02-24 18:00 ` [PATCH 4/5] firewire: fw-sbp2: fix " Stefan Richter
2008-02-24 18:01 ` [PATCH 5/5] firewire: refactor fw_unit " Stefan Richter
2008-03-01  5:17 ` [PATCH 0/5] firewire: fix crashes in workqueue jobs Jarod Wilson

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).