All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 0/4 v4] More device removal fixes
@ 2012-10-10 15:05 Bart Van Assche
  2012-10-10 15:07 ` [PATCH 1/4] block: Rename queue dead flag Bart Van Assche
                   ` (3 more replies)
  0 siblings, 4 replies; 13+ messages in thread
From: Bart Van Assche @ 2012-10-10 15:05 UTC (permalink / raw)
  To: linux-scsi, James Bottomley, Mike Christie, Jens Axboe,
	Tejun Heo, Chanho Min

Fix a few race conditions that can be triggered by removing a device:
- Avoid that request_fn() can be invoked on a dead queue.
- Avoid that blk_cleanup_queue() can finish while request_fn is still
   running.
- Fix a race between starved list processing and device removal.

These patches have been tested on top of commit 2474542f - a commit 
between 3.6 and the upcoming 3.7-rc1.

Changes compared to v3:
- Fixed a race condition by setting QUEUE_FLAG_DEAD earlier.
- Added a patch for fixing a race between starved list processing
   and device removal to this series.

Changes compared to v2:
- Split second patch into two patches.
- Refined patch descriptions.

Changes compared to v1:
- Included a patch to rename QUEUE_FLAG_DEAD.
- Refined the descriptions of the __blk_run_queue_uncond() and
   blk_cleanup_queue() functions.

-- 
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


^ permalink raw reply	[flat|nested] 13+ messages in thread

* [PATCH 1/4] block: Rename queue dead flag
  2012-10-10 15:05 [PATCH 0/4 v4] More device removal fixes Bart Van Assche
@ 2012-10-10 15:07 ` Bart Van Assche
  2012-10-16 23:31   ` Tejun Heo
  2012-10-10 15:08 ` [PATCH 2/4] block: Avoid that request_fn is invoked on a dead queue Bart Van Assche
                   ` (2 subsequent siblings)
  3 siblings, 1 reply; 13+ messages in thread
From: Bart Van Assche @ 2012-10-10 15:07 UTC (permalink / raw)
  Cc: linux-scsi, James Bottomley, Mike Christie, Jens Axboe,
	Tejun Heo, Chanho Min

QUEUE_FLAG_DEAD is used to indicate that queuing new requests must
stop. After this flag has been set queue draining starts. However,
during the queue draining phase it is still safe to invoke the
queue's request_fn, so QUEUE_FLAG_DYING is a better name for this
flag.

This patch is the result of running the command below and manually
fixing up indentation:

git grep -lE 'blk_queue_dead|QUEUE_FLAG_DEAD' |
  xargs sed -i.tmp -e 's/blk_queue_dead/blk_queue_dying/g' \
	-e 's/QUEUE_FLAG_DEAD/QUEUE_FLAG_DYING/g'

Cc: James Bottomley <JBottomley@Parallels.com>
Cc: Mike Christie <michaelc@cs.wisc.edu>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Tejun Heo <tj@kernel.org>
Cc: Chanho Min <chanho.min@lge.com>
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
---
 block/blk-cgroup.c      |    2 +-
 block/blk-core.c        |   16 ++++++++--------
 block/blk-exec.c        |    2 +-
 block/blk-sysfs.c       |    4 ++--
 block/blk-throttle.c    |    2 +-
 block/blk.h             |    2 +-
 drivers/scsi/scsi_lib.c |    2 +-
 include/linux/blkdev.h  |    4 ++--
 8 files changed, 17 insertions(+), 17 deletions(-)

diff --git a/block/blk-cgroup.c b/block/blk-cgroup.c
index cafcd74..396c319 100644
--- a/block/blk-cgroup.c
+++ b/block/blk-cgroup.c
@@ -231,7 +231,7 @@ struct blkcg_gq *blkg_lookup_create(struct blkcg *blkcg,
 	 * we shouldn't allow anything to go through for a bypassing queue.
 	 */
 	if (unlikely(blk_queue_bypass(q)))
-		return ERR_PTR(blk_queue_dead(q) ? -EINVAL : -EBUSY);
+		return ERR_PTR(blk_queue_dying(q) ? -EINVAL : -EBUSY);
 	return __blkg_lookup_create(blkcg, q, NULL);
 }
 EXPORT_SYMBOL_GPL(blkg_lookup_create);
diff --git a/block/blk-core.c b/block/blk-core.c
index d2da641..a319fbf 100644
--- a/block/blk-core.c
+++ b/block/blk-core.c
@@ -482,7 +482,7 @@ void blk_cleanup_queue(struct request_queue *q)
 
 	/* mark @q DEAD, no new request or merges will be allowed afterwards */
 	mutex_lock(&q->sysfs_lock);
-	queue_flag_set_unlocked(QUEUE_FLAG_DEAD, q);
+	queue_flag_set_unlocked(QUEUE_FLAG_DYING, q);
 	spin_lock_irq(lock);
 
 	/*
@@ -499,7 +499,7 @@ void blk_cleanup_queue(struct request_queue *q)
 
 	queue_flag_set(QUEUE_FLAG_NOMERGES, q);
 	queue_flag_set(QUEUE_FLAG_NOXMERGES, q);
-	queue_flag_set(QUEUE_FLAG_DEAD, q);
+	queue_flag_set(QUEUE_FLAG_DYING, q);
 	spin_unlock_irq(lock);
 	mutex_unlock(&q->sysfs_lock);
 
@@ -721,7 +721,7 @@ EXPORT_SYMBOL(blk_init_allocated_queue);
 
 bool blk_get_queue(struct request_queue *q)
 {
-	if (likely(!blk_queue_dead(q))) {
+	if (likely(!blk_queue_dying(q))) {
 		__blk_get_queue(q);
 		return true;
 	}
@@ -875,7 +875,7 @@ static struct request *__get_request(struct request_list *rl, int rw_flags,
 	const bool is_sync = rw_is_sync(rw_flags) != 0;
 	int may_queue;
 
-	if (unlikely(blk_queue_dead(q)))
+	if (unlikely(blk_queue_dying(q)))
 		return NULL;
 
 	may_queue = elv_may_queue(q, rw_flags);
@@ -1055,7 +1055,7 @@ retry:
 	if (rq)
 		return rq;
 
-	if (!(gfp_mask & __GFP_WAIT) || unlikely(blk_queue_dead(q))) {
+	if (!(gfp_mask & __GFP_WAIT) || unlikely(blk_queue_dying(q))) {
 		blk_put_rl(rl);
 		return NULL;
 	}
@@ -1907,7 +1907,7 @@ int blk_insert_cloned_request(struct request_queue *q, struct request *rq)
 		return -EIO;
 
 	spin_lock_irqsave(q->queue_lock, flags);
-	if (unlikely(blk_queue_dead(q))) {
+	if (unlikely(blk_queue_dying(q))) {
 		spin_unlock_irqrestore(q->queue_lock, flags);
 		return -ENODEV;
 	}
@@ -2889,7 +2889,7 @@ static void queue_unplugged(struct request_queue *q, unsigned int depth,
 	/*
 	 * Don't mess with dead queue.
 	 */
-	if (unlikely(blk_queue_dead(q))) {
+	if (unlikely(blk_queue_dying(q))) {
 		spin_unlock(q->queue_lock);
 		return;
 	}
@@ -2998,7 +2998,7 @@ void blk_flush_plug_list(struct blk_plug *plug, bool from_schedule)
 		/*
 		 * Short-circuit if @q is dead
 		 */
-		if (unlikely(blk_queue_dead(q))) {
+		if (unlikely(blk_queue_dying(q))) {
 			__blk_end_request_all(rq, -ENODEV);
 			continue;
 		}
diff --git a/block/blk-exec.c b/block/blk-exec.c
index 8b6dc5b..4aec98d 100644
--- a/block/blk-exec.c
+++ b/block/blk-exec.c
@@ -60,7 +60,7 @@ void blk_execute_rq_nowait(struct request_queue *q, struct gendisk *bd_disk,
 
 	spin_lock_irq(q->queue_lock);
 
-	if (unlikely(blk_queue_dead(q))) {
+	if (unlikely(blk_queue_dying(q))) {
 		rq->errors = -ENXIO;
 		if (rq->end_io)
 			rq->end_io(rq, rq->errors);
diff --git a/block/blk-sysfs.c b/block/blk-sysfs.c
index 9628b29..6898f17 100644
--- a/block/blk-sysfs.c
+++ b/block/blk-sysfs.c
@@ -432,7 +432,7 @@ queue_attr_show(struct kobject *kobj, struct attribute *attr, char *page)
 	if (!entry->show)
 		return -EIO;
 	mutex_lock(&q->sysfs_lock);
-	if (blk_queue_dead(q)) {
+	if (blk_queue_dying(q)) {
 		mutex_unlock(&q->sysfs_lock);
 		return -ENOENT;
 	}
@@ -454,7 +454,7 @@ queue_attr_store(struct kobject *kobj, struct attribute *attr,
 
 	q = container_of(kobj, struct request_queue, kobj);
 	mutex_lock(&q->sysfs_lock);
-	if (blk_queue_dead(q)) {
+	if (blk_queue_dying(q)) {
 		mutex_unlock(&q->sysfs_lock);
 		return -ENOENT;
 	}
diff --git a/block/blk-throttle.c b/block/blk-throttle.c
index a9664fa..3114622 100644
--- a/block/blk-throttle.c
+++ b/block/blk-throttle.c
@@ -302,7 +302,7 @@ static struct throtl_grp *throtl_lookup_create_tg(struct throtl_data *td,
 		/* if %NULL and @q is alive, fall back to root_tg */
 		if (!IS_ERR(blkg))
 			tg = blkg_to_tg(blkg);
-		else if (!blk_queue_dead(q))
+		else if (!blk_queue_dying(q))
 			tg = td_root_tg(td);
 	}
 
diff --git a/block/blk.h b/block/blk.h
index 2a0ea32..a066ceb 100644
--- a/block/blk.h
+++ b/block/blk.h
@@ -96,7 +96,7 @@ static inline struct request *__elv_next_request(struct request_queue *q)
 			q->flush_queue_delayed = 1;
 			return NULL;
 		}
-		if (unlikely(blk_queue_dead(q)) ||
+		if (unlikely(blk_queue_dying(q)) ||
 		    !q->elevator->type->ops.elevator_dispatch_fn(q, 0))
 			return NULL;
 	}
diff --git a/drivers/scsi/scsi_lib.c b/drivers/scsi/scsi_lib.c
index da36a3a..f29a1a9 100644
--- a/drivers/scsi/scsi_lib.c
+++ b/drivers/scsi/scsi_lib.c
@@ -1406,7 +1406,7 @@ static int scsi_lld_busy(struct request_queue *q)
 	struct scsi_device *sdev = q->queuedata;
 	struct Scsi_Host *shost;
 
-	if (blk_queue_dead(q))
+	if (blk_queue_dying(q))
 		return 0;
 
 	shost = sdev->host;
diff --git a/include/linux/blkdev.h b/include/linux/blkdev.h
index 4a2ab7c..c6ab0db 100644
--- a/include/linux/blkdev.h
+++ b/include/linux/blkdev.h
@@ -436,7 +436,7 @@ struct request_queue {
 #define QUEUE_FLAG_STOPPED	2	/* queue is stopped */
 #define	QUEUE_FLAG_SYNCFULL	3	/* read queue has been filled */
 #define QUEUE_FLAG_ASYNCFULL	4	/* write queue has been filled */
-#define QUEUE_FLAG_DEAD		5	/* queue being torn down */
+#define QUEUE_FLAG_DYING	5	/* queue being torn down */
 #define QUEUE_FLAG_BYPASS	6	/* act as dumb FIFO queue */
 #define QUEUE_FLAG_BIDI		7	/* queue supports bidi requests */
 #define QUEUE_FLAG_NOMERGES     8	/* disable merge attempts */
@@ -520,7 +520,7 @@ static inline void queue_flag_clear(unsigned int flag, struct request_queue *q)
 
 #define blk_queue_tagged(q)	test_bit(QUEUE_FLAG_QUEUED, &(q)->queue_flags)
 #define blk_queue_stopped(q)	test_bit(QUEUE_FLAG_STOPPED, &(q)->queue_flags)
-#define blk_queue_dead(q)	test_bit(QUEUE_FLAG_DEAD, &(q)->queue_flags)
+#define blk_queue_dying(q)	test_bit(QUEUE_FLAG_DYING, &(q)->queue_flags)
 #define blk_queue_bypass(q)	test_bit(QUEUE_FLAG_BYPASS, &(q)->queue_flags)
 #define blk_queue_nomerges(q)	test_bit(QUEUE_FLAG_NOMERGES, &(q)->queue_flags)
 #define blk_queue_noxmerges(q)	\
-- 
1.7.10.4


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH 2/4] block: Avoid that request_fn is invoked on a dead queue
  2012-10-10 15:05 [PATCH 0/4 v4] More device removal fixes Bart Van Assche
  2012-10-10 15:07 ` [PATCH 1/4] block: Rename queue dead flag Bart Van Assche
@ 2012-10-10 15:08 ` Bart Van Assche
  2012-10-16 23:38   ` Tejun Heo
  2012-10-10 15:09 ` [PATCH 3/4] Make blk_cleanup_queue() wait until request_fn finished Bart Van Assche
  2012-10-10 15:10 ` [PATCH 4/4] Fix race between starved list processing and device removal Bart Van Assche
  3 siblings, 1 reply; 13+ messages in thread
From: Bart Van Assche @ 2012-10-10 15:08 UTC (permalink / raw)
  Cc: linux-scsi, James Bottomley, Mike Christie, Jens Axboe,
	Tejun Heo, Chanho Min

A block driver may start cleaning up resources needed by its
request_fn as soon as blk_cleanup_queue() finished, so request_fn
must not be invoked after draining finished.

Cc: James Bottomley <JBottomley@Parallels.com>
Cc: Mike Christie <michaelc@cs.wisc.edu>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Tejun Heo <tj@kernel.org>
Cc: Chanho Min <chanho.min@lge.com>
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
---
 block/blk-core.c       |   24 +++++++++++++++++++++++-
 block/blk-exec.c       |    2 +-
 block/blk.h            |    2 ++
 include/linux/blkdev.h |    2 ++
 4 files changed, 28 insertions(+), 2 deletions(-)

diff --git a/block/blk-core.c b/block/blk-core.c
index a319fbf..5e752ff 100644
--- a/block/blk-core.c
+++ b/block/blk-core.c
@@ -293,6 +293,25 @@ void blk_sync_queue(struct request_queue *q)
 EXPORT_SYMBOL(blk_sync_queue);
 
 /**
+ * __blk_run_queue_uncond - run a queue whether or not it has been stopped
+ * @q:	The queue to run
+ *
+ * Description:
+ *    Invoke request handling on a queue if there are any pending requests.
+ *    May be used to restart request handling after a request has completed.
+ *    This variant runs the queue whether or not the queue has been
+ *    stopped. Must be called with the queue lock held and interrupts
+ *    disabled. See also @blk_run_queue.
+ */
+void __blk_run_queue_uncond(struct request_queue *q)
+{
+	if (unlikely(blk_queue_dead(q)))
+		return;
+
+	q->request_fn(q);
+}
+
+/**
  * __blk_run_queue - run a single device queue
  * @q:	The queue to run
  *
@@ -305,7 +324,7 @@ void __blk_run_queue(struct request_queue *q)
 	if (unlikely(blk_queue_stopped(q)))
 		return;
 
-	q->request_fn(q);
+	__blk_run_queue_uncond(q);
 }
 EXPORT_SYMBOL(__blk_run_queue);
 
@@ -401,6 +420,9 @@ void blk_drain_queue(struct request_queue *q, bool drain_all)
 			}
 		}
 
+		if (!drain && blk_queue_dying(q))
+			queue_flag_set(QUEUE_FLAG_DEAD, q);
+
 		spin_unlock_irq(q->queue_lock);
 
 		if (!drain)
diff --git a/block/blk-exec.c b/block/blk-exec.c
index 4aec98d..1320e74 100644
--- a/block/blk-exec.c
+++ b/block/blk-exec.c
@@ -72,7 +72,7 @@ void blk_execute_rq_nowait(struct request_queue *q, struct gendisk *bd_disk,
 	__blk_run_queue(q);
 	/* the queue is stopped so it won't be run */
 	if (rq->cmd_type == REQ_TYPE_PM_RESUME)
-		q->request_fn(q);
+		__blk_run_queue_uncond(q);
 	spin_unlock_irq(q->queue_lock);
 }
 EXPORT_SYMBOL_GPL(blk_execute_rq_nowait);
diff --git a/block/blk.h b/block/blk.h
index a066ceb..3e94c14 100644
--- a/block/blk.h
+++ b/block/blk.h
@@ -145,6 +145,8 @@ int blk_try_merge(struct request *rq, struct bio *bio);
 
 void blk_queue_congestion_threshold(struct request_queue *q);
 
+void __blk_run_queue_uncond(struct request_queue *q);
+
 int blk_dev_init(void);
 
 
diff --git a/include/linux/blkdev.h b/include/linux/blkdev.h
index c6ab0db..9b9855f 100644
--- a/include/linux/blkdev.h
+++ b/include/linux/blkdev.h
@@ -451,6 +451,7 @@ struct request_queue {
 #define QUEUE_FLAG_ADD_RANDOM  16	/* Contributes to random pool */
 #define QUEUE_FLAG_SECDISCARD  17	/* supports SECDISCARD */
 #define QUEUE_FLAG_SAME_FORCE  18	/* force complete on same CPU */
+#define QUEUE_FLAG_DEAD        19	/* queue tear-down finished */
 
 #define QUEUE_FLAG_DEFAULT	((1 << QUEUE_FLAG_IO_STAT) |		\
 				 (1 << QUEUE_FLAG_STACKABLE)	|	\
@@ -521,6 +522,7 @@ static inline void queue_flag_clear(unsigned int flag, struct request_queue *q)
 #define blk_queue_tagged(q)	test_bit(QUEUE_FLAG_QUEUED, &(q)->queue_flags)
 #define blk_queue_stopped(q)	test_bit(QUEUE_FLAG_STOPPED, &(q)->queue_flags)
 #define blk_queue_dying(q)	test_bit(QUEUE_FLAG_DYING, &(q)->queue_flags)
+#define blk_queue_dead(q)	test_bit(QUEUE_FLAG_DEAD, &(q)->queue_flags)
 #define blk_queue_bypass(q)	test_bit(QUEUE_FLAG_BYPASS, &(q)->queue_flags)
 #define blk_queue_nomerges(q)	test_bit(QUEUE_FLAG_NOMERGES, &(q)->queue_flags)
 #define blk_queue_noxmerges(q)	\
-- 
1.7.10.4


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH 3/4] Make blk_cleanup_queue() wait until request_fn finished
  2012-10-10 15:05 [PATCH 0/4 v4] More device removal fixes Bart Van Assche
  2012-10-10 15:07 ` [PATCH 1/4] block: Rename queue dead flag Bart Van Assche
  2012-10-10 15:08 ` [PATCH 2/4] block: Avoid that request_fn is invoked on a dead queue Bart Van Assche
@ 2012-10-10 15:09 ` Bart Van Assche
  2012-10-16 23:51   ` Tejun Heo
  2012-10-10 15:10 ` [PATCH 4/4] Fix race between starved list processing and device removal Bart Van Assche
  3 siblings, 1 reply; 13+ messages in thread
From: Bart Van Assche @ 2012-10-10 15:09 UTC (permalink / raw)
  Cc: linux-scsi, James Bottomley, Mike Christie, Jens Axboe,
	Tejun Heo, Chanho Min

Some request_fn implementations, e.g. scsi_request_fn(), unlock
the queue lock. Make sure that blk_cleanup_queue() waits until all
active request_fn invocations have finished. This fixes a potential
use-after-free at the end of scsi_request_fn(). Also, change the
type of the 'drain' variable from bool to int to avoid that the
highest bits of the request counters get ignored.

Reported-by: Chanho Min <chanho.min@lge.com>
Cc: James Bottomley <JBottomley@Parallels.com>
Cc: Mike Christie <michaelc@cs.wisc.edu>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Tejun Heo <tj@kernel.org>
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
---
 block/blk-core.c        |   11 +++++++----
 drivers/scsi/scsi_lib.c |   10 +---------
 include/linux/blkdev.h  |    6 ++++++
 3 files changed, 14 insertions(+), 13 deletions(-)

diff --git a/block/blk-core.c b/block/blk-core.c
index 5e752ff..ba75649 100644
--- a/block/blk-core.c
+++ b/block/blk-core.c
@@ -308,7 +308,9 @@ void __blk_run_queue_uncond(struct request_queue *q)
 	if (unlikely(blk_queue_dead(q)))
 		return;
 
+	q->driver_active++;
 	q->request_fn(q);
+	q->driver_active--;
 }
 
 /**
@@ -376,12 +378,12 @@ EXPORT_SYMBOL(blk_put_queue);
  * If not, only ELVPRIV requests are drained.  The caller is responsible
  * for ensuring that no new requests which need to be drained are queued.
  */
-void blk_drain_queue(struct request_queue *q, bool drain_all)
+static void blk_drain_queue(struct request_queue *q, bool drain_all)
 {
 	int i;
 
 	while (true) {
-		bool drain = false;
+		int drain = 0;
 
 		spin_lock_irq(q->queue_lock);
 
@@ -405,6 +407,7 @@ void blk_drain_queue(struct request_queue *q, bool drain_all)
 			__blk_run_queue(q);
 
 		drain |= q->nr_rqs_elvpriv;
+		drain |= q->driver_active;
 
 		/*
 		 * Unfortunately, requests are queued at and tracked from
@@ -495,8 +498,8 @@ EXPORT_SYMBOL_GPL(blk_queue_bypass_end);
  * blk_cleanup_queue - shutdown a request queue
  * @q: request queue to shutdown
  *
- * Mark @q DEAD, drain all pending requests, destroy and put it.  All
- * future requests will be failed immediately with -ENODEV.
+ * Mark @q as dying, drain all pending requests, mark @q as dead, destroy and
+ * put it.  All future requests will be failed immediately with -ENODEV.
  */
 void blk_cleanup_queue(struct request_queue *q)
 {
diff --git a/drivers/scsi/scsi_lib.c b/drivers/scsi/scsi_lib.c
index f29a1a9..0e15374 100644
--- a/drivers/scsi/scsi_lib.c
+++ b/drivers/scsi/scsi_lib.c
@@ -1517,10 +1517,6 @@ static void scsi_request_fn(struct request_queue *q)
 	struct scsi_cmnd *cmd;
 	struct request *req;
 
-	if(!get_device(&sdev->sdev_gendev))
-		/* We must be tearing the block queue down already */
-		return;
-
 	/*
 	 * To start with, we keep looping until the queue is empty, or until
 	 * the host is no longer able to accept any more requests.
@@ -1629,11 +1625,7 @@ out_delay:
 	if (sdev->device_busy == 0)
 		blk_delay_queue(q, SCSI_QUEUE_DELAY);
 out:
-	/* must be careful here...if we trigger the ->remove() function
-	 * we cannot be holding the q lock */
-	spin_unlock_irq(q->queue_lock);
-	put_device(&sdev->sdev_gendev);
-	spin_lock_irq(q->queue_lock);
+	;
 }
 
 u64 scsi_calculate_bounce_limit(struct Scsi_Host *shost)
diff --git a/include/linux/blkdev.h b/include/linux/blkdev.h
index 9b9855f..66ae538 100644
--- a/include/linux/blkdev.h
+++ b/include/linux/blkdev.h
@@ -377,6 +377,12 @@ struct request_queue {
 
 	unsigned int		nr_sorted;
 	unsigned int		in_flight[2];
+	/*
+	 * Number of active block driver functions for which blk_drain_queue()
+	 * must wait. Must be incremented around functions that unlock the
+	 * queue_lock internally, e.g. scsi_request_fn().
+	 */
+	unsigned int		driver_active;
 
 	unsigned int		rq_timeout;
 	struct timer_list	timeout;
-- 
1.7.10.4


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH 4/4] Fix race between starved list processing and device removal
  2012-10-10 15:05 [PATCH 0/4 v4] More device removal fixes Bart Van Assche
                   ` (2 preceding siblings ...)
  2012-10-10 15:09 ` [PATCH 3/4] Make blk_cleanup_queue() wait until request_fn finished Bart Van Assche
@ 2012-10-10 15:10 ` Bart Van Assche
  2012-10-16 23:59   ` Tejun Heo
  3 siblings, 1 reply; 13+ messages in thread
From: Bart Van Assche @ 2012-10-10 15:10 UTC (permalink / raw)
  Cc: linux-scsi, James Bottomley, Mike Christie, Jens Axboe,
	Tejun Heo, Chanho Min

Avoid that the sdev reference count can drop to zero before
the queue is run by scsi_run_queue(). Also avoid that the sdev
reference count can drop to zero in the same function by invoking
__blk_run_queue().

Reported-and-tested-by: Chanho Min <chanho.min@lge.com>
Reference: http://lkml.org/lkml/2012/8/2/96
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Tejun Heo <tj@kernel.org>
Cc: <stable@vger.kernel.org>
Reviewed-by: Mike Christie <michaelc@cs.wisc.edu>
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
---
 drivers/scsi/scsi_lib.c   |    5 +++++
 drivers/scsi/scsi_sysfs.c |    7 ++++++-
 2 files changed, 11 insertions(+), 1 deletion(-)

diff --git a/drivers/scsi/scsi_lib.c b/drivers/scsi/scsi_lib.c
index 0e15374..fb57e68 100644
--- a/drivers/scsi/scsi_lib.c
+++ b/drivers/scsi/scsi_lib.c
@@ -452,10 +452,15 @@ static void scsi_run_queue(struct request_queue *q)
 			continue;
 		}
 
+		get_device(&sdev->sdev_gendev);
 		spin_unlock(shost->host_lock);
+
 		spin_lock(sdev->request_queue->queue_lock);
 		__blk_run_queue(sdev->request_queue);
 		spin_unlock(sdev->request_queue->queue_lock);
+
+		put_device(&sdev->sdev_gendev);
+
 		spin_lock(shost->host_lock);
 	}
 	/* put any unprocessed entries back */
diff --git a/drivers/scsi/scsi_sysfs.c b/drivers/scsi/scsi_sysfs.c
index ce5224c..2661a957 100644
--- a/drivers/scsi/scsi_sysfs.c
+++ b/drivers/scsi/scsi_sysfs.c
@@ -348,7 +348,6 @@ static void scsi_device_dev_release_usercontext(struct work_struct *work)
 	starget->reap_ref++;
 	list_del(&sdev->siblings);
 	list_del(&sdev->same_target_siblings);
-	list_del(&sdev->starved_entry);
 	spin_unlock_irqrestore(sdev->host->host_lock, flags);
 
 	cancel_work_sync(&sdev->event_work);
@@ -956,6 +955,8 @@ int scsi_sysfs_add_sdev(struct scsi_device *sdev)
 void __scsi_remove_device(struct scsi_device *sdev)
 {
 	struct device *dev = &sdev->sdev_gendev;
+	struct Scsi_Host *shost = sdev->host;
+	unsigned long flags;
 
 	if (sdev->is_visible) {
 		if (scsi_device_set_state(sdev, SDEV_CANCEL) != 0)
@@ -977,6 +978,10 @@ void __scsi_remove_device(struct scsi_device *sdev)
 	blk_cleanup_queue(sdev->request_queue);
 	cancel_work_sync(&sdev->requeue_work);
 
+	spin_lock_irqsave(shost->host_lock, flags);
+	list_del(&sdev->starved_entry);
+	spin_unlock_irqrestore(shost->host_lock, flags);
+
 	if (sdev->host->hostt->slave_destroy)
 		sdev->host->hostt->slave_destroy(sdev);
 	transport_destroy_device(dev);
-- 
1.7.10.4


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* Re: [PATCH 1/4] block: Rename queue dead flag
  2012-10-10 15:07 ` [PATCH 1/4] block: Rename queue dead flag Bart Van Assche
@ 2012-10-16 23:31   ` Tejun Heo
  0 siblings, 0 replies; 13+ messages in thread
From: Tejun Heo @ 2012-10-16 23:31 UTC (permalink / raw)
  To: Bart Van Assche
  Cc: linux-scsi, James Bottomley, Mike Christie, Jens Axboe, Chanho Min

On Wed, Oct 10, 2012 at 05:07:08PM +0200, Bart Van Assche wrote:
> QUEUE_FLAG_DEAD is used to indicate that queuing new requests must
> stop. After this flag has been set queue draining starts. However,
> during the queue draining phase it is still safe to invoke the
> queue's request_fn, so QUEUE_FLAG_DYING is a better name for this
> flag.
> 
> This patch is the result of running the command below and manually
> fixing up indentation:
> 
> git grep -lE 'blk_queue_dead|QUEUE_FLAG_DEAD' |
>   xargs sed -i.tmp -e 's/blk_queue_dead/blk_queue_dying/g' \
> 	-e 's/QUEUE_FLAG_DEAD/QUEUE_FLAG_DYING/g'
> 
> Cc: James Bottomley <JBottomley@Parallels.com>
> Cc: Mike Christie <michaelc@cs.wisc.edu>
> Cc: Jens Axboe <axboe@kernel.dk>
> Cc: Tejun Heo <tj@kernel.org>
> Cc: Chanho Min <chanho.min@lge.com>
> Signed-off-by: Bart Van Assche <bvanassche@acm.org>

Acked-by: Tejun Heo <tj@kernel.org>

Thanks.

-- 
tejun

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH 2/4] block: Avoid that request_fn is invoked on a dead queue
  2012-10-10 15:08 ` [PATCH 2/4] block: Avoid that request_fn is invoked on a dead queue Bart Van Assche
@ 2012-10-16 23:38   ` Tejun Heo
  2012-10-23 12:11     ` Bart Van Assche
  0 siblings, 1 reply; 13+ messages in thread
From: Tejun Heo @ 2012-10-16 23:38 UTC (permalink / raw)
  To: Bart Van Assche
  Cc: linux-scsi, James Bottomley, Mike Christie, Jens Axboe, Chanho Min

Hello,

On Wed, Oct 10, 2012 at 05:08:01PM +0200, Bart Van Assche wrote:
> A block driver may start cleaning up resources needed by its
> request_fn as soon as blk_cleanup_queue() finished, so request_fn
> must not be invoked after draining finished.

Can you please make the commit message more verbose preferably with
example crash trace?  It's difficult to tell what bug it's trying to
fix how.

>  /**
> + * __blk_run_queue_uncond - run a queue whether or not it has been stopped
> + * @q:	The queue to run
> + *
> + * Description:
> + *    Invoke request handling on a queue if there are any pending requests.
> + *    May be used to restart request handling after a request has completed.
> + *    This variant runs the queue whether or not the queue has been
> + *    stopped. Must be called with the queue lock held and interrupts
> + *    disabled. See also @blk_run_queue.
> + */
> +void __blk_run_queue_uncond(struct request_queue *q)
> +{
> +	if (unlikely(blk_queue_dead(q)))
> +		return;
> +
> +	q->request_fn(q);
> +}
> +
> +/**
>   * __blk_run_queue - run a single device queue
>   * @q:	The queue to run
>   *
> @@ -305,7 +324,7 @@ void __blk_run_queue(struct request_queue *q)
>  	if (unlikely(blk_queue_stopped(q)))
>  		return;
>  
> -	q->request_fn(q);
> +	__blk_run_queue_uncond(q);

__blk_run_queue_uncond() is a cold path and I don't think adding a
test there matters but I think it would be better if we avoid an extra
branch if possible for __blk_run_queue().  Can't we merge
blk_queue_stopped/dead() testing?

> @@ -401,6 +420,9 @@ void blk_drain_queue(struct request_queue *q, bool drain_all)
>  			}
>  		}
>  
> +		if (!drain && blk_queue_dying(q))
> +			queue_flag_set(QUEUE_FLAG_DEAD, q);
> +

Wouldn't doing this in blk_cleanup_queue() be better?  It may involve
an extra queue locking but I don't think that matter at all in that
path and doing things where they belong is far more important.

Thanks.

-- 
tejun

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH 3/4] Make blk_cleanup_queue() wait until request_fn finished
  2012-10-10 15:09 ` [PATCH 3/4] Make blk_cleanup_queue() wait until request_fn finished Bart Van Assche
@ 2012-10-16 23:51   ` Tejun Heo
  2012-10-23 12:16     ` Bart Van Assche
  0 siblings, 1 reply; 13+ messages in thread
From: Tejun Heo @ 2012-10-16 23:51 UTC (permalink / raw)
  To: Bart Van Assche
  Cc: linux-scsi, James Bottomley, Mike Christie, Jens Axboe, Chanho Min

Hello,

On Wed, Oct 10, 2012 at 05:09:04PM +0200, Bart Van Assche wrote:
> Some request_fn implementations, e.g. scsi_request_fn(), unlock
> the queue lock. Make sure that blk_cleanup_queue() waits until all
> active request_fn invocations have finished. This fixes a potential
> use-after-free at the end of scsi_request_fn(). Also, change the
> type of the 'drain' variable from bool to int to avoid that the
> highest bits of the request counters get ignored.

Similar comment.  It would be great if you better separate what's
broken and how it's fixed.  Kinda difficult to digest.

> @@ -308,7 +308,9 @@ void __blk_run_queue_uncond(struct request_queue *q)
>  	if (unlikely(blk_queue_dead(q)))
>  		return;
>  
> +	q->driver_active++;
>  	q->request_fn(q);
> +	q->driver_active--;

Maybe q->request_fn_active is a better name?

> -void blk_drain_queue(struct request_queue *q, bool drain_all)
> +static void blk_drain_queue(struct request_queue *q, bool drain_all)
>  {
>  	int i;
>  
>  	while (true) {
> -		bool drain = false;
> +		int drain = 0;

I don't think this is necessary.  bool conversion works fine
regardless how high the bits are.  Isn't avoiding signed/unsigned
autocast maze one of the reasons why we're using bool to begin with?

> diff --git a/drivers/scsi/scsi_lib.c b/drivers/scsi/scsi_lib.c
> index f29a1a9..0e15374 100644
> --- a/drivers/scsi/scsi_lib.c
> +++ b/drivers/scsi/scsi_lib.c
> @@ -1517,10 +1517,6 @@ static void scsi_request_fn(struct request_queue *q)
>  	struct scsi_cmnd *cmd;
>  	struct request *req;
>  
> -	if(!get_device(&sdev->sdev_gendev))
> -		/* We must be tearing the block queue down already */
> -		return;
> -
>  	/*
>  	 * To start with, we keep looping until the queue is empty, or until
>  	 * the host is no longer able to accept any more requests.
> @@ -1629,11 +1625,7 @@ out_delay:
>  	if (sdev->device_busy == 0)
>  		blk_delay_queue(q, SCSI_QUEUE_DELAY);
>  out:
> -	/* must be careful here...if we trigger the ->remove() function
> -	 * we cannot be holding the q lock */
> -	spin_unlock_irq(q->queue_lock);
> -	put_device(&sdev->sdev_gendev);
> -	spin_lock_irq(q->queue_lock);
> +	;

I think moving this out to a separate patch would be better.

Thanks.

-- 
tejun

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH 4/4] Fix race between starved list processing and device removal
  2012-10-10 15:10 ` [PATCH 4/4] Fix race between starved list processing and device removal Bart Van Assche
@ 2012-10-16 23:59   ` Tejun Heo
  0 siblings, 0 replies; 13+ messages in thread
From: Tejun Heo @ 2012-10-16 23:59 UTC (permalink / raw)
  To: Bart Van Assche
  Cc: linux-scsi, James Bottomley, Mike Christie, Jens Axboe, Chanho Min

Hello,

On Wed, Oct 10, 2012 at 05:10:35PM +0200, Bart Van Assche wrote:
> Avoid that the sdev reference count can drop to zero before
> the queue is run by scsi_run_queue(). Also avoid that the sdev
> reference count can drop to zero in the same function by invoking
> __blk_run_queue().

I think this is correct but again had to scratch my head quite a bit.
It would be nice to start with brief explanation of starved_list,
especially the way it's processed per-host instead of per-device.  And
then explain how it's broken(how nothing guarantees a device to be
alive while on starved_list) and the fix.

Thanks.

-- 
tejun

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH 2/4] block: Avoid that request_fn is invoked on a dead queue
  2012-10-16 23:38   ` Tejun Heo
@ 2012-10-23 12:11     ` Bart Van Assche
  2012-10-24 19:13       ` Tejun Heo
  0 siblings, 1 reply; 13+ messages in thread
From: Bart Van Assche @ 2012-10-23 12:11 UTC (permalink / raw)
  To: Tejun Heo
  Cc: linux-scsi, James Bottomley, Mike Christie, Jens Axboe, Chanho Min

On 10/17/12 01:38, Tejun Heo wrote:
>>   /**
>> + * __blk_run_queue_uncond - run a queue whether or not it has been stopped
>> + * @q:	The queue to run
>> + *
>> + * Description:
>> + *    Invoke request handling on a queue if there are any pending requests.
>> + *    May be used to restart request handling after a request has completed.
>> + *    This variant runs the queue whether or not the queue has been
>> + *    stopped. Must be called with the queue lock held and interrupts
>> + *    disabled. See also @blk_run_queue.
>> + */
>> +void __blk_run_queue_uncond(struct request_queue *q)
>> +{
>> +	if (unlikely(blk_queue_dead(q)))
>> +		return;
>> +
>> +	q->request_fn(q);
>> +}
>> +
>> +/**
>>    * __blk_run_queue - run a single device queue
>>    * @q:	The queue to run
>>    *
>> @@ -305,7 +324,7 @@ void __blk_run_queue(struct request_queue *q)
>>   	if (unlikely(blk_queue_stopped(q)))
>>   		return;
>>
>> -	q->request_fn(q);
>> +	__blk_run_queue_uncond(q);
>
> __blk_run_queue_uncond() is a cold path and I don't think adding a
> test there matters but I think it would be better if we avoid an extra
> branch if possible for __blk_run_queue().  Can't we merge
> blk_queue_stopped/dead() testing?

How about declaring the function __blk_run_queue_uncond() inline ? That 
should allow the compiler to combine the two tests into a single test.

Bart.


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH 3/4] Make blk_cleanup_queue() wait until request_fn finished
  2012-10-16 23:51   ` Tejun Heo
@ 2012-10-23 12:16     ` Bart Van Assche
  2012-10-24 19:11       ` Tejun Heo
  0 siblings, 1 reply; 13+ messages in thread
From: Bart Van Assche @ 2012-10-23 12:16 UTC (permalink / raw)
  To: Tejun Heo
  Cc: linux-scsi, James Bottomley, Mike Christie, Jens Axboe, Chanho Min

On 10/17/12 01:51, Tejun Heo wrote:
>> -void blk_drain_queue(struct request_queue *q, bool drain_all)
>> +static void blk_drain_queue(struct request_queue *q, bool drain_all)
>>   {
>>   	int i;
>>
>>   	while (true) {
>> -		bool drain = false;
>> +		int drain = 0;
>
> I don't think this is necessary.  bool conversion works fine
> regardless how high the bits are.  Isn't avoiding signed/unsigned
> autocast maze one of the reasons why we're using bool to begin with?

My concern is about statements like "drain |= q->nr_rqs[i]". As far as I 
know nr_rqs can exceed the value 255 ?

Bart.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH 3/4] Make blk_cleanup_queue() wait until request_fn finished
  2012-10-23 12:16     ` Bart Van Assche
@ 2012-10-24 19:11       ` Tejun Heo
  0 siblings, 0 replies; 13+ messages in thread
From: Tejun Heo @ 2012-10-24 19:11 UTC (permalink / raw)
  To: Bart Van Assche
  Cc: linux-scsi, James Bottomley, Mike Christie, Jens Axboe, Chanho Min

Hello, Bart.

On Tue, Oct 23, 2012 at 02:16:12PM +0200, Bart Van Assche wrote:
> On 10/17/12 01:51, Tejun Heo wrote:
> >>-void blk_drain_queue(struct request_queue *q, bool drain_all)
> >>+static void blk_drain_queue(struct request_queue *q, bool drain_all)
> >>  {
> >>  	int i;
> >>
> >>  	while (true) {
> >>-		bool drain = false;
> >>+		int drain = 0;
> >
> >I don't think this is necessary.  bool conversion works fine
> >regardless how high the bits are.  Isn't avoiding signed/unsigned
> >autocast maze one of the reasons why we're using bool to begin with?
> 
> My concern is about statements like "drain |= q->nr_rqs[i]". As far
> as I know nr_rqs can exceed the value 255 ?

How would that be a problem?  Can you please elaborate (or better
demonstrate) the problem case?  And if such problem exists, please
make it a separate bug fix patch.

Thanks.

--
tejun

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH 2/4] block: Avoid that request_fn is invoked on a dead queue
  2012-10-23 12:11     ` Bart Van Assche
@ 2012-10-24 19:13       ` Tejun Heo
  0 siblings, 0 replies; 13+ messages in thread
From: Tejun Heo @ 2012-10-24 19:13 UTC (permalink / raw)
  To: Bart Van Assche
  Cc: linux-scsi, James Bottomley, Mike Christie, Jens Axboe, Chanho Min

Hello, Bart.

On Tue, Oct 23, 2012 at 02:11:29PM +0200, Bart Van Assche wrote:
> >__blk_run_queue_uncond() is a cold path and I don't think adding a
> >test there matters but I think it would be better if we avoid an extra
> >branch if possible for __blk_run_queue().  Can't we merge
> >blk_queue_stopped/dead() testing?
> 
> How about declaring the function __blk_run_queue_uncond() inline ?
> That should allow the compiler to combine the two tests into a
> single test.

Let's leave it as-is for now.  Given the later patches, I no longer
think it would be better to merge the testings.

Thanks!

--
tejun

^ permalink raw reply	[flat|nested] 13+ messages in thread

end of thread, other threads:[~2012-10-24 19:13 UTC | newest]

Thread overview: 13+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2012-10-10 15:05 [PATCH 0/4 v4] More device removal fixes Bart Van Assche
2012-10-10 15:07 ` [PATCH 1/4] block: Rename queue dead flag Bart Van Assche
2012-10-16 23:31   ` Tejun Heo
2012-10-10 15:08 ` [PATCH 2/4] block: Avoid that request_fn is invoked on a dead queue Bart Van Assche
2012-10-16 23:38   ` Tejun Heo
2012-10-23 12:11     ` Bart Van Assche
2012-10-24 19:13       ` Tejun Heo
2012-10-10 15:09 ` [PATCH 3/4] Make blk_cleanup_queue() wait until request_fn finished Bart Van Assche
2012-10-16 23:51   ` Tejun Heo
2012-10-23 12:16     ` Bart Van Assche
2012-10-24 19:11       ` Tejun Heo
2012-10-10 15:10 ` [PATCH 4/4] Fix race between starved list processing and device removal Bart Van Assche
2012-10-16 23:59   ` Tejun Heo

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.