linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH][SCSI] remove the queue unlock in scsi_requset_fn
@ 2012-08-10  3:22 Chanho Min
  2012-08-13 17:47 ` Bart Van Assche
  2012-08-14  9:48 ` [PATCH RESEND] " Chanho Min
  0 siblings, 2 replies; 9+ messages in thread
From: Chanho Min @ 2012-08-10  3:22 UTC (permalink / raw)
  To: James Bottomley
  Cc: Mike Christie, linux-scsi, linux-kernel, Jens Axboe, Tejun Heo,
	Bart Van Assche

We don't need to unlock the queue before put_device in scsi_request_fn()
If we trigger the ->remove() function, It occur a oops from the caller.
So sdev reference count should not be dropped to zero here.
Also It was added before scsi_device_dev_release() was moved
to user context, so it is outdated.

Signed-off-by: Chanho Min <chanho.min@lge.com>
Reviewed-by: Bart Van Assche <bvanassche@acm.org>
---
 drivers/scsi/scsi_lib.c |    4 ----
 1 files changed, 0 insertions(+), 4 deletions(-)

diff --git a/drivers/scsi/scsi_lib.c b/drivers/scsi/scsi_lib.c
index ffd7773..cb2185a 100644
--- a/drivers/scsi/scsi_lib.c
+++ b/drivers/scsi/scsi_lib.c
@@ -1626,11 +1626,7 @@ out_delay:
 	if (sdev->device_busy == 0)
 		blk_delay_queue(q, SCSI_QUEUE_DELAY);
 out:
-	/* must be careful here...if we trigger the ->remove() function
-	 * we cannot be holding the q lock */
-	spin_unlock_irq(q->queue_lock);
 	put_device(&sdev->sdev_gendev);
-	spin_lock_irq(q->queue_lock);
 }

 u64 scsi_calculate_bounce_limit(struct Scsi_Host *shost)
-- 
1.7.0.4

^ permalink raw reply related	[flat|nested] 9+ messages in thread

* Re: [PATCH][SCSI] remove the queue unlock in scsi_requset_fn
  2012-08-10  3:22 [PATCH][SCSI] remove the queue unlock in scsi_requset_fn Chanho Min
@ 2012-08-13 17:47 ` Bart Van Assche
  2012-08-14  9:48 ` [PATCH RESEND] " Chanho Min
  1 sibling, 0 replies; 9+ messages in thread
From: Bart Van Assche @ 2012-08-13 17:47 UTC (permalink / raw)
  To: Chanho Min
  Cc: James Bottomley, Mike Christie, linux-scsi, linux-kernel,
	Jens Axboe, Tejun Heo

On 08/10/12 03:22, Chanho Min wrote:
> We don't need to unlock the queue before put_device in scsi_request_fn()

It looks like there is a typo in the patch subject ? Also, you can omit
"[SCSI]" from the patch subject - AFAIK James has a script that inserts
that text automatically.

Bart.


^ permalink raw reply	[flat|nested] 9+ messages in thread

* [PATCH RESEND] remove the queue unlock in scsi_requset_fn
  2012-08-10  3:22 [PATCH][SCSI] remove the queue unlock in scsi_requset_fn Chanho Min
  2012-08-13 17:47 ` Bart Van Assche
@ 2012-08-14  9:48 ` Chanho Min
  2012-08-14 12:07   ` James Bottomley
  1 sibling, 1 reply; 9+ messages in thread
From: Chanho Min @ 2012-08-14  9:48 UTC (permalink / raw)
  To: James Bottomley
  Cc: Mike Christie, linux-scsi, linux-kernel, Jens Axboe, Tejun Heo,
	Bart Van Assche

We don't need to unlock the queue before put_device in scsi_request_fn()
If we trigger the ->remove() function, It occur a oops from the caller.
So sdev reference count should not be dropped to zero here.
Also It was added before scsi_device_dev_release() was moved
to user context, so it is outdated.

Signed-off-by: Chanho Min <chanho.min@lge.com>
Reviewed-by: Bart Van Assche <bvanassche@acm.org>
---
 drivers/scsi/scsi_lib.c |    4 ----
 1 files changed, 0 insertions(+), 4 deletions(-)

diff --git a/drivers/scsi/scsi_lib.c b/drivers/scsi/scsi_lib.c
index ffd7773..cb2185a 100644
--- a/drivers/scsi/scsi_lib.c
+++ b/drivers/scsi/scsi_lib.c
@@ -1626,11 +1626,7 @@ out_delay:
 	if (sdev->device_busy == 0)
 		blk_delay_queue(q, SCSI_QUEUE_DELAY);
 out:
-	/* must be careful here...if we trigger the ->remove() function
-	 * we cannot be holding the q lock */
-	spin_unlock_irq(q->queue_lock);
 	put_device(&sdev->sdev_gendev);
-	spin_lock_irq(q->queue_lock);
 }

 u64 scsi_calculate_bounce_limit(struct Scsi_Host *shost)
-- 
1.7.0.4

^ permalink raw reply related	[flat|nested] 9+ messages in thread

* Re: [PATCH RESEND] remove the queue unlock in scsi_requset_fn
  2012-08-14  9:48 ` [PATCH RESEND] " Chanho Min
@ 2012-08-14 12:07   ` James Bottomley
  2012-08-16  1:35     ` Chanho Min
  0 siblings, 1 reply; 9+ messages in thread
From: James Bottomley @ 2012-08-14 12:07 UTC (permalink / raw)
  To: Chanho Min
  Cc: Mike Christie, linux-scsi, linux-kernel, Jens Axboe, Tejun Heo,
	Bart Van Assche

On Tue, 2012-08-14 at 18:48 +0900, Chanho Min wrote:
> We don't need to unlock the queue before put_device in scsi_request_fn()
> If we trigger the ->remove() function, It occur a oops from the caller.
> So sdev reference count should not be dropped to zero here.
> Also It was added before scsi_device_dev_release() was moved
> to user context, so it is outdated.

None of this sounds to be correct.  The user context comment is
irrelevant because if we happen to be in user context, all the release
functions will occur in line.  I also don't see why the sdev reference
couldn't drop to zero here.

The reason I think we could remove the lock drop is because the queue
reference cannot drop to zero here because the block layer is holding a
reference to run the queue.  It's only the queue ->release function that
would take the queue lock and therefore we're safe to hold it.

James


> Signed-off-by: Chanho Min <chanho.min@lge.com>
> Reviewed-by: Bart Van Assche <bvanassche@acm.org>
> ---
>  drivers/scsi/scsi_lib.c |    4 ----
>  1 files changed, 0 insertions(+), 4 deletions(-)
> 
> diff --git a/drivers/scsi/scsi_lib.c b/drivers/scsi/scsi_lib.c
> index ffd7773..cb2185a 100644
> --- a/drivers/scsi/scsi_lib.c
> +++ b/drivers/scsi/scsi_lib.c
> @@ -1626,11 +1626,7 @@ out_delay:
>  	if (sdev->device_busy == 0)
>  		blk_delay_queue(q, SCSI_QUEUE_DELAY);
>  out:
> -	/* must be careful here...if we trigger the ->remove() function
> -	 * we cannot be holding the q lock */
> -	spin_unlock_irq(q->queue_lock);
>  	put_device(&sdev->sdev_gendev);
> -	spin_lock_irq(q->queue_lock);
>  }
> 
>  u64 scsi_calculate_bounce_limit(struct Scsi_Host *shost)



^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH RESEND] remove the queue unlock in scsi_requset_fn
  2012-08-14 12:07   ` James Bottomley
@ 2012-08-16  1:35     ` Chanho Min
  2012-08-16  7:52       ` Bart Van Assche
  2012-08-16  7:56       ` James Bottomley
  0 siblings, 2 replies; 9+ messages in thread
From: Chanho Min @ 2012-08-16  1:35 UTC (permalink / raw)
  To: James Bottomley
  Cc: Mike Christie, linux-scsi, linux-kernel, Jens Axboe, Tejun Heo,
	Bart Van Assche

> functions will occur in line.  I also don't see why the sdev reference
> couldn't drop to zero here.
scsi_request_fn is called under the lock of request_queue->queue_lock.
If we drop the sdev reference to zero here,
scsi_device_dev_release_usercontext is
invoked and make request_queue to NULL. When caller of scsi_request_fn try to
unlock request_queue->queue_lock, the oops is occurred.

Thanks, Chanho

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH RESEND] remove the queue unlock in scsi_requset_fn
  2012-08-16  1:35     ` Chanho Min
@ 2012-08-16  7:52       ` Bart Van Assche
  2012-08-16  8:10         ` James Bottomley
  2012-08-18 11:56         ` Bart Van Assche
  2012-08-16  7:56       ` James Bottomley
  1 sibling, 2 replies; 9+ messages in thread
From: Bart Van Assche @ 2012-08-16  7:52 UTC (permalink / raw)
  To: Chanho Min
  Cc: James Bottomley, Mike Christie, linux-scsi, linux-kernel,
	Jens Axboe, Tejun Heo

On 08/16/12 01:35, Chanho Min wrote:
>> functions will occur in line.  I also don't see why the sdev reference
>> couldn't drop to zero here.
> scsi_request_fn is called under the lock of request_queue->queue_lock.
> If we drop the sdev reference to zero here,
> scsi_device_dev_release_usercontext is
> invoked and make request_queue to NULL. When caller of scsi_request_fn try to
> unlock request_queue->queue_lock, the oops is occurred.

Whether or not your patch is applied, if the put_device() call in
scsi_request_fn() decreases the sdev reference count to zero, the
scsi_request_fn() caller will still try to unlock the queue lock after
scsi_request_fn() finished and hence will trigger a use-after-free. I'm
afraid the only real solution is to modify the SCSI and/or block layers
such that scsi_remove_device() can't finish while scsi_request_fn() is
in progress. And once that is guaranteed the get_device() / put_device()
pair can be left out from scsi_request_fn().

Bart.


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH RESEND] remove the queue unlock in scsi_requset_fn
  2012-08-16  1:35     ` Chanho Min
  2012-08-16  7:52       ` Bart Van Assche
@ 2012-08-16  7:56       ` James Bottomley
  1 sibling, 0 replies; 9+ messages in thread
From: James Bottomley @ 2012-08-16  7:56 UTC (permalink / raw)
  To: Chanho Min
  Cc: Mike Christie, linux-scsi, linux-kernel, Jens Axboe, Tejun Heo,
	Bart Van Assche, Jens Axboe

On Thu, 2012-08-16 at 10:35 +0900, Chanho Min wrote:
> > functions will occur in line.  I also don't see why the sdev reference
> > couldn't drop to zero here.
> scsi_request_fn is called under the lock of request_queue->queue_lock.
> If we drop the sdev reference to zero here,
> scsi_device_dev_release_usercontext is
> invoked and make request_queue to NULL. When caller of scsi_request_fn try to
> unlock request_queue->queue_lock, the oops is occurred.

I don't understand this explanation.

sdev->request_queue goes to NULL if the sdev refcount goes to zero (and
blk.  We have a copy though in the q variable, which is what we unlock.
That q variable only goes invalid if the queue ref count goes to zero.
If that happens, the queue release function will try to take the lock to
free the elevator and your patch will cause a deadlock.

There are only two possibilities here:

     1. The queue refcount can never reach zero within a request
        function because block ensures that it can unlock the queue lock
        on exit.  We could then remove this lock drop and acquire on the
        grounds that it is superfluous.
     2. The queue refcount does indeed go to zero and the queue gets
        released.  This would mean that all our lock; request_fn; unlock
        patterns do a use after free (in the block layer).  Your
        proposed patch doesn't fix this (and indeed would cause a
        deadlock on the release path).

I've cc'd Jens, because I don't entirely see why our

lock; request_fn; unlock

is safe against a racing blk_cleanup_queue().

James



^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH RESEND] remove the queue unlock in scsi_requset_fn
  2012-08-16  7:52       ` Bart Van Assche
@ 2012-08-16  8:10         ` James Bottomley
  2012-08-18 11:56         ` Bart Van Assche
  1 sibling, 0 replies; 9+ messages in thread
From: James Bottomley @ 2012-08-16  8:10 UTC (permalink / raw)
  To: Bart Van Assche
  Cc: Chanho Min, Mike Christie, linux-scsi, linux-kernel, Jens Axboe,
	Tejun Heo

On Thu, 2012-08-16 at 07:52 +0000, Bart Van Assche wrote:
> On 08/16/12 01:35, Chanho Min wrote:
> >> functions will occur in line.  I also don't see why the sdev reference
> >> couldn't drop to zero here.
> > scsi_request_fn is called under the lock of request_queue->queue_lock.
> > If we drop the sdev reference to zero here,
> > scsi_device_dev_release_usercontext is
> > invoked and make request_queue to NULL. When caller of scsi_request_fn try to
> > unlock request_queue->queue_lock, the oops is occurred.
> 
> Whether or not your patch is applied, if the put_device() call in
> scsi_request_fn() decreases the sdev reference count to zero, the
> scsi_request_fn() caller will still try to unlock the queue lock after
> scsi_request_fn() finished and hence will trigger a use-after-free. I'm
> afraid the only real solution is to modify the SCSI and/or block layers
> such that scsi_remove_device() can't finish while scsi_request_fn() is
> in progress. And once that is guaranteed the get_device() / put_device()
> pair can be left out from scsi_request_fn().

Well, no.  The only way to destroy a queue is with blk_cleanup_queue()
which does the final put.  blk_cleanup_queue has a rather clunky drain
check that looks at both queued and in-flight requests.  Even if we have
a scsi_remove_device() racing with the scsi_request_fn() and it gets to
blk_cleanup_queue(), that call gets held at the drain wait until the
scsi_request_fn() has exited.  The same is true for all other request
functions, so we're safe.

James



^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH RESEND] remove the queue unlock in scsi_requset_fn
  2012-08-16  7:52       ` Bart Van Assche
  2012-08-16  8:10         ` James Bottomley
@ 2012-08-18 11:56         ` Bart Van Assche
  1 sibling, 0 replies; 9+ messages in thread
From: Bart Van Assche @ 2012-08-18 11:56 UTC (permalink / raw)
  To: Chanho Min
  Cc: James Bottomley, Mike Christie, linux-scsi, linux-kernel,
	Jens Axboe, Tejun Heo

On 08/16/12 07:52, Bart Van Assche wrote:
> On 08/16/12 01:35, Chanho Min wrote:
>>> functions will occur in line.  I also don't see why the sdev reference
>>> couldn't drop to zero here.
>> scsi_request_fn is called under the lock of request_queue->queue_lock.
>> If we drop the sdev reference to zero here,
>> scsi_device_dev_release_usercontext is
>> invoked and make request_queue to NULL. When caller of scsi_request_fn try to
>> unlock request_queue->queue_lock, the oops is occurred.
> 
> Whether or not your patch is applied, if the put_device() call in
> scsi_request_fn() decreases the sdev reference count to zero, the
> scsi_request_fn() caller will still try to unlock the queue lock after
> scsi_request_fn() finished and hence will trigger a use-after-free. I'm
> afraid the only real solution is to modify the SCSI and/or block layers
> such that scsi_remove_device() can't finish while scsi_request_fn() is
> in progress. And once that is guaranteed the get_device() / put_device()
> pair can be left out from scsi_request_fn().

(replying to my own e-mail)

How about the patch below ?

[PATCH] Fix device removal race

If the put_device() call in scsi_request_fn() drops the sdev refcount
to zero then the spin_lock_irq() call after the put_device() call
triggers a use-after-free. Avoid that by making sure that blk_cleanup_queue()
can only finish after all active scsi_request_fn() calls have returned.
---
 block/blk-core.c        |    1 +
 drivers/scsi/scsi_lib.c |   10 ++--------
 include/linux/blkdev.h  |    5 +++++
 3 files changed, 8 insertions(+), 8 deletions(-)

diff --git a/block/blk-core.c b/block/blk-core.c
index 4b4dbdf..0e4da3b 100644
--- a/block/blk-core.c
+++ b/block/blk-core.c
@@ -388,6 +388,7 @@ void blk_drain_queue(struct request_queue *q, bool drain_all)
 			__blk_run_queue(q);
 
 		drain |= q->nr_rqs_elvpriv;
+		drain |= q->request_fn_active;
 
 		/*
 		 * Unfortunately, requests are queued at and tracked from
diff --git a/drivers/scsi/scsi_lib.c b/drivers/scsi/scsi_lib.c
index ffd7773..10bb348 100644
--- a/drivers/scsi/scsi_lib.c
+++ b/drivers/scsi/scsi_lib.c
@@ -1514,9 +1514,7 @@ static void scsi_request_fn(struct request_queue *q)
 	struct scsi_cmnd *cmd;
 	struct request *req;
 
-	if(!get_device(&sdev->sdev_gendev))
-		/* We must be tearing the block queue down already */
-		return;
+	q->request_fn_active++;
 
 	/*
 	 * To start with, we keep looping until the queue is empty, or until
@@ -1626,11 +1624,7 @@ out_delay:
 	if (sdev->device_busy == 0)
 		blk_delay_queue(q, SCSI_QUEUE_DELAY);
 out:
-	/* must be careful here...if we trigger the ->remove() function
-	 * we cannot be holding the q lock */
-	spin_unlock_irq(q->queue_lock);
-	put_device(&sdev->sdev_gendev);
-	spin_lock_irq(q->queue_lock);
+	q->request_fn_active--;
 }
 
 u64 scsi_calculate_bounce_limit(struct Scsi_Host *shost)
diff --git a/include/linux/blkdev.h b/include/linux/blkdev.h
index 4e72a9d..11c1987 100644
--- a/include/linux/blkdev.h
+++ b/include/linux/blkdev.h
@@ -377,6 +377,11 @@ struct request_queue {
 
 	unsigned int		nr_sorted;
 	unsigned int		in_flight[2];
+	/*
+	 * Number of active request_fn() calls for those request_fn()
+	 * implementations that unlock the queue_lock, e.g. scsi_request_fn().
+	 */
+	unsigned int		request_fn_active;
 
 	unsigned int		rq_timeout;
 	struct timer_list	timeout;
-- 
1.7.7



^ permalink raw reply related	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2012-08-18 11:56 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2012-08-10  3:22 [PATCH][SCSI] remove the queue unlock in scsi_requset_fn Chanho Min
2012-08-13 17:47 ` Bart Van Assche
2012-08-14  9:48 ` [PATCH RESEND] " Chanho Min
2012-08-14 12:07   ` James Bottomley
2012-08-16  1:35     ` Chanho Min
2012-08-16  7:52       ` Bart Van Assche
2012-08-16  8:10         ` James Bottomley
2012-08-18 11:56         ` Bart Van Assche
2012-08-16  7:56       ` James Bottomley

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).