All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v2 0/2] Rework blk_mq_mark_tag_wait()
@ 2018-01-10 19:39 Bart Van Assche
  2018-01-10 19:39 ` [PATCH v2 1/2] blk-mq: Reduce the number of if-statements in blk_mq_mark_tag_wait() Bart Van Assche
  2018-01-10 19:39 ` [PATCH v2 2/2] blk-mq: Fix a race condition " Bart Van Assche
  0 siblings, 2 replies; 9+ messages in thread
From: Bart Van Assche @ 2018-01-10 19:39 UTC (permalink / raw)
  To: Jens Axboe; +Cc: linux-block, Christoph Hellwig, Bart Van Assche

Hello Jens,

This patch series reworks the blk_mq_mark_tag_wait() implementation and also
fixes a race condition in that function. Please consider these two patches for
kernel v4.16.

Thanks,

Bart.

Changes compared to v1:
- Split a single patch into two patches to make reviewing easier.
- The race fix does no longer use prepare_to_wait() / finish_wait().

Bart Van Assche (2):
  blk-mq: Reduce the number of if-statements in blk_mq_mark_tag_wait()
  blk-mq: Fix a race condition in blk_mq_mark_tag_wait()

 block/blk-mq.c | 49 ++++++++++++++++++++++++-------------------------
 1 file changed, 24 insertions(+), 25 deletions(-)

-- 
2.15.1

^ permalink raw reply	[flat|nested] 9+ messages in thread

* [PATCH v2 1/2] blk-mq: Reduce the number of if-statements in blk_mq_mark_tag_wait()
  2018-01-10 19:39 [PATCH v2 0/2] Rework blk_mq_mark_tag_wait() Bart Van Assche
@ 2018-01-10 19:39 ` Bart Van Assche
  2018-01-10 20:30   ` Jens Axboe
  2018-01-11  7:00   ` Hannes Reinecke
  2018-01-10 19:39 ` [PATCH v2 2/2] blk-mq: Fix a race condition " Bart Van Assche
  1 sibling, 2 replies; 9+ messages in thread
From: Bart Van Assche @ 2018-01-10 19:39 UTC (permalink / raw)
  To: Jens Axboe
  Cc: linux-block, Christoph Hellwig, Bart Van Assche, Omar Sandoval,
	Hannes Reinecke, Johannes Thumshirn

This patch does not change any functionality but makes the
blk_mq_mark_tag_wait() code slightly easier to read.

Signed-off-by: Bart Van Assche <bart.vanassche@wdc.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Omar Sandoval <osandov@fb.com>
Cc: Hannes Reinecke <hare@suse.de>
Cc: Johannes Thumshirn <jthumshirn@suse.de>
---
 block/blk-mq.c | 30 +++++++++++++++---------------
 1 file changed, 15 insertions(+), 15 deletions(-)

diff --git a/block/blk-mq.c b/block/blk-mq.c
index 1d705e25852e..e770e8814f60 100644
--- a/block/blk-mq.c
+++ b/block/blk-mq.c
@@ -1189,6 +1189,16 @@ static bool blk_mq_mark_tag_wait(struct blk_mq_hw_ctx **hctx,
 	if (!shared_tags) {
 		if (!test_bit(BLK_MQ_S_SCHED_RESTART, &this_hctx->state))
 			set_bit(BLK_MQ_S_SCHED_RESTART, &this_hctx->state);
+		/*
+		 * It's possible that a tag was freed in the window between the
+		 * allocation failure and adding the hardware queue to the wait
+		 * queue.
+		 */
+		ret = blk_mq_get_driver_tag(rq, hctx, false);
+		/*
+		 * Don't clear RESTART here, someone else could have set it.
+		 * At most this will cost an extra queue run.
+		 */
 	} else {
 		wait = &this_hctx->dispatch_wait;
 		if (!list_empty_careful(&wait->entry))
@@ -1202,22 +1212,12 @@ static bool blk_mq_mark_tag_wait(struct blk_mq_hw_ctx **hctx,
 
 		ws = bt_wait_ptr(&this_hctx->tags->bitmap_tags, this_hctx);
 		add_wait_queue(&ws->wait, wait);
-	}
-
-	/*
-	 * It's possible that a tag was freed in the window between the
-	 * allocation failure and adding the hardware queue to the wait
-	 * queue.
-	 */
-	ret = blk_mq_get_driver_tag(rq, hctx, false);
-
-	if (!shared_tags) {
 		/*
-		 * Don't clear RESTART here, someone else could have set it.
-		 * At most this will cost an extra queue run.
+		 * It's possible that a tag was freed in the window between the
+		 * allocation failure and adding the hardware queue to the wait
+		 * queue.
 		 */
-		return ret;
-	} else {
+		ret = blk_mq_get_driver_tag(rq, hctx, false);
 		if (!ret) {
 			spin_unlock(&this_hctx->lock);
 			return false;
@@ -1231,8 +1231,8 @@ static bool blk_mq_mark_tag_wait(struct blk_mq_hw_ctx **hctx,
 		list_del_init(&wait->entry);
 		spin_unlock_irq(&ws->wait.lock);
 		spin_unlock(&this_hctx->lock);
-		return true;
 	}
+	return ret;
 }
 
 bool blk_mq_dispatch_rq_list(struct request_queue *q, struct list_head *list,
-- 
2.15.1

^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [PATCH v2 2/2] blk-mq: Fix a race condition in blk_mq_mark_tag_wait()
  2018-01-10 19:39 [PATCH v2 0/2] Rework blk_mq_mark_tag_wait() Bart Van Assche
  2018-01-10 19:39 ` [PATCH v2 1/2] blk-mq: Reduce the number of if-statements in blk_mq_mark_tag_wait() Bart Van Assche
@ 2018-01-10 19:39 ` Bart Van Assche
  2018-01-11  7:39   ` Hannes Reinecke
  2018-01-11 18:21   ` Omar Sandoval
  1 sibling, 2 replies; 9+ messages in thread
From: Bart Van Assche @ 2018-01-10 19:39 UTC (permalink / raw)
  To: Jens Axboe
  Cc: linux-block, Christoph Hellwig, Bart Van Assche, Omar Sandoval,
	Hannes Reinecke, Johannes Thumshirn

Both add_wait_queue() and blk_mq_dispatch_wake() protect wait queue
manipulations with the wait queue lock. Hence also protect the
!list_empty(&wait->entry) test with the wait queue lock instead of
the hctx lock.

Signed-off-by: Bart Van Assche <bart.vanassche@wdc.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Omar Sandoval <osandov@fb.com>
Cc: Hannes Reinecke <hare@suse.de>
Cc: Johannes Thumshirn <jthumshirn@suse.de>
---
 block/blk-mq.c | 19 +++++++++----------
 1 file changed, 9 insertions(+), 10 deletions(-)

diff --git a/block/blk-mq.c b/block/blk-mq.c
index e770e8814f60..d5313ce60836 100644
--- a/block/blk-mq.c
+++ b/block/blk-mq.c
@@ -1184,7 +1184,7 @@ static bool blk_mq_mark_tag_wait(struct blk_mq_hw_ctx **hctx,
 	bool shared_tags = (this_hctx->flags & BLK_MQ_F_TAG_SHARED) != 0;
 	struct sbq_wait_state *ws;
 	wait_queue_entry_t *wait;
-	bool ret;
+	bool on_wait_list, ret;
 
 	if (!shared_tags) {
 		if (!test_bit(BLK_MQ_S_SCHED_RESTART, &this_hctx->state))
@@ -1204,13 +1204,15 @@ static bool blk_mq_mark_tag_wait(struct blk_mq_hw_ctx **hctx,
 		if (!list_empty_careful(&wait->entry))
 			return false;
 
-		spin_lock(&this_hctx->lock);
-		if (!list_empty(&wait->entry)) {
-			spin_unlock(&this_hctx->lock);
+		ws = bt_wait_ptr(&this_hctx->tags->bitmap_tags, this_hctx);
+
+		spin_lock_irq(&ws->wait.lock);
+		on_wait_list = !list_empty(&wait->entry);
+		spin_unlock_irq(&ws->wait.lock);
+
+		if (on_wait_list)
 			return false;
-		}
 
-		ws = bt_wait_ptr(&this_hctx->tags->bitmap_tags, this_hctx);
 		add_wait_queue(&ws->wait, wait);
 		/*
 		 * It's possible that a tag was freed in the window between the
@@ -1218,10 +1220,8 @@ static bool blk_mq_mark_tag_wait(struct blk_mq_hw_ctx **hctx,
 		 * queue.
 		 */
 		ret = blk_mq_get_driver_tag(rq, hctx, false);
-		if (!ret) {
-			spin_unlock(&this_hctx->lock);
+		if (!ret)
 			return false;
-		}
 
 		/*
 		 * We got a tag, remove ourselves from the wait queue to ensure
@@ -1230,7 +1230,6 @@ static bool blk_mq_mark_tag_wait(struct blk_mq_hw_ctx **hctx,
 		spin_lock_irq(&ws->wait.lock);
 		list_del_init(&wait->entry);
 		spin_unlock_irq(&ws->wait.lock);
-		spin_unlock(&this_hctx->lock);
 	}
 	return ret;
 }
-- 
2.15.1

^ permalink raw reply related	[flat|nested] 9+ messages in thread

* Re: [PATCH v2 1/2] blk-mq: Reduce the number of if-statements in blk_mq_mark_tag_wait()
  2018-01-10 19:39 ` [PATCH v2 1/2] blk-mq: Reduce the number of if-statements in blk_mq_mark_tag_wait() Bart Van Assche
@ 2018-01-10 20:30   ` Jens Axboe
  2018-01-10 21:03     ` Bart Van Assche
  2018-01-11  7:00   ` Hannes Reinecke
  1 sibling, 1 reply; 9+ messages in thread
From: Jens Axboe @ 2018-01-10 20:30 UTC (permalink / raw)
  To: Bart Van Assche
  Cc: linux-block, Christoph Hellwig, Omar Sandoval, Hannes Reinecke,
	Johannes Thumshirn

On 1/10/18 12:39 PM, Bart Van Assche wrote:
> This patch does not change any functionality but makes the
> blk_mq_mark_tag_wait() code slightly easier to read.

I agree it could do with a cleanup, but how about something like the
below? I think that's easier to read.


diff --git a/block/blk-mq.c b/block/blk-mq.c
index 8000ba6db07d..afccd0848d6f 100644
--- a/block/blk-mq.c
+++ b/block/blk-mq.c
@@ -1104,58 +1104,59 @@ static bool blk_mq_mark_tag_wait(struct blk_mq_hw_ctx **hctx,
 				 struct request *rq)
 {
 	struct blk_mq_hw_ctx *this_hctx = *hctx;
-	bool shared_tags = (this_hctx->flags & BLK_MQ_F_TAG_SHARED) != 0;
 	struct sbq_wait_state *ws;
 	wait_queue_entry_t *wait;
 	bool ret;
 
-	if (!shared_tags) {
+	if (!(this_hctx->flags & BLK_MQ_F_TAG_SHARED)) {
 		if (!test_bit(BLK_MQ_S_SCHED_RESTART, &this_hctx->state))
 			set_bit(BLK_MQ_S_SCHED_RESTART, &this_hctx->state);
-	} else {
-		wait = &this_hctx->dispatch_wait;
-		if (!list_empty_careful(&wait->entry))
-			return false;
 
-		spin_lock(&this_hctx->lock);
-		if (!list_empty(&wait->entry)) {
-			spin_unlock(&this_hctx->lock);
-			return false;
-		}
+		/*
+		 * It's possible that a tag was freed in the window between the
+		 * allocation failure and adding the hardware queue to the wait
+		 * queue.
+		 *
+		 * Don't clear RESTART here, someone else could have set it.
+		 * At most this will cost an extra queue run.
+		 */
+		return blk_mq_get_driver_tag(rq, hctx, false);
+	}
+
+	wait = &this_hctx->dispatch_wait;
+	if (!list_empty_careful(&wait->entry))
+		return false;
 
-		ws = bt_wait_ptr(&this_hctx->tags->bitmap_tags, this_hctx);
-		add_wait_queue(&ws->wait, wait);
+	spin_lock(&this_hctx->lock);
+	if (!list_empty(&wait->entry)) {
+		spin_unlock(&this_hctx->lock);
+		return false;
 	}
 
+	ws = bt_wait_ptr(&this_hctx->tags->bitmap_tags, this_hctx);
+	add_wait_queue(&ws->wait, wait);
+
 	/*
 	 * It's possible that a tag was freed in the window between the
 	 * allocation failure and adding the hardware queue to the wait
 	 * queue.
 	 */
 	ret = blk_mq_get_driver_tag(rq, hctx, false);
-
-	if (!shared_tags) {
-		/*
-		 * Don't clear RESTART here, someone else could have set it.
-		 * At most this will cost an extra queue run.
-		 */
-		return ret;
-	} else {
-		if (!ret) {
-			spin_unlock(&this_hctx->lock);
-			return false;
-		}
-
-		/*
-		 * We got a tag, remove ourselves from the wait queue to ensure
-		 * someone else gets the wakeup.
-		 */
-		spin_lock_irq(&ws->wait.lock);
-		list_del_init(&wait->entry);
-		spin_unlock_irq(&ws->wait.lock);
+	if (!ret) {
 		spin_unlock(&this_hctx->lock);
-		return true;
+		return false;
 	}
+
+	/*
+	 * We got a tag, remove ourselves from the wait queue to ensure
+	 * someone else gets the wakeup.
+	 */
+	spin_lock_irq(&ws->wait.lock);
+	list_del_init(&wait->entry);
+	spin_unlock_irq(&ws->wait.lock);
+	spin_unlock(&this_hctx->lock);
+
+	return true;
 }
 
 bool blk_mq_dispatch_rq_list(struct request_queue *q, struct list_head *list,

-- 
Jens Axboe

^ permalink raw reply related	[flat|nested] 9+ messages in thread

* Re: [PATCH v2 1/2] blk-mq: Reduce the number of if-statements in blk_mq_mark_tag_wait()
  2018-01-10 20:30   ` Jens Axboe
@ 2018-01-10 21:03     ` Bart Van Assche
  0 siblings, 0 replies; 9+ messages in thread
From: Bart Van Assche @ 2018-01-10 21:03 UTC (permalink / raw)
  To: axboe; +Cc: hch, jthumshirn, linux-block, osandov, hare

T24gV2VkLCAyMDE4LTAxLTEwIGF0IDEzOjMwIC0wNzAwLCBKZW5zIEF4Ym9lIHdyb3RlOg0KPiBP
biAxLzEwLzE4IDEyOjM5IFBNLCBCYXJ0IFZhbiBBc3NjaGUgd3JvdGU6DQo+ID4gVGhpcyBwYXRj
aCBkb2VzIG5vdCBjaGFuZ2UgYW55IGZ1bmN0aW9uYWxpdHkgYnV0IG1ha2VzIHRoZQ0KPiA+IGJs
a19tcV9tYXJrX3RhZ193YWl0KCkgY29kZSBzbGlnaHRseSBlYXNpZXIgdG8gcmVhZC4NCj4gDQo+
IEkgYWdyZWUgaXQgY291bGQgZG8gd2l0aCBhIGNsZWFudXAsIGJ1dCBob3cgYWJvdXQgc29tZXRo
aW5nIGxpa2UgdGhlDQo+IGJlbG93PyBJIHRoaW5rIHRoYXQncyBlYXNpZXIgdG8gcmVhZC4NCj4g
DQo+IFsgLi4uIF0NCg0KSGVsbG8gSmVucywNCg0KVGhhdCBzb3VuZHMgbGlrZSBhIGdvb2QgaWRl
YSB0byBtZS4gSSB3aWxsIHVwZGF0ZSB0aGUgcGF0Y2ggc2VyaWVzLCByZXRlc3QNCml0IGFuZCBy
ZXBvc3QuDQoNCkJhcnQu

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH v2 1/2] blk-mq: Reduce the number of if-statements in blk_mq_mark_tag_wait()
  2018-01-10 19:39 ` [PATCH v2 1/2] blk-mq: Reduce the number of if-statements in blk_mq_mark_tag_wait() Bart Van Assche
  2018-01-10 20:30   ` Jens Axboe
@ 2018-01-11  7:00   ` Hannes Reinecke
  1 sibling, 0 replies; 9+ messages in thread
From: Hannes Reinecke @ 2018-01-11  7:00 UTC (permalink / raw)
  To: Bart Van Assche, Jens Axboe
  Cc: linux-block, Christoph Hellwig, Omar Sandoval, Johannes Thumshirn

On 01/10/2018 08:39 PM, Bart Van Assche wrote:
> This patch does not change any functionality but makes the
> blk_mq_mark_tag_wait() code slightly easier to read.
> 
> Signed-off-by: Bart Van Assche <bart.vanassche@wdc.com>
> Cc: Christoph Hellwig <hch@lst.de>
> Cc: Omar Sandoval <osandov@fb.com>
> Cc: Hannes Reinecke <hare@suse.de>
> Cc: Johannes Thumshirn <jthumshirn@suse.de>
> ---
>  block/blk-mq.c | 30 +++++++++++++++---------------
>  1 file changed, 15 insertions(+), 15 deletions(-)
> 
Reviewed-by: Hannes Reinecke <hare@suse.com>

Cheers,

Hannes
-- 
Dr. Hannes Reinecke		   Teamlead Storage & Networking
hare@suse.de			               +49 911 74053 688
SUSE LINUX GmbH, Maxfeldstr. 5, 90409 Nürnberg
GF: F. Imendörffer, J. Smithard, J. Guild, D. Upmanyu, G. Norton
HRB 21284 (AG Nürnberg)

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH v2 2/2] blk-mq: Fix a race condition in blk_mq_mark_tag_wait()
  2018-01-10 19:39 ` [PATCH v2 2/2] blk-mq: Fix a race condition " Bart Van Assche
@ 2018-01-11  7:39   ` Hannes Reinecke
  2018-01-11 17:27     ` Bart Van Assche
  2018-01-11 18:21   ` Omar Sandoval
  1 sibling, 1 reply; 9+ messages in thread
From: Hannes Reinecke @ 2018-01-11  7:39 UTC (permalink / raw)
  To: Bart Van Assche, Jens Axboe
  Cc: linux-block, Christoph Hellwig, Omar Sandoval, Johannes Thumshirn

On 01/10/2018 08:39 PM, Bart Van Assche wrote:
> Both add_wait_queue() and blk_mq_dispatch_wake() protect wait queue
> manipulations with the wait queue lock. Hence also protect the
> !list_empty(&wait->entry) test with the wait queue lock instead of
> the hctx lock.
> 
> Signed-off-by: Bart Van Assche <bart.vanassche@wdc.com>
> Cc: Christoph Hellwig <hch@lst.de>
> Cc: Omar Sandoval <osandov@fb.com>
> Cc: Hannes Reinecke <hare@suse.de>
> Cc: Johannes Thumshirn <jthumshirn@suse.de>
> ---
>  block/blk-mq.c | 19 +++++++++----------
>  1 file changed, 9 insertions(+), 10 deletions(-)
> 
> diff --git a/block/blk-mq.c b/block/blk-mq.c
> index e770e8814f60..d5313ce60836 100644
> --- a/block/blk-mq.c
> +++ b/block/blk-mq.c
> @@ -1184,7 +1184,7 @@ static bool blk_mq_mark_tag_wait(struct blk_mq_hw_ctx **hctx,
>  	bool shared_tags = (this_hctx->flags & BLK_MQ_F_TAG_SHARED) != 0;
>  	struct sbq_wait_state *ws;
>  	wait_queue_entry_t *wait;
> -	bool ret;
> +	bool on_wait_list, ret;
>  
>  	if (!shared_tags) {
>  		if (!test_bit(BLK_MQ_S_SCHED_RESTART, &this_hctx->state))
> @@ -1204,13 +1204,15 @@ static bool blk_mq_mark_tag_wait(struct blk_mq_hw_ctx **hctx,
>  		if (!list_empty_careful(&wait->entry))
>  			return false;
>  
> -		spin_lock(&this_hctx->lock);
> -		if (!list_empty(&wait->entry)) {
> -			spin_unlock(&this_hctx->lock);
> +		ws = bt_wait_ptr(&this_hctx->tags->bitmap_tags, this_hctx);
> +
> +		spin_lock_irq(&ws->wait.lock);
> +		on_wait_list = !list_empty(&wait->entry);
> +		spin_unlock_irq(&ws->wait.lock);
> +
> +		if (on_wait_list)
>  			return false;
> -		}
>  
> -		ws = bt_wait_ptr(&this_hctx->tags->bitmap_tags, this_hctx);
>  		add_wait_queue(&ws->wait, wait);
>  		/*
>  		 * It's possible that a tag was freed in the window between the
I'm actually not that convinced with this change; originally we had been
checking if it's on the wait list, and only _then_ call bt_wait_ptr().
Now we call bt_wait_ptr() always, meaning we run a chance of increasing
the bitmap_tags wait pointer without actually using it.
Looking at the code I'm not sure this is the correct way of using it ...

Cheers,

Hannes
-- 
Dr. Hannes Reinecke		   Teamlead Storage & Networking
hare@suse.de			               +49 911 74053 688
SUSE LINUX GmbH, Maxfeldstr. 5, 90409 Nürnberg
GF: F. Imendörffer, J. Smithard, J. Guild, D. Upmanyu, G. Norton
HRB 21284 (AG Nürnberg)

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH v2 2/2] blk-mq: Fix a race condition in blk_mq_mark_tag_wait()
  2018-01-11  7:39   ` Hannes Reinecke
@ 2018-01-11 17:27     ` Bart Van Assche
  0 siblings, 0 replies; 9+ messages in thread
From: Bart Van Assche @ 2018-01-11 17:27 UTC (permalink / raw)
  To: hare, axboe; +Cc: hch, linux-block, osandov, jthumshirn

T24gVGh1LCAyMDE4LTAxLTExIGF0IDA4OjM5ICswMTAwLCBIYW5uZXMgUmVpbmVja2Ugd3JvdGU6
DQo+IEknbSBhY3R1YWxseSBub3QgdGhhdCBjb252aW5jZWQgd2l0aCB0aGlzIGNoYW5nZTsgb3Jp
Z2luYWxseSB3ZSBoYWQgYmVlbg0KPiBjaGVja2luZyBpZiBpdCdzIG9uIHRoZSB3YWl0IGxpc3Qs
IGFuZCBvbmx5IF90aGVuXyBjYWxsIGJ0X3dhaXRfcHRyKCkuDQo+IE5vdyB3ZSBjYWxsIGJ0X3dh
aXRfcHRyKCkgYWx3YXlzLCBtZWFuaW5nIHdlIHJ1biBhIGNoYW5jZSBvZiBpbmNyZWFzaW5nDQo+
IHRoZSBiaXRtYXBfdGFncyB3YWl0IHBvaW50ZXIgd2l0aG91dCBhY3R1YWxseSB1c2luZyBpdC4N
Cj4gTG9va2luZyBhdCB0aGUgY29kZSBJJ20gbm90IHN1cmUgdGhpcyBpcyB0aGUgY29ycmVjdCB3
YXkgb2YgdXNpbmcgaXQgLi4uDQoNCkhlbGxvIEhhbm5lcywNCg0KQXJlIHlvdSBwZXJoYXBzIHJl
ZmVycmluZyB0byBzYnFfaW5kZXhfYXRvbWljX2luYygpPyBJIHRoaW5rIGl0J3MgZmluZSB0bw0K
Y2FsbCBidF93YWl0X3B0cigpIGV2ZW4gaWYgdGhlIGNvcnJlc3BvbmRpbmcgd2FpdHF1ZXVlIHdv
bid0IGJlIHVzZWQuIE15DQp1bmRlcnN0YW5kaW5nIGlzIHRoYXQgdGhlIHB1cnBvc2Ugb2YgaGF2
aW5nIG11bHRpcGxlIHdhaXRxdWV1ZXMgYW5kIG9mDQp1c2luZyB0aGVzZSBpbiBhIHJvdW5kLXJv
YmluIGZhc2hpb24gaXMgdG8gYXZvaWQgdGhhdCBpZiBsZXNzIHRoYW4gb3IgZXF1YWwNCnRvIDgg
KFNCUV9XQUlUX1FVRVVFUykgdGhyZWFkcyBhcmUgd2FpdGluZyBmb3IgYSB0YWcgdGhhdCB0aGVz
ZSB3b24ndCB1c2UNCnRoZSBzYW1lIHdhaXQgcXVldWUuIFNvIGluIHRoZSB1bmxpa2VseSBldmVu
dCBvZiBhbiBlYXJseSByZXR1cm4gdGhlIHdvcnN0DQp0aGF0IGNhbiBoYXBwZW4gaXMgdGhhdCB0
aGVyZSBpcyBtb3JlIGNvbnRlbnRpb24gb24gb25lIG9mIHRoZSBlaWdodCB3YWl0DQpxdWV1ZXMu
IEJ1dCBzaW5jZSB0aGF0IGlzIGFuIHVubGlrZWx5IHNjZW5hcmlvIEkgZG9uJ3QgdGhpbmsgd2Ug
aGF2ZSB0bw0Kd29ycnkgYWJvdXQgdGhpcy4NCg0KQmFydC4=

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH v2 2/2] blk-mq: Fix a race condition in blk_mq_mark_tag_wait()
  2018-01-10 19:39 ` [PATCH v2 2/2] blk-mq: Fix a race condition " Bart Van Assche
  2018-01-11  7:39   ` Hannes Reinecke
@ 2018-01-11 18:21   ` Omar Sandoval
  1 sibling, 0 replies; 9+ messages in thread
From: Omar Sandoval @ 2018-01-11 18:21 UTC (permalink / raw)
  To: Bart Van Assche
  Cc: Jens Axboe, linux-block, Christoph Hellwig, Omar Sandoval,
	Hannes Reinecke, Johannes Thumshirn

On Wed, Jan 10, 2018 at 11:39:19AM -0800, Bart Van Assche wrote:
> Both add_wait_queue() and blk_mq_dispatch_wake() protect wait queue
> manipulations with the wait queue lock. Hence also protect the
> !list_empty(&wait->entry) test with the wait queue lock instead of
> the hctx lock.
> 
> Signed-off-by: Bart Van Assche <bart.vanassche@wdc.com>
> Cc: Christoph Hellwig <hch@lst.de>
> Cc: Omar Sandoval <osandov@fb.com>
> Cc: Hannes Reinecke <hare@suse.de>
> Cc: Johannes Thumshirn <jthumshirn@suse.de>
> ---
>  block/blk-mq.c | 19 +++++++++----------
>  1 file changed, 9 insertions(+), 10 deletions(-)
> 
> diff --git a/block/blk-mq.c b/block/blk-mq.c
> index e770e8814f60..d5313ce60836 100644
> --- a/block/blk-mq.c
> +++ b/block/blk-mq.c
> @@ -1184,7 +1184,7 @@ static bool blk_mq_mark_tag_wait(struct blk_mq_hw_ctx **hctx,
>  	bool shared_tags = (this_hctx->flags & BLK_MQ_F_TAG_SHARED) != 0;
>  	struct sbq_wait_state *ws;
>  	wait_queue_entry_t *wait;
> -	bool ret;
> +	bool on_wait_list, ret;
>  
>  	if (!shared_tags) {
>  		if (!test_bit(BLK_MQ_S_SCHED_RESTART, &this_hctx->state))
> @@ -1204,13 +1204,15 @@ static bool blk_mq_mark_tag_wait(struct blk_mq_hw_ctx **hctx,
>  		if (!list_empty_careful(&wait->entry))
>  			return false;
>  
> -		spin_lock(&this_hctx->lock);
> -		if (!list_empty(&wait->entry)) {
> -			spin_unlock(&this_hctx->lock);
> +		ws = bt_wait_ptr(&this_hctx->tags->bitmap_tags, this_hctx);
> +
> +		spin_lock_irq(&ws->wait.lock);
> +		on_wait_list = !list_empty(&wait->entry);
> +		spin_unlock_irq(&ws->wait.lock);

This isn't quite right. There's no guarantee that the struct
sbq_wait_state returned by bt_wait_ptr() is the same one that the wait
entry is on, so the lock on the returned ws->wait isn't necessarily
protecting the wait entry. I think we should just be using
list_empty_careful() in this case.

> +
> +		if (on_wait_list)
>  			return false;
> -		}
>  
> -		ws = bt_wait_ptr(&this_hctx->tags->bitmap_tags, this_hctx);
>  		add_wait_queue(&ws->wait, wait);
>  		/*
>  		 * It's possible that a tag was freed in the window between the
> @@ -1218,10 +1220,8 @@ static bool blk_mq_mark_tag_wait(struct blk_mq_hw_ctx **hctx,
>  		 * queue.
>  		 */
>  		ret = blk_mq_get_driver_tag(rq, hctx, false);
> -		if (!ret) {
> -			spin_unlock(&this_hctx->lock);
> +		if (!ret)
>  			return false;
> -		}
>  
>  		/*
>  		 * We got a tag, remove ourselves from the wait queue to ensure
> @@ -1230,7 +1230,6 @@ static bool blk_mq_mark_tag_wait(struct blk_mq_hw_ctx **hctx,
>  		spin_lock_irq(&ws->wait.lock);
>  		list_del_init(&wait->entry);
>  		spin_unlock_irq(&ws->wait.lock);
> -		spin_unlock(&this_hctx->lock);
>  	}
>  	return ret;
>  }
> -- 
> 2.15.1
> 

^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2018-01-11 18:21 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-01-10 19:39 [PATCH v2 0/2] Rework blk_mq_mark_tag_wait() Bart Van Assche
2018-01-10 19:39 ` [PATCH v2 1/2] blk-mq: Reduce the number of if-statements in blk_mq_mark_tag_wait() Bart Van Assche
2018-01-10 20:30   ` Jens Axboe
2018-01-10 21:03     ` Bart Van Assche
2018-01-11  7:00   ` Hannes Reinecke
2018-01-10 19:39 ` [PATCH v2 2/2] blk-mq: Fix a race condition " Bart Van Assche
2018-01-11  7:39   ` Hannes Reinecke
2018-01-11 17:27     ` Bart Van Assche
2018-01-11 18:21   ` Omar Sandoval

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.