* [PATCH v2 0/2] Rework blk_mq_mark_tag_wait()
@ 2018-01-10 19:39 Bart Van Assche
2018-01-10 19:39 ` [PATCH v2 1/2] blk-mq: Reduce the number of if-statements in blk_mq_mark_tag_wait() Bart Van Assche
2018-01-10 19:39 ` [PATCH v2 2/2] blk-mq: Fix a race condition " Bart Van Assche
0 siblings, 2 replies; 9+ messages in thread
From: Bart Van Assche @ 2018-01-10 19:39 UTC (permalink / raw)
To: Jens Axboe; +Cc: linux-block, Christoph Hellwig, Bart Van Assche
Hello Jens,
This patch series reworks the blk_mq_mark_tag_wait() implementation and also
fixes a race condition in that function. Please consider these two patches for
kernel v4.16.
Thanks,
Bart.
Changes compared to v1:
- Split a single patch into two patches to make reviewing easier.
- The race fix does no longer use prepare_to_wait() / finish_wait().
Bart Van Assche (2):
blk-mq: Reduce the number of if-statements in blk_mq_mark_tag_wait()
blk-mq: Fix a race condition in blk_mq_mark_tag_wait()
block/blk-mq.c | 49 ++++++++++++++++++++++++-------------------------
1 file changed, 24 insertions(+), 25 deletions(-)
--
2.15.1
^ permalink raw reply [flat|nested] 9+ messages in thread
* [PATCH v2 1/2] blk-mq: Reduce the number of if-statements in blk_mq_mark_tag_wait()
2018-01-10 19:39 [PATCH v2 0/2] Rework blk_mq_mark_tag_wait() Bart Van Assche
@ 2018-01-10 19:39 ` Bart Van Assche
2018-01-10 20:30 ` Jens Axboe
2018-01-11 7:00 ` Hannes Reinecke
2018-01-10 19:39 ` [PATCH v2 2/2] blk-mq: Fix a race condition " Bart Van Assche
1 sibling, 2 replies; 9+ messages in thread
From: Bart Van Assche @ 2018-01-10 19:39 UTC (permalink / raw)
To: Jens Axboe
Cc: linux-block, Christoph Hellwig, Bart Van Assche, Omar Sandoval,
Hannes Reinecke, Johannes Thumshirn
This patch does not change any functionality but makes the
blk_mq_mark_tag_wait() code slightly easier to read.
Signed-off-by: Bart Van Assche <bart.vanassche@wdc.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Omar Sandoval <osandov@fb.com>
Cc: Hannes Reinecke <hare@suse.de>
Cc: Johannes Thumshirn <jthumshirn@suse.de>
---
block/blk-mq.c | 30 +++++++++++++++---------------
1 file changed, 15 insertions(+), 15 deletions(-)
diff --git a/block/blk-mq.c b/block/blk-mq.c
index 1d705e25852e..e770e8814f60 100644
--- a/block/blk-mq.c
+++ b/block/blk-mq.c
@@ -1189,6 +1189,16 @@ static bool blk_mq_mark_tag_wait(struct blk_mq_hw_ctx **hctx,
if (!shared_tags) {
if (!test_bit(BLK_MQ_S_SCHED_RESTART, &this_hctx->state))
set_bit(BLK_MQ_S_SCHED_RESTART, &this_hctx->state);
+ /*
+ * It's possible that a tag was freed in the window between the
+ * allocation failure and adding the hardware queue to the wait
+ * queue.
+ */
+ ret = blk_mq_get_driver_tag(rq, hctx, false);
+ /*
+ * Don't clear RESTART here, someone else could have set it.
+ * At most this will cost an extra queue run.
+ */
} else {
wait = &this_hctx->dispatch_wait;
if (!list_empty_careful(&wait->entry))
@@ -1202,22 +1212,12 @@ static bool blk_mq_mark_tag_wait(struct blk_mq_hw_ctx **hctx,
ws = bt_wait_ptr(&this_hctx->tags->bitmap_tags, this_hctx);
add_wait_queue(&ws->wait, wait);
- }
-
- /*
- * It's possible that a tag was freed in the window between the
- * allocation failure and adding the hardware queue to the wait
- * queue.
- */
- ret = blk_mq_get_driver_tag(rq, hctx, false);
-
- if (!shared_tags) {
/*
- * Don't clear RESTART here, someone else could have set it.
- * At most this will cost an extra queue run.
+ * It's possible that a tag was freed in the window between the
+ * allocation failure and adding the hardware queue to the wait
+ * queue.
*/
- return ret;
- } else {
+ ret = blk_mq_get_driver_tag(rq, hctx, false);
if (!ret) {
spin_unlock(&this_hctx->lock);
return false;
@@ -1231,8 +1231,8 @@ static bool blk_mq_mark_tag_wait(struct blk_mq_hw_ctx **hctx,
list_del_init(&wait->entry);
spin_unlock_irq(&ws->wait.lock);
spin_unlock(&this_hctx->lock);
- return true;
}
+ return ret;
}
bool blk_mq_dispatch_rq_list(struct request_queue *q, struct list_head *list,
--
2.15.1
^ permalink raw reply related [flat|nested] 9+ messages in thread
* [PATCH v2 2/2] blk-mq: Fix a race condition in blk_mq_mark_tag_wait()
2018-01-10 19:39 [PATCH v2 0/2] Rework blk_mq_mark_tag_wait() Bart Van Assche
2018-01-10 19:39 ` [PATCH v2 1/2] blk-mq: Reduce the number of if-statements in blk_mq_mark_tag_wait() Bart Van Assche
@ 2018-01-10 19:39 ` Bart Van Assche
2018-01-11 7:39 ` Hannes Reinecke
2018-01-11 18:21 ` Omar Sandoval
1 sibling, 2 replies; 9+ messages in thread
From: Bart Van Assche @ 2018-01-10 19:39 UTC (permalink / raw)
To: Jens Axboe
Cc: linux-block, Christoph Hellwig, Bart Van Assche, Omar Sandoval,
Hannes Reinecke, Johannes Thumshirn
Both add_wait_queue() and blk_mq_dispatch_wake() protect wait queue
manipulations with the wait queue lock. Hence also protect the
!list_empty(&wait->entry) test with the wait queue lock instead of
the hctx lock.
Signed-off-by: Bart Van Assche <bart.vanassche@wdc.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Omar Sandoval <osandov@fb.com>
Cc: Hannes Reinecke <hare@suse.de>
Cc: Johannes Thumshirn <jthumshirn@suse.de>
---
block/blk-mq.c | 19 +++++++++----------
1 file changed, 9 insertions(+), 10 deletions(-)
diff --git a/block/blk-mq.c b/block/blk-mq.c
index e770e8814f60..d5313ce60836 100644
--- a/block/blk-mq.c
+++ b/block/blk-mq.c
@@ -1184,7 +1184,7 @@ static bool blk_mq_mark_tag_wait(struct blk_mq_hw_ctx **hctx,
bool shared_tags = (this_hctx->flags & BLK_MQ_F_TAG_SHARED) != 0;
struct sbq_wait_state *ws;
wait_queue_entry_t *wait;
- bool ret;
+ bool on_wait_list, ret;
if (!shared_tags) {
if (!test_bit(BLK_MQ_S_SCHED_RESTART, &this_hctx->state))
@@ -1204,13 +1204,15 @@ static bool blk_mq_mark_tag_wait(struct blk_mq_hw_ctx **hctx,
if (!list_empty_careful(&wait->entry))
return false;
- spin_lock(&this_hctx->lock);
- if (!list_empty(&wait->entry)) {
- spin_unlock(&this_hctx->lock);
+ ws = bt_wait_ptr(&this_hctx->tags->bitmap_tags, this_hctx);
+
+ spin_lock_irq(&ws->wait.lock);
+ on_wait_list = !list_empty(&wait->entry);
+ spin_unlock_irq(&ws->wait.lock);
+
+ if (on_wait_list)
return false;
- }
- ws = bt_wait_ptr(&this_hctx->tags->bitmap_tags, this_hctx);
add_wait_queue(&ws->wait, wait);
/*
* It's possible that a tag was freed in the window between the
@@ -1218,10 +1220,8 @@ static bool blk_mq_mark_tag_wait(struct blk_mq_hw_ctx **hctx,
* queue.
*/
ret = blk_mq_get_driver_tag(rq, hctx, false);
- if (!ret) {
- spin_unlock(&this_hctx->lock);
+ if (!ret)
return false;
- }
/*
* We got a tag, remove ourselves from the wait queue to ensure
@@ -1230,7 +1230,6 @@ static bool blk_mq_mark_tag_wait(struct blk_mq_hw_ctx **hctx,
spin_lock_irq(&ws->wait.lock);
list_del_init(&wait->entry);
spin_unlock_irq(&ws->wait.lock);
- spin_unlock(&this_hctx->lock);
}
return ret;
}
--
2.15.1
^ permalink raw reply related [flat|nested] 9+ messages in thread
* Re: [PATCH v2 1/2] blk-mq: Reduce the number of if-statements in blk_mq_mark_tag_wait()
2018-01-10 19:39 ` [PATCH v2 1/2] blk-mq: Reduce the number of if-statements in blk_mq_mark_tag_wait() Bart Van Assche
@ 2018-01-10 20:30 ` Jens Axboe
2018-01-10 21:03 ` Bart Van Assche
2018-01-11 7:00 ` Hannes Reinecke
1 sibling, 1 reply; 9+ messages in thread
From: Jens Axboe @ 2018-01-10 20:30 UTC (permalink / raw)
To: Bart Van Assche
Cc: linux-block, Christoph Hellwig, Omar Sandoval, Hannes Reinecke,
Johannes Thumshirn
On 1/10/18 12:39 PM, Bart Van Assche wrote:
> This patch does not change any functionality but makes the
> blk_mq_mark_tag_wait() code slightly easier to read.
I agree it could do with a cleanup, but how about something like the
below? I think that's easier to read.
diff --git a/block/blk-mq.c b/block/blk-mq.c
index 8000ba6db07d..afccd0848d6f 100644
--- a/block/blk-mq.c
+++ b/block/blk-mq.c
@@ -1104,58 +1104,59 @@ static bool blk_mq_mark_tag_wait(struct blk_mq_hw_ctx **hctx,
struct request *rq)
{
struct blk_mq_hw_ctx *this_hctx = *hctx;
- bool shared_tags = (this_hctx->flags & BLK_MQ_F_TAG_SHARED) != 0;
struct sbq_wait_state *ws;
wait_queue_entry_t *wait;
bool ret;
- if (!shared_tags) {
+ if (!(this_hctx->flags & BLK_MQ_F_TAG_SHARED)) {
if (!test_bit(BLK_MQ_S_SCHED_RESTART, &this_hctx->state))
set_bit(BLK_MQ_S_SCHED_RESTART, &this_hctx->state);
- } else {
- wait = &this_hctx->dispatch_wait;
- if (!list_empty_careful(&wait->entry))
- return false;
- spin_lock(&this_hctx->lock);
- if (!list_empty(&wait->entry)) {
- spin_unlock(&this_hctx->lock);
- return false;
- }
+ /*
+ * It's possible that a tag was freed in the window between the
+ * allocation failure and adding the hardware queue to the wait
+ * queue.
+ *
+ * Don't clear RESTART here, someone else could have set it.
+ * At most this will cost an extra queue run.
+ */
+ return blk_mq_get_driver_tag(rq, hctx, false);
+ }
+
+ wait = &this_hctx->dispatch_wait;
+ if (!list_empty_careful(&wait->entry))
+ return false;
- ws = bt_wait_ptr(&this_hctx->tags->bitmap_tags, this_hctx);
- add_wait_queue(&ws->wait, wait);
+ spin_lock(&this_hctx->lock);
+ if (!list_empty(&wait->entry)) {
+ spin_unlock(&this_hctx->lock);
+ return false;
}
+ ws = bt_wait_ptr(&this_hctx->tags->bitmap_tags, this_hctx);
+ add_wait_queue(&ws->wait, wait);
+
/*
* It's possible that a tag was freed in the window between the
* allocation failure and adding the hardware queue to the wait
* queue.
*/
ret = blk_mq_get_driver_tag(rq, hctx, false);
-
- if (!shared_tags) {
- /*
- * Don't clear RESTART here, someone else could have set it.
- * At most this will cost an extra queue run.
- */
- return ret;
- } else {
- if (!ret) {
- spin_unlock(&this_hctx->lock);
- return false;
- }
-
- /*
- * We got a tag, remove ourselves from the wait queue to ensure
- * someone else gets the wakeup.
- */
- spin_lock_irq(&ws->wait.lock);
- list_del_init(&wait->entry);
- spin_unlock_irq(&ws->wait.lock);
+ if (!ret) {
spin_unlock(&this_hctx->lock);
- return true;
+ return false;
}
+
+ /*
+ * We got a tag, remove ourselves from the wait queue to ensure
+ * someone else gets the wakeup.
+ */
+ spin_lock_irq(&ws->wait.lock);
+ list_del_init(&wait->entry);
+ spin_unlock_irq(&ws->wait.lock);
+ spin_unlock(&this_hctx->lock);
+
+ return true;
}
bool blk_mq_dispatch_rq_list(struct request_queue *q, struct list_head *list,
--
Jens Axboe
^ permalink raw reply related [flat|nested] 9+ messages in thread
* Re: [PATCH v2 1/2] blk-mq: Reduce the number of if-statements in blk_mq_mark_tag_wait()
2018-01-10 20:30 ` Jens Axboe
@ 2018-01-10 21:03 ` Bart Van Assche
0 siblings, 0 replies; 9+ messages in thread
From: Bart Van Assche @ 2018-01-10 21:03 UTC (permalink / raw)
To: axboe; +Cc: hch, jthumshirn, linux-block, osandov, hare
T24gV2VkLCAyMDE4LTAxLTEwIGF0IDEzOjMwIC0wNzAwLCBKZW5zIEF4Ym9lIHdyb3RlOg0KPiBP
biAxLzEwLzE4IDEyOjM5IFBNLCBCYXJ0IFZhbiBBc3NjaGUgd3JvdGU6DQo+ID4gVGhpcyBwYXRj
aCBkb2VzIG5vdCBjaGFuZ2UgYW55IGZ1bmN0aW9uYWxpdHkgYnV0IG1ha2VzIHRoZQ0KPiA+IGJs
a19tcV9tYXJrX3RhZ193YWl0KCkgY29kZSBzbGlnaHRseSBlYXNpZXIgdG8gcmVhZC4NCj4gDQo+
IEkgYWdyZWUgaXQgY291bGQgZG8gd2l0aCBhIGNsZWFudXAsIGJ1dCBob3cgYWJvdXQgc29tZXRo
aW5nIGxpa2UgdGhlDQo+IGJlbG93PyBJIHRoaW5rIHRoYXQncyBlYXNpZXIgdG8gcmVhZC4NCj4g
DQo+IFsgLi4uIF0NCg0KSGVsbG8gSmVucywNCg0KVGhhdCBzb3VuZHMgbGlrZSBhIGdvb2QgaWRl
YSB0byBtZS4gSSB3aWxsIHVwZGF0ZSB0aGUgcGF0Y2ggc2VyaWVzLCByZXRlc3QNCml0IGFuZCBy
ZXBvc3QuDQoNCkJhcnQu
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH v2 1/2] blk-mq: Reduce the number of if-statements in blk_mq_mark_tag_wait()
2018-01-10 19:39 ` [PATCH v2 1/2] blk-mq: Reduce the number of if-statements in blk_mq_mark_tag_wait() Bart Van Assche
2018-01-10 20:30 ` Jens Axboe
@ 2018-01-11 7:00 ` Hannes Reinecke
1 sibling, 0 replies; 9+ messages in thread
From: Hannes Reinecke @ 2018-01-11 7:00 UTC (permalink / raw)
To: Bart Van Assche, Jens Axboe
Cc: linux-block, Christoph Hellwig, Omar Sandoval, Johannes Thumshirn
On 01/10/2018 08:39 PM, Bart Van Assche wrote:
> This patch does not change any functionality but makes the
> blk_mq_mark_tag_wait() code slightly easier to read.
>
> Signed-off-by: Bart Van Assche <bart.vanassche@wdc.com>
> Cc: Christoph Hellwig <hch@lst.de>
> Cc: Omar Sandoval <osandov@fb.com>
> Cc: Hannes Reinecke <hare@suse.de>
> Cc: Johannes Thumshirn <jthumshirn@suse.de>
> ---
> block/blk-mq.c | 30 +++++++++++++++---------------
> 1 file changed, 15 insertions(+), 15 deletions(-)
>
Reviewed-by: Hannes Reinecke <hare@suse.com>
Cheers,
Hannes
--
Dr. Hannes Reinecke Teamlead Storage & Networking
hare@suse.de +49 911 74053 688
SUSE LINUX GmbH, Maxfeldstr. 5, 90409 Nürnberg
GF: F. Imendörffer, J. Smithard, J. Guild, D. Upmanyu, G. Norton
HRB 21284 (AG Nürnberg)
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH v2 2/2] blk-mq: Fix a race condition in blk_mq_mark_tag_wait()
2018-01-10 19:39 ` [PATCH v2 2/2] blk-mq: Fix a race condition " Bart Van Assche
@ 2018-01-11 7:39 ` Hannes Reinecke
2018-01-11 17:27 ` Bart Van Assche
2018-01-11 18:21 ` Omar Sandoval
1 sibling, 1 reply; 9+ messages in thread
From: Hannes Reinecke @ 2018-01-11 7:39 UTC (permalink / raw)
To: Bart Van Assche, Jens Axboe
Cc: linux-block, Christoph Hellwig, Omar Sandoval, Johannes Thumshirn
On 01/10/2018 08:39 PM, Bart Van Assche wrote:
> Both add_wait_queue() and blk_mq_dispatch_wake() protect wait queue
> manipulations with the wait queue lock. Hence also protect the
> !list_empty(&wait->entry) test with the wait queue lock instead of
> the hctx lock.
>
> Signed-off-by: Bart Van Assche <bart.vanassche@wdc.com>
> Cc: Christoph Hellwig <hch@lst.de>
> Cc: Omar Sandoval <osandov@fb.com>
> Cc: Hannes Reinecke <hare@suse.de>
> Cc: Johannes Thumshirn <jthumshirn@suse.de>
> ---
> block/blk-mq.c | 19 +++++++++----------
> 1 file changed, 9 insertions(+), 10 deletions(-)
>
> diff --git a/block/blk-mq.c b/block/blk-mq.c
> index e770e8814f60..d5313ce60836 100644
> --- a/block/blk-mq.c
> +++ b/block/blk-mq.c
> @@ -1184,7 +1184,7 @@ static bool blk_mq_mark_tag_wait(struct blk_mq_hw_ctx **hctx,
> bool shared_tags = (this_hctx->flags & BLK_MQ_F_TAG_SHARED) != 0;
> struct sbq_wait_state *ws;
> wait_queue_entry_t *wait;
> - bool ret;
> + bool on_wait_list, ret;
>
> if (!shared_tags) {
> if (!test_bit(BLK_MQ_S_SCHED_RESTART, &this_hctx->state))
> @@ -1204,13 +1204,15 @@ static bool blk_mq_mark_tag_wait(struct blk_mq_hw_ctx **hctx,
> if (!list_empty_careful(&wait->entry))
> return false;
>
> - spin_lock(&this_hctx->lock);
> - if (!list_empty(&wait->entry)) {
> - spin_unlock(&this_hctx->lock);
> + ws = bt_wait_ptr(&this_hctx->tags->bitmap_tags, this_hctx);
> +
> + spin_lock_irq(&ws->wait.lock);
> + on_wait_list = !list_empty(&wait->entry);
> + spin_unlock_irq(&ws->wait.lock);
> +
> + if (on_wait_list)
> return false;
> - }
>
> - ws = bt_wait_ptr(&this_hctx->tags->bitmap_tags, this_hctx);
> add_wait_queue(&ws->wait, wait);
> /*
> * It's possible that a tag was freed in the window between the
I'm actually not that convinced with this change; originally we had been
checking if it's on the wait list, and only _then_ call bt_wait_ptr().
Now we call bt_wait_ptr() always, meaning we run a chance of increasing
the bitmap_tags wait pointer without actually using it.
Looking at the code I'm not sure this is the correct way of using it ...
Cheers,
Hannes
--
Dr. Hannes Reinecke Teamlead Storage & Networking
hare@suse.de +49 911 74053 688
SUSE LINUX GmbH, Maxfeldstr. 5, 90409 Nürnberg
GF: F. Imendörffer, J. Smithard, J. Guild, D. Upmanyu, G. Norton
HRB 21284 (AG Nürnberg)
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH v2 2/2] blk-mq: Fix a race condition in blk_mq_mark_tag_wait()
2018-01-11 7:39 ` Hannes Reinecke
@ 2018-01-11 17:27 ` Bart Van Assche
0 siblings, 0 replies; 9+ messages in thread
From: Bart Van Assche @ 2018-01-11 17:27 UTC (permalink / raw)
To: hare, axboe; +Cc: hch, linux-block, osandov, jthumshirn
T24gVGh1LCAyMDE4LTAxLTExIGF0IDA4OjM5ICswMTAwLCBIYW5uZXMgUmVpbmVja2Ugd3JvdGU6
DQo+IEknbSBhY3R1YWxseSBub3QgdGhhdCBjb252aW5jZWQgd2l0aCB0aGlzIGNoYW5nZTsgb3Jp
Z2luYWxseSB3ZSBoYWQgYmVlbg0KPiBjaGVja2luZyBpZiBpdCdzIG9uIHRoZSB3YWl0IGxpc3Qs
IGFuZCBvbmx5IF90aGVuXyBjYWxsIGJ0X3dhaXRfcHRyKCkuDQo+IE5vdyB3ZSBjYWxsIGJ0X3dh
aXRfcHRyKCkgYWx3YXlzLCBtZWFuaW5nIHdlIHJ1biBhIGNoYW5jZSBvZiBpbmNyZWFzaW5nDQo+
IHRoZSBiaXRtYXBfdGFncyB3YWl0IHBvaW50ZXIgd2l0aG91dCBhY3R1YWxseSB1c2luZyBpdC4N
Cj4gTG9va2luZyBhdCB0aGUgY29kZSBJJ20gbm90IHN1cmUgdGhpcyBpcyB0aGUgY29ycmVjdCB3
YXkgb2YgdXNpbmcgaXQgLi4uDQoNCkhlbGxvIEhhbm5lcywNCg0KQXJlIHlvdSBwZXJoYXBzIHJl
ZmVycmluZyB0byBzYnFfaW5kZXhfYXRvbWljX2luYygpPyBJIHRoaW5rIGl0J3MgZmluZSB0bw0K
Y2FsbCBidF93YWl0X3B0cigpIGV2ZW4gaWYgdGhlIGNvcnJlc3BvbmRpbmcgd2FpdHF1ZXVlIHdv
bid0IGJlIHVzZWQuIE15DQp1bmRlcnN0YW5kaW5nIGlzIHRoYXQgdGhlIHB1cnBvc2Ugb2YgaGF2
aW5nIG11bHRpcGxlIHdhaXRxdWV1ZXMgYW5kIG9mDQp1c2luZyB0aGVzZSBpbiBhIHJvdW5kLXJv
YmluIGZhc2hpb24gaXMgdG8gYXZvaWQgdGhhdCBpZiBsZXNzIHRoYW4gb3IgZXF1YWwNCnRvIDgg
KFNCUV9XQUlUX1FVRVVFUykgdGhyZWFkcyBhcmUgd2FpdGluZyBmb3IgYSB0YWcgdGhhdCB0aGVz
ZSB3b24ndCB1c2UNCnRoZSBzYW1lIHdhaXQgcXVldWUuIFNvIGluIHRoZSB1bmxpa2VseSBldmVu
dCBvZiBhbiBlYXJseSByZXR1cm4gdGhlIHdvcnN0DQp0aGF0IGNhbiBoYXBwZW4gaXMgdGhhdCB0
aGVyZSBpcyBtb3JlIGNvbnRlbnRpb24gb24gb25lIG9mIHRoZSBlaWdodCB3YWl0DQpxdWV1ZXMu
IEJ1dCBzaW5jZSB0aGF0IGlzIGFuIHVubGlrZWx5IHNjZW5hcmlvIEkgZG9uJ3QgdGhpbmsgd2Ug
aGF2ZSB0bw0Kd29ycnkgYWJvdXQgdGhpcy4NCg0KQmFydC4=
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH v2 2/2] blk-mq: Fix a race condition in blk_mq_mark_tag_wait()
2018-01-10 19:39 ` [PATCH v2 2/2] blk-mq: Fix a race condition " Bart Van Assche
2018-01-11 7:39 ` Hannes Reinecke
@ 2018-01-11 18:21 ` Omar Sandoval
1 sibling, 0 replies; 9+ messages in thread
From: Omar Sandoval @ 2018-01-11 18:21 UTC (permalink / raw)
To: Bart Van Assche
Cc: Jens Axboe, linux-block, Christoph Hellwig, Omar Sandoval,
Hannes Reinecke, Johannes Thumshirn
On Wed, Jan 10, 2018 at 11:39:19AM -0800, Bart Van Assche wrote:
> Both add_wait_queue() and blk_mq_dispatch_wake() protect wait queue
> manipulations with the wait queue lock. Hence also protect the
> !list_empty(&wait->entry) test with the wait queue lock instead of
> the hctx lock.
>
> Signed-off-by: Bart Van Assche <bart.vanassche@wdc.com>
> Cc: Christoph Hellwig <hch@lst.de>
> Cc: Omar Sandoval <osandov@fb.com>
> Cc: Hannes Reinecke <hare@suse.de>
> Cc: Johannes Thumshirn <jthumshirn@suse.de>
> ---
> block/blk-mq.c | 19 +++++++++----------
> 1 file changed, 9 insertions(+), 10 deletions(-)
>
> diff --git a/block/blk-mq.c b/block/blk-mq.c
> index e770e8814f60..d5313ce60836 100644
> --- a/block/blk-mq.c
> +++ b/block/blk-mq.c
> @@ -1184,7 +1184,7 @@ static bool blk_mq_mark_tag_wait(struct blk_mq_hw_ctx **hctx,
> bool shared_tags = (this_hctx->flags & BLK_MQ_F_TAG_SHARED) != 0;
> struct sbq_wait_state *ws;
> wait_queue_entry_t *wait;
> - bool ret;
> + bool on_wait_list, ret;
>
> if (!shared_tags) {
> if (!test_bit(BLK_MQ_S_SCHED_RESTART, &this_hctx->state))
> @@ -1204,13 +1204,15 @@ static bool blk_mq_mark_tag_wait(struct blk_mq_hw_ctx **hctx,
> if (!list_empty_careful(&wait->entry))
> return false;
>
> - spin_lock(&this_hctx->lock);
> - if (!list_empty(&wait->entry)) {
> - spin_unlock(&this_hctx->lock);
> + ws = bt_wait_ptr(&this_hctx->tags->bitmap_tags, this_hctx);
> +
> + spin_lock_irq(&ws->wait.lock);
> + on_wait_list = !list_empty(&wait->entry);
> + spin_unlock_irq(&ws->wait.lock);
This isn't quite right. There's no guarantee that the struct
sbq_wait_state returned by bt_wait_ptr() is the same one that the wait
entry is on, so the lock on the returned ws->wait isn't necessarily
protecting the wait entry. I think we should just be using
list_empty_careful() in this case.
> +
> + if (on_wait_list)
> return false;
> - }
>
> - ws = bt_wait_ptr(&this_hctx->tags->bitmap_tags, this_hctx);
> add_wait_queue(&ws->wait, wait);
> /*
> * It's possible that a tag was freed in the window between the
> @@ -1218,10 +1220,8 @@ static bool blk_mq_mark_tag_wait(struct blk_mq_hw_ctx **hctx,
> * queue.
> */
> ret = blk_mq_get_driver_tag(rq, hctx, false);
> - if (!ret) {
> - spin_unlock(&this_hctx->lock);
> + if (!ret)
> return false;
> - }
>
> /*
> * We got a tag, remove ourselves from the wait queue to ensure
> @@ -1230,7 +1230,6 @@ static bool blk_mq_mark_tag_wait(struct blk_mq_hw_ctx **hctx,
> spin_lock_irq(&ws->wait.lock);
> list_del_init(&wait->entry);
> spin_unlock_irq(&ws->wait.lock);
> - spin_unlock(&this_hctx->lock);
> }
> return ret;
> }
> --
> 2.15.1
>
^ permalink raw reply [flat|nested] 9+ messages in thread
end of thread, other threads:[~2018-01-11 18:21 UTC | newest]
Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-01-10 19:39 [PATCH v2 0/2] Rework blk_mq_mark_tag_wait() Bart Van Assche
2018-01-10 19:39 ` [PATCH v2 1/2] blk-mq: Reduce the number of if-statements in blk_mq_mark_tag_wait() Bart Van Assche
2018-01-10 20:30 ` Jens Axboe
2018-01-10 21:03 ` Bart Van Assche
2018-01-11 7:00 ` Hannes Reinecke
2018-01-10 19:39 ` [PATCH v2 2/2] blk-mq: Fix a race condition " Bart Van Assche
2018-01-11 7:39 ` Hannes Reinecke
2018-01-11 17:27 ` Bart Van Assche
2018-01-11 18:21 ` Omar Sandoval
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.