linux-block.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 0/4][v2] rq-qos memory barrier shenanigans
@ 2019-07-15 20:11 Josef Bacik
  2019-07-15 20:11 ` [PATCH 1/4] wait: add wq_has_single_sleeper helper Josef Bacik
                   ` (4 more replies)
  0 siblings, 5 replies; 8+ messages in thread
From: Josef Bacik @ 2019-07-15 20:11 UTC (permalink / raw)
  To: axboe, linux-block, kernel-team, peterz, oleg

This is the patch series to address the hang we saw in production because of
missed wakeups, and the other issues that Oleg noticed while reviewing the code.

v1->v2:
- rename wq_has_multiple_sleepers to wq_has_single_sleeper
- fix the check for has_sleepers in the missed wake-ups patch
- fix the barrier issues around got_token that Oleg noticed
- dropped the has_sleepers reset that Oleg noticed we didn't need

Thanks,

Josef


^ permalink raw reply	[flat|nested] 8+ messages in thread

* [PATCH 1/4] wait: add wq_has_single_sleeper helper
  2019-07-15 20:11 [PATCH 0/4][v2] rq-qos memory barrier shenanigans Josef Bacik
@ 2019-07-15 20:11 ` Josef Bacik
  2019-07-15 20:11 ` [PATCH 2/4] rq-qos: fix missed wake-ups in rq_qos_throttle Josef Bacik
                   ` (3 subsequent siblings)
  4 siblings, 0 replies; 8+ messages in thread
From: Josef Bacik @ 2019-07-15 20:11 UTC (permalink / raw)
  To: axboe, linux-block, kernel-team, peterz, oleg

rq-qos sits in the io path so we want to take locks as sparingly as
possible.  To accomplish this we try not to take the waitqueue head lock
unless we are sure we need to go to sleep, and we have an optimization
to make sure that we don't starve out existing waiters.  Since we check
if there are existing waiters locklessly we need to be able to update
our view of the waitqueue list after we've added ourselves to the
waitqueue.  Accomplish this by adding this helper to see if there is
more than just ourselves on the list.

Signed-off-by: Josef Bacik <josef@toxicpanda.com>
---
 include/linux/wait.h | 13 +++++++++++++
 1 file changed, 13 insertions(+)

diff --git a/include/linux/wait.h b/include/linux/wait.h
index b6f77cf60dd7..30c515520fb2 100644
--- a/include/linux/wait.h
+++ b/include/linux/wait.h
@@ -126,6 +126,19 @@ static inline int waitqueue_active(struct wait_queue_head *wq_head)
 	return !list_empty(&wq_head->head);
 }
 
+/**
+ * wq_has_single_sleeper - check if there is only one sleeper
+ * @wq_head: wait queue head
+ *
+ * Returns true of wq_head has only one sleeper on the list.
+ *
+ * Please refer to the comment for waitqueue_active.
+ */
+static inline bool wq_has_single_sleeper(struct wait_queue_head *wq_head)
+{
+	return list_is_singular(&wq_head->head);
+}
+
 /**
  * wq_has_sleeper - check if there are any waiting processes
  * @wq_head: wait queue head
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 8+ messages in thread

* [PATCH 2/4] rq-qos: fix missed wake-ups in rq_qos_throttle
  2019-07-15 20:11 [PATCH 0/4][v2] rq-qos memory barrier shenanigans Josef Bacik
  2019-07-15 20:11 ` [PATCH 1/4] wait: add wq_has_single_sleeper helper Josef Bacik
@ 2019-07-15 20:11 ` Josef Bacik
  2019-07-15 20:11 ` [PATCH 3/4] rq-qos: use READ_ONCE/WRITE_ONCE for got_token Josef Bacik
                   ` (2 subsequent siblings)
  4 siblings, 0 replies; 8+ messages in thread
From: Josef Bacik @ 2019-07-15 20:11 UTC (permalink / raw)
  To: axboe, linux-block, kernel-team, peterz, oleg

We saw a hang in production with WBT where there was only one waiter in
the throttle path and no outstanding IO.  This is because of the
has_sleepers optimization that is used to make sure we don't steal an
inflight counter for new submitters when there are people already on the
list.

We can race with our check to see if the waitqueue has any waiters (this
is done locklessly) and the time we actually add ourselves to the
waitqueue.  If this happens we'll go to sleep and never be woken up
because nobody is doing IO to wake us up.

Fix this by checking if the waitqueue has a single sleeper on the list
after we add ourselves, that way we have an uptodate view of the list.

Signed-off-by: Josef Bacik <josef@toxicpanda.com>
---
 block/blk-rq-qos.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/block/blk-rq-qos.c b/block/blk-rq-qos.c
index 659ccb8b693f..67a0a4c07060 100644
--- a/block/blk-rq-qos.c
+++ b/block/blk-rq-qos.c
@@ -244,6 +244,7 @@ void rq_qos_wait(struct rq_wait *rqw, void *private_data,
 		return;
 
 	prepare_to_wait_exclusive(&rqw->wait, &data.wq, TASK_UNINTERRUPTIBLE);
+	has_sleeper = !wq_has_single_sleeper(&rqw->wait);
 	do {
 		if (data.got_token)
 			break;
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 8+ messages in thread

* [PATCH 3/4] rq-qos: use READ_ONCE/WRITE_ONCE for got_token
  2019-07-15 20:11 [PATCH 0/4][v2] rq-qos memory barrier shenanigans Josef Bacik
  2019-07-15 20:11 ` [PATCH 1/4] wait: add wq_has_single_sleeper helper Josef Bacik
  2019-07-15 20:11 ` [PATCH 2/4] rq-qos: fix missed wake-ups in rq_qos_throttle Josef Bacik
@ 2019-07-15 20:11 ` Josef Bacik
  2019-07-16  8:29   ` Oleg Nesterov
  2019-07-15 20:11 ` [PATCH 4/4] rq-qos: don't reset has_sleepers on spurious wakeups Josef Bacik
  2019-07-16  8:39 ` [PATCH 0/4][v2] rq-qos memory barrier shenanigans Oleg Nesterov
  4 siblings, 1 reply; 8+ messages in thread
From: Josef Bacik @ 2019-07-15 20:11 UTC (permalink / raw)
  To: axboe, linux-block, kernel-team, peterz, oleg

Oleg noticed that our checking of data.got_token is unsafe in the
cleanup case, and should really use a memory barrier.  Use the
READ_ONCE/WRITE_ONCE helpers on got_token so we can be sure we're always
safe.

Signed-off-by: Josef Bacik <josef@toxicpanda.com>
---
 block/blk-rq-qos.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/block/blk-rq-qos.c b/block/blk-rq-qos.c
index 67a0a4c07060..f4aa7b818cf5 100644
--- a/block/blk-rq-qos.c
+++ b/block/blk-rq-qos.c
@@ -201,7 +201,7 @@ static int rq_qos_wake_function(struct wait_queue_entry *curr,
 	if (!data->cb(data->rqw, data->private_data))
 		return -1;
 
-	data->got_token = true;
+	WRITE_ONCE(data->got_token, true);
 	list_del_init(&curr->entry);
 	wake_up_process(data->task);
 	return 1;
@@ -246,7 +246,7 @@ void rq_qos_wait(struct rq_wait *rqw, void *private_data,
 	prepare_to_wait_exclusive(&rqw->wait, &data.wq, TASK_UNINTERRUPTIBLE);
 	has_sleeper = !wq_has_single_sleeper(&rqw->wait);
 	do {
-		if (data.got_token)
+		if (READ_ONCE(data.got_token))
 			break;
 		if (!has_sleeper && acquire_inflight_cb(rqw, private_data)) {
 			finish_wait(&rqw->wait, &data.wq);
@@ -256,7 +256,7 @@ void rq_qos_wait(struct rq_wait *rqw, void *private_data,
 			 * which means we now have two. Put our local token
 			 * and wake anyone else potentially waiting for one.
 			 */
-			if (data.got_token)
+			if (READ_ONCE(data.got_token))
 				cleanup_cb(rqw, private_data);
 			break;
 		}
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 8+ messages in thread

* [PATCH 4/4] rq-qos: don't reset has_sleepers on spurious wakeups
  2019-07-15 20:11 [PATCH 0/4][v2] rq-qos memory barrier shenanigans Josef Bacik
                   ` (2 preceding siblings ...)
  2019-07-15 20:11 ` [PATCH 3/4] rq-qos: use READ_ONCE/WRITE_ONCE for got_token Josef Bacik
@ 2019-07-15 20:11 ` Josef Bacik
  2019-07-16  8:35   ` Oleg Nesterov
  2019-07-16  8:39 ` [PATCH 0/4][v2] rq-qos memory barrier shenanigans Oleg Nesterov
  4 siblings, 1 reply; 8+ messages in thread
From: Josef Bacik @ 2019-07-15 20:11 UTC (permalink / raw)
  To: axboe, linux-block, kernel-team, peterz, oleg

If we had multiple sleepers before we don't need to worry about never
being woken up.  If we're woken up randomly without having gotten our
inflight token then we need to go back to sleep, we won't be properly
woken up without being given our inflight token anyway so this resetting
isn't needed.

Signed-off-by: Josef Bacik <josef@toxicpanda.com>
---
 block/blk-rq-qos.c | 1 -
 1 file changed, 1 deletion(-)

diff --git a/block/blk-rq-qos.c b/block/blk-rq-qos.c
index f4aa7b818cf5..35bc6f54d088 100644
--- a/block/blk-rq-qos.c
+++ b/block/blk-rq-qos.c
@@ -261,7 +261,6 @@ void rq_qos_wait(struct rq_wait *rqw, void *private_data,
 			break;
 		}
 		io_schedule();
-		has_sleeper = false;
 	} while (1);
 	finish_wait(&rqw->wait, &data.wq);
 }
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 8+ messages in thread

* Re: [PATCH 3/4] rq-qos: use READ_ONCE/WRITE_ONCE for got_token
  2019-07-15 20:11 ` [PATCH 3/4] rq-qos: use READ_ONCE/WRITE_ONCE for got_token Josef Bacik
@ 2019-07-16  8:29   ` Oleg Nesterov
  0 siblings, 0 replies; 8+ messages in thread
From: Oleg Nesterov @ 2019-07-16  8:29 UTC (permalink / raw)
  To: Josef Bacik; +Cc: axboe, linux-block, kernel-team, peterz

On 07/15, Josef Bacik wrote:
>
> Oleg noticed that our checking of data.got_token is unsafe in the
> cleanup case, and should really use a memory barrier.  Use the
> READ_ONCE/WRITE_ONCE helpers on got_token so we can be sure we're always
> safe.

READ/WRITE_ONCE can't help, both are compiler barriers. You need smp_wmb/rmb.

Alternatively,

> @@ -246,7 +246,7 @@ void rq_qos_wait(struct rq_wait *rqw, void *private_data,
>  	prepare_to_wait_exclusive(&rqw->wait, &data.wq, TASK_UNINTERRUPTIBLE);
>  	has_sleeper = !wq_has_single_sleeper(&rqw->wait);
>  	do {
> -		if (data.got_token)
> +		if (READ_ONCE(data.got_token))
>  			break;
>  		if (!has_sleeper && acquire_inflight_cb(rqw, private_data)) {
>  			finish_wait(&rqw->wait, &data.wq);

You can use remove_wait_queue() which takes rqw->wait->lock unconditonally,
but then you will need to do __set_current_state(TASK_RUNNING) and use
"return" instead of break.

Oleg.


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH 4/4] rq-qos: don't reset has_sleepers on spurious wakeups
  2019-07-15 20:11 ` [PATCH 4/4] rq-qos: don't reset has_sleepers on spurious wakeups Josef Bacik
@ 2019-07-16  8:35   ` Oleg Nesterov
  0 siblings, 0 replies; 8+ messages in thread
From: Oleg Nesterov @ 2019-07-16  8:35 UTC (permalink / raw)
  To: Josef Bacik; +Cc: axboe, linux-block, kernel-team, peterz

On 07/15, Josef Bacik wrote:
>
> If we had multiple sleepers before we don't need to worry about never
> being woken up.  If we're woken up randomly without having gotten our
> inflight token then we need to go back to sleep, we won't be properly
> woken up without being given our inflight token anyway so this resetting
> isn't needed.

Yes, and this means that we can avoid acquire_inflight_cb() if e're woken
up randomly, so

> Signed-off-by: Josef Bacik <josef@toxicpanda.com>
> ---
>  block/blk-rq-qos.c | 1 -
>  1 file changed, 1 deletion(-)
> 
> diff --git a/block/blk-rq-qos.c b/block/blk-rq-qos.c
> index f4aa7b818cf5..35bc6f54d088 100644
> --- a/block/blk-rq-qos.c
> +++ b/block/blk-rq-qos.c
> @@ -261,7 +261,6 @@ void rq_qos_wait(struct rq_wait *rqw, void *private_data,
>  			break;
>  		}
>  		io_schedule();
> -		has_sleeper = false;

I still think it should do

		has_sleeper = true;

Oleg.


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH 0/4][v2] rq-qos memory barrier shenanigans
  2019-07-15 20:11 [PATCH 0/4][v2] rq-qos memory barrier shenanigans Josef Bacik
                   ` (3 preceding siblings ...)
  2019-07-15 20:11 ` [PATCH 4/4] rq-qos: don't reset has_sleepers on spurious wakeups Josef Bacik
@ 2019-07-16  8:39 ` Oleg Nesterov
  4 siblings, 0 replies; 8+ messages in thread
From: Oleg Nesterov @ 2019-07-16  8:39 UTC (permalink / raw)
  To: Josef Bacik; +Cc: axboe, linux-block, kernel-team, peterz

On 07/15, Josef Bacik wrote:
>
> This is the patch series to address the hang we saw in production because of
> missed wakeups, and the other issues that Oleg noticed while reviewing the code.

You missed another problem. See
https://lore.kernel.org/linux-block/20190711114543.GA14901@redhat.com/

IOW, rq_qos_wait() needs set_current_state(TASK_UNINTERRUPTIBLE) after
io_schedule().

Oleg.


^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2019-07-16  8:39 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-07-15 20:11 [PATCH 0/4][v2] rq-qos memory barrier shenanigans Josef Bacik
2019-07-15 20:11 ` [PATCH 1/4] wait: add wq_has_single_sleeper helper Josef Bacik
2019-07-15 20:11 ` [PATCH 2/4] rq-qos: fix missed wake-ups in rq_qos_throttle Josef Bacik
2019-07-15 20:11 ` [PATCH 3/4] rq-qos: use READ_ONCE/WRITE_ONCE for got_token Josef Bacik
2019-07-16  8:29   ` Oleg Nesterov
2019-07-15 20:11 ` [PATCH 4/4] rq-qos: don't reset has_sleepers on spurious wakeups Josef Bacik
2019-07-16  8:35   ` Oleg Nesterov
2019-07-16  8:39 ` [PATCH 0/4][v2] rq-qos memory barrier shenanigans Oleg Nesterov

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).