linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH BUGFIX/IMPROVEMENT 0/6] block, bfq: third and last batch of fixes and improvements
@ 2021-01-26 10:50 Paolo Valente
  2021-01-26 10:50 ` [PATCH BUGFIX/IMPROVEMENT 1/6] block, bfq: always inject I/O of queues blocked by wakers Paolo Valente
                   ` (5 more replies)
  0 siblings, 6 replies; 20+ messages in thread
From: Paolo Valente @ 2021-01-26 10:50 UTC (permalink / raw)
  To: Jens Axboe; +Cc: linux-block, linux-kernel, Paolo Valente

Hi,
here's batch 3/3.

Thanks,
Paolo

Paolo Valente (6):
  block, bfq: always inject I/O of queues blocked by wakers
  block, bfq: put reqs of waker and woken in dispatch list
  block, bfq: make shared queues inherit wakers
  block, bfq: fix weight-raising resume with !low_latency
  block, bfq: keep shared queues out of the waker mechanism
  block, bfq: merge bursts of newly-created queues

 block/bfq-cgroup.c  |   2 +
 block/bfq-iosched.c | 362 +++++++++++++++++++++++++++++++++++++++++---
 block/bfq-iosched.h |  15 ++
 block/bfq-wf2q.c    |   8 +
 4 files changed, 365 insertions(+), 22 deletions(-)

--
2.20.1

^ permalink raw reply	[flat|nested] 20+ messages in thread

* [PATCH BUGFIX/IMPROVEMENT 1/6] block, bfq: always inject I/O of queues blocked by wakers
  2021-01-26 10:50 [PATCH BUGFIX/IMPROVEMENT 0/6] block, bfq: third and last batch of fixes and improvements Paolo Valente
@ 2021-01-26 10:50 ` Paolo Valente
  2021-01-26 16:17   ` Jens Axboe
  2021-01-26 10:50 ` [PATCH BUGFIX/IMPROVEMENT 2/6] block, bfq: put reqs of waker and woken in dispatch list Paolo Valente
                   ` (4 subsequent siblings)
  5 siblings, 1 reply; 20+ messages in thread
From: Paolo Valente @ 2021-01-26 10:50 UTC (permalink / raw)
  To: Jens Axboe; +Cc: linux-block, linux-kernel, Paolo Valente, Jan Kara

Suppose that I/O dispatch is plugged, to wait for new I/O for the
in-service bfq-queue, say bfqq.  Suppose then that there is a further
bfq_queue woken by bfqq, and that this woken queue has pending I/O. A
woken queue does not steal bandwidth from bfqq, because it remains
soon without I/O if bfqq is not served. So there is virtually no risk
of loss of bandwidth for bfqq if this woken queue has I/O dispatched
while bfqq is waiting for new I/O. In contrast, this extra I/O
injection boosts throughput. This commit performs this extra
injection.

Tested-by: Jan Kara <jack@suse.cz>
Signed-off-by: Paolo Valente <paolo.valente@linaro.org>
---
 block/bfq-iosched.c | 32 +++++++++++++++++++++++++++-----
 block/bfq-wf2q.c    |  8 ++++++++
 2 files changed, 35 insertions(+), 5 deletions(-)

diff --git a/block/bfq-iosched.c b/block/bfq-iosched.c
index 445cef9c0bb9..a83149407336 100644
--- a/block/bfq-iosched.c
+++ b/block/bfq-iosched.c
@@ -4487,9 +4487,15 @@ static struct bfq_queue *bfq_select_queue(struct bfq_data *bfqd)
 			bfq_bfqq_busy(bfqq->bic->bfqq[0]) &&
 			bfqq->bic->bfqq[0]->next_rq ?
 			bfqq->bic->bfqq[0] : NULL;
+		struct bfq_queue *blocked_bfqq =
+			!hlist_empty(&bfqq->woken_list) ?
+			container_of(bfqq->woken_list.first,
+				     struct bfq_queue,
+				     woken_list_node)
+			: NULL;
 
 		/*
-		 * The next three mutually-exclusive ifs decide
+		 * The next four mutually-exclusive ifs decide
 		 * whether to try injection, and choose the queue to
 		 * pick an I/O request from.
 		 *
@@ -4522,7 +4528,15 @@ static struct bfq_queue *bfq_select_queue(struct bfq_data *bfqd)
 		 * next bfqq's I/O is brought forward dramatically,
 		 * for it is not blocked for milliseconds.
 		 *
-		 * The third if checks whether bfqq is a queue for
+		 * The third if checks whether there is a queue woken
+		 * by bfqq, and currently with pending I/O. Such a
+		 * woken queue does not steal bandwidth from bfqq,
+		 * because it remains soon without I/O if bfqq is not
+		 * served. So there is virtually no risk of loss of
+		 * bandwidth for bfqq if this woken queue has I/O
+		 * dispatched while bfqq is waiting for new I/O.
+		 *
+		 * The fourth if checks whether bfqq is a queue for
 		 * which it is better to avoid injection. It is so if
 		 * bfqq delivers more throughput when served without
 		 * any further I/O from other queues in the middle, or
@@ -4542,11 +4556,11 @@ static struct bfq_queue *bfq_select_queue(struct bfq_data *bfqd)
 		 * bfq_update_has_short_ttime(), it is rather likely
 		 * that, if I/O is being plugged for bfqq and the
 		 * waker queue has pending I/O requests that are
-		 * blocking bfqq's I/O, then the third alternative
+		 * blocking bfqq's I/O, then the fourth alternative
 		 * above lets the waker queue get served before the
 		 * I/O-plugging timeout fires. So one may deem the
 		 * second alternative superfluous. It is not, because
-		 * the third alternative may be way less effective in
+		 * the fourth alternative may be way less effective in
 		 * case of a synchronization. For two main
 		 * reasons. First, throughput may be low because the
 		 * inject limit may be too low to guarantee the same
@@ -4555,7 +4569,7 @@ static struct bfq_queue *bfq_select_queue(struct bfq_data *bfqd)
 		 * guarantees (the second alternative unconditionally
 		 * injects a pending I/O request of the waker queue
 		 * for each bfq_dispatch_request()). Second, with the
-		 * third alternative, the duration of the plugging,
+		 * fourth alternative, the duration of the plugging,
 		 * i.e., the time before bfqq finally receives new I/O,
 		 * may not be minimized, because the waker queue may
 		 * happen to be served only after other queues.
@@ -4573,6 +4587,14 @@ static struct bfq_queue *bfq_select_queue(struct bfq_data *bfqd)
 			   bfq_bfqq_budget_left(bfqq->waker_bfqq)
 			)
 			bfqq = bfqq->waker_bfqq;
+		else if (blocked_bfqq &&
+			   bfq_bfqq_busy(blocked_bfqq) &&
+			   blocked_bfqq->next_rq &&
+			   bfq_serv_to_charge(blocked_bfqq->next_rq,
+					      blocked_bfqq) <=
+			   bfq_bfqq_budget_left(blocked_bfqq)
+			)
+			bfqq = blocked_bfqq;
 		else if (!idling_boosts_thr_without_issues(bfqd, bfqq) &&
 			 (bfqq->wr_coeff == 1 || bfqd->wr_busy_queues > 1 ||
 			  !bfq_bfqq_has_short_ttime(bfqq)))
diff --git a/block/bfq-wf2q.c b/block/bfq-wf2q.c
index 26776bdbdf36..02e59931d897 100644
--- a/block/bfq-wf2q.c
+++ b/block/bfq-wf2q.c
@@ -1709,4 +1709,12 @@ void bfq_add_bfqq_busy(struct bfq_data *bfqd, struct bfq_queue *bfqq)
 
 	if (bfqq->wr_coeff > 1)
 		bfqd->wr_busy_queues++;
+
+	/* Move bfqq to the head of the woken list of its waker */
+	if (!hlist_unhashed(&bfqq->woken_list_node) &&
+	    &bfqq->woken_list_node != bfqq->waker_bfqq->woken_list.first) {
+		hlist_del_init(&bfqq->woken_list_node);
+		hlist_add_head(&bfqq->woken_list_node,
+			       &bfqq->waker_bfqq->woken_list);
+	}
 }
-- 
2.20.1


^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [PATCH BUGFIX/IMPROVEMENT 2/6] block, bfq: put reqs of waker and woken in dispatch list
  2021-01-26 10:50 [PATCH BUGFIX/IMPROVEMENT 0/6] block, bfq: third and last batch of fixes and improvements Paolo Valente
  2021-01-26 10:50 ` [PATCH BUGFIX/IMPROVEMENT 1/6] block, bfq: always inject I/O of queues blocked by wakers Paolo Valente
@ 2021-01-26 10:50 ` Paolo Valente
  2021-01-26 16:18   ` Jens Axboe
  2021-01-26 10:50 ` [PATCH BUGFIX/IMPROVEMENT 3/6] block, bfq: make shared queues inherit wakers Paolo Valente
                   ` (3 subsequent siblings)
  5 siblings, 1 reply; 20+ messages in thread
From: Paolo Valente @ 2021-01-26 10:50 UTC (permalink / raw)
  To: Jens Axboe; +Cc: linux-block, linux-kernel, Paolo Valente, Jan Kara

Consider a new I/O request that arrives for a bfq_queue bfqq. If, when
this happens, the only active bfq_queues are bfqq and either its waker
bfq_queue or one of its woken bfq_queues, then there is no point in
queueing this new I/O request in bfqq for service. In fact, the
in-service queue and bfqq agree on serving this new I/O request as
soon as possible. So this commit puts this new I/O request directly
into the dispatch list.

Tested-by: Jan Kara <jack@suse.cz>
Signed-off-by: Paolo Valente <paolo.valente@linaro.org>
---
 block/bfq-iosched.c | 17 ++++++++++++++++-
 1 file changed, 16 insertions(+), 1 deletion(-)

diff --git a/block/bfq-iosched.c b/block/bfq-iosched.c
index a83149407336..e5b83910fbe0 100644
--- a/block/bfq-iosched.c
+++ b/block/bfq-iosched.c
@@ -5640,7 +5640,22 @@ static void bfq_insert_request(struct blk_mq_hw_ctx *hctx, struct request *rq,
 
 	spin_lock_irq(&bfqd->lock);
 	bfqq = bfq_init_rq(rq);
-	if (!bfqq || at_head || blk_rq_is_passthrough(rq)) {
+
+	/*
+	 * Additional case for putting rq directly into the dispatch
+	 * queue: the only active bfq_queues are bfqq and either its
+	 * waker bfq_queue or one of its woken bfq_queues. In this
+	 * case, there is no point in queueing rq in bfqq for
+	 * service. In fact, the in-service queue and bfqq agree on
+	 * serving this new I/O request as soon as possible.
+	 */
+	if (!bfqq ||
+	    (bfqq != bfqd->in_service_queue &&
+	     bfqd->in_service_queue != NULL &&
+	     bfq_tot_busy_queues(bfqd) == 1 + bfq_bfqq_busy(bfqq) &&
+	     (bfqq->waker_bfqq == bfqd->in_service_queue ||
+	      bfqd->in_service_queue->waker_bfqq == bfqq)) ||
+	    at_head || blk_rq_is_passthrough(rq)) {
 		if (at_head)
 			list_add(&rq->queuelist, &bfqd->dispatch);
 		else
-- 
2.20.1


^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [PATCH BUGFIX/IMPROVEMENT 3/6] block, bfq: make shared queues inherit wakers
  2021-01-26 10:50 [PATCH BUGFIX/IMPROVEMENT 0/6] block, bfq: third and last batch of fixes and improvements Paolo Valente
  2021-01-26 10:50 ` [PATCH BUGFIX/IMPROVEMENT 1/6] block, bfq: always inject I/O of queues blocked by wakers Paolo Valente
  2021-01-26 10:50 ` [PATCH BUGFIX/IMPROVEMENT 2/6] block, bfq: put reqs of waker and woken in dispatch list Paolo Valente
@ 2021-01-26 10:50 ` Paolo Valente
  2021-01-26 10:51 ` [PATCH BUGFIX/IMPROVEMENT 4/6] block, bfq: fix weight-raising resume with !low_latency Paolo Valente
                   ` (2 subsequent siblings)
  5 siblings, 0 replies; 20+ messages in thread
From: Paolo Valente @ 2021-01-26 10:50 UTC (permalink / raw)
  To: Jens Axboe; +Cc: linux-block, linux-kernel, Paolo Valente, Jan Kara

Consider a bfq_queue bfqq that is about to be merged with another
bfq_queue new_bfqq. The processes associated with bfqq are cooperators
of the processes associated with new_bfqq. So, if bfqq has a waker,
then it is reasonable (and beneficial for throughput) to assume that
all these processes will be happy to let bfqq's waker freely inject
I/O when they have no I/O. So this commit makes new_bfqq inherit
bfqq's waker.

Tested-by: Jan Kara <jack@suse.cz>
Signed-off-by: Paolo Valente <paolo.valente@linaro.org>
---
 block/bfq-iosched.c | 42 +++++++++++++++++++++++++++++++++++++++---
 1 file changed, 39 insertions(+), 3 deletions(-)

diff --git a/block/bfq-iosched.c b/block/bfq-iosched.c
index e5b83910fbe0..c5bda33c0923 100644
--- a/block/bfq-iosched.c
+++ b/block/bfq-iosched.c
@@ -2819,6 +2819,29 @@ bfq_merge_bfqqs(struct bfq_data *bfqd, struct bfq_io_cq *bic,
 		bfq_mark_bfqq_IO_bound(new_bfqq);
 	bfq_clear_bfqq_IO_bound(bfqq);
 
+	/*
+	 * The processes associated with bfqq are cooperators of the
+	 * processes associated with new_bfqq. So, if bfqq has a
+	 * waker, then assume that all these processes will be happy
+	 * to let bfqq's waker freely inject I/O when they have no
+	 * I/O.
+	 */
+	if (bfqq->waker_bfqq && !new_bfqq->waker_bfqq &&
+	    bfqq->waker_bfqq != new_bfqq) {
+		new_bfqq->waker_bfqq = bfqq->waker_bfqq;
+		new_bfqq->tentative_waker_bfqq = NULL;
+
+		/*
+		 * If the waker queue disappears, then
+		 * new_bfqq->waker_bfqq must be reset. So insert
+		 * new_bfqq into the woken_list of the waker. See
+		 * bfq_check_waker for details.
+		 */
+		hlist_add_head(&new_bfqq->woken_list_node,
+			       &new_bfqq->waker_bfqq->woken_list);
+
+	}
+
 	/*
 	 * If bfqq is weight-raised, then let new_bfqq inherit
 	 * weight-raising. To reduce false positives, neglect the case
@@ -6276,7 +6299,7 @@ static struct bfq_queue *bfq_init_rq(struct request *rq)
 	if (likely(!new_queue)) {
 		/* If the queue was seeky for too long, break it apart. */
 		if (bfq_bfqq_coop(bfqq) && bfq_bfqq_split_coop(bfqq)) {
-			bfq_log_bfqq(bfqd, bfqq, "breaking apart bfqq");
+			struct bfq_queue *old_bfqq = bfqq;
 
 			/* Update bic before losing reference to bfqq */
 			if (bfq_bfqq_in_large_burst(bfqq))
@@ -6285,11 +6308,24 @@ static struct bfq_queue *bfq_init_rq(struct request *rq)
 			bfqq = bfq_split_bfqq(bic, bfqq);
 			split = true;
 
-			if (!bfqq)
+			if (!bfqq) {
 				bfqq = bfq_get_bfqq_handle_split(bfqd, bic, bio,
 								 true, is_sync,
 								 NULL);
-			else
+				bfqq->waker_bfqq = old_bfqq->waker_bfqq;
+				bfqq->tentative_waker_bfqq = NULL;
+
+				/*
+				 * If the waker queue disappears, then
+				 * new_bfqq->waker_bfqq must be
+				 * reset. So insert new_bfqq into the
+				 * woken_list of the waker. See
+				 * bfq_check_waker for details.
+				 */
+				if (bfqq->waker_bfqq)
+					hlist_add_head(&bfqq->woken_list_node,
+						       &bfqq->waker_bfqq->woken_list);
+			} else
 				bfqq_already_existing = true;
 		}
 	}
-- 
2.20.1


^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [PATCH BUGFIX/IMPROVEMENT 4/6] block, bfq: fix weight-raising resume with !low_latency
  2021-01-26 10:50 [PATCH BUGFIX/IMPROVEMENT 0/6] block, bfq: third and last batch of fixes and improvements Paolo Valente
                   ` (2 preceding siblings ...)
  2021-01-26 10:50 ` [PATCH BUGFIX/IMPROVEMENT 3/6] block, bfq: make shared queues inherit wakers Paolo Valente
@ 2021-01-26 10:51 ` Paolo Valente
  2021-01-26 10:51 ` [PATCH BUGFIX/IMPROVEMENT 5/6] block, bfq: keep shared queues out of the waker mechanism Paolo Valente
  2021-01-26 10:51 ` [PATCH BUGFIX/IMPROVEMENT 6/6] block, bfq: merge bursts of newly-created queues Paolo Valente
  5 siblings, 0 replies; 20+ messages in thread
From: Paolo Valente @ 2021-01-26 10:51 UTC (permalink / raw)
  To: Jens Axboe; +Cc: linux-block, linux-kernel, Paolo Valente, Jan Kara

When the io_latency heuristic is off, bfq_queues must not start to be
weight-raised. Unfortunately, by mistake, this may happen when the
state of a previously weight-raised bfq_queue is resumed after a queue
split. This commit fixes this error.

Tested-by: Jan Kara <jack@suse.cz>
Signed-off-by: Paolo Valente <paolo.valente@linaro.org>
---
 block/bfq-iosched.c | 10 ++++++++--
 1 file changed, 8 insertions(+), 2 deletions(-)

diff --git a/block/bfq-iosched.c b/block/bfq-iosched.c
index c5bda33c0923..0c7e203085f1 100644
--- a/block/bfq-iosched.c
+++ b/block/bfq-iosched.c
@@ -1010,7 +1010,7 @@ static void
 bfq_bfqq_resume_state(struct bfq_queue *bfqq, struct bfq_data *bfqd,
 		      struct bfq_io_cq *bic, bool bfq_already_existing)
 {
-	unsigned int old_wr_coeff = bfqq->wr_coeff;
+	unsigned int old_wr_coeff = 1;
 	bool busy = bfq_already_existing && bfq_bfqq_busy(bfqq);
 
 	if (bic->saved_has_short_ttime)
@@ -1031,7 +1031,13 @@ bfq_bfqq_resume_state(struct bfq_queue *bfqq, struct bfq_data *bfqd,
 	bfqq->ttime = bic->saved_ttime;
 	bfqq->io_start_time = bic->saved_io_start_time;
 	bfqq->tot_idle_time = bic->saved_tot_idle_time;
-	bfqq->wr_coeff = bic->saved_wr_coeff;
+	/*
+	 * Restore weight coefficient only if low_latency is on
+	 */
+	if (bfqd->low_latency) {
+		old_wr_coeff = bfqq->wr_coeff;
+		bfqq->wr_coeff = bic->saved_wr_coeff;
+	}
 	bfqq->service_from_wr = bic->saved_service_from_wr;
 	bfqq->wr_start_at_switch_to_srt = bic->saved_wr_start_at_switch_to_srt;
 	bfqq->last_wr_start_finish = bic->saved_last_wr_start_finish;
-- 
2.20.1


^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [PATCH BUGFIX/IMPROVEMENT 5/6] block, bfq: keep shared queues out of the waker mechanism
  2021-01-26 10:50 [PATCH BUGFIX/IMPROVEMENT 0/6] block, bfq: third and last batch of fixes and improvements Paolo Valente
                   ` (3 preceding siblings ...)
  2021-01-26 10:51 ` [PATCH BUGFIX/IMPROVEMENT 4/6] block, bfq: fix weight-raising resume with !low_latency Paolo Valente
@ 2021-01-26 10:51 ` Paolo Valente
  2021-02-03 11:48   ` Jan Kara
  2021-01-26 10:51 ` [PATCH BUGFIX/IMPROVEMENT 6/6] block, bfq: merge bursts of newly-created queues Paolo Valente
  5 siblings, 1 reply; 20+ messages in thread
From: Paolo Valente @ 2021-01-26 10:51 UTC (permalink / raw)
  To: Jens Axboe; +Cc: linux-block, linux-kernel, Paolo Valente, Jan Kara

Shared queues are likely to receive I/O at a high rate. This may
deceptively let them be considered as wakers of other queues. But a
false waker will unjustly steal bandwidth to its supposedly woken
queue. So considering also shared queues in the waking mechanism may
cause more control troubles than throughput benefits. This commit
keeps shared queues out of the waker-detection mechanism.

Tested-by: Jan Kara <jack@suse.cz>
Signed-off-by: Paolo Valente <paolo.valente@linaro.org>
---
 block/bfq-iosched.c | 12 +++++++++++-
 1 file changed, 11 insertions(+), 1 deletion(-)

diff --git a/block/bfq-iosched.c b/block/bfq-iosched.c
index 0c7e203085f1..23d0dd7bd90f 100644
--- a/block/bfq-iosched.c
+++ b/block/bfq-iosched.c
@@ -5825,7 +5825,17 @@ static void bfq_completed_request(struct bfq_queue *bfqq, struct bfq_data *bfqd)
 			1UL<<(BFQ_RATE_SHIFT - 10))
 		bfq_update_rate_reset(bfqd, NULL);
 	bfqd->last_completion = now_ns;
-	bfqd->last_completed_rq_bfqq = bfqq;
+	/*
+	 * Shared queues are likely to receive I/O at a high
+	 * rate. This may deceptively let them be considered as wakers
+	 * of other queues. But a false waker will unjustly steal
+	 * bandwidth to its supposedly woken queue. So considering
+	 * also shared queues in the waking mechanism may cause more
+	 * control troubles than throughput benefits. Then do not set
+	 * last_completed_rq_bfqq to bfqq if bfqq is a shared queue.
+	 */
+	if (!bfq_bfqq_coop(bfqq))
+		bfqd->last_completed_rq_bfqq = bfqq;
 
 	/*
 	 * If we are waiting to discover whether the request pattern
-- 
2.20.1


^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [PATCH BUGFIX/IMPROVEMENT 6/6] block, bfq: merge bursts of newly-created queues
  2021-01-26 10:50 [PATCH BUGFIX/IMPROVEMENT 0/6] block, bfq: third and last batch of fixes and improvements Paolo Valente
                   ` (4 preceding siblings ...)
  2021-01-26 10:51 ` [PATCH BUGFIX/IMPROVEMENT 5/6] block, bfq: keep shared queues out of the waker mechanism Paolo Valente
@ 2021-01-26 10:51 ` Paolo Valente
  2021-01-26 16:15   ` Jens Axboe
                     ` (2 more replies)
  5 siblings, 3 replies; 20+ messages in thread
From: Paolo Valente @ 2021-01-26 10:51 UTC (permalink / raw)
  To: Jens Axboe; +Cc: linux-block, linux-kernel, Paolo Valente, Jan Kara

Many throughput-sensitive workloads are made of several parallel I/O
flows, with all flows generated by the same application, or more
generically by the same task (e.g., system boot). The most
counterproductive action with these workloads is plugging I/O dispatch
when one of the bfq_queues associated with these flows remains
temporarily empty.

To avoid this plugging, BFQ has been using a burst-handling mechanism
for years now. This mechanism has proven effective for throughput, and
not detrimental for service guarantees. This commit pushes this
mechanism a little bit further, basing on the following two facts.

First, all the I/O flows of a the same application or task contribute
to the execution/completion of that common application or task. So the
performance figures that matter are total throughput of the flows and
task-wide I/O latency.  In particular, these flows do not need to be
protected from each other, in terms of individual bandwidth or
latency.

Second, the above fact holds regardless of the number of flows.

Putting these two facts together, this commits merges stably the
bfq_queues associated with these I/O flows, i.e., with the processes
that generate these IO/ flows, regardless of how many the involved
processes are.

To decide whether a set of bfq_queues is actually associated with the
I/O flows of a common application or task, and to merge these queues
stably, this commit operates as follows: given a bfq_queue, say Q2,
currently being created, and the last bfq_queue, say Q1, created
before Q2, Q2 is merged stably with Q1 if
- very little time has elapsed since when Q1 was created
- Q2 has the same ioprio as Q1
- Q2 belongs to the same group as Q1

Merging bfq_queues also reduces scheduling overhead. A fio test with
ten random readers on /dev/nullb shows a throughput boost of 40%, with
a quadcore. Since BFQ's execution time amounts to ~50% of the total
per-request processing time, the above throughput boost implies that
BFQ's overhead is reduced by more than 50%.

Tested-by: Jan Kara <jack@suse.cz>
Signed-off-by: Paolo Valente <paolo.valente@linaro.org>
---
 block/bfq-cgroup.c  |   2 +
 block/bfq-iosched.c | 249 ++++++++++++++++++++++++++++++++++++++++++--
 block/bfq-iosched.h |  15 +++
 3 files changed, 256 insertions(+), 10 deletions(-)

diff --git a/block/bfq-cgroup.c b/block/bfq-cgroup.c
index b791e2041e49..e2f14508f2d6 100644
--- a/block/bfq-cgroup.c
+++ b/block/bfq-cgroup.c
@@ -547,6 +547,8 @@ static void bfq_pd_init(struct blkg_policy_data *pd)
 
 	entity->orig_weight = entity->weight = entity->new_weight = d->weight;
 	entity->my_sched_data = &bfqg->sched_data;
+	entity->last_bfqq_created = NULL;
+
 	bfqg->my_entity = entity; /*
 				   * the root_group's will be set to NULL
 				   * in bfq_init_queue()
diff --git a/block/bfq-iosched.c b/block/bfq-iosched.c
index 23d0dd7bd90f..f4e23e0ced74 100644
--- a/block/bfq-iosched.c
+++ b/block/bfq-iosched.c
@@ -1073,7 +1073,7 @@ bfq_bfqq_resume_state(struct bfq_queue *bfqq, struct bfq_data *bfqd,
 static int bfqq_process_refs(struct bfq_queue *bfqq)
 {
 	return bfqq->ref - bfqq->allocated - bfqq->entity.on_st_or_in_serv -
-		(bfqq->weight_counter != NULL);
+		(bfqq->weight_counter != NULL) - bfqq->stable_ref;
 }
 
 /* Empty burst list and add just bfqq (see comments on bfq_handle_burst) */
@@ -2625,6 +2625,11 @@ static bool bfq_may_be_close_cooperator(struct bfq_queue *bfqq,
 	return true;
 }
 
+static bool idling_boosts_thr_without_issues(struct bfq_data *bfqd,
+					     struct bfq_queue *bfqq);
+
+static void bfq_put_stable_ref(struct bfq_queue *bfqq);
+
 /*
  * Attempt to schedule a merge of bfqq with the currently in-service
  * queue or with a close queue among the scheduled queues.  Return
@@ -2647,10 +2652,49 @@ static bool bfq_may_be_close_cooperator(struct bfq_queue *bfqq,
  */
 static struct bfq_queue *
 bfq_setup_cooperator(struct bfq_data *bfqd, struct bfq_queue *bfqq,
-		     void *io_struct, bool request)
+		     void *io_struct, bool request, struct bfq_io_cq *bic)
 {
 	struct bfq_queue *in_service_bfqq, *new_bfqq;
 
+	/*
+	 * Check delayed stable merge for rotational or non-queueing
+	 * devs. For this branch to be executed, bfqq must not be
+	 * currently merged with some other queue (i.e., bfqq->bic
+	 * must be non null). If we considered also merged queues,
+	 * then we should also check whether bfqq has already been
+	 * merged with bic->stable_merge_bfqq. But this would be
+	 * costly and complicated.
+	 */
+	if (unlikely(!bfqd->nonrot_with_queueing)) {
+		if (bic->stable_merge_bfqq &&
+		    !bfq_bfqq_just_created(bfqq) &&
+		    time_is_after_jiffies(bfqq->split_time +
+					  msecs_to_jiffies(200))) {
+			struct bfq_queue *stable_merge_bfqq =
+				bic->stable_merge_bfqq;
+			int proc_ref = min(bfqq_process_refs(bfqq),
+					   bfqq_process_refs(stable_merge_bfqq));
+
+			/* deschedule stable merge, because done or aborted here */
+			bfq_put_stable_ref(stable_merge_bfqq);
+
+			bic->stable_merge_bfqq = NULL;
+
+			if (!idling_boosts_thr_without_issues(bfqd, bfqq) &&
+			    proc_ref > 0) {
+				/* next function will take at least one ref */
+				struct bfq_queue *new_bfqq =
+					bfq_setup_merge(bfqq, stable_merge_bfqq);
+
+				bic->stably_merged = true;
+				if (new_bfqq && new_bfqq->bic)
+					new_bfqq->bic->stably_merged = true;
+				return new_bfqq;
+			} else
+				return NULL;
+		}
+	}
+
 	/*
 	 * Do not perform queue merging if the device is non
 	 * rotational and performs internal queueing. In fact, such a
@@ -2809,6 +2853,12 @@ void bfq_release_process_ref(struct bfq_data *bfqd, struct bfq_queue *bfqq)
 	    bfqq != bfqd->in_service_queue)
 		bfq_del_bfqq_busy(bfqd, bfqq, false);
 
+	if (bfqq->entity.parent &&
+	    bfqq->entity.parent->last_bfqq_created == bfqq)
+		bfqq->entity.parent->last_bfqq_created = NULL;
+	else if (bfqq->bfqd && bfqq->bfqd->last_bfqq_created == bfqq)
+		bfqq->bfqd->last_bfqq_created = NULL;
+
 	bfq_put_queue(bfqq);
 }
 
@@ -2905,6 +2955,13 @@ bfq_merge_bfqqs(struct bfq_data *bfqd, struct bfq_io_cq *bic,
 	 */
 	new_bfqq->pid = -1;
 	bfqq->bic = NULL;
+
+	if (bfqq->entity.parent &&
+	    bfqq->entity.parent->last_bfqq_created == bfqq)
+		bfqq->entity.parent->last_bfqq_created = new_bfqq;
+	else if (bfqq->bfqd && bfqq->bfqd->last_bfqq_created == bfqq)
+		bfqq->bfqd->last_bfqq_created = new_bfqq;
+
 	bfq_release_process_ref(bfqd, bfqq);
 }
 
@@ -2932,7 +2989,7 @@ static bool bfq_allow_bio_merge(struct request_queue *q, struct request *rq,
 	 * We take advantage of this function to perform an early merge
 	 * of the queues of possible cooperating processes.
 	 */
-	new_bfqq = bfq_setup_cooperator(bfqd, bfqq, bio, false);
+	new_bfqq = bfq_setup_cooperator(bfqd, bfqq, bio, false, bfqd->bio_bic);
 	if (new_bfqq) {
 		/*
 		 * bic still points to bfqq, then it has not yet been
@@ -5033,6 +5090,12 @@ void bfq_put_queue(struct bfq_queue *bfqq)
 	bfqg_and_blkg_put(bfqg);
 }
 
+static void bfq_put_stable_ref(struct bfq_queue *bfqq)
+{
+	bfqq->stable_ref--;
+	bfq_put_queue(bfqq);
+}
+
 static void bfq_put_cooperator(struct bfq_queue *bfqq)
 {
 	struct bfq_queue *__bfqq, *next;
@@ -5089,6 +5152,17 @@ static void bfq_exit_icq(struct io_cq *icq)
 {
 	struct bfq_io_cq *bic = icq_to_bic(icq);
 
+	if (bic->stable_merge_bfqq) {
+		unsigned long flags;
+		struct bfq_data *bfqd = bic->stable_merge_bfqq->bfqd;
+
+		if (bfqd)
+			spin_lock_irqsave(&bfqd->lock, flags);
+		bfq_put_stable_ref(bic->stable_merge_bfqq);
+		if (bfqd)
+			spin_unlock_irqrestore(&bfqd->lock, flags);
+	}
+
 	bfq_exit_icq_bfqq(bic, true);
 	bfq_exit_icq_bfqq(bic, false);
 }
@@ -5149,7 +5223,8 @@ bfq_set_next_ioprio_data(struct bfq_queue *bfqq, struct bfq_io_cq *bic)
 
 static struct bfq_queue *bfq_get_queue(struct bfq_data *bfqd,
 				       struct bio *bio, bool is_sync,
-				       struct bfq_io_cq *bic);
+				       struct bfq_io_cq *bic,
+				       bool respawn);
 
 static void bfq_check_ioprio_change(struct bfq_io_cq *bic, struct bio *bio)
 {
@@ -5169,7 +5244,7 @@ static void bfq_check_ioprio_change(struct bfq_io_cq *bic, struct bio *bio)
 	bfqq = bic_to_bfqq(bic, false);
 	if (bfqq) {
 		bfq_release_process_ref(bfqd, bfqq);
-		bfqq = bfq_get_queue(bfqd, bio, BLK_RW_ASYNC, bic);
+		bfqq = bfq_get_queue(bfqd, bio, BLK_RW_ASYNC, bic, true);
 		bic_set_bfqq(bic, bfqq, false);
 	}
 
@@ -5212,6 +5287,8 @@ static void bfq_init_bfqq(struct bfq_data *bfqd, struct bfq_queue *bfqq,
 	/* set end request to minus infinity from now */
 	bfqq->ttime.last_end_request = now_ns + 1;
 
+	bfqq->creation_time = jiffies;
+
 	bfqq->io_start_time = now_ns;
 
 	bfq_mark_bfqq_IO_bound(bfqq);
@@ -5261,9 +5338,156 @@ static struct bfq_queue **bfq_async_queue_prio(struct bfq_data *bfqd,
 	}
 }
 
+struct bfq_queue *
+bfq_do_early_stable_merge(struct bfq_data *bfqd, struct bfq_queue *bfqq,
+			  struct bfq_io_cq *bic,
+			  struct bfq_queue *last_bfqq_created)
+{
+	struct bfq_queue *new_bfqq =
+		bfq_setup_merge(bfqq, last_bfqq_created);
+
+	if (!new_bfqq)
+		return bfqq;
+
+	if (new_bfqq->bic)
+		new_bfqq->bic->stably_merged = true;
+	bic->stably_merged = true;
+
+	/*
+	 * Reusing merge functions. This implies that
+	 * bfqq->bic must be set too, for
+	 * bfq_merge_bfqqs to correctly save bfqq's
+	 * state before killing it.
+	 */
+	bfqq->bic = bic;
+	bfq_merge_bfqqs(bfqd, bic, bfqq, new_bfqq);
+
+	return new_bfqq;
+}
+
+/*
+ * Many throughput-sensitive workloads are made of several parallel
+ * I/O flows, with all flows generated by the same application, or
+ * more generically by the same task (e.g., system boot). The most
+ * counterproductive action with these workloads is plugging I/O
+ * dispatch when one of the bfq_queues associated with these flows
+ * remains temporarily empty.
+ *
+ * To avoid this plugging, BFQ has been using a burst-handling
+ * mechanism for years now. This mechanism has proven effective for
+ * throughput, and not detrimental for service guarantees. The
+ * following function pushes this mechanism a little bit further,
+ * basing on the following two facts.
+ *
+ * First, all the I/O flows of a the same application or task
+ * contribute to the execution/completion of that common application
+ * or task. So the performance figures that matter are total
+ * throughput of the flows and task-wide I/O latency.  In particular,
+ * these flows do not need to be protected from each other, in terms
+ * of individual bandwidth or latency.
+ *
+ * Second, the above fact holds regardless of the number of flows.
+ *
+ * Putting these two facts together, this commits merges stably the
+ * bfq_queues associated with these I/O flows, i.e., with the
+ * processes that generate these IO/ flows, regardless of how many the
+ * involved processes are.
+ *
+ * To decide whether a set of bfq_queues is actually associated with
+ * the I/O flows of a common application or task, and to merge these
+ * queues stably, this function operates as follows: given a bfq_queue,
+ * say Q2, currently being created, and the last bfq_queue, say Q1,
+ * created before Q2, Q2 is merged stably with Q1 if
+ * - very little time has elapsed since when Q1 was created
+ * - Q2 has the same ioprio as Q1
+ * - Q2 belongs to the same group as Q1
+ *
+ * Merging bfq_queues also reduces scheduling overhead. A fio test
+ * with ten random readers on /dev/nullb shows a throughput boost of
+ * 40%, with a quadcore. Since BFQ's execution time amounts to ~50% of
+ * the total per-request processing time, the above throughput boost
+ * implies that BFQ's overhead is reduced by more than 50%.
+ *
+ * This new mechanism most certainly obsoletes the current
+ * burst-handling heuristics. We keep those heuristics for the moment.
+ */
+static struct bfq_queue *bfq_do_or_sched_stable_merge(struct bfq_data *bfqd,
+						      struct bfq_queue *bfqq,
+						      struct bfq_io_cq *bic)
+{
+	struct bfq_queue **source_bfqq = bfqq->entity.parent ?
+		&bfqq->entity.parent->last_bfqq_created :
+		&bfqd->last_bfqq_created;
+
+	struct bfq_queue *last_bfqq_created = *source_bfqq;
+
+	/*
+	 * If last_bfqq_created has not been set yet, then init it. If
+	 * it has been set already, but too long ago, then move it
+	 * forward to bfqq. Finally, move also if bfqq belongs to a
+	 * different group than last_bfqq_created, or if bfqq has a
+	 * different ioprio or ioprio_class. If none of these
+	 * conditions holds true, then try an early stable merge or
+	 * schedule a delayed stable merge.
+	 *
+	 * A delayed merge is scheduled (instead of performing an
+	 * early merge), in case bfqq might soon prove to be more
+	 * throughput-beneficial if not merged. Currently this is
+	 * possible only if bfqd is rotational with no queueing. For
+	 * such a drive, not merging bfqq is better for throughput if
+	 * bfqq happens to contain sequential I/O. So, we wait a
+	 * little bit for enough I/O to flow through bfqq. After that,
+	 * if such an I/O is sequential, then the merge is
+	 * canceled. Otherwise the merge is finally performed.
+	 */
+	if (!last_bfqq_created ||
+	    time_before(last_bfqq_created->creation_time +
+			bfqd->bfq_burst_interval,
+			bfqq->creation_time) ||
+		bfqq->entity.parent != last_bfqq_created->entity.parent ||
+		bfqq->ioprio != last_bfqq_created->ioprio ||
+		bfqq->ioprio_class != last_bfqq_created->ioprio_class)
+		*source_bfqq = bfqq;
+	else if (time_after_eq(last_bfqq_created->creation_time +
+				 bfqd->bfq_burst_interval,
+				 bfqq->creation_time)) {
+		if (likely(bfqd->nonrot_with_queueing))
+			/*
+			 * With this type of drive, leaving
+			 * bfqq alone may provide no
+			 * throughput benefits compared with
+			 * merging bfqq. So merge bfqq now.
+			 */
+			bfqq = bfq_do_early_stable_merge(bfqd, bfqq,
+							 bic,
+							 last_bfqq_created);
+		else { /* schedule tentative stable merge */
+			/*
+			 * get reference on last_bfqq_created,
+			 * to prevent it from being freed,
+			 * until we decide whether to merge
+			 */
+			last_bfqq_created->ref++;
+			/*
+			 * need to keep track of stable refs, to
+			 * compute process refs correctly
+			 */
+			last_bfqq_created->stable_ref++;
+			/*
+			 * Record the bfqq to merge to.
+			 */
+			bic->stable_merge_bfqq = last_bfqq_created;
+		}
+	}
+
+	return bfqq;
+}
+
+
 static struct bfq_queue *bfq_get_queue(struct bfq_data *bfqd,
 				       struct bio *bio, bool is_sync,
-				       struct bfq_io_cq *bic)
+				       struct bfq_io_cq *bic,
+				       bool respawn)
 {
 	const int ioprio = IOPRIO_PRIO_DATA(bic->ioprio);
 	const int ioprio_class = IOPRIO_PRIO_CLASS(bic->ioprio);
@@ -5321,7 +5545,10 @@ static struct bfq_queue *bfq_get_queue(struct bfq_data *bfqd,
 
 out:
 	bfqq->ref++; /* get a process reference to this queue */
-	bfq_log_bfqq(bfqd, bfqq, "get_queue, at end: %p, %d", bfqq, bfqq->ref);
+
+	if (bfqq != &bfqd->oom_bfqq && is_sync && !respawn)
+		bfqq = bfq_do_or_sched_stable_merge(bfqd, bfqq, bic);
+
 	rcu_read_unlock();
 	return bfqq;
 }
@@ -5563,7 +5790,8 @@ static void bfq_rq_enqueued(struct bfq_data *bfqd, struct bfq_queue *bfqq,
 static bool __bfq_insert_request(struct bfq_data *bfqd, struct request *rq)
 {
 	struct bfq_queue *bfqq = RQ_BFQQ(rq),
-		*new_bfqq = bfq_setup_cooperator(bfqd, bfqq, rq, true);
+		*new_bfqq = bfq_setup_cooperator(bfqd, bfqq, rq, true,
+						 RQ_BIC(rq));
 	bool waiting, idle_timer_disabled = false;
 
 	if (new_bfqq) {
@@ -6193,7 +6421,7 @@ static struct bfq_queue *bfq_get_bfqq_handle_split(struct bfq_data *bfqd,
 
 	if (bfqq)
 		bfq_put_queue(bfqq);
-	bfqq = bfq_get_queue(bfqd, bio, is_sync, bic);
+	bfqq = bfq_get_queue(bfqd, bio, is_sync, bic, split);
 
 	bic_set_bfqq(bic, bfqq, is_sync);
 	if (split && is_sync) {
@@ -6314,7 +6542,8 @@ static struct bfq_queue *bfq_init_rq(struct request *rq)
 
 	if (likely(!new_queue)) {
 		/* If the queue was seeky for too long, break it apart. */
-		if (bfq_bfqq_coop(bfqq) && bfq_bfqq_split_coop(bfqq)) {
+		if (bfq_bfqq_coop(bfqq) && bfq_bfqq_split_coop(bfqq) &&
+			!bic->stably_merged) {
 			struct bfq_queue *old_bfqq = bfqq;
 
 			/* Update bic before losing reference to bfqq */
diff --git a/block/bfq-iosched.h b/block/bfq-iosched.h
index b8e793c34ff1..99c2a3cb081e 100644
--- a/block/bfq-iosched.h
+++ b/block/bfq-iosched.h
@@ -197,6 +197,9 @@ struct bfq_entity {
 
 	/* flag, set if the entity is counted in groups_with_pending_reqs */
 	bool in_groups_with_pending_reqs;
+
+	/* last child queue of entity created (for non-leaf entities) */
+	struct bfq_queue *last_bfqq_created;
 };
 
 struct bfq_group;
@@ -230,6 +233,8 @@ struct bfq_ttime {
 struct bfq_queue {
 	/* reference counter */
 	int ref;
+	/* counter of references from other queues for delayed stable merge */
+	int stable_ref;
 	/* parent bfq_data */
 	struct bfq_data *bfqd;
 
@@ -365,6 +370,8 @@ struct bfq_queue {
 
 	unsigned long first_IO_time; /* time of first I/O for this queue */
 
+	unsigned long creation_time; /* when this queue is created */
+
 	/* max service rate measured so far */
 	u32 max_service_rate;
 
@@ -454,6 +461,11 @@ struct bfq_io_cq {
 	u64 saved_last_serv_time_ns;
 	unsigned int saved_inject_limit;
 	unsigned long saved_decrease_time_jif;
+
+	/* candidate queue for a stable merge (due to close creation time) */
+	struct bfq_queue *stable_merge_bfqq;
+
+	bool stably_merged;	/* non splittable if true */
 };
 
 /**
@@ -578,6 +590,9 @@ struct bfq_data {
 	/* bfqq owning the last completed rq */
 	struct bfq_queue *last_completed_rq_bfqq;
 
+	/* last bfqq created, among those in the root group */
+	struct bfq_queue *last_bfqq_created;
+
 	/* time of last transition from empty to non-empty (ns) */
 	u64 last_empty_occupied_ns;
 
-- 
2.20.1


^ permalink raw reply related	[flat|nested] 20+ messages in thread

* Re: [PATCH BUGFIX/IMPROVEMENT 6/6] block, bfq: merge bursts of newly-created queues
  2021-01-26 10:51 ` [PATCH BUGFIX/IMPROVEMENT 6/6] block, bfq: merge bursts of newly-created queues Paolo Valente
@ 2021-01-26 16:15   ` Jens Axboe
  2021-02-25 17:25     ` Paolo Valente
  2021-01-27  7:34   ` kernel test robot
  2021-01-27  9:52   ` kernel test robot
  2 siblings, 1 reply; 20+ messages in thread
From: Jens Axboe @ 2021-01-26 16:15 UTC (permalink / raw)
  To: Paolo Valente; +Cc: linux-block, linux-kernel, Jan Kara

On 1/26/21 3:51 AM, Paolo Valente wrote:
> @@ -2809,6 +2853,12 @@ void bfq_release_process_ref(struct bfq_data *bfqd, struct bfq_queue *bfqq)
>  	    bfqq != bfqd->in_service_queue)
>  		bfq_del_bfqq_busy(bfqd, bfqq, false);
>  
> +	if (bfqq->entity.parent &&
> +	    bfqq->entity.parent->last_bfqq_created == bfqq)
> +		bfqq->entity.parent->last_bfqq_created = NULL;
> +	else if (bfqq->bfqd && bfqq->bfqd->last_bfqq_created == bfqq)
> +		bfqq->bfqd->last_bfqq_created = NULL;
> +
>  	bfq_put_queue(bfqq);
>  }
>  
> @@ -2905,6 +2955,13 @@ bfq_merge_bfqqs(struct bfq_data *bfqd, struct bfq_io_cq *bic,
>  	 */
>  	new_bfqq->pid = -1;
>  	bfqq->bic = NULL;
> +
> +	if (bfqq->entity.parent &&
> +	    bfqq->entity.parent->last_bfqq_created == bfqq)
> +		bfqq->entity.parent->last_bfqq_created = new_bfqq;
> +	else if (bfqq->bfqd && bfqq->bfqd->last_bfqq_created == bfqq)
> +		bfqq->bfqd->last_bfqq_created = new_bfqq;
> +
>  	bfq_release_process_ref(bfqd, bfqq);
>  }

Almost identical code constructs makes it seem like this should have a
helper instead.

> @@ -5033,6 +5090,12 @@ void bfq_put_queue(struct bfq_queue *bfqq)
>  	bfqg_and_blkg_put(bfqg);
>  }
>  
> +static void bfq_put_stable_ref(struct bfq_queue *bfqq)
> +{
> +	bfqq->stable_ref--;
> +	bfq_put_queue(bfqq);
> +}
> +
>  static void bfq_put_cooperator(struct bfq_queue *bfqq)
>  {
>  	struct bfq_queue *__bfqq, *next;
> @@ -5089,6 +5152,17 @@ static void bfq_exit_icq(struct io_cq *icq)
>  {
>  	struct bfq_io_cq *bic = icq_to_bic(icq);
>  
> +	if (bic->stable_merge_bfqq) {
> +		unsigned long flags;
> +		struct bfq_data *bfqd = bic->stable_merge_bfqq->bfqd;
> +
> +		if (bfqd)
> +			spin_lock_irqsave(&bfqd->lock, flags);
> +		bfq_put_stable_ref(bic->stable_merge_bfqq);
> +		if (bfqd)
> +			spin_unlock_irqrestore(&bfqd->lock, flags);
> +	}
> +

Construct like this are really painful. Just do:

if (bfqd) {
	unsigned long flags;

	spin_lock_irqsave(&bfqd->lock, flags);
	bfq_put_stable_ref(bic->stable_merge_bfqq);
	spin_unlock_irqrestore(&bfqd->lock, flags);
} else {
	bfq_put_stable_ref(bic->stable_merge_bfqq);
}

which is also less likely to cause code analyzer false warnings. Outside
of that, it needs a comment on why it's ok NOT to grab the lock when
bfqd is zero, because that seems counter-intuitive and more a case of
"well we can't grab a lock for something we don't have". Maybe it's
because bfqd is no longer visible at this point, and it's ok, but it's
definitely not clear just looking at this patch. Even with that, is the
bfqq visible? Should the ref be atomic, and locking happen further down
instead?

-- 
Jens Axboe


^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH BUGFIX/IMPROVEMENT 1/6] block, bfq: always inject I/O of queues blocked by wakers
  2021-01-26 10:50 ` [PATCH BUGFIX/IMPROVEMENT 1/6] block, bfq: always inject I/O of queues blocked by wakers Paolo Valente
@ 2021-01-26 16:17   ` Jens Axboe
  2021-02-25 15:58     ` Paolo Valente
  0 siblings, 1 reply; 20+ messages in thread
From: Jens Axboe @ 2021-01-26 16:17 UTC (permalink / raw)
  To: Paolo Valente; +Cc: linux-block, linux-kernel, Jan Kara

On 1/26/21 3:50 AM, Paolo Valente wrote:
> diff --git a/block/bfq-iosched.c b/block/bfq-iosched.c
> index 445cef9c0bb9..a83149407336 100644
> --- a/block/bfq-iosched.c
> +++ b/block/bfq-iosched.c
> @@ -4487,9 +4487,15 @@ static struct bfq_queue *bfq_select_queue(struct bfq_data *bfqd)
>  			bfq_bfqq_busy(bfqq->bic->bfqq[0]) &&
>  			bfqq->bic->bfqq[0]->next_rq ?
>  			bfqq->bic->bfqq[0] : NULL;
> +		struct bfq_queue *blocked_bfqq =
> +			!hlist_empty(&bfqq->woken_list) ?
> +			container_of(bfqq->woken_list.first,
> +				     struct bfq_queue,
> +				     woken_list_node)
> +			: NULL;

hlist_first_entry_or_null?

-- 
Jens Axboe


^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH BUGFIX/IMPROVEMENT 2/6] block, bfq: put reqs of waker and woken in dispatch list
  2021-01-26 10:50 ` [PATCH BUGFIX/IMPROVEMENT 2/6] block, bfq: put reqs of waker and woken in dispatch list Paolo Valente
@ 2021-01-26 16:18   ` Jens Axboe
  2021-01-28 17:54     ` Paolo Valente
  0 siblings, 1 reply; 20+ messages in thread
From: Jens Axboe @ 2021-01-26 16:18 UTC (permalink / raw)
  To: Paolo Valente; +Cc: linux-block, linux-kernel, Jan Kara

On 1/26/21 3:50 AM, Paolo Valente wrote:
> Consider a new I/O request that arrives for a bfq_queue bfqq. If, when
> this happens, the only active bfq_queues are bfqq and either its waker
> bfq_queue or one of its woken bfq_queues, then there is no point in
> queueing this new I/O request in bfqq for service. In fact, the
> in-service queue and bfqq agree on serving this new I/O request as
> soon as possible. So this commit puts this new I/O request directly
> into the dispatch list.
> 
> Tested-by: Jan Kara <jack@suse.cz>
> Signed-off-by: Paolo Valente <paolo.valente@linaro.org>
> ---
>  block/bfq-iosched.c | 17 ++++++++++++++++-
>  1 file changed, 16 insertions(+), 1 deletion(-)
> 
> diff --git a/block/bfq-iosched.c b/block/bfq-iosched.c
> index a83149407336..e5b83910fbe0 100644
> --- a/block/bfq-iosched.c
> +++ b/block/bfq-iosched.c
> @@ -5640,7 +5640,22 @@ static void bfq_insert_request(struct blk_mq_hw_ctx *hctx, struct request *rq,
>  
>  	spin_lock_irq(&bfqd->lock);
>  	bfqq = bfq_init_rq(rq);
> -	if (!bfqq || at_head || blk_rq_is_passthrough(rq)) {
> +
> +	/*
> +	 * Additional case for putting rq directly into the dispatch
> +	 * queue: the only active bfq_queues are bfqq and either its
> +	 * waker bfq_queue or one of its woken bfq_queues. In this
> +	 * case, there is no point in queueing rq in bfqq for
> +	 * service. In fact, the in-service queue and bfqq agree on
> +	 * serving this new I/O request as soon as possible.
> +	 */
> +	if (!bfqq ||
> +	    (bfqq != bfqd->in_service_queue &&
> +	     bfqd->in_service_queue != NULL &&
> +	     bfq_tot_busy_queues(bfqd) == 1 + bfq_bfqq_busy(bfqq) &&
> +	     (bfqq->waker_bfqq == bfqd->in_service_queue ||
> +	      bfqd->in_service_queue->waker_bfqq == bfqq)) ||
> +	    at_head || blk_rq_is_passthrough(rq)) {
>  		if (at_head)
>  			list_add(&rq->queuelist, &bfqd->dispatch);
>  		else
> 

This is unreadable... Just seems like you are piling heuristics in to
catch some case, and it's neither readable nor clean.

-- 
Jens Axboe


^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH BUGFIX/IMPROVEMENT 6/6] block, bfq: merge bursts of newly-created queues
  2021-01-26 10:51 ` [PATCH BUGFIX/IMPROVEMENT 6/6] block, bfq: merge bursts of newly-created queues Paolo Valente
  2021-01-26 16:15   ` Jens Axboe
@ 2021-01-27  7:34   ` kernel test robot
  2021-01-27  9:52   ` kernel test robot
  2 siblings, 0 replies; 20+ messages in thread
From: kernel test robot @ 2021-01-27  7:34 UTC (permalink / raw)
  To: Paolo Valente, Jens Axboe
  Cc: kbuild-all, linux-block, linux-kernel, Paolo Valente, Jan Kara

[-- Attachment #1: Type: text/plain, Size: 2691 bytes --]

Hi Paolo,

I love your patch! Perhaps something to improve:

[auto build test WARNING on block/for-next]
[cannot apply to v5.11-rc5 next-20210125]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch]

url:    https://github.com/0day-ci/linux/commits/Paolo-Valente/block-bfq-third-and-last-batch-of-fixes-and-improvements/20210127-090045
base:   https://git.kernel.org/pub/scm/linux/kernel/git/axboe/linux-block.git for-next
config: nds32-randconfig-r033-20210126 (attached as .config)
compiler: nds32le-linux-gcc (GCC) 9.3.0
reproduce (this is a W=1 build):
        wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross
        chmod +x ~/bin/make.cross
        # https://github.com/0day-ci/linux/commit/2d32d2f7e624f94a180d6a6acfeea65c0c511fe1
        git remote add linux-review https://github.com/0day-ci/linux
        git fetch --no-tags linux-review Paolo-Valente/block-bfq-third-and-last-batch-of-fixes-and-improvements/20210127-090045
        git checkout 2d32d2f7e624f94a180d6a6acfeea65c0c511fe1
        # save the attached .config to linux build tree
        COMPILER_INSTALL_PATH=$HOME/0day COMPILER=gcc-9.3.0 make.cross ARCH=nds32 

If you fix the issue, kindly add following tag as appropriate
Reported-by: kernel test robot <lkp@intel.com>

All warnings (new ones prefixed by >>):

>> block/bfq-iosched.c:5340:1: warning: no previous prototype for 'bfq_do_early_stable_merge' [-Wmissing-prototypes]
    5340 | bfq_do_early_stable_merge(struct bfq_data *bfqd, struct bfq_queue *bfqq,
         | ^~~~~~~~~~~~~~~~~~~~~~~~~


vim +/bfq_do_early_stable_merge +5340 block/bfq-iosched.c

  5338	
  5339	struct bfq_queue *
> 5340	bfq_do_early_stable_merge(struct bfq_data *bfqd, struct bfq_queue *bfqq,
  5341				  struct bfq_io_cq *bic,
  5342				  struct bfq_queue *last_bfqq_created)
  5343	{
  5344		struct bfq_queue *new_bfqq =
  5345			bfq_setup_merge(bfqq, last_bfqq_created);
  5346	
  5347		if (!new_bfqq)
  5348			return bfqq;
  5349	
  5350		if (new_bfqq->bic)
  5351			new_bfqq->bic->stably_merged = true;
  5352		bic->stably_merged = true;
  5353	
  5354		/*
  5355		 * Reusing merge functions. This implies that
  5356		 * bfqq->bic must be set too, for
  5357		 * bfq_merge_bfqqs to correctly save bfqq's
  5358		 * state before killing it.
  5359		 */
  5360		bfqq->bic = bic;
  5361		bfq_merge_bfqqs(bfqd, bic, bfqq, new_bfqq);
  5362	
  5363		return new_bfqq;
  5364	}
  5365	

---
0-DAY CI Kernel Test Service, Intel Corporation
https://lists.01.org/hyperkitty/list/kbuild-all@lists.01.org

[-- Attachment #2: .config.gz --]
[-- Type: application/gzip, Size: 28502 bytes --]

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH BUGFIX/IMPROVEMENT 6/6] block, bfq: merge bursts of newly-created queues
  2021-01-26 10:51 ` [PATCH BUGFIX/IMPROVEMENT 6/6] block, bfq: merge bursts of newly-created queues Paolo Valente
  2021-01-26 16:15   ` Jens Axboe
  2021-01-27  7:34   ` kernel test robot
@ 2021-01-27  9:52   ` kernel test robot
  2 siblings, 0 replies; 20+ messages in thread
From: kernel test robot @ 2021-01-27  9:52 UTC (permalink / raw)
  To: Paolo Valente, Jens Axboe
  Cc: kbuild-all, clang-built-linux, linux-block, linux-kernel,
	Paolo Valente, Jan Kara

[-- Attachment #1: Type: text/plain, Size: 3035 bytes --]

Hi Paolo,

I love your patch! Perhaps something to improve:

[auto build test WARNING on block/for-next]
[cannot apply to v5.11-rc5 next-20210125]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch]

url:    https://github.com/0day-ci/linux/commits/Paolo-Valente/block-bfq-third-and-last-batch-of-fixes-and-improvements/20210127-090045
base:   https://git.kernel.org/pub/scm/linux/kernel/git/axboe/linux-block.git for-next
config: x86_64-randconfig-a003-20210126 (attached as .config)
compiler: clang version 12.0.0 (https://github.com/llvm/llvm-project 74784a5aa47bb8967e5868831e359fa631abe465)
reproduce (this is a W=1 build):
        wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross
        chmod +x ~/bin/make.cross
        # install x86_64 cross compiling tool for clang build
        # apt-get install binutils-x86-64-linux-gnu
        # https://github.com/0day-ci/linux/commit/2d32d2f7e624f94a180d6a6acfeea65c0c511fe1
        git remote add linux-review https://github.com/0day-ci/linux
        git fetch --no-tags linux-review Paolo-Valente/block-bfq-third-and-last-batch-of-fixes-and-improvements/20210127-090045
        git checkout 2d32d2f7e624f94a180d6a6acfeea65c0c511fe1
        # save the attached .config to linux build tree
        COMPILER_INSTALL_PATH=$HOME/0day COMPILER=clang make.cross ARCH=x86_64 

If you fix the issue, kindly add following tag as appropriate
Reported-by: kernel test robot <lkp@intel.com>

All warnings (new ones prefixed by >>):

>> block/bfq-iosched.c:5340:1: warning: no previous prototype for function 'bfq_do_early_stable_merge' [-Wmissing-prototypes]
   bfq_do_early_stable_merge(struct bfq_data *bfqd, struct bfq_queue *bfqq,
   ^
   block/bfq-iosched.c:5339:1: note: declare 'static' if the function is not intended to be used outside of this translation unit
   struct bfq_queue *
   ^
   static 
   1 warning generated.


vim +/bfq_do_early_stable_merge +5340 block/bfq-iosched.c

  5338	
  5339	struct bfq_queue *
> 5340	bfq_do_early_stable_merge(struct bfq_data *bfqd, struct bfq_queue *bfqq,
  5341				  struct bfq_io_cq *bic,
  5342				  struct bfq_queue *last_bfqq_created)
  5343	{
  5344		struct bfq_queue *new_bfqq =
  5345			bfq_setup_merge(bfqq, last_bfqq_created);
  5346	
  5347		if (!new_bfqq)
  5348			return bfqq;
  5349	
  5350		if (new_bfqq->bic)
  5351			new_bfqq->bic->stably_merged = true;
  5352		bic->stably_merged = true;
  5353	
  5354		/*
  5355		 * Reusing merge functions. This implies that
  5356		 * bfqq->bic must be set too, for
  5357		 * bfq_merge_bfqqs to correctly save bfqq's
  5358		 * state before killing it.
  5359		 */
  5360		bfqq->bic = bic;
  5361		bfq_merge_bfqqs(bfqd, bic, bfqq, new_bfqq);
  5362	
  5363		return new_bfqq;
  5364	}
  5365	

---
0-DAY CI Kernel Test Service, Intel Corporation
https://lists.01.org/hyperkitty/list/kbuild-all@lists.01.org

[-- Attachment #2: .config.gz --]
[-- Type: application/gzip, Size: 41797 bytes --]

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH BUGFIX/IMPROVEMENT 2/6] block, bfq: put reqs of waker and woken in dispatch list
  2021-01-26 16:18   ` Jens Axboe
@ 2021-01-28 17:54     ` Paolo Valente
  2021-02-03 11:01       ` Paolo Valente
  2021-02-03 11:43       ` Jan Kara
  0 siblings, 2 replies; 20+ messages in thread
From: Paolo Valente @ 2021-01-28 17:54 UTC (permalink / raw)
  To: Jens Axboe; +Cc: linux-block, Linux Kernel Mailing List, Jan Kara



> Il giorno 26 gen 2021, alle ore 17:18, Jens Axboe <axboe@kernel.dk> ha scritto:
> 
> On 1/26/21 3:50 AM, Paolo Valente wrote:
>> Consider a new I/O request that arrives for a bfq_queue bfqq. If, when
>> this happens, the only active bfq_queues are bfqq and either its waker
>> bfq_queue or one of its woken bfq_queues, then there is no point in
>> queueing this new I/O request in bfqq for service. In fact, the
>> in-service queue and bfqq agree on serving this new I/O request as
>> soon as possible. So this commit puts this new I/O request directly
>> into the dispatch list.
>> 
>> Tested-by: Jan Kara <jack@suse.cz>
>> Signed-off-by: Paolo Valente <paolo.valente@linaro.org>
>> ---
>> block/bfq-iosched.c | 17 ++++++++++++++++-
>> 1 file changed, 16 insertions(+), 1 deletion(-)
>> 
>> diff --git a/block/bfq-iosched.c b/block/bfq-iosched.c
>> index a83149407336..e5b83910fbe0 100644
>> --- a/block/bfq-iosched.c
>> +++ b/block/bfq-iosched.c
>> @@ -5640,7 +5640,22 @@ static void bfq_insert_request(struct blk_mq_hw_ctx *hctx, struct request *rq,
>> 
>> 	spin_lock_irq(&bfqd->lock);
>> 	bfqq = bfq_init_rq(rq);
>> -	if (!bfqq || at_head || blk_rq_is_passthrough(rq)) {
>> +
>> +	/*
>> +	 * Additional case for putting rq directly into the dispatch
>> +	 * queue: the only active bfq_queues are bfqq and either its
>> +	 * waker bfq_queue or one of its woken bfq_queues. In this
>> +	 * case, there is no point in queueing rq in bfqq for
>> +	 * service. In fact, the in-service queue and bfqq agree on
>> +	 * serving this new I/O request as soon as possible.
>> +	 */
>> +	if (!bfqq ||
>> +	    (bfqq != bfqd->in_service_queue &&
>> +	     bfqd->in_service_queue != NULL &&
>> +	     bfq_tot_busy_queues(bfqd) == 1 + bfq_bfqq_busy(bfqq) &&
>> +	     (bfqq->waker_bfqq == bfqd->in_service_queue ||
>> +	      bfqd->in_service_queue->waker_bfqq == bfqq)) ||
>> +	    at_head || blk_rq_is_passthrough(rq)) {
>> 		if (at_head)
>> 			list_add(&rq->queuelist, &bfqd->dispatch);
>> 		else
>> 
> 
> This is unreadable... Just seems like you are piling heuristics in to
> catch some case, and it's neither readable nor clean.
> 

Yeah, these comments inappropriately assume that the reader knows the
waker mechanism in depth.  And they do not stress at all how important
this improvement is.

I'll do my best to improve these comments.

To try to do a better job, let me also explain the matter early here.
Maybe you or others can give me some early feedback (or just tell me
to proceed).

This change is one of the main improvements that boosted
throughput in Jan's tests.  Here is the rationale:
- consider a bfq_queue, say Q1, detected as a waker of another
  bfq_queue, say Q2
- by definition of a waker, Q1 blocks the I/O of Q2, i.e., some I/O of
  of Q1 needs to be completed for new I/O of Q1 to arrive.  A notable
  example is journald
- so, Q1 and Q2 are in any respect two cooperating processes: if the
  service of Q1's I/O is delayed, Q2 can only suffer from it.
  Conversely, if Q2's I/O is delayed, the purpose of Q1 is just defeated.
- as a consequence if some I/O of Q1/Q2 arrives while Q2/Q1 is the
  only queue in service, there is absolutely no point in delaying the
  service of such an I/O.  The only possible result is a throughput
  loss, detected by Jan's test
- so, when the above condition holds, the most effective and efficient
  action is to put the new I/O directly in the dispatch list
- as an additional restriction, Q1 and Q2 must be the only busy queues
  for this commit to put the I/O of Q2/Q1 in the dispatch list.  This is
  necessary, because, if also other queues are waiting for service, then
  putting new I/O directly in the dispatch list may evidently cause a
  violation of service guarantees for the other queues

If these comments make things clearer, then I'll put them in the
commit message and the code, and I'll proceed with a V2.

Thanks,
Paolo


> -- 
> Jens Axboe
> 


^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH BUGFIX/IMPROVEMENT 2/6] block, bfq: put reqs of waker and woken in dispatch list
  2021-01-28 17:54     ` Paolo Valente
@ 2021-02-03 11:01       ` Paolo Valente
  2021-02-03 11:43       ` Jan Kara
  1 sibling, 0 replies; 20+ messages in thread
From: Paolo Valente @ 2021-02-03 11:01 UTC (permalink / raw)
  To: Jens Axboe; +Cc: linux-block, Linux Kernel Mailing List, Jan Kara



> Il giorno 28 gen 2021, alle ore 18:54, Paolo Valente <paolo.valente@linaro.org> ha scritto:
> 
> 
> 
>> Il giorno 26 gen 2021, alle ore 17:18, Jens Axboe <axboe@kernel.dk> ha scritto:
>> 
>> On 1/26/21 3:50 AM, Paolo Valente wrote:
>>> Consider a new I/O request that arrives for a bfq_queue bfqq. If, when
>>> this happens, the only active bfq_queues are bfqq and either its waker
>>> bfq_queue or one of its woken bfq_queues, then there is no point in
>>> queueing this new I/O request in bfqq for service. In fact, the
>>> in-service queue and bfqq agree on serving this new I/O request as
>>> soon as possible. So this commit puts this new I/O request directly
>>> into the dispatch list.
>>> 
>>> Tested-by: Jan Kara <jack@suse.cz>
>>> Signed-off-by: Paolo Valente <paolo.valente@linaro.org>
>>> ---
>>> block/bfq-iosched.c | 17 ++++++++++++++++-
>>> 1 file changed, 16 insertions(+), 1 deletion(-)
>>> 
>>> diff --git a/block/bfq-iosched.c b/block/bfq-iosched.c
>>> index a83149407336..e5b83910fbe0 100644
>>> --- a/block/bfq-iosched.c
>>> +++ b/block/bfq-iosched.c
>>> @@ -5640,7 +5640,22 @@ static void bfq_insert_request(struct blk_mq_hw_ctx *hctx, struct request *rq,
>>> 
>>> 	spin_lock_irq(&bfqd->lock);
>>> 	bfqq = bfq_init_rq(rq);
>>> -	if (!bfqq || at_head || blk_rq_is_passthrough(rq)) {
>>> +
>>> +	/*
>>> +	 * Additional case for putting rq directly into the dispatch
>>> +	 * queue: the only active bfq_queues are bfqq and either its
>>> +	 * waker bfq_queue or one of its woken bfq_queues. In this
>>> +	 * case, there is no point in queueing rq in bfqq for
>>> +	 * service. In fact, the in-service queue and bfqq agree on
>>> +	 * serving this new I/O request as soon as possible.
>>> +	 */
>>> +	if (!bfqq ||
>>> +	    (bfqq != bfqd->in_service_queue &&
>>> +	     bfqd->in_service_queue != NULL &&
>>> +	     bfq_tot_busy_queues(bfqd) == 1 + bfq_bfqq_busy(bfqq) &&
>>> +	     (bfqq->waker_bfqq == bfqd->in_service_queue ||
>>> +	      bfqd->in_service_queue->waker_bfqq == bfqq)) ||
>>> +	    at_head || blk_rq_is_passthrough(rq)) {
>>> 		if (at_head)
>>> 			list_add(&rq->queuelist, &bfqd->dispatch);
>>> 		else
>>> 
>> 
>> This is unreadable... Just seems like you are piling heuristics in to
>> catch some case, and it's neither readable nor clean.
>> 
> 
> Yeah, these comments inappropriately assume that the reader knows the
> waker mechanism in depth.  And they do not stress at all how important
> this improvement is.
> 
> I'll do my best to improve these comments.
> 
> To try to do a better job, let me also explain the matter early here.
> Maybe you or others can give me some early feedback (or just tell me
> to proceed).
> 
> This change is one of the main improvements that boosted
> throughput in Jan's tests.  Here is the rationale:
> - consider a bfq_queue, say Q1, detected as a waker of another
>  bfq_queue, say Q2
> - by definition of a waker, Q1 blocks the I/O of Q2, i.e., some I/O of
>  of Q1 needs to be completed for new I/O of Q1 to arrive.  A notable
>  example is journald
> - so, Q1 and Q2 are in any respect two cooperating processes: if the
>  service of Q1's I/O is delayed, Q2 can only suffer from it.
>  Conversely, if Q2's I/O is delayed, the purpose of Q1 is just defeated.
> - as a consequence if some I/O of Q1/Q2 arrives while Q2/Q1 is the
>  only queue in service, there is absolutely no point in delaying the
>  service of such an I/O.  The only possible result is a throughput
>  loss, detected by Jan's test
> - so, when the above condition holds, the most effective and efficient
>  action is to put the new I/O directly in the dispatch list
> - as an additional restriction, Q1 and Q2 must be the only busy queues
>  for this commit to put the I/O of Q2/Q1 in the dispatch list.  This is
>  necessary, because, if also other queues are waiting for service, then
>  putting new I/O directly in the dispatch list may evidently cause a
>  violation of service guarantees for the other queues
> 
> If these comments make things clearer, then I'll put them in the
> commit message and the code, and I'll proceed with a V2.
> 

Hi Jens,
may I proceed with a V2?

Thanks,
Paolo

> Thanks,
> Paolo
> 
> 
>> -- 
>> Jens Axboe


^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH BUGFIX/IMPROVEMENT 2/6] block, bfq: put reqs of waker and woken in dispatch list
  2021-01-28 17:54     ` Paolo Valente
  2021-02-03 11:01       ` Paolo Valente
@ 2021-02-03 11:43       ` Jan Kara
  2021-02-05 10:16         ` Paolo Valente
  1 sibling, 1 reply; 20+ messages in thread
From: Jan Kara @ 2021-02-03 11:43 UTC (permalink / raw)
  To: Paolo Valente
  Cc: Jens Axboe, linux-block, Linux Kernel Mailing List, Jan Kara

On Thu 28-01-21 18:54:05, Paolo Valente wrote:
> 
> 
> > Il giorno 26 gen 2021, alle ore 17:18, Jens Axboe <axboe@kernel.dk> ha scritto:
> > 
> > On 1/26/21 3:50 AM, Paolo Valente wrote:
> >> Consider a new I/O request that arrives for a bfq_queue bfqq. If, when
> >> this happens, the only active bfq_queues are bfqq and either its waker
> >> bfq_queue or one of its woken bfq_queues, then there is no point in
> >> queueing this new I/O request in bfqq for service. In fact, the
> >> in-service queue and bfqq agree on serving this new I/O request as
> >> soon as possible. So this commit puts this new I/O request directly
> >> into the dispatch list.
> >> 
> >> Tested-by: Jan Kara <jack@suse.cz>
> >> Signed-off-by: Paolo Valente <paolo.valente@linaro.org>
> >> ---
> >> block/bfq-iosched.c | 17 ++++++++++++++++-
> >> 1 file changed, 16 insertions(+), 1 deletion(-)
> >> 
> >> diff --git a/block/bfq-iosched.c b/block/bfq-iosched.c
> >> index a83149407336..e5b83910fbe0 100644
> >> --- a/block/bfq-iosched.c
> >> +++ b/block/bfq-iosched.c
> >> @@ -5640,7 +5640,22 @@ static void bfq_insert_request(struct blk_mq_hw_ctx *hctx, struct request *rq,
> >> 
> >> 	spin_lock_irq(&bfqd->lock);
> >> 	bfqq = bfq_init_rq(rq);
> >> -	if (!bfqq || at_head || blk_rq_is_passthrough(rq)) {
> >> +
> >> +	/*
> >> +	 * Additional case for putting rq directly into the dispatch
> >> +	 * queue: the only active bfq_queues are bfqq and either its
> >> +	 * waker bfq_queue or one of its woken bfq_queues. In this
> >> +	 * case, there is no point in queueing rq in bfqq for
> >> +	 * service. In fact, the in-service queue and bfqq agree on
> >> +	 * serving this new I/O request as soon as possible.
> >> +	 */
> >> +	if (!bfqq ||
> >> +	    (bfqq != bfqd->in_service_queue &&
> >> +	     bfqd->in_service_queue != NULL &&
> >> +	     bfq_tot_busy_queues(bfqd) == 1 + bfq_bfqq_busy(bfqq) &&
> >> +	     (bfqq->waker_bfqq == bfqd->in_service_queue ||
> >> +	      bfqd->in_service_queue->waker_bfqq == bfqq)) ||
> >> +	    at_head || blk_rq_is_passthrough(rq)) {
> >> 		if (at_head)
> >> 			list_add(&rq->queuelist, &bfqd->dispatch);
> >> 		else
> >> 
> > 
> > This is unreadable... Just seems like you are piling heuristics in to
> > catch some case, and it's neither readable nor clean.
> > 
> 
> Yeah, these comments inappropriately assume that the reader knows the
> waker mechanism in depth.  And they do not stress at all how important
> this improvement is.
> 
> I'll do my best to improve these comments.
> 
> To try to do a better job, let me also explain the matter early here.
> Maybe you or others can give me some early feedback (or just tell me
> to proceed).
> 
> This change is one of the main improvements that boosted
> throughput in Jan's tests.  Here is the rationale:
> - consider a bfq_queue, say Q1, detected as a waker of another
>   bfq_queue, say Q2
> - by definition of a waker, Q1 blocks the I/O of Q2, i.e., some I/O of
>   of Q1 needs to be completed for new I/O of Q1 to arrive.  A notable
					       ^^ Q2?

>   example is journald
> - so, Q1 and Q2 are in any respect two cooperating processes: if the
>   service of Q1's I/O is delayed, Q2 can only suffer from it.
>   Conversely, if Q2's I/O is delayed, the purpose of Q1 is just defeated.

What do you exactly mean by this last sentence?

> - as a consequence if some I/O of Q1/Q2 arrives while Q2/Q1 is the
>   only queue in service, there is absolutely no point in delaying the
>   service of such an I/O.  The only possible result is a throughput
>   loss, detected by Jan's test

If we are idling at that moment waiting for more IO from in service queue,
I agree. But that doesn't seem to be part of your condition above?

> - so, when the above condition holds, the most effective and efficient
>   action is to put the new I/O directly in the dispatch list
> - as an additional restriction, Q1 and Q2 must be the only busy queues
>   for this commit to put the I/O of Q2/Q1 in the dispatch list.  This is
>   necessary, because, if also other queues are waiting for service, then
>   putting new I/O directly in the dispatch list may evidently cause a
>   violation of service guarantees for the other queues

This last restriction is not ideal for cases like jbd2 thread since it may
still lead to pointless idling but I understand that without some
restriction like this several waking threads could just starve other ones.
So I guess it's fine for now.

								Honza
-- 
Jan Kara <jack@suse.com>
SUSE Labs, CR

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH BUGFIX/IMPROVEMENT 5/6] block, bfq: keep shared queues out of the waker mechanism
  2021-01-26 10:51 ` [PATCH BUGFIX/IMPROVEMENT 5/6] block, bfq: keep shared queues out of the waker mechanism Paolo Valente
@ 2021-02-03 11:48   ` Jan Kara
  0 siblings, 0 replies; 20+ messages in thread
From: Jan Kara @ 2021-02-03 11:48 UTC (permalink / raw)
  To: Paolo Valente; +Cc: Jens Axboe, linux-block, linux-kernel, Jan Kara

On Tue 26-01-21 11:51:01, Paolo Valente wrote:
> Shared queues are likely to receive I/O at a high rate. This may
> deceptively let them be considered as wakers of other queues. But a
> false waker will unjustly steal bandwidth to its supposedly woken
> queue. So considering also shared queues in the waking mechanism may
> cause more control troubles than throughput benefits. This commit
> keeps shared queues out of the waker-detection mechanism.
> 
> Tested-by: Jan Kara <jack@suse.cz>
> Signed-off-by: Paolo Valente <paolo.valente@linaro.org>

Honestly this makes me somewhat nervous. It is just a band aid for a fact
that waker detection is unreliable? There's nothing which prevents
non-shared queue to submit high amounts of IO (e.g. when it uses AIO) as
well as there's nothing which says that shared queues have no wakers (e.g.
jbd2 thread can easily be a waker for a shared queue).

								Honza

> ---
>  block/bfq-iosched.c | 12 +++++++++++-
>  1 file changed, 11 insertions(+), 1 deletion(-)
> 
> diff --git a/block/bfq-iosched.c b/block/bfq-iosched.c
> index 0c7e203085f1..23d0dd7bd90f 100644
> --- a/block/bfq-iosched.c
> +++ b/block/bfq-iosched.c
> @@ -5825,7 +5825,17 @@ static void bfq_completed_request(struct bfq_queue *bfqq, struct bfq_data *bfqd)
>  			1UL<<(BFQ_RATE_SHIFT - 10))
>  		bfq_update_rate_reset(bfqd, NULL);
>  	bfqd->last_completion = now_ns;
> -	bfqd->last_completed_rq_bfqq = bfqq;
> +	/*
> +	 * Shared queues are likely to receive I/O at a high
> +	 * rate. This may deceptively let them be considered as wakers
> +	 * of other queues. But a false waker will unjustly steal
> +	 * bandwidth to its supposedly woken queue. So considering
> +	 * also shared queues in the waking mechanism may cause more
> +	 * control troubles than throughput benefits. Then do not set
> +	 * last_completed_rq_bfqq to bfqq if bfqq is a shared queue.
> +	 */
> +	if (!bfq_bfqq_coop(bfqq))
> +		bfqd->last_completed_rq_bfqq = bfqq;
>  
>  	/*
>  	 * If we are waiting to discover whether the request pattern
> -- 
> 2.20.1
> 
-- 
Jan Kara <jack@suse.com>
SUSE Labs, CR

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH BUGFIX/IMPROVEMENT 2/6] block, bfq: put reqs of waker and woken in dispatch list
  2021-02-03 11:43       ` Jan Kara
@ 2021-02-05 10:16         ` Paolo Valente
  2021-02-09 17:09           ` Paolo Valente
  0 siblings, 1 reply; 20+ messages in thread
From: Paolo Valente @ 2021-02-05 10:16 UTC (permalink / raw)
  To: Jan Kara; +Cc: Jens Axboe, linux-block, Linux Kernel Mailing List



> Il giorno 3 feb 2021, alle ore 12:43, Jan Kara <jack@suse.cz> ha scritto:
> 
> On Thu 28-01-21 18:54:05, Paolo Valente wrote:
>> 
>> 
>>> Il giorno 26 gen 2021, alle ore 17:18, Jens Axboe <axboe@kernel.dk> ha scritto:
>>> 
>>> On 1/26/21 3:50 AM, Paolo Valente wrote:
>>>> Consider a new I/O request that arrives for a bfq_queue bfqq. If, when
>>>> this happens, the only active bfq_queues are bfqq and either its waker
>>>> bfq_queue or one of its woken bfq_queues, then there is no point in
>>>> queueing this new I/O request in bfqq for service. In fact, the
>>>> in-service queue and bfqq agree on serving this new I/O request as
>>>> soon as possible. So this commit puts this new I/O request directly
>>>> into the dispatch list.
>>>> 
>>>> Tested-by: Jan Kara <jack@suse.cz>
>>>> Signed-off-by: Paolo Valente <paolo.valente@linaro.org>
>>>> ---
>>>> block/bfq-iosched.c | 17 ++++++++++++++++-
>>>> 1 file changed, 16 insertions(+), 1 deletion(-)
>>>> 
>>>> diff --git a/block/bfq-iosched.c b/block/bfq-iosched.c
>>>> index a83149407336..e5b83910fbe0 100644
>>>> --- a/block/bfq-iosched.c
>>>> +++ b/block/bfq-iosched.c
>>>> @@ -5640,7 +5640,22 @@ static void bfq_insert_request(struct blk_mq_hw_ctx *hctx, struct request *rq,
>>>> 
>>>> 	spin_lock_irq(&bfqd->lock);
>>>> 	bfqq = bfq_init_rq(rq);
>>>> -	if (!bfqq || at_head || blk_rq_is_passthrough(rq)) {
>>>> +
>>>> +	/*
>>>> +	 * Additional case for putting rq directly into the dispatch
>>>> +	 * queue: the only active bfq_queues are bfqq and either its
>>>> +	 * waker bfq_queue or one of its woken bfq_queues. In this
>>>> +	 * case, there is no point in queueing rq in bfqq for
>>>> +	 * service. In fact, the in-service queue and bfqq agree on
>>>> +	 * serving this new I/O request as soon as possible.
>>>> +	 */
>>>> +	if (!bfqq ||
>>>> +	    (bfqq != bfqd->in_service_queue &&
>>>> +	     bfqd->in_service_queue != NULL &&
>>>> +	     bfq_tot_busy_queues(bfqd) == 1 + bfq_bfqq_busy(bfqq) &&
>>>> +	     (bfqq->waker_bfqq == bfqd->in_service_queue ||
>>>> +	      bfqd->in_service_queue->waker_bfqq == bfqq)) ||
>>>> +	    at_head || blk_rq_is_passthrough(rq)) {
>>>> 		if (at_head)
>>>> 			list_add(&rq->queuelist, &bfqd->dispatch);
>>>> 		else
>>>> 
>>> 
>>> This is unreadable... Just seems like you are piling heuristics in to
>>> catch some case, and it's neither readable nor clean.
>>> 
>> 
>> Yeah, these comments inappropriately assume that the reader knows the
>> waker mechanism in depth.  And they do not stress at all how important
>> this improvement is.
>> 
>> I'll do my best to improve these comments.
>> 
>> To try to do a better job, let me also explain the matter early here.
>> Maybe you or others can give me some early feedback (or just tell me
>> to proceed).
>> 
>> This change is one of the main improvements that boosted
>> throughput in Jan's tests.  Here is the rationale:
>> - consider a bfq_queue, say Q1, detected as a waker of another
>>  bfq_queue, say Q2
>> - by definition of a waker, Q1 blocks the I/O of Q2, i.e., some I/O of
>>  of Q1 needs to be completed for new I/O of Q1 to arrive.  A notable
> 					       ^^ Q2?
> 

Yes, thank you!

(after this interaction, I'll fix and improve all this description,
according to your comments)

>>  example is journald
>> - so, Q1 and Q2 are in any respect two cooperating processes: if the
>>  service of Q1's I/O is delayed, Q2 can only suffer from it.
>>  Conversely, if Q2's I/O is delayed, the purpose of Q1 is just defeated.
> 
> What do you exactly mean by this last sentence?

By definition of waker, the purpose of Q1's I/O is doing what needs to
be done, so that new Q2's I/O can finally be issued.  Delaying Q2's I/O
is the opposite of this goal.

> 
>> - as a consequence if some I/O of Q1/Q2 arrives while Q2/Q1 is the
>>  only queue in service, there is absolutely no point in delaying the
>>  service of such an I/O.  The only possible result is a throughput
>>  loss, detected by Jan's test
> 
> If we are idling at that moment waiting for more IO from in service queue,
> I agree.

And I agree too, if the drive has no internal queueing, has no
parallelism or pipeline, or is at least one order of magnitude slower
than the CPU is processing I/O.  In all other cases, serving the I/O
of only one queue at a time means throwing away throughput.  For
example, on a consumer SSD, moving from one to two I/O threads served
in parallel usually means doubling the throughput.

So, the best thing to do, if all the above conditions are met, is to
have this new I/O dispatched as soon as possible.

The most efficient way to attain this goal is to just put the new I/O
directly into the dispatch list.

> But that doesn't seem to be part of your condition above?
> 
>> - so, when the above condition holds, the most effective and efficient
>>  action is to put the new I/O directly in the dispatch list
>> - as an additional restriction, Q1 and Q2 must be the only busy queues
>>  for this commit to put the I/O of Q2/Q1 in the dispatch list.  This is
>>  necessary, because, if also other queues are waiting for service, then
>>  putting new I/O directly in the dispatch list may evidently cause a
>>  violation of service guarantees for the other queues
> 
> This last restriction is not ideal for cases like jbd2 thread since it may
> still lead to pointless idling but I understand that without some
> restriction like this several waking threads could just starve other ones.

Yeah, the goal here is to reduce a little bit false positives.

> So I guess it's fine for now.
> 

Yes, hopefully experience will lead us to even improvements or even
better solutions.

Thanks,
Paolo

> 								Honza
> -- 
> Jan Kara <jack@suse.com>
> SUSE Labs, CR


^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH BUGFIX/IMPROVEMENT 2/6] block, bfq: put reqs of waker and woken in dispatch list
  2021-02-05 10:16         ` Paolo Valente
@ 2021-02-09 17:09           ` Paolo Valente
  0 siblings, 0 replies; 20+ messages in thread
From: Paolo Valente @ 2021-02-09 17:09 UTC (permalink / raw)
  To: Jan Kara; +Cc: Jens Axboe, linux-block, Linux Kernel Mailing List



> Il giorno 5 feb 2021, alle ore 11:16, Paolo Valente <paolo.valente@linaro.org> ha scritto:
> 
> 
> 
>> Il giorno 3 feb 2021, alle ore 12:43, Jan Kara <jack@suse.cz> ha scritto:
>> 
>> On Thu 28-01-21 18:54:05, Paolo Valente wrote:
>>> 
>>> 
>>>> Il giorno 26 gen 2021, alle ore 17:18, Jens Axboe <axboe@kernel.dk> ha scritto:
>>>> 
>>>> On 1/26/21 3:50 AM, Paolo Valente wrote:
>>>>> Consider a new I/O request that arrives for a bfq_queue bfqq. If, when
>>>>> this happens, the only active bfq_queues are bfqq and either its waker
>>>>> bfq_queue or one of its woken bfq_queues, then there is no point in
>>>>> queueing this new I/O request in bfqq for service. In fact, the
>>>>> in-service queue and bfqq agree on serving this new I/O request as
>>>>> soon as possible. So this commit puts this new I/O request directly
>>>>> into the dispatch list.
>>>>> 
>>>>> Tested-by: Jan Kara <jack@suse.cz>
>>>>> Signed-off-by: Paolo Valente <paolo.valente@linaro.org>
>>>>> ---
>>>>> block/bfq-iosched.c | 17 ++++++++++++++++-
>>>>> 1 file changed, 16 insertions(+), 1 deletion(-)
>>>>> 
>>>>> diff --git a/block/bfq-iosched.c b/block/bfq-iosched.c
>>>>> index a83149407336..e5b83910fbe0 100644
>>>>> --- a/block/bfq-iosched.c
>>>>> +++ b/block/bfq-iosched.c
>>>>> @@ -5640,7 +5640,22 @@ static void bfq_insert_request(struct blk_mq_hw_ctx *hctx, struct request *rq,
>>>>> 
>>>>> 	spin_lock_irq(&bfqd->lock);
>>>>> 	bfqq = bfq_init_rq(rq);
>>>>> -	if (!bfqq || at_head || blk_rq_is_passthrough(rq)) {
>>>>> +
>>>>> +	/*
>>>>> +	 * Additional case for putting rq directly into the dispatch
>>>>> +	 * queue: the only active bfq_queues are bfqq and either its
>>>>> +	 * waker bfq_queue or one of its woken bfq_queues. In this
>>>>> +	 * case, there is no point in queueing rq in bfqq for
>>>>> +	 * service. In fact, the in-service queue and bfqq agree on
>>>>> +	 * serving this new I/O request as soon as possible.
>>>>> +	 */
>>>>> +	if (!bfqq ||
>>>>> +	    (bfqq != bfqd->in_service_queue &&
>>>>> +	     bfqd->in_service_queue != NULL &&
>>>>> +	     bfq_tot_busy_queues(bfqd) == 1 + bfq_bfqq_busy(bfqq) &&
>>>>> +	     (bfqq->waker_bfqq == bfqd->in_service_queue ||
>>>>> +	      bfqd->in_service_queue->waker_bfqq == bfqq)) ||
>>>>> +	    at_head || blk_rq_is_passthrough(rq)) {
>>>>> 		if (at_head)
>>>>> 			list_add(&rq->queuelist, &bfqd->dispatch);
>>>>> 		else
>>>>> 
>>>> 
>>>> This is unreadable... Just seems like you are piling heuristics in to
>>>> catch some case, and it's neither readable nor clean.
>>>> 
>>> 
>>> Yeah, these comments inappropriately assume that the reader knows the
>>> waker mechanism in depth.  And they do not stress at all how important
>>> this improvement is.
>>> 
>>> I'll do my best to improve these comments.
>>> 
>>> To try to do a better job, let me also explain the matter early here.
>>> Maybe you or others can give me some early feedback (or just tell me
>>> to proceed).
>>> 
>>> This change is one of the main improvements that boosted
>>> throughput in Jan's tests.  Here is the rationale:
>>> - consider a bfq_queue, say Q1, detected as a waker of another
>>> bfq_queue, say Q2
>>> - by definition of a waker, Q1 blocks the I/O of Q2, i.e., some I/O of
>>> of Q1 needs to be completed for new I/O of Q1 to arrive.  A notable
>> 					       ^^ Q2?
>> 
> 
> Yes, thank you!
> 
> (after this interaction, I'll fix and improve all this description,
> according to your comments)
> 
>>> example is journald
>>> - so, Q1 and Q2 are in any respect two cooperating processes: if the
>>> service of Q1's I/O is delayed, Q2 can only suffer from it.
>>> Conversely, if Q2's I/O is delayed, the purpose of Q1 is just defeated.
>> 
>> What do you exactly mean by this last sentence?
> 
> By definition of waker, the purpose of Q1's I/O is doing what needs to
> be done, so that new Q2's I/O can finally be issued.  Delaying Q2's I/O
> is the opposite of this goal.
> 
>> 
>>> - as a consequence if some I/O of Q1/Q2 arrives while Q2/Q1 is the
>>> only queue in service, there is absolutely no point in delaying the
>>> service of such an I/O.  The only possible result is a throughput
>>> loss, detected by Jan's test
>> 
>> If we are idling at that moment waiting for more IO from in service queue,
>> I agree.
> 
> And I agree too, if the drive has no internal queueing, has no
> parallelism or pipeline, or is at least one order of magnitude slower
> than the CPU is processing I/O.  In all other cases, serving the I/O
> of only one queue at a time means throwing away throughput.  For
> example, on a consumer SSD, moving from one to two I/O threads served
> in parallel usually means doubling the throughput.
> 
> So, the best thing to do, if all the above conditions are met, is to
> have this new I/O dispatched as soon as possible.
> 
> The most efficient way to attain this goal is to just put the new I/O
> directly into the dispatch list.
> 
>> But that doesn't seem to be part of your condition above?
>> 
>>> - so, when the above condition holds, the most effective and efficient
>>> action is to put the new I/O directly in the dispatch list
>>> - as an additional restriction, Q1 and Q2 must be the only busy queues
>>> for this commit to put the I/O of Q2/Q1 in the dispatch list.  This is
>>> necessary, because, if also other queues are waiting for service, then
>>> putting new I/O directly in the dispatch list may evidently cause a
>>> violation of service guarantees for the other queues
>> 
>> This last restriction is not ideal for cases like jbd2 thread since it may
>> still lead to pointless idling but I understand that without some
>> restriction like this several waking threads could just starve other ones.
> 
> Yeah, the goal here is to reduce a little bit false positives.
> 
>> So I guess it's fine for now.
>> 
> 
> Yes, hopefully experience will lead us to even improvements or even
> better solutions.
> 

Hi Jens,
on a separate thread, Jan told me that my last reply, and therefore
also this patch are ok for him.  May I now proceed with a V2, in which
I'll report my extra comments? Or are there some other issues for you?

Thanks,
Paolo

> Thanks,
> Paolo
> 
>> 								Honza
>> -- 
>> Jan Kara <jack@suse.com>
>> SUSE Labs, CR


^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH BUGFIX/IMPROVEMENT 1/6] block, bfq: always inject I/O of queues blocked by wakers
  2021-01-26 16:17   ` Jens Axboe
@ 2021-02-25 15:58     ` Paolo Valente
  0 siblings, 0 replies; 20+ messages in thread
From: Paolo Valente @ 2021-02-25 15:58 UTC (permalink / raw)
  To: Jens Axboe; +Cc: linux-block, Linux Kernel Mailing List, Jan Kara



> Il giorno 26 gen 2021, alle ore 17:17, Jens Axboe <axboe@kernel.dk> ha scritto:
> 
> On 1/26/21 3:50 AM, Paolo Valente wrote:
>> diff --git a/block/bfq-iosched.c b/block/bfq-iosched.c
>> index 445cef9c0bb9..a83149407336 100644
>> --- a/block/bfq-iosched.c
>> +++ b/block/bfq-iosched.c
>> @@ -4487,9 +4487,15 @@ static struct bfq_queue *bfq_select_queue(struct bfq_data *bfqd)
>> 			bfq_bfqq_busy(bfqq->bic->bfqq[0]) &&
>> 			bfqq->bic->bfqq[0]->next_rq ?
>> 			bfqq->bic->bfqq[0] : NULL;
>> +		struct bfq_queue *blocked_bfqq =
>> +			!hlist_empty(&bfqq->woken_list) ?
>> +			container_of(bfqq->woken_list.first,
>> +				     struct bfq_queue,
>> +				     woken_list_node)
>> +			: NULL;
> 
> hlist_first_entry_or_null?
> 

I didn't find any such function.  There is a list_first_entry_or_null,
but it's for circular doubly linked lists.

I'll wait a little bit for your reply, then send a V2 with this patch
unchanged.

Thanks,
Paolo

> -- 
> Jens Axboe
> 


^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH BUGFIX/IMPROVEMENT 6/6] block, bfq: merge bursts of newly-created queues
  2021-01-26 16:15   ` Jens Axboe
@ 2021-02-25 17:25     ` Paolo Valente
  0 siblings, 0 replies; 20+ messages in thread
From: Paolo Valente @ 2021-02-25 17:25 UTC (permalink / raw)
  To: Jens Axboe; +Cc: linux-block, Linux Kernel Mailing List, Jan Kara



> Il giorno 26 gen 2021, alle ore 17:15, Jens Axboe <axboe@kernel.dk> ha scritto:
> 
> On 1/26/21 3:51 AM, Paolo Valente wrote:
>> @@ -2809,6 +2853,12 @@ void bfq_release_process_ref(struct bfq_data *bfqd, struct bfq_queue *bfqq)
>> 	    bfqq != bfqd->in_service_queue)
>> 		bfq_del_bfqq_busy(bfqd, bfqq, false);
>> 
>> +	if (bfqq->entity.parent &&
>> +	    bfqq->entity.parent->last_bfqq_created == bfqq)
>> +		bfqq->entity.parent->last_bfqq_created = NULL;
>> +	else if (bfqq->bfqd && bfqq->bfqd->last_bfqq_created == bfqq)
>> +		bfqq->bfqd->last_bfqq_created = NULL;
>> +
>> 	bfq_put_queue(bfqq);
>> }
>> 
>> @@ -2905,6 +2955,13 @@ bfq_merge_bfqqs(struct bfq_data *bfqd, struct bfq_io_cq *bic,
>> 	 */
>> 	new_bfqq->pid = -1;
>> 	bfqq->bic = NULL;
>> +
>> +	if (bfqq->entity.parent &&
>> +	    bfqq->entity.parent->last_bfqq_created == bfqq)
>> +		bfqq->entity.parent->last_bfqq_created = new_bfqq;
>> +	else if (bfqq->bfqd && bfqq->bfqd->last_bfqq_created == bfqq)
>> +		bfqq->bfqd->last_bfqq_created = new_bfqq;
>> +
>> 	bfq_release_process_ref(bfqd, bfqq);
>> }
> 
> Almost identical code constructs makes it seem like this should have a
> helper instead.
> 

Right, sorry. Improved in V2.

>> @@ -5033,6 +5090,12 @@ void bfq_put_queue(struct bfq_queue *bfqq)
>> 	bfqg_and_blkg_put(bfqg);
>> }
>> 
>> +static void bfq_put_stable_ref(struct bfq_queue *bfqq)
>> +{
>> +	bfqq->stable_ref--;
>> +	bfq_put_queue(bfqq);
>> +}
>> +
>> static void bfq_put_cooperator(struct bfq_queue *bfqq)
>> {
>> 	struct bfq_queue *__bfqq, *next;
>> @@ -5089,6 +5152,17 @@ static void bfq_exit_icq(struct io_cq *icq)
>> {
>> 	struct bfq_io_cq *bic = icq_to_bic(icq);
>> 
>> +	if (bic->stable_merge_bfqq) {
>> +		unsigned long flags;
>> +		struct bfq_data *bfqd = bic->stable_merge_bfqq->bfqd;
>> +
>> +		if (bfqd)
>> +			spin_lock_irqsave(&bfqd->lock, flags);
>> +		bfq_put_stable_ref(bic->stable_merge_bfqq);
>> +		if (bfqd)
>> +			spin_unlock_irqrestore(&bfqd->lock, flags);
>> +	}
>> +
> 
> Construct like this are really painful. Just do:
> 
> if (bfqd) {
> 	unsigned long flags;
> 
> 	spin_lock_irqsave(&bfqd->lock, flags);
> 	bfq_put_stable_ref(bic->stable_merge_bfqq);
> 	spin_unlock_irqrestore(&bfqd->lock, flags);
> } else {
> 	bfq_put_stable_ref(bic->stable_merge_bfqq);
> }
> 
> which is also less likely to cause code analyzer false warnings.

Done, thanks.

> Outside
> of that, it needs a comment on why it's ok NOT to grab the lock when
> bfqd is zero, because that seems counter-intuitive and more a case of
> "well we can't grab a lock for something we don't have". Maybe it's
> because bfqd is no longer visible at this point, and it's ok,

yes

> but it's
> definitely not clear just looking at this patch.

Right, the reason is already reported a few lines above, but not
repeated in this function.  I'll repeat it.


> Even with that, is the
> bfqq visible? Should the ref be atomic, and locking happen further down
> instead?
> 

Since the scheduler is gone, no pending I/O is expected to still
reference bfqq.  I'll write this too in V2.

As I stated in my reply to another comments of yours, I'll submit the
V2 soon, unless I receive a reply before.

Thanks.
Paolo

> -- 
> Jens Axboe
> 


^ permalink raw reply	[flat|nested] 20+ messages in thread

end of thread, other threads:[~2021-02-25 17:25 UTC | newest]

Thread overview: 20+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-01-26 10:50 [PATCH BUGFIX/IMPROVEMENT 0/6] block, bfq: third and last batch of fixes and improvements Paolo Valente
2021-01-26 10:50 ` [PATCH BUGFIX/IMPROVEMENT 1/6] block, bfq: always inject I/O of queues blocked by wakers Paolo Valente
2021-01-26 16:17   ` Jens Axboe
2021-02-25 15:58     ` Paolo Valente
2021-01-26 10:50 ` [PATCH BUGFIX/IMPROVEMENT 2/6] block, bfq: put reqs of waker and woken in dispatch list Paolo Valente
2021-01-26 16:18   ` Jens Axboe
2021-01-28 17:54     ` Paolo Valente
2021-02-03 11:01       ` Paolo Valente
2021-02-03 11:43       ` Jan Kara
2021-02-05 10:16         ` Paolo Valente
2021-02-09 17:09           ` Paolo Valente
2021-01-26 10:50 ` [PATCH BUGFIX/IMPROVEMENT 3/6] block, bfq: make shared queues inherit wakers Paolo Valente
2021-01-26 10:51 ` [PATCH BUGFIX/IMPROVEMENT 4/6] block, bfq: fix weight-raising resume with !low_latency Paolo Valente
2021-01-26 10:51 ` [PATCH BUGFIX/IMPROVEMENT 5/6] block, bfq: keep shared queues out of the waker mechanism Paolo Valente
2021-02-03 11:48   ` Jan Kara
2021-01-26 10:51 ` [PATCH BUGFIX/IMPROVEMENT 6/6] block, bfq: merge bursts of newly-created queues Paolo Valente
2021-01-26 16:15   ` Jens Axboe
2021-02-25 17:25     ` Paolo Valente
2021-01-27  7:34   ` kernel test robot
2021-01-27  9:52   ` kernel test robot

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).