[PATCH 0/8] block, bfq: extend bfq to support multi-actuator drives

All of lore.kernel.org
 help / color / mirror / Atom feed

* [PATCH 0/8] block, bfq: extend bfq to support multi-actuator drives
@ 2022-06-23 15:53 Paolo Valente
  2022-06-23 15:53 ` [PATCH 1/8] block, bfq: split sync bfq_queues on a per-actuator basis Paolo Valente
                   ` (8 more replies)
  0 siblings, 9 replies; 21+ messages in thread
From: Paolo Valente @ 2022-06-23 15:53 UTC (permalink / raw)
  To: Jens Axboe
  Cc: linux-block, linux-kernel, jack, andrea.righi, glen.valante,
	arie.vanderhoeven, Paolo Valente

Hi,
this patch series extends BFQ so as to optimize I/O dispatch to
multi-actuator drives. In particular, this extension addresses the
following issue. Multi-actuator drives appear as a single device to
the I/O subsystem [1].  Yet they address commands to different
actuators internally, as a function of Logical Block Addressing
(LBAs). A given sector is reachable by only one of the actuators. For
example, Seagate’s Serial Advanced Technology Attachment (SATA)
version contains two actuators and maps the lower half of the SATA LBA
space to the lower actuator and the upper half to the upper actuator.

Evidently, to fully utilize actuators, no actuator must be left idle
or underutilized while there is pending I/O for it. To reach this
goal, the block layer must somehow control the load of each actuator
individually. This series enriches BFQ with such a per-actuator
control, as a first step. Then it also adds a simple mechanism for
guaranteeing that actuators with pending I/O are never left idle.

See [1] for a more detailed overview of the problem and of the
solutions implemented in this patch series. There you will also find
some preliminary performance results.

Thanks,
Paolo

[1] https://www.linaro.org/blog/budget-fair-queueing-bfq-linux-io-scheduler-optimizations-for-multi-actuator-sata-hard-drives/

Davide Zini (3):
  block, bfq: split also async bfq_queues on a per-actuator basis
  block, bfq: inject I/O to underutilized actuators
  block, bfq: balance I/O injection among underutilized actuators

Federico Gavioli (1):
  block, bfq: retrieve independent access ranges from request queue

Paolo Valente (4):
  block, bfq: split sync bfq_queues on a per-actuator basis
  block, bfq: forbid stable merging of queues associated with different
    actuators
  block, bfq: turn scalar fields into arrays in bfq_io_cq
  block, bfq: turn BFQ_NUM_ACTUATORS into BFQ_MAX_ACTUATORS

 block/bfq-cgroup.c  |  97 +++++----
 block/bfq-iosched.c | 488 +++++++++++++++++++++++++++++---------------
 block/bfq-iosched.h | 149 ++++++++++----
 block/bfq-wf2q.c    |   2 +-
 4 files changed, 493 insertions(+), 243 deletions(-)

--
2.20.1

^ permalink raw reply	[flat|nested] 21+ messages in thread

* [PATCH 1/8] block, bfq: split sync bfq_queues on a per-actuator basis
  2022-06-23 15:53 [PATCH 0/8] block, bfq: extend bfq to support multi-actuator drives Paolo Valente
@ 2022-06-23 15:53 ` Paolo Valente
  2022-06-23 15:53 ` [PATCH 2/8] block, bfq: forbid stable merging of queues associated with different actuators Paolo Valente
                   ` (7 subsequent siblings)
  8 siblings, 0 replies; 21+ messages in thread
From: Paolo Valente @ 2022-06-23 15:53 UTC (permalink / raw)
  To: Jens Axboe
  Cc: linux-block, linux-kernel, jack, andrea.righi, glen.valante,
	arie.vanderhoeven, Paolo Valente, Gabriele Felici

Multi-actuator drives appear as a single device to the I/O subsystem [1].
Yet they address commands to different actuators internally, as a
function of Logical Block Addressing (LBAs). A given sector is
reachable by only one of the actuators. For example, Seagate’s Serial
Advanced Technology Attachment (SATA) version contains two actuators
and maps the lower half of the SATA LBA space to the lower actuator
and the upper half to the upper actuator.

Evidently, to fully utilize actuators, no actuator must be left idle
or underutilized while there is pending I/O for it. The block layer
must somehow control the load of each actuator individually. This
commit lays the ground for allowing BFQ to provide such a per-actuator
control.

BFQ associates an I/O-request sync bfq_queue with each process doing
synchronous I/O, or with a group of processes, in case of queue
merging. Then BFQ serves one bfq_queue at a time. While in service, a
bfq_queue is emptied in request-position order. Yet the same process,
or group of processes, may generate I/O for different actuators. In
this case, different streams of I/O (each for a different actuator)
get all inserted into the same sync bfq_queue. So there is basically
no individual control on when each stream is served, i.e., on when the
I/O requests of the stream are picked from the bfq_queue and
dispatched to the drive.

This commit enables BFQ to control the service of each actuator
individually for synchronous I/O, by simply splitting each sync
bfq_queue into N queues, one for each actuator. In other words, a sync
bfq_queue is now associated to a pair (process, actuator). As a
consequence of this split, the per-queue proportional-share policy
implemented by BFQ will guarantee that the sync I/O generated for each
actuator, by each process, receives its fair share of service.

This is just a preparatory patch. If the I/O of the same process
happens to be sent to different queues, then each of these queues may
undergo queue merging. To handle this event, the bfq_io_cq data
structure must be properly extended. In addition, stable merging must
be disabled to avoid loss of control on individual actuators. Finally,
also async queues must be split. These issues are described in detail
and addressed in next commits. As for this commit, although multiple
per-process bfq_queues are provided, the I/O of each process or group
of processes is still sent to only one queue, regardless of the
actuator the I/O is for. The forwarding to distinct bfq_queues will be
enabled after addressing the above issues.

[1] https://www.linaro.org/blog/budget-fair-queueing-bfq-linux-io-scheduler-optimizations-for-multi-actuator-sata-hard-drives/

Signed-off-by: Gabriele Felici <felicigb@gmail.com>
Signed-off-by: Paolo Valente <paolo.valente@linaro.org>
---
 block/bfq-cgroup.c  |  95 +++++++++++++++++--------------
 block/bfq-iosched.c | 135 +++++++++++++++++++++++++++-----------------
 block/bfq-iosched.h |  38 +++++++++----
 3 files changed, 164 insertions(+), 104 deletions(-)

diff --git a/block/bfq-cgroup.c b/block/bfq-cgroup.c
index 09574af83566..6462e8de44db 100644
--- a/block/bfq-cgroup.c
+++ b/block/bfq-cgroup.c
@@ -705,6 +705,48 @@ void bfq_bfqq_move(struct bfq_data *bfqd, struct bfq_queue *bfqq,
 	bfq_put_queue(bfqq);
 }
 
+static void bfq_sync_bfqq_move(struct bfq_data *bfqd,
+			       struct bfq_queue *sync_bfqq,
+			       struct bfq_io_cq *bic,
+			       struct bfq_group *bfqg,
+			       unsigned int act_idx)
+{
+	if (!sync_bfqq->new_bfqq && !bfq_bfqq_coop(sync_bfqq)) {
+		/* We are the only user of this bfqq, just move it */
+		if (sync_bfqq->entity.sched_data != &bfqg->sched_data)
+			bfq_bfqq_move(bfqd, sync_bfqq, bfqg);
+	} else {
+		struct bfq_queue *bfqq;
+
+		/*
+		 * The queue was merged to a different queue. Check
+		 * that the merge chain still belongs to the same
+		 * cgroup.
+		 */
+		for (bfqq = sync_bfqq; bfqq; bfqq = bfqq->new_bfqq)
+			if (bfqq->entity.sched_data !=
+			    &bfqg->sched_data)
+				break;
+		if (bfqq) {
+			/*
+			 * Some queue changed cgroup so the merge is
+			 * not valid anymore. We cannot easily just
+			 * cancel the merge (by clearing new_bfqq) as
+			 * there may be other processes using this
+			 * queue and holding refs to all queues below
+			 * sync_bfqq->new_bfqq. Similarly if the merge
+			 * already happened, we need to detach from
+			 * bfqq now so that we cannot merge bio to a
+			 * request from the old cgroup.
+			 */
+			bfq_put_cooperator(sync_bfqq);
+			bfq_release_process_ref(bfqd, sync_bfqq);
+			bic_set_bfqq(bic, NULL, 1, act_idx);
+		}
+	}
+}
+
+
 /**
  * __bfq_bic_change_cgroup - move @bic to @cgroup.
  * @bfqd: the queue descriptor.
@@ -719,53 +761,24 @@ static void *__bfq_bic_change_cgroup(struct bfq_data *bfqd,
 				     struct bfq_io_cq *bic,
 				     struct bfq_group *bfqg)
 {
-	struct bfq_queue *async_bfqq = bic_to_bfqq(bic, 0);
-	struct bfq_queue *sync_bfqq = bic_to_bfqq(bic, 1);
 	struct bfq_entity *entity;
+	unsigned int act_idx;
 
-	if (async_bfqq) {
-		entity = &async_bfqq->entity;
-
-		if (entity->sched_data != &bfqg->sched_data) {
-			bic_set_bfqq(bic, NULL, 0);
-			bfq_release_process_ref(bfqd, async_bfqq);
-		}
-	}
+	for (act_idx = 0; act_idx < BFQ_NUM_ACTUATORS; act_idx++) {
+		struct bfq_queue *async_bfqq = bic_to_bfqq(bic, 0, act_idx);
+		struct bfq_queue *sync_bfqq = bic_to_bfqq(bic, 1, act_idx);
 
-	if (sync_bfqq) {
-		if (!sync_bfqq->new_bfqq && !bfq_bfqq_coop(sync_bfqq)) {
-			/* We are the only user of this bfqq, just move it */
-			if (sync_bfqq->entity.sched_data != &bfqg->sched_data)
-				bfq_bfqq_move(bfqd, sync_bfqq, bfqg);
-		} else {
-			struct bfq_queue *bfqq;
+		if (async_bfqq) {
+			entity = &async_bfqq->entity;
 
-			/*
-			 * The queue was merged to a different queue. Check
-			 * that the merge chain still belongs to the same
-			 * cgroup.
-			 */
-			for (bfqq = sync_bfqq; bfqq; bfqq = bfqq->new_bfqq)
-				if (bfqq->entity.sched_data !=
-				    &bfqg->sched_data)
-					break;
-			if (bfqq) {
-				/*
-				 * Some queue changed cgroup so the merge is
-				 * not valid anymore. We cannot easily just
-				 * cancel the merge (by clearing new_bfqq) as
-				 * there may be other processes using this
-				 * queue and holding refs to all queues below
-				 * sync_bfqq->new_bfqq. Similarly if the merge
-				 * already happened, we need to detach from
-				 * bfqq now so that we cannot merge bio to a
-				 * request from the old cgroup.
-				 */
-				bfq_put_cooperator(sync_bfqq);
-				bfq_release_process_ref(bfqd, sync_bfqq);
-				bic_set_bfqq(bic, NULL, 1);
+			if (entity->sched_data != &bfqg->sched_data) {
+				bic_set_bfqq(bic, NULL, 0, act_idx);
+				bfq_release_process_ref(bfqd, async_bfqq);
 			}
 		}
+
+		if (sync_bfqq)
+			bfq_sync_bfqq_move(bfqd, sync_bfqq, bic, bfqg, act_idx);
 	}
 
 	return bfqg;
diff --git a/block/bfq-iosched.c b/block/bfq-iosched.c
index e6d7e6b01a05..98e0c454f5bc 100644
--- a/block/bfq-iosched.c
+++ b/block/bfq-iosched.c
@@ -377,14 +377,19 @@ static const unsigned long bfq_late_stable_merging = 600;
 #define RQ_BIC(rq)		((struct bfq_io_cq *)((rq)->elv.priv[0]))
 #define RQ_BFQQ(rq)		((rq)->elv.priv[1])
 
-struct bfq_queue *bic_to_bfqq(struct bfq_io_cq *bic, bool is_sync)
+struct bfq_queue *bic_to_bfqq(struct bfq_io_cq *bic,
+			      bool is_sync,
+			      unsigned int actuator_idx)
 {
-	return bic->bfqq[is_sync];
+	return bic->bfqq[is_sync][actuator_idx];
 }
 
 static void bfq_put_stable_ref(struct bfq_queue *bfqq);
 
-void bic_set_bfqq(struct bfq_io_cq *bic, struct bfq_queue *bfqq, bool is_sync)
+void bic_set_bfqq(struct bfq_io_cq *bic,
+		  struct bfq_queue *bfqq,
+		  bool is_sync,
+		  unsigned int actuator_idx)
 {
 	/*
 	 * If bfqq != NULL, then a non-stable queue merge between
@@ -399,7 +404,7 @@ void bic_set_bfqq(struct bfq_io_cq *bic, struct bfq_queue *bfqq, bool is_sync)
 	 * we cancel the stable merge if
 	 * bic->stable_merge_bfqq == bfqq.
 	 */
-	bic->bfqq[is_sync] = bfqq;
+	bic->bfqq[is_sync][actuator_idx] = bfqq;
 
 	if (bfqq && bic->stable_merge_bfqq == bfqq) {
 		/*
@@ -672,9 +677,9 @@ static void bfq_limit_depth(unsigned int op, struct blk_mq_alloc_data *data)
 {
 	struct bfq_data *bfqd = data->q->elevator->elevator_data;
 	struct bfq_io_cq *bic = bfq_bic_lookup(data->q);
-	struct bfq_queue *bfqq = bic ? bic_to_bfqq(bic, op_is_sync(op)) : NULL;
 	int depth;
 	unsigned limit = data->q->nr_requests;
+	unsigned int act_idx;
 
 	/* Sync reads have full depth available */
 	if (op_is_sync(op) && !op_is_write(op)) {
@@ -684,14 +689,21 @@ static void bfq_limit_depth(unsigned int op, struct blk_mq_alloc_data *data)
 		limit = (limit * depth) >> bfqd->full_depth_shift;
 	}
 
-	/*
-	 * Does queue (or any parent entity) exceed number of requests that
-	 * should be available to it? Heavily limit depth so that it cannot
-	 * consume more available requests and thus starve other entities.
-	 */
-	if (bfqq && bfqq_request_over_limit(bfqq, limit))
-		depth = 1;
+	for (act_idx = 0; act_idx < BFQ_NUM_ACTUATORS; act_idx++) {
+		struct bfq_queue *bfqq =
+			bic ? bic_to_bfqq(bic, op_is_sync(op), act_idx) : NULL;
 
+		/*
+		 * Does queue (or any parent entity) exceed number of
+		 * requests that should be available to it? Heavily
+		 * limit depth so that it cannot consume more
+		 * available requests and thus starve other entities.
+		 */
+		if (bfqq && bfqq_request_over_limit(bfqq, limit)) {
+			depth = 1;
+			break;
+		}
+	}
 	bfq_log(bfqd, "[%s] wr_busy %d sync %d depth %u",
 		__func__, bfqd->wr_busy_queues, op_is_sync(op), depth);
 	if (depth)
@@ -2142,7 +2154,7 @@ static void bfq_check_waker(struct bfq_data *bfqd, struct bfq_queue *bfqq,
 	 * We reset waker detection logic also if too much time has passed
  	 * since the first detection. If wakeups are rare, pointless idling
 	 * doesn't hurt throughput that much. The condition below makes sure
-	 * we do not uselessly idle blocking waker in more than 1/64 cases. 
+	 * we do not uselessly idle blocking waker in more than 1/64 cases.
 	 */
 	if (bfqd->last_completed_rq_bfqq !=
 	    bfqq->tentative_waker_bfqq ||
@@ -2454,6 +2466,16 @@ static void bfq_remove_request(struct request_queue *q,
 
 }
 
+/* get the index of the actuator that will serve bio */
+static unsigned int bfq_actuator_index(struct bfq_data *bfqd, struct bio *bio)
+{
+	/*
+	 * Multi-actuator support not complete yet, so always return 0
+	 * for the moment.
+	 */
+	return 0;
+}
+
 static bool bfq_bio_merge(struct request_queue *q, struct bio *bio,
 		unsigned int nr_segs)
 {
@@ -2478,7 +2500,8 @@ static bool bfq_bio_merge(struct request_queue *q, struct bio *bio,
 		 */
 		bfq_bic_update_cgroup(bic, bio);
 
-		bfqd->bio_bfqq = bic_to_bfqq(bic, op_is_sync(bio->bi_opf));
+		bfqd->bio_bfqq = bic_to_bfqq(bic, op_is_sync(bio->bi_opf),
+					     bfq_actuator_index(bfqd, bio));
 	} else {
 		bfqd->bio_bfqq = NULL;
 	}
@@ -3174,7 +3197,7 @@ bfq_merge_bfqqs(struct bfq_data *bfqd, struct bfq_io_cq *bic,
 	/*
 	 * Merge queues (that is, let bic redirect its requests to new_bfqq)
 	 */
-	bic_set_bfqq(bic, new_bfqq, 1);
+	bic_set_bfqq(bic, new_bfqq, 1, bfqq->actuator_idx);
 	bfq_mark_bfqq_coop(new_bfqq);
 	/*
 	 * new_bfqq now belongs to at least two bics (it is a shared queue):
@@ -4808,11 +4831,12 @@ static struct bfq_queue *bfq_select_queue(struct bfq_data *bfqd)
 	 */
 	if (bfq_bfqq_wait_request(bfqq) ||
 	    (bfqq->dispatched != 0 && bfq_better_to_idle(bfqq))) {
+		unsigned int act_idx = bfqq->actuator_idx;
 		struct bfq_queue *async_bfqq =
-			bfqq->bic && bfqq->bic->bfqq[0] &&
-			bfq_bfqq_busy(bfqq->bic->bfqq[0]) &&
-			bfqq->bic->bfqq[0]->next_rq ?
-			bfqq->bic->bfqq[0] : NULL;
+			bfqq->bic && bfqq->bic->bfqq[0][act_idx] &&
+			bfq_bfqq_busy(bfqq->bic->bfqq[0][act_idx]) &&
+			bfqq->bic->bfqq[0][act_idx]->next_rq ?
+			bfqq->bic->bfqq[0][act_idx] : NULL;
 		struct bfq_queue *blocked_bfqq =
 			!hlist_empty(&bfqq->woken_list) ?
 			container_of(bfqq->woken_list.first,
@@ -4904,7 +4928,7 @@ static struct bfq_queue *bfq_select_queue(struct bfq_data *bfqd)
 		    icq_to_bic(async_bfqq->next_rq->elv.icq) == bfqq->bic &&
 		    bfq_serv_to_charge(async_bfqq->next_rq, async_bfqq) <=
 		    bfq_bfqq_budget_left(async_bfqq))
-			bfqq = bfqq->bic->bfqq[0];
+			bfqq = bfqq->bic->bfqq[0][act_idx];
 		else if (bfqq->waker_bfqq &&
 			   bfq_bfqq_busy(bfqq->waker_bfqq) &&
 			   bfqq->waker_bfqq->next_rq &&
@@ -5367,49 +5391,47 @@ static void bfq_exit_bfqq(struct bfq_data *bfqd, struct bfq_queue *bfqq)
 	bfq_release_process_ref(bfqd, bfqq);
 }
 
-static void bfq_exit_icq_bfqq(struct bfq_io_cq *bic, bool is_sync)
+static void bfq_exit_icq_bfqq(struct bfq_io_cq *bic,
+			      bool is_sync,
+			      unsigned int actuator_idx)
 {
-	struct bfq_queue *bfqq = bic_to_bfqq(bic, is_sync);
+	struct bfq_queue *bfqq = bic_to_bfqq(bic, is_sync, actuator_idx);
 	struct bfq_data *bfqd;
 
 	if (bfqq)
 		bfqd = bfqq->bfqd; /* NULL if scheduler already exited */
 
 	if (bfqq && bfqd) {
-		unsigned long flags;
-
-		spin_lock_irqsave(&bfqd->lock, flags);
 		bfqq->bic = NULL;
 		bfq_exit_bfqq(bfqd, bfqq);
-		bic_set_bfqq(bic, NULL, is_sync);
-		spin_unlock_irqrestore(&bfqd->lock, flags);
+		bic_set_bfqq(bic, NULL, is_sync, actuator_idx);
 	}
 }
 
 static void bfq_exit_icq(struct io_cq *icq)
 {
 	struct bfq_io_cq *bic = icq_to_bic(icq);
+	struct bfq_data *bfqd = bic_to_bfqd(bic);
+	unsigned long flags;
+	unsigned int act_idx;
 
-	if (bic->stable_merge_bfqq) {
-		struct bfq_data *bfqd = bic->stable_merge_bfqq->bfqd;
-
-		/*
-		 * bfqd is NULL if scheduler already exited, and in
-		 * that case this is the last time bfqq is accessed.
-		 */
-		if (bfqd) {
-			unsigned long flags;
+	/*
+	 * bfqd is NULL if scheduler already exited, and in that case
+	 * this is the last time these queues are accessed.
+	 */
+	if (bfqd)
+		spin_lock_irqsave(&bfqd->lock, flags);
 
-			spin_lock_irqsave(&bfqd->lock, flags);
-			bfq_put_stable_ref(bic->stable_merge_bfqq);
-			spin_unlock_irqrestore(&bfqd->lock, flags);
-		} else {
+	for (act_idx = 0; act_idx < BFQ_NUM_ACTUATORS; act_idx++) {
+		if (bic->stable_merge_bfqq)
 			bfq_put_stable_ref(bic->stable_merge_bfqq);
-		}
+
+		bfq_exit_icq_bfqq(bic, true, act_idx);
+		bfq_exit_icq_bfqq(bic, false, act_idx);
 	}
 
-	bfq_exit_icq_bfqq(bic, true);
-	bfq_exit_icq_bfqq(bic, false);
+	if (bfqd)
+		spin_unlock_irqrestore(&bfqd->lock, flags);
 }
 
 /*
@@ -5486,23 +5508,25 @@ static void bfq_check_ioprio_change(struct bfq_io_cq *bic, struct bio *bio)
 
 	bic->ioprio = ioprio;
 
-	bfqq = bic_to_bfqq(bic, false);
+	bfqq = bic_to_bfqq(bic, false, bfq_actuator_index(bfqd, bio));
 	if (bfqq) {
 		bfq_release_process_ref(bfqd, bfqq);
 		bfqq = bfq_get_queue(bfqd, bio, false, bic, true);
-		bic_set_bfqq(bic, bfqq, false);
+		bic_set_bfqq(bic, bfqq, false, bfq_actuator_index(bfqd, bio));
 	}
 
-	bfqq = bic_to_bfqq(bic, true);
+	bfqq = bic_to_bfqq(bic, true, bfq_actuator_index(bfqd, bio));
 	if (bfqq)
 		bfq_set_next_ioprio_data(bfqq, bic);
 }
 
 static void bfq_init_bfqq(struct bfq_data *bfqd, struct bfq_queue *bfqq,
-			  struct bfq_io_cq *bic, pid_t pid, int is_sync)
+			  struct bfq_io_cq *bic, pid_t pid, int is_sync,
+			  unsigned int act_idx)
 {
 	u64 now_ns = ktime_get_ns();
 
+	bfqq->actuator_idx = act_idx;
 	RB_CLEAR_NODE(&bfqq->entity.rb_node);
 	INIT_LIST_HEAD(&bfqq->fifo);
 	INIT_HLIST_NODE(&bfqq->burst_list_node);
@@ -5741,6 +5765,7 @@ static struct bfq_queue *bfq_get_queue(struct bfq_data *bfqd,
 	struct bfq_group *bfqg;
 
 	bfqg = bfq_bio_bfqg(bfqd, bio);
+
 	if (!is_sync) {
 		async_bfqq = bfq_async_queue_prio(bfqd, bfqg, ioprio_class,
 						  ioprio);
@@ -5755,7 +5780,7 @@ static struct bfq_queue *bfq_get_queue(struct bfq_data *bfqd,
 
 	if (bfqq) {
 		bfq_init_bfqq(bfqd, bfqq, bic, current->pid,
-			      is_sync);
+			      is_sync, bfq_actuator_index(bfqd, bio));
 		bfq_init_entity(&bfqq->entity, bfqg);
 		bfq_log_bfqq(bfqd, bfqq, "allocated");
 	} else {
@@ -6070,7 +6095,8 @@ static bool __bfq_insert_request(struct bfq_data *bfqd, struct request *rq)
 		 * then complete the merge and redirect it to
 		 * new_bfqq.
 		 */
-		if (bic_to_bfqq(RQ_BIC(rq), 1) == bfqq)
+		if (bic_to_bfqq(RQ_BIC(rq), 1,
+				bfq_actuator_index(bfqd, rq->bio)) == bfqq)
 			bfq_merge_bfqqs(bfqd, RQ_BIC(rq),
 					bfqq, new_bfqq);
 
@@ -6624,7 +6650,7 @@ bfq_split_bfqq(struct bfq_io_cq *bic, struct bfq_queue *bfqq)
 		return bfqq;
 	}
 
-	bic_set_bfqq(bic, NULL, 1);
+	bic_set_bfqq(bic, NULL, 1, bfqq->actuator_idx);
 
 	bfq_put_cooperator(bfqq);
 
@@ -6638,7 +6664,8 @@ static struct bfq_queue *bfq_get_bfqq_handle_split(struct bfq_data *bfqd,
 						   bool split, bool is_sync,
 						   bool *new_queue)
 {
-	struct bfq_queue *bfqq = bic_to_bfqq(bic, is_sync);
+	unsigned int act_idx = bfq_actuator_index(bfqd, bio);
+	struct bfq_queue *bfqq = bic_to_bfqq(bic, is_sync, act_idx);
 
 	if (likely(bfqq && bfqq != &bfqd->oom_bfqq))
 		return bfqq;
@@ -6650,7 +6677,7 @@ static struct bfq_queue *bfq_get_bfqq_handle_split(struct bfq_data *bfqd,
 		bfq_put_queue(bfqq);
 	bfqq = bfq_get_queue(bfqd, bio, is_sync, bic, split);
 
-	bic_set_bfqq(bic, bfqq, is_sync);
+	bic_set_bfqq(bic, bfqq, is_sync, act_idx);
 	if (split && is_sync) {
 		if ((bic->was_in_burst_list && bfqd->large_burst) ||
 		    bic->saved_in_large_burst)
@@ -7092,8 +7119,10 @@ static int bfq_init_queue(struct request_queue *q, struct elevator_type *e)
 	 * Our fallback bfqq if bfq_find_alloc_queue() runs into OOM issues.
 	 * Grab a permanent reference to it, so that the normal code flow
 	 * will not attempt to free it.
+	 * Set zero as actuator index: we will pretend that
+	 * all I/O requests are for the same actuator.
 	 */
-	bfq_init_bfqq(bfqd, &bfqd->oom_bfqq, NULL, 1, 0);
+	bfq_init_bfqq(bfqd, &bfqd->oom_bfqq, NULL, 1, 0, 0);
 	bfqd->oom_bfqq.ref++;
 	bfqd->oom_bfqq.new_ioprio = BFQ_DEFAULT_QUEUE_IOPRIO;
 	bfqd->oom_bfqq.new_ioprio_class = IOPRIO_CLASS_BE;
diff --git a/block/bfq-iosched.h b/block/bfq-iosched.h
index ca8177d7bf7c..c7b89dbe21ea 100644
--- a/block/bfq-iosched.h
+++ b/block/bfq-iosched.h
@@ -33,6 +33,8 @@
  */
 #define BFQ_SOFTRT_WEIGHT_FACTOR	100
 
+#define BFQ_NUM_ACTUATORS 2
+
 struct bfq_entity;
 
 /**
@@ -225,12 +227,14 @@ struct bfq_ttime {
  * struct bfq_queue - leaf schedulable entity.
  *
  * A bfq_queue is a leaf request queue; it can be associated with an
- * io_context or more, if it  is  async or shared  between  cooperating
- * processes. @cgroup holds a reference to the cgroup, to be sure that it
- * does not disappear while a bfqq still references it (mostly to avoid
- * races between request issuing and task migration followed by cgroup
- * destruction).
- * All the fields are protected by the queue lock of the containing bfqd.
+ * io_context or more, if it is async or shared between cooperating
+ * processes. Besides, it contains I/O requests for only one actuator
+ * (an io_context is associated with a different bfq_queue for each
+ * actuator it generates I/O for). @cgroup holds a reference to the
+ * cgroup, to be sure that it does not disappear while a bfqq still
+ * references it (mostly to avoid races between request issuing and
+ * task migration followed by cgroup destruction).  All the fields are
+ * protected by the queue lock of the containing bfqd.
  */
 struct bfq_queue {
 	/* reference counter */
@@ -399,6 +403,9 @@ struct bfq_queue {
 	 * the woken queues when this queue exits.
 	 */
 	struct hlist_head woken_list;
+
+	/* index of the actuator this queue is associated with */
+	unsigned int actuator_idx;
 };
 
 /**
@@ -407,8 +414,17 @@ struct bfq_queue {
 struct bfq_io_cq {
 	/* associated io_cq structure */
 	struct io_cq icq; /* must be the first member */
-	/* array of two process queues, the sync and the async */
-	struct bfq_queue *bfqq[2];
+	/*
+	 * Matrix of associated process queues: first row for async
+	 * queues, second row sync queues. Each row contains one
+	 * column for each actuator. An I/O request generated by the
+	 * process is inserted into the queue pointed by bfqq[i][j] if
+	 * the request is to be served by the j-th actuator of the
+	 * drive, where i==0 or i==1, depending on whether the request
+	 * is async or sync. So there is a distinct queue for each
+	 * actuator.
+	 */
+	struct bfq_queue *bfqq[2][BFQ_NUM_ACTUATORS];
 	/* per (request_queue, blkcg) ioprio */
 	int ioprio;
 #ifdef CONFIG_BFQ_GROUP_IOSCHED
@@ -968,8 +984,10 @@ struct bfq_group {
 
 extern const int bfq_timeout;
 
-struct bfq_queue *bic_to_bfqq(struct bfq_io_cq *bic, bool is_sync);
-void bic_set_bfqq(struct bfq_io_cq *bic, struct bfq_queue *bfqq, bool is_sync);
+struct bfq_queue *bic_to_bfqq(struct bfq_io_cq *bic, bool is_sync,
+				unsigned int actuator_idx);
+void bic_set_bfqq(struct bfq_io_cq *bic, struct bfq_queue *bfqq, bool is_sync,
+				unsigned int actuator_idx);
 struct bfq_data *bic_to_bfqd(struct bfq_io_cq *bic);
 void bfq_pos_tree_add_move(struct bfq_data *bfqd, struct bfq_queue *bfqq);
 void bfq_weights_tree_add(struct bfq_data *bfqd, struct bfq_queue *bfqq,
-- 
2.20.1


^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [PATCH 2/8] block, bfq: forbid stable merging of queues associated with different actuators
  2022-06-23 15:53 [PATCH 0/8] block, bfq: extend bfq to support multi-actuator drives Paolo Valente
  2022-06-23 15:53 ` [PATCH 1/8] block, bfq: split sync bfq_queues on a per-actuator basis Paolo Valente
@ 2022-06-23 15:53 ` Paolo Valente
  2022-06-23 15:53 ` [PATCH 3/8] block, bfq: turn scalar fields into arrays in bfq_io_cq Paolo Valente
                   ` (6 subsequent siblings)
  8 siblings, 0 replies; 21+ messages in thread
From: Paolo Valente @ 2022-06-23 15:53 UTC (permalink / raw)
  To: Jens Axboe
  Cc: linux-block, linux-kernel, jack, andrea.righi, glen.valante,
	arie.vanderhoeven, Paolo Valente

If queues associated with different actuators are merged, then control
is lost on each actuator. Therefore some actuator may be
underutilized, and throughput may decrease. This problem cannot occur
with basic queue merging, because the latter is triggered by spatial
locality, and sectors for different actuators are not close to each
other. Yet it may happen with stable merging. To address this issue,
this commit prevents stable merging from occurring among queues
associated with different actuators.

Signed-off-by: Paolo Valente <paolo.valente@linaro.org>
---
 block/bfq-iosched.c | 13 +++++++++----
 1 file changed, 9 insertions(+), 4 deletions(-)

diff --git a/block/bfq-iosched.c b/block/bfq-iosched.c
index 98e0c454f5bc..52fc3930b889 100644
--- a/block/bfq-iosched.c
+++ b/block/bfq-iosched.c
@@ -5695,9 +5695,13 @@ static struct bfq_queue *bfq_do_or_sched_stable_merge(struct bfq_data *bfqd,
 	 * it has been set already, but too long ago, then move it
 	 * forward to bfqq. Finally, move also if bfqq belongs to a
 	 * different group than last_bfqq_created, or if bfqq has a
-	 * different ioprio or ioprio_class. If none of these
-	 * conditions holds true, then try an early stable merge or
-	 * schedule a delayed stable merge.
+	 * different ioprio, ioprio_class or actuator_idx. If none of
+	 * these conditions holds true, then try an early stable merge
+	 * or schedule a delayed stable merge. As for the condition on
+	 * actuator_idx, the reason is that, if queues associated with
+	 * different actuators are merged, then control is lost on
+	 * each actuator. Therefore some actuator may be
+	 * underutilized, and throughput may decrease.
 	 *
 	 * A delayed merge is scheduled (instead of performing an
 	 * early merge), in case bfqq might soon prove to be more
@@ -5715,7 +5719,8 @@ static struct bfq_queue *bfq_do_or_sched_stable_merge(struct bfq_data *bfqd,
 			bfqq->creation_time) ||
 		bfqq->entity.parent != last_bfqq_created->entity.parent ||
 		bfqq->ioprio != last_bfqq_created->ioprio ||
-		bfqq->ioprio_class != last_bfqq_created->ioprio_class)
+		bfqq->ioprio_class != last_bfqq_created->ioprio_class ||
+		bfqq->actuator_idx != last_bfqq_created->actuator_idx)
 		*source_bfqq = bfqq;
 	else if (time_after_eq(last_bfqq_created->creation_time +
 				 bfqd->bfq_burst_interval,
-- 
2.20.1


^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [PATCH 3/8] block, bfq: turn scalar fields into arrays in bfq_io_cq
  2022-06-23 15:53 [PATCH 0/8] block, bfq: extend bfq to support multi-actuator drives Paolo Valente
  2022-06-23 15:53 ` [PATCH 1/8] block, bfq: split sync bfq_queues on a per-actuator basis Paolo Valente
  2022-06-23 15:53 ` [PATCH 2/8] block, bfq: forbid stable merging of queues associated with different actuators Paolo Valente
@ 2022-06-23 15:53 ` Paolo Valente
  2022-06-23 15:53 ` [PATCH 4/8] block, bfq: split also async bfq_queues on a per-actuator basis Paolo Valente
                   ` (5 subsequent siblings)
  8 siblings, 0 replies; 21+ messages in thread
From: Paolo Valente @ 2022-06-23 15:53 UTC (permalink / raw)
  To: Jens Axboe
  Cc: linux-block, linux-kernel, jack, andrea.righi, glen.valante,
	arie.vanderhoeven, Paolo Valente, Gabriele Felici

When a bfq_queue Q is merged with another queue, several pieces of
information are saved about Q. These pieces are stored in the
bfq_io_cq data structure of the process associated with Q. In
particular, each such piece is represented by a scalar field in
bfq_io_cq.

Yet, with a multi-actuator drive, a process gets associated with
multiple bfq_queues: one queue for each of the N actuators. Each of
these queues may undergo a merge. So, the bfq_io_cq data structure
must be able to accommodate the above information for N queues.

This commit solves this problem by turning each scalar field into an
array of N elements (and by changing code so as to handle these
arrays).

This solution is written under the assumption that bfq_queues
associated with different actuators cannot be cross-merged. This
assumption holds naturally with basic queue merging: the latter is
triggered by spatial locality, and sectors for different actuators are
not close to each other. As for stable cross-merging, the assumption
here is that it is disabled.

Signed-off-by: Gabriele Felici <felicigb@gmail.com>
Signed-off-by: Paolo Valente <paolo.valente@linaro.org>
---
 block/bfq-iosched.c | 139 ++++++++++++++++++++++++--------------------
 block/bfq-iosched.h |  52 ++++++++++-------
 2 files changed, 105 insertions(+), 86 deletions(-)

diff --git a/block/bfq-iosched.c b/block/bfq-iosched.c
index 52fc3930b889..fa24f9074e6f 100644
--- a/block/bfq-iosched.c
+++ b/block/bfq-iosched.c
@@ -406,7 +406,7 @@ void bic_set_bfqq(struct bfq_io_cq *bic,
 	 */
 	bic->bfqq[is_sync][actuator_idx] = bfqq;
 
-	if (bfqq && bic->stable_merge_bfqq == bfqq) {
+	if (bfqq && bic->stable_merge_bfqq[actuator_idx] == bfqq) {
 		/*
 		 * Actually, these same instructions are executed also
 		 * in bfq_setup_cooperator, in case of abort or actual
@@ -415,9 +415,9 @@ void bic_set_bfqq(struct bfq_io_cq *bic,
 		 * did so, we would nest even more complexity in this
 		 * function.
 		 */
-		bfq_put_stable_ref(bic->stable_merge_bfqq);
+		bfq_put_stable_ref(bic->stable_merge_bfqq[actuator_idx]);
 
-		bic->stable_merge_bfqq = NULL;
+		bic->stable_merge_bfqq[actuator_idx] = NULL;
 	}
 }
 
@@ -1176,36 +1176,38 @@ bfq_bfqq_resume_state(struct bfq_queue *bfqq, struct bfq_data *bfqd,
 {
 	unsigned int old_wr_coeff = 1;
 	bool busy = bfq_already_existing && bfq_bfqq_busy(bfqq);
+	unsigned int a_idx = bfqq->actuator_idx;
 
-	if (bic->saved_has_short_ttime)
+	if (bic->saved_has_short_ttime[a_idx])
 		bfq_mark_bfqq_has_short_ttime(bfqq);
 	else
 		bfq_clear_bfqq_has_short_ttime(bfqq);
 
-	if (bic->saved_IO_bound)
+	if (bic->saved_IO_bound[a_idx])
 		bfq_mark_bfqq_IO_bound(bfqq);
 	else
 		bfq_clear_bfqq_IO_bound(bfqq);
 
-	bfqq->last_serv_time_ns = bic->saved_last_serv_time_ns;
-	bfqq->inject_limit = bic->saved_inject_limit;
-	bfqq->decrease_time_jif = bic->saved_decrease_time_jif;
+	bfqq->last_serv_time_ns = bic->saved_last_serv_time_ns[a_idx];
+	bfqq->inject_limit = bic->saved_inject_limit[a_idx];
+	bfqq->decrease_time_jif = bic->saved_decrease_time_jif[a_idx];
 
-	bfqq->entity.new_weight = bic->saved_weight;
-	bfqq->ttime = bic->saved_ttime;
-	bfqq->io_start_time = bic->saved_io_start_time;
-	bfqq->tot_idle_time = bic->saved_tot_idle_time;
+	bfqq->entity.new_weight = bic->saved_weight[a_idx];
+	bfqq->ttime = bic->saved_ttime[a_idx];
+	bfqq->io_start_time = bic->saved_io_start_time[a_idx];
+	bfqq->tot_idle_time = bic->saved_tot_idle_time[a_idx];
 	/*
 	 * Restore weight coefficient only if low_latency is on
 	 */
 	if (bfqd->low_latency) {
 		old_wr_coeff = bfqq->wr_coeff;
-		bfqq->wr_coeff = bic->saved_wr_coeff;
+		bfqq->wr_coeff = bic->saved_wr_coeff[a_idx];
 	}
-	bfqq->service_from_wr = bic->saved_service_from_wr;
-	bfqq->wr_start_at_switch_to_srt = bic->saved_wr_start_at_switch_to_srt;
-	bfqq->last_wr_start_finish = bic->saved_last_wr_start_finish;
-	bfqq->wr_cur_max_time = bic->saved_wr_cur_max_time;
+	bfqq->service_from_wr = bic->saved_service_from_wr[a_idx];
+	bfqq->wr_start_at_switch_to_srt =
+				bic->saved_wr_start_at_switch_to_srt[a_idx];
+	bfqq->last_wr_start_finish = bic->saved_last_wr_start_finish[a_idx];
+	bfqq->wr_cur_max_time = bic->saved_wr_cur_max_time[a_idx];
 
 	if (bfqq->wr_coeff > 1 && (bfq_bfqq_in_large_burst(bfqq) ||
 	    time_is_before_jiffies(bfqq->last_wr_start_finish +
@@ -1824,6 +1826,16 @@ static bool bfq_bfqq_higher_class_or_weight(struct bfq_queue *bfqq,
 	return bfqq_weight > in_serv_weight;
 }
 
+/* get the index of the actuator that will serve bio */
+static unsigned int bfq_actuator_index(struct bfq_data *bfqd, struct bio *bio)
+{
+	/*
+	 * Multi-actuator support not complete yet, so always return 0
+	 * for the moment.
+	 */
+	return 0;
+}
+
 static bool bfq_better_to_idle(struct bfq_queue *bfqq);
 
 static void bfq_bfqq_handle_idle_busy_switch(struct bfq_data *bfqd,
@@ -1878,7 +1890,9 @@ static void bfq_bfqq_handle_idle_busy_switch(struct bfq_data *bfqd,
 	wr_or_deserves_wr = bfqd->low_latency &&
 		(bfqq->wr_coeff > 1 ||
 		 (bfq_bfqq_sync(bfqq) &&
-		  (bfqq->bic || RQ_BIC(rq)->stably_merged) &&
+		  (bfqq->bic ||
+		   RQ_BIC(rq)->stably_merged
+		   [bfq_actuator_index(bfqd, rq->bio)]) &&
 		   (*interactive || soft_rt)));
 
 	/*
@@ -2466,16 +2480,6 @@ static void bfq_remove_request(struct request_queue *q,
 
 }
 
-/* get the index of the actuator that will serve bio */
-static unsigned int bfq_actuator_index(struct bfq_data *bfqd, struct bio *bio)
-{
-	/*
-	 * Multi-actuator support not complete yet, so always return 0
-	 * for the moment.
-	 */
-	return 0;
-}
-
 static bool bfq_bio_merge(struct request_queue *q, struct bio *bio,
 		unsigned int nr_segs)
 {
@@ -2902,6 +2906,7 @@ bfq_setup_cooperator(struct bfq_data *bfqd, struct bfq_queue *bfqq,
 		     void *io_struct, bool request, struct bfq_io_cq *bic)
 {
 	struct bfq_queue *in_service_bfqq, *new_bfqq;
+	unsigned int a_idx = bfqq->actuator_idx;
 
 	/* if a merge has already been setup, then proceed with that first */
 	if (bfqq->new_bfqq)
@@ -2923,21 +2928,21 @@ bfq_setup_cooperator(struct bfq_data *bfqd, struct bfq_queue *bfqq,
 		 * stable merging) also if bic is associated with a
 		 * sync queue, but this bfqq is async
 		 */
-		if (bfq_bfqq_sync(bfqq) && bic->stable_merge_bfqq &&
+		if (bfq_bfqq_sync(bfqq) && bic->stable_merge_bfqq[a_idx] &&
 		    !bfq_bfqq_just_created(bfqq) &&
 		    time_is_before_jiffies(bfqq->split_time +
 					  msecs_to_jiffies(bfq_late_stable_merging)) &&
 		    time_is_before_jiffies(bfqq->creation_time +
 					   msecs_to_jiffies(bfq_late_stable_merging))) {
 			struct bfq_queue *stable_merge_bfqq =
-				bic->stable_merge_bfqq;
+				bic->stable_merge_bfqq[a_idx];
 			int proc_ref = min(bfqq_process_refs(bfqq),
 					   bfqq_process_refs(stable_merge_bfqq));
 
 			/* deschedule stable merge, because done or aborted here */
 			bfq_put_stable_ref(stable_merge_bfqq);
 
-			bic->stable_merge_bfqq = NULL;
+			bic->stable_merge_bfqq[a_idx] = NULL;
 
 			if (!idling_boosts_thr_without_issues(bfqd, bfqq) &&
 			    proc_ref > 0) {
@@ -2946,9 +2951,10 @@ bfq_setup_cooperator(struct bfq_data *bfqd, struct bfq_queue *bfqq,
 					bfq_setup_merge(bfqq, stable_merge_bfqq);
 
 				if (new_bfqq) {
-					bic->stably_merged = true;
+					bic->stably_merged[a_idx] = true;
 					if (new_bfqq->bic)
-						new_bfqq->bic->stably_merged =
+						new_bfqq->bic->stably_merged
+						    [new_bfqq->actuator_idx] =
 									true;
 				}
 				return new_bfqq;
@@ -3048,6 +3054,8 @@ bfq_setup_cooperator(struct bfq_data *bfqd, struct bfq_queue *bfqq,
 static void bfq_bfqq_save_state(struct bfq_queue *bfqq)
 {
 	struct bfq_io_cq *bic = bfqq->bic;
+	/* State must be saved for the right queue index. */
+	unsigned int a_idx = bfqq->actuator_idx;
 
 	/*
 	 * If !bfqq->bic, the queue is already shared or its requests
@@ -3057,18 +3065,18 @@ static void bfq_bfqq_save_state(struct bfq_queue *bfqq)
 	if (!bic)
 		return;
 
-	bic->saved_last_serv_time_ns = bfqq->last_serv_time_ns;
-	bic->saved_inject_limit = bfqq->inject_limit;
-	bic->saved_decrease_time_jif = bfqq->decrease_time_jif;
-
-	bic->saved_weight = bfqq->entity.orig_weight;
-	bic->saved_ttime = bfqq->ttime;
-	bic->saved_has_short_ttime = bfq_bfqq_has_short_ttime(bfqq);
-	bic->saved_IO_bound = bfq_bfqq_IO_bound(bfqq);
-	bic->saved_io_start_time = bfqq->io_start_time;
-	bic->saved_tot_idle_time = bfqq->tot_idle_time;
-	bic->saved_in_large_burst = bfq_bfqq_in_large_burst(bfqq);
-	bic->was_in_burst_list = !hlist_unhashed(&bfqq->burst_list_node);
+	bic->saved_last_serv_time_ns[a_idx] = bfqq->last_serv_time_ns;
+	bic->saved_inject_limit[a_idx] = bfqq->inject_limit;
+	bic->saved_decrease_time_jif[a_idx] = bfqq->decrease_time_jif;
+
+	bic->saved_weight[a_idx] = bfqq->entity.orig_weight;
+	bic->saved_ttime[a_idx] = bfqq->ttime;
+	bic->saved_has_short_ttime[a_idx] = bfq_bfqq_has_short_ttime(bfqq);
+	bic->saved_IO_bound[a_idx] = bfq_bfqq_IO_bound(bfqq);
+	bic->saved_io_start_time[a_idx] = bfqq->io_start_time;
+	bic->saved_tot_idle_time[a_idx] = bfqq->tot_idle_time;
+	bic->saved_in_large_burst[a_idx] = bfq_bfqq_in_large_burst(bfqq);
+	bic->was_in_burst_list[a_idx] = !hlist_unhashed(&bfqq->burst_list_node);
 	if (unlikely(bfq_bfqq_just_created(bfqq) &&
 		     !bfq_bfqq_in_large_burst(bfqq) &&
 		     bfqq->bfqd->low_latency)) {
@@ -3081,17 +3089,17 @@ static void bfq_bfqq_save_state(struct bfq_queue *bfqq)
 		 * to bfqq, so that to avoid that bfqq unjustly fails
 		 * to enjoy weight raising if split soon.
 		 */
-		bic->saved_wr_coeff = bfqq->bfqd->bfq_wr_coeff;
-		bic->saved_wr_start_at_switch_to_srt = bfq_smallest_from_now();
-		bic->saved_wr_cur_max_time = bfq_wr_duration(bfqq->bfqd);
-		bic->saved_last_wr_start_finish = jiffies;
+		bic->saved_wr_coeff[a_idx] = bfqq->bfqd->bfq_wr_coeff;
+		bic->saved_wr_start_at_switch_to_srt[a_idx] = bfq_smallest_from_now();
+		bic->saved_wr_cur_max_time[a_idx] = bfq_wr_duration(bfqq->bfqd);
+		bic->saved_last_wr_start_finish[a_idx] = jiffies;
 	} else {
-		bic->saved_wr_coeff = bfqq->wr_coeff;
-		bic->saved_wr_start_at_switch_to_srt =
+		bic->saved_wr_coeff[a_idx] = bfqq->wr_coeff;
+		bic->saved_wr_start_at_switch_to_srt[a_idx] =
 			bfqq->wr_start_at_switch_to_srt;
-		bic->saved_service_from_wr = bfqq->service_from_wr;
-		bic->saved_last_wr_start_finish = bfqq->last_wr_start_finish;
-		bic->saved_wr_cur_max_time = bfqq->wr_cur_max_time;
+		bic->saved_service_from_wr[a_idx] = bfqq->service_from_wr;
+		bic->saved_last_wr_start_finish[a_idx] = bfqq->last_wr_start_finish;
+		bic->saved_wr_cur_max_time[a_idx] = bfqq->wr_cur_max_time;
 	}
 }
 
@@ -5423,8 +5431,8 @@ static void bfq_exit_icq(struct io_cq *icq)
 		spin_lock_irqsave(&bfqd->lock, flags);
 
 	for (act_idx = 0; act_idx < BFQ_NUM_ACTUATORS; act_idx++) {
-		if (bic->stable_merge_bfqq)
-			bfq_put_stable_ref(bic->stable_merge_bfqq);
+		if (bic->stable_merge_bfqq[act_idx])
+			bfq_put_stable_ref(bic->stable_merge_bfqq[act_idx]);
 
 		bfq_exit_icq_bfqq(bic, true, act_idx);
 		bfq_exit_icq_bfqq(bic, false, act_idx);
@@ -5612,6 +5620,7 @@ bfq_do_early_stable_merge(struct bfq_data *bfqd, struct bfq_queue *bfqq,
 			  struct bfq_io_cq *bic,
 			  struct bfq_queue *last_bfqq_created)
 {
+	unsigned int a_idx = last_bfqq_created->actuator_idx;
 	struct bfq_queue *new_bfqq =
 		bfq_setup_merge(bfqq, last_bfqq_created);
 
@@ -5619,8 +5628,8 @@ bfq_do_early_stable_merge(struct bfq_data *bfqd, struct bfq_queue *bfqq,
 		return bfqq;
 
 	if (new_bfqq->bic)
-		new_bfqq->bic->stably_merged = true;
-	bic->stably_merged = true;
+		new_bfqq->bic->stably_merged[a_idx] = true;
+	bic->stably_merged[a_idx] = true;
 
 	/*
 	 * Reusing merge functions. This implies that
@@ -5750,7 +5759,8 @@ static struct bfq_queue *bfq_do_or_sched_stable_merge(struct bfq_data *bfqd,
 			/*
 			 * Record the bfqq to merge to.
 			 */
-			bic->stable_merge_bfqq = last_bfqq_created;
+			bic->stable_merge_bfqq[last_bfqq_created->actuator_idx]
+							   = last_bfqq_created;
 		}
 	}
 
@@ -6684,12 +6694,12 @@ static struct bfq_queue *bfq_get_bfqq_handle_split(struct bfq_data *bfqd,
 
 	bic_set_bfqq(bic, bfqq, is_sync, act_idx);
 	if (split && is_sync) {
-		if ((bic->was_in_burst_list && bfqd->large_burst) ||
-		    bic->saved_in_large_burst)
+		if ((bic->was_in_burst_list[act_idx] && bfqd->large_burst) ||
+		    bic->saved_in_large_burst[act_idx])
 			bfq_mark_bfqq_in_large_burst(bfqq);
 		else {
 			bfq_clear_bfqq_in_large_burst(bfqq);
-			if (bic->was_in_burst_list)
+			if (bic->was_in_burst_list[act_idx])
 				/*
 				 * If bfqq was in the current
 				 * burst list before being
@@ -6778,6 +6788,7 @@ static struct bfq_queue *bfq_init_rq(struct request *rq)
 	struct bfq_queue *bfqq;
 	bool new_queue = false;
 	bool bfqq_already_existing = false, split = false;
+	unsigned int a_idx = bfq_actuator_index(bfqd, bio);
 
 	if (unlikely(!rq->elv.icq))
 		return NULL;
@@ -6804,12 +6815,12 @@ static struct bfq_queue *bfq_init_rq(struct request *rq)
 	if (likely(!new_queue)) {
 		/* If the queue was seeky for too long, break it apart. */
 		if (bfq_bfqq_coop(bfqq) && bfq_bfqq_split_coop(bfqq) &&
-			!bic->stably_merged) {
+			!bic->stably_merged[a_idx]) {
 			struct bfq_queue *old_bfqq = bfqq;
 
 			/* Update bic before losing reference to bfqq */
 			if (bfq_bfqq_in_large_burst(bfqq))
-				bic->saved_in_large_burst = true;
+				bic->saved_in_large_burst[a_idx] = true;
 
 			bfqq = bfq_split_bfqq(bic, bfqq);
 			split = true;
diff --git a/block/bfq-iosched.h b/block/bfq-iosched.h
index c7b89dbe21ea..e1ed18a9400d 100644
--- a/block/bfq-iosched.h
+++ b/block/bfq-iosched.h
@@ -431,30 +431,37 @@ struct bfq_io_cq {
 	uint64_t blkcg_serial_nr; /* the current blkcg serial */
 #endif
 	/*
-	 * Snapshot of the has_short_time flag before merging; taken
-	 * to remember its value while the queue is merged, so as to
-	 * be able to restore it in case of split.
+	 * Several fields follow, which are used to support
+	 * queue-merging operations. Each field is an array, because a
+	 * process may be associated with multiple bfq_queues (see the
+	 * field bfqq above). And each of these queues may undergo a
+	 * merge.
 	 */
-	bool saved_has_short_ttime;
+	/*
+	 * Snapshot of the has_short_time flags before merging; taken
+	 * to remember their values while a queue is merged, so as to
+	 * be able to restore them in case of split.
+	 */
+	bool saved_has_short_ttime[BFQ_NUM_ACTUATORS];
 	/*
 	 * Same purpose as the previous two fields for the I/O bound
 	 * classification of a queue.
 	 */
-	bool saved_IO_bound;
+	bool saved_IO_bound[BFQ_NUM_ACTUATORS];
 
-	u64 saved_io_start_time;
-	u64 saved_tot_idle_time;
+	u64 saved_io_start_time[BFQ_NUM_ACTUATORS];
+	u64 saved_tot_idle_time[BFQ_NUM_ACTUATORS];
 
 	/*
-	 * Same purpose as the previous fields for the value of the
+	 * Same purpose as the previous fields for the values of the
 	 * field keeping the queue's belonging to a large burst
 	 */
-	bool saved_in_large_burst;
+	bool saved_in_large_burst[BFQ_NUM_ACTUATORS];
 	/*
 	 * True if the queue belonged to a burst list before its merge
 	 * with another cooperating queue.
 	 */
-	bool was_in_burst_list;
+	bool was_in_burst_list[BFQ_NUM_ACTUATORS];
 
 	/*
 	 * Save the weight when a merge occurs, to be able
@@ -463,27 +470,28 @@ struct bfq_io_cq {
 	 * then the weight of the recycled queue could differ
 	 * from the weight of the original queue.
 	 */
-	unsigned int saved_weight;
+	unsigned int saved_weight[BFQ_NUM_ACTUATORS];
 
 	/*
 	 * Similar to previous fields: save wr information.
 	 */
-	unsigned long saved_wr_coeff;
-	unsigned long saved_last_wr_start_finish;
-	unsigned long saved_service_from_wr;
-	unsigned long saved_wr_start_at_switch_to_srt;
-	unsigned int saved_wr_cur_max_time;
-	struct bfq_ttime saved_ttime;
+	unsigned long saved_wr_coeff[BFQ_NUM_ACTUATORS];
+	unsigned long saved_last_wr_start_finish[BFQ_NUM_ACTUATORS];
+	unsigned long saved_service_from_wr[BFQ_NUM_ACTUATORS];
+	unsigned long saved_wr_start_at_switch_to_srt[BFQ_NUM_ACTUATORS];
+	unsigned int saved_wr_cur_max_time[BFQ_NUM_ACTUATORS];
+	struct bfq_ttime saved_ttime[BFQ_NUM_ACTUATORS];
 
 	/* Save also injection state */
-	u64 saved_last_serv_time_ns;
-	unsigned int saved_inject_limit;
-	unsigned long saved_decrease_time_jif;
+	u64 saved_last_serv_time_ns[BFQ_NUM_ACTUATORS];
+	unsigned int saved_inject_limit[BFQ_NUM_ACTUATORS];
+	unsigned long saved_decrease_time_jif[BFQ_NUM_ACTUATORS];
 
 	/* candidate queue for a stable merge (due to close creation time) */
-	struct bfq_queue *stable_merge_bfqq;
+	struct bfq_queue *stable_merge_bfqq[BFQ_NUM_ACTUATORS];
+
+	bool stably_merged[BFQ_NUM_ACTUATORS];	/* non splittable if true */
 
-	bool stably_merged;	/* non splittable if true */
 	unsigned int requests;	/* Number of requests this process has in flight */
 };
 
-- 
2.20.1


^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [PATCH 4/8] block, bfq: split also async bfq_queues on a per-actuator basis
  2022-06-23 15:53 [PATCH 0/8] block, bfq: extend bfq to support multi-actuator drives Paolo Valente
                   ` (2 preceding siblings ...)
  2022-06-23 15:53 ` [PATCH 3/8] block, bfq: turn scalar fields into arrays in bfq_io_cq Paolo Valente
@ 2022-06-23 15:53 ` Paolo Valente
  2022-06-23 15:53 ` [PATCH 5/8] block, bfq: turn BFQ_NUM_ACTUATORS into BFQ_MAX_ACTUATORS Paolo Valente
                   ` (4 subsequent siblings)
  8 siblings, 0 replies; 21+ messages in thread
From: Paolo Valente @ 2022-06-23 15:53 UTC (permalink / raw)
  To: Jens Axboe
  Cc: linux-block, linux-kernel, jack, andrea.righi, glen.valante,
	arie.vanderhoeven, Davide Zini, Paolo Valente

From: Davide Zini <davidezini2@gmail.com>

Similarly to sync bfq_queues, also async bfq_queues need to be split
on a per-actuator basis.

Signed-off-by: Paolo Valente <paolo.valente@linaro.org>
Signed-off-by: Davide Zini <davidezini2@gmail.com>
---
 block/bfq-iosched.c | 41 +++++++++++++++++++++++------------------
 block/bfq-iosched.h |  8 ++++----
 2 files changed, 27 insertions(+), 22 deletions(-)

diff --git a/block/bfq-iosched.c b/block/bfq-iosched.c
index fa24f9074e6f..acc2282cf196 100644
--- a/block/bfq-iosched.c
+++ b/block/bfq-iosched.c
@@ -2671,14 +2671,16 @@ static void bfq_bfqq_end_wr(struct bfq_queue *bfqq)
 void bfq_end_wr_async_queues(struct bfq_data *bfqd,
 			     struct bfq_group *bfqg)
 {
-	int i, j;
-
-	for (i = 0; i < 2; i++)
-		for (j = 0; j < IOPRIO_NR_LEVELS; j++)
-			if (bfqg->async_bfqq[i][j])
-				bfq_bfqq_end_wr(bfqg->async_bfqq[i][j]);
-	if (bfqg->async_idle_bfqq)
-		bfq_bfqq_end_wr(bfqg->async_idle_bfqq);
+	int i, j, k;
+
+	for (k = 0; k < BFQ_NUM_ACTUATORS; k++) {
+		for (i = 0; i < 2; i++)
+			for (j = 0; j < IOPRIO_NR_LEVELS; j++)
+				if (bfqg->async_bfqq[i][j][k])
+					bfq_bfqq_end_wr(bfqg->async_bfqq[i][j][k]);
+		if (bfqg->async_idle_bfqq[k])
+			bfq_bfqq_end_wr(bfqg->async_idle_bfqq[k]);
+	}
 }
 
 static void bfq_end_wr(struct bfq_data *bfqd)
@@ -5598,18 +5600,18 @@ static void bfq_init_bfqq(struct bfq_data *bfqd, struct bfq_queue *bfqq,
 
 static struct bfq_queue **bfq_async_queue_prio(struct bfq_data *bfqd,
 					       struct bfq_group *bfqg,
-					       int ioprio_class, int ioprio)
+					       int ioprio_class, int ioprio, int act_idx)
 {
 	switch (ioprio_class) {
 	case IOPRIO_CLASS_RT:
-		return &bfqg->async_bfqq[0][ioprio];
+		return &bfqg->async_bfqq[0][ioprio][act_idx];
 	case IOPRIO_CLASS_NONE:
 		ioprio = IOPRIO_BE_NORM;
 		fallthrough;
 	case IOPRIO_CLASS_BE:
-		return &bfqg->async_bfqq[1][ioprio];
+		return &bfqg->async_bfqq[1][ioprio][act_idx];
 	case IOPRIO_CLASS_IDLE:
-		return &bfqg->async_idle_bfqq;
+		return &bfqg->async_idle_bfqq[act_idx];
 	default:
 		return NULL;
 	}
@@ -5783,7 +5785,8 @@ static struct bfq_queue *bfq_get_queue(struct bfq_data *bfqd,
 
 	if (!is_sync) {
 		async_bfqq = bfq_async_queue_prio(bfqd, bfqg, ioprio_class,
-						  ioprio);
+						  ioprio,
+						  bfq_actuator_index(bfqd, bio));
 		bfqq = *async_bfqq;
 		if (bfqq)
 			goto out;
@@ -6998,13 +7001,15 @@ static void __bfq_put_async_bfqq(struct bfq_data *bfqd,
  */
 void bfq_put_async_queues(struct bfq_data *bfqd, struct bfq_group *bfqg)
 {
-	int i, j;
+	int i, j, k;
 
-	for (i = 0; i < 2; i++)
-		for (j = 0; j < IOPRIO_NR_LEVELS; j++)
-			__bfq_put_async_bfqq(bfqd, &bfqg->async_bfqq[i][j]);
+	for (k = 0; k < BFQ_NUM_ACTUATORS; k++) {
+		for (i = 0; i < 2; i++)
+			for (j = 0; j < IOPRIO_NR_LEVELS; j++)
+				__bfq_put_async_bfqq(bfqd, &bfqg->async_bfqq[i][j][k]);
 
-	__bfq_put_async_bfqq(bfqd, &bfqg->async_idle_bfqq);
+	__bfq_put_async_bfqq(bfqd, &bfqg->async_idle_bfqq[k]);
+	}
 }
 
 /*
diff --git a/block/bfq-iosched.h b/block/bfq-iosched.h
index e1ed18a9400d..8862c12e8031 100644
--- a/block/bfq-iosched.h
+++ b/block/bfq-iosched.h
@@ -961,8 +961,8 @@ struct bfq_group {
 
 	void *bfqd;
 
-	struct bfq_queue *async_bfqq[2][IOPRIO_NR_LEVELS];
-	struct bfq_queue *async_idle_bfqq;
+	struct bfq_queue *async_bfqq[2][IOPRIO_NR_LEVELS][BFQ_NUM_ACTUATORS];
+	struct bfq_queue *async_idle_bfqq[BFQ_NUM_ACTUATORS];
 
 	struct bfq_entity *my_entity;
 
@@ -978,8 +978,8 @@ struct bfq_group {
 	struct bfq_entity entity;
 	struct bfq_sched_data sched_data;
 
-	struct bfq_queue *async_bfqq[2][IOPRIO_NR_LEVELS];
-	struct bfq_queue *async_idle_bfqq;
+	struct bfq_queue *async_bfqq[2][IOPRIO_NR_LEVELS][BFQ_NUM_ACTUATORS];
+	struct bfq_queue *async_idle_bfqq[BFQ_NUM_ACTUATORS];
 
 	struct rb_root rq_pos_tree;
 };
-- 
2.20.1


^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [PATCH 5/8] block, bfq: turn BFQ_NUM_ACTUATORS into BFQ_MAX_ACTUATORS
  2022-06-23 15:53 [PATCH 0/8] block, bfq: extend bfq to support multi-actuator drives Paolo Valente
                   ` (3 preceding siblings ...)
  2022-06-23 15:53 ` [PATCH 4/8] block, bfq: split also async bfq_queues on a per-actuator basis Paolo Valente
@ 2022-06-23 15:53 ` Paolo Valente
  2022-06-23 15:53 ` [PATCH 6/8] block, bfq: retrieve independent access ranges from request queue Paolo Valente
                   ` (3 subsequent siblings)
  8 siblings, 0 replies; 21+ messages in thread
From: Paolo Valente @ 2022-06-23 15:53 UTC (permalink / raw)
  To: Jens Axboe
  Cc: linux-block, linux-kernel, jack, andrea.righi, glen.valante,
	arie.vanderhoeven, Paolo Valente

This is a preparatory commit, for the commit that will use independent
access ranges. The latter retrieves the number of ranges, that is the
number of actuators. That number is assumed to be equal at most to
a constant BFQ_MAX_ACTUATORS. Such a constant is defined in this
preparatory icommit. In particular, this commit uses BFQ_MAX_ACTUATORS
in place of the placeholder BFQ_NUM_ACTUATORS, which was introduced in
a previous commit.

The actual commit that uses independet access ranges will then
replace BFQ_MAX_ACTUATORS with the actual number of actuators, where
appropriate.

Signed-off-by: Paolo Valente <paolo.valente@linaro.org>
---
 block/bfq-cgroup.c  |  2 +-
 block/bfq-iosched.c |  8 ++++----
 block/bfq-iosched.h | 48 ++++++++++++++++++++++-----------------------
 3 files changed, 29 insertions(+), 29 deletions(-)

diff --git a/block/bfq-cgroup.c b/block/bfq-cgroup.c
index 6462e8de44db..c5866f57b3aa 100644
--- a/block/bfq-cgroup.c
+++ b/block/bfq-cgroup.c
@@ -764,7 +764,7 @@ static void *__bfq_bic_change_cgroup(struct bfq_data *bfqd,
 	struct bfq_entity *entity;
 	unsigned int act_idx;
 
-	for (act_idx = 0; act_idx < BFQ_NUM_ACTUATORS; act_idx++) {
+	for (act_idx = 0; act_idx < BFQ_MAX_ACTUATORS; act_idx++) {
 		struct bfq_queue *async_bfqq = bic_to_bfqq(bic, 0, act_idx);
 		struct bfq_queue *sync_bfqq = bic_to_bfqq(bic, 1, act_idx);
 
diff --git a/block/bfq-iosched.c b/block/bfq-iosched.c
index acc2282cf196..99c305df6cce 100644
--- a/block/bfq-iosched.c
+++ b/block/bfq-iosched.c
@@ -689,7 +689,7 @@ static void bfq_limit_depth(unsigned int op, struct blk_mq_alloc_data *data)
 		limit = (limit * depth) >> bfqd->full_depth_shift;
 	}
 
-	for (act_idx = 0; act_idx < BFQ_NUM_ACTUATORS; act_idx++) {
+	for (act_idx = 0; act_idx < BFQ_MAX_ACTUATORS; act_idx++) {
 		struct bfq_queue *bfqq =
 			bic ? bic_to_bfqq(bic, op_is_sync(op), act_idx) : NULL;
 
@@ -2673,7 +2673,7 @@ void bfq_end_wr_async_queues(struct bfq_data *bfqd,
 {
 	int i, j, k;
 
-	for (k = 0; k < BFQ_NUM_ACTUATORS; k++) {
+	for (k = 0; k < BFQ_MAX_ACTUATORS; k++) {
 		for (i = 0; i < 2; i++)
 			for (j = 0; j < IOPRIO_NR_LEVELS; j++)
 				if (bfqg->async_bfqq[i][j][k])
@@ -5432,7 +5432,7 @@ static void bfq_exit_icq(struct io_cq *icq)
 	if (bfqd)
 		spin_lock_irqsave(&bfqd->lock, flags);
 
-	for (act_idx = 0; act_idx < BFQ_NUM_ACTUATORS; act_idx++) {
+	for (act_idx = 0; act_idx < BFQ_MAX_ACTUATORS; act_idx++) {
 		if (bic->stable_merge_bfqq[act_idx])
 			bfq_put_stable_ref(bic->stable_merge_bfqq[act_idx]);
 
@@ -7003,7 +7003,7 @@ void bfq_put_async_queues(struct bfq_data *bfqd, struct bfq_group *bfqg)
 {
 	int i, j, k;
 
-	for (k = 0; k < BFQ_NUM_ACTUATORS; k++) {
+	for (k = 0; k < BFQ_MAX_ACTUATORS; k++) {
 		for (i = 0; i < 2; i++)
 			for (j = 0; j < IOPRIO_NR_LEVELS; j++)
 				__bfq_put_async_bfqq(bfqd, &bfqg->async_bfqq[i][j][k]);
diff --git a/block/bfq-iosched.h b/block/bfq-iosched.h
index 8862c12e8031..2687e9a85a41 100644
--- a/block/bfq-iosched.h
+++ b/block/bfq-iosched.h
@@ -33,7 +33,7 @@
  */
 #define BFQ_SOFTRT_WEIGHT_FACTOR	100
 
-#define BFQ_NUM_ACTUATORS 2
+#define BFQ_MAX_ACTUATORS 32
 
 struct bfq_entity;
 
@@ -424,7 +424,7 @@ struct bfq_io_cq {
 	 * is async or sync. So there is a distinct queue for each
 	 * actuator.
 	 */
-	struct bfq_queue *bfqq[2][BFQ_NUM_ACTUATORS];
+	struct bfq_queue *bfqq[2][BFQ_MAX_ACTUATORS];
 	/* per (request_queue, blkcg) ioprio */
 	int ioprio;
 #ifdef CONFIG_BFQ_GROUP_IOSCHED
@@ -442,26 +442,26 @@ struct bfq_io_cq {
 	 * to remember their values while a queue is merged, so as to
 	 * be able to restore them in case of split.
 	 */
-	bool saved_has_short_ttime[BFQ_NUM_ACTUATORS];
+	bool saved_has_short_ttime[BFQ_MAX_ACTUATORS];
 	/*
 	 * Same purpose as the previous two fields for the I/O bound
 	 * classification of a queue.
 	 */
-	bool saved_IO_bound[BFQ_NUM_ACTUATORS];
+	bool saved_IO_bound[BFQ_MAX_ACTUATORS];
 
-	u64 saved_io_start_time[BFQ_NUM_ACTUATORS];
-	u64 saved_tot_idle_time[BFQ_NUM_ACTUATORS];
+	u64 saved_io_start_time[BFQ_MAX_ACTUATORS];
+	u64 saved_tot_idle_time[BFQ_MAX_ACTUATORS];
 
 	/*
 	 * Same purpose as the previous fields for the values of the
 	 * field keeping the queue's belonging to a large burst
 	 */
-	bool saved_in_large_burst[BFQ_NUM_ACTUATORS];
+	bool saved_in_large_burst[BFQ_MAX_ACTUATORS];
 	/*
 	 * True if the queue belonged to a burst list before its merge
 	 * with another cooperating queue.
 	 */
-	bool was_in_burst_list[BFQ_NUM_ACTUATORS];
+	bool was_in_burst_list[BFQ_MAX_ACTUATORS];
 
 	/*
 	 * Save the weight when a merge occurs, to be able
@@ -470,27 +470,27 @@ struct bfq_io_cq {
 	 * then the weight of the recycled queue could differ
 	 * from the weight of the original queue.
 	 */
-	unsigned int saved_weight[BFQ_NUM_ACTUATORS];
+	unsigned int saved_weight[BFQ_MAX_ACTUATORS];
 
 	/*
 	 * Similar to previous fields: save wr information.
 	 */
-	unsigned long saved_wr_coeff[BFQ_NUM_ACTUATORS];
-	unsigned long saved_last_wr_start_finish[BFQ_NUM_ACTUATORS];
-	unsigned long saved_service_from_wr[BFQ_NUM_ACTUATORS];
-	unsigned long saved_wr_start_at_switch_to_srt[BFQ_NUM_ACTUATORS];
-	unsigned int saved_wr_cur_max_time[BFQ_NUM_ACTUATORS];
-	struct bfq_ttime saved_ttime[BFQ_NUM_ACTUATORS];
+	unsigned long saved_wr_coeff[BFQ_MAX_ACTUATORS];
+	unsigned long saved_last_wr_start_finish[BFQ_MAX_ACTUATORS];
+	unsigned long saved_service_from_wr[BFQ_MAX_ACTUATORS];
+	unsigned long saved_wr_start_at_switch_to_srt[BFQ_MAX_ACTUATORS];
+	unsigned int saved_wr_cur_max_time[BFQ_MAX_ACTUATORS];
+	struct bfq_ttime saved_ttime[BFQ_MAX_ACTUATORS];
 
 	/* Save also injection state */
-	u64 saved_last_serv_time_ns[BFQ_NUM_ACTUATORS];
-	unsigned int saved_inject_limit[BFQ_NUM_ACTUATORS];
-	unsigned long saved_decrease_time_jif[BFQ_NUM_ACTUATORS];
+	u64 saved_last_serv_time_ns[BFQ_MAX_ACTUATORS];
+	unsigned int saved_inject_limit[BFQ_MAX_ACTUATORS];
+	unsigned long saved_decrease_time_jif[BFQ_MAX_ACTUATORS];
 
 	/* candidate queue for a stable merge (due to close creation time) */
-	struct bfq_queue *stable_merge_bfqq[BFQ_NUM_ACTUATORS];
+	struct bfq_queue *stable_merge_bfqq[BFQ_MAX_ACTUATORS];
 
-	bool stably_merged[BFQ_NUM_ACTUATORS];	/* non splittable if true */
+	bool stably_merged[BFQ_MAX_ACTUATORS];	/* non splittable if true */
 
 	unsigned int requests;	/* Number of requests this process has in flight */
 };
@@ -961,8 +961,8 @@ struct bfq_group {
 
 	void *bfqd;
 
-	struct bfq_queue *async_bfqq[2][IOPRIO_NR_LEVELS][BFQ_NUM_ACTUATORS];
-	struct bfq_queue *async_idle_bfqq[BFQ_NUM_ACTUATORS];
+	struct bfq_queue *async_bfqq[2][IOPRIO_NR_LEVELS][BFQ_MAX_ACTUATORS];
+	struct bfq_queue *async_idle_bfqq[BFQ_MAX_ACTUATORS];
 
 	struct bfq_entity *my_entity;
 
@@ -978,8 +978,8 @@ struct bfq_group {
 	struct bfq_entity entity;
 	struct bfq_sched_data sched_data;
 
-	struct bfq_queue *async_bfqq[2][IOPRIO_NR_LEVELS][BFQ_NUM_ACTUATORS];
-	struct bfq_queue *async_idle_bfqq[BFQ_NUM_ACTUATORS];
+	struct bfq_queue *async_bfqq[2][IOPRIO_NR_LEVELS][BFQ_MAX_ACTUATORS];
+	struct bfq_queue *async_idle_bfqq[BFQ_MAX_ACTUATORS];
 
 	struct rb_root rq_pos_tree;
 };
-- 
2.20.1


^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [PATCH 6/8] block, bfq: retrieve independent access ranges from request queue
  2022-06-23 15:53 [PATCH 0/8] block, bfq: extend bfq to support multi-actuator drives Paolo Valente
                   ` (4 preceding siblings ...)
  2022-06-23 15:53 ` [PATCH 5/8] block, bfq: turn BFQ_NUM_ACTUATORS into BFQ_MAX_ACTUATORS Paolo Valente
@ 2022-06-23 15:53 ` Paolo Valente
  2022-06-23 15:53 ` [PATCH 7/8] block, bfq: inject I/O to underutilized actuators Paolo Valente
                   ` (2 subsequent siblings)
  8 siblings, 0 replies; 21+ messages in thread
From: Paolo Valente @ 2022-06-23 15:53 UTC (permalink / raw)
  To: Jens Axboe
  Cc: linux-block, linux-kernel, jack, andrea.righi, glen.valante,
	arie.vanderhoeven, Federico Gavioli, Paolo Valente

From: Federico Gavioli <f.gavioli97@gmail.com>

This patch implements the code to gather the content of the
independent_access_ranges structure from the request_queue and copy
it into the queue's bfq_data. This copy is done at queue initialization.

We copy the access ranges into the bfq_data to avoid taking the queue
lock each time we access the ranges.

This implementation, however, puts a limit to the maximum independent
ranges supported by the scheduler. Such a limit is equal to the constant
BFQ_MAX_ACTUATORS. This limit was placed to avoid the allocation of
dynamic memory.

Signed-off-by: Federico Gavioli <f.gavioli97@gmail.com>
Signed-off-by: Paolo Valente <paolo.valente@linaro.org>
---
 block/bfq-cgroup.c  |  2 +-
 block/bfq-iosched.c | 57 ++++++++++++++++++++++++++++++++++++++-------
 block/bfq-iosched.h | 12 ++++++++++
 3 files changed, 61 insertions(+), 10 deletions(-)

diff --git a/block/bfq-cgroup.c b/block/bfq-cgroup.c
index c5866f57b3aa..872036d059d7 100644
--- a/block/bfq-cgroup.c
+++ b/block/bfq-cgroup.c
@@ -764,7 +764,7 @@ static void *__bfq_bic_change_cgroup(struct bfq_data *bfqd,
 	struct bfq_entity *entity;
 	unsigned int act_idx;
 
-	for (act_idx = 0; act_idx < BFQ_MAX_ACTUATORS; act_idx++) {
+	for (act_idx = 0; act_idx < bfqd->num_ia_ranges; act_idx++) {
 		struct bfq_queue *async_bfqq = bic_to_bfqq(bic, 0, act_idx);
 		struct bfq_queue *sync_bfqq = bic_to_bfqq(bic, 1, act_idx);
 
diff --git a/block/bfq-iosched.c b/block/bfq-iosched.c
index 99c305df6cce..24821334f685 100644
--- a/block/bfq-iosched.c
+++ b/block/bfq-iosched.c
@@ -689,7 +689,7 @@ static void bfq_limit_depth(unsigned int op, struct blk_mq_alloc_data *data)
 		limit = (limit * depth) >> bfqd->full_depth_shift;
 	}
 
-	for (act_idx = 0; act_idx < BFQ_MAX_ACTUATORS; act_idx++) {
+	for (act_idx = 0; act_idx < bfqd->num_ia_ranges; act_idx++) {
 		struct bfq_queue *bfqq =
 			bic ? bic_to_bfqq(bic, op_is_sync(op), act_idx) : NULL;
 
@@ -1829,10 +1829,24 @@ static bool bfq_bfqq_higher_class_or_weight(struct bfq_queue *bfqq,
 /* get the index of the actuator that will serve bio */
 static unsigned int bfq_actuator_index(struct bfq_data *bfqd, struct bio *bio)
 {
-	/*
-	 * Multi-actuator support not complete yet, so always return 0
-	 * for the moment.
-	 */
+	struct blk_independent_access_range *iar;
+	unsigned int i;
+	sector_t end;
+
+	if (bfqd->num_ia_ranges == 1)
+		return 0;
+
+	end = bio_end_sector(bio);
+
+	for (i = 0; i < bfqd->num_ia_ranges; i++) {
+		iar = &(bfqd->ia_ranges[i]);
+		if (end >= iar->sector && end < iar->sector + iar->nr_sectors)
+			return i;
+	}
+
+	WARN_ONCE(true,
+		  "bfq_actuator_index: bio sector out of ranges: end=%llu\n",
+		  end);
 	return 0;
 }
 
@@ -2477,7 +2491,6 @@ static void bfq_remove_request(struct request_queue *q,
 
 	if (rq->cmd_flags & REQ_META)
 		bfqq->meta_pending--;
-
 }
 
 static bool bfq_bio_merge(struct request_queue *q, struct bio *bio,
@@ -2673,7 +2686,7 @@ void bfq_end_wr_async_queues(struct bfq_data *bfqd,
 {
 	int i, j, k;
 
-	for (k = 0; k < BFQ_MAX_ACTUATORS; k++) {
+	for (k = 0; k < bfqd->num_ia_ranges; k++) {
 		for (i = 0; i < 2; i++)
 			for (j = 0; j < IOPRIO_NR_LEVELS; j++)
 				if (bfqg->async_bfqq[i][j][k])
@@ -5432,7 +5445,7 @@ static void bfq_exit_icq(struct io_cq *icq)
 	if (bfqd)
 		spin_lock_irqsave(&bfqd->lock, flags);
 
-	for (act_idx = 0; act_idx < BFQ_MAX_ACTUATORS; act_idx++) {
+	for (act_idx = 0; act_idx < bfqd->num_ia_ranges; act_idx++) {
 		if (bic->stable_merge_bfqq[act_idx])
 			bfq_put_stable_ref(bic->stable_merge_bfqq[act_idx]);
 
@@ -7003,7 +7016,7 @@ void bfq_put_async_queues(struct bfq_data *bfqd, struct bfq_group *bfqg)
 {
 	int i, j, k;
 
-	for (k = 0; k < BFQ_MAX_ACTUATORS; k++) {
+	for (k = 0; k < bfqd->num_ia_ranges; k++) {
 		for (i = 0; i < 2; i++)
 			for (j = 0; j < IOPRIO_NR_LEVELS; j++)
 				__bfq_put_async_bfqq(bfqd, &bfqg->async_bfqq[i][j][k]);
@@ -7120,6 +7133,7 @@ static int bfq_init_queue(struct request_queue *q, struct elevator_type *e)
 {
 	struct bfq_data *bfqd;
 	struct elevator_queue *eq;
+	unsigned int i;
 
 	eq = elevator_alloc(q, e);
 	if (!eq)
@@ -7162,6 +7176,31 @@ static int bfq_init_queue(struct request_queue *q, struct elevator_type *e)
 
 	bfqd->queue = q;
 
+	/*
+	 * If the disk supports multiple actuators, we copy the independent
+	 * access ranges from the request queue structure.
+	 */
+	spin_lock_irq(&q->queue_lock);
+	if (q->ia_ranges) {
+		/*
+		 * Check if the disk ia_ranges size exceeds the current bfq
+		 * actuator limit.
+		 */
+		if (q->ia_ranges->nr_ia_ranges > BFQ_MAX_ACTUATORS) {
+			pr_crit("nr_ia_ranges higher than act limit: iars=%d, max=%d.\n",
+					q->ia_ranges->nr_ia_ranges, BFQ_MAX_ACTUATORS);
+			pr_crit("Falling back to single actuator mode.\n");
+			bfqd->num_ia_ranges = 1;
+		} else {
+			bfqd->num_ia_ranges = q->ia_ranges->nr_ia_ranges;
+
+			for (i = 0; i < bfqd->num_ia_ranges; i++)
+				bfqd->ia_ranges[i] = q->ia_ranges->ia_range[i];
+		}
+	} else
+		bfqd->num_ia_ranges = 1;
+	spin_unlock_irq(&q->queue_lock);
+
 	INIT_LIST_HEAD(&bfqd->dispatch);
 
 	hrtimer_init(&bfqd->idle_slice_timer, CLOCK_MONOTONIC,
diff --git a/block/bfq-iosched.h b/block/bfq-iosched.h
index 2687e9a85a41..0c78feea15d7 100644
--- a/block/bfq-iosched.h
+++ b/block/bfq-iosched.h
@@ -796,6 +796,18 @@ struct bfq_data {
 	 */
 	unsigned int word_depths[2][2];
 	unsigned int full_depth_shift;
+
+	/*
+	 * Number of independent access ranges. This is equal to 1 in
+	 * case of single actuator drives.
+	 */
+	unsigned int num_ia_ranges;
+
+	/*
+	 * Disk independent access ranges for each actuator
+	 * in this device.
+	 */
+	struct blk_independent_access_range ia_ranges[BFQ_MAX_ACTUATORS];
 };
 
 enum bfqq_state_flags {
-- 
2.20.1


^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [PATCH 7/8] block, bfq: inject I/O to underutilized actuators
  2022-06-23 15:53 [PATCH 0/8] block, bfq: extend bfq to support multi-actuator drives Paolo Valente
                   ` (5 preceding siblings ...)
  2022-06-23 15:53 ` [PATCH 6/8] block, bfq: retrieve independent access ranges from request queue Paolo Valente
@ 2022-06-23 15:53 ` Paolo Valente
  2022-06-23 15:53 ` [PATCH 8/8] block, bfq: balance I/O injection among " Paolo Valente
       [not found] ` <PH7PR20MB505849512979A89E8A66FB24F18E9@PH7PR20MB5058.namprd20.prod.outlook.com>
  8 siblings, 0 replies; 21+ messages in thread
From: Paolo Valente @ 2022-06-23 15:53 UTC (permalink / raw)
  To: Jens Axboe
  Cc: linux-block, linux-kernel, jack, andrea.righi, glen.valante,
	arie.vanderhoeven, Davide Zini, Paolo Valente

From: Davide Zini <davidezini2@gmail.com>

The main service scheme of BFQ for sync I/O is serving one sync
bfq_queue at a time, for a while. In particular, BFQ enforces this
scheme when it deems the latter necessary to boost throughput or
to preserve service guarantees. Unfortunately, when BFQ enforces
this policy, only one actuator at a time gets served for a while,
because each bfq_queue contains I/O only for one actuator. The
other actuators may remain underutilized.

Actually, BFQ may serve (inject) extra I/O, taken from other
bfq_queues, in parallel with that of the in-service queue. This
injection mechanism may provide the ground for dealing also with
the above actuator-underutilization problem. Yet BFQ does not take
the actuator load into account when choosing which queue to pick
extra I/O from. In addition, BFQ may happen to inject extra I/O
only when the in-service queue is temporarily empty.

In view of these facts, this commit extends the
injection mechanism in such a way that the latter:
(1) takes into account also the actuator load;
(2) checks such a load on each dispatch, and injects I/O for an
    underutilized actuator, if there is one and there is I/O for it.

To perform the check in (2), this commit introduces a load
threshold, currently set to 4.  A linear scan of each actuator is
performed, until an actuator is found for which the following two
conditions hold: the load of the actuator is below the threshold,
and there is at least one non-in-service queue that contains I/O
for that actuator. If such a pair (actuator, queue) is found, then
the head request of that queue is returned for dispatch, instead
of the head request of the in-service queue.

We have set the threshold, empirically, to the minimum possible
value for which an actuator is fully utilized, or close to be
fully utilized. By doing so, injected I/O 'steals' as few
drive-queue slots as possibile to the in-service queue. This
reduces as much as possible the probability that the service of
I/O from the in-service bfq_queue gets delayed because of slot
exhaustion, i.e., because all the slots of the drive queue are
filled with I/O injected from other queues (NCQ provides for 32
slots).

This new mechanism also counters actuator underutilization in the
case of asymmetric configurations of bfq_queues. Namely if there
are few bfq_queues containing I/O for some actuators and many
bfq_queues containing I/O for other actuators. Or if the
bfq_queues containing I/O for some actuators have lower weights
than the other bfq_queues.

Signed-off-by: Paolo Valente <paolo.valente@linaro.org>
Signed-off-by: Davide Zini <davidezini2@gmail.com>
---
 block/bfq-cgroup.c  |   2 +-
 block/bfq-iosched.c | 137 +++++++++++++++++++++++++++++++++-----------
 block/bfq-iosched.h |  39 ++++++++++++-
 block/bfq-wf2q.c    |   2 +-
 4 files changed, 141 insertions(+), 39 deletions(-)

diff --git a/block/bfq-cgroup.c b/block/bfq-cgroup.c
index 872036d059d7..4b379f0cf8ed 100644
--- a/block/bfq-cgroup.c
+++ b/block/bfq-cgroup.c
@@ -699,7 +699,7 @@ void bfq_bfqq_move(struct bfq_data *bfqd, struct bfq_queue *bfqq,
 		bfq_activate_bfqq(bfqd, bfqq);
 	}
 
-	if (!bfqd->in_service_queue && !bfqd->rq_in_driver)
+	if (!bfqd->in_service_queue && !bfqd->tot_rq_in_driver)
 		bfq_schedule_dispatch(bfqd);
 	/* release extra ref taken above, bfqq may happen to be freed now */
 	bfq_put_queue(bfqq);
diff --git a/block/bfq-iosched.c b/block/bfq-iosched.c
index 24821334f685..80eb7d542b70 100644
--- a/block/bfq-iosched.c
+++ b/block/bfq-iosched.c
@@ -2248,6 +2248,7 @@ static void bfq_add_request(struct request *rq)
 
 	bfq_log_bfqq(bfqd, bfqq, "add_request %d", rq_is_sync(rq));
 	bfqq->queued[rq_is_sync(rq)]++;
+
 	/*
 	 * Updating of 'bfqd->queued' is protected by 'bfqd->lock', however, it
 	 * may be read without holding the lock in bfq_has_work().
@@ -2293,9 +2294,9 @@ static void bfq_add_request(struct request *rq)
 		 *   elapsed.
 		 */
 		if (bfqq == bfqd->in_service_queue &&
-		    (bfqd->rq_in_driver == 0 ||
+		    (bfqd->tot_rq_in_driver == 0 ||
 		     (bfqq->last_serv_time_ns > 0 &&
-		      bfqd->rqs_injected && bfqd->rq_in_driver > 0)) &&
+		      bfqd->rqs_injected && bfqd->tot_rq_in_driver > 0)) &&
 		    time_is_before_eq_jiffies(bfqq->decrease_time_jif +
 					      msecs_to_jiffies(10))) {
 			bfqd->last_empty_occupied_ns = ktime_get_ns();
@@ -2319,7 +2320,7 @@ static void bfq_add_request(struct request *rq)
 			 * will be set in case injection is performed
 			 * on bfqq before rq is completed).
 			 */
-			if (bfqd->rq_in_driver == 0)
+			if (bfqd->tot_rq_in_driver == 0)
 				bfqd->rqs_injected = false;
 		}
 	}
@@ -2417,15 +2418,18 @@ static sector_t get_sdist(sector_t last_pos, struct request *rq)
 static void bfq_activate_request(struct request_queue *q, struct request *rq)
 {
 	struct bfq_data *bfqd = q->elevator->elevator_data;
+	unsigned int act_idx = bfq_actuator_index(bfqd, rq->bio);
 
-	bfqd->rq_in_driver++;
+	bfqd->tot_rq_in_driver++;
+	bfqd->rq_in_driver[act_idx]++;
 }
 
 static void bfq_deactivate_request(struct request_queue *q, struct request *rq)
 {
 	struct bfq_data *bfqd = q->elevator->elevator_data;
 
-	bfqd->rq_in_driver--;
+	bfqd->tot_rq_in_driver--;
+	bfqd->rq_in_driver[bfq_actuator_index(bfqd, rq->bio)]--;
 }
 #endif
 
@@ -2699,11 +2703,14 @@ void bfq_end_wr_async_queues(struct bfq_data *bfqd,
 static void bfq_end_wr(struct bfq_data *bfqd)
 {
 	struct bfq_queue *bfqq;
+	int i;
 
 	spin_lock_irq(&bfqd->lock);
 
-	list_for_each_entry(bfqq, &bfqd->active_list, bfqq_list)
-		bfq_bfqq_end_wr(bfqq);
+	for (i = 0; i < bfqd->num_ia_ranges; i++) {
+		list_for_each_entry(bfqq, &bfqd->active_list[i], bfqq_list)
+			bfq_bfqq_end_wr(bfqq);
+	}
 	list_for_each_entry(bfqq, &bfqd->idle_list, bfqq_list)
 		bfq_bfqq_end_wr(bfqq);
 	bfq_end_wr_async(bfqd);
@@ -3638,13 +3645,13 @@ static void bfq_update_peak_rate(struct bfq_data *bfqd, struct request *rq)
 	 * - start a new observation interval with this dispatch
 	 */
 	if (now_ns - bfqd->last_dispatch > 100*NSEC_PER_MSEC &&
-	    bfqd->rq_in_driver == 0)
+	    bfqd->tot_rq_in_driver == 0)
 		goto update_rate_and_reset;
 
 	/* Update sampling information */
 	bfqd->peak_rate_samples++;
 
-	if ((bfqd->rq_in_driver > 0 ||
+	if ((bfqd->tot_rq_in_driver > 0 ||
 		now_ns - bfqd->last_completion < BFQ_MIN_TT)
 	    && !BFQ_RQ_SEEKY(bfqd, bfqd->last_position, rq))
 		bfqd->sequential_samples++;
@@ -3911,7 +3918,7 @@ static bool idling_needed_for_service_guarantees(struct bfq_data *bfqd,
 	return (bfqq->wr_coeff > 1 &&
 		(bfqd->wr_busy_queues <
 		 tot_busy_queues ||
-		 bfqd->rq_in_driver >=
+		 bfqd->tot_rq_in_driver >=
 		 bfqq->dispatched + 4)) ||
 		bfq_asymmetric_scenario(bfqd, bfqq) ||
 		tot_busy_queues == 1;
@@ -4683,6 +4690,7 @@ bfq_choose_bfqq_for_injection(struct bfq_data *bfqd)
 {
 	struct bfq_queue *bfqq, *in_serv_bfqq = bfqd->in_service_queue;
 	unsigned int limit = in_serv_bfqq->inject_limit;
+	int i;
 	/*
 	 * If
 	 * - bfqq is not weight-raised and therefore does not carry
@@ -4714,7 +4722,7 @@ bfq_choose_bfqq_for_injection(struct bfq_data *bfqd)
 		)
 		limit = 1;
 
-	if (bfqd->rq_in_driver >= limit)
+	if (bfqd->tot_rq_in_driver >= limit)
 		return NULL;
 
 	/*
@@ -4729,11 +4737,12 @@ bfq_choose_bfqq_for_injection(struct bfq_data *bfqd)
 	 *   (and re-added only if it gets new requests, but then it
 	 *   is assigned again enough budget for its new backlog).
 	 */
-	list_for_each_entry(bfqq, &bfqd->active_list, bfqq_list)
-		if (!RB_EMPTY_ROOT(&bfqq->sort_list) &&
-		    (in_serv_always_inject || bfqq->wr_coeff > 1) &&
-		    bfq_serv_to_charge(bfqq->next_rq, bfqq) <=
-		    bfq_bfqq_budget_left(bfqq)) {
+	for (i = 0; i < bfqd->num_ia_ranges; i++) {
+		list_for_each_entry(bfqq, &bfqd->active_list[i], bfqq_list)
+			if (!RB_EMPTY_ROOT(&bfqq->sort_list) &&
+				(in_serv_always_inject || bfqq->wr_coeff > 1) &&
+				bfq_serv_to_charge(bfqq->next_rq, bfqq) <=
+				bfq_bfqq_budget_left(bfqq)) {
 			/*
 			 * Allow for only one large in-flight request
 			 * on non-rotational devices, for the
@@ -4758,22 +4767,67 @@ bfq_choose_bfqq_for_injection(struct bfq_data *bfqd)
 			else
 				limit = in_serv_bfqq->inject_limit;
 
-			if (bfqd->rq_in_driver < limit) {
+			if (bfqd->tot_rq_in_driver < limit) {
 				bfqd->rqs_injected = true;
 				return bfqq;
 			}
 		}
+	}
+
+	return NULL;
+}
+
+struct bfq_queue *bfq_find_active_bfqq_for_actuator(struct bfq_data *bfqd,
+						    int idx)
+{
+	struct bfq_queue *bfqq = NULL;
+
+	if (bfqd->in_service_queue &&
+	    bfqd->in_service_queue->actuator_idx == idx)
+		return bfqd->in_service_queue;
+
+	list_for_each_entry(bfqq, &bfqd->active_list[idx], bfqq_list) {
+		if (!RB_EMPTY_ROOT(&bfqq->sort_list) &&
+			bfq_serv_to_charge(bfqq->next_rq, bfqq) <=
+				bfq_bfqq_budget_left(bfqq)) {
+			return bfqq;
+		}
+	}
+
+	return NULL;
+}
+
+/*
+ * Perform a linear scan of each actuator, until an actuator is found
+ * for which the following two conditions hold: the load of the
+ * actuator is below the threshold (see comments on actuator_load_threshold
+ * for details), and there is a queue that contains I/O for that
+ * actuator. On success, return that queue.
+ */
+struct bfq_queue *bfq_find_bfqq_for_underused_actuator(struct bfq_data *bfqd)
+{
+	int i;
+
+	for (i = 0 ; i < bfqd->num_ia_ranges; i++)
+		if (bfqd->rq_in_driver[i] < bfqd->actuator_load_threshold) {
+			struct bfq_queue *bfqq =
+				bfq_find_active_bfqq_for_actuator(bfqd, i);
+
+			if (bfqq)
+				return bfqq;
+		}
 
 	return NULL;
 }
 
+
 /*
  * Select a queue for service.  If we have a current queue in service,
  * check whether to continue servicing it, or retrieve and set a new one.
  */
 static struct bfq_queue *bfq_select_queue(struct bfq_data *bfqd)
 {
-	struct bfq_queue *bfqq;
+	struct bfq_queue *bfqq, *inject_bfqq;
 	struct request *next_rq;
 	enum bfqq_expiration reason = BFQQE_BUDGET_TIMEOUT;
 
@@ -4795,6 +4849,15 @@ static struct bfq_queue *bfq_select_queue(struct bfq_data *bfqd)
 		goto expire;
 
 check_queue:
+	/*
+	 *  If some actuator is underutilized, but the in-service
+	 *  queue does not contain I/O for that actuator, then try to
+	 *  inject I/O for that actuator.
+	 */
+	inject_bfqq = bfq_find_bfqq_for_underused_actuator(bfqd);
+	if (inject_bfqq && inject_bfqq != bfqq)
+		return inject_bfqq;
+
 	/*
 	 * This loop is rarely executed more than once. Even when it
 	 * happens, it is much more convenient to re-execute this loop
@@ -5150,11 +5213,11 @@ static struct request *__bfq_dispatch_request(struct blk_mq_hw_ctx *hctx)
 
 		/*
 		 * We exploit the bfq_finish_requeue_request hook to
-		 * decrement rq_in_driver, but
+		 * decrement tot_rq_in_driver, but
 		 * bfq_finish_requeue_request will not be invoked on
 		 * this request. So, to avoid unbalance, just start
-		 * this request, without incrementing rq_in_driver. As
-		 * a negative consequence, rq_in_driver is deceptively
+		 * this request, without incrementing tot_rq_in_driver. As
+		 * a negative consequence, tot_rq_in_driver is deceptively
 		 * lower than it should be while this request is in
 		 * service. This may cause bfq_schedule_dispatch to be
 		 * invoked uselessly.
@@ -5163,7 +5226,7 @@ static struct request *__bfq_dispatch_request(struct blk_mq_hw_ctx *hctx)
 		 * bfq_finish_requeue_request hook, if defined, is
 		 * probably invoked also on this request. So, by
 		 * exploiting this hook, we could 1) increment
-		 * rq_in_driver here, and 2) decrement it in
+		 * tot_rq_in_driver here, and 2) decrement it in
 		 * bfq_finish_requeue_request. Such a solution would
 		 * let the value of the counter be always accurate,
 		 * but it would entail using an extra interface
@@ -5192,7 +5255,7 @@ static struct request *__bfq_dispatch_request(struct blk_mq_hw_ctx *hctx)
 	 * Of course, serving one request at a time may cause loss of
 	 * throughput.
 	 */
-	if (bfqd->strict_guarantees && bfqd->rq_in_driver > 0)
+	if (bfqd->strict_guarantees && bfqd->tot_rq_in_driver > 0)
 		goto exit;
 
 	bfqq = bfq_select_queue(bfqd);
@@ -5203,7 +5266,8 @@ static struct request *__bfq_dispatch_request(struct blk_mq_hw_ctx *hctx)
 
 	if (rq) {
 inc_in_driver_start_rq:
-		bfqd->rq_in_driver++;
+		bfqd->rq_in_driver[bfqq->actuator_idx]++;
+		bfqd->tot_rq_in_driver++;
 start_rq:
 		rq->rq_flags |= RQF_STARTED;
 	}
@@ -6265,7 +6329,7 @@ static void bfq_update_hw_tag(struct bfq_data *bfqd)
 	struct bfq_queue *bfqq = bfqd->in_service_queue;
 
 	bfqd->max_rq_in_driver = max_t(int, bfqd->max_rq_in_driver,
-				       bfqd->rq_in_driver);
+				       bfqd->tot_rq_in_driver);
 
 	if (bfqd->hw_tag == 1)
 		return;
@@ -6276,7 +6340,7 @@ static void bfq_update_hw_tag(struct bfq_data *bfqd)
 	 * sum is not exact, as it's not taking into account deactivated
 	 * requests.
 	 */
-	if (bfqd->rq_in_driver + bfqd->queued <= BFQ_HW_QUEUE_THRESHOLD)
+	if (bfqd->tot_rq_in_driver + bfqd->queued <= BFQ_HW_QUEUE_THRESHOLD)
 		return;
 
 	/*
@@ -6287,7 +6351,7 @@ static void bfq_update_hw_tag(struct bfq_data *bfqd)
 	if (bfqq && bfq_bfqq_has_short_ttime(bfqq) &&
 	    bfqq->dispatched + bfqq->queued[0] + bfqq->queued[1] <
 	    BFQ_HW_QUEUE_THRESHOLD &&
-	    bfqd->rq_in_driver < BFQ_HW_QUEUE_THRESHOLD)
+	    bfqd->tot_rq_in_driver < BFQ_HW_QUEUE_THRESHOLD)
 		return;
 
 	if (bfqd->hw_tag_samples++ < BFQ_HW_QUEUE_SAMPLES)
@@ -6308,7 +6372,8 @@ static void bfq_completed_request(struct bfq_queue *bfqq, struct bfq_data *bfqd)
 
 	bfq_update_hw_tag(bfqd);
 
-	bfqd->rq_in_driver--;
+	bfqd->rq_in_driver[bfqq->actuator_idx]--;
+	bfqd->tot_rq_in_driver--;
 	bfqq->dispatched--;
 
 	if (!bfqq->dispatched && !bfq_bfqq_busy(bfqq)) {
@@ -6427,7 +6492,7 @@ static void bfq_completed_request(struct bfq_queue *bfqq, struct bfq_data *bfqd)
 					BFQQE_NO_MORE_REQUESTS);
 	}
 
-	if (!bfqd->rq_in_driver)
+	if (!bfqd->tot_rq_in_driver)
 		bfq_schedule_dispatch(bfqd);
 }
 
@@ -6558,13 +6623,13 @@ static void bfq_update_inject_limit(struct bfq_data *bfqd,
 	 * conditions to do it, or we can lower the last base value
 	 * computed.
 	 *
-	 * NOTE: (bfqd->rq_in_driver == 1) means that there is no I/O
+	 * NOTE: (bfqd->tot_rq_in_driver == 1) means that there is no I/O
 	 * request in flight, because this function is in the code
 	 * path that handles the completion of a request of bfqq, and,
 	 * in particular, this function is executed before
-	 * bfqd->rq_in_driver is decremented in such a code path.
+	 * bfqd->tot_rq_in_driver is decremented in such a code path.
 	 */
-	if ((bfqq->last_serv_time_ns == 0 && bfqd->rq_in_driver == 1) ||
+	if ((bfqq->last_serv_time_ns == 0 && bfqd->tot_rq_in_driver == 1) ||
 	    tot_time_ns < bfqq->last_serv_time_ns) {
 		if (bfqq->last_serv_time_ns == 0) {
 			/*
@@ -6574,7 +6639,7 @@ static void bfq_update_inject_limit(struct bfq_data *bfqd,
 			bfqq->inject_limit = max_t(unsigned int, 1, old_limit);
 		}
 		bfqq->last_serv_time_ns = tot_time_ns;
-	} else if (!bfqd->rqs_injected && bfqd->rq_in_driver == 1)
+	} else if (!bfqd->rqs_injected && bfqd->tot_rq_in_driver == 1)
 		/*
 		 * No I/O injected and no request still in service in
 		 * the drive: these are the exact conditions for
@@ -7210,7 +7275,8 @@ static int bfq_init_queue(struct request_queue *q, struct elevator_type *e)
 	bfqd->queue_weights_tree = RB_ROOT_CACHED;
 	bfqd->num_groups_with_pending_reqs = 0;
 
-	INIT_LIST_HEAD(&bfqd->active_list);
+	INIT_LIST_HEAD(&bfqd->active_list[0]);
+	INIT_LIST_HEAD(&bfqd->active_list[1]);
 	INIT_LIST_HEAD(&bfqd->idle_list);
 	INIT_HLIST_HEAD(&bfqd->burst_list);
 
@@ -7255,6 +7321,9 @@ static int bfq_init_queue(struct request_queue *q, struct elevator_type *e)
 		ref_wr_duration[blk_queue_nonrot(bfqd->queue)];
 	bfqd->peak_rate = ref_rate[blk_queue_nonrot(bfqd->queue)] * 2 / 3;
 
+	/* see comments on the definition of next field inside bfq_data */
+	bfqd->actuator_load_threshold = 4;
+
 	spin_lock_init(&bfqd->lock);
 
 	/*
diff --git a/block/bfq-iosched.h b/block/bfq-iosched.h
index 0c78feea15d7..10c2d83bcb34 100644
--- a/block/bfq-iosched.h
+++ b/block/bfq-iosched.h
@@ -578,7 +578,12 @@ struct bfq_data {
 	/* number of queued requests */
 	int queued;
 	/* number of requests dispatched and waiting for completion */
-	int rq_in_driver;
+	int tot_rq_in_driver;
+	/*
+	 * number of requests dispatched and waiting for completion
+	 * for each actuator
+	 */
+	int rq_in_driver[BFQ_MAX_ACTUATORS];
 
 	/* true if the device is non rotational and performs queueing */
 	bool nonrot_with_queueing;
@@ -672,8 +677,13 @@ struct bfq_data {
 	/* maximum budget allotted to a bfq_queue before rescheduling */
 	int bfq_max_budget;
 
-	/* list of all the bfq_queues active on the device */
-	struct list_head active_list;
+	/*
+	 * List of all the bfq_queues active for a specific actuator
+	 * on the device. Keeping active queues separate on a
+	 * per-actuator basis helps implementing per-actuator
+	 * injection more efficiently.
+	 */
+	struct list_head active_list[BFQ_MAX_ACTUATORS];
 	/* list of all the bfq_queues idle on the device */
 	struct list_head idle_list;
 
@@ -808,6 +818,29 @@ struct bfq_data {
 	 * in this device.
 	 */
 	struct blk_independent_access_range ia_ranges[BFQ_MAX_ACTUATORS];
+
+	/*
+	 * If the number of I/O requests queued in the device for a
+	 * given actuator is below next threshold, then the actuator
+	 * is deemed as underutilized. If this condition is found to
+	 * hold for some actuator upon a dispatch, but (i) the
+	 * in-service queue does not contain I/O for that actuator,
+	 * while (ii) some other queue does contain I/O for that
+	 * actuator, then the head I/O request of the latter queue is
+	 * returned (injected), instead of the head request of the
+	 * currently in-service queue.
+	 *
+	 * We set the threshold, empirically, to the minimum possible
+	 * value for which an actuator is fully utilized, or close to
+	 * be fully utilized. By doing so, injected I/O 'steals' as
+	 * few drive-queue slots as possibile to the in-service
+	 * queue. This reduces as much as possible the probability
+	 * that the service of I/O from the in-service bfq_queue gets
+	 * delayed because of slot exhaustion, i.e., because all the
+	 * slots of the drive queue are filled with I/O injected from
+	 * other queues (NCQ provides for 32 slots).
+	 */
+	unsigned int actuator_load_threshold;
 };
 
 enum bfqq_state_flags {
diff --git a/block/bfq-wf2q.c b/block/bfq-wf2q.c
index f8eb340381cf..f142f28e2f4d 100644
--- a/block/bfq-wf2q.c
+++ b/block/bfq-wf2q.c
@@ -477,7 +477,7 @@ static void bfq_active_insert(struct bfq_service_tree *st,
 	bfqd = (struct bfq_data *)bfqg->bfqd;
 #endif
 	if (bfqq)
-		list_add(&bfqq->bfqq_list, &bfqq->bfqd->active_list);
+		list_add(&bfqq->bfqq_list, &bfqq->bfqd->active_list[bfqq->actuator_idx]);
 #ifdef CONFIG_BFQ_GROUP_IOSCHED
 	if (bfqg != bfqd->root_group)
 		bfqg->active_entities++;
-- 
2.20.1


^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [PATCH 8/8] block, bfq: balance I/O injection among underutilized actuators
  2022-06-23 15:53 [PATCH 0/8] block, bfq: extend bfq to support multi-actuator drives Paolo Valente
                   ` (6 preceding siblings ...)
  2022-06-23 15:53 ` [PATCH 7/8] block, bfq: inject I/O to underutilized actuators Paolo Valente
@ 2022-06-23 15:53 ` Paolo Valente
       [not found] ` <PH7PR20MB505849512979A89E8A66FB24F18E9@PH7PR20MB5058.namprd20.prod.outlook.com>
  8 siblings, 0 replies; 21+ messages in thread
From: Paolo Valente @ 2022-06-23 15:53 UTC (permalink / raw)
  To: Jens Axboe
  Cc: linux-block, linux-kernel, jack, andrea.righi, glen.valante,
	arie.vanderhoeven, Davide Zini, Paolo Valente

From: Davide Zini <davidezini2@gmail.com>

Upon the invocation of its dispatch function, BFQ returns the next I/O
request of the in-service bfq_queue, unless some exception holds. One
such exception is that there is some underutilized actuator, different
from the actuator for which the in-service queue contains I/O, and
that some other bfq_queue happens to contain I/O for such an
actuator. In this case, the next I/O request of the latter bfq_queue,
and not of the in-service bfq_queue, is returned (I/O is injected from
that bfq_queue). To find such an actuator, a linear scan, in
increasing index order, is performed among actuators.

Performing a linear scan entails a prioritization among actuators: an
underutilized actuator may be considered for injection only if all
actuators with a lower index are currently fully utilized, or if there
is no pending I/O for any lower-index actuator that happens to be
underutilized.

This commits breaks this prioritization and tends to distribute
injection uniformly across actuators. This is obtained by adding the
following condition to the linear scan: even if an actuator A is
underutilized, A is however skipped if its load is higher than that of
the next actuator.

Signed-off-by: Paolo Valente <paolo.valente@linaro.org>
Signed-off-by: Davide Zini <davidezini2@gmail.com>
---
 block/bfq-iosched.c | 18 +++++++++++++-----
 1 file changed, 13 insertions(+), 5 deletions(-)

diff --git a/block/bfq-iosched.c b/block/bfq-iosched.c
index 80eb7d542b70..ee113d1b949f 100644
--- a/block/bfq-iosched.c
+++ b/block/bfq-iosched.c
@@ -4799,17 +4799,25 @@ struct bfq_queue *bfq_find_active_bfqq_for_actuator(struct bfq_data *bfqd,

 /*
  * Perform a linear scan of each actuator, until an actuator is found
- * for which the following two conditions hold: the load of the
- * actuator is below the threshold (see comments on actuator_load_threshold
- * for details), and there is a queue that contains I/O for that
- * actuator. On success, return that queue.
+ * for which the following three conditions hold: the load of the
+ * actuator is below the threshold (see comments on
+ * actuator_load_threshold for details) and lower than that of the
+ * next actuator (comments on this extra condition below), and there
+ * is a queue that contains I/O for that actuator. On success, return
+ * that queue.
+ *
+ * Performing a plain linear scan entails a prioritization among
+ * actuators. The extra condition above breaks this prioritization and
+ * tends to distribute injection uniformly across actuators.
  */
 struct bfq_queue *bfq_find_bfqq_for_underused_actuator(struct bfq_data *bfqd)
 {
 	int i;

 	for (i = 0 ; i < bfqd->num_ia_ranges; i++)
-		if (bfqd->rq_in_driver[i] < bfqd->actuator_load_threshold) {
+		if (bfqd->rq_in_driver[i] < bfqd->actuator_load_threshold &&
+		    (i == bfqd->num_ia_ranges - 1 ||
+		     bfqd->rq_in_driver[i] < bfqd->rq_in_driver[i+1])) {
 			struct bfq_queue *bfqq =
 				bfq_find_active_bfqq_for_actuator(bfqd, i);

-- 
2.20.1

^ permalink raw reply related	[flat|nested] 21+ messages in thread

* Re: [PATCH 0/8] block, bfq: extend bfq to support multi-actuator drives
       [not found]       ` <DM4PR20MB5061638AB3F389ECD0714029F1969@DM4PR20MB5061.namprd20.prod.outlook.com>
@ 2022-07-29 16:36         ` Arie van der Hoeven
  0 siblings, 0 replies; 21+ messages in thread
From: Arie van der Hoeven @ 2022-07-29 16:36 UTC (permalink / raw)
  To: Rory Chen, Paolo Valente
  Cc: linux-block, linux-kernel, jack, andrea.righi, glen.valante,
	axboe, Tyler Erickson, Muhammad Ahmad, Michael English,
	Andrew Ring, Varun Boddu

<Resending as plain text>

Adding Tyler Erickson who worked on the patch for concurrent positioning ranges that may be related.  Adding others at Seagate who worked on validating the patch as well.  (Muhammad, Michael, Andrew, Varun)

Thanks,
--Arie

From: Rory Chen <rory.c.chen@seagate.com>
Sent: Thursday, July 21, 2022 5:59 PM
To: Paolo Valente <paolo.valente@linaro.org>
Cc: linux-block@vger.kernel.org <linux-block@vger.kernel.org>; linux-kernel@vger.kernel.org <linux-kernel@vger.kernel.org>; jack@suse.cz <jack@suse.cz>; andrea.righi@canonical.com <andrea.righi@canonical.com>; glen.valante@linaro.org <glen.valante@linaro.org>; Arie van der Hoeven <arie.vanderhoeven@seagate.com>; axboe@kernel.dk <axboe@kernel.dk>
Subject: RE: [PATCH 0/8] block, bfq: extend bfq to support multi-actuator drives

Hi Paolo,

I’m from Seagate China and face a problem when I’m evaluating the bfq patches. Could you please check ? Thanks

Issue statement
When running performance test on bfq patch, I observed warning message “bfq_actuator_index: bio sector out of ranges: end=35156656128” and OS hung suddenly after some hours.
The warning message is reported from function bfq_actuator_index which determines IO request is in which index of actuators.  The bio_end_sector is 35156656128 but the max LBA for the drive is 35156656127 so it’s beyond the LBA range.  I captured the block trace and didn’t found request LBA 35156656128 instead only found max request LBA 35156656127.
I’m not sure if this warning message is related to later OS hung.  /var/log/messages file is attached.

Problem environment
Kernel base is 5.18.9
Test HDD drive is Seagate ST18000NM0092 dual actuator SATA.
Actuator LBA mapping by reading VPD B9
Concurrent positioning ranges VPD page (SBC):
LBA range number: 0
number of storage elements: 1
starting LBA: 0x0
number of LBAs: 0x417c00000 [17578328064]
LBA range number: 1
number of storage elements: 1
starting LBA: 0x417c00000
number of LBAs: 0x417c00000 [17578328064]

Seagate Internal
From: Arie van der Hoeven <arie.vanderhoeven@seagate.com>
Sent: Wednesday, July 20, 2022 10:41 PM
To: Rory Chen <rory.c.chen@seagate.com>
Subject: Fw: [PATCH 0/8] block, bfq: extend bfq to support multi-actuator drives

From: Paolo Valente <paolo.valente@linaro.org>
Sent: Thursday, June 23, 2022 8:53 AM
To: Jens Axboe <axboe@kernel.dk>
Cc: linux-block@vger.kernel.org <linux-block@vger.kernel.org>; linux-kernel@vger.kernel.org <linux-kernel@vger.kernel.org>; jack@suse.cz <jack@suse.cz>; andrea.righi@canonical.com <andrea.righi@canonical.com>; glen.valante@linaro.org <glen.valante@linaro.org>; Arie van der Hoeven <arie.vanderhoeven@seagate.com>; Paolo Valente <paolo.valente@linaro.org>
Subject: [PATCH 0/8] block, bfq: extend bfq to support multi-actuator drives

This message has originated from an External Source. Please use proper judgment and caution when opening attachments, clicking links, or responding to this email.

Hi,
this patch series extends BFQ so as to optimize I/O dispatch to
multi-actuator drives. In particular, this extension addresses the
following issue. Multi-actuator drives appear as a single device to
the I/O subsystem [1].  Yet they address commands to different
actuators internally, as a function of Logical Block Addressing
(LBAs). A given sector is reachable by only one of the actuators. For
example, Seagate’s Serial Advanced Technology Attachment (SATA)
version contains two actuators and maps the lower half of the SATA LBA
space to the lower actuator and the upper half to the upper actuator.

Evidently, to fully utilize actuators, no actuator must be left idle
or underutilized while there is pending I/O for it. To reach this
goal, the block layer must somehow control the load of each actuator
individually. This series enriches BFQ with such a per-actuator
control, as a first step. Then it also adds a simple mechanism for
guaranteeing that actuators with pending I/O are never left idle.

See [1] for a more detailed overview of the problem and of the
solutions implemented in this patch series. There you will also find
some preliminary performance results.

Thanks,
Paolo

[1] https://secure-web.cisco.com/1hcxnN1C3h1nW7mby7S66_LE8szirQwbQI0fBpYePrA0GTWfyuQyl0GpZaOn32xMSkNT0BUQWloDHFzZ23aYDZdi8NfdrEFLY9pQDBblIvn08LRiTVoIOUC8zWSG_r2PCyLtx3ppZq5cWOib_8azxteRRcbKWGdbLPSqg9hfSJSqltth0ByLONHEoI3p3e9QNIn6nVAeQbsT3aOQe-F95XrQvaPrFJXx6RGL9kDXyfkbXIHcdcLBf895gYBFn5S2WjBDQq2kzDzZOlc1HekRUhg0qDQcFY6NydVfrqNfLbpAHAth6KyREscQhVTMVREEVa1b6bQByX6grF5pn3pTIo0lODyfX6yRmcbReSYNfOZ65ZPvp-nH530FQ-5nXoRxFf74WIKDrNTALs3xQvg03DH4jLez-T2M9xEu-sfEDAEdTGF7BcnmBW6vrPO4_p3k4/https%3A%2F%2Fwww.linaro.org%2Fblog%2Fbudget-fair-queueing-bfq-linux-io-scheduler-optimizations-for-multi-actuator-sata-hard-drives%2F

Davide Zini (3):
  block, bfq: split also async bfq_queues on a per-actuator basis
  block, bfq: inject I/O to underutilized actuators
  block, bfq: balance I/O injection among underutilized actuators

Federico Gavioli (1):
  block, bfq: retrieve independent access ranges from request queue

Paolo Valente (4):
  block, bfq: split sync bfq_queues on a per-actuator basis
  block, bfq: forbid stable merging of queues associated with different
    actuators
  block, bfq: turn scalar fields into arrays in bfq_io_cq
  block, bfq: turn BFQ_NUM_ACTUATORS into BFQ_MAX_ACTUATORS

 block/bfq-cgroup.c  |  97 +++++----
 block/bfq-iosched.c | 488 +++++++++++++++++++++++++++++---------------
 block/bfq-iosched.h | 149 ++++++++++----
 block/bfq-wf2q.c    |   2 +-
 4 files changed, 493 insertions(+), 243 deletions(-)

--
2.20.1

Seagate Internal

Seagate Internal

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH 0/8] block, bfq: extend bfq to support multi-actuator drives
       [not found] ` <PH7PR20MB505849512979A89E8A66FB24F18E9@PH7PR20MB5058.namprd20.prod.outlook.com>
       [not found]   ` <SJ0PR20MB44097AAA14A819BB2C7B4AB3A0909@SJ0PR20MB4409.namprd20.prod.outlook.com>
@ 2022-08-09  3:47   ` Rory Chen
  2022-08-10 10:22     ` Paolo Valente
  1 sibling, 1 reply; 21+ messages in thread
From: Rory Chen @ 2022-08-09  3:47 UTC (permalink / raw)
  To: Arie van der Hoeven, Muhammad Ahmad, linux-block, linux-kernel,
	jack, andrea.righi, glen.valante, axboe, Tyler Erickson,
	Michael English, Andrew Ring, Varun Boddu, Paolo Valente

Resend the mail as plain text because previous mail with rich text makes some mess and forget to add others at Seagate who worked on validating the
patch as well(Muhammad, Michael, Andrew, Varun,Tyler)

Hi Paolo,

I am from Seagate China and face a problem when I’m evaluating the bfq patches. Could you please check? Thanks

Issue statement
When running performance test on bfq patch, I observed warning message "bfq_actuator_index: bio sector out of ranges: end=35156656128" and OS hung suddenly after some hours.
The warning message is reported from function bfq_actuator_index which determines IO request is in which index of actuators.  The bio_end_sector is 35156656128 but the max LBA for the drive is 35156656127 so it’s beyond the LBA range.  I captured the block trace and didn’t found request LBA 35156656128 instead only found max request LBA 35156656127.
I’m not sure if this warning message is related to later OS hung.

Problem environment
Kernel base is 5.18.9
Test HDD drive is Seagate ST18000NM0092 dual actuator SATA.
Actuator LBA mapping by reading VPD B9
Concurrent positioning ranges VPD page:
LBA range number:0
number of storage elements:1
starting LBA:0x0
number of LBAs:0x417c00000 [17578328064]
LBA range number:1
number of storage elements:1
starting LBA:0x417c00000
number of LBAs:0x417c00000 [17578328064]

From: Paolo Valente <paolo.valente@linaro.org>
Sent: Thursday, June 23, 2022 8:53 AM
To: Jens Axboe <axboe@kernel.dk>
Cc: linux-block@vger.kernel.org <linux-block@vger.kernel.org>; linux-kernel@vger.kernel.org <linux-kernel@vger.kernel.org>; jack@suse.cz <jack@suse.cz>; andrea.righi@canonical.com <andrea.righi@canonical.com>; glen.valante@linaro.org <glen.valante@linaro.org>; Arie van der Hoeven <arie.vanderhoeven@seagate.com>; Paolo Valente <paolo.valente@linaro.org>
Subject: [PATCH 0/8] block, bfq: extend bfq to support multi-actuator drives

This message has originated from an External Source. Please use proper judgment and caution when opening attachments, clicking links, or responding to this email.

Hi,
this patch series extends BFQ so as to optimize I/O dispatch to
multi-actuator drives. In particular, this extension addresses the
following issue. Multi-actuator drives appear as a single device to
the I/O subsystem [1].  Yet they address commands to different
actuators internally, as a function of Logical Block Addressing
(LBAs). A given sector is reachable by only one of the actuators. For
example, Seagate’s Serial Advanced Technology Attachment (SATA)
version contains two actuators and maps the lower half of the SATA LBA
space to the lower actuator and the upper half to the upper actuator.

Evidently, to fully utilize actuators, no actuator must be left idle
or underutilized while there is pending I/O for it. To reach this
goal, the block layer must somehow control the load of each actuator
individually. This series enriches BFQ with such a per-actuator
control, as a first step. Then it also adds a simple mechanism for
guaranteeing that actuators with pending I/O are never left idle.

See [1] for a more detailed overview of the problem and of the
solutions implemented in this patch series. There you will also find
some preliminary performance results.

Thanks,
Paolo

[1] https://secure-web.cisco.com/1hcxnN1C3h1nW7mby7S66_LE8szirQwbQI0fBpYePrA0GTWfyuQyl0GpZaOn32xMSkNT0BUQWloDHFzZ23aYDZdi8NfdrEFLY9pQDBblIvn08LRiTVoIOUC8zWSG_r2PCyLtx3ppZq5cWOib_8azxteRRcbKWGdbLPSqg9hfSJSqltth0ByLONHEoI3p3e9QNIn6nVAeQbsT3aOQe-F95XrQvaPrFJXx6RGL9kDXyfkbXIHcdcLBf895gYBFn5S2WjBDQq2kzDzZOlc1HekRUhg0qDQcFY6NydVfrqNfLbpAHAth6KyREscQhVTMVREEVa1b6bQByX6grF5pn3pTIo0lODyfX6yRmcbReSYNfOZ65ZPvp-nH530FQ-5nXoRxFf74WIKDrNTALs3xQvg03DH4jLez-T2M9xEu-sfEDAEdTGF7BcnmBW6vrPO4_p3k4/https%3A%2F%2Fwww.linaro.org%2Fblog%2Fbudget-fair-queueing-bfq-linux-io-scheduler-optimizations-for-multi-actuator-sata-hard-drives%2F

Davide Zini (3):
  block, bfq: split also async bfq_queues on a per-actuator basis
  block, bfq: inject I/O to underutilized actuators
  block, bfq: balance I/O injection among underutilized actuators

Federico Gavioli (1):
  block, bfq: retrieve independent access ranges from request queue

Paolo Valente (4):
  block, bfq: split sync bfq_queues on a per-actuator basis
  block, bfq: forbid stable merging of queues associated with different
    actuators
  block, bfq: turn scalar fields into arrays in bfq_io_cq
  block, bfq: turn BFQ_NUM_ACTUATORS into BFQ_MAX_ACTUATORS

 block/bfq-cgroup.c  |  97 +++++----
 block/bfq-iosched.c | 488 +++++++++++++++++++++++++++++---------------
 block/bfq-iosched.h | 149 ++++++++++----
 block/bfq-wf2q.c    |   2 +-
 4 files changed, 493 insertions(+), 243 deletions(-)

--
2.20.1

Seagate Internal

Seagate Internal

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH 0/8] block, bfq: extend bfq to support multi-actuator drives
  2022-08-09  3:47   ` Rory Chen
@ 2022-08-10 10:22     ` Paolo Valente
  2022-08-10 12:59       ` Rory Chen
  0 siblings, 1 reply; 21+ messages in thread
From: Paolo Valente @ 2022-08-10 10:22 UTC (permalink / raw)
  To: Rory Chen
  Cc: Arie van der Hoeven, Muhammad Ahmad, linux-block, linux-kernel,
	Jan Kara, andrea.righi, glen.valante, axboe, Tyler Erickson,
	Michael English, Andrew Ring, Varun Boddu



> Il giorno 9 ago 2022, alle ore 05:47, Rory Chen <rory.c.chen@seagate.com> ha scritto:
> 
> Resend the mail as plain text because previous mail with rich text makes some mess and forget to add others at Seagate who worked on validating the
> patch as well(Muhammad, Michael, Andrew, Varun,Tyler)
> 
> Hi Paolo,
> 

Hi

> I am from Seagate China and face a problem when I’m evaluating the bfq patches. Could you please check?
> Thanks
> 
> Issue statement
> When running performance test on bfq patch, I observed warning message "bfq_actuator_index: bio sector out of ranges: end=35156656128" and OS hung suddenly after some hours.
> The warning message is reported from function bfq_actuator_index which determines IO request is in which index of actuators.  The bio_end_sector is 35156656128 but the max LBA for the drive is 35156656127 so it’s beyond the LBA range.

Yep, this sanity check fails if the end sector of a new IO does not
belong to any sector range.

>  I captured the block trace and didn’t found request LBA 35156656128 instead only found max request LBA 35156656127.

Maybe in the trace you see only start sectors?  The failed check si
performed on end sectors instead.

At any rate, there seems to be an off-by-one error in the value(s)
stored in the sector field(s) of the blk_independent_access_range data
structure.

I guess we may need some help/feedback from people competent on this
stuff.

> I’m not sure if this warning message is related to later OS hung.
> 

Not easy to say.  At any rate, we can try with a development version
of bfq.  It can help us detect the possible cause of this hang.  But
let's see where we get with this sector error first.

Thank you for testing this extended version of bfq,
Paolo

> 
> Problem environment
> Kernel base is 5.18.9
> Test HDD drive is Seagate ST18000NM0092 dual actuator SATA.
> Actuator LBA mapping by reading VPD B9
> Concurrent positioning ranges VPD page:
> LBA range number:0
> number of storage elements:1
> starting LBA:0x0
> number of LBAs:0x417c00000 [17578328064]
> LBA range number:1
> number of storage elements:1
> starting LBA:0x417c00000
> number of LBAs:0x417c00000 [17578328064]
> 
> 
> 
> 
> 
> From: Paolo Valente <paolo.valente@linaro.org>
> Sent: Thursday, June 23, 2022 8:53 AM
> To: Jens Axboe <axboe@kernel.dk>
> Cc: linux-block@vger.kernel.org <linux-block@vger.kernel.org>; linux-kernel@vger.kernel.org <linux-kernel@vger.kernel.org>; jack@suse.cz <jack@suse.cz>; andrea.righi@canonical.com <andrea.righi@canonical.com>; glen.valante@linaro.org <glen.valante@linaro.org>; Arie van der Hoeven <arie.vanderhoeven@seagate.com>; Paolo Valente <paolo.valente@linaro.org>
> Subject: [PATCH 0/8] block, bfq: extend bfq to support multi-actuator drives
> 
> 
> This message has originated from an External Source. Please use proper judgment and caution when opening attachments, clicking links, or responding to this email.
> 
> 
> Hi,
> this patch series extends BFQ so as to optimize I/O dispatch to
> multi-actuator drives. In particular, this extension addresses the
> following issue. Multi-actuator drives appear as a single device to
> the I/O subsystem [1].  Yet they address commands to different
> actuators internally, as a function of Logical Block Addressing
> (LBAs). A given sector is reachable by only one of the actuators. For
> example, Seagate’s Serial Advanced Technology Attachment (SATA)
> version contains two actuators and maps the lower half of the SATA LBA
> space to the lower actuator and the upper half to the upper actuator.
> 
> Evidently, to fully utilize actuators, no actuator must be left idle
> or underutilized while there is pending I/O for it. To reach this
> goal, the block layer must somehow control the load of each actuator
> individually. This series enriches BFQ with such a per-actuator
> control, as a first step. Then it also adds a simple mechanism for
> guaranteeing that actuators with pending I/O are never left idle.
> 
> See [1] for a more detailed overview of the problem and of the
> solutions implemented in this patch series. There you will also find
> some preliminary performance results.
> 
> Thanks,
> Paolo
> 
> [1] https://secure-web.cisco.com/1hcxnN1C3h1nW7mby7S66_LE8szirQwbQI0fBpYePrA0GTWfyuQyl0GpZaOn32xMSkNT0BUQWloDHFzZ23aYDZdi8NfdrEFLY9pQDBblIvn08LRiTVoIOUC8zWSG_r2PCyLtx3ppZq5cWOib_8azxteRRcbKWGdbLPSqg9hfSJSqltth0ByLONHEoI3p3e9QNIn6nVAeQbsT3aOQe-F95XrQvaPrFJXx6RGL9kDXyfkbXIHcdcLBf895gYBFn5S2WjBDQq2kzDzZOlc1HekRUhg0qDQcFY6NydVfrqNfLbpAHAth6KyREscQhVTMVREEVa1b6bQByX6grF5pn3pTIo0lODyfX6yRmcbReSYNfOZ65ZPvp-nH530FQ-5nXoRxFf74WIKDrNTALs3xQvg03DH4jLez-T2M9xEu-sfEDAEdTGF7BcnmBW6vrPO4_p3k4/https%3A%2F%2Fwww.linaro.org%2Fblog%2Fbudget-fair-queueing-bfq-linux-io-scheduler-optimizations-for-multi-actuator-sata-hard-drives%2F
> 
> Davide Zini (3):
>  block, bfq: split also async bfq_queues on a per-actuator basis
>  block, bfq: inject I/O to underutilized actuators
>  block, bfq: balance I/O injection among underutilized actuators
> 
> Federico Gavioli (1):
>  block, bfq: retrieve independent access ranges from request queue
> 
> Paolo Valente (4):
>  block, bfq: split sync bfq_queues on a per-actuator basis
>  block, bfq: forbid stable merging of queues associated with different
>    actuators
>  block, bfq: turn scalar fields into arrays in bfq_io_cq
>  block, bfq: turn BFQ_NUM_ACTUATORS into BFQ_MAX_ACTUATORS
> 
> block/bfq-cgroup.c  |  97 +++++----
> block/bfq-iosched.c | 488 +++++++++++++++++++++++++++++---------------
> block/bfq-iosched.h | 149 ++++++++++----
> block/bfq-wf2q.c    |   2 +-
> 4 files changed, 493 insertions(+), 243 deletions(-)
> 
> --
> 2.20.1
> 
> 
> Seagate Internal
> 
> Seagate Internal


^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH 0/8] block, bfq: extend bfq to support multi-actuator drives
  2022-08-10 10:22     ` Paolo Valente
@ 2022-08-10 12:59       ` Rory Chen
  2022-08-18 15:40         ` Tyler Erickson
  0 siblings, 1 reply; 21+ messages in thread
From: Rory Chen @ 2022-08-10 12:59 UTC (permalink / raw)
  To: Paolo Valente
  Cc: Arie van der Hoeven, Muhammad Ahmad, linux-block, linux-kernel,
	Jan Kara, andrea.righi, glen.valante, axboe, Tyler Erickson,
	Michael English, Andrew Ring, Varun Boddu

The block trace shows the start sector is 35156656120 and transfer length is 8 sectors, which is within the max LBA 35156656127 of drive. And this IO is completed successfully from the slice of parsed block trace though reporting the warning message.
 8,64   7       13     0.039401337 19176  Q  RA 35156656120 + 8 [systemd-udevd]
  8,64   7       15     0.039403946 19176  P   N [systemd-udevd]
  8,64   7       16     0.039405132 19176  I  RA 35156656120 + 8 [systemd-udevd]
  8,64   7       18     0.039411554 19176  D  RA 35156656120 + 8 [systemd-udevd]
  8,64   0       40     0.039479055     0  C  RA 35156656120 + 8 [0]

It may need to know where calculate "bio_end_sector" value as 35156656128. I have patched libata and sd driver for Dual Actuator.



From: Paolo Valente <paolo.valente@linaro.org>
Sent: Wednesday, August 10, 2022 6:22 PM
To: Rory Chen <rory.c.chen@seagate.com>
Cc: Arie van der Hoeven <arie.vanderhoeven@seagate.com>; Muhammad Ahmad <muhammad.ahmad@seagate.com>; linux-block@vger.kernel.org <linux-block@vger.kernel.org>; linux-kernel@vger.kernel.org <linux-kernel@vger.kernel.org>; Jan Kara <jack@suse.cz>; andrea.righi@canonical.com <andrea.righi@canonical.com>; glen.valante@linaro.org <glen.valante@linaro.org>; axboe@kernel.dk <axboe@kernel.dk>; Tyler Erickson <tyler.erickson@seagate.com>; Michael English <michael.english@seagate.com>; Andrew Ring <andrew.ring@seagate.com>; Varun Boddu <varunreddy.boddu@seagate.com>
Subject: Re: [PATCH 0/8] block, bfq: extend bfq to support multi-actuator drives


This message has originated from an External Source. Please use proper judgment and caution when opening attachments, clicking links, or responding to this email.


> Il giorno 9 ago 2022, alle ore 05:47, Rory Chen <rory.c.chen@seagate.com> ha scritto:
>
> Resend the mail as plain text because previous mail with rich text makes some mess and forget to add others at Seagate who worked on validating the
> patch as well(Muhammad, Michael, Andrew, Varun,Tyler)
>
> Hi Paolo,
>

Hi

> I am from Seagate China and face a problem when I’m evaluating the bfq patches. Could you please check?
> Thanks
>
> Issue statement
> When running performance test on bfq patch, I observed warning message "bfq_actuator_index: bio sector out of ranges: end=35156656128" and OS hung suddenly after some hours.
> The warning message is reported from function bfq_actuator_index which determines IO request is in which index of actuators.  The bio_end_sector is 35156656128 but the max LBA for the drive is 35156656127 so it’s beyond the LBA range.

Yep, this sanity check fails if the end sector of a new IO does not
belong to any sector range.

>  I captured the block trace and didn’t found request LBA 35156656128 instead only found max request LBA 35156656127.

Maybe in the trace you see only start sectors?  The failed check si
performed on end sectors instead.

At any rate, there seems to be an off-by-one error in the value(s)
stored in the sector field(s) of the blk_independent_access_range data
structure.

I guess we may need some help/feedback from people competent on this
stuff.

> I’m not sure if this warning message is related to later OS hung.
>

Not easy to say.  At any rate, we can try with a development version
of bfq.  It can help us detect the possible cause of this hang.  But
let's see where we get with this sector error first.

Thank you for testing this extended version of bfq,
Paolo

>
> Problem environment
> Kernel base is 5.18.9
> Test HDD drive is Seagate ST18000NM0092 dual actuator SATA.
> Actuator LBA mapping by reading VPD B9
> Concurrent positioning ranges VPD page:
> LBA range number:0
> number of storage elements:1
> starting LBA:0x0
> number of LBAs:0x417c00000 [17578328064]
> LBA range number:1
> number of storage elements:1
> starting LBA:0x417c00000
> number of LBAs:0x417c00000 [17578328064]
>
>
>
>
>
> From: Paolo Valente <paolo.valente@linaro.org>
> Sent: Thursday, June 23, 2022 8:53 AM
> To: Jens Axboe <axboe@kernel.dk>
> Cc: linux-block@vger.kernel.org <linux-block@vger.kernel.org>; linux-kernel@vger.kernel.org <linux-kernel@vger.kernel.org>; jack@suse.cz <jack@suse.cz>; andrea.righi@canonical.com <andrea.righi@canonical.com>; glen.valante@linaro.org <glen.valante@linaro.org>; Arie van der Hoeven <arie.vanderhoeven@seagate.com>; Paolo Valente <paolo.valente@linaro.org>
> Subject: [PATCH 0/8] block, bfq: extend bfq to support multi-actuator drives
>
>
> This message has originated from an External Source. Please use proper judgment and caution when opening attachments, clicking links, or responding to this email.
>
>
> Hi,
> this patch series extends BFQ so as to optimize I/O dispatch to
> multi-actuator drives. In particular, this extension addresses the
> following issue. Multi-actuator drives appear as a single device to
> the I/O subsystem [1].  Yet they address commands to different
> actuators internally, as a function of Logical Block Addressing
> (LBAs). A given sector is reachable by only one of the actuators. For
> example, Seagate’s Serial Advanced Technology Attachment (SATA)
> version contains two actuators and maps the lower half of the SATA LBA
> space to the lower actuator and the upper half to the upper actuator.
>
> Evidently, to fully utilize actuators, no actuator must be left idle
> or underutilized while there is pending I/O for it. To reach this
> goal, the block layer must somehow control the load of each actuator
> individually. This series enriches BFQ with such a per-actuator
> control, as a first step. Then it also adds a simple mechanism for
> guaranteeing that actuators with pending I/O are never left idle.
>
> See [1] for a more detailed overview of the problem and of the
> solutions implemented in this patch series. There you will also find
> some preliminary performance results.
>
> Thanks,
> Paolo
>
> [1] https://secure-web.cisco.com/1hcxnN1C3h1nW7mby7S66_LE8szirQwbQI0fBpYePrA0GTWfyuQyl0GpZaOn32xMSkNT0BUQWloDHFzZ23aYDZdi8NfdrEFLY9pQDBblIvn08LRiTVoIOUC8zWSG_r2PCyLtx3ppZq5cWOib_8azxteRRcbKWGdbLPSqg9hfSJSqltth0ByLONHEoI3p3e9QNIn6nVAeQbsT3aOQe-F95XrQvaPrFJXx6RGL9kDXyfkbXIHcdcLBf895gYBFn5S2WjBDQq2kzDzZOlc1HekRUhg0qDQcFY6NydVfrqNfLbpAHAth6KyREscQhVTMVREEVa1b6bQByX6grF5pn3pTIo0lODyfX6yRmcbReSYNfOZ65ZPvp-nH530FQ-5nXoRxFf74WIKDrNTALs3xQvg03DH4jLez-T2M9xEu-sfEDAEdTGF7BcnmBW6vrPO4_p3k4/https%3A%2F%2Fwww.linaro.org%2Fblog%2Fbudget-fair-queueing-bfq-linux-io-scheduler-optimizations-for-multi-actuator-sata-hard-drives%2F
>
> Davide Zini (3):
>  block, bfq: split also async bfq_queues on a per-actuator basis
>  block, bfq: inject I/O to underutilized actuators
>  block, bfq: balance I/O injection among underutilized actuators
>
> Federico Gavioli (1):
>  block, bfq: retrieve independent access ranges from request queue
>
> Paolo Valente (4):
>  block, bfq: split sync bfq_queues on a per-actuator basis
>  block, bfq: forbid stable merging of queues associated with different
>    actuators
>  block, bfq: turn scalar fields into arrays in bfq_io_cq
>  block, bfq: turn BFQ_NUM_ACTUATORS into BFQ_MAX_ACTUATORS
>
> block/bfq-cgroup.c  |  97 +++++----
> block/bfq-iosched.c | 488 +++++++++++++++++++++++++++++---------------
> block/bfq-iosched.h | 149 ++++++++++----
> block/bfq-wf2q.c    |   2 +-
> 4 files changed, 493 insertions(+), 243 deletions(-)
>
> --
> 2.20.1
>
>
> Seagate Internal
>
> Seagate Internal

Seagate Internal

^ permalink raw reply	[flat|nested] 21+ messages in thread

* RE: [PATCH 0/8] block, bfq: extend bfq to support multi-actuator drives
  2022-08-10 12:59       ` Rory Chen
@ 2022-08-18 15:40         ` Tyler Erickson
  2022-08-25 14:45           ` Paolo Valente
  0 siblings, 1 reply; 21+ messages in thread
From: Tyler Erickson @ 2022-08-18 15:40 UTC (permalink / raw)
  To: Rory Chen, Paolo Valente
  Cc: Arie van der Hoeven, Muhammad Ahmad, linux-block, linux-kernel,
	Jan Kara, andrea.righi, glen.valante, axboe, Michael English,
	Andrew Ring, Varun Boddu

The libata layer is reporting correctly after the changes I submitted.
 
The drive reports the actuator ranges as a starting LBA and a count of LBAs for the range.
If the code reading the reported values simply does startingLBA + range, this is an incorrect ending LBA for that actuator. This is because LBAs are zero indexed and this simple addition is not taking that into account.
The proper way to get the endingLBA is startingLBA + range - 1 to get the last LBA value for where to issue a final IO read/write to account for LBA values starting at zero rather than one.
 
Here is an example from the output in SeaChest/openSeaChest:
====Concurrent Positioning Ranges====
 
Range#     #Elements            Lowest LBA          # of LBAs      
   0            1                                               0           17578328064
   1            1                         17578328064           17578328064
 
If using the incorrect formula to get the final LBA for actuator 0, you would get 17578328064, but this is the starting LBA reported by the drive for actuator 1.
So to be consistent for all ranges, the final LBA for a given actuator should be calculated as starting LBA + range - 1.
 
I had reached out to Seagate's T10 and T13 representatives for clarification and verification and this is most likely what is causing the error is a missing - 1 somewhere after getting the information reported by the device. They agreed that the reporting from the drive and the SCSI to ATA translation is correct.
 
I'm not sure where this is being read and calculated, but it is not an error in the low-level libata or sd level of the kernel. It may be in bfq, or it may be in some other place after the sd layer. I know there were some additions to read this and report it up the stack, but I did not think those were wrong as they seemed to pass the drive reported information up the stack.

Tyler Erickson
Seagate Technology


Seagate Internal

-----Original Message-----
From: Rory Chen <rory.c.chen@seagate.com> 
Sent: Wednesday, August 10, 2022 6:59 AM
To: Paolo Valente <paolo.valente@linaro.org>
Cc: Arie van der Hoeven <arie.vanderhoeven@seagate.com>; Muhammad Ahmad <muhammad.ahmad@seagate.com>; linux-block@vger.kernel.org; linux-kernel@vger.kernel.org; Jan Kara <jack@suse.cz>; andrea.righi@canonical.com; glen.valante@linaro.org; axboe@kernel.dk; Tyler Erickson <tyler.erickson@seagate.com>; Michael English <michael.english@seagate.com>; Andrew Ring <andrew.ring@seagate.com>; Varun Boddu <varunreddy.boddu@seagate.com>
Subject: Re: [PATCH 0/8] block, bfq: extend bfq to support multi-actuator drives

The block trace shows the start sector is 35156656120 and transfer length is 8 sectors, which is within the max LBA 35156656127 of drive. And this IO is completed successfully from the slice of parsed block trace though reporting the warning message.
 8,64   7       13     0.039401337 19176  Q  RA 35156656120 + 8 [systemd-udevd]
  8,64   7       15     0.039403946 19176  P   N [systemd-udevd]
  8,64   7       16     0.039405132 19176  I  RA 35156656120 + 8 [systemd-udevd]
  8,64   7       18     0.039411554 19176  D  RA 35156656120 + 8 [systemd-udevd]
  8,64   0       40     0.039479055     0  C  RA 35156656120 + 8 [0]

It may need to know where calculate "bio_end_sector" value as 35156656128. I have patched libata and sd driver for Dual Actuator.



From: Paolo Valente <paolo.valente@linaro.org>
Sent: Wednesday, August 10, 2022 6:22 PM
To: Rory Chen <rory.c.chen@seagate.com>
Cc: Arie van der Hoeven <arie.vanderhoeven@seagate.com>; Muhammad Ahmad <muhammad.ahmad@seagate.com>; linux-block@vger.kernel.org <linux-block@vger.kernel.org>; linux-kernel@vger.kernel.org <linux-kernel@vger.kernel.org>; Jan Kara <jack@suse.cz>; andrea.righi@canonical.com <andrea.righi@canonical.com>; glen.valante@linaro.org <glen.valante@linaro.org>; axboe@kernel.dk <axboe@kernel.dk>; Tyler Erickson <tyler.erickson@seagate.com>; Michael English <michael.english@seagate.com>; Andrew Ring <andrew.ring@seagate.com>; Varun Boddu <varunreddy.boddu@seagate.com>
Subject: Re: [PATCH 0/8] block, bfq: extend bfq to support multi-actuator drives


This message has originated from an External Source. Please use proper judgment and caution when opening attachments, clicking links, or responding to this email.


> Il giorno 9 ago 2022, alle ore 05:47, Rory Chen <rory.c.chen@seagate.com> ha scritto:
>
> Resend the mail as plain text because previous mail with rich text 
> makes some mess and forget to add others at Seagate who worked on 
> validating the patch as well(Muhammad, Michael, Andrew, Varun,Tyler)
>
> Hi Paolo,
>

Hi

> I am from Seagate China and face a problem when I'm evaluating the bfq patches. Could you please check?
> Thanks
>
> Issue statement
> When running performance test on bfq patch, I observed warning message "bfq_actuator_index: bio sector out of ranges: end=35156656128" and OS hung suddenly after some hours.
> The warning message is reported from function bfq_actuator_index which determines IO request is in which index of actuators.  The bio_end_sector is 35156656128 but the max LBA for the drive is 35156656127 so it's beyond the LBA range.

Yep, this sanity check fails if the end sector of a new IO does not belong to any sector range.

>  I captured the block trace and didn't found request LBA 35156656128 instead only found max request LBA 35156656127.

Maybe in the trace you see only start sectors?  The failed check si performed on end sectors instead.

At any rate, there seems to be an off-by-one error in the value(s) stored in the sector field(s) of the blk_independent_access_range data structure.

I guess we may need some help/feedback from people competent on this stuff.

> I'm not sure if this warning message is related to later OS hung.
>

Not easy to say.  At any rate, we can try with a development version of bfq.  It can help us detect the possible cause of this hang.  But let's see where we get with this sector error first.

Thank you for testing this extended version of bfq, Paolo

>
> Problem environment
> Kernel base is 5.18.9
> Test HDD drive is Seagate ST18000NM0092 dual actuator SATA.
> Actuator LBA mapping by reading VPD B9 Concurrent positioning ranges 
> VPD page:
> LBA range number:0
> number of storage elements:1
> starting LBA:0x0
> number of LBAs:0x417c00000 [17578328064] LBA range number:1 number of 
> storage elements:1 starting LBA:0x417c00000 number of LBAs:0x417c00000 
> [17578328064]
>
>
>
>
>
> From: Paolo Valente <paolo.valente@linaro.org>
> Sent: Thursday, June 23, 2022 8:53 AM
> To: Jens Axboe <axboe@kernel.dk>
> Cc: linux-block@vger.kernel.org <linux-block@vger.kernel.org>; 
> linux-kernel@vger.kernel.org <linux-kernel@vger.kernel.org>; 
> jack@suse.cz <jack@suse.cz>; andrea.righi@canonical.com 
> <andrea.righi@canonical.com>; glen.valante@linaro.org 
> <glen.valante@linaro.org>; Arie van der Hoeven 
> <arie.vanderhoeven@seagate.com>; Paolo Valente 
> <paolo.valente@linaro.org>
> Subject: [PATCH 0/8] block, bfq: extend bfq to support multi-actuator 
> drives
>
>
> This message has originated from an External Source. Please use proper judgment and caution when opening attachments, clicking links, or responding to this email.
>
>
> Hi,
> this patch series extends BFQ so as to optimize I/O dispatch to 
> multi-actuator drives. In particular, this extension addresses the 
> following issue. Multi-actuator drives appear as a single device to 
> the I/O subsystem [1].  Yet they address commands to different 
> actuators internally, as a function of Logical Block Addressing 
> (LBAs). A given sector is reachable by only one of the actuators. For 
> example, Seagate's Serial Advanced Technology Attachment (SATA) 
> version contains two actuators and maps the lower half of the SATA LBA 
> space to the lower actuator and the upper half to the upper actuator.
>
> Evidently, to fully utilize actuators, no actuator must be left idle 
> or underutilized while there is pending I/O for it. To reach this 
> goal, the block layer must somehow control the load of each actuator 
> individually. This series enriches BFQ with such a per-actuator 
> control, as a first step. Then it also adds a simple mechanism for 
> guaranteeing that actuators with pending I/O are never left idle.
>
> See [1] for a more detailed overview of the problem and of the 
> solutions implemented in this patch series. There you will also find 
> some preliminary performance results.
>
> Thanks,
> Paolo
>
> [1] 
> https://secure-web.cisco.com/1hcxnN1C3h1nW7mby7S66_LE8szirQwbQI0fBpYeP
> rA0GTWfyuQyl0GpZaOn32xMSkNT0BUQWloDHFzZ23aYDZdi8NfdrEFLY9pQDBblIvn08LR
> iTVoIOUC8zWSG_r2PCyLtx3ppZq5cWOib_8azxteRRcbKWGdbLPSqg9hfSJSqltth0ByLO
> NHEoI3p3e9QNIn6nVAeQbsT3aOQe-F95XrQvaPrFJXx6RGL9kDXyfkbXIHcdcLBf895gYB
> Fn5S2WjBDQq2kzDzZOlc1HekRUhg0qDQcFY6NydVfrqNfLbpAHAth6KyREscQhVTMVREEV
> a1b6bQByX6grF5pn3pTIo0lODyfX6yRmcbReSYNfOZ65ZPvp-nH530FQ-5nXoRxFf74WIK
> DrNTALs3xQvg03DH4jLez-T2M9xEu-sfEDAEdTGF7BcnmBW6vrPO4_p3k4/https%3A%2F
> %2Fwww.linaro.org%2Fblog%2Fbudget-fair-queueing-bfq-linux-io-scheduler
> -optimizations-for-multi-actuator-sata-hard-drives%2F
>
> Davide Zini (3):
>  block, bfq: split also async bfq_queues on a per-actuator basis  
> block, bfq: inject I/O to underutilized actuators  block, bfq: balance 
> I/O injection among underutilized actuators
>
> Federico Gavioli (1):
>  block, bfq: retrieve independent access ranges from request queue
>
> Paolo Valente (4):
>  block, bfq: split sync bfq_queues on a per-actuator basis  block, 
> bfq: forbid stable merging of queues associated with different
>    actuators
>  block, bfq: turn scalar fields into arrays in bfq_io_cq  block, bfq: 
> turn BFQ_NUM_ACTUATORS into BFQ_MAX_ACTUATORS
>
> block/bfq-cgroup.c  |  97 +++++----
> block/bfq-iosched.c | 488 +++++++++++++++++++++++++++++---------------
> block/bfq-iosched.h | 149 ++++++++++----
> block/bfq-wf2q.c    |   2 +-
> 4 files changed, 493 insertions(+), 243 deletions(-)
>
> --
> 2.20.1
>
>
> Seagate Internal
>
> Seagate Internal

Seagate Internal

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH 0/8] block, bfq: extend bfq to support multi-actuator drives
  2022-08-18 15:40         ` Tyler Erickson
@ 2022-08-25 14:45           ` Paolo Valente
  2022-09-08  2:46             ` Rory Chen
  0 siblings, 1 reply; 21+ messages in thread
From: Paolo Valente @ 2022-08-25 14:45 UTC (permalink / raw)
  To: Tyler Erickson
  Cc: Rory Chen, Arie van der Hoeven, Muhammad Ahmad, linux-block,
	linux-kernel, Jan Kara, andrea.righi, glen.valante, axboe,
	Michael English, Andrew Ring, Varun Boddu

Hi

> Il giorno 18 ago 2022, alle ore 17:40, Tyler Erickson <tyler.erickson@seagate.com> ha scritto:
> 
> The libata layer is reporting correctly after the changes I submitted.
> 
> The drive reports the actuator ranges as a starting LBA and a count of LBAs for the range.
> If the code reading the reported values simply does startingLBA + range, this is an incorrect ending LBA for that actuator. This is because LBAs are zero indexed and this simple addition is not taking that into account.
> The proper way to get the endingLBA is startingLBA + range - 1 to get the last LBA value for where to issue a final IO read/write to account for LBA values starting at zero rather than one.
> 
> Here is an example from the output in SeaChest/openSeaChest:
> ====Concurrent Positioning Ranges====
> 
> Range#     #Elements            Lowest LBA          # of LBAs      
>   0            1                                               0           17578328064
>   1            1                         17578328064           17578328064
> 
> If using the incorrect formula to get the final LBA for actuator 0, you would get 17578328064, but this is the starting LBA reported by the drive for actuator 1.
> So to be consistent for all ranges, the final LBA for a given actuator should be calculated as starting LBA + range - 1.
> 

Ok

> I had reached out to Seagate's T10 and T13 representatives for clarification and verification and this is most likely what is causing the error is a missing - 1 somewhere after getting the information reported by the device. They agreed that the reporting from the drive and the SCSI to ATA translation is correct.
> 
> I'm not sure where this is being read and calculated, but it is not an error in the low-level libata or sd level of the kernel. It may be in bfq, or it may be in some other place after the sd layer.

This apparent mistake is in the macro bio_end_sector (defined in
include/linux/bio.h), which seems to be translated as sector+size.
Jens, can you shed a light on this point?

Thanks,
Paolo

> I know there were some additions to read this and report it up the stack, but I did not think those were wrong as they seemed to pass the drive reported information up the stack.
> 
> Tyler Erickson
> Seagate Technology
> 
> 
> Seagate Internal
> 
> -----Original Message-----
> From: Rory Chen <rory.c.chen@seagate.com> 
> Sent: Wednesday, August 10, 2022 6:59 AM
> To: Paolo Valente <paolo.valente@linaro.org>
> Cc: Arie van der Hoeven <arie.vanderhoeven@seagate.com>; Muhammad Ahmad <muhammad.ahmad@seagate.com>; linux-block@vger.kernel.org; linux-kernel@vger.kernel.org; Jan Kara <jack@suse.cz>; andrea.righi@canonical.com; glen.valante@linaro.org; axboe@kernel.dk; Tyler Erickson <tyler.erickson@seagate.com>; Michael English <michael.english@seagate.com>; Andrew Ring <andrew.ring@seagate.com>; Varun Boddu <varunreddy.boddu@seagate.com>
> Subject: Re: [PATCH 0/8] block, bfq: extend bfq to support multi-actuator drives
> 
> The block trace shows the start sector is 35156656120 and transfer length is 8 sectors, which is within the max LBA 35156656127 of drive. And this IO is completed successfully from the slice of parsed block trace though reporting the warning message.
> 8,64   7       13     0.039401337 19176  Q  RA 35156656120 + 8 [systemd-udevd]
>  8,64   7       15     0.039403946 19176  P   N [systemd-udevd]
>  8,64   7       16     0.039405132 19176  I  RA 35156656120 + 8 [systemd-udevd]
>  8,64   7       18     0.039411554 19176  D  RA 35156656120 + 8 [systemd-udevd]
>  8,64   0       40     0.039479055     0  C  RA 35156656120 + 8 [0]
> 
> It may need to know where calculate "bio_end_sector" value as 35156656128. I have patched libata and sd driver for Dual Actuator.
> 
> 
> 
> From: Paolo Valente <paolo.valente@linaro.org>
> Sent: Wednesday, August 10, 2022 6:22 PM
> To: Rory Chen <rory.c.chen@seagate.com>
> Cc: Arie van der Hoeven <arie.vanderhoeven@seagate.com>; Muhammad Ahmad <muhammad.ahmad@seagate.com>; linux-block@vger.kernel.org <linux-block@vger.kernel.org>; linux-kernel@vger.kernel.org <linux-kernel@vger.kernel.org>; Jan Kara <jack@suse.cz>; andrea.righi@canonical.com <andrea.righi@canonical.com>; glen.valante@linaro.org <glen.valante@linaro.org>; axboe@kernel.dk <axboe@kernel.dk>; Tyler Erickson <tyler.erickson@seagate.com>; Michael English <michael.english@seagate.com>; Andrew Ring <andrew.ring@seagate.com>; Varun Boddu <varunreddy.boddu@seagate.com>
> Subject: Re: [PATCH 0/8] block, bfq: extend bfq to support multi-actuator drives
> 
> 
> This message has originated from an External Source. Please use proper judgment and caution when opening attachments, clicking links, or responding to this email.
> 
> 
>> Il giorno 9 ago 2022, alle ore 05:47, Rory Chen <rory.c.chen@seagate.com> ha scritto:
>> 
>> Resend the mail as plain text because previous mail with rich text 
>> makes some mess and forget to add others at Seagate who worked on 
>> validating the patch as well(Muhammad, Michael, Andrew, Varun,Tyler)
>> 
>> Hi Paolo,
>> 
> 
> Hi
> 
>> I am from Seagate China and face a problem when I'm evaluating the bfq patches. Could you please check?
>> Thanks
>> 
>> Issue statement
>> When running performance test on bfq patch, I observed warning message "bfq_actuator_index: bio sector out of ranges: end=35156656128" and OS hung suddenly after some hours.
>> The warning message is reported from function bfq_actuator_index which determines IO request is in which index of actuators.  The bio_end_sector is 35156656128 but the max LBA for the drive is 35156656127 so it's beyond the LBA range.
> 
> Yep, this sanity check fails if the end sector of a new IO does not belong to any sector range.
> 
>> I captured the block trace and didn't found request LBA 35156656128 instead only found max request LBA 35156656127.
> 
> Maybe in the trace you see only start sectors?  The failed check si performed on end sectors instead.
> 
> At any rate, there seems to be an off-by-one error in the value(s) stored in the sector field(s) of the blk_independent_access_range data structure.
> 
> I guess we may need some help/feedback from people competent on this stuff.
> 
>> I'm not sure if this warning message is related to later OS hung.
>> 
> 
> Not easy to say.  At any rate, we can try with a development version of bfq.  It can help us detect the possible cause of this hang.  But let's see where we get with this sector error first.
> 
> Thank you for testing this extended version of bfq, Paolo
> 
>> 
>> Problem environment
>> Kernel base is 5.18.9
>> Test HDD drive is Seagate ST18000NM0092 dual actuator SATA.
>> Actuator LBA mapping by reading VPD B9 Concurrent positioning ranges 
>> VPD page:
>> LBA range number:0
>> number of storage elements:1
>> starting LBA:0x0
>> number of LBAs:0x417c00000 [17578328064] LBA range number:1 number of 
>> storage elements:1 starting LBA:0x417c00000 number of LBAs:0x417c00000 
>> [17578328064]
>> 
>> 
>> 
>> 
>> 
>> From: Paolo Valente <paolo.valente@linaro.org>
>> Sent: Thursday, June 23, 2022 8:53 AM
>> To: Jens Axboe <axboe@kernel.dk>
>> Cc: linux-block@vger.kernel.org <linux-block@vger.kernel.org>; 
>> linux-kernel@vger.kernel.org <linux-kernel@vger.kernel.org>; 
>> jack@suse.cz <jack@suse.cz>; andrea.righi@canonical.com 
>> <andrea.righi@canonical.com>; glen.valante@linaro.org 
>> <glen.valante@linaro.org>; Arie van der Hoeven 
>> <arie.vanderhoeven@seagate.com>; Paolo Valente 
>> <paolo.valente@linaro.org>
>> Subject: [PATCH 0/8] block, bfq: extend bfq to support multi-actuator 
>> drives
>> 
>> 
>> This message has originated from an External Source. Please use proper judgment and caution when opening attachments, clicking links, or responding to this email.
>> 
>> 
>> Hi,
>> this patch series extends BFQ so as to optimize I/O dispatch to 
>> multi-actuator drives. In particular, this extension addresses the 
>> following issue. Multi-actuator drives appear as a single device to 
>> the I/O subsystem [1].  Yet they address commands to different 
>> actuators internally, as a function of Logical Block Addressing 
>> (LBAs). A given sector is reachable by only one of the actuators. For 
>> example, Seagate's Serial Advanced Technology Attachment (SATA) 
>> version contains two actuators and maps the lower half of the SATA LBA 
>> space to the lower actuator and the upper half to the upper actuator.
>> 
>> Evidently, to fully utilize actuators, no actuator must be left idle 
>> or underutilized while there is pending I/O for it. To reach this 
>> goal, the block layer must somehow control the load of each actuator 
>> individually. This series enriches BFQ with such a per-actuator 
>> control, as a first step. Then it also adds a simple mechanism for 
>> guaranteeing that actuators with pending I/O are never left idle.
>> 
>> See [1] for a more detailed overview of the problem and of the 
>> solutions implemented in this patch series. There you will also find 
>> some preliminary performance results.
>> 
>> Thanks,
>> Paolo
>> 
>> [1] 
>> https://secure-web.cisco.com/1hcxnN1C3h1nW7mby7S66_LE8szirQwbQI0fBpYeP
>> rA0GTWfyuQyl0GpZaOn32xMSkNT0BUQWloDHFzZ23aYDZdi8NfdrEFLY9pQDBblIvn08LR
>> iTVoIOUC8zWSG_r2PCyLtx3ppZq5cWOib_8azxteRRcbKWGdbLPSqg9hfSJSqltth0ByLO
>> NHEoI3p3e9QNIn6nVAeQbsT3aOQe-F95XrQvaPrFJXx6RGL9kDXyfkbXIHcdcLBf895gYB
>> Fn5S2WjBDQq2kzDzZOlc1HekRUhg0qDQcFY6NydVfrqNfLbpAHAth6KyREscQhVTMVREEV
>> a1b6bQByX6grF5pn3pTIo0lODyfX6yRmcbReSYNfOZ65ZPvp-nH530FQ-5nXoRxFf74WIK
>> DrNTALs3xQvg03DH4jLez-T2M9xEu-sfEDAEdTGF7BcnmBW6vrPO4_p3k4/https%3A%2F
>> %2Fwww.linaro.org%2Fblog%2Fbudget-fair-queueing-bfq-linux-io-scheduler
>> -optimizations-for-multi-actuator-sata-hard-drives%2F
>> 
>> Davide Zini (3):
>> block, bfq: split also async bfq_queues on a per-actuator basis  
>> block, bfq: inject I/O to underutilized actuators  block, bfq: balance 
>> I/O injection among underutilized actuators
>> 
>> Federico Gavioli (1):
>> block, bfq: retrieve independent access ranges from request queue
>> 
>> Paolo Valente (4):
>> block, bfq: split sync bfq_queues on a per-actuator basis  block, 
>> bfq: forbid stable merging of queues associated with different
>>   actuators
>> block, bfq: turn scalar fields into arrays in bfq_io_cq  block, bfq: 
>> turn BFQ_NUM_ACTUATORS into BFQ_MAX_ACTUATORS
>> 
>> block/bfq-cgroup.c  |  97 +++++----
>> block/bfq-iosched.c | 488 +++++++++++++++++++++++++++++---------------
>> block/bfq-iosched.h | 149 ++++++++++----
>> block/bfq-wf2q.c    |   2 +-
>> 4 files changed, 493 insertions(+), 243 deletions(-)
>> 
>> --
>> 2.20.1
>> 
>> 
>> Seagate Internal
>> 
>> Seagate Internal
> 
> Seagate Internal


^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH 0/8] block, bfq: extend bfq to support multi-actuator drives
  2022-08-25 14:45           ` Paolo Valente
@ 2022-09-08  2:46             ` Rory Chen
  2022-09-08 10:54               ` Paolo Valente
  0 siblings, 1 reply; 21+ messages in thread
From: Rory Chen @ 2022-09-08  2:46 UTC (permalink / raw)
  To: Paolo Valente, Tyler Erickson
  Cc: Arie van der Hoeven, Muhammad Ahmad, linux-block, linux-kernel,
	Jan Kara, andrea.righi, glen.valante, axboe, Michael English,
	Andrew Ring, Varun Boddu

I change the comparison condition and it can eliminate the warning.
<               if (end >= iar->sector + 1 && end < iar->sector + iar->nr_sectors + 1)
>               if (end >= iar->sector && end < iar->sector + iar->nr_sectors)

I don't know if this change is appropriate but  bio_end_sector deducting 1 said by Tyler seems to make sense.

From: Paolo Valente <paolo.valente@linaro.org>
Sent: Thursday, August 25, 2022 10:45 PM
To: Tyler Erickson <tyler.erickson@seagate.com>
Cc: Rory Chen <rory.c.chen@seagate.com>; Arie van der Hoeven <arie.vanderhoeven@seagate.com>; Muhammad Ahmad <muhammad.ahmad@seagate.com>; linux-block@vger.kernel.org <linux-block@vger.kernel.org>; linux-kernel@vger.kernel.org <linux-kernel@vger.kernel.org>; Jan Kara <jack@suse.cz>; andrea.righi@canonical.com <andrea.righi@canonical.com>; glen.valante@linaro.org <glen.valante@linaro.org>; axboe@kernel.dk <axboe@kernel.dk>; Michael English <michael.english@seagate.com>; Andrew Ring <andrew.ring@seagate.com>; Varun Boddu <varunreddy.boddu@seagate.com>
Subject: Re: [PATCH 0/8] block, bfq: extend bfq to support multi-actuator drives


This message has originated from an External Source. Please use proper judgment and caution when opening attachments, clicking links, or responding to this email.


Hi

> Il giorno 18 ago 2022, alle ore 17:40, Tyler Erickson <tyler.erickson@seagate.com> ha scritto:
>
> The libata layer is reporting correctly after the changes I submitted.
>
> The drive reports the actuator ranges as a starting LBA and a count of LBAs for the range.
> If the code reading the reported values simply does startingLBA + range, this is an incorrect ending LBA for that actuator. This is because LBAs are zero indexed and this simple addition is not taking that into account.
> The proper way to get the endingLBA is startingLBA + range - 1 to get the last LBA value for where to issue a final IO read/write to account for LBA values starting at zero rather than one.
>
> Here is an example from the output in SeaChest/openSeaChest:
> ====Concurrent Positioning Ranges====
>
> Range#     #Elements            Lowest LBA          # of LBAs
>   0            1                                               0           17578328064
>   1            1                         17578328064           17578328064
>
> If using the incorrect formula to get the final LBA for actuator 0, you would get 17578328064, but this is the starting LBA reported by the drive for actuator 1.
> So to be consistent for all ranges, the final LBA for a given actuator should be calculated as starting LBA + range - 1.
>

Ok

> I had reached out to Seagate's T10 and T13 representatives for clarification and verification and this is most likely what is causing the error is a missing - 1 somewhere after getting the information reported by the device. They agreed that the reporting from the drive and the SCSI to ATA translation is correct.
>
> I'm not sure where this is being read and calculated, but it is not an error in the low-level libata or sd level of the kernel. It may be in bfq, or it may be in some other place after the sd layer.

This apparent mistake is in the macro bio_end_sector (defined in
include/linux/bio.h), which seems to be translated as sector+size.
Jens, can you shed a light on this point?

Thanks,
Paolo

> I know there were some additions to read this and report it up the stack, but I did not think those were wrong as they seemed to pass the drive reported information up the stack.
>
> Tyler Erickson
> Seagate Technology
>
>
> Seagate Internal
>
> -----Original Message-----
> From: Rory Chen <rory.c.chen@seagate.com>
> Sent: Wednesday, August 10, 2022 6:59 AM
> To: Paolo Valente <paolo.valente@linaro.org>
> Cc: Arie van der Hoeven <arie.vanderhoeven@seagate.com>; Muhammad Ahmad <muhammad.ahmad@seagate.com>; linux-block@vger.kernel.org; linux-kernel@vger.kernel.org; Jan Kara <jack@suse.cz>; andrea.righi@canonical.com; glen.valante@linaro.org; axboe@kernel.dk; Tyler Erickson <tyler.erickson@seagate.com>; Michael English <michael.english@seagate.com>; Andrew Ring <andrew.ring@seagate.com>; Varun Boddu <varunreddy.boddu@seagate.com>
> Subject: Re: [PATCH 0/8] block, bfq: extend bfq to support multi-actuator drives
>
> The block trace shows the start sector is 35156656120 and transfer length is 8 sectors, which is within the max LBA 35156656127 of drive. And this IO is completed successfully from the slice of parsed block trace though reporting the warning message.
> 8,64   7       13     0.039401337 19176  Q  RA 35156656120 + 8 [systemd-udevd]
>  8,64   7       15     0.039403946 19176  P   N [systemd-udevd]
>  8,64   7       16     0.039405132 19176  I  RA 35156656120 + 8 [systemd-udevd]
>  8,64   7       18     0.039411554 19176  D  RA 35156656120 + 8 [systemd-udevd]
>  8,64   0       40     0.039479055     0  C  RA 35156656120 + 8 [0]
>
> It may need to know where calculate "bio_end_sector" value as 35156656128. I have patched libata and sd driver for Dual Actuator.
>
>
>
> From: Paolo Valente <paolo.valente@linaro.org>
> Sent: Wednesday, August 10, 2022 6:22 PM
> To: Rory Chen <rory.c.chen@seagate.com>
> Cc: Arie van der Hoeven <arie.vanderhoeven@seagate.com>; Muhammad Ahmad <muhammad.ahmad@seagate.com>; linux-block@vger.kernel.org <linux-block@vger.kernel.org>; linux-kernel@vger.kernel.org <linux-kernel@vger.kernel.org>; Jan Kara <jack@suse.cz>; andrea.righi@canonical.com <andrea.righi@canonical.com>; glen.valante@linaro.org <glen.valante@linaro.org>; axboe@kernel.dk <axboe@kernel.dk>; Tyler Erickson <tyler.erickson@seagate.com>; Michael English <michael.english@seagate.com>; Andrew Ring <andrew.ring@seagate.com>; Varun Boddu <varunreddy.boddu@seagate.com>
> Subject: Re: [PATCH 0/8] block, bfq: extend bfq to support multi-actuator drives
>
>
> This message has originated from an External Source. Please use proper judgment and caution when opening attachments, clicking links, or responding to this email.
>
>
>> Il giorno 9 ago 2022, alle ore 05:47, Rory Chen <rory.c.chen@seagate.com> ha scritto:
>>
>> Resend the mail as plain text because previous mail with rich text
>> makes some mess and forget to add others at Seagate who worked on
>> validating the patch as well(Muhammad, Michael, Andrew, Varun,Tyler)
>>
>> Hi Paolo,
>>
>
> Hi
>
>> I am from Seagate China and face a problem when I'm evaluating the bfq patches. Could you please check?
>> Thanks
>>
>> Issue statement
>> When running performance test on bfq patch, I observed warning message "bfq_actuator_index: bio sector out of ranges: end=35156656128" and OS hung suddenly after some hours.
>> The warning message is reported from function bfq_actuator_index which determines IO request is in which index of actuators.  The bio_end_sector is 35156656128 but the max LBA for the drive is 35156656127 so it's beyond the LBA range.
>
> Yep, this sanity check fails if the end sector of a new IO does not belong to any sector range.
>
>> I captured the block trace and didn't found request LBA 35156656128 instead only found max request LBA 35156656127.
>
> Maybe in the trace you see only start sectors?  The failed check si performed on end sectors instead.
>
> At any rate, there seems to be an off-by-one error in the value(s) stored in the sector field(s) of the blk_independent_access_range data structure.
>
> I guess we may need some help/feedback from people competent on this stuff.
>
>> I'm not sure if this warning message is related to later OS hung.
>>
>
> Not easy to say.  At any rate, we can try with a development version of bfq.  It can help us detect the possible cause of this hang.  But let's see where we get with this sector error first.
>
> Thank you for testing this extended version of bfq, Paolo
>
>>
>> Problem environment
>> Kernel base is 5.18.9
>> Test HDD drive is Seagate ST18000NM0092 dual actuator SATA.
>> Actuator LBA mapping by reading VPD B9 Concurrent positioning ranges
>> VPD page:
>> LBA range number:0
>> number of storage elements:1
>> starting LBA:0x0
>> number of LBAs:0x417c00000 [17578328064] LBA range number:1 number of
>> storage elements:1 starting LBA:0x417c00000 number of LBAs:0x417c00000
>> [17578328064]
>>
>>
>>
>>
>>
>> From: Paolo Valente <paolo.valente@linaro.org>
>> Sent: Thursday, June 23, 2022 8:53 AM
>> To: Jens Axboe <axboe@kernel.dk>
>> Cc: linux-block@vger.kernel.org <linux-block@vger.kernel.org>;
>> linux-kernel@vger.kernel.org <linux-kernel@vger.kernel.org>;
>> jack@suse.cz <jack@suse.cz>; andrea.righi@canonical.com
>> <andrea.righi@canonical.com>; glen.valante@linaro.org
>> <glen.valante@linaro.org>; Arie van der Hoeven
>> <arie.vanderhoeven@seagate.com>; Paolo Valente
>> <paolo.valente@linaro.org>
>> Subject: [PATCH 0/8] block, bfq: extend bfq to support multi-actuator
>> drives
>>
>>
>> This message has originated from an External Source. Please use proper judgment and caution when opening attachments, clicking links, or responding to this email.
>>
>>
>> Hi,
>> this patch series extends BFQ so as to optimize I/O dispatch to
>> multi-actuator drives. In particular, this extension addresses the
>> following issue. Multi-actuator drives appear as a single device to
>> the I/O subsystem [1].  Yet they address commands to different
>> actuators internally, as a function of Logical Block Addressing
>> (LBAs). A given sector is reachable by only one of the actuators. For
>> example, Seagate's Serial Advanced Technology Attachment (SATA)
>> version contains two actuators and maps the lower half of the SATA LBA
>> space to the lower actuator and the upper half to the upper actuator.
>>
>> Evidently, to fully utilize actuators, no actuator must be left idle
>> or underutilized while there is pending I/O for it. To reach this
>> goal, the block layer must somehow control the load of each actuator
>> individually. This series enriches BFQ with such a per-actuator
>> control, as a first step. Then it also adds a simple mechanism for
>> guaranteeing that actuators with pending I/O are never left idle.
>>
>> See [1] for a more detailed overview of the problem and of the
>> solutions implemented in this patch series. There you will also find
>> some preliminary performance results.
>>
>> Thanks,
>> Paolo
>>
>> [1]
>> https://secure-web.cisco.com/1hcxnN1C3h1nW7mby7S66_LE8szirQwbQI0fBpYeP
>> rA0GTWfyuQyl0GpZaOn32xMSkNT0BUQWloDHFzZ23aYDZdi8NfdrEFLY9pQDBblIvn08LR
>> iTVoIOUC8zWSG_r2PCyLtx3ppZq5cWOib_8azxteRRcbKWGdbLPSqg9hfSJSqltth0ByLO
>> NHEoI3p3e9QNIn6nVAeQbsT3aOQe-F95XrQvaPrFJXx6RGL9kDXyfkbXIHcdcLBf895gYB
>> Fn5S2WjBDQq2kzDzZOlc1HekRUhg0qDQcFY6NydVfrqNfLbpAHAth6KyREscQhVTMVREEV
>> a1b6bQByX6grF5pn3pTIo0lODyfX6yRmcbReSYNfOZ65ZPvp-nH530FQ-5nXoRxFf74WIK
>> DrNTALs3xQvg03DH4jLez-T2M9xEu-sfEDAEdTGF7BcnmBW6vrPO4_p3k4/https%3A%2F
>> %2Fwww.linaro.org%2Fblog%2Fbudget-fair-queueing-bfq-linux-io-scheduler
>> -optimizations-for-multi-actuator-sata-hard-drives%2F
>>
>> Davide Zini (3):
>> block, bfq: split also async bfq_queues on a per-actuator basis
>> block, bfq: inject I/O to underutilized actuators  block, bfq: balance
>> I/O injection among underutilized actuators
>>
>> Federico Gavioli (1):
>> block, bfq: retrieve independent access ranges from request queue
>>
>> Paolo Valente (4):
>> block, bfq: split sync bfq_queues on a per-actuator basis  block,
>> bfq: forbid stable merging of queues associated with different
>>   actuators
>> block, bfq: turn scalar fields into arrays in bfq_io_cq  block, bfq:
>> turn BFQ_NUM_ACTUATORS into BFQ_MAX_ACTUATORS
>>
>> block/bfq-cgroup.c  |  97 +++++----
>> block/bfq-iosched.c | 488 +++++++++++++++++++++++++++++---------------
>> block/bfq-iosched.h | 149 ++++++++++----
>> block/bfq-wf2q.c    |   2 +-
>> 4 files changed, 493 insertions(+), 243 deletions(-)
>>
>> --
>> 2.20.1
>>
>>
>> Seagate Internal
>>
>> Seagate Internal
>
> Seagate Internal

Seagate Internal

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH 0/8] block, bfq: extend bfq to support multi-actuator drives
  2022-09-08  2:46             ` Rory Chen
@ 2022-09-08 10:54               ` Paolo Valente
  2022-09-08 11:34                 ` Rory Chen
  0 siblings, 1 reply; 21+ messages in thread
From: Paolo Valente @ 2022-09-08 10:54 UTC (permalink / raw)
  To: Rory Chen
  Cc: Tyler Erickson, Arie van der Hoeven, Muhammad Ahmad, linux-block,
	linux-kernel, Jan Kara, andrea.righi, glen.valante, axboe,
	Michael English, Andrew Ring, Varun Boddu, Damien Le Moal



> Il giorno 8 set 2022, alle ore 04:46, Rory Chen <rory.c.chen@seagate.com> ha scritto:
> 
> I change the comparison condition and it can eliminate the warning.

Yep. The crash you reported also goes away?

> <               if (end >= iar->sector + 1 && end < iar->sector + iar->nr_sectors + 1)
>>              if (end >= iar->sector && end < iar->sector + iar->nr_sectors)
> 
> I don't know if this change is appropriate

Unfortunately your change conflicts with the standard code, taken from
the original patches on access ranges [1].  I've CCed Damien, the
author of this patch series.

[1] https://lwn.net/ml/linux-block/20210909023545.1101672-2-damien.lemoal@wdc.com/

Thanks,
Paolo

> but  bio_end_sector deducting 1 said by Tyler seems to make sense.
> 
> From: Paolo Valente <paolo.valente@linaro.org>
> Sent: Thursday, August 25, 2022 10:45 PM
> To: Tyler Erickson <tyler.erickson@seagate.com>
> Cc: Rory Chen <rory.c.chen@seagate.com>; Arie van der Hoeven <arie.vanderhoeven@seagate.com>; Muhammad Ahmad <muhammad.ahmad@seagate.com>; linux-block@vger.kernel.org <linux-block@vger.kernel.org>; linux-kernel@vger.kernel.org <linux-kernel@vger.kernel.org>; Jan Kara <jack@suse.cz>; andrea.righi@canonical.com <andrea.righi@canonical.com>; glen.valante@linaro.org <glen.valante@linaro.org>; axboe@kernel.dk <axboe@kernel.dk>; Michael English <michael.english@seagate.com>; Andrew Ring <andrew.ring@seagate.com>; Varun Boddu <varunreddy.boddu@seagate.com>
> Subject: Re: [PATCH 0/8] block, bfq: extend bfq to support multi-actuator drives
> 
> 
> This message has originated from an External Source. Please use proper judgment and caution when opening attachments, clicking links, or responding to this email.
> 
> 
> Hi
> 
>> Il giorno 18 ago 2022, alle ore 17:40, Tyler Erickson <tyler.erickson@seagate.com> ha scritto:
>> 
>> The libata layer is reporting correctly after the changes I submitted.
>> 
>> The drive reports the actuator ranges as a starting LBA and a count of LBAs for the range.
>> If the code reading the reported values simply does startingLBA + range, this is an incorrect ending LBA for that actuator. This is because LBAs are zero indexed and this simple addition is not taking that into account.
>> The proper way to get the endingLBA is startingLBA + range - 1 to get the last LBA value for where to issue a final IO read/write to account for LBA values starting at zero rather than one.
>> 
>> Here is an example from the output in SeaChest/openSeaChest:
>> ====Concurrent Positioning Ranges====
>> 
>> Range#     #Elements            Lowest LBA          # of LBAs
>>  0            1                                               0           17578328064
>>  1            1                         17578328064           17578328064
>> 
>> If using the incorrect formula to get the final LBA for actuator 0, you would get 17578328064, but this is the starting LBA reported by the drive for actuator 1.
>> So to be consistent for all ranges, the final LBA for a given actuator should be calculated as starting LBA + range - 1.
>> 
> 
> Ok
> 
>> I had reached out to Seagate's T10 and T13 representatives for clarification and verification and this is most likely what is causing the error is a missing - 1 somewhere after getting the information reported by the device. They agreed that the reporting from the drive and the SCSI to ATA translation is correct.
>> 
>> I'm not sure where this is being read and calculated, but it is not an error in the low-level libata or sd level of the kernel. It may be in bfq, or it may be in some other place after the sd layer.
> 
> This apparent mistake is in the macro bio_end_sector (defined in
> include/linux/bio.h), which seems to be translated as sector+size.
> Jens, can you shed a light on this point?
> 
> Thanks,
> Paolo
> 
>> I know there were some additions to read this and report it up the stack, but I did not think those were wrong as they seemed to pass the drive reported information up the stack.
>> 
>> Tyler Erickson
>> Seagate Technology
>> 
>> 
>> Seagate Internal
>> 
>> -----Original Message-----
>> From: Rory Chen <rory.c.chen@seagate.com>
>> Sent: Wednesday, August 10, 2022 6:59 AM
>> To: Paolo Valente <paolo.valente@linaro.org>
>> Cc: Arie van der Hoeven <arie.vanderhoeven@seagate.com>; Muhammad Ahmad <muhammad.ahmad@seagate.com>; linux-block@vger.kernel.org; linux-kernel@vger.kernel.org; Jan Kara <jack@suse.cz>; andrea.righi@canonical.com; glen.valante@linaro.org; axboe@kernel.dk; Tyler Erickson <tyler.erickson@seagate.com>; Michael English <michael.english@seagate.com>; Andrew Ring <andrew.ring@seagate.com>; Varun Boddu <varunreddy.boddu@seagate.com>
>> Subject: Re: [PATCH 0/8] block, bfq: extend bfq to support multi-actuator drives
>> 
>> The block trace shows the start sector is 35156656120 and transfer length is 8 sectors, which is within the max LBA 35156656127 of drive. And this IO is completed successfully from the slice of parsed block trace though reporting the warning message.
>> 8,64   7       13     0.039401337 19176  Q  RA 35156656120 + 8 [systemd-udevd]
>> 8,64   7       15     0.039403946 19176  P   N [systemd-udevd]
>> 8,64   7       16     0.039405132 19176  I  RA 35156656120 + 8 [systemd-udevd]
>> 8,64   7       18     0.039411554 19176  D  RA 35156656120 + 8 [systemd-udevd]
>> 8,64   0       40     0.039479055     0  C  RA 35156656120 + 8 [0]
>> 
>> It may need to know where calculate "bio_end_sector" value as 35156656128. I have patched libata and sd driver for Dual Actuator.
>> 
>> 
>> 
>> From: Paolo Valente <paolo.valente@linaro.org>
>> Sent: Wednesday, August 10, 2022 6:22 PM
>> To: Rory Chen <rory.c.chen@seagate.com>
>> Cc: Arie van der Hoeven <arie.vanderhoeven@seagate.com>; Muhammad Ahmad <muhammad.ahmad@seagate.com>; linux-block@vger.kernel.org <linux-block@vger.kernel.org>; linux-kernel@vger.kernel.org <linux-kernel@vger.kernel.org>; Jan Kara <jack@suse.cz>; andrea.righi@canonical.com <andrea.righi@canonical.com>; glen.valante@linaro.org <glen.valante@linaro.org>; axboe@kernel.dk <axboe@kernel.dk>; Tyler Erickson <tyler.erickson@seagate.com>; Michael English <michael.english@seagate.com>; Andrew Ring <andrew.ring@seagate.com>; Varun Boddu <varunreddy.boddu@seagate.com>
>> Subject: Re: [PATCH 0/8] block, bfq: extend bfq to support multi-actuator drives
>> 
>> 
>> This message has originated from an External Source. Please use proper judgment and caution when opening attachments, clicking links, or responding to this email.
>> 
>> 
>>> Il giorno 9 ago 2022, alle ore 05:47, Rory Chen <rory.c.chen@seagate.com> ha scritto:
>>> 
>>> Resend the mail as plain text because previous mail with rich text
>>> makes some mess and forget to add others at Seagate who worked on
>>> validating the patch as well(Muhammad, Michael, Andrew, Varun,Tyler)
>>> 
>>> Hi Paolo,
>>> 
>> 
>> Hi
>> 
>>> I am from Seagate China and face a problem when I'm evaluating the bfq patches. Could you please check?
>>> Thanks
>>> 
>>> Issue statement
>>> When running performance test on bfq patch, I observed warning message "bfq_actuator_index: bio sector out of ranges: end=35156656128" and OS hung suddenly after some hours.
>>> The warning message is reported from function bfq_actuator_index which determines IO request is in which index of actuators.  The bio_end_sector is 35156656128 but the max LBA for the drive is 35156656127 so it's beyond the LBA range.
>> 
>> Yep, this sanity check fails if the end sector of a new IO does not belong to any sector range.
>> 
>>> I captured the block trace and didn't found request LBA 35156656128 instead only found max request LBA 35156656127.
>> 
>> Maybe in the trace you see only start sectors?  The failed check si performed on end sectors instead.
>> 
>> At any rate, there seems to be an off-by-one error in the value(s) stored in the sector field(s) of the blk_independent_access_range data structure.
>> 
>> I guess we may need some help/feedback from people competent on this stuff.
>> 
>>> I'm not sure if this warning message is related to later OS hung.
>>> 
>> 
>> Not easy to say.  At any rate, we can try with a development version of bfq.  It can help us detect the possible cause of this hang.  But let's see where we get with this sector error first.
>> 
>> Thank you for testing this extended version of bfq, Paolo
>> 
>>> 
>>> Problem environment
>>> Kernel base is 5.18.9
>>> Test HDD drive is Seagate ST18000NM0092 dual actuator SATA.
>>> Actuator LBA mapping by reading VPD B9 Concurrent positioning ranges
>>> VPD page:
>>> LBA range number:0
>>> number of storage elements:1
>>> starting LBA:0x0
>>> number of LBAs:0x417c00000 [17578328064] LBA range number:1 number of
>>> storage elements:1 starting LBA:0x417c00000 number of LBAs:0x417c00000
>>> [17578328064]
>>> 
>>> 
>>> 
>>> 
>>> 
>>> From: Paolo Valente <paolo.valente@linaro.org>
>>> Sent: Thursday, June 23, 2022 8:53 AM
>>> To: Jens Axboe <axboe@kernel.dk>
>>> Cc: linux-block@vger.kernel.org <linux-block@vger.kernel.org>;
>>> linux-kernel@vger.kernel.org <linux-kernel@vger.kernel.org>;
>>> jack@suse.cz <jack@suse.cz>; andrea.righi@canonical.com
>>> <andrea.righi@canonical.com>; glen.valante@linaro.org
>>> <glen.valante@linaro.org>; Arie van der Hoeven
>>> <arie.vanderhoeven@seagate.com>; Paolo Valente
>>> <paolo.valente@linaro.org>
>>> Subject: [PATCH 0/8] block, bfq: extend bfq to support multi-actuator
>>> drives
>>> 
>>> 
>>> This message has originated from an External Source. Please use proper judgment and caution when opening attachments, clicking links, or responding to this email.
>>> 
>>> 
>>> Hi,
>>> this patch series extends BFQ so as to optimize I/O dispatch to
>>> multi-actuator drives. In particular, this extension addresses the
>>> following issue. Multi-actuator drives appear as a single device to
>>> the I/O subsystem [1].  Yet they address commands to different
>>> actuators internally, as a function of Logical Block Addressing
>>> (LBAs). A given sector is reachable by only one of the actuators. For
>>> example, Seagate's Serial Advanced Technology Attachment (SATA)
>>> version contains two actuators and maps the lower half of the SATA LBA
>>> space to the lower actuator and the upper half to the upper actuator.
>>> 
>>> Evidently, to fully utilize actuators, no actuator must be left idle
>>> or underutilized while there is pending I/O for it. To reach this
>>> goal, the block layer must somehow control the load of each actuator
>>> individually. This series enriches BFQ with such a per-actuator
>>> control, as a first step. Then it also adds a simple mechanism for
>>> guaranteeing that actuators with pending I/O are never left idle.
>>> 
>>> See [1] for a more detailed overview of the problem and of the
>>> solutions implemented in this patch series. There you will also find
>>> some preliminary performance results.
>>> 
>>> Thanks,
>>> Paolo
>>> 
>>> [1]
>>> https://secure-web.cisco.com/1hcxnN1C3h1nW7mby7S66_LE8szirQwbQI0fBpYeP
>>> rA0GTWfyuQyl0GpZaOn32xMSkNT0BUQWloDHFzZ23aYDZdi8NfdrEFLY9pQDBblIvn08LR
>>> iTVoIOUC8zWSG_r2PCyLtx3ppZq5cWOib_8azxteRRcbKWGdbLPSqg9hfSJSqltth0ByLO
>>> NHEoI3p3e9QNIn6nVAeQbsT3aOQe-F95XrQvaPrFJXx6RGL9kDXyfkbXIHcdcLBf895gYB
>>> Fn5S2WjBDQq2kzDzZOlc1HekRUhg0qDQcFY6NydVfrqNfLbpAHAth6KyREscQhVTMVREEV
>>> a1b6bQByX6grF5pn3pTIo0lODyfX6yRmcbReSYNfOZ65ZPvp-nH530FQ-5nXoRxFf74WIK
>>> DrNTALs3xQvg03DH4jLez-T2M9xEu-sfEDAEdTGF7BcnmBW6vrPO4_p3k4/https%3A%2F
>>> %2Fwww.linaro.org%2Fblog%2Fbudget-fair-queueing-bfq-linux-io-scheduler
>>> -optimizations-for-multi-actuator-sata-hard-drives%2F
>>> 
>>> Davide Zini (3):
>>> block, bfq: split also async bfq_queues on a per-actuator basis
>>> block, bfq: inject I/O to underutilized actuators  block, bfq: balance
>>> I/O injection among underutilized actuators
>>> 
>>> Federico Gavioli (1):
>>> block, bfq: retrieve independent access ranges from request queue
>>> 
>>> Paolo Valente (4):
>>> block, bfq: split sync bfq_queues on a per-actuator basis  block,
>>> bfq: forbid stable merging of queues associated with different
>>>  actuators
>>> block, bfq: turn scalar fields into arrays in bfq_io_cq  block, bfq:
>>> turn BFQ_NUM_ACTUATORS into BFQ_MAX_ACTUATORS
>>> 
>>> block/bfq-cgroup.c  |  97 +++++----
>>> block/bfq-iosched.c | 488 +++++++++++++++++++++++++++++---------------
>>> block/bfq-iosched.h | 149 ++++++++++----
>>> block/bfq-wf2q.c    |   2 +-
>>> 4 files changed, 493 insertions(+), 243 deletions(-)
>>> 
>>> --
>>> 2.20.1
>>> 
>>> 
>>> Seagate Internal
>>> 
>>> Seagate Internal
>> 
>> Seagate Internal
> 
> Seagate Internal


^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH 0/8] block, bfq: extend bfq to support multi-actuator drives
  2022-09-08 10:54               ` Paolo Valente
@ 2022-09-08 11:34                 ` Rory Chen
  2022-09-27 16:59                   ` Arie van der Hoeven
  0 siblings, 1 reply; 21+ messages in thread
From: Rory Chen @ 2022-09-08 11:34 UTC (permalink / raw)
  To: Paolo Valente
  Cc: Tyler Erickson, Arie van der Hoeven, Muhammad Ahmad, linux-block,
	linux-kernel, Jan Kara, andrea.righi, glen.valante, axboe,
	Michael English, Andrew Ring, Varun Boddu, Damien Le Moal

Oops, I attach wrong code change. Here's the right change made by me.

<               if (end >= iar->sector + 1 && end < iar->sector + iar->nr_sectors + 1)  //Changed code
>               if (end >= iar->sector && end < iar->sector + iar->nr_sectors) // Original code

Unfortunately, the crash is still existing and I can't find any clue from /var/log/messages



From: Paolo Valente <paolo.valente@linaro.org>
Sent: Thursday, September 8, 2022 6:54 PM
To: Rory Chen <rory.c.chen@seagate.com>
Cc: Tyler Erickson <tyler.erickson@seagate.com>; Arie van der Hoeven <arie.vanderhoeven@seagate.com>; Muhammad Ahmad <muhammad.ahmad@seagate.com>; linux-block@vger.kernel.org <linux-block@vger.kernel.org>; linux-kernel@vger.kernel.org <linux-kernel@vger.kernel.org>; Jan Kara <jack@suse.cz>; andrea.righi@canonical.com <andrea.righi@canonical.com>; glen.valante@linaro.org <glen.valante@linaro.org>; axboe@kernel.dk <axboe@kernel.dk>; Michael English <michael.english@seagate.com>; Andrew Ring <andrew.ring@seagate.com>; Varun Boddu <varunreddy.boddu@seagate.com>; Damien Le Moal <damien.lemoal@wdc.com>
Subject: Re: [PATCH 0/8] block, bfq: extend bfq to support multi-actuator drives


This message has originated from an External Source. Please use proper judgment and caution when opening attachments, clicking links, or responding to this email.


> Il giorno 8 set 2022, alle ore 04:46, Rory Chen <rory.c.chen@seagate.com> ha scritto:
>
> I change the comparison condition and it can eliminate the warning.

Yep. The crash you reported also goes away?

> <               if (end >= iar->sector + 1 && end < iar->sector + iar->nr_sectors + 1)
>>              if (end >= iar->sector && end < iar->sector + iar->nr_sectors)
>
> I don't know if this change is appropriate

Unfortunately your change conflicts with the standard code, taken from
the original patches on access ranges [1].  I've CCed Damien, the
author of this patch series.

[1] https://secure-web.cisco.com/12uvPqOwOjHJPiVGM6hJ7791jpWxxy8My3bFD1oA0pNh9m0W778f8IM7HPxjRUL8-94N0gKahHwtK-sEv1Tgk2Oo4H9GTAlLoml_uWF6BGktvDAlDp-zdNQUzCL7y1OCz_MJMaNlS5h0iwsE3q9m7tJsCFUWW0YEgcJE6LRTrZDQpFJhG3pGCLFgoPIuKa3o8B136dJoQvEtek7ZOQFKqesuZKbu4lvM4ds0HOLs5TIgJR_mSJ8UmhP5_M3a1CaDxdDzQ784H3EydkRN9a6v9-Oogo-wYUqS8fRq35rUyw1t2IblmgJzr6aoGazZsJHxBXPjpxA9DSEQqUtH7oT5RGM4qxLpEmYjgyzpJUZqhUCSXye7-lCTIQIB-SGzRuZDVbIqK5tZd3F_YK9LcAN0iVH_qfBM4zRe_4w4h5ikJdhc/https%3A%2F%2Flwn.net%2Fml%2Flinux-block%2F20210909023545.1101672-2-damien.lemoal%40wdc.com%2F

Thanks,
Paolo

> but  bio_end_sector deducting 1 said by Tyler seems to make sense.
>
> From: Paolo Valente <paolo.valente@linaro.org>
> Sent: Thursday, August 25, 2022 10:45 PM
> To: Tyler Erickson <tyler.erickson@seagate.com>
> Cc: Rory Chen <rory.c.chen@seagate.com>; Arie van der Hoeven <arie.vanderhoeven@seagate.com>; Muhammad Ahmad <muhammad.ahmad@seagate.com>; linux-block@vger.kernel.org <linux-block@vger.kernel.org>; linux-kernel@vger.kernel.org <linux-kernel@vger.kernel.org>; Jan Kara <jack@suse.cz>; andrea.righi@canonical.com <andrea.righi@canonical.com>; glen.valante@linaro.org <glen.valante@linaro.org>; axboe@kernel.dk <axboe@kernel.dk>; Michael English <michael.english@seagate.com>; Andrew Ring <andrew.ring@seagate.com>; Varun Boddu <varunreddy.boddu@seagate.com>
> Subject: Re: [PATCH 0/8] block, bfq: extend bfq to support multi-actuator drives
>
>
> This message has originated from an External Source. Please use proper judgment and caution when opening attachments, clicking links, or responding to this email.
>
>
> Hi
>
>> Il giorno 18 ago 2022, alle ore 17:40, Tyler Erickson <tyler.erickson@seagate.com> ha scritto:
>>
>> The libata layer is reporting correctly after the changes I submitted.
>>
>> The drive reports the actuator ranges as a starting LBA and a count of LBAs for the range.
>> If the code reading the reported values simply does startingLBA + range, this is an incorrect ending LBA for that actuator. This is because LBAs are zero indexed and this simple addition is not taking that into account.
>> The proper way to get the endingLBA is startingLBA + range - 1 to get the last LBA value for where to issue a final IO read/write to account for LBA values starting at zero rather than one.
>>
>> Here is an example from the output in SeaChest/openSeaChest:
>> ====Concurrent Positioning Ranges====
>>
>> Range#     #Elements            Lowest LBA          # of LBAs
>>  0            1                                               0           17578328064
>>  1            1                         17578328064           17578328064
>>
>> If using the incorrect formula to get the final LBA for actuator 0, you would get 17578328064, but this is the starting LBA reported by the drive for actuator 1.
>> So to be consistent for all ranges, the final LBA for a given actuator should be calculated as starting LBA + range - 1.
>>
>
> Ok
>
>> I had reached out to Seagate's T10 and T13 representatives for clarification and verification and this is most likely what is causing the error is a missing - 1 somewhere after getting the information reported by the device. They agreed that the reporting from the drive and the SCSI to ATA translation is correct.
>>
>> I'm not sure where this is being read and calculated, but it is not an error in the low-level libata or sd level of the kernel. It may be in bfq, or it may be in some other place after the sd layer.
>
> This apparent mistake is in the macro bio_end_sector (defined in
> include/linux/bio.h), which seems to be translated as sector+size.
> Jens, can you shed a light on this point?
>
> Thanks,
> Paolo
>
>> I know there were some additions to read this and report it up the stack, but I did not think those were wrong as they seemed to pass the drive reported information up the stack.
>>
>> Tyler Erickson
>> Seagate Technology
>>
>>
>> Seagate Internal
>>
>> -----Original Message-----
>> From: Rory Chen <rory.c.chen@seagate.com>
>> Sent: Wednesday, August 10, 2022 6:59 AM
>> To: Paolo Valente <paolo.valente@linaro.org>
>> Cc: Arie van der Hoeven <arie.vanderhoeven@seagate.com>; Muhammad Ahmad <muhammad.ahmad@seagate.com>; linux-block@vger.kernel.org; linux-kernel@vger.kernel.org; Jan Kara <jack@suse.cz>; andrea.righi@canonical.com; glen.valante@linaro.org; axboe@kernel.dk; Tyler Erickson <tyler.erickson@seagate.com>; Michael English <michael.english@seagate.com>; Andrew Ring <andrew.ring@seagate.com>; Varun Boddu <varunreddy.boddu@seagate.com>
>> Subject: Re: [PATCH 0/8] block, bfq: extend bfq to support multi-actuator drives
>>
>> The block trace shows the start sector is 35156656120 and transfer length is 8 sectors, which is within the max LBA 35156656127 of drive. And this IO is completed successfully from the slice of parsed block trace though reporting the warning message.
>> 8,64   7       13     0.039401337 19176  Q  RA 35156656120 + 8 [systemd-udevd]
>> 8,64   7       15     0.039403946 19176  P   N [systemd-udevd]
>> 8,64   7       16     0.039405132 19176  I  RA 35156656120 + 8 [systemd-udevd]
>> 8,64   7       18     0.039411554 19176  D  RA 35156656120 + 8 [systemd-udevd]
>> 8,64   0       40     0.039479055     0  C  RA 35156656120 + 8 [0]
>>
>> It may need to know where calculate "bio_end_sector" value as 35156656128. I have patched libata and sd driver for Dual Actuator.
>>
>>
>>
>> From: Paolo Valente <paolo.valente@linaro.org>
>> Sent: Wednesday, August 10, 2022 6:22 PM
>> To: Rory Chen <rory.c.chen@seagate.com>
>> Cc: Arie van der Hoeven <arie.vanderhoeven@seagate.com>; Muhammad Ahmad <muhammad.ahmad@seagate.com>; linux-block@vger.kernel.org <linux-block@vger.kernel.org>; linux-kernel@vger.kernel.org <linux-kernel@vger.kernel.org>; Jan Kara <jack@suse.cz>; andrea.righi@canonical.com <andrea.righi@canonical.com>; glen.valante@linaro.org <glen.valante@linaro.org>; axboe@kernel.dk <axboe@kernel.dk>; Tyler Erickson <tyler.erickson@seagate.com>; Michael English <michael.english@seagate.com>; Andrew Ring <andrew.ring@seagate.com>; Varun Boddu <varunreddy.boddu@seagate.com>
>> Subject: Re: [PATCH 0/8] block, bfq: extend bfq to support multi-actuator drives
>>
>>
>> This message has originated from an External Source. Please use proper judgment and caution when opening attachments, clicking links, or responding to this email.
>>
>>
>>> Il giorno 9 ago 2022, alle ore 05:47, Rory Chen <rory.c.chen@seagate.com> ha scritto:
>>>
>>> Resend the mail as plain text because previous mail with rich text
>>> makes some mess and forget to add others at Seagate who worked on
>>> validating the patch as well(Muhammad, Michael, Andrew, Varun,Tyler)
>>>
>>> Hi Paolo,
>>>
>>
>> Hi
>>
>>> I am from Seagate China and face a problem when I'm evaluating the bfq patches. Could you please check?
>>> Thanks
>>>
>>> Issue statement
>>> When running performance test on bfq patch, I observed warning message "bfq_actuator_index: bio sector out of ranges: end=35156656128" and OS hung suddenly after some hours.
>>> The warning message is reported from function bfq_actuator_index which determines IO request is in which index of actuators.  The bio_end_sector is 35156656128 but the max LBA for the drive is 35156656127 so it's beyond the LBA range.
>>
>> Yep, this sanity check fails if the end sector of a new IO does not belong to any sector range.
>>
>>> I captured the block trace and didn't found request LBA 35156656128 instead only found max request LBA 35156656127.
>>
>> Maybe in the trace you see only start sectors?  The failed check si performed on end sectors instead.
>>
>> At any rate, there seems to be an off-by-one error in the value(s) stored in the sector field(s) of the blk_independent_access_range data structure.
>>
>> I guess we may need some help/feedback from people competent on this stuff.
>>
>>> I'm not sure if this warning message is related to later OS hung.
>>>
>>
>> Not easy to say.  At any rate, we can try with a development version of bfq.  It can help us detect the possible cause of this hang.  But let's see where we get with this sector error first.
>>
>> Thank you for testing this extended version of bfq, Paolo
>>
>>>
>>> Problem environment
>>> Kernel base is 5.18.9
>>> Test HDD drive is Seagate ST18000NM0092 dual actuator SATA.
>>> Actuator LBA mapping by reading VPD B9 Concurrent positioning ranges
>>> VPD page:
>>> LBA range number:0
>>> number of storage elements:1
>>> starting LBA:0x0
>>> number of LBAs:0x417c00000 [17578328064] LBA range number:1 number of
>>> storage elements:1 starting LBA:0x417c00000 number of LBAs:0x417c00000
>>> [17578328064]
>>>
>>>
>>>
>>>
>>>
>>> From: Paolo Valente <paolo.valente@linaro.org>
>>> Sent: Thursday, June 23, 2022 8:53 AM
>>> To: Jens Axboe <axboe@kernel.dk>
>>> Cc: linux-block@vger.kernel.org <linux-block@vger.kernel.org>;
>>> linux-kernel@vger.kernel.org <linux-kernel@vger.kernel.org>;
>>> jack@suse.cz <jack@suse.cz>; andrea.righi@canonical.com
>>> <andrea.righi@canonical.com>; glen.valante@linaro.org
>>> <glen.valante@linaro.org>; Arie van der Hoeven
>>> <arie.vanderhoeven@seagate.com>; Paolo Valente
>>> <paolo.valente@linaro.org>
>>> Subject: [PATCH 0/8] block, bfq: extend bfq to support multi-actuator
>>> drives
>>>
>>>
>>> This message has originated from an External Source. Please use proper judgment and caution when opening attachments, clicking links, or responding to this email.
>>>
>>>
>>> Hi,
>>> this patch series extends BFQ so as to optimize I/O dispatch to
>>> multi-actuator drives. In particular, this extension addresses the
>>> following issue. Multi-actuator drives appear as a single device to
>>> the I/O subsystem [1].  Yet they address commands to different
>>> actuators internally, as a function of Logical Block Addressing
>>> (LBAs). A given sector is reachable by only one of the actuators. For
>>> example, Seagate's Serial Advanced Technology Attachment (SATA)
>>> version contains two actuators and maps the lower half of the SATA LBA
>>> space to the lower actuator and the upper half to the upper actuator.
>>>
>>> Evidently, to fully utilize actuators, no actuator must be left idle
>>> or underutilized while there is pending I/O for it. To reach this
>>> goal, the block layer must somehow control the load of each actuator
>>> individually. This series enriches BFQ with such a per-actuator
>>> control, as a first step. Then it also adds a simple mechanism for
>>> guaranteeing that actuators with pending I/O are never left idle.
>>>
>>> See [1] for a more detailed overview of the problem and of the
>>> solutions implemented in this patch series. There you will also find
>>> some preliminary performance results.
>>>
>>> Thanks,
>>> Paolo
>>>
>>> [1]
>>> https://secure-web.cisco.com/1hcxnN1C3h1nW7mby7S66_LE8szirQwbQI0fBpYeP
>>> rA0GTWfyuQyl0GpZaOn32xMSkNT0BUQWloDHFzZ23aYDZdi8NfdrEFLY9pQDBblIvn08LR
>>> iTVoIOUC8zWSG_r2PCyLtx3ppZq5cWOib_8azxteRRcbKWGdbLPSqg9hfSJSqltth0ByLO
>>> NHEoI3p3e9QNIn6nVAeQbsT3aOQe-F95XrQvaPrFJXx6RGL9kDXyfkbXIHcdcLBf895gYB
>>> Fn5S2WjBDQq2kzDzZOlc1HekRUhg0qDQcFY6NydVfrqNfLbpAHAth6KyREscQhVTMVREEV
>>> a1b6bQByX6grF5pn3pTIo0lODyfX6yRmcbReSYNfOZ65ZPvp-nH530FQ-5nXoRxFf74WIK
>>> DrNTALs3xQvg03DH4jLez-T2M9xEu-sfEDAEdTGF7BcnmBW6vrPO4_p3k4/https%3A%2F
>>> %2Fwww.linaro.org%2Fblog%2Fbudget-fair-queueing-bfq-linux-io-scheduler
>>> -optimizations-for-multi-actuator-sata-hard-drives%2F
>>>
>>> Davide Zini (3):
>>> block, bfq: split also async bfq_queues on a per-actuator basis
>>> block, bfq: inject I/O to underutilized actuators  block, bfq: balance
>>> I/O injection among underutilized actuators
>>>
>>> Federico Gavioli (1):
>>> block, bfq: retrieve independent access ranges from request queue
>>>
>>> Paolo Valente (4):
>>> block, bfq: split sync bfq_queues on a per-actuator basis  block,
>>> bfq: forbid stable merging of queues associated with different
>>>  actuators
>>> block, bfq: turn scalar fields into arrays in bfq_io_cq  block, bfq:
>>> turn BFQ_NUM_ACTUATORS into BFQ_MAX_ACTUATORS
>>>
>>> block/bfq-cgroup.c  |  97 +++++----
>>> block/bfq-iosched.c | 488 +++++++++++++++++++++++++++++---------------
>>> block/bfq-iosched.h | 149 ++++++++++----
>>> block/bfq-wf2q.c    |   2 +-
>>> 4 files changed, 493 insertions(+), 243 deletions(-)
>>>
>>> --
>>> 2.20.1
>>>
>>>
>>> Seagate Internal
>>>
>>> Seagate Internal
>>
>> Seagate Internal
>
> Seagate Internal


Seagate Internal

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH 0/8] block, bfq: extend bfq to support multi-actuator drives
  2022-09-08 11:34                 ` Rory Chen
@ 2022-09-27 16:59                   ` Arie van der Hoeven
  2022-09-28  1:02                     ` Rory Chen
  0 siblings, 1 reply; 21+ messages in thread
From: Arie van der Hoeven @ 2022-09-27 16:59 UTC (permalink / raw)
  To: Rory Chen, Paolo Valente
  Cc: Tyler Erickson, Muhammad Ahmad, linux-block, linux-kernel,
	Jan Kara, andrea.righi, glen.valante, axboe, Michael English,
	Andrew Ring, Varun Boddu, Damien Le Moal

Chiming in as we have customers who are having very good results with these BFQ patches and are planning to pilot NAS solutions in early 2023. This bug is not a blocker for us, but we do need the BFQ patches included in Linux 6.0.

Rory can you submit your changes or is it the maintainer's responsibility?

Regards,  --Arie


From: Rory Chen <rory.c.chen@seagate.com>
Sent: Thursday, September 8, 2022 4:34 AM
To: Paolo Valente <paolo.valente@linaro.org>
Cc: Tyler Erickson <tyler.erickson@seagate.com>; Arie van der Hoeven <arie.vanderhoeven@seagate.com>; Muhammad Ahmad <muhammad.ahmad@seagate.com>; linux-block@vger.kernel.org <linux-block@vger.kernel.org>; linux-kernel@vger.kernel.org <linux-kernel@vger.kernel.org>; Jan Kara <jack@suse.cz>; andrea.righi@canonical.com <andrea.righi@canonical.com>; glen.valante@linaro.org <glen.valante@linaro.org>; axboe@kernel.dk <axboe@kernel.dk>; Michael English <michael.english@seagate.com>; Andrew Ring <andrew.ring@seagate.com>; Varun Boddu <varunreddy.boddu@seagate.com>; Damien Le Moal <damien.lemoal@wdc.com>
Subject: Re: [PATCH 0/8] block, bfq: extend bfq to support multi-actuator drives


Oops, I attach wrong code change. Here's the right change made by me.

<               if (end >= iar->sector + 1 && end < iar->sector + iar->nr_sectors + 1)  //Changed code
>               if (end >= iar->sector && end < iar->sector + iar->nr_sectors) // Original code

Unfortunately, the crash is still existing and I can't find any clue from /var/log/messages



From: Paolo Valente <paolo.valente@linaro.org>
Sent: Thursday, September 8, 2022 6:54 PM
To: Rory Chen <rory.c.chen@seagate.com>
Cc: Tyler Erickson <tyler.erickson@seagate.com>; Arie van der Hoeven <arie.vanderhoeven@seagate.com>; Muhammad Ahmad <muhammad.ahmad@seagate.com>; linux-block@vger.kernel.org <linux-block@vger.kernel.org>; linux-kernel@vger.kernel.org <linux-kernel@vger.kernel.org>; Jan Kara <jack@suse.cz>; andrea.righi@canonical.com <andrea.righi@canonical.com>; glen.valante@linaro.org <glen.valante@linaro.org>; axboe@kernel.dk <axboe@kernel.dk>; Michael English <michael.english@seagate.com>; Andrew Ring <andrew.ring@seagate.com>; Varun Boddu <varunreddy.boddu@seagate.com>; Damien Le Moal <damien.lemoal@wdc.com>
Subject: Re: [PATCH 0/8] block, bfq: extend bfq to support multi-actuator drives


This message has originated from an External Source. Please use proper judgment and caution when opening attachments, clicking links, or responding to this email.


> Il giorno 8 set 2022, alle ore 04:46, Rory Chen <rory.c.chen@seagate.com> ha scritto:
>
> I change the comparison condition and it can eliminate the warning.

Yep. The crash you reported also goes away?

> <               if (end >= iar->sector + 1 && end < iar->sector + iar->nr_sectors + 1)
>>              if (end >= iar->sector && end < iar->sector + iar->nr_sectors)
>
> I don't know if this change is appropriate

Unfortunately your change conflicts with the standard code, taken from
the original patches on access ranges [1].  I've CCed Damien, the
author of this patch series.

[1] https://secure-web.cisco.com/12uvPqOwOjHJPiVGM6hJ7791jpWxxy8My3bFD1oA0pNh9m0W778f8IM7HPxjRUL8-94N0gKahHwtK-sEv1Tgk2Oo4H9GTAlLoml_uWF6BGktvDAlDp-zdNQUzCL7y1OCz_MJMaNlS5h0iwsE3q9m7tJsCFUWW0YEgcJE6LRTrZDQpFJhG3pGCLFgoPIuKa3o8B136dJoQvEtek7ZOQFKqesuZKbu4lvM4ds0HOLs5TIgJR_mSJ8UmhP5_M3a1CaDxdDzQ784H3EydkRN9a6v9-Oogo-wYUqS8fRq35rUyw1t2IblmgJzr6aoGazZsJHxBXPjpxA9DSEQqUtH7oT5RGM4qxLpEmYjgyzpJUZqhUCSXye7-lCTIQIB-SGzRuZDVbIqK5tZd3F_YK9LcAN0iVH_qfBM4zRe_4w4h5ikJdhc/https%3A%2F%2Flwn.net%2Fml%2Flinux-block%2F20210909023545.1101672-2-damien.lemoal%40wdc.com%2F

Thanks,
Paolo

> but  bio_end_sector deducting 1 said by Tyler seems to make sense.
>
> From: Paolo Valente <paolo.valente@linaro.org>
> Sent: Thursday, August 25, 2022 10:45 PM
> To: Tyler Erickson <tyler.erickson@seagate.com>
> Cc: Rory Chen <rory.c.chen@seagate.com>; Arie van der Hoeven <arie.vanderhoeven@seagate.com>; Muhammad Ahmad <muhammad.ahmad@seagate.com>; linux-block@vger.kernel.org <linux-block@vger.kernel.org>; linux-kernel@vger.kernel.org <linux-kernel@vger.kernel.org>; Jan Kara <jack@suse.cz>; andrea.righi@canonical.com <andrea.righi@canonical.com>; glen.valante@linaro.org <glen.valante@linaro.org>; axboe@kernel.dk <axboe@kernel.dk>; Michael English <michael.english@seagate.com>; Andrew Ring <andrew.ring@seagate.com>; Varun Boddu <varunreddy.boddu@seagate.com>
> Subject: Re: [PATCH 0/8] block, bfq: extend bfq to support multi-actuator drives
>
>
> This message has originated from an External Source. Please use proper judgment and caution when opening attachments, clicking links, or responding to this email.
>
>
> Hi
>
>> Il giorno 18 ago 2022, alle ore 17:40, Tyler Erickson <tyler.erickson@seagate.com> ha scritto:
>>
>> The libata layer is reporting correctly after the changes I submitted.
>>
>> The drive reports the actuator ranges as a starting LBA and a count of LBAs for the range.
>> If the code reading the reported values simply does startingLBA + range, this is an incorrect ending LBA for that actuator. This is because LBAs are zero indexed and this simple addition is not taking that into account.
>> The proper way to get the endingLBA is startingLBA + range - 1 to get the last LBA value for where to issue a final IO read/write to account for LBA values starting at zero rather than one.
>>
>> Here is an example from the output in SeaChest/openSeaChest:
>> ====Concurrent Positioning Ranges====
>>
>> Range#     #Elements            Lowest LBA          # of LBAs
>>  0            1                                               0           17578328064
>>  1            1                         17578328064           17578328064
>>
>> If using the incorrect formula to get the final LBA for actuator 0, you would get 17578328064, but this is the starting LBA reported by the drive for actuator 1.
>> So to be consistent for all ranges, the final LBA for a given actuator should be calculated as starting LBA + range - 1.
>>
>
> Ok
>
>> I had reached out to Seagate's T10 and T13 representatives for clarification and verification and this is most likely what is causing the error is a missing - 1 somewhere after getting the information reported by the device. They agreed that the reporting from the drive and the SCSI to ATA translation is correct.
>>
>> I'm not sure where this is being read and calculated, but it is not an error in the low-level libata or sd level of the kernel. It may be in bfq, or it may be in some other place after the sd layer.
>
> This apparent mistake is in the macro bio_end_sector (defined in
> include/linux/bio.h), which seems to be translated as sector+size.
> Jens, can you shed a light on this point?
>
> Thanks,
> Paolo
>
>> I know there were some additions to read this and report it up the stack, but I did not think those were wrong as they seemed to pass the drive reported information up the stack.
>>
>> Tyler Erickson
>> Seagate Technology
>>
>>
>> Seagate Internal
>>
>> -----Original Message-----
>> From: Rory Chen <rory.c.chen@seagate.com>
>> Sent: Wednesday, August 10, 2022 6:59 AM
>> To: Paolo Valente <paolo.valente@linaro.org>
>> Cc: Arie van der Hoeven <arie.vanderhoeven@seagate.com>; Muhammad Ahmad <muhammad.ahmad@seagate.com>; linux-block@vger.kernel.org; linux-kernel@vger.kernel.org; Jan Kara <jack@suse.cz>; andrea.righi@canonical.com; glen.valante@linaro.org; axboe@kernel.dk; Tyler Erickson <tyler.erickson@seagate.com>; Michael English <michael.english@seagate.com>; Andrew Ring <andrew.ring@seagate.com>; Varun Boddu <varunreddy.boddu@seagate.com>
>> Subject: Re: [PATCH 0/8] block, bfq: extend bfq to support multi-actuator drives
>>
>> The block trace shows the start sector is 35156656120 and transfer length is 8 sectors, which is within the max LBA 35156656127 of drive. And this IO is completed successfully from the slice of parsed block trace though reporting the warning message.
>> 8,64   7       13     0.039401337 19176  Q  RA 35156656120 + 8 [systemd-udevd]
>> 8,64   7       15     0.039403946 19176  P   N [systemd-udevd]
>> 8,64   7       16     0.039405132 19176  I  RA 35156656120 + 8 [systemd-udevd]
>> 8,64   7       18     0.039411554 19176  D  RA 35156656120 + 8 [systemd-udevd]
>> 8,64   0       40     0.039479055     0  C  RA 35156656120 + 8 [0]
>>
>> It may need to know where calculate "bio_end_sector" value as 35156656128. I have patched libata and sd driver for Dual Actuator.
>>
>>
>>
>> From: Paolo Valente <paolo.valente@linaro.org>
>> Sent: Wednesday, August 10, 2022 6:22 PM
>> To: Rory Chen <rory.c.chen@seagate.com>
>> Cc: Arie van der Hoeven <arie.vanderhoeven@seagate.com>; Muhammad Ahmad <muhammad.ahmad@seagate.com>; linux-block@vger.kernel.org <linux-block@vger.kernel.org>; linux-kernel@vger.kernel.org <linux-kernel@vger.kernel.org>; Jan Kara <jack@suse.cz>; andrea.righi@canonical.com <andrea.righi@canonical.com>; glen.valante@linaro.org <glen.valante@linaro.org>; axboe@kernel.dk <axboe@kernel.dk>; Tyler Erickson <tyler.erickson@seagate.com>; Michael English <michael.english@seagate.com>; Andrew Ring <andrew.ring@seagate.com>; Varun Boddu <varunreddy.boddu@seagate.com>
>> Subject: Re: [PATCH 0/8] block, bfq: extend bfq to support multi-actuator drives
>>
>>
>> This message has originated from an External Source. Please use proper judgment and caution when opening attachments, clicking links, or responding to this email.
>>
>>
>>> Il giorno 9 ago 2022, alle ore 05:47, Rory Chen <rory.c.chen@seagate.com> ha scritto:
>>>
>>> Resend the mail as plain text because previous mail with rich text
>>> makes some mess and forget to add others at Seagate who worked on
>>> validating the patch as well(Muhammad, Michael, Andrew, Varun,Tyler)
>>>
>>> Hi Paolo,
>>>
>>
>> Hi
>>
>>> I am from Seagate China and face a problem when I'm evaluating the bfq patches. Could you please check?
>>> Thanks
>>>
>>> Issue statement
>>> When running performance test on bfq patch, I observed warning message "bfq_actuator_index: bio sector out of ranges: end=35156656128" and OS hung suddenly after some hours.
>>> The warning message is reported from function bfq_actuator_index which determines IO request is in which index of actuators.  The bio_end_sector is 35156656128 but the max LBA for the drive is 35156656127 so it's beyond the LBA range.
>>
>> Yep, this sanity check fails if the end sector of a new IO does not belong to any sector range.
>>
>>> I captured the block trace and didn't found request LBA 35156656128 instead only found max request LBA 35156656127.
>>
>> Maybe in the trace you see only start sectors?  The failed check si performed on end sectors instead.
>>
>> At any rate, there seems to be an off-by-one error in the value(s) stored in the sector field(s) of the blk_independent_access_range data structure.
>>
>> I guess we may need some help/feedback from people competent on this stuff.
>>
>>> I'm not sure if this warning message is related to later OS hung.
>>>
>>
>> Not easy to say.  At any rate, we can try with a development version of bfq.  It can help us detect the possible cause of this hang.  But let's see where we get with this sector error first.
>>
>> Thank you for testing this extended version of bfq, Paolo
>>
>>>
>>> Problem environment
>>> Kernel base is 5.18.9
>>> Test HDD drive is Seagate ST18000NM0092 dual actuator SATA.
>>> Actuator LBA mapping by reading VPD B9 Concurrent positioning ranges
>>> VPD page:
>>> LBA range number:0
>>> number of storage elements:1
>>> starting LBA:0x0
>>> number of LBAs:0x417c00000 [17578328064] LBA range number:1 number of
>>> storage elements:1 starting LBA:0x417c00000 number of LBAs:0x417c00000
>>> [17578328064]
>>>
>>>
>>>
>>>
>>>
>>> From: Paolo Valente <paolo.valente@linaro.org>
>>> Sent: Thursday, June 23, 2022 8:53 AM
>>> To: Jens Axboe <axboe@kernel.dk>
>>> Cc: linux-block@vger.kernel.org <linux-block@vger.kernel.org>;
>>> linux-kernel@vger.kernel.org <linux-kernel@vger.kernel.org>;
>>> jack@suse.cz <jack@suse.cz>; andrea.righi@canonical.com
>>> <andrea.righi@canonical.com>; glen.valante@linaro.org
>>> <glen.valante@linaro.org>; Arie van der Hoeven
>>> <arie.vanderhoeven@seagate.com>; Paolo Valente
>>> <paolo.valente@linaro.org>
>>> Subject: [PATCH 0/8] block, bfq: extend bfq to support multi-actuator
>>> drives
>>>
>>>
>>> This message has originated from an External Source. Please use proper judgment and caution when opening attachments, clicking links, or responding to this email.
>>>
>>>
>>> Hi,
>>> this patch series extends BFQ so as to optimize I/O dispatch to
>>> multi-actuator drives. In particular, this extension addresses the
>>> following issue. Multi-actuator drives appear as a single device to
>>> the I/O subsystem [1].  Yet they address commands to different
>>> actuators internally, as a function of Logical Block Addressing
>>> (LBAs). A given sector is reachable by only one of the actuators. For
>>> example, Seagate's Serial Advanced Technology Attachment (SATA)
>>> version contains two actuators and maps the lower half of the SATA LBA
>>> space to the lower actuator and the upper half to the upper actuator.
>>>
>>> Evidently, to fully utilize actuators, no actuator must be left idle
>>> or underutilized while there is pending I/O for it. To reach this
>>> goal, the block layer must somehow control the load of each actuator
>>> individually. This series enriches BFQ with such a per-actuator
>>> control, as a first step. Then it also adds a simple mechanism for
>>> guaranteeing that actuators with pending I/O are never left idle.
>>>
>>> See [1] for a more detailed overview of the problem and of the
>>> solutions implemented in this patch series. There you will also find
>>> some preliminary performance results.
>>>
>>> Thanks,
>>> Paolo
>>>
>>> [1]
>>> https://secure-web.cisco.com/1hcxnN1C3h1nW7mby7S66_LE8szirQwbQI0fBpYeP
>>> rA0GTWfyuQyl0GpZaOn32xMSkNT0BUQWloDHFzZ23aYDZdi8NfdrEFLY9pQDBblIvn08LR
>>> iTVoIOUC8zWSG_r2PCyLtx3ppZq5cWOib_8azxteRRcbKWGdbLPSqg9hfSJSqltth0ByLO
>>> NHEoI3p3e9QNIn6nVAeQbsT3aOQe-F95XrQvaPrFJXx6RGL9kDXyfkbXIHcdcLBf895gYB
>>> Fn5S2WjBDQq2kzDzZOlc1HekRUhg0qDQcFY6NydVfrqNfLbpAHAth6KyREscQhVTMVREEV
>>> a1b6bQByX6grF5pn3pTIo0lODyfX6yRmcbReSYNfOZ65ZPvp-nH530FQ-5nXoRxFf74WIK
>>> DrNTALs3xQvg03DH4jLez-T2M9xEu-sfEDAEdTGF7BcnmBW6vrPO4_p3k4/https%3A%2F
>>> %2Fwww.linaro.org%2Fblog%2Fbudget-fair-queueing-bfq-linux-io-scheduler
>>> -optimizations-for-multi-actuator-sata-hard-drives%2F
>>>
>>> Davide Zini (3):
>>> block, bfq: split also async bfq_queues on a per-actuator basis
>>> block, bfq: inject I/O to underutilized actuators  block, bfq: balance
>>> I/O injection among underutilized actuators
>>>
>>> Federico Gavioli (1):
>>> block, bfq: retrieve independent access ranges from request queue
>>>
>>> Paolo Valente (4):
>>> block, bfq: split sync bfq_queues on a per-actuator basis  block,
>>> bfq: forbid stable merging of queues associated with different
>>>  actuators
>>> block, bfq: turn scalar fields into arrays in bfq_io_cq  block, bfq:
>>> turn BFQ_NUM_ACTUATORS into BFQ_MAX_ACTUATORS
>>>
>>> block/bfq-cgroup.c  |  97 +++++----
>>> block/bfq-iosched.c | 488 +++++++++++++++++++++++++++++---------------
>>> block/bfq-iosched.h | 149 ++++++++++----
>>> block/bfq-wf2q.c    |   2 +-
>>> 4 files changed, 493 insertions(+), 243 deletions(-)
>>>
>>> --
>>> 2.20.1
>>>
>>>
>>> Seagate Internal
>>>
>>> Seagate Internal
>>
>> Seagate Internal
>
> Seagate Internal


Seagate Internal

Seagate Internal

^ permalink raw reply	[flat|nested] 21+ messages in thread

* RE: [PATCH 0/8] block, bfq: extend bfq to support multi-actuator drives
  2022-09-27 16:59                   ` Arie van der Hoeven
@ 2022-09-28  1:02                     ` Rory Chen
  2022-09-30 15:48                       ` Paolo Valente
  0 siblings, 1 reply; 21+ messages in thread
From: Rory Chen @ 2022-09-28  1:02 UTC (permalink / raw)
  To: Arie van der Hoeven, Paolo Valente
  Cc: Tyler Erickson, Muhammad Ahmad, linux-block, linux-kernel,
	Jan Kara, andrea.righi, glen.valante, axboe, Michael English,
	Andrew Ring, Varun Boddu, Damien Le Moal

Hi Arie,

I have no experience to submit the change into kernel. Can Paolo or other maintainers help submit?


Seagate Internal

-----Original Message-----
From: Arie van der Hoeven <arie.vanderhoeven@seagate.com> 
Sent: Wednesday, September 28, 2022 12:59 AM
To: Rory Chen <rory.c.chen@seagate.com>; Paolo Valente <paolo.valente@linaro.org>
Cc: Tyler Erickson <tyler.erickson@seagate.com>; Muhammad Ahmad <muhammad.ahmad@seagate.com>; linux-block@vger.kernel.org; linux-kernel@vger.kernel.org; Jan Kara <jack@suse.cz>; andrea.righi@canonical.com; glen.valante@linaro.org; axboe@kernel.dk; Michael English <michael.english@seagate.com>; Andrew Ring <andrew.ring@seagate.com>; Varun Boddu <varunreddy.boddu@seagate.com>; Damien Le Moal <damien.lemoal@wdc.com>
Subject: Re: [PATCH 0/8] block, bfq: extend bfq to support multi-actuator drives

Chiming in as we have customers who are having very good results with these BFQ patches and are planning to pilot NAS solutions in early 2023. This bug is not a blocker for us, but we do need the BFQ patches included in Linux 6.0.

Rory can you submit your changes or is it the maintainer's responsibility?

Regards,  --Arie


From: Rory Chen <mailto:rory.c.chen@seagate.com>
Sent: Thursday, September 8, 2022 4:34 AM
To: Paolo Valente <mailto:paolo.valente@linaro.org>
Cc: Tyler Erickson <mailto:tyler.erickson@seagate.com>; Arie van der Hoeven <mailto:arie.vanderhoeven@seagate.com>; Muhammad Ahmad <mailto:muhammad.ahmad@seagate.com>; mailto:linux-block@vger.kernel.org <mailto:linux-block@vger.kernel.org>; mailto:linux-kernel@vger.kernel.org <mailto:linux-kernel@vger.kernel.org>; Jan Kara <mailto:jack@suse.cz>; mailto:andrea.righi@canonical.com <mailto:andrea.righi@canonical.com>; mailto:glen.valante@linaro.org <mailto:glen.valante@linaro.org>; mailto:axboe@kernel.dk <mailto:axboe@kernel.dk>; Michael English <mailto:michael.english@seagate.com>; Andrew Ring <mailto:andrew.ring@seagate.com>; Varun Boddu <mailto:varunreddy.boddu@seagate.com>; Damien Le Moal <mailto:damien.lemoal@wdc.com>
Subject: Re: [PATCH 0/8] block, bfq: extend bfq to support multi-actuator drives


Oops, I attach wrong code change. Here's the right change made by me.

<               if (end >= iar->sector + 1 && end < iar->sector + iar->nr_sectors + 1)  //Changed code
>               if (end >= iar->sector && end < iar->sector + 
> iar->nr_sectors) // Original code

Unfortunately, the crash is still existing and I can't find any clue from /var/log/messages



From: Paolo Valente <mailto:paolo.valente@linaro.org>
Sent: Thursday, September 8, 2022 6:54 PM
To: Rory Chen <mailto:rory.c.chen@seagate.com>
Cc: Tyler Erickson <mailto:tyler.erickson@seagate.com>; Arie van der Hoeven <mailto:arie.vanderhoeven@seagate.com>; Muhammad Ahmad <mailto:muhammad.ahmad@seagate.com>; mailto:linux-block@vger.kernel.org <mailto:linux-block@vger.kernel.org>; mailto:linux-kernel@vger.kernel.org <mailto:linux-kernel@vger.kernel.org>; Jan Kara <mailto:jack@suse.cz>; mailto:andrea.righi@canonical.com <mailto:andrea.righi@canonical.com>; mailto:glen.valante@linaro.org <mailto:glen.valante@linaro.org>; mailto:axboe@kernel.dk <mailto:axboe@kernel.dk>; Michael English <mailto:michael.english@seagate.com>; Andrew Ring <mailto:andrew.ring@seagate.com>; Varun Boddu <mailto:varunreddy.boddu@seagate.com>; Damien Le Moal <mailto:damien.lemoal@wdc.com>
Subject: Re: [PATCH 0/8] block, bfq: extend bfq to support multi-actuator drives


This message has originated from an External Source. Please use proper judgment and caution when opening attachments, clicking links, or responding to this email.


> Il giorno 8 set 2022, alle ore 04:46, Rory Chen <mailto:rory.c.chen@seagate.com> ha scritto:
>
> I change the comparison condition and it can eliminate the warning.

Yep. The crash you reported also goes away?

> <               if (end >= iar->sector + 1 && end < iar->sector + iar->nr_sectors + 1)
>>              if (end >= iar->sector && end < iar->sector + 
>> iar->nr_sectors)
>
> I don't know if this change is appropriate

Unfortunately your change conflicts with the standard code, taken from the original patches on access ranges [1].  I've CCed Damien, the author of this patch series.

[1] https://secure-web.cisco.com/12uvPqOwOjHJPiVGM6hJ7791jpWxxy8My3bFD1oA0pNh9m0W778f8IM7HPxjRUL8-94N0gKahHwtK-sEv1Tgk2Oo4H9GTAlLoml_uWF6BGktvDAlDp-zdNQUzCL7y1OCz_MJMaNlS5h0iwsE3q9m7tJsCFUWW0YEgcJE6LRTrZDQpFJhG3pGCLFgoPIuKa3o8B136dJoQvEtek7ZOQFKqesuZKbu4lvM4ds0HOLs5TIgJR_mSJ8UmhP5_M3a1CaDxdDzQ784H3EydkRN9a6v9-Oogo-wYUqS8fRq35rUyw1t2IblmgJzr6aoGazZsJHxBXPjpxA9DSEQqUtH7oT5RGM4qxLpEmYjgyzpJUZqhUCSXye7-lCTIQIB-SGzRuZDVbIqK5tZd3F_YK9LcAN0iVH_qfBM4zRe_4w4h5ikJdhc/https%3A%2F%2Flwn.net%2Fml%2Flinux-block%2F20210909023545.1101672-2-damien.lemoal%40wdc.com%2F

Thanks,
Paolo

> but  bio_end_sector deducting 1 said by Tyler seems to make sense.
>
> From: Paolo Valente <mailto:paolo.valente@linaro.org>
> Sent: Thursday, August 25, 2022 10:45 PM
> To: Tyler Erickson <mailto:tyler.erickson@seagate.com>
> Cc: Rory Chen <mailto:rory.c.chen@seagate.com>; Arie van der Hoeven 
> <mailto:arie.vanderhoeven@seagate.com>; Muhammad Ahmad 
> <mailto:muhammad.ahmad@seagate.com>; mailto:linux-block@vger.kernel.org 
> <mailto:linux-block@vger.kernel.org>; mailto:linux-kernel@vger.kernel.org 
> <mailto:linux-kernel@vger.kernel.org>; Jan Kara <mailto:jack@suse.cz>; 
> mailto:andrea.righi@canonical.com <mailto:andrea.righi@canonical.com>; 
> mailto:glen.valante@linaro.org <mailto:glen.valante@linaro.org>; mailto:axboe@kernel.dk 
> <mailto:axboe@kernel.dk>; Michael English <mailto:michael.english@seagate.com>; 
> Andrew Ring <mailto:andrew.ring@seagate.com>; Varun Boddu 
> <mailto:varunreddy.boddu@seagate.com>
> Subject: Re: [PATCH 0/8] block, bfq: extend bfq to support 
> multi-actuator drives
>
>
> This message has originated from an External Source. Please use proper judgment and caution when opening attachments, clicking links, or responding to this email.
>
>
> Hi
>
>> Il giorno 18 ago 2022, alle ore 17:40, Tyler Erickson <mailto:tyler.erickson@seagate.com> ha scritto:
>>
>> The libata layer is reporting correctly after the changes I submitted.
>>
>> The drive reports the actuator ranges as a starting LBA and a count of LBAs for the range.
>> If the code reading the reported values simply does startingLBA + range, this is an incorrect ending LBA for that actuator. This is because LBAs are zero indexed and this simple addition is not taking that into account.
>> The proper way to get the endingLBA is startingLBA + range - 1 to get the last LBA value for where to issue a final IO read/write to account for LBA values starting at zero rather than one.
>>
>> Here is an example from the output in SeaChest/openSeaChest:
>> ====Concurrent Positioning Ranges====
>>
>> Range#     #Elements            Lowest LBA          # of LBAs
>>  0            1                                               0           17578328064
>>  1            1                         17578328064           17578328064
>>
>> If using the incorrect formula to get the final LBA for actuator 0, you would get 17578328064, but this is the starting LBA reported by the drive for actuator 1.
>> So to be consistent for all ranges, the final LBA for a given actuator should be calculated as starting LBA + range - 1.
>>
>
> Ok
>
>> I had reached out to Seagate's T10 and T13 representatives for clarification and verification and this is most likely what is causing the error is a missing - 1 somewhere after getting the information reported by the device. They agreed that the reporting from the drive and the SCSI to ATA translation is correct.
>>
>> I'm not sure where this is being read and calculated, but it is not an error in the low-level libata or sd level of the kernel. It may be in bfq, or it may be in some other place after the sd layer.
>
> This apparent mistake is in the macro bio_end_sector (defined in 
> include/linux/bio.h), which seems to be translated as sector+size.
> Jens, can you shed a light on this point?
>
> Thanks,
> Paolo
>
>> I know there were some additions to read this and report it up the stack, but I did not think those were wrong as they seemed to pass the drive reported information up the stack.
>>
>> Tyler Erickson
>> Seagate Technology
>>
>>
>> Seagate Internal
>>
>> -----Original Message-----
>> From: Rory Chen <mailto:rory.c.chen@seagate.com>
>> Sent: Wednesday, August 10, 2022 6:59 AM
>> To: Paolo Valente <mailto:paolo.valente@linaro.org>
>> Cc: Arie van der Hoeven <mailto:arie.vanderhoeven@seagate.com>; Muhammad 
>> Ahmad <mailto:muhammad.ahmad@seagate.com>; mailto:linux-block@vger.kernel.org; 
>> mailto:linux-kernel@vger.kernel.org; Jan Kara <mailto:jack@suse.cz>; 
>> mailto:andrea.righi@canonical.com; mailto:glen.valante@linaro.org; mailto:axboe@kernel.dk; 
>> Tyler Erickson <mailto:tyler.erickson@seagate.com>; Michael English 
>> <mailto:michael.english@seagate.com>; Andrew Ring <mailto:andrew.ring@seagate.com>; 
>> Varun Boddu <mailto:varunreddy.boddu@seagate.com>
>> Subject: Re: [PATCH 0/8] block, bfq: extend bfq to support 
>> multi-actuator drives
>>
>> The block trace shows the start sector is 35156656120 and transfer length is 8 sectors, which is within the max LBA 35156656127 of drive. And this IO is completed successfully from the slice of parsed block trace though reporting the warning message.
>> 8,64   7       13     0.039401337 19176  Q  RA 35156656120 + 8 [systemd-udevd]
>> 8,64   7       15     0.039403946 19176  P   N [systemd-udevd]
>> 8,64   7       16     0.039405132 19176  I  RA 35156656120 + 8 [systemd-udevd]
>> 8,64   7       18     0.039411554 19176  D  RA 35156656120 + 8 [systemd-udevd]
>> 8,64   0       40     0.039479055     0  C  RA 35156656120 + 8 [0]
>>
>> It may need to know where calculate "bio_end_sector" value as 35156656128. I have patched libata and sd driver for Dual Actuator.
>>
>>
>>
>> From: Paolo Valente <mailto:paolo.valente@linaro.org>
>> Sent: Wednesday, August 10, 2022 6:22 PM
>> To: Rory Chen <mailto:rory.c.chen@seagate.com>
>> Cc: Arie van der Hoeven <mailto:arie.vanderhoeven@seagate.com>; Muhammad 
>> Ahmad <mailto:muhammad.ahmad@seagate.com>; mailto:linux-block@vger.kernel.org 
>> <mailto:linux-block@vger.kernel.org>; mailto:linux-kernel@vger.kernel.org 
>> <mailto:linux-kernel@vger.kernel.org>; Jan Kara <mailto:jack@suse.cz>; 
>> mailto:andrea.righi@canonical.com <mailto:andrea.righi@canonical.com>; 
>> mailto:glen.valante@linaro.org <mailto:glen.valante@linaro.org>; mailto:axboe@kernel.dk 
>> <mailto:axboe@kernel.dk>; Tyler Erickson <mailto:tyler.erickson@seagate.com>; 
>> Michael English <mailto:michael.english@seagate.com>; Andrew Ring 
>> <mailto:andrew.ring@seagate.com>; Varun Boddu <mailto:varunreddy.boddu@seagate.com>
>> Subject: Re: [PATCH 0/8] block, bfq: extend bfq to support 
>> multi-actuator drives
>>
>>
>> This message has originated from an External Source. Please use proper judgment and caution when opening attachments, clicking links, or responding to this email.
>>
>>
>>> Il giorno 9 ago 2022, alle ore 05:47, Rory Chen <mailto:rory.c.chen@seagate.com> ha scritto:
>>>
>>> Resend the mail as plain text because previous mail with rich text 
>>> makes some mess and forget to add others at Seagate who worked on 
>>> validating the patch as well(Muhammad, Michael, Andrew, Varun,Tyler)
>>>
>>> Hi Paolo,
>>>
>>
>> Hi
>>
>>> I am from Seagate China and face a problem when I'm evaluating the bfq patches. Could you please check?
>>> Thanks
>>>
>>> Issue statement
>>> When running performance test on bfq patch, I observed warning message "bfq_actuator_index: bio sector out of ranges: end=35156656128" and OS hung suddenly after some hours.
>>> The warning message is reported from function bfq_actuator_index which determines IO request is in which index of actuators.  The bio_end_sector is 35156656128 but the max LBA for the drive is 35156656127 so it's beyond the LBA range.
>>
>> Yep, this sanity check fails if the end sector of a new IO does not belong to any sector range.
>>
>>> I captured the block trace and didn't found request LBA 35156656128 instead only found max request LBA 35156656127.
>>
>> Maybe in the trace you see only start sectors?  The failed check si performed on end sectors instead.
>>
>> At any rate, there seems to be an off-by-one error in the value(s) stored in the sector field(s) of the blk_independent_access_range data structure.
>>
>> I guess we may need some help/feedback from people competent on this stuff.
>>
>>> I'm not sure if this warning message is related to later OS hung.
>>>
>>
>> Not easy to say.  At any rate, we can try with a development version of bfq.  It can help us detect the possible cause of this hang.  But let's see where we get with this sector error first.
>>
>> Thank you for testing this extended version of bfq, Paolo
>>
>>>
>>> Problem environment
>>> Kernel base is 5.18.9
>>> Test HDD drive is Seagate ST18000NM0092 dual actuator SATA.
>>> Actuator LBA mapping by reading VPD B9 Concurrent positioning ranges 
>>> VPD page:
>>> LBA range number:0
>>> number of storage elements:1
>>> starting LBA:0x0
>>> number of LBAs:0x417c00000 [17578328064] LBA range number:1 number 
>>> of storage elements:1 starting LBA:0x417c00000 number of 
>>> LBAs:0x417c00000 [17578328064]
>>>
>>>
>>>
>>>
>>>
>>> From: Paolo Valente <mailto:paolo.valente@linaro.org>
>>> Sent: Thursday, June 23, 2022 8:53 AM
>>> To: Jens Axboe <mailto:axboe@kernel.dk>
>>> Cc: mailto:linux-block@vger.kernel.org <mailto:linux-block@vger.kernel.org>; 
>>> mailto:linux-kernel@vger.kernel.org <mailto:linux-kernel@vger.kernel.org>; 
>>> mailto:jack@suse.cz <mailto:jack@suse.cz>; mailto:andrea.righi@canonical.com 
>>> <mailto:andrea.righi@canonical.com>; mailto:glen.valante@linaro.org 
>>> <mailto:glen.valante@linaro.org>; Arie van der Hoeven 
>>> <mailto:arie.vanderhoeven@seagate.com>; Paolo Valente 
>>> <mailto:paolo.valente@linaro.org>
>>> Subject: [PATCH 0/8] block, bfq: extend bfq to support 
>>> multi-actuator drives
>>>
>>>
>>> This message has originated from an External Source. Please use proper judgment and caution when opening attachments, clicking links, or responding to this email.
>>>
>>>
>>> Hi,
>>> this patch series extends BFQ so as to optimize I/O dispatch to 
>>> multi-actuator drives. In particular, this extension addresses the 
>>> following issue. Multi-actuator drives appear as a single device to 
>>> the I/O subsystem [1].  Yet they address commands to different 
>>> actuators internally, as a function of Logical Block Addressing 
>>> (LBAs). A given sector is reachable by only one of the actuators. 
>>> For example, Seagate's Serial Advanced Technology Attachment (SATA) 
>>> version contains two actuators and maps the lower half of the SATA 
>>> LBA space to the lower actuator and the upper half to the upper actuator.
>>>
>>> Evidently, to fully utilize actuators, no actuator must be left idle 
>>> or underutilized while there is pending I/O for it. To reach this 
>>> goal, the block layer must somehow control the load of each actuator 
>>> individually. This series enriches BFQ with such a per-actuator 
>>> control, as a first step. Then it also adds a simple mechanism for 
>>> guaranteeing that actuators with pending I/O are never left idle.
>>>
>>> See [1] for a more detailed overview of the problem and of the 
>>> solutions implemented in this patch series. There you will also find 
>>> some preliminary performance results.
>>>
>>> Thanks,
>>> Paolo
>>>
>>> [1]
>>> https://secure-web.cisco.com/1hcxnN1C3h1nW7mby7S66_LE8szirQwbQI0fBpY
>>> eP 
>>> rA0GTWfyuQyl0GpZaOn32xMSkNT0BUQWloDHFzZ23aYDZdi8NfdrEFLY9pQDBblIvn08
>>> LR 
>>> iTVoIOUC8zWSG_r2PCyLtx3ppZq5cWOib_8azxteRRcbKWGdbLPSqg9hfSJSqltth0By
>>> LO 
>>> NHEoI3p3e9QNIn6nVAeQbsT3aOQe-F95XrQvaPrFJXx6RGL9kDXyfkbXIHcdcLBf895g
>>> YB 
>>> Fn5S2WjBDQq2kzDzZOlc1HekRUhg0qDQcFY6NydVfrqNfLbpAHAth6KyREscQhVTMVRE
>>> EV 
>>> a1b6bQByX6grF5pn3pTIo0lODyfX6yRmcbReSYNfOZ65ZPvp-nH530FQ-5nXoRxFf74W
>>> IK 
>>> DrNTALs3xQvg03DH4jLez-T2M9xEu-sfEDAEdTGF7BcnmBW6vrPO4_p3k4/https%3A%
>>> 2F 
>>> %2Fwww.linaro.org%2Fblog%2Fbudget-fair-queueing-bfq-linux-io-schedul
>>> er -optimizations-for-multi-actuator-sata-hard-drives%2F
>>>
>>> Davide Zini (3):
>>> block, bfq: split also async bfq_queues on a per-actuator basis 
>>> block, bfq: inject I/O to underutilized actuators  block, bfq: 
>>> balance I/O injection among underutilized actuators
>>>
>>> Federico Gavioli (1):
>>> block, bfq: retrieve independent access ranges from request queue
>>>
>>> Paolo Valente (4):
>>> block, bfq: split sync bfq_queues on a per-actuator basis  block,
>>> bfq: forbid stable merging of queues associated with different  
>>> actuators block, bfq: turn scalar fields into arrays in bfq_io_cq  
>>> block, bfq:
>>> turn BFQ_NUM_ACTUATORS into BFQ_MAX_ACTUATORS
>>>
>>> block/bfq-cgroup.c  |  97 +++++----
>>> block/bfq-iosched.c | 488 
>>> +++++++++++++++++++++++++++++---------------
>>> block/bfq-iosched.h | 149 ++++++++++----
>>> block/bfq-wf2q.c    |   2 +-
>>> 4 files changed, 493 insertions(+), 243 deletions(-)
>>>
>>> --
>>> 2.20.1
>>>
>>>
>>> Seagate Internal
>>>
>>> Seagate Internal
>>
>> Seagate Internal
>
> Seagate Internal


Seagate Internal

Seagate Internal

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH 0/8] block, bfq: extend bfq to support multi-actuator drives
  2022-09-28  1:02                     ` Rory Chen
@ 2022-09-30 15:48                       ` Paolo Valente
  0 siblings, 0 replies; 21+ messages in thread
From: Paolo Valente @ 2022-09-30 15:48 UTC (permalink / raw)
  To: Rory Chen
  Cc: Arie van der Hoeven, Tyler Erickson, Muhammad Ahmad, linux-block,
	linux-kernel, Jan Kara, andrea.righi, glen.valante, axboe,
	Michael English, Andrew Ring, Varun Boddu, Damien Le Moal



> Il giorno 28 set 2022, alle ore 03:02, Rory Chen <rory.c.chen@seagate.com> ha scritto:
> 
> Hi Arie,
> 
> I have no experience to submit the change into kernel. Can Paolo or other maintainers help submit?
> 
> 

Hi Rory,
the problem here is that your change simply switches off the warning
in BFQ, but the offending sector number is still produced by
bio_end_sector.  What we need here is some feedback by Damien or Jens.
Maybe they are not following this thread.  I'll try to get their
attention by creating a new thread on the topic.  Cross your fingers :)

Thanks,
Paolo

> Seagate Internal
> 
> -----Original Message-----
> From: Arie van der Hoeven <arie.vanderhoeven@seagate.com> 
> Sent: Wednesday, September 28, 2022 12:59 AM
> To: Rory Chen <rory.c.chen@seagate.com>; Paolo Valente <paolo.valente@linaro.org>
> Cc: Tyler Erickson <tyler.erickson@seagate.com>; Muhammad Ahmad <muhammad.ahmad@seagate.com>; linux-block@vger.kernel.org; linux-kernel@vger.kernel.org; Jan Kara <jack@suse.cz>; andrea.righi@canonical.com; glen.valante@linaro.org; axboe@kernel.dk; Michael English <michael.english@seagate.com>; Andrew Ring <andrew.ring@seagate.com>; Varun Boddu <varunreddy.boddu@seagate.com>; Damien Le Moal <damien.lemoal@wdc.com>
> Subject: Re: [PATCH 0/8] block, bfq: extend bfq to support multi-actuator drives
> 
> Chiming in as we have customers who are having very good results with these BFQ patches and are planning to pilot NAS solutions in early 2023. This bug is not a blocker for us, but we do need the BFQ patches included in Linux 6.0.
> 
> Rory can you submit your changes or is it the maintainer's responsibility?
> 
> Regards,  --Arie
> 
> 
> From: Rory Chen <mailto:rory.c.chen@seagate.com>
> Sent: Thursday, September 8, 2022 4:34 AM
> To: Paolo Valente <mailto:paolo.valente@linaro.org>
> Cc: Tyler Erickson <mailto:tyler.erickson@seagate.com>; Arie van der Hoeven <mailto:arie.vanderhoeven@seagate.com>; Muhammad Ahmad <mailto:muhammad.ahmad@seagate.com>; mailto:linux-block@vger.kernel.org <mailto:linux-block@vger.kernel.org>; mailto:linux-kernel@vger.kernel.org <mailto:linux-kernel@vger.kernel.org>; Jan Kara <mailto:jack@suse.cz>; mailto:andrea.righi@canonical.com <mailto:andrea.righi@canonical.com>; mailto:glen.valante@linaro.org <mailto:glen.valante@linaro.org>; mailto:axboe@kernel.dk <mailto:axboe@kernel.dk>; Michael English <mailto:michael.english@seagate.com>; Andrew Ring <mailto:andrew.ring@seagate.com>; Varun Boddu <mailto:varunreddy.boddu@seagate.com>; Damien Le Moal <mailto:damien.lemoal@wdc.com>
> Subject: Re: [PATCH 0/8] block, bfq: extend bfq to support multi-actuator drives
> 
> 
> Oops, I attach wrong code change. Here's the right change made by me.
> 
> <               if (end >= iar->sector + 1 && end < iar->sector + iar->nr_sectors + 1)  //Changed code
>>              if (end >= iar->sector && end < iar->sector + 
>> iar->nr_sectors) // Original code
> 
> Unfortunately, the crash is still existing and I can't find any clue from /var/log/messages
> 
> 
> 
> From: Paolo Valente <mailto:paolo.valente@linaro.org>
> Sent: Thursday, September 8, 2022 6:54 PM
> To: Rory Chen <mailto:rory.c.chen@seagate.com>
> Cc: Tyler Erickson <mailto:tyler.erickson@seagate.com>; Arie van der Hoeven <mailto:arie.vanderhoeven@seagate.com>; Muhammad Ahmad <mailto:muhammad.ahmad@seagate.com>; mailto:linux-block@vger.kernel.org <mailto:linux-block@vger.kernel.org>; mailto:linux-kernel@vger.kernel.org <mailto:linux-kernel@vger.kernel.org>; Jan Kara <mailto:jack@suse.cz>; mailto:andrea.righi@canonical.com <mailto:andrea.righi@canonical.com>; mailto:glen.valante@linaro.org <mailto:glen.valante@linaro.org>; mailto:axboe@kernel.dk <mailto:axboe@kernel.dk>; Michael English <mailto:michael.english@seagate.com>; Andrew Ring <mailto:andrew.ring@seagate.com>; Varun Boddu <mailto:varunreddy.boddu@seagate.com>; Damien Le Moal <mailto:damien.lemoal@wdc.com>
> Subject: Re: [PATCH 0/8] block, bfq: extend bfq to support multi-actuator drives
> 
> 
> This message has originated from an External Source. Please use proper judgment and caution when opening attachments, clicking links, or responding to this email.
> 
> 
>> Il giorno 8 set 2022, alle ore 04:46, Rory Chen <mailto:rory.c.chen@seagate.com> ha scritto:
>> 
>> I change the comparison condition and it can eliminate the warning.
> 
> Yep. The crash you reported also goes away?
> 
>> <               if (end >= iar->sector + 1 && end < iar->sector + iar->nr_sectors + 1)
>>>             if (end >= iar->sector && end < iar->sector + 
>>> iar->nr_sectors)
>> 
>> I don't know if this change is appropriate
> 
> Unfortunately your change conflicts with the standard code, taken from the original patches on access ranges [1].  I've CCed Damien, the author of this patch series.
> 
> [1] https://secure-web.cisco.com/12uvPqOwOjHJPiVGM6hJ7791jpWxxy8My3bFD1oA0pNh9m0W778f8IM7HPxjRUL8-94N0gKahHwtK-sEv1Tgk2Oo4H9GTAlLoml_uWF6BGktvDAlDp-zdNQUzCL7y1OCz_MJMaNlS5h0iwsE3q9m7tJsCFUWW0YEgcJE6LRTrZDQpFJhG3pGCLFgoPIuKa3o8B136dJoQvEtek7ZOQFKqesuZKbu4lvM4ds0HOLs5TIgJR_mSJ8UmhP5_M3a1CaDxdDzQ784H3EydkRN9a6v9-Oogo-wYUqS8fRq35rUyw1t2IblmgJzr6aoGazZsJHxBXPjpxA9DSEQqUtH7oT5RGM4qxLpEmYjgyzpJUZqhUCSXye7-lCTIQIB-SGzRuZDVbIqK5tZd3F_YK9LcAN0iVH_qfBM4zRe_4w4h5ikJdhc/https%3A%2F%2Flwn.net%2Fml%2Flinux-block%2F20210909023545.1101672-2-damien.lemoal%40wdc.com%2F
> 
> Thanks,
> Paolo
> 
>> but  bio_end_sector deducting 1 said by Tyler seems to make sense.
>> 
>> From: Paolo Valente <mailto:paolo.valente@linaro.org>
>> Sent: Thursday, August 25, 2022 10:45 PM
>> To: Tyler Erickson <mailto:tyler.erickson@seagate.com>
>> Cc: Rory Chen <mailto:rory.c.chen@seagate.com>; Arie van der Hoeven 
>> <mailto:arie.vanderhoeven@seagate.com>; Muhammad Ahmad 
>> <mailto:muhammad.ahmad@seagate.com>; mailto:linux-block@vger.kernel.org 
>> <mailto:linux-block@vger.kernel.org>; mailto:linux-kernel@vger.kernel.org 
>> <mailto:linux-kernel@vger.kernel.org>; Jan Kara <mailto:jack@suse.cz>; 
>> mailto:andrea.righi@canonical.com <mailto:andrea.righi@canonical.com>; 
>> mailto:glen.valante@linaro.org <mailto:glen.valante@linaro.org>; mailto:axboe@kernel.dk 
>> <mailto:axboe@kernel.dk>; Michael English <mailto:michael.english@seagate.com>; 
>> Andrew Ring <mailto:andrew.ring@seagate.com>; Varun Boddu 
>> <mailto:varunreddy.boddu@seagate.com>
>> Subject: Re: [PATCH 0/8] block, bfq: extend bfq to support 
>> multi-actuator drives
>> 
>> 
>> This message has originated from an External Source. Please use proper judgment and caution when opening attachments, clicking links, or responding to this email.
>> 
>> 
>> Hi
>> 
>>> Il giorno 18 ago 2022, alle ore 17:40, Tyler Erickson <mailto:tyler.erickson@seagate.com> ha scritto:
>>> 
>>> The libata layer is reporting correctly after the changes I submitted.
>>> 
>>> The drive reports the actuator ranges as a starting LBA and a count of LBAs for the range.
>>> If the code reading the reported values simply does startingLBA + range, this is an incorrect ending LBA for that actuator. This is because LBAs are zero indexed and this simple addition is not taking that into account.
>>> The proper way to get the endingLBA is startingLBA + range - 1 to get the last LBA value for where to issue a final IO read/write to account for LBA values starting at zero rather than one.
>>> 
>>> Here is an example from the output in SeaChest/openSeaChest:
>>> ====Concurrent Positioning Ranges====
>>> 
>>> Range#     #Elements            Lowest LBA          # of LBAs
>>> 0            1                                               0           17578328064
>>> 1            1                         17578328064           17578328064
>>> 
>>> If using the incorrect formula to get the final LBA for actuator 0, you would get 17578328064, but this is the starting LBA reported by the drive for actuator 1.
>>> So to be consistent for all ranges, the final LBA for a given actuator should be calculated as starting LBA + range - 1.
>>> 
>> 
>> Ok
>> 
>>> I had reached out to Seagate's T10 and T13 representatives for clarification and verification and this is most likely what is causing the error is a missing - 1 somewhere after getting the information reported by the device. They agreed that the reporting from the drive and the SCSI to ATA translation is correct.
>>> 
>>> I'm not sure where this is being read and calculated, but it is not an error in the low-level libata or sd level of the kernel. It may be in bfq, or it may be in some other place after the sd layer.
>> 
>> This apparent mistake is in the macro bio_end_sector (defined in 
>> include/linux/bio.h), which seems to be translated as sector+size.
>> Jens, can you shed a light on this point?
>> 
>> Thanks,
>> Paolo
>> 
>>> I know there were some additions to read this and report it up the stack, but I did not think those were wrong as they seemed to pass the drive reported information up the stack.
>>> 
>>> Tyler Erickson
>>> Seagate Technology
>>> 
>>> 
>>> Seagate Internal
>>> 
>>> -----Original Message-----
>>> From: Rory Chen <mailto:rory.c.chen@seagate.com>
>>> Sent: Wednesday, August 10, 2022 6:59 AM
>>> To: Paolo Valente <mailto:paolo.valente@linaro.org>
>>> Cc: Arie van der Hoeven <mailto:arie.vanderhoeven@seagate.com>; Muhammad 
>>> Ahmad <mailto:muhammad.ahmad@seagate.com>; mailto:linux-block@vger.kernel.org; 
>>> mailto:linux-kernel@vger.kernel.org; Jan Kara <mailto:jack@suse.cz>; 
>>> mailto:andrea.righi@canonical.com; mailto:glen.valante@linaro.org; mailto:axboe@kernel.dk; 
>>> Tyler Erickson <mailto:tyler.erickson@seagate.com>; Michael English 
>>> <mailto:michael.english@seagate.com>; Andrew Ring <mailto:andrew.ring@seagate.com>; 
>>> Varun Boddu <mailto:varunreddy.boddu@seagate.com>
>>> Subject: Re: [PATCH 0/8] block, bfq: extend bfq to support 
>>> multi-actuator drives
>>> 
>>> The block trace shows the start sector is 35156656120 and transfer length is 8 sectors, which is within the max LBA 35156656127 of drive. And this IO is completed successfully from the slice of parsed block trace though reporting the warning message.
>>> 8,64   7       13     0.039401337 19176  Q  RA 35156656120 + 8 [systemd-udevd]
>>> 8,64   7       15     0.039403946 19176  P   N [systemd-udevd]
>>> 8,64   7       16     0.039405132 19176  I  RA 35156656120 + 8 [systemd-udevd]
>>> 8,64   7       18     0.039411554 19176  D  RA 35156656120 + 8 [systemd-udevd]
>>> 8,64   0       40     0.039479055     0  C  RA 35156656120 + 8 [0]
>>> 
>>> It may need to know where calculate "bio_end_sector" value as 35156656128. I have patched libata and sd driver for Dual Actuator.
>>> 
>>> 
>>> 
>>> From: Paolo Valente <mailto:paolo.valente@linaro.org>
>>> Sent: Wednesday, August 10, 2022 6:22 PM
>>> To: Rory Chen <mailto:rory.c.chen@seagate.com>
>>> Cc: Arie van der Hoeven <mailto:arie.vanderhoeven@seagate.com>; Muhammad 
>>> Ahmad <mailto:muhammad.ahmad@seagate.com>; mailto:linux-block@vger.kernel.org 
>>> <mailto:linux-block@vger.kernel.org>; mailto:linux-kernel@vger.kernel.org 
>>> <mailto:linux-kernel@vger.kernel.org>; Jan Kara <mailto:jack@suse.cz>; 
>>> mailto:andrea.righi@canonical.com <mailto:andrea.righi@canonical.com>; 
>>> mailto:glen.valante@linaro.org <mailto:glen.valante@linaro.org>; mailto:axboe@kernel.dk 
>>> <mailto:axboe@kernel.dk>; Tyler Erickson <mailto:tyler.erickson@seagate.com>; 
>>> Michael English <mailto:michael.english@seagate.com>; Andrew Ring 
>>> <mailto:andrew.ring@seagate.com>; Varun Boddu <mailto:varunreddy.boddu@seagate.com>
>>> Subject: Re: [PATCH 0/8] block, bfq: extend bfq to support 
>>> multi-actuator drives
>>> 
>>> 
>>> This message has originated from an External Source. Please use proper judgment and caution when opening attachments, clicking links, or responding to this email.
>>> 
>>> 
>>>> Il giorno 9 ago 2022, alle ore 05:47, Rory Chen <mailto:rory.c.chen@seagate.com> ha scritto:
>>>> 
>>>> Resend the mail as plain text because previous mail with rich text 
>>>> makes some mess and forget to add others at Seagate who worked on 
>>>> validating the patch as well(Muhammad, Michael, Andrew, Varun,Tyler)
>>>> 
>>>> Hi Paolo,
>>>> 
>>> 
>>> Hi
>>> 
>>>> I am from Seagate China and face a problem when I'm evaluating the bfq patches. Could you please check?
>>>> Thanks
>>>> 
>>>> Issue statement
>>>> When running performance test on bfq patch, I observed warning message "bfq_actuator_index: bio sector out of ranges: end=35156656128" and OS hung suddenly after some hours.
>>>> The warning message is reported from function bfq_actuator_index which determines IO request is in which index of actuators.  The bio_end_sector is 35156656128 but the max LBA for the drive is 35156656127 so it's beyond the LBA range.
>>> 
>>> Yep, this sanity check fails if the end sector of a new IO does not belong to any sector range.
>>> 
>>>> I captured the block trace and didn't found request LBA 35156656128 instead only found max request LBA 35156656127.
>>> 
>>> Maybe in the trace you see only start sectors?  The failed check si performed on end sectors instead.
>>> 
>>> At any rate, there seems to be an off-by-one error in the value(s) stored in the sector field(s) of the blk_independent_access_range data structure.
>>> 
>>> I guess we may need some help/feedback from people competent on this stuff.
>>> 
>>>> I'm not sure if this warning message is related to later OS hung.
>>>> 
>>> 
>>> Not easy to say.  At any rate, we can try with a development version of bfq.  It can help us detect the possible cause of this hang.  But let's see where we get with this sector error first.
>>> 
>>> Thank you for testing this extended version of bfq, Paolo
>>> 
>>>> 
>>>> Problem environment
>>>> Kernel base is 5.18.9
>>>> Test HDD drive is Seagate ST18000NM0092 dual actuator SATA.
>>>> Actuator LBA mapping by reading VPD B9 Concurrent positioning ranges 
>>>> VPD page:
>>>> LBA range number:0
>>>> number of storage elements:1
>>>> starting LBA:0x0
>>>> number of LBAs:0x417c00000 [17578328064] LBA range number:1 number 
>>>> of storage elements:1 starting LBA:0x417c00000 number of 
>>>> LBAs:0x417c00000 [17578328064]
>>>> 
>>>> 
>>>> 
>>>> 
>>>> 
>>>> From: Paolo Valente <mailto:paolo.valente@linaro.org>
>>>> Sent: Thursday, June 23, 2022 8:53 AM
>>>> To: Jens Axboe <mailto:axboe@kernel.dk>
>>>> Cc: mailto:linux-block@vger.kernel.org <mailto:linux-block@vger.kernel.org>; 
>>>> mailto:linux-kernel@vger.kernel.org <mailto:linux-kernel@vger.kernel.org>; 
>>>> mailto:jack@suse.cz <mailto:jack@suse.cz>; mailto:andrea.righi@canonical.com 
>>>> <mailto:andrea.righi@canonical.com>; mailto:glen.valante@linaro.org 
>>>> <mailto:glen.valante@linaro.org>; Arie van der Hoeven 
>>>> <mailto:arie.vanderhoeven@seagate.com>; Paolo Valente 
>>>> <mailto:paolo.valente@linaro.org>
>>>> Subject: [PATCH 0/8] block, bfq: extend bfq to support 
>>>> multi-actuator drives
>>>> 
>>>> 
>>>> This message has originated from an External Source. Please use proper judgment and caution when opening attachments, clicking links, or responding to this email.
>>>> 
>>>> 
>>>> Hi,
>>>> this patch series extends BFQ so as to optimize I/O dispatch to 
>>>> multi-actuator drives. In particular, this extension addresses the 
>>>> following issue. Multi-actuator drives appear as a single device to 
>>>> the I/O subsystem [1].  Yet they address commands to different 
>>>> actuators internally, as a function of Logical Block Addressing 
>>>> (LBAs). A given sector is reachable by only one of the actuators. 
>>>> For example, Seagate's Serial Advanced Technology Attachment (SATA) 
>>>> version contains two actuators and maps the lower half of the SATA 
>>>> LBA space to the lower actuator and the upper half to the upper actuator.
>>>> 
>>>> Evidently, to fully utilize actuators, no actuator must be left idle 
>>>> or underutilized while there is pending I/O for it. To reach this 
>>>> goal, the block layer must somehow control the load of each actuator 
>>>> individually. This series enriches BFQ with such a per-actuator 
>>>> control, as a first step. Then it also adds a simple mechanism for 
>>>> guaranteeing that actuators with pending I/O are never left idle.
>>>> 
>>>> See [1] for a more detailed overview of the problem and of the 
>>>> solutions implemented in this patch series. There you will also find 
>>>> some preliminary performance results.
>>>> 
>>>> Thanks,
>>>> Paolo
>>>> 
>>>> [1]
>>>> https://secure-web.cisco.com/1hcxnN1C3h1nW7mby7S66_LE8szirQwbQI0fBpY
>>>> eP 
>>>> rA0GTWfyuQyl0GpZaOn32xMSkNT0BUQWloDHFzZ23aYDZdi8NfdrEFLY9pQDBblIvn08
>>>> LR 
>>>> iTVoIOUC8zWSG_r2PCyLtx3ppZq5cWOib_8azxteRRcbKWGdbLPSqg9hfSJSqltth0By
>>>> LO 
>>>> NHEoI3p3e9QNIn6nVAeQbsT3aOQe-F95XrQvaPrFJXx6RGL9kDXyfkbXIHcdcLBf895g
>>>> YB 
>>>> Fn5S2WjBDQq2kzDzZOlc1HekRUhg0qDQcFY6NydVfrqNfLbpAHAth6KyREscQhVTMVRE
>>>> EV 
>>>> a1b6bQByX6grF5pn3pTIo0lODyfX6yRmcbReSYNfOZ65ZPvp-nH530FQ-5nXoRxFf74W
>>>> IK 
>>>> DrNTALs3xQvg03DH4jLez-T2M9xEu-sfEDAEdTGF7BcnmBW6vrPO4_p3k4/https%3A%
>>>> 2F 
>>>> %2Fwww.linaro.org%2Fblog%2Fbudget-fair-queueing-bfq-linux-io-schedul
>>>> er -optimizations-for-multi-actuator-sata-hard-drives%2F
>>>> 
>>>> Davide Zini (3):
>>>> block, bfq: split also async bfq_queues on a per-actuator basis 
>>>> block, bfq: inject I/O to underutilized actuators  block, bfq: 
>>>> balance I/O injection among underutilized actuators
>>>> 
>>>> Federico Gavioli (1):
>>>> block, bfq: retrieve independent access ranges from request queue
>>>> 
>>>> Paolo Valente (4):
>>>> block, bfq: split sync bfq_queues on a per-actuator basis  block,
>>>> bfq: forbid stable merging of queues associated with different  
>>>> actuators block, bfq: turn scalar fields into arrays in bfq_io_cq  
>>>> block, bfq:
>>>> turn BFQ_NUM_ACTUATORS into BFQ_MAX_ACTUATORS
>>>> 
>>>> block/bfq-cgroup.c  |  97 +++++----
>>>> block/bfq-iosched.c | 488 
>>>> +++++++++++++++++++++++++++++---------------
>>>> block/bfq-iosched.h | 149 ++++++++++----
>>>> block/bfq-wf2q.c    |   2 +-
>>>> 4 files changed, 493 insertions(+), 243 deletions(-)
>>>> 
>>>> --
>>>> 2.20.1
>>>> 
>>>> 
>>>> Seagate Internal
>>>> 
>>>> Seagate Internal
>>> 
>>> Seagate Internal
>> 
>> Seagate Internal
> 
> 
> Seagate Internal
> 
> Seagate Internal


^ permalink raw reply	[flat|nested] 21+ messages in thread

end of thread, other threads:[~2022-09-30 15:48 UTC | newest]

Thread overview: 21+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-06-23 15:53 [PATCH 0/8] block, bfq: extend bfq to support multi-actuator drives Paolo Valente
2022-06-23 15:53 ` [PATCH 1/8] block, bfq: split sync bfq_queues on a per-actuator basis Paolo Valente
2022-06-23 15:53 ` [PATCH 2/8] block, bfq: forbid stable merging of queues associated with different actuators Paolo Valente
2022-06-23 15:53 ` [PATCH 3/8] block, bfq: turn scalar fields into arrays in bfq_io_cq Paolo Valente
2022-06-23 15:53 ` [PATCH 4/8] block, bfq: split also async bfq_queues on a per-actuator basis Paolo Valente
2022-06-23 15:53 ` [PATCH 5/8] block, bfq: turn BFQ_NUM_ACTUATORS into BFQ_MAX_ACTUATORS Paolo Valente
2022-06-23 15:53 ` [PATCH 6/8] block, bfq: retrieve independent access ranges from request queue Paolo Valente
2022-06-23 15:53 ` [PATCH 7/8] block, bfq: inject I/O to underutilized actuators Paolo Valente
2022-06-23 15:53 ` [PATCH 8/8] block, bfq: balance I/O injection among " Paolo Valente
     [not found] ` <PH7PR20MB505849512979A89E8A66FB24F18E9@PH7PR20MB5058.namprd20.prod.outlook.com>
     [not found]   ` <SJ0PR20MB44097AAA14A819BB2C7B4AB3A0909@SJ0PR20MB4409.namprd20.prod.outlook.com>
     [not found]     ` <DM4PR20MB50613F96089AC468679FB415F1969@DM4PR20MB5061.namprd20.prod.outlook.com>
     [not found]       ` <DM4PR20MB5061638AB3F389ECD0714029F1969@DM4PR20MB5061.namprd20.prod.outlook.com>
2022-07-29 16:36         ` [PATCH 0/8] block, bfq: extend bfq to support multi-actuator drives Arie van der Hoeven
2022-08-09  3:47   ` Rory Chen
2022-08-10 10:22     ` Paolo Valente
2022-08-10 12:59       ` Rory Chen
2022-08-18 15:40         ` Tyler Erickson
2022-08-25 14:45           ` Paolo Valente
2022-09-08  2:46             ` Rory Chen
2022-09-08 10:54               ` Paolo Valente
2022-09-08 11:34                 ` Rory Chen
2022-09-27 16:59                   ` Arie van der Hoeven
2022-09-28  1:02                     ` Rory Chen
2022-09-30 15:48                       ` Paolo Valente

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.