All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH V2 0/7] block: improve iops limit throttle
@ 2022-02-09  9:14 Ming Lei
  2022-02-09  9:14 ` [PATCH V2 1/7] block: move submit_bio_checks() into submit_bio_noacct Ming Lei
                   ` (7 more replies)
  0 siblings, 8 replies; 19+ messages in thread
From: Ming Lei @ 2022-02-09  9:14 UTC (permalink / raw)
  To: Jens Axboe; +Cc: linux-block, Li Ning, Tejun Heo, Chunguang Xu, Ming Lei

Hello Guys,

Lining reported that iops limit throttle doesn't work on dm-thin, also
iops limit throttle works bad on plain disk in case of excessive split.

Commit 4f1e9630afe6 ("blk-throtl: optimize IOPS throttle for large IO scenarios")
was for addressing this issue, but the taken approach is just to run
post-accounting, then current split bios won't be throttled actually,
so actual iops throttle result isn't good in case of excessive bio
splitting.

The 1st three patches are cleanup.

The 4th patches add one new local helper of submit_bio_noacct_nocheck() for
blk_throtl_dispatch_work_fn(), so that bios won't be throttled any more
when blk-throttle code dispatches throttled bios.

The 5th and 6th patch makes the real difference for throttling split bio wrt.
iops limit.

The last patch is to revert commit 4f1e9630afe6 ("blk-throtl: optimize IOPS
throttle for large IO scenarios").

Lining has verified that iops throttle is improved much on the posted
RFC V1 version.

V2:
	- remove RFC
	- don't add/export __submit_bio_noacct(), instead add one new local
	helper of submit_bio_noacct_nocheck() per Christoph's suggestion

Ming Lei (7):
  block: move submit_bio_checks() into submit_bio_noacct
  block: move blk_crypto_bio_prep() out of blk-mq.c
  block: don't declare submit_bio_checks in local header
  block: don't check bio in blk_throtl_dispatch_work_fn
  block: throttle split bio in case of iops limit
  block: don't try to throttle split bio if iops limit isn't set
  block: revert 4f1e9630afe6 ("blk-throtl: optimize IOPS throttle for
    large IO scenarios")

 block/blk-core.c     | 53 ++++++++++++++++++++------------------
 block/blk-merge.c    |  2 --
 block/blk-mq.c       |  3 ---
 block/blk-throttle.c | 61 ++++++++++++++++----------------------------
 block/blk-throttle.h | 16 +++++++-----
 block/blk.h          |  2 +-
 6 files changed, 61 insertions(+), 76 deletions(-)

-- 
2.31.1


^ permalink raw reply	[flat|nested] 19+ messages in thread

* [PATCH V2 1/7] block: move submit_bio_checks() into submit_bio_noacct
  2022-02-09  9:14 [PATCH V2 0/7] block: improve iops limit throttle Ming Lei
@ 2022-02-09  9:14 ` Ming Lei
  2022-02-09 14:01   ` Christoph Hellwig
  2022-02-09  9:14 ` [PATCH V2 2/7] block: move blk_crypto_bio_prep() out of blk-mq.c Ming Lei
                   ` (6 subsequent siblings)
  7 siblings, 1 reply; 19+ messages in thread
From: Ming Lei @ 2022-02-09  9:14 UTC (permalink / raw)
  To: Jens Axboe; +Cc: linux-block, Li Ning, Tejun Heo, Chunguang Xu, Ming Lei

It is more clean & readable to check bio when starting to submit it,
instead of just before calling ->submit_bio() or blk_mq_submit_bio().

Also it provides us chance to optimize bio submission without checking
bio.

Signed-off-by: Ming Lei <ming.lei@redhat.com>
---
 block/blk-core.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/block/blk-core.c b/block/blk-core.c
index be8812f5489d..bf989b1b3bea 100644
--- a/block/blk-core.c
+++ b/block/blk-core.c
@@ -800,9 +800,6 @@ static void __submit_bio(struct bio *bio)
 {
 	struct gendisk *disk = bio->bi_bdev->bd_disk;
 
-	if (unlikely(!submit_bio_checks(bio)))
-		return;
-
 	if (!disk->fops->submit_bio)
 		blk_mq_submit_bio(bio);
 	else
@@ -896,6 +893,9 @@ static void __submit_bio_noacct_mq(struct bio *bio)
  */
 void submit_bio_noacct(struct bio *bio)
 {
+	if (unlikely(!submit_bio_checks(bio)))
+		return;
+
 	/*
 	 * We only want one ->submit_bio to be active at a time, else stack
 	 * usage with stacked devices could be a problem.  Use current->bio_list
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH V2 2/7] block: move blk_crypto_bio_prep() out of blk-mq.c
  2022-02-09  9:14 [PATCH V2 0/7] block: improve iops limit throttle Ming Lei
  2022-02-09  9:14 ` [PATCH V2 1/7] block: move submit_bio_checks() into submit_bio_noacct Ming Lei
@ 2022-02-09  9:14 ` Ming Lei
  2022-02-09 14:02   ` Christoph Hellwig
  2022-02-09  9:14 ` [PATCH V2 3/7] block: don't declare submit_bio_checks in local header Ming Lei
                   ` (5 subsequent siblings)
  7 siblings, 1 reply; 19+ messages in thread
From: Ming Lei @ 2022-02-09  9:14 UTC (permalink / raw)
  To: Jens Axboe; +Cc: linux-block, Li Ning, Tejun Heo, Chunguang Xu, Ming Lei

blk_crypto_bio_prep() is called for both bio based and blk-mq drivers,
so move it out of blk-mq.c, then we can unify this kind of handling.

Signed-off-by: Ming Lei <ming.lei@redhat.com>
---
 block/blk-core.c | 18 ++++++------------
 block/blk-mq.c   |  3 ---
 2 files changed, 6 insertions(+), 15 deletions(-)

diff --git a/block/blk-core.c b/block/blk-core.c
index bf989b1b3bea..1514dbab0bd8 100644
--- a/block/blk-core.c
+++ b/block/blk-core.c
@@ -786,26 +786,20 @@ noinline_for_stack bool submit_bio_checks(struct bio *bio)
 	return false;
 }
 
-static void __submit_bio_fops(struct gendisk *disk, struct bio *bio)
+static void __submit_bio(struct bio *bio)
 {
 	if (blk_crypto_bio_prep(&bio)) {
-		if (likely(bio_queue_enter(bio) == 0)) {
+		struct gendisk *disk = bio->bi_bdev->bd_disk;
+
+		if (!disk->fops->submit_bio) {
+			blk_mq_submit_bio(bio);
+		} else if (likely(bio_queue_enter(bio) == 0)) {
 			disk->fops->submit_bio(bio);
 			blk_queue_exit(disk->queue);
 		}
 	}
 }
 
-static void __submit_bio(struct bio *bio)
-{
-	struct gendisk *disk = bio->bi_bdev->bd_disk;
-
-	if (!disk->fops->submit_bio)
-		blk_mq_submit_bio(bio);
-	else
-		__submit_bio_fops(disk, bio);
-}
-
 /*
  * The loop in this function may be a bit non-obvious, and so deserves some
  * explanation:
diff --git a/block/blk-mq.c b/block/blk-mq.c
index 4b868e792ba4..fa1c38a05d5a 100644
--- a/block/blk-mq.c
+++ b/block/blk-mq.c
@@ -2786,9 +2786,6 @@ void blk_mq_submit_bio(struct bio *bio)
 	unsigned int nr_segs = 1;
 	blk_status_t ret;
 
-	if (unlikely(!blk_crypto_bio_prep(&bio)))
-		return;
-
 	blk_queue_bounce(q, &bio);
 	if (blk_may_split(q, bio))
 		__blk_queue_split(q, &bio, &nr_segs);
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH V2 3/7] block: don't declare submit_bio_checks in local header
  2022-02-09  9:14 [PATCH V2 0/7] block: improve iops limit throttle Ming Lei
  2022-02-09  9:14 ` [PATCH V2 1/7] block: move submit_bio_checks() into submit_bio_noacct Ming Lei
  2022-02-09  9:14 ` [PATCH V2 2/7] block: move blk_crypto_bio_prep() out of blk-mq.c Ming Lei
@ 2022-02-09  9:14 ` Ming Lei
  2022-02-09 14:02   ` Christoph Hellwig
  2022-02-09  9:14 ` [PATCH V2 4/7] block: don't check bio in blk_throtl_dispatch_work_fn Ming Lei
                   ` (4 subsequent siblings)
  7 siblings, 1 reply; 19+ messages in thread
From: Ming Lei @ 2022-02-09  9:14 UTC (permalink / raw)
  To: Jens Axboe; +Cc: linux-block, Li Ning, Tejun Heo, Chunguang Xu, Ming Lei

submit_bio_checks() won't be called outside of block/blk-core.c any more
since commit 9d497e2941c3 ("block: don't protect submit_bio_checks by
q_usage_counter").

Signed-off-by: Ming Lei <ming.lei@redhat.com>
---
 block/blk.h | 1 -
 1 file changed, 1 deletion(-)

diff --git a/block/blk.h b/block/blk.h
index abb663a2a147..b2516cb4f98e 100644
--- a/block/blk.h
+++ b/block/blk.h
@@ -46,7 +46,6 @@ void blk_freeze_queue(struct request_queue *q);
 void __blk_mq_unfreeze_queue(struct request_queue *q, bool force_atomic);
 void blk_queue_start_drain(struct request_queue *q);
 int __bio_queue_enter(struct request_queue *q, struct bio *bio);
-bool submit_bio_checks(struct bio *bio);
 
 static inline bool blk_try_enter_queue(struct request_queue *q, bool pm)
 {
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH V2 4/7] block: don't check bio in blk_throtl_dispatch_work_fn
  2022-02-09  9:14 [PATCH V2 0/7] block: improve iops limit throttle Ming Lei
                   ` (2 preceding siblings ...)
  2022-02-09  9:14 ` [PATCH V2 3/7] block: don't declare submit_bio_checks in local header Ming Lei
@ 2022-02-09  9:14 ` Ming Lei
  2022-02-09 14:03   ` Christoph Hellwig
  2022-02-09  9:14 ` [PATCH V2 5/7] block: throttle split bio in case of iops limit Ming Lei
                   ` (3 subsequent siblings)
  7 siblings, 1 reply; 19+ messages in thread
From: Ming Lei @ 2022-02-09  9:14 UTC (permalink / raw)
  To: Jens Axboe; +Cc: linux-block, Li Ning, Tejun Heo, Chunguang Xu, Ming Lei

The bio has been checked already before throttling, so no need to check
it again before dispatching it from throttle queue.

Add one local helper of submit_bio_noacct_nocheck() for this purpose.

Signed-off-by: Ming Lei <ming.lei@redhat.com>
---
 block/blk-core.c     | 35 ++++++++++++++++++++++-------------
 block/blk-throttle.c |  2 +-
 block/blk.h          |  1 +
 3 files changed, 24 insertions(+), 14 deletions(-)

diff --git a/block/blk-core.c b/block/blk-core.c
index 1514dbab0bd8..4de9f391836b 100644
--- a/block/blk-core.c
+++ b/block/blk-core.c
@@ -876,20 +876,8 @@ static void __submit_bio_noacct_mq(struct bio *bio)
 	current->bio_list = NULL;
 }
 
-/**
- * submit_bio_noacct - re-submit a bio to the block device layer for I/O
- * @bio:  The bio describing the location in memory and on the device.
- *
- * This is a version of submit_bio() that shall only be used for I/O that is
- * resubmitted to lower level drivers by stacking block drivers.  All file
- * systems and other upper level users of the block layer should use
- * submit_bio() instead.
- */
-void submit_bio_noacct(struct bio *bio)
+static inline void __submit_bio_noacct_nocheck(struct bio *bio)
 {
-	if (unlikely(!submit_bio_checks(bio)))
-		return;
-
 	/*
 	 * We only want one ->submit_bio to be active at a time, else stack
 	 * usage with stacked devices could be a problem.  Use current->bio_list
@@ -903,8 +891,29 @@ void submit_bio_noacct(struct bio *bio)
 	else
 		__submit_bio_noacct(bio);
 }
+
+/**
+ * submit_bio_noacct - re-submit a bio to the block device layer for I/O
+ * @bio:  The bio describing the location in memory and on the device.
+ *
+ * This is a version of submit_bio() that shall only be used for I/O that is
+ * resubmitted to lower level drivers by stacking block drivers.  All file
+ * systems and other upper level users of the block layer should use
+ * submit_bio() instead.
+ */
+void submit_bio_noacct(struct bio *bio)
+{
+	if (unlikely(!submit_bio_checks(bio)))
+		return;
+	__submit_bio_noacct_nocheck(bio);
+}
 EXPORT_SYMBOL(submit_bio_noacct);
 
+void submit_bio_noacct_nocheck(struct bio *bio)
+{
+	__submit_bio_noacct_nocheck(bio);
+}
+
 /**
  * submit_bio - submit a bio to the block device layer for I/O
  * @bio: The &struct bio which describes the I/O
diff --git a/block/blk-throttle.c b/block/blk-throttle.c
index 7c462c006b26..6cca1715c31e 100644
--- a/block/blk-throttle.c
+++ b/block/blk-throttle.c
@@ -1219,7 +1219,7 @@ static void blk_throtl_dispatch_work_fn(struct work_struct *work)
 	if (!bio_list_empty(&bio_list_on_stack)) {
 		blk_start_plug(&plug);
 		while ((bio = bio_list_pop(&bio_list_on_stack)))
-			submit_bio_noacct(bio);
+			submit_bio_noacct_nocheck(bio);
 		blk_finish_plug(&plug);
 	}
 }
diff --git a/block/blk.h b/block/blk.h
index b2516cb4f98e..ebaa59ca46ca 100644
--- a/block/blk.h
+++ b/block/blk.h
@@ -46,6 +46,7 @@ void blk_freeze_queue(struct request_queue *q);
 void __blk_mq_unfreeze_queue(struct request_queue *q, bool force_atomic);
 void blk_queue_start_drain(struct request_queue *q);
 int __bio_queue_enter(struct request_queue *q, struct bio *bio);
+void submit_bio_noacct_nocheck(struct bio *bio);
 
 static inline bool blk_try_enter_queue(struct request_queue *q, bool pm)
 {
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH V2 5/7] block: throttle split bio in case of iops limit
  2022-02-09  9:14 [PATCH V2 0/7] block: improve iops limit throttle Ming Lei
                   ` (3 preceding siblings ...)
  2022-02-09  9:14 ` [PATCH V2 4/7] block: don't check bio in blk_throtl_dispatch_work_fn Ming Lei
@ 2022-02-09  9:14 ` Ming Lei
  2022-02-14 20:22   ` Tejun Heo
  2022-02-15  8:17   ` Ning Li
  2022-02-09  9:14 ` [PATCH V2 6/7] block: don't try to throttle split bio if iops limit isn't set Ming Lei
                   ` (2 subsequent siblings)
  7 siblings, 2 replies; 19+ messages in thread
From: Ming Lei @ 2022-02-09  9:14 UTC (permalink / raw)
  To: Jens Axboe; +Cc: linux-block, Li Ning, Tejun Heo, Chunguang Xu, Ming Lei

Commit 111be8839817 ("block-throttle: avoid double charge") marks bio as
BIO_THROTTLED unconditionally if __blk_throtl_bio() is called on this bio,
then this bio won't be called into __blk_throtl_bio() any more. This way
is to avoid double charge in case of bio splitting. It is reasonable for
read/write throughput limit, but not reasonable for IOPS limit because
block layer provides io accounting against split bio.

Chunguang Xu has already observed this issue and fixed it in commit
4f1e9630afe6 ("blk-throtl: optimize IOPS throttle for large IO scenarios").
However, that patch only covers bio splitting in __blk_queue_split(), and
we have other kind of bio splitting, such as bio_split() &
submit_bio_noacct() and other ways.

This patch tries to fix the issue in one generic way by always charging
the bio for iops limit in blk_throtl_bio(). This way is reasonable:
re-submission & fast-cloned bio is charged if it is submitted to same
disk/queue, and BIO_THROTTLED will be cleared if bio->bi_bdev is changed.

This new approach can get much more smooth/stable iops limit compared with
commit 4f1e9630afe6 ("blk-throtl: optimize IOPS throttle for large IO
scenarios") since that commit can't throttle current split bios actually.

Also this way won't cause new double bio iops charge in
blk_throtl_dispatch_work_fn() in which blk_throtl_bio() won't be called
any more.

Reported-by: Li Ning <lining2020x@163.com>
Cc: Tejun Heo <tj@kernel.org>
Cc: Chunguang Xu <brookxu@tencent.com>
Signed-off-by: Ming Lei <ming.lei@redhat.com>
---
 block/blk-merge.c    |  2 --
 block/blk-throttle.c | 10 +++++++---
 block/blk-throttle.h |  2 --
 3 files changed, 7 insertions(+), 7 deletions(-)

diff --git a/block/blk-merge.c b/block/blk-merge.c
index 4de34a332c9f..f5255991b773 100644
--- a/block/blk-merge.c
+++ b/block/blk-merge.c
@@ -368,8 +368,6 @@ void __blk_queue_split(struct request_queue *q, struct bio **bio,
 		trace_block_split(split, (*bio)->bi_iter.bi_sector);
 		submit_bio_noacct(*bio);
 		*bio = split;
-
-		blk_throtl_charge_bio_split(*bio);
 	}
 }
 
diff --git a/block/blk-throttle.c b/block/blk-throttle.c
index 6cca1715c31e..6f2a8e7801fe 100644
--- a/block/blk-throttle.c
+++ b/block/blk-throttle.c
@@ -808,7 +808,8 @@ static bool tg_with_in_bps_limit(struct throtl_grp *tg, struct bio *bio,
 	unsigned long jiffy_elapsed, jiffy_wait, jiffy_elapsed_rnd;
 	unsigned int bio_size = throtl_bio_data_size(bio);
 
-	if (bps_limit == U64_MAX) {
+	/* no need to throttle if this bio's bytes have been accounted */
+	if (bps_limit == U64_MAX || bio_flagged(bio, BIO_THROTTLED)) {
 		if (wait)
 			*wait = 0;
 		return true;
@@ -920,9 +921,12 @@ static void throtl_charge_bio(struct throtl_grp *tg, struct bio *bio)
 	unsigned int bio_size = throtl_bio_data_size(bio);
 
 	/* Charge the bio to the group */
-	tg->bytes_disp[rw] += bio_size;
+	if (!bio_flagged(bio, BIO_THROTTLED)) {
+		tg->bytes_disp[rw] += bio_size;
+		tg->last_bytes_disp[rw] += bio_size;
+	}
+
 	tg->io_disp[rw]++;
-	tg->last_bytes_disp[rw] += bio_size;
 	tg->last_io_disp[rw]++;
 
 	/*
diff --git a/block/blk-throttle.h b/block/blk-throttle.h
index 175f03abd9e4..cb43f4417d6e 100644
--- a/block/blk-throttle.h
+++ b/block/blk-throttle.h
@@ -170,8 +170,6 @@ static inline bool blk_throtl_bio(struct bio *bio)
 {
 	struct throtl_grp *tg = blkg_to_tg(bio->bi_blkg);
 
-	if (bio_flagged(bio, BIO_THROTTLED))
-		return false;
 	if (!tg->has_rules[bio_data_dir(bio)])
 		return false;
 
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH V2 6/7] block: don't try to throttle split bio if iops limit isn't set
  2022-02-09  9:14 [PATCH V2 0/7] block: improve iops limit throttle Ming Lei
                   ` (4 preceding siblings ...)
  2022-02-09  9:14 ` [PATCH V2 5/7] block: throttle split bio in case of iops limit Ming Lei
@ 2022-02-09  9:14 ` Ming Lei
  2022-02-09  9:14 ` [PATCH V2 7/7] block: revert 4f1e9630afe6 ("blk-throtl: optimize IOPS throttle for large IO scenarios") Ming Lei
  2022-02-13  8:34 ` [PATCH V2 0/7] block: improve iops limit throttle Ming Lei
  7 siblings, 0 replies; 19+ messages in thread
From: Ming Lei @ 2022-02-09  9:14 UTC (permalink / raw)
  To: Jens Axboe; +Cc: linux-block, Li Ning, Tejun Heo, Chunguang Xu, Ming Lei

We need to throttle split bio in case of IOPS limit even though the
split bio has been marked as BIO_THROTTLED since block layer
accounts split bio actually.

If only throughput throttle is setup, no need to throttle any more
if BIO_THROTTLED is set since we have accounted & considered the
whole bio bytes already.

Add one flag of THROTL_TG_HAS_IOPS_LIMIT for serving this purpose.

Signed-off-by: Ming Lei <ming.lei@redhat.com>
---
 block/blk-throttle.c | 21 ++++++++++++++-------
 block/blk-throttle.h | 11 +++++++++++
 2 files changed, 25 insertions(+), 7 deletions(-)

diff --git a/block/blk-throttle.c b/block/blk-throttle.c
index 6f2a8e7801fe..7969d9a29830 100644
--- a/block/blk-throttle.c
+++ b/block/blk-throttle.c
@@ -42,11 +42,6 @@
 /* A workqueue to queue throttle related work */
 static struct workqueue_struct *kthrotld_workqueue;
 
-enum tg_state_flags {
-	THROTL_TG_PENDING	= 1 << 0,	/* on parent's pending tree */
-	THROTL_TG_WAS_EMPTY	= 1 << 1,	/* bio_lists[] became non-empty */
-};
-
 #define rb_entry_tg(node)	rb_entry((node), struct throtl_grp, rb_node)
 
 /* We measure latency for request size from <= 4k to >= 1M */
@@ -426,12 +421,24 @@ static void tg_update_has_rules(struct throtl_grp *tg)
 	struct throtl_grp *parent_tg = sq_to_tg(tg->service_queue.parent_sq);
 	struct throtl_data *td = tg->td;
 	int rw;
+	int has_iops_limit = 0;
+
+	for (rw = READ; rw <= WRITE; rw++) {
+		unsigned int iops_limit = tg_iops_limit(tg, rw);
 
-	for (rw = READ; rw <= WRITE; rw++)
 		tg->has_rules[rw] = (parent_tg && parent_tg->has_rules[rw]) ||
 			(td->limit_valid[td->limit_index] &&
 			 (tg_bps_limit(tg, rw) != U64_MAX ||
-			  tg_iops_limit(tg, rw) != UINT_MAX));
+			  iops_limit != UINT_MAX));
+
+		if (iops_limit != UINT_MAX)
+			has_iops_limit = 1;
+	}
+
+	if (has_iops_limit)
+		tg->flags |= THROTL_TG_HAS_IOPS_LIMIT;
+	else
+		tg->flags &= ~THROTL_TG_HAS_IOPS_LIMIT;
 }
 
 static void throtl_pd_online(struct blkg_policy_data *pd)
diff --git a/block/blk-throttle.h b/block/blk-throttle.h
index cb43f4417d6e..c996a15f290e 100644
--- a/block/blk-throttle.h
+++ b/block/blk-throttle.h
@@ -52,6 +52,12 @@ struct throtl_service_queue {
 	struct timer_list	pending_timer;	/* fires on first_pending_disptime */
 };
 
+enum tg_state_flags {
+	THROTL_TG_PENDING	= 1 << 0,	/* on parent's pending tree */
+	THROTL_TG_WAS_EMPTY	= 1 << 1,	/* bio_lists[] became non-empty */
+	THROTL_TG_HAS_IOPS_LIMIT = 1 << 2,	/* tg has iops limit */
+};
+
 enum {
 	LIMIT_LOW,
 	LIMIT_MAX,
@@ -170,6 +176,11 @@ static inline bool blk_throtl_bio(struct bio *bio)
 {
 	struct throtl_grp *tg = blkg_to_tg(bio->bi_blkg);
 
+	/* no need to throttle bps any more if the bio has been throttled */
+	if (bio_flagged(bio, BIO_THROTTLED) &&
+	    !(tg->flags & THROTL_TG_HAS_IOPS_LIMIT))
+		return false;
+
 	if (!tg->has_rules[bio_data_dir(bio)])
 		return false;
 
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH V2 7/7] block: revert 4f1e9630afe6 ("blk-throtl: optimize IOPS throttle for large IO scenarios")
  2022-02-09  9:14 [PATCH V2 0/7] block: improve iops limit throttle Ming Lei
                   ` (5 preceding siblings ...)
  2022-02-09  9:14 ` [PATCH V2 6/7] block: don't try to throttle split bio if iops limit isn't set Ming Lei
@ 2022-02-09  9:14 ` Ming Lei
  2022-02-13  8:34 ` [PATCH V2 0/7] block: improve iops limit throttle Ming Lei
  7 siblings, 0 replies; 19+ messages in thread
From: Ming Lei @ 2022-02-09  9:14 UTC (permalink / raw)
  To: Jens Axboe; +Cc: linux-block, Li Ning, Tejun Heo, Chunguang Xu, Ming Lei

Revert commit 4f1e9630afe6 ("blk-throtl: optimize IOPS throttle for large
IO scenarios") since we have another easier way to address this issue and
get better iops throttling result.

Signed-off-by: Ming Lei <ming.lei@redhat.com>
---
 block/blk-throttle.c | 28 ----------------------------
 block/blk-throttle.h |  5 -----
 2 files changed, 33 deletions(-)

diff --git a/block/blk-throttle.c b/block/blk-throttle.c
index 7969d9a29830..6afe37bc76b9 100644
--- a/block/blk-throttle.c
+++ b/block/blk-throttle.c
@@ -641,8 +641,6 @@ static inline void throtl_start_new_slice_with_credit(struct throtl_grp *tg,
 	tg->bytes_disp[rw] = 0;
 	tg->io_disp[rw] = 0;
 
-	atomic_set(&tg->io_split_cnt[rw], 0);
-
 	/*
 	 * Previous slice has expired. We must have trimmed it after last
 	 * bio dispatch. That means since start of last slice, we never used
@@ -666,8 +664,6 @@ static inline void throtl_start_new_slice(struct throtl_grp *tg, bool rw)
 	tg->slice_start[rw] = jiffies;
 	tg->slice_end[rw] = jiffies + tg->td->throtl_slice;
 
-	atomic_set(&tg->io_split_cnt[rw], 0);
-
 	throtl_log(&tg->service_queue,
 		   "[%c] new slice start=%lu end=%lu jiffies=%lu",
 		   rw == READ ? 'R' : 'W', tg->slice_start[rw],
@@ -901,9 +897,6 @@ static bool tg_may_dispatch(struct throtl_grp *tg, struct bio *bio,
 				jiffies + tg->td->throtl_slice);
 	}
 
-	if (iops_limit != UINT_MAX)
-		tg->io_disp[rw] += atomic_xchg(&tg->io_split_cnt[rw], 0);
-
 	if (tg_with_in_bps_limit(tg, bio, bps_limit, &bps_wait) &&
 	    tg_with_in_iops_limit(tg, bio, iops_limit, &iops_wait)) {
 		if (wait)
@@ -1928,14 +1921,12 @@ static void throtl_downgrade_check(struct throtl_grp *tg)
 	}
 
 	if (tg->iops[READ][LIMIT_LOW]) {
-		tg->last_io_disp[READ] += atomic_xchg(&tg->last_io_split_cnt[READ], 0);
 		iops = tg->last_io_disp[READ] * HZ / elapsed_time;
 		if (iops >= tg->iops[READ][LIMIT_LOW])
 			tg->last_low_overflow_time[READ] = now;
 	}
 
 	if (tg->iops[WRITE][LIMIT_LOW]) {
-		tg->last_io_disp[WRITE] += atomic_xchg(&tg->last_io_split_cnt[WRITE], 0);
 		iops = tg->last_io_disp[WRITE] * HZ / elapsed_time;
 		if (iops >= tg->iops[WRITE][LIMIT_LOW])
 			tg->last_low_overflow_time[WRITE] = now;
@@ -2054,25 +2045,6 @@ static inline void throtl_update_latency_buckets(struct throtl_data *td)
 }
 #endif
 
-void blk_throtl_charge_bio_split(struct bio *bio)
-{
-	struct blkcg_gq *blkg = bio->bi_blkg;
-	struct throtl_grp *parent = blkg_to_tg(blkg);
-	struct throtl_service_queue *parent_sq;
-	bool rw = bio_data_dir(bio);
-
-	do {
-		if (!parent->has_rules[rw])
-			break;
-
-		atomic_inc(&parent->io_split_cnt[rw]);
-		atomic_inc(&parent->last_io_split_cnt[rw]);
-
-		parent_sq = parent->service_queue.parent_sq;
-		parent = sq_to_tg(parent_sq);
-	} while (parent);
-}
-
 bool __blk_throtl_bio(struct bio *bio)
 {
 	struct request_queue *q = bdev_get_queue(bio->bi_bdev);
diff --git a/block/blk-throttle.h b/block/blk-throttle.h
index c996a15f290e..b23a9f3abb82 100644
--- a/block/blk-throttle.h
+++ b/block/blk-throttle.h
@@ -138,9 +138,6 @@ struct throtl_grp {
 	unsigned int bad_bio_cnt; /* bios exceeding latency threshold */
 	unsigned long bio_cnt_reset_time;
 
-	atomic_t io_split_cnt[2];
-	atomic_t last_io_split_cnt[2];
-
 	struct blkg_rwstat stat_bytes;
 	struct blkg_rwstat stat_ios;
 };
@@ -164,13 +161,11 @@ static inline struct throtl_grp *blkg_to_tg(struct blkcg_gq *blkg)
 static inline int blk_throtl_init(struct request_queue *q) { return 0; }
 static inline void blk_throtl_exit(struct request_queue *q) { }
 static inline void blk_throtl_register_queue(struct request_queue *q) { }
-static inline void blk_throtl_charge_bio_split(struct bio *bio) { }
 static inline bool blk_throtl_bio(struct bio *bio) { return false; }
 #else /* CONFIG_BLK_DEV_THROTTLING */
 int blk_throtl_init(struct request_queue *q);
 void blk_throtl_exit(struct request_queue *q);
 void blk_throtl_register_queue(struct request_queue *q);
-void blk_throtl_charge_bio_split(struct bio *bio);
 bool __blk_throtl_bio(struct bio *bio);
 static inline bool blk_throtl_bio(struct bio *bio)
 {
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 19+ messages in thread

* Re: [PATCH V2 1/7] block: move submit_bio_checks() into submit_bio_noacct
  2022-02-09  9:14 ` [PATCH V2 1/7] block: move submit_bio_checks() into submit_bio_noacct Ming Lei
@ 2022-02-09 14:01   ` Christoph Hellwig
  0 siblings, 0 replies; 19+ messages in thread
From: Christoph Hellwig @ 2022-02-09 14:01 UTC (permalink / raw)
  To: Ming Lei; +Cc: Jens Axboe, linux-block, Li Ning, Tejun Heo, Chunguang Xu

Looks good,

Reviewed-by: Christoph Hellwig <hch@lst.de>

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH V2 2/7] block: move blk_crypto_bio_prep() out of blk-mq.c
  2022-02-09  9:14 ` [PATCH V2 2/7] block: move blk_crypto_bio_prep() out of blk-mq.c Ming Lei
@ 2022-02-09 14:02   ` Christoph Hellwig
  0 siblings, 0 replies; 19+ messages in thread
From: Christoph Hellwig @ 2022-02-09 14:02 UTC (permalink / raw)
  To: Ming Lei; +Cc: Jens Axboe, linux-block, Li Ning, Tejun Heo, Chunguang Xu

>  	if (blk_crypto_bio_prep(&bio)) {

I'd just do an early return if tis fails, but otherwise it looks fine
to me:

Reviewed-by: Christoph Hellwig <hch@lst.de>

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH V2 3/7] block: don't declare submit_bio_checks in local header
  2022-02-09  9:14 ` [PATCH V2 3/7] block: don't declare submit_bio_checks in local header Ming Lei
@ 2022-02-09 14:02   ` Christoph Hellwig
  0 siblings, 0 replies; 19+ messages in thread
From: Christoph Hellwig @ 2022-02-09 14:02 UTC (permalink / raw)
  To: Ming Lei; +Cc: Jens Axboe, linux-block, Li Ning, Tejun Heo, Chunguang Xu

On Wed, Feb 09, 2022 at 05:14:25PM +0800, Ming Lei wrote:
> submit_bio_checks() won't be called outside of block/blk-core.c any more
> since commit 9d497e2941c3 ("block: don't protect submit_bio_checks by
> q_usage_counter").

Please also mark it static while you're at it.

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH V2 4/7] block: don't check bio in blk_throtl_dispatch_work_fn
  2022-02-09  9:14 ` [PATCH V2 4/7] block: don't check bio in blk_throtl_dispatch_work_fn Ming Lei
@ 2022-02-09 14:03   ` Christoph Hellwig
  0 siblings, 0 replies; 19+ messages in thread
From: Christoph Hellwig @ 2022-02-09 14:03 UTC (permalink / raw)
  To: Ming Lei; +Cc: Jens Axboe, linux-block, Li Ning, Tejun Heo, Chunguang Xu

> +/**
> + * submit_bio_noacct - re-submit a bio to the block device layer for I/O
> + * @bio:  The bio describing the location in memory and on the device.
> + *
> + * This is a version of submit_bio() that shall only be used for I/O that is
> + * resubmitted to lower level drivers by stacking block drivers.  All file
> + * systems and other upper level users of the block layer should use
> + * submit_bio() instead.
> + */
> +void submit_bio_noacct(struct bio *bio)
> +{
> +	if (unlikely(!submit_bio_checks(bio)))
> +		return;
> +	__submit_bio_noacct_nocheck(bio);

Given that this is the only caller of submit_bio_checks I'd merge
the two.

Otherwise looks fine:

Reviewed-by: Christoph Hellwig <hch@lst.de>

> +}
>  EXPORT_SYMBOL(submit_bio_noacct);
>  
> +void submit_bio_noacct_nocheck(struct bio *bio)
> +{
> +	__submit_bio_noacct_nocheck(bio);
> +}
> +
>  /**
>   * submit_bio - submit a bio to the block device layer for I/O
>   * @bio: The &struct bio which describes the I/O
> diff --git a/block/blk-throttle.c b/block/blk-throttle.c
> index 7c462c006b26..6cca1715c31e 100644
> --- a/block/blk-throttle.c
> +++ b/block/blk-throttle.c
> @@ -1219,7 +1219,7 @@ static void blk_throtl_dispatch_work_fn(struct work_struct *work)
>  	if (!bio_list_empty(&bio_list_on_stack)) {
>  		blk_start_plug(&plug);
>  		while ((bio = bio_list_pop(&bio_list_on_stack)))
> -			submit_bio_noacct(bio);
> +			submit_bio_noacct_nocheck(bio);
>  		blk_finish_plug(&plug);
>  	}
>  }
> diff --git a/block/blk.h b/block/blk.h
> index b2516cb4f98e..ebaa59ca46ca 100644
> --- a/block/blk.h
> +++ b/block/blk.h
> @@ -46,6 +46,7 @@ void blk_freeze_queue(struct request_queue *q);
>  void __blk_mq_unfreeze_queue(struct request_queue *q, bool force_atomic);
>  void blk_queue_start_drain(struct request_queue *q);
>  int __bio_queue_enter(struct request_queue *q, struct bio *bio);
> +void submit_bio_noacct_nocheck(struct bio *bio);
>  
>  static inline bool blk_try_enter_queue(struct request_queue *q, bool pm)
>  {
> -- 
> 2.31.1
> 
---end quoted text---

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH V2 0/7] block: improve iops limit throttle
  2022-02-09  9:14 [PATCH V2 0/7] block: improve iops limit throttle Ming Lei
                   ` (6 preceding siblings ...)
  2022-02-09  9:14 ` [PATCH V2 7/7] block: revert 4f1e9630afe6 ("blk-throtl: optimize IOPS throttle for large IO scenarios") Ming Lei
@ 2022-02-13  8:34 ` Ming Lei
  7 siblings, 0 replies; 19+ messages in thread
From: Ming Lei @ 2022-02-13  8:34 UTC (permalink / raw)
  To: Jens Axboe; +Cc: linux-block, Li Ning, Tejun Heo, Chunguang Xu

On Wed, Feb 09, 2022 at 05:14:22PM +0800, Ming Lei wrote:
> Hello Guys,
> 
> Lining reported that iops limit throttle doesn't work on dm-thin, also
> iops limit throttle works bad on plain disk in case of excessive split.
> 
> Commit 4f1e9630afe6 ("blk-throtl: optimize IOPS throttle for large IO scenarios")
> was for addressing this issue, but the taken approach is just to run
> post-accounting, then current split bios won't be throttled actually,
> so actual iops throttle result isn't good in case of excessive bio
> splitting.
> 
> The 1st three patches are cleanup.
> 
> The 4th patches add one new local helper of submit_bio_noacct_nocheck() for
> blk_throtl_dispatch_work_fn(), so that bios won't be throttled any more
> when blk-throttle code dispatches throttled bios.
> 
> The 5th and 6th patch makes the real difference for throttling split bio wrt.
> iops limit.
> 
> The last patch is to revert commit 4f1e9630afe6 ("blk-throtl: optimize IOPS
> throttle for large IO scenarios").
> 
> Lining has verified that iops throttle is improved much on the posted
> RFC V1 version.
> 
> V2:
> 	- remove RFC
> 	- don't add/export __submit_bio_noacct(), instead add one new local
> 	helper of submit_bio_noacct_nocheck() per Christoph's suggestion
> 
> Ming Lei (7):
>   block: move submit_bio_checks() into submit_bio_noacct
>   block: move blk_crypto_bio_prep() out of blk-mq.c
>   block: don't declare submit_bio_checks in local header
>   block: don't check bio in blk_throtl_dispatch_work_fn
>   block: throttle split bio in case of iops limit
>   block: don't try to throttle split bio if iops limit isn't set
>   block: revert 4f1e9630afe6 ("blk-throtl: optimize IOPS throttle for
>     large IO scenarios")

Hello Tejun, Chunguang and guys,

Can you give an review on this patchset? Especially the last 4 changes
on blk-throtl?


Thanks,
Ming


^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH V2 5/7] block: throttle split bio in case of iops limit
  2022-02-09  9:14 ` [PATCH V2 5/7] block: throttle split bio in case of iops limit Ming Lei
@ 2022-02-14 20:22   ` Tejun Heo
  2022-02-15  1:10     ` Ming Lei
  2022-02-15  8:17   ` Ning Li
  1 sibling, 1 reply; 19+ messages in thread
From: Tejun Heo @ 2022-02-14 20:22 UTC (permalink / raw)
  To: Ming Lei; +Cc: Jens Axboe, linux-block, Li Ning, Chunguang Xu

Hello,

On Wed, Feb 09, 2022 at 05:14:27PM +0800, Ming Lei wrote:
> Chunguang Xu has already observed this issue and fixed it in commit
> 4f1e9630afe6 ("blk-throtl: optimize IOPS throttle for large IO scenarios").
> However, that patch only covers bio splitting in __blk_queue_split(), and
> we have other kind of bio splitting, such as bio_split() &
> submit_bio_noacct() and other ways.

I see. So, we can go for adding split handling to all other places or try to
find a common spot.

> This patch tries to fix the issue in one generic way by always charging
> the bio for iops limit in blk_throtl_bio(). This way is reasonable:
> re-submission & fast-cloned bio is charged if it is submitted to same
> disk/queue, and BIO_THROTTLED will be cleared if bio->bi_bdev is changed.
> 
> This new approach can get much more smooth/stable iops limit compared with
> commit 4f1e9630afe6 ("blk-throtl: optimize IOPS throttle for large IO
> scenarios") since that commit can't throttle current split bios actually.
> 
> Also this way won't cause new double bio iops charge in
> blk_throtl_dispatch_work_fn() in which blk_throtl_bio() won't be called
> any more.

But yeah, this should work too. This is simpler but more fragile given the
twisted history around BIO_THROTTLED. I think the root cause of the
convolution is that it's hooked at the wrong spot - it's sitting on top of
the queue trying to guess what actually happens to the bios it sent down.
Ideally, we probably wanna move this to rq-qos hooks which sit on the actual
issue-to-the-device path.

For now, I don't have a strong preference. This looks fine to me too. Please
feel free to add

 Acked-by: Tejun Heo <tj@kernel.org>

for the blk-throtl patches.

Thanks.

-- 
tejun

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH V2 5/7] block: throttle split bio in case of iops limit
  2022-02-14 20:22   ` Tejun Heo
@ 2022-02-15  1:10     ` Ming Lei
  2022-02-15  7:08       ` Christoph Hellwig
  0 siblings, 1 reply; 19+ messages in thread
From: Ming Lei @ 2022-02-15  1:10 UTC (permalink / raw)
  To: Tejun Heo; +Cc: Jens Axboe, linux-block, Li Ning, Chunguang Xu

On Mon, Feb 14, 2022 at 10:22:49AM -1000, Tejun Heo wrote:
> Hello,
> 
> On Wed, Feb 09, 2022 at 05:14:27PM +0800, Ming Lei wrote:
> > Chunguang Xu has already observed this issue and fixed it in commit
> > 4f1e9630afe6 ("blk-throtl: optimize IOPS throttle for large IO scenarios").
> > However, that patch only covers bio splitting in __blk_queue_split(), and
> > we have other kind of bio splitting, such as bio_split() &
> > submit_bio_noacct() and other ways.
> 
> I see. So, we can go for adding split handling to all other places or try to
> find a common spot.
> 
> > This patch tries to fix the issue in one generic way by always charging
> > the bio for iops limit in blk_throtl_bio(). This way is reasonable:
> > re-submission & fast-cloned bio is charged if it is submitted to same
> > disk/queue, and BIO_THROTTLED will be cleared if bio->bi_bdev is changed.
> > 
> > This new approach can get much more smooth/stable iops limit compared with
> > commit 4f1e9630afe6 ("blk-throtl: optimize IOPS throttle for large IO
> > scenarios") since that commit can't throttle current split bios actually.
> > 
> > Also this way won't cause new double bio iops charge in
> > blk_throtl_dispatch_work_fn() in which blk_throtl_bio() won't be called
> > any more.
> 
> But yeah, this should work too. This is simpler but more fragile given the
> twisted history around BIO_THROTTLED. I think the root cause of the
> convolution is that it's hooked at the wrong spot - it's sitting on top of
> the queue trying to guess what actually happens to the bios it sent down.
> Ideally, we probably wanna move this to rq-qos hooks which sit on the actual
> issue-to-the-device path.

The big problem is that rq-qos is only hooked for request based queue,
and we need to support cgroup/throttle for bio base queue.

> 
> For now, I don't have a strong preference. This looks fine to me too. Please
> feel free to add
> 
>  Acked-by: Tejun Heo <tj@kernel.org>
> 
> for the blk-throtl patches.

Thanks!

-- 
Ming


^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH V2 5/7] block: throttle split bio in case of iops limit
  2022-02-15  1:10     ` Ming Lei
@ 2022-02-15  7:08       ` Christoph Hellwig
  2022-02-15  8:02         ` Ming Lei
  0 siblings, 1 reply; 19+ messages in thread
From: Christoph Hellwig @ 2022-02-15  7:08 UTC (permalink / raw)
  To: Ming Lei; +Cc: Tejun Heo, Jens Axboe, linux-block, Li Ning, Chunguang Xu

On Tue, Feb 15, 2022 at 09:10:18AM +0800, Ming Lei wrote:
> The big problem is that rq-qos is only hooked for request based queue,
> and we need to support cgroup/throttle for bio base queue.

Which bio based driver do we care about?

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH V2 5/7] block: throttle split bio in case of iops limit
  2022-02-15  7:08       ` Christoph Hellwig
@ 2022-02-15  8:02         ` Ming Lei
  0 siblings, 0 replies; 19+ messages in thread
From: Ming Lei @ 2022-02-15  8:02 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Tejun Heo, Jens Axboe, linux-block, Li Ning, Chunguang Xu

On Mon, Feb 14, 2022 at 11:08:06PM -0800, Christoph Hellwig wrote:
> On Tue, Feb 15, 2022 at 09:10:18AM +0800, Ming Lei wrote:
> > The big problem is that rq-qos is only hooked for request based queue,
> > and we need to support cgroup/throttle for bio base queue.
> 
> Which bio based driver do we care about?

dm/md is usually the upper layer block device for mounting FS, and
userspace just setup bio throttle on these dm/md device, so we can't
break userspace by removing io throttle for dm/md and other bio based
devices.

Thanks,
Ming


^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH V2 5/7] block: throttle split bio in case of iops limit
  2022-02-09  9:14 ` [PATCH V2 5/7] block: throttle split bio in case of iops limit Ming Lei
  2022-02-14 20:22   ` Tejun Heo
@ 2022-02-15  8:17   ` Ning Li
  2022-02-15  8:20     ` Ning Li
  1 sibling, 1 reply; 19+ messages in thread
From: Ning Li @ 2022-02-15  8:17 UTC (permalink / raw)
  To: Ming Lei; +Cc: Jens Axboe, linux-block, Tejun Heo, Chunguang Xu

If where will be one more version of this patchset, please update my sign with

`Ning Li <lining@2020x@163.com>`.  : )

Thanks 


> 2022年2月9日 下午5:14,Ming Lei <ming.lei@redhat.com> 写道:
> 
> Commit 111be8839817 ("block-throttle: avoid double charge") marks bio as
> BIO_THROTTLED unconditionally if __blk_throtl_bio() is called on this bio,
> then this bio won't be called into __blk_throtl_bio() any more. This way
> is to avoid double charge in case of bio splitting. It is reasonable for
> read/write throughput limit, but not reasonable for IOPS limit because
> block layer provides io accounting against split bio.
> 
> Chunguang Xu has already observed this issue and fixed it in commit
> 4f1e9630afe6 ("blk-throtl: optimize IOPS throttle for large IO scenarios").
> However, that patch only covers bio splitting in __blk_queue_split(), and
> we have other kind of bio splitting, such as bio_split() &
> submit_bio_noacct() and other ways.
> 
> This patch tries to fix the issue in one generic way by always charging
> the bio for iops limit in blk_throtl_bio(). This way is reasonable:
> re-submission & fast-cloned bio is charged if it is submitted to same
> disk/queue, and BIO_THROTTLED will be cleared if bio->bi_bdev is changed.
> 
> This new approach can get much more smooth/stable iops limit compared with
> commit 4f1e9630afe6 ("blk-throtl: optimize IOPS throttle for large IO
> scenarios") since that commit can't throttle current split bios actually.
> 
> Also this way won't cause new double bio iops charge in
> blk_throtl_dispatch_work_fn() in which blk_throtl_bio() won't be called
> any more.
> 
> Reported-by: Li Ning <lining2020x@163.com>
> Cc: Tejun Heo <tj@kernel.org>
> Cc: Chunguang Xu <brookxu@tencent.com>
> Signed-off-by: Ming Lei <ming.lei@redhat.com>
> ---
> block/blk-merge.c    |  2 --
> block/blk-throttle.c | 10 +++++++---
> block/blk-throttle.h |  2 --
> 3 files changed, 7 insertions(+), 7 deletions(-)
> 
> diff --git a/block/blk-merge.c b/block/blk-merge.c
> index 4de34a332c9f..f5255991b773 100644
> --- a/block/blk-merge.c
> +++ b/block/blk-merge.c
> @@ -368,8 +368,6 @@ void __blk_queue_split(struct request_queue *q, struct bio **bio,
> 		trace_block_split(split, (*bio)->bi_iter.bi_sector);
> 		submit_bio_noacct(*bio);
> 		*bio = split;
> -
> -		blk_throtl_charge_bio_split(*bio);
> 	}
> }
> 
> diff --git a/block/blk-throttle.c b/block/blk-throttle.c
> index 6cca1715c31e..6f2a8e7801fe 100644
> --- a/block/blk-throttle.c
> +++ b/block/blk-throttle.c
> @@ -808,7 +808,8 @@ static bool tg_with_in_bps_limit(struct throtl_grp *tg, struct bio *bio,
> 	unsigned long jiffy_elapsed, jiffy_wait, jiffy_elapsed_rnd;
> 	unsigned int bio_size = throtl_bio_data_size(bio);
> 
> -	if (bps_limit == U64_MAX) {
> +	/* no need to throttle if this bio's bytes have been accounted */
> +	if (bps_limit == U64_MAX || bio_flagged(bio, BIO_THROTTLED)) {
> 		if (wait)
> 			*wait = 0;
> 		return true;
> @@ -920,9 +921,12 @@ static void throtl_charge_bio(struct throtl_grp *tg, struct bio *bio)
> 	unsigned int bio_size = throtl_bio_data_size(bio);
> 
> 	/* Charge the bio to the group */
> -	tg->bytes_disp[rw] += bio_size;
> +	if (!bio_flagged(bio, BIO_THROTTLED)) {
> +		tg->bytes_disp[rw] += bio_size;
> +		tg->last_bytes_disp[rw] += bio_size;
> +	}
> +
> 	tg->io_disp[rw]++;
> -	tg->last_bytes_disp[rw] += bio_size;
> 	tg->last_io_disp[rw]++;
> 
> 	/*
> diff --git a/block/blk-throttle.h b/block/blk-throttle.h
> index 175f03abd9e4..cb43f4417d6e 100644
> --- a/block/blk-throttle.h
> +++ b/block/blk-throttle.h
> @@ -170,8 +170,6 @@ static inline bool blk_throtl_bio(struct bio *bio)
> {
> 	struct throtl_grp *tg = blkg_to_tg(bio->bi_blkg);
> 
> -	if (bio_flagged(bio, BIO_THROTTLED))
> -		return false;
> 	if (!tg->has_rules[bio_data_dir(bio)])
> 		return false;
> 
> -- 
> 2.31.1


^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH V2 5/7] block: throttle split bio in case of iops limit
  2022-02-15  8:17   ` Ning Li
@ 2022-02-15  8:20     ` Ning Li
  0 siblings, 0 replies; 19+ messages in thread
From: Ning Li @ 2022-02-15  8:20 UTC (permalink / raw)
  To: Ming Lei; +Cc: Jens Axboe, linux-block, Tejun Heo, Chunguang Xu



> 2022年2月15日 下午4:17,Ning Li <lining2020x@163.com> 写道:
> 
> If where will be one more version of this patchset, please update my sign with
> 
> `Ning Li <lining@2020x@163.com>`.  : )

Sorry for a typo, it’s `Ning Li <lining2020x@163.com>`

> Thanks 
> 
> 
>> 2022年2月9日 下午5:14,Ming Lei <ming.lei@redhat.com> 写道:
>> 
>> Commit 111be8839817 ("block-throttle: avoid double charge") marks bio as
>> BIO_THROTTLED unconditionally if __blk_throtl_bio() is called on this bio,
>> then this bio won't be called into __blk_throtl_bio() any more. This way
>> is to avoid double charge in case of bio splitting. It is reasonable for
>> read/write throughput limit, but not reasonable for IOPS limit because
>> block layer provides io accounting against split bio.
>> 
>> Chunguang Xu has already observed this issue and fixed it in commit
>> 4f1e9630afe6 ("blk-throtl: optimize IOPS throttle for large IO scenarios").
>> However, that patch only covers bio splitting in __blk_queue_split(), and
>> we have other kind of bio splitting, such as bio_split() &
>> submit_bio_noacct() and other ways.
>> 
>> This patch tries to fix the issue in one generic way by always charging
>> the bio for iops limit in blk_throtl_bio(). This way is reasonable:
>> re-submission & fast-cloned bio is charged if it is submitted to same
>> disk/queue, and BIO_THROTTLED will be cleared if bio->bi_bdev is changed.
>> 
>> This new approach can get much more smooth/stable iops limit compared with
>> commit 4f1e9630afe6 ("blk-throtl: optimize IOPS throttle for large IO
>> scenarios") since that commit can't throttle current split bios actually.
>> 
>> Also this way won't cause new double bio iops charge in
>> blk_throtl_dispatch_work_fn() in which blk_throtl_bio() won't be called
>> any more.
>> 
>> Reported-by: Li Ning <lining2020x@163.com>
>> Cc: Tejun Heo <tj@kernel.org>
>> Cc: Chunguang Xu <brookxu@tencent.com>
>> Signed-off-by: Ming Lei <ming.lei@redhat.com>
>> ---
>> block/blk-merge.c    |  2 --
>> block/blk-throttle.c | 10 +++++++---
>> block/blk-throttle.h |  2 --
>> 3 files changed, 7 insertions(+), 7 deletions(-)
>> 
>> diff --git a/block/blk-merge.c b/block/blk-merge.c
>> index 4de34a332c9f..f5255991b773 100644
>> --- a/block/blk-merge.c
>> +++ b/block/blk-merge.c
>> @@ -368,8 +368,6 @@ void __blk_queue_split(struct request_queue *q, struct bio **bio,
>> 		trace_block_split(split, (*bio)->bi_iter.bi_sector);
>> 		submit_bio_noacct(*bio);
>> 		*bio = split;
>> -
>> -		blk_throtl_charge_bio_split(*bio);
>> 	}
>> }
>> 
>> diff --git a/block/blk-throttle.c b/block/blk-throttle.c
>> index 6cca1715c31e..6f2a8e7801fe 100644
>> --- a/block/blk-throttle.c
>> +++ b/block/blk-throttle.c
>> @@ -808,7 +808,8 @@ static bool tg_with_in_bps_limit(struct throtl_grp *tg, struct bio *bio,
>> 	unsigned long jiffy_elapsed, jiffy_wait, jiffy_elapsed_rnd;
>> 	unsigned int bio_size = throtl_bio_data_size(bio);
>> 
>> -	if (bps_limit == U64_MAX) {
>> +	/* no need to throttle if this bio's bytes have been accounted */
>> +	if (bps_limit == U64_MAX || bio_flagged(bio, BIO_THROTTLED)) {
>> 		if (wait)
>> 			*wait = 0;
>> 		return true;
>> @@ -920,9 +921,12 @@ static void throtl_charge_bio(struct throtl_grp *tg, struct bio *bio)
>> 	unsigned int bio_size = throtl_bio_data_size(bio);
>> 
>> 	/* Charge the bio to the group */
>> -	tg->bytes_disp[rw] += bio_size;
>> +	if (!bio_flagged(bio, BIO_THROTTLED)) {
>> +		tg->bytes_disp[rw] += bio_size;
>> +		tg->last_bytes_disp[rw] += bio_size;
>> +	}
>> +
>> 	tg->io_disp[rw]++;
>> -	tg->last_bytes_disp[rw] += bio_size;
>> 	tg->last_io_disp[rw]++;
>> 
>> 	/*
>> diff --git a/block/blk-throttle.h b/block/blk-throttle.h
>> index 175f03abd9e4..cb43f4417d6e 100644
>> --- a/block/blk-throttle.h
>> +++ b/block/blk-throttle.h
>> @@ -170,8 +170,6 @@ static inline bool blk_throtl_bio(struct bio *bio)
>> {
>> 	struct throtl_grp *tg = blkg_to_tg(bio->bi_blkg);
>> 
>> -	if (bio_flagged(bio, BIO_THROTTLED))
>> -		return false;
>> 	if (!tg->has_rules[bio_data_dir(bio)])
>> 		return false;
>> 
>> -- 
>> 2.31.1
> 


^ permalink raw reply	[flat|nested] 19+ messages in thread

end of thread, other threads:[~2022-02-15  8:34 UTC | newest]

Thread overview: 19+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-02-09  9:14 [PATCH V2 0/7] block: improve iops limit throttle Ming Lei
2022-02-09  9:14 ` [PATCH V2 1/7] block: move submit_bio_checks() into submit_bio_noacct Ming Lei
2022-02-09 14:01   ` Christoph Hellwig
2022-02-09  9:14 ` [PATCH V2 2/7] block: move blk_crypto_bio_prep() out of blk-mq.c Ming Lei
2022-02-09 14:02   ` Christoph Hellwig
2022-02-09  9:14 ` [PATCH V2 3/7] block: don't declare submit_bio_checks in local header Ming Lei
2022-02-09 14:02   ` Christoph Hellwig
2022-02-09  9:14 ` [PATCH V2 4/7] block: don't check bio in blk_throtl_dispatch_work_fn Ming Lei
2022-02-09 14:03   ` Christoph Hellwig
2022-02-09  9:14 ` [PATCH V2 5/7] block: throttle split bio in case of iops limit Ming Lei
2022-02-14 20:22   ` Tejun Heo
2022-02-15  1:10     ` Ming Lei
2022-02-15  7:08       ` Christoph Hellwig
2022-02-15  8:02         ` Ming Lei
2022-02-15  8:17   ` Ning Li
2022-02-15  8:20     ` Ning Li
2022-02-09  9:14 ` [PATCH V2 6/7] block: don't try to throttle split bio if iops limit isn't set Ming Lei
2022-02-09  9:14 ` [PATCH V2 7/7] block: revert 4f1e9630afe6 ("blk-throtl: optimize IOPS throttle for large IO scenarios") Ming Lei
2022-02-13  8:34 ` [PATCH V2 0/7] block: improve iops limit throttle Ming Lei

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.