All of lore.kernel.org
 help / color / mirror / Atom feed
* [RFC PATCH 0/7] block: improve iops limit throttle
@ 2022-01-11 11:55 Ming Lei
  2022-01-11 11:55 ` [RFC PATCH 1/7] block: move submit_bio_checks() into submit_bio_noacct Ming Lei
                   ` (7 more replies)
  0 siblings, 8 replies; 14+ messages in thread
From: Ming Lei @ 2022-01-11 11:55 UTC (permalink / raw)
  To: Jens Axboe; +Cc: linux-block, Li Ning, Tejun Heo, Chunguang Xu, Ming Lei

Hello Guys,

Lining reported that iops limit throttle doesn't work on dm-thin, also
iops limit throttle works bad on plain disk in case of excessive split.

Commit 4f1e9630afe6 ("blk-throtl: optimize IOPS throttle for large IO scenarios")
was for addressing this issue, but the taken approach is just to run
post-accounting, then current split bios won't be throttled actually,
so actual iops throttle result isn't good in case of excessive bio
splitting.

The 1st two patches are cleanup.

The 3rd and 4th patches add one new helper of __submit_bio_noacct() for
blk_throtl_dispatch_work_fn(), so that bios won't be throttled any more
when blk-throttle code dispatches throttled bios.

The 5th and 6th patch makes the real difference for throttling split bio wrt.
iops limit.

The last patch is to revert commit 4f1e9630afe6 ("blk-throtl: optimize IOPS
throttle for large IO scenarios").

Lining, you should get exact IOPS throttling in your dm-thin test with
this patchset, please test and feedback.


Ming Lei (7):
  block: move submit_bio_checks() into submit_bio_noacct
  block: move blk_crypto_bio_prep() out of blk-mq.c
  block: allow to bypass bio check before submitting bio
  block: don't check bio in blk_throtl_dispatch_work_fn
  block: throttle split bio in case of iops limit
  block: don't try to throttle split bio if iops limit isn't set
  block: revert 4f1e9630afe6 ("blk-throtl: optimize IOPS throttle for
    large IO scenarios")

 block/blk-core.c       | 32 +++++++++-------------
 block/blk-merge.c      |  2 --
 block/blk-mq.c         |  3 ---
 block/blk-throttle.c   | 61 +++++++++++++++---------------------------
 block/blk-throttle.h   | 16 ++++++-----
 include/linux/blkdev.h |  7 ++++-
 6 files changed, 51 insertions(+), 70 deletions(-)

-- 
2.31.1


^ permalink raw reply	[flat|nested] 14+ messages in thread

* [RFC PATCH 1/7] block: move submit_bio_checks() into submit_bio_noacct
  2022-01-11 11:55 [RFC PATCH 0/7] block: improve iops limit throttle Ming Lei
@ 2022-01-11 11:55 ` Ming Lei
  2022-01-17  8:20   ` Christoph Hellwig
  2022-01-11 11:55 ` [RFC PATCH 2/7] block: move blk_crypto_bio_prep() out of blk-mq.c Ming Lei
                   ` (6 subsequent siblings)
  7 siblings, 1 reply; 14+ messages in thread
From: Ming Lei @ 2022-01-11 11:55 UTC (permalink / raw)
  To: Jens Axboe; +Cc: linux-block, Li Ning, Tejun Heo, Chunguang Xu, Ming Lei

It is more clean & readable to check bio when starting to submit it,
instead of just before calling ->submit_bio() or blk_mq_submit_bio().

Also it provides us chance to optimize bio submission without checking
bio.

Signed-off-by: Ming Lei <ming.lei@redhat.com>
---
 block/blk-core.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/block/blk-core.c b/block/blk-core.c
index fd029c86d6ac..cca7fbe2a43b 100644
--- a/block/blk-core.c
+++ b/block/blk-core.c
@@ -799,9 +799,6 @@ static void __submit_bio(struct bio *bio)
 {
 	struct gendisk *disk = bio->bi_bdev->bd_disk;
 
-	if (unlikely(!submit_bio_checks(bio)))
-		return;
-
 	if (!disk->fops->submit_bio)
 		blk_mq_submit_bio(bio);
 	else
@@ -895,6 +892,9 @@ static void __submit_bio_noacct_mq(struct bio *bio)
  */
 void submit_bio_noacct(struct bio *bio)
 {
+	if (unlikely(!submit_bio_checks(bio)))
+		return;
+
 	/*
 	 * We only want one ->submit_bio to be active at a time, else stack
 	 * usage with stacked devices could be a problem.  Use current->bio_list
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [RFC PATCH 2/7] block: move blk_crypto_bio_prep() out of blk-mq.c
  2022-01-11 11:55 [RFC PATCH 0/7] block: improve iops limit throttle Ming Lei
  2022-01-11 11:55 ` [RFC PATCH 1/7] block: move submit_bio_checks() into submit_bio_noacct Ming Lei
@ 2022-01-11 11:55 ` Ming Lei
  2022-01-11 11:55 ` [RFC PATCH 3/7] block: allow to bypass bio check before submitting bio Ming Lei
                   ` (5 subsequent siblings)
  7 siblings, 0 replies; 14+ messages in thread
From: Ming Lei @ 2022-01-11 11:55 UTC (permalink / raw)
  To: Jens Axboe; +Cc: linux-block, Li Ning, Tejun Heo, Chunguang Xu, Ming Lei

blk_crypto_bio_prep() is called for both bio based and blk-mq drivers,
so move it out of blk-mq.c, then we can unify this kind of handling.

Signed-off-by: Ming Lei <ming.lei@redhat.com>
---
 block/blk-core.c | 18 ++++++------------
 block/blk-mq.c   |  3 ---
 2 files changed, 6 insertions(+), 15 deletions(-)

diff --git a/block/blk-core.c b/block/blk-core.c
index cca7fbe2a43b..471ffc834e3f 100644
--- a/block/blk-core.c
+++ b/block/blk-core.c
@@ -785,26 +785,20 @@ static noinline_for_stack bool submit_bio_checks(struct bio *bio)
 	return false;
 }
 
-static void __submit_bio_fops(struct gendisk *disk, struct bio *bio)
+static void __submit_bio(struct bio *bio)
 {
 	if (blk_crypto_bio_prep(&bio)) {
-		if (likely(bio_queue_enter(bio) == 0)) {
+		struct gendisk *disk = bio->bi_bdev->bd_disk;
+
+		if (!disk->fops->submit_bio) {
+			blk_mq_submit_bio(bio);
+		} else if (likely(bio_queue_enter(bio) == 0)) {
 			disk->fops->submit_bio(bio);
 			blk_queue_exit(disk->queue);
 		}
 	}
 }
 
-static void __submit_bio(struct bio *bio)
-{
-	struct gendisk *disk = bio->bi_bdev->bd_disk;
-
-	if (!disk->fops->submit_bio)
-		blk_mq_submit_bio(bio);
-	else
-		__submit_bio_fops(disk, bio);
-}
-
 /*
  * The loop in this function may be a bit non-obvious, and so deserves some
  * explanation:
diff --git a/block/blk-mq.c b/block/blk-mq.c
index a6d4780580fc..73c376e27c5a 100644
--- a/block/blk-mq.c
+++ b/block/blk-mq.c
@@ -2786,9 +2786,6 @@ void blk_mq_submit_bio(struct bio *bio)
 	unsigned int nr_segs = 1;
 	blk_status_t ret;
 
-	if (unlikely(!blk_crypto_bio_prep(&bio)))
-		return;
-
 	blk_queue_bounce(q, &bio);
 	if (blk_may_split(q, bio))
 		__blk_queue_split(q, &bio, &nr_segs);
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [RFC PATCH 3/7] block: allow to bypass bio check before submitting bio
  2022-01-11 11:55 [RFC PATCH 0/7] block: improve iops limit throttle Ming Lei
  2022-01-11 11:55 ` [RFC PATCH 1/7] block: move submit_bio_checks() into submit_bio_noacct Ming Lei
  2022-01-11 11:55 ` [RFC PATCH 2/7] block: move blk_crypto_bio_prep() out of blk-mq.c Ming Lei
@ 2022-01-11 11:55 ` Ming Lei
  2022-01-17  8:21   ` Christoph Hellwig
  2022-01-11 11:55 ` [RFC PATCH 4/7] block: don't check bio in blk_throtl_dispatch_work_fn Ming Lei
                   ` (4 subsequent siblings)
  7 siblings, 1 reply; 14+ messages in thread
From: Ming Lei @ 2022-01-11 11:55 UTC (permalink / raw)
  To: Jens Axboe; +Cc: linux-block, Li Ning, Tejun Heo, Chunguang Xu, Ming Lei

We have several cases in which bio is re-submitted to same queue, so
checking bio isn't needed, so add helper for such propose.

Signed-off-by: Ming Lei <ming.lei@redhat.com>
---
 block/blk-core.c       | 10 +++++-----
 include/linux/blkdev.h |  7 ++++++-
 2 files changed, 11 insertions(+), 6 deletions(-)

diff --git a/block/blk-core.c b/block/blk-core.c
index 471ffc834e3f..1e6ebc16b889 100644
--- a/block/blk-core.c
+++ b/block/blk-core.c
@@ -818,7 +818,7 @@ static void __submit_bio(struct bio *bio)
  * bio_list_on_stack[1] contains bios that were submitted before the current
  *	->submit_bio_bio, but that haven't been processed yet.
  */
-static void __submit_bio_noacct(struct bio *bio)
+static void __submit_bio_noacct_generic(struct bio *bio)
 {
 	struct bio_list bio_list_on_stack[2];
 
@@ -884,9 +884,9 @@ static void __submit_bio_noacct_mq(struct bio *bio)
  * systems and other upper level users of the block layer should use
  * submit_bio() instead.
  */
-void submit_bio_noacct(struct bio *bio)
+void __submit_bio_noacct(struct bio *bio, bool check)
 {
-	if (unlikely(!submit_bio_checks(bio)))
+	if (unlikely(check && !submit_bio_checks(bio)))
 		return;
 
 	/*
@@ -900,9 +900,9 @@ void submit_bio_noacct(struct bio *bio)
 	else if (!bio->bi_bdev->bd_disk->fops->submit_bio)
 		__submit_bio_noacct_mq(bio);
 	else
-		__submit_bio_noacct(bio);
+		__submit_bio_noacct_generic(bio);
 }
-EXPORT_SYMBOL(submit_bio_noacct);
+EXPORT_SYMBOL(__submit_bio_noacct);
 
 /**
  * submit_bio - submit a bio to the block device layer for I/O
diff --git a/include/linux/blkdev.h b/include/linux/blkdev.h
index 22746b2d6825..e02a73d7c277 100644
--- a/include/linux/blkdev.h
+++ b/include/linux/blkdev.h
@@ -599,7 +599,12 @@ static inline unsigned int blk_queue_depth(struct request_queue *q)
 
 extern int blk_register_queue(struct gendisk *disk);
 extern void blk_unregister_queue(struct gendisk *disk);
-void submit_bio_noacct(struct bio *bio);
+void __submit_bio_noacct(struct bio *bio, bool check);
+
+static inline void submit_bio_noacct(struct bio *bio)
+{
+	__submit_bio_noacct(bio, true);
+}
 
 extern int blk_lld_busy(struct request_queue *q);
 extern void blk_queue_split(struct bio **);
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [RFC PATCH 4/7] block: don't check bio in blk_throtl_dispatch_work_fn
  2022-01-11 11:55 [RFC PATCH 0/7] block: improve iops limit throttle Ming Lei
                   ` (2 preceding siblings ...)
  2022-01-11 11:55 ` [RFC PATCH 3/7] block: allow to bypass bio check before submitting bio Ming Lei
@ 2022-01-11 11:55 ` Ming Lei
  2022-01-11 11:55 ` [RFC PATCH 5/7] block: throttle split bio in case of iops limit Ming Lei
                   ` (3 subsequent siblings)
  7 siblings, 0 replies; 14+ messages in thread
From: Ming Lei @ 2022-01-11 11:55 UTC (permalink / raw)
  To: Jens Axboe; +Cc: linux-block, Li Ning, Tejun Heo, Chunguang Xu, Ming Lei

The bio has been checks already before throttling, so no need to check
it again before dispatching it from throttle queue.

Signed-off-by: Ming Lei <ming.lei@redhat.com>
---
 block/blk-throttle.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/block/blk-throttle.c b/block/blk-throttle.c
index 7c462c006b26..8ba08425db06 100644
--- a/block/blk-throttle.c
+++ b/block/blk-throttle.c
@@ -1219,7 +1219,7 @@ static void blk_throtl_dispatch_work_fn(struct work_struct *work)
 	if (!bio_list_empty(&bio_list_on_stack)) {
 		blk_start_plug(&plug);
 		while ((bio = bio_list_pop(&bio_list_on_stack)))
-			submit_bio_noacct(bio);
+			__submit_bio_noacct(bio, false);
 		blk_finish_plug(&plug);
 	}
 }
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [RFC PATCH 5/7] block: throttle split bio in case of iops limit
  2022-01-11 11:55 [RFC PATCH 0/7] block: improve iops limit throttle Ming Lei
                   ` (3 preceding siblings ...)
  2022-01-11 11:55 ` [RFC PATCH 4/7] block: don't check bio in blk_throtl_dispatch_work_fn Ming Lei
@ 2022-01-11 11:55 ` Ming Lei
  2022-01-11 11:55 ` [RFC PATCH 6/7] block: don't try to throttle split bio if iops limit isn't set Ming Lei
                   ` (2 subsequent siblings)
  7 siblings, 0 replies; 14+ messages in thread
From: Ming Lei @ 2022-01-11 11:55 UTC (permalink / raw)
  To: Jens Axboe; +Cc: linux-block, Li Ning, Tejun Heo, Chunguang Xu, Ming Lei

Commit 111be8839817 ("block-throttle: avoid double charge") marks bio as
BIO_THROTTLED unconditionally if __blk_throtl_bio() is called on this bio,
then this bio won't be called into __blk_throtl_bio() any more. This way
is to avoid double charge in case of bio splitting. It is reasonable for
read/write throughput limit, but not reasonable for IOPS limit because
block layer provides io accounting against split bio.

Chunguang Xu has already observed this issue and fixed it in commit
4f1e9630afe6 ("blk-throtl: optimize IOPS throttle for large IO scenarios").
However, that patch only covers bio splitting in __blk_queue_split(), and
we have other kind of bio splitting, such as bio_split() &
submit_bio_noacct() and other ways.

This patch tries to fix the issue in one generic way by always charging
the bio for iops limit in blk_throtl_bio(). This way is reasonable:
re-submission & fast-cloned bio is charged if it is submitted to same
disk/queue, and BIO_THROTTLED will be cleared if bio->bi_bdev is changed.

This new approach can get much more smooth/stable iops limit compared with
commit 4f1e9630afe6 ("blk-throtl: optimize IOPS throttle for large IO
scenarios") since that commit can't throttle current split bios actually.

Also this way won't cause new double bio iops charge in
blk_throtl_dispatch_work_fn() in which blk_throtl_bio() won't be called
any more.

Reported-by: Li Ning <lining2020x@163.com>
Cc: Tejun Heo <tj@kernel.org>
Cc: Chunguang Xu <brookxu@tencent.com>
Signed-off-by: Ming Lei <ming.lei@redhat.com>
---
 block/blk-merge.c    |  2 --
 block/blk-throttle.c | 10 +++++++---
 block/blk-throttle.h |  2 --
 3 files changed, 7 insertions(+), 7 deletions(-)

diff --git a/block/blk-merge.c b/block/blk-merge.c
index 4de34a332c9f..f5255991b773 100644
--- a/block/blk-merge.c
+++ b/block/blk-merge.c
@@ -368,8 +368,6 @@ void __blk_queue_split(struct request_queue *q, struct bio **bio,
 		trace_block_split(split, (*bio)->bi_iter.bi_sector);
 		submit_bio_noacct(*bio);
 		*bio = split;
-
-		blk_throtl_charge_bio_split(*bio);
 	}
 }
 
diff --git a/block/blk-throttle.c b/block/blk-throttle.c
index 8ba08425db06..67bb39d13751 100644
--- a/block/blk-throttle.c
+++ b/block/blk-throttle.c
@@ -808,7 +808,8 @@ static bool tg_with_in_bps_limit(struct throtl_grp *tg, struct bio *bio,
 	unsigned long jiffy_elapsed, jiffy_wait, jiffy_elapsed_rnd;
 	unsigned int bio_size = throtl_bio_data_size(bio);
 
-	if (bps_limit == U64_MAX) {
+	/* no need to throttle if this bio's bytes have been accounted */
+	if (bps_limit == U64_MAX || bio_flagged(bio, BIO_THROTTLED)) {
 		if (wait)
 			*wait = 0;
 		return true;
@@ -920,9 +921,12 @@ static void throtl_charge_bio(struct throtl_grp *tg, struct bio *bio)
 	unsigned int bio_size = throtl_bio_data_size(bio);
 
 	/* Charge the bio to the group */
-	tg->bytes_disp[rw] += bio_size;
+	if (!bio_flagged(bio, BIO_THROTTLED)) {
+		tg->bytes_disp[rw] += bio_size;
+		tg->last_bytes_disp[rw] += bio_size;
+	}
+
 	tg->io_disp[rw]++;
-	tg->last_bytes_disp[rw] += bio_size;
 	tg->last_io_disp[rw]++;
 
 	/*
diff --git a/block/blk-throttle.h b/block/blk-throttle.h
index 175f03abd9e4..cb43f4417d6e 100644
--- a/block/blk-throttle.h
+++ b/block/blk-throttle.h
@@ -170,8 +170,6 @@ static inline bool blk_throtl_bio(struct bio *bio)
 {
 	struct throtl_grp *tg = blkg_to_tg(bio->bi_blkg);
 
-	if (bio_flagged(bio, BIO_THROTTLED))
-		return false;
 	if (!tg->has_rules[bio_data_dir(bio)])
 		return false;
 
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [RFC PATCH 6/7] block: don't try to throttle split bio if iops limit isn't set
  2022-01-11 11:55 [RFC PATCH 0/7] block: improve iops limit throttle Ming Lei
                   ` (4 preceding siblings ...)
  2022-01-11 11:55 ` [RFC PATCH 5/7] block: throttle split bio in case of iops limit Ming Lei
@ 2022-01-11 11:55 ` Ming Lei
  2022-01-11 11:55 ` [RFC PATCH 7/7] block: revert 4f1e9630afe6 ("blk-throtl: optimize IOPS throttle for large IO scenarios") Ming Lei
       [not found] ` <4666c796.5083.17e4d67bb88.Coremail.lining2020x@163.com>
  7 siblings, 0 replies; 14+ messages in thread
From: Ming Lei @ 2022-01-11 11:55 UTC (permalink / raw)
  To: Jens Axboe; +Cc: linux-block, Li Ning, Tejun Heo, Chunguang Xu, Ming Lei

We need to throttle split bio in case of IOPS limit even though the
split bio has been marked as BIO_THROTTLED since block layer
accounts split bio actually.

If only throughput throttle is setup, no need to throttle any more
if BIO_THROTTLED is set since we have accounted & considered the
whole bio bytes already.

Add one flag of THROTL_TG_HAS_IOPS_LIMIT for serving this purpose.

Signed-off-by: Ming Lei <ming.lei@redhat.com>
---
 block/blk-throttle.c | 21 ++++++++++++++-------
 block/blk-throttle.h | 11 +++++++++++
 2 files changed, 25 insertions(+), 7 deletions(-)

diff --git a/block/blk-throttle.c b/block/blk-throttle.c
index 67bb39d13751..36c17fcb67ef 100644
--- a/block/blk-throttle.c
+++ b/block/blk-throttle.c
@@ -42,11 +42,6 @@
 /* A workqueue to queue throttle related work */
 static struct workqueue_struct *kthrotld_workqueue;
 
-enum tg_state_flags {
-	THROTL_TG_PENDING	= 1 << 0,	/* on parent's pending tree */
-	THROTL_TG_WAS_EMPTY	= 1 << 1,	/* bio_lists[] became non-empty */
-};
-
 #define rb_entry_tg(node)	rb_entry((node), struct throtl_grp, rb_node)
 
 /* We measure latency for request size from <= 4k to >= 1M */
@@ -426,12 +421,24 @@ static void tg_update_has_rules(struct throtl_grp *tg)
 	struct throtl_grp *parent_tg = sq_to_tg(tg->service_queue.parent_sq);
 	struct throtl_data *td = tg->td;
 	int rw;
+	int has_iops_limit = 0;
+
+	for (rw = READ; rw <= WRITE; rw++) {
+		unsigned int iops_limit = tg_iops_limit(tg, rw);
 
-	for (rw = READ; rw <= WRITE; rw++)
 		tg->has_rules[rw] = (parent_tg && parent_tg->has_rules[rw]) ||
 			(td->limit_valid[td->limit_index] &&
 			 (tg_bps_limit(tg, rw) != U64_MAX ||
-			  tg_iops_limit(tg, rw) != UINT_MAX));
+			  iops_limit != UINT_MAX));
+
+		if (iops_limit != UINT_MAX)
+			has_iops_limit = 1;
+	}
+
+	if (has_iops_limit)
+		tg->flags |= THROTL_TG_HAS_IOPS_LIMIT;
+	else
+		tg->flags &= ~THROTL_TG_HAS_IOPS_LIMIT;
 }
 
 static void throtl_pd_online(struct blkg_policy_data *pd)
diff --git a/block/blk-throttle.h b/block/blk-throttle.h
index cb43f4417d6e..c996a15f290e 100644
--- a/block/blk-throttle.h
+++ b/block/blk-throttle.h
@@ -52,6 +52,12 @@ struct throtl_service_queue {
 	struct timer_list	pending_timer;	/* fires on first_pending_disptime */
 };
 
+enum tg_state_flags {
+	THROTL_TG_PENDING	= 1 << 0,	/* on parent's pending tree */
+	THROTL_TG_WAS_EMPTY	= 1 << 1,	/* bio_lists[] became non-empty */
+	THROTL_TG_HAS_IOPS_LIMIT = 1 << 2,	/* tg has iops limit */
+};
+
 enum {
 	LIMIT_LOW,
 	LIMIT_MAX,
@@ -170,6 +176,11 @@ static inline bool blk_throtl_bio(struct bio *bio)
 {
 	struct throtl_grp *tg = blkg_to_tg(bio->bi_blkg);
 
+	/* no need to throttle bps any more if the bio has been throttled */
+	if (bio_flagged(bio, BIO_THROTTLED) &&
+	    !(tg->flags & THROTL_TG_HAS_IOPS_LIMIT))
+		return false;
+
 	if (!tg->has_rules[bio_data_dir(bio)])
 		return false;
 
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [RFC PATCH 7/7] block: revert 4f1e9630afe6 ("blk-throtl: optimize IOPS throttle for large IO scenarios")
  2022-01-11 11:55 [RFC PATCH 0/7] block: improve iops limit throttle Ming Lei
                   ` (5 preceding siblings ...)
  2022-01-11 11:55 ` [RFC PATCH 6/7] block: don't try to throttle split bio if iops limit isn't set Ming Lei
@ 2022-01-11 11:55 ` Ming Lei
       [not found] ` <4666c796.5083.17e4d67bb88.Coremail.lining2020x@163.com>
  7 siblings, 0 replies; 14+ messages in thread
From: Ming Lei @ 2022-01-11 11:55 UTC (permalink / raw)
  To: Jens Axboe; +Cc: linux-block, Li Ning, Tejun Heo, Chunguang Xu, Ming Lei

Revert commit 4f1e9630afe6 ("blk-throtl: optimize IOPS throttle for large
IO scenarios") since we have another easier way to address this issue and
get better iops throttling result.

Signed-off-by: Ming Lei <ming.lei@redhat.com>
---
 block/blk-throttle.c | 28 ----------------------------
 block/blk-throttle.h |  5 -----
 2 files changed, 33 deletions(-)

diff --git a/block/blk-throttle.c b/block/blk-throttle.c
index 36c17fcb67ef..cf7e20804f1b 100644
--- a/block/blk-throttle.c
+++ b/block/blk-throttle.c
@@ -641,8 +641,6 @@ static inline void throtl_start_new_slice_with_credit(struct throtl_grp *tg,
 	tg->bytes_disp[rw] = 0;
 	tg->io_disp[rw] = 0;
 
-	atomic_set(&tg->io_split_cnt[rw], 0);
-
 	/*
 	 * Previous slice has expired. We must have trimmed it after last
 	 * bio dispatch. That means since start of last slice, we never used
@@ -666,8 +664,6 @@ static inline void throtl_start_new_slice(struct throtl_grp *tg, bool rw)
 	tg->slice_start[rw] = jiffies;
 	tg->slice_end[rw] = jiffies + tg->td->throtl_slice;
 
-	atomic_set(&tg->io_split_cnt[rw], 0);
-
 	throtl_log(&tg->service_queue,
 		   "[%c] new slice start=%lu end=%lu jiffies=%lu",
 		   rw == READ ? 'R' : 'W', tg->slice_start[rw],
@@ -901,9 +897,6 @@ static bool tg_may_dispatch(struct throtl_grp *tg, struct bio *bio,
 				jiffies + tg->td->throtl_slice);
 	}
 
-	if (iops_limit != UINT_MAX)
-		tg->io_disp[rw] += atomic_xchg(&tg->io_split_cnt[rw], 0);
-
 	if (tg_with_in_bps_limit(tg, bio, bps_limit, &bps_wait) &&
 	    tg_with_in_iops_limit(tg, bio, iops_limit, &iops_wait)) {
 		if (wait)
@@ -1928,14 +1921,12 @@ static void throtl_downgrade_check(struct throtl_grp *tg)
 	}
 
 	if (tg->iops[READ][LIMIT_LOW]) {
-		tg->last_io_disp[READ] += atomic_xchg(&tg->last_io_split_cnt[READ], 0);
 		iops = tg->last_io_disp[READ] * HZ / elapsed_time;
 		if (iops >= tg->iops[READ][LIMIT_LOW])
 			tg->last_low_overflow_time[READ] = now;
 	}
 
 	if (tg->iops[WRITE][LIMIT_LOW]) {
-		tg->last_io_disp[WRITE] += atomic_xchg(&tg->last_io_split_cnt[WRITE], 0);
 		iops = tg->last_io_disp[WRITE] * HZ / elapsed_time;
 		if (iops >= tg->iops[WRITE][LIMIT_LOW])
 			tg->last_low_overflow_time[WRITE] = now;
@@ -2054,25 +2045,6 @@ static inline void throtl_update_latency_buckets(struct throtl_data *td)
 }
 #endif
 
-void blk_throtl_charge_bio_split(struct bio *bio)
-{
-	struct blkcg_gq *blkg = bio->bi_blkg;
-	struct throtl_grp *parent = blkg_to_tg(blkg);
-	struct throtl_service_queue *parent_sq;
-	bool rw = bio_data_dir(bio);
-
-	do {
-		if (!parent->has_rules[rw])
-			break;
-
-		atomic_inc(&parent->io_split_cnt[rw]);
-		atomic_inc(&parent->last_io_split_cnt[rw]);
-
-		parent_sq = parent->service_queue.parent_sq;
-		parent = sq_to_tg(parent_sq);
-	} while (parent);
-}
-
 bool __blk_throtl_bio(struct bio *bio)
 {
 	struct request_queue *q = bdev_get_queue(bio->bi_bdev);
diff --git a/block/blk-throttle.h b/block/blk-throttle.h
index c996a15f290e..b23a9f3abb82 100644
--- a/block/blk-throttle.h
+++ b/block/blk-throttle.h
@@ -138,9 +138,6 @@ struct throtl_grp {
 	unsigned int bad_bio_cnt; /* bios exceeding latency threshold */
 	unsigned long bio_cnt_reset_time;
 
-	atomic_t io_split_cnt[2];
-	atomic_t last_io_split_cnt[2];
-
 	struct blkg_rwstat stat_bytes;
 	struct blkg_rwstat stat_ios;
 };
@@ -164,13 +161,11 @@ static inline struct throtl_grp *blkg_to_tg(struct blkcg_gq *blkg)
 static inline int blk_throtl_init(struct request_queue *q) { return 0; }
 static inline void blk_throtl_exit(struct request_queue *q) { }
 static inline void blk_throtl_register_queue(struct request_queue *q) { }
-static inline void blk_throtl_charge_bio_split(struct bio *bio) { }
 static inline bool blk_throtl_bio(struct bio *bio) { return false; }
 #else /* CONFIG_BLK_DEV_THROTTLING */
 int blk_throtl_init(struct request_queue *q);
 void blk_throtl_exit(struct request_queue *q);
 void blk_throtl_register_queue(struct request_queue *q);
-void blk_throtl_charge_bio_split(struct bio *bio);
 bool __blk_throtl_bio(struct bio *bio);
 static inline bool blk_throtl_bio(struct bio *bio)
 {
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 14+ messages in thread

* Re: [RFC PATCH 0/7] block: improve iops limit throttle
       [not found]   ` <d7c6f47.2d01.17e51b058cc.Coremail.lining2020x@163.com>
@ 2022-01-17  7:46     ` 163
  0 siblings, 0 replies; 14+ messages in thread
From: 163 @ 2022-01-17  7:46 UTC (permalink / raw)
  To: Ming Lei, Jens Axboe
  Cc: linux-block, Tejun Heo, Chunguang Xu, weijia.liu, chao.yang, ning.a.li

( Sorry for sending this mail again,  I am not sure you saw that for the last one was not in pure text format.)

Hi all,

In short, Ming’s patchset was tested against kernel master and works well.  Tested-by: lining <lining2020x@163.com> 


The iops limit test result on thin-dm lv shows quite perfect and IO was much more smooth during the testing, very cool! 

The origin issue is #bz2027241 named `cgroup2: wiops limit doesn't work when doing sequential write on a thin provisioned lv`,
There are more details about this issue and patch work in this bugzilla ticket [1].

Follows my test result and test script (iops was limit to 10 in my case):
- dm thin case test result: https://pastebin.com/raw/VgvB2TFY
- dm plain case test result: https://pastebin.com/raw/UVZMYjTp
- test script: https://pastebin.com/raw/YXEDH6Ss
 
Here is my code tree info for testing:
```
root@ubuntu-r:~/backup/linux# git remote -v 
origin	https://github.com/torvalds/linux.git (fetch)
origin	https://github.com/torvalds/linux.git (push)

root@ubuntu-r:~/backup/linux# git branch -a
* master
  remotes/origin/master

root@ubuntu-r:~/backup/linux# git log --oneline -10
9b977519b97c (HEAD -> master) block: revert 4f1e9630afe6 ("blk-throtl: optimize IOPS throttle for large IO scenarios")
36beefdf7492 block: don't try to throttle split bio if iops limit isn't set
722ff89df455 block: throttle split bio in case of iops limit
778b7c819d8c block: don't check bio in blk_throtl_dispatch_work_fn
c803548b4623 block: allow to bypass bio check before submitting bio
936bc02492c2 block: move blk_crypto_bio_prep() out of blk-mq.c
98e2c0e19ca6 block: move submit_bio_checks() into submit_bio_noacct
455e73a07f6e (origin/master) Merge tag 'clk-for-linus' of 
git://git.kernel.org/pub/scm/linux/kernel/git/clk/linux

d9b5941bb593 Merge tag 'leds-5.17-rc1' of 
git://git.kernel.org/pub/scm/linux/kernel/git/pavel/linux-leds

4eb766f64d12 Merge tag 'devicetree-for-5.17' of 
git://git.kernel.org/pub/scm/linux/kernel/git/robh/linux

```


[1] bz2027241:  https://bugzilla.redhat.com/show_bug.cgi?id=2027241
 
Thanks,
Ning

> 2022年1月13日 下午12:26,Ning Li <lining2020x@163.com> 写道:
> 
> Hi all,
> 
> The iops limit test result on thin-dm lv shows quite perfect, very cool!  Tested-by: lining <lining2020x@163.com>
> 
> The origin issue is #bz2027241 named `cgroup2: wiops limit doesn't work when doing sequential write on a thin provisioned lv`,
> There are more details about this issue and patch work in this bugzilla ticket [1].
> 
> Follows my test result and test script (iops was limit to 10 in my case):
> - dm thin case test result: https://pastebin.com/raw/VgvB2TFY
> - dm plain case test result: https://pastebin.com/raw/UVZMYjTp
> - test script: https://pastebin.com/raw/YXEDH6Ss
>  
> Here is my code tree info for testing:
> ```
> root@ubuntu-r:~/backup/linux# git remote -v
> origin	https://github.com/torvalds/linux.git (fetch)
> origin	https://github.com/torvalds/linux.git (push)
> 
> root@ubuntu-r:~/backup/linux# git branch -a
> * master
>   remotes/origin/master
> 
> root@ubuntu-r:~/backup/linux# git log --oneline -10
> 9b977519b97c (HEAD -> master) block: revert 4f1e9630afe6 ("blk-throtl: optimize IOPS throttle for large IO scenarios")
> 36beefdf7492 block: don't try to throttle split bio if iops limit isn't set
> 722ff89df455 block: throttle split bio in case of iops limit
> 778b7c819d8c block: don't check bio in blk_throtl_dispatch_work_fn
> c803548b4623 block: allow to bypass bio check before submitting bio
> 936bc02492c2 block: move blk_crypto_bio_prep() out of blk-mq.c
> 98e2c0e19ca6 block: move submit_bio_checks() into submit_bio_noacct
> 455e73a07f6e (origin/master) Merge tag 'clk-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/clk/linux
> d9b5941bb593 Merge tag 'leds-5.17-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/pavel/linux-leds
> 4eb766f64d12 Merge tag 'devicetree-for-5.17' of git://git.kernel.org/pub/scm/linux/kernel/git/robh/linux
> ```
> 
> 
> [1] bz2027241:  https://bugzilla.redhat.com/show_bug.cgi?id=2027241
>  
> Thanks,
> Ning
> 
> 
> At 2022-01-12 16:29:10, "Ning Li" <lining2020x@163.com> wrote:
> 
> Cool, I will test it later.
> 
> Thanks, 
> Ning
> At 2022-01-11 19:55:25, "Ming Lei" <ming.lei@redhat.com> wrote:
> >Hello Guys,
> >
> >Lining reported that iops limit throttle doesn't work on dm-thin, also
> >iops limit throttle works bad on plain disk in case of excessive split.
> >
> >Commit 4f1e9630afe6 ("blk-throtl: optimize IOPS throttle for large IO scenarios")
> >was for addressing this issue, but the taken approach is just to run
> >post-accounting, then current split bios won't be throttled actually,
> >so actual iops throttle result isn't good in case of excessive bio
> >splitting.
> >
> >The 1st two patches are cleanup.
> >
> >The 3rd and 4th patches add one new helper of __submit_bio_noacct() for
> >blk_throtl_dispatch_work_fn(), so that bios won't be throttled any more
> >when blk-throttle code dispatches throttled bios.
> >
> >The 5th and 6th patch makes the real difference for throttling split bio wrt.
> >iops limit.
> >
> >The last patch is to revert commit 4f1e9630afe6 ("blk-throtl: optimize IOPS
> >throttle for large IO scenarios").
> >
> >Lining, you should get exact IOPS throttling in your dm-thin test with
> >this patchset, please test and feedback.
> >
> >
> >Ming Lei (7):
> >  block: move submit_bio_checks() into submit_bio_noacct
> >  block: move blk_crypto_bio_prep() out of blk-mq.c
> >  block: allow to bypass bio check before submitting bio
> >  block: don't check bio in blk_throtl_dispatch_work_fn
> >  block: throttle split bio in case of iops limit
> >  block: don't try to throttle split bio if iops limit isn't set
> >  block: revert 4f1e9630afe6 ("blk-throtl: optimize IOPS throttle for
> >    large IO scenarios")
> >
> > block/blk-core.c       | 32 +++++++++-------------
> > block/blk-merge.c      |  2 --
> > block/blk-mq.c         |  3 ---
> > block/blk-throttle.c   | 61 +++++++++++++++---------------------------
> > block/blk-throttle.h   | 16 ++++++-----
> > include/linux/blkdev.h |  7 ++++-
> > 6 files changed, 51 insertions(+), 70 deletions(-)
> >
> >-- 
> >2.31.1
> 
> 
> 
>  
> 
> 
>  


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [RFC PATCH 1/7] block: move submit_bio_checks() into submit_bio_noacct
  2022-01-11 11:55 ` [RFC PATCH 1/7] block: move submit_bio_checks() into submit_bio_noacct Ming Lei
@ 2022-01-17  8:20   ` Christoph Hellwig
  0 siblings, 0 replies; 14+ messages in thread
From: Christoph Hellwig @ 2022-01-17  8:20 UTC (permalink / raw)
  To: Ming Lei; +Cc: Jens Axboe, linux-block, Li Ning, Tejun Heo, Chunguang Xu

On Tue, Jan 11, 2022 at 07:55:26PM +0800, Ming Lei wrote:
> It is more clean & readable to check bio when starting to submit it,
> instead of just before calling ->submit_bio() or blk_mq_submit_bio().
> 
> Also it provides us chance to optimize bio submission without checking
> bio.

This looks ok, but I'd just remove submit_bio_checks entirely while
we're at it.

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [RFC PATCH 3/7] block: allow to bypass bio check before submitting bio
  2022-01-11 11:55 ` [RFC PATCH 3/7] block: allow to bypass bio check before submitting bio Ming Lei
@ 2022-01-17  8:21   ` Christoph Hellwig
  2022-02-09  8:12     ` Ming Lei
  0 siblings, 1 reply; 14+ messages in thread
From: Christoph Hellwig @ 2022-01-17  8:21 UTC (permalink / raw)
  To: Ming Lei; +Cc: Jens Axboe, linux-block, Li Ning, Tejun Heo, Chunguang Xu


>   * systems and other upper level users of the block layer should use
>   * submit_bio() instead.
>   */
> -void submit_bio_noacct(struct bio *bio)
> +void __submit_bio_noacct(struct bio *bio, bool check)
>  {
> -	if (unlikely(!submit_bio_checks(bio)))
> +	if (unlikely(check && !submit_bio_checks(bio)))
>  		return;

This doesn't make sense as an API - you can just move the checks
into the caller that pass check=true.

Also the lowlevel nocheck API really has no business being exported.

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [RFC PATCH 3/7] block: allow to bypass bio check before submitting bio
  2022-01-17  8:21   ` Christoph Hellwig
@ 2022-02-09  8:12     ` Ming Lei
  2022-02-09  8:16       ` Christoph Hellwig
  0 siblings, 1 reply; 14+ messages in thread
From: Ming Lei @ 2022-02-09  8:12 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Jens Axboe, linux-block, Li Ning, Tejun Heo, Chunguang Xu

On Mon, Jan 17, 2022 at 12:21:37AM -0800, Christoph Hellwig wrote:
> 
> >   * systems and other upper level users of the block layer should use
> >   * submit_bio() instead.
> >   */
> > -void submit_bio_noacct(struct bio *bio)
> > +void __submit_bio_noacct(struct bio *bio, bool check)
> >  {
> > -	if (unlikely(!submit_bio_checks(bio)))
> > +	if (unlikely(check && !submit_bio_checks(bio)))
> >  		return;
> 
> This doesn't make sense as an API - you can just move the checks
> into the caller that pass check=true.

But submit_bio_checks() is local helper, and it is hard to make it
public to drivers. Not mention there are lots of callers to
submit_bio_noacct().


Thanks,
Ming


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [RFC PATCH 3/7] block: allow to bypass bio check before submitting bio
  2022-02-09  8:12     ` Ming Lei
@ 2022-02-09  8:16       ` Christoph Hellwig
  2022-02-09  8:35         ` Ming Lei
  0 siblings, 1 reply; 14+ messages in thread
From: Christoph Hellwig @ 2022-02-09  8:16 UTC (permalink / raw)
  To: Ming Lei
  Cc: Christoph Hellwig, Jens Axboe, linux-block, Li Ning, Tejun Heo,
	Chunguang Xu

On Wed, Feb 09, 2022 at 04:12:43PM +0800, Ming Lei wrote:
> On Mon, Jan 17, 2022 at 12:21:37AM -0800, Christoph Hellwig wrote:
> > 
> > >   * systems and other upper level users of the block layer should use
> > >   * submit_bio() instead.
> > >   */
> > > -void submit_bio_noacct(struct bio *bio)
> > > +void __submit_bio_noacct(struct bio *bio, bool check)
> > >  {
> > > -	if (unlikely(!submit_bio_checks(bio)))
> > > +	if (unlikely(check && !submit_bio_checks(bio)))
> > >  		return;
> > 
> > This doesn't make sense as an API - you can just move the checks
> > into the caller that pass check=true.
> 
> But submit_bio_checks() is local helper, and it is hard to make it
> public to drivers. Not mention there are lots of callers to
> submit_bio_noacct().

What I mean is something like:

-void submit_bio_noacct(struct bio *bio)
+void submit_bio_noacct_nocheck(struct bio *bio)
 {
-	if (unlikely(!submit_bio_checks(bio)))
-		return

...

+void submit_bio_noacct(struct bio *bio)
+{
+	if (unlikely(!submit_bio_checks(bio)))
+		return
+	submit_bio_noacct_nocheck(bio);
+}


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [RFC PATCH 3/7] block: allow to bypass bio check before submitting bio
  2022-02-09  8:16       ` Christoph Hellwig
@ 2022-02-09  8:35         ` Ming Lei
  0 siblings, 0 replies; 14+ messages in thread
From: Ming Lei @ 2022-02-09  8:35 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Jens Axboe, linux-block, Li Ning, Tejun Heo, Chunguang Xu

On Wed, Feb 09, 2022 at 12:16:16AM -0800, Christoph Hellwig wrote:
> On Wed, Feb 09, 2022 at 04:12:43PM +0800, Ming Lei wrote:
> > On Mon, Jan 17, 2022 at 12:21:37AM -0800, Christoph Hellwig wrote:
> > > 
> > > >   * systems and other upper level users of the block layer should use
> > > >   * submit_bio() instead.
> > > >   */
> > > > -void submit_bio_noacct(struct bio *bio)
> > > > +void __submit_bio_noacct(struct bio *bio, bool check)
> > > >  {
> > > > -	if (unlikely(!submit_bio_checks(bio)))
> > > > +	if (unlikely(check && !submit_bio_checks(bio)))
> > > >  		return;
> > > 
> > > This doesn't make sense as an API - you can just move the checks
> > > into the caller that pass check=true.
> > 
> > But submit_bio_checks() is local helper, and it is hard to make it
> > public to drivers. Not mention there are lots of callers to
> > submit_bio_noacct().
> 
> What I mean is something like:
> 
> -void submit_bio_noacct(struct bio *bio)
> +void submit_bio_noacct_nocheck(struct bio *bio)
>  {
> -	if (unlikely(!submit_bio_checks(bio)))
> -		return
> 
> ...
> 
> +void submit_bio_noacct(struct bio *bio)
> +{
> +	if (unlikely(!submit_bio_checks(bio)))
> +		return
> +	submit_bio_noacct_nocheck(bio);
> +}

OK, that is also very similar with my local version.


Thanks,
Ming


^ permalink raw reply	[flat|nested] 14+ messages in thread

end of thread, other threads:[~2022-02-09 11:57 UTC | newest]

Thread overview: 14+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-01-11 11:55 [RFC PATCH 0/7] block: improve iops limit throttle Ming Lei
2022-01-11 11:55 ` [RFC PATCH 1/7] block: move submit_bio_checks() into submit_bio_noacct Ming Lei
2022-01-17  8:20   ` Christoph Hellwig
2022-01-11 11:55 ` [RFC PATCH 2/7] block: move blk_crypto_bio_prep() out of blk-mq.c Ming Lei
2022-01-11 11:55 ` [RFC PATCH 3/7] block: allow to bypass bio check before submitting bio Ming Lei
2022-01-17  8:21   ` Christoph Hellwig
2022-02-09  8:12     ` Ming Lei
2022-02-09  8:16       ` Christoph Hellwig
2022-02-09  8:35         ` Ming Lei
2022-01-11 11:55 ` [RFC PATCH 4/7] block: don't check bio in blk_throtl_dispatch_work_fn Ming Lei
2022-01-11 11:55 ` [RFC PATCH 5/7] block: throttle split bio in case of iops limit Ming Lei
2022-01-11 11:55 ` [RFC PATCH 6/7] block: don't try to throttle split bio if iops limit isn't set Ming Lei
2022-01-11 11:55 ` [RFC PATCH 7/7] block: revert 4f1e9630afe6 ("blk-throtl: optimize IOPS throttle for large IO scenarios") Ming Lei
     [not found] ` <4666c796.5083.17e4d67bb88.Coremail.lining2020x@163.com>
     [not found]   ` <d7c6f47.2d01.17e51b058cc.Coremail.lining2020x@163.com>
2022-01-17  7:46     ` [RFC PATCH 0/7] block: improve iops limit throttle 163

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.