All of lore.kernel.org
 help / color / mirror / Atom feed
From: Yu Kuai <yukuai1@huaweicloud.com>
To: tj@kernel.org, mkoutny@suse.com, axboe@kernel.dk, ming.lei@redhat.com
Cc: cgroups@vger.kernel.org, linux-block@vger.kernel.org,
	linux-kernel@vger.kernel.org, yukuai3@huawei.com,
	yukuai1@huaweicloud.com, yi.zhang@huawei.com
Subject: [PATCH v7 1/9] blk-throttle: fix that io throttle can only work for single bio
Date: Tue,  2 Aug 2022 22:04:07 +0800	[thread overview]
Message-ID: <20220802140415.2960284-2-yukuai1@huaweicloud.com> (raw)
In-Reply-To: <20220802140415.2960284-1-yukuai1@huaweicloud.com>

From: Yu Kuai <yukuai3@huawei.com>

Test scripts:
cd /sys/fs/cgroup/blkio/
echo "8:0 1024" > blkio.throttle.write_bps_device
echo $$ > cgroup.procs
dd if=/dev/zero of=/dev/sda bs=10k count=1 oflag=direct &
dd if=/dev/zero of=/dev/sda bs=10k count=1 oflag=direct &

Test result:
10240 bytes (10 kB, 10 KiB) copied, 10.0134 s, 1.0 kB/s
10240 bytes (10 kB, 10 KiB) copied, 10.0135 s, 1.0 kB/s

The problem is that the second bio is finished after 10s instead of 20s.

Root cause:
1) second bio will be flaged:

__blk_throtl_bio
 while (true) {
  ...
  if (sq->nr_queued[rw]) -> some bio is throttled already
   break
 };
 bio_set_flag(bio, BIO_THROTTLED); -> flag the bio

2) flaged bio will be dispatched without waiting:

throtl_dispatch_tg
 tg_may_dispatch
  tg_with_in_bps_limit
   if (bps_limit == U64_MAX || bio_flagged(bio, BIO_THROTTLED))
    *wait = 0; -> wait time is zero
    return true;

commit 9f5ede3c01f9 ("block: throttle split bio in case of iops limit")
support to count splited bios for iops limit, thus it adds flaged bio
checking in tg_with_in_bps_limit() so that splited bios will only count
once for bps limit, however, it introduce a new problem that io throttle
won't work if multiple bios are throttled.

In order to fix the problem, at first, don't skip flaged bio in
tg_with_in_bps_limit(), however, this will break that splited bios should
only count once for bps limit. And this patch tries to avoid
over-accounting by decrementing it first in __blk_throtl_bio(), and
then counting it again while dispatching it.

Fixes: 9f5ede3c01f9 ("block: throttle split bio in case of iops limit")
Signed-off-by: Yu Kuai <yukuai3@huawei.com>
Reviewed-by: Ming Lei <ming.lei@redhat.com>
---
 block/blk-throttle.c | 26 ++++++++++++++++++++------
 1 file changed, 20 insertions(+), 6 deletions(-)

diff --git a/block/blk-throttle.c b/block/blk-throttle.c
index 9f5fe62afff9..2957e2c643f4 100644
--- a/block/blk-throttle.c
+++ b/block/blk-throttle.c
@@ -811,7 +811,7 @@ static bool tg_with_in_bps_limit(struct throtl_grp *tg, struct bio *bio,
 	unsigned int bio_size = throtl_bio_data_size(bio);
 
 	/* no need to throttle if this bio's bytes have been accounted */
-	if (bps_limit == U64_MAX || bio_flagged(bio, BIO_THROTTLED)) {
+	if (bps_limit == U64_MAX) {
 		if (wait)
 			*wait = 0;
 		return true;
@@ -921,11 +921,8 @@ static void throtl_charge_bio(struct throtl_grp *tg, struct bio *bio)
 	unsigned int bio_size = throtl_bio_data_size(bio);
 
 	/* Charge the bio to the group */
-	if (!bio_flagged(bio, BIO_THROTTLED)) {
-		tg->bytes_disp[rw] += bio_size;
-		tg->last_bytes_disp[rw] += bio_size;
-	}
-
+	tg->bytes_disp[rw] += bio_size;
+	tg->last_bytes_disp[rw] += bio_size;
 	tg->io_disp[rw]++;
 	tg->last_io_disp[rw]++;
 
@@ -2121,6 +2118,23 @@ bool __blk_throtl_bio(struct bio *bio)
 			tg->last_low_overflow_time[rw] = jiffies;
 		throtl_downgrade_check(tg);
 		throtl_upgrade_check(tg);
+
+		/*
+		 * Splited bios can be re-entered because iops limit should be
+		 * counted again, however, bps limit should not. Since bps limit
+		 * will be counted again while dispatching it, compensate the
+		 * over-accounting here. Noted that compensation can fail if
+		 * new slice is started.
+		 */
+		if (bio_flagged(bio, BIO_THROTTLED)) {
+			unsigned int bio_size = throtl_bio_data_size(bio);
+
+			if (tg->bytes_disp[rw] >= bio_size)
+				tg->bytes_disp[rw] -= bio_size;
+			if (tg->last_bytes_disp[rw] >= bio_size)
+				tg->last_bytes_disp[rw] -= bio_size;
+		}
+
 		/* throtl is FIFO - if bios are already queued, should queue */
 		if (sq->nr_queued[rw])
 			break;
-- 
2.31.1


  reply	other threads:[~2022-08-02 13:52 UTC|newest]

Thread overview: 62+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-08-02 14:04 [PATCH v7 0/9] bugfix and cleanup for blk-throttle Yu Kuai
2022-08-02 14:04 ` Yu Kuai
2022-08-02 14:04 ` Yu Kuai [this message]
2022-08-16 19:37   ` [PATCH v7 1/9] blk-throttle: fix that io throttle can only work for single bio Tejun Heo
2022-08-17  1:13     ` Yu Kuai
2022-08-17  1:13       ` Yu Kuai
2022-08-17 17:50       ` Tejun Heo
2022-08-18  1:23         ` Yu Kuai
2022-08-18  1:23           ` Yu Kuai
2022-08-22  3:06           ` Yu Kuai
2022-08-22  3:06             ` Yu Kuai
2022-08-22  7:25             ` Tejun Heo
2022-08-22  7:25               ` Tejun Heo
2022-08-22  7:44               ` Yu Kuai
2022-08-22  7:44                 ` Yu Kuai
2022-08-02 14:04 ` [PATCH v7 2/9] blk-throttle: prevent overflow while calculating wait time Yu Kuai
2022-08-02 14:04 ` [PATCH v7 3/9] blk-throttle: factor out code to calculate ios/bytes_allowed Yu Kuai
2022-08-02 14:04   ` Yu Kuai
2022-08-16 19:47   ` Tejun Heo
2022-08-17  1:32     ` Yu Kuai
2022-08-17  1:32       ` Yu Kuai
2022-08-02 14:04 ` [PATCH v7 4/9] blk-throttle: fix io hung due to configuration updates Yu Kuai
2022-08-02 14:04   ` Yu Kuai
2022-08-16 20:01   ` Tejun Heo
2022-08-16 20:01     ` Tejun Heo
2022-08-17  1:30     ` Yu Kuai
2022-08-17  1:30       ` Yu Kuai
2022-08-17 17:52       ` Tejun Heo
2022-08-17 17:52         ` Tejun Heo
2022-08-18  1:16         ` Yu Kuai
2022-08-18  1:16           ` Yu Kuai
2022-08-19 17:33           ` Tejun Heo
2022-08-19 17:33             ` Tejun Heo
2022-08-02 14:04 ` [PATCH v7 5/9] blk-throttle: improve handling of re-entered bio for bps limit Yu Kuai
2022-08-02 14:04   ` Yu Kuai
2022-08-16 20:02   ` Tejun Heo
2022-08-16 20:02     ` Tejun Heo
2022-08-02 14:04 ` [PATCH v7 6/9] blk-throttle: use 'READ/WRITE' instead of '0/1' Yu Kuai
2022-08-02 14:04   ` Yu Kuai
2022-08-16 20:03   ` Tejun Heo
2022-08-17  1:33     ` Yu Kuai
2022-08-17  1:33       ` Yu Kuai
2022-08-02 14:04 ` [PATCH v7 7/9] blk-throttle: calling throtl_dequeue/enqueue_tg in pairs Yu Kuai
2022-08-02 14:04   ` Yu Kuai
2022-08-02 14:04 ` [PATCH v7 8/9] blk-throttle: cleanup tg_update_disptime() Yu Kuai
2022-08-02 14:04   ` Yu Kuai
2022-08-16 20:09   ` Tejun Heo
2022-08-17  1:38     ` Yu Kuai
2022-08-17  1:38       ` Yu Kuai
2022-08-02 14:04 ` [PATCH v7 9/9] blk-throttle: clean up flag 'THROTL_TG_PENDING' Yu Kuai
2022-08-02 14:04   ` Yu Kuai
2022-08-16 20:14   ` Tejun Heo
2022-08-17  1:45     ` Yu Kuai
2022-08-17  1:45       ` Yu Kuai
2022-08-17 17:54       ` Tejun Heo
2022-08-17 17:54         ` Tejun Heo
2022-08-18  9:29         ` Yu Kuai
2022-08-18  9:29           ` Yu Kuai
2022-08-19 17:35           ` Tejun Heo
2022-08-19 17:35             ` Tejun Heo
2022-08-13  5:59 ` [PATCH v7 0/9] bugfix and cleanup for blk-throttle Yu Kuai
2022-08-13  5:59   ` Yu Kuai

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20220802140415.2960284-2-yukuai1@huaweicloud.com \
    --to=yukuai1@huaweicloud.com \
    --cc=axboe@kernel.dk \
    --cc=cgroups@vger.kernel.org \
    --cc=linux-block@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=ming.lei@redhat.com \
    --cc=mkoutny@suse.com \
    --cc=tj@kernel.org \
    --cc=yi.zhang@huawei.com \
    --cc=yukuai3@huawei.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.