All of lore.kernel.org
 help / color / mirror / Atom feed
From: Sasha Levin <sashal@kernel.org>
To: linux-kernel@vger.kernel.org, stable@vger.kernel.org
Cc: Ming Lei <ming.lei@redhat.com>, Ning Li <lining2020x@163.com>,
	Tejun Heo <tj@kernel.org>, Chunguang Xu <brookxu@tencent.com>,
	Jens Axboe <axboe@kernel.dk>, Sasha Levin <sashal@kernel.org>,
	linux-block@vger.kernel.org, cgroups@vger.kernel.org
Subject: [PATCH AUTOSEL 5.17 09/43] block: throttle split bio in case of iops limit
Date: Mon, 28 Mar 2022 07:17:53 -0400	[thread overview]
Message-ID: <20220328111828.1554086-9-sashal@kernel.org> (raw)
In-Reply-To: <20220328111828.1554086-1-sashal@kernel.org>

From: Ming Lei <ming.lei@redhat.com>

[ Upstream commit 9f5ede3c01f9951b0ae7d68b28762ad51d9bacc8 ]

Commit 111be8839817 ("block-throttle: avoid double charge") marks bio as
BIO_THROTTLED unconditionally if __blk_throtl_bio() is called on this bio,
then this bio won't be called into __blk_throtl_bio() any more. This way
is to avoid double charge in case of bio splitting. It is reasonable for
read/write throughput limit, but not reasonable for IOPS limit because
block layer provides io accounting against split bio.

Chunguang Xu has already observed this issue and fixed it in commit
4f1e9630afe6 ("blk-throtl: optimize IOPS throttle for large IO scenarios").
However, that patch only covers bio splitting in __blk_queue_split(), and
we have other kind of bio splitting, such as bio_split() &
submit_bio_noacct() and other ways.

This patch tries to fix the issue in one generic way by always charging
the bio for iops limit in blk_throtl_bio(). This way is reasonable:
re-submission & fast-cloned bio is charged if it is submitted to same
disk/queue, and BIO_THROTTLED will be cleared if bio->bi_bdev is changed.

This new approach can get much more smooth/stable iops limit compared with
commit 4f1e9630afe6 ("blk-throtl: optimize IOPS throttle for large IO
scenarios") since that commit can't throttle current split bios actually.

Also this way won't cause new double bio iops charge in
blk_throtl_dispatch_work_fn() in which blk_throtl_bio() won't be called
any more.

Reported-by: Ning Li <lining2020x@163.com>
Acked-by: Tejun Heo <tj@kernel.org>
Cc: Chunguang Xu <brookxu@tencent.com>
Signed-off-by: Ming Lei <ming.lei@redhat.com>
Link: https://lore.kernel.org/r/20220216044514.2903784-7-ming.lei@redhat.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
 block/blk-merge.c    |  2 --
 block/blk-throttle.c | 10 +++++++---
 block/blk-throttle.h |  2 --
 3 files changed, 7 insertions(+), 7 deletions(-)

diff --git a/block/blk-merge.c b/block/blk-merge.c
index 4de34a332c9f..f5255991b773 100644
--- a/block/blk-merge.c
+++ b/block/blk-merge.c
@@ -368,8 +368,6 @@ void __blk_queue_split(struct request_queue *q, struct bio **bio,
 		trace_block_split(split, (*bio)->bi_iter.bi_sector);
 		submit_bio_noacct(*bio);
 		*bio = split;
-
-		blk_throtl_charge_bio_split(*bio);
 	}
 }
 
diff --git a/block/blk-throttle.c b/block/blk-throttle.c
index 7c462c006b26..87769b337fc5 100644
--- a/block/blk-throttle.c
+++ b/block/blk-throttle.c
@@ -808,7 +808,8 @@ static bool tg_with_in_bps_limit(struct throtl_grp *tg, struct bio *bio,
 	unsigned long jiffy_elapsed, jiffy_wait, jiffy_elapsed_rnd;
 	unsigned int bio_size = throtl_bio_data_size(bio);
 
-	if (bps_limit == U64_MAX) {
+	/* no need to throttle if this bio's bytes have been accounted */
+	if (bps_limit == U64_MAX || bio_flagged(bio, BIO_THROTTLED)) {
 		if (wait)
 			*wait = 0;
 		return true;
@@ -920,9 +921,12 @@ static void throtl_charge_bio(struct throtl_grp *tg, struct bio *bio)
 	unsigned int bio_size = throtl_bio_data_size(bio);
 
 	/* Charge the bio to the group */
-	tg->bytes_disp[rw] += bio_size;
+	if (!bio_flagged(bio, BIO_THROTTLED)) {
+		tg->bytes_disp[rw] += bio_size;
+		tg->last_bytes_disp[rw] += bio_size;
+	}
+
 	tg->io_disp[rw]++;
-	tg->last_bytes_disp[rw] += bio_size;
 	tg->last_io_disp[rw]++;
 
 	/*
diff --git a/block/blk-throttle.h b/block/blk-throttle.h
index 175f03abd9e4..cb43f4417d6e 100644
--- a/block/blk-throttle.h
+++ b/block/blk-throttle.h
@@ -170,8 +170,6 @@ static inline bool blk_throtl_bio(struct bio *bio)
 {
 	struct throtl_grp *tg = blkg_to_tg(bio->bi_blkg);
 
-	if (bio_flagged(bio, BIO_THROTTLED))
-		return false;
 	if (!tg->has_rules[bio_data_dir(bio)])
 		return false;
 
-- 
2.34.1


  parent reply	other threads:[~2022-03-28 11:19 UTC|newest]

Thread overview: 66+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-03-28 11:17 [PATCH AUTOSEL 5.17 01/43] LSM: general protection fault in legacy_parse_param Sasha Levin
2022-03-28 11:17 ` [PATCH AUTOSEL 5.17 02/43] regulator: rpi-panel: Handle I2C errors/timing to the Atmel Sasha Levin
2022-03-28 11:17 ` [PATCH AUTOSEL 5.17 03/43] crypto: hisilicon/qm - cleanup warning in qm_vf_read_qos Sasha Levin
2022-03-28 11:17 ` [PATCH AUTOSEL 5.17 04/43] crypto: octeontx2 - CN10K CPT to RNM workaround Sasha Levin
2022-03-28 11:17 ` [PATCH AUTOSEL 5.17 05/43] gcc-plugins/stackleak: Exactly match strings instead of prefixes Sasha Levin
2022-03-28 11:17 ` [PATCH AUTOSEL 5.17 06/43] rcu: Kill rnp->ofl_seq and use only rcu_state.ofl_lock for exclusion Sasha Levin
2022-03-28 11:17 ` [PATCH AUTOSEL 5.17 07/43] pinctrl: npcm: Fix broken references to chip->parent_device Sasha Levin
2022-03-28 11:17   ` Sasha Levin
2022-03-28 11:17 ` [PATCH AUTOSEL 5.17 08/43] rcu: Mark writes to the rcu_segcblist structure's ->flags field Sasha Levin
2022-03-28 11:17 ` Sasha Levin [this message]
2022-03-28 11:17 ` [PATCH AUTOSEL 5.17 10/43] memstick/mspro_block: fix handling of read-only devices Sasha Levin
2022-03-28 11:17 ` [PATCH AUTOSEL 5.17 11/43] block/bfq_wf2q: correct weight to ioprio Sasha Levin
2022-03-28 11:17 ` [PATCH AUTOSEL 5.17 12/43] crypto: xts - Add softdep on ecb Sasha Levin
2022-03-28 11:17 ` [PATCH AUTOSEL 5.17 13/43] crypto: hisilicon/sec - not need to enable sm4 extra mode at HW V3 Sasha Levin
2022-03-28 11:17 ` [PATCH AUTOSEL 5.17 14/43] block, bfq: don't move oom_bfqq Sasha Levin
2022-03-28 11:17   ` Sasha Levin
2022-03-28 11:17 ` [PATCH AUTOSEL 5.17 15/43] selinux: use correct type for context length Sasha Levin
2022-03-28 11:18 ` [PATCH AUTOSEL 5.17 16/43] random: use computational hash for entropy extraction Sasha Levin
2022-03-28 18:08   ` Eric Biggers
2022-03-28 18:34     ` Michael Brooks
2022-03-29  5:31     ` Jason A. Donenfeld
2022-04-05 22:10       ` Jason A. Donenfeld
2022-03-29 15:38     ` Theodore Ts'o
2022-03-29 17:34       ` Michael Brooks
2022-03-29 18:28         ` Theodore Ts'o
     [not found]   ` <CAOnCY6RUN+CSwjsD6Vg-MDi7ERAj2kKLorMLGp1jE8dTZ+3cpQ@mail.gmail.com>
2022-03-28 19:33     ` Michael Brooks
2022-03-30 16:08   ` Michael Brooks
2022-03-30 16:49     ` David Laight
2022-03-30 17:10       ` Michael Brooks
2022-03-30 18:33         ` Michael Brooks
2022-03-30 19:01           ` Theodore Y. Ts'o
2022-03-30 19:08             ` Michael Brooks
2022-03-28 11:18 ` [PATCH AUTOSEL 5.17 17/43] random: remove batched entropy locking Sasha Levin
2022-03-28 11:18 ` [PATCH AUTOSEL 5.17 18/43] random: absorb fast pool into input pool after fast load Sasha Levin
2022-03-28 11:18 ` [PATCH AUTOSEL 5.17 19/43] powercap/dtpm_cpu: Reset per_cpu variable in the release function Sasha Levin
2022-03-28 11:18 ` [PATCH AUTOSEL 5.17 20/43] random: round-robin registers as ulong, not u32 Sasha Levin
2022-03-28 11:18 ` [PATCH AUTOSEL 5.17 21/43] arm64: module: remove (NOLOAD) from linker script Sasha Levin
2022-03-28 11:18   ` Sasha Levin
2022-03-28 11:18 ` [PATCH AUTOSEL 5.17 22/43] selinux: allow FIOCLEX and FIONCLEX with policy capability Sasha Levin
2022-03-28 11:18 ` [PATCH AUTOSEL 5.17 23/43] loop: use sysfs_emit() in the sysfs xxx show() Sasha Levin
2022-03-28 11:18 ` [PATCH AUTOSEL 5.17 24/43] Fix incorrect type in assignment of ipv6 port for audit Sasha Levin
2022-03-28 11:18 ` [PATCH AUTOSEL 5.17 25/43] irqchip/qcom-pdc: Fix broken locking Sasha Levin
2022-03-28 11:18 ` [PATCH AUTOSEL 5.17 26/43] irqchip/nvic: Release nvic_base upon failure Sasha Levin
2022-03-28 11:18 ` [PATCH AUTOSEL 5.17 27/43] fs/binfmt_elf: Fix AT_PHDR for unusual ELF files Sasha Levin
2022-03-28 11:18 ` [PATCH AUTOSEL 5.17 28/43] hwrng: cavium - fix NULL but dereferenced coccicheck error Sasha Levin
2022-03-28 11:18 ` [PATCH AUTOSEL 5.17 29/43] signal, x86: Delay calling signals in atomic on RT enabled kernels Sasha Levin
2022-03-28 14:31   ` Eric W. Biederman
2022-03-28 16:35     ` Sebastian Andrzej Siewior
2022-03-31 16:59       ` Sasha Levin
2022-03-28 11:18 ` [PATCH AUTOSEL 5.17 30/43] bfq: fix use-after-free in bfq_dispatch_request Sasha Levin
2022-03-28 11:18 ` [PATCH AUTOSEL 5.17 31/43] ACPICA: Avoid walking the ACPI Namespace if it is not there Sasha Levin
2022-03-28 11:18 ` [PATCH AUTOSEL 5.17 32/43] ACPI / x86: Add skip i2c clients quirk for Nextbook Ares 8 Sasha Levin
2022-03-28 11:18 ` [PATCH AUTOSEL 5.17 33/43] ACPI / x86: Add skip i2c clients quirk for Lenovo Yoga Tablet 1050F/L Sasha Levin
2022-03-28 11:18 ` [PATCH AUTOSEL 5.17 34/43] lib/raid6/test/Makefile: Use $(pound) instead of \# for Make 4.3 Sasha Levin
2022-03-28 11:18 ` [PATCH AUTOSEL 5.17 35/43] Revert "Revert "block, bfq: honor already-setup queue merges"" Sasha Levin
2022-03-28 11:18 ` [PATCH AUTOSEL 5.17 36/43] ACPI/APEI: Limit printable size of BERT table data Sasha Levin
2022-03-28 11:18 ` [PATCH AUTOSEL 5.17 37/43] PM: core: keep irq flags in device_pm_check_callbacks() Sasha Levin
2022-03-28 11:18 ` [PATCH AUTOSEL 5.17 38/43] parisc: Fix non-access data TLB cache flush faults Sasha Levin
2022-03-28 11:18 ` [PATCH AUTOSEL 5.17 39/43] parisc: Fix handling off probe non-access faults Sasha Levin
2022-03-28 11:18 ` [PATCH AUTOSEL 5.17 40/43] nvme-tcp: lockdep: annotate in-kernel sockets Sasha Levin
2022-03-28 11:18 ` [PATCH AUTOSEL 5.17 41/43] spi: tegra20: Use of_device_get_match_data() Sasha Levin
2022-03-28 11:18 ` [PATCH AUTOSEL 5.17 42/43] Revert "ACPI: Pass the same capabilities to the _OSC regardless of the query flag" Sasha Levin
2022-07-07 21:30   ` Tom Crossland
2022-07-07 21:36     ` Limonciello, Mario
2022-07-08  9:22       ` Tom Crossland
2022-03-28 11:18 ` [PATCH AUTOSEL 5.17 43/43] spi: fsi: Implement a timeout for polling status Sasha Levin

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20220328111828.1554086-9-sashal@kernel.org \
    --to=sashal@kernel.org \
    --cc=axboe@kernel.dk \
    --cc=brookxu@tencent.com \
    --cc=cgroups@vger.kernel.org \
    --cc=lining2020x@163.com \
    --cc=linux-block@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=ming.lei@redhat.com \
    --cc=stable@vger.kernel.org \
    --cc=tj@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.