linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH -next] blk-throttle: enable io throttle for root in cgroup v2
@ 2022-01-14  9:30 Yu Kuai
  2022-01-26 17:29 ` Tejun Heo
  2022-02-09  3:14 ` Ming Lei
  0 siblings, 2 replies; 11+ messages in thread
From: Yu Kuai @ 2022-01-14  9:30 UTC (permalink / raw)
  To: tj, axboe; +Cc: cgroups, linux-block, linux-kernel, yukuai3, yi.zhang

RFC patch: https://lkml.org/lkml/2021/9/9/1432

There is a proformance problem in our environment:

A host can provide a remote device to difierent client. If one client is
under high io pressure, other clients might be affected.

Limit the overall iops/bps(io.max) from the client can fix the problem,
however, config files do not exist in root cgroup currently, which makes
it impossible.

This patch enables io throttle for root cgroup:
 - enable "io.max" and "io.low" in root
 - don't skip root group in tg_iops_limit() and tg_bps_limit()
 - don't skip root group in tg_conf_updated()

Signed-off-by: Yu Kuai <yukuai3@huawei.com>
---
 block/blk-throttle.c | 13 ++-----------
 1 file changed, 2 insertions(+), 11 deletions(-)

diff --git a/block/blk-throttle.c b/block/blk-throttle.c
index 7c462c006b26..ac25bfbbfe7f 100644
--- a/block/blk-throttle.c
+++ b/block/blk-throttle.c
@@ -156,9 +156,6 @@ static uint64_t tg_bps_limit(struct throtl_grp *tg, int rw)
 	struct throtl_data *td;
 	uint64_t ret;
 
-	if (cgroup_subsys_on_dfl(io_cgrp_subsys) && !blkg->parent)
-		return U64_MAX;
-
 	td = tg->td;
 	ret = tg->bps[rw][td->limit_index];
 	if (ret == 0 && td->limit_index == LIMIT_LOW) {
@@ -186,9 +183,6 @@ static unsigned int tg_iops_limit(struct throtl_grp *tg, int rw)
 	struct throtl_data *td;
 	unsigned int ret;
 
-	if (cgroup_subsys_on_dfl(io_cgrp_subsys) && !blkg->parent)
-		return UINT_MAX;
-
 	td = tg->td;
 	ret = tg->iops[rw][td->limit_index];
 	if (ret == 0 && tg->td->limit_index == LIMIT_LOW) {
@@ -1284,9 +1278,8 @@ static void tg_conf_updated(struct throtl_grp *tg, bool global)
 		struct throtl_grp *parent_tg;
 
 		tg_update_has_rules(this_tg);
-		/* ignore root/second level */
-		if (!cgroup_subsys_on_dfl(io_cgrp_subsys) || !blkg->parent ||
-		    !blkg->parent->parent)
+		/* ignore root level */
+		if (!cgroup_subsys_on_dfl(io_cgrp_subsys) || !blkg->parent)
 			continue;
 		parent_tg = blkg_to_tg(blkg->parent);
 		/*
@@ -1625,7 +1618,6 @@ static struct cftype throtl_files[] = {
 #ifdef CONFIG_BLK_DEV_THROTTLING_LOW
 	{
 		.name = "low",
-		.flags = CFTYPE_NOT_ON_ROOT,
 		.seq_show = tg_print_limit,
 		.write = tg_set_limit,
 		.private = LIMIT_LOW,
@@ -1633,7 +1625,6 @@ static struct cftype throtl_files[] = {
 #endif
 	{
 		.name = "max",
-		.flags = CFTYPE_NOT_ON_ROOT,
 		.seq_show = tg_print_limit,
 		.write = tg_set_limit,
 		.private = LIMIT_MAX,
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* Re: [PATCH -next] blk-throttle: enable io throttle for root in cgroup v2
  2022-01-14  9:30 [PATCH -next] blk-throttle: enable io throttle for root in cgroup v2 Yu Kuai
@ 2022-01-26 17:29 ` Tejun Heo
  2022-01-27  2:36   ` yukuai (C)
  2022-02-09  3:14 ` Ming Lei
  1 sibling, 1 reply; 11+ messages in thread
From: Tejun Heo @ 2022-01-26 17:29 UTC (permalink / raw)
  To: Yu Kuai; +Cc: axboe, cgroups, linux-block, linux-kernel, yi.zhang

On Fri, Jan 14, 2022 at 05:30:00PM +0800, Yu Kuai wrote:
> RFC patch: https://lkml.org/lkml/2021/9/9/1432
> 
> There is a proformance problem in our environment:
> 
> A host can provide a remote device to difierent client. If one client is
> under high io pressure, other clients might be affected.
> 
> Limit the overall iops/bps(io.max) from the client can fix the problem,
> however, config files do not exist in root cgroup currently, which makes
> it impossible.
> 
> This patch enables io throttle for root cgroup:
>  - enable "io.max" and "io.low" in root
>  - don't skip root group in tg_iops_limit() and tg_bps_limit()
>  - don't skip root group in tg_conf_updated()
> 
> Signed-off-by: Yu Kuai <yukuai3@huawei.com>

Yeah, I'm kinda split. It's a simple change with some utility, but it's also
something which doesn't fit with the cgroup feature or interface. It's
regulating the whole system behavior. There's no reason for any of the
control "groups" to be involved here and semantically the interface would
fit a lot better under /proc, /sys or some other system-wide location. Here
are some points to consider:

* As a comparison, it'd be rather absurd to enable memory.max at system root
  in terms of interface and most likely break whole lot of mm operations.

* Resource control knobs of a cgroup belong to the parent as the parent is
  responsible for divvying up the available resources to its children. Here
  too, the knobs are making sense because there's a higher level parent
  (whether that's hypervisor or some network server).

Is your use case VMs or network attached storage?

Thanks.

-- 
tejun

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH -next] blk-throttle: enable io throttle for root in cgroup v2
  2022-01-26 17:29 ` Tejun Heo
@ 2022-01-27  2:36   ` yukuai (C)
  2022-02-01 17:20     ` Tejun Heo
  0 siblings, 1 reply; 11+ messages in thread
From: yukuai (C) @ 2022-01-27  2:36 UTC (permalink / raw)
  To: Tejun Heo; +Cc: axboe, cgroups, linux-block, linux-kernel, yi.zhang

在 2022/01/27 1:29, Tejun Heo 写道:
> On Fri, Jan 14, 2022 at 05:30:00PM +0800, Yu Kuai wrote:
>> RFC patch: https://lkml.org/lkml/2021/9/9/1432
>>
>> There is a proformance problem in our environment:
>>
>> A host can provide a remote device to difierent client. If one client is
>> under high io pressure, other clients might be affected.
>>
>> Limit the overall iops/bps(io.max) from the client can fix the problem,
>> however, config files do not exist in root cgroup currently, which makes
>> it impossible.
>>
>> This patch enables io throttle for root cgroup:
>>   - enable "io.max" and "io.low" in root
>>   - don't skip root group in tg_iops_limit() and tg_bps_limit()
>>   - don't skip root group in tg_conf_updated()
>>
>> Signed-off-by: Yu Kuai <yukuai3@huawei.com>
> 
> Yeah, I'm kinda split. It's a simple change with some utility, but it's also
> something which doesn't fit with the cgroup feature or interface. It's
> regulating the whole system behavior. There's no reason for any of the
> control "groups" to be involved here and semantically the interface would
> fit a lot better under /proc, /sys or some other system-wide location. Here
> are some points to consider:
> 
> * As a comparison, it'd be rather absurd to enable memory.max at system root
>    in terms of interface and most likely break whole lot of mm operations.
> 
> * Resource control knobs of a cgroup belong to the parent as the parent is
>    responsible for divvying up the available resources to its children. Here
>    too, the knobs are making sense because there's a higher level parent
>    (whether that's hypervisor or some network server).
> 
> Is your use case VMs or network attached storage?
> 
Hi,

In our case, the disk is provided by server, and such disk can be shared
by multipul clients. Thus for the client side, the server is a higher
level parent.

Theoretically, limit the io from server for each client is feasible,
however, the main reason we don't want to do this is the following
shortcoming:

client can still send io to server unlimited, we can just limit the
amount of io that can complete from server, which might cause too much
pressure on the server side.

Thanks,
Kuai

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH -next] blk-throttle: enable io throttle for root in cgroup v2
  2022-01-27  2:36   ` yukuai (C)
@ 2022-02-01 17:20     ` Tejun Heo
  2022-02-08  1:38       ` yukuai (C)
  0 siblings, 1 reply; 11+ messages in thread
From: Tejun Heo @ 2022-02-01 17:20 UTC (permalink / raw)
  To: yukuai (C); +Cc: axboe, cgroups, linux-block, linux-kernel, yi.zhang

Hello,

On Thu, Jan 27, 2022 at 10:36:38AM +0800, yukuai (C) wrote:
> In our case, the disk is provided by server, and such disk can be shared
> by multipul clients. Thus for the client side, the server is a higher
> level parent.
> 
> Theoretically, limit the io from server for each client is feasible,
> however, the main reason we don't want to do this is the following
> shortcoming:
> 
> client can still send io to server unlimited, we can just limit the
> amount of io that can complete from server, which might cause too much
> pressure on the server side.

I don't quite follow the "send io to server unlimited" part. Doesn't that
get limited by available number of requests? ie. if the server throttles,
the in-flight requests will take longer to complete which exhausts the
available requests and thus slows down the client. That's how it's supposed
to work on the local machine too.

Thanks.

-- 
tejun

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH -next] blk-throttle: enable io throttle for root in cgroup v2
  2022-02-01 17:20     ` Tejun Heo
@ 2022-02-08  1:38       ` yukuai (C)
  2022-02-08 18:49         ` Tejun Heo
  0 siblings, 1 reply; 11+ messages in thread
From: yukuai (C) @ 2022-02-08  1:38 UTC (permalink / raw)
  To: Tejun Heo; +Cc: axboe, cgroups, linux-block, linux-kernel, yi.zhang

在 2022/02/02 1:20, Tejun Heo 写道:
> Hello,
> 
> On Thu, Jan 27, 2022 at 10:36:38AM +0800, yukuai (C) wrote:
>> In our case, the disk is provided by server, and such disk can be shared
>> by multipul clients. Thus for the client side, the server is a higher
>> level parent.
>>
>> Theoretically, limit the io from server for each client is feasible,
>> however, the main reason we don't want to do this is the following
>> shortcoming:
>>
>> client can still send io to server unlimited, we can just limit the
>> amount of io that can complete from server, which might cause too much
>> pressure on the server side.
> 
> I don't quite follow the "send io to server unlimited" part. Doesn't that
> get limited by available number of requests? 

Hi, Tejun

Here I mean that io is not limited through io throttle from client. Of
course io must be limited by avaliable number of requests.

> ie. if the server throttles,
> the in-flight requests will take longer to complete which exhausts the
> available requests and thus slows down the client.

For example, if we have 8 clients, and available requests is 64.

1) If we don't limit iops, and each client send 64 io to server, some
client might have performance issue.

2) If we limit iops to 8 from clients, then server can receive 64 io at
most at the same time, both client and server should work fine.

3) If we limit iops to 8 for each client from server, client should work
fine, however, server can receive 64 x 8 = 512 io at most at the same
time, which might cause too much pressure on the server.(maybe bps is
more appropriate to explain the high pressure).

Thus I prefer to limit io from client.

Thanks,
Kuai
> That's how it's supposed
> to work on the local machine too.
> 
> Thanks.
> 

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH -next] blk-throttle: enable io throttle for root in cgroup v2
  2022-02-08  1:38       ` yukuai (C)
@ 2022-02-08 18:49         ` Tejun Heo
  2022-02-09  1:22           ` yukuai (C)
  0 siblings, 1 reply; 11+ messages in thread
From: Tejun Heo @ 2022-02-08 18:49 UTC (permalink / raw)
  To: yukuai (C); +Cc: axboe, cgroups, linux-block, linux-kernel, yi.zhang

Hello,

On Tue, Feb 08, 2022 at 09:38:33AM +0800, yukuai (C) wrote:
> 3) If we limit iops to 8 for each client from server, client should work
> fine, however, server can receive 64 x 8 = 512 io at most at the same
> time, which might cause too much pressure on the server.(maybe bps is
> more appropriate to explain the high pressure).

I don't follow this part. Why can't the server control however it wanna
control?

Thanks.

-- 
tejun

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH -next] blk-throttle: enable io throttle for root in cgroup v2
  2022-02-08 18:49         ` Tejun Heo
@ 2022-02-09  1:22           ` yukuai (C)
  2023-02-06 15:10             ` Michal Koutný
  0 siblings, 1 reply; 11+ messages in thread
From: yukuai (C) @ 2022-02-09  1:22 UTC (permalink / raw)
  To: Tejun Heo; +Cc: axboe, cgroups, linux-block, linux-kernel, yi.zhang

在 2022/02/09 2:49, Tejun Heo 写道:
> Hello,
> 
> On Tue, Feb 08, 2022 at 09:38:33AM +0800, yukuai (C) wrote:
>> 3) If we limit iops to 8 for each client from server, client should work
>> fine, however, server can receive 64 x 8 = 512 io at most at the same
>> time, which might cause too much pressure on the server.(maybe bps is
>> more appropriate to explain the high pressure).
> 
> I don't follow this part. Why can't the server control however it wanna
> control?

Hi,

Do you agree that the server can't control how many io it can receives
from one client if we limit from server? I think the difference is that
limit from client can control it...

Thanks,
Kuai
> 
> Thanks.
> 

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH -next] blk-throttle: enable io throttle for root in cgroup v2
  2022-01-14  9:30 [PATCH -next] blk-throttle: enable io throttle for root in cgroup v2 Yu Kuai
  2022-01-26 17:29 ` Tejun Heo
@ 2022-02-09  3:14 ` Ming Lei
  2023-02-05 15:55   ` Ofir Gal
  1 sibling, 1 reply; 11+ messages in thread
From: Ming Lei @ 2022-02-09  3:14 UTC (permalink / raw)
  To: Yu Kuai; +Cc: tj, axboe, cgroups, linux-block, linux-kernel, yi.zhang

Hello Yu Kuai,

On Fri, Jan 14, 2022 at 05:30:00PM +0800, Yu Kuai wrote:
> RFC patch: https://lkml.org/lkml/2021/9/9/1432
> 
> There is a proformance problem in our environment:
> 
> A host can provide a remote device to difierent client. If one client is
> under high io pressure, other clients might be affected.

Can you use the linux kernel storage term to describe the issue?
Such as, I guess here host means target server(iscsi, nvme target?),
client should be scsi initiator, or nvme host. If not, can you provide
one actual example for your storage use case?

With common term used, it becomes pretty easy for people to understand &
solve the issue, and avoid any misunderstanding.

> 
> Limit the overall iops/bps(io.max) from the client can fix the problem,

Just be curious how each client can figure out perfect iops/bps limit?
Given one client doesn't know how many clients are connected to the
target server.

It sounds like the throttle shouldn't be done in client side cgroup,
given the throttle is nothing to do with tasks. 

Maybe it should be done in server side, since server has enough
information to provide fair iops/bps allocation for each clients.


Thanks, 
Ming


^ permalink raw reply	[flat|nested] 11+ messages in thread

* [PATCH -next] blk-throttle: enable io throttle for root in cgroup v2
  2022-02-09  3:14 ` Ming Lei
@ 2023-02-05 15:55   ` Ofir Gal
  2023-02-06 15:00     ` Michal Koutný
  0 siblings, 1 reply; 11+ messages in thread
From: Ofir Gal @ 2023-02-05 15:55 UTC (permalink / raw)
  To: ming.lei
  Cc: axboe, cgroups, linux-block, linux-kernel, tj, yi.zhang, yukuai3,
	Ofir Gal

From: Ofir Gal <ofir@gal.software>

Hello Ming Lei,

I am trying to use cgroups v2 to throttle a media disk that is controlled by an NVME target.
Unfortunately, it cannot be done without setting the limit in the root cgroup.
It can be done via cgroups v1. Yu Kuai's patch allows this to be accomplished.

My setup consist from 3 servers.
Server #1:
    a. SSD media disk (needs to be throttled to 100K IOPs)
    b. NVME target controlling the SSD (1.a)

Server #2:
    a. NVME initiator is connected to Server #1 NVME target (1.b)

Server #3:
    a. NVME initiator is connected to Server #1 NVME target (1.b)

My setup accesses this media from multiple servers using NVMe over TCP,
but the initiator servers' workloads are unknown and can be changed dynamically. I need to limit the media disk to 100K IOPS on the target side.

I have tried to limit the SSD on Server #1, but it seems that the NVME target kworkers are not affected unless I use Yu Kuai's patch.

Can you elaborate on the issues with this patch or how the scenario described above can be done with cgroups v2?

Best regards, Ofir Gal.

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH -next] blk-throttle: enable io throttle for root in cgroup v2
  2023-02-05 15:55   ` Ofir Gal
@ 2023-02-06 15:00     ` Michal Koutný
  0 siblings, 0 replies; 11+ messages in thread
From: Michal Koutný @ 2023-02-06 15:00 UTC (permalink / raw)
  To: Ofir Gal
  Cc: ming.lei, axboe, cgroups, linux-block, linux-kernel, tj,
	yi.zhang, yukuai3, Ofir Gal

[-- Attachment #1: Type: text/plain, Size: 729 bytes --]

On Sun, Feb 05, 2023 at 05:55:41PM +0200, Ofir Gal <ofir.gal@volumez.com> wrote:
> I have tried to limit the SSD on Server #1, but it seems that the NVME
> target kworkers are not affected unless I use Yu Kuai's patch.
> 
> Can you elaborate on the issues with this patch or how the scenario
> described above can be done with cgroups v2?

The issue is that if there's a client that doesn't implement
self-throttling you cannot guarantee anything on the server side.
Hence the mechanism must exist on the server side.

The NVME target should charge IO to respective blkcg's (just a generic
advice, I'm not familiar with that interface; see also
kthread_associate_blkcg() use in loop device driver).

HTH,
Michal

[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 228 bytes --]

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH -next] blk-throttle: enable io throttle for root in cgroup v2
  2022-02-09  1:22           ` yukuai (C)
@ 2023-02-06 15:10             ` Michal Koutný
  0 siblings, 0 replies; 11+ messages in thread
From: Michal Koutný @ 2023-02-06 15:10 UTC (permalink / raw)
  To: yukuai (C); +Cc: Tejun Heo, axboe, cgroups, linux-block, linux-kernel, yi.zhang

[-- Attachment #1: Type: text/plain, Size: 652 bytes --]

Hello Kuai.

On Wed, Feb 09, 2022 at 09:22:30AM +0800, "yukuai (C)" <yukuai3@huawei.com> wrote:
> Do you agree that the server can't control how many io it can receives
> from one client if we limit from server? I think the difference is that
> limit from client can control it...

(Perhaps it depends on the protocol used for the IO but) eventually
client requests would be noticably lost/dropped and that's how the
server propagates the requested control onto the clients without relying
on the clients throttling themselves.

(Maybe better place to implement this would be a "testing" device mapper
target akin to the 'delay' one.)

Regards,
Michal

[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 228 bytes --]

^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2023-02-06 15:10 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-01-14  9:30 [PATCH -next] blk-throttle: enable io throttle for root in cgroup v2 Yu Kuai
2022-01-26 17:29 ` Tejun Heo
2022-01-27  2:36   ` yukuai (C)
2022-02-01 17:20     ` Tejun Heo
2022-02-08  1:38       ` yukuai (C)
2022-02-08 18:49         ` Tejun Heo
2022-02-09  1:22           ` yukuai (C)
2023-02-06 15:10             ` Michal Koutný
2022-02-09  3:14 ` Ming Lei
2023-02-05 15:55   ` Ofir Gal
2023-02-06 15:00     ` Michal Koutný

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).