From: Dietmar Eggemann <firstname.lastname@example.org>
To: Josh Don <email@example.com>
Cc: Ingo Molnar <firstname.lastname@example.org>,
Peter Zijlstra <email@example.com>,
Juri Lelli <firstname.lastname@example.org>,
Vincent Guittot <email@example.com>,
Steven Rostedt <firstname.lastname@example.org>,
Ben Segall <email@example.com>, Mel Gorman <firstname.lastname@example.org>,
Daniel Bristot de Oliveira <email@example.com>,
Paul Turner <firstname.lastname@example.org>,
David Rientjes <email@example.com>,
Oleg Rombakh <firstname.lastname@example.org>,
Viresh Kumar <email@example.com>,
Steve Sistare <firstname.lastname@example.org>,
Tejun Heo <email@example.com>,
Subject: Re: [PATCH] sched: cgroup SCHED_IDLE support
Date: Tue, 15 Jun 2021 12:06:57 +0200 [thread overview]
Message-ID: <firstname.lastname@example.org> (raw)
On 12/06/2021 01:34, Josh Don wrote:
> On Fri, Jun 11, 2021 at 9:43 AM Dietmar Eggemann
> <email@example.com> wrote:
>> On 10/06/2021 21:14, Josh Don wrote:
>>> Hey Dietmar,
>>> On Thu, Jun 10, 2021 at 5:53 AM Dietmar Eggemann
>>> <firstname.lastname@example.org> wrote:
>>>> Any reason why this should only work on cgroup-v2?
>>> My (perhaps incorrect) assumption that new development should not
>>> extend v1. I'd actually prefer making this work on v1 as well; I'll
>>> add that support.
>>>> struct cftype cpu_legacy_files vs. cpu_files
>>>>> @@ -11340,10 +11408,14 @@ void init_tg_cfs_entry(struct task_group *tg, struct cfs_rq *cfs_rq,
>>>>> static DEFINE_MUTEX(shares_mutex);
>>>>> -int sched_group_set_shares(struct task_group *tg, unsigned long shares)
>>>>> +#define IDLE_WEIGHT sched_prio_to_weight[ARRAY_SIZE(sched_prio_to_weight) - 1]
>>>> Why not 3 ? Like for tasks (WEIGHT_IDLEPRIO)?
>>> Went back and forth on this; on second look, I do think it makes sense
>>> to use the IDLEPRIO weight of 3 here. This gets converted to a 0,
>>> rather than a 1 for display of cpu.weight, which is also actually a
>>> nice property.
>> I'm struggling to see the benefit here.
>> For a taskgroup A: Why setting A/cpu.idle=1 to force a minimum A->shares
>> when you can set it directly via A/cpu.weight (to 1 (minimum))?
>> WEIGHT cpu.weight tg->shares
>> 3 0 3072
>> 15 1 15360
>> 1 10240
>> `A/cpu.weight` follows cgroup-v2's `weights` `resource distribution
>> model`* but I can only see `A/cpu.idle` as a layer on top of it forcing
>> `A/cpu.weight` to get its minimum value?
> Setting cpu.idle carries additional properties in addition to just the
> weight. Currently, it primarily includes (a) special wakeup preemption
> handling, and (b) contribution to idle_h_nr_running for the purpose of
> marking a cpu as a sched_idle_cpu(). Essentially, the current
> SCHED_IDLE mechanics. I've also discussed with Peter a potential
> extension to SCHED_IDLE to manipulate vruntime.
Right, I forgot about (b).
But IMHO, (a) could be handled with this special tg->shares value for
If there would be a way to open up `cpu.weight`, `cpu.weight.nice` (and
`cpu,shares` for v1) to take a special value for SCHED_IDLE, then you
won't need cpu.idle.
And you could handle the functionality from sched_group_set_idle()
directly in sched_group_set_shares().
In this case sched_group_set_shares() wouldn't have to be rejected on an
A tg would just become !idle by writing a different cpu.weight value.
Currently, if you !idle a tg it gets the default NICE_0_LOAD.
I guess cpu.weight [1, 10000] would be easy, 0 could be taken for that
and mapped into weight = WEIGHT_IDLEPRIO (3, 3072) to call
cpu.weight = 1 maps to (10, 10240)
cpu.weight.nice [-20, 19] would be already more complicated, 20?
And for cpu.shares [2, 2 << 18] 0 could be used. The issue here is that
WEIGHT_IDLEPRIO (3, 3072) is a valid value already for shares.
> We set the cgroup weight here, since by definition SCHED_IDLE entities
> have the least scheduling weight. From the perspective of your
> question, the analogous statement for tasks would be that we set task
> weight to the min when doing setsched(SCHED_IDLE), even though we
> already have a renice mechanism.
I agree. `cpu.idle = 1` is like setting the task policy to SCHED_IDLE.
And there is even the `cpu.weight.nice` to support the `task - tg`
analogy on nice values.
I'm just wondering if integrating this into `cpu.weight` and friends
would be better to make the code behind this easier to grasp.
next prev parent reply other threads:[~2021-06-15 10:07 UTC|newest]
Thread overview: 19+ messages / expand[flat|nested] mbox.gz Atom feed top
2021-06-08 23:11 [PATCH] sched: cgroup SCHED_IDLE support Josh Don
2021-06-10 12:53 ` Dietmar Eggemann
2021-06-10 19:14 ` Josh Don
2021-06-11 16:43 ` Dietmar Eggemann
2021-06-11 23:34 ` Josh Don
2021-06-15 10:06 ` Dietmar Eggemann [this message]
2021-06-15 23:30 ` Josh Don
2021-06-25 9:24 ` Peter Zijlstra
2021-06-16 15:42 ` Tejun Heo
2021-06-17 1:01 ` Josh Don
2021-06-26 9:57 ` Tejun Heo
2021-06-29 4:57 ` Josh Don
2021-06-25 8:08 ` Peter Zijlstra
2021-06-26 10:06 ` Tejun Heo
2021-06-26 11:42 ` Rik van Riel
2021-06-25 8:14 ` Peter Zijlstra
2021-06-26 0:18 ` Josh Don
2021-06-25 8:20 ` Peter Zijlstra
2021-06-26 0:35 ` Josh Don
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).