linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Joel Fernandes <joel@joelfernandes.org>
To: Tejun Heo <tj@kernel.org>, Hao Luo <haoluo@google.com>
Cc: "Peter Zijlstra" <peterz@infradead.org>,
	"Hyser,Chris" <chris.hyser@oracle.com>,
	"Josh Don" <joshdon@google.com>, "Ingo Molnar" <mingo@kernel.org>,
	"Vincent Guittot" <vincent.guittot@linaro.org>,
	"Valentin Schneider" <valentin.schneider@arm.com>,
	"Mel Gorman" <mgorman@suse.de>,
	LKML <linux-kernel@vger.kernel.org>,
	"Thomas Glexiner" <tglx@linutronix.de>,
	"Michal Koutný" <mkoutny@suse.com>,
	"Christian Brauner" <christian.brauner@ubuntu.com>,
	"Zefan Li" <lizefan.x@bytedance.com>
Subject: Re: [PATCH 0/9] sched: Core scheduling interfaces
Date: Mon, 5 Apr 2021 14:46:09 -0400	[thread overview]
Message-ID: <CAEXW_YSS0ex8xK7t2R7c1jiE4eNbwxdwP2uyGPDK78YAaYQr5A@mail.gmail.com> (raw)
In-Reply-To: <YGpOF6f0YcMkWy1u@mtj.duckdns.org>

Hi TJ, Peter,

On Sun, Apr 4, 2021 at 7:39 PM Tejun Heo <tj@kernel.org> wrote:
>
> cc'ing Michal and Christian who've been spending some time on cgroup
> interface issues recently and Li Zefan for cpuset.
>
> On Thu, Apr 01, 2021 at 03:10:12PM +0200, Peter Zijlstra wrote:
> > The cgroup interface now uses a 'core_sched' file, which still takes 0,1. It is
> > however changed such that you can have nested tags. The for any given task, the
> > first parent with a cookie is the effective one. The rationale is that this way
> > you can delegate subtrees and still allow them some control over grouping.
>
> I find it difficult to like the proposed interface from the name (the term
> "core" is really confusing given how the word tends to be used internally)
> to the semantics (it isn't like anything else) and even the functionality
> (we're gonna have fixed processors at some point, right?).
>
> Here are some preliminary thoughts:
>
> * Are both prctl and cgroup based interfaces really necessary? I could be
>   being naive but given that we're (hopefully) working around hardware
>   deficiencies which will go away in time, I think there's a strong case for
>   minimizing at least the interface to the bare minimum.

I don't think these issues are going away as there are constantly new
exploits related to SMT that are coming out. Further, core scheduling
is not only for SMT - there are other usecases as well (such as VM
performance by preventing vCPU threads from core-sharing).

>
>   Given how cgroups are set up (membership operations happening only for
>   seeding, especially with the new clone interface), it isn't too difficult
>   to synchronize process tree and cgroup hierarchy where it matters - ie.
>   given the right per-process level interface, restricting configuration for
>   a cgroup sub-hierarchy may not need any cgroup involvement at all. This
>   also nicely gets rid of the interaction between prctl and cgroup bits.
> * If we *have* to have cgroup interface, I wonder whether this would fit a
>   lot better as a part of cpuset. If you squint just right, this can be
>   viewed as some dynamic form of cpuset. Implementation-wise, it probably
>   won't integrate with the rest but I think the feature will be less jarring
>   as a part of cpuset, which already is a bit of kitchensink anyway.

I think both interfaces are important for different reasons. Could you
take a look at the initial thread I started few months ago? I tried to
elaborate about usecases in detail:
http://lore.kernel.org/r/20200822030155.GA414063@google.com

Also, in ChromeOS we can't use CGroups for this purpose. The CGroup
hierarchy does not fit well with the threads we are tagging. Also, we
use CGroup v1, and since CGroups cannot overlap, this is impossible
let alone cumbersome. And, the CGroup interface having core scheduling
is still useful for people using containers and wanting to
core-schedule each container separately ( +Hao Luo can elaborate more
on that, but I did describe it in the link above).

> > The cgroup thing also '(ab)uses' cgroup_mutex for serialization because it
> > needs to ensure continuity between ss->can_attach() and ss->attach() for the
> > memory allocation. If the prctl() were allowed to interleave it might steal the
> > memory.
> >
> > Using cgroup_mutex feels icky, but is not without precedent,
> > kernel/bpf/cgroup.c does the same thing afaict.
> >
> > TJ, can you please have a look at this?
>
> Yeah, using cgroup_mutex for stabilizing cgroup hierarchy for consecutive
> operations is fine. It might be worthwhile to break that out into a proper
> interface but that's the least of concerns here.
>
> Can someone point me to a realistic and concrete usage scenario for this
> feature?

Yeah, its at http://lore.kernel.org/r/20200822030155.GA414063@google.com
as mentioned above, let me know if you need any more details about
usecase.

About the file name, how about kernel/sched/smt.c ? That definitely
provides more information than 'core_sched.c'.

Thanks,
- Joel

  reply	other threads:[~2021-04-05 18:46 UTC|newest]

Thread overview: 36+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-04-01 13:10 Peter Zijlstra
2021-04-01 13:10 ` [PATCH 1/9] sched: Allow sched_core_put() from atomic context Peter Zijlstra
2021-04-01 13:10 ` [PATCH 2/9] sched: Implement core-sched assertions Peter Zijlstra
2021-04-01 13:10 ` [PATCH 3/9] sched: Trivial core scheduling cookie management Peter Zijlstra
2021-04-01 20:04   ` Josh Don
2021-04-02  7:13     ` Peter Zijlstra
2021-04-01 13:10 ` [PATCH 4/9] sched: Default core-sched policy Peter Zijlstra
2021-04-21 13:33   ` Peter Zijlstra
2021-04-21 14:31     ` Chris Hyser
2021-04-01 13:10 ` [PATCH 5/9] sched: prctl() core-scheduling interface Peter Zijlstra
2021-04-07 17:00   ` Peter Zijlstra
2021-04-18  3:52     ` Joel Fernandes
2021-04-01 13:10 ` [PATCH 6/9] kselftest: Add test for core sched prctl interface Peter Zijlstra
2021-04-01 13:10 ` [PATCH 7/9] sched: Cgroup core-scheduling interface Peter Zijlstra
2021-04-02  0:34   ` Josh Don
2021-04-01 13:10 ` [PATCH 8/9] rbtree: Remove const from the rb_find_add() comparator Peter Zijlstra
2021-04-01 13:10 ` [PATCH 9/9] sched: prctl() and cgroup interaction Peter Zijlstra
2021-04-03  1:30   ` Josh Don
2021-04-06 15:12     ` Peter Zijlstra
2021-04-04 23:39 ` [PATCH 0/9] sched: Core scheduling interfaces Tejun Heo
2021-04-05 18:46   ` Joel Fernandes [this message]
2021-04-06 14:16     ` Tejun Heo
2021-04-18  1:35       ` Joel Fernandes
2021-04-19  9:00         ` Peter Zijlstra
2021-04-21 13:35           ` Peter Zijlstra
2021-04-21 14:45             ` Chris Hyser
2021-04-06 15:32   ` Peter Zijlstra
2021-04-06 16:08     ` Tejun Heo
2021-04-07 18:39       ` Peter Zijlstra
2021-04-07 16:50   ` Michal Koutný
2021-04-07 18:34     ` Peter Zijlstra
2021-04-08 13:25       ` Michal Koutný
2021-04-08 15:02         ` Peter Zijlstra
2021-04-09  0:16           ` Josh Don
2021-04-19 11:30       ` Tejun Heo
2021-04-20  1:17         ` Josh Don

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CAEXW_YSS0ex8xK7t2R7c1jiE4eNbwxdwP2uyGPDK78YAaYQr5A@mail.gmail.com \
    --to=joel@joelfernandes.org \
    --cc=chris.hyser@oracle.com \
    --cc=christian.brauner@ubuntu.com \
    --cc=haoluo@google.com \
    --cc=joshdon@google.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=lizefan.x@bytedance.com \
    --cc=mgorman@suse.de \
    --cc=mingo@kernel.org \
    --cc=mkoutny@suse.com \
    --cc=peterz@infradead.org \
    --cc=tglx@linutronix.de \
    --cc=tj@kernel.org \
    --cc=valentin.schneider@arm.com \
    --cc=vincent.guittot@linaro.org \
    --subject='Re: [PATCH 0/9] sched: Core scheduling interfaces' \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

This is a public inbox, see mirroring instructions
on how to clone and mirror all data and code used for this inbox