linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Waiman Long <longman@redhat.com>
To: "Michal Koutný" <mkoutny@suse.com>
Cc: Tejun Heo <tj@kernel.org>, Zefan Li <lizefan.x@bytedance.com>,
	Johannes Weiner <hannes@cmpxchg.org>,
	Jonathan Corbet <corbet@lwn.net>, Shuah Khan <shuah@kernel.org>,
	linux-kernel@vger.kernel.org, cgroups@vger.kernel.org,
	linux-doc@vger.kernel.org, linux-kselftest@vger.kernel.org,
	Juri Lelli <juri.lelli@redhat.com>,
	Valentin Schneider <vschneid@redhat.com>,
	Frederic Weisbecker <frederic@kernel.org>
Subject: Re: [RFC PATCH 0/5] cgroup/cpuset: A new "isolcpus" paritition
Date: Wed, 3 May 2023 23:01:36 -0400	[thread overview]
Message-ID: <deb7b684-3d7c-b3ae-7b36-5b7ba2dd8001@redhat.com> (raw)
In-Reply-To: <ZFGOTHQj3k5rzmyR@blackbook>


On 5/2/23 18:27, Michal Koutný wrote:
> On Tue, May 02, 2023 at 05:26:17PM -0400, Waiman Long <longman@redhat.com> wrote:
>> In the new scheme, the available cpus are still directly passed down to a
>> descendant cgroup. However, isolated CPUs (or more generally CPUs dedicated
>> to a partition) have to be exclusive. So what the cpuset.cpus.reserve does
>> is to identify those exclusive CPUs that can be excluded from the
>> effective_cpus of the parent cgroups before they are claimed by a child
>> partition. Currently this is done automatically when a child partition is
>> created off a parent partition root. The new scheme will break it into 2
>> separate steps without the requirement that the parent of a partition has to
>> be a partition root itself.
> new scheme
>    1st step:
>    echo C >p/cpuset.cpus.reserve
>    # p/cpuset.cpus.effective == A-C (1)
>    2nd step (claim):
>    echo C' >p/c/cpuset.cpus # C'⊆C
>    echo root >p/c/cpuset.cpus.partition

It is something like that. However, the current scheme of automatic 
reservation is also supported, i.e. cpuset.cpus.reserve will be set 
automatically when the child cgroup becomes a valid partition as long as 
the cpuset.cpus.reserve file is not written to. This is for backward 
compatibility.

Once it is written to, automatic mode will end and users have to 
manually set it afterward.


>
> current scheme
>    1st step (configure):
>    echo C >p/c/cpuset.cpus
>    2nd step (reserve & claim):
>    echo root >p/c/cpuset.cpus.partition
>    # p/cpuset.cpus.effective == A-C (2)
>
> As long as p/c is unpopulated, (1) and (2) are equal situations.
> Why is the (different) two step procedure needed?
>
> Also the relaxation of requirement of a parent being a partition
> confuses me -- if the parent is not a partition, i.e. it has no
> exclusive ownership of CPUs but it can still "give" it to children -- is
> child partition meant to be exclusive? (IOW can parent siblings reserve
> some same CPUs?)

A valid partition root has exclusive ownership of its CPUs. That is a 
rule that won't be changed. As a result, an incoming partition root 
cannot claim CPUs that have been allocated to another partition. To 
simplify thing, transition to a valid partition root is not possible if 
any of the CPUs in its cpuset.cpus are not in the cpuset.cpus.reserve of 
its ancestor or have been allocated to another partition. The partition 
root simply becomes invalid.

The parent can virtually give the reserved CPUs from the root down the 
hierarchy and a child can claim them once it becomes a partition root. 
In manual mode, we need to check all the way up the hierarchy to the 
root to figure out what CPUs in cpuset.cpus.reserve are valid. It has 
higher overhead, but enabling partition is not a fast operation anyway.

Cheers,
Longman


  reply	other threads:[~2023-05-04  3:02 UTC|newest]

Thread overview: 32+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-04-12 15:37 [RFC PATCH 0/5] cgroup/cpuset: A new "isolcpus" paritition Waiman Long
2023-04-12 19:28 ` Tejun Heo
     [not found]   ` <1ce6a073-e573-0c32-c3d8-f67f3d389a28@redhat.com>
2023-04-12 20:22     ` Tejun Heo
2023-04-12 20:33       ` Waiman Long
2023-04-13  0:03         ` Tejun Heo
2023-04-13  0:26           ` Waiman Long
2023-04-13  0:33             ` Tejun Heo
2023-04-13  0:55               ` Waiman Long
2023-04-13  1:17                 ` Tejun Heo
2023-04-13  1:55                   ` Waiman Long
2023-04-14  1:22                     ` Waiman Long
2023-04-14 16:54                       ` Tejun Heo
2023-04-14 17:29                         ` Waiman Long
2023-04-14 17:34                           ` Tejun Heo
2023-04-14 17:38                             ` Waiman Long
2023-04-14 19:06                               ` Waiman Long
2023-05-02 18:01                                 ` Michal Koutný
2023-05-02 21:26                                   ` Waiman Long
2023-05-02 22:27                                     ` Michal Koutný
2023-05-04  3:01                                       ` Waiman Long [this message]
2023-05-05 16:03                                         ` Tejun Heo
2023-05-05 16:25                                           ` Waiman Long
2023-05-08  1:03                                             ` Waiman Long
2023-05-22 19:49                                               ` Tejun Heo
2023-05-28 21:18                                                 ` Waiman Long
2023-06-05 18:03                                                   ` Tejun Heo
2023-06-05 20:00                                                     ` Waiman Long
2023-06-05 20:27                                                       ` Tejun Heo
2023-06-06  2:47                                                         ` Waiman Long
2023-06-06 19:58                                                           ` Tejun Heo
2023-06-06 20:11                                                             ` Waiman Long
2023-06-06 20:13                                                               ` Tejun Heo

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=deb7b684-3d7c-b3ae-7b36-5b7ba2dd8001@redhat.com \
    --to=longman@redhat.com \
    --cc=cgroups@vger.kernel.org \
    --cc=corbet@lwn.net \
    --cc=frederic@kernel.org \
    --cc=hannes@cmpxchg.org \
    --cc=juri.lelli@redhat.com \
    --cc=linux-doc@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-kselftest@vger.kernel.org \
    --cc=lizefan.x@bytedance.com \
    --cc=mkoutny@suse.com \
    --cc=shuah@kernel.org \
    --cc=tj@kernel.org \
    --cc=vschneid@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).