All of lore.kernel.org
 help / color / mirror / Atom feed
From: Waiman Long <llong@redhat.com>
To: Tejun Heo <tj@kernel.org>, Waiman Long <llong@redhat.com>
Cc: Zefan Li <lizefan.x@bytedance.com>,
	Johannes Weiner <hannes@cmpxchg.org>,
	Jonathan Corbet <corbet@lwn.net>, Shuah Khan <shuah@kernel.org>,
	cgroups@vger.kernel.org, linux-kernel@vger.kernel.org,
	linux-doc@vger.kernel.org, linux-kselftest@vger.kernel.org,
	Andrew Morton <akpm@linux-foundation.org>,
	Roman Gushchin <guro@fb.com>, Phil Auld <pauld@redhat.com>,
	Peter Zijlstra <peterz@infradead.org>,
	Juri Lelli <juri.lelli@redhat.com>
Subject: Re: [PATCH v2 2/6] cgroup/cpuset: Clarify the use of invalid partition root
Date: Fri, 16 Jul 2021 14:44:27 -0400	[thread overview]
Message-ID: <c6ae2d9b-ad6e-9bbd-b25c-f52b0ff6fb9b@redhat.com> (raw)
In-Reply-To: <YONGk3iw/zrNzwLK@mtj.duckdns.org>

On 7/5/21 1:51 PM, Tejun Heo wrote:
> Hello, Waiman.
>
> On Mon, Jun 28, 2021 at 09:06:50AM -0400, Waiman Long wrote:
>> The main reason for doing this is because normal cpuset control file actions
>> are under the direct control of the cpuset code. So it is up to us to decide
>> whether to grant it or deny it. Hotplug, on the other hand, is not under the
>> control of cpuset code. It can't deny a hotplug operation. This is the main
>> reason why the partition root error state was added in the first place.
> I have a difficult time convincing myself that this difference justifies the
> behavior difference and it keeps bothering me that there is a state which
> can be reached through one path but rejected by the other. I'll continue
> below.
>
>> Normally, users can set cpuset.cpus to whatever value they want even though
>> they are not actually granted. However, turning on partition root is under
>> more strict control. You can't turn on partition root if the CPUs requested
>> cannot actually be granted. The problem with setting the state to just
>> partition error is that users may not be aware that the partition creation
>> operation fails.  We can't assume all users will do the proper error
>> checking. I would rather let them know the operation fails rather than
>> relying on them doing the proper check afterward.
>>
>> Yes, I agree that it is a different philosophy than the original cpuset
>> code, but I thought one reason of doing cgroup v2 is to simplify the
>> interface and make it a bit more erorr-proof. Since partition root creation
>> is a relatively rare operation, we can afford to make it more strict than
>> the other operations.
> So, IMO, one of the reasons why cgroup1 interface was such a mess was
> because each piece of interaction was designed ad-hoc without regard to the
> overall consistency. One person feels a particular way of interacting with
> the interface is "correct" and does it that way and another person does
> another part in a different way. In the end, we ended up with a messy
> patchwork.
>
> One problematic aspect of cpuset in cgroup1 was the handling of failure
> modes, which was caused by the same exact approach - we wanted the interface
> to reject invalid configurations outright even though we didn't have the
> ability to prevent those configurations from occurring through other paths,
> which makes the failure mode more subtle by further obscuring them.
>
> I think a better approach would be having a clear signal and mechanism to
> watch the state and explicitly requiring users to verify and monitor the
> state transitions.

Sorry for the late reply as I was busy with other works.

I agree with you on principle. However, the reason why there are more 
restrictions on enabling partition is because I want to avoid forcing 
the users to always read back cpuset.partition.type to see if the 
operation succeeds instead of just getting an error from the operation. 
The former approach is more error prone. If you don't want changes in 
existing behavior, I can relax the checking and allow them to become an 
invalid partition if an illegal operation happens.

Also there is now another cpuset patch to extend cpu isolation to cgroup 
v1 [1]. I think it is better suit to the cgroup v2 partition scheme, but 
cgroup v1 is still quite heavily out there.

Please let me know what you want me to do and I will send out a v3 version.

Thanks a lot!
Longman


WARNING: multiple messages have this Message-ID (diff)
From: Waiman Long <llong-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
To: Tejun Heo <tj-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>,
	Waiman Long <llong-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
Cc: Zefan Li <lizefan.x-EC8Uxl6Npydl57MIdRCFDg@public.gmane.org>,
	Johannes Weiner <hannes-druUgvl0LCNAfugRpC6u6w@public.gmane.org>,
	Jonathan Corbet <corbet-T1hC0tSOHrs@public.gmane.org>,
	Shuah Khan <shuah-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>,
	cgroups-u79uwXL29TY76Z2rM5mHXA@public.gmane.org,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA@public.gmane.org,
	linux-doc-u79uwXL29TY76Z2rM5mHXA@public.gmane.org,
	linux-kselftest-u79uwXL29TY76Z2rM5mHXA@public.gmane.org,
	Andrew Morton
	<akpm-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b@public.gmane.org>,
	Roman Gushchin <guro-b10kYP2dOMg@public.gmane.org>,
	Phil Auld <pauld-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>,
	Peter Zijlstra <peterz-wEGCiKHe2LqWVfeAwA7xHQ@public.gmane.org>,
	Juri Lelli <juri.lelli-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
Subject: Re: [PATCH v2 2/6] cgroup/cpuset: Clarify the use of invalid partition root
Date: Fri, 16 Jul 2021 14:44:27 -0400	[thread overview]
Message-ID: <c6ae2d9b-ad6e-9bbd-b25c-f52b0ff6fb9b@redhat.com> (raw)
In-Reply-To: <YONGk3iw/zrNzwLK-qYNAdHglDFBN0TnZuCh8vA@public.gmane.org>

On 7/5/21 1:51 PM, Tejun Heo wrote:
> Hello, Waiman.
>
> On Mon, Jun 28, 2021 at 09:06:50AM -0400, Waiman Long wrote:
>> The main reason for doing this is because normal cpuset control file actions
>> are under the direct control of the cpuset code. So it is up to us to decide
>> whether to grant it or deny it. Hotplug, on the other hand, is not under the
>> control of cpuset code. It can't deny a hotplug operation. This is the main
>> reason why the partition root error state was added in the first place.
> I have a difficult time convincing myself that this difference justifies the
> behavior difference and it keeps bothering me that there is a state which
> can be reached through one path but rejected by the other. I'll continue
> below.
>
>> Normally, users can set cpuset.cpus to whatever value they want even though
>> they are not actually granted. However, turning on partition root is under
>> more strict control. You can't turn on partition root if the CPUs requested
>> cannot actually be granted. The problem with setting the state to just
>> partition error is that users may not be aware that the partition creation
>> operation fails.  We can't assume all users will do the proper error
>> checking. I would rather let them know the operation fails rather than
>> relying on them doing the proper check afterward.
>>
>> Yes, I agree that it is a different philosophy than the original cpuset
>> code, but I thought one reason of doing cgroup v2 is to simplify the
>> interface and make it a bit more erorr-proof. Since partition root creation
>> is a relatively rare operation, we can afford to make it more strict than
>> the other operations.
> So, IMO, one of the reasons why cgroup1 interface was such a mess was
> because each piece of interaction was designed ad-hoc without regard to the
> overall consistency. One person feels a particular way of interacting with
> the interface is "correct" and does it that way and another person does
> another part in a different way. In the end, we ended up with a messy
> patchwork.
>
> One problematic aspect of cpuset in cgroup1 was the handling of failure
> modes, which was caused by the same exact approach - we wanted the interface
> to reject invalid configurations outright even though we didn't have the
> ability to prevent those configurations from occurring through other paths,
> which makes the failure mode more subtle by further obscuring them.
>
> I think a better approach would be having a clear signal and mechanism to
> watch the state and explicitly requiring users to verify and monitor the
> state transitions.

Sorry for the late reply as I was busy with other works.

I agree with you on principle. However, the reason why there are more 
restrictions on enabling partition is because I want to avoid forcing 
the users to always read back cpuset.partition.type to see if the 
operation succeeds instead of just getting an error from the operation. 
The former approach is more error prone. If you don't want changes in 
existing behavior, I can relax the checking and allow them to become an 
invalid partition if an illegal operation happens.

Also there is now another cpuset patch to extend cpu isolation to cgroup 
v1 [1]. I think it is better suit to the cgroup v2 partition scheme, but 
cgroup v1 is still quite heavily out there.

Please let me know what you want me to do and I will send out a v3 version.

Thanks a lot!
Longman


  reply	other threads:[~2021-07-16 18:44 UTC|newest]

Thread overview: 33+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-06-21 18:49 [PATCH v2 0/6] cgroup/cpuset: Add new cpuset partition type & empty effecitve cpus Waiman Long
2021-06-21 18:49 ` Waiman Long
2021-06-21 18:49 ` [PATCH v2 1/6] cgroup/cpuset: Miscellaneous code cleanup Waiman Long
2021-06-21 18:49   ` Waiman Long
2021-06-21 18:49 ` [PATCH v2 2/6] cgroup/cpuset: Clarify the use of invalid partition root Waiman Long
2021-06-26 10:53   ` Tejun Heo
2021-06-28 13:06     ` Waiman Long
2021-06-28 13:06       ` Waiman Long
2021-07-05 17:51       ` Tejun Heo
2021-07-16 18:44         ` Waiman Long [this message]
2021-07-16 18:44           ` Waiman Long
2021-07-16 18:59           ` Waiman Long
2021-07-16 18:59             ` Waiman Long
2021-07-16 20:08             ` Waiman Long
2021-07-16 20:08               ` Waiman Long
2021-07-16 20:46               ` Tejun Heo
2021-07-16 21:12                 ` Waiman Long
2021-07-16 21:12                   ` Waiman Long
2021-07-16 21:18                   ` Tejun Heo
2021-07-16 21:18                     ` Tejun Heo
2021-07-16 21:28                     ` Waiman Long
2021-07-16 21:28                       ` Waiman Long
2021-06-21 18:49 ` [PATCH v2 3/6] cgroup/cpuset: Add a new isolated cpus.partition type Waiman Long
2021-06-21 18:49   ` Waiman Long
2021-06-24 12:51   ` Michal Koutný
2021-06-24 12:51     ` Michal Koutný
2021-06-24 15:23     ` Waiman Long
2021-06-24 15:23       ` Waiman Long
2021-06-21 18:49 ` [PATCH v2 4/6] cgroup/cpuset: Allow non-top parent partition root to distribute out all CPUs Waiman Long
2021-06-21 18:49   ` Waiman Long
2021-06-21 18:49 ` [PATCH v2 5/6] cgroup/cpuset: Update description of cpuset.cpus.partition in cgroup-v2.rst Waiman Long
2021-06-21 18:49   ` Waiman Long
2021-06-21 18:49 ` [PATCH v2 6/6] kselftest/cgroup: Add cpuset v2 partition root state test Waiman Long

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=c6ae2d9b-ad6e-9bbd-b25c-f52b0ff6fb9b@redhat.com \
    --to=llong@redhat.com \
    --cc=akpm@linux-foundation.org \
    --cc=cgroups@vger.kernel.org \
    --cc=corbet@lwn.net \
    --cc=guro@fb.com \
    --cc=hannes@cmpxchg.org \
    --cc=juri.lelli@redhat.com \
    --cc=linux-doc@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-kselftest@vger.kernel.org \
    --cc=lizefan.x@bytedance.com \
    --cc=pauld@redhat.com \
    --cc=peterz@infradead.org \
    --cc=shuah@kernel.org \
    --cc=tj@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.