All of lore.kernel.org
 help / color / mirror / Atom feed
From: Waiman Long <longman@redhat.com>
To: Peter Zijlstra <peterz@infradead.org>
Cc: Mike Galbraith <efault@gmx.de>, Tejun Heo <tj@kernel.org>,
	Li Zefan <lizefan@huawei.com>,
	Johannes Weiner <hannes@cmpxchg.org>,
	Ingo Molnar <mingo@redhat.com>,
	cgroups@vger.kernel.org, linux-kernel@vger.kernel.org,
	linux-doc@vger.kernel.org, kernel-team@fb.com, pjt@google.com,
	luto@amacapital.net, torvalds@linux-foundation.org,
	Roman Gushchin <guro@fb.com>
Subject: Re: [PATCH v4] cpuset: Enable cpuset controller in default hierarchy
Date: Mon, 12 Mar 2018 10:20:32 -0400	[thread overview]
Message-ID: <12ef7f60-5b14-8e09-2478-1453a3071f21@redhat.com> (raw)
In-Reply-To: <20180310131647.GB4043@hirez.programming.kicks-ass.net>

On 03/10/2018 08:16 AM, Peter Zijlstra wrote:
> On Fri, Mar 09, 2018 at 06:06:29PM -0500, Waiman Long wrote:
>> So you are talking about sched_relax_domain_level and
> That one I wouldn't be sad to see the back of.
>
>> sched_load_balance.
> This one, that's critical. And this is the perfect time to try and fix
> the whole isolcpus issue.
>
> The primary issue is that to make equivalent functionality available
> through cpuset, we need to basically start all tasks outside the root
> group.
>
> The equivalent of isolcpus=xxx is a cgroup setup like:
>
>         root
> 	/  \
>   system    other
>
> Where other has the @xxx cpus and system the remainder and
> root.sched_load_balance = 0.

I saw in the kernel-parameters.txt file that the isolcpus option was
deprecated - use cpusets instead. However, there doesn't seem to have
document on the right way to do it. Of course, we can achieve similar
results with what you have outlined above, but the process is more
complex than just adding another boot command line argument with
isolcpus. So I doubt isolcpus will die anytime soon unless we can make
the alternative as easy to use.

> Back before cgroups (and the new workqueue stuff), we could've started
> everything in the !root group, no worry. But now that doesn't work,
> because a bunch of controllers can't deal with that and everything
> cgroup expects the cgroupfs to be empty on boot.

AFAIK, all the processes belong to the root cgroup on boot. And the root
cgroup is usually special that the controller may not exert any control
for processes in the root cgroup. Many controllers become active for
processes in the child cgroups only. Would you mind elaborating what
doesn't quite work currently?

 
> It's one of my biggest regrets that I didn't 'fix' this before cgroups
> came along.
>
>> I have not removed any bits. I just haven't exposed
>> them yet. It does seem like these 2 control knobs are useful from the
>> scheduling perspective. Do we also need cpu_exclusive or just the two
>> sched control knobs are enough?
> I always forget if we need exclusive for load_balance to work; I'll
> peruse the document/code.

I think the cpu_exclusive feature can be useful to enforce that CPUs
allocated to the "other" isolated cgroup cannot be used by the processes
under the "system" parent.

I know that there are special code to handle the isolcpus option. How
about changing it to create a exclusive cpuset automatically instead.
Applications that need to run in those isolated CPUs can then use the
standard cgroup process to be moved into the isolated cgroup. For example,

isolcpus=<cpuset-name>,<cpu-id-list>

or

isolcpuset=<cpuset-name>[,cpu:<cpu-id-list>][,mem:<memory-node-list>]

We can then retire the old usage and encourage users to use the cgroup
API to manage it.

Cheers,
Longman

WARNING: multiple messages have this Message-ID (diff)
From: Waiman Long <longman@redhat.com>
To: Peter Zijlstra <peterz@infradead.org>
Cc: Mike Galbraith <efault@gmx.de>, Tejun Heo <tj@kernel.org>,
	Li Zefan <lizefan@huawei.com>,
	Johannes Weiner <hannes@cmpxchg.org>,
	Ingo Molnar <mingo@redhat.com>,
	cgroups@vger.kernel.org, linux-kernel@vger.kernel.org,
	linux-doc@vger.kernel.org, kernel-team@fb.com, pjt@google.com,
	luto@amacapital.net, torvalds@linux-foundation.org,
	Roman Gushchin <guro@fb.com>
Subject: Re: [PATCH v4] cpuset: Enable cpuset controller in default hierarchy
Date: Mon, 12 Mar 2018 10:20:32 -0400	[thread overview]
Message-ID: <12ef7f60-5b14-8e09-2478-1453a3071f21@redhat.com> (raw)
In-Reply-To: <20180310131647.GB4043@hirez.programming.kicks-ass.net>

On 03/10/2018 08:16 AM, Peter Zijlstra wrote:
> On Fri, Mar 09, 2018 at 06:06:29PM -0500, Waiman Long wrote:
>> So you are talking about sched_relax_domain_level and
> That one I wouldn't be sad to see the back of.
>
>> sched_load_balance.
> This one, that's critical. And this is the perfect time to try and fix
> the whole isolcpus issue.
>
> The primary issue is that to make equivalent functionality available
> through cpuset, we need to basically start all tasks outside the root
> group.
>
> The equivalent of isolcpus=xxx is a cgroup setup like:
>
>         root
> 	/  \
>   system    other
>
> Where other has the @xxx cpus and system the remainder and
> root.sched_load_balance = 0.

I saw in the kernel-parameters.txt file that the isolcpus option was
deprecated - use cpusets instead. However, there doesn't seem to have
document on the right way to do it. Of course, we can achieve similar
results with what you have outlined above, but the process is more
complex than just adding another boot command line argument with
isolcpus. So I doubt isolcpus will die anytime soon unless we can make
the alternative as easy to use.

> Back before cgroups (and the new workqueue stuff), we could've started
> everything in the !root group, no worry. But now that doesn't work,
> because a bunch of controllers can't deal with that and everything
> cgroup expects the cgroupfs to be empty on boot.

AFAIK, all the processes belong to the root cgroup on boot. And the root
cgroup is usually special that the controller may not exert any control
for processes in the root cgroup. Many controllers become active for
processes in the child cgroups only. Would you mind elaborating what
doesn't quite work currently?

 
> It's one of my biggest regrets that I didn't 'fix' this before cgroups
> came along.
>
>> I have not removed any bits. I just haven't exposed
>> them yet. It does seem like these 2 control knobs are useful from the
>> scheduling perspective. Do we also need cpu_exclusive or just the two
>> sched control knobs are enough?
> I always forget if we need exclusive for load_balance to work; I'll
> peruse the document/code.

I think the cpu_exclusive feature can be useful to enforce that CPUs
allocated to the "other" isolated cgroup cannot be used by the processes
under the "system" parent.

I know that there are special code to handle the isolcpus option. How
about changing it to create a exclusive cpuset automatically instead.
Applications that need to run in those isolated CPUs can then use the
standard cgroup process to be moved into the isolated cgroup. For example,

isolcpus=<cpuset-name>,<cpu-id-list>

or

isolcpuset=<cpuset-name>[,cpu:<cpu-id-list>][,mem:<memory-node-list>]

We can then retire the old usage and encourage users to use the cgroup
API to manage it.

Cheers,
Longman


--
To unsubscribe from this list: send the line "unsubscribe linux-doc" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

  reply	other threads:[~2018-03-12 14:20 UTC|newest]

Thread overview: 41+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-03-09 15:35 [PATCH v4] cpuset: Enable cpuset controller in default hierarchy Waiman Long
2018-03-09 15:35 ` Waiman Long
2018-03-09 16:34 ` Mike Galbraith
2018-03-09 16:34   ` Mike Galbraith
2018-03-09 17:23   ` Mike Galbraith
2018-03-09 17:23     ` Mike Galbraith
2018-03-09 17:45   ` Waiman Long
2018-03-09 17:45     ` Waiman Long
2018-03-09 18:17     ` Mike Galbraith
2018-03-09 18:17       ` Mike Galbraith
2018-03-09 18:20       ` Waiman Long
2018-03-09 18:20         ` Waiman Long
2018-03-09 18:20         ` Waiman Long
2018-03-09 19:40         ` Mike Galbraith
2018-03-09 19:40           ` Mike Galbraith
2018-03-09 20:43           ` Waiman Long
2018-03-09 20:43             ` Waiman Long
2018-03-09 22:17             ` Peter Zijlstra
2018-03-09 22:17               ` Peter Zijlstra
2018-03-09 23:06               ` Waiman Long
2018-03-09 23:06                 ` Waiman Long
2018-03-10  3:47                 ` Mike Galbraith
2018-03-10  3:47                   ` Mike Galbraith
2018-03-14 19:57                   ` Tejun Heo
2018-03-14 19:57                     ` Tejun Heo
2018-03-15  2:49                     ` Mike Galbraith
2018-03-15  2:49                       ` Mike Galbraith
2018-03-19 15:34                       ` Tejun Heo
2018-03-19 15:34                         ` Tejun Heo
2018-03-19 20:49                         ` Mike Galbraith
2018-03-19 20:49                           ` Mike Galbraith
2018-03-19 21:41                           ` Waiman Long
2018-03-19 21:41                             ` Waiman Long
2018-03-20  4:25                             ` Mike Galbraith
2018-03-20  4:25                               ` Mike Galbraith
2018-03-10 13:16                 ` Peter Zijlstra
2018-03-10 13:16                   ` Peter Zijlstra
2018-03-12 14:20                   ` Waiman Long [this message]
2018-03-12 14:20                     ` Waiman Long
2018-03-12 15:21                     ` Mike Galbraith
2018-03-12 15:21                       ` Mike Galbraith

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=12ef7f60-5b14-8e09-2478-1453a3071f21@redhat.com \
    --to=longman@redhat.com \
    --cc=cgroups@vger.kernel.org \
    --cc=efault@gmx.de \
    --cc=guro@fb.com \
    --cc=hannes@cmpxchg.org \
    --cc=kernel-team@fb.com \
    --cc=linux-doc@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=lizefan@huawei.com \
    --cc=luto@amacapital.net \
    --cc=mingo@redhat.com \
    --cc=peterz@infradead.org \
    --cc=pjt@google.com \
    --cc=tj@kernel.org \
    --cc=torvalds@linux-foundation.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.