LKML Archive on lore.kernel.org
 help / color / Atom feed
From: Patrick Bellasi <patrick.bellasi@arm.com>
To: Subhra Mazumdar <subhra.mazumdar@oracle.com>,
	Peter Zijlstra <peterz@infradead.org>
Cc: linux-kernel@vger.kernel.org, mingo@redhat.com,
	tglx@linutronix.de, steven.sistare@oracle.com,
	dhaval.giani@oracle.com, daniel.lezcano@linaro.org,
	vincent.guittot@linaro.org, viresh.kumar@linaro.org,
	tim.c.chen@linux.intel.com, mgorman@techsingularity.net,
	parth@linux.ibm.com
Subject: Re: [RFC PATCH 1/9] sched,cgroup: Add interface for latency-nice
Date: Thu, 05 Sep 2019 10:45:27 +0100
Message-ID: <87r24v2i14.fsf@arm.com> (raw)
In-Reply-To: <20190905083127.GA2332@hirez.programming.kicks-ass.net>


On Thu, Sep 05, 2019 at 09:31:27 +0100, Peter Zijlstra wrote...

> On Fri, Aug 30, 2019 at 10:49:36AM -0700, subhra mazumdar wrote:
>> Add Cgroup interface for latency-nice. Each CPU Cgroup adds a new file
>> "latency-nice" which is shared by all the threads in that Cgroup.
>
> *sigh*, no. We start with a normal per task attribute, and then later,
> if it is needed and makes sense, we add it to cgroups.

FWIW, to add on top of what Peter says, we used this same approach for
uclamp and it proved to be a very effective way to come up with a good
design. General principles have been:

 - a system wide API [1] (under /proc/sys/kernel/sched_*) defines
   default values for all tasks affected by that feature.
   This interface has to define also upper bounds for task specific
   values. Thus, in the case of latency-nice, it should be set by
   default to the MIN value, since that's the current mainline
   behaviour: all tasks are latency sensitive.

 - a per-task API [2] (via the sched_setattr() syscall) can be used to
   relax the system wide setting thus implementing a "nice" policy.

 - a per-taskgroup API [3] (via cpu controller's attributes) can be used
   to relax the system-wide settings and restrict the per-task API.

The above features are worth to be added in that exact order.

> Also, your Changelog fails on pretty much every point. It doesn't
> explain why, it doesn't describe anything and so on.

On the description side, I guess it's worth to mention somewhere to
which scheduling classes this feature can be useful for. It's worth to
mention that it can apply only to:

 - CFS tasks: for example, at wakeup time a task with an high
   latency-nice should avoid to preempt a low latency-nice task.
   Maybe by mapping the latency nice value into proper vruntime
   normalization value?

 - RT tasks: for example, at wakeup time a task with an high
   latency-nice value could avoid to preempt a CFS task.

I'm sure there will be discussion about some of these features, that's
why it's important in the proposal presentation to keep a well defined
distinction among the "mechanisms and API" and how we use the new
concept to "bias" some scheduler policies.

> From just reading the above, I would expect it to have the range
> [-20,19] just like normal nice. Apparently this is not so.

Regarding the range for the latency-nice values, I guess we have two
options:

  - [-20..19], which makes it similar to priorities
  downside: we quite likely end up with a kernel space representation
  which does not match the user-space one, e.g. look at
  task_struct::prio.

  - [0..1024], which makes it more similar to a "percentage"

Being latency-nice a new concept, we are not constrained by POSIX and
IMHO the [0..1024] scale is a better fit.

That will translate into:

  latency-nice=0 : default (current mainline) behaviour, all "biasing"
  policies are disabled and we wakeup up as fast as possible

  latency-nice=1024 : maximum niceness, where for example we can imaging
  to turn switch a CFS task to be SCHED_IDLE?

Best,
Patrick

[1] commit e8f14172c6b1 ("sched/uclamp: Add system default clamps")
[2] commit a509a7cd7974 ("sched/uclamp: Extend sched_setattr() to support utilization clamping")
[3] 5 patches in today's tip/sched/core up to:
    commit babbe170e053 ("sched/uclamp: Update CPU's refcount on TG's clamp changes")

-- 
#include <best/regards.h>

Patrick Bellasi

  reply index

Thread overview: 55+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-08-30 17:49 [RFC PATCH 0/9] Task latency-nice subhra mazumdar
2019-08-30 17:49 ` [RFC PATCH 1/9] sched,cgroup: Add interface for latency-nice subhra mazumdar
2019-09-04 17:32   ` Tim Chen
2019-09-05  6:15     ` Parth Shah
2019-09-05 10:11       ` Patrick Bellasi
2019-09-06 12:22         ` Parth Shah
2019-09-05  8:31   ` Peter Zijlstra
2019-09-05  9:45     ` Patrick Bellasi [this message]
2019-09-05 10:46       ` Peter Zijlstra
2019-09-05 11:13         ` Qais Yousef
2019-09-05 11:30           ` Peter Zijlstra
2019-09-05 11:40             ` Patrick Bellasi
2019-09-05 11:48               ` Peter Zijlstra
2019-09-05 13:32                 ` Qais Yousef
2019-09-05 11:47             ` Qais Yousef
2020-04-16  0:02               ` Joel Fernandes
2020-04-16 17:23                 ` Dietmar Eggemann
2020-04-18 16:01                   ` Joel Fernandes
2020-04-20 11:26                     ` Parth Shah
2020-04-20 19:14                       ` Joel Fernandes
2020-04-20 11:47                     ` Qais Yousef
2020-04-20 19:10                       ` Joel Fernandes
2019-09-05 11:30           ` Patrick Bellasi
2019-09-05 11:47             ` Peter Zijlstra
2019-09-05 11:18         ` Patrick Bellasi
2019-09-05 11:40           ` Peter Zijlstra
2019-09-05 11:46             ` Patrick Bellasi
2019-09-05 11:46           ` Valentin Schneider
2019-09-05 13:07             ` Patrick Bellasi
2019-09-05 14:48               ` Valentin Schneider
2019-09-06 12:45               ` Parth Shah
2019-09-06 14:13                 ` Valentin Schneider
2019-09-06 14:32                   ` Vincent Guittot
2019-09-06 17:10                   ` Parth Shah
2019-09-06 22:50                     ` Valentin Schneider
2019-09-06 12:31       ` Parth Shah
2019-09-05 10:05   ` Patrick Bellasi
2019-09-05 10:48     ` Peter Zijlstra
2019-08-30 17:49 ` [RFC PATCH 2/9] sched: add search limit as per latency-nice subhra mazumdar
2019-09-05  6:22   ` Parth Shah
2019-08-30 17:49 ` [RFC PATCH 3/9] sched: add sched feature to disable idle core search subhra mazumdar
2019-09-05 10:17   ` Patrick Bellasi
2019-09-05 22:02     ` Subhra Mazumdar
2019-08-30 17:49 ` [RFC PATCH 4/9] sched: SIS_CORE " subhra mazumdar
2019-09-05 10:19   ` Patrick Bellasi
2019-08-30 17:49 ` [RFC PATCH 5/9] sched: Define macro for number of CPUs in core subhra mazumdar
2019-08-30 17:49 ` [RFC PATCH 6/9] x86/smpboot: Optimize cpumask_weight_sibling macro for x86 subhra mazumdar
2019-08-30 17:49 ` [RFC PATCH 7/9] sched: search SMT before LLC domain subhra mazumdar
2019-09-05  9:31   ` Peter Zijlstra
2019-09-05 20:40     ` Subhra Mazumdar
2019-08-30 17:49 ` [RFC PATCH 8/9] sched: introduce per-cpu var next_cpu to track search limit subhra mazumdar
2019-08-30 17:49 ` [RFC PATCH 9/9] sched: rotate the cpu search window for better spread subhra mazumdar
2019-09-05  6:37   ` Parth Shah
2019-09-05  5:55 ` [RFC PATCH 0/9] Task latency-nice Parth Shah
2019-09-05 10:31 ` Patrick Bellasi

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87r24v2i14.fsf@arm.com \
    --to=patrick.bellasi@arm.com \
    --cc=daniel.lezcano@linaro.org \
    --cc=dhaval.giani@oracle.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mgorman@techsingularity.net \
    --cc=mingo@redhat.com \
    --cc=parth@linux.ibm.com \
    --cc=peterz@infradead.org \
    --cc=steven.sistare@oracle.com \
    --cc=subhra.mazumdar@oracle.com \
    --cc=tglx@linutronix.de \
    --cc=tim.c.chen@linux.intel.com \
    --cc=vincent.guittot@linaro.org \
    --cc=viresh.kumar@linaro.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

LKML Archive on lore.kernel.org

Archives are clonable:
	git clone --mirror https://lore.kernel.org/lkml/0 lkml/git/0.git
	git clone --mirror https://lore.kernel.org/lkml/1 lkml/git/1.git
	git clone --mirror https://lore.kernel.org/lkml/2 lkml/git/2.git
	git clone --mirror https://lore.kernel.org/lkml/3 lkml/git/3.git
	git clone --mirror https://lore.kernel.org/lkml/4 lkml/git/4.git
	git clone --mirror https://lore.kernel.org/lkml/5 lkml/git/5.git
	git clone --mirror https://lore.kernel.org/lkml/6 lkml/git/6.git
	git clone --mirror https://lore.kernel.org/lkml/7 lkml/git/7.git
	git clone --mirror https://lore.kernel.org/lkml/8 lkml/git/8.git
	git clone --mirror https://lore.kernel.org/lkml/9 lkml/git/9.git
	git clone --mirror https://lore.kernel.org/lkml/10 lkml/git/10.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 lkml lkml/ https://lore.kernel.org/lkml \
		linux-kernel@vger.kernel.org
	public-inbox-index lkml

Example config snippet for mirrors

Newsgroup available over NNTP:
	nntp://nntp.lore.kernel.org/org.kernel.vger.linux-kernel


AGPL code for this site: git clone https://public-inbox.org/public-inbox.git