All of lore.kernel.org
 help / color / mirror / Atom feed
From: Parth Shah <parth@linux.ibm.com>
To: Patrick Bellasi <patrick.bellasi@arm.com>,
	Valentin Schneider <valentin.schneider@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>,
	Subhra Mazumdar <subhra.mazumdar@oracle.com>,
	linux-kernel@vger.kernel.org, mingo@redhat.com,
	tglx@linutronix.de, steven.sistare@oracle.com,
	dhaval.giani@oracle.com, daniel.lezcano@linaro.org,
	vincent.guittot@linaro.org, viresh.kumar@linaro.org,
	tim.c.chen@linux.intel.com, mgorman@techsingularity.net
Subject: Re: [RFC PATCH 1/9] sched,cgroup: Add interface for latency-nice
Date: Fri, 6 Sep 2019 18:15:28 +0530	[thread overview]
Message-ID: <3bb17e15-5492-b78c-20a8-5989519f20e2@linux.ibm.com> (raw)
In-Reply-To: <87d0ge3n85.fsf@arm.com>



On 9/5/19 6:37 PM, Patrick Bellasi wrote:
> 
> On Thu, Sep 05, 2019 at 12:46:37 +0100, Valentin Schneider wrote...
> 
>> On 05/09/2019 12:18, Patrick Bellasi wrote:
>>>> There's a few things wrong there; I really feel that if we call it nice,
>>>> it should be like nice. Otherwise we should call it latency-bias and not
>>>> have the association with nice to confuse people.
>>>>
>>>> Secondly; the default should be in the middle of the range. Naturally
>>>> this would be a signed range like nice [-(x+1),x] for some x. but if you
>>>> want [0,1024], then the default really should be 512, but personally I
>>>> like 0 better as a default, in which case we need negative numbers.
>>>>
>>>> This is important because we want to be able to bias towards less
>>>> importance to (tail) latency as well as more importantance to (tail)
>>>> latency.
>>>>
>>>> Specifically, Oracle wants to sacrifice (some) latency for throughput.
>>>> Facebook OTOH seems to want to sacrifice (some) throughput for latency.
>>>
>>> Right, we have this dualism to deal with and current mainline behaviour
>>> is somehow in the middle.
>>>
>>> BTW, the FB requirement is the same we have in Android.
>>> We want some CFS tasks to have very small latency and a low chance
>>> to be preempted by the wake-up of less-important "background" tasks.
>>>
>>> I'm not totally against the usage of a signed range, but I'm thinking
>>> that since we are introducing a new (non POSIX) concept we can get the
>>> chance to make it more human friendly.
>>>
>>> Give the two extremes above, would not be much simpler and intuitive to
>>> have 0 implementing the FB/Android (no latency) case and 1024 the
>>> (max latency) Oracle case?
>>>
>>
>> For something like latency-<whatever>, I don't see the point of having
>> such a wide range. The nice range is probably more than enough - and before
>> even bothering about the range, we should probably agree on what the range
>> should represent.
>>
>> If it's niceness, I read it as: positive latency-nice value means we're
>> nice to latency, means we reduce it. So the further up you go, the more you
>> restrict your wakeup scan. I think it's quite easy to map that into the
>> code: current behaviour at 0, with a decreasing scan mask size as we go
>> towards +19. I don't think anyone needs 512 steps to tune this.
>>
>> I don't know what logic we'd follow for negative values though. Maybe
>> latency-nice -20 means always going through the slowpath, but what of the
>> intermediate values?
> 
> Yep, I think so fare we are all converging towards the idea to use the
> a signed range. Regarding the range itself, yes: 1024 looks very
> oversized, but +-20 is still something which leave room for a bit of
> flexibility and it also better matches the idea that we don't want to
> "enumerate behaviours" but just expose a knob. To map certain "bias" we
> could benefit from a slightly larger range.
> 
>> AFAICT this RFC only looks at wakeups, but I guess latency-nice can be
> 
> For the wakeup path there is also the TurboSched proposal by Parth:
> 
>    Message-ID: <20190725070857.6639-1-parth@linux.ibm.com> 
>    https://lore.kernel.org/lkml/20190725070857.6639-1-parth@linux.ibm.com/
> 
> we should keep in mind.
> 
>> applied elsewhere (e.g. load-balance, something like task_hot() and its
>> use of sysctl_sched_migration_cost).
> 
> For LB can you come up with some better description of what usages you
> see could benefit from a "per task" or "per task-group" latency niceness?
> 

I guess there is some usecase in case of thermal throttling.
If a task is heating up the core then in ideal scenarios POWER systems throttle
down to rated frequency.
In such case, if the task is latency sensitive (min latency nice), we can move the
task around the chip to heat up the chip uniformly allowing me to gain more performance
with sustained higher frequency.
With this, we will require the help from active load balancer and latency-nice
classification on per task and/or group basis.

Hopefully, this might be useful for other arch as well, right?

> Best,
> Patrick
> 

Thanks,
Parth


  parent reply	other threads:[~2019-09-06 12:45 UTC|newest]

Thread overview: 55+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-08-30 17:49 [RFC PATCH 0/9] Task latency-nice subhra mazumdar
2019-08-30 17:49 ` [RFC PATCH 1/9] sched,cgroup: Add interface for latency-nice subhra mazumdar
2019-09-04 17:32   ` Tim Chen
2019-09-05  6:15     ` Parth Shah
2019-09-05 10:11       ` Patrick Bellasi
2019-09-06 12:22         ` Parth Shah
2019-09-05  8:31   ` Peter Zijlstra
2019-09-05  9:45     ` Patrick Bellasi
2019-09-05 10:46       ` Peter Zijlstra
2019-09-05 11:13         ` Qais Yousef
2019-09-05 11:30           ` Peter Zijlstra
2019-09-05 11:40             ` Patrick Bellasi
2019-09-05 11:48               ` Peter Zijlstra
2019-09-05 13:32                 ` Qais Yousef
2019-09-05 11:47             ` Qais Yousef
2020-04-16  0:02               ` Joel Fernandes
2020-04-16 17:23                 ` Dietmar Eggemann
2020-04-18 16:01                   ` Joel Fernandes
2020-04-20 11:26                     ` Parth Shah
2020-04-20 19:14                       ` Joel Fernandes
2020-04-20 11:47                     ` Qais Yousef
2020-04-20 19:10                       ` Joel Fernandes
2019-09-05 11:30           ` Patrick Bellasi
2019-09-05 11:47             ` Peter Zijlstra
2019-09-05 11:18         ` Patrick Bellasi
2019-09-05 11:40           ` Peter Zijlstra
2019-09-05 11:46             ` Patrick Bellasi
2019-09-05 11:46           ` Valentin Schneider
2019-09-05 13:07             ` Patrick Bellasi
2019-09-05 14:48               ` Valentin Schneider
2019-09-06 12:45               ` Parth Shah [this message]
2019-09-06 14:13                 ` Valentin Schneider
2019-09-06 14:32                   ` Vincent Guittot
2019-09-06 17:10                   ` Parth Shah
2019-09-06 22:50                     ` Valentin Schneider
2019-09-06 12:31       ` Parth Shah
2019-09-05 10:05   ` Patrick Bellasi
2019-09-05 10:48     ` Peter Zijlstra
2019-08-30 17:49 ` [RFC PATCH 2/9] sched: add search limit as per latency-nice subhra mazumdar
2019-09-05  6:22   ` Parth Shah
2019-08-30 17:49 ` [RFC PATCH 3/9] sched: add sched feature to disable idle core search subhra mazumdar
2019-09-05 10:17   ` Patrick Bellasi
2019-09-05 22:02     ` Subhra Mazumdar
2019-08-30 17:49 ` [RFC PATCH 4/9] sched: SIS_CORE " subhra mazumdar
2019-09-05 10:19   ` Patrick Bellasi
2019-08-30 17:49 ` [RFC PATCH 5/9] sched: Define macro for number of CPUs in core subhra mazumdar
2019-08-30 17:49 ` [RFC PATCH 6/9] x86/smpboot: Optimize cpumask_weight_sibling macro for x86 subhra mazumdar
2019-08-30 17:49 ` [RFC PATCH 7/9] sched: search SMT before LLC domain subhra mazumdar
2019-09-05  9:31   ` Peter Zijlstra
2019-09-05 20:40     ` Subhra Mazumdar
2019-08-30 17:49 ` [RFC PATCH 8/9] sched: introduce per-cpu var next_cpu to track search limit subhra mazumdar
2019-08-30 17:49 ` [RFC PATCH 9/9] sched: rotate the cpu search window for better spread subhra mazumdar
2019-09-05  6:37   ` Parth Shah
2019-09-05  5:55 ` [RFC PATCH 0/9] Task latency-nice Parth Shah
2019-09-05 10:31 ` Patrick Bellasi

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=3bb17e15-5492-b78c-20a8-5989519f20e2@linux.ibm.com \
    --to=parth@linux.ibm.com \
    --cc=daniel.lezcano@linaro.org \
    --cc=dhaval.giani@oracle.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mgorman@techsingularity.net \
    --cc=mingo@redhat.com \
    --cc=patrick.bellasi@arm.com \
    --cc=peterz@infradead.org \
    --cc=steven.sistare@oracle.com \
    --cc=subhra.mazumdar@oracle.com \
    --cc=tglx@linutronix.de \
    --cc=tim.c.chen@linux.intel.com \
    --cc=valentin.schneider@arm.com \
    --cc=vincent.guittot@linaro.org \
    --cc=viresh.kumar@linaro.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.