Re: [RFC 1/6] Optimize wake-up task for Task Packing heuristic

All of lore.kernel.org
 help / color / mirror / Atom feed

* Re: [RFC 1/6] Optimize wake-up task for Task Packing heuristic
       [not found]   ` <0b198386-ef0b-75e0-e53a-1160c77326b7@arm.com>
@ 2019-04-07  5:57     ` Parth Shah
  2019-04-10 16:33       ` Dietmar Eggemann
  0 siblings, 1 reply; 20+ messages in thread
From: Parth Shah @ 2019-04-07  5:57 UTC (permalink / raw)
  To: Dietmar Eggemann, linux-pm



On 3/27/19 3:01 PM, Dietmar Eggemann wrote:
> Hi Parth,
> 
> On 3/22/19 7:06 AM, Parth Shah wrote:
>> TurboSched feature requires the minimal number of cores to be active
>> inorder to sustain higher Turbo-frequency in SMP sytems. The jitter
>> tasks if packed on the idle CPUs of already active cores will result in
>> better performance for throughput intensive workloads.
>>
>> Signed-off-by: Parth Shah <parth015@linux.vnet.ibm.com>
>> ---
>>   kernel/sched/fair.c | 91 ++++++++++++++++++++++++++++++++++++++++++++-
>>   1 file changed, 89 insertions(+), 2 deletions(-)
>>
>> diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
>> index 51003e1c794d..dcf48f37e0fa 100644
>> --- a/kernel/sched/fair.c
>> +++ b/kernel/sched/fair.c
>> @@ -6202,6 +6202,82 @@ static int select_idle_cpu(struct task_struct *p, struct sched_domain *sd, int t
>>       return cpu;
>>   }
>>   +/*
>> + * Core is defined as under-utilized in case if the aggregated utilization of a
>> + * all the CPUs in a core is less than 12.5%
>> + */
>> +static inline bool core_underutilized(long unsigned core_util,
>> +           long unsigned core_capacity)
>> +{
>> +    return core_util < core_capacity>>3;
>> +}
>> +
>> +/*
>> + * Core Capacity Mulitplication Factor
>> + * The capacity of a core is defined to be 1.6x the capacity of any
>> + * CPU(or SM thread), since the architecture is symmetric
>> + */
>> +static const int core_cap_mf = 16;
>> +
>> +/*
>> + * Try to find non idle core in the system, but with spare capacity available
>> + * for task packing, thereby keeping minimal cores active.
>> + */
>> +static int select_non_idle_core(struct task_struct *p, int prev_cpu)
>> +{
>> +    struct cpumask *cpus = this_cpu_cpumask_var_ptr(select_idle_mask);
>> +    int core, smt;
>> +
>> +    cpumask_and(cpus, cpu_online_mask, &p->cpus_allowed);
>> +
>> +    for_each_cpu_wrap(core, cpus, prev_cpu)
>> +    {
>> +        long unsigned int core_util = 0;
>> +        long unsigned int core_cap = core_cap_mf*capacity_of(core)/10;
>> +        long unsigned int cache_cpu_util = (unsigned)-1;
>> +        long unsigned est_util = 0, est_util_enqueued = 0;
>> +        int cache_cpu = core;
>> +        struct cfs_rq *cfs_rq;
>> +
>> +        for_each_cpu(smt, cpu_smt_mask(core)) {
> 
> This one doesn't build for me on arm64 (make defconfig) since it uses cpu_smt_mask() outside the CONFIG_SCHED_SMT guard.
> 
> kernel/sched/fair.c: In function ‘select_non_idle_core’:
> kernel/sched/fair.c:6243:21: error: implicit declaration of function ‘cpu_smt_mask’; did you mean ‘cpu_cpu_mask’? [-Werror=implicit-function-declaration]
>    for_each_cpu(smt, cpu_smt_mask(core)) {
>                      ^
> ./include/linux/cpumask.h:242:32: note: in definition of macro ‘for_each_cpu’
> 
> [...]
> 

Thanks for pointing out. It will not build for individual patches for the current version of RFC.

Please try to build it with full patch set. I assure, the following iterations of RFC will resolve this issue.


^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [RFC 3/6] Introduce static key to enable or disable TurboSched
       [not found]     ` <20190404095854.GB25302@aks.ibm>
@ 2019-04-07  6:43       ` Parth Shah
  0 siblings, 0 replies; 20+ messages in thread
From: Parth Shah @ 2019-04-07  6:43 UTC (permalink / raw)
  To: Akshay Adiga, Doug Smythies; +Cc: linux-pm



On 4/4/19 3:28 PM, Akshay Adiga wrote:
> On Mon, Mar 25, 2019 at 03:42:48PM -0700, Doug Smythies wrote:
>> On 2019.03.21 23:06 Parth Shah wrote:
>>
>>> Signed-off-by: Parth Shah <parth015@linux.vnet.ibm.com>
>>> ---
>>>  kernel/sched/core.c  | 36 ++++++++++++++++++++++++++++++++++++
>>> kernel/sched/fair.c  |  2 +-
>>> kernel/sched/sched.h |  7 +++++++
>>> 3 files changed, 44 insertions(+), 1 deletion(-)
>>
>> ...[snip]...
>>
>> I wanted to try this patch set with a particular work flow.
>> I actually don't think it will help, but wanted to try it.
>>
>> Forgive me if this is something that should be obvious but...
>>
>> I have not been able to figure out how to enable or disable
>> Turboshed.
>>
>> This patch suggests that I should be able to, but how?
>>
>> I have a kernel 5.1-rc1 with this entire patch set added.
>>  
>> ... Doug
>>
>>
> I think he added attribute called cpu.turbo_sched for each cpu controller.
> 
> In order to enable this feature you need first create a cpu cgroup, and set
> the attribute value cpu.turbo_sched to :
>   - cpu.turbo_sched = 1 (tasks benefiting from high freq) or
>   - cpu.turbo_sched = 2 (tasks which are jitters)
> 
> Looks like  the feature is disabled by default or cpu.turbo_sched = 0.
> 

Correct Akshay.

In fact, one can follow the below procedure to create 2 different categories
of workload,

cd  /sys/fs/cgroup/cpu/
mkdir type1
execute type1_workload
echo `pidof type1_workload` > type1/cgroup.procs && echo 1 > type1/cpu.turbo_sched
mkdir type2
execute type2_workload
echo `pidof type2_workload` > type2/cgroup.procs && echo 2 > type2/cpu.turbo_sched

where    Type1 = tasks benefiting from high freq
and    Type2 = tasks which are jitters


One can use the test benchmark (wofbench.c) to have a quick demo of TurboSched as follows

Execute program with 8 jitters and 8 High util tasks: wofbench -t 30 -h 8 -n 16
Create cgroup: cd  /sys/fs/cgroup/cpu/ && mkdir type2
Put task: echo `pidof wofbench`> type2/cgroup.procs && echo 2 > type2/cpu.turbo_sched
Result: Jitters will be packed on the same core as other 8 High util tasks.

Thus during multi-thread workload execution, if the system throttles frequency
due to power budget constraint of the system, using these patches may result in
sustaining of higher frequencies by keeping maximum cores idle.

Hope this helps. Thanks


^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [RFC 1/6] Optimize wake-up task for Task Packing heuristic
  2019-04-07  5:57     ` [RFC 1/6] Optimize wake-up task for Task Packing heuristic Parth Shah
@ 2019-04-10 16:33       ` Dietmar Eggemann
  2019-04-11 14:05         ` Dietmar Eggemann
  2019-04-11 14:44         ` Parth Shah
  0 siblings, 2 replies; 20+ messages in thread
From: Dietmar Eggemann @ 2019-04-10 16:33 UTC (permalink / raw)
  To: Parth Shah, linux-pm

On 4/7/19 7:57 AM, Parth Shah wrote:
> 
> 
> On 3/27/19 3:01 PM, Dietmar Eggemann wrote:
>> Hi Parth,
>>
>> On 3/22/19 7:06 AM, Parth Shah wrote:

[...]

>>> +    for_each_cpu_wrap(core, cpus, prev_cpu)
>>> +    {
>>> +        long unsigned int core_util = 0;
>>> +        long unsigned int core_cap = core_cap_mf*capacity_of(core)/10;
>>> +        long unsigned int cache_cpu_util = (unsigned)-1;
>>> +        long unsigned est_util = 0, est_util_enqueued = 0;
>>> +        int cache_cpu = core;
>>> +        struct cfs_rq *cfs_rq;
>>> +
>>> +        for_each_cpu(smt, cpu_smt_mask(core)) {
>>
>> This one doesn't build for me on arm64 (make defconfig) since it uses cpu_smt_mask() outside the CONFIG_SCHED_SMT guard.
>>
>> kernel/sched/fair.c: In function ‘select_non_idle_core’:
>> kernel/sched/fair.c:6243:21: error: implicit declaration of function ‘cpu_smt_mask’; did you mean ‘cpu_cpu_mask’? [-Werror=implicit-function-declaration]
>>     for_each_cpu(smt, cpu_smt_mask(core)) {
>>                       ^
>> ./include/linux/cpumask.h:242:32: note: in definition of macro ‘for_each_cpu’
>>
>> [...]
>>
> 
> Thanks for pointing out. It will not build for individual patches for the current version of RFC.
> 
> Please try to build it with full patch set. I assure, the following iterations of RFC will resolve this issue.

Ah, I see, you have this CONFIG_TURBO_SCHED patch at the end of the 
series. I think that you will have a hard time to get new CONFIG 
switches into the task scheduler code. Can you not bting in your feature 
w/o CONFIG_TURBO_SCHED?

[...]

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [RFC 6/6] Providing TurboSched as config option
       [not found] ` <20190322060621.27021-7-parth015@linux.vnet.ibm.com>
@ 2019-04-10 16:39   ` Dietmar Eggemann
  0 siblings, 0 replies; 20+ messages in thread
From: Dietmar Eggemann @ 2019-04-10 16:39 UTC (permalink / raw)
  To: Parth Shah, linux-pm

On 3/22/19 7:06 AM, Parth Shah wrote:
> Signed-off-by: Parth Shah <parth015@linux.vnet.ibm.com>

[...]

> diff --git a/kernel/sched/core.c b/kernel/sched/core.c
> index 06e440d3ea2f..2484fa8c49d0 100644
> --- a/kernel/sched/core.c
> +++ b/kernel/sched/core.c
> @@ -54,6 +54,7 @@ unsigned int sysctl_sched_rt_period = 1000000;
>   
>   __read_mostly int scheduler_running;
>   
> +#ifdef CONFIG_TURBO_SCHED

This one takes sysctl_sched_rt_runtime away from a non 
CONFIG_TURBO_SCHED build like arm64 defconfig.

...
   MODPOST vmlinux.o
kernel/sysctl.o:(.data+0x14d0): undefined reference to 
`sysctl_sched_rt_runtime'
kernel/sched/core.o: In function `dl_bandwidth_enabled':
/opt/git/kernel_org/kernel/sched/sched.h:282: undefined reference to 
`sysctl_sched_rt_runtime'
...

[...]

--- 8< ---

@@ -54,13 +54,13 @@ unsigned int sysctl_sched_rt_period = 1000000;

  __read_mostly int scheduler_running;

-#ifdef CONFIG_TURBO_SCHED
  /*
   * part of the period that we allow rt tasks to run in us.
   * default: 0.95s
   */
  int sysctl_sched_rt_runtime = 950000;

+#ifdef CONFIG_TURBO_SCHED

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [RFC 2/6] Provide cgroup interface for manual jitter classification
       [not found] ` <20190322060621.27021-3-parth015@linux.vnet.ibm.com>
@ 2019-04-11 13:46   ` Dietmar Eggemann
  2019-04-15  7:23     ` Parth Shah
  0 siblings, 1 reply; 20+ messages in thread
From: Dietmar Eggemann @ 2019-04-11 13:46 UTC (permalink / raw)
  To: Parth Shah, linux-pm

On 3/22/19 7:06 AM, Parth Shah wrote:
> Signed-off-by: Parth Shah <parth015@linux.vnet.ibm.com>
> ---
>   include/linux/sched.h |  8 ++++++++
>   kernel/sched/core.c   | 46 +++++++++++++++++++++++++++++++++++++++++++
>   kernel/sched/fair.c   | 15 +++++++++-----
>   kernel/sched/sched.h  |  2 ++
>   4 files changed, 66 insertions(+), 5 deletions(-)
> 
> diff --git a/include/linux/sched.h b/include/linux/sched.h
> index 903ef29b62c3..be8e8617e0cb 100644
> --- a/include/linux/sched.h
> +++ b/include/linux/sched.h
> @@ -604,6 +604,14 @@ struct task_struct {
>   	unsigned int			flags;
>   	unsigned int			ptrace;
>   
> +	/*
> +	 * task_characteristics:
> +	 * 0 = unknown. Follows regular CFS policy for task placement.
> +	 * 1 = Throughput intensive high utilization task
> +	 * 2 = Jitter tasks. Should be packed to reduce active core count.

This is very similar to the ideas in [PATCH v8 00/16] Add utilization 
clamping support 
https://lore.kernel.org/lkml/20190402104153.25404-1-patrick.bellasi@arm.com 
. Task classification from userspace ...

[...]

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [RFC 1/6] Optimize wake-up task for Task Packing heuristic
  2019-04-10 16:33       ` Dietmar Eggemann
@ 2019-04-11 14:05         ` Dietmar Eggemann
  2019-04-11 14:44         ` Parth Shah
  1 sibling, 0 replies; 20+ messages in thread
From: Dietmar Eggemann @ 2019-04-11 14:05 UTC (permalink / raw)
  To: Parth Shah, linux-pm

On 4/10/19 6:33 PM, Dietmar Eggemann wrote:
> On 4/7/19 7:57 AM, Parth Shah wrote:
>>
>>
>> On 3/27/19 3:01 PM, Dietmar Eggemann wrote:
>>> Hi Parth,
>>>
>>> On 3/22/19 7:06 AM, Parth Shah wrote:
> 
> [...]
> 
>>>> +    for_each_cpu_wrap(core, cpus, prev_cpu)
>>>> +    {
>>>> +        long unsigned int core_util = 0;
>>>> +        long unsigned int core_cap = core_cap_mf*capacity_of(core)/10;
>>>> +        long unsigned int cache_cpu_util = (unsigned)-1;
>>>> +        long unsigned est_util = 0, est_util_enqueued = 0;
>>>> +        int cache_cpu = core;
>>>> +        struct cfs_rq *cfs_rq;
>>>> +
>>>> +        for_each_cpu(smt, cpu_smt_mask(core)) {
>>>
>>> This one doesn't build for me on arm64 (make defconfig) since it uses 
>>> cpu_smt_mask() outside the CONFIG_SCHED_SMT guard.
>>>
>>> kernel/sched/fair.c: In function ‘select_non_idle_core’:
>>> kernel/sched/fair.c:6243:21: error: implicit declaration of function 
>>> ‘cpu_smt_mask’; did you mean ‘cpu_cpu_mask’? 
>>> [-Werror=implicit-function-declaration]
>>>     for_each_cpu(smt, cpu_smt_mask(core)) {
>>>                       ^
>>> ./include/linux/cpumask.h:242:32: note: in definition of macro 
>>> ‘for_each_cpu’
>>>
>>> [...]
>>>
>>
>> Thanks for pointing out. It will not build for individual patches for 
>> the current version of RFC.
>>
>> Please try to build it with full patch set. I assure, the following 
>> iterations of RFC will resolve this issue.
> 
> Ah, I see, you have this CONFIG_TURBO_SCHED patch at the end of the 
> series. I think that you will have a hard time to get new CONFIG 
> switches into the task scheduler code. Can you not bting in your feature 
> w/o CONFIG_TURBO_SCHED?

I think you could replace CONFIG_TURBO_SCHED entirely with your static 
key __turbo_sched_enabled (and the existing CONFIG_SMT in case your 
feature relies on SMT).

Please sent your next version to linux-kernel@vger.kernel.org as well 
and cc the task scheduler maintainer to get more review.

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [RFC 1/6] Optimize wake-up task for Task Packing heuristic
  2019-04-10 16:33       ` Dietmar Eggemann
  2019-04-11 14:05         ` Dietmar Eggemann
@ 2019-04-11 14:44         ` Parth Shah
  1 sibling, 0 replies; 20+ messages in thread
From: Parth Shah @ 2019-04-11 14:44 UTC (permalink / raw)
  To: Dietmar Eggemann, linux-pm



On 4/10/19 10:03 PM, Dietmar Eggemann wrote:
> On 4/7/19 7:57 AM, Parth Shah wrote:
>>
>>
>> On 3/27/19 3:01 PM, Dietmar Eggemann wrote:
>>> Hi Parth,
>>>
>>> On 3/22/19 7:06 AM, Parth Shah wrote:
> 
> [...]
> 
>>>> +    for_each_cpu_wrap(core, cpus, prev_cpu)
>>>> +    {
>>>> +        long unsigned int core_util = 0;
>>>> +        long unsigned int core_cap = core_cap_mf*capacity_of(core)/10;
>>>> +        long unsigned int cache_cpu_util = (unsigned)-1;
>>>> +        long unsigned est_util = 0, est_util_enqueued = 0;
>>>> +        int cache_cpu = core;
>>>> +        struct cfs_rq *cfs_rq;
>>>> +
>>>> +        for_each_cpu(smt, cpu_smt_mask(core)) {
>>>
>>> This one doesn't build for me on arm64 (make defconfig) since it uses cpu_smt_mask() outside the CONFIG_SCHED_SMT guard.
>>>
>>> kernel/sched/fair.c: In function ‘select_non_idle_core’:
>>> kernel/sched/fair.c:6243:21: error: implicit declaration of function ‘cpu_smt_mask’; did you mean ‘cpu_cpu_mask’? [-Werror=implicit-function-declaration]
>>>     for_each_cpu(smt, cpu_smt_mask(core)) {
>>>                       ^
>>> ./include/linux/cpumask.h:242:32: note: in definition of macro ‘for_each_cpu’
>>>
>>> [...]
>>>
>>
>> Thanks for pointing out. It will not build for individual patches for the current version of RFC.
>>
>> Please try to build it with full patch set. I assure, the following iterations of RFC will resolve this issue.
> 
> Ah, I see, you have this CONFIG_TURBO_SCHED patch at the end of the series. I think that you will have a hard time to get new CONFIG switches into the task scheduler code. Can you not bting in your feature w/o CONFIG_TURBO_SCHED?
> 
> [...]
> 

Yes, that seems doable. I will remove the additional CONFIG option and modify the code so that it builds for arm64 as well.
Thanks.


^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [RFC 4/6] Add cpumask to track throughput intensive tasks
       [not found] ` <20190322060621.27021-5-parth015@linux.vnet.ibm.com>
@ 2019-04-11 15:54   ` Dietmar Eggemann
  2019-04-12 10:47     ` Dietmar Eggemann
  2019-04-12 10:58   ` Dietmar Eggemann
  2019-04-12 11:13   ` Dietmar Eggemann
  2 siblings, 1 reply; 20+ messages in thread
From: Dietmar Eggemann @ 2019-04-11 15:54 UTC (permalink / raw)
  To: Parth Shah, linux-pm

On 3/22/19 7:06 AM, Parth Shah wrote:
> WOF tasks indicates the workload optimized frequency tasks which has

You should explain what WOF stands for before you it. I think its only 
done in the next patch.

[...]

> @@ -6225,10 +6232,11 @@ static const int core_cap_mf = 16;
>    */
>   static int select_non_idle_core(struct task_struct *p, int prev_cpu)
>   {
> -	struct cpumask *cpus = this_cpu_cpumask_var_ptr(select_idle_mask);
> +	struct cpumask highutil_task_mask_copy;

Don't put struct cpumask on the stack.

[...]


^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [RFC 5/6] Improvise cgroup interface for classifying jitter from WOF tasks
       [not found] ` <20190322060621.27021-6-parth015@linux.vnet.ibm.com>
@ 2019-04-11 16:01   ` Dietmar Eggemann
  2019-04-12 11:28     ` Parth Shah
  2019-04-12 13:08   ` Dietmar Eggemann
  1 sibling, 1 reply; 20+ messages in thread
From: Dietmar Eggemann @ 2019-04-11 16:01 UTC (permalink / raw)
  To: Parth Shah, linux-pm

On 3/22/19 7:06 AM, Parth Shah wrote:
> Signed-off-by: Parth Shah <parth015@linux.vnet.ibm.com>
> ---
>   kernel/sched/fair.c  | 40 +++++++++++++++++++++++++++++++++++++---
>   kernel/sched/sched.h |  2 ++
>   2 files changed, 39 insertions(+), 3 deletions(-)
> 
> diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
> index 7ecf5ed0b391..c7be5803a37c 100644
> --- a/kernel/sched/fair.c
> +++ b/kernel/sched/fair.c
> @@ -27,9 +27,15 @@
>   /*
>    * Maintain cpumask to track the core occupying throughput intensive tasks,
>    * which are usually high utilization tasks.
> + *
> + * WOF(Workload Optimized Frequency) tasks are manually classified(task_type=2)

This contradicts with patch 2/6:

diff --git a/include/linux/sched.h b/include/linux/sched.h
index 903ef29b62c3..be8e8617e0cb 100644
--- a/include/linux/sched.h
+++ b/include/linux/sched.h
@@ -604,6 +604,14 @@ struct task_struct {
  	unsigned int			flags;
  	unsigned int			ptrace;

+	/*
+	 * task_characteristics:
+	 * 0 = unknown. Follows regular CFS policy for task placement.
+	 * 1 = Throughput intensive high utilization task
+	 * 2 = Jitter tasks. Should be packed to reduce active core count.
+	 */
+	long long unsigned		task_type;

> + * high utilization tasks which may get benefit of higher frequencies, because
> + * of higher Instructions per seconds metric. >    */
>   static struct cpumask __highutil_task_cpu_mask;
>   #define highutil_task_cpu_mask ((struct cpumask *)&__highutil_task_cpu_mask)
> +static struct cpumask __wof_tasks_cpu_mask;
> +#define wof_tasks_cpu_mask ((struct cpumask *)&__wof_tasks_cpu_mask)
>   
>   /*
>    * Targeted preemption latency for CPU-bound tasks:
> @@ -5184,6 +5190,16 @@ enqueue_task_fair(struct rq *rq, struct task_struct *p, int flags)
>   		flags = ENQUEUE_WAKEUP;
>   	}
>   
> +	/*
> +	 * If TurboSched feature is enabled and task is of WOF type(=1)
> +	 * then mark it on the rq
> +	 */
> +	if (turbo_sched_enabled() && p->task_type == 1)

... and with the code here.

[...]

^ permalink raw reply related	[flat|nested] 20+ messages in thread

* Re: [RFC 4/6] Add cpumask to track throughput intensive tasks
  2019-04-11 15:54   ` [RFC 4/6] Add cpumask to track throughput intensive tasks Dietmar Eggemann
@ 2019-04-12 10:47     ` Dietmar Eggemann
  2019-04-12 11:16       ` Parth Shah
  0 siblings, 1 reply; 20+ messages in thread
From: Dietmar Eggemann @ 2019-04-12 10:47 UTC (permalink / raw)
  To: Parth Shah, linux-pm

On 4/11/19 5:54 PM, Dietmar Eggemann wrote:
> On 3/22/19 7:06 AM, Parth Shah wrote:
>> WOF tasks indicates the workload optimized frequency tasks which has
> 
> You should explain what WOF stands for before you it. I think its only 
> done in the next patch.
> 
> [...]
> 
>> @@ -6225,10 +6232,11 @@ static const int core_cap_mf = 16;
>>    */
>>   static int select_non_idle_core(struct task_struct *p, int prev_cpu)
>>   {
>> -    struct cpumask *cpus = this_cpu_cpumask_var_ptr(select_idle_mask);
>> +    struct cpumask highutil_task_mask_copy;
> 
> Don't put struct cpumask on the stack.

Why don't you provide a per-cpu mask for highutil_task_cpu_mask and 
wof_tasks_cpu_mask like we have for select_idle_mask?

DEFINE_PER_CPU(cpumask_var_t, select_idle_mask);

[...]


^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [RFC 4/6] Add cpumask to track throughput intensive tasks
       [not found] ` <20190322060621.27021-5-parth015@linux.vnet.ibm.com>
  2019-04-11 15:54   ` [RFC 4/6] Add cpumask to track throughput intensive tasks Dietmar Eggemann
@ 2019-04-12 10:58   ` Dietmar Eggemann
  2019-04-12 11:31     ` Parth Shah
  2019-04-12 11:13   ` Dietmar Eggemann
  2 siblings, 1 reply; 20+ messages in thread
From: Dietmar Eggemann @ 2019-04-12 10:58 UTC (permalink / raw)
  To: Parth Shah, linux-pm

On 3/22/19 7:06 AM, Parth Shah wrote:

[...]

> @@ -6225,10 +6232,11 @@ static const int core_cap_mf = 16;
>    */
>   static int select_non_idle_core(struct task_struct *p, int prev_cpu)
>   {
> -	struct cpumask *cpus = this_cpu_cpumask_var_ptr(select_idle_mask);
> +	struct cpumask highutil_task_mask_copy;
> +	struct cpumask *cpus = &highutil_task_mask_copy;
>   	int core, smt;
>   
> -	cpumask_and(cpus, cpu_online_mask, &p->cpus_allowed);
> +	cpumask_copy(cpus, highutil_task_cpu_mask);

IMHO, you still would have to and cpus with cpu_online_mask and 
&p->cpus_allowed.

[...]

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [RFC 4/6] Add cpumask to track throughput intensive tasks
       [not found] ` <20190322060621.27021-5-parth015@linux.vnet.ibm.com>
  2019-04-11 15:54   ` [RFC 4/6] Add cpumask to track throughput intensive tasks Dietmar Eggemann
  2019-04-12 10:58   ` Dietmar Eggemann
@ 2019-04-12 11:13   ` Dietmar Eggemann
  2019-04-12 11:23     ` Parth Shah
  2 siblings, 1 reply; 20+ messages in thread
From: Dietmar Eggemann @ 2019-04-12 11:13 UTC (permalink / raw)
  To: Parth Shah, linux-pm

On 3/22/19 7:06 AM, Parth Shah wrote:

[...]

> @@ -7038,6 +7046,11 @@ pick_next_task_fair(struct rq *rq, struct task_struct *prev, struct rq_flags *rf
>   	struct task_struct *p;
>   	int new_tasks;
>   
> +	if (!core_underutilized(cpu_util(rq->cpu),capacity_of(rq->cpu)))
> +		cpumask_test_and_set_cpu(rq->cpu, highutil_task_cpu_mask);
> +	else
> +		cpumask_test_and_clear_cpu(rq->cpu, highutil_task_cpu_mask);
> +

Shouldn't this update of the highutil_task_cpu_mask also be under the 
hood of turbo_sched_enabled()?

[...]

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [RFC 4/6] Add cpumask to track throughput intensive tasks
  2019-04-12 10:47     ` Dietmar Eggemann
@ 2019-04-12 11:16       ` Parth Shah
  0 siblings, 0 replies; 20+ messages in thread
From: Parth Shah @ 2019-04-12 11:16 UTC (permalink / raw)
  To: Dietmar Eggemann, linux-pm



On 4/12/19 4:17 PM, Dietmar Eggemann wrote:
> On 4/11/19 5:54 PM, Dietmar Eggemann wrote:
>> On 3/22/19 7:06 AM, Parth Shah wrote:
>>> WOF tasks indicates the workload optimized frequency tasks which has
>>
>> You should explain what WOF stands for before you it. I think its only done in the next patch.
>>
>> [...]
>>
>>> @@ -6225,10 +6232,11 @@ static const int core_cap_mf = 16;
>>>    */
>>>   static int select_non_idle_core(struct task_struct *p, int prev_cpu)
>>>   {
>>> -    struct cpumask *cpus = this_cpu_cpumask_var_ptr(select_idle_mask);
>>> +    struct cpumask highutil_task_mask_copy;
>>
>> Don't put struct cpumask on the stack.
> 
> Why don't you provide a per-cpu mask for highutil_task_cpu_mask and wof_tasks_cpu_mask like we have for select_idle_mask?
> 
> DEFINE_PER_CPU(cpumask_var_t, select_idle_mask);
> 
> [...]
> 
Yes, referring per_cpu highutil_task_mask will solve the problem of initializing new cpumask on stack.
Thanks for pointing that out.


^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [RFC 4/6] Add cpumask to track throughput intensive tasks
  2019-04-12 11:13   ` Dietmar Eggemann
@ 2019-04-12 11:23     ` Parth Shah
  0 siblings, 0 replies; 20+ messages in thread
From: Parth Shah @ 2019-04-12 11:23 UTC (permalink / raw)
  To: Dietmar Eggemann, linux-pm

On 4/12/19 4:43 PM, Dietmar Eggemann wrote:
> On 3/22/19 7:06 AM, Parth Shah wrote:
> 
> [...]
> 
>> @@ -7038,6 +7046,11 @@ pick_next_task_fair(struct rq *rq, struct task_struct *prev, struct rq_flags *rf
>>       struct task_struct *p;
>>       int new_tasks;
>>   +    if (!core_underutilized(cpu_util(rq->cpu),capacity_of(rq->cpu)))
>> +        cpumask_test_and_set_cpu(rq->cpu, highutil_task_cpu_mask);
>> +    else
>> +        cpumask_test_and_clear_cpu(rq->cpu, highutil_task_cpu_mask);
>> +
> 
> Shouldn't this update of the highutil_task_cpu_mask also be under the hood of turbo_sched_enabled()?
> 
> [...]
> 

So if this is kept inside turbo_sched_enabled, then the highutil_task_cpu_mask will contain irrelevant
information when the turbo_sched is enabled. This will result in selecting wrong CPU unless
pick_next_task_fair gets invoked in most CPUs.

Maybe, that is okay since it impacts only during the time of enabling the feature. Also, since I will
remove CONFIG options, almost every piece of TurboSched code will have to pass through turbo_sched_enabled() condition.

Thanks

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [RFC 5/6] Improvise cgroup interface for classifying jitter from WOF tasks
  2019-04-11 16:01   ` [RFC 5/6] Improvise cgroup interface for classifying jitter from WOF tasks Dietmar Eggemann
@ 2019-04-12 11:28     ` Parth Shah
  0 siblings, 0 replies; 20+ messages in thread
From: Parth Shah @ 2019-04-12 11:28 UTC (permalink / raw)
  To: Dietmar Eggemann, linux-pm



On 4/11/19 9:31 PM, Dietmar Eggemann wrote:
> On 3/22/19 7:06 AM, Parth Shah wrote:
>> Signed-off-by: Parth Shah <parth015@linux.vnet.ibm.com>
>> ---
>>   kernel/sched/fair.c  | 40 +++++++++++++++++++++++++++++++++++++---
>>   kernel/sched/sched.h |  2 ++
>>   2 files changed, 39 insertions(+), 3 deletions(-)
>>
>> diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
>> index 7ecf5ed0b391..c7be5803a37c 100644
>> --- a/kernel/sched/fair.c
>> +++ b/kernel/sched/fair.c
>> @@ -27,9 +27,15 @@
>>   /*
>>    * Maintain cpumask to track the core occupying throughput intensive tasks,
>>    * which are usually high utilization tasks.
>> + *
>> + * WOF(Workload Optimized Frequency) tasks are manually classified(task_type=2)

Good catch. This has to be task_type=1.


^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [RFC 4/6] Add cpumask to track throughput intensive tasks
  2019-04-12 10:58   ` Dietmar Eggemann
@ 2019-04-12 11:31     ` Parth Shah
  0 siblings, 0 replies; 20+ messages in thread
From: Parth Shah @ 2019-04-12 11:31 UTC (permalink / raw)
  To: Dietmar Eggemann, linux-pm



On 4/12/19 4:28 PM, Dietmar Eggemann wrote:
> On 3/22/19 7:06 AM, Parth Shah wrote:
> 
> [...]
> 
>> @@ -6225,10 +6232,11 @@ static const int core_cap_mf = 16;
>>    */
>>   static int select_non_idle_core(struct task_struct *p, int prev_cpu)
>>   {
>> -    struct cpumask *cpus = this_cpu_cpumask_var_ptr(select_idle_mask);
>> +    struct cpumask highutil_task_mask_copy;
>> +    struct cpumask *cpus = &highutil_task_mask_copy;
>>       int core, smt;
I think I should retain the following line
>>   -    cpumask_and(cpus, cpu_online_mask, &p->cpus_allowed);
>> +    cpumask_copy(cpus, highutil_task_cpu_mask);
> 
> IMHO, you still would have to and cpus with cpu_online_mask and &p->cpus_allowed.
> 
> [...]
> I agree.


^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [RFC 5/6] Improvise cgroup interface for classifying jitter from WOF tasks
       [not found] ` <20190322060621.27021-6-parth015@linux.vnet.ibm.com>
  2019-04-11 16:01   ` [RFC 5/6] Improvise cgroup interface for classifying jitter from WOF tasks Dietmar Eggemann
@ 2019-04-12 13:08   ` Dietmar Eggemann
  2019-04-12 15:48     ` Parth Shah
  1 sibling, 1 reply; 20+ messages in thread
From: Dietmar Eggemann @ 2019-04-12 13:08 UTC (permalink / raw)
  To: Parth Shah, linux-pm

On 3/22/19 7:06 AM, Parth Shah wrote:

[...]

> @@ -6236,7 +6259,15 @@ static int select_non_idle_core(struct task_struct *p, int prev_cpu)
>   	struct cpumask *cpus = &highutil_task_mask_copy;
>   	int core, smt;
>   
> -	cpumask_copy(cpus, highutil_task_cpu_mask);
> +	/*
> +	 * Prefer jitters to be pulled on core occupying WOF tasks
> +	 * If such tasks are not running then use long code path to find
> +	 * non-idle core with spare capacity
> +	 */

I'm struggling to grasp what you mean by ' ... long code path to find 
non-idle core with spare capacity' here?

We're here in:
select_task_rq_fair()->__select_idle_sibling()->select_non_idle_core()

What is the 'long code path'? In case select_non_idle_core() can't find 
a CPU, select_idle_sibling() is called but IMHO that's the original 
fastpath operating on the LLC domain.

It can't be the slow path find_idlest_cpu() because that's only called 
in case __select_idle_sibling() isn't called.

[...]

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [RFC 5/6] Improvise cgroup interface for classifying jitter from WOF tasks
  2019-04-12 13:08   ` Dietmar Eggemann
@ 2019-04-12 15:48     ` Parth Shah
  2019-04-15 10:27       ` Dietmar Eggemann
  0 siblings, 1 reply; 20+ messages in thread
From: Parth Shah @ 2019-04-12 15:48 UTC (permalink / raw)
  To: Dietmar Eggemann, linux-pm

On 4/12/19 6:38 PM, Dietmar Eggemann wrote:
> On 3/22/19 7:06 AM, Parth Shah wrote:
> 
> [...]
> 
>> @@ -6236,7 +6259,15 @@ static int select_non_idle_core(struct task_struct *p, int prev_cpu)
>>       struct cpumask *cpus = &highutil_task_mask_copy;
>>       int core, smt;
>>   -    cpumask_copy(cpus, highutil_task_cpu_mask);
>> +    /*
>> +     * Prefer jitters to be pulled on core occupying WOF tasks
>> +     * If such tasks are not running then use long code path to find
>> +     * non-idle core with spare capacity
>> +     */
> 
> I'm struggling to grasp what you mean by ' ... long code path to find non-idle core with spare capacity' here?
> 
> We're here in:
> select_task_rq_fair()->__select_idle_sibling()->select_non_idle_core()
> 
> What is the 'long code path'? In case select_non_idle_core() can't find a CPU, select_idle_sibling() is called but IMHO that's the original fastpath operating on the LLC domain.
> 
> It can't be the slow path find_idlest_cpu() because that's only called in case __select_idle_sibling() isn't called.
> 
> [...]
> 

Maybe the comment is not self-explainable. Please follow a below one,

Here we have two cpumask to iterate through
1. wof_tasks_cpu_mask which contains CPU occupied by user classified task and is hence easier to track
2. highutil_task_cpu_mask which contains CPU which are non-idle (not core_underutilized)

So, the selection of "cpus" can be either from the above two masks. 
Selecting "wof_tasks_cpu_mask" leads to quicker and better selection of non idle CPU,
whereas "highutil_task_cpu_mask" will contain more CPUs (inclusive of wof_tasks_cpu_mask) and hence finding perfect CPU may require more iterations.

Hence, selecting "highutil_task_cpu_mask" is relatively exhaustive (which I termed "longer code path") in general cases.

Hope this helps.

Thanks

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [RFC 2/6] Provide cgroup interface for manual jitter classification
  2019-04-11 13:46   ` [RFC 2/6] Provide cgroup interface for manual jitter classification Dietmar Eggemann
@ 2019-04-15  7:23     ` Parth Shah
  0 siblings, 0 replies; 20+ messages in thread
From: Parth Shah @ 2019-04-15  7:23 UTC (permalink / raw)
  To: Dietmar Eggemann, linux-pm



On 4/11/19 7:16 PM, Dietmar Eggemann wrote:
> On 3/22/19 7:06 AM, Parth Shah wrote:
>> Signed-off-by: Parth Shah <parth015@linux.vnet.ibm.com>
>> ---
>>   include/linux/sched.h |  8 ++++++++
>>   kernel/sched/core.c   | 46 +++++++++++++++++++++++++++++++++++++++++++
>>   kernel/sched/fair.c   | 15 +++++++++-----
>>   kernel/sched/sched.h  |  2 ++
>>   4 files changed, 66 insertions(+), 5 deletions(-)
>>
>> diff --git a/include/linux/sched.h b/include/linux/sched.h
>> index 903ef29b62c3..be8e8617e0cb 100644
>> --- a/include/linux/sched.h
>> +++ b/include/linux/sched.h
>> @@ -604,6 +604,14 @@ struct task_struct {
>>       unsigned int            flags;
>>       unsigned int            ptrace;
>>   +    /*
>> +     * task_characteristics:
>> +     * 0 = unknown. Follows regular CFS policy for task placement.
>> +     * 1 = Throughput intensive high utilization task
>> +     * 2 = Jitter tasks. Should be packed to reduce active core count.
> 
> This is very similar to the ideas in [PATCH v8 00/16] Add utilization clamping support https://lore.kernel.org/lkml/20190402104153.25404-1-patrick.bellasi@arm.com . Task classification from userspace ...
> 
> [...]
> 

Patches seems interesting and converges to the process of jitter classification.
I will try to re-base my patches on top of that.

Thanks


^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [RFC 5/6] Improvise cgroup interface for classifying jitter from WOF tasks
  2019-04-12 15:48     ` Parth Shah
@ 2019-04-15 10:27       ` Dietmar Eggemann
  0 siblings, 0 replies; 20+ messages in thread
From: Dietmar Eggemann @ 2019-04-15 10:27 UTC (permalink / raw)
  To: Parth Shah, linux-pm

On 4/12/19 5:48 PM, Parth Shah wrote:
> 
> 
> On 4/12/19 6:38 PM, Dietmar Eggemann wrote:
>> On 3/22/19 7:06 AM, Parth Shah wrote:

[...]

> Maybe the comment is not self-explainable. Please follow a below one,
> 
> Here we have two cpumask to iterate through
> 1. wof_tasks_cpu_mask which contains CPU occupied by user classified task and is hence easier to track
> 2. highutil_task_cpu_mask which contains CPU which are non-idle (not core_underutilized)
> 
> So, the selection of "cpus" can be either from the above two masks.
> Selecting "wof_tasks_cpu_mask" leads to quicker and better selection of non idle CPU,
> whereas "highutil_task_cpu_mask" will contain more CPUs (inclusive of wof_tasks_cpu_mask) and hence finding perfect CPU may require more iterations.
> 
> Hence, selecting "highutil_task_cpu_mask" is relatively exhaustive (which I termed "longer code path") in general cases.
> 
> Hope this helps.

Ah, OK, that helped to understand it.

[...]

^ permalink raw reply	[flat|nested] 20+ messages in thread

end of thread, other threads:[~2019-04-15 10:27 UTC | newest]

Thread overview: 20+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <20190322060621.27021-1-parth015@linux.vnet.ibm.com>
     [not found] ` <20190322060621.27021-2-parth015@linux.vnet.ibm.com>
     [not found]   ` <0b198386-ef0b-75e0-e53a-1160c77326b7@arm.com>
2019-04-07  5:57     ` [RFC 1/6] Optimize wake-up task for Task Packing heuristic Parth Shah
2019-04-10 16:33       ` Dietmar Eggemann
2019-04-11 14:05         ` Dietmar Eggemann
2019-04-11 14:44         ` Parth Shah
     [not found] ` <20190322060621.27021-4-parth015@linux.vnet.ibm.com>
     [not found]   ` <001101d4e35c$13951220$3abf3660$@net>
     [not found]     ` <20190404095854.GB25302@aks.ibm>
2019-04-07  6:43       ` [RFC 3/6] Introduce static key to enable or disable TurboSched Parth Shah
     [not found] ` <20190322060621.27021-7-parth015@linux.vnet.ibm.com>
2019-04-10 16:39   ` [RFC 6/6] Providing TurboSched as config option Dietmar Eggemann
     [not found] ` <20190322060621.27021-3-parth015@linux.vnet.ibm.com>
2019-04-11 13:46   ` [RFC 2/6] Provide cgroup interface for manual jitter classification Dietmar Eggemann
2019-04-15  7:23     ` Parth Shah
     [not found] ` <20190322060621.27021-6-parth015@linux.vnet.ibm.com>
2019-04-11 16:01   ` [RFC 5/6] Improvise cgroup interface for classifying jitter from WOF tasks Dietmar Eggemann
2019-04-12 11:28     ` Parth Shah
2019-04-12 13:08   ` Dietmar Eggemann
2019-04-12 15:48     ` Parth Shah
2019-04-15 10:27       ` Dietmar Eggemann
     [not found] ` <20190322060621.27021-5-parth015@linux.vnet.ibm.com>
2019-04-11 15:54   ` [RFC 4/6] Add cpumask to track throughput intensive tasks Dietmar Eggemann
2019-04-12 10:47     ` Dietmar Eggemann
2019-04-12 11:16       ` Parth Shah
2019-04-12 10:58   ` Dietmar Eggemann
2019-04-12 11:31     ` Parth Shah
2019-04-12 11:13   ` Dietmar Eggemann
2019-04-12 11:23     ` Parth Shah

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.