All of lore.kernel.org
 help / color / mirror / Atom feed
* Re: [PATCH 1/3] sched/fair: Introduce scaled capacity awareness in find_idlest_cpu code path
@ 2017-10-10 15:59 Atish Patra
  0 siblings, 0 replies; 9+ messages in thread
From: Atish Patra @ 2017-10-10 15:59 UTC (permalink / raw)
  To: rohit.k.jain
  Cc: mingo, morten.rasmussen, vincent.guittot, linux-kernel, joelaf,
	peterz, eas-dev, dietmar.eggemann


Minor nit: Patch version missing in the subject line.

Other than that:
Reviewed-by: Atish Patra <atish.patra@oracle.com>

Regards,
Atish
----- Original Message -----
From: rohit.k.jain@oracle.com
To: linux-kernel@vger.kernel.org, eas-dev@lists.linaro.org
Cc: peterz@infradead.org, mingo@redhat.com, joelaf@google.com, atish.patra@oracle.com, vincent.guittot@linaro.org, dietmar.eggemann@arm.com, morten.rasmussen@arm.com
Sent: Saturday, October 7, 2017 6:44:47 PM GMT -06:00 US/Canada Central
Subject: [PATCH 1/3] sched/fair: Introduce scaled capacity awareness in find_idlest_cpu code path

While looking for idle CPUs for a waking task, we should also account
for the delays caused due to the bandwidth reduction by RT/IRQ tasks.

This patch does that by trying to find a higher capacity CPU with
minimum wake up latency.

Signed-off-by: Rohit Jain <rohit.k.jain@oracle.com>
---
 kernel/sched/fair.c | 27 ++++++++++++++++++++++++---
 1 file changed, 24 insertions(+), 3 deletions(-)

diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index 0107280..eaede50 100644
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -5579,6 +5579,11 @@ static unsigned long capacity_orig_of(int cpu)
 	return cpu_rq(cpu)->cpu_capacity_orig;
 }
 
+static inline bool full_capacity(int cpu)
+{
+	return (capacity_of(cpu) >= (capacity_orig_of(cpu)*768 >> 10));
+}
+
 static unsigned long cpu_avg_load_per_task(int cpu)
 {
 	struct rq *rq = cpu_rq(cpu);
@@ -5865,8 +5870,10 @@ find_idlest_cpu(struct sched_group *group, struct task_struct *p, int this_cpu)
 	unsigned long load, min_load = ULONG_MAX;
 	unsigned int min_exit_latency = UINT_MAX;
 	u64 latest_idle_timestamp = 0;
+	unsigned int backup_cap = 0;
 	int least_loaded_cpu = this_cpu;
 	int shallowest_idle_cpu = -1;
+	int shallowest_idle_cpu_backup = -1;
 	int i;
 
 	/* Check if we have any choice: */
@@ -5876,6 +5883,7 @@ find_idlest_cpu(struct sched_group *group, struct task_struct *p, int this_cpu)
 	/* Traverse only the allowed CPUs */
 	for_each_cpu_and(i, sched_group_span(group), &p->cpus_allowed) {
 		if (idle_cpu(i)) {
+			int idle_candidate = -1;
 			struct rq *rq = cpu_rq(i);
 			struct cpuidle_state *idle = idle_get_state(rq);
 			if (idle && idle->exit_latency < min_exit_latency) {
@@ -5886,7 +5894,7 @@ find_idlest_cpu(struct sched_group *group, struct task_struct *p, int this_cpu)
 				 */
 				min_exit_latency = idle->exit_latency;
 				latest_idle_timestamp = rq->idle_stamp;
-				shallowest_idle_cpu = i;
+				idle_candidate = i;
 			} else if ((!idle || idle->exit_latency == min_exit_latency) &&
 				   rq->idle_stamp > latest_idle_timestamp) {
 				/*
@@ -5895,7 +5903,16 @@ find_idlest_cpu(struct sched_group *group, struct task_struct *p, int this_cpu)
 				 * a warmer cache.
 				 */
 				latest_idle_timestamp = rq->idle_stamp;
-				shallowest_idle_cpu = i;
+				idle_candidate = i;
+			}
+
+			if (idle_candidate != -1) {
+				if (full_capacity(idle_candidate)) {
+					shallowest_idle_cpu = idle_candidate;
+				} else if (capacity_of(idle_candidate) > backup_cap) {
+					shallowest_idle_cpu_backup = idle_candidate;
+					backup_cap = capacity_of(idle_candidate);
+				}
 			}
 		} else if (shallowest_idle_cpu == -1) {
 			load = weighted_cpuload(cpu_rq(i));
@@ -5906,7 +5923,11 @@ find_idlest_cpu(struct sched_group *group, struct task_struct *p, int this_cpu)
 		}
 	}
 
-	return shallowest_idle_cpu != -1 ? shallowest_idle_cpu : least_loaded_cpu;
+	if (shallowest_idle_cpu != -1)
+		return shallowest_idle_cpu;
+
+	return (shallowest_idle_cpu_backup != -1 ?
+		shallowest_idle_cpu_backup : least_loaded_cpu);
 }
 
 #ifdef CONFIG_SCHED_SMT
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 9+ messages in thread

* Re: [PATCH 1/3] sched/fair: Introduce scaled capacity awareness in find_idlest_cpu code path
  2017-10-12 21:47     ` Joel Fernandes
@ 2017-10-13  1:54       ` Rohit Jain
  0 siblings, 0 replies; 9+ messages in thread
From: Rohit Jain @ 2017-10-13  1:54 UTC (permalink / raw)
  To: Joel Fernandes
  Cc: Peter Zijlstra, Atish Patra, LKML, eas-dev, Ingo Molnar,
	Vincent Guittot, Dietmar Eggemann, Morten Rasmussen

Hi Joel,


On 10/12/2017 02:47 PM, Joel Fernandes wrote:
> On Thu, Oct 12, 2017 at 10:03 AM, Rohit Jain <rohit.k.jain@oracle.com> wrote:
>> Hi Joel, Atish,
>>
>> Moving off-line discussions to LKML, just so everyone's on the same page,
>> I actually like this version now and it is outperforming my previous
>> code, so I am on board with this version. It makes the code simpler too.
> I think you should have explained what the version does differently.
> Nobody can read your mind.

I apologize for being terse (will do better next time)

This is based on your (offline) suggestion (and rightly so), that
find_idlest_group today bases its decision on capacity_spare_wake which
in turn only looks at the original capacity of the CPU. This diff
(version) changes that to look at the current capacity after being
scaled down (due to IRQ/RT/etc.).

Also, this diff changed find_idlest_group_cpu to not do a search for
CPUs based on the 'full_capacity()' function, instead changed it to
find the idlest CPU with max available capacity. This way we can avoid
all the 'backup' stuff in the code as in the version (v5) below it.

I think as you can see from the way it will work itself out that the
code will look much simpler with the new search. This is OK because we
are doing a full CPU search in the sched_group_span anyway.

[..]
>> diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
>> index 56f343b..a1f622c 100644
>> --- a/kernel/sched/fair.c
>> +++ b/kernel/sched/fair.c
>> @@ -5724,7 +5724,7 @@ static int cpu_util_wake(int cpu, struct task_struct
>> *p);
>>
>>   static unsigned long capacity_spare_wake(int cpu, struct task_struct *p)
>>   {
>> -    return capacity_orig_of(cpu) - cpu_util_wake(cpu, p);
>> +    return capacity_of(cpu) - cpu_util_wake(cpu, p);
>>   }
>>
>>   /*
>> @@ -5870,6 +5870,7 @@ find_idlest_group_cpu(struct sched_group *group,
>> struct task_struct *p, int this
>>       unsigned long load, min_load = ULONG_MAX;
>>       unsigned int min_exit_latency = UINT_MAX;
>>       u64 latest_idle_timestamp = 0;
>> +    unsigned int idle_cpu_cap = 0;
>>       int least_loaded_cpu = this_cpu;
>>       int shallowest_idle_cpu = -1;
>>       int i;
>> @@ -5881,6 +5882,7 @@ find_idlest_group_cpu(struct sched_group *group,
>> struct task_struct *p, int this
>>       /* Traverse only the allowed CPUs */
>>       for_each_cpu_and(i, sched_group_span(group), &p->cpus_allowed) {
>>           if (idle_cpu(i)) {
>> +            int idle_candidate = -1;
>>               struct rq *rq = cpu_rq(i);
>>               struct cpuidle_state *idle = idle_get_state(rq);
>>               if (idle && idle->exit_latency < min_exit_latency) {
>> @@ -5891,7 +5893,7 @@ find_idlest_group_cpu(struct sched_group *group,
>> struct task_struct *p, int this
>>                    */
>>                   min_exit_latency = idle->exit_latency;
>>                   latest_idle_timestamp = rq->idle_stamp;
>> -                shallowest_idle_cpu = i;
>> +                idle_candidate = i;
>>               } else if ((!idle || idle->exit_latency == min_exit_latency) &&
>>                      rq->idle_stamp > latest_idle_timestamp) {
>>                   /*
>> @@ -5900,8 +5902,14 @@ find_idlest_group_cpu(struct sched_group *group,
>> struct task_struct *p, int this
>>                    * a warmer cache.
>>                    */
>>                   latest_idle_timestamp = rq->idle_stamp;
>> -                shallowest_idle_cpu = i;
>> +                idle_candidate = i;
>>               }
>> +
>> +            if (idle_candidate != -1 &&
>> +                (capacity_of(idle_candidate) > idle_cpu_cap)) {
>> +                shallowest_idle_cpu = idle_candidate;
>> +                idle_cpu_cap = capacity_of(idle_candidate);
>> +            }
> This is broken, incase idle_candidate != -1 but idle_cpu_cap makes the
> condition false - you're still setting min_exit_latency which is
> wrong.

Yes, you're right. I will fix this.

>
> Also this means if you have 2 CPUs and 1 is in a shallower idle state
> than the other, but lesser in capacity, then it would select the CPU
> with less shallow idle state right? So 'shallowest_idle_cpu' loses its
> meaning.

OK, I will change the name

Thanks,
Rohit
> [..]

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH 1/3] sched/fair: Introduce scaled capacity awareness in find_idlest_cpu code path
  2017-10-12 17:03   ` Rohit Jain
@ 2017-10-12 21:47     ` Joel Fernandes
  2017-10-13  1:54       ` Rohit Jain
  0 siblings, 1 reply; 9+ messages in thread
From: Joel Fernandes @ 2017-10-12 21:47 UTC (permalink / raw)
  To: Rohit Jain
  Cc: Peter Zijlstra, Atish Patra, LKML, eas-dev, Ingo Molnar,
	Vincent Guittot, Dietmar Eggemann, Morten Rasmussen

On Thu, Oct 12, 2017 at 10:03 AM, Rohit Jain <rohit.k.jain@oracle.com> wrote:
> Hi Joel, Atish,
>
> Moving off-line discussions to LKML, just so everyone's on the same page,
> I actually like this version now and it is outperforming my previous
> code, so I am on board with this version. It makes the code simpler too.

I think you should have explained what the version does differently.
Nobody can read your mind.

>
> Since we need a fast way of returning an idle cpu in select_idle_sibling
> path, I think that can remain as it is (or may be we can argue about the
> patch on that thread)

This is hardly an explanation of the diff below.

>
> If what I said abovemakes sense to everyone, I will send out a v6.
>
> As always, please let me know what you think.

More below:

> Thanks,
> Rohit
>
> diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
> index 56f343b..a1f622c 100644
> --- a/kernel/sched/fair.c
> +++ b/kernel/sched/fair.c
> @@ -5724,7 +5724,7 @@ static int cpu_util_wake(int cpu, struct task_struct
> *p);
>
>  static unsigned long capacity_spare_wake(int cpu, struct task_struct *p)
>  {
> -    return capacity_orig_of(cpu) - cpu_util_wake(cpu, p);
> +    return capacity_of(cpu) - cpu_util_wake(cpu, p);
>  }
>
>  /*
> @@ -5870,6 +5870,7 @@ find_idlest_group_cpu(struct sched_group *group,
> struct task_struct *p, int this
>      unsigned long load, min_load = ULONG_MAX;
>      unsigned int min_exit_latency = UINT_MAX;
>      u64 latest_idle_timestamp = 0;
> +    unsigned int idle_cpu_cap = 0;
>      int least_loaded_cpu = this_cpu;
>      int shallowest_idle_cpu = -1;
>      int i;
> @@ -5881,6 +5882,7 @@ find_idlest_group_cpu(struct sched_group *group,
> struct task_struct *p, int this
>      /* Traverse only the allowed CPUs */
>      for_each_cpu_and(i, sched_group_span(group), &p->cpus_allowed) {
>          if (idle_cpu(i)) {
> +            int idle_candidate = -1;
>              struct rq *rq = cpu_rq(i);
>              struct cpuidle_state *idle = idle_get_state(rq);
>              if (idle && idle->exit_latency < min_exit_latency) {
> @@ -5891,7 +5893,7 @@ find_idlest_group_cpu(struct sched_group *group,
> struct task_struct *p, int this
>                   */
>                  min_exit_latency = idle->exit_latency;
>                  latest_idle_timestamp = rq->idle_stamp;
> -                shallowest_idle_cpu = i;
> +                idle_candidate = i;
>              } else if ((!idle || idle->exit_latency == min_exit_latency) &&
>                     rq->idle_stamp > latest_idle_timestamp) {
>                  /*
> @@ -5900,8 +5902,14 @@ find_idlest_group_cpu(struct sched_group *group,
> struct task_struct *p, int this
>                   * a warmer cache.
>                   */
>                  latest_idle_timestamp = rq->idle_stamp;
> -                shallowest_idle_cpu = i;
> +                idle_candidate = i;
>              }
> +
> +            if (idle_candidate != -1 &&
> +                (capacity_of(idle_candidate) > idle_cpu_cap)) {
> +                shallowest_idle_cpu = idle_candidate;
> +                idle_cpu_cap = capacity_of(idle_candidate);
> +            }

This is broken, incase idle_candidate != -1 but idle_cpu_cap makes the
condition false - you're still setting min_exit_latency which is
wrong.

Also this means if you have 2 CPUs and 1 is in a shallower idle state
than the other, but lesser in capacity, then it would select the CPU
with less shallow idle state right? So 'shallowest_idle_cpu' loses its
meaning.

thanks,

- Joel

[..]

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH 1/3] sched/fair: Introduce scaled capacity awareness in find_idlest_cpu code path
  2017-10-07 23:48 ` [PATCH 1/3] sched/fair: Introduce scaled capacity awareness in find_idlest_cpu code path Rohit Jain
@ 2017-10-12 17:03   ` Rohit Jain
  2017-10-12 21:47     ` Joel Fernandes
  0 siblings, 1 reply; 9+ messages in thread
From: Rohit Jain @ 2017-10-12 17:03 UTC (permalink / raw)
  To: peterz, joelaf, atish.patra
  Cc: linux-kernel, eas-dev, mingo, vincent.guittot, dietmar.eggemann,
	morten.rasmussen

Hi Joel, Atish,

Moving off-line discussions to LKML, just so everyone's on the same page,
I actually like this version now and it is outperforming my previous
code, so I am on board with this version. It makes the code simpler too.

Since we need a fast way of returning an idle cpu in select_idle_sibling
path, I think that can remain as it is (or may be we can argue about the
patch on that thread)

If what I said abovemakes sense to everyone, I will send out a v6.

As always, please let me know what you think.

Thanks,
Rohit

diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index 56f343b..a1f622c 100644
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -5724,7 +5724,7 @@ static int cpu_util_wake(int cpu, struct 
task_struct *p);

  static unsigned long capacity_spare_wake(int cpu, struct task_struct *p)
  {
-    return capacity_orig_of(cpu) - cpu_util_wake(cpu, p);
+    return capacity_of(cpu) - cpu_util_wake(cpu, p);
  }

  /*
@@ -5870,6 +5870,7 @@ find_idlest_group_cpu(struct sched_group *group, 
struct task_struct *p, int this
      unsigned long load, min_load = ULONG_MAX;
      unsigned int min_exit_latency = UINT_MAX;
      u64 latest_idle_timestamp = 0;
+    unsigned int idle_cpu_cap = 0;
      int least_loaded_cpu = this_cpu;
      int shallowest_idle_cpu = -1;
      int i;
@@ -5881,6 +5882,7 @@ find_idlest_group_cpu(struct sched_group *group, 
struct task_struct *p, int this
      /* Traverse only the allowed CPUs */
      for_each_cpu_and(i, sched_group_span(group), &p->cpus_allowed) {
          if (idle_cpu(i)) {
+            int idle_candidate = -1;
              struct rq *rq = cpu_rq(i);
              struct cpuidle_state *idle = idle_get_state(rq);
              if (idle && idle->exit_latency < min_exit_latency) {
@@ -5891,7 +5893,7 @@ find_idlest_group_cpu(struct sched_group *group, 
struct task_struct *p, int this
                   */
                  min_exit_latency = idle->exit_latency;
                  latest_idle_timestamp = rq->idle_stamp;
-                shallowest_idle_cpu = i;
+                idle_candidate = i;
              } else if ((!idle || idle->exit_latency == 
min_exit_latency) &&
                     rq->idle_stamp > latest_idle_timestamp) {
                  /*
@@ -5900,8 +5902,14 @@ find_idlest_group_cpu(struct sched_group *group, 
struct task_struct *p, int this
                   * a warmer cache.
                   */
                  latest_idle_timestamp = rq->idle_stamp;
-                shallowest_idle_cpu = i;
+                idle_candidate = i;
              }
+
+            if (idle_candidate != -1 &&
+                (capacity_of(idle_candidate) > idle_cpu_cap)) {
+                shallowest_idle_cpu = idle_candidate;
+                idle_cpu_cap = capacity_of(idle_candidate);
+            }
          } else if (shallowest_idle_cpu == -1) {
              load = weighted_cpuload(cpu_rq(i));
              if (load < min_load || (load == min_load && i == this_cpu)) {
-- 
2.7.4


On 10/07/2017 04:48 PM, Rohit Jain wrote:
> While looking for idle CPUs for a waking task, we should also account
> for the delays caused due to the bandwidth reduction by RT/IRQ tasks.
>
> This patch does that by trying to find a higher capacity CPU with
> minimum wake up latency.
>
> Signed-off-by: Rohit Jain<rohit.k.jain@oracle.com>
> ---
>   kernel/sched/fair.c | 27 ++++++++++++++++++++++++---
>   1 file changed, 24 insertions(+), 3 deletions(-)
>
> diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
> index 0107280..eaede50 100644
> --- a/kernel/sched/fair.c
> +++ b/kernel/sched/fair.c
> @@ -5579,6 +5579,11 @@ static unsigned long capacity_orig_of(int cpu)
>   	return cpu_rq(cpu)->cpu_capacity_orig;
>   }
>   
> +static inline bool full_capacity(int cpu)
> +{
> +	return (capacity_of(cpu) >= (capacity_orig_of(cpu)*768 >> 10));
> +}
> +
>   static unsigned long cpu_avg_load_per_task(int cpu)
>   {
>   	struct rq *rq = cpu_rq(cpu);
> @@ -5865,8 +5870,10 @@ find_idlest_cpu(struct sched_group *group, struct task_struct *p, int this_cpu)
>   	unsigned long load, min_load = ULONG_MAX;
>   	unsigned int min_exit_latency = UINT_MAX;
>   	u64 latest_idle_timestamp = 0;
> +	unsigned int backup_cap = 0;
>   	int least_loaded_cpu = this_cpu;
>   	int shallowest_idle_cpu = -1;
> +	int shallowest_idle_cpu_backup = -1;
>   	int i;
>   
>   	/* Check if we have any choice: */
> @@ -5876,6 +5883,7 @@ find_idlest_cpu(struct sched_group *group, struct task_struct *p, int this_cpu)
>   	/* Traverse only the allowed CPUs */
>   	for_each_cpu_and(i, sched_group_span(group), &p->cpus_allowed) {
>   		if (idle_cpu(i)) {
> +			int idle_candidate = -1;
>   			struct rq *rq = cpu_rq(i);
>   			struct cpuidle_state *idle = idle_get_state(rq);
>   			if (idle && idle->exit_latency < min_exit_latency) {
> @@ -5886,7 +5894,7 @@ find_idlest_cpu(struct sched_group *group, struct task_struct *p, int this_cpu)
>   				 */
>   				min_exit_latency = idle->exit_latency;
>   				latest_idle_timestamp = rq->idle_stamp;
> -				shallowest_idle_cpu = i;
> +				idle_candidate = i;
>   			} else if ((!idle || idle->exit_latency == min_exit_latency) &&
>   				   rq->idle_stamp > latest_idle_timestamp) {
>   				/*
> @@ -5895,7 +5903,16 @@ find_idlest_cpu(struct sched_group *group, struct task_struct *p, int this_cpu)
>   				 * a warmer cache.
>   				 */
>   				latest_idle_timestamp = rq->idle_stamp;
> -				shallowest_idle_cpu = i;
> +				idle_candidate = i;
> +			}
> +
> +			if (idle_candidate != -1) {
> +				if (full_capacity(idle_candidate)) {
> +					shallowest_idle_cpu = idle_candidate;
> +				} else if (capacity_of(idle_candidate) > backup_cap) {
> +					shallowest_idle_cpu_backup = idle_candidate;
> +					backup_cap = capacity_of(idle_candidate);
> +				}
>   			}
>   		} else if (shallowest_idle_cpu == -1) {
>   			load = weighted_cpuload(cpu_rq(i));
> @@ -5906,7 +5923,11 @@ find_idlest_cpu(struct sched_group *group, struct task_struct *p, int this_cpu)
>   		}
>   	}
>   
> -	return shallowest_idle_cpu != -1 ? shallowest_idle_cpu : least_loaded_cpu;
> +	if (shallowest_idle_cpu != -1)
> +		return shallowest_idle_cpu;
> +
> +	return (shallowest_idle_cpu_backup != -1 ?
> +		shallowest_idle_cpu_backup : least_loaded_cpu);
>   }
>   
>   #ifdef CONFIG_SCHED_SMT

^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [PATCH 1/3] sched/fair: Introduce scaled capacity awareness in find_idlest_cpu code path
  2017-10-07 23:48 [PATCH v5 0/3] sched/fair: Introduce scaled capacity awareness in enqueue Rohit Jain
@ 2017-10-07 23:48 ` Rohit Jain
  2017-10-12 17:03   ` Rohit Jain
  0 siblings, 1 reply; 9+ messages in thread
From: Rohit Jain @ 2017-10-07 23:48 UTC (permalink / raw)
  To: linux-kernel, eas-dev
  Cc: peterz, mingo, joelaf, atish.patra, vincent.guittot,
	dietmar.eggemann, morten.rasmussen

While looking for idle CPUs for a waking task, we should also account
for the delays caused due to the bandwidth reduction by RT/IRQ tasks.

This patch does that by trying to find a higher capacity CPU with
minimum wake up latency.

Signed-off-by: Rohit Jain <rohit.k.jain@oracle.com>
---
 kernel/sched/fair.c | 27 ++++++++++++++++++++++++---
 1 file changed, 24 insertions(+), 3 deletions(-)

diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index 0107280..eaede50 100644
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -5579,6 +5579,11 @@ static unsigned long capacity_orig_of(int cpu)
 	return cpu_rq(cpu)->cpu_capacity_orig;
 }
 
+static inline bool full_capacity(int cpu)
+{
+	return (capacity_of(cpu) >= (capacity_orig_of(cpu)*768 >> 10));
+}
+
 static unsigned long cpu_avg_load_per_task(int cpu)
 {
 	struct rq *rq = cpu_rq(cpu);
@@ -5865,8 +5870,10 @@ find_idlest_cpu(struct sched_group *group, struct task_struct *p, int this_cpu)
 	unsigned long load, min_load = ULONG_MAX;
 	unsigned int min_exit_latency = UINT_MAX;
 	u64 latest_idle_timestamp = 0;
+	unsigned int backup_cap = 0;
 	int least_loaded_cpu = this_cpu;
 	int shallowest_idle_cpu = -1;
+	int shallowest_idle_cpu_backup = -1;
 	int i;
 
 	/* Check if we have any choice: */
@@ -5876,6 +5883,7 @@ find_idlest_cpu(struct sched_group *group, struct task_struct *p, int this_cpu)
 	/* Traverse only the allowed CPUs */
 	for_each_cpu_and(i, sched_group_span(group), &p->cpus_allowed) {
 		if (idle_cpu(i)) {
+			int idle_candidate = -1;
 			struct rq *rq = cpu_rq(i);
 			struct cpuidle_state *idle = idle_get_state(rq);
 			if (idle && idle->exit_latency < min_exit_latency) {
@@ -5886,7 +5894,7 @@ find_idlest_cpu(struct sched_group *group, struct task_struct *p, int this_cpu)
 				 */
 				min_exit_latency = idle->exit_latency;
 				latest_idle_timestamp = rq->idle_stamp;
-				shallowest_idle_cpu = i;
+				idle_candidate = i;
 			} else if ((!idle || idle->exit_latency == min_exit_latency) &&
 				   rq->idle_stamp > latest_idle_timestamp) {
 				/*
@@ -5895,7 +5903,16 @@ find_idlest_cpu(struct sched_group *group, struct task_struct *p, int this_cpu)
 				 * a warmer cache.
 				 */
 				latest_idle_timestamp = rq->idle_stamp;
-				shallowest_idle_cpu = i;
+				idle_candidate = i;
+			}
+
+			if (idle_candidate != -1) {
+				if (full_capacity(idle_candidate)) {
+					shallowest_idle_cpu = idle_candidate;
+				} else if (capacity_of(idle_candidate) > backup_cap) {
+					shallowest_idle_cpu_backup = idle_candidate;
+					backup_cap = capacity_of(idle_candidate);
+				}
 			}
 		} else if (shallowest_idle_cpu == -1) {
 			load = weighted_cpuload(cpu_rq(i));
@@ -5906,7 +5923,11 @@ find_idlest_cpu(struct sched_group *group, struct task_struct *p, int this_cpu)
 		}
 	}
 
-	return shallowest_idle_cpu != -1 ? shallowest_idle_cpu : least_loaded_cpu;
+	if (shallowest_idle_cpu != -1)
+		return shallowest_idle_cpu;
+
+	return (shallowest_idle_cpu_backup != -1 ?
+		shallowest_idle_cpu_backup : least_loaded_cpu);
 }
 
 #ifdef CONFIG_SCHED_SMT
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 9+ messages in thread

* Re: [PATCH 1/3] sched/fair: Introduce scaled capacity awareness in find_idlest_cpu code path
  2017-09-26  4:40     ` Rohit Jain
@ 2017-09-26  6:59       ` Joel Fernandes
  0 siblings, 0 replies; 9+ messages in thread
From: Joel Fernandes @ 2017-09-26  6:59 UTC (permalink / raw)
  To: Rohit Jain
  Cc: LKML, eas-dev, Peter Zijlstra, Ingo Molnar, Atish Patra,
	Vincent Guittot, Dietmar Eggemann, Morten Rasmussen

Hi Rohit,

On Mon, Sep 25, 2017 at 9:40 PM, Rohit Jain <rohit.k.jain@oracle.com> wrote:
> On 09/25/2017 07:51 PM, joelaf wrote:
[...]
>>
>> On 09/25/2017 05:02 PM, Rohit Jain wrote:
>>>
>>> While looking for idle CPUs for a waking task, we should also account
>>> for the delays caused due to the bandwidth reduction by RT/IRQ tasks.
>>>
>>> This patch does that by trying to find a higher capacity CPU with
>>> minimum wake up latency.
>>>
>>>
>>> Signed-off-by: Rohit Jain <rohit.k.jain@oracle.com>
>>> ---
>>>   kernel/sched/fair.c | 27 ++++++++++++++++++++++++---
>>>   1 file changed, 24 insertions(+), 3 deletions(-)
>>>
>>> diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
>>> index eca6a57..afb701f 100644
>>> --- a/kernel/sched/fair.c
>>> +++ b/kernel/sched/fair.c
>>> @@ -5590,6 +5590,11 @@ static unsigned long capacity_orig_of(int cpu)
>>>         return cpu_rq(cpu)->cpu_capacity_orig;
>>>   }
>>>   +static inline bool full_capacity(int cpu)
>>> +{
>>> +       return (capacity_of(cpu) >= (capacity_orig_of(cpu)*819 >> 10));
>>
>> Wouldn't 768 be better for multiplication? gcc converts the expression to
>> shifts and adds then.
>
>
> While 768 is easier to convert to shifts and adds, 819/1024 gets you
> very close to 80% which is what I was trying to achieve.

Yeah I guess if its not too hard, you could check if 768 gets you a
similar result but I would defer to the maintainers on what they are
Ok with.

>>> +}
>>> +
>>>   static unsigned long cpu_avg_load_per_task(int cpu)
>>>   {
>>>         struct rq *rq = cpu_rq(cpu);
>>> @@ -5916,8 +5921,10 @@ find_idlest_cpu(struct sched_group *group, struct
>>> task_struct *p, int this_cpu)
>>>         unsigned long load, min_load = ULONG_MAX;
>>>         unsigned int min_exit_latency = UINT_MAX;
>>>         u64 latest_idle_timestamp = 0;
>>> +       unsigned int backup_cap = 0;
>>>         int least_loaded_cpu = this_cpu;
>>>         int shallowest_idle_cpu = -1;
>>> +       int shallowest_idle_cpu_backup = -1;
>>>         int i;
>>>         /* Check if we have any choice: */
>>> @@ -5937,7 +5944,12 @@ find_idlest_cpu(struct sched_group *group, struct
>>> task_struct *p, int this_cpu)
>>>                                  */
>>>                                 min_exit_latency = idle->exit_latency;
>>>                                 latest_idle_timestamp = rq->idle_stamp;
>>> -                               shallowest_idle_cpu = i;
>>> +                               if (full_capacity(i)) {
>>> +                                       shallowest_idle_cpu = i;
>>> +                               } else if (capacity_of(i) > backup_cap) {
>>> +                                       shallowest_idle_cpu_backup = i;
>>> +                                       backup_cap = capacity_of(i);
>>> +                               }
>>
>> I'm a bit skeptical about this - if the CPU is idle, then is it likely
>> that the capacity of the CPU is reduced due to RT pressure?
>
>
> What has idleness got to do with RT pressure?
>
> This is an instantaneous view where the scheduler is looking to place
> threads. In this case, if we know historically the capacity of the CPU
> is reduced (due to RT/IRQ/Thermal Throttling or whatever it may be) we
> should avoid that CPU if we have a choice.

Yeah Ok, that's a fair point, I don't dispute this fact. I was just
trying to understand your patch.

thanks,

- Joel

[...]

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH 1/3] sched/fair: Introduce scaled capacity awareness in find_idlest_cpu code path
  2017-09-26  2:51   ` joelaf
@ 2017-09-26  4:40     ` Rohit Jain
  2017-09-26  6:59       ` Joel Fernandes
  0 siblings, 1 reply; 9+ messages in thread
From: Rohit Jain @ 2017-09-26  4:40 UTC (permalink / raw)
  To: joelaf, linux-kernel, eas-dev
  Cc: peterz, mingo, atish.patra, vincent.guittot, dietmar.eggemann,
	morten.rasmussen

On 09/25/2017 07:51 PM, joelaf wrote:
> Hi Rohit,
>
> Just some comments:

Hi Joel,

Thanks for the comments

>
> On 09/25/2017 05:02 PM, Rohit Jain wrote:
>> While looking for idle CPUs for a waking task, we should also account
>> for the delays caused due to the bandwidth reduction by RT/IRQ tasks.
>>
>> This patch does that by trying to find a higher capacity CPU with
>> minimum wake up latency.
>>
>>
>> Signed-off-by: Rohit Jain <rohit.k.jain@oracle.com>
>> ---
>>   kernel/sched/fair.c | 27 ++++++++++++++++++++++++---
>>   1 file changed, 24 insertions(+), 3 deletions(-)
>>
>> diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
>> index eca6a57..afb701f 100644
>> --- a/kernel/sched/fair.c
>> +++ b/kernel/sched/fair.c
>> @@ -5590,6 +5590,11 @@ static unsigned long capacity_orig_of(int cpu)
>>   	return cpu_rq(cpu)->cpu_capacity_orig;
>>   }
>>   
>> +static inline bool full_capacity(int cpu)
>> +{
>> +	return (capacity_of(cpu) >= (capacity_orig_of(cpu)*819 >> 10));
> Wouldn't 768 be better for multiplication? gcc converts the expression to shifts and adds then.

While 768 is easier to convert to shifts and adds, 819/1024 gets you
very close to 80% which is what I was trying to achieve.

>
>> +}
>> +
>>   static unsigned long cpu_avg_load_per_task(int cpu)
>>   {
>>   	struct rq *rq = cpu_rq(cpu);
>> @@ -5916,8 +5921,10 @@ find_idlest_cpu(struct sched_group *group, struct task_struct *p, int this_cpu)
>>   	unsigned long load, min_load = ULONG_MAX;
>>   	unsigned int min_exit_latency = UINT_MAX;
>>   	u64 latest_idle_timestamp = 0;
>> +	unsigned int backup_cap = 0;
>>   	int least_loaded_cpu = this_cpu;
>>   	int shallowest_idle_cpu = -1;
>> +	int shallowest_idle_cpu_backup = -1;
>>   	int i;
>>   
>>   	/* Check if we have any choice: */
>> @@ -5937,7 +5944,12 @@ find_idlest_cpu(struct sched_group *group, struct task_struct *p, int this_cpu)
>>   				 */
>>   				min_exit_latency = idle->exit_latency;
>>   				latest_idle_timestamp = rq->idle_stamp;
>> -				shallowest_idle_cpu = i;
>> +				if (full_capacity(i)) {
>> +					shallowest_idle_cpu = i;
>> +				} else if (capacity_of(i) > backup_cap) {
>> +					shallowest_idle_cpu_backup = i;
>> +					backup_cap = capacity_of(i);
>> +				}
> I'm a bit skeptical about this - if the CPU is idle, then is it likely that the capacity of the CPU is reduced due to RT pressure?

What has idleness got to do with RT pressure?

This is an instantaneous view where the scheduler is looking to place
threads. In this case, if we know historically the capacity of the CPU
is reduced (due to RT/IRQ/Thermal Throttling or whatever it may be) we
should avoid that CPU if we have a choice.

> I can see that it can matter, but I am wondering if you have any data for your usecase to show that it does (that is if you didn't consider RT pressure for idle CPUs, are you still seeing a big enough performance improvement to warrant the change?

I tested this with OLTP which has a mix of both IRQ and RT threads.
Also, I tested with micro-benchmarks which have IRQ and fair threads. I
haven't seen what happens just with RT alone. This would come back to the
question: "Why reduce capacities when RT thread is running?", frankly I
don't know the answer, however from a capacity stand point that is taken
into account.

It makes sense to me however I don't know the reasoning.

>>   			} else if ((!idle || idle->exit_latency == min_exit_latency) &&
>>   				   rq->idle_stamp > latest_idle_timestamp) {
>>   				/*
>> @@ -5946,7 +5958,12 @@ find_idlest_cpu(struct sched_group *group, struct task_struct *p, int this_cpu)
>>   				 * a warmer cache.
>>   				 */
>>   				latest_idle_timestamp = rq->idle_stamp;
>> -				shallowest_idle_cpu = i;
>> +				if (full_capacity(i)) {
>> +					shallowest_idle_cpu = i;
>> +				} else if (capacity_of(i) > backup_cap) {
>> +					shallowest_idle_cpu_backup = i;
>> +					backup_cap = capacity_of(i);
>> +				}
>>   			}
>>   		} else if (shallowest_idle_cpu == -1) {
>>   			load = weighted_cpuload(cpu_rq(i));
>> @@ -5957,7 +5974,11 @@ find_idlest_cpu(struct sched_group *group, struct task_struct *p, int this_cpu)
>>   		}
>>   	}
>>   
>> -	return shallowest_idle_cpu != -1 ? shallowest_idle_cpu : least_loaded_cpu;
>> +	if (shallowest_idle_cpu != -1)
>> +		return shallowest_idle_cpu;
>> +
>> +	return (shallowest_idle_cpu_backup != -1 ?
>> +		shallowest_idle_cpu_backup : least_loaded_cpu);
>>   }
>>   
>>   #ifdef CONFIG_SCHED_SMT
>>
> I see code duplication here which can be reduced by 7 lines compared to your original patch:

This does look better and I will try to incorporate this.

Thanks,
Rohit

>
> ---
>   kernel/sched/fair.c | 20 +++++++++++++++++---
>   1 file changed, 17 insertions(+), 3 deletions(-)
>
> diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
> index c95880e216f6..72fc8d18b251 100644
> --- a/kernel/sched/fair.c
> +++ b/kernel/sched/fair.c
> @@ -5528,6 +5528,7 @@ find_idlest_cpu(struct sched_group *group, struct task_struct *p, int this_cpu)
>   	/* Traverse only the allowed CPUs */
>   	for_each_cpu_and(i, sched_group_span(group), &p->cpus_allowed) {
>   		if (idle_cpu(i)) {
> +			int idle_candidate = -1;
>   			struct rq *rq = cpu_rq(i);
>   			struct cpuidle_state *idle = idle_get_state(rq);
>   			if (idle && idle->exit_latency < min_exit_latency) {
> @@ -5538,7 +5539,7 @@ find_idlest_cpu(struct sched_group *group, struct task_struct *p, int this_cpu)
>   				 */
>   				min_exit_latency = idle->exit_latency;
>   				latest_idle_timestamp = rq->idle_stamp;
> -				shallowest_idle_cpu = i;
> +				idle_candidate = i;
>   			} else if ((!idle || idle->exit_latency == min_exit_latency) &&
>   				   rq->idle_stamp > latest_idle_timestamp) {
>   				/*
> @@ -5547,7 +5548,16 @@ find_idlest_cpu(struct sched_group *group, struct task_struct *p, int this_cpu)
>   				 * a warmer cache.
>   				 */
>   				latest_idle_timestamp = rq->idle_stamp;
> -				shallowest_idle_cpu = i;
> +				idle_candidate = i;
> +			}
> +
> +			if (idle_candidate != -1) {
> +				if (full_capacity(idle_candidate)) {
> +					shallowest_idle_cpu = idle_candidate;
> +				} else if (capacity_of(idle_candidate) > backup_cap) {
> +					shallowest_idle_cpu_backup = idle_candidate;
> +					backup_cap = capacity_of(idle_candidate);
> +				}
>   			}
>   		} else if (shallowest_idle_cpu == -1) {
>   			load = weighted_cpuload(i);
> @@ -5558,7 +5568,11 @@ find_idlest_cpu(struct sched_group *group, struct task_struct *p, int this_cpu)
>   		}
>   	}
>   
> -	return shallowest_idle_cpu != -1 ? shallowest_idle_cpu : least_loaded_cpu;
> +	if (shallowest_idle_cpu != -1)
> +		return shallowest_idle_cpu;
> +
> +	return (shallowest_idle_cpu_backup != -1 ?
> +			shallowest_idle_cpu_backup : least_loaded_cpu);
>   }
>   
>   #ifdef CONFIG_SCHED_SMT

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH 1/3] sched/fair: Introduce scaled capacity awareness in find_idlest_cpu code path
  2017-09-26  0:02 ` [PATCH 1/3] sched/fair: Introduce scaled capacity awareness in find_idlest_cpu code path Rohit Jain
@ 2017-09-26  2:51   ` joelaf
  2017-09-26  4:40     ` Rohit Jain
  0 siblings, 1 reply; 9+ messages in thread
From: joelaf @ 2017-09-26  2:51 UTC (permalink / raw)
  To: Rohit Jain, linux-kernel, eas-dev
  Cc: peterz, mingo, atish.patra, vincent.guittot, dietmar.eggemann,
	morten.rasmussen

Hi Rohit,

Just some comments:

On 09/25/2017 05:02 PM, Rohit Jain wrote:
> While looking for idle CPUs for a waking task, we should also account
> for the delays caused due to the bandwidth reduction by RT/IRQ tasks.
> 
> This patch does that by trying to find a higher capacity CPU with
> minimum wake up latency.
> 
> 
> Signed-off-by: Rohit Jain <rohit.k.jain@oracle.com>
> ---
>  kernel/sched/fair.c | 27 ++++++++++++++++++++++++---
>  1 file changed, 24 insertions(+), 3 deletions(-)
> 
> diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
> index eca6a57..afb701f 100644
> --- a/kernel/sched/fair.c
> +++ b/kernel/sched/fair.c
> @@ -5590,6 +5590,11 @@ static unsigned long capacity_orig_of(int cpu)
>  	return cpu_rq(cpu)->cpu_capacity_orig;
>  }
>  
> +static inline bool full_capacity(int cpu)
> +{
> +	return (capacity_of(cpu) >= (capacity_orig_of(cpu)*819 >> 10));

Wouldn't 768 be better for multiplication? gcc converts the expression to shifts and adds then.

> +}
> +
>  static unsigned long cpu_avg_load_per_task(int cpu)
>  {
>  	struct rq *rq = cpu_rq(cpu);
> @@ -5916,8 +5921,10 @@ find_idlest_cpu(struct sched_group *group, struct task_struct *p, int this_cpu)
>  	unsigned long load, min_load = ULONG_MAX;
>  	unsigned int min_exit_latency = UINT_MAX;
>  	u64 latest_idle_timestamp = 0;
> +	unsigned int backup_cap = 0;
>  	int least_loaded_cpu = this_cpu;
>  	int shallowest_idle_cpu = -1;
> +	int shallowest_idle_cpu_backup = -1;
>  	int i;
>  
>  	/* Check if we have any choice: */
> @@ -5937,7 +5944,12 @@ find_idlest_cpu(struct sched_group *group, struct task_struct *p, int this_cpu)
>  				 */
>  				min_exit_latency = idle->exit_latency;
>  				latest_idle_timestamp = rq->idle_stamp;
> -				shallowest_idle_cpu = i;
> +				if (full_capacity(i)) {
> +					shallowest_idle_cpu = i;
> +				} else if (capacity_of(i) > backup_cap) {
> +					shallowest_idle_cpu_backup = i;
> +					backup_cap = capacity_of(i);
> +				}

I'm a bit skeptical about this - if the CPU is idle, then is it likely that the capacity of the CPU is reduced due to RT pressure? I can see that it can matter, but I am wondering if you have any data for your usecase to show that it does (that is if you didn't consider RT pressure for idle CPUs, are you still seeing a big enough performance improvement to warrant the change?

>  			} else if ((!idle || idle->exit_latency == min_exit_latency) &&
>  				   rq->idle_stamp > latest_idle_timestamp) {
>  				/*
> @@ -5946,7 +5958,12 @@ find_idlest_cpu(struct sched_group *group, struct task_struct *p, int this_cpu)
>  				 * a warmer cache.
>  				 */
>  				latest_idle_timestamp = rq->idle_stamp;
> -				shallowest_idle_cpu = i;
> +				if (full_capacity(i)) {
> +					shallowest_idle_cpu = i;
> +				} else if (capacity_of(i) > backup_cap) {
> +					shallowest_idle_cpu_backup = i;
> +					backup_cap = capacity_of(i);
> +				}
>  			}
>  		} else if (shallowest_idle_cpu == -1) {
>  			load = weighted_cpuload(cpu_rq(i));
> @@ -5957,7 +5974,11 @@ find_idlest_cpu(struct sched_group *group, struct task_struct *p, int this_cpu)
>  		}
>  	}
>  
> -	return shallowest_idle_cpu != -1 ? shallowest_idle_cpu : least_loaded_cpu;
> +	if (shallowest_idle_cpu != -1)
> +		return shallowest_idle_cpu;
> +
> +	return (shallowest_idle_cpu_backup != -1 ?
> +		shallowest_idle_cpu_backup : least_loaded_cpu);
>  }
>  
>  #ifdef CONFIG_SCHED_SMT
> 

I see code duplication here which can be reduced by 7 lines compared to your original patch:

---
 kernel/sched/fair.c | 20 +++++++++++++++++---
 1 file changed, 17 insertions(+), 3 deletions(-)

diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index c95880e216f6..72fc8d18b251 100644
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -5528,6 +5528,7 @@ find_idlest_cpu(struct sched_group *group, struct task_struct *p, int this_cpu)
 	/* Traverse only the allowed CPUs */
 	for_each_cpu_and(i, sched_group_span(group), &p->cpus_allowed) {
 		if (idle_cpu(i)) {
+			int idle_candidate = -1;
 			struct rq *rq = cpu_rq(i);
 			struct cpuidle_state *idle = idle_get_state(rq);
 			if (idle && idle->exit_latency < min_exit_latency) {
@@ -5538,7 +5539,7 @@ find_idlest_cpu(struct sched_group *group, struct task_struct *p, int this_cpu)
 				 */
 				min_exit_latency = idle->exit_latency;
 				latest_idle_timestamp = rq->idle_stamp;
-				shallowest_idle_cpu = i;
+				idle_candidate = i;
 			} else if ((!idle || idle->exit_latency == min_exit_latency) &&
 				   rq->idle_stamp > latest_idle_timestamp) {
 				/*
@@ -5547,7 +5548,16 @@ find_idlest_cpu(struct sched_group *group, struct task_struct *p, int this_cpu)
 				 * a warmer cache.
 				 */
 				latest_idle_timestamp = rq->idle_stamp;
-				shallowest_idle_cpu = i;
+				idle_candidate = i;
+			}
+
+			if (idle_candidate != -1) {
+				if (full_capacity(idle_candidate)) {
+					shallowest_idle_cpu = idle_candidate;
+				} else if (capacity_of(idle_candidate) > backup_cap) {
+					shallowest_idle_cpu_backup = idle_candidate;
+					backup_cap = capacity_of(idle_candidate);
+				}
 			}
 		} else if (shallowest_idle_cpu == -1) {
 			load = weighted_cpuload(i);
@@ -5558,7 +5568,11 @@ find_idlest_cpu(struct sched_group *group, struct task_struct *p, int this_cpu)
 		}
 	}
 
-	return shallowest_idle_cpu != -1 ? shallowest_idle_cpu : least_loaded_cpu;
+	if (shallowest_idle_cpu != -1)
+		return shallowest_idle_cpu;
+
+	return (shallowest_idle_cpu_backup != -1 ?
+			shallowest_idle_cpu_backup : least_loaded_cpu);
 }
 
 #ifdef CONFIG_SCHED_SMT
-- 
2.14.1.821.g8fa685d3b7-goog

^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [PATCH 1/3] sched/fair: Introduce scaled capacity awareness in find_idlest_cpu code path
  2017-09-26  0:02 [PATCH v4 0/3] sched/fair: Introduce scaled capacity awareness in enqueue Rohit Jain
@ 2017-09-26  0:02 ` Rohit Jain
  2017-09-26  2:51   ` joelaf
  0 siblings, 1 reply; 9+ messages in thread
From: Rohit Jain @ 2017-09-26  0:02 UTC (permalink / raw)
  To: linux-kernel, eas-dev
  Cc: peterz, mingo, joelaf, atish.patra, vincent.guittot,
	dietmar.eggemann, morten.rasmussen

While looking for idle CPUs for a waking task, we should also account
for the delays caused due to the bandwidth reduction by RT/IRQ tasks.

This patch does that by trying to find a higher capacity CPU with
minimum wake up latency.


Signed-off-by: Rohit Jain <rohit.k.jain@oracle.com>
---
 kernel/sched/fair.c | 27 ++++++++++++++++++++++++---
 1 file changed, 24 insertions(+), 3 deletions(-)

diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index eca6a57..afb701f 100644
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -5590,6 +5590,11 @@ static unsigned long capacity_orig_of(int cpu)
 	return cpu_rq(cpu)->cpu_capacity_orig;
 }
 
+static inline bool full_capacity(int cpu)
+{
+	return (capacity_of(cpu) >= (capacity_orig_of(cpu)*819 >> 10));
+}
+
 static unsigned long cpu_avg_load_per_task(int cpu)
 {
 	struct rq *rq = cpu_rq(cpu);
@@ -5916,8 +5921,10 @@ find_idlest_cpu(struct sched_group *group, struct task_struct *p, int this_cpu)
 	unsigned long load, min_load = ULONG_MAX;
 	unsigned int min_exit_latency = UINT_MAX;
 	u64 latest_idle_timestamp = 0;
+	unsigned int backup_cap = 0;
 	int least_loaded_cpu = this_cpu;
 	int shallowest_idle_cpu = -1;
+	int shallowest_idle_cpu_backup = -1;
 	int i;
 
 	/* Check if we have any choice: */
@@ -5937,7 +5944,12 @@ find_idlest_cpu(struct sched_group *group, struct task_struct *p, int this_cpu)
 				 */
 				min_exit_latency = idle->exit_latency;
 				latest_idle_timestamp = rq->idle_stamp;
-				shallowest_idle_cpu = i;
+				if (full_capacity(i)) {
+					shallowest_idle_cpu = i;
+				} else if (capacity_of(i) > backup_cap) {
+					shallowest_idle_cpu_backup = i;
+					backup_cap = capacity_of(i);
+				}
 			} else if ((!idle || idle->exit_latency == min_exit_latency) &&
 				   rq->idle_stamp > latest_idle_timestamp) {
 				/*
@@ -5946,7 +5958,12 @@ find_idlest_cpu(struct sched_group *group, struct task_struct *p, int this_cpu)
 				 * a warmer cache.
 				 */
 				latest_idle_timestamp = rq->idle_stamp;
-				shallowest_idle_cpu = i;
+				if (full_capacity(i)) {
+					shallowest_idle_cpu = i;
+				} else if (capacity_of(i) > backup_cap) {
+					shallowest_idle_cpu_backup = i;
+					backup_cap = capacity_of(i);
+				}
 			}
 		} else if (shallowest_idle_cpu == -1) {
 			load = weighted_cpuload(cpu_rq(i));
@@ -5957,7 +5974,11 @@ find_idlest_cpu(struct sched_group *group, struct task_struct *p, int this_cpu)
 		}
 	}
 
-	return shallowest_idle_cpu != -1 ? shallowest_idle_cpu : least_loaded_cpu;
+	if (shallowest_idle_cpu != -1)
+		return shallowest_idle_cpu;
+
+	return (shallowest_idle_cpu_backup != -1 ?
+		shallowest_idle_cpu_backup : least_loaded_cpu);
 }
 
 #ifdef CONFIG_SCHED_SMT
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2017-10-13  1:54 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-10-10 15:59 [PATCH 1/3] sched/fair: Introduce scaled capacity awareness in find_idlest_cpu code path Atish Patra
  -- strict thread matches above, loose matches on Subject: below --
2017-10-07 23:48 [PATCH v5 0/3] sched/fair: Introduce scaled capacity awareness in enqueue Rohit Jain
2017-10-07 23:48 ` [PATCH 1/3] sched/fair: Introduce scaled capacity awareness in find_idlest_cpu code path Rohit Jain
2017-10-12 17:03   ` Rohit Jain
2017-10-12 21:47     ` Joel Fernandes
2017-10-13  1:54       ` Rohit Jain
2017-09-26  0:02 [PATCH v4 0/3] sched/fair: Introduce scaled capacity awareness in enqueue Rohit Jain
2017-09-26  0:02 ` [PATCH 1/3] sched/fair: Introduce scaled capacity awareness in find_idlest_cpu code path Rohit Jain
2017-09-26  2:51   ` joelaf
2017-09-26  4:40     ` Rohit Jain
2017-09-26  6:59       ` Joel Fernandes

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.