* [PATCH v2] sched/deadline: Make find_later_rq() choose a closer cpu in topology
@ 2017-03-21 7:52 Byungchul Park
2017-03-21 13:28 ` Daniel Bristot de Oliveira
2017-03-22 12:37 ` Peter Zijlstra
0 siblings, 2 replies; 5+ messages in thread
From: Byungchul Park @ 2017-03-21 7:52 UTC (permalink / raw)
To: peterz, mingo; +Cc: linux-kernel, juri.lelli, rostedt, kernel-team
When cpudl_find() returns any among free_cpus, the cpu might not be
closer than others, considering sched domain. For example:
this_cpu: 15
free_cpus: 0, 1,..., 14 (== later_mask)
best_cpu: 0
topology:
0 --+
+--+
1 --+ |
+-- ... --+
2 --+ | |
+--+ |
3 --+ |
... ...
12 --+ |
+--+ |
13 --+ | |
+-- ... -+
14 --+ |
+--+
15 --+
In this case, it would be best to select 14 since it's a free cpu and
closest to 15(this_cpu). However, currently the code select 0(best_cpu)
even though that's just any among free_cpus. Fix it.
Signed-off-by: Byungchul Park <byungchul.park@lge.com>
---
kernel/sched/deadline.c | 29 +++++++++++++++--------------
1 file changed, 15 insertions(+), 14 deletions(-)
diff --git a/kernel/sched/deadline.c b/kernel/sched/deadline.c
index a2ce590..49c93b9 100644
--- a/kernel/sched/deadline.c
+++ b/kernel/sched/deadline.c
@@ -1324,7 +1324,7 @@ static int find_later_rq(struct task_struct *task)
struct sched_domain *sd;
struct cpumask *later_mask = this_cpu_cpumask_var_ptr(local_cpu_mask_dl);
int this_cpu = smp_processor_id();
- int best_cpu, cpu = task_cpu(task);
+ int cpu = task_cpu(task);
/* Make sure the mask is initialized first */
if (unlikely(!later_mask))
@@ -1337,17 +1337,14 @@ static int find_later_rq(struct task_struct *task)
* We have to consider system topology and task affinity
* first, then we can look for a suitable cpu.
*/
- best_cpu = cpudl_find(&task_rq(task)->rd->cpudl,
- task, later_mask);
- if (best_cpu == -1)
+ if (cpudl_find(&task_rq(task)->rd->cpudl, task, later_mask) == -1)
return -1;
/*
- * If we are here, some target has been found,
- * the most suitable of which is cached in best_cpu.
- * This is, among the runqueues where the current tasks
- * have later deadlines than the task's one, the rq
- * with the latest possible one.
+ * If we are here, some targets have been found, including
+ * the most suitable which is, among the runqueues where the
+ * current tasks have later deadlines than the task's one, the
+ * rq with the latest possible one.
*
* Now we check how well this matches with task's
* affinity and system topology.
@@ -1367,6 +1364,7 @@ static int find_later_rq(struct task_struct *task)
rcu_read_lock();
for_each_domain(cpu, sd) {
if (sd->flags & SD_WAKE_AFFINE) {
+ int closest_cpu;
/*
* If possible, preempting this_cpu is
@@ -1378,14 +1376,17 @@ static int find_later_rq(struct task_struct *task)
return this_cpu;
}
+ closest_cpu = cpumask_first_and(later_mask,
+ sched_domain_span(sd));
/*
- * Last chance: if best_cpu is valid and is
- * in the mask, that becomes our choice.
+ * Last chance: if a cpu being in both later_mask
+ * and current sd span is valid, that becomes our
+ * choice. Of course, the latest possible cpu is
+ * already under consideration through later_mask.
*/
- if (best_cpu < nr_cpu_ids &&
- cpumask_test_cpu(best_cpu, sched_domain_span(sd))) {
+ if (closest_cpu < nr_cpu_ids) {
rcu_read_unlock();
- return best_cpu;
+ return closest_cpu;
}
}
}
--
1.9.1
^ permalink raw reply related [flat|nested] 5+ messages in thread
* Re: [PATCH v2] sched/deadline: Make find_later_rq() choose a closer cpu in topology
2017-03-21 7:52 [PATCH v2] sched/deadline: Make find_later_rq() choose a closer cpu in topology Byungchul Park
@ 2017-03-21 13:28 ` Daniel Bristot de Oliveira
2017-03-22 2:46 ` Byungchul Park
2017-03-22 12:37 ` Peter Zijlstra
1 sibling, 1 reply; 5+ messages in thread
From: Daniel Bristot de Oliveira @ 2017-03-21 13:28 UTC (permalink / raw)
To: Byungchul Park, peterz, mingo
Cc: linux-kernel, juri.lelli, rostedt, kernel-team
On 03/21/2017 08:52 AM, Byungchul Park wrote:
> When cpudl_find() returns any among free_cpus, the cpu might not be
> closer than others, considering sched domain. For example:
>
> this_cpu: 15
> free_cpus: 0, 1,..., 14 (== later_mask)
> best_cpu: 0
>
> topology:
>
> 0 --+
> +--+
> 1 --+ |
> +-- ... --+
> 2 --+ | |
> +--+ |
> 3 --+ |
>
> ... ...
>
> 12 --+ |
> +--+ |
> 13 --+ | |
> +-- ... -+
> 14 --+ |
> +--+
> 15 --+
>
> In this case, it would be best to select 14 since it's a free cpu and
> closest to 15(this_cpu). However, currently the code select 0(best_cpu)
> even though that's just any among free_cpus. Fix it.
That is a nice patch! But I wonder what would be the behavior with your
patch in the following hw:
# numactl --hardware
available: 2 nodes (0-1)
node 0 cpus: 0 2 4 6 8 10 12 14
node 0 size: 16159 MB
node 0 free: 15308 MB
node 1 cpus: 1 3 5 7 9 11 13 15
node 1 size: 16384 MB
node 1 free: 15028 MB
node distances:
node 0 1
0: 10 21
1: 21 10
-- Daniel
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [PATCH v2] sched/deadline: Make find_later_rq() choose a closer cpu in topology
2017-03-21 13:28 ` Daniel Bristot de Oliveira
@ 2017-03-22 2:46 ` Byungchul Park
0 siblings, 0 replies; 5+ messages in thread
From: Byungchul Park @ 2017-03-22 2:46 UTC (permalink / raw)
To: Daniel Bristot de Oliveira
Cc: peterz, mingo, linux-kernel, juri.lelli, rostedt, kernel-team
On Tue, Mar 21, 2017 at 02:28:50PM +0100, Daniel Bristot de Oliveira wrote:
> On 03/21/2017 08:52 AM, Byungchul Park wrote:
> > When cpudl_find() returns any among free_cpus, the cpu might not be
> > closer than others, considering sched domain. For example:
> >
> > this_cpu: 15
> > free_cpus: 0, 1,..., 14 (== later_mask)
> > best_cpu: 0
> >
> > topology:
> >
> > 0 --+
> > +--+
> > 1 --+ |
> > +-- ... --+
> > 2 --+ | |
> > +--+ |
> > 3 --+ |
> >
> > ... ...
> >
> > 12 --+ |
> > +--+ |
> > 13 --+ | |
> > +-- ... -+
> > 14 --+ |
> > +--+
> > 15 --+
> >
> > In this case, it would be best to select 14 since it's a free cpu and
> > closest to 15(this_cpu). However, currently the code select 0(best_cpu)
> > even though that's just any among free_cpus. Fix it.
>
> That is a nice patch! But I wonder what would be the behavior with your
> patch in the following hw:
>
> # numactl --hardware
> available: 2 nodes (0-1)
> node 0 cpus: 0 2 4 6 8 10 12 14
> node 0 size: 16159 MB
> node 0 free: 15308 MB
> node 1 cpus: 1 3 5 7 9 11 13 15
> node 1 size: 16384 MB
> node 1 free: 15028 MB
> node distances:
> node 0 1
> 0: 10 21
> 1: 21 10
Hi,
In this case, I guess the topology looks like:
0 --+
+--+
2 --+ |
+-- ... --+
4 --+ | |
+--+ |
6 --+ |
... ...
9 --+ |
+--+ |
11 --+ | |
+-- ... -+
13 --+ |
+--+
15 --+
And sched_domain would also reflect that. So the dl's push works well.
Do I miss something?
In addition, IMHO, it's not an issue for dl's push but one for building
sched_domains. Wrong?
Thanks,
Byungchul
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [PATCH v2] sched/deadline: Make find_later_rq() choose a closer cpu in topology
2017-03-21 7:52 [PATCH v2] sched/deadline: Make find_later_rq() choose a closer cpu in topology Byungchul Park
2017-03-21 13:28 ` Daniel Bristot de Oliveira
@ 2017-03-22 12:37 ` Peter Zijlstra
2017-03-22 23:21 ` Byungchul Park
1 sibling, 1 reply; 5+ messages in thread
From: Peter Zijlstra @ 2017-03-22 12:37 UTC (permalink / raw)
To: Byungchul Park; +Cc: mingo, linux-kernel, juri.lelli, rostedt, kernel-team
On Tue, Mar 21, 2017 at 04:52:24PM +0900, Byungchul Park wrote:
> When cpudl_find() returns any among free_cpus, the cpu might not be
> closer than others, considering sched domain. For example:
>
> this_cpu: 15
> free_cpus: 0, 1,..., 14 (== later_mask)
> best_cpu: 0
>
> topology:
>
> 0 --+
> +--+
> 1 --+ |
> +-- ... --+
> 2 --+ | |
> +--+ |
> 3 --+ |
>
> ... ...
>
> 12 --+ |
> +--+ |
> 13 --+ | |
> +-- ... -+
> 14 --+ |
> +--+
> 15 --+
>
> In this case, it would be best to select 14 since it's a free cpu and
> closest to 15(this_cpu). However, currently the code select 0(best_cpu)
> even though that's just any among free_cpus. Fix it.
This would result in picking the HT sibling, if available. Which is
typically the worst possible pick.
If you add support for SD_PREFER_SIBLING, which denotes a preference for
any other sibling domain above this one, this might work.
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [PATCH v2] sched/deadline: Make find_later_rq() choose a closer cpu in topology
2017-03-22 12:37 ` Peter Zijlstra
@ 2017-03-22 23:21 ` Byungchul Park
0 siblings, 0 replies; 5+ messages in thread
From: Byungchul Park @ 2017-03-22 23:21 UTC (permalink / raw)
To: Peter Zijlstra; +Cc: mingo, linux-kernel, juri.lelli, rostedt, kernel-team
On Wed, Mar 22, 2017 at 01:37:28PM +0100, Peter Zijlstra wrote:
> On Tue, Mar 21, 2017 at 04:52:24PM +0900, Byungchul Park wrote:
> > When cpudl_find() returns any among free_cpus, the cpu might not be
> > closer than others, considering sched domain. For example:
> >
> > this_cpu: 15
> > free_cpus: 0, 1,..., 14 (== later_mask)
> > best_cpu: 0
> >
> > topology:
> >
> > 0 --+
> > +--+
> > 1 --+ |
> > +-- ... --+
> > 2 --+ | |
> > +--+ |
> > 3 --+ |
> >
> > ... ...
> >
> > 12 --+ |
> > +--+ |
> > 13 --+ | |
> > +-- ... -+
> > 14 --+ |
> > +--+
> > 15 --+
> >
> > In this case, it would be best to select 14 since it's a free cpu and
> > closest to 15(this_cpu). However, currently the code select 0(best_cpu)
> > even though that's just any among free_cpus. Fix it.
>
> This would result in picking the HT sibling, if available. Which is
> typically the worst possible pick.
>
> If you add support for SD_PREFER_SIBLING, which denotes a preference for
> any other sibling domain above this one, this might work.
Sure. I will add that support as well. Thank you.
^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2017-03-22 23:22 UTC | newest]
Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-03-21 7:52 [PATCH v2] sched/deadline: Make find_later_rq() choose a closer cpu in topology Byungchul Park
2017-03-21 13:28 ` Daniel Bristot de Oliveira
2017-03-22 2:46 ` Byungchul Park
2017-03-22 12:37 ` Peter Zijlstra
2017-03-22 23:21 ` Byungchul Park
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.