* [RFC PATCH 1/4] Minimally invasive deadline wrap-around bugfix for SCHED_DEADLINE cpu heap
2016-07-19 9:44 SCHED_DEADLINE cpudeadline.{h,c} fixup Tommaso Cucinotta
@ 2016-07-19 9:44 ` Tommaso Cucinotta
2016-08-01 11:43 ` Juri Lelli
2016-08-10 18:00 ` [tip:sched/core] sched/deadline: Fix wrap-around in DL heap tip-bot for Tommaso Cucinotta
2016-07-19 9:44 ` [RFC PATCH 2/4] SCHED_DEADLINE cpu heap code clarification/refactory Tommaso Cucinotta
` (2 subsequent siblings)
3 siblings, 2 replies; 13+ messages in thread
From: Tommaso Cucinotta @ 2016-07-19 9:44 UTC (permalink / raw)
To: Luca Abeni, Juri Lelli, Peter Zijlstra, Ingo Molnar
Cc: linux-kernel, linux-dl, Tommaso Cucinotta, Juri Lelli
Current code in cpudeadline.c has a bug in re-heapifying when adding a
new element at the end of the heap, because a deadline value of 0 is
temporarily set in the new elem, then cpudl_change_key() is called
with the actual elem deadline as param. However, the function compares
the new deadline to set with the one previously in the elem, which is
0. So, if current absolute deadlines grew so much to have negative
values as s64, the comparison in cpudl_change_key() makes the wrong
decision. Instead, as from dl_time_before(), the kernel should handle
correctly abs deadlines wrap-arounds.
This patch fixes the problem with a minimally invasive change that
forces cpudl_change_key() to heapify up in this case.
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Juri Lelli <juri.lelli@arm.com>
Cc: Luca Abeni <luca.abeni@unitn.it>
Reviewed-by: Luca Abeni <luca.abeni@unitn.it>
Signed-off-by: Tommaso Cucinotta <tommaso.cucinotta@sssup.it>
---
kernel/sched/cpudeadline.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/kernel/sched/cpudeadline.c b/kernel/sched/cpudeadline.c
index 5be5882..d418449 100644
--- a/kernel/sched/cpudeadline.c
+++ b/kernel/sched/cpudeadline.c
@@ -168,7 +168,7 @@ void cpudl_set(struct cpudl *cp, int cpu, u64 dl, int is_valid)
if (old_idx == IDX_INVALID) {
cp->size++;
- cp->elements[cp->size - 1].dl = 0;
+ cp->elements[cp->size - 1].dl = dl;
cp->elements[cp->size - 1].cpu = cpu;
cp->elements[cpu].idx = cp->size - 1;
cpudl_change_key(cp, cp->size - 1, dl);
--
2.7.4
^ permalink raw reply related [flat|nested] 13+ messages in thread
* Re: [RFC PATCH 1/4] Minimally invasive deadline wrap-around bugfix for SCHED_DEADLINE cpu heap
2016-07-19 9:44 ` [RFC PATCH 1/4] Minimally invasive deadline wrap-around bugfix for SCHED_DEADLINE cpu heap Tommaso Cucinotta
@ 2016-08-01 11:43 ` Juri Lelli
2016-08-01 12:57 ` Juri Lelli
2016-08-10 18:00 ` [tip:sched/core] sched/deadline: Fix wrap-around in DL heap tip-bot for Tommaso Cucinotta
1 sibling, 1 reply; 13+ messages in thread
From: Juri Lelli @ 2016-08-01 11:43 UTC (permalink / raw)
To: Tommaso Cucinotta
Cc: Luca Abeni, Juri Lelli, Peter Zijlstra, Ingo Molnar,
linux-kernel, linux-dl
Hi,
On 19/07/16 11:44, Tommaso Cucinotta wrote:
> Current code in cpudeadline.c has a bug in re-heapifying when adding a
> new element at the end of the heap, because a deadline value of 0 is
> temporarily set in the new elem, then cpudl_change_key() is called
> with the actual elem deadline as param. However, the function compares
> the new deadline to set with the one previously in the elem, which is
> 0. So, if current absolute deadlines grew so much to have negative
> values as s64, the comparison in cpudl_change_key() makes the wrong
> decision. Instead, as from dl_time_before(), the kernel should handle
> correctly abs deadlines wrap-arounds.
>
> This patch fixes the problem with a minimally invasive change that
> forces cpudl_change_key() to heapify up in this case.
>
> Cc: Peter Zijlstra <peterz@infradead.org>
> Cc: Juri Lelli <juri.lelli@arm.com>
> Cc: Luca Abeni <luca.abeni@unitn.it>
> Reviewed-by: Luca Abeni <luca.abeni@unitn.it>
> Signed-off-by: Tommaso Cucinotta <tommaso.cucinotta@sssup.it>
> ---
> kernel/sched/cpudeadline.c | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/kernel/sched/cpudeadline.c b/kernel/sched/cpudeadline.c
> index 5be5882..d418449 100644
> --- a/kernel/sched/cpudeadline.c
> +++ b/kernel/sched/cpudeadline.c
> @@ -168,7 +168,7 @@ void cpudl_set(struct cpudl *cp, int cpu, u64 dl, int is_valid)
>
> if (old_idx == IDX_INVALID) {
> cp->size++;
> - cp->elements[cp->size - 1].dl = 0;
> + cp->elements[cp->size - 1].dl = dl;
> cp->elements[cp->size - 1].cpu = cpu;
> cp->elements[cpu].idx = cp->size - 1;
> cpudl_change_key(cp, cp->size - 1, dl);
Looks good. I'd only change the subject to something like:
sched/deadline: Wrap-around bugfix for cpu heap
Best,
- Juri
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [RFC PATCH 1/4] Minimally invasive deadline wrap-around bugfix for SCHED_DEADLINE cpu heap
2016-08-01 11:43 ` Juri Lelli
@ 2016-08-01 12:57 ` Juri Lelli
0 siblings, 0 replies; 13+ messages in thread
From: Juri Lelli @ 2016-08-01 12:57 UTC (permalink / raw)
To: Tommaso Cucinotta
Cc: Luca Abeni, Juri Lelli, Peter Zijlstra, Ingo Molnar, linux-kernel
On 01/08/16 12:43, Juri Lelli wrote:
> Hi,
>
> On 19/07/16 11:44, Tommaso Cucinotta wrote:
> > Current code in cpudeadline.c has a bug in re-heapifying when adding a
> > new element at the end of the heap, because a deadline value of 0 is
> > temporarily set in the new elem, then cpudl_change_key() is called
> > with the actual elem deadline as param. However, the function compares
> > the new deadline to set with the one previously in the elem, which is
> > 0. So, if current absolute deadlines grew so much to have negative
> > values as s64, the comparison in cpudl_change_key() makes the wrong
> > decision. Instead, as from dl_time_before(), the kernel should handle
> > correctly abs deadlines wrap-arounds.
> >
> > This patch fixes the problem with a minimally invasive change that
> > forces cpudl_change_key() to heapify up in this case.
> >
> > Cc: Peter Zijlstra <peterz@infradead.org>
> > Cc: Juri Lelli <juri.lelli@arm.com>
> > Cc: Luca Abeni <luca.abeni@unitn.it>
> > Reviewed-by: Luca Abeni <luca.abeni@unitn.it>
> > Signed-off-by: Tommaso Cucinotta <tommaso.cucinotta@sssup.it>
> > ---
> > kernel/sched/cpudeadline.c | 2 +-
> > 1 file changed, 1 insertion(+), 1 deletion(-)
> >
> > diff --git a/kernel/sched/cpudeadline.c b/kernel/sched/cpudeadline.c
> > index 5be5882..d418449 100644
> > --- a/kernel/sched/cpudeadline.c
> > +++ b/kernel/sched/cpudeadline.c
> > @@ -168,7 +168,7 @@ void cpudl_set(struct cpudl *cp, int cpu, u64 dl, int is_valid)
> >
> > if (old_idx == IDX_INVALID) {
> > cp->size++;
> > - cp->elements[cp->size - 1].dl = 0;
> > + cp->elements[cp->size - 1].dl = dl;
> > cp->elements[cp->size - 1].cpu = cpu;
> > cp->elements[cpu].idx = cp->size - 1;
> > cpudl_change_key(cp, cp->size - 1, dl);
>
> Looks good. I'd only change the subject to something like:
>
> sched/deadline: Wrap-around bugfix for cpu heap
>
and please remove linux-dl mailing list from future postings. It's
subscribers only, and practically dead (AFAIK).
Thanks,
- Juri
^ permalink raw reply [flat|nested] 13+ messages in thread
* [tip:sched/core] sched/deadline: Fix wrap-around in DL heap
2016-07-19 9:44 ` [RFC PATCH 1/4] Minimally invasive deadline wrap-around bugfix for SCHED_DEADLINE cpu heap Tommaso Cucinotta
2016-08-01 11:43 ` Juri Lelli
@ 2016-08-10 18:00 ` tip-bot for Tommaso Cucinotta
1 sibling, 0 replies; 13+ messages in thread
From: tip-bot for Tommaso Cucinotta @ 2016-08-10 18:00 UTC (permalink / raw)
To: linux-tip-commits
Cc: linux-kernel, hpa, tommaso.cucinotta, peterz, torvalds,
luca.abeni, juri.lelli, mingo, juri.lelli, tglx
Commit-ID: a23eadfae2fd45536a355b785d5a1533e1955c22
Gitweb: http://git.kernel.org/tip/a23eadfae2fd45536a355b785d5a1533e1955c22
Author: Tommaso Cucinotta <tommaso.cucinotta@sssup.it>
AuthorDate: Tue, 19 Jul 2016 11:44:50 +0200
Committer: Ingo Molnar <mingo@kernel.org>
CommitDate: Wed, 10 Aug 2016 13:32:55 +0200
sched/deadline: Fix wrap-around in DL heap
Current code in cpudeadline.c has a bug in re-heapifying when adding a
new element at the end of the heap, because a deadline value of 0 is
temporarily set in the new elem, then cpudl_change_key() is called
with the actual elem deadline as param.
However, the function compares the new deadline to set with the one
previously in the elem, which is 0. So, if current absolute deadlines
grew so much to have negative values as s64, the comparison in
cpudl_change_key() makes the wrong decision. Instead, as from
dl_time_before(), the kernel should handle correctly abs deadlines
wrap-arounds.
This patch fixes the problem with a minimally invasive change that
forces cpudl_change_key() to heapify up in this case.
Signed-off-by: Tommaso Cucinotta <tommaso.cucinotta@sssup.it>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Luca Abeni <luca.abeni@unitn.it>
Cc: Juri Lelli <juri.lelli@arm.com>
Cc: Juri Lelli <juri.lelli@gmail.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/1468921493-10054-2-git-send-email-tommaso.cucinotta@sssup.it
Signed-off-by: Ingo Molnar <mingo@kernel.org>
---
kernel/sched/cpudeadline.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/kernel/sched/cpudeadline.c b/kernel/sched/cpudeadline.c
index 5be5882..d418449 100644
--- a/kernel/sched/cpudeadline.c
+++ b/kernel/sched/cpudeadline.c
@@ -168,7 +168,7 @@ void cpudl_set(struct cpudl *cp, int cpu, u64 dl, int is_valid)
if (old_idx == IDX_INVALID) {
cp->size++;
- cp->elements[cp->size - 1].dl = 0;
+ cp->elements[cp->size - 1].dl = dl;
cp->elements[cp->size - 1].cpu = cpu;
cp->elements[cpu].idx = cp->size - 1;
cpudl_change_key(cp, cp->size - 1, dl);
^ permalink raw reply related [flat|nested] 13+ messages in thread
* [RFC PATCH 2/4] SCHED_DEADLINE cpu heap code clarification/refactory.
2016-07-19 9:44 SCHED_DEADLINE cpudeadline.{h,c} fixup Tommaso Cucinotta
2016-07-19 9:44 ` [RFC PATCH 1/4] Minimally invasive deadline wrap-around bugfix for SCHED_DEADLINE cpu heap Tommaso Cucinotta
@ 2016-07-19 9:44 ` Tommaso Cucinotta
2016-08-01 16:30 ` Juri Lelli
2016-07-19 9:44 ` [RFC PATCH 3/4] Make deadline max-heap faster avoiding real swaps on heapify ops Tommaso Cucinotta
2016-07-19 9:44 ` [RFC PATCH 4/4] Split cpudl_set() into cpudl_set() and cpudl_clear() Tommaso Cucinotta
3 siblings, 1 reply; 13+ messages in thread
From: Tommaso Cucinotta @ 2016-07-19 9:44 UTC (permalink / raw)
To: Luca Abeni, Juri Lelli, Peter Zijlstra, Ingo Molnar
Cc: linux-kernel, linux-dl, Tommaso Cucinotta, Juri Lelli
1. heapify up factored out in new dedicated function heapify_up()
(avoids repeatition of same code)
2. call to cpudl_change_key() replaced with heapify_up() when
cpudl_set actually inserts a new node in the heap
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Juri Lelli <juri.lelli@arm.com>
Cc: Luca Abeni <luca.abeni@unitn.it>
Reviewed-by: Luca Abeni <luca.abeni@unitn.it>
Signed-off-by: Tommaso Cucinotta <tommaso.cucinotta@sssup.it>
---
kernel/sched/cpudeadline.c | 38 +++++++++++++++++++-------------------
1 file changed, 19 insertions(+), 19 deletions(-)
diff --git a/kernel/sched/cpudeadline.c b/kernel/sched/cpudeadline.c
index d418449..3c42702 100644
--- a/kernel/sched/cpudeadline.c
+++ b/kernel/sched/cpudeadline.c
@@ -41,7 +41,7 @@ static void cpudl_exchange(struct cpudl *cp, int a, int b)
swap(cp->elements[cpu_a].idx, cp->elements[cpu_b].idx);
}
-static void cpudl_heapify(struct cpudl *cp, int idx)
+static void cpudl_heapify_down(struct cpudl *cp, int idx)
{
int l, r, largest;
@@ -66,20 +66,25 @@ static void cpudl_heapify(struct cpudl *cp, int idx)
}
}
+static void cpudl_heapify_up(struct cpudl *cp, int idx)
+{
+ while (idx > 0 && dl_time_before(cp->elements[parent(idx)].dl,
+ cp->elements[idx].dl)) {
+ cpudl_exchange(cp, idx, parent(idx));
+ idx = parent(idx);
+ }
+}
+
static void cpudl_change_key(struct cpudl *cp, int idx, u64 new_dl)
{
WARN_ON(idx == IDX_INVALID || !cpu_present(idx));
if (dl_time_before(new_dl, cp->elements[idx].dl)) {
cp->elements[idx].dl = new_dl;
- cpudl_heapify(cp, idx);
+ cpudl_heapify_down(cp, idx);
} else {
cp->elements[idx].dl = new_dl;
- while (idx > 0 && dl_time_before(cp->elements[parent(idx)].dl,
- cp->elements[idx].dl)) {
- cpudl_exchange(cp, idx, parent(idx));
- idx = parent(idx);
- }
+ cpudl_heapify_up(cp, idx);
}
}
@@ -154,24 +159,19 @@ void cpudl_set(struct cpudl *cp, int cpu, u64 dl, int is_valid)
cp->size--;
cp->elements[new_cpu].idx = old_idx;
cp->elements[cpu].idx = IDX_INVALID;
- while (old_idx > 0 && dl_time_before(
- cp->elements[parent(old_idx)].dl,
- cp->elements[old_idx].dl)) {
- cpudl_exchange(cp, old_idx, parent(old_idx));
- old_idx = parent(old_idx);
- }
+ cpudl_heapify_up(cp, old_idx);
cpumask_set_cpu(cpu, cp->free_cpus);
- cpudl_heapify(cp, old_idx);
+ cpudl_heapify_down(cp, old_idx);
goto out;
}
if (old_idx == IDX_INVALID) {
- cp->size++;
- cp->elements[cp->size - 1].dl = dl;
- cp->elements[cp->size - 1].cpu = cpu;
- cp->elements[cpu].idx = cp->size - 1;
- cpudl_change_key(cp, cp->size - 1, dl);
+ int size1 = cp->size++;
+ cp->elements[size1].dl = dl;
+ cp->elements[size1].cpu = cpu;
+ cp->elements[cpu].idx = size1;
+ cpudl_heapify_up(cp, size1);
cpumask_clear_cpu(cpu, cp->free_cpus);
} else {
cpudl_change_key(cp, old_idx, dl);
--
2.7.4
^ permalink raw reply related [flat|nested] 13+ messages in thread
* Re: [RFC PATCH 2/4] SCHED_DEADLINE cpu heap code clarification/refactory.
2016-07-19 9:44 ` [RFC PATCH 2/4] SCHED_DEADLINE cpu heap code clarification/refactory Tommaso Cucinotta
@ 2016-08-01 16:30 ` Juri Lelli
0 siblings, 0 replies; 13+ messages in thread
From: Juri Lelli @ 2016-08-01 16:30 UTC (permalink / raw)
To: Tommaso Cucinotta
Cc: Luca Abeni, Juri Lelli, Peter Zijlstra, Ingo Molnar, linux-kernel
Hi,
nice clean-up. Maybe change the subject to something like
"sched/deadline: refactor cpu heap code" ?
On 19/07/16 11:44, Tommaso Cucinotta wrote:
This change does two things:
> 1. heapify up factored out in new dedicated function heapify_up()
> (avoids repeatition of same code)
s/repeatition/repetition/
> 2. call to cpudl_change_key() replaced with heapify_up() when
> cpudl_set actually inserts a new node in the heap
>
Maybe we want a separate patch (we usually want 1 patch - 1 change) ?
> Cc: Peter Zijlstra <peterz@infradead.org>
> Cc: Juri Lelli <juri.lelli@arm.com>
> Cc: Luca Abeni <luca.abeni@unitn.it>
> Reviewed-by: Luca Abeni <luca.abeni@unitn.it>
> Signed-off-by: Tommaso Cucinotta <tommaso.cucinotta@sssup.it>
> ---
> kernel/sched/cpudeadline.c | 38 +++++++++++++++++++-------------------
> 1 file changed, 19 insertions(+), 19 deletions(-)
>
> diff --git a/kernel/sched/cpudeadline.c b/kernel/sched/cpudeadline.c
> index d418449..3c42702 100644
> --- a/kernel/sched/cpudeadline.c
> +++ b/kernel/sched/cpudeadline.c
> @@ -41,7 +41,7 @@ static void cpudl_exchange(struct cpudl *cp, int a, int b)
> swap(cp->elements[cpu_a].idx, cp->elements[cpu_b].idx);
> }
>
> -static void cpudl_heapify(struct cpudl *cp, int idx)
> +static void cpudl_heapify_down(struct cpudl *cp, int idx)
> {
> int l, r, largest;
>
> @@ -66,20 +66,25 @@ static void cpudl_heapify(struct cpudl *cp, int idx)
> }
> }
>
> +static void cpudl_heapify_up(struct cpudl *cp, int idx)
> +{
> + while (idx > 0 && dl_time_before(cp->elements[parent(idx)].dl,
> + cp->elements[idx].dl)) {
> + cpudl_exchange(cp, idx, parent(idx));
> + idx = parent(idx);
> + }
> +}
> +
> static void cpudl_change_key(struct cpudl *cp, int idx, u64 new_dl)
> {
> WARN_ON(idx == IDX_INVALID || !cpu_present(idx));
>
> if (dl_time_before(new_dl, cp->elements[idx].dl)) {
> cp->elements[idx].dl = new_dl;
> - cpudl_heapify(cp, idx);
> + cpudl_heapify_down(cp, idx);
> } else {
> cp->elements[idx].dl = new_dl;
> - while (idx > 0 && dl_time_before(cp->elements[parent(idx)].dl,
> - cp->elements[idx].dl)) {
> - cpudl_exchange(cp, idx, parent(idx));
> - idx = parent(idx);
> - }
> + cpudl_heapify_up(cp, idx);
> }
> }
>
> @@ -154,24 +159,19 @@ void cpudl_set(struct cpudl *cp, int cpu, u64 dl, int is_valid)
> cp->size--;
> cp->elements[new_cpu].idx = old_idx;
> cp->elements[cpu].idx = IDX_INVALID;
> - while (old_idx > 0 && dl_time_before(
> - cp->elements[parent(old_idx)].dl,
> - cp->elements[old_idx].dl)) {
> - cpudl_exchange(cp, old_idx, parent(old_idx));
> - old_idx = parent(old_idx);
> - }
> + cpudl_heapify_up(cp, old_idx);
> cpumask_set_cpu(cpu, cp->free_cpus);
> - cpudl_heapify(cp, old_idx);
> + cpudl_heapify_down(cp, old_idx);
I think this line was already whitespace damaged. Could you fix it (with
a proper tab) in next version?
>
> goto out;
> }
>
> if (old_idx == IDX_INVALID) {
> - cp->size++;
> - cp->elements[cp->size - 1].dl = dl;
> - cp->elements[cp->size - 1].cpu = cpu;
> - cp->elements[cpu].idx = cp->size - 1;
> - cpudl_change_key(cp, cp->size - 1, dl);
> + int size1 = cp->size++;
s/size1/new_size/ ?
> + cp->elements[size1].dl = dl;
> + cp->elements[size1].cpu = cpu;
> + cp->elements[cpu].idx = size1;
> + cpudl_heapify_up(cp, size1);
> cpumask_clear_cpu(cpu, cp->free_cpus);
> } else {
> cpudl_change_key(cp, old_idx, dl);
We also seem to do almost the same ("cp->size - 1" mutliple times and
then cp->size--) up above, !is_valid branch. Maybe we want to clean
that up as well?
Thanks,
- Juri
^ permalink raw reply [flat|nested] 13+ messages in thread
* [RFC PATCH 3/4] Make deadline max-heap faster avoiding real swaps on heapify ops.
2016-07-19 9:44 SCHED_DEADLINE cpudeadline.{h,c} fixup Tommaso Cucinotta
2016-07-19 9:44 ` [RFC PATCH 1/4] Minimally invasive deadline wrap-around bugfix for SCHED_DEADLINE cpu heap Tommaso Cucinotta
2016-07-19 9:44 ` [RFC PATCH 2/4] SCHED_DEADLINE cpu heap code clarification/refactory Tommaso Cucinotta
@ 2016-07-19 9:44 ` Tommaso Cucinotta
2016-08-02 11:38 ` Juri Lelli
2016-08-08 14:04 ` Peter Zijlstra
2016-07-19 9:44 ` [RFC PATCH 4/4] Split cpudl_set() into cpudl_set() and cpudl_clear() Tommaso Cucinotta
3 siblings, 2 replies; 13+ messages in thread
From: Tommaso Cucinotta @ 2016-07-19 9:44 UTC (permalink / raw)
To: Luca Abeni, Juri Lelli, Peter Zijlstra, Ingo Molnar
Cc: linux-kernel, linux-dl, Tommaso Cucinotta, Juri Lelli
This change achieves up to 10% of speed-up for cpudl_set() calls, as
measured with a andomly generated workload of 1K,10K,100K random heap
insertions and deletions (75% cpudl_set() calls with is_valid=1 and
25% with is_valid=0), and randomly generated cpu IDs, with up to 256
CPUs, as measured on an Intel Core2 Duo.
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Juri Lelli <juri.lelli@arm.com>
Cc: Luca Abeni <luca.abeni@unitn.it>
Reviewed-by: Luca Abeni <luca.abeni@unitn.it>
Signed-off-by: Tommaso Cucinotta <tommaso.cucinotta@sssup.it>
---
kernel/sched/cpudeadline.c | 114 +++++++++++++++++++++++++++------------------
1 file changed, 69 insertions(+), 45 deletions(-)
diff --git a/kernel/sched/cpudeadline.c b/kernel/sched/cpudeadline.c
index 3c42702..60f933a 100644
--- a/kernel/sched/cpudeadline.c
+++ b/kernel/sched/cpudeadline.c
@@ -31,60 +31,82 @@ static inline int right_child(int i)
return (i << 1) + 2;
}
-static void cpudl_exchange(struct cpudl *cp, int a, int b)
-{
- int cpu_a = cp->elements[a].cpu, cpu_b = cp->elements[b].cpu;
-
- swap(cp->elements[a].cpu, cp->elements[b].cpu);
- swap(cp->elements[a].dl , cp->elements[b].dl );
-
- swap(cp->elements[cpu_a].idx, cp->elements[cpu_b].idx);
-}
-
static void cpudl_heapify_down(struct cpudl *cp, int idx)
{
int l, r, largest;
+ int orig_cpu = cp->elements[idx].cpu;
+ u64 orig_dl = cp->elements[idx].dl;
+
+ if (left_child(idx) >= cp->size)
+ return;
+
/* adapted from lib/prio_heap.c */
while(1) {
+ u64 largest_dl;
l = left_child(idx);
r = right_child(idx);
largest = idx;
+ largest_dl = orig_dl;
- if ((l < cp->size) && dl_time_before(cp->elements[idx].dl,
- cp->elements[l].dl))
+ if ((l < cp->size) && dl_time_before(orig_dl, cp->elements[l].dl)) {
largest = l;
- if ((r < cp->size) && dl_time_before(cp->elements[largest].dl,
- cp->elements[r].dl))
+ largest_dl = cp->elements[l].dl;
+ }
+ if ((r < cp->size) && dl_time_before(largest_dl, cp->elements[r].dl))
largest = r;
+
if (largest == idx)
break;
- /* Push idx down the heap one level and bump one up */
- cpudl_exchange(cp, largest, idx);
+ /* pull largest child onto idx */
+ cp->elements[idx].cpu = cp->elements[largest].cpu;
+ cp->elements[idx].dl = cp->elements[largest].dl;
+ cp->elements[cp->elements[idx].cpu].idx = idx;
idx = largest;
}
+ /* actual push down of saved original values orig_* */
+ cp->elements[idx].cpu = orig_cpu;
+ cp->elements[idx].dl = orig_dl;
+ cp->elements[cp->elements[idx].cpu].idx = idx;
}
static void cpudl_heapify_up(struct cpudl *cp, int idx)
{
- while (idx > 0 && dl_time_before(cp->elements[parent(idx)].dl,
- cp->elements[idx].dl)) {
- cpudl_exchange(cp, idx, parent(idx));
- idx = parent(idx);
- }
+ int p;
+
+ int orig_cpu = cp->elements[idx].cpu;
+ u64 orig_dl = cp->elements[idx].dl;
+
+ if (idx == 0)
+ return;
+
+ do {
+ p = parent(idx);
+ if (dl_time_before(cp->elements[idx].dl, cp->elements[p].dl))
+ break;
+ /* pull parent onto idx */
+ cp->elements[idx].cpu = cp->elements[p].cpu;
+ cp->elements[idx].dl = cp->elements[p].dl;
+ cp->elements[cp->elements[idx].cpu].idx = idx;
+ idx = p;
+ } while (idx != 0);
+ /* actual push up of saved original values orig_* */
+ cp->elements[idx].cpu = orig_cpu;
+ cp->elements[idx].dl = orig_dl;
+ cp->elements[cp->elements[idx].cpu].idx = idx;
}
-static void cpudl_change_key(struct cpudl *cp, int idx, u64 new_dl)
+static void cpudl_heapify(struct cpudl *cp, int idx)
{
WARN_ON(idx == IDX_INVALID || !cpu_present(idx));
+ if (idx == IDX_INVALID)
+ return;
- if (dl_time_before(new_dl, cp->elements[idx].dl)) {
- cp->elements[idx].dl = new_dl;
- cpudl_heapify_down(cp, idx);
- } else {
- cp->elements[idx].dl = new_dl;
+ if (idx > 0 && dl_time_before(cp->elements[parent(idx)].dl, cp->elements[idx].dl)) {
cpudl_heapify_up(cp, idx);
+ } else {
+ cpudl_heapify_down(cp, idx);
}
}
@@ -153,28 +175,30 @@ void cpudl_set(struct cpudl *cp, int cpu, u64 dl, int is_valid)
*/
goto out;
}
- new_cpu = cp->elements[cp->size - 1].cpu;
- cp->elements[old_idx].dl = cp->elements[cp->size - 1].dl;
- cp->elements[old_idx].cpu = new_cpu;
cp->size--;
- cp->elements[new_cpu].idx = old_idx;
cp->elements[cpu].idx = IDX_INVALID;
- cpudl_heapify_up(cp, old_idx);
- cpumask_set_cpu(cpu, cp->free_cpus);
- cpudl_heapify_down(cp, old_idx);
-
- goto out;
- }
+ if (old_idx != cp->size) {
+ new_cpu = cp->elements[cp->size].cpu;
+ cp->elements[old_idx].dl = cp->elements[cp->size].dl;
+ cp->elements[old_idx].cpu = new_cpu;
+ cp->elements[new_cpu].idx = old_idx;
+ cpudl_heapify(cp, old_idx);
+ }
- if (old_idx == IDX_INVALID) {
- int size1 = cp->size++;
- cp->elements[size1].dl = dl;
- cp->elements[size1].cpu = cpu;
- cp->elements[cpu].idx = size1;
- cpudl_heapify_up(cp, size1);
- cpumask_clear_cpu(cpu, cp->free_cpus);
+ cpumask_set_cpu(cpu, cp->free_cpus);
} else {
- cpudl_change_key(cp, old_idx, dl);
+ if (old_idx == IDX_INVALID) {
+ int size1 = cp->size++;
+ cp->elements[size1].dl = dl;
+ cp->elements[size1].cpu = cpu;
+ cp->elements[cpu].idx = size1;
+ cpudl_heapify_up(cp, size1);
+
+ cpumask_clear_cpu(cpu, cp->free_cpus);
+ } else {
+ cp->elements[old_idx].dl = dl;
+ cpudl_heapify(cp, old_idx);
+ }
}
out:
--
2.7.4
^ permalink raw reply related [flat|nested] 13+ messages in thread
* Re: [RFC PATCH 3/4] Make deadline max-heap faster avoiding real swaps on heapify ops.
2016-07-19 9:44 ` [RFC PATCH 3/4] Make deadline max-heap faster avoiding real swaps on heapify ops Tommaso Cucinotta
@ 2016-08-02 11:38 ` Juri Lelli
2016-08-08 14:04 ` Peter Zijlstra
1 sibling, 0 replies; 13+ messages in thread
From: Juri Lelli @ 2016-08-02 11:38 UTC (permalink / raw)
To: Tommaso Cucinotta
Cc: Luca Abeni, Juri Lelli, Peter Zijlstra, Ingo Molnar, linux-kernel
Hi,
On 19/07/16 11:44, Tommaso Cucinotta wrote:
> This change achieves up to 10% of speed-up for cpudl_set() calls, as
> measured with a andomly generated workload of 1K,10K,100K random heap
> insertions and deletions (75% cpudl_set() calls with is_valid=1 and
> 25% with is_valid=0), and randomly generated cpu IDs, with up to 256
> CPUs, as measured on an Intel Core2 Duo.
>
> Cc: Peter Zijlstra <peterz@infradead.org>
> Cc: Juri Lelli <juri.lelli@arm.com>
> Cc: Luca Abeni <luca.abeni@unitn.it>
> Reviewed-by: Luca Abeni <luca.abeni@unitn.it>
> Signed-off-by: Tommaso Cucinotta <tommaso.cucinotta@sssup.it>
> ---
> kernel/sched/cpudeadline.c | 114 +++++++++++++++++++++++++++------------------
> 1 file changed, 69 insertions(+), 45 deletions(-)
>
> diff --git a/kernel/sched/cpudeadline.c b/kernel/sched/cpudeadline.c
> index 3c42702..60f933a 100644
> --- a/kernel/sched/cpudeadline.c
> +++ b/kernel/sched/cpudeadline.c
> @@ -31,60 +31,82 @@ static inline int right_child(int i)
> return (i << 1) + 2;
> }
>
> -static void cpudl_exchange(struct cpudl *cp, int a, int b)
> -{
> - int cpu_a = cp->elements[a].cpu, cpu_b = cp->elements[b].cpu;
> -
> - swap(cp->elements[a].cpu, cp->elements[b].cpu);
> - swap(cp->elements[a].dl , cp->elements[b].dl );
> -
> - swap(cp->elements[cpu_a].idx, cp->elements[cpu_b].idx);
> -}
> -
> static void cpudl_heapify_down(struct cpudl *cp, int idx)
> {
> int l, r, largest;
>
> + int orig_cpu = cp->elements[idx].cpu;
> + u64 orig_dl = cp->elements[idx].dl;
> +
> + if (left_child(idx) >= cp->size)
> + return;
> +
> /* adapted from lib/prio_heap.c */
> while(1) {
> + u64 largest_dl;
> l = left_child(idx);
> r = right_child(idx);
> largest = idx;
> + largest_dl = orig_dl;
>
> - if ((l < cp->size) && dl_time_before(cp->elements[idx].dl,
> - cp->elements[l].dl))
> + if ((l < cp->size) && dl_time_before(orig_dl, cp->elements[l].dl)) {
> largest = l;
> - if ((r < cp->size) && dl_time_before(cp->elements[largest].dl,
> - cp->elements[r].dl))
> + largest_dl = cp->elements[l].dl;
OK, I was a bit puzzled by the usage of largest_dl in this loop, but I
seemed to convince myself that it is right. It's basically a local tmp
variable that you use to pick left or right child at each iteration, and
you reset it to the deadline of the node your are moving (orig_dl) after
it iteration.
> + }
> + if ((r < cp->size) && dl_time_before(largest_dl, cp->elements[r].dl))
Nitpick, this and the above conditions break 80 columns.
> largest = r;
> +
> if (largest == idx)
> break;
>
> - /* Push idx down the heap one level and bump one up */
> - cpudl_exchange(cp, largest, idx);
> + /* pull largest child onto idx */
> + cp->elements[idx].cpu = cp->elements[largest].cpu;
> + cp->elements[idx].dl = cp->elements[largest].dl;
> + cp->elements[cp->elements[idx].cpu].idx = idx;
> idx = largest;
> }
> + /* actual push down of saved original values orig_* */
> + cp->elements[idx].cpu = orig_cpu;
> + cp->elements[idx].dl = orig_dl;
> + cp->elements[cp->elements[idx].cpu].idx = idx;
> }
>
> static void cpudl_heapify_up(struct cpudl *cp, int idx)
> {
> - while (idx > 0 && dl_time_before(cp->elements[parent(idx)].dl,
> - cp->elements[idx].dl)) {
> - cpudl_exchange(cp, idx, parent(idx));
> - idx = parent(idx);
> - }
> + int p;
> +
> + int orig_cpu = cp->elements[idx].cpu;
> + u64 orig_dl = cp->elements[idx].dl;
> +
> + if (idx == 0)
> + return;
> +
> + do {
> + p = parent(idx);
> + if (dl_time_before(cp->elements[idx].dl, cp->elements[p].dl))
> + break;
> + /* pull parent onto idx */
> + cp->elements[idx].cpu = cp->elements[p].cpu;
> + cp->elements[idx].dl = cp->elements[p].dl;
> + cp->elements[cp->elements[idx].cpu].idx = idx;
> + idx = p;
> + } while (idx != 0);
> + /* actual push up of saved original values orig_* */
> + cp->elements[idx].cpu = orig_cpu;
> + cp->elements[idx].dl = orig_dl;
> + cp->elements[cp->elements[idx].cpu].idx = idx;
> }
>
> -static void cpudl_change_key(struct cpudl *cp, int idx, u64 new_dl)
> +static void cpudl_heapify(struct cpudl *cp, int idx)
> {
> WARN_ON(idx == IDX_INVALID || !cpu_present(idx));
> + if (idx == IDX_INVALID)
> + return;
Can we actually get here with IDX_INVALID? We bail out in the !is_valid
branch and the other call point is an else branch of old_idx ==
IDX_INVALID. So, it seems that this is more a BUG_ON().
>
> - if (dl_time_before(new_dl, cp->elements[idx].dl)) {
> - cp->elements[idx].dl = new_dl;
> - cpudl_heapify_down(cp, idx);
> - } else {
> - cp->elements[idx].dl = new_dl;
> + if (idx > 0 && dl_time_before(cp->elements[parent(idx)].dl, cp->elements[idx].dl)) {
You can wrap after the comma. And brackets are not required.
> cpudl_heapify_up(cp, idx);
> + } else {
> + cpudl_heapify_down(cp, idx);
> }
> }
>
> @@ -153,28 +175,30 @@ void cpudl_set(struct cpudl *cp, int cpu, u64 dl, int is_valid)
> */
> goto out;
> }
> - new_cpu = cp->elements[cp->size - 1].cpu;
> - cp->elements[old_idx].dl = cp->elements[cp->size - 1].dl;
> - cp->elements[old_idx].cpu = new_cpu;
> cp->size--;
> - cp->elements[new_cpu].idx = old_idx;
> cp->elements[cpu].idx = IDX_INVALID;
> - cpudl_heapify_up(cp, old_idx);
> - cpumask_set_cpu(cpu, cp->free_cpus);
> - cpudl_heapify_down(cp, old_idx);
> -
> - goto out;
> - }
> + if (old_idx != cp->size) {
> + new_cpu = cp->elements[cp->size].cpu;
> + cp->elements[old_idx].dl = cp->elements[cp->size].dl;
> + cp->elements[old_idx].cpu = new_cpu;
> + cp->elements[new_cpu].idx = old_idx;
> + cpudl_heapify(cp, old_idx);
> + }
>
> - if (old_idx == IDX_INVALID) {
> - int size1 = cp->size++;
> - cp->elements[size1].dl = dl;
> - cp->elements[size1].cpu = cpu;
> - cp->elements[cpu].idx = size1;
> - cpudl_heapify_up(cp, size1);
> - cpumask_clear_cpu(cpu, cp->free_cpus);
> + cpumask_set_cpu(cpu, cp->free_cpus);
> } else {
> - cpudl_change_key(cp, old_idx, dl);
> + if (old_idx == IDX_INVALID) {
> + int size1 = cp->size++;
> + cp->elements[size1].dl = dl;
> + cp->elements[size1].cpu = cpu;
> + cp->elements[cpu].idx = size1;
> + cpudl_heapify_up(cp, size1);
> +
> + cpumask_clear_cpu(cpu, cp->free_cpus);
> + } else {
> + cp->elements[old_idx].dl = dl;
> + cpudl_heapify(cp, old_idx);
> + }
> }
>
> out:
> --
Couldn't spot any problem with this change (apart from the minor things
I pointed out above). Didn't test it extensively, though. But it seems
you did. :)
Best,
- Juri
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [RFC PATCH 3/4] Make deadline max-heap faster avoiding real swaps on heapify ops.
2016-07-19 9:44 ` [RFC PATCH 3/4] Make deadline max-heap faster avoiding real swaps on heapify ops Tommaso Cucinotta
2016-08-02 11:38 ` Juri Lelli
@ 2016-08-08 14:04 ` Peter Zijlstra
1 sibling, 0 replies; 13+ messages in thread
From: Peter Zijlstra @ 2016-08-08 14:04 UTC (permalink / raw)
To: Tommaso Cucinotta
Cc: Luca Abeni, Juri Lelli, Ingo Molnar, linux-kernel, linux-dl, Juri Lelli
On Tue, Jul 19, 2016 at 11:44:52AM +0200, Tommaso Cucinotta wrote:
> This change achieves up to 10% of speed-up for cpudl_set() calls, as
> measured with a andomly generated workload of 1K,10K,100K random heap
> insertions and deletions (75% cpudl_set() calls with is_valid=1 and
> 25% with is_valid=0), and randomly generated cpu IDs, with up to 256
> CPUs, as measured on an Intel Core2 Duo.
Changelog fails to explain _what_ the change is.
Some details on the change in implementation and why this is faster
would be good.
^ permalink raw reply [flat|nested] 13+ messages in thread
* [RFC PATCH 4/4] Split cpudl_set() into cpudl_set() and cpudl_clear().
2016-07-19 9:44 SCHED_DEADLINE cpudeadline.{h,c} fixup Tommaso Cucinotta
` (2 preceding siblings ...)
2016-07-19 9:44 ` [RFC PATCH 3/4] Make deadline max-heap faster avoiding real swaps on heapify ops Tommaso Cucinotta
@ 2016-07-19 9:44 ` Tommaso Cucinotta
2016-08-02 11:52 ` Juri Lelli
3 siblings, 1 reply; 13+ messages in thread
From: Tommaso Cucinotta @ 2016-07-19 9:44 UTC (permalink / raw)
To: Luca Abeni, Juri Lelli, Peter Zijlstra, Ingo Molnar
Cc: linux-kernel, linux-dl, Tommaso Cucinotta, Juri Lelli
These 2 exercise independent code paths and need different arguments.
Now you call
cpudl_clear(cp, cpu)
cpudl_set(cp, cpu, dl)
instead of
cpudl_set(cp, cpu, 0 /* dl */, 0 /* is_valid */)
cpudl_set(cp, cpu, dl, 1 /* is_valid */)
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Juri Lelli <juri.lelli@arm.com>
Cc: Luca Abeni <luca.abeni@unitn.it>
Reviewed-by: Luca Abeni <luca.abeni@unitn.it>
Signed-off-by: Tommaso Cucinotta <tommaso.cucinotta@sssup.it>
---
kernel/sched/cpudeadline.c | 71 +++++++++++++++++++++++++++++-----------------
kernel/sched/cpudeadline.h | 3 +-
kernel/sched/deadline.c | 10 +++----
3 files changed, 52 insertions(+), 32 deletions(-)
diff --git a/kernel/sched/cpudeadline.c b/kernel/sched/cpudeadline.c
index 60f933a..0f276bf 100644
--- a/kernel/sched/cpudeadline.c
+++ b/kernel/sched/cpudeadline.c
@@ -147,16 +147,15 @@ out:
}
/*
- * cpudl_set - update the cpudl max-heap
+ * cpudl_clear - remove a cpu from the cpudl max-heap
* @cp: the cpudl max-heap context
* @cpu: the target cpu
- * @dl: the new earliest deadline for this cpu
*
* Notes: assumes cpu_rq(cpu)->lock is locked
*
* Returns: (void)
*/
-void cpudl_set(struct cpudl *cp, int cpu, u64 dl, int is_valid)
+void cpudl_clear(struct cpudl *cp, int cpu)
{
int old_idx, new_cpu;
unsigned long flags;
@@ -164,17 +163,16 @@ void cpudl_set(struct cpudl *cp, int cpu, u64 dl, int is_valid)
WARN_ON(!cpu_present(cpu));
raw_spin_lock_irqsave(&cp->lock, flags);
+
old_idx = cp->elements[cpu].idx;
- if (!is_valid) {
+ if (old_idx == IDX_INVALID) {
+ /*
+ * Nothing to remove if old_idx was invalid.
+ * This could happen if a rq_offline_dl is
+ * called for a CPU without -dl tasks running.
+ */
+ } else {
/* remove item */
- if (old_idx == IDX_INVALID) {
- /*
- * Nothing to remove if old_idx was invalid.
- * This could happen if a rq_offline_dl is
- * called for a CPU without -dl tasks running.
- */
- goto out;
- }
cp->size--;
cp->elements[cpu].idx = IDX_INVALID;
if (old_idx != cp->size) {
@@ -184,24 +182,45 @@ void cpudl_set(struct cpudl *cp, int cpu, u64 dl, int is_valid)
cp->elements[new_cpu].idx = old_idx;
cpudl_heapify(cp, old_idx);
}
-
cpumask_set_cpu(cpu, cp->free_cpus);
+ }
+
+ raw_spin_unlock_irqrestore(&cp->lock, flags);
+}
+
+/*
+ * cpudl_set - update the cpudl max-heap
+ * @cp: the cpudl max-heap context
+ * @cpu: the target cpu
+ * @dl: the new earliest deadline for this cpu
+ *
+ * Notes: assumes cpu_rq(cpu)->lock is locked
+ *
+ * Returns: (void)
+ */
+void cpudl_set(struct cpudl *cp, int cpu, u64 dl)
+{
+ int old_idx;
+ unsigned long flags;
+
+ WARN_ON(!cpu_present(cpu));
+
+ raw_spin_lock_irqsave(&cp->lock, flags);
+
+ old_idx = cp->elements[cpu].idx;
+ if (old_idx == IDX_INVALID) {
+ int sz1 = cp->size++;
+ cp->elements[sz1].dl = dl;
+ cp->elements[sz1].cpu = cpu;
+ cp->elements[cpu].idx = sz1;
+ cpudl_heapify_up(cp, sz1);
+
+ cpumask_clear_cpu(cpu, cp->free_cpus);
} else {
- if (old_idx == IDX_INVALID) {
- int size1 = cp->size++;
- cp->elements[size1].dl = dl;
- cp->elements[size1].cpu = cpu;
- cp->elements[cpu].idx = size1;
- cpudl_heapify_up(cp, size1);
-
- cpumask_clear_cpu(cpu, cp->free_cpus);
- } else {
- cp->elements[old_idx].dl = dl;
- cpudl_heapify(cp, old_idx);
- }
+ cp->elements[old_idx].dl = dl;
+ cpudl_heapify(cp, old_idx);
}
-out:
raw_spin_unlock_irqrestore(&cp->lock, flags);
}
diff --git a/kernel/sched/cpudeadline.h b/kernel/sched/cpudeadline.h
index fcbdf83..f7da8c5 100644
--- a/kernel/sched/cpudeadline.h
+++ b/kernel/sched/cpudeadline.h
@@ -23,7 +23,8 @@ struct cpudl {
#ifdef CONFIG_SMP
int cpudl_find(struct cpudl *cp, struct task_struct *p,
struct cpumask *later_mask);
-void cpudl_set(struct cpudl *cp, int cpu, u64 dl, int is_valid);
+void cpudl_set(struct cpudl *cp, int cpu, u64 dl);
+void cpudl_clear(struct cpudl *cp, int cpu);
int cpudl_init(struct cpudl *cp);
void cpudl_set_freecpu(struct cpudl *cp, int cpu);
void cpudl_clear_freecpu(struct cpudl *cp, int cpu);
diff --git a/kernel/sched/deadline.c b/kernel/sched/deadline.c
index fcb7f02..f2e8f47 100644
--- a/kernel/sched/deadline.c
+++ b/kernel/sched/deadline.c
@@ -795,7 +795,7 @@ static void inc_dl_deadline(struct dl_rq *dl_rq, u64 deadline)
if (dl_rq->earliest_dl.curr == 0 ||
dl_time_before(deadline, dl_rq->earliest_dl.curr)) {
dl_rq->earliest_dl.curr = deadline;
- cpudl_set(&rq->rd->cpudl, rq->cpu, deadline, 1);
+ cpudl_set(&rq->rd->cpudl, rq->cpu, deadline);
}
}
@@ -810,14 +810,14 @@ static void dec_dl_deadline(struct dl_rq *dl_rq, u64 deadline)
if (!dl_rq->dl_nr_running) {
dl_rq->earliest_dl.curr = 0;
dl_rq->earliest_dl.next = 0;
- cpudl_set(&rq->rd->cpudl, rq->cpu, 0, 0);
+ cpudl_clear(&rq->rd->cpudl, rq->cpu);
} else {
struct rb_node *leftmost = dl_rq->rb_leftmost;
struct sched_dl_entity *entry;
entry = rb_entry(leftmost, struct sched_dl_entity, rb_node);
dl_rq->earliest_dl.curr = entry->deadline;
- cpudl_set(&rq->rd->cpudl, rq->cpu, entry->deadline, 1);
+ cpudl_set(&rq->rd->cpudl, rq->cpu, entry->deadline);
}
}
@@ -1668,7 +1668,7 @@ static void rq_online_dl(struct rq *rq)
cpudl_set_freecpu(&rq->rd->cpudl, rq->cpu);
if (rq->dl.dl_nr_running > 0)
- cpudl_set(&rq->rd->cpudl, rq->cpu, rq->dl.earliest_dl.curr, 1);
+ cpudl_set(&rq->rd->cpudl, rq->cpu, rq->dl.earliest_dl.curr);
}
/* Assumes rq->lock is held */
@@ -1677,7 +1677,7 @@ static void rq_offline_dl(struct rq *rq)
if (rq->dl.overloaded)
dl_clear_overload(rq);
- cpudl_set(&rq->rd->cpudl, rq->cpu, 0, 0);
+ cpudl_clear(&rq->rd->cpudl, rq->cpu);
cpudl_clear_freecpu(&rq->rd->cpudl, rq->cpu);
}
--
2.7.4
^ permalink raw reply related [flat|nested] 13+ messages in thread
* Re: [RFC PATCH 4/4] Split cpudl_set() into cpudl_set() and cpudl_clear().
2016-07-19 9:44 ` [RFC PATCH 4/4] Split cpudl_set() into cpudl_set() and cpudl_clear() Tommaso Cucinotta
@ 2016-08-02 11:52 ` Juri Lelli
0 siblings, 0 replies; 13+ messages in thread
From: Juri Lelli @ 2016-08-02 11:52 UTC (permalink / raw)
To: Tommaso Cucinotta
Cc: Luca Abeni, Juri Lelli, Peter Zijlstra, Ingo Molnar, linux-kernel
Hi,
you should add "sched/deadline:" to the subject.
On 19/07/16 11:44, Tommaso Cucinotta wrote:
I'd repeat what the subject says, so that you can refer that from the
changelog with "These".
> These 2 exercise independent code paths and need different arguments.
> Now you call
It's more "after this change" that now, IMHO.
> cpudl_clear(cp, cpu)
> cpudl_set(cp, cpu, dl)
> instead of
> cpudl_set(cp, cpu, 0 /* dl */, 0 /* is_valid */)
> cpudl_set(cp, cpu, dl, 1 /* is_valid */)
>
> Cc: Peter Zijlstra <peterz@infradead.org>
> Cc: Juri Lelli <juri.lelli@arm.com>
> Cc: Luca Abeni <luca.abeni@unitn.it>
> Reviewed-by: Luca Abeni <luca.abeni@unitn.it>
> Signed-off-by: Tommaso Cucinotta <tommaso.cucinotta@sssup.it>
> ---
> kernel/sched/cpudeadline.c | 71 +++++++++++++++++++++++++++++-----------------
> kernel/sched/cpudeadline.h | 3 +-
> kernel/sched/deadline.c | 10 +++----
> 3 files changed, 52 insertions(+), 32 deletions(-)
>
> diff --git a/kernel/sched/cpudeadline.c b/kernel/sched/cpudeadline.c
> index 60f933a..0f276bf 100644
> --- a/kernel/sched/cpudeadline.c
> +++ b/kernel/sched/cpudeadline.c
> @@ -147,16 +147,15 @@ out:
> }
>
> /*
> - * cpudl_set - update the cpudl max-heap
> + * cpudl_clear - remove a cpu from the cpudl max-heap
> * @cp: the cpudl max-heap context
> * @cpu: the target cpu
> - * @dl: the new earliest deadline for this cpu
> *
> * Notes: assumes cpu_rq(cpu)->lock is locked
We should probably add (in a separate patch) a lockdep_assert for this.
> *
> * Returns: (void)
> */
> -void cpudl_set(struct cpudl *cp, int cpu, u64 dl, int is_valid)
> +void cpudl_clear(struct cpudl *cp, int cpu)
> {
> int old_idx, new_cpu;
> unsigned long flags;
> @@ -164,17 +163,16 @@ void cpudl_set(struct cpudl *cp, int cpu, u64 dl, int is_valid)
> WARN_ON(!cpu_present(cpu));
>
> raw_spin_lock_irqsave(&cp->lock, flags);
> +
> old_idx = cp->elements[cpu].idx;
> - if (!is_valid) {
> + if (old_idx == IDX_INVALID) {
> + /*
> + * Nothing to remove if old_idx was invalid.
> + * This could happen if a rq_offline_dl is
> + * called for a CPU without -dl tasks running.
> + */
> + } else {
> /* remove item */
> - if (old_idx == IDX_INVALID) {
> - /*
> - * Nothing to remove if old_idx was invalid.
> - * This could happen if a rq_offline_dl is
> - * called for a CPU without -dl tasks running.
> - */
> - goto out;
> - }
> cp->size--;
> cp->elements[cpu].idx = IDX_INVALID;
> if (old_idx != cp->size) {
> @@ -184,24 +182,45 @@ void cpudl_set(struct cpudl *cp, int cpu, u64 dl, int is_valid)
> cp->elements[new_cpu].idx = old_idx;
> cpudl_heapify(cp, old_idx);
> }
> -
> cpumask_set_cpu(cpu, cp->free_cpus);
> + }
> +
> + raw_spin_unlock_irqrestore(&cp->lock, flags);
> +}
> +
> +/*
> + * cpudl_set - update the cpudl max-heap
> + * @cp: the cpudl max-heap context
> + * @cpu: the target cpu
> + * @dl: the new earliest deadline for this cpu
> + *
> + * Notes: assumes cpu_rq(cpu)->lock is locked
> + *
> + * Returns: (void)
> + */
> +void cpudl_set(struct cpudl *cp, int cpu, u64 dl)
> +{
> + int old_idx;
> + unsigned long flags;
> +
> + WARN_ON(!cpu_present(cpu));
> +
> + raw_spin_lock_irqsave(&cp->lock, flags);
> +
> + old_idx = cp->elements[cpu].idx;
> + if (old_idx == IDX_INVALID) {
> + int sz1 = cp->size++;
You also change the temp variable name. I think you might want to fix
that in one of the previous patches once for all.
> + cp->elements[sz1].dl = dl;
> + cp->elements[sz1].cpu = cpu;
> + cp->elements[cpu].idx = sz1;
> + cpudl_heapify_up(cp, sz1);
> +
> + cpumask_clear_cpu(cpu, cp->free_cpus);
> } else {
> - if (old_idx == IDX_INVALID) {
> - int size1 = cp->size++;
> - cp->elements[size1].dl = dl;
> - cp->elements[size1].cpu = cpu;
> - cp->elements[cpu].idx = size1;
> - cpudl_heapify_up(cp, size1);
> -
> - cpumask_clear_cpu(cpu, cp->free_cpus);
> - } else {
> - cp->elements[old_idx].dl = dl;
> - cpudl_heapify(cp, old_idx);
> - }
> + cp->elements[old_idx].dl = dl;
> + cpudl_heapify(cp, old_idx);
> }
>
> -out:
> raw_spin_unlock_irqrestore(&cp->lock, flags);
> }
>
> diff --git a/kernel/sched/cpudeadline.h b/kernel/sched/cpudeadline.h
> index fcbdf83..f7da8c5 100644
> --- a/kernel/sched/cpudeadline.h
> +++ b/kernel/sched/cpudeadline.h
> @@ -23,7 +23,8 @@ struct cpudl {
> #ifdef CONFIG_SMP
> int cpudl_find(struct cpudl *cp, struct task_struct *p,
> struct cpumask *later_mask);
> -void cpudl_set(struct cpudl *cp, int cpu, u64 dl, int is_valid);
> +void cpudl_set(struct cpudl *cp, int cpu, u64 dl);
> +void cpudl_clear(struct cpudl *cp, int cpu);
> int cpudl_init(struct cpudl *cp);
> void cpudl_set_freecpu(struct cpudl *cp, int cpu);
> void cpudl_clear_freecpu(struct cpudl *cp, int cpu);
> diff --git a/kernel/sched/deadline.c b/kernel/sched/deadline.c
> index fcb7f02..f2e8f47 100644
> --- a/kernel/sched/deadline.c
> +++ b/kernel/sched/deadline.c
> @@ -795,7 +795,7 @@ static void inc_dl_deadline(struct dl_rq *dl_rq, u64 deadline)
> if (dl_rq->earliest_dl.curr == 0 ||
> dl_time_before(deadline, dl_rq->earliest_dl.curr)) {
> dl_rq->earliest_dl.curr = deadline;
> - cpudl_set(&rq->rd->cpudl, rq->cpu, deadline, 1);
> + cpudl_set(&rq->rd->cpudl, rq->cpu, deadline);
> }
> }
>
> @@ -810,14 +810,14 @@ static void dec_dl_deadline(struct dl_rq *dl_rq, u64 deadline)
> if (!dl_rq->dl_nr_running) {
> dl_rq->earliest_dl.curr = 0;
> dl_rq->earliest_dl.next = 0;
> - cpudl_set(&rq->rd->cpudl, rq->cpu, 0, 0);
> + cpudl_clear(&rq->rd->cpudl, rq->cpu);
> } else {
> struct rb_node *leftmost = dl_rq->rb_leftmost;
> struct sched_dl_entity *entry;
>
> entry = rb_entry(leftmost, struct sched_dl_entity, rb_node);
> dl_rq->earliest_dl.curr = entry->deadline;
> - cpudl_set(&rq->rd->cpudl, rq->cpu, entry->deadline, 1);
> + cpudl_set(&rq->rd->cpudl, rq->cpu, entry->deadline);
> }
> }
>
> @@ -1668,7 +1668,7 @@ static void rq_online_dl(struct rq *rq)
>
> cpudl_set_freecpu(&rq->rd->cpudl, rq->cpu);
> if (rq->dl.dl_nr_running > 0)
> - cpudl_set(&rq->rd->cpudl, rq->cpu, rq->dl.earliest_dl.curr, 1);
> + cpudl_set(&rq->rd->cpudl, rq->cpu, rq->dl.earliest_dl.curr);
> }
>
> /* Assumes rq->lock is held */
> @@ -1677,7 +1677,7 @@ static void rq_offline_dl(struct rq *rq)
> if (rq->dl.overloaded)
> dl_clear_overload(rq);
>
> - cpudl_set(&rq->rd->cpudl, rq->cpu, 0, 0);
> + cpudl_clear(&rq->rd->cpudl, rq->cpu);
> cpudl_clear_freecpu(&rq->rd->cpudl, rq->cpu);
> }
>
> --
Apart from the minor nitpicks above, the change looks good to me and it
shouldn't introduce any functional changes (maybe worth stating it in
the changelog).
Best,
- Juri
^ permalink raw reply [flat|nested] 13+ messages in thread
* [RFC PATCH 3/4] Make deadline max-heap faster avoiding real swaps on heapify ops.
2016-05-19 16:02 SCHED_DEADLINE cpudeadline.{h,c} fixup Tommaso Cucinotta
@ 2016-05-19 16:02 ` Tommaso Cucinotta
0 siblings, 0 replies; 13+ messages in thread
From: Tommaso Cucinotta @ 2016-05-19 16:02 UTC (permalink / raw)
To: Luca Abeni, Juri Lelli, Peter Zijlstra, Ingo Molnar
Cc: linux-kernel, Tommaso Cucinotta
---
kernel/sched/cpudeadline.c | 114 +++++++++++++++++++++++++++------------------
1 file changed, 69 insertions(+), 45 deletions(-)
diff --git a/kernel/sched/cpudeadline.c b/kernel/sched/cpudeadline.c
index 3c42702..60f933a 100644
--- a/kernel/sched/cpudeadline.c
+++ b/kernel/sched/cpudeadline.c
@@ -31,60 +31,82 @@ static inline int right_child(int i)
return (i << 1) + 2;
}
-static void cpudl_exchange(struct cpudl *cp, int a, int b)
-{
- int cpu_a = cp->elements[a].cpu, cpu_b = cp->elements[b].cpu;
-
- swap(cp->elements[a].cpu, cp->elements[b].cpu);
- swap(cp->elements[a].dl , cp->elements[b].dl );
-
- swap(cp->elements[cpu_a].idx, cp->elements[cpu_b].idx);
-}
-
static void cpudl_heapify_down(struct cpudl *cp, int idx)
{
int l, r, largest;
+ int orig_cpu = cp->elements[idx].cpu;
+ u64 orig_dl = cp->elements[idx].dl;
+
+ if (left_child(idx) >= cp->size)
+ return;
+
/* adapted from lib/prio_heap.c */
while(1) {
+ u64 largest_dl;
l = left_child(idx);
r = right_child(idx);
largest = idx;
+ largest_dl = orig_dl;
- if ((l < cp->size) && dl_time_before(cp->elements[idx].dl,
- cp->elements[l].dl))
+ if ((l < cp->size) && dl_time_before(orig_dl, cp->elements[l].dl)) {
largest = l;
- if ((r < cp->size) && dl_time_before(cp->elements[largest].dl,
- cp->elements[r].dl))
+ largest_dl = cp->elements[l].dl;
+ }
+ if ((r < cp->size) && dl_time_before(largest_dl, cp->elements[r].dl))
largest = r;
+
if (largest == idx)
break;
- /* Push idx down the heap one level and bump one up */
- cpudl_exchange(cp, largest, idx);
+ /* pull largest child onto idx */
+ cp->elements[idx].cpu = cp->elements[largest].cpu;
+ cp->elements[idx].dl = cp->elements[largest].dl;
+ cp->elements[cp->elements[idx].cpu].idx = idx;
idx = largest;
}
+ /* actual push down of saved original values orig_* */
+ cp->elements[idx].cpu = orig_cpu;
+ cp->elements[idx].dl = orig_dl;
+ cp->elements[cp->elements[idx].cpu].idx = idx;
}
static void cpudl_heapify_up(struct cpudl *cp, int idx)
{
- while (idx > 0 && dl_time_before(cp->elements[parent(idx)].dl,
- cp->elements[idx].dl)) {
- cpudl_exchange(cp, idx, parent(idx));
- idx = parent(idx);
- }
+ int p;
+
+ int orig_cpu = cp->elements[idx].cpu;
+ u64 orig_dl = cp->elements[idx].dl;
+
+ if (idx == 0)
+ return;
+
+ do {
+ p = parent(idx);
+ if (dl_time_before(cp->elements[idx].dl, cp->elements[p].dl))
+ break;
+ /* pull parent onto idx */
+ cp->elements[idx].cpu = cp->elements[p].cpu;
+ cp->elements[idx].dl = cp->elements[p].dl;
+ cp->elements[cp->elements[idx].cpu].idx = idx;
+ idx = p;
+ } while (idx != 0);
+ /* actual push up of saved original values orig_* */
+ cp->elements[idx].cpu = orig_cpu;
+ cp->elements[idx].dl = orig_dl;
+ cp->elements[cp->elements[idx].cpu].idx = idx;
}
-static void cpudl_change_key(struct cpudl *cp, int idx, u64 new_dl)
+static void cpudl_heapify(struct cpudl *cp, int idx)
{
WARN_ON(idx == IDX_INVALID || !cpu_present(idx));
+ if (idx == IDX_INVALID)
+ return;
- if (dl_time_before(new_dl, cp->elements[idx].dl)) {
- cp->elements[idx].dl = new_dl;
- cpudl_heapify_down(cp, idx);
- } else {
- cp->elements[idx].dl = new_dl;
+ if (idx > 0 && dl_time_before(cp->elements[parent(idx)].dl, cp->elements[idx].dl)) {
cpudl_heapify_up(cp, idx);
+ } else {
+ cpudl_heapify_down(cp, idx);
}
}
@@ -153,28 +175,30 @@ void cpudl_set(struct cpudl *cp, int cpu, u64 dl, int is_valid)
*/
goto out;
}
- new_cpu = cp->elements[cp->size - 1].cpu;
- cp->elements[old_idx].dl = cp->elements[cp->size - 1].dl;
- cp->elements[old_idx].cpu = new_cpu;
cp->size--;
- cp->elements[new_cpu].idx = old_idx;
cp->elements[cpu].idx = IDX_INVALID;
- cpudl_heapify_up(cp, old_idx);
- cpumask_set_cpu(cpu, cp->free_cpus);
- cpudl_heapify_down(cp, old_idx);
-
- goto out;
- }
+ if (old_idx != cp->size) {
+ new_cpu = cp->elements[cp->size].cpu;
+ cp->elements[old_idx].dl = cp->elements[cp->size].dl;
+ cp->elements[old_idx].cpu = new_cpu;
+ cp->elements[new_cpu].idx = old_idx;
+ cpudl_heapify(cp, old_idx);
+ }
- if (old_idx == IDX_INVALID) {
- int size1 = cp->size++;
- cp->elements[size1].dl = dl;
- cp->elements[size1].cpu = cpu;
- cp->elements[cpu].idx = size1;
- cpudl_heapify_up(cp, size1);
- cpumask_clear_cpu(cpu, cp->free_cpus);
+ cpumask_set_cpu(cpu, cp->free_cpus);
} else {
- cpudl_change_key(cp, old_idx, dl);
+ if (old_idx == IDX_INVALID) {
+ int size1 = cp->size++;
+ cp->elements[size1].dl = dl;
+ cp->elements[size1].cpu = cpu;
+ cp->elements[cpu].idx = size1;
+ cpudl_heapify_up(cp, size1);
+
+ cpumask_clear_cpu(cpu, cp->free_cpus);
+ } else {
+ cp->elements[old_idx].dl = dl;
+ cpudl_heapify(cp, old_idx);
+ }
}
out:
--
2.7.4
^ permalink raw reply related [flat|nested] 13+ messages in thread