All of lore.kernel.org
 help / color / mirror / Atom feed
* SCHED_DEADLINE cpudeadline.{h,c} fixup
@ 2016-07-19  9:44 Tommaso Cucinotta
  2016-07-19  9:44 ` [RFC PATCH 1/4] Minimally invasive deadline wrap-around bugfix for SCHED_DEADLINE cpu heap Tommaso Cucinotta
                   ` (3 more replies)
  0 siblings, 4 replies; 12+ messages in thread
From: Tommaso Cucinotta @ 2016-07-19  9:44 UTC (permalink / raw)
  To: Luca Abeni, Juri Lelli, Peter Zijlstra, Ingo Molnar
  Cc: linux-kernel, linux-dl, Tommaso Cucinotta

Hi,

this is a rework of the cpudeadline bugfix and speed-up patch-set, that
integrates all comments received so far from Luca and Juri.

The first patch is a minimally invasive (1-line) fix for the deadline
wrap-around bug. This leaves some weirdness in how cpudl_change_key() is
called. Therefore, the second patch does a minimum of refactory to make
things more explicit and clear.
The 3rd patch contains now the actual performance enhancement (avoiding
unneeded swaps during heapify operations), which has been measured to
achieve up to 10% , of speed-up for cpudl_set() calls.
This has been measured with a andomly generated workload of 1K,10K,100K
random heap insertions and deletions (75% cpudl_set() calls with is_valid=1
and 25% with is_valid=0), and randomly generated cpu IDs, with up to 256
CPUs.

Benchmarking code is available at:
  https://github.com/tomcucinotta/cpudl-bench
Obtained speed-up plot:
  https://github.com/tomcucinotta/cpudl-bench/blob/master/cpudl.pdf

Finally, the 4th patch is another clear-up patch touching cpudeadline.{h,c}
and deadline.c. Now you call cpudl_clear(cp, cpu) and cpudl_set(cp, cpu, dl)
instead of cpudl_set(cp, cpu, 0 /* dl */, 0  /* is_valid */) and
cpudl_set(cp, cpu, dl, 1 /* is_valid */).

Please, share your comments, thanks!

  Tommaso

^ permalink raw reply	[flat|nested] 12+ messages in thread

* [RFC PATCH 1/4] Minimally invasive deadline wrap-around bugfix for SCHED_DEADLINE cpu heap
  2016-07-19  9:44 SCHED_DEADLINE cpudeadline.{h,c} fixup Tommaso Cucinotta
@ 2016-07-19  9:44 ` Tommaso Cucinotta
  2016-08-01 11:43   ` Juri Lelli
  2016-08-10 18:00   ` [tip:sched/core] sched/deadline: Fix wrap-around in DL heap tip-bot for Tommaso Cucinotta
  2016-07-19  9:44 ` [RFC PATCH 2/4] SCHED_DEADLINE cpu heap code clarification/refactory Tommaso Cucinotta
                   ` (2 subsequent siblings)
  3 siblings, 2 replies; 12+ messages in thread
From: Tommaso Cucinotta @ 2016-07-19  9:44 UTC (permalink / raw)
  To: Luca Abeni, Juri Lelli, Peter Zijlstra, Ingo Molnar
  Cc: linux-kernel, linux-dl, Tommaso Cucinotta, Juri Lelli

Current code in cpudeadline.c has a bug in re-heapifying when adding a
new element at the end of the heap, because a deadline value of 0 is
temporarily set in the new elem, then cpudl_change_key() is called
with the actual elem deadline as param. However, the function compares
the new deadline to set with the one previously in the elem, which is
0.  So, if current absolute deadlines grew so much to have negative
values as s64, the comparison in cpudl_change_key() makes the wrong
decision.  Instead, as from dl_time_before(), the kernel should handle
correctly abs deadlines wrap-arounds.

This patch fixes the problem with a minimally invasive change that
forces cpudl_change_key() to heapify up in this case.

Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Juri Lelli <juri.lelli@arm.com>
Cc: Luca Abeni <luca.abeni@unitn.it>
Reviewed-by: Luca Abeni <luca.abeni@unitn.it>
Signed-off-by: Tommaso Cucinotta <tommaso.cucinotta@sssup.it>
---
 kernel/sched/cpudeadline.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/kernel/sched/cpudeadline.c b/kernel/sched/cpudeadline.c
index 5be5882..d418449 100644
--- a/kernel/sched/cpudeadline.c
+++ b/kernel/sched/cpudeadline.c
@@ -168,7 +168,7 @@ void cpudl_set(struct cpudl *cp, int cpu, u64 dl, int is_valid)
 
 	if (old_idx == IDX_INVALID) {
 		cp->size++;
-		cp->elements[cp->size - 1].dl = 0;
+		cp->elements[cp->size - 1].dl = dl;
 		cp->elements[cp->size - 1].cpu = cpu;
 		cp->elements[cpu].idx = cp->size - 1;
 		cpudl_change_key(cp, cp->size - 1, dl);
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [RFC PATCH 2/4] SCHED_DEADLINE cpu heap code clarification/refactory.
  2016-07-19  9:44 SCHED_DEADLINE cpudeadline.{h,c} fixup Tommaso Cucinotta
  2016-07-19  9:44 ` [RFC PATCH 1/4] Minimally invasive deadline wrap-around bugfix for SCHED_DEADLINE cpu heap Tommaso Cucinotta
@ 2016-07-19  9:44 ` Tommaso Cucinotta
  2016-08-01 16:30   ` Juri Lelli
  2016-07-19  9:44 ` [RFC PATCH 3/4] Make deadline max-heap faster avoiding real swaps on heapify ops Tommaso Cucinotta
  2016-07-19  9:44 ` [RFC PATCH 4/4] Split cpudl_set() into cpudl_set() and cpudl_clear() Tommaso Cucinotta
  3 siblings, 1 reply; 12+ messages in thread
From: Tommaso Cucinotta @ 2016-07-19  9:44 UTC (permalink / raw)
  To: Luca Abeni, Juri Lelli, Peter Zijlstra, Ingo Molnar
  Cc: linux-kernel, linux-dl, Tommaso Cucinotta, Juri Lelli

1. heapify up factored out in new dedicated function heapify_up()
   (avoids repeatition of same code)
2. call to cpudl_change_key() replaced with heapify_up() when
   cpudl_set actually inserts a new node in the heap

Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Juri Lelli <juri.lelli@arm.com>
Cc: Luca Abeni <luca.abeni@unitn.it>
Reviewed-by: Luca Abeni <luca.abeni@unitn.it>
Signed-off-by: Tommaso Cucinotta <tommaso.cucinotta@sssup.it>
---
 kernel/sched/cpudeadline.c | 38 +++++++++++++++++++-------------------
 1 file changed, 19 insertions(+), 19 deletions(-)

diff --git a/kernel/sched/cpudeadline.c b/kernel/sched/cpudeadline.c
index d418449..3c42702 100644
--- a/kernel/sched/cpudeadline.c
+++ b/kernel/sched/cpudeadline.c
@@ -41,7 +41,7 @@ static void cpudl_exchange(struct cpudl *cp, int a, int b)
 	swap(cp->elements[cpu_a].idx, cp->elements[cpu_b].idx);
 }
 
-static void cpudl_heapify(struct cpudl *cp, int idx)
+static void cpudl_heapify_down(struct cpudl *cp, int idx)
 {
 	int l, r, largest;
 
@@ -66,20 +66,25 @@ static void cpudl_heapify(struct cpudl *cp, int idx)
 	}
 }
 
+static void cpudl_heapify_up(struct cpudl *cp, int idx)
+{
+	while (idx > 0 && dl_time_before(cp->elements[parent(idx)].dl,
+			cp->elements[idx].dl)) {
+		cpudl_exchange(cp, idx, parent(idx));
+		idx = parent(idx);
+	}
+}
+
 static void cpudl_change_key(struct cpudl *cp, int idx, u64 new_dl)
 {
 	WARN_ON(idx == IDX_INVALID || !cpu_present(idx));
 
 	if (dl_time_before(new_dl, cp->elements[idx].dl)) {
 		cp->elements[idx].dl = new_dl;
-		cpudl_heapify(cp, idx);
+		cpudl_heapify_down(cp, idx);
 	} else {
 		cp->elements[idx].dl = new_dl;
-		while (idx > 0 && dl_time_before(cp->elements[parent(idx)].dl,
-					cp->elements[idx].dl)) {
-			cpudl_exchange(cp, idx, parent(idx));
-			idx = parent(idx);
-		}
+		cpudl_heapify_up(cp, idx);
 	}
 }
 
@@ -154,24 +159,19 @@ void cpudl_set(struct cpudl *cp, int cpu, u64 dl, int is_valid)
 		cp->size--;
 		cp->elements[new_cpu].idx = old_idx;
 		cp->elements[cpu].idx = IDX_INVALID;
-		while (old_idx > 0 && dl_time_before(
-				cp->elements[parent(old_idx)].dl,
-				cp->elements[old_idx].dl)) {
-			cpudl_exchange(cp, old_idx, parent(old_idx));
-			old_idx = parent(old_idx);
-		}
+		cpudl_heapify_up(cp, old_idx);
 		cpumask_set_cpu(cpu, cp->free_cpus);
-                cpudl_heapify(cp, old_idx);
+                cpudl_heapify_down(cp, old_idx);
 
 		goto out;
 	}
 
 	if (old_idx == IDX_INVALID) {
-		cp->size++;
-		cp->elements[cp->size - 1].dl = dl;
-		cp->elements[cp->size - 1].cpu = cpu;
-		cp->elements[cpu].idx = cp->size - 1;
-		cpudl_change_key(cp, cp->size - 1, dl);
+		int size1 = cp->size++;
+		cp->elements[size1].dl = dl;
+		cp->elements[size1].cpu = cpu;
+		cp->elements[cpu].idx = size1;
+		cpudl_heapify_up(cp, size1);
 		cpumask_clear_cpu(cpu, cp->free_cpus);
 	} else {
 		cpudl_change_key(cp, old_idx, dl);
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [RFC PATCH 3/4] Make deadline max-heap faster avoiding real swaps on heapify ops.
  2016-07-19  9:44 SCHED_DEADLINE cpudeadline.{h,c} fixup Tommaso Cucinotta
  2016-07-19  9:44 ` [RFC PATCH 1/4] Minimally invasive deadline wrap-around bugfix for SCHED_DEADLINE cpu heap Tommaso Cucinotta
  2016-07-19  9:44 ` [RFC PATCH 2/4] SCHED_DEADLINE cpu heap code clarification/refactory Tommaso Cucinotta
@ 2016-07-19  9:44 ` Tommaso Cucinotta
  2016-08-02 11:38   ` Juri Lelli
  2016-08-08 14:04   ` Peter Zijlstra
  2016-07-19  9:44 ` [RFC PATCH 4/4] Split cpudl_set() into cpudl_set() and cpudl_clear() Tommaso Cucinotta
  3 siblings, 2 replies; 12+ messages in thread
From: Tommaso Cucinotta @ 2016-07-19  9:44 UTC (permalink / raw)
  To: Luca Abeni, Juri Lelli, Peter Zijlstra, Ingo Molnar
  Cc: linux-kernel, linux-dl, Tommaso Cucinotta, Juri Lelli

This change achieves up to 10% of speed-up for cpudl_set() calls, as
measured with a andomly generated workload of 1K,10K,100K random heap
insertions and deletions (75% cpudl_set() calls with is_valid=1 and
25% with is_valid=0), and randomly generated cpu IDs, with up to 256
CPUs, as measured on an Intel Core2 Duo.

Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Juri Lelli <juri.lelli@arm.com>
Cc: Luca Abeni <luca.abeni@unitn.it>
Reviewed-by: Luca Abeni <luca.abeni@unitn.it>
Signed-off-by: Tommaso Cucinotta <tommaso.cucinotta@sssup.it>
---
 kernel/sched/cpudeadline.c | 114 +++++++++++++++++++++++++++------------------
 1 file changed, 69 insertions(+), 45 deletions(-)

diff --git a/kernel/sched/cpudeadline.c b/kernel/sched/cpudeadline.c
index 3c42702..60f933a 100644
--- a/kernel/sched/cpudeadline.c
+++ b/kernel/sched/cpudeadline.c
@@ -31,60 +31,82 @@ static inline int right_child(int i)
 	return (i << 1) + 2;
 }
 
-static void cpudl_exchange(struct cpudl *cp, int a, int b)
-{
-	int cpu_a = cp->elements[a].cpu, cpu_b = cp->elements[b].cpu;
-
-	swap(cp->elements[a].cpu, cp->elements[b].cpu);
-	swap(cp->elements[a].dl , cp->elements[b].dl );
-
-	swap(cp->elements[cpu_a].idx, cp->elements[cpu_b].idx);
-}
-
 static void cpudl_heapify_down(struct cpudl *cp, int idx)
 {
 	int l, r, largest;
 
+	int orig_cpu = cp->elements[idx].cpu;
+	u64 orig_dl = cp->elements[idx].dl;
+
+	if (left_child(idx) >= cp->size)
+		return;
+
 	/* adapted from lib/prio_heap.c */
 	while(1) {
+		u64 largest_dl;
 		l = left_child(idx);
 		r = right_child(idx);
 		largest = idx;
+		largest_dl = orig_dl;
 
-		if ((l < cp->size) && dl_time_before(cp->elements[idx].dl,
-							cp->elements[l].dl))
+		if ((l < cp->size) && dl_time_before(orig_dl, cp->elements[l].dl)) {
 			largest = l;
-		if ((r < cp->size) && dl_time_before(cp->elements[largest].dl,
-							cp->elements[r].dl))
+			largest_dl = cp->elements[l].dl;
+		}
+		if ((r < cp->size) && dl_time_before(largest_dl, cp->elements[r].dl))
 			largest = r;
+
 		if (largest == idx)
 			break;
 
-		/* Push idx down the heap one level and bump one up */
-		cpudl_exchange(cp, largest, idx);
+		/* pull largest child onto idx */
+		cp->elements[idx].cpu = cp->elements[largest].cpu;
+		cp->elements[idx].dl = cp->elements[largest].dl;
+		cp->elements[cp->elements[idx].cpu].idx = idx;
 		idx = largest;
 	}
+	/* actual push down of saved original values orig_* */
+	cp->elements[idx].cpu = orig_cpu;
+	cp->elements[idx].dl = orig_dl;
+	cp->elements[cp->elements[idx].cpu].idx = idx;
 }
 
 static void cpudl_heapify_up(struct cpudl *cp, int idx)
 {
-	while (idx > 0 && dl_time_before(cp->elements[parent(idx)].dl,
-			cp->elements[idx].dl)) {
-		cpudl_exchange(cp, idx, parent(idx));
-		idx = parent(idx);
-	}
+	int p;
+
+	int orig_cpu = cp->elements[idx].cpu;
+	u64 orig_dl = cp->elements[idx].dl;
+
+	if (idx == 0)
+		return;
+
+	do {
+		p = parent(idx);
+		if (dl_time_before(cp->elements[idx].dl, cp->elements[p].dl))
+			break;
+		/* pull parent onto idx */
+		cp->elements[idx].cpu = cp->elements[p].cpu;
+		cp->elements[idx].dl = cp->elements[p].dl;
+		cp->elements[cp->elements[idx].cpu].idx = idx;
+		idx = p;
+	} while (idx != 0);
+	/* actual push up of saved original values orig_* */
+	cp->elements[idx].cpu = orig_cpu;
+	cp->elements[idx].dl = orig_dl;
+	cp->elements[cp->elements[idx].cpu].idx = idx;
 }
 
-static void cpudl_change_key(struct cpudl *cp, int idx, u64 new_dl)
+static void cpudl_heapify(struct cpudl *cp, int idx)
 {
 	WARN_ON(idx == IDX_INVALID || !cpu_present(idx));
+	if (idx == IDX_INVALID)
+		return;
 
-	if (dl_time_before(new_dl, cp->elements[idx].dl)) {
-		cp->elements[idx].dl = new_dl;
-		cpudl_heapify_down(cp, idx);
-	} else {
-		cp->elements[idx].dl = new_dl;
+	if (idx > 0 && dl_time_before(cp->elements[parent(idx)].dl, cp->elements[idx].dl)) {
 		cpudl_heapify_up(cp, idx);
+	} else {
+		cpudl_heapify_down(cp, idx);
 	}
 }
 
@@ -153,28 +175,30 @@ void cpudl_set(struct cpudl *cp, int cpu, u64 dl, int is_valid)
 			 */
 			goto out;
 		}
-		new_cpu = cp->elements[cp->size - 1].cpu;
-		cp->elements[old_idx].dl = cp->elements[cp->size - 1].dl;
-		cp->elements[old_idx].cpu = new_cpu;
 		cp->size--;
-		cp->elements[new_cpu].idx = old_idx;
 		cp->elements[cpu].idx = IDX_INVALID;
-		cpudl_heapify_up(cp, old_idx);
-		cpumask_set_cpu(cpu, cp->free_cpus);
-                cpudl_heapify_down(cp, old_idx);
-
-		goto out;
-	}
+		if (old_idx != cp->size) {
+			new_cpu = cp->elements[cp->size].cpu;
+			cp->elements[old_idx].dl = cp->elements[cp->size].dl;
+			cp->elements[old_idx].cpu = new_cpu;
+			cp->elements[new_cpu].idx = old_idx;
+			cpudl_heapify(cp, old_idx);
+		}
 
-	if (old_idx == IDX_INVALID) {
-		int size1 = cp->size++;
-		cp->elements[size1].dl = dl;
-		cp->elements[size1].cpu = cpu;
-		cp->elements[cpu].idx = size1;
-		cpudl_heapify_up(cp, size1);
-		cpumask_clear_cpu(cpu, cp->free_cpus);
+		cpumask_set_cpu(cpu, cp->free_cpus);
 	} else {
-		cpudl_change_key(cp, old_idx, dl);
+		if (old_idx == IDX_INVALID) {
+			int size1 = cp->size++;
+			cp->elements[size1].dl = dl;
+			cp->elements[size1].cpu = cpu;
+			cp->elements[cpu].idx = size1;
+			cpudl_heapify_up(cp, size1);
+
+			cpumask_clear_cpu(cpu, cp->free_cpus);
+		} else {
+			cp->elements[old_idx].dl = dl;
+			cpudl_heapify(cp, old_idx);
+		}
 	}
 
 out:
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [RFC PATCH 4/4] Split cpudl_set() into cpudl_set() and cpudl_clear().
  2016-07-19  9:44 SCHED_DEADLINE cpudeadline.{h,c} fixup Tommaso Cucinotta
                   ` (2 preceding siblings ...)
  2016-07-19  9:44 ` [RFC PATCH 3/4] Make deadline max-heap faster avoiding real swaps on heapify ops Tommaso Cucinotta
@ 2016-07-19  9:44 ` Tommaso Cucinotta
  2016-08-02 11:52   ` Juri Lelli
  3 siblings, 1 reply; 12+ messages in thread
From: Tommaso Cucinotta @ 2016-07-19  9:44 UTC (permalink / raw)
  To: Luca Abeni, Juri Lelli, Peter Zijlstra, Ingo Molnar
  Cc: linux-kernel, linux-dl, Tommaso Cucinotta, Juri Lelli

These 2 exercise independent code paths and need different arguments.
Now you call
  cpudl_clear(cp, cpu)
  cpudl_set(cp, cpu, dl)
instead of
  cpudl_set(cp, cpu, 0 /* dl */, 0 /* is_valid */)
  cpudl_set(cp, cpu, dl, 1 /* is_valid */)

Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Juri Lelli <juri.lelli@arm.com>
Cc: Luca Abeni <luca.abeni@unitn.it>
Reviewed-by: Luca Abeni <luca.abeni@unitn.it>
Signed-off-by: Tommaso Cucinotta <tommaso.cucinotta@sssup.it>
---
 kernel/sched/cpudeadline.c | 71 +++++++++++++++++++++++++++++-----------------
 kernel/sched/cpudeadline.h |  3 +-
 kernel/sched/deadline.c    | 10 +++----
 3 files changed, 52 insertions(+), 32 deletions(-)

diff --git a/kernel/sched/cpudeadline.c b/kernel/sched/cpudeadline.c
index 60f933a..0f276bf 100644
--- a/kernel/sched/cpudeadline.c
+++ b/kernel/sched/cpudeadline.c
@@ -147,16 +147,15 @@ out:
 }
 
 /*
- * cpudl_set - update the cpudl max-heap
+ * cpudl_clear - remove a cpu from the cpudl max-heap
  * @cp: the cpudl max-heap context
  * @cpu: the target cpu
- * @dl: the new earliest deadline for this cpu
  *
  * Notes: assumes cpu_rq(cpu)->lock is locked
  *
  * Returns: (void)
  */
-void cpudl_set(struct cpudl *cp, int cpu, u64 dl, int is_valid)
+void cpudl_clear(struct cpudl *cp, int cpu)
 {
 	int old_idx, new_cpu;
 	unsigned long flags;
@@ -164,17 +163,16 @@ void cpudl_set(struct cpudl *cp, int cpu, u64 dl, int is_valid)
 	WARN_ON(!cpu_present(cpu));
 
 	raw_spin_lock_irqsave(&cp->lock, flags);
+
 	old_idx = cp->elements[cpu].idx;
-	if (!is_valid) {
+	if (old_idx == IDX_INVALID) {
+		/*
+		 * Nothing to remove if old_idx was invalid.
+		 * This could happen if a rq_offline_dl is
+		 * called for a CPU without -dl tasks running.
+		 */
+	} else {
 		/* remove item */
-		if (old_idx == IDX_INVALID) {
-			/*
-			 * Nothing to remove if old_idx was invalid.
-			 * This could happen if a rq_offline_dl is
-			 * called for a CPU without -dl tasks running.
-			 */
-			goto out;
-		}
 		cp->size--;
 		cp->elements[cpu].idx = IDX_INVALID;
 		if (old_idx != cp->size) {
@@ -184,24 +182,45 @@ void cpudl_set(struct cpudl *cp, int cpu, u64 dl, int is_valid)
 			cp->elements[new_cpu].idx = old_idx;
 			cpudl_heapify(cp, old_idx);
 		}
-
 		cpumask_set_cpu(cpu, cp->free_cpus);
+	}
+
+	raw_spin_unlock_irqrestore(&cp->lock, flags);
+}
+
+/*
+ * cpudl_set - update the cpudl max-heap
+ * @cp: the cpudl max-heap context
+ * @cpu: the target cpu
+ * @dl: the new earliest deadline for this cpu
+ *
+ * Notes: assumes cpu_rq(cpu)->lock is locked
+ *
+ * Returns: (void)
+ */
+void cpudl_set(struct cpudl *cp, int cpu, u64 dl)
+{
+	int old_idx;
+	unsigned long flags;
+
+	WARN_ON(!cpu_present(cpu));
+
+	raw_spin_lock_irqsave(&cp->lock, flags);
+
+	old_idx = cp->elements[cpu].idx;
+	if (old_idx == IDX_INVALID) {
+		int sz1 = cp->size++;
+		cp->elements[sz1].dl = dl;
+		cp->elements[sz1].cpu = cpu;
+		cp->elements[cpu].idx = sz1;
+		cpudl_heapify_up(cp, sz1);
+
+		cpumask_clear_cpu(cpu, cp->free_cpus);
 	} else {
-		if (old_idx == IDX_INVALID) {
-			int size1 = cp->size++;
-			cp->elements[size1].dl = dl;
-			cp->elements[size1].cpu = cpu;
-			cp->elements[cpu].idx = size1;
-			cpudl_heapify_up(cp, size1);
-
-			cpumask_clear_cpu(cpu, cp->free_cpus);
-		} else {
-			cp->elements[old_idx].dl = dl;
-			cpudl_heapify(cp, old_idx);
-		}
+		cp->elements[old_idx].dl = dl;
+		cpudl_heapify(cp, old_idx);
 	}
 
-out:
 	raw_spin_unlock_irqrestore(&cp->lock, flags);
 }
 
diff --git a/kernel/sched/cpudeadline.h b/kernel/sched/cpudeadline.h
index fcbdf83..f7da8c5 100644
--- a/kernel/sched/cpudeadline.h
+++ b/kernel/sched/cpudeadline.h
@@ -23,7 +23,8 @@ struct cpudl {
 #ifdef CONFIG_SMP
 int cpudl_find(struct cpudl *cp, struct task_struct *p,
 	       struct cpumask *later_mask);
-void cpudl_set(struct cpudl *cp, int cpu, u64 dl, int is_valid);
+void cpudl_set(struct cpudl *cp, int cpu, u64 dl);
+void cpudl_clear(struct cpudl *cp, int cpu);
 int cpudl_init(struct cpudl *cp);
 void cpudl_set_freecpu(struct cpudl *cp, int cpu);
 void cpudl_clear_freecpu(struct cpudl *cp, int cpu);
diff --git a/kernel/sched/deadline.c b/kernel/sched/deadline.c
index fcb7f02..f2e8f47 100644
--- a/kernel/sched/deadline.c
+++ b/kernel/sched/deadline.c
@@ -795,7 +795,7 @@ static void inc_dl_deadline(struct dl_rq *dl_rq, u64 deadline)
 	if (dl_rq->earliest_dl.curr == 0 ||
 	    dl_time_before(deadline, dl_rq->earliest_dl.curr)) {
 		dl_rq->earliest_dl.curr = deadline;
-		cpudl_set(&rq->rd->cpudl, rq->cpu, deadline, 1);
+		cpudl_set(&rq->rd->cpudl, rq->cpu, deadline);
 	}
 }
 
@@ -810,14 +810,14 @@ static void dec_dl_deadline(struct dl_rq *dl_rq, u64 deadline)
 	if (!dl_rq->dl_nr_running) {
 		dl_rq->earliest_dl.curr = 0;
 		dl_rq->earliest_dl.next = 0;
-		cpudl_set(&rq->rd->cpudl, rq->cpu, 0, 0);
+		cpudl_clear(&rq->rd->cpudl, rq->cpu);
 	} else {
 		struct rb_node *leftmost = dl_rq->rb_leftmost;
 		struct sched_dl_entity *entry;
 
 		entry = rb_entry(leftmost, struct sched_dl_entity, rb_node);
 		dl_rq->earliest_dl.curr = entry->deadline;
-		cpudl_set(&rq->rd->cpudl, rq->cpu, entry->deadline, 1);
+		cpudl_set(&rq->rd->cpudl, rq->cpu, entry->deadline);
 	}
 }
 
@@ -1668,7 +1668,7 @@ static void rq_online_dl(struct rq *rq)
 
 	cpudl_set_freecpu(&rq->rd->cpudl, rq->cpu);
 	if (rq->dl.dl_nr_running > 0)
-		cpudl_set(&rq->rd->cpudl, rq->cpu, rq->dl.earliest_dl.curr, 1);
+		cpudl_set(&rq->rd->cpudl, rq->cpu, rq->dl.earliest_dl.curr);
 }
 
 /* Assumes rq->lock is held */
@@ -1677,7 +1677,7 @@ static void rq_offline_dl(struct rq *rq)
 	if (rq->dl.overloaded)
 		dl_clear_overload(rq);
 
-	cpudl_set(&rq->rd->cpudl, rq->cpu, 0, 0);
+	cpudl_clear(&rq->rd->cpudl, rq->cpu);
 	cpudl_clear_freecpu(&rq->rd->cpudl, rq->cpu);
 }
 
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 12+ messages in thread

* Re: [RFC PATCH 1/4] Minimally invasive deadline wrap-around bugfix for SCHED_DEADLINE cpu heap
  2016-07-19  9:44 ` [RFC PATCH 1/4] Minimally invasive deadline wrap-around bugfix for SCHED_DEADLINE cpu heap Tommaso Cucinotta
@ 2016-08-01 11:43   ` Juri Lelli
  2016-08-01 12:57     ` Juri Lelli
  2016-08-10 18:00   ` [tip:sched/core] sched/deadline: Fix wrap-around in DL heap tip-bot for Tommaso Cucinotta
  1 sibling, 1 reply; 12+ messages in thread
From: Juri Lelli @ 2016-08-01 11:43 UTC (permalink / raw)
  To: Tommaso Cucinotta
  Cc: Luca Abeni, Juri Lelli, Peter Zijlstra, Ingo Molnar,
	linux-kernel, linux-dl

Hi,

On 19/07/16 11:44, Tommaso Cucinotta wrote:
> Current code in cpudeadline.c has a bug in re-heapifying when adding a
> new element at the end of the heap, because a deadline value of 0 is
> temporarily set in the new elem, then cpudl_change_key() is called
> with the actual elem deadline as param. However, the function compares
> the new deadline to set with the one previously in the elem, which is
> 0.  So, if current absolute deadlines grew so much to have negative
> values as s64, the comparison in cpudl_change_key() makes the wrong
> decision.  Instead, as from dl_time_before(), the kernel should handle
> correctly abs deadlines wrap-arounds.
> 
> This patch fixes the problem with a minimally invasive change that
> forces cpudl_change_key() to heapify up in this case.
> 
> Cc: Peter Zijlstra <peterz@infradead.org>
> Cc: Juri Lelli <juri.lelli@arm.com>
> Cc: Luca Abeni <luca.abeni@unitn.it>
> Reviewed-by: Luca Abeni <luca.abeni@unitn.it>
> Signed-off-by: Tommaso Cucinotta <tommaso.cucinotta@sssup.it>
> ---
>  kernel/sched/cpudeadline.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/kernel/sched/cpudeadline.c b/kernel/sched/cpudeadline.c
> index 5be5882..d418449 100644
> --- a/kernel/sched/cpudeadline.c
> +++ b/kernel/sched/cpudeadline.c
> @@ -168,7 +168,7 @@ void cpudl_set(struct cpudl *cp, int cpu, u64 dl, int is_valid)
>  
>  	if (old_idx == IDX_INVALID) {
>  		cp->size++;
> -		cp->elements[cp->size - 1].dl = 0;
> +		cp->elements[cp->size - 1].dl = dl;
>  		cp->elements[cp->size - 1].cpu = cpu;
>  		cp->elements[cpu].idx = cp->size - 1;
>  		cpudl_change_key(cp, cp->size - 1, dl);

Looks good. I'd only change the subject to something like:

 sched/deadline: Wrap-around bugfix for cpu heap

Best,

- Juri

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [RFC PATCH 1/4] Minimally invasive deadline wrap-around bugfix for SCHED_DEADLINE cpu heap
  2016-08-01 11:43   ` Juri Lelli
@ 2016-08-01 12:57     ` Juri Lelli
  0 siblings, 0 replies; 12+ messages in thread
From: Juri Lelli @ 2016-08-01 12:57 UTC (permalink / raw)
  To: Tommaso Cucinotta
  Cc: Luca Abeni, Juri Lelli, Peter Zijlstra, Ingo Molnar, linux-kernel

On 01/08/16 12:43, Juri Lelli wrote:
> Hi,
> 
> On 19/07/16 11:44, Tommaso Cucinotta wrote:
> > Current code in cpudeadline.c has a bug in re-heapifying when adding a
> > new element at the end of the heap, because a deadline value of 0 is
> > temporarily set in the new elem, then cpudl_change_key() is called
> > with the actual elem deadline as param. However, the function compares
> > the new deadline to set with the one previously in the elem, which is
> > 0.  So, if current absolute deadlines grew so much to have negative
> > values as s64, the comparison in cpudl_change_key() makes the wrong
> > decision.  Instead, as from dl_time_before(), the kernel should handle
> > correctly abs deadlines wrap-arounds.
> > 
> > This patch fixes the problem with a minimally invasive change that
> > forces cpudl_change_key() to heapify up in this case.
> > 
> > Cc: Peter Zijlstra <peterz@infradead.org>
> > Cc: Juri Lelli <juri.lelli@arm.com>
> > Cc: Luca Abeni <luca.abeni@unitn.it>
> > Reviewed-by: Luca Abeni <luca.abeni@unitn.it>
> > Signed-off-by: Tommaso Cucinotta <tommaso.cucinotta@sssup.it>
> > ---
> >  kernel/sched/cpudeadline.c | 2 +-
> >  1 file changed, 1 insertion(+), 1 deletion(-)
> > 
> > diff --git a/kernel/sched/cpudeadline.c b/kernel/sched/cpudeadline.c
> > index 5be5882..d418449 100644
> > --- a/kernel/sched/cpudeadline.c
> > +++ b/kernel/sched/cpudeadline.c
> > @@ -168,7 +168,7 @@ void cpudl_set(struct cpudl *cp, int cpu, u64 dl, int is_valid)
> >  
> >  	if (old_idx == IDX_INVALID) {
> >  		cp->size++;
> > -		cp->elements[cp->size - 1].dl = 0;
> > +		cp->elements[cp->size - 1].dl = dl;
> >  		cp->elements[cp->size - 1].cpu = cpu;
> >  		cp->elements[cpu].idx = cp->size - 1;
> >  		cpudl_change_key(cp, cp->size - 1, dl);
> 
> Looks good. I'd only change the subject to something like:
> 
>  sched/deadline: Wrap-around bugfix for cpu heap
> 

and please remove linux-dl mailing list from future postings. It's
subscribers only, and practically dead (AFAIK).

Thanks,

- Juri

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [RFC PATCH 2/4] SCHED_DEADLINE cpu heap code clarification/refactory.
  2016-07-19  9:44 ` [RFC PATCH 2/4] SCHED_DEADLINE cpu heap code clarification/refactory Tommaso Cucinotta
@ 2016-08-01 16:30   ` Juri Lelli
  0 siblings, 0 replies; 12+ messages in thread
From: Juri Lelli @ 2016-08-01 16:30 UTC (permalink / raw)
  To: Tommaso Cucinotta
  Cc: Luca Abeni, Juri Lelli, Peter Zijlstra, Ingo Molnar, linux-kernel

Hi,

nice clean-up. Maybe change the subject to something like
"sched/deadline: refactor cpu heap code" ?

On 19/07/16 11:44, Tommaso Cucinotta wrote:

This change does two things:

> 1. heapify up factored out in new dedicated function heapify_up()
>    (avoids repeatition of same code)

s/repeatition/repetition/

> 2. call to cpudl_change_key() replaced with heapify_up() when
>    cpudl_set actually inserts a new node in the heap
> 

Maybe we want a separate patch (we usually want 1 patch - 1 change) ?

> Cc: Peter Zijlstra <peterz@infradead.org>
> Cc: Juri Lelli <juri.lelli@arm.com>
> Cc: Luca Abeni <luca.abeni@unitn.it>
> Reviewed-by: Luca Abeni <luca.abeni@unitn.it>
> Signed-off-by: Tommaso Cucinotta <tommaso.cucinotta@sssup.it>
> ---
>  kernel/sched/cpudeadline.c | 38 +++++++++++++++++++-------------------
>  1 file changed, 19 insertions(+), 19 deletions(-)
> 
> diff --git a/kernel/sched/cpudeadline.c b/kernel/sched/cpudeadline.c
> index d418449..3c42702 100644
> --- a/kernel/sched/cpudeadline.c
> +++ b/kernel/sched/cpudeadline.c
> @@ -41,7 +41,7 @@ static void cpudl_exchange(struct cpudl *cp, int a, int b)
>  	swap(cp->elements[cpu_a].idx, cp->elements[cpu_b].idx);
>  }
>  
> -static void cpudl_heapify(struct cpudl *cp, int idx)
> +static void cpudl_heapify_down(struct cpudl *cp, int idx)
>  {
>  	int l, r, largest;
>  
> @@ -66,20 +66,25 @@ static void cpudl_heapify(struct cpudl *cp, int idx)
>  	}
>  }
>  
> +static void cpudl_heapify_up(struct cpudl *cp, int idx)
> +{
> +	while (idx > 0 && dl_time_before(cp->elements[parent(idx)].dl,
> +			cp->elements[idx].dl)) {
> +		cpudl_exchange(cp, idx, parent(idx));
> +		idx = parent(idx);
> +	}
> +}
> +
>  static void cpudl_change_key(struct cpudl *cp, int idx, u64 new_dl)
>  {
>  	WARN_ON(idx == IDX_INVALID || !cpu_present(idx));
>  
>  	if (dl_time_before(new_dl, cp->elements[idx].dl)) {
>  		cp->elements[idx].dl = new_dl;
> -		cpudl_heapify(cp, idx);
> +		cpudl_heapify_down(cp, idx);
>  	} else {
>  		cp->elements[idx].dl = new_dl;
> -		while (idx > 0 && dl_time_before(cp->elements[parent(idx)].dl,
> -					cp->elements[idx].dl)) {
> -			cpudl_exchange(cp, idx, parent(idx));
> -			idx = parent(idx);
> -		}
> +		cpudl_heapify_up(cp, idx);
>  	}
>  }
>  
> @@ -154,24 +159,19 @@ void cpudl_set(struct cpudl *cp, int cpu, u64 dl, int is_valid)
>  		cp->size--;
>  		cp->elements[new_cpu].idx = old_idx;
>  		cp->elements[cpu].idx = IDX_INVALID;
> -		while (old_idx > 0 && dl_time_before(
> -				cp->elements[parent(old_idx)].dl,
> -				cp->elements[old_idx].dl)) {
> -			cpudl_exchange(cp, old_idx, parent(old_idx));
> -			old_idx = parent(old_idx);
> -		}
> +		cpudl_heapify_up(cp, old_idx);
>  		cpumask_set_cpu(cpu, cp->free_cpus);
> -                cpudl_heapify(cp, old_idx);
> +                cpudl_heapify_down(cp, old_idx);

I think this line was already whitespace damaged. Could you fix it (with
a proper tab) in next version?

>  
>  		goto out;
>  	}
>  
>  	if (old_idx == IDX_INVALID) {
> -		cp->size++;
> -		cp->elements[cp->size - 1].dl = dl;
> -		cp->elements[cp->size - 1].cpu = cpu;
> -		cp->elements[cpu].idx = cp->size - 1;
> -		cpudl_change_key(cp, cp->size - 1, dl);
> +		int size1 = cp->size++;

s/size1/new_size/ ?

> +		cp->elements[size1].dl = dl;
> +		cp->elements[size1].cpu = cpu;
> +		cp->elements[cpu].idx = size1;
> +		cpudl_heapify_up(cp, size1);
>  		cpumask_clear_cpu(cpu, cp->free_cpus);
>  	} else {
>  		cpudl_change_key(cp, old_idx, dl);

We also seem to do almost the same ("cp->size - 1" mutliple times and
then cp->size--) up above, !is_valid branch. Maybe we want to clean
that up as well?

Thanks,

- Juri

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [RFC PATCH 3/4] Make deadline max-heap faster avoiding real swaps on heapify ops.
  2016-07-19  9:44 ` [RFC PATCH 3/4] Make deadline max-heap faster avoiding real swaps on heapify ops Tommaso Cucinotta
@ 2016-08-02 11:38   ` Juri Lelli
  2016-08-08 14:04   ` Peter Zijlstra
  1 sibling, 0 replies; 12+ messages in thread
From: Juri Lelli @ 2016-08-02 11:38 UTC (permalink / raw)
  To: Tommaso Cucinotta
  Cc: Luca Abeni, Juri Lelli, Peter Zijlstra, Ingo Molnar, linux-kernel

Hi,

On 19/07/16 11:44, Tommaso Cucinotta wrote:
> This change achieves up to 10% of speed-up for cpudl_set() calls, as
> measured with a andomly generated workload of 1K,10K,100K random heap
> insertions and deletions (75% cpudl_set() calls with is_valid=1 and
> 25% with is_valid=0), and randomly generated cpu IDs, with up to 256
> CPUs, as measured on an Intel Core2 Duo.
> 
> Cc: Peter Zijlstra <peterz@infradead.org>
> Cc: Juri Lelli <juri.lelli@arm.com>
> Cc: Luca Abeni <luca.abeni@unitn.it>
> Reviewed-by: Luca Abeni <luca.abeni@unitn.it>
> Signed-off-by: Tommaso Cucinotta <tommaso.cucinotta@sssup.it>
> ---
>  kernel/sched/cpudeadline.c | 114 +++++++++++++++++++++++++++------------------
>  1 file changed, 69 insertions(+), 45 deletions(-)
> 
> diff --git a/kernel/sched/cpudeadline.c b/kernel/sched/cpudeadline.c
> index 3c42702..60f933a 100644
> --- a/kernel/sched/cpudeadline.c
> +++ b/kernel/sched/cpudeadline.c
> @@ -31,60 +31,82 @@ static inline int right_child(int i)
>  	return (i << 1) + 2;
>  }
>  
> -static void cpudl_exchange(struct cpudl *cp, int a, int b)
> -{
> -	int cpu_a = cp->elements[a].cpu, cpu_b = cp->elements[b].cpu;
> -
> -	swap(cp->elements[a].cpu, cp->elements[b].cpu);
> -	swap(cp->elements[a].dl , cp->elements[b].dl );
> -
> -	swap(cp->elements[cpu_a].idx, cp->elements[cpu_b].idx);
> -}
> -
>  static void cpudl_heapify_down(struct cpudl *cp, int idx)
>  {
>  	int l, r, largest;
>  
> +	int orig_cpu = cp->elements[idx].cpu;
> +	u64 orig_dl = cp->elements[idx].dl;
> +
> +	if (left_child(idx) >= cp->size)
> +		return;
> +
>  	/* adapted from lib/prio_heap.c */
>  	while(1) {
> +		u64 largest_dl;
>  		l = left_child(idx);
>  		r = right_child(idx);
>  		largest = idx;
> +		largest_dl = orig_dl;
>  
> -		if ((l < cp->size) && dl_time_before(cp->elements[idx].dl,
> -							cp->elements[l].dl))
> +		if ((l < cp->size) && dl_time_before(orig_dl, cp->elements[l].dl)) {
>  			largest = l;
> -		if ((r < cp->size) && dl_time_before(cp->elements[largest].dl,
> -							cp->elements[r].dl))
> +			largest_dl = cp->elements[l].dl;

OK, I was a bit puzzled by the usage of largest_dl in this loop, but I
seemed to convince myself that it is right. It's basically a local tmp
variable that you use to pick left or right child at each iteration, and
you reset it to the deadline of the node your are moving (orig_dl) after
it iteration.

> +		}
> +		if ((r < cp->size) && dl_time_before(largest_dl, cp->elements[r].dl))

Nitpick, this and the above conditions break 80 columns.

>  			largest = r;
> +
>  		if (largest == idx)
>  			break;
>  
> -		/* Push idx down the heap one level and bump one up */
> -		cpudl_exchange(cp, largest, idx);
> +		/* pull largest child onto idx */
> +		cp->elements[idx].cpu = cp->elements[largest].cpu;
> +		cp->elements[idx].dl = cp->elements[largest].dl;
> +		cp->elements[cp->elements[idx].cpu].idx = idx;
>  		idx = largest;
>  	}
> +	/* actual push down of saved original values orig_* */
> +	cp->elements[idx].cpu = orig_cpu;
> +	cp->elements[idx].dl = orig_dl;
> +	cp->elements[cp->elements[idx].cpu].idx = idx;
>  }
>  
>  static void cpudl_heapify_up(struct cpudl *cp, int idx)
>  {
> -	while (idx > 0 && dl_time_before(cp->elements[parent(idx)].dl,
> -			cp->elements[idx].dl)) {
> -		cpudl_exchange(cp, idx, parent(idx));
> -		idx = parent(idx);
> -	}
> +	int p;
> +
> +	int orig_cpu = cp->elements[idx].cpu;
> +	u64 orig_dl = cp->elements[idx].dl;
> +
> +	if (idx == 0)
> +		return;
> +
> +	do {
> +		p = parent(idx);
> +		if (dl_time_before(cp->elements[idx].dl, cp->elements[p].dl))
> +			break;
> +		/* pull parent onto idx */
> +		cp->elements[idx].cpu = cp->elements[p].cpu;
> +		cp->elements[idx].dl = cp->elements[p].dl;
> +		cp->elements[cp->elements[idx].cpu].idx = idx;
> +		idx = p;
> +	} while (idx != 0);
> +	/* actual push up of saved original values orig_* */
> +	cp->elements[idx].cpu = orig_cpu;
> +	cp->elements[idx].dl = orig_dl;
> +	cp->elements[cp->elements[idx].cpu].idx = idx;
>  }
>  
> -static void cpudl_change_key(struct cpudl *cp, int idx, u64 new_dl)
> +static void cpudl_heapify(struct cpudl *cp, int idx)
>  {
>  	WARN_ON(idx == IDX_INVALID || !cpu_present(idx));
> +	if (idx == IDX_INVALID)
> +		return;

Can we actually get here with IDX_INVALID? We bail out in the !is_valid
branch and the other call point is an else branch of old_idx ==
IDX_INVALID. So, it seems that this is more a BUG_ON().

>  
> -	if (dl_time_before(new_dl, cp->elements[idx].dl)) {
> -		cp->elements[idx].dl = new_dl;
> -		cpudl_heapify_down(cp, idx);
> -	} else {
> -		cp->elements[idx].dl = new_dl;
> +	if (idx > 0 && dl_time_before(cp->elements[parent(idx)].dl, cp->elements[idx].dl)) {

You can wrap after the comma. And brackets are not required.

>  		cpudl_heapify_up(cp, idx);
> +	} else {
> +		cpudl_heapify_down(cp, idx);
>  	}
>  }
>  
> @@ -153,28 +175,30 @@ void cpudl_set(struct cpudl *cp, int cpu, u64 dl, int is_valid)
>  			 */
>  			goto out;
>  		}
> -		new_cpu = cp->elements[cp->size - 1].cpu;
> -		cp->elements[old_idx].dl = cp->elements[cp->size - 1].dl;
> -		cp->elements[old_idx].cpu = new_cpu;
>  		cp->size--;
> -		cp->elements[new_cpu].idx = old_idx;
>  		cp->elements[cpu].idx = IDX_INVALID;
> -		cpudl_heapify_up(cp, old_idx);
> -		cpumask_set_cpu(cpu, cp->free_cpus);
> -                cpudl_heapify_down(cp, old_idx);
> -
> -		goto out;
> -	}
> +		if (old_idx != cp->size) {
> +			new_cpu = cp->elements[cp->size].cpu;
> +			cp->elements[old_idx].dl = cp->elements[cp->size].dl;
> +			cp->elements[old_idx].cpu = new_cpu;
> +			cp->elements[new_cpu].idx = old_idx;
> +			cpudl_heapify(cp, old_idx);
> +		}
>  
> -	if (old_idx == IDX_INVALID) {
> -		int size1 = cp->size++;
> -		cp->elements[size1].dl = dl;
> -		cp->elements[size1].cpu = cpu;
> -		cp->elements[cpu].idx = size1;
> -		cpudl_heapify_up(cp, size1);
> -		cpumask_clear_cpu(cpu, cp->free_cpus);
> +		cpumask_set_cpu(cpu, cp->free_cpus);
>  	} else {
> -		cpudl_change_key(cp, old_idx, dl);
> +		if (old_idx == IDX_INVALID) {
> +			int size1 = cp->size++;
> +			cp->elements[size1].dl = dl;
> +			cp->elements[size1].cpu = cpu;
> +			cp->elements[cpu].idx = size1;
> +			cpudl_heapify_up(cp, size1);
> +
> +			cpumask_clear_cpu(cpu, cp->free_cpus);
> +		} else {
> +			cp->elements[old_idx].dl = dl;
> +			cpudl_heapify(cp, old_idx);
> +		}
>  	}
>  
>  out:
> -- 

Couldn't spot any problem with this change (apart from the minor things
I pointed out above). Didn't test it extensively, though. But it seems
you did. :)

Best,

- Juri

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [RFC PATCH 4/4] Split cpudl_set() into cpudl_set() and cpudl_clear().
  2016-07-19  9:44 ` [RFC PATCH 4/4] Split cpudl_set() into cpudl_set() and cpudl_clear() Tommaso Cucinotta
@ 2016-08-02 11:52   ` Juri Lelli
  0 siblings, 0 replies; 12+ messages in thread
From: Juri Lelli @ 2016-08-02 11:52 UTC (permalink / raw)
  To: Tommaso Cucinotta
  Cc: Luca Abeni, Juri Lelli, Peter Zijlstra, Ingo Molnar, linux-kernel

Hi,

you should add "sched/deadline:" to the subject.

On 19/07/16 11:44, Tommaso Cucinotta wrote:

I'd repeat what the subject says, so that you can refer that from the
changelog with "These".

> These 2 exercise independent code paths and need different arguments.
> Now you call

It's more "after this change" that now, IMHO.

>   cpudl_clear(cp, cpu)
>   cpudl_set(cp, cpu, dl)
> instead of
>   cpudl_set(cp, cpu, 0 /* dl */, 0 /* is_valid */)
>   cpudl_set(cp, cpu, dl, 1 /* is_valid */)
> 
> Cc: Peter Zijlstra <peterz@infradead.org>
> Cc: Juri Lelli <juri.lelli@arm.com>
> Cc: Luca Abeni <luca.abeni@unitn.it>
> Reviewed-by: Luca Abeni <luca.abeni@unitn.it>
> Signed-off-by: Tommaso Cucinotta <tommaso.cucinotta@sssup.it>
> ---
>  kernel/sched/cpudeadline.c | 71 +++++++++++++++++++++++++++++-----------------
>  kernel/sched/cpudeadline.h |  3 +-
>  kernel/sched/deadline.c    | 10 +++----
>  3 files changed, 52 insertions(+), 32 deletions(-)
> 
> diff --git a/kernel/sched/cpudeadline.c b/kernel/sched/cpudeadline.c
> index 60f933a..0f276bf 100644
> --- a/kernel/sched/cpudeadline.c
> +++ b/kernel/sched/cpudeadline.c
> @@ -147,16 +147,15 @@ out:
>  }
>  
>  /*
> - * cpudl_set - update the cpudl max-heap
> + * cpudl_clear - remove a cpu from the cpudl max-heap
>   * @cp: the cpudl max-heap context
>   * @cpu: the target cpu
> - * @dl: the new earliest deadline for this cpu
>   *
>   * Notes: assumes cpu_rq(cpu)->lock is locked

We should probably add (in a separate patch) a lockdep_assert for this.

>   *
>   * Returns: (void)
>   */
> -void cpudl_set(struct cpudl *cp, int cpu, u64 dl, int is_valid)
> +void cpudl_clear(struct cpudl *cp, int cpu)
>  {
>  	int old_idx, new_cpu;
>  	unsigned long flags;
> @@ -164,17 +163,16 @@ void cpudl_set(struct cpudl *cp, int cpu, u64 dl, int is_valid)
>  	WARN_ON(!cpu_present(cpu));
>  
>  	raw_spin_lock_irqsave(&cp->lock, flags);
> +
>  	old_idx = cp->elements[cpu].idx;
> -	if (!is_valid) {
> +	if (old_idx == IDX_INVALID) {
> +		/*
> +		 * Nothing to remove if old_idx was invalid.
> +		 * This could happen if a rq_offline_dl is
> +		 * called for a CPU without -dl tasks running.
> +		 */
> +	} else {
>  		/* remove item */
> -		if (old_idx == IDX_INVALID) {
> -			/*
> -			 * Nothing to remove if old_idx was invalid.
> -			 * This could happen if a rq_offline_dl is
> -			 * called for a CPU without -dl tasks running.
> -			 */
> -			goto out;
> -		}
>  		cp->size--;
>  		cp->elements[cpu].idx = IDX_INVALID;
>  		if (old_idx != cp->size) {
> @@ -184,24 +182,45 @@ void cpudl_set(struct cpudl *cp, int cpu, u64 dl, int is_valid)
>  			cp->elements[new_cpu].idx = old_idx;
>  			cpudl_heapify(cp, old_idx);
>  		}
> -
>  		cpumask_set_cpu(cpu, cp->free_cpus);
> +	}
> +
> +	raw_spin_unlock_irqrestore(&cp->lock, flags);
> +}
> +
> +/*
> + * cpudl_set - update the cpudl max-heap
> + * @cp: the cpudl max-heap context
> + * @cpu: the target cpu
> + * @dl: the new earliest deadline for this cpu
> + *
> + * Notes: assumes cpu_rq(cpu)->lock is locked
> + *
> + * Returns: (void)
> + */
> +void cpudl_set(struct cpudl *cp, int cpu, u64 dl)
> +{
> +	int old_idx;
> +	unsigned long flags;
> +
> +	WARN_ON(!cpu_present(cpu));
> +
> +	raw_spin_lock_irqsave(&cp->lock, flags);
> +
> +	old_idx = cp->elements[cpu].idx;
> +	if (old_idx == IDX_INVALID) {
> +		int sz1 = cp->size++;

You also change the temp variable name. I think you might want to fix
that in one of the previous patches once for all.

> +		cp->elements[sz1].dl = dl;
> +		cp->elements[sz1].cpu = cpu;
> +		cp->elements[cpu].idx = sz1;
> +		cpudl_heapify_up(cp, sz1);
> +
> +		cpumask_clear_cpu(cpu, cp->free_cpus);
>  	} else {
> -		if (old_idx == IDX_INVALID) {
> -			int size1 = cp->size++;
> -			cp->elements[size1].dl = dl;
> -			cp->elements[size1].cpu = cpu;
> -			cp->elements[cpu].idx = size1;
> -			cpudl_heapify_up(cp, size1);
> -
> -			cpumask_clear_cpu(cpu, cp->free_cpus);
> -		} else {
> -			cp->elements[old_idx].dl = dl;
> -			cpudl_heapify(cp, old_idx);
> -		}
> +		cp->elements[old_idx].dl = dl;
> +		cpudl_heapify(cp, old_idx);
>  	}
>  
> -out:
>  	raw_spin_unlock_irqrestore(&cp->lock, flags);
>  }
>  
> diff --git a/kernel/sched/cpudeadline.h b/kernel/sched/cpudeadline.h
> index fcbdf83..f7da8c5 100644
> --- a/kernel/sched/cpudeadline.h
> +++ b/kernel/sched/cpudeadline.h
> @@ -23,7 +23,8 @@ struct cpudl {
>  #ifdef CONFIG_SMP
>  int cpudl_find(struct cpudl *cp, struct task_struct *p,
>  	       struct cpumask *later_mask);
> -void cpudl_set(struct cpudl *cp, int cpu, u64 dl, int is_valid);
> +void cpudl_set(struct cpudl *cp, int cpu, u64 dl);
> +void cpudl_clear(struct cpudl *cp, int cpu);
>  int cpudl_init(struct cpudl *cp);
>  void cpudl_set_freecpu(struct cpudl *cp, int cpu);
>  void cpudl_clear_freecpu(struct cpudl *cp, int cpu);
> diff --git a/kernel/sched/deadline.c b/kernel/sched/deadline.c
> index fcb7f02..f2e8f47 100644
> --- a/kernel/sched/deadline.c
> +++ b/kernel/sched/deadline.c
> @@ -795,7 +795,7 @@ static void inc_dl_deadline(struct dl_rq *dl_rq, u64 deadline)
>  	if (dl_rq->earliest_dl.curr == 0 ||
>  	    dl_time_before(deadline, dl_rq->earliest_dl.curr)) {
>  		dl_rq->earliest_dl.curr = deadline;
> -		cpudl_set(&rq->rd->cpudl, rq->cpu, deadline, 1);
> +		cpudl_set(&rq->rd->cpudl, rq->cpu, deadline);
>  	}
>  }
>  
> @@ -810,14 +810,14 @@ static void dec_dl_deadline(struct dl_rq *dl_rq, u64 deadline)
>  	if (!dl_rq->dl_nr_running) {
>  		dl_rq->earliest_dl.curr = 0;
>  		dl_rq->earliest_dl.next = 0;
> -		cpudl_set(&rq->rd->cpudl, rq->cpu, 0, 0);
> +		cpudl_clear(&rq->rd->cpudl, rq->cpu);
>  	} else {
>  		struct rb_node *leftmost = dl_rq->rb_leftmost;
>  		struct sched_dl_entity *entry;
>  
>  		entry = rb_entry(leftmost, struct sched_dl_entity, rb_node);
>  		dl_rq->earliest_dl.curr = entry->deadline;
> -		cpudl_set(&rq->rd->cpudl, rq->cpu, entry->deadline, 1);
> +		cpudl_set(&rq->rd->cpudl, rq->cpu, entry->deadline);
>  	}
>  }
>  
> @@ -1668,7 +1668,7 @@ static void rq_online_dl(struct rq *rq)
>  
>  	cpudl_set_freecpu(&rq->rd->cpudl, rq->cpu);
>  	if (rq->dl.dl_nr_running > 0)
> -		cpudl_set(&rq->rd->cpudl, rq->cpu, rq->dl.earliest_dl.curr, 1);
> +		cpudl_set(&rq->rd->cpudl, rq->cpu, rq->dl.earliest_dl.curr);
>  }
>  
>  /* Assumes rq->lock is held */
> @@ -1677,7 +1677,7 @@ static void rq_offline_dl(struct rq *rq)
>  	if (rq->dl.overloaded)
>  		dl_clear_overload(rq);
>  
> -	cpudl_set(&rq->rd->cpudl, rq->cpu, 0, 0);
> +	cpudl_clear(&rq->rd->cpudl, rq->cpu);
>  	cpudl_clear_freecpu(&rq->rd->cpudl, rq->cpu);
>  }
>  
> -- 

Apart from the minor nitpicks above, the change looks good to me and it
shouldn't introduce any functional changes (maybe worth stating it in
the changelog).

Best,

- Juri

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [RFC PATCH 3/4] Make deadline max-heap faster avoiding real swaps on heapify ops.
  2016-07-19  9:44 ` [RFC PATCH 3/4] Make deadline max-heap faster avoiding real swaps on heapify ops Tommaso Cucinotta
  2016-08-02 11:38   ` Juri Lelli
@ 2016-08-08 14:04   ` Peter Zijlstra
  1 sibling, 0 replies; 12+ messages in thread
From: Peter Zijlstra @ 2016-08-08 14:04 UTC (permalink / raw)
  To: Tommaso Cucinotta
  Cc: Luca Abeni, Juri Lelli, Ingo Molnar, linux-kernel, linux-dl, Juri Lelli

On Tue, Jul 19, 2016 at 11:44:52AM +0200, Tommaso Cucinotta wrote:
> This change achieves up to 10% of speed-up for cpudl_set() calls, as
> measured with a andomly generated workload of 1K,10K,100K random heap
> insertions and deletions (75% cpudl_set() calls with is_valid=1 and
> 25% with is_valid=0), and randomly generated cpu IDs, with up to 256
> CPUs, as measured on an Intel Core2 Duo.

Changelog fails to explain _what_ the change is.

Some details on the change in implementation and why this is faster
would be good.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* [tip:sched/core] sched/deadline: Fix wrap-around in DL heap
  2016-07-19  9:44 ` [RFC PATCH 1/4] Minimally invasive deadline wrap-around bugfix for SCHED_DEADLINE cpu heap Tommaso Cucinotta
  2016-08-01 11:43   ` Juri Lelli
@ 2016-08-10 18:00   ` tip-bot for Tommaso Cucinotta
  1 sibling, 0 replies; 12+ messages in thread
From: tip-bot for Tommaso Cucinotta @ 2016-08-10 18:00 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: linux-kernel, hpa, tommaso.cucinotta, peterz, torvalds,
	luca.abeni, juri.lelli, mingo, juri.lelli, tglx

Commit-ID:  a23eadfae2fd45536a355b785d5a1533e1955c22
Gitweb:     http://git.kernel.org/tip/a23eadfae2fd45536a355b785d5a1533e1955c22
Author:     Tommaso Cucinotta <tommaso.cucinotta@sssup.it>
AuthorDate: Tue, 19 Jul 2016 11:44:50 +0200
Committer:  Ingo Molnar <mingo@kernel.org>
CommitDate: Wed, 10 Aug 2016 13:32:55 +0200

sched/deadline: Fix wrap-around in DL heap

Current code in cpudeadline.c has a bug in re-heapifying when adding a
new element at the end of the heap, because a deadline value of 0 is
temporarily set in the new elem, then cpudl_change_key() is called
with the actual elem deadline as param.

However, the function compares the new deadline to set with the one
previously in the elem, which is 0.  So, if current absolute deadlines
grew so much to have negative values as s64, the comparison in
cpudl_change_key() makes the wrong decision.  Instead, as from
dl_time_before(), the kernel should handle correctly abs deadlines
wrap-arounds.

This patch fixes the problem with a minimally invasive change that
forces cpudl_change_key() to heapify up in this case.

Signed-off-by: Tommaso Cucinotta <tommaso.cucinotta@sssup.it>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Luca Abeni <luca.abeni@unitn.it>
Cc: Juri Lelli <juri.lelli@arm.com>
Cc: Juri Lelli <juri.lelli@gmail.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/1468921493-10054-2-git-send-email-tommaso.cucinotta@sssup.it
Signed-off-by: Ingo Molnar <mingo@kernel.org>
---
 kernel/sched/cpudeadline.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/kernel/sched/cpudeadline.c b/kernel/sched/cpudeadline.c
index 5be5882..d418449 100644
--- a/kernel/sched/cpudeadline.c
+++ b/kernel/sched/cpudeadline.c
@@ -168,7 +168,7 @@ void cpudl_set(struct cpudl *cp, int cpu, u64 dl, int is_valid)
 
 	if (old_idx == IDX_INVALID) {
 		cp->size++;
-		cp->elements[cp->size - 1].dl = 0;
+		cp->elements[cp->size - 1].dl = dl;
 		cp->elements[cp->size - 1].cpu = cpu;
 		cp->elements[cpu].idx = cp->size - 1;
 		cpudl_change_key(cp, cp->size - 1, dl);

^ permalink raw reply related	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2016-08-10 19:49 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-07-19  9:44 SCHED_DEADLINE cpudeadline.{h,c} fixup Tommaso Cucinotta
2016-07-19  9:44 ` [RFC PATCH 1/4] Minimally invasive deadline wrap-around bugfix for SCHED_DEADLINE cpu heap Tommaso Cucinotta
2016-08-01 11:43   ` Juri Lelli
2016-08-01 12:57     ` Juri Lelli
2016-08-10 18:00   ` [tip:sched/core] sched/deadline: Fix wrap-around in DL heap tip-bot for Tommaso Cucinotta
2016-07-19  9:44 ` [RFC PATCH 2/4] SCHED_DEADLINE cpu heap code clarification/refactory Tommaso Cucinotta
2016-08-01 16:30   ` Juri Lelli
2016-07-19  9:44 ` [RFC PATCH 3/4] Make deadline max-heap faster avoiding real swaps on heapify ops Tommaso Cucinotta
2016-08-02 11:38   ` Juri Lelli
2016-08-08 14:04   ` Peter Zijlstra
2016-07-19  9:44 ` [RFC PATCH 4/4] Split cpudl_set() into cpudl_set() and cpudl_clear() Tommaso Cucinotta
2016-08-02 11:52   ` Juri Lelli

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.