From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <linux-kernel-owner@vger.kernel.org>
Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand
        id S1755567AbcIELzS (ORCPT <rfc822;w@1wt.eu>);
        Mon, 5 Sep 2016 07:55:18 -0400
Received: from terminus.zytor.com ([198.137.202.10]:36356 "EHLO
        terminus.zytor.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org
        with ESMTP id S1755379AbcIELzC (ORCPT
        <rfc822;linux-kernel@vger.kernel.org>);
        Mon, 5 Sep 2016 07:55:02 -0400
Date: Mon, 5 Sep 2016 04:54:26 -0700
From: tip-bot for Tommaso Cucinotta <tipbot@zytor.com>
Message-ID: <tip-8e1bc301aaf9f9a2d731bf8d50d549ac2dcfdab2@git.kernel.org>
Cc: tommaso.cucinotta@sssup.it, hpa@zytor.com, peterz@infradead.org,
        torvalds@linux-foundation.org, linux-kernel@vger.kernel.org,
        juri.lelli@gmail.com, mingo@kernel.org, luca.abeni@unitn.it,
        juri.lelli@arm.com, tglx@linutronix.de
Reply-To: tglx@linutronix.de, juri.lelli@arm.com, luca.abeni@unitn.it,
        mingo@kernel.org, linux-kernel@vger.kernel.org, juri.lelli@gmail.com,
        torvalds@linux-foundation.org, peterz@infradead.org, hpa@zytor.com,
        tommaso.cucinotta@sssup.it
In-Reply-To: <1471184828-12644-3-git-send-email-tommaso.cucinotta@sssup.it>
References: <1471184828-12644-3-git-send-email-tommaso.cucinotta@sssup.it>
To: linux-tip-commits@vger.kernel.org
Subject: [tip:sched/core] sched/deadline: Make CPU heap faster avoiding real
 swaps on heapify
Git-Commit-ID: 8e1bc301aaf9f9a2d731bf8d50d549ac2dcfdab2
X-Mailer: tip-git-log-daemon
Robot-ID: <tip-bot.git.kernel.org>
Robot-Unsubscribe: Contact <mailto:hpa@kernel.org> to get blacklisted from
 these emails
MIME-Version: 1.0
Content-Transfer-Encoding: 8bit
Content-Type: text/plain; charset=UTF-8
Content-Disposition: inline
Sender: linux-kernel-owner@vger.kernel.org
List-ID: <linux-kernel.vger.kernel.org>
X-Mailing-List: linux-kernel@vger.kernel.org

Commit-ID:  8e1bc301aaf9f9a2d731bf8d50d549ac2dcfdab2
Gitweb:     http://git.kernel.org/tip/8e1bc301aaf9f9a2d731bf8d50d549ac2dcfdab2
Author:     Tommaso Cucinotta <tommaso.cucinotta@sssup.it>
AuthorDate: Sun, 14 Aug 2016 16:27:07 +0200
Committer:  Ingo Molnar <mingo@kernel.org>
CommitDate: Mon, 5 Sep 2016 13:29:43 +0200

sched/deadline: Make CPU heap faster avoiding real swaps on heapify

This change goes from heapify() ops done by swapping with parent/child
so that the item to fix moves along, to heapify() ops done by just
pulling the parent/child chain by 1 pos, then storing the item to fix
just at the end. On a non-trivial heapify(), this performs roughly half
stores wrt swaps.

This has been measured to achieve up to 10% of speed-up for cpudl_set()
calls, with a randomly generated workload of 1K,10K,100K random heap
insertions and deletions (75% cpudl_set() calls with is_valid=1 and
25% with is_valid=0), and randomly generated cpu IDs, with up to 256
CPUs, as measured on an Intel Core2 Duo.

Signed-off-by: Tommaso Cucinotta <tommaso.cucinotta@sssup.it>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Luca Abeni <luca.abeni@unitn.it>
Reviewed-by: Juri Lelli <juri.lelli@arm.com>
Cc: Juri Lelli <juri.lelli@gmail.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-dl@retis.sssup.it
Link: http://lkml.kernel.org/r/1471184828-12644-3-git-send-email-tommaso.cucinotta@sssup.it
Signed-off-by: Ingo Molnar <mingo@kernel.org>
---
 kernel/sched/cpudeadline.c | 66 +++++++++++++++++++++++++++++++---------------
 1 file changed, 45 insertions(+), 21 deletions(-)

diff --git a/kernel/sched/cpudeadline.c b/kernel/sched/cpudeadline.c
index 0acb0d4..0ace75a 100644
--- a/kernel/sched/cpudeadline.c
+++ b/kernel/sched/cpudeadline.c
@@ -31,48 +31,72 @@ static inline int right_child(int i)
 	return (i << 1) + 2;
 }
 
-static void cpudl_exchange(struct cpudl *cp, int a, int b)
-{
-	int cpu_a = cp->elements[a].cpu, cpu_b = cp->elements[b].cpu;
-
-	swap(cp->elements[a].cpu, cp->elements[b].cpu);
-	swap(cp->elements[a].dl , cp->elements[b].dl );
-
-	swap(cp->elements[cpu_a].idx, cp->elements[cpu_b].idx);
-}
-
 static void cpudl_heapify_down(struct cpudl *cp, int idx)
 {
 	int l, r, largest;
 
+	int orig_cpu = cp->elements[idx].cpu;
+	u64 orig_dl = cp->elements[idx].dl;
+
+	if (left_child(idx) >= cp->size)
+		return;
+
 	/* adapted from lib/prio_heap.c */
 	while(1) {
+		u64 largest_dl;
 		l = left_child(idx);
 		r = right_child(idx);
 		largest = idx;
+		largest_dl = orig_dl;
 
-		if ((l < cp->size) && dl_time_before(cp->elements[idx].dl,
-							cp->elements[l].dl))
+		if ((l < cp->size) && dl_time_before(orig_dl,
+						cp->elements[l].dl)) {
 			largest = l;
-		if ((r < cp->size) && dl_time_before(cp->elements[largest].dl,
-							cp->elements[r].dl))
+			largest_dl = cp->elements[l].dl;
+		}
+		if ((r < cp->size) && dl_time_before(largest_dl,
+						cp->elements[r].dl))
 			largest = r;
+
 		if (largest == idx)
 			break;
 
-		/* Push idx down the heap one level and bump one up */
-		cpudl_exchange(cp, largest, idx);
+		/* pull largest child onto idx */
+		cp->elements[idx].cpu = cp->elements[largest].cpu;
+		cp->elements[idx].dl = cp->elements[largest].dl;
+		cp->elements[cp->elements[idx].cpu].idx = idx;
 		idx = largest;
 	}
+	/* actual push down of saved original values orig_* */
+	cp->elements[idx].cpu = orig_cpu;
+	cp->elements[idx].dl = orig_dl;
+	cp->elements[cp->elements[idx].cpu].idx = idx;
 }
 
 static void cpudl_heapify_up(struct cpudl *cp, int idx)
 {
-	while (idx > 0 && dl_time_before(cp->elements[parent(idx)].dl,
-			cp->elements[idx].dl)) {
-		cpudl_exchange(cp, idx, parent(idx));
-		idx = parent(idx);
-	}
+	int p;
+
+	int orig_cpu = cp->elements[idx].cpu;
+	u64 orig_dl = cp->elements[idx].dl;
+
+	if (idx == 0)
+		return;
+
+	do {
+		p = parent(idx);
+		if (dl_time_before(orig_dl, cp->elements[p].dl))
+			break;
+		/* pull parent onto idx */
+		cp->elements[idx].cpu = cp->elements[p].cpu;
+		cp->elements[idx].dl = cp->elements[p].dl;
+		cp->elements[cp->elements[idx].cpu].idx = idx;
+		idx = p;
+	} while (idx != 0);
+	/* actual push up of saved original values orig_* */
+	cp->elements[idx].cpu = orig_cpu;
+	cp->elements[idx].dl = orig_dl;
+	cp->elements[cp->elements[idx].cpu].idx = idx;
 }
 
 static void cpudl_heapify(struct cpudl *cp, int idx)