From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1754282Ab0IMNQM (ORCPT ); Mon, 13 Sep 2010 09:16:12 -0400 Received: from bombadil.infradead.org ([18.85.46.34]:34028 "EHLO bombadil.infradead.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752428Ab0IMNQL convert rfc822-to-8bit (ORCPT ); Mon, 13 Sep 2010 09:16:11 -0400 Subject: Re: [RFC patch 1/2] sched: dynamically adapt granularity with nr_running From: Peter Zijlstra To: Mathieu Desnoyers Cc: LKML , Linus Torvalds , Andrew Morton , Ingo Molnar , Steven Rostedt , Thomas Gleixner , Tony Lindgren , Mike Galbraith In-Reply-To: <1284382387.2275.265.camel@laptop> References: <20100911173732.551632040@efficios.com> <20100911174003.051303123@efficios.com> <1284231470.2251.52.camel@laptop> <20100911195708.GA9273@Krystal> <1284288072.2251.91.camel@laptop> <20100912203712.GD32327@Krystal> <1284382387.2275.265.camel@laptop> Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: 8BIT Date: Mon, 13 Sep 2010 15:15:58 +0200 Message-ID: <1284383758.2275.283.camel@laptop> Mime-Version: 1.0 X-Mailer: Evolution 2.28.3 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Mon, 2010-09-13 at 14:53 +0200, Peter Zijlstra wrote: > On Sun, 2010-09-12 at 16:37 -0400, Mathieu Desnoyers wrote: > > The whole point of my patch is not to have to do this latency vs performance > > tradeoff for low number of running threads. With your approach, lowering the > > granularity even when there are few threads running will very likely hurt > > performance, no ? > > But you presented it as a latency patch, not a throughput patch. And I'm > not sure it will matter enough to offset the computational cost it > introduces. --- On Mon, 2010-09-13 at 14:53 +0200, Peter Zijlstra wrote: On Sun, 2010-09-12 at 16:37 -0400, Mathieu Desnoyers wrote: > > The whole point of my patch is not to have to do this latency vs performance > > tradeoff for low number of running threads. With your approach, lowering the > > granularity even when there are few threads running will very likely hurt > > performance, no ? > > But you presented it as a latency patch, not a throughput patch. And I'm > not sure it will matter enough to offset the computational cost it > introduces. > One option is to simply get rid of that stuff in check_preempt_tick() and instead do a wakeup-preempt check on the leftmost task instead. The code as it stands today does that delta_exec < min_gran check to ensure current gets some runtime before doing that second preemption check, which compares vruntime with a wall-time measure. Making that gran more complex doesn't really buy us much because for a system with different weights in the gran and slice lengths don't match up anyway. --- Subject: sched: Simplify tick preemption From: Peter Zijlstra Date: Mon Jul 05 13:56:30 CEST 2010 Check the current slice, if not expired, see if the leftmost task would otherwise have preempted current. Signed-off-by: Peter Zijlstra --- kernel/sched_fair.c | 43 +++++++++++++++---------------------------- 1 file changed, 15 insertions(+), 28 deletions(-) Index: linux-2.6/kernel/sched_fair.c =================================================================== --- linux-2.6.orig/kernel/sched_fair.c +++ linux-2.6/kernel/sched_fair.c @@ -838,44 +838,34 @@ dequeue_entity(struct cfs_rq *cfs_rq, st se->vruntime -= cfs_rq->min_vruntime; } +static int +wakeup_preempt_entity(struct sched_entity *curr, struct sched_entity *se); + /* * Preempt the current task with a newly woken task if needed: */ static void check_preempt_tick(struct cfs_rq *cfs_rq, struct sched_entity *curr) { - unsigned long ideal_runtime, delta_exec; + unsigned long slice = sched_slice(cfs_rq, curr); + + if (curr->sum_exec_runtime - curr->prev_sum_exec_runtime < slice) { + struct sched_entity *pse = __pick_next_entity(cfs_rq); + + if (pse && wakeup_preempt_entity(curr, pse) == 1) + goto preempt; - ideal_runtime = sched_slice(cfs_rq, curr); - delta_exec = curr->sum_exec_runtime - curr->prev_sum_exec_runtime; - if (delta_exec > ideal_runtime) { - resched_task(rq_of(cfs_rq)->curr); - /* - * The current task ran long enough, ensure it doesn't get - * re-elected due to buddy favours. - */ - clear_buddies(cfs_rq, curr); return; } /* - * Ensure that a task that missed wakeup preemption by a - * narrow margin doesn't have to wait for a full slice. - * This also mitigates buddy induced latencies under load. + * The current task ran long enough, ensure it doesn't get + * re-elected due to buddy favours. */ - if (!sched_feat(WAKEUP_PREEMPT)) - return; - - if (delta_exec < sysctl_sched_min_granularity) - return; + clear_buddies(cfs_rq, curr); - if (cfs_rq->nr_running > 1) { - struct sched_entity *se = __pick_next_entity(cfs_rq); - s64 delta = curr->vruntime - se->vruntime; - - if (delta > ideal_runtime) - resched_task(rq_of(cfs_rq)->curr); - } +preempt: + resched_task(rq_of(cfs_rq)->curr); } static void @@ -908,9 +898,6 @@ set_next_entity(struct cfs_rq *cfs_rq, s se->prev_sum_exec_runtime = se->sum_exec_runtime; } -static int -wakeup_preempt_entity(struct sched_entity *curr, struct sched_entity *se); - static struct sched_entity *pick_next_entity(struct cfs_rq *cfs_rq) { struct sched_entity *se = __pick_next_entity(cfs_rq);