All of lore.kernel.org
 help / color / mirror / Atom feed
From: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
To: Peter Zijlstra <peterz@infradead.org>
Cc: LKML <linux-kernel@vger.kernel.org>,
	Linus Torvalds <torvalds@linux-foundation.org>,
	Andrew Morton <akpm@linux-foundation.org>,
	Ingo Molnar <mingo@elte.hu>, Steven Rostedt <rostedt@goodmis.org>,
	Thomas Gleixner <tglx@linutronix.de>,
	Tony Lindgren <tony@atomide.com>, Mike Galbraith <efault@gmx.de>
Subject: Re: [RFC patch 1/2] sched: dynamically adapt granularity with nr_running
Date: Mon, 13 Sep 2010 09:56:21 -0400	[thread overview]
Message-ID: <20100913135621.GA13442@Krystal> (raw)
In-Reply-To: <1284383758.2275.283.camel@laptop>

* Peter Zijlstra (peterz@infradead.org) wrote:
> On Mon, 2010-09-13 at 14:53 +0200, Peter Zijlstra wrote:
> > On Sun, 2010-09-12 at 16:37 -0400, Mathieu Desnoyers wrote:
> > > The whole point of my patch is not to have to do this latency vs performance
> > > tradeoff for low number of running threads. With your approach, lowering the
> > > granularity even when there are few threads running will very likely hurt
> > > performance, no ? 
> > 
> > But you presented it as a latency patch, not a throughput patch. And I'm
> > not sure it will matter enough to offset the computational cost it
> > introduces.

Can we agree that the patch I proposed is a patch that try to improve _latency_
under high load while preserving good throughput under lower load ? I find this
"performance" XOR "latency" dichotomy itching: it's like looking at an american
TV show where all bad guys are dressed in black, and the hero is in white.

> 
> 
> ---
> On Mon, 2010-09-13 at 14:53 +0200, Peter Zijlstra wrote:
> On Sun, 2010-09-12 at 16:37 -0400, Mathieu Desnoyers wrote:
> > > The whole point of my patch is not to have to do this latency vs performance
> > > tradeoff for low number of running threads. With your approach, lowering the
> > > granularity even when there are few threads running will very likely hurt
> > > performance, no ? 
> > 
> > But you presented it as a latency patch, not a throughput patch. And I'm
> > not sure it will matter enough to offset the computational cost it
> > introduces.
> > 
> 
> One option is to simply get rid of that stuff in check_preempt_tick()
> and instead do a wakeup-preempt check on the leftmost task instead.
> 
> The code as it stands today does that delta_exec < min_gran check to
> ensure current gets some runtime before doing that second preemption
> check, which compares vruntime with a wall-time measure.
> 
> Making that gran more complex doesn't really buy us much because for a
> system with different weights in the gran and slice lengths don't match
> up anyway.

So I bet this last sentence is about the example of a system with many nice 19
processes I told you about on IRC. Yes, this one is a bummer, as we would not
like to count them as running threads at all.

More comments below,

> 
> ---
> Subject: sched: Simplify tick preemption
> From: Peter Zijlstra <a.p.zijlstra@chello.nl>
> Date: Mon Jul 05 13:56:30 CEST 2010
> 
> Check the current slice, if not expired, see if the leftmost task
> would otherwise have preempted current.
> 
> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
> ---
>  kernel/sched_fair.c |   43 +++++++++++++++----------------------------
>  1 file changed, 15 insertions(+), 28 deletions(-)
> 
> Index: linux-2.6/kernel/sched_fair.c
> ===================================================================
> --- linux-2.6.orig/kernel/sched_fair.c
> +++ linux-2.6/kernel/sched_fair.c
> @@ -838,44 +838,34 @@ dequeue_entity(struct cfs_rq *cfs_rq, st
>  		se->vruntime -= cfs_rq->min_vruntime;
>  }
>  
> +static int
> +wakeup_preempt_entity(struct sched_entity *curr, struct sched_entity *se);
> +
>  /*
>   * Preempt the current task with a newly woken task if needed:
>   */
>  static void
>  check_preempt_tick(struct cfs_rq *cfs_rq, struct sched_entity *curr)
>  {
> -	unsigned long ideal_runtime, delta_exec;
> +	unsigned long slice = sched_slice(cfs_rq, curr);

So you still compute the sched_slice(), based on sched_period(), based on
sysctl_sched_min_granularity *= nr_running when there are more than nr_latency
running threads.

> +
> +	if (curr->sum_exec_runtime - curr->prev_sum_exec_runtime < slice) {
> +		struct sched_entity *pse = __pick_next_entity(cfs_rq);
> +
> +		if (pse && wakeup_preempt_entity(curr, pse) == 1)
> +			goto preempt;
>  
> -	ideal_runtime = sched_slice(cfs_rq, curr);
> -	delta_exec = curr->sum_exec_runtime - curr->prev_sum_exec_runtime;
> -	if (delta_exec > ideal_runtime) {
> -		resched_task(rq_of(cfs_rq)->curr);
> -		/*
> -		 * The current task ran long enough, ensure it doesn't get
> -		 * re-elected due to buddy favours.
> -		 */
> -		clear_buddies(cfs_rq, curr);
>  		return;
>  	}
>  
>  	/*
> -	 * Ensure that a task that missed wakeup preemption by a
> -	 * narrow margin doesn't have to wait for a full slice.
> -	 * This also mitigates buddy induced latencies under load.
> +	 * The current task ran long enough, ensure it doesn't get
> +	 * re-elected due to buddy favours.
>  	 */
> -	if (!sched_feat(WAKEUP_PREEMPT))
> -		return;
> -
> -	if (delta_exec < sysctl_sched_min_granularity)
> -		return;

Well, the reason why this test is here seems to be that we don't want to trigger
"resched_task" more often than needed, and here it's defined by the granularity.
I don't quite see with what you are replacing this, other than "let's set the
resched flag all the time to save a 32-bit division". I figure out it's more
expensive the call the scheduler than to do a 32-bit div.

Thanks,

Mathieu

> +	clear_buddies(cfs_rq, curr);
>  
> -	if (cfs_rq->nr_running > 1) {
> -		struct sched_entity *se = __pick_next_entity(cfs_rq);
> -		s64 delta = curr->vruntime - se->vruntime;
> -
> -		if (delta > ideal_runtime)
> -			resched_task(rq_of(cfs_rq)->curr);
> -	}
> +preempt:
> +	resched_task(rq_of(cfs_rq)->curr);
>  }
>  
>  static void
> @@ -908,9 +898,6 @@ set_next_entity(struct cfs_rq *cfs_rq, s
>  	se->prev_sum_exec_runtime = se->sum_exec_runtime;
>  }
>  
> -static int
> -wakeup_preempt_entity(struct sched_entity *curr, struct sched_entity *se);
> -
>  static struct sched_entity *pick_next_entity(struct cfs_rq *cfs_rq)
>  {
>  	struct sched_entity *se = __pick_next_entity(cfs_rq);
> 

-- 
Mathieu Desnoyers
Operating System Efficiency R&D Consultant
EfficiOS Inc.
http://www.efficios.com

  reply	other threads:[~2010-09-13 13:56 UTC|newest]

Thread overview: 76+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2010-09-11 17:37 [RFC patch 0/2] sched: dynamically adapt granularity with nr_running Mathieu Desnoyers
2010-09-11 17:37 ` [RFC patch 1/2] " Mathieu Desnoyers
2010-09-11 18:57   ` Peter Zijlstra
2010-09-11 19:21     ` Linus Torvalds
2010-09-11 20:36       ` Peter Zijlstra
2010-09-11 20:45         ` Peter Zijlstra
2010-09-11 20:52           ` Linus Torvalds
2010-09-12  9:07             ` Peter Zijlstra
2010-09-11 20:48         ` Linus Torvalds
2010-09-12  9:06           ` Peter Zijlstra
2010-09-12  9:14             ` Peter Zijlstra
2010-09-12 20:39               ` Mathieu Desnoyers
2010-09-13 12:54                 ` Peter Zijlstra
2010-09-12 20:34             ` Mathieu Desnoyers
2010-09-13 12:53               ` Peter Zijlstra
2010-09-13  4:35             ` Mike Galbraith
2010-09-13  8:41               ` Peter Zijlstra
2010-09-13 11:22                 ` Ingo Molnar
2010-09-13 13:52                 ` Steven Rostedt
2010-09-13 13:54                   ` Peter Zijlstra
2010-09-13 14:02                     ` Peter Zijlstra
2010-09-13 14:21                       ` Ingo Molnar
2010-09-11 20:52         ` Mathieu Desnoyers
2010-09-11 19:57     ` Mathieu Desnoyers
2010-09-12 10:41       ` Peter Zijlstra
2010-09-12 20:37         ` Mathieu Desnoyers
2010-09-13 12:53           ` Peter Zijlstra
2010-09-13 13:15             ` Peter Zijlstra
2010-09-13 13:56               ` Mathieu Desnoyers [this message]
2010-09-13 14:16                 ` Peter Zijlstra
2010-09-13 14:43                   ` Steven Rostedt
2010-09-13 15:25                     ` Mathieu Desnoyers
2010-09-13 15:39                       ` Steven Rostedt
2010-09-13 16:16                   ` [RFC PATCH] check_preempt_tick should not compare vruntime with wall time Mathieu Desnoyers
2010-09-13 16:36                     ` Linus Torvalds
2010-09-13 17:45                       ` Mathieu Desnoyers
2010-09-13 17:51                         ` Linus Torvalds
2010-09-13 18:01                           ` Mathieu Desnoyers
2010-09-13 18:10                           ` Steven Rostedt
2010-09-13 18:03                         ` Ingo Molnar
2010-09-13 18:19                           ` Mathieu Desnoyers
2010-09-13 18:23                             ` [PATCH] sched: Improve latencies under load by decreasing minimum scheduling granularity Ingo Molnar
2010-09-13 18:28                               ` Joe Perches
2010-09-13 19:44                               ` Linus Torvalds
2010-09-13 20:00                                 ` Ingo Molnar
2010-09-13 18:19                         ` [RFC PATCH] check_preempt_tick should not compare vruntime with wall time Ingo Molnar
2010-09-13 17:36                     ` Ingo Molnar
2010-09-13 17:56                       ` Mathieu Desnoyers
2010-09-14  2:10                     ` Mike Galbraith
2010-09-13 14:44                 ` [RFC patch 1/2] sched: dynamically adapt granularity with nr_running Mike Galbraith
     [not found]               ` <1284386179.10436.6.camel@marge.simson.net>
2010-09-13 15:53                 ` Peter Zijlstra
2010-09-13 18:04                   ` [RFC][PATCH] sched: Improve tick preemption Peter Zijlstra
2010-09-14  2:27                   ` [RFC patch 1/2] sched: dynamically adapt granularity with nr_running Mike Galbraith
2010-09-12  6:14   ` Ingo Molnar
2010-09-12  7:21     ` Mike Galbraith
2010-09-12 18:16       ` Mathieu Desnoyers
2010-09-13  4:13         ` Mike Galbraith
2010-09-13  6:41           ` Ingo Molnar
2010-09-13  7:08             ` Mike Galbraith
2010-09-13  7:35               ` Mike Galbraith
2010-09-13  8:35               ` Peter Zijlstra
2010-09-13  9:16                 ` Mike Galbraith
2010-09-13  9:37                   ` Peter Zijlstra
2010-09-13  9:50                     ` Mike Galbraith
2010-09-13  9:55                       ` Peter Zijlstra
2010-09-13 10:06                         ` Mike Galbraith
2010-09-13 10:45                         ` Peter Zijlstra
2010-09-13 11:43                           ` Peter Zijlstra
2010-09-13 11:49                             ` Peter Zijlstra
2010-09-13 12:32                             ` Mike Galbraith
2010-09-13 20:19             ` Mathieu Desnoyers
2010-09-13 20:56               ` Mathieu Desnoyers
2010-09-12 18:13     ` Mathieu Desnoyers
2010-09-12 23:44       ` Mathieu Desnoyers
2010-09-11 17:37 ` [RFC patch 2/2] sched: sleepers coarse granularity on wakeup Mathieu Desnoyers
2010-09-12 12:44 ` [RFC patch 0/2] sched: dynamically adapt granularity with nr_running Peter Zijlstra

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20100913135621.GA13442@Krystal \
    --to=mathieu.desnoyers@efficios.com \
    --cc=akpm@linux-foundation.org \
    --cc=efault@gmx.de \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@elte.hu \
    --cc=peterz@infradead.org \
    --cc=rostedt@goodmis.org \
    --cc=tglx@linutronix.de \
    --cc=tony@atomide.com \
    --cc=torvalds@linux-foundation.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.