All of lore.kernel.org
 help / color / mirror / Atom feed
From: Jason Low <jason.low2@hp.com>
To: Peter Zijlstra <peterz@infradead.org>
Cc: Preeti U Murthy <preeti@linux.vnet.ibm.com>,
	mingo@kernel.org, linux-kernel@vger.kernel.org,
	daniel.lezcano@linaro.org, alex.shi@linaro.org, efault@gmx.de,
	vincent.guittot@linaro.org, morten.rasmussen@arm.com,
	aswin@hp.com, chegu_vinod@hp.com
Subject: Re: [PATCH 1/3] sched, balancing: Update rq->max_idle_balance_cost whenever newidle balance is attempted
Date: Thu, 24 Apr 2014 15:18:37 -0700	[thread overview]
Message-ID: <1398377917.3509.32.camel@j-VirtualBox> (raw)
In-Reply-To: <20140424171453.GZ11096@twins.programming.kicks-ass.net>

On Thu, 2014-04-24 at 19:14 +0200, Peter Zijlstra wrote:
> On Thu, Apr 24, 2014 at 09:53:37AM -0700, Jason Low wrote:
> > 
> > So I thought that the original rationale (commit 1bd77f2d) behind
> > updating rq->next_balance in idle_balance() is that, if we are going
> > idle (!pulled_task), we want to ensure that the next_balance gets
> > calculated without the busy_factor.
> > 
> > If the rq is busy, then rq->next_balance gets updated based on
> > sd->interval * busy_factor. However, when the rq goes from "busy"
> > to idle, rq->next_balance might still have been calculated under
> > the assumption that the rq is busy. Thus, if we are going idle, we
> > would then properly update next_balance without the busy factor
> > if we update when !pulled_task.
> > 
> 
> Its late here and I'm confused!
> 
> So the for_each_domain() loop calculates a new next_balance based on
> ->balance_interval (which has that busy_factor on, right).
> 
> But if it fails to pull anything, we'll (potentially) iterate the entire
> tree up to the largest domain; and supposedly set next_balanced to the
> largest possible interval.
> 
> So when we go from busy to idle (!pulled_task), we actually set
> ->next_balance to the longest interval. Whereas the commit you
> referenced says it sets it to a shorter while.
> 
> Not seeing it.

So this is the way I understand that code:

In rebalance_domain, next_balance is suppose to be set to the
minimum of all sd->last_balance + interval so that we properly call
into rebalance_domains() if one of the domains is due for a balance.

In the domain traversals:

	if (time_after(next_balance, sd->last_balance + interval))
		next_balance = sd->last_balance + interval;

we update next_balance to a new value if the current next_balance
is after, and we only update next_balance to a smaller value.

In rebalance_domains, we have code:

	interval = sd->balance_interval;
	if (idle != CPU_IDLE)
		interval *= sd->busy_factor;

	...

	if (time_after(next_balance, sd->last_balance + interval)) {
		next_balance = sd->last_balance + interval;

	...

	rq->next_balance = next_balance;

In the CPU_IDLE case, interval would not include the busy factor,
whereas in the !CPU_IDLE case, we multiply the interval by the
sd->busy_factor.

So as an example, if a CPU is not idle and we run this:

rebalance_domain()
	interval = 1 ms;
	if (idle != CPU_IDLE)
		interval *= 64;

	next_balance = sd->last_balance + 64 ms

	rq->next_balance = next_balance

The rq->next_balance is set to a large value since the CPU is not idle.

Then, let's say the CPU then goes idle 1 ms later. The
rq->next_balance can be up to 63 ms later, because we computed
it when the CPU is not idle. Now that we are going idle,
we would have to wait a long time for the next balance.

So I believe that the initial reason why rq->next_balance was
updated in idle_balance is that if the CPU is in the process 
of going idle (!pulled_task in idle_balance()), we can reset the
rq->next_balance based on the interval = 1 ms, as oppose to
having it remain up to 64 ms later (in idle_balance(), interval
doesn't get multiplied by sd->busy_factor).




  parent reply	other threads:[~2014-04-24 22:18 UTC|newest]

Thread overview: 45+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-04-24  1:30 [PATCH 0/3] sched: Idle balance patches Jason Low
2014-04-24  1:30 ` [PATCH 1/3] sched, balancing: Update rq->max_idle_balance_cost whenever newidle balance is attempted Jason Low
2014-04-24 10:14   ` Preeti U Murthy
2014-04-24 12:04     ` Peter Zijlstra
2014-04-24 12:44       ` Peter Zijlstra
2014-04-24 16:53         ` Jason Low
2014-04-24 17:14           ` Peter Zijlstra
2014-04-24 17:29             ` Peter Zijlstra
2014-04-24 22:18             ` Jason Low [this message]
2014-04-25  5:12               ` Preeti U Murthy
2014-04-25  7:13                 ` Jason Low
2014-04-25  7:58                   ` Mike Galbraith
2014-04-25 17:03                     ` Jason Low
2014-04-25  5:08             ` Preeti U Murthy
2014-04-25  9:43               ` Peter Zijlstra
2014-04-25 19:54                 ` Jason Low
2014-04-26 14:50                   ` Peter Zijlstra
2014-04-28 16:42                     ` Jason Low
2014-04-27  8:31                   ` Preeti Murthy
2014-04-28  9:24                     ` Peter Zijlstra
2014-04-29  3:10                       ` Preeti U Murthy
2014-04-28 18:04                     ` Jason Low
2014-04-29  3:52                       ` Preeti U Murthy
2014-04-24  1:30 ` [PATCH 2/3] sched: Initialize newidle balance stats in sd_numa_init() Jason Low
2014-04-24 12:18   ` Peter Zijlstra
2014-04-25  5:57   ` Preeti U Murthy
2014-05-08 10:42   ` [tip:sched/core] sched/numa: " tip-bot for Jason Low
2014-04-24  1:30 ` [PATCH 3/3] sched, fair: Stop searching for tasks in newidle balance if there are runnable tasks Jason Low
2014-04-24  2:51   ` Mike Galbraith
2014-04-24  8:28     ` Mike Galbraith
2014-04-24 16:37     ` Jason Low
2014-04-24 19:07       ` Mike Galbraith
2014-04-24  7:15   ` Peter Zijlstra
2014-04-24 16:43     ` Jason Low
2014-04-24 16:52       ` Peter Zijlstra
2014-04-25  1:24         ` Jason Low
2014-04-25  2:45         ` Mike Galbraith
2014-04-25  3:33           ` Jason Low
2014-04-25  5:46             ` Mike Galbraith
2014-04-24 16:54       ` Peter Zijlstra
2014-04-24 10:30   ` Morten Rasmussen
2014-04-24 11:32     ` Peter Zijlstra
2014-04-24 14:08       ` Morten Rasmussen
2014-04-24 14:59         ` Peter Zijlstra
2014-05-08 10:44   ` [tip:sched/core] sched/fair: " tip-bot for Jason Low

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1398377917.3509.32.camel@j-VirtualBox \
    --to=jason.low2@hp.com \
    --cc=alex.shi@linaro.org \
    --cc=aswin@hp.com \
    --cc=chegu_vinod@hp.com \
    --cc=daniel.lezcano@linaro.org \
    --cc=efault@gmx.de \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@kernel.org \
    --cc=morten.rasmussen@arm.com \
    --cc=peterz@infradead.org \
    --cc=preeti@linux.vnet.ibm.com \
    --cc=vincent.guittot@linaro.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.