linuxppc-dev.lists.ozlabs.org archive mirror
 help / color / mirror / Atom feed
From: Michael Neuling <mikey@neuling.org>
To: Peter Zijlstra <peterz@infradead.org>
Cc: Suresh Siddha <suresh.b.siddha@intel.com>,
	Gautham R Shenoy <ego@in.ibm.com>,
	linux-kernel@vger.kernel.org, linuxppc-dev@ozlabs.org,
	Ingo Molnar <mingo@elte.hu>
Subject: Re: [PATCH 1/5] sched: fix capacity calculations for SMT4
Date: Thu, 29 Apr 2010 16:55:58 +1000	[thread overview]
Message-ID: <31281.1272524158@neuling.org> (raw)
In-Reply-To: <1271426308.1674.429.camel@laptop>

In message <1271426308.1674.429.camel@laptop> you wrote:
> On Wed, 2010-04-14 at 14:28 +1000, Michael Neuling wrote:
> 
> > > Right, so I suspect this will indeed break some things.
> > > 
> > > We initially allowed 0 capacity for when a cpu is consumed by an RT task
> > > and there simply isn't much capacity left, in that case you really want
> > > to try and move load to your sibling cpus if possible.
> > 
> > Changing the CPU power based on what tasks are running on them seems a
> > bit wrong to me.  Shouldn't we keep those concepts separate?
> 
> Well the thing cpu_power represents is a ratio of compute capacity
> available to this cpu as compared to other cpus. By normalizing the
> runqueue weights with this we end up with a fair balance.
> 
> The thing to realize here is that this is solely about SCHED_NORMAL
> tasks, SCHED_FIFO/RR (or the proposed DEADLINE) tasks do not care about
> fairness and available compute capacity.
> 
> So if we were to ignore RT tasks, you'd end up with a situation where,
> assuming 2 cpus and 4 equally weighted NORMAL tasks, and 1 RT task, the
> load-balancer would give each cpu 2 NORMAL tasks, but the tasks that
> would end up on the cpu the RT tasks would be running on would not run
> as fast -- is that fair?
> 
> Since RT tasks do not have a weight (FIFO/RR have no limit at all,
> DEADLINE would have something equivalent to a max weight), it is
> impossible to account them in the normal weight sense.
> 
> Therefore the current model takes them into account by lowering the
> compute capacity according to their (avg) cpu usage. So if the RT task
> would consume 66% cputime, we'd end up with a situation where the cpu
> running the RT task would get 1 NORMAL task, and other cpu would have
> the remaining 3, that way they'd all get 33% cpu.
> 
> > > However you're right that this goes awry in your case.
> > > 
> > > One thing to look at is if that 15% increase is indeed representative
> > > for the power7 cpu, it having 4 SMT threads seems to suggest there was
> > > significant gains, otherwise they'd not have wasted the silicon.
> > 
> > There are certainly, for most workloads, per core gains for SMT4 over
> > SMT2 on P7.  My kernels certainly compile faster and that's the only
> > workload anyone who matters cares about.... ;-)
> 
> For sure ;-)
> 
> Are there any numbers available on how much they gain? It might be worth
> to stick in real numbers instead of this alleged 15%.
> 
> > > One thing we could look at is using the cpu base power to compute
> > > capacity from. We'd have to add another field to sched_group and store
> > > power before we do the scale_rt_power() stuff.
> > 
> > Separating capacity from what RT tasks are running seems like a good
> > idea to me.
> 
> Well, per the above we cannot fully separate them.
> 
> > This would fix the RT issue, but it's not clear to me how you are
> > suggesting fixing the rounding down to 0 SMT4 issue.  Are you suggesting
> > we bump smt_gain to say 2048 + 15%?  Or are you suggesting we separate
> > the RT tasks out from capacity and keep the max(1, capacity) that I've
> > added?  Or something else?
> 
> I would think that 4 SMT threads are still slower than two full cores,
> right? So cpu_power=2048 would not be appropriate.
> 
> > Would another possibility be changing capacity a scaled value (like
> > cpu_power is now) rather than a small integer as it is now.  For
> > example, a scaled capacity of 1024 would be equivalent to a capacity of
> > 1 now.  This might enable us to handle partial capacities better?  We'd
> > probably have to scale a bunch of nr_running too.  
> 
> Right, so my proposal was to scale down the capacity divider (currently
> 1024) to whatever would be the base capacity for that cpu. Trouble seems
> to be that that makes group capacity a lot more complex, as you would
> end up needing to average all the cpu's their base capacity.
> 
> 
> Hrmm, my brain seems muddled but I might have another solution, let me
> ponder this for a bit..
> 

Peter,

Did you manage to get anywhere on this capacity issue?

Mikey

  parent reply	other threads:[~2010-04-29  6:55 UTC|newest]

Thread overview: 25+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2010-04-09  6:21 [PATCH 0/5] sched: asymmetrical packing for POWER7 SMT4 Michael Neuling
2010-04-09  6:21 ` [PATCH 2/5] sched: add asymmetric packing option for sibling domain Michael Neuling
2010-04-13 12:29   ` Peter Zijlstra
2010-04-14  6:09     ` Michael Neuling
2010-04-09  6:21 ` [PATCH 1/5] sched: fix capacity calculations for SMT4 Michael Neuling
2010-04-13 12:29   ` Peter Zijlstra
2010-04-14  4:28     ` Michael Neuling
2010-04-16 13:58       ` Peter Zijlstra
2010-04-18 21:34         ` Michael Neuling
2010-04-19 14:49           ` Peter Zijlstra
2010-04-19 20:45             ` Michael Neuling
2010-04-29  6:55         ` Michael Neuling [this message]
2010-05-31  8:33         ` Peter Zijlstra
2010-06-01 22:52           ` Vaidyanathan Srinivasan
2010-06-03  8:56             ` Peter Zijlstra
2010-06-07 15:06           ` Srivatsa Vaddagiri
2010-04-09  6:21 ` [PATCH 4/5] sched: Mark the balance type for use in need_active_balance() Michael Neuling
2010-04-13 12:29   ` Peter Zijlstra
2010-04-15  4:15     ` Michael Neuling
2010-04-09  6:21 ` [PATCH 3/5] powerpc: enabled asymmetric SMT scheduling on POWER7 Michael Neuling
2010-04-09  6:48   ` Michael Neuling
2010-04-09  6:21 ` [PATCH 5/5] sched: make fix_small_imbalance work with asymmetric packing Michael Neuling
2010-04-13 12:29   ` Peter Zijlstra
2010-04-14  1:31     ` Suresh Siddha
2010-04-15  5:06       ` Michael Neuling

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=31281.1272524158@neuling.org \
    --to=mikey@neuling.org \
    --cc=ego@in.ibm.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linuxppc-dev@ozlabs.org \
    --cc=mingo@elte.hu \
    --cc=peterz@infradead.org \
    --cc=suresh.b.siddha@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).