From: Mike Galbraith <efault@gmx.de>
To: Alex Shi <alex.shi@intel.com>
Cc: Borislav Petkov <bp@alien8.de>,
torvalds@linux-foundation.org, mingo@redhat.com,
peterz@infradead.org, tglx@linutronix.de,
akpm@linux-foundation.org, arjan@linux.intel.com, pjt@google.com,
namhyung@kernel.org, vincent.guittot@linaro.org,
gregkh@linuxfoundation.org, preeti@linux.vnet.ibm.com,
viresh.kumar@linaro.org, linux-kernel@vger.kernel.org
Subject: Re: [patch v4 0/18] sched: simplified fork, release load avg and power awareness scheduling
Date: Mon, 28 Jan 2013 07:42:17 +0100 [thread overview]
Message-ID: <1359355337.5783.56.camel@marge.simpson.net> (raw)
In-Reply-To: <1359353720.5783.51.camel@marge.simpson.net>
On Mon, 2013-01-28 at 07:15 +0100, Mike Galbraith wrote:
> On Mon, 2013-01-28 at 13:51 +0800, Alex Shi wrote:
> > On 01/28/2013 01:17 PM, Mike Galbraith wrote:
> > > On Sun, 2013-01-27 at 16:51 +0100, Mike Galbraith wrote:
> > >> On Sun, 2013-01-27 at 21:25 +0800, Alex Shi wrote:
> > >>> On 01/27/2013 06:35 PM, Borislav Petkov wrote:
> > >>>> On Sun, Jan 27, 2013 at 05:36:25AM +0100, Mike Galbraith wrote:
> > >>>>> With aim7 compute on 4 node 40 core box, I see stable throughput
> > >>>>> improvement at tasks = nr_cores and below w. balance and powersaving.
> > >> ...
> > >>>> Ok, this is sick. How is balance and powersaving better than perf? Both
> > >>>> have much more jobs per minute than perf; is that because we do pack
> > >>>> much more tasks per cpu with balance and powersaving?
> > >>>
> > >>> Maybe it is due to the lazy balancing on balance/powersaving. You can
> > >>> check the CS times in /proc/pid/status.
> > >>
> > >> Well, it's not wakeup path, limiting entry frequency per waker did zip
> > >> squat nada to any policy throughput.
> > >
> > > monteverdi:/abuild/mike/:[0]# echo powersaving > /sys/devices/system/cpu/sched_policy/current_sched_policy
> > > monteverdi:/abuild/mike/:[0]# massive_intr 10 60
> > > 043321 00058616
> > > 043313 00058616
> > > 043318 00058968
> > > 043317 00058968
> > > 043316 00059184
> > > 043319 00059192
> > > 043320 00059048
> > > 043314 00059048
> > > 043312 00058176
> > > 043315 00058184
> > > monteverdi:/abuild/mike/:[0]# echo balance > /sys/devices/system/cpu/sched_policy/current_sched_policy
> > > monteverdi:/abuild/mike/:[0]# massive_intr 10 60
> > > 043337 00053448
> > > 043333 00053456
> > > 043338 00052992
> > > 043331 00053448
> > > 043332 00053488
> > > 043335 00053496
> > > 043334 00053480
> > > 043329 00053288
> > > 043336 00053464
> > > 043330 00053496
> > > monteverdi:/abuild/mike/:[0]# echo performance > /sys/devices/system/cpu/sched_policy/current_sched_policy
> > > monteverdi:/abuild/mike/:[0]# massive_intr 10 60
> > > 043348 00052488
> > > 043344 00052488
> > > 043349 00052744
> > > 043343 00052504
> > > 043347 00052504
> > > 043352 00052888
> > > 043345 00052504
> > > 043351 00052496
> > > 043346 00052496
> > > 043350 00052304
> > > monteverdi:/abuild/mike/:[0]#
> >
> > similar with aim7 results. Thanks, Mike!
> >
> > Wold you like to collect vmstat info in background?
> > >
> > > Zzzt. Wish I could turn turbo thingy off.
> >
> > Do you mean the turbo mode of cpu frequency? I remember some of machine
> > can disable it in BIOS.
>
> Yeah, I can do that in my local x3550 box. I can't fiddle with BIOS
> settings on the remote NUMA box.
>
> This can't be anything but turbo gizmo mucking up the numbers I think,
> not that the numbers are invalid or anything, better numbers are better
> numbers no matter where/how they come about ;-)
>
> The massive_intr load is dirt simple sleep/spin with bean counting. It
> sleeps 1ms spins 8ms. Change that to sleep 8ms, grind away for 1ms...
>
> monteverdi:/abuild/mike/:[0]# ./massive_intr 10 60
> 045150 00006484
> 045157 00006427
> 045156 00006401
> 045152 00006428
> 045155 00006372
> 045154 00006370
> 045158 00006453
> 045149 00006372
> 045151 00006371
> 045153 00006371
> monteverdi:/abuild/mike/:[0]# echo balance > /sys/devices/system/cpu/sched_policy/current_sched_policy
> monteverdi:/abuild/mike/:[0]# ./massive_intr 10 60
> 045170 00006380
> 045172 00006374
> 045169 00006376
> 045175 00006376
> 045171 00006334
> 045176 00006380
> 045168 00006374
> 045174 00006334
> 045177 00006375
> 045173 00006376
> monteverdi:/abuild/mike/:[0]# echo performance > /sys/devices/system/cpu/sched_policy/current_sched_policy
> monteverdi:/abuild/mike/:[0]# ./massive_intr 10 60
> 045198 00006408
> 045191 00006408
> 045197 00006408
> 045192 00006411
> 045194 00006409
> 045196 00006409
> 045195 00006336
> 045189 00006336
> 045193 00006411
> 045190 00006410
Back to original 1ms sleep, 8ms work, turning NUMA box into a single
node 10 core box with numactl.
monteverdi:/abuild/mike/:[0]# echo powersaving > /sys/devices/system/cpu/sched_policy/current_sched_policy
monteverdi:/abuild/mike/:[0]# numactl --cpunodebind=0 massive_intr 10 60
045286 00043872
045289 00043464
045284 00043488
045287 00043440
045283 00043416
045281 00044456
045285 00043456
045288 00044312
045280 00043048
045282 00043240
monteverdi:/abuild/mike/:[0]# echo balance > /sys/devices/system/cpu/sched_policy/current_sched_policy
monteverdi:/abuild/mike/:[0]# numactl --cpunodebind=0 massive_intr 10 60
045300 00052536
045307 00052472
045304 00052536
045299 00052536
045305 00052520
045306 00052528
045302 00052528
045303 00052528
045308 00052512
045301 00052520
monteverdi:/abuild/mike/:[0]# echo performance > /sys/devices/system/cpu/sched_policy/current_sched_policy
monteverdi:/abuild/mike/:[0]# numactl --cpunodebind=0 massive_intr 10 60
045339 00052600
045340 00052608
045338 00052600
045337 00052608
045343 00052600
045341 00052600
045336 00052608
045335 00052616
045334 00052576
045342 00052600
next prev parent reply other threads:[~2013-01-28 6:42 UTC|newest]
Thread overview: 88+ messages / expand[flat|nested] mbox.gz Atom feed top
2013-01-24 3:06 [patch v4 0/18] sched: simplified fork, release load avg and power awareness scheduling Alex Shi
2013-01-24 3:06 ` [patch v4 01/18] sched: set SD_PREFER_SIBLING on MC domain to reduce a domain level Alex Shi
2013-02-12 10:11 ` Peter Zijlstra
2013-02-13 13:22 ` Alex Shi
2013-02-15 12:38 ` Peter Zijlstra
2013-02-16 5:16 ` Alex Shi
2013-02-13 14:17 ` Alex Shi
2013-01-24 3:06 ` [patch v4 02/18] sched: select_task_rq_fair clean up Alex Shi
2013-02-12 10:14 ` Peter Zijlstra
2013-02-13 14:44 ` Alex Shi
2013-01-24 3:06 ` [patch v4 03/18] sched: fix find_idlest_group mess logical Alex Shi
2013-02-12 10:16 ` Peter Zijlstra
2013-02-13 15:07 ` Alex Shi
2013-01-24 3:06 ` [patch v4 04/18] sched: don't need go to smaller sched domain Alex Shi
2013-01-24 3:06 ` [patch v4 05/18] sched: quicker balancing on fork/exec/wake Alex Shi
2013-02-12 10:22 ` Peter Zijlstra
2013-02-14 3:13 ` Alex Shi
2013-02-14 8:12 ` Preeti U Murthy
2013-02-14 14:08 ` Alex Shi
2013-02-15 13:00 ` Peter Zijlstra
2013-01-24 3:06 ` [patch v4 06/18] sched: give initial value for runnable avg of sched entities Alex Shi
2013-02-12 10:23 ` Peter Zijlstra
2013-01-24 3:06 ` [patch v4 07/18] sched: set initial load avg of new forked task Alex Shi
2013-02-12 10:26 ` Peter Zijlstra
2013-02-13 15:14 ` Alex Shi
2013-02-13 15:41 ` Paul Turner
2013-02-14 13:07 ` Alex Shi
2013-02-19 11:34 ` Paul Turner
2013-02-20 4:18 ` Preeti U Murthy
2013-02-20 5:13 ` Alex Shi
2013-01-24 3:06 ` [patch v4 08/18] Revert "sched: Introduce temporary FAIR_GROUP_SCHED dependency for load-tracking" Alex Shi
2013-02-12 10:27 ` Peter Zijlstra
2013-02-13 15:23 ` Alex Shi
2013-02-13 15:45 ` Paul Turner
2013-02-14 3:07 ` Preeti U Murthy
2013-01-24 3:06 ` [patch v4 09/18] sched: add sched_policies in kernel Alex Shi
2013-02-12 10:36 ` Peter Zijlstra
2013-02-13 15:41 ` Alex Shi
2013-01-24 3:06 ` [patch v4 10/18] sched: add sysfs interface for sched_policy selection Alex Shi
2013-01-24 3:06 ` [patch v4 11/18] sched: log the cpu utilization at rq Alex Shi
2013-02-12 10:39 ` Peter Zijlstra
2013-02-14 3:10 ` Alex Shi
2013-01-24 3:06 ` [patch v4 12/18] sched: add power aware scheduling in fork/exec/wake Alex Shi
2013-01-24 3:06 ` [patch v4 13/18] sched: packing small tasks in wake/exec balancing Alex Shi
2013-01-24 3:06 ` [patch v4 14/18] sched: add power/performance balance allowed flag Alex Shi
2013-01-24 3:06 ` [patch v4 15/18] sched: pull all tasks from source group Alex Shi
2013-01-24 3:06 ` [patch v4 16/18] sched: don't care if the local group has capacity Alex Shi
2013-01-24 3:06 ` [patch v4 17/18] sched: power aware load balance, Alex Shi
2013-01-24 3:07 ` [patch v4 18/18] sched: lazy power balance Alex Shi
2013-01-24 9:44 ` [patch v4 0/18] sched: simplified fork, release load avg and power awareness scheduling Borislav Petkov
2013-01-24 15:07 ` Alex Shi
2013-01-27 2:41 ` Alex Shi
2013-01-27 4:36 ` Mike Galbraith
2013-01-27 10:35 ` Borislav Petkov
2013-01-27 13:25 ` Alex Shi
2013-01-27 15:51 ` Mike Galbraith
2013-01-28 5:17 ` Mike Galbraith
2013-01-28 5:51 ` Alex Shi
2013-01-28 6:15 ` Mike Galbraith
2013-01-28 6:42 ` Mike Galbraith [this message]
2013-01-28 7:20 ` Mike Galbraith
2013-01-29 1:17 ` Alex Shi
2013-01-28 9:55 ` Borislav Petkov
2013-01-28 10:44 ` Mike Galbraith
2013-01-28 11:29 ` Borislav Petkov
2013-01-28 11:32 ` Mike Galbraith
2013-01-28 11:40 ` Mike Galbraith
2013-01-28 15:22 ` Borislav Petkov
2013-01-28 15:55 ` Mike Galbraith
2013-01-29 1:38 ` Alex Shi
2013-01-29 1:32 ` Alex Shi
2013-01-29 1:36 ` Alex Shi
2013-01-28 15:47 ` Mike Galbraith
2013-01-29 1:45 ` Alex Shi
2013-01-29 4:03 ` Mike Galbraith
2013-01-29 2:27 ` Alex Shi
2013-01-27 10:40 ` Borislav Petkov
2013-01-27 14:03 ` Alex Shi
2013-01-28 5:19 ` Alex Shi
2013-01-28 6:49 ` Mike Galbraith
2013-01-28 7:17 ` Alex Shi
2013-01-28 7:33 ` Mike Galbraith
2013-01-29 6:02 ` Alex Shi
2013-01-28 1:28 ` Alex Shi
2013-02-04 1:35 ` Alex Shi
2013-02-04 11:09 ` Ingo Molnar
2013-02-05 2:26 ` Alex Shi
2013-02-06 5:08 ` Alex Shi
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=1359355337.5783.56.camel@marge.simpson.net \
--to=efault@gmx.de \
--cc=akpm@linux-foundation.org \
--cc=alex.shi@intel.com \
--cc=arjan@linux.intel.com \
--cc=bp@alien8.de \
--cc=gregkh@linuxfoundation.org \
--cc=linux-kernel@vger.kernel.org \
--cc=mingo@redhat.com \
--cc=namhyung@kernel.org \
--cc=peterz@infradead.org \
--cc=pjt@google.com \
--cc=preeti@linux.vnet.ibm.com \
--cc=tglx@linutronix.de \
--cc=torvalds@linux-foundation.org \
--cc=vincent.guittot@linaro.org \
--cc=viresh.kumar@linaro.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).