From: Andrea Arcangeli <aarcange@redhat.com>
To: Dan Smith <danms@us.ibm.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>,
Linus Torvalds <torvalds@linux-foundation.org>,
Andrew Morton <akpm@linux-foundation.org>,
Thomas Gleixner <tglx@linutronix.de>, Ingo Molnar <mingo@elte.hu>,
Paul Turner <pjt@google.com>,
Suresh Siddha <suresh.b.siddha@intel.com>,
Mike Galbraith <efault@gmx.de>,
"Paul E. McKenney" <paulmck@linux.vnet.ibm.com>,
Lai Jiangshan <laijs@cn.fujitsu.com>,
Bharata B Rao <bharata.rao@gmail.com>,
Lee Schermerhorn <Lee.Schermerhorn@hp.com>,
Rik van Riel <riel@redhat.com>,
Johannes Weiner <hannes@cmpxchg.org>,
linux-kernel@vger.kernel.org, linux-mm@kvack.org
Subject: Re: [RFC] AutoNUMA alpha6
Date: Wed, 21 Mar 2012 13:49:37 +0100 [thread overview]
Message-ID: <20120321124937.GX24602@redhat.com> (raw)
In-Reply-To: <87fwd2d2kp.fsf@danplanet.com>
Hi Dan,
On Tue, Mar 20, 2012 at 09:01:58PM -0700, Dan Smith wrote:
> AA> upstream autonuma numasched hard inverse
> AA> numa02 64 45 66 42 81
> AA> numa01 491 328 607 321 623 -D THREAD_ALLOC
> AA> numa01 305 207 338 196 378 -D NO_BIND_FORCE_SAME_NODE
>
> AA> So give me a break... you must have made a real mess in your
> AA> benchmarking.
>
> I'm just running what you posted, dude :)
Apologies if it felt like I was attacking you, that wasn't my
intention, I actually appreciate your effort!
My exclamation was because I was shocked by the staggering difference
in results, nothing else.
Here I still get the results I posted above from numasched. In fact
even worse, now even -D THREAD_ALLOC wouldn't end (and I disabled
lockdep just in case), I'll try to reboot some more time to see if I
can get some number out of it again.
numa02 at least repeats at 66 sec reproducibly with numasched with or
without lockdep.
> AA> numasched is always doing worse than upstream here, in fact two
> AA> times massively worse. Almost as bad as the inverse binds.
>
> Well, something clearly isn't right, because my numbers don't match
> yours at all. This time with THP disabled, and compared to the rest of
> the numbers from my previous runs:
>
> autonuma HARD INVERSE NO_BIND_FORCE_SAME_MODE
>
> numa01 366 335 356 377
> numa01THP 388 336 353 399
>
> That shows that autonuma is worse than inverse binds here. If I'm
> running your stuff incorrectly, please tell me and I'll correct
> it. However, I've now compiled the binary exactly as you asked, with THP
> disabled, and am seeing surprisingly consistent results.
HARD and INVERSE should be the min and max you get.
I would ask you before you test AutoNUMA again, or numasched again, to
repeat this "HARD" vs "INVERSE" vs "NO_BIND_FORCE_SAME_MODE"
benchmark and be sure the above numbers are correct for the above
three cases.
On my hardware you can see on page 7 of my pdf what I get:
http://www.kernel.org/pub/linux/kernel/people/andrea/autonuma/autonuma_bench-20120321.pdf
numa01 -DHARD_BIND | -DNO_BIND_FORCE_SAME_NODE | -DINVERSE_BIND
196 305 378
You can do this benchmark on an upstream kernel 3.3-rc, no need of any
patch to collect the above three numbers.
For me this is always true: HARD_BIND <= NO_BIND_FORCE_SAME_NODE <= INVERSE_BIND.
Checking if numa01 HARD_BIND and INVERSE_BIND cases are setting up
your hardware topology correctly may be good idea too.
If it's not a benchmarking error or a topology error in
HARD_BIND/INVERSE_BIND, it may be the hardware you're using is very
different. That would be bad news though, I thought you were using the
same common 2 socket exacore setup that I'm using and I wouldn't have
expected such a staggering difference in results (even for HARD vs
INVERSE vs NO_BIND_FORCE_SAME_NODE, even before we put autonuma or
numasched into the equation).
> AA> Maybe you've more than 16g? I've 16G and that leaves 1G free on both
> AA> nodes at the peak load with AutoNUMA. That shall be enough for
> AA> numasched too (Peter complained me I waste 80MB on a 16G system, so
> AA> he can't possibly be intentionally wasting me 2GB).
>
> Yep, 24G here. Do I need to tweak the test?
Well maybe you could try to repeat at 16G if you still see numasched
performing great after running it with -DNO_BIND_FORCE_SAME_MODE.
What -DNO_BIND_FORCE_SAME_MODE is meant to do, is to start the "NUMA
migration" races from the worst possible condition.
Imagine it like doing a hiking race consistently always from the
_bottom_ of the mountain, and not randomly from the middle like it
would happen without -DNO_BIND_FORCE_SAME_MODE.
> How do you figure? I didn't post any hard binding numbers. In fact,
> numasched performed about equal to hard binding...definitely within your
> stated 2% error interval. That was with THP enabled, tomorrow I'll be
> glad to run them all again without THP.
Again thanks so much for your effort. I hope others will run more
benchmarks too on both solution. And I repeat what I said yesterday
clear and stright: if numasched will be shown to have the lead on the
vast majority of workloads, I will be happy to "rm -r autonuma" to
stop wasting time on an inferior dead project, and work on something
else entirely or to contribute to numasched in case they will need
help for something.
next prev parent reply other threads:[~2012-03-21 12:50 UTC|newest]
Thread overview: 153+ messages / expand[flat|nested] mbox.gz Atom feed top
2012-03-16 14:40 [RFC][PATCH 00/26] sched/numa Peter Zijlstra
2012-03-16 14:40 ` [RFC][PATCH 01/26] mm, mpol: Re-implement check_*_range() using walk_page_range() Peter Zijlstra
2012-03-16 14:40 ` [RFC][PATCH 02/26] mm, mpol: Remove NUMA_INTERLEAVE_HIT Peter Zijlstra
2012-07-06 10:32 ` Johannes Weiner
2012-07-06 13:46 ` [tip:sched/core] mm: Fix vmstat names-values off-by-one tip-bot for Johannes Weiner
2012-07-06 14:48 ` [RFC][PATCH 02/26] mm, mpol: Remove NUMA_INTERLEAVE_HIT Minchan Kim
2012-07-06 15:02 ` Peter Zijlstra
2012-07-06 14:54 ` Kyungmin Park
2012-07-06 15:00 ` Peter Zijlstra
2012-03-16 14:40 ` [RFC][PATCH 03/26] mm, mpol: add MPOL_MF_LAZY Peter Zijlstra
2012-03-23 11:50 ` Mel Gorman
2012-07-06 16:38 ` Rik van Riel
2012-07-06 20:04 ` Lee Schermerhorn
2012-07-06 20:27 ` Rik van Riel
2012-07-09 11:48 ` Peter Zijlstra
2012-03-16 14:40 ` [RFC][PATCH 04/26] mm, mpol: add MPOL_MF_NOOP Peter Zijlstra
2012-07-06 18:40 ` Rik van Riel
2012-03-16 14:40 ` [RFC][PATCH 05/26] mm, mpol: Check for misplaced page Peter Zijlstra
2012-03-16 14:40 ` [RFC][PATCH 06/26] mm: Migrate " Peter Zijlstra
2012-04-03 17:32 ` Dan Smith
2012-03-16 14:40 ` [RFC][PATCH 07/26] mm: Handle misplaced anon pages Peter Zijlstra
2012-03-16 14:40 ` [RFC][PATCH 08/26] mm, mpol: Simplify do_mbind() Peter Zijlstra
2012-03-16 14:40 ` [RFC][PATCH 09/26] sched, mm: Introduce tsk_home_node() Peter Zijlstra
2012-03-16 14:40 ` [RFC][PATCH 10/26] mm, mpol: Make mempolicy home-node aware Peter Zijlstra
2012-03-16 18:34 ` Christoph Lameter
2012-03-16 21:12 ` Peter Zijlstra
2012-03-19 13:53 ` Christoph Lameter
2012-03-19 14:05 ` Peter Zijlstra
2012-03-19 15:16 ` Christoph Lameter
2012-03-19 15:23 ` Peter Zijlstra
2012-03-19 15:31 ` Christoph Lameter
2012-03-19 17:09 ` Peter Zijlstra
2012-03-19 17:28 ` Peter Zijlstra
2012-03-19 19:06 ` Christoph Lameter
2012-03-19 20:28 ` Lee Schermerhorn
2012-03-19 21:21 ` Peter Zijlstra
2012-03-16 14:40 ` [RFC][PATCH 11/26] mm, mpol: Lazy migrate a process/vma Peter Zijlstra
2012-03-16 14:40 ` [RFC][PATCH 12/26] sched, mm: sched_{fork,exec} node assignment Peter Zijlstra
2012-06-15 18:16 ` Tony Luck
2012-06-20 19:12 ` [PATCH] sched: Fix build problems when CONFIG_NUMA=y and CONFIG_SMP=n Luck, Tony
2012-03-16 14:40 ` [RFC][PATCH 13/26] sched: Implement home-node awareness Peter Zijlstra
2012-03-16 14:40 ` [RFC][PATCH 14/26] sched, numa: Numa balancer Peter Zijlstra
2012-07-07 18:26 ` Rik van Riel
2012-07-09 12:05 ` Peter Zijlstra
2012-07-09 12:23 ` Peter Zijlstra
2012-07-09 12:40 ` Peter Zijlstra
2012-07-09 14:50 ` Rik van Riel
2012-07-08 18:35 ` Rik van Riel
2012-07-09 12:25 ` Peter Zijlstra
2012-07-09 14:54 ` Rik van Riel
2012-07-12 22:02 ` Rik van Riel
2012-07-13 14:45 ` Don Morris
2012-07-14 16:20 ` Rik van Riel
2012-03-16 14:40 ` [RFC][PATCH 15/26] sched, numa: Implement hotplug hooks Peter Zijlstra
2012-03-19 12:16 ` Srivatsa S. Bhat
2012-03-19 12:19 ` Peter Zijlstra
2012-03-19 12:27 ` Srivatsa S. Bhat
2012-03-16 14:40 ` [RFC][PATCH 16/26] sched, numa: Abstract the numa_entity Peter Zijlstra
2012-03-16 14:40 ` [RFC][PATCH 17/26] srcu: revert1 Peter Zijlstra
2012-03-16 14:40 ` [RFC][PATCH 18/26] srcu: revert2 Peter Zijlstra
2012-03-16 14:40 ` [RFC][PATCH 19/26] srcu: Implement call_srcu() Peter Zijlstra
2012-03-16 14:40 ` [RFC][PATCH 20/26] mm, mpol: Introduce vma_dup_policy() Peter Zijlstra
2012-03-16 14:40 ` [RFC][PATCH 21/26] mm, mpol: Introduce vma_put_policy() Peter Zijlstra
2012-03-16 14:40 ` [RFC][PATCH 22/26] mm, mpol: Split and explose some mempolicy functions Peter Zijlstra
2012-03-16 14:40 ` [RFC][PATCH 23/26] sched, numa: Introduce sys_numa_{t,m}bind() Peter Zijlstra
2012-03-16 14:40 ` [RFC][PATCH 24/26] mm, mpol: Implement numa_group RSS accounting Peter Zijlstra
2012-03-16 14:40 ` [RFC][PATCH 25/26] sched, numa: Only migrate long-running entities Peter Zijlstra
2012-07-08 18:34 ` Rik van Riel
2012-07-09 12:26 ` Peter Zijlstra
2012-07-09 14:53 ` Rik van Riel
2012-07-09 14:55 ` Peter Zijlstra
2012-03-16 14:40 ` [RFC][PATCH 26/26] sched, numa: A few debug bits Peter Zijlstra
2012-03-16 18:25 ` [RFC] AutoNUMA alpha6 Andrea Arcangeli
2012-03-19 18:47 ` Peter Zijlstra
2012-03-19 19:02 ` Andrea Arcangeli
2012-03-20 23:41 ` Dan Smith
2012-03-21 1:00 ` Andrea Arcangeli
2012-03-21 2:12 ` Andrea Arcangeli
2012-03-21 4:01 ` Dan Smith
2012-03-21 12:49 ` Andrea Arcangeli [this message]
2012-03-21 22:05 ` Dan Smith
2012-03-21 22:52 ` Andrea Arcangeli
2012-03-21 23:13 ` Dan Smith
2012-03-21 23:41 ` Andrea Arcangeli
2012-03-22 0:17 ` Andrea Arcangeli
2012-03-22 13:58 ` Dan Smith
2012-03-22 14:27 ` Andrea Arcangeli
2012-03-22 18:49 ` Andrea Arcangeli
2012-03-22 18:56 ` Dan Smith
2012-03-22 19:11 ` Andrea Arcangeli
2012-03-23 14:15 ` Andrew Theurer
2012-03-23 16:01 ` Andrea Arcangeli
2012-03-25 13:30 ` Andrea Arcangeli
2012-03-21 7:12 ` Ingo Molnar
2012-03-21 12:08 ` Andrea Arcangeli
2012-03-21 7:53 ` Ingo Molnar
2012-03-21 12:17 ` Andrea Arcangeli
2012-03-19 9:57 ` [RFC][PATCH 00/26] sched/numa Avi Kivity
2012-03-19 11:12 ` Peter Zijlstra
2012-03-19 11:30 ` Peter Zijlstra
2012-03-19 11:39 ` Peter Zijlstra
2012-03-19 11:42 ` Avi Kivity
2012-03-19 11:59 ` Peter Zijlstra
2012-03-19 12:07 ` Avi Kivity
2012-03-19 12:09 ` Peter Zijlstra
2012-03-19 12:16 ` Avi Kivity
2012-03-19 20:03 ` Peter Zijlstra
2012-03-20 10:18 ` Avi Kivity
2012-03-20 10:48 ` Peter Zijlstra
2012-03-20 10:52 ` Avi Kivity
2012-03-20 11:07 ` Peter Zijlstra
2012-03-20 11:48 ` Avi Kivity
2012-03-19 12:20 ` Peter Zijlstra
2012-03-19 12:24 ` Avi Kivity
2012-03-19 15:44 ` Avi Kivity
2012-03-19 13:40 ` Andrea Arcangeli
2012-03-19 20:06 ` Peter Zijlstra
2012-03-19 13:04 ` Andrea Arcangeli
2012-03-19 13:26 ` Peter Zijlstra
2012-03-19 13:57 ` Andrea Arcangeli
2012-03-19 14:06 ` Avi Kivity
2012-03-19 14:30 ` Andrea Arcangeli
2012-03-19 18:42 ` Peter Zijlstra
2012-03-20 22:18 ` Rik van Riel
2012-03-21 16:50 ` Andrea Arcangeli
2012-04-02 16:34 ` Pekka Enberg
2012-04-02 16:55 ` Rik van Riel
2012-04-02 16:54 ` Pekka Enberg
2012-04-02 17:12 ` Pekka Enberg
2012-04-02 17:23 ` Pekka Enberg
2012-03-19 14:07 ` Peter Zijlstra
2012-03-19 14:34 ` Andrea Arcangeli
2012-03-19 18:41 ` Peter Zijlstra
2012-03-19 19:13 ` Peter Zijlstra
2012-03-19 14:07 ` Andrea Arcangeli
2012-03-19 19:05 ` Peter Zijlstra
2012-03-19 13:26 ` Peter Zijlstra
2012-03-19 14:16 ` Andrea Arcangeli
2012-03-19 13:29 ` Peter Zijlstra
2012-03-19 14:19 ` Andrea Arcangeli
2012-03-19 13:39 ` Peter Zijlstra
2012-03-19 14:20 ` Andrea Arcangeli
2012-03-19 20:17 ` Christoph Lameter
2012-03-19 20:28 ` Ingo Molnar
2012-03-19 20:43 ` Christoph Lameter
2012-03-19 21:34 ` Ingo Molnar
2012-03-20 0:05 ` Linus Torvalds
2012-03-20 7:31 ` Ingo Molnar
2012-03-21 22:53 ` Nish Aravamudan
2012-03-22 9:45 ` Peter Zijlstra
2012-03-22 10:34 ` Ingo Molnar
2012-03-24 1:41 ` Nish Aravamudan
2012-03-26 11:42 ` Peter Zijlstra
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20120321124937.GX24602@redhat.com \
--to=aarcange@redhat.com \
--cc=Lee.Schermerhorn@hp.com \
--cc=a.p.zijlstra@chello.nl \
--cc=akpm@linux-foundation.org \
--cc=bharata.rao@gmail.com \
--cc=danms@us.ibm.com \
--cc=efault@gmx.de \
--cc=hannes@cmpxchg.org \
--cc=laijs@cn.fujitsu.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=mingo@elte.hu \
--cc=paulmck@linux.vnet.ibm.com \
--cc=pjt@google.com \
--cc=riel@redhat.com \
--cc=suresh.b.siddha@intel.com \
--cc=tglx@linutronix.de \
--cc=torvalds@linux-foundation.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).