All of lore.kernel.org
 help / color / mirror / Atom feed
From: Juergen Gross <juergen.gross@ts.fujitsu.com>
To: Dario Faggioli <dario.faggioli@citrix.com>
Cc: Marcus Granado <Marcus.Granado@eu.citrix.com>,
	Andre Przywara <andre.przywara@amd.com>,
	Ian Campbell <Ian.Campbell@citrix.com>,
	Anil Madhavapeddy <anil@recoil.org>,
	George Dunlap <george.dunlap@eu.citrix.com>,
	Andrew Cooper <Andrew.Cooper3@citrix.com>,
	Ian Jackson <Ian.Jackson@eu.citrix.com>,
	xen-devel@lists.xen.org, Jan Beulich <JBeulich@suse.com>,
	Daniel De Graaf <dgdegra@tycho.nsa.gov>,
	Matt Wilson <msw@amazon.com>
Subject: Re: [PATCH 0 of 8] NUMA Awareness for the Credit Scheduler
Date: Tue, 09 Oct 2012 12:02:00 +0200	[thread overview]
Message-ID: <5073F618.7020709@ts.fujitsu.com> (raw)
In-Reply-To: <patchbomb.1349446098@Solace>

Am 05.10.2012 16:08, schrieb Dario Faggioli:
> Hi Everyone,
>
> Here it comes a patch series instilling some NUMA awareness in the Credit
> scheduler.
>
> What the patches do is teaching the Xen's scheduler how to try maximizing
> performances on a NUMA host, taking advantage of the information coming from
> the automatic NUMA placement we have in libxl.  Right now, the
> placement algorithm runs and selects a node (or a set of nodes) where it is best
> to put a new domain on. Then, all the memory for the new domain is allocated
> from those node(s) and all the vCPUs of the new domain are pinned to the pCPUs
> of those node(s). What we do here is, instead of statically pinning the domain's
> vCPUs to the nodes' pCPUs, have the (Credit) scheduler _prefer_ running them
> there. That enables most of the performances benefits of "real" pinning, but
> without its intrinsic lack of flexibility.
>
> The above happens by extending to the scheduler the knowledge of a domain's
> node-affinity. We then ask it to first try to run the domain's vCPUs on one of
> the nodes the domain has affinity with. Of course, if that turns out to be
> impossible, it falls back on the old behaviour (i.e., considering vcpu-affinity
> only).
>
> Just allow me to mention that NUMA aware scheduling not only is one of the item
> of the NUMA roadmap I'm trying to maintain here
> http://wiki.xen.org/wiki/Xen_NUMA_Roadmap. It is also one of the features we
> decided we want for Xen 4.3 (and thus it is part of the list of such features
> that George is maintaining).
>
> Up to now, I've been able to thoroughly test this only on my 2 NUMA nodes
> testbox, by running the SpecJBB2005 benchmark concurrently on multiple VMs, and
> the results looks really nice.  A full set of what I got can be found inside my
> presentation from last XenSummit, which is available here:
>
>   http://www.slideshare.net/xen_com_mgr/numa-and-virtualization-the-case-of-xen?ref=http://www.xen.org/xensummit/xs12na_talks/T9.html
>
> However, I rerun some of the tests in these last days (since I changed some
> bits of the implementation) and here's what I got:
>
> -------------------------------------------------------
>   SpecJBB2005 Total Aggregate Throughput
> -------------------------------------------------------
> #VMs       No NUMA affinity     NUMA affinity&    +/- %
>                                    scheduling
> -------------------------------------------------------
>     2            34653.273          40243.015    +16.13%
>     4            29883.057          35526.807    +18.88%
>     6            23512.926          27015.786    +14.89%
>     8            19120.243          21825.818    +14.15%
>    10            15676.675          17701.472    +12.91%
>
> Basically, results are consistent with what is shown in the super-nice graphs I
> have in the slides above! :-) As said, this looks nice to me, especially
> considering that my test machine is quite small, i.e., its 2 nodes are very
> close to each others from a latency point of view. I really expect more
> improvement on bigger hardware, where much greater NUMA effect is to be
> expected.  Of course, I myself will continue benchmarking (hopefully, on
> systems with more than 2 nodes too), but should anyone want to run its own
> testing, that would be great, so feel free to do that and report results to me
> and/or to the list!
>
> A little bit more about the series:
>
>   1/8 xen, libxc: rename xenctl_cpumap to xenctl_bitmap
>   2/8 xen, libxc: introduce node maps and masks
>
> Is some preparation work.
>
>   3/8 xen: let the (credit) scheduler know about `node affinity`
>
> Is where the vcpu load balancing logic of the credit scheduler is modified to
> support node-affinity.
>
>   4/8 xen: allow for explicitly specifying node-affinity
>   5/8 libxc: allow for explicitly specifying node-affinity
>   6/8 libxl: allow for explicitly specifying node-affinity
>   7/8 libxl: automatic placement deals with node-affinity
>
> Is what wires the in-scheduler node-affinity support with the external world.
> Please, note that patch 4 touches XSM and Flask, which is the area with which I
> have less experience and less chance to test properly. So, If Daniel and/or
> anyone interested in that could take a look and comment, that would be awesome.
>
>   8/8 xl: report node-affinity for domains
>
> Is just some small output enhancement.

Apart from the minor comment to Patch 3:

Acked-by: Juergen Gross <juergen.gross@ts.fujitsu.com>


-- 
Juergen Gross                 Principal Developer Operating Systems
PBG PDG ES&S SWE OS6                   Telephone: +49 (0) 89 3222 2967
Fujitsu Technology Solutions              e-mail: juergen.gross@ts.fujitsu.com
Domagkstr. 28                           Internet: ts.fujitsu.com
D-80807 Muenchen                 Company details: ts.fujitsu.com/imprint.html

  parent reply	other threads:[~2012-10-09 10:02 UTC|newest]

Thread overview: 38+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2012-10-05 14:08 [PATCH 0 of 8] NUMA Awareness for the Credit Scheduler Dario Faggioli
2012-10-05 14:08 ` [PATCH 1 of 8] xen, libxc: rename xenctl_cpumap to xenctl_bitmap Dario Faggioli
2012-10-09 15:59   ` George Dunlap
2012-10-05 14:08 ` [PATCH 2 of 8] xen, libxc: introduce node maps and masks Dario Faggioli
2012-10-09 15:59   ` George Dunlap
2012-10-05 14:08 ` [PATCH 3 of 8] xen: let the (credit) scheduler know about `node affinity` Dario Faggioli
2012-10-05 14:25   ` Jan Beulich
2012-10-09 10:29     ` Dario Faggioli
2012-10-09 11:10       ` Keir Fraser
2012-10-09  9:53   ` Juergen Gross
2012-10-09 10:21     ` Dario Faggioli
2012-10-09 16:29   ` George Dunlap
2012-10-05 14:08 ` [PATCH 4 of 8] xen: allow for explicitly specifying node-affinity Dario Faggioli
2012-10-09 16:47   ` George Dunlap
2012-10-09 16:52     ` Ian Campbell
2012-10-09 18:31       ` [PATCH RFC] flask: move policy header sources into hypervisor Daniel De Graaf
2012-10-10  8:38         ` Ian Campbell
2012-10-10  8:44         ` Dario Faggioli
2012-10-10 14:03           ` Daniel De Graaf
2012-10-10 14:39             ` Dario Faggioli
2012-10-10 15:32               ` Daniel De Graaf
2012-10-09 17:17     ` [PATCH 4 of 8] xen: allow for explicitly specifying node-affinity Dario Faggioli
2012-10-05 14:08 ` [PATCH 5 of 8] libxc: " Dario Faggioli
2012-10-05 14:08 ` [PATCH 6 of 8] libxl: " Dario Faggioli
2012-10-05 14:08 ` [PATCH 7 of 8] libxl: automatic placement deals with node-affinity Dario Faggioli
2012-10-10 10:55   ` George Dunlap
2012-10-05 14:08 ` [PATCH 8 of 8] xl: add node-affinity to the output of `xl list` Dario Faggioli
2012-10-05 16:36   ` Ian Jackson
2012-10-09 11:07     ` Dario Faggioli
2012-10-09 15:03       ` Ian Jackson
2012-10-10  8:46         ` Dario Faggioli
2012-10-08 19:43 ` [PATCH 0 of 8] NUMA Awareness for the Credit Scheduler Dan Magenheimer
2012-10-09 10:45   ` Dario Faggioli
2012-10-09 20:20     ` Matt Wilson
2012-10-10 16:18   ` Dario Faggioli
2012-10-09 10:02 ` Juergen Gross [this message]
2012-10-10 11:00 ` George Dunlap
2012-10-10 12:28   ` Dario Faggioli

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=5073F618.7020709@ts.fujitsu.com \
    --to=juergen.gross@ts.fujitsu.com \
    --cc=Andrew.Cooper3@citrix.com \
    --cc=Ian.Campbell@citrix.com \
    --cc=Ian.Jackson@eu.citrix.com \
    --cc=JBeulich@suse.com \
    --cc=Marcus.Granado@eu.citrix.com \
    --cc=andre.przywara@amd.com \
    --cc=anil@recoil.org \
    --cc=dario.faggioli@citrix.com \
    --cc=dgdegra@tycho.nsa.gov \
    --cc=george.dunlap@eu.citrix.com \
    --cc=msw@amazon.com \
    --cc=xen-devel@lists.xen.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.