All of lore.kernel.org
 help / color / mirror / Atom feed
From: Dario Faggioli <raistlin@linux.it>
To: Ian Jackson <Ian.Jackson@eu.citrix.com>
Cc: Andre Przywara <andre.przywara@amd.com>,
	Ian Campbell <Ian.Campbell@citrix.com>,
	Stefano Stabellini <Stefano.Stabellini@eu.citrix.com>,
	George Dunlap <George.Dunlap@eu.citrix.com>,
	Juergen Gross <juergen.gross@ts.fujitsu.com>,
	xen-devel <xen-devel@lists.xen.org>
Subject: Re: [PATCH 1 of 3 v5/leftover] libxl: enable automatic placement of guests on NUMA nodes [and 1 more messages]
Date: Wed, 18 Jul 2012 02:22:27 +0200	[thread overview]
Message-ID: <1342570947.11794.92.camel@Abyss> (raw)
In-Reply-To: <20485.43293.833036.352186@mariner.uk.xensource.com>


[-- Attachment #1.1: Type: text/plain, Size: 3856 bytes --]

On Tue, 2012-07-17 at 19:04 +0100, Ian Jackson wrote: 
> I've been told that boxes with 16 NUMA nodes are coming out about
> now.
> 
> This algorithm will start to have unreasonable performance with
> anything bigger than that.  Eg choosing 16 NUMA nodes out of 32 would
> involve searching 601,080,390 combinations.  32 out of 64 gives 10^18.
> 
Ok, when designing this thing, I knew it would turn out to be the most
computationally complex part of the whole NUMA thing (hopefully!).

That was on purpose, mainly because this is the *first* (from the
guest's lifetime POV) of the various others NUMA-aware related
heuristics that will come, and the only one that will run userspace and
far from critical paths. That's why, when the suggestion to consider
"all the possible combination of ..." came, during one round of the
review, I liked it very much and went for it.

_That_all_being_said_, the algorithm takes some input parameters, the
most important of which is probably max_nodes. It should contain the
maximum number of nodes one wants his guest to span. I worked really
hard for it not to not be just hardcoded to 1, as that would mean we
were giving up trying to improve performances for guests not fitting in
just one node. However, it is not written in stone that it must be free
to range from 1 to #_of_nodes. That is sure possible, with the current
NUMA systems, and here they are the number of combinations we get for 4,
8 and 16 nodes (n), respectively: 
$ echo "combs=0; n=4; for k=1:4 combs=combs+bincoeff(n,k); endfor; combs" | octave -q
  combs =  15
 $ echo "combs=0; n=8; for k=1:8 combs=combs+bincoeff(n,k); endfor; combs" | octave -q
  combs =  255
 $ echo "combs=0; n=16; for k=1:16 combs=combs+bincoeff(n,k); endfor; combs" | octave -q
  combs =  65535

However, it might well be decided that it is sane to limit max_nodes to,
say, 4, which would mean we'll be trying our best to boost the
performances of all guests that fits within 4 nodes (which is a lot
better than just one!), which would result in these numbers, for nodes
(n) equal to 4, 8, 16, 32 and 64: 
$ echo "combs=0; n=4; for k=1:4 combs=combs+bincoeff(n,k); endfor; combs" | octave -q
  combs =  15
 $ echo "combs=0; n=8; for k=1:4 combs=combs+bincoeff(n,k); endfor; combs" | octave -q
  combs =  162
 $ echo "combs=0; n=16; for k=1:4 combs=combs+bincoeff(n,k); endfor; combs" | octave -q
  combs =  2516
 $ echo "combs=0; n=32; for k=1:4 combs=combs+bincoeff(n,k); endfor; combs" | octave -q
  combs =  41448
 $ echo "combs=0; n=64; for k=1:4 combs=combs+bincoeff(n,k); endfor; combs" | octave -q
  combs =  679120

Moreover, as another input parameter is a nodemask, telling the
algorithm not only how many but also _which_ nodes it should consider,
one could imagine to have some multi-tier form of the algorithm itself,
or, perhaps, implementing a more abstract and coarse partitioning
heuristics among group of nodes (which will better be done as soon as
we'll see how 32 and 64 nodes system will actually look like :-D) and
then run the current algorithm only on these groups, with numbers that
would then look just like above.

So, to summarize, I can't be farther than saying the current algorithm
is perfect or fast. Rather, I'm definitely saying it has the advantage
of not branching out the problem space too much aggressively at this
early stage, and it is flexible enough for avoiding getting unreasonable
performances, or at least, it looks like that to me, I guess. :-)

Thoughts?

Thanks and Regards,
Dario

-- 
<<This happens because I choose it to happen!>> (Raistlin Majere)
-----------------------------------------------------------------
Dario Faggioli, Ph.D, http://retis.sssup.it/people/faggioli
Senior Software Engineer, Citrix Systems R&D Ltd., Cambridge (UK)



[-- Attachment #1.2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 198 bytes --]

[-- Attachment #2: Type: text/plain, Size: 126 bytes --]

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

  parent reply	other threads:[~2012-07-18  0:22 UTC|newest]

Thread overview: 60+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2012-07-10 15:03 [PATCH 0 of 3 v4/leftover] Automatic NUMA placement for xl Dario Faggioli
2012-07-10 15:03 ` [PATCH 1 of 3 v4/leftover] libxl: enable automatic placement of guests on NUMA nodes Dario Faggioli
2012-07-17 15:55   ` Ian Jackson
2012-07-16 17:13     ` [PATCH 0 of 3 v5/leftover] Automatic NUMA placement for xl Dario Faggioli
2012-07-16 17:13       ` [PATCH 1 of 3 v5/leftover] libxl: enable automatic placement of guests on NUMA nodes Dario Faggioli
2012-07-17 18:04         ` [PATCH 1 of 3 v5/leftover] libxl: enable automatic placement of guests on NUMA nodes [and 1 more messages] Ian Jackson
2012-07-17 20:23           ` Ian Campbell
2012-07-18  0:31             ` Dario Faggioli
2012-07-18 10:44             ` Ian Jackson
2012-07-18  0:22           ` Dario Faggioli [this message]
2012-07-18  8:27             ` Dario Faggioli
2012-07-18  9:13             ` Ian Campbell
2012-07-18  9:43               ` Dario Faggioli
2012-07-18  9:53                 ` Ian Campbell
2012-07-18 10:08                   ` Dario Faggioli
2012-07-18 11:00                   ` Ian Jackson
2012-07-18 13:14                     ` Ian Campbell
2012-07-18 13:35                       ` Dario Faggioli
2012-07-19 12:47                       ` Dario Faggioli
2012-07-18 13:40                     ` Andre Przywara
2012-07-18 13:54                       ` Juergen Gross
2012-07-18 14:00                       ` Dario Faggioli
2012-07-19 14:43                       ` Ian Jackson
2012-07-19 18:37                         ` Andre Przywara
2012-07-21  1:46                           ` Dario Faggioli
2012-07-18 10:53                 ` Ian Jackson
2012-07-18 13:12                   ` Ian Campbell
2012-07-18  9:47             ` Dario Faggioli
2012-07-19 12:21         ` [PATCH 1 of 3 v5/leftover] libxl: enable automatic placement of guests on NUMA nodes Andre Przywara
2012-07-19 14:22           ` Dario Faggioli
2012-07-20  8:19             ` Andre Przywara
2012-07-20  9:39               ` Dario Faggioli
2012-07-20 10:01                 ` Dario Faggioli
2012-07-20  8:20             ` Dario Faggioli
2012-07-20  8:26               ` Andre Przywara
2012-07-20  8:38                 ` Juergen Gross
2012-07-20  9:52                   ` Dario Faggioli
2012-07-20  9:56                     ` Juergen Gross
2012-07-20  9:44                 ` Dario Faggioli
2012-07-20 11:47                   ` Andre Przywara
2012-07-20 12:54                     ` Dario Faggioli
2012-07-20 13:07                       ` Andre Przywara
2012-07-21  1:44                         ` Dario Faggioli
2012-07-16 17:13       ` [PATCH 2 of 3 v5/leftover] libxl: have NUMA placement deal with cpupools Dario Faggioli
2012-07-16 17:13       ` [PATCH 3 of 3 v5/leftover] Some automatic NUMA placement documentation Dario Faggioli
2012-07-20 11:07       ` [PATCH 0 of 3 v5/leftover] Automatic NUMA placement for xl David Vrabel
2012-07-20 11:43         ` Andre Przywara
2012-07-20 12:00           ` Ian Campbell
2012-07-20 12:08             ` Ian Campbell
2012-07-23 10:38               ` Dario Faggioli
2012-07-23 10:42                 ` Ian Campbell
2012-07-23 15:31                   ` Dario Faggioli
2012-07-23 10:23             ` Dario Faggioli
2012-07-20 12:14           ` David Vrabel
2012-07-17 15:59     ` [PATCH 1 of 3 v4/leftover] libxl: enable automatic placement of guests on NUMA nodes Ian Campbell
2012-07-17 18:01       ` Ian Jackson
2012-07-17 22:15     ` Dario Faggioli
2012-07-10 15:03 ` [PATCH 2 of 3 v4/leftover] libxl: have NUMA placement deal with cpupools Dario Faggioli
2012-07-10 15:03 ` [PATCH 3 of 3 v4/leftover] Some automatic NUMA placement documentation Dario Faggioli
2012-07-16 17:03 ` [PATCH 0 of 3 v4/leftover] Automatic NUMA placement for xl Dario Faggioli

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1342570947.11794.92.camel@Abyss \
    --to=raistlin@linux.it \
    --cc=George.Dunlap@eu.citrix.com \
    --cc=Ian.Campbell@citrix.com \
    --cc=Ian.Jackson@eu.citrix.com \
    --cc=Stefano.Stabellini@eu.citrix.com \
    --cc=andre.przywara@amd.com \
    --cc=juergen.gross@ts.fujitsu.com \
    --cc=xen-devel@lists.xen.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.