All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Cui, Dexuan" <dexuan.cui@intel.com>
To: Keir Fraser <keir.fraser@eu.citrix.com>,
	Dulloor <dulloor@gmail.com>,
	"xen-devel@lists.xensource.com" <xen-devel@lists.xensource.com>
Cc: Andre Przywara <andre.przywara@amd.com>
Subject: RE: [PATCH 00/11] PV NUMA Guests
Date: Wed, 7 Apr 2010 15:57:05 +0800	[thread overview]
Message-ID: <ED3036A092A28F4C91B0B4360DD128EABE1D656C@shzsmsx502.ccr.corp.intel.com> (raw)
In-Reply-To: <C7DF41BF.F558%keir.fraser@eu.citrix.com>

Keir Fraser wrote:
> I would like Acks from the people working on HVM NUMA for this patch
> series. At the very least it would be nice to have a single user
> interface for setting this up, regardless of whether for a PV or HVM
> guest. Hopefully code in the toolstack also can be shared. So I'm
Yes, I strongly agree we should share one interterface, e.g., The XENMEM_numa_op hypercalls implemented by Dulloor could be re-used in the hvm numa case and some parts of the toolstack could be shared, I think. I also replied in another thead and supplied some similarity I found in Andre/Dulloor's patches.

> cc'ing Dexuan and Andre, as I know they are involved in the HVM NUMA
> work. 
> 
>  Thanks,
>  Keir
> 
> On 04/04/2010 20:30, "Dulloor" <dulloor@gmail.com> wrote:
> 
>> The set of patches implements virtual NUMA-enlightenment to support
>> NUMA-aware PV guests. In more detail, the patch implements the
>> following : 
>> 
>> * For the NUMA systems, the following memory allocation strategies
>>  are implemented :
>> - CONFINE : Confine the VM memory allocation to a
>> single node. As opposed to the current method of doing this in
>> python, the patch implements this in libxc(along with other
>> strategies) and with assurance that the memory actually comes from
>> the selected node. 
> > - STRIPE : If the VM memory doesn't fit in a
>> single node and if the VM is not compiled with guest-numa-support,
>> the memory is allocated striped across a selected max-set of nodes.
>> - SPLIT : If the VM memory doesn't fit in a single node and if the VM
>> is compiled with guest-numa-support, the memory is allocated split
>> (equally for now) from the min-set of nodes. The  VM is then made
>> aware of this NUMA allocation (virtual NUMA enlightenment).
>> -DEFAULT : This is the existing allocation scheme.
>> 
>> * If the numa-guest support is compiled into the PV guest, we add
>> numa-guest-support to xen features elfnote. The xen tools use this to
>> determine if SPLIT strategy can be applied.
>> 
I think this looks too complex to allow a real user to easily determine which one to use...
About the CONFINE stragegy -- looks this is not a useful usage model to me -- do we really think it's a typical usage model to ensure a VM's memory can only be allocated on a specified node?
The definitions of STRIPE and SPLIT also doesn't sound like typical usage models to me. 
Why must tools know if the PV kernel is built with guest numa support or not? 
If a user configures guest numa to "on" for a pv guest, the tools can supply the numa info to PV kernel even if the pv kernel is not built with guest numa support -- the pv kernel will neglect the info safely;
If a user configures guest numa to "off" for a pv guest and the tools don't supply the numa info to PV kernel, and if the pv kernel is built with guest numa support, the pv kernel can easily detect this by your new hypercall and will not enable numa.

When a user finds the computing capability of a single node can't satisfy the actual need and hence wants to use guest numa, since the user has specified the amount of guest memory and the number of vcpus in guest config file, I think the user only needs to specify how many guest nodes (the "guestnodes" option in Andre's patch) the guest will see, and the tools and the hypervisor should co-work to distribute guest memory and vcpus uniformly among the guest nodes(I think we may not want to support non-uniform nodes as that doesn't look like a typical usage model) -- of course, maybe a specified node doesn't have the expected amount of memory -- in this case, the guest can continue to run with a slower speed (we can print a warning message to the user); or, if the user does care about predictable guest performance, the guest creation should fail.

How do you like this? My thought is we can make things simple in the first step. :-)

Thanks,
-- Dexuan

  reply	other threads:[~2010-04-07  7:57 UTC|newest]

Thread overview: 12+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2010-04-04 19:30 [PATCH 00/11] PV NUMA Guests Dulloor
2010-04-05  6:29 ` Keir Fraser
2010-04-07  7:57   ` Cui, Dexuan [this message]
2010-04-09  4:47     ` Dulloor
2010-04-14  5:18       ` Cui, Dexuan
2010-04-15 17:19         ` Dulloor
2010-04-05 14:52 ` Dan Magenheimer
2010-04-06  3:51   ` Dulloor
2010-04-06 17:18     ` Dan Magenheimer
2010-04-09  4:16       ` Dulloor
2010-04-09 11:34 ` Ian Pratt
2010-04-11  3:06   ` Dulloor

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=ED3036A092A28F4C91B0B4360DD128EABE1D656C@shzsmsx502.ccr.corp.intel.com \
    --to=dexuan.cui@intel.com \
    --cc=andre.przywara@amd.com \
    --cc=dulloor@gmail.com \
    --cc=keir.fraser@eu.citrix.com \
    --cc=xen-devel@lists.xensource.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.