All of lore.kernel.org
 help / color / mirror / Atom feed
From: Dario Faggioli <dario.faggioli@citrix.com>
To: Wei Liu <wei.liu2@citrix.com>
Cc: "JBeulich@suse.com" <JBeulich@suse.com>,
	Andrew Cooper <Andrew.Cooper3@citrix.com>,
	"xen-devel@lists.xen.org" <xen-devel@lists.xen.org>,
	"ufimtseva@gmail.com" <ufimtseva@gmail.com>,
	Ian Jackson <Ian.Jackson@citrix.com>,
	Ian Campbell <Ian.Campbell@citrix.com>
Subject: Re: [PATCH v5 06/24] libxl: introduce vNUMA types
Date: Mon, 16 Feb 2015 16:51:43 +0000	[thread overview]
Message-ID: <1424105501.2591.53.camel@citrix.com> (raw)
In-Reply-To: <20150216161155.GF20572@zion.uk.xensource.com>


[-- Attachment #1.1: Type: text/plain, Size: 4711 bytes --]

On Mon, 2015-02-16 at 16:11 +0000, Wei Liu wrote:
> On Mon, Feb 16, 2015 at 03:56:21PM +0000, Dario Faggioli wrote:
> > On Mon, 2015-02-16 at 15:17 +0000, Wei Liu wrote:

> > > And there is no way to
> > > specify priority among the group of nodes you specify with a single
> > > bitmap.
> > > 
> > Why do we need such a thing as a 'priority'? What I'm talking about is
> > making it possible, for each vnode, to specify vnode-to-pnode mapping as
> > a bitmap of pnode. What we'd do, in presence of a bitmap, would be
> > allocating the memory by striping it across _all_ the pnodes present in
> > the bitmap.
> > 
> 
> Should we enforce memory equally stripped across all nodes? If so this
> should be stated explicitly in the comment of interface.  
>
I don't think we should enforce anything... I was much rather describing
what happens *right* *now* in that scenario, it being documented or not.

> I can't see
> that in your original description. I ask "priority" because I
> interpreted as something else (which is one of many ways to interpret
> I think).
> 
So, if you're saying that, if we use a bitmap, we should write somewhere
how libxl would use it, I certainly agree. Up to what level of details
we, at that point, should do that, I'm not sure. I think I'd be fine, as
a user, if finding it written that "the memory of the vnode will be
allocated out of the pnodes specified in the bitmap", with no much
further detail, especially considering the use case for the feature.

> If it's up to libxl to make dynamic choice, we should also say that. But
> this is not very useful to user because libxl's algorithm can change
> isn't it? How do users expect to know that across versions of Xen?
> 
Why does he need to? This would be something enabling a bit more of
flexibility, if one wants it, or a bit less worse performance, in some
specific situations, and all this pretty much independently from the
algorithm used inside libxl, I think.

As I said, if there is only 1GB free on all pnodes, the user will be
allowed to specify a set of pnodes for the vnodes, instead of not being
able to use vnuma at all, no matter how libxl (or whoever else) will
actually split the memory, in this, previous or future version of Xen...
This is the scenario I'm talking about, and in such a scenario, knowing
how the split happens, does not really help much, it is just the
_possibility_ of splitting, that helps...

> > If we allow the user (or the automatic placement algorithm) to specify a
> > bitmap of pnode for each vnode, he could put, say, vnode #1 on pnode #0
> > and #2, which maybe are really close (in terms of NUMA distances) to
> > each other, and vnode #2 to pnode #5 and #6 (close to each others too).
> > This would give worst performance than having each vnode on just one
> > pnode, but, most likely, better performance than the scenario described
> > right above.
> > 
> 
> I get what you mean. So by writing the above paragraphs, you sort of
> confirm that there still are too many implications in the algorithms,
> right? A user cannot just tell from the interface what the behaviour is
> going to be.  
>
An user can tell that, if he wants a vnode 2GB wide, and there is no
pnode with 2GB free, but the sum of free memory in pnode #4 and #6 is >=
2GB, he can still use vNUMA, by paying the (small or high will depends
on more factors) price of having that vnode split in two (or more!).

I think there would be room for some increased user satisfaction in
this, even without knowing much and/or being in control on how exactly
the split happens, as there are chances for performance to be (if the
thing is used properly) better than in the no-vNUMA case, which is what
we're after.

> You can of course say the algorithm is fixed but I don't
> think we want to do that?
> 
I don't want to, but I don't think it's needed.

Anyway, I'm more than ok if we want to defer the discussion to after
this series is in. It will require a further change in the interface,
but I don't think it would be a terrible price to pay, if we decide the
feature is worth.

Or, and that was the other thing I was suggesting, we can have the
bitmap in vnode_info since now, but then only accept ints in xl config
parsing, and enforce the weight of the bitmap to be 1 (and perhaps print
a warning) for now. This would not require changing the API in future,
it'd just be a matter of changing the xl config file parsing. The
"problem" would still stand for libxl callers different than xl, though,
I know.

Regards,
Dario

> Wei.
> 
> > Hope I made myself clear enough :-)
> > 
> > Regards,
> > Dario
> 
> 


[-- Attachment #1.2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 181 bytes --]

[-- Attachment #2: Type: text/plain, Size: 126 bytes --]

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

  reply	other threads:[~2015-02-16 16:51 UTC|newest]

Thread overview: 94+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-02-12 19:44 [PATCH v5 00/24] Virtual NUMA for PV and HVM Wei Liu
2015-02-12 19:44 ` [PATCH v5 01/24] xen: dump vNUMA information with debug key "u" Wei Liu
2015-02-13 11:50   ` Andrew Cooper
2015-02-16 14:35     ` Dario Faggioli
2015-02-12 19:44 ` [PATCH v5 02/24] xen: make two memory hypercalls vNUMA-aware Wei Liu
2015-02-13 12:00   ` Andrew Cooper
2015-02-13 13:24     ` Wei Liu
2015-02-12 19:44 ` [PATCH v5 03/24] libxc: duplicate snippet to allocate p2m_host array Wei Liu
2015-02-12 19:44 ` [PATCH v5 04/24] libxc: add p2m_size to xc_dom_image Wei Liu
2015-02-16 14:46   ` Dario Faggioli
2015-02-16 14:49     ` Wei Liu
2015-02-12 19:44 ` [PATCH v5 05/24] libxc: allocate memory with vNUMA information for PV guest Wei Liu
2015-02-13 14:30   ` Andrew Cooper
2015-02-13 15:05     ` Wei Liu
2015-02-13 15:17       ` Andrew Cooper
2015-02-16 16:58   ` Dario Faggioli
2015-02-16 17:44     ` Wei Liu
2015-02-12 19:44 ` [PATCH v5 06/24] libxl: introduce vNUMA types Wei Liu
2015-02-16 14:58   ` Dario Faggioli
2015-02-16 15:17     ` Wei Liu
2015-02-16 15:56       ` Dario Faggioli
2015-02-16 16:11         ` Wei Liu
2015-02-16 16:51           ` Dario Faggioli [this message]
2015-02-16 17:38             ` Wei Liu
2015-02-17 10:42               ` Dario Faggioli
2015-02-12 19:44 ` [PATCH v5 07/24] libxl: add vmemrange to libxl__domain_build_state Wei Liu
2015-02-16 16:00   ` Dario Faggioli
2015-02-16 16:15     ` Wei Liu
2015-02-12 19:44 ` [PATCH v5 08/24] libxl: introduce libxl__vnuma_config_check Wei Liu
2015-02-13 14:15   ` Ian Jackson
2015-02-13 15:12     ` Wei Liu
2015-02-13 15:39       ` Elena Ufimtseva
2015-02-13 16:06         ` Wei Liu
2015-02-13 16:11           ` Elena Ufimtseva
2015-02-17 16:51             ` Dario Faggioli
2015-02-22 15:50               ` Wei Liu
2015-02-17 16:44       ` Dario Faggioli
2015-02-13 15:40   ` Andrew Cooper
2015-02-17 12:56     ` Wei Liu
2015-03-02 15:13       ` Ian Campbell
2015-03-02 15:25         ` Andrew Cooper
2015-03-02 16:05           ` Ian Campbell
2015-02-17 16:38   ` Dario Faggioli
2015-02-22 15:47     ` Wei Liu
2015-02-12 19:44 ` [PATCH v5 09/24] libxl: x86: factor out e820_host_sanitize Wei Liu
2015-02-13 15:42   ` Andrew Cooper
2015-02-16 17:00     ` Dario Faggioli
2015-02-12 19:44 ` [PATCH v5 10/24] libxl: functions to build vmemranges for PV guest Wei Liu
2015-02-13 15:49   ` Andrew Cooper
2015-02-17 14:08     ` Wei Liu
2015-02-17 15:28   ` Dario Faggioli
2015-02-17 15:32     ` Wei Liu
2015-02-12 19:44 ` [PATCH v5 11/24] libxl: build, check and pass vNUMA info to Xen " Wei Liu
2015-02-13 15:54   ` Andrew Cooper
2015-02-17 14:49   ` Dario Faggioli
2015-02-12 19:44 ` [PATCH v5 12/24] hvmloader: retrieve vNUMA information from hypervisor Wei Liu
2015-02-13 15:58   ` Andrew Cooper
2015-02-17 11:36   ` Jan Beulich
2015-02-17 11:42     ` Wei Liu
2015-02-12 19:44 ` [PATCH v5 13/24] hvmloader: construct SRAT Wei Liu
2015-02-13 16:07   ` Andrew Cooper
2015-02-12 19:44 ` [PATCH v5 14/24] hvmloader: construct SLIT Wei Liu
2015-02-13 16:10   ` Andrew Cooper
2015-02-12 19:44 ` [PATCH v5 15/24] libxc: indentation change to xc_hvm_build_x86.c Wei Liu
2015-02-12 19:44 ` [PATCH v5 16/24] libxc: allocate memory with vNUMA information for HVM guest Wei Liu
2015-02-13 16:22   ` Andrew Cooper
2015-02-12 19:44 ` [PATCH v5 17/24] libxl: build, check and pass vNUMA info to Xen " Wei Liu
2015-02-13 14:21   ` Ian Jackson
2015-02-13 15:18     ` Wei Liu
2015-02-17 14:26   ` Dario Faggioli
2015-02-17 14:41     ` Wei Liu
2015-02-12 19:44 ` [PATCH v5 18/24] libxl: disallow memory relocation when vNUMA is enabled Wei Liu
2015-02-13 14:17   ` Ian Jackson
2015-02-13 15:18     ` Wei Liu
2015-02-12 19:44 ` [PATCH v5 19/24] libxl: define LIBXL_HAVE_VNUMA Wei Liu
2015-02-13 14:12   ` Ian Jackson
2015-02-13 15:21     ` Wei Liu
2015-02-13 15:26       ` Ian Jackson
2015-02-13 15:27         ` Ian Jackson
2015-02-13 15:28         ` Wei Liu
2015-02-12 19:44 ` [PATCH v5 20/24] libxlu: rework internal representation of setting Wei Liu
2015-02-13 14:24   ` Ian Jackson
2015-02-12 19:44 ` [PATCH v5 21/24] libxlu: nested list support Wei Liu
2015-02-12 19:44 ` [PATCH v5 22/24] libxlu: introduce new APIs Wei Liu
2015-02-13 14:12   ` Ian Jackson
2015-02-16 19:10     ` Wei Liu
2015-02-16 19:47       ` Wei Liu
2015-02-12 19:44 ` [PATCH v5 23/24] xl: introduce xcalloc Wei Liu
2015-02-12 20:17   ` Andrew Cooper
2015-02-13 10:25     ` Wei Liu
2015-02-12 19:44 ` [PATCH v5 24/24] xl: vNUMA support Wei Liu
2015-02-24 16:19   ` Dario Faggioli
2015-02-24 16:31     ` Wei Liu
2015-02-24 16:44       ` Dario Faggioli

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1424105501.2591.53.camel@citrix.com \
    --to=dario.faggioli@citrix.com \
    --cc=Andrew.Cooper3@citrix.com \
    --cc=Ian.Campbell@citrix.com \
    --cc=Ian.Jackson@citrix.com \
    --cc=JBeulich@suse.com \
    --cc=ufimtseva@gmail.com \
    --cc=wei.liu2@citrix.com \
    --cc=xen-devel@lists.xen.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.