All of lore.kernel.org
 help / color / mirror / Atom feed
From: Wei Liu <wei.liu2@citrix.com>
To: Dario Faggioli <dario.faggioli@citrix.com>
Cc: Wei Liu <wei.liu2@citrix.com>,
	"JBeulich@suse.com" <JBeulich@suse.com>,
	Andrew Cooper <Andrew.Cooper3@citrix.com>,
	"xen-devel@lists.xen.org" <xen-devel@lists.xen.org>,
	"ufimtseva@gmail.com" <ufimtseva@gmail.com>,
	Ian Jackson <Ian.Jackson@citrix.com>,
	Ian Campbell <Ian.Campbell@citrix.com>
Subject: Re: [PATCH v5 06/24] libxl: introduce vNUMA types
Date: Mon, 16 Feb 2015 16:11:55 +0000	[thread overview]
Message-ID: <20150216161155.GF20572@zion.uk.xensource.com> (raw)
In-Reply-To: <1424102178.2591.12.camel@citrix.com>

On Mon, Feb 16, 2015 at 03:56:21PM +0000, Dario Faggioli wrote:
> On Mon, 2015-02-16 at 15:17 +0000, Wei Liu wrote:
> > On Mon, Feb 16, 2015 at 02:58:32PM +0000, Dario Faggioli wrote:
> 
> > > > +libxl_vnode_info = Struct("vnode_info", [
> > > > +    ("memkb", MemKB),
> > > > +    ("distances", Array(uint32, "num_distances")), # distances from this node to other nodes
> > > > +    ("pnode", uint32), # physical node of this node
> > > >
> > > I am unsure whether we ever discussed this already or not (and sorry for
> > > not recalling) but, in principle, one vnode can be mapped to more than
> > > just one pnode.
> > > 
> > 
> > I don't recall either.
> > 
> > > Semantic would be that the memory of the vnode is somehow split (evenly,
> > > by default, I would say) between the specified pnodes. So, pnode could
> > > be a bitmap too (and be called "pnodes" :-) ), although we can put
> > > checks in place that --for now-- it always have only one bit set.
> > > 
> > > Reasons might be that the user just wants it, or that there is not
> > > enough (free) memory on just one pnode, but we still want to achieve
> > > some locality.
> > > 
> > 
> > Wouldn't this cause unpredictable performance? 
> >
> A certain amount of it, yes, for sure, but always less than having the
> memory striped on all nodes, I would say.
> 
> Well, of course it depends on how it will be used, as usual with these
> things...
> 
> > And there is no way to
> > specify priority among the group of nodes you specify with a single
> > bitmap.
> > 
> Why do we need such a thing as a 'priority'? What I'm talking about is
> making it possible, for each vnode, to specify vnode-to-pnode mapping as
> a bitmap of pnode. What we'd do, in presence of a bitmap, would be
> allocating the memory by striping it across _all_ the pnodes present in
> the bitmap.
> 

Should we enforce memory equally stripped across all nodes? If so this
should be stated explicitly in the comment of interface.  I can't see
that in your original description. I ask "priority" because I
interpreted as something else (which is one of many ways to interpret
I think).

If it's up to libxl to make dynamic choice, we should also say that. But
this is not very useful to user because libxl's algorithm can change
isn't it? How do users expect to know that across versions of Xen?

> If there's only one bit set, you have the same behavior as in this
> patch.
> 
> > I can't say I fully understand the implication of the scenario you
> > described.
> > 
> Ok. Imagine you want to create a guest with 2 vnodes, 4GB RAM total, so
> 2GB on each vnode. On the host, you have 8 pnodes, but only 1GB free on
> each of them.
> 
> If you can only associate a vnode with a single pnode, there is no node
> that can accommodate a full vnode, and we would have to give up trying
> to place the domain and map the vnodes, and we'll end up with 0.5GB on
> each pnode, unpredictable perf, and, basically, no vnuma at all (or at
> least no vnode-to-pnode mapping)... Does this make sense?
> 
> If we allow the user (or the automatic placement algorithm) to specify a
> bitmap of pnode for each vnode, he could put, say, vnode #1 on pnode #0
> and #2, which maybe are really close (in terms of NUMA distances) to
> each other, and vnode #2 to pnode #5 and #6 (close to each others too).
> This would give worst performance than having each vnode on just one
> pnode, but, most likely, better performance than the scenario described
> right above.
> 

I get what you mean. So by writing the above paragraphs, you sort of
confirm that there still are too many implications in the algorithms,
right? A user cannot just tell from the interface what the behaviour is
going to be.  You can of course say the algorithm is fixed but I don't
think we want to do that?

Wei.

> Hope I made myself clear enough :-)
> 
> Regards,
> Dario

  reply	other threads:[~2015-02-16 16:11 UTC|newest]

Thread overview: 94+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-02-12 19:44 [PATCH v5 00/24] Virtual NUMA for PV and HVM Wei Liu
2015-02-12 19:44 ` [PATCH v5 01/24] xen: dump vNUMA information with debug key "u" Wei Liu
2015-02-13 11:50   ` Andrew Cooper
2015-02-16 14:35     ` Dario Faggioli
2015-02-12 19:44 ` [PATCH v5 02/24] xen: make two memory hypercalls vNUMA-aware Wei Liu
2015-02-13 12:00   ` Andrew Cooper
2015-02-13 13:24     ` Wei Liu
2015-02-12 19:44 ` [PATCH v5 03/24] libxc: duplicate snippet to allocate p2m_host array Wei Liu
2015-02-12 19:44 ` [PATCH v5 04/24] libxc: add p2m_size to xc_dom_image Wei Liu
2015-02-16 14:46   ` Dario Faggioli
2015-02-16 14:49     ` Wei Liu
2015-02-12 19:44 ` [PATCH v5 05/24] libxc: allocate memory with vNUMA information for PV guest Wei Liu
2015-02-13 14:30   ` Andrew Cooper
2015-02-13 15:05     ` Wei Liu
2015-02-13 15:17       ` Andrew Cooper
2015-02-16 16:58   ` Dario Faggioli
2015-02-16 17:44     ` Wei Liu
2015-02-12 19:44 ` [PATCH v5 06/24] libxl: introduce vNUMA types Wei Liu
2015-02-16 14:58   ` Dario Faggioli
2015-02-16 15:17     ` Wei Liu
2015-02-16 15:56       ` Dario Faggioli
2015-02-16 16:11         ` Wei Liu [this message]
2015-02-16 16:51           ` Dario Faggioli
2015-02-16 17:38             ` Wei Liu
2015-02-17 10:42               ` Dario Faggioli
2015-02-12 19:44 ` [PATCH v5 07/24] libxl: add vmemrange to libxl__domain_build_state Wei Liu
2015-02-16 16:00   ` Dario Faggioli
2015-02-16 16:15     ` Wei Liu
2015-02-12 19:44 ` [PATCH v5 08/24] libxl: introduce libxl__vnuma_config_check Wei Liu
2015-02-13 14:15   ` Ian Jackson
2015-02-13 15:12     ` Wei Liu
2015-02-13 15:39       ` Elena Ufimtseva
2015-02-13 16:06         ` Wei Liu
2015-02-13 16:11           ` Elena Ufimtseva
2015-02-17 16:51             ` Dario Faggioli
2015-02-22 15:50               ` Wei Liu
2015-02-17 16:44       ` Dario Faggioli
2015-02-13 15:40   ` Andrew Cooper
2015-02-17 12:56     ` Wei Liu
2015-03-02 15:13       ` Ian Campbell
2015-03-02 15:25         ` Andrew Cooper
2015-03-02 16:05           ` Ian Campbell
2015-02-17 16:38   ` Dario Faggioli
2015-02-22 15:47     ` Wei Liu
2015-02-12 19:44 ` [PATCH v5 09/24] libxl: x86: factor out e820_host_sanitize Wei Liu
2015-02-13 15:42   ` Andrew Cooper
2015-02-16 17:00     ` Dario Faggioli
2015-02-12 19:44 ` [PATCH v5 10/24] libxl: functions to build vmemranges for PV guest Wei Liu
2015-02-13 15:49   ` Andrew Cooper
2015-02-17 14:08     ` Wei Liu
2015-02-17 15:28   ` Dario Faggioli
2015-02-17 15:32     ` Wei Liu
2015-02-12 19:44 ` [PATCH v5 11/24] libxl: build, check and pass vNUMA info to Xen " Wei Liu
2015-02-13 15:54   ` Andrew Cooper
2015-02-17 14:49   ` Dario Faggioli
2015-02-12 19:44 ` [PATCH v5 12/24] hvmloader: retrieve vNUMA information from hypervisor Wei Liu
2015-02-13 15:58   ` Andrew Cooper
2015-02-17 11:36   ` Jan Beulich
2015-02-17 11:42     ` Wei Liu
2015-02-12 19:44 ` [PATCH v5 13/24] hvmloader: construct SRAT Wei Liu
2015-02-13 16:07   ` Andrew Cooper
2015-02-12 19:44 ` [PATCH v5 14/24] hvmloader: construct SLIT Wei Liu
2015-02-13 16:10   ` Andrew Cooper
2015-02-12 19:44 ` [PATCH v5 15/24] libxc: indentation change to xc_hvm_build_x86.c Wei Liu
2015-02-12 19:44 ` [PATCH v5 16/24] libxc: allocate memory with vNUMA information for HVM guest Wei Liu
2015-02-13 16:22   ` Andrew Cooper
2015-02-12 19:44 ` [PATCH v5 17/24] libxl: build, check and pass vNUMA info to Xen " Wei Liu
2015-02-13 14:21   ` Ian Jackson
2015-02-13 15:18     ` Wei Liu
2015-02-17 14:26   ` Dario Faggioli
2015-02-17 14:41     ` Wei Liu
2015-02-12 19:44 ` [PATCH v5 18/24] libxl: disallow memory relocation when vNUMA is enabled Wei Liu
2015-02-13 14:17   ` Ian Jackson
2015-02-13 15:18     ` Wei Liu
2015-02-12 19:44 ` [PATCH v5 19/24] libxl: define LIBXL_HAVE_VNUMA Wei Liu
2015-02-13 14:12   ` Ian Jackson
2015-02-13 15:21     ` Wei Liu
2015-02-13 15:26       ` Ian Jackson
2015-02-13 15:27         ` Ian Jackson
2015-02-13 15:28         ` Wei Liu
2015-02-12 19:44 ` [PATCH v5 20/24] libxlu: rework internal representation of setting Wei Liu
2015-02-13 14:24   ` Ian Jackson
2015-02-12 19:44 ` [PATCH v5 21/24] libxlu: nested list support Wei Liu
2015-02-12 19:44 ` [PATCH v5 22/24] libxlu: introduce new APIs Wei Liu
2015-02-13 14:12   ` Ian Jackson
2015-02-16 19:10     ` Wei Liu
2015-02-16 19:47       ` Wei Liu
2015-02-12 19:44 ` [PATCH v5 23/24] xl: introduce xcalloc Wei Liu
2015-02-12 20:17   ` Andrew Cooper
2015-02-13 10:25     ` Wei Liu
2015-02-12 19:44 ` [PATCH v5 24/24] xl: vNUMA support Wei Liu
2015-02-24 16:19   ` Dario Faggioli
2015-02-24 16:31     ` Wei Liu
2015-02-24 16:44       ` Dario Faggioli

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20150216161155.GF20572@zion.uk.xensource.com \
    --to=wei.liu2@citrix.com \
    --cc=Andrew.Cooper3@citrix.com \
    --cc=Ian.Campbell@citrix.com \
    --cc=Ian.Jackson@citrix.com \
    --cc=JBeulich@suse.com \
    --cc=dario.faggioli@citrix.com \
    --cc=ufimtseva@gmail.com \
    --cc=xen-devel@lists.xen.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.