From mboxrd@z Thu Jan 1 00:00:00 1970 From: Elena Ufimtseva Subject: Re: [PATCH v10 6/9] libxl: build numa nodes memory blocks Date: Thu, 4 Sep 2014 00:47:57 -0400 Message-ID: References: <1409718258-3276-1-git-send-email-ufimtseva@gmail.com> <1409718258-3276-4-git-send-email-ufimtseva@gmail.com> <1409757696.3323.33.camel@kazak.uk.xensource.com> Mime-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Return-path: In-Reply-To: <1409757696.3323.33.camel@kazak.uk.xensource.com> List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Sender: xen-devel-bounces@lists.xen.org Errors-To: xen-devel-bounces@lists.xen.org To: Ian Campbell Cc: Keir Fraser , Stefano Stabellini , George Dunlap , Matt Wilson , Dario Faggioli , Li Yechen , Ian Jackson , "xen-devel@lists.xen.org" , Jan Beulich List-Id: xen-devel@lists.xenproject.org On Wed, Sep 3, 2014 at 11:21 AM, Ian Campbell wrote: > On Wed, 2014-09-03 at 00:24 -0400, Elena Ufimtseva wrote: >> Create the vmemrange structure based on the >> PV guests E820 map. Values are in in Megabytes. >> Also export the E820 filter code e820_sanitize >> out to be available internally. >> As Xen can support mutiranges vNUMA nodes, for >> PV guest it is one range per one node. The other >> domain types should have their building routines >> implemented. >> >> Changes since v8: >> - Added setting of the vnuma memory ranges node ids; >> >> Signed-off-by: Elena Ufimtseva >> --- >> tools/libxl/libxl_internal.h | 9 ++ >> tools/libxl/libxl_numa.c | 201 ++++++++++++++++++++++++++++++++++++++++++ >> tools/libxl/libxl_x86.c | 3 +- >> 3 files changed, 212 insertions(+), 1 deletion(-) >> >> diff --git a/tools/libxl/libxl_internal.h b/tools/libxl/libxl_internal.h >> index beb052e..63ccb5e 100644 >> --- a/tools/libxl/libxl_internal.h >> +++ b/tools/libxl/libxl_internal.h >> @@ -3088,6 +3088,15 @@ void libxl__numa_candidate_put_nodemap(libxl__gc *gc, >> libxl_bitmap_copy(CTX, &cndt->nodemap, nodemap); >> } >> >> +bool libxl__vnodemap_is_usable(libxl__gc *gc, libxl_domain_build_info *info); >> + >> +int e820_sanitize(libxl_ctx *ctx, struct e820entry src[], uint32_t *nr_entries, >> + unsigned long map_limitkb, unsigned long balloon_kb); > > This is x86 specific, so simply exposing it to common libxl code as > you've done will break on arm. > > I'm not sure if this means moving the whole thing to libxl_x86 and > adding an ARM stub or if there is some common stuff from which the x86 > stuff needs to be abstracted. > >> + >> +int libxl__vnuma_align_mem(libxl__gc *gc, uint32_t domid, >> + struct libxl_domain_build_info *b_info, >> + vmemrange_t *memblks); >> + >> _hidden int libxl__ms_vm_genid_set(libxl__gc *gc, uint32_t domid, >> const libxl_ms_vm_genid *id); >> >> diff --git a/tools/libxl/libxl_numa.c b/tools/libxl/libxl_numa.c >> index 94ca4fe..c416faf 100644 >> --- a/tools/libxl/libxl_numa.c >> +++ b/tools/libxl/libxl_numa.c >> @@ -19,6 +19,10 @@ >> >> #include "libxl_internal.h" >> >> +#include "libxl_vnuma.h" >> + >> +#include "xc_private.h" >> + >> /* >> * What follows are helpers for generating all the k-combinations >> * without repetitions of a set S with n elements in it. Formally >> @@ -508,6 +512,203 @@ int libxl__get_numa_candidate(libxl__gc *gc, >> } >> >> /* >> + * Check if we can fit vnuma nodes to numa pnodes >> + * from vnode_to_pnode array. >> + */ >> +bool libxl__vnodemap_is_usable(libxl__gc *gc, >> + libxl_domain_build_info *info) >> +{ >> + unsigned int i; >> + libxl_numainfo *ninfo = NULL; >> + unsigned long long *claim; >> + unsigned int node; >> + uint64_t *sz_array; > > I don't think you set this, so please make it const so as to make sure. > >> + int nr_nodes = 0; >> + >> + /* Cannot use specified mapping if not NUMA machine. */ >> + ninfo = libxl_get_numainfo(CTX, &nr_nodes); >> + if (ninfo == NULL) >> + return false; >> + >> + sz_array = info->vnuma_mem; >> + claim = libxl__calloc(gc, info->vnodes, sizeof(*claim)); >> + /* Get total memory required on each physical node. */ >> + for (i = 0; i < info->vnodes; i++) >> + { >> + node = info->vnuma_vnodemap[i]; >> + >> + if (node < nr_nodes) >> + claim[node] += (sz_array[i] << 20); >> + else >> + goto vnodemapout; >> + } >> + for (i = 0; i < nr_nodes; i++) { >> + if (claim[i] > ninfo[i].free) >> + /* Cannot complete user request, falling to default. */ >> + goto vnodemapout; >> + } >> + >> + vnodemapout: >> + return true; > > Apart from non-NUMA systems, you always return true. Is that really > correct right? I'd expect one or the other "goto vnodemapout" to want to > return false. > > "out" is OK as a label name. > > >> + >> +/* >> + * For each node, build memory block start and end addresses. >> + * Substract any memory hole from the range found in e820 map. >> + * vnode memory size are passed here in megabytes, the result is >> + * in memory block addresses. >> + * Linux kernel will adjust numa memory block sizes on its own. >> + * But we want to provide to the kernel numa block addresses that >> + * will be the same in kernel and hypervisor. > > You shouldn't be making these sorts of guest OS assumptions. This is not > only Linux specific but also (presumably) specific to the version of > Linux you happened to be looking at. > > Please define the semantics wrt the interface and guarantees which Xen > wants to provide to all PV guests, not just one particular kernel/ > >> + */ >> +int libxl__vnuma_align_mem(libxl__gc *gc, > > Is "align" in the name here accurate? The doc comment says "build" > instead, which suggests it is creating things (perhaps aligned as it > goes). > >> + uint32_t domid, >> + /* IN: mem sizes in megabytes */ >> + libxl_domain_build_info *b_info, > > if it is input only then please make it const. > >> + /* OUT: linux NUMA blocks addresses */ > > Again -- Linux specific. > >> + libxl_ctx *ctx = libxl__gc_owner(gc); > > You can use CTX instead of creating this local thing. > > Ian. > Thanks Ian for reviewing. I will take a closer look at your comments tomorrow and reply. -- Elena