All of lore.kernel.org
 help / color / mirror / Atom feed
From: Chris Wright <chrisw@sous-sol.org>
To: Alexander Graf <agraf@suse.de>
Cc: Andrea Arcangeli <aarcange@redhat.com>,
	Peter Zijlstra <a.p.zijlstra@chello.nl>,
	kvm list <kvm@vger.kernel.org>,
	bharata@linux.vnet.ibm.com,
	qemu-devel Developers <qemu-devel@nongnu.org>,
	dipankar@in.ibm.com, Vaidyanathan S <svaidy@in.ibm.com>
Subject: Re: [RFC PATCH] Exporting Guest RAM information for NUMA binding
Date: Tue, 8 Nov 2011 09:33:04 -0800	[thread overview]
Message-ID: <20111108173304.GA14486@sequoia.sous-sol.org> (raw)
In-Reply-To: <7816C401-9BE5-48A9-8BA9-4CDAD1B39FC8@suse.de>

* Alexander Graf (agraf@suse.de) wrote:
> On 29.10.2011, at 20:45, Bharata B Rao wrote:
> > As guests become NUMA aware, it becomes important for the guests to
> > have correct NUMA policies when they run on NUMA aware hosts.
> > Currently limited support for NUMA binding is available via libvirt
> > where it is possible to apply a NUMA policy to the guest as a whole.
> > However multinode guests would benefit if guest memory belonging to
> > different guest nodes are mapped appropriately to different host NUMA nodes.
> > 
> > To achieve this we would need QEMU to expose information about
> > guest RAM ranges (Guest Physical Address - GPA) and their host virtual
> > address mappings (Host Virtual Address - HVA). Using GPA and HVA, any external
> > tool like libvirt would be able to divide the guest RAM as per the guest NUMA
> > node geometry and bind guest memory nodes to corresponding host memory nodes
> > using HVA. This needs both QEMU (and libvirt) changes as well as changes
> > in the kernel.
> 
> Ok, let's take a step back here. You are basically growing libvirt into a memory resource manager that know how much memory is available on which nodes and how these nodes would possibly fit into the host's memory layout.
> 
> Shouldn't that be the kernel's job? It seems to me that architecturally the kernel is the place I would want my memory resource controls to be in.

I think that both Peter and Andrea are looking at this.  Before we commit
an API to QEMU that has a different semantic than a possible new kernel
interface (that perhaps QEMU could use directly to inform kernel of the
binding/relationship between vcpu thread and it's memory at VM startuup)
it would be useful to see what these guys are working on...

thanks,
-chris

WARNING: multiple messages have this Message-ID (diff)
From: Chris Wright <chrisw@sous-sol.org>
To: Alexander Graf <agraf@suse.de>
Cc: Andrea Arcangeli <aarcange@redhat.com>,
	Peter Zijlstra <a.p.zijlstra@chello.nl>,
	kvm list <kvm@vger.kernel.org>,
	bharata@linux.vnet.ibm.com,
	qemu-devel Developers <qemu-devel@nongnu.org>,
	dipankar@in.ibm.com, Vaidyanathan S <svaidy@in.ibm.com>
Subject: Re: [Qemu-devel] [RFC PATCH] Exporting Guest RAM information for NUMA binding
Date: Tue, 8 Nov 2011 09:33:04 -0800	[thread overview]
Message-ID: <20111108173304.GA14486@sequoia.sous-sol.org> (raw)
In-Reply-To: <7816C401-9BE5-48A9-8BA9-4CDAD1B39FC8@suse.de>

* Alexander Graf (agraf@suse.de) wrote:
> On 29.10.2011, at 20:45, Bharata B Rao wrote:
> > As guests become NUMA aware, it becomes important for the guests to
> > have correct NUMA policies when they run on NUMA aware hosts.
> > Currently limited support for NUMA binding is available via libvirt
> > where it is possible to apply a NUMA policy to the guest as a whole.
> > However multinode guests would benefit if guest memory belonging to
> > different guest nodes are mapped appropriately to different host NUMA nodes.
> > 
> > To achieve this we would need QEMU to expose information about
> > guest RAM ranges (Guest Physical Address - GPA) and their host virtual
> > address mappings (Host Virtual Address - HVA). Using GPA and HVA, any external
> > tool like libvirt would be able to divide the guest RAM as per the guest NUMA
> > node geometry and bind guest memory nodes to corresponding host memory nodes
> > using HVA. This needs both QEMU (and libvirt) changes as well as changes
> > in the kernel.
> 
> Ok, let's take a step back here. You are basically growing libvirt into a memory resource manager that know how much memory is available on which nodes and how these nodes would possibly fit into the host's memory layout.
> 
> Shouldn't that be the kernel's job? It seems to me that architecturally the kernel is the place I would want my memory resource controls to be in.

I think that both Peter and Andrea are looking at this.  Before we commit
an API to QEMU that has a different semantic than a possible new kernel
interface (that perhaps QEMU could use directly to inform kernel of the
binding/relationship between vcpu thread and it's memory at VM startuup)
it would be useful to see what these guys are working on...

thanks,
-chris

  parent reply	other threads:[~2011-11-08 17:33 UTC|newest]

Thread overview: 56+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2011-10-29 18:45 [Qemu-devel] [RFC PATCH] Exporting Guest RAM information for NUMA binding Bharata B Rao
2011-10-29 19:57 ` Alexander Graf
2011-10-29 19:57   ` [Qemu-devel] " Alexander Graf
2011-10-30  9:32   ` Vaidyanathan Srinivasan
2011-10-30  9:32     ` [Qemu-devel] " Vaidyanathan Srinivasan
2011-11-08 17:33   ` Chris Wright [this message]
2011-11-08 17:33     ` Chris Wright
2011-11-21 15:18     ` Bharata B Rao
2011-11-21 15:18       ` Bharata B Rao
2011-11-21 15:25       ` Peter Zijlstra
2011-11-21 15:25         ` [Qemu-devel] " Peter Zijlstra
2011-11-21 16:00         ` Bharata B Rao
2011-11-21 17:03           ` Peter Zijlstra
2011-11-21 17:03             ` [Qemu-devel] " Peter Zijlstra
2011-11-21 22:50             ` Chris Wright
2011-11-21 22:50               ` [Qemu-devel] " Chris Wright
2011-11-22  1:57               ` Anthony Liguori
2011-11-22  1:57                 ` Anthony Liguori
2011-11-22  1:51             ` Anthony Liguori
2011-11-22  1:51               ` Anthony Liguori
2011-11-23 15:03               ` Andrea Arcangeli
2011-11-23 15:03                 ` Andrea Arcangeli
2011-11-23 18:34                 ` Alexander Graf
2011-11-23 18:34                   ` Alexander Graf
2011-11-23 20:19                   ` Andrea Arcangeli
2011-11-23 20:19                     ` [Qemu-devel] " Andrea Arcangeli
2011-11-30 16:22                   ` Dipankar Sarma
2011-11-30 16:22                     ` Dipankar Sarma
2011-11-30 16:25                     ` Peter Zijlstra
2011-11-30 16:25                       ` [Qemu-devel] " Peter Zijlstra
2011-11-30 16:33                       ` Chris Wright
2011-11-30 16:33                         ` [Qemu-devel] " Chris Wright
2011-11-30 17:41                     ` Andrea Arcangeli
2011-11-30 17:41                       ` [Qemu-devel] " Andrea Arcangeli
2011-12-01 17:25                       ` Dipankar Sarma
2011-12-01 17:25                         ` Dipankar Sarma
2011-12-01 17:36                         ` Andrea Arcangeli
2011-12-01 17:36                           ` [Qemu-devel] " Andrea Arcangeli
2011-12-01 17:49                           ` Dipankar Sarma
2011-12-01 17:49                             ` Dipankar Sarma
2011-12-01 17:40                 ` Peter Zijlstra
2011-12-01 17:40                   ` Peter Zijlstra
2011-12-22 11:01                   ` Marcelo Tosatti
2011-12-22 11:01                     ` Marcelo Tosatti
2011-12-22 17:13                     ` Anthony Liguori
2011-12-22 17:13                       ` Anthony Liguori
2011-12-22 17:55                       ` Marcelo Tosatti
2011-12-22 17:55                         ` Marcelo Tosatti
2011-12-22 19:04                     ` Peter Zijlstra
2011-12-22 19:04                       ` [Qemu-devel] " Peter Zijlstra
2011-12-22 11:24                   ` Marcelo Tosatti
2011-12-22 11:24                     ` [Qemu-devel] " Marcelo Tosatti
2011-11-21 18:03         ` Avi Kivity
2011-11-21 18:03           ` [Qemu-devel] " Avi Kivity
2011-11-21 19:31           ` Peter Zijlstra
2011-11-21 19:31             ` Peter Zijlstra

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20111108173304.GA14486@sequoia.sous-sol.org \
    --to=chrisw@sous-sol.org \
    --cc=a.p.zijlstra@chello.nl \
    --cc=aarcange@redhat.com \
    --cc=agraf@suse.de \
    --cc=bharata@linux.vnet.ibm.com \
    --cc=dipankar@in.ibm.com \
    --cc=kvm@vger.kernel.org \
    --cc=qemu-devel@nongnu.org \
    --cc=svaidy@in.ibm.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.