All of lore.kernel.org
 help / color / mirror / Atom feed
From: Igor Mammedov <imammedo@redhat.com>
To: "Daniel P. Berrangé" <berrange@redhat.com>
Cc: qemu-devel@nongnu.org, ehabkost@redhat.com,
	libvir-list@redhat.com, eblake@redhat.com,
	Markus Armbruster <armbru@redhat.com>
Subject: Re: [Qemu-devel] [PATCH] numa: warn if numa 'mem' option or default RAM splitting between nodes is used.
Date: Thu, 7 Mar 2019 16:55:16 +0100	[thread overview]
Message-ID: <20190307165516.5fe005e5@redhat.com> (raw)
In-Reply-To: <20190307100456.GG32268@redhat.com>

On Thu, 7 Mar 2019 10:04:56 +0000
Daniel P. Berrangé <berrange@redhat.com> wrote:

> On Wed, Mar 06, 2019 at 07:48:22PM +0100, Igor Mammedov wrote:
> > On Wed, 6 Mar 2019 17:10:37 +0000
> > Daniel P. Berrangé <berrange@redhat.com> wrote:
> >   
> > > On Wed, Mar 06, 2019 at 05:58:35PM +0100, Igor Mammedov wrote:  
> > > > On Wed, 6 Mar 2019 16:39:38 +0000
> > > > Daniel P. Berrangé <berrange@redhat.com> wrote:
> > > >   
> > > > > On Wed, Mar 06, 2019 at 05:30:25PM +0100, Igor Mammedov wrote:  
> > > > > > Ammend -numa option docs and print warnings if 'mem' option or default RAM
> > > > > > splitting between nodes is used. It's intended to discourage users from using
> > > > > > configuration that allows only to fake NUMA on guest side while leading
> > > > > > to reduced performance of the guest due to inability to properly configure
> > > > > > VM's RAM on the host.
> > > > > > 
> > > > > > In NUMA case, it's recommended to always explicitly configure guest RAM
> > > > > > using -numa node,memdev={backend-id} option.
> > > > > > 
> > > > > > Signed-off-by: Igor Mammedov <imammedo@redhat.com>
> > > > > > ---
> > > > > >  numa.c          |  5 +++++
> > > > > >  qemu-options.hx | 12 ++++++++----
> > > > > >  2 files changed, 13 insertions(+), 4 deletions(-)
> > > > > > 
> > > > > > diff --git a/numa.c b/numa.c
> > > > > > index 3875e1e..c6c2a6f 100644
> > > > > > --- a/numa.c
> > > > > > +++ b/numa.c
> > > > > > @@ -121,6 +121,8 @@ static void parse_numa_node(MachineState *ms, NumaNodeOptions *node,
> > > > > >  
> > > > > >      if (node->has_mem) {
> > > > > >          numa_info[nodenr].node_mem = node->mem;
> > > > > > +        warn_report("Parameter -numa node,mem is obsolete,"
> > > > > > +                    " use -numa node,memdev instead");  
> > > > > 
> > > > > I don't think we should do this. Libvirt isn't going to stop using this
> > > > > option in the near term. When users see warnings like this in logs  
> > > > well when it was the only option available libvirt had no other choice,
> > > > but since memdev became available libvirt should try to use it whenever
> > > > possible.  
> > > 
> > > As we previously discussed, it is not possible for libvirt to use it
> > > in all cases.
> > >   
> > > >   
> > > > > they'll often file bugs reports thinking something is broken which is
> > > > > not the case here.   
> > > > It's the exact purpose of the warning, to force user asking questions
> > > > and fix configuration, since he/she obviously not getting NUMA benefits
> > > > and/or performance-wise  
> > > 
> > > That's only useful if it is possible to do something about the problem.
> > > Libvirt wants to use the new option but it can't due to the live migration
> > > problems. So this simply leads to bug reports that will end up marked
> > > as CANTFIX.  
> > The problem could be solved by user though, by reconfiguring and restarting
> > domain since it's impossible to (at least as it stands now wrt migration).
> >   
> > > I don't believe libvirt actually  suffers from the performance problem
> > > you describe wrt lack of pinning.   When we attempt to pin guest NUMA
> > > nodes to host NUMA nodes, libvirt *will* use "memdev". IIUC, we
> > > use "mem" in the case where there /no/ requested pinning of guest
> > > NUMA nodes, and so we're not suffering from the limitations of "mem"
> > > in that case.  
> > What would be the use-case for not pinning numa nodes?
> > If user isn't asking for pinning, VM would run with degraded performance and
> > it would be better of being non-numa.  
> 
> The guest could have been originally booted on a host which has 2 NUMA
> nodes and have been migrated to a host with 1 NUMA node, in which case
> pinnning is not relevant.
> 
> For CI purposes too it is reasonable to create guests with NUMA configurations
> that bear no resemblance to the host NUMA configuration. This allows for testing
> the operation of guest applications. This is something that is relevant to
> OpenStack for testnig Nova's handling of NUMA placement logic, since almost
> all their testing is in VMs not bare metal.
>
> Not pinning isn't common, but it is reasonable to do it.

 
 
> > Even if user doesn't ask for pinning, non-pinned memdev (for new VMs where
> > available) could be used as well. Users of 'mem' with new QEMU will see the warning
> > and probably think about if they are configured VM correctly.  
> 
> We can't change to memdev due to migration problems. Since there is no
> functional problem with using 'mem' in this scenario there's no pressing
> reason to stop using 'mem'.
The problem with it is that it blocks fixing bug in QEMU for the multi node case
 '-numa node,mem + -mem-path + -mem-prealloc + -object "memdev",policy=bind'
and it also gets in the way of unifying RAM handling using device-memory
for initial RAM due to the same migration issue since RAM layout in migration
stream is different.

> > Desire to deprecate 'mem' at least for new machines is to be able to prevent user
> > from creating broken configs and manage all RAM in the same manner (frontend/backend).
> > (I can't do it for old machine types but for new it is possible).  
> 
> As before libvirt can't make its CLI config dependant on machine type
> version as that breaks live migration to old libvirt versions.
> 
> Regards,
> Daniel

      reply	other threads:[~2019-03-07 15:55 UTC|newest]

Thread overview: 8+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-03-06 16:30 [Qemu-devel] [PATCH] numa: warn if numa 'mem' option or default RAM splitting between nodes is used Igor Mammedov
2019-03-06 16:38 ` Eric Blake
2019-03-06 16:39 ` Daniel P. Berrangé
2019-03-06 16:58   ` Igor Mammedov
2019-03-06 17:10     ` Daniel P. Berrangé
2019-03-06 18:48       ` Igor Mammedov
2019-03-07 10:04         ` Daniel P. Berrangé
2019-03-07 15:55           ` Igor Mammedov [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20190307165516.5fe005e5@redhat.com \
    --to=imammedo@redhat.com \
    --cc=armbru@redhat.com \
    --cc=berrange@redhat.com \
    --cc=eblake@redhat.com \
    --cc=ehabkost@redhat.com \
    --cc=libvir-list@redhat.com \
    --cc=qemu-devel@nongnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.