All of lore.kernel.org
 help / color / mirror / Atom feed
* [Qemu-devel] [PATCH 0/2] numa: deprecate -numa node, mem and default memory distribution
@ 2019-03-01 15:42 Igor Mammedov
  2019-03-01 15:42 ` [Qemu-devel] [PATCH 1/2] numa: deprecate 'mem' parameter of '-numa node' option Igor Mammedov
  2019-03-01 15:42 ` [Qemu-devel] [PATCH 2/2] numa: deprecate implict memory distribution between nodes Igor Mammedov
  0 siblings, 2 replies; 37+ messages in thread
From: Igor Mammedov @ 2019-03-01 15:42 UTC (permalink / raw)
  To: qemu-devel
  Cc: ehabkost, libvir-list, pbonzini, peter.maydell, david, qemu-ppc,
	qemu-arm


 1) "I'm considering to deprecating -mem-path/prealloc CLI options and replacing
them with a single memdev Machine property to allow interested users to pick
used backend for initial RAM (fixes mixed -mem-path+hostmem backends issues)
and as a transition step to modeling initial as a Device instead of (ab)using
MemoryRegion APIs."
(for more details see: https://www.mail-archive.com/qemu-devel@nongnu.org/msg596314.html)

However there is a couple of roadblocks on the way (s390x and numa memory handling).
I think I finally thought out a way to hack s390x in migration compatible manner,
but I don't see a way to do it for -numa node,mem and default RAM assignement to
nodes. Considering both numa usecases aren't meaningfully using NUMA (aside guest
side testing), and could be replaced with explicitly used memdev parameter, I'd
like to propose removing these fake NUMA friends, hence this deprecation.

As result of removing deprecated options and replacing initial RAM allocation
with 'memdev's (1), QEMU will allocate guest RAM in consistent way, fixing mixed
use-case and allowing boards to move towards modelling initial RAM as Device(s).
Which in its own turn should allow to cleanup NUMA/HMP/memory accounting code
more by dropping ad-hoc node_mem tracking and reusing device enumeration instead.

Igor Mammedov (2):
  numa: deprecate 'mem' parameter of '-numa node' option
  numa: deprecate implict memory distribution between nodes

 numa.c               |  5 +++++
 qemu-deprecated.texi | 21 +++++++++++++++++++++
 2 files changed, 26 insertions(+)

-- 
2.7.4

^ permalink raw reply	[flat|nested] 37+ messages in thread

* [Qemu-devel] [PATCH 1/2] numa: deprecate 'mem' parameter of '-numa node' option
  2019-03-01 15:42 [Qemu-devel] [PATCH 0/2] numa: deprecate -numa node, mem and default memory distribution Igor Mammedov
@ 2019-03-01 15:42 ` Igor Mammedov
  2019-03-01 15:49   ` [Qemu-devel] [libvirt] " Daniel P. Berrangé
  2019-03-01 15:42 ` [Qemu-devel] [PATCH 2/2] numa: deprecate implict memory distribution between nodes Igor Mammedov
  1 sibling, 1 reply; 37+ messages in thread
From: Igor Mammedov @ 2019-03-01 15:42 UTC (permalink / raw)
  To: qemu-devel
  Cc: ehabkost, libvir-list, pbonzini, peter.maydell, david, qemu-ppc,
	qemu-arm

The parameter allows to configure fake NUMA topology where guest
VM simulates NUMA topology but not actually getting a performance
benefits from it. The same or better results could be achieved
using 'memdev' parameter. In light of that any VM that uses NUMA
to get its benefits should use 'memdev' and to allow transition
initial RAM to device based model, deprecate 'mem' parameter as
its ad-hoc partitioning of initial RAM MemoryRegion can't be
translated to memdev based backend transparently to users and in
compatible manner (migration wise).

That will also allow to clean up a bit our numa code, leaving only
'memdev' impl. in place and several boards that use node_mem
to generate FDT/ACPI description from it.

Signed-off-by: Igor Mammedov <imammedo@redhat.com>
---
 numa.c               |  2 ++
 qemu-deprecated.texi | 14 ++++++++++++++
 2 files changed, 16 insertions(+)

diff --git a/numa.c b/numa.c
index 3875e1e..2205773 100644
--- a/numa.c
+++ b/numa.c
@@ -121,6 +121,8 @@ static void parse_numa_node(MachineState *ms, NumaNodeOptions *node,
 
     if (node->has_mem) {
         numa_info[nodenr].node_mem = node->mem;
+        warn_report("Parameter -numa node,mem is deprecated,"
+                    " use -numa node,memdev instead");
     }
     if (node->has_memdev) {
         Object *o;
diff --git a/qemu-deprecated.texi b/qemu-deprecated.texi
index 45c5795..73f99d4 100644
--- a/qemu-deprecated.texi
+++ b/qemu-deprecated.texi
@@ -60,6 +60,20 @@ Support for invalid topologies will be removed, the user must ensure
 topologies described with -smp include all possible cpus, i.e.
   @math{@var{sockets} * @var{cores} * @var{threads} = @var{maxcpus}}.
 
+@subsection -numa node,mem=@var{size} (since 4.0)
+
+The parameter @option{mem} of @option{-numa node} is used to assign a part of
+guest RAM to a NUMA node. But when using it, it's impossible to manage specified
+size on the host side (like bind it to a host node, setting bind policy, ...),
+so guest end-ups with the fake NUMA configuration with suboptiomal performance.
+However since 2014 there is an alternative way to assign RAM to a NUMA node
+using parameter @option{memdev}, which does the same as @option{mem} and has
+an ability to actualy manage node RAM on the host side. Use parameter
+@option{memdev} with @var{memory-backend-ram} backend as an replacement for
+parameter @option{mem} to achieve the same fake NUMA effect or a properly
+configured @var{memory-backend-file} backend to actually benefit from NUMA
+configuration.
+
 @section QEMU Machine Protocol (QMP) commands
 
 @subsection block-dirty-bitmap-add "autoload" parameter (since 2.12.0)
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 37+ messages in thread

* [Qemu-devel] [PATCH 2/2] numa: deprecate implict memory distribution between nodes
  2019-03-01 15:42 [Qemu-devel] [PATCH 0/2] numa: deprecate -numa node, mem and default memory distribution Igor Mammedov
  2019-03-01 15:42 ` [Qemu-devel] [PATCH 1/2] numa: deprecate 'mem' parameter of '-numa node' option Igor Mammedov
@ 2019-03-01 15:42 ` Igor Mammedov
  1 sibling, 0 replies; 37+ messages in thread
From: Igor Mammedov @ 2019-03-01 15:42 UTC (permalink / raw)
  To: qemu-devel
  Cc: ehabkost, libvir-list, pbonzini, peter.maydell, david, qemu-ppc,
	qemu-arm

Implict RAM distribution between nodes has exactly the same issues as:
  "numa: deprecate 'mem' parameter of '-numa node' option"
only with QEMU being the user that's 'adding' 'mem' parameter.

Depricate it, to get it out of the way so that we could switch to
consistent guest RAM allocation using memory backends and possibly
memory devices later on top of that.

Signed-off-by: Igor Mammedov <imammedo@redhat.com>
---
 numa.c               | 3 +++
 qemu-deprecated.texi | 7 +++++++
 2 files changed, 10 insertions(+)

diff --git a/numa.c b/numa.c
index 2205773..6d45a1f 100644
--- a/numa.c
+++ b/numa.c
@@ -409,6 +409,9 @@ void numa_complete_configuration(MachineState *ms)
         if (i == nb_numa_nodes) {
             assert(mc->numa_auto_assign_ram);
             mc->numa_auto_assign_ram(mc, numa_info, nb_numa_nodes, ram_size);
+            warn_report("Default splitting of RAM between nodes is deprecated,"
+                        " Use '-numa node,memdev' to explictly define RAM"
+                        " allocation per node");
         }
 
         numa_total = 0;
diff --git a/qemu-deprecated.texi b/qemu-deprecated.texi
index 73f99d4..09bec7d 100644
--- a/qemu-deprecated.texi
+++ b/qemu-deprecated.texi
@@ -74,6 +74,13 @@ parameter @option{mem} to achieve the same fake NUMA effect or a properly
 configured @var{memory-backend-file} backend to actually benefit from NUMA
 configuration.
 
+@subsection -numa node (without memory specified) (since 4.0)
+
+Splitting RAM by default between NUMA nodes has the same issues as @option{mem}
+parameter described above with a difference that role of the user plays QEMU
+using generic splitting rule or a board specific one. Use @option{memdev} with
+@var{memory-backend-ram} backend to define mapping explictly instead.
+
 @section QEMU Machine Protocol (QMP) commands
 
 @subsection block-dirty-bitmap-add "autoload" parameter (since 2.12.0)
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 37+ messages in thread

* Re: [Qemu-devel] [libvirt] [PATCH 1/2] numa: deprecate 'mem' parameter of '-numa node' option
  2019-03-01 15:42 ` [Qemu-devel] [PATCH 1/2] numa: deprecate 'mem' parameter of '-numa node' option Igor Mammedov
@ 2019-03-01 15:49   ` Daniel P. Berrangé
  2019-03-01 17:33     ` Igor Mammedov
  0 siblings, 1 reply; 37+ messages in thread
From: Daniel P. Berrangé @ 2019-03-01 15:49 UTC (permalink / raw)
  To: Igor Mammedov
  Cc: qemu-devel, peter.maydell, ehabkost, libvir-list, qemu-arm,
	qemu-ppc, pbonzini, david

On Fri, Mar 01, 2019 at 04:42:15PM +0100, Igor Mammedov wrote:
> The parameter allows to configure fake NUMA topology where guest
> VM simulates NUMA topology but not actually getting a performance
> benefits from it. The same or better results could be achieved
> using 'memdev' parameter. In light of that any VM that uses NUMA
> to get its benefits should use 'memdev' and to allow transition
> initial RAM to device based model, deprecate 'mem' parameter as
> its ad-hoc partitioning of initial RAM MemoryRegion can't be
> translated to memdev based backend transparently to users and in
> compatible manner (migration wise).
> 
> That will also allow to clean up a bit our numa code, leaving only
> 'memdev' impl. in place and several boards that use node_mem
> to generate FDT/ACPI description from it.

Can you confirm that the  'mem' and 'memdev' parameters to -numa
are 100% live migration compatible in both directions ?  Libvirt
would need this to be the case in order to use the 'memdev' syntax
instead.

> 
> Signed-off-by: Igor Mammedov <imammedo@redhat.com>
> ---
>  numa.c               |  2 ++
>  qemu-deprecated.texi | 14 ++++++++++++++
>  2 files changed, 16 insertions(+)
> 
> diff --git a/numa.c b/numa.c
> index 3875e1e..2205773 100644
> --- a/numa.c
> +++ b/numa.c
> @@ -121,6 +121,8 @@ static void parse_numa_node(MachineState *ms, NumaNodeOptions *node,
>  
>      if (node->has_mem) {
>          numa_info[nodenr].node_mem = node->mem;
> +        warn_report("Parameter -numa node,mem is deprecated,"
> +                    " use -numa node,memdev instead");
>      }
>      if (node->has_memdev) {
>          Object *o;
> diff --git a/qemu-deprecated.texi b/qemu-deprecated.texi
> index 45c5795..73f99d4 100644
> --- a/qemu-deprecated.texi
> +++ b/qemu-deprecated.texi
> @@ -60,6 +60,20 @@ Support for invalid topologies will be removed, the user must ensure
>  topologies described with -smp include all possible cpus, i.e.
>    @math{@var{sockets} * @var{cores} * @var{threads} = @var{maxcpus}}.
>  
> +@subsection -numa node,mem=@var{size} (since 4.0)
> +
> +The parameter @option{mem} of @option{-numa node} is used to assign a part of
> +guest RAM to a NUMA node. But when using it, it's impossible to manage specified
> +size on the host side (like bind it to a host node, setting bind policy, ...),
> +so guest end-ups with the fake NUMA configuration with suboptiomal performance.
> +However since 2014 there is an alternative way to assign RAM to a NUMA node
> +using parameter @option{memdev}, which does the same as @option{mem} and has
> +an ability to actualy manage node RAM on the host side. Use parameter
> +@option{memdev} with @var{memory-backend-ram} backend as an replacement for
> +parameter @option{mem} to achieve the same fake NUMA effect or a properly
> +configured @var{memory-backend-file} backend to actually benefit from NUMA
> +configuration.
> +
>  @section QEMU Machine Protocol (QMP) commands
>  
>  @subsection block-dirty-bitmap-add "autoload" parameter (since 2.12.0)
> -- 
> 2.7.4
> 
> --
> libvir-list mailing list
> libvir-list@redhat.com
> https://www.redhat.com/mailman/listinfo/libvir-list

Regards,
Daniel
-- 
|: https://berrange.com      -o-    https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org         -o-            https://fstop138.berrange.com :|
|: https://entangle-photo.org    -o-    https://www.instagram.com/dberrange :|

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [Qemu-devel] [libvirt] [PATCH 1/2] numa: deprecate 'mem' parameter of '-numa node' option
  2019-03-01 15:49   ` [Qemu-devel] [libvirt] " Daniel P. Berrangé
@ 2019-03-01 17:33     ` Igor Mammedov
  2019-03-01 17:48       ` Daniel P. Berrangé
  2019-03-01 18:01       ` [Qemu-devel] " Dr. David Alan Gilbert
  0 siblings, 2 replies; 37+ messages in thread
From: Igor Mammedov @ 2019-03-01 17:33 UTC (permalink / raw)
  To: Daniel P. Berrangé
  Cc: peter.maydell, ehabkost, libvir-list, qemu-devel, qemu-arm,
	qemu-ppc, pbonzini, david, Dr. David Alan Gilbert

On Fri, 1 Mar 2019 15:49:47 +0000
Daniel P. Berrangé <berrange@redhat.com> wrote:

> On Fri, Mar 01, 2019 at 04:42:15PM +0100, Igor Mammedov wrote:
> > The parameter allows to configure fake NUMA topology where guest
> > VM simulates NUMA topology but not actually getting a performance
> > benefits from it. The same or better results could be achieved
> > using 'memdev' parameter. In light of that any VM that uses NUMA
> > to get its benefits should use 'memdev' and to allow transition
> > initial RAM to device based model, deprecate 'mem' parameter as
> > its ad-hoc partitioning of initial RAM MemoryRegion can't be
> > translated to memdev based backend transparently to users and in
> > compatible manner (migration wise).
> > 
> > That will also allow to clean up a bit our numa code, leaving only
> > 'memdev' impl. in place and several boards that use node_mem
> > to generate FDT/ACPI description from it.  
> 
> Can you confirm that the  'mem' and 'memdev' parameters to -numa
> are 100% live migration compatible in both directions ?  Libvirt
> would need this to be the case in order to use the 'memdev' syntax
> instead.
Unfortunately they are not migration compatible in any direction,
if it where possible to translate them to each other I'd alias 'mem'
to 'memdev' without deprecation. The former sends over only one
MemoryRegion to target, while the later sends over several (one per
memdev).

Mixed memory issue[1] first came from libvirt side RHBZ1624223,
back then it was resolved on libvirt side in favor of migration
compatibility vs correctness (i.e. bind policy doesn't work as expected).
What worse that it was made default and affects all new machines,
as I understood it.

In case of -mem-path + -mem-prealloc (with 1 numa node or numa less)
it's possible on QEMU side to make conversion to memdev in migration
compatible way (that's what stopped Michal from memdev approach).
But it's hard to do so in multi-nodes case as amount of MemoryRegions
is different.

Point is to consider 'mem' as mis-configuration error, as the user
in the first place using broken numa configuration
(i.e. fake numa configuration doesn't actually improve performance).

CCed David, maybe he could offer a way to do 1:n migration and other
way around.


> > Signed-off-by: Igor Mammedov <imammedo@redhat.com>
> > ---
> >  numa.c               |  2 ++
> >  qemu-deprecated.texi | 14 ++++++++++++++
> >  2 files changed, 16 insertions(+)
> > 
> > diff --git a/numa.c b/numa.c
> > index 3875e1e..2205773 100644
> > --- a/numa.c
> > +++ b/numa.c
> > @@ -121,6 +121,8 @@ static void parse_numa_node(MachineState *ms, NumaNodeOptions *node,
> >  
> >      if (node->has_mem) {
> >          numa_info[nodenr].node_mem = node->mem;
> > +        warn_report("Parameter -numa node,mem is deprecated,"
> > +                    " use -numa node,memdev instead");
> >      }
> >      if (node->has_memdev) {
> >          Object *o;
> > diff --git a/qemu-deprecated.texi b/qemu-deprecated.texi
> > index 45c5795..73f99d4 100644
> > --- a/qemu-deprecated.texi
> > +++ b/qemu-deprecated.texi
> > @@ -60,6 +60,20 @@ Support for invalid topologies will be removed, the user must ensure
> >  topologies described with -smp include all possible cpus, i.e.
> >    @math{@var{sockets} * @var{cores} * @var{threads} = @var{maxcpus}}.
> >  
> > +@subsection -numa node,mem=@var{size} (since 4.0)
> > +
> > +The parameter @option{mem} of @option{-numa node} is used to assign a part of
> > +guest RAM to a NUMA node. But when using it, it's impossible to manage specified
> > +size on the host side (like bind it to a host node, setting bind policy, ...),
> > +so guest end-ups with the fake NUMA configuration with suboptiomal performance.
> > +However since 2014 there is an alternative way to assign RAM to a NUMA node
> > +using parameter @option{memdev}, which does the same as @option{mem} and has
> > +an ability to actualy manage node RAM on the host side. Use parameter
> > +@option{memdev} with @var{memory-backend-ram} backend as an replacement for
> > +parameter @option{mem} to achieve the same fake NUMA effect or a properly
> > +configured @var{memory-backend-file} backend to actually benefit from NUMA
> > +configuration.
> > +
> >  @section QEMU Machine Protocol (QMP) commands
> >  
> >  @subsection block-dirty-bitmap-add "autoload" parameter (since 2.12.0)
> > -- 
> > 2.7.4
> > 
> > --
> > libvir-list mailing list
> > libvir-list@redhat.com
> > https://www.redhat.com/mailman/listinfo/libvir-list  
> 
> Regards,
> Daniel

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [Qemu-devel] [libvirt] [PATCH 1/2] numa: deprecate 'mem' parameter of '-numa node' option
  2019-03-01 17:33     ` Igor Mammedov
@ 2019-03-01 17:48       ` Daniel P. Berrangé
  2019-03-04  7:13         ` Markus Armbruster
  2019-03-04  8:11         ` [Qemu-devel] [Qemu-ppc] " Thomas Huth
  2019-03-01 18:01       ` [Qemu-devel] " Dr. David Alan Gilbert
  1 sibling, 2 replies; 37+ messages in thread
From: Daniel P. Berrangé @ 2019-03-01 17:48 UTC (permalink / raw)
  To: Igor Mammedov
  Cc: peter.maydell, ehabkost, libvir-list, qemu-devel, qemu-arm,
	qemu-ppc, pbonzini, david, Dr. David Alan Gilbert

On Fri, Mar 01, 2019 at 06:33:28PM +0100, Igor Mammedov wrote:
> On Fri, 1 Mar 2019 15:49:47 +0000
> Daniel P. Berrangé <berrange@redhat.com> wrote:
> 
> > On Fri, Mar 01, 2019 at 04:42:15PM +0100, Igor Mammedov wrote:
> > > The parameter allows to configure fake NUMA topology where guest
> > > VM simulates NUMA topology but not actually getting a performance
> > > benefits from it. The same or better results could be achieved
> > > using 'memdev' parameter. In light of that any VM that uses NUMA
> > > to get its benefits should use 'memdev' and to allow transition
> > > initial RAM to device based model, deprecate 'mem' parameter as
> > > its ad-hoc partitioning of initial RAM MemoryRegion can't be
> > > translated to memdev based backend transparently to users and in
> > > compatible manner (migration wise).
> > > 
> > > That will also allow to clean up a bit our numa code, leaving only
> > > 'memdev' impl. in place and several boards that use node_mem
> > > to generate FDT/ACPI description from it.  
> > 
> > Can you confirm that the  'mem' and 'memdev' parameters to -numa
> > are 100% live migration compatible in both directions ?  Libvirt
> > would need this to be the case in order to use the 'memdev' syntax
> > instead.
> Unfortunately they are not migration compatible in any direction,
> if it where possible to translate them to each other I'd alias 'mem'
> to 'memdev' without deprecation. The former sends over only one
> MemoryRegion to target, while the later sends over several (one per
> memdev).

If we can't migration from one to the other, then we can not deprecate
the existing 'mem' syntax. Even if libvirt were to provide a config
option to let apps opt-in to the new syntax, we need to be able to
support live migration of existing running VMs indefinitely. Effectively
this means we need the to keep 'mem' support forever, or at least such
a long time that it effectively means forever.

So I think this patch has to be dropped & replaced with one that
simply documents that memdev syntax is preferred.

Regards,
Daniel
-- 
|: https://berrange.com      -o-    https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org         -o-            https://fstop138.berrange.com :|
|: https://entangle-photo.org    -o-    https://www.instagram.com/dberrange :|

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [Qemu-devel] [libvirt] [PATCH 1/2] numa: deprecate 'mem' parameter of '-numa node' option
  2019-03-01 17:33     ` Igor Mammedov
  2019-03-01 17:48       ` Daniel P. Berrangé
@ 2019-03-01 18:01       ` Dr. David Alan Gilbert
  2019-03-04 13:52         ` Igor Mammedov
  1 sibling, 1 reply; 37+ messages in thread
From: Dr. David Alan Gilbert @ 2019-03-01 18:01 UTC (permalink / raw)
  To: Igor Mammedov
  Cc: Daniel P. Berrangé,
	peter.maydell, ehabkost, libvir-list, qemu-devel, qemu-arm,
	qemu-ppc, pbonzini, david

* Igor Mammedov (imammedo@redhat.com) wrote:
> On Fri, 1 Mar 2019 15:49:47 +0000
> Daniel P. Berrangé <berrange@redhat.com> wrote:
> 
> > On Fri, Mar 01, 2019 at 04:42:15PM +0100, Igor Mammedov wrote:
> > > The parameter allows to configure fake NUMA topology where guest
> > > VM simulates NUMA topology but not actually getting a performance
> > > benefits from it. The same or better results could be achieved
> > > using 'memdev' parameter. In light of that any VM that uses NUMA
> > > to get its benefits should use 'memdev' and to allow transition
> > > initial RAM to device based model, deprecate 'mem' parameter as
> > > its ad-hoc partitioning of initial RAM MemoryRegion can't be
> > > translated to memdev based backend transparently to users and in
> > > compatible manner (migration wise).
> > > 
> > > That will also allow to clean up a bit our numa code, leaving only
> > > 'memdev' impl. in place and several boards that use node_mem
> > > to generate FDT/ACPI description from it.  
> > 
> > Can you confirm that the  'mem' and 'memdev' parameters to -numa
> > are 100% live migration compatible in both directions ?  Libvirt
> > would need this to be the case in order to use the 'memdev' syntax
> > instead.
> Unfortunately they are not migration compatible in any direction,
> if it where possible to translate them to each other I'd alias 'mem'
> to 'memdev' without deprecation. The former sends over only one
> MemoryRegion to target, while the later sends over several (one per
> memdev).
> 
> Mixed memory issue[1] first came from libvirt side RHBZ1624223,
> back then it was resolved on libvirt side in favor of migration
> compatibility vs correctness (i.e. bind policy doesn't work as expected).
> What worse that it was made default and affects all new machines,
> as I understood it.
> 
> In case of -mem-path + -mem-prealloc (with 1 numa node or numa less)
> it's possible on QEMU side to make conversion to memdev in migration
> compatible way (that's what stopped Michal from memdev approach).
> But it's hard to do so in multi-nodes case as amount of MemoryRegions
> is different.
> 
> Point is to consider 'mem' as mis-configuration error, as the user
> in the first place using broken numa configuration
> (i.e. fake numa configuration doesn't actually improve performance).
> 
> CCed David, maybe he could offer a way to do 1:n migration and other
> way around.

I can't see a trivial way.
About the easiest I can think of is if you had a way to create a memdev
that was an alias to pc.ram (of a particular size and offset).

Dave

> 
> > > Signed-off-by: Igor Mammedov <imammedo@redhat.com>
> > > ---
> > >  numa.c               |  2 ++
> > >  qemu-deprecated.texi | 14 ++++++++++++++
> > >  2 files changed, 16 insertions(+)
> > > 
> > > diff --git a/numa.c b/numa.c
> > > index 3875e1e..2205773 100644
> > > --- a/numa.c
> > > +++ b/numa.c
> > > @@ -121,6 +121,8 @@ static void parse_numa_node(MachineState *ms, NumaNodeOptions *node,
> > >  
> > >      if (node->has_mem) {
> > >          numa_info[nodenr].node_mem = node->mem;
> > > +        warn_report("Parameter -numa node,mem is deprecated,"
> > > +                    " use -numa node,memdev instead");
> > >      }
> > >      if (node->has_memdev) {
> > >          Object *o;
> > > diff --git a/qemu-deprecated.texi b/qemu-deprecated.texi
> > > index 45c5795..73f99d4 100644
> > > --- a/qemu-deprecated.texi
> > > +++ b/qemu-deprecated.texi
> > > @@ -60,6 +60,20 @@ Support for invalid topologies will be removed, the user must ensure
> > >  topologies described with -smp include all possible cpus, i.e.
> > >    @math{@var{sockets} * @var{cores} * @var{threads} = @var{maxcpus}}.
> > >  
> > > +@subsection -numa node,mem=@var{size} (since 4.0)
> > > +
> > > +The parameter @option{mem} of @option{-numa node} is used to assign a part of
> > > +guest RAM to a NUMA node. But when using it, it's impossible to manage specified
> > > +size on the host side (like bind it to a host node, setting bind policy, ...),
> > > +so guest end-ups with the fake NUMA configuration with suboptiomal performance.
> > > +However since 2014 there is an alternative way to assign RAM to a NUMA node
> > > +using parameter @option{memdev}, which does the same as @option{mem} and has
> > > +an ability to actualy manage node RAM on the host side. Use parameter
> > > +@option{memdev} with @var{memory-backend-ram} backend as an replacement for
> > > +parameter @option{mem} to achieve the same fake NUMA effect or a properly
> > > +configured @var{memory-backend-file} backend to actually benefit from NUMA
> > > +configuration.
> > > +
> > >  @section QEMU Machine Protocol (QMP) commands
> > >  
> > >  @subsection block-dirty-bitmap-add "autoload" parameter (since 2.12.0)
> > > -- 
> > > 2.7.4
> > > 
> > > --
> > > libvir-list mailing list
> > > libvir-list@redhat.com
> > > https://www.redhat.com/mailman/listinfo/libvir-list  
> > 
> > Regards,
> > Daniel
> 
--
Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [Qemu-devel] [libvirt] [PATCH 1/2] numa: deprecate 'mem' parameter of '-numa node' option
  2019-03-01 17:48       ` Daniel P. Berrangé
@ 2019-03-04  7:13         ` Markus Armbruster
  2019-03-04 10:19           ` Daniel P. Berrangé
  2019-03-04 12:25           ` Igor Mammedov
  2019-03-04  8:11         ` [Qemu-devel] [Qemu-ppc] " Thomas Huth
  1 sibling, 2 replies; 37+ messages in thread
From: Markus Armbruster @ 2019-03-04  7:13 UTC (permalink / raw)
  To: Daniel P. Berrangé
  Cc: Igor Mammedov, peter.maydell, ehabkost, libvir-list, qemu-devel,
	Dr. David Alan Gilbert, qemu-arm, qemu-ppc, pbonzini, david

Daniel P. Berrangé <berrange@redhat.com> writes:

> On Fri, Mar 01, 2019 at 06:33:28PM +0100, Igor Mammedov wrote:
>> On Fri, 1 Mar 2019 15:49:47 +0000
>> Daniel P. Berrangé <berrange@redhat.com> wrote:
>> 
>> > On Fri, Mar 01, 2019 at 04:42:15PM +0100, Igor Mammedov wrote:
>> > > The parameter allows to configure fake NUMA topology where guest
>> > > VM simulates NUMA topology but not actually getting a performance
>> > > benefits from it. The same or better results could be achieved
>> > > using 'memdev' parameter. In light of that any VM that uses NUMA
>> > > to get its benefits should use 'memdev' and to allow transition
>> > > initial RAM to device based model, deprecate 'mem' parameter as
>> > > its ad-hoc partitioning of initial RAM MemoryRegion can't be
>> > > translated to memdev based backend transparently to users and in
>> > > compatible manner (migration wise).
>> > > 
>> > > That will also allow to clean up a bit our numa code, leaving only
>> > > 'memdev' impl. in place and several boards that use node_mem
>> > > to generate FDT/ACPI description from it.  
>> > 
>> > Can you confirm that the  'mem' and 'memdev' parameters to -numa
>> > are 100% live migration compatible in both directions ?  Libvirt
>> > would need this to be the case in order to use the 'memdev' syntax
>> > instead.
>> Unfortunately they are not migration compatible in any direction,
>> if it where possible to translate them to each other I'd alias 'mem'
>> to 'memdev' without deprecation. The former sends over only one
>> MemoryRegion to target, while the later sends over several (one per
>> memdev).
>
> If we can't migration from one to the other, then we can not deprecate
> the existing 'mem' syntax. Even if libvirt were to provide a config
> option to let apps opt-in to the new syntax, we need to be able to
> support live migration of existing running VMs indefinitely. Effectively
> this means we need the to keep 'mem' support forever, or at least such
> a long time that it effectively means forever.
>
> So I think this patch has to be dropped & replaced with one that
> simply documents that memdev syntax is preferred.

We have this habit of postulating absolutes like "can not deprecate"
instead of engaging with the tradeoffs.  We need to kick it.

So let's have an actual look at the tradeoffs.

We don't actually "support live migration of existing running VMs
indefinitely".

We support live migration to any newer version of QEMU that still
supports the machine type.

We support live migration to any older version of QEMU that already
supports the machine type and all the devices the machine uses.

Aside: "support" is really an honest best effort here.  If you rely on
it, use a downstream that puts in the (substantial!) QA work real
support takes.

Feature deprecation is not a contract to drop the feature after two
releases, or even five.  It's a formal notice that users of the feature
should transition to its replacement in an orderly manner.

If I understand Igor correctly, all users should transition away from
outdated NUMA configurations at least for new VMs in an orderly manner.

So, how could this formal notice be served constructively?

If we reject outdated NUMA configurations starting with machine type T,
we can remove the means to create those configurations along with
machine type T-1.  Won't happen anytime soon, will happen eventually,
because in the long run, all machine types are dead (apologies to
Keynes).

If we deprecate outdated NUMA configurations now, we can start rejecting
them with new machine types after a suitable grace period.

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [Qemu-devel] [Qemu-ppc] [libvirt] [PATCH 1/2] numa: deprecate 'mem' parameter of '-numa node' option
  2019-03-01 17:48       ` Daniel P. Berrangé
  2019-03-04  7:13         ` Markus Armbruster
@ 2019-03-04  8:11         ` Thomas Huth
  2019-03-04 13:55           ` Igor Mammedov
  1 sibling, 1 reply; 37+ messages in thread
From: Thomas Huth @ 2019-03-04  8:11 UTC (permalink / raw)
  To: Daniel P. Berrangé, Igor Mammedov
  Cc: peter.maydell, ehabkost, libvir-list, qemu-devel,
	Dr. David Alan Gilbert, qemu-arm, qemu-ppc, pbonzini, david

On 01/03/2019 18.48, Daniel P. Berrangé wrote:
[...]
> So I think this patch has to be dropped & replaced with one that
> simply documents that memdev syntax is preferred.

That's definitely not enough. I've had a couple of cases already where
we documented that certain options should not be used anymore, and
people simply ignored it (aka. if it ain't broken, don't do any change).
Then they just started to complain when I really tried to remove the
option after the deprecation period.

So Igor, if you can not officially deprecate these things here yet, you
should at least make sure that they can not be used with new machine
types anymore. Then, after a couple of years, when we feel sure that
there are only some few or no people left who still use it with the old
machine types, we can start to discuss the deprecation process again, I
think.

 Thomas

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [Qemu-devel] [libvirt] [PATCH 1/2] numa: deprecate 'mem' parameter of '-numa node' option
  2019-03-04  7:13         ` Markus Armbruster
@ 2019-03-04 10:19           ` Daniel P. Berrangé
  2019-03-04 11:45             ` Markus Armbruster
  2019-03-04 14:24             ` Michal Privoznik
  2019-03-04 12:25           ` Igor Mammedov
  1 sibling, 2 replies; 37+ messages in thread
From: Daniel P. Berrangé @ 2019-03-04 10:19 UTC (permalink / raw)
  To: Markus Armbruster
  Cc: Igor Mammedov, peter.maydell, ehabkost, libvir-list, qemu-devel,
	Dr. David Alan Gilbert, qemu-arm, qemu-ppc, pbonzini, david

On Mon, Mar 04, 2019 at 08:13:53AM +0100, Markus Armbruster wrote:
> Daniel P. Berrangé <berrange@redhat.com> writes:
> 
> > On Fri, Mar 01, 2019 at 06:33:28PM +0100, Igor Mammedov wrote:
> >> On Fri, 1 Mar 2019 15:49:47 +0000
> >> Daniel P. Berrangé <berrange@redhat.com> wrote:
> >> 
> >> > On Fri, Mar 01, 2019 at 04:42:15PM +0100, Igor Mammedov wrote:
> >> > > The parameter allows to configure fake NUMA topology where guest
> >> > > VM simulates NUMA topology but not actually getting a performance
> >> > > benefits from it. The same or better results could be achieved
> >> > > using 'memdev' parameter. In light of that any VM that uses NUMA
> >> > > to get its benefits should use 'memdev' and to allow transition
> >> > > initial RAM to device based model, deprecate 'mem' parameter as
> >> > > its ad-hoc partitioning of initial RAM MemoryRegion can't be
> >> > > translated to memdev based backend transparently to users and in
> >> > > compatible manner (migration wise).
> >> > > 
> >> > > That will also allow to clean up a bit our numa code, leaving only
> >> > > 'memdev' impl. in place and several boards that use node_mem
> >> > > to generate FDT/ACPI description from it.  
> >> > 
> >> > Can you confirm that the  'mem' and 'memdev' parameters to -numa
> >> > are 100% live migration compatible in both directions ?  Libvirt
> >> > would need this to be the case in order to use the 'memdev' syntax
> >> > instead.
> >> Unfortunately they are not migration compatible in any direction,
> >> if it where possible to translate them to each other I'd alias 'mem'
> >> to 'memdev' without deprecation. The former sends over only one
> >> MemoryRegion to target, while the later sends over several (one per
> >> memdev).
> >
> > If we can't migration from one to the other, then we can not deprecate
> > the existing 'mem' syntax. Even if libvirt were to provide a config
> > option to let apps opt-in to the new syntax, we need to be able to
> > support live migration of existing running VMs indefinitely. Effectively
> > this means we need the to keep 'mem' support forever, or at least such
> > a long time that it effectively means forever.
> >
> > So I think this patch has to be dropped & replaced with one that
> > simply documents that memdev syntax is preferred.
> 
> We have this habit of postulating absolutes like "can not deprecate"
> instead of engaging with the tradeoffs.  We need to kick it.
> 
> So let's have an actual look at the tradeoffs.
> 
> We don't actually "support live migration of existing running VMs
> indefinitely".
>
> We support live migration to any newer version of QEMU that still
> supports the machine type.
> 
> We support live migration to any older version of QEMU that already
> supports the machine type and all the devices the machine uses.
> 
> Aside: "support" is really an honest best effort here.  If you rely on
> it, use a downstream that puts in the (substantial!) QA work real
> support takes.

If upstream deletes the feature, then that in turn breaks the downstream
unless downstream reverts the upstream change. When we have large overlap
between downstream & upstream maintainer, it is not beneficial to delete
the feature upstream as any effort saved upstream usually expands into
larger effort downstream.

> Feature deprecation is not a contract to drop the feature after two
> releases, or even five.  It's a formal notice that users of the feature
> should transition to its replacement in an orderly manner.
> 
> If I understand Igor correctly, all users should transition away from
> outdated NUMA configurations at least for new VMs in an orderly manner.
> 
> So, how could this formal notice be served constructively?
> 
> If we reject outdated NUMA configurations starting with machine type T,
> we can remove the means to create those configurations along with
> machine type T-1.  Won't happen anytime soon, will happen eventually,
> because in the long run, all machine types are dead (apologies to
> Keynes).
> 
> If we deprecate outdated NUMA configurations now, we can start rejecting
> them with new machine types after a suitable grace period.

How is libvirt going to know what machines it can use with the feature ?
We don't have any way to introspect machine type specific logic, since we
run all probing with "-machine none", and QEMU can't report anything about
machines without instantiating them.

Regards,
Daniel
-- 
|: https://berrange.com      -o-    https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org         -o-            https://fstop138.berrange.com :|
|: https://entangle-photo.org    -o-    https://www.instagram.com/dberrange :|

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [Qemu-devel] [libvirt] [PATCH 1/2] numa: deprecate 'mem' parameter of '-numa node' option
  2019-03-04 10:19           ` Daniel P. Berrangé
@ 2019-03-04 11:45             ` Markus Armbruster
  2019-03-04 15:28               ` Daniel P. Berrangé
  2019-03-04 14:24             ` Michal Privoznik
  1 sibling, 1 reply; 37+ messages in thread
From: Markus Armbruster @ 2019-03-04 11:45 UTC (permalink / raw)
  To: Daniel P. Berrangé
  Cc: peter.maydell, ehabkost, libvir-list, qemu-devel,
	Dr. David Alan Gilbert, qemu-arm, qemu-ppc, pbonzini,
	Igor Mammedov, david

Daniel P. Berrangé <berrange@redhat.com> writes:

> On Mon, Mar 04, 2019 at 08:13:53AM +0100, Markus Armbruster wrote:
>> Daniel P. Berrangé <berrange@redhat.com> writes:
>> 
>> > On Fri, Mar 01, 2019 at 06:33:28PM +0100, Igor Mammedov wrote:
>> >> On Fri, 1 Mar 2019 15:49:47 +0000
>> >> Daniel P. Berrangé <berrange@redhat.com> wrote:
>> >> 
>> >> > On Fri, Mar 01, 2019 at 04:42:15PM +0100, Igor Mammedov wrote:
>> >> > > The parameter allows to configure fake NUMA topology where guest
>> >> > > VM simulates NUMA topology but not actually getting a performance
>> >> > > benefits from it. The same or better results could be achieved
>> >> > > using 'memdev' parameter. In light of that any VM that uses NUMA
>> >> > > to get its benefits should use 'memdev' and to allow transition
>> >> > > initial RAM to device based model, deprecate 'mem' parameter as
>> >> > > its ad-hoc partitioning of initial RAM MemoryRegion can't be
>> >> > > translated to memdev based backend transparently to users and in
>> >> > > compatible manner (migration wise).
>> >> > > 
>> >> > > That will also allow to clean up a bit our numa code, leaving only
>> >> > > 'memdev' impl. in place and several boards that use node_mem
>> >> > > to generate FDT/ACPI description from it.  
>> >> > 
>> >> > Can you confirm that the  'mem' and 'memdev' parameters to -numa
>> >> > are 100% live migration compatible in both directions ?  Libvirt
>> >> > would need this to be the case in order to use the 'memdev' syntax
>> >> > instead.
>> >> Unfortunately they are not migration compatible in any direction,
>> >> if it where possible to translate them to each other I'd alias 'mem'
>> >> to 'memdev' without deprecation. The former sends over only one
>> >> MemoryRegion to target, while the later sends over several (one per
>> >> memdev).
>> >
>> > If we can't migration from one to the other, then we can not deprecate
>> > the existing 'mem' syntax. Even if libvirt were to provide a config
>> > option to let apps opt-in to the new syntax, we need to be able to
>> > support live migration of existing running VMs indefinitely. Effectively
>> > this means we need the to keep 'mem' support forever, or at least such
>> > a long time that it effectively means forever.
>> >
>> > So I think this patch has to be dropped & replaced with one that
>> > simply documents that memdev syntax is preferred.
>> 
>> We have this habit of postulating absolutes like "can not deprecate"
>> instead of engaging with the tradeoffs.  We need to kick it.
>> 
>> So let's have an actual look at the tradeoffs.
>> 
>> We don't actually "support live migration of existing running VMs
>> indefinitely".
>>
>> We support live migration to any newer version of QEMU that still
>> supports the machine type.
>> 
>> We support live migration to any older version of QEMU that already
>> supports the machine type and all the devices the machine uses.
>> 
>> Aside: "support" is really an honest best effort here.  If you rely on
>> it, use a downstream that puts in the (substantial!) QA work real
>> support takes.
>
> If upstream deletes the feature, then that in turn breaks the downstream
> unless downstream reverts the upstream change. When we have large overlap
> between downstream & upstream maintainer, it is not beneficial to delete
> the feature upstream as any effort saved upstream usually expands into
> larger effort downstream.

It can't "break" existing downstreams, only future ones forked off after
the deletion.  Such a fork cares only if it has backward compatibility
requirements to satisfy that require the feature.  My point is: it's not
a simple absolute, it's a complex tradeoff.

>> Feature deprecation is not a contract to drop the feature after two
>> releases, or even five.  It's a formal notice that users of the feature
>> should transition to its replacement in an orderly manner.
>> 
>> If I understand Igor correctly, all users should transition away from
>> outdated NUMA configurations at least for new VMs in an orderly manner.
>> 
>> So, how could this formal notice be served constructively?
>> 
>> If we reject outdated NUMA configurations starting with machine type T,
>> we can remove the means to create those configurations along with
>> machine type T-1.  Won't happen anytime soon, will happen eventually,
>> because in the long run, all machine types are dead (apologies to
>> Keynes).
>> 
>> If we deprecate outdated NUMA configurations now, we can start rejecting
>> them with new machine types after a suitable grace period.
>
> How is libvirt going to know what machines it can use with the feature ?
> We don't have any way to introspect machine type specific logic, since we
> run all probing with "-machine none", and QEMU can't report anything about
> machines without instantiating them.

Fair point.  A practical way for management applications to decide which
of the two interfaces they can use with which machine type may be
required for deprecating one of the interfaces with new machine types.

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [Qemu-devel] [libvirt] [PATCH 1/2] numa: deprecate 'mem' parameter of '-numa node' option
  2019-03-04  7:13         ` Markus Armbruster
  2019-03-04 10:19           ` Daniel P. Berrangé
@ 2019-03-04 12:25           ` Igor Mammedov
  2019-03-04 12:39             ` Daniel P. Berrangé
  1 sibling, 1 reply; 37+ messages in thread
From: Igor Mammedov @ 2019-03-04 12:25 UTC (permalink / raw)
  To: Markus Armbruster
  Cc: Daniel P. Berrangé,
	peter.maydell, ehabkost, libvir-list, qemu-devel,
	Dr. David Alan Gilbert, qemu-arm, qemu-ppc, pbonzini, david,
	mprivozn

On Mon, 04 Mar 2019 08:13:53 +0100
Markus Armbruster <armbru@redhat.com> wrote:

> Daniel P. Berrangé <berrange@redhat.com> writes:
> 
> > On Fri, Mar 01, 2019 at 06:33:28PM +0100, Igor Mammedov wrote:  
> >> On Fri, 1 Mar 2019 15:49:47 +0000
> >> Daniel P. Berrangé <berrange@redhat.com> wrote:
> >>   
> >> > On Fri, Mar 01, 2019 at 04:42:15PM +0100, Igor Mammedov wrote:  
> >> > > The parameter allows to configure fake NUMA topology where guest
> >> > > VM simulates NUMA topology but not actually getting a performance
> >> > > benefits from it. The same or better results could be achieved
> >> > > using 'memdev' parameter. In light of that any VM that uses NUMA
> >> > > to get its benefits should use 'memdev' and to allow transition
> >> > > initial RAM to device based model, deprecate 'mem' parameter as
> >> > > its ad-hoc partitioning of initial RAM MemoryRegion can't be
> >> > > translated to memdev based backend transparently to users and in
> >> > > compatible manner (migration wise).
> >> > > 
> >> > > That will also allow to clean up a bit our numa code, leaving only
> >> > > 'memdev' impl. in place and several boards that use node_mem
> >> > > to generate FDT/ACPI description from it.    
> >> > 
> >> > Can you confirm that the  'mem' and 'memdev' parameters to -numa
> >> > are 100% live migration compatible in both directions ?  Libvirt
> >> > would need this to be the case in order to use the 'memdev' syntax
> >> > instead.  
> >> Unfortunately they are not migration compatible in any direction,
> >> if it where possible to translate them to each other I'd alias 'mem'
> >> to 'memdev' without deprecation. The former sends over only one
> >> MemoryRegion to target, while the later sends over several (one per
> >> memdev).  
> >
> > If we can't migration from one to the other, then we can not deprecate
> > the existing 'mem' syntax. Even if libvirt were to provide a config
> > option to let apps opt-in to the new syntax, we need to be able to
> > support live migration of existing running VMs indefinitely. Effectively
> > this means we need the to keep 'mem' support forever, or at least such
> > a long time that it effectively means forever.
> >
> > So I think this patch has to be dropped & replaced with one that
> > simply documents that memdev syntax is preferred.  
> 
> We have this habit of postulating absolutes like "can not deprecate"
> instead of engaging with the tradeoffs.  We need to kick it.
> 
> So let's have an actual look at the tradeoffs.
> 
> We don't actually "support live migration of existing running VMs
> indefinitely".
> 
> We support live migration to any newer version of QEMU that still
> supports the machine type.
> 
> We support live migration to any older version of QEMU that already
> supports the machine type and all the devices the machine uses.
> 
> Aside: "support" is really an honest best effort here.  If you rely on
> it, use a downstream that puts in the (substantial!) QA work real
> support takes.
> 
> Feature deprecation is not a contract to drop the feature after two
> releases, or even five.  It's a formal notice that users of the feature
> should transition to its replacement in an orderly manner.
> 
> If I understand Igor correctly, all users should transition away from
> outdated NUMA configurations at least for new VMs in an orderly manner.
Yes, we can postpone removing options until there are machines type
versions that were capable to use it (unfortunate but probably 
unavoidable unless there is a migration trick to make transition
transparent) but that should not stop us from disabling broken
options on new machine types at least.

This series can serve as formal notice with follow up disabling of
deprecated options for new machine types. (As Thomas noted, just warnings
do not work and users continue to use broken features regardless whether
they are don't know about issues or aware of it [*])

Hence suggested deprecation approach and enforced rejection of legacy
numa options for new machine types in 2 releases so users would stop
using them eventually.

*) https://www.redhat.com/archives/libvir-list/2018-November/msg00159.html

> So, how could this formal notice be served constructively?
> 
> If we reject outdated NUMA configurations starting with machine type T,
> we can remove the means to create those configurations along with
> machine type T-1.  Won't happen anytime soon, will happen eventually,
> because in the long run, all machine types are dead (apologies to
> Keynes).
> 
> If we deprecate outdated NUMA configurations now, we can start rejecting
> them with new machine types after a suitable grace period.
> 

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [Qemu-devel] [libvirt] [PATCH 1/2] numa: deprecate 'mem' parameter of '-numa node' option
  2019-03-04 12:25           ` Igor Mammedov
@ 2019-03-04 12:39             ` Daniel P. Berrangé
  2019-03-04 14:16               ` Igor Mammedov
  0 siblings, 1 reply; 37+ messages in thread
From: Daniel P. Berrangé @ 2019-03-04 12:39 UTC (permalink / raw)
  To: Igor Mammedov
  Cc: Markus Armbruster, peter.maydell, ehabkost, libvir-list,
	qemu-devel, Dr. David Alan Gilbert, qemu-arm, qemu-ppc, pbonzini,
	david, mprivozn

On Mon, Mar 04, 2019 at 01:25:07PM +0100, Igor Mammedov wrote:
> On Mon, 04 Mar 2019 08:13:53 +0100
> Markus Armbruster <armbru@redhat.com> wrote:
> 
> > Daniel P. Berrangé <berrange@redhat.com> writes:
> > 
> > > On Fri, Mar 01, 2019 at 06:33:28PM +0100, Igor Mammedov wrote:  
> > >> On Fri, 1 Mar 2019 15:49:47 +0000
> > >> Daniel P. Berrangé <berrange@redhat.com> wrote:
> > >>   
> > >> > On Fri, Mar 01, 2019 at 04:42:15PM +0100, Igor Mammedov wrote:  
> > >> > > The parameter allows to configure fake NUMA topology where guest
> > >> > > VM simulates NUMA topology but not actually getting a performance
> > >> > > benefits from it. The same or better results could be achieved
> > >> > > using 'memdev' parameter. In light of that any VM that uses NUMA
> > >> > > to get its benefits should use 'memdev' and to allow transition
> > >> > > initial RAM to device based model, deprecate 'mem' parameter as
> > >> > > its ad-hoc partitioning of initial RAM MemoryRegion can't be
> > >> > > translated to memdev based backend transparently to users and in
> > >> > > compatible manner (migration wise).
> > >> > > 
> > >> > > That will also allow to clean up a bit our numa code, leaving only
> > >> > > 'memdev' impl. in place and several boards that use node_mem
> > >> > > to generate FDT/ACPI description from it.    
> > >> > 
> > >> > Can you confirm that the  'mem' and 'memdev' parameters to -numa
> > >> > are 100% live migration compatible in both directions ?  Libvirt
> > >> > would need this to be the case in order to use the 'memdev' syntax
> > >> > instead.  
> > >> Unfortunately they are not migration compatible in any direction,
> > >> if it where possible to translate them to each other I'd alias 'mem'
> > >> to 'memdev' without deprecation. The former sends over only one
> > >> MemoryRegion to target, while the later sends over several (one per
> > >> memdev).  
> > >
> > > If we can't migration from one to the other, then we can not deprecate
> > > the existing 'mem' syntax. Even if libvirt were to provide a config
> > > option to let apps opt-in to the new syntax, we need to be able to
> > > support live migration of existing running VMs indefinitely. Effectively
> > > this means we need the to keep 'mem' support forever, or at least such
> > > a long time that it effectively means forever.
> > >
> > > So I think this patch has to be dropped & replaced with one that
> > > simply documents that memdev syntax is preferred.  
> > 
> > We have this habit of postulating absolutes like "can not deprecate"
> > instead of engaging with the tradeoffs.  We need to kick it.
> > 
> > So let's have an actual look at the tradeoffs.
> > 
> > We don't actually "support live migration of existing running VMs
> > indefinitely".
> > 
> > We support live migration to any newer version of QEMU that still
> > supports the machine type.
> > 
> > We support live migration to any older version of QEMU that already
> > supports the machine type and all the devices the machine uses.
> > 
> > Aside: "support" is really an honest best effort here.  If you rely on
> > it, use a downstream that puts in the (substantial!) QA work real
> > support takes.
> > 
> > Feature deprecation is not a contract to drop the feature after two
> > releases, or even five.  It's a formal notice that users of the feature
> > should transition to its replacement in an orderly manner.
> > 
> > If I understand Igor correctly, all users should transition away from
> > outdated NUMA configurations at least for new VMs in an orderly manner.
> Yes, we can postpone removing options until there are machines type
> versions that were capable to use it (unfortunate but probably 
> unavoidable unless there is a migration trick to make transition
> transparent) but that should not stop us from disabling broken
> options on new machine types at least.
> 
> This series can serve as formal notice with follow up disabling of
> deprecated options for new machine types. (As Thomas noted, just warnings
> do not work and users continue to use broken features regardless whether
> they are don't know about issues or aware of it [*])
> 
> Hence suggested deprecation approach and enforced rejection of legacy
> numa options for new machine types in 2 releases so users would stop
> using them eventually.

When we deprecate something, we need to have a way for apps to use the
new alternative approach *at the same time*.  So even if we only want to
deprecate for new machine types, we still have to first solve the problem
of how mgmt apps will introspect QEMU to learn which machine types expect
the new options.

Regards,
Daniel
-- 
|: https://berrange.com      -o-    https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org         -o-            https://fstop138.berrange.com :|
|: https://entangle-photo.org    -o-    https://www.instagram.com/dberrange :|

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [Qemu-devel] [libvirt] [PATCH 1/2] numa: deprecate 'mem' parameter of '-numa node' option
  2019-03-01 18:01       ` [Qemu-devel] " Dr. David Alan Gilbert
@ 2019-03-04 13:52         ` Igor Mammedov
  0 siblings, 0 replies; 37+ messages in thread
From: Igor Mammedov @ 2019-03-04 13:52 UTC (permalink / raw)
  To: Dr. David Alan Gilbert
  Cc: Daniel P. Berrangé,
	peter.maydell, ehabkost, libvir-list, qemu-devel, qemu-arm,
	qemu-ppc, pbonzini, david

On Fri, 1 Mar 2019 18:01:52 +0000
"Dr. David Alan Gilbert" <dgilbert@redhat.com> wrote:

> * Igor Mammedov (imammedo@redhat.com) wrote:
> > On Fri, 1 Mar 2019 15:49:47 +0000
> > Daniel P. Berrangé <berrange@redhat.com> wrote:
> >   
> > > On Fri, Mar 01, 2019 at 04:42:15PM +0100, Igor Mammedov wrote:  
> > > > The parameter allows to configure fake NUMA topology where guest
> > > > VM simulates NUMA topology but not actually getting a performance
> > > > benefits from it. The same or better results could be achieved
> > > > using 'memdev' parameter. In light of that any VM that uses NUMA
> > > > to get its benefits should use 'memdev' and to allow transition
> > > > initial RAM to device based model, deprecate 'mem' parameter as
> > > > its ad-hoc partitioning of initial RAM MemoryRegion can't be
> > > > translated to memdev based backend transparently to users and in
> > > > compatible manner (migration wise).
> > > > 
> > > > That will also allow to clean up a bit our numa code, leaving only
> > > > 'memdev' impl. in place and several boards that use node_mem
> > > > to generate FDT/ACPI description from it.    
> > > 
> > > Can you confirm that the  'mem' and 'memdev' parameters to -numa
> > > are 100% live migration compatible in both directions ?  Libvirt
> > > would need this to be the case in order to use the 'memdev' syntax
> > > instead.  
> > Unfortunately they are not migration compatible in any direction,
> > if it where possible to translate them to each other I'd alias 'mem'
> > to 'memdev' without deprecation. The former sends over only one
> > MemoryRegion to target, while the later sends over several (one per
> > memdev).
> > 
> > Mixed memory issue[1] first came from libvirt side RHBZ1624223,
> > back then it was resolved on libvirt side in favor of migration
> > compatibility vs correctness (i.e. bind policy doesn't work as expected).
> > What worse that it was made default and affects all new machines,
> > as I understood it.
> > 
> > In case of -mem-path + -mem-prealloc (with 1 numa node or numa less)
> > it's possible on QEMU side to make conversion to memdev in migration
> > compatible way (that's what stopped Michal from memdev approach).
> > But it's hard to do so in multi-nodes case as amount of MemoryRegions
> > is different.
> > 
> > Point is to consider 'mem' as mis-configuration error, as the user
> > in the first place using broken numa configuration
> > (i.e. fake numa configuration doesn't actually improve performance).
> > 
> > CCed David, maybe he could offer a way to do 1:n migration and other
> > way around.  
> 
> I can't see a trivial way.
> About the easiest I can think of is if you had a way to create a memdev
> that was an alias to pc.ram (of a particular size and offset).
If I get you right that's what I was planning to do for numa-less machines
that use -mem-path/prealloc options, where it's possible to replace
an initial RAM MemoryRegion with a correspondingly named memdev and its
backing MemoryRegion.

But I don't see how it could work in case of legacy NUMA 'mem' options
where initial RAM is 1 MemoryRegion (it's a fake numa after all) and how to
translate that into several MemoryRegions (one per node/memdev).

> Dave
> 
> >   
> > > > Signed-off-by: Igor Mammedov <imammedo@redhat.com>
> > > > ---
> > > >  numa.c               |  2 ++
> > > >  qemu-deprecated.texi | 14 ++++++++++++++
> > > >  2 files changed, 16 insertions(+)
> > > > 
> > > > diff --git a/numa.c b/numa.c
> > > > index 3875e1e..2205773 100644
> > > > --- a/numa.c
> > > > +++ b/numa.c
> > > > @@ -121,6 +121,8 @@ static void parse_numa_node(MachineState *ms, NumaNodeOptions *node,
> > > >  
> > > >      if (node->has_mem) {
> > > >          numa_info[nodenr].node_mem = node->mem;
> > > > +        warn_report("Parameter -numa node,mem is deprecated,"
> > > > +                    " use -numa node,memdev instead");
> > > >      }
> > > >      if (node->has_memdev) {
> > > >          Object *o;
> > > > diff --git a/qemu-deprecated.texi b/qemu-deprecated.texi
> > > > index 45c5795..73f99d4 100644
> > > > --- a/qemu-deprecated.texi
> > > > +++ b/qemu-deprecated.texi
> > > > @@ -60,6 +60,20 @@ Support for invalid topologies will be removed, the user must ensure
> > > >  topologies described with -smp include all possible cpus, i.e.
> > > >    @math{@var{sockets} * @var{cores} * @var{threads} = @var{maxcpus}}.
> > > >  
> > > > +@subsection -numa node,mem=@var{size} (since 4.0)
> > > > +
> > > > +The parameter @option{mem} of @option{-numa node} is used to assign a part of
> > > > +guest RAM to a NUMA node. But when using it, it's impossible to manage specified
> > > > +size on the host side (like bind it to a host node, setting bind policy, ...),
> > > > +so guest end-ups with the fake NUMA configuration with suboptiomal performance.
> > > > +However since 2014 there is an alternative way to assign RAM to a NUMA node
> > > > +using parameter @option{memdev}, which does the same as @option{mem} and has
> > > > +an ability to actualy manage node RAM on the host side. Use parameter
> > > > +@option{memdev} with @var{memory-backend-ram} backend as an replacement for
> > > > +parameter @option{mem} to achieve the same fake NUMA effect or a properly
> > > > +configured @var{memory-backend-file} backend to actually benefit from NUMA
> > > > +configuration.
> > > > +
> > > >  @section QEMU Machine Protocol (QMP) commands
> > > >  
> > > >  @subsection block-dirty-bitmap-add "autoload" parameter (since 2.12.0)
> > > > -- 
> > > > 2.7.4
> > > > 
> > > > --
> > > > libvir-list mailing list
> > > > libvir-list@redhat.com
> > > > https://www.redhat.com/mailman/listinfo/libvir-list    
> > > 
> > > Regards,
> > > Daniel  
> >   
> --
> Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [Qemu-devel] [Qemu-ppc] [libvirt] [PATCH 1/2] numa: deprecate 'mem' parameter of '-numa node' option
  2019-03-04  8:11         ` [Qemu-devel] [Qemu-ppc] " Thomas Huth
@ 2019-03-04 13:55           ` Igor Mammedov
  2019-03-04 13:59             ` Daniel P. Berrangé
  0 siblings, 1 reply; 37+ messages in thread
From: Igor Mammedov @ 2019-03-04 13:55 UTC (permalink / raw)
  To: Thomas Huth
  Cc: Daniel P. Berrangé,
	peter.maydell, ehabkost, libvir-list, qemu-devel,
	Dr. David Alan Gilbert, qemu-arm, qemu-ppc, pbonzini, david

On Mon, 4 Mar 2019 09:11:19 +0100
Thomas Huth <thuth@redhat.com> wrote:

> On 01/03/2019 18.48, Daniel P. Berrangé wrote:
> [...]
> > So I think this patch has to be dropped & replaced with one that
> > simply documents that memdev syntax is preferred.  
> 
> That's definitely not enough. I've had a couple of cases already where
> we documented that certain options should not be used anymore, and
> people simply ignored it (aka. if it ain't broken, don't do any change).
> Then they just started to complain when I really tried to remove the
> option after the deprecation period.

> So Igor, if you can not officially deprecate these things here yet, you
> should at least make sure that they can not be used with new machine
> types anymore. Then, after a couple of years, when we feel sure that
> there are only some few or no people left who still use it with the old
> machine types, we can start to discuss the deprecation process again, I
> think.
Is it acceptable to silently disable CLI options (even if they are broken
like in this case) for new machine types?
I was under impression that it should go through deprecation first.

> 
>  Thomas
> 

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [Qemu-devel] [Qemu-ppc] [libvirt] [PATCH 1/2] numa: deprecate 'mem' parameter of '-numa node' option
  2019-03-04 13:55           ` Igor Mammedov
@ 2019-03-04 13:59             ` Daniel P. Berrangé
  2019-03-04 14:54               ` Igor Mammedov
  0 siblings, 1 reply; 37+ messages in thread
From: Daniel P. Berrangé @ 2019-03-04 13:59 UTC (permalink / raw)
  To: Igor Mammedov
  Cc: Thomas Huth, peter.maydell, ehabkost, libvir-list, qemu-devel,
	Dr. David Alan Gilbert, qemu-arm, qemu-ppc, pbonzini, david

On Mon, Mar 04, 2019 at 02:55:10PM +0100, Igor Mammedov wrote:
> On Mon, 4 Mar 2019 09:11:19 +0100
> Thomas Huth <thuth@redhat.com> wrote:
> 
> > On 01/03/2019 18.48, Daniel P. Berrangé wrote:
> > [...]
> > > So I think this patch has to be dropped & replaced with one that
> > > simply documents that memdev syntax is preferred.  
> > 
> > That's definitely not enough. I've had a couple of cases already where
> > we documented that certain options should not be used anymore, and
> > people simply ignored it (aka. if it ain't broken, don't do any change).
> > Then they just started to complain when I really tried to remove the
> > option after the deprecation period.
> 
> > So Igor, if you can not officially deprecate these things here yet, you
> > should at least make sure that they can not be used with new machine
> > types anymore. Then, after a couple of years, when we feel sure that
> > there are only some few or no people left who still use it with the old
> > machine types, we can start to discuss the deprecation process again, I
> > think.
> Is it acceptable to silently disable CLI options (even if they are broken
> like in this case) for new machine types?
> I was under impression that it should go through deprecation first.

Yes, it must go through deprecation. I was saying we can't disable
the CLI options at all, until there is a way for libvirt to correctly
use the new options.

Regards,
Daniel
-- 
|: https://berrange.com      -o-    https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org         -o-            https://fstop138.berrange.com :|
|: https://entangle-photo.org    -o-    https://www.instagram.com/dberrange :|

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [Qemu-devel] [libvirt] [PATCH 1/2] numa: deprecate 'mem' parameter of '-numa node' option
  2019-03-04 12:39             ` Daniel P. Berrangé
@ 2019-03-04 14:16               ` Igor Mammedov
  2019-03-04 14:24                 ` Daniel P. Berrangé
  2019-03-04 14:34                 ` Michal Privoznik
  0 siblings, 2 replies; 37+ messages in thread
From: Igor Mammedov @ 2019-03-04 14:16 UTC (permalink / raw)
  To: Daniel P. Berrangé
  Cc: Markus Armbruster, peter.maydell, ehabkost, libvir-list,
	qemu-devel, Dr. David Alan Gilbert, qemu-arm, qemu-ppc, pbonzini,
	david, mprivozn

On Mon, 4 Mar 2019 12:39:08 +0000
Daniel P. Berrangé <berrange@redhat.com> wrote:

> On Mon, Mar 04, 2019 at 01:25:07PM +0100, Igor Mammedov wrote:
> > On Mon, 04 Mar 2019 08:13:53 +0100
> > Markus Armbruster <armbru@redhat.com> wrote:
> >   
> > > Daniel P. Berrangé <berrange@redhat.com> writes:
> > >   
> > > > On Fri, Mar 01, 2019 at 06:33:28PM +0100, Igor Mammedov wrote:    
> > > >> On Fri, 1 Mar 2019 15:49:47 +0000
> > > >> Daniel P. Berrangé <berrange@redhat.com> wrote:
> > > >>     
> > > >> > On Fri, Mar 01, 2019 at 04:42:15PM +0100, Igor Mammedov wrote:    
> > > >> > > The parameter allows to configure fake NUMA topology where guest
> > > >> > > VM simulates NUMA topology but not actually getting a performance
> > > >> > > benefits from it. The same or better results could be achieved
> > > >> > > using 'memdev' parameter. In light of that any VM that uses NUMA
> > > >> > > to get its benefits should use 'memdev' and to allow transition
> > > >> > > initial RAM to device based model, deprecate 'mem' parameter as
> > > >> > > its ad-hoc partitioning of initial RAM MemoryRegion can't be
> > > >> > > translated to memdev based backend transparently to users and in
> > > >> > > compatible manner (migration wise).
> > > >> > > 
> > > >> > > That will also allow to clean up a bit our numa code, leaving only
> > > >> > > 'memdev' impl. in place and several boards that use node_mem
> > > >> > > to generate FDT/ACPI description from it.      
> > > >> > 
> > > >> > Can you confirm that the  'mem' and 'memdev' parameters to -numa
> > > >> > are 100% live migration compatible in both directions ?  Libvirt
> > > >> > would need this to be the case in order to use the 'memdev' syntax
> > > >> > instead.    
> > > >> Unfortunately they are not migration compatible in any direction,
> > > >> if it where possible to translate them to each other I'd alias 'mem'
> > > >> to 'memdev' without deprecation. The former sends over only one
> > > >> MemoryRegion to target, while the later sends over several (one per
> > > >> memdev).    
> > > >
> > > > If we can't migration from one to the other, then we can not deprecate
> > > > the existing 'mem' syntax. Even if libvirt were to provide a config
> > > > option to let apps opt-in to the new syntax, we need to be able to
> > > > support live migration of existing running VMs indefinitely. Effectively
> > > > this means we need the to keep 'mem' support forever, or at least such
> > > > a long time that it effectively means forever.
> > > >
> > > > So I think this patch has to be dropped & replaced with one that
> > > > simply documents that memdev syntax is preferred.    
> > > 
> > > We have this habit of postulating absolutes like "can not deprecate"
> > > instead of engaging with the tradeoffs.  We need to kick it.
> > > 
> > > So let's have an actual look at the tradeoffs.
> > > 
> > > We don't actually "support live migration of existing running VMs
> > > indefinitely".
> > > 
> > > We support live migration to any newer version of QEMU that still
> > > supports the machine type.
> > > 
> > > We support live migration to any older version of QEMU that already
> > > supports the machine type and all the devices the machine uses.
> > > 
> > > Aside: "support" is really an honest best effort here.  If you rely on
> > > it, use a downstream that puts in the (substantial!) QA work real
> > > support takes.
> > > 
> > > Feature deprecation is not a contract to drop the feature after two
> > > releases, or even five.  It's a formal notice that users of the feature
> > > should transition to its replacement in an orderly manner.
> > > 
> > > If I understand Igor correctly, all users should transition away from
> > > outdated NUMA configurations at least for new VMs in an orderly manner.  
> > Yes, we can postpone removing options until there are machines type
> > versions that were capable to use it (unfortunate but probably 
> > unavoidable unless there is a migration trick to make transition
> > transparent) but that should not stop us from disabling broken
> > options on new machine types at least.
> > 
> > This series can serve as formal notice with follow up disabling of
> > deprecated options for new machine types. (As Thomas noted, just warnings
> > do not work and users continue to use broken features regardless whether
> > they are don't know about issues or aware of it [*])
> > 
> > Hence suggested deprecation approach and enforced rejection of legacy
> > numa options for new machine types in 2 releases so users would stop
> > using them eventually.  
> 
> When we deprecate something, we need to have a way for apps to use the
> new alternative approach *at the same time*.  So even if we only want to
> deprecate for new machine types, we still have to first solve the problem
> of how mgmt apps will introspect QEMU to learn which machine types expect
> the new options.
I'm not aware any mechanism to introspect machine type options (existing
or something being developed). Are/were there any ideas about it that were
discussed in the past?

Aside from developing a new mechanism what are alternative approaches?
I mean when we delete deprecated CLI option, how it's solved on libvirt
side currently?

For example I don't see anything introspection related when we have been
removing deprecated options recently.

More exact question specific to this series usecase,
how libvirt decides when to use -numa node,memdev or not currently?


> 
> Regards,
> Daniel

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [Qemu-devel] [libvirt] [PATCH 1/2] numa: deprecate 'mem' parameter of '-numa node' option
  2019-03-04 10:19           ` Daniel P. Berrangé
  2019-03-04 11:45             ` Markus Armbruster
@ 2019-03-04 14:24             ` Michal Privoznik
  2019-03-04 15:03               ` Igor Mammedov
  1 sibling, 1 reply; 37+ messages in thread
From: Michal Privoznik @ 2019-03-04 14:24 UTC (permalink / raw)
  To: Daniel P. Berrangé, Markus Armbruster
  Cc: peter.maydell, ehabkost, libvir-list, qemu-devel,
	Dr. David Alan Gilbert, qemu-arm, qemu-ppc, pbonzini,
	Igor Mammedov, david

[Thanks Igor for bringing this onto my radar. I don't follow qemu-devel 
that close]

On 3/4/19 11:19 AM, Daniel P. Berrangé wrote:
> On Mon, Mar 04, 2019 at 08:13:53AM +0100, Markus Armbruster wrote:
>> Daniel P. Berrangé <berrange@redhat.com> writes:
>>
>>> On Fri, Mar 01, 2019 at 06:33:28PM +0100, Igor Mammedov wrote:
>>>> On Fri, 1 Mar 2019 15:49:47 +0000
>>>> Daniel P. Berrangé <berrange@redhat.com> wrote:
>>>>
>>>>> On Fri, Mar 01, 2019 at 04:42:15PM +0100, Igor Mammedov wrote:
>>>>>> The parameter allows to configure fake NUMA topology where guest
>>>>>> VM simulates NUMA topology but not actually getting a performance
>>>>>> benefits from it. The same or better results could be achieved
>>>>>> using 'memdev' parameter. In light of that any VM that uses NUMA
>>>>>> to get its benefits should use 'memdev' and to allow transition
>>>>>> initial RAM to device based model, deprecate 'mem' parameter as
>>>>>> its ad-hoc partitioning of initial RAM MemoryRegion can't be
>>>>>> translated to memdev based backend transparently to users and in
>>>>>> compatible manner (migration wise).
>>>>>>
>>>>>> That will also allow to clean up a bit our numa code, leaving only
>>>>>> 'memdev' impl. in place and several boards that use node_mem
>>>>>> to generate FDT/ACPI description from it.
>>>>>
>>>>> Can you confirm that the  'mem' and 'memdev' parameters to -numa
>>>>> are 100% live migration compatible in both directions ?  Libvirt
>>>>> would need this to be the case in order to use the 'memdev' syntax
>>>>> instead.
>>>> Unfortunately they are not migration compatible in any direction,
>>>> if it where possible to translate them to each other I'd alias 'mem'
>>>> to 'memdev' without deprecation. The former sends over only one
>>>> MemoryRegion to target, while the later sends over several (one per
>>>> memdev).
>>>
>>> If we can't migration from one to the other, then we can not deprecate
>>> the existing 'mem' syntax. Even if libvirt were to provide a config
>>> option to let apps opt-in to the new syntax, we need to be able to
>>> support live migration of existing running VMs indefinitely. Effectively
>>> this means we need the to keep 'mem' support forever, or at least such
>>> a long time that it effectively means forever.

I'm with Daniel on this. The reason why libvirt still defaults to '-numa 
node,mem=' is exactly because of backward compatibility. Since a machine 
can't be migrated from '-numa node,mem=' to '-numa node,memdev= + 
-object memory-backend-*' libvirt hast to play it safe and chose a 
combination that is acessible the widest.

If you remove this, how would you expect older machines to migrate to 
newer cmd line?

I'm all for deprecating old stuff. In fact, I've suggested that in 
libvirt(!) here and there, but I'm afraid we can't just remove 
functionatlity unless we give users a way to migrate to the one we 
prefer now.

And if libvirt doesn't follow qemu's warnings then it definitely should. 
It's a libvirt bug if it doesn't follow the best practicies (well, if can).

Michal

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [Qemu-devel] [libvirt] [PATCH 1/2] numa: deprecate 'mem' parameter of '-numa node' option
  2019-03-04 14:16               ` Igor Mammedov
@ 2019-03-04 14:24                 ` Daniel P. Berrangé
  2019-03-04 15:19                   ` Igor Mammedov
  2019-03-04 16:20                   ` Michal Privoznik
  2019-03-04 14:34                 ` Michal Privoznik
  1 sibling, 2 replies; 37+ messages in thread
From: Daniel P. Berrangé @ 2019-03-04 14:24 UTC (permalink / raw)
  To: Igor Mammedov
  Cc: Markus Armbruster, peter.maydell, ehabkost, libvir-list,
	qemu-devel, Dr. David Alan Gilbert, qemu-arm, qemu-ppc, pbonzini,
	david, mprivozn

On Mon, Mar 04, 2019 at 03:16:41PM +0100, Igor Mammedov wrote:
> On Mon, 4 Mar 2019 12:39:08 +0000
> Daniel P. Berrangé <berrange@redhat.com> wrote:
> 
> > On Mon, Mar 04, 2019 at 01:25:07PM +0100, Igor Mammedov wrote:
> > > On Mon, 04 Mar 2019 08:13:53 +0100
> > > Markus Armbruster <armbru@redhat.com> wrote:
> > >   
> > > > Daniel P. Berrangé <berrange@redhat.com> writes:
> > > >   
> > > > > On Fri, Mar 01, 2019 at 06:33:28PM +0100, Igor Mammedov wrote:    
> > > > >> On Fri, 1 Mar 2019 15:49:47 +0000
> > > > >> Daniel P. Berrangé <berrange@redhat.com> wrote:
> > > > >>     
> > > > >> > On Fri, Mar 01, 2019 at 04:42:15PM +0100, Igor Mammedov wrote:    
> > > > >> > > The parameter allows to configure fake NUMA topology where guest
> > > > >> > > VM simulates NUMA topology but not actually getting a performance
> > > > >> > > benefits from it. The same or better results could be achieved
> > > > >> > > using 'memdev' parameter. In light of that any VM that uses NUMA
> > > > >> > > to get its benefits should use 'memdev' and to allow transition
> > > > >> > > initial RAM to device based model, deprecate 'mem' parameter as
> > > > >> > > its ad-hoc partitioning of initial RAM MemoryRegion can't be
> > > > >> > > translated to memdev based backend transparently to users and in
> > > > >> > > compatible manner (migration wise).
> > > > >> > > 
> > > > >> > > That will also allow to clean up a bit our numa code, leaving only
> > > > >> > > 'memdev' impl. in place and several boards that use node_mem
> > > > >> > > to generate FDT/ACPI description from it.      
> > > > >> > 
> > > > >> > Can you confirm that the  'mem' and 'memdev' parameters to -numa
> > > > >> > are 100% live migration compatible in both directions ?  Libvirt
> > > > >> > would need this to be the case in order to use the 'memdev' syntax
> > > > >> > instead.    
> > > > >> Unfortunately they are not migration compatible in any direction,
> > > > >> if it where possible to translate them to each other I'd alias 'mem'
> > > > >> to 'memdev' without deprecation. The former sends over only one
> > > > >> MemoryRegion to target, while the later sends over several (one per
> > > > >> memdev).    
> > > > >
> > > > > If we can't migration from one to the other, then we can not deprecate
> > > > > the existing 'mem' syntax. Even if libvirt were to provide a config
> > > > > option to let apps opt-in to the new syntax, we need to be able to
> > > > > support live migration of existing running VMs indefinitely. Effectively
> > > > > this means we need the to keep 'mem' support forever, or at least such
> > > > > a long time that it effectively means forever.
> > > > >
> > > > > So I think this patch has to be dropped & replaced with one that
> > > > > simply documents that memdev syntax is preferred.    
> > > > 
> > > > We have this habit of postulating absolutes like "can not deprecate"
> > > > instead of engaging with the tradeoffs.  We need to kick it.
> > > > 
> > > > So let's have an actual look at the tradeoffs.
> > > > 
> > > > We don't actually "support live migration of existing running VMs
> > > > indefinitely".
> > > > 
> > > > We support live migration to any newer version of QEMU that still
> > > > supports the machine type.
> > > > 
> > > > We support live migration to any older version of QEMU that already
> > > > supports the machine type and all the devices the machine uses.
> > > > 
> > > > Aside: "support" is really an honest best effort here.  If you rely on
> > > > it, use a downstream that puts in the (substantial!) QA work real
> > > > support takes.
> > > > 
> > > > Feature deprecation is not a contract to drop the feature after two
> > > > releases, or even five.  It's a formal notice that users of the feature
> > > > should transition to its replacement in an orderly manner.
> > > > 
> > > > If I understand Igor correctly, all users should transition away from
> > > > outdated NUMA configurations at least for new VMs in an orderly manner.  
> > > Yes, we can postpone removing options until there are machines type
> > > versions that were capable to use it (unfortunate but probably 
> > > unavoidable unless there is a migration trick to make transition
> > > transparent) but that should not stop us from disabling broken
> > > options on new machine types at least.
> > > 
> > > This series can serve as formal notice with follow up disabling of
> > > deprecated options for new machine types. (As Thomas noted, just warnings
> > > do not work and users continue to use broken features regardless whether
> > > they are don't know about issues or aware of it [*])
> > > 
> > > Hence suggested deprecation approach and enforced rejection of legacy
> > > numa options for new machine types in 2 releases so users would stop
> > > using them eventually.  
> > 
> > When we deprecate something, we need to have a way for apps to use the
> > new alternative approach *at the same time*.  So even if we only want to
> > deprecate for new machine types, we still have to first solve the problem
> > of how mgmt apps will introspect QEMU to learn which machine types expect
> > the new options.
> I'm not aware any mechanism to introspect machine type options (existing
> or something being developed). Are/were there any ideas about it that were
> discussed in the past?
> 
> Aside from developing a new mechanism what are alternative approaches?
> I mean when we delete deprecated CLI option, how it's solved on libvirt
> side currently?
> 
> For example I don't see anything introspection related when we have been
> removing deprecated options recently.

Right, with other stuff we deprecate we've had a simpler time, as it
either didn't affect migration at all, or the new replacement stuff
was fully compatible with the migration data stream. IOW, libvirt
could unconditionally use the new feature as soon as it saw that it
exists in QEMU. We didn't have any machine type dependancy to deal
with before now.

> More exact question specific to this series usecase,
> how libvirt decides when to use -numa node,memdev or not currently?

It is pretty hard to follow the code, but IIUC we only use memdev when
stting up NVDIMMs, or for guests configured to have the "shared" flag
on the memory region.

Regards,
Daniel
-- 
|: https://berrange.com      -o-    https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org         -o-            https://fstop138.berrange.com :|
|: https://entangle-photo.org    -o-    https://www.instagram.com/dberrange :|

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [Qemu-devel] [libvirt] [PATCH 1/2] numa: deprecate 'mem' parameter of '-numa node' option
  2019-03-04 14:16               ` Igor Mammedov
  2019-03-04 14:24                 ` Daniel P. Berrangé
@ 2019-03-04 14:34                 ` Michal Privoznik
  1 sibling, 0 replies; 37+ messages in thread
From: Michal Privoznik @ 2019-03-04 14:34 UTC (permalink / raw)
  To: Igor Mammedov, Daniel P. Berrangé
  Cc: Markus Armbruster, peter.maydell, ehabkost, libvir-list,
	qemu-devel, Dr. David Alan Gilbert, qemu-arm, qemu-ppc, pbonzini,
	david

On 3/4/19 3:16 PM, Igor Mammedov wrote:
> On Mon, 4 Mar 2019 12:39:08 +0000
> Daniel P. Berrangé <berrange@redhat.com> wrote:
> 
>> On Mon, Mar 04, 2019 at 01:25:07PM +0100, Igor Mammedov wrote:
>>> On Mon, 04 Mar 2019 08:13:53 +0100
>>> Markus Armbruster <armbru@redhat.com> wrote:
>>>    
>>>> Daniel P. Berrangé <berrange@redhat.com> writes:
>>>>    
>>>>> On Fri, Mar 01, 2019 at 06:33:28PM +0100, Igor Mammedov wrote:
>>>>>> On Fri, 1 Mar 2019 15:49:47 +0000
>>>>>> Daniel P. Berrangé <berrange@redhat.com> wrote:
>>>>>>      
>>>>>>> On Fri, Mar 01, 2019 at 04:42:15PM +0100, Igor Mammedov wrote:
>>>>>>>> The parameter allows to configure fake NUMA topology where guest
>>>>>>>> VM simulates NUMA topology but not actually getting a performance
>>>>>>>> benefits from it. The same or better results could be achieved
>>>>>>>> using 'memdev' parameter. In light of that any VM that uses NUMA
>>>>>>>> to get its benefits should use 'memdev' and to allow transition
>>>>>>>> initial RAM to device based model, deprecate 'mem' parameter as
>>>>>>>> its ad-hoc partitioning of initial RAM MemoryRegion can't be
>>>>>>>> translated to memdev based backend transparently to users and in
>>>>>>>> compatible manner (migration wise).
>>>>>>>>
>>>>>>>> That will also allow to clean up a bit our numa code, leaving only
>>>>>>>> 'memdev' impl. in place and several boards that use node_mem
>>>>>>>> to generate FDT/ACPI description from it.
>>>>>>>
>>>>>>> Can you confirm that the  'mem' and 'memdev' parameters to -numa
>>>>>>> are 100% live migration compatible in both directions ?  Libvirt
>>>>>>> would need this to be the case in order to use the 'memdev' syntax
>>>>>>> instead.
>>>>>> Unfortunately they are not migration compatible in any direction,
>>>>>> if it where possible to translate them to each other I'd alias 'mem'
>>>>>> to 'memdev' without deprecation. The former sends over only one
>>>>>> MemoryRegion to target, while the later sends over several (one per
>>>>>> memdev).
>>>>>
>>>>> If we can't migration from one to the other, then we can not deprecate
>>>>> the existing 'mem' syntax. Even if libvirt were to provide a config
>>>>> option to let apps opt-in to the new syntax, we need to be able to
>>>>> support live migration of existing running VMs indefinitely. Effectively
>>>>> this means we need the to keep 'mem' support forever, or at least such
>>>>> a long time that it effectively means forever.
>>>>>
>>>>> So I think this patch has to be dropped & replaced with one that
>>>>> simply documents that memdev syntax is preferred.
>>>>
>>>> We have this habit of postulating absolutes like "can not deprecate"
>>>> instead of engaging with the tradeoffs.  We need to kick it.
>>>>
>>>> So let's have an actual look at the tradeoffs.
>>>>
>>>> We don't actually "support live migration of existing running VMs
>>>> indefinitely".
>>>>
>>>> We support live migration to any newer version of QEMU that still
>>>> supports the machine type.
>>>>
>>>> We support live migration to any older version of QEMU that already
>>>> supports the machine type and all the devices the machine uses.
>>>>
>>>> Aside: "support" is really an honest best effort here.  If you rely on
>>>> it, use a downstream that puts in the (substantial!) QA work real
>>>> support takes.
>>>>
>>>> Feature deprecation is not a contract to drop the feature after two
>>>> releases, or even five.  It's a formal notice that users of the feature
>>>> should transition to its replacement in an orderly manner.
>>>>
>>>> If I understand Igor correctly, all users should transition away from
>>>> outdated NUMA configurations at least for new VMs in an orderly manner.
>>> Yes, we can postpone removing options until there are machines type
>>> versions that were capable to use it (unfortunate but probably
>>> unavoidable unless there is a migration trick to make transition
>>> transparent) but that should not stop us from disabling broken
>>> options on new machine types at least.
>>>
>>> This series can serve as formal notice with follow up disabling of
>>> deprecated options for new machine types. (As Thomas noted, just warnings
>>> do not work and users continue to use broken features regardless whether
>>> they are don't know about issues or aware of it [*])
>>>
>>> Hence suggested deprecation approach and enforced rejection of legacy
>>> numa options for new machine types in 2 releases so users would stop
>>> using them eventually.
>>
>> When we deprecate something, we need to have a way for apps to use the
>> new alternative approach *at the same time*.  So even if we only want to
>> deprecate for new machine types, we still have to first solve the problem
>> of how mgmt apps will introspect QEMU to learn which machine types expect
>> the new options.
> I'm not aware any mechanism to introspect machine type options (existing
> or something being developed). Are/were there any ideas about it that were
> discussed in the past?
> 
> Aside from developing a new mechanism what are alternative approaches?
> I mean when we delete deprecated CLI option, how it's solved on libvirt
> side currently?

Libvirt queries qemu capabilites via QMP. And in all places it can it 
preferes the latest recommended cmd line options (well, those known to a 
libvirt developer at the time he/she is writing the code). So as long as 
you remove only old stuff and libvirt refreshes itself when following 
best practicies we're okay.

> 
> For example I don't see anything introspection related when we have been
> removing deprecated options recently.
> 
> More exact question specific to this series usecase,
> how libvirt decides when to use -numa node,memdev or not currently?

It has a mechanism to tell if '-numa node,memdev=' is needed (i.e. there 
is no other way to satisfy user requested configuration) and only then 
it uses ,memdev. For all other cases it defaults to -numa node,mem= 
simply to keep backwards compatibility (as I'm explaining in another 
e-mail I've just sent to this list).

Anyway, in the libvirt code you want to be looking at:

src/qemu/qemu_command.c: qemuBuildNumaArgStr
src/qemu/qemu_command.c: qemuBuildMemoryCellBackendStr

Michal

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [Qemu-devel] [Qemu-ppc] [libvirt] [PATCH 1/2] numa: deprecate 'mem' parameter of '-numa node' option
  2019-03-04 13:59             ` Daniel P. Berrangé
@ 2019-03-04 14:54               ` Igor Mammedov
  2019-03-04 15:02                 ` Daniel P. Berrangé
  0 siblings, 1 reply; 37+ messages in thread
From: Igor Mammedov @ 2019-03-04 14:54 UTC (permalink / raw)
  To: Daniel P. Berrangé
  Cc: Thomas Huth, peter.maydell, ehabkost, libvir-list, qemu-devel,
	Dr. David Alan Gilbert, qemu-arm, qemu-ppc, pbonzini, david

On Mon, 4 Mar 2019 13:59:09 +0000
Daniel P. Berrangé <berrange@redhat.com> wrote:

> On Mon, Mar 04, 2019 at 02:55:10PM +0100, Igor Mammedov wrote:
> > On Mon, 4 Mar 2019 09:11:19 +0100
> > Thomas Huth <thuth@redhat.com> wrote:
> >   
> > > On 01/03/2019 18.48, Daniel P. Berrangé wrote:
> > > [...]  
> > > > So I think this patch has to be dropped & replaced with one that
> > > > simply documents that memdev syntax is preferred.    
> > > 
> > > That's definitely not enough. I've had a couple of cases already where
> > > we documented that certain options should not be used anymore, and
> > > people simply ignored it (aka. if it ain't broken, don't do any change).
> > > Then they just started to complain when I really tried to remove the
> > > option after the deprecation period.  
> >   
> > > So Igor, if you can not officially deprecate these things here yet, you
> > > should at least make sure that they can not be used with new machine
> > > types anymore. Then, after a couple of years, when we feel sure that
> > > there are only some few or no people left who still use it with the old
> > > machine types, we can start to discuss the deprecation process again, I
> > > think.  
> > Is it acceptable to silently disable CLI options (even if they are broken
> > like in this case) for new machine types?
> > I was under impression that it should go through deprecation first.  
> 
> Yes, it must go through deprecation. I was saying we can't disable
> the CLI options at all, until there is a way for libvirt to correctly
> use the new options.

I'm not adding new options (nor plan to for numa case (yet)),
-numa node,memdev is around several years by now and should be used
as default for creating new configs.

In light of keeping 'mem' option around for old machines,
Deprecation should have served for notifying users that legacy
options will be disabled later on (for new machines at least
if no way found for migration compatible transition for older ones).

What I'm mainly aiming here is to prevent using broken legacy options
for new VMs (like in RHBZ1624223 case) and deprecation is the only way
we have now to notify users about CLI breaking changes.

In the -mem-path/prealloc case, there will be a new machine option 'memdev'
to replace them but that's migration compatible, so libvirt could use new
CLI syntax to replace even old configs.
(If there is a mechanism for introducing/introspecting a new options
per-machine or whole QEMU, I'd happy to add a new option there on top of
just deprecation notice that we have now).

> Regards,
> Daniel

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [Qemu-devel] [Qemu-ppc] [libvirt] [PATCH 1/2] numa: deprecate 'mem' parameter of '-numa node' option
  2019-03-04 14:54               ` Igor Mammedov
@ 2019-03-04 15:02                 ` Daniel P. Berrangé
  2019-03-04 16:45                   ` Igor Mammedov
  0 siblings, 1 reply; 37+ messages in thread
From: Daniel P. Berrangé @ 2019-03-04 15:02 UTC (permalink / raw)
  To: Igor Mammedov
  Cc: Thomas Huth, peter.maydell, ehabkost, libvir-list, qemu-devel,
	Dr. David Alan Gilbert, qemu-arm, qemu-ppc, pbonzini, david

On Mon, Mar 04, 2019 at 03:54:57PM +0100, Igor Mammedov wrote:
> On Mon, 4 Mar 2019 13:59:09 +0000
> Daniel P. Berrangé <berrange@redhat.com> wrote:
> 
> > On Mon, Mar 04, 2019 at 02:55:10PM +0100, Igor Mammedov wrote:
> > > On Mon, 4 Mar 2019 09:11:19 +0100
> > > Thomas Huth <thuth@redhat.com> wrote:
> > >   
> > > > On 01/03/2019 18.48, Daniel P. Berrangé wrote:
> > > > [...]  
> > > > > So I think this patch has to be dropped & replaced with one that
> > > > > simply documents that memdev syntax is preferred.    
> > > > 
> > > > That's definitely not enough. I've had a couple of cases already where
> > > > we documented that certain options should not be used anymore, and
> > > > people simply ignored it (aka. if it ain't broken, don't do any change).
> > > > Then they just started to complain when I really tried to remove the
> > > > option after the deprecation period.  
> > >   
> > > > So Igor, if you can not officially deprecate these things here yet, you
> > > > should at least make sure that they can not be used with new machine
> > > > types anymore. Then, after a couple of years, when we feel sure that
> > > > there are only some few or no people left who still use it with the old
> > > > machine types, we can start to discuss the deprecation process again, I
> > > > think.  
> > > Is it acceptable to silently disable CLI options (even if they are broken
> > > like in this case) for new machine types?
> > > I was under impression that it should go through deprecation first.  
> > 
> > Yes, it must go through deprecation. I was saying we can't disable
> > the CLI options at all, until there is a way for libvirt to correctly
> > use the new options.
> 
> I'm not adding new options (nor plan to for numa case (yet)),
> -numa node,memdev is around several years by now and should be used
> as default for creating new configs.
> 
> In light of keeping 'mem' option around for old machines,
> Deprecation should have served for notifying users that legacy
> options will be disabled later on (for new machines at least
> if no way found for migration compatible transition for older ones).
> 
> What I'm mainly aiming here is to prevent using broken legacy options
> for new VMs (like in RHBZ1624223 case) and deprecation is the only way
> we have now to notify users about CLI breaking changes.

The idea of doing advance warnings via deprecations is that applications
have time to adapt to the new mechanism several releases before the old
mechanism is removed/disabled.  Since the new mechanism isn't fully
usable yet, applications can't adapt to use it. So we can't start the
deprecation process yet, as it would be telling apps to do a switch
that isn't possible for many to actually do.

In the meantime, qemu-options.hx could be updated. It documents both
"mem" and "memdev" currently but doesn't tell people that "memdev" is
the preferred syntax for future usage / warn against using "mem".


Regards,
Daniel
-- 
|: https://berrange.com      -o-    https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org         -o-            https://fstop138.berrange.com :|
|: https://entangle-photo.org    -o-    https://www.instagram.com/dberrange :|

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [Qemu-devel] [libvirt] [PATCH 1/2] numa: deprecate 'mem' parameter of '-numa node' option
  2019-03-04 14:24             ` Michal Privoznik
@ 2019-03-04 15:03               ` Igor Mammedov
  0 siblings, 0 replies; 37+ messages in thread
From: Igor Mammedov @ 2019-03-04 15:03 UTC (permalink / raw)
  To: Michal Privoznik
  Cc: Daniel P. Berrangé,
	Markus Armbruster, peter.maydell, ehabkost, libvir-list,
	qemu-devel, Dr. David Alan Gilbert, qemu-arm, qemu-ppc, pbonzini,
	david

On Mon, 4 Mar 2019 15:24:28 +0100
Michal Privoznik <mprivozn@redhat.com> wrote:

> [Thanks Igor for bringing this onto my radar. I don't follow qemu-devel 
> that close]
> 
> On 3/4/19 11:19 AM, Daniel P. Berrangé wrote:
> > On Mon, Mar 04, 2019 at 08:13:53AM +0100, Markus Armbruster wrote:  
> >> Daniel P. Berrangé <berrange@redhat.com> writes:
> >>  
> >>> On Fri, Mar 01, 2019 at 06:33:28PM +0100, Igor Mammedov wrote:  
> >>>> On Fri, 1 Mar 2019 15:49:47 +0000
> >>>> Daniel P. Berrangé <berrange@redhat.com> wrote:
> >>>>  
> >>>>> On Fri, Mar 01, 2019 at 04:42:15PM +0100, Igor Mammedov wrote:  
> >>>>>> The parameter allows to configure fake NUMA topology where guest
> >>>>>> VM simulates NUMA topology but not actually getting a performance
> >>>>>> benefits from it. The same or better results could be achieved
> >>>>>> using 'memdev' parameter. In light of that any VM that uses NUMA
> >>>>>> to get its benefits should use 'memdev' and to allow transition
> >>>>>> initial RAM to device based model, deprecate 'mem' parameter as
> >>>>>> its ad-hoc partitioning of initial RAM MemoryRegion can't be
> >>>>>> translated to memdev based backend transparently to users and in
> >>>>>> compatible manner (migration wise).
> >>>>>>
> >>>>>> That will also allow to clean up a bit our numa code, leaving only
> >>>>>> 'memdev' impl. in place and several boards that use node_mem
> >>>>>> to generate FDT/ACPI description from it.  
> >>>>>
> >>>>> Can you confirm that the  'mem' and 'memdev' parameters to -numa
> >>>>> are 100% live migration compatible in both directions ?  Libvirt
> >>>>> would need this to be the case in order to use the 'memdev' syntax
> >>>>> instead.  
> >>>> Unfortunately they are not migration compatible in any direction,
> >>>> if it where possible to translate them to each other I'd alias 'mem'
> >>>> to 'memdev' without deprecation. The former sends over only one
> >>>> MemoryRegion to target, while the later sends over several (one per
> >>>> memdev).  
> >>>
> >>> If we can't migration from one to the other, then we can not deprecate
> >>> the existing 'mem' syntax. Even if libvirt were to provide a config
> >>> option to let apps opt-in to the new syntax, we need to be able to
> >>> support live migration of existing running VMs indefinitely. Effectively
> >>> this means we need the to keep 'mem' support forever, or at least such
> >>> a long time that it effectively means forever.  
> 
> I'm with Daniel on this. The reason why libvirt still defaults to '-numa 
> node,mem=' is exactly because of backward compatibility. Since a machine 
> can't be migrated from '-numa node,mem=' to '-numa node,memdev= + 
> -object memory-backend-*' libvirt hast to play it safe and chose a 
> combination that is acessible the widest.
> 
> If you remove this, how would you expect older machines to migrate to 
> newer cmd line?
> 
> I'm all for deprecating old stuff. In fact, I've suggested that in 
> libvirt(!) here and there, but I'm afraid we can't just remove 
> functionatlity unless we give users a way to migrate to the one we 
> prefer now.
Agreed, it's clear now that I can't remove just 'mem' for OLD machine
types (even if this safe variant is broken and doesn't actually do
what it should). Libvirt should use 'memdev' for new VMs for them
to actually benefit from NUMA configuration.

Currently we are talking about disabling 'mem' for new machine types
only (pity that I have to keep around legacy code but at least we would
be able to move on to normal device modeling for initial memory on
new machines).

> And if libvirt doesn't follow qemu's warnings then it definitely should. 
> It's a libvirt bug if it doesn't follow the best practicies (well, if can).
> 
> Michal

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [Qemu-devel] [libvirt] [PATCH 1/2] numa: deprecate 'mem' parameter of '-numa node' option
  2019-03-04 14:24                 ` Daniel P. Berrangé
@ 2019-03-04 15:19                   ` Igor Mammedov
  2019-03-04 16:12                     ` Michal Privoznik
  2019-03-04 16:20                   ` Michal Privoznik
  1 sibling, 1 reply; 37+ messages in thread
From: Igor Mammedov @ 2019-03-04 15:19 UTC (permalink / raw)
  To: Daniel P. Berrangé
  Cc: peter.maydell, ehabkost, libvir-list, mprivozn, qemu-devel,
	Markus Armbruster, qemu-arm, qemu-ppc, pbonzini,
	Dr. David Alan Gilbert, david

On Mon, 4 Mar 2019 14:24:32 +0000
Daniel P. Berrangé <berrange@redhat.com> wrote:

> On Mon, Mar 04, 2019 at 03:16:41PM +0100, Igor Mammedov wrote:
> > On Mon, 4 Mar 2019 12:39:08 +0000
> > Daniel P. Berrangé <berrange@redhat.com> wrote:
> >   
> > > On Mon, Mar 04, 2019 at 01:25:07PM +0100, Igor Mammedov wrote:  
> > > > On Mon, 04 Mar 2019 08:13:53 +0100
> > > > Markus Armbruster <armbru@redhat.com> wrote:
> > > >     
> > > > > Daniel P. Berrangé <berrange@redhat.com> writes:
> > > > >     
> > > > > > On Fri, Mar 01, 2019 at 06:33:28PM +0100, Igor Mammedov wrote:      
> > > > > >> On Fri, 1 Mar 2019 15:49:47 +0000
> > > > > >> Daniel P. Berrangé <berrange@redhat.com> wrote:
> > > > > >>       
> > > > > >> > On Fri, Mar 01, 2019 at 04:42:15PM +0100, Igor Mammedov wrote:      
> > > > > >> > > The parameter allows to configure fake NUMA topology where guest
> > > > > >> > > VM simulates NUMA topology but not actually getting a performance
> > > > > >> > > benefits from it. The same or better results could be achieved
> > > > > >> > > using 'memdev' parameter. In light of that any VM that uses NUMA
> > > > > >> > > to get its benefits should use 'memdev' and to allow transition
> > > > > >> > > initial RAM to device based model, deprecate 'mem' parameter as
> > > > > >> > > its ad-hoc partitioning of initial RAM MemoryRegion can't be
> > > > > >> > > translated to memdev based backend transparently to users and in
> > > > > >> > > compatible manner (migration wise).
> > > > > >> > > 
> > > > > >> > > That will also allow to clean up a bit our numa code, leaving only
> > > > > >> > > 'memdev' impl. in place and several boards that use node_mem
> > > > > >> > > to generate FDT/ACPI description from it.        
> > > > > >> > 
> > > > > >> > Can you confirm that the  'mem' and 'memdev' parameters to -numa
> > > > > >> > are 100% live migration compatible in both directions ?  Libvirt
> > > > > >> > would need this to be the case in order to use the 'memdev' syntax
> > > > > >> > instead.      
> > > > > >> Unfortunately they are not migration compatible in any direction,
> > > > > >> if it where possible to translate them to each other I'd alias 'mem'
> > > > > >> to 'memdev' without deprecation. The former sends over only one
> > > > > >> MemoryRegion to target, while the later sends over several (one per
> > > > > >> memdev).      
> > > > > >
> > > > > > If we can't migration from one to the other, then we can not deprecate
> > > > > > the existing 'mem' syntax. Even if libvirt were to provide a config
> > > > > > option to let apps opt-in to the new syntax, we need to be able to
> > > > > > support live migration of existing running VMs indefinitely. Effectively
> > > > > > this means we need the to keep 'mem' support forever, or at least such
> > > > > > a long time that it effectively means forever.
> > > > > >
> > > > > > So I think this patch has to be dropped & replaced with one that
> > > > > > simply documents that memdev syntax is preferred.      
> > > > > 
> > > > > We have this habit of postulating absolutes like "can not deprecate"
> > > > > instead of engaging with the tradeoffs.  We need to kick it.
> > > > > 
> > > > > So let's have an actual look at the tradeoffs.
> > > > > 
> > > > > We don't actually "support live migration of existing running VMs
> > > > > indefinitely".
> > > > > 
> > > > > We support live migration to any newer version of QEMU that still
> > > > > supports the machine type.
> > > > > 
> > > > > We support live migration to any older version of QEMU that already
> > > > > supports the machine type and all the devices the machine uses.
> > > > > 
> > > > > Aside: "support" is really an honest best effort here.  If you rely on
> > > > > it, use a downstream that puts in the (substantial!) QA work real
> > > > > support takes.
> > > > > 
> > > > > Feature deprecation is not a contract to drop the feature after two
> > > > > releases, or even five.  It's a formal notice that users of the feature
> > > > > should transition to its replacement in an orderly manner.
> > > > > 
> > > > > If I understand Igor correctly, all users should transition away from
> > > > > outdated NUMA configurations at least for new VMs in an orderly manner.    
> > > > Yes, we can postpone removing options until there are machines type
> > > > versions that were capable to use it (unfortunate but probably 
> > > > unavoidable unless there is a migration trick to make transition
> > > > transparent) but that should not stop us from disabling broken
> > > > options on new machine types at least.
> > > > 
> > > > This series can serve as formal notice with follow up disabling of
> > > > deprecated options for new machine types. (As Thomas noted, just warnings
> > > > do not work and users continue to use broken features regardless whether
> > > > they are don't know about issues or aware of it [*])
> > > > 
> > > > Hence suggested deprecation approach and enforced rejection of legacy
> > > > numa options for new machine types in 2 releases so users would stop
> > > > using them eventually.    
> > > 
> > > When we deprecate something, we need to have a way for apps to use the
> > > new alternative approach *at the same time*.  So even if we only want to
> > > deprecate for new machine types, we still have to first solve the problem
> > > of how mgmt apps will introspect QEMU to learn which machine types expect
> > > the new options.  
> > I'm not aware any mechanism to introspect machine type options (existing
> > or something being developed). Are/were there any ideas about it that were
> > discussed in the past?
> > 
> > Aside from developing a new mechanism what are alternative approaches?
> > I mean when we delete deprecated CLI option, how it's solved on libvirt
> > side currently?
> > 
> > For example I don't see anything introspection related when we have been
> > removing deprecated options recently.  
> 
> Right, with other stuff we deprecate we've had a simpler time, as it
> either didn't affect migration at all, or the new replacement stuff
> was fully compatible with the migration data stream. IOW, libvirt
> could unconditionally use the new feature as soon as it saw that it
> exists in QEMU. We didn't have any machine type dependancy to deal
> with before now.
Any suggestions what direction we should proceed?
(I'm really not keen to develop a new introspection feature but if that
the only way to move forward ...)

> > More exact question specific to this series usecase,
> > how libvirt decides when to use -numa node,memdev or not currently?  
> 
> It is pretty hard to follow the code, but IIUC we only use memdev when
> stting up NVDIMMs, or for guests configured to have the "shared" flag
> on the memory region.
Then I'd guess that most VMs end up with default '-numa node,mem' 
which by design can produce only fake NUMA without ability to manage
guest RAM on host side. So such VMs aren't getting performance benefits
or worse run with performance regression (due to wrong sched/mm decisions
as guest kernel assumes NUMA topology is valid one).
 
> Regards,
> Daniel

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [Qemu-devel] [libvirt] [PATCH 1/2] numa: deprecate 'mem' parameter of '-numa node' option
  2019-03-04 11:45             ` Markus Armbruster
@ 2019-03-04 15:28               ` Daniel P. Berrangé
  2019-03-04 15:46                 ` Igor Mammedov
  2019-03-10 10:14                 ` Markus Armbruster
  0 siblings, 2 replies; 37+ messages in thread
From: Daniel P. Berrangé @ 2019-03-04 15:28 UTC (permalink / raw)
  To: Markus Armbruster
  Cc: peter.maydell, ehabkost, libvir-list, qemu-devel,
	Dr. David Alan Gilbert, qemu-arm, qemu-ppc, pbonzini,
	Igor Mammedov, david

On Mon, Mar 04, 2019 at 12:45:14PM +0100, Markus Armbruster wrote:
> Daniel P. Berrangé <berrange@redhat.com> writes:
> 
> > On Mon, Mar 04, 2019 at 08:13:53AM +0100, Markus Armbruster wrote:
> >> If we deprecate outdated NUMA configurations now, we can start rejecting
> >> them with new machine types after a suitable grace period.
> >
> > How is libvirt going to know what machines it can use with the feature ?
> > We don't have any way to introspect machine type specific logic, since we
> > run all probing with "-machine none", and QEMU can't report anything about
> > machines without instantiating them.
> 
> Fair point.  A practical way for management applications to decide which
> of the two interfaces they can use with which machine type may be
> required for deprecating one of the interfaces with new machine types.

We currently have  "qom-list-properties" which can report on the
existance of properties registered against object types. What it
can't do though is report on the default values of these properties.

What's interesting though is that qmp_qom_list_properties will actually
instantiate objects in order to query properties, if the type isn't an
abstract type.

IOW, even if you are running "$QEMU -machine none", then if at the qmp-shell
you do

   (QEMU) qom-list-properties typename=pc-q35-2.6-machine

it will have actually instantiate the pc-q35-2.6-machine machine type.
Since it has instantiated the machine, the object initializer function
will have run and initialized the default values for various properties.

IOW, it is possible for qom-list-properties to report on default values
for non-abstract types.

I did a quick hack to PoC the theory:

diff --git a/qapi/misc.json b/qapi/misc.json
index 8b3ca4fdd3..906dfbf3b5 100644
--- a/qapi/misc.json
+++ b/qapi/misc.json
@@ -1368,7 +1368,8 @@
 # Since: 1.2
 ##
 { 'struct': 'ObjectPropertyInfo',
-  'data': { 'name': 'str', 'type': 'str', '*description': 'str' } }
+  'data': { 'name': 'str', 'type': 'str', '*description': 'str',
+            '*default': 'str'} }
 
 ##
 # @qom-list:
diff --git a/qmp.c b/qmp.c
index b92d62cd5f..a45669032c 100644
--- a/qmp.c
+++ b/qmp.c
@@ -594,6 +594,11 @@ ObjectPropertyInfoList *qmp_qom_list_properties(const char *typename,
         info->has_description = !!prop->description;
         info->description = g_strdup(prop->description);
 
+        if (obj && g_str_equal(info->type, "string")) {
+            info->q_default = g_strdup(object_property_get_str(obj, info->name, NULL));
+            info->has_q_default = info->q_default != NULL;
+        }
+
         entry = g_malloc0(sizeof(*entry));
         entry->value = info;
         entry->next = prop_list;


If we could make this hack less of a hack, then perhaps this is good
enough to cope reporting machine types which forbid use of "mem" in
favour of "memdev" ? They would need to have a property registered
against them of course to identify the "memdev" requirement.

Regards,
Daniel
-- 
|: https://berrange.com      -o-    https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org         -o-            https://fstop138.berrange.com :|
|: https://entangle-photo.org    -o-    https://www.instagram.com/dberrange :|

^ permalink raw reply related	[flat|nested] 37+ messages in thread

* Re: [Qemu-devel] [libvirt] [PATCH 1/2] numa: deprecate 'mem' parameter of '-numa node' option
  2019-03-04 15:28               ` Daniel P. Berrangé
@ 2019-03-04 15:46                 ` Igor Mammedov
  2019-03-10 10:14                 ` Markus Armbruster
  1 sibling, 0 replies; 37+ messages in thread
From: Igor Mammedov @ 2019-03-04 15:46 UTC (permalink / raw)
  To: Daniel P. Berrangé
  Cc: Markus Armbruster, peter.maydell, ehabkost, libvir-list,
	qemu-devel, Dr. David Alan Gilbert, qemu-arm, qemu-ppc, pbonzini,
	david

On Mon, 4 Mar 2019 15:28:11 +0000
Daniel P. Berrangé <berrange@redhat.com> wrote:

> On Mon, Mar 04, 2019 at 12:45:14PM +0100, Markus Armbruster wrote:
> > Daniel P. Berrangé <berrange@redhat.com> writes:
> >   
> > > On Mon, Mar 04, 2019 at 08:13:53AM +0100, Markus Armbruster wrote:  
> > >> If we deprecate outdated NUMA configurations now, we can start rejecting
> > >> them with new machine types after a suitable grace period.  
> > >
> > > How is libvirt going to know what machines it can use with the feature ?
> > > We don't have any way to introspect machine type specific logic, since we
> > > run all probing with "-machine none", and QEMU can't report anything about
> > > machines without instantiating them.  
> > 
> > Fair point.  A practical way for management applications to decide which
> > of the two interfaces they can use with which machine type may be
> > required for deprecating one of the interfaces with new machine types.  
> 
> We currently have  "qom-list-properties" which can report on the
> existance of properties registered against object types. What it
> can't do though is report on the default values of these properties.
> 
> What's interesting though is that qmp_qom_list_properties will actually
> instantiate objects in order to query properties, if the type isn't an
> abstract type.
> 
> IOW, even if you are running "$QEMU -machine none", then if at the qmp-shell
> you do
> 
>    (QEMU) qom-list-properties typename=pc-q35-2.6-machine
> 
> it will have actually instantiate the pc-q35-2.6-machine machine type.
> Since it has instantiated the machine, the object initializer function
> will have run and initialized the default values for various properties.
> 
> IOW, it is possible for qom-list-properties to report on default values
> for non-abstract types.
> 
> I did a quick hack to PoC the theory:
> 
> diff --git a/qapi/misc.json b/qapi/misc.json
> index 8b3ca4fdd3..906dfbf3b5 100644
> --- a/qapi/misc.json
> +++ b/qapi/misc.json
> @@ -1368,7 +1368,8 @@
>  # Since: 1.2
>  ##
>  { 'struct': 'ObjectPropertyInfo',
> -  'data': { 'name': 'str', 'type': 'str', '*description': 'str' } }
> +  'data': { 'name': 'str', 'type': 'str', '*description': 'str',
> +            '*default': 'str'} }
>  
>  ##
>  # @qom-list:
> diff --git a/qmp.c b/qmp.c
> index b92d62cd5f..a45669032c 100644
> --- a/qmp.c
> +++ b/qmp.c
> @@ -594,6 +594,11 @@ ObjectPropertyInfoList *qmp_qom_list_properties(const char *typename,
>          info->has_description = !!prop->description;
>          info->description = g_strdup(prop->description);
>  
> +        if (obj && g_str_equal(info->type, "string")) {
> +            info->q_default = g_strdup(object_property_get_str(obj, info->name, NULL));
> +            info->has_q_default = info->q_default != NULL;
> +        }
> +
>          entry = g_malloc0(sizeof(*entry));
>          entry->value = info;
>          entry->next = prop_list;
> 
> 
> If we could make this hack less of a hack, then perhaps this is good
> enough to cope reporting machine types which forbid use of "mem" in
> favour of "memdev" ? They would need to have a property registered
> against them of course to identify the "memdev" requirement.

Thanks, I'll look into it and try to come up with patches.

> Regards,
> Daniel

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [Qemu-devel] [libvirt] [PATCH 1/2] numa: deprecate 'mem' parameter of '-numa node' option
  2019-03-04 15:19                   ` Igor Mammedov
@ 2019-03-04 16:12                     ` Michal Privoznik
  2019-03-04 16:27                       ` Daniel P. Berrangé
  0 siblings, 1 reply; 37+ messages in thread
From: Michal Privoznik @ 2019-03-04 16:12 UTC (permalink / raw)
  To: Igor Mammedov, Daniel P. Berrangé
  Cc: peter.maydell, ehabkost, libvir-list, qemu-devel,
	Markus Armbruster, qemu-arm, qemu-ppc, pbonzini,
	Dr. David Alan Gilbert, david

On 3/4/19 4:19 PM, Igor Mammedov wrote:

> Then I'd guess that most VMs end up with default '-numa node,mem'
> which by design can produce only fake NUMA without ability to manage
> guest RAM on host side. So such VMs aren't getting performance benefits
> or worse run with performance regression (due to wrong sched/mm decisions
> as guest kernel assumes NUMA topology is valid one).

Specifying NUMA distances in libvirt XML makes it generate the modern 
cmd line.

Michal

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [Qemu-devel] [libvirt] [PATCH 1/2] numa: deprecate 'mem' parameter of '-numa node' option
  2019-03-04 14:24                 ` Daniel P. Berrangé
  2019-03-04 15:19                   ` Igor Mammedov
@ 2019-03-04 16:20                   ` Michal Privoznik
  2019-03-04 16:31                     ` Dr. David Alan Gilbert
                                       ` (2 more replies)
  1 sibling, 3 replies; 37+ messages in thread
From: Michal Privoznik @ 2019-03-04 16:20 UTC (permalink / raw)
  To: Daniel P. Berrangé, Igor Mammedov
  Cc: Markus Armbruster, peter.maydell, ehabkost, libvir-list,
	qemu-devel, Dr. David Alan Gilbert, qemu-arm, qemu-ppc, pbonzini,
	david

On 3/4/19 3:24 PM, Daniel P. Berrangé wrote:
> On Mon, Mar 04, 2019 at 03:16:41PM +0100, Igor Mammedov wrote:
>> On Mon, 4 Mar 2019 12:39:08 +0000
>> Daniel P. Berrangé <berrange@redhat.com> wrote:
>>
>>> On Mon, Mar 04, 2019 at 01:25:07PM +0100, Igor Mammedov wrote:
>>>> On Mon, 04 Mar 2019 08:13:53 +0100
>>>> Markus Armbruster <armbru@redhat.com> wrote:
>>>>    
>>>>> Daniel P. Berrangé <berrange@redhat.com> writes:
>>>>>    
>>>>>> On Fri, Mar 01, 2019 at 06:33:28PM +0100, Igor Mammedov wrote:
>>>>>>> On Fri, 1 Mar 2019 15:49:47 +0000
>>>>>>> Daniel P. Berrangé <berrange@redhat.com> wrote:
>>>>>>>      
>>>>>>>> On Fri, Mar 01, 2019 at 04:42:15PM +0100, Igor Mammedov wrote:
>>>>>>>>> The parameter allows to configure fake NUMA topology where guest
>>>>>>>>> VM simulates NUMA topology but not actually getting a performance
>>>>>>>>> benefits from it. The same or better results could be achieved
>>>>>>>>> using 'memdev' parameter. In light of that any VM that uses NUMA
>>>>>>>>> to get its benefits should use 'memdev' and to allow transition
>>>>>>>>> initial RAM to device based model, deprecate 'mem' parameter as
>>>>>>>>> its ad-hoc partitioning of initial RAM MemoryRegion can't be
>>>>>>>>> translated to memdev based backend transparently to users and in
>>>>>>>>> compatible manner (migration wise).
>>>>>>>>>
>>>>>>>>> That will also allow to clean up a bit our numa code, leaving only
>>>>>>>>> 'memdev' impl. in place and several boards that use node_mem
>>>>>>>>> to generate FDT/ACPI description from it.
>>>>>>>>
>>>>>>>> Can you confirm that the  'mem' and 'memdev' parameters to -numa
>>>>>>>> are 100% live migration compatible in both directions ?  Libvirt
>>>>>>>> would need this to be the case in order to use the 'memdev' syntax
>>>>>>>> instead.
>>>>>>> Unfortunately they are not migration compatible in any direction,
>>>>>>> if it where possible to translate them to each other I'd alias 'mem'
>>>>>>> to 'memdev' without deprecation. The former sends over only one
>>>>>>> MemoryRegion to target, while the later sends over several (one per
>>>>>>> memdev).
>>>>>>
>>>>>> If we can't migration from one to the other, then we can not deprecate
>>>>>> the existing 'mem' syntax. Even if libvirt were to provide a config
>>>>>> option to let apps opt-in to the new syntax, we need to be able to
>>>>>> support live migration of existing running VMs indefinitely. Effectively
>>>>>> this means we need the to keep 'mem' support forever, or at least such
>>>>>> a long time that it effectively means forever.
>>>>>>
>>>>>> So I think this patch has to be dropped & replaced with one that
>>>>>> simply documents that memdev syntax is preferred.
>>>>>
>>>>> We have this habit of postulating absolutes like "can not deprecate"
>>>>> instead of engaging with the tradeoffs.  We need to kick it.
>>>>>
>>>>> So let's have an actual look at the tradeoffs.
>>>>>
>>>>> We don't actually "support live migration of existing running VMs
>>>>> indefinitely".
>>>>>
>>>>> We support live migration to any newer version of QEMU that still
>>>>> supports the machine type.
>>>>>
>>>>> We support live migration to any older version of QEMU that already
>>>>> supports the machine type and all the devices the machine uses.
>>>>>
>>>>> Aside: "support" is really an honest best effort here.  If you rely on
>>>>> it, use a downstream that puts in the (substantial!) QA work real
>>>>> support takes.
>>>>>
>>>>> Feature deprecation is not a contract to drop the feature after two
>>>>> releases, or even five.  It's a formal notice that users of the feature
>>>>> should transition to its replacement in an orderly manner.
>>>>>
>>>>> If I understand Igor correctly, all users should transition away from
>>>>> outdated NUMA configurations at least for new VMs in an orderly manner.
>>>> Yes, we can postpone removing options until there are machines type
>>>> versions that were capable to use it (unfortunate but probably
>>>> unavoidable unless there is a migration trick to make transition
>>>> transparent) but that should not stop us from disabling broken
>>>> options on new machine types at least.
>>>>
>>>> This series can serve as formal notice with follow up disabling of
>>>> deprecated options for new machine types. (As Thomas noted, just warnings
>>>> do not work and users continue to use broken features regardless whether
>>>> they are don't know about issues or aware of it [*])
>>>>
>>>> Hence suggested deprecation approach and enforced rejection of legacy
>>>> numa options for new machine types in 2 releases so users would stop
>>>> using them eventually.
>>>
>>> When we deprecate something, we need to have a way for apps to use the
>>> new alternative approach *at the same time*.  So even if we only want to
>>> deprecate for new machine types, we still have to first solve the problem
>>> of how mgmt apps will introspect QEMU to learn which machine types expect
>>> the new options.
>> I'm not aware any mechanism to introspect machine type options (existing
>> or something being developed). Are/were there any ideas about it that were
>> discussed in the past?
>>
>> Aside from developing a new mechanism what are alternative approaches?
>> I mean when we delete deprecated CLI option, how it's solved on libvirt
>> side currently?
>>
>> For example I don't see anything introspection related when we have been
>> removing deprecated options recently.
> 
> Right, with other stuff we deprecate we've had a simpler time, as it
> either didn't affect migration at all, or the new replacement stuff
> was fully compatible with the migration data stream. IOW, libvirt
> could unconditionally use the new feature as soon as it saw that it
> exists in QEMU. We didn't have any machine type dependancy to deal
> with before now.

We couldn't have done that. How we would migrate from older qemu?

Anyway, now that I look into this (esp. git log) I came accross:

commit f309db1f4d51009bad0d32e12efc75530b66836b
Author:     Michal Privoznik <mprivozn@redhat.com>
AuthorDate: Thu Dec 18 12:36:48 2014 +0100
Commit:     Michal Privoznik <mprivozn@redhat.com>
CommitDate: Fri Dec 19 07:44:44 2014 +0100

     qemu: Create memory-backend-{ram,file} iff needed

Or this 7832fac84741d65e851dbdbfaf474785cbfdcf3c. We did try to 
generated newer cmd line but then for various reasong (e.g. avoiding 
triggering a qemu bug) we turned it off and make libvirt default to 
older (now deprecated) cmd line.

Frankly, I don't know how to proceed. Unless qemu is fixed to allow 
migration from deprecated to new cmd line (unlikely, if not impossible, 
right?) then I guess the only approach we can have is that:

1) whenever so called cold booting a new machine (fresh, brand new start 
of a new domain) libvirt would default to modern cmd line,

2) on migration, libvirt would record in the migration stream (or status 
XML or wherever) that modern cmd line was generated and thus it'll make 
the destination generate modern cmd line too.

This solution still suffers a couple of problems:
a) migration to older libvirt will fail as older libvirt won't recognize 
the flag set in 2) and therefore would default to deprecated cmd line
b) migrating from one host to another won't modernize the cmd line

But I guess we have to draw a line somewhere (if we are not willing to 
write those migration patches).

Michal

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [Qemu-devel] [libvirt] [PATCH 1/2] numa: deprecate 'mem' parameter of '-numa node' option
  2019-03-04 16:12                     ` Michal Privoznik
@ 2019-03-04 16:27                       ` Daniel P. Berrangé
  0 siblings, 0 replies; 37+ messages in thread
From: Daniel P. Berrangé @ 2019-03-04 16:27 UTC (permalink / raw)
  To: Michal Privoznik
  Cc: Igor Mammedov, peter.maydell, ehabkost, libvir-list, qemu-devel,
	Markus Armbruster, qemu-arm, qemu-ppc, pbonzini,
	Dr. David Alan Gilbert, david

On Mon, Mar 04, 2019 at 05:12:40PM +0100, Michal Privoznik wrote:
> On 3/4/19 4:19 PM, Igor Mammedov wrote:
> 
> > Then I'd guess that most VMs end up with default '-numa node,mem'
> > which by design can produce only fake NUMA without ability to manage
> > guest RAM on host side. So such VMs aren't getting performance benefits
> > or worse run with performance regression (due to wrong sched/mm decisions
> > as guest kernel assumes NUMA topology is valid one).
> 
> Specifying NUMA distances in libvirt XML makes it generate the modern cmd
> line.

AFAIK, specifying any guest NUMA -> Host NUMA affinity makes it use the
modern cmd line. eg I  just modified a plain 8 CPU / 2 GB RAM guest
with this:

  <numatune>
    <memnode cellid='0' mode='strict' nodeset='0'/>
    <memnode cellid='1' mode='strict' nodeset='1'/>
  </numatune>
  <cpu mode='host-model'>
    <numa>
      <cell id='0' cpus='0-3' memory='1024000' unit='KiB'/>
      <cell id='1' cpus='4-7' memory='1024000' unit='KiB'/>
    </numa>
  </cpu>

and I can see libvirt decided to use memdev

  -object memory-backend-ram,id=ram-node0,size=1048576000,host-nodes=0,policy=bind
  -numa node,nodeid=0,cpus=0-3,memdev=ram-node0
  -object memory-backend-ram,id=ram-node1,size=1048576000,host-nodes=1,policy=bind
  -numa node,nodeid=1,cpus=4-7,memdev=ram-node1 

So unless I'm missing something, we aren't suffering from the problem
described by Igor above even today.


Regards,
Daniel
-- 
|: https://berrange.com      -o-    https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org         -o-            https://fstop138.berrange.com :|
|: https://entangle-photo.org    -o-    https://www.instagram.com/dberrange :|

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [Qemu-devel] [libvirt] [PATCH 1/2] numa: deprecate 'mem' parameter of '-numa node' option
  2019-03-04 16:20                   ` Michal Privoznik
@ 2019-03-04 16:31                     ` Dr. David Alan Gilbert
  2019-03-04 16:35                     ` Daniel P. Berrangé
  2019-03-06 19:56                     ` Igor Mammedov
  2 siblings, 0 replies; 37+ messages in thread
From: Dr. David Alan Gilbert @ 2019-03-04 16:31 UTC (permalink / raw)
  To: Michal Privoznik
  Cc: Daniel P. Berrangé,
	Igor Mammedov, Markus Armbruster, peter.maydell, ehabkost,
	libvir-list, qemu-devel, qemu-arm, qemu-ppc, pbonzini, david

* Michal Privoznik (mprivozn@redhat.com) wrote:
> On 3/4/19 3:24 PM, Daniel P. Berrangé wrote:
> > On Mon, Mar 04, 2019 at 03:16:41PM +0100, Igor Mammedov wrote:
> > > On Mon, 4 Mar 2019 12:39:08 +0000
> > > Daniel P. Berrangé <berrange@redhat.com> wrote:
> > > 
> > > > On Mon, Mar 04, 2019 at 01:25:07PM +0100, Igor Mammedov wrote:
> > > > > On Mon, 04 Mar 2019 08:13:53 +0100
> > > > > Markus Armbruster <armbru@redhat.com> wrote:
> > > > > > Daniel P. Berrangé <berrange@redhat.com> writes:
> > > > > > > On Fri, Mar 01, 2019 at 06:33:28PM +0100, Igor Mammedov wrote:
> > > > > > > > On Fri, 1 Mar 2019 15:49:47 +0000
> > > > > > > > Daniel P. Berrangé <berrange@redhat.com> wrote:
> > > > > > > > > On Fri, Mar 01, 2019 at 04:42:15PM +0100, Igor Mammedov wrote:
> > > > > > > > > > The parameter allows to configure fake NUMA topology where guest
> > > > > > > > > > VM simulates NUMA topology but not actually getting a performance
> > > > > > > > > > benefits from it. The same or better results could be achieved
> > > > > > > > > > using 'memdev' parameter. In light of that any VM that uses NUMA
> > > > > > > > > > to get its benefits should use 'memdev' and to allow transition
> > > > > > > > > > initial RAM to device based model, deprecate 'mem' parameter as
> > > > > > > > > > its ad-hoc partitioning of initial RAM MemoryRegion can't be
> > > > > > > > > > translated to memdev based backend transparently to users and in
> > > > > > > > > > compatible manner (migration wise).
> > > > > > > > > > 
> > > > > > > > > > That will also allow to clean up a bit our numa code, leaving only
> > > > > > > > > > 'memdev' impl. in place and several boards that use node_mem
> > > > > > > > > > to generate FDT/ACPI description from it.
> > > > > > > > > 
> > > > > > > > > Can you confirm that the  'mem' and 'memdev' parameters to -numa
> > > > > > > > > are 100% live migration compatible in both directions ?  Libvirt
> > > > > > > > > would need this to be the case in order to use the 'memdev' syntax
> > > > > > > > > instead.
> > > > > > > > Unfortunately they are not migration compatible in any direction,
> > > > > > > > if it where possible to translate them to each other I'd alias 'mem'
> > > > > > > > to 'memdev' without deprecation. The former sends over only one
> > > > > > > > MemoryRegion to target, while the later sends over several (one per
> > > > > > > > memdev).
> > > > > > > 
> > > > > > > If we can't migration from one to the other, then we can not deprecate
> > > > > > > the existing 'mem' syntax. Even if libvirt were to provide a config
> > > > > > > option to let apps opt-in to the new syntax, we need to be able to
> > > > > > > support live migration of existing running VMs indefinitely. Effectively
> > > > > > > this means we need the to keep 'mem' support forever, or at least such
> > > > > > > a long time that it effectively means forever.
> > > > > > > 
> > > > > > > So I think this patch has to be dropped & replaced with one that
> > > > > > > simply documents that memdev syntax is preferred.
> > > > > > 
> > > > > > We have this habit of postulating absolutes like "can not deprecate"
> > > > > > instead of engaging with the tradeoffs.  We need to kick it.
> > > > > > 
> > > > > > So let's have an actual look at the tradeoffs.
> > > > > > 
> > > > > > We don't actually "support live migration of existing running VMs
> > > > > > indefinitely".
> > > > > > 
> > > > > > We support live migration to any newer version of QEMU that still
> > > > > > supports the machine type.
> > > > > > 
> > > > > > We support live migration to any older version of QEMU that already
> > > > > > supports the machine type and all the devices the machine uses.
> > > > > > 
> > > > > > Aside: "support" is really an honest best effort here.  If you rely on
> > > > > > it, use a downstream that puts in the (substantial!) QA work real
> > > > > > support takes.
> > > > > > 
> > > > > > Feature deprecation is not a contract to drop the feature after two
> > > > > > releases, or even five.  It's a formal notice that users of the feature
> > > > > > should transition to its replacement in an orderly manner.
> > > > > > 
> > > > > > If I understand Igor correctly, all users should transition away from
> > > > > > outdated NUMA configurations at least for new VMs in an orderly manner.
> > > > > Yes, we can postpone removing options until there are machines type
> > > > > versions that were capable to use it (unfortunate but probably
> > > > > unavoidable unless there is a migration trick to make transition
> > > > > transparent) but that should not stop us from disabling broken
> > > > > options on new machine types at least.
> > > > > 
> > > > > This series can serve as formal notice with follow up disabling of
> > > > > deprecated options for new machine types. (As Thomas noted, just warnings
> > > > > do not work and users continue to use broken features regardless whether
> > > > > they are don't know about issues or aware of it [*])
> > > > > 
> > > > > Hence suggested deprecation approach and enforced rejection of legacy
> > > > > numa options for new machine types in 2 releases so users would stop
> > > > > using them eventually.
> > > > 
> > > > When we deprecate something, we need to have a way for apps to use the
> > > > new alternative approach *at the same time*.  So even if we only want to
> > > > deprecate for new machine types, we still have to first solve the problem
> > > > of how mgmt apps will introspect QEMU to learn which machine types expect
> > > > the new options.
> > > I'm not aware any mechanism to introspect machine type options (existing
> > > or something being developed). Are/were there any ideas about it that were
> > > discussed in the past?
> > > 
> > > Aside from developing a new mechanism what are alternative approaches?
> > > I mean when we delete deprecated CLI option, how it's solved on libvirt
> > > side currently?
> > > 
> > > For example I don't see anything introspection related when we have been
> > > removing deprecated options recently.
> > 
> > Right, with other stuff we deprecate we've had a simpler time, as it
> > either didn't affect migration at all, or the new replacement stuff
> > was fully compatible with the migration data stream. IOW, libvirt
> > could unconditionally use the new feature as soon as it saw that it
> > exists in QEMU. We didn't have any machine type dependancy to deal
> > with before now.
> 
> We couldn't have done that. How we would migrate from older qemu?
> 
> Anyway, now that I look into this (esp. git log) I came accross:
> 
> commit f309db1f4d51009bad0d32e12efc75530b66836b
> Author:     Michal Privoznik <mprivozn@redhat.com>
> AuthorDate: Thu Dec 18 12:36:48 2014 +0100
> Commit:     Michal Privoznik <mprivozn@redhat.com>
> CommitDate: Fri Dec 19 07:44:44 2014 +0100
> 
>     qemu: Create memory-backend-{ram,file} iff needed
> 
> Or this 7832fac84741d65e851dbdbfaf474785cbfdcf3c. We did try to generated
> newer cmd line but then for various reasong (e.g. avoiding triggering a qemu
> bug) we turned it off and make libvirt default to older (now deprecated) cmd
> line.
> 
> Frankly, I don't know how to proceed. Unless qemu is fixed to allow
> migration from deprecated to new cmd line (unlikely, if not impossible,
> right?) then I guess the only approach we can have is that:
> 
> 1) whenever so called cold booting a new machine (fresh, brand new start of
> a new domain) libvirt would default to modern cmd line,
>
> 2) on migration, libvirt would record in the migration stream (or status XML
> or wherever) that modern cmd line was generated and thus it'll make the
> destination generate modern cmd line too.
> 
> This solution still suffers a couple of problems:
> a) migration to older libvirt will fail as older libvirt won't recognize the
> flag set in 2) and therefore would default to deprecated cmd line
> b) migrating from one host to another won't modernize the cmd line
> 
> But I guess we have to draw a line somewhere (if we are not willing to write
> those migration patches).

What's interesting here is that this problem isn't really machine-type
related; so providing introspection on the machine type doesn't
immediately help.  What we're actually trying to do here is (mis)use a
machine type as a proxy for knowing that both sides are new enough to
handle the new command line.

That's an OK thing to do, and if we did have introspection we could
add a fudge flag to say it's allowed now.

Dave


> Michal
--
Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [Qemu-devel] [libvirt] [PATCH 1/2] numa: deprecate 'mem' parameter of '-numa node' option
  2019-03-04 16:20                   ` Michal Privoznik
  2019-03-04 16:31                     ` Dr. David Alan Gilbert
@ 2019-03-04 16:35                     ` Daniel P. Berrangé
  2019-03-06 19:03                       ` Igor Mammedov
  2019-03-06 19:56                     ` Igor Mammedov
  2 siblings, 1 reply; 37+ messages in thread
From: Daniel P. Berrangé @ 2019-03-04 16:35 UTC (permalink / raw)
  To: Michal Privoznik
  Cc: Igor Mammedov, Markus Armbruster, peter.maydell, ehabkost,
	libvir-list, qemu-devel, Dr. David Alan Gilbert, qemu-arm,
	qemu-ppc, pbonzini, david

On Mon, Mar 04, 2019 at 05:20:13PM +0100, Michal Privoznik wrote:
> On 3/4/19 3:24 PM, Daniel P. Berrangé wrote:
> > On Mon, Mar 04, 2019 at 03:16:41PM +0100, Igor Mammedov wrote:
> > > On Mon, 4 Mar 2019 12:39:08 +0000
> > > Daniel P. Berrangé <berrange@redhat.com> wrote:
> > > 
> > > > On Mon, Mar 04, 2019 at 01:25:07PM +0100, Igor Mammedov wrote:
> > > > > On Mon, 04 Mar 2019 08:13:53 +0100
> > > > > Markus Armbruster <armbru@redhat.com> wrote:
> > > > > > Daniel P. Berrangé <berrange@redhat.com> writes:
> > > > > > > On Fri, Mar 01, 2019 at 06:33:28PM +0100, Igor Mammedov wrote:
> > > > > > > > On Fri, 1 Mar 2019 15:49:47 +0000
> > > > > > > > Daniel P. Berrangé <berrange@redhat.com> wrote:
> > > > > > > > > On Fri, Mar 01, 2019 at 04:42:15PM +0100, Igor Mammedov wrote:
> > > > > > > > > > The parameter allows to configure fake NUMA topology where guest
> > > > > > > > > > VM simulates NUMA topology but not actually getting a performance
> > > > > > > > > > benefits from it. The same or better results could be achieved
> > > > > > > > > > using 'memdev' parameter. In light of that any VM that uses NUMA
> > > > > > > > > > to get its benefits should use 'memdev' and to allow transition
> > > > > > > > > > initial RAM to device based model, deprecate 'mem' parameter as
> > > > > > > > > > its ad-hoc partitioning of initial RAM MemoryRegion can't be
> > > > > > > > > > translated to memdev based backend transparently to users and in
> > > > > > > > > > compatible manner (migration wise).
> > > > > > > > > > 
> > > > > > > > > > That will also allow to clean up a bit our numa code, leaving only
> > > > > > > > > > 'memdev' impl. in place and several boards that use node_mem
> > > > > > > > > > to generate FDT/ACPI description from it.
> > > > > > > > > 
> > > > > > > > > Can you confirm that the  'mem' and 'memdev' parameters to -numa
> > > > > > > > > are 100% live migration compatible in both directions ?  Libvirt
> > > > > > > > > would need this to be the case in order to use the 'memdev' syntax
> > > > > > > > > instead.
> > > > > > > > Unfortunately they are not migration compatible in any direction,
> > > > > > > > if it where possible to translate them to each other I'd alias 'mem'
> > > > > > > > to 'memdev' without deprecation. The former sends over only one
> > > > > > > > MemoryRegion to target, while the later sends over several (one per
> > > > > > > > memdev).
> > > > > > > 
> > > > > > > If we can't migration from one to the other, then we can not deprecate
> > > > > > > the existing 'mem' syntax. Even if libvirt were to provide a config
> > > > > > > option to let apps opt-in to the new syntax, we need to be able to
> > > > > > > support live migration of existing running VMs indefinitely. Effectively
> > > > > > > this means we need the to keep 'mem' support forever, or at least such
> > > > > > > a long time that it effectively means forever.
> > > > > > > 
> > > > > > > So I think this patch has to be dropped & replaced with one that
> > > > > > > simply documents that memdev syntax is preferred.
> > > > > > 
> > > > > > We have this habit of postulating absolutes like "can not deprecate"
> > > > > > instead of engaging with the tradeoffs.  We need to kick it.
> > > > > > 
> > > > > > So let's have an actual look at the tradeoffs.
> > > > > > 
> > > > > > We don't actually "support live migration of existing running VMs
> > > > > > indefinitely".
> > > > > > 
> > > > > > We support live migration to any newer version of QEMU that still
> > > > > > supports the machine type.
> > > > > > 
> > > > > > We support live migration to any older version of QEMU that already
> > > > > > supports the machine type and all the devices the machine uses.
> > > > > > 
> > > > > > Aside: "support" is really an honest best effort here.  If you rely on
> > > > > > it, use a downstream that puts in the (substantial!) QA work real
> > > > > > support takes.
> > > > > > 
> > > > > > Feature deprecation is not a contract to drop the feature after two
> > > > > > releases, or even five.  It's a formal notice that users of the feature
> > > > > > should transition to its replacement in an orderly manner.
> > > > > > 
> > > > > > If I understand Igor correctly, all users should transition away from
> > > > > > outdated NUMA configurations at least for new VMs in an orderly manner.
> > > > > Yes, we can postpone removing options until there are machines type
> > > > > versions that were capable to use it (unfortunate but probably
> > > > > unavoidable unless there is a migration trick to make transition
> > > > > transparent) but that should not stop us from disabling broken
> > > > > options on new machine types at least.
> > > > > 
> > > > > This series can serve as formal notice with follow up disabling of
> > > > > deprecated options for new machine types. (As Thomas noted, just warnings
> > > > > do not work and users continue to use broken features regardless whether
> > > > > they are don't know about issues or aware of it [*])
> > > > > 
> > > > > Hence suggested deprecation approach and enforced rejection of legacy
> > > > > numa options for new machine types in 2 releases so users would stop
> > > > > using them eventually.
> > > > 
> > > > When we deprecate something, we need to have a way for apps to use the
> > > > new alternative approach *at the same time*.  So even if we only want to
> > > > deprecate for new machine types, we still have to first solve the problem
> > > > of how mgmt apps will introspect QEMU to learn which machine types expect
> > > > the new options.
> > > I'm not aware any mechanism to introspect machine type options (existing
> > > or something being developed). Are/were there any ideas about it that were
> > > discussed in the past?
> > > 
> > > Aside from developing a new mechanism what are alternative approaches?
> > > I mean when we delete deprecated CLI option, how it's solved on libvirt
> > > side currently?
> > > 
> > > For example I don't see anything introspection related when we have been
> > > removing deprecated options recently.
> > 
> > Right, with other stuff we deprecate we've had a simpler time, as it
> > either didn't affect migration at all, or the new replacement stuff
> > was fully compatible with the migration data stream. IOW, libvirt
> > could unconditionally use the new feature as soon as it saw that it
> > exists in QEMU. We didn't have any machine type dependancy to deal
> > with before now.
> 
> We couldn't have done that. How we would migrate from older qemu?
> 
> Anyway, now that I look into this (esp. git log) I came accross:
> 
> commit f309db1f4d51009bad0d32e12efc75530b66836b
> Author:     Michal Privoznik <mprivozn@redhat.com>
> AuthorDate: Thu Dec 18 12:36:48 2014 +0100
> Commit:     Michal Privoznik <mprivozn@redhat.com>
> CommitDate: Fri Dec 19 07:44:44 2014 +0100
> 
>     qemu: Create memory-backend-{ram,file} iff needed
> 
> Or this 7832fac84741d65e851dbdbfaf474785cbfdcf3c. We did try to generated
> newer cmd line but then for various reasong (e.g. avoiding triggering a qemu
> bug) we turned it off and make libvirt default to older (now deprecated) cmd
> line.
> 
> Frankly, I don't know how to proceed. Unless qemu is fixed to allow
> migration from deprecated to new cmd line (unlikely, if not impossible,
> right?) then I guess the only approach we can have is that:
> 
> 1) whenever so called cold booting a new machine (fresh, brand new start of
> a new domain) libvirt would default to modern cmd line,
> 
> 2) on migration, libvirt would record in the migration stream (or status XML
> or wherever) that modern cmd line was generated and thus it'll make the
> destination generate modern cmd line too.
> 
> This solution still suffers a couple of problems:
> a) migration to older libvirt will fail as older libvirt won't recognize the
> flag set in 2) and therefore would default to deprecated cmd line
> b) migrating from one host to another won't modernize the cmd line
> 
> But I guess we have to draw a line somewhere (if we are not willing to write
> those migration patches).

Yeah supporting backwards migration is a non-optional requirement from at
least one of the mgmt apps using libvirt, so breaking the new to old case
is something we always aim to avoid.

These incompabilities are reminding me why we haven't tied these kind of
changes to machine type versions in the past. New machine type != new
libvirt, so we can't tie usage of a feature in livirt to a new machine
type.

I'm wondering exactly which cases libvirt will still use the "mem" option
in  as opposed to "memdev".  If none of the cases using "mem" actually
suffer from the ill-effects of "mem", then there's not a compelling reason
to stop using it. It can be discouraged in QEMU documentation but otherwise
left alone.

Regards,
Daniel
-- 
|: https://berrange.com      -o-    https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org         -o-            https://fstop138.berrange.com :|
|: https://entangle-photo.org    -o-    https://www.instagram.com/dberrange :|

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [Qemu-devel] [Qemu-ppc] [libvirt] [PATCH 1/2] numa: deprecate 'mem' parameter of '-numa node' option
  2019-03-04 15:02                 ` Daniel P. Berrangé
@ 2019-03-04 16:45                   ` Igor Mammedov
  0 siblings, 0 replies; 37+ messages in thread
From: Igor Mammedov @ 2019-03-04 16:45 UTC (permalink / raw)
  To: Daniel P. Berrangé
  Cc: Thomas Huth, peter.maydell, ehabkost, libvir-list, qemu-devel,
	Dr. David Alan Gilbert, qemu-arm, qemu-ppc, pbonzini, david

On Mon, 4 Mar 2019 15:02:18 +0000
Daniel P. Berrangé <berrange@redhat.com> wrote:

> On Mon, Mar 04, 2019 at 03:54:57PM +0100, Igor Mammedov wrote:
> > On Mon, 4 Mar 2019 13:59:09 +0000
> > Daniel P. Berrangé <berrange@redhat.com> wrote:
> >   
> > > On Mon, Mar 04, 2019 at 02:55:10PM +0100, Igor Mammedov wrote:  
> > > > On Mon, 4 Mar 2019 09:11:19 +0100
> > > > Thomas Huth <thuth@redhat.com> wrote:
> > > >     
> > > > > On 01/03/2019 18.48, Daniel P. Berrangé wrote:
> > > > > [...]    
> > > > > > So I think this patch has to be dropped & replaced with one that
> > > > > > simply documents that memdev syntax is preferred.      
> > > > > 
> > > > > That's definitely not enough. I've had a couple of cases already where
> > > > > we documented that certain options should not be used anymore, and
> > > > > people simply ignored it (aka. if it ain't broken, don't do any change).
> > > > > Then they just started to complain when I really tried to remove the
> > > > > option after the deprecation period.    
> > > >     
> > > > > So Igor, if you can not officially deprecate these things here yet, you
> > > > > should at least make sure that they can not be used with new machine
> > > > > types anymore. Then, after a couple of years, when we feel sure that
> > > > > there are only some few or no people left who still use it with the old
> > > > > machine types, we can start to discuss the deprecation process again, I
> > > > > think.    
> > > > Is it acceptable to silently disable CLI options (even if they are broken
> > > > like in this case) for new machine types?
> > > > I was under impression that it should go through deprecation first.    
> > > 
> > > Yes, it must go through deprecation. I was saying we can't disable
> > > the CLI options at all, until there is a way for libvirt to correctly
> > > use the new options.  
> > 
> > I'm not adding new options (nor plan to for numa case (yet)),
> > -numa node,memdev is around several years by now and should be used
> > as default for creating new configs.
> > 
> > In light of keeping 'mem' option around for old machines,
> > Deprecation should have served for notifying users that legacy
> > options will be disabled later on (for new machines at least
> > if no way found for migration compatible transition for older ones).
> > 
> > What I'm mainly aiming here is to prevent using broken legacy options
> > for new VMs (like in RHBZ1624223 case) and deprecation is the only way
> > we have now to notify users about CLI breaking changes.  
> 
> The idea of doing advance warnings via deprecations is that applications
> have time to adapt to the new mechanism several releases before the old
> mechanism is removed/disabled.  Since the new mechanism isn't fully
> usable yet, applications can't adapt to use it. So we can't start the
> deprecation process yet, as it would be telling apps to do a switch
> that isn't possible for many to actually do.

At least a hope attempt to deprecate 'mem', served its purpose somewhat.
Now it's clear that libvirt defaults to wrong legacy mode and should use
'memdev' instead. I hope that it will be addressed on libvirt side
regardless of the fate of 'mem' deprecation process.
Basically new VM should default to 'memdev' if it's available. 


> In the meantime, qemu-options.hx could be updated. It documents both
> "mem" and "memdev" currently but doesn't tell people that "memdev" is
> the preferred syntax for future usage / warn against using "mem".

Will do so.

> 
> Regards,
> Daniel

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [Qemu-devel] [libvirt] [PATCH 1/2] numa: deprecate 'mem' parameter of '-numa node' option
  2019-03-04 16:35                     ` Daniel P. Berrangé
@ 2019-03-06 19:03                       ` Igor Mammedov
  2019-03-07  9:59                         ` Daniel P. Berrangé
  0 siblings, 1 reply; 37+ messages in thread
From: Igor Mammedov @ 2019-03-06 19:03 UTC (permalink / raw)
  To: Daniel P. Berrangé
  Cc: Michal Privoznik, Markus Armbruster, peter.maydell, ehabkost,
	libvir-list, qemu-devel, Dr. David Alan Gilbert, qemu-arm,
	qemu-ppc, pbonzini, david

On Mon, 4 Mar 2019 16:35:16 +0000
Daniel P. Berrangé <berrange@redhat.com> wrote:

> On Mon, Mar 04, 2019 at 05:20:13PM +0100, Michal Privoznik wrote:
> > On 3/4/19 3:24 PM, Daniel P. Berrangé wrote:
> > > On Mon, Mar 04, 2019 at 03:16:41PM +0100, Igor Mammedov wrote:
> > > > On Mon, 4 Mar 2019 12:39:08 +0000
> > > > Daniel P. Berrangé <berrange@redhat.com> wrote:
> > > > 
> > > > > On Mon, Mar 04, 2019 at 01:25:07PM +0100, Igor Mammedov wrote:
> > > > > > On Mon, 04 Mar 2019 08:13:53 +0100
> > > > > > Markus Armbruster <armbru@redhat.com> wrote:
> > > > > > > Daniel P. Berrangé <berrange@redhat.com> writes:
> > > > > > > > On Fri, Mar 01, 2019 at 06:33:28PM +0100, Igor Mammedov wrote:
> > > > > > > > > On Fri, 1 Mar 2019 15:49:47 +0000
> > > > > > > > > Daniel P. Berrangé <berrange@redhat.com> wrote:
> > > > > > > > > > On Fri, Mar 01, 2019 at 04:42:15PM +0100, Igor Mammedov wrote:
> > > > > > > > > > > The parameter allows to configure fake NUMA topology where guest
> > > > > > > > > > > VM simulates NUMA topology but not actually getting a performance
> > > > > > > > > > > benefits from it. The same or better results could be achieved
> > > > > > > > > > > using 'memdev' parameter. In light of that any VM that uses NUMA
> > > > > > > > > > > to get its benefits should use 'memdev' and to allow transition
> > > > > > > > > > > initial RAM to device based model, deprecate 'mem' parameter as
> > > > > > > > > > > its ad-hoc partitioning of initial RAM MemoryRegion can't be
> > > > > > > > > > > translated to memdev based backend transparently to users and in
> > > > > > > > > > > compatible manner (migration wise).
> > > > > > > > > > > 
> > > > > > > > > > > That will also allow to clean up a bit our numa code, leaving only
> > > > > > > > > > > 'memdev' impl. in place and several boards that use node_mem
> > > > > > > > > > > to generate FDT/ACPI description from it.
> > > > > > > > > > 
> > > > > > > > > > Can you confirm that the  'mem' and 'memdev' parameters to -numa
> > > > > > > > > > are 100% live migration compatible in both directions ?  Libvirt
> > > > > > > > > > would need this to be the case in order to use the 'memdev' syntax
> > > > > > > > > > instead.
> > > > > > > > > Unfortunately they are not migration compatible in any direction,
> > > > > > > > > if it where possible to translate them to each other I'd alias 'mem'
> > > > > > > > > to 'memdev' without deprecation. The former sends over only one
> > > > > > > > > MemoryRegion to target, while the later sends over several (one per
> > > > > > > > > memdev).
> > > > > > > > 
> > > > > > > > If we can't migration from one to the other, then we can not deprecate
> > > > > > > > the existing 'mem' syntax. Even if libvirt were to provide a config
> > > > > > > > option to let apps opt-in to the new syntax, we need to be able to
> > > > > > > > support live migration of existing running VMs indefinitely. Effectively
> > > > > > > > this means we need the to keep 'mem' support forever, or at least such
> > > > > > > > a long time that it effectively means forever.
> > > > > > > > 
> > > > > > > > So I think this patch has to be dropped & replaced with one that
> > > > > > > > simply documents that memdev syntax is preferred.
> > > > > > > 
> > > > > > > We have this habit of postulating absolutes like "can not deprecate"
> > > > > > > instead of engaging with the tradeoffs.  We need to kick it.
> > > > > > > 
> > > > > > > So let's have an actual look at the tradeoffs.
> > > > > > > 
> > > > > > > We don't actually "support live migration of existing running VMs
> > > > > > > indefinitely".
> > > > > > > 
> > > > > > > We support live migration to any newer version of QEMU that still
> > > > > > > supports the machine type.
> > > > > > > 
> > > > > > > We support live migration to any older version of QEMU that already
> > > > > > > supports the machine type and all the devices the machine uses.
> > > > > > > 
> > > > > > > Aside: "support" is really an honest best effort here.  If you rely on
> > > > > > > it, use a downstream that puts in the (substantial!) QA work real
> > > > > > > support takes.
> > > > > > > 
> > > > > > > Feature deprecation is not a contract to drop the feature after two
> > > > > > > releases, or even five.  It's a formal notice that users of the feature
> > > > > > > should transition to its replacement in an orderly manner.
> > > > > > > 
> > > > > > > If I understand Igor correctly, all users should transition away from
> > > > > > > outdated NUMA configurations at least for new VMs in an orderly manner.
> > > > > > Yes, we can postpone removing options until there are machines type
> > > > > > versions that were capable to use it (unfortunate but probably
> > > > > > unavoidable unless there is a migration trick to make transition
> > > > > > transparent) but that should not stop us from disabling broken
> > > > > > options on new machine types at least.
> > > > > > 
> > > > > > This series can serve as formal notice with follow up disabling of
> > > > > > deprecated options for new machine types. (As Thomas noted, just warnings
> > > > > > do not work and users continue to use broken features regardless whether
> > > > > > they are don't know about issues or aware of it [*])
> > > > > > 
> > > > > > Hence suggested deprecation approach and enforced rejection of legacy
> > > > > > numa options for new machine types in 2 releases so users would stop
> > > > > > using them eventually.
> > > > > 
> > > > > When we deprecate something, we need to have a way for apps to use the
> > > > > new alternative approach *at the same time*.  So even if we only want to
> > > > > deprecate for new machine types, we still have to first solve the problem
> > > > > of how mgmt apps will introspect QEMU to learn which machine types expect
> > > > > the new options.
> > > > I'm not aware any mechanism to introspect machine type options (existing
> > > > or something being developed). Are/were there any ideas about it that were
> > > > discussed in the past?
> > > > 
> > > > Aside from developing a new mechanism what are alternative approaches?
> > > > I mean when we delete deprecated CLI option, how it's solved on libvirt
> > > > side currently?
> > > > 
> > > > For example I don't see anything introspection related when we have been
> > > > removing deprecated options recently.
> > > 
> > > Right, with other stuff we deprecate we've had a simpler time, as it
> > > either didn't affect migration at all, or the new replacement stuff
> > > was fully compatible with the migration data stream. IOW, libvirt
> > > could unconditionally use the new feature as soon as it saw that it
> > > exists in QEMU. We didn't have any machine type dependancy to deal
> > > with before now.
> > 
> > We couldn't have done that. How we would migrate from older qemu?
> > 
> > Anyway, now that I look into this (esp. git log) I came accross:
> > 
> > commit f309db1f4d51009bad0d32e12efc75530b66836b
> > Author:     Michal Privoznik <mprivozn@redhat.com>
> > AuthorDate: Thu Dec 18 12:36:48 2014 +0100
> > Commit:     Michal Privoznik <mprivozn@redhat.com>
> > CommitDate: Fri Dec 19 07:44:44 2014 +0100
> > 
> >     qemu: Create memory-backend-{ram,file} iff needed
> > 
> > Or this 7832fac84741d65e851dbdbfaf474785cbfdcf3c. We did try to generated
> > newer cmd line but then for various reasong (e.g. avoiding triggering a qemu
> > bug) we turned it off and make libvirt default to older (now deprecated) cmd
> > line.
> > 
> > Frankly, I don't know how to proceed. Unless qemu is fixed to allow
> > migration from deprecated to new cmd line (unlikely, if not impossible,
> > right?) then I guess the only approach we can have is that:
> > 
> > 1) whenever so called cold booting a new machine (fresh, brand new start of
> > a new domain) libvirt would default to modern cmd line,
> > 
> > 2) on migration, libvirt would record in the migration stream (or status XML
> > or wherever) that modern cmd line was generated and thus it'll make the
> > destination generate modern cmd line too.
> > 
> > This solution still suffers a couple of problems:
> > a) migration to older libvirt will fail as older libvirt won't recognize the
> > flag set in 2) and therefore would default to deprecated cmd line
> > b) migrating from one host to another won't modernize the cmd line
> > 
> > But I guess we have to draw a line somewhere (if we are not willing to write
> > those migration patches).
> 
> Yeah supporting backwards migration is a non-optional requirement from at
> least one of the mgmt apps using libvirt, so breaking the new to old case
> is something we always aim to avoid.
Aiming for support of 
"new QEMU + new machine type" => "old QEMU + non-existing machine type"
seems a bit difficult.
Note old machine types will continue to work with old CLI.
 

> These incompabilities are reminding me why we haven't tied these kind of
> changes to machine type versions in the past. New machine type != new
> libvirt, so we can't tie usage of a feature in livirt to a new machine
> type.
> 
> I'm wondering exactly which cases libvirt will still use the "mem" option
> in  as opposed to "memdev".  If none of the cases using "mem" actually
> suffer from the ill-effects of "mem", then there's not a compelling reason
> to stop using it. It can be discouraged in QEMU documentation but otherwise
> left alone.
> 
> Regards,
> Daniel

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [Qemu-devel] [libvirt] [PATCH 1/2] numa: deprecate 'mem' parameter of '-numa node' option
  2019-03-04 16:20                   ` Michal Privoznik
  2019-03-04 16:31                     ` Dr. David Alan Gilbert
  2019-03-04 16:35                     ` Daniel P. Berrangé
@ 2019-03-06 19:56                     ` Igor Mammedov
  2 siblings, 0 replies; 37+ messages in thread
From: Igor Mammedov @ 2019-03-06 19:56 UTC (permalink / raw)
  To: Michal Privoznik
  Cc: Daniel P. Berrangé,
	Markus Armbruster, peter.maydell, ehabkost, libvir-list,
	qemu-devel, Dr. David Alan Gilbert, qemu-arm, qemu-ppc, pbonzini,
	david

On Mon, 4 Mar 2019 17:20:13 +0100
Michal Privoznik <mprivozn@redhat.com> wrote:

> On 3/4/19 3:24 PM, Daniel P. Berrangé wrote:
> > On Mon, Mar 04, 2019 at 03:16:41PM +0100, Igor Mammedov wrote:
> >> On Mon, 4 Mar 2019 12:39:08 +0000
> >> Daniel P. Berrangé <berrange@redhat.com> wrote:
> >>
> >>> On Mon, Mar 04, 2019 at 01:25:07PM +0100, Igor Mammedov wrote:
> >>>> On Mon, 04 Mar 2019 08:13:53 +0100
> >>>> Markus Armbruster <armbru@redhat.com> wrote:
> >>>>    
> >>>>> Daniel P. Berrangé <berrange@redhat.com> writes:
> >>>>>    
> >>>>>> On Fri, Mar 01, 2019 at 06:33:28PM +0100, Igor Mammedov wrote:
> >>>>>>> On Fri, 1 Mar 2019 15:49:47 +0000
> >>>>>>> Daniel P. Berrangé <berrange@redhat.com> wrote:
> >>>>>>>      
> >>>>>>>> On Fri, Mar 01, 2019 at 04:42:15PM +0100, Igor Mammedov wrote:
> >>>>>>>>> The parameter allows to configure fake NUMA topology where guest
> >>>>>>>>> VM simulates NUMA topology but not actually getting a performance
> >>>>>>>>> benefits from it. The same or better results could be achieved
> >>>>>>>>> using 'memdev' parameter. In light of that any VM that uses NUMA
> >>>>>>>>> to get its benefits should use 'memdev' and to allow transition
> >>>>>>>>> initial RAM to device based model, deprecate 'mem' parameter as
> >>>>>>>>> its ad-hoc partitioning of initial RAM MemoryRegion can't be
> >>>>>>>>> translated to memdev based backend transparently to users and in
> >>>>>>>>> compatible manner (migration wise).
> >>>>>>>>>
> >>>>>>>>> That will also allow to clean up a bit our numa code, leaving only
> >>>>>>>>> 'memdev' impl. in place and several boards that use node_mem
> >>>>>>>>> to generate FDT/ACPI description from it.
> >>>>>>>>
> >>>>>>>> Can you confirm that the  'mem' and 'memdev' parameters to -numa
> >>>>>>>> are 100% live migration compatible in both directions ?  Libvirt
> >>>>>>>> would need this to be the case in order to use the 'memdev' syntax
> >>>>>>>> instead.
> >>>>>>> Unfortunately they are not migration compatible in any direction,
> >>>>>>> if it where possible to translate them to each other I'd alias 'mem'
> >>>>>>> to 'memdev' without deprecation. The former sends over only one
> >>>>>>> MemoryRegion to target, while the later sends over several (one per
> >>>>>>> memdev).
> >>>>>>
> >>>>>> If we can't migration from one to the other, then we can not deprecate
> >>>>>> the existing 'mem' syntax. Even if libvirt were to provide a config
> >>>>>> option to let apps opt-in to the new syntax, we need to be able to
> >>>>>> support live migration of existing running VMs indefinitely. Effectively
> >>>>>> this means we need the to keep 'mem' support forever, or at least such
> >>>>>> a long time that it effectively means forever.
> >>>>>>
> >>>>>> So I think this patch has to be dropped & replaced with one that
> >>>>>> simply documents that memdev syntax is preferred.
> >>>>>
> >>>>> We have this habit of postulating absolutes like "can not deprecate"
> >>>>> instead of engaging with the tradeoffs.  We need to kick it.
> >>>>>
> >>>>> So let's have an actual look at the tradeoffs.
> >>>>>
> >>>>> We don't actually "support live migration of existing running VMs
> >>>>> indefinitely".
> >>>>>
> >>>>> We support live migration to any newer version of QEMU that still
> >>>>> supports the machine type.
> >>>>>
> >>>>> We support live migration to any older version of QEMU that already
> >>>>> supports the machine type and all the devices the machine uses.
> >>>>>
> >>>>> Aside: "support" is really an honest best effort here.  If you rely on
> >>>>> it, use a downstream that puts in the (substantial!) QA work real
> >>>>> support takes.
> >>>>>
> >>>>> Feature deprecation is not a contract to drop the feature after two
> >>>>> releases, or even five.  It's a formal notice that users of the feature
> >>>>> should transition to its replacement in an orderly manner.
> >>>>>
> >>>>> If I understand Igor correctly, all users should transition away from
> >>>>> outdated NUMA configurations at least for new VMs in an orderly manner.
> >>>> Yes, we can postpone removing options until there are machines type
> >>>> versions that were capable to use it (unfortunate but probably
> >>>> unavoidable unless there is a migration trick to make transition
> >>>> transparent) but that should not stop us from disabling broken
> >>>> options on new machine types at least.
> >>>>
> >>>> This series can serve as formal notice with follow up disabling of
> >>>> deprecated options for new machine types. (As Thomas noted, just warnings
> >>>> do not work and users continue to use broken features regardless whether
> >>>> they are don't know about issues or aware of it [*])
> >>>>
> >>>> Hence suggested deprecation approach and enforced rejection of legacy
> >>>> numa options for new machine types in 2 releases so users would stop
> >>>> using them eventually.
> >>>
> >>> When we deprecate something, we need to have a way for apps to use the
> >>> new alternative approach *at the same time*.  So even if we only want to
> >>> deprecate for new machine types, we still have to first solve the problem
> >>> of how mgmt apps will introspect QEMU to learn which machine types expect
> >>> the new options.
> >> I'm not aware any mechanism to introspect machine type options (existing
> >> or something being developed). Are/were there any ideas about it that were
> >> discussed in the past?
> >>
> >> Aside from developing a new mechanism what are alternative approaches?
> >> I mean when we delete deprecated CLI option, how it's solved on libvirt
> >> side currently?
> >>
> >> For example I don't see anything introspection related when we have been
> >> removing deprecated options recently.
> > 
> > Right, with other stuff we deprecate we've had a simpler time, as it
> > either didn't affect migration at all, or the new replacement stuff
> > was fully compatible with the migration data stream. IOW, libvirt
> > could unconditionally use the new feature as soon as it saw that it
> > exists in QEMU. We didn't have any machine type dependancy to deal
> > with before now.
> 
> We couldn't have done that. How we would migrate from older qemu?
> 
> Anyway, now that I look into this (esp. git log) I came accross:
> 
> commit f309db1f4d51009bad0d32e12efc75530b66836b
> Author:     Michal Privoznik <mprivozn@redhat.com>
> AuthorDate: Thu Dec 18 12:36:48 2014 +0100
> Commit:     Michal Privoznik <mprivozn@redhat.com>
> CommitDate: Fri Dec 19 07:44:44 2014 +0100
> 
>      qemu: Create memory-backend-{ram,file} iff needed
> 
> Or this 7832fac84741d65e851dbdbfaf474785cbfdcf3c. We did try to 
> generated newer cmd line but then for various reasong (e.g. avoiding 
> triggering a qemu bug) we turned it off and make libvirt default to 
> older (now deprecated) cmd line.
> 
> Frankly, I don't know how to proceed. Unless qemu is fixed to allow 
> migration from deprecated to new cmd line (unlikely, if not impossible, 
> right?) then I guess the only approach we can have is that:
In non numa case (or 1 node case) there is migration compatible solution,
I'm working on it.

The only problematic case is where -mem-path + -mem-prealloc + pc-dimm are
used with muti-node variant + '-numa node,mem=xx' option or plain '-numa node'
with implicit RAM distribution (the same as 'mem' just done by QEMU), here I
can't translate options due to different memory layout, so these VMs would 
still continue using old 'mem' with old machine types.

> 1) whenever so called cold booting a new machine (fresh, brand new start 
> of a new domain) libvirt would default to modern cmd line,
that's what I'd prefer and complained that this route wasn't followed.
(it would even work with currently broken QEMU, since modern CLI variant works)

It would wean down 'mem' userbase that configures NUMA incorrectly (i.e. non pinned mem&cpus).
Users still can do that with 'memdev' by leaving out bind policy but
when they ask for pinning with bind policy they will really get it since
under-hood "-mem-path + -mem-prealloc" will be the same 'memdev,prealloc=on',
without surprise due to quite non obvious call ordering.
 
> 2) on migration, libvirt would record in the migration stream (or status 
> XML or wherever) that modern cmd line was generated and thus it'll make 
> the destination generate modern cmd line too.
since legacy option is not removed, it's not necessary for converting
legacy SOURCE to new CLI (new QEMU would still accept legacy CLI for
old machine types), i.e.e business as usual.


> This solution still suffers a couple of problems:
> a) migration to older libvirt will fail as older libvirt won't recognize 
> the flag set in 2) and therefore would default to deprecated cmd line

> b) migrating from one host to another won't modernize the cmd line
I didn't knew that there was 'modernizing' feature in libvirt but
modernizing CLI only necessary when we remove option completely.


> But I guess we have to draw a line somewhere (if we are not willing to 
> write those migration patches).
I wasn't able to make migration from 1 to N MemoryRegions and back work.
So the only choice left is to start deprecation of numa 'mem' option with
new machine types, so that at least there memory pinning would not be broken
and it wouldn't be possible to mis-configure it.


> Michal

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [Qemu-devel] [libvirt] [PATCH 1/2] numa: deprecate 'mem' parameter of '-numa node' option
  2019-03-06 19:03                       ` Igor Mammedov
@ 2019-03-07  9:59                         ` Daniel P. Berrangé
  2019-03-10 10:16                           ` Markus Armbruster
  0 siblings, 1 reply; 37+ messages in thread
From: Daniel P. Berrangé @ 2019-03-07  9:59 UTC (permalink / raw)
  To: Igor Mammedov
  Cc: Michal Privoznik, Markus Armbruster, peter.maydell, ehabkost,
	libvir-list, qemu-devel, Dr. David Alan Gilbert, qemu-arm,
	qemu-ppc, pbonzini, david

On Wed, Mar 06, 2019 at 08:03:48PM +0100, Igor Mammedov wrote:
> On Mon, 4 Mar 2019 16:35:16 +0000
> Daniel P. Berrangé <berrange@redhat.com> wrote:
> 
> > On Mon, Mar 04, 2019 at 05:20:13PM +0100, Michal Privoznik wrote:
> > > We couldn't have done that. How we would migrate from older qemu?
> > > 
> > > Anyway, now that I look into this (esp. git log) I came accross:
> > > 
> > > commit f309db1f4d51009bad0d32e12efc75530b66836b
> > > Author:     Michal Privoznik <mprivozn@redhat.com>
> > > AuthorDate: Thu Dec 18 12:36:48 2014 +0100
> > > Commit:     Michal Privoznik <mprivozn@redhat.com>
> > > CommitDate: Fri Dec 19 07:44:44 2014 +0100
> > > 
> > >     qemu: Create memory-backend-{ram,file} iff needed
> > > 
> > > Or this 7832fac84741d65e851dbdbfaf474785cbfdcf3c. We did try to generated
> > > newer cmd line but then for various reasong (e.g. avoiding triggering a qemu
> > > bug) we turned it off and make libvirt default to older (now deprecated) cmd
> > > line.
> > > 
> > > Frankly, I don't know how to proceed. Unless qemu is fixed to allow
> > > migration from deprecated to new cmd line (unlikely, if not impossible,
> > > right?) then I guess the only approach we can have is that:
> > > 
> > > 1) whenever so called cold booting a new machine (fresh, brand new start of
> > > a new domain) libvirt would default to modern cmd line,
> > > 
> > > 2) on migration, libvirt would record in the migration stream (or status XML
> > > or wherever) that modern cmd line was generated and thus it'll make the
> > > destination generate modern cmd line too.
> > > 
> > > This solution still suffers a couple of problems:
> > > a) migration to older libvirt will fail as older libvirt won't recognize the
> > > flag set in 2) and therefore would default to deprecated cmd line
> > > b) migrating from one host to another won't modernize the cmd line
> > > 
> > > But I guess we have to draw a line somewhere (if we are not willing to write
> > > those migration patches).
> > 
> > Yeah supporting backwards migration is a non-optional requirement from at
> > least one of the mgmt apps using libvirt, so breaking the new to old case
> > is something we always aim to avoid.
> Aiming for support of 
> "new QEMU + new machine type" => "old QEMU + non-existing machine type"
> seems a bit difficult.

That's not the scenario that's the problem. The problem is

   new QEMU + new machine type + new libvirt   -> new QEMU + new machine type + old libvirt

Previously released versions of libvirt will happily use any new machine
type that QEMU introduces. So we can't make new libvirt use a different
options, only for new machine types, as old libvirt supports those machine
types too.

Regards,
Daniel
-- 
|: https://berrange.com      -o-    https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org         -o-            https://fstop138.berrange.com :|
|: https://entangle-photo.org    -o-    https://www.instagram.com/dberrange :|

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [Qemu-devel] [libvirt] [PATCH 1/2] numa: deprecate 'mem' parameter of '-numa node' option
  2019-03-04 15:28               ` Daniel P. Berrangé
  2019-03-04 15:46                 ` Igor Mammedov
@ 2019-03-10 10:14                 ` Markus Armbruster
  1 sibling, 0 replies; 37+ messages in thread
From: Markus Armbruster @ 2019-03-10 10:14 UTC (permalink / raw)
  To: Daniel P. Berrangé
  Cc: peter.maydell, ehabkost, libvir-list, Dr. David Alan Gilbert,
	qemu-devel, qemu-arm, qemu-ppc, Igor Mammedov, pbonzini, david

Daniel P. Berrangé <berrange@redhat.com> writes:

> On Mon, Mar 04, 2019 at 12:45:14PM +0100, Markus Armbruster wrote:
>> Daniel P. Berrangé <berrange@redhat.com> writes:
>> 
>> > On Mon, Mar 04, 2019 at 08:13:53AM +0100, Markus Armbruster wrote:
>> >> If we deprecate outdated NUMA configurations now, we can start rejecting
>> >> them with new machine types after a suitable grace period.
>> >
>> > How is libvirt going to know what machines it can use with the feature ?
>> > We don't have any way to introspect machine type specific logic, since we
>> > run all probing with "-machine none", and QEMU can't report anything about
>> > machines without instantiating them.
>> 
>> Fair point.  A practical way for management applications to decide which
>> of the two interfaces they can use with which machine type may be
>> required for deprecating one of the interfaces with new machine types.
>
> We currently have  "qom-list-properties" which can report on the
> existance of properties registered against object types. What it
> can't do though is report on the default values of these properties.

Yes.

> What's interesting though is that qmp_qom_list_properties will actually
> instantiate objects in order to query properties, if the type isn't an
> abstract type.

If it's an abstract type, qom-list-properties returns the properties
created with object_class_property_add() & friends, typically by the
class_init method.  This is possible without instantiating the type.

If it's a concrete type, qom-list-properties additionally returns the
properties created with object_property_add(), typically by the
instance_init() method.  This requires instantiating the type.

Both kinds of properties can be added or deleted at any time.  For
instance, setting a property value with object_property_set() or similar
could create additional properties.

For historical reasons, we use often use object_property_add() where
object_class_property_add() would do.  Sad.

> IOW, even if you are running "$QEMU -machine none", then if at the qmp-shell
> you do
>
>    (QEMU) qom-list-properties typename=pc-q35-2.6-machine
>
> it will have actually instantiate the pc-q35-2.6-machine machine type.
> Since it has instantiated the machine, the object initializer function
> will have run and initialized the default values for various properties.
>
> IOW, it is possible for qom-list-properties to report on default values
> for non-abstract types.

instance_init() also initializes the properties' values.
qom-list-properties could show these initial values (I hesitate calling
them default values).

Setting a property's value can change other properties' values by side
effect.

My point is: the properties qom-list-properties shows and the initial
values it could show are not necessarily final.  QOM is designed to be
maximally flexible, and flexibility brings along its bosom-buddy
complexity.

If you keep that in mind, qom-list-properties can be put to good use all
the same.

A way to report "default values" (really: whatever the values are after
object_new()) feels like a fair feature request to me, if backed by an
actual use case.

[...]

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [Qemu-devel] [libvirt] [PATCH 1/2] numa: deprecate 'mem' parameter of '-numa node' option
  2019-03-07  9:59                         ` Daniel P. Berrangé
@ 2019-03-10 10:16                           ` Markus Armbruster
  0 siblings, 0 replies; 37+ messages in thread
From: Markus Armbruster @ 2019-03-10 10:16 UTC (permalink / raw)
  To: Daniel P. Berrangé
  Cc: Igor Mammedov, peter.maydell, ehabkost, libvir-list,
	Michal Privoznik, qemu-devel, qemu-arm, qemu-ppc, pbonzini,
	Dr. David Alan Gilbert, david

Daniel P. Berrangé <berrange@redhat.com> writes:

> On Wed, Mar 06, 2019 at 08:03:48PM +0100, Igor Mammedov wrote:
>> On Mon, 4 Mar 2019 16:35:16 +0000
>> Daniel P. Berrangé <berrange@redhat.com> wrote:
>> 
>> > On Mon, Mar 04, 2019 at 05:20:13PM +0100, Michal Privoznik wrote:
>> > > We couldn't have done that. How we would migrate from older qemu?
>> > > 
>> > > Anyway, now that I look into this (esp. git log) I came accross:
>> > > 
>> > > commit f309db1f4d51009bad0d32e12efc75530b66836b
>> > > Author:     Michal Privoznik <mprivozn@redhat.com>
>> > > AuthorDate: Thu Dec 18 12:36:48 2014 +0100
>> > > Commit:     Michal Privoznik <mprivozn@redhat.com>
>> > > CommitDate: Fri Dec 19 07:44:44 2014 +0100
>> > > 
>> > >     qemu: Create memory-backend-{ram,file} iff needed
>> > > 
>> > > Or this 7832fac84741d65e851dbdbfaf474785cbfdcf3c. We did try to generated
>> > > newer cmd line but then for various reasong (e.g. avoiding triggering a qemu
>> > > bug) we turned it off and make libvirt default to older (now deprecated) cmd
>> > > line.
>> > > 
>> > > Frankly, I don't know how to proceed. Unless qemu is fixed to allow
>> > > migration from deprecated to new cmd line (unlikely, if not impossible,
>> > > right?) then I guess the only approach we can have is that:
>> > > 
>> > > 1) whenever so called cold booting a new machine (fresh, brand new start of
>> > > a new domain) libvirt would default to modern cmd line,
>> > > 
>> > > 2) on migration, libvirt would record in the migration stream (or status XML
>> > > or wherever) that modern cmd line was generated and thus it'll make the
>> > > destination generate modern cmd line too.
>> > > 
>> > > This solution still suffers a couple of problems:
>> > > a) migration to older libvirt will fail as older libvirt won't recognize the
>> > > flag set in 2) and therefore would default to deprecated cmd line
>> > > b) migrating from one host to another won't modernize the cmd line
>> > > 
>> > > But I guess we have to draw a line somewhere (if we are not willing to write
>> > > those migration patches).
>> > 
>> > Yeah supporting backwards migration is a non-optional requirement from at
>> > least one of the mgmt apps using libvirt, so breaking the new to old case
>> > is something we always aim to avoid.
>> Aiming for support of 
>> "new QEMU + new machine type" => "old QEMU + non-existing machine type"
>> seems a bit difficult.
>
> That's not the scenario that's the problem. The problem is
>
>    new QEMU + new machine type + new libvirt   -> new QEMU + new machine type + old libvirt
>
> Previously released versions of libvirt will happily use any new machine
> type that QEMU introduces. So we can't make new libvirt use a different
> options, only for new machine types, as old libvirt supports those machine
> types too.

Avoiding tight coupling between QEMU und libvirt versions makes sense,
because having to upgrade stuff in lock-step is such a pain.

Does not imply we must support arbitrary combinations of QEMU and
libvirt versions.

Unless upstream libvirt's test matrix covers all versions of libvirt
against all released versions of QEMU, "previously released versions of
libvirt will continue to work with new QEMU" is largely an empty promise
anyway.  The real promise is more like "we won't break it intentionally;
good luck".

Mind, I'm not criticizing that real promise.  I'm criticizing cutting
yourself off from large areas of the solution space so you can continue
to pretend to yourself you actually deliver on the empty promise.

Now, if you limited what you promise to something more realistic,
ideally to something you actually test, we could talk about deprecation
schedules constructively.

For instance, if you promised

    QEMU as of time T + its latest machine type + libvirt as of time T
 -> QEMU as of time T + its latest machine type + libvirt as of time T - d

will work for a certain value of d, then once all released versions of
libvirt since T - d support a new way of doing things, flipping to that
new way becomes a whole lot easier.

^ permalink raw reply	[flat|nested] 37+ messages in thread

end of thread, other threads:[~2019-03-10 10:16 UTC | newest]

Thread overview: 37+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-03-01 15:42 [Qemu-devel] [PATCH 0/2] numa: deprecate -numa node, mem and default memory distribution Igor Mammedov
2019-03-01 15:42 ` [Qemu-devel] [PATCH 1/2] numa: deprecate 'mem' parameter of '-numa node' option Igor Mammedov
2019-03-01 15:49   ` [Qemu-devel] [libvirt] " Daniel P. Berrangé
2019-03-01 17:33     ` Igor Mammedov
2019-03-01 17:48       ` Daniel P. Berrangé
2019-03-04  7:13         ` Markus Armbruster
2019-03-04 10:19           ` Daniel P. Berrangé
2019-03-04 11:45             ` Markus Armbruster
2019-03-04 15:28               ` Daniel P. Berrangé
2019-03-04 15:46                 ` Igor Mammedov
2019-03-10 10:14                 ` Markus Armbruster
2019-03-04 14:24             ` Michal Privoznik
2019-03-04 15:03               ` Igor Mammedov
2019-03-04 12:25           ` Igor Mammedov
2019-03-04 12:39             ` Daniel P. Berrangé
2019-03-04 14:16               ` Igor Mammedov
2019-03-04 14:24                 ` Daniel P. Berrangé
2019-03-04 15:19                   ` Igor Mammedov
2019-03-04 16:12                     ` Michal Privoznik
2019-03-04 16:27                       ` Daniel P. Berrangé
2019-03-04 16:20                   ` Michal Privoznik
2019-03-04 16:31                     ` Dr. David Alan Gilbert
2019-03-04 16:35                     ` Daniel P. Berrangé
2019-03-06 19:03                       ` Igor Mammedov
2019-03-07  9:59                         ` Daniel P. Berrangé
2019-03-10 10:16                           ` Markus Armbruster
2019-03-06 19:56                     ` Igor Mammedov
2019-03-04 14:34                 ` Michal Privoznik
2019-03-04  8:11         ` [Qemu-devel] [Qemu-ppc] " Thomas Huth
2019-03-04 13:55           ` Igor Mammedov
2019-03-04 13:59             ` Daniel P. Berrangé
2019-03-04 14:54               ` Igor Mammedov
2019-03-04 15:02                 ` Daniel P. Berrangé
2019-03-04 16:45                   ` Igor Mammedov
2019-03-01 18:01       ` [Qemu-devel] " Dr. David Alan Gilbert
2019-03-04 13:52         ` Igor Mammedov
2019-03-01 15:42 ` [Qemu-devel] [PATCH 2/2] numa: deprecate implict memory distribution between nodes Igor Mammedov

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.