qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
* [Qemu-devel] [PATCH 0/3] pc: Fix die-id validation and compatibility with libvirt
@ 2019-08-15 18:38 Eduardo Habkost
  2019-08-15 18:38 ` [Qemu-devel] [PATCH 1/3] pc: Fix error message on die-id validation Eduardo Habkost
                   ` (3 more replies)
  0 siblings, 4 replies; 27+ messages in thread
From: Eduardo Habkost @ 2019-08-15 18:38 UTC (permalink / raw)
  To: qemu-devel
  Cc: Peter Krempa, Eduardo Habkost, Like Xu, Michael S. Tsirkin,
	Igor Mammedov, Paolo Bonzini, Richard Henderson

Currently, if die-id is omitted on -device for CPUs, we get a
very confusing error message:

  $ qemu-system-x86_64 -smp 1,sockets=6,maxcpus=6 \
    -device qemu64-x86_64-cpu,socket-id=1,core-id=0,thread-id=0
  qemu-system-x86_64: -device qemu64-x86_64-cpu,socket-id=1,core-id=0,thread-id=0: \
    Invalid CPU die-id: 4294967295 must be in range 0:5

This has 3 problems

1) The actual range for die-id is 0:0.
   This is fixed by patch 1/3.
2) The user didn't specify die-id=4294967295.
   This is fixed by patch 2/3.
3) It breaks compatibility with libvirt because die-id was not
   mandatory before.
   This is addressed by patch 3/3.

Issues #1 and #2 were reported at:
https://bugzilla.redhat.com/show_bug.cgi?id=1741151

Issue #3 was reported at:
https://bugzilla.redhat.com/show_bug.cgi?id=1741451

Cc: Like Xu <like.xu@linux.intel.com>
Cc: Peter Krempa <pkrempa@redhat.com>
Cc: Igor Mammedov <imammedo@redhat.com>

Eduardo Habkost (3):
  pc: Fix error message on die-id validation
  pc: Improve error message when die-id is omitted
  pc: Don't make CPU properties mandatory unless necessary

 hw/i386/pc.c                             | 23 ++++++++-
 tests/acceptance/pc_cpu_hotplug_props.py | 59 ++++++++++++++++++++++++
 2 files changed, 81 insertions(+), 1 deletion(-)
 create mode 100644 tests/acceptance/pc_cpu_hotplug_props.py

-- 
2.21.0



^ permalink raw reply	[flat|nested] 27+ messages in thread

* [Qemu-devel] [PATCH 1/3] pc: Fix error message on die-id validation
  2019-08-15 18:38 [Qemu-devel] [PATCH 0/3] pc: Fix die-id validation and compatibility with libvirt Eduardo Habkost
@ 2019-08-15 18:38 ` Eduardo Habkost
  2019-08-15 20:04   ` Vanderson Martins do Rosario
                     ` (3 more replies)
  2019-08-15 18:38 ` [Qemu-devel] [PATCH 2/3] pc: Improve error message when die-id is omitted Eduardo Habkost
                   ` (2 subsequent siblings)
  3 siblings, 4 replies; 27+ messages in thread
From: Eduardo Habkost @ 2019-08-15 18:38 UTC (permalink / raw)
  To: qemu-devel
  Cc: Peter Krempa, Eduardo Habkost, Like Xu, Michael S. Tsirkin,
	Igor Mammedov, Paolo Bonzini, Richard Henderson

The error message for die-id range validation is incorrect.  Example:

  $ qemu-system-x86_64 -smp 1,sockets=6,maxcpus=6 \
    -device qemu64-x86_64-cpu,socket-id=1,die-id=1,core-id=0,thread-id=0
  qemu-system-x86_64: -device qemu64-x86_64-cpu,socket-id=1,die-id=1,core-id=0,thread-id=0: \
    Invalid CPU die-id: 1 must be in range 0:5

The actual range for die-id in this example is 0:0.

Fix the error message to use smp_dies and print the correct range.

Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
---
 hw/i386/pc.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/hw/i386/pc.c b/hw/i386/pc.c
index 549c437050..24b03bb49c 100644
--- a/hw/i386/pc.c
+++ b/hw/i386/pc.c
@@ -2412,7 +2412,7 @@ static void pc_cpu_pre_plug(HotplugHandler *hotplug_dev,
             return;
         } else if (cpu->die_id > pcms->smp_dies - 1) {
             error_setg(errp, "Invalid CPU die-id: %u must be in range 0:%u",
-                       cpu->die_id, max_socket);
+                       cpu->die_id, pcms->smp_dies - 1);
             return;
         }
         if (cpu->core_id < 0) {
-- 
2.21.0



^ permalink raw reply related	[flat|nested] 27+ messages in thread

* [Qemu-devel] [PATCH 2/3] pc: Improve error message when die-id is omitted
  2019-08-15 18:38 [Qemu-devel] [PATCH 0/3] pc: Fix die-id validation and compatibility with libvirt Eduardo Habkost
  2019-08-15 18:38 ` [Qemu-devel] [PATCH 1/3] pc: Fix error message on die-id validation Eduardo Habkost
@ 2019-08-15 18:38 ` Eduardo Habkost
  2019-08-15 20:11   ` Vanderson Martins do Rosario
  2019-08-16 14:06   ` Igor Mammedov
  2019-08-15 18:38 ` [Qemu-devel] [PATCH 3/3] pc: Don't make CPU properties mandatory unless necessary Eduardo Habkost
  2019-08-19 19:19 ` [Qemu-devel] [PATCH 0/3] pc: Fix die-id validation and compatibility with libvirt Michael S. Tsirkin
  3 siblings, 2 replies; 27+ messages in thread
From: Eduardo Habkost @ 2019-08-15 18:38 UTC (permalink / raw)
  To: qemu-devel
  Cc: Peter Krempa, Eduardo Habkost, Like Xu, Michael S. Tsirkin,
	Igor Mammedov, Paolo Bonzini, Richard Henderson

The error message when die-id is omitted doesn't make sense:

  $ qemu-system-x86_64 -smp 1,sockets=6,maxcpus=6 \
    -device qemu64-x86_64-cpu,socket-id=1,core-id=0,thread-id=0
  qemu-system-x86_64: -device qemu64-x86_64-cpu,socket-id=1,core-id=0,thread-id=0: \
    Invalid CPU die-id: 4294967295 must be in range 0:0

Fix it, so it will now read:

  qemu-system-x86_64: -device qemu64-x86_64-cpu,socket-id=1,core-id=0,thread-id=0: \
    CPU die-id is not set

Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
---
 hw/i386/pc.c | 4 ++++
 1 file changed, 4 insertions(+)

diff --git a/hw/i386/pc.c b/hw/i386/pc.c
index 24b03bb49c..fb4ac5ca90 100644
--- a/hw/i386/pc.c
+++ b/hw/i386/pc.c
@@ -2410,6 +2410,10 @@ static void pc_cpu_pre_plug(HotplugHandler *hotplug_dev,
             error_setg(errp, "Invalid CPU socket-id: %u must be in range 0:%u",
                        cpu->socket_id, max_socket);
             return;
+        }
+        if (cpu->die_id < 0) {
+            error_setg(errp, "CPU die-id is not set");
+            return;
         } else if (cpu->die_id > pcms->smp_dies - 1) {
             error_setg(errp, "Invalid CPU die-id: %u must be in range 0:%u",
                        cpu->die_id, pcms->smp_dies - 1);
-- 
2.21.0



^ permalink raw reply related	[flat|nested] 27+ messages in thread

* [Qemu-devel] [PATCH 3/3] pc: Don't make CPU properties mandatory unless necessary
  2019-08-15 18:38 [Qemu-devel] [PATCH 0/3] pc: Fix die-id validation and compatibility with libvirt Eduardo Habkost
  2019-08-15 18:38 ` [Qemu-devel] [PATCH 1/3] pc: Fix error message on die-id validation Eduardo Habkost
  2019-08-15 18:38 ` [Qemu-devel] [PATCH 2/3] pc: Improve error message when die-id is omitted Eduardo Habkost
@ 2019-08-15 18:38 ` Eduardo Habkost
  2019-08-16  6:10   ` Markus Armbruster
  2019-08-16 13:20   ` Igor Mammedov
  2019-08-19 19:19 ` [Qemu-devel] [PATCH 0/3] pc: Fix die-id validation and compatibility with libvirt Michael S. Tsirkin
  3 siblings, 2 replies; 27+ messages in thread
From: Eduardo Habkost @ 2019-08-15 18:38 UTC (permalink / raw)
  To: qemu-devel
  Cc: Peter Krempa, Eduardo Habkost, Like Xu, Michael S. Tsirkin,
	Igor Mammedov, Paolo Bonzini, Richard Henderson

We have this issue reported when using libvirt to hotplug CPUs:
https://bugzilla.redhat.com/show_bug.cgi?id=1741451

Basically, libvirt is not copying die-id from
query-hotpluggable-cpus, but die-id is now mandatory.

We could blame libvirt and say it is not following the documented
interface, because we have this buried in the QAPI schema
documentation:

> Note: currently there are 5 properties that could be present
> but management should be prepared to pass through other
> properties with device_add command to allow for future
> interface extension. This also requires the filed names to be kept in
> sync with the properties passed to -device/device_add.

But I don't think this would be reasonable from us.  We can just
make QEMU more flexible and let CPU properties to be omitted when
there's no ambiguity.  This will allow us to keep compatibility
with existing libvirt versions.

Test case included to ensure we don't break this again.

Cc: Peter Krempa <pkrempa@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
---
 hw/i386/pc.c                             | 17 +++++++
 tests/acceptance/pc_cpu_hotplug_props.py | 59 ++++++++++++++++++++++++
 2 files changed, 76 insertions(+)
 create mode 100644 tests/acceptance/pc_cpu_hotplug_props.py

diff --git a/hw/i386/pc.c b/hw/i386/pc.c
index fb4ac5ca90..4d773c862d 100644
--- a/hw/i386/pc.c
+++ b/hw/i386/pc.c
@@ -2403,6 +2403,23 @@ static void pc_cpu_pre_plug(HotplugHandler *hotplug_dev,
         int max_socket = (ms->smp.max_cpus - 1) /
                                 smp_threads / smp_cores / pcms->smp_dies;
 
+        /*
+         * If there's only one possible value for a topology property,
+         * allow it to be omitted.
+         */
+        if (cpu->socket_id < 0 && max_socket == 0) {
+            cpu->socket_id = 0;
+        }
+        if (cpu->die_id < 0 && pcms->smp_dies == 1) {
+            cpu->die_id = 0;
+        }
+        if (cpu->core_id < 0 && smp_cores == 1) {
+            cpu->core_id = 0;
+        }
+        if (cpu->thread_id < 0 && smp_threads == 1) {
+            cpu->thread_id = 0;
+        }
+
         if (cpu->socket_id < 0) {
             error_setg(errp, "CPU socket-id is not set");
             return;
diff --git a/tests/acceptance/pc_cpu_hotplug_props.py b/tests/acceptance/pc_cpu_hotplug_props.py
new file mode 100644
index 0000000000..2c36926214
--- /dev/null
+++ b/tests/acceptance/pc_cpu_hotplug_props.py
@@ -0,0 +1,59 @@
+#
+# Ensure CPU topology parameters may be omitted on -device
+#
+#  Copyright (c) 2019 Red Hat Inc
+#
+# Author:
+#  Eduardo Habkost <ehabkost@redhat.com>
+#
+# This library is free software; you can redistribute it and/or
+# modify it under the terms of the GNU Lesser General Public
+# License as published by the Free Software Foundation; either
+# version 2 of the License, or (at your option) any later version.
+#
+# This library is distributed in the hope that it will be useful,
+# but WITHOUT ANY WARRANTY; without even the implied warranty of
+# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+# Lesser General Public License for more details.
+#
+# You should have received a copy of the GNU Lesser General Public
+# License along with this library; if not, see <http://www.gnu.org/licenses/>.
+#
+
+from avocado_qemu import Test
+
+class OmittedCPUProps(Test):
+    """
+    :avocado: tags=arch:x86_64
+    """
+    def test_only_socket(self):
+        self.vm.add_args('-nodefaults', '-S')
+        self.vm.add_args('-smp', '1,sockets=2,maxcpus=2')
+        self.vm.add_args('-cpu', 'qemu64')
+        self.vm.add_args('-device', 'qemu64-x86_64-cpu,socket-id=1')
+        self.vm.launch()
+        self.assertEquals(len(self.vm.command('query-cpus')), 2)
+
+    def test_only_die(self):
+        self.vm.add_args('-nodefaults', '-S')
+        self.vm.add_args('-smp', '1,dies=2,maxcpus=2')
+        self.vm.add_args('-cpu', 'qemu64')
+        self.vm.add_args('-device', 'qemu64-x86_64-cpu,die-id=1')
+        self.vm.launch()
+        self.assertEquals(len(self.vm.command('query-cpus')), 2)
+
+    def test_only_core(self):
+        self.vm.add_args('-nodefaults', '-S')
+        self.vm.add_args('-smp', '1,cores=2,maxcpus=2')
+        self.vm.add_args('-cpu', 'qemu64')
+        self.vm.add_args('-device', 'qemu64-x86_64-cpu,core-id=1')
+        self.vm.launch()
+        self.assertEquals(len(self.vm.command('query-cpus')), 2)
+
+    def test_only_thread(self):
+        self.vm.add_args('-nodefaults', '-S')
+        self.vm.add_args('-smp', '1,threads=2,maxcpus=2')
+        self.vm.add_args('-cpu', 'qemu64')
+        self.vm.add_args('-device', 'qemu64-x86_64-cpu,thread-id=1')
+        self.vm.launch()
+        self.assertEquals(len(self.vm.command('query-cpus')), 2)
-- 
2.21.0



^ permalink raw reply related	[flat|nested] 27+ messages in thread

* Re: [Qemu-devel] [PATCH 1/3] pc: Fix error message on die-id validation
  2019-08-15 18:38 ` [Qemu-devel] [PATCH 1/3] pc: Fix error message on die-id validation Eduardo Habkost
@ 2019-08-15 20:04   ` Vanderson Martins do Rosario
  2019-08-16  1:04   ` Like Xu
                     ` (2 subsequent siblings)
  3 siblings, 0 replies; 27+ messages in thread
From: Vanderson Martins do Rosario @ 2019-08-15 20:04 UTC (permalink / raw)
  To: Eduardo Habkost
  Cc: Peter Krempa, Like Xu, Michael S. Tsirkin, qemu-devel,
	Paolo Bonzini, Igor Mammedov, Richard Henderson

Reviewed-by: Vanderson M. do Rosario <vandersonmr2@gmail.com>

Vanderson M. Rosario


On Thu, Aug 15, 2019 at 3:48 PM Eduardo Habkost <ehabkost@redhat.com> wrote:

> The error message for die-id range validation is incorrect.  Example:
>
>   $ qemu-system-x86_64 -smp 1,sockets=6,maxcpus=6 \
>     -device qemu64-x86_64-cpu,socket-id=1,die-id=1,core-id=0,thread-id=0
>   qemu-system-x86_64: -device
> qemu64-x86_64-cpu,socket-id=1,die-id=1,core-id=0,thread-id=0: \
>     Invalid CPU die-id: 1 must be in range 0:5
>
> The actual range for die-id in this example is 0:0.
>
> Fix the error message to use smp_dies and print the correct range.
>
> Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
> ---
>  hw/i386/pc.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/hw/i386/pc.c b/hw/i386/pc.c
> index 549c437050..24b03bb49c 100644
> --- a/hw/i386/pc.c
> +++ b/hw/i386/pc.c
> @@ -2412,7 +2412,7 @@ static void pc_cpu_pre_plug(HotplugHandler
> *hotplug_dev,
>              return;
>          } else if (cpu->die_id > pcms->smp_dies - 1) {
>              error_setg(errp, "Invalid CPU die-id: %u must be in range
> 0:%u",
> -                       cpu->die_id, max_socket);
> +                       cpu->die_id, pcms->smp_dies - 1);
>              return;
>          }
>          if (cpu->core_id < 0) {
> --
> 2.21.0
>
>
>

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [Qemu-devel] [PATCH 2/3] pc: Improve error message when die-id is omitted
  2019-08-15 18:38 ` [Qemu-devel] [PATCH 2/3] pc: Improve error message when die-id is omitted Eduardo Habkost
@ 2019-08-15 20:11   ` Vanderson Martins do Rosario
  2019-08-16 14:06   ` Igor Mammedov
  1 sibling, 0 replies; 27+ messages in thread
From: Vanderson Martins do Rosario @ 2019-08-15 20:11 UTC (permalink / raw)
  To: Eduardo Habkost
  Cc: Peter Krempa, Like Xu, Michael S. Tsirkin, qemu-devel,
	Paolo Bonzini, Igor Mammedov, Richard Henderson

Reviewed-by: Vanderson M. do Rosario <vandersonmr2@gmail.com>

Vanderson M. Rosario


On Thu, Aug 15, 2019 at 3:43 PM Eduardo Habkost <ehabkost@redhat.com> wrote:

> The error message when die-id is omitted doesn't make sense:
>
>   $ qemu-system-x86_64 -smp 1,sockets=6,maxcpus=6 \
>     -device qemu64-x86_64-cpu,socket-id=1,core-id=0,thread-id=0
>   qemu-system-x86_64: -device
> qemu64-x86_64-cpu,socket-id=1,core-id=0,thread-id=0: \
>     Invalid CPU die-id: 4294967295 must be in range 0:0
>
> Fix it, so it will now read:
>
>   qemu-system-x86_64: -device
> qemu64-x86_64-cpu,socket-id=1,core-id=0,thread-id=0: \
>     CPU die-id is not set
>
> Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
> ---
>  hw/i386/pc.c | 4 ++++
>  1 file changed, 4 insertions(+)
>
> diff --git a/hw/i386/pc.c b/hw/i386/pc.c
> index 24b03bb49c..fb4ac5ca90 100644
> --- a/hw/i386/pc.c
> +++ b/hw/i386/pc.c
> @@ -2410,6 +2410,10 @@ static void pc_cpu_pre_plug(HotplugHandler
> *hotplug_dev,
>              error_setg(errp, "Invalid CPU socket-id: %u must be in range
> 0:%u",
>                         cpu->socket_id, max_socket);
>              return;
> +        }
> +        if (cpu->die_id < 0) {
> +            error_setg(errp, "CPU die-id is not set");
> +            return;
>          } else if (cpu->die_id > pcms->smp_dies - 1) {
>              error_setg(errp, "Invalid CPU die-id: %u must be in range
> 0:%u",
>                         cpu->die_id, pcms->smp_dies - 1);
> --
> 2.21.0
>
>
>

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [Qemu-devel] [PATCH 1/3] pc: Fix error message on die-id validation
  2019-08-15 18:38 ` [Qemu-devel] [PATCH 1/3] pc: Fix error message on die-id validation Eduardo Habkost
  2019-08-15 20:04   ` Vanderson Martins do Rosario
@ 2019-08-16  1:04   ` Like Xu
  2019-08-16 13:49     ` Eduardo Habkost
  2019-08-16  6:06   ` Markus Armbruster
  2019-08-16 14:04   ` Igor Mammedov
  3 siblings, 1 reply; 27+ messages in thread
From: Like Xu @ 2019-08-16  1:04 UTC (permalink / raw)
  To: Eduardo Habkost, qemu-devel
  Cc: Igor Mammedov, Richard Henderson, Peter Krempa, Paolo Bonzini,
	Michael S. Tsirkin

Hi,

On 2019/8/16 2:38, Eduardo Habkost wrote:
> The error message for die-id range validation is incorrect.  Example:
> 
>    $ qemu-system-x86_64 -smp 1,sockets=6,maxcpus=6 \
>      -device qemu64-x86_64-cpu,socket-id=1,die-id=1,core-id=0,thread-id=0
>    qemu-system-x86_64: -device qemu64-x86_64-cpu,socket-id=1,die-id=1,core-id=0,thread-id=0: \
>      Invalid CPU die-id: 1 must be in range 0:5
> 
> The actual range for die-id in this example is 0:0.

There is one die per socket by default.

The case sockets=6 means there are 6 dies by default
and the range for die-id is 0:5.

> 
> Fix the error message to use smp_dies and print the correct range.
> 
> Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
> ---
>   hw/i386/pc.c | 2 +-
>   1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/hw/i386/pc.c b/hw/i386/pc.c
> index 549c437050..24b03bb49c 100644
> --- a/hw/i386/pc.c
> +++ b/hw/i386/pc.c
> @@ -2412,7 +2412,7 @@ static void pc_cpu_pre_plug(HotplugHandler *hotplug_dev,
>               return;
>           } else if (cpu->die_id > pcms->smp_dies - 1) {
>               error_setg(errp, "Invalid CPU die-id: %u must be in range 0:%u",
> -                       cpu->die_id, max_socket);
> +                       cpu->die_id, pcms->smp_dies - 1);
>               return;
>           }
>           if (cpu->core_id < 0) {
> 



^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [Qemu-devel] [PATCH 1/3] pc: Fix error message on die-id validation
  2019-08-15 18:38 ` [Qemu-devel] [PATCH 1/3] pc: Fix error message on die-id validation Eduardo Habkost
  2019-08-15 20:04   ` Vanderson Martins do Rosario
  2019-08-16  1:04   ` Like Xu
@ 2019-08-16  6:06   ` Markus Armbruster
  2019-08-16 14:04   ` Igor Mammedov
  3 siblings, 0 replies; 27+ messages in thread
From: Markus Armbruster @ 2019-08-16  6:06 UTC (permalink / raw)
  To: Eduardo Habkost
  Cc: Peter Krempa, Like Xu, Michael S. Tsirkin, qemu-devel,
	Paolo Bonzini, Igor Mammedov, Richard Henderson

Eduardo Habkost <ehabkost@redhat.com> writes:

> The error message for die-id range validation is incorrect.  Example:
>
>   $ qemu-system-x86_64 -smp 1,sockets=6,maxcpus=6 \
>     -device qemu64-x86_64-cpu,socket-id=1,die-id=1,core-id=0,thread-id=0
>   qemu-system-x86_64: -device qemu64-x86_64-cpu,socket-id=1,die-id=1,core-id=0,thread-id=0: \
>     Invalid CPU die-id: 1 must be in range 0:5

The error message makes me wonder whether the upper bound is inclusive.

> The actual range for die-id in this example is 0:0.

Aha.  I guess it is.

Hmm...

$ git-grep 'must be between' \*.c
block/qcow.c:        error_setg(errp, "Cluster size must be between 512 and 64k");
block/qcow.c:        error_setg(errp, "L2 table size must be between 512 and 64k");
hw/arm/armsse.c:        error_setg(errp, "SRAM_ADDR_WIDTH must be between 1 and %d",
hw/block/block.c:            error_setg(errp, "cyls must be between 1 and %u", cyls_max);
hw/block/block.c:            error_setg(errp, "heads must be between 1 and %u", heads_max);
hw/block/block.c:            error_setg(errp, "secs must be between 1 and %u", secs_max);
hw/block/virtio-blk.c:                   ", must be between 1 and %d",
hw/block/virtio-blk.c:                   "), must be between 1 and %d",
hw/net/rocker/rocker_of_dpa.c:            DPRINTF("New vlan_id (%d) must be between 1 and 4095\n",
hw/net/virtio-net.c:        error_setg(errp, "'speed' must be between 0 and INT_MAX");
hw/nvram/spapr_nvram.c:                   "spapr-nvram must be between %" PRId64
hw/timer/a9gtimer.c:        error_setg(errp, "%s: num-cpu must be between 1 and %d",
hw/timer/arm_mptimer.c:        error_setg(errp, "num-cpu must be between 1 and %d",
hw/usb/hcd-ehci.c:        error_setg(errp, "firstport must be between 0 and %u",
net/slirp.c:                   "(IPv6 prefix length must be between 0 and 126)");
tests/test-throttle.c:        /* burst_length must be between 1 and THROTTLE_VALUE_MAX */
util/keyval.c: * The length of any key-fragment must be between 1 and 127.

> Fix the error message to use smp_dies and print the correct range.
>
> Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
> ---
>  hw/i386/pc.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/hw/i386/pc.c b/hw/i386/pc.c
> index 549c437050..24b03bb49c 100644
> --- a/hw/i386/pc.c
> +++ b/hw/i386/pc.c
> @@ -2412,7 +2412,7 @@ static void pc_cpu_pre_plug(HotplugHandler *hotplug_dev,
>              return;
>          } else if (cpu->die_id > pcms->smp_dies - 1) {
>              error_setg(errp, "Invalid CPU die-id: %u must be in range 0:%u",
> -                       cpu->die_id, max_socket);
> +                       cpu->die_id, pcms->smp_dies - 1);
>              return;
>          }
>          if (cpu->core_id < 0) {


^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [Qemu-devel] [PATCH 3/3] pc: Don't make CPU properties mandatory unless necessary
  2019-08-15 18:38 ` [Qemu-devel] [PATCH 3/3] pc: Don't make CPU properties mandatory unless necessary Eduardo Habkost
@ 2019-08-16  6:10   ` Markus Armbruster
  2019-08-16  7:49     ` Erik Skultety
  2019-08-16 16:49     ` Eduardo Habkost
  2019-08-16 13:20   ` Igor Mammedov
  1 sibling, 2 replies; 27+ messages in thread
From: Markus Armbruster @ 2019-08-16  6:10 UTC (permalink / raw)
  To: Eduardo Habkost
  Cc: Peter Krempa, Like Xu, Michael S. Tsirkin, qemu-devel,
	Paolo Bonzini, Igor Mammedov, Richard Henderson

Eduardo Habkost <ehabkost@redhat.com> writes:

> We have this issue reported when using libvirt to hotplug CPUs:
> https://bugzilla.redhat.com/show_bug.cgi?id=1741451
>
> Basically, libvirt is not copying die-id from
> query-hotpluggable-cpus, but die-id is now mandatory.

Uh-oh, "is now mandatory": making an optional property mandatory is an
incompatible change.  When did we do that?  Commit hash, please.

[...]


^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [Qemu-devel] [PATCH 3/3] pc: Don't make CPU properties mandatory unless necessary
  2019-08-16  6:10   ` Markus Armbruster
@ 2019-08-16  7:49     ` Erik Skultety
  2019-08-16 12:22       ` Markus Armbruster
  2019-08-16 16:49     ` Eduardo Habkost
  1 sibling, 1 reply; 27+ messages in thread
From: Erik Skultety @ 2019-08-16  7:49 UTC (permalink / raw)
  To: Markus Armbruster
  Cc: Peter Krempa, Eduardo Habkost, Like Xu, Michael S. Tsirkin,
	qemu-devel, Igor Mammedov, Paolo Bonzini, Richard Henderson

On Fri, Aug 16, 2019 at 08:10:20AM +0200, Markus Armbruster wrote:
> Eduardo Habkost <ehabkost@redhat.com> writes:
>
> > We have this issue reported when using libvirt to hotplug CPUs:
> > https://bugzilla.redhat.com/show_bug.cgi?id=1741451
> >
> > Basically, libvirt is not copying die-id from
> > query-hotpluggable-cpus, but die-id is now mandatory.
>
> Uh-oh, "is now mandatory": making an optional property mandatory is an
> incompatible change.  When did we do that?  Commit hash, please.
>
> [...]
>

I don't even see it as being optional ever - the property wasn't even
recognized before commit 176d2cda0de introduced it as mandatory.

Regards,
Erik


^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [Qemu-devel] [PATCH 3/3] pc: Don't make CPU properties mandatory unless necessary
  2019-08-16  7:49     ` Erik Skultety
@ 2019-08-16 12:22       ` Markus Armbruster
  2019-08-16 17:42         ` Eduardo Habkost
  0 siblings, 1 reply; 27+ messages in thread
From: Markus Armbruster @ 2019-08-16 12:22 UTC (permalink / raw)
  To: Erik Skultety
  Cc: Peter Krempa, Eduardo Habkost, Like Xu, Michael S. Tsirkin,
	qemu-devel, Paolo Bonzini, Igor Mammedov, Richard Henderson

Erik Skultety <eskultet@redhat.com> writes:

> On Fri, Aug 16, 2019 at 08:10:20AM +0200, Markus Armbruster wrote:
>> Eduardo Habkost <ehabkost@redhat.com> writes:
>>
>> > We have this issue reported when using libvirt to hotplug CPUs:
>> > https://bugzilla.redhat.com/show_bug.cgi?id=1741451
>> >
>> > Basically, libvirt is not copying die-id from
>> > query-hotpluggable-cpus, but die-id is now mandatory.
>>
>> Uh-oh, "is now mandatory": making an optional property mandatory is an
>> incompatible change.  When did we do that?  Commit hash, please.
>>
>> [...]
>>
>
> I don't even see it as being optional ever - the property wasn't even
> recognized before commit 176d2cda0de introduced it as mandatory.

Compatibility break.

Commit 176d2cda0de is in v4.1.0.  If I had learned about it a bit
earlier, I would've argued for a last minute fix or a revert.  Now we
have a regression in the release.

Eduardo, I think this fix should go into v4.1.1.  Please add cc:
qemu-stable.

How can we best avoid such compatibility breaks to slip in undetected?

A static checker would be nice.  For vmstate, we have
scripts/vmstate-static-checker.py.  Not sure it's used.


^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [Qemu-devel] [PATCH 3/3] pc: Don't make CPU properties mandatory unless necessary
  2019-08-15 18:38 ` [Qemu-devel] [PATCH 3/3] pc: Don't make CPU properties mandatory unless necessary Eduardo Habkost
  2019-08-16  6:10   ` Markus Armbruster
@ 2019-08-16 13:20   ` Igor Mammedov
  2019-08-16 16:56     ` Eduardo Habkost
  1 sibling, 1 reply; 27+ messages in thread
From: Igor Mammedov @ 2019-08-16 13:20 UTC (permalink / raw)
  To: Eduardo Habkost
  Cc: Peter Krempa, Like Xu, Michael S. Tsirkin, qemu-devel,
	Paolo Bonzini, Richard Henderson

On Thu, 15 Aug 2019 15:38:03 -0300
Eduardo Habkost <ehabkost@redhat.com> wrote:

> We have this issue reported when using libvirt to hotplug CPUs:
> https://bugzilla.redhat.com/show_bug.cgi?id=1741451
> 
> Basically, libvirt is not copying die-id from
> query-hotpluggable-cpus, but die-id is now mandatory.

this should have been gated on compat property and affect
only new machine types.
Maybe we should do just that instead of fixup so libvirt
would finally make proper handling of query-hotpluggable-cpus.

 
> We could blame libvirt and say it is not following the documented
> interface, because we have this buried in the QAPI schema
> documentation:

I wouldn't say buried, if I understand it right QAPI schema
should be the authoritative source of interface description.

If I recall it's not the first time, there was similar issue
for exactly the same reason (libvirt not passing through
all properties from query-hotpluggable-cpus).

And we had to fix it up on QEMU side (numa_cpu_pre_plug),
but it seems 2 years later libvirt is still broken the same way :(

Should we really do fixups or finaly fix it on libvirt side?

 
> > Note: currently there are 5 properties that could be present
> > but management should be prepared to pass through other
> > properties with device_add command to allow for future
> > interface extension. This also requires the filed names to be kept in
> > sync with the properties passed to -device/device_add.  
> 
> But I don't think this would be reasonable from us.  We can just
> make QEMU more flexible and let CPU properties to be omitted when
> there's no ambiguity.  This will allow us to keep compatibility
> with existing libvirt versions.

I don't really like making rule from exceptions so I'd suggest doing
it only for  die_id if we have to do fix it up (with fat comment
like in numa_cpu_pre_plug).
The rest are working fine as is.


> Test case included to ensure we don't break this again.
> 
> Cc: Peter Krempa <pkrempa@redhat.com>
> Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
> ---
>  hw/i386/pc.c                             | 17 +++++++
>  tests/acceptance/pc_cpu_hotplug_props.py | 59 ++++++++++++++++++++++++
>  2 files changed, 76 insertions(+)
>  create mode 100644 tests/acceptance/pc_cpu_hotplug_props.py
> 
> diff --git a/hw/i386/pc.c b/hw/i386/pc.c
> index fb4ac5ca90..4d773c862d 100644
> --- a/hw/i386/pc.c
> +++ b/hw/i386/pc.c
> @@ -2403,6 +2403,23 @@ static void pc_cpu_pre_plug(HotplugHandler *hotplug_dev,
>          int max_socket = (ms->smp.max_cpus - 1) /
>                                  smp_threads / smp_cores / pcms->smp_dies;
>  
> +        /*
> +         * If there's only one possible value for a topology property,
> +         * allow it to be omitted.
> +         */
> +        if (cpu->socket_id < 0 && max_socket == 0) {
> +            cpu->socket_id = 0;
> +        }
> +        if (cpu->die_id < 0 && pcms->smp_dies == 1) {
> +            cpu->die_id = 0;
> +        }
> +        if (cpu->core_id < 0 && smp_cores == 1) {
> +            cpu->core_id = 0;
> +        }
> +        if (cpu->thread_id < 0 && smp_threads == 1) {
> +            cpu->thread_id = 0;
> +        }
> +
>          if (cpu->socket_id < 0) {
>              error_setg(errp, "CPU socket-id is not set");
>              return;
> diff --git a/tests/acceptance/pc_cpu_hotplug_props.py b/tests/acceptance/pc_cpu_hotplug_props.py
> new file mode 100644
> index 0000000000..2c36926214
> --- /dev/null
> +++ b/tests/acceptance/pc_cpu_hotplug_props.py
> @@ -0,0 +1,59 @@
> +#
> +# Ensure CPU topology parameters may be omitted on -device
> +#
> +#  Copyright (c) 2019 Red Hat Inc
> +#
> +# Author:
> +#  Eduardo Habkost <ehabkost@redhat.com>
> +#
> +# This library is free software; you can redistribute it and/or
> +# modify it under the terms of the GNU Lesser General Public
> +# License as published by the Free Software Foundation; either
> +# version 2 of the License, or (at your option) any later version.
> +#
> +# This library is distributed in the hope that it will be useful,
> +# but WITHOUT ANY WARRANTY; without even the implied warranty of
> +# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
> +# Lesser General Public License for more details.
> +#
> +# You should have received a copy of the GNU Lesser General Public
> +# License along with this library; if not, see <http://www.gnu.org/licenses/>.
> +#
> +
> +from avocado_qemu import Test
> +
> +class OmittedCPUProps(Test):
> +    """
> +    :avocado: tags=arch:x86_64
> +    """
> +    def test_only_socket(self):
> +        self.vm.add_args('-nodefaults', '-S')
> +        self.vm.add_args('-smp', '1,sockets=2,maxcpus=2')
> +        self.vm.add_args('-cpu', 'qemu64')
> +        self.vm.add_args('-device', 'qemu64-x86_64-cpu,socket-id=1')
> +        self.vm.launch()
> +        self.assertEquals(len(self.vm.command('query-cpus')), 2)
> +
> +    def test_only_die(self):
> +        self.vm.add_args('-nodefaults', '-S')
> +        self.vm.add_args('-smp', '1,dies=2,maxcpus=2')
> +        self.vm.add_args('-cpu', 'qemu64')
> +        self.vm.add_args('-device', 'qemu64-x86_64-cpu,die-id=1')
> +        self.vm.launch()
> +        self.assertEquals(len(self.vm.command('query-cpus')), 2)
> +
> +    def test_only_core(self):
> +        self.vm.add_args('-nodefaults', '-S')
> +        self.vm.add_args('-smp', '1,cores=2,maxcpus=2')
> +        self.vm.add_args('-cpu', 'qemu64')
> +        self.vm.add_args('-device', 'qemu64-x86_64-cpu,core-id=1')
> +        self.vm.launch()
> +        self.assertEquals(len(self.vm.command('query-cpus')), 2)
> +
> +    def test_only_thread(self):
> +        self.vm.add_args('-nodefaults', '-S')
> +        self.vm.add_args('-smp', '1,threads=2,maxcpus=2')
> +        self.vm.add_args('-cpu', 'qemu64')
> +        self.vm.add_args('-device', 'qemu64-x86_64-cpu,thread-id=1')
> +        self.vm.launch()
> +        self.assertEquals(len(self.vm.command('query-cpus')), 2)



^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [Qemu-devel] [PATCH 1/3] pc: Fix error message on die-id validation
  2019-08-16  1:04   ` Like Xu
@ 2019-08-16 13:49     ` Eduardo Habkost
  2019-08-19  0:53       ` Like Xu
  0 siblings, 1 reply; 27+ messages in thread
From: Eduardo Habkost @ 2019-08-16 13:49 UTC (permalink / raw)
  To: Like Xu
  Cc: Peter Krempa, Michael S. Tsirkin, qemu-devel, Paolo Bonzini,
	Igor Mammedov, Richard Henderson

On Fri, Aug 16, 2019 at 09:04:16AM +0800, Like Xu wrote:
> Hi,
> 
> On 2019/8/16 2:38, Eduardo Habkost wrote:
> > The error message for die-id range validation is incorrect.  Example:
> > 
> >    $ qemu-system-x86_64 -smp 1,sockets=6,maxcpus=6 \
> >      -device qemu64-x86_64-cpu,socket-id=1,die-id=1,core-id=0,thread-id=0
> >    qemu-system-x86_64: -device qemu64-x86_64-cpu,socket-id=1,die-id=1,core-id=0,thread-id=0: \
> >      Invalid CPU die-id: 1 must be in range 0:5
> > 
> > The actual range for die-id in this example is 0:0.
> 
> There is one die per socket by default.
> 
> The case sockets=6 means there are 6 dies by default
> and the range for die-id is 0:5.
> 

I don't understand why you say that.  die-id supposed to identify
a die inside a socket.  The code below is already checking for
(cpu->die_id > pcms->smp_dies - 1) because of that.


> > 
> > Fix the error message to use smp_dies and print the correct range.
> > 
> > Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
> > ---
> >   hw/i386/pc.c | 2 +-
> >   1 file changed, 1 insertion(+), 1 deletion(-)
> > 
> > diff --git a/hw/i386/pc.c b/hw/i386/pc.c
> > index 549c437050..24b03bb49c 100644
> > --- a/hw/i386/pc.c
> > +++ b/hw/i386/pc.c
> > @@ -2412,7 +2412,7 @@ static void pc_cpu_pre_plug(HotplugHandler *hotplug_dev,
> >               return;
> >           } else if (cpu->die_id > pcms->smp_dies - 1) {
> >               error_setg(errp, "Invalid CPU die-id: %u must be in range 0:%u",
> > -                       cpu->die_id, max_socket);
> > +                       cpu->die_id, pcms->smp_dies - 1);
> >               return;
> >           }
> >           if (cpu->core_id < 0) {
> > 
> 

-- 
Eduardo


^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [Qemu-devel] [PATCH 1/3] pc: Fix error message on die-id validation
  2019-08-15 18:38 ` [Qemu-devel] [PATCH 1/3] pc: Fix error message on die-id validation Eduardo Habkost
                     ` (2 preceding siblings ...)
  2019-08-16  6:06   ` Markus Armbruster
@ 2019-08-16 14:04   ` Igor Mammedov
  3 siblings, 0 replies; 27+ messages in thread
From: Igor Mammedov @ 2019-08-16 14:04 UTC (permalink / raw)
  To: Eduardo Habkost
  Cc: Peter Krempa, Like Xu, Michael S. Tsirkin, qemu-devel,
	Paolo Bonzini, Richard Henderson

On Thu, 15 Aug 2019 15:38:01 -0300
Eduardo Habkost <ehabkost@redhat.com> wrote:

> The error message for die-id range validation is incorrect.  Example:
> 
>   $ qemu-system-x86_64 -smp 1,sockets=6,maxcpus=6 \
>     -device qemu64-x86_64-cpu,socket-id=1,die-id=1,core-id=0,thread-id=0
>   qemu-system-x86_64: -device qemu64-x86_64-cpu,socket-id=1,die-id=1,core-id=0,thread-id=0: \
>     Invalid CPU die-id: 1 must be in range 0:5
> 
> The actual range for die-id in this example is 0:0.
> 
> Fix the error message to use smp_dies and print the correct range.
> 
> Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>

Reviewed-by: Igor Mammedov <imammedo@redhat.com>

PS:
there is another unrelated bug, QEMU crashes if run like this:
   qemu-system-x86_64 -smp 1,sockets=6,dies=0,maxcpus=6

> ---
>  hw/i386/pc.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/hw/i386/pc.c b/hw/i386/pc.c
> index 549c437050..24b03bb49c 100644
> --- a/hw/i386/pc.c
> +++ b/hw/i386/pc.c
> @@ -2412,7 +2412,7 @@ static void pc_cpu_pre_plug(HotplugHandler *hotplug_dev,
>              return;
>          } else if (cpu->die_id > pcms->smp_dies - 1) {
>              error_setg(errp, "Invalid CPU die-id: %u must be in range 0:%u",
> -                       cpu->die_id, max_socket);
> +                       cpu->die_id, pcms->smp_dies - 1);
>              return;
>          }
>          if (cpu->core_id < 0) {



^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [Qemu-devel] [PATCH 2/3] pc: Improve error message when die-id is omitted
  2019-08-15 18:38 ` [Qemu-devel] [PATCH 2/3] pc: Improve error message when die-id is omitted Eduardo Habkost
  2019-08-15 20:11   ` Vanderson Martins do Rosario
@ 2019-08-16 14:06   ` Igor Mammedov
  1 sibling, 0 replies; 27+ messages in thread
From: Igor Mammedov @ 2019-08-16 14:06 UTC (permalink / raw)
  To: Eduardo Habkost
  Cc: Peter Krempa, Like Xu, Michael S. Tsirkin, qemu-devel,
	Paolo Bonzini, Richard Henderson

On Thu, 15 Aug 2019 15:38:02 -0300
Eduardo Habkost <ehabkost@redhat.com> wrote:

> The error message when die-id is omitted doesn't make sense:
> 
>   $ qemu-system-x86_64 -smp 1,sockets=6,maxcpus=6 \
>     -device qemu64-x86_64-cpu,socket-id=1,core-id=0,thread-id=0
>   qemu-system-x86_64: -device qemu64-x86_64-cpu,socket-id=1,core-id=0,thread-id=0: \
>     Invalid CPU die-id: 4294967295 must be in range 0:0
> 
> Fix it, so it will now read:
> 
>   qemu-system-x86_64: -device qemu64-x86_64-cpu,socket-id=1,core-id=0,thread-id=0: \
>     CPU die-id is not set
> 
> Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>

Reviewed-by: Igor Mammedov <imammedo@redhat.com>

> ---
>  hw/i386/pc.c | 4 ++++
>  1 file changed, 4 insertions(+)
> 
> diff --git a/hw/i386/pc.c b/hw/i386/pc.c
> index 24b03bb49c..fb4ac5ca90 100644
> --- a/hw/i386/pc.c
> +++ b/hw/i386/pc.c
> @@ -2410,6 +2410,10 @@ static void pc_cpu_pre_plug(HotplugHandler *hotplug_dev,
>              error_setg(errp, "Invalid CPU socket-id: %u must be in range 0:%u",
>                         cpu->socket_id, max_socket);
>              return;
> +        }
> +        if (cpu->die_id < 0) {
> +            error_setg(errp, "CPU die-id is not set");
> +            return;
>          } else if (cpu->die_id > pcms->smp_dies - 1) {
>              error_setg(errp, "Invalid CPU die-id: %u must be in range 0:%u",
>                         cpu->die_id, pcms->smp_dies - 1);



^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [Qemu-devel] [PATCH 3/3] pc: Don't make CPU properties mandatory unless necessary
  2019-08-16  6:10   ` Markus Armbruster
  2019-08-16  7:49     ` Erik Skultety
@ 2019-08-16 16:49     ` Eduardo Habkost
  1 sibling, 0 replies; 27+ messages in thread
From: Eduardo Habkost @ 2019-08-16 16:49 UTC (permalink / raw)
  To: Markus Armbruster
  Cc: Peter Krempa, Like Xu, Michael S. Tsirkin, qemu-devel,
	Paolo Bonzini, Igor Mammedov, Richard Henderson

On Fri, Aug 16, 2019 at 08:10:20AM +0200, Markus Armbruster wrote:
> Eduardo Habkost <ehabkost@redhat.com> writes:
> 
> > We have this issue reported when using libvirt to hotplug CPUs:
> > https://bugzilla.redhat.com/show_bug.cgi?id=1741451
> >
> > Basically, libvirt is not copying die-id from
> > query-hotpluggable-cpus, but die-id is now mandatory.
> 
> Uh-oh, "is now mandatory": making an optional property mandatory is an
> incompatible change.  When did we do that?  Commit hash, please.

Sorry, forgot to include a "Fixes:" line.

commit 176d2cda0dee9f4f78f604ad72d6a111e8e38f3b
Author: Like Xu <like.xu@linux.intel.com>
Date:   Wed Jun 12 16:40:58 2019 +0800

    i386/cpu: Consolidate die-id validity in smp context
    
    The field die_id (default as 0) and has_die_id are introduced to X86CPU.
    Following the legacy smp check rules, the die_id validity is added to
    the same contexts as leagcy smp variables such as hmp_hotpluggable_cpus(),
    machine_set_cpu_numa_node(), cpu_slot_to_string() and pc_cpu_pre_plug().
    
    Acked-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
    Signed-off-by: Like Xu <like.xu@linux.intel.com>
    Message-Id: <20190612084104.34984-4-like.xu@linux.intel.com>
    Reviewed-by: Eduardo Habkost <ehabkost@redhat.com>
    Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>


-- 
Eduardo


^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [Qemu-devel] [PATCH 3/3] pc: Don't make CPU properties mandatory unless necessary
  2019-08-16 13:20   ` Igor Mammedov
@ 2019-08-16 16:56     ` Eduardo Habkost
  2019-08-17  6:17       ` Markus Armbruster
  0 siblings, 1 reply; 27+ messages in thread
From: Eduardo Habkost @ 2019-08-16 16:56 UTC (permalink / raw)
  To: Igor Mammedov
  Cc: Peter Krempa, Like Xu, Michael S. Tsirkin, qemu-devel,
	Paolo Bonzini, Richard Henderson

On Fri, Aug 16, 2019 at 03:20:11PM +0200, Igor Mammedov wrote:
> On Thu, 15 Aug 2019 15:38:03 -0300
> Eduardo Habkost <ehabkost@redhat.com> wrote:
> 
> > We have this issue reported when using libvirt to hotplug CPUs:
> > https://bugzilla.redhat.com/show_bug.cgi?id=1741451
> > 
> > Basically, libvirt is not copying die-id from
> > query-hotpluggable-cpus, but die-id is now mandatory.
> 
> this should have been gated on compat property and affect
> only new machine types.
> Maybe we should do just that instead of fixup so libvirt
> would finally make proper handling of query-hotpluggable-cpus.
> 
>  
> > We could blame libvirt and say it is not following the documented
> > interface, because we have this buried in the QAPI schema
> > documentation:
> 
> I wouldn't say buried, if I understand it right QAPI schema
> should be the authoritative source of interface description.
> 
> If I recall it's not the first time, there was similar issue
> for exactly the same reason (libvirt not passing through
> all properties from query-hotpluggable-cpus).
> 
> And we had to fix it up on QEMU side (numa_cpu_pre_plug),
> but it seems 2 years later libvirt is still broken the same way :(
> 
> Should we really do fixups or finaly fix it on libvirt side?

Is it truly a bug in libvirt?  Making QEMU behave differently
when getting exactly the same input sounds like a bad idea, even
if we documented that at the QAPI documentation.

My suggestion is to instead drop the comment below from the QAPI
documentation.  New properties shouldn't become mandatory.

>  
> > > Note: currently there are 5 properties that could be present
> > > but management should be prepared to pass through other
> > > properties with device_add command to allow for future
> > > interface extension. This also requires the filed names to be kept in
> > > sync with the properties passed to -device/device_add.  
> > 
> > But I don't think this would be reasonable from us.  We can just
> > make QEMU more flexible and let CPU properties to be omitted when
> > there's no ambiguity.  This will allow us to keep compatibility
> > with existing libvirt versions.
> 
> I don't really like making rule from exceptions so I'd suggest doing
> it only for  die_id if we have to do fix it up (with fat comment
> like in numa_cpu_pre_plug).
> The rest are working fine as is.

I will insist we make it consistent for all properties, but I
don't want this discussion to hold the bug fix.  So I'll do this:

I will submit a new patch that makes only die-id optional, and CC
qemu-stable.

After that, i will submit this patch again, and we can discuss
whether we should make all properties optional.

-- 
Eduardo


^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [Qemu-devel] [PATCH 3/3] pc: Don't make CPU properties mandatory unless necessary
  2019-08-16 12:22       ` Markus Armbruster
@ 2019-08-16 17:42         ` Eduardo Habkost
  2019-08-16 21:07           ` Yash Mankad
  2019-08-17  5:34           ` Markus Armbruster
  0 siblings, 2 replies; 27+ messages in thread
From: Eduardo Habkost @ 2019-08-16 17:42 UTC (permalink / raw)
  To: Markus Armbruster
  Cc: Yash Mankad, Peter Krempa, Like Xu, Michael S. Tsirkin,
	Erik Skultety, qemu-devel, Paolo Bonzini, Igor Mammedov,
	Miroslav Rezanina, Danilo C. L. de Paula, Richard Henderson

On Fri, Aug 16, 2019 at 02:22:58PM +0200, Markus Armbruster wrote:
> Erik Skultety <eskultet@redhat.com> writes:
> 
> > On Fri, Aug 16, 2019 at 08:10:20AM +0200, Markus Armbruster wrote:
> >> Eduardo Habkost <ehabkost@redhat.com> writes:
> >>
> >> > We have this issue reported when using libvirt to hotplug CPUs:
> >> > https://bugzilla.redhat.com/show_bug.cgi?id=1741451
> >> >
> >> > Basically, libvirt is not copying die-id from
> >> > query-hotpluggable-cpus, but die-id is now mandatory.
> >>
> >> Uh-oh, "is now mandatory": making an optional property mandatory is an
> >> incompatible change.  When did we do that?  Commit hash, please.
> >>
> >> [...]
> >>
> >
> > I don't even see it as being optional ever - the property wasn't even
> > recognized before commit 176d2cda0de introduced it as mandatory.
> 
> Compatibility break.
> 
> Commit 176d2cda0de is in v4.1.0.  If I had learned about it a bit
> earlier, I would've argued for a last minute fix or a revert.  Now we
> have a regression in the release.
> 
> Eduardo, I think this fix should go into v4.1.1.  Please add cc:
> qemu-stable.

I did it in v2.

> 
> How can we best avoid such compatibility breaks to slip in undetected?
> 
> A static checker would be nice.  For vmstate, we have
> scripts/vmstate-static-checker.py.  Not sure it's used.

I don't think this specific bug would be detected with a static
checker.  "die-id is mandatory" is not something that can be
extracted by looking at QOM data structures.  The new rule was
being enforced by the hotplug handler callbacks, and the hotplug
handler call tree is a bit complex (too complex for my taste, but
I digress).

We could have detected this with a simple CPU hotplug automated
test case, though.  Or with a very simple -device test case like
the one I have submitted with this patch.

This was detected by libvirt automated test cases.  It would be
nice if this was run during the -rc stage and not only after the
4.1.0 release, though.

I don't know details of the test job.  Danilo, Mirek, Yash: do
you know how this bug was detected, and what we could do to run
the same test jobs in upstream QEMU release candidates?

-- 
Eduardo


^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [Qemu-devel] [PATCH 3/3] pc: Don't make CPU properties mandatory unless necessary
  2019-08-16 17:42         ` Eduardo Habkost
@ 2019-08-16 21:07           ` Yash Mankad
  2019-08-20 21:06             ` Eduardo Habkost
  2019-08-17  5:34           ` Markus Armbruster
  1 sibling, 1 reply; 27+ messages in thread
From: Yash Mankad @ 2019-08-16 21:07 UTC (permalink / raw)
  To: Eduardo Habkost, Markus Armbruster
  Cc: Peter Krempa, Like Xu, Erik Skultety, Michael S. Tsirkin,
	qemu-devel, Igor Mammedov, Paolo Bonzini, Miroslav Rezanina,
	Danilo C. L. de Paula, Richard Henderson



On 8/16/19 1:42 PM, Eduardo Habkost wrote:
> On Fri, Aug 16, 2019 at 02:22:58PM +0200, Markus Armbruster wrote:
>> Erik Skultety <eskultet@redhat.com> writes:
>>
>>> On Fri, Aug 16, 2019 at 08:10:20AM +0200, Markus Armbruster wrote:
>>>> Eduardo Habkost <ehabkost@redhat.com> writes:
>>>>
>>>>> We have this issue reported when using libvirt to hotplug CPUs:
>>>>> https://bugzilla.redhat.com/show_bug.cgi?id=1741451
>>>>>
>>>>> Basically, libvirt is not copying die-id from
>>>>> query-hotpluggable-cpus, but die-id is now mandatory.
>>>> Uh-oh, "is now mandatory": making an optional property mandatory is an
>>>> incompatible change.  When did we do that?  Commit hash, please.
>>>>
>>>> [...]
>>>>
>>> I don't even see it as being optional ever - the property wasn't even
>>> recognized before commit 176d2cda0de introduced it as mandatory.
>> Compatibility break.
>>
>> Commit 176d2cda0de is in v4.1.0.  If I had learned about it a bit
>> earlier, I would've argued for a last minute fix or a revert.  Now we
>> have a regression in the release.
>>
>> Eduardo, I think this fix should go into v4.1.1.  Please add cc:
>> qemu-stable.
> I did it in v2.
>
>> How can we best avoid such compatibility breaks to slip in undetected?
>>
>> A static checker would be nice.  For vmstate, we have
>> scripts/vmstate-static-checker.py.  Not sure it's used.
> I don't think this specific bug would be detected with a static
> checker.  "die-id is mandatory" is not something that can be
> extracted by looking at QOM data structures.  The new rule was
> being enforced by the hotplug handler callbacks, and the hotplug
> handler call tree is a bit complex (too complex for my taste, but
> I digress).
>
> We could have detected this with a simple CPU hotplug automated
> test case, though.  Or with a very simple -device test case like
> the one I have submitted with this patch.
>
> This was detected by libvirt automated test cases.  It would be
> nice if this was run during the -rc stage and not only after the
> 4.1.0 release, though.
>
> I don't know details of the test job.  Danilo, Mirek, Yash: do
> you know how this bug was detected, and what we could do to run
> the same test jobs in upstream QEMU release candidates?

This bug was caught by our internal gating tests.

The libvirt gating tests for the virt module include the
following Avocado-VT test case:

libvirt_vcpu_plug_unplug.positive_test.vcpu_set.live.vm_operate.save

This job failed with the error that you can see in the description
of the BZ#1741451 [0].

If you think that this would have been caught by a simple hotplug
case, I'd recommend adding a test for hotplug to avocado_qemu.
Otherwise, if you want, I can look into adding this particular
libvirt test case to our QEMU CI efforts.

Thanks,
Yash

[0] https://bugzilla.redhat.com/show_bug.cgi?id=1741451#c0



>




^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [Qemu-devel] [PATCH 3/3] pc: Don't make CPU properties mandatory unless necessary
  2019-08-16 17:42         ` Eduardo Habkost
  2019-08-16 21:07           ` Yash Mankad
@ 2019-08-17  5:34           ` Markus Armbruster
  1 sibling, 0 replies; 27+ messages in thread
From: Markus Armbruster @ 2019-08-17  5:34 UTC (permalink / raw)
  To: Eduardo Habkost
  Cc: Yash Mankad, Peter Krempa, Like Xu, Erik Skultety,
	Michael S. Tsirkin, qemu-devel, Igor Mammedov, Paolo Bonzini,
	Miroslav Rezanina, Danilo C. L. de Paula, Richard Henderson

Eduardo Habkost <ehabkost@redhat.com> writes:

> On Fri, Aug 16, 2019 at 02:22:58PM +0200, Markus Armbruster wrote:
>> Erik Skultety <eskultet@redhat.com> writes:
>> 
>> > On Fri, Aug 16, 2019 at 08:10:20AM +0200, Markus Armbruster wrote:
>> >> Eduardo Habkost <ehabkost@redhat.com> writes:
>> >>
>> >> > We have this issue reported when using libvirt to hotplug CPUs:
>> >> > https://bugzilla.redhat.com/show_bug.cgi?id=1741451
>> >> >
>> >> > Basically, libvirt is not copying die-id from
>> >> > query-hotpluggable-cpus, but die-id is now mandatory.
>> >>
>> >> Uh-oh, "is now mandatory": making an optional property mandatory is an
>> >> incompatible change.  When did we do that?  Commit hash, please.
>> >>
>> >> [...]
>> >>
>> >
>> > I don't even see it as being optional ever - the property wasn't even
>> > recognized before commit 176d2cda0de introduced it as mandatory.
>> 
>> Compatibility break.
>> 
>> Commit 176d2cda0de is in v4.1.0.  If I had learned about it a bit
>> earlier, I would've argued for a last minute fix or a revert.  Now we
>> have a regression in the release.
>> 
>> Eduardo, I think this fix should go into v4.1.1.  Please add cc:
>> qemu-stable.
>
> I did it in v2.
>
>> 
>> How can we best avoid such compatibility breaks to slip in undetected?
>> 
>> A static checker would be nice.  For vmstate, we have
>> scripts/vmstate-static-checker.py.  Not sure it's used.
>
> I don't think this specific bug would be detected with a static
> checker.  "die-id is mandatory" is not something that can be
> extracted by looking at QOM data structures.  The new rule was
> being enforced by the hotplug handler callbacks, and the hotplug
> handler call tree is a bit complex (too complex for my taste, but
> I digress).

QOM does too much in code.  Turing tarpit.

> We could have detected this with a simple CPU hotplug automated
> test case, though.  Or with a very simple -device test case like
> the one I have submitted with this patch.

The external QOM interface is huge.  Even if we had an army of
industrious gnomes writing simple test cases for all of it, we'd still
need a fleet of machines to actually run them, and at least a batallion
of gnomes to maintain them.

The extremely basic qom-test gobbles up a painful amount of CPU cycles
already:

$ time for i in `find bld/*-softmmu -maxdepth 1 -name qemu-system-\* -perm /u+x`; do QTEST_QEMU_BINARY=$i bld/tests/qom-test; done
/aarch64/qom/versatileab: OK
[260 lines of the form name/of/test: OK omitted...]
/xtensaeb/qom/lx60: OK

real	3m33.001s
user	2m18.081s
sys	1m31.809s

> This was detected by libvirt automated test cases.  It would be

Nice.

> nice if this was run during the -rc stage and not only after the
> 4.1.0 release, though.

We don't always get lucky.

> I don't know details of the test job.  Danilo, Mirek, Yash: do
> you know how this bug was detected, and what we could do to run
> the same test jobs in upstream QEMU release candidates?

Thinking about how to make the best use of the tests we have is in
order.


^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [Qemu-devel] [PATCH 3/3] pc: Don't make CPU properties mandatory unless necessary
  2019-08-16 16:56     ` Eduardo Habkost
@ 2019-08-17  6:17       ` Markus Armbruster
  2019-08-26 14:51         ` Igor Mammedov
  0 siblings, 1 reply; 27+ messages in thread
From: Markus Armbruster @ 2019-08-17  6:17 UTC (permalink / raw)
  To: Eduardo Habkost
  Cc: Peter Krempa, Like Xu, Michael S. Tsirkin, qemu-devel,
	Paolo Bonzini, Igor Mammedov, Richard Henderson

Eduardo Habkost <ehabkost@redhat.com> writes:

> On Fri, Aug 16, 2019 at 03:20:11PM +0200, Igor Mammedov wrote:
>> On Thu, 15 Aug 2019 15:38:03 -0300
>> Eduardo Habkost <ehabkost@redhat.com> wrote:
>> 
>> > We have this issue reported when using libvirt to hotplug CPUs:
>> > https://bugzilla.redhat.com/show_bug.cgi?id=1741451
>> > 
>> > Basically, libvirt is not copying die-id from
>> > query-hotpluggable-cpus, but die-id is now mandatory.
>> 
>> this should have been gated on compat property and affect
>> only new machine types.
>> Maybe we should do just that instead of fixup so libvirt
>> would finally make proper handling of query-hotpluggable-cpus.
>> 
>>  
>> > We could blame libvirt and say it is not following the documented
>> > interface, because we have this buried in the QAPI schema
>> > documentation:
>> 
>> I wouldn't say buried, if I understand it right QAPI schema
>> should be the authoritative source of interface description.
>> 
>> If I recall it's not the first time, there was similar issue
>> for exactly the same reason (libvirt not passing through
>> all properties from query-hotpluggable-cpus).
>> 
>> And we had to fix it up on QEMU side (numa_cpu_pre_plug),
>> but it seems 2 years later libvirt is still broken the same way :(
>> 
>> Should we really do fixups or finaly fix it on libvirt side?
>
> Is it truly a bug in libvirt?  Making QEMU behave differently
> when getting exactly the same input sounds like a bad idea, even
> if we documented that at the QAPI documentation.
>
> My suggestion is to instead drop the comment below from the QAPI
> documentation.  New properties shouldn't become mandatory.

The "comment below" is this one, in qapi/machine.json:

>> > > Note: currently there are 5 properties that could be present
>> > > but management should be prepared to pass through other
>> > > properties with device_add command to allow for future
>> > > interface extension. This also requires the filed names to be kept in
>> > > sync with the properties passed to -device/device_add.  

Goes back to commit d4633541ee0, v2.7.0.  @die-id was the first such
interface extension.

A rule like "to use command C, you must pass it whatever you get from
command Q" punches a hole into the "QMP is a stable interface" promise.
Retroactively tacking it onto an existing interface like device-add
some-existing-device is even more problematic than specifying it for a
new interface.  Mind, this is not a categorical "can't ever do that".
It's more like "you better show this is less bad than all the
alternatives we can think of, and we've thought pretty hard".

Since this particular hole failed us the first time anybody actually
tried to wiggle through it, I think Eduardo has a point when he calls
for filling it in by deleting the comment.

By the way, the line preceding the comment

     # @core-id: core number within die the CPU belongs to# @thread-id: thread number within core the CPU belongs to

is actually two lines run together.  Messed up when @die-id was added
(commit 176d2cda0de).  Minor review fail (I didn't look even though I
was cc'ed on v2; I wasn't on v3).  Let's fix that, too.

>> > But I don't think this would be reasonable from us.  We can just
>> > make QEMU more flexible and let CPU properties to be omitted when
>> > there's no ambiguity.  This will allow us to keep compatibility
>> > with existing libvirt versions.
>> 
>> I don't really like making rule from exceptions so I'd suggest doing
>> it only for  die_id if we have to do fix it up (with fat comment
>> like in numa_cpu_pre_plug).
>> The rest are working fine as is.
>
> I will insist we make it consistent for all properties, but I
> don't want this discussion to hold the bug fix.  So I'll do this:
>
> I will submit a new patch that makes only die-id optional, and CC
> qemu-stable.
>
> After that, i will submit this patch again, and we can discuss
> whether we should make all properties optional.

Makes sense, go ahead.


^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [Qemu-devel] [PATCH 1/3] pc: Fix error message on die-id validation
  2019-08-16 13:49     ` Eduardo Habkost
@ 2019-08-19  0:53       ` Like Xu
  0 siblings, 0 replies; 27+ messages in thread
From: Like Xu @ 2019-08-19  0:53 UTC (permalink / raw)
  To: Eduardo Habkost
  Cc: Peter Krempa, Michael S. Tsirkin, qemu-devel, Igor Mammedov,
	Paolo Bonzini, Richard Henderson

On 2019/8/16 21:49, Eduardo Habkost wrote:
> On Fri, Aug 16, 2019 at 09:04:16AM +0800, Like Xu wrote:
>> Hi,
>>
>> On 2019/8/16 2:38, Eduardo Habkost wrote:
>>> The error message for die-id range validation is incorrect.  Example:
>>>
>>>     $ qemu-system-x86_64 -smp 1,sockets=6,maxcpus=6 \
>>>       -device qemu64-x86_64-cpu,socket-id=1,die-id=1,core-id=0,thread-id=0
>>>     qemu-system-x86_64: -device qemu64-x86_64-cpu,socket-id=1,die-id=1,core-id=0,thread-id=0: \
>>>       Invalid CPU die-id: 1 must be in range 0:5
>>>
>>> The actual range for die-id in this example is 0:0.
>>
>> There is one die per socket by default.
>>
>> The case sockets=6 means there are 6 dies by default
>> and the range for die-id is 0:5.
>>
> 
> I don't understand why you say that.  die-id supposed to identify
> a die inside a socket.  The code below is already checking for
> (cpu->die_id > pcms->smp_dies - 1) because of that.
> 

You're right about this.
Sorry to make a mess to support die topology.

> 
>>>
>>> Fix the error message to use smp_dies and print the correct range.
>>>
>>> Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
>>> ---
>>>    hw/i386/pc.c | 2 +-
>>>    1 file changed, 1 insertion(+), 1 deletion(-)
>>>
>>> diff --git a/hw/i386/pc.c b/hw/i386/pc.c
>>> index 549c437050..24b03bb49c 100644
>>> --- a/hw/i386/pc.c
>>> +++ b/hw/i386/pc.c
>>> @@ -2412,7 +2412,7 @@ static void pc_cpu_pre_plug(HotplugHandler *hotplug_dev,
>>>                return;
>>>            } else if (cpu->die_id > pcms->smp_dies - 1) {
>>>                error_setg(errp, "Invalid CPU die-id: %u must be in range 0:%u",
>>> -                       cpu->die_id, max_socket);
>>> +                       cpu->die_id, pcms->smp_dies - 1);
>>>                return;
>>>            }
>>>            if (cpu->core_id < 0) {
>>>
>>
> 



^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [Qemu-devel] [PATCH 0/3] pc: Fix die-id validation and compatibility with libvirt
  2019-08-15 18:38 [Qemu-devel] [PATCH 0/3] pc: Fix die-id validation and compatibility with libvirt Eduardo Habkost
                   ` (2 preceding siblings ...)
  2019-08-15 18:38 ` [Qemu-devel] [PATCH 3/3] pc: Don't make CPU properties mandatory unless necessary Eduardo Habkost
@ 2019-08-19 19:19 ` Michael S. Tsirkin
  3 siblings, 0 replies; 27+ messages in thread
From: Michael S. Tsirkin @ 2019-08-19 19:19 UTC (permalink / raw)
  To: Eduardo Habkost
  Cc: Peter Krempa, Like Xu, qemu-devel, Igor Mammedov, Paolo Bonzini,
	Richard Henderson

On Thu, Aug 15, 2019 at 03:38:00PM -0300, Eduardo Habkost wrote:
> Currently, if die-id is omitted on -device for CPUs, we get a
> very confusing error message:
> 
>   $ qemu-system-x86_64 -smp 1,sockets=6,maxcpus=6 \
>     -device qemu64-x86_64-cpu,socket-id=1,core-id=0,thread-id=0
>   qemu-system-x86_64: -device qemu64-x86_64-cpu,socket-id=1,core-id=0,thread-id=0: \
>     Invalid CPU die-id: 4294967295 must be in range 0:5
> 
> This has 3 problems
> 
> 1) The actual range for die-id is 0:0.
>    This is fixed by patch 1/3.
> 2) The user didn't specify die-id=4294967295.
>    This is fixed by patch 2/3.
> 3) It breaks compatibility with libvirt because die-id was not
>    mandatory before.
>    This is addressed by patch 3/3.
> 
> Issues #1 and #2 were reported at:
> https://bugzilla.redhat.com/show_bug.cgi?id=1741151
> 
> Issue #3 was reported at:
> https://bugzilla.redhat.com/show_bug.cgi?id=1741451
> 
> Cc: Like Xu <like.xu@linux.intel.com>
> Cc: Peter Krempa <pkrempa@redhat.com>
> Cc: Igor Mammedov <imammedo@redhat.com>

Looks good

Reviewed-by: Michael S. Tsirkin <mst@redhat.com>

I'm traveling and can't test this properly.
Anyone else can merge this? Eduardo?


> Eduardo Habkost (3):
>   pc: Fix error message on die-id validation
>   pc: Improve error message when die-id is omitted
>   pc: Don't make CPU properties mandatory unless necessary
> 
>  hw/i386/pc.c                             | 23 ++++++++-
>  tests/acceptance/pc_cpu_hotplug_props.py | 59 ++++++++++++++++++++++++
>  2 files changed, 81 insertions(+), 1 deletion(-)
>  create mode 100644 tests/acceptance/pc_cpu_hotplug_props.py
> 
> -- 
> 2.21.0


^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [Qemu-devel] [PATCH 3/3] pc: Don't make CPU properties mandatory unless necessary
  2019-08-16 21:07           ` Yash Mankad
@ 2019-08-20 21:06             ` Eduardo Habkost
  0 siblings, 0 replies; 27+ messages in thread
From: Eduardo Habkost @ 2019-08-20 21:06 UTC (permalink / raw)
  To: Yash Mankad
  Cc: Peter Krempa, Like Xu, Erik Skultety, Michael S. Tsirkin,
	qemu-devel, Markus Armbruster, Igor Mammedov, Paolo Bonzini,
	Miroslav Rezanina, Danilo C. L. de Paula, Richard Henderson

On Fri, Aug 16, 2019 at 05:07:10PM -0400, Yash Mankad wrote:
> On 8/16/19 1:42 PM, Eduardo Habkost wrote:
> > On Fri, Aug 16, 2019 at 02:22:58PM +0200, Markus Armbruster wrote:
> >> Erik Skultety <eskultet@redhat.com> writes:
> >>
> >>> On Fri, Aug 16, 2019 at 08:10:20AM +0200, Markus Armbruster wrote:
> >>>> Eduardo Habkost <ehabkost@redhat.com> writes:
> >>>>
> >>>>> We have this issue reported when using libvirt to hotplug CPUs:
> >>>>> https://bugzilla.redhat.com/show_bug.cgi?id=1741451
> >>>>>
> >>>>> Basically, libvirt is not copying die-id from
> >>>>> query-hotpluggable-cpus, but die-id is now mandatory.
> >>>> Uh-oh, "is now mandatory": making an optional property mandatory is an
> >>>> incompatible change.  When did we do that?  Commit hash, please.
> >>>>
> >>>> [...]
> >>>>
> >>> I don't even see it as being optional ever - the property wasn't even
> >>> recognized before commit 176d2cda0de introduced it as mandatory.
> >> Compatibility break.
> >>
> >> Commit 176d2cda0de is in v4.1.0.  If I had learned about it a bit
> >> earlier, I would've argued for a last minute fix or a revert.  Now we
> >> have a regression in the release.
> >>
> >> Eduardo, I think this fix should go into v4.1.1.  Please add cc:
> >> qemu-stable.
> > I did it in v2.
> >
> >> How can we best avoid such compatibility breaks to slip in undetected?
> >>
> >> A static checker would be nice.  For vmstate, we have
> >> scripts/vmstate-static-checker.py.  Not sure it's used.
> > I don't think this specific bug would be detected with a static
> > checker.  "die-id is mandatory" is not something that can be
> > extracted by looking at QOM data structures.  The new rule was
> > being enforced by the hotplug handler callbacks, and the hotplug
> > handler call tree is a bit complex (too complex for my taste, but
> > I digress).
> >
> > We could have detected this with a simple CPU hotplug automated
> > test case, though.  Or with a very simple -device test case like
> > the one I have submitted with this patch.
> >
> > This was detected by libvirt automated test cases.  It would be
> > nice if this was run during the -rc stage and not only after the
> > 4.1.0 release, though.
> >
> > I don't know details of the test job.  Danilo, Mirek, Yash: do
> > you know how this bug was detected, and what we could do to run
> > the same test jobs in upstream QEMU release candidates?
> 
> This bug was caught by our internal gating tests.
> 
> The libvirt gating tests for the virt module include the
> following Avocado-VT test case:
> 
> libvirt_vcpu_plug_unplug.positive_test.vcpu_set.live.vm_operate.save
> 
> This job failed with the error that you can see in the description
> of the BZ#1741451 [0].
> 
> If you think that this would have been caught by a simple hotplug
> case, I'd recommend adding a test for hotplug to avocado_qemu.
> Otherwise, if you want, I can look into adding this particular
> libvirt test case to our QEMU CI efforts.

Having a hotplug test case using avocado_qemu would help catch
some bugs, but it's not enough if we want to catch integration
issues between QEMU and libvirt (like this one).

I wouldn't focus just on that particular libvirt test case.
I suggest we run a reasonably large set of libvirt test cases
against QEMU release candidates as soon as they are tagged.

> 
> Thanks,
> Yash
> 
> [0] https://bugzilla.redhat.com/show_bug.cgi?id=1741451#c0
> 

-- 
Eduardo


^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [Qemu-devel] [PATCH 3/3] pc: Don't make CPU properties mandatory unless necessary
  2019-08-17  6:17       ` Markus Armbruster
@ 2019-08-26 14:51         ` Igor Mammedov
  2019-08-27 16:15           ` Markus Armbruster
  0 siblings, 1 reply; 27+ messages in thread
From: Igor Mammedov @ 2019-08-26 14:51 UTC (permalink / raw)
  To: Markus Armbruster
  Cc: Peter Krempa, Eduardo Habkost, Like Xu, Michael S. Tsirkin,
	qemu-devel, Paolo Bonzini, Richard Henderson

On Sat, 17 Aug 2019 08:17:48 +0200
Markus Armbruster <armbru@redhat.com> wrote:

> Eduardo Habkost <ehabkost@redhat.com> writes:
> 
> > On Fri, Aug 16, 2019 at 03:20:11PM +0200, Igor Mammedov wrote:  
> >> On Thu, 15 Aug 2019 15:38:03 -0300
> >> Eduardo Habkost <ehabkost@redhat.com> wrote:
> >>   
> >> > We have this issue reported when using libvirt to hotplug CPUs:
> >> > https://bugzilla.redhat.com/show_bug.cgi?id=1741451
> >> > 
> >> > Basically, libvirt is not copying die-id from
> >> > query-hotpluggable-cpus, but die-id is now mandatory.  
> >> 
> >> this should have been gated on compat property and affect
> >> only new machine types.
> >> Maybe we should do just that instead of fixup so libvirt
> >> would finally make proper handling of query-hotpluggable-cpus.
> >> 
> >>    
> >> > We could blame libvirt and say it is not following the documented
> >> > interface, because we have this buried in the QAPI schema
> >> > documentation:  
> >> 
> >> I wouldn't say buried, if I understand it right QAPI schema
> >> should be the authoritative source of interface description.
> >> 
> >> If I recall it's not the first time, there was similar issue
> >> for exactly the same reason (libvirt not passing through
> >> all properties from query-hotpluggable-cpus).
> >> 
> >> And we had to fix it up on QEMU side (numa_cpu_pre_plug),
> >> but it seems 2 years later libvirt is still broken the same way :(
> >> 
> >> Should we really do fixups or finaly fix it on libvirt side?  
> >
> > Is it truly a bug in libvirt?  Making QEMU behave differently
> > when getting exactly the same input sounds like a bad idea, even
> > if we documented that at the QAPI documentation.
> >
> > My suggestion is to instead drop the comment below from the QAPI
> > documentation.  New properties shouldn't become mandatory.  
> 
> The "comment below" is this one, in qapi/machine.json:
> 
> >> > > Note: currently there are 5 properties that could be present
> >> > > but management should be prepared to pass through other
> >> > > properties with device_add command to allow for future
> >> > > interface extension. This also requires the filed names to be kept in
> >> > > sync with the properties passed to -device/device_add.    
> 
> Goes back to commit d4633541ee0, v2.7.0.  @die-id was the first such
> interface extension.
> 
> A rule like "to use command C, you must pass it whatever you get from
> command Q" punches a hole into the "QMP is a stable interface" promise.
> Retroactively tacking it onto an existing interface like device-add
> some-existing-device is even more problematic than specifying it for a
> new interface.  Mind, this is not a categorical "can't ever do that".
> It's more like "you better show this is less bad than all the
> alternatives we can think of, and we've thought pretty hard".
> Since this particular hole failed us the first time anybody actually
> tried to wiggle through it, I think Eduardo has a point when he calls
> for filling it in by deleting the comment.

That was a consensus we were able to reach when discussing cpu hotplug
QMP interface. If I recall correctly idea was that it should work for
different targets (cpu topology properties target specific) and be
extensible without breaking old mgmt stack  or requiring its update
in lock step.

If implemented correctly mgmt would not only query from QEMU/machine
possible CPUs (with properties and valid values needed to plug it in,
which it does already) but also 'keep' them around and pass back to
device_add. In that case it would have worked as designed just fine.

But this also shows a problem that we still need versioned machine type
to keep old set of properties for old machine types anyway and we can
miss it during review as tests we have might be not enough
(tests/cpu-plug-test didn't detect it for some reason).

 
> By the way, the line preceding the comment
> 
>      # @core-id: core number within die the CPU belongs to# @thread-id: thread number within core the CPU belongs to
> 
> is actually two lines run together.  Messed up when @die-id was added
> (commit 176d2cda0de).  Minor review fail (I didn't look even though I
> was cc'ed on v2; I wasn't on v3).  Let's fix that, too.
> 
> >> > But I don't think this would be reasonable from us.  We can just
> >> > make QEMU more flexible and let CPU properties to be omitted when
> >> > there's no ambiguity.  This will allow us to keep compatibility
> >> > with existing libvirt versions.  
> >> 
> >> I don't really like making rule from exceptions so I'd suggest doing
> >> it only for  die_id if we have to do fix it up (with fat comment
> >> like in numa_cpu_pre_plug).
> >> The rest are working fine as is.  
> >
> > I will insist we make it consistent for all properties, but I
> > don't want this discussion to hold the bug fix.  So I'll do this:
> >
> > I will submit a new patch that makes only die-id optional, and CC
> > qemu-stable.
> >
> > After that, i will submit this patch again, and we can discuss
> > whether we should make all properties optional.  
> 
> Makes sense, go ahead.



^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [Qemu-devel] [PATCH 3/3] pc: Don't make CPU properties mandatory unless necessary
  2019-08-26 14:51         ` Igor Mammedov
@ 2019-08-27 16:15           ` Markus Armbruster
  2019-08-28 15:27             ` Igor Mammedov
  0 siblings, 1 reply; 27+ messages in thread
From: Markus Armbruster @ 2019-08-27 16:15 UTC (permalink / raw)
  To: Igor Mammedov
  Cc: Peter Krempa, Eduardo Habkost, Like Xu, Michael S. Tsirkin,
	qemu-devel, Paolo Bonzini, Richard Henderson

Igor Mammedov <imammedo@redhat.com> writes:

> On Sat, 17 Aug 2019 08:17:48 +0200
> Markus Armbruster <armbru@redhat.com> wrote:
>
>> Eduardo Habkost <ehabkost@redhat.com> writes:
>> 
>> > On Fri, Aug 16, 2019 at 03:20:11PM +0200, Igor Mammedov wrote:  
>> >> On Thu, 15 Aug 2019 15:38:03 -0300
>> >> Eduardo Habkost <ehabkost@redhat.com> wrote:
>> >>   
>> >> > We have this issue reported when using libvirt to hotplug CPUs:
>> >> > https://bugzilla.redhat.com/show_bug.cgi?id=1741451
>> >> > 
>> >> > Basically, libvirt is not copying die-id from
>> >> > query-hotpluggable-cpus, but die-id is now mandatory.  
>> >> 
>> >> this should have been gated on compat property and affect
>> >> only new machine types.
>> >> Maybe we should do just that instead of fixup so libvirt
>> >> would finally make proper handling of query-hotpluggable-cpus.
>> >> 
>> >>    
>> >> > We could blame libvirt and say it is not following the documented
>> >> > interface, because we have this buried in the QAPI schema
>> >> > documentation:  
>> >> 
>> >> I wouldn't say buried, if I understand it right QAPI schema
>> >> should be the authoritative source of interface description.
>> >> 
>> >> If I recall it's not the first time, there was similar issue
>> >> for exactly the same reason (libvirt not passing through
>> >> all properties from query-hotpluggable-cpus).
>> >> 
>> >> And we had to fix it up on QEMU side (numa_cpu_pre_plug),
>> >> but it seems 2 years later libvirt is still broken the same way :(
>> >> 
>> >> Should we really do fixups or finaly fix it on libvirt side?  
>> >
>> > Is it truly a bug in libvirt?  Making QEMU behave differently
>> > when getting exactly the same input sounds like a bad idea, even
>> > if we documented that at the QAPI documentation.
>> >
>> > My suggestion is to instead drop the comment below from the QAPI
>> > documentation.  New properties shouldn't become mandatory.  
>> 
>> The "comment below" is this one, in qapi/machine.json:
>> 
>> >> > > Note: currently there are 5 properties that could be present
>> >> > > but management should be prepared to pass through other
>> >> > > properties with device_add command to allow for future
>> >> > > interface extension. This also requires the filed names to be kept in
>> >> > > sync with the properties passed to -device/device_add.    
>> 
>> Goes back to commit d4633541ee0, v2.7.0.  @die-id was the first such
>> interface extension.
>> 
>> A rule like "to use command C, you must pass it whatever you get from
>> command Q" punches a hole into the "QMP is a stable interface" promise.
>> Retroactively tacking it onto an existing interface like device-add
>> some-existing-device is even more problematic than specifying it for a
>> new interface.  Mind, this is not a categorical "can't ever do that".
>> It's more like "you better show this is less bad than all the
>> alternatives we can think of, and we've thought pretty hard".
>> Since this particular hole failed us the first time anybody actually
>> tried to wiggle through it, I think Eduardo has a point when he calls
>> for filling it in by deleting the comment.
>
> That was a consensus we were able to reach when discussing cpu hotplug
> QMP interface. If I recall correctly idea was that it should work for
> different targets (cpu topology properties target specific) and be
> extensible without breaking old mgmt stack  or requiring its update
> in lock step.
>
> If implemented correctly mgmt would not only query from QEMU/machine
> possible CPUs (with properties and valid values needed to plug it in,
> which it does already) but also 'keep' them around and pass back to
> device_add. In that case it would have worked as designed just fine.
>
> But this also shows a problem that we still need versioned machine type
> to keep old set of properties for old machine types anyway and we can
> miss it during review as tests we have might be not enough
> (tests/cpu-plug-test didn't detect it for some reason).

I think the lesson to learn here is "non-trivial rules on correct
interface use need to be backed by integration tests".

The rule in question is "a CPU hot-plug with device_add must specify all
the properties returned by query-hotpluggable-cpus".

Sadly, stipulating such rules does not change the de facto API.  Case in
point: libvirt did not obey this one, and even though it's been in place
for years, yet we're (rightly!) unwilling to blame libvirt for the
regression.  The stipulation was futile.

How could we increase our chances that management applications pick up
such rules?  I can see only one promising way: make tests fail unless
they do.  Add some arbitray dummy property, fail the hot plug unless
it's given.  Of course, we can't do that, because it's exactly the
breakage we're trying to avoid.  So do it only when QEMU is run with
--future, then have integration tests run it that way.

Aside: I'm afraid "# TODO: Better documentation; currently there is
none" didn't exactly help with query-hotpluggable-cpus uptake.

[...]


^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [Qemu-devel] [PATCH 3/3] pc: Don't make CPU properties mandatory unless necessary
  2019-08-27 16:15           ` Markus Armbruster
@ 2019-08-28 15:27             ` Igor Mammedov
  0 siblings, 0 replies; 27+ messages in thread
From: Igor Mammedov @ 2019-08-28 15:27 UTC (permalink / raw)
  To: Markus Armbruster
  Cc: Peter Krempa, Eduardo Habkost, Like Xu, Michael S. Tsirkin,
	qemu-devel, Paolo Bonzini, Richard Henderson

On Tue, 27 Aug 2019 18:15:53 +0200
Markus Armbruster <armbru@redhat.com> wrote:

> Igor Mammedov <imammedo@redhat.com> writes:
> 
> > On Sat, 17 Aug 2019 08:17:48 +0200
> > Markus Armbruster <armbru@redhat.com> wrote:
> >  
> >> Eduardo Habkost <ehabkost@redhat.com> writes:
> >>   
> >> > On Fri, Aug 16, 2019 at 03:20:11PM +0200, Igor Mammedov wrote:    
> >> >> On Thu, 15 Aug 2019 15:38:03 -0300
> >> >> Eduardo Habkost <ehabkost@redhat.com> wrote:
> >> >>     
> >> >> > We have this issue reported when using libvirt to hotplug CPUs:
> >> >> > https://bugzilla.redhat.com/show_bug.cgi?id=1741451
> >> >> > 
> >> >> > Basically, libvirt is not copying die-id from
> >> >> > query-hotpluggable-cpus, but die-id is now mandatory.    
> >> >> 
> >> >> this should have been gated on compat property and affect
> >> >> only new machine types.
> >> >> Maybe we should do just that instead of fixup so libvirt
> >> >> would finally make proper handling of query-hotpluggable-cpus.
> >> >> 
> >> >>      
> >> >> > We could blame libvirt and say it is not following the documented
> >> >> > interface, because we have this buried in the QAPI schema
> >> >> > documentation:    
> >> >> 
> >> >> I wouldn't say buried, if I understand it right QAPI schema
> >> >> should be the authoritative source of interface description.
> >> >> 
> >> >> If I recall it's not the first time, there was similar issue
> >> >> for exactly the same reason (libvirt not passing through
> >> >> all properties from query-hotpluggable-cpus).
> >> >> 
> >> >> And we had to fix it up on QEMU side (numa_cpu_pre_plug),
> >> >> but it seems 2 years later libvirt is still broken the same way :(
> >> >> 
> >> >> Should we really do fixups or finaly fix it on libvirt side?    
> >> >
> >> > Is it truly a bug in libvirt?  Making QEMU behave differently
> >> > when getting exactly the same input sounds like a bad idea, even
> >> > if we documented that at the QAPI documentation.
> >> >
> >> > My suggestion is to instead drop the comment below from the QAPI
> >> > documentation.  New properties shouldn't become mandatory.    
> >> 
> >> The "comment below" is this one, in qapi/machine.json:
> >>   
> >> >> > > Note: currently there are 5 properties that could be present
> >> >> > > but management should be prepared to pass through other
> >> >> > > properties with device_add command to allow for future
> >> >> > > interface extension. This also requires the filed names to be kept in
> >> >> > > sync with the properties passed to -device/device_add.      
> >> 
> >> Goes back to commit d4633541ee0, v2.7.0.  @die-id was the first such
> >> interface extension.
> >> 
> >> A rule like "to use command C, you must pass it whatever you get from
> >> command Q" punches a hole into the "QMP is a stable interface" promise.
> >> Retroactively tacking it onto an existing interface like device-add
> >> some-existing-device is even more problematic than specifying it for a
> >> new interface.  Mind, this is not a categorical "can't ever do that".
> >> It's more like "you better show this is less bad than all the
> >> alternatives we can think of, and we've thought pretty hard".
> >> Since this particular hole failed us the first time anybody actually
> >> tried to wiggle through it, I think Eduardo has a point when he calls
> >> for filling it in by deleting the comment.  
> >
> > That was a consensus we were able to reach when discussing cpu hotplug
> > QMP interface. If I recall correctly idea was that it should work for
> > different targets (cpu topology properties target specific) and be
> > extensible without breaking old mgmt stack  or requiring its update
> > in lock step.
> >
> > If implemented correctly mgmt would not only query from QEMU/machine
> > possible CPUs (with properties and valid values needed to plug it in,
> > which it does already) but also 'keep' them around and pass back to
> > device_add. In that case it would have worked as designed just fine.
> >
> > But this also shows a problem that we still need versioned machine type
> > to keep old set of properties for old machine types anyway and we can
> > miss it during review as tests we have might be not enough
> > (tests/cpu-plug-test didn't detect it for some reason).  
> 
> I think the lesson to learn here is "non-trivial rules on correct
> interface use need to be backed by integration tests".
> 
> The rule in question is "a CPU hot-plug with device_add must specify all
> the properties returned by query-hotpluggable-cpus".
> 
> Sadly, stipulating such rules does not change the de facto API.  Case in
> point: libvirt did not obey this one, and even though it's been in place
> for years, yet we're (rightly!) unwilling to blame libvirt for the
> regression.  The stipulation was futile.
it will be futile while we continue fixing up QEMU.
practice shows there will be nothing to motivate fixing
client side for years (really, why even bother???).

> How could we increase our chances that management applications pick up
> such rules?  I can see only one promising way: make tests fail unless
> they do.  Add some arbitray dummy property, fail the hot plug unless
> it's given.  Of course, we can't do that, because it's exactly the
> breakage we're trying to avoid.  So do it only when QEMU is run with
> --future, then have integration tests run it that way.
Unfortunately it didn't work this time,
it was too late when integration tests caught die-id issue.
Adding random property would ensure that client won't be able
to implement only 'dummy' instead of copying all properties,
but too late detected bug issue would remain the same.

PS:
it's also me to blame not paying much attention to series and
breaking tests/cpu-plug-test, which allowed this issue to sip
through without any notice. We had test_plug_with_device_add_x86()
which would caught issue if it was run and I did break it with
commit bc1fb850a31. /me looking into fixing it/

 
> Aside: I'm afraid "# TODO: Better documentation; currently there is
> none" didn't exactly help with query-hotpluggable-cpus uptake.
> [...]



^ permalink raw reply	[flat|nested] 27+ messages in thread

end of thread, other threads:[~2019-08-28 15:28 UTC | newest]

Thread overview: 27+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-08-15 18:38 [Qemu-devel] [PATCH 0/3] pc: Fix die-id validation and compatibility with libvirt Eduardo Habkost
2019-08-15 18:38 ` [Qemu-devel] [PATCH 1/3] pc: Fix error message on die-id validation Eduardo Habkost
2019-08-15 20:04   ` Vanderson Martins do Rosario
2019-08-16  1:04   ` Like Xu
2019-08-16 13:49     ` Eduardo Habkost
2019-08-19  0:53       ` Like Xu
2019-08-16  6:06   ` Markus Armbruster
2019-08-16 14:04   ` Igor Mammedov
2019-08-15 18:38 ` [Qemu-devel] [PATCH 2/3] pc: Improve error message when die-id is omitted Eduardo Habkost
2019-08-15 20:11   ` Vanderson Martins do Rosario
2019-08-16 14:06   ` Igor Mammedov
2019-08-15 18:38 ` [Qemu-devel] [PATCH 3/3] pc: Don't make CPU properties mandatory unless necessary Eduardo Habkost
2019-08-16  6:10   ` Markus Armbruster
2019-08-16  7:49     ` Erik Skultety
2019-08-16 12:22       ` Markus Armbruster
2019-08-16 17:42         ` Eduardo Habkost
2019-08-16 21:07           ` Yash Mankad
2019-08-20 21:06             ` Eduardo Habkost
2019-08-17  5:34           ` Markus Armbruster
2019-08-16 16:49     ` Eduardo Habkost
2019-08-16 13:20   ` Igor Mammedov
2019-08-16 16:56     ` Eduardo Habkost
2019-08-17  6:17       ` Markus Armbruster
2019-08-26 14:51         ` Igor Mammedov
2019-08-27 16:15           ` Markus Armbruster
2019-08-28 15:27             ` Igor Mammedov
2019-08-19 19:19 ` [Qemu-devel] [PATCH 0/3] pc: Fix die-id validation and compatibility with libvirt Michael S. Tsirkin

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).