All of lore.kernel.org
 help / color / mirror / Atom feed
* [Qemu-devel] [PATCH RFC 0/8] virtio: migrate new properties
@ 2014-05-14 15:41 Greg Kurz
  2014-05-14 15:41 ` [Qemu-devel] [PATCH RFC 1/8] virtio: add subsections to the migration stream Greg Kurz
                   ` (7 more replies)
  0 siblings, 8 replies; 68+ messages in thread
From: Greg Kurz @ 2014-05-14 15:41 UTC (permalink / raw)
  To: Kevin Wolf, Anthony Liguori, Michael S. Tsirkin, Stefan Hajnoczi,
	Amit Shah, Paolo Bonzini
  Cc: Juan Quintela, Fam Zheng, Alexander Graf, Andreas Färber,
	qemu-devel

The work on virtio endianness unveiled the need to add new properties
to VirtIODevice and be able to migrate without breaking compatibility.
There was a similar concern with the introduction of a VirtIODevice.broken
property.

This patch set is a tentative to address the issue, following a suggestion
from Alex:
 - patches 1 to 7 adds optional subsections in the virtio migration stream
 - patch 8 belongs to the virtio endianness serie, it is sent as an example

Thanks for your comments.

---

Greg Kurz (8):
      virtio: add subsections to the migration stream
      virtio-net: migrate subsections
      virtio-blk: migrate subsections
      virtio-scsi: migrate subsections
      virtio-serial: migrate subsections
      virtio-balloon: migrate subsections
      virtio-rng: migrate subsections
      virtio: add endian-ambivalent support to VirtIODevice


 exec.c                      |    8 ---
 hw/block/virtio-blk.c       |    4 +
 hw/char/virtio-serial-bus.c |   10 ++-
 hw/net/virtio-net.c         |    7 ++
 hw/scsi/virtio-scsi.c       |    3 +
 hw/virtio/virtio-balloon.c  |    4 +
 hw/virtio/virtio-pci.c      |   11 +---
 hw/virtio/virtio-rng.c      |   13 ++++
 hw/virtio/virtio.c          |  129 +++++++++++++++++++++++++++++++++++++++----
 include/exec/cpu-common.h   |    1 
 include/hw/virtio/virtio.h  |   17 ++++++
 11 files changed, 175 insertions(+), 32 deletions(-)

-- 
Greg

^ permalink raw reply	[flat|nested] 68+ messages in thread

* [Qemu-devel] [PATCH RFC 1/8] virtio: add subsections to the migration stream
  2014-05-14 15:41 [Qemu-devel] [PATCH RFC 0/8] virtio: migrate new properties Greg Kurz
@ 2014-05-14 15:41 ` Greg Kurz
  2014-05-15  6:04   ` Amit Shah
  2014-05-14 15:41 ` [Qemu-devel] [PATCH RFC 2/8] virtio-net: migrate subsections Greg Kurz
                   ` (6 subsequent siblings)
  7 siblings, 1 reply; 68+ messages in thread
From: Greg Kurz @ 2014-05-14 15:41 UTC (permalink / raw)
  To: Kevin Wolf, Anthony Liguori, Michael S. Tsirkin, Stefan Hajnoczi,
	Amit Shah, Paolo Bonzini
  Cc: Juan Quintela, Fam Zheng, Alexander Graf, Andreas Färber,
	qemu-devel

There is a need to add some more fields to VirtIODevice that should be
migrated (broken status, endianness). The problem is that we do not
want to break compatibility while adding a new feature... This issue has
been addressed in the generic VMState code with the use of optional
subsections. As a *temporary* alternative to port the whole virtio
migration code to VMState, this patch mimics a similar subsectionning
ability for virtio.

Since each virtio device is streamed in its own section, the idea is to
stream subsections between the end of the device section and the start
of the next sections. This allows an older QEMU to complain and exit
when fed with subsections:

Unknown savevm section type 5
Error -22 while loading VM state

All users of virtio_load()/virtio_save() need to be patched because the
subsections are streamed AFTER the device itself.

Suggested-by: Alexander Graf <agraf@suse.de>
Signed-off-by: Greg Kurz <gkurz@linux.vnet.ibm.com>
---
 hw/virtio/virtio.c         |   65 ++++++++++++++++++++++++++++++++++++++++++++
 include/hw/virtio/virtio.h |    4 +++
 2 files changed, 69 insertions(+)

diff --git a/hw/virtio/virtio.c b/hw/virtio/virtio.c
index aeabf3a..f4eaa3f 100644
--- a/hw/virtio/virtio.c
+++ b/hw/virtio/virtio.c
@@ -19,6 +19,7 @@
 #include "hw/virtio/virtio.h"
 #include "qemu/atomic.h"
 #include "hw/virtio/virtio-bus.h"
+#include "migration/migration.h"
 
 /*
  * The alignment to use between consumer and producer parts of vring.
@@ -833,6 +834,70 @@ void virtio_notify_config(VirtIODevice *vdev)
     virtio_notify_vector(vdev, vdev->config_vector);
 }
 
+static const struct VirtIOSubsectionDescStruct {
+    const char *name;
+    int version_id;
+    void (*save)(VirtIODevice *vdev, QEMUFile *f);
+    int (*load)(VirtIODevice *vdev, QEMUFile *f);
+} virtio_subsection[] = {
+    { .name = NULL }
+};
+
+void virtio_save_subsections(VirtIODevice *vdev, QEMUFile *f)
+{
+    int i;
+
+    for (i = 0; virtio_subsection[i].name; i++) {
+        const char *name = virtio_subsection[i].name;
+        uint8_t len = strlen(name);
+
+        qemu_put_byte(f, QEMU_VM_SUBSECTION);
+        qemu_put_byte(f, len);
+        qemu_put_buffer(f, (uint8_t *) name, len);
+        qemu_put_be32(f, virtio_subsection[i].version_id);
+        (*virtio_subsection[i].save)(vdev, f);
+    }
+}
+
+int virtio_load_subsections(VirtIODevice *vdev, QEMUFile *f)
+{
+    while (qemu_peek_byte(f, 0) == QEMU_VM_SUBSECTION) {
+        char idstr[256];
+        uint8_t len, size;
+        int i;
+
+        len = qemu_peek_byte(f, 1);
+        size = qemu_peek_buffer(f, (uint8_t *) idstr, len, 2);
+        if (size != len) {
+            break;
+        }
+
+        idstr[size] = 0;
+
+        for (i = 0; virtio_subsection[i].name; i++) {
+            if (strcmp(virtio_subsection[i].name, idstr) == 0) {
+                uint32_t version_id;
+                int ret;
+
+                qemu_file_skip(f, 1); /* subsection */
+                qemu_file_skip(f, 1); /* len */
+                qemu_file_skip(f, len); /* idstr */
+
+                version_id = qemu_get_be32(f);
+                if (version_id > virtio_subsection[i].version_id) {
+                    return -EINVAL;
+                }
+                ret = (*virtio_subsection[i].load)(vdev, f);
+                if (ret) {
+                    return ret;
+                }
+            }
+        }
+    }
+
+    return 0;
+}
+
 void virtio_save(VirtIODevice *vdev, QEMUFile *f)
 {
     BusState *qbus = qdev_get_parent_bus(DEVICE(vdev));
diff --git a/include/hw/virtio/virtio.h b/include/hw/virtio/virtio.h
index 3e54e90..82f852f 100644
--- a/include/hw/virtio/virtio.h
+++ b/include/hw/virtio/virtio.h
@@ -186,6 +186,10 @@ void virtio_save(VirtIODevice *vdev, QEMUFile *f);
 
 int virtio_load(VirtIODevice *vdev, QEMUFile *f);
 
+void virtio_save_subsections(VirtIODevice *vdev, QEMUFile *f);
+
+int virtio_load_subsections(VirtIODevice *vdev, QEMUFile *f);
+
 void virtio_notify_config(VirtIODevice *vdev);
 
 void virtio_queue_set_notification(VirtQueue *vq, int enable);

^ permalink raw reply related	[flat|nested] 68+ messages in thread

* [Qemu-devel] [PATCH RFC 2/8] virtio-net: migrate subsections
  2014-05-14 15:41 [Qemu-devel] [PATCH RFC 0/8] virtio: migrate new properties Greg Kurz
  2014-05-14 15:41 ` [Qemu-devel] [PATCH RFC 1/8] virtio: add subsections to the migration stream Greg Kurz
@ 2014-05-14 15:41 ` Greg Kurz
  2014-05-14 15:41 ` [Qemu-devel] [PATCH RFC 3/8] virtio-blk: " Greg Kurz
                   ` (5 subsequent siblings)
  7 siblings, 0 replies; 68+ messages in thread
From: Greg Kurz @ 2014-05-14 15:41 UTC (permalink / raw)
  To: Kevin Wolf, Anthony Liguori, Michael S. Tsirkin, Stefan Hajnoczi,
	Amit Shah, Paolo Bonzini
  Cc: Juan Quintela, Fam Zheng, Alexander Graf, Andreas Färber,
	qemu-devel

Signed-off-by: Greg Kurz <gkurz@linux.vnet.ibm.com>
---
 hw/net/virtio-net.c |    7 +++++++
 1 file changed, 7 insertions(+)

diff --git a/hw/net/virtio-net.c b/hw/net/virtio-net.c
index fd23c46..4004d02 100644
--- a/hw/net/virtio-net.c
+++ b/hw/net/virtio-net.c
@@ -1300,6 +1300,8 @@ static void virtio_net_save(QEMUFile *f, void *opaque)
     if ((1 << VIRTIO_NET_F_CTRL_GUEST_OFFLOADS) & vdev->guest_features) {
         qemu_put_be64(f, n->curr_guest_offloads);
     }
+
+    virtio_save_subsections(vdev, f);
 }
 
 static int virtio_net_load(QEMUFile *f, void *opaque, int version_id)
@@ -1396,6 +1398,11 @@ static int virtio_net_load(QEMUFile *f, void *opaque, int version_id)
         n->curr_guest_offloads = virtio_net_supported_guest_offloads(n);
     }
 
+    ret = virtio_load_subsections(vdev, f);
+    if (ret) {
+        return ret;
+    }
+
     if (peer_has_vnet_hdr(n)) {
         virtio_net_apply_guest_offloads(n);
     }

^ permalink raw reply related	[flat|nested] 68+ messages in thread

* [Qemu-devel] [PATCH RFC 3/8] virtio-blk: migrate subsections
  2014-05-14 15:41 [Qemu-devel] [PATCH RFC 0/8] virtio: migrate new properties Greg Kurz
  2014-05-14 15:41 ` [Qemu-devel] [PATCH RFC 1/8] virtio: add subsections to the migration stream Greg Kurz
  2014-05-14 15:41 ` [Qemu-devel] [PATCH RFC 2/8] virtio-net: migrate subsections Greg Kurz
@ 2014-05-14 15:41 ` Greg Kurz
  2014-05-14 15:42 ` [Qemu-devel] [PATCH RFC 4/8] virtio-scsi: " Greg Kurz
                   ` (4 subsequent siblings)
  7 siblings, 0 replies; 68+ messages in thread
From: Greg Kurz @ 2014-05-14 15:41 UTC (permalink / raw)
  To: Kevin Wolf, Anthony Liguori, Michael S. Tsirkin, Stefan Hajnoczi,
	Amit Shah, Paolo Bonzini
  Cc: Juan Quintela, Fam Zheng, Alexander Graf, Andreas Färber,
	qemu-devel

Signed-off-by: Greg Kurz <gkurz@linux.vnet.ibm.com>
---
 hw/block/virtio-blk.c |    4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/hw/block/virtio-blk.c b/hw/block/virtio-blk.c
index 8a568e5..5a0d224 100644
--- a/hw/block/virtio-blk.c
+++ b/hw/block/virtio-blk.c
@@ -600,6 +600,8 @@ static void virtio_blk_save(QEMUFile *f, void *opaque)
         req = req->next;
     }
     qemu_put_sbyte(f, 0);
+
+    virtio_save_subsections(vdev, f);
 }
 
 static int virtio_blk_load(QEMUFile *f, void *opaque, int version_id)
@@ -628,7 +630,7 @@ static int virtio_blk_load(QEMUFile *f, void *opaque, int version_id)
             req->elem.out_num, 0);
     }
 
-    return 0;
+    return virtio_load_subsections(vdev, f);
 }
 
 static void virtio_blk_resize(void *opaque)

^ permalink raw reply related	[flat|nested] 68+ messages in thread

* [Qemu-devel] [PATCH RFC 4/8] virtio-scsi: migrate subsections
  2014-05-14 15:41 [Qemu-devel] [PATCH RFC 0/8] virtio: migrate new properties Greg Kurz
                   ` (2 preceding siblings ...)
  2014-05-14 15:41 ` [Qemu-devel] [PATCH RFC 3/8] virtio-blk: " Greg Kurz
@ 2014-05-14 15:42 ` Greg Kurz
  2014-05-14 15:42 ` [Qemu-devel] [PATCH RFC 5/8] virtio-serial: " Greg Kurz
                   ` (3 subsequent siblings)
  7 siblings, 0 replies; 68+ messages in thread
From: Greg Kurz @ 2014-05-14 15:42 UTC (permalink / raw)
  To: Kevin Wolf, Anthony Liguori, Michael S. Tsirkin, Stefan Hajnoczi,
	Amit Shah, Paolo Bonzini
  Cc: Juan Quintela, Fam Zheng, Alexander Graf, Andreas Färber,
	qemu-devel

Signed-off-by: Greg Kurz <gkurz@linux.vnet.ibm.com>
---
 hw/scsi/virtio-scsi.c |    3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/hw/scsi/virtio-scsi.c b/hw/scsi/virtio-scsi.c
index b0d7517..989e4ed 100644
--- a/hw/scsi/virtio-scsi.c
+++ b/hw/scsi/virtio-scsi.c
@@ -471,6 +471,7 @@ static void virtio_scsi_save(QEMUFile *f, void *opaque)
 {
     VirtIODevice *vdev = VIRTIO_DEVICE(opaque);
     virtio_save(vdev, f);
+    virtio_save_subsections(vdev, f);
 }
 
 static int virtio_scsi_load(QEMUFile *f, void *opaque, int version_id)
@@ -482,7 +483,7 @@ static int virtio_scsi_load(QEMUFile *f, void *opaque, int version_id)
     if (ret) {
         return ret;
     }
-    return 0;
+    return virtio_load_subsections(vdev, f);
 }
 
 static void virtio_scsi_push_event(VirtIOSCSI *s, SCSIDevice *dev,

^ permalink raw reply related	[flat|nested] 68+ messages in thread

* [Qemu-devel] [PATCH RFC 5/8] virtio-serial: migrate subsections
  2014-05-14 15:41 [Qemu-devel] [PATCH RFC 0/8] virtio: migrate new properties Greg Kurz
                   ` (3 preceding siblings ...)
  2014-05-14 15:42 ` [Qemu-devel] [PATCH RFC 4/8] virtio-scsi: " Greg Kurz
@ 2014-05-14 15:42 ` Greg Kurz
  2014-05-14 15:42 ` [Qemu-devel] [PATCH RFC 6/8] virtio-balloon: " Greg Kurz
                   ` (2 subsequent siblings)
  7 siblings, 0 replies; 68+ messages in thread
From: Greg Kurz @ 2014-05-14 15:42 UTC (permalink / raw)
  To: Kevin Wolf, Anthony Liguori, Michael S. Tsirkin, Stefan Hajnoczi,
	Amit Shah, Paolo Bonzini
  Cc: Juan Quintela, Fam Zheng, Alexander Graf, Andreas Färber,
	qemu-devel

Signed-off-by: Greg Kurz <gkurz@linux.vnet.ibm.com>
---
 hw/char/virtio-serial-bus.c |   10 +++++++---
 1 file changed, 7 insertions(+), 3 deletions(-)

diff --git a/hw/char/virtio-serial-bus.c b/hw/char/virtio-serial-bus.c
index 2b647b6..98f32fd 100644
--- a/hw/char/virtio-serial-bus.c
+++ b/hw/char/virtio-serial-bus.c
@@ -522,12 +522,13 @@ static void vser_reset(VirtIODevice *vdev)
 static void virtio_serial_save(QEMUFile *f, void *opaque)
 {
     VirtIOSerial *s = VIRTIO_SERIAL(opaque);
+    VirtIODevice *vdev = VIRTIO_DEVICE(s);
     VirtIOSerialPort *port;
     uint32_t nr_active_ports;
     unsigned int i, max_nr_ports;
 
     /* The virtio device */
-    virtio_save(VIRTIO_DEVICE(s), f);
+    virtio_save(vdev, f);
 
     /* The config space */
     qemu_put_be16s(f, &s->config.cols);
@@ -573,6 +574,8 @@ static void virtio_serial_save(QEMUFile *f, void *opaque)
                             sizeof(port->elem));
         }
     }
+
+    virtio_save_subsections(vdev, f);
 }
 
 static void virtio_serial_post_load_timer_cb(void *opaque)
@@ -667,6 +670,7 @@ static int fetch_active_ports_list(QEMUFile *f, int version_id,
 static int virtio_serial_load(QEMUFile *f, void *opaque, int version_id)
 {
     VirtIOSerial *s = VIRTIO_SERIAL(opaque);
+    VirtIODevice *vdev = VIRTIO_DEVICE(s);
     uint32_t max_nr_ports, nr_active_ports, ports_map;
     unsigned int i;
     int ret;
@@ -676,7 +680,7 @@ static int virtio_serial_load(QEMUFile *f, void *opaque, int version_id)
     }
 
     /* The virtio device */
-    ret = virtio_load(VIRTIO_DEVICE(s), f);
+    ret = virtio_load(vdev, f);
     if (ret) {
         return ret;
     }
@@ -716,7 +720,7 @@ static int virtio_serial_load(QEMUFile *f, void *opaque, int version_id)
             return ret;
         }
     }
-    return 0;
+    return virtio_load_subsections(vdev, f);
 }
 
 static void virtser_bus_dev_print(Monitor *mon, DeviceState *qdev, int indent);

^ permalink raw reply related	[flat|nested] 68+ messages in thread

* [Qemu-devel] [PATCH RFC 6/8] virtio-balloon: migrate subsections
  2014-05-14 15:41 [Qemu-devel] [PATCH RFC 0/8] virtio: migrate new properties Greg Kurz
                   ` (4 preceding siblings ...)
  2014-05-14 15:42 ` [Qemu-devel] [PATCH RFC 5/8] virtio-serial: " Greg Kurz
@ 2014-05-14 15:42 ` Greg Kurz
  2014-05-14 15:42 ` [Qemu-devel] [PATCH RFC 7/8] virtio-rng: " Greg Kurz
  2014-05-14 15:42 ` [Qemu-devel] [PATCH RFC 8/8] virtio: add endian-ambivalent support to VirtIODevice Greg Kurz
  7 siblings, 0 replies; 68+ messages in thread
From: Greg Kurz @ 2014-05-14 15:42 UTC (permalink / raw)
  To: Kevin Wolf, Anthony Liguori, Michael S. Tsirkin, Stefan Hajnoczi,
	Amit Shah, Paolo Bonzini
  Cc: Juan Quintela, Fam Zheng, Alexander Graf, Andreas Färber,
	qemu-devel

Signed-off-by: Greg Kurz <gkurz@linux.vnet.ibm.com>
---
 hw/virtio/virtio-balloon.c |    4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/hw/virtio/virtio-balloon.c b/hw/virtio/virtio-balloon.c
index a470a0b..41f7ae1 100644
--- a/hw/virtio/virtio-balloon.c
+++ b/hw/virtio/virtio-balloon.c
@@ -316,6 +316,8 @@ static void virtio_balloon_save(QEMUFile *f, void *opaque)
 
     qemu_put_be32(f, s->num_pages);
     qemu_put_be32(f, s->actual);
+
+    virtio_save_subsections(vdev, f);
 }
 
 static int virtio_balloon_load(QEMUFile *f, void *opaque, int version_id)
@@ -334,7 +336,7 @@ static int virtio_balloon_load(QEMUFile *f, void *opaque, int version_id)
 
     s->num_pages = qemu_get_be32(f);
     s->actual = qemu_get_be32(f);
-    return 0;
+    return virtio_load_subsections(vdev, f);
 }
 
 static void virtio_balloon_device_realize(DeviceState *dev, Error **errp)

^ permalink raw reply related	[flat|nested] 68+ messages in thread

* [Qemu-devel] [PATCH RFC 7/8] virtio-rng: migrate subsections
  2014-05-14 15:41 [Qemu-devel] [PATCH RFC 0/8] virtio: migrate new properties Greg Kurz
                   ` (5 preceding siblings ...)
  2014-05-14 15:42 ` [Qemu-devel] [PATCH RFC 6/8] virtio-balloon: " Greg Kurz
@ 2014-05-14 15:42 ` Greg Kurz
  2014-05-14 15:42 ` [Qemu-devel] [PATCH RFC 8/8] virtio: add endian-ambivalent support to VirtIODevice Greg Kurz
  7 siblings, 0 replies; 68+ messages in thread
From: Greg Kurz @ 2014-05-14 15:42 UTC (permalink / raw)
  To: Kevin Wolf, Anthony Liguori, Michael S. Tsirkin, Stefan Hajnoczi,
	Amit Shah, Paolo Bonzini
  Cc: Juan Quintela, Fam Zheng, Alexander Graf, Andreas Färber,
	qemu-devel

While we are here, we also check virtio_load() return value.

Signed-off-by: Greg Kurz <gkurz@linux.vnet.ibm.com>
---
 hw/virtio/virtio-rng.c |   13 ++++++++++++-
 1 file changed, 12 insertions(+), 1 deletion(-)

diff --git a/hw/virtio/virtio-rng.c b/hw/virtio/virtio-rng.c
index b6ab361..0371103 100644
--- a/hw/virtio/virtio-rng.c
+++ b/hw/virtio/virtio-rng.c
@@ -103,17 +103,28 @@ static void virtio_rng_save(QEMUFile *f, void *opaque)
     VirtIODevice *vdev = opaque;
 
     virtio_save(vdev, f);
+    virtio_save_subsections(vdev, f);
 }
 
 static int virtio_rng_load(QEMUFile *f, void *opaque, int version_id)
 {
     VirtIORNG *vrng = opaque;
     VirtIODevice *vdev = VIRTIO_DEVICE(vrng);
+    int ret;
 
     if (version_id != 1) {
         return -EINVAL;
     }
-    virtio_load(vdev, f);
+
+    ret = virtio_load(vdev, f);
+    if (ret) {
+        return ret;
+    }
+
+    ret = virtio_load_subsections(vdev, f);
+    if (ret) {
+        return ret;
+    }
 
     /* We may have an element ready but couldn't process it due to a quota
      * limit.  Make sure to try again after live migration when the quota may

^ permalink raw reply related	[flat|nested] 68+ messages in thread

* [Qemu-devel] [PATCH RFC 8/8] virtio: add endian-ambivalent support to VirtIODevice
  2014-05-14 15:41 [Qemu-devel] [PATCH RFC 0/8] virtio: migrate new properties Greg Kurz
                   ` (6 preceding siblings ...)
  2014-05-14 15:42 ` [Qemu-devel] [PATCH RFC 7/8] virtio-rng: " Greg Kurz
@ 2014-05-14 15:42 ` Greg Kurz
       [not found]   ` <5384A8D2.8050104@redhat.com>
  7 siblings, 1 reply; 68+ messages in thread
From: Greg Kurz @ 2014-05-14 15:42 UTC (permalink / raw)
  To: Kevin Wolf, Anthony Liguori, Michael S. Tsirkin, Stefan Hajnoczi,
	Amit Shah, Paolo Bonzini
  Cc: Juan Quintela, Fam Zheng, Alexander Graf, Andreas Färber,
	qemu-devel

Some CPU families can dynamically change their endianness. This means we
can have little endian ppc or big endian arm guests for example. This has
an impact on legacy virtio data structures since they are target endian.
We hence introduce a new property to track the endianness of each virtio
device. It is reasonnably assumed that endianness won't change while the
device is in use : we hence capture the device endianness when it gets
reset.

Of course this property must be part of the migration stream as most of
the virtio code will depend on it. It is hence migrated in a subsection
to preserve compatibility with older migration streams. The tricky part
is that the endianness gets known quite late and we must ensure no access
is made to virtio data before that. It is for example invalid to call
vring_avail_idx() as does the actual migration code: the VQ indexes sanity
checks had to be moved from virtio_load() to virtio_load_subsections()
because of that.

In fact, we have two conditions where the endianness property is stale:
- from initialization time (virtio_init) to first reset time (virtio_reset)
- from incoming migration time (virtio_load) to the time the property
  is restored (virtio_load_device_endian)
We enforce some paranoia by making the endianness a 3-value enum so
that we can poison it when it should not be used.

Signed-off-by: Greg Kurz <gkurz@linux.vnet.ibm.com>
---
 exec.c                     |    8 +-----
 hw/virtio/virtio-pci.c     |   11 +++-----
 hw/virtio/virtio.c         |   64 ++++++++++++++++++++++++++++++++++++--------
 include/exec/cpu-common.h  |    1 +
 include/hw/virtio/virtio.h |   13 +++++++++
 5 files changed, 72 insertions(+), 25 deletions(-)

diff --git a/exec.c b/exec.c
index 91513c6..4e83588 100644
--- a/exec.c
+++ b/exec.c
@@ -2740,13 +2740,7 @@ int cpu_memory_rw_debug(CPUState *cpu, target_ulong addr,
 #endif
 
 #if !defined(CONFIG_USER_ONLY)
-
-/*
- * A helper function for the _utterly broken_ virtio device model to find out if
- * it's running on a big endian machine. Don't do this at home kids!
- */
-bool virtio_is_big_endian(void);
-bool virtio_is_big_endian(void)
+bool target_words_bigendian(void)
 {
 #if defined(TARGET_WORDS_BIGENDIAN)
     return true;
diff --git a/hw/virtio/virtio-pci.c b/hw/virtio/virtio-pci.c
index ce97514..2ffceb8 100644
--- a/hw/virtio/virtio-pci.c
+++ b/hw/virtio/virtio-pci.c
@@ -89,9 +89,6 @@
 /* Flags track per-device state like workarounds for quirks in older guests. */
 #define VIRTIO_PCI_FLAG_BUS_MASTER_BUG  (1 << 0)
 
-/* HACK for virtio to determine if it's running a big endian guest */
-bool virtio_is_big_endian(void);
-
 static void virtio_pci_bus_new(VirtioBusState *bus, size_t bus_size,
                                VirtIOPCIProxy *dev);
 
@@ -409,13 +406,13 @@ static uint64_t virtio_pci_config_read(void *opaque, hwaddr addr,
         break;
     case 2:
         val = virtio_config_readw(vdev, addr);
-        if (virtio_is_big_endian()) {
+        if (virtio_is_big_endian(vdev)) {
             val = bswap16(val);
         }
         break;
     case 4:
         val = virtio_config_readl(vdev, addr);
-        if (virtio_is_big_endian()) {
+        if (virtio_is_big_endian(vdev)) {
             val = bswap32(val);
         }
         break;
@@ -443,13 +440,13 @@ static void virtio_pci_config_write(void *opaque, hwaddr addr,
         virtio_config_writeb(vdev, addr, val);
         break;
     case 2:
-        if (virtio_is_big_endian()) {
+        if (virtio_is_big_endian(vdev)) {
             val = bswap16(val);
         }
         virtio_config_writew(vdev, addr, val);
         break;
     case 4:
-        if (virtio_is_big_endian()) {
+        if (virtio_is_big_endian(vdev)) {
             val = bswap32(val);
         }
         virtio_config_writel(vdev, addr, val);
diff --git a/hw/virtio/virtio.c b/hw/virtio/virtio.c
index f4eaa3f..9018b6c 100644
--- a/hw/virtio/virtio.c
+++ b/hw/virtio/virtio.c
@@ -547,6 +547,12 @@ void virtio_reset(void *opaque)
 
     virtio_set_status(vdev, 0);
 
+    if (target_words_bigendian()) {
+        vdev->device_endian = VIRTIO_DEVICE_ENDIAN_BIG;
+    } else {
+        vdev->device_endian = VIRTIO_DEVICE_ENDIAN_LITTLE;
+    }
+
     if (k->reset) {
         k->reset(vdev);
     }
@@ -834,12 +840,28 @@ void virtio_notify_config(VirtIODevice *vdev)
     virtio_notify_vector(vdev, vdev->config_vector);
 }
 
+static void virtio_save_device_endian(VirtIODevice *vdev, QEMUFile *f)
+{
+    qemu_put_byte(f, vdev->device_endian);
+}
+
+static int virtio_load_device_endian(VirtIODevice *vdev, QEMUFile *f)
+{
+    vdev->device_endian = qemu_get_byte(f);
+    return 0;
+}
+
 static const struct VirtIOSubsectionDescStruct {
     const char *name;
     int version_id;
     void (*save)(VirtIODevice *vdev, QEMUFile *f);
     int (*load)(VirtIODevice *vdev, QEMUFile *f);
 } virtio_subsection[] = {
+    { .name = "virtio/is_big_endian",
+      .version_id = 1,
+      .save = virtio_save_device_endian,
+      .load = virtio_load_device_endian,
+    },
     { .name = NULL }
 };
 
@@ -861,6 +883,8 @@ void virtio_save_subsections(VirtIODevice *vdev, QEMUFile *f)
 
 int virtio_load_subsections(VirtIODevice *vdev, QEMUFile *f)
 {
+    int i;
+
     while (qemu_peek_byte(f, 0) == QEMU_VM_SUBSECTION) {
         char idstr[256];
         uint8_t len, size;
@@ -895,6 +919,26 @@ int virtio_load_subsections(VirtIODevice *vdev, QEMUFile *f)
         }
     }
 
+    for (i = 0; i < VIRTIO_PCI_QUEUE_MAX; i++) {
+        if (vdev->vq[i].vring.num == 0) {
+            break;
+        }
+
+        if (vdev->vq[i].pa) {
+            uint16_t nheads;
+            nheads = vring_avail_idx(&vdev->vq[i]) - vdev->vq[i].last_avail_idx;
+            /* Check it isn't doing strange things with descriptor numbers. */
+            if (nheads > vdev->vq[i].vring.num) {
+                error_report("VQ %d size 0x%x Guest index 0x%x "
+                             "inconsistent with Host index 0x%x: delta 0x%x",
+                             i, vdev->vq[i].vring.num,
+                             vring_avail_idx(&vdev->vq[i]),
+                             vdev->vq[i].last_avail_idx, nheads);
+                return -1;
+            }
+        }
+    }
+
     return 0;
 }
 
@@ -962,6 +1006,11 @@ int virtio_load(VirtIODevice *vdev, QEMUFile *f)
     BusState *qbus = qdev_get_parent_bus(DEVICE(vdev));
     VirtioBusClass *k = VIRTIO_BUS_GET_CLASS(qbus);
 
+    /* We poison the endianness to ensure it does not get used before
+     * subsections have been loaded.
+     */
+    vdev->device_endian = VIRTIO_DEVICE_ENDIAN_UNKNOWN;
+
     if (k->load_config) {
         ret = k->load_config(qbus->parent, f);
         if (ret)
@@ -995,18 +1044,7 @@ int virtio_load(VirtIODevice *vdev, QEMUFile *f)
         vdev->vq[i].notification = true;
 
         if (vdev->vq[i].pa) {
-            uint16_t nheads;
             virtqueue_init(&vdev->vq[i]);
-            nheads = vring_avail_idx(&vdev->vq[i]) - vdev->vq[i].last_avail_idx;
-            /* Check it isn't doing very strange things with descriptor numbers. */
-            if (nheads > vdev->vq[i].vring.num) {
-                error_report("VQ %d size 0x%x Guest index 0x%x "
-                             "inconsistent with Host index 0x%x: delta 0x%x",
-                             i, vdev->vq[i].vring.num,
-                             vring_avail_idx(&vdev->vq[i]),
-                             vdev->vq[i].last_avail_idx, nheads);
-                return -1;
-            }
         } else if (vdev->vq[i].last_avail_idx) {
             error_report("VQ %d address 0x0 "
                          "inconsistent with Host index 0x%x",
@@ -1078,6 +1116,10 @@ void virtio_init(VirtIODevice *vdev, const char *name,
     }
     vdev->vmstate = qemu_add_vm_change_state_handler(virtio_vmstate_change,
                                                      vdev);
+    /* We poison the endianness to ensure it does not get used before
+     * the device is reset.
+     */
+    vdev->device_endian = VIRTIO_DEVICE_ENDIAN_UNKNOWN;
 }
 
 hwaddr virtio_queue_get_desc_addr(VirtIODevice *vdev, int n)
diff --git a/include/exec/cpu-common.h b/include/exec/cpu-common.h
index a21b65a..eb798c1 100644
--- a/include/exec/cpu-common.h
+++ b/include/exec/cpu-common.h
@@ -122,4 +122,5 @@ void qemu_ram_foreach_block(RAMBlockIterFunc func, void *opaque);
 
 #endif
 
+bool target_words_bigendian(void);
 #endif /* !CPU_COMMON_H */
diff --git a/include/hw/virtio/virtio.h b/include/hw/virtio/virtio.h
index 82f852f..0012470 100644
--- a/include/hw/virtio/virtio.h
+++ b/include/hw/virtio/virtio.h
@@ -104,6 +104,12 @@ typedef struct VirtQueueElement
 #define VIRTIO_DEVICE(obj) \
         OBJECT_CHECK(VirtIODevice, (obj), TYPE_VIRTIO_DEVICE)
 
+enum virtio_device_endian {
+    VIRTIO_DEVICE_ENDIAN_UNKNOWN,
+    VIRTIO_DEVICE_ENDIAN_LITTLE,
+    VIRTIO_DEVICE_ENDIAN_BIG,
+};
+
 struct VirtIODevice
 {
     DeviceState parent_obj;
@@ -121,6 +127,7 @@ struct VirtIODevice
     bool vm_running;
     VMChangeStateEntry *vmstate;
     char *bus_name;
+    enum virtio_device_endian device_endian;
 };
 
 typedef struct VirtioDeviceClass {
@@ -257,4 +264,10 @@ void virtio_queue_set_host_notifier_fd_handler(VirtQueue *vq, bool assign,
                                                bool set_handler);
 void virtio_queue_notify_vq(VirtQueue *vq);
 void virtio_irq(VirtQueue *vq);
+
+static inline bool virtio_is_big_endian(VirtIODevice *vdev)
+{
+    assert(vdev->device_endian != VIRTIO_DEVICE_ENDIAN_UNKNOWN);
+    return vdev->device_endian == VIRTIO_DEVICE_ENDIAN_BIG;
+}
 #endif

^ permalink raw reply related	[flat|nested] 68+ messages in thread

* Re: [Qemu-devel] [PATCH RFC 1/8] virtio: add subsections to the migration stream
  2014-05-14 15:41 ` [Qemu-devel] [PATCH RFC 1/8] virtio: add subsections to the migration stream Greg Kurz
@ 2014-05-15  6:04   ` Amit Shah
  2014-05-15  6:23     ` Michael S. Tsirkin
  2014-05-15  6:49     ` Greg Kurz
  0 siblings, 2 replies; 68+ messages in thread
From: Amit Shah @ 2014-05-15  6:04 UTC (permalink / raw)
  To: Greg Kurz
  Cc: Kevin Wolf, Fam Zheng, Anthony Liguori, Michael S. Tsirkin,
	Juan Quintela, Alexander Graf, qemu-devel, Stefan Hajnoczi,
	Paolo Bonzini, Andreas Färber

On (Wed) 14 May 2014 [17:41:38], Greg Kurz wrote:
> There is a need to add some more fields to VirtIODevice that should be
> migrated (broken status, endianness). The problem is that we do not
> want to break compatibility while adding a new feature... This issue has
> been addressed in the generic VMState code with the use of optional
> subsections. As a *temporary* alternative to port the whole virtio
> migration code to VMState, this patch mimics a similar subsectionning
> ability for virtio.
> 
> Since each virtio device is streamed in its own section, the idea is to
> stream subsections between the end of the device section and the start
> of the next sections. This allows an older QEMU to complain and exit
> when fed with subsections:
> 
> Unknown savevm section type 5
> Error -22 while loading VM state

Please make this configurable -- either via configure or device
properties.  That avoids having to break existing configurations that
work without this patch.

> All users of virtio_load()/virtio_save() need to be patched because the
> subsections are streamed AFTER the device itself.

Since all have the same fixup, I'm wondering if a new section can be
added to the virtio-bus itself, which gets propagated to all devices
upon load in the dest.

		Amit

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [Qemu-devel] [PATCH RFC 1/8] virtio: add subsections to the migration stream
  2014-05-15  6:04   ` Amit Shah
@ 2014-05-15  6:23     ` Michael S. Tsirkin
  2014-05-15  6:46       ` Amit Shah
  2014-05-15  6:49     ` Greg Kurz
  1 sibling, 1 reply; 68+ messages in thread
From: Michael S. Tsirkin @ 2014-05-15  6:23 UTC (permalink / raw)
  To: Amit Shah
  Cc: Kevin Wolf, Fam Zheng, Anthony Liguori, Juan Quintela,
	Alexander Graf, qemu-devel, Stefan Hajnoczi, Paolo Bonzini,
	Andreas Färber, Greg Kurz

On Thu, May 15, 2014 at 11:34:25AM +0530, Amit Shah wrote:
> On (Wed) 14 May 2014 [17:41:38], Greg Kurz wrote:
> > There is a need to add some more fields to VirtIODevice that should be
> > migrated (broken status, endianness). The problem is that we do not
> > want to break compatibility while adding a new feature... This issue has
> > been addressed in the generic VMState code with the use of optional
> > subsections. As a *temporary* alternative to port the whole virtio
> > migration code to VMState, this patch mimics a similar subsectionning
> > ability for virtio.
> > 
> > Since each virtio device is streamed in its own section, the idea is to
> > stream subsections between the end of the device section and the start
> > of the next sections. This allows an older QEMU to complain and exit
> > when fed with subsections:
> > 
> > Unknown savevm section type 5
> > Error -22 while loading VM state
> 
> Please make this configurable -- either via configure or device
> properties.  That avoids having to break existing configurations that
> work without this patch.
> 
> > All users of virtio_load()/virtio_save() need to be patched because the
> > subsections are streamed AFTER the device itself.
> 
> Since all have the same fixup, I'm wondering if a new section can be
> added to the virtio-bus itself, which gets propagated to all devices
> upon load in the dest.
> 
> 		Amit

This calls for a way for devices to inherit properties from the bus,
which doesn't exist ATM.
Fine but let's not hold up this patchset because of this.

-- 
MST

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [Qemu-devel] [PATCH RFC 1/8] virtio: add subsections to the migration stream
  2014-05-15  6:23     ` Michael S. Tsirkin
@ 2014-05-15  6:46       ` Amit Shah
  2014-05-15  7:04         ` Greg Kurz
  2014-05-15  7:14         ` Michael S. Tsirkin
  0 siblings, 2 replies; 68+ messages in thread
From: Amit Shah @ 2014-05-15  6:46 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: Kevin Wolf, Fam Zheng, Anthony Liguori, Juan Quintela,
	Alexander Graf, qemu-devel, Stefan Hajnoczi, Paolo Bonzini,
	Andreas Färber, Greg Kurz

On (Thu) 15 May 2014 [09:23:51], Michael S. Tsirkin wrote:
> On Thu, May 15, 2014 at 11:34:25AM +0530, Amit Shah wrote:
> > On (Wed) 14 May 2014 [17:41:38], Greg Kurz wrote:
> > > There is a need to add some more fields to VirtIODevice that should be
> > > migrated (broken status, endianness). The problem is that we do not
> > > want to break compatibility while adding a new feature... This issue has
> > > been addressed in the generic VMState code with the use of optional
> > > subsections. As a *temporary* alternative to port the whole virtio
> > > migration code to VMState, this patch mimics a similar subsectionning
> > > ability for virtio.

BTW Greg, do you plan on working on vmstate for virtio?

> > > Since each virtio device is streamed in its own section, the idea is to
> > > stream subsections between the end of the device section and the start
> > > of the next sections. This allows an older QEMU to complain and exit
> > > when fed with subsections:
> > > 
> > > Unknown savevm section type 5
> > > Error -22 while loading VM state
> > 
> > Please make this configurable -- either via configure or device
> > properties.  That avoids having to break existing configurations that
> > work without this patch.
> > 
> > > All users of virtio_load()/virtio_save() need to be patched because the
> > > subsections are streamed AFTER the device itself.
> > 
> > Since all have the same fixup, I'm wondering if a new section can be
> > added to the virtio-bus itself, which gets propagated to all devices
> > upon load in the dest.
> 
> This calls for a way for devices to inherit properties from the bus,
> which doesn't exist ATM.
> Fine but let's not hold up this patchset because of this.

No, only suggestion is to add a migration section in the bus, and then
it's easier to do this in the post-migrate functions for each device
-- so only one new section gets introduced instead of all devices
being modified to send a new subsection.

		Amit

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [Qemu-devel] [PATCH RFC 1/8] virtio: add subsections to the migration stream
  2014-05-15  6:04   ` Amit Shah
  2014-05-15  6:23     ` Michael S. Tsirkin
@ 2014-05-15  6:49     ` Greg Kurz
  2014-05-15  6:55       ` Amit Shah
  2014-05-15  7:12       ` Michael S. Tsirkin
  1 sibling, 2 replies; 68+ messages in thread
From: Greg Kurz @ 2014-05-15  6:49 UTC (permalink / raw)
  To: Amit Shah
  Cc: Kevin Wolf, Fam Zheng, Anthony Liguori, Michael S. Tsirkin,
	Juan Quintela, Alexander Graf, qemu-devel, Stefan Hajnoczi,
	Paolo Bonzini, Andreas Färber

On Thu, 15 May 2014 11:34:25 +0530
Amit Shah <amit.shah@redhat.com> wrote:

> On (Wed) 14 May 2014 [17:41:38], Greg Kurz wrote:
> > There is a need to add some more fields to VirtIODevice that should be
> > migrated (broken status, endianness). The problem is that we do not
> > want to break compatibility while adding a new feature... This issue has
> > been addressed in the generic VMState code with the use of optional
> > subsections. As a *temporary* alternative to port the whole virtio
> > migration code to VMState, this patch mimics a similar subsectionning
> > ability for virtio.
> > 
> > Since each virtio device is streamed in its own section, the idea is to
> > stream subsections between the end of the device section and the start
> > of the next sections. This allows an older QEMU to complain and exit
> > when fed with subsections:
> > 
> > Unknown savevm section type 5
> > Error -22 while loading VM state
> 
> Please make this configurable -- either via configure or device
> properties.  That avoids having to break existing configurations that
> work without this patch.
> 

Hmmm... you mean we support migration from a newer QEMU to an older one ?

> > All users of virtio_load()/virtio_save() need to be patched because the
> > subsections are streamed AFTER the device itself.
> 
> Since all have the same fixup, I'm wondering if a new section can be
> added to the virtio-bus itself, which gets propagated to all devices
> upon load in the dest.
> 

That would be nice if possible. I will have a closer look.

> 		Amit
> 

Thanks.

-- 
Gregory Kurz                                     kurzgreg@fr.ibm.com
                                                 gkurz@linux.vnet.ibm.com
Software Engineer @ IBM/Meiosys                  http://www.ibm.com
Tel +33 (0)562 165 496

"Anarchy is about taking complete responsibility for yourself."
        Alan Moore.

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [Qemu-devel] [PATCH RFC 1/8] virtio: add subsections to the migration stream
  2014-05-15  6:49     ` Greg Kurz
@ 2014-05-15  6:55       ` Amit Shah
  2014-05-15  7:12       ` Michael S. Tsirkin
  1 sibling, 0 replies; 68+ messages in thread
From: Amit Shah @ 2014-05-15  6:55 UTC (permalink / raw)
  To: Greg Kurz
  Cc: Kevin Wolf, Fam Zheng, Anthony Liguori, Michael S. Tsirkin,
	Juan Quintela, Alexander Graf, qemu-devel, Stefan Hajnoczi,
	Paolo Bonzini, Andreas Färber

On (Thu) 15 May 2014 [08:49:48], Greg Kurz wrote:
> On Thu, 15 May 2014 11:34:25 +0530
> Amit Shah <amit.shah@redhat.com> wrote:
> 
> > On (Wed) 14 May 2014 [17:41:38], Greg Kurz wrote:
> > > There is a need to add some more fields to VirtIODevice that should be
> > > migrated (broken status, endianness). The problem is that we do not
> > > want to break compatibility while adding a new feature... This issue has
> > > been addressed in the generic VMState code with the use of optional
> > > subsections. As a *temporary* alternative to port the whole virtio
> > > migration code to VMState, this patch mimics a similar subsectionning
> > > ability for virtio.
> > > 
> > > Since each virtio device is streamed in its own section, the idea is to
> > > stream subsections between the end of the device section and the start
> > > of the next sections. This allows an older QEMU to complain and exit
> > > when fed with subsections:
> > > 
> > > Unknown savevm section type 5
> > > Error -22 while loading VM state
> > 
> > Please make this configurable -- either via configure or device
> > properties.  That avoids having to break existing configurations that
> > work without this patch.
> > 
> 
> Hmmm... you mean we support migration from a newer QEMU to an older one ?

Well, not really "support", since many things don't work anyway, but
let's make that easier on downstreams that may want to.

> > > All users of virtio_load()/virtio_save() need to be patched because the
> > > subsections are streamed AFTER the device itself.
> > 
> > Since all have the same fixup, I'm wondering if a new section can be
> > added to the virtio-bus itself, which gets propagated to all devices
> > upon load in the dest.
> > 
> 
> That would be nice if possible. I will have a closer look.

Thanks!

		Amit

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [Qemu-devel] [PATCH RFC 1/8] virtio: add subsections to the migration stream
  2014-05-15  6:46       ` Amit Shah
@ 2014-05-15  7:04         ` Greg Kurz
  2014-05-15  9:20           ` Andreas Färber
  2014-05-17 18:29           ` Michael S. Tsirkin
  2014-05-15  7:14         ` Michael S. Tsirkin
  1 sibling, 2 replies; 68+ messages in thread
From: Greg Kurz @ 2014-05-15  7:04 UTC (permalink / raw)
  To: Amit Shah
  Cc: Kevin Wolf, Fam Zheng, Stefan Hajnoczi, Juan Quintela,
	Michael S. Tsirkin, Alexander Graf, qemu-devel, Anthony Liguori,
	Paolo Bonzini, Andreas Färber

On Thu, 15 May 2014 12:16:35 +0530
Amit Shah <amit.shah@redhat.com> wrote:
> On (Thu) 15 May 2014 [09:23:51], Michael S. Tsirkin wrote:
> > On Thu, May 15, 2014 at 11:34:25AM +0530, Amit Shah wrote:
> > > On (Wed) 14 May 2014 [17:41:38], Greg Kurz wrote:
> > > > There is a need to add some more fields to VirtIODevice that should be
> > > > migrated (broken status, endianness). The problem is that we do not
> > > > want to break compatibility while adding a new feature... This issue has
> > > > been addressed in the generic VMState code with the use of optional
> > > > subsections. As a *temporary* alternative to port the whole virtio
> > > > migration code to VMState, this patch mimics a similar subsectionning
> > > > ability for virtio.
> 
> BTW Greg, do you plan on working on vmstate for virtio?
> 

Yes.

> > > > Since each virtio device is streamed in its own section, the idea is to
> > > > stream subsections between the end of the device section and the start
> > > > of the next sections. This allows an older QEMU to complain and exit
> > > > when fed with subsections:
> > > > 
> > > > Unknown savevm section type 5
> > > > Error -22 while loading VM state
> > > 
> > > Please make this configurable -- either via configure or device
> > > properties.  That avoids having to break existing configurations that
> > > work without this patch.
> > > 
> > > > All users of virtio_load()/virtio_save() need to be patched because the
> > > > subsections are streamed AFTER the device itself.
> > > 
> > > Since all have the same fixup, I'm wondering if a new section can be
> > > added to the virtio-bus itself, which gets propagated to all devices
> > > upon load in the dest.
> > 
> > This calls for a way for devices to inherit properties from the bus,
> > which doesn't exist ATM.
> > Fine but let's not hold up this patchset because of this.
> 
> No, only suggestion is to add a migration section in the bus, and then
> it's easier to do this in the post-migrate functions for each device
> -- so only one new section gets introduced instead of all devices
> being modified to send a new subsection.
> 

The main problem I see is that virtio sucks: as you see in patch 8, we have
to be careful not to call vring or virtqueue stuff before the device knows
its endianness or it breaks... I need to study how the virtio-bus gets
migrated to ensure the endian section is streamed before the devices.

> 		Amit
> 

Thanks.

-- 
Gregory Kurz                                     kurzgreg@fr.ibm.com
                                                 gkurz@linux.vnet.ibm.com
Software Engineer @ IBM/Meiosys                  http://www.ibm.com
Tel +33 (0)562 165 496

"Anarchy is about taking complete responsibility for yourself."
        Alan Moore.

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [Qemu-devel] [PATCH RFC 1/8] virtio: add subsections to the migration stream
  2014-05-15  6:49     ` Greg Kurz
  2014-05-15  6:55       ` Amit Shah
@ 2014-05-15  7:12       ` Michael S. Tsirkin
  1 sibling, 0 replies; 68+ messages in thread
From: Michael S. Tsirkin @ 2014-05-15  7:12 UTC (permalink / raw)
  To: Greg Kurz
  Cc: Kevin Wolf, Fam Zheng, Anthony Liguori, Juan Quintela,
	Alexander Graf, qemu-devel, Stefan Hajnoczi, Amit Shah,
	Paolo Bonzini, Andreas Färber

On Thu, May 15, 2014 at 08:49:48AM +0200, Greg Kurz wrote:
> On Thu, 15 May 2014 11:34:25 +0530
> Amit Shah <amit.shah@redhat.com> wrote:
> 
> > On (Wed) 14 May 2014 [17:41:38], Greg Kurz wrote:
> > > There is a need to add some more fields to VirtIODevice that should be
> > > migrated (broken status, endianness). The problem is that we do not
> > > want to break compatibility while adding a new feature... This issue has
> > > been addressed in the generic VMState code with the use of optional
> > > subsections. As a *temporary* alternative to port the whole virtio
> > > migration code to VMState, this patch mimics a similar subsectionning
> > > ability for virtio.
> > > 
> > > Since each virtio device is streamed in its own section, the idea is to
> > > stream subsections between the end of the device section and the start
> > > of the next sections. This allows an older QEMU to complain and exit
> > > when fed with subsections:
> > > 
> > > Unknown savevm section type 5
> > > Error -22 while loading VM state
> > 
> > Please make this configurable -- either via configure or device
> > properties.  That avoids having to break existing configurations that
> > work without this patch.
> > 
> 
> Hmmm... you mean we support migration from a newer QEMU to an older one ?

In theory yes, that's why we have the -M switch.
This area isn't well tested unfortunately, but there are
downstreams that test and productize it.


> > > All users of virtio_load()/virtio_save() need to be patched because the
> > > subsections are streamed AFTER the device itself.
> > 
> > Since all have the same fixup, I'm wondering if a new section can be
> > added to the virtio-bus itself, which gets propagated to all devices
> > upon load in the dest.
> > 
> 
> That would be nice if possible. I will have a closer look.
> 
> > 		Amit
> > 
> 
> Thanks.
> 
> -- 
> Gregory Kurz                                     kurzgreg@fr.ibm.com
>                                                  gkurz@linux.vnet.ibm.com
> Software Engineer @ IBM/Meiosys                  http://www.ibm.com
> Tel +33 (0)562 165 496
> 
> "Anarchy is about taking complete responsibility for yourself."
>         Alan Moore.

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [Qemu-devel] [PATCH RFC 1/8] virtio: add subsections to the migration stream
  2014-05-15  6:46       ` Amit Shah
  2014-05-15  7:04         ` Greg Kurz
@ 2014-05-15  7:14         ` Michael S. Tsirkin
  1 sibling, 0 replies; 68+ messages in thread
From: Michael S. Tsirkin @ 2014-05-15  7:14 UTC (permalink / raw)
  To: Amit Shah
  Cc: Kevin Wolf, Fam Zheng, Anthony Liguori, Juan Quintela,
	Alexander Graf, qemu-devel, Stefan Hajnoczi, Paolo Bonzini,
	Andreas Färber, Greg Kurz

On Thu, May 15, 2014 at 12:16:35PM +0530, Amit Shah wrote:
> On (Thu) 15 May 2014 [09:23:51], Michael S. Tsirkin wrote:
> > On Thu, May 15, 2014 at 11:34:25AM +0530, Amit Shah wrote:
> > > On (Wed) 14 May 2014 [17:41:38], Greg Kurz wrote:
> > > > There is a need to add some more fields to VirtIODevice that should be
> > > > migrated (broken status, endianness). The problem is that we do not
> > > > want to break compatibility while adding a new feature... This issue has
> > > > been addressed in the generic VMState code with the use of optional
> > > > subsections. As a *temporary* alternative to port the whole virtio
> > > > migration code to VMState, this patch mimics a similar subsectionning
> > > > ability for virtio.
> 
> BTW Greg, do you plan on working on vmstate for virtio?
> 
> > > > Since each virtio device is streamed in its own section, the idea is to
> > > > stream subsections between the end of the device section and the start
> > > > of the next sections. This allows an older QEMU to complain and exit
> > > > when fed with subsections:
> > > > 
> > > > Unknown savevm section type 5
> > > > Error -22 while loading VM state
> > > 
> > > Please make this configurable -- either via configure or device
> > > properties.  That avoids having to break existing configurations that
> > > work without this patch.
> > > 
> > > > All users of virtio_load()/virtio_save() need to be patched because the
> > > > subsections are streamed AFTER the device itself.
> > > 
> > > Since all have the same fixup, I'm wondering if a new section can be
> > > added to the virtio-bus itself, which gets propagated to all devices
> > > upon load in the dest.
> > 
> > This calls for a way for devices to inherit properties from the bus,
> > which doesn't exist ATM.
> > Fine but let's not hold up this patchset because of this.
> 
> No, only suggestion is to add a migration section in the bus, and then
> it's easier to do this in the post-migrate functions for each device
> -- so only one new section gets introduced instead of all devices
> being modified to send a new subsection.
> 
> 		Amit

I don't mind but the gain isn't very big here.

-- 
MST

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [Qemu-devel] [PATCH RFC 1/8] virtio: add subsections to the migration stream
  2014-05-15  7:04         ` Greg Kurz
@ 2014-05-15  9:20           ` Andreas Färber
  2014-05-15  9:52             ` Michael S. Tsirkin
  2014-05-15 10:08             ` Greg Kurz
  2014-05-17 18:29           ` Michael S. Tsirkin
  1 sibling, 2 replies; 68+ messages in thread
From: Andreas Färber @ 2014-05-15  9:20 UTC (permalink / raw)
  To: Greg Kurz, Amit Shah
  Cc: Kevin Wolf, Fam Zheng, Stefan Hajnoczi, Juan Quintela,
	Michael S. Tsirkin, Alexander Graf, qemu-devel, Anthony Liguori,
	Paolo Bonzini

Am 15.05.2014 09:04, schrieb Greg Kurz:
> On Thu, 15 May 2014 12:16:35 +0530
> Amit Shah <amit.shah@redhat.com> wrote:
>> On (Thu) 15 May 2014 [09:23:51], Michael S. Tsirkin wrote:
>>> On Thu, May 15, 2014 at 11:34:25AM +0530, Amit Shah wrote:
>>>> On (Wed) 14 May 2014 [17:41:38], Greg Kurz wrote:
>>>>> Since each virtio device is streamed in its own section, the idea is to
>>>>> stream subsections between the end of the device section and the start
>>>>> of the next sections. This allows an older QEMU to complain and exit
>>>>> when fed with subsections:
>>>>>
>>>>> Unknown savevm section type 5
>>>>> Error -22 while loading VM state
>>>>
>>>> Please make this configurable -- either via configure or device
>>>> properties.  That avoids having to break existing configurations that
>>>> work without this patch.

Since backwards migration is not supported upstream, wouldn't it be
easiest to just add support for the subsection marker and skipping to
the end of section in that downstream?

>>>>> All users of virtio_load()/virtio_save() need to be patched because the
>>>>> subsections are streamed AFTER the device itself.

IMO this is calling for inversion of control - i.e. let virtio devices
call generic load/save functions that then dispatch to device-specific
code and let us add common stuff in a central place without forgetting
to add calls in some new device.

>>>> Since all have the same fixup, I'm wondering if a new section can be
>>>> added to the virtio-bus itself, which gets propagated to all devices
>>>> upon load in the dest.
>>>
>>> This calls for a way for devices to inherit properties from the bus,
>>> which doesn't exist ATM.
>>> Fine but let's not hold up this patchset because of this.
>>
>> No, only suggestion is to add a migration section in the bus, and then
>> it's easier to do this in the post-migrate functions for each device
>> -- so only one new section gets introduced instead of all devices
>> being modified to send a new subsection.
>>
> 
> The main problem I see is that virtio sucks: as you see in patch 8, we have
> to be careful not to call vring or virtqueue stuff before the device knows
> its endianness or it breaks... I need to study how the virtio-bus gets
> migrated to ensure the endian section is streamed before the devices.

There is no ordering guarantee. The state needs to be migrated in the
device or bus where it sits, if post-load processing is required; i.e.,
if it's in VirtIODevice then something like this series, if it were on
VirtioBus exclusively (device asking bus for its endianness each time
and does not do post-load stuff) then endianness could be migrated as a
new bus section. Not sure if that would help the "broken" state though?

Would touch on Stefan's alias properties for anything but virtio-mmio.

Regards,
Andreas

-- 
SUSE LINUX Products GmbH, Maxfeldstr. 5, 90409 Nürnberg, Germany
GF: Jeff Hawn, Jennifer Guild, Felix Imendörffer; HRB 16746 AG Nürnberg

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [Qemu-devel] [PATCH RFC 1/8] virtio: add subsections to the migration stream
  2014-05-15  9:20           ` Andreas Färber
@ 2014-05-15  9:52             ` Michael S. Tsirkin
  2014-05-15  9:58               ` Andreas Färber
  2014-05-15 12:33               ` Markus Armbruster
  2014-05-15 10:08             ` Greg Kurz
  1 sibling, 2 replies; 68+ messages in thread
From: Michael S. Tsirkin @ 2014-05-15  9:52 UTC (permalink / raw)
  To: Andreas Färber
  Cc: Kevin Wolf, Fam Zheng, Stefan Hajnoczi, Juan Quintela,
	Alexander Graf, qemu-devel, Anthony Liguori, Amit Shah,
	Paolo Bonzini, Greg Kurz

On Thu, May 15, 2014 at 11:20:18AM +0200, Andreas Färber wrote:
> Am 15.05.2014 09:04, schrieb Greg Kurz:
> > On Thu, 15 May 2014 12:16:35 +0530
> > Amit Shah <amit.shah@redhat.com> wrote:
> >> On (Thu) 15 May 2014 [09:23:51], Michael S. Tsirkin wrote:
> >>> On Thu, May 15, 2014 at 11:34:25AM +0530, Amit Shah wrote:
> >>>> On (Wed) 14 May 2014 [17:41:38], Greg Kurz wrote:
> >>>>> Since each virtio device is streamed in its own section, the idea is to
> >>>>> stream subsections between the end of the device section and the start
> >>>>> of the next sections. This allows an older QEMU to complain and exit
> >>>>> when fed with subsections:
> >>>>>
> >>>>> Unknown savevm section type 5
> >>>>> Error -22 while loading VM state
> >>>>
> >>>> Please make this configurable -- either via configure or device
> >>>> properties.  That avoids having to break existing configurations that
> >>>> work without this patch.
> 
> Since backwards migration is not supported upstream, wouldn't it be
> easiest to just add support for the subsection marker and skipping to
> the end of section in that downstream?

Backwards and forwards migration need to be supported,
customers told us repeatedly. So some downstreams support this
and not supporting it upstream just means downstreams need
to do their own thing.

As importantly, ping-pong migration is the only
reliable way to stress migration.

So if we want to test cross-version we need it to work
both way.

Finally, the real issue and difficulty with cross-version migration is
making VM behave in a backwards compatible way.  Serializing in a
compatible way is a trivial problem, or would be if the code wasn't a
mess :) Once you do the hard part, breaking migration because of the
trivial serialization issue is just silly.  And special-casing forward
migration does not make code simpler, it really only leads to
proliferation of hacks and lack of symmetry.

So yes it's a useful feature, and no not supporting it does
not help anyway.


> >>>>> All users of virtio_load()/virtio_save() need to be patched because the
> >>>>> subsections are streamed AFTER the device itself.
> 
> IMO this is calling for inversion of control - i.e. let virtio devices
> call generic load/save functions that then dispatch to device-specific
> code and let us add common stuff in a central place without forgetting
> to add calls in some new device.
> 
> >>>> Since all have the same fixup, I'm wondering if a new section can be
> >>>> added to the virtio-bus itself, which gets propagated to all devices
> >>>> upon load in the dest.
> >>>
> >>> This calls for a way for devices to inherit properties from the bus,
> >>> which doesn't exist ATM.
> >>> Fine but let's not hold up this patchset because of this.
> >>
> >> No, only suggestion is to add a migration section in the bus, and then
> >> it's easier to do this in the post-migrate functions for each device
> >> -- so only one new section gets introduced instead of all devices
> >> being modified to send a new subsection.
> >>
> > 
> > The main problem I see is that virtio sucks: as you see in patch 8, we have
> > to be careful not to call vring or virtqueue stuff before the device knows
> > its endianness or it breaks... I need to study how the virtio-bus gets
> > migrated to ensure the endian section is streamed before the devices.
> 
> There is no ordering guarantee. The state needs to be migrated in the
> device or bus where it sits, if post-load processing is required; i.e.,
> if it's in VirtIODevice then something like this series, if it were on
> VirtioBus exclusively (device asking bus for its endianness each time
> and does not do post-load stuff) then endianness could be migrated as a
> new bus section. Not sure if that would help the "broken" state though?
> 
> Would touch on Stefan's alias properties for anything but virtio-mmio.
> 
> Regards,
> Andreas
> 
> -- 
> SUSE LINUX Products GmbH, Maxfeldstr. 5, 90409 Nürnberg, Germany
> GF: Jeff Hawn, Jennifer Guild, Felix Imendörffer; HRB 16746 AG Nürnberg

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [Qemu-devel] [PATCH RFC 1/8] virtio: add subsections to the migration stream
  2014-05-15  9:52             ` Michael S. Tsirkin
@ 2014-05-15  9:58               ` Andreas Färber
  2014-05-15 10:03                 ` Michael S. Tsirkin
  2014-05-15 12:33               ` Markus Armbruster
  1 sibling, 1 reply; 68+ messages in thread
From: Andreas Färber @ 2014-05-15  9:58 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: Kevin Wolf, Fam Zheng, Stefan Hajnoczi, Juan Quintela,
	Alexander Graf, qemu-devel, Anthony Liguori, Amit Shah,
	Paolo Bonzini, Greg Kurz

Am 15.05.2014 11:52, schrieb Michael S. Tsirkin:
> On Thu, May 15, 2014 at 11:20:18AM +0200, Andreas Färber wrote:
>> Am 15.05.2014 09:04, schrieb Greg Kurz:
>>> On Thu, 15 May 2014 12:16:35 +0530
>>> Amit Shah <amit.shah@redhat.com> wrote:
>>>> On (Thu) 15 May 2014 [09:23:51], Michael S. Tsirkin wrote:
>>>>> On Thu, May 15, 2014 at 11:34:25AM +0530, Amit Shah wrote:
>>>>>> On (Wed) 14 May 2014 [17:41:38], Greg Kurz wrote:
>>>>>>> Since each virtio device is streamed in its own section, the idea is to
>>>>>>> stream subsections between the end of the device section and the start
>>>>>>> of the next sections. This allows an older QEMU to complain and exit
>>>>>>> when fed with subsections:
>>>>>>>
>>>>>>> Unknown savevm section type 5
>>>>>>> Error -22 while loading VM state
>>>>>>
>>>>>> Please make this configurable -- either via configure or device
>>>>>> properties.  That avoids having to break existing configurations that
>>>>>> work without this patch.
>>
>> Since backwards migration is not supported upstream, wouldn't it be
>> easiest to just add support for the subsection marker and skipping to
>> the end of section in that downstream?
> 
> Backwards and forwards migration need to be supported,
> customers told us repeatedly. So some downstreams support this
> and not supporting it upstream just means downstreams need
> to do their own thing.
> 
> As importantly, ping-pong migration is the only
> reliable way to stress migration.
> 
> So if we want to test cross-version we need it to work
> both way.
> 
> Finally, the real issue and difficulty with cross-version migration is
> making VM behave in a backwards compatible way.  Serializing in a
> compatible way is a trivial problem, or would be if the code wasn't a
> mess :) Once you do the hard part, breaking migration because of the
> trivial serialization issue is just silly.  And special-casing forward
> migration does not make code simpler, it really only leads to
> proliferation of hacks and lack of symmetry.
> 
> So yes it's a useful feature, and no not supporting it does
> not help anyway.

It seems you misunderstand. I was not saying it's not useful.

My point is that VMStateSubsections added in newer versions (and thus
not existing in older versions) need to be handled for any
VMState-converted devices. So why can't we make that work for virtio too?

Andreas

-- 
SUSE LINUX Products GmbH, Maxfeldstr. 5, 90409 Nürnberg, Germany
GF: Jeff Hawn, Jennifer Guild, Felix Imendörffer; HRB 16746 AG Nürnberg

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [Qemu-devel] [PATCH RFC 1/8] virtio: add subsections to the migration stream
  2014-05-15  9:58               ` Andreas Färber
@ 2014-05-15 10:03                 ` Michael S. Tsirkin
  2014-05-15 10:11                   ` Andreas Färber
  0 siblings, 1 reply; 68+ messages in thread
From: Michael S. Tsirkin @ 2014-05-15 10:03 UTC (permalink / raw)
  To: Andreas Färber
  Cc: Kevin Wolf, Fam Zheng, Stefan Hajnoczi, Juan Quintela,
	Alexander Graf, qemu-devel, Anthony Liguori, Amit Shah,
	Paolo Bonzini, Greg Kurz

On Thu, May 15, 2014 at 11:58:25AM +0200, Andreas Färber wrote:
> Am 15.05.2014 11:52, schrieb Michael S. Tsirkin:
> > On Thu, May 15, 2014 at 11:20:18AM +0200, Andreas Färber wrote:
> >> Am 15.05.2014 09:04, schrieb Greg Kurz:
> >>> On Thu, 15 May 2014 12:16:35 +0530
> >>> Amit Shah <amit.shah@redhat.com> wrote:
> >>>> On (Thu) 15 May 2014 [09:23:51], Michael S. Tsirkin wrote:
> >>>>> On Thu, May 15, 2014 at 11:34:25AM +0530, Amit Shah wrote:
> >>>>>> On (Wed) 14 May 2014 [17:41:38], Greg Kurz wrote:
> >>>>>>> Since each virtio device is streamed in its own section, the idea is to
> >>>>>>> stream subsections between the end of the device section and the start
> >>>>>>> of the next sections. This allows an older QEMU to complain and exit
> >>>>>>> when fed with subsections:
> >>>>>>>
> >>>>>>> Unknown savevm section type 5
> >>>>>>> Error -22 while loading VM state
> >>>>>>
> >>>>>> Please make this configurable -- either via configure or device
> >>>>>> properties.  That avoids having to break existing configurations that
> >>>>>> work without this patch.
> >>
> >> Since backwards migration is not supported upstream, wouldn't it be
> >> easiest to just add support for the subsection marker and skipping to
> >> the end of section in that downstream?
> > 
> > Backwards and forwards migration need to be supported,
> > customers told us repeatedly. So some downstreams support this
> > and not supporting it upstream just means downstreams need
> > to do their own thing.
> > 
> > As importantly, ping-pong migration is the only
> > reliable way to stress migration.
> > 
> > So if we want to test cross-version we need it to work
> > both way.
> > 
> > Finally, the real issue and difficulty with cross-version migration is
> > making VM behave in a backwards compatible way.  Serializing in a
> > compatible way is a trivial problem, or would be if the code wasn't a
> > mess :) Once you do the hard part, breaking migration because of the
> > trivial serialization issue is just silly.  And special-casing forward
> > migration does not make code simpler, it really only leads to
> > proliferation of hacks and lack of symmetry.
> > 
> > So yes it's a useful feature, and no not supporting it does
> > not help anyway.
> 
> It seems you misunderstand. I was not saying it's not useful.
> 
> My point is that VMStateSubsections added in newer versions (and thus
> not existing in older versions) need to be handled for any
> VMState-converted devices. So why can't we make that work for virtio too?
> 
> Andreas

Sure.
AFAIK the way to this works is by adding an "field_exists" callback,
and not sending the section when we are in a compat mode.

> -- 
> SUSE LINUX Products GmbH, Maxfeldstr. 5, 90409 Nürnberg, Germany
> GF: Jeff Hawn, Jennifer Guild, Felix Imendörffer; HRB 16746 AG Nürnberg

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [Qemu-devel] [PATCH RFC 1/8] virtio: add subsections to the migration stream
  2014-05-15  9:20           ` Andreas Färber
  2014-05-15  9:52             ` Michael S. Tsirkin
@ 2014-05-15 10:08             ` Greg Kurz
  2014-05-15 10:12               ` Michael S. Tsirkin
                                 ` (2 more replies)
  1 sibling, 3 replies; 68+ messages in thread
From: Greg Kurz @ 2014-05-15 10:08 UTC (permalink / raw)
  To: Andreas Färber
  Cc: Kevin Wolf, Fam Zheng, Stefan Hajnoczi, Juan Quintela,
	Michael S. Tsirkin, Alexander Graf, qemu-devel, Anthony Liguori,
	Amit Shah, Paolo Bonzini

On Thu, 15 May 2014 11:20:18 +0200
Andreas Färber <afaerber@suse.de> wrote:
> Am 15.05.2014 09:04, schrieb Greg Kurz:
> > On Thu, 15 May 2014 12:16:35 +0530
> > Amit Shah <amit.shah@redhat.com> wrote:
> >> On (Thu) 15 May 2014 [09:23:51], Michael S. Tsirkin wrote:
> >>> On Thu, May 15, 2014 at 11:34:25AM +0530, Amit Shah wrote:
> >>>> On (Wed) 14 May 2014 [17:41:38], Greg Kurz wrote:
> >>>>> Since each virtio device is streamed in its own section, the idea is to
> >>>>> stream subsections between the end of the device section and the start
> >>>>> of the next sections. This allows an older QEMU to complain and exit
> >>>>> when fed with subsections:
> >>>>>
> >>>>> Unknown savevm section type 5
> >>>>> Error -22 while loading VM state
> >>>>
> >>>> Please make this configurable -- either via configure or device
> >>>> properties.  That avoids having to break existing configurations that
> >>>> work without this patch.
> 
> Since backwards migration is not supported upstream, wouldn't it be
> easiest to just add support for the subsection marker and skipping to
> the end of section in that downstream?
> 

Not sure I understand well... Do you suggest to stream the markers first,
then the device, then the subsections ? And then there would be a way
we can have the subsections restored before the device ?

> >>>>> All users of virtio_load()/virtio_save() need to be patched because the
> >>>>> subsections are streamed AFTER the device itself.
> 
> IMO this is calling for inversion of control - i.e. let virtio devices
> call generic load/save functions that then dispatch to device-specific
> code and let us add common stuff in a central place without forgetting
> to add calls in some new device.
> 

That makes a lot of sense.

> >>>> Since all have the same fixup, I'm wondering if a new section can be
> >>>> added to the virtio-bus itself, which gets propagated to all devices
> >>>> upon load in the dest.
> >>>
> >>> This calls for a way for devices to inherit properties from the bus,
> >>> which doesn't exist ATM.
> >>> Fine but let's not hold up this patchset because of this.
> >>
> >> No, only suggestion is to add a migration section in the bus, and then
> >> it's easier to do this in the post-migrate functions for each device
> >> -- so only one new section gets introduced instead of all devices
> >> being modified to send a new subsection.
> >>
> > 
> > The main problem I see is that virtio sucks: as you see in patch 8, we have
> > to be careful not to call vring or virtqueue stuff before the device knows
> > its endianness or it breaks... I need to study how the virtio-bus gets
> > migrated to ensure the endian section is streamed before the devices.
> 
> There is no ordering guarantee. The state needs to be migrated in the
> device or bus where it sits, if post-load processing is required; i.e.,
> if it's in VirtIODevice then something like this series, if it were on
> VirtioBus exclusively (device asking bus for its endianness each time
> and does not do post-load stuff) then endianness could be migrated as a
> new bus section. Not sure if that would help the "broken" state though?
> 

IIRW the "broken" state was proposed as a per-device property...

Fam,

Do you have plans about the "broken" property ? Is it still needed ?

> Would touch on Stefan's alias properties for anything but virtio-mmio.
> 

OMG... maybe I should hold on then.

> Regards,
> Andreas
> 

Thanks !

-- 
Gregory Kurz                                     kurzgreg@fr.ibm.com
                                                 gkurz@linux.vnet.ibm.com
Software Engineer @ IBM/Meiosys                  http://www.ibm.com
Tel +33 (0)562 165 496

"Anarchy is about taking complete responsibility for yourself."
        Alan Moore.

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [Qemu-devel] [PATCH RFC 1/8] virtio: add subsections to the migration stream
  2014-05-15 10:03                 ` Michael S. Tsirkin
@ 2014-05-15 10:11                   ` Andreas Färber
  2014-05-15 10:16                     ` Michael S. Tsirkin
  0 siblings, 1 reply; 68+ messages in thread
From: Andreas Färber @ 2014-05-15 10:11 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: Kevin Wolf, Fam Zheng, Stefan Hajnoczi, Juan Quintela,
	Alexander Graf, qemu-devel, Anthony Liguori, Amit Shah,
	Paolo Bonzini, Greg Kurz

Am 15.05.2014 12:03, schrieb Michael S. Tsirkin:
> On Thu, May 15, 2014 at 11:58:25AM +0200, Andreas Färber wrote:
>> Am 15.05.2014 11:52, schrieb Michael S. Tsirkin:
>>> On Thu, May 15, 2014 at 11:20:18AM +0200, Andreas Färber wrote:
>>>> Am 15.05.2014 09:04, schrieb Greg Kurz:
>>>>> On Thu, 15 May 2014 12:16:35 +0530
>>>>> Amit Shah <amit.shah@redhat.com> wrote:
>>>>>> On (Thu) 15 May 2014 [09:23:51], Michael S. Tsirkin wrote:
>>>>>>> On Thu, May 15, 2014 at 11:34:25AM +0530, Amit Shah wrote:
>>>>>>>> On (Wed) 14 May 2014 [17:41:38], Greg Kurz wrote:
>>>>>>>>> Since each virtio device is streamed in its own section, the idea is to
>>>>>>>>> stream subsections between the end of the device section and the start
>>>>>>>>> of the next sections. This allows an older QEMU to complain and exit
>>>>>>>>> when fed with subsections:
>>>>>>>>>
>>>>>>>>> Unknown savevm section type 5
>>>>>>>>> Error -22 while loading VM state
>>>>>>>>
>>>>>>>> Please make this configurable -- either via configure or device
>>>>>>>> properties.  That avoids having to break existing configurations that
>>>>>>>> work without this patch.
>>>>
>>>> Since backwards migration is not supported upstream, wouldn't it be
>>>> easiest to just add support for the subsection marker and skipping to
>>>> the end of section in that downstream?
>>>
>>> Backwards and forwards migration need to be supported,
>>> customers told us repeatedly. So some downstreams support this
>>> and not supporting it upstream just means downstreams need
>>> to do their own thing.
>>>
>>> As importantly, ping-pong migration is the only
>>> reliable way to stress migration.
>>>
>>> So if we want to test cross-version we need it to work
>>> both way.
>>>
>>> Finally, the real issue and difficulty with cross-version migration is
>>> making VM behave in a backwards compatible way.  Serializing in a
>>> compatible way is a trivial problem, or would be if the code wasn't a
>>> mess :) Once you do the hard part, breaking migration because of the
>>> trivial serialization issue is just silly.  And special-casing forward
>>> migration does not make code simpler, it really only leads to
>>> proliferation of hacks and lack of symmetry.
>>>
>>> So yes it's a useful feature, and no not supporting it does
>>> not help anyway.
>>
>> It seems you misunderstand. I was not saying it's not useful.
>>
>> My point is that VMStateSubsections added in newer versions (and thus
>> not existing in older versions) need to be handled for any
>> VMState-converted devices. So why can't we make that work for virtio too?
> 
> Sure.
> AFAIK the way to this works is by adding an "field_exists" callback,
> and not sending the section when we are in a compat mode.

OK, but upstream always sends the latest version today. So isn't that
just two ifs that you would need to add in save and load functions added
here for the downstream? x86_64 is unaffected from ppc's endianness
changes and you still do ppc64 BE, so there's no real functional problem
for RHEL, is there?

Andreas

-- 
SUSE LINUX Products GmbH, Maxfeldstr. 5, 90409 Nürnberg, Germany
GF: Jeff Hawn, Jennifer Guild, Felix Imendörffer; HRB 16746 AG Nürnberg

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [Qemu-devel] [PATCH RFC 1/8] virtio: add subsections to the migration stream
  2014-05-15 10:08             ` Greg Kurz
@ 2014-05-15 10:12               ` Michael S. Tsirkin
  2014-05-15 10:21                 ` Greg Kurz
  2014-05-15 10:16               ` Greg Kurz
  2014-05-16  9:14               ` Fam Zheng
  2 siblings, 1 reply; 68+ messages in thread
From: Michael S. Tsirkin @ 2014-05-15 10:12 UTC (permalink / raw)
  To: Greg Kurz
  Cc: Kevin Wolf, Fam Zheng, Stefan Hajnoczi, Juan Quintela,
	Alexander Graf, qemu-devel, Anthony Liguori, Amit Shah,
	Paolo Bonzini, Andreas Färber

On Thu, May 15, 2014 at 12:08:26PM +0200, Greg Kurz wrote:
> On Thu, 15 May 2014 11:20:18 +0200
> Andreas Färber <afaerber@suse.de> wrote:
> > Am 15.05.2014 09:04, schrieb Greg Kurz:
> > > On Thu, 15 May 2014 12:16:35 +0530
> > > Amit Shah <amit.shah@redhat.com> wrote:
> > >> On (Thu) 15 May 2014 [09:23:51], Michael S. Tsirkin wrote:
> > >>> On Thu, May 15, 2014 at 11:34:25AM +0530, Amit Shah wrote:
> > >>>> On (Wed) 14 May 2014 [17:41:38], Greg Kurz wrote:
> > >>>>> Since each virtio device is streamed in its own section, the idea is to
> > >>>>> stream subsections between the end of the device section and the start
> > >>>>> of the next sections. This allows an older QEMU to complain and exit
> > >>>>> when fed with subsections:
> > >>>>>
> > >>>>> Unknown savevm section type 5
> > >>>>> Error -22 while loading VM state
> > >>>>
> > >>>> Please make this configurable -- either via configure or device
> > >>>> properties.  That avoids having to break existing configurations that
> > >>>> work without this patch.
> > 
> > Since backwards migration is not supported upstream, wouldn't it be
> > easiest to just add support for the subsection marker and skipping to
> > the end of section in that downstream?
> > 
> 
> Not sure I understand well... Do you suggest to stream the markers first,
> then the device, then the subsections ? And then there would be a way
> we can have the subsections restored before the device ?
> 
> > >>>>> All users of virtio_load()/virtio_save() need to be patched because the
> > >>>>> subsections are streamed AFTER the device itself.
> > 
> > IMO this is calling for inversion of control - i.e. let virtio devices
> > call generic load/save functions that then dispatch to device-specific
> > code and let us add common stuff in a central place without forgetting
> > to add calls in some new device.
> > 
> 
> That makes a lot of sense.
> 
> > >>>> Since all have the same fixup, I'm wondering if a new section can be
> > >>>> added to the virtio-bus itself, which gets propagated to all devices
> > >>>> upon load in the dest.
> > >>>
> > >>> This calls for a way for devices to inherit properties from the bus,
> > >>> which doesn't exist ATM.
> > >>> Fine but let's not hold up this patchset because of this.
> > >>
> > >> No, only suggestion is to add a migration section in the bus, and then
> > >> it's easier to do this in the post-migrate functions for each device
> > >> -- so only one new section gets introduced instead of all devices
> > >> being modified to send a new subsection.
> > >>
> > > 
> > > The main problem I see is that virtio sucks: as you see in patch 8, we have
> > > to be careful not to call vring or virtqueue stuff before the device knows
> > > its endianness or it breaks... I need to study how the virtio-bus gets
> > > migrated to ensure the endian section is streamed before the devices.
> > 
> > There is no ordering guarantee. The state needs to be migrated in the
> > device or bus where it sits, if post-load processing is required; i.e.,
> > if it's in VirtIODevice then something like this series, if it were on
> > VirtioBus exclusively (device asking bus for its endianness each time
> > and does not do post-load stuff) then endianness could be migrated as a
> > new bus section. Not sure if that would help the "broken" state though?
> > 
> 
> IIRW the "broken" state was proposed as a per-device property...
> 
> Fam,
> 
> Do you have plans about the "broken" property ? Is it still needed ?
> 
> > Would touch on Stefan's alias properties for anything but virtio-mmio.
> > 
> 
> OMG... maybe I should hold on then.

No need to wait imho.
Can this be made even simpler - call this stuff
from virtio_save/virtio_load?

Why not?


> > Regards,
> > Andreas
> > 
> 
> Thanks !
> 
> -- 
> Gregory Kurz                                     kurzgreg@fr.ibm.com
>                                                  gkurz@linux.vnet.ibm.com
> Software Engineer @ IBM/Meiosys                  http://www.ibm.com
> Tel +33 (0)562 165 496
> 
> "Anarchy is about taking complete responsibility for yourself."
>         Alan Moore.

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [Qemu-devel] [PATCH RFC 1/8] virtio: add subsections to the migration stream
  2014-05-15 10:08             ` Greg Kurz
  2014-05-15 10:12               ` Michael S. Tsirkin
@ 2014-05-15 10:16               ` Greg Kurz
  2014-05-16  9:14               ` Fam Zheng
  2 siblings, 0 replies; 68+ messages in thread
From: Greg Kurz @ 2014-05-15 10:16 UTC (permalink / raw)
  To: Greg Kurz
  Cc: Kevin Wolf, Fam Zheng, Anthony Liguori, Michael S. Tsirkin,
	Juan Quintela, Alexander Graf, qemu-devel, Stefan Hajnoczi,
	Amit Shah, Paolo Bonzini, Andreas Färber

On Thu, 15 May 2014 12:08:26 +0200
Greg Kurz <gkurz@linux.vnet.ibm.com> wrote:
> On Thu, 15 May 2014 11:20:18 +0200
> Andreas Färber <afaerber@suse.de> wrote:
> > Am 15.05.2014 09:04, schrieb Greg Kurz:
> > > On Thu, 15 May 2014 12:16:35 +0530
> > > Amit Shah <amit.shah@redhat.com> wrote:
> > >> On (Thu) 15 May 2014 [09:23:51], Michael S. Tsirkin wrote:
> > >>> On Thu, May 15, 2014 at 11:34:25AM +0530, Amit Shah wrote:
> > >>>> On (Wed) 14 May 2014 [17:41:38], Greg Kurz wrote:
> > >>>>> Since each virtio device is streamed in its own section, the idea is to
> > >>>>> stream subsections between the end of the device section and the start
> > >>>>> of the next sections. This allows an older QEMU to complain and exit
> > >>>>> when fed with subsections:
> > >>>>>
> > >>>>> Unknown savevm section type 5
> > >>>>> Error -22 while loading VM state
> > >>>>
> > >>>> Please make this configurable -- either via configure or device
> > >>>> properties.  That avoids having to break existing configurations that
> > >>>> work without this patch.
> > 
> > Since backwards migration is not supported upstream, wouldn't it be
> > easiest to just add support for the subsection marker and skipping to
> > the end of section in that downstream?
> > 
> 
> Not sure I understand well... Do you suggest to stream the markers first,
> then the device, then the subsections ? And then there would be a way
> we can have the subsections restored before the device ?
> 

Heh just got the clarification thanks to Michael's mail... Sorry for the
confusion and noise. :P


> > >>>>> All users of virtio_load()/virtio_save() need to be patched because the
> > >>>>> subsections are streamed AFTER the device itself.
> > 
> > IMO this is calling for inversion of control - i.e. let virtio devices
> > call generic load/save functions that then dispatch to device-specific
> > code and let us add common stuff in a central place without forgetting
> > to add calls in some new device.
> > 
> 
> That makes a lot of sense.
> 
> > >>>> Since all have the same fixup, I'm wondering if a new section can be
> > >>>> added to the virtio-bus itself, which gets propagated to all devices
> > >>>> upon load in the dest.
> > >>>
> > >>> This calls for a way for devices to inherit properties from the bus,
> > >>> which doesn't exist ATM.
> > >>> Fine but let's not hold up this patchset because of this.
> > >>
> > >> No, only suggestion is to add a migration section in the bus, and then
> > >> it's easier to do this in the post-migrate functions for each device
> > >> -- so only one new section gets introduced instead of all devices
> > >> being modified to send a new subsection.
> > >>
> > > 
> > > The main problem I see is that virtio sucks: as you see in patch 8, we have
> > > to be careful not to call vring or virtqueue stuff before the device knows
> > > its endianness or it breaks... I need to study how the virtio-bus gets
> > > migrated to ensure the endian section is streamed before the devices.
> > 
> > There is no ordering guarantee. The state needs to be migrated in the
> > device or bus where it sits, if post-load processing is required; i.e.,
> > if it's in VirtIODevice then something like this series, if it were on
> > VirtioBus exclusively (device asking bus for its endianness each time
> > and does not do post-load stuff) then endianness could be migrated as a
> > new bus section. Not sure if that would help the "broken" state though?
> > 
> 
> IIRW the "broken" state was proposed as a per-device property...
> 
> Fam,
> 
> Do you have plans about the "broken" property ? Is it still needed ?
> 
> > Would touch on Stefan's alias properties for anything but virtio-mmio.
> > 
> 
> OMG... maybe I should hold on then.
> 
> > Regards,
> > Andreas
> > 
> 
> Thanks !
> 



-- 
Gregory Kurz                                     kurzgreg@fr.ibm.com
                                                 gkurz@linux.vnet.ibm.com
Software Engineer @ IBM/Meiosys                  http://www.ibm.com
Tel +33 (0)562 165 496

"Anarchy is about taking complete responsibility for yourself."
        Alan Moore.

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [Qemu-devel] [PATCH RFC 1/8] virtio: add subsections to the migration stream
  2014-05-15 10:11                   ` Andreas Färber
@ 2014-05-15 10:16                     ` Michael S. Tsirkin
  2014-05-15 12:00                       ` Andreas Färber
  0 siblings, 1 reply; 68+ messages in thread
From: Michael S. Tsirkin @ 2014-05-15 10:16 UTC (permalink / raw)
  To: Andreas Färber
  Cc: Kevin Wolf, Fam Zheng, Stefan Hajnoczi, Juan Quintela,
	Alexander Graf, qemu-devel, Anthony Liguori, Amit Shah,
	Paolo Bonzini, Greg Kurz

On Thu, May 15, 2014 at 12:11:12PM +0200, Andreas Färber wrote:
> Am 15.05.2014 12:03, schrieb Michael S. Tsirkin:
> > On Thu, May 15, 2014 at 11:58:25AM +0200, Andreas Färber wrote:
> >> Am 15.05.2014 11:52, schrieb Michael S. Tsirkin:
> >>> On Thu, May 15, 2014 at 11:20:18AM +0200, Andreas Färber wrote:
> >>>> Am 15.05.2014 09:04, schrieb Greg Kurz:
> >>>>> On Thu, 15 May 2014 12:16:35 +0530
> >>>>> Amit Shah <amit.shah@redhat.com> wrote:
> >>>>>> On (Thu) 15 May 2014 [09:23:51], Michael S. Tsirkin wrote:
> >>>>>>> On Thu, May 15, 2014 at 11:34:25AM +0530, Amit Shah wrote:
> >>>>>>>> On (Wed) 14 May 2014 [17:41:38], Greg Kurz wrote:
> >>>>>>>>> Since each virtio device is streamed in its own section, the idea is to
> >>>>>>>>> stream subsections between the end of the device section and the start
> >>>>>>>>> of the next sections. This allows an older QEMU to complain and exit
> >>>>>>>>> when fed with subsections:
> >>>>>>>>>
> >>>>>>>>> Unknown savevm section type 5
> >>>>>>>>> Error -22 while loading VM state
> >>>>>>>>
> >>>>>>>> Please make this configurable -- either via configure or device
> >>>>>>>> properties.  That avoids having to break existing configurations that
> >>>>>>>> work without this patch.
> >>>>
> >>>> Since backwards migration is not supported upstream, wouldn't it be
> >>>> easiest to just add support for the subsection marker and skipping to
> >>>> the end of section in that downstream?
> >>>
> >>> Backwards and forwards migration need to be supported,
> >>> customers told us repeatedly. So some downstreams support this
> >>> and not supporting it upstream just means downstreams need
> >>> to do their own thing.
> >>>
> >>> As importantly, ping-pong migration is the only
> >>> reliable way to stress migration.
> >>>
> >>> So if we want to test cross-version we need it to work
> >>> both way.
> >>>
> >>> Finally, the real issue and difficulty with cross-version migration is
> >>> making VM behave in a backwards compatible way.  Serializing in a
> >>> compatible way is a trivial problem, or would be if the code wasn't a
> >>> mess :) Once you do the hard part, breaking migration because of the
> >>> trivial serialization issue is just silly.  And special-casing forward
> >>> migration does not make code simpler, it really only leads to
> >>> proliferation of hacks and lack of symmetry.
> >>>
> >>> So yes it's a useful feature, and no not supporting it does
> >>> not help anyway.
> >>
> >> It seems you misunderstand. I was not saying it's not useful.
> >>
> >> My point is that VMStateSubsections added in newer versions (and thus
> >> not existing in older versions) need to be handled for any
> >> VMState-converted devices. So why can't we make that work for virtio too?
> > 
> > Sure.
> > AFAIK the way to this works is by adding an "field_exists" callback,
> > and not sending the section when we are in a compat mode.
> 
> OK, but upstream always sends the latest version today.
> So isn't that
> just two ifs that you would need to add in save and load functions added
> here for the downstream? x86_64 is unaffected from ppc's endianness
> changes and you still do ppc64 BE, so there's no real functional problem
> for RHEL, is there?
> 
> Andreas

I'm sorry I don't understand what you are saying here.
Simply put, if you specify a compatible -M then your
device should behave, and migrate, exactly like and old
qemu did.

With *any* change, there's always the temptation to say
"oh but old behaviour was too buggy we should just ignore it".
Seems to make sense in each case but does not scale,
you end up with breakages in all version without exception.
So this should be a very high bar to overcome.



> -- 
> SUSE LINUX Products GmbH, Maxfeldstr. 5, 90409 Nürnberg, Germany
> GF: Jeff Hawn, Jennifer Guild, Felix Imendörffer; HRB 16746 AG Nürnberg

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [Qemu-devel] [PATCH RFC 1/8] virtio: add subsections to the migration stream
  2014-05-15 10:12               ` Michael S. Tsirkin
@ 2014-05-15 10:21                 ` Greg Kurz
  0 siblings, 0 replies; 68+ messages in thread
From: Greg Kurz @ 2014-05-15 10:21 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: Kevin Wolf, Fam Zheng, Stefan Hajnoczi, Juan Quintela,
	Alexander Graf, qemu-devel, Anthony Liguori, Amit Shah,
	Paolo Bonzini, Andreas Färber

On Thu, 15 May 2014 13:12:12 +0300
"Michael S. Tsirkin" <mst@redhat.com> wrote:

> On Thu, May 15, 2014 at 12:08:26PM +0200, Greg Kurz wrote:
> > On Thu, 15 May 2014 11:20:18 +0200
> > Andreas Färber <afaerber@suse.de> wrote:
> > > Am 15.05.2014 09:04, schrieb Greg Kurz:
> > > > On Thu, 15 May 2014 12:16:35 +0530
> > > > Amit Shah <amit.shah@redhat.com> wrote:
> > > >> On (Thu) 15 May 2014 [09:23:51], Michael S. Tsirkin wrote:
> > > >>> On Thu, May 15, 2014 at 11:34:25AM +0530, Amit Shah wrote:
> > > >>>> On (Wed) 14 May 2014 [17:41:38], Greg Kurz wrote:
> > > >>>>> Since each virtio device is streamed in its own section, the idea is to
> > > >>>>> stream subsections between the end of the device section and the start
> > > >>>>> of the next sections. This allows an older QEMU to complain and exit
> > > >>>>> when fed with subsections:
> > > >>>>>
> > > >>>>> Unknown savevm section type 5
> > > >>>>> Error -22 while loading VM state
> > > >>>>
> > > >>>> Please make this configurable -- either via configure or device
> > > >>>> properties.  That avoids having to break existing configurations that
> > > >>>> work without this patch.
> > > 
> > > Since backwards migration is not supported upstream, wouldn't it be
> > > easiest to just add support for the subsection marker and skipping to
> > > the end of section in that downstream?
> > > 
> > 
> > Not sure I understand well... Do you suggest to stream the markers first,
> > then the device, then the subsections ? And then there would be a way
> > we can have the subsections restored before the device ?
> > 
> > > >>>>> All users of virtio_load()/virtio_save() need to be patched because the
> > > >>>>> subsections are streamed AFTER the device itself.
> > > 
> > > IMO this is calling for inversion of control - i.e. let virtio devices
> > > call generic load/save functions that then dispatch to device-specific
> > > code and let us add common stuff in a central place without forgetting
> > > to add calls in some new device.
> > > 
> > 
> > That makes a lot of sense.
> > 
> > > >>>> Since all have the same fixup, I'm wondering if a new section can be
> > > >>>> added to the virtio-bus itself, which gets propagated to all devices
> > > >>>> upon load in the dest.
> > > >>>
> > > >>> This calls for a way for devices to inherit properties from the bus,
> > > >>> which doesn't exist ATM.
> > > >>> Fine but let's not hold up this patchset because of this.
> > > >>
> > > >> No, only suggestion is to add a migration section in the bus, and then
> > > >> it's easier to do this in the post-migrate functions for each device
> > > >> -- so only one new section gets introduced instead of all devices
> > > >> being modified to send a new subsection.
> > > >>
> > > > 
> > > > The main problem I see is that virtio sucks: as you see in patch 8, we have
> > > > to be careful not to call vring or virtqueue stuff before the device knows
> > > > its endianness or it breaks... I need to study how the virtio-bus gets
> > > > migrated to ensure the endian section is streamed before the devices.
> > > 
> > > There is no ordering guarantee. The state needs to be migrated in the
> > > device or bus where it sits, if post-load processing is required; i.e.,
> > > if it's in VirtIODevice then something like this series, if it were on
> > > VirtioBus exclusively (device asking bus for its endianness each time
> > > and does not do post-load stuff) then endianness could be migrated as a
> > > new bus section. Not sure if that would help the "broken" state though?
> > > 
> > 
> > IIRW the "broken" state was proposed as a per-device property...
> > 
> > Fam,
> > 
> > Do you have plans about the "broken" property ? Is it still needed ?
> > 
> > > Would touch on Stefan's alias properties for anything but virtio-mmio.
> > > 
> > 
> > OMG... maybe I should hold on then.
> 
> No need to wait imho.
> Can this be made even simpler - call this stuff
> from virtio_save/virtio_load?
> 

Andreas already suggested this inversion of control.

> Why not?
> 

No reason indeed. I'll rewrite the code that way ! :)

> 
> > > Regards,
> > > Andreas
> > > 
> > 
> > Thanks !
> > 
> > -- 
> > Gregory Kurz                                     kurzgreg@fr.ibm.com
> >                                                  gkurz@linux.vnet.ibm.com
> > Software Engineer @ IBM/Meiosys                  http://www.ibm.com
> > Tel +33 (0)562 165 496
> > 
> > "Anarchy is about taking complete responsibility for yourself."
> >         Alan Moore.
> 

Thnaks !

-- 
Gregory Kurz                                     kurzgreg@fr.ibm.com
                                                 gkurz@linux.vnet.ibm.com
Software Engineer @ IBM/Meiosys                  http://www.ibm.com
Tel +33 (0)562 165 496

"Anarchy is about taking complete responsibility for yourself."
        Alan Moore.

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [Qemu-devel] [PATCH RFC 1/8] virtio: add subsections to the migration stream
  2014-05-15 10:16                     ` Michael S. Tsirkin
@ 2014-05-15 12:00                       ` Andreas Färber
  2014-05-15 12:20                         ` Michael S. Tsirkin
  2014-05-15 13:49                         ` Greg Kurz
  0 siblings, 2 replies; 68+ messages in thread
From: Andreas Färber @ 2014-05-15 12:00 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: Kevin Wolf, Fam Zheng, Stefan Hajnoczi, Juan Quintela,
	Alexander Graf, qemu-devel, Anthony Liguori, Amit Shah,
	Paolo Bonzini, Greg Kurz

Am 15.05.2014 12:16, schrieb Michael S. Tsirkin:
> On Thu, May 15, 2014 at 12:11:12PM +0200, Andreas Färber wrote:
>> Am 15.05.2014 12:03, schrieb Michael S. Tsirkin:
>>> On Thu, May 15, 2014 at 11:58:25AM +0200, Andreas Färber wrote:
>>>> Am 15.05.2014 11:52, schrieb Michael S. Tsirkin:
>>>>> On Thu, May 15, 2014 at 11:20:18AM +0200, Andreas Färber wrote:
>>>>>> Am 15.05.2014 09:04, schrieb Greg Kurz:
>>>>>>> On Thu, 15 May 2014 12:16:35 +0530
>>>>>>> Amit Shah <amit.shah@redhat.com> wrote:
>>>>>>>> On (Thu) 15 May 2014 [09:23:51], Michael S. Tsirkin wrote:
>>>>>>>>> On Thu, May 15, 2014 at 11:34:25AM +0530, Amit Shah wrote:
>>>>>>>>>> On (Wed) 14 May 2014 [17:41:38], Greg Kurz wrote:
>>>>>>>>>>> Since each virtio device is streamed in its own section, the idea is to
>>>>>>>>>>> stream subsections between the end of the device section and the start
>>>>>>>>>>> of the next sections. This allows an older QEMU to complain and exit
>>>>>>>>>>> when fed with subsections:
>>>>>>>>>>>
>>>>>>>>>>> Unknown savevm section type 5
>>>>>>>>>>> Error -22 while loading VM state
>>>>>>>>>>
>>>>>>>>>> Please make this configurable -- either via configure or device
>>>>>>>>>> properties.  That avoids having to break existing configurations that
>>>>>>>>>> work without this patch.
>>>>>>
>>>>>> Since backwards migration is not supported upstream, wouldn't it be
>>>>>> easiest to just add support for the subsection marker and skipping to
>>>>>> the end of section in that downstream?
>>>>>
>>>>> Backwards and forwards migration need to be supported,
>>>>> customers told us repeatedly. So some downstreams support this
>>>>> and not supporting it upstream just means downstreams need
>>>>> to do their own thing.
>>>>>
>>>>> As importantly, ping-pong migration is the only
>>>>> reliable way to stress migration.
>>>>>
>>>>> So if we want to test cross-version we need it to work
>>>>> both way.
>>>>>
>>>>> Finally, the real issue and difficulty with cross-version migration is
>>>>> making VM behave in a backwards compatible way.  Serializing in a
>>>>> compatible way is a trivial problem, or would be if the code wasn't a
>>>>> mess :) Once you do the hard part, breaking migration because of the
>>>>> trivial serialization issue is just silly.  And special-casing forward
>>>>> migration does not make code simpler, it really only leads to
>>>>> proliferation of hacks and lack of symmetry.
>>>>>
>>>>> So yes it's a useful feature, and no not supporting it does
>>>>> not help anyway.
>>>>
>>>> It seems you misunderstand. I was not saying it's not useful.
>>>>
>>>> My point is that VMStateSubsections added in newer versions (and thus
>>>> not existing in older versions) need to be handled for any
>>>> VMState-converted devices. So why can't we make that work for virtio too?
>>>
>>> Sure.
>>> AFAIK the way to this works is by adding an "field_exists" callback,
>>> and not sending the section when we are in a compat mode.
>>
>> OK, but upstream always sends the latest version today.
>> So isn't that
>> just two ifs that you would need to add in save and load functions added
>> here for the downstream? x86_64 is unaffected from ppc's endianness
>> changes and you still do ppc64 BE, so there's no real functional problem
>> for RHEL, is there?
> 
> I'm sorry I don't understand what you are saying here.
> Simply put, if you specify a compatible -M then your
> device should behave, and migrate, exactly like and old
> qemu did.

Whatever the version_id fields are for, upstream QEMU *always* saves the
newest version_id format that it knows. There is no mechanism that I'm
aware of in upstream QEMU for migrating device fields dependent on -M.
So a device in QEMU only migrates exactly like an old QEMU did if
neither fields nor subsections were added.

Subsections are usually migrated based on whether the state differs from
the default state (didn't check whether the final patch does this
correctly here, but doesn't matter for 1/8 concept). So for x86 the
subsection should *never* get written and thus not be a problem for you.
For ppc64 it should not get written either, unless you care about
migration from SLES12 or upstream ppc64le to RHEL ppc64.
As I understood the IRC discussion Alex and me had with Greg about this,
this is copying the exact code VMState uses to write its subsections, so
there would be no forward migration problems at all once we convert to
VMState and VMStateSubsections. The only question remaining is how does
your downstream react when it encounters a subsection marker for
subsections it doesn't know - which is not something upstream can solve
for you unless you contribute backwards migration support upstream.

Don't forget that Greg is introducing subsection support *because of*
your backwards migration wish, so telling him not to add subsections
because of backwards migration is really weird! Either subsections work
with your downstream or they don't; if they don't, then you have much
larger problems than can be solved by blocking this one section for ppc.
Therefore I was saying if your downstream forgot to handle the case of a
non-VMState device having subsections, then it may be easier to fix
exactly that than make Greg jump through more hoops here for a
theoretical problem you are unlikely to encounter on your downstream.

Andreas

-- 
SUSE LINUX Products GmbH, Maxfeldstr. 5, 90409 Nürnberg, Germany
GF: Jeff Hawn, Jennifer Guild, Felix Imendörffer; HRB 16746 AG Nürnberg

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [Qemu-devel] [PATCH RFC 1/8] virtio: add subsections to the migration stream
  2014-05-15 12:00                       ` Andreas Färber
@ 2014-05-15 12:20                         ` Michael S. Tsirkin
  2014-05-15 13:47                           ` Markus Armbruster
  2014-05-15 13:49                         ` Greg Kurz
  1 sibling, 1 reply; 68+ messages in thread
From: Michael S. Tsirkin @ 2014-05-15 12:20 UTC (permalink / raw)
  To: Andreas Färber
  Cc: Kevin Wolf, Fam Zheng, Stefan Hajnoczi, Juan Quintela,
	Alexander Graf, qemu-devel, Anthony Liguori, Amit Shah,
	Paolo Bonzini, Greg Kurz

On Thu, May 15, 2014 at 02:00:15PM +0200, Andreas Färber wrote:
> Am 15.05.2014 12:16, schrieb Michael S. Tsirkin:
> > On Thu, May 15, 2014 at 12:11:12PM +0200, Andreas Färber wrote:
> >> Am 15.05.2014 12:03, schrieb Michael S. Tsirkin:
> >>> On Thu, May 15, 2014 at 11:58:25AM +0200, Andreas Färber wrote:
> >>>> Am 15.05.2014 11:52, schrieb Michael S. Tsirkin:
> >>>>> On Thu, May 15, 2014 at 11:20:18AM +0200, Andreas Färber wrote:
> >>>>>> Am 15.05.2014 09:04, schrieb Greg Kurz:
> >>>>>>> On Thu, 15 May 2014 12:16:35 +0530
> >>>>>>> Amit Shah <amit.shah@redhat.com> wrote:
> >>>>>>>> On (Thu) 15 May 2014 [09:23:51], Michael S. Tsirkin wrote:
> >>>>>>>>> On Thu, May 15, 2014 at 11:34:25AM +0530, Amit Shah wrote:
> >>>>>>>>>> On (Wed) 14 May 2014 [17:41:38], Greg Kurz wrote:
> >>>>>>>>>>> Since each virtio device is streamed in its own section, the idea is to
> >>>>>>>>>>> stream subsections between the end of the device section and the start
> >>>>>>>>>>> of the next sections. This allows an older QEMU to complain and exit
> >>>>>>>>>>> when fed with subsections:
> >>>>>>>>>>>
> >>>>>>>>>>> Unknown savevm section type 5
> >>>>>>>>>>> Error -22 while loading VM state
> >>>>>>>>>>
> >>>>>>>>>> Please make this configurable -- either via configure or device
> >>>>>>>>>> properties.  That avoids having to break existing configurations that
> >>>>>>>>>> work without this patch.
> >>>>>>
> >>>>>> Since backwards migration is not supported upstream, wouldn't it be
> >>>>>> easiest to just add support for the subsection marker and skipping to
> >>>>>> the end of section in that downstream?
> >>>>>
> >>>>> Backwards and forwards migration need to be supported,
> >>>>> customers told us repeatedly. So some downstreams support this
> >>>>> and not supporting it upstream just means downstreams need
> >>>>> to do their own thing.
> >>>>>
> >>>>> As importantly, ping-pong migration is the only
> >>>>> reliable way to stress migration.
> >>>>>
> >>>>> So if we want to test cross-version we need it to work
> >>>>> both way.
> >>>>>
> >>>>> Finally, the real issue and difficulty with cross-version migration is
> >>>>> making VM behave in a backwards compatible way.  Serializing in a
> >>>>> compatible way is a trivial problem, or would be if the code wasn't a
> >>>>> mess :) Once you do the hard part, breaking migration because of the
> >>>>> trivial serialization issue is just silly.  And special-casing forward
> >>>>> migration does not make code simpler, it really only leads to
> >>>>> proliferation of hacks and lack of symmetry.
> >>>>>
> >>>>> So yes it's a useful feature, and no not supporting it does
> >>>>> not help anyway.
> >>>>
> >>>> It seems you misunderstand. I was not saying it's not useful.
> >>>>
> >>>> My point is that VMStateSubsections added in newer versions (and thus
> >>>> not existing in older versions) need to be handled for any
> >>>> VMState-converted devices. So why can't we make that work for virtio too?
> >>>
> >>> Sure.
> >>> AFAIK the way to this works is by adding an "field_exists" callback,
> >>> and not sending the section when we are in a compat mode.
> >>
> >> OK, but upstream always sends the latest version today.
> >> So isn't that
> >> just two ifs that you would need to add in save and load functions added
> >> here for the downstream? x86_64 is unaffected from ppc's endianness
> >> changes and you still do ppc64 BE, so there's no real functional problem
> >> for RHEL, is there?
> > 
> > I'm sorry I don't understand what you are saying here.
> > Simply put, if you specify a compatible -M then your
> > device should behave, and migrate, exactly like and old
> > qemu did.
> 
> Whatever the version_id fields are for, upstream QEMU *always* saves the
> newest version_id format that it knows. There is no mechanism that I'm
> aware of in upstream QEMU for migrating device fields dependent on -M.
> So a device in QEMU only migrates exactly like an old QEMU did if
> neither fields nor subsections were added.

Ah you are talking about version ids.
Yes, they can not support cross-version migration.
For this reason these should only be incremented as the last resort,
normally new fields (possibly subsections) should be added,
with field_exists callback checking for compatibility mode.


> Subsections are usually migrated based on whether the state differs from
> the default state (didn't check whether the final patch does this
> correctly here, but doesn't matter for 1/8 concept).

Yes, this would be fine, but it's not what's implemented IIUC.

> So for x86 the
> subsection should *never* get written and thus not be a problem for you.
> For ppc64 it should not get written either, unless you care about
> migration from SLES12 or upstream ppc64le to RHEL ppc64.
> As I understood the IRC discussion Alex and me had with Greg about this,
> this is copying the exact code VMState uses to write its subsections, so
> there would be no forward migration problems at all once we convert to
> VMState and VMStateSubsections. The only question remaining is how does
> your downstream react when it encounters a subsection marker for
> subsections it doesn't know - which is not something upstream can solve
> for you unless you contribute backwards migration support upstream.
> 
> Don't forget that Greg is introducing subsection support *because of*
> your backwards migration wish, so telling him not to add subsections
> because of backwards migration is really weird!

I don't really care whether it's a subsection or just plain
qemu_put_byte, but I'm not saying not to use subsection
either.
I am merely saying this should have "exists" callback checking
for compability mode for old machine types.

> Either subsections work
> with your downstream or they don't; if they don't, then you have much
> larger problems than can be solved by blocking this one section for ppc.
> Therefore I was saying if your downstream forgot to handle the case of a
> non-VMState device having subsections, then it may be easier to fix
> exactly that than make Greg jump through more hoops here for a
> theoretical problem you are unlikely to encounter on your downstream.
> 
> Andreas

I only mentioned downstream because it's a requirement coming from
users most of whom never interact with upstream.
There shouldn't be any difference and people building clusters
using upstream qemu will have same needs: clusters can't be
completely brought down for version upgrades, so need to be able to run
a mix of versions, and be able to migrate in any direction
while in the process.

> -- 
> SUSE LINUX Products GmbH, Maxfeldstr. 5, 90409 Nürnberg, Germany
> GF: Jeff Hawn, Jennifer Guild, Felix Imendörffer; HRB 16746 AG Nürnberg

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [Qemu-devel] [PATCH RFC 1/8] virtio: add subsections to the migration stream
  2014-05-15  9:52             ` Michael S. Tsirkin
  2014-05-15  9:58               ` Andreas Färber
@ 2014-05-15 12:33               ` Markus Armbruster
  2014-05-15 12:58                 ` Michael S. Tsirkin
  1 sibling, 1 reply; 68+ messages in thread
From: Markus Armbruster @ 2014-05-15 12:33 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: Kevin Wolf, Fam Zheng, Anthony Liguori, Juan Quintela,
	Alexander Graf, qemu-devel, Stefan Hajnoczi, Amit Shah,
	Paolo Bonzini, Andreas Färber, Greg Kurz

"Michael S. Tsirkin" <mst@redhat.com> writes:

> On Thu, May 15, 2014 at 11:20:18AM +0200, Andreas Färber wrote:
>> Am 15.05.2014 09:04, schrieb Greg Kurz:
>> > On Thu, 15 May 2014 12:16:35 +0530
>> > Amit Shah <amit.shah@redhat.com> wrote:
>> >> On (Thu) 15 May 2014 [09:23:51], Michael S. Tsirkin wrote:
>> >>> On Thu, May 15, 2014 at 11:34:25AM +0530, Amit Shah wrote:
>> >>>> On (Wed) 14 May 2014 [17:41:38], Greg Kurz wrote:
>> >>>>> Since each virtio device is streamed in its own section, the idea is to
>> >>>>> stream subsections between the end of the device section and the start
>> >>>>> of the next sections. This allows an older QEMU to complain and exit
>> >>>>> when fed with subsections:
>> >>>>>
>> >>>>> Unknown savevm section type 5
>> >>>>> Error -22 while loading VM state
>> >>>>
>> >>>> Please make this configurable -- either via configure or device
>> >>>> properties.  That avoids having to break existing configurations that
>> >>>> work without this patch.
>> 
>> Since backwards migration is not supported upstream, wouldn't it be
>> easiest to just add support for the subsection marker and skipping to
>> the end of section in that downstream?
>
> Backwards and forwards migration need to be supported,
> customers told us repeatedly.

Can I have world peace and a pony with that?

Given the current state of things, attempting to support backward
migration is trying to run before you can walk.  We need to put
migration on a more solid footing first.

The migration format is crap, and needs to be replaced.

Reasoning on migration compatibility is entirely manual.

Systematic testing of migration compatibility is done downstream.

Fortunately, there's progress being made on all of the above.  Let's not
sabotage it by biting off yet another mouthful.

>                               So some downstreams support this
> and not supporting it upstream just means downstreams need
> to do their own thing.
>
> As importantly, ping-pong migration is the only
> reliable way to stress migration.
>
> So if we want to test cross-version we need it to work
> both way.

Non sequitur.

> Finally, the real issue and difficulty with cross-version migration is
> making VM behave in a backwards compatible way.  Serializing in a
> compatible way is a trivial problem, or would be if the code wasn't a
> mess :)

However, it is.

>         Once you do the hard part, breaking migration because of the
> trivial serialization issue is just silly.  And special-casing forward
> migration does not make code simpler, it really only leads to
> proliferation of hacks and lack of symmetry.

Bold claim; citation needed.

> So yes it's a useful feature, and no not supporting it does
> not help anyway.

Nobody denies reliable backward migration would be useful.  However,
attempting to do every useful feature at once just because they're all
useful is foolish.

Treating backward migration as strictly secondary concern while we're up
to the ass in other alligators *can* help, by letting us focus on the
said other alligators.

I'm not opposed to coding things in ways that help backward migration.
Speaking of "support", however, is clearly premature and misleading.

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [Qemu-devel] [PATCH RFC 1/8] virtio: add subsections to the migration stream
  2014-05-15 12:33               ` Markus Armbruster
@ 2014-05-15 12:58                 ` Michael S. Tsirkin
  2014-05-15 13:35                   ` Greg Kurz
  0 siblings, 1 reply; 68+ messages in thread
From: Michael S. Tsirkin @ 2014-05-15 12:58 UTC (permalink / raw)
  To: Markus Armbruster
  Cc: Kevin Wolf, Fam Zheng, Anthony Liguori, Juan Quintela,
	Alexander Graf, qemu-devel, Stefan Hajnoczi, Amit Shah,
	Paolo Bonzini, Andreas Färber, Greg Kurz

On Thu, May 15, 2014 at 02:33:47PM +0200, Markus Armbruster wrote:
> "Michael S. Tsirkin" <mst@redhat.com> writes:
> 
> > On Thu, May 15, 2014 at 11:20:18AM +0200, Andreas Färber wrote:
> >> Am 15.05.2014 09:04, schrieb Greg Kurz:
> >> > On Thu, 15 May 2014 12:16:35 +0530
> >> > Amit Shah <amit.shah@redhat.com> wrote:
> >> >> On (Thu) 15 May 2014 [09:23:51], Michael S. Tsirkin wrote:
> >> >>> On Thu, May 15, 2014 at 11:34:25AM +0530, Amit Shah wrote:
> >> >>>> On (Wed) 14 May 2014 [17:41:38], Greg Kurz wrote:
> >> >>>>> Since each virtio device is streamed in its own section, the idea is to
> >> >>>>> stream subsections between the end of the device section and the start
> >> >>>>> of the next sections. This allows an older QEMU to complain and exit
> >> >>>>> when fed with subsections:
> >> >>>>>
> >> >>>>> Unknown savevm section type 5
> >> >>>>> Error -22 while loading VM state
> >> >>>>
> >> >>>> Please make this configurable -- either via configure or device
> >> >>>> properties.  That avoids having to break existing configurations that
> >> >>>> work without this patch.
> >> 
> >> Since backwards migration is not supported upstream, wouldn't it be
> >> easiest to just add support for the subsection marker and skipping to
> >> the end of section in that downstream?
> >
> > Backwards and forwards migration need to be supported,
> > customers told us repeatedly.
> 
> Can I have world peace and a pony with that?
> 
> Given the current state of things, attempting to support backward
> migration is trying to run before you can walk.  We need to put
> migration on a more solid footing first.
> 
> The migration format is crap, and needs to be replaced.
> 
> Reasoning on migration compatibility is entirely manual.
> 
> Systematic testing of migration compatibility is done downstream.
> 
> Fortunately, there's progress being made on all of the above.  Let's not
> sabotage it by biting off yet another mouthful.
> 
> >                               So some downstreams support this
> > and not supporting it upstream just means downstreams need
> > to do their own thing.
> >
> > As importantly, ping-pong migration is the only
> > reliable way to stress migration.
> >
> > So if we want to test cross-version we need it to work
> > both way.
> 
> Non sequitur.
> 
> > Finally, the real issue and difficulty with cross-version migration is
> > making VM behave in a backwards compatible way.  Serializing in a
> > compatible way is a trivial problem, or would be if the code wasn't a
> > mess :)
> 
> However, it is.
> 
> >         Once you do the hard part, breaking migration because of the
> > trivial serialization issue is just silly.  And special-casing forward
> > migration does not make code simpler, it really only leads to
> > proliferation of hacks and lack of symmetry.
> 
> Bold claim; citation needed.

You are asking for examples of ugly assymetry?
It's easy: grep for .load_state_old.
You have here a bunch of functions loading format that qemu
can no longer produce, with any set of flags.
The only way to make them run is to install two qemu versions side by side,
save from old one and load in the new one.
What, would you guess, is the chance that they actually work?

I'm going to send a patch removing all this stuff, it's
effectively dead code, but this is just one, biggest example.


> > So yes it's a useful feature, and no not supporting it does
> > not help anyway.
> 
> Nobody denies reliable backward migration would be useful.  However,
> attempting to do every useful feature at once just because they're all
> useful is foolish.
> 
> Treating backward migration as strictly secondary concern while we're up
> to the ass in other alligators *can* help, by letting us focus on the
> said other alligators.
> 
> I'm not opposed to coding things in ways that help backward migration.
> Speaking of "support", however, is clearly premature and misleading.


I agree it's a secondary concern upstream and your comments really apply
to migration generally.  We can't claim that it's properly supported by
the upstream QEMU.

I am merely asking that we don't break cross-version migration
intentionally. I and others also try fix it when we notice it's broken.
In particular I'm not asking that submitters test it.

-- 
MST

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [Qemu-devel] [PATCH RFC 1/8] virtio: add subsections to the migration stream
  2014-05-15 12:58                 ` Michael S. Tsirkin
@ 2014-05-15 13:35                   ` Greg Kurz
  0 siblings, 0 replies; 68+ messages in thread
From: Greg Kurz @ 2014-05-15 13:35 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: Kevin Wolf, Fam Zheng, Anthony Liguori, Juan Quintela,
	qemu-devel, Markus Armbruster, Alexander Graf, Stefan Hajnoczi,
	Amit Shah, Paolo Bonzini, Andreas Färber

On Thu, 15 May 2014 15:58:16 +0300
"Michael S. Tsirkin" <mst@redhat.com> wrote:

> On Thu, May 15, 2014 at 02:33:47PM +0200, Markus Armbruster wrote:
> > "Michael S. Tsirkin" <mst@redhat.com> writes:
> > 
> > > On Thu, May 15, 2014 at 11:20:18AM +0200, Andreas Färber wrote:
> > >> Am 15.05.2014 09:04, schrieb Greg Kurz:
> > >> > On Thu, 15 May 2014 12:16:35 +0530
> > >> > Amit Shah <amit.shah@redhat.com> wrote:
> > >> >> On (Thu) 15 May 2014 [09:23:51], Michael S. Tsirkin wrote:
> > >> >>> On Thu, May 15, 2014 at 11:34:25AM +0530, Amit Shah wrote:
> > >> >>>> On (Wed) 14 May 2014 [17:41:38], Greg Kurz wrote:
> > >> >>>>> Since each virtio device is streamed in its own section, the idea is to
> > >> >>>>> stream subsections between the end of the device section and the start
> > >> >>>>> of the next sections. This allows an older QEMU to complain and exit
> > >> >>>>> when fed with subsections:
> > >> >>>>>
> > >> >>>>> Unknown savevm section type 5
> > >> >>>>> Error -22 while loading VM state
> > >> >>>>
> > >> >>>> Please make this configurable -- either via configure or device
> > >> >>>> properties.  That avoids having to break existing configurations that
> > >> >>>> work without this patch.
> > >> 
> > >> Since backwards migration is not supported upstream, wouldn't it be
> > >> easiest to just add support for the subsection marker and skipping to
> > >> the end of section in that downstream?
> > >
> > > Backwards and forwards migration need to be supported,
> > > customers told us repeatedly.
> > 
> > Can I have world peace and a pony with that?
> > 
> > Given the current state of things, attempting to support backward
> > migration is trying to run before you can walk.  We need to put
> > migration on a more solid footing first.
> > 
> > The migration format is crap, and needs to be replaced.
> > 
> > Reasoning on migration compatibility is entirely manual.
> > 
> > Systematic testing of migration compatibility is done downstream.
> > 
> > Fortunately, there's progress being made on all of the above.  Let's not
> > sabotage it by biting off yet another mouthful.
> > 
> > >                               So some downstreams support this
> > > and not supporting it upstream just means downstreams need
> > > to do their own thing.
> > >
> > > As importantly, ping-pong migration is the only
> > > reliable way to stress migration.
> > >
> > > So if we want to test cross-version we need it to work
> > > both way.
> > 
> > Non sequitur.
> > 
> > > Finally, the real issue and difficulty with cross-version migration is
> > > making VM behave in a backwards compatible way.  Serializing in a
> > > compatible way is a trivial problem, or would be if the code wasn't a
> > > mess :)
> > 
> > However, it is.
> > 
> > >         Once you do the hard part, breaking migration because of the
> > > trivial serialization issue is just silly.  And special-casing forward
> > > migration does not make code simpler, it really only leads to
> > > proliferation of hacks and lack of symmetry.
> > 
> > Bold claim; citation needed.
> 
> You are asking for examples of ugly assymetry?
> It's easy: grep for .load_state_old.
> You have here a bunch of functions loading format that qemu
> can no longer produce, with any set of flags.
> The only way to make them run is to install two qemu versions side by side,
> save from old one and load in the new one.
> What, would you guess, is the chance that they actually work?
> 
> I'm going to send a patch removing all this stuff, it's
> effectively dead code, but this is just one, biggest example.
> 
> 
> > > So yes it's a useful feature, and no not supporting it does
> > > not help anyway.
> > 
> > Nobody denies reliable backward migration would be useful.  However,
> > attempting to do every useful feature at once just because they're all
> > useful is foolish.
> > 
> > Treating backward migration as strictly secondary concern while we're up
> > to the ass in other alligators *can* help, by letting us focus on the
> > said other alligators.
> > 
> > I'm not opposed to coding things in ways that help backward migration.
> > Speaking of "support", however, is clearly premature and misleading.
> 
> 
> I agree it's a secondary concern upstream and your comments really apply
> to migration generally.  We can't claim that it's properly supported by
> the upstream QEMU.
> 
> I am merely asking that we don't break cross-version migration
> intentionally. I and others also try fix it when we notice it's broken.
> In particular I'm not asking that submitters test it.
> 

Michael,

I fully understand your concern and I will take some time to study backward
migration. This being said, the task is not that simple and I will need
several rounds to have the job done...

Cheers.

-- 
Gregory Kurz                                     kurzgreg@fr.ibm.com
                                                 gkurz@linux.vnet.ibm.com
Software Engineer @ IBM/Meiosys                  http://www.ibm.com
Tel +33 (0)562 165 496

"Anarchy is about taking complete responsibility for yourself."
        Alan Moore.

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [Qemu-devel] [PATCH RFC 1/8] virtio: add subsections to the migration stream
  2014-05-15 12:20                         ` Michael S. Tsirkin
@ 2014-05-15 13:47                           ` Markus Armbruster
  0 siblings, 0 replies; 68+ messages in thread
From: Markus Armbruster @ 2014-05-15 13:47 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: Kevin Wolf, Fam Zheng, Anthony Liguori, Juan Quintela,
	Alexander Graf, qemu-devel, Stefan Hajnoczi, Amit Shah,
	Paolo Bonzini, Andreas Färber, Greg Kurz

"Michael S. Tsirkin" <mst@redhat.com> writes:

> On Thu, May 15, 2014 at 02:00:15PM +0200, Andreas Färber wrote:
[...]
>> Either subsections work
>> with your downstream or they don't; if they don't, then you have much
>> larger problems than can be solved by blocking this one section for ppc.
>> Therefore I was saying if your downstream forgot to handle the case of a
>> non-VMState device having subsections, then it may be easier to fix
>> exactly that than make Greg jump through more hoops here for a
>> theoretical problem you are unlikely to encounter on your downstream.
>> 
>> Andreas
>
> I only mentioned downstream because it's a requirement coming from
> users most of whom never interact with upstream.
> There shouldn't be any difference and people building clusters
> using upstream qemu will have same needs: clusters can't be
> completely brought down for version upgrades, so need to be able to run
> a mix of versions, and be able to migrate in any direction
> while in the process.

Okay, I'll have world peace, a cluster, and a pony, please.

Seriously, no matter how much your cluster management software wants to
migrate backwards, it's not going to work reliably, except in very
narrow cases, at least not today.

If you want to migrate freely, stick to an ultra-stable (downstream)
branch of QEMU.  When that branch EOLs, scrap your cluster and start
over.

If you want to upgrade QEMU in more interesting ways, keep track of what
your cluster nodes are running, and migrate only forward.

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [Qemu-devel] [PATCH RFC 1/8] virtio: add subsections to the migration stream
  2014-05-15 12:00                       ` Andreas Färber
  2014-05-15 12:20                         ` Michael S. Tsirkin
@ 2014-05-15 13:49                         ` Greg Kurz
  1 sibling, 0 replies; 68+ messages in thread
From: Greg Kurz @ 2014-05-15 13:49 UTC (permalink / raw)
  To: Andreas Färber
  Cc: Kevin Wolf, Fam Zheng, Stefan Hajnoczi, Juan Quintela,
	Michael S. Tsirkin, Alexander Graf, qemu-devel, Anthony Liguori,
	Amit Shah, Paolo Bonzini

On Thu, 15 May 2014 14:00:15 +0200
Andreas Färber <afaerber@suse.de> wrote:

> Am 15.05.2014 12:16, schrieb Michael S. Tsirkin:
> > On Thu, May 15, 2014 at 12:11:12PM +0200, Andreas Färber wrote:
> >> Am 15.05.2014 12:03, schrieb Michael S. Tsirkin:
> >>> On Thu, May 15, 2014 at 11:58:25AM +0200, Andreas Färber wrote:
> >>>> Am 15.05.2014 11:52, schrieb Michael S. Tsirkin:
> >>>>> On Thu, May 15, 2014 at 11:20:18AM +0200, Andreas Färber wrote:
> >>>>>> Am 15.05.2014 09:04, schrieb Greg Kurz:
> >>>>>>> On Thu, 15 May 2014 12:16:35 +0530
> >>>>>>> Amit Shah <amit.shah@redhat.com> wrote:
> >>>>>>>> On (Thu) 15 May 2014 [09:23:51], Michael S. Tsirkin wrote:
> >>>>>>>>> On Thu, May 15, 2014 at 11:34:25AM +0530, Amit Shah wrote:
> >>>>>>>>>> On (Wed) 14 May 2014 [17:41:38], Greg Kurz wrote:
> >>>>>>>>>>> Since each virtio device is streamed in its own section, the idea is to
> >>>>>>>>>>> stream subsections between the end of the device section and the start
> >>>>>>>>>>> of the next sections. This allows an older QEMU to complain and exit
> >>>>>>>>>>> when fed with subsections:
> >>>>>>>>>>>
> >>>>>>>>>>> Unknown savevm section type 5
> >>>>>>>>>>> Error -22 while loading VM state
> >>>>>>>>>>
> >>>>>>>>>> Please make this configurable -- either via configure or device
> >>>>>>>>>> properties.  That avoids having to break existing configurations that
> >>>>>>>>>> work without this patch.
> >>>>>>
> >>>>>> Since backwards migration is not supported upstream, wouldn't it be
> >>>>>> easiest to just add support for the subsection marker and skipping to
> >>>>>> the end of section in that downstream?
> >>>>>
> >>>>> Backwards and forwards migration need to be supported,
> >>>>> customers told us repeatedly. So some downstreams support this
> >>>>> and not supporting it upstream just means downstreams need
> >>>>> to do their own thing.
> >>>>>
> >>>>> As importantly, ping-pong migration is the only
> >>>>> reliable way to stress migration.
> >>>>>
> >>>>> So if we want to test cross-version we need it to work
> >>>>> both way.
> >>>>>
> >>>>> Finally, the real issue and difficulty with cross-version migration is
> >>>>> making VM behave in a backwards compatible way.  Serializing in a
> >>>>> compatible way is a trivial problem, or would be if the code wasn't a
> >>>>> mess :) Once you do the hard part, breaking migration because of the
> >>>>> trivial serialization issue is just silly.  And special-casing forward
> >>>>> migration does not make code simpler, it really only leads to
> >>>>> proliferation of hacks and lack of symmetry.
> >>>>>
> >>>>> So yes it's a useful feature, and no not supporting it does
> >>>>> not help anyway.
> >>>>
> >>>> It seems you misunderstand. I was not saying it's not useful.
> >>>>
> >>>> My point is that VMStateSubsections added in newer versions (and thus
> >>>> not existing in older versions) need to be handled for any
> >>>> VMState-converted devices. So why can't we make that work for virtio too?
> >>>
> >>> Sure.
> >>> AFAIK the way to this works is by adding an "field_exists" callback,
> >>> and not sending the section when we are in a compat mode.
> >>
> >> OK, but upstream always sends the latest version today.
> >> So isn't that
> >> just two ifs that you would need to add in save and load functions added
> >> here for the downstream? x86_64 is unaffected from ppc's endianness
> >> changes and you still do ppc64 BE, so there's no real functional problem
> >> for RHEL, is there?
> > 
> > I'm sorry I don't understand what you are saying here.
> > Simply put, if you specify a compatible -M then your
> > device should behave, and migrate, exactly like and old
> > qemu did.
> 
> Whatever the version_id fields are for, upstream QEMU *always* saves the
> newest version_id format that it knows. There is no mechanism that I'm
> aware of in upstream QEMU for migrating device fields dependent on -M.
> So a device in QEMU only migrates exactly like an old QEMU did if
> neither fields nor subsections were added.
> 
> Subsections are usually migrated based on whether the state differs from
> the default state (didn't check whether the final patch does this
> correctly here, but doesn't matter for 1/8 concept). So for x86 the

For this first round, subsections are sent unconditionally. I will change
that.

> subsection should *never* get written and thus not be a problem for you.
> For ppc64 it should not get written either, unless you care about
> migration from SLES12 or upstream ppc64le to RHEL ppc64.
> As I understood the IRC discussion Alex and me had with Greg about this,
> this is copying the exact code VMState uses to write its subsections, so
> there would be no forward migration problems at all once we convert to
> VMState and VMStateSubsections. The only question remaining is how does
> your downstream react when it encounters a subsection marker for
> subsections it doesn't know - which is not something upstream can solve
> for you unless you contribute backwards migration support upstream.
> 
> Don't forget that Greg is introducing subsection support *because of*
> your backwards migration wish, so telling him not to add subsections
> because of backwards migration is really weird! Either subsections work
> with your downstream or they don't; if they don't, then you have much
> larger problems than can be solved by blocking this one section for ppc.
> Therefore I was saying if your downstream forgot to handle the case of a
> non-VMState device having subsections, then it may be easier to fix
> exactly that than make Greg jump through more hoops here for a
> theoretical problem you are unlikely to encounter on your downstream.
> 
> Andreas
> 



-- 
Gregory Kurz                                     kurzgreg@fr.ibm.com
                                                 gkurz@linux.vnet.ibm.com
Software Engineer @ IBM/Meiosys                  http://www.ibm.com
Tel +33 (0)562 165 496

"Anarchy is about taking complete responsibility for yourself."
        Alan Moore.

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [Qemu-devel] [PATCH RFC 1/8] virtio: add subsections to the migration stream
  2014-05-15 10:08             ` Greg Kurz
  2014-05-15 10:12               ` Michael S. Tsirkin
  2014-05-15 10:16               ` Greg Kurz
@ 2014-05-16  9:14               ` Fam Zheng
  2014-05-16  9:22                 ` Andreas Färber
  2 siblings, 1 reply; 68+ messages in thread
From: Fam Zheng @ 2014-05-16  9:14 UTC (permalink / raw)
  To: Greg Kurz
  Cc: Kevin Wolf, Anthony Liguori, Michael S. Tsirkin, Juan Quintela,
	Alexander Graf, qemu-devel, Stefan Hajnoczi, Amit Shah,
	Paolo Bonzini, Andreas Färber

On Thu, 05/15 12:08, Greg Kurz wrote:
> > > The main problem I see is that virtio sucks: as you see in patch 8, we have
> > > to be careful not to call vring or virtqueue stuff before the device knows
> > > its endianness or it breaks... I need to study how the virtio-bus gets
> > > migrated to ensure the endian section is streamed before the devices.
> > 
> > There is no ordering guarantee. The state needs to be migrated in the
> > device or bus where it sits, if post-load processing is required; i.e.,
> > if it's in VirtIODevice then something like this series, if it were on
> > VirtioBus exclusively (device asking bus for its endianness each time
> > and does not do post-load stuff) then endianness could be migrated as a
> > new bus section. Not sure if that would help the "broken" state though?
> > 
> 
> IIRW the "broken" state was proposed as a per-device property...

Yes.

> 
> Fam,
> 
> Do you have plans about the "broken" property ? Is it still needed ?

It is on my todo list. We will need it. I hope the next revision of virtio
specification has a proper device status bit that the "broken" property can
utilize.

Thanks,
Fam

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [Qemu-devel] [PATCH RFC 1/8] virtio: add subsections to the migration stream
  2014-05-16  9:14               ` Fam Zheng
@ 2014-05-16  9:22                 ` Andreas Färber
  2014-05-16  9:40                   ` Fam Zheng
  0 siblings, 1 reply; 68+ messages in thread
From: Andreas Färber @ 2014-05-16  9:22 UTC (permalink / raw)
  To: Fam Zheng, Greg Kurz
  Cc: Kevin Wolf, Anthony Liguori, Michael S. Tsirkin, Juan Quintela,
	qemu-devel, Alexander Graf, Stefan Hajnoczi, Amit Shah,
	Paolo Bonzini

Am 16.05.2014 11:14, schrieb Fam Zheng:
> On Thu, 05/15 12:08, Greg Kurz wrote:
>>>> The main problem I see is that virtio sucks: as you see in patch 8, we have
>>>> to be careful not to call vring or virtqueue stuff before the device knows
>>>> its endianness or it breaks... I need to study how the virtio-bus gets
>>>> migrated to ensure the endian section is streamed before the devices.
>>>
>>> There is no ordering guarantee. The state needs to be migrated in the
>>> device or bus where it sits, if post-load processing is required; i.e.,
>>> if it's in VirtIODevice then something like this series, if it were on
>>> VirtioBus exclusively (device asking bus for its endianness each time
>>> and does not do post-load stuff) then endianness could be migrated as a
>>> new bus section. Not sure if that would help the "broken" state though?
>>>
>>
>> IIRW the "broken" state was proposed as a per-device property...
> 
> Yes.

Sure, and that makes sense, but we do have a 1:1 relation of bus/device,
or does virtio-mmio support more? If device doesn't work for some
reason, we could (mis)use the bus as fallback then.

Cheers,
Andreas

-- 
SUSE LINUX Products GmbH, Maxfeldstr. 5, 90409 Nürnberg, Germany
GF: Jeff Hawn, Jennifer Guild, Felix Imendörffer; HRB 16746 AG Nürnberg

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [Qemu-devel] [PATCH RFC 1/8] virtio: add subsections to the migration stream
  2014-05-16  9:22                 ` Andreas Färber
@ 2014-05-16  9:40                   ` Fam Zheng
  2014-05-16  9:48                     ` Greg Kurz
  0 siblings, 1 reply; 68+ messages in thread
From: Fam Zheng @ 2014-05-16  9:40 UTC (permalink / raw)
  To: Andreas Färber
  Cc: Kevin Wolf, Stefan Hajnoczi, Juan Quintela, Michael S. Tsirkin,
	qemu-devel, Alexander Graf, Anthony Liguori, Amit Shah,
	Paolo Bonzini, Greg Kurz

On Fri, 05/16 11:22, Andreas Färber wrote:
> Am 16.05.2014 11:14, schrieb Fam Zheng:
> > On Thu, 05/15 12:08, Greg Kurz wrote:
> >>>> The main problem I see is that virtio sucks: as you see in patch 8, we have
> >>>> to be careful not to call vring or virtqueue stuff before the device knows
> >>>> its endianness or it breaks... I need to study how the virtio-bus gets
> >>>> migrated to ensure the endian section is streamed before the devices.
> >>>
> >>> There is no ordering guarantee. The state needs to be migrated in the
> >>> device or bus where it sits, if post-load processing is required; i.e.,
> >>> if it's in VirtIODevice then something like this series, if it were on
> >>> VirtioBus exclusively (device asking bus for its endianness each time
> >>> and does not do post-load stuff) then endianness could be migrated as a
> >>> new bus section. Not sure if that would help the "broken" state though?
> >>>
> >>
> >> IIRW the "broken" state was proposed as a per-device property...
> > 
> > Yes.
> 
> Sure, and that makes sense, but we do have a 1:1 relation of bus/device,
> or does virtio-mmio support more? If device doesn't work for some
> reason, we could (mis)use the bus as fallback then.
> 

FWIW, I just realized that "broken" may be loaded from device status bit in the
future, so we don't need to migrate the field separately:

https://tools.oasis-open.org/issues/browse/VIRTIO-98

Fam

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [Qemu-devel] [PATCH RFC 1/8] virtio: add subsections to the migration stream
  2014-05-16  9:40                   ` Fam Zheng
@ 2014-05-16  9:48                     ` Greg Kurz
  0 siblings, 0 replies; 68+ messages in thread
From: Greg Kurz @ 2014-05-16  9:48 UTC (permalink / raw)
  To: Fam Zheng
  Cc: Kevin Wolf, Stefan Hajnoczi, Juan Quintela, Michael S. Tsirkin,
	qemu-devel, Alexander Graf, Anthony Liguori, Amit Shah,
	Paolo Bonzini, Andreas Färber

On Fri, 16 May 2014 17:40:00 +0800
Fam Zheng <famz@redhat.com> wrote:
> On Fri, 05/16 11:22, Andreas Färber wrote:
> > Am 16.05.2014 11:14, schrieb Fam Zheng:
> > > On Thu, 05/15 12:08, Greg Kurz wrote:
> > >>>> The main problem I see is that virtio sucks: as you see in patch 8, we have
> > >>>> to be careful not to call vring or virtqueue stuff before the device knows
> > >>>> its endianness or it breaks... I need to study how the virtio-bus gets
> > >>>> migrated to ensure the endian section is streamed before the devices.
> > >>>
> > >>> There is no ordering guarantee. The state needs to be migrated in the
> > >>> device or bus where it sits, if post-load processing is required; i.e.,
> > >>> if it's in VirtIODevice then something like this series, if it were on
> > >>> VirtioBus exclusively (device asking bus for its endianness each time
> > >>> and does not do post-load stuff) then endianness could be migrated as a
> > >>> new bus section. Not sure if that would help the "broken" state though?
> > >>>
> > >>
> > >> IIRW the "broken" state was proposed as a per-device property...
> > > 
> > > Yes.
> > 
> > Sure, and that makes sense, but we do have a 1:1 relation of bus/device,
> > or does virtio-mmio support more? If device doesn't work for some
> > reason, we could (mis)use the bus as fallback then.
> > 
> 
> FWIW, I just realized that "broken" may be loaded from device status bit in the
> future, so we don't need to migrate the field separately:
> 
> https://tools.oasis-open.org/issues/browse/VIRTIO-98
> 
> Fam
> 

Oh, if you are targetting virtio 1.0... all the virtio endian stuff I am
working on is for legacy virtio. Unless you plan to do something for
legacy, I guess our objectives differ.

Cheers.

-- 
Gregory Kurz                                     kurzgreg@fr.ibm.com
                                                 gkurz@linux.vnet.ibm.com
Software Engineer @ IBM/Meiosys                  http://www.ibm.com
Tel +33 (0)562 165 496

"Anarchy is about taking complete responsibility for yourself."
        Alan Moore.

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [Qemu-devel] [PATCH RFC 1/8] virtio: add subsections to the migration stream
  2014-05-15  7:04         ` Greg Kurz
  2014-05-15  9:20           ` Andreas Färber
@ 2014-05-17 18:29           ` Michael S. Tsirkin
  1 sibling, 0 replies; 68+ messages in thread
From: Michael S. Tsirkin @ 2014-05-17 18:29 UTC (permalink / raw)
  To: Greg Kurz
  Cc: Kevin Wolf, Fam Zheng, Stefan Hajnoczi, Juan Quintela,
	Alexander Graf, qemu-devel, Anthony Liguori, Amit Shah,
	Paolo Bonzini, Andreas Färber

On Thu, May 15, 2014 at 09:04:49AM +0200, Greg Kurz wrote:
> On Thu, 15 May 2014 12:16:35 +0530
> Amit Shah <amit.shah@redhat.com> wrote:
> > On (Thu) 15 May 2014 [09:23:51], Michael S. Tsirkin wrote:
> > > On Thu, May 15, 2014 at 11:34:25AM +0530, Amit Shah wrote:
> > > > On (Wed) 14 May 2014 [17:41:38], Greg Kurz wrote:
> > > > > There is a need to add some more fields to VirtIODevice that should be
> > > > > migrated (broken status, endianness). The problem is that we do not
> > > > > want to break compatibility while adding a new feature... This issue has
> > > > > been addressed in the generic VMState code with the use of optional
> > > > > subsections. As a *temporary* alternative to port the whole virtio
> > > > > migration code to VMState, this patch mimics a similar subsectionning
> > > > > ability for virtio.
> > 
> > BTW Greg, do you plan on working on vmstate for virtio?
> > 
> 
> Yes.
> 
> > > > > Since each virtio device is streamed in its own section, the idea is to
> > > > > stream subsections between the end of the device section and the start
> > > > > of the next sections. This allows an older QEMU to complain and exit
> > > > > when fed with subsections:
> > > > > 
> > > > > Unknown savevm section type 5
> > > > > Error -22 while loading VM state
> > > > 
> > > > Please make this configurable -- either via configure or device
> > > > properties.  That avoids having to break existing configurations that
> > > > work without this patch.
> > > > 
> > > > > All users of virtio_load()/virtio_save() need to be patched because the
> > > > > subsections are streamed AFTER the device itself.
> > > > 
> > > > Since all have the same fixup, I'm wondering if a new section can be
> > > > added to the virtio-bus itself, which gets propagated to all devices
> > > > upon load in the dest.
> > > 
> > > This calls for a way for devices to inherit properties from the bus,
> > > which doesn't exist ATM.
> > > Fine but let's not hold up this patchset because of this.
> > 
> > No, only suggestion is to add a migration section in the bus, and then
> > it's easier to do this in the post-migrate functions for each device
> > -- so only one new section gets introduced instead of all devices
> > being modified to send a new subsection.
> > 
> 
> The main problem I see is that virtio sucks: as you see in patch 8, we have
> to be careful not to call vring or virtqueue stuff before the device knows
> its endianness or it breaks...

I see.
I think it's all not a big deal.
People here suggested many ways to deal with it, but IMHO the
only thing that does matter is the functionality.

Functionality-wise I think the only two things that were mentioned were
- decent chance that migration to a wrong machine
  version fails gracefully
- migrate in a compatible way with correct legacy machine version

> I need to study how the virtio-bus gets
> migrated to ensure the endian section is streamed before the devices.
> 
> > 		Amit
> > 
> 
> Thanks.
> 
> -- 
> Gregory Kurz                                     kurzgreg@fr.ibm.com
>                                                  gkurz@linux.vnet.ibm.com
> Software Engineer @ IBM/Meiosys                  http://www.ibm.com
> Tel +33 (0)562 165 496
> 
> "Anarchy is about taking complete responsibility for yourself."
>         Alan Moore.

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [Qemu-devel] [PATCH RFC 8/8] virtio: add endian-ambivalent support to VirtIODevice
       [not found]       ` <538708FA.4070309@redhat.com>
@ 2014-06-12  7:43         ` Greg Kurz
  2014-06-12  7:54           ` Michael S. Tsirkin
  2014-06-12  8:55           ` Paolo Bonzini
  0 siblings, 2 replies; 68+ messages in thread
From: Greg Kurz @ 2014-06-12  7:43 UTC (permalink / raw)
  To: Paolo Bonzini
  Cc: Kevin Wolf, Fam Zheng, Stefan Hajnoczi, Michael S. Tsirkin,
	Juan Quintela, Alexander Graf, qemu-devel, Anthony Liguori,
	Amit Shah, Andreas Färber

On Thu, 29 May 2014 12:16:26 +0200
Paolo Bonzini <pbonzini@redhat.com> wrote:
> Il 29/05/2014 11:12, Greg Kurz ha scritto:
> > int virtio_load(VirtIODevice *vdev, QEMUFile *f)
> > {
> > [...]
> >             nheads = vring_avail_idx(&vdev->vq[i]) - vdev->vq[i].last_avail_idx;
> >                        ^^^^^^^^^^^
> >             /* Check it isn't doing very strange things with descriptor numbers. */
> >             if (nheads > vdev->vq[i].vring.num) {
> > [...]
> > }
> >
> > and
> >
> > static int virtio_serial_load(QEMUFile *f, void *opaque, int version_id)
> > {
> > [...]
> >     /* The config space */
> >     qemu_get_be16s(f, &s->config.cols);
> >     qemu_get_be16s(f, &s->config.rows);
> >
> >     qemu_get_be32s(f, &max_nr_ports);
> >     tswap32s(&max_nr_ports);
> >      ^^^^^^
> >     if (max_nr_ports > tswap32(s->config.max_nr_ports)) {
> > [...]
> > }
> >
> > If we stream subsections after the device descriptor as it is done
> > in VMState, these two will break because the device endian is stale.
> >
> > The first one can be easily dealt with: just defer the sanity check
> > to a post_load function.
> 
> Good, we're lucky here.
> 
> > The second is a bit more tricky: the
> > virtio serial migrates its config space (target endian) and the
> > active ports bitmap. The load code assumes max_nr_ports from the
> > config space tells the size of the ports bitmap... that means the
> > virtio migration protocol is also contaminated by target endianness. :-\
> 
> Ouch.
> 
> I guess we could break migration in the case of host endianness != 
> target endianness, like this:
> 
>      /* These three used to be fetched in target endianness and then
>       * stored as big endian.  It ended up as little endian if host and
>       * target endianness doesn't match.
>       *
>       * Starting with qemu 2.1, we always store as big endian.  The
>       * version wasn't bumped to avoid breaking backwards compatibility.
>       * We check the validity of max_nr_ports, and the incorrect-
>       * endianness max_nr_ports will be huge, which will abort migration
>       * anyway.
>       */
>      uint16_t cols = tswap16(s->config.cols);
>      uint16_t rows = tswap16(s->config.rows);
>      uint32_t max_nr_ports = tswap32(s->config.max_nr_ports);
> 
>      qemu_put_be16s(f, &cols);
>      qemu_put_be16s(f, &rows);
>      qemu_put_be32s(f, &max_nr_ports);
> 
> ...
> 
>      uint16_t cols, rows;
> 
>      qemu_get_be16s(f, &cols);
>      qemu_get_be16s(f, &rows);
>      qemu_get_be32s(f, &max_nr_ports);
> 
>      /* Convert back to target endianness when storing into the config
>       * space.
>       */

Paolo,

The patch set to support endian changing targets adds a device_endian
field to the VirtIODevice structure to be used instead of the default
target endianness as it happens with tswap() macros. It also introduces
virtio_tswap() helpers for this purpose, but they can only be used when
the device_endian field has been restored... in a subsection after the
device descriptor... :-\
If the scenario is ppc64le-on-ppc64: tswap() macros don't do anything
and we cannot convert back to LE...

>      s->config.cols = tswap16(cols);
>      s->config.rows = tswap16(rows);

Since cols and rows are not involved in the protocol, we can safely
defer the conversion to post load.

>      if (max_nr_ports > tswap32(s->config.max_nr_ports) {
>          ...
>      }
> 

Since we know that 0 < max_nr_ports < 32,  is it acceptable to guess
the correct endianness with a heuristic ?

if (max_nr_ports > tswap32(s->config.max_nr_ports)) {
	max_nr_ports = bswap32(max_nr_ports);
}

if (max_nr_ports > tswap32(s->config.max_nr_ports)) {
	return -EINVAL;
}

> > In the case the answer for above is "legacy virtio really sucks" then,
> > is it acceptable to not honor bug-compatibility with older versions and
> > fix the code ? :)
> 
> As long as the common cases don't break, yes.  The question is what are 
> the common cases.  Here I think the only non-obscure case that could 
> break is x86-on-PPC, and it's not that common.
> 
> Paolo
> 

Thanks.

-- 
Gregory Kurz                                     kurzgreg@fr.ibm.com
                                                 gkurz@linux.vnet.ibm.com
Software Engineer @ IBM/Meiosys                  http://www.ibm.com
Tel +33 (0)562 165 496

"Anarchy is about taking complete responsibility for yourself."
        Alan Moore.

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [Qemu-devel] [PATCH RFC 8/8] virtio: add endian-ambivalent support to VirtIODevice
  2014-06-12  7:43         ` Greg Kurz
@ 2014-06-12  7:54           ` Michael S. Tsirkin
  2014-06-12  8:47             ` Greg Kurz
  2014-06-12  8:55             ` Alexander Graf
  2014-06-12  8:55           ` Paolo Bonzini
  1 sibling, 2 replies; 68+ messages in thread
From: Michael S. Tsirkin @ 2014-06-12  7:54 UTC (permalink / raw)
  To: Greg Kurz
  Cc: Kevin Wolf, Fam Zheng, Stefan Hajnoczi, Juan Quintela,
	Alexander Graf, qemu-devel, Anthony Liguori, Amit Shah,
	Paolo Bonzini, Andreas Färber

On Thu, Jun 12, 2014 at 09:43:51AM +0200, Greg Kurz wrote:
> On Thu, 29 May 2014 12:16:26 +0200
> Paolo Bonzini <pbonzini@redhat.com> wrote:
> > Il 29/05/2014 11:12, Greg Kurz ha scritto:
> > > int virtio_load(VirtIODevice *vdev, QEMUFile *f)
> > > {
> > > [...]
> > >             nheads = vring_avail_idx(&vdev->vq[i]) - vdev->vq[i].last_avail_idx;
> > >                        ^^^^^^^^^^^
> > >             /* Check it isn't doing very strange things with descriptor numbers. */
> > >             if (nheads > vdev->vq[i].vring.num) {
> > > [...]
> > > }
> > >
> > > and
> > >
> > > static int virtio_serial_load(QEMUFile *f, void *opaque, int version_id)
> > > {
> > > [...]
> > >     /* The config space */
> > >     qemu_get_be16s(f, &s->config.cols);
> > >     qemu_get_be16s(f, &s->config.rows);
> > >
> > >     qemu_get_be32s(f, &max_nr_ports);
> > >     tswap32s(&max_nr_ports);
> > >      ^^^^^^
> > >     if (max_nr_ports > tswap32(s->config.max_nr_ports)) {
> > > [...]
> > > }
> > >
> > > If we stream subsections after the device descriptor as it is done
> > > in VMState, these two will break because the device endian is stale.
> > >
> > > The first one can be easily dealt with: just defer the sanity check
> > > to a post_load function.
> > 
> > Good, we're lucky here.
> > 
> > > The second is a bit more tricky: the
> > > virtio serial migrates its config space (target endian) and the
> > > active ports bitmap. The load code assumes max_nr_ports from the
> > > config space tells the size of the ports bitmap... that means the
> > > virtio migration protocol is also contaminated by target endianness. :-\
> > 
> > Ouch.
> > 
> > I guess we could break migration in the case of host endianness != 
> > target endianness, like this:
> > 
> >      /* These three used to be fetched in target endianness and then
> >       * stored as big endian.  It ended up as little endian if host and
> >       * target endianness doesn't match.
> >       *
> >       * Starting with qemu 2.1, we always store as big endian.  The
> >       * version wasn't bumped to avoid breaking backwards compatibility.
> >       * We check the validity of max_nr_ports, and the incorrect-
> >       * endianness max_nr_ports will be huge, which will abort migration
> >       * anyway.
> >       */
> >      uint16_t cols = tswap16(s->config.cols);
> >      uint16_t rows = tswap16(s->config.rows);
> >      uint32_t max_nr_ports = tswap32(s->config.max_nr_ports);
> > 
> >      qemu_put_be16s(f, &cols);
> >      qemu_put_be16s(f, &rows);
> >      qemu_put_be32s(f, &max_nr_ports);
> > 
> > ...
> > 
> >      uint16_t cols, rows;
> > 
> >      qemu_get_be16s(f, &cols);
> >      qemu_get_be16s(f, &rows);
> >      qemu_get_be32s(f, &max_nr_ports);
> > 
> >      /* Convert back to target endianness when storing into the config
> >       * space.
> >       */
> 
> Paolo,
> 
> The patch set to support endian changing targets adds a device_endian
> field to the VirtIODevice structure to be used instead of the default
> target endianness as it happens with tswap() macros. It also introduces
> virtio_tswap() helpers for this purpose, but they can only be used when
> the device_endian field has been restored... in a subsection after the
> device descriptor... :-\

Store it earlier then, using plain put/get.
You can still add a section conditionally to cause
a cleaner failure in broken cross-version scenarios.

> If the scenario is ppc64le-on-ppc64: tswap() macros don't do anything
> and we cannot convert back to LE...
> 
> >      s->config.cols = tswap16(cols);
> >      s->config.rows = tswap16(rows);
> 
> Since cols and rows are not involved in the protocol, we can safely
> defer the conversion to post load.
> 
> >      if (max_nr_ports > tswap32(s->config.max_nr_ports) {
> >          ...
> >      }
> > 
> 
> Since we know that 0 < max_nr_ports < 32,  is it acceptable to guess
> the correct endianness with a heuristic ?
> 
> if (max_nr_ports > tswap32(s->config.max_nr_ports)) {
> 	max_nr_ports = bswap32(max_nr_ports);
> }
> 
> if (max_nr_ports > tswap32(s->config.max_nr_ports)) {
> 	return -EINVAL;
> }
> 
> > > In the case the answer for above is "legacy virtio really sucks" then,
> > > is it acceptable to not honor bug-compatibility with older versions and
> > > fix the code ? :)
> > 
> > As long as the common cases don't break, yes.  The question is what are 
> > the common cases.  Here I think the only non-obscure case that could 
> > break is x86-on-PPC, and it's not that common.
> > 
> > Paolo
> > 
> 
> Thanks.

One starts doubting whether all this hackery is worth it.  virtio 1.0
should be out real soon now, it makes everything LE so the problem goes
away. It's not like PPC LE is so popular that we must support old drivers
at all costs.  Won't time be better spent backporting virtio 1.0 drivers?


> -- 
> Gregory Kurz                                     kurzgreg@fr.ibm.com
>                                                  gkurz@linux.vnet.ibm.com
> Software Engineer @ IBM/Meiosys                  http://www.ibm.com
> Tel +33 (0)562 165 496
> 
> "Anarchy is about taking complete responsibility for yourself."
>         Alan Moore.

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [Qemu-devel] [PATCH RFC 8/8] virtio: add endian-ambivalent support to VirtIODevice
  2014-06-12  7:54           ` Michael S. Tsirkin
@ 2014-06-12  8:47             ` Greg Kurz
  2014-06-12  9:05               ` Michael S. Tsirkin
  2014-06-12  9:06               ` Alexander Graf
  2014-06-12  8:55             ` Alexander Graf
  1 sibling, 2 replies; 68+ messages in thread
From: Greg Kurz @ 2014-06-12  8:47 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: Kevin Wolf, Fam Zheng, Stefan Hajnoczi, Juan Quintela,
	Alexander Graf, qemu-devel, Anthony Liguori, Amit Shah,
	Paolo Bonzini, Andreas Färber

On Thu, 12 Jun 2014 10:54:48 +0300
"Michael S. Tsirkin" <mst@redhat.com> wrote:
> On Thu, Jun 12, 2014 at 09:43:51AM +0200, Greg Kurz wrote:
> > On Thu, 29 May 2014 12:16:26 +0200
> > Paolo Bonzini <pbonzini@redhat.com> wrote:
> > > Il 29/05/2014 11:12, Greg Kurz ha scritto:
> > > > int virtio_load(VirtIODevice *vdev, QEMUFile *f)
> > > > {
> > > > [...]
> > > >             nheads = vring_avail_idx(&vdev->vq[i]) - vdev->vq[i].last_avail_idx;
> > > >                        ^^^^^^^^^^^
> > > >             /* Check it isn't doing very strange things with descriptor numbers. */
> > > >             if (nheads > vdev->vq[i].vring.num) {
> > > > [...]
> > > > }
> > > >
> > > > and
> > > >
> > > > static int virtio_serial_load(QEMUFile *f, void *opaque, int version_id)
> > > > {
> > > > [...]
> > > >     /* The config space */
> > > >     qemu_get_be16s(f, &s->config.cols);
> > > >     qemu_get_be16s(f, &s->config.rows);
> > > >
> > > >     qemu_get_be32s(f, &max_nr_ports);
> > > >     tswap32s(&max_nr_ports);
> > > >      ^^^^^^
> > > >     if (max_nr_ports > tswap32(s->config.max_nr_ports)) {
> > > > [...]
> > > > }
> > > >
> > > > If we stream subsections after the device descriptor as it is done
> > > > in VMState, these two will break because the device endian is stale.
> > > >
> > > > The first one can be easily dealt with: just defer the sanity check
> > > > to a post_load function.
> > > 
> > > Good, we're lucky here.
> > > 
> > > > The second is a bit more tricky: the
> > > > virtio serial migrates its config space (target endian) and the
> > > > active ports bitmap. The load code assumes max_nr_ports from the
> > > > config space tells the size of the ports bitmap... that means the
> > > > virtio migration protocol is also contaminated by target endianness. :-\
> > > 
> > > Ouch.
> > > 
> > > I guess we could break migration in the case of host endianness != 
> > > target endianness, like this:
> > > 
> > >      /* These three used to be fetched in target endianness and then
> > >       * stored as big endian.  It ended up as little endian if host and
> > >       * target endianness doesn't match.
> > >       *
> > >       * Starting with qemu 2.1, we always store as big endian.  The
> > >       * version wasn't bumped to avoid breaking backwards compatibility.
> > >       * We check the validity of max_nr_ports, and the incorrect-
> > >       * endianness max_nr_ports will be huge, which will abort migration
> > >       * anyway.
> > >       */
> > >      uint16_t cols = tswap16(s->config.cols);
> > >      uint16_t rows = tswap16(s->config.rows);
> > >      uint32_t max_nr_ports = tswap32(s->config.max_nr_ports);
> > > 
> > >      qemu_put_be16s(f, &cols);
> > >      qemu_put_be16s(f, &rows);
> > >      qemu_put_be32s(f, &max_nr_ports);
> > > 
> > > ...
> > > 
> > >      uint16_t cols, rows;
> > > 
> > >      qemu_get_be16s(f, &cols);
> > >      qemu_get_be16s(f, &rows);
> > >      qemu_get_be32s(f, &max_nr_ports);
> > > 
> > >      /* Convert back to target endianness when storing into the config
> > >       * space.
> > >       */
> > 
> > Paolo,
> > 
> > The patch set to support endian changing targets adds a device_endian
> > field to the VirtIODevice structure to be used instead of the default
> > target endianness as it happens with tswap() macros. It also introduces
> > virtio_tswap() helpers for this purpose, but they can only be used when
> > the device_endian field has been restored... in a subsection after the
> > device descriptor... :-\
> 
> Store it earlier then, using plain put/get.

Not sure I follow... this will break compatibility, no ?

> You can still add a section conditionally to cause
> a cleaner failure in broken cross-version scenarios.
> 
> > If the scenario is ppc64le-on-ppc64: tswap() macros don't do anything
> > and we cannot convert back to LE...
> > 
> > >      s->config.cols = tswap16(cols);
> > >      s->config.rows = tswap16(rows);
> > 
> > Since cols and rows are not involved in the protocol, we can safely
> > defer the conversion to post load.
> > 
> > >      if (max_nr_ports > tswap32(s->config.max_nr_ports) {
> > >          ...
> > >      }
> > > 
> > 
> > Since we know that 0 < max_nr_ports < 32,  is it acceptable to guess
> > the correct endianness with a heuristic ?
> > 
> > if (max_nr_ports > tswap32(s->config.max_nr_ports)) {
> > 	max_nr_ports = bswap32(max_nr_ports);
> > }
> > 
> > if (max_nr_ports > tswap32(s->config.max_nr_ports)) {
> > 	return -EINVAL;
> > }
> > 
> > > > In the case the answer for above is "legacy virtio really sucks" then,
> > > > is it acceptable to not honor bug-compatibility with older versions and
> > > > fix the code ? :)
> > > 
> > > As long as the common cases don't break, yes.  The question is what are 
> > > the common cases.  Here I think the only non-obscure case that could 
> > > break is x86-on-PPC, and it's not that common.
> > > 
> > > Paolo
> > > 
> > 
> > Thanks.
> 
> One starts doubting whether all this hackery is worth it.  virtio 1.0
> should be out real soon now, it makes everything LE so the problem goes
> away. It's not like PPC LE is so popular that we must support old drivers
> at all costs.  Won't time be better spent backporting virtio 1.0 drivers?
> 

Hmmm... AFAIC some QEMU maintainers expressed interest in supporting legacy
virtio in the case we have endian-changing targets. Patches to run a ppc64le
guests have been accepted in KVM, Linux and QEMU... the only missing block
is virtio. I don't especially care in supporting old drivers at all cost: this
request was expressed on the list. I just want people to be able to run a ppc64le
ubuntu-14.04 (and soon other distros) guest on a ppc64 box and be able to migrate.

Would it be acceptable to break compatibility for ppc64 (and maybe ARM) only with
a target specific hook being called from the virtio code ?

> 
> > -- 
> > Gregory Kurz                                     kurzgreg@fr.ibm.com
> >                                                  gkurz@linux.vnet.ibm.com
> > Software Engineer @ IBM/Meiosys                  http://www.ibm.com
> > Tel +33 (0)562 165 496
> > 
> > "Anarchy is about taking complete responsibility for yourself."
> >         Alan Moore.
> 

Thanks.

-- 
Gregory Kurz                                     kurzgreg@fr.ibm.com
                                                 gkurz@linux.vnet.ibm.com
Software Engineer @ IBM/Meiosys                  http://www.ibm.com
Tel +33 (0)562 165 496

"Anarchy is about taking complete responsibility for yourself."
        Alan Moore.

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [Qemu-devel] [PATCH RFC 8/8] virtio: add endian-ambivalent support to VirtIODevice
  2014-06-12  7:54           ` Michael S. Tsirkin
  2014-06-12  8:47             ` Greg Kurz
@ 2014-06-12  8:55             ` Alexander Graf
  2014-06-12  9:07               ` Michael S. Tsirkin
  1 sibling, 1 reply; 68+ messages in thread
From: Alexander Graf @ 2014-06-12  8:55 UTC (permalink / raw)
  To: Michael S. Tsirkin, Greg Kurz
  Cc: Kevin Wolf, Fam Zheng, Stefan Hajnoczi, Juan Quintela,
	qemu-devel, Anthony Liguori, Amit Shah, Paolo Bonzini,
	Andreas Färber


On 12.06.14 09:54, Michael S. Tsirkin wrote:
> On Thu, Jun 12, 2014 at 09:43:51AM +0200, Greg Kurz wrote:
>> On Thu, 29 May 2014 12:16:26 +0200
>> Paolo Bonzini <pbonzini@redhat.com> wrote:
>>> Il 29/05/2014 11:12, Greg Kurz ha scritto:
>>>> int virtio_load(VirtIODevice *vdev, QEMUFile *f)
>>>> {
>>>> [...]
>>>>              nheads = vring_avail_idx(&vdev->vq[i]) - vdev->vq[i].last_avail_idx;
>>>>                         ^^^^^^^^^^^
>>>>              /* Check it isn't doing very strange things with descriptor numbers. */
>>>>              if (nheads > vdev->vq[i].vring.num) {
>>>> [...]
>>>> }
>>>>
>>>> and
>>>>
>>>> static int virtio_serial_load(QEMUFile *f, void *opaque, int version_id)
>>>> {
>>>> [...]
>>>>      /* The config space */
>>>>      qemu_get_be16s(f, &s->config.cols);
>>>>      qemu_get_be16s(f, &s->config.rows);
>>>>
>>>>      qemu_get_be32s(f, &max_nr_ports);
>>>>      tswap32s(&max_nr_ports);
>>>>       ^^^^^^
>>>>      if (max_nr_ports > tswap32(s->config.max_nr_ports)) {
>>>> [...]
>>>> }
>>>>
>>>> If we stream subsections after the device descriptor as it is done
>>>> in VMState, these two will break because the device endian is stale.
>>>>
>>>> The first one can be easily dealt with: just defer the sanity check
>>>> to a post_load function.
>>> Good, we're lucky here.
>>>
>>>> The second is a bit more tricky: the
>>>> virtio serial migrates its config space (target endian) and the
>>>> active ports bitmap. The load code assumes max_nr_ports from the
>>>> config space tells the size of the ports bitmap... that means the
>>>> virtio migration protocol is also contaminated by target endianness. :-\
>>> Ouch.
>>>
>>> I guess we could break migration in the case of host endianness !=
>>> target endianness, like this:
>>>
>>>       /* These three used to be fetched in target endianness and then
>>>        * stored as big endian.  It ended up as little endian if host and
>>>        * target endianness doesn't match.
>>>        *
>>>        * Starting with qemu 2.1, we always store as big endian.  The
>>>        * version wasn't bumped to avoid breaking backwards compatibility.
>>>        * We check the validity of max_nr_ports, and the incorrect-
>>>        * endianness max_nr_ports will be huge, which will abort migration
>>>        * anyway.
>>>        */
>>>       uint16_t cols = tswap16(s->config.cols);
>>>       uint16_t rows = tswap16(s->config.rows);
>>>       uint32_t max_nr_ports = tswap32(s->config.max_nr_ports);
>>>
>>>       qemu_put_be16s(f, &cols);
>>>       qemu_put_be16s(f, &rows);
>>>       qemu_put_be32s(f, &max_nr_ports);
>>>
>>> ...
>>>
>>>       uint16_t cols, rows;
>>>
>>>       qemu_get_be16s(f, &cols);
>>>       qemu_get_be16s(f, &rows);
>>>       qemu_get_be32s(f, &max_nr_ports);
>>>
>>>       /* Convert back to target endianness when storing into the config
>>>        * space.
>>>        */
>> Paolo,
>>
>> The patch set to support endian changing targets adds a device_endian
>> field to the VirtIODevice structure to be used instead of the default
>> target endianness as it happens with tswap() macros. It also introduces
>> virtio_tswap() helpers for this purpose, but they can only be used when
>> the device_endian field has been restored... in a subsection after the
>> device descriptor... :-\
> Store it earlier then, using plain put/get.
> You can still add a section conditionally to cause
> a cleaner failure in broken cross-version scenarios.
>
>> If the scenario is ppc64le-on-ppc64: tswap() macros don't do anything
>> and we cannot convert back to LE...
>>
>>>       s->config.cols = tswap16(cols);
>>>       s->config.rows = tswap16(rows);
>> Since cols and rows are not involved in the protocol, we can safely
>> defer the conversion to post load.
>>
>>>       if (max_nr_ports > tswap32(s->config.max_nr_ports) {
>>>           ...
>>>       }
>>>
>> Since we know that 0 < max_nr_ports < 32,  is it acceptable to guess
>> the correct endianness with a heuristic ?
>>
>> if (max_nr_ports > tswap32(s->config.max_nr_ports)) {
>> 	max_nr_ports = bswap32(max_nr_ports);
>> }
>>
>> if (max_nr_ports > tswap32(s->config.max_nr_ports)) {
>> 	return -EINVAL;
>> }
>>
>>>> In the case the answer for above is "legacy virtio really sucks" then,
>>>> is it acceptable to not honor bug-compatibility with older versions and
>>>> fix the code ? :)
>>> As long as the common cases don't break, yes.  The question is what are
>>> the common cases.  Here I think the only non-obscure case that could
>>> break is x86-on-PPC, and it's not that common.
>>>
>>> Paolo
>>>
>> Thanks.
> One starts doubting whether all this hackery is worth it.  virtio 1.0
> should be out real soon now, it makes everything LE so the problem goes
> away. It's not like PPC LE is so popular that we must support old drivers
> at all costs.  Won't time be better spent backporting virtio 1.0 drivers?

There are already released and working Linux distributions (Ubuntu, 
openSUSE, maybe others) that don't have virtio 1.0 drivers. Putting our 
heads into the sand is not an option ;).


Alex

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [Qemu-devel] [PATCH RFC 8/8] virtio: add endian-ambivalent support to VirtIODevice
  2014-06-12  7:43         ` Greg Kurz
  2014-06-12  7:54           ` Michael S. Tsirkin
@ 2014-06-12  8:55           ` Paolo Bonzini
  2014-06-12  8:57             ` Alexander Graf
  2014-06-12  9:06             ` Greg Kurz
  1 sibling, 2 replies; 68+ messages in thread
From: Paolo Bonzini @ 2014-06-12  8:55 UTC (permalink / raw)
  To: Greg Kurz
  Cc: Kevin Wolf, Fam Zheng, Anthony Liguori, Juan Quintela,
	Michael S. Tsirkin, Alexander Graf, qemu-devel, Stefan Hajnoczi,
	Amit Shah, Andreas Färber

Il 12/06/2014 09:43, Greg Kurz ha scritto:
> Since we know that 0 < max_nr_ports < 32,  is it acceptable to guess
> the correct endianness with a heuristic ?
>
> if (max_nr_ports > tswap32(s->config.max_nr_ports)) {
> 	max_nr_ports = bswap32(max_nr_ports);
> }
>
> if (max_nr_ports > tswap32(s->config.max_nr_ports)) {
> 	return -EINVAL;
> }

Yes, I guess it is acceptable.  So first you should fix the code to 
always serialize fields as big-endian, and then apply this little hack 
and virtio_tswap on top of the previous change.

Paolo

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [Qemu-devel] [PATCH RFC 8/8] virtio: add endian-ambivalent support to VirtIODevice
  2014-06-12  8:55           ` Paolo Bonzini
@ 2014-06-12  8:57             ` Alexander Graf
  2014-06-12  9:06             ` Greg Kurz
  1 sibling, 0 replies; 68+ messages in thread
From: Alexander Graf @ 2014-06-12  8:57 UTC (permalink / raw)
  To: Paolo Bonzini, Greg Kurz
  Cc: Kevin Wolf, Fam Zheng, Anthony Liguori, Juan Quintela,
	Michael S. Tsirkin, qemu-devel, Stefan Hajnoczi, Amit Shah,
	Andreas Färber


On 12.06.14 10:55, Paolo Bonzini wrote:
> Il 12/06/2014 09:43, Greg Kurz ha scritto:
>> Since we know that 0 < max_nr_ports < 32,  is it acceptable to guess
>> the correct endianness with a heuristic ?
>>
>> if (max_nr_ports > tswap32(s->config.max_nr_ports)) {
>>     max_nr_ports = bswap32(max_nr_ports);
>> }
>>
>> if (max_nr_ports > tswap32(s->config.max_nr_ports)) {
>>     return -EINVAL;
>> }
>
> Yes, I guess it is acceptable.  So first you should fix the code to 
> always serialize fields as big-endian, and then apply this little hack 
> and virtio_tswap on top of the previous change.

Why don't we just ignore those config fields? They won't change during 
the lifetime of the guest anyway - and configuration needs to be the 
same on source and dest to work.


Alex

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [Qemu-devel] [PATCH RFC 8/8] virtio: add endian-ambivalent support to VirtIODevice
  2014-06-12  8:47             ` Greg Kurz
@ 2014-06-12  9:05               ` Michael S. Tsirkin
  2014-06-12  9:06               ` Alexander Graf
  1 sibling, 0 replies; 68+ messages in thread
From: Michael S. Tsirkin @ 2014-06-12  9:05 UTC (permalink / raw)
  To: Greg Kurz
  Cc: Kevin Wolf, Fam Zheng, Stefan Hajnoczi, Juan Quintela,
	Alexander Graf, qemu-devel, Anthony Liguori, Amit Shah,
	Paolo Bonzini, Andreas Färber

On Thu, Jun 12, 2014 at 10:47:17AM +0200, Greg Kurz wrote:
> On Thu, 12 Jun 2014 10:54:48 +0300
> "Michael S. Tsirkin" <mst@redhat.com> wrote:
> > On Thu, Jun 12, 2014 at 09:43:51AM +0200, Greg Kurz wrote:
> > > On Thu, 29 May 2014 12:16:26 +0200
> > > Paolo Bonzini <pbonzini@redhat.com> wrote:
> > > > Il 29/05/2014 11:12, Greg Kurz ha scritto:
> > > > > int virtio_load(VirtIODevice *vdev, QEMUFile *f)
> > > > > {
> > > > > [...]
> > > > >             nheads = vring_avail_idx(&vdev->vq[i]) - vdev->vq[i].last_avail_idx;
> > > > >                        ^^^^^^^^^^^
> > > > >             /* Check it isn't doing very strange things with descriptor numbers. */
> > > > >             if (nheads > vdev->vq[i].vring.num) {
> > > > > [...]
> > > > > }
> > > > >
> > > > > and
> > > > >
> > > > > static int virtio_serial_load(QEMUFile *f, void *opaque, int version_id)
> > > > > {
> > > > > [...]
> > > > >     /* The config space */
> > > > >     qemu_get_be16s(f, &s->config.cols);
> > > > >     qemu_get_be16s(f, &s->config.rows);
> > > > >
> > > > >     qemu_get_be32s(f, &max_nr_ports);
> > > > >     tswap32s(&max_nr_ports);
> > > > >      ^^^^^^
> > > > >     if (max_nr_ports > tswap32(s->config.max_nr_ports)) {
> > > > > [...]
> > > > > }
> > > > >
> > > > > If we stream subsections after the device descriptor as it is done
> > > > > in VMState, these two will break because the device endian is stale.
> > > > >
> > > > > The first one can be easily dealt with: just defer the sanity check
> > > > > to a post_load function.
> > > > 
> > > > Good, we're lucky here.
> > > > 
> > > > > The second is a bit more tricky: the
> > > > > virtio serial migrates its config space (target endian) and the
> > > > > active ports bitmap. The load code assumes max_nr_ports from the
> > > > > config space tells the size of the ports bitmap... that means the
> > > > > virtio migration protocol is also contaminated by target endianness. :-\
> > > > 
> > > > Ouch.
> > > > 
> > > > I guess we could break migration in the case of host endianness != 
> > > > target endianness, like this:
> > > > 
> > > >      /* These three used to be fetched in target endianness and then
> > > >       * stored as big endian.  It ended up as little endian if host and
> > > >       * target endianness doesn't match.
> > > >       *
> > > >       * Starting with qemu 2.1, we always store as big endian.  The
> > > >       * version wasn't bumped to avoid breaking backwards compatibility.
> > > >       * We check the validity of max_nr_ports, and the incorrect-
> > > >       * endianness max_nr_ports will be huge, which will abort migration
> > > >       * anyway.
> > > >       */
> > > >      uint16_t cols = tswap16(s->config.cols);
> > > >      uint16_t rows = tswap16(s->config.rows);
> > > >      uint32_t max_nr_ports = tswap32(s->config.max_nr_ports);
> > > > 
> > > >      qemu_put_be16s(f, &cols);
> > > >      qemu_put_be16s(f, &rows);
> > > >      qemu_put_be32s(f, &max_nr_ports);
> > > > 
> > > > ...
> > > > 
> > > >      uint16_t cols, rows;
> > > > 
> > > >      qemu_get_be16s(f, &cols);
> > > >      qemu_get_be16s(f, &rows);
> > > >      qemu_get_be32s(f, &max_nr_ports);
> > > > 
> > > >      /* Convert back to target endianness when storing into the config
> > > >       * space.
> > > >       */
> > > 
> > > Paolo,
> > > 
> > > The patch set to support endian changing targets adds a device_endian
> > > field to the VirtIODevice structure to be used instead of the default
> > > target endianness as it happens with tswap() macros. It also introduces
> > > virtio_tswap() helpers for this purpose, but they can only be used when
> > > the device_endian field has been restored... in a subsection after the
> > > device descriptor... :-\
> > 
> > Store it earlier then, using plain put/get.
> 
> Not sure I follow... this will break compatibility, no ?

Do it only on the ambialent targets :)

> > You can still add a section conditionally to cause
> > a cleaner failure in broken cross-version scenarios.
> > 
> > > If the scenario is ppc64le-on-ppc64: tswap() macros don't do anything
> > > and we cannot convert back to LE...
> > > 
> > > >      s->config.cols = tswap16(cols);
> > > >      s->config.rows = tswap16(rows);
> > > 
> > > Since cols and rows are not involved in the protocol, we can safely
> > > defer the conversion to post load.
> > > 
> > > >      if (max_nr_ports > tswap32(s->config.max_nr_ports) {
> > > >          ...
> > > >      }
> > > > 
> > > 
> > > Since we know that 0 < max_nr_ports < 32,  is it acceptable to guess
> > > the correct endianness with a heuristic ?
> > > 
> > > if (max_nr_ports > tswap32(s->config.max_nr_ports)) {
> > > 	max_nr_ports = bswap32(max_nr_ports);
> > > }
> > > 
> > > if (max_nr_ports > tswap32(s->config.max_nr_ports)) {
> > > 	return -EINVAL;
> > > }
> > > 
> > > > > In the case the answer for above is "legacy virtio really sucks" then,
> > > > > is it acceptable to not honor bug-compatibility with older versions and
> > > > > fix the code ? :)
> > > > 
> > > > As long as the common cases don't break, yes.  The question is what are 
> > > > the common cases.  Here I think the only non-obscure case that could 
> > > > break is x86-on-PPC, and it's not that common.
> > > > 
> > > > Paolo
> > > > 
> > > 
> > > Thanks.
> > 
> > One starts doubting whether all this hackery is worth it.  virtio 1.0
> > should be out real soon now, it makes everything LE so the problem goes
> > away. It's not like PPC LE is so popular that we must support old drivers
> > at all costs.  Won't time be better spent backporting virtio 1.0 drivers?
> > 
> 
> Hmmm... AFAIC some QEMU maintainers expressed interest in supporting legacy
> virtio in the case we have endian-changing targets. Patches to run a ppc64le
> guests have been accepted in KVM, Linux and QEMU... the only missing block
> is virtio. I don't especially care in supporting old drivers at all cost: this
> request was expressed on the list. I just want people to be able to run a ppc64le
> ubuntu-14.04 (and soon other distros) guest on a ppc64 box and be able to migrate.

Sigh. So ubuntu ships known-broken virtio drivers in their LTS release
and now we are trying to make them work.
There's no working qemu configuration at the moment, do we really
have to build new hardware around known-broken drivers
and then maintain it all for years?
Why not build good hardware and backport new drivers?


> Would it be acceptable to break compatibility for ppc64 (and maybe ARM) only with
> a target specific hook being called from the virtio code ?

I think so.
Can we just add a function that tells us whether target is ambivalent?
Only migrate new stuff if that's the case.

> > 
> > > -- 
> > > Gregory Kurz                                     kurzgreg@fr.ibm.com
> > >                                                  gkurz@linux.vnet.ibm.com
> > > Software Engineer @ IBM/Meiosys                  http://www.ibm.com
> > > Tel +33 (0)562 165 496
> > > 
> > > "Anarchy is about taking complete responsibility for yourself."
> > >         Alan Moore.
> > 
> 
> Thanks.
> 
> -- 
> Gregory Kurz                                     kurzgreg@fr.ibm.com
>                                                  gkurz@linux.vnet.ibm.com
> Software Engineer @ IBM/Meiosys                  http://www.ibm.com
> Tel +33 (0)562 165 496
> 
> "Anarchy is about taking complete responsibility for yourself."
>         Alan Moore.

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [Qemu-devel] [PATCH RFC 8/8] virtio: add endian-ambivalent support to VirtIODevice
  2014-06-12  8:55           ` Paolo Bonzini
  2014-06-12  8:57             ` Alexander Graf
@ 2014-06-12  9:06             ` Greg Kurz
  2014-06-12  9:19               ` Paolo Bonzini
  1 sibling, 1 reply; 68+ messages in thread
From: Greg Kurz @ 2014-06-12  9:06 UTC (permalink / raw)
  To: Paolo Bonzini
  Cc: Kevin Wolf, Fam Zheng, Stefan Hajnoczi, Michael S. Tsirkin,
	Juan Quintela, Alexander Graf, qemu-devel, Anthony Liguori,
	Amit Shah, Andreas Färber

On Thu, 12 Jun 2014 10:55:42 +0200
Paolo Bonzini <pbonzini@redhat.com> wrote:

> Il 12/06/2014 09:43, Greg Kurz ha scritto:
> > Since we know that 0 < max_nr_ports < 32,  is it acceptable to guess
> > the correct endianness with a heuristic ?
> >
> > if (max_nr_ports > tswap32(s->config.max_nr_ports)) {
> > 	max_nr_ports = bswap32(max_nr_ports);
> > }
> >
> > if (max_nr_ports > tswap32(s->config.max_nr_ports)) {
> > 	return -EINVAL;
> > }
> 
> Yes, I guess it is acceptable.  So first you should fix the code to 
> always serialize fields as big-endian, and then apply this little hack 
> and virtio_tswap on top of the previous change.
> 
> Paolo
> 

BTW, can someone explain why we stream the device config ?

-- 
Gregory Kurz                                     kurzgreg@fr.ibm.com
                                                 gkurz@linux.vnet.ibm.com
Software Engineer @ IBM/Meiosys                  http://www.ibm.com
Tel +33 (0)562 165 496

"Anarchy is about taking complete responsibility for yourself."
        Alan Moore.

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [Qemu-devel] [PATCH RFC 8/8] virtio: add endian-ambivalent support to VirtIODevice
  2014-06-12  8:47             ` Greg Kurz
  2014-06-12  9:05               ` Michael S. Tsirkin
@ 2014-06-12  9:06               ` Alexander Graf
  1 sibling, 0 replies; 68+ messages in thread
From: Alexander Graf @ 2014-06-12  9:06 UTC (permalink / raw)
  To: Greg Kurz, Michael S. Tsirkin
  Cc: Kevin Wolf, Fam Zheng, Stefan Hajnoczi, Juan Quintela,
	qemu-devel, Anthony Liguori, Amit Shah, Paolo Bonzini,
	Andreas Färber


On 12.06.14 10:47, Greg Kurz wrote:
> On Thu, 12 Jun 2014 10:54:48 +0300
> "Michael S. Tsirkin" <mst@redhat.com> wrote:
>> On Thu, Jun 12, 2014 at 09:43:51AM +0200, Greg Kurz wrote:
>>> On Thu, 29 May 2014 12:16:26 +0200
>>> Paolo Bonzini <pbonzini@redhat.com> wrote:
>>>> Il 29/05/2014 11:12, Greg Kurz ha scritto:
>>>>> int virtio_load(VirtIODevice *vdev, QEMUFile *f)
>>>>> {
>>>>> [...]
>>>>>              nheads = vring_avail_idx(&vdev->vq[i]) - vdev->vq[i].last_avail_idx;
>>>>>                         ^^^^^^^^^^^
>>>>>              /* Check it isn't doing very strange things with descriptor numbers. */
>>>>>              if (nheads > vdev->vq[i].vring.num) {
>>>>> [...]
>>>>> }
>>>>>
>>>>> and
>>>>>
>>>>> static int virtio_serial_load(QEMUFile *f, void *opaque, int version_id)
>>>>> {
>>>>> [...]
>>>>>      /* The config space */
>>>>>      qemu_get_be16s(f, &s->config.cols);
>>>>>      qemu_get_be16s(f, &s->config.rows);
>>>>>
>>>>>      qemu_get_be32s(f, &max_nr_ports);
>>>>>      tswap32s(&max_nr_ports);
>>>>>       ^^^^^^
>>>>>      if (max_nr_ports > tswap32(s->config.max_nr_ports)) {
>>>>> [...]
>>>>> }
>>>>>
>>>>> If we stream subsections after the device descriptor as it is done
>>>>> in VMState, these two will break because the device endian is stale.
>>>>>
>>>>> The first one can be easily dealt with: just defer the sanity check
>>>>> to a post_load function.
>>>> Good, we're lucky here.
>>>>
>>>>> The second is a bit more tricky: the
>>>>> virtio serial migrates its config space (target endian) and the
>>>>> active ports bitmap. The load code assumes max_nr_ports from the
>>>>> config space tells the size of the ports bitmap... that means the
>>>>> virtio migration protocol is also contaminated by target endianness. :-\
>>>> Ouch.
>>>>
>>>> I guess we could break migration in the case of host endianness !=
>>>> target endianness, like this:
>>>>
>>>>       /* These three used to be fetched in target endianness and then
>>>>        * stored as big endian.  It ended up as little endian if host and
>>>>        * target endianness doesn't match.
>>>>        *
>>>>        * Starting with qemu 2.1, we always store as big endian.  The
>>>>        * version wasn't bumped to avoid breaking backwards compatibility.
>>>>        * We check the validity of max_nr_ports, and the incorrect-
>>>>        * endianness max_nr_ports will be huge, which will abort migration
>>>>        * anyway.
>>>>        */
>>>>       uint16_t cols = tswap16(s->config.cols);
>>>>       uint16_t rows = tswap16(s->config.rows);
>>>>       uint32_t max_nr_ports = tswap32(s->config.max_nr_ports);
>>>>
>>>>       qemu_put_be16s(f, &cols);
>>>>       qemu_put_be16s(f, &rows);
>>>>       qemu_put_be32s(f, &max_nr_ports);
>>>>
>>>> ...
>>>>
>>>>       uint16_t cols, rows;
>>>>
>>>>       qemu_get_be16s(f, &cols);
>>>>       qemu_get_be16s(f, &rows);
>>>>       qemu_get_be32s(f, &max_nr_ports);
>>>>
>>>>       /* Convert back to target endianness when storing into the config
>>>>        * space.
>>>>        */
>>> Paolo,
>>>
>>> The patch set to support endian changing targets adds a device_endian
>>> field to the VirtIODevice structure to be used instead of the default
>>> target endianness as it happens with tswap() macros. It also introduces
>>> virtio_tswap() helpers for this purpose, but they can only be used when
>>> the device_endian field has been restored... in a subsection after the
>>> device descriptor... :-\
>> Store it earlier then, using plain put/get.
> Not sure I follow... this will break compatibility, no ?
>
>> You can still add a section conditionally to cause
>> a cleaner failure in broken cross-version scenarios.
>>
>>> If the scenario is ppc64le-on-ppc64: tswap() macros don't do anything
>>> and we cannot convert back to LE...
>>>
>>>>       s->config.cols = tswap16(cols);
>>>>       s->config.rows = tswap16(rows);
>>> Since cols and rows are not involved in the protocol, we can safely
>>> defer the conversion to post load.
>>>
>>>>       if (max_nr_ports > tswap32(s->config.max_nr_ports) {
>>>>           ...
>>>>       }
>>>>
>>> Since we know that 0 < max_nr_ports < 32,  is it acceptable to guess
>>> the correct endianness with a heuristic ?
>>>
>>> if (max_nr_ports > tswap32(s->config.max_nr_ports)) {
>>> 	max_nr_ports = bswap32(max_nr_ports);
>>> }
>>>
>>> if (max_nr_ports > tswap32(s->config.max_nr_ports)) {
>>> 	return -EINVAL;
>>> }
>>>
>>>>> In the case the answer for above is "legacy virtio really sucks" then,
>>>>> is it acceptable to not honor bug-compatibility with older versions and
>>>>> fix the code ? :)
>>>> As long as the common cases don't break, yes.  The question is what are
>>>> the common cases.  Here I think the only non-obscure case that could
>>>> break is x86-on-PPC, and it's not that common.
>>>>
>>>> Paolo
>>>>
>>> Thanks.
>> One starts doubting whether all this hackery is worth it.  virtio 1.0
>> should be out real soon now, it makes everything LE so the problem goes
>> away. It's not like PPC LE is so popular that we must support old drivers
>> at all costs.  Won't time be better spent backporting virtio 1.0 drivers?
>>
> Hmmm... AFAIC some QEMU maintainers expressed interest in supporting legacy
> virtio in the case we have endian-changing targets. Patches to run a ppc64le
> guests have been accepted in KVM, Linux and QEMU... the only missing block
> is virtio. I don't especially care in supporting old drivers at all cost: this
> request was expressed on the list. I just want people to be able to run a ppc64le
> ubuntu-14.04 (and soon other distros) guest on a ppc64 box and be able to migrate.
>
> Would it be acceptable to break compatibility for ppc64 (and maybe ARM) only with
> a target specific hook being called from the virtio code ?

I don't see the problem. Let's just make the config fields UNUSED in 
vmstate and keep the virtio-serial config space in good endian locally. 
Then you don't care about ordering of sections anymore.

Then we only need the subsection code for "endian is not target endian" 
which makes migration backwards compatible with non-switchable QEMU 
versions and everyone's happy. End of story :).


Alex

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [Qemu-devel] [PATCH RFC 8/8] virtio: add endian-ambivalent support to VirtIODevice
  2014-06-12  8:55             ` Alexander Graf
@ 2014-06-12  9:07               ` Michael S. Tsirkin
  2014-06-12  9:08                 ` Alexander Graf
  0 siblings, 1 reply; 68+ messages in thread
From: Michael S. Tsirkin @ 2014-06-12  9:07 UTC (permalink / raw)
  To: Alexander Graf
  Cc: Kevin Wolf, Fam Zheng, Stefan Hajnoczi, Juan Quintela,
	qemu-devel, Anthony Liguori, Amit Shah, Paolo Bonzini,
	Andreas Färber, Greg Kurz

On Thu, Jun 12, 2014 at 10:55:40AM +0200, Alexander Graf wrote:
> 
> On 12.06.14 09:54, Michael S. Tsirkin wrote:
> >On Thu, Jun 12, 2014 at 09:43:51AM +0200, Greg Kurz wrote:
> >>On Thu, 29 May 2014 12:16:26 +0200
> >>Paolo Bonzini <pbonzini@redhat.com> wrote:
> >>>Il 29/05/2014 11:12, Greg Kurz ha scritto:
> >>>>int virtio_load(VirtIODevice *vdev, QEMUFile *f)
> >>>>{
> >>>>[...]
> >>>>             nheads = vring_avail_idx(&vdev->vq[i]) - vdev->vq[i].last_avail_idx;
> >>>>                        ^^^^^^^^^^^
> >>>>             /* Check it isn't doing very strange things with descriptor numbers. */
> >>>>             if (nheads > vdev->vq[i].vring.num) {
> >>>>[...]
> >>>>}
> >>>>
> >>>>and
> >>>>
> >>>>static int virtio_serial_load(QEMUFile *f, void *opaque, int version_id)
> >>>>{
> >>>>[...]
> >>>>     /* The config space */
> >>>>     qemu_get_be16s(f, &s->config.cols);
> >>>>     qemu_get_be16s(f, &s->config.rows);
> >>>>
> >>>>     qemu_get_be32s(f, &max_nr_ports);
> >>>>     tswap32s(&max_nr_ports);
> >>>>      ^^^^^^
> >>>>     if (max_nr_ports > tswap32(s->config.max_nr_ports)) {
> >>>>[...]
> >>>>}
> >>>>
> >>>>If we stream subsections after the device descriptor as it is done
> >>>>in VMState, these two will break because the device endian is stale.
> >>>>
> >>>>The first one can be easily dealt with: just defer the sanity check
> >>>>to a post_load function.
> >>>Good, we're lucky here.
> >>>
> >>>>The second is a bit more tricky: the
> >>>>virtio serial migrates its config space (target endian) and the
> >>>>active ports bitmap. The load code assumes max_nr_ports from the
> >>>>config space tells the size of the ports bitmap... that means the
> >>>>virtio migration protocol is also contaminated by target endianness. :-\
> >>>Ouch.
> >>>
> >>>I guess we could break migration in the case of host endianness !=
> >>>target endianness, like this:
> >>>
> >>>      /* These three used to be fetched in target endianness and then
> >>>       * stored as big endian.  It ended up as little endian if host and
> >>>       * target endianness doesn't match.
> >>>       *
> >>>       * Starting with qemu 2.1, we always store as big endian.  The
> >>>       * version wasn't bumped to avoid breaking backwards compatibility.
> >>>       * We check the validity of max_nr_ports, and the incorrect-
> >>>       * endianness max_nr_ports will be huge, which will abort migration
> >>>       * anyway.
> >>>       */
> >>>      uint16_t cols = tswap16(s->config.cols);
> >>>      uint16_t rows = tswap16(s->config.rows);
> >>>      uint32_t max_nr_ports = tswap32(s->config.max_nr_ports);
> >>>
> >>>      qemu_put_be16s(f, &cols);
> >>>      qemu_put_be16s(f, &rows);
> >>>      qemu_put_be32s(f, &max_nr_ports);
> >>>
> >>>...
> >>>
> >>>      uint16_t cols, rows;
> >>>
> >>>      qemu_get_be16s(f, &cols);
> >>>      qemu_get_be16s(f, &rows);
> >>>      qemu_get_be32s(f, &max_nr_ports);
> >>>
> >>>      /* Convert back to target endianness when storing into the config
> >>>       * space.
> >>>       */
> >>Paolo,
> >>
> >>The patch set to support endian changing targets adds a device_endian
> >>field to the VirtIODevice structure to be used instead of the default
> >>target endianness as it happens with tswap() macros. It also introduces
> >>virtio_tswap() helpers for this purpose, but they can only be used when
> >>the device_endian field has been restored... in a subsection after the
> >>device descriptor... :-\
> >Store it earlier then, using plain put/get.
> >You can still add a section conditionally to cause
> >a cleaner failure in broken cross-version scenarios.
> >
> >>If the scenario is ppc64le-on-ppc64: tswap() macros don't do anything
> >>and we cannot convert back to LE...
> >>
> >>>      s->config.cols = tswap16(cols);
> >>>      s->config.rows = tswap16(rows);
> >>Since cols and rows are not involved in the protocol, we can safely
> >>defer the conversion to post load.
> >>
> >>>      if (max_nr_ports > tswap32(s->config.max_nr_ports) {
> >>>          ...
> >>>      }
> >>>
> >>Since we know that 0 < max_nr_ports < 32,  is it acceptable to guess
> >>the correct endianness with a heuristic ?
> >>
> >>if (max_nr_ports > tswap32(s->config.max_nr_ports)) {
> >>	max_nr_ports = bswap32(max_nr_ports);
> >>}
> >>
> >>if (max_nr_ports > tswap32(s->config.max_nr_ports)) {
> >>	return -EINVAL;
> >>}
> >>
> >>>>In the case the answer for above is "legacy virtio really sucks" then,
> >>>>is it acceptable to not honor bug-compatibility with older versions and
> >>>>fix the code ? :)
> >>>As long as the common cases don't break, yes.  The question is what are
> >>>the common cases.  Here I think the only non-obscure case that could
> >>>break is x86-on-PPC, and it's not that common.
> >>>
> >>>Paolo
> >>>
> >>Thanks.
> >One starts doubting whether all this hackery is worth it.  virtio 1.0
> >should be out real soon now, it makes everything LE so the problem goes
> >away. It's not like PPC LE is so popular that we must support old drivers
> >at all costs.  Won't time be better spent backporting virtio 1.0 drivers?
> 
> There are already released and working Linux distributions (Ubuntu,
> openSUSE, maybe others) that don't have virtio 1.0 drivers. Putting our
> heads into the sand is not an option ;).
> 
> 
> Alex

I don't get it. Does virtio work there at the moment?

-- 
MST

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [Qemu-devel] [PATCH RFC 8/8] virtio: add endian-ambivalent support to VirtIODevice
  2014-06-12  9:07               ` Michael S. Tsirkin
@ 2014-06-12  9:08                 ` Alexander Graf
  0 siblings, 0 replies; 68+ messages in thread
From: Alexander Graf @ 2014-06-12  9:08 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: Kevin Wolf, Fam Zheng, Stefan Hajnoczi, Juan Quintela,
	qemu-devel, Anthony Liguori, Amit Shah, Paolo Bonzini,
	Andreas Färber, Greg Kurz


On 12.06.14 11:07, Michael S. Tsirkin wrote:
> On Thu, Jun 12, 2014 at 10:55:40AM +0200, Alexander Graf wrote:
>> On 12.06.14 09:54, Michael S. Tsirkin wrote:
>>> On Thu, Jun 12, 2014 at 09:43:51AM +0200, Greg Kurz wrote:
>>>> On Thu, 29 May 2014 12:16:26 +0200
>>>> Paolo Bonzini <pbonzini@redhat.com> wrote:
>>>>> Il 29/05/2014 11:12, Greg Kurz ha scritto:
>>>>>> int virtio_load(VirtIODevice *vdev, QEMUFile *f)
>>>>>> {
>>>>>> [...]
>>>>>>              nheads = vring_avail_idx(&vdev->vq[i]) - vdev->vq[i].last_avail_idx;
>>>>>>                         ^^^^^^^^^^^
>>>>>>              /* Check it isn't doing very strange things with descriptor numbers. */
>>>>>>              if (nheads > vdev->vq[i].vring.num) {
>>>>>> [...]
>>>>>> }
>>>>>>
>>>>>> and
>>>>>>
>>>>>> static int virtio_serial_load(QEMUFile *f, void *opaque, int version_id)
>>>>>> {
>>>>>> [...]
>>>>>>      /* The config space */
>>>>>>      qemu_get_be16s(f, &s->config.cols);
>>>>>>      qemu_get_be16s(f, &s->config.rows);
>>>>>>
>>>>>>      qemu_get_be32s(f, &max_nr_ports);
>>>>>>      tswap32s(&max_nr_ports);
>>>>>>       ^^^^^^
>>>>>>      if (max_nr_ports > tswap32(s->config.max_nr_ports)) {
>>>>>> [...]
>>>>>> }
>>>>>>
>>>>>> If we stream subsections after the device descriptor as it is done
>>>>>> in VMState, these two will break because the device endian is stale.
>>>>>>
>>>>>> The first one can be easily dealt with: just defer the sanity check
>>>>>> to a post_load function.
>>>>> Good, we're lucky here.
>>>>>
>>>>>> The second is a bit more tricky: the
>>>>>> virtio serial migrates its config space (target endian) and the
>>>>>> active ports bitmap. The load code assumes max_nr_ports from the
>>>>>> config space tells the size of the ports bitmap... that means the
>>>>>> virtio migration protocol is also contaminated by target endianness. :-\
>>>>> Ouch.
>>>>>
>>>>> I guess we could break migration in the case of host endianness !=
>>>>> target endianness, like this:
>>>>>
>>>>>       /* These three used to be fetched in target endianness and then
>>>>>        * stored as big endian.  It ended up as little endian if host and
>>>>>        * target endianness doesn't match.
>>>>>        *
>>>>>        * Starting with qemu 2.1, we always store as big endian.  The
>>>>>        * version wasn't bumped to avoid breaking backwards compatibility.
>>>>>        * We check the validity of max_nr_ports, and the incorrect-
>>>>>        * endianness max_nr_ports will be huge, which will abort migration
>>>>>        * anyway.
>>>>>        */
>>>>>       uint16_t cols = tswap16(s->config.cols);
>>>>>       uint16_t rows = tswap16(s->config.rows);
>>>>>       uint32_t max_nr_ports = tswap32(s->config.max_nr_ports);
>>>>>
>>>>>       qemu_put_be16s(f, &cols);
>>>>>       qemu_put_be16s(f, &rows);
>>>>>       qemu_put_be32s(f, &max_nr_ports);
>>>>>
>>>>> ...
>>>>>
>>>>>       uint16_t cols, rows;
>>>>>
>>>>>       qemu_get_be16s(f, &cols);
>>>>>       qemu_get_be16s(f, &rows);
>>>>>       qemu_get_be32s(f, &max_nr_ports);
>>>>>
>>>>>       /* Convert back to target endianness when storing into the config
>>>>>        * space.
>>>>>        */
>>>> Paolo,
>>>>
>>>> The patch set to support endian changing targets adds a device_endian
>>>> field to the VirtIODevice structure to be used instead of the default
>>>> target endianness as it happens with tswap() macros. It also introduces
>>>> virtio_tswap() helpers for this purpose, but they can only be used when
>>>> the device_endian field has been restored... in a subsection after the
>>>> device descriptor... :-\
>>> Store it earlier then, using plain put/get.
>>> You can still add a section conditionally to cause
>>> a cleaner failure in broken cross-version scenarios.
>>>
>>>> If the scenario is ppc64le-on-ppc64: tswap() macros don't do anything
>>>> and we cannot convert back to LE...
>>>>
>>>>>       s->config.cols = tswap16(cols);
>>>>>       s->config.rows = tswap16(rows);
>>>> Since cols and rows are not involved in the protocol, we can safely
>>>> defer the conversion to post load.
>>>>
>>>>>       if (max_nr_ports > tswap32(s->config.max_nr_ports) {
>>>>>           ...
>>>>>       }
>>>>>
>>>> Since we know that 0 < max_nr_ports < 32,  is it acceptable to guess
>>>> the correct endianness with a heuristic ?
>>>>
>>>> if (max_nr_ports > tswap32(s->config.max_nr_ports)) {
>>>> 	max_nr_ports = bswap32(max_nr_ports);
>>>> }
>>>>
>>>> if (max_nr_ports > tswap32(s->config.max_nr_ports)) {
>>>> 	return -EINVAL;
>>>> }
>>>>
>>>>>> In the case the answer for above is "legacy virtio really sucks" then,
>>>>>> is it acceptable to not honor bug-compatibility with older versions and
>>>>>> fix the code ? :)
>>>>> As long as the common cases don't break, yes.  The question is what are
>>>>> the common cases.  Here I think the only non-obscure case that could
>>>>> break is x86-on-PPC, and it's not that common.
>>>>>
>>>>> Paolo
>>>>>
>>>> Thanks.
>>> One starts doubting whether all this hackery is worth it.  virtio 1.0
>>> should be out real soon now, it makes everything LE so the problem goes
>>> away. It's not like PPC LE is so popular that we must support old drivers
>>> at all costs.  Won't time be better spent backporting virtio 1.0 drivers?
>> There are already released and working Linux distributions (Ubuntu,
>> openSUSE, maybe others) that don't have virtio 1.0 drivers. Putting our
>> heads into the sand is not an option ;).
>>
>>
>> Alex
> I don't get it. Does virtio work there at the moment?

With the LE enable patch, yes, sure. Imagine the same would happen for 
Windows and e1000 - would you still argue the same way?


Alex

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [Qemu-devel] [PATCH RFC 8/8] virtio: add endian-ambivalent support to VirtIODevice
  2014-06-12  9:06             ` Greg Kurz
@ 2014-06-12  9:19               ` Paolo Bonzini
  2014-06-12  9:37                 ` Michael S. Tsirkin
  0 siblings, 1 reply; 68+ messages in thread
From: Paolo Bonzini @ 2014-06-12  9:19 UTC (permalink / raw)
  To: Greg Kurz
  Cc: Kevin Wolf, Fam Zheng, Stefan Hajnoczi, Michael S. Tsirkin,
	Juan Quintela, Alexander Graf, qemu-devel, Anthony Liguori,
	Amit Shah, Andreas Färber

Il 12/06/2014 11:06, Greg Kurz ha scritto:
> On Thu, 12 Jun 2014 10:55:42 +0200
> Paolo Bonzini <pbonzini@redhat.com> wrote:
>
>> Il 12/06/2014 09:43, Greg Kurz ha scritto:
>>> Since we know that 0 < max_nr_ports < 32,  is it acceptable to guess
>>> the correct endianness with a heuristic ?
>>>
>>> if (max_nr_ports > tswap32(s->config.max_nr_ports)) {
>>> 	max_nr_ports = bswap32(max_nr_ports);
>>> }
>>>
>>> if (max_nr_ports > tswap32(s->config.max_nr_ports)) {
>>> 	return -EINVAL;
>>> }
>>
>> Yes, I guess it is acceptable.  So first you should fix the code to
>> always serialize fields as big-endian, and then apply this little hack
>> and virtio_tswap on top of the previous change.
>
> BTW, can someone explain why we stream the device config ?

For max_nr_ports it is useless indeed.  Let's keep storing it for 
backwards compatibility, but you can remove its load.

The cols and rows values should be defined by the host and thus could 
even change on migration, there is no need to store them.  As of now, 
however, they are unused in QEMU and should always be zero, because 
VIRTIO_CONSOLE_F_SIZE is not supported.

Paolo

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [Qemu-devel] [PATCH RFC 8/8] virtio: add endian-ambivalent support to VirtIODevice
  2014-06-12  9:19               ` Paolo Bonzini
@ 2014-06-12  9:37                 ` Michael S. Tsirkin
  2014-06-12  9:39                   ` Paolo Bonzini
  0 siblings, 1 reply; 68+ messages in thread
From: Michael S. Tsirkin @ 2014-06-12  9:37 UTC (permalink / raw)
  To: Paolo Bonzini
  Cc: Kevin Wolf, Fam Zheng, Stefan Hajnoczi, Juan Quintela,
	Alexander Graf, qemu-devel, Anthony Liguori, Amit Shah,
	Andreas Färber, Greg Kurz

On Thu, Jun 12, 2014 at 11:19:47AM +0200, Paolo Bonzini wrote:
> Il 12/06/2014 11:06, Greg Kurz ha scritto:
> >On Thu, 12 Jun 2014 10:55:42 +0200
> >Paolo Bonzini <pbonzini@redhat.com> wrote:
> >
> >>Il 12/06/2014 09:43, Greg Kurz ha scritto:
> >>>Since we know that 0 < max_nr_ports < 32,  is it acceptable to guess
> >>>the correct endianness with a heuristic ?
> >>>
> >>>if (max_nr_ports > tswap32(s->config.max_nr_ports)) {
> >>>	max_nr_ports = bswap32(max_nr_ports);
> >>>}
> >>>
> >>>if (max_nr_ports > tswap32(s->config.max_nr_ports)) {
> >>>	return -EINVAL;
> >>>}
> >>
> >>Yes, I guess it is acceptable.  So first you should fix the code to
> >>always serialize fields as big-endian, and then apply this little hack
> >>and virtio_tswap on top of the previous change.
> >
> >BTW, can someone explain why we stream the device config ?
> 
> For max_nr_ports it is useless indeed.  Let's keep storing it for backwards
> compatibility, but you can remove its load.
> 
> The cols and rows values should be defined by the host and thus could even
> change on migration, there is no need to store them.  As of now, however,
> they are unused in QEMU and should always be zero, because
> VIRTIO_CONSOLE_F_SIZE is not supported.
> 
> Paolo

Maybe just drop unnecessary stuff for new machine types?
Then we won't need hacks to migrate it.

-- 
MST

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [Qemu-devel] [PATCH RFC 8/8] virtio: add endian-ambivalent support to VirtIODevice
  2014-06-12  9:37                 ` Michael S. Tsirkin
@ 2014-06-12  9:39                   ` Paolo Bonzini
  2014-06-12  9:43                     ` Alexander Graf
  2014-06-12 10:55                     ` Michael S. Tsirkin
  0 siblings, 2 replies; 68+ messages in thread
From: Paolo Bonzini @ 2014-06-12  9:39 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: Kevin Wolf, Fam Zheng, Anthony Liguori, Juan Quintela,
	qemu-devel, Alexander Graf, Stefan Hajnoczi, Amit Shah,
	Andreas Färber, Greg Kurz

Il 12/06/2014 11:37, Michael S. Tsirkin ha scritto:
> Maybe just drop unnecessary stuff for new machine types?
> Then we won't need hacks to migrate it.

For any machine type it's trivial to migrate it.  All machine types 
including old ones can disregard the migrated values.

Paolo

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [Qemu-devel] [PATCH RFC 8/8] virtio: add endian-ambivalent support to VirtIODevice
  2014-06-12  9:39                   ` Paolo Bonzini
@ 2014-06-12  9:43                     ` Alexander Graf
  2014-06-12 10:14                       ` Greg Kurz
  2014-06-12 10:56                       ` Michael S. Tsirkin
  2014-06-12 10:55                     ` Michael S. Tsirkin
  1 sibling, 2 replies; 68+ messages in thread
From: Alexander Graf @ 2014-06-12  9:43 UTC (permalink / raw)
  To: Paolo Bonzini, Michael S. Tsirkin
  Cc: Kevin Wolf, Fam Zheng, Stefan Hajnoczi, Juan Quintela,
	qemu-devel, Anthony Liguori, Amit Shah, Andreas Färber,
	Greg Kurz


On 12.06.14 11:39, Paolo Bonzini wrote:
> Il 12/06/2014 11:37, Michael S. Tsirkin ha scritto:
>> Maybe just drop unnecessary stuff for new machine types?
>> Then we won't need hacks to migrate it.
>
> For any machine type it's trivial to migrate it.  All machine types 
> including old ones can disregard the migrated values.

How about a patch like this before the actual LE awareness ones? With 
this we should make virtio-serial config space completely independent of 
live migration.

Also since QEMU versions that do read these swapped values during 
migration are not bi-endian aware, we can never get into a case where a 
cross-endian save needs to be considered ;).


Alex


diff --git a/hw/char/virtio-serial-bus.c b/hw/char/virtio-serial-bus.c
index 2b647b6..73cb9b7 100644
--- a/hw/char/virtio-serial-bus.c
+++ b/hw/char/virtio-serial-bus.c
@@ -670,6 +670,7 @@ static int virtio_serial_load(QEMUFile *f, void 
*opaque, int version_id)
      uint32_t max_nr_ports, nr_active_ports, ports_map;
      unsigned int i;
      int ret;
+    uint32_t tmp;

      if (version_id > 3) {
          return -EINVAL;
@@ -685,17 +686,12 @@ static int virtio_serial_load(QEMUFile *f, void 
*opaque, int version_id)
          return 0;
      }

-    /* The config space */
-    qemu_get_be16s(f, &s->config.cols);
-    qemu_get_be16s(f, &s->config.rows);
-
-    qemu_get_be32s(f, &max_nr_ports);
-    tswap32s(&max_nr_ports);
-    if (max_nr_ports > tswap32(s->config.max_nr_ports)) {
-        /* Source could have had more ports than us. Fail migration. */
-        return -EINVAL;
-    }
+    /* Unused */
+    qemu_get_be16s(f, &tmp);
+    qemu_get_be16s(f, &tmp);
+    qemu_get_be32s(f, &tmp);

+    max_nr_ports = tswap32(s->config.max_nr_ports);
      for (i = 0; i < (max_nr_ports + 31) / 32; i++) {
          qemu_get_be32s(f, &ports_map);

^ permalink raw reply related	[flat|nested] 68+ messages in thread

* Re: [Qemu-devel] [PATCH RFC 8/8] virtio: add endian-ambivalent support to VirtIODevice
  2014-06-12  9:43                     ` Alexander Graf
@ 2014-06-12 10:14                       ` Greg Kurz
  2014-06-12 10:39                         ` Alexander Graf
  2014-06-12 10:57                         ` Michael S. Tsirkin
  2014-06-12 10:56                       ` Michael S. Tsirkin
  1 sibling, 2 replies; 68+ messages in thread
From: Greg Kurz @ 2014-06-12 10:14 UTC (permalink / raw)
  To: Alexander Graf
  Cc: Kevin Wolf, Fam Zheng, Anthony Liguori, Juan Quintela,
	Michael S. Tsirkin, qemu-devel, Stefan Hajnoczi, Amit Shah,
	Paolo Bonzini, Andreas Färber

On Thu, 12 Jun 2014 11:43:20 +0200
Alexander Graf <agraf@suse.de> wrote:

> 
> On 12.06.14 11:39, Paolo Bonzini wrote:
> > Il 12/06/2014 11:37, Michael S. Tsirkin ha scritto:
> >> Maybe just drop unnecessary stuff for new machine types?
> >> Then we won't need hacks to migrate it.
> >
> > For any machine type it's trivial to migrate it.  All machine types 
> > including old ones can disregard the migrated values.
> 
> How about a patch like this before the actual LE awareness ones? With 
> this we should make virtio-serial config space completely independent of 
> live migration.
> 
> Also since QEMU versions that do read these swapped values during 
> migration are not bi-endian aware, we can never get into a case where a 
> cross-endian save needs to be considered ;).
> 
> 
> Alex
> 
> 
> diff --git a/hw/char/virtio-serial-bus.c b/hw/char/virtio-serial-bus.c
> index 2b647b6..73cb9b7 100644
> --- a/hw/char/virtio-serial-bus.c
> +++ b/hw/char/virtio-serial-bus.c
> @@ -670,6 +670,7 @@ static int virtio_serial_load(QEMUFile *f, void 
> *opaque, int version_id)
>       uint32_t max_nr_ports, nr_active_ports, ports_map;
>       unsigned int i;
>       int ret;
> +    uint32_t tmp;
> 
>       if (version_id > 3) {
>           return -EINVAL;
> @@ -685,17 +686,12 @@ static int virtio_serial_load(QEMUFile *f, void 
> *opaque, int version_id)
>           return 0;
>       }
> 
> -    /* The config space */
> -    qemu_get_be16s(f, &s->config.cols);
> -    qemu_get_be16s(f, &s->config.rows);
> -
> -    qemu_get_be32s(f, &max_nr_ports);
> -    tswap32s(&max_nr_ports);
> -    if (max_nr_ports > tswap32(s->config.max_nr_ports)) {
> -        /* Source could have had more ports than us. Fail migration. */
> -        return -EINVAL;
> -    }
> +    /* Unused */
> +    qemu_get_be16s(f, &tmp);
> +    qemu_get_be16s(f, &tmp);
> +    qemu_get_be32s(f, &tmp);
> 
> +    max_nr_ports = tswap32(s->config.max_nr_ports);
>       for (i = 0; i < (max_nr_ports + 31) / 32; i++) {
>           qemu_get_be32s(f, &ports_map);
> 
> 

For the moment, we have 0 < max_nr_ports < 32 so the source
machine only stores a single 32 bit value... If this limit
gets raised, we can end up sending more than that... and
only the source machine max_nr_ports value can give the
information...

-- 
Gregory Kurz                                     kurzgreg@fr.ibm.com
                                                 gkurz@linux.vnet.ibm.com
Software Engineer @ IBM/Meiosys                  http://www.ibm.com
Tel +33 (0)562 165 496

"Anarchy is about taking complete responsibility for yourself."
        Alan Moore.

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [Qemu-devel] [PATCH RFC 8/8] virtio: add endian-ambivalent support to VirtIODevice
  2014-06-12 10:14                       ` Greg Kurz
@ 2014-06-12 10:39                         ` Alexander Graf
  2014-06-12 10:50                           ` Greg Kurz
  2014-06-12 10:57                         ` Michael S. Tsirkin
  1 sibling, 1 reply; 68+ messages in thread
From: Alexander Graf @ 2014-06-12 10:39 UTC (permalink / raw)
  To: Greg Kurz
  Cc: Kevin Wolf, Fam Zheng, Anthony Liguori, Juan Quintela,
	Michael S. Tsirkin, qemu-devel, Stefan Hajnoczi, Amit Shah,
	Paolo Bonzini, Andreas Färber



> Am 12.06.2014 um 12:14 schrieb Greg Kurz <gkurz@linux.vnet.ibm.com>:
> 
> On Thu, 12 Jun 2014 11:43:20 +0200
> Alexander Graf <agraf@suse.de> wrote:
> 
>> 
>>> On 12.06.14 11:39, Paolo Bonzini wrote:
>>> Il 12/06/2014 11:37, Michael S. Tsirkin ha scritto:
>>>> Maybe just drop unnecessary stuff for new machine types?
>>>> Then we won't need hacks to migrate it.
>>> 
>>> For any machine type it's trivial to migrate it.  All machine types 
>>> including old ones can disregard the migrated values.
>> 
>> How about a patch like this before the actual LE awareness ones? With 
>> this we should make virtio-serial config space completely independent of 
>> live migration.
>> 
>> Also since QEMU versions that do read these swapped values during 
>> migration are not bi-endian aware, we can never get into a case where a 
>> cross-endian save needs to be considered ;).
>> 
>> 
>> Alex
>> 
>> 
>> diff --git a/hw/char/virtio-serial-bus.c b/hw/char/virtio-serial-bus.c
>> index 2b647b6..73cb9b7 100644
>> --- a/hw/char/virtio-serial-bus.c
>> +++ b/hw/char/virtio-serial-bus.c
>> @@ -670,6 +670,7 @@ static int virtio_serial_load(QEMUFile *f, void 
>> *opaque, int version_id)
>>      uint32_t max_nr_ports, nr_active_ports, ports_map;
>>      unsigned int i;
>>      int ret;
>> +    uint32_t tmp;
>> 
>>      if (version_id > 3) {
>>          return -EINVAL;
>> @@ -685,17 +686,12 @@ static int virtio_serial_load(QEMUFile *f, void 
>> *opaque, int version_id)
>>          return 0;
>>      }
>> 
>> -    /* The config space */
>> -    qemu_get_be16s(f, &s->config.cols);
>> -    qemu_get_be16s(f, &s->config.rows);
>> -
>> -    qemu_get_be32s(f, &max_nr_ports);
>> -    tswap32s(&max_nr_ports);
>> -    if (max_nr_ports > tswap32(s->config.max_nr_ports)) {
>> -        /* Source could have had more ports than us. Fail migration. */
>> -        return -EINVAL;
>> -    }
>> +    /* Unused */
>> +    qemu_get_be16s(f, &tmp);
>> +    qemu_get_be16s(f, &tmp);
>> +    qemu_get_be32s(f, &tmp);
>> 
>> +    max_nr_ports = tswap32(s->config.max_nr_ports);
>>      for (i = 0; i < (max_nr_ports + 31) / 32; i++) {
>>          qemu_get_be32s(f, &ports_map);
> 
> For the moment, we have 0 < max_nr_ports < 32 so the source
> machine only stores a single 32 bit value... If this limit
> gets raised, we can end up sending more than that... and
> only the source machine max_nr_ports value can give the
> information...

Why? This value only ever gets set in realize, so it will not change during the lifetime of the device - which means we don't need to migrate it.


Alex

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [Qemu-devel] [PATCH RFC 8/8] virtio: add endian-ambivalent support to VirtIODevice
  2014-06-12 10:39                         ` Alexander Graf
@ 2014-06-12 10:50                           ` Greg Kurz
  2014-06-12 10:58                             ` Alexander Graf
  2014-06-12 10:59                             ` Michael S. Tsirkin
  0 siblings, 2 replies; 68+ messages in thread
From: Greg Kurz @ 2014-06-12 10:50 UTC (permalink / raw)
  To: Alexander Graf
  Cc: Kevin Wolf, Fam Zheng, Anthony Liguori, Juan Quintela,
	Michael S. Tsirkin, qemu-devel, Stefan Hajnoczi, Amit Shah,
	Paolo Bonzini, Andreas Färber

On Thu, 12 Jun 2014 12:39:27 +0200
Alexander Graf <agraf@suse.de> wrote:

> 
> 
> > Am 12.06.2014 um 12:14 schrieb Greg Kurz <gkurz@linux.vnet.ibm.com>:
> > 
> > On Thu, 12 Jun 2014 11:43:20 +0200
> > Alexander Graf <agraf@suse.de> wrote:
> > 
> >> 
> >>> On 12.06.14 11:39, Paolo Bonzini wrote:
> >>> Il 12/06/2014 11:37, Michael S. Tsirkin ha scritto:
> >>>> Maybe just drop unnecessary stuff for new machine types?
> >>>> Then we won't need hacks to migrate it.
> >>> 
> >>> For any machine type it's trivial to migrate it.  All machine types 
> >>> including old ones can disregard the migrated values.
> >> 
> >> How about a patch like this before the actual LE awareness ones? With 
> >> this we should make virtio-serial config space completely independent of 
> >> live migration.
> >> 
> >> Also since QEMU versions that do read these swapped values during 
> >> migration are not bi-endian aware, we can never get into a case where a 
> >> cross-endian save needs to be considered ;).
> >> 
> >> 
> >> Alex
> >> 
> >> 
> >> diff --git a/hw/char/virtio-serial-bus.c b/hw/char/virtio-serial-bus.c
> >> index 2b647b6..73cb9b7 100644
> >> --- a/hw/char/virtio-serial-bus.c
> >> +++ b/hw/char/virtio-serial-bus.c
> >> @@ -670,6 +670,7 @@ static int virtio_serial_load(QEMUFile *f, void 
> >> *opaque, int version_id)
> >>      uint32_t max_nr_ports, nr_active_ports, ports_map;
> >>      unsigned int i;
> >>      int ret;
> >> +    uint32_t tmp;
> >> 
> >>      if (version_id > 3) {
> >>          return -EINVAL;
> >> @@ -685,17 +686,12 @@ static int virtio_serial_load(QEMUFile *f, void 
> >> *opaque, int version_id)
> >>          return 0;
> >>      }
> >> 
> >> -    /* The config space */
> >> -    qemu_get_be16s(f, &s->config.cols);
> >> -    qemu_get_be16s(f, &s->config.rows);
> >> -
> >> -    qemu_get_be32s(f, &max_nr_ports);
> >> -    tswap32s(&max_nr_ports);
> >> -    if (max_nr_ports > tswap32(s->config.max_nr_ports)) {
> >> -        /* Source could have had more ports than us. Fail migration. */
> >> -        return -EINVAL;
> >> -    }
> >> +    /* Unused */
> >> +    qemu_get_be16s(f, &tmp);
> >> +    qemu_get_be16s(f, &tmp);
> >> +    qemu_get_be32s(f, &tmp);
> >> 
> >> +    max_nr_ports = tswap32(s->config.max_nr_ports);
> >>      for (i = 0; i < (max_nr_ports + 31) / 32; i++) {
> >>          qemu_get_be32s(f, &ports_map);
> > 
> > For the moment, we have 0 < max_nr_ports < 32 so the source
> > machine only stores a single 32 bit value... If this limit
> > gets raised, we can end up sending more than that... and
> > only the source machine max_nr_ports value can give the
> > information...
> 
> Why? This value only ever gets set in realize, so it will not change during the lifetime of the device - which means we don't need to migrate it.
> 

I agree with the fact that the value does not change and should not be migrated in the first place.
I am just worried about the size of the active ports bitmap that is streamed in the for loop... it
is only 32 bit as of today, because we are limited to 32 ports. How would this work if the limit is
raised ? How can the destination machine know how many bits have to be read ?

-- 
Gregory Kurz                                     kurzgreg@fr.ibm.com
                                                 gkurz@linux.vnet.ibm.com
Software Engineer @ IBM/Meiosys                  http://www.ibm.com
Tel +33 (0)562 165 496

"Anarchy is about taking complete responsibility for yourself."
        Alan Moore.

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [Qemu-devel] [PATCH RFC 8/8] virtio: add endian-ambivalent support to VirtIODevice
  2014-06-12  9:39                   ` Paolo Bonzini
  2014-06-12  9:43                     ` Alexander Graf
@ 2014-06-12 10:55                     ` Michael S. Tsirkin
  1 sibling, 0 replies; 68+ messages in thread
From: Michael S. Tsirkin @ 2014-06-12 10:55 UTC (permalink / raw)
  To: Paolo Bonzini
  Cc: Kevin Wolf, Fam Zheng, Anthony Liguori, Juan Quintela,
	qemu-devel, Alexander Graf, Stefan Hajnoczi, Amit Shah,
	Andreas Färber, Greg Kurz

On Thu, Jun 12, 2014 at 11:39:42AM +0200, Paolo Bonzini wrote:
> Il 12/06/2014 11:37, Michael S. Tsirkin ha scritto:
> >Maybe just drop unnecessary stuff for new machine types?
> >Then we won't need hacks to migrate it.
> 
> For any machine type it's trivial to migrate it.  All machine types
> including old ones can disregard the migrated values.
> 
> Paolo

Ah, good idea, so simply disregard the values in load.

-- 
MST

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [Qemu-devel] [PATCH RFC 8/8] virtio: add endian-ambivalent support to VirtIODevice
  2014-06-12  9:43                     ` Alexander Graf
  2014-06-12 10:14                       ` Greg Kurz
@ 2014-06-12 10:56                       ` Michael S. Tsirkin
  1 sibling, 0 replies; 68+ messages in thread
From: Michael S. Tsirkin @ 2014-06-12 10:56 UTC (permalink / raw)
  To: Alexander Graf
  Cc: Kevin Wolf, Fam Zheng, Anthony Liguori, Juan Quintela,
	qemu-devel, Stefan Hajnoczi, Amit Shah, Paolo Bonzini,
	Andreas Färber, Greg Kurz

On Thu, Jun 12, 2014 at 11:43:20AM +0200, Alexander Graf wrote:
> 
> On 12.06.14 11:39, Paolo Bonzini wrote:
> >Il 12/06/2014 11:37, Michael S. Tsirkin ha scritto:
> >>Maybe just drop unnecessary stuff for new machine types?
> >>Then we won't need hacks to migrate it.
> >
> >For any machine type it's trivial to migrate it.  All machine types
> >including old ones can disregard the migrated values.
> 
> How about a patch like this before the actual LE awareness ones? With this
> we should make virtio-serial config space completely independent of live
> migration.
> 
> Also since QEMU versions that do read these swapped values during migration
> are not bi-endian aware, we can never get into a case where a cross-endian
> save needs to be considered ;).
> 
> 
> Alex
> 
> 
> diff --git a/hw/char/virtio-serial-bus.c b/hw/char/virtio-serial-bus.c
> index 2b647b6..73cb9b7 100644
> --- a/hw/char/virtio-serial-bus.c
> +++ b/hw/char/virtio-serial-bus.c
> @@ -670,6 +670,7 @@ static int virtio_serial_load(QEMUFile *f, void *opaque,
> int version_id)
>      uint32_t max_nr_ports, nr_active_ports, ports_map;
>      unsigned int i;
>      int ret;
> +    uint32_t tmp;
> 
>      if (version_id > 3) {
>          return -EINVAL;
> @@ -685,17 +686,12 @@ static int virtio_serial_load(QEMUFile *f, void
> *opaque, int version_id)
>          return 0;
>      }
> 
> -    /* The config space */
> -    qemu_get_be16s(f, &s->config.cols);
> -    qemu_get_be16s(f, &s->config.rows);
> -
> -    qemu_get_be32s(f, &max_nr_ports);
> -    tswap32s(&max_nr_ports);
> -    if (max_nr_ports > tswap32(s->config.max_nr_ports)) {
> -        /* Source could have had more ports than us. Fail migration. */
> -        return -EINVAL;
> -    }
> +    /* Unused */
> +    qemu_get_be16s(f, &tmp);
> +    qemu_get_be16s(f, &tmp);
> +    qemu_get_be32s(f, &tmp);
> 
> +    max_nr_ports = tswap32(s->config.max_nr_ports);
>      for (i = 0; i < (max_nr_ports + 31) / 32; i++) {
>          qemu_get_be32s(f, &ports_map);
>

Exactly.
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
 

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [Qemu-devel] [PATCH RFC 8/8] virtio: add endian-ambivalent support to VirtIODevice
  2014-06-12 10:14                       ` Greg Kurz
  2014-06-12 10:39                         ` Alexander Graf
@ 2014-06-12 10:57                         ` Michael S. Tsirkin
  1 sibling, 0 replies; 68+ messages in thread
From: Michael S. Tsirkin @ 2014-06-12 10:57 UTC (permalink / raw)
  To: Greg Kurz
  Cc: Kevin Wolf, Fam Zheng, Anthony Liguori, Juan Quintela,
	Alexander Graf, qemu-devel, Stefan Hajnoczi, Amit Shah,
	Paolo Bonzini, Andreas Färber

On Thu, Jun 12, 2014 at 12:14:40PM +0200, Greg Kurz wrote:
> On Thu, 12 Jun 2014 11:43:20 +0200
> Alexander Graf <agraf@suse.de> wrote:
> 
> > 
> > On 12.06.14 11:39, Paolo Bonzini wrote:
> > > Il 12/06/2014 11:37, Michael S. Tsirkin ha scritto:
> > >> Maybe just drop unnecessary stuff for new machine types?
> > >> Then we won't need hacks to migrate it.
> > >
> > > For any machine type it's trivial to migrate it.  All machine types 
> > > including old ones can disregard the migrated values.
> > 
> > How about a patch like this before the actual LE awareness ones? With 
> > this we should make virtio-serial config space completely independent of 
> > live migration.
> > 
> > Also since QEMU versions that do read these swapped values during 
> > migration are not bi-endian aware, we can never get into a case where a 
> > cross-endian save needs to be considered ;).
> > 
> > 
> > Alex
> > 
> > 
> > diff --git a/hw/char/virtio-serial-bus.c b/hw/char/virtio-serial-bus.c
> > index 2b647b6..73cb9b7 100644
> > --- a/hw/char/virtio-serial-bus.c
> > +++ b/hw/char/virtio-serial-bus.c
> > @@ -670,6 +670,7 @@ static int virtio_serial_load(QEMUFile *f, void 
> > *opaque, int version_id)
> >       uint32_t max_nr_ports, nr_active_ports, ports_map;
> >       unsigned int i;
> >       int ret;
> > +    uint32_t tmp;
> > 
> >       if (version_id > 3) {
> >           return -EINVAL;
> > @@ -685,17 +686,12 @@ static int virtio_serial_load(QEMUFile *f, void 
> > *opaque, int version_id)
> >           return 0;
> >       }
> > 
> > -    /* The config space */
> > -    qemu_get_be16s(f, &s->config.cols);
> > -    qemu_get_be16s(f, &s->config.rows);
> > -
> > -    qemu_get_be32s(f, &max_nr_ports);
> > -    tswap32s(&max_nr_ports);
> > -    if (max_nr_ports > tswap32(s->config.max_nr_ports)) {
> > -        /* Source could have had more ports than us. Fail migration. */
> > -        return -EINVAL;
> > -    }
> > +    /* Unused */
> > +    qemu_get_be16s(f, &tmp);
> > +    qemu_get_be16s(f, &tmp);
> > +    qemu_get_be32s(f, &tmp);
> > 
> > +    max_nr_ports = tswap32(s->config.max_nr_ports);
> >       for (i = 0; i < (max_nr_ports + 31) / 32; i++) {
> >           qemu_get_be32s(f, &ports_map);
> > 
> > 
> 
> For the moment, we have 0 < max_nr_ports < 32 so the source
> machine only stores a single 32 bit value... If this limit
> gets raised, we can end up sending more than that... and
> only the source machine max_nr_ports value can give the
> information...

I don't think we need to worry.
We won't be able to change max_nr_ports in compat machine types.


> -- 
> Gregory Kurz                                     kurzgreg@fr.ibm.com
>                                                  gkurz@linux.vnet.ibm.com
> Software Engineer @ IBM/Meiosys                  http://www.ibm.com
> Tel +33 (0)562 165 496
> 
> "Anarchy is about taking complete responsibility for yourself."
>         Alan Moore.

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [Qemu-devel] [PATCH RFC 8/8] virtio: add endian-ambivalent support to VirtIODevice
  2014-06-12 10:50                           ` Greg Kurz
@ 2014-06-12 10:58                             ` Alexander Graf
  2014-06-12 10:59                             ` Michael S. Tsirkin
  1 sibling, 0 replies; 68+ messages in thread
From: Alexander Graf @ 2014-06-12 10:58 UTC (permalink / raw)
  To: Greg Kurz
  Cc: Kevin Wolf, Fam Zheng, Anthony Liguori, Juan Quintela,
	Michael S. Tsirkin, qemu-devel, Stefan Hajnoczi, Amit Shah,
	Paolo Bonzini, Andreas Färber



> Am 12.06.2014 um 12:50 schrieb Greg Kurz <gkurz@linux.vnet.ibm.com>:
> 
> On Thu, 12 Jun 2014 12:39:27 +0200
> Alexander Graf <agraf@suse.de> wrote:
> 
>> 
>> 
>>> Am 12.06.2014 um 12:14 schrieb Greg Kurz <gkurz@linux.vnet.ibm.com>:
>>> 
>>> On Thu, 12 Jun 2014 11:43:20 +0200
>>> Alexander Graf <agraf@suse.de> wrote:
>>> 
>>>> 
>>>>>> On 12.06.14 11:39, Paolo Bonzini wrote:
>>>>>> Il 12/06/2014 11:37, Michael S. Tsirkin ha scritto:
>>>>>> Maybe just drop unnecessary stuff for new machine types?
>>>>>> Then we won't need hacks to migrate it.
>>>>> 
>>>>> For any machine type it's trivial to migrate it.  All machine types 
>>>>> including old ones can disregard the migrated values.
>>>> 
>>>> How about a patch like this before the actual LE awareness ones? With 
>>>> this we should make virtio-serial config space completely independent of 
>>>> live migration.
>>>> 
>>>> Also since QEMU versions that do read these swapped values during 
>>>> migration are not bi-endian aware, we can never get into a case where a 
>>>> cross-endian save needs to be considered ;).
>>>> 
>>>> 
>>>> Alex
>>>> 
>>>> 
>>>> diff --git a/hw/char/virtio-serial-bus.c b/hw/char/virtio-serial-bus.c
>>>> index 2b647b6..73cb9b7 100644
>>>> --- a/hw/char/virtio-serial-bus.c
>>>> +++ b/hw/char/virtio-serial-bus.c
>>>> @@ -670,6 +670,7 @@ static int virtio_serial_load(QEMUFile *f, void 
>>>> *opaque, int version_id)
>>>>     uint32_t max_nr_ports, nr_active_ports, ports_map;
>>>>     unsigned int i;
>>>>     int ret;
>>>> +    uint32_t tmp;
>>>> 
>>>>     if (version_id > 3) {
>>>>         return -EINVAL;
>>>> @@ -685,17 +686,12 @@ static int virtio_serial_load(QEMUFile *f, void 
>>>> *opaque, int version_id)
>>>>         return 0;
>>>>     }
>>>> 
>>>> -    /* The config space */
>>>> -    qemu_get_be16s(f, &s->config.cols);
>>>> -    qemu_get_be16s(f, &s->config.rows);
>>>> -
>>>> -    qemu_get_be32s(f, &max_nr_ports);
>>>> -    tswap32s(&max_nr_ports);
>>>> -    if (max_nr_ports > tswap32(s->config.max_nr_ports)) {
>>>> -        /* Source could have had more ports than us. Fail migration. */
>>>> -        return -EINVAL;
>>>> -    }
>>>> +    /* Unused */
>>>> +    qemu_get_be16s(f, &tmp);
>>>> +    qemu_get_be16s(f, &tmp);
>>>> +    qemu_get_be32s(f, &tmp);
>>>> 
>>>> +    max_nr_ports = tswap32(s->config.max_nr_ports);
>>>>     for (i = 0; i < (max_nr_ports + 31) / 32; i++) {
>>>>         qemu_get_be32s(f, &ports_map);
>>> 
>>> For the moment, we have 0 < max_nr_ports < 32 so the source
>>> machine only stores a single 32 bit value... If this limit
>>> gets raised, we can end up sending more than that... and
>>> only the source machine max_nr_ports value can give the
>>> information...
>> 
>> Why? This value only ever gets set in realize, so it will not change during the lifetime of the device - which means we don't need to migrate it.
> 
> I agree with the fact that the value does not change and should not be migrated in the first place.
> I am just worried about the size of the active ports bitmap that is streamed in the for loop... it
> is only 32 bit as of today, because we are limited to 32 ports. How would this work if the limit is
> raised ? How can the destination machine know how many bits have to be read ?

The destination has be be executed in compat mode to the older qemu version via a versioned machine type. This should ensure that limits are kept.


Alex

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [Qemu-devel] [PATCH RFC 8/8] virtio: add endian-ambivalent support to VirtIODevice
  2014-06-12 10:50                           ` Greg Kurz
  2014-06-12 10:58                             ` Alexander Graf
@ 2014-06-12 10:59                             ` Michael S. Tsirkin
  2014-06-12 11:10                               ` Greg Kurz
  1 sibling, 1 reply; 68+ messages in thread
From: Michael S. Tsirkin @ 2014-06-12 10:59 UTC (permalink / raw)
  To: Greg Kurz
  Cc: Kevin Wolf, Fam Zheng, Anthony Liguori, Juan Quintela,
	Alexander Graf, qemu-devel, Stefan Hajnoczi, Amit Shah,
	Paolo Bonzini, Andreas Färber

On Thu, Jun 12, 2014 at 12:50:56PM +0200, Greg Kurz wrote:
> On Thu, 12 Jun 2014 12:39:27 +0200
> Alexander Graf <agraf@suse.de> wrote:
> 
> > 
> > 
> > > Am 12.06.2014 um 12:14 schrieb Greg Kurz <gkurz@linux.vnet.ibm.com>:
> > > 
> > > On Thu, 12 Jun 2014 11:43:20 +0200
> > > Alexander Graf <agraf@suse.de> wrote:
> > > 
> > >> 
> > >>> On 12.06.14 11:39, Paolo Bonzini wrote:
> > >>> Il 12/06/2014 11:37, Michael S. Tsirkin ha scritto:
> > >>>> Maybe just drop unnecessary stuff for new machine types?
> > >>>> Then we won't need hacks to migrate it.
> > >>> 
> > >>> For any machine type it's trivial to migrate it.  All machine types 
> > >>> including old ones can disregard the migrated values.
> > >> 
> > >> How about a patch like this before the actual LE awareness ones? With 
> > >> this we should make virtio-serial config space completely independent of 
> > >> live migration.
> > >> 
> > >> Also since QEMU versions that do read these swapped values during 
> > >> migration are not bi-endian aware, we can never get into a case where a 
> > >> cross-endian save needs to be considered ;).
> > >> 
> > >> 
> > >> Alex
> > >> 
> > >> 
> > >> diff --git a/hw/char/virtio-serial-bus.c b/hw/char/virtio-serial-bus.c
> > >> index 2b647b6..73cb9b7 100644
> > >> --- a/hw/char/virtio-serial-bus.c
> > >> +++ b/hw/char/virtio-serial-bus.c
> > >> @@ -670,6 +670,7 @@ static int virtio_serial_load(QEMUFile *f, void 
> > >> *opaque, int version_id)
> > >>      uint32_t max_nr_ports, nr_active_ports, ports_map;
> > >>      unsigned int i;
> > >>      int ret;
> > >> +    uint32_t tmp;
> > >> 
> > >>      if (version_id > 3) {
> > >>          return -EINVAL;
> > >> @@ -685,17 +686,12 @@ static int virtio_serial_load(QEMUFile *f, void 
> > >> *opaque, int version_id)
> > >>          return 0;
> > >>      }
> > >> 
> > >> -    /* The config space */
> > >> -    qemu_get_be16s(f, &s->config.cols);
> > >> -    qemu_get_be16s(f, &s->config.rows);
> > >> -
> > >> -    qemu_get_be32s(f, &max_nr_ports);
> > >> -    tswap32s(&max_nr_ports);
> > >> -    if (max_nr_ports > tswap32(s->config.max_nr_ports)) {
> > >> -        /* Source could have had more ports than us. Fail migration. */
> > >> -        return -EINVAL;
> > >> -    }
> > >> +    /* Unused */
> > >> +    qemu_get_be16s(f, &tmp);
> > >> +    qemu_get_be16s(f, &tmp);
> > >> +    qemu_get_be32s(f, &tmp);
> > >> 
> > >> +    max_nr_ports = tswap32(s->config.max_nr_ports);
> > >>      for (i = 0; i < (max_nr_ports + 31) / 32; i++) {
> > >>          qemu_get_be32s(f, &ports_map);
> > > 
> > > For the moment, we have 0 < max_nr_ports < 32 so the source
> > > machine only stores a single 32 bit value... If this limit
> > > gets raised, we can end up sending more than that... and
> > > only the source machine max_nr_ports value can give the
> > > information...
> > 
> > Why? This value only ever gets set in realize, so it will not change during the lifetime of the device - which means we don't need to migrate it.
> > 
> 
> I agree with the fact that the value does not change and should not be migrated in the first place.
> I am just worried about the size of the active ports bitmap that is streamed in the for loop... it
> is only 32 bit as of today, because we are limited to 32 ports. How would this work if the limit is
> raised ? How can the destination machine know how many bits have to be read ?

When the destination machine is started with -M 2.1, it
knows that it has to read 32 bit.
If started with -M 3.0 it reads in 42 bits :)


> -- 
> Gregory Kurz                                     kurzgreg@fr.ibm.com
>                                                  gkurz@linux.vnet.ibm.com
> Software Engineer @ IBM/Meiosys                  http://www.ibm.com
> Tel +33 (0)562 165 496
> 
> "Anarchy is about taking complete responsibility for yourself."
>         Alan Moore.

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [Qemu-devel] [PATCH RFC 8/8] virtio: add endian-ambivalent support to VirtIODevice
  2014-06-12 10:59                             ` Michael S. Tsirkin
@ 2014-06-12 11:10                               ` Greg Kurz
  0 siblings, 0 replies; 68+ messages in thread
From: Greg Kurz @ 2014-06-12 11:10 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: Kevin Wolf, Fam Zheng, Anthony Liguori, Juan Quintela,
	Alexander Graf, qemu-devel, Stefan Hajnoczi, Amit Shah,
	Paolo Bonzini, Andreas Färber

On Thu, 12 Jun 2014 13:59:59 +0300
"Michael S. Tsirkin" <mst@redhat.com> wrote:

> On Thu, Jun 12, 2014 at 12:50:56PM +0200, Greg Kurz wrote:
> > On Thu, 12 Jun 2014 12:39:27 +0200
> > Alexander Graf <agraf@suse.de> wrote:
> > 
> > > 
> > > 
> > > > Am 12.06.2014 um 12:14 schrieb Greg Kurz <gkurz@linux.vnet.ibm.com>:
> > > > 
> > > > On Thu, 12 Jun 2014 11:43:20 +0200
> > > > Alexander Graf <agraf@suse.de> wrote:
> > > > 
> > > >> 
> > > >>> On 12.06.14 11:39, Paolo Bonzini wrote:
> > > >>> Il 12/06/2014 11:37, Michael S. Tsirkin ha scritto:
> > > >>>> Maybe just drop unnecessary stuff for new machine types?
> > > >>>> Then we won't need hacks to migrate it.
> > > >>> 
> > > >>> For any machine type it's trivial to migrate it.  All machine types 
> > > >>> including old ones can disregard the migrated values.
> > > >> 
> > > >> How about a patch like this before the actual LE awareness ones? With 
> > > >> this we should make virtio-serial config space completely independent of 
> > > >> live migration.
> > > >> 
> > > >> Also since QEMU versions that do read these swapped values during 
> > > >> migration are not bi-endian aware, we can never get into a case where a 
> > > >> cross-endian save needs to be considered ;).
> > > >> 
> > > >> 
> > > >> Alex
> > > >> 
> > > >> 
> > > >> diff --git a/hw/char/virtio-serial-bus.c b/hw/char/virtio-serial-bus.c
> > > >> index 2b647b6..73cb9b7 100644
> > > >> --- a/hw/char/virtio-serial-bus.c
> > > >> +++ b/hw/char/virtio-serial-bus.c
> > > >> @@ -670,6 +670,7 @@ static int virtio_serial_load(QEMUFile *f, void 
> > > >> *opaque, int version_id)
> > > >>      uint32_t max_nr_ports, nr_active_ports, ports_map;
> > > >>      unsigned int i;
> > > >>      int ret;
> > > >> +    uint32_t tmp;
> > > >> 
> > > >>      if (version_id > 3) {
> > > >>          return -EINVAL;
> > > >> @@ -685,17 +686,12 @@ static int virtio_serial_load(QEMUFile *f, void 
> > > >> *opaque, int version_id)
> > > >>          return 0;
> > > >>      }
> > > >> 
> > > >> -    /* The config space */
> > > >> -    qemu_get_be16s(f, &s->config.cols);
> > > >> -    qemu_get_be16s(f, &s->config.rows);
> > > >> -
> > > >> -    qemu_get_be32s(f, &max_nr_ports);
> > > >> -    tswap32s(&max_nr_ports);
> > > >> -    if (max_nr_ports > tswap32(s->config.max_nr_ports)) {
> > > >> -        /* Source could have had more ports than us. Fail migration. */
> > > >> -        return -EINVAL;
> > > >> -    }
> > > >> +    /* Unused */
> > > >> +    qemu_get_be16s(f, &tmp);
> > > >> +    qemu_get_be16s(f, &tmp);
> > > >> +    qemu_get_be32s(f, &tmp);
> > > >> 
> > > >> +    max_nr_ports = tswap32(s->config.max_nr_ports);
> > > >>      for (i = 0; i < (max_nr_ports + 31) / 32; i++) {
> > > >>          qemu_get_be32s(f, &ports_map);
> > > > 
> > > > For the moment, we have 0 < max_nr_ports < 32 so the source
> > > > machine only stores a single 32 bit value... If this limit
> > > > gets raised, we can end up sending more than that... and
> > > > only the source machine max_nr_ports value can give the
> > > > information...
> > > 
> > > Why? This value only ever gets set in realize, so it will not change during the lifetime of the device - which means we don't need to migrate it.
> > > 
> > 
> > I agree with the fact that the value does not change and should not be migrated in the first place.
> > I am just worried about the size of the active ports bitmap that is streamed in the for loop... it
> > is only 32 bit as of today, because we are limited to 32 ports. How would this work if the limit is
> > raised ? How can the destination machine know how many bits have to be read ?
> 
> When the destination machine is started with -M 2.1, it
> knows that it has to read 32 bit.
> If started with -M 3.0 it reads in 42 bits :)
> 

Okay ! I was completely missing this point... now things make sense at last ! :)
About Alex's patch, will you or Amit or Anthony apply it directly or should
I send it along with my patches for a full review ?

Thanks for your time. 

-- 
Gregory Kurz                                     kurzgreg@fr.ibm.com
                                                 gkurz@linux.vnet.ibm.com
Software Engineer @ IBM/Meiosys                  http://www.ibm.com
Tel +33 (0)562 165 496

"Anarchy is about taking complete responsibility for yourself."
        Alan Moore.

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [Qemu-devel] [PATCH RFC 8/8] virtio: add endian-ambivalent support to VirtIODevice
  2014-05-19 17:06     ` Andreas Färber
  2014-05-19 17:32       ` Greg Kurz
@ 2014-05-19 18:07       ` Dr. David Alan Gilbert
  1 sibling, 0 replies; 68+ messages in thread
From: Dr. David Alan Gilbert @ 2014-05-19 18:07 UTC (permalink / raw)
  To: Andreas F?rber
  Cc: Kevin Wolf, Fam Zheng, Stefan Hajnoczi, Juan Quintela,
	Michael S. Tsirkin, Alexander Graf, qemu-devel, Anthony Liguori,
	Amit Shah, Paolo Bonzini, Greg Kurz

* Andreas F?rber (afaerber@suse.de) wrote:
> Am 19.05.2014 15:06, schrieb Greg Kurz:
> > On Mon, 19 May 2014 10:39:09 +0200
> > Greg Kurz <gkurz@linux.vnet.ibm.com> wrote:
> >> diff --git a/hw/virtio/virtio.c b/hw/virtio/virtio.c
> >> index 7fbad29..6578854 100644
> >> --- a/hw/virtio/virtio.c
> >> +++ b/hw/virtio/virtio.c
> [...]
> >> @@ -839,10 +849,39 @@ typedef struct VirtIOSubsection {
> >>      int version_id;
> >>      void (*save)(VirtIODevice *vdev, QEMUFile *f);
> >>      int (*load)(VirtIODevice *vdev, QEMUFile *f);
> >> -    int (*needed)(VirtIODevice *vdev);
> >> +    bool (*needed)(VirtIODevice *vdev);
> >>  } VirtIOSubsection;
> >>
> >> +static void virtio_save_device_endian(VirtIODevice *vdev, QEMUFile *f)
> >> +{
> >> +    qemu_put_byte(f, vdev->device_endian);
> >> +}
> >> +
> >> +static int virtio_load_device_endian(VirtIODevice *vdev, QEMUFile *f)
> >> +{
> >> +    vdev->device_endian = qemu_get_byte(f);
> >> +    return 0;
> >> +}
> >> +
> >> +static bool virtio_device_endian_needed(VirtIODevice *vdev)
> >> +{
> >> +    /* No migration is supposed to occur while we are loading state.
> >> +     */
> >> +    assert(vdev->device_endian != VIRTIO_DEVICE_ENDIAN_UNKNOWN);
> >> +    if (target_words_bigendian()) {
> >> +        return vdev->device_endian == VIRTIO_DEVICE_ENDIAN_LITTLE;
> >> +    } else {
> >> +        return vdev->device_endian == VIRTIO_DEVICE_ENDIAN_BIG;
> >> +    }
> >> +}
> >> +
> >>  static const VirtIOSubsection virtio_subsection[] = {
> >> +    { .name = "virtio/device_endian",
> > 
> > Can anyone comment the subsection name ? Is there a chance the
> > VMState port would come up with the same ?
> > 
> >> +      .version_id = 1,
> >> +      .save = virtio_save_device_endian,
> >> +      .load = virtio_load_device_endian,
> >> +      .needed = virtio_device_endian_needed,
> >> +    },
> >>      { .name = NULL }
> >>  };
> >>
> 
> Different question: With converting VirtIO to VMState in mind, why are
> you not using a regular VMStateSubsection and loading/saving that as
> part of the old-style load/save functions? Is an API for that missing?

There are a handful of places that call into vmstate from a non-vmstate
routine but I don't think they're using plain subsections.

hw/pci/pci.c: pci_device_save/load
hw/scsi/spapr_vscsi.c: vscsi_save_request
hw/acpi/piix4.c: acpi_load_old

Dave
> 
> Regards,
> Andreas
> 
> -- 
> SUSE LINUX Products GmbH, Maxfeldstr. 5, 90409 Nürnberg, Germany
> GF: Jeff Hawn, Jennifer Guild, Felix Imendörffer; HRB 16746 AG Nürnberg
> 
--
Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [Qemu-devel] [PATCH RFC 8/8] virtio: add endian-ambivalent support to VirtIODevice
  2014-05-19 17:06     ` Andreas Färber
@ 2014-05-19 17:32       ` Greg Kurz
  2014-05-19 18:07       ` Dr. David Alan Gilbert
  1 sibling, 0 replies; 68+ messages in thread
From: Greg Kurz @ 2014-05-19 17:32 UTC (permalink / raw)
  To: Andreas Färber
  Cc: Kevin Wolf, Fam Zheng, Anthony Liguori, Michael S. Tsirkin,
	Juan Quintela, Alexander Graf, qemu-devel, Stefan Hajnoczi,
	Amit Shah, Paolo Bonzini

On Mon, 19 May 2014 19:06:39 +0200
Andreas Färber <afaerber@suse.de> wrote:

> Am 19.05.2014 15:06, schrieb Greg Kurz:
> > On Mon, 19 May 2014 10:39:09 +0200
> > Greg Kurz <gkurz@linux.vnet.ibm.com> wrote:
> >> diff --git a/hw/virtio/virtio.c b/hw/virtio/virtio.c
> >> index 7fbad29..6578854 100644
> >> --- a/hw/virtio/virtio.c
> >> +++ b/hw/virtio/virtio.c
> [...]
> >> @@ -839,10 +849,39 @@ typedef struct VirtIOSubsection {
> >>      int version_id;
> >>      void (*save)(VirtIODevice *vdev, QEMUFile *f);
> >>      int (*load)(VirtIODevice *vdev, QEMUFile *f);
> >> -    int (*needed)(VirtIODevice *vdev);
> >> +    bool (*needed)(VirtIODevice *vdev);
> >>  } VirtIOSubsection;
> >>
> >> +static void virtio_save_device_endian(VirtIODevice *vdev, QEMUFile *f)
> >> +{
> >> +    qemu_put_byte(f, vdev->device_endian);
> >> +}
> >> +
> >> +static int virtio_load_device_endian(VirtIODevice *vdev, QEMUFile *f)
> >> +{
> >> +    vdev->device_endian = qemu_get_byte(f);
> >> +    return 0;
> >> +}
> >> +
> >> +static bool virtio_device_endian_needed(VirtIODevice *vdev)
> >> +{
> >> +    /* No migration is supposed to occur while we are loading state.
> >> +     */
> >> +    assert(vdev->device_endian != VIRTIO_DEVICE_ENDIAN_UNKNOWN);
> >> +    if (target_words_bigendian()) {
> >> +        return vdev->device_endian == VIRTIO_DEVICE_ENDIAN_LITTLE;
> >> +    } else {
> >> +        return vdev->device_endian == VIRTIO_DEVICE_ENDIAN_BIG;
> >> +    }
> >> +}
> >> +
> >>  static const VirtIOSubsection virtio_subsection[] = {
> >> +    { .name = "virtio/device_endian",
> > 
> > Can anyone comment the subsection name ? Is there a chance the
> > VMState port would come up with the same ?
> > 
> >> +      .version_id = 1,
> >> +      .save = virtio_save_device_endian,
> >> +      .load = virtio_load_device_endian,
> >> +      .needed = virtio_device_endian_needed,
> >> +    },
> >>      { .name = NULL }
> >>  };
> >>
> 
> Different question: With converting VirtIO to VMState in mind, why are
> you not using a regular VMStateSubsection and loading/saving that as
> part of the old-style load/save functions? Is an API for that missing?
> 

I guess because I haven't tried yet. :)
I'll have a closer look at this.

> Regards,
> Andreas
> 

Thanks.

-- 
Gregory Kurz                                     kurzgreg@fr.ibm.com
                                                 gkurz@linux.vnet.ibm.com
Software Engineer @ IBM/Meiosys                  http://www.ibm.com
Tel +33 (0)562 165 496

"Anarchy is about taking complete responsibility for yourself."
        Alan Moore.

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [Qemu-devel] [PATCH RFC 8/8] virtio: add endian-ambivalent support to VirtIODevice
  2014-05-19 13:06   ` Greg Kurz
@ 2014-05-19 17:06     ` Andreas Färber
  2014-05-19 17:32       ` Greg Kurz
  2014-05-19 18:07       ` Dr. David Alan Gilbert
  0 siblings, 2 replies; 68+ messages in thread
From: Andreas Färber @ 2014-05-19 17:06 UTC (permalink / raw)
  To: Greg Kurz
  Cc: Kevin Wolf, Fam Zheng, Anthony Liguori, Michael S. Tsirkin,
	Juan Quintela, Alexander Graf, qemu-devel, Stefan Hajnoczi,
	Amit Shah, Paolo Bonzini

Am 19.05.2014 15:06, schrieb Greg Kurz:
> On Mon, 19 May 2014 10:39:09 +0200
> Greg Kurz <gkurz@linux.vnet.ibm.com> wrote:
>> diff --git a/hw/virtio/virtio.c b/hw/virtio/virtio.c
>> index 7fbad29..6578854 100644
>> --- a/hw/virtio/virtio.c
>> +++ b/hw/virtio/virtio.c
[...]
>> @@ -839,10 +849,39 @@ typedef struct VirtIOSubsection {
>>      int version_id;
>>      void (*save)(VirtIODevice *vdev, QEMUFile *f);
>>      int (*load)(VirtIODevice *vdev, QEMUFile *f);
>> -    int (*needed)(VirtIODevice *vdev);
>> +    bool (*needed)(VirtIODevice *vdev);
>>  } VirtIOSubsection;
>>
>> +static void virtio_save_device_endian(VirtIODevice *vdev, QEMUFile *f)
>> +{
>> +    qemu_put_byte(f, vdev->device_endian);
>> +}
>> +
>> +static int virtio_load_device_endian(VirtIODevice *vdev, QEMUFile *f)
>> +{
>> +    vdev->device_endian = qemu_get_byte(f);
>> +    return 0;
>> +}
>> +
>> +static bool virtio_device_endian_needed(VirtIODevice *vdev)
>> +{
>> +    /* No migration is supposed to occur while we are loading state.
>> +     */
>> +    assert(vdev->device_endian != VIRTIO_DEVICE_ENDIAN_UNKNOWN);
>> +    if (target_words_bigendian()) {
>> +        return vdev->device_endian == VIRTIO_DEVICE_ENDIAN_LITTLE;
>> +    } else {
>> +        return vdev->device_endian == VIRTIO_DEVICE_ENDIAN_BIG;
>> +    }
>> +}
>> +
>>  static const VirtIOSubsection virtio_subsection[] = {
>> +    { .name = "virtio/device_endian",
> 
> Can anyone comment the subsection name ? Is there a chance the
> VMState port would come up with the same ?
> 
>> +      .version_id = 1,
>> +      .save = virtio_save_device_endian,
>> +      .load = virtio_load_device_endian,
>> +      .needed = virtio_device_endian_needed,
>> +    },
>>      { .name = NULL }
>>  };
>>

Different question: With converting VirtIO to VMState in mind, why are
you not using a regular VMStateSubsection and loading/saving that as
part of the old-style load/save functions? Is an API for that missing?

Regards,
Andreas

-- 
SUSE LINUX Products GmbH, Maxfeldstr. 5, 90409 Nürnberg, Germany
GF: Jeff Hawn, Jennifer Guild, Felix Imendörffer; HRB 16746 AG Nürnberg

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [Qemu-devel] [PATCH RFC 8/8] virtio: add endian-ambivalent support to VirtIODevice
  2014-05-19  8:39 ` [Qemu-devel] [PATCH RFC 8/8] virtio: add endian-ambivalent support to VirtIODevice Greg Kurz
@ 2014-05-19 13:06   ` Greg Kurz
  2014-05-19 17:06     ` Andreas Färber
  0 siblings, 1 reply; 68+ messages in thread
From: Greg Kurz @ 2014-05-19 13:06 UTC (permalink / raw)
  To: Kevin Wolf, Anthony Liguori, Michael S. Tsirkin, Stefan Hajnoczi,
	Amit Shah, Paolo Bonzini
  Cc: qemu-devel, Fam Zheng, Alexander Graf, Andreas Färber,
	Juan Quintela

On Mon, 19 May 2014 10:39:09 +0200
Greg Kurz <gkurz@linux.vnet.ibm.com> wrote:

> Some CPU families can dynamically change their endianness. This means we
> can have little endian ppc or big endian arm guests for example. This has
> an impact on legacy virtio data structures since they are target endian.
> We hence introduce a new property to track the endianness of each virtio
> device. It is reasonnably assumed that endianness won't change while the
> device is in use : we hence capture the device endianness when it gets
> reset.
> 
> Of course this property must be part of the migration stream as most of
> the virtio code will depend on it. It is hence migrated in a subsection
> to preserve compatibility with older migration streams. The tricky part
> is that the endianness gets known quite late and we must ensure no access
> is made to virtio data before that. It is for example invalid to call
> vring_avail_idx() as does the actual migration code: the VQ indexes sanity
> checks had to be moved from virtio_load() to virtio_load_subsections()
> because of that.
> 
> We enforce some paranoia by making the endianness a 3-value enum so
> that we can temporarily poison it while loading state.
> 
> Signed-off-by: Greg Kurz <gkurz@linux.vnet.ibm.com>
> ---
>  exec.c                     |    8 +---
>  hw/virtio/virtio-pci.c     |   11 ++----
>  hw/virtio/virtio.c         |   80 +++++++++++++++++++++++++++++++++++++-------
>  include/exec/cpu-common.h  |    1 +
>  include/hw/virtio/virtio.h |   13 +++++++
>  5 files changed, 87 insertions(+), 26 deletions(-)
> 
> diff --git a/exec.c b/exec.c
> index 91513c6..4e83588 100644
> --- a/exec.c
> +++ b/exec.c
> @@ -2740,13 +2740,7 @@ int cpu_memory_rw_debug(CPUState *cpu, target_ulong addr,
>  #endif
> 
>  #if !defined(CONFIG_USER_ONLY)
> -
> -/*
> - * A helper function for the _utterly broken_ virtio device model to find out if
> - * it's running on a big endian machine. Don't do this at home kids!
> - */
> -bool virtio_is_big_endian(void);
> -bool virtio_is_big_endian(void)
> +bool target_words_bigendian(void)
>  {
>  #if defined(TARGET_WORDS_BIGENDIAN)
>      return true;
> diff --git a/hw/virtio/virtio-pci.c b/hw/virtio/virtio-pci.c
> index ce97514..2ffceb8 100644
> --- a/hw/virtio/virtio-pci.c
> +++ b/hw/virtio/virtio-pci.c
> @@ -89,9 +89,6 @@
>  /* Flags track per-device state like workarounds for quirks in older guests. */
>  #define VIRTIO_PCI_FLAG_BUS_MASTER_BUG  (1 << 0)
> 
> -/* HACK for virtio to determine if it's running a big endian guest */
> -bool virtio_is_big_endian(void);
> -
>  static void virtio_pci_bus_new(VirtioBusState *bus, size_t bus_size,
>                                 VirtIOPCIProxy *dev);
> 
> @@ -409,13 +406,13 @@ static uint64_t virtio_pci_config_read(void *opaque, hwaddr addr,
>          break;
>      case 2:
>          val = virtio_config_readw(vdev, addr);
> -        if (virtio_is_big_endian()) {
> +        if (virtio_is_big_endian(vdev)) {
>              val = bswap16(val);
>          }
>          break;
>      case 4:
>          val = virtio_config_readl(vdev, addr);
> -        if (virtio_is_big_endian()) {
> +        if (virtio_is_big_endian(vdev)) {
>              val = bswap32(val);
>          }
>          break;
> @@ -443,13 +440,13 @@ static void virtio_pci_config_write(void *opaque, hwaddr addr,
>          virtio_config_writeb(vdev, addr, val);
>          break;
>      case 2:
> -        if (virtio_is_big_endian()) {
> +        if (virtio_is_big_endian(vdev)) {
>              val = bswap16(val);
>          }
>          virtio_config_writew(vdev, addr, val);
>          break;
>      case 4:
> -        if (virtio_is_big_endian()) {
> +        if (virtio_is_big_endian(vdev)) {
>              val = bswap32(val);
>          }
>          virtio_config_writel(vdev, addr, val);
> diff --git a/hw/virtio/virtio.c b/hw/virtio/virtio.c
> index 7fbad29..6578854 100644
> --- a/hw/virtio/virtio.c
> +++ b/hw/virtio/virtio.c
> @@ -539,6 +539,15 @@ void virtio_set_status(VirtIODevice *vdev, uint8_t val)
>      vdev->status = val;
>  }
> 
> +static void virtio_set_endian_target_default(VirtIODevice *vdev)
> +{
> +    if (target_words_bigendian()) {
> +        vdev->device_endian = VIRTIO_DEVICE_ENDIAN_BIG;
> +    } else {
> +        vdev->device_endian = VIRTIO_DEVICE_ENDIAN_LITTLE;
> +    }
> +}
> +
>  void virtio_reset(void *opaque)
>  {
>      VirtIODevice *vdev = opaque;
> @@ -546,6 +555,7 @@ void virtio_reset(void *opaque)
>      int i;
> 
>      virtio_set_status(vdev, 0);
> +    virtio_set_endian_target_default(vdev);
> 
>      if (k->reset) {
>          k->reset(vdev);
> @@ -839,10 +849,39 @@ typedef struct VirtIOSubsection {
>      int version_id;
>      void (*save)(VirtIODevice *vdev, QEMUFile *f);
>      int (*load)(VirtIODevice *vdev, QEMUFile *f);
> -    int (*needed)(VirtIODevice *vdev);
> +    bool (*needed)(VirtIODevice *vdev);
>  } VirtIOSubsection;
> 
> +static void virtio_save_device_endian(VirtIODevice *vdev, QEMUFile *f)
> +{
> +    qemu_put_byte(f, vdev->device_endian);
> +}
> +
> +static int virtio_load_device_endian(VirtIODevice *vdev, QEMUFile *f)
> +{
> +    vdev->device_endian = qemu_get_byte(f);
> +    return 0;
> +}
> +
> +static bool virtio_device_endian_needed(VirtIODevice *vdev)
> +{
> +    /* No migration is supposed to occur while we are loading state.
> +     */
> +    assert(vdev->device_endian != VIRTIO_DEVICE_ENDIAN_UNKNOWN);
> +    if (target_words_bigendian()) {
> +        return vdev->device_endian == VIRTIO_DEVICE_ENDIAN_LITTLE;
> +    } else {
> +        return vdev->device_endian == VIRTIO_DEVICE_ENDIAN_BIG;
> +    }
> +}
> +
>  static const VirtIOSubsection virtio_subsection[] = {
> +    { .name = "virtio/device_endian",

Can anyone comment the subsection name ? Is there a chance the
VMState port would come up with the same ?

> +      .version_id = 1,
> +      .save = virtio_save_device_endian,
> +      .load = virtio_load_device_endian,
> +      .needed = virtio_device_endian_needed,
> +    },
>      { .name = NULL }
>  };
> 
> @@ -868,6 +907,8 @@ static void virtio_save_subsections(VirtIODevice *vdev, QEMUFile *f)
> 
>  static int virtio_load_subsections(VirtIODevice *vdev, QEMUFile *f)
>  {
> +    int i;
> +
>      while (qemu_peek_byte(f, 0) == QEMU_VM_SUBSECTION) {
>          char idstr[256];
>          uint8_t len, size;
> @@ -904,6 +945,26 @@ static int virtio_load_subsections(VirtIODevice *vdev, QEMUFile *f)
>          }
>      }
> 
> +    for (i = 0; i < VIRTIO_PCI_QUEUE_MAX; i++) {
> +        if (vdev->vq[i].vring.num == 0) {
> +            break;
> +        }
> +
> +        if (vdev->vq[i].pa) {
> +            uint16_t nheads;
> +            nheads = vring_avail_idx(&vdev->vq[i]) - vdev->vq[i].last_avail_idx;
> +            /* Check it isn't doing strange things with descriptor numbers. */
> +            if (nheads > vdev->vq[i].vring.num) {
> +                error_report("VQ %d size 0x%x Guest index 0x%x "
> +                             "inconsistent with Host index 0x%x: delta 0x%x",
> +                             i, vdev->vq[i].vring.num,
> +                             vring_avail_idx(&vdev->vq[i]),
> +                             vdev->vq[i].last_avail_idx, nheads);
> +                return -1;
> +            }
> +        }
> +    }
> +
>      return 0;
>  }
> 
> @@ -979,6 +1040,11 @@ int virtio_load(VirtIODevice *vdev, QEMUFile *f, int version_id)
>      VirtioBusClass *k = VIRTIO_BUS_GET_CLASS(qbus);
>      VirtioDeviceClass *vdc = VIRTIO_DEVICE_GET_CLASS(vdev);
> 
> +    /* We poison the endianness to ensure it does not get used before
> +     * subsections have been loaded.
> +     */
> +    vdev->device_endian = VIRTIO_DEVICE_ENDIAN_UNKNOWN;
> +
>      if (k->load_config) {
>          ret = k->load_config(qbus->parent, f);
>          if (ret)
> @@ -1012,18 +1078,7 @@ int virtio_load(VirtIODevice *vdev, QEMUFile *f, int version_id)
>          vdev->vq[i].notification = true;
> 
>          if (vdev->vq[i].pa) {
> -            uint16_t nheads;
>              virtqueue_init(&vdev->vq[i]);
> -            nheads = vring_avail_idx(&vdev->vq[i]) - vdev->vq[i].last_avail_idx;
> -            /* Check it isn't doing very strange things with descriptor numbers. */
> -            if (nheads > vdev->vq[i].vring.num) {
> -                error_report("VQ %d size 0x%x Guest index 0x%x "
> -                             "inconsistent with Host index 0x%x: delta 0x%x",
> -                             i, vdev->vq[i].vring.num,
> -                             vring_avail_idx(&vdev->vq[i]),
> -                             vdev->vq[i].last_avail_idx, nheads);
> -                return -1;
> -            }
>          } else if (vdev->vq[i].last_avail_idx) {
>              error_report("VQ %d address 0x0 "
>                           "inconsistent with Host index 0x%x",
> @@ -1100,6 +1155,7 @@ void virtio_init(VirtIODevice *vdev, const char *name,
>      }
>      vdev->vmstate = qemu_add_vm_change_state_handler(virtio_vmstate_change,
>                                                       vdev);
> +    virtio_set_endian_target_default(vdev);
>  }
> 
>  hwaddr virtio_queue_get_desc_addr(VirtIODevice *vdev, int n)
> diff --git a/include/exec/cpu-common.h b/include/exec/cpu-common.h
> index a21b65a..eb798c1 100644
> --- a/include/exec/cpu-common.h
> +++ b/include/exec/cpu-common.h
> @@ -122,4 +122,5 @@ void qemu_ram_foreach_block(RAMBlockIterFunc func, void *opaque);
> 
>  #endif
> 
> +bool target_words_bigendian(void);
>  #endif /* !CPU_COMMON_H */
> diff --git a/include/hw/virtio/virtio.h b/include/hw/virtio/virtio.h
> index 3505ce5..1c4a736 100644
> --- a/include/hw/virtio/virtio.h
> +++ b/include/hw/virtio/virtio.h
> @@ -104,6 +104,12 @@ typedef struct VirtQueueElement
>  #define VIRTIO_DEVICE(obj) \
>          OBJECT_CHECK(VirtIODevice, (obj), TYPE_VIRTIO_DEVICE)
> 
> +enum virtio_device_endian {
> +    VIRTIO_DEVICE_ENDIAN_UNKNOWN,
> +    VIRTIO_DEVICE_ENDIAN_LITTLE,
> +    VIRTIO_DEVICE_ENDIAN_BIG,
> +};
> +
>  struct VirtIODevice
>  {
>      DeviceState parent_obj;
> @@ -121,6 +127,7 @@ struct VirtIODevice
>      bool vm_running;
>      VMChangeStateEntry *vmstate;
>      char *bus_name;
> +    enum virtio_device_endian device_endian;
>  };
> 
>  typedef struct VirtioDeviceClass {
> @@ -255,4 +262,10 @@ void virtio_queue_set_host_notifier_fd_handler(VirtQueue *vq, bool assign,
>                                                 bool set_handler);
>  void virtio_queue_notify_vq(VirtQueue *vq);
>  void virtio_irq(VirtQueue *vq);
> +
> +static inline bool virtio_is_big_endian(VirtIODevice *vdev)
> +{
> +    assert(vdev->device_endian != VIRTIO_DEVICE_ENDIAN_UNKNOWN);
> +    return vdev->device_endian == VIRTIO_DEVICE_ENDIAN_BIG;
> +}
>  #endif
> 
> 



-- 
Gregory Kurz                                     kurzgreg@fr.ibm.com
                                                 gkurz@linux.vnet.ibm.com
Software Engineer @ IBM/Meiosys                  http://www.ibm.com
Tel +33 (0)562 165 496

"Anarchy is about taking complete responsibility for yourself."
        Alan Moore.

^ permalink raw reply	[flat|nested] 68+ messages in thread

* [Qemu-devel] [PATCH RFC 8/8] virtio: add endian-ambivalent support to VirtIODevice
  2014-05-19  8:38 [Qemu-devel] [PATCH RFC V2 0/8] virtio: migrate new properties Greg Kurz
@ 2014-05-19  8:39 ` Greg Kurz
  2014-05-19 13:06   ` Greg Kurz
  0 siblings, 1 reply; 68+ messages in thread
From: Greg Kurz @ 2014-05-19  8:39 UTC (permalink / raw)
  To: Kevin Wolf, Anthony Liguori, Michael S. Tsirkin, Stefan Hajnoczi,
	Amit Shah, Paolo Bonzini
  Cc: Juan Quintela, Fam Zheng, Alexander Graf, Andreas Färber,
	qemu-devel

Some CPU families can dynamically change their endianness. This means we
can have little endian ppc or big endian arm guests for example. This has
an impact on legacy virtio data structures since they are target endian.
We hence introduce a new property to track the endianness of each virtio
device. It is reasonnably assumed that endianness won't change while the
device is in use : we hence capture the device endianness when it gets
reset.

Of course this property must be part of the migration stream as most of
the virtio code will depend on it. It is hence migrated in a subsection
to preserve compatibility with older migration streams. The tricky part
is that the endianness gets known quite late and we must ensure no access
is made to virtio data before that. It is for example invalid to call
vring_avail_idx() as does the actual migration code: the VQ indexes sanity
checks had to be moved from virtio_load() to virtio_load_subsections()
because of that.

We enforce some paranoia by making the endianness a 3-value enum so
that we can temporarily poison it while loading state.

Signed-off-by: Greg Kurz <gkurz@linux.vnet.ibm.com>
---
 exec.c                     |    8 +---
 hw/virtio/virtio-pci.c     |   11 ++----
 hw/virtio/virtio.c         |   80 +++++++++++++++++++++++++++++++++++++-------
 include/exec/cpu-common.h  |    1 +
 include/hw/virtio/virtio.h |   13 +++++++
 5 files changed, 87 insertions(+), 26 deletions(-)

diff --git a/exec.c b/exec.c
index 91513c6..4e83588 100644
--- a/exec.c
+++ b/exec.c
@@ -2740,13 +2740,7 @@ int cpu_memory_rw_debug(CPUState *cpu, target_ulong addr,
 #endif
 
 #if !defined(CONFIG_USER_ONLY)
-
-/*
- * A helper function for the _utterly broken_ virtio device model to find out if
- * it's running on a big endian machine. Don't do this at home kids!
- */
-bool virtio_is_big_endian(void);
-bool virtio_is_big_endian(void)
+bool target_words_bigendian(void)
 {
 #if defined(TARGET_WORDS_BIGENDIAN)
     return true;
diff --git a/hw/virtio/virtio-pci.c b/hw/virtio/virtio-pci.c
index ce97514..2ffceb8 100644
--- a/hw/virtio/virtio-pci.c
+++ b/hw/virtio/virtio-pci.c
@@ -89,9 +89,6 @@
 /* Flags track per-device state like workarounds for quirks in older guests. */
 #define VIRTIO_PCI_FLAG_BUS_MASTER_BUG  (1 << 0)
 
-/* HACK for virtio to determine if it's running a big endian guest */
-bool virtio_is_big_endian(void);
-
 static void virtio_pci_bus_new(VirtioBusState *bus, size_t bus_size,
                                VirtIOPCIProxy *dev);
 
@@ -409,13 +406,13 @@ static uint64_t virtio_pci_config_read(void *opaque, hwaddr addr,
         break;
     case 2:
         val = virtio_config_readw(vdev, addr);
-        if (virtio_is_big_endian()) {
+        if (virtio_is_big_endian(vdev)) {
             val = bswap16(val);
         }
         break;
     case 4:
         val = virtio_config_readl(vdev, addr);
-        if (virtio_is_big_endian()) {
+        if (virtio_is_big_endian(vdev)) {
             val = bswap32(val);
         }
         break;
@@ -443,13 +440,13 @@ static void virtio_pci_config_write(void *opaque, hwaddr addr,
         virtio_config_writeb(vdev, addr, val);
         break;
     case 2:
-        if (virtio_is_big_endian()) {
+        if (virtio_is_big_endian(vdev)) {
             val = bswap16(val);
         }
         virtio_config_writew(vdev, addr, val);
         break;
     case 4:
-        if (virtio_is_big_endian()) {
+        if (virtio_is_big_endian(vdev)) {
             val = bswap32(val);
         }
         virtio_config_writel(vdev, addr, val);
diff --git a/hw/virtio/virtio.c b/hw/virtio/virtio.c
index 7fbad29..6578854 100644
--- a/hw/virtio/virtio.c
+++ b/hw/virtio/virtio.c
@@ -539,6 +539,15 @@ void virtio_set_status(VirtIODevice *vdev, uint8_t val)
     vdev->status = val;
 }
 
+static void virtio_set_endian_target_default(VirtIODevice *vdev)
+{
+    if (target_words_bigendian()) {
+        vdev->device_endian = VIRTIO_DEVICE_ENDIAN_BIG;
+    } else {
+        vdev->device_endian = VIRTIO_DEVICE_ENDIAN_LITTLE;
+    }
+}
+
 void virtio_reset(void *opaque)
 {
     VirtIODevice *vdev = opaque;
@@ -546,6 +555,7 @@ void virtio_reset(void *opaque)
     int i;
 
     virtio_set_status(vdev, 0);
+    virtio_set_endian_target_default(vdev);
 
     if (k->reset) {
         k->reset(vdev);
@@ -839,10 +849,39 @@ typedef struct VirtIOSubsection {
     int version_id;
     void (*save)(VirtIODevice *vdev, QEMUFile *f);
     int (*load)(VirtIODevice *vdev, QEMUFile *f);
-    int (*needed)(VirtIODevice *vdev);
+    bool (*needed)(VirtIODevice *vdev);
 } VirtIOSubsection;
 
+static void virtio_save_device_endian(VirtIODevice *vdev, QEMUFile *f)
+{
+    qemu_put_byte(f, vdev->device_endian);
+}
+
+static int virtio_load_device_endian(VirtIODevice *vdev, QEMUFile *f)
+{
+    vdev->device_endian = qemu_get_byte(f);
+    return 0;
+}
+
+static bool virtio_device_endian_needed(VirtIODevice *vdev)
+{
+    /* No migration is supposed to occur while we are loading state.
+     */
+    assert(vdev->device_endian != VIRTIO_DEVICE_ENDIAN_UNKNOWN);
+    if (target_words_bigendian()) {
+        return vdev->device_endian == VIRTIO_DEVICE_ENDIAN_LITTLE;
+    } else {
+        return vdev->device_endian == VIRTIO_DEVICE_ENDIAN_BIG;
+    }
+}
+
 static const VirtIOSubsection virtio_subsection[] = {
+    { .name = "virtio/device_endian",
+      .version_id = 1,
+      .save = virtio_save_device_endian,
+      .load = virtio_load_device_endian,
+      .needed = virtio_device_endian_needed,
+    },
     { .name = NULL }
 };
 
@@ -868,6 +907,8 @@ static void virtio_save_subsections(VirtIODevice *vdev, QEMUFile *f)
 
 static int virtio_load_subsections(VirtIODevice *vdev, QEMUFile *f)
 {
+    int i;
+
     while (qemu_peek_byte(f, 0) == QEMU_VM_SUBSECTION) {
         char idstr[256];
         uint8_t len, size;
@@ -904,6 +945,26 @@ static int virtio_load_subsections(VirtIODevice *vdev, QEMUFile *f)
         }
     }
 
+    for (i = 0; i < VIRTIO_PCI_QUEUE_MAX; i++) {
+        if (vdev->vq[i].vring.num == 0) {
+            break;
+        }
+
+        if (vdev->vq[i].pa) {
+            uint16_t nheads;
+            nheads = vring_avail_idx(&vdev->vq[i]) - vdev->vq[i].last_avail_idx;
+            /* Check it isn't doing strange things with descriptor numbers. */
+            if (nheads > vdev->vq[i].vring.num) {
+                error_report("VQ %d size 0x%x Guest index 0x%x "
+                             "inconsistent with Host index 0x%x: delta 0x%x",
+                             i, vdev->vq[i].vring.num,
+                             vring_avail_idx(&vdev->vq[i]),
+                             vdev->vq[i].last_avail_idx, nheads);
+                return -1;
+            }
+        }
+    }
+
     return 0;
 }
 
@@ -979,6 +1040,11 @@ int virtio_load(VirtIODevice *vdev, QEMUFile *f, int version_id)
     VirtioBusClass *k = VIRTIO_BUS_GET_CLASS(qbus);
     VirtioDeviceClass *vdc = VIRTIO_DEVICE_GET_CLASS(vdev);
 
+    /* We poison the endianness to ensure it does not get used before
+     * subsections have been loaded.
+     */
+    vdev->device_endian = VIRTIO_DEVICE_ENDIAN_UNKNOWN;
+
     if (k->load_config) {
         ret = k->load_config(qbus->parent, f);
         if (ret)
@@ -1012,18 +1078,7 @@ int virtio_load(VirtIODevice *vdev, QEMUFile *f, int version_id)
         vdev->vq[i].notification = true;
 
         if (vdev->vq[i].pa) {
-            uint16_t nheads;
             virtqueue_init(&vdev->vq[i]);
-            nheads = vring_avail_idx(&vdev->vq[i]) - vdev->vq[i].last_avail_idx;
-            /* Check it isn't doing very strange things with descriptor numbers. */
-            if (nheads > vdev->vq[i].vring.num) {
-                error_report("VQ %d size 0x%x Guest index 0x%x "
-                             "inconsistent with Host index 0x%x: delta 0x%x",
-                             i, vdev->vq[i].vring.num,
-                             vring_avail_idx(&vdev->vq[i]),
-                             vdev->vq[i].last_avail_idx, nheads);
-                return -1;
-            }
         } else if (vdev->vq[i].last_avail_idx) {
             error_report("VQ %d address 0x0 "
                          "inconsistent with Host index 0x%x",
@@ -1100,6 +1155,7 @@ void virtio_init(VirtIODevice *vdev, const char *name,
     }
     vdev->vmstate = qemu_add_vm_change_state_handler(virtio_vmstate_change,
                                                      vdev);
+    virtio_set_endian_target_default(vdev);
 }
 
 hwaddr virtio_queue_get_desc_addr(VirtIODevice *vdev, int n)
diff --git a/include/exec/cpu-common.h b/include/exec/cpu-common.h
index a21b65a..eb798c1 100644
--- a/include/exec/cpu-common.h
+++ b/include/exec/cpu-common.h
@@ -122,4 +122,5 @@ void qemu_ram_foreach_block(RAMBlockIterFunc func, void *opaque);
 
 #endif
 
+bool target_words_bigendian(void);
 #endif /* !CPU_COMMON_H */
diff --git a/include/hw/virtio/virtio.h b/include/hw/virtio/virtio.h
index 3505ce5..1c4a736 100644
--- a/include/hw/virtio/virtio.h
+++ b/include/hw/virtio/virtio.h
@@ -104,6 +104,12 @@ typedef struct VirtQueueElement
 #define VIRTIO_DEVICE(obj) \
         OBJECT_CHECK(VirtIODevice, (obj), TYPE_VIRTIO_DEVICE)
 
+enum virtio_device_endian {
+    VIRTIO_DEVICE_ENDIAN_UNKNOWN,
+    VIRTIO_DEVICE_ENDIAN_LITTLE,
+    VIRTIO_DEVICE_ENDIAN_BIG,
+};
+
 struct VirtIODevice
 {
     DeviceState parent_obj;
@@ -121,6 +127,7 @@ struct VirtIODevice
     bool vm_running;
     VMChangeStateEntry *vmstate;
     char *bus_name;
+    enum virtio_device_endian device_endian;
 };
 
 typedef struct VirtioDeviceClass {
@@ -255,4 +262,10 @@ void virtio_queue_set_host_notifier_fd_handler(VirtQueue *vq, bool assign,
                                                bool set_handler);
 void virtio_queue_notify_vq(VirtQueue *vq);
 void virtio_irq(VirtQueue *vq);
+
+static inline bool virtio_is_big_endian(VirtIODevice *vdev)
+{
+    assert(vdev->device_endian != VIRTIO_DEVICE_ENDIAN_UNKNOWN);
+    return vdev->device_endian == VIRTIO_DEVICE_ENDIAN_BIG;
+}
 #endif

^ permalink raw reply related	[flat|nested] 68+ messages in thread

end of thread, other threads:[~2014-06-12 11:10 UTC | newest]

Thread overview: 68+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2014-05-14 15:41 [Qemu-devel] [PATCH RFC 0/8] virtio: migrate new properties Greg Kurz
2014-05-14 15:41 ` [Qemu-devel] [PATCH RFC 1/8] virtio: add subsections to the migration stream Greg Kurz
2014-05-15  6:04   ` Amit Shah
2014-05-15  6:23     ` Michael S. Tsirkin
2014-05-15  6:46       ` Amit Shah
2014-05-15  7:04         ` Greg Kurz
2014-05-15  9:20           ` Andreas Färber
2014-05-15  9:52             ` Michael S. Tsirkin
2014-05-15  9:58               ` Andreas Färber
2014-05-15 10:03                 ` Michael S. Tsirkin
2014-05-15 10:11                   ` Andreas Färber
2014-05-15 10:16                     ` Michael S. Tsirkin
2014-05-15 12:00                       ` Andreas Färber
2014-05-15 12:20                         ` Michael S. Tsirkin
2014-05-15 13:47                           ` Markus Armbruster
2014-05-15 13:49                         ` Greg Kurz
2014-05-15 12:33               ` Markus Armbruster
2014-05-15 12:58                 ` Michael S. Tsirkin
2014-05-15 13:35                   ` Greg Kurz
2014-05-15 10:08             ` Greg Kurz
2014-05-15 10:12               ` Michael S. Tsirkin
2014-05-15 10:21                 ` Greg Kurz
2014-05-15 10:16               ` Greg Kurz
2014-05-16  9:14               ` Fam Zheng
2014-05-16  9:22                 ` Andreas Färber
2014-05-16  9:40                   ` Fam Zheng
2014-05-16  9:48                     ` Greg Kurz
2014-05-17 18:29           ` Michael S. Tsirkin
2014-05-15  7:14         ` Michael S. Tsirkin
2014-05-15  6:49     ` Greg Kurz
2014-05-15  6:55       ` Amit Shah
2014-05-15  7:12       ` Michael S. Tsirkin
2014-05-14 15:41 ` [Qemu-devel] [PATCH RFC 2/8] virtio-net: migrate subsections Greg Kurz
2014-05-14 15:41 ` [Qemu-devel] [PATCH RFC 3/8] virtio-blk: " Greg Kurz
2014-05-14 15:42 ` [Qemu-devel] [PATCH RFC 4/8] virtio-scsi: " Greg Kurz
2014-05-14 15:42 ` [Qemu-devel] [PATCH RFC 5/8] virtio-serial: " Greg Kurz
2014-05-14 15:42 ` [Qemu-devel] [PATCH RFC 6/8] virtio-balloon: " Greg Kurz
2014-05-14 15:42 ` [Qemu-devel] [PATCH RFC 7/8] virtio-rng: " Greg Kurz
2014-05-14 15:42 ` [Qemu-devel] [PATCH RFC 8/8] virtio: add endian-ambivalent support to VirtIODevice Greg Kurz
     [not found]   ` <5384A8D2.8050104@redhat.com>
     [not found]     ` <20140529111253.4ff55199@bahia.local>
     [not found]       ` <538708FA.4070309@redhat.com>
2014-06-12  7:43         ` Greg Kurz
2014-06-12  7:54           ` Michael S. Tsirkin
2014-06-12  8:47             ` Greg Kurz
2014-06-12  9:05               ` Michael S. Tsirkin
2014-06-12  9:06               ` Alexander Graf
2014-06-12  8:55             ` Alexander Graf
2014-06-12  9:07               ` Michael S. Tsirkin
2014-06-12  9:08                 ` Alexander Graf
2014-06-12  8:55           ` Paolo Bonzini
2014-06-12  8:57             ` Alexander Graf
2014-06-12  9:06             ` Greg Kurz
2014-06-12  9:19               ` Paolo Bonzini
2014-06-12  9:37                 ` Michael S. Tsirkin
2014-06-12  9:39                   ` Paolo Bonzini
2014-06-12  9:43                     ` Alexander Graf
2014-06-12 10:14                       ` Greg Kurz
2014-06-12 10:39                         ` Alexander Graf
2014-06-12 10:50                           ` Greg Kurz
2014-06-12 10:58                             ` Alexander Graf
2014-06-12 10:59                             ` Michael S. Tsirkin
2014-06-12 11:10                               ` Greg Kurz
2014-06-12 10:57                         ` Michael S. Tsirkin
2014-06-12 10:56                       ` Michael S. Tsirkin
2014-06-12 10:55                     ` Michael S. Tsirkin
2014-05-19  8:38 [Qemu-devel] [PATCH RFC V2 0/8] virtio: migrate new properties Greg Kurz
2014-05-19  8:39 ` [Qemu-devel] [PATCH RFC 8/8] virtio: add endian-ambivalent support to VirtIODevice Greg Kurz
2014-05-19 13:06   ` Greg Kurz
2014-05-19 17:06     ` Andreas Färber
2014-05-19 17:32       ` Greg Kurz
2014-05-19 18:07       ` Dr. David Alan Gilbert

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.