All of lore.kernel.org
 help / color / mirror / Atom feed
* [Qemu-devel] [PATCH RFC 0/7] vhost: cross-endian support (vhost-net only)
@ 2015-05-06 12:07 ` Greg Kurz
  0 siblings, 0 replies; 44+ messages in thread
From: Greg Kurz @ 2015-05-06 12:07 UTC (permalink / raw)
  To: Jason Wang, Stefan Hajnoczi, Michael S. Tsirkin
  Cc: qemu-devel, virtualization

Hi,

This series allows QEMU to use vhost with legacy virtio devices when
host and target don't have the same endianness. Only network devices
are covered for the moment.

I had already posted a series some monthes ago but it never got reviewed.
Moreover, the underlying kernel support was entirely re-written and is still
waiting to be applied by Michael. I hence post as RFC.

The corresponding kernel patches are available here:

http://lists.linuxfoundation.org/pipermail/virtualization/2015-April/029885.html

Please comment.

---

Cédric Le Goater (1):
      vhost_net: re-enable when cross endian

Greg Kurz (6):
      virtio: relax feature check
      linux-headers: sync vhost.h
      virtio: introduce virtio_legacy_is_cross_endian()
      vhost: set vring endianness for legacy virtio
      tap: add VNET_LE/VNET_BE operations
      vhost-net: tell tap backend about the vnet endianness


 hw/net/vhost_net.c                |   50 +++++++++++++++++++++++--------------
 hw/virtio/vhost.c                 |   50 ++++++++++++++++++++++++++++++++++++-
 include/hw/virtio/virtio-access.h |   13 ++++++++++
 include/hw/virtio/virtio.h        |    1 -
 include/net/net.h                 |    6 ++++
 linux-headers/linux/vhost.h       |   14 ++++++++++
 net/net.c                         |   18 +++++++++++++
 net/tap-linux.c                   |   34 +++++++++++++++++++++++++
 net/tap-linux.h                   |    2 +
 net/tap.c                         |   16 ++++++++++++
 net/tap_int.h                     |    2 +
 11 files changed, 185 insertions(+), 21 deletions(-)

--
Greg

^ permalink raw reply	[flat|nested] 44+ messages in thread

* [PATCH RFC 0/7] vhost: cross-endian support (vhost-net only)
@ 2015-05-06 12:07 ` Greg Kurz
  0 siblings, 0 replies; 44+ messages in thread
From: Greg Kurz @ 2015-05-06 12:07 UTC (permalink / raw)
  To: Jason Wang, Stefan Hajnoczi, Michael S. Tsirkin
  Cc: qemu-devel, virtualization

Hi,

This series allows QEMU to use vhost with legacy virtio devices when
host and target don't have the same endianness. Only network devices
are covered for the moment.

I had already posted a series some monthes ago but it never got reviewed.
Moreover, the underlying kernel support was entirely re-written and is still
waiting to be applied by Michael. I hence post as RFC.

The corresponding kernel patches are available here:

http://lists.linuxfoundation.org/pipermail/virtualization/2015-April/029885.html

Please comment.

---

Cédric Le Goater (1):
      vhost_net: re-enable when cross endian

Greg Kurz (6):
      virtio: relax feature check
      linux-headers: sync vhost.h
      virtio: introduce virtio_legacy_is_cross_endian()
      vhost: set vring endianness for legacy virtio
      tap: add VNET_LE/VNET_BE operations
      vhost-net: tell tap backend about the vnet endianness


 hw/net/vhost_net.c                |   50 +++++++++++++++++++++++--------------
 hw/virtio/vhost.c                 |   50 ++++++++++++++++++++++++++++++++++++-
 include/hw/virtio/virtio-access.h |   13 ++++++++++
 include/hw/virtio/virtio.h        |    1 -
 include/net/net.h                 |    6 ++++
 linux-headers/linux/vhost.h       |   14 ++++++++++
 net/net.c                         |   18 +++++++++++++
 net/tap-linux.c                   |   34 +++++++++++++++++++++++++
 net/tap-linux.h                   |    2 +
 net/tap.c                         |   16 ++++++++++++
 net/tap_int.h                     |    2 +
 11 files changed, 185 insertions(+), 21 deletions(-)

--
Greg

_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

^ permalink raw reply	[flat|nested] 44+ messages in thread

* [Qemu-devel] [PATCH RFC 1/7] virtio: relax feature check
  2015-05-06 12:07 ` Greg Kurz
@ 2015-05-06 12:07   ` Greg Kurz
  -1 siblings, 0 replies; 44+ messages in thread
From: Greg Kurz @ 2015-05-06 12:07 UTC (permalink / raw)
  To: Jason Wang, Stefan Hajnoczi, Michael S. Tsirkin
  Cc: qemu-devel, virtualization

Unlike with add and clear, there is no valid reason to abort when checking
for a feature. It makes more sense to return false (i.e. the feature bit
isn't set). This is exactly what __virtio_has_feature() does if fbit >= 32.

This allows to introduce code that is aware about new 64-bit features like
VIRTIO_F_VERSION_1, even if they are still not implemented.

Signed-off-by: Greg Kurz <gkurz@linux.vnet.ibm.com>
---
 include/hw/virtio/virtio.h |    1 -
 1 file changed, 1 deletion(-)

diff --git a/include/hw/virtio/virtio.h b/include/hw/virtio/virtio.h
index d95f8b6..6ef70f1 100644
--- a/include/hw/virtio/virtio.h
+++ b/include/hw/virtio/virtio.h
@@ -233,7 +233,6 @@ static inline void virtio_clear_feature(uint32_t *features, unsigned int fbit)
 
 static inline bool __virtio_has_feature(uint32_t features, unsigned int fbit)
 {
-    assert(fbit < 32);
     return !!(features & (1 << fbit));
 }
 

^ permalink raw reply related	[flat|nested] 44+ messages in thread

* [PATCH RFC 1/7] virtio: relax feature check
@ 2015-05-06 12:07   ` Greg Kurz
  0 siblings, 0 replies; 44+ messages in thread
From: Greg Kurz @ 2015-05-06 12:07 UTC (permalink / raw)
  To: Jason Wang, Stefan Hajnoczi, Michael S. Tsirkin
  Cc: qemu-devel, virtualization

Unlike with add and clear, there is no valid reason to abort when checking
for a feature. It makes more sense to return false (i.e. the feature bit
isn't set). This is exactly what __virtio_has_feature() does if fbit >= 32.

This allows to introduce code that is aware about new 64-bit features like
VIRTIO_F_VERSION_1, even if they are still not implemented.

Signed-off-by: Greg Kurz <gkurz@linux.vnet.ibm.com>
---
 include/hw/virtio/virtio.h |    1 -
 1 file changed, 1 deletion(-)

diff --git a/include/hw/virtio/virtio.h b/include/hw/virtio/virtio.h
index d95f8b6..6ef70f1 100644
--- a/include/hw/virtio/virtio.h
+++ b/include/hw/virtio/virtio.h
@@ -233,7 +233,6 @@ static inline void virtio_clear_feature(uint32_t *features, unsigned int fbit)
 
 static inline bool __virtio_has_feature(uint32_t features, unsigned int fbit)
 {
-    assert(fbit < 32);
     return !!(features & (1 << fbit));
 }

^ permalink raw reply related	[flat|nested] 44+ messages in thread

* [Qemu-devel] [PATCH RFC 2/7] linux-headers: sync vhost.h
  2015-05-06 12:07 ` Greg Kurz
@ 2015-05-06 12:07   ` Greg Kurz
  -1 siblings, 0 replies; 44+ messages in thread
From: Greg Kurz @ 2015-05-06 12:07 UTC (permalink / raw)
  To: Jason Wang, Stefan Hajnoczi, Michael S. Tsirkin
  Cc: qemu-devel, virtualization

This patch brings the cross-endian vhost API to QEMU.

Signed-off-by: Greg Kurz <gkurz@linux.vnet.ibm.com>
---
 linux-headers/linux/vhost.h |   14 ++++++++++++++
 1 file changed, 14 insertions(+)

diff --git a/linux-headers/linux/vhost.h b/linux-headers/linux/vhost.h
index c656f61..ead86db 100644
--- a/linux-headers/linux/vhost.h
+++ b/linux-headers/linux/vhost.h
@@ -103,6 +103,20 @@ struct vhost_memory {
 /* Get accessor: reads index, writes value in num */
 #define VHOST_GET_VRING_BASE _IOWR(VHOST_VIRTIO, 0x12, struct vhost_vring_state)
 
+/* Set the vring byte order in num. Valid values are VHOST_VRING_LITTLE_ENDIAN
+ * or VHOST_VRING_BIG_ENDIAN (other values return -EINVAL).
+ * The byte order cannot be changed while the device is active: trying to do so
+ * returns -EBUSY.
+ * This is a legacy only API that is simply ignored when VIRTIO_F_VERSION_1 is
+ * set.
+ * Not all kernel configurations support this ioctl, but all configurations that
+ * support SET also support GET.
+ */
+#define VHOST_VRING_LITTLE_ENDIAN 0
+#define VHOST_VRING_BIG_ENDIAN 1
+#define VHOST_SET_VRING_ENDIAN _IOW(VHOST_VIRTIO, 0x13, struct vhost_vring_state)
+#define VHOST_GET_VRING_ENDIAN _IOW(VHOST_VIRTIO, 0x14, struct vhost_vring_state)
+
 /* The following ioctls use eventfd file descriptors to signal and poll
  * for events. */
 

^ permalink raw reply related	[flat|nested] 44+ messages in thread

* [PATCH RFC 2/7] linux-headers: sync vhost.h
@ 2015-05-06 12:07   ` Greg Kurz
  0 siblings, 0 replies; 44+ messages in thread
From: Greg Kurz @ 2015-05-06 12:07 UTC (permalink / raw)
  To: Jason Wang, Stefan Hajnoczi, Michael S. Tsirkin
  Cc: qemu-devel, virtualization

This patch brings the cross-endian vhost API to QEMU.

Signed-off-by: Greg Kurz <gkurz@linux.vnet.ibm.com>
---
 linux-headers/linux/vhost.h |   14 ++++++++++++++
 1 file changed, 14 insertions(+)

diff --git a/linux-headers/linux/vhost.h b/linux-headers/linux/vhost.h
index c656f61..ead86db 100644
--- a/linux-headers/linux/vhost.h
+++ b/linux-headers/linux/vhost.h
@@ -103,6 +103,20 @@ struct vhost_memory {
 /* Get accessor: reads index, writes value in num */
 #define VHOST_GET_VRING_BASE _IOWR(VHOST_VIRTIO, 0x12, struct vhost_vring_state)
 
+/* Set the vring byte order in num. Valid values are VHOST_VRING_LITTLE_ENDIAN
+ * or VHOST_VRING_BIG_ENDIAN (other values return -EINVAL).
+ * The byte order cannot be changed while the device is active: trying to do so
+ * returns -EBUSY.
+ * This is a legacy only API that is simply ignored when VIRTIO_F_VERSION_1 is
+ * set.
+ * Not all kernel configurations support this ioctl, but all configurations that
+ * support SET also support GET.
+ */
+#define VHOST_VRING_LITTLE_ENDIAN 0
+#define VHOST_VRING_BIG_ENDIAN 1
+#define VHOST_SET_VRING_ENDIAN _IOW(VHOST_VIRTIO, 0x13, struct vhost_vring_state)
+#define VHOST_GET_VRING_ENDIAN _IOW(VHOST_VIRTIO, 0x14, struct vhost_vring_state)
+
 /* The following ioctls use eventfd file descriptors to signal and poll
  * for events. */

^ permalink raw reply related	[flat|nested] 44+ messages in thread

* [Qemu-devel] [PATCH RFC 3/7] virtio: introduce virtio_legacy_is_cross_endian()
  2015-05-06 12:07 ` Greg Kurz
@ 2015-05-06 12:07   ` Greg Kurz
  -1 siblings, 0 replies; 44+ messages in thread
From: Greg Kurz @ 2015-05-06 12:07 UTC (permalink / raw)
  To: Jason Wang, Stefan Hajnoczi, Michael S. Tsirkin
  Cc: qemu-devel, virtualization

This helper will be used by vhost and tap to detect cross-endianness in
the legacy virtio case.

Signed-off-by: Greg Kurz <gkurz@linux.vnet.ibm.com>
---
 include/hw/virtio/virtio-access.h |   13 +++++++++++++
 1 file changed, 13 insertions(+)

diff --git a/include/hw/virtio/virtio-access.h b/include/hw/virtio/virtio-access.h
index 46456fd..caf0940 100644
--- a/include/hw/virtio/virtio-access.h
+++ b/include/hw/virtio/virtio-access.h
@@ -28,6 +28,19 @@ static inline bool virtio_access_is_big_endian(VirtIODevice *vdev)
 #endif
 }
 
+static inline bool virtio_legacy_is_cross_endian(VirtIODevice *vdev)
+{
+#ifdef TARGET_IS_BIENDIAN
+#ifdef HOST_WORDS_BIGENDIAN
+    return !virtio_is_big_endian(vdev);
+#else
+    return virtio_is_big_endian(vdev);
+#endif
+#else
+    return false;
+#endif
+}
+
 static inline uint16_t virtio_lduw_phys(VirtIODevice *vdev, hwaddr pa)
 {
     if (virtio_access_is_big_endian(vdev)) {

^ permalink raw reply related	[flat|nested] 44+ messages in thread

* [PATCH RFC 3/7] virtio: introduce virtio_legacy_is_cross_endian()
@ 2015-05-06 12:07   ` Greg Kurz
  0 siblings, 0 replies; 44+ messages in thread
From: Greg Kurz @ 2015-05-06 12:07 UTC (permalink / raw)
  To: Jason Wang, Stefan Hajnoczi, Michael S. Tsirkin
  Cc: qemu-devel, virtualization

This helper will be used by vhost and tap to detect cross-endianness in
the legacy virtio case.

Signed-off-by: Greg Kurz <gkurz@linux.vnet.ibm.com>
---
 include/hw/virtio/virtio-access.h |   13 +++++++++++++
 1 file changed, 13 insertions(+)

diff --git a/include/hw/virtio/virtio-access.h b/include/hw/virtio/virtio-access.h
index 46456fd..caf0940 100644
--- a/include/hw/virtio/virtio-access.h
+++ b/include/hw/virtio/virtio-access.h
@@ -28,6 +28,19 @@ static inline bool virtio_access_is_big_endian(VirtIODevice *vdev)
 #endif
 }
 
+static inline bool virtio_legacy_is_cross_endian(VirtIODevice *vdev)
+{
+#ifdef TARGET_IS_BIENDIAN
+#ifdef HOST_WORDS_BIGENDIAN
+    return !virtio_is_big_endian(vdev);
+#else
+    return virtio_is_big_endian(vdev);
+#endif
+#else
+    return false;
+#endif
+}
+
 static inline uint16_t virtio_lduw_phys(VirtIODevice *vdev, hwaddr pa)
 {
     if (virtio_access_is_big_endian(vdev)) {

^ permalink raw reply related	[flat|nested] 44+ messages in thread

* [Qemu-devel] [PATCH RFC 4/7] vhost: set vring endianness for legacy virtio
  2015-05-06 12:07 ` Greg Kurz
@ 2015-05-06 12:08   ` Greg Kurz
  -1 siblings, 0 replies; 44+ messages in thread
From: Greg Kurz @ 2015-05-06 12:08 UTC (permalink / raw)
  To: Jason Wang, Stefan Hajnoczi, Michael S. Tsirkin
  Cc: qemu-devel, virtualization

Legacy virtio is native endian: if the guest and host endianness differ,
we have to tell vhost so it can swap bytes where appropriate. This is
done through a vhost ring ioctl.

Signed-off-by: Greg Kurz <gkurz@linux.vnet.ibm.com>
---
 hw/virtio/vhost.c |   50 +++++++++++++++++++++++++++++++++++++++++++++++++-
 1 file changed, 49 insertions(+), 1 deletion(-)

diff --git a/hw/virtio/vhost.c b/hw/virtio/vhost.c
index 54851b7..1d7b939 100644
--- a/hw/virtio/vhost.c
+++ b/hw/virtio/vhost.c
@@ -17,9 +17,11 @@
 #include "hw/hw.h"
 #include "qemu/atomic.h"
 #include "qemu/range.h"
+#include "qemu/error-report.h"
 #include <linux/vhost.h>
 #include "exec/address-spaces.h"
 #include "hw/virtio/virtio-bus.h"
+#include "hw/virtio/virtio-access.h"
 #include "migration/migration.h"
 
 static void vhost_dev_sync_region(struct vhost_dev *dev,
@@ -647,6 +649,27 @@ static void vhost_log_stop(MemoryListener *listener,
     /* FIXME: implement */
 }
 
+static int vhost_virtqueue_set_vring_endian_legacy(struct vhost_dev *dev,
+                                                   bool is_big_endian,
+                                                   int vhost_vq_index)
+{
+    struct vhost_vring_state s = {
+        .index = vhost_vq_index,
+        .num = is_big_endian
+    };
+
+    if (!dev->vhost_ops->vhost_call(dev, VHOST_SET_VRING_ENDIAN, &s)) {
+        return 0;
+    }
+
+    if (errno == ENOTTY) {
+        error_report("vhost does not support cross-endian");
+        return -ENOSYS;
+    }
+
+    return -errno;
+}
+
 static int vhost_virtqueue_start(struct vhost_dev *dev,
                                 struct VirtIODevice *vdev,
                                 struct vhost_virtqueue *vq,
@@ -677,6 +700,16 @@ static int vhost_virtqueue_start(struct vhost_dev *dev,
         return -errno;
     }
 
+    if (!virtio_has_feature(vdev, VIRTIO_F_VERSION_1) &&
+        virtio_legacy_is_cross_endian(vdev)) {
+        r = vhost_virtqueue_set_vring_endian_legacy(dev,
+                                                    virtio_is_big_endian(vdev),
+                                                    vhost_vq_index);
+        if (r) {
+            return -errno;
+        }
+    }
+
     s = l = virtio_queue_get_desc_size(vdev, idx);
     a = virtio_queue_get_desc_addr(vdev, idx);
     vq->desc = cpu_physical_memory_map(a, &l, 0);
@@ -747,8 +780,9 @@ static void vhost_virtqueue_stop(struct vhost_dev *dev,
                                     struct vhost_virtqueue *vq,
                                     unsigned idx)
 {
+    int vhost_vq_index = idx - dev->vq_index;
     struct vhost_vring_state state = {
-        .index = idx - dev->vq_index
+        .index = vhost_vq_index,
     };
     int r;
     assert(idx >= dev->vq_index && idx < dev->vq_index + dev->nvqs);
@@ -759,6 +793,20 @@ static void vhost_virtqueue_stop(struct vhost_dev *dev,
     }
     virtio_queue_set_last_avail_idx(vdev, idx, state.num);
     virtio_queue_invalidate_signalled_used(vdev, idx);
+
+    /* In the cross-endian case, we need to reset the vring endianness to
+     * native as legacy devices expect so by default.
+     */
+    if (!virtio_has_feature(vdev, VIRTIO_F_VERSION_1) &&
+        virtio_legacy_is_cross_endian(vdev)) {
+        r = vhost_virtqueue_set_vring_endian_legacy(dev,
+                                                    !virtio_is_big_endian(vdev),
+                                                    vhost_vq_index);
+        if (r < 0) {
+            error_report("failed to reset vring endianness");
+        }
+    }
+
     assert (r >= 0);
     cpu_physical_memory_unmap(vq->ring, virtio_queue_get_ring_size(vdev, idx),
                               0, virtio_queue_get_ring_size(vdev, idx));

^ permalink raw reply related	[flat|nested] 44+ messages in thread

* [PATCH RFC 4/7] vhost: set vring endianness for legacy virtio
@ 2015-05-06 12:08   ` Greg Kurz
  0 siblings, 0 replies; 44+ messages in thread
From: Greg Kurz @ 2015-05-06 12:08 UTC (permalink / raw)
  To: Jason Wang, Stefan Hajnoczi, Michael S. Tsirkin
  Cc: qemu-devel, virtualization

Legacy virtio is native endian: if the guest and host endianness differ,
we have to tell vhost so it can swap bytes where appropriate. This is
done through a vhost ring ioctl.

Signed-off-by: Greg Kurz <gkurz@linux.vnet.ibm.com>
---
 hw/virtio/vhost.c |   50 +++++++++++++++++++++++++++++++++++++++++++++++++-
 1 file changed, 49 insertions(+), 1 deletion(-)

diff --git a/hw/virtio/vhost.c b/hw/virtio/vhost.c
index 54851b7..1d7b939 100644
--- a/hw/virtio/vhost.c
+++ b/hw/virtio/vhost.c
@@ -17,9 +17,11 @@
 #include "hw/hw.h"
 #include "qemu/atomic.h"
 #include "qemu/range.h"
+#include "qemu/error-report.h"
 #include <linux/vhost.h>
 #include "exec/address-spaces.h"
 #include "hw/virtio/virtio-bus.h"
+#include "hw/virtio/virtio-access.h"
 #include "migration/migration.h"
 
 static void vhost_dev_sync_region(struct vhost_dev *dev,
@@ -647,6 +649,27 @@ static void vhost_log_stop(MemoryListener *listener,
     /* FIXME: implement */
 }
 
+static int vhost_virtqueue_set_vring_endian_legacy(struct vhost_dev *dev,
+                                                   bool is_big_endian,
+                                                   int vhost_vq_index)
+{
+    struct vhost_vring_state s = {
+        .index = vhost_vq_index,
+        .num = is_big_endian
+    };
+
+    if (!dev->vhost_ops->vhost_call(dev, VHOST_SET_VRING_ENDIAN, &s)) {
+        return 0;
+    }
+
+    if (errno == ENOTTY) {
+        error_report("vhost does not support cross-endian");
+        return -ENOSYS;
+    }
+
+    return -errno;
+}
+
 static int vhost_virtqueue_start(struct vhost_dev *dev,
                                 struct VirtIODevice *vdev,
                                 struct vhost_virtqueue *vq,
@@ -677,6 +700,16 @@ static int vhost_virtqueue_start(struct vhost_dev *dev,
         return -errno;
     }
 
+    if (!virtio_has_feature(vdev, VIRTIO_F_VERSION_1) &&
+        virtio_legacy_is_cross_endian(vdev)) {
+        r = vhost_virtqueue_set_vring_endian_legacy(dev,
+                                                    virtio_is_big_endian(vdev),
+                                                    vhost_vq_index);
+        if (r) {
+            return -errno;
+        }
+    }
+
     s = l = virtio_queue_get_desc_size(vdev, idx);
     a = virtio_queue_get_desc_addr(vdev, idx);
     vq->desc = cpu_physical_memory_map(a, &l, 0);
@@ -747,8 +780,9 @@ static void vhost_virtqueue_stop(struct vhost_dev *dev,
                                     struct vhost_virtqueue *vq,
                                     unsigned idx)
 {
+    int vhost_vq_index = idx - dev->vq_index;
     struct vhost_vring_state state = {
-        .index = idx - dev->vq_index
+        .index = vhost_vq_index,
     };
     int r;
     assert(idx >= dev->vq_index && idx < dev->vq_index + dev->nvqs);
@@ -759,6 +793,20 @@ static void vhost_virtqueue_stop(struct vhost_dev *dev,
     }
     virtio_queue_set_last_avail_idx(vdev, idx, state.num);
     virtio_queue_invalidate_signalled_used(vdev, idx);
+
+    /* In the cross-endian case, we need to reset the vring endianness to
+     * native as legacy devices expect so by default.
+     */
+    if (!virtio_has_feature(vdev, VIRTIO_F_VERSION_1) &&
+        virtio_legacy_is_cross_endian(vdev)) {
+        r = vhost_virtqueue_set_vring_endian_legacy(dev,
+                                                    !virtio_is_big_endian(vdev),
+                                                    vhost_vq_index);
+        if (r < 0) {
+            error_report("failed to reset vring endianness");
+        }
+    }
+
     assert (r >= 0);
     cpu_physical_memory_unmap(vq->ring, virtio_queue_get_ring_size(vdev, idx),
                               0, virtio_queue_get_ring_size(vdev, idx));

^ permalink raw reply related	[flat|nested] 44+ messages in thread

* [Qemu-devel] [PATCH RFC 5/7] tap: add VNET_LE/VNET_BE operations
  2015-05-06 12:07 ` Greg Kurz
@ 2015-05-06 12:08   ` Greg Kurz
  -1 siblings, 0 replies; 44+ messages in thread
From: Greg Kurz @ 2015-05-06 12:08 UTC (permalink / raw)
  To: Jason Wang, Stefan Hajnoczi, Michael S. Tsirkin
  Cc: qemu-devel, virtualization

The linux tap and macvtap backends can be told to parse vnet headers
according to little or big endian. This is done through the TUNSETVNETLE
and TUNSETVNETBE ioctls.

This patch brings all the plumbing for QEMU to use these APIs.

Signed-off-by: Greg Kurz <gkurz@linux.vnet.ibm.com>
---
 include/net/net.h |    6 ++++++
 net/net.c         |   18 ++++++++++++++++++
 net/tap-linux.c   |   34 ++++++++++++++++++++++++++++++++++
 net/tap-linux.h   |    2 ++
 net/tap.c         |   16 ++++++++++++++++
 net/tap_int.h     |    2 ++
 6 files changed, 78 insertions(+)

diff --git a/include/net/net.h b/include/net/net.h
index 50ffcb9..86f57f7 100644
--- a/include/net/net.h
+++ b/include/net/net.h
@@ -55,6 +55,8 @@ typedef bool (HasVnetHdrLen)(NetClientState *, int);
 typedef void (UsingVnetHdr)(NetClientState *, bool);
 typedef void (SetOffload)(NetClientState *, int, int, int, int, int);
 typedef void (SetVnetHdrLen)(NetClientState *, int);
+typedef int (SetVnetLE)(NetClientState *, bool);
+typedef int (SetVnetBE)(NetClientState *, bool);
 
 typedef struct NetClientInfo {
     NetClientOptionsKind type;
@@ -73,6 +75,8 @@ typedef struct NetClientInfo {
     UsingVnetHdr *using_vnet_hdr;
     SetOffload *set_offload;
     SetVnetHdrLen *set_vnet_hdr_len;
+    SetVnetLE *set_vnet_le;
+    SetVnetBE *set_vnet_be;
 } NetClientInfo;
 
 struct NetClientState {
@@ -138,6 +142,8 @@ void qemu_using_vnet_hdr(NetClientState *nc, bool enable);
 void qemu_set_offload(NetClientState *nc, int csum, int tso4, int tso6,
                       int ecn, int ufo);
 void qemu_set_vnet_hdr_len(NetClientState *nc, int len);
+int qemu_set_vnet_le(NetClientState *nc, bool is_le);
+int qemu_set_vnet_be(NetClientState *nc, bool is_be);
 void qemu_macaddr_default_if_unset(MACAddr *macaddr);
 int qemu_show_nic_models(const char *arg, const char *const *models);
 void qemu_check_nic_model(NICInfo *nd, const char *model);
diff --git a/net/net.c b/net/net.c
index 0be084d..eb8ef3e 100644
--- a/net/net.c
+++ b/net/net.c
@@ -452,6 +452,24 @@ void qemu_set_vnet_hdr_len(NetClientState *nc, int len)
     nc->info->set_vnet_hdr_len(nc, len);
 }
 
+int qemu_set_vnet_le(NetClientState *nc, bool is_le)
+{
+    if (!nc || !nc->info->set_vnet_le) {
+        return -ENOSYS;
+    }
+
+    return nc->info->set_vnet_le(nc, is_le);
+}
+
+int qemu_set_vnet_be(NetClientState *nc, bool is_be)
+{
+    if (!nc || !nc->info->set_vnet_be) {
+        return -ENOSYS;
+    }
+
+    return nc->info->set_vnet_be(nc, is_be);
+}
+
 int qemu_can_send_packet(NetClientState *sender)
 {
     int vm_running = runstate_is_running();
diff --git a/net/tap-linux.c b/net/tap-linux.c
index 812bf2d..15b57a7 100644
--- a/net/tap-linux.c
+++ b/net/tap-linux.c
@@ -198,6 +198,40 @@ void tap_fd_set_vnet_hdr_len(int fd, int len)
     }
 }
 
+int tap_fd_set_vnet_le(int fd, int is_le)
+{
+    int arg = is_le ? 1 : 0;
+
+    if (!ioctl(fd, TUNSETVNETLE, &arg)) {
+        return 0;
+    }
+
+    /* Check if our kernel supports TUNSETVNETLE */
+    if (errno == EINVAL) {
+        return -errno;
+    }
+
+    error_report("TUNSETVNETLE ioctl() failed: %s.\n", strerror(errno));
+    abort();
+}
+
+int tap_fd_set_vnet_be(int fd, int is_be)
+{
+    int arg = is_be ? 1 : 0;
+
+    if (!ioctl(fd, TUNSETVNETBE, &arg)) {
+        return 0;
+    }
+
+    /* Check if our kernel supports TUNSETVNETBE */
+    if (errno == EINVAL) {
+        return -errno;
+    }
+
+    error_report("TUNSETVNETBE ioctl() failed: %s.\n", strerror(errno));
+    abort();
+}
+
 void tap_fd_set_offload(int fd, int csum, int tso4,
                         int tso6, int ecn, int ufo)
 {
diff --git a/net/tap-linux.h b/net/tap-linux.h
index 1cf35d4..01dc6f8 100644
--- a/net/tap-linux.h
+++ b/net/tap-linux.h
@@ -30,6 +30,8 @@
 #define TUNGETVNETHDRSZ _IOR('T', 215, int)
 #define TUNSETVNETHDRSZ _IOW('T', 216, int)
 #define TUNSETQUEUE  _IOW('T', 217, int)
+#define TUNSETVNETLE _IOW('T', 220, int)
+#define TUNSETVNETBE _IOW('T', 222, int)
 
 #endif
 
diff --git a/net/tap.c b/net/tap.c
index 968df46..c6f9a7d 100644
--- a/net/tap.c
+++ b/net/tap.c
@@ -274,6 +274,20 @@ static void tap_using_vnet_hdr(NetClientState *nc, bool using_vnet_hdr)
     s->using_vnet_hdr = using_vnet_hdr;
 }
 
+static int tap_set_vnet_le(NetClientState *nc, bool is_le)
+{
+    TAPState *s = DO_UPCAST(TAPState, nc, nc);
+
+    return tap_fd_set_vnet_le(s->fd, is_le);
+}
+
+static int tap_set_vnet_be(NetClientState *nc, bool is_be)
+{
+    TAPState *s = DO_UPCAST(TAPState, nc, nc);
+
+    return tap_fd_set_vnet_be(s->fd, is_be);
+}
+
 static void tap_set_offload(NetClientState *nc, int csum, int tso4,
                      int tso6, int ecn, int ufo)
 {
@@ -335,6 +349,8 @@ static NetClientInfo net_tap_info = {
     .using_vnet_hdr = tap_using_vnet_hdr,
     .set_offload = tap_set_offload,
     .set_vnet_hdr_len = tap_set_vnet_hdr_len,
+    .set_vnet_le = tap_set_vnet_le,
+    .set_vnet_be = tap_set_vnet_be,
 };
 
 static TAPState *net_tap_fd_init(NetClientState *peer,
diff --git a/net/tap_int.h b/net/tap_int.h
index 79afdf2..5cb79fc 100644
--- a/net/tap_int.h
+++ b/net/tap_int.h
@@ -40,6 +40,8 @@ int tap_probe_vnet_hdr_len(int fd, int len);
 int tap_probe_has_ufo(int fd);
 void tap_fd_set_offload(int fd, int csum, int tso4, int tso6, int ecn, int ufo);
 void tap_fd_set_vnet_hdr_len(int fd, int len);
+int tap_fd_set_vnet_le(int fd, int vnet_is_le);
+int tap_fd_set_vnet_be(int fd, int vnet_is_be);
 int tap_fd_enable(int fd);
 int tap_fd_disable(int fd);
 int tap_fd_get_ifname(int fd, char *ifname);

^ permalink raw reply related	[flat|nested] 44+ messages in thread

* [PATCH RFC 5/7] tap: add VNET_LE/VNET_BE operations
@ 2015-05-06 12:08   ` Greg Kurz
  0 siblings, 0 replies; 44+ messages in thread
From: Greg Kurz @ 2015-05-06 12:08 UTC (permalink / raw)
  To: Jason Wang, Stefan Hajnoczi, Michael S. Tsirkin
  Cc: qemu-devel, virtualization

The linux tap and macvtap backends can be told to parse vnet headers
according to little or big endian. This is done through the TUNSETVNETLE
and TUNSETVNETBE ioctls.

This patch brings all the plumbing for QEMU to use these APIs.

Signed-off-by: Greg Kurz <gkurz@linux.vnet.ibm.com>
---
 include/net/net.h |    6 ++++++
 net/net.c         |   18 ++++++++++++++++++
 net/tap-linux.c   |   34 ++++++++++++++++++++++++++++++++++
 net/tap-linux.h   |    2 ++
 net/tap.c         |   16 ++++++++++++++++
 net/tap_int.h     |    2 ++
 6 files changed, 78 insertions(+)

diff --git a/include/net/net.h b/include/net/net.h
index 50ffcb9..86f57f7 100644
--- a/include/net/net.h
+++ b/include/net/net.h
@@ -55,6 +55,8 @@ typedef bool (HasVnetHdrLen)(NetClientState *, int);
 typedef void (UsingVnetHdr)(NetClientState *, bool);
 typedef void (SetOffload)(NetClientState *, int, int, int, int, int);
 typedef void (SetVnetHdrLen)(NetClientState *, int);
+typedef int (SetVnetLE)(NetClientState *, bool);
+typedef int (SetVnetBE)(NetClientState *, bool);
 
 typedef struct NetClientInfo {
     NetClientOptionsKind type;
@@ -73,6 +75,8 @@ typedef struct NetClientInfo {
     UsingVnetHdr *using_vnet_hdr;
     SetOffload *set_offload;
     SetVnetHdrLen *set_vnet_hdr_len;
+    SetVnetLE *set_vnet_le;
+    SetVnetBE *set_vnet_be;
 } NetClientInfo;
 
 struct NetClientState {
@@ -138,6 +142,8 @@ void qemu_using_vnet_hdr(NetClientState *nc, bool enable);
 void qemu_set_offload(NetClientState *nc, int csum, int tso4, int tso6,
                       int ecn, int ufo);
 void qemu_set_vnet_hdr_len(NetClientState *nc, int len);
+int qemu_set_vnet_le(NetClientState *nc, bool is_le);
+int qemu_set_vnet_be(NetClientState *nc, bool is_be);
 void qemu_macaddr_default_if_unset(MACAddr *macaddr);
 int qemu_show_nic_models(const char *arg, const char *const *models);
 void qemu_check_nic_model(NICInfo *nd, const char *model);
diff --git a/net/net.c b/net/net.c
index 0be084d..eb8ef3e 100644
--- a/net/net.c
+++ b/net/net.c
@@ -452,6 +452,24 @@ void qemu_set_vnet_hdr_len(NetClientState *nc, int len)
     nc->info->set_vnet_hdr_len(nc, len);
 }
 
+int qemu_set_vnet_le(NetClientState *nc, bool is_le)
+{
+    if (!nc || !nc->info->set_vnet_le) {
+        return -ENOSYS;
+    }
+
+    return nc->info->set_vnet_le(nc, is_le);
+}
+
+int qemu_set_vnet_be(NetClientState *nc, bool is_be)
+{
+    if (!nc || !nc->info->set_vnet_be) {
+        return -ENOSYS;
+    }
+
+    return nc->info->set_vnet_be(nc, is_be);
+}
+
 int qemu_can_send_packet(NetClientState *sender)
 {
     int vm_running = runstate_is_running();
diff --git a/net/tap-linux.c b/net/tap-linux.c
index 812bf2d..15b57a7 100644
--- a/net/tap-linux.c
+++ b/net/tap-linux.c
@@ -198,6 +198,40 @@ void tap_fd_set_vnet_hdr_len(int fd, int len)
     }
 }
 
+int tap_fd_set_vnet_le(int fd, int is_le)
+{
+    int arg = is_le ? 1 : 0;
+
+    if (!ioctl(fd, TUNSETVNETLE, &arg)) {
+        return 0;
+    }
+
+    /* Check if our kernel supports TUNSETVNETLE */
+    if (errno == EINVAL) {
+        return -errno;
+    }
+
+    error_report("TUNSETVNETLE ioctl() failed: %s.\n", strerror(errno));
+    abort();
+}
+
+int tap_fd_set_vnet_be(int fd, int is_be)
+{
+    int arg = is_be ? 1 : 0;
+
+    if (!ioctl(fd, TUNSETVNETBE, &arg)) {
+        return 0;
+    }
+
+    /* Check if our kernel supports TUNSETVNETBE */
+    if (errno == EINVAL) {
+        return -errno;
+    }
+
+    error_report("TUNSETVNETBE ioctl() failed: %s.\n", strerror(errno));
+    abort();
+}
+
 void tap_fd_set_offload(int fd, int csum, int tso4,
                         int tso6, int ecn, int ufo)
 {
diff --git a/net/tap-linux.h b/net/tap-linux.h
index 1cf35d4..01dc6f8 100644
--- a/net/tap-linux.h
+++ b/net/tap-linux.h
@@ -30,6 +30,8 @@
 #define TUNGETVNETHDRSZ _IOR('T', 215, int)
 #define TUNSETVNETHDRSZ _IOW('T', 216, int)
 #define TUNSETQUEUE  _IOW('T', 217, int)
+#define TUNSETVNETLE _IOW('T', 220, int)
+#define TUNSETVNETBE _IOW('T', 222, int)
 
 #endif
 
diff --git a/net/tap.c b/net/tap.c
index 968df46..c6f9a7d 100644
--- a/net/tap.c
+++ b/net/tap.c
@@ -274,6 +274,20 @@ static void tap_using_vnet_hdr(NetClientState *nc, bool using_vnet_hdr)
     s->using_vnet_hdr = using_vnet_hdr;
 }
 
+static int tap_set_vnet_le(NetClientState *nc, bool is_le)
+{
+    TAPState *s = DO_UPCAST(TAPState, nc, nc);
+
+    return tap_fd_set_vnet_le(s->fd, is_le);
+}
+
+static int tap_set_vnet_be(NetClientState *nc, bool is_be)
+{
+    TAPState *s = DO_UPCAST(TAPState, nc, nc);
+
+    return tap_fd_set_vnet_be(s->fd, is_be);
+}
+
 static void tap_set_offload(NetClientState *nc, int csum, int tso4,
                      int tso6, int ecn, int ufo)
 {
@@ -335,6 +349,8 @@ static NetClientInfo net_tap_info = {
     .using_vnet_hdr = tap_using_vnet_hdr,
     .set_offload = tap_set_offload,
     .set_vnet_hdr_len = tap_set_vnet_hdr_len,
+    .set_vnet_le = tap_set_vnet_le,
+    .set_vnet_be = tap_set_vnet_be,
 };
 
 static TAPState *net_tap_fd_init(NetClientState *peer,
diff --git a/net/tap_int.h b/net/tap_int.h
index 79afdf2..5cb79fc 100644
--- a/net/tap_int.h
+++ b/net/tap_int.h
@@ -40,6 +40,8 @@ int tap_probe_vnet_hdr_len(int fd, int len);
 int tap_probe_has_ufo(int fd);
 void tap_fd_set_offload(int fd, int csum, int tso4, int tso6, int ecn, int ufo);
 void tap_fd_set_vnet_hdr_len(int fd, int len);
+int tap_fd_set_vnet_le(int fd, int vnet_is_le);
+int tap_fd_set_vnet_be(int fd, int vnet_is_be);
 int tap_fd_enable(int fd);
 int tap_fd_disable(int fd);
 int tap_fd_get_ifname(int fd, char *ifname);

^ permalink raw reply related	[flat|nested] 44+ messages in thread

* [Qemu-devel] [PATCH RFC 6/7] vhost-net: tell tap backend about the vnet endianness
  2015-05-06 12:07 ` Greg Kurz
@ 2015-05-06 12:08   ` Greg Kurz
  -1 siblings, 0 replies; 44+ messages in thread
From: Greg Kurz @ 2015-05-06 12:08 UTC (permalink / raw)
  To: Jason Wang, Stefan Hajnoczi, Michael S. Tsirkin
  Cc: qemu-devel, virtualization

The default behaviour for TAP/MACVTAP is to consider vnet as native endian.

This patch handles the cases when this is not true:
- virtio 1.0: always little-endian
- legacy cross-endian

Signed-off-by: Greg Kurz <gkurz@linux.vnet.ibm.com>
---
 hw/net/vhost_net.c |   33 ++++++++++++++++++++++++++++++++-
 1 file changed, 32 insertions(+), 1 deletion(-)

diff --git a/hw/net/vhost_net.c b/hw/net/vhost_net.c
index cf23335..1884e59 100644
--- a/hw/net/vhost_net.c
+++ b/hw/net/vhost_net.c
@@ -38,6 +38,7 @@
 #include "standard-headers/linux/virtio_ring.h"
 #include "hw/virtio/vhost.h"
 #include "hw/virtio/virtio-bus.h"
+#include "hw/virtio/virtio-access.h"
 
 struct vhost_net {
     struct vhost_dev dev;
@@ -194,6 +195,27 @@ static void vhost_net_set_vq_index(struct vhost_net *net, int vq_index)
     net->dev.vq_index = vq_index;
 }
 
+static int vhost_net_set_vnet_endian(VirtIODevice *dev, NetClientState *peer,
+                                     bool set)
+{
+    int r = 0;
+
+    if (virtio_has_feature(dev, VIRTIO_F_VERSION_1) ||
+        (virtio_legacy_is_cross_endian(dev) && !virtio_is_big_endian(dev))) {
+        r = qemu_set_vnet_le(peer, set);
+        if (r) {
+            error_report("backend does not support LE vnet headers");
+        }
+    } else if (virtio_legacy_is_cross_endian(dev)) {
+        r = qemu_set_vnet_be(peer, set);
+        if (r) {
+            error_report("backend does not support BE vnet headers");
+        }
+    }
+
+    return r;
+}
+
 static int vhost_net_start_one(struct vhost_net *net,
                                VirtIODevice *dev)
 {
@@ -304,6 +326,11 @@ int vhost_net_start(VirtIODevice *dev, NetClientState *ncs,
         goto err;
     }
 
+    r = vhost_net_set_vnet_endian(dev, ncs[0].peer, true);
+    if (r < 0) {
+        goto err;
+    }
+
     for (i = 0; i < total_queues; i++) {
         vhost_net_set_vq_index(get_vhost_net(ncs[i].peer), i * 2);
     }
@@ -311,7 +338,7 @@ int vhost_net_start(VirtIODevice *dev, NetClientState *ncs,
     r = k->set_guest_notifiers(qbus->parent, total_queues * 2, true);
     if (r < 0) {
         error_report("Error binding guest notifier: %d", -r);
-        goto err;
+        goto err_endian;
     }
 
     for (i = 0; i < total_queues; i++) {
@@ -333,6 +360,8 @@ err_start:
         fprintf(stderr, "vhost guest notifier cleanup failed: %d\n", e);
         fflush(stderr);
     }
+err_endian:
+    vhost_net_set_vnet_endian(dev, ncs[0].peer, false);
 err:
     return r;
 }
@@ -355,6 +384,8 @@ void vhost_net_stop(VirtIODevice *dev, NetClientState *ncs,
         fflush(stderr);
     }
     assert(r >= 0);
+
+    assert(vhost_net_set_vnet_endian(dev, ncs[0].peer, false) >= 0);
 }
 
 void vhost_net_cleanup(struct vhost_net *net)

^ permalink raw reply related	[flat|nested] 44+ messages in thread

* [PATCH RFC 6/7] vhost-net: tell tap backend about the vnet endianness
@ 2015-05-06 12:08   ` Greg Kurz
  0 siblings, 0 replies; 44+ messages in thread
From: Greg Kurz @ 2015-05-06 12:08 UTC (permalink / raw)
  To: Jason Wang, Stefan Hajnoczi, Michael S. Tsirkin
  Cc: qemu-devel, virtualization

The default behaviour for TAP/MACVTAP is to consider vnet as native endian.

This patch handles the cases when this is not true:
- virtio 1.0: always little-endian
- legacy cross-endian

Signed-off-by: Greg Kurz <gkurz@linux.vnet.ibm.com>
---
 hw/net/vhost_net.c |   33 ++++++++++++++++++++++++++++++++-
 1 file changed, 32 insertions(+), 1 deletion(-)

diff --git a/hw/net/vhost_net.c b/hw/net/vhost_net.c
index cf23335..1884e59 100644
--- a/hw/net/vhost_net.c
+++ b/hw/net/vhost_net.c
@@ -38,6 +38,7 @@
 #include "standard-headers/linux/virtio_ring.h"
 #include "hw/virtio/vhost.h"
 #include "hw/virtio/virtio-bus.h"
+#include "hw/virtio/virtio-access.h"
 
 struct vhost_net {
     struct vhost_dev dev;
@@ -194,6 +195,27 @@ static void vhost_net_set_vq_index(struct vhost_net *net, int vq_index)
     net->dev.vq_index = vq_index;
 }
 
+static int vhost_net_set_vnet_endian(VirtIODevice *dev, NetClientState *peer,
+                                     bool set)
+{
+    int r = 0;
+
+    if (virtio_has_feature(dev, VIRTIO_F_VERSION_1) ||
+        (virtio_legacy_is_cross_endian(dev) && !virtio_is_big_endian(dev))) {
+        r = qemu_set_vnet_le(peer, set);
+        if (r) {
+            error_report("backend does not support LE vnet headers");
+        }
+    } else if (virtio_legacy_is_cross_endian(dev)) {
+        r = qemu_set_vnet_be(peer, set);
+        if (r) {
+            error_report("backend does not support BE vnet headers");
+        }
+    }
+
+    return r;
+}
+
 static int vhost_net_start_one(struct vhost_net *net,
                                VirtIODevice *dev)
 {
@@ -304,6 +326,11 @@ int vhost_net_start(VirtIODevice *dev, NetClientState *ncs,
         goto err;
     }
 
+    r = vhost_net_set_vnet_endian(dev, ncs[0].peer, true);
+    if (r < 0) {
+        goto err;
+    }
+
     for (i = 0; i < total_queues; i++) {
         vhost_net_set_vq_index(get_vhost_net(ncs[i].peer), i * 2);
     }
@@ -311,7 +338,7 @@ int vhost_net_start(VirtIODevice *dev, NetClientState *ncs,
     r = k->set_guest_notifiers(qbus->parent, total_queues * 2, true);
     if (r < 0) {
         error_report("Error binding guest notifier: %d", -r);
-        goto err;
+        goto err_endian;
     }
 
     for (i = 0; i < total_queues; i++) {
@@ -333,6 +360,8 @@ err_start:
         fprintf(stderr, "vhost guest notifier cleanup failed: %d\n", e);
         fflush(stderr);
     }
+err_endian:
+    vhost_net_set_vnet_endian(dev, ncs[0].peer, false);
 err:
     return r;
 }
@@ -355,6 +384,8 @@ void vhost_net_stop(VirtIODevice *dev, NetClientState *ncs,
         fflush(stderr);
     }
     assert(r >= 0);
+
+    assert(vhost_net_set_vnet_endian(dev, ncs[0].peer, false) >= 0);
 }
 
 void vhost_net_cleanup(struct vhost_net *net)

^ permalink raw reply related	[flat|nested] 44+ messages in thread

* [Qemu-devel] [PATCH RFC 7/7] vhost_net: re-enable when cross endian
  2015-05-06 12:07 ` Greg Kurz
@ 2015-05-06 12:08   ` Greg Kurz
  -1 siblings, 0 replies; 44+ messages in thread
From: Greg Kurz @ 2015-05-06 12:08 UTC (permalink / raw)
  To: Jason Wang, Stefan Hajnoczi, Michael S. Tsirkin
  Cc: qemu-devel, virtualization

From: Cédric Le Goater <clg@fr.ibm.com>

Cross-endianness is now checked by the core vhost code.

revert 371df9f5e0f1 "vhost-net: disable when cross-endian"

Signed-off-by: Cédric Le Goater <clg@fr.ibm.com>
[ added commit message, Greg Kurz <gkurz@linux.vnet.ibm.com> ]
Signed-off-by: Greg Kurz <gkurz@linux.vnet.ibm.com>
---
 hw/net/vhost_net.c |   19 -------------------
 1 file changed, 19 deletions(-)

diff --git a/hw/net/vhost_net.c b/hw/net/vhost_net.c
index 1884e59..3e4b0f2 100644
--- a/hw/net/vhost_net.c
+++ b/hw/net/vhost_net.c
@@ -293,19 +293,6 @@ static void vhost_net_stop_one(struct vhost_net *net,
     vhost_dev_disable_notifiers(&net->dev, dev);
 }
 
-static bool vhost_net_device_endian_ok(VirtIODevice *vdev)
-{
-#ifdef TARGET_IS_BIENDIAN
-#ifdef HOST_WORDS_BIGENDIAN
-    return virtio_is_big_endian(vdev);
-#else
-    return !virtio_is_big_endian(vdev);
-#endif
-#else
-    return true;
-#endif
-}
-
 int vhost_net_start(VirtIODevice *dev, NetClientState *ncs,
                     int total_queues)
 {
@@ -314,12 +301,6 @@ int vhost_net_start(VirtIODevice *dev, NetClientState *ncs,
     VirtioBusClass *k = VIRTIO_BUS_GET_CLASS(vbus);
     int r, e, i;
 
-    if (!vhost_net_device_endian_ok(dev)) {
-        error_report("vhost-net does not support cross-endian");
-        r = -ENOSYS;
-        goto err;
-    }
-
     if (!k->set_guest_notifiers) {
         error_report("binding does not support guest notifiers");
         r = -ENOSYS;

^ permalink raw reply related	[flat|nested] 44+ messages in thread

* [PATCH RFC 7/7] vhost_net: re-enable when cross endian
@ 2015-05-06 12:08   ` Greg Kurz
  0 siblings, 0 replies; 44+ messages in thread
From: Greg Kurz @ 2015-05-06 12:08 UTC (permalink / raw)
  To: Jason Wang, Stefan Hajnoczi, Michael S. Tsirkin
  Cc: qemu-devel, virtualization

From: Cédric Le Goater <clg@fr.ibm.com>

Cross-endianness is now checked by the core vhost code.

revert 371df9f5e0f1 "vhost-net: disable when cross-endian"

Signed-off-by: Cédric Le Goater <clg@fr.ibm.com>
[ added commit message, Greg Kurz <gkurz@linux.vnet.ibm.com> ]
Signed-off-by: Greg Kurz <gkurz@linux.vnet.ibm.com>
---
 hw/net/vhost_net.c |   19 -------------------
 1 file changed, 19 deletions(-)

diff --git a/hw/net/vhost_net.c b/hw/net/vhost_net.c
index 1884e59..3e4b0f2 100644
--- a/hw/net/vhost_net.c
+++ b/hw/net/vhost_net.c
@@ -293,19 +293,6 @@ static void vhost_net_stop_one(struct vhost_net *net,
     vhost_dev_disable_notifiers(&net->dev, dev);
 }
 
-static bool vhost_net_device_endian_ok(VirtIODevice *vdev)
-{
-#ifdef TARGET_IS_BIENDIAN
-#ifdef HOST_WORDS_BIGENDIAN
-    return virtio_is_big_endian(vdev);
-#else
-    return !virtio_is_big_endian(vdev);
-#endif
-#else
-    return true;
-#endif
-}
-
 int vhost_net_start(VirtIODevice *dev, NetClientState *ncs,
                     int total_queues)
 {
@@ -314,12 +301,6 @@ int vhost_net_start(VirtIODevice *dev, NetClientState *ncs,
     VirtioBusClass *k = VIRTIO_BUS_GET_CLASS(vbus);
     int r, e, i;
 
-    if (!vhost_net_device_endian_ok(dev)) {
-        error_report("vhost-net does not support cross-endian");
-        r = -ENOSYS;
-        goto err;
-    }
-
     if (!k->set_guest_notifiers) {
         error_report("binding does not support guest notifiers");
         r = -ENOSYS;

_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

^ permalink raw reply related	[flat|nested] 44+ messages in thread

* Re: [Qemu-devel] [PATCH RFC 1/7] virtio: relax feature check
  2015-05-06 12:07   ` Greg Kurz
@ 2015-05-12 13:14     ` Cornelia Huck
  -1 siblings, 0 replies; 44+ messages in thread
From: Cornelia Huck @ 2015-05-12 13:14 UTC (permalink / raw)
  To: Greg Kurz
  Cc: Jason Wang, virtualization, qemu-devel, Stefan Hajnoczi,
	Michael S. Tsirkin

On Wed, 06 May 2015 14:07:37 +0200
Greg Kurz <gkurz@linux.vnet.ibm.com> wrote:

> Unlike with add and clear, there is no valid reason to abort when checking
> for a feature. It makes more sense to return false (i.e. the feature bit
> isn't set). This is exactly what __virtio_has_feature() does if fbit >= 32.
> 
> This allows to introduce code that is aware about new 64-bit features like
> VIRTIO_F_VERSION_1, even if they are still not implemented.
> 
> Signed-off-by: Greg Kurz <gkurz@linux.vnet.ibm.com>
> ---
>  include/hw/virtio/virtio.h |    1 -
>  1 file changed, 1 deletion(-)
> 
> diff --git a/include/hw/virtio/virtio.h b/include/hw/virtio/virtio.h
> index d95f8b6..6ef70f1 100644
> --- a/include/hw/virtio/virtio.h
> +++ b/include/hw/virtio/virtio.h
> @@ -233,7 +233,6 @@ static inline void virtio_clear_feature(uint32_t *features, unsigned int fbit)
> 
>  static inline bool __virtio_has_feature(uint32_t features, unsigned int fbit)
>  {
> -    assert(fbit < 32);
>      return !!(features & (1 << fbit));
>  }
> 
> 
> 

I must say I'm not very comfortable with knowingly passing out-of-rage
values to this function.

Can we perhaps apply at least the feature-bit-size extending patches
prior to your patchset, if the remainder of the virtio-1 patchset still
takes some time?

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [Qemu-devel] [PATCH RFC 1/7] virtio: relax feature check
@ 2015-05-12 13:14     ` Cornelia Huck
  0 siblings, 0 replies; 44+ messages in thread
From: Cornelia Huck @ 2015-05-12 13:14 UTC (permalink / raw)
  To: Greg Kurz; +Cc: virtualization, qemu-devel, Stefan Hajnoczi, Michael S. Tsirkin

On Wed, 06 May 2015 14:07:37 +0200
Greg Kurz <gkurz@linux.vnet.ibm.com> wrote:

> Unlike with add and clear, there is no valid reason to abort when checking
> for a feature. It makes more sense to return false (i.e. the feature bit
> isn't set). This is exactly what __virtio_has_feature() does if fbit >= 32.
> 
> This allows to introduce code that is aware about new 64-bit features like
> VIRTIO_F_VERSION_1, even if they are still not implemented.
> 
> Signed-off-by: Greg Kurz <gkurz@linux.vnet.ibm.com>
> ---
>  include/hw/virtio/virtio.h |    1 -
>  1 file changed, 1 deletion(-)
> 
> diff --git a/include/hw/virtio/virtio.h b/include/hw/virtio/virtio.h
> index d95f8b6..6ef70f1 100644
> --- a/include/hw/virtio/virtio.h
> +++ b/include/hw/virtio/virtio.h
> @@ -233,7 +233,6 @@ static inline void virtio_clear_feature(uint32_t *features, unsigned int fbit)
> 
>  static inline bool __virtio_has_feature(uint32_t features, unsigned int fbit)
>  {
> -    assert(fbit < 32);
>      return !!(features & (1 << fbit));
>  }
> 
> 
> 

I must say I'm not very comfortable with knowingly passing out-of-rage
values to this function.

Can we perhaps apply at least the feature-bit-size extending patches
prior to your patchset, if the remainder of the virtio-1 patchset still
takes some time?

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [Qemu-devel] [PATCH RFC 4/7] vhost: set vring endianness for legacy virtio
  2015-05-06 12:08   ` Greg Kurz
@ 2015-05-12 13:25     ` Cornelia Huck
  -1 siblings, 0 replies; 44+ messages in thread
From: Cornelia Huck @ 2015-05-12 13:25 UTC (permalink / raw)
  To: Greg Kurz
  Cc: Jason Wang, virtualization, qemu-devel, Stefan Hajnoczi,
	Michael S. Tsirkin

On Wed, 06 May 2015 14:08:02 +0200
Greg Kurz <gkurz@linux.vnet.ibm.com> wrote:

> Legacy virtio is native endian: if the guest and host endianness differ,
> we have to tell vhost so it can swap bytes where appropriate. This is
> done through a vhost ring ioctl.
> 
> Signed-off-by: Greg Kurz <gkurz@linux.vnet.ibm.com>
> ---
>  hw/virtio/vhost.c |   50 +++++++++++++++++++++++++++++++++++++++++++++++++-
>  1 file changed, 49 insertions(+), 1 deletion(-)
> 
> diff --git a/hw/virtio/vhost.c b/hw/virtio/vhost.c
> index 54851b7..1d7b939 100644
> --- a/hw/virtio/vhost.c
> +++ b/hw/virtio/vhost.c
(...)
> @@ -677,6 +700,16 @@ static int vhost_virtqueue_start(struct vhost_dev *dev,
>          return -errno;
>      }
> 
> +    if (!virtio_has_feature(vdev, VIRTIO_F_VERSION_1) &&

I think this should either go in after the virtio-1 base support (more
feature bits etc.) or get a big fat comment and be touched up later.
I'd prefer the first solution so it does not get forgotten, but I'm not
sure when Michael plans to proceed with the virtio-1 patches (I think
they're mostly fine already).

> +        virtio_legacy_is_cross_endian(vdev)) {
> +        r = vhost_virtqueue_set_vring_endian_legacy(dev,
> +                                                    virtio_is_big_endian(vdev),
> +                                                    vhost_vq_index);
> +        if (r) {
> +            return -errno;
> +        }
> +    }
> +
>      s = l = virtio_queue_get_desc_size(vdev, idx);
>      a = virtio_queue_get_desc_addr(vdev, idx);
>      vq->desc = cpu_physical_memory_map(a, &l, 0);

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [Qemu-devel] [PATCH RFC 4/7] vhost: set vring endianness for legacy virtio
@ 2015-05-12 13:25     ` Cornelia Huck
  0 siblings, 0 replies; 44+ messages in thread
From: Cornelia Huck @ 2015-05-12 13:25 UTC (permalink / raw)
  To: Greg Kurz; +Cc: virtualization, qemu-devel, Stefan Hajnoczi, Michael S. Tsirkin

On Wed, 06 May 2015 14:08:02 +0200
Greg Kurz <gkurz@linux.vnet.ibm.com> wrote:

> Legacy virtio is native endian: if the guest and host endianness differ,
> we have to tell vhost so it can swap bytes where appropriate. This is
> done through a vhost ring ioctl.
> 
> Signed-off-by: Greg Kurz <gkurz@linux.vnet.ibm.com>
> ---
>  hw/virtio/vhost.c |   50 +++++++++++++++++++++++++++++++++++++++++++++++++-
>  1 file changed, 49 insertions(+), 1 deletion(-)
> 
> diff --git a/hw/virtio/vhost.c b/hw/virtio/vhost.c
> index 54851b7..1d7b939 100644
> --- a/hw/virtio/vhost.c
> +++ b/hw/virtio/vhost.c
(...)
> @@ -677,6 +700,16 @@ static int vhost_virtqueue_start(struct vhost_dev *dev,
>          return -errno;
>      }
> 
> +    if (!virtio_has_feature(vdev, VIRTIO_F_VERSION_1) &&

I think this should either go in after the virtio-1 base support (more
feature bits etc.) or get a big fat comment and be touched up later.
I'd prefer the first solution so it does not get forgotten, but I'm not
sure when Michael plans to proceed with the virtio-1 patches (I think
they're mostly fine already).

> +        virtio_legacy_is_cross_endian(vdev)) {
> +        r = vhost_virtqueue_set_vring_endian_legacy(dev,
> +                                                    virtio_is_big_endian(vdev),
> +                                                    vhost_vq_index);
> +        if (r) {
> +            return -errno;
> +        }
> +    }
> +
>      s = l = virtio_queue_get_desc_size(vdev, idx);
>      a = virtio_queue_get_desc_addr(vdev, idx);
>      vq->desc = cpu_physical_memory_map(a, &l, 0);

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [Qemu-devel] [PATCH RFC 1/7] virtio: relax feature check
  2015-05-12 13:14     ` Cornelia Huck
@ 2015-05-12 13:34       ` Michael S. Tsirkin
  -1 siblings, 0 replies; 44+ messages in thread
From: Michael S. Tsirkin @ 2015-05-12 13:34 UTC (permalink / raw)
  To: Cornelia Huck; +Cc: Jason Wang, virtualization, qemu-devel, Stefan Hajnoczi

On Tue, May 12, 2015 at 03:14:53PM +0200, Cornelia Huck wrote:
> On Wed, 06 May 2015 14:07:37 +0200
> Greg Kurz <gkurz@linux.vnet.ibm.com> wrote:
> 
> > Unlike with add and clear, there is no valid reason to abort when checking
> > for a feature. It makes more sense to return false (i.e. the feature bit
> > isn't set). This is exactly what __virtio_has_feature() does if fbit >= 32.
> > 
> > This allows to introduce code that is aware about new 64-bit features like
> > VIRTIO_F_VERSION_1, even if they are still not implemented.
> > 
> > Signed-off-by: Greg Kurz <gkurz@linux.vnet.ibm.com>
> > ---
> >  include/hw/virtio/virtio.h |    1 -
> >  1 file changed, 1 deletion(-)
> > 
> > diff --git a/include/hw/virtio/virtio.h b/include/hw/virtio/virtio.h
> > index d95f8b6..6ef70f1 100644
> > --- a/include/hw/virtio/virtio.h
> > +++ b/include/hw/virtio/virtio.h
> > @@ -233,7 +233,6 @@ static inline void virtio_clear_feature(uint32_t *features, unsigned int fbit)
> > 
> >  static inline bool __virtio_has_feature(uint32_t features, unsigned int fbit)
> >  {
> > -    assert(fbit < 32);
> >      return !!(features & (1 << fbit));
> >  }
> > 
> > 
> > 
> 
> I must say I'm not very comfortable with knowingly passing out-of-rage
> values to this function.
> 
> Can we perhaps apply at least the feature-bit-size extending patches
> prior to your patchset, if the remainder of the virtio-1 patchset still
> takes some time?

So the feature-bit-size extending patches currently don't support
migration correctly, that's why they are not merged.

What I think we need to do for this is move host_features out
from transports into core virtio device.

Then we can simply check host features >31 and skip
migrating low guest features is none set.

Thoughts? Any takers?

-- 
MST

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [Qemu-devel] [PATCH RFC 1/7] virtio: relax feature check
@ 2015-05-12 13:34       ` Michael S. Tsirkin
  0 siblings, 0 replies; 44+ messages in thread
From: Michael S. Tsirkin @ 2015-05-12 13:34 UTC (permalink / raw)
  To: Cornelia Huck; +Cc: virtualization, qemu-devel, Stefan Hajnoczi

On Tue, May 12, 2015 at 03:14:53PM +0200, Cornelia Huck wrote:
> On Wed, 06 May 2015 14:07:37 +0200
> Greg Kurz <gkurz@linux.vnet.ibm.com> wrote:
> 
> > Unlike with add and clear, there is no valid reason to abort when checking
> > for a feature. It makes more sense to return false (i.e. the feature bit
> > isn't set). This is exactly what __virtio_has_feature() does if fbit >= 32.
> > 
> > This allows to introduce code that is aware about new 64-bit features like
> > VIRTIO_F_VERSION_1, even if they are still not implemented.
> > 
> > Signed-off-by: Greg Kurz <gkurz@linux.vnet.ibm.com>
> > ---
> >  include/hw/virtio/virtio.h |    1 -
> >  1 file changed, 1 deletion(-)
> > 
> > diff --git a/include/hw/virtio/virtio.h b/include/hw/virtio/virtio.h
> > index d95f8b6..6ef70f1 100644
> > --- a/include/hw/virtio/virtio.h
> > +++ b/include/hw/virtio/virtio.h
> > @@ -233,7 +233,6 @@ static inline void virtio_clear_feature(uint32_t *features, unsigned int fbit)
> > 
> >  static inline bool __virtio_has_feature(uint32_t features, unsigned int fbit)
> >  {
> > -    assert(fbit < 32);
> >      return !!(features & (1 << fbit));
> >  }
> > 
> > 
> > 
> 
> I must say I'm not very comfortable with knowingly passing out-of-rage
> values to this function.
> 
> Can we perhaps apply at least the feature-bit-size extending patches
> prior to your patchset, if the remainder of the virtio-1 patchset still
> takes some time?

So the feature-bit-size extending patches currently don't support
migration correctly, that's why they are not merged.

What I think we need to do for this is move host_features out
from transports into core virtio device.

Then we can simply check host features >31 and skip
migrating low guest features is none set.

Thoughts? Any takers?

-- 
MST

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [Qemu-devel] [PATCH RFC 1/7] virtio: relax feature check
  2015-05-12 13:34       ` Michael S. Tsirkin
@ 2015-05-12 13:44         ` Cornelia Huck
  -1 siblings, 0 replies; 44+ messages in thread
From: Cornelia Huck @ 2015-05-12 13:44 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: Jason Wang, virtualization, qemu-devel, Stefan Hajnoczi

On Tue, 12 May 2015 15:34:47 +0200
"Michael S. Tsirkin" <mst@redhat.com> wrote:

> On Tue, May 12, 2015 at 03:14:53PM +0200, Cornelia Huck wrote:
> > On Wed, 06 May 2015 14:07:37 +0200
> > Greg Kurz <gkurz@linux.vnet.ibm.com> wrote:
> > 
> > > Unlike with add and clear, there is no valid reason to abort when checking
> > > for a feature. It makes more sense to return false (i.e. the feature bit
> > > isn't set). This is exactly what __virtio_has_feature() does if fbit >= 32.
> > > 
> > > This allows to introduce code that is aware about new 64-bit features like
> > > VIRTIO_F_VERSION_1, even if they are still not implemented.
> > > 
> > > Signed-off-by: Greg Kurz <gkurz@linux.vnet.ibm.com>
> > > ---
> > >  include/hw/virtio/virtio.h |    1 -
> > >  1 file changed, 1 deletion(-)
> > > 
> > > diff --git a/include/hw/virtio/virtio.h b/include/hw/virtio/virtio.h
> > > index d95f8b6..6ef70f1 100644
> > > --- a/include/hw/virtio/virtio.h
> > > +++ b/include/hw/virtio/virtio.h
> > > @@ -233,7 +233,6 @@ static inline void virtio_clear_feature(uint32_t *features, unsigned int fbit)
> > > 
> > >  static inline bool __virtio_has_feature(uint32_t features, unsigned int fbit)
> > >  {
> > > -    assert(fbit < 32);
> > >      return !!(features & (1 << fbit));
> > >  }
> > > 
> > > 
> > > 
> > 
> > I must say I'm not very comfortable with knowingly passing out-of-rage
> > values to this function.
> > 
> > Can we perhaps apply at least the feature-bit-size extending patches
> > prior to your patchset, if the remainder of the virtio-1 patchset still
> > takes some time?
> 
> So the feature-bit-size extending patches currently don't support
> migration correctly, that's why they are not merged.
> 
> What I think we need to do for this is move host_features out
> from transports into core virtio device.
> 
> Then we can simply check host features >31 and skip
> migrating low guest features is none set.
> 
> Thoughts? Any takers?
> 

After we move host_features, put them into an optional vmstate
subsection?

I think with the recent patchsets, most of the interesting stuff is
already not handled by the transport anymore. There's only
VIRTIO_F_NOTIFY_ON_EMPTY and VIRTIO_F_BAD_FEATURE left (set by pci and
ccw).

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [Qemu-devel] [PATCH RFC 1/7] virtio: relax feature check
@ 2015-05-12 13:44         ` Cornelia Huck
  0 siblings, 0 replies; 44+ messages in thread
From: Cornelia Huck @ 2015-05-12 13:44 UTC (permalink / raw)
  To: Michael S. Tsirkin; +Cc: virtualization, qemu-devel, Stefan Hajnoczi

On Tue, 12 May 2015 15:34:47 +0200
"Michael S. Tsirkin" <mst@redhat.com> wrote:

> On Tue, May 12, 2015 at 03:14:53PM +0200, Cornelia Huck wrote:
> > On Wed, 06 May 2015 14:07:37 +0200
> > Greg Kurz <gkurz@linux.vnet.ibm.com> wrote:
> > 
> > > Unlike with add and clear, there is no valid reason to abort when checking
> > > for a feature. It makes more sense to return false (i.e. the feature bit
> > > isn't set). This is exactly what __virtio_has_feature() does if fbit >= 32.
> > > 
> > > This allows to introduce code that is aware about new 64-bit features like
> > > VIRTIO_F_VERSION_1, even if they are still not implemented.
> > > 
> > > Signed-off-by: Greg Kurz <gkurz@linux.vnet.ibm.com>
> > > ---
> > >  include/hw/virtio/virtio.h |    1 -
> > >  1 file changed, 1 deletion(-)
> > > 
> > > diff --git a/include/hw/virtio/virtio.h b/include/hw/virtio/virtio.h
> > > index d95f8b6..6ef70f1 100644
> > > --- a/include/hw/virtio/virtio.h
> > > +++ b/include/hw/virtio/virtio.h
> > > @@ -233,7 +233,6 @@ static inline void virtio_clear_feature(uint32_t *features, unsigned int fbit)
> > > 
> > >  static inline bool __virtio_has_feature(uint32_t features, unsigned int fbit)
> > >  {
> > > -    assert(fbit < 32);
> > >      return !!(features & (1 << fbit));
> > >  }
> > > 
> > > 
> > > 
> > 
> > I must say I'm not very comfortable with knowingly passing out-of-rage
> > values to this function.
> > 
> > Can we perhaps apply at least the feature-bit-size extending patches
> > prior to your patchset, if the remainder of the virtio-1 patchset still
> > takes some time?
> 
> So the feature-bit-size extending patches currently don't support
> migration correctly, that's why they are not merged.
> 
> What I think we need to do for this is move host_features out
> from transports into core virtio device.
> 
> Then we can simply check host features >31 and skip
> migrating low guest features is none set.
> 
> Thoughts? Any takers?
> 

After we move host_features, put them into an optional vmstate
subsection?

I think with the recent patchsets, most of the interesting stuff is
already not handled by the transport anymore. There's only
VIRTIO_F_NOTIFY_ON_EMPTY and VIRTIO_F_BAD_FEATURE left (set by pci and
ccw).

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [Qemu-devel] [PATCH RFC 1/7] virtio: relax feature check
  2015-05-12 13:14     ` Cornelia Huck
  (?)
  (?)
@ 2015-05-12 13:49     ` Peter Maydell
  -1 siblings, 0 replies; 44+ messages in thread
From: Peter Maydell @ 2015-05-12 13:49 UTC (permalink / raw)
  To: Cornelia Huck
  Cc: Michael S. Tsirkin, Jason Wang, QEMU Developers, virtualization,
	Stefan Hajnoczi

On 12 May 2015 at 14:14, Cornelia Huck <cornelia.huck@de.ibm.com> wrote:
> On Wed, 06 May 2015 14:07:37 +0200
> Greg Kurz <gkurz@linux.vnet.ibm.com> wrote:
>> @@ -233,7 +233,6 @@ static inline void virtio_clear_feature(uint32_t *features, unsigned int fbit)
>>
>>  static inline bool __virtio_has_feature(uint32_t features, unsigned int fbit)
>>  {
>> -    assert(fbit < 32);
>>      return !!(features & (1 << fbit));
>>  }
>>
>>
>>
>
> I must say I'm not very comfortable with knowingly passing out-of-rage
> values to this function.

It would invoke C undefined behaviour, so clearly a bug if we did
pass an out-of-range value here. You'd need to at least do
    if (fbit >= 32) {
        return false;
    }
if you want to make it valid.

-- PMM

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [Qemu-devel] [PATCH RFC 1/7] virtio: relax feature check
  2015-05-12 13:14     ` Cornelia Huck
                       ` (2 preceding siblings ...)
  (?)
@ 2015-05-12 13:49     ` Peter Maydell
  -1 siblings, 0 replies; 44+ messages in thread
From: Peter Maydell @ 2015-05-12 13:49 UTC (permalink / raw)
  To: Cornelia Huck
  Cc: Michael S. Tsirkin, QEMU Developers, virtualization, Stefan Hajnoczi

On 12 May 2015 at 14:14, Cornelia Huck <cornelia.huck@de.ibm.com> wrote:
> On Wed, 06 May 2015 14:07:37 +0200
> Greg Kurz <gkurz@linux.vnet.ibm.com> wrote:
>> @@ -233,7 +233,6 @@ static inline void virtio_clear_feature(uint32_t *features, unsigned int fbit)
>>
>>  static inline bool __virtio_has_feature(uint32_t features, unsigned int fbit)
>>  {
>> -    assert(fbit < 32);
>>      return !!(features & (1 << fbit));
>>  }
>>
>>
>>
>
> I must say I'm not very comfortable with knowingly passing out-of-rage
> values to this function.

It would invoke C undefined behaviour, so clearly a bug if we did
pass an out-of-range value here. You'd need to at least do
    if (fbit >= 32) {
        return false;
    }
if you want to make it valid.

-- PMM

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [Qemu-devel] [PATCH RFC 1/7] virtio: relax feature check
  2015-05-12 13:14     ` Cornelia Huck
@ 2015-05-12 13:55       ` Greg Kurz
  -1 siblings, 0 replies; 44+ messages in thread
From: Greg Kurz @ 2015-05-12 13:55 UTC (permalink / raw)
  To: Cornelia Huck
  Cc: Jason Wang, virtualization, qemu-devel, Stefan Hajnoczi,
	Michael S. Tsirkin

On Tue, 12 May 2015 15:14:53 +0200
Cornelia Huck <cornelia.huck@de.ibm.com> wrote:
> On Wed, 06 May 2015 14:07:37 +0200
> Greg Kurz <gkurz@linux.vnet.ibm.com> wrote:
> 
> > Unlike with add and clear, there is no valid reason to abort when checking
> > for a feature. It makes more sense to return false (i.e. the feature bit
> > isn't set). This is exactly what __virtio_has_feature() does if fbit >= 32.
> > 
> > This allows to introduce code that is aware about new 64-bit features like
> > VIRTIO_F_VERSION_1, even if they are still not implemented.
> > 
> > Signed-off-by: Greg Kurz <gkurz@linux.vnet.ibm.com>
> > ---
> >  include/hw/virtio/virtio.h |    1 -
> >  1 file changed, 1 deletion(-)
> > 
> > diff --git a/include/hw/virtio/virtio.h b/include/hw/virtio/virtio.h
> > index d95f8b6..6ef70f1 100644
> > --- a/include/hw/virtio/virtio.h
> > +++ b/include/hw/virtio/virtio.h
> > @@ -233,7 +233,6 @@ static inline void virtio_clear_feature(uint32_t *features, unsigned int fbit)
> > 
> >  static inline bool __virtio_has_feature(uint32_t features, unsigned int fbit)
> >  {
> > -    assert(fbit < 32);
> >      return !!(features & (1 << fbit));
> >  }
> > 
> > 
> > 
> 
> I must say I'm not very comfortable with knowingly passing out-of-rage
> values to this function.
> 

I take that as a valid reason then :)

> Can we perhaps apply at least the feature-bit-size extending patches
> prior to your patchset, if the remainder of the virtio-1 patchset still
> takes some time?

Hmm... if I remember well, it still lacks migration support.

--
Greg

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [Qemu-devel] [PATCH RFC 1/7] virtio: relax feature check
@ 2015-05-12 13:55       ` Greg Kurz
  0 siblings, 0 replies; 44+ messages in thread
From: Greg Kurz @ 2015-05-12 13:55 UTC (permalink / raw)
  To: Cornelia Huck
  Cc: virtualization, qemu-devel, Stefan Hajnoczi, Michael S. Tsirkin

On Tue, 12 May 2015 15:14:53 +0200
Cornelia Huck <cornelia.huck@de.ibm.com> wrote:
> On Wed, 06 May 2015 14:07:37 +0200
> Greg Kurz <gkurz@linux.vnet.ibm.com> wrote:
> 
> > Unlike with add and clear, there is no valid reason to abort when checking
> > for a feature. It makes more sense to return false (i.e. the feature bit
> > isn't set). This is exactly what __virtio_has_feature() does if fbit >= 32.
> > 
> > This allows to introduce code that is aware about new 64-bit features like
> > VIRTIO_F_VERSION_1, even if they are still not implemented.
> > 
> > Signed-off-by: Greg Kurz <gkurz@linux.vnet.ibm.com>
> > ---
> >  include/hw/virtio/virtio.h |    1 -
> >  1 file changed, 1 deletion(-)
> > 
> > diff --git a/include/hw/virtio/virtio.h b/include/hw/virtio/virtio.h
> > index d95f8b6..6ef70f1 100644
> > --- a/include/hw/virtio/virtio.h
> > +++ b/include/hw/virtio/virtio.h
> > @@ -233,7 +233,6 @@ static inline void virtio_clear_feature(uint32_t *features, unsigned int fbit)
> > 
> >  static inline bool __virtio_has_feature(uint32_t features, unsigned int fbit)
> >  {
> > -    assert(fbit < 32);
> >      return !!(features & (1 << fbit));
> >  }
> > 
> > 
> > 
> 
> I must say I'm not very comfortable with knowingly passing out-of-rage
> values to this function.
> 

I take that as a valid reason then :)

> Can we perhaps apply at least the feature-bit-size extending patches
> prior to your patchset, if the remainder of the virtio-1 patchset still
> takes some time?

Hmm... if I remember well, it still lacks migration support.

--
Greg

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [Qemu-devel] [PATCH RFC 1/7] virtio: relax feature check
  2015-05-12 13:44         ` Cornelia Huck
@ 2015-05-12 14:46           ` Cornelia Huck
  -1 siblings, 0 replies; 44+ messages in thread
From: Cornelia Huck @ 2015-05-12 14:46 UTC (permalink / raw)
  To: qemu-devel
  Cc: Jason Wang, virtualization, Stefan Hajnoczi, Michael S. Tsirkin

On Tue, 12 May 2015 15:44:46 +0200
Cornelia Huck <cornelia.huck@de.ibm.com> wrote:

> On Tue, 12 May 2015 15:34:47 +0200
> "Michael S. Tsirkin" <mst@redhat.com> wrote:
> 
> > On Tue, May 12, 2015 at 03:14:53PM +0200, Cornelia Huck wrote:
> > > On Wed, 06 May 2015 14:07:37 +0200
> > > Greg Kurz <gkurz@linux.vnet.ibm.com> wrote:
> > > 
> > > > Unlike with add and clear, there is no valid reason to abort when checking
> > > > for a feature. It makes more sense to return false (i.e. the feature bit
> > > > isn't set). This is exactly what __virtio_has_feature() does if fbit >= 32.
> > > > 
> > > > This allows to introduce code that is aware about new 64-bit features like
> > > > VIRTIO_F_VERSION_1, even if they are still not implemented.
> > > > 
> > > > Signed-off-by: Greg Kurz <gkurz@linux.vnet.ibm.com>
> > > > ---
> > > >  include/hw/virtio/virtio.h |    1 -
> > > >  1 file changed, 1 deletion(-)
> > > > 
> > > > diff --git a/include/hw/virtio/virtio.h b/include/hw/virtio/virtio.h
> > > > index d95f8b6..6ef70f1 100644
> > > > --- a/include/hw/virtio/virtio.h
> > > > +++ b/include/hw/virtio/virtio.h
> > > > @@ -233,7 +233,6 @@ static inline void virtio_clear_feature(uint32_t *features, unsigned int fbit)
> > > > 
> > > >  static inline bool __virtio_has_feature(uint32_t features, unsigned int fbit)
> > > >  {
> > > > -    assert(fbit < 32);
> > > >      return !!(features & (1 << fbit));
> > > >  }
> > > > 
> > > > 
> > > > 
> > > 
> > > I must say I'm not very comfortable with knowingly passing out-of-rage
> > > values to this function.
> > > 
> > > Can we perhaps apply at least the feature-bit-size extending patches
> > > prior to your patchset, if the remainder of the virtio-1 patchset still
> > > takes some time?
> > 
> > So the feature-bit-size extending patches currently don't support
> > migration correctly, that's why they are not merged.
> > 
> > What I think we need to do for this is move host_features out
> > from transports into core virtio device.
> > 
> > Then we can simply check host features >31 and skip
> > migrating low guest features is none set.
> > 
> > Thoughts? Any takers?
> > 
> 
> After we move host_features, put them into an optional vmstate
> subsection?
> 
> I think with the recent patchsets, most of the interesting stuff is
> already not handled by the transport anymore. There's only
> VIRTIO_F_NOTIFY_ON_EMPTY and VIRTIO_F_BAD_FEATURE left (set by pci and
> ccw).

Thinking a bit more, we probably don't need this move of host_features
to get migration right (although it might be a nice cleanup later).

Could we
- keep migration of bits 0..31 as-is
- add a vmstate subsection for bits 32..63 only included if one of
  those bits is set
- have a post handler that performs a validation of the full set of
  bits 0..63
?

We could do a similar exercise with a subsection containing the
addresses for avail and used with a post handler overwriting any
addresses set by the old style migration code.

Does that make sense?

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [Qemu-devel] [PATCH RFC 1/7] virtio: relax feature check
@ 2015-05-12 14:46           ` Cornelia Huck
  0 siblings, 0 replies; 44+ messages in thread
From: Cornelia Huck @ 2015-05-12 14:46 UTC (permalink / raw)
  To: qemu-devel; +Cc: virtualization, Stefan Hajnoczi, Michael S. Tsirkin

On Tue, 12 May 2015 15:44:46 +0200
Cornelia Huck <cornelia.huck@de.ibm.com> wrote:

> On Tue, 12 May 2015 15:34:47 +0200
> "Michael S. Tsirkin" <mst@redhat.com> wrote:
> 
> > On Tue, May 12, 2015 at 03:14:53PM +0200, Cornelia Huck wrote:
> > > On Wed, 06 May 2015 14:07:37 +0200
> > > Greg Kurz <gkurz@linux.vnet.ibm.com> wrote:
> > > 
> > > > Unlike with add and clear, there is no valid reason to abort when checking
> > > > for a feature. It makes more sense to return false (i.e. the feature bit
> > > > isn't set). This is exactly what __virtio_has_feature() does if fbit >= 32.
> > > > 
> > > > This allows to introduce code that is aware about new 64-bit features like
> > > > VIRTIO_F_VERSION_1, even if they are still not implemented.
> > > > 
> > > > Signed-off-by: Greg Kurz <gkurz@linux.vnet.ibm.com>
> > > > ---
> > > >  include/hw/virtio/virtio.h |    1 -
> > > >  1 file changed, 1 deletion(-)
> > > > 
> > > > diff --git a/include/hw/virtio/virtio.h b/include/hw/virtio/virtio.h
> > > > index d95f8b6..6ef70f1 100644
> > > > --- a/include/hw/virtio/virtio.h
> > > > +++ b/include/hw/virtio/virtio.h
> > > > @@ -233,7 +233,6 @@ static inline void virtio_clear_feature(uint32_t *features, unsigned int fbit)
> > > > 
> > > >  static inline bool __virtio_has_feature(uint32_t features, unsigned int fbit)
> > > >  {
> > > > -    assert(fbit < 32);
> > > >      return !!(features & (1 << fbit));
> > > >  }
> > > > 
> > > > 
> > > > 
> > > 
> > > I must say I'm not very comfortable with knowingly passing out-of-rage
> > > values to this function.
> > > 
> > > Can we perhaps apply at least the feature-bit-size extending patches
> > > prior to your patchset, if the remainder of the virtio-1 patchset still
> > > takes some time?
> > 
> > So the feature-bit-size extending patches currently don't support
> > migration correctly, that's why they are not merged.
> > 
> > What I think we need to do for this is move host_features out
> > from transports into core virtio device.
> > 
> > Then we can simply check host features >31 and skip
> > migrating low guest features is none set.
> > 
> > Thoughts? Any takers?
> > 
> 
> After we move host_features, put them into an optional vmstate
> subsection?
> 
> I think with the recent patchsets, most of the interesting stuff is
> already not handled by the transport anymore. There's only
> VIRTIO_F_NOTIFY_ON_EMPTY and VIRTIO_F_BAD_FEATURE left (set by pci and
> ccw).

Thinking a bit more, we probably don't need this move of host_features
to get migration right (although it might be a nice cleanup later).

Could we
- keep migration of bits 0..31 as-is
- add a vmstate subsection for bits 32..63 only included if one of
  those bits is set
- have a post handler that performs a validation of the full set of
  bits 0..63
?

We could do a similar exercise with a subsection containing the
addresses for avail and used with a post handler overwriting any
addresses set by the old style migration code.

Does that make sense?

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [Qemu-devel] [PATCH RFC 4/7] vhost: set vring endianness for legacy virtio
  2015-05-12 13:25     ` Cornelia Huck
@ 2015-05-12 15:15       ` Michael S. Tsirkin
  -1 siblings, 0 replies; 44+ messages in thread
From: Michael S. Tsirkin @ 2015-05-12 15:15 UTC (permalink / raw)
  To: Cornelia Huck; +Cc: Jason Wang, virtualization, qemu-devel, Stefan Hajnoczi

On Tue, May 12, 2015 at 03:25:30PM +0200, Cornelia Huck wrote:
> On Wed, 06 May 2015 14:08:02 +0200
> Greg Kurz <gkurz@linux.vnet.ibm.com> wrote:
> 
> > Legacy virtio is native endian: if the guest and host endianness differ,
> > we have to tell vhost so it can swap bytes where appropriate. This is
> > done through a vhost ring ioctl.
> > 
> > Signed-off-by: Greg Kurz <gkurz@linux.vnet.ibm.com>
> > ---
> >  hw/virtio/vhost.c |   50 +++++++++++++++++++++++++++++++++++++++++++++++++-
> >  1 file changed, 49 insertions(+), 1 deletion(-)
> > 
> > diff --git a/hw/virtio/vhost.c b/hw/virtio/vhost.c
> > index 54851b7..1d7b939 100644
> > --- a/hw/virtio/vhost.c
> > +++ b/hw/virtio/vhost.c
> (...)
> > @@ -677,6 +700,16 @@ static int vhost_virtqueue_start(struct vhost_dev *dev,
> >          return -errno;
> >      }
> > 
> > +    if (!virtio_has_feature(vdev, VIRTIO_F_VERSION_1) &&
> 
> I think this should either go in after the virtio-1 base support (more
> feature bits etc.) or get a big fat comment and be touched up later.
> I'd prefer the first solution so it does not get forgotten, but I'm not
> sure when Michael plans to proceed with the virtio-1 patches (I think
> they're mostly fine already).

There are three main issues with virtio 1 patches that I am aware of.

One issue with virtio 1 patches as they are is with how features are
handled ATM.  There are 3 types of features

	a. virtio 1 only features
	b. virtio 0 only features
	c. shared features

and 3 types of devices
	a. legacy device: has b+c features
	b. modern device: has a+c features
	c. transitional device: has a+c features but exposes
	   only c through the legacy interface


So I think a callback that gets features depending on guest
version isn't a good way to model it because fundamentally device
has one set of features.
A better way to model this is really just a single
host_features bitmask, and for transitional devices, a mask
hiding a features - which are so far all bits > 31, so maybe
for now we can just have a global mask.

We need to validate features at initialization time and make
sure they make sense, fail if not (sometimes we need to mask
features if they don't make sense - this is unfortunate
but might be needed for compatibility).

Moving host_features to virtio core would make all of the above
easier.


Second issue is migration, some of it is with migrating the new
features, so that's tied to the first one.


Third issue is fixing devices so they don't try to
access guest memory until DRIVER_OK is set.
This is surprisingly hard to do generally given need to support old
drivers which don't set DRIVER_OK or set it very late, and the fact that
we tied work-arounds for even older drivers which dont' set pci bus
master to the DRIVER_OK bit. I tried, and I'm close to giving up and
just checking guest ack for virtio 1, and ignoring DRIVER_OK requirement
if not there.



> > +        virtio_legacy_is_cross_endian(vdev)) {
> > +        r = vhost_virtqueue_set_vring_endian_legacy(dev,
> > +                                                    virtio_is_big_endian(vdev),
> > +                                                    vhost_vq_index);
> > +        if (r) {
> > +            return -errno;
> > +        }
> > +    }
> > +
> >      s = l = virtio_queue_get_desc_size(vdev, idx);
> >      a = virtio_queue_get_desc_addr(vdev, idx);
> >      vq->desc = cpu_physical_memory_map(a, &l, 0);

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [Qemu-devel] [PATCH RFC 4/7] vhost: set vring endianness for legacy virtio
@ 2015-05-12 15:15       ` Michael S. Tsirkin
  0 siblings, 0 replies; 44+ messages in thread
From: Michael S. Tsirkin @ 2015-05-12 15:15 UTC (permalink / raw)
  To: Cornelia Huck; +Cc: virtualization, qemu-devel, Stefan Hajnoczi

On Tue, May 12, 2015 at 03:25:30PM +0200, Cornelia Huck wrote:
> On Wed, 06 May 2015 14:08:02 +0200
> Greg Kurz <gkurz@linux.vnet.ibm.com> wrote:
> 
> > Legacy virtio is native endian: if the guest and host endianness differ,
> > we have to tell vhost so it can swap bytes where appropriate. This is
> > done through a vhost ring ioctl.
> > 
> > Signed-off-by: Greg Kurz <gkurz@linux.vnet.ibm.com>
> > ---
> >  hw/virtio/vhost.c |   50 +++++++++++++++++++++++++++++++++++++++++++++++++-
> >  1 file changed, 49 insertions(+), 1 deletion(-)
> > 
> > diff --git a/hw/virtio/vhost.c b/hw/virtio/vhost.c
> > index 54851b7..1d7b939 100644
> > --- a/hw/virtio/vhost.c
> > +++ b/hw/virtio/vhost.c
> (...)
> > @@ -677,6 +700,16 @@ static int vhost_virtqueue_start(struct vhost_dev *dev,
> >          return -errno;
> >      }
> > 
> > +    if (!virtio_has_feature(vdev, VIRTIO_F_VERSION_1) &&
> 
> I think this should either go in after the virtio-1 base support (more
> feature bits etc.) or get a big fat comment and be touched up later.
> I'd prefer the first solution so it does not get forgotten, but I'm not
> sure when Michael plans to proceed with the virtio-1 patches (I think
> they're mostly fine already).

There are three main issues with virtio 1 patches that I am aware of.

One issue with virtio 1 patches as they are is with how features are
handled ATM.  There are 3 types of features

	a. virtio 1 only features
	b. virtio 0 only features
	c. shared features

and 3 types of devices
	a. legacy device: has b+c features
	b. modern device: has a+c features
	c. transitional device: has a+c features but exposes
	   only c through the legacy interface


So I think a callback that gets features depending on guest
version isn't a good way to model it because fundamentally device
has one set of features.
A better way to model this is really just a single
host_features bitmask, and for transitional devices, a mask
hiding a features - which are so far all bits > 31, so maybe
for now we can just have a global mask.

We need to validate features at initialization time and make
sure they make sense, fail if not (sometimes we need to mask
features if they don't make sense - this is unfortunate
but might be needed for compatibility).

Moving host_features to virtio core would make all of the above
easier.


Second issue is migration, some of it is with migrating the new
features, so that's tied to the first one.


Third issue is fixing devices so they don't try to
access guest memory until DRIVER_OK is set.
This is surprisingly hard to do generally given need to support old
drivers which don't set DRIVER_OK or set it very late, and the fact that
we tied work-arounds for even older drivers which dont' set pci bus
master to the DRIVER_OK bit. I tried, and I'm close to giving up and
just checking guest ack for virtio 1, and ignoring DRIVER_OK requirement
if not there.



> > +        virtio_legacy_is_cross_endian(vdev)) {
> > +        r = vhost_virtqueue_set_vring_endian_legacy(dev,
> > +                                                    virtio_is_big_endian(vdev),
> > +                                                    vhost_vq_index);
> > +        if (r) {
> > +            return -errno;
> > +        }
> > +    }
> > +
> >      s = l = virtio_queue_get_desc_size(vdev, idx);
> >      a = virtio_queue_get_desc_addr(vdev, idx);
> >      vq->desc = cpu_physical_memory_map(a, &l, 0);

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [Qemu-devel] [PATCH RFC 1/7] virtio: relax feature check
  2015-05-12 14:46           ` Cornelia Huck
@ 2015-05-12 15:30             ` Michael S. Tsirkin
  -1 siblings, 0 replies; 44+ messages in thread
From: Michael S. Tsirkin @ 2015-05-12 15:30 UTC (permalink / raw)
  To: Cornelia Huck; +Cc: Jason Wang, virtualization, qemu-devel, Stefan Hajnoczi

On Tue, May 12, 2015 at 04:46:11PM +0200, Cornelia Huck wrote:
> On Tue, 12 May 2015 15:44:46 +0200
> Cornelia Huck <cornelia.huck@de.ibm.com> wrote:
> 
> > On Tue, 12 May 2015 15:34:47 +0200
> > "Michael S. Tsirkin" <mst@redhat.com> wrote:
> > 
> > > On Tue, May 12, 2015 at 03:14:53PM +0200, Cornelia Huck wrote:
> > > > On Wed, 06 May 2015 14:07:37 +0200
> > > > Greg Kurz <gkurz@linux.vnet.ibm.com> wrote:
> > > > 
> > > > > Unlike with add and clear, there is no valid reason to abort when checking
> > > > > for a feature. It makes more sense to return false (i.e. the feature bit
> > > > > isn't set). This is exactly what __virtio_has_feature() does if fbit >= 32.
> > > > > 
> > > > > This allows to introduce code that is aware about new 64-bit features like
> > > > > VIRTIO_F_VERSION_1, even if they are still not implemented.
> > > > > 
> > > > > Signed-off-by: Greg Kurz <gkurz@linux.vnet.ibm.com>
> > > > > ---
> > > > >  include/hw/virtio/virtio.h |    1 -
> > > > >  1 file changed, 1 deletion(-)
> > > > > 
> > > > > diff --git a/include/hw/virtio/virtio.h b/include/hw/virtio/virtio.h
> > > > > index d95f8b6..6ef70f1 100644
> > > > > --- a/include/hw/virtio/virtio.h
> > > > > +++ b/include/hw/virtio/virtio.h
> > > > > @@ -233,7 +233,6 @@ static inline void virtio_clear_feature(uint32_t *features, unsigned int fbit)
> > > > > 
> > > > >  static inline bool __virtio_has_feature(uint32_t features, unsigned int fbit)
> > > > >  {
> > > > > -    assert(fbit < 32);
> > > > >      return !!(features & (1 << fbit));
> > > > >  }
> > > > > 
> > > > > 
> > > > > 
> > > > 
> > > > I must say I'm not very comfortable with knowingly passing out-of-rage
> > > > values to this function.
> > > > 
> > > > Can we perhaps apply at least the feature-bit-size extending patches
> > > > prior to your patchset, if the remainder of the virtio-1 patchset still
> > > > takes some time?
> > > 
> > > So the feature-bit-size extending patches currently don't support
> > > migration correctly, that's why they are not merged.
> > > 
> > > What I think we need to do for this is move host_features out
> > > from transports into core virtio device.
> > > 
> > > Then we can simply check host features >31 and skip
> > > migrating low guest features is none set.
> > > 
> > > Thoughts? Any takers?
> > > 
> > 
> > After we move host_features, put them into an optional vmstate
> > subsection?
> > 
> > I think with the recent patchsets, most of the interesting stuff is
> > already not handled by the transport anymore. There's only
> > VIRTIO_F_NOTIFY_ON_EMPTY and VIRTIO_F_BAD_FEATURE left (set by pci and
> > ccw).

notify on empty is likely safe to set for everyone.

bad feature should be pci specific, it's a mistake that
we have it in ccw. it's there to detect very old buggy guests.
in fact ccw ignores this bit completely.

For PCI, I think VIRTIO_F_BAD_FEATURE is never
actually set in guest features. If guest attempts to set it,
it is immediately cleared.

So it can be handled in pci specific code, and won't
affect migration.


> Thinking a bit more, we probably don't need this move of host_features
> to get migration right (although it might be a nice cleanup later).
> 
> Could we
> - keep migration of bits 0..31 as-is
> - add a vmstate subsection for bits 32..63 only included if one of
>   those bits is set
> - have a post handler that performs a validation of the full set of
>   bits 0..63
> ?
> 
> We could do a similar exercise with a subsection containing the
> addresses for avail and used with a post handler overwriting any
> addresses set by the old style migration code.
> 
> Does that make sense?

I don't see how it does: on the receive side you don't know
whether guest acked bits 32..63 so you can't decide whether
to parse bits 32..63.

The right thing to do IMHO is to migrate the high guest bits if and only
if the *host* bits 32..63 are set.  And that needs the host features in
core, or at least is easier if they are there.

-- 
MST

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [Qemu-devel] [PATCH RFC 1/7] virtio: relax feature check
@ 2015-05-12 15:30             ` Michael S. Tsirkin
  0 siblings, 0 replies; 44+ messages in thread
From: Michael S. Tsirkin @ 2015-05-12 15:30 UTC (permalink / raw)
  To: Cornelia Huck; +Cc: virtualization, qemu-devel, Stefan Hajnoczi

On Tue, May 12, 2015 at 04:46:11PM +0200, Cornelia Huck wrote:
> On Tue, 12 May 2015 15:44:46 +0200
> Cornelia Huck <cornelia.huck@de.ibm.com> wrote:
> 
> > On Tue, 12 May 2015 15:34:47 +0200
> > "Michael S. Tsirkin" <mst@redhat.com> wrote:
> > 
> > > On Tue, May 12, 2015 at 03:14:53PM +0200, Cornelia Huck wrote:
> > > > On Wed, 06 May 2015 14:07:37 +0200
> > > > Greg Kurz <gkurz@linux.vnet.ibm.com> wrote:
> > > > 
> > > > > Unlike with add and clear, there is no valid reason to abort when checking
> > > > > for a feature. It makes more sense to return false (i.e. the feature bit
> > > > > isn't set). This is exactly what __virtio_has_feature() does if fbit >= 32.
> > > > > 
> > > > > This allows to introduce code that is aware about new 64-bit features like
> > > > > VIRTIO_F_VERSION_1, even if they are still not implemented.
> > > > > 
> > > > > Signed-off-by: Greg Kurz <gkurz@linux.vnet.ibm.com>
> > > > > ---
> > > > >  include/hw/virtio/virtio.h |    1 -
> > > > >  1 file changed, 1 deletion(-)
> > > > > 
> > > > > diff --git a/include/hw/virtio/virtio.h b/include/hw/virtio/virtio.h
> > > > > index d95f8b6..6ef70f1 100644
> > > > > --- a/include/hw/virtio/virtio.h
> > > > > +++ b/include/hw/virtio/virtio.h
> > > > > @@ -233,7 +233,6 @@ static inline void virtio_clear_feature(uint32_t *features, unsigned int fbit)
> > > > > 
> > > > >  static inline bool __virtio_has_feature(uint32_t features, unsigned int fbit)
> > > > >  {
> > > > > -    assert(fbit < 32);
> > > > >      return !!(features & (1 << fbit));
> > > > >  }
> > > > > 
> > > > > 
> > > > > 
> > > > 
> > > > I must say I'm not very comfortable with knowingly passing out-of-rage
> > > > values to this function.
> > > > 
> > > > Can we perhaps apply at least the feature-bit-size extending patches
> > > > prior to your patchset, if the remainder of the virtio-1 patchset still
> > > > takes some time?
> > > 
> > > So the feature-bit-size extending patches currently don't support
> > > migration correctly, that's why they are not merged.
> > > 
> > > What I think we need to do for this is move host_features out
> > > from transports into core virtio device.
> > > 
> > > Then we can simply check host features >31 and skip
> > > migrating low guest features is none set.
> > > 
> > > Thoughts? Any takers?
> > > 
> > 
> > After we move host_features, put them into an optional vmstate
> > subsection?
> > 
> > I think with the recent patchsets, most of the interesting stuff is
> > already not handled by the transport anymore. There's only
> > VIRTIO_F_NOTIFY_ON_EMPTY and VIRTIO_F_BAD_FEATURE left (set by pci and
> > ccw).

notify on empty is likely safe to set for everyone.

bad feature should be pci specific, it's a mistake that
we have it in ccw. it's there to detect very old buggy guests.
in fact ccw ignores this bit completely.

For PCI, I think VIRTIO_F_BAD_FEATURE is never
actually set in guest features. If guest attempts to set it,
it is immediately cleared.

So it can be handled in pci specific code, and won't
affect migration.


> Thinking a bit more, we probably don't need this move of host_features
> to get migration right (although it might be a nice cleanup later).
> 
> Could we
> - keep migration of bits 0..31 as-is
> - add a vmstate subsection for bits 32..63 only included if one of
>   those bits is set
> - have a post handler that performs a validation of the full set of
>   bits 0..63
> ?
> 
> We could do a similar exercise with a subsection containing the
> addresses for avail and used with a post handler overwriting any
> addresses set by the old style migration code.
> 
> Does that make sense?

I don't see how it does: on the receive side you don't know
whether guest acked bits 32..63 so you can't decide whether
to parse bits 32..63.

The right thing to do IMHO is to migrate the high guest bits if and only
if the *host* bits 32..63 are set.  And that needs the host features in
core, or at least is easier if they are there.

-- 
MST

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [Qemu-devel] [PATCH RFC 4/7] vhost: set vring endianness for legacy virtio
  2015-05-12 15:15       ` Michael S. Tsirkin
@ 2015-05-12 16:25         ` Cornelia Huck
  -1 siblings, 0 replies; 44+ messages in thread
From: Cornelia Huck @ 2015-05-12 16:25 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: Jason Wang, virtualization, qemu-devel, Stefan Hajnoczi

On Tue, 12 May 2015 17:15:53 +0200
"Michael S. Tsirkin" <mst@redhat.com> wrote:

> On Tue, May 12, 2015 at 03:25:30PM +0200, Cornelia Huck wrote:
> > On Wed, 06 May 2015 14:08:02 +0200
> > Greg Kurz <gkurz@linux.vnet.ibm.com> wrote:
> > 
> > > Legacy virtio is native endian: if the guest and host endianness differ,
> > > we have to tell vhost so it can swap bytes where appropriate. This is
> > > done through a vhost ring ioctl.
> > > 
> > > Signed-off-by: Greg Kurz <gkurz@linux.vnet.ibm.com>
> > > ---
> > >  hw/virtio/vhost.c |   50 +++++++++++++++++++++++++++++++++++++++++++++++++-
> > >  1 file changed, 49 insertions(+), 1 deletion(-)
> > > 
> > > diff --git a/hw/virtio/vhost.c b/hw/virtio/vhost.c
> > > index 54851b7..1d7b939 100644
> > > --- a/hw/virtio/vhost.c
> > > +++ b/hw/virtio/vhost.c
> > (...)
> > > @@ -677,6 +700,16 @@ static int vhost_virtqueue_start(struct vhost_dev *dev,
> > >          return -errno;
> > >      }
> > > 
> > > +    if (!virtio_has_feature(vdev, VIRTIO_F_VERSION_1) &&
> > 
> > I think this should either go in after the virtio-1 base support (more
> > feature bits etc.) or get a big fat comment and be touched up later.
> > I'd prefer the first solution so it does not get forgotten, but I'm not
> > sure when Michael plans to proceed with the virtio-1 patches (I think
> > they're mostly fine already).
> 
> There are three main issues with virtio 1 patches that I am aware of.
> 
> One issue with virtio 1 patches as they are is with how features are
> handled ATM.  There are 3 types of features
> 
> 	a. virtio 1 only features
> 	b. virtio 0 only features
> 	c. shared features
> 
> and 3 types of devices
> 	a. legacy device: has b+c features
> 	b. modern device: has a+c features
> 	c. transitional device: has a+c features but exposes
> 	   only c through the legacy interface

Wouldn't a transitional device be able to expose b as well?

> 
> 
> So I think a callback that gets features depending on guest
> version isn't a good way to model it because fundamentally device
> has one set of features.
> A better way to model this is really just a single
> host_features bitmask, and for transitional devices, a mask
> hiding a features - which are so far all bits > 31, so maybe
> for now we can just have a global mask.

How would this work for transitional presenting a modern device - would
you have a superset of bits and masks for legacy and modern?

> 
> We need to validate features at initialization time and make
> sure they make sense, fail if not (sometimes we need to mask
> features if they don't make sense - this is unfortunate
> but might be needed for compatibility).
> 
> Moving host_features to virtio core would make all of the above
> easier.

I have started hacking up code that moves host_features, but I'm quite
lost with all the different virtio versions floating around. Currently
trying against master, but that of course ignores the virtio-1 issues.

> 
> 
> Second issue is migration, some of it is with migrating the new
> features, so that's tied to the first one.

There's also the used and avail addresses, but that kind of follows
from virtio-1 support.

> 
> 
> Third issue is fixing devices so they don't try to
> access guest memory until DRIVER_OK is set.
> This is surprisingly hard to do generally given need to support old
> drivers which don't set DRIVER_OK or set it very late, and the fact that
> we tied work-arounds for even older drivers which dont' set pci bus
> master to the DRIVER_OK bit. I tried, and I'm close to giving up and
> just checking guest ack for virtio 1, and ignoring DRIVER_OK requirement
> if not there.

If legacy survived like it is until now, it might be best to focus on
modern devices for this.

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [Qemu-devel] [PATCH RFC 4/7] vhost: set vring endianness for legacy virtio
@ 2015-05-12 16:25         ` Cornelia Huck
  0 siblings, 0 replies; 44+ messages in thread
From: Cornelia Huck @ 2015-05-12 16:25 UTC (permalink / raw)
  To: Michael S. Tsirkin; +Cc: virtualization, qemu-devel, Stefan Hajnoczi

On Tue, 12 May 2015 17:15:53 +0200
"Michael S. Tsirkin" <mst@redhat.com> wrote:

> On Tue, May 12, 2015 at 03:25:30PM +0200, Cornelia Huck wrote:
> > On Wed, 06 May 2015 14:08:02 +0200
> > Greg Kurz <gkurz@linux.vnet.ibm.com> wrote:
> > 
> > > Legacy virtio is native endian: if the guest and host endianness differ,
> > > we have to tell vhost so it can swap bytes where appropriate. This is
> > > done through a vhost ring ioctl.
> > > 
> > > Signed-off-by: Greg Kurz <gkurz@linux.vnet.ibm.com>
> > > ---
> > >  hw/virtio/vhost.c |   50 +++++++++++++++++++++++++++++++++++++++++++++++++-
> > >  1 file changed, 49 insertions(+), 1 deletion(-)
> > > 
> > > diff --git a/hw/virtio/vhost.c b/hw/virtio/vhost.c
> > > index 54851b7..1d7b939 100644
> > > --- a/hw/virtio/vhost.c
> > > +++ b/hw/virtio/vhost.c
> > (...)
> > > @@ -677,6 +700,16 @@ static int vhost_virtqueue_start(struct vhost_dev *dev,
> > >          return -errno;
> > >      }
> > > 
> > > +    if (!virtio_has_feature(vdev, VIRTIO_F_VERSION_1) &&
> > 
> > I think this should either go in after the virtio-1 base support (more
> > feature bits etc.) or get a big fat comment and be touched up later.
> > I'd prefer the first solution so it does not get forgotten, but I'm not
> > sure when Michael plans to proceed with the virtio-1 patches (I think
> > they're mostly fine already).
> 
> There are three main issues with virtio 1 patches that I am aware of.
> 
> One issue with virtio 1 patches as they are is with how features are
> handled ATM.  There are 3 types of features
> 
> 	a. virtio 1 only features
> 	b. virtio 0 only features
> 	c. shared features
> 
> and 3 types of devices
> 	a. legacy device: has b+c features
> 	b. modern device: has a+c features
> 	c. transitional device: has a+c features but exposes
> 	   only c through the legacy interface

Wouldn't a transitional device be able to expose b as well?

> 
> 
> So I think a callback that gets features depending on guest
> version isn't a good way to model it because fundamentally device
> has one set of features.
> A better way to model this is really just a single
> host_features bitmask, and for transitional devices, a mask
> hiding a features - which are so far all bits > 31, so maybe
> for now we can just have a global mask.

How would this work for transitional presenting a modern device - would
you have a superset of bits and masks for legacy and modern?

> 
> We need to validate features at initialization time and make
> sure they make sense, fail if not (sometimes we need to mask
> features if they don't make sense - this is unfortunate
> but might be needed for compatibility).
> 
> Moving host_features to virtio core would make all of the above
> easier.

I have started hacking up code that moves host_features, but I'm quite
lost with all the different virtio versions floating around. Currently
trying against master, but that of course ignores the virtio-1 issues.

> 
> 
> Second issue is migration, some of it is with migrating the new
> features, so that's tied to the first one.

There's also the used and avail addresses, but that kind of follows
from virtio-1 support.

> 
> 
> Third issue is fixing devices so they don't try to
> access guest memory until DRIVER_OK is set.
> This is surprisingly hard to do generally given need to support old
> drivers which don't set DRIVER_OK or set it very late, and the fact that
> we tied work-arounds for even older drivers which dont' set pci bus
> master to the DRIVER_OK bit. I tried, and I'm close to giving up and
> just checking guest ack for virtio 1, and ignoring DRIVER_OK requirement
> if not there.

If legacy survived like it is until now, it might be best to focus on
modern devices for this.

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [Qemu-devel] [PATCH RFC 1/7] virtio: relax feature check
  2015-05-12 15:30             ` Michael S. Tsirkin
@ 2015-05-12 16:27               ` Cornelia Huck
  -1 siblings, 0 replies; 44+ messages in thread
From: Cornelia Huck @ 2015-05-12 16:27 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: Jason Wang, virtualization, qemu-devel, Stefan Hajnoczi

On Tue, 12 May 2015 17:30:21 +0200
"Michael S. Tsirkin" <mst@redhat.com> wrote:

> On Tue, May 12, 2015 at 04:46:11PM +0200, Cornelia Huck wrote:
> > On Tue, 12 May 2015 15:44:46 +0200
> > Cornelia Huck <cornelia.huck@de.ibm.com> wrote:
> > 
> > > On Tue, 12 May 2015 15:34:47 +0200
> > > "Michael S. Tsirkin" <mst@redhat.com> wrote:
> > > 
> > > > On Tue, May 12, 2015 at 03:14:53PM +0200, Cornelia Huck wrote:
> > > > > On Wed, 06 May 2015 14:07:37 +0200
> > > > > Greg Kurz <gkurz@linux.vnet.ibm.com> wrote:
> > > > > 
> > > > > > Unlike with add and clear, there is no valid reason to abort when checking
> > > > > > for a feature. It makes more sense to return false (i.e. the feature bit
> > > > > > isn't set). This is exactly what __virtio_has_feature() does if fbit >= 32.
> > > > > > 
> > > > > > This allows to introduce code that is aware about new 64-bit features like
> > > > > > VIRTIO_F_VERSION_1, even if they are still not implemented.
> > > > > > 
> > > > > > Signed-off-by: Greg Kurz <gkurz@linux.vnet.ibm.com>
> > > > > > ---
> > > > > >  include/hw/virtio/virtio.h |    1 -
> > > > > >  1 file changed, 1 deletion(-)
> > > > > > 
> > > > > > diff --git a/include/hw/virtio/virtio.h b/include/hw/virtio/virtio.h
> > > > > > index d95f8b6..6ef70f1 100644
> > > > > > --- a/include/hw/virtio/virtio.h
> > > > > > +++ b/include/hw/virtio/virtio.h
> > > > > > @@ -233,7 +233,6 @@ static inline void virtio_clear_feature(uint32_t *features, unsigned int fbit)
> > > > > > 
> > > > > >  static inline bool __virtio_has_feature(uint32_t features, unsigned int fbit)
> > > > > >  {
> > > > > > -    assert(fbit < 32);
> > > > > >      return !!(features & (1 << fbit));
> > > > > >  }
> > > > > > 
> > > > > > 
> > > > > > 
> > > > > 
> > > > > I must say I'm not very comfortable with knowingly passing out-of-rage
> > > > > values to this function.
> > > > > 
> > > > > Can we perhaps apply at least the feature-bit-size extending patches
> > > > > prior to your patchset, if the remainder of the virtio-1 patchset still
> > > > > takes some time?
> > > > 
> > > > So the feature-bit-size extending patches currently don't support
> > > > migration correctly, that's why they are not merged.
> > > > 
> > > > What I think we need to do for this is move host_features out
> > > > from transports into core virtio device.
> > > > 
> > > > Then we can simply check host features >31 and skip
> > > > migrating low guest features is none set.
> > > > 
> > > > Thoughts? Any takers?
> > > > 
> > > 
> > > After we move host_features, put them into an optional vmstate
> > > subsection?
> > > 
> > > I think with the recent patchsets, most of the interesting stuff is
> > > already not handled by the transport anymore. There's only
> > > VIRTIO_F_NOTIFY_ON_EMPTY and VIRTIO_F_BAD_FEATURE left (set by pci and
> > > ccw).
> 
> notify on empty is likely safe to set for everyone.
> 
> bad feature should be pci specific, it's a mistake that
> we have it in ccw. it's there to detect very old buggy guests.
> in fact ccw ignores this bit completely.
> 
> For PCI, I think VIRTIO_F_BAD_FEATURE is never
> actually set in guest features. If guest attempts to set it,
> it is immediately cleared.
> 
> So it can be handled in pci specific code, and won't
> affect migration.
> 
> 
> > Thinking a bit more, we probably don't need this move of host_features
> > to get migration right (although it might be a nice cleanup later).
> > 
> > Could we
> > - keep migration of bits 0..31 as-is
> > - add a vmstate subsection for bits 32..63 only included if one of
> >   those bits is set
> > - have a post handler that performs a validation of the full set of
> >   bits 0..63
> > ?
> > 
> > We could do a similar exercise with a subsection containing the
> > addresses for avail and used with a post handler overwriting any
> > addresses set by the old style migration code.
> > 
> > Does that make sense?
> 
> I don't see how it does: on the receive side you don't know
> whether guest acked bits 32..63 so you can't decide whether
> to parse bits 32..63.

But if it wasn't set, it obviously wasn't acked, I'd think?

> 
> The right thing to do IMHO is to migrate the high guest bits if and only
> if the *host* bits 32..63 are set.  And that needs the host features in
> core, or at least is easier if they are there.

Aren't the host bits a prereq? Confused. I'll think about that tomorrow
when it's hopefully a bit cooler around here :)

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [Qemu-devel] [PATCH RFC 1/7] virtio: relax feature check
@ 2015-05-12 16:27               ` Cornelia Huck
  0 siblings, 0 replies; 44+ messages in thread
From: Cornelia Huck @ 2015-05-12 16:27 UTC (permalink / raw)
  To: Michael S. Tsirkin; +Cc: virtualization, qemu-devel, Stefan Hajnoczi

On Tue, 12 May 2015 17:30:21 +0200
"Michael S. Tsirkin" <mst@redhat.com> wrote:

> On Tue, May 12, 2015 at 04:46:11PM +0200, Cornelia Huck wrote:
> > On Tue, 12 May 2015 15:44:46 +0200
> > Cornelia Huck <cornelia.huck@de.ibm.com> wrote:
> > 
> > > On Tue, 12 May 2015 15:34:47 +0200
> > > "Michael S. Tsirkin" <mst@redhat.com> wrote:
> > > 
> > > > On Tue, May 12, 2015 at 03:14:53PM +0200, Cornelia Huck wrote:
> > > > > On Wed, 06 May 2015 14:07:37 +0200
> > > > > Greg Kurz <gkurz@linux.vnet.ibm.com> wrote:
> > > > > 
> > > > > > Unlike with add and clear, there is no valid reason to abort when checking
> > > > > > for a feature. It makes more sense to return false (i.e. the feature bit
> > > > > > isn't set). This is exactly what __virtio_has_feature() does if fbit >= 32.
> > > > > > 
> > > > > > This allows to introduce code that is aware about new 64-bit features like
> > > > > > VIRTIO_F_VERSION_1, even if they are still not implemented.
> > > > > > 
> > > > > > Signed-off-by: Greg Kurz <gkurz@linux.vnet.ibm.com>
> > > > > > ---
> > > > > >  include/hw/virtio/virtio.h |    1 -
> > > > > >  1 file changed, 1 deletion(-)
> > > > > > 
> > > > > > diff --git a/include/hw/virtio/virtio.h b/include/hw/virtio/virtio.h
> > > > > > index d95f8b6..6ef70f1 100644
> > > > > > --- a/include/hw/virtio/virtio.h
> > > > > > +++ b/include/hw/virtio/virtio.h
> > > > > > @@ -233,7 +233,6 @@ static inline void virtio_clear_feature(uint32_t *features, unsigned int fbit)
> > > > > > 
> > > > > >  static inline bool __virtio_has_feature(uint32_t features, unsigned int fbit)
> > > > > >  {
> > > > > > -    assert(fbit < 32);
> > > > > >      return !!(features & (1 << fbit));
> > > > > >  }
> > > > > > 
> > > > > > 
> > > > > > 
> > > > > 
> > > > > I must say I'm not very comfortable with knowingly passing out-of-rage
> > > > > values to this function.
> > > > > 
> > > > > Can we perhaps apply at least the feature-bit-size extending patches
> > > > > prior to your patchset, if the remainder of the virtio-1 patchset still
> > > > > takes some time?
> > > > 
> > > > So the feature-bit-size extending patches currently don't support
> > > > migration correctly, that's why they are not merged.
> > > > 
> > > > What I think we need to do for this is move host_features out
> > > > from transports into core virtio device.
> > > > 
> > > > Then we can simply check host features >31 and skip
> > > > migrating low guest features is none set.
> > > > 
> > > > Thoughts? Any takers?
> > > > 
> > > 
> > > After we move host_features, put them into an optional vmstate
> > > subsection?
> > > 
> > > I think with the recent patchsets, most of the interesting stuff is
> > > already not handled by the transport anymore. There's only
> > > VIRTIO_F_NOTIFY_ON_EMPTY and VIRTIO_F_BAD_FEATURE left (set by pci and
> > > ccw).
> 
> notify on empty is likely safe to set for everyone.
> 
> bad feature should be pci specific, it's a mistake that
> we have it in ccw. it's there to detect very old buggy guests.
> in fact ccw ignores this bit completely.
> 
> For PCI, I think VIRTIO_F_BAD_FEATURE is never
> actually set in guest features. If guest attempts to set it,
> it is immediately cleared.
> 
> So it can be handled in pci specific code, and won't
> affect migration.
> 
> 
> > Thinking a bit more, we probably don't need this move of host_features
> > to get migration right (although it might be a nice cleanup later).
> > 
> > Could we
> > - keep migration of bits 0..31 as-is
> > - add a vmstate subsection for bits 32..63 only included if one of
> >   those bits is set
> > - have a post handler that performs a validation of the full set of
> >   bits 0..63
> > ?
> > 
> > We could do a similar exercise with a subsection containing the
> > addresses for avail and used with a post handler overwriting any
> > addresses set by the old style migration code.
> > 
> > Does that make sense?
> 
> I don't see how it does: on the receive side you don't know
> whether guest acked bits 32..63 so you can't decide whether
> to parse bits 32..63.

But if it wasn't set, it obviously wasn't acked, I'd think?

> 
> The right thing to do IMHO is to migrate the high guest bits if and only
> if the *host* bits 32..63 are set.  And that needs the host features in
> core, or at least is easier if they are there.

Aren't the host bits a prereq? Confused. I'll think about that tomorrow
when it's hopefully a bit cooler around here :)

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [Qemu-devel] [PATCH RFC 4/7] vhost: set vring endianness for legacy virtio
  2015-05-12 16:25         ` Cornelia Huck
@ 2015-05-12 16:40           ` Michael S. Tsirkin
  -1 siblings, 0 replies; 44+ messages in thread
From: Michael S. Tsirkin @ 2015-05-12 16:40 UTC (permalink / raw)
  To: Cornelia Huck; +Cc: Jason Wang, virtualization, qemu-devel, Stefan Hajnoczi

On Tue, May 12, 2015 at 06:25:32PM +0200, Cornelia Huck wrote:
> On Tue, 12 May 2015 17:15:53 +0200
> "Michael S. Tsirkin" <mst@redhat.com> wrote:
> 
> > On Tue, May 12, 2015 at 03:25:30PM +0200, Cornelia Huck wrote:
> > > On Wed, 06 May 2015 14:08:02 +0200
> > > Greg Kurz <gkurz@linux.vnet.ibm.com> wrote:
> > > 
> > > > Legacy virtio is native endian: if the guest and host endianness differ,
> > > > we have to tell vhost so it can swap bytes where appropriate. This is
> > > > done through a vhost ring ioctl.
> > > > 
> > > > Signed-off-by: Greg Kurz <gkurz@linux.vnet.ibm.com>
> > > > ---
> > > >  hw/virtio/vhost.c |   50 +++++++++++++++++++++++++++++++++++++++++++++++++-
> > > >  1 file changed, 49 insertions(+), 1 deletion(-)
> > > > 
> > > > diff --git a/hw/virtio/vhost.c b/hw/virtio/vhost.c
> > > > index 54851b7..1d7b939 100644
> > > > --- a/hw/virtio/vhost.c
> > > > +++ b/hw/virtio/vhost.c
> > > (...)
> > > > @@ -677,6 +700,16 @@ static int vhost_virtqueue_start(struct vhost_dev *dev,
> > > >          return -errno;
> > > >      }
> > > > 
> > > > +    if (!virtio_has_feature(vdev, VIRTIO_F_VERSION_1) &&
> > > 
> > > I think this should either go in after the virtio-1 base support (more
> > > feature bits etc.) or get a big fat comment and be touched up later.
> > > I'd prefer the first solution so it does not get forgotten, but I'm not
> > > sure when Michael plans to proceed with the virtio-1 patches (I think
> > > they're mostly fine already).
> > 
> > There are three main issues with virtio 1 patches that I am aware of.
> > 
> > One issue with virtio 1 patches as they are is with how features are
> > handled ATM.  There are 3 types of features
> > 
> > 	a. virtio 1 only features
> > 	b. virtio 0 only features
> > 	c. shared features
> > 
> > and 3 types of devices
> > 	a. legacy device: has b+c features
> > 	b. modern device: has a+c features
> > 	c. transitional device: has a+c features but exposes
> > 	   only c through the legacy interface
> 
> Wouldn't a transitional device be able to expose b as well?

No because the virtio 1 spec says it shouldn't.


> > 
> > 
> > So I think a callback that gets features depending on guest
> > version isn't a good way to model it because fundamentally device
> > has one set of features.
> > A better way to model this is really just a single
> > host_features bitmask, and for transitional devices, a mask
> > hiding a features - which are so far all bits > 31, so maybe
> > for now we can just have a global mask.
> 
> How would this work for transitional presenting a modern device - would
> you have a superset of bits and masks for legacy and modern?

Basically we expose through modern interface a superset
of bits exposed through legacy.
F_BAD for pci is probably the only exception.


> > 
> > We need to validate features at initialization time and make
> > sure they make sense, fail if not (sometimes we need to mask
> > features if they don't make sense - this is unfortunate
> > but might be needed for compatibility).
> > 
> > Moving host_features to virtio core would make all of the above
> > easier.
> 
> I have started hacking up code that moves host_features, but I'm quite
> lost with all the different virtio versions floating around. Currently
> trying against master, but that of course ignores the virtio-1 issues.

Yes, I think we should focus on infrastructure cleanups in master first.

> > 
> > 
> > Second issue is migration, some of it is with migrating the new
> > features, so that's tied to the first one.
> 
> There's also the used and avail addresses, but that kind of follows
> from virtio-1 support.
> 
> > 
> > 
> > Third issue is fixing devices so they don't try to
> > access guest memory until DRIVER_OK is set.
> > This is surprisingly hard to do generally given need to support old
> > drivers which don't set DRIVER_OK or set it very late, and the fact that
> > we tied work-arounds for even older drivers which dont' set pci bus
> > master to the DRIVER_OK bit. I tried, and I'm close to giving up and
> > just checking guest ack for virtio 1, and ignoring DRIVER_OK requirement
> > if not there.
> 
> If legacy survived like it is until now, it might be best to focus on
> modern devices for this.

I'm kind of unhappy that it's up to guest though as that controls
whether we run in modern mode. But yea.

-- 
MST

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [Qemu-devel] [PATCH RFC 4/7] vhost: set vring endianness for legacy virtio
@ 2015-05-12 16:40           ` Michael S. Tsirkin
  0 siblings, 0 replies; 44+ messages in thread
From: Michael S. Tsirkin @ 2015-05-12 16:40 UTC (permalink / raw)
  To: Cornelia Huck; +Cc: virtualization, qemu-devel, Stefan Hajnoczi

On Tue, May 12, 2015 at 06:25:32PM +0200, Cornelia Huck wrote:
> On Tue, 12 May 2015 17:15:53 +0200
> "Michael S. Tsirkin" <mst@redhat.com> wrote:
> 
> > On Tue, May 12, 2015 at 03:25:30PM +0200, Cornelia Huck wrote:
> > > On Wed, 06 May 2015 14:08:02 +0200
> > > Greg Kurz <gkurz@linux.vnet.ibm.com> wrote:
> > > 
> > > > Legacy virtio is native endian: if the guest and host endianness differ,
> > > > we have to tell vhost so it can swap bytes where appropriate. This is
> > > > done through a vhost ring ioctl.
> > > > 
> > > > Signed-off-by: Greg Kurz <gkurz@linux.vnet.ibm.com>
> > > > ---
> > > >  hw/virtio/vhost.c |   50 +++++++++++++++++++++++++++++++++++++++++++++++++-
> > > >  1 file changed, 49 insertions(+), 1 deletion(-)
> > > > 
> > > > diff --git a/hw/virtio/vhost.c b/hw/virtio/vhost.c
> > > > index 54851b7..1d7b939 100644
> > > > --- a/hw/virtio/vhost.c
> > > > +++ b/hw/virtio/vhost.c
> > > (...)
> > > > @@ -677,6 +700,16 @@ static int vhost_virtqueue_start(struct vhost_dev *dev,
> > > >          return -errno;
> > > >      }
> > > > 
> > > > +    if (!virtio_has_feature(vdev, VIRTIO_F_VERSION_1) &&
> > > 
> > > I think this should either go in after the virtio-1 base support (more
> > > feature bits etc.) or get a big fat comment and be touched up later.
> > > I'd prefer the first solution so it does not get forgotten, but I'm not
> > > sure when Michael plans to proceed with the virtio-1 patches (I think
> > > they're mostly fine already).
> > 
> > There are three main issues with virtio 1 patches that I am aware of.
> > 
> > One issue with virtio 1 patches as they are is with how features are
> > handled ATM.  There are 3 types of features
> > 
> > 	a. virtio 1 only features
> > 	b. virtio 0 only features
> > 	c. shared features
> > 
> > and 3 types of devices
> > 	a. legacy device: has b+c features
> > 	b. modern device: has a+c features
> > 	c. transitional device: has a+c features but exposes
> > 	   only c through the legacy interface
> 
> Wouldn't a transitional device be able to expose b as well?

No because the virtio 1 spec says it shouldn't.


> > 
> > 
> > So I think a callback that gets features depending on guest
> > version isn't a good way to model it because fundamentally device
> > has one set of features.
> > A better way to model this is really just a single
> > host_features bitmask, and for transitional devices, a mask
> > hiding a features - which are so far all bits > 31, so maybe
> > for now we can just have a global mask.
> 
> How would this work for transitional presenting a modern device - would
> you have a superset of bits and masks for legacy and modern?

Basically we expose through modern interface a superset
of bits exposed through legacy.
F_BAD for pci is probably the only exception.


> > 
> > We need to validate features at initialization time and make
> > sure they make sense, fail if not (sometimes we need to mask
> > features if they don't make sense - this is unfortunate
> > but might be needed for compatibility).
> > 
> > Moving host_features to virtio core would make all of the above
> > easier.
> 
> I have started hacking up code that moves host_features, but I'm quite
> lost with all the different virtio versions floating around. Currently
> trying against master, but that of course ignores the virtio-1 issues.

Yes, I think we should focus on infrastructure cleanups in master first.

> > 
> > 
> > Second issue is migration, some of it is with migrating the new
> > features, so that's tied to the first one.
> 
> There's also the used and avail addresses, but that kind of follows
> from virtio-1 support.
> 
> > 
> > 
> > Third issue is fixing devices so they don't try to
> > access guest memory until DRIVER_OK is set.
> > This is surprisingly hard to do generally given need to support old
> > drivers which don't set DRIVER_OK or set it very late, and the fact that
> > we tied work-arounds for even older drivers which dont' set pci bus
> > master to the DRIVER_OK bit. I tried, and I'm close to giving up and
> > just checking guest ack for virtio 1, and ignoring DRIVER_OK requirement
> > if not there.
> 
> If legacy survived like it is until now, it might be best to focus on
> modern devices for this.

I'm kind of unhappy that it's up to guest though as that controls
whether we run in modern mode. But yea.

-- 
MST

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [Qemu-devel] [PATCH RFC 4/7] vhost: set vring endianness for legacy virtio
  2015-05-12 16:40           ` Michael S. Tsirkin
@ 2015-05-13 10:39             ` Cornelia Huck
  -1 siblings, 0 replies; 44+ messages in thread
From: Cornelia Huck @ 2015-05-13 10:39 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: Jason Wang, virtualization, qemu-devel, Stefan Hajnoczi

On Tue, 12 May 2015 18:40:35 +0200
"Michael S. Tsirkin" <mst@redhat.com> wrote:

> On Tue, May 12, 2015 at 06:25:32PM +0200, Cornelia Huck wrote:
> > On Tue, 12 May 2015 17:15:53 +0200
> > "Michael S. Tsirkin" <mst@redhat.com> wrote:

> > > One issue with virtio 1 patches as they are is with how features are
> > > handled ATM.  There are 3 types of features
> > > 
> > > 	a. virtio 1 only features
> > > 	b. virtio 0 only features
> > > 	c. shared features
> > > 
> > > and 3 types of devices
> > > 	a. legacy device: has b+c features
> > > 	b. modern device: has a+c features
> > > 	c. transitional device: has a+c features but exposes
> > > 	   only c through the legacy interface
> > 
> > Wouldn't a transitional device be able to expose b as well?
> 
> No because the virtio 1 spec says it shouldn't.

Can you point me to the part where it says that? Can't seem to find it.

> > > So I think a callback that gets features depending on guest
> > > version isn't a good way to model it because fundamentally device
> > > has one set of features.
> > > A better way to model this is really just a single
> > > host_features bitmask, and for transitional devices, a mask
> > > hiding a features - which are so far all bits > 31, so maybe
> > > for now we can just have a global mask.
> > 
> > How would this work for transitional presenting a modern device - would
> > you have a superset of bits and masks for legacy and modern?
> 
> Basically we expose through modern interface a superset
> of bits exposed through legacy.
> F_BAD for pci is probably the only exception.

ccw has F_BAD as well.

NOTIFY_ON_EMPTY is in the MAY be offered category.

I'm still wondering about the old legacy bits. Transitional devices not
offering them is easiest to implement from the host side, but I'm
wondering if all legacy devices will be able to function with missing
bits? Is breaking some legacy drivers OK?

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [Qemu-devel] [PATCH RFC 4/7] vhost: set vring endianness for legacy virtio
@ 2015-05-13 10:39             ` Cornelia Huck
  0 siblings, 0 replies; 44+ messages in thread
From: Cornelia Huck @ 2015-05-13 10:39 UTC (permalink / raw)
  To: Michael S. Tsirkin; +Cc: virtualization, qemu-devel, Stefan Hajnoczi

On Tue, 12 May 2015 18:40:35 +0200
"Michael S. Tsirkin" <mst@redhat.com> wrote:

> On Tue, May 12, 2015 at 06:25:32PM +0200, Cornelia Huck wrote:
> > On Tue, 12 May 2015 17:15:53 +0200
> > "Michael S. Tsirkin" <mst@redhat.com> wrote:

> > > One issue with virtio 1 patches as they are is with how features are
> > > handled ATM.  There are 3 types of features
> > > 
> > > 	a. virtio 1 only features
> > > 	b. virtio 0 only features
> > > 	c. shared features
> > > 
> > > and 3 types of devices
> > > 	a. legacy device: has b+c features
> > > 	b. modern device: has a+c features
> > > 	c. transitional device: has a+c features but exposes
> > > 	   only c through the legacy interface
> > 
> > Wouldn't a transitional device be able to expose b as well?
> 
> No because the virtio 1 spec says it shouldn't.

Can you point me to the part where it says that? Can't seem to find it.

> > > So I think a callback that gets features depending on guest
> > > version isn't a good way to model it because fundamentally device
> > > has one set of features.
> > > A better way to model this is really just a single
> > > host_features bitmask, and for transitional devices, a mask
> > > hiding a features - which are so far all bits > 31, so maybe
> > > for now we can just have a global mask.
> > 
> > How would this work for transitional presenting a modern device - would
> > you have a superset of bits and masks for legacy and modern?
> 
> Basically we expose through modern interface a superset
> of bits exposed through legacy.
> F_BAD for pci is probably the only exception.

ccw has F_BAD as well.

NOTIFY_ON_EMPTY is in the MAY be offered category.

I'm still wondering about the old legacy bits. Transitional devices not
offering them is easiest to implement from the host side, but I'm
wondering if all legacy devices will be able to function with missing
bits? Is breaking some legacy drivers OK?

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [Qemu-devel] [PATCH RFC 4/7] vhost: set vring endianness for legacy virtio
  2015-05-13 10:39             ` Cornelia Huck
@ 2015-05-13 11:22               ` Michael S. Tsirkin
  -1 siblings, 0 replies; 44+ messages in thread
From: Michael S. Tsirkin @ 2015-05-13 11:22 UTC (permalink / raw)
  To: Cornelia Huck; +Cc: Jason Wang, virtualization, qemu-devel, Stefan Hajnoczi

On Wed, May 13, 2015 at 12:39:07PM +0200, Cornelia Huck wrote:
> On Tue, 12 May 2015 18:40:35 +0200
> "Michael S. Tsirkin" <mst@redhat.com> wrote:
> 
> > On Tue, May 12, 2015 at 06:25:32PM +0200, Cornelia Huck wrote:
> > > On Tue, 12 May 2015 17:15:53 +0200
> > > "Michael S. Tsirkin" <mst@redhat.com> wrote:
> 
> > > > One issue with virtio 1 patches as they are is with how features are
> > > > handled ATM.  There are 3 types of features
> > > > 
> > > > 	a. virtio 1 only features
> > > > 	b. virtio 0 only features
> > > > 	c. shared features
> > > > 
> > > > and 3 types of devices
> > > > 	a. legacy device: has b+c features
> > > > 	b. modern device: has a+c features
> > > > 	c. transitional device: has a+c features but exposes
> > > > 	   only c through the legacy interface
> > > 
> > > Wouldn't a transitional device be able to expose b as well?
> > 
> > No because the virtio 1 spec says it shouldn't.
> 
> Can you point me to the part where it says that? Can't seem to find it.


What I mean is that these features can't be operated through the modern
interface.
Consider the BLK SCSI command feature. If you set it you really
expect guest to be able to e.g. use your device as a cd writer.
So, a legacy guest is able to do this, but a modern guest isn't?
Doesn't sound like an expected outcome to me.


So the principle of the least surprise dictates that we
- disable the feature by default for everyone
- add flag to enable the feature, available for legacy devices only


> > > > So I think a callback that gets features depending on guest
> > > > version isn't a good way to model it because fundamentally device
> > > > has one set of features.
> > > > A better way to model this is really just a single
> > > > host_features bitmask, and for transitional devices, a mask
> > > > hiding a features - which are so far all bits > 31, so maybe
> > > > for now we can just have a global mask.
> > > 
> > > How would this work for transitional presenting a modern device - would
> > > you have a superset of bits and masks for legacy and modern?
> > 
> > Basically we expose through modern interface a superset
> > of bits exposed through legacy.
> > F_BAD for pci is probably the only exception.
> 
> ccw has F_BAD as well.

It does but it ignores it, so I think setting that bit
in host features is a bug. PCI uses it to detect very old
guests that just acked all features without looking
at them. ccw never had such guests.

> NOTIFY_ON_EMPTY is in the MAY be offered category.

Right. Current guests simply ignore this feature bit,
so I don't much care what we do with this bit as long as
- pci exposes this on the legacy interface
- we don't spend many lines of code working around that special case

> I'm still wondering about the old legacy bits. Transitional devices not
> offering them is easiest to implement from the host side, but I'm
> wondering if all legacy devices will be able to function with missing
> bits? Is breaking some legacy drivers OK?

We should add flags to re-enable them, for legacy devices only.

-- 
MST

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [Qemu-devel] [PATCH RFC 4/7] vhost: set vring endianness for legacy virtio
@ 2015-05-13 11:22               ` Michael S. Tsirkin
  0 siblings, 0 replies; 44+ messages in thread
From: Michael S. Tsirkin @ 2015-05-13 11:22 UTC (permalink / raw)
  To: Cornelia Huck; +Cc: virtualization, qemu-devel, Stefan Hajnoczi

On Wed, May 13, 2015 at 12:39:07PM +0200, Cornelia Huck wrote:
> On Tue, 12 May 2015 18:40:35 +0200
> "Michael S. Tsirkin" <mst@redhat.com> wrote:
> 
> > On Tue, May 12, 2015 at 06:25:32PM +0200, Cornelia Huck wrote:
> > > On Tue, 12 May 2015 17:15:53 +0200
> > > "Michael S. Tsirkin" <mst@redhat.com> wrote:
> 
> > > > One issue with virtio 1 patches as they are is with how features are
> > > > handled ATM.  There are 3 types of features
> > > > 
> > > > 	a. virtio 1 only features
> > > > 	b. virtio 0 only features
> > > > 	c. shared features
> > > > 
> > > > and 3 types of devices
> > > > 	a. legacy device: has b+c features
> > > > 	b. modern device: has a+c features
> > > > 	c. transitional device: has a+c features but exposes
> > > > 	   only c through the legacy interface
> > > 
> > > Wouldn't a transitional device be able to expose b as well?
> > 
> > No because the virtio 1 spec says it shouldn't.
> 
> Can you point me to the part where it says that? Can't seem to find it.


What I mean is that these features can't be operated through the modern
interface.
Consider the BLK SCSI command feature. If you set it you really
expect guest to be able to e.g. use your device as a cd writer.
So, a legacy guest is able to do this, but a modern guest isn't?
Doesn't sound like an expected outcome to me.


So the principle of the least surprise dictates that we
- disable the feature by default for everyone
- add flag to enable the feature, available for legacy devices only


> > > > So I think a callback that gets features depending on guest
> > > > version isn't a good way to model it because fundamentally device
> > > > has one set of features.
> > > > A better way to model this is really just a single
> > > > host_features bitmask, and for transitional devices, a mask
> > > > hiding a features - which are so far all bits > 31, so maybe
> > > > for now we can just have a global mask.
> > > 
> > > How would this work for transitional presenting a modern device - would
> > > you have a superset of bits and masks for legacy and modern?
> > 
> > Basically we expose through modern interface a superset
> > of bits exposed through legacy.
> > F_BAD for pci is probably the only exception.
> 
> ccw has F_BAD as well.

It does but it ignores it, so I think setting that bit
in host features is a bug. PCI uses it to detect very old
guests that just acked all features without looking
at them. ccw never had such guests.

> NOTIFY_ON_EMPTY is in the MAY be offered category.

Right. Current guests simply ignore this feature bit,
so I don't much care what we do with this bit as long as
- pci exposes this on the legacy interface
- we don't spend many lines of code working around that special case

> I'm still wondering about the old legacy bits. Transitional devices not
> offering them is easiest to implement from the host side, but I'm
> wondering if all legacy devices will be able to function with missing
> bits? Is breaking some legacy drivers OK?

We should add flags to re-enable them, for legacy devices only.

-- 
MST

^ permalink raw reply	[flat|nested] 44+ messages in thread

end of thread, other threads:[~2015-05-13 11:22 UTC | newest]

Thread overview: 44+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-05-06 12:07 [Qemu-devel] [PATCH RFC 0/7] vhost: cross-endian support (vhost-net only) Greg Kurz
2015-05-06 12:07 ` Greg Kurz
2015-05-06 12:07 ` [Qemu-devel] [PATCH RFC 1/7] virtio: relax feature check Greg Kurz
2015-05-06 12:07   ` Greg Kurz
2015-05-12 13:14   ` [Qemu-devel] " Cornelia Huck
2015-05-12 13:14     ` Cornelia Huck
2015-05-12 13:34     ` Michael S. Tsirkin
2015-05-12 13:34       ` Michael S. Tsirkin
2015-05-12 13:44       ` Cornelia Huck
2015-05-12 13:44         ` Cornelia Huck
2015-05-12 14:46         ` Cornelia Huck
2015-05-12 14:46           ` Cornelia Huck
2015-05-12 15:30           ` Michael S. Tsirkin
2015-05-12 15:30             ` Michael S. Tsirkin
2015-05-12 16:27             ` Cornelia Huck
2015-05-12 16:27               ` Cornelia Huck
2015-05-12 13:49     ` Peter Maydell
2015-05-12 13:49     ` Peter Maydell
2015-05-12 13:55     ` Greg Kurz
2015-05-12 13:55       ` Greg Kurz
2015-05-06 12:07 ` [Qemu-devel] [PATCH RFC 2/7] linux-headers: sync vhost.h Greg Kurz
2015-05-06 12:07   ` Greg Kurz
2015-05-06 12:07 ` [Qemu-devel] [PATCH RFC 3/7] virtio: introduce virtio_legacy_is_cross_endian() Greg Kurz
2015-05-06 12:07   ` Greg Kurz
2015-05-06 12:08 ` [Qemu-devel] [PATCH RFC 4/7] vhost: set vring endianness for legacy virtio Greg Kurz
2015-05-06 12:08   ` Greg Kurz
2015-05-12 13:25   ` [Qemu-devel] " Cornelia Huck
2015-05-12 13:25     ` Cornelia Huck
2015-05-12 15:15     ` Michael S. Tsirkin
2015-05-12 15:15       ` Michael S. Tsirkin
2015-05-12 16:25       ` Cornelia Huck
2015-05-12 16:25         ` Cornelia Huck
2015-05-12 16:40         ` Michael S. Tsirkin
2015-05-12 16:40           ` Michael S. Tsirkin
2015-05-13 10:39           ` Cornelia Huck
2015-05-13 10:39             ` Cornelia Huck
2015-05-13 11:22             ` Michael S. Tsirkin
2015-05-13 11:22               ` Michael S. Tsirkin
2015-05-06 12:08 ` [Qemu-devel] [PATCH RFC 5/7] tap: add VNET_LE/VNET_BE operations Greg Kurz
2015-05-06 12:08   ` Greg Kurz
2015-05-06 12:08 ` [Qemu-devel] [PATCH RFC 6/7] vhost-net: tell tap backend about the vnet endianness Greg Kurz
2015-05-06 12:08   ` Greg Kurz
2015-05-06 12:08 ` [Qemu-devel] [PATCH RFC 7/7] vhost_net: re-enable when cross endian Greg Kurz
2015-05-06 12:08   ` Greg Kurz

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.