All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 00/18] virtio and vhost-net performance enhancements
@ 2011-05-04 20:50 ` Michael S. Tsirkin
  0 siblings, 0 replies; 145+ messages in thread
From: Michael S. Tsirkin @ 2011-05-04 20:50 UTC (permalink / raw)
  To: linux-kernel
  Cc: Rusty Russell, Carsten Otte, Christian Borntraeger, linux390,
	Martin Schwidefsky, Heiko Carstens, Shirley Ma, lguest,
	linux-kernel, virtualization, netdev, linux-s390, kvm,
	Krishna Kumar, Tom Lendacky, steved, habanero

OK, here's a large patchset that implements the virtio spec update that I
sent earlier. It supercedes the PUBLISH_USED_IDX patches
I sent out earlier.

I know it's a lot to ask but please test, and please consider for 2.6.40 :)

I see nice performance improvements: one run showed going from 12
to 18 Gbit/s host to guest with netperf, but I did not spend a lot
of time testing performance, so no guarantees it's not a fluke,
I hope others will try this out and report.
Pls note I will be away from keyboard for the next week.

Essentially we change virtio ring notification
hand-off to work like the one in Xen -
each side publishes an event index, the other one
notifies when it reaches that value -
With the one difference that event index starts at 0,
same as request index (in xen event index starts at 1).

Each side of the handoff has a feature bit independent
of the other one, so we can have e.g. interrupts
handled in the new way and exits in the old one.

This is actually what made the patchset larger:
we run out of feature bits so I had to add some more.
I tested various combinations of hosts and guests and
this code seems to be solid.

With the indexes in place it becomes possbile to request an event after
many requests (and not just on the next one as done now). This shall fix
the TX queue overrun which currently triggers a storm of interrupts.

The patches are mostly independent and can also be cherry-picked,
hopefully there won't be much need of that.

One dependency I'd like to note is on two cleanup patches:
the patch removing batching of available index updates
and the patch fixing ring capability checks in virtio-net.
This simplified code a bit and made the following patch simpler.

I could unwrap the dependency but prefer as is.

The patchset is on top of net-next which at the time
I last rebased was 15ecd03 - so roughly 2.6.39-rc2.

qemu patch will follow shortly.

Rusty, I think (in the hope it will come to that) it will be easier to
merge vhost and virtio bits in one go. Can all go in through your tree
(Dave in the past acked a very similar patch so should not be a problem)
or from me to Dave Miller.

I see nice performance improvements: e.g. from 12 to 18 Gbit/s host
to guest with netperf, but did not spend a lot of time testing
performance, and I will be away from keyboard for the next week.
I hope others will try this out and report.

Michael S. Tsirkin (17):
  virtio: 64 bit features
  virtio_test: update for 64 bit features
  vhost: fix 64 bit features
  virtio: don't delay avail index update
  virtio: used event index interface
  virtio_ring: avail event index interface
  virtio ring: inline function to check for events
  virtio_ring: support for used_event idx feature
  virtio: use avail_event index
  vhost: utilize used_event index
  vhost: support avail_event idx
  virtio_test: support used_event index
  virtio_test: avail_event index support
  virtio: add api for delayed callbacks
  virtio_net: delay TX callbacks
  virtio_net: fix TX capacity checks using new API
  virtio_net: limit xmit polling

Shirley Ma (1):
  virtio_ring: Add capacity check API

 drivers/lguest/lguest_device.c |    8 +-
 drivers/net/virtio_net.c       |   25 ++++---
 drivers/s390/kvm/kvm_virtio.c  |    8 +-
 drivers/vhost/net.c            |   12 ++--
 drivers/vhost/test.c           |    6 +-
 drivers/vhost/vhost.c          |  139 ++++++++++++++++++++++++++++++----------
 drivers/vhost/vhost.h          |   30 ++++++---
 drivers/virtio/virtio.c        |    8 +-
 drivers/virtio/virtio_pci.c    |   34 ++++++++--
 drivers/virtio/virtio_ring.c   |  105 +++++++++++++++++++++++++++---
 include/linux/virtio.h         |   16 ++++-
 include/linux/virtio_config.h  |   15 +++--
 include/linux/virtio_pci.h     |    9 ++-
 include/linux/virtio_ring.h    |   30 ++++++++-
 tools/virtio/virtio_test.c     |   39 ++++++++++-
 15 files changed, 377 insertions(+), 107 deletions(-)

-- 
1.7.5.53.gc233e

^ permalink raw reply	[flat|nested] 145+ messages in thread

* [PATCH 00/18] virtio and vhost-net performance enhancements
@ 2011-05-04 20:50 ` Michael S. Tsirkin
  0 siblings, 0 replies; 145+ messages in thread
From: Michael S. Tsirkin @ 2011-05-04 20:50 UTC (permalink / raw)
  Cc: Rusty Russell, Carsten Otte, Christian Borntraeger, linux390,
	Martin Schwidefsky, Heiko Carstens, Shirley Ma, lguest,
	linux-kernel, virtualization, netdev, linux-s390, kvm,
	Krishna Kumar, Tom Lendacky, steved, habanero

OK, here's a large patchset that implements the virtio spec update that I
sent earlier. It supercedes the PUBLISH_USED_IDX patches
I sent out earlier.

I know it's a lot to ask but please test, and please consider for 2.6.40 :)

I see nice performance improvements: one run showed going from 12
to 18 Gbit/s host to guest with netperf, but I did not spend a lot
of time testing performance, so no guarantees it's not a fluke,
I hope others will try this out and report.
Pls note I will be away from keyboard for the next week.

Essentially we change virtio ring notification
hand-off to work like the one in Xen -
each side publishes an event index, the other one
notifies when it reaches that value -
With the one difference that event index starts at 0,
same as request index (in xen event index starts at 1).

Each side of the handoff has a feature bit independent
of the other one, so we can have e.g. interrupts
handled in the new way and exits in the old one.

This is actually what made the patchset larger:
we run out of feature bits so I had to add some more.
I tested various combinations of hosts and guests and
this code seems to be solid.

With the indexes in place it becomes possbile to request an event after
many requests (and not just on the next one as done now). This shall fix
the TX queue overrun which currently triggers a storm of interrupts.

The patches are mostly independent and can also be cherry-picked,
hopefully there won't be much need of that.

One dependency I'd like to note is on two cleanup patches:
the patch removing batching of available index updates
and the patch fixing ring capability checks in virtio-net.
This simplified code a bit and made the following patch simpler.

I could unwrap the dependency but prefer as is.

The patchset is on top of net-next which at the time
I last rebased was 15ecd03 - so roughly 2.6.39-rc2.

qemu patch will follow shortly.

Rusty, I think (in the hope it will come to that) it will be easier to
merge vhost and virtio bits in one go. Can all go in through your tree
(Dave in the past acked a very similar patch so should not be a problem)
or from me to Dave Miller.

I see nice performance improvements: e.g. from 12 to 18 Gbit/s host
to guest with netperf, but did not spend a lot of time testing
performance, and I will be away from keyboard for the next week.
I hope others will try this out and report.

Michael S. Tsirkin (17):
  virtio: 64 bit features
  virtio_test: update for 64 bit features
  vhost: fix 64 bit features
  virtio: don't delay avail index update
  virtio: used event index interface
  virtio_ring: avail event index interface
  virtio ring: inline function to check for events
  virtio_ring: support for used_event idx feature
  virtio: use avail_event index
  vhost: utilize used_event index
  vhost: support avail_event idx
  virtio_test: support used_event index
  virtio_test: avail_event index support
  virtio: add api for delayed callbacks
  virtio_net: delay TX callbacks
  virtio_net: fix TX capacity checks using new API
  virtio_net: limit xmit polling

Shirley Ma (1):
  virtio_ring: Add capacity check API

 drivers/lguest/lguest_device.c |    8 +-
 drivers/net/virtio_net.c       |   25 ++++---
 drivers/s390/kvm/kvm_virtio.c  |    8 +-
 drivers/vhost/net.c            |   12 ++--
 drivers/vhost/test.c           |    6 +-
 drivers/vhost/vhost.c          |  139 ++++++++++++++++++++++++++++++----------
 drivers/vhost/vhost.h          |   30 ++++++---
 drivers/virtio/virtio.c        |    8 +-
 drivers/virtio/virtio_pci.c    |   34 ++++++++--
 drivers/virtio/virtio_ring.c   |  105 +++++++++++++++++++++++++++---
 include/linux/virtio.h         |   16 ++++-
 include/linux/virtio_config.h  |   15 +++--
 include/linux/virtio_pci.h     |    9 ++-
 include/linux/virtio_ring.h    |   30 ++++++++-
 tools/virtio/virtio_test.c     |   39 ++++++++++-
 15 files changed, 377 insertions(+), 107 deletions(-)

-- 
1.7.5.53.gc233e

^ permalink raw reply	[flat|nested] 145+ messages in thread

* [PATCH 01/18] virtio: 64 bit features
  2011-05-04 20:50 ` Michael S. Tsirkin
@ 2011-05-04 20:50   ` Michael S. Tsirkin
  -1 siblings, 0 replies; 145+ messages in thread
From: Michael S. Tsirkin @ 2011-05-04 20:50 UTC (permalink / raw)
  To: linux-kernel
  Cc: Rusty Russell, Carsten Otte, Christian Borntraeger, linux390,
	Martin Schwidefsky, Heiko Carstens, Shirley Ma, lguest,
	linux-kernel, virtualization, netdev, linux-s390, kvm,
	Krishna Kumar, Tom Lendacky, steved, habanero

Extend features to 64 bit so we can use more
transport bits.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
---
 drivers/lguest/lguest_device.c |    8 ++++----
 drivers/s390/kvm/kvm_virtio.c  |    8 ++++----
 drivers/virtio/virtio.c        |    8 ++++----
 drivers/virtio/virtio_pci.c    |   34 ++++++++++++++++++++++++++++------
 drivers/virtio/virtio_ring.c   |    2 ++
 include/linux/virtio.h         |    2 +-
 include/linux/virtio_config.h  |   15 +++++++++------
 include/linux/virtio_pci.h     |    9 ++++++++-
 8 files changed, 60 insertions(+), 26 deletions(-)

diff --git a/drivers/lguest/lguest_device.c b/drivers/lguest/lguest_device.c
index 69c84a1..d2d6953 100644
--- a/drivers/lguest/lguest_device.c
+++ b/drivers/lguest/lguest_device.c
@@ -93,17 +93,17 @@ static unsigned desc_size(const struct lguest_device_desc *desc)
 }
 
 /* This gets the device's feature bits. */
-static u32 lg_get_features(struct virtio_device *vdev)
+static u64 lg_get_features(struct virtio_device *vdev)
 {
 	unsigned int i;
-	u32 features = 0;
+	u64 features = 0;
 	struct lguest_device_desc *desc = to_lgdev(vdev)->desc;
 	u8 *in_features = lg_features(desc);
 
 	/* We do this the slow but generic way. */
-	for (i = 0; i < min(desc->feature_len * 8, 32); i++)
+	for (i = 0; i < min(desc->feature_len * 8, 64); i++)
 		if (in_features[i / 8] & (1 << (i % 8)))
-			features |= (1 << i);
+			features |= (1ull << i);
 
 	return features;
 }
diff --git a/drivers/s390/kvm/kvm_virtio.c b/drivers/s390/kvm/kvm_virtio.c
index 414427d..c56293c 100644
--- a/drivers/s390/kvm/kvm_virtio.c
+++ b/drivers/s390/kvm/kvm_virtio.c
@@ -79,16 +79,16 @@ static unsigned desc_size(const struct kvm_device_desc *desc)
 }
 
 /* This gets the device's feature bits. */
-static u32 kvm_get_features(struct virtio_device *vdev)
+static u64 kvm_get_features(struct virtio_device *vdev)
 {
 	unsigned int i;
-	u32 features = 0;
+	u64 features = 0;
 	struct kvm_device_desc *desc = to_kvmdev(vdev)->desc;
 	u8 *in_features = kvm_vq_features(desc);
 
-	for (i = 0; i < min(desc->feature_len * 8, 32); i++)
+	for (i = 0; i < min(desc->feature_len * 8, 64); i++)
 		if (in_features[i / 8] & (1 << (i % 8)))
-			features |= (1 << i);
+			features |= (1ull << i);
 	return features;
 }
 
diff --git a/drivers/virtio/virtio.c b/drivers/virtio/virtio.c
index efb35aa..52b24d7 100644
--- a/drivers/virtio/virtio.c
+++ b/drivers/virtio/virtio.c
@@ -112,7 +112,7 @@ static int virtio_dev_probe(struct device *_d)
 	struct virtio_device *dev = container_of(_d,struct virtio_device,dev);
 	struct virtio_driver *drv = container_of(dev->dev.driver,
 						 struct virtio_driver, driver);
-	u32 device_features;
+	u64 device_features;
 
 	/* We have a driver! */
 	add_status(dev, VIRTIO_CONFIG_S_DRIVER);
@@ -124,14 +124,14 @@ static int virtio_dev_probe(struct device *_d)
 	memset(dev->features, 0, sizeof(dev->features));
 	for (i = 0; i < drv->feature_table_size; i++) {
 		unsigned int f = drv->feature_table[i];
-		BUG_ON(f >= 32);
-		if (device_features & (1 << f))
+		BUG_ON(f >= 64);
+		if (device_features & (1ull << f))
 			set_bit(f, dev->features);
 	}
 
 	/* Transport features always preserved to pass to finalize_features. */
 	for (i = VIRTIO_TRANSPORT_F_START; i < VIRTIO_TRANSPORT_F_END; i++)
-		if (device_features & (1 << i))
+		if (device_features & (1ull << i))
 			set_bit(i, dev->features);
 
 	dev->config->finalize_features(dev);
diff --git a/drivers/virtio/virtio_pci.c b/drivers/virtio/virtio_pci.c
index 4fb5b2b..04b216f 100644
--- a/drivers/virtio/virtio_pci.c
+++ b/drivers/virtio/virtio_pci.c
@@ -44,6 +44,8 @@ struct virtio_pci_device
 	spinlock_t lock;
 	struct list_head virtqueues;
 
+	/* 64 bit features */
+	int features_hi;
 	/* MSI-X support */
 	int msix_enabled;
 	int intx_enabled;
@@ -103,26 +105,46 @@ static struct virtio_pci_device *to_vp_device(struct virtio_device *vdev)
 }
 
 /* virtio config->get_features() implementation */
-static u32 vp_get_features(struct virtio_device *vdev)
+static u64 vp_get_features(struct virtio_device *vdev)
 {
 	struct virtio_pci_device *vp_dev = to_vp_device(vdev);
+	u32 flo, fhi;
 
-	/* When someone needs more than 32 feature bits, we'll need to
+	/* When someone needs more than 32 feature bits, we need to
 	 * steal a bit to indicate that the rest are somewhere else. */
-	return ioread32(vp_dev->ioaddr + VIRTIO_PCI_HOST_FEATURES);
+	flo = ioread32(vp_dev->ioaddr + VIRTIO_PCI_HOST_FEATURES);
+	if (flo & (0x1 << VIRTIO_F_FEATURES_HI)) {
+		vp_dev->features_hi = 1;
+		iowrite32(0x1 << VIRTIO_F_FEATURES_HI,
+			  vp_dev->ioaddr + VIRTIO_PCI_GUEST_FEATURES);
+		fhi = ioread32(vp_dev->ioaddr + VIRTIO_PCI_HOST_FEATURES_HI);
+	} else {
+		vp_dev->features_hi = 0;
+		fhi = 0;
+	}
+	return (((u64)fhi) << 32) | flo;
 }
 
 /* virtio config->finalize_features() implementation */
 static void vp_finalize_features(struct virtio_device *vdev)
 {
 	struct virtio_pci_device *vp_dev = to_vp_device(vdev);
+	u32 flo, fhi;
 
 	/* Give virtio_ring a chance to accept features. */
 	vring_transport_features(vdev);
 
-	/* We only support 32 feature bits. */
-	BUILD_BUG_ON(ARRAY_SIZE(vdev->features) != 1);
-	iowrite32(vdev->features[0], vp_dev->ioaddr+VIRTIO_PCI_GUEST_FEATURES);
+	/* We only support 64 feature bits. */
+	BUILD_BUG_ON(ARRAY_SIZE(vdev->features) != 64 / BITS_PER_LONG);
+	flo = vdev->features[0];
+	fhi = vdev->features[64 / BITS_PER_LONG - 1] >> (BITS_PER_LONG - 32);
+	iowrite32(flo, vp_dev->ioaddr + VIRTIO_PCI_GUEST_FEATURES);
+	if (flo & (0x1 << VIRTIO_F_FEATURES_HI)) {
+		vp_dev->features_hi = 1;
+		iowrite32(fhi, vp_dev->ioaddr + VIRTIO_PCI_GUEST_FEATURES_HI);
+	} else {
+		vp_dev->features_hi = 0;
+	}
 }
 
 /* virtio config->get() implementation */
diff --git a/drivers/virtio/virtio_ring.c b/drivers/virtio/virtio_ring.c
index cc2f73e..059e02d 100644
--- a/drivers/virtio/virtio_ring.c
+++ b/drivers/virtio/virtio_ring.c
@@ -469,6 +469,8 @@ void vring_transport_features(struct virtio_device *vdev)
 
 	for (i = VIRTIO_TRANSPORT_F_START; i < VIRTIO_TRANSPORT_F_END; i++) {
 		switch (i) {
+		case VIRTIO_F_FEATURES_HI:
+			break;
 		case VIRTIO_RING_F_INDIRECT_DESC:
 			break;
 		default:
diff --git a/include/linux/virtio.h b/include/linux/virtio.h
index aff5b4f..718336b 100644
--- a/include/linux/virtio.h
+++ b/include/linux/virtio.h
@@ -105,7 +105,7 @@ struct virtio_device {
 	struct virtio_config_ops *config;
 	struct list_head vqs;
 	/* Note that this is a Linux set_bit-style bitmap. */
-	unsigned long features[1];
+	unsigned long features[64 / BITS_PER_LONG];
 	void *priv;
 };
 
diff --git a/include/linux/virtio_config.h b/include/linux/virtio_config.h
index 800617b..b1a1981 100644
--- a/include/linux/virtio_config.h
+++ b/include/linux/virtio_config.h
@@ -18,16 +18,19 @@
 /* We've given up on this device. */
 #define VIRTIO_CONFIG_S_FAILED		0x80
 
-/* Some virtio feature bits (currently bits 28 through 31) are reserved for the
+/* Some virtio feature bits (currently bits 28 through 39) are reserved for the
  * transport being used (eg. virtio_ring), the rest are per-device feature
  * bits. */
 #define VIRTIO_TRANSPORT_F_START	28
-#define VIRTIO_TRANSPORT_F_END		32
+#define VIRTIO_TRANSPORT_F_END		40
 
 /* Do we get callbacks when the ring is completely used, even if we've
  * suppressed them? */
 #define VIRTIO_F_NOTIFY_ON_EMPTY	24
 
+/* Enables feature bits 32 to 63 (only really required for virtio_pci). */
+#define VIRTIO_F_FEATURES_HI		31
+
 #ifdef __KERNEL__
 #include <linux/err.h>
 #include <linux/virtio.h>
@@ -72,7 +75,7 @@
  * @del_vqs: free virtqueues found by find_vqs().
  * @get_features: get the array of feature bits for this device.
  *	vdev: the virtio_device
- *	Returns the first 32 feature bits (all we currently need).
+ *	Returns the first 64 feature bits (all we currently need).
  * @finalize_features: confirm what device features we'll be using.
  *	vdev: the virtio_device
  *	This gives the final feature bits for the device: it can change
@@ -92,7 +95,7 @@ struct virtio_config_ops {
 			vq_callback_t *callbacks[],
 			const char *names[]);
 	void (*del_vqs)(struct virtio_device *);
-	u32 (*get_features)(struct virtio_device *vdev);
+	u64 (*get_features)(struct virtio_device *vdev);
 	void (*finalize_features)(struct virtio_device *vdev);
 };
 
@@ -110,9 +113,9 @@ static inline bool virtio_has_feature(const struct virtio_device *vdev,
 {
 	/* Did you forget to fix assumptions on max features? */
 	if (__builtin_constant_p(fbit))
-		BUILD_BUG_ON(fbit >= 32);
+		BUILD_BUG_ON(fbit >= 64);
 	else
-		BUG_ON(fbit >= 32);
+		BUG_ON(fbit >= 64);
 
 	if (fbit < VIRTIO_TRANSPORT_F_START)
 		virtio_check_driver_offered_feature(vdev, fbit);
diff --git a/include/linux/virtio_pci.h b/include/linux/virtio_pci.h
index 9a3d7c4..90f9725 100644
--- a/include/linux/virtio_pci.h
+++ b/include/linux/virtio_pci.h
@@ -55,9 +55,16 @@
 /* Vector value used to disable MSI for queue */
 #define VIRTIO_MSI_NO_VECTOR            0xffff
 
+/* An extended 32-bit r/o bitmask of the features supported by the host */
+#define VIRTIO_PCI_HOST_FEATURES_HI	24
+
+/* An extended 32-bit r/w bitmask of features activated by the guest */
+#define VIRTIO_PCI_GUEST_FEATURES_HI	28
+
 /* The remaining space is defined by each driver as the per-driver
  * configuration space */
-#define VIRTIO_PCI_CONFIG(dev)		((dev)->msix_enabled ? 24 : 20)
+#define VIRTIO_PCI_CONFIG(dev)		((dev)->features_hi ? 32 : \
+						(dev)->msix_enabled ? 24 : 20)
 
 /* Virtio ABI version, this must match exactly */
 #define VIRTIO_PCI_ABI_VERSION		0
-- 
1.7.5.53.gc233e


^ permalink raw reply related	[flat|nested] 145+ messages in thread

* [PATCH 01/18] virtio: 64 bit features
@ 2011-05-04 20:50   ` Michael S. Tsirkin
  0 siblings, 0 replies; 145+ messages in thread
From: Michael S. Tsirkin @ 2011-05-04 20:50 UTC (permalink / raw)
  Cc: Rusty Russell, Carsten Otte, Christian Borntraeger, linux390,
	Martin Schwidefsky, Heiko Carstens, Shirley Ma, lguest,
	linux-kernel, virtualization, netdev, linux-s390, kvm,
	Krishna Kumar, Tom Lendacky, steved, habanero

Extend features to 64 bit so we can use more
transport bits.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
---
 drivers/lguest/lguest_device.c |    8 ++++----
 drivers/s390/kvm/kvm_virtio.c  |    8 ++++----
 drivers/virtio/virtio.c        |    8 ++++----
 drivers/virtio/virtio_pci.c    |   34 ++++++++++++++++++++++++++++------
 drivers/virtio/virtio_ring.c   |    2 ++
 include/linux/virtio.h         |    2 +-
 include/linux/virtio_config.h  |   15 +++++++++------
 include/linux/virtio_pci.h     |    9 ++++++++-
 8 files changed, 60 insertions(+), 26 deletions(-)

diff --git a/drivers/lguest/lguest_device.c b/drivers/lguest/lguest_device.c
index 69c84a1..d2d6953 100644
--- a/drivers/lguest/lguest_device.c
+++ b/drivers/lguest/lguest_device.c
@@ -93,17 +93,17 @@ static unsigned desc_size(const struct lguest_device_desc *desc)
 }
 
 /* This gets the device's feature bits. */
-static u32 lg_get_features(struct virtio_device *vdev)
+static u64 lg_get_features(struct virtio_device *vdev)
 {
 	unsigned int i;
-	u32 features = 0;
+	u64 features = 0;
 	struct lguest_device_desc *desc = to_lgdev(vdev)->desc;
 	u8 *in_features = lg_features(desc);
 
 	/* We do this the slow but generic way. */
-	for (i = 0; i < min(desc->feature_len * 8, 32); i++)
+	for (i = 0; i < min(desc->feature_len * 8, 64); i++)
 		if (in_features[i / 8] & (1 << (i % 8)))
-			features |= (1 << i);
+			features |= (1ull << i);
 
 	return features;
 }
diff --git a/drivers/s390/kvm/kvm_virtio.c b/drivers/s390/kvm/kvm_virtio.c
index 414427d..c56293c 100644
--- a/drivers/s390/kvm/kvm_virtio.c
+++ b/drivers/s390/kvm/kvm_virtio.c
@@ -79,16 +79,16 @@ static unsigned desc_size(const struct kvm_device_desc *desc)
 }
 
 /* This gets the device's feature bits. */
-static u32 kvm_get_features(struct virtio_device *vdev)
+static u64 kvm_get_features(struct virtio_device *vdev)
 {
 	unsigned int i;
-	u32 features = 0;
+	u64 features = 0;
 	struct kvm_device_desc *desc = to_kvmdev(vdev)->desc;
 	u8 *in_features = kvm_vq_features(desc);
 
-	for (i = 0; i < min(desc->feature_len * 8, 32); i++)
+	for (i = 0; i < min(desc->feature_len * 8, 64); i++)
 		if (in_features[i / 8] & (1 << (i % 8)))
-			features |= (1 << i);
+			features |= (1ull << i);
 	return features;
 }
 
diff --git a/drivers/virtio/virtio.c b/drivers/virtio/virtio.c
index efb35aa..52b24d7 100644
--- a/drivers/virtio/virtio.c
+++ b/drivers/virtio/virtio.c
@@ -112,7 +112,7 @@ static int virtio_dev_probe(struct device *_d)
 	struct virtio_device *dev = container_of(_d,struct virtio_device,dev);
 	struct virtio_driver *drv = container_of(dev->dev.driver,
 						 struct virtio_driver, driver);
-	u32 device_features;
+	u64 device_features;
 
 	/* We have a driver! */
 	add_status(dev, VIRTIO_CONFIG_S_DRIVER);
@@ -124,14 +124,14 @@ static int virtio_dev_probe(struct device *_d)
 	memset(dev->features, 0, sizeof(dev->features));
 	for (i = 0; i < drv->feature_table_size; i++) {
 		unsigned int f = drv->feature_table[i];
-		BUG_ON(f >= 32);
-		if (device_features & (1 << f))
+		BUG_ON(f >= 64);
+		if (device_features & (1ull << f))
 			set_bit(f, dev->features);
 	}
 
 	/* Transport features always preserved to pass to finalize_features. */
 	for (i = VIRTIO_TRANSPORT_F_START; i < VIRTIO_TRANSPORT_F_END; i++)
-		if (device_features & (1 << i))
+		if (device_features & (1ull << i))
 			set_bit(i, dev->features);
 
 	dev->config->finalize_features(dev);
diff --git a/drivers/virtio/virtio_pci.c b/drivers/virtio/virtio_pci.c
index 4fb5b2b..04b216f 100644
--- a/drivers/virtio/virtio_pci.c
+++ b/drivers/virtio/virtio_pci.c
@@ -44,6 +44,8 @@ struct virtio_pci_device
 	spinlock_t lock;
 	struct list_head virtqueues;
 
+	/* 64 bit features */
+	int features_hi;
 	/* MSI-X support */
 	int msix_enabled;
 	int intx_enabled;
@@ -103,26 +105,46 @@ static struct virtio_pci_device *to_vp_device(struct virtio_device *vdev)
 }
 
 /* virtio config->get_features() implementation */
-static u32 vp_get_features(struct virtio_device *vdev)
+static u64 vp_get_features(struct virtio_device *vdev)
 {
 	struct virtio_pci_device *vp_dev = to_vp_device(vdev);
+	u32 flo, fhi;
 
-	/* When someone needs more than 32 feature bits, we'll need to
+	/* When someone needs more than 32 feature bits, we need to
 	 * steal a bit to indicate that the rest are somewhere else. */
-	return ioread32(vp_dev->ioaddr + VIRTIO_PCI_HOST_FEATURES);
+	flo = ioread32(vp_dev->ioaddr + VIRTIO_PCI_HOST_FEATURES);
+	if (flo & (0x1 << VIRTIO_F_FEATURES_HI)) {
+		vp_dev->features_hi = 1;
+		iowrite32(0x1 << VIRTIO_F_FEATURES_HI,
+			  vp_dev->ioaddr + VIRTIO_PCI_GUEST_FEATURES);
+		fhi = ioread32(vp_dev->ioaddr + VIRTIO_PCI_HOST_FEATURES_HI);
+	} else {
+		vp_dev->features_hi = 0;
+		fhi = 0;
+	}
+	return (((u64)fhi) << 32) | flo;
 }
 
 /* virtio config->finalize_features() implementation */
 static void vp_finalize_features(struct virtio_device *vdev)
 {
 	struct virtio_pci_device *vp_dev = to_vp_device(vdev);
+	u32 flo, fhi;
 
 	/* Give virtio_ring a chance to accept features. */
 	vring_transport_features(vdev);
 
-	/* We only support 32 feature bits. */
-	BUILD_BUG_ON(ARRAY_SIZE(vdev->features) != 1);
-	iowrite32(vdev->features[0], vp_dev->ioaddr+VIRTIO_PCI_GUEST_FEATURES);
+	/* We only support 64 feature bits. */
+	BUILD_BUG_ON(ARRAY_SIZE(vdev->features) != 64 / BITS_PER_LONG);
+	flo = vdev->features[0];
+	fhi = vdev->features[64 / BITS_PER_LONG - 1] >> (BITS_PER_LONG - 32);
+	iowrite32(flo, vp_dev->ioaddr + VIRTIO_PCI_GUEST_FEATURES);
+	if (flo & (0x1 << VIRTIO_F_FEATURES_HI)) {
+		vp_dev->features_hi = 1;
+		iowrite32(fhi, vp_dev->ioaddr + VIRTIO_PCI_GUEST_FEATURES_HI);
+	} else {
+		vp_dev->features_hi = 0;
+	}
 }
 
 /* virtio config->get() implementation */
diff --git a/drivers/virtio/virtio_ring.c b/drivers/virtio/virtio_ring.c
index cc2f73e..059e02d 100644
--- a/drivers/virtio/virtio_ring.c
+++ b/drivers/virtio/virtio_ring.c
@@ -469,6 +469,8 @@ void vring_transport_features(struct virtio_device *vdev)
 
 	for (i = VIRTIO_TRANSPORT_F_START; i < VIRTIO_TRANSPORT_F_END; i++) {
 		switch (i) {
+		case VIRTIO_F_FEATURES_HI:
+			break;
 		case VIRTIO_RING_F_INDIRECT_DESC:
 			break;
 		default:
diff --git a/include/linux/virtio.h b/include/linux/virtio.h
index aff5b4f..718336b 100644
--- a/include/linux/virtio.h
+++ b/include/linux/virtio.h
@@ -105,7 +105,7 @@ struct virtio_device {
 	struct virtio_config_ops *config;
 	struct list_head vqs;
 	/* Note that this is a Linux set_bit-style bitmap. */
-	unsigned long features[1];
+	unsigned long features[64 / BITS_PER_LONG];
 	void *priv;
 };
 
diff --git a/include/linux/virtio_config.h b/include/linux/virtio_config.h
index 800617b..b1a1981 100644
--- a/include/linux/virtio_config.h
+++ b/include/linux/virtio_config.h
@@ -18,16 +18,19 @@
 /* We've given up on this device. */
 #define VIRTIO_CONFIG_S_FAILED		0x80
 
-/* Some virtio feature bits (currently bits 28 through 31) are reserved for the
+/* Some virtio feature bits (currently bits 28 through 39) are reserved for the
  * transport being used (eg. virtio_ring), the rest are per-device feature
  * bits. */
 #define VIRTIO_TRANSPORT_F_START	28
-#define VIRTIO_TRANSPORT_F_END		32
+#define VIRTIO_TRANSPORT_F_END		40
 
 /* Do we get callbacks when the ring is completely used, even if we've
  * suppressed them? */
 #define VIRTIO_F_NOTIFY_ON_EMPTY	24
 
+/* Enables feature bits 32 to 63 (only really required for virtio_pci). */
+#define VIRTIO_F_FEATURES_HI		31
+
 #ifdef __KERNEL__
 #include <linux/err.h>
 #include <linux/virtio.h>
@@ -72,7 +75,7 @@
  * @del_vqs: free virtqueues found by find_vqs().
  * @get_features: get the array of feature bits for this device.
  *	vdev: the virtio_device
- *	Returns the first 32 feature bits (all we currently need).
+ *	Returns the first 64 feature bits (all we currently need).
  * @finalize_features: confirm what device features we'll be using.
  *	vdev: the virtio_device
  *	This gives the final feature bits for the device: it can change
@@ -92,7 +95,7 @@ struct virtio_config_ops {
 			vq_callback_t *callbacks[],
 			const char *names[]);
 	void (*del_vqs)(struct virtio_device *);
-	u32 (*get_features)(struct virtio_device *vdev);
+	u64 (*get_features)(struct virtio_device *vdev);
 	void (*finalize_features)(struct virtio_device *vdev);
 };
 
@@ -110,9 +113,9 @@ static inline bool virtio_has_feature(const struct virtio_device *vdev,
 {
 	/* Did you forget to fix assumptions on max features? */
 	if (__builtin_constant_p(fbit))
-		BUILD_BUG_ON(fbit >= 32);
+		BUILD_BUG_ON(fbit >= 64);
 	else
-		BUG_ON(fbit >= 32);
+		BUG_ON(fbit >= 64);
 
 	if (fbit < VIRTIO_TRANSPORT_F_START)
 		virtio_check_driver_offered_feature(vdev, fbit);
diff --git a/include/linux/virtio_pci.h b/include/linux/virtio_pci.h
index 9a3d7c4..90f9725 100644
--- a/include/linux/virtio_pci.h
+++ b/include/linux/virtio_pci.h
@@ -55,9 +55,16 @@
 /* Vector value used to disable MSI for queue */
 #define VIRTIO_MSI_NO_VECTOR            0xffff
 
+/* An extended 32-bit r/o bitmask of the features supported by the host */
+#define VIRTIO_PCI_HOST_FEATURES_HI	24
+
+/* An extended 32-bit r/w bitmask of features activated by the guest */
+#define VIRTIO_PCI_GUEST_FEATURES_HI	28
+
 /* The remaining space is defined by each driver as the per-driver
  * configuration space */
-#define VIRTIO_PCI_CONFIG(dev)		((dev)->msix_enabled ? 24 : 20)
+#define VIRTIO_PCI_CONFIG(dev)		((dev)->features_hi ? 32 : \
+						(dev)->msix_enabled ? 24 : 20)
 
 /* Virtio ABI version, this must match exactly */
 #define VIRTIO_PCI_ABI_VERSION		0
-- 
1.7.5.53.gc233e

^ permalink raw reply related	[flat|nested] 145+ messages in thread

* [PATCH 01/18] virtio: 64 bit features
  2011-05-04 20:50 ` Michael S. Tsirkin
  (?)
@ 2011-05-04 20:50 ` Michael S. Tsirkin
  -1 siblings, 0 replies; 145+ messages in thread
From: Michael S. Tsirkin @ 2011-05-04 20:50 UTC (permalink / raw)
  Cc: Krishna Kumar, Carsten Otte, lguest, Shirley Ma, kvm, linux-s390,
	netdev, habanero, Heiko Carstens, linux-kernel, virtualization,
	steved, Christian Borntraeger, Tom Lendacky, Martin Schwidefsky,
	linux390

Extend features to 64 bit so we can use more
transport bits.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
---
 drivers/lguest/lguest_device.c |    8 ++++----
 drivers/s390/kvm/kvm_virtio.c  |    8 ++++----
 drivers/virtio/virtio.c        |    8 ++++----
 drivers/virtio/virtio_pci.c    |   34 ++++++++++++++++++++++++++++------
 drivers/virtio/virtio_ring.c   |    2 ++
 include/linux/virtio.h         |    2 +-
 include/linux/virtio_config.h  |   15 +++++++++------
 include/linux/virtio_pci.h     |    9 ++++++++-
 8 files changed, 60 insertions(+), 26 deletions(-)

diff --git a/drivers/lguest/lguest_device.c b/drivers/lguest/lguest_device.c
index 69c84a1..d2d6953 100644
--- a/drivers/lguest/lguest_device.c
+++ b/drivers/lguest/lguest_device.c
@@ -93,17 +93,17 @@ static unsigned desc_size(const struct lguest_device_desc *desc)
 }
 
 /* This gets the device's feature bits. */
-static u32 lg_get_features(struct virtio_device *vdev)
+static u64 lg_get_features(struct virtio_device *vdev)
 {
 	unsigned int i;
-	u32 features = 0;
+	u64 features = 0;
 	struct lguest_device_desc *desc = to_lgdev(vdev)->desc;
 	u8 *in_features = lg_features(desc);
 
 	/* We do this the slow but generic way. */
-	for (i = 0; i < min(desc->feature_len * 8, 32); i++)
+	for (i = 0; i < min(desc->feature_len * 8, 64); i++)
 		if (in_features[i / 8] & (1 << (i % 8)))
-			features |= (1 << i);
+			features |= (1ull << i);
 
 	return features;
 }
diff --git a/drivers/s390/kvm/kvm_virtio.c b/drivers/s390/kvm/kvm_virtio.c
index 414427d..c56293c 100644
--- a/drivers/s390/kvm/kvm_virtio.c
+++ b/drivers/s390/kvm/kvm_virtio.c
@@ -79,16 +79,16 @@ static unsigned desc_size(const struct kvm_device_desc *desc)
 }
 
 /* This gets the device's feature bits. */
-static u32 kvm_get_features(struct virtio_device *vdev)
+static u64 kvm_get_features(struct virtio_device *vdev)
 {
 	unsigned int i;
-	u32 features = 0;
+	u64 features = 0;
 	struct kvm_device_desc *desc = to_kvmdev(vdev)->desc;
 	u8 *in_features = kvm_vq_features(desc);
 
-	for (i = 0; i < min(desc->feature_len * 8, 32); i++)
+	for (i = 0; i < min(desc->feature_len * 8, 64); i++)
 		if (in_features[i / 8] & (1 << (i % 8)))
-			features |= (1 << i);
+			features |= (1ull << i);
 	return features;
 }
 
diff --git a/drivers/virtio/virtio.c b/drivers/virtio/virtio.c
index efb35aa..52b24d7 100644
--- a/drivers/virtio/virtio.c
+++ b/drivers/virtio/virtio.c
@@ -112,7 +112,7 @@ static int virtio_dev_probe(struct device *_d)
 	struct virtio_device *dev = container_of(_d,struct virtio_device,dev);
 	struct virtio_driver *drv = container_of(dev->dev.driver,
 						 struct virtio_driver, driver);
-	u32 device_features;
+	u64 device_features;
 
 	/* We have a driver! */
 	add_status(dev, VIRTIO_CONFIG_S_DRIVER);
@@ -124,14 +124,14 @@ static int virtio_dev_probe(struct device *_d)
 	memset(dev->features, 0, sizeof(dev->features));
 	for (i = 0; i < drv->feature_table_size; i++) {
 		unsigned int f = drv->feature_table[i];
-		BUG_ON(f >= 32);
-		if (device_features & (1 << f))
+		BUG_ON(f >= 64);
+		if (device_features & (1ull << f))
 			set_bit(f, dev->features);
 	}
 
 	/* Transport features always preserved to pass to finalize_features. */
 	for (i = VIRTIO_TRANSPORT_F_START; i < VIRTIO_TRANSPORT_F_END; i++)
-		if (device_features & (1 << i))
+		if (device_features & (1ull << i))
 			set_bit(i, dev->features);
 
 	dev->config->finalize_features(dev);
diff --git a/drivers/virtio/virtio_pci.c b/drivers/virtio/virtio_pci.c
index 4fb5b2b..04b216f 100644
--- a/drivers/virtio/virtio_pci.c
+++ b/drivers/virtio/virtio_pci.c
@@ -44,6 +44,8 @@ struct virtio_pci_device
 	spinlock_t lock;
 	struct list_head virtqueues;
 
+	/* 64 bit features */
+	int features_hi;
 	/* MSI-X support */
 	int msix_enabled;
 	int intx_enabled;
@@ -103,26 +105,46 @@ static struct virtio_pci_device *to_vp_device(struct virtio_device *vdev)
 }
 
 /* virtio config->get_features() implementation */
-static u32 vp_get_features(struct virtio_device *vdev)
+static u64 vp_get_features(struct virtio_device *vdev)
 {
 	struct virtio_pci_device *vp_dev = to_vp_device(vdev);
+	u32 flo, fhi;
 
-	/* When someone needs more than 32 feature bits, we'll need to
+	/* When someone needs more than 32 feature bits, we need to
 	 * steal a bit to indicate that the rest are somewhere else. */
-	return ioread32(vp_dev->ioaddr + VIRTIO_PCI_HOST_FEATURES);
+	flo = ioread32(vp_dev->ioaddr + VIRTIO_PCI_HOST_FEATURES);
+	if (flo & (0x1 << VIRTIO_F_FEATURES_HI)) {
+		vp_dev->features_hi = 1;
+		iowrite32(0x1 << VIRTIO_F_FEATURES_HI,
+			  vp_dev->ioaddr + VIRTIO_PCI_GUEST_FEATURES);
+		fhi = ioread32(vp_dev->ioaddr + VIRTIO_PCI_HOST_FEATURES_HI);
+	} else {
+		vp_dev->features_hi = 0;
+		fhi = 0;
+	}
+	return (((u64)fhi) << 32) | flo;
 }
 
 /* virtio config->finalize_features() implementation */
 static void vp_finalize_features(struct virtio_device *vdev)
 {
 	struct virtio_pci_device *vp_dev = to_vp_device(vdev);
+	u32 flo, fhi;
 
 	/* Give virtio_ring a chance to accept features. */
 	vring_transport_features(vdev);
 
-	/* We only support 32 feature bits. */
-	BUILD_BUG_ON(ARRAY_SIZE(vdev->features) != 1);
-	iowrite32(vdev->features[0], vp_dev->ioaddr+VIRTIO_PCI_GUEST_FEATURES);
+	/* We only support 64 feature bits. */
+	BUILD_BUG_ON(ARRAY_SIZE(vdev->features) != 64 / BITS_PER_LONG);
+	flo = vdev->features[0];
+	fhi = vdev->features[64 / BITS_PER_LONG - 1] >> (BITS_PER_LONG - 32);
+	iowrite32(flo, vp_dev->ioaddr + VIRTIO_PCI_GUEST_FEATURES);
+	if (flo & (0x1 << VIRTIO_F_FEATURES_HI)) {
+		vp_dev->features_hi = 1;
+		iowrite32(fhi, vp_dev->ioaddr + VIRTIO_PCI_GUEST_FEATURES_HI);
+	} else {
+		vp_dev->features_hi = 0;
+	}
 }
 
 /* virtio config->get() implementation */
diff --git a/drivers/virtio/virtio_ring.c b/drivers/virtio/virtio_ring.c
index cc2f73e..059e02d 100644
--- a/drivers/virtio/virtio_ring.c
+++ b/drivers/virtio/virtio_ring.c
@@ -469,6 +469,8 @@ void vring_transport_features(struct virtio_device *vdev)
 
 	for (i = VIRTIO_TRANSPORT_F_START; i < VIRTIO_TRANSPORT_F_END; i++) {
 		switch (i) {
+		case VIRTIO_F_FEATURES_HI:
+			break;
 		case VIRTIO_RING_F_INDIRECT_DESC:
 			break;
 		default:
diff --git a/include/linux/virtio.h b/include/linux/virtio.h
index aff5b4f..718336b 100644
--- a/include/linux/virtio.h
+++ b/include/linux/virtio.h
@@ -105,7 +105,7 @@ struct virtio_device {
 	struct virtio_config_ops *config;
 	struct list_head vqs;
 	/* Note that this is a Linux set_bit-style bitmap. */
-	unsigned long features[1];
+	unsigned long features[64 / BITS_PER_LONG];
 	void *priv;
 };
 
diff --git a/include/linux/virtio_config.h b/include/linux/virtio_config.h
index 800617b..b1a1981 100644
--- a/include/linux/virtio_config.h
+++ b/include/linux/virtio_config.h
@@ -18,16 +18,19 @@
 /* We've given up on this device. */
 #define VIRTIO_CONFIG_S_FAILED		0x80
 
-/* Some virtio feature bits (currently bits 28 through 31) are reserved for the
+/* Some virtio feature bits (currently bits 28 through 39) are reserved for the
  * transport being used (eg. virtio_ring), the rest are per-device feature
  * bits. */
 #define VIRTIO_TRANSPORT_F_START	28
-#define VIRTIO_TRANSPORT_F_END		32
+#define VIRTIO_TRANSPORT_F_END		40
 
 /* Do we get callbacks when the ring is completely used, even if we've
  * suppressed them? */
 #define VIRTIO_F_NOTIFY_ON_EMPTY	24
 
+/* Enables feature bits 32 to 63 (only really required for virtio_pci). */
+#define VIRTIO_F_FEATURES_HI		31
+
 #ifdef __KERNEL__
 #include <linux/err.h>
 #include <linux/virtio.h>
@@ -72,7 +75,7 @@
  * @del_vqs: free virtqueues found by find_vqs().
  * @get_features: get the array of feature bits for this device.
  *	vdev: the virtio_device
- *	Returns the first 32 feature bits (all we currently need).
+ *	Returns the first 64 feature bits (all we currently need).
  * @finalize_features: confirm what device features we'll be using.
  *	vdev: the virtio_device
  *	This gives the final feature bits for the device: it can change
@@ -92,7 +95,7 @@ struct virtio_config_ops {
 			vq_callback_t *callbacks[],
 			const char *names[]);
 	void (*del_vqs)(struct virtio_device *);
-	u32 (*get_features)(struct virtio_device *vdev);
+	u64 (*get_features)(struct virtio_device *vdev);
 	void (*finalize_features)(struct virtio_device *vdev);
 };
 
@@ -110,9 +113,9 @@ static inline bool virtio_has_feature(const struct virtio_device *vdev,
 {
 	/* Did you forget to fix assumptions on max features? */
 	if (__builtin_constant_p(fbit))
-		BUILD_BUG_ON(fbit >= 32);
+		BUILD_BUG_ON(fbit >= 64);
 	else
-		BUG_ON(fbit >= 32);
+		BUG_ON(fbit >= 64);
 
 	if (fbit < VIRTIO_TRANSPORT_F_START)
 		virtio_check_driver_offered_feature(vdev, fbit);
diff --git a/include/linux/virtio_pci.h b/include/linux/virtio_pci.h
index 9a3d7c4..90f9725 100644
--- a/include/linux/virtio_pci.h
+++ b/include/linux/virtio_pci.h
@@ -55,9 +55,16 @@
 /* Vector value used to disable MSI for queue */
 #define VIRTIO_MSI_NO_VECTOR            0xffff
 
+/* An extended 32-bit r/o bitmask of the features supported by the host */
+#define VIRTIO_PCI_HOST_FEATURES_HI	24
+
+/* An extended 32-bit r/w bitmask of features activated by the guest */
+#define VIRTIO_PCI_GUEST_FEATURES_HI	28
+
 /* The remaining space is defined by each driver as the per-driver
  * configuration space */
-#define VIRTIO_PCI_CONFIG(dev)		((dev)->msix_enabled ? 24 : 20)
+#define VIRTIO_PCI_CONFIG(dev)		((dev)->features_hi ? 32 : \
+						(dev)->msix_enabled ? 24 : 20)
 
 /* Virtio ABI version, this must match exactly */
 #define VIRTIO_PCI_ABI_VERSION		0
-- 
1.7.5.53.gc233e

^ permalink raw reply related	[flat|nested] 145+ messages in thread

* [PATCH 02/18] virtio_test: update for 64 bit features
  2011-05-04 20:50 ` Michael S. Tsirkin
  (?)
@ 2011-05-04 20:50   ` Michael S. Tsirkin
  -1 siblings, 0 replies; 145+ messages in thread
From: Michael S. Tsirkin @ 2011-05-04 20:50 UTC (permalink / raw)
  To: linux-kernel
  Cc: Rusty Russell, Carsten Otte, Christian Borntraeger, linux390,
	Martin Schwidefsky, Heiko Carstens, Shirley Ma, lguest,
	linux-kernel, virtualization, netdev, linux-s390, kvm,
	Krishna Kumar, Tom Lendacky, steved, habanero

Extend the virtio_test tool so it can work with
64 bit features.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
---
 tools/virtio/virtio_test.c |    8 ++++++--
 1 files changed, 6 insertions(+), 2 deletions(-)

diff --git a/tools/virtio/virtio_test.c b/tools/virtio/virtio_test.c
index df0c6d2..9e65e6d 100644
--- a/tools/virtio/virtio_test.c
+++ b/tools/virtio/virtio_test.c
@@ -55,7 +55,6 @@ void vhost_vq_setup(struct vdev_info *dev, struct vq_info *info)
 {
 	struct vhost_vring_state state = { .index = info->idx };
 	struct vhost_vring_file file = { .index = info->idx };
-	unsigned long long features = dev->vdev.features[0];
 	struct vhost_vring_addr addr = {
 		.index = info->idx,
 		.desc_user_addr = (uint64_t)(unsigned long)info->vring.desc,
@@ -63,6 +62,10 @@ void vhost_vq_setup(struct vdev_info *dev, struct vq_info *info)
 		.used_user_addr = (uint64_t)(unsigned long)info->vring.used,
 	};
 	int r;
+	unsigned long long features = dev->vdev.features[0];
+	if (sizeof features > sizeof dev->vdev.features[0])
+		features |= ((unsigned long long)dev->vdev.features[1]) << 32;
+
 	r = ioctl(dev->control, VHOST_SET_FEATURES, &features);
 	assert(r >= 0);
 	state.num = info->vring.num;
@@ -107,7 +110,8 @@ static void vdev_info_init(struct vdev_info* dev, unsigned long long features)
 	int r;
 	memset(dev, 0, sizeof *dev);
 	dev->vdev.features[0] = features;
-	dev->vdev.features[1] = features >> 32;
+	if (sizeof features > sizeof dev->vdev.features[0])
+		dev->vdev.features[1] = features >> 32;
 	dev->buf_size = 1024;
 	dev->buf = malloc(dev->buf_size);
 	assert(dev->buf);
-- 
1.7.5.53.gc233e


^ permalink raw reply related	[flat|nested] 145+ messages in thread

* [PATCH 02/18] virtio_test: update for 64 bit features
@ 2011-05-04 20:50   ` Michael S. Tsirkin
  0 siblings, 0 replies; 145+ messages in thread
From: Michael S. Tsirkin @ 2011-05-04 20:50 UTC (permalink / raw)
  To: linux-kernel
  Cc: Krishna Kumar, Carsten Otte, lguest, Shirley Ma, kvm, linux-s390,
	netdev, habanero, Heiko Carstens, linux-kernel, virtualization,
	steved, Christian Borntraeger, Tom Lendacky, Martin Schwidefsky,
	linux390

Extend the virtio_test tool so it can work with
64 bit features.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
---
 tools/virtio/virtio_test.c |    8 ++++++--
 1 files changed, 6 insertions(+), 2 deletions(-)

diff --git a/tools/virtio/virtio_test.c b/tools/virtio/virtio_test.c
index df0c6d2..9e65e6d 100644
--- a/tools/virtio/virtio_test.c
+++ b/tools/virtio/virtio_test.c
@@ -55,7 +55,6 @@ void vhost_vq_setup(struct vdev_info *dev, struct vq_info *info)
 {
 	struct vhost_vring_state state = { .index = info->idx };
 	struct vhost_vring_file file = { .index = info->idx };
-	unsigned long long features = dev->vdev.features[0];
 	struct vhost_vring_addr addr = {
 		.index = info->idx,
 		.desc_user_addr = (uint64_t)(unsigned long)info->vring.desc,
@@ -63,6 +62,10 @@ void vhost_vq_setup(struct vdev_info *dev, struct vq_info *info)
 		.used_user_addr = (uint64_t)(unsigned long)info->vring.used,
 	};
 	int r;
+	unsigned long long features = dev->vdev.features[0];
+	if (sizeof features > sizeof dev->vdev.features[0])
+		features |= ((unsigned long long)dev->vdev.features[1]) << 32;
+
 	r = ioctl(dev->control, VHOST_SET_FEATURES, &features);
 	assert(r >= 0);
 	state.num = info->vring.num;
@@ -107,7 +110,8 @@ static void vdev_info_init(struct vdev_info* dev, unsigned long long features)
 	int r;
 	memset(dev, 0, sizeof *dev);
 	dev->vdev.features[0] = features;
-	dev->vdev.features[1] = features >> 32;
+	if (sizeof features > sizeof dev->vdev.features[0])
+		dev->vdev.features[1] = features >> 32;
 	dev->buf_size = 1024;
 	dev->buf = malloc(dev->buf_size);
 	assert(dev->buf);
-- 
1.7.5.53.gc233e

^ permalink raw reply related	[flat|nested] 145+ messages in thread

* [PATCH 02/18] virtio_test: update for 64 bit features
@ 2011-05-04 20:50   ` Michael S. Tsirkin
  0 siblings, 0 replies; 145+ messages in thread
From: Michael S. Tsirkin @ 2011-05-04 20:50 UTC (permalink / raw)
  Cc: Krishna Kumar, Carsten Otte, lguest, Shirley Ma, kvm, linux-s390,
	netdev, habanero, Heiko Carstens, linux-kernel, virtualization,
	steved, Christian Borntraeger, Tom Lendacky, Martin Schwidefsky,
	linux390

Extend the virtio_test tool so it can work with
64 bit features.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
---
 tools/virtio/virtio_test.c |    8 ++++++--
 1 files changed, 6 insertions(+), 2 deletions(-)

diff --git a/tools/virtio/virtio_test.c b/tools/virtio/virtio_test.c
index df0c6d2..9e65e6d 100644
--- a/tools/virtio/virtio_test.c
+++ b/tools/virtio/virtio_test.c
@@ -55,7 +55,6 @@ void vhost_vq_setup(struct vdev_info *dev, struct vq_info *info)
 {
 	struct vhost_vring_state state = { .index = info->idx };
 	struct vhost_vring_file file = { .index = info->idx };
-	unsigned long long features = dev->vdev.features[0];
 	struct vhost_vring_addr addr = {
 		.index = info->idx,
 		.desc_user_addr = (uint64_t)(unsigned long)info->vring.desc,
@@ -63,6 +62,10 @@ void vhost_vq_setup(struct vdev_info *dev, struct vq_info *info)
 		.used_user_addr = (uint64_t)(unsigned long)info->vring.used,
 	};
 	int r;
+	unsigned long long features = dev->vdev.features[0];
+	if (sizeof features > sizeof dev->vdev.features[0])
+		features |= ((unsigned long long)dev->vdev.features[1]) << 32;
+
 	r = ioctl(dev->control, VHOST_SET_FEATURES, &features);
 	assert(r >= 0);
 	state.num = info->vring.num;
@@ -107,7 +110,8 @@ static void vdev_info_init(struct vdev_info* dev, unsigned long long features)
 	int r;
 	memset(dev, 0, sizeof *dev);
 	dev->vdev.features[0] = features;
-	dev->vdev.features[1] = features >> 32;
+	if (sizeof features > sizeof dev->vdev.features[0])
+		dev->vdev.features[1] = features >> 32;
 	dev->buf_size = 1024;
 	dev->buf = malloc(dev->buf_size);
 	assert(dev->buf);
-- 
1.7.5.53.gc233e

^ permalink raw reply related	[flat|nested] 145+ messages in thread

* [PATCH 03/18] vhost: fix 64 bit features
  2011-05-04 20:50 ` Michael S. Tsirkin
  (?)
@ 2011-05-04 20:50   ` Michael S. Tsirkin
  -1 siblings, 0 replies; 145+ messages in thread
From: Michael S. Tsirkin @ 2011-05-04 20:50 UTC (permalink / raw)
  To: linux-kernel
  Cc: Rusty Russell, Carsten Otte, Christian Borntraeger, linux390,
	Martin Schwidefsky, Heiko Carstens, Shirley Ma, lguest,
	linux-kernel, virtualization, netdev, linux-s390, kvm,
	Krishna Kumar, Tom Lendacky, steved, habanero

Update vhost_has_feature to make it work correctly for bit > 32.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
---
 drivers/vhost/vhost.h |    8 ++++----
 1 files changed, 4 insertions(+), 4 deletions(-)

diff --git a/drivers/vhost/vhost.h b/drivers/vhost/vhost.h
index b3363ae..0f1bf33 100644
--- a/drivers/vhost/vhost.h
+++ b/drivers/vhost/vhost.h
@@ -117,7 +117,7 @@ struct vhost_dev {
 	struct vhost_memory __rcu *memory;
 	struct mm_struct *mm;
 	struct mutex mutex;
-	unsigned acked_features;
+	u64 acked_features;
 	struct vhost_virtqueue *vqs;
 	int nvqs;
 	struct file *log_file;
@@ -169,14 +169,14 @@ enum {
 			 (1 << VIRTIO_NET_F_MRG_RXBUF),
 };
 
-static inline int vhost_has_feature(struct vhost_dev *dev, int bit)
+static inline bool vhost_has_feature(struct vhost_dev *dev, int bit)
 {
-	unsigned acked_features;
+	u64 acked_features;
 
 	/* TODO: check that we are running from vhost_worker or dev mutex is
 	 * held? */
 	acked_features = rcu_dereference_index_check(dev->acked_features, 1);
-	return acked_features & (1 << bit);
+	return acked_features & (1ull << bit);
 }
 
 #endif
-- 
1.7.5.53.gc233e


^ permalink raw reply related	[flat|nested] 145+ messages in thread

* [PATCH 03/18] vhost: fix 64 bit features
@ 2011-05-04 20:50   ` Michael S. Tsirkin
  0 siblings, 0 replies; 145+ messages in thread
From: Michael S. Tsirkin @ 2011-05-04 20:50 UTC (permalink / raw)
  To: linux-kernel
  Cc: Krishna Kumar, Carsten Otte, lguest, Shirley Ma, kvm, linux-s390,
	netdev, habanero, Heiko Carstens, linux-kernel, virtualization,
	steved, Christian Borntraeger, Tom Lendacky, Martin Schwidefsky,
	linux390

Update vhost_has_feature to make it work correctly for bit > 32.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
---
 drivers/vhost/vhost.h |    8 ++++----
 1 files changed, 4 insertions(+), 4 deletions(-)

diff --git a/drivers/vhost/vhost.h b/drivers/vhost/vhost.h
index b3363ae..0f1bf33 100644
--- a/drivers/vhost/vhost.h
+++ b/drivers/vhost/vhost.h
@@ -117,7 +117,7 @@ struct vhost_dev {
 	struct vhost_memory __rcu *memory;
 	struct mm_struct *mm;
 	struct mutex mutex;
-	unsigned acked_features;
+	u64 acked_features;
 	struct vhost_virtqueue *vqs;
 	int nvqs;
 	struct file *log_file;
@@ -169,14 +169,14 @@ enum {
 			 (1 << VIRTIO_NET_F_MRG_RXBUF),
 };
 
-static inline int vhost_has_feature(struct vhost_dev *dev, int bit)
+static inline bool vhost_has_feature(struct vhost_dev *dev, int bit)
 {
-	unsigned acked_features;
+	u64 acked_features;
 
 	/* TODO: check that we are running from vhost_worker or dev mutex is
 	 * held? */
 	acked_features = rcu_dereference_index_check(dev->acked_features, 1);
-	return acked_features & (1 << bit);
+	return acked_features & (1ull << bit);
 }
 
 #endif
-- 
1.7.5.53.gc233e

^ permalink raw reply related	[flat|nested] 145+ messages in thread

* [PATCH 03/18] vhost: fix 64 bit features
@ 2011-05-04 20:50   ` Michael S. Tsirkin
  0 siblings, 0 replies; 145+ messages in thread
From: Michael S. Tsirkin @ 2011-05-04 20:50 UTC (permalink / raw)
  Cc: Krishna Kumar, Carsten Otte, lguest, Shirley Ma, kvm, linux-s390,
	netdev, habanero, Heiko Carstens, linux-kernel, virtualization,
	steved, Christian Borntraeger, Tom Lendacky, Martin Schwidefsky,
	linux390

Update vhost_has_feature to make it work correctly for bit > 32.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
---
 drivers/vhost/vhost.h |    8 ++++----
 1 files changed, 4 insertions(+), 4 deletions(-)

diff --git a/drivers/vhost/vhost.h b/drivers/vhost/vhost.h
index b3363ae..0f1bf33 100644
--- a/drivers/vhost/vhost.h
+++ b/drivers/vhost/vhost.h
@@ -117,7 +117,7 @@ struct vhost_dev {
 	struct vhost_memory __rcu *memory;
 	struct mm_struct *mm;
 	struct mutex mutex;
-	unsigned acked_features;
+	u64 acked_features;
 	struct vhost_virtqueue *vqs;
 	int nvqs;
 	struct file *log_file;
@@ -169,14 +169,14 @@ enum {
 			 (1 << VIRTIO_NET_F_MRG_RXBUF),
 };
 
-static inline int vhost_has_feature(struct vhost_dev *dev, int bit)
+static inline bool vhost_has_feature(struct vhost_dev *dev, int bit)
 {
-	unsigned acked_features;
+	u64 acked_features;
 
 	/* TODO: check that we are running from vhost_worker or dev mutex is
 	 * held? */
 	acked_features = rcu_dereference_index_check(dev->acked_features, 1);
-	return acked_features & (1 << bit);
+	return acked_features & (1ull << bit);
 }
 
 #endif
-- 
1.7.5.53.gc233e

^ permalink raw reply related	[flat|nested] 145+ messages in thread

* [PATCH 04/18] virtio: don't delay avail index update
  2011-05-04 20:50 ` Michael S. Tsirkin
@ 2011-05-04 20:51   ` Michael S. Tsirkin
  -1 siblings, 0 replies; 145+ messages in thread
From: Michael S. Tsirkin @ 2011-05-04 20:51 UTC (permalink / raw)
  To: linux-kernel
  Cc: Rusty Russell, Carsten Otte, Christian Borntraeger, linux390,
	Martin Schwidefsky, Heiko Carstens, Shirley Ma, lguest,
	linux-kernel, virtualization, netdev, linux-s390, kvm,
	Krishna Kumar, Tom Lendacky, steved, habanero

Update avail index immediately instead of upon kick:
for virtio-net RX this helps parallelism with the host.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
---
 drivers/virtio/virtio_ring.c |   15 +++++----------
 1 files changed, 5 insertions(+), 10 deletions(-)

diff --git a/drivers/virtio/virtio_ring.c b/drivers/virtio/virtio_ring.c
index 059e02d..507d6eb 100644
--- a/drivers/virtio/virtio_ring.c
+++ b/drivers/virtio/virtio_ring.c
@@ -86,8 +86,6 @@ struct vring_virtqueue
 	unsigned int num_free;
 	/* Head of free buffer list. */
 	unsigned int free_head;
-	/* Number we've added since last sync. */
-	unsigned int num_added;
 
 	/* Last used index we've seen. */
 	u16 last_used_idx;
@@ -224,8 +222,12 @@ add_head:
 
 	/* Put entry in available array (but don't update avail->idx until they
 	 * do sync).  FIXME: avoid modulus here? */
-	avail = (vq->vring.avail->idx + vq->num_added++) % vq->vring.num;
+	avail = vq->vring.avail->idx % vq->vring.num;
 	vq->vring.avail->ring[avail] = head;
+	/* Descriptors and available array need to be set before we expose the
+	 * new available array entries. */
+	virtio_wmb();
+	vq->vring.avail->idx++;
 
 	pr_debug("Added buffer head %i to %p\n", head, vq);
 	END_USE(vq);
@@ -238,12 +240,6 @@ void virtqueue_kick(struct virtqueue *_vq)
 {
 	struct vring_virtqueue *vq = to_vvq(_vq);
 	START_USE(vq);
-	/* Descriptors and available array need to be set before we expose the
-	 * new available array entries. */
-	virtio_wmb();
-
-	vq->vring.avail->idx += vq->num_added;
-	vq->num_added = 0;
 
 	/* Need to update avail index before checking if we should notify */
 	virtio_mb();
@@ -430,7 +426,6 @@ struct virtqueue *vring_new_virtqueue(unsigned int num,
 	vq->notify = notify;
 	vq->broken = false;
 	vq->last_used_idx = 0;
-	vq->num_added = 0;
 	list_add_tail(&vq->vq.list, &vdev->vqs);
 #ifdef DEBUG
 	vq->in_use = false;
-- 
1.7.5.53.gc233e


^ permalink raw reply related	[flat|nested] 145+ messages in thread

* [PATCH 04/18] virtio: don't delay avail index update
@ 2011-05-04 20:51   ` Michael S. Tsirkin
  0 siblings, 0 replies; 145+ messages in thread
From: Michael S. Tsirkin @ 2011-05-04 20:51 UTC (permalink / raw)
  Cc: Rusty Russell, Carsten Otte, Christian Borntraeger, linux390,
	Martin Schwidefsky, Heiko Carstens, Shirley Ma, lguest,
	linux-kernel, virtualization, netdev, linux-s390, kvm,
	Krishna Kumar, Tom Lendacky, steved, habanero

Update avail index immediately instead of upon kick:
for virtio-net RX this helps parallelism with the host.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
---
 drivers/virtio/virtio_ring.c |   15 +++++----------
 1 files changed, 5 insertions(+), 10 deletions(-)

diff --git a/drivers/virtio/virtio_ring.c b/drivers/virtio/virtio_ring.c
index 059e02d..507d6eb 100644
--- a/drivers/virtio/virtio_ring.c
+++ b/drivers/virtio/virtio_ring.c
@@ -86,8 +86,6 @@ struct vring_virtqueue
 	unsigned int num_free;
 	/* Head of free buffer list. */
 	unsigned int free_head;
-	/* Number we've added since last sync. */
-	unsigned int num_added;
 
 	/* Last used index we've seen. */
 	u16 last_used_idx;
@@ -224,8 +222,12 @@ add_head:
 
 	/* Put entry in available array (but don't update avail->idx until they
 	 * do sync).  FIXME: avoid modulus here? */
-	avail = (vq->vring.avail->idx + vq->num_added++) % vq->vring.num;
+	avail = vq->vring.avail->idx % vq->vring.num;
 	vq->vring.avail->ring[avail] = head;
+	/* Descriptors and available array need to be set before we expose the
+	 * new available array entries. */
+	virtio_wmb();
+	vq->vring.avail->idx++;
 
 	pr_debug("Added buffer head %i to %p\n", head, vq);
 	END_USE(vq);
@@ -238,12 +240,6 @@ void virtqueue_kick(struct virtqueue *_vq)
 {
 	struct vring_virtqueue *vq = to_vvq(_vq);
 	START_USE(vq);
-	/* Descriptors and available array need to be set before we expose the
-	 * new available array entries. */
-	virtio_wmb();
-
-	vq->vring.avail->idx += vq->num_added;
-	vq->num_added = 0;
 
 	/* Need to update avail index before checking if we should notify */
 	virtio_mb();
@@ -430,7 +426,6 @@ struct virtqueue *vring_new_virtqueue(unsigned int num,
 	vq->notify = notify;
 	vq->broken = false;
 	vq->last_used_idx = 0;
-	vq->num_added = 0;
 	list_add_tail(&vq->vq.list, &vdev->vqs);
 #ifdef DEBUG
 	vq->in_use = false;
-- 
1.7.5.53.gc233e

^ permalink raw reply related	[flat|nested] 145+ messages in thread

* [PATCH 04/18] virtio: don't delay avail index update
  2011-05-04 20:50 ` Michael S. Tsirkin
                   ` (4 preceding siblings ...)
  (?)
@ 2011-05-04 20:51 ` Michael S. Tsirkin
  -1 siblings, 0 replies; 145+ messages in thread
From: Michael S. Tsirkin @ 2011-05-04 20:51 UTC (permalink / raw)
  Cc: Krishna Kumar, Carsten Otte, lguest, Shirley Ma, kvm, linux-s390,
	netdev, habanero, Heiko Carstens, linux-kernel, virtualization,
	steved, Christian Borntraeger, Tom Lendacky, Martin Schwidefsky,
	linux390

Update avail index immediately instead of upon kick:
for virtio-net RX this helps parallelism with the host.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
---
 drivers/virtio/virtio_ring.c |   15 +++++----------
 1 files changed, 5 insertions(+), 10 deletions(-)

diff --git a/drivers/virtio/virtio_ring.c b/drivers/virtio/virtio_ring.c
index 059e02d..507d6eb 100644
--- a/drivers/virtio/virtio_ring.c
+++ b/drivers/virtio/virtio_ring.c
@@ -86,8 +86,6 @@ struct vring_virtqueue
 	unsigned int num_free;
 	/* Head of free buffer list. */
 	unsigned int free_head;
-	/* Number we've added since last sync. */
-	unsigned int num_added;
 
 	/* Last used index we've seen. */
 	u16 last_used_idx;
@@ -224,8 +222,12 @@ add_head:
 
 	/* Put entry in available array (but don't update avail->idx until they
 	 * do sync).  FIXME: avoid modulus here? */
-	avail = (vq->vring.avail->idx + vq->num_added++) % vq->vring.num;
+	avail = vq->vring.avail->idx % vq->vring.num;
 	vq->vring.avail->ring[avail] = head;
+	/* Descriptors and available array need to be set before we expose the
+	 * new available array entries. */
+	virtio_wmb();
+	vq->vring.avail->idx++;
 
 	pr_debug("Added buffer head %i to %p\n", head, vq);
 	END_USE(vq);
@@ -238,12 +240,6 @@ void virtqueue_kick(struct virtqueue *_vq)
 {
 	struct vring_virtqueue *vq = to_vvq(_vq);
 	START_USE(vq);
-	/* Descriptors and available array need to be set before we expose the
-	 * new available array entries. */
-	virtio_wmb();
-
-	vq->vring.avail->idx += vq->num_added;
-	vq->num_added = 0;
 
 	/* Need to update avail index before checking if we should notify */
 	virtio_mb();
@@ -430,7 +426,6 @@ struct virtqueue *vring_new_virtqueue(unsigned int num,
 	vq->notify = notify;
 	vq->broken = false;
 	vq->last_used_idx = 0;
-	vq->num_added = 0;
 	list_add_tail(&vq->vq.list, &vdev->vqs);
 #ifdef DEBUG
 	vq->in_use = false;
-- 
1.7.5.53.gc233e

^ permalink raw reply related	[flat|nested] 145+ messages in thread

* [PATCH 05/18] virtio: used event index interface
  2011-05-04 20:50 ` Michael S. Tsirkin
  (?)
@ 2011-05-04 20:51   ` Michael S. Tsirkin
  -1 siblings, 0 replies; 145+ messages in thread
From: Michael S. Tsirkin @ 2011-05-04 20:51 UTC (permalink / raw)
  To: linux-kernel
  Cc: Rusty Russell, Carsten Otte, Christian Borntraeger, linux390,
	Martin Schwidefsky, Heiko Carstens, Shirley Ma, lguest,
	linux-kernel, virtualization, netdev, linux-s390, kvm,
	Krishna Kumar, Tom Lendacky, steved, habanero

Define a new feature bit for the guest to utilize a used_event index
(like Xen) instead if a flag bit to enable/disable interrupts.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
---
 include/linux/virtio_ring.h |    9 +++++++++
 1 files changed, 9 insertions(+), 0 deletions(-)

diff --git a/include/linux/virtio_ring.h b/include/linux/virtio_ring.h
index e4d144b..f5c1b75 100644
--- a/include/linux/virtio_ring.h
+++ b/include/linux/virtio_ring.h
@@ -29,6 +29,10 @@
 /* We support indirect buffer descriptors */
 #define VIRTIO_RING_F_INDIRECT_DESC	28
 
+/* The Guest publishes the used index for which it expects an interrupt
+ * at the end of the avail ring. Host should ignore the avail->flags field. */
+#define VIRTIO_RING_F_USED_EVENT_IDX	29
+
 /* Virtio ring descriptors: 16 bytes.  These can chain together via "next". */
 struct vring_desc {
 	/* Address (guest-physical). */
@@ -83,6 +87,7 @@ struct vring {
  *	__u16 avail_flags;
  *	__u16 avail_idx;
  *	__u16 available[num];
+ *	__u16 used_event_idx;
  *
  *	// Padding to the next align boundary.
  *	char pad[];
@@ -93,6 +98,10 @@ struct vring {
  *	struct vring_used_elem used[num];
  * };
  */
+/* We publish the used event index at the end of the available ring.
+ * It is at the end for backwards compatibility. */
+#define vring_used_event(vr) ((vr)->avail->ring[(vr)->num])
+
 static inline void vring_init(struct vring *vr, unsigned int num, void *p,
 			      unsigned long align)
 {
-- 
1.7.5.53.gc233e


^ permalink raw reply related	[flat|nested] 145+ messages in thread

* [PATCH 05/18] virtio: used event index interface
@ 2011-05-04 20:51   ` Michael S. Tsirkin
  0 siblings, 0 replies; 145+ messages in thread
From: Michael S. Tsirkin @ 2011-05-04 20:51 UTC (permalink / raw)
  To: linux-kernel
  Cc: Krishna Kumar, Carsten Otte, lguest, Shirley Ma, kvm, linux-s390,
	netdev, habanero, Heiko Carstens, linux-kernel, virtualization,
	steved, Christian Borntraeger, Tom Lendacky, Martin Schwidefsky,
	linux390

Define a new feature bit for the guest to utilize a used_event index
(like Xen) instead if a flag bit to enable/disable interrupts.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
---
 include/linux/virtio_ring.h |    9 +++++++++
 1 files changed, 9 insertions(+), 0 deletions(-)

diff --git a/include/linux/virtio_ring.h b/include/linux/virtio_ring.h
index e4d144b..f5c1b75 100644
--- a/include/linux/virtio_ring.h
+++ b/include/linux/virtio_ring.h
@@ -29,6 +29,10 @@
 /* We support indirect buffer descriptors */
 #define VIRTIO_RING_F_INDIRECT_DESC	28
 
+/* The Guest publishes the used index for which it expects an interrupt
+ * at the end of the avail ring. Host should ignore the avail->flags field. */
+#define VIRTIO_RING_F_USED_EVENT_IDX	29
+
 /* Virtio ring descriptors: 16 bytes.  These can chain together via "next". */
 struct vring_desc {
 	/* Address (guest-physical). */
@@ -83,6 +87,7 @@ struct vring {
  *	__u16 avail_flags;
  *	__u16 avail_idx;
  *	__u16 available[num];
+ *	__u16 used_event_idx;
  *
  *	// Padding to the next align boundary.
  *	char pad[];
@@ -93,6 +98,10 @@ struct vring {
  *	struct vring_used_elem used[num];
  * };
  */
+/* We publish the used event index at the end of the available ring.
+ * It is at the end for backwards compatibility. */
+#define vring_used_event(vr) ((vr)->avail->ring[(vr)->num])
+
 static inline void vring_init(struct vring *vr, unsigned int num, void *p,
 			      unsigned long align)
 {
-- 
1.7.5.53.gc233e

^ permalink raw reply related	[flat|nested] 145+ messages in thread

* [PATCH 05/18] virtio: used event index interface
@ 2011-05-04 20:51   ` Michael S. Tsirkin
  0 siblings, 0 replies; 145+ messages in thread
From: Michael S. Tsirkin @ 2011-05-04 20:51 UTC (permalink / raw)
  Cc: Krishna Kumar, Carsten Otte, lguest, Shirley Ma, kvm, linux-s390,
	netdev, habanero, Heiko Carstens, linux-kernel, virtualization,
	steved, Christian Borntraeger, Tom Lendacky, Martin Schwidefsky,
	linux390

Define a new feature bit for the guest to utilize a used_event index
(like Xen) instead if a flag bit to enable/disable interrupts.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
---
 include/linux/virtio_ring.h |    9 +++++++++
 1 files changed, 9 insertions(+), 0 deletions(-)

diff --git a/include/linux/virtio_ring.h b/include/linux/virtio_ring.h
index e4d144b..f5c1b75 100644
--- a/include/linux/virtio_ring.h
+++ b/include/linux/virtio_ring.h
@@ -29,6 +29,10 @@
 /* We support indirect buffer descriptors */
 #define VIRTIO_RING_F_INDIRECT_DESC	28
 
+/* The Guest publishes the used index for which it expects an interrupt
+ * at the end of the avail ring. Host should ignore the avail->flags field. */
+#define VIRTIO_RING_F_USED_EVENT_IDX	29
+
 /* Virtio ring descriptors: 16 bytes.  These can chain together via "next". */
 struct vring_desc {
 	/* Address (guest-physical). */
@@ -83,6 +87,7 @@ struct vring {
  *	__u16 avail_flags;
  *	__u16 avail_idx;
  *	__u16 available[num];
+ *	__u16 used_event_idx;
  *
  *	// Padding to the next align boundary.
  *	char pad[];
@@ -93,6 +98,10 @@ struct vring {
  *	struct vring_used_elem used[num];
  * };
  */
+/* We publish the used event index at the end of the available ring.
+ * It is at the end for backwards compatibility. */
+#define vring_used_event(vr) ((vr)->avail->ring[(vr)->num])
+
 static inline void vring_init(struct vring *vr, unsigned int num, void *p,
 			      unsigned long align)
 {
-- 
1.7.5.53.gc233e

^ permalink raw reply related	[flat|nested] 145+ messages in thread

* [PATCH 06/18] virtio_ring: avail event index interface
  2011-05-04 20:50 ` Michael S. Tsirkin
  (?)
@ 2011-05-04 20:51   ` Michael S. Tsirkin
  -1 siblings, 0 replies; 145+ messages in thread
From: Michael S. Tsirkin @ 2011-05-04 20:51 UTC (permalink / raw)
  To: linux-kernel
  Cc: Rusty Russell, Carsten Otte, Christian Borntraeger, linux390,
	Martin Schwidefsky, Heiko Carstens, Shirley Ma, lguest,
	linux-kernel, virtualization, netdev, linux-s390, kvm,
	Krishna Kumar, Tom Lendacky, steved, habanero

Define a new feature bit for the host to
declare that it uses an avail_event index
(like Xen) instead of a feature bit
to enable/disable interrupts.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
---
 include/linux/virtio_ring.h |   11 ++++++++---
 1 files changed, 8 insertions(+), 3 deletions(-)

diff --git a/include/linux/virtio_ring.h b/include/linux/virtio_ring.h
index f5c1b75..f791772 100644
--- a/include/linux/virtio_ring.h
+++ b/include/linux/virtio_ring.h
@@ -32,6 +32,9 @@
 /* The Guest publishes the used index for which it expects an interrupt
  * at the end of the avail ring. Host should ignore the avail->flags field. */
 #define VIRTIO_RING_F_USED_EVENT_IDX	29
+/* The Host publishes the avail index for which it expects a kick
+ * at the end of the used ring. Guest should ignore the used->flags field. */
+#define VIRTIO_RING_F_AVAIL_EVENT_IDX	32
 
 /* Virtio ring descriptors: 16 bytes.  These can chain together via "next". */
 struct vring_desc {
@@ -96,11 +99,13 @@ struct vring {
  *	__u16 used_flags;
  *	__u16 used_idx;
  *	struct vring_used_elem used[num];
+ *	__u16 avail_event_idx;
  * };
  */
-/* We publish the used event index at the end of the available ring.
- * It is at the end for backwards compatibility. */
+/* We publish the used event index at the end of the available ring, and vice
+ * versa. They are at the end for backwards compatibility. */
 #define vring_used_event(vr) ((vr)->avail->ring[(vr)->num])
+#define vring_avail_event(vr) (*(__u16 *)&(vr)->used->ring[(vr)->num])
 
 static inline void vring_init(struct vring *vr, unsigned int num, void *p,
 			      unsigned long align)
@@ -116,7 +121,7 @@ static inline unsigned vring_size(unsigned int num, unsigned long align)
 {
 	return ((sizeof(struct vring_desc) * num + sizeof(__u16) * (2 + num)
 		 + align - 1) & ~(align - 1))
-		+ sizeof(__u16) * 2 + sizeof(struct vring_used_elem) * num;
+		+ sizeof(__u16) * 3 + sizeof(struct vring_used_elem) * num;
 }
 
 #ifdef __KERNEL__
-- 
1.7.5.53.gc233e


^ permalink raw reply related	[flat|nested] 145+ messages in thread

* [PATCH 06/18] virtio_ring: avail event index interface
@ 2011-05-04 20:51   ` Michael S. Tsirkin
  0 siblings, 0 replies; 145+ messages in thread
From: Michael S. Tsirkin @ 2011-05-04 20:51 UTC (permalink / raw)
  To: linux-kernel
  Cc: Krishna Kumar, Carsten Otte, lguest, Shirley Ma, kvm, linux-s390,
	netdev, habanero, Heiko Carstens, linux-kernel, virtualization,
	steved, Christian Borntraeger, Tom Lendacky, Martin Schwidefsky,
	linux390

Define a new feature bit for the host to
declare that it uses an avail_event index
(like Xen) instead of a feature bit
to enable/disable interrupts.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
---
 include/linux/virtio_ring.h |   11 ++++++++---
 1 files changed, 8 insertions(+), 3 deletions(-)

diff --git a/include/linux/virtio_ring.h b/include/linux/virtio_ring.h
index f5c1b75..f791772 100644
--- a/include/linux/virtio_ring.h
+++ b/include/linux/virtio_ring.h
@@ -32,6 +32,9 @@
 /* The Guest publishes the used index for which it expects an interrupt
  * at the end of the avail ring. Host should ignore the avail->flags field. */
 #define VIRTIO_RING_F_USED_EVENT_IDX	29
+/* The Host publishes the avail index for which it expects a kick
+ * at the end of the used ring. Guest should ignore the used->flags field. */
+#define VIRTIO_RING_F_AVAIL_EVENT_IDX	32
 
 /* Virtio ring descriptors: 16 bytes.  These can chain together via "next". */
 struct vring_desc {
@@ -96,11 +99,13 @@ struct vring {
  *	__u16 used_flags;
  *	__u16 used_idx;
  *	struct vring_used_elem used[num];
+ *	__u16 avail_event_idx;
  * };
  */
-/* We publish the used event index at the end of the available ring.
- * It is at the end for backwards compatibility. */
+/* We publish the used event index at the end of the available ring, and vice
+ * versa. They are at the end for backwards compatibility. */
 #define vring_used_event(vr) ((vr)->avail->ring[(vr)->num])
+#define vring_avail_event(vr) (*(__u16 *)&(vr)->used->ring[(vr)->num])
 
 static inline void vring_init(struct vring *vr, unsigned int num, void *p,
 			      unsigned long align)
@@ -116,7 +121,7 @@ static inline unsigned vring_size(unsigned int num, unsigned long align)
 {
 	return ((sizeof(struct vring_desc) * num + sizeof(__u16) * (2 + num)
 		 + align - 1) & ~(align - 1))
-		+ sizeof(__u16) * 2 + sizeof(struct vring_used_elem) * num;
+		+ sizeof(__u16) * 3 + sizeof(struct vring_used_elem) * num;
 }
 
 #ifdef __KERNEL__
-- 
1.7.5.53.gc233e

^ permalink raw reply related	[flat|nested] 145+ messages in thread

* [PATCH 06/18] virtio_ring: avail event index interface
@ 2011-05-04 20:51   ` Michael S. Tsirkin
  0 siblings, 0 replies; 145+ messages in thread
From: Michael S. Tsirkin @ 2011-05-04 20:51 UTC (permalink / raw)
  Cc: Krishna Kumar, Carsten Otte, lguest, Shirley Ma, kvm, linux-s390,
	netdev, habanero, Heiko Carstens, linux-kernel, virtualization,
	steved, Christian Borntraeger, Tom Lendacky, Martin Schwidefsky,
	linux390

Define a new feature bit for the host to
declare that it uses an avail_event index
(like Xen) instead of a feature bit
to enable/disable interrupts.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
---
 include/linux/virtio_ring.h |   11 ++++++++---
 1 files changed, 8 insertions(+), 3 deletions(-)

diff --git a/include/linux/virtio_ring.h b/include/linux/virtio_ring.h
index f5c1b75..f791772 100644
--- a/include/linux/virtio_ring.h
+++ b/include/linux/virtio_ring.h
@@ -32,6 +32,9 @@
 /* The Guest publishes the used index for which it expects an interrupt
  * at the end of the avail ring. Host should ignore the avail->flags field. */
 #define VIRTIO_RING_F_USED_EVENT_IDX	29
+/* The Host publishes the avail index for which it expects a kick
+ * at the end of the used ring. Guest should ignore the used->flags field. */
+#define VIRTIO_RING_F_AVAIL_EVENT_IDX	32
 
 /* Virtio ring descriptors: 16 bytes.  These can chain together via "next". */
 struct vring_desc {
@@ -96,11 +99,13 @@ struct vring {
  *	__u16 used_flags;
  *	__u16 used_idx;
  *	struct vring_used_elem used[num];
+ *	__u16 avail_event_idx;
  * };
  */
-/* We publish the used event index at the end of the available ring.
- * It is at the end for backwards compatibility. */
+/* We publish the used event index at the end of the available ring, and vice
+ * versa. They are at the end for backwards compatibility. */
 #define vring_used_event(vr) ((vr)->avail->ring[(vr)->num])
+#define vring_avail_event(vr) (*(__u16 *)&(vr)->used->ring[(vr)->num])
 
 static inline void vring_init(struct vring *vr, unsigned int num, void *p,
 			      unsigned long align)
@@ -116,7 +121,7 @@ static inline unsigned vring_size(unsigned int num, unsigned long align)
 {
 	return ((sizeof(struct vring_desc) * num + sizeof(__u16) * (2 + num)
 		 + align - 1) & ~(align - 1))
-		+ sizeof(__u16) * 2 + sizeof(struct vring_used_elem) * num;
+		+ sizeof(__u16) * 3 + sizeof(struct vring_used_elem) * num;
 }
 
 #ifdef __KERNEL__
-- 
1.7.5.53.gc233e

^ permalink raw reply related	[flat|nested] 145+ messages in thread

* [PATCH 07/18] virtio ring: inline function to check for events
  2011-05-04 20:50 ` Michael S. Tsirkin
@ 2011-05-04 20:51   ` Michael S. Tsirkin
  -1 siblings, 0 replies; 145+ messages in thread
From: Michael S. Tsirkin @ 2011-05-04 20:51 UTC (permalink / raw)
  To: linux-kernel
  Cc: Rusty Russell, Carsten Otte, Christian Borntraeger, linux390,
	Martin Schwidefsky, Heiko Carstens, Shirley Ma, lguest,
	linux-kernel, virtualization, netdev, linux-s390, kvm,
	Krishna Kumar, Tom Lendacky, steved, habanero

With the new used_event and avail_event and features, both
host and guest need similar logic to check whether events are
enabled, so it helps to put the common code in the header.

Note that Xen has similar logic for notification hold-off
in include/xen/interface/io/ring.h with req_event and req_prod
corresponding to event_idx + 1 and new_idx respectively.
+1 comes from the fact that req_event and req_prod in Xen start at 1,
while event index in virtio starts at 0.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
---
 include/linux/virtio_ring.h |   14 ++++++++++++++
 1 files changed, 14 insertions(+), 0 deletions(-)

diff --git a/include/linux/virtio_ring.h b/include/linux/virtio_ring.h
index f791772..2a3b0ea 100644
--- a/include/linux/virtio_ring.h
+++ b/include/linux/virtio_ring.h
@@ -124,6 +124,20 @@ static inline unsigned vring_size(unsigned int num, unsigned long align)
 		+ sizeof(__u16) * 3 + sizeof(struct vring_used_elem) * num;
 }
 
+/* The following is used with USED_EVENT_IDX and AVAIL_EVENT_IDX */
+/* Assuming a given event_idx value from the other size, if
+ * we have just incremented index from old to new_idx,
+ * should we trigger an event? */
+static inline int vring_need_event(__u16 event_idx, __u16 new_idx, __u16 old)
+{
+	/* Note: Xen has similar logic for notification hold-off
+	 * in include/xen/interface/io/ring.h with req_event and req_prod
+	 * corresponding to event_idx + 1 and new_idx respectively.
+	 * Note also that req_event and req_prod in Xen start at 1,
+	 * event indexes in virtio start at 0. */
+	return (__u16)(new_idx - event_idx - 1) < (__u16)(new_idx - old);
+}
+
 #ifdef __KERNEL__
 #include <linux/irqreturn.h>
 struct virtio_device;
-- 
1.7.5.53.gc233e


^ permalink raw reply related	[flat|nested] 145+ messages in thread

* [PATCH 07/18] virtio ring: inline function to check for events
@ 2011-05-04 20:51   ` Michael S. Tsirkin
  0 siblings, 0 replies; 145+ messages in thread
From: Michael S. Tsirkin @ 2011-05-04 20:51 UTC (permalink / raw)
  Cc: Rusty Russell, Carsten Otte, Christian Borntraeger, linux390,
	Martin Schwidefsky, Heiko Carstens, Shirley Ma, lguest,
	linux-kernel, virtualization, netdev, linux-s390, kvm,
	Krishna Kumar, Tom Lendacky, steved, habanero

With the new used_event and avail_event and features, both
host and guest need similar logic to check whether events are
enabled, so it helps to put the common code in the header.

Note that Xen has similar logic for notification hold-off
in include/xen/interface/io/ring.h with req_event and req_prod
corresponding to event_idx + 1 and new_idx respectively.
+1 comes from the fact that req_event and req_prod in Xen start at 1,
while event index in virtio starts at 0.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
---
 include/linux/virtio_ring.h |   14 ++++++++++++++
 1 files changed, 14 insertions(+), 0 deletions(-)

diff --git a/include/linux/virtio_ring.h b/include/linux/virtio_ring.h
index f791772..2a3b0ea 100644
--- a/include/linux/virtio_ring.h
+++ b/include/linux/virtio_ring.h
@@ -124,6 +124,20 @@ static inline unsigned vring_size(unsigned int num, unsigned long align)
 		+ sizeof(__u16) * 3 + sizeof(struct vring_used_elem) * num;
 }
 
+/* The following is used with USED_EVENT_IDX and AVAIL_EVENT_IDX */
+/* Assuming a given event_idx value from the other size, if
+ * we have just incremented index from old to new_idx,
+ * should we trigger an event? */
+static inline int vring_need_event(__u16 event_idx, __u16 new_idx, __u16 old)
+{
+	/* Note: Xen has similar logic for notification hold-off
+	 * in include/xen/interface/io/ring.h with req_event and req_prod
+	 * corresponding to event_idx + 1 and new_idx respectively.
+	 * Note also that req_event and req_prod in Xen start at 1,
+	 * event indexes in virtio start at 0. */
+	return (__u16)(new_idx - event_idx - 1) < (__u16)(new_idx - old);
+}
+
 #ifdef __KERNEL__
 #include <linux/irqreturn.h>
 struct virtio_device;
-- 
1.7.5.53.gc233e

^ permalink raw reply related	[flat|nested] 145+ messages in thread

* [PATCH 07/18] virtio ring: inline function to check for events
  2011-05-04 20:50 ` Michael S. Tsirkin
                   ` (8 preceding siblings ...)
  (?)
@ 2011-05-04 20:51 ` Michael S. Tsirkin
  -1 siblings, 0 replies; 145+ messages in thread
From: Michael S. Tsirkin @ 2011-05-04 20:51 UTC (permalink / raw)
  Cc: Krishna Kumar, Carsten Otte, lguest, Shirley Ma, kvm, linux-s390,
	netdev, habanero, Heiko Carstens, linux-kernel, virtualization,
	steved, Christian Borntraeger, Tom Lendacky, Martin Schwidefsky,
	linux390

With the new used_event and avail_event and features, both
host and guest need similar logic to check whether events are
enabled, so it helps to put the common code in the header.

Note that Xen has similar logic for notification hold-off
in include/xen/interface/io/ring.h with req_event and req_prod
corresponding to event_idx + 1 and new_idx respectively.
+1 comes from the fact that req_event and req_prod in Xen start at 1,
while event index in virtio starts at 0.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
---
 include/linux/virtio_ring.h |   14 ++++++++++++++
 1 files changed, 14 insertions(+), 0 deletions(-)

diff --git a/include/linux/virtio_ring.h b/include/linux/virtio_ring.h
index f791772..2a3b0ea 100644
--- a/include/linux/virtio_ring.h
+++ b/include/linux/virtio_ring.h
@@ -124,6 +124,20 @@ static inline unsigned vring_size(unsigned int num, unsigned long align)
 		+ sizeof(__u16) * 3 + sizeof(struct vring_used_elem) * num;
 }
 
+/* The following is used with USED_EVENT_IDX and AVAIL_EVENT_IDX */
+/* Assuming a given event_idx value from the other size, if
+ * we have just incremented index from old to new_idx,
+ * should we trigger an event? */
+static inline int vring_need_event(__u16 event_idx, __u16 new_idx, __u16 old)
+{
+	/* Note: Xen has similar logic for notification hold-off
+	 * in include/xen/interface/io/ring.h with req_event and req_prod
+	 * corresponding to event_idx + 1 and new_idx respectively.
+	 * Note also that req_event and req_prod in Xen start at 1,
+	 * event indexes in virtio start at 0. */
+	return (__u16)(new_idx - event_idx - 1) < (__u16)(new_idx - old);
+}
+
 #ifdef __KERNEL__
 #include <linux/irqreturn.h>
 struct virtio_device;
-- 
1.7.5.53.gc233e

^ permalink raw reply related	[flat|nested] 145+ messages in thread

* [PATCH 08/18] virtio_ring: support for used_event idx feature
  2011-05-04 20:50 ` Michael S. Tsirkin
  (?)
@ 2011-05-04 20:51   ` Michael S. Tsirkin
  -1 siblings, 0 replies; 145+ messages in thread
From: Michael S. Tsirkin @ 2011-05-04 20:51 UTC (permalink / raw)
  To: linux-kernel
  Cc: Rusty Russell, Carsten Otte, Christian Borntraeger, linux390,
	Martin Schwidefsky, Heiko Carstens, Shirley Ma, lguest,
	linux-kernel, virtualization, netdev, linux-s390, kvm,
	Krishna Kumar, Tom Lendacky, steved, habanero

Add support for the used_event idx feature: when enabling
interrupts, publish the current avail index value to
the host so that we get interrupts on the next update.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
---
 drivers/virtio/virtio_ring.c |   14 ++++++++++++++
 1 files changed, 14 insertions(+), 0 deletions(-)

diff --git a/drivers/virtio/virtio_ring.c b/drivers/virtio/virtio_ring.c
index 507d6eb..3a3ed75 100644
--- a/drivers/virtio/virtio_ring.c
+++ b/drivers/virtio/virtio_ring.c
@@ -320,6 +320,14 @@ void *virtqueue_get_buf(struct virtqueue *_vq, unsigned int *len)
 	ret = vq->data[i];
 	detach_buf(vq, i);
 	vq->last_used_idx++;
+	/* If we expect an interrupt for the next entry, tell host
+	 * by writing event index and flush out the write before
+	 * the read in the next get_buf call. */
+	if (!(vq->vring.avail->flags & VRING_AVAIL_F_NO_INTERRUPT)) {
+		vring_used_event(&vq->vring) = vq->last_used_idx;
+		virtio_mb();
+	}
+
 	END_USE(vq);
 	return ret;
 }
@@ -341,7 +349,11 @@ bool virtqueue_enable_cb(struct virtqueue *_vq)
 
 	/* We optimistically turn back on interrupts, then check if there was
 	 * more to do. */
+	/* Depending on the VIRTIO_RING_F_USED_EVENT_IDX feature, we need to
+	 * either clear the flags bit or point the event index at the next
+	 * entry. Always do both to keep code simple. */
 	vq->vring.avail->flags &= ~VRING_AVAIL_F_NO_INTERRUPT;
+	vring_used_event(&vq->vring) = vq->last_used_idx;
 	virtio_mb();
 	if (unlikely(more_used(vq))) {
 		END_USE(vq);
@@ -468,6 +480,8 @@ void vring_transport_features(struct virtio_device *vdev)
 			break;
 		case VIRTIO_RING_F_INDIRECT_DESC:
 			break;
+		case VIRTIO_RING_F_USED_EVENT_IDX:
+			break;
 		default:
 			/* We don't understand this bit. */
 			clear_bit(i, vdev->features);
-- 
1.7.5.53.gc233e


^ permalink raw reply related	[flat|nested] 145+ messages in thread

* [PATCH 08/18] virtio_ring: support for used_event idx feature
@ 2011-05-04 20:51   ` Michael S. Tsirkin
  0 siblings, 0 replies; 145+ messages in thread
From: Michael S. Tsirkin @ 2011-05-04 20:51 UTC (permalink / raw)
  To: linux-kernel
  Cc: Krishna Kumar, Carsten Otte, lguest, Shirley Ma, kvm, linux-s390,
	netdev, habanero, Heiko Carstens, linux-kernel, virtualization,
	steved, Christian Borntraeger, Tom Lendacky, Martin Schwidefsky,
	linux390

Add support for the used_event idx feature: when enabling
interrupts, publish the current avail index value to
the host so that we get interrupts on the next update.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
---
 drivers/virtio/virtio_ring.c |   14 ++++++++++++++
 1 files changed, 14 insertions(+), 0 deletions(-)

diff --git a/drivers/virtio/virtio_ring.c b/drivers/virtio/virtio_ring.c
index 507d6eb..3a3ed75 100644
--- a/drivers/virtio/virtio_ring.c
+++ b/drivers/virtio/virtio_ring.c
@@ -320,6 +320,14 @@ void *virtqueue_get_buf(struct virtqueue *_vq, unsigned int *len)
 	ret = vq->data[i];
 	detach_buf(vq, i);
 	vq->last_used_idx++;
+	/* If we expect an interrupt for the next entry, tell host
+	 * by writing event index and flush out the write before
+	 * the read in the next get_buf call. */
+	if (!(vq->vring.avail->flags & VRING_AVAIL_F_NO_INTERRUPT)) {
+		vring_used_event(&vq->vring) = vq->last_used_idx;
+		virtio_mb();
+	}
+
 	END_USE(vq);
 	return ret;
 }
@@ -341,7 +349,11 @@ bool virtqueue_enable_cb(struct virtqueue *_vq)
 
 	/* We optimistically turn back on interrupts, then check if there was
 	 * more to do. */
+	/* Depending on the VIRTIO_RING_F_USED_EVENT_IDX feature, we need to
+	 * either clear the flags bit or point the event index at the next
+	 * entry. Always do both to keep code simple. */
 	vq->vring.avail->flags &= ~VRING_AVAIL_F_NO_INTERRUPT;
+	vring_used_event(&vq->vring) = vq->last_used_idx;
 	virtio_mb();
 	if (unlikely(more_used(vq))) {
 		END_USE(vq);
@@ -468,6 +480,8 @@ void vring_transport_features(struct virtio_device *vdev)
 			break;
 		case VIRTIO_RING_F_INDIRECT_DESC:
 			break;
+		case VIRTIO_RING_F_USED_EVENT_IDX:
+			break;
 		default:
 			/* We don't understand this bit. */
 			clear_bit(i, vdev->features);
-- 
1.7.5.53.gc233e

^ permalink raw reply related	[flat|nested] 145+ messages in thread

* [PATCH 08/18] virtio_ring: support for used_event idx feature
@ 2011-05-04 20:51   ` Michael S. Tsirkin
  0 siblings, 0 replies; 145+ messages in thread
From: Michael S. Tsirkin @ 2011-05-04 20:51 UTC (permalink / raw)
  Cc: Krishna Kumar, Carsten Otte, lguest, Shirley Ma, kvm, linux-s390,
	netdev, habanero, Heiko Carstens, linux-kernel, virtualization,
	steved, Christian Borntraeger, Tom Lendacky, Martin Schwidefsky,
	linux390

Add support for the used_event idx feature: when enabling
interrupts, publish the current avail index value to
the host so that we get interrupts on the next update.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
---
 drivers/virtio/virtio_ring.c |   14 ++++++++++++++
 1 files changed, 14 insertions(+), 0 deletions(-)

diff --git a/drivers/virtio/virtio_ring.c b/drivers/virtio/virtio_ring.c
index 507d6eb..3a3ed75 100644
--- a/drivers/virtio/virtio_ring.c
+++ b/drivers/virtio/virtio_ring.c
@@ -320,6 +320,14 @@ void *virtqueue_get_buf(struct virtqueue *_vq, unsigned int *len)
 	ret = vq->data[i];
 	detach_buf(vq, i);
 	vq->last_used_idx++;
+	/* If we expect an interrupt for the next entry, tell host
+	 * by writing event index and flush out the write before
+	 * the read in the next get_buf call. */
+	if (!(vq->vring.avail->flags & VRING_AVAIL_F_NO_INTERRUPT)) {
+		vring_used_event(&vq->vring) = vq->last_used_idx;
+		virtio_mb();
+	}
+
 	END_USE(vq);
 	return ret;
 }
@@ -341,7 +349,11 @@ bool virtqueue_enable_cb(struct virtqueue *_vq)
 
 	/* We optimistically turn back on interrupts, then check if there was
 	 * more to do. */
+	/* Depending on the VIRTIO_RING_F_USED_EVENT_IDX feature, we need to
+	 * either clear the flags bit or point the event index at the next
+	 * entry. Always do both to keep code simple. */
 	vq->vring.avail->flags &= ~VRING_AVAIL_F_NO_INTERRUPT;
+	vring_used_event(&vq->vring) = vq->last_used_idx;
 	virtio_mb();
 	if (unlikely(more_used(vq))) {
 		END_USE(vq);
@@ -468,6 +480,8 @@ void vring_transport_features(struct virtio_device *vdev)
 			break;
 		case VIRTIO_RING_F_INDIRECT_DESC:
 			break;
+		case VIRTIO_RING_F_USED_EVENT_IDX:
+			break;
 		default:
 			/* We don't understand this bit. */
 			clear_bit(i, vdev->features);
-- 
1.7.5.53.gc233e

^ permalink raw reply related	[flat|nested] 145+ messages in thread

* [PATCH 09/18] virtio: use avail_event index
  2011-05-04 20:50 ` Michael S. Tsirkin
  (?)
@ 2011-05-04 20:51   ` Michael S. Tsirkin
  -1 siblings, 0 replies; 145+ messages in thread
From: Michael S. Tsirkin @ 2011-05-04 20:51 UTC (permalink / raw)
  To: linux-kernel
  Cc: Rusty Russell, Carsten Otte, Christian Borntraeger, linux390,
	Martin Schwidefsky, Heiko Carstens, Shirley Ma, lguest,
	linux-kernel, virtualization, netdev, linux-s390, kvm,
	Krishna Kumar, Tom Lendacky, steved, habanero

Use the new avail_event feature to reduce the number
of exits from the guest.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
---
 drivers/virtio/virtio_ring.c |   39 ++++++++++++++++++++++++++++++++++++++-
 1 files changed, 38 insertions(+), 1 deletions(-)

diff --git a/drivers/virtio/virtio_ring.c b/drivers/virtio/virtio_ring.c
index 3a3ed75..262dfe6 100644
--- a/drivers/virtio/virtio_ring.c
+++ b/drivers/virtio/virtio_ring.c
@@ -82,6 +82,15 @@ struct vring_virtqueue
 	/* Host supports indirect buffers */
 	bool indirect;
 
+	/* Host publishes avail event idx */
+	bool event;
+
+	/* Is kicked_avail below valid? */
+	bool kicked_avail_valid;
+
+	/* avail idx value we already kicked. */
+	u16 kicked_avail;
+
 	/* Number of free buffers */
 	unsigned int num_free;
 	/* Head of free buffer list. */
@@ -228,6 +237,12 @@ add_head:
 	 * new available array entries. */
 	virtio_wmb();
 	vq->vring.avail->idx++;
+	/* If the driver never bothers to kick in a very long while,
+	 * avail index might wrap around. If that happens, invalidate
+	 * kicked_avail index we stored. TODO: make sure all drivers
+	 * kick at least once in 2^16 and remove this. */
+	if (unlikely(vq->vring.avail->idx == vq->kicked_avail))
+		vq->kicked_avail_valid = true;
 
 	pr_debug("Added buffer head %i to %p\n", head, vq);
 	END_USE(vq);
@@ -236,6 +251,23 @@ add_head:
 }
 EXPORT_SYMBOL_GPL(virtqueue_add_buf_gfp);
 
+
+static bool vring_notify(struct vring_virtqueue *vq)
+{
+	u16 old, new;
+	bool v;
+	if (!vq->event)
+		return !(vq->vring.used->flags & VRING_USED_F_NO_NOTIFY);
+
+	v = vq->kicked_avail_valid;
+	old = vq->kicked_avail;
+	new = vq->kicked_avail = vq->vring.avail->idx;
+	vq->kicked_avail_valid = true;
+	if (unlikely(!v))
+		return true;
+	return vring_need_event(vring_avail_event(&vq->vring), new, old);
+}
+
 void virtqueue_kick(struct virtqueue *_vq)
 {
 	struct vring_virtqueue *vq = to_vvq(_vq);
@@ -244,7 +276,7 @@ void virtqueue_kick(struct virtqueue *_vq)
 	/* Need to update avail index before checking if we should notify */
 	virtio_mb();
 
-	if (!(vq->vring.used->flags & VRING_USED_F_NO_NOTIFY))
+	if (vring_notify(vq))
 		/* Prod other side to tell it about changes. */
 		vq->notify(&vq->vq);
 
@@ -437,6 +469,8 @@ struct virtqueue *vring_new_virtqueue(unsigned int num,
 	vq->vq.name = name;
 	vq->notify = notify;
 	vq->broken = false;
+	vq->kicked_avail_valid = false;
+	vq->kicked_avail = 0;
 	vq->last_used_idx = 0;
 	list_add_tail(&vq->vq.list, &vdev->vqs);
 #ifdef DEBUG
@@ -444,6 +478,7 @@ struct virtqueue *vring_new_virtqueue(unsigned int num,
 #endif
 
 	vq->indirect = virtio_has_feature(vdev, VIRTIO_RING_F_INDIRECT_DESC);
+	vq->event = virtio_has_feature(vdev, VIRTIO_RING_F_AVAIL_EVENT_IDX);
 
 	/* No callback?  Tell other side not to bother us. */
 	if (!callback)
@@ -482,6 +517,8 @@ void vring_transport_features(struct virtio_device *vdev)
 			break;
 		case VIRTIO_RING_F_USED_EVENT_IDX:
 			break;
+		case VIRTIO_RING_F_AVAIL_EVENT_IDX:
+			break;
 		default:
 			/* We don't understand this bit. */
 			clear_bit(i, vdev->features);
-- 
1.7.5.53.gc233e


^ permalink raw reply related	[flat|nested] 145+ messages in thread

* [PATCH 09/18] virtio: use avail_event index
@ 2011-05-04 20:51   ` Michael S. Tsirkin
  0 siblings, 0 replies; 145+ messages in thread
From: Michael S. Tsirkin @ 2011-05-04 20:51 UTC (permalink / raw)
  To: linux-kernel
  Cc: Krishna Kumar, Carsten Otte, lguest, Shirley Ma, kvm, linux-s390,
	netdev, habanero, Heiko Carstens, linux-kernel, virtualization,
	steved, Christian Borntraeger, Tom Lendacky, Martin Schwidefsky,
	linux390

Use the new avail_event feature to reduce the number
of exits from the guest.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
---
 drivers/virtio/virtio_ring.c |   39 ++++++++++++++++++++++++++++++++++++++-
 1 files changed, 38 insertions(+), 1 deletions(-)

diff --git a/drivers/virtio/virtio_ring.c b/drivers/virtio/virtio_ring.c
index 3a3ed75..262dfe6 100644
--- a/drivers/virtio/virtio_ring.c
+++ b/drivers/virtio/virtio_ring.c
@@ -82,6 +82,15 @@ struct vring_virtqueue
 	/* Host supports indirect buffers */
 	bool indirect;
 
+	/* Host publishes avail event idx */
+	bool event;
+
+	/* Is kicked_avail below valid? */
+	bool kicked_avail_valid;
+
+	/* avail idx value we already kicked. */
+	u16 kicked_avail;
+
 	/* Number of free buffers */
 	unsigned int num_free;
 	/* Head of free buffer list. */
@@ -228,6 +237,12 @@ add_head:
 	 * new available array entries. */
 	virtio_wmb();
 	vq->vring.avail->idx++;
+	/* If the driver never bothers to kick in a very long while,
+	 * avail index might wrap around. If that happens, invalidate
+	 * kicked_avail index we stored. TODO: make sure all drivers
+	 * kick at least once in 2^16 and remove this. */
+	if (unlikely(vq->vring.avail->idx == vq->kicked_avail))
+		vq->kicked_avail_valid = true;
 
 	pr_debug("Added buffer head %i to %p\n", head, vq);
 	END_USE(vq);
@@ -236,6 +251,23 @@ add_head:
 }
 EXPORT_SYMBOL_GPL(virtqueue_add_buf_gfp);
 
+
+static bool vring_notify(struct vring_virtqueue *vq)
+{
+	u16 old, new;
+	bool v;
+	if (!vq->event)
+		return !(vq->vring.used->flags & VRING_USED_F_NO_NOTIFY);
+
+	v = vq->kicked_avail_valid;
+	old = vq->kicked_avail;
+	new = vq->kicked_avail = vq->vring.avail->idx;
+	vq->kicked_avail_valid = true;
+	if (unlikely(!v))
+		return true;
+	return vring_need_event(vring_avail_event(&vq->vring), new, old);
+}
+
 void virtqueue_kick(struct virtqueue *_vq)
 {
 	struct vring_virtqueue *vq = to_vvq(_vq);
@@ -244,7 +276,7 @@ void virtqueue_kick(struct virtqueue *_vq)
 	/* Need to update avail index before checking if we should notify */
 	virtio_mb();
 
-	if (!(vq->vring.used->flags & VRING_USED_F_NO_NOTIFY))
+	if (vring_notify(vq))
 		/* Prod other side to tell it about changes. */
 		vq->notify(&vq->vq);
 
@@ -437,6 +469,8 @@ struct virtqueue *vring_new_virtqueue(unsigned int num,
 	vq->vq.name = name;
 	vq->notify = notify;
 	vq->broken = false;
+	vq->kicked_avail_valid = false;
+	vq->kicked_avail = 0;
 	vq->last_used_idx = 0;
 	list_add_tail(&vq->vq.list, &vdev->vqs);
 #ifdef DEBUG
@@ -444,6 +478,7 @@ struct virtqueue *vring_new_virtqueue(unsigned int num,
 #endif
 
 	vq->indirect = virtio_has_feature(vdev, VIRTIO_RING_F_INDIRECT_DESC);
+	vq->event = virtio_has_feature(vdev, VIRTIO_RING_F_AVAIL_EVENT_IDX);
 
 	/* No callback?  Tell other side not to bother us. */
 	if (!callback)
@@ -482,6 +517,8 @@ void vring_transport_features(struct virtio_device *vdev)
 			break;
 		case VIRTIO_RING_F_USED_EVENT_IDX:
 			break;
+		case VIRTIO_RING_F_AVAIL_EVENT_IDX:
+			break;
 		default:
 			/* We don't understand this bit. */
 			clear_bit(i, vdev->features);
-- 
1.7.5.53.gc233e

^ permalink raw reply related	[flat|nested] 145+ messages in thread

* [PATCH 09/18] virtio: use avail_event index
@ 2011-05-04 20:51   ` Michael S. Tsirkin
  0 siblings, 0 replies; 145+ messages in thread
From: Michael S. Tsirkin @ 2011-05-04 20:51 UTC (permalink / raw)
  Cc: Krishna Kumar, Carsten Otte, lguest, Shirley Ma, kvm, linux-s390,
	netdev, habanero, Heiko Carstens, linux-kernel, virtualization,
	steved, Christian Borntraeger, Tom Lendacky, Martin Schwidefsky,
	linux390

Use the new avail_event feature to reduce the number
of exits from the guest.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
---
 drivers/virtio/virtio_ring.c |   39 ++++++++++++++++++++++++++++++++++++++-
 1 files changed, 38 insertions(+), 1 deletions(-)

diff --git a/drivers/virtio/virtio_ring.c b/drivers/virtio/virtio_ring.c
index 3a3ed75..262dfe6 100644
--- a/drivers/virtio/virtio_ring.c
+++ b/drivers/virtio/virtio_ring.c
@@ -82,6 +82,15 @@ struct vring_virtqueue
 	/* Host supports indirect buffers */
 	bool indirect;
 
+	/* Host publishes avail event idx */
+	bool event;
+
+	/* Is kicked_avail below valid? */
+	bool kicked_avail_valid;
+
+	/* avail idx value we already kicked. */
+	u16 kicked_avail;
+
 	/* Number of free buffers */
 	unsigned int num_free;
 	/* Head of free buffer list. */
@@ -228,6 +237,12 @@ add_head:
 	 * new available array entries. */
 	virtio_wmb();
 	vq->vring.avail->idx++;
+	/* If the driver never bothers to kick in a very long while,
+	 * avail index might wrap around. If that happens, invalidate
+	 * kicked_avail index we stored. TODO: make sure all drivers
+	 * kick at least once in 2^16 and remove this. */
+	if (unlikely(vq->vring.avail->idx == vq->kicked_avail))
+		vq->kicked_avail_valid = true;
 
 	pr_debug("Added buffer head %i to %p\n", head, vq);
 	END_USE(vq);
@@ -236,6 +251,23 @@ add_head:
 }
 EXPORT_SYMBOL_GPL(virtqueue_add_buf_gfp);
 
+
+static bool vring_notify(struct vring_virtqueue *vq)
+{
+	u16 old, new;
+	bool v;
+	if (!vq->event)
+		return !(vq->vring.used->flags & VRING_USED_F_NO_NOTIFY);
+
+	v = vq->kicked_avail_valid;
+	old = vq->kicked_avail;
+	new = vq->kicked_avail = vq->vring.avail->idx;
+	vq->kicked_avail_valid = true;
+	if (unlikely(!v))
+		return true;
+	return vring_need_event(vring_avail_event(&vq->vring), new, old);
+}
+
 void virtqueue_kick(struct virtqueue *_vq)
 {
 	struct vring_virtqueue *vq = to_vvq(_vq);
@@ -244,7 +276,7 @@ void virtqueue_kick(struct virtqueue *_vq)
 	/* Need to update avail index before checking if we should notify */
 	virtio_mb();
 
-	if (!(vq->vring.used->flags & VRING_USED_F_NO_NOTIFY))
+	if (vring_notify(vq))
 		/* Prod other side to tell it about changes. */
 		vq->notify(&vq->vq);
 
@@ -437,6 +469,8 @@ struct virtqueue *vring_new_virtqueue(unsigned int num,
 	vq->vq.name = name;
 	vq->notify = notify;
 	vq->broken = false;
+	vq->kicked_avail_valid = false;
+	vq->kicked_avail = 0;
 	vq->last_used_idx = 0;
 	list_add_tail(&vq->vq.list, &vdev->vqs);
 #ifdef DEBUG
@@ -444,6 +478,7 @@ struct virtqueue *vring_new_virtqueue(unsigned int num,
 #endif
 
 	vq->indirect = virtio_has_feature(vdev, VIRTIO_RING_F_INDIRECT_DESC);
+	vq->event = virtio_has_feature(vdev, VIRTIO_RING_F_AVAIL_EVENT_IDX);
 
 	/* No callback?  Tell other side not to bother us. */
 	if (!callback)
@@ -482,6 +517,8 @@ void vring_transport_features(struct virtio_device *vdev)
 			break;
 		case VIRTIO_RING_F_USED_EVENT_IDX:
 			break;
+		case VIRTIO_RING_F_AVAIL_EVENT_IDX:
+			break;
 		default:
 			/* We don't understand this bit. */
 			clear_bit(i, vdev->features);
-- 
1.7.5.53.gc233e

^ permalink raw reply related	[flat|nested] 145+ messages in thread

* [PATCH 10/18] vhost: utilize used_event index
  2011-05-04 20:50 ` Michael S. Tsirkin
  (?)
@ 2011-05-04 20:51   ` Michael S. Tsirkin
  -1 siblings, 0 replies; 145+ messages in thread
From: Michael S. Tsirkin @ 2011-05-04 20:51 UTC (permalink / raw)
  To: linux-kernel
  Cc: Rusty Russell, Carsten Otte, Christian Borntraeger, linux390,
	Martin Schwidefsky, Heiko Carstens, Shirley Ma, lguest,
	linux-kernel, virtualization, netdev, linux-s390, kvm,
	Krishna Kumar, Tom Lendacky, steved, habanero

Support the new used_event index. When acked,
utilize it to reduce the # of interrupts sent to the guest.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
---
 drivers/vhost/vhost.c |   74 +++++++++++++++++++++++++++++++++++++------------
 drivers/vhost/vhost.h |    7 ++++
 2 files changed, 63 insertions(+), 18 deletions(-)

diff --git a/drivers/vhost/vhost.c b/drivers/vhost/vhost.c
index 2ab2912..e33d5a3 100644
--- a/drivers/vhost/vhost.c
+++ b/drivers/vhost/vhost.c
@@ -37,6 +37,8 @@ enum {
 	VHOST_MEMORY_F_LOG = 0x1,
 };
 
+#define vhost_used_event(vq) ((u16 __user *)&vq->avail->ring[vq->num])
+
 static void vhost_poll_func(struct file *file, wait_queue_head_t *wqh,
 			    poll_table *pt)
 {
@@ -161,6 +163,8 @@ static void vhost_vq_reset(struct vhost_dev *dev,
 	vq->last_avail_idx = 0;
 	vq->avail_idx = 0;
 	vq->last_used_idx = 0;
+	vq->signalled_used = 0;
+	vq->signalled_used_valid = false;
 	vq->used_flags = 0;
 	vq->log_used = false;
 	vq->log_addr = -1ull;
@@ -489,14 +493,15 @@ static int memory_access_ok(struct vhost_dev *d, struct vhost_memory *mem,
 	return 1;
 }
 
-static int vq_access_ok(unsigned int num,
+static int vq_access_ok(struct vhost_dev *d, unsigned int num,
 			struct vring_desc __user *desc,
 			struct vring_avail __user *avail,
 			struct vring_used __user *used)
 {
+	size_t sa = vhost_has_feature(d, VIRTIO_RING_F_USED_EVENT_IDX) ? 2 : 0;
 	return access_ok(VERIFY_READ, desc, num * sizeof *desc) &&
 	       access_ok(VERIFY_READ, avail,
-			 sizeof *avail + num * sizeof *avail->ring) &&
+			 sizeof *avail + num * sizeof *avail->ring + sa) &&
 	       access_ok(VERIFY_WRITE, used,
 			sizeof *used + num * sizeof *used->ring);
 }
@@ -531,7 +536,7 @@ static int vq_log_access_ok(struct vhost_virtqueue *vq, void __user *log_base)
 /* Caller should have vq mutex and device mutex */
 int vhost_vq_access_ok(struct vhost_virtqueue *vq)
 {
-	return vq_access_ok(vq->num, vq->desc, vq->avail, vq->used) &&
+	return vq_access_ok(vq->dev, vq->num, vq->desc, vq->avail, vq->used) &&
 		vq_log_access_ok(vq, vq->log_base);
 }
 
@@ -577,6 +582,7 @@ static int init_used(struct vhost_virtqueue *vq,
 
 	if (r)
 		return r;
+	vq->signalled_used_valid = false;
 	return get_user(vq->last_used_idx, &used->idx);
 }
 
@@ -674,7 +680,7 @@ static long vhost_set_vring(struct vhost_dev *d, int ioctl, void __user *argp)
 		 * If it is not, we don't as size might not have been setup.
 		 * We will verify when backend is configured. */
 		if (vq->private_data) {
-			if (!vq_access_ok(vq->num,
+			if (!vq_access_ok(d, vq->num,
 				(void __user *)(unsigned long)a.desc_user_addr,
 				(void __user *)(unsigned long)a.avail_user_addr,
 				(void __user *)(unsigned long)a.used_user_addr)) {
@@ -1267,6 +1273,12 @@ int vhost_add_used(struct vhost_virtqueue *vq, unsigned int head, int len)
 			eventfd_signal(vq->log_ctx, 1);
 	}
 	vq->last_used_idx++;
+	/* If the driver never bothers to signal in a very long while,
+	 * used index might wrap around. If that happens, invalidate
+	 * signalled_used index we stored. TODO: make sure driver
+	 * signals at least once in 2^16 and remove this. */
+	if (unlikely(vq->last_used_idx == vq->signalled_used))
+		vq->signalled_used_valid = false;
 	return 0;
 }
 
@@ -1275,6 +1287,7 @@ static int __vhost_add_used_n(struct vhost_virtqueue *vq,
 			    unsigned count)
 {
 	struct vring_used_elem __user *used;
+	u16 old, new;
 	int start;
 
 	start = vq->last_used_idx % vq->num;
@@ -1292,7 +1305,14 @@ static int __vhost_add_used_n(struct vhost_virtqueue *vq,
 			   ((void __user *)used - (void __user *)vq->used),
 			  count * sizeof *used);
 	}
-	vq->last_used_idx += count;
+	old = vq->last_used_idx;
+	new = (vq->last_used_idx += count);
+	/* If the driver never bothers to signal in a very long while,
+	 * used index might wrap around. If that happens, invalidate
+	 * signalled_used index we stored. TODO: make sure driver
+	 * signals at least once in 2^16 and remove this. */
+	if (unlikely((u16)(new - vq->signalled_used) < (u16)(new - old)))
+		vq->signalled_used_valid = false;
 	return 0;
 }
 
@@ -1331,29 +1351,47 @@ int vhost_add_used_n(struct vhost_virtqueue *vq, struct vring_used_elem *heads,
 	return r;
 }
 
-/* This actually signals the guest, using eventfd. */
-void vhost_signal(struct vhost_dev *dev, struct vhost_virtqueue *vq)
+static bool vhost_notify(struct vhost_dev *dev, struct vhost_virtqueue *vq)
 {
-	__u16 flags;
-
+	__u16 old, new, event;
+	bool v;
 	/* Flush out used index updates. This is paired
 	 * with the barrier that the Guest executes when enabling
 	 * interrupts. */
 	smp_mb();
 
-	if (__get_user(flags, &vq->avail->flags)) {
-		vq_err(vq, "Failed to get flags");
-		return;
+	if (vhost_has_feature(dev, VIRTIO_F_NOTIFY_ON_EMPTY) &&
+	    unlikely(vq->avail_idx == vq->last_avail_idx))
+		return true;
+
+	if (!vhost_has_feature(dev, VIRTIO_RING_F_USED_EVENT_IDX)) {
+		__u16 flags;
+		if (__get_user(flags, &vq->avail->flags)) {
+			vq_err(vq, "Failed to get flags");
+			return true;
+		}
+		return !(flags & VRING_AVAIL_F_NO_INTERRUPT);
 	}
+	old = vq->signalled_used;
+	v = vq->signalled_used_valid;
+	new = vq->signalled_used = vq->last_used_idx;
+	vq->signalled_used_valid = true;
 
-	/* If they don't want an interrupt, don't signal, unless empty. */
-	if ((flags & VRING_AVAIL_F_NO_INTERRUPT) &&
-	    (vq->avail_idx != vq->last_avail_idx ||
-	     !vhost_has_feature(dev, VIRTIO_F_NOTIFY_ON_EMPTY)))
-		return;
+	if (unlikely(!v))
+		return true;
+
+	if (get_user(event, vhost_used_event(vq))) {
+		vq_err(vq, "Failed to get used event idx");
+		return true;
+	}
+	return vring_need_event(event, new, old);
+}
 
+/* This actually signals the guest, using eventfd. */
+void vhost_signal(struct vhost_dev *dev, struct vhost_virtqueue *vq)
+{
 	/* Signal the Guest tell them we used something up. */
-	if (vq->call_ctx)
+	if (vq->call_ctx && vhost_notify(dev, vq))
 		eventfd_signal(vq->call_ctx, 1);
 }
 
diff --git a/drivers/vhost/vhost.h b/drivers/vhost/vhost.h
index 0f1bf33..5825ac6 100644
--- a/drivers/vhost/vhost.h
+++ b/drivers/vhost/vhost.h
@@ -84,6 +84,12 @@ struct vhost_virtqueue {
 	/* Used flags */
 	u16 used_flags;
 
+	/* Last used index value we have signalled on */
+	u16 signalled_used;
+
+	/* Last used index value we have signalled on */
+	bool signalled_used_valid;
+
 	/* Log writes to used structure. */
 	bool log_used;
 	u64 log_addr;
@@ -164,6 +170,7 @@ int vhost_log_write(struct vhost_virtqueue *vq, struct vhost_log *log,
 enum {
 	VHOST_FEATURES = (1 << VIRTIO_F_NOTIFY_ON_EMPTY) |
 			 (1 << VIRTIO_RING_F_INDIRECT_DESC) |
+			 (1 << VIRTIO_RING_F_USED_EVENT_IDX) |
 			 (1 << VHOST_F_LOG_ALL) |
 			 (1 << VHOST_NET_F_VIRTIO_NET_HDR) |
 			 (1 << VIRTIO_NET_F_MRG_RXBUF),
-- 
1.7.5.53.gc233e


^ permalink raw reply related	[flat|nested] 145+ messages in thread

* [PATCH 10/18] vhost: utilize used_event index
@ 2011-05-04 20:51   ` Michael S. Tsirkin
  0 siblings, 0 replies; 145+ messages in thread
From: Michael S. Tsirkin @ 2011-05-04 20:51 UTC (permalink / raw)
  To: linux-kernel
  Cc: Krishna Kumar, Carsten Otte, lguest, Shirley Ma, kvm, linux-s390,
	netdev, habanero, Heiko Carstens, linux-kernel, virtualization,
	steved, Christian Borntraeger, Tom Lendacky, Martin Schwidefsky,
	linux390

Support the new used_event index. When acked,
utilize it to reduce the # of interrupts sent to the guest.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
---
 drivers/vhost/vhost.c |   74 +++++++++++++++++++++++++++++++++++++------------
 drivers/vhost/vhost.h |    7 ++++
 2 files changed, 63 insertions(+), 18 deletions(-)

diff --git a/drivers/vhost/vhost.c b/drivers/vhost/vhost.c
index 2ab2912..e33d5a3 100644
--- a/drivers/vhost/vhost.c
+++ b/drivers/vhost/vhost.c
@@ -37,6 +37,8 @@ enum {
 	VHOST_MEMORY_F_LOG = 0x1,
 };
 
+#define vhost_used_event(vq) ((u16 __user *)&vq->avail->ring[vq->num])
+
 static void vhost_poll_func(struct file *file, wait_queue_head_t *wqh,
 			    poll_table *pt)
 {
@@ -161,6 +163,8 @@ static void vhost_vq_reset(struct vhost_dev *dev,
 	vq->last_avail_idx = 0;
 	vq->avail_idx = 0;
 	vq->last_used_idx = 0;
+	vq->signalled_used = 0;
+	vq->signalled_used_valid = false;
 	vq->used_flags = 0;
 	vq->log_used = false;
 	vq->log_addr = -1ull;
@@ -489,14 +493,15 @@ static int memory_access_ok(struct vhost_dev *d, struct vhost_memory *mem,
 	return 1;
 }
 
-static int vq_access_ok(unsigned int num,
+static int vq_access_ok(struct vhost_dev *d, unsigned int num,
 			struct vring_desc __user *desc,
 			struct vring_avail __user *avail,
 			struct vring_used __user *used)
 {
+	size_t sa = vhost_has_feature(d, VIRTIO_RING_F_USED_EVENT_IDX) ? 2 : 0;
 	return access_ok(VERIFY_READ, desc, num * sizeof *desc) &&
 	       access_ok(VERIFY_READ, avail,
-			 sizeof *avail + num * sizeof *avail->ring) &&
+			 sizeof *avail + num * sizeof *avail->ring + sa) &&
 	       access_ok(VERIFY_WRITE, used,
 			sizeof *used + num * sizeof *used->ring);
 }
@@ -531,7 +536,7 @@ static int vq_log_access_ok(struct vhost_virtqueue *vq, void __user *log_base)
 /* Caller should have vq mutex and device mutex */
 int vhost_vq_access_ok(struct vhost_virtqueue *vq)
 {
-	return vq_access_ok(vq->num, vq->desc, vq->avail, vq->used) &&
+	return vq_access_ok(vq->dev, vq->num, vq->desc, vq->avail, vq->used) &&
 		vq_log_access_ok(vq, vq->log_base);
 }
 
@@ -577,6 +582,7 @@ static int init_used(struct vhost_virtqueue *vq,
 
 	if (r)
 		return r;
+	vq->signalled_used_valid = false;
 	return get_user(vq->last_used_idx, &used->idx);
 }
 
@@ -674,7 +680,7 @@ static long vhost_set_vring(struct vhost_dev *d, int ioctl, void __user *argp)
 		 * If it is not, we don't as size might not have been setup.
 		 * We will verify when backend is configured. */
 		if (vq->private_data) {
-			if (!vq_access_ok(vq->num,
+			if (!vq_access_ok(d, vq->num,
 				(void __user *)(unsigned long)a.desc_user_addr,
 				(void __user *)(unsigned long)a.avail_user_addr,
 				(void __user *)(unsigned long)a.used_user_addr)) {
@@ -1267,6 +1273,12 @@ int vhost_add_used(struct vhost_virtqueue *vq, unsigned int head, int len)
 			eventfd_signal(vq->log_ctx, 1);
 	}
 	vq->last_used_idx++;
+	/* If the driver never bothers to signal in a very long while,
+	 * used index might wrap around. If that happens, invalidate
+	 * signalled_used index we stored. TODO: make sure driver
+	 * signals at least once in 2^16 and remove this. */
+	if (unlikely(vq->last_used_idx == vq->signalled_used))
+		vq->signalled_used_valid = false;
 	return 0;
 }
 
@@ -1275,6 +1287,7 @@ static int __vhost_add_used_n(struct vhost_virtqueue *vq,
 			    unsigned count)
 {
 	struct vring_used_elem __user *used;
+	u16 old, new;
 	int start;
 
 	start = vq->last_used_idx % vq->num;
@@ -1292,7 +1305,14 @@ static int __vhost_add_used_n(struct vhost_virtqueue *vq,
 			   ((void __user *)used - (void __user *)vq->used),
 			  count * sizeof *used);
 	}
-	vq->last_used_idx += count;
+	old = vq->last_used_idx;
+	new = (vq->last_used_idx += count);
+	/* If the driver never bothers to signal in a very long while,
+	 * used index might wrap around. If that happens, invalidate
+	 * signalled_used index we stored. TODO: make sure driver
+	 * signals at least once in 2^16 and remove this. */
+	if (unlikely((u16)(new - vq->signalled_used) < (u16)(new - old)))
+		vq->signalled_used_valid = false;
 	return 0;
 }
 
@@ -1331,29 +1351,47 @@ int vhost_add_used_n(struct vhost_virtqueue *vq, struct vring_used_elem *heads,
 	return r;
 }
 
-/* This actually signals the guest, using eventfd. */
-void vhost_signal(struct vhost_dev *dev, struct vhost_virtqueue *vq)
+static bool vhost_notify(struct vhost_dev *dev, struct vhost_virtqueue *vq)
 {
-	__u16 flags;
-
+	__u16 old, new, event;
+	bool v;
 	/* Flush out used index updates. This is paired
 	 * with the barrier that the Guest executes when enabling
 	 * interrupts. */
 	smp_mb();
 
-	if (__get_user(flags, &vq->avail->flags)) {
-		vq_err(vq, "Failed to get flags");
-		return;
+	if (vhost_has_feature(dev, VIRTIO_F_NOTIFY_ON_EMPTY) &&
+	    unlikely(vq->avail_idx == vq->last_avail_idx))
+		return true;
+
+	if (!vhost_has_feature(dev, VIRTIO_RING_F_USED_EVENT_IDX)) {
+		__u16 flags;
+		if (__get_user(flags, &vq->avail->flags)) {
+			vq_err(vq, "Failed to get flags");
+			return true;
+		}
+		return !(flags & VRING_AVAIL_F_NO_INTERRUPT);
 	}
+	old = vq->signalled_used;
+	v = vq->signalled_used_valid;
+	new = vq->signalled_used = vq->last_used_idx;
+	vq->signalled_used_valid = true;
 
-	/* If they don't want an interrupt, don't signal, unless empty. */
-	if ((flags & VRING_AVAIL_F_NO_INTERRUPT) &&
-	    (vq->avail_idx != vq->last_avail_idx ||
-	     !vhost_has_feature(dev, VIRTIO_F_NOTIFY_ON_EMPTY)))
-		return;
+	if (unlikely(!v))
+		return true;
+
+	if (get_user(event, vhost_used_event(vq))) {
+		vq_err(vq, "Failed to get used event idx");
+		return true;
+	}
+	return vring_need_event(event, new, old);
+}
 
+/* This actually signals the guest, using eventfd. */
+void vhost_signal(struct vhost_dev *dev, struct vhost_virtqueue *vq)
+{
 	/* Signal the Guest tell them we used something up. */
-	if (vq->call_ctx)
+	if (vq->call_ctx && vhost_notify(dev, vq))
 		eventfd_signal(vq->call_ctx, 1);
 }
 
diff --git a/drivers/vhost/vhost.h b/drivers/vhost/vhost.h
index 0f1bf33..5825ac6 100644
--- a/drivers/vhost/vhost.h
+++ b/drivers/vhost/vhost.h
@@ -84,6 +84,12 @@ struct vhost_virtqueue {
 	/* Used flags */
 	u16 used_flags;
 
+	/* Last used index value we have signalled on */
+	u16 signalled_used;
+
+	/* Last used index value we have signalled on */
+	bool signalled_used_valid;
+
 	/* Log writes to used structure. */
 	bool log_used;
 	u64 log_addr;
@@ -164,6 +170,7 @@ int vhost_log_write(struct vhost_virtqueue *vq, struct vhost_log *log,
 enum {
 	VHOST_FEATURES = (1 << VIRTIO_F_NOTIFY_ON_EMPTY) |
 			 (1 << VIRTIO_RING_F_INDIRECT_DESC) |
+			 (1 << VIRTIO_RING_F_USED_EVENT_IDX) |
 			 (1 << VHOST_F_LOG_ALL) |
 			 (1 << VHOST_NET_F_VIRTIO_NET_HDR) |
 			 (1 << VIRTIO_NET_F_MRG_RXBUF),
-- 
1.7.5.53.gc233e

^ permalink raw reply related	[flat|nested] 145+ messages in thread

* [PATCH 10/18] vhost: utilize used_event index
@ 2011-05-04 20:51   ` Michael S. Tsirkin
  0 siblings, 0 replies; 145+ messages in thread
From: Michael S. Tsirkin @ 2011-05-04 20:51 UTC (permalink / raw)
  Cc: Krishna Kumar, Carsten Otte, lguest, Shirley Ma, kvm, linux-s390,
	netdev, habanero, Heiko Carstens, linux-kernel, virtualization,
	steved, Christian Borntraeger, Tom Lendacky, Martin Schwidefsky,
	linux390

Support the new used_event index. When acked,
utilize it to reduce the # of interrupts sent to the guest.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
---
 drivers/vhost/vhost.c |   74 +++++++++++++++++++++++++++++++++++++------------
 drivers/vhost/vhost.h |    7 ++++
 2 files changed, 63 insertions(+), 18 deletions(-)

diff --git a/drivers/vhost/vhost.c b/drivers/vhost/vhost.c
index 2ab2912..e33d5a3 100644
--- a/drivers/vhost/vhost.c
+++ b/drivers/vhost/vhost.c
@@ -37,6 +37,8 @@ enum {
 	VHOST_MEMORY_F_LOG = 0x1,
 };
 
+#define vhost_used_event(vq) ((u16 __user *)&vq->avail->ring[vq->num])
+
 static void vhost_poll_func(struct file *file, wait_queue_head_t *wqh,
 			    poll_table *pt)
 {
@@ -161,6 +163,8 @@ static void vhost_vq_reset(struct vhost_dev *dev,
 	vq->last_avail_idx = 0;
 	vq->avail_idx = 0;
 	vq->last_used_idx = 0;
+	vq->signalled_used = 0;
+	vq->signalled_used_valid = false;
 	vq->used_flags = 0;
 	vq->log_used = false;
 	vq->log_addr = -1ull;
@@ -489,14 +493,15 @@ static int memory_access_ok(struct vhost_dev *d, struct vhost_memory *mem,
 	return 1;
 }
 
-static int vq_access_ok(unsigned int num,
+static int vq_access_ok(struct vhost_dev *d, unsigned int num,
 			struct vring_desc __user *desc,
 			struct vring_avail __user *avail,
 			struct vring_used __user *used)
 {
+	size_t sa = vhost_has_feature(d, VIRTIO_RING_F_USED_EVENT_IDX) ? 2 : 0;
 	return access_ok(VERIFY_READ, desc, num * sizeof *desc) &&
 	       access_ok(VERIFY_READ, avail,
-			 sizeof *avail + num * sizeof *avail->ring) &&
+			 sizeof *avail + num * sizeof *avail->ring + sa) &&
 	       access_ok(VERIFY_WRITE, used,
 			sizeof *used + num * sizeof *used->ring);
 }
@@ -531,7 +536,7 @@ static int vq_log_access_ok(struct vhost_virtqueue *vq, void __user *log_base)
 /* Caller should have vq mutex and device mutex */
 int vhost_vq_access_ok(struct vhost_virtqueue *vq)
 {
-	return vq_access_ok(vq->num, vq->desc, vq->avail, vq->used) &&
+	return vq_access_ok(vq->dev, vq->num, vq->desc, vq->avail, vq->used) &&
 		vq_log_access_ok(vq, vq->log_base);
 }
 
@@ -577,6 +582,7 @@ static int init_used(struct vhost_virtqueue *vq,
 
 	if (r)
 		return r;
+	vq->signalled_used_valid = false;
 	return get_user(vq->last_used_idx, &used->idx);
 }
 
@@ -674,7 +680,7 @@ static long vhost_set_vring(struct vhost_dev *d, int ioctl, void __user *argp)
 		 * If it is not, we don't as size might not have been setup.
 		 * We will verify when backend is configured. */
 		if (vq->private_data) {
-			if (!vq_access_ok(vq->num,
+			if (!vq_access_ok(d, vq->num,
 				(void __user *)(unsigned long)a.desc_user_addr,
 				(void __user *)(unsigned long)a.avail_user_addr,
 				(void __user *)(unsigned long)a.used_user_addr)) {
@@ -1267,6 +1273,12 @@ int vhost_add_used(struct vhost_virtqueue *vq, unsigned int head, int len)
 			eventfd_signal(vq->log_ctx, 1);
 	}
 	vq->last_used_idx++;
+	/* If the driver never bothers to signal in a very long while,
+	 * used index might wrap around. If that happens, invalidate
+	 * signalled_used index we stored. TODO: make sure driver
+	 * signals at least once in 2^16 and remove this. */
+	if (unlikely(vq->last_used_idx == vq->signalled_used))
+		vq->signalled_used_valid = false;
 	return 0;
 }
 
@@ -1275,6 +1287,7 @@ static int __vhost_add_used_n(struct vhost_virtqueue *vq,
 			    unsigned count)
 {
 	struct vring_used_elem __user *used;
+	u16 old, new;
 	int start;
 
 	start = vq->last_used_idx % vq->num;
@@ -1292,7 +1305,14 @@ static int __vhost_add_used_n(struct vhost_virtqueue *vq,
 			   ((void __user *)used - (void __user *)vq->used),
 			  count * sizeof *used);
 	}
-	vq->last_used_idx += count;
+	old = vq->last_used_idx;
+	new = (vq->last_used_idx += count);
+	/* If the driver never bothers to signal in a very long while,
+	 * used index might wrap around. If that happens, invalidate
+	 * signalled_used index we stored. TODO: make sure driver
+	 * signals at least once in 2^16 and remove this. */
+	if (unlikely((u16)(new - vq->signalled_used) < (u16)(new - old)))
+		vq->signalled_used_valid = false;
 	return 0;
 }
 
@@ -1331,29 +1351,47 @@ int vhost_add_used_n(struct vhost_virtqueue *vq, struct vring_used_elem *heads,
 	return r;
 }
 
-/* This actually signals the guest, using eventfd. */
-void vhost_signal(struct vhost_dev *dev, struct vhost_virtqueue *vq)
+static bool vhost_notify(struct vhost_dev *dev, struct vhost_virtqueue *vq)
 {
-	__u16 flags;
-
+	__u16 old, new, event;
+	bool v;
 	/* Flush out used index updates. This is paired
 	 * with the barrier that the Guest executes when enabling
 	 * interrupts. */
 	smp_mb();
 
-	if (__get_user(flags, &vq->avail->flags)) {
-		vq_err(vq, "Failed to get flags");
-		return;
+	if (vhost_has_feature(dev, VIRTIO_F_NOTIFY_ON_EMPTY) &&
+	    unlikely(vq->avail_idx == vq->last_avail_idx))
+		return true;
+
+	if (!vhost_has_feature(dev, VIRTIO_RING_F_USED_EVENT_IDX)) {
+		__u16 flags;
+		if (__get_user(flags, &vq->avail->flags)) {
+			vq_err(vq, "Failed to get flags");
+			return true;
+		}
+		return !(flags & VRING_AVAIL_F_NO_INTERRUPT);
 	}
+	old = vq->signalled_used;
+	v = vq->signalled_used_valid;
+	new = vq->signalled_used = vq->last_used_idx;
+	vq->signalled_used_valid = true;
 
-	/* If they don't want an interrupt, don't signal, unless empty. */
-	if ((flags & VRING_AVAIL_F_NO_INTERRUPT) &&
-	    (vq->avail_idx != vq->last_avail_idx ||
-	     !vhost_has_feature(dev, VIRTIO_F_NOTIFY_ON_EMPTY)))
-		return;
+	if (unlikely(!v))
+		return true;
+
+	if (get_user(event, vhost_used_event(vq))) {
+		vq_err(vq, "Failed to get used event idx");
+		return true;
+	}
+	return vring_need_event(event, new, old);
+}
 
+/* This actually signals the guest, using eventfd. */
+void vhost_signal(struct vhost_dev *dev, struct vhost_virtqueue *vq)
+{
 	/* Signal the Guest tell them we used something up. */
-	if (vq->call_ctx)
+	if (vq->call_ctx && vhost_notify(dev, vq))
 		eventfd_signal(vq->call_ctx, 1);
 }
 
diff --git a/drivers/vhost/vhost.h b/drivers/vhost/vhost.h
index 0f1bf33..5825ac6 100644
--- a/drivers/vhost/vhost.h
+++ b/drivers/vhost/vhost.h
@@ -84,6 +84,12 @@ struct vhost_virtqueue {
 	/* Used flags */
 	u16 used_flags;
 
+	/* Last used index value we have signalled on */
+	u16 signalled_used;
+
+	/* Last used index value we have signalled on */
+	bool signalled_used_valid;
+
 	/* Log writes to used structure. */
 	bool log_used;
 	u64 log_addr;
@@ -164,6 +170,7 @@ int vhost_log_write(struct vhost_virtqueue *vq, struct vhost_log *log,
 enum {
 	VHOST_FEATURES = (1 << VIRTIO_F_NOTIFY_ON_EMPTY) |
 			 (1 << VIRTIO_RING_F_INDIRECT_DESC) |
+			 (1 << VIRTIO_RING_F_USED_EVENT_IDX) |
 			 (1 << VHOST_F_LOG_ALL) |
 			 (1 << VHOST_NET_F_VIRTIO_NET_HDR) |
 			 (1 << VIRTIO_NET_F_MRG_RXBUF),
-- 
1.7.5.53.gc233e

^ permalink raw reply related	[flat|nested] 145+ messages in thread

* [PATCH 11/18] vhost: support avail_event idx
  2011-05-04 20:50 ` Michael S. Tsirkin
  (?)
@ 2011-05-04 20:52   ` Michael S. Tsirkin
  -1 siblings, 0 replies; 145+ messages in thread
From: Michael S. Tsirkin @ 2011-05-04 20:52 UTC (permalink / raw)
  To: linux-kernel
  Cc: Rusty Russell, Carsten Otte, Christian Borntraeger, linux390,
	Martin Schwidefsky, Heiko Carstens, Shirley Ma, lguest,
	linux-kernel, virtualization, netdev, linux-s390, kvm,
	Krishna Kumar, Tom Lendacky, steved, habanero

Add support for the new avail_event feature in vhost_net
and vhost test modules.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
---
 drivers/vhost/net.c   |   12 ++++----
 drivers/vhost/test.c  |    6 ++--
 drivers/vhost/vhost.c |   65 +++++++++++++++++++++++++++++++++++++------------
 drivers/vhost/vhost.h |   17 +++++++------
 4 files changed, 67 insertions(+), 33 deletions(-)

diff --git a/drivers/vhost/net.c b/drivers/vhost/net.c
index 2f7c76a..e224a92 100644
--- a/drivers/vhost/net.c
+++ b/drivers/vhost/net.c
@@ -144,7 +144,7 @@ static void handle_tx(struct vhost_net *net)
 	}
 
 	mutex_lock(&vq->mutex);
-	vhost_disable_notify(vq);
+	vhost_disable_notify(&net->dev, vq);
 
 	if (wmem < sock->sk->sk_sndbuf / 2)
 		tx_poll_stop(net);
@@ -166,8 +166,8 @@ static void handle_tx(struct vhost_net *net)
 				set_bit(SOCK_ASYNC_NOSPACE, &sock->flags);
 				break;
 			}
-			if (unlikely(vhost_enable_notify(vq))) {
-				vhost_disable_notify(vq);
+			if (unlikely(vhost_enable_notify(&net->dev, vq))) {
+				vhost_disable_notify(&net->dev, vq);
 				continue;
 			}
 			break;
@@ -315,7 +315,7 @@ static void handle_rx(struct vhost_net *net)
 		return;
 
 	mutex_lock(&vq->mutex);
-	vhost_disable_notify(vq);
+	vhost_disable_notify(&net->dev, vq);
 	vhost_hlen = vq->vhost_hlen;
 	sock_hlen = vq->sock_hlen;
 
@@ -334,10 +334,10 @@ static void handle_rx(struct vhost_net *net)
 			break;
 		/* OK, now we need to know about added descriptors. */
 		if (!headcount) {
-			if (unlikely(vhost_enable_notify(vq))) {
+			if (unlikely(vhost_enable_notify(&net->dev, vq))) {
 				/* They have slipped one in as we were
 				 * doing that: check again. */
-				vhost_disable_notify(vq);
+				vhost_disable_notify(&net->dev, vq);
 				continue;
 			}
 			/* Nothing new?  Wait for eventfd to tell us
diff --git a/drivers/vhost/test.c b/drivers/vhost/test.c
index 099f302..734e1d7 100644
--- a/drivers/vhost/test.c
+++ b/drivers/vhost/test.c
@@ -49,7 +49,7 @@ static void handle_vq(struct vhost_test *n)
 		return;
 
 	mutex_lock(&vq->mutex);
-	vhost_disable_notify(vq);
+	vhost_disable_notify(&n->dev, vq);
 
 	for (;;) {
 		head = vhost_get_vq_desc(&n->dev, vq, vq->iov,
@@ -61,8 +61,8 @@ static void handle_vq(struct vhost_test *n)
 			break;
 		/* Nothing new?  Wait for eventfd to tell us they refilled. */
 		if (head == vq->num) {
-			if (unlikely(vhost_enable_notify(vq))) {
-				vhost_disable_notify(vq);
+			if (unlikely(vhost_enable_notify(&n->dev, vq))) {
+				vhost_disable_notify(&n->dev, vq);
 				continue;
 			}
 			break;
diff --git a/drivers/vhost/vhost.c b/drivers/vhost/vhost.c
index e33d5a3..2aea4cb 100644
--- a/drivers/vhost/vhost.c
+++ b/drivers/vhost/vhost.c
@@ -38,6 +38,7 @@ enum {
 };
 
 #define vhost_used_event(vq) ((u16 __user *)&vq->avail->ring[vq->num])
+#define vhost_avail_event(vq) ((u16 __user *)&vq->used->ring[vq->num])
 
 static void vhost_poll_func(struct file *file, wait_queue_head_t *wqh,
 			    poll_table *pt)
@@ -499,11 +500,12 @@ static int vq_access_ok(struct vhost_dev *d, unsigned int num,
 			struct vring_used __user *used)
 {
 	size_t sa = vhost_has_feature(d, VIRTIO_RING_F_USED_EVENT_IDX) ? 2 : 0;
+	size_t su = vhost_has_feature(d, VIRTIO_RING_F_AVAIL_EVENT_IDX) ? 2 : 0;
 	return access_ok(VERIFY_READ, desc, num * sizeof *desc) &&
 	       access_ok(VERIFY_READ, avail,
 			 sizeof *avail + num * sizeof *avail->ring + sa) &&
 	       access_ok(VERIFY_WRITE, used,
-			sizeof *used + num * sizeof *used->ring);
+			sizeof *used + num * sizeof *used->ring + su);
 }
 
 /* Can we log writes? */
@@ -519,9 +521,11 @@ int vhost_log_access_ok(struct vhost_dev *dev)
 
 /* Verify access for write logging. */
 /* Caller should have vq mutex and device mutex */
-static int vq_log_access_ok(struct vhost_virtqueue *vq, void __user *log_base)
+static int vq_log_access_ok(struct vhost_dev *d, struct vhost_virtqueue *vq,
+			    void __user *log_base)
 {
 	struct vhost_memory *mp;
+	size_t s = vhost_has_feature(d, VIRTIO_RING_F_AVAIL_EVENT_IDX) ? 2 : 0;
 
 	mp = rcu_dereference_protected(vq->dev->memory,
 				       lockdep_is_held(&vq->mutex));
@@ -529,7 +533,7 @@ static int vq_log_access_ok(struct vhost_virtqueue *vq, void __user *log_base)
 			    vhost_has_feature(vq->dev, VHOST_F_LOG_ALL)) &&
 		(!vq->log_used || log_access_ok(log_base, vq->log_addr,
 					sizeof *vq->used +
-					vq->num * sizeof *vq->used->ring));
+					vq->num * sizeof *vq->used->ring + s));
 }
 
 /* Can we start vq? */
@@ -537,7 +541,7 @@ static int vq_log_access_ok(struct vhost_virtqueue *vq, void __user *log_base)
 int vhost_vq_access_ok(struct vhost_virtqueue *vq)
 {
 	return vq_access_ok(vq->dev, vq->num, vq->desc, vq->avail, vq->used) &&
-		vq_log_access_ok(vq, vq->log_base);
+		vq_log_access_ok(vq->dev, vq, vq->log_base);
 }
 
 static long vhost_set_memory(struct vhost_dev *d, struct vhost_memory __user *m)
@@ -824,7 +828,7 @@ long vhost_dev_ioctl(struct vhost_dev *d, unsigned int ioctl, unsigned long arg)
 			vq = d->vqs + i;
 			mutex_lock(&vq->mutex);
 			/* If ring is inactive, will check when it's enabled. */
-			if (vq->private_data && !vq_log_access_ok(vq, base))
+			if (vq->private_data && !vq_log_access_ok(d, vq, base))
 				r = -EFAULT;
 			else
 				vq->log_base = base;
@@ -1225,6 +1229,10 @@ int vhost_get_vq_desc(struct vhost_dev *dev, struct vhost_virtqueue *vq,
 
 	/* On success, increment avail index. */
 	vq->last_avail_idx++;
+
+	/* Assume notifications from guest are disabled at this point,
+	 * if they aren't we would need to update avail_event index. */
+	BUG_ON(!(vq->used_flags & VRING_USED_F_NO_NOTIFY));
 	return head;
 }
 
@@ -1414,7 +1422,7 @@ void vhost_add_used_and_signal_n(struct vhost_dev *dev,
 }
 
 /* OK, now we need to know about added descriptors. */
-bool vhost_enable_notify(struct vhost_virtqueue *vq)
+bool vhost_enable_notify(struct vhost_dev *dev, struct vhost_virtqueue *vq)
 {
 	u16 avail_idx;
 	int r;
@@ -1422,11 +1430,34 @@ bool vhost_enable_notify(struct vhost_virtqueue *vq)
 	if (!(vq->used_flags & VRING_USED_F_NO_NOTIFY))
 		return false;
 	vq->used_flags &= ~VRING_USED_F_NO_NOTIFY;
-	r = put_user(vq->used_flags, &vq->used->flags);
-	if (r) {
-		vq_err(vq, "Failed to enable notification at %p: %d\n",
-		       &vq->used->flags, r);
-		return false;
+	if (!vhost_has_feature(dev, VIRTIO_RING_F_AVAIL_EVENT_IDX)) {
+		r = put_user(vq->used_flags, &vq->used->flags);
+		if (r) {
+			vq_err(vq, "Failed to enable notification at %p: %d\n",
+			       &vq->used->flags, r);
+			return false;
+		}
+	} else {
+		r = put_user(vq->last_avail_idx, vhost_avail_event(vq));
+		if (r) {
+			vq_err(vq, "Failed to update avail event index at %p: %d\n",
+			       vhost_avail_event(vq), r);
+			return false;
+		}
+	}
+	if (unlikely(vq->log_used)) {
+		void __user *used;
+		/* Make sure data is seen before log. */
+		smp_wmb();
+		used = vhost_has_feature(dev, VIRTIO_RING_F_AVAIL_EVENT_IDX) ?
+			&vq->used->flags : vhost_avail_event(vq);
+		/* Log used flags or event index entry write. Both are 16 bit
+		 * fields. */
+		log_write(vq->log_base, vq->log_addr +
+			   (used - (void __user *)vq->used),
+			  sizeof(u16));
+		if (vq->log_ctx)
+			eventfd_signal(vq->log_ctx, 1);
 	}
 	/* They could have slipped one in as we were doing that: make
 	 * sure it's written, then check again. */
@@ -1442,15 +1473,17 @@ bool vhost_enable_notify(struct vhost_virtqueue *vq)
 }
 
 /* We don't need to be notified again. */
-void vhost_disable_notify(struct vhost_virtqueue *vq)
+void vhost_disable_notify(struct vhost_dev *dev, struct vhost_virtqueue *vq)
 {
 	int r;
 
 	if (vq->used_flags & VRING_USED_F_NO_NOTIFY)
 		return;
 	vq->used_flags |= VRING_USED_F_NO_NOTIFY;
-	r = put_user(vq->used_flags, &vq->used->flags);
-	if (r)
-		vq_err(vq, "Failed to enable notification at %p: %d\n",
-		       &vq->used->flags, r);
+	if (!vhost_has_feature(dev, VIRTIO_RING_F_AVAIL_EVENT_IDX)) {
+		r = put_user(vq->used_flags, &vq->used->flags);
+		if (r)
+			vq_err(vq, "Failed to enable notification at %p: %d\n",
+			       &vq->used->flags, r);
+	}
 }
diff --git a/drivers/vhost/vhost.h b/drivers/vhost/vhost.h
index 5825ac6..edf84be 100644
--- a/drivers/vhost/vhost.h
+++ b/drivers/vhost/vhost.h
@@ -155,8 +155,8 @@ void vhost_add_used_and_signal(struct vhost_dev *, struct vhost_virtqueue *,
 void vhost_add_used_and_signal_n(struct vhost_dev *, struct vhost_virtqueue *,
 			       struct vring_used_elem *heads, unsigned count);
 void vhost_signal(struct vhost_dev *, struct vhost_virtqueue *);
-void vhost_disable_notify(struct vhost_virtqueue *);
-bool vhost_enable_notify(struct vhost_virtqueue *);
+void vhost_disable_notify(struct vhost_dev *, struct vhost_virtqueue *);
+bool vhost_enable_notify(struct vhost_dev *, struct vhost_virtqueue *);
 
 int vhost_log_write(struct vhost_virtqueue *vq, struct vhost_log *log,
 		    unsigned int log_num, u64 len);
@@ -168,12 +168,13 @@ int vhost_log_write(struct vhost_virtqueue *vq, struct vhost_log *log,
 	} while (0)
 
 enum {
-	VHOST_FEATURES = (1 << VIRTIO_F_NOTIFY_ON_EMPTY) |
-			 (1 << VIRTIO_RING_F_INDIRECT_DESC) |
-			 (1 << VIRTIO_RING_F_USED_EVENT_IDX) |
-			 (1 << VHOST_F_LOG_ALL) |
-			 (1 << VHOST_NET_F_VIRTIO_NET_HDR) |
-			 (1 << VIRTIO_NET_F_MRG_RXBUF),
+	VHOST_FEATURES = (1ULL << VIRTIO_F_NOTIFY_ON_EMPTY) |
+			 (1ULL << VIRTIO_RING_F_INDIRECT_DESC) |
+			 (1ULL << VIRTIO_RING_F_USED_EVENT_IDX) |
+			 (1ULL << VIRTIO_RING_F_AVAIL_EVENT_IDX) |
+			 (1ULL << VHOST_F_LOG_ALL) |
+			 (1ULL << VHOST_NET_F_VIRTIO_NET_HDR) |
+			 (1ULL << VIRTIO_NET_F_MRG_RXBUF),
 };
 
 static inline bool vhost_has_feature(struct vhost_dev *dev, int bit)
-- 
1.7.5.53.gc233e


^ permalink raw reply related	[flat|nested] 145+ messages in thread

* [PATCH 11/18] vhost: support avail_event idx
@ 2011-05-04 20:52   ` Michael S. Tsirkin
  0 siblings, 0 replies; 145+ messages in thread
From: Michael S. Tsirkin @ 2011-05-04 20:52 UTC (permalink / raw)
  To: linux-kernel
  Cc: Krishna Kumar, Carsten Otte, lguest, Shirley Ma, kvm, linux-s390,
	netdev, habanero, Heiko Carstens, linux-kernel, virtualization,
	steved, Christian Borntraeger, Tom Lendacky, Martin Schwidefsky,
	linux390

Add support for the new avail_event feature in vhost_net
and vhost test modules.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
---
 drivers/vhost/net.c   |   12 ++++----
 drivers/vhost/test.c  |    6 ++--
 drivers/vhost/vhost.c |   65 +++++++++++++++++++++++++++++++++++++------------
 drivers/vhost/vhost.h |   17 +++++++------
 4 files changed, 67 insertions(+), 33 deletions(-)

diff --git a/drivers/vhost/net.c b/drivers/vhost/net.c
index 2f7c76a..e224a92 100644
--- a/drivers/vhost/net.c
+++ b/drivers/vhost/net.c
@@ -144,7 +144,7 @@ static void handle_tx(struct vhost_net *net)
 	}
 
 	mutex_lock(&vq->mutex);
-	vhost_disable_notify(vq);
+	vhost_disable_notify(&net->dev, vq);
 
 	if (wmem < sock->sk->sk_sndbuf / 2)
 		tx_poll_stop(net);
@@ -166,8 +166,8 @@ static void handle_tx(struct vhost_net *net)
 				set_bit(SOCK_ASYNC_NOSPACE, &sock->flags);
 				break;
 			}
-			if (unlikely(vhost_enable_notify(vq))) {
-				vhost_disable_notify(vq);
+			if (unlikely(vhost_enable_notify(&net->dev, vq))) {
+				vhost_disable_notify(&net->dev, vq);
 				continue;
 			}
 			break;
@@ -315,7 +315,7 @@ static void handle_rx(struct vhost_net *net)
 		return;
 
 	mutex_lock(&vq->mutex);
-	vhost_disable_notify(vq);
+	vhost_disable_notify(&net->dev, vq);
 	vhost_hlen = vq->vhost_hlen;
 	sock_hlen = vq->sock_hlen;
 
@@ -334,10 +334,10 @@ static void handle_rx(struct vhost_net *net)
 			break;
 		/* OK, now we need to know about added descriptors. */
 		if (!headcount) {
-			if (unlikely(vhost_enable_notify(vq))) {
+			if (unlikely(vhost_enable_notify(&net->dev, vq))) {
 				/* They have slipped one in as we were
 				 * doing that: check again. */
-				vhost_disable_notify(vq);
+				vhost_disable_notify(&net->dev, vq);
 				continue;
 			}
 			/* Nothing new?  Wait for eventfd to tell us
diff --git a/drivers/vhost/test.c b/drivers/vhost/test.c
index 099f302..734e1d7 100644
--- a/drivers/vhost/test.c
+++ b/drivers/vhost/test.c
@@ -49,7 +49,7 @@ static void handle_vq(struct vhost_test *n)
 		return;
 
 	mutex_lock(&vq->mutex);
-	vhost_disable_notify(vq);
+	vhost_disable_notify(&n->dev, vq);
 
 	for (;;) {
 		head = vhost_get_vq_desc(&n->dev, vq, vq->iov,
@@ -61,8 +61,8 @@ static void handle_vq(struct vhost_test *n)
 			break;
 		/* Nothing new?  Wait for eventfd to tell us they refilled. */
 		if (head == vq->num) {
-			if (unlikely(vhost_enable_notify(vq))) {
-				vhost_disable_notify(vq);
+			if (unlikely(vhost_enable_notify(&n->dev, vq))) {
+				vhost_disable_notify(&n->dev, vq);
 				continue;
 			}
 			break;
diff --git a/drivers/vhost/vhost.c b/drivers/vhost/vhost.c
index e33d5a3..2aea4cb 100644
--- a/drivers/vhost/vhost.c
+++ b/drivers/vhost/vhost.c
@@ -38,6 +38,7 @@ enum {
 };
 
 #define vhost_used_event(vq) ((u16 __user *)&vq->avail->ring[vq->num])
+#define vhost_avail_event(vq) ((u16 __user *)&vq->used->ring[vq->num])
 
 static void vhost_poll_func(struct file *file, wait_queue_head_t *wqh,
 			    poll_table *pt)
@@ -499,11 +500,12 @@ static int vq_access_ok(struct vhost_dev *d, unsigned int num,
 			struct vring_used __user *used)
 {
 	size_t sa = vhost_has_feature(d, VIRTIO_RING_F_USED_EVENT_IDX) ? 2 : 0;
+	size_t su = vhost_has_feature(d, VIRTIO_RING_F_AVAIL_EVENT_IDX) ? 2 : 0;
 	return access_ok(VERIFY_READ, desc, num * sizeof *desc) &&
 	       access_ok(VERIFY_READ, avail,
 			 sizeof *avail + num * sizeof *avail->ring + sa) &&
 	       access_ok(VERIFY_WRITE, used,
-			sizeof *used + num * sizeof *used->ring);
+			sizeof *used + num * sizeof *used->ring + su);
 }
 
 /* Can we log writes? */
@@ -519,9 +521,11 @@ int vhost_log_access_ok(struct vhost_dev *dev)
 
 /* Verify access for write logging. */
 /* Caller should have vq mutex and device mutex */
-static int vq_log_access_ok(struct vhost_virtqueue *vq, void __user *log_base)
+static int vq_log_access_ok(struct vhost_dev *d, struct vhost_virtqueue *vq,
+			    void __user *log_base)
 {
 	struct vhost_memory *mp;
+	size_t s = vhost_has_feature(d, VIRTIO_RING_F_AVAIL_EVENT_IDX) ? 2 : 0;
 
 	mp = rcu_dereference_protected(vq->dev->memory,
 				       lockdep_is_held(&vq->mutex));
@@ -529,7 +533,7 @@ static int vq_log_access_ok(struct vhost_virtqueue *vq, void __user *log_base)
 			    vhost_has_feature(vq->dev, VHOST_F_LOG_ALL)) &&
 		(!vq->log_used || log_access_ok(log_base, vq->log_addr,
 					sizeof *vq->used +
-					vq->num * sizeof *vq->used->ring));
+					vq->num * sizeof *vq->used->ring + s));
 }
 
 /* Can we start vq? */
@@ -537,7 +541,7 @@ static int vq_log_access_ok(struct vhost_virtqueue *vq, void __user *log_base)
 int vhost_vq_access_ok(struct vhost_virtqueue *vq)
 {
 	return vq_access_ok(vq->dev, vq->num, vq->desc, vq->avail, vq->used) &&
-		vq_log_access_ok(vq, vq->log_base);
+		vq_log_access_ok(vq->dev, vq, vq->log_base);
 }
 
 static long vhost_set_memory(struct vhost_dev *d, struct vhost_memory __user *m)
@@ -824,7 +828,7 @@ long vhost_dev_ioctl(struct vhost_dev *d, unsigned int ioctl, unsigned long arg)
 			vq = d->vqs + i;
 			mutex_lock(&vq->mutex);
 			/* If ring is inactive, will check when it's enabled. */
-			if (vq->private_data && !vq_log_access_ok(vq, base))
+			if (vq->private_data && !vq_log_access_ok(d, vq, base))
 				r = -EFAULT;
 			else
 				vq->log_base = base;
@@ -1225,6 +1229,10 @@ int vhost_get_vq_desc(struct vhost_dev *dev, struct vhost_virtqueue *vq,
 
 	/* On success, increment avail index. */
 	vq->last_avail_idx++;
+
+	/* Assume notifications from guest are disabled at this point,
+	 * if they aren't we would need to update avail_event index. */
+	BUG_ON(!(vq->used_flags & VRING_USED_F_NO_NOTIFY));
 	return head;
 }
 
@@ -1414,7 +1422,7 @@ void vhost_add_used_and_signal_n(struct vhost_dev *dev,
 }
 
 /* OK, now we need to know about added descriptors. */
-bool vhost_enable_notify(struct vhost_virtqueue *vq)
+bool vhost_enable_notify(struct vhost_dev *dev, struct vhost_virtqueue *vq)
 {
 	u16 avail_idx;
 	int r;
@@ -1422,11 +1430,34 @@ bool vhost_enable_notify(struct vhost_virtqueue *vq)
 	if (!(vq->used_flags & VRING_USED_F_NO_NOTIFY))
 		return false;
 	vq->used_flags &= ~VRING_USED_F_NO_NOTIFY;
-	r = put_user(vq->used_flags, &vq->used->flags);
-	if (r) {
-		vq_err(vq, "Failed to enable notification at %p: %d\n",
-		       &vq->used->flags, r);
-		return false;
+	if (!vhost_has_feature(dev, VIRTIO_RING_F_AVAIL_EVENT_IDX)) {
+		r = put_user(vq->used_flags, &vq->used->flags);
+		if (r) {
+			vq_err(vq, "Failed to enable notification at %p: %d\n",
+			       &vq->used->flags, r);
+			return false;
+		}
+	} else {
+		r = put_user(vq->last_avail_idx, vhost_avail_event(vq));
+		if (r) {
+			vq_err(vq, "Failed to update avail event index at %p: %d\n",
+			       vhost_avail_event(vq), r);
+			return false;
+		}
+	}
+	if (unlikely(vq->log_used)) {
+		void __user *used;
+		/* Make sure data is seen before log. */
+		smp_wmb();
+		used = vhost_has_feature(dev, VIRTIO_RING_F_AVAIL_EVENT_IDX) ?
+			&vq->used->flags : vhost_avail_event(vq);
+		/* Log used flags or event index entry write. Both are 16 bit
+		 * fields. */
+		log_write(vq->log_base, vq->log_addr +
+			   (used - (void __user *)vq->used),
+			  sizeof(u16));
+		if (vq->log_ctx)
+			eventfd_signal(vq->log_ctx, 1);
 	}
 	/* They could have slipped one in as we were doing that: make
 	 * sure it's written, then check again. */
@@ -1442,15 +1473,17 @@ bool vhost_enable_notify(struct vhost_virtqueue *vq)
 }
 
 /* We don't need to be notified again. */
-void vhost_disable_notify(struct vhost_virtqueue *vq)
+void vhost_disable_notify(struct vhost_dev *dev, struct vhost_virtqueue *vq)
 {
 	int r;
 
 	if (vq->used_flags & VRING_USED_F_NO_NOTIFY)
 		return;
 	vq->used_flags |= VRING_USED_F_NO_NOTIFY;
-	r = put_user(vq->used_flags, &vq->used->flags);
-	if (r)
-		vq_err(vq, "Failed to enable notification at %p: %d\n",
-		       &vq->used->flags, r);
+	if (!vhost_has_feature(dev, VIRTIO_RING_F_AVAIL_EVENT_IDX)) {
+		r = put_user(vq->used_flags, &vq->used->flags);
+		if (r)
+			vq_err(vq, "Failed to enable notification at %p: %d\n",
+			       &vq->used->flags, r);
+	}
 }
diff --git a/drivers/vhost/vhost.h b/drivers/vhost/vhost.h
index 5825ac6..edf84be 100644
--- a/drivers/vhost/vhost.h
+++ b/drivers/vhost/vhost.h
@@ -155,8 +155,8 @@ void vhost_add_used_and_signal(struct vhost_dev *, struct vhost_virtqueue *,
 void vhost_add_used_and_signal_n(struct vhost_dev *, struct vhost_virtqueue *,
 			       struct vring_used_elem *heads, unsigned count);
 void vhost_signal(struct vhost_dev *, struct vhost_virtqueue *);
-void vhost_disable_notify(struct vhost_virtqueue *);
-bool vhost_enable_notify(struct vhost_virtqueue *);
+void vhost_disable_notify(struct vhost_dev *, struct vhost_virtqueue *);
+bool vhost_enable_notify(struct vhost_dev *, struct vhost_virtqueue *);
 
 int vhost_log_write(struct vhost_virtqueue *vq, struct vhost_log *log,
 		    unsigned int log_num, u64 len);
@@ -168,12 +168,13 @@ int vhost_log_write(struct vhost_virtqueue *vq, struct vhost_log *log,
 	} while (0)
 
 enum {
-	VHOST_FEATURES = (1 << VIRTIO_F_NOTIFY_ON_EMPTY) |
-			 (1 << VIRTIO_RING_F_INDIRECT_DESC) |
-			 (1 << VIRTIO_RING_F_USED_EVENT_IDX) |
-			 (1 << VHOST_F_LOG_ALL) |
-			 (1 << VHOST_NET_F_VIRTIO_NET_HDR) |
-			 (1 << VIRTIO_NET_F_MRG_RXBUF),
+	VHOST_FEATURES = (1ULL << VIRTIO_F_NOTIFY_ON_EMPTY) |
+			 (1ULL << VIRTIO_RING_F_INDIRECT_DESC) |
+			 (1ULL << VIRTIO_RING_F_USED_EVENT_IDX) |
+			 (1ULL << VIRTIO_RING_F_AVAIL_EVENT_IDX) |
+			 (1ULL << VHOST_F_LOG_ALL) |
+			 (1ULL << VHOST_NET_F_VIRTIO_NET_HDR) |
+			 (1ULL << VIRTIO_NET_F_MRG_RXBUF),
 };
 
 static inline bool vhost_has_feature(struct vhost_dev *dev, int bit)
-- 
1.7.5.53.gc233e

^ permalink raw reply related	[flat|nested] 145+ messages in thread

* [PATCH 11/18] vhost: support avail_event idx
@ 2011-05-04 20:52   ` Michael S. Tsirkin
  0 siblings, 0 replies; 145+ messages in thread
From: Michael S. Tsirkin @ 2011-05-04 20:52 UTC (permalink / raw)
  Cc: Krishna Kumar, Carsten Otte, lguest, Shirley Ma, kvm, linux-s390,
	netdev, habanero, Heiko Carstens, linux-kernel, virtualization,
	steved, Christian Borntraeger, Tom Lendacky, Martin Schwidefsky,
	linux390

Add support for the new avail_event feature in vhost_net
and vhost test modules.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
---
 drivers/vhost/net.c   |   12 ++++----
 drivers/vhost/test.c  |    6 ++--
 drivers/vhost/vhost.c |   65 +++++++++++++++++++++++++++++++++++++------------
 drivers/vhost/vhost.h |   17 +++++++------
 4 files changed, 67 insertions(+), 33 deletions(-)

diff --git a/drivers/vhost/net.c b/drivers/vhost/net.c
index 2f7c76a..e224a92 100644
--- a/drivers/vhost/net.c
+++ b/drivers/vhost/net.c
@@ -144,7 +144,7 @@ static void handle_tx(struct vhost_net *net)
 	}
 
 	mutex_lock(&vq->mutex);
-	vhost_disable_notify(vq);
+	vhost_disable_notify(&net->dev, vq);
 
 	if (wmem < sock->sk->sk_sndbuf / 2)
 		tx_poll_stop(net);
@@ -166,8 +166,8 @@ static void handle_tx(struct vhost_net *net)
 				set_bit(SOCK_ASYNC_NOSPACE, &sock->flags);
 				break;
 			}
-			if (unlikely(vhost_enable_notify(vq))) {
-				vhost_disable_notify(vq);
+			if (unlikely(vhost_enable_notify(&net->dev, vq))) {
+				vhost_disable_notify(&net->dev, vq);
 				continue;
 			}
 			break;
@@ -315,7 +315,7 @@ static void handle_rx(struct vhost_net *net)
 		return;
 
 	mutex_lock(&vq->mutex);
-	vhost_disable_notify(vq);
+	vhost_disable_notify(&net->dev, vq);
 	vhost_hlen = vq->vhost_hlen;
 	sock_hlen = vq->sock_hlen;
 
@@ -334,10 +334,10 @@ static void handle_rx(struct vhost_net *net)
 			break;
 		/* OK, now we need to know about added descriptors. */
 		if (!headcount) {
-			if (unlikely(vhost_enable_notify(vq))) {
+			if (unlikely(vhost_enable_notify(&net->dev, vq))) {
 				/* They have slipped one in as we were
 				 * doing that: check again. */
-				vhost_disable_notify(vq);
+				vhost_disable_notify(&net->dev, vq);
 				continue;
 			}
 			/* Nothing new?  Wait for eventfd to tell us
diff --git a/drivers/vhost/test.c b/drivers/vhost/test.c
index 099f302..734e1d7 100644
--- a/drivers/vhost/test.c
+++ b/drivers/vhost/test.c
@@ -49,7 +49,7 @@ static void handle_vq(struct vhost_test *n)
 		return;
 
 	mutex_lock(&vq->mutex);
-	vhost_disable_notify(vq);
+	vhost_disable_notify(&n->dev, vq);
 
 	for (;;) {
 		head = vhost_get_vq_desc(&n->dev, vq, vq->iov,
@@ -61,8 +61,8 @@ static void handle_vq(struct vhost_test *n)
 			break;
 		/* Nothing new?  Wait for eventfd to tell us they refilled. */
 		if (head == vq->num) {
-			if (unlikely(vhost_enable_notify(vq))) {
-				vhost_disable_notify(vq);
+			if (unlikely(vhost_enable_notify(&n->dev, vq))) {
+				vhost_disable_notify(&n->dev, vq);
 				continue;
 			}
 			break;
diff --git a/drivers/vhost/vhost.c b/drivers/vhost/vhost.c
index e33d5a3..2aea4cb 100644
--- a/drivers/vhost/vhost.c
+++ b/drivers/vhost/vhost.c
@@ -38,6 +38,7 @@ enum {
 };
 
 #define vhost_used_event(vq) ((u16 __user *)&vq->avail->ring[vq->num])
+#define vhost_avail_event(vq) ((u16 __user *)&vq->used->ring[vq->num])
 
 static void vhost_poll_func(struct file *file, wait_queue_head_t *wqh,
 			    poll_table *pt)
@@ -499,11 +500,12 @@ static int vq_access_ok(struct vhost_dev *d, unsigned int num,
 			struct vring_used __user *used)
 {
 	size_t sa = vhost_has_feature(d, VIRTIO_RING_F_USED_EVENT_IDX) ? 2 : 0;
+	size_t su = vhost_has_feature(d, VIRTIO_RING_F_AVAIL_EVENT_IDX) ? 2 : 0;
 	return access_ok(VERIFY_READ, desc, num * sizeof *desc) &&
 	       access_ok(VERIFY_READ, avail,
 			 sizeof *avail + num * sizeof *avail->ring + sa) &&
 	       access_ok(VERIFY_WRITE, used,
-			sizeof *used + num * sizeof *used->ring);
+			sizeof *used + num * sizeof *used->ring + su);
 }
 
 /* Can we log writes? */
@@ -519,9 +521,11 @@ int vhost_log_access_ok(struct vhost_dev *dev)
 
 /* Verify access for write logging. */
 /* Caller should have vq mutex and device mutex */
-static int vq_log_access_ok(struct vhost_virtqueue *vq, void __user *log_base)
+static int vq_log_access_ok(struct vhost_dev *d, struct vhost_virtqueue *vq,
+			    void __user *log_base)
 {
 	struct vhost_memory *mp;
+	size_t s = vhost_has_feature(d, VIRTIO_RING_F_AVAIL_EVENT_IDX) ? 2 : 0;
 
 	mp = rcu_dereference_protected(vq->dev->memory,
 				       lockdep_is_held(&vq->mutex));
@@ -529,7 +533,7 @@ static int vq_log_access_ok(struct vhost_virtqueue *vq, void __user *log_base)
 			    vhost_has_feature(vq->dev, VHOST_F_LOG_ALL)) &&
 		(!vq->log_used || log_access_ok(log_base, vq->log_addr,
 					sizeof *vq->used +
-					vq->num * sizeof *vq->used->ring));
+					vq->num * sizeof *vq->used->ring + s));
 }
 
 /* Can we start vq? */
@@ -537,7 +541,7 @@ static int vq_log_access_ok(struct vhost_virtqueue *vq, void __user *log_base)
 int vhost_vq_access_ok(struct vhost_virtqueue *vq)
 {
 	return vq_access_ok(vq->dev, vq->num, vq->desc, vq->avail, vq->used) &&
-		vq_log_access_ok(vq, vq->log_base);
+		vq_log_access_ok(vq->dev, vq, vq->log_base);
 }
 
 static long vhost_set_memory(struct vhost_dev *d, struct vhost_memory __user *m)
@@ -824,7 +828,7 @@ long vhost_dev_ioctl(struct vhost_dev *d, unsigned int ioctl, unsigned long arg)
 			vq = d->vqs + i;
 			mutex_lock(&vq->mutex);
 			/* If ring is inactive, will check when it's enabled. */
-			if (vq->private_data && !vq_log_access_ok(vq, base))
+			if (vq->private_data && !vq_log_access_ok(d, vq, base))
 				r = -EFAULT;
 			else
 				vq->log_base = base;
@@ -1225,6 +1229,10 @@ int vhost_get_vq_desc(struct vhost_dev *dev, struct vhost_virtqueue *vq,
 
 	/* On success, increment avail index. */
 	vq->last_avail_idx++;
+
+	/* Assume notifications from guest are disabled at this point,
+	 * if they aren't we would need to update avail_event index. */
+	BUG_ON(!(vq->used_flags & VRING_USED_F_NO_NOTIFY));
 	return head;
 }
 
@@ -1414,7 +1422,7 @@ void vhost_add_used_and_signal_n(struct vhost_dev *dev,
 }
 
 /* OK, now we need to know about added descriptors. */
-bool vhost_enable_notify(struct vhost_virtqueue *vq)
+bool vhost_enable_notify(struct vhost_dev *dev, struct vhost_virtqueue *vq)
 {
 	u16 avail_idx;
 	int r;
@@ -1422,11 +1430,34 @@ bool vhost_enable_notify(struct vhost_virtqueue *vq)
 	if (!(vq->used_flags & VRING_USED_F_NO_NOTIFY))
 		return false;
 	vq->used_flags &= ~VRING_USED_F_NO_NOTIFY;
-	r = put_user(vq->used_flags, &vq->used->flags);
-	if (r) {
-		vq_err(vq, "Failed to enable notification at %p: %d\n",
-		       &vq->used->flags, r);
-		return false;
+	if (!vhost_has_feature(dev, VIRTIO_RING_F_AVAIL_EVENT_IDX)) {
+		r = put_user(vq->used_flags, &vq->used->flags);
+		if (r) {
+			vq_err(vq, "Failed to enable notification at %p: %d\n",
+			       &vq->used->flags, r);
+			return false;
+		}
+	} else {
+		r = put_user(vq->last_avail_idx, vhost_avail_event(vq));
+		if (r) {
+			vq_err(vq, "Failed to update avail event index at %p: %d\n",
+			       vhost_avail_event(vq), r);
+			return false;
+		}
+	}
+	if (unlikely(vq->log_used)) {
+		void __user *used;
+		/* Make sure data is seen before log. */
+		smp_wmb();
+		used = vhost_has_feature(dev, VIRTIO_RING_F_AVAIL_EVENT_IDX) ?
+			&vq->used->flags : vhost_avail_event(vq);
+		/* Log used flags or event index entry write. Both are 16 bit
+		 * fields. */
+		log_write(vq->log_base, vq->log_addr +
+			   (used - (void __user *)vq->used),
+			  sizeof(u16));
+		if (vq->log_ctx)
+			eventfd_signal(vq->log_ctx, 1);
 	}
 	/* They could have slipped one in as we were doing that: make
 	 * sure it's written, then check again. */
@@ -1442,15 +1473,17 @@ bool vhost_enable_notify(struct vhost_virtqueue *vq)
 }
 
 /* We don't need to be notified again. */
-void vhost_disable_notify(struct vhost_virtqueue *vq)
+void vhost_disable_notify(struct vhost_dev *dev, struct vhost_virtqueue *vq)
 {
 	int r;
 
 	if (vq->used_flags & VRING_USED_F_NO_NOTIFY)
 		return;
 	vq->used_flags |= VRING_USED_F_NO_NOTIFY;
-	r = put_user(vq->used_flags, &vq->used->flags);
-	if (r)
-		vq_err(vq, "Failed to enable notification at %p: %d\n",
-		       &vq->used->flags, r);
+	if (!vhost_has_feature(dev, VIRTIO_RING_F_AVAIL_EVENT_IDX)) {
+		r = put_user(vq->used_flags, &vq->used->flags);
+		if (r)
+			vq_err(vq, "Failed to enable notification at %p: %d\n",
+			       &vq->used->flags, r);
+	}
 }
diff --git a/drivers/vhost/vhost.h b/drivers/vhost/vhost.h
index 5825ac6..edf84be 100644
--- a/drivers/vhost/vhost.h
+++ b/drivers/vhost/vhost.h
@@ -155,8 +155,8 @@ void vhost_add_used_and_signal(struct vhost_dev *, struct vhost_virtqueue *,
 void vhost_add_used_and_signal_n(struct vhost_dev *, struct vhost_virtqueue *,
 			       struct vring_used_elem *heads, unsigned count);
 void vhost_signal(struct vhost_dev *, struct vhost_virtqueue *);
-void vhost_disable_notify(struct vhost_virtqueue *);
-bool vhost_enable_notify(struct vhost_virtqueue *);
+void vhost_disable_notify(struct vhost_dev *, struct vhost_virtqueue *);
+bool vhost_enable_notify(struct vhost_dev *, struct vhost_virtqueue *);
 
 int vhost_log_write(struct vhost_virtqueue *vq, struct vhost_log *log,
 		    unsigned int log_num, u64 len);
@@ -168,12 +168,13 @@ int vhost_log_write(struct vhost_virtqueue *vq, struct vhost_log *log,
 	} while (0)
 
 enum {
-	VHOST_FEATURES = (1 << VIRTIO_F_NOTIFY_ON_EMPTY) |
-			 (1 << VIRTIO_RING_F_INDIRECT_DESC) |
-			 (1 << VIRTIO_RING_F_USED_EVENT_IDX) |
-			 (1 << VHOST_F_LOG_ALL) |
-			 (1 << VHOST_NET_F_VIRTIO_NET_HDR) |
-			 (1 << VIRTIO_NET_F_MRG_RXBUF),
+	VHOST_FEATURES = (1ULL << VIRTIO_F_NOTIFY_ON_EMPTY) |
+			 (1ULL << VIRTIO_RING_F_INDIRECT_DESC) |
+			 (1ULL << VIRTIO_RING_F_USED_EVENT_IDX) |
+			 (1ULL << VIRTIO_RING_F_AVAIL_EVENT_IDX) |
+			 (1ULL << VHOST_F_LOG_ALL) |
+			 (1ULL << VHOST_NET_F_VIRTIO_NET_HDR) |
+			 (1ULL << VIRTIO_NET_F_MRG_RXBUF),
 };
 
 static inline bool vhost_has_feature(struct vhost_dev *dev, int bit)
-- 
1.7.5.53.gc233e

^ permalink raw reply related	[flat|nested] 145+ messages in thread

* [PATCH 12/18] virtio_test: support used_event index
  2011-05-04 20:50 ` Michael S. Tsirkin
  (?)
@ 2011-05-04 20:52   ` Michael S. Tsirkin
  -1 siblings, 0 replies; 145+ messages in thread
From: Michael S. Tsirkin @ 2011-05-04 20:52 UTC (permalink / raw)
  To: linux-kernel
  Cc: Rusty Russell, Carsten Otte, Christian Borntraeger, linux390,
	Martin Schwidefsky, Heiko Carstens, Shirley Ma, lguest,
	linux-kernel, virtualization, netdev, linux-s390, kvm,
	Krishna Kumar, Tom Lendacky, steved, habanero

Add ability to test the new used_event feature,
enable by default.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
---
 tools/virtio/virtio_test.c |   18 ++++++++++++++++--
 1 files changed, 16 insertions(+), 2 deletions(-)

diff --git a/tools/virtio/virtio_test.c b/tools/virtio/virtio_test.c
index 9e65e6d..157ec68 100644
--- a/tools/virtio/virtio_test.c
+++ b/tools/virtio/virtio_test.c
@@ -210,18 +210,29 @@ const struct option longopts[] = {
 		.val = 'i',
 	},
 	{
+		.name = "used-event-idx",
+		.val = 'U',
+	},
+	{
+		.name = "no-used-event-idx",
+		.val = 'u',
+	},
+	{
 	}
 };
 
 static void help()
 {
-	fprintf(stderr, "Usage: virtio_test [--help] [--no-indirect]\n");
+	fprintf(stderr, "Usage: virtio_test [--help]"
+		" [--no-indirect] "
+		" [--no-used-event-idx]\n");
 }
 
 int main(int argc, char **argv)
 {
 	struct vdev_info dev;
-	unsigned long long features = 1ULL << VIRTIO_RING_F_INDIRECT_DESC;
+	unsigned long long features = (1ULL << VIRTIO_RING_F_INDIRECT_DESC) |
+		(1ULL << VIRTIO_RING_F_USED_EVENT_IDX);
 	int o;
 
 	for (;;) {
@@ -238,6 +249,9 @@ int main(int argc, char **argv)
 		case 'i':
 			features &= ~(1ULL << VIRTIO_RING_F_INDIRECT_DESC);
 			break;
+		case 'u':
+			features &= ~(1ULL << VIRTIO_RING_F_USED_EVENT_IDX);
+			break;
 		default:
 			assert(0);
 			break;
-- 
1.7.5.53.gc233e


^ permalink raw reply related	[flat|nested] 145+ messages in thread

* [PATCH 12/18] virtio_test: support used_event index
@ 2011-05-04 20:52   ` Michael S. Tsirkin
  0 siblings, 0 replies; 145+ messages in thread
From: Michael S. Tsirkin @ 2011-05-04 20:52 UTC (permalink / raw)
  To: linux-kernel
  Cc: Krishna Kumar, Carsten Otte, lguest, Shirley Ma, kvm, linux-s390,
	netdev, habanero, Heiko Carstens, linux-kernel, virtualization,
	steved, Christian Borntraeger, Tom Lendacky, Martin Schwidefsky,
	linux390

Add ability to test the new used_event feature,
enable by default.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
---
 tools/virtio/virtio_test.c |   18 ++++++++++++++++--
 1 files changed, 16 insertions(+), 2 deletions(-)

diff --git a/tools/virtio/virtio_test.c b/tools/virtio/virtio_test.c
index 9e65e6d..157ec68 100644
--- a/tools/virtio/virtio_test.c
+++ b/tools/virtio/virtio_test.c
@@ -210,18 +210,29 @@ const struct option longopts[] = {
 		.val = 'i',
 	},
 	{
+		.name = "used-event-idx",
+		.val = 'U',
+	},
+	{
+		.name = "no-used-event-idx",
+		.val = 'u',
+	},
+	{
 	}
 };
 
 static void help()
 {
-	fprintf(stderr, "Usage: virtio_test [--help] [--no-indirect]\n");
+	fprintf(stderr, "Usage: virtio_test [--help]"
+		" [--no-indirect] "
+		" [--no-used-event-idx]\n");
 }
 
 int main(int argc, char **argv)
 {
 	struct vdev_info dev;
-	unsigned long long features = 1ULL << VIRTIO_RING_F_INDIRECT_DESC;
+	unsigned long long features = (1ULL << VIRTIO_RING_F_INDIRECT_DESC) |
+		(1ULL << VIRTIO_RING_F_USED_EVENT_IDX);
 	int o;
 
 	for (;;) {
@@ -238,6 +249,9 @@ int main(int argc, char **argv)
 		case 'i':
 			features &= ~(1ULL << VIRTIO_RING_F_INDIRECT_DESC);
 			break;
+		case 'u':
+			features &= ~(1ULL << VIRTIO_RING_F_USED_EVENT_IDX);
+			break;
 		default:
 			assert(0);
 			break;
-- 
1.7.5.53.gc233e

^ permalink raw reply related	[flat|nested] 145+ messages in thread

* [PATCH 12/18] virtio_test: support used_event index
@ 2011-05-04 20:52   ` Michael S. Tsirkin
  0 siblings, 0 replies; 145+ messages in thread
From: Michael S. Tsirkin @ 2011-05-04 20:52 UTC (permalink / raw)
  Cc: Krishna Kumar, Carsten Otte, lguest, Shirley Ma, kvm, linux-s390,
	netdev, habanero, Heiko Carstens, linux-kernel, virtualization,
	steved, Christian Borntraeger, Tom Lendacky, Martin Schwidefsky,
	linux390

Add ability to test the new used_event feature,
enable by default.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
---
 tools/virtio/virtio_test.c |   18 ++++++++++++++++--
 1 files changed, 16 insertions(+), 2 deletions(-)

diff --git a/tools/virtio/virtio_test.c b/tools/virtio/virtio_test.c
index 9e65e6d..157ec68 100644
--- a/tools/virtio/virtio_test.c
+++ b/tools/virtio/virtio_test.c
@@ -210,18 +210,29 @@ const struct option longopts[] = {
 		.val = 'i',
 	},
 	{
+		.name = "used-event-idx",
+		.val = 'U',
+	},
+	{
+		.name = "no-used-event-idx",
+		.val = 'u',
+	},
+	{
 	}
 };
 
 static void help()
 {
-	fprintf(stderr, "Usage: virtio_test [--help] [--no-indirect]\n");
+	fprintf(stderr, "Usage: virtio_test [--help]"
+		" [--no-indirect] "
+		" [--no-used-event-idx]\n");
 }
 
 int main(int argc, char **argv)
 {
 	struct vdev_info dev;
-	unsigned long long features = 1ULL << VIRTIO_RING_F_INDIRECT_DESC;
+	unsigned long long features = (1ULL << VIRTIO_RING_F_INDIRECT_DESC) |
+		(1ULL << VIRTIO_RING_F_USED_EVENT_IDX);
 	int o;
 
 	for (;;) {
@@ -238,6 +249,9 @@ int main(int argc, char **argv)
 		case 'i':
 			features &= ~(1ULL << VIRTIO_RING_F_INDIRECT_DESC);
 			break;
+		case 'u':
+			features &= ~(1ULL << VIRTIO_RING_F_USED_EVENT_IDX);
+			break;
 		default:
 			assert(0);
 			break;
-- 
1.7.5.53.gc233e

^ permalink raw reply related	[flat|nested] 145+ messages in thread

* [PATCH 13/18] virtio_test: avail_event index support
  2011-05-04 20:50 ` Michael S. Tsirkin
@ 2011-05-04 20:52   ` Michael S. Tsirkin
  -1 siblings, 0 replies; 145+ messages in thread
From: Michael S. Tsirkin @ 2011-05-04 20:52 UTC (permalink / raw)
  To: linux-kernel
  Cc: Rusty Russell, Carsten Otte, Christian Borntraeger, linux390,
	Martin Schwidefsky, Heiko Carstens, Shirley Ma, lguest,
	linux-kernel, virtualization, netdev, linux-s390, kvm,
	Krishna Kumar, Tom Lendacky, steved, habanero

Add ability to test the new avail_event feature,
enable by default.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
---
 tools/virtio/virtio_test.c |   17 +++++++++++++++--
 1 files changed, 15 insertions(+), 2 deletions(-)

diff --git a/tools/virtio/virtio_test.c b/tools/virtio/virtio_test.c
index 157ec68..8adf55d 100644
--- a/tools/virtio/virtio_test.c
+++ b/tools/virtio/virtio_test.c
@@ -202,6 +202,14 @@ const struct option longopts[] = {
 		.val = 'h',
 	},
 	{
+		.name = "avail-event-idx",
+		.val = 'A',
+	},
+	{
+		.name = "no-avail-event-idx",
+		.val = 'a',
+	},
+	{
 		.name = "indirect",
 		.val = 'I',
 	},
@@ -224,7 +232,8 @@ const struct option longopts[] = {
 static void help()
 {
 	fprintf(stderr, "Usage: virtio_test [--help]"
-		" [--no-indirect] "
+		" [--no-indirect]"
+		" [--no-avail-event-idx]"
 		" [--no-used-event-idx]\n");
 }
 
@@ -232,7 +241,8 @@ int main(int argc, char **argv)
 {
 	struct vdev_info dev;
 	unsigned long long features = (1ULL << VIRTIO_RING_F_INDIRECT_DESC) |
-		(1ULL << VIRTIO_RING_F_USED_EVENT_IDX);
+		(1ULL << VIRTIO_RING_F_USED_EVENT_IDX) |
+		(1ULL << VIRTIO_RING_F_AVAIL_EVENT_IDX);
 	int o;
 
 	for (;;) {
@@ -243,6 +253,9 @@ int main(int argc, char **argv)
 		case '?':
 			help();
 			exit(2);
+		case 'a':
+			features &= ~(1ULL << VIRTIO_RING_F_AVAIL_EVENT_IDX);
+			break;
 		case 'h':
 			help();
 			goto done;
-- 
1.7.5.53.gc233e


^ permalink raw reply related	[flat|nested] 145+ messages in thread

* [PATCH 13/18] virtio_test: avail_event index support
@ 2011-05-04 20:52   ` Michael S. Tsirkin
  0 siblings, 0 replies; 145+ messages in thread
From: Michael S. Tsirkin @ 2011-05-04 20:52 UTC (permalink / raw)
  Cc: Rusty Russell, Carsten Otte, Christian Borntraeger, linux390,
	Martin Schwidefsky, Heiko Carstens, Shirley Ma, lguest,
	linux-kernel, virtualization, netdev, linux-s390, kvm,
	Krishna Kumar, Tom Lendacky, steved, habanero

Add ability to test the new avail_event feature,
enable by default.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
---
 tools/virtio/virtio_test.c |   17 +++++++++++++++--
 1 files changed, 15 insertions(+), 2 deletions(-)

diff --git a/tools/virtio/virtio_test.c b/tools/virtio/virtio_test.c
index 157ec68..8adf55d 100644
--- a/tools/virtio/virtio_test.c
+++ b/tools/virtio/virtio_test.c
@@ -202,6 +202,14 @@ const struct option longopts[] = {
 		.val = 'h',
 	},
 	{
+		.name = "avail-event-idx",
+		.val = 'A',
+	},
+	{
+		.name = "no-avail-event-idx",
+		.val = 'a',
+	},
+	{
 		.name = "indirect",
 		.val = 'I',
 	},
@@ -224,7 +232,8 @@ const struct option longopts[] = {
 static void help()
 {
 	fprintf(stderr, "Usage: virtio_test [--help]"
-		" [--no-indirect] "
+		" [--no-indirect]"
+		" [--no-avail-event-idx]"
 		" [--no-used-event-idx]\n");
 }
 
@@ -232,7 +241,8 @@ int main(int argc, char **argv)
 {
 	struct vdev_info dev;
 	unsigned long long features = (1ULL << VIRTIO_RING_F_INDIRECT_DESC) |
-		(1ULL << VIRTIO_RING_F_USED_EVENT_IDX);
+		(1ULL << VIRTIO_RING_F_USED_EVENT_IDX) |
+		(1ULL << VIRTIO_RING_F_AVAIL_EVENT_IDX);
 	int o;
 
 	for (;;) {
@@ -243,6 +253,9 @@ int main(int argc, char **argv)
 		case '?':
 			help();
 			exit(2);
+		case 'a':
+			features &= ~(1ULL << VIRTIO_RING_F_AVAIL_EVENT_IDX);
+			break;
 		case 'h':
 			help();
 			goto done;
-- 
1.7.5.53.gc233e

^ permalink raw reply related	[flat|nested] 145+ messages in thread

* [PATCH 13/18] virtio_test: avail_event index support
  2011-05-04 20:50 ` Michael S. Tsirkin
                   ` (15 preceding siblings ...)
  (?)
@ 2011-05-04 20:52 ` Michael S. Tsirkin
  -1 siblings, 0 replies; 145+ messages in thread
From: Michael S. Tsirkin @ 2011-05-04 20:52 UTC (permalink / raw)
  Cc: Krishna Kumar, Carsten Otte, lguest, Shirley Ma, kvm, linux-s390,
	netdev, habanero, Heiko Carstens, linux-kernel, virtualization,
	steved, Christian Borntraeger, Tom Lendacky, Martin Schwidefsky,
	linux390

Add ability to test the new avail_event feature,
enable by default.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
---
 tools/virtio/virtio_test.c |   17 +++++++++++++++--
 1 files changed, 15 insertions(+), 2 deletions(-)

diff --git a/tools/virtio/virtio_test.c b/tools/virtio/virtio_test.c
index 157ec68..8adf55d 100644
--- a/tools/virtio/virtio_test.c
+++ b/tools/virtio/virtio_test.c
@@ -202,6 +202,14 @@ const struct option longopts[] = {
 		.val = 'h',
 	},
 	{
+		.name = "avail-event-idx",
+		.val = 'A',
+	},
+	{
+		.name = "no-avail-event-idx",
+		.val = 'a',
+	},
+	{
 		.name = "indirect",
 		.val = 'I',
 	},
@@ -224,7 +232,8 @@ const struct option longopts[] = {
 static void help()
 {
 	fprintf(stderr, "Usage: virtio_test [--help]"
-		" [--no-indirect] "
+		" [--no-indirect]"
+		" [--no-avail-event-idx]"
 		" [--no-used-event-idx]\n");
 }
 
@@ -232,7 +241,8 @@ int main(int argc, char **argv)
 {
 	struct vdev_info dev;
 	unsigned long long features = (1ULL << VIRTIO_RING_F_INDIRECT_DESC) |
-		(1ULL << VIRTIO_RING_F_USED_EVENT_IDX);
+		(1ULL << VIRTIO_RING_F_USED_EVENT_IDX) |
+		(1ULL << VIRTIO_RING_F_AVAIL_EVENT_IDX);
 	int o;
 
 	for (;;) {
@@ -243,6 +253,9 @@ int main(int argc, char **argv)
 		case '?':
 			help();
 			exit(2);
+		case 'a':
+			features &= ~(1ULL << VIRTIO_RING_F_AVAIL_EVENT_IDX);
+			break;
 		case 'h':
 			help();
 			goto done;
-- 
1.7.5.53.gc233e

^ permalink raw reply related	[flat|nested] 145+ messages in thread

* [PATCH 14/18] virtio: add api for delayed callbacks
  2011-05-04 20:50 ` Michael S. Tsirkin
  (?)
@ 2011-05-04 20:52   ` Michael S. Tsirkin
  -1 siblings, 0 replies; 145+ messages in thread
From: Michael S. Tsirkin @ 2011-05-04 20:52 UTC (permalink / raw)
  To: linux-kernel
  Cc: Rusty Russell, Carsten Otte, Christian Borntraeger, linux390,
	Martin Schwidefsky, Heiko Carstens, Shirley Ma, lguest,
	linux-kernel, virtualization, netdev, linux-s390, kvm,
	Krishna Kumar, Tom Lendacky, steved, habanero

Add an API that tells the other side that callbacks
should be delayed until a lot of work has been done.
Implement using the new used_event feature.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
---
 drivers/virtio/virtio_ring.c |   27 +++++++++++++++++++++++++++
 include/linux/virtio.h       |    9 +++++++++
 2 files changed, 36 insertions(+), 0 deletions(-)

diff --git a/drivers/virtio/virtio_ring.c b/drivers/virtio/virtio_ring.c
index 262dfe6..3a70d70 100644
--- a/drivers/virtio/virtio_ring.c
+++ b/drivers/virtio/virtio_ring.c
@@ -397,6 +397,33 @@ bool virtqueue_enable_cb(struct virtqueue *_vq)
 }
 EXPORT_SYMBOL_GPL(virtqueue_enable_cb);
 
+bool virtqueue_enable_cb_delayed(struct virtqueue *_vq)
+{
+	struct vring_virtqueue *vq = to_vvq(_vq);
+	int bufs;
+
+	START_USE(vq);
+
+	/* We optimistically turn back on interrupts, then check if there was
+	 * more to do. */
+	/* Depending on the VIRTIO_RING_F_USED_EVENT_IDX feature, we need to
+	 * either clear the flags bit or point the event index at the next
+	 * entry. Always do both to keep code simple. */
+	vq->vring.avail->flags &= ~VRING_AVAIL_F_NO_INTERRUPT;
+	/* TODO: tune this threshold */
+	bufs = (vq->vring.avail->idx - vq->last_used_idx) * 3 / 4;
+	vring_used_event(&vq->vring) = vq->last_used_idx + bufs;
+	virtio_mb();
+	if (unlikely(vq->vring.used->idx - vq->last_used_idx > bufs)) {
+		END_USE(vq);
+		return false;
+	}
+
+	END_USE(vq);
+	return true;
+}
+EXPORT_SYMBOL_GPL(virtqueue_enable_cb_delayed);
+
 void *virtqueue_detach_unused_buf(struct virtqueue *_vq)
 {
 	struct vring_virtqueue *vq = to_vvq(_vq);
diff --git a/include/linux/virtio.h b/include/linux/virtio.h
index 718336b..5151fd1 100644
--- a/include/linux/virtio.h
+++ b/include/linux/virtio.h
@@ -51,6 +51,13 @@ struct virtqueue {
  *	This re-enables callbacks; it returns "false" if there are pending
  *	buffers in the queue, to detect a possible race between the driver
  *	checking for more work, and enabling callbacks.
+ * virtqueue_enable_cb_delayed: restart callbacks after disable_cb.
+ *	vq: the struct virtqueue we're talking about.
+ *	This re-enables callbacks but hints to the other side to delay
+ *	interrupts until most of the available buffers have been processed;
+ *	it returns "false" if there are many pending buffers in the queue,
+ *	to detect a possible race between the driver checking for more work,
+ *	and enabling callbacks.
  * virtqueue_detach_unused_buf: detach first unused buffer
  * 	vq: the struct virtqueue we're talking about.
  * 	Returns NULL or the "data" token handed to add_buf
@@ -86,6 +93,8 @@ void virtqueue_disable_cb(struct virtqueue *vq);
 
 bool virtqueue_enable_cb(struct virtqueue *vq);
 
+bool virtqueue_enable_cb_delayed(struct virtqueue *vq);
+
 void *virtqueue_detach_unused_buf(struct virtqueue *vq);
 
 /**
-- 
1.7.5.53.gc233e


^ permalink raw reply related	[flat|nested] 145+ messages in thread

* [PATCH 14/18] virtio: add api for delayed callbacks
@ 2011-05-04 20:52   ` Michael S. Tsirkin
  0 siblings, 0 replies; 145+ messages in thread
From: Michael S. Tsirkin @ 2011-05-04 20:52 UTC (permalink / raw)
  To: linux-kernel
  Cc: Krishna Kumar, Carsten Otte, lguest, Shirley Ma, kvm, linux-s390,
	netdev, habanero, Heiko Carstens, linux-kernel, virtualization,
	steved, Christian Borntraeger, Tom Lendacky, Martin Schwidefsky,
	linux390

Add an API that tells the other side that callbacks
should be delayed until a lot of work has been done.
Implement using the new used_event feature.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
---
 drivers/virtio/virtio_ring.c |   27 +++++++++++++++++++++++++++
 include/linux/virtio.h       |    9 +++++++++
 2 files changed, 36 insertions(+), 0 deletions(-)

diff --git a/drivers/virtio/virtio_ring.c b/drivers/virtio/virtio_ring.c
index 262dfe6..3a70d70 100644
--- a/drivers/virtio/virtio_ring.c
+++ b/drivers/virtio/virtio_ring.c
@@ -397,6 +397,33 @@ bool virtqueue_enable_cb(struct virtqueue *_vq)
 }
 EXPORT_SYMBOL_GPL(virtqueue_enable_cb);
 
+bool virtqueue_enable_cb_delayed(struct virtqueue *_vq)
+{
+	struct vring_virtqueue *vq = to_vvq(_vq);
+	int bufs;
+
+	START_USE(vq);
+
+	/* We optimistically turn back on interrupts, then check if there was
+	 * more to do. */
+	/* Depending on the VIRTIO_RING_F_USED_EVENT_IDX feature, we need to
+	 * either clear the flags bit or point the event index at the next
+	 * entry. Always do both to keep code simple. */
+	vq->vring.avail->flags &= ~VRING_AVAIL_F_NO_INTERRUPT;
+	/* TODO: tune this threshold */
+	bufs = (vq->vring.avail->idx - vq->last_used_idx) * 3 / 4;
+	vring_used_event(&vq->vring) = vq->last_used_idx + bufs;
+	virtio_mb();
+	if (unlikely(vq->vring.used->idx - vq->last_used_idx > bufs)) {
+		END_USE(vq);
+		return false;
+	}
+
+	END_USE(vq);
+	return true;
+}
+EXPORT_SYMBOL_GPL(virtqueue_enable_cb_delayed);
+
 void *virtqueue_detach_unused_buf(struct virtqueue *_vq)
 {
 	struct vring_virtqueue *vq = to_vvq(_vq);
diff --git a/include/linux/virtio.h b/include/linux/virtio.h
index 718336b..5151fd1 100644
--- a/include/linux/virtio.h
+++ b/include/linux/virtio.h
@@ -51,6 +51,13 @@ struct virtqueue {
  *	This re-enables callbacks; it returns "false" if there are pending
  *	buffers in the queue, to detect a possible race between the driver
  *	checking for more work, and enabling callbacks.
+ * virtqueue_enable_cb_delayed: restart callbacks after disable_cb.
+ *	vq: the struct virtqueue we're talking about.
+ *	This re-enables callbacks but hints to the other side to delay
+ *	interrupts until most of the available buffers have been processed;
+ *	it returns "false" if there are many pending buffers in the queue,
+ *	to detect a possible race between the driver checking for more work,
+ *	and enabling callbacks.
  * virtqueue_detach_unused_buf: detach first unused buffer
  * 	vq: the struct virtqueue we're talking about.
  * 	Returns NULL or the "data" token handed to add_buf
@@ -86,6 +93,8 @@ void virtqueue_disable_cb(struct virtqueue *vq);
 
 bool virtqueue_enable_cb(struct virtqueue *vq);
 
+bool virtqueue_enable_cb_delayed(struct virtqueue *vq);
+
 void *virtqueue_detach_unused_buf(struct virtqueue *vq);
 
 /**
-- 
1.7.5.53.gc233e

^ permalink raw reply related	[flat|nested] 145+ messages in thread

* [PATCH 14/18] virtio: add api for delayed callbacks
@ 2011-05-04 20:52   ` Michael S. Tsirkin
  0 siblings, 0 replies; 145+ messages in thread
From: Michael S. Tsirkin @ 2011-05-04 20:52 UTC (permalink / raw)
  Cc: Krishna Kumar, Carsten Otte, lguest, Shirley Ma, kvm, linux-s390,
	netdev, habanero, Heiko Carstens, linux-kernel, virtualization,
	steved, Christian Borntraeger, Tom Lendacky, Martin Schwidefsky,
	linux390

Add an API that tells the other side that callbacks
should be delayed until a lot of work has been done.
Implement using the new used_event feature.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
---
 drivers/virtio/virtio_ring.c |   27 +++++++++++++++++++++++++++
 include/linux/virtio.h       |    9 +++++++++
 2 files changed, 36 insertions(+), 0 deletions(-)

diff --git a/drivers/virtio/virtio_ring.c b/drivers/virtio/virtio_ring.c
index 262dfe6..3a70d70 100644
--- a/drivers/virtio/virtio_ring.c
+++ b/drivers/virtio/virtio_ring.c
@@ -397,6 +397,33 @@ bool virtqueue_enable_cb(struct virtqueue *_vq)
 }
 EXPORT_SYMBOL_GPL(virtqueue_enable_cb);
 
+bool virtqueue_enable_cb_delayed(struct virtqueue *_vq)
+{
+	struct vring_virtqueue *vq = to_vvq(_vq);
+	int bufs;
+
+	START_USE(vq);
+
+	/* We optimistically turn back on interrupts, then check if there was
+	 * more to do. */
+	/* Depending on the VIRTIO_RING_F_USED_EVENT_IDX feature, we need to
+	 * either clear the flags bit or point the event index at the next
+	 * entry. Always do both to keep code simple. */
+	vq->vring.avail->flags &= ~VRING_AVAIL_F_NO_INTERRUPT;
+	/* TODO: tune this threshold */
+	bufs = (vq->vring.avail->idx - vq->last_used_idx) * 3 / 4;
+	vring_used_event(&vq->vring) = vq->last_used_idx + bufs;
+	virtio_mb();
+	if (unlikely(vq->vring.used->idx - vq->last_used_idx > bufs)) {
+		END_USE(vq);
+		return false;
+	}
+
+	END_USE(vq);
+	return true;
+}
+EXPORT_SYMBOL_GPL(virtqueue_enable_cb_delayed);
+
 void *virtqueue_detach_unused_buf(struct virtqueue *_vq)
 {
 	struct vring_virtqueue *vq = to_vvq(_vq);
diff --git a/include/linux/virtio.h b/include/linux/virtio.h
index 718336b..5151fd1 100644
--- a/include/linux/virtio.h
+++ b/include/linux/virtio.h
@@ -51,6 +51,13 @@ struct virtqueue {
  *	This re-enables callbacks; it returns "false" if there are pending
  *	buffers in the queue, to detect a possible race between the driver
  *	checking for more work, and enabling callbacks.
+ * virtqueue_enable_cb_delayed: restart callbacks after disable_cb.
+ *	vq: the struct virtqueue we're talking about.
+ *	This re-enables callbacks but hints to the other side to delay
+ *	interrupts until most of the available buffers have been processed;
+ *	it returns "false" if there are many pending buffers in the queue,
+ *	to detect a possible race between the driver checking for more work,
+ *	and enabling callbacks.
  * virtqueue_detach_unused_buf: detach first unused buffer
  * 	vq: the struct virtqueue we're talking about.
  * 	Returns NULL or the "data" token handed to add_buf
@@ -86,6 +93,8 @@ void virtqueue_disable_cb(struct virtqueue *vq);
 
 bool virtqueue_enable_cb(struct virtqueue *vq);
 
+bool virtqueue_enable_cb_delayed(struct virtqueue *vq);
+
 void *virtqueue_detach_unused_buf(struct virtqueue *vq);
 
 /**
-- 
1.7.5.53.gc233e

^ permalink raw reply related	[flat|nested] 145+ messages in thread

* [PATCH 15/18] virtio_net: delay TX callbacks
  2011-05-04 20:50 ` Michael S. Tsirkin
@ 2011-05-04 20:52   ` Michael S. Tsirkin
  -1 siblings, 0 replies; 145+ messages in thread
From: Michael S. Tsirkin @ 2011-05-04 20:52 UTC (permalink / raw)
  To: linux-kernel
  Cc: Rusty Russell, Carsten Otte, Christian Borntraeger, linux390,
	Martin Schwidefsky, Heiko Carstens, Shirley Ma, lguest,
	linux-kernel, virtualization, netdev, linux-s390, kvm,
	Krishna Kumar, Tom Lendacky, steved, habanero

Ask for delayed callbacks on TX ring full, to give the
other side more of a chance to make progress.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
---
 drivers/net/virtio_net.c |    2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c
index 0cb0b06..f685324 100644
--- a/drivers/net/virtio_net.c
+++ b/drivers/net/virtio_net.c
@@ -609,7 +609,7 @@ static netdev_tx_t start_xmit(struct sk_buff *skb, struct net_device *dev)
 	 * before it gets out of hand.  Naturally, this wastes entries. */
 	if (capacity < 2+MAX_SKB_FRAGS) {
 		netif_stop_queue(dev);
-		if (unlikely(!virtqueue_enable_cb(vi->svq))) {
+		if (unlikely(!virtqueue_enable_cb_delayed(vi->svq))) {
 			/* More just got used, free them then recheck. */
 			capacity += free_old_xmit_skbs(vi);
 			if (capacity >= 2+MAX_SKB_FRAGS) {
-- 
1.7.5.53.gc233e


^ permalink raw reply related	[flat|nested] 145+ messages in thread

* [PATCH 15/18] virtio_net: delay TX callbacks
@ 2011-05-04 20:52   ` Michael S. Tsirkin
  0 siblings, 0 replies; 145+ messages in thread
From: Michael S. Tsirkin @ 2011-05-04 20:52 UTC (permalink / raw)
  Cc: Rusty Russell, Carsten Otte, Christian Borntraeger, linux390,
	Martin Schwidefsky, Heiko Carstens, Shirley Ma, lguest,
	linux-kernel, virtualization, netdev, linux-s390, kvm,
	Krishna Kumar, Tom Lendacky, steved, habanero

Ask for delayed callbacks on TX ring full, to give the
other side more of a chance to make progress.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
---
 drivers/net/virtio_net.c |    2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c
index 0cb0b06..f685324 100644
--- a/drivers/net/virtio_net.c
+++ b/drivers/net/virtio_net.c
@@ -609,7 +609,7 @@ static netdev_tx_t start_xmit(struct sk_buff *skb, struct net_device *dev)
 	 * before it gets out of hand.  Naturally, this wastes entries. */
 	if (capacity < 2+MAX_SKB_FRAGS) {
 		netif_stop_queue(dev);
-		if (unlikely(!virtqueue_enable_cb(vi->svq))) {
+		if (unlikely(!virtqueue_enable_cb_delayed(vi->svq))) {
 			/* More just got used, free them then recheck. */
 			capacity += free_old_xmit_skbs(vi);
 			if (capacity >= 2+MAX_SKB_FRAGS) {
-- 
1.7.5.53.gc233e

^ permalink raw reply related	[flat|nested] 145+ messages in thread

* [PATCH 15/18] virtio_net: delay TX callbacks
  2011-05-04 20:50 ` Michael S. Tsirkin
                   ` (19 preceding siblings ...)
  (?)
@ 2011-05-04 20:52 ` Michael S. Tsirkin
  -1 siblings, 0 replies; 145+ messages in thread
From: Michael S. Tsirkin @ 2011-05-04 20:52 UTC (permalink / raw)
  Cc: Krishna Kumar, Carsten Otte, lguest, Shirley Ma, kvm, linux-s390,
	netdev, habanero, Heiko Carstens, linux-kernel, virtualization,
	steved, Christian Borntraeger, Tom Lendacky, Martin Schwidefsky,
	linux390

Ask for delayed callbacks on TX ring full, to give the
other side more of a chance to make progress.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
---
 drivers/net/virtio_net.c |    2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c
index 0cb0b06..f685324 100644
--- a/drivers/net/virtio_net.c
+++ b/drivers/net/virtio_net.c
@@ -609,7 +609,7 @@ static netdev_tx_t start_xmit(struct sk_buff *skb, struct net_device *dev)
 	 * before it gets out of hand.  Naturally, this wastes entries. */
 	if (capacity < 2+MAX_SKB_FRAGS) {
 		netif_stop_queue(dev);
-		if (unlikely(!virtqueue_enable_cb(vi->svq))) {
+		if (unlikely(!virtqueue_enable_cb_delayed(vi->svq))) {
 			/* More just got used, free them then recheck. */
 			capacity += free_old_xmit_skbs(vi);
 			if (capacity >= 2+MAX_SKB_FRAGS) {
-- 
1.7.5.53.gc233e

^ permalink raw reply related	[flat|nested] 145+ messages in thread

* [PATCH 16/18] virtio_ring: Add capacity check API
  2011-05-04 20:50 ` Michael S. Tsirkin
  (?)
@ 2011-05-04 20:52   ` Michael S. Tsirkin
  -1 siblings, 0 replies; 145+ messages in thread
From: Michael S. Tsirkin @ 2011-05-04 20:52 UTC (permalink / raw)
  To: linux-kernel
  Cc: Rusty Russell, Carsten Otte, Christian Borntraeger, linux390,
	Martin Schwidefsky, Heiko Carstens, Shirley Ma, lguest,
	linux-kernel, virtualization, netdev, linux-s390, kvm,
	Krishna Kumar, Tom Lendacky, steved, habanero

>From: Shirley Ma <mashirle@us.ibm.com>

Signed-off-by: Shirley Ma <xma@us.ibm.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
---

I'm not sure who wrote this first anymore :)
But it's a simple patch.

 drivers/virtio/virtio_ring.c |    8 ++++++++
 include/linux/virtio.h       |    5 +++++
 2 files changed, 13 insertions(+), 0 deletions(-)

diff --git a/drivers/virtio/virtio_ring.c b/drivers/virtio/virtio_ring.c
index 3a70d70..57bf9d5 100644
--- a/drivers/virtio/virtio_ring.c
+++ b/drivers/virtio/virtio_ring.c
@@ -365,6 +365,14 @@ void *virtqueue_get_buf(struct virtqueue *_vq, unsigned int *len)
 }
 EXPORT_SYMBOL_GPL(virtqueue_get_buf);
 
+int virtqueue_get_capacity(struct virtqueue *_vq)
+{
+	struct vring_virtqueue *vq = to_vvq(_vq);
+
+	return vq->num_free;
+}
+EXPORT_SYMBOL_GPL(virtqueue_get_capacity);
+
 void virtqueue_disable_cb(struct virtqueue *_vq)
 {
 	struct vring_virtqueue *vq = to_vvq(_vq);
diff --git a/include/linux/virtio.h b/include/linux/virtio.h
index 5151fd1..944ebcd 100644
--- a/include/linux/virtio.h
+++ b/include/linux/virtio.h
@@ -42,6 +42,9 @@ struct virtqueue {
  *	vq: the struct virtqueue we're talking about.
  *	len: the length written into the buffer
  *	Returns NULL or the "data" token handed to add_buf.
+ * virtqueue_get_capacity: get the current capacity of the queue
+ *	vq: the struct virtqueue we're talking about.
+ *	Returns remaining capacity of the queue.
  * virtqueue_disable_cb: disable callbacks
  *	vq: the struct virtqueue we're talking about.
  *	Note that this is not necessarily synchronous, hence unreliable and only
@@ -89,6 +92,8 @@ void virtqueue_kick(struct virtqueue *vq);
 
 void *virtqueue_get_buf(struct virtqueue *vq, unsigned int *len);
 
+int virtqueue_get_capacity(struct virtqueue *vq);
+
 void virtqueue_disable_cb(struct virtqueue *vq);
 
 bool virtqueue_enable_cb(struct virtqueue *vq);
-- 
1.7.5.53.gc233e


^ permalink raw reply related	[flat|nested] 145+ messages in thread

* [PATCH 16/18] virtio_ring: Add capacity check API
@ 2011-05-04 20:52   ` Michael S. Tsirkin
  0 siblings, 0 replies; 145+ messages in thread
From: Michael S. Tsirkin @ 2011-05-04 20:52 UTC (permalink / raw)
  To: linux-kernel
  Cc: Krishna Kumar, Carsten Otte, lguest, Shirley Ma, kvm, linux-s390,
	netdev, habanero, Heiko Carstens, linux-kernel, virtualization,
	steved, Christian Borntraeger, Tom Lendacky, Martin Schwidefsky,
	linux390

>From: Shirley Ma <mashirle@us.ibm.com>

Signed-off-by: Shirley Ma <xma@us.ibm.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
---

I'm not sure who wrote this first anymore :)
But it's a simple patch.

 drivers/virtio/virtio_ring.c |    8 ++++++++
 include/linux/virtio.h       |    5 +++++
 2 files changed, 13 insertions(+), 0 deletions(-)

diff --git a/drivers/virtio/virtio_ring.c b/drivers/virtio/virtio_ring.c
index 3a70d70..57bf9d5 100644
--- a/drivers/virtio/virtio_ring.c
+++ b/drivers/virtio/virtio_ring.c
@@ -365,6 +365,14 @@ void *virtqueue_get_buf(struct virtqueue *_vq, unsigned int *len)
 }
 EXPORT_SYMBOL_GPL(virtqueue_get_buf);
 
+int virtqueue_get_capacity(struct virtqueue *_vq)
+{
+	struct vring_virtqueue *vq = to_vvq(_vq);
+
+	return vq->num_free;
+}
+EXPORT_SYMBOL_GPL(virtqueue_get_capacity);
+
 void virtqueue_disable_cb(struct virtqueue *_vq)
 {
 	struct vring_virtqueue *vq = to_vvq(_vq);
diff --git a/include/linux/virtio.h b/include/linux/virtio.h
index 5151fd1..944ebcd 100644
--- a/include/linux/virtio.h
+++ b/include/linux/virtio.h
@@ -42,6 +42,9 @@ struct virtqueue {
  *	vq: the struct virtqueue we're talking about.
  *	len: the length written into the buffer
  *	Returns NULL or the "data" token handed to add_buf.
+ * virtqueue_get_capacity: get the current capacity of the queue
+ *	vq: the struct virtqueue we're talking about.
+ *	Returns remaining capacity of the queue.
  * virtqueue_disable_cb: disable callbacks
  *	vq: the struct virtqueue we're talking about.
  *	Note that this is not necessarily synchronous, hence unreliable and only
@@ -89,6 +92,8 @@ void virtqueue_kick(struct virtqueue *vq);
 
 void *virtqueue_get_buf(struct virtqueue *vq, unsigned int *len);
 
+int virtqueue_get_capacity(struct virtqueue *vq);
+
 void virtqueue_disable_cb(struct virtqueue *vq);
 
 bool virtqueue_enable_cb(struct virtqueue *vq);
-- 
1.7.5.53.gc233e

^ permalink raw reply related	[flat|nested] 145+ messages in thread

* [PATCH 16/18] virtio_ring: Add capacity check API
@ 2011-05-04 20:52   ` Michael S. Tsirkin
  0 siblings, 0 replies; 145+ messages in thread
From: Michael S. Tsirkin @ 2011-05-04 20:52 UTC (permalink / raw)
  Cc: Krishna Kumar, Carsten Otte, lguest, Shirley Ma, kvm, linux-s390,
	netdev, habanero, Heiko Carstens, linux-kernel, virtualization,
	steved, Christian Borntraeger, Tom Lendacky, Martin Schwidefsky,
	linux390

>From: Shirley Ma <mashirle@us.ibm.com>

Signed-off-by: Shirley Ma <xma@us.ibm.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
---

I'm not sure who wrote this first anymore :)
But it's a simple patch.

 drivers/virtio/virtio_ring.c |    8 ++++++++
 include/linux/virtio.h       |    5 +++++
 2 files changed, 13 insertions(+), 0 deletions(-)

diff --git a/drivers/virtio/virtio_ring.c b/drivers/virtio/virtio_ring.c
index 3a70d70..57bf9d5 100644
--- a/drivers/virtio/virtio_ring.c
+++ b/drivers/virtio/virtio_ring.c
@@ -365,6 +365,14 @@ void *virtqueue_get_buf(struct virtqueue *_vq, unsigned int *len)
 }
 EXPORT_SYMBOL_GPL(virtqueue_get_buf);
 
+int virtqueue_get_capacity(struct virtqueue *_vq)
+{
+	struct vring_virtqueue *vq = to_vvq(_vq);
+
+	return vq->num_free;
+}
+EXPORT_SYMBOL_GPL(virtqueue_get_capacity);
+
 void virtqueue_disable_cb(struct virtqueue *_vq)
 {
 	struct vring_virtqueue *vq = to_vvq(_vq);
diff --git a/include/linux/virtio.h b/include/linux/virtio.h
index 5151fd1..944ebcd 100644
--- a/include/linux/virtio.h
+++ b/include/linux/virtio.h
@@ -42,6 +42,9 @@ struct virtqueue {
  *	vq: the struct virtqueue we're talking about.
  *	len: the length written into the buffer
  *	Returns NULL or the "data" token handed to add_buf.
+ * virtqueue_get_capacity: get the current capacity of the queue
+ *	vq: the struct virtqueue we're talking about.
+ *	Returns remaining capacity of the queue.
  * virtqueue_disable_cb: disable callbacks
  *	vq: the struct virtqueue we're talking about.
  *	Note that this is not necessarily synchronous, hence unreliable and only
@@ -89,6 +92,8 @@ void virtqueue_kick(struct virtqueue *vq);
 
 void *virtqueue_get_buf(struct virtqueue *vq, unsigned int *len);
 
+int virtqueue_get_capacity(struct virtqueue *vq);
+
 void virtqueue_disable_cb(struct virtqueue *vq);
 
 bool virtqueue_enable_cb(struct virtqueue *vq);
-- 
1.7.5.53.gc233e

^ permalink raw reply related	[flat|nested] 145+ messages in thread

* [PATCH 17/18] virtio_net: fix TX capacity checks using new API
  2011-05-04 20:50 ` Michael S. Tsirkin
  (?)
@ 2011-05-04 20:53   ` Michael S. Tsirkin
  -1 siblings, 0 replies; 145+ messages in thread
From: Michael S. Tsirkin @ 2011-05-04 20:53 UTC (permalink / raw)
  To: linux-kernel
  Cc: Rusty Russell, Carsten Otte, Christian Borntraeger, linux390,
	Martin Schwidefsky, Heiko Carstens, Shirley Ma, lguest,
	linux-kernel, virtualization, netdev, linux-s390, kvm,
	Krishna Kumar, Tom Lendacky, steved, habanero

virtio net uses the number of sg entries to
check for TX ring capacity freed. But this
gives incorrect results when indirect buffers
are used. Use the new capacity API instead.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
---
 drivers/net/virtio_net.c |    9 ++++-----
 1 files changed, 4 insertions(+), 5 deletions(-)

diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c
index f685324..f33c92b 100644
--- a/drivers/net/virtio_net.c
+++ b/drivers/net/virtio_net.c
@@ -509,19 +509,17 @@ again:
 	return received;
 }
 
-static unsigned int free_old_xmit_skbs(struct virtnet_info *vi)
+static void free_old_xmit_skbs(struct virtnet_info *vi)
 {
 	struct sk_buff *skb;
-	unsigned int len, tot_sgs = 0;
+	unsigned int len;
 
 	while ((skb = virtqueue_get_buf(vi->svq, &len)) != NULL) {
 		pr_debug("Sent skb %p\n", skb);
 		vi->dev->stats.tx_bytes += skb->len;
 		vi->dev->stats.tx_packets++;
-		tot_sgs += skb_vnet_hdr(skb)->num_sg;
 		dev_kfree_skb_any(skb);
 	}
-	return tot_sgs;
 }
 
 static int xmit_skb(struct virtnet_info *vi, struct sk_buff *skb)
@@ -611,7 +609,8 @@ static netdev_tx_t start_xmit(struct sk_buff *skb, struct net_device *dev)
 		netif_stop_queue(dev);
 		if (unlikely(!virtqueue_enable_cb_delayed(vi->svq))) {
 			/* More just got used, free them then recheck. */
-			capacity += free_old_xmit_skbs(vi);
+			free_old_xmit_skbs(vi);
+			capacity = virtqueue_get_capacity(vi->svq);
 			if (capacity >= 2+MAX_SKB_FRAGS) {
 				netif_start_queue(dev);
 				virtqueue_disable_cb(vi->svq);
-- 
1.7.5.53.gc233e


^ permalink raw reply related	[flat|nested] 145+ messages in thread

* [PATCH 17/18] virtio_net: fix TX capacity checks using new API
@ 2011-05-04 20:53   ` Michael S. Tsirkin
  0 siblings, 0 replies; 145+ messages in thread
From: Michael S. Tsirkin @ 2011-05-04 20:53 UTC (permalink / raw)
  To: linux-kernel
  Cc: Krishna Kumar, Carsten Otte, lguest, Shirley Ma, kvm, linux-s390,
	netdev, habanero, Heiko Carstens, linux-kernel, virtualization,
	steved, Christian Borntraeger, Tom Lendacky, Martin Schwidefsky,
	linux390

virtio net uses the number of sg entries to
check for TX ring capacity freed. But this
gives incorrect results when indirect buffers
are used. Use the new capacity API instead.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
---
 drivers/net/virtio_net.c |    9 ++++-----
 1 files changed, 4 insertions(+), 5 deletions(-)

diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c
index f685324..f33c92b 100644
--- a/drivers/net/virtio_net.c
+++ b/drivers/net/virtio_net.c
@@ -509,19 +509,17 @@ again:
 	return received;
 }
 
-static unsigned int free_old_xmit_skbs(struct virtnet_info *vi)
+static void free_old_xmit_skbs(struct virtnet_info *vi)
 {
 	struct sk_buff *skb;
-	unsigned int len, tot_sgs = 0;
+	unsigned int len;
 
 	while ((skb = virtqueue_get_buf(vi->svq, &len)) != NULL) {
 		pr_debug("Sent skb %p\n", skb);
 		vi->dev->stats.tx_bytes += skb->len;
 		vi->dev->stats.tx_packets++;
-		tot_sgs += skb_vnet_hdr(skb)->num_sg;
 		dev_kfree_skb_any(skb);
 	}
-	return tot_sgs;
 }
 
 static int xmit_skb(struct virtnet_info *vi, struct sk_buff *skb)
@@ -611,7 +609,8 @@ static netdev_tx_t start_xmit(struct sk_buff *skb, struct net_device *dev)
 		netif_stop_queue(dev);
 		if (unlikely(!virtqueue_enable_cb_delayed(vi->svq))) {
 			/* More just got used, free them then recheck. */
-			capacity += free_old_xmit_skbs(vi);
+			free_old_xmit_skbs(vi);
+			capacity = virtqueue_get_capacity(vi->svq);
 			if (capacity >= 2+MAX_SKB_FRAGS) {
 				netif_start_queue(dev);
 				virtqueue_disable_cb(vi->svq);
-- 
1.7.5.53.gc233e

^ permalink raw reply related	[flat|nested] 145+ messages in thread

* [PATCH 17/18] virtio_net: fix TX capacity checks using new API
@ 2011-05-04 20:53   ` Michael S. Tsirkin
  0 siblings, 0 replies; 145+ messages in thread
From: Michael S. Tsirkin @ 2011-05-04 20:53 UTC (permalink / raw)
  Cc: Krishna Kumar, Carsten Otte, lguest, Shirley Ma, kvm, linux-s390,
	netdev, habanero, Heiko Carstens, linux-kernel, virtualization,
	steved, Christian Borntraeger, Tom Lendacky, Martin Schwidefsky,
	linux390

virtio net uses the number of sg entries to
check for TX ring capacity freed. But this
gives incorrect results when indirect buffers
are used. Use the new capacity API instead.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
---
 drivers/net/virtio_net.c |    9 ++++-----
 1 files changed, 4 insertions(+), 5 deletions(-)

diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c
index f685324..f33c92b 100644
--- a/drivers/net/virtio_net.c
+++ b/drivers/net/virtio_net.c
@@ -509,19 +509,17 @@ again:
 	return received;
 }
 
-static unsigned int free_old_xmit_skbs(struct virtnet_info *vi)
+static void free_old_xmit_skbs(struct virtnet_info *vi)
 {
 	struct sk_buff *skb;
-	unsigned int len, tot_sgs = 0;
+	unsigned int len;
 
 	while ((skb = virtqueue_get_buf(vi->svq, &len)) != NULL) {
 		pr_debug("Sent skb %p\n", skb);
 		vi->dev->stats.tx_bytes += skb->len;
 		vi->dev->stats.tx_packets++;
-		tot_sgs += skb_vnet_hdr(skb)->num_sg;
 		dev_kfree_skb_any(skb);
 	}
-	return tot_sgs;
 }
 
 static int xmit_skb(struct virtnet_info *vi, struct sk_buff *skb)
@@ -611,7 +609,8 @@ static netdev_tx_t start_xmit(struct sk_buff *skb, struct net_device *dev)
 		netif_stop_queue(dev);
 		if (unlikely(!virtqueue_enable_cb_delayed(vi->svq))) {
 			/* More just got used, free them then recheck. */
-			capacity += free_old_xmit_skbs(vi);
+			free_old_xmit_skbs(vi);
+			capacity = virtqueue_get_capacity(vi->svq);
 			if (capacity >= 2+MAX_SKB_FRAGS) {
 				netif_start_queue(dev);
 				virtqueue_disable_cb(vi->svq);
-- 
1.7.5.53.gc233e

^ permalink raw reply related	[flat|nested] 145+ messages in thread

* [PATCH 18/18] virtio_net: limit xmit polling
  2011-05-04 20:50 ` Michael S. Tsirkin
  (?)
@ 2011-05-04 20:53   ` Michael S. Tsirkin
  -1 siblings, 0 replies; 145+ messages in thread
From: Michael S. Tsirkin @ 2011-05-04 20:53 UTC (permalink / raw)
  To: linux-kernel
  Cc: Rusty Russell, Carsten Otte, Christian Borntraeger, linux390,
	Martin Schwidefsky, Heiko Carstens, Shirley Ma, lguest,
	linux-kernel, virtualization, netdev, linux-s390, kvm,
	Krishna Kumar, Tom Lendacky, steved, habanero

Current code might introduce a lot of latency variation
if there are many pending bufs at the time we
attempt to transmit a new one. This is bad for
real-time applications and can't be good for TCP either.

Free up just enough to both clean up all buffers
eventually and to be able to xmit the next packet.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
---
 drivers/net/virtio_net.c |   18 +++++++++++-------
 1 files changed, 11 insertions(+), 7 deletions(-)

diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c
index f33c92b..9982bd7 100644
--- a/drivers/net/virtio_net.c
+++ b/drivers/net/virtio_net.c
@@ -509,17 +509,23 @@ again:
 	return received;
 }
 
-static void free_old_xmit_skbs(struct virtnet_info *vi)
+static bool free_old_xmit_skbs(struct virtnet_info *vi, int capacity)
 {
 	struct sk_buff *skb;
 	unsigned int len;
+	bool c;
+	/* We try to free up at least 2 skbs per one sent, so that we'll get
+	 * all of the memory back if they are used fast enough. */
+	int n = 2;
 
-	while ((skb = virtqueue_get_buf(vi->svq, &len)) != NULL) {
+	while ((c = virtqueue_get_capacity(vi->svq) >= capacity) && --n > 0 &&
+	       (skb = virtqueue_get_buf(vi->svq, &len)) != NULL) {
 		pr_debug("Sent skb %p\n", skb);
 		vi->dev->stats.tx_bytes += skb->len;
 		vi->dev->stats.tx_packets++;
 		dev_kfree_skb_any(skb);
 	}
+	return c;
 }
 
 static int xmit_skb(struct virtnet_info *vi, struct sk_buff *skb)
@@ -574,8 +580,8 @@ static netdev_tx_t start_xmit(struct sk_buff *skb, struct net_device *dev)
 	struct virtnet_info *vi = netdev_priv(dev);
 	int capacity;
 
-	/* Free up any pending old buffers before queueing new ones. */
-	free_old_xmit_skbs(vi);
+	/* Free enough pending old buffers to enable queueing new ones. */
+	free_old_xmit_skbs(vi, 2+MAX_SKB_FRAGS);
 
 	/* Try to transmit */
 	capacity = xmit_skb(vi, skb);
@@ -609,9 +615,7 @@ static netdev_tx_t start_xmit(struct sk_buff *skb, struct net_device *dev)
 		netif_stop_queue(dev);
 		if (unlikely(!virtqueue_enable_cb_delayed(vi->svq))) {
 			/* More just got used, free them then recheck. */
-			free_old_xmit_skbs(vi);
-			capacity = virtqueue_get_capacity(vi->svq);
-			if (capacity >= 2+MAX_SKB_FRAGS) {
+			if (!likely(free_old_xmit_skbs(vi, 2+MAX_SKB_FRAGS))) {
 				netif_start_queue(dev);
 				virtqueue_disable_cb(vi->svq);
 			}
-- 
1.7.5.53.gc233e

^ permalink raw reply related	[flat|nested] 145+ messages in thread

* [PATCH 18/18] virtio_net: limit xmit polling
@ 2011-05-04 20:53   ` Michael S. Tsirkin
  0 siblings, 0 replies; 145+ messages in thread
From: Michael S. Tsirkin @ 2011-05-04 20:53 UTC (permalink / raw)
  To: linux-kernel
  Cc: Krishna Kumar, Carsten Otte, lguest, Shirley Ma, kvm, linux-s390,
	netdev, habanero, Heiko Carstens, linux-kernel, virtualization,
	steved, Christian Borntraeger, Tom Lendacky, Martin Schwidefsky,
	linux390

Current code might introduce a lot of latency variation
if there are many pending bufs at the time we
attempt to transmit a new one. This is bad for
real-time applications and can't be good for TCP either.

Free up just enough to both clean up all buffers
eventually and to be able to xmit the next packet.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
---
 drivers/net/virtio_net.c |   18 +++++++++++-------
 1 files changed, 11 insertions(+), 7 deletions(-)

diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c
index f33c92b..9982bd7 100644
--- a/drivers/net/virtio_net.c
+++ b/drivers/net/virtio_net.c
@@ -509,17 +509,23 @@ again:
 	return received;
 }
 
-static void free_old_xmit_skbs(struct virtnet_info *vi)
+static bool free_old_xmit_skbs(struct virtnet_info *vi, int capacity)
 {
 	struct sk_buff *skb;
 	unsigned int len;
+	bool c;
+	/* We try to free up at least 2 skbs per one sent, so that we'll get
+	 * all of the memory back if they are used fast enough. */
+	int n = 2;
 
-	while ((skb = virtqueue_get_buf(vi->svq, &len)) != NULL) {
+	while ((c = virtqueue_get_capacity(vi->svq) >= capacity) && --n > 0 &&
+	       (skb = virtqueue_get_buf(vi->svq, &len)) != NULL) {
 		pr_debug("Sent skb %p\n", skb);
 		vi->dev->stats.tx_bytes += skb->len;
 		vi->dev->stats.tx_packets++;
 		dev_kfree_skb_any(skb);
 	}
+	return c;
 }
 
 static int xmit_skb(struct virtnet_info *vi, struct sk_buff *skb)
@@ -574,8 +580,8 @@ static netdev_tx_t start_xmit(struct sk_buff *skb, struct net_device *dev)
 	struct virtnet_info *vi = netdev_priv(dev);
 	int capacity;
 
-	/* Free up any pending old buffers before queueing new ones. */
-	free_old_xmit_skbs(vi);
+	/* Free enough pending old buffers to enable queueing new ones. */
+	free_old_xmit_skbs(vi, 2+MAX_SKB_FRAGS);
 
 	/* Try to transmit */
 	capacity = xmit_skb(vi, skb);
@@ -609,9 +615,7 @@ static netdev_tx_t start_xmit(struct sk_buff *skb, struct net_device *dev)
 		netif_stop_queue(dev);
 		if (unlikely(!virtqueue_enable_cb_delayed(vi->svq))) {
 			/* More just got used, free them then recheck. */
-			free_old_xmit_skbs(vi);
-			capacity = virtqueue_get_capacity(vi->svq);
-			if (capacity >= 2+MAX_SKB_FRAGS) {
+			if (!likely(free_old_xmit_skbs(vi, 2+MAX_SKB_FRAGS))) {
 				netif_start_queue(dev);
 				virtqueue_disable_cb(vi->svq);
 			}
-- 
1.7.5.53.gc233e

^ permalink raw reply related	[flat|nested] 145+ messages in thread

* [PATCH 18/18] virtio_net: limit xmit polling
@ 2011-05-04 20:53   ` Michael S. Tsirkin
  0 siblings, 0 replies; 145+ messages in thread
From: Michael S. Tsirkin @ 2011-05-04 20:53 UTC (permalink / raw)
  Cc: Krishna Kumar, Carsten Otte, lguest, Shirley Ma, kvm, linux-s390,
	netdev, habanero, Heiko Carstens, linux-kernel, virtualization,
	steved, Christian Borntraeger, Tom Lendacky, Martin Schwidefsky,
	linux390

Current code might introduce a lot of latency variation
if there are many pending bufs at the time we
attempt to transmit a new one. This is bad for
real-time applications and can't be good for TCP either.

Free up just enough to both clean up all buffers
eventually and to be able to xmit the next packet.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
---
 drivers/net/virtio_net.c |   18 +++++++++++-------
 1 files changed, 11 insertions(+), 7 deletions(-)

diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c
index f33c92b..9982bd7 100644
--- a/drivers/net/virtio_net.c
+++ b/drivers/net/virtio_net.c
@@ -509,17 +509,23 @@ again:
 	return received;
 }
 
-static void free_old_xmit_skbs(struct virtnet_info *vi)
+static bool free_old_xmit_skbs(struct virtnet_info *vi, int capacity)
 {
 	struct sk_buff *skb;
 	unsigned int len;
+	bool c;
+	/* We try to free up at least 2 skbs per one sent, so that we'll get
+	 * all of the memory back if they are used fast enough. */
+	int n = 2;
 
-	while ((skb = virtqueue_get_buf(vi->svq, &len)) != NULL) {
+	while ((c = virtqueue_get_capacity(vi->svq) >= capacity) && --n > 0 &&
+	       (skb = virtqueue_get_buf(vi->svq, &len)) != NULL) {
 		pr_debug("Sent skb %p\n", skb);
 		vi->dev->stats.tx_bytes += skb->len;
 		vi->dev->stats.tx_packets++;
 		dev_kfree_skb_any(skb);
 	}
+	return c;
 }
 
 static int xmit_skb(struct virtnet_info *vi, struct sk_buff *skb)
@@ -574,8 +580,8 @@ static netdev_tx_t start_xmit(struct sk_buff *skb, struct net_device *dev)
 	struct virtnet_info *vi = netdev_priv(dev);
 	int capacity;
 
-	/* Free up any pending old buffers before queueing new ones. */
-	free_old_xmit_skbs(vi);
+	/* Free enough pending old buffers to enable queueing new ones. */
+	free_old_xmit_skbs(vi, 2+MAX_SKB_FRAGS);
 
 	/* Try to transmit */
 	capacity = xmit_skb(vi, skb);
@@ -609,9 +615,7 @@ static netdev_tx_t start_xmit(struct sk_buff *skb, struct net_device *dev)
 		netif_stop_queue(dev);
 		if (unlikely(!virtqueue_enable_cb_delayed(vi->svq))) {
 			/* More just got used, free them then recheck. */
-			free_old_xmit_skbs(vi);
-			capacity = virtqueue_get_capacity(vi->svq);
-			if (capacity >= 2+MAX_SKB_FRAGS) {
+			if (!likely(free_old_xmit_skbs(vi, 2+MAX_SKB_FRAGS))) {
 				netif_start_queue(dev);
 				virtqueue_disable_cb(vi->svq);
 			}
-- 
1.7.5.53.gc233e

^ permalink raw reply related	[flat|nested] 145+ messages in thread

* Re: [PATCH 05/18] virtio: used event index interface
  2011-05-04 20:51   ` Michael S. Tsirkin
                     ` (2 preceding siblings ...)
  (?)
@ 2011-05-04 21:56   ` Tom Lendacky
  2011-05-05  9:38     ` Michael S. Tsirkin
  2011-05-05  9:38     ` Michael S. Tsirkin
  -1 siblings, 2 replies; 145+ messages in thread
From: Tom Lendacky @ 2011-05-04 21:56 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: linux-kernel, Rusty Russell, Carsten Otte, Christian Borntraeger,
	linux390, Martin Schwidefsky, Heiko Carstens, Shirley Ma, lguest,
	virtualization, netdev, linux-s390, kvm, Krishna Kumar, steved,
	habanero

On Wednesday, May 04, 2011 03:51:09 PM Michael S. Tsirkin wrote:
> Define a new feature bit for the guest to utilize a used_event index
> (like Xen) instead if a flag bit to enable/disable interrupts.
> 
> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
> ---
>  include/linux/virtio_ring.h |    9 +++++++++
>  1 files changed, 9 insertions(+), 0 deletions(-)
> 
> diff --git a/include/linux/virtio_ring.h b/include/linux/virtio_ring.h
> index e4d144b..f5c1b75 100644
> --- a/include/linux/virtio_ring.h
> +++ b/include/linux/virtio_ring.h
> @@ -29,6 +29,10 @@
>  /* We support indirect buffer descriptors */
>  #define VIRTIO_RING_F_INDIRECT_DESC	28
> 
> +/* The Guest publishes the used index for which it expects an interrupt
> + * at the end of the avail ring. Host should ignore the avail->flags
> field. */ +#define VIRTIO_RING_F_USED_EVENT_IDX	29
> +
>  /* Virtio ring descriptors: 16 bytes.  These can chain together via
> "next". */ struct vring_desc {
>  	/* Address (guest-physical). */
> @@ -83,6 +87,7 @@ struct vring {
>   *	__u16 avail_flags;
>   *	__u16 avail_idx;
>   *	__u16 available[num];
> + *	__u16 used_event_idx;
>   *
>   *	// Padding to the next align boundary.
>   *	char pad[];
> @@ -93,6 +98,10 @@ struct vring {
>   *	struct vring_used_elem used[num];
>   * };
>   */
> +/* We publish the used event index at the end of the available ring.
> + * It is at the end for backwards compatibility. */
> +#define vring_used_event(vr) ((vr)->avail->ring[(vr)->num])
> +
>  static inline void vring_init(struct vring *vr, unsigned int num, void *p,
>  			      unsigned long align)
>  {

You should update the vring_size procedure to account for the extra field at 
the end of the available ring by change the "(2 + num)" to "(3 + num)":
    return ((sizeof(struct vring_desc) * num + sizeof(__u16) * (3 + num)

Tom

^ permalink raw reply	[flat|nested] 145+ messages in thread

* Re: [PATCH 05/18] virtio: used event index interface
  2011-05-04 20:51   ` Michael S. Tsirkin
  (?)
  (?)
@ 2011-05-04 21:56   ` Tom Lendacky
  -1 siblings, 0 replies; 145+ messages in thread
From: Tom Lendacky @ 2011-05-04 21:56 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: Krishna Kumar, Carsten Otte, lguest, Shirley Ma, kvm, linux-s390,
	netdev, habanero, Heiko Carstens, linux-kernel, virtualization,
	steved, Christian Borntraeger, Martin Schwidefsky, linux390

On Wednesday, May 04, 2011 03:51:09 PM Michael S. Tsirkin wrote:
> Define a new feature bit for the guest to utilize a used_event index
> (like Xen) instead if a flag bit to enable/disable interrupts.
> 
> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
> ---
>  include/linux/virtio_ring.h |    9 +++++++++
>  1 files changed, 9 insertions(+), 0 deletions(-)
> 
> diff --git a/include/linux/virtio_ring.h b/include/linux/virtio_ring.h
> index e4d144b..f5c1b75 100644
> --- a/include/linux/virtio_ring.h
> +++ b/include/linux/virtio_ring.h
> @@ -29,6 +29,10 @@
>  /* We support indirect buffer descriptors */
>  #define VIRTIO_RING_F_INDIRECT_DESC	28
> 
> +/* The Guest publishes the used index for which it expects an interrupt
> + * at the end of the avail ring. Host should ignore the avail->flags
> field. */ +#define VIRTIO_RING_F_USED_EVENT_IDX	29
> +
>  /* Virtio ring descriptors: 16 bytes.  These can chain together via
> "next". */ struct vring_desc {
>  	/* Address (guest-physical). */
> @@ -83,6 +87,7 @@ struct vring {
>   *	__u16 avail_flags;
>   *	__u16 avail_idx;
>   *	__u16 available[num];
> + *	__u16 used_event_idx;
>   *
>   *	// Padding to the next align boundary.
>   *	char pad[];
> @@ -93,6 +98,10 @@ struct vring {
>   *	struct vring_used_elem used[num];
>   * };
>   */
> +/* We publish the used event index at the end of the available ring.
> + * It is at the end for backwards compatibility. */
> +#define vring_used_event(vr) ((vr)->avail->ring[(vr)->num])
> +
>  static inline void vring_init(struct vring *vr, unsigned int num, void *p,
>  			      unsigned long align)
>  {

You should update the vring_size procedure to account for the extra field at 
the end of the available ring by change the "(2 + num)" to "(3 + num)":
    return ((sizeof(struct vring_desc) * num + sizeof(__u16) * (3 + num)

Tom

^ permalink raw reply	[flat|nested] 145+ messages in thread

* Re: [PATCH 09/18] virtio: use avail_event index
  2011-05-04 20:51   ` Michael S. Tsirkin
                     ` (2 preceding siblings ...)
  (?)
@ 2011-05-04 21:58   ` Tom Lendacky
  2011-05-05  9:34     ` Michael S. Tsirkin
  2011-05-05  9:34     ` Michael S. Tsirkin
  -1 siblings, 2 replies; 145+ messages in thread
From: Tom Lendacky @ 2011-05-04 21:58 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: linux-kernel, Rusty Russell, Carsten Otte, Christian Borntraeger,
	linux390, Martin Schwidefsky, Heiko Carstens, Shirley Ma, lguest,
	virtualization, netdev, linux-s390, kvm, Krishna Kumar, steved,
	habanero


On Wednesday, May 04, 2011 03:51:47 PM Michael S. Tsirkin wrote:
> Use the new avail_event feature to reduce the number
> of exits from the guest.
> 
> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
> ---
>  drivers/virtio/virtio_ring.c |   39
> ++++++++++++++++++++++++++++++++++++++- 1 files changed, 38 insertions(+),
> 1 deletions(-)
> 
> diff --git a/drivers/virtio/virtio_ring.c b/drivers/virtio/virtio_ring.c
> index 3a3ed75..262dfe6 100644
> --- a/drivers/virtio/virtio_ring.c
> +++ b/drivers/virtio/virtio_ring.c
> @@ -82,6 +82,15 @@ struct vring_virtqueue
>  	/* Host supports indirect buffers */
>  	bool indirect;
> 
> +	/* Host publishes avail event idx */
> +	bool event;
> +
> +	/* Is kicked_avail below valid? */
> +	bool kicked_avail_valid;
> +
> +	/* avail idx value we already kicked. */
> +	u16 kicked_avail;
> +
>  	/* Number of free buffers */
>  	unsigned int num_free;
>  	/* Head of free buffer list. */
> @@ -228,6 +237,12 @@ add_head:
>  	 * new available array entries. */
>  	virtio_wmb();
>  	vq->vring.avail->idx++;
> +	/* If the driver never bothers to kick in a very long while,
> +	 * avail index might wrap around. If that happens, invalidate
> +	 * kicked_avail index we stored. TODO: make sure all drivers
> +	 * kick at least once in 2^16 and remove this. */
> +	if (unlikely(vq->vring.avail->idx == vq->kicked_avail))
> +		vq->kicked_avail_valid = true;

vq->kicked_avail_valid should be set to false here.

Tom

> 
>  	pr_debug("Added buffer head %i to %p\n", head, vq);
>  	END_USE(vq);
> @@ -236,6 +251,23 @@ add_head:
>  }
>  EXPORT_SYMBOL_GPL(virtqueue_add_buf_gfp);
> 
> +
> +static bool vring_notify(struct vring_virtqueue *vq)
> +{
> +	u16 old, new;
> +	bool v;
> +	if (!vq->event)
> +		return !(vq->vring.used->flags & VRING_USED_F_NO_NOTIFY);
> +
> +	v = vq->kicked_avail_valid;
> +	old = vq->kicked_avail;
> +	new = vq->kicked_avail = vq->vring.avail->idx;
> +	vq->kicked_avail_valid = true;
> +	if (unlikely(!v))
> +		return true;
> +	return vring_need_event(vring_avail_event(&vq->vring), new, old);
> +}
> +
>  void virtqueue_kick(struct virtqueue *_vq)
>  {
>  	struct vring_virtqueue *vq = to_vvq(_vq);
> @@ -244,7 +276,7 @@ void virtqueue_kick(struct virtqueue *_vq)
>  	/* Need to update avail index before checking if we should notify */
>  	virtio_mb();
> 
> -	if (!(vq->vring.used->flags & VRING_USED_F_NO_NOTIFY))
> +	if (vring_notify(vq))
>  		/* Prod other side to tell it about changes. */
>  		vq->notify(&vq->vq);
> 
> @@ -437,6 +469,8 @@ struct virtqueue *vring_new_virtqueue(unsigned int num,
>  	vq->vq.name = name;
>  	vq->notify = notify;
>  	vq->broken = false;
> +	vq->kicked_avail_valid = false;
> +	vq->kicked_avail = 0;
>  	vq->last_used_idx = 0;
>  	list_add_tail(&vq->vq.list, &vdev->vqs);
>  #ifdef DEBUG
> @@ -444,6 +478,7 @@ struct virtqueue *vring_new_virtqueue(unsigned int num,
>  #endif
> 
>  	vq->indirect = virtio_has_feature(vdev, VIRTIO_RING_F_INDIRECT_DESC);
> +	vq->event = virtio_has_feature(vdev, VIRTIO_RING_F_AVAIL_EVENT_IDX);
> 
>  	/* No callback?  Tell other side not to bother us. */
>  	if (!callback)
> @@ -482,6 +517,8 @@ void vring_transport_features(struct virtio_device
> *vdev) break;
>  		case VIRTIO_RING_F_USED_EVENT_IDX:
>  			break;
> +		case VIRTIO_RING_F_AVAIL_EVENT_IDX:
> +			break;
>  		default:
>  			/* We don't understand this bit. */
>  			clear_bit(i, vdev->features);

^ permalink raw reply	[flat|nested] 145+ messages in thread

* Re: [PATCH 09/18] virtio: use avail_event index
  2011-05-04 20:51   ` Michael S. Tsirkin
  (?)
  (?)
@ 2011-05-04 21:58   ` Tom Lendacky
  -1 siblings, 0 replies; 145+ messages in thread
From: Tom Lendacky @ 2011-05-04 21:58 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: Krishna Kumar, Carsten Otte, lguest, Shirley Ma, kvm, linux-s390,
	netdev, habanero, Heiko Carstens, linux-kernel, virtualization,
	steved, Christian Borntraeger, Martin Schwidefsky, linux390


On Wednesday, May 04, 2011 03:51:47 PM Michael S. Tsirkin wrote:
> Use the new avail_event feature to reduce the number
> of exits from the guest.
> 
> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
> ---
>  drivers/virtio/virtio_ring.c |   39
> ++++++++++++++++++++++++++++++++++++++- 1 files changed, 38 insertions(+),
> 1 deletions(-)
> 
> diff --git a/drivers/virtio/virtio_ring.c b/drivers/virtio/virtio_ring.c
> index 3a3ed75..262dfe6 100644
> --- a/drivers/virtio/virtio_ring.c
> +++ b/drivers/virtio/virtio_ring.c
> @@ -82,6 +82,15 @@ struct vring_virtqueue
>  	/* Host supports indirect buffers */
>  	bool indirect;
> 
> +	/* Host publishes avail event idx */
> +	bool event;
> +
> +	/* Is kicked_avail below valid? */
> +	bool kicked_avail_valid;
> +
> +	/* avail idx value we already kicked. */
> +	u16 kicked_avail;
> +
>  	/* Number of free buffers */
>  	unsigned int num_free;
>  	/* Head of free buffer list. */
> @@ -228,6 +237,12 @@ add_head:
>  	 * new available array entries. */
>  	virtio_wmb();
>  	vq->vring.avail->idx++;
> +	/* If the driver never bothers to kick in a very long while,
> +	 * avail index might wrap around. If that happens, invalidate
> +	 * kicked_avail index we stored. TODO: make sure all drivers
> +	 * kick at least once in 2^16 and remove this. */
> +	if (unlikely(vq->vring.avail->idx == vq->kicked_avail))
> +		vq->kicked_avail_valid = true;

vq->kicked_avail_valid should be set to false here.

Tom

> 
>  	pr_debug("Added buffer head %i to %p\n", head, vq);
>  	END_USE(vq);
> @@ -236,6 +251,23 @@ add_head:
>  }
>  EXPORT_SYMBOL_GPL(virtqueue_add_buf_gfp);
> 
> +
> +static bool vring_notify(struct vring_virtqueue *vq)
> +{
> +	u16 old, new;
> +	bool v;
> +	if (!vq->event)
> +		return !(vq->vring.used->flags & VRING_USED_F_NO_NOTIFY);
> +
> +	v = vq->kicked_avail_valid;
> +	old = vq->kicked_avail;
> +	new = vq->kicked_avail = vq->vring.avail->idx;
> +	vq->kicked_avail_valid = true;
> +	if (unlikely(!v))
> +		return true;
> +	return vring_need_event(vring_avail_event(&vq->vring), new, old);
> +}
> +
>  void virtqueue_kick(struct virtqueue *_vq)
>  {
>  	struct vring_virtqueue *vq = to_vvq(_vq);
> @@ -244,7 +276,7 @@ void virtqueue_kick(struct virtqueue *_vq)
>  	/* Need to update avail index before checking if we should notify */
>  	virtio_mb();
> 
> -	if (!(vq->vring.used->flags & VRING_USED_F_NO_NOTIFY))
> +	if (vring_notify(vq))
>  		/* Prod other side to tell it about changes. */
>  		vq->notify(&vq->vq);
> 
> @@ -437,6 +469,8 @@ struct virtqueue *vring_new_virtqueue(unsigned int num,
>  	vq->vq.name = name;
>  	vq->notify = notify;
>  	vq->broken = false;
> +	vq->kicked_avail_valid = false;
> +	vq->kicked_avail = 0;
>  	vq->last_used_idx = 0;
>  	list_add_tail(&vq->vq.list, &vdev->vqs);
>  #ifdef DEBUG
> @@ -444,6 +478,7 @@ struct virtqueue *vring_new_virtqueue(unsigned int num,
>  #endif
> 
>  	vq->indirect = virtio_has_feature(vdev, VIRTIO_RING_F_INDIRECT_DESC);
> +	vq->event = virtio_has_feature(vdev, VIRTIO_RING_F_AVAIL_EVENT_IDX);
> 
>  	/* No callback?  Tell other side not to bother us. */
>  	if (!callback)
> @@ -482,6 +517,8 @@ void vring_transport_features(struct virtio_device
> *vdev) break;
>  		case VIRTIO_RING_F_USED_EVENT_IDX:
>  			break;
> +		case VIRTIO_RING_F_AVAIL_EVENT_IDX:
> +			break;
>  		default:
>  			/* We don't understand this bit. */
>  			clear_bit(i, vdev->features);

^ permalink raw reply	[flat|nested] 145+ messages in thread

* Re: [PATCH 07/18] virtio ring: inline function to check for events
  2011-05-04 20:51   ` Michael S. Tsirkin
  (?)
@ 2011-05-05  8:34   ` Stefan Hajnoczi
  2011-05-05  8:56     ` Michael S. Tsirkin
  2011-05-05  8:56     ` Michael S. Tsirkin
  -1 siblings, 2 replies; 145+ messages in thread
From: Stefan Hajnoczi @ 2011-05-05  8:34 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: linux-kernel, Krishna Kumar, Carsten Otte, lguest, Shirley Ma,
	kvm, linux-s390, netdev, habanero, Heiko Carstens,
	virtualization, steved, Christian Borntraeger, Tom Lendacky,
	Martin Schwidefsky, linux390

On Wed, May 4, 2011 at 9:51 PM, Michael S. Tsirkin <mst@redhat.com> wrote:
> With the new used_event and avail_event and features, both
> host and guest need similar logic to check whether events are
> enabled, so it helps to put the common code in the header.
>
> Note that Xen has similar logic for notification hold-off
> in include/xen/interface/io/ring.h with req_event and req_prod
> corresponding to event_idx + 1 and new_idx respectively.
> +1 comes from the fact that req_event and req_prod in Xen start at 1,
> while event index in virtio starts at 0.
>
> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
> ---
>  include/linux/virtio_ring.h |   14 ++++++++++++++
>  1 files changed, 14 insertions(+), 0 deletions(-)
>
> diff --git a/include/linux/virtio_ring.h b/include/linux/virtio_ring.h
> index f791772..2a3b0ea 100644
> --- a/include/linux/virtio_ring.h
> +++ b/include/linux/virtio_ring.h
> @@ -124,6 +124,20 @@ static inline unsigned vring_size(unsigned int num, unsigned long align)
>                + sizeof(__u16) * 3 + sizeof(struct vring_used_elem) * num;
>  }
>
> +/* The following is used with USED_EVENT_IDX and AVAIL_EVENT_IDX */
> +/* Assuming a given event_idx value from the other size, if

s/other size/other side/ ?

Stefan

^ permalink raw reply	[flat|nested] 145+ messages in thread

* Re: [PATCH 07/18] virtio ring: inline function to check for events
  2011-05-04 20:51   ` Michael S. Tsirkin
  (?)
  (?)
@ 2011-05-05  8:34   ` Stefan Hajnoczi
  -1 siblings, 0 replies; 145+ messages in thread
From: Stefan Hajnoczi @ 2011-05-05  8:34 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: Krishna Kumar, Carsten Otte, lguest, Shirley Ma, kvm, linux-s390,
	netdev, habanero, Heiko Carstens, linux-kernel, virtualization,
	steved, Christian Borntraeger, Tom Lendacky, Martin Schwidefsky,
	linux390

On Wed, May 4, 2011 at 9:51 PM, Michael S. Tsirkin <mst@redhat.com> wrote:
> With the new used_event and avail_event and features, both
> host and guest need similar logic to check whether events are
> enabled, so it helps to put the common code in the header.
>
> Note that Xen has similar logic for notification hold-off
> in include/xen/interface/io/ring.h with req_event and req_prod
> corresponding to event_idx + 1 and new_idx respectively.
> +1 comes from the fact that req_event and req_prod in Xen start at 1,
> while event index in virtio starts at 0.
>
> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
> ---
>  include/linux/virtio_ring.h |   14 ++++++++++++++
>  1 files changed, 14 insertions(+), 0 deletions(-)
>
> diff --git a/include/linux/virtio_ring.h b/include/linux/virtio_ring.h
> index f791772..2a3b0ea 100644
> --- a/include/linux/virtio_ring.h
> +++ b/include/linux/virtio_ring.h
> @@ -124,6 +124,20 @@ static inline unsigned vring_size(unsigned int num, unsigned long align)
>                + sizeof(__u16) * 3 + sizeof(struct vring_used_elem) * num;
>  }
>
> +/* The following is used with USED_EVENT_IDX and AVAIL_EVENT_IDX */
> +/* Assuming a given event_idx value from the other size, if

s/other size/other side/ ?

Stefan

^ permalink raw reply	[flat|nested] 145+ messages in thread

* Re: [PATCH 07/18] virtio ring: inline function to check for events
  2011-05-05  8:34   ` Stefan Hajnoczi
@ 2011-05-05  8:56     ` Michael S. Tsirkin
  2011-05-05  8:56     ` Michael S. Tsirkin
  1 sibling, 0 replies; 145+ messages in thread
From: Michael S. Tsirkin @ 2011-05-05  8:56 UTC (permalink / raw)
  To: Stefan Hajnoczi
  Cc: linux-kernel, Krishna Kumar, Carsten Otte, lguest, Shirley Ma,
	kvm, linux-s390, netdev, habanero, Heiko Carstens,
	virtualization, steved, Christian Borntraeger, Tom Lendacky,
	Martin Schwidefsky, linux390

On Thu, May 05, 2011 at 09:34:46AM +0100, Stefan Hajnoczi wrote:
> On Wed, May 4, 2011 at 9:51 PM, Michael S. Tsirkin <mst@redhat.com> wrote:
> > With the new used_event and avail_event and features, both
> > host and guest need similar logic to check whether events are
> > enabled, so it helps to put the common code in the header.
> >
> > Note that Xen has similar logic for notification hold-off
> > in include/xen/interface/io/ring.h with req_event and req_prod
> > corresponding to event_idx + 1 and new_idx respectively.
> > +1 comes from the fact that req_event and req_prod in Xen start at 1,
> > while event index in virtio starts at 0.
> >
> > Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
> > ---
> >  include/linux/virtio_ring.h |   14 ++++++++++++++
> >  1 files changed, 14 insertions(+), 0 deletions(-)
> >
> > diff --git a/include/linux/virtio_ring.h b/include/linux/virtio_ring.h
> > index f791772..2a3b0ea 100644
> > --- a/include/linux/virtio_ring.h
> > +++ b/include/linux/virtio_ring.h
> > @@ -124,6 +124,20 @@ static inline unsigned vring_size(unsigned int num, unsigned long align)
> >                + sizeof(__u16) * 3 + sizeof(struct vring_used_elem) * num;
> >  }
> >
> > +/* The following is used with USED_EVENT_IDX and AVAIL_EVENT_IDX */
> > +/* Assuming a given event_idx value from the other size, if
> 
> s/other size/other side/ ?
> 
> Stefan

Exactly. Good catch, thanks.

-- 
MST

^ permalink raw reply	[flat|nested] 145+ messages in thread

* Re: [PATCH 07/18] virtio ring: inline function to check for events
  2011-05-05  8:34   ` Stefan Hajnoczi
  2011-05-05  8:56     ` Michael S. Tsirkin
@ 2011-05-05  8:56     ` Michael S. Tsirkin
  1 sibling, 0 replies; 145+ messages in thread
From: Michael S. Tsirkin @ 2011-05-05  8:56 UTC (permalink / raw)
  To: Stefan Hajnoczi
  Cc: Krishna Kumar, Carsten Otte, lguest, Shirley Ma, kvm, linux-s390,
	netdev, habanero, Heiko Carstens, linux-kernel, virtualization,
	steved, Christian Borntraeger, Tom Lendacky, Martin Schwidefsky,
	linux390

On Thu, May 05, 2011 at 09:34:46AM +0100, Stefan Hajnoczi wrote:
> On Wed, May 4, 2011 at 9:51 PM, Michael S. Tsirkin <mst@redhat.com> wrote:
> > With the new used_event and avail_event and features, both
> > host and guest need similar logic to check whether events are
> > enabled, so it helps to put the common code in the header.
> >
> > Note that Xen has similar logic for notification hold-off
> > in include/xen/interface/io/ring.h with req_event and req_prod
> > corresponding to event_idx + 1 and new_idx respectively.
> > +1 comes from the fact that req_event and req_prod in Xen start at 1,
> > while event index in virtio starts at 0.
> >
> > Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
> > ---
> >  include/linux/virtio_ring.h |   14 ++++++++++++++
> >  1 files changed, 14 insertions(+), 0 deletions(-)
> >
> > diff --git a/include/linux/virtio_ring.h b/include/linux/virtio_ring.h
> > index f791772..2a3b0ea 100644
> > --- a/include/linux/virtio_ring.h
> > +++ b/include/linux/virtio_ring.h
> > @@ -124,6 +124,20 @@ static inline unsigned vring_size(unsigned int num, unsigned long align)
> >                + sizeof(__u16) * 3 + sizeof(struct vring_used_elem) * num;
> >  }
> >
> > +/* The following is used with USED_EVENT_IDX and AVAIL_EVENT_IDX */
> > +/* Assuming a given event_idx value from the other size, if
> 
> s/other size/other side/ ?
> 
> Stefan

Exactly. Good catch, thanks.

-- 
MST

^ permalink raw reply	[flat|nested] 145+ messages in thread

* Re: [PATCH 09/18] virtio: use avail_event index
  2011-05-04 21:58   ` Tom Lendacky
@ 2011-05-05  9:34     ` Michael S. Tsirkin
  2011-05-05  9:34     ` Michael S. Tsirkin
  1 sibling, 0 replies; 145+ messages in thread
From: Michael S. Tsirkin @ 2011-05-05  9:34 UTC (permalink / raw)
  To: Tom Lendacky
  Cc: linux-kernel, Rusty Russell, Carsten Otte, Christian Borntraeger,
	linux390, Martin Schwidefsky, Heiko Carstens, Shirley Ma, lguest,
	virtualization, netdev, linux-s390, kvm, Krishna Kumar, steved,
	habanero

On Wed, May 04, 2011 at 04:58:18PM -0500, Tom Lendacky wrote:
> 
> On Wednesday, May 04, 2011 03:51:47 PM Michael S. Tsirkin wrote:
> > Use the new avail_event feature to reduce the number
> > of exits from the guest.
> > 
> > Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
> > ---
> >  drivers/virtio/virtio_ring.c |   39
> > ++++++++++++++++++++++++++++++++++++++- 1 files changed, 38 insertions(+),
> > 1 deletions(-)
> > 
> > diff --git a/drivers/virtio/virtio_ring.c b/drivers/virtio/virtio_ring.c
> > index 3a3ed75..262dfe6 100644
> > --- a/drivers/virtio/virtio_ring.c
> > +++ b/drivers/virtio/virtio_ring.c
> > @@ -82,6 +82,15 @@ struct vring_virtqueue
> >  	/* Host supports indirect buffers */
> >  	bool indirect;
> > 
> > +	/* Host publishes avail event idx */
> > +	bool event;
> > +
> > +	/* Is kicked_avail below valid? */
> > +	bool kicked_avail_valid;
> > +
> > +	/* avail idx value we already kicked. */
> > +	u16 kicked_avail;
> > +
> >  	/* Number of free buffers */
> >  	unsigned int num_free;
> >  	/* Head of free buffer list. */
> > @@ -228,6 +237,12 @@ add_head:
> >  	 * new available array entries. */
> >  	virtio_wmb();
> >  	vq->vring.avail->idx++;
> > +	/* If the driver never bothers to kick in a very long while,
> > +	 * avail index might wrap around. If that happens, invalidate
> > +	 * kicked_avail index we stored. TODO: make sure all drivers
> > +	 * kick at least once in 2^16 and remove this. */
> > +	if (unlikely(vq->vring.avail->idx == vq->kicked_avail))
> > +		vq->kicked_avail_valid = true;
> 
> vq->kicked_avail_valid should be set to false here.
> 
> Tom

Right, good catch.

> > 
> >  	pr_debug("Added buffer head %i to %p\n", head, vq);
> >  	END_USE(vq);
> > @@ -236,6 +251,23 @@ add_head:
> >  }
> >  EXPORT_SYMBOL_GPL(virtqueue_add_buf_gfp);
> > 
> > +
> > +static bool vring_notify(struct vring_virtqueue *vq)
> > +{
> > +	u16 old, new;
> > +	bool v;
> > +	if (!vq->event)
> > +		return !(vq->vring.used->flags & VRING_USED_F_NO_NOTIFY);
> > +
> > +	v = vq->kicked_avail_valid;
> > +	old = vq->kicked_avail;
> > +	new = vq->kicked_avail = vq->vring.avail->idx;
> > +	vq->kicked_avail_valid = true;
> > +	if (unlikely(!v))
> > +		return true;
> > +	return vring_need_event(vring_avail_event(&vq->vring), new, old);
> > +}
> > +
> >  void virtqueue_kick(struct virtqueue *_vq)
> >  {
> >  	struct vring_virtqueue *vq = to_vvq(_vq);
> > @@ -244,7 +276,7 @@ void virtqueue_kick(struct virtqueue *_vq)
> >  	/* Need to update avail index before checking if we should notify */
> >  	virtio_mb();
> > 
> > -	if (!(vq->vring.used->flags & VRING_USED_F_NO_NOTIFY))
> > +	if (vring_notify(vq))
> >  		/* Prod other side to tell it about changes. */
> >  		vq->notify(&vq->vq);
> > 
> > @@ -437,6 +469,8 @@ struct virtqueue *vring_new_virtqueue(unsigned int num,
> >  	vq->vq.name = name;
> >  	vq->notify = notify;
> >  	vq->broken = false;
> > +	vq->kicked_avail_valid = false;
> > +	vq->kicked_avail = 0;
> >  	vq->last_used_idx = 0;
> >  	list_add_tail(&vq->vq.list, &vdev->vqs);
> >  #ifdef DEBUG
> > @@ -444,6 +478,7 @@ struct virtqueue *vring_new_virtqueue(unsigned int num,
> >  #endif
> > 
> >  	vq->indirect = virtio_has_feature(vdev, VIRTIO_RING_F_INDIRECT_DESC);
> > +	vq->event = virtio_has_feature(vdev, VIRTIO_RING_F_AVAIL_EVENT_IDX);
> > 
> >  	/* No callback?  Tell other side not to bother us. */
> >  	if (!callback)
> > @@ -482,6 +517,8 @@ void vring_transport_features(struct virtio_device
> > *vdev) break;
> >  		case VIRTIO_RING_F_USED_EVENT_IDX:
> >  			break;
> > +		case VIRTIO_RING_F_AVAIL_EVENT_IDX:
> > +			break;
> >  		default:
> >  			/* We don't understand this bit. */
> >  			clear_bit(i, vdev->features);

^ permalink raw reply	[flat|nested] 145+ messages in thread

* Re: [PATCH 09/18] virtio: use avail_event index
  2011-05-04 21:58   ` Tom Lendacky
  2011-05-05  9:34     ` Michael S. Tsirkin
@ 2011-05-05  9:34     ` Michael S. Tsirkin
  1 sibling, 0 replies; 145+ messages in thread
From: Michael S. Tsirkin @ 2011-05-05  9:34 UTC (permalink / raw)
  To: Tom Lendacky
  Cc: Krishna Kumar, Carsten Otte, lguest, Shirley Ma, kvm, linux-s390,
	netdev, habanero, Heiko Carstens, linux-kernel, virtualization,
	steved, Christian Borntraeger, Martin Schwidefsky, linux390

On Wed, May 04, 2011 at 04:58:18PM -0500, Tom Lendacky wrote:
> 
> On Wednesday, May 04, 2011 03:51:47 PM Michael S. Tsirkin wrote:
> > Use the new avail_event feature to reduce the number
> > of exits from the guest.
> > 
> > Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
> > ---
> >  drivers/virtio/virtio_ring.c |   39
> > ++++++++++++++++++++++++++++++++++++++- 1 files changed, 38 insertions(+),
> > 1 deletions(-)
> > 
> > diff --git a/drivers/virtio/virtio_ring.c b/drivers/virtio/virtio_ring.c
> > index 3a3ed75..262dfe6 100644
> > --- a/drivers/virtio/virtio_ring.c
> > +++ b/drivers/virtio/virtio_ring.c
> > @@ -82,6 +82,15 @@ struct vring_virtqueue
> >  	/* Host supports indirect buffers */
> >  	bool indirect;
> > 
> > +	/* Host publishes avail event idx */
> > +	bool event;
> > +
> > +	/* Is kicked_avail below valid? */
> > +	bool kicked_avail_valid;
> > +
> > +	/* avail idx value we already kicked. */
> > +	u16 kicked_avail;
> > +
> >  	/* Number of free buffers */
> >  	unsigned int num_free;
> >  	/* Head of free buffer list. */
> > @@ -228,6 +237,12 @@ add_head:
> >  	 * new available array entries. */
> >  	virtio_wmb();
> >  	vq->vring.avail->idx++;
> > +	/* If the driver never bothers to kick in a very long while,
> > +	 * avail index might wrap around. If that happens, invalidate
> > +	 * kicked_avail index we stored. TODO: make sure all drivers
> > +	 * kick at least once in 2^16 and remove this. */
> > +	if (unlikely(vq->vring.avail->idx == vq->kicked_avail))
> > +		vq->kicked_avail_valid = true;
> 
> vq->kicked_avail_valid should be set to false here.
> 
> Tom

Right, good catch.

> > 
> >  	pr_debug("Added buffer head %i to %p\n", head, vq);
> >  	END_USE(vq);
> > @@ -236,6 +251,23 @@ add_head:
> >  }
> >  EXPORT_SYMBOL_GPL(virtqueue_add_buf_gfp);
> > 
> > +
> > +static bool vring_notify(struct vring_virtqueue *vq)
> > +{
> > +	u16 old, new;
> > +	bool v;
> > +	if (!vq->event)
> > +		return !(vq->vring.used->flags & VRING_USED_F_NO_NOTIFY);
> > +
> > +	v = vq->kicked_avail_valid;
> > +	old = vq->kicked_avail;
> > +	new = vq->kicked_avail = vq->vring.avail->idx;
> > +	vq->kicked_avail_valid = true;
> > +	if (unlikely(!v))
> > +		return true;
> > +	return vring_need_event(vring_avail_event(&vq->vring), new, old);
> > +}
> > +
> >  void virtqueue_kick(struct virtqueue *_vq)
> >  {
> >  	struct vring_virtqueue *vq = to_vvq(_vq);
> > @@ -244,7 +276,7 @@ void virtqueue_kick(struct virtqueue *_vq)
> >  	/* Need to update avail index before checking if we should notify */
> >  	virtio_mb();
> > 
> > -	if (!(vq->vring.used->flags & VRING_USED_F_NO_NOTIFY))
> > +	if (vring_notify(vq))
> >  		/* Prod other side to tell it about changes. */
> >  		vq->notify(&vq->vq);
> > 
> > @@ -437,6 +469,8 @@ struct virtqueue *vring_new_virtqueue(unsigned int num,
> >  	vq->vq.name = name;
> >  	vq->notify = notify;
> >  	vq->broken = false;
> > +	vq->kicked_avail_valid = false;
> > +	vq->kicked_avail = 0;
> >  	vq->last_used_idx = 0;
> >  	list_add_tail(&vq->vq.list, &vdev->vqs);
> >  #ifdef DEBUG
> > @@ -444,6 +478,7 @@ struct virtqueue *vring_new_virtqueue(unsigned int num,
> >  #endif
> > 
> >  	vq->indirect = virtio_has_feature(vdev, VIRTIO_RING_F_INDIRECT_DESC);
> > +	vq->event = virtio_has_feature(vdev, VIRTIO_RING_F_AVAIL_EVENT_IDX);
> > 
> >  	/* No callback?  Tell other side not to bother us. */
> >  	if (!callback)
> > @@ -482,6 +517,8 @@ void vring_transport_features(struct virtio_device
> > *vdev) break;
> >  		case VIRTIO_RING_F_USED_EVENT_IDX:
> >  			break;
> > +		case VIRTIO_RING_F_AVAIL_EVENT_IDX:
> > +			break;
> >  		default:
> >  			/* We don't understand this bit. */
> >  			clear_bit(i, vdev->features);

^ permalink raw reply	[flat|nested] 145+ messages in thread

* Re: [PATCH 05/18] virtio: used event index interface
  2011-05-04 21:56   ` Tom Lendacky
  2011-05-05  9:38     ` Michael S. Tsirkin
@ 2011-05-05  9:38     ` Michael S. Tsirkin
  1 sibling, 0 replies; 145+ messages in thread
From: Michael S. Tsirkin @ 2011-05-05  9:38 UTC (permalink / raw)
  To: Tom Lendacky
  Cc: linux-kernel, Rusty Russell, Carsten Otte, Christian Borntraeger,
	linux390, Martin Schwidefsky, Heiko Carstens, Shirley Ma, lguest,
	virtualization, netdev, linux-s390, kvm, Krishna Kumar, steved,
	habanero

On Wed, May 04, 2011 at 04:56:09PM -0500, Tom Lendacky wrote:
> On Wednesday, May 04, 2011 03:51:09 PM Michael S. Tsirkin wrote:
> > Define a new feature bit for the guest to utilize a used_event index
> > (like Xen) instead if a flag bit to enable/disable interrupts.
> > 
> > Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
> > ---
> >  include/linux/virtio_ring.h |    9 +++++++++
> >  1 files changed, 9 insertions(+), 0 deletions(-)
> > 
> > diff --git a/include/linux/virtio_ring.h b/include/linux/virtio_ring.h
> > index e4d144b..f5c1b75 100644
> > --- a/include/linux/virtio_ring.h
> > +++ b/include/linux/virtio_ring.h
> > @@ -29,6 +29,10 @@
> >  /* We support indirect buffer descriptors */
> >  #define VIRTIO_RING_F_INDIRECT_DESC	28
> > 
> > +/* The Guest publishes the used index for which it expects an interrupt
> > + * at the end of the avail ring. Host should ignore the avail->flags
> > field. */ +#define VIRTIO_RING_F_USED_EVENT_IDX	29
> > +
> >  /* Virtio ring descriptors: 16 bytes.  These can chain together via
> > "next". */ struct vring_desc {
> >  	/* Address (guest-physical). */
> > @@ -83,6 +87,7 @@ struct vring {
> >   *	__u16 avail_flags;
> >   *	__u16 avail_idx;
> >   *	__u16 available[num];
> > + *	__u16 used_event_idx;
> >   *
> >   *	// Padding to the next align boundary.
> >   *	char pad[];
> > @@ -93,6 +98,10 @@ struct vring {
> >   *	struct vring_used_elem used[num];
> >   * };
> >   */
> > +/* We publish the used event index at the end of the available ring.
> > + * It is at the end for backwards compatibility. */
> > +#define vring_used_event(vr) ((vr)->avail->ring[(vr)->num])
> > +
> >  static inline void vring_init(struct vring *vr, unsigned int num, void *p,
> >  			      unsigned long align)
> >  {
> 
> You should update the vring_size procedure to account for the extra field at 
> the end of the available ring by change the "(2 + num)" to "(3 + num)":
>     return ((sizeof(struct vring_desc) * num + sizeof(__u16) * (3 + num)
> 
> Tom

In practice it gives the same result because of the alignment, but sure.
Thanks!

-- 
MST

^ permalink raw reply	[flat|nested] 145+ messages in thread

* Re: [PATCH 05/18] virtio: used event index interface
  2011-05-04 21:56   ` Tom Lendacky
@ 2011-05-05  9:38     ` Michael S. Tsirkin
  2011-05-05  9:38     ` Michael S. Tsirkin
  1 sibling, 0 replies; 145+ messages in thread
From: Michael S. Tsirkin @ 2011-05-05  9:38 UTC (permalink / raw)
  To: Tom Lendacky
  Cc: Krishna Kumar, Carsten Otte, lguest, Shirley Ma, kvm, linux-s390,
	netdev, habanero, Heiko Carstens, linux-kernel, virtualization,
	steved, Christian Borntraeger, Martin Schwidefsky, linux390

On Wed, May 04, 2011 at 04:56:09PM -0500, Tom Lendacky wrote:
> On Wednesday, May 04, 2011 03:51:09 PM Michael S. Tsirkin wrote:
> > Define a new feature bit for the guest to utilize a used_event index
> > (like Xen) instead if a flag bit to enable/disable interrupts.
> > 
> > Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
> > ---
> >  include/linux/virtio_ring.h |    9 +++++++++
> >  1 files changed, 9 insertions(+), 0 deletions(-)
> > 
> > diff --git a/include/linux/virtio_ring.h b/include/linux/virtio_ring.h
> > index e4d144b..f5c1b75 100644
> > --- a/include/linux/virtio_ring.h
> > +++ b/include/linux/virtio_ring.h
> > @@ -29,6 +29,10 @@
> >  /* We support indirect buffer descriptors */
> >  #define VIRTIO_RING_F_INDIRECT_DESC	28
> > 
> > +/* The Guest publishes the used index for which it expects an interrupt
> > + * at the end of the avail ring. Host should ignore the avail->flags
> > field. */ +#define VIRTIO_RING_F_USED_EVENT_IDX	29
> > +
> >  /* Virtio ring descriptors: 16 bytes.  These can chain together via
> > "next". */ struct vring_desc {
> >  	/* Address (guest-physical). */
> > @@ -83,6 +87,7 @@ struct vring {
> >   *	__u16 avail_flags;
> >   *	__u16 avail_idx;
> >   *	__u16 available[num];
> > + *	__u16 used_event_idx;
> >   *
> >   *	// Padding to the next align boundary.
> >   *	char pad[];
> > @@ -93,6 +98,10 @@ struct vring {
> >   *	struct vring_used_elem used[num];
> >   * };
> >   */
> > +/* We publish the used event index at the end of the available ring.
> > + * It is at the end for backwards compatibility. */
> > +#define vring_used_event(vr) ((vr)->avail->ring[(vr)->num])
> > +
> >  static inline void vring_init(struct vring *vr, unsigned int num, void *p,
> >  			      unsigned long align)
> >  {
> 
> You should update the vring_size procedure to account for the extra field at 
> the end of the available ring by change the "(2 + num)" to "(3 + num)":
>     return ((sizeof(struct vring_desc) * num + sizeof(__u16) * (3 + num)
> 
> Tom

In practice it gives the same result because of the alignment, but sure.
Thanks!

-- 
MST

^ permalink raw reply	[flat|nested] 145+ messages in thread

* [PATCH 0/3] virtio and vhost-net performance enhancements
  2011-05-04 20:50 ` Michael S. Tsirkin
@ 2011-05-05 15:07   ` Michael S. Tsirkin
  -1 siblings, 0 replies; 145+ messages in thread
From: Michael S. Tsirkin @ 2011-05-05 15:07 UTC (permalink / raw)
  To: linux-kernel
  Cc: Rusty Russell, Carsten Otte, Christian Borntraeger, linux390,
	Martin Schwidefsky, Heiko Carstens, Shirley Ma, lguest,
	linux-kernel, virtualization, netdev, linux-s390, kvm,
	Krishna Kumar, Tom Lendacky, steved, habanero

Here are a couple of minor fixes suggested on list.
Applies on top of the previous patchset.

As before result pushed here:
git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost.git vhost-net-next-event-idx-v1

Michael S. Tsirkin (3):
  virtio: fix avail event support
  virtio_ring: check used_event offset
  virtio_ring: need_event api comment fix

 drivers/virtio/virtio_ring.c |    2 +-
 include/linux/virtio_ring.h  |   10 ++++++++--
 2 files changed, 9 insertions(+), 3 deletions(-)

-- 
1.7.5.53.gc233e

^ permalink raw reply	[flat|nested] 145+ messages in thread

* [PATCH 0/3] virtio and vhost-net performance enhancements
@ 2011-05-05 15:07   ` Michael S. Tsirkin
  0 siblings, 0 replies; 145+ messages in thread
From: Michael S. Tsirkin @ 2011-05-05 15:07 UTC (permalink / raw)
  Cc: Rusty Russell, Carsten Otte, Christian Borntraeger, linux390,
	Martin Schwidefsky, Heiko Carstens, Shirley Ma, lguest,
	linux-kernel, virtualization, netdev, linux-s390, kvm,
	Krishna Kumar, Tom Lendacky, steved, habanero

Here are a couple of minor fixes suggested on list.
Applies on top of the previous patchset.

As before result pushed here:
git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost.git vhost-net-next-event-idx-v1

Michael S. Tsirkin (3):
  virtio: fix avail event support
  virtio_ring: check used_event offset
  virtio_ring: need_event api comment fix

 drivers/virtio/virtio_ring.c |    2 +-
 include/linux/virtio_ring.h  |   10 ++++++++--
 2 files changed, 9 insertions(+), 3 deletions(-)

-- 
1.7.5.53.gc233e

^ permalink raw reply	[flat|nested] 145+ messages in thread

* [PATCH 0/3] virtio and vhost-net performance enhancements
  2011-05-04 20:50 ` Michael S. Tsirkin
                   ` (24 preceding siblings ...)
  (?)
@ 2011-05-05 15:07 ` Michael S. Tsirkin
  -1 siblings, 0 replies; 145+ messages in thread
From: Michael S. Tsirkin @ 2011-05-05 15:07 UTC (permalink / raw)
  Cc: Krishna Kumar, Carsten Otte, lguest, Shirley Ma, kvm, linux-s390,
	netdev, habanero, Heiko Carstens, linux-kernel, virtualization,
	steved, Christian Borntraeger, Tom Lendacky, Martin Schwidefsky,
	linux390

Here are a couple of minor fixes suggested on list.
Applies on top of the previous patchset.

As before result pushed here:
git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost.git vhost-net-next-event-idx-v1

Michael S. Tsirkin (3):
  virtio: fix avail event support
  virtio_ring: check used_event offset
  virtio_ring: need_event api comment fix

 drivers/virtio/virtio_ring.c |    2 +-
 include/linux/virtio_ring.h  |   10 ++++++++--
 2 files changed, 9 insertions(+), 3 deletions(-)

-- 
1.7.5.53.gc233e

^ permalink raw reply	[flat|nested] 145+ messages in thread

* [PATCH 1/3] virtio: fix avail event support
  2011-05-05 15:07   ` Michael S. Tsirkin
@ 2011-05-05 15:08     ` Michael S. Tsirkin
  -1 siblings, 0 replies; 145+ messages in thread
From: Michael S. Tsirkin @ 2011-05-05 15:08 UTC (permalink / raw)
  To: linux-kernel
  Cc: Rusty Russell, Carsten Otte, Christian Borntraeger, linux390,
	Martin Schwidefsky, Heiko Carstens, Shirley Ma, lguest,
	linux-kernel, virtualization, netdev, linux-s390, kvm,
	Krishna Kumar, Tom Lendacky, steved, habanero

make valid flag false, not true, on overrun

Reported-by: Tom Lendacky <tahm@linux.vnet.ibm.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
---
 drivers/virtio/virtio_ring.c |    2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/drivers/virtio/virtio_ring.c b/drivers/virtio/virtio_ring.c
index 57bf9d5..0ea0781 100644
--- a/drivers/virtio/virtio_ring.c
+++ b/drivers/virtio/virtio_ring.c
@@ -242,7 +242,7 @@ add_head:
 	 * kicked_avail index we stored. TODO: make sure all drivers
 	 * kick at least once in 2^16 and remove this. */
 	if (unlikely(vq->vring.avail->idx == vq->kicked_avail))
-		vq->kicked_avail_valid = true;
+		vq->kicked_avail_valid = false;
 
 	pr_debug("Added buffer head %i to %p\n", head, vq);
 	END_USE(vq);
-- 
1.7.5.53.gc233e


^ permalink raw reply related	[flat|nested] 145+ messages in thread

* [PATCH 1/3] virtio: fix avail event support
@ 2011-05-05 15:08     ` Michael S. Tsirkin
  0 siblings, 0 replies; 145+ messages in thread
From: Michael S. Tsirkin @ 2011-05-05 15:08 UTC (permalink / raw)
  Cc: Rusty Russell, Carsten Otte, Christian Borntraeger, linux390,
	Martin Schwidefsky, Heiko Carstens, Shirley Ma, lguest,
	linux-kernel, virtualization, netdev, linux-s390, kvm,
	Krishna Kumar, Tom Lendacky, steved, habanero

make valid flag false, not true, on overrun

Reported-by: Tom Lendacky <tahm@linux.vnet.ibm.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
---
 drivers/virtio/virtio_ring.c |    2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/drivers/virtio/virtio_ring.c b/drivers/virtio/virtio_ring.c
index 57bf9d5..0ea0781 100644
--- a/drivers/virtio/virtio_ring.c
+++ b/drivers/virtio/virtio_ring.c
@@ -242,7 +242,7 @@ add_head:
 	 * kicked_avail index we stored. TODO: make sure all drivers
 	 * kick at least once in 2^16 and remove this. */
 	if (unlikely(vq->vring.avail->idx == vq->kicked_avail))
-		vq->kicked_avail_valid = true;
+		vq->kicked_avail_valid = false;
 
 	pr_debug("Added buffer head %i to %p\n", head, vq);
 	END_USE(vq);
-- 
1.7.5.53.gc233e

^ permalink raw reply related	[flat|nested] 145+ messages in thread

* [PATCH 1/3] virtio: fix avail event support
  2011-05-05 15:07   ` Michael S. Tsirkin
  (?)
@ 2011-05-05 15:08   ` Michael S. Tsirkin
  -1 siblings, 0 replies; 145+ messages in thread
From: Michael S. Tsirkin @ 2011-05-05 15:08 UTC (permalink / raw)
  Cc: Krishna Kumar, Carsten Otte, lguest, Shirley Ma, kvm, linux-s390,
	netdev, habanero, Heiko Carstens, linux-kernel, virtualization,
	steved, Christian Borntraeger, Tom Lendacky, Martin Schwidefsky,
	linux390

make valid flag false, not true, on overrun

Reported-by: Tom Lendacky <tahm@linux.vnet.ibm.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
---
 drivers/virtio/virtio_ring.c |    2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/drivers/virtio/virtio_ring.c b/drivers/virtio/virtio_ring.c
index 57bf9d5..0ea0781 100644
--- a/drivers/virtio/virtio_ring.c
+++ b/drivers/virtio/virtio_ring.c
@@ -242,7 +242,7 @@ add_head:
 	 * kicked_avail index we stored. TODO: make sure all drivers
 	 * kick at least once in 2^16 and remove this. */
 	if (unlikely(vq->vring.avail->idx == vq->kicked_avail))
-		vq->kicked_avail_valid = true;
+		vq->kicked_avail_valid = false;
 
 	pr_debug("Added buffer head %i to %p\n", head, vq);
 	END_USE(vq);
-- 
1.7.5.53.gc233e

^ permalink raw reply related	[flat|nested] 145+ messages in thread

* [PATCH 2/3] virtio_ring: check used_event offset
  2011-05-05 15:07   ` Michael S. Tsirkin
@ 2011-05-05 15:08     ` Michael S. Tsirkin
  -1 siblings, 0 replies; 145+ messages in thread
From: Michael S. Tsirkin @ 2011-05-05 15:08 UTC (permalink / raw)
  To: linux-kernel
  Cc: Rusty Russell, Carsten Otte, Christian Borntraeger, linux390,
	Martin Schwidefsky, Heiko Carstens, Shirley Ma, lguest,
	linux-kernel, virtualization, netdev, linux-s390, kvm,
	Krishna Kumar, Tom Lendacky, steved, habanero

Nothing's wrong with vring_size as is, but it's nice
to check that the new field in the avail ring
won't overlow into the used ring.

Reported-by: Tom Lendacky <tahm@linux.vnet.ibm.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
---
 include/linux/virtio_ring.h |    8 +++++++-
 1 files changed, 7 insertions(+), 1 deletions(-)

diff --git a/include/linux/virtio_ring.h b/include/linux/virtio_ring.h
index 2a3b0ea..089cbf2 100644
--- a/include/linux/virtio_ring.h
+++ b/include/linux/virtio_ring.h
@@ -119,7 +119,13 @@ static inline void vring_init(struct vring *vr, unsigned int num, void *p,
 
 static inline unsigned vring_size(unsigned int num, unsigned long align)
 {
-	return ((sizeof(struct vring_desc) * num + sizeof(__u16) * (2 + num)
+#ifdef __KERNEL__
+	/* Older versions did not have used_event field at the end of the
+	 * avail ring. Used ring offset must be compatible with such devices. */
+	size_t s = sizeof(struct vring_desc) * num + sizeof(__u16) * (2 + num);
+	BUG_ON(ALIGN(s, align) != ALIGN(s + sizeof(__u16), align));
+#endif
+	return ((sizeof(struct vring_desc) * num + sizeof(__u16) * (3 + num)
 		 + align - 1) & ~(align - 1))
 		+ sizeof(__u16) * 3 + sizeof(struct vring_used_elem) * num;
 }
-- 
1.7.5.53.gc233e


^ permalink raw reply related	[flat|nested] 145+ messages in thread

* [PATCH 2/3] virtio_ring: check used_event offset
@ 2011-05-05 15:08     ` Michael S. Tsirkin
  0 siblings, 0 replies; 145+ messages in thread
From: Michael S. Tsirkin @ 2011-05-05 15:08 UTC (permalink / raw)
  Cc: Rusty Russell, Carsten Otte, Christian Borntraeger, linux390,
	Martin Schwidefsky, Heiko Carstens, Shirley Ma, lguest,
	linux-kernel, virtualization, netdev, linux-s390, kvm,
	Krishna Kumar, Tom Lendacky, steved, habanero

Nothing's wrong with vring_size as is, but it's nice
to check that the new field in the avail ring
won't overlow into the used ring.

Reported-by: Tom Lendacky <tahm@linux.vnet.ibm.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
---
 include/linux/virtio_ring.h |    8 +++++++-
 1 files changed, 7 insertions(+), 1 deletions(-)

diff --git a/include/linux/virtio_ring.h b/include/linux/virtio_ring.h
index 2a3b0ea..089cbf2 100644
--- a/include/linux/virtio_ring.h
+++ b/include/linux/virtio_ring.h
@@ -119,7 +119,13 @@ static inline void vring_init(struct vring *vr, unsigned int num, void *p,
 
 static inline unsigned vring_size(unsigned int num, unsigned long align)
 {
-	return ((sizeof(struct vring_desc) * num + sizeof(__u16) * (2 + num)
+#ifdef __KERNEL__
+	/* Older versions did not have used_event field at the end of the
+	 * avail ring. Used ring offset must be compatible with such devices. */
+	size_t s = sizeof(struct vring_desc) * num + sizeof(__u16) * (2 + num);
+	BUG_ON(ALIGN(s, align) != ALIGN(s + sizeof(__u16), align));
+#endif
+	return ((sizeof(struct vring_desc) * num + sizeof(__u16) * (3 + num)
 		 + align - 1) & ~(align - 1))
 		+ sizeof(__u16) * 3 + sizeof(struct vring_used_elem) * num;
 }
-- 
1.7.5.53.gc233e

^ permalink raw reply related	[flat|nested] 145+ messages in thread

* [PATCH 2/3] virtio_ring: check used_event offset
  2011-05-05 15:07   ` Michael S. Tsirkin
                     ` (3 preceding siblings ...)
  (?)
@ 2011-05-05 15:08   ` Michael S. Tsirkin
  -1 siblings, 0 replies; 145+ messages in thread
From: Michael S. Tsirkin @ 2011-05-05 15:08 UTC (permalink / raw)
  Cc: Krishna Kumar, Carsten Otte, lguest, Shirley Ma, kvm, linux-s390,
	netdev, habanero, Heiko Carstens, linux-kernel, virtualization,
	steved, Christian Borntraeger, Tom Lendacky, Martin Schwidefsky,
	linux390

Nothing's wrong with vring_size as is, but it's nice
to check that the new field in the avail ring
won't overlow into the used ring.

Reported-by: Tom Lendacky <tahm@linux.vnet.ibm.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
---
 include/linux/virtio_ring.h |    8 +++++++-
 1 files changed, 7 insertions(+), 1 deletions(-)

diff --git a/include/linux/virtio_ring.h b/include/linux/virtio_ring.h
index 2a3b0ea..089cbf2 100644
--- a/include/linux/virtio_ring.h
+++ b/include/linux/virtio_ring.h
@@ -119,7 +119,13 @@ static inline void vring_init(struct vring *vr, unsigned int num, void *p,
 
 static inline unsigned vring_size(unsigned int num, unsigned long align)
 {
-	return ((sizeof(struct vring_desc) * num + sizeof(__u16) * (2 + num)
+#ifdef __KERNEL__
+	/* Older versions did not have used_event field at the end of the
+	 * avail ring. Used ring offset must be compatible with such devices. */
+	size_t s = sizeof(struct vring_desc) * num + sizeof(__u16) * (2 + num);
+	BUG_ON(ALIGN(s, align) != ALIGN(s + sizeof(__u16), align));
+#endif
+	return ((sizeof(struct vring_desc) * num + sizeof(__u16) * (3 + num)
 		 + align - 1) & ~(align - 1))
 		+ sizeof(__u16) * 3 + sizeof(struct vring_used_elem) * num;
 }
-- 
1.7.5.53.gc233e

^ permalink raw reply related	[flat|nested] 145+ messages in thread

* [PATCH 3/3] virtio_ring: need_event api comment fix
  2011-05-05 15:07   ` Michael S. Tsirkin
@ 2011-05-05 15:08     ` Michael S. Tsirkin
  -1 siblings, 0 replies; 145+ messages in thread
From: Michael S. Tsirkin @ 2011-05-05 15:08 UTC (permalink / raw)
  To: linux-kernel
  Cc: Rusty Russell, Carsten Otte, Christian Borntraeger, linux390,
	Martin Schwidefsky, Heiko Carstens, Shirley Ma, lguest,
	linux-kernel, virtualization, netdev, linux-s390, kvm,
	Krishna Kumar, Tom Lendacky, steved, habanero

fix typo in a comment: size -> side

Reported-by: Stefan Hajnoczi <stefanha@gmail.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
---
 include/linux/virtio_ring.h |    2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/include/linux/virtio_ring.h b/include/linux/virtio_ring.h
index 089cbf2..0a45c6e 100644
--- a/include/linux/virtio_ring.h
+++ b/include/linux/virtio_ring.h
@@ -131,7 +131,7 @@ static inline unsigned vring_size(unsigned int num, unsigned long align)
 }
 
 /* The following is used with USED_EVENT_IDX and AVAIL_EVENT_IDX */
-/* Assuming a given event_idx value from the other size, if
+/* Assuming a given event_idx value from the other side, if
  * we have just incremented index from old to new_idx,
  * should we trigger an event? */
 static inline int vring_need_event(__u16 event_idx, __u16 new_idx, __u16 old)
-- 
1.7.5.53.gc233e

^ permalink raw reply related	[flat|nested] 145+ messages in thread

* [PATCH 3/3] virtio_ring: need_event api comment fix
@ 2011-05-05 15:08     ` Michael S. Tsirkin
  0 siblings, 0 replies; 145+ messages in thread
From: Michael S. Tsirkin @ 2011-05-05 15:08 UTC (permalink / raw)
  Cc: Rusty Russell, Carsten Otte, Christian Borntraeger, linux390,
	Martin Schwidefsky, Heiko Carstens, Shirley Ma, lguest,
	linux-kernel, virtualization, netdev, linux-s390, kvm,
	Krishna Kumar, Tom Lendacky, steved, habanero

fix typo in a comment: size -> side

Reported-by: Stefan Hajnoczi <stefanha@gmail.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
---
 include/linux/virtio_ring.h |    2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/include/linux/virtio_ring.h b/include/linux/virtio_ring.h
index 089cbf2..0a45c6e 100644
--- a/include/linux/virtio_ring.h
+++ b/include/linux/virtio_ring.h
@@ -131,7 +131,7 @@ static inline unsigned vring_size(unsigned int num, unsigned long align)
 }
 
 /* The following is used with USED_EVENT_IDX and AVAIL_EVENT_IDX */
-/* Assuming a given event_idx value from the other size, if
+/* Assuming a given event_idx value from the other side, if
  * we have just incremented index from old to new_idx,
  * should we trigger an event? */
 static inline int vring_need_event(__u16 event_idx, __u16 new_idx, __u16 old)
-- 
1.7.5.53.gc233e

^ permalink raw reply related	[flat|nested] 145+ messages in thread

* [PATCH 3/3] virtio_ring: need_event api comment fix
  2011-05-05 15:07   ` Michael S. Tsirkin
                     ` (4 preceding siblings ...)
  (?)
@ 2011-05-05 15:08   ` Michael S. Tsirkin
  -1 siblings, 0 replies; 145+ messages in thread
From: Michael S. Tsirkin @ 2011-05-05 15:08 UTC (permalink / raw)
  Cc: Krishna Kumar, Carsten Otte, lguest, Shirley Ma, kvm, linux-s390,
	netdev, habanero, Heiko Carstens, linux-kernel, virtualization,
	steved, Christian Borntraeger, Tom Lendacky, Martin Schwidefsky,
	linux390

fix typo in a comment: size -> side

Reported-by: Stefan Hajnoczi <stefanha@gmail.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
---
 include/linux/virtio_ring.h |    2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/include/linux/virtio_ring.h b/include/linux/virtio_ring.h
index 089cbf2..0a45c6e 100644
--- a/include/linux/virtio_ring.h
+++ b/include/linux/virtio_ring.h
@@ -131,7 +131,7 @@ static inline unsigned vring_size(unsigned int num, unsigned long align)
 }
 
 /* The following is used with USED_EVENT_IDX and AVAIL_EVENT_IDX */
-/* Assuming a given event_idx value from the other size, if
+/* Assuming a given event_idx value from the other side, if
  * we have just incremented index from old to new_idx,
  * should we trigger an event? */
 static inline int vring_need_event(__u16 event_idx, __u16 new_idx, __u16 old)
-- 
1.7.5.53.gc233e

^ permalink raw reply related	[flat|nested] 145+ messages in thread

* Re: [PATCH 06/18] virtio_ring: avail event index interface
  2011-05-04 20:51   ` Michael S. Tsirkin
@ 2011-05-09  4:13     ` Rusty Russell
  -1 siblings, 0 replies; 145+ messages in thread
From: Rusty Russell @ 2011-05-09  4:13 UTC (permalink / raw)
  To: Michael S. Tsirkin, linux-kernel
  Cc: Carsten Otte, Christian Borntraeger, linux390,
	Martin Schwidefsky, Heiko Carstens, Shirley Ma, lguest,
	linux-kernel, virtualization, netdev, linux-s390, kvm,
	Krishna Kumar, Tom Lendacky, steved, habanero

On Wed, 4 May 2011 23:51:19 +0300, "Michael S. Tsirkin" <mst@redhat.com> wrote:
> Define a new feature bit for the host to
> declare that it uses an avail_event index
> (like Xen) instead of a feature bit
> to enable/disable interrupts.
> 
> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
> ---
>  include/linux/virtio_ring.h |   11 ++++++++---
>  1 files changed, 8 insertions(+), 3 deletions(-)
> 
> diff --git a/include/linux/virtio_ring.h b/include/linux/virtio_ring.h
> index f5c1b75..f791772 100644
> --- a/include/linux/virtio_ring.h
> +++ b/include/linux/virtio_ring.h
> @@ -32,6 +32,9 @@
>  /* The Guest publishes the used index for which it expects an interrupt
>   * at the end of the avail ring. Host should ignore the avail->flags field. */
>  #define VIRTIO_RING_F_USED_EVENT_IDX	29
> +/* The Host publishes the avail index for which it expects a kick
> + * at the end of the used ring. Guest should ignore the used->flags field. */
> +#define VIRTIO_RING_F_AVAIL_EVENT_IDX	32

Are you really sure we want to separate the two?  Seems a little simpler
to have one bit to mean "we're publishing our threshold".  For someone
implementing this from scratch, it's a little simpler.

Or are there cases where the old style makes more sense?

Thanks,
Rusty.

^ permalink raw reply	[flat|nested] 145+ messages in thread

* Re: [PATCH 06/18] virtio_ring: avail event index interface
@ 2011-05-09  4:13     ` Rusty Russell
  0 siblings, 0 replies; 145+ messages in thread
From: Rusty Russell @ 2011-05-09  4:13 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: Carsten Otte, Christian Borntraeger, linux390,
	Martin Schwidefsky, Heiko Carstens, Shirley Ma, lguest,
	linux-kernel, virtualization, netdev, linux-s390, kvm,
	Krishna Kumar, Tom Lendacky, steved, habanero

On Wed, 4 May 2011 23:51:19 +0300, "Michael S. Tsirkin" <mst@redhat.com> wrote:
> Define a new feature bit for the host to
> declare that it uses an avail_event index
> (like Xen) instead of a feature bit
> to enable/disable interrupts.
> 
> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
> ---
>  include/linux/virtio_ring.h |   11 ++++++++---
>  1 files changed, 8 insertions(+), 3 deletions(-)
> 
> diff --git a/include/linux/virtio_ring.h b/include/linux/virtio_ring.h
> index f5c1b75..f791772 100644
> --- a/include/linux/virtio_ring.h
> +++ b/include/linux/virtio_ring.h
> @@ -32,6 +32,9 @@
>  /* The Guest publishes the used index for which it expects an interrupt
>   * at the end of the avail ring. Host should ignore the avail->flags field. */
>  #define VIRTIO_RING_F_USED_EVENT_IDX	29
> +/* The Host publishes the avail index for which it expects a kick
> + * at the end of the used ring. Guest should ignore the used->flags field. */
> +#define VIRTIO_RING_F_AVAIL_EVENT_IDX	32

Are you really sure we want to separate the two?  Seems a little simpler
to have one bit to mean "we're publishing our threshold".  For someone
implementing this from scratch, it's a little simpler.

Or are there cases where the old style makes more sense?

Thanks,
Rusty.

^ permalink raw reply	[flat|nested] 145+ messages in thread

* Re: [PATCH 06/18] virtio_ring: avail event index interface
  2011-05-04 20:51   ` Michael S. Tsirkin
  (?)
  (?)
@ 2011-05-09  4:13   ` Rusty Russell
  -1 siblings, 0 replies; 145+ messages in thread
From: Rusty Russell @ 2011-05-09  4:13 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: Krishna Kumar, Carsten Otte, lguest, Shirley Ma, kvm, linux-s390,
	netdev, habanero, Heiko Carstens, linux-kernel, virtualization,
	steved, Christian Borntraeger, Tom Lendacky, Martin Schwidefsky,
	linux390

On Wed, 4 May 2011 23:51:19 +0300, "Michael S. Tsirkin" <mst@redhat.com> wrote:
> Define a new feature bit for the host to
> declare that it uses an avail_event index
> (like Xen) instead of a feature bit
> to enable/disable interrupts.
> 
> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
> ---
>  include/linux/virtio_ring.h |   11 ++++++++---
>  1 files changed, 8 insertions(+), 3 deletions(-)
> 
> diff --git a/include/linux/virtio_ring.h b/include/linux/virtio_ring.h
> index f5c1b75..f791772 100644
> --- a/include/linux/virtio_ring.h
> +++ b/include/linux/virtio_ring.h
> @@ -32,6 +32,9 @@
>  /* The Guest publishes the used index for which it expects an interrupt
>   * at the end of the avail ring. Host should ignore the avail->flags field. */
>  #define VIRTIO_RING_F_USED_EVENT_IDX	29
> +/* The Host publishes the avail index for which it expects a kick
> + * at the end of the used ring. Guest should ignore the used->flags field. */
> +#define VIRTIO_RING_F_AVAIL_EVENT_IDX	32

Are you really sure we want to separate the two?  Seems a little simpler
to have one bit to mean "we're publishing our threshold".  For someone
implementing this from scratch, it's a little simpler.

Or are there cases where the old style makes more sense?

Thanks,
Rusty.

^ permalink raw reply	[flat|nested] 145+ messages in thread

* Re: [PATCH 08/18] virtio_ring: support for used_event idx feature
@ 2011-05-09  4:17     ` Rusty Russell
  0 siblings, 0 replies; 145+ messages in thread
From: Rusty Russell @ 2011-05-09  4:17 UTC (permalink / raw)
  To: Michael S. Tsirkin, linux-kernel
  Cc: Carsten Otte, Christian Borntraeger, linux390,
	Martin Schwidefsky, Heiko Carstens, Shirley Ma, lguest,
	linux-kernel, virtualization, netdev, linux-s390, kvm,
	Krishna Kumar, Tom Lendacky, steved, habanero

On Wed, 4 May 2011 23:51:38 +0300, "Michael S. Tsirkin" <mst@redhat.com> wrote:
> Add support for the used_event idx feature: when enabling
> interrupts, publish the current avail index value to
> the host so that we get interrupts on the next update.
> 
> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
> ---
>  drivers/virtio/virtio_ring.c |   14 ++++++++++++++
>  1 files changed, 14 insertions(+), 0 deletions(-)
> 
> diff --git a/drivers/virtio/virtio_ring.c b/drivers/virtio/virtio_ring.c
> index 507d6eb..3a3ed75 100644
> --- a/drivers/virtio/virtio_ring.c
> +++ b/drivers/virtio/virtio_ring.c
> @@ -320,6 +320,14 @@ void *virtqueue_get_buf(struct virtqueue *_vq, unsigned int *len)
>  	ret = vq->data[i];
>  	detach_buf(vq, i);
>  	vq->last_used_idx++;
> +	/* If we expect an interrupt for the next entry, tell host
> +	 * by writing event index and flush out the write before
> +	 * the read in the next get_buf call. */
> +	if (!(vq->vring.avail->flags & VRING_AVAIL_F_NO_INTERRUPT)) {
> +		vring_used_event(&vq->vring) = vq->last_used_idx;
> +		virtio_mb();
> +	}
> +

Hmm, so you're still using the avail->flags; it's just if thresholding
is enabled the host will ignore it?

It's a little subtle, but it keeps this patch small.  Perhaps we'll want
to make it more explicit later.

Thanks,
Rusty.

^ permalink raw reply	[flat|nested] 145+ messages in thread

* Re: [PATCH 08/18] virtio_ring: support for used_event idx feature
@ 2011-05-09  4:17     ` Rusty Russell
  0 siblings, 0 replies; 145+ messages in thread
From: Rusty Russell @ 2011-05-09  4:17 UTC (permalink / raw)
  To: Michael S. Tsirkin, linux-kernel-u79uwXL29TY76Z2rM5mHXA
  Cc: Krishna Kumar, Carsten Otte, lguest-uLR06cmDAlY/bJ5BZ2RsiQ,
	Shirley Ma, kvm-u79uwXL29TY76Z2rM5mHXA,
	linux-s390-u79uwXL29TY76Z2rM5mHXA, netdev-u79uwXL29TY76Z2rM5mHXA,
	habanero-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8, Heiko Carstens,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	virtualization-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	steved-r/Jw6+rmf7HQT0dZR+AlfA, Christian Borntraeger,
	Tom Lendacky, Martin Schwidefsky,
	linux390-tA70FqPdS9bQT0dZR+AlfA

On Wed, 4 May 2011 23:51:38 +0300, "Michael S. Tsirkin" <mst-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org> wrote:
> Add support for the used_event idx feature: when enabling
> interrupts, publish the current avail index value to
> the host so that we get interrupts on the next update.
> 
> Signed-off-by: Michael S. Tsirkin <mst-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
> ---
>  drivers/virtio/virtio_ring.c |   14 ++++++++++++++
>  1 files changed, 14 insertions(+), 0 deletions(-)
> 
> diff --git a/drivers/virtio/virtio_ring.c b/drivers/virtio/virtio_ring.c
> index 507d6eb..3a3ed75 100644
> --- a/drivers/virtio/virtio_ring.c
> +++ b/drivers/virtio/virtio_ring.c
> @@ -320,6 +320,14 @@ void *virtqueue_get_buf(struct virtqueue *_vq, unsigned int *len)
>  	ret = vq->data[i];
>  	detach_buf(vq, i);
>  	vq->last_used_idx++;
> +	/* If we expect an interrupt for the next entry, tell host
> +	 * by writing event index and flush out the write before
> +	 * the read in the next get_buf call. */
> +	if (!(vq->vring.avail->flags & VRING_AVAIL_F_NO_INTERRUPT)) {
> +		vring_used_event(&vq->vring) = vq->last_used_idx;
> +		virtio_mb();
> +	}
> +

Hmm, so you're still using the avail->flags; it's just if thresholding
is enabled the host will ignore it?

It's a little subtle, but it keeps this patch small.  Perhaps we'll want
to make it more explicit later.

Thanks,
Rusty.

^ permalink raw reply	[flat|nested] 145+ messages in thread

* Re: [PATCH 08/18] virtio_ring: support for used_event idx feature
  2011-05-04 20:51   ` Michael S. Tsirkin
  (?)
  (?)
@ 2011-05-09  4:17   ` Rusty Russell
  -1 siblings, 0 replies; 145+ messages in thread
From: Rusty Russell @ 2011-05-09  4:17 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: Krishna Kumar, Carsten Otte, lguest, Shirley Ma, kvm, linux-s390,
	netdev, habanero, Heiko Carstens, linux-kernel, virtualization,
	steved, Christian Borntraeger, Tom Lendacky, Martin Schwidefsky,
	linux390

On Wed, 4 May 2011 23:51:38 +0300, "Michael S. Tsirkin" <mst@redhat.com> wrote:
> Add support for the used_event idx feature: when enabling
> interrupts, publish the current avail index value to
> the host so that we get interrupts on the next update.
> 
> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
> ---
>  drivers/virtio/virtio_ring.c |   14 ++++++++++++++
>  1 files changed, 14 insertions(+), 0 deletions(-)
> 
> diff --git a/drivers/virtio/virtio_ring.c b/drivers/virtio/virtio_ring.c
> index 507d6eb..3a3ed75 100644
> --- a/drivers/virtio/virtio_ring.c
> +++ b/drivers/virtio/virtio_ring.c
> @@ -320,6 +320,14 @@ void *virtqueue_get_buf(struct virtqueue *_vq, unsigned int *len)
>  	ret = vq->data[i];
>  	detach_buf(vq, i);
>  	vq->last_used_idx++;
> +	/* If we expect an interrupt for the next entry, tell host
> +	 * by writing event index and flush out the write before
> +	 * the read in the next get_buf call. */
> +	if (!(vq->vring.avail->flags & VRING_AVAIL_F_NO_INTERRUPT)) {
> +		vring_used_event(&vq->vring) = vq->last_used_idx;
> +		virtio_mb();
> +	}
> +

Hmm, so you're still using the avail->flags; it's just if thresholding
is enabled the host will ignore it?

It's a little subtle, but it keeps this patch small.  Perhaps we'll want
to make it more explicit later.

Thanks,
Rusty.

^ permalink raw reply	[flat|nested] 145+ messages in thread

* Re: [PATCH 09/18] virtio: use avail_event index
@ 2011-05-09  4:33     ` Rusty Russell
  0 siblings, 0 replies; 145+ messages in thread
From: Rusty Russell @ 2011-05-09  4:33 UTC (permalink / raw)
  To: Michael S. Tsirkin, linux-kernel
  Cc: Carsten Otte, Christian Borntraeger, linux390,
	Martin Schwidefsky, Heiko Carstens, Shirley Ma, lguest,
	linux-kernel, virtualization, netdev, linux-s390, kvm,
	Krishna Kumar, Tom Lendacky, steved, habanero

On Wed, 4 May 2011 23:51:47 +0300, "Michael S. Tsirkin" <mst@redhat.com> wrote:
> Use the new avail_event feature to reduce the number
> of exits from the guest.

Figures here would be nice :)

> @@ -228,6 +237,12 @@ add_head:
>  	 * new available array entries. */
>  	virtio_wmb();
>  	vq->vring.avail->idx++;
> +	/* If the driver never bothers to kick in a very long while,
> +	 * avail index might wrap around. If that happens, invalidate
> +	 * kicked_avail index we stored. TODO: make sure all drivers
> +	 * kick at least once in 2^16 and remove this. */
> +	if (unlikely(vq->vring.avail->idx == vq->kicked_avail))
> +		vq->kicked_avail_valid = true;

If they don't, they're already buggy.  Simply do:
        WARN_ON(vq->vring.avail->idx == vq->kicked_avail);

> +static bool vring_notify(struct vring_virtqueue *vq)
> +{
> +	u16 old, new;
> +	bool v;
> +	if (!vq->event)
> +		return !(vq->vring.used->flags & VRING_USED_F_NO_NOTIFY);
> +
> +	v = vq->kicked_avail_valid;
> +	old = vq->kicked_avail;
> +	new = vq->kicked_avail = vq->vring.avail->idx;
> +	vq->kicked_avail_valid = true;
> +	if (unlikely(!v))
> +		return true;

This is the only place you actually used kicked_avail_valid.  Is it
possible to initialize it in such a way that you can remove this?

> @@ -482,6 +517,8 @@ void vring_transport_features(struct virtio_device *vdev)
>  			break;
>  		case VIRTIO_RING_F_USED_EVENT_IDX:
>  			break;
> +		case VIRTIO_RING_F_AVAIL_EVENT_IDX:
> +			break;
>  		default:
>  			/* We don't understand this bit. */
>  			clear_bit(i, vdev->features);

Does this belong in a prior patch?

Thanks,
Rusty.

^ permalink raw reply	[flat|nested] 145+ messages in thread

* Re: [PATCH 09/18] virtio: use avail_event index
@ 2011-05-09  4:33     ` Rusty Russell
  0 siblings, 0 replies; 145+ messages in thread
From: Rusty Russell @ 2011-05-09  4:33 UTC (permalink / raw)
  To: Michael S. Tsirkin, linux-kernel-u79uwXL29TY76Z2rM5mHXA
  Cc: Krishna Kumar, Carsten Otte, lguest-uLR06cmDAlY/bJ5BZ2RsiQ,
	Shirley Ma, kvm-u79uwXL29TY76Z2rM5mHXA,
	linux-s390-u79uwXL29TY76Z2rM5mHXA, netdev-u79uwXL29TY76Z2rM5mHXA,
	habanero-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8, Heiko Carstens,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	virtualization-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	steved-r/Jw6+rmf7HQT0dZR+AlfA, Christian Borntraeger,
	Tom Lendacky, Martin Schwidefsky,
	linux390-tA70FqPdS9bQT0dZR+AlfA

On Wed, 4 May 2011 23:51:47 +0300, "Michael S. Tsirkin" <mst-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org> wrote:
> Use the new avail_event feature to reduce the number
> of exits from the guest.

Figures here would be nice :)

> @@ -228,6 +237,12 @@ add_head:
>  	 * new available array entries. */
>  	virtio_wmb();
>  	vq->vring.avail->idx++;
> +	/* If the driver never bothers to kick in a very long while,
> +	 * avail index might wrap around. If that happens, invalidate
> +	 * kicked_avail index we stored. TODO: make sure all drivers
> +	 * kick at least once in 2^16 and remove this. */
> +	if (unlikely(vq->vring.avail->idx == vq->kicked_avail))
> +		vq->kicked_avail_valid = true;

If they don't, they're already buggy.  Simply do:
        WARN_ON(vq->vring.avail->idx == vq->kicked_avail);

> +static bool vring_notify(struct vring_virtqueue *vq)
> +{
> +	u16 old, new;
> +	bool v;
> +	if (!vq->event)
> +		return !(vq->vring.used->flags & VRING_USED_F_NO_NOTIFY);
> +
> +	v = vq->kicked_avail_valid;
> +	old = vq->kicked_avail;
> +	new = vq->kicked_avail = vq->vring.avail->idx;
> +	vq->kicked_avail_valid = true;
> +	if (unlikely(!v))
> +		return true;

This is the only place you actually used kicked_avail_valid.  Is it
possible to initialize it in such a way that you can remove this?

> @@ -482,6 +517,8 @@ void vring_transport_features(struct virtio_device *vdev)
>  			break;
>  		case VIRTIO_RING_F_USED_EVENT_IDX:
>  			break;
> +		case VIRTIO_RING_F_AVAIL_EVENT_IDX:
> +			break;
>  		default:
>  			/* We don't understand this bit. */
>  			clear_bit(i, vdev->features);

Does this belong in a prior patch?

Thanks,
Rusty.

^ permalink raw reply	[flat|nested] 145+ messages in thread

* Re: [PATCH 09/18] virtio: use avail_event index
  2011-05-04 20:51   ` Michael S. Tsirkin
                     ` (3 preceding siblings ...)
  (?)
@ 2011-05-09  4:33   ` Rusty Russell
  -1 siblings, 0 replies; 145+ messages in thread
From: Rusty Russell @ 2011-05-09  4:33 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: Krishna Kumar, Carsten Otte, lguest, Shirley Ma, kvm, linux-s390,
	netdev, habanero, Heiko Carstens, linux-kernel, virtualization,
	steved, Christian Borntraeger, Tom Lendacky, Martin Schwidefsky,
	linux390

On Wed, 4 May 2011 23:51:47 +0300, "Michael S. Tsirkin" <mst@redhat.com> wrote:
> Use the new avail_event feature to reduce the number
> of exits from the guest.

Figures here would be nice :)

> @@ -228,6 +237,12 @@ add_head:
>  	 * new available array entries. */
>  	virtio_wmb();
>  	vq->vring.avail->idx++;
> +	/* If the driver never bothers to kick in a very long while,
> +	 * avail index might wrap around. If that happens, invalidate
> +	 * kicked_avail index we stored. TODO: make sure all drivers
> +	 * kick at least once in 2^16 and remove this. */
> +	if (unlikely(vq->vring.avail->idx == vq->kicked_avail))
> +		vq->kicked_avail_valid = true;

If they don't, they're already buggy.  Simply do:
        WARN_ON(vq->vring.avail->idx == vq->kicked_avail);

> +static bool vring_notify(struct vring_virtqueue *vq)
> +{
> +	u16 old, new;
> +	bool v;
> +	if (!vq->event)
> +		return !(vq->vring.used->flags & VRING_USED_F_NO_NOTIFY);
> +
> +	v = vq->kicked_avail_valid;
> +	old = vq->kicked_avail;
> +	new = vq->kicked_avail = vq->vring.avail->idx;
> +	vq->kicked_avail_valid = true;
> +	if (unlikely(!v))
> +		return true;

This is the only place you actually used kicked_avail_valid.  Is it
possible to initialize it in such a way that you can remove this?

> @@ -482,6 +517,8 @@ void vring_transport_features(struct virtio_device *vdev)
>  			break;
>  		case VIRTIO_RING_F_USED_EVENT_IDX:
>  			break;
> +		case VIRTIO_RING_F_AVAIL_EVENT_IDX:
> +			break;
>  		default:
>  			/* We don't understand this bit. */
>  			clear_bit(i, vdev->features);

Does this belong in a prior patch?

Thanks,
Rusty.

^ permalink raw reply	[flat|nested] 145+ messages in thread

* Re: [PATCH 14/18] virtio: add api for delayed callbacks
@ 2011-05-09  5:57     ` Rusty Russell
  0 siblings, 0 replies; 145+ messages in thread
From: Rusty Russell @ 2011-05-09  5:57 UTC (permalink / raw)
  To: Michael S. Tsirkin, linux-kernel
  Cc: Carsten Otte, Christian Borntraeger, linux390,
	Martin Schwidefsky, Heiko Carstens, Shirley Ma, lguest,
	linux-kernel, virtualization, netdev, linux-s390, kvm,
	Krishna Kumar, Tom Lendacky, steved, habanero

On Wed, 4 May 2011 23:52:33 +0300, "Michael S. Tsirkin" <mst@redhat.com> wrote:
> Add an API that tells the other side that callbacks
> should be delayed until a lot of work has been done.
> Implement using the new used_event feature.

Since you're going to add a capacity query anyway, why not add the
threshold argument here?

Then the caller can choose how much space it needs.  Maybe net and block
will want different things...

Cheers,
Rusty.

^ permalink raw reply	[flat|nested] 145+ messages in thread

* Re: [PATCH 14/18] virtio: add api for delayed callbacks
@ 2011-05-09  5:57     ` Rusty Russell
  0 siblings, 0 replies; 145+ messages in thread
From: Rusty Russell @ 2011-05-09  5:57 UTC (permalink / raw)
  To: Michael S. Tsirkin, linux-kernel-u79uwXL29TY76Z2rM5mHXA
  Cc: Krishna Kumar, Carsten Otte, lguest-uLR06cmDAlY/bJ5BZ2RsiQ,
	Shirley Ma, kvm-u79uwXL29TY76Z2rM5mHXA,
	linux-s390-u79uwXL29TY76Z2rM5mHXA, netdev-u79uwXL29TY76Z2rM5mHXA,
	habanero-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8, Heiko Carstens,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	virtualization-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	steved-r/Jw6+rmf7HQT0dZR+AlfA, Christian Borntraeger,
	Tom Lendacky, Martin Schwidefsky,
	linux390-tA70FqPdS9bQT0dZR+AlfA

On Wed, 4 May 2011 23:52:33 +0300, "Michael S. Tsirkin" <mst-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org> wrote:
> Add an API that tells the other side that callbacks
> should be delayed until a lot of work has been done.
> Implement using the new used_event feature.

Since you're going to add a capacity query anyway, why not add the
threshold argument here?

Then the caller can choose how much space it needs.  Maybe net and block
will want different things...

Cheers,
Rusty.

^ permalink raw reply	[flat|nested] 145+ messages in thread

* Re: [PATCH 14/18] virtio: add api for delayed callbacks
  2011-05-04 20:52   ` Michael S. Tsirkin
  (?)
  (?)
@ 2011-05-09  5:57   ` Rusty Russell
  -1 siblings, 0 replies; 145+ messages in thread
From: Rusty Russell @ 2011-05-09  5:57 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: Krishna Kumar, Carsten Otte, lguest, Shirley Ma, kvm, linux-s390,
	netdev, habanero, Heiko Carstens, linux-kernel, virtualization,
	steved, Christian Borntraeger, Tom Lendacky, Martin Schwidefsky,
	linux390

On Wed, 4 May 2011 23:52:33 +0300, "Michael S. Tsirkin" <mst@redhat.com> wrote:
> Add an API that tells the other side that callbacks
> should be delayed until a lot of work has been done.
> Implement using the new used_event feature.

Since you're going to add a capacity query anyway, why not add the
threshold argument here?

Then the caller can choose how much space it needs.  Maybe net and block
will want different things...

Cheers,
Rusty.

^ permalink raw reply	[flat|nested] 145+ messages in thread

* Re: [PATCH 3/3] virtio_ring: need_event api comment fix
@ 2011-05-09  5:59       ` Rusty Russell
  0 siblings, 0 replies; 145+ messages in thread
From: Rusty Russell @ 2011-05-09  5:59 UTC (permalink / raw)
  To: Michael S. Tsirkin, linux-kernel
  Cc: Carsten Otte, Christian Borntraeger, linux390,
	Martin Schwidefsky, Heiko Carstens, Shirley Ma, lguest,
	linux-kernel, virtualization, netdev, linux-s390, kvm,
	Krishna Kumar, Tom Lendacky, steved, habanero

On Thu, 5 May 2011 18:08:17 +0300, "Michael S. Tsirkin" <mst@redhat.com> wrote:
> fix typo in a comment: size -> side
> 
> Reported-by: Stefan Hajnoczi <stefanha@gmail.com>
> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>

I could smerge these together for you, but I *really* want benchmarks in
these commit messages.

Thanks,
Rusty.
PS. Was away last week, hence the delay on this...

^ permalink raw reply	[flat|nested] 145+ messages in thread

* Re: [PATCH 3/3] virtio_ring: need_event api comment fix
@ 2011-05-09  5:59       ` Rusty Russell
  0 siblings, 0 replies; 145+ messages in thread
From: Rusty Russell @ 2011-05-09  5:59 UTC (permalink / raw)
  To: Michael S. Tsirkin, linux-kernel-u79uwXL29TY76Z2rM5mHXA
  Cc: Krishna Kumar, Carsten Otte, lguest-uLR06cmDAlY/bJ5BZ2RsiQ,
	Shirley Ma, kvm-u79uwXL29TY76Z2rM5mHXA,
	linux-s390-u79uwXL29TY76Z2rM5mHXA, netdev-u79uwXL29TY76Z2rM5mHXA,
	habanero-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8, Heiko Carstens,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	virtualization-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	steved-r/Jw6+rmf7HQT0dZR+AlfA, Christian Borntraeger,
	Tom Lendacky, Martin Schwidefsky,
	linux390-tA70FqPdS9bQT0dZR+AlfA

On Thu, 5 May 2011 18:08:17 +0300, "Michael S. Tsirkin" <mst-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org> wrote:
> fix typo in a comment: size -> side
> 
> Reported-by: Stefan Hajnoczi <stefanha-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
> Signed-off-by: Michael S. Tsirkin <mst-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>

I could smerge these together for you, but I *really* want benchmarks in
these commit messages.

Thanks,
Rusty.
PS. Was away last week, hence the delay on this...

^ permalink raw reply	[flat|nested] 145+ messages in thread

* Re: [PATCH 3/3] virtio_ring: need_event api comment fix
  2011-05-05 15:08     ` Michael S. Tsirkin
  (?)
@ 2011-05-09  5:59     ` Rusty Russell
  -1 siblings, 0 replies; 145+ messages in thread
From: Rusty Russell @ 2011-05-09  5:59 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: Krishna Kumar, Carsten Otte, lguest, Shirley Ma, kvm, linux-s390,
	netdev, habanero, Heiko Carstens, linux-kernel, virtualization,
	steved, Christian Borntraeger, Tom Lendacky, Martin Schwidefsky,
	linux390

On Thu, 5 May 2011 18:08:17 +0300, "Michael S. Tsirkin" <mst@redhat.com> wrote:
> fix typo in a comment: size -> side
> 
> Reported-by: Stefan Hajnoczi <stefanha@gmail.com>
> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>

I could smerge these together for you, but I *really* want benchmarks in
these commit messages.

Thanks,
Rusty.
PS. Was away last week, hence the delay on this...

^ permalink raw reply	[flat|nested] 145+ messages in thread

* Re: [PATCH 00/18] virtio and vhost-net performance enhancements
  2011-05-04 20:50 ` Michael S. Tsirkin
                   ` (25 preceding siblings ...)
  (?)
@ 2011-05-11 17:10 ` Krishna Kumar2
  -1 siblings, 0 replies; 145+ messages in thread
From: Krishna Kumar2 @ 2011-05-11 17:10 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: Christian Borntraeger, Carsten Otte, habanero, Heiko Carstens,
	kvm, lguest, linux-kernel, linux-s390, linux390, netdev,
	Rusty Russell, Martin Schwidefsky, steved, Tom Lendacky,
	virtualization, Shirley Ma

"Michael S. Tsirkin" <mst@redhat.com> wrote on 05/05/2011 02:20:18 AM:

> [PATCH 00/18] virtio and vhost-net performance enhancements
>
> OK, here's a large patchset that implements the virtio spec update that I
> sent earlier. It supercedes the PUBLISH_USED_IDX patches
> I sent out earlier.
>
> I know it's a lot to ask but please test, and please consider for
2.6.40 :)
>
> I see nice performance improvements: one run showed going from 12
> to 18 Gbit/s host to guest with netperf, but I did not spend a lot
> of time testing performance, so no guarantees it's not a fluke,
> I hope others will try this out and report.
> Pls note I will be away from keyboard for the next week.

I tested with the git tree (which also contains the later
additional patch), and get this error on guest:

May 11 08:06:08 localhost kernel: net eth0: Unexpected TX queue failure:
-28
May 11 08:06:08 localhost kernel: net eth0: Unexpected TX queue failure:
-28
May 11 08:06:08 localhost kernel: net eth0: Unexpected TX queue failure:
-28
May 11 08:06:08 localhost kernel: net eth0: Unexpected TX queue failure:
-28
...

The network stops after that and requires a modprobe "restart" to
get it working again. This is with the new qemu/vhost/virtio-net.

Please let me know if I am missing something.

thanks,

- KK


^ permalink raw reply	[flat|nested] 145+ messages in thread

* Re: [PATCH 00/18] virtio and vhost-net performance enhancements
  2011-05-04 20:50 ` Michael S. Tsirkin
                   ` (26 preceding siblings ...)
  (?)
@ 2011-05-11 17:10 ` Krishna Kumar2
  -1 siblings, 0 replies; 145+ messages in thread
From: Krishna Kumar2 @ 2011-05-11 17:10 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: habanero, lguest, Shirley Ma, kvm, Carsten Otte, linux-s390,
	Heiko Carstens, linux-kernel, virtualization, steved,
	Christian Borntraeger, Tom Lendacky, netdev, Martin Schwidefsky,
	linux390

"Michael S. Tsirkin" <mst@redhat.com> wrote on 05/05/2011 02:20:18 AM:

> [PATCH 00/18] virtio and vhost-net performance enhancements
>
> OK, here's a large patchset that implements the virtio spec update that I
> sent earlier. It supercedes the PUBLISH_USED_IDX patches
> I sent out earlier.
>
> I know it's a lot to ask but please test, and please consider for
2.6.40 :)
>
> I see nice performance improvements: one run showed going from 12
> to 18 Gbit/s host to guest with netperf, but I did not spend a lot
> of time testing performance, so no guarantees it's not a fluke,
> I hope others will try this out and report.
> Pls note I will be away from keyboard for the next week.

I tested with the git tree (which also contains the later
additional patch), and get this error on guest:

May 11 08:06:08 localhost kernel: net eth0: Unexpected TX queue failure:
-28
May 11 08:06:08 localhost kernel: net eth0: Unexpected TX queue failure:
-28
May 11 08:06:08 localhost kernel: net eth0: Unexpected TX queue failure:
-28
May 11 08:06:08 localhost kernel: net eth0: Unexpected TX queue failure:
-28
...

The network stops after that and requires a modprobe "restart" to
get it working again. This is with the new qemu/vhost/virtio-net.

Please let me know if I am missing something.

thanks,

- KK

^ permalink raw reply	[flat|nested] 145+ messages in thread

* Re: [PATCH 06/18] virtio_ring: avail event index interface
@ 2011-05-15 12:47       ` Michael S. Tsirkin
  0 siblings, 0 replies; 145+ messages in thread
From: Michael S. Tsirkin @ 2011-05-15 12:47 UTC (permalink / raw)
  To: Rusty Russell
  Cc: linux-kernel, Carsten Otte, Christian Borntraeger, linux390,
	Martin Schwidefsky, Heiko Carstens, Shirley Ma, lguest,
	virtualization, netdev, linux-s390, kvm, Krishna Kumar,
	Tom Lendacky, steved, habanero

On Mon, May 09, 2011 at 01:43:15PM +0930, Rusty Russell wrote:
> On Wed, 4 May 2011 23:51:19 +0300, "Michael S. Tsirkin" <mst@redhat.com> wrote:
> > Define a new feature bit for the host to
> > declare that it uses an avail_event index
> > (like Xen) instead of a feature bit
> > to enable/disable interrupts.
> > 
> > Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
> > ---
> >  include/linux/virtio_ring.h |   11 ++++++++---
> >  1 files changed, 8 insertions(+), 3 deletions(-)
> > 
> > diff --git a/include/linux/virtio_ring.h b/include/linux/virtio_ring.h
> > index f5c1b75..f791772 100644
> > --- a/include/linux/virtio_ring.h
> > +++ b/include/linux/virtio_ring.h
> > @@ -32,6 +32,9 @@
> >  /* The Guest publishes the used index for which it expects an interrupt
> >   * at the end of the avail ring. Host should ignore the avail->flags field. */
> >  #define VIRTIO_RING_F_USED_EVENT_IDX	29
> > +/* The Host publishes the avail index for which it expects a kick
> > + * at the end of the used ring. Guest should ignore the used->flags field. */
> > +#define VIRTIO_RING_F_AVAIL_EVENT_IDX	32
> 
> Are you really sure we want to separate the two?  Seems a little simpler
> to have one bit to mean "we're publishing our threshold".  For someone
> implementing this from scratch, it's a little simpler.
> 
> Or are there cases where the old style makes more sense?
> 
> Thanks,
> Rusty.

Hmm, it makes debugging easier as each side can disable
publishing separately - I used it all the time when I saw
e.g. networking stuck to guess whether I need to investigate the
interrupt or the exit handling.

But I'm not hung up on this.

Let me know pls.

-- 
MST

^ permalink raw reply	[flat|nested] 145+ messages in thread

* Re: [PATCH 06/18] virtio_ring: avail event index interface
@ 2011-05-15 12:47       ` Michael S. Tsirkin
  0 siblings, 0 replies; 145+ messages in thread
From: Michael S. Tsirkin @ 2011-05-15 12:47 UTC (permalink / raw)
  To: Rusty Russell
  Cc: Krishna Kumar, Carsten Otte, lguest-uLR06cmDAlY/bJ5BZ2RsiQ,
	Shirley Ma, kvm-u79uwXL29TY76Z2rM5mHXA,
	linux-s390-u79uwXL29TY76Z2rM5mHXA, netdev-u79uwXL29TY76Z2rM5mHXA,
	habanero-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8, Heiko Carstens,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	virtualization-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	steved-r/Jw6+rmf7HQT0dZR+AlfA, Christian Borntraeger,
	Tom Lendacky, Martin Schwidefsky,
	linux390-tA70FqPdS9bQT0dZR+AlfA

On Mon, May 09, 2011 at 01:43:15PM +0930, Rusty Russell wrote:
> On Wed, 4 May 2011 23:51:19 +0300, "Michael S. Tsirkin" <mst-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org> wrote:
> > Define a new feature bit for the host to
> > declare that it uses an avail_event index
> > (like Xen) instead of a feature bit
> > to enable/disable interrupts.
> > 
> > Signed-off-by: Michael S. Tsirkin <mst-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
> > ---
> >  include/linux/virtio_ring.h |   11 ++++++++---
> >  1 files changed, 8 insertions(+), 3 deletions(-)
> > 
> > diff --git a/include/linux/virtio_ring.h b/include/linux/virtio_ring.h
> > index f5c1b75..f791772 100644
> > --- a/include/linux/virtio_ring.h
> > +++ b/include/linux/virtio_ring.h
> > @@ -32,6 +32,9 @@
> >  /* The Guest publishes the used index for which it expects an interrupt
> >   * at the end of the avail ring. Host should ignore the avail->flags field. */
> >  #define VIRTIO_RING_F_USED_EVENT_IDX	29
> > +/* The Host publishes the avail index for which it expects a kick
> > + * at the end of the used ring. Guest should ignore the used->flags field. */
> > +#define VIRTIO_RING_F_AVAIL_EVENT_IDX	32
> 
> Are you really sure we want to separate the two?  Seems a little simpler
> to have one bit to mean "we're publishing our threshold".  For someone
> implementing this from scratch, it's a little simpler.
> 
> Or are there cases where the old style makes more sense?
> 
> Thanks,
> Rusty.

Hmm, it makes debugging easier as each side can disable
publishing separately - I used it all the time when I saw
e.g. networking stuck to guess whether I need to investigate the
interrupt or the exit handling.

But I'm not hung up on this.

Let me know pls.

-- 
MST

^ permalink raw reply	[flat|nested] 145+ messages in thread

* Re: [PATCH 06/18] virtio_ring: avail event index interface
  2011-05-09  4:13     ` Rusty Russell
  (?)
@ 2011-05-15 12:47     ` Michael S. Tsirkin
  -1 siblings, 0 replies; 145+ messages in thread
From: Michael S. Tsirkin @ 2011-05-15 12:47 UTC (permalink / raw)
  To: Rusty Russell
  Cc: Krishna Kumar, Carsten Otte, lguest, Shirley Ma, kvm, linux-s390,
	netdev, habanero, Heiko Carstens, linux-kernel, virtualization,
	steved, Christian Borntraeger, Tom Lendacky, Martin Schwidefsky,
	linux390

On Mon, May 09, 2011 at 01:43:15PM +0930, Rusty Russell wrote:
> On Wed, 4 May 2011 23:51:19 +0300, "Michael S. Tsirkin" <mst@redhat.com> wrote:
> > Define a new feature bit for the host to
> > declare that it uses an avail_event index
> > (like Xen) instead of a feature bit
> > to enable/disable interrupts.
> > 
> > Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
> > ---
> >  include/linux/virtio_ring.h |   11 ++++++++---
> >  1 files changed, 8 insertions(+), 3 deletions(-)
> > 
> > diff --git a/include/linux/virtio_ring.h b/include/linux/virtio_ring.h
> > index f5c1b75..f791772 100644
> > --- a/include/linux/virtio_ring.h
> > +++ b/include/linux/virtio_ring.h
> > @@ -32,6 +32,9 @@
> >  /* The Guest publishes the used index for which it expects an interrupt
> >   * at the end of the avail ring. Host should ignore the avail->flags field. */
> >  #define VIRTIO_RING_F_USED_EVENT_IDX	29
> > +/* The Host publishes the avail index for which it expects a kick
> > + * at the end of the used ring. Guest should ignore the used->flags field. */
> > +#define VIRTIO_RING_F_AVAIL_EVENT_IDX	32
> 
> Are you really sure we want to separate the two?  Seems a little simpler
> to have one bit to mean "we're publishing our threshold".  For someone
> implementing this from scratch, it's a little simpler.
> 
> Or are there cases where the old style makes more sense?
> 
> Thanks,
> Rusty.

Hmm, it makes debugging easier as each side can disable
publishing separately - I used it all the time when I saw
e.g. networking stuck to guess whether I need to investigate the
interrupt or the exit handling.

But I'm not hung up on this.

Let me know pls.

-- 
MST

^ permalink raw reply	[flat|nested] 145+ messages in thread

* Re: [PATCH 08/18] virtio_ring: support for used_event idx feature
@ 2011-05-15 12:47       ` Michael S. Tsirkin
  0 siblings, 0 replies; 145+ messages in thread
From: Michael S. Tsirkin @ 2011-05-15 12:47 UTC (permalink / raw)
  To: Rusty Russell
  Cc: linux-kernel, Carsten Otte, Christian Borntraeger, linux390,
	Martin Schwidefsky, Heiko Carstens, Shirley Ma, lguest,
	virtualization, netdev, linux-s390, kvm, Krishna Kumar,
	Tom Lendacky, steved, habanero

On Mon, May 09, 2011 at 01:47:32PM +0930, Rusty Russell wrote:
> On Wed, 4 May 2011 23:51:38 +0300, "Michael S. Tsirkin" <mst@redhat.com> wrote:
> > Add support for the used_event idx feature: when enabling
> > interrupts, publish the current avail index value to
> > the host so that we get interrupts on the next update.
> > 
> > Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
> > ---
> >  drivers/virtio/virtio_ring.c |   14 ++++++++++++++
> >  1 files changed, 14 insertions(+), 0 deletions(-)
> > 
> > diff --git a/drivers/virtio/virtio_ring.c b/drivers/virtio/virtio_ring.c
> > index 507d6eb..3a3ed75 100644
> > --- a/drivers/virtio/virtio_ring.c
> > +++ b/drivers/virtio/virtio_ring.c
> > @@ -320,6 +320,14 @@ void *virtqueue_get_buf(struct virtqueue *_vq, unsigned int *len)
> >  	ret = vq->data[i];
> >  	detach_buf(vq, i);
> >  	vq->last_used_idx++;
> > +	/* If we expect an interrupt for the next entry, tell host
> > +	 * by writing event index and flush out the write before
> > +	 * the read in the next get_buf call. */
> > +	if (!(vq->vring.avail->flags & VRING_AVAIL_F_NO_INTERRUPT)) {
> > +		vring_used_event(&vq->vring) = vq->last_used_idx;
> > +		virtio_mb();
> > +	}
> > +
> 
> Hmm, so you're still using the avail->flags; it's just if thresholding
> is enabled the host will ignore it?
> 
> It's a little subtle, but it keeps this patch small.

Right, that's exactly why I do it this way.

> Perhaps we'll want to make it more explicit later.
> 
> Thanks,
> Rusty.

Yes, e.g. it might be better to avoid touching that cache line,
and track the current status in a private field in the guest.
But I was unable to measure any effect from doing it either way.

-- 
MST

^ permalink raw reply	[flat|nested] 145+ messages in thread

* Re: [PATCH 08/18] virtio_ring: support for used_event idx feature
@ 2011-05-15 12:47       ` Michael S. Tsirkin
  0 siblings, 0 replies; 145+ messages in thread
From: Michael S. Tsirkin @ 2011-05-15 12:47 UTC (permalink / raw)
  To: Rusty Russell
  Cc: Krishna Kumar, Carsten Otte, lguest-uLR06cmDAlY/bJ5BZ2RsiQ,
	Shirley Ma, kvm-u79uwXL29TY76Z2rM5mHXA,
	linux-s390-u79uwXL29TY76Z2rM5mHXA, netdev-u79uwXL29TY76Z2rM5mHXA,
	habanero-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8, Heiko Carstens,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	virtualization-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	steved-r/Jw6+rmf7HQT0dZR+AlfA, Christian Borntraeger,
	Tom Lendacky, Martin Schwidefsky,
	linux390-tA70FqPdS9bQT0dZR+AlfA

On Mon, May 09, 2011 at 01:47:32PM +0930, Rusty Russell wrote:
> On Wed, 4 May 2011 23:51:38 +0300, "Michael S. Tsirkin" <mst-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org> wrote:
> > Add support for the used_event idx feature: when enabling
> > interrupts, publish the current avail index value to
> > the host so that we get interrupts on the next update.
> > 
> > Signed-off-by: Michael S. Tsirkin <mst-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
> > ---
> >  drivers/virtio/virtio_ring.c |   14 ++++++++++++++
> >  1 files changed, 14 insertions(+), 0 deletions(-)
> > 
> > diff --git a/drivers/virtio/virtio_ring.c b/drivers/virtio/virtio_ring.c
> > index 507d6eb..3a3ed75 100644
> > --- a/drivers/virtio/virtio_ring.c
> > +++ b/drivers/virtio/virtio_ring.c
> > @@ -320,6 +320,14 @@ void *virtqueue_get_buf(struct virtqueue *_vq, unsigned int *len)
> >  	ret = vq->data[i];
> >  	detach_buf(vq, i);
> >  	vq->last_used_idx++;
> > +	/* If we expect an interrupt for the next entry, tell host
> > +	 * by writing event index and flush out the write before
> > +	 * the read in the next get_buf call. */
> > +	if (!(vq->vring.avail->flags & VRING_AVAIL_F_NO_INTERRUPT)) {
> > +		vring_used_event(&vq->vring) = vq->last_used_idx;
> > +		virtio_mb();
> > +	}
> > +
> 
> Hmm, so you're still using the avail->flags; it's just if thresholding
> is enabled the host will ignore it?
> 
> It's a little subtle, but it keeps this patch small.

Right, that's exactly why I do it this way.

> Perhaps we'll want to make it more explicit later.
> 
> Thanks,
> Rusty.

Yes, e.g. it might be better to avoid touching that cache line,
and track the current status in a private field in the guest.
But I was unable to measure any effect from doing it either way.

-- 
MST

^ permalink raw reply	[flat|nested] 145+ messages in thread

* Re: [PATCH 08/18] virtio_ring: support for used_event idx feature
  2011-05-09  4:17     ` Rusty Russell
  (?)
@ 2011-05-15 12:47     ` Michael S. Tsirkin
  -1 siblings, 0 replies; 145+ messages in thread
From: Michael S. Tsirkin @ 2011-05-15 12:47 UTC (permalink / raw)
  To: Rusty Russell
  Cc: Krishna Kumar, Carsten Otte, lguest, Shirley Ma, kvm, linux-s390,
	netdev, habanero, Heiko Carstens, linux-kernel, virtualization,
	steved, Christian Borntraeger, Tom Lendacky, Martin Schwidefsky,
	linux390

On Mon, May 09, 2011 at 01:47:32PM +0930, Rusty Russell wrote:
> On Wed, 4 May 2011 23:51:38 +0300, "Michael S. Tsirkin" <mst@redhat.com> wrote:
> > Add support for the used_event idx feature: when enabling
> > interrupts, publish the current avail index value to
> > the host so that we get interrupts on the next update.
> > 
> > Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
> > ---
> >  drivers/virtio/virtio_ring.c |   14 ++++++++++++++
> >  1 files changed, 14 insertions(+), 0 deletions(-)
> > 
> > diff --git a/drivers/virtio/virtio_ring.c b/drivers/virtio/virtio_ring.c
> > index 507d6eb..3a3ed75 100644
> > --- a/drivers/virtio/virtio_ring.c
> > +++ b/drivers/virtio/virtio_ring.c
> > @@ -320,6 +320,14 @@ void *virtqueue_get_buf(struct virtqueue *_vq, unsigned int *len)
> >  	ret = vq->data[i];
> >  	detach_buf(vq, i);
> >  	vq->last_used_idx++;
> > +	/* If we expect an interrupt for the next entry, tell host
> > +	 * by writing event index and flush out the write before
> > +	 * the read in the next get_buf call. */
> > +	if (!(vq->vring.avail->flags & VRING_AVAIL_F_NO_INTERRUPT)) {
> > +		vring_used_event(&vq->vring) = vq->last_used_idx;
> > +		virtio_mb();
> > +	}
> > +
> 
> Hmm, so you're still using the avail->flags; it's just if thresholding
> is enabled the host will ignore it?
> 
> It's a little subtle, but it keeps this patch small.

Right, that's exactly why I do it this way.

> Perhaps we'll want to make it more explicit later.
> 
> Thanks,
> Rusty.

Yes, e.g. it might be better to avoid touching that cache line,
and track the current status in a private field in the guest.
But I was unable to measure any effect from doing it either way.

-- 
MST

^ permalink raw reply	[flat|nested] 145+ messages in thread

* Re: [PATCH 14/18] virtio: add api for delayed callbacks
@ 2011-05-15 12:48       ` Michael S. Tsirkin
  0 siblings, 0 replies; 145+ messages in thread
From: Michael S. Tsirkin @ 2011-05-15 12:48 UTC (permalink / raw)
  To: Rusty Russell
  Cc: linux-kernel, Carsten Otte, Christian Borntraeger, linux390,
	Martin Schwidefsky, Heiko Carstens, Shirley Ma, lguest,
	virtualization, netdev, linux-s390, kvm, Krishna Kumar,
	Tom Lendacky, steved, habanero

On Mon, May 09, 2011 at 03:27:33PM +0930, Rusty Russell wrote:
> On Wed, 4 May 2011 23:52:33 +0300, "Michael S. Tsirkin" <mst@redhat.com> wrote:
> > Add an API that tells the other side that callbacks
> > should be delayed until a lot of work has been done.
> > Implement using the new used_event feature.
> 
> Since you're going to add a capacity query anyway, why not add the
> threshold argument here?

I thought that if we keep the API kind of generic
there might be more of a chance that future transports
will be able to implement it. For example, with an
old host we can't commit to a specific index.

> 
> Then the caller can choose how much space it needs.  Maybe net and block
> will want different things...
> 
> Cheers,
> Rusty.

Hmm, maybe. OK.

-- 
MST

^ permalink raw reply	[flat|nested] 145+ messages in thread

* Re: [PATCH 14/18] virtio: add api for delayed callbacks
@ 2011-05-15 12:48       ` Michael S. Tsirkin
  0 siblings, 0 replies; 145+ messages in thread
From: Michael S. Tsirkin @ 2011-05-15 12:48 UTC (permalink / raw)
  To: Rusty Russell
  Cc: Krishna Kumar, Carsten Otte, lguest-uLR06cmDAlY/bJ5BZ2RsiQ,
	Shirley Ma, kvm-u79uwXL29TY76Z2rM5mHXA,
	linux-s390-u79uwXL29TY76Z2rM5mHXA, netdev-u79uwXL29TY76Z2rM5mHXA,
	habanero-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8, Heiko Carstens,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	virtualization-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	steved-r/Jw6+rmf7HQT0dZR+AlfA, Christian Borntraeger,
	Tom Lendacky, Martin Schwidefsky,
	linux390-tA70FqPdS9bQT0dZR+AlfA

On Mon, May 09, 2011 at 03:27:33PM +0930, Rusty Russell wrote:
> On Wed, 4 May 2011 23:52:33 +0300, "Michael S. Tsirkin" <mst-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org> wrote:
> > Add an API that tells the other side that callbacks
> > should be delayed until a lot of work has been done.
> > Implement using the new used_event feature.
> 
> Since you're going to add a capacity query anyway, why not add the
> threshold argument here?

I thought that if we keep the API kind of generic
there might be more of a chance that future transports
will be able to implement it. For example, with an
old host we can't commit to a specific index.

> 
> Then the caller can choose how much space it needs.  Maybe net and block
> will want different things...
> 
> Cheers,
> Rusty.

Hmm, maybe. OK.

-- 
MST

^ permalink raw reply	[flat|nested] 145+ messages in thread

* Re: [PATCH 14/18] virtio: add api for delayed callbacks
  2011-05-09  5:57     ` Rusty Russell
  (?)
@ 2011-05-15 12:48     ` Michael S. Tsirkin
  -1 siblings, 0 replies; 145+ messages in thread
From: Michael S. Tsirkin @ 2011-05-15 12:48 UTC (permalink / raw)
  To: Rusty Russell
  Cc: Krishna Kumar, Carsten Otte, lguest, Shirley Ma, kvm, linux-s390,
	netdev, habanero, Heiko Carstens, linux-kernel, virtualization,
	steved, Christian Borntraeger, Tom Lendacky, Martin Schwidefsky,
	linux390

On Mon, May 09, 2011 at 03:27:33PM +0930, Rusty Russell wrote:
> On Wed, 4 May 2011 23:52:33 +0300, "Michael S. Tsirkin" <mst@redhat.com> wrote:
> > Add an API that tells the other side that callbacks
> > should be delayed until a lot of work has been done.
> > Implement using the new used_event feature.
> 
> Since you're going to add a capacity query anyway, why not add the
> threshold argument here?

I thought that if we keep the API kind of generic
there might be more of a chance that future transports
will be able to implement it. For example, with an
old host we can't commit to a specific index.

> 
> Then the caller can choose how much space it needs.  Maybe net and block
> will want different things...
> 
> Cheers,
> Rusty.

Hmm, maybe. OK.

-- 
MST

^ permalink raw reply	[flat|nested] 145+ messages in thread

* Re: [PATCH 09/18] virtio: use avail_event index
@ 2011-05-15 13:55       ` Michael S. Tsirkin
  0 siblings, 0 replies; 145+ messages in thread
From: Michael S. Tsirkin @ 2011-05-15 13:55 UTC (permalink / raw)
  To: Rusty Russell
  Cc: linux-kernel, Carsten Otte, Christian Borntraeger, linux390,
	Martin Schwidefsky, Heiko Carstens, Shirley Ma, lguest,
	virtualization, netdev, linux-s390, kvm, Krishna Kumar,
	Tom Lendacky, steved, habanero

On Mon, May 09, 2011 at 02:03:26PM +0930, Rusty Russell wrote:
> On Wed, 4 May 2011 23:51:47 +0300, "Michael S. Tsirkin" <mst@redhat.com> wrote:
> > Use the new avail_event feature to reduce the number
> > of exits from the guest.
> 
> Figures here would be nice :)

You mean ASCII art in comments?

> > @@ -228,6 +237,12 @@ add_head:
> >  	 * new available array entries. */
> >  	virtio_wmb();
> >  	vq->vring.avail->idx++;
> > +	/* If the driver never bothers to kick in a very long while,
> > +	 * avail index might wrap around. If that happens, invalidate
> > +	 * kicked_avail index we stored. TODO: make sure all drivers
> > +	 * kick at least once in 2^16 and remove this. */
> > +	if (unlikely(vq->vring.avail->idx == vq->kicked_avail))
> > +		vq->kicked_avail_valid = true;
> 
> If they don't, they're already buggy.  Simply do:
>         WARN_ON(vq->vring.avail->idx == vq->kicked_avail);

Hmm, but does it say that somewhere?
It seems that most drivers use locking to prevent
posting more buffers while they drain the used ring,
and also kick at least once in vq->num buffers,
which I guess in the end would work out fine
as vq num is never as large as 2^15.

> > +static bool vring_notify(struct vring_virtqueue *vq)
> > +{
> > +	u16 old, new;
> > +	bool v;
> > +	if (!vq->event)
> > +		return !(vq->vring.used->flags & VRING_USED_F_NO_NOTIFY);
> > +
> > +	v = vq->kicked_avail_valid;
> > +	old = vq->kicked_avail;
> > +	new = vq->kicked_avail = vq->vring.avail->idx;
> > +	vq->kicked_avail_valid = true;
> > +	if (unlikely(!v))
> > +		return true;
> 
> This is the only place you actually used kicked_avail_valid.  Is it
> possible to initialize it in such a way that you can remove this?

If we kill the code above as you suggested, likely yes.
Maybe an explicit flag is more obvious?

> > @@ -482,6 +517,8 @@ void vring_transport_features(struct virtio_device *vdev)
> >  			break;
> >  		case VIRTIO_RING_F_USED_EVENT_IDX:
> >  			break;
> > +		case VIRTIO_RING_F_AVAIL_EVENT_IDX:
> > +			break;
> >  		default:
> >  			/* We don't understand this bit. */
> >  			clear_bit(i, vdev->features);
> 
> Does this belong in a prior patch?
> 
> Thanks,
> Rusty.

Well if we don't support the feature in the ring we should not
ack the feature, right?

^ permalink raw reply	[flat|nested] 145+ messages in thread

* Re: [PATCH 09/18] virtio: use avail_event index
@ 2011-05-15 13:55       ` Michael S. Tsirkin
  0 siblings, 0 replies; 145+ messages in thread
From: Michael S. Tsirkin @ 2011-05-15 13:55 UTC (permalink / raw)
  To: Rusty Russell
  Cc: Krishna Kumar, Carsten Otte, lguest-uLR06cmDAlY/bJ5BZ2RsiQ,
	Shirley Ma, kvm-u79uwXL29TY76Z2rM5mHXA,
	linux-s390-u79uwXL29TY76Z2rM5mHXA, netdev-u79uwXL29TY76Z2rM5mHXA,
	habanero-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8, Heiko Carstens,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	virtualization-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	steved-r/Jw6+rmf7HQT0dZR+AlfA, Christian Borntraeger,
	Tom Lendacky, Martin Schwidefsky,
	linux390-tA70FqPdS9bQT0dZR+AlfA

On Mon, May 09, 2011 at 02:03:26PM +0930, Rusty Russell wrote:
> On Wed, 4 May 2011 23:51:47 +0300, "Michael S. Tsirkin" <mst-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org> wrote:
> > Use the new avail_event feature to reduce the number
> > of exits from the guest.
> 
> Figures here would be nice :)

You mean ASCII art in comments?

> > @@ -228,6 +237,12 @@ add_head:
> >  	 * new available array entries. */
> >  	virtio_wmb();
> >  	vq->vring.avail->idx++;
> > +	/* If the driver never bothers to kick in a very long while,
> > +	 * avail index might wrap around. If that happens, invalidate
> > +	 * kicked_avail index we stored. TODO: make sure all drivers
> > +	 * kick at least once in 2^16 and remove this. */
> > +	if (unlikely(vq->vring.avail->idx == vq->kicked_avail))
> > +		vq->kicked_avail_valid = true;
> 
> If they don't, they're already buggy.  Simply do:
>         WARN_ON(vq->vring.avail->idx == vq->kicked_avail);

Hmm, but does it say that somewhere?
It seems that most drivers use locking to prevent
posting more buffers while they drain the used ring,
and also kick at least once in vq->num buffers,
which I guess in the end would work out fine
as vq num is never as large as 2^15.

> > +static bool vring_notify(struct vring_virtqueue *vq)
> > +{
> > +	u16 old, new;
> > +	bool v;
> > +	if (!vq->event)
> > +		return !(vq->vring.used->flags & VRING_USED_F_NO_NOTIFY);
> > +
> > +	v = vq->kicked_avail_valid;
> > +	old = vq->kicked_avail;
> > +	new = vq->kicked_avail = vq->vring.avail->idx;
> > +	vq->kicked_avail_valid = true;
> > +	if (unlikely(!v))
> > +		return true;
> 
> This is the only place you actually used kicked_avail_valid.  Is it
> possible to initialize it in such a way that you can remove this?

If we kill the code above as you suggested, likely yes.
Maybe an explicit flag is more obvious?

> > @@ -482,6 +517,8 @@ void vring_transport_features(struct virtio_device *vdev)
> >  			break;
> >  		case VIRTIO_RING_F_USED_EVENT_IDX:
> >  			break;
> > +		case VIRTIO_RING_F_AVAIL_EVENT_IDX:
> > +			break;
> >  		default:
> >  			/* We don't understand this bit. */
> >  			clear_bit(i, vdev->features);
> 
> Does this belong in a prior patch?
> 
> Thanks,
> Rusty.

Well if we don't support the feature in the ring we should not
ack the feature, right?

^ permalink raw reply	[flat|nested] 145+ messages in thread

* Re: [PATCH 09/18] virtio: use avail_event index
  2011-05-09  4:33     ` Rusty Russell
  (?)
@ 2011-05-15 13:55     ` Michael S. Tsirkin
  -1 siblings, 0 replies; 145+ messages in thread
From: Michael S. Tsirkin @ 2011-05-15 13:55 UTC (permalink / raw)
  To: Rusty Russell
  Cc: Krishna Kumar, Carsten Otte, lguest, Shirley Ma, kvm, linux-s390,
	netdev, habanero, Heiko Carstens, linux-kernel, virtualization,
	steved, Christian Borntraeger, Tom Lendacky, Martin Schwidefsky,
	linux390

On Mon, May 09, 2011 at 02:03:26PM +0930, Rusty Russell wrote:
> On Wed, 4 May 2011 23:51:47 +0300, "Michael S. Tsirkin" <mst@redhat.com> wrote:
> > Use the new avail_event feature to reduce the number
> > of exits from the guest.
> 
> Figures here would be nice :)

You mean ASCII art in comments?

> > @@ -228,6 +237,12 @@ add_head:
> >  	 * new available array entries. */
> >  	virtio_wmb();
> >  	vq->vring.avail->idx++;
> > +	/* If the driver never bothers to kick in a very long while,
> > +	 * avail index might wrap around. If that happens, invalidate
> > +	 * kicked_avail index we stored. TODO: make sure all drivers
> > +	 * kick at least once in 2^16 and remove this. */
> > +	if (unlikely(vq->vring.avail->idx == vq->kicked_avail))
> > +		vq->kicked_avail_valid = true;
> 
> If they don't, they're already buggy.  Simply do:
>         WARN_ON(vq->vring.avail->idx == vq->kicked_avail);

Hmm, but does it say that somewhere?
It seems that most drivers use locking to prevent
posting more buffers while they drain the used ring,
and also kick at least once in vq->num buffers,
which I guess in the end would work out fine
as vq num is never as large as 2^15.

> > +static bool vring_notify(struct vring_virtqueue *vq)
> > +{
> > +	u16 old, new;
> > +	bool v;
> > +	if (!vq->event)
> > +		return !(vq->vring.used->flags & VRING_USED_F_NO_NOTIFY);
> > +
> > +	v = vq->kicked_avail_valid;
> > +	old = vq->kicked_avail;
> > +	new = vq->kicked_avail = vq->vring.avail->idx;
> > +	vq->kicked_avail_valid = true;
> > +	if (unlikely(!v))
> > +		return true;
> 
> This is the only place you actually used kicked_avail_valid.  Is it
> possible to initialize it in such a way that you can remove this?

If we kill the code above as you suggested, likely yes.
Maybe an explicit flag is more obvious?

> > @@ -482,6 +517,8 @@ void vring_transport_features(struct virtio_device *vdev)
> >  			break;
> >  		case VIRTIO_RING_F_USED_EVENT_IDX:
> >  			break;
> > +		case VIRTIO_RING_F_AVAIL_EVENT_IDX:
> > +			break;
> >  		default:
> >  			/* We don't understand this bit. */
> >  			clear_bit(i, vdev->features);
> 
> Does this belong in a prior patch?
> 
> Thanks,
> Rusty.

Well if we don't support the feature in the ring we should not
ack the feature, right?

^ permalink raw reply	[flat|nested] 145+ messages in thread

* Re: [PATCH 06/18] virtio_ring: avail event index interface
@ 2011-05-16  6:23         ` Rusty Russell
  0 siblings, 0 replies; 145+ messages in thread
From: Rusty Russell @ 2011-05-16  6:23 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: linux-kernel, Carsten Otte, Christian Borntraeger, linux390,
	Martin Schwidefsky, Heiko Carstens, Shirley Ma, lguest,
	virtualization, netdev, linux-s390, kvm, Krishna Kumar,
	Tom Lendacky, steved, habanero

On Sun, 15 May 2011 15:47:27 +0300, "Michael S. Tsirkin" <mst@redhat.com> wrote:
> On Mon, May 09, 2011 at 01:43:15PM +0930, Rusty Russell wrote:
> > On Wed, 4 May 2011 23:51:19 +0300, "Michael S. Tsirkin" <mst@redhat.com> wrote:
> > >  #define VIRTIO_RING_F_USED_EVENT_IDX	29
> > > +/* The Host publishes the avail index for which it expects a kick
> > > + * at the end of the used ring. Guest should ignore the used->flags field. */
> > > +#define VIRTIO_RING_F_AVAIL_EVENT_IDX	32
> > 
> > Are you really sure we want to separate the two?  Seems a little simpler
> > to have one bit to mean "we're publishing our threshold".  For someone
> > implementing this from scratch, it's a little simpler.
> > 
> > Or are there cases where the old style makes more sense?
> > 
> > Thanks,
> > Rusty.
> 
> Hmm, it makes debugging easier as each side can disable
> publishing separately - I used it all the time when I saw
> e.g. networking stuck to guess whether I need to investigate the
> interrupt or the exit handling.
> 
> But I'm not hung up on this.
> 
> Let me know pls.

If we combine them into one, then these patches no longer depend on
the feature bit expansion, which is worthwhile (though I'll take both).

Thanks,
Rusty.

^ permalink raw reply	[flat|nested] 145+ messages in thread

* Re: [PATCH 06/18] virtio_ring: avail event index interface
@ 2011-05-16  6:23         ` Rusty Russell
  0 siblings, 0 replies; 145+ messages in thread
From: Rusty Russell @ 2011-05-16  6:23 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: Krishna Kumar, Carsten Otte, lguest-uLR06cmDAlY/bJ5BZ2RsiQ,
	Shirley Ma, kvm-u79uwXL29TY76Z2rM5mHXA,
	linux-s390-u79uwXL29TY76Z2rM5mHXA, netdev-u79uwXL29TY76Z2rM5mHXA,
	habanero-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8, Heiko Carstens,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	virtualization-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	steved-r/Jw6+rmf7HQT0dZR+AlfA, Christian Borntraeger,
	Tom Lendacky, Martin Schwidefsky,
	linux390-tA70FqPdS9bQT0dZR+AlfA

On Sun, 15 May 2011 15:47:27 +0300, "Michael S. Tsirkin" <mst-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org> wrote:
> On Mon, May 09, 2011 at 01:43:15PM +0930, Rusty Russell wrote:
> > On Wed, 4 May 2011 23:51:19 +0300, "Michael S. Tsirkin" <mst-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org> wrote:
> > >  #define VIRTIO_RING_F_USED_EVENT_IDX	29
> > > +/* The Host publishes the avail index for which it expects a kick
> > > + * at the end of the used ring. Guest should ignore the used->flags field. */
> > > +#define VIRTIO_RING_F_AVAIL_EVENT_IDX	32
> > 
> > Are you really sure we want to separate the two?  Seems a little simpler
> > to have one bit to mean "we're publishing our threshold".  For someone
> > implementing this from scratch, it's a little simpler.
> > 
> > Or are there cases where the old style makes more sense?
> > 
> > Thanks,
> > Rusty.
> 
> Hmm, it makes debugging easier as each side can disable
> publishing separately - I used it all the time when I saw
> e.g. networking stuck to guess whether I need to investigate the
> interrupt or the exit handling.
> 
> But I'm not hung up on this.
> 
> Let me know pls.

If we combine them into one, then these patches no longer depend on
the feature bit expansion, which is worthwhile (though I'll take both).

Thanks,
Rusty.

^ permalink raw reply	[flat|nested] 145+ messages in thread

* Re: [PATCH 06/18] virtio_ring: avail event index interface
  2011-05-15 12:47       ` Michael S. Tsirkin
  (?)
@ 2011-05-16  6:23       ` Rusty Russell
  -1 siblings, 0 replies; 145+ messages in thread
From: Rusty Russell @ 2011-05-16  6:23 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: Krishna Kumar, Carsten Otte, lguest, Shirley Ma, kvm, linux-s390,
	netdev, habanero, Heiko Carstens, linux-kernel, virtualization,
	steved, Christian Borntraeger, Tom Lendacky, Martin Schwidefsky,
	linux390

On Sun, 15 May 2011 15:47:27 +0300, "Michael S. Tsirkin" <mst@redhat.com> wrote:
> On Mon, May 09, 2011 at 01:43:15PM +0930, Rusty Russell wrote:
> > On Wed, 4 May 2011 23:51:19 +0300, "Michael S. Tsirkin" <mst@redhat.com> wrote:
> > >  #define VIRTIO_RING_F_USED_EVENT_IDX	29
> > > +/* The Host publishes the avail index for which it expects a kick
> > > + * at the end of the used ring. Guest should ignore the used->flags field. */
> > > +#define VIRTIO_RING_F_AVAIL_EVENT_IDX	32
> > 
> > Are you really sure we want to separate the two?  Seems a little simpler
> > to have one bit to mean "we're publishing our threshold".  For someone
> > implementing this from scratch, it's a little simpler.
> > 
> > Or are there cases where the old style makes more sense?
> > 
> > Thanks,
> > Rusty.
> 
> Hmm, it makes debugging easier as each side can disable
> publishing separately - I used it all the time when I saw
> e.g. networking stuck to guess whether I need to investigate the
> interrupt or the exit handling.
> 
> But I'm not hung up on this.
> 
> Let me know pls.

If we combine them into one, then these patches no longer depend on
the feature bit expansion, which is worthwhile (though I'll take both).

Thanks,
Rusty.

^ permalink raw reply	[flat|nested] 145+ messages in thread

* Re: [PATCH 09/18] virtio: use avail_event index
@ 2011-05-16  7:12         ` Rusty Russell
  0 siblings, 0 replies; 145+ messages in thread
From: Rusty Russell @ 2011-05-16  7:12 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: linux-kernel, Carsten Otte, Christian Borntraeger, linux390,
	Martin Schwidefsky, Heiko Carstens, Shirley Ma, lguest,
	virtualization, netdev, linux-s390, kvm, Krishna Kumar,
	Tom Lendacky, steved, habanero

On Sun, 15 May 2011 16:55:41 +0300, "Michael S. Tsirkin" <mst@redhat.com> wrote:
> On Mon, May 09, 2011 at 02:03:26PM +0930, Rusty Russell wrote:
> > On Wed, 4 May 2011 23:51:47 +0300, "Michael S. Tsirkin" <mst@redhat.com> wrote:
> > > Use the new avail_event feature to reduce the number
> > > of exits from the guest.
> > 
> > Figures here would be nice :)
> 
> You mean ASCII art in comments?

I mean benchmarks of some kind.

> 
> > > @@ -228,6 +237,12 @@ add_head:
> > >  	 * new available array entries. */
> > >  	virtio_wmb();
> > >  	vq->vring.avail->idx++;
> > > +	/* If the driver never bothers to kick in a very long while,
> > > +	 * avail index might wrap around. If that happens, invalidate
> > > +	 * kicked_avail index we stored. TODO: make sure all drivers
> > > +	 * kick at least once in 2^16 and remove this. */
> > > +	if (unlikely(vq->vring.avail->idx == vq->kicked_avail))
> > > +		vq->kicked_avail_valid = true;
> > 
> > If they don't, they're already buggy.  Simply do:
> >         WARN_ON(vq->vring.avail->idx == vq->kicked_avail);
> 
> Hmm, but does it say that somewhere?

AFAICT it's a corollary of:
1) You have a finite ring of size <= 2^16.
2) You need to kick the other side once you've done some work.

> > > @@ -482,6 +517,8 @@ void vring_transport_features(struct virtio_device *vdev)
> > >  			break;
> > >  		case VIRTIO_RING_F_USED_EVENT_IDX:
> > >  			break;
> > > +		case VIRTIO_RING_F_AVAIL_EVENT_IDX:
> > > +			break;
> > >  		default:
> > >  			/* We don't understand this bit. */
> > >  			clear_bit(i, vdev->features);
> > 
> > Does this belong in a prior patch?
> > 
> > Thanks,
> > Rusty.
> 
> Well if we don't support the feature in the ring we should not
> ack the feature, right?

Ah, you're right.

Thanks,
Rusty.

^ permalink raw reply	[flat|nested] 145+ messages in thread

* Re: [PATCH 09/18] virtio: use avail_event index
@ 2011-05-16  7:12         ` Rusty Russell
  0 siblings, 0 replies; 145+ messages in thread
From: Rusty Russell @ 2011-05-16  7:12 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: Krishna Kumar, Carsten Otte, lguest-uLR06cmDAlY/bJ5BZ2RsiQ,
	Shirley Ma, kvm-u79uwXL29TY76Z2rM5mHXA,
	linux-s390-u79uwXL29TY76Z2rM5mHXA, netdev-u79uwXL29TY76Z2rM5mHXA,
	habanero-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8, Heiko Carstens,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	virtualization-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	steved-r/Jw6+rmf7HQT0dZR+AlfA, Christian Borntraeger,
	Tom Lendacky, Martin Schwidefsky,
	linux390-tA70FqPdS9bQT0dZR+AlfA

On Sun, 15 May 2011 16:55:41 +0300, "Michael S. Tsirkin" <mst-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org> wrote:
> On Mon, May 09, 2011 at 02:03:26PM +0930, Rusty Russell wrote:
> > On Wed, 4 May 2011 23:51:47 +0300, "Michael S. Tsirkin" <mst-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org> wrote:
> > > Use the new avail_event feature to reduce the number
> > > of exits from the guest.
> > 
> > Figures here would be nice :)
> 
> You mean ASCII art in comments?

I mean benchmarks of some kind.

> 
> > > @@ -228,6 +237,12 @@ add_head:
> > >  	 * new available array entries. */
> > >  	virtio_wmb();
> > >  	vq->vring.avail->idx++;
> > > +	/* If the driver never bothers to kick in a very long while,
> > > +	 * avail index might wrap around. If that happens, invalidate
> > > +	 * kicked_avail index we stored. TODO: make sure all drivers
> > > +	 * kick at least once in 2^16 and remove this. */
> > > +	if (unlikely(vq->vring.avail->idx == vq->kicked_avail))
> > > +		vq->kicked_avail_valid = true;
> > 
> > If they don't, they're already buggy.  Simply do:
> >         WARN_ON(vq->vring.avail->idx == vq->kicked_avail);
> 
> Hmm, but does it say that somewhere?

AFAICT it's a corollary of:
1) You have a finite ring of size <= 2^16.
2) You need to kick the other side once you've done some work.

> > > @@ -482,6 +517,8 @@ void vring_transport_features(struct virtio_device *vdev)
> > >  			break;
> > >  		case VIRTIO_RING_F_USED_EVENT_IDX:
> > >  			break;
> > > +		case VIRTIO_RING_F_AVAIL_EVENT_IDX:
> > > +			break;
> > >  		default:
> > >  			/* We don't understand this bit. */
> > >  			clear_bit(i, vdev->features);
> > 
> > Does this belong in a prior patch?
> > 
> > Thanks,
> > Rusty.
> 
> Well if we don't support the feature in the ring we should not
> ack the feature, right?

Ah, you're right.

Thanks,
Rusty.

^ permalink raw reply	[flat|nested] 145+ messages in thread

* Re: [PATCH 09/18] virtio: use avail_event index
  2011-05-15 13:55       ` Michael S. Tsirkin
  (?)
@ 2011-05-16  7:12       ` Rusty Russell
  -1 siblings, 0 replies; 145+ messages in thread
From: Rusty Russell @ 2011-05-16  7:12 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: Krishna Kumar, Carsten Otte, lguest, Shirley Ma, kvm, linux-s390,
	netdev, habanero, Heiko Carstens, linux-kernel, virtualization,
	steved, Christian Borntraeger, Tom Lendacky, Martin Schwidefsky,
	linux390

On Sun, 15 May 2011 16:55:41 +0300, "Michael S. Tsirkin" <mst@redhat.com> wrote:
> On Mon, May 09, 2011 at 02:03:26PM +0930, Rusty Russell wrote:
> > On Wed, 4 May 2011 23:51:47 +0300, "Michael S. Tsirkin" <mst@redhat.com> wrote:
> > > Use the new avail_event feature to reduce the number
> > > of exits from the guest.
> > 
> > Figures here would be nice :)
> 
> You mean ASCII art in comments?

I mean benchmarks of some kind.

> 
> > > @@ -228,6 +237,12 @@ add_head:
> > >  	 * new available array entries. */
> > >  	virtio_wmb();
> > >  	vq->vring.avail->idx++;
> > > +	/* If the driver never bothers to kick in a very long while,
> > > +	 * avail index might wrap around. If that happens, invalidate
> > > +	 * kicked_avail index we stored. TODO: make sure all drivers
> > > +	 * kick at least once in 2^16 and remove this. */
> > > +	if (unlikely(vq->vring.avail->idx == vq->kicked_avail))
> > > +		vq->kicked_avail_valid = true;
> > 
> > If they don't, they're already buggy.  Simply do:
> >         WARN_ON(vq->vring.avail->idx == vq->kicked_avail);
> 
> Hmm, but does it say that somewhere?

AFAICT it's a corollary of:
1) You have a finite ring of size <= 2^16.
2) You need to kick the other side once you've done some work.

> > > @@ -482,6 +517,8 @@ void vring_transport_features(struct virtio_device *vdev)
> > >  			break;
> > >  		case VIRTIO_RING_F_USED_EVENT_IDX:
> > >  			break;
> > > +		case VIRTIO_RING_F_AVAIL_EVENT_IDX:
> > > +			break;
> > >  		default:
> > >  			/* We don't understand this bit. */
> > >  			clear_bit(i, vdev->features);
> > 
> > Does this belong in a prior patch?
> > 
> > Thanks,
> > Rusty.
> 
> Well if we don't support the feature in the ring we should not
> ack the feature, right?

Ah, you're right.

Thanks,
Rusty.

^ permalink raw reply	[flat|nested] 145+ messages in thread

* Re: [PATCH 14/18] virtio: add api for delayed callbacks
@ 2011-05-16  7:13         ` Rusty Russell
  0 siblings, 0 replies; 145+ messages in thread
From: Rusty Russell @ 2011-05-16  7:13 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: linux-kernel, Carsten Otte, Christian Borntraeger, linux390,
	Martin Schwidefsky, Heiko Carstens, Shirley Ma, lguest,
	virtualization, netdev, linux-s390, kvm, Krishna Kumar,
	Tom Lendacky, steved, habanero

On Sun, 15 May 2011 15:48:18 +0300, "Michael S. Tsirkin" <mst@redhat.com> wrote:
> On Mon, May 09, 2011 at 03:27:33PM +0930, Rusty Russell wrote:
> > On Wed, 4 May 2011 23:52:33 +0300, "Michael S. Tsirkin" <mst@redhat.com> wrote:
> > > Add an API that tells the other side that callbacks
> > > should be delayed until a lot of work has been done.
> > > Implement using the new used_event feature.
> > 
> > Since you're going to add a capacity query anyway, why not add the
> > threshold argument here?
> 
> I thought that if we keep the API kind of generic
> there might be more of a chance that future transports
> will be able to implement it. For example, with an
> old host we can't commit to a specific index.

No, it's always a hint anyway: you can be notified before the threshold
is reached.  But best make it explicit I think.

Cheers,
Rusty.

^ permalink raw reply	[flat|nested] 145+ messages in thread

* Re: [PATCH 14/18] virtio: add api for delayed callbacks
@ 2011-05-16  7:13         ` Rusty Russell
  0 siblings, 0 replies; 145+ messages in thread
From: Rusty Russell @ 2011-05-16  7:13 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: Krishna Kumar, Carsten Otte, lguest-uLR06cmDAlY/bJ5BZ2RsiQ,
	Shirley Ma, kvm-u79uwXL29TY76Z2rM5mHXA,
	linux-s390-u79uwXL29TY76Z2rM5mHXA, netdev-u79uwXL29TY76Z2rM5mHXA,
	habanero-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8, Heiko Carstens,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	virtualization-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	steved-r/Jw6+rmf7HQT0dZR+AlfA, Christian Borntraeger,
	Tom Lendacky, Martin Schwidefsky,
	linux390-tA70FqPdS9bQT0dZR+AlfA

On Sun, 15 May 2011 15:48:18 +0300, "Michael S. Tsirkin" <mst-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org> wrote:
> On Mon, May 09, 2011 at 03:27:33PM +0930, Rusty Russell wrote:
> > On Wed, 4 May 2011 23:52:33 +0300, "Michael S. Tsirkin" <mst-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org> wrote:
> > > Add an API that tells the other side that callbacks
> > > should be delayed until a lot of work has been done.
> > > Implement using the new used_event feature.
> > 
> > Since you're going to add a capacity query anyway, why not add the
> > threshold argument here?
> 
> I thought that if we keep the API kind of generic
> there might be more of a chance that future transports
> will be able to implement it. For example, with an
> old host we can't commit to a specific index.

No, it's always a hint anyway: you can be notified before the threshold
is reached.  But best make it explicit I think.

Cheers,
Rusty.

^ permalink raw reply	[flat|nested] 145+ messages in thread

* Re: [PATCH 14/18] virtio: add api for delayed callbacks
  2011-05-15 12:48       ` Michael S. Tsirkin
  (?)
@ 2011-05-16  7:13       ` Rusty Russell
  -1 siblings, 0 replies; 145+ messages in thread
From: Rusty Russell @ 2011-05-16  7:13 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: Krishna Kumar, Carsten Otte, lguest, Shirley Ma, kvm, linux-s390,
	netdev, habanero, Heiko Carstens, linux-kernel, virtualization,
	steved, Christian Borntraeger, Tom Lendacky, Martin Schwidefsky,
	linux390

On Sun, 15 May 2011 15:48:18 +0300, "Michael S. Tsirkin" <mst@redhat.com> wrote:
> On Mon, May 09, 2011 at 03:27:33PM +0930, Rusty Russell wrote:
> > On Wed, 4 May 2011 23:52:33 +0300, "Michael S. Tsirkin" <mst@redhat.com> wrote:
> > > Add an API that tells the other side that callbacks
> > > should be delayed until a lot of work has been done.
> > > Implement using the new used_event feature.
> > 
> > Since you're going to add a capacity query anyway, why not add the
> > threshold argument here?
> 
> I thought that if we keep the API kind of generic
> there might be more of a chance that future transports
> will be able to implement it. For example, with an
> old host we can't commit to a specific index.

No, it's always a hint anyway: you can be notified before the threshold
is reached.  But best make it explicit I think.

Cheers,
Rusty.

^ permalink raw reply	[flat|nested] 145+ messages in thread

* Re: [PATCH 06/18] virtio_ring: avail event index interface
@ 2011-05-17  6:00           ` Michael S. Tsirkin
  0 siblings, 0 replies; 145+ messages in thread
From: Michael S. Tsirkin @ 2011-05-17  6:00 UTC (permalink / raw)
  To: Rusty Russell
  Cc: linux-kernel, Carsten Otte, Christian Borntraeger, linux390,
	Martin Schwidefsky, Heiko Carstens, Shirley Ma, lguest,
	virtualization, netdev, linux-s390, kvm, Krishna Kumar,
	Tom Lendacky, steved, habanero

On Mon, May 16, 2011 at 03:53:19PM +0930, Rusty Russell wrote:
> On Sun, 15 May 2011 15:47:27 +0300, "Michael S. Tsirkin" <mst@redhat.com> wrote:
> > On Mon, May 09, 2011 at 01:43:15PM +0930, Rusty Russell wrote:
> > > On Wed, 4 May 2011 23:51:19 +0300, "Michael S. Tsirkin" <mst@redhat.com> wrote:
> > > >  #define VIRTIO_RING_F_USED_EVENT_IDX	29
> > > > +/* The Host publishes the avail index for which it expects a kick
> > > > + * at the end of the used ring. Guest should ignore the used->flags field. */
> > > > +#define VIRTIO_RING_F_AVAIL_EVENT_IDX	32
> > > 
> > > Are you really sure we want to separate the two?  Seems a little simpler
> > > to have one bit to mean "we're publishing our threshold".  For someone
> > > implementing this from scratch, it's a little simpler.
> > > 
> > > Or are there cases where the old style makes more sense?
> > > 
> > > Thanks,
> > > Rusty.
> > 
> > Hmm, it makes debugging easier as each side can disable
> > publishing separately - I used it all the time when I saw
> > e.g. networking stuck to guess whether I need to investigate the
> > interrupt or the exit handling.
> > 
> > But I'm not hung up on this.
> > 
> > Let me know pls.
> 
> If we combine them into one, then these patches no longer depend on
> the feature bit expansion, which is worthwhile (though I'll take both).
> 
> Thanks,
> Rusty.

Yes, I know. But if we do expand feature bits anyway, for debugging
and profiling if nothing else it's useful to have them separate ...
If you take both why does the order matter?

-- 
MST

^ permalink raw reply	[flat|nested] 145+ messages in thread

* Re: [PATCH 06/18] virtio_ring: avail event index interface
@ 2011-05-17  6:00           ` Michael S. Tsirkin
  0 siblings, 0 replies; 145+ messages in thread
From: Michael S. Tsirkin @ 2011-05-17  6:00 UTC (permalink / raw)
  To: Rusty Russell
  Cc: Krishna Kumar, Carsten Otte, lguest-uLR06cmDAlY/bJ5BZ2RsiQ,
	Shirley Ma, kvm-u79uwXL29TY76Z2rM5mHXA,
	linux-s390-u79uwXL29TY76Z2rM5mHXA, netdev-u79uwXL29TY76Z2rM5mHXA,
	habanero-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8, Heiko Carstens,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	virtualization-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	steved-r/Jw6+rmf7HQT0dZR+AlfA, Christian Borntraeger,
	Tom Lendacky, Martin Schwidefsky,
	linux390-tA70FqPdS9bQT0dZR+AlfA

On Mon, May 16, 2011 at 03:53:19PM +0930, Rusty Russell wrote:
> On Sun, 15 May 2011 15:47:27 +0300, "Michael S. Tsirkin" <mst-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org> wrote:
> > On Mon, May 09, 2011 at 01:43:15PM +0930, Rusty Russell wrote:
> > > On Wed, 4 May 2011 23:51:19 +0300, "Michael S. Tsirkin" <mst-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org> wrote:
> > > >  #define VIRTIO_RING_F_USED_EVENT_IDX	29
> > > > +/* The Host publishes the avail index for which it expects a kick
> > > > + * at the end of the used ring. Guest should ignore the used->flags field. */
> > > > +#define VIRTIO_RING_F_AVAIL_EVENT_IDX	32
> > > 
> > > Are you really sure we want to separate the two?  Seems a little simpler
> > > to have one bit to mean "we're publishing our threshold".  For someone
> > > implementing this from scratch, it's a little simpler.
> > > 
> > > Or are there cases where the old style makes more sense?
> > > 
> > > Thanks,
> > > Rusty.
> > 
> > Hmm, it makes debugging easier as each side can disable
> > publishing separately - I used it all the time when I saw
> > e.g. networking stuck to guess whether I need to investigate the
> > interrupt or the exit handling.
> > 
> > But I'm not hung up on this.
> > 
> > Let me know pls.
> 
> If we combine them into one, then these patches no longer depend on
> the feature bit expansion, which is worthwhile (though I'll take both).
> 
> Thanks,
> Rusty.

Yes, I know. But if we do expand feature bits anyway, for debugging
and profiling if nothing else it's useful to have them separate ...
If you take both why does the order matter?

-- 
MST

^ permalink raw reply	[flat|nested] 145+ messages in thread

* Re: [PATCH 06/18] virtio_ring: avail event index interface
  2011-05-16  6:23         ` Rusty Russell
  (?)
@ 2011-05-17  6:00         ` Michael S. Tsirkin
  -1 siblings, 0 replies; 145+ messages in thread
From: Michael S. Tsirkin @ 2011-05-17  6:00 UTC (permalink / raw)
  To: Rusty Russell
  Cc: Krishna Kumar, Carsten Otte, lguest, Shirley Ma, kvm, linux-s390,
	netdev, habanero, Heiko Carstens, linux-kernel, virtualization,
	steved, Christian Borntraeger, Tom Lendacky, Martin Schwidefsky,
	linux390

On Mon, May 16, 2011 at 03:53:19PM +0930, Rusty Russell wrote:
> On Sun, 15 May 2011 15:47:27 +0300, "Michael S. Tsirkin" <mst@redhat.com> wrote:
> > On Mon, May 09, 2011 at 01:43:15PM +0930, Rusty Russell wrote:
> > > On Wed, 4 May 2011 23:51:19 +0300, "Michael S. Tsirkin" <mst@redhat.com> wrote:
> > > >  #define VIRTIO_RING_F_USED_EVENT_IDX	29
> > > > +/* The Host publishes the avail index for which it expects a kick
> > > > + * at the end of the used ring. Guest should ignore the used->flags field. */
> > > > +#define VIRTIO_RING_F_AVAIL_EVENT_IDX	32
> > > 
> > > Are you really sure we want to separate the two?  Seems a little simpler
> > > to have one bit to mean "we're publishing our threshold".  For someone
> > > implementing this from scratch, it's a little simpler.
> > > 
> > > Or are there cases where the old style makes more sense?
> > > 
> > > Thanks,
> > > Rusty.
> > 
> > Hmm, it makes debugging easier as each side can disable
> > publishing separately - I used it all the time when I saw
> > e.g. networking stuck to guess whether I need to investigate the
> > interrupt or the exit handling.
> > 
> > But I'm not hung up on this.
> > 
> > Let me know pls.
> 
> If we combine them into one, then these patches no longer depend on
> the feature bit expansion, which is worthwhile (though I'll take both).
> 
> Thanks,
> Rusty.

Yes, I know. But if we do expand feature bits anyway, for debugging
and profiling if nothing else it's useful to have them separate ...
If you take both why does the order matter?

-- 
MST

^ permalink raw reply	[flat|nested] 145+ messages in thread

* Re: [PATCH 09/18] virtio: use avail_event index
@ 2011-05-17  6:10           ` Michael S. Tsirkin
  0 siblings, 0 replies; 145+ messages in thread
From: Michael S. Tsirkin @ 2011-05-17  6:10 UTC (permalink / raw)
  To: Rusty Russell
  Cc: linux-kernel, Carsten Otte, Christian Borntraeger, linux390,
	Martin Schwidefsky, Heiko Carstens, Shirley Ma, lguest,
	virtualization, netdev, linux-s390, kvm, Krishna Kumar,
	Tom Lendacky, steved, habanero

On Mon, May 16, 2011 at 04:42:21PM +0930, Rusty Russell wrote:
> On Sun, 15 May 2011 16:55:41 +0300, "Michael S. Tsirkin" <mst@redhat.com> wrote:
> > On Mon, May 09, 2011 at 02:03:26PM +0930, Rusty Russell wrote:
> > > On Wed, 4 May 2011 23:51:47 +0300, "Michael S. Tsirkin" <mst@redhat.com> wrote:
> > > > Use the new avail_event feature to reduce the number
> > > > of exits from the guest.
> > > 
> > > Figures here would be nice :)
> > 
> > You mean ASCII art in comments?
> 
> I mean benchmarks of some kind.

:)

> > 
> > > > @@ -228,6 +237,12 @@ add_head:
> > > >  	 * new available array entries. */
> > > >  	virtio_wmb();
> > > >  	vq->vring.avail->idx++;
> > > > +	/* If the driver never bothers to kick in a very long while,
> > > > +	 * avail index might wrap around. If that happens, invalidate
> > > > +	 * kicked_avail index we stored. TODO: make sure all drivers
> > > > +	 * kick at least once in 2^16 and remove this. */
> > > > +	if (unlikely(vq->vring.avail->idx == vq->kicked_avail))
> > > > +		vq->kicked_avail_valid = true;
> > > 
> > > If they don't, they're already buggy.  Simply do:
> > >         WARN_ON(vq->vring.avail->idx == vq->kicked_avail);
> > 
> > Hmm, but does it say that somewhere?
> 
> AFAICT it's a corollary of:
> 1) You have a finite ring of size <= 2^16.
> 2) You need to kick the other side once you've done some work.

Well one can imagine a driver doing:

	while (virtqueue_get_buf()) {
		virtqueue_add_buf()
	}
	virtqueue_kick()

which looks sensible (batch kicks) but might
process any number of bufs between kicks.

If we look at drivers closely enough, I think none
of them do the equivalent of the above, but not 100% sure.


> > > > @@ -482,6 +517,8 @@ void vring_transport_features(struct virtio_device *vdev)
> > > >  			break;
> > > >  		case VIRTIO_RING_F_USED_EVENT_IDX:
> > > >  			break;
> > > > +		case VIRTIO_RING_F_AVAIL_EVENT_IDX:
> > > > +			break;
> > > >  		default:
> > > >  			/* We don't understand this bit. */
> > > >  			clear_bit(i, vdev->features);
> > > 
> > > Does this belong in a prior patch?
> > > 
> > > Thanks,
> > > Rusty.
> > 
> > Well if we don't support the feature in the ring we should not
> > ack the feature, right?
> 
> Ah, you're right.
> 
> Thanks,
> Rusty.

^ permalink raw reply	[flat|nested] 145+ messages in thread

* Re: [PATCH 09/18] virtio: use avail_event index
@ 2011-05-17  6:10           ` Michael S. Tsirkin
  0 siblings, 0 replies; 145+ messages in thread
From: Michael S. Tsirkin @ 2011-05-17  6:10 UTC (permalink / raw)
  To: Rusty Russell
  Cc: Krishna Kumar, Carsten Otte, lguest-uLR06cmDAlY/bJ5BZ2RsiQ,
	Shirley Ma, kvm-u79uwXL29TY76Z2rM5mHXA,
	linux-s390-u79uwXL29TY76Z2rM5mHXA, netdev-u79uwXL29TY76Z2rM5mHXA,
	habanero-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8, Heiko Carstens,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	virtualization-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	steved-r/Jw6+rmf7HQT0dZR+AlfA, Christian Borntraeger,
	Tom Lendacky, Martin Schwidefsky,
	linux390-tA70FqPdS9bQT0dZR+AlfA

On Mon, May 16, 2011 at 04:42:21PM +0930, Rusty Russell wrote:
> On Sun, 15 May 2011 16:55:41 +0300, "Michael S. Tsirkin" <mst-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org> wrote:
> > On Mon, May 09, 2011 at 02:03:26PM +0930, Rusty Russell wrote:
> > > On Wed, 4 May 2011 23:51:47 +0300, "Michael S. Tsirkin" <mst-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org> wrote:
> > > > Use the new avail_event feature to reduce the number
> > > > of exits from the guest.
> > > 
> > > Figures here would be nice :)
> > 
> > You mean ASCII art in comments?
> 
> I mean benchmarks of some kind.

:)

> > 
> > > > @@ -228,6 +237,12 @@ add_head:
> > > >  	 * new available array entries. */
> > > >  	virtio_wmb();
> > > >  	vq->vring.avail->idx++;
> > > > +	/* If the driver never bothers to kick in a very long while,
> > > > +	 * avail index might wrap around. If that happens, invalidate
> > > > +	 * kicked_avail index we stored. TODO: make sure all drivers
> > > > +	 * kick at least once in 2^16 and remove this. */
> > > > +	if (unlikely(vq->vring.avail->idx == vq->kicked_avail))
> > > > +		vq->kicked_avail_valid = true;
> > > 
> > > If they don't, they're already buggy.  Simply do:
> > >         WARN_ON(vq->vring.avail->idx == vq->kicked_avail);
> > 
> > Hmm, but does it say that somewhere?
> 
> AFAICT it's a corollary of:
> 1) You have a finite ring of size <= 2^16.
> 2) You need to kick the other side once you've done some work.

Well one can imagine a driver doing:

	while (virtqueue_get_buf()) {
		virtqueue_add_buf()
	}
	virtqueue_kick()

which looks sensible (batch kicks) but might
process any number of bufs between kicks.

If we look at drivers closely enough, I think none
of them do the equivalent of the above, but not 100% sure.


> > > > @@ -482,6 +517,8 @@ void vring_transport_features(struct virtio_device *vdev)
> > > >  			break;
> > > >  		case VIRTIO_RING_F_USED_EVENT_IDX:
> > > >  			break;
> > > > +		case VIRTIO_RING_F_AVAIL_EVENT_IDX:
> > > > +			break;
> > > >  		default:
> > > >  			/* We don't understand this bit. */
> > > >  			clear_bit(i, vdev->features);
> > > 
> > > Does this belong in a prior patch?
> > > 
> > > Thanks,
> > > Rusty.
> > 
> > Well if we don't support the feature in the ring we should not
> > ack the feature, right?
> 
> Ah, you're right.
> 
> Thanks,
> Rusty.

^ permalink raw reply	[flat|nested] 145+ messages in thread

* Re: [PATCH 09/18] virtio: use avail_event index
  2011-05-16  7:12         ` Rusty Russell
  (?)
  (?)
@ 2011-05-17  6:10         ` Michael S. Tsirkin
  -1 siblings, 0 replies; 145+ messages in thread
From: Michael S. Tsirkin @ 2011-05-17  6:10 UTC (permalink / raw)
  To: Rusty Russell
  Cc: Krishna Kumar, Carsten Otte, lguest, Shirley Ma, kvm, linux-s390,
	netdev, habanero, Heiko Carstens, linux-kernel, virtualization,
	steved, Christian Borntraeger, Tom Lendacky, Martin Schwidefsky,
	linux390

On Mon, May 16, 2011 at 04:42:21PM +0930, Rusty Russell wrote:
> On Sun, 15 May 2011 16:55:41 +0300, "Michael S. Tsirkin" <mst@redhat.com> wrote:
> > On Mon, May 09, 2011 at 02:03:26PM +0930, Rusty Russell wrote:
> > > On Wed, 4 May 2011 23:51:47 +0300, "Michael S. Tsirkin" <mst@redhat.com> wrote:
> > > > Use the new avail_event feature to reduce the number
> > > > of exits from the guest.
> > > 
> > > Figures here would be nice :)
> > 
> > You mean ASCII art in comments?
> 
> I mean benchmarks of some kind.

:)

> > 
> > > > @@ -228,6 +237,12 @@ add_head:
> > > >  	 * new available array entries. */
> > > >  	virtio_wmb();
> > > >  	vq->vring.avail->idx++;
> > > > +	/* If the driver never bothers to kick in a very long while,
> > > > +	 * avail index might wrap around. If that happens, invalidate
> > > > +	 * kicked_avail index we stored. TODO: make sure all drivers
> > > > +	 * kick at least once in 2^16 and remove this. */
> > > > +	if (unlikely(vq->vring.avail->idx == vq->kicked_avail))
> > > > +		vq->kicked_avail_valid = true;
> > > 
> > > If they don't, they're already buggy.  Simply do:
> > >         WARN_ON(vq->vring.avail->idx == vq->kicked_avail);
> > 
> > Hmm, but does it say that somewhere?
> 
> AFAICT it's a corollary of:
> 1) You have a finite ring of size <= 2^16.
> 2) You need to kick the other side once you've done some work.

Well one can imagine a driver doing:

	while (virtqueue_get_buf()) {
		virtqueue_add_buf()
	}
	virtqueue_kick()

which looks sensible (batch kicks) but might
process any number of bufs between kicks.

If we look at drivers closely enough, I think none
of them do the equivalent of the above, but not 100% sure.


> > > > @@ -482,6 +517,8 @@ void vring_transport_features(struct virtio_device *vdev)
> > > >  			break;
> > > >  		case VIRTIO_RING_F_USED_EVENT_IDX:
> > > >  			break;
> > > > +		case VIRTIO_RING_F_AVAIL_EVENT_IDX:
> > > > +			break;
> > > >  		default:
> > > >  			/* We don't understand this bit. */
> > > >  			clear_bit(i, vdev->features);
> > > 
> > > Does this belong in a prior patch?
> > > 
> > > Thanks,
> > > Rusty.
> > 
> > Well if we don't support the feature in the ring we should not
> > ack the feature, right?
> 
> Ah, you're right.
> 
> Thanks,
> Rusty.

^ permalink raw reply	[flat|nested] 145+ messages in thread

* Re: [PATCH 09/18] virtio: use avail_event index
  2011-05-16  7:12         ` Rusty Russell
                           ` (2 preceding siblings ...)
  (?)
@ 2011-05-17 16:23         ` Tom Lendacky
  -1 siblings, 0 replies; 145+ messages in thread
From: Tom Lendacky @ 2011-05-17 16:23 UTC (permalink / raw)
  To: Rusty Russell
  Cc: Michael S. Tsirkin, linux-kernel, Carsten Otte,
	Christian Borntraeger, linux390, Martin Schwidefsky,
	Heiko Carstens, Shirley Ma, lguest, virtualization, netdev,
	linux-s390, kvm, Krishna Kumar, steved, habanero


On Monday, May 16, 2011 02:12:21 AM Rusty Russell wrote:
> On Sun, 15 May 2011 16:55:41 +0300, "Michael S. Tsirkin" <mst@redhat.com> 
wrote:
> > On Mon, May 09, 2011 at 02:03:26PM +0930, Rusty Russell wrote:
> > > On Wed, 4 May 2011 23:51:47 +0300, "Michael S. Tsirkin" <mst@redhat.com> 
wrote:
> > > > Use the new avail_event feature to reduce the number
> > > > of exits from the guest.
> > > 
> > > Figures here would be nice :)
> > 
> > You mean ASCII art in comments?
> 
> I mean benchmarks of some kind.

I'm working on getting some benchmark results for the patches.  I should 
hopefully have something in the next day or two.

Tom
> 
> > > > @@ -228,6 +237,12 @@ add_head:
> > > >  	 * new available array entries. */
> > > >  	
> > > >  	virtio_wmb();
> > > >  	vq->vring.avail->idx++;
> > > > 
> > > > +	/* If the driver never bothers to kick in a very long while,
> > > > +	 * avail index might wrap around. If that happens, invalidate
> > > > +	 * kicked_avail index we stored. TODO: make sure all drivers
> > > > +	 * kick at least once in 2^16 and remove this. */
> > > > +	if (unlikely(vq->vring.avail->idx == vq->kicked_avail))
> > > > +		vq->kicked_avail_valid = true;
> > > 
> > > If they don't, they're already buggy.  Simply do:
> > >         WARN_ON(vq->vring.avail->idx == vq->kicked_avail);
> > 
> > Hmm, but does it say that somewhere?
> 
> AFAICT it's a corollary of:
> 1) You have a finite ring of size <= 2^16.
> 2) You need to kick the other side once you've done some work.
> 
> > > > @@ -482,6 +517,8 @@ void vring_transport_features(struct
> > > > virtio_device *vdev)
> > > > 
> > > >  			break;
> > > >  		
> > > >  		case VIRTIO_RING_F_USED_EVENT_IDX:
> > > >  			break;
> > > > 
> > > > +		case VIRTIO_RING_F_AVAIL_EVENT_IDX:
> > > > +			break;
> > > > 
> > > >  		default:
> > > >  			/* We don't understand this bit. */
> > > >  			clear_bit(i, vdev->features);
> > > 
> > > Does this belong in a prior patch?
> > > 
> > > Thanks,
> > > Rusty.
> > 
> > Well if we don't support the feature in the ring we should not
> > ack the feature, right?
> 
> Ah, you're right.
> 
> Thanks,
> Rusty.
> --
> To unsubscribe from this list: send the line "unsubscribe kvm" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 145+ messages in thread

* Re: [PATCH 09/18] virtio: use avail_event index
  2011-05-16  7:12         ` Rusty Russell
                           ` (3 preceding siblings ...)
  (?)
@ 2011-05-17 16:23         ` Tom Lendacky
  -1 siblings, 0 replies; 145+ messages in thread
From: Tom Lendacky @ 2011-05-17 16:23 UTC (permalink / raw)
  To: Rusty Russell
  Cc: Krishna Kumar, Carsten Otte, linux-s390, lguest, Shirley Ma, kvm,
	Michael S. Tsirkin, netdev, habanero, Heiko Carstens,
	linux-kernel, virtualization, steved, Christian Borntraeger,
	Martin Schwidefsky, linux390


On Monday, May 16, 2011 02:12:21 AM Rusty Russell wrote:
> On Sun, 15 May 2011 16:55:41 +0300, "Michael S. Tsirkin" <mst@redhat.com> 
wrote:
> > On Mon, May 09, 2011 at 02:03:26PM +0930, Rusty Russell wrote:
> > > On Wed, 4 May 2011 23:51:47 +0300, "Michael S. Tsirkin" <mst@redhat.com> 
wrote:
> > > > Use the new avail_event feature to reduce the number
> > > > of exits from the guest.
> > > 
> > > Figures here would be nice :)
> > 
> > You mean ASCII art in comments?
> 
> I mean benchmarks of some kind.

I'm working on getting some benchmark results for the patches.  I should 
hopefully have something in the next day or two.

Tom
> 
> > > > @@ -228,6 +237,12 @@ add_head:
> > > >  	 * new available array entries. */
> > > >  	
> > > >  	virtio_wmb();
> > > >  	vq->vring.avail->idx++;
> > > > 
> > > > +	/* If the driver never bothers to kick in a very long while,
> > > > +	 * avail index might wrap around. If that happens, invalidate
> > > > +	 * kicked_avail index we stored. TODO: make sure all drivers
> > > > +	 * kick at least once in 2^16 and remove this. */
> > > > +	if (unlikely(vq->vring.avail->idx == vq->kicked_avail))
> > > > +		vq->kicked_avail_valid = true;
> > > 
> > > If they don't, they're already buggy.  Simply do:
> > >         WARN_ON(vq->vring.avail->idx == vq->kicked_avail);
> > 
> > Hmm, but does it say that somewhere?
> 
> AFAICT it's a corollary of:
> 1) You have a finite ring of size <= 2^16.
> 2) You need to kick the other side once you've done some work.
> 
> > > > @@ -482,6 +517,8 @@ void vring_transport_features(struct
> > > > virtio_device *vdev)
> > > > 
> > > >  			break;
> > > >  		
> > > >  		case VIRTIO_RING_F_USED_EVENT_IDX:
> > > >  			break;
> > > > 
> > > > +		case VIRTIO_RING_F_AVAIL_EVENT_IDX:
> > > > +			break;
> > > > 
> > > >  		default:
> > > >  			/* We don't understand this bit. */
> > > >  			clear_bit(i, vdev->features);
> > > 
> > > Does this belong in a prior patch?
> > > 
> > > Thanks,
> > > Rusty.
> > 
> > Well if we don't support the feature in the ring we should not
> > ack the feature, right?
> 
> Ah, you're right.
> 
> Thanks,
> Rusty.
> --
> To unsubscribe from this list: send the line "unsubscribe kvm" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 145+ messages in thread

* Re: [PATCH 06/18] virtio_ring: avail event index interface
@ 2011-05-18  0:08             ` Rusty Russell
  0 siblings, 0 replies; 145+ messages in thread
From: Rusty Russell @ 2011-05-18  0:08 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: linux-kernel, Carsten Otte, Christian Borntraeger, linux390,
	Martin Schwidefsky, Heiko Carstens, Shirley Ma, lguest,
	virtualization, netdev, linux-s390, kvm, Krishna Kumar,
	Tom Lendacky, steved, habanero

On Tue, 17 May 2011 09:00:52 +0300, "Michael S. Tsirkin" <mst@redhat.com> wrote:
> On Mon, May 16, 2011 at 03:53:19PM +0930, Rusty Russell wrote:
> > On Sun, 15 May 2011 15:47:27 +0300, "Michael S. Tsirkin" <mst@redhat.com> wrote:
> > > On Mon, May 09, 2011 at 01:43:15PM +0930, Rusty Russell wrote:
> > > > On Wed, 4 May 2011 23:51:19 +0300, "Michael S. Tsirkin" <mst@redhat.com> wrote:
> > > > >  #define VIRTIO_RING_F_USED_EVENT_IDX	29
> > > > > +/* The Host publishes the avail index for which it expects a kick
> > > > > + * at the end of the used ring. Guest should ignore the used->flags field. */
> > > > > +#define VIRTIO_RING_F_AVAIL_EVENT_IDX	32
> > > > 
> > > > Are you really sure we want to separate the two?  Seems a little simpler
> > > > to have one bit to mean "we're publishing our threshold".  For someone
> > > > implementing this from scratch, it's a little simpler.
> > > > 
> > > > Or are there cases where the old style makes more sense?
> > > > 
> > > > Thanks,
> > > > Rusty.
> > > 
> > > Hmm, it makes debugging easier as each side can disable
> > > publishing separately - I used it all the time when I saw
> > > e.g. networking stuck to guess whether I need to investigate the
> > > interrupt or the exit handling.
> > > 
> > > But I'm not hung up on this.
> > > 
> > > Let me know pls.
> > 
> > If we combine them into one, then these patches no longer depend on
> > the feature bit expansion, which is worthwhile (though I'll take both).
> > 
> > Thanks,
> > Rusty.
> 
> Yes, I know. But if we do expand feature bits anyway, for debugging
> and profiling if nothing else it's useful to have them separate ...
> If you take both why does the order matter?

Damage control.  Then if something breaks, it doesn't break everything.

Cheers,
Rusty.

^ permalink raw reply	[flat|nested] 145+ messages in thread

* Re: [PATCH 06/18] virtio_ring: avail event index interface
@ 2011-05-18  0:08             ` Rusty Russell
  0 siblings, 0 replies; 145+ messages in thread
From: Rusty Russell @ 2011-05-18  0:08 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: Krishna Kumar, Carsten Otte, lguest-uLR06cmDAlY/bJ5BZ2RsiQ,
	Shirley Ma, kvm-u79uwXL29TY76Z2rM5mHXA,
	linux-s390-u79uwXL29TY76Z2rM5mHXA, netdev-u79uwXL29TY76Z2rM5mHXA,
	habanero-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8, Heiko Carstens,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	virtualization-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	steved-r/Jw6+rmf7HQT0dZR+AlfA, Christian Borntraeger,
	Tom Lendacky, Martin Schwidefsky,
	linux390-tA70FqPdS9bQT0dZR+AlfA

On Tue, 17 May 2011 09:00:52 +0300, "Michael S. Tsirkin" <mst-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org> wrote:
> On Mon, May 16, 2011 at 03:53:19PM +0930, Rusty Russell wrote:
> > On Sun, 15 May 2011 15:47:27 +0300, "Michael S. Tsirkin" <mst-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org> wrote:
> > > On Mon, May 09, 2011 at 01:43:15PM +0930, Rusty Russell wrote:
> > > > On Wed, 4 May 2011 23:51:19 +0300, "Michael S. Tsirkin" <mst-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org> wrote:
> > > > >  #define VIRTIO_RING_F_USED_EVENT_IDX	29
> > > > > +/* The Host publishes the avail index for which it expects a kick
> > > > > + * at the end of the used ring. Guest should ignore the used->flags field. */
> > > > > +#define VIRTIO_RING_F_AVAIL_EVENT_IDX	32
> > > > 
> > > > Are you really sure we want to separate the two?  Seems a little simpler
> > > > to have one bit to mean "we're publishing our threshold".  For someone
> > > > implementing this from scratch, it's a little simpler.
> > > > 
> > > > Or are there cases where the old style makes more sense?
> > > > 
> > > > Thanks,
> > > > Rusty.
> > > 
> > > Hmm, it makes debugging easier as each side can disable
> > > publishing separately - I used it all the time when I saw
> > > e.g. networking stuck to guess whether I need to investigate the
> > > interrupt or the exit handling.
> > > 
> > > But I'm not hung up on this.
> > > 
> > > Let me know pls.
> > 
> > If we combine them into one, then these patches no longer depend on
> > the feature bit expansion, which is worthwhile (though I'll take both).
> > 
> > Thanks,
> > Rusty.
> 
> Yes, I know. But if we do expand feature bits anyway, for debugging
> and profiling if nothing else it's useful to have them separate ...
> If you take both why does the order matter?

Damage control.  Then if something breaks, it doesn't break everything.

Cheers,
Rusty.

^ permalink raw reply	[flat|nested] 145+ messages in thread

* Re: [PATCH 06/18] virtio_ring: avail event index interface
  2011-05-17  6:00           ` Michael S. Tsirkin
  (?)
@ 2011-05-18  0:08           ` Rusty Russell
  -1 siblings, 0 replies; 145+ messages in thread
From: Rusty Russell @ 2011-05-18  0:08 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: Krishna Kumar, Carsten Otte, lguest, Shirley Ma, kvm, linux-s390,
	netdev, habanero, Heiko Carstens, linux-kernel, virtualization,
	steved, Christian Borntraeger, Tom Lendacky, Martin Schwidefsky,
	linux390

On Tue, 17 May 2011 09:00:52 +0300, "Michael S. Tsirkin" <mst@redhat.com> wrote:
> On Mon, May 16, 2011 at 03:53:19PM +0930, Rusty Russell wrote:
> > On Sun, 15 May 2011 15:47:27 +0300, "Michael S. Tsirkin" <mst@redhat.com> wrote:
> > > On Mon, May 09, 2011 at 01:43:15PM +0930, Rusty Russell wrote:
> > > > On Wed, 4 May 2011 23:51:19 +0300, "Michael S. Tsirkin" <mst@redhat.com> wrote:
> > > > >  #define VIRTIO_RING_F_USED_EVENT_IDX	29
> > > > > +/* The Host publishes the avail index for which it expects a kick
> > > > > + * at the end of the used ring. Guest should ignore the used->flags field. */
> > > > > +#define VIRTIO_RING_F_AVAIL_EVENT_IDX	32
> > > > 
> > > > Are you really sure we want to separate the two?  Seems a little simpler
> > > > to have one bit to mean "we're publishing our threshold".  For someone
> > > > implementing this from scratch, it's a little simpler.
> > > > 
> > > > Or are there cases where the old style makes more sense?
> > > > 
> > > > Thanks,
> > > > Rusty.
> > > 
> > > Hmm, it makes debugging easier as each side can disable
> > > publishing separately - I used it all the time when I saw
> > > e.g. networking stuck to guess whether I need to investigate the
> > > interrupt or the exit handling.
> > > 
> > > But I'm not hung up on this.
> > > 
> > > Let me know pls.
> > 
> > If we combine them into one, then these patches no longer depend on
> > the feature bit expansion, which is worthwhile (though I'll take both).
> > 
> > Thanks,
> > Rusty.
> 
> Yes, I know. But if we do expand feature bits anyway, for debugging
> and profiling if nothing else it's useful to have them separate ...
> If you take both why does the order matter?

Damage control.  Then if something breaks, it doesn't break everything.

Cheers,
Rusty.

^ permalink raw reply	[flat|nested] 145+ messages in thread

* Re: [PATCH 09/18] virtio: use avail_event index
@ 2011-05-18  0:19             ` Rusty Russell
  0 siblings, 0 replies; 145+ messages in thread
From: Rusty Russell @ 2011-05-18  0:19 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: linux-kernel, Carsten Otte, Christian Borntraeger, linux390,
	Martin Schwidefsky, Heiko Carstens, Shirley Ma, lguest,
	virtualization, netdev, linux-s390, kvm, Krishna Kumar,
	Tom Lendacky, steved, habanero

On Tue, 17 May 2011 09:10:31 +0300, "Michael S. Tsirkin" <mst@redhat.com> wrote:
> Well one can imagine a driver doing:
> 
> 	while (virtqueue_get_buf()) {
> 		virtqueue_add_buf()
> 	}
> 	virtqueue_kick()
> 
> which looks sensible (batch kicks) but might
> process any number of bufs between kicks.

No, we currently only expose the buffers in the kick, so it can only
fill the ring doing that.

We could change that (and maybe that's worth looking at)...

> If we look at drivers closely enough, I think none
> of them do the equivalent of the above, but not 100% sure.

I'm pretty sure we don't have this kind of 'echo' driver yet.  Drivers
tend to take OS requests and queue them.  The only one which does
anything even partially sophisticated is the net driver...

Thanks,
Rusty.

^ permalink raw reply	[flat|nested] 145+ messages in thread

* Re: [PATCH 09/18] virtio: use avail_event index
@ 2011-05-18  0:19             ` Rusty Russell
  0 siblings, 0 replies; 145+ messages in thread
From: Rusty Russell @ 2011-05-18  0:19 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: Krishna Kumar, Carsten Otte, lguest-uLR06cmDAlY/bJ5BZ2RsiQ,
	Shirley Ma, kvm-u79uwXL29TY76Z2rM5mHXA,
	linux-s390-u79uwXL29TY76Z2rM5mHXA, netdev-u79uwXL29TY76Z2rM5mHXA,
	habanero-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8, Heiko Carstens,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	virtualization-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	steved-r/Jw6+rmf7HQT0dZR+AlfA, Christian Borntraeger,
	Tom Lendacky, Martin Schwidefsky,
	linux390-tA70FqPdS9bQT0dZR+AlfA

On Tue, 17 May 2011 09:10:31 +0300, "Michael S. Tsirkin" <mst-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org> wrote:
> Well one can imagine a driver doing:
> 
> 	while (virtqueue_get_buf()) {
> 		virtqueue_add_buf()
> 	}
> 	virtqueue_kick()
> 
> which looks sensible (batch kicks) but might
> process any number of bufs between kicks.

No, we currently only expose the buffers in the kick, so it can only
fill the ring doing that.

We could change that (and maybe that's worth looking at)...

> If we look at drivers closely enough, I think none
> of them do the equivalent of the above, but not 100% sure.

I'm pretty sure we don't have this kind of 'echo' driver yet.  Drivers
tend to take OS requests and queue them.  The only one which does
anything even partially sophisticated is the net driver...

Thanks,
Rusty.

^ permalink raw reply	[flat|nested] 145+ messages in thread

* Re: [PATCH 09/18] virtio: use avail_event index
  2011-05-17  6:10           ` Michael S. Tsirkin
  (?)
@ 2011-05-18  0:19           ` Rusty Russell
  -1 siblings, 0 replies; 145+ messages in thread
From: Rusty Russell @ 2011-05-18  0:19 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: Krishna Kumar, Carsten Otte, lguest, Shirley Ma, kvm, linux-s390,
	netdev, habanero, Heiko Carstens, linux-kernel, virtualization,
	steved, Christian Borntraeger, Tom Lendacky, Martin Schwidefsky,
	linux390

On Tue, 17 May 2011 09:10:31 +0300, "Michael S. Tsirkin" <mst@redhat.com> wrote:
> Well one can imagine a driver doing:
> 
> 	while (virtqueue_get_buf()) {
> 		virtqueue_add_buf()
> 	}
> 	virtqueue_kick()
> 
> which looks sensible (batch kicks) but might
> process any number of bufs between kicks.

No, we currently only expose the buffers in the kick, so it can only
fill the ring doing that.

We could change that (and maybe that's worth looking at)...

> If we look at drivers closely enough, I think none
> of them do the equivalent of the above, but not 100% sure.

I'm pretty sure we don't have this kind of 'echo' driver yet.  Drivers
tend to take OS requests and queue them.  The only one which does
anything even partially sophisticated is the net driver...

Thanks,
Rusty.

^ permalink raw reply	[flat|nested] 145+ messages in thread

* Re: [PATCH 09/18] virtio: use avail_event index
@ 2011-05-18  5:43               ` Michael S. Tsirkin
  0 siblings, 0 replies; 145+ messages in thread
From: Michael S. Tsirkin @ 2011-05-18  5:43 UTC (permalink / raw)
  To: Rusty Russell
  Cc: linux-kernel, Carsten Otte, Christian Borntraeger, linux390,
	Martin Schwidefsky, Heiko Carstens, Shirley Ma, lguest,
	virtualization, netdev, linux-s390, kvm, Krishna Kumar,
	Tom Lendacky, steved, habanero

On Wed, May 18, 2011 at 09:49:42AM +0930, Rusty Russell wrote:
> On Tue, 17 May 2011 09:10:31 +0300, "Michael S. Tsirkin" <mst@redhat.com> wrote:
> > Well one can imagine a driver doing:
> > 
> > 	while (virtqueue_get_buf()) {
> > 		virtqueue_add_buf()
> > 	}
> > 	virtqueue_kick()
> > 
> > which looks sensible (batch kicks) but might
> > process any number of bufs between kicks.
> 
> No, we currently only expose the buffers in the kick, so it can only
> fill the ring doing that.
> 
> We could change that (and maybe that's worth looking at)...

Yes, I think we should - this way host and guest can process
data in parallel without a kick.

My patchset included that simply because it's one index
less to be confused about.


> > If we look at drivers closely enough, I think none
> > of them do the equivalent of the above, but not 100% sure.
> 
> I'm pretty sure we don't have this kind of 'echo' driver yet.  Drivers
> tend to take OS requests and queue them.  The only one which does
> anything even partially sophisticated is the net driver...
> 
> Thanks,
> Rusty.

^ permalink raw reply	[flat|nested] 145+ messages in thread

* Re: [PATCH 09/18] virtio: use avail_event index
@ 2011-05-18  5:43               ` Michael S. Tsirkin
  0 siblings, 0 replies; 145+ messages in thread
From: Michael S. Tsirkin @ 2011-05-18  5:43 UTC (permalink / raw)
  To: Rusty Russell
  Cc: Krishna Kumar, Carsten Otte, lguest-uLR06cmDAlY/bJ5BZ2RsiQ,
	Shirley Ma, kvm-u79uwXL29TY76Z2rM5mHXA,
	linux-s390-u79uwXL29TY76Z2rM5mHXA, netdev-u79uwXL29TY76Z2rM5mHXA,
	habanero-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8, Heiko Carstens,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	virtualization-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	steved-r/Jw6+rmf7HQT0dZR+AlfA, Christian Borntraeger,
	Tom Lendacky, Martin Schwidefsky,
	linux390-tA70FqPdS9bQT0dZR+AlfA

On Wed, May 18, 2011 at 09:49:42AM +0930, Rusty Russell wrote:
> On Tue, 17 May 2011 09:10:31 +0300, "Michael S. Tsirkin" <mst-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org> wrote:
> > Well one can imagine a driver doing:
> > 
> > 	while (virtqueue_get_buf()) {
> > 		virtqueue_add_buf()
> > 	}
> > 	virtqueue_kick()
> > 
> > which looks sensible (batch kicks) but might
> > process any number of bufs between kicks.
> 
> No, we currently only expose the buffers in the kick, so it can only
> fill the ring doing that.
> 
> We could change that (and maybe that's worth looking at)...

Yes, I think we should - this way host and guest can process
data in parallel without a kick.

My patchset included that simply because it's one index
less to be confused about.


> > If we look at drivers closely enough, I think none
> > of them do the equivalent of the above, but not 100% sure.
> 
> I'm pretty sure we don't have this kind of 'echo' driver yet.  Drivers
> tend to take OS requests and queue them.  The only one which does
> anything even partially sophisticated is the net driver...
> 
> Thanks,
> Rusty.

^ permalink raw reply	[flat|nested] 145+ messages in thread

* Re: [PATCH 09/18] virtio: use avail_event index
  2011-05-18  0:19             ` Rusty Russell
  (?)
@ 2011-05-18  5:43             ` Michael S. Tsirkin
  -1 siblings, 0 replies; 145+ messages in thread
From: Michael S. Tsirkin @ 2011-05-18  5:43 UTC (permalink / raw)
  To: Rusty Russell
  Cc: Krishna Kumar, Carsten Otte, lguest, Shirley Ma, kvm, linux-s390,
	netdev, habanero, Heiko Carstens, linux-kernel, virtualization,
	steved, Christian Borntraeger, Tom Lendacky, Martin Schwidefsky,
	linux390

On Wed, May 18, 2011 at 09:49:42AM +0930, Rusty Russell wrote:
> On Tue, 17 May 2011 09:10:31 +0300, "Michael S. Tsirkin" <mst@redhat.com> wrote:
> > Well one can imagine a driver doing:
> > 
> > 	while (virtqueue_get_buf()) {
> > 		virtqueue_add_buf()
> > 	}
> > 	virtqueue_kick()
> > 
> > which looks sensible (batch kicks) but might
> > process any number of bufs between kicks.
> 
> No, we currently only expose the buffers in the kick, so it can only
> fill the ring doing that.
> 
> We could change that (and maybe that's worth looking at)...

Yes, I think we should - this way host and guest can process
data in parallel without a kick.

My patchset included that simply because it's one index
less to be confused about.


> > If we look at drivers closely enough, I think none
> > of them do the equivalent of the above, but not 100% sure.
> 
> I'm pretty sure we don't have this kind of 'echo' driver yet.  Drivers
> tend to take OS requests and queue them.  The only one which does
> anything even partially sophisticated is the net driver...
> 
> Thanks,
> Rusty.

^ permalink raw reply	[flat|nested] 145+ messages in thread

* Re: [PATCH 14/18] virtio: add api for delayed callbacks
@ 2011-05-19  7:24           ` Michael S. Tsirkin
  0 siblings, 0 replies; 145+ messages in thread
From: Michael S. Tsirkin @ 2011-05-19  7:24 UTC (permalink / raw)
  To: Rusty Russell
  Cc: linux-kernel, Carsten Otte, Christian Borntraeger, linux390,
	Martin Schwidefsky, Heiko Carstens, Shirley Ma, lguest,
	virtualization, netdev, linux-s390, kvm, Krishna Kumar,
	Tom Lendacky, steved, habanero

On Mon, May 16, 2011 at 04:43:21PM +0930, Rusty Russell wrote:
> On Sun, 15 May 2011 15:48:18 +0300, "Michael S. Tsirkin" <mst@redhat.com> wrote:
> > On Mon, May 09, 2011 at 03:27:33PM +0930, Rusty Russell wrote:
> > > On Wed, 4 May 2011 23:52:33 +0300, "Michael S. Tsirkin" <mst@redhat.com> wrote:
> > > > Add an API that tells the other side that callbacks
> > > > should be delayed until a lot of work has been done.
> > > > Implement using the new used_event feature.
> > > 
> > > Since you're going to add a capacity query anyway, why not add the
> > > threshold argument here?
> > 
> > I thought that if we keep the API kind of generic
> > there might be more of a chance that future transports
> > will be able to implement it. For example, with an
> > old host we can't commit to a specific index.
> 
> No, it's always a hint anyway: you can be notified before the threshold
> is reached.  But best make it explicit I think.
> 
> Cheers,
> Rusty.


I tried doing that and remembered the real reason I went for this API:

capacity is limited by descriptor table space, not
used ring space: each entry in the used ring frees up multiple entries
in the descriptor ring. Thus the ring can't provide
callback after capacity is N: capacity is only available
after we get bufs.

We could try and make the API pass in the number of freed bufs, however:
- this is not really what virtio-net cares about (it cares about
  capacity)
- if the driver passes a number > number of outstanding bufs, it will
  never get a callback. So to stay correct the driver will need to
  track number of outstanding requests. The simpler API avoids that. 


APIs are easy to change so I'm guessing it's not a major blocker:
we can change later when e.g. block tries to
pass in some kind of extra hint: we'll be smarter
about how this API can change then.

Right?

-- 
MST

^ permalink raw reply	[flat|nested] 145+ messages in thread

* Re: [PATCH 14/18] virtio: add api for delayed callbacks
@ 2011-05-19  7:24           ` Michael S. Tsirkin
  0 siblings, 0 replies; 145+ messages in thread
From: Michael S. Tsirkin @ 2011-05-19  7:24 UTC (permalink / raw)
  To: Rusty Russell
  Cc: Krishna Kumar, Carsten Otte, lguest-uLR06cmDAlY/bJ5BZ2RsiQ,
	Shirley Ma, kvm-u79uwXL29TY76Z2rM5mHXA,
	linux-s390-u79uwXL29TY76Z2rM5mHXA, netdev-u79uwXL29TY76Z2rM5mHXA,
	habanero-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8, Heiko Carstens,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	virtualization-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	steved-r/Jw6+rmf7HQT0dZR+AlfA, Christian Borntraeger,
	Tom Lendacky, Martin Schwidefsky,
	linux390-tA70FqPdS9bQT0dZR+AlfA

On Mon, May 16, 2011 at 04:43:21PM +0930, Rusty Russell wrote:
> On Sun, 15 May 2011 15:48:18 +0300, "Michael S. Tsirkin" <mst-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org> wrote:
> > On Mon, May 09, 2011 at 03:27:33PM +0930, Rusty Russell wrote:
> > > On Wed, 4 May 2011 23:52:33 +0300, "Michael S. Tsirkin" <mst-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org> wrote:
> > > > Add an API that tells the other side that callbacks
> > > > should be delayed until a lot of work has been done.
> > > > Implement using the new used_event feature.
> > > 
> > > Since you're going to add a capacity query anyway, why not add the
> > > threshold argument here?
> > 
> > I thought that if we keep the API kind of generic
> > there might be more of a chance that future transports
> > will be able to implement it. For example, with an
> > old host we can't commit to a specific index.
> 
> No, it's always a hint anyway: you can be notified before the threshold
> is reached.  But best make it explicit I think.
> 
> Cheers,
> Rusty.


I tried doing that and remembered the real reason I went for this API:

capacity is limited by descriptor table space, not
used ring space: each entry in the used ring frees up multiple entries
in the descriptor ring. Thus the ring can't provide
callback after capacity is N: capacity is only available
after we get bufs.

We could try and make the API pass in the number of freed bufs, however:
- this is not really what virtio-net cares about (it cares about
  capacity)
- if the driver passes a number > number of outstanding bufs, it will
  never get a callback. So to stay correct the driver will need to
  track number of outstanding requests. The simpler API avoids that. 


APIs are easy to change so I'm guessing it's not a major blocker:
we can change later when e.g. block tries to
pass in some kind of extra hint: we'll be smarter
about how this API can change then.

Right?

-- 
MST

^ permalink raw reply	[flat|nested] 145+ messages in thread

* Re: [PATCH 14/18] virtio: add api for delayed callbacks
  2011-05-16  7:13         ` Rusty Russell
  (?)
  (?)
@ 2011-05-19  7:24         ` Michael S. Tsirkin
  -1 siblings, 0 replies; 145+ messages in thread
From: Michael S. Tsirkin @ 2011-05-19  7:24 UTC (permalink / raw)
  To: Rusty Russell
  Cc: Krishna Kumar, Carsten Otte, lguest, Shirley Ma, kvm, linux-s390,
	netdev, habanero, Heiko Carstens, linux-kernel, virtualization,
	steved, Christian Borntraeger, Tom Lendacky, Martin Schwidefsky,
	linux390

On Mon, May 16, 2011 at 04:43:21PM +0930, Rusty Russell wrote:
> On Sun, 15 May 2011 15:48:18 +0300, "Michael S. Tsirkin" <mst@redhat.com> wrote:
> > On Mon, May 09, 2011 at 03:27:33PM +0930, Rusty Russell wrote:
> > > On Wed, 4 May 2011 23:52:33 +0300, "Michael S. Tsirkin" <mst@redhat.com> wrote:
> > > > Add an API that tells the other side that callbacks
> > > > should be delayed until a lot of work has been done.
> > > > Implement using the new used_event feature.
> > > 
> > > Since you're going to add a capacity query anyway, why not add the
> > > threshold argument here?
> > 
> > I thought that if we keep the API kind of generic
> > there might be more of a chance that future transports
> > will be able to implement it. For example, with an
> > old host we can't commit to a specific index.
> 
> No, it's always a hint anyway: you can be notified before the threshold
> is reached.  But best make it explicit I think.
> 
> Cheers,
> Rusty.


I tried doing that and remembered the real reason I went for this API:

capacity is limited by descriptor table space, not
used ring space: each entry in the used ring frees up multiple entries
in the descriptor ring. Thus the ring can't provide
callback after capacity is N: capacity is only available
after we get bufs.

We could try and make the API pass in the number of freed bufs, however:
- this is not really what virtio-net cares about (it cares about
  capacity)
- if the driver passes a number > number of outstanding bufs, it will
  never get a callback. So to stay correct the driver will need to
  track number of outstanding requests. The simpler API avoids that. 


APIs are easy to change so I'm guessing it's not a major blocker:
we can change later when e.g. block tries to
pass in some kind of extra hint: we'll be smarter
about how this API can change then.

Right?

-- 
MST

^ permalink raw reply	[flat|nested] 145+ messages in thread

* Re: [PATCH 09/18] virtio: use avail_event index
@ 2011-05-19  7:27               ` Michael S. Tsirkin
  0 siblings, 0 replies; 145+ messages in thread
From: Michael S. Tsirkin @ 2011-05-19  7:27 UTC (permalink / raw)
  To: Rusty Russell
  Cc: linux-kernel, Carsten Otte, Christian Borntraeger, linux390,
	Martin Schwidefsky, Heiko Carstens, Shirley Ma, lguest,
	virtualization, netdev, linux-s390, kvm, Krishna Kumar,
	Tom Lendacky, steved, habanero

On Wed, May 18, 2011 at 09:49:42AM +0930, Rusty Russell wrote:
> On Tue, 17 May 2011 09:10:31 +0300, "Michael S. Tsirkin" <mst@redhat.com> wrote:
> > Well one can imagine a driver doing:
> > 
> > 	while (virtqueue_get_buf()) {
> > 		virtqueue_add_buf()
> > 	}
> > 	virtqueue_kick()
> > 
> > which looks sensible (batch kicks) but might
> > process any number of bufs between kicks.
> 
> No, we currently only expose the buffers in the kick, so it can only
> fill the ring doing that.
> 
> We could change that (and maybe that's worth looking at)...

That's actually what one of the early patches in the series did.
I guess I can try and reorder the patches, I do believe
it makes sense to publish immediately as this way
host can work in parallel with the guest.

> > If we look at drivers closely enough, I think none
> > of them do the equivalent of the above, but not 100% sure.
> 
> I'm pretty sure we don't have this kind of 'echo' driver yet.  Drivers
> tend to take OS requests and queue them.  The only one which does
> anything even partially sophisticated is the net driver...
> 
> Thanks,
> Rusty.

I guess I'll just need to do the legwork and check then.

-- 
MST

^ permalink raw reply	[flat|nested] 145+ messages in thread

* Re: [PATCH 09/18] virtio: use avail_event index
@ 2011-05-19  7:27               ` Michael S. Tsirkin
  0 siblings, 0 replies; 145+ messages in thread
From: Michael S. Tsirkin @ 2011-05-19  7:27 UTC (permalink / raw)
  To: Rusty Russell
  Cc: Krishna Kumar, Carsten Otte, lguest-uLR06cmDAlY/bJ5BZ2RsiQ,
	Shirley Ma, kvm-u79uwXL29TY76Z2rM5mHXA,
	linux-s390-u79uwXL29TY76Z2rM5mHXA, netdev-u79uwXL29TY76Z2rM5mHXA,
	habanero-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8, Heiko Carstens,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	virtualization-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	steved-r/Jw6+rmf7HQT0dZR+AlfA, Christian Borntraeger,
	Tom Lendacky, Martin Schwidefsky,
	linux390-tA70FqPdS9bQT0dZR+AlfA

On Wed, May 18, 2011 at 09:49:42AM +0930, Rusty Russell wrote:
> On Tue, 17 May 2011 09:10:31 +0300, "Michael S. Tsirkin" <mst-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org> wrote:
> > Well one can imagine a driver doing:
> > 
> > 	while (virtqueue_get_buf()) {
> > 		virtqueue_add_buf()
> > 	}
> > 	virtqueue_kick()
> > 
> > which looks sensible (batch kicks) but might
> > process any number of bufs between kicks.
> 
> No, we currently only expose the buffers in the kick, so it can only
> fill the ring doing that.
> 
> We could change that (and maybe that's worth looking at)...

That's actually what one of the early patches in the series did.
I guess I can try and reorder the patches, I do believe
it makes sense to publish immediately as this way
host can work in parallel with the guest.

> > If we look at drivers closely enough, I think none
> > of them do the equivalent of the above, but not 100% sure.
> 
> I'm pretty sure we don't have this kind of 'echo' driver yet.  Drivers
> tend to take OS requests and queue them.  The only one which does
> anything even partially sophisticated is the net driver...
> 
> Thanks,
> Rusty.

I guess I'll just need to do the legwork and check then.

-- 
MST

^ permalink raw reply	[flat|nested] 145+ messages in thread

* Re: [PATCH 09/18] virtio: use avail_event index
  2011-05-18  0:19             ` Rusty Russell
                               ` (2 preceding siblings ...)
  (?)
@ 2011-05-19  7:27             ` Michael S. Tsirkin
  -1 siblings, 0 replies; 145+ messages in thread
From: Michael S. Tsirkin @ 2011-05-19  7:27 UTC (permalink / raw)
  To: Rusty Russell
  Cc: Krishna Kumar, Carsten Otte, lguest, Shirley Ma, kvm, linux-s390,
	netdev, habanero, Heiko Carstens, linux-kernel, virtualization,
	steved, Christian Borntraeger, Tom Lendacky, Martin Schwidefsky,
	linux390

On Wed, May 18, 2011 at 09:49:42AM +0930, Rusty Russell wrote:
> On Tue, 17 May 2011 09:10:31 +0300, "Michael S. Tsirkin" <mst@redhat.com> wrote:
> > Well one can imagine a driver doing:
> > 
> > 	while (virtqueue_get_buf()) {
> > 		virtqueue_add_buf()
> > 	}
> > 	virtqueue_kick()
> > 
> > which looks sensible (batch kicks) but might
> > process any number of bufs between kicks.
> 
> No, we currently only expose the buffers in the kick, so it can only
> fill the ring doing that.
> 
> We could change that (and maybe that's worth looking at)...

That's actually what one of the early patches in the series did.
I guess I can try and reorder the patches, I do believe
it makes sense to publish immediately as this way
host can work in parallel with the guest.

> > If we look at drivers closely enough, I think none
> > of them do the equivalent of the above, but not 100% sure.
> 
> I'm pretty sure we don't have this kind of 'echo' driver yet.  Drivers
> tend to take OS requests and queue them.  The only one which does
> anything even partially sophisticated is the net driver...
> 
> Thanks,
> Rusty.

I guess I'll just need to do the legwork and check then.

-- 
MST

^ permalink raw reply	[flat|nested] 145+ messages in thread

* Re: [PATCH 14/18] virtio: add api for delayed callbacks
@ 2011-05-20  7:43             ` Rusty Russell
  0 siblings, 0 replies; 145+ messages in thread
From: Rusty Russell @ 2011-05-20  7:43 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: linux-kernel, Carsten Otte, Christian Borntraeger, linux390,
	Martin Schwidefsky, Heiko Carstens, Shirley Ma, lguest,
	virtualization, netdev, linux-s390, kvm, Krishna Kumar,
	Tom Lendacky, steved, habanero

On Thu, 19 May 2011 10:24:12 +0300, "Michael S. Tsirkin" <mst@redhat.com> wrote:
> On Mon, May 16, 2011 at 04:43:21PM +0930, Rusty Russell wrote:
> > On Sun, 15 May 2011 15:48:18 +0300, "Michael S. Tsirkin" <mst@redhat.com> wrote:
> > > On Mon, May 09, 2011 at 03:27:33PM +0930, Rusty Russell wrote:
> > > > On Wed, 4 May 2011 23:52:33 +0300, "Michael S. Tsirkin" <mst@redhat.com> wrote:
> > > > > Add an API that tells the other side that callbacks
> > > > > should be delayed until a lot of work has been done.
> > > > > Implement using the new used_event feature.
> > > > 
> > > > Since you're going to add a capacity query anyway, why not add the
> > > > threshold argument here?
> > > 
> > > I thought that if we keep the API kind of generic
> > > there might be more of a chance that future transports
> > > will be able to implement it. For example, with an
> > > old host we can't commit to a specific index.
> > 
> > No, it's always a hint anyway: you can be notified before the threshold
> > is reached.  But best make it explicit I think.
> > 
> > Cheers,
> > Rusty.
> 
> 
> I tried doing that and remembered the real reason I went for this API:
> 
> capacity is limited by descriptor table space, not
> used ring space: each entry in the used ring frees up multiple entries
> in the descriptor ring. Thus the ring can't provide
> callback after capacity is N: capacity is only available
> after we get bufs.

I think this indicates a problem, but I haven't reviewed your patches
except very cursorily.  I am slack...

> We could try and make the API pass in the number of freed bufs, however:
> - this is not really what virtio-net cares about (it cares about
>   capacity)

Yes, max sg elements and max requests are separate, though in the
current virtio implementation remaining sg <= remaining request slots.

That's why we currently feed back remaining descriptors to the driver,
not the number of request slots.

This implies that the thresholds should refer to descriptor numbers
(ie. wake me when there are this many descriptors freed), not the used
ring at all.  Which means we're barking up the wrong tree...

I think this needs more thought.

> - if the driver passes a number > number of outstanding bufs, it will
>   never get a callback. So to stay correct the driver will need to
>   track number of outstanding requests. The simpler API avoids that. 

I think the driver should simply say "wake me when you have this many
descriptors free".  And set it during initialization, rather than every
time.  The virtio_ring code should handle it from there.

Perhaps that can be done with the current technique, where the
virtio_ring makes an educated guess on when sufficient capacity will be
available...

Cheers,
Rusty.

^ permalink raw reply	[flat|nested] 145+ messages in thread

* Re: [PATCH 14/18] virtio: add api for delayed callbacks
@ 2011-05-20  7:43             ` Rusty Russell
  0 siblings, 0 replies; 145+ messages in thread
From: Rusty Russell @ 2011-05-20  7:43 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: Krishna Kumar, Carsten Otte, lguest-uLR06cmDAlY/bJ5BZ2RsiQ,
	Shirley Ma, kvm-u79uwXL29TY76Z2rM5mHXA,
	linux-s390-u79uwXL29TY76Z2rM5mHXA, netdev-u79uwXL29TY76Z2rM5mHXA,
	habanero-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8, Heiko Carstens,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	virtualization-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	steved-r/Jw6+rmf7HQT0dZR+AlfA, Christian Borntraeger,
	Tom Lendacky, Martin Schwidefsky,
	linux390-tA70FqPdS9bQT0dZR+AlfA

On Thu, 19 May 2011 10:24:12 +0300, "Michael S. Tsirkin" <mst-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org> wrote:
> On Mon, May 16, 2011 at 04:43:21PM +0930, Rusty Russell wrote:
> > On Sun, 15 May 2011 15:48:18 +0300, "Michael S. Tsirkin" <mst-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org> wrote:
> > > On Mon, May 09, 2011 at 03:27:33PM +0930, Rusty Russell wrote:
> > > > On Wed, 4 May 2011 23:52:33 +0300, "Michael S. Tsirkin" <mst-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org> wrote:
> > > > > Add an API that tells the other side that callbacks
> > > > > should be delayed until a lot of work has been done.
> > > > > Implement using the new used_event feature.
> > > > 
> > > > Since you're going to add a capacity query anyway, why not add the
> > > > threshold argument here?
> > > 
> > > I thought that if we keep the API kind of generic
> > > there might be more of a chance that future transports
> > > will be able to implement it. For example, with an
> > > old host we can't commit to a specific index.
> > 
> > No, it's always a hint anyway: you can be notified before the threshold
> > is reached.  But best make it explicit I think.
> > 
> > Cheers,
> > Rusty.
> 
> 
> I tried doing that and remembered the real reason I went for this API:
> 
> capacity is limited by descriptor table space, not
> used ring space: each entry in the used ring frees up multiple entries
> in the descriptor ring. Thus the ring can't provide
> callback after capacity is N: capacity is only available
> after we get bufs.

I think this indicates a problem, but I haven't reviewed your patches
except very cursorily.  I am slack...

> We could try and make the API pass in the number of freed bufs, however:
> - this is not really what virtio-net cares about (it cares about
>   capacity)

Yes, max sg elements and max requests are separate, though in the
current virtio implementation remaining sg <= remaining request slots.

That's why we currently feed back remaining descriptors to the driver,
not the number of request slots.

This implies that the thresholds should refer to descriptor numbers
(ie. wake me when there are this many descriptors freed), not the used
ring at all.  Which means we're barking up the wrong tree...

I think this needs more thought.

> - if the driver passes a number > number of outstanding bufs, it will
>   never get a callback. So to stay correct the driver will need to
>   track number of outstanding requests. The simpler API avoids that. 

I think the driver should simply say "wake me when you have this many
descriptors free".  And set it during initialization, rather than every
time.  The virtio_ring code should handle it from there.

Perhaps that can be done with the current technique, where the
virtio_ring makes an educated guess on when sufficient capacity will be
available...

Cheers,
Rusty.

^ permalink raw reply	[flat|nested] 145+ messages in thread

* Re: [PATCH 14/18] virtio: add api for delayed callbacks
  2011-05-19  7:24           ` Michael S. Tsirkin
  (?)
@ 2011-05-20  7:43           ` Rusty Russell
  -1 siblings, 0 replies; 145+ messages in thread
From: Rusty Russell @ 2011-05-20  7:43 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: Krishna Kumar, Carsten Otte, lguest, Shirley Ma, kvm, linux-s390,
	netdev, habanero, Heiko Carstens, linux-kernel, virtualization,
	steved, Christian Borntraeger, Tom Lendacky, Martin Schwidefsky,
	linux390

On Thu, 19 May 2011 10:24:12 +0300, "Michael S. Tsirkin" <mst@redhat.com> wrote:
> On Mon, May 16, 2011 at 04:43:21PM +0930, Rusty Russell wrote:
> > On Sun, 15 May 2011 15:48:18 +0300, "Michael S. Tsirkin" <mst@redhat.com> wrote:
> > > On Mon, May 09, 2011 at 03:27:33PM +0930, Rusty Russell wrote:
> > > > On Wed, 4 May 2011 23:52:33 +0300, "Michael S. Tsirkin" <mst@redhat.com> wrote:
> > > > > Add an API that tells the other side that callbacks
> > > > > should be delayed until a lot of work has been done.
> > > > > Implement using the new used_event feature.
> > > > 
> > > > Since you're going to add a capacity query anyway, why not add the
> > > > threshold argument here?
> > > 
> > > I thought that if we keep the API kind of generic
> > > there might be more of a chance that future transports
> > > will be able to implement it. For example, with an
> > > old host we can't commit to a specific index.
> > 
> > No, it's always a hint anyway: you can be notified before the threshold
> > is reached.  But best make it explicit I think.
> > 
> > Cheers,
> > Rusty.
> 
> 
> I tried doing that and remembered the real reason I went for this API:
> 
> capacity is limited by descriptor table space, not
> used ring space: each entry in the used ring frees up multiple entries
> in the descriptor ring. Thus the ring can't provide
> callback after capacity is N: capacity is only available
> after we get bufs.

I think this indicates a problem, but I haven't reviewed your patches
except very cursorily.  I am slack...

> We could try and make the API pass in the number of freed bufs, however:
> - this is not really what virtio-net cares about (it cares about
>   capacity)

Yes, max sg elements and max requests are separate, though in the
current virtio implementation remaining sg <= remaining request slots.

That's why we currently feed back remaining descriptors to the driver,
not the number of request slots.

This implies that the thresholds should refer to descriptor numbers
(ie. wake me when there are this many descriptors freed), not the used
ring at all.  Which means we're barking up the wrong tree...

I think this needs more thought.

> - if the driver passes a number > number of outstanding bufs, it will
>   never get a callback. So to stay correct the driver will need to
>   track number of outstanding requests. The simpler API avoids that. 

I think the driver should simply say "wake me when you have this many
descriptors free".  And set it during initialization, rather than every
time.  The virtio_ring code should handle it from there.

Perhaps that can be done with the current technique, where the
virtio_ring makes an educated guess on when sufficient capacity will be
available...

Cheers,
Rusty.

^ permalink raw reply	[flat|nested] 145+ messages in thread

* [PATCH 00/18] virtio and vhost-net performance enhancements
@ 2011-05-04 20:50 Michael S. Tsirkin
  0 siblings, 0 replies; 145+ messages in thread
From: Michael S. Tsirkin @ 2011-05-04 20:50 UTC (permalink / raw)
  Cc: Krishna Kumar, Carsten Otte, lguest, Shirley Ma, kvm, linux-s390,
	netdev, habanero, Heiko Carstens, linux-kernel, virtualization,
	steved, Christian Borntraeger, Tom Lendacky, Martin Schwidefsky,
	linux390

OK, here's a large patchset that implements the virtio spec update that I
sent earlier. It supercedes the PUBLISH_USED_IDX patches
I sent out earlier.

I know it's a lot to ask but please test, and please consider for 2.6.40 :)

I see nice performance improvements: one run showed going from 12
to 18 Gbit/s host to guest with netperf, but I did not spend a lot
of time testing performance, so no guarantees it's not a fluke,
I hope others will try this out and report.
Pls note I will be away from keyboard for the next week.

Essentially we change virtio ring notification
hand-off to work like the one in Xen -
each side publishes an event index, the other one
notifies when it reaches that value -
With the one difference that event index starts at 0,
same as request index (in xen event index starts at 1).

Each side of the handoff has a feature bit independent
of the other one, so we can have e.g. interrupts
handled in the new way and exits in the old one.

This is actually what made the patchset larger:
we run out of feature bits so I had to add some more.
I tested various combinations of hosts and guests and
this code seems to be solid.

With the indexes in place it becomes possbile to request an event after
many requests (and not just on the next one as done now). This shall fix
the TX queue overrun which currently triggers a storm of interrupts.

The patches are mostly independent and can also be cherry-picked,
hopefully there won't be much need of that.

One dependency I'd like to note is on two cleanup patches:
the patch removing batching of available index updates
and the patch fixing ring capability checks in virtio-net.
This simplified code a bit and made the following patch simpler.

I could unwrap the dependency but prefer as is.

The patchset is on top of net-next which at the time
I last rebased was 15ecd03 - so roughly 2.6.39-rc2.

qemu patch will follow shortly.

Rusty, I think (in the hope it will come to that) it will be easier to
merge vhost and virtio bits in one go. Can all go in through your tree
(Dave in the past acked a very similar patch so should not be a problem)
or from me to Dave Miller.

I see nice performance improvements: e.g. from 12 to 18 Gbit/s host
to guest with netperf, but did not spend a lot of time testing
performance, and I will be away from keyboard for the next week.
I hope others will try this out and report.

Michael S. Tsirkin (17):
  virtio: 64 bit features
  virtio_test: update for 64 bit features
  vhost: fix 64 bit features
  virtio: don't delay avail index update
  virtio: used event index interface
  virtio_ring: avail event index interface
  virtio ring: inline function to check for events
  virtio_ring: support for used_event idx feature
  virtio: use avail_event index
  vhost: utilize used_event index
  vhost: support avail_event idx
  virtio_test: support used_event index
  virtio_test: avail_event index support
  virtio: add api for delayed callbacks
  virtio_net: delay TX callbacks
  virtio_net: fix TX capacity checks using new API
  virtio_net: limit xmit polling

Shirley Ma (1):
  virtio_ring: Add capacity check API

 drivers/lguest/lguest_device.c |    8 +-
 drivers/net/virtio_net.c       |   25 ++++---
 drivers/s390/kvm/kvm_virtio.c  |    8 +-
 drivers/vhost/net.c            |   12 ++--
 drivers/vhost/test.c           |    6 +-
 drivers/vhost/vhost.c          |  139 ++++++++++++++++++++++++++++++----------
 drivers/vhost/vhost.h          |   30 ++++++---
 drivers/virtio/virtio.c        |    8 +-
 drivers/virtio/virtio_pci.c    |   34 ++++++++--
 drivers/virtio/virtio_ring.c   |  105 +++++++++++++++++++++++++++---
 include/linux/virtio.h         |   16 ++++-
 include/linux/virtio_config.h  |   15 +++--
 include/linux/virtio_pci.h     |    9 ++-
 include/linux/virtio_ring.h    |   30 ++++++++-
 tools/virtio/virtio_test.c     |   39 ++++++++++-
 15 files changed, 377 insertions(+), 107 deletions(-)

-- 
1.7.5.53.gc233e

^ permalink raw reply	[flat|nested] 145+ messages in thread

end of thread, other threads:[~2011-05-20  7:52 UTC | newest]

Thread overview: 145+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2011-05-04 20:50 [PATCH 00/18] virtio and vhost-net performance enhancements Michael S. Tsirkin
2011-05-04 20:50 ` Michael S. Tsirkin
2011-05-04 20:50 ` [PATCH 01/18] virtio: 64 bit features Michael S. Tsirkin
2011-05-04 20:50 ` Michael S. Tsirkin
2011-05-04 20:50   ` Michael S. Tsirkin
2011-05-04 20:50 ` [PATCH 02/18] virtio_test: update for " Michael S. Tsirkin
2011-05-04 20:50   ` Michael S. Tsirkin
2011-05-04 20:50   ` Michael S. Tsirkin
2011-05-04 20:50 ` [PATCH 03/18] vhost: fix " Michael S. Tsirkin
2011-05-04 20:50   ` Michael S. Tsirkin
2011-05-04 20:50   ` Michael S. Tsirkin
2011-05-04 20:51 ` [PATCH 04/18] virtio: don't delay avail index update Michael S. Tsirkin
2011-05-04 20:51 ` Michael S. Tsirkin
2011-05-04 20:51   ` Michael S. Tsirkin
2011-05-04 20:51 ` [PATCH 05/18] virtio: used event index interface Michael S. Tsirkin
2011-05-04 20:51   ` Michael S. Tsirkin
2011-05-04 20:51   ` Michael S. Tsirkin
2011-05-04 21:56   ` Tom Lendacky
2011-05-04 21:56   ` Tom Lendacky
2011-05-05  9:38     ` Michael S. Tsirkin
2011-05-05  9:38     ` Michael S. Tsirkin
2011-05-04 20:51 ` [PATCH 06/18] virtio_ring: avail " Michael S. Tsirkin
2011-05-04 20:51   ` Michael S. Tsirkin
2011-05-04 20:51   ` Michael S. Tsirkin
2011-05-09  4:13   ` Rusty Russell
2011-05-09  4:13   ` Rusty Russell
2011-05-09  4:13     ` Rusty Russell
2011-05-15 12:47     ` Michael S. Tsirkin
2011-05-15 12:47     ` Michael S. Tsirkin
2011-05-15 12:47       ` Michael S. Tsirkin
2011-05-16  6:23       ` Rusty Russell
2011-05-16  6:23       ` Rusty Russell
2011-05-16  6:23         ` Rusty Russell
2011-05-17  6:00         ` Michael S. Tsirkin
2011-05-17  6:00         ` Michael S. Tsirkin
2011-05-17  6:00           ` Michael S. Tsirkin
2011-05-18  0:08           ` Rusty Russell
2011-05-18  0:08           ` Rusty Russell
2011-05-18  0:08             ` Rusty Russell
2011-05-04 20:51 ` [PATCH 07/18] virtio ring: inline function to check for events Michael S. Tsirkin
2011-05-04 20:51 ` Michael S. Tsirkin
2011-05-04 20:51   ` Michael S. Tsirkin
2011-05-05  8:34   ` Stefan Hajnoczi
2011-05-05  8:56     ` Michael S. Tsirkin
2011-05-05  8:56     ` Michael S. Tsirkin
2011-05-05  8:34   ` Stefan Hajnoczi
2011-05-04 20:51 ` [PATCH 08/18] virtio_ring: support for used_event idx feature Michael S. Tsirkin
2011-05-04 20:51   ` Michael S. Tsirkin
2011-05-04 20:51   ` Michael S. Tsirkin
2011-05-09  4:17   ` Rusty Russell
2011-05-09  4:17   ` Rusty Russell
2011-05-09  4:17     ` Rusty Russell
2011-05-15 12:47     ` Michael S. Tsirkin
2011-05-15 12:47     ` Michael S. Tsirkin
2011-05-15 12:47       ` Michael S. Tsirkin
2011-05-04 20:51 ` [PATCH 09/18] virtio: use avail_event index Michael S. Tsirkin
2011-05-04 20:51   ` Michael S. Tsirkin
2011-05-04 20:51   ` Michael S. Tsirkin
2011-05-04 21:58   ` Tom Lendacky
2011-05-04 21:58   ` Tom Lendacky
2011-05-05  9:34     ` Michael S. Tsirkin
2011-05-05  9:34     ` Michael S. Tsirkin
2011-05-09  4:33   ` Rusty Russell
2011-05-09  4:33   ` Rusty Russell
2011-05-09  4:33     ` Rusty Russell
2011-05-15 13:55     ` Michael S. Tsirkin
2011-05-15 13:55     ` Michael S. Tsirkin
2011-05-15 13:55       ` Michael S. Tsirkin
2011-05-16  7:12       ` Rusty Russell
2011-05-16  7:12       ` Rusty Russell
2011-05-16  7:12         ` Rusty Russell
2011-05-17  6:10         ` Michael S. Tsirkin
2011-05-17  6:10           ` Michael S. Tsirkin
2011-05-18  0:19           ` Rusty Russell
2011-05-18  0:19           ` Rusty Russell
2011-05-18  0:19             ` Rusty Russell
2011-05-18  5:43             ` Michael S. Tsirkin
2011-05-18  5:43             ` Michael S. Tsirkin
2011-05-18  5:43               ` Michael S. Tsirkin
2011-05-19  7:27             ` Michael S. Tsirkin
2011-05-19  7:27             ` Michael S. Tsirkin
2011-05-19  7:27               ` Michael S. Tsirkin
2011-05-17  6:10         ` Michael S. Tsirkin
2011-05-17 16:23         ` Tom Lendacky
2011-05-17 16:23         ` Tom Lendacky
2011-05-04 20:51 ` [PATCH 10/18] vhost: utilize used_event index Michael S. Tsirkin
2011-05-04 20:51   ` Michael S. Tsirkin
2011-05-04 20:51   ` Michael S. Tsirkin
2011-05-04 20:52 ` [PATCH 11/18] vhost: support avail_event idx Michael S. Tsirkin
2011-05-04 20:52   ` Michael S. Tsirkin
2011-05-04 20:52   ` Michael S. Tsirkin
2011-05-04 20:52 ` [PATCH 12/18] virtio_test: support used_event index Michael S. Tsirkin
2011-05-04 20:52   ` Michael S. Tsirkin
2011-05-04 20:52   ` Michael S. Tsirkin
2011-05-04 20:52 ` [PATCH 13/18] virtio_test: avail_event index support Michael S. Tsirkin
2011-05-04 20:52 ` Michael S. Tsirkin
2011-05-04 20:52   ` Michael S. Tsirkin
2011-05-04 20:52 ` [PATCH 14/18] virtio: add api for delayed callbacks Michael S. Tsirkin
2011-05-04 20:52   ` Michael S. Tsirkin
2011-05-04 20:52   ` Michael S. Tsirkin
2011-05-09  5:57   ` Rusty Russell
2011-05-09  5:57   ` Rusty Russell
2011-05-09  5:57     ` Rusty Russell
2011-05-15 12:48     ` Michael S. Tsirkin
2011-05-15 12:48     ` Michael S. Tsirkin
2011-05-15 12:48       ` Michael S. Tsirkin
2011-05-16  7:13       ` Rusty Russell
2011-05-16  7:13       ` Rusty Russell
2011-05-16  7:13         ` Rusty Russell
2011-05-19  7:24         ` Michael S. Tsirkin
2011-05-19  7:24           ` Michael S. Tsirkin
2011-05-20  7:43           ` Rusty Russell
2011-05-20  7:43           ` Rusty Russell
2011-05-20  7:43             ` Rusty Russell
2011-05-19  7:24         ` Michael S. Tsirkin
2011-05-04 20:52 ` [PATCH 15/18] virtio_net: delay TX callbacks Michael S. Tsirkin
2011-05-04 20:52   ` Michael S. Tsirkin
2011-05-04 20:52 ` Michael S. Tsirkin
2011-05-04 20:52 ` [PATCH 16/18] virtio_ring: Add capacity check API Michael S. Tsirkin
2011-05-04 20:52   ` Michael S. Tsirkin
2011-05-04 20:52   ` Michael S. Tsirkin
2011-05-04 20:53 ` [PATCH 17/18] virtio_net: fix TX capacity checks using new API Michael S. Tsirkin
2011-05-04 20:53   ` Michael S. Tsirkin
2011-05-04 20:53   ` Michael S. Tsirkin
2011-05-04 20:53 ` [PATCH 18/18] virtio_net: limit xmit polling Michael S. Tsirkin
2011-05-04 20:53   ` Michael S. Tsirkin
2011-05-04 20:53   ` Michael S. Tsirkin
2011-05-05 15:07 ` [PATCH 0/3] virtio and vhost-net performance enhancements Michael S. Tsirkin
2011-05-05 15:07   ` Michael S. Tsirkin
2011-05-05 15:08   ` [PATCH 1/3] virtio: fix avail event support Michael S. Tsirkin
2011-05-05 15:08   ` Michael S. Tsirkin
2011-05-05 15:08     ` Michael S. Tsirkin
2011-05-05 15:08   ` [PATCH 2/3] virtio_ring: check used_event offset Michael S. Tsirkin
2011-05-05 15:08     ` Michael S. Tsirkin
2011-05-05 15:08   ` Michael S. Tsirkin
2011-05-05 15:08   ` [PATCH 3/3] virtio_ring: need_event api comment fix Michael S. Tsirkin
2011-05-05 15:08   ` Michael S. Tsirkin
2011-05-05 15:08     ` Michael S. Tsirkin
2011-05-09  5:59     ` Rusty Russell
2011-05-09  5:59     ` Rusty Russell
2011-05-09  5:59       ` Rusty Russell
2011-05-05 15:07 ` [PATCH 0/3] virtio and vhost-net performance enhancements Michael S. Tsirkin
2011-05-11 17:10 ` [PATCH 00/18] " Krishna Kumar2
2011-05-11 17:10 ` Krishna Kumar2
2011-05-04 20:50 Michael S. Tsirkin

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.