All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCHv2 RFC 0/4] virtio and vhost-net capacity handling
@ 2011-06-02 15:42 ` Michael S. Tsirkin
  0 siblings, 0 replies; 50+ messages in thread
From: Michael S. Tsirkin @ 2011-06-02 15:42 UTC (permalink / raw)
  To: linux-kernel
  Cc: Rusty Russell, Carsten Otte, Christian Borntraeger, linux390,
	Martin Schwidefsky, Heiko Carstens, Shirley Ma, lguest,
	linux-kernel, virtualization, netdev, linux-s390, kvm,
	Krishna Kumar, Tom Lendacky, steved, habanero

OK, here's a new attempt to use the new capacity api.  I also added more
comments to clarify the logic.  Hope this is more readable.  Let me know
pls.

This is on top of the patches applied by Rusty.

Warning: untested. Posting now to give people chance to
comment on the API.

Changes from v1:
- fix comment in patch 2 to correct confusion noted by Rusty
- rewrite patch 3 along the lines suggested by Rusty
  note: it's not exactly the same but I hope it's close
  enough, the main difference is that mine does limited
  polling even in the unlikely xmit failure case.
- added a patch to not return capacity from add_buf
  it always looked like a weird hack

Michael S. Tsirkin (4):
  virtio_ring: add capacity check API
  virtio_net: fix tx capacity checks using new API
  virtio_net: limit xmit polling
  Revert "virtio: make add_buf return capacity remaining:

 drivers/net/virtio_net.c     |  111 ++++++++++++++++++++++++++----------------
 drivers/virtio/virtio_ring.c |   10 +++-
 include/linux/virtio.h       |    7 ++-
 3 files changed, 84 insertions(+), 44 deletions(-)

-- 
1.7.5.53.gc233e

^ permalink raw reply	[flat|nested] 50+ messages in thread

* [PATCHv2 RFC 0/4] virtio and vhost-net capacity handling
@ 2011-06-02 15:42 ` Michael S. Tsirkin
  0 siblings, 0 replies; 50+ messages in thread
From: Michael S. Tsirkin @ 2011-06-02 15:42 UTC (permalink / raw)
  To: linux-kernel-u79uwXL29TY76Z2rM5mHXA
  Cc: Krishna Kumar, Carsten Otte, lguest-uLR06cmDAlY/bJ5BZ2RsiQ,
	Shirley Ma, kvm-u79uwXL29TY76Z2rM5mHXA,
	linux-s390-u79uwXL29TY76Z2rM5mHXA, netdev-u79uwXL29TY76Z2rM5mHXA,
	habanero-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8, Heiko Carstens,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	virtualization-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	steved-r/Jw6+rmf7HQT0dZR+AlfA, Christian Borntraeger,
	Tom Lendacky, Martin Schwidefsky,
	linux390-tA70FqPdS9bQT0dZR+AlfA

OK, here's a new attempt to use the new capacity api.  I also added more
comments to clarify the logic.  Hope this is more readable.  Let me know
pls.

This is on top of the patches applied by Rusty.

Warning: untested. Posting now to give people chance to
comment on the API.

Changes from v1:
- fix comment in patch 2 to correct confusion noted by Rusty
- rewrite patch 3 along the lines suggested by Rusty
  note: it's not exactly the same but I hope it's close
  enough, the main difference is that mine does limited
  polling even in the unlikely xmit failure case.
- added a patch to not return capacity from add_buf
  it always looked like a weird hack

Michael S. Tsirkin (4):
  virtio_ring: add capacity check API
  virtio_net: fix tx capacity checks using new API
  virtio_net: limit xmit polling
  Revert "virtio: make add_buf return capacity remaining:

 drivers/net/virtio_net.c     |  111 ++++++++++++++++++++++++++----------------
 drivers/virtio/virtio_ring.c |   10 +++-
 include/linux/virtio.h       |    7 ++-
 3 files changed, 84 insertions(+), 44 deletions(-)

-- 
1.7.5.53.gc233e

^ permalink raw reply	[flat|nested] 50+ messages in thread

* [PATCHv2 RFC 1/4] virtio_ring: add capacity check API
@ 2011-06-02 15:42   ` Michael S. Tsirkin
  0 siblings, 0 replies; 50+ messages in thread
From: Michael S. Tsirkin @ 2011-06-02 15:42 UTC (permalink / raw)
  To: linux-kernel
  Cc: Rusty Russell, Carsten Otte, Christian Borntraeger, linux390,
	Martin Schwidefsky, Heiko Carstens, Shirley Ma, lguest,
	linux-kernel, virtualization, netdev, linux-s390, kvm,
	Krishna Kumar, Tom Lendacky, steved, habanero

Add API to check ring capacity. Because of the option
to use indirect buffers, this returns the worst
case, not the normal case capacity.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
---
 drivers/virtio/virtio_ring.c |    8 ++++++++
 include/linux/virtio.h       |    5 +++++
 2 files changed, 13 insertions(+), 0 deletions(-)

diff --git a/drivers/virtio/virtio_ring.c b/drivers/virtio/virtio_ring.c
index 68b9136..23422f1 100644
--- a/drivers/virtio/virtio_ring.c
+++ b/drivers/virtio/virtio_ring.c
@@ -344,6 +344,14 @@ void *virtqueue_get_buf(struct virtqueue *_vq, unsigned int *len)
 }
 EXPORT_SYMBOL_GPL(virtqueue_get_buf);
 
+int virtqueue_min_capacity(struct virtqueue *_vq)
+{
+	struct vring_virtqueue *vq = to_vvq(_vq);
+
+	return vq->num_free;
+}
+EXPORT_SYMBOL_GPL(virtqueue_min_capacity);
+
 void virtqueue_disable_cb(struct virtqueue *_vq)
 {
 	struct vring_virtqueue *vq = to_vvq(_vq);
diff --git a/include/linux/virtio.h b/include/linux/virtio.h
index 7108857..209220d 100644
--- a/include/linux/virtio.h
+++ b/include/linux/virtio.h
@@ -42,6 +42,9 @@ struct virtqueue {
  *	vq: the struct virtqueue we're talking about.
  *	len: the length written into the buffer
  *	Returns NULL or the "data" token handed to add_buf.
+ * virtqueue_min_capacity: get the current capacity of the queue
+ *	vq: the struct virtqueue we're talking about.
+ *	Returns the current worst case capacity of the queue.
  * virtqueue_disable_cb: disable callbacks
  *	vq: the struct virtqueue we're talking about.
  *	Note that this is not necessarily synchronous, hence unreliable and only
@@ -89,6 +92,8 @@ void virtqueue_kick(struct virtqueue *vq);
 
 void *virtqueue_get_buf(struct virtqueue *vq, unsigned int *len);
 
+int virtqueue_min_capacity(struct virtqueue *vq);
+
 void virtqueue_disable_cb(struct virtqueue *vq);
 
 bool virtqueue_enable_cb(struct virtqueue *vq);
-- 
1.7.5.53.gc233e


^ permalink raw reply related	[flat|nested] 50+ messages in thread

* [PATCHv2 RFC 1/4] virtio_ring: add capacity check API
@ 2011-06-02 15:42   ` Michael S. Tsirkin
  0 siblings, 0 replies; 50+ messages in thread
From: Michael S. Tsirkin @ 2011-06-02 15:42 UTC (permalink / raw)
  To: linux-kernel-u79uwXL29TY76Z2rM5mHXA
  Cc: Krishna Kumar, Carsten Otte, lguest-uLR06cmDAlY/bJ5BZ2RsiQ,
	Shirley Ma, kvm-u79uwXL29TY76Z2rM5mHXA,
	linux-s390-u79uwXL29TY76Z2rM5mHXA, netdev-u79uwXL29TY76Z2rM5mHXA,
	habanero-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8, Heiko Carstens,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	virtualization-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	steved-r/Jw6+rmf7HQT0dZR+AlfA, Christian Borntraeger,
	Tom Lendacky, Martin Schwidefsky,
	linux390-tA70FqPdS9bQT0dZR+AlfA

Add API to check ring capacity. Because of the option
to use indirect buffers, this returns the worst
case, not the normal case capacity.

Signed-off-by: Michael S. Tsirkin <mst-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
---
 drivers/virtio/virtio_ring.c |    8 ++++++++
 include/linux/virtio.h       |    5 +++++
 2 files changed, 13 insertions(+), 0 deletions(-)

diff --git a/drivers/virtio/virtio_ring.c b/drivers/virtio/virtio_ring.c
index 68b9136..23422f1 100644
--- a/drivers/virtio/virtio_ring.c
+++ b/drivers/virtio/virtio_ring.c
@@ -344,6 +344,14 @@ void *virtqueue_get_buf(struct virtqueue *_vq, unsigned int *len)
 }
 EXPORT_SYMBOL_GPL(virtqueue_get_buf);
 
+int virtqueue_min_capacity(struct virtqueue *_vq)
+{
+	struct vring_virtqueue *vq = to_vvq(_vq);
+
+	return vq->num_free;
+}
+EXPORT_SYMBOL_GPL(virtqueue_min_capacity);
+
 void virtqueue_disable_cb(struct virtqueue *_vq)
 {
 	struct vring_virtqueue *vq = to_vvq(_vq);
diff --git a/include/linux/virtio.h b/include/linux/virtio.h
index 7108857..209220d 100644
--- a/include/linux/virtio.h
+++ b/include/linux/virtio.h
@@ -42,6 +42,9 @@ struct virtqueue {
  *	vq: the struct virtqueue we're talking about.
  *	len: the length written into the buffer
  *	Returns NULL or the "data" token handed to add_buf.
+ * virtqueue_min_capacity: get the current capacity of the queue
+ *	vq: the struct virtqueue we're talking about.
+ *	Returns the current worst case capacity of the queue.
  * virtqueue_disable_cb: disable callbacks
  *	vq: the struct virtqueue we're talking about.
  *	Note that this is not necessarily synchronous, hence unreliable and only
@@ -89,6 +92,8 @@ void virtqueue_kick(struct virtqueue *vq);
 
 void *virtqueue_get_buf(struct virtqueue *vq, unsigned int *len);
 
+int virtqueue_min_capacity(struct virtqueue *vq);
+
 void virtqueue_disable_cb(struct virtqueue *vq);
 
 bool virtqueue_enable_cb(struct virtqueue *vq);
-- 
1.7.5.53.gc233e

^ permalink raw reply related	[flat|nested] 50+ messages in thread

* [PATCHv2 RFC 1/4] virtio_ring: add capacity check API
  2011-06-02 15:42 ` Michael S. Tsirkin
  (?)
@ 2011-06-02 15:42 ` Michael S. Tsirkin
  -1 siblings, 0 replies; 50+ messages in thread
From: Michael S. Tsirkin @ 2011-06-02 15:42 UTC (permalink / raw)
  Cc: Krishna Kumar, Carsten Otte, lguest, Shirley Ma, kvm, linux-s390,
	netdev, habanero, Heiko Carstens, linux-kernel, virtualization,
	steved, Christian Borntraeger, Tom Lendacky, Martin Schwidefsky,
	linux390

Add API to check ring capacity. Because of the option
to use indirect buffers, this returns the worst
case, not the normal case capacity.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
---
 drivers/virtio/virtio_ring.c |    8 ++++++++
 include/linux/virtio.h       |    5 +++++
 2 files changed, 13 insertions(+), 0 deletions(-)

diff --git a/drivers/virtio/virtio_ring.c b/drivers/virtio/virtio_ring.c
index 68b9136..23422f1 100644
--- a/drivers/virtio/virtio_ring.c
+++ b/drivers/virtio/virtio_ring.c
@@ -344,6 +344,14 @@ void *virtqueue_get_buf(struct virtqueue *_vq, unsigned int *len)
 }
 EXPORT_SYMBOL_GPL(virtqueue_get_buf);
 
+int virtqueue_min_capacity(struct virtqueue *_vq)
+{
+	struct vring_virtqueue *vq = to_vvq(_vq);
+
+	return vq->num_free;
+}
+EXPORT_SYMBOL_GPL(virtqueue_min_capacity);
+
 void virtqueue_disable_cb(struct virtqueue *_vq)
 {
 	struct vring_virtqueue *vq = to_vvq(_vq);
diff --git a/include/linux/virtio.h b/include/linux/virtio.h
index 7108857..209220d 100644
--- a/include/linux/virtio.h
+++ b/include/linux/virtio.h
@@ -42,6 +42,9 @@ struct virtqueue {
  *	vq: the struct virtqueue we're talking about.
  *	len: the length written into the buffer
  *	Returns NULL or the "data" token handed to add_buf.
+ * virtqueue_min_capacity: get the current capacity of the queue
+ *	vq: the struct virtqueue we're talking about.
+ *	Returns the current worst case capacity of the queue.
  * virtqueue_disable_cb: disable callbacks
  *	vq: the struct virtqueue we're talking about.
  *	Note that this is not necessarily synchronous, hence unreliable and only
@@ -89,6 +92,8 @@ void virtqueue_kick(struct virtqueue *vq);
 
 void *virtqueue_get_buf(struct virtqueue *vq, unsigned int *len);
 
+int virtqueue_min_capacity(struct virtqueue *vq);
+
 void virtqueue_disable_cb(struct virtqueue *vq);
 
 bool virtqueue_enable_cb(struct virtqueue *vq);
-- 
1.7.5.53.gc233e

^ permalink raw reply related	[flat|nested] 50+ messages in thread

* [PATCHv2 RFC 2/4] virtio_net: fix tx capacity checks using new API
@ 2011-06-02 15:43   ` Michael S. Tsirkin
  0 siblings, 0 replies; 50+ messages in thread
From: Michael S. Tsirkin @ 2011-06-02 15:43 UTC (permalink / raw)
  To: linux-kernel
  Cc: Rusty Russell, Carsten Otte, Christian Borntraeger, linux390,
	Martin Schwidefsky, Heiko Carstens, Shirley Ma, lguest,
	linux-kernel, virtualization, netdev, linux-s390, kvm,
	Krishna Kumar, Tom Lendacky, steved, habanero

In the (rare) case where new descriptors are used
while virtio_net enables vq callback for the TX vq,
virtio_net uses the number of sg entries in the skb it frees to
calculate how many descriptors in the ring have just been made
available.  But this value is an overestimate: with indirect buffers
each skb only uses one descriptor entry, meaning we may wake the queue
only to find we are still out of space and need to stop
the ring almost at once.

Using the new virtqueue_min_capacity() call, we can determine
the remaining capacity, so we should use that instead.
This estimation is worst-case which is consistent with
the value returned by virtqueue_add_buf.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
---
 drivers/net/virtio_net.c |    9 ++++-----
 1 files changed, 4 insertions(+), 5 deletions(-)

diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c
index f685324..a0ee78d 100644
--- a/drivers/net/virtio_net.c
+++ b/drivers/net/virtio_net.c
@@ -509,19 +509,17 @@ again:
 	return received;
 }
 
-static unsigned int free_old_xmit_skbs(struct virtnet_info *vi)
+static void free_old_xmit_skbs(struct virtnet_info *vi)
 {
 	struct sk_buff *skb;
-	unsigned int len, tot_sgs = 0;
+	unsigned int len;
 
 	while ((skb = virtqueue_get_buf(vi->svq, &len)) != NULL) {
 		pr_debug("Sent skb %p\n", skb);
 		vi->dev->stats.tx_bytes += skb->len;
 		vi->dev->stats.tx_packets++;
-		tot_sgs += skb_vnet_hdr(skb)->num_sg;
 		dev_kfree_skb_any(skb);
 	}
-	return tot_sgs;
 }
 
 static int xmit_skb(struct virtnet_info *vi, struct sk_buff *skb)
@@ -611,7 +609,8 @@ static netdev_tx_t start_xmit(struct sk_buff *skb, struct net_device *dev)
 		netif_stop_queue(dev);
 		if (unlikely(!virtqueue_enable_cb_delayed(vi->svq))) {
 			/* More just got used, free them then recheck. */
-			capacity += free_old_xmit_skbs(vi);
+			free_old_xmit_skbs(vi);
+			capacity = virtqueue_min_capacity(vi->svq);
 			if (capacity >= 2+MAX_SKB_FRAGS) {
 				netif_start_queue(dev);
 				virtqueue_disable_cb(vi->svq);
-- 
1.7.5.53.gc233e


^ permalink raw reply related	[flat|nested] 50+ messages in thread

* [PATCHv2 RFC 2/4] virtio_net: fix tx capacity checks using new API
@ 2011-06-02 15:43   ` Michael S. Tsirkin
  0 siblings, 0 replies; 50+ messages in thread
From: Michael S. Tsirkin @ 2011-06-02 15:43 UTC (permalink / raw)
  To: linux-kernel-u79uwXL29TY76Z2rM5mHXA
  Cc: Krishna Kumar, Carsten Otte, lguest-uLR06cmDAlY/bJ5BZ2RsiQ,
	Shirley Ma, kvm-u79uwXL29TY76Z2rM5mHXA,
	linux-s390-u79uwXL29TY76Z2rM5mHXA, netdev-u79uwXL29TY76Z2rM5mHXA,
	habanero-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8, Heiko Carstens,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	virtualization-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	steved-r/Jw6+rmf7HQT0dZR+AlfA, Christian Borntraeger,
	Tom Lendacky, Martin Schwidefsky,
	linux390-tA70FqPdS9bQT0dZR+AlfA

In the (rare) case where new descriptors are used
while virtio_net enables vq callback for the TX vq,
virtio_net uses the number of sg entries in the skb it frees to
calculate how many descriptors in the ring have just been made
available.  But this value is an overestimate: with indirect buffers
each skb only uses one descriptor entry, meaning we may wake the queue
only to find we are still out of space and need to stop
the ring almost at once.

Using the new virtqueue_min_capacity() call, we can determine
the remaining capacity, so we should use that instead.
This estimation is worst-case which is consistent with
the value returned by virtqueue_add_buf.

Signed-off-by: Michael S. Tsirkin <mst-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
---
 drivers/net/virtio_net.c |    9 ++++-----
 1 files changed, 4 insertions(+), 5 deletions(-)

diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c
index f685324..a0ee78d 100644
--- a/drivers/net/virtio_net.c
+++ b/drivers/net/virtio_net.c
@@ -509,19 +509,17 @@ again:
 	return received;
 }
 
-static unsigned int free_old_xmit_skbs(struct virtnet_info *vi)
+static void free_old_xmit_skbs(struct virtnet_info *vi)
 {
 	struct sk_buff *skb;
-	unsigned int len, tot_sgs = 0;
+	unsigned int len;
 
 	while ((skb = virtqueue_get_buf(vi->svq, &len)) != NULL) {
 		pr_debug("Sent skb %p\n", skb);
 		vi->dev->stats.tx_bytes += skb->len;
 		vi->dev->stats.tx_packets++;
-		tot_sgs += skb_vnet_hdr(skb)->num_sg;
 		dev_kfree_skb_any(skb);
 	}
-	return tot_sgs;
 }
 
 static int xmit_skb(struct virtnet_info *vi, struct sk_buff *skb)
@@ -611,7 +609,8 @@ static netdev_tx_t start_xmit(struct sk_buff *skb, struct net_device *dev)
 		netif_stop_queue(dev);
 		if (unlikely(!virtqueue_enable_cb_delayed(vi->svq))) {
 			/* More just got used, free them then recheck. */
-			capacity += free_old_xmit_skbs(vi);
+			free_old_xmit_skbs(vi);
+			capacity = virtqueue_min_capacity(vi->svq);
 			if (capacity >= 2+MAX_SKB_FRAGS) {
 				netif_start_queue(dev);
 				virtqueue_disable_cb(vi->svq);
-- 
1.7.5.53.gc233e

^ permalink raw reply related	[flat|nested] 50+ messages in thread

* [PATCHv2 RFC 2/4] virtio_net: fix tx capacity checks using new API
  2011-06-02 15:42 ` Michael S. Tsirkin
                   ` (3 preceding siblings ...)
  (?)
@ 2011-06-02 15:43 ` Michael S. Tsirkin
  -1 siblings, 0 replies; 50+ messages in thread
From: Michael S. Tsirkin @ 2011-06-02 15:43 UTC (permalink / raw)
  Cc: Krishna Kumar, Carsten Otte, lguest, Shirley Ma, kvm, linux-s390,
	netdev, habanero, Heiko Carstens, linux-kernel, virtualization,
	steved, Christian Borntraeger, Tom Lendacky, Martin Schwidefsky,
	linux390

In the (rare) case where new descriptors are used
while virtio_net enables vq callback for the TX vq,
virtio_net uses the number of sg entries in the skb it frees to
calculate how many descriptors in the ring have just been made
available.  But this value is an overestimate: with indirect buffers
each skb only uses one descriptor entry, meaning we may wake the queue
only to find we are still out of space and need to stop
the ring almost at once.

Using the new virtqueue_min_capacity() call, we can determine
the remaining capacity, so we should use that instead.
This estimation is worst-case which is consistent with
the value returned by virtqueue_add_buf.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
---
 drivers/net/virtio_net.c |    9 ++++-----
 1 files changed, 4 insertions(+), 5 deletions(-)

diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c
index f685324..a0ee78d 100644
--- a/drivers/net/virtio_net.c
+++ b/drivers/net/virtio_net.c
@@ -509,19 +509,17 @@ again:
 	return received;
 }
 
-static unsigned int free_old_xmit_skbs(struct virtnet_info *vi)
+static void free_old_xmit_skbs(struct virtnet_info *vi)
 {
 	struct sk_buff *skb;
-	unsigned int len, tot_sgs = 0;
+	unsigned int len;
 
 	while ((skb = virtqueue_get_buf(vi->svq, &len)) != NULL) {
 		pr_debug("Sent skb %p\n", skb);
 		vi->dev->stats.tx_bytes += skb->len;
 		vi->dev->stats.tx_packets++;
-		tot_sgs += skb_vnet_hdr(skb)->num_sg;
 		dev_kfree_skb_any(skb);
 	}
-	return tot_sgs;
 }
 
 static int xmit_skb(struct virtnet_info *vi, struct sk_buff *skb)
@@ -611,7 +609,8 @@ static netdev_tx_t start_xmit(struct sk_buff *skb, struct net_device *dev)
 		netif_stop_queue(dev);
 		if (unlikely(!virtqueue_enable_cb_delayed(vi->svq))) {
 			/* More just got used, free them then recheck. */
-			capacity += free_old_xmit_skbs(vi);
+			free_old_xmit_skbs(vi);
+			capacity = virtqueue_min_capacity(vi->svq);
 			if (capacity >= 2+MAX_SKB_FRAGS) {
 				netif_start_queue(dev);
 				virtqueue_disable_cb(vi->svq);
-- 
1.7.5.53.gc233e

^ permalink raw reply related	[flat|nested] 50+ messages in thread

* [PATCHv2 RFC 3/4] virtio_net: limit xmit polling
@ 2011-06-02 15:43   ` Michael S. Tsirkin
  0 siblings, 0 replies; 50+ messages in thread
From: Michael S. Tsirkin @ 2011-06-02 15:43 UTC (permalink / raw)
  To: linux-kernel
  Cc: Rusty Russell, Carsten Otte, Christian Borntraeger, linux390,
	Martin Schwidefsky, Heiko Carstens, Shirley Ma, lguest,
	linux-kernel, virtualization, netdev, linux-s390, kvm,
	Krishna Kumar, Tom Lendacky, steved, habanero

Current code might introduce a lot of latency variation
if there are many pending bufs at the time we
attempt to transmit a new one. This is bad for
real-time applications and can't be good for TCP either.

Free up just enough to both clean up all buffers
eventually and to be able to xmit the next packet.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
---
 drivers/net/virtio_net.c |  106 +++++++++++++++++++++++++++++-----------------
 1 files changed, 67 insertions(+), 39 deletions(-)

diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c
index a0ee78d..b25db1c 100644
--- a/drivers/net/virtio_net.c
+++ b/drivers/net/virtio_net.c
@@ -509,17 +509,33 @@ again:
 	return received;
 }
 
-static void free_old_xmit_skbs(struct virtnet_info *vi)
+static bool free_old_xmit_skb(struct virtnet_info *vi)
 {
 	struct sk_buff *skb;
 	unsigned int len;
 
-	while ((skb = virtqueue_get_buf(vi->svq, &len)) != NULL) {
-		pr_debug("Sent skb %p\n", skb);
-		vi->dev->stats.tx_bytes += skb->len;
-		vi->dev->stats.tx_packets++;
-		dev_kfree_skb_any(skb);
-	}
+	skb = virtqueue_get_buf(vi->svq, &len);
+	if (unlikely(!skb))
+		return false;
+	pr_debug("Sent skb %p\n", skb);
+	vi->dev->stats.tx_bytes += skb->len;
+	vi->dev->stats.tx_packets++;
+	dev_kfree_skb_any(skb);
+	return true;
+}
+
+/* Check capacity and try to free enough pending old buffers to enable queueing
+ * new ones. Return true if we can guarantee that a following
+ * virtqueue_add_buf will succeed. */
+static bool free_xmit_capacity(struct virtnet_info *vi)
+{
+	struct sk_buff *skb;
+	unsigned int len;
+
+	while (virtqueue_min_capacity(vi->svq) < MAX_SKB_FRAGS + 2)
+		if (unlikely(!free_old_xmit_skb))
+			return false;
+	return true;
 }
 
 static int xmit_skb(struct virtnet_info *vi, struct sk_buff *skb)
@@ -572,30 +588,34 @@ static int xmit_skb(struct virtnet_info *vi, struct sk_buff *skb)
 static netdev_tx_t start_xmit(struct sk_buff *skb, struct net_device *dev)
 {
 	struct virtnet_info *vi = netdev_priv(dev);
-	int capacity;
-
-	/* Free up any pending old buffers before queueing new ones. */
-	free_old_xmit_skbs(vi);
-
-	/* Try to transmit */
-	capacity = xmit_skb(vi, skb);
-
-	/* This can happen with OOM and indirect buffers. */
-	if (unlikely(capacity < 0)) {
-		if (net_ratelimit()) {
-			if (likely(capacity == -ENOMEM)) {
-				dev_warn(&dev->dev,
-					 "TX queue failure: out of memory\n");
-			} else {
-				dev->stats.tx_fifo_errors++;
+	int ret, i;
+
+	/* We normally do have space in the ring, so try to queue the skb as
+	 * fast as possible. */
+	ret = xmit_skb(vi, skb);
+	if (unlikely(ret < 0)) {
+		/* This triggers on the first xmit after ring full condition.
+		 * We need to free up some skbs first. */
+		if (likely(free_xmit_capacity(vi))) {
+			ret = xmit_skb(vi, skb);
+			/* This should never fail. Check, just in case. */
+			if (unlikely(ret < 0)) {
 				dev_warn(&dev->dev,
 					 "Unexpected TX queue failure: %d\n",
-					 capacity);
+					 ret);
+				dev->stats.tx_fifo_errors++;
+				dev->stats.tx_dropped++;
+				kfree_skb(skb);
+				return NETDEV_TX_OK;
 			}
+		} else {
+			/* Ring full: it might happen if we get a callback while
+			 * the queue is still mostly full. This should be
+			 * extremely rare. */
+			dev->stats.tx_dropped++;
+			kfree_skb(skb);
+			goto stop;
 		}
-		dev->stats.tx_dropped++;
-		kfree_skb(skb);
-		return NETDEV_TX_OK;
 	}
 	virtqueue_kick(vi->svq);
 
@@ -603,18 +623,26 @@ static netdev_tx_t start_xmit(struct sk_buff *skb, struct net_device *dev)
 	skb_orphan(skb);
 	nf_reset(skb);
 
-	/* Apparently nice girls don't return TX_BUSY; stop the queue
-	 * before it gets out of hand.  Naturally, this wastes entries. */
-	if (capacity < 2+MAX_SKB_FRAGS) {
-		netif_stop_queue(dev);
-		if (unlikely(!virtqueue_enable_cb_delayed(vi->svq))) {
-			/* More just got used, free them then recheck. */
-			free_old_xmit_skbs(vi);
-			capacity = virtqueue_min_capacity(vi->svq);
-			if (capacity >= 2+MAX_SKB_FRAGS) {
-				netif_start_queue(dev);
-				virtqueue_disable_cb(vi->svq);
-			}
+	/* We transmit one skb, so try to free at least two pending skbs.
+	 * This is so that we don't hog the skb memory unnecessarily. *
+	 * Doing this after kick means there's a chance we'll free
+	 * the skb we have just sent, which is hot in cache. */
+	for (i = 0; i < 2; i++)
+		free_old_xmit_skb(v);
+
+	if (likely(free_xmit_capacity(vi)))
+		return NETDEV_TX_OK;
+
+stop:
+	/* Apparently nice girls don't return TX_BUSY; check capacity and stop
+	 * the queue before it gets out of hand.
+	 * Naturally, this wastes entries. */
+	netif_stop_queue(dev);
+	if (unlikely(!virtqueue_enable_cb_delayed(vi->svq))) {
+		/* More just got used, free them and recheck. */
+		if (free_xmit_capacity(vi)) {
+			netif_start_queue(dev);
+			virtqueue_disable_cb(vi->svq);
 		}
 	}
 
-- 
1.7.5.53.gc233e


^ permalink raw reply related	[flat|nested] 50+ messages in thread

* [PATCHv2 RFC 3/4] virtio_net: limit xmit polling
@ 2011-06-02 15:43   ` Michael S. Tsirkin
  0 siblings, 0 replies; 50+ messages in thread
From: Michael S. Tsirkin @ 2011-06-02 15:43 UTC (permalink / raw)
  To: linux-kernel-u79uwXL29TY76Z2rM5mHXA
  Cc: Krishna Kumar, Carsten Otte, lguest-uLR06cmDAlY/bJ5BZ2RsiQ,
	Shirley Ma, kvm-u79uwXL29TY76Z2rM5mHXA,
	linux-s390-u79uwXL29TY76Z2rM5mHXA, netdev-u79uwXL29TY76Z2rM5mHXA,
	habanero-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8, Heiko Carstens,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	virtualization-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	steved-r/Jw6+rmf7HQT0dZR+AlfA, Christian Borntraeger,
	Tom Lendacky, Martin Schwidefsky,
	linux390-tA70FqPdS9bQT0dZR+AlfA

Current code might introduce a lot of latency variation
if there are many pending bufs at the time we
attempt to transmit a new one. This is bad for
real-time applications and can't be good for TCP either.

Free up just enough to both clean up all buffers
eventually and to be able to xmit the next packet.

Signed-off-by: Michael S. Tsirkin <mst-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
---
 drivers/net/virtio_net.c |  106 +++++++++++++++++++++++++++++-----------------
 1 files changed, 67 insertions(+), 39 deletions(-)

diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c
index a0ee78d..b25db1c 100644
--- a/drivers/net/virtio_net.c
+++ b/drivers/net/virtio_net.c
@@ -509,17 +509,33 @@ again:
 	return received;
 }
 
-static void free_old_xmit_skbs(struct virtnet_info *vi)
+static bool free_old_xmit_skb(struct virtnet_info *vi)
 {
 	struct sk_buff *skb;
 	unsigned int len;
 
-	while ((skb = virtqueue_get_buf(vi->svq, &len)) != NULL) {
-		pr_debug("Sent skb %p\n", skb);
-		vi->dev->stats.tx_bytes += skb->len;
-		vi->dev->stats.tx_packets++;
-		dev_kfree_skb_any(skb);
-	}
+	skb = virtqueue_get_buf(vi->svq, &len);
+	if (unlikely(!skb))
+		return false;
+	pr_debug("Sent skb %p\n", skb);
+	vi->dev->stats.tx_bytes += skb->len;
+	vi->dev->stats.tx_packets++;
+	dev_kfree_skb_any(skb);
+	return true;
+}
+
+/* Check capacity and try to free enough pending old buffers to enable queueing
+ * new ones. Return true if we can guarantee that a following
+ * virtqueue_add_buf will succeed. */
+static bool free_xmit_capacity(struct virtnet_info *vi)
+{
+	struct sk_buff *skb;
+	unsigned int len;
+
+	while (virtqueue_min_capacity(vi->svq) < MAX_SKB_FRAGS + 2)
+		if (unlikely(!free_old_xmit_skb))
+			return false;
+	return true;
 }
 
 static int xmit_skb(struct virtnet_info *vi, struct sk_buff *skb)
@@ -572,30 +588,34 @@ static int xmit_skb(struct virtnet_info *vi, struct sk_buff *skb)
 static netdev_tx_t start_xmit(struct sk_buff *skb, struct net_device *dev)
 {
 	struct virtnet_info *vi = netdev_priv(dev);
-	int capacity;
-
-	/* Free up any pending old buffers before queueing new ones. */
-	free_old_xmit_skbs(vi);
-
-	/* Try to transmit */
-	capacity = xmit_skb(vi, skb);
-
-	/* This can happen with OOM and indirect buffers. */
-	if (unlikely(capacity < 0)) {
-		if (net_ratelimit()) {
-			if (likely(capacity == -ENOMEM)) {
-				dev_warn(&dev->dev,
-					 "TX queue failure: out of memory\n");
-			} else {
-				dev->stats.tx_fifo_errors++;
+	int ret, i;
+
+	/* We normally do have space in the ring, so try to queue the skb as
+	 * fast as possible. */
+	ret = xmit_skb(vi, skb);
+	if (unlikely(ret < 0)) {
+		/* This triggers on the first xmit after ring full condition.
+		 * We need to free up some skbs first. */
+		if (likely(free_xmit_capacity(vi))) {
+			ret = xmit_skb(vi, skb);
+			/* This should never fail. Check, just in case. */
+			if (unlikely(ret < 0)) {
 				dev_warn(&dev->dev,
 					 "Unexpected TX queue failure: %d\n",
-					 capacity);
+					 ret);
+				dev->stats.tx_fifo_errors++;
+				dev->stats.tx_dropped++;
+				kfree_skb(skb);
+				return NETDEV_TX_OK;
 			}
+		} else {
+			/* Ring full: it might happen if we get a callback while
+			 * the queue is still mostly full. This should be
+			 * extremely rare. */
+			dev->stats.tx_dropped++;
+			kfree_skb(skb);
+			goto stop;
 		}
-		dev->stats.tx_dropped++;
-		kfree_skb(skb);
-		return NETDEV_TX_OK;
 	}
 	virtqueue_kick(vi->svq);
 
@@ -603,18 +623,26 @@ static netdev_tx_t start_xmit(struct sk_buff *skb, struct net_device *dev)
 	skb_orphan(skb);
 	nf_reset(skb);
 
-	/* Apparently nice girls don't return TX_BUSY; stop the queue
-	 * before it gets out of hand.  Naturally, this wastes entries. */
-	if (capacity < 2+MAX_SKB_FRAGS) {
-		netif_stop_queue(dev);
-		if (unlikely(!virtqueue_enable_cb_delayed(vi->svq))) {
-			/* More just got used, free them then recheck. */
-			free_old_xmit_skbs(vi);
-			capacity = virtqueue_min_capacity(vi->svq);
-			if (capacity >= 2+MAX_SKB_FRAGS) {
-				netif_start_queue(dev);
-				virtqueue_disable_cb(vi->svq);
-			}
+	/* We transmit one skb, so try to free at least two pending skbs.
+	 * This is so that we don't hog the skb memory unnecessarily. *
+	 * Doing this after kick means there's a chance we'll free
+	 * the skb we have just sent, which is hot in cache. */
+	for (i = 0; i < 2; i++)
+		free_old_xmit_skb(v);
+
+	if (likely(free_xmit_capacity(vi)))
+		return NETDEV_TX_OK;
+
+stop:
+	/* Apparently nice girls don't return TX_BUSY; check capacity and stop
+	 * the queue before it gets out of hand.
+	 * Naturally, this wastes entries. */
+	netif_stop_queue(dev);
+	if (unlikely(!virtqueue_enable_cb_delayed(vi->svq))) {
+		/* More just got used, free them and recheck. */
+		if (free_xmit_capacity(vi)) {
+			netif_start_queue(dev);
+			virtqueue_disable_cb(vi->svq);
 		}
 	}
 
-- 
1.7.5.53.gc233e

^ permalink raw reply related	[flat|nested] 50+ messages in thread

* [PATCHv2 RFC 3/4] virtio_net: limit xmit polling
  2011-06-02 15:42 ` Michael S. Tsirkin
                   ` (5 preceding siblings ...)
  (?)
@ 2011-06-02 15:43 ` Michael S. Tsirkin
  -1 siblings, 0 replies; 50+ messages in thread
From: Michael S. Tsirkin @ 2011-06-02 15:43 UTC (permalink / raw)
  Cc: Krishna Kumar, Carsten Otte, lguest, Shirley Ma, kvm, linux-s390,
	netdev, habanero, Heiko Carstens, linux-kernel, virtualization,
	steved, Christian Borntraeger, Tom Lendacky, Martin Schwidefsky,
	linux390

Current code might introduce a lot of latency variation
if there are many pending bufs at the time we
attempt to transmit a new one. This is bad for
real-time applications and can't be good for TCP either.

Free up just enough to both clean up all buffers
eventually and to be able to xmit the next packet.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
---
 drivers/net/virtio_net.c |  106 +++++++++++++++++++++++++++++-----------------
 1 files changed, 67 insertions(+), 39 deletions(-)

diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c
index a0ee78d..b25db1c 100644
--- a/drivers/net/virtio_net.c
+++ b/drivers/net/virtio_net.c
@@ -509,17 +509,33 @@ again:
 	return received;
 }
 
-static void free_old_xmit_skbs(struct virtnet_info *vi)
+static bool free_old_xmit_skb(struct virtnet_info *vi)
 {
 	struct sk_buff *skb;
 	unsigned int len;
 
-	while ((skb = virtqueue_get_buf(vi->svq, &len)) != NULL) {
-		pr_debug("Sent skb %p\n", skb);
-		vi->dev->stats.tx_bytes += skb->len;
-		vi->dev->stats.tx_packets++;
-		dev_kfree_skb_any(skb);
-	}
+	skb = virtqueue_get_buf(vi->svq, &len);
+	if (unlikely(!skb))
+		return false;
+	pr_debug("Sent skb %p\n", skb);
+	vi->dev->stats.tx_bytes += skb->len;
+	vi->dev->stats.tx_packets++;
+	dev_kfree_skb_any(skb);
+	return true;
+}
+
+/* Check capacity and try to free enough pending old buffers to enable queueing
+ * new ones. Return true if we can guarantee that a following
+ * virtqueue_add_buf will succeed. */
+static bool free_xmit_capacity(struct virtnet_info *vi)
+{
+	struct sk_buff *skb;
+	unsigned int len;
+
+	while (virtqueue_min_capacity(vi->svq) < MAX_SKB_FRAGS + 2)
+		if (unlikely(!free_old_xmit_skb))
+			return false;
+	return true;
 }
 
 static int xmit_skb(struct virtnet_info *vi, struct sk_buff *skb)
@@ -572,30 +588,34 @@ static int xmit_skb(struct virtnet_info *vi, struct sk_buff *skb)
 static netdev_tx_t start_xmit(struct sk_buff *skb, struct net_device *dev)
 {
 	struct virtnet_info *vi = netdev_priv(dev);
-	int capacity;
-
-	/* Free up any pending old buffers before queueing new ones. */
-	free_old_xmit_skbs(vi);
-
-	/* Try to transmit */
-	capacity = xmit_skb(vi, skb);
-
-	/* This can happen with OOM and indirect buffers. */
-	if (unlikely(capacity < 0)) {
-		if (net_ratelimit()) {
-			if (likely(capacity == -ENOMEM)) {
-				dev_warn(&dev->dev,
-					 "TX queue failure: out of memory\n");
-			} else {
-				dev->stats.tx_fifo_errors++;
+	int ret, i;
+
+	/* We normally do have space in the ring, so try to queue the skb as
+	 * fast as possible. */
+	ret = xmit_skb(vi, skb);
+	if (unlikely(ret < 0)) {
+		/* This triggers on the first xmit after ring full condition.
+		 * We need to free up some skbs first. */
+		if (likely(free_xmit_capacity(vi))) {
+			ret = xmit_skb(vi, skb);
+			/* This should never fail. Check, just in case. */
+			if (unlikely(ret < 0)) {
 				dev_warn(&dev->dev,
 					 "Unexpected TX queue failure: %d\n",
-					 capacity);
+					 ret);
+				dev->stats.tx_fifo_errors++;
+				dev->stats.tx_dropped++;
+				kfree_skb(skb);
+				return NETDEV_TX_OK;
 			}
+		} else {
+			/* Ring full: it might happen if we get a callback while
+			 * the queue is still mostly full. This should be
+			 * extremely rare. */
+			dev->stats.tx_dropped++;
+			kfree_skb(skb);
+			goto stop;
 		}
-		dev->stats.tx_dropped++;
-		kfree_skb(skb);
-		return NETDEV_TX_OK;
 	}
 	virtqueue_kick(vi->svq);
 
@@ -603,18 +623,26 @@ static netdev_tx_t start_xmit(struct sk_buff *skb, struct net_device *dev)
 	skb_orphan(skb);
 	nf_reset(skb);
 
-	/* Apparently nice girls don't return TX_BUSY; stop the queue
-	 * before it gets out of hand.  Naturally, this wastes entries. */
-	if (capacity < 2+MAX_SKB_FRAGS) {
-		netif_stop_queue(dev);
-		if (unlikely(!virtqueue_enable_cb_delayed(vi->svq))) {
-			/* More just got used, free them then recheck. */
-			free_old_xmit_skbs(vi);
-			capacity = virtqueue_min_capacity(vi->svq);
-			if (capacity >= 2+MAX_SKB_FRAGS) {
-				netif_start_queue(dev);
-				virtqueue_disable_cb(vi->svq);
-			}
+	/* We transmit one skb, so try to free at least two pending skbs.
+	 * This is so that we don't hog the skb memory unnecessarily. *
+	 * Doing this after kick means there's a chance we'll free
+	 * the skb we have just sent, which is hot in cache. */
+	for (i = 0; i < 2; i++)
+		free_old_xmit_skb(v);
+
+	if (likely(free_xmit_capacity(vi)))
+		return NETDEV_TX_OK;
+
+stop:
+	/* Apparently nice girls don't return TX_BUSY; check capacity and stop
+	 * the queue before it gets out of hand.
+	 * Naturally, this wastes entries. */
+	netif_stop_queue(dev);
+	if (unlikely(!virtqueue_enable_cb_delayed(vi->svq))) {
+		/* More just got used, free them and recheck. */
+		if (free_xmit_capacity(vi)) {
+			netif_start_queue(dev);
+			virtqueue_disable_cb(vi->svq);
 		}
 	}
 
-- 
1.7.5.53.gc233e

^ permalink raw reply related	[flat|nested] 50+ messages in thread

* [PATCHv2 RFC 4/4] Revert "virtio: make add_buf return capacity remaining:
@ 2011-06-02 15:43   ` Michael S. Tsirkin
  0 siblings, 0 replies; 50+ messages in thread
From: Michael S. Tsirkin @ 2011-06-02 15:43 UTC (permalink / raw)
  To: linux-kernel
  Cc: Rusty Russell, Carsten Otte, Christian Borntraeger, linux390,
	Martin Schwidefsky, Heiko Carstens, Shirley Ma, lguest,
	linux-kernel, virtualization, netdev, linux-s390, kvm,
	Krishna Kumar, Tom Lendacky, steved, habanero

This reverts commit 3c1b27d5043086a485f8526353ae9fe37bfa1065.
The only user was virtio_net, and it switched to
min_capacity instead.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
---
 drivers/virtio/virtio_ring.c |    2 +-
 include/linux/virtio.h       |    2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/virtio/virtio_ring.c b/drivers/virtio/virtio_ring.c
index 23422f1..a6c21eb 100644
--- a/drivers/virtio/virtio_ring.c
+++ b/drivers/virtio/virtio_ring.c
@@ -233,7 +233,7 @@ add_head:
 	pr_debug("Added buffer head %i to %p\n", head, vq);
 	END_USE(vq);
 
-	return vq->num_free;
+	return 0;
 }
 EXPORT_SYMBOL_GPL(virtqueue_add_buf_gfp);
 
diff --git a/include/linux/virtio.h b/include/linux/virtio.h
index 209220d..63c4908 100644
--- a/include/linux/virtio.h
+++ b/include/linux/virtio.h
@@ -34,7 +34,7 @@ struct virtqueue {
  *	in_num: the number of sg which are writable (after readable ones)
  *	data: the token identifying the buffer.
  *	gfp: how to do memory allocations (if necessary).
- *      Returns remaining capacity of queue (sg segments) or a negative error.
+ *      Returns 0 on success or a negative error.
  * virtqueue_kick: update after add_buf
  *	vq: the struct virtqueue
  *	After one or more add_buf calls, invoke this to kick the other side.
-- 
1.7.5.53.gc233e

^ permalink raw reply related	[flat|nested] 50+ messages in thread

* [PATCHv2 RFC 4/4] Revert "virtio: make add_buf return capacity remaining:
@ 2011-06-02 15:43   ` Michael S. Tsirkin
  0 siblings, 0 replies; 50+ messages in thread
From: Michael S. Tsirkin @ 2011-06-02 15:43 UTC (permalink / raw)
  To: linux-kernel-u79uwXL29TY76Z2rM5mHXA
  Cc: Krishna Kumar, Carsten Otte, lguest-uLR06cmDAlY/bJ5BZ2RsiQ,
	Shirley Ma, kvm-u79uwXL29TY76Z2rM5mHXA,
	linux-s390-u79uwXL29TY76Z2rM5mHXA, netdev-u79uwXL29TY76Z2rM5mHXA,
	habanero-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8, Heiko Carstens,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	virtualization-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	steved-r/Jw6+rmf7HQT0dZR+AlfA, Christian Borntraeger,
	Tom Lendacky, Martin Schwidefsky,
	linux390-tA70FqPdS9bQT0dZR+AlfA

This reverts commit 3c1b27d5043086a485f8526353ae9fe37bfa1065.
The only user was virtio_net, and it switched to
min_capacity instead.

Signed-off-by: Michael S. Tsirkin <mst-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
---
 drivers/virtio/virtio_ring.c |    2 +-
 include/linux/virtio.h       |    2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/virtio/virtio_ring.c b/drivers/virtio/virtio_ring.c
index 23422f1..a6c21eb 100644
--- a/drivers/virtio/virtio_ring.c
+++ b/drivers/virtio/virtio_ring.c
@@ -233,7 +233,7 @@ add_head:
 	pr_debug("Added buffer head %i to %p\n", head, vq);
 	END_USE(vq);
 
-	return vq->num_free;
+	return 0;
 }
 EXPORT_SYMBOL_GPL(virtqueue_add_buf_gfp);
 
diff --git a/include/linux/virtio.h b/include/linux/virtio.h
index 209220d..63c4908 100644
--- a/include/linux/virtio.h
+++ b/include/linux/virtio.h
@@ -34,7 +34,7 @@ struct virtqueue {
  *	in_num: the number of sg which are writable (after readable ones)
  *	data: the token identifying the buffer.
  *	gfp: how to do memory allocations (if necessary).
- *      Returns remaining capacity of queue (sg segments) or a negative error.
+ *      Returns 0 on success or a negative error.
  * virtqueue_kick: update after add_buf
  *	vq: the struct virtqueue
  *	After one or more add_buf calls, invoke this to kick the other side.
-- 
1.7.5.53.gc233e

^ permalink raw reply related	[flat|nested] 50+ messages in thread

* [PATCHv2 RFC 4/4] Revert "virtio: make add_buf return capacity remaining:
  2011-06-02 15:42 ` Michael S. Tsirkin
                   ` (7 preceding siblings ...)
  (?)
@ 2011-06-02 15:43 ` Michael S. Tsirkin
  -1 siblings, 0 replies; 50+ messages in thread
From: Michael S. Tsirkin @ 2011-06-02 15:43 UTC (permalink / raw)
  Cc: Krishna Kumar, Carsten Otte, lguest, Shirley Ma, kvm, linux-s390,
	netdev, habanero, Heiko Carstens, linux-kernel, virtualization,
	steved, Christian Borntraeger, Tom Lendacky, Martin Schwidefsky,
	linux390

This reverts commit 3c1b27d5043086a485f8526353ae9fe37bfa1065.
The only user was virtio_net, and it switched to
min_capacity instead.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
---
 drivers/virtio/virtio_ring.c |    2 +-
 include/linux/virtio.h       |    2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/virtio/virtio_ring.c b/drivers/virtio/virtio_ring.c
index 23422f1..a6c21eb 100644
--- a/drivers/virtio/virtio_ring.c
+++ b/drivers/virtio/virtio_ring.c
@@ -233,7 +233,7 @@ add_head:
 	pr_debug("Added buffer head %i to %p\n", head, vq);
 	END_USE(vq);
 
-	return vq->num_free;
+	return 0;
 }
 EXPORT_SYMBOL_GPL(virtqueue_add_buf_gfp);
 
diff --git a/include/linux/virtio.h b/include/linux/virtio.h
index 209220d..63c4908 100644
--- a/include/linux/virtio.h
+++ b/include/linux/virtio.h
@@ -34,7 +34,7 @@ struct virtqueue {
  *	in_num: the number of sg which are writable (after readable ones)
  *	data: the token identifying the buffer.
  *	gfp: how to do memory allocations (if necessary).
- *      Returns remaining capacity of queue (sg segments) or a negative error.
+ *      Returns 0 on success or a negative error.
  * virtqueue_kick: update after add_buf
  *	vq: the struct virtqueue
  *	After one or more add_buf calls, invoke this to kick the other side.
-- 
1.7.5.53.gc233e

^ permalink raw reply related	[flat|nested] 50+ messages in thread

* Re: [PATCHv2 RFC 0/4] virtio and vhost-net capacity handling
@ 2011-06-02 17:17   ` Michael S. Tsirkin
  0 siblings, 0 replies; 50+ messages in thread
From: Michael S. Tsirkin @ 2011-06-02 17:17 UTC (permalink / raw)
  To: linux-kernel
  Cc: Rusty Russell, Carsten Otte, Christian Borntraeger, linux390,
	Martin Schwidefsky, Heiko Carstens, Shirley Ma, lguest,
	virtualization, netdev, linux-s390, kvm, Krishna Kumar,
	Tom Lendacky, steved, habanero

On Thu, Jun 02, 2011 at 06:42:35PM +0300, Michael S. Tsirkin wrote:
> OK, here's a new attempt to use the new capacity api.  I also added more
> comments to clarify the logic.  Hope this is more readable.  Let me know
> pls.
> 
> This is on top of the patches applied by Rusty.
> 
> Warning: untested. Posting now to give people chance to
> comment on the API.
> 
> Changes from v1:
> - fix comment in patch 2 to correct confusion noted by Rusty
> - rewrite patch 3 along the lines suggested by Rusty
>   note: it's not exactly the same but I hope it's close
>   enough, the main difference is that mine does limited
>   polling even in the unlikely xmit failure case.
> - added a patch to not return capacity from add_buf
>   it always looked like a weird hack
> 
> Michael S. Tsirkin (4):
>   virtio_ring: add capacity check API
>   virtio_net: fix tx capacity checks using new API
>   virtio_net: limit xmit polling
>   Revert "virtio: make add_buf return capacity remaining:
> 
>  drivers/net/virtio_net.c     |  111 ++++++++++++++++++++++++++----------------
>  drivers/virtio/virtio_ring.c |   10 +++-
>  include/linux/virtio.h       |    7 ++-
>  3 files changed, 84 insertions(+), 44 deletions(-)
> 
> -- 
> 1.7.5.53.gc233e


And just FYI, here's a patch (on top) that I considered but
decided against. This does a single get_buf before
xmit. I thought it's not really needed as the capacity
check in add_buf is relatively cheap, and we removed
the kick in xmit_skb. But the point is that the loop
will now be easy to move around if someone manages
to show this benefits speed (which I doubt).

diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c
index b25db1c..75af5be 100644
--- a/drivers/net/virtio_net.c
+++ b/drivers/net/virtio_net.c
@@ -590,6 +590,9 @@ static netdev_tx_t start_xmit(struct sk_buff *skb, struct net_device *dev)
 	struct virtnet_info *vi = netdev_priv(dev);
 	int ret, i;
 
+	/* Try to pop an skb, to increase the chance we can add this one. */
+	free_old_xmit_skb(v);
+
 	/* We normally do have space in the ring, so try to queue the skb as
 	 * fast as possible. */
 	ret = xmit_skb(vi, skb);
@@ -627,8 +630,7 @@ static netdev_tx_t start_xmit(struct sk_buff *skb, struct net_device *dev)
 	 * This is so that we don't hog the skb memory unnecessarily. *
 	 * Doing this after kick means there's a chance we'll free
 	 * the skb we have just sent, which is hot in cache. */
-	for (i = 0; i < 2; i++)
-		free_old_xmit_skb(v);
+	free_old_xmit_skb(v);
 
 	if (likely(free_xmit_capacity(vi)))
 		return NETDEV_TX_OK;

^ permalink raw reply related	[flat|nested] 50+ messages in thread

* Re: [PATCHv2 RFC 0/4] virtio and vhost-net capacity handling
@ 2011-06-02 17:17   ` Michael S. Tsirkin
  0 siblings, 0 replies; 50+ messages in thread
From: Michael S. Tsirkin @ 2011-06-02 17:17 UTC (permalink / raw)
  To: linux-kernel-u79uwXL29TY76Z2rM5mHXA
  Cc: Krishna Kumar, Carsten Otte, lguest-uLR06cmDAlY/bJ5BZ2RsiQ,
	Shirley Ma, kvm-u79uwXL29TY76Z2rM5mHXA,
	linux-s390-u79uwXL29TY76Z2rM5mHXA, netdev-u79uwXL29TY76Z2rM5mHXA,
	habanero-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8, Heiko Carstens,
	virtualization-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	steved-r/Jw6+rmf7HQT0dZR+AlfA, Christian Borntraeger,
	Tom Lendacky, Martin Schwidefsky,
	linux390-tA70FqPdS9bQT0dZR+AlfA

On Thu, Jun 02, 2011 at 06:42:35PM +0300, Michael S. Tsirkin wrote:
> OK, here's a new attempt to use the new capacity api.  I also added more
> comments to clarify the logic.  Hope this is more readable.  Let me know
> pls.
> 
> This is on top of the patches applied by Rusty.
> 
> Warning: untested. Posting now to give people chance to
> comment on the API.
> 
> Changes from v1:
> - fix comment in patch 2 to correct confusion noted by Rusty
> - rewrite patch 3 along the lines suggested by Rusty
>   note: it's not exactly the same but I hope it's close
>   enough, the main difference is that mine does limited
>   polling even in the unlikely xmit failure case.
> - added a patch to not return capacity from add_buf
>   it always looked like a weird hack
> 
> Michael S. Tsirkin (4):
>   virtio_ring: add capacity check API
>   virtio_net: fix tx capacity checks using new API
>   virtio_net: limit xmit polling
>   Revert "virtio: make add_buf return capacity remaining:
> 
>  drivers/net/virtio_net.c     |  111 ++++++++++++++++++++++++++----------------
>  drivers/virtio/virtio_ring.c |   10 +++-
>  include/linux/virtio.h       |    7 ++-
>  3 files changed, 84 insertions(+), 44 deletions(-)
> 
> -- 
> 1.7.5.53.gc233e


And just FYI, here's a patch (on top) that I considered but
decided against. This does a single get_buf before
xmit. I thought it's not really needed as the capacity
check in add_buf is relatively cheap, and we removed
the kick in xmit_skb. But the point is that the loop
will now be easy to move around if someone manages
to show this benefits speed (which I doubt).

diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c
index b25db1c..75af5be 100644
--- a/drivers/net/virtio_net.c
+++ b/drivers/net/virtio_net.c
@@ -590,6 +590,9 @@ static netdev_tx_t start_xmit(struct sk_buff *skb, struct net_device *dev)
 	struct virtnet_info *vi = netdev_priv(dev);
 	int ret, i;
 
+	/* Try to pop an skb, to increase the chance we can add this one. */
+	free_old_xmit_skb(v);
+
 	/* We normally do have space in the ring, so try to queue the skb as
 	 * fast as possible. */
 	ret = xmit_skb(vi, skb);
@@ -627,8 +630,7 @@ static netdev_tx_t start_xmit(struct sk_buff *skb, struct net_device *dev)
 	 * This is so that we don't hog the skb memory unnecessarily. *
 	 * Doing this after kick means there's a chance we'll free
 	 * the skb we have just sent, which is hot in cache. */
-	for (i = 0; i < 2; i++)
-		free_old_xmit_skb(v);
+	free_old_xmit_skb(v);
 
 	if (likely(free_xmit_capacity(vi)))
 		return NETDEV_TX_OK;

^ permalink raw reply related	[flat|nested] 50+ messages in thread

* Re: [PATCHv2 RFC 0/4] virtio and vhost-net capacity handling
  2011-06-02 15:42 ` Michael S. Tsirkin
                   ` (9 preceding siblings ...)
  (?)
@ 2011-06-02 17:17 ` Michael S. Tsirkin
  -1 siblings, 0 replies; 50+ messages in thread
From: Michael S. Tsirkin @ 2011-06-02 17:17 UTC (permalink / raw)
  To: linux-kernel
  Cc: Krishna Kumar, Carsten Otte, lguest, Shirley Ma, kvm, linux-s390,
	netdev, habanero, Heiko Carstens, virtualization, steved,
	Christian Borntraeger, Tom Lendacky, Martin Schwidefsky,
	linux390

On Thu, Jun 02, 2011 at 06:42:35PM +0300, Michael S. Tsirkin wrote:
> OK, here's a new attempt to use the new capacity api.  I also added more
> comments to clarify the logic.  Hope this is more readable.  Let me know
> pls.
> 
> This is on top of the patches applied by Rusty.
> 
> Warning: untested. Posting now to give people chance to
> comment on the API.
> 
> Changes from v1:
> - fix comment in patch 2 to correct confusion noted by Rusty
> - rewrite patch 3 along the lines suggested by Rusty
>   note: it's not exactly the same but I hope it's close
>   enough, the main difference is that mine does limited
>   polling even in the unlikely xmit failure case.
> - added a patch to not return capacity from add_buf
>   it always looked like a weird hack
> 
> Michael S. Tsirkin (4):
>   virtio_ring: add capacity check API
>   virtio_net: fix tx capacity checks using new API
>   virtio_net: limit xmit polling
>   Revert "virtio: make add_buf return capacity remaining:
> 
>  drivers/net/virtio_net.c     |  111 ++++++++++++++++++++++++++----------------
>  drivers/virtio/virtio_ring.c |   10 +++-
>  include/linux/virtio.h       |    7 ++-
>  3 files changed, 84 insertions(+), 44 deletions(-)
> 
> -- 
> 1.7.5.53.gc233e


And just FYI, here's a patch (on top) that I considered but
decided against. This does a single get_buf before
xmit. I thought it's not really needed as the capacity
check in add_buf is relatively cheap, and we removed
the kick in xmit_skb. But the point is that the loop
will now be easy to move around if someone manages
to show this benefits speed (which I doubt).

diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c
index b25db1c..75af5be 100644
--- a/drivers/net/virtio_net.c
+++ b/drivers/net/virtio_net.c
@@ -590,6 +590,9 @@ static netdev_tx_t start_xmit(struct sk_buff *skb, struct net_device *dev)
 	struct virtnet_info *vi = netdev_priv(dev);
 	int ret, i;
 
+	/* Try to pop an skb, to increase the chance we can add this one. */
+	free_old_xmit_skb(v);
+
 	/* We normally do have space in the ring, so try to queue the skb as
 	 * fast as possible. */
 	ret = xmit_skb(vi, skb);
@@ -627,8 +630,7 @@ static netdev_tx_t start_xmit(struct sk_buff *skb, struct net_device *dev)
 	 * This is so that we don't hog the skb memory unnecessarily. *
 	 * Doing this after kick means there's a chance we'll free
 	 * the skb we have just sent, which is hot in cache. */
-	for (i = 0; i < 2; i++)
-		free_old_xmit_skb(v);
+	free_old_xmit_skb(v);
 
 	if (likely(free_xmit_capacity(vi)))
 		return NETDEV_TX_OK;

^ permalink raw reply related	[flat|nested] 50+ messages in thread

* Re: [PATCHv2 RFC 3/4] virtio_net: limit xmit polling
  2011-06-02 15:43   ` Michael S. Tsirkin
  (?)
@ 2011-06-02 18:09   ` Sridhar Samudrala
  2011-06-02 19:23     ` Michael S. Tsirkin
  2011-06-02 19:23       ` Michael S. Tsirkin
  -1 siblings, 2 replies; 50+ messages in thread
From: Sridhar Samudrala @ 2011-06-02 18:09 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: linux-kernel, Rusty Russell, Carsten Otte, Christian Borntraeger,
	linux390, Martin Schwidefsky, Heiko Carstens, Shirley Ma, lguest,
	virtualization, netdev, linux-s390, kvm, Krishna Kumar,
	Tom Lendacky, steved, habanero

On Thu, 2011-06-02 at 18:43 +0300, Michael S. Tsirkin wrote:
> Current code might introduce a lot of latency variation
> if there are many pending bufs at the time we
> attempt to transmit a new one. This is bad for
> real-time applications and can't be good for TCP either.
> 
> Free up just enough to both clean up all buffers
> eventually and to be able to xmit the next packet.
> 
> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
> ---
>  drivers/net/virtio_net.c |  106 +++++++++++++++++++++++++++++-----------------
>  1 files changed, 67 insertions(+), 39 deletions(-)
> 
> diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c
> index a0ee78d..b25db1c 100644
> --- a/drivers/net/virtio_net.c
> +++ b/drivers/net/virtio_net.c
> @@ -509,17 +509,33 @@ again:
>  	return received;
>  }
> 
> -static void free_old_xmit_skbs(struct virtnet_info *vi)
> +static bool free_old_xmit_skb(struct virtnet_info *vi)
>  {
>  	struct sk_buff *skb;
>  	unsigned int len;
> 
> -	while ((skb = virtqueue_get_buf(vi->svq, &len)) != NULL) {
> -		pr_debug("Sent skb %p\n", skb);
> -		vi->dev->stats.tx_bytes += skb->len;
> -		vi->dev->stats.tx_packets++;
> -		dev_kfree_skb_any(skb);
> -	}
> +	skb = virtqueue_get_buf(vi->svq, &len);
> +	if (unlikely(!skb))
> +		return false;
> +	pr_debug("Sent skb %p\n", skb);
> +	vi->dev->stats.tx_bytes += skb->len;
> +	vi->dev->stats.tx_packets++;
> +	dev_kfree_skb_any(skb);
> +	return true;
> +}
> +
> +/* Check capacity and try to free enough pending old buffers to enable queueing
> + * new ones. Return true if we can guarantee that a following
> + * virtqueue_add_buf will succeed. */
> +static bool free_xmit_capacity(struct virtnet_info *vi)
> +{
> +	struct sk_buff *skb;
> +	unsigned int len;
> +
> +	while (virtqueue_min_capacity(vi->svq) < MAX_SKB_FRAGS + 2)
> +		if (unlikely(!free_old_xmit_skb))
> +			return false;
If we are using INDIRECT descriptors, 1 descriptor entry is good enough
to guarantee that an skb can be queued unless we run out of memory.
Is it worth checking if 'indirect' is set on the svq and then only free
1 descriptor? Otherwise, we will be dropping the packet if there are
less than 18 free descriptors although we ony need 1.


> +	return true;
>  }
> 
>  static int xmit_skb(struct virtnet_info *vi, struct sk_buff *skb)
> @@ -572,30 +588,34 @@ static int xmit_skb(struct virtnet_info *vi, struct sk_buff *skb)
>  static netdev_tx_t start_xmit(struct sk_buff *skb, struct net_device *dev)
>  {
>  	struct virtnet_info *vi = netdev_priv(dev);
> -	int capacity;
> -
> -	/* Free up any pending old buffers before queueing new ones. */
> -	free_old_xmit_skbs(vi);
> -
> -	/* Try to transmit */
> -	capacity = xmit_skb(vi, skb);
> -
> -	/* This can happen with OOM and indirect buffers. */
> -	if (unlikely(capacity < 0)) {
> -		if (net_ratelimit()) {
> -			if (likely(capacity == -ENOMEM)) {
> -				dev_warn(&dev->dev,
> -					 "TX queue failure: out of memory\n");
> -			} else {
> -				dev->stats.tx_fifo_errors++;
> +	int ret, i;
> +
> +	/* We normally do have space in the ring, so try to queue the skb as
> +	 * fast as possible. */
> +	ret = xmit_skb(vi, skb);
> +	if (unlikely(ret < 0)) {
> +		/* This triggers on the first xmit after ring full condition.
> +		 * We need to free up some skbs first. */
> +		if (likely(free_xmit_capacity(vi))) {
> +			ret = xmit_skb(vi, skb);
> +			/* This should never fail. Check, just in case. */
> +			if (unlikely(ret < 0)) {
>  				dev_warn(&dev->dev,
>  					 "Unexpected TX queue failure: %d\n",
> -					 capacity);
> +					 ret);
> +				dev->stats.tx_fifo_errors++;
> +				dev->stats.tx_dropped++;
> +				kfree_skb(skb);
> +				return NETDEV_TX_OK;
>  			}
> +		} else {
> +			/* Ring full: it might happen if we get a callback while
> +			 * the queue is still mostly full. This should be
> +			 * extremely rare. */
> +			dev->stats.tx_dropped++;
> +			kfree_skb(skb);
> +			goto stop;
>  		}
> -		dev->stats.tx_dropped++;
> -		kfree_skb(skb);
> -		return NETDEV_TX_OK;
>  	}
>  	virtqueue_kick(vi->svq);
> 
> @@ -603,18 +623,26 @@ static netdev_tx_t start_xmit(struct sk_buff *skb, struct net_device *dev)
>  	skb_orphan(skb);
>  	nf_reset(skb);
> 
> -	/* Apparently nice girls don't return TX_BUSY; stop the queue
> -	 * before it gets out of hand.  Naturally, this wastes entries. */
> -	if (capacity < 2+MAX_SKB_FRAGS) {
> -		netif_stop_queue(dev);
> -		if (unlikely(!virtqueue_enable_cb_delayed(vi->svq))) {
> -			/* More just got used, free them then recheck. */
> -			free_old_xmit_skbs(vi);
> -			capacity = virtqueue_min_capacity(vi->svq);
> -			if (capacity >= 2+MAX_SKB_FRAGS) {
> -				netif_start_queue(dev);
> -				virtqueue_disable_cb(vi->svq);
> -			}
> +	/* We transmit one skb, so try to free at least two pending skbs.
> +	 * This is so that we don't hog the skb memory unnecessarily. *
> +	 * Doing this after kick means there's a chance we'll free
> +	 * the skb we have just sent, which is hot in cache. */
> +	for (i = 0; i < 2; i++)
> +		free_old_xmit_skb(v);
> +
> +	if (likely(free_xmit_capacity(vi)))
> +		return NETDEV_TX_OK;
> +
> +stop:
> +	/* Apparently nice girls don't return TX_BUSY; check capacity and stop
> +	 * the queue before it gets out of hand.
> +	 * Naturally, this wastes entries. */
> +	netif_stop_queue(dev);
> +	if (unlikely(!virtqueue_enable_cb_delayed(vi->svq))) {
> +		/* More just got used, free them and recheck. */
> +		if (free_xmit_capacity(vi)) {
> +			netif_start_queue(dev);
> +			virtqueue_disable_cb(vi->svq);
>  		}
>  	}
> 


^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCHv2 RFC 3/4] virtio_net: limit xmit polling
  2011-06-02 15:43   ` Michael S. Tsirkin
  (?)
  (?)
@ 2011-06-02 18:09   ` Sridhar Samudrala
  -1 siblings, 0 replies; 50+ messages in thread
From: Sridhar Samudrala @ 2011-06-02 18:09 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: Krishna Kumar, Carsten Otte, lguest, Shirley Ma, kvm, linux-s390,
	netdev, habanero, Heiko Carstens, linux-kernel, virtualization,
	steved, Christian Borntraeger, Tom Lendacky, Martin Schwidefsky,
	linux390

On Thu, 2011-06-02 at 18:43 +0300, Michael S. Tsirkin wrote:
> Current code might introduce a lot of latency variation
> if there are many pending bufs at the time we
> attempt to transmit a new one. This is bad for
> real-time applications and can't be good for TCP either.
> 
> Free up just enough to both clean up all buffers
> eventually and to be able to xmit the next packet.
> 
> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
> ---
>  drivers/net/virtio_net.c |  106 +++++++++++++++++++++++++++++-----------------
>  1 files changed, 67 insertions(+), 39 deletions(-)
> 
> diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c
> index a0ee78d..b25db1c 100644
> --- a/drivers/net/virtio_net.c
> +++ b/drivers/net/virtio_net.c
> @@ -509,17 +509,33 @@ again:
>  	return received;
>  }
> 
> -static void free_old_xmit_skbs(struct virtnet_info *vi)
> +static bool free_old_xmit_skb(struct virtnet_info *vi)
>  {
>  	struct sk_buff *skb;
>  	unsigned int len;
> 
> -	while ((skb = virtqueue_get_buf(vi->svq, &len)) != NULL) {
> -		pr_debug("Sent skb %p\n", skb);
> -		vi->dev->stats.tx_bytes += skb->len;
> -		vi->dev->stats.tx_packets++;
> -		dev_kfree_skb_any(skb);
> -	}
> +	skb = virtqueue_get_buf(vi->svq, &len);
> +	if (unlikely(!skb))
> +		return false;
> +	pr_debug("Sent skb %p\n", skb);
> +	vi->dev->stats.tx_bytes += skb->len;
> +	vi->dev->stats.tx_packets++;
> +	dev_kfree_skb_any(skb);
> +	return true;
> +}
> +
> +/* Check capacity and try to free enough pending old buffers to enable queueing
> + * new ones. Return true if we can guarantee that a following
> + * virtqueue_add_buf will succeed. */
> +static bool free_xmit_capacity(struct virtnet_info *vi)
> +{
> +	struct sk_buff *skb;
> +	unsigned int len;
> +
> +	while (virtqueue_min_capacity(vi->svq) < MAX_SKB_FRAGS + 2)
> +		if (unlikely(!free_old_xmit_skb))
> +			return false;
If we are using INDIRECT descriptors, 1 descriptor entry is good enough
to guarantee that an skb can be queued unless we run out of memory.
Is it worth checking if 'indirect' is set on the svq and then only free
1 descriptor? Otherwise, we will be dropping the packet if there are
less than 18 free descriptors although we ony need 1.


> +	return true;
>  }
> 
>  static int xmit_skb(struct virtnet_info *vi, struct sk_buff *skb)
> @@ -572,30 +588,34 @@ static int xmit_skb(struct virtnet_info *vi, struct sk_buff *skb)
>  static netdev_tx_t start_xmit(struct sk_buff *skb, struct net_device *dev)
>  {
>  	struct virtnet_info *vi = netdev_priv(dev);
> -	int capacity;
> -
> -	/* Free up any pending old buffers before queueing new ones. */
> -	free_old_xmit_skbs(vi);
> -
> -	/* Try to transmit */
> -	capacity = xmit_skb(vi, skb);
> -
> -	/* This can happen with OOM and indirect buffers. */
> -	if (unlikely(capacity < 0)) {
> -		if (net_ratelimit()) {
> -			if (likely(capacity == -ENOMEM)) {
> -				dev_warn(&dev->dev,
> -					 "TX queue failure: out of memory\n");
> -			} else {
> -				dev->stats.tx_fifo_errors++;
> +	int ret, i;
> +
> +	/* We normally do have space in the ring, so try to queue the skb as
> +	 * fast as possible. */
> +	ret = xmit_skb(vi, skb);
> +	if (unlikely(ret < 0)) {
> +		/* This triggers on the first xmit after ring full condition.
> +		 * We need to free up some skbs first. */
> +		if (likely(free_xmit_capacity(vi))) {
> +			ret = xmit_skb(vi, skb);
> +			/* This should never fail. Check, just in case. */
> +			if (unlikely(ret < 0)) {
>  				dev_warn(&dev->dev,
>  					 "Unexpected TX queue failure: %d\n",
> -					 capacity);
> +					 ret);
> +				dev->stats.tx_fifo_errors++;
> +				dev->stats.tx_dropped++;
> +				kfree_skb(skb);
> +				return NETDEV_TX_OK;
>  			}
> +		} else {
> +			/* Ring full: it might happen if we get a callback while
> +			 * the queue is still mostly full. This should be
> +			 * extremely rare. */
> +			dev->stats.tx_dropped++;
> +			kfree_skb(skb);
> +			goto stop;
>  		}
> -		dev->stats.tx_dropped++;
> -		kfree_skb(skb);
> -		return NETDEV_TX_OK;
>  	}
>  	virtqueue_kick(vi->svq);
> 
> @@ -603,18 +623,26 @@ static netdev_tx_t start_xmit(struct sk_buff *skb, struct net_device *dev)
>  	skb_orphan(skb);
>  	nf_reset(skb);
> 
> -	/* Apparently nice girls don't return TX_BUSY; stop the queue
> -	 * before it gets out of hand.  Naturally, this wastes entries. */
> -	if (capacity < 2+MAX_SKB_FRAGS) {
> -		netif_stop_queue(dev);
> -		if (unlikely(!virtqueue_enable_cb_delayed(vi->svq))) {
> -			/* More just got used, free them then recheck. */
> -			free_old_xmit_skbs(vi);
> -			capacity = virtqueue_min_capacity(vi->svq);
> -			if (capacity >= 2+MAX_SKB_FRAGS) {
> -				netif_start_queue(dev);
> -				virtqueue_disable_cb(vi->svq);
> -			}
> +	/* We transmit one skb, so try to free at least two pending skbs.
> +	 * This is so that we don't hog the skb memory unnecessarily. *
> +	 * Doing this after kick means there's a chance we'll free
> +	 * the skb we have just sent, which is hot in cache. */
> +	for (i = 0; i < 2; i++)
> +		free_old_xmit_skb(v);
> +
> +	if (likely(free_xmit_capacity(vi)))
> +		return NETDEV_TX_OK;
> +
> +stop:
> +	/* Apparently nice girls don't return TX_BUSY; check capacity and stop
> +	 * the queue before it gets out of hand.
> +	 * Naturally, this wastes entries. */
> +	netif_stop_queue(dev);
> +	if (unlikely(!virtqueue_enable_cb_delayed(vi->svq))) {
> +		/* More just got used, free them and recheck. */
> +		if (free_xmit_capacity(vi)) {
> +			netif_start_queue(dev);
> +			virtqueue_disable_cb(vi->svq);
>  		}
>  	}
> 

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCHv2 RFC 3/4] virtio_net: limit xmit polling
@ 2011-06-02 19:23       ` Michael S. Tsirkin
  0 siblings, 0 replies; 50+ messages in thread
From: Michael S. Tsirkin @ 2011-06-02 19:23 UTC (permalink / raw)
  To: Sridhar Samudrala
  Cc: linux-kernel, Rusty Russell, Carsten Otte, Christian Borntraeger,
	linux390, Martin Schwidefsky, Heiko Carstens, Shirley Ma, lguest,
	virtualization, netdev, linux-s390, kvm, Krishna Kumar,
	Tom Lendacky, steved, habanero

On Thu, Jun 02, 2011 at 11:09:53AM -0700, Sridhar Samudrala wrote:
> On Thu, 2011-06-02 at 18:43 +0300, Michael S. Tsirkin wrote:
> > Current code might introduce a lot of latency variation
> > if there are many pending bufs at the time we
> > attempt to transmit a new one. This is bad for
> > real-time applications and can't be good for TCP either.
> > 
> > Free up just enough to both clean up all buffers
> > eventually and to be able to xmit the next packet.
> > 
> > Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
> > ---
> >  drivers/net/virtio_net.c |  106 +++++++++++++++++++++++++++++-----------------
> >  1 files changed, 67 insertions(+), 39 deletions(-)
> > 
> > diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c
> > index a0ee78d..b25db1c 100644
> > --- a/drivers/net/virtio_net.c
> > +++ b/drivers/net/virtio_net.c
> > @@ -509,17 +509,33 @@ again:
> >  	return received;
> >  }
> > 
> > -static void free_old_xmit_skbs(struct virtnet_info *vi)
> > +static bool free_old_xmit_skb(struct virtnet_info *vi)
> >  {
> >  	struct sk_buff *skb;
> >  	unsigned int len;
> > 
> > -	while ((skb = virtqueue_get_buf(vi->svq, &len)) != NULL) {
> > -		pr_debug("Sent skb %p\n", skb);
> > -		vi->dev->stats.tx_bytes += skb->len;
> > -		vi->dev->stats.tx_packets++;
> > -		dev_kfree_skb_any(skb);
> > -	}
> > +	skb = virtqueue_get_buf(vi->svq, &len);
> > +	if (unlikely(!skb))
> > +		return false;
> > +	pr_debug("Sent skb %p\n", skb);
> > +	vi->dev->stats.tx_bytes += skb->len;
> > +	vi->dev->stats.tx_packets++;
> > +	dev_kfree_skb_any(skb);
> > +	return true;
> > +}
> > +
> > +/* Check capacity and try to free enough pending old buffers to enable queueing
> > + * new ones. Return true if we can guarantee that a following
> > + * virtqueue_add_buf will succeed. */
> > +static bool free_xmit_capacity(struct virtnet_info *vi)
> > +{
> > +	struct sk_buff *skb;
> > +	unsigned int len;
> > +
> > +	while (virtqueue_min_capacity(vi->svq) < MAX_SKB_FRAGS + 2)
> > +		if (unlikely(!free_old_xmit_skb))
> > +			return false;
> If we are using INDIRECT descriptors, 1 descriptor entry is good enough
> to guarantee that an skb can be queued unless we run out of memory.
> Is it worth checking if 'indirect' is set on the svq and then only free
> 1 descriptor? Otherwise, we will be dropping the packet if there are
> less than 18 free descriptors although we ony need 1.

This is not introduced with this patch: the
issue is that we need to consider worst case
as indirect memory allocation can fail.


I don't think it matters much: 18 out of 256
descriptors is not too expensive.

> > +	return true;
> >  }
> > 
> >  static int xmit_skb(struct virtnet_info *vi, struct sk_buff *skb)
> > @@ -572,30 +588,34 @@ static int xmit_skb(struct virtnet_info *vi, struct sk_buff *skb)
> >  static netdev_tx_t start_xmit(struct sk_buff *skb, struct net_device *dev)
> >  {
> >  	struct virtnet_info *vi = netdev_priv(dev);
> > -	int capacity;
> > -
> > -	/* Free up any pending old buffers before queueing new ones. */
> > -	free_old_xmit_skbs(vi);
> > -
> > -	/* Try to transmit */
> > -	capacity = xmit_skb(vi, skb);
> > -
> > -	/* This can happen with OOM and indirect buffers. */
> > -	if (unlikely(capacity < 0)) {
> > -		if (net_ratelimit()) {
> > -			if (likely(capacity == -ENOMEM)) {
> > -				dev_warn(&dev->dev,
> > -					 "TX queue failure: out of memory\n");
> > -			} else {
> > -				dev->stats.tx_fifo_errors++;
> > +	int ret, i;
> > +
> > +	/* We normally do have space in the ring, so try to queue the skb as
> > +	 * fast as possible. */
> > +	ret = xmit_skb(vi, skb);
> > +	if (unlikely(ret < 0)) {
> > +		/* This triggers on the first xmit after ring full condition.
> > +		 * We need to free up some skbs first. */
> > +		if (likely(free_xmit_capacity(vi))) {
> > +			ret = xmit_skb(vi, skb);
> > +			/* This should never fail. Check, just in case. */
> > +			if (unlikely(ret < 0)) {
> >  				dev_warn(&dev->dev,
> >  					 "Unexpected TX queue failure: %d\n",
> > -					 capacity);
> > +					 ret);
> > +				dev->stats.tx_fifo_errors++;
> > +				dev->stats.tx_dropped++;
> > +				kfree_skb(skb);
> > +				return NETDEV_TX_OK;
> >  			}
> > +		} else {
> > +			/* Ring full: it might happen if we get a callback while
> > +			 * the queue is still mostly full. This should be
> > +			 * extremely rare. */
> > +			dev->stats.tx_dropped++;
> > +			kfree_skb(skb);
> > +			goto stop;
> >  		}
> > -		dev->stats.tx_dropped++;
> > -		kfree_skb(skb);
> > -		return NETDEV_TX_OK;
> >  	}
> >  	virtqueue_kick(vi->svq);
> > 
> > @@ -603,18 +623,26 @@ static netdev_tx_t start_xmit(struct sk_buff *skb, struct net_device *dev)
> >  	skb_orphan(skb);
> >  	nf_reset(skb);
> > 
> > -	/* Apparently nice girls don't return TX_BUSY; stop the queue
> > -	 * before it gets out of hand.  Naturally, this wastes entries. */
> > -	if (capacity < 2+MAX_SKB_FRAGS) {
> > -		netif_stop_queue(dev);
> > -		if (unlikely(!virtqueue_enable_cb_delayed(vi->svq))) {
> > -			/* More just got used, free them then recheck. */
> > -			free_old_xmit_skbs(vi);
> > -			capacity = virtqueue_min_capacity(vi->svq);
> > -			if (capacity >= 2+MAX_SKB_FRAGS) {
> > -				netif_start_queue(dev);
> > -				virtqueue_disable_cb(vi->svq);
> > -			}
> > +	/* We transmit one skb, so try to free at least two pending skbs.
> > +	 * This is so that we don't hog the skb memory unnecessarily. *
> > +	 * Doing this after kick means there's a chance we'll free
> > +	 * the skb we have just sent, which is hot in cache. */
> > +	for (i = 0; i < 2; i++)
> > +		free_old_xmit_skb(v);
> > +
> > +	if (likely(free_xmit_capacity(vi)))
> > +		return NETDEV_TX_OK;
> > +
> > +stop:
> > +	/* Apparently nice girls don't return TX_BUSY; check capacity and stop
> > +	 * the queue before it gets out of hand.
> > +	 * Naturally, this wastes entries. */
> > +	netif_stop_queue(dev);
> > +	if (unlikely(!virtqueue_enable_cb_delayed(vi->svq))) {
> > +		/* More just got used, free them and recheck. */
> > +		if (free_xmit_capacity(vi)) {
> > +			netif_start_queue(dev);
> > +			virtqueue_disable_cb(vi->svq);
> >  		}
> >  	}
> > 

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCHv2 RFC 3/4] virtio_net: limit xmit polling
@ 2011-06-02 19:23       ` Michael S. Tsirkin
  0 siblings, 0 replies; 50+ messages in thread
From: Michael S. Tsirkin @ 2011-06-02 19:23 UTC (permalink / raw)
  To: Sridhar Samudrala
  Cc: Krishna Kumar, Carsten Otte, lguest-uLR06cmDAlY/bJ5BZ2RsiQ,
	Shirley Ma, kvm-u79uwXL29TY76Z2rM5mHXA,
	linux-s390-u79uwXL29TY76Z2rM5mHXA, netdev-u79uwXL29TY76Z2rM5mHXA,
	habanero-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8, Heiko Carstens,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	virtualization-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	steved-r/Jw6+rmf7HQT0dZR+AlfA, Christian Borntraeger,
	Tom Lendacky, Martin Schwidefsky,
	linux390-tA70FqPdS9bQT0dZR+AlfA

On Thu, Jun 02, 2011 at 11:09:53AM -0700, Sridhar Samudrala wrote:
> On Thu, 2011-06-02 at 18:43 +0300, Michael S. Tsirkin wrote:
> > Current code might introduce a lot of latency variation
> > if there are many pending bufs at the time we
> > attempt to transmit a new one. This is bad for
> > real-time applications and can't be good for TCP either.
> > 
> > Free up just enough to both clean up all buffers
> > eventually and to be able to xmit the next packet.
> > 
> > Signed-off-by: Michael S. Tsirkin <mst-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
> > ---
> >  drivers/net/virtio_net.c |  106 +++++++++++++++++++++++++++++-----------------
> >  1 files changed, 67 insertions(+), 39 deletions(-)
> > 
> > diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c
> > index a0ee78d..b25db1c 100644
> > --- a/drivers/net/virtio_net.c
> > +++ b/drivers/net/virtio_net.c
> > @@ -509,17 +509,33 @@ again:
> >  	return received;
> >  }
> > 
> > -static void free_old_xmit_skbs(struct virtnet_info *vi)
> > +static bool free_old_xmit_skb(struct virtnet_info *vi)
> >  {
> >  	struct sk_buff *skb;
> >  	unsigned int len;
> > 
> > -	while ((skb = virtqueue_get_buf(vi->svq, &len)) != NULL) {
> > -		pr_debug("Sent skb %p\n", skb);
> > -		vi->dev->stats.tx_bytes += skb->len;
> > -		vi->dev->stats.tx_packets++;
> > -		dev_kfree_skb_any(skb);
> > -	}
> > +	skb = virtqueue_get_buf(vi->svq, &len);
> > +	if (unlikely(!skb))
> > +		return false;
> > +	pr_debug("Sent skb %p\n", skb);
> > +	vi->dev->stats.tx_bytes += skb->len;
> > +	vi->dev->stats.tx_packets++;
> > +	dev_kfree_skb_any(skb);
> > +	return true;
> > +}
> > +
> > +/* Check capacity and try to free enough pending old buffers to enable queueing
> > + * new ones. Return true if we can guarantee that a following
> > + * virtqueue_add_buf will succeed. */
> > +static bool free_xmit_capacity(struct virtnet_info *vi)
> > +{
> > +	struct sk_buff *skb;
> > +	unsigned int len;
> > +
> > +	while (virtqueue_min_capacity(vi->svq) < MAX_SKB_FRAGS + 2)
> > +		if (unlikely(!free_old_xmit_skb))
> > +			return false;
> If we are using INDIRECT descriptors, 1 descriptor entry is good enough
> to guarantee that an skb can be queued unless we run out of memory.
> Is it worth checking if 'indirect' is set on the svq and then only free
> 1 descriptor? Otherwise, we will be dropping the packet if there are
> less than 18 free descriptors although we ony need 1.

This is not introduced with this patch: the
issue is that we need to consider worst case
as indirect memory allocation can fail.


I don't think it matters much: 18 out of 256
descriptors is not too expensive.

> > +	return true;
> >  }
> > 
> >  static int xmit_skb(struct virtnet_info *vi, struct sk_buff *skb)
> > @@ -572,30 +588,34 @@ static int xmit_skb(struct virtnet_info *vi, struct sk_buff *skb)
> >  static netdev_tx_t start_xmit(struct sk_buff *skb, struct net_device *dev)
> >  {
> >  	struct virtnet_info *vi = netdev_priv(dev);
> > -	int capacity;
> > -
> > -	/* Free up any pending old buffers before queueing new ones. */
> > -	free_old_xmit_skbs(vi);
> > -
> > -	/* Try to transmit */
> > -	capacity = xmit_skb(vi, skb);
> > -
> > -	/* This can happen with OOM and indirect buffers. */
> > -	if (unlikely(capacity < 0)) {
> > -		if (net_ratelimit()) {
> > -			if (likely(capacity == -ENOMEM)) {
> > -				dev_warn(&dev->dev,
> > -					 "TX queue failure: out of memory\n");
> > -			} else {
> > -				dev->stats.tx_fifo_errors++;
> > +	int ret, i;
> > +
> > +	/* We normally do have space in the ring, so try to queue the skb as
> > +	 * fast as possible. */
> > +	ret = xmit_skb(vi, skb);
> > +	if (unlikely(ret < 0)) {
> > +		/* This triggers on the first xmit after ring full condition.
> > +		 * We need to free up some skbs first. */
> > +		if (likely(free_xmit_capacity(vi))) {
> > +			ret = xmit_skb(vi, skb);
> > +			/* This should never fail. Check, just in case. */
> > +			if (unlikely(ret < 0)) {
> >  				dev_warn(&dev->dev,
> >  					 "Unexpected TX queue failure: %d\n",
> > -					 capacity);
> > +					 ret);
> > +				dev->stats.tx_fifo_errors++;
> > +				dev->stats.tx_dropped++;
> > +				kfree_skb(skb);
> > +				return NETDEV_TX_OK;
> >  			}
> > +		} else {
> > +			/* Ring full: it might happen if we get a callback while
> > +			 * the queue is still mostly full. This should be
> > +			 * extremely rare. */
> > +			dev->stats.tx_dropped++;
> > +			kfree_skb(skb);
> > +			goto stop;
> >  		}
> > -		dev->stats.tx_dropped++;
> > -		kfree_skb(skb);
> > -		return NETDEV_TX_OK;
> >  	}
> >  	virtqueue_kick(vi->svq);
> > 
> > @@ -603,18 +623,26 @@ static netdev_tx_t start_xmit(struct sk_buff *skb, struct net_device *dev)
> >  	skb_orphan(skb);
> >  	nf_reset(skb);
> > 
> > -	/* Apparently nice girls don't return TX_BUSY; stop the queue
> > -	 * before it gets out of hand.  Naturally, this wastes entries. */
> > -	if (capacity < 2+MAX_SKB_FRAGS) {
> > -		netif_stop_queue(dev);
> > -		if (unlikely(!virtqueue_enable_cb_delayed(vi->svq))) {
> > -			/* More just got used, free them then recheck. */
> > -			free_old_xmit_skbs(vi);
> > -			capacity = virtqueue_min_capacity(vi->svq);
> > -			if (capacity >= 2+MAX_SKB_FRAGS) {
> > -				netif_start_queue(dev);
> > -				virtqueue_disable_cb(vi->svq);
> > -			}
> > +	/* We transmit one skb, so try to free at least two pending skbs.
> > +	 * This is so that we don't hog the skb memory unnecessarily. *
> > +	 * Doing this after kick means there's a chance we'll free
> > +	 * the skb we have just sent, which is hot in cache. */
> > +	for (i = 0; i < 2; i++)
> > +		free_old_xmit_skb(v);
> > +
> > +	if (likely(free_xmit_capacity(vi)))
> > +		return NETDEV_TX_OK;
> > +
> > +stop:
> > +	/* Apparently nice girls don't return TX_BUSY; check capacity and stop
> > +	 * the queue before it gets out of hand.
> > +	 * Naturally, this wastes entries. */
> > +	netif_stop_queue(dev);
> > +	if (unlikely(!virtqueue_enable_cb_delayed(vi->svq))) {
> > +		/* More just got used, free them and recheck. */
> > +		if (free_xmit_capacity(vi)) {
> > +			netif_start_queue(dev);
> > +			virtqueue_disable_cb(vi->svq);
> >  		}
> >  	}
> > 

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCHv2 RFC 3/4] virtio_net: limit xmit polling
  2011-06-02 18:09   ` Sridhar Samudrala
@ 2011-06-02 19:23     ` Michael S. Tsirkin
  2011-06-02 19:23       ` Michael S. Tsirkin
  1 sibling, 0 replies; 50+ messages in thread
From: Michael S. Tsirkin @ 2011-06-02 19:23 UTC (permalink / raw)
  To: Sridhar Samudrala
  Cc: Krishna Kumar, Carsten Otte, lguest, Shirley Ma, kvm, linux-s390,
	netdev, habanero, Heiko Carstens, linux-kernel, virtualization,
	steved, Christian Borntraeger, Tom Lendacky, Martin Schwidefsky,
	linux390

On Thu, Jun 02, 2011 at 11:09:53AM -0700, Sridhar Samudrala wrote:
> On Thu, 2011-06-02 at 18:43 +0300, Michael S. Tsirkin wrote:
> > Current code might introduce a lot of latency variation
> > if there are many pending bufs at the time we
> > attempt to transmit a new one. This is bad for
> > real-time applications and can't be good for TCP either.
> > 
> > Free up just enough to both clean up all buffers
> > eventually and to be able to xmit the next packet.
> > 
> > Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
> > ---
> >  drivers/net/virtio_net.c |  106 +++++++++++++++++++++++++++++-----------------
> >  1 files changed, 67 insertions(+), 39 deletions(-)
> > 
> > diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c
> > index a0ee78d..b25db1c 100644
> > --- a/drivers/net/virtio_net.c
> > +++ b/drivers/net/virtio_net.c
> > @@ -509,17 +509,33 @@ again:
> >  	return received;
> >  }
> > 
> > -static void free_old_xmit_skbs(struct virtnet_info *vi)
> > +static bool free_old_xmit_skb(struct virtnet_info *vi)
> >  {
> >  	struct sk_buff *skb;
> >  	unsigned int len;
> > 
> > -	while ((skb = virtqueue_get_buf(vi->svq, &len)) != NULL) {
> > -		pr_debug("Sent skb %p\n", skb);
> > -		vi->dev->stats.tx_bytes += skb->len;
> > -		vi->dev->stats.tx_packets++;
> > -		dev_kfree_skb_any(skb);
> > -	}
> > +	skb = virtqueue_get_buf(vi->svq, &len);
> > +	if (unlikely(!skb))
> > +		return false;
> > +	pr_debug("Sent skb %p\n", skb);
> > +	vi->dev->stats.tx_bytes += skb->len;
> > +	vi->dev->stats.tx_packets++;
> > +	dev_kfree_skb_any(skb);
> > +	return true;
> > +}
> > +
> > +/* Check capacity and try to free enough pending old buffers to enable queueing
> > + * new ones. Return true if we can guarantee that a following
> > + * virtqueue_add_buf will succeed. */
> > +static bool free_xmit_capacity(struct virtnet_info *vi)
> > +{
> > +	struct sk_buff *skb;
> > +	unsigned int len;
> > +
> > +	while (virtqueue_min_capacity(vi->svq) < MAX_SKB_FRAGS + 2)
> > +		if (unlikely(!free_old_xmit_skb))
> > +			return false;
> If we are using INDIRECT descriptors, 1 descriptor entry is good enough
> to guarantee that an skb can be queued unless we run out of memory.
> Is it worth checking if 'indirect' is set on the svq and then only free
> 1 descriptor? Otherwise, we will be dropping the packet if there are
> less than 18 free descriptors although we ony need 1.

This is not introduced with this patch: the
issue is that we need to consider worst case
as indirect memory allocation can fail.


I don't think it matters much: 18 out of 256
descriptors is not too expensive.

> > +	return true;
> >  }
> > 
> >  static int xmit_skb(struct virtnet_info *vi, struct sk_buff *skb)
> > @@ -572,30 +588,34 @@ static int xmit_skb(struct virtnet_info *vi, struct sk_buff *skb)
> >  static netdev_tx_t start_xmit(struct sk_buff *skb, struct net_device *dev)
> >  {
> >  	struct virtnet_info *vi = netdev_priv(dev);
> > -	int capacity;
> > -
> > -	/* Free up any pending old buffers before queueing new ones. */
> > -	free_old_xmit_skbs(vi);
> > -
> > -	/* Try to transmit */
> > -	capacity = xmit_skb(vi, skb);
> > -
> > -	/* This can happen with OOM and indirect buffers. */
> > -	if (unlikely(capacity < 0)) {
> > -		if (net_ratelimit()) {
> > -			if (likely(capacity == -ENOMEM)) {
> > -				dev_warn(&dev->dev,
> > -					 "TX queue failure: out of memory\n");
> > -			} else {
> > -				dev->stats.tx_fifo_errors++;
> > +	int ret, i;
> > +
> > +	/* We normally do have space in the ring, so try to queue the skb as
> > +	 * fast as possible. */
> > +	ret = xmit_skb(vi, skb);
> > +	if (unlikely(ret < 0)) {
> > +		/* This triggers on the first xmit after ring full condition.
> > +		 * We need to free up some skbs first. */
> > +		if (likely(free_xmit_capacity(vi))) {
> > +			ret = xmit_skb(vi, skb);
> > +			/* This should never fail. Check, just in case. */
> > +			if (unlikely(ret < 0)) {
> >  				dev_warn(&dev->dev,
> >  					 "Unexpected TX queue failure: %d\n",
> > -					 capacity);
> > +					 ret);
> > +				dev->stats.tx_fifo_errors++;
> > +				dev->stats.tx_dropped++;
> > +				kfree_skb(skb);
> > +				return NETDEV_TX_OK;
> >  			}
> > +		} else {
> > +			/* Ring full: it might happen if we get a callback while
> > +			 * the queue is still mostly full. This should be
> > +			 * extremely rare. */
> > +			dev->stats.tx_dropped++;
> > +			kfree_skb(skb);
> > +			goto stop;
> >  		}
> > -		dev->stats.tx_dropped++;
> > -		kfree_skb(skb);
> > -		return NETDEV_TX_OK;
> >  	}
> >  	virtqueue_kick(vi->svq);
> > 
> > @@ -603,18 +623,26 @@ static netdev_tx_t start_xmit(struct sk_buff *skb, struct net_device *dev)
> >  	skb_orphan(skb);
> >  	nf_reset(skb);
> > 
> > -	/* Apparently nice girls don't return TX_BUSY; stop the queue
> > -	 * before it gets out of hand.  Naturally, this wastes entries. */
> > -	if (capacity < 2+MAX_SKB_FRAGS) {
> > -		netif_stop_queue(dev);
> > -		if (unlikely(!virtqueue_enable_cb_delayed(vi->svq))) {
> > -			/* More just got used, free them then recheck. */
> > -			free_old_xmit_skbs(vi);
> > -			capacity = virtqueue_min_capacity(vi->svq);
> > -			if (capacity >= 2+MAX_SKB_FRAGS) {
> > -				netif_start_queue(dev);
> > -				virtqueue_disable_cb(vi->svq);
> > -			}
> > +	/* We transmit one skb, so try to free at least two pending skbs.
> > +	 * This is so that we don't hog the skb memory unnecessarily. *
> > +	 * Doing this after kick means there's a chance we'll free
> > +	 * the skb we have just sent, which is hot in cache. */
> > +	for (i = 0; i < 2; i++)
> > +		free_old_xmit_skb(v);
> > +
> > +	if (likely(free_xmit_capacity(vi)))
> > +		return NETDEV_TX_OK;
> > +
> > +stop:
> > +	/* Apparently nice girls don't return TX_BUSY; check capacity and stop
> > +	 * the queue before it gets out of hand.
> > +	 * Naturally, this wastes entries. */
> > +	netif_stop_queue(dev);
> > +	if (unlikely(!virtqueue_enable_cb_delayed(vi->svq))) {
> > +		/* More just got used, free them and recheck. */
> > +		if (free_xmit_capacity(vi)) {
> > +			netif_start_queue(dev);
> > +			virtqueue_disable_cb(vi->svq);
> >  		}
> >  	}
> > 

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCHv2 RFC 0/4] virtio and vhost-net capacity handling
@ 2011-06-06  3:39     ` Rusty Russell
  0 siblings, 0 replies; 50+ messages in thread
From: Rusty Russell @ 2011-06-06  3:39 UTC (permalink / raw)
  To: Michael S. Tsirkin, linux-kernel
  Cc: Carsten Otte, Christian Borntraeger, linux390,
	Martin Schwidefsky, Heiko Carstens, Shirley Ma, lguest,
	virtualization, netdev, linux-s390, kvm, Krishna Kumar,
	Tom Lendacky, steved, habanero

On Thu, 2 Jun 2011 20:17:21 +0300, "Michael S. Tsirkin" <mst@redhat.com> wrote:
> On Thu, Jun 02, 2011 at 06:42:35PM +0300, Michael S. Tsirkin wrote:
> > OK, here's a new attempt to use the new capacity api.  I also added more
> > comments to clarify the logic.  Hope this is more readable.  Let me know
> > pls.
> > 
> > This is on top of the patches applied by Rusty.
> > 
> > Warning: untested. Posting now to give people chance to
> > comment on the API.
> > 
> > Changes from v1:
> > - fix comment in patch 2 to correct confusion noted by Rusty
> > - rewrite patch 3 along the lines suggested by Rusty
> >   note: it's not exactly the same but I hope it's close
> >   enough, the main difference is that mine does limited
> >   polling even in the unlikely xmit failure case.
> > - added a patch to not return capacity from add_buf
> >   it always looked like a weird hack
> > 
> > Michael S. Tsirkin (4):
> >   virtio_ring: add capacity check API
> >   virtio_net: fix tx capacity checks using new API
> >   virtio_net: limit xmit polling
> >   Revert "virtio: make add_buf return capacity remaining:
> > 
> >  drivers/net/virtio_net.c     |  111 ++++++++++++++++++++++++++----------------
> >  drivers/virtio/virtio_ring.c |   10 +++-
> >  include/linux/virtio.h       |    7 ++-
> >  3 files changed, 84 insertions(+), 44 deletions(-)
> > 
> > -- 
> > 1.7.5.53.gc233e
> 
> 
> And just FYI, here's a patch (on top) that I considered but
> decided against. This does a single get_buf before
> xmit. I thought it's not really needed as the capacity
> check in add_buf is relatively cheap, and we removed
> the kick in xmit_skb. But the point is that the loop
> will now be easy to move around if someone manages
> to show this benefits speed (which I doubt).

Agreed.  The other is clearer.

I like the approach these patches take.  Testing is required, but I
think the final result is a neater driver than the current one, as well
as having nicer latency.

Thanks,
Rusty.

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCHv2 RFC 0/4] virtio and vhost-net capacity handling
@ 2011-06-06  3:39     ` Rusty Russell
  0 siblings, 0 replies; 50+ messages in thread
From: Rusty Russell @ 2011-06-06  3:39 UTC (permalink / raw)
  To: Michael S. Tsirkin, linux-kernel-u79uwXL29TY76Z2rM5mHXA
  Cc: Krishna Kumar, Carsten Otte, lguest-uLR06cmDAlY/bJ5BZ2RsiQ,
	Shirley Ma, kvm-u79uwXL29TY76Z2rM5mHXA,
	linux-s390-u79uwXL29TY76Z2rM5mHXA, netdev-u79uwXL29TY76Z2rM5mHXA,
	habanero-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8, Heiko Carstens,
	virtualization-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	steved-r/Jw6+rmf7HQT0dZR+AlfA, Christian Borntraeger,
	Tom Lendacky, Martin Schwidefsky,
	linux390-tA70FqPdS9bQT0dZR+AlfA

On Thu, 2 Jun 2011 20:17:21 +0300, "Michael S. Tsirkin" <mst-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org> wrote:
> On Thu, Jun 02, 2011 at 06:42:35PM +0300, Michael S. Tsirkin wrote:
> > OK, here's a new attempt to use the new capacity api.  I also added more
> > comments to clarify the logic.  Hope this is more readable.  Let me know
> > pls.
> > 
> > This is on top of the patches applied by Rusty.
> > 
> > Warning: untested. Posting now to give people chance to
> > comment on the API.
> > 
> > Changes from v1:
> > - fix comment in patch 2 to correct confusion noted by Rusty
> > - rewrite patch 3 along the lines suggested by Rusty
> >   note: it's not exactly the same but I hope it's close
> >   enough, the main difference is that mine does limited
> >   polling even in the unlikely xmit failure case.
> > - added a patch to not return capacity from add_buf
> >   it always looked like a weird hack
> > 
> > Michael S. Tsirkin (4):
> >   virtio_ring: add capacity check API
> >   virtio_net: fix tx capacity checks using new API
> >   virtio_net: limit xmit polling
> >   Revert "virtio: make add_buf return capacity remaining:
> > 
> >  drivers/net/virtio_net.c     |  111 ++++++++++++++++++++++++++----------------
> >  drivers/virtio/virtio_ring.c |   10 +++-
> >  include/linux/virtio.h       |    7 ++-
> >  3 files changed, 84 insertions(+), 44 deletions(-)
> > 
> > -- 
> > 1.7.5.53.gc233e
> 
> 
> And just FYI, here's a patch (on top) that I considered but
> decided against. This does a single get_buf before
> xmit. I thought it's not really needed as the capacity
> check in add_buf is relatively cheap, and we removed
> the kick in xmit_skb. But the point is that the loop
> will now be easy to move around if someone manages
> to show this benefits speed (which I doubt).

Agreed.  The other is clearer.

I like the approach these patches take.  Testing is required, but I
think the final result is a neater driver than the current one, as well
as having nicer latency.

Thanks,
Rusty.

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCHv2 RFC 0/4] virtio and vhost-net capacity handling
  2011-06-02 17:17   ` Michael S. Tsirkin
  (?)
  (?)
@ 2011-06-06  3:39   ` Rusty Russell
  -1 siblings, 0 replies; 50+ messages in thread
From: Rusty Russell @ 2011-06-06  3:39 UTC (permalink / raw)
  To: Michael S. Tsirkin, linux-kernel
  Cc: Krishna Kumar, Carsten Otte, lguest, Shirley Ma, kvm, linux-s390,
	netdev, habanero, Heiko Carstens, virtualization, steved,
	Christian Borntraeger, Tom Lendacky, Martin Schwidefsky,
	linux390

On Thu, 2 Jun 2011 20:17:21 +0300, "Michael S. Tsirkin" <mst@redhat.com> wrote:
> On Thu, Jun 02, 2011 at 06:42:35PM +0300, Michael S. Tsirkin wrote:
> > OK, here's a new attempt to use the new capacity api.  I also added more
> > comments to clarify the logic.  Hope this is more readable.  Let me know
> > pls.
> > 
> > This is on top of the patches applied by Rusty.
> > 
> > Warning: untested. Posting now to give people chance to
> > comment on the API.
> > 
> > Changes from v1:
> > - fix comment in patch 2 to correct confusion noted by Rusty
> > - rewrite patch 3 along the lines suggested by Rusty
> >   note: it's not exactly the same but I hope it's close
> >   enough, the main difference is that mine does limited
> >   polling even in the unlikely xmit failure case.
> > - added a patch to not return capacity from add_buf
> >   it always looked like a weird hack
> > 
> > Michael S. Tsirkin (4):
> >   virtio_ring: add capacity check API
> >   virtio_net: fix tx capacity checks using new API
> >   virtio_net: limit xmit polling
> >   Revert "virtio: make add_buf return capacity remaining:
> > 
> >  drivers/net/virtio_net.c     |  111 ++++++++++++++++++++++++++----------------
> >  drivers/virtio/virtio_ring.c |   10 +++-
> >  include/linux/virtio.h       |    7 ++-
> >  3 files changed, 84 insertions(+), 44 deletions(-)
> > 
> > -- 
> > 1.7.5.53.gc233e
> 
> 
> And just FYI, here's a patch (on top) that I considered but
> decided against. This does a single get_buf before
> xmit. I thought it's not really needed as the capacity
> check in add_buf is relatively cheap, and we removed
> the kick in xmit_skb. But the point is that the loop
> will now be easy to move around if someone manages
> to show this benefits speed (which I doubt).

Agreed.  The other is clearer.

I like the approach these patches take.  Testing is required, but I
think the final result is a neater driver than the current one, as well
as having nicer latency.

Thanks,
Rusty.

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCHv2 RFC 4/4] Revert "virtio: make add_buf return capacity remaining:
@ 2011-06-07 15:54     ` Michael S. Tsirkin
  0 siblings, 0 replies; 50+ messages in thread
From: Michael S. Tsirkin @ 2011-06-07 15:54 UTC (permalink / raw)
  To: linux-kernel
  Cc: Rusty Russell, Carsten Otte, Christian Borntraeger, linux390,
	Martin Schwidefsky, Heiko Carstens, Shirley Ma, lguest,
	virtualization, netdev, linux-s390, kvm, Krishna Kumar,
	Tom Lendacky, steved, habanero

On Thu, Jun 02, 2011 at 06:43:25PM +0300, Michael S. Tsirkin wrote:
> This reverts commit 3c1b27d5043086a485f8526353ae9fe37bfa1065.
> The only user was virtio_net, and it switched to
> min_capacity instead.
> 
> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>

It turns out another place in virtio_net: receive
buf processing - relies on the old behaviour:

try_fill_recv:
	do {
		if (vi->mergeable_rx_bufs)
			err = add_recvbuf_mergeable(vi, gfp);
		else if (vi->big_packets)
			err = add_recvbuf_big(vi, gfp);
		else
			err = add_recvbuf_small(vi, gfp);

		oom = err == -ENOMEM;
		if (err < 0)
			break;
		++vi->num;
	} while (err > 0);

The point is to avoid allocating a buf if
the ring is out of space and we are sure
add_buf will fail.

It works well for mergeable buffers and for big
packets if we are not OOM. small packets and
oom will do extra get_page/put_page calls
(but maybe we don't care).

So this is RX, I intend to drop it from this patchset and focus on the
TX side for starters.

> ---
>  drivers/virtio/virtio_ring.c |    2 +-
>  include/linux/virtio.h       |    2 +-
>  2 files changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/virtio/virtio_ring.c b/drivers/virtio/virtio_ring.c
> index 23422f1..a6c21eb 100644
> --- a/drivers/virtio/virtio_ring.c
> +++ b/drivers/virtio/virtio_ring.c
> @@ -233,7 +233,7 @@ add_head:
>  	pr_debug("Added buffer head %i to %p\n", head, vq);
>  	END_USE(vq);
>  
> -	return vq->num_free;
> +	return 0;
>  }
>  EXPORT_SYMBOL_GPL(virtqueue_add_buf_gfp);
>  
> diff --git a/include/linux/virtio.h b/include/linux/virtio.h
> index 209220d..63c4908 100644
> --- a/include/linux/virtio.h
> +++ b/include/linux/virtio.h
> @@ -34,7 +34,7 @@ struct virtqueue {
>   *	in_num: the number of sg which are writable (after readable ones)
>   *	data: the token identifying the buffer.
>   *	gfp: how to do memory allocations (if necessary).
> - *      Returns remaining capacity of queue (sg segments) or a negative error.
> + *      Returns 0 on success or a negative error.
>   * virtqueue_kick: update after add_buf
>   *	vq: the struct virtqueue
>   *	After one or more add_buf calls, invoke this to kick the other side.
> -- 
> 1.7.5.53.gc233e

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCHv2 RFC 4/4] Revert "virtio: make add_buf return capacity remaining:
@ 2011-06-07 15:54     ` Michael S. Tsirkin
  0 siblings, 0 replies; 50+ messages in thread
From: Michael S. Tsirkin @ 2011-06-07 15:54 UTC (permalink / raw)
  To: linux-kernel-u79uwXL29TY76Z2rM5mHXA
  Cc: Krishna Kumar, Carsten Otte, lguest-uLR06cmDAlY/bJ5BZ2RsiQ,
	Shirley Ma, kvm-u79uwXL29TY76Z2rM5mHXA,
	linux-s390-u79uwXL29TY76Z2rM5mHXA, netdev-u79uwXL29TY76Z2rM5mHXA,
	habanero-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8, Heiko Carstens,
	virtualization-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	steved-r/Jw6+rmf7HQT0dZR+AlfA, Christian Borntraeger,
	Tom Lendacky, Martin Schwidefsky,
	linux390-tA70FqPdS9bQT0dZR+AlfA

On Thu, Jun 02, 2011 at 06:43:25PM +0300, Michael S. Tsirkin wrote:
> This reverts commit 3c1b27d5043086a485f8526353ae9fe37bfa1065.
> The only user was virtio_net, and it switched to
> min_capacity instead.
> 
> Signed-off-by: Michael S. Tsirkin <mst-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>

It turns out another place in virtio_net: receive
buf processing - relies on the old behaviour:

try_fill_recv:
	do {
		if (vi->mergeable_rx_bufs)
			err = add_recvbuf_mergeable(vi, gfp);
		else if (vi->big_packets)
			err = add_recvbuf_big(vi, gfp);
		else
			err = add_recvbuf_small(vi, gfp);

		oom = err == -ENOMEM;
		if (err < 0)
			break;
		++vi->num;
	} while (err > 0);

The point is to avoid allocating a buf if
the ring is out of space and we are sure
add_buf will fail.

It works well for mergeable buffers and for big
packets if we are not OOM. small packets and
oom will do extra get_page/put_page calls
(but maybe we don't care).

So this is RX, I intend to drop it from this patchset and focus on the
TX side for starters.

> ---
>  drivers/virtio/virtio_ring.c |    2 +-
>  include/linux/virtio.h       |    2 +-
>  2 files changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/virtio/virtio_ring.c b/drivers/virtio/virtio_ring.c
> index 23422f1..a6c21eb 100644
> --- a/drivers/virtio/virtio_ring.c
> +++ b/drivers/virtio/virtio_ring.c
> @@ -233,7 +233,7 @@ add_head:
>  	pr_debug("Added buffer head %i to %p\n", head, vq);
>  	END_USE(vq);
>  
> -	return vq->num_free;
> +	return 0;
>  }
>  EXPORT_SYMBOL_GPL(virtqueue_add_buf_gfp);
>  
> diff --git a/include/linux/virtio.h b/include/linux/virtio.h
> index 209220d..63c4908 100644
> --- a/include/linux/virtio.h
> +++ b/include/linux/virtio.h
> @@ -34,7 +34,7 @@ struct virtqueue {
>   *	in_num: the number of sg which are writable (after readable ones)
>   *	data: the token identifying the buffer.
>   *	gfp: how to do memory allocations (if necessary).
> - *      Returns remaining capacity of queue (sg segments) or a negative error.
> + *      Returns 0 on success or a negative error.
>   * virtqueue_kick: update after add_buf
>   *	vq: the struct virtqueue
>   *	After one or more add_buf calls, invoke this to kick the other side.
> -- 
> 1.7.5.53.gc233e

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCHv2 RFC 4/4] Revert "virtio: make add_buf return capacity remaining:
  2011-06-02 15:43   ` Michael S. Tsirkin
  (?)
  (?)
@ 2011-06-07 15:54   ` Michael S. Tsirkin
  -1 siblings, 0 replies; 50+ messages in thread
From: Michael S. Tsirkin @ 2011-06-07 15:54 UTC (permalink / raw)
  To: linux-kernel
  Cc: Krishna Kumar, Carsten Otte, lguest, Shirley Ma, kvm, linux-s390,
	netdev, habanero, Heiko Carstens, virtualization, steved,
	Christian Borntraeger, Tom Lendacky, Martin Schwidefsky,
	linux390

On Thu, Jun 02, 2011 at 06:43:25PM +0300, Michael S. Tsirkin wrote:
> This reverts commit 3c1b27d5043086a485f8526353ae9fe37bfa1065.
> The only user was virtio_net, and it switched to
> min_capacity instead.
> 
> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>

It turns out another place in virtio_net: receive
buf processing - relies on the old behaviour:

try_fill_recv:
	do {
		if (vi->mergeable_rx_bufs)
			err = add_recvbuf_mergeable(vi, gfp);
		else if (vi->big_packets)
			err = add_recvbuf_big(vi, gfp);
		else
			err = add_recvbuf_small(vi, gfp);

		oom = err == -ENOMEM;
		if (err < 0)
			break;
		++vi->num;
	} while (err > 0);

The point is to avoid allocating a buf if
the ring is out of space and we are sure
add_buf will fail.

It works well for mergeable buffers and for big
packets if we are not OOM. small packets and
oom will do extra get_page/put_page calls
(but maybe we don't care).

So this is RX, I intend to drop it from this patchset and focus on the
TX side for starters.

> ---
>  drivers/virtio/virtio_ring.c |    2 +-
>  include/linux/virtio.h       |    2 +-
>  2 files changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/virtio/virtio_ring.c b/drivers/virtio/virtio_ring.c
> index 23422f1..a6c21eb 100644
> --- a/drivers/virtio/virtio_ring.c
> +++ b/drivers/virtio/virtio_ring.c
> @@ -233,7 +233,7 @@ add_head:
>  	pr_debug("Added buffer head %i to %p\n", head, vq);
>  	END_USE(vq);
>  
> -	return vq->num_free;
> +	return 0;
>  }
>  EXPORT_SYMBOL_GPL(virtqueue_add_buf_gfp);
>  
> diff --git a/include/linux/virtio.h b/include/linux/virtio.h
> index 209220d..63c4908 100644
> --- a/include/linux/virtio.h
> +++ b/include/linux/virtio.h
> @@ -34,7 +34,7 @@ struct virtqueue {
>   *	in_num: the number of sg which are writable (after readable ones)
>   *	data: the token identifying the buffer.
>   *	gfp: how to do memory allocations (if necessary).
> - *      Returns remaining capacity of queue (sg segments) or a negative error.
> + *      Returns 0 on success or a negative error.
>   * virtqueue_kick: update after add_buf
>   *	vq: the struct virtqueue
>   *	After one or more add_buf calls, invoke this to kick the other side.
> -- 
> 1.7.5.53.gc233e

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCHv2 RFC 3/4] virtio_net: limit xmit polling
@ 2011-06-07 15:59     ` Michael S. Tsirkin
  0 siblings, 0 replies; 50+ messages in thread
From: Michael S. Tsirkin @ 2011-06-07 15:59 UTC (permalink / raw)
  To: linux-kernel
  Cc: Rusty Russell, Carsten Otte, Christian Borntraeger, linux390,
	Martin Schwidefsky, Heiko Carstens, Shirley Ma, lguest,
	virtualization, netdev, linux-s390, kvm, Krishna Kumar,
	Tom Lendacky, steved, habanero

On Thu, Jun 02, 2011 at 06:43:17PM +0300, Michael S. Tsirkin wrote:
> Current code might introduce a lot of latency variation
> if there are many pending bufs at the time we
> attempt to transmit a new one. This is bad for
> real-time applications and can't be good for TCP either.
> 
> Free up just enough to both clean up all buffers
> eventually and to be able to xmit the next packet.
> 
> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>


I've been testing this patch and it seems to work fine
so far. The following fixups are needed to make it
build though:


diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c
index b25db1c..77cdf34 100644
--- a/drivers/net/virtio_net.c
+++ b/drivers/net/virtio_net.c
@@ -529,11 +529,8 @@ static bool free_old_xmit_skb(struct virtnet_info *vi)
  * virtqueue_add_buf will succeed. */
 static bool free_xmit_capacity(struct virtnet_info *vi)
 {
-	struct sk_buff *skb;
-	unsigned int len;
-
 	while (virtqueue_min_capacity(vi->svq) < MAX_SKB_FRAGS + 2)
-		if (unlikely(!free_old_xmit_skb))
+		if (unlikely(!free_old_xmit_skb(vi)))
 			return false;
 	return true;
 }
@@ -628,7 +625,7 @@ static netdev_tx_t start_xmit(struct sk_buff *skb, struct net_device *dev)
 	 * Doing this after kick means there's a chance we'll free
 	 * the skb we have just sent, which is hot in cache. */
 	for (i = 0; i < 2; i++)
-		free_old_xmit_skb(v);
+		free_old_xmit_skb(vi);
 
 	if (likely(free_xmit_capacity(vi)))
 		return NETDEV_TX_OK;

^ permalink raw reply related	[flat|nested] 50+ messages in thread

* Re: [PATCHv2 RFC 3/4] virtio_net: limit xmit polling
@ 2011-06-07 15:59     ` Michael S. Tsirkin
  0 siblings, 0 replies; 50+ messages in thread
From: Michael S. Tsirkin @ 2011-06-07 15:59 UTC (permalink / raw)
  To: linux-kernel-u79uwXL29TY76Z2rM5mHXA
  Cc: Krishna Kumar, Carsten Otte, lguest-uLR06cmDAlY/bJ5BZ2RsiQ,
	Shirley Ma, kvm-u79uwXL29TY76Z2rM5mHXA,
	linux-s390-u79uwXL29TY76Z2rM5mHXA, netdev-u79uwXL29TY76Z2rM5mHXA,
	habanero-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8, Heiko Carstens,
	virtualization-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	steved-r/Jw6+rmf7HQT0dZR+AlfA, Christian Borntraeger,
	Tom Lendacky, Martin Schwidefsky,
	linux390-tA70FqPdS9bQT0dZR+AlfA

On Thu, Jun 02, 2011 at 06:43:17PM +0300, Michael S. Tsirkin wrote:
> Current code might introduce a lot of latency variation
> if there are many pending bufs at the time we
> attempt to transmit a new one. This is bad for
> real-time applications and can't be good for TCP either.
> 
> Free up just enough to both clean up all buffers
> eventually and to be able to xmit the next packet.
> 
> Signed-off-by: Michael S. Tsirkin <mst-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>


I've been testing this patch and it seems to work fine
so far. The following fixups are needed to make it
build though:


diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c
index b25db1c..77cdf34 100644
--- a/drivers/net/virtio_net.c
+++ b/drivers/net/virtio_net.c
@@ -529,11 +529,8 @@ static bool free_old_xmit_skb(struct virtnet_info *vi)
  * virtqueue_add_buf will succeed. */
 static bool free_xmit_capacity(struct virtnet_info *vi)
 {
-	struct sk_buff *skb;
-	unsigned int len;
-
 	while (virtqueue_min_capacity(vi->svq) < MAX_SKB_FRAGS + 2)
-		if (unlikely(!free_old_xmit_skb))
+		if (unlikely(!free_old_xmit_skb(vi)))
 			return false;
 	return true;
 }
@@ -628,7 +625,7 @@ static netdev_tx_t start_xmit(struct sk_buff *skb, struct net_device *dev)
 	 * Doing this after kick means there's a chance we'll free
 	 * the skb we have just sent, which is hot in cache. */
 	for (i = 0; i < 2; i++)
-		free_old_xmit_skb(v);
+		free_old_xmit_skb(vi);
 
 	if (likely(free_xmit_capacity(vi)))
 		return NETDEV_TX_OK;

^ permalink raw reply related	[flat|nested] 50+ messages in thread

* Re: [PATCHv2 RFC 3/4] virtio_net: limit xmit polling
  2011-06-02 15:43   ` Michael S. Tsirkin
                     ` (2 preceding siblings ...)
  (?)
@ 2011-06-07 15:59   ` Michael S. Tsirkin
  -1 siblings, 0 replies; 50+ messages in thread
From: Michael S. Tsirkin @ 2011-06-07 15:59 UTC (permalink / raw)
  To: linux-kernel
  Cc: Krishna Kumar, Carsten Otte, lguest, Shirley Ma, kvm, linux-s390,
	netdev, habanero, Heiko Carstens, virtualization, steved,
	Christian Borntraeger, Tom Lendacky, Martin Schwidefsky,
	linux390

On Thu, Jun 02, 2011 at 06:43:17PM +0300, Michael S. Tsirkin wrote:
> Current code might introduce a lot of latency variation
> if there are many pending bufs at the time we
> attempt to transmit a new one. This is bad for
> real-time applications and can't be good for TCP either.
> 
> Free up just enough to both clean up all buffers
> eventually and to be able to xmit the next packet.
> 
> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>


I've been testing this patch and it seems to work fine
so far. The following fixups are needed to make it
build though:


diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c
index b25db1c..77cdf34 100644
--- a/drivers/net/virtio_net.c
+++ b/drivers/net/virtio_net.c
@@ -529,11 +529,8 @@ static bool free_old_xmit_skb(struct virtnet_info *vi)
  * virtqueue_add_buf will succeed. */
 static bool free_xmit_capacity(struct virtnet_info *vi)
 {
-	struct sk_buff *skb;
-	unsigned int len;
-
 	while (virtqueue_min_capacity(vi->svq) < MAX_SKB_FRAGS + 2)
-		if (unlikely(!free_old_xmit_skb))
+		if (unlikely(!free_old_xmit_skb(vi)))
 			return false;
 	return true;
 }
@@ -628,7 +625,7 @@ static netdev_tx_t start_xmit(struct sk_buff *skb, struct net_device *dev)
 	 * Doing this after kick means there's a chance we'll free
 	 * the skb we have just sent, which is hot in cache. */
 	for (i = 0; i < 2; i++)
-		free_old_xmit_skb(v);
+		free_old_xmit_skb(vi);
 
 	if (likely(free_xmit_capacity(vi)))
 		return NETDEV_TX_OK;

^ permalink raw reply related	[flat|nested] 50+ messages in thread

* Re: [PATCHv2 RFC 0/4] virtio and vhost-net capacity handling
@ 2011-06-07 16:08   ` Michael S. Tsirkin
  0 siblings, 0 replies; 50+ messages in thread
From: Michael S. Tsirkin @ 2011-06-07 16:08 UTC (permalink / raw)
  To: linux-kernel
  Cc: Rusty Russell, Carsten Otte, Christian Borntraeger, linux390,
	Martin Schwidefsky, Heiko Carstens, Shirley Ma, lguest,
	virtualization, netdev, linux-s390, kvm, Krishna Kumar,
	Tom Lendacky, steved, habanero

On Thu, Jun 02, 2011 at 06:42:35PM +0300, Michael S. Tsirkin wrote:
> OK, here's a new attempt to use the new capacity api.  I also added more
> comments to clarify the logic.  Hope this is more readable.  Let me know
> pls.
> 
> This is on top of the patches applied by Rusty.
> 
> Warning: untested. Posting now to give people chance to
> comment on the API.

OK, this seems to have survived some testing so far,
after I dropped patch 4 and fixed build for patch 3
(build fixup patch sent in reply to the original).

I'll be mostly offline until Sunday, would appreciate
testing reports.

git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost.git
virtio-net-xmit-polling-v2
git://git.kernel.org/pub/scm/linux/kernel/git/mst/qemu-kvm.git
virtio-net-event-idx-v3

Thanks!

> Changes from v1:
> - fix comment in patch 2 to correct confusion noted by Rusty
> - rewrite patch 3 along the lines suggested by Rusty
>   note: it's not exactly the same but I hope it's close
>   enough, the main difference is that mine does limited
>   polling even in the unlikely xmit failure case.
> - added a patch to not return capacity from add_buf
>   it always looked like a weird hack
> 
> Michael S. Tsirkin (4):
>   virtio_ring: add capacity check API
>   virtio_net: fix tx capacity checks using new API
>   virtio_net: limit xmit polling
>   Revert "virtio: make add_buf return capacity remaining:
> 
>  drivers/net/virtio_net.c     |  111 ++++++++++++++++++++++++++----------------
>  drivers/virtio/virtio_ring.c |   10 +++-
>  include/linux/virtio.h       |    7 ++-
>  3 files changed, 84 insertions(+), 44 deletions(-)
> 
> -- 
> 1.7.5.53.gc233e

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCHv2 RFC 0/4] virtio and vhost-net capacity handling
@ 2011-06-07 16:08   ` Michael S. Tsirkin
  0 siblings, 0 replies; 50+ messages in thread
From: Michael S. Tsirkin @ 2011-06-07 16:08 UTC (permalink / raw)
  To: linux-kernel-u79uwXL29TY76Z2rM5mHXA
  Cc: Krishna Kumar, Carsten Otte, lguest-uLR06cmDAlY/bJ5BZ2RsiQ,
	Shirley Ma, kvm-u79uwXL29TY76Z2rM5mHXA,
	linux-s390-u79uwXL29TY76Z2rM5mHXA, netdev-u79uwXL29TY76Z2rM5mHXA,
	habanero-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8, Heiko Carstens,
	virtualization-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	steved-r/Jw6+rmf7HQT0dZR+AlfA, Christian Borntraeger,
	Tom Lendacky, Martin Schwidefsky,
	linux390-tA70FqPdS9bQT0dZR+AlfA

On Thu, Jun 02, 2011 at 06:42:35PM +0300, Michael S. Tsirkin wrote:
> OK, here's a new attempt to use the new capacity api.  I also added more
> comments to clarify the logic.  Hope this is more readable.  Let me know
> pls.
> 
> This is on top of the patches applied by Rusty.
> 
> Warning: untested. Posting now to give people chance to
> comment on the API.

OK, this seems to have survived some testing so far,
after I dropped patch 4 and fixed build for patch 3
(build fixup patch sent in reply to the original).

I'll be mostly offline until Sunday, would appreciate
testing reports.

git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost.git
virtio-net-xmit-polling-v2
git://git.kernel.org/pub/scm/linux/kernel/git/mst/qemu-kvm.git
virtio-net-event-idx-v3

Thanks!

> Changes from v1:
> - fix comment in patch 2 to correct confusion noted by Rusty
> - rewrite patch 3 along the lines suggested by Rusty
>   note: it's not exactly the same but I hope it's close
>   enough, the main difference is that mine does limited
>   polling even in the unlikely xmit failure case.
> - added a patch to not return capacity from add_buf
>   it always looked like a weird hack
> 
> Michael S. Tsirkin (4):
>   virtio_ring: add capacity check API
>   virtio_net: fix tx capacity checks using new API
>   virtio_net: limit xmit polling
>   Revert "virtio: make add_buf return capacity remaining:
> 
>  drivers/net/virtio_net.c     |  111 ++++++++++++++++++++++++++----------------
>  drivers/virtio/virtio_ring.c |   10 +++-
>  include/linux/virtio.h       |    7 ++-
>  3 files changed, 84 insertions(+), 44 deletions(-)
> 
> -- 
> 1.7.5.53.gc233e

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCHv2 RFC 0/4] virtio and vhost-net capacity handling
  2011-06-02 15:42 ` Michael S. Tsirkin
                   ` (11 preceding siblings ...)
  (?)
@ 2011-06-07 16:08 ` Michael S. Tsirkin
  -1 siblings, 0 replies; 50+ messages in thread
From: Michael S. Tsirkin @ 2011-06-07 16:08 UTC (permalink / raw)
  To: linux-kernel
  Cc: Krishna Kumar, Carsten Otte, lguest, Shirley Ma, kvm, linux-s390,
	netdev, habanero, Heiko Carstens, virtualization, steved,
	Christian Borntraeger, Tom Lendacky, Martin Schwidefsky,
	linux390

On Thu, Jun 02, 2011 at 06:42:35PM +0300, Michael S. Tsirkin wrote:
> OK, here's a new attempt to use the new capacity api.  I also added more
> comments to clarify the logic.  Hope this is more readable.  Let me know
> pls.
> 
> This is on top of the patches applied by Rusty.
> 
> Warning: untested. Posting now to give people chance to
> comment on the API.

OK, this seems to have survived some testing so far,
after I dropped patch 4 and fixed build for patch 3
(build fixup patch sent in reply to the original).

I'll be mostly offline until Sunday, would appreciate
testing reports.

git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost.git
virtio-net-xmit-polling-v2
git://git.kernel.org/pub/scm/linux/kernel/git/mst/qemu-kvm.git
virtio-net-event-idx-v3

Thanks!

> Changes from v1:
> - fix comment in patch 2 to correct confusion noted by Rusty
> - rewrite patch 3 along the lines suggested by Rusty
>   note: it's not exactly the same but I hope it's close
>   enough, the main difference is that mine does limited
>   polling even in the unlikely xmit failure case.
> - added a patch to not return capacity from add_buf
>   it always looked like a weird hack
> 
> Michael S. Tsirkin (4):
>   virtio_ring: add capacity check API
>   virtio_net: fix tx capacity checks using new API
>   virtio_net: limit xmit polling
>   Revert "virtio: make add_buf return capacity remaining:
> 
>  drivers/net/virtio_net.c     |  111 ++++++++++++++++++++++++++----------------
>  drivers/virtio/virtio_ring.c |   10 +++-
>  include/linux/virtio.h       |    7 ++-
>  3 files changed, 84 insertions(+), 44 deletions(-)
> 
> -- 
> 1.7.5.53.gc233e

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCHv2 RFC 4/4] Revert "virtio: make add_buf return capacity remaining:
  2011-06-07 15:54     ` Michael S. Tsirkin
  (?)
@ 2011-06-08  0:19     ` Rusty Russell
  -1 siblings, 0 replies; 50+ messages in thread
From: Rusty Russell @ 2011-06-08  0:19 UTC (permalink / raw)
  To: Michael S. Tsirkin, linux-kernel
  Cc: Carsten Otte, Christian Borntraeger, linux390,
	Martin Schwidefsky, Heiko Carstens, Shirley Ma, lguest,
	virtualization, netdev, linux-s390, kvm, Krishna Kumar,
	Tom Lendacky, steved, habanero

On Tue, 7 Jun 2011 18:54:57 +0300, "Michael S. Tsirkin" <mst@redhat.com> wrote:
> On Thu, Jun 02, 2011 at 06:43:25PM +0300, Michael S. Tsirkin wrote:
> > This reverts commit 3c1b27d5043086a485f8526353ae9fe37bfa1065.
> > The only user was virtio_net, and it switched to
> > min_capacity instead.
> > 
> > Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
> 
> It turns out another place in virtio_net: receive
> buf processing - relies on the old behaviour:
> 
> try_fill_recv:
> 	do {
> 		if (vi->mergeable_rx_bufs)
> 			err = add_recvbuf_mergeable(vi, gfp);
> 		else if (vi->big_packets)
> 			err = add_recvbuf_big(vi, gfp);
> 		else
> 			err = add_recvbuf_small(vi, gfp);
> 
> 		oom = err == -ENOMEM;
> 		if (err < 0)
> 			break;
> 		++vi->num;
> 	} while (err > 0);
> 
> The point is to avoid allocating a buf if
> the ring is out of space and we are sure
> add_buf will fail.
> 
> It works well for mergeable buffers and for big
> packets if we are not OOM. small packets and
> oom will do extra get_page/put_page calls
> (but maybe we don't care).
> 
> So this is RX, I intend to drop it from this patchset and focus on the
> TX side for starters.

We could do some hack where we get the capacity, and estimate how many
packets we need to fill it, then try to do that many.

I say hack, because knowing whether we're doing indirect buffers is a
layering violation.  But that's life when you're trying to do
microoptimizations.

Cheers,
Rusty.

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCHv2 RFC 4/4] Revert "virtio: make add_buf return capacity remaining:
  2011-06-07 15:54     ` Michael S. Tsirkin
  (?)
  (?)
@ 2011-06-08  0:19     ` Rusty Russell
  -1 siblings, 0 replies; 50+ messages in thread
From: Rusty Russell @ 2011-06-08  0:19 UTC (permalink / raw)
  To: Michael S. Tsirkin, linux-kernel
  Cc: Krishna Kumar, Carsten Otte, lguest, Shirley Ma, kvm, linux-s390,
	netdev, habanero, Heiko Carstens, virtualization, steved,
	Christian Borntraeger, Tom Lendacky, Martin Schwidefsky,
	linux390

On Tue, 7 Jun 2011 18:54:57 +0300, "Michael S. Tsirkin" <mst@redhat.com> wrote:
> On Thu, Jun 02, 2011 at 06:43:25PM +0300, Michael S. Tsirkin wrote:
> > This reverts commit 3c1b27d5043086a485f8526353ae9fe37bfa1065.
> > The only user was virtio_net, and it switched to
> > min_capacity instead.
> > 
> > Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
> 
> It turns out another place in virtio_net: receive
> buf processing - relies on the old behaviour:
> 
> try_fill_recv:
> 	do {
> 		if (vi->mergeable_rx_bufs)
> 			err = add_recvbuf_mergeable(vi, gfp);
> 		else if (vi->big_packets)
> 			err = add_recvbuf_big(vi, gfp);
> 		else
> 			err = add_recvbuf_small(vi, gfp);
> 
> 		oom = err == -ENOMEM;
> 		if (err < 0)
> 			break;
> 		++vi->num;
> 	} while (err > 0);
> 
> The point is to avoid allocating a buf if
> the ring is out of space and we are sure
> add_buf will fail.
> 
> It works well for mergeable buffers and for big
> packets if we are not OOM. small packets and
> oom will do extra get_page/put_page calls
> (but maybe we don't care).
> 
> So this is RX, I intend to drop it from this patchset and focus on the
> TX side for starters.

We could do some hack where we get the capacity, and estimate how many
packets we need to fill it, then try to do that many.

I say hack, because knowing whether we're doing indirect buffers is a
layering violation.  But that's life when you're trying to do
microoptimizations.

Cheers,
Rusty.

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCHv2 RFC 0/4] virtio and vhost-net capacity handling
  2011-06-07 16:08   ` Michael S. Tsirkin
  (?)
  (?)
@ 2011-06-13 13:32   ` Krishna Kumar2
  2011-06-13 13:35     ` Michael S. Tsirkin
                       ` (5 more replies)
  -1 siblings, 6 replies; 50+ messages in thread
From: Krishna Kumar2 @ 2011-06-13 13:32 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: Christian Borntraeger, Carsten Otte, habanero, Heiko Carstens,
	kvm, lguest, linux-kernel, linux-s390, linux390, netdev,
	Rusty Russell, Martin Schwidefsky, steved, Tom Lendacky,
	virtualization, Shirley Ma

"Michael S. Tsirkin" <mst@redhat.com> wrote on 06/07/2011 09:38:30 PM:

> > This is on top of the patches applied by Rusty.
> >
> > Warning: untested. Posting now to give people chance to
> > comment on the API.
>
> OK, this seems to have survived some testing so far,
> after I dropped patch 4 and fixed build for patch 3
> (build fixup patch sent in reply to the original).
>
> I'll be mostly offline until Sunday, would appreciate
> testing reports.

Hi Michael,

I ran the latest patches with 1K I/O (guest->local host) and
the results are (60 sec run for each test case):

______________________________
#sessions   BW%       SD%
______________________________
1           -25.6     47.0
2           -29.3     22.9
4              .8     1.6
8             1.6     0
16          -1.6      4.1
32          -5.3      2.1
48           11.3    -7.8
64          -2.8      .7
96          -6.2      .6
128         -10.6     12.7
______________________________
BW: -4.8     SD: 5.4

I tested it again to see if the regression is fleeting (since
the numbers vary quite a bit for 1K I/O even between guest->
local host), but:

______________________________
#sessions  BW%     SD%
______________________________
1          14.0    -17.3
2          19.9    -11.1
4          7.9     -15.3
8          9.6     -13.1
16         1.2     -7.3
32        -.6      -13.5
48        -28.7     10.0
64        -5.7     -.7
96        -9.4     -8.1
128       -9.4      .7
______________________________
BW: -3.7     SD: -2.0


With 16K, there was an improvement in SD, but
higher sessions seem to slightly degrade BW/SD:

______________________________
#sessions  BW%      SD%
______________________________
1          30.9    -25.0
2          16.5    -19.4
4         -1.3      7.9
8          1.4      6.2
16         3.9     -5.4
32         0        4.3
48        -.5        .1
64         32.1    -1.5
96        -2.1      23.2
128       -7.4      3.8
______________________________
BW: 5.0          SD: 7.5


Thanks,

- KK


^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCHv2 RFC 0/4] virtio and vhost-net capacity handling
  2011-06-07 16:08   ` Michael S. Tsirkin
  (?)
@ 2011-06-13 13:32   ` Krishna Kumar2
  -1 siblings, 0 replies; 50+ messages in thread
From: Krishna Kumar2 @ 2011-06-13 13:32 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: habanero, lguest, Shirley Ma, kvm, Carsten Otte, linux-s390,
	Heiko Carstens, linux-kernel, virtualization, steved,
	Christian Borntraeger, Tom Lendacky, netdev, Martin Schwidefsky,
	linux390

"Michael S. Tsirkin" <mst@redhat.com> wrote on 06/07/2011 09:38:30 PM:

> > This is on top of the patches applied by Rusty.
> >
> > Warning: untested. Posting now to give people chance to
> > comment on the API.
>
> OK, this seems to have survived some testing so far,
> after I dropped patch 4 and fixed build for patch 3
> (build fixup patch sent in reply to the original).
>
> I'll be mostly offline until Sunday, would appreciate
> testing reports.

Hi Michael,

I ran the latest patches with 1K I/O (guest->local host) and
the results are (60 sec run for each test case):

______________________________
#sessions   BW%       SD%
______________________________
1           -25.6     47.0
2           -29.3     22.9
4              .8     1.6
8             1.6     0
16          -1.6      4.1
32          -5.3      2.1
48           11.3    -7.8
64          -2.8      .7
96          -6.2      .6
128         -10.6     12.7
______________________________
BW: -4.8     SD: 5.4

I tested it again to see if the regression is fleeting (since
the numbers vary quite a bit for 1K I/O even between guest->
local host), but:

______________________________
#sessions  BW%     SD%
______________________________
1          14.0    -17.3
2          19.9    -11.1
4          7.9     -15.3
8          9.6     -13.1
16         1.2     -7.3
32        -.6      -13.5
48        -28.7     10.0
64        -5.7     -.7
96        -9.4     -8.1
128       -9.4      .7
______________________________
BW: -3.7     SD: -2.0


With 16K, there was an improvement in SD, but
higher sessions seem to slightly degrade BW/SD:

______________________________
#sessions  BW%      SD%
______________________________
1          30.9    -25.0
2          16.5    -19.4
4         -1.3      7.9
8          1.4      6.2
16         3.9     -5.4
32         0        4.3
48        -.5        .1
64         32.1    -1.5
96        -2.1      23.2
128       -7.4      3.8
______________________________
BW: 5.0          SD: 7.5


Thanks,

- KK

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCHv2 RFC 0/4] virtio and vhost-net capacity handling
@ 2011-06-13 13:35       ` Michael S. Tsirkin
  0 siblings, 0 replies; 50+ messages in thread
From: Michael S. Tsirkin @ 2011-06-13 13:35 UTC (permalink / raw)
  To: Krishna Kumar2
  Cc: Christian Borntraeger, Carsten Otte, habanero, Heiko Carstens,
	kvm, lguest, linux-kernel, linux-s390, linux390, netdev,
	Rusty Russell, Martin Schwidefsky, steved, Tom Lendacky,
	virtualization, Shirley Ma

On Mon, Jun 13, 2011 at 07:02:27PM +0530, Krishna Kumar2 wrote:
> "Michael S. Tsirkin" <mst@redhat.com> wrote on 06/07/2011 09:38:30 PM:
> 
> > > This is on top of the patches applied by Rusty.
> > >
> > > Warning: untested. Posting now to give people chance to
> > > comment on the API.
> >
> > OK, this seems to have survived some testing so far,
> > after I dropped patch 4 and fixed build for patch 3
> > (build fixup patch sent in reply to the original).
> >
> > I'll be mostly offline until Sunday, would appreciate
> > testing reports.
> 
> Hi Michael,
> 
> I ran the latest patches with 1K I/O (guest->local host) and
> the results are (60 sec run for each test case):

Hi!
Did you apply this one:
[PATCHv2 RFC 4/4] Revert "virtio: make add_buf return capacity remaining"
?

It turns out that that patch has a bug and should be reverted,
only patches 1-3 should be applied.

Could you confirm please?


> ___________________/
> #sessions   BW%       SD%
> ______________________________
> 1           -25.6     47.0
> 2           -29.3     22.9
> 4              .8     1.6
> 8             1.6     0
> 16          -1.6      4.1
> 32          -5.3      2.1
> 48           11.3    -7.8
> 64          -2.8      .7
> 96          -6.2      .6
> 128         -10.6     12.7
> ______________________________
> BW: -4.8     SD: 5.4
> 
> I tested it again to see if the regression is fleeting (since
> the numbers vary quite a bit for 1K I/O even between guest->
> local host), but:
> 
> ______________________________
> #sessions  BW%     SD%
> ______________________________
> 1          14.0    -17.3
> 2          19.9    -11.1
> 4          7.9     -15.3
> 8          9.6     -13.1
> 16         1.2     -7.3
> 32        -.6      -13.5
> 48        -28.7     10.0
> 64        -5.7     -.7
> 96        -9.4     -8.1
> 128       -9.4      .7
> ______________________________
> BW: -3.7     SD: -2.0
> 
> 
> With 16K, there was an improvement in SD, but
> higher sessions seem to slightly degrade BW/SD:
> 
> ______________________________
> #sessions  BW%      SD%
> ______________________________
> 1          30.9    -25.0
> 2          16.5    -19.4
> 4         -1.3      7.9
> 8          1.4      6.2
> 16         3.9     -5.4
> 32         0        4.3
> 48        -.5        .1
> 64         32.1    -1.5
> 96        -2.1      23.2
> 128       -7.4      3.8
> ______________________________
> BW: 5.0          SD: 7.5
> 
> 
> Thanks,
> 
> - KK

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCHv2 RFC 0/4] virtio and vhost-net capacity handling
@ 2011-06-13 13:35       ` Michael S. Tsirkin
  0 siblings, 0 replies; 50+ messages in thread
From: Michael S. Tsirkin @ 2011-06-13 13:35 UTC (permalink / raw)
  To: Krishna Kumar2
  Cc: habanero-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8,
	lguest-uLR06cmDAlY/bJ5BZ2RsiQ, Shirley Ma,
	kvm-u79uwXL29TY76Z2rM5mHXA, Carsten Otte,
	linux-s390-u79uwXL29TY76Z2rM5mHXA, Heiko Carstens,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	virtualization-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	steved-r/Jw6+rmf7HQT0dZR+AlfA, Christian Borntraeger,
	Tom Lendacky, netdev-u79uwXL29TY76Z2rM5mHXA, Martin Schwidefsky,
	linux390-tA70FqPdS9bQT0dZR+AlfA

On Mon, Jun 13, 2011 at 07:02:27PM +0530, Krishna Kumar2 wrote:
> "Michael S. Tsirkin" <mst-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org> wrote on 06/07/2011 09:38:30 PM:
> 
> > > This is on top of the patches applied by Rusty.
> > >
> > > Warning: untested. Posting now to give people chance to
> > > comment on the API.
> >
> > OK, this seems to have survived some testing so far,
> > after I dropped patch 4 and fixed build for patch 3
> > (build fixup patch sent in reply to the original).
> >
> > I'll be mostly offline until Sunday, would appreciate
> > testing reports.
> 
> Hi Michael,
> 
> I ran the latest patches with 1K I/O (guest->local host) and
> the results are (60 sec run for each test case):

Hi!
Did you apply this one:
[PATCHv2 RFC 4/4] Revert "virtio: make add_buf return capacity remaining"
?

It turns out that that patch has a bug and should be reverted,
only patches 1-3 should be applied.

Could you confirm please?


> ___________________/
> #sessions   BW%       SD%
> ______________________________
> 1           -25.6     47.0
> 2           -29.3     22.9
> 4              .8     1.6
> 8             1.6     0
> 16          -1.6      4.1
> 32          -5.3      2.1
> 48           11.3    -7.8
> 64          -2.8      .7
> 96          -6.2      .6
> 128         -10.6     12.7
> ______________________________
> BW: -4.8     SD: 5.4
> 
> I tested it again to see if the regression is fleeting (since
> the numbers vary quite a bit for 1K I/O even between guest->
> local host), but:
> 
> ______________________________
> #sessions  BW%     SD%
> ______________________________
> 1          14.0    -17.3
> 2          19.9    -11.1
> 4          7.9     -15.3
> 8          9.6     -13.1
> 16         1.2     -7.3
> 32        -.6      -13.5
> 48        -28.7     10.0
> 64        -5.7     -.7
> 96        -9.4     -8.1
> 128       -9.4      .7
> ______________________________
> BW: -3.7     SD: -2.0
> 
> 
> With 16K, there was an improvement in SD, but
> higher sessions seem to slightly degrade BW/SD:
> 
> ______________________________
> #sessions  BW%      SD%
> ______________________________
> 1          30.9    -25.0
> 2          16.5    -19.4
> 4         -1.3      7.9
> 8          1.4      6.2
> 16         3.9     -5.4
> 32         0        4.3
> 48        -.5        .1
> 64         32.1    -1.5
> 96        -2.1      23.2
> 128       -7.4      3.8
> ______________________________
> BW: 5.0          SD: 7.5
> 
> 
> Thanks,
> 
> - KK

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCHv2 RFC 0/4] virtio and vhost-net capacity handling
  2011-06-13 13:32   ` Krishna Kumar2
@ 2011-06-13 13:35     ` Michael S. Tsirkin
  2011-06-13 13:35       ` Michael S. Tsirkin
                       ` (4 subsequent siblings)
  5 siblings, 0 replies; 50+ messages in thread
From: Michael S. Tsirkin @ 2011-06-13 13:35 UTC (permalink / raw)
  To: Krishna Kumar2
  Cc: habanero, lguest, Shirley Ma, kvm, Carsten Otte, linux-s390,
	Heiko Carstens, linux-kernel, virtualization, steved,
	Christian Borntraeger, Tom Lendacky, netdev, Martin Schwidefsky,
	linux390

On Mon, Jun 13, 2011 at 07:02:27PM +0530, Krishna Kumar2 wrote:
> "Michael S. Tsirkin" <mst@redhat.com> wrote on 06/07/2011 09:38:30 PM:
> 
> > > This is on top of the patches applied by Rusty.
> > >
> > > Warning: untested. Posting now to give people chance to
> > > comment on the API.
> >
> > OK, this seems to have survived some testing so far,
> > after I dropped patch 4 and fixed build for patch 3
> > (build fixup patch sent in reply to the original).
> >
> > I'll be mostly offline until Sunday, would appreciate
> > testing reports.
> 
> Hi Michael,
> 
> I ran the latest patches with 1K I/O (guest->local host) and
> the results are (60 sec run for each test case):

Hi!
Did you apply this one:
[PATCHv2 RFC 4/4] Revert "virtio: make add_buf return capacity remaining"
?

It turns out that that patch has a bug and should be reverted,
only patches 1-3 should be applied.

Could you confirm please?


> ___________________/
> #sessions   BW%       SD%
> ______________________________
> 1           -25.6     47.0
> 2           -29.3     22.9
> 4              .8     1.6
> 8             1.6     0
> 16          -1.6      4.1
> 32          -5.3      2.1
> 48           11.3    -7.8
> 64          -2.8      .7
> 96          -6.2      .6
> 128         -10.6     12.7
> ______________________________
> BW: -4.8     SD: 5.4
> 
> I tested it again to see if the regression is fleeting (since
> the numbers vary quite a bit for 1K I/O even between guest->
> local host), but:
> 
> ______________________________
> #sessions  BW%     SD%
> ______________________________
> 1          14.0    -17.3
> 2          19.9    -11.1
> 4          7.9     -15.3
> 8          9.6     -13.1
> 16         1.2     -7.3
> 32        -.6      -13.5
> 48        -28.7     10.0
> 64        -5.7     -.7
> 96        -9.4     -8.1
> 128       -9.4      .7
> ______________________________
> BW: -3.7     SD: -2.0
> 
> 
> With 16K, there was an improvement in SD, but
> higher sessions seem to slightly degrade BW/SD:
> 
> ______________________________
> #sessions  BW%      SD%
> ______________________________
> 1          30.9    -25.0
> 2          16.5    -19.4
> 4         -1.3      7.9
> 8          1.4      6.2
> 16         3.9     -5.4
> 32         0        4.3
> 48        -.5        .1
> 64         32.1    -1.5
> 96        -2.1      23.2
> 128       -7.4      3.8
> ______________________________
> BW: 5.0          SD: 7.5
> 
> 
> Thanks,
> 
> - KK

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCHv2 RFC 0/4] virtio and vhost-net capacity handling
@ 2011-06-13 13:38       ` Krishna Kumar2
  0 siblings, 0 replies; 50+ messages in thread
From: Krishna Kumar2 @ 2011-06-13 13:38 UTC (permalink / raw)
  To: Krishna Kumar2
  Cc: Christian Borntraeger, Carsten Otte, habanero, Heiko Carstens,
	kvm, lguest, linux-kernel, linux-s390, linux390,
	Michael S. Tsirkin, netdev, netdev-owner, Rusty Russell,
	Martin Schwidefsky, steved, Tom Lendacky, virtualization,
	Shirley Ma

> Krishna Kumar2/India/IBM@IBMIN wrote on 06/13/2011 07:02:27 PM:

...

> With 16K, there was an improvement in SD, but
> higher sessions seem to slightly degrade BW/SD:

I meant to say "With 16K, there was an improvement in BW"
above. Again the numbers are not very reproducible,
I will test with remote host also to see if I get more
consistent numbers.

Thanks,

- KK

>
> ______________________________
> #sessions  BW%      SD%
> ______________________________
> 1          30.9    -25.0
> 2          16.5    -19.4
> 4         -1.3      7.9
> 8          1.4      6.2
> 16         3.9     -5.4
> 32         0        4.3
> 48        -.5        .1
> 64         32.1    -1.5
> 96        -2.1      23.2
> 128       -7.4      3.8
> ______________________________
> BW: 5.0          SD: 7.5


^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCHv2 RFC 0/4] virtio and vhost-net capacity handling
@ 2011-06-13 13:38       ` Krishna Kumar2
  0 siblings, 0 replies; 50+ messages in thread
From: Krishna Kumar2 @ 2011-06-13 13:38 UTC (permalink / raw)
  To: Krishna Kumar2
  Cc: habanero-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8,
	lguest-uLR06cmDAlY/bJ5BZ2RsiQ, Shirley Ma,
	kvm-u79uwXL29TY76Z2rM5mHXA, Carsten Otte,
	linux-s390-u79uwXL29TY76Z2rM5mHXA, Michael S. Tsirkin,
	Heiko Carstens, linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	netdev-owner-u79uwXL29TY76Z2rM5mHXA,
	steved-r/Jw6+rmf7HQT0dZR+AlfA, Christian Borntraeger,
	Tom Lendacky, netdev-u79uwXL29TY76Z2rM5mHXA, Martin Schwidefsky,
	linux390-tA70FqPdS9bQT0dZR+AlfA,
	virtualization-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA

> Krishna Kumar2/India/IBM@IBMIN wrote on 06/13/2011 07:02:27 PM:

...

> With 16K, there was an improvement in SD, but
> higher sessions seem to slightly degrade BW/SD:

I meant to say "With 16K, there was an improvement in BW"
above. Again the numbers are not very reproducible,
I will test with remote host also to see if I get more
consistent numbers.

Thanks,

- KK

>
> ______________________________
> #sessions  BW%      SD%
> ______________________________
> 1          30.9    -25.0
> 2          16.5    -19.4
> 4         -1.3      7.9
> 8          1.4      6.2
> 16         3.9     -5.4
> 32         0        4.3
> 48        -.5        .1
> 64         32.1    -1.5
> 96        -2.1      23.2
> 128       -7.4      3.8
> ______________________________
> BW: 5.0          SD: 7.5

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCHv2 RFC 0/4] virtio and vhost-net capacity handling
  2011-06-13 13:32   ` Krishna Kumar2
  2011-06-13 13:35     ` Michael S. Tsirkin
  2011-06-13 13:35       ` Michael S. Tsirkin
@ 2011-06-13 13:38     ` Krishna Kumar2
  2011-06-13 13:38       ` Krishna Kumar2
                       ` (2 subsequent siblings)
  5 siblings, 0 replies; 50+ messages in thread
From: Krishna Kumar2 @ 2011-06-13 13:38 UTC (permalink / raw)
  To: Krishna Kumar2
  Cc: habanero, lguest, Shirley Ma, kvm, Carsten Otte, linux-s390,
	Michael S. Tsirkin, Heiko Carstens, linux-kernel, netdev-owner,
	steved, Christian Borntraeger, Tom Lendacky, netdev,
	Martin Schwidefsky, linux390, virtualization

> Krishna Kumar2/India/IBM@IBMIN wrote on 06/13/2011 07:02:27 PM:

...

> With 16K, there was an improvement in SD, but
> higher sessions seem to slightly degrade BW/SD:

I meant to say "With 16K, there was an improvement in BW"
above. Again the numbers are not very reproducible,
I will test with remote host also to see if I get more
consistent numbers.

Thanks,

- KK

>
> ______________________________
> #sessions  BW%      SD%
> ______________________________
> 1          30.9    -25.0
> 2          16.5    -19.4
> 4         -1.3      7.9
> 8          1.4      6.2
> 16         3.9     -5.4
> 32         0        4.3
> 48        -.5        .1
> 64         32.1    -1.5
> 96        -2.1      23.2
> 128       -7.4      3.8
> ______________________________
> BW: 5.0          SD: 7.5

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCHv2 RFC 0/4] virtio and vhost-net capacity handling
@ 2011-06-13 13:44         ` Krishna Kumar2
  0 siblings, 0 replies; 50+ messages in thread
From: Krishna Kumar2 @ 2011-06-13 13:44 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: Christian Borntraeger, Carsten Otte, habanero, Heiko Carstens,
	kvm, lguest, linux-kernel, linux-s390, linux390, netdev,
	Rusty Russell, Martin Schwidefsky, steved, Tom Lendacky,
	virtualization, Shirley Ma

"Michael S. Tsirkin" <mst@redhat.com> wrote on 06/13/2011 07:05:13 PM:

> > I ran the latest patches with 1K I/O (guest->local host) and
> > the results are (60 sec run for each test case):
>
> Hi!
> Did you apply this one:
> [PATCHv2 RFC 4/4] Revert "virtio: make add_buf return capacity remaining"
> ?
>
> It turns out that that patch has a bug and should be reverted,
> only patches 1-3 should be applied.
>
> Could you confirm please?

No, I didn't apply that patch. I had also seen your mail
earlier on this patch breaking receive buffer processing
if applied.

thanks,

- KK


^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCHv2 RFC 0/4] virtio and vhost-net capacity handling
@ 2011-06-13 13:44         ` Krishna Kumar2
  0 siblings, 0 replies; 50+ messages in thread
From: Krishna Kumar2 @ 2011-06-13 13:44 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: habanero-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8,
	lguest-uLR06cmDAlY/bJ5BZ2RsiQ, Shirley Ma,
	kvm-u79uwXL29TY76Z2rM5mHXA, Carsten Otte,
	linux-s390-u79uwXL29TY76Z2rM5mHXA, Heiko Carstens,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	virtualization-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	steved-r/Jw6+rmf7HQT0dZR+AlfA, Christian Borntraeger,
	Tom Lendacky, netdev-u79uwXL29TY76Z2rM5mHXA, Martin Schwidefsky,
	linux390-tA70FqPdS9bQT0dZR+AlfA

"Michael S. Tsirkin" <mst-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org> wrote on 06/13/2011 07:05:13 PM:

> > I ran the latest patches with 1K I/O (guest->local host) and
> > the results are (60 sec run for each test case):
>
> Hi!
> Did you apply this one:
> [PATCHv2 RFC 4/4] Revert "virtio: make add_buf return capacity remaining"
> ?
>
> It turns out that that patch has a bug and should be reverted,
> only patches 1-3 should be applied.
>
> Could you confirm please?

No, I didn't apply that patch. I had also seen your mail
earlier on this patch breaking receive buffer processing
if applied.

thanks,

- KK

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCHv2 RFC 0/4] virtio and vhost-net capacity handling
  2011-06-13 13:35       ` Michael S. Tsirkin
  (?)
@ 2011-06-13 13:44       ` Krishna Kumar2
  -1 siblings, 0 replies; 50+ messages in thread
From: Krishna Kumar2 @ 2011-06-13 13:44 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: habanero, lguest, Shirley Ma, kvm, Carsten Otte, linux-s390,
	Heiko Carstens, linux-kernel, virtualization, steved,
	Christian Borntraeger, Tom Lendacky, netdev, Martin Schwidefsky,
	linux390

"Michael S. Tsirkin" <mst@redhat.com> wrote on 06/13/2011 07:05:13 PM:

> > I ran the latest patches with 1K I/O (guest->local host) and
> > the results are (60 sec run for each test case):
>
> Hi!
> Did you apply this one:
> [PATCHv2 RFC 4/4] Revert "virtio: make add_buf return capacity remaining"
> ?
>
> It turns out that that patch has a bug and should be reverted,
> only patches 1-3 should be applied.
>
> Could you confirm please?

No, I didn't apply that patch. I had also seen your mail
earlier on this patch breaking receive buffer processing
if applied.

thanks,

- KK

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCHv2 RFC 0/4] virtio and vhost-net capacity handling
@ 2011-06-19  8:53       ` Michael S. Tsirkin
  0 siblings, 0 replies; 50+ messages in thread
From: Michael S. Tsirkin @ 2011-06-19  8:53 UTC (permalink / raw)
  To: Krishna Kumar2
  Cc: Christian Borntraeger, Carsten Otte, habanero, Heiko Carstens,
	kvm, lguest, linux-kernel, linux-s390, linux390, netdev,
	Rusty Russell, Martin Schwidefsky, steved, Tom Lendacky,
	virtualization, Shirley Ma

On Mon, Jun 13, 2011 at 07:02:27PM +0530, Krishna Kumar2 wrote:
> "Michael S. Tsirkin" <mst@redhat.com> wrote on 06/07/2011 09:38:30 PM:
> 
> > > This is on top of the patches applied by Rusty.
> > >
> > > Warning: untested. Posting now to give people chance to
> > > comment on the API.
> >
> > OK, this seems to have survived some testing so far,
> > after I dropped patch 4 and fixed build for patch 3
> > (build fixup patch sent in reply to the original).
> >
> > I'll be mostly offline until Sunday, would appreciate
> > testing reports.
> 
> Hi Michael,
> 
> I ran the latest patches with 1K I/O (guest->local host) and
> the results are (60 sec run for each test case):
> 
> ______________________________
> #sessions   BW%       SD%
> ______________________________
> 1           -25.6     47.0
> 2           -29.3     22.9
> 4              .8     1.6
> 8             1.6     0
> 16          -1.6      4.1
> 32          -5.3      2.1
> 48           11.3    -7.8
> 64          -2.8      .7
> 96          -6.2      .6
> 128         -10.6     12.7
> ______________________________
> BW: -4.8     SD: 5.4
> 
> I tested it again to see if the regression is fleeting (since
> the numbers vary quite a bit for 1K I/O even between guest->
> local host), but:
> 
> ______________________________
> #sessions  BW%     SD%
> ______________________________
> 1          14.0    -17.3
> 2          19.9    -11.1
> 4          7.9     -15.3
> 8          9.6     -13.1
> 16         1.2     -7.3
> 32        -.6      -13.5
> 48        -28.7     10.0
> 64        -5.7     -.7
> 96        -9.4     -8.1
> 128       -9.4      .7
> ______________________________
> BW: -3.7     SD: -2.0
> 
> 
> With 16K, there was an improvement in SD, but
> higher sessions seem to slightly degrade BW/SD:
> 
> ______________________________
> #sessions  BW%      SD%
> ______________________________
> 1          30.9    -25.0
> 2          16.5    -19.4
> 4         -1.3      7.9
> 8          1.4      6.2
> 16         3.9     -5.4
> 32         0        4.3
> 48        -.5        .1
> 64         32.1    -1.5
> 96        -2.1      23.2
> 128       -7.4      3.8
> ______________________________
> BW: 5.0          SD: 7.5
> 
> 
> Thanks,
> 
> - KK

I think I see one scenario where we do extra work:
when TX ring overflows, the first attempt to
add buf will fail, so the work to format the s/g
list is then wasted. So it might make sense to
free up buffers up to capacity first thing after all,
which will still do nothing typically, add buf afterwards.

-- 
MST

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCHv2 RFC 0/4] virtio and vhost-net capacity handling
@ 2011-06-19  8:53       ` Michael S. Tsirkin
  0 siblings, 0 replies; 50+ messages in thread
From: Michael S. Tsirkin @ 2011-06-19  8:53 UTC (permalink / raw)
  To: Krishna Kumar2
  Cc: habanero-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8,
	lguest-uLR06cmDAlY/bJ5BZ2RsiQ, Shirley Ma,
	kvm-u79uwXL29TY76Z2rM5mHXA, Carsten Otte,
	linux-s390-u79uwXL29TY76Z2rM5mHXA, Heiko Carstens,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	virtualization-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	steved-r/Jw6+rmf7HQT0dZR+AlfA, Christian Borntraeger,
	Tom Lendacky, netdev-u79uwXL29TY76Z2rM5mHXA, Martin Schwidefsky,
	linux390-tA70FqPdS9bQT0dZR+AlfA

On Mon, Jun 13, 2011 at 07:02:27PM +0530, Krishna Kumar2 wrote:
> "Michael S. Tsirkin" <mst-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org> wrote on 06/07/2011 09:38:30 PM:
> 
> > > This is on top of the patches applied by Rusty.
> > >
> > > Warning: untested. Posting now to give people chance to
> > > comment on the API.
> >
> > OK, this seems to have survived some testing so far,
> > after I dropped patch 4 and fixed build for patch 3
> > (build fixup patch sent in reply to the original).
> >
> > I'll be mostly offline until Sunday, would appreciate
> > testing reports.
> 
> Hi Michael,
> 
> I ran the latest patches with 1K I/O (guest->local host) and
> the results are (60 sec run for each test case):
> 
> ______________________________
> #sessions   BW%       SD%
> ______________________________
> 1           -25.6     47.0
> 2           -29.3     22.9
> 4              .8     1.6
> 8             1.6     0
> 16          -1.6      4.1
> 32          -5.3      2.1
> 48           11.3    -7.8
> 64          -2.8      .7
> 96          -6.2      .6
> 128         -10.6     12.7
> ______________________________
> BW: -4.8     SD: 5.4
> 
> I tested it again to see if the regression is fleeting (since
> the numbers vary quite a bit for 1K I/O even between guest->
> local host), but:
> 
> ______________________________
> #sessions  BW%     SD%
> ______________________________
> 1          14.0    -17.3
> 2          19.9    -11.1
> 4          7.9     -15.3
> 8          9.6     -13.1
> 16         1.2     -7.3
> 32        -.6      -13.5
> 48        -28.7     10.0
> 64        -5.7     -.7
> 96        -9.4     -8.1
> 128       -9.4      .7
> ______________________________
> BW: -3.7     SD: -2.0
> 
> 
> With 16K, there was an improvement in SD, but
> higher sessions seem to slightly degrade BW/SD:
> 
> ______________________________
> #sessions  BW%      SD%
> ______________________________
> 1          30.9    -25.0
> 2          16.5    -19.4
> 4         -1.3      7.9
> 8          1.4      6.2
> 16         3.9     -5.4
> 32         0        4.3
> 48        -.5        .1
> 64         32.1    -1.5
> 96        -2.1      23.2
> 128       -7.4      3.8
> ______________________________
> BW: 5.0          SD: 7.5
> 
> 
> Thanks,
> 
> - KK

I think I see one scenario where we do extra work:
when TX ring overflows, the first attempt to
add buf will fail, so the work to format the s/g
list is then wasted. So it might make sense to
free up buffers up to capacity first thing after all,
which will still do nothing typically, add buf afterwards.

-- 
MST

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCHv2 RFC 0/4] virtio and vhost-net capacity handling
  2011-06-13 13:32   ` Krishna Kumar2
                       ` (3 preceding siblings ...)
  2011-06-13 13:38       ` Krishna Kumar2
@ 2011-06-19  8:53     ` Michael S. Tsirkin
  2011-06-19  8:53       ` Michael S. Tsirkin
  5 siblings, 0 replies; 50+ messages in thread
From: Michael S. Tsirkin @ 2011-06-19  8:53 UTC (permalink / raw)
  To: Krishna Kumar2
  Cc: habanero, lguest, Shirley Ma, kvm, Carsten Otte, linux-s390,
	Heiko Carstens, linux-kernel, virtualization, steved,
	Christian Borntraeger, Tom Lendacky, netdev, Martin Schwidefsky,
	linux390

On Mon, Jun 13, 2011 at 07:02:27PM +0530, Krishna Kumar2 wrote:
> "Michael S. Tsirkin" <mst@redhat.com> wrote on 06/07/2011 09:38:30 PM:
> 
> > > This is on top of the patches applied by Rusty.
> > >
> > > Warning: untested. Posting now to give people chance to
> > > comment on the API.
> >
> > OK, this seems to have survived some testing so far,
> > after I dropped patch 4 and fixed build for patch 3
> > (build fixup patch sent in reply to the original).
> >
> > I'll be mostly offline until Sunday, would appreciate
> > testing reports.
> 
> Hi Michael,
> 
> I ran the latest patches with 1K I/O (guest->local host) and
> the results are (60 sec run for each test case):
> 
> ______________________________
> #sessions   BW%       SD%
> ______________________________
> 1           -25.6     47.0
> 2           -29.3     22.9
> 4              .8     1.6
> 8             1.6     0
> 16          -1.6      4.1
> 32          -5.3      2.1
> 48           11.3    -7.8
> 64          -2.8      .7
> 96          -6.2      .6
> 128         -10.6     12.7
> ______________________________
> BW: -4.8     SD: 5.4
> 
> I tested it again to see if the regression is fleeting (since
> the numbers vary quite a bit for 1K I/O even between guest->
> local host), but:
> 
> ______________________________
> #sessions  BW%     SD%
> ______________________________
> 1          14.0    -17.3
> 2          19.9    -11.1
> 4          7.9     -15.3
> 8          9.6     -13.1
> 16         1.2     -7.3
> 32        -.6      -13.5
> 48        -28.7     10.0
> 64        -5.7     -.7
> 96        -9.4     -8.1
> 128       -9.4      .7
> ______________________________
> BW: -3.7     SD: -2.0
> 
> 
> With 16K, there was an improvement in SD, but
> higher sessions seem to slightly degrade BW/SD:
> 
> ______________________________
> #sessions  BW%      SD%
> ______________________________
> 1          30.9    -25.0
> 2          16.5    -19.4
> 4         -1.3      7.9
> 8          1.4      6.2
> 16         3.9     -5.4
> 32         0        4.3
> 48        -.5        .1
> 64         32.1    -1.5
> 96        -2.1      23.2
> 128       -7.4      3.8
> ______________________________
> BW: 5.0          SD: 7.5
> 
> 
> Thanks,
> 
> - KK

I think I see one scenario where we do extra work:
when TX ring overflows, the first attempt to
add buf will fail, so the work to format the s/g
list is then wasted. So it might make sense to
free up buffers up to capacity first thing after all,
which will still do nothing typically, add buf afterwards.

-- 
MST

^ permalink raw reply	[flat|nested] 50+ messages in thread

end of thread, other threads:[~2011-06-19  8:53 UTC | newest]

Thread overview: 50+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2011-06-02 15:42 [PATCHv2 RFC 0/4] virtio and vhost-net capacity handling Michael S. Tsirkin
2011-06-02 15:42 ` Michael S. Tsirkin
2011-06-02 15:42 ` [PATCHv2 RFC 1/4] virtio_ring: add capacity check API Michael S. Tsirkin
2011-06-02 15:42 ` Michael S. Tsirkin
2011-06-02 15:42   ` Michael S. Tsirkin
2011-06-02 15:43 ` [PATCHv2 RFC 2/4] virtio_net: fix tx capacity checks using new API Michael S. Tsirkin
2011-06-02 15:43   ` Michael S. Tsirkin
2011-06-02 15:43 ` Michael S. Tsirkin
2011-06-02 15:43 ` [PATCHv2 RFC 3/4] virtio_net: limit xmit polling Michael S. Tsirkin
2011-06-02 15:43   ` Michael S. Tsirkin
2011-06-02 18:09   ` Sridhar Samudrala
2011-06-02 19:23     ` Michael S. Tsirkin
2011-06-02 19:23     ` Michael S. Tsirkin
2011-06-02 19:23       ` Michael S. Tsirkin
2011-06-02 18:09   ` Sridhar Samudrala
2011-06-07 15:59   ` Michael S. Tsirkin
2011-06-07 15:59   ` Michael S. Tsirkin
2011-06-07 15:59     ` Michael S. Tsirkin
2011-06-02 15:43 ` Michael S. Tsirkin
2011-06-02 15:43 ` [PATCHv2 RFC 4/4] Revert "virtio: make add_buf return capacity remaining: Michael S. Tsirkin
2011-06-02 15:43   ` Michael S. Tsirkin
2011-06-07 15:54   ` Michael S. Tsirkin
2011-06-07 15:54     ` Michael S. Tsirkin
2011-06-08  0:19     ` Rusty Russell
2011-06-08  0:19     ` Rusty Russell
2011-06-07 15:54   ` Michael S. Tsirkin
2011-06-02 15:43 ` Michael S. Tsirkin
2011-06-02 17:17 ` [PATCHv2 RFC 0/4] virtio and vhost-net capacity handling Michael S. Tsirkin
2011-06-02 17:17   ` Michael S. Tsirkin
2011-06-06  3:39   ` Rusty Russell
2011-06-06  3:39     ` Rusty Russell
2011-06-06  3:39   ` Rusty Russell
2011-06-02 17:17 ` Michael S. Tsirkin
2011-06-07 16:08 ` Michael S. Tsirkin
2011-06-07 16:08   ` Michael S. Tsirkin
2011-06-13 13:32   ` Krishna Kumar2
2011-06-13 13:32   ` Krishna Kumar2
2011-06-13 13:35     ` Michael S. Tsirkin
2011-06-13 13:35     ` Michael S. Tsirkin
2011-06-13 13:35       ` Michael S. Tsirkin
2011-06-13 13:44       ` Krishna Kumar2
2011-06-13 13:44       ` Krishna Kumar2
2011-06-13 13:44         ` Krishna Kumar2
2011-06-13 13:38     ` Krishna Kumar2
2011-06-13 13:38     ` Krishna Kumar2
2011-06-13 13:38       ` Krishna Kumar2
2011-06-19  8:53     ` Michael S. Tsirkin
2011-06-19  8:53     ` Michael S. Tsirkin
2011-06-19  8:53       ` Michael S. Tsirkin
2011-06-07 16:08 ` Michael S. Tsirkin

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.