All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH net-next V3 0/6] switch to use tx skb array in tun
@ 2016-06-30  3:52 ` Jason Wang
  0 siblings, 0 replies; 28+ messages in thread
From: Jason Wang @ 2016-06-30  3:52 UTC (permalink / raw)
  To: mst, netdev, linux-kernel, davem
  Cc: kvm, virtualization, eric.dumazet, brouer, Jason Wang

Hi all:

This series tries to switch to use skb array in tun. This is used to
eliminate the spinlock contention between producer and consumer. The
conversion was straightforward: just introdce a tx skb array and use
it instead of sk_receive_queue.

A minor issue is to keep the tx_queue_len behaviour, since tun used to
use it for the length of sk_receive_queue. This is done through:

- add the ability to resize multiple rings at once to avoid handling
  partial resize failure for mutiple rings.
- add the support for zero length ring.
- introduce a notifier which was triggered when tx_queue_len was
  changed for a netdev.
- resize all queues during the tx_queue_len changing.

Tests shows about 15% improvement on guest rx pps:

Before: ~1300000pps
After : ~1500000pps

Changes from V2:
- add multiple rings resizing support for ptr_ring/skb_array
- add zero length ring support
- introdce a NETDEV_CHANGE_TX_QUEUE_LEN
- drop new flags

Changes from V1:
- switch to use skb array instead of a customized circular buffer
- add non-blocking support
- rename .peek to .peek_len
- drop lockless peeking since test show very minor improvement

Jason Wang (5):
  ptr_ring: support zero length ring
  skb_array: minor tweak
  skb_array: add wrappers for resizing
  net: introduce NETDEV_CHANGE_TX_QUEUE_LEN
  tun: switch to use skb array for tx

Michael S. Tsirkin (1):
  ptr_ring: support resizing multiple queues

 drivers/net/tun.c                | 138 ++++++++++++++++++++++++++++++++++++---
 drivers/vhost/net.c              |  16 ++++-
 include/linux/net.h              |   1 +
 include/linux/netdevice.h        |   1 +
 include/linux/ptr_ring.h         |  77 ++++++++++++++++++----
 include/linux/skb_array.h        |  13 +++-
 net/core/net-sysfs.c             |  15 ++++-
 tools/virtio/ringtest/ptr_ring.c |   5 ++
 8 files changed, 243 insertions(+), 23 deletions(-)

-- 
2.7.4

^ permalink raw reply	[flat|nested] 28+ messages in thread

* [PATCH net-next V3 0/6] switch to use tx skb array in tun
@ 2016-06-30  3:52 ` Jason Wang
  0 siblings, 0 replies; 28+ messages in thread
From: Jason Wang @ 2016-06-30  3:52 UTC (permalink / raw)
  To: mst, netdev, linux-kernel, davem
  Cc: brouer, eric.dumazet, kvm, virtualization

Hi all:

This series tries to switch to use skb array in tun. This is used to
eliminate the spinlock contention between producer and consumer. The
conversion was straightforward: just introdce a tx skb array and use
it instead of sk_receive_queue.

A minor issue is to keep the tx_queue_len behaviour, since tun used to
use it for the length of sk_receive_queue. This is done through:

- add the ability to resize multiple rings at once to avoid handling
  partial resize failure for mutiple rings.
- add the support for zero length ring.
- introduce a notifier which was triggered when tx_queue_len was
  changed for a netdev.
- resize all queues during the tx_queue_len changing.

Tests shows about 15% improvement on guest rx pps:

Before: ~1300000pps
After : ~1500000pps

Changes from V2:
- add multiple rings resizing support for ptr_ring/skb_array
- add zero length ring support
- introdce a NETDEV_CHANGE_TX_QUEUE_LEN
- drop new flags

Changes from V1:
- switch to use skb array instead of a customized circular buffer
- add non-blocking support
- rename .peek to .peek_len
- drop lockless peeking since test show very minor improvement

Jason Wang (5):
  ptr_ring: support zero length ring
  skb_array: minor tweak
  skb_array: add wrappers for resizing
  net: introduce NETDEV_CHANGE_TX_QUEUE_LEN
  tun: switch to use skb array for tx

Michael S. Tsirkin (1):
  ptr_ring: support resizing multiple queues

 drivers/net/tun.c                | 138 ++++++++++++++++++++++++++++++++++++---
 drivers/vhost/net.c              |  16 ++++-
 include/linux/net.h              |   1 +
 include/linux/netdevice.h        |   1 +
 include/linux/ptr_ring.h         |  77 ++++++++++++++++++----
 include/linux/skb_array.h        |  13 +++-
 net/core/net-sysfs.c             |  15 ++++-
 tools/virtio/ringtest/ptr_ring.c |   5 ++
 8 files changed, 243 insertions(+), 23 deletions(-)

-- 
2.7.4

^ permalink raw reply	[flat|nested] 28+ messages in thread

* [PATCH net-next V3 1/6] ptr_ring: support zero length ring
  2016-06-30  3:52 ` Jason Wang
@ 2016-06-30  3:52   ` Jason Wang
  -1 siblings, 0 replies; 28+ messages in thread
From: Jason Wang @ 2016-06-30  3:52 UTC (permalink / raw)
  To: mst, netdev, linux-kernel, davem
  Cc: kvm, virtualization, eric.dumazet, brouer, Jason Wang

Sometimes, we need zero length ring. But current code will crash since
we don't do any check before accessing the ring. This patch fixes this.

Signed-off-by: Jason Wang <jasowang@redhat.com>
---
 include/linux/ptr_ring.h | 6 ++++--
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/include/linux/ptr_ring.h b/include/linux/ptr_ring.h
index 562a65e..d78b8b8 100644
--- a/include/linux/ptr_ring.h
+++ b/include/linux/ptr_ring.h
@@ -102,7 +102,7 @@ static inline bool ptr_ring_full_bh(struct ptr_ring *r)
  */
 static inline int __ptr_ring_produce(struct ptr_ring *r, void *ptr)
 {
-	if (r->queue[r->producer])
+	if (unlikely(!r->size) || r->queue[r->producer])
 		return -ENOSPC;
 
 	r->queue[r->producer++] = ptr;
@@ -164,7 +164,9 @@ static inline int ptr_ring_produce_bh(struct ptr_ring *r, void *ptr)
  */
 static inline void *__ptr_ring_peek(struct ptr_ring *r)
 {
-	return r->queue[r->consumer];
+	if (likely(r->size))
+		return r->queue[r->consumer];
+	return NULL;
 }
 
 /* Note: callers invoking this in a loop must use a compiler barrier,
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [PATCH net-next V3 1/6] ptr_ring: support zero length ring
@ 2016-06-30  3:52   ` Jason Wang
  0 siblings, 0 replies; 28+ messages in thread
From: Jason Wang @ 2016-06-30  3:52 UTC (permalink / raw)
  To: mst, netdev, linux-kernel, davem
  Cc: brouer, eric.dumazet, kvm, virtualization

Sometimes, we need zero length ring. But current code will crash since
we don't do any check before accessing the ring. This patch fixes this.

Signed-off-by: Jason Wang <jasowang@redhat.com>
---
 include/linux/ptr_ring.h | 6 ++++--
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/include/linux/ptr_ring.h b/include/linux/ptr_ring.h
index 562a65e..d78b8b8 100644
--- a/include/linux/ptr_ring.h
+++ b/include/linux/ptr_ring.h
@@ -102,7 +102,7 @@ static inline bool ptr_ring_full_bh(struct ptr_ring *r)
  */
 static inline int __ptr_ring_produce(struct ptr_ring *r, void *ptr)
 {
-	if (r->queue[r->producer])
+	if (unlikely(!r->size) || r->queue[r->producer])
 		return -ENOSPC;
 
 	r->queue[r->producer++] = ptr;
@@ -164,7 +164,9 @@ static inline int ptr_ring_produce_bh(struct ptr_ring *r, void *ptr)
  */
 static inline void *__ptr_ring_peek(struct ptr_ring *r)
 {
-	return r->queue[r->consumer];
+	if (likely(r->size))
+		return r->queue[r->consumer];
+	return NULL;
 }
 
 /* Note: callers invoking this in a loop must use a compiler barrier,
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [PATCH net-next V3 2/6] skb_array: minor tweak
  2016-06-30  3:52 ` Jason Wang
@ 2016-06-30  3:52   ` Jason Wang
  -1 siblings, 0 replies; 28+ messages in thread
From: Jason Wang @ 2016-06-30  3:52 UTC (permalink / raw)
  To: mst, netdev, linux-kernel, davem
  Cc: kvm, virtualization, eric.dumazet, brouer, Jason Wang

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
---
 include/linux/skb_array.h | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/include/linux/skb_array.h b/include/linux/skb_array.h
index 678bfbf..2dd0d1e 100644
--- a/include/linux/skb_array.h
+++ b/include/linux/skb_array.h
@@ -151,12 +151,12 @@ static inline int skb_array_init(struct skb_array *a, int size, gfp_t gfp)
 	return ptr_ring_init(&a->ring, size, gfp);
 }
 
-void __skb_array_destroy_skb(void *ptr)
+static void __skb_array_destroy_skb(void *ptr)
 {
 	kfree_skb(ptr);
 }
 
-int skb_array_resize(struct skb_array *a, int size, gfp_t gfp)
+static inline int skb_array_resize(struct skb_array *a, int size, gfp_t gfp)
 {
 	return ptr_ring_resize(&a->ring, size, gfp, __skb_array_destroy_skb);
 }
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [PATCH net-next V3 2/6] skb_array: minor tweak
@ 2016-06-30  3:52   ` Jason Wang
  0 siblings, 0 replies; 28+ messages in thread
From: Jason Wang @ 2016-06-30  3:52 UTC (permalink / raw)
  To: mst, netdev, linux-kernel, davem
  Cc: brouer, eric.dumazet, kvm, virtualization

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
---
 include/linux/skb_array.h | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/include/linux/skb_array.h b/include/linux/skb_array.h
index 678bfbf..2dd0d1e 100644
--- a/include/linux/skb_array.h
+++ b/include/linux/skb_array.h
@@ -151,12 +151,12 @@ static inline int skb_array_init(struct skb_array *a, int size, gfp_t gfp)
 	return ptr_ring_init(&a->ring, size, gfp);
 }
 
-void __skb_array_destroy_skb(void *ptr)
+static void __skb_array_destroy_skb(void *ptr)
 {
 	kfree_skb(ptr);
 }
 
-int skb_array_resize(struct skb_array *a, int size, gfp_t gfp)
+static inline int skb_array_resize(struct skb_array *a, int size, gfp_t gfp)
 {
 	return ptr_ring_resize(&a->ring, size, gfp, __skb_array_destroy_skb);
 }
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [PATCH net-next V3 3/6] ptr_ring: support resizing multiple queues
  2016-06-30  3:52 ` Jason Wang
@ 2016-06-30  3:52   ` Jason Wang
  -1 siblings, 0 replies; 28+ messages in thread
From: Jason Wang @ 2016-06-30  3:52 UTC (permalink / raw)
  To: mst, netdev, linux-kernel, davem
  Cc: kvm, virtualization, eric.dumazet, brouer, Jason Wang

From: "Michael S. Tsirkin" <mst@redhat.com>

Sometimes, we need support resizing multiple queues at once. This is
because it was not easy to recover to recover from a partial failure
of multiple queues resizing.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
---
 include/linux/ptr_ring.h         | 71 +++++++++++++++++++++++++++++++++++-----
 tools/virtio/ringtest/ptr_ring.c |  5 +++
 2 files changed, 67 insertions(+), 9 deletions(-)

diff --git a/include/linux/ptr_ring.h b/include/linux/ptr_ring.h
index d78b8b8..2052011 100644
--- a/include/linux/ptr_ring.h
+++ b/include/linux/ptr_ring.h
@@ -349,20 +349,14 @@ static inline int ptr_ring_init(struct ptr_ring *r, int size, gfp_t gfp)
 	return 0;
 }
 
-static inline int ptr_ring_resize(struct ptr_ring *r, int size, gfp_t gfp,
-				  void (*destroy)(void *))
+static inline void **__ptr_ring_swap_queue(struct ptr_ring *r, void **queue,
+					   int size, gfp_t gfp,
+					   void (*destroy)(void *))
 {
-	unsigned long flags;
 	int producer = 0;
-	void **queue = __ptr_ring_init_queue_alloc(size, gfp);
 	void **old;
 	void *ptr;
 
-	if (!queue)
-		return -ENOMEM;
-
-	spin_lock_irqsave(&(r)->producer_lock, flags);
-
 	while ((ptr = ptr_ring_consume(r)))
 		if (producer < size)
 			queue[producer++] = ptr;
@@ -375,6 +369,23 @@ static inline int ptr_ring_resize(struct ptr_ring *r, int size, gfp_t gfp,
 	old = r->queue;
 	r->queue = queue;
 
+	return old;
+}
+
+static inline int ptr_ring_resize(struct ptr_ring *r, int size, gfp_t gfp,
+				  void (*destroy)(void *))
+{
+	unsigned long flags;
+	void **queue = __ptr_ring_init_queue_alloc(size, gfp);
+	void **old;
+
+	if (!queue)
+		return -ENOMEM;
+
+	spin_lock_irqsave(&(r)->producer_lock, flags);
+
+	old = __ptr_ring_swap_queue(r, queue, size, gfp, destroy);
+
 	spin_unlock_irqrestore(&(r)->producer_lock, flags);
 
 	kfree(old);
@@ -382,6 +393,48 @@ static inline int ptr_ring_resize(struct ptr_ring *r, int size, gfp_t gfp,
 	return 0;
 }
 
+static inline int ptr_ring_resize_multiple(struct ptr_ring **rings, int nrings,
+					   int size,
+					   gfp_t gfp, void (*destroy)(void *))
+{
+	unsigned long flags;
+	void ***queues;
+	int i;
+
+	queues = kmalloc(nrings * sizeof *queues, gfp);
+	if (!queues)
+		goto noqueues;
+
+	for (i = 0; i < nrings; ++i) {
+		queues[i] = __ptr_ring_init_queue_alloc(size, gfp);
+		if (!queues[i])
+			goto nomem;
+	}
+
+	for (i = 0; i < nrings; ++i) {
+		spin_lock_irqsave(&(rings[i])->producer_lock, flags);
+		queues[i] = __ptr_ring_swap_queue(rings[i], queues[i],
+						  size, gfp, destroy);
+		spin_unlock_irqrestore(&(rings[i])->producer_lock, flags);
+	}
+
+	for (i = 0; i < nrings; ++i)
+		kfree(queues[i]);
+
+	kfree(queues);
+
+	return 0;
+
+nomem:
+	while (--i >= 0)
+		kfree(queues[i]);
+
+	kfree(queues);
+
+noqueues:
+	return -ENOMEM;
+}
+
 static inline void ptr_ring_cleanup(struct ptr_ring *r, void (*destroy)(void *))
 {
 	void *ptr;
diff --git a/tools/virtio/ringtest/ptr_ring.c b/tools/virtio/ringtest/ptr_ring.c
index 74abd74..68e4f9f 100644
--- a/tools/virtio/ringtest/ptr_ring.c
+++ b/tools/virtio/ringtest/ptr_ring.c
@@ -17,6 +17,11 @@
 typedef pthread_spinlock_t  spinlock_t;
 
 typedef int gfp_t;
+static void *kmalloc(unsigned size, gfp_t gfp)
+{
+	return memalign(64, size);
+}
+
 static void *kzalloc(unsigned size, gfp_t gfp)
 {
 	void *p = memalign(64, size);
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [PATCH net-next V3 3/6] ptr_ring: support resizing multiple queues
@ 2016-06-30  3:52   ` Jason Wang
  0 siblings, 0 replies; 28+ messages in thread
From: Jason Wang @ 2016-06-30  3:52 UTC (permalink / raw)
  To: mst, netdev, linux-kernel, davem
  Cc: brouer, eric.dumazet, kvm, virtualization

From: "Michael S. Tsirkin" <mst@redhat.com>

Sometimes, we need support resizing multiple queues at once. This is
because it was not easy to recover to recover from a partial failure
of multiple queues resizing.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
---
 include/linux/ptr_ring.h         | 71 +++++++++++++++++++++++++++++++++++-----
 tools/virtio/ringtest/ptr_ring.c |  5 +++
 2 files changed, 67 insertions(+), 9 deletions(-)

diff --git a/include/linux/ptr_ring.h b/include/linux/ptr_ring.h
index d78b8b8..2052011 100644
--- a/include/linux/ptr_ring.h
+++ b/include/linux/ptr_ring.h
@@ -349,20 +349,14 @@ static inline int ptr_ring_init(struct ptr_ring *r, int size, gfp_t gfp)
 	return 0;
 }
 
-static inline int ptr_ring_resize(struct ptr_ring *r, int size, gfp_t gfp,
-				  void (*destroy)(void *))
+static inline void **__ptr_ring_swap_queue(struct ptr_ring *r, void **queue,
+					   int size, gfp_t gfp,
+					   void (*destroy)(void *))
 {
-	unsigned long flags;
 	int producer = 0;
-	void **queue = __ptr_ring_init_queue_alloc(size, gfp);
 	void **old;
 	void *ptr;
 
-	if (!queue)
-		return -ENOMEM;
-
-	spin_lock_irqsave(&(r)->producer_lock, flags);
-
 	while ((ptr = ptr_ring_consume(r)))
 		if (producer < size)
 			queue[producer++] = ptr;
@@ -375,6 +369,23 @@ static inline int ptr_ring_resize(struct ptr_ring *r, int size, gfp_t gfp,
 	old = r->queue;
 	r->queue = queue;
 
+	return old;
+}
+
+static inline int ptr_ring_resize(struct ptr_ring *r, int size, gfp_t gfp,
+				  void (*destroy)(void *))
+{
+	unsigned long flags;
+	void **queue = __ptr_ring_init_queue_alloc(size, gfp);
+	void **old;
+
+	if (!queue)
+		return -ENOMEM;
+
+	spin_lock_irqsave(&(r)->producer_lock, flags);
+
+	old = __ptr_ring_swap_queue(r, queue, size, gfp, destroy);
+
 	spin_unlock_irqrestore(&(r)->producer_lock, flags);
 
 	kfree(old);
@@ -382,6 +393,48 @@ static inline int ptr_ring_resize(struct ptr_ring *r, int size, gfp_t gfp,
 	return 0;
 }
 
+static inline int ptr_ring_resize_multiple(struct ptr_ring **rings, int nrings,
+					   int size,
+					   gfp_t gfp, void (*destroy)(void *))
+{
+	unsigned long flags;
+	void ***queues;
+	int i;
+
+	queues = kmalloc(nrings * sizeof *queues, gfp);
+	if (!queues)
+		goto noqueues;
+
+	for (i = 0; i < nrings; ++i) {
+		queues[i] = __ptr_ring_init_queue_alloc(size, gfp);
+		if (!queues[i])
+			goto nomem;
+	}
+
+	for (i = 0; i < nrings; ++i) {
+		spin_lock_irqsave(&(rings[i])->producer_lock, flags);
+		queues[i] = __ptr_ring_swap_queue(rings[i], queues[i],
+						  size, gfp, destroy);
+		spin_unlock_irqrestore(&(rings[i])->producer_lock, flags);
+	}
+
+	for (i = 0; i < nrings; ++i)
+		kfree(queues[i]);
+
+	kfree(queues);
+
+	return 0;
+
+nomem:
+	while (--i >= 0)
+		kfree(queues[i]);
+
+	kfree(queues);
+
+noqueues:
+	return -ENOMEM;
+}
+
 static inline void ptr_ring_cleanup(struct ptr_ring *r, void (*destroy)(void *))
 {
 	void *ptr;
diff --git a/tools/virtio/ringtest/ptr_ring.c b/tools/virtio/ringtest/ptr_ring.c
index 74abd74..68e4f9f 100644
--- a/tools/virtio/ringtest/ptr_ring.c
+++ b/tools/virtio/ringtest/ptr_ring.c
@@ -17,6 +17,11 @@
 typedef pthread_spinlock_t  spinlock_t;
 
 typedef int gfp_t;
+static void *kmalloc(unsigned size, gfp_t gfp)
+{
+	return memalign(64, size);
+}
+
 static void *kzalloc(unsigned size, gfp_t gfp)
 {
 	void *p = memalign(64, size);
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [PATCH net-next V3 4/6] skb_array: add wrappers for resizing
  2016-06-30  3:52 ` Jason Wang
@ 2016-06-30  3:52   ` Jason Wang
  -1 siblings, 0 replies; 28+ messages in thread
From: Jason Wang @ 2016-06-30  3:52 UTC (permalink / raw)
  To: mst, netdev, linux-kernel, davem
  Cc: kvm, virtualization, eric.dumazet, brouer, Jason Wang

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
---
 include/linux/skb_array.h | 9 +++++++++
 1 file changed, 9 insertions(+)

diff --git a/include/linux/skb_array.h b/include/linux/skb_array.h
index 2dd0d1e..f4dfade 100644
--- a/include/linux/skb_array.h
+++ b/include/linux/skb_array.h
@@ -161,6 +161,15 @@ static inline int skb_array_resize(struct skb_array *a, int size, gfp_t gfp)
 	return ptr_ring_resize(&a->ring, size, gfp, __skb_array_destroy_skb);
 }
 
+static inline int skb_array_resize_multiple(struct skb_array **rings,
+					    int nrings, int size, gfp_t gfp)
+{
+	BUILD_BUG_ON(offsetof(struct skb_array, ring));
+	return ptr_ring_resize_multiple((struct ptr_ring **)rings,
+					nrings, size, gfp,
+					__skb_array_destroy_skb);
+}
+
 static inline void skb_array_cleanup(struct skb_array *a)
 {
 	ptr_ring_cleanup(&a->ring, __skb_array_destroy_skb);
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [PATCH net-next V3 4/6] skb_array: add wrappers for resizing
@ 2016-06-30  3:52   ` Jason Wang
  0 siblings, 0 replies; 28+ messages in thread
From: Jason Wang @ 2016-06-30  3:52 UTC (permalink / raw)
  To: mst, netdev, linux-kernel, davem
  Cc: brouer, eric.dumazet, kvm, virtualization

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
---
 include/linux/skb_array.h | 9 +++++++++
 1 file changed, 9 insertions(+)

diff --git a/include/linux/skb_array.h b/include/linux/skb_array.h
index 2dd0d1e..f4dfade 100644
--- a/include/linux/skb_array.h
+++ b/include/linux/skb_array.h
@@ -161,6 +161,15 @@ static inline int skb_array_resize(struct skb_array *a, int size, gfp_t gfp)
 	return ptr_ring_resize(&a->ring, size, gfp, __skb_array_destroy_skb);
 }
 
+static inline int skb_array_resize_multiple(struct skb_array **rings,
+					    int nrings, int size, gfp_t gfp)
+{
+	BUILD_BUG_ON(offsetof(struct skb_array, ring));
+	return ptr_ring_resize_multiple((struct ptr_ring **)rings,
+					nrings, size, gfp,
+					__skb_array_destroy_skb);
+}
+
 static inline void skb_array_cleanup(struct skb_array *a)
 {
 	ptr_ring_cleanup(&a->ring, __skb_array_destroy_skb);
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [PATCH net-next V3 5/6] net: introduce NETDEV_CHANGE_TX_QUEUE_LEN
  2016-06-30  3:52 ` Jason Wang
@ 2016-06-30  3:52   ` Jason Wang
  -1 siblings, 0 replies; 28+ messages in thread
From: Jason Wang @ 2016-06-30  3:52 UTC (permalink / raw)
  To: mst, netdev, linux-kernel, davem
  Cc: kvm, virtualization, eric.dumazet, brouer, Jason Wang

This patch introduces a new event - NETDEV_CHANGE_TX_QUEUE_LEN, this
will be triggered when tx_queue_len. It could be used by net device
who want to do some processing at that time. An example is tun who may
want to resize tx array when tx_queue_len is changed.

Signed-off-by: Jason Wang <jasowang@redhat.com>
---
 include/linux/netdevice.h |  1 +
 net/core/net-sysfs.c      | 15 ++++++++++++++-
 2 files changed, 15 insertions(+), 1 deletion(-)

diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h
index e84d9d2..7dc2ec7 100644
--- a/include/linux/netdevice.h
+++ b/include/linux/netdevice.h
@@ -2237,6 +2237,7 @@ struct netdev_lag_lower_state_info {
 #define NETDEV_PRECHANGEUPPER	0x001A
 #define NETDEV_CHANGELOWERSTATE	0x001B
 #define NETDEV_UDP_TUNNEL_PUSH_INFO	0x001C
+#define NETDEV_CHANGE_TX_QUEUE_LEN	0x001E
 
 int register_netdevice_notifier(struct notifier_block *nb);
 int unregister_netdevice_notifier(struct notifier_block *nb);
diff --git a/net/core/net-sysfs.c b/net/core/net-sysfs.c
index 7a0b616..6e4f347 100644
--- a/net/core/net-sysfs.c
+++ b/net/core/net-sysfs.c
@@ -322,7 +322,20 @@ NETDEVICE_SHOW_RW(flags, fmt_hex);
 
 static int change_tx_queue_len(struct net_device *dev, unsigned long new_len)
 {
-	dev->tx_queue_len = new_len;
+	int res, orig_len = dev->tx_queue_len;
+
+	if (new_len != orig_len) {
+		dev->tx_queue_len = new_len;
+		res = call_netdevice_notifiers(NETDEV_CHANGE_TX_QUEUE_LEN, dev);
+		res = notifier_to_errno(res);
+		if (res) {
+			netdev_err(dev,
+				   "refused to change device tx_queue_len\n");
+			dev->tx_queue_len = orig_len;
+			return -EFAULT;
+		}
+	}
+
 	return 0;
 }
 
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [PATCH net-next V3 5/6] net: introduce NETDEV_CHANGE_TX_QUEUE_LEN
@ 2016-06-30  3:52   ` Jason Wang
  0 siblings, 0 replies; 28+ messages in thread
From: Jason Wang @ 2016-06-30  3:52 UTC (permalink / raw)
  To: mst, netdev, linux-kernel, davem
  Cc: brouer, eric.dumazet, kvm, virtualization

This patch introduces a new event - NETDEV_CHANGE_TX_QUEUE_LEN, this
will be triggered when tx_queue_len. It could be used by net device
who want to do some processing at that time. An example is tun who may
want to resize tx array when tx_queue_len is changed.

Signed-off-by: Jason Wang <jasowang@redhat.com>
---
 include/linux/netdevice.h |  1 +
 net/core/net-sysfs.c      | 15 ++++++++++++++-
 2 files changed, 15 insertions(+), 1 deletion(-)

diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h
index e84d9d2..7dc2ec7 100644
--- a/include/linux/netdevice.h
+++ b/include/linux/netdevice.h
@@ -2237,6 +2237,7 @@ struct netdev_lag_lower_state_info {
 #define NETDEV_PRECHANGEUPPER	0x001A
 #define NETDEV_CHANGELOWERSTATE	0x001B
 #define NETDEV_UDP_TUNNEL_PUSH_INFO	0x001C
+#define NETDEV_CHANGE_TX_QUEUE_LEN	0x001E
 
 int register_netdevice_notifier(struct notifier_block *nb);
 int unregister_netdevice_notifier(struct notifier_block *nb);
diff --git a/net/core/net-sysfs.c b/net/core/net-sysfs.c
index 7a0b616..6e4f347 100644
--- a/net/core/net-sysfs.c
+++ b/net/core/net-sysfs.c
@@ -322,7 +322,20 @@ NETDEVICE_SHOW_RW(flags, fmt_hex);
 
 static int change_tx_queue_len(struct net_device *dev, unsigned long new_len)
 {
-	dev->tx_queue_len = new_len;
+	int res, orig_len = dev->tx_queue_len;
+
+	if (new_len != orig_len) {
+		dev->tx_queue_len = new_len;
+		res = call_netdevice_notifiers(NETDEV_CHANGE_TX_QUEUE_LEN, dev);
+		res = notifier_to_errno(res);
+		if (res) {
+			netdev_err(dev,
+				   "refused to change device tx_queue_len\n");
+			dev->tx_queue_len = orig_len;
+			return -EFAULT;
+		}
+	}
+
 	return 0;
 }
 
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [PATCH net-next V3 6/6] tun: switch to use skb array for tx
  2016-06-30  3:52 ` Jason Wang
@ 2016-06-30  3:52   ` Jason Wang
  -1 siblings, 0 replies; 28+ messages in thread
From: Jason Wang @ 2016-06-30  3:52 UTC (permalink / raw)
  To: mst, netdev, linux-kernel, davem
  Cc: kvm, virtualization, eric.dumazet, brouer, Jason Wang

We used to queue tx packets in sk_receive_queue, this is less
efficient since it requires spinlocks to synchronize between producer
and consumer.

This patch tries to address this by:

- switch from sk_receive_queue to a skb_array, and resize it when
  tx_queue_len was changed.
- introduce a new proto_ops peek_len which was used for peeking the
  skb length.
- implement a tun version of peek_len for vhost_net to use and convert
  vhost_net to use peek_len if possible.

Pktgen test shows about 15.3% improvement on guest receiving pps for small
buffers:

Before: ~1300000pps
After : ~1500000pps

Signed-off-by: Jason Wang <jasowang@redhat.com>
---
 drivers/net/tun.c   | 138 +++++++++++++++++++++++++++++++++++++++++++++++++---
 drivers/vhost/net.c |  16 +++++-
 include/linux/net.h |   1 +
 3 files changed, 146 insertions(+), 9 deletions(-)

diff --git a/drivers/net/tun.c b/drivers/net/tun.c
index 4884802..3be69ea 100644
--- a/drivers/net/tun.c
+++ b/drivers/net/tun.c
@@ -71,6 +71,7 @@
 #include <net/sock.h>
 #include <linux/seq_file.h>
 #include <linux/uio.h>
+#include <linux/skb_array.h>
 
 #include <asm/uaccess.h>
 
@@ -167,6 +168,7 @@ struct tun_file {
 	};
 	struct list_head next;
 	struct tun_struct *detached;
+	struct skb_array tx_array;
 };
 
 struct tun_flow_entry {
@@ -515,7 +517,11 @@ static struct tun_struct *tun_enable_queue(struct tun_file *tfile)
 
 static void tun_queue_purge(struct tun_file *tfile)
 {
-	skb_queue_purge(&tfile->sk.sk_receive_queue);
+	struct sk_buff *skb;
+
+	while ((skb = skb_array_consume(&tfile->tx_array)) != NULL)
+		kfree_skb(skb);
+
 	skb_queue_purge(&tfile->sk.sk_error_queue);
 }
 
@@ -560,6 +566,8 @@ static void __tun_detach(struct tun_file *tfile, bool clean)
 			    tun->dev->reg_state == NETREG_REGISTERED)
 				unregister_netdevice(tun->dev);
 		}
+		if (tun)
+			skb_array_cleanup(&tfile->tx_array);
 		sock_put(&tfile->sk);
 	}
 }
@@ -613,6 +621,7 @@ static void tun_detach_all(struct net_device *dev)
 static int tun_attach(struct tun_struct *tun, struct file *file, bool skip_filter)
 {
 	struct tun_file *tfile = file->private_data;
+	struct net_device *dev = tun->dev;
 	int err;
 
 	err = security_tun_dev_attach(tfile->socket.sk, tun->security);
@@ -642,6 +651,13 @@ static int tun_attach(struct tun_struct *tun, struct file *file, bool skip_filte
 		if (!err)
 			goto out;
 	}
+
+	if (!tfile->detached &&
+	    skb_array_init(&tfile->tx_array, dev->tx_queue_len, GFP_KERNEL)) {
+		err = -ENOMEM;
+		goto out;
+	}
+
 	tfile->queue_index = tun->numqueues;
 	tfile->socket.sk->sk_shutdown &= ~RCV_SHUTDOWN;
 	rcu_assign_pointer(tfile->tun, tun);
@@ -891,8 +907,8 @@ static netdev_tx_t tun_net_xmit(struct sk_buff *skb, struct net_device *dev)
 
 	nf_reset(skb);
 
-	/* Enqueue packet */
-	skb_queue_tail(&tfile->socket.sk->sk_receive_queue, skb);
+	if (skb_array_produce(&tfile->tx_array, skb))
+		goto drop;
 
 	/* Notify and wake up reader process */
 	if (tfile->flags & TUN_FASYNC)
@@ -1107,7 +1123,7 @@ static unsigned int tun_chr_poll(struct file *file, poll_table *wait)
 
 	poll_wait(file, sk_sleep(sk), wait);
 
-	if (!skb_queue_empty(&sk->sk_receive_queue))
+	if (!skb_array_empty(&tfile->tx_array))
 		mask |= POLLIN | POLLRDNORM;
 
 	if (sock_writeable(sk) ||
@@ -1426,22 +1442,61 @@ done:
 	return total;
 }
 
+static struct sk_buff *tun_ring_recv(struct tun_file *tfile, int noblock,
+				     int *err)
+{
+	DECLARE_WAITQUEUE(wait, current);
+	struct sk_buff *skb = NULL;
+
+	skb = skb_array_consume(&tfile->tx_array);
+	if (skb)
+		goto out;
+	if (noblock) {
+		*err = -EAGAIN;
+		goto out;
+	}
+
+	add_wait_queue(&tfile->wq.wait, &wait);
+	current->state = TASK_INTERRUPTIBLE;
+
+	while (1) {
+		skb = skb_array_consume(&tfile->tx_array);
+		if (skb)
+			break;
+		if (signal_pending(current)) {
+			*err = -ERESTARTSYS;
+			break;
+		}
+		if (tfile->socket.sk->sk_shutdown & RCV_SHUTDOWN) {
+			*err = -EFAULT;
+			break;
+		}
+
+		schedule();
+	};
+
+	current->state = TASK_RUNNING;
+	remove_wait_queue(&tfile->wq.wait, &wait);
+
+out:
+	return skb;
+}
+
 static ssize_t tun_do_read(struct tun_struct *tun, struct tun_file *tfile,
 			   struct iov_iter *to,
 			   int noblock)
 {
 	struct sk_buff *skb;
 	ssize_t ret;
-	int peeked, err, off = 0;
+	int err;
 
 	tun_debug(KERN_INFO, tun, "tun_do_read\n");
 
 	if (!iov_iter_count(to))
 		return 0;
 
-	/* Read frames from queue */
-	skb = __skb_recv_datagram(tfile->socket.sk, noblock ? MSG_DONTWAIT : 0,
-				  &peeked, &off, &err);
+	/* Read frames from ring */
+	skb = tun_ring_recv(tfile, noblock, &err);
 	if (!skb)
 		return err;
 
@@ -1574,8 +1629,25 @@ out:
 	return ret;
 }
 
+static int tun_peek_len(struct socket *sock)
+{
+	struct tun_file *tfile = container_of(sock, struct tun_file, socket);
+	struct tun_struct *tun;
+	int ret = 0;
+
+	tun = __tun_get(tfile);
+	if (!tun)
+		return 0;
+
+	ret = skb_array_peek_len(&tfile->tx_array);
+	tun_put(tun);
+
+	return ret;
+}
+
 /* Ops structure to mimic raw sockets with tun */
 static const struct proto_ops tun_socket_ops = {
+	.peek_len = tun_peek_len,
 	.sendmsg = tun_sendmsg,
 	.recvmsg = tun_recvmsg,
 };
@@ -2397,6 +2469,53 @@ static const struct ethtool_ops tun_ethtool_ops = {
 	.get_ts_info	= ethtool_op_get_ts_info,
 };
 
+static int tun_queue_resize(struct tun_struct *tun)
+{
+	struct net_device *dev = tun->dev;
+	struct tun_file *tfile;
+	struct skb_array **arrays;
+	int n = tun->numqueues + tun->numdisabled;
+	int ret, i;
+
+	arrays = kmalloc(sizeof *arrays * n, GFP_KERNEL);
+	if (!arrays)
+		return -ENOMEM;
+
+	for (i = 0; i < tun->numqueues; i++) {
+		tfile = rtnl_dereference(tun->tfiles[i]);
+		arrays[i] = &tfile->tx_array;
+	}
+	list_for_each_entry(tfile, &tun->disabled, next)
+		arrays[i++] = &tfile->tx_array;
+
+	ret = skb_array_resize_multiple(arrays, n,
+					dev->tx_queue_len, GFP_KERNEL);
+
+	kfree(arrays);
+	return ret;
+}
+
+static int tun_device_event(struct notifier_block *unused,
+			    unsigned long event, void *ptr)
+{
+	struct net_device *dev = netdev_notifier_info_to_dev(ptr);
+	struct tun_struct *tun = netdev_priv(dev);
+
+	switch (event) {
+	case NETDEV_CHANGE_TX_QUEUE_LEN:
+		if (tun_queue_resize(tun))
+			return NOTIFY_BAD;
+		break;
+	default:
+		break;
+	}
+
+	return NOTIFY_DONE;
+}
+
+static struct notifier_block tun_notifier_block __read_mostly = {
+	.notifier_call	= tun_device_event,
+};
 
 static int __init tun_init(void)
 {
@@ -2416,6 +2535,8 @@ static int __init tun_init(void)
 		pr_err("Can't register misc device %d\n", TUN_MINOR);
 		goto err_misc;
 	}
+
+	register_netdevice_notifier(&tun_notifier_block);
 	return  0;
 err_misc:
 	rtnl_link_unregister(&tun_link_ops);
@@ -2427,6 +2548,7 @@ static void tun_cleanup(void)
 {
 	misc_deregister(&tun_miscdev);
 	rtnl_link_unregister(&tun_link_ops);
+	unregister_netdevice_notifier(&tun_notifier_block);
 }
 
 /* Get an underlying socket object from tun file.  Returns error unless file is
diff --git a/drivers/vhost/net.c b/drivers/vhost/net.c
index 1d3e45f..e032ca3 100644
--- a/drivers/vhost/net.c
+++ b/drivers/vhost/net.c
@@ -481,10 +481,14 @@ out:
 
 static int peek_head_len(struct sock *sk)
 {
+	struct socket *sock = sk->sk_socket;
 	struct sk_buff *head;
 	int len = 0;
 	unsigned long flags;
 
+	if (sock->ops->peek_len)
+		return sock->ops->peek_len(sock);
+
 	spin_lock_irqsave(&sk->sk_receive_queue.lock, flags);
 	head = skb_peek(&sk->sk_receive_queue);
 	if (likely(head)) {
@@ -497,6 +501,16 @@ static int peek_head_len(struct sock *sk)
 	return len;
 }
 
+static int sk_has_rx_data(struct sock *sk)
+{
+	struct socket *sock = sk->sk_socket;
+
+	if (sock->ops->peek_len)
+		return sock->ops->peek_len(sock);
+
+	return skb_queue_empty(&sk->sk_receive_queue);
+}
+
 static int vhost_net_rx_peek_head_len(struct vhost_net *net, struct sock *sk)
 {
 	struct vhost_net_virtqueue *nvq = &net->vqs[VHOST_NET_VQ_TX];
@@ -513,7 +527,7 @@ static int vhost_net_rx_peek_head_len(struct vhost_net *net, struct sock *sk)
 		endtime = busy_clock() + vq->busyloop_timeout;
 
 		while (vhost_can_busy_poll(&net->dev, endtime) &&
-		       skb_queue_empty(&sk->sk_receive_queue) &&
+		       !sk_has_rx_data(sk) &&
 		       vhost_vq_avail_empty(&net->dev, vq))
 			cpu_relax_lowlatency();
 
diff --git a/include/linux/net.h b/include/linux/net.h
index 9aa49a0..b6b3843 100644
--- a/include/linux/net.h
+++ b/include/linux/net.h
@@ -185,6 +185,7 @@ struct proto_ops {
 	ssize_t 	(*splice_read)(struct socket *sock,  loff_t *ppos,
 				       struct pipe_inode_info *pipe, size_t len, unsigned int flags);
 	int		(*set_peek_off)(struct sock *sk, int val);
+	int		(*peek_len)(struct socket *sock);
 };
 
 #define DECLARE_SOCKADDR(type, dst, src)	\
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [PATCH net-next V3 6/6] tun: switch to use skb array for tx
@ 2016-06-30  3:52   ` Jason Wang
  0 siblings, 0 replies; 28+ messages in thread
From: Jason Wang @ 2016-06-30  3:52 UTC (permalink / raw)
  To: mst, netdev, linux-kernel, davem
  Cc: brouer, eric.dumazet, kvm, virtualization

We used to queue tx packets in sk_receive_queue, this is less
efficient since it requires spinlocks to synchronize between producer
and consumer.

This patch tries to address this by:

- switch from sk_receive_queue to a skb_array, and resize it when
  tx_queue_len was changed.
- introduce a new proto_ops peek_len which was used for peeking the
  skb length.
- implement a tun version of peek_len for vhost_net to use and convert
  vhost_net to use peek_len if possible.

Pktgen test shows about 15.3% improvement on guest receiving pps for small
buffers:

Before: ~1300000pps
After : ~1500000pps

Signed-off-by: Jason Wang <jasowang@redhat.com>
---
 drivers/net/tun.c   | 138 +++++++++++++++++++++++++++++++++++++++++++++++++---
 drivers/vhost/net.c |  16 +++++-
 include/linux/net.h |   1 +
 3 files changed, 146 insertions(+), 9 deletions(-)

diff --git a/drivers/net/tun.c b/drivers/net/tun.c
index 4884802..3be69ea 100644
--- a/drivers/net/tun.c
+++ b/drivers/net/tun.c
@@ -71,6 +71,7 @@
 #include <net/sock.h>
 #include <linux/seq_file.h>
 #include <linux/uio.h>
+#include <linux/skb_array.h>
 
 #include <asm/uaccess.h>
 
@@ -167,6 +168,7 @@ struct tun_file {
 	};
 	struct list_head next;
 	struct tun_struct *detached;
+	struct skb_array tx_array;
 };
 
 struct tun_flow_entry {
@@ -515,7 +517,11 @@ static struct tun_struct *tun_enable_queue(struct tun_file *tfile)
 
 static void tun_queue_purge(struct tun_file *tfile)
 {
-	skb_queue_purge(&tfile->sk.sk_receive_queue);
+	struct sk_buff *skb;
+
+	while ((skb = skb_array_consume(&tfile->tx_array)) != NULL)
+		kfree_skb(skb);
+
 	skb_queue_purge(&tfile->sk.sk_error_queue);
 }
 
@@ -560,6 +566,8 @@ static void __tun_detach(struct tun_file *tfile, bool clean)
 			    tun->dev->reg_state == NETREG_REGISTERED)
 				unregister_netdevice(tun->dev);
 		}
+		if (tun)
+			skb_array_cleanup(&tfile->tx_array);
 		sock_put(&tfile->sk);
 	}
 }
@@ -613,6 +621,7 @@ static void tun_detach_all(struct net_device *dev)
 static int tun_attach(struct tun_struct *tun, struct file *file, bool skip_filter)
 {
 	struct tun_file *tfile = file->private_data;
+	struct net_device *dev = tun->dev;
 	int err;
 
 	err = security_tun_dev_attach(tfile->socket.sk, tun->security);
@@ -642,6 +651,13 @@ static int tun_attach(struct tun_struct *tun, struct file *file, bool skip_filte
 		if (!err)
 			goto out;
 	}
+
+	if (!tfile->detached &&
+	    skb_array_init(&tfile->tx_array, dev->tx_queue_len, GFP_KERNEL)) {
+		err = -ENOMEM;
+		goto out;
+	}
+
 	tfile->queue_index = tun->numqueues;
 	tfile->socket.sk->sk_shutdown &= ~RCV_SHUTDOWN;
 	rcu_assign_pointer(tfile->tun, tun);
@@ -891,8 +907,8 @@ static netdev_tx_t tun_net_xmit(struct sk_buff *skb, struct net_device *dev)
 
 	nf_reset(skb);
 
-	/* Enqueue packet */
-	skb_queue_tail(&tfile->socket.sk->sk_receive_queue, skb);
+	if (skb_array_produce(&tfile->tx_array, skb))
+		goto drop;
 
 	/* Notify and wake up reader process */
 	if (tfile->flags & TUN_FASYNC)
@@ -1107,7 +1123,7 @@ static unsigned int tun_chr_poll(struct file *file, poll_table *wait)
 
 	poll_wait(file, sk_sleep(sk), wait);
 
-	if (!skb_queue_empty(&sk->sk_receive_queue))
+	if (!skb_array_empty(&tfile->tx_array))
 		mask |= POLLIN | POLLRDNORM;
 
 	if (sock_writeable(sk) ||
@@ -1426,22 +1442,61 @@ done:
 	return total;
 }
 
+static struct sk_buff *tun_ring_recv(struct tun_file *tfile, int noblock,
+				     int *err)
+{
+	DECLARE_WAITQUEUE(wait, current);
+	struct sk_buff *skb = NULL;
+
+	skb = skb_array_consume(&tfile->tx_array);
+	if (skb)
+		goto out;
+	if (noblock) {
+		*err = -EAGAIN;
+		goto out;
+	}
+
+	add_wait_queue(&tfile->wq.wait, &wait);
+	current->state = TASK_INTERRUPTIBLE;
+
+	while (1) {
+		skb = skb_array_consume(&tfile->tx_array);
+		if (skb)
+			break;
+		if (signal_pending(current)) {
+			*err = -ERESTARTSYS;
+			break;
+		}
+		if (tfile->socket.sk->sk_shutdown & RCV_SHUTDOWN) {
+			*err = -EFAULT;
+			break;
+		}
+
+		schedule();
+	};
+
+	current->state = TASK_RUNNING;
+	remove_wait_queue(&tfile->wq.wait, &wait);
+
+out:
+	return skb;
+}
+
 static ssize_t tun_do_read(struct tun_struct *tun, struct tun_file *tfile,
 			   struct iov_iter *to,
 			   int noblock)
 {
 	struct sk_buff *skb;
 	ssize_t ret;
-	int peeked, err, off = 0;
+	int err;
 
 	tun_debug(KERN_INFO, tun, "tun_do_read\n");
 
 	if (!iov_iter_count(to))
 		return 0;
 
-	/* Read frames from queue */
-	skb = __skb_recv_datagram(tfile->socket.sk, noblock ? MSG_DONTWAIT : 0,
-				  &peeked, &off, &err);
+	/* Read frames from ring */
+	skb = tun_ring_recv(tfile, noblock, &err);
 	if (!skb)
 		return err;
 
@@ -1574,8 +1629,25 @@ out:
 	return ret;
 }
 
+static int tun_peek_len(struct socket *sock)
+{
+	struct tun_file *tfile = container_of(sock, struct tun_file, socket);
+	struct tun_struct *tun;
+	int ret = 0;
+
+	tun = __tun_get(tfile);
+	if (!tun)
+		return 0;
+
+	ret = skb_array_peek_len(&tfile->tx_array);
+	tun_put(tun);
+
+	return ret;
+}
+
 /* Ops structure to mimic raw sockets with tun */
 static const struct proto_ops tun_socket_ops = {
+	.peek_len = tun_peek_len,
 	.sendmsg = tun_sendmsg,
 	.recvmsg = tun_recvmsg,
 };
@@ -2397,6 +2469,53 @@ static const struct ethtool_ops tun_ethtool_ops = {
 	.get_ts_info	= ethtool_op_get_ts_info,
 };
 
+static int tun_queue_resize(struct tun_struct *tun)
+{
+	struct net_device *dev = tun->dev;
+	struct tun_file *tfile;
+	struct skb_array **arrays;
+	int n = tun->numqueues + tun->numdisabled;
+	int ret, i;
+
+	arrays = kmalloc(sizeof *arrays * n, GFP_KERNEL);
+	if (!arrays)
+		return -ENOMEM;
+
+	for (i = 0; i < tun->numqueues; i++) {
+		tfile = rtnl_dereference(tun->tfiles[i]);
+		arrays[i] = &tfile->tx_array;
+	}
+	list_for_each_entry(tfile, &tun->disabled, next)
+		arrays[i++] = &tfile->tx_array;
+
+	ret = skb_array_resize_multiple(arrays, n,
+					dev->tx_queue_len, GFP_KERNEL);
+
+	kfree(arrays);
+	return ret;
+}
+
+static int tun_device_event(struct notifier_block *unused,
+			    unsigned long event, void *ptr)
+{
+	struct net_device *dev = netdev_notifier_info_to_dev(ptr);
+	struct tun_struct *tun = netdev_priv(dev);
+
+	switch (event) {
+	case NETDEV_CHANGE_TX_QUEUE_LEN:
+		if (tun_queue_resize(tun))
+			return NOTIFY_BAD;
+		break;
+	default:
+		break;
+	}
+
+	return NOTIFY_DONE;
+}
+
+static struct notifier_block tun_notifier_block __read_mostly = {
+	.notifier_call	= tun_device_event,
+};
 
 static int __init tun_init(void)
 {
@@ -2416,6 +2535,8 @@ static int __init tun_init(void)
 		pr_err("Can't register misc device %d\n", TUN_MINOR);
 		goto err_misc;
 	}
+
+	register_netdevice_notifier(&tun_notifier_block);
 	return  0;
 err_misc:
 	rtnl_link_unregister(&tun_link_ops);
@@ -2427,6 +2548,7 @@ static void tun_cleanup(void)
 {
 	misc_deregister(&tun_miscdev);
 	rtnl_link_unregister(&tun_link_ops);
+	unregister_netdevice_notifier(&tun_notifier_block);
 }
 
 /* Get an underlying socket object from tun file.  Returns error unless file is
diff --git a/drivers/vhost/net.c b/drivers/vhost/net.c
index 1d3e45f..e032ca3 100644
--- a/drivers/vhost/net.c
+++ b/drivers/vhost/net.c
@@ -481,10 +481,14 @@ out:
 
 static int peek_head_len(struct sock *sk)
 {
+	struct socket *sock = sk->sk_socket;
 	struct sk_buff *head;
 	int len = 0;
 	unsigned long flags;
 
+	if (sock->ops->peek_len)
+		return sock->ops->peek_len(sock);
+
 	spin_lock_irqsave(&sk->sk_receive_queue.lock, flags);
 	head = skb_peek(&sk->sk_receive_queue);
 	if (likely(head)) {
@@ -497,6 +501,16 @@ static int peek_head_len(struct sock *sk)
 	return len;
 }
 
+static int sk_has_rx_data(struct sock *sk)
+{
+	struct socket *sock = sk->sk_socket;
+
+	if (sock->ops->peek_len)
+		return sock->ops->peek_len(sock);
+
+	return skb_queue_empty(&sk->sk_receive_queue);
+}
+
 static int vhost_net_rx_peek_head_len(struct vhost_net *net, struct sock *sk)
 {
 	struct vhost_net_virtqueue *nvq = &net->vqs[VHOST_NET_VQ_TX];
@@ -513,7 +527,7 @@ static int vhost_net_rx_peek_head_len(struct vhost_net *net, struct sock *sk)
 		endtime = busy_clock() + vq->busyloop_timeout;
 
 		while (vhost_can_busy_poll(&net->dev, endtime) &&
-		       skb_queue_empty(&sk->sk_receive_queue) &&
+		       !sk_has_rx_data(sk) &&
 		       vhost_vq_avail_empty(&net->dev, vq))
 			cpu_relax_lowlatency();
 
diff --git a/include/linux/net.h b/include/linux/net.h
index 9aa49a0..b6b3843 100644
--- a/include/linux/net.h
+++ b/include/linux/net.h
@@ -185,6 +185,7 @@ struct proto_ops {
 	ssize_t 	(*splice_read)(struct socket *sock,  loff_t *ppos,
 				       struct pipe_inode_info *pipe, size_t len, unsigned int flags);
 	int		(*set_peek_off)(struct sock *sk, int val);
+	int		(*peek_len)(struct socket *sock);
 };
 
 #define DECLARE_SOCKADDR(type, dst, src)	\
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 28+ messages in thread

* Re: [PATCH net-next V3 5/6] net: introduce NETDEV_CHANGE_TX_QUEUE_LEN
  2016-06-30  3:52   ` Jason Wang
  (?)
@ 2016-06-30  4:56   ` John Fastabend
  2016-06-30  5:12       ` Jason Wang
  -1 siblings, 1 reply; 28+ messages in thread
From: John Fastabend @ 2016-06-30  4:56 UTC (permalink / raw)
  To: Jason Wang, mst, netdev, linux-kernel, davem
  Cc: kvm, virtualization, eric.dumazet, brouer

On 16-06-29 08:52 PM, Jason Wang wrote:
> This patch introduces a new event - NETDEV_CHANGE_TX_QUEUE_LEN, this
> will be triggered when tx_queue_len. It could be used by net device
> who want to do some processing at that time. An example is tun who may
> want to resize tx array when tx_queue_len is changed.
> 
> Signed-off-by: Jason Wang <jasowang@redhat.com>
> ---
>  include/linux/netdevice.h |  1 +
>  net/core/net-sysfs.c      | 15 ++++++++++++++-
>  2 files changed, 15 insertions(+), 1 deletion(-)
> 
> diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h
> index e84d9d2..7dc2ec7 100644
> --- a/include/linux/netdevice.h
> +++ b/include/linux/netdevice.h
> @@ -2237,6 +2237,7 @@ struct netdev_lag_lower_state_info {
>  #define NETDEV_PRECHANGEUPPER	0x001A
>  #define NETDEV_CHANGELOWERSTATE	0x001B
>  #define NETDEV_UDP_TUNNEL_PUSH_INFO	0x001C
> +#define NETDEV_CHANGE_TX_QUEUE_LEN	0x001E
>  
>  int register_netdevice_notifier(struct notifier_block *nb);
>  int unregister_netdevice_notifier(struct notifier_block *nb);
> diff --git a/net/core/net-sysfs.c b/net/core/net-sysfs.c
> index 7a0b616..6e4f347 100644
> --- a/net/core/net-sysfs.c
> +++ b/net/core/net-sysfs.c
> @@ -322,7 +322,20 @@ NETDEVICE_SHOW_RW(flags, fmt_hex);
>  
>  static int change_tx_queue_len(struct net_device *dev, unsigned long new_len)
>  {
> -	dev->tx_queue_len = new_len;
> +	int res, orig_len = dev->tx_queue_len;
> +
> +	if (new_len != orig_len) {
> +		dev->tx_queue_len = new_len;
> +		res = call_netdevice_notifiers(NETDEV_CHANGE_TX_QUEUE_LEN, dev);
> +		res = notifier_to_errno(res);
> +		if (res) {
> +			netdev_err(dev,
> +				   "refused to change device tx_queue_len\n");
> +			dev->tx_queue_len = orig_len;
> +			return -EFAULT;
> +		}
> +	}
> +
>  	return 0;
>  }
>  
> 

Acked-by: John Fastabend <john.r.fastabend@intel.com>

Great timing I was just looking into this because I need it for the
qdisc side.

It looks like this covers the sysfs change but the tx_queue_len can
also be changed via rtnetlink as well. So we need another patch for
that path right?

        if (tb[IFLA_TXQLEN]) {
                unsigned long value = nla_get_u32(tb[IFLA_TXQLEN]);

                if (dev->tx_queue_len ^ value)
                        status |= DO_SETLINK_NOTIFY;

                dev->tx_queue_len = value;
        }

Thanks,
John

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH net-next V3 5/6] net: introduce NETDEV_CHANGE_TX_QUEUE_LEN
  2016-06-30  3:52   ` Jason Wang
  (?)
  (?)
@ 2016-06-30  4:56   ` John Fastabend
  -1 siblings, 0 replies; 28+ messages in thread
From: John Fastabend @ 2016-06-30  4:56 UTC (permalink / raw)
  To: Jason Wang, mst, netdev, linux-kernel, davem
  Cc: brouer, eric.dumazet, kvm, virtualization

On 16-06-29 08:52 PM, Jason Wang wrote:
> This patch introduces a new event - NETDEV_CHANGE_TX_QUEUE_LEN, this
> will be triggered when tx_queue_len. It could be used by net device
> who want to do some processing at that time. An example is tun who may
> want to resize tx array when tx_queue_len is changed.
> 
> Signed-off-by: Jason Wang <jasowang@redhat.com>
> ---
>  include/linux/netdevice.h |  1 +
>  net/core/net-sysfs.c      | 15 ++++++++++++++-
>  2 files changed, 15 insertions(+), 1 deletion(-)
> 
> diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h
> index e84d9d2..7dc2ec7 100644
> --- a/include/linux/netdevice.h
> +++ b/include/linux/netdevice.h
> @@ -2237,6 +2237,7 @@ struct netdev_lag_lower_state_info {
>  #define NETDEV_PRECHANGEUPPER	0x001A
>  #define NETDEV_CHANGELOWERSTATE	0x001B
>  #define NETDEV_UDP_TUNNEL_PUSH_INFO	0x001C
> +#define NETDEV_CHANGE_TX_QUEUE_LEN	0x001E
>  
>  int register_netdevice_notifier(struct notifier_block *nb);
>  int unregister_netdevice_notifier(struct notifier_block *nb);
> diff --git a/net/core/net-sysfs.c b/net/core/net-sysfs.c
> index 7a0b616..6e4f347 100644
> --- a/net/core/net-sysfs.c
> +++ b/net/core/net-sysfs.c
> @@ -322,7 +322,20 @@ NETDEVICE_SHOW_RW(flags, fmt_hex);
>  
>  static int change_tx_queue_len(struct net_device *dev, unsigned long new_len)
>  {
> -	dev->tx_queue_len = new_len;
> +	int res, orig_len = dev->tx_queue_len;
> +
> +	if (new_len != orig_len) {
> +		dev->tx_queue_len = new_len;
> +		res = call_netdevice_notifiers(NETDEV_CHANGE_TX_QUEUE_LEN, dev);
> +		res = notifier_to_errno(res);
> +		if (res) {
> +			netdev_err(dev,
> +				   "refused to change device tx_queue_len\n");
> +			dev->tx_queue_len = orig_len;
> +			return -EFAULT;
> +		}
> +	}
> +
>  	return 0;
>  }
>  
> 

Acked-by: John Fastabend <john.r.fastabend@intel.com>

Great timing I was just looking into this because I need it for the
qdisc side.

It looks like this covers the sysfs change but the tx_queue_len can
also be changed via rtnetlink as well. So we need another patch for
that path right?

        if (tb[IFLA_TXQLEN]) {
                unsigned long value = nla_get_u32(tb[IFLA_TXQLEN]);

                if (dev->tx_queue_len ^ value)
                        status |= DO_SETLINK_NOTIFY;

                dev->tx_queue_len = value;
        }

Thanks,
John

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH net-next V3 5/6] net: introduce NETDEV_CHANGE_TX_QUEUE_LEN
  2016-06-30  4:56   ` John Fastabend
@ 2016-06-30  5:12       ` Jason Wang
  0 siblings, 0 replies; 28+ messages in thread
From: Jason Wang @ 2016-06-30  5:12 UTC (permalink / raw)
  To: John Fastabend, mst, netdev, linux-kernel, davem
  Cc: kvm, virtualization, eric.dumazet, brouer



On 2016年06月30日 12:56, John Fastabend wrote:
> On 16-06-29 08:52 PM, Jason Wang wrote:
>> This patch introduces a new event - NETDEV_CHANGE_TX_QUEUE_LEN, this
>> will be triggered when tx_queue_len. It could be used by net device
>> who want to do some processing at that time. An example is tun who may
>> want to resize tx array when tx_queue_len is changed.
>>
>> Signed-off-by: Jason Wang <jasowang@redhat.com>
>> ---
>>   include/linux/netdevice.h |  1 +
>>   net/core/net-sysfs.c      | 15 ++++++++++++++-
>>   2 files changed, 15 insertions(+), 1 deletion(-)
>>
>> diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h
>> index e84d9d2..7dc2ec7 100644
>> --- a/include/linux/netdevice.h
>> +++ b/include/linux/netdevice.h
>> @@ -2237,6 +2237,7 @@ struct netdev_lag_lower_state_info {
>>   #define NETDEV_PRECHANGEUPPER	0x001A
>>   #define NETDEV_CHANGELOWERSTATE	0x001B
>>   #define NETDEV_UDP_TUNNEL_PUSH_INFO	0x001C
>> +#define NETDEV_CHANGE_TX_QUEUE_LEN	0x001E
>>   
>>   int register_netdevice_notifier(struct notifier_block *nb);
>>   int unregister_netdevice_notifier(struct notifier_block *nb);
>> diff --git a/net/core/net-sysfs.c b/net/core/net-sysfs.c
>> index 7a0b616..6e4f347 100644
>> --- a/net/core/net-sysfs.c
>> +++ b/net/core/net-sysfs.c
>> @@ -322,7 +322,20 @@ NETDEVICE_SHOW_RW(flags, fmt_hex);
>>   
>>   static int change_tx_queue_len(struct net_device *dev, unsigned long new_len)
>>   {
>> -	dev->tx_queue_len = new_len;
>> +	int res, orig_len = dev->tx_queue_len;
>> +
>> +	if (new_len != orig_len) {
>> +		dev->tx_queue_len = new_len;
>> +		res = call_netdevice_notifiers(NETDEV_CHANGE_TX_QUEUE_LEN, dev);
>> +		res = notifier_to_errno(res);
>> +		if (res) {
>> +			netdev_err(dev,
>> +				   "refused to change device tx_queue_len\n");
>> +			dev->tx_queue_len = orig_len;
>> +			return -EFAULT;
>> +		}
>> +	}
>> +
>>   	return 0;
>>   }
>>   
>>
> Acked-by: John Fastabend <john.r.fastabend@intel.com>
>
> Great timing I was just looking into this because I need it for the
> qdisc side.
>
> It looks like this covers the sysfs change but the tx_queue_len can
> also be changed via rtnetlink as well. So we need another patch for
> that path right?
>
>          if (tb[IFLA_TXQLEN]) {
>                  unsigned long value = nla_get_u32(tb[IFLA_TXQLEN]);
>
>                  if (dev->tx_queue_len ^ value)
>                          status |= DO_SETLINK_NOTIFY;
>
>                  dev->tx_queue_len = value;
>          }
>
> Thanks,
> John
>

Right, will do this in next version.

Thanks

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH net-next V3 5/6] net: introduce NETDEV_CHANGE_TX_QUEUE_LEN
@ 2016-06-30  5:12       ` Jason Wang
  0 siblings, 0 replies; 28+ messages in thread
From: Jason Wang @ 2016-06-30  5:12 UTC (permalink / raw)
  To: John Fastabend, mst, netdev, linux-kernel, davem
  Cc: brouer, eric.dumazet, kvm, virtualization



On 2016年06月30日 12:56, John Fastabend wrote:
> On 16-06-29 08:52 PM, Jason Wang wrote:
>> This patch introduces a new event - NETDEV_CHANGE_TX_QUEUE_LEN, this
>> will be triggered when tx_queue_len. It could be used by net device
>> who want to do some processing at that time. An example is tun who may
>> want to resize tx array when tx_queue_len is changed.
>>
>> Signed-off-by: Jason Wang <jasowang@redhat.com>
>> ---
>>   include/linux/netdevice.h |  1 +
>>   net/core/net-sysfs.c      | 15 ++++++++++++++-
>>   2 files changed, 15 insertions(+), 1 deletion(-)
>>
>> diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h
>> index e84d9d2..7dc2ec7 100644
>> --- a/include/linux/netdevice.h
>> +++ b/include/linux/netdevice.h
>> @@ -2237,6 +2237,7 @@ struct netdev_lag_lower_state_info {
>>   #define NETDEV_PRECHANGEUPPER	0x001A
>>   #define NETDEV_CHANGELOWERSTATE	0x001B
>>   #define NETDEV_UDP_TUNNEL_PUSH_INFO	0x001C
>> +#define NETDEV_CHANGE_TX_QUEUE_LEN	0x001E
>>   
>>   int register_netdevice_notifier(struct notifier_block *nb);
>>   int unregister_netdevice_notifier(struct notifier_block *nb);
>> diff --git a/net/core/net-sysfs.c b/net/core/net-sysfs.c
>> index 7a0b616..6e4f347 100644
>> --- a/net/core/net-sysfs.c
>> +++ b/net/core/net-sysfs.c
>> @@ -322,7 +322,20 @@ NETDEVICE_SHOW_RW(flags, fmt_hex);
>>   
>>   static int change_tx_queue_len(struct net_device *dev, unsigned long new_len)
>>   {
>> -	dev->tx_queue_len = new_len;
>> +	int res, orig_len = dev->tx_queue_len;
>> +
>> +	if (new_len != orig_len) {
>> +		dev->tx_queue_len = new_len;
>> +		res = call_netdevice_notifiers(NETDEV_CHANGE_TX_QUEUE_LEN, dev);
>> +		res = notifier_to_errno(res);
>> +		if (res) {
>> +			netdev_err(dev,
>> +				   "refused to change device tx_queue_len\n");
>> +			dev->tx_queue_len = orig_len;
>> +			return -EFAULT;
>> +		}
>> +	}
>> +
>>   	return 0;
>>   }
>>   
>>
> Acked-by: John Fastabend <john.r.fastabend@intel.com>
>
> Great timing I was just looking into this because I need it for the
> qdisc side.
>
> It looks like this covers the sysfs change but the tx_queue_len can
> also be changed via rtnetlink as well. So we need another patch for
> that path right?
>
>          if (tb[IFLA_TXQLEN]) {
>                  unsigned long value = nla_get_u32(tb[IFLA_TXQLEN]);
>
>                  if (dev->tx_queue_len ^ value)
>                          status |= DO_SETLINK_NOTIFY;
>
>                  dev->tx_queue_len = value;
>          }
>
> Thanks,
> John
>

Right, will do this in next version.

Thanks
_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH net-next V3 0/6] switch to use tx skb array in tun
  2016-06-30  3:52 ` Jason Wang
@ 2016-06-30  5:37   ` Michael S. Tsirkin
  -1 siblings, 0 replies; 28+ messages in thread
From: Michael S. Tsirkin @ 2016-06-30  5:37 UTC (permalink / raw)
  To: Jason Wang
  Cc: netdev, linux-kernel, davem, kvm, virtualization, eric.dumazet, brouer

On Thu, Jun 30, 2016 at 11:52:53AM +0800, Jason Wang wrote:
> Hi all:
> 
> This series tries to switch to use skb array in tun. This is used to
> eliminate the spinlock contention between producer and consumer. The
> conversion was straightforward: just introdce a tx skb array and use
> it instead of sk_receive_queue.
> 
> A minor issue is to keep the tx_queue_len behaviour, since tun used to
> use it for the length of sk_receive_queue. This is done through:
> 
> - add the ability to resize multiple rings at once to avoid handling
>   partial resize failure for mutiple rings.
> - add the support for zero length ring.
> - introduce a notifier which was triggered when tx_queue_len was
>   changed for a netdev.
> - resize all queues during the tx_queue_len changing.
> 
> Tests shows about 15% improvement on guest rx pps:
> 
> Before: ~1300000pps
> After : ~1500000pps

Series:

Acked-by: Michael S. Tsirkin <mst@redhat.com>


> Changes from V2:
> - add multiple rings resizing support for ptr_ring/skb_array
> - add zero length ring support
> - introdce a NETDEV_CHANGE_TX_QUEUE_LEN
> - drop new flags
> 
> Changes from V1:
> - switch to use skb array instead of a customized circular buffer
> - add non-blocking support
> - rename .peek to .peek_len
> - drop lockless peeking since test show very minor improvement
> 
> Jason Wang (5):
>   ptr_ring: support zero length ring
>   skb_array: minor tweak
>   skb_array: add wrappers for resizing
>   net: introduce NETDEV_CHANGE_TX_QUEUE_LEN
>   tun: switch to use skb array for tx
> 
> Michael S. Tsirkin (1):
>   ptr_ring: support resizing multiple queues
> 
>  drivers/net/tun.c                | 138 ++++++++++++++++++++++++++++++++++++---
>  drivers/vhost/net.c              |  16 ++++-
>  include/linux/net.h              |   1 +
>  include/linux/netdevice.h        |   1 +
>  include/linux/ptr_ring.h         |  77 ++++++++++++++++++----
>  include/linux/skb_array.h        |  13 +++-
>  net/core/net-sysfs.c             |  15 ++++-
>  tools/virtio/ringtest/ptr_ring.c |   5 ++
>  8 files changed, 243 insertions(+), 23 deletions(-)
> 
> -- 
> 2.7.4

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH net-next V3 0/6] switch to use tx skb array in tun
@ 2016-06-30  5:37   ` Michael S. Tsirkin
  0 siblings, 0 replies; 28+ messages in thread
From: Michael S. Tsirkin @ 2016-06-30  5:37 UTC (permalink / raw)
  To: Jason Wang
  Cc: kvm, eric.dumazet, netdev, linux-kernel, virtualization, brouer, davem

On Thu, Jun 30, 2016 at 11:52:53AM +0800, Jason Wang wrote:
> Hi all:
> 
> This series tries to switch to use skb array in tun. This is used to
> eliminate the spinlock contention between producer and consumer. The
> conversion was straightforward: just introdce a tx skb array and use
> it instead of sk_receive_queue.
> 
> A minor issue is to keep the tx_queue_len behaviour, since tun used to
> use it for the length of sk_receive_queue. This is done through:
> 
> - add the ability to resize multiple rings at once to avoid handling
>   partial resize failure for mutiple rings.
> - add the support for zero length ring.
> - introduce a notifier which was triggered when tx_queue_len was
>   changed for a netdev.
> - resize all queues during the tx_queue_len changing.
> 
> Tests shows about 15% improvement on guest rx pps:
> 
> Before: ~1300000pps
> After : ~1500000pps

Series:

Acked-by: Michael S. Tsirkin <mst@redhat.com>


> Changes from V2:
> - add multiple rings resizing support for ptr_ring/skb_array
> - add zero length ring support
> - introdce a NETDEV_CHANGE_TX_QUEUE_LEN
> - drop new flags
> 
> Changes from V1:
> - switch to use skb array instead of a customized circular buffer
> - add non-blocking support
> - rename .peek to .peek_len
> - drop lockless peeking since test show very minor improvement
> 
> Jason Wang (5):
>   ptr_ring: support zero length ring
>   skb_array: minor tweak
>   skb_array: add wrappers for resizing
>   net: introduce NETDEV_CHANGE_TX_QUEUE_LEN
>   tun: switch to use skb array for tx
> 
> Michael S. Tsirkin (1):
>   ptr_ring: support resizing multiple queues
> 
>  drivers/net/tun.c                | 138 ++++++++++++++++++++++++++++++++++++---
>  drivers/vhost/net.c              |  16 ++++-
>  include/linux/net.h              |   1 +
>  include/linux/netdevice.h        |   1 +
>  include/linux/ptr_ring.h         |  77 ++++++++++++++++++----
>  include/linux/skb_array.h        |  13 +++-
>  net/core/net-sysfs.c             |  15 ++++-
>  tools/virtio/ringtest/ptr_ring.c |   5 ++
>  8 files changed, 243 insertions(+), 23 deletions(-)
> 
> -- 
> 2.7.4

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH net-next V3 5/6] net: introduce NETDEV_CHANGE_TX_QUEUE_LEN
  2016-06-30  5:12       ` Jason Wang
@ 2016-06-30  5:59         ` Jason Wang
  -1 siblings, 0 replies; 28+ messages in thread
From: Jason Wang @ 2016-06-30  5:59 UTC (permalink / raw)
  To: John Fastabend, mst, netdev, linux-kernel, davem
  Cc: kvm, virtualization, eric.dumazet, brouer



On 2016年06月30日 13:12, Jason Wang wrote:
>
>
> On 2016年06月30日 12:56, John Fastabend wrote:
>> On 16-06-29 08:52 PM, Jason Wang wrote:
>>> This patch introduces a new event - NETDEV_CHANGE_TX_QUEUE_LEN, this
>>> will be triggered when tx_queue_len. It could be used by net device
>>> who want to do some processing at that time. An example is tun who may
>>> want to resize tx array when tx_queue_len is changed.
>>>
>>> Signed-off-by: Jason Wang <jasowang@redhat.com>
>>> ---
>>>   include/linux/netdevice.h |  1 +
>>>   net/core/net-sysfs.c      | 15 ++++++++++++++-
>>>   2 files changed, 15 insertions(+), 1 deletion(-)
>>>
>>> diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h
>>> index e84d9d2..7dc2ec7 100644
>>> --- a/include/linux/netdevice.h
>>> +++ b/include/linux/netdevice.h
>>> @@ -2237,6 +2237,7 @@ struct netdev_lag_lower_state_info {
>>>   #define NETDEV_PRECHANGEUPPER    0x001A
>>>   #define NETDEV_CHANGELOWERSTATE    0x001B
>>>   #define NETDEV_UDP_TUNNEL_PUSH_INFO    0x001C
>>> +#define NETDEV_CHANGE_TX_QUEUE_LEN    0x001E
>>>     int register_netdevice_notifier(struct notifier_block *nb);
>>>   int unregister_netdevice_notifier(struct notifier_block *nb);
>>> diff --git a/net/core/net-sysfs.c b/net/core/net-sysfs.c
>>> index 7a0b616..6e4f347 100644
>>> --- a/net/core/net-sysfs.c
>>> +++ b/net/core/net-sysfs.c
>>> @@ -322,7 +322,20 @@ NETDEVICE_SHOW_RW(flags, fmt_hex);
>>>     static int change_tx_queue_len(struct net_device *dev, unsigned 
>>> long new_len)
>>>   {
>>> -    dev->tx_queue_len = new_len;
>>> +    int res, orig_len = dev->tx_queue_len;
>>> +
>>> +    if (new_len != orig_len) {
>>> +        dev->tx_queue_len = new_len;
>>> +        res = call_netdevice_notifiers(NETDEV_CHANGE_TX_QUEUE_LEN, 
>>> dev);
>>> +        res = notifier_to_errno(res);
>>> +        if (res) {
>>> +            netdev_err(dev,
>>> +                   "refused to change device tx_queue_len\n");
>>> +            dev->tx_queue_len = orig_len;
>>> +            return -EFAULT;
>>> +        }
>>> +    }
>>> +
>>>       return 0;
>>>   }
>>>
>> Acked-by: John Fastabend <john.r.fastabend@intel.com>
>>
>> Great timing I was just looking into this because I need it for the
>> qdisc side.
>>
>> It looks like this covers the sysfs change but the tx_queue_len can
>> also be changed via rtnetlink as well. So we need another patch for
>> that path right?
>>
>>          if (tb[IFLA_TXQLEN]) {
>>                  unsigned long value = nla_get_u32(tb[IFLA_TXQLEN]);
>>
>>                  if (dev->tx_queue_len ^ value)
>>                          status |= DO_SETLINK_NOTIFY;
>>
>>                  dev->tx_queue_len = value;
>>          }
>>
>> Thanks,
>> John
>>
>
> Right, will do this in next version.
>
> Thanks

Ok, since Michael has acked on the series, will prepare a patch on top.

Thanks

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH net-next V3 5/6] net: introduce NETDEV_CHANGE_TX_QUEUE_LEN
@ 2016-06-30  5:59         ` Jason Wang
  0 siblings, 0 replies; 28+ messages in thread
From: Jason Wang @ 2016-06-30  5:59 UTC (permalink / raw)
  To: John Fastabend, mst, netdev, linux-kernel, davem
  Cc: brouer, eric.dumazet, kvm, virtualization



On 2016年06月30日 13:12, Jason Wang wrote:
>
>
> On 2016年06月30日 12:56, John Fastabend wrote:
>> On 16-06-29 08:52 PM, Jason Wang wrote:
>>> This patch introduces a new event - NETDEV_CHANGE_TX_QUEUE_LEN, this
>>> will be triggered when tx_queue_len. It could be used by net device
>>> who want to do some processing at that time. An example is tun who may
>>> want to resize tx array when tx_queue_len is changed.
>>>
>>> Signed-off-by: Jason Wang <jasowang@redhat.com>
>>> ---
>>>   include/linux/netdevice.h |  1 +
>>>   net/core/net-sysfs.c      | 15 ++++++++++++++-
>>>   2 files changed, 15 insertions(+), 1 deletion(-)
>>>
>>> diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h
>>> index e84d9d2..7dc2ec7 100644
>>> --- a/include/linux/netdevice.h
>>> +++ b/include/linux/netdevice.h
>>> @@ -2237,6 +2237,7 @@ struct netdev_lag_lower_state_info {
>>>   #define NETDEV_PRECHANGEUPPER    0x001A
>>>   #define NETDEV_CHANGELOWERSTATE    0x001B
>>>   #define NETDEV_UDP_TUNNEL_PUSH_INFO    0x001C
>>> +#define NETDEV_CHANGE_TX_QUEUE_LEN    0x001E
>>>     int register_netdevice_notifier(struct notifier_block *nb);
>>>   int unregister_netdevice_notifier(struct notifier_block *nb);
>>> diff --git a/net/core/net-sysfs.c b/net/core/net-sysfs.c
>>> index 7a0b616..6e4f347 100644
>>> --- a/net/core/net-sysfs.c
>>> +++ b/net/core/net-sysfs.c
>>> @@ -322,7 +322,20 @@ NETDEVICE_SHOW_RW(flags, fmt_hex);
>>>     static int change_tx_queue_len(struct net_device *dev, unsigned 
>>> long new_len)
>>>   {
>>> -    dev->tx_queue_len = new_len;
>>> +    int res, orig_len = dev->tx_queue_len;
>>> +
>>> +    if (new_len != orig_len) {
>>> +        dev->tx_queue_len = new_len;
>>> +        res = call_netdevice_notifiers(NETDEV_CHANGE_TX_QUEUE_LEN, 
>>> dev);
>>> +        res = notifier_to_errno(res);
>>> +        if (res) {
>>> +            netdev_err(dev,
>>> +                   "refused to change device tx_queue_len\n");
>>> +            dev->tx_queue_len = orig_len;
>>> +            return -EFAULT;
>>> +        }
>>> +    }
>>> +
>>>       return 0;
>>>   }
>>>
>> Acked-by: John Fastabend <john.r.fastabend@intel.com>
>>
>> Great timing I was just looking into this because I need it for the
>> qdisc side.
>>
>> It looks like this covers the sysfs change but the tx_queue_len can
>> also be changed via rtnetlink as well. So we need another patch for
>> that path right?
>>
>>          if (tb[IFLA_TXQLEN]) {
>>                  unsigned long value = nla_get_u32(tb[IFLA_TXQLEN]);
>>
>>                  if (dev->tx_queue_len ^ value)
>>                          status |= DO_SETLINK_NOTIFY;
>>
>>                  dev->tx_queue_len = value;
>>          }
>>
>> Thanks,
>> John
>>
>
> Right, will do this in next version.
>
> Thanks

Ok, since Michael has acked on the series, will prepare a patch on top.

Thanks
_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH net-next V3 6/6] tun: switch to use skb array for tx
  2016-06-30  3:52   ` Jason Wang
@ 2016-06-30  6:20     ` kbuild test robot
  -1 siblings, 0 replies; 28+ messages in thread
From: kbuild test robot @ 2016-06-30  6:20 UTC (permalink / raw)
  To: Jason Wang
  Cc: kbuild-all, mst, netdev, linux-kernel, davem, kvm,
	virtualization, eric.dumazet, brouer, Jason Wang

Hi,

[auto build test WARNING on net-next/master]

url:    https://github.com/0day-ci/linux/commits/Jason-Wang/switch-to-use-tx-skb-array-in-tun/20160630-120656


coccinelle warnings: (new ones prefixed by >>)

>> drivers/net/tun.c:1476:2-3: Unneeded semicolon

Please review and possibly fold the followup patch.

---
0-DAY kernel test infrastructure                Open Source Technology Center
https://lists.01.org/pipermail/kbuild-all                   Intel Corporation

^ permalink raw reply	[flat|nested] 28+ messages in thread

* [PATCH] tun: fix semicolon.cocci warnings
  2016-06-30  3:52   ` Jason Wang
@ 2016-06-30  6:20     ` kbuild test robot
  -1 siblings, 0 replies; 28+ messages in thread
From: kbuild test robot @ 2016-06-30  6:20 UTC (permalink / raw)
  To: Jason Wang
  Cc: kbuild-all, mst, netdev, linux-kernel, davem, kvm,
	virtualization, eric.dumazet, brouer, Jason Wang

drivers/net/tun.c:1476:2-3: Unneeded semicolon


 Remove unneeded semicolon.

Generated by: scripts/coccinelle/misc/semicolon.cocci

CC: Jason Wang <jasowang@redhat.com>
Signed-off-by: Fengguang Wu <fengguang.wu@intel.com>
---

 tun.c |    2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

--- a/drivers/net/tun.c
+++ b/drivers/net/tun.c
@@ -1473,7 +1473,7 @@ static struct sk_buff *tun_ring_recv(str
 		}
 
 		schedule();
-	};
+	}
 
 	current->state = TASK_RUNNING;
 	remove_wait_queue(&tfile->wq.wait, &wait);

^ permalink raw reply	[flat|nested] 28+ messages in thread

* [PATCH] tun: fix semicolon.cocci warnings
@ 2016-06-30  6:20     ` kbuild test robot
  0 siblings, 0 replies; 28+ messages in thread
From: kbuild test robot @ 2016-06-30  6:20 UTC (permalink / raw)
  To: Jason Wang
  Cc: eric.dumazet, kvm, mst, netdev, linux-kernel, virtualization,
	kbuild-all, brouer, davem

drivers/net/tun.c:1476:2-3: Unneeded semicolon


 Remove unneeded semicolon.

Generated by: scripts/coccinelle/misc/semicolon.cocci

CC: Jason Wang <jasowang@redhat.com>
Signed-off-by: Fengguang Wu <fengguang.wu@intel.com>
---

 tun.c |    2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

--- a/drivers/net/tun.c
+++ b/drivers/net/tun.c
@@ -1473,7 +1473,7 @@ static struct sk_buff *tun_ring_recv(str
 		}
 
 		schedule();
-	};
+	}
 
 	current->state = TASK_RUNNING;
 	remove_wait_queue(&tfile->wq.wait, &wait);

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH net-next V3 6/6] tun: switch to use skb array for tx
@ 2016-06-30  6:20     ` kbuild test robot
  0 siblings, 0 replies; 28+ messages in thread
From: kbuild test robot @ 2016-06-30  6:20 UTC (permalink / raw)
  To: Jason Wang
  Cc: eric.dumazet, kvm, mst, netdev, linux-kernel, virtualization,
	kbuild-all, brouer, davem

Hi,

[auto build test WARNING on net-next/master]

url:    https://github.com/0day-ci/linux/commits/Jason-Wang/switch-to-use-tx-skb-array-in-tun/20160630-120656


coccinelle warnings: (new ones prefixed by >>)

>> drivers/net/tun.c:1476:2-3: Unneeded semicolon

Please review and possibly fold the followup patch.

---
0-DAY kernel test infrastructure                Open Source Technology Center
https://lists.01.org/pipermail/kbuild-all                   Intel Corporation

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH net-next V3 5/6] net: introduce NETDEV_CHANGE_TX_QUEUE_LEN
  2016-06-30  5:59         ` Jason Wang
@ 2016-06-30  6:43           ` Jason Wang
  -1 siblings, 0 replies; 28+ messages in thread
From: Jason Wang @ 2016-06-30  6:43 UTC (permalink / raw)
  To: John Fastabend, mst, netdev, linux-kernel, davem
  Cc: kvm, virtualization, eric.dumazet, brouer



On 2016年06月30日 13:59, Jason Wang wrote:
>
>
> On 2016年06月30日 13:12, Jason Wang wrote:
>>
>>
>> On 2016年06月30日 12:56, John Fastabend wrote:
>>> On 16-06-29 08:52 PM, Jason Wang wrote:
>>>> This patch introduces a new event - NETDEV_CHANGE_TX_QUEUE_LEN, this
>>>> will be triggered when tx_queue_len. It could be used by net device
>>>> who want to do some processing at that time. An example is tun who may
>>>> want to resize tx array when tx_queue_len is changed.
>>>>
>>>> Signed-off-by: Jason Wang <jasowang@redhat.com>
>>>> ---
>>>>   include/linux/netdevice.h |  1 +
>>>>   net/core/net-sysfs.c      | 15 ++++++++++++++-
>>>>   2 files changed, 15 insertions(+), 1 deletion(-)
>>>>
>>>> diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h
>>>> index e84d9d2..7dc2ec7 100644
>>>> --- a/include/linux/netdevice.h
>>>> +++ b/include/linux/netdevice.h
>>>> @@ -2237,6 +2237,7 @@ struct netdev_lag_lower_state_info {
>>>>   #define NETDEV_PRECHANGEUPPER    0x001A
>>>>   #define NETDEV_CHANGELOWERSTATE    0x001B
>>>>   #define NETDEV_UDP_TUNNEL_PUSH_INFO    0x001C
>>>> +#define NETDEV_CHANGE_TX_QUEUE_LEN    0x001E
>>>>     int register_netdevice_notifier(struct notifier_block *nb);
>>>>   int unregister_netdevice_notifier(struct notifier_block *nb);
>>>> diff --git a/net/core/net-sysfs.c b/net/core/net-sysfs.c
>>>> index 7a0b616..6e4f347 100644
>>>> --- a/net/core/net-sysfs.c
>>>> +++ b/net/core/net-sysfs.c
>>>> @@ -322,7 +322,20 @@ NETDEVICE_SHOW_RW(flags, fmt_hex);
>>>>     static int change_tx_queue_len(struct net_device *dev, unsigned 
>>>> long new_len)
>>>>   {
>>>> -    dev->tx_queue_len = new_len;
>>>> +    int res, orig_len = dev->tx_queue_len;
>>>> +
>>>> +    if (new_len != orig_len) {
>>>> +        dev->tx_queue_len = new_len;
>>>> +        res = call_netdevice_notifiers(NETDEV_CHANGE_TX_QUEUE_LEN, 
>>>> dev);
>>>> +        res = notifier_to_errno(res);
>>>> +        if (res) {
>>>> +            netdev_err(dev,
>>>> +                   "refused to change device tx_queue_len\n");
>>>> +            dev->tx_queue_len = orig_len;
>>>> +            return -EFAULT;
>>>> +        }
>>>> +    }
>>>> +
>>>>       return 0;
>>>>   }
>>>>
>>> Acked-by: John Fastabend <john.r.fastabend@intel.com>
>>>
>>> Great timing I was just looking into this because I need it for the
>>> qdisc side.
>>>
>>> It looks like this covers the sysfs change but the tx_queue_len can
>>> also be changed via rtnetlink as well. So we need another patch for
>>> that path right?
>>>
>>>          if (tb[IFLA_TXQLEN]) {
>>>                  unsigned long value = nla_get_u32(tb[IFLA_TXQLEN]);
>>>
>>>                  if (dev->tx_queue_len ^ value)
>>>                          status |= DO_SETLINK_NOTIFY;
>>>
>>>                  dev->tx_queue_len = value;
>>>          }
>>>
>>> Thanks,
>>> John
>>>
>>
>> Right, will do this in next version.
>>
>> Thanks
>
> Ok, since Michael has acked on the series, will prepare a patch on top.
>
> Thanks 

Since kbuild test robot has found a minor issue on this series, I will 
post v4 with this fixed.

Thanks

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH net-next V3 5/6] net: introduce NETDEV_CHANGE_TX_QUEUE_LEN
@ 2016-06-30  6:43           ` Jason Wang
  0 siblings, 0 replies; 28+ messages in thread
From: Jason Wang @ 2016-06-30  6:43 UTC (permalink / raw)
  To: John Fastabend, mst, netdev, linux-kernel, davem
  Cc: brouer, eric.dumazet, kvm, virtualization



On 2016年06月30日 13:59, Jason Wang wrote:
>
>
> On 2016年06月30日 13:12, Jason Wang wrote:
>>
>>
>> On 2016年06月30日 12:56, John Fastabend wrote:
>>> On 16-06-29 08:52 PM, Jason Wang wrote:
>>>> This patch introduces a new event - NETDEV_CHANGE_TX_QUEUE_LEN, this
>>>> will be triggered when tx_queue_len. It could be used by net device
>>>> who want to do some processing at that time. An example is tun who may
>>>> want to resize tx array when tx_queue_len is changed.
>>>>
>>>> Signed-off-by: Jason Wang <jasowang@redhat.com>
>>>> ---
>>>>   include/linux/netdevice.h |  1 +
>>>>   net/core/net-sysfs.c      | 15 ++++++++++++++-
>>>>   2 files changed, 15 insertions(+), 1 deletion(-)
>>>>
>>>> diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h
>>>> index e84d9d2..7dc2ec7 100644
>>>> --- a/include/linux/netdevice.h
>>>> +++ b/include/linux/netdevice.h
>>>> @@ -2237,6 +2237,7 @@ struct netdev_lag_lower_state_info {
>>>>   #define NETDEV_PRECHANGEUPPER    0x001A
>>>>   #define NETDEV_CHANGELOWERSTATE    0x001B
>>>>   #define NETDEV_UDP_TUNNEL_PUSH_INFO    0x001C
>>>> +#define NETDEV_CHANGE_TX_QUEUE_LEN    0x001E
>>>>     int register_netdevice_notifier(struct notifier_block *nb);
>>>>   int unregister_netdevice_notifier(struct notifier_block *nb);
>>>> diff --git a/net/core/net-sysfs.c b/net/core/net-sysfs.c
>>>> index 7a0b616..6e4f347 100644
>>>> --- a/net/core/net-sysfs.c
>>>> +++ b/net/core/net-sysfs.c
>>>> @@ -322,7 +322,20 @@ NETDEVICE_SHOW_RW(flags, fmt_hex);
>>>>     static int change_tx_queue_len(struct net_device *dev, unsigned 
>>>> long new_len)
>>>>   {
>>>> -    dev->tx_queue_len = new_len;
>>>> +    int res, orig_len = dev->tx_queue_len;
>>>> +
>>>> +    if (new_len != orig_len) {
>>>> +        dev->tx_queue_len = new_len;
>>>> +        res = call_netdevice_notifiers(NETDEV_CHANGE_TX_QUEUE_LEN, 
>>>> dev);
>>>> +        res = notifier_to_errno(res);
>>>> +        if (res) {
>>>> +            netdev_err(dev,
>>>> +                   "refused to change device tx_queue_len\n");
>>>> +            dev->tx_queue_len = orig_len;
>>>> +            return -EFAULT;
>>>> +        }
>>>> +    }
>>>> +
>>>>       return 0;
>>>>   }
>>>>
>>> Acked-by: John Fastabend <john.r.fastabend@intel.com>
>>>
>>> Great timing I was just looking into this because I need it for the
>>> qdisc side.
>>>
>>> It looks like this covers the sysfs change but the tx_queue_len can
>>> also be changed via rtnetlink as well. So we need another patch for
>>> that path right?
>>>
>>>          if (tb[IFLA_TXQLEN]) {
>>>                  unsigned long value = nla_get_u32(tb[IFLA_TXQLEN]);
>>>
>>>                  if (dev->tx_queue_len ^ value)
>>>                          status |= DO_SETLINK_NOTIFY;
>>>
>>>                  dev->tx_queue_len = value;
>>>          }
>>>
>>> Thanks,
>>> John
>>>
>>
>> Right, will do this in next version.
>>
>> Thanks
>
> Ok, since Michael has acked on the series, will prepare a patch on top.
>
> Thanks 

Since kbuild test robot has found a minor issue on this series, I will 
post v4 with this fixed.

Thanks
_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

^ permalink raw reply	[flat|nested] 28+ messages in thread

end of thread, other threads:[~2016-06-30  6:51 UTC | newest]

Thread overview: 28+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-06-30  3:52 [PATCH net-next V3 0/6] switch to use tx skb array in tun Jason Wang
2016-06-30  3:52 ` Jason Wang
2016-06-30  3:52 ` [PATCH net-next V3 1/6] ptr_ring: support zero length ring Jason Wang
2016-06-30  3:52   ` Jason Wang
2016-06-30  3:52 ` [PATCH net-next V3 2/6] skb_array: minor tweak Jason Wang
2016-06-30  3:52   ` Jason Wang
2016-06-30  3:52 ` [PATCH net-next V3 3/6] ptr_ring: support resizing multiple queues Jason Wang
2016-06-30  3:52   ` Jason Wang
2016-06-30  3:52 ` [PATCH net-next V3 4/6] skb_array: add wrappers for resizing Jason Wang
2016-06-30  3:52   ` Jason Wang
2016-06-30  3:52 ` [PATCH net-next V3 5/6] net: introduce NETDEV_CHANGE_TX_QUEUE_LEN Jason Wang
2016-06-30  3:52   ` Jason Wang
2016-06-30  4:56   ` John Fastabend
2016-06-30  5:12     ` Jason Wang
2016-06-30  5:12       ` Jason Wang
2016-06-30  5:59       ` Jason Wang
2016-06-30  5:59         ` Jason Wang
2016-06-30  6:43         ` Jason Wang
2016-06-30  6:43           ` Jason Wang
2016-06-30  4:56   ` John Fastabend
2016-06-30  3:52 ` [PATCH net-next V3 6/6] tun: switch to use skb array for tx Jason Wang
2016-06-30  3:52   ` Jason Wang
2016-06-30  6:20   ` [PATCH] tun: fix semicolon.cocci warnings kbuild test robot
2016-06-30  6:20     ` kbuild test robot
2016-06-30  6:20   ` [PATCH net-next V3 6/6] tun: switch to use skb array for tx kbuild test robot
2016-06-30  6:20     ` kbuild test robot
2016-06-30  5:37 ` [PATCH net-next V3 0/6] switch to use tx skb array in tun Michael S. Tsirkin
2016-06-30  5:37   ` Michael S. Tsirkin

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.