* [PATCH net-next V3 0/6] switch to use tx skb array in tun
@ 2016-06-30 3:52 ` Jason Wang
0 siblings, 0 replies; 28+ messages in thread
From: Jason Wang @ 2016-06-30 3:52 UTC (permalink / raw)
To: mst, netdev, linux-kernel, davem
Cc: kvm, virtualization, eric.dumazet, brouer, Jason Wang
Hi all:
This series tries to switch to use skb array in tun. This is used to
eliminate the spinlock contention between producer and consumer. The
conversion was straightforward: just introdce a tx skb array and use
it instead of sk_receive_queue.
A minor issue is to keep the tx_queue_len behaviour, since tun used to
use it for the length of sk_receive_queue. This is done through:
- add the ability to resize multiple rings at once to avoid handling
partial resize failure for mutiple rings.
- add the support for zero length ring.
- introduce a notifier which was triggered when tx_queue_len was
changed for a netdev.
- resize all queues during the tx_queue_len changing.
Tests shows about 15% improvement on guest rx pps:
Before: ~1300000pps
After : ~1500000pps
Changes from V2:
- add multiple rings resizing support for ptr_ring/skb_array
- add zero length ring support
- introdce a NETDEV_CHANGE_TX_QUEUE_LEN
- drop new flags
Changes from V1:
- switch to use skb array instead of a customized circular buffer
- add non-blocking support
- rename .peek to .peek_len
- drop lockless peeking since test show very minor improvement
Jason Wang (5):
ptr_ring: support zero length ring
skb_array: minor tweak
skb_array: add wrappers for resizing
net: introduce NETDEV_CHANGE_TX_QUEUE_LEN
tun: switch to use skb array for tx
Michael S. Tsirkin (1):
ptr_ring: support resizing multiple queues
drivers/net/tun.c | 138 ++++++++++++++++++++++++++++++++++++---
drivers/vhost/net.c | 16 ++++-
include/linux/net.h | 1 +
include/linux/netdevice.h | 1 +
include/linux/ptr_ring.h | 77 ++++++++++++++++++----
include/linux/skb_array.h | 13 +++-
net/core/net-sysfs.c | 15 ++++-
tools/virtio/ringtest/ptr_ring.c | 5 ++
8 files changed, 243 insertions(+), 23 deletions(-)
--
2.7.4
^ permalink raw reply [flat|nested] 28+ messages in thread
* [PATCH net-next V3 0/6] switch to use tx skb array in tun
@ 2016-06-30 3:52 ` Jason Wang
0 siblings, 0 replies; 28+ messages in thread
From: Jason Wang @ 2016-06-30 3:52 UTC (permalink / raw)
To: mst, netdev, linux-kernel, davem
Cc: brouer, eric.dumazet, kvm, virtualization
Hi all:
This series tries to switch to use skb array in tun. This is used to
eliminate the spinlock contention between producer and consumer. The
conversion was straightforward: just introdce a tx skb array and use
it instead of sk_receive_queue.
A minor issue is to keep the tx_queue_len behaviour, since tun used to
use it for the length of sk_receive_queue. This is done through:
- add the ability to resize multiple rings at once to avoid handling
partial resize failure for mutiple rings.
- add the support for zero length ring.
- introduce a notifier which was triggered when tx_queue_len was
changed for a netdev.
- resize all queues during the tx_queue_len changing.
Tests shows about 15% improvement on guest rx pps:
Before: ~1300000pps
After : ~1500000pps
Changes from V2:
- add multiple rings resizing support for ptr_ring/skb_array
- add zero length ring support
- introdce a NETDEV_CHANGE_TX_QUEUE_LEN
- drop new flags
Changes from V1:
- switch to use skb array instead of a customized circular buffer
- add non-blocking support
- rename .peek to .peek_len
- drop lockless peeking since test show very minor improvement
Jason Wang (5):
ptr_ring: support zero length ring
skb_array: minor tweak
skb_array: add wrappers for resizing
net: introduce NETDEV_CHANGE_TX_QUEUE_LEN
tun: switch to use skb array for tx
Michael S. Tsirkin (1):
ptr_ring: support resizing multiple queues
drivers/net/tun.c | 138 ++++++++++++++++++++++++++++++++++++---
drivers/vhost/net.c | 16 ++++-
include/linux/net.h | 1 +
include/linux/netdevice.h | 1 +
include/linux/ptr_ring.h | 77 ++++++++++++++++++----
include/linux/skb_array.h | 13 +++-
net/core/net-sysfs.c | 15 ++++-
tools/virtio/ringtest/ptr_ring.c | 5 ++
8 files changed, 243 insertions(+), 23 deletions(-)
--
2.7.4
^ permalink raw reply [flat|nested] 28+ messages in thread
* [PATCH net-next V3 1/6] ptr_ring: support zero length ring
2016-06-30 3:52 ` Jason Wang
@ 2016-06-30 3:52 ` Jason Wang
-1 siblings, 0 replies; 28+ messages in thread
From: Jason Wang @ 2016-06-30 3:52 UTC (permalink / raw)
To: mst, netdev, linux-kernel, davem
Cc: kvm, virtualization, eric.dumazet, brouer, Jason Wang
Sometimes, we need zero length ring. But current code will crash since
we don't do any check before accessing the ring. This patch fixes this.
Signed-off-by: Jason Wang <jasowang@redhat.com>
---
include/linux/ptr_ring.h | 6 ++++--
1 file changed, 4 insertions(+), 2 deletions(-)
diff --git a/include/linux/ptr_ring.h b/include/linux/ptr_ring.h
index 562a65e..d78b8b8 100644
--- a/include/linux/ptr_ring.h
+++ b/include/linux/ptr_ring.h
@@ -102,7 +102,7 @@ static inline bool ptr_ring_full_bh(struct ptr_ring *r)
*/
static inline int __ptr_ring_produce(struct ptr_ring *r, void *ptr)
{
- if (r->queue[r->producer])
+ if (unlikely(!r->size) || r->queue[r->producer])
return -ENOSPC;
r->queue[r->producer++] = ptr;
@@ -164,7 +164,9 @@ static inline int ptr_ring_produce_bh(struct ptr_ring *r, void *ptr)
*/
static inline void *__ptr_ring_peek(struct ptr_ring *r)
{
- return r->queue[r->consumer];
+ if (likely(r->size))
+ return r->queue[r->consumer];
+ return NULL;
}
/* Note: callers invoking this in a loop must use a compiler barrier,
--
2.7.4
^ permalink raw reply related [flat|nested] 28+ messages in thread
* [PATCH net-next V3 1/6] ptr_ring: support zero length ring
@ 2016-06-30 3:52 ` Jason Wang
0 siblings, 0 replies; 28+ messages in thread
From: Jason Wang @ 2016-06-30 3:52 UTC (permalink / raw)
To: mst, netdev, linux-kernel, davem
Cc: brouer, eric.dumazet, kvm, virtualization
Sometimes, we need zero length ring. But current code will crash since
we don't do any check before accessing the ring. This patch fixes this.
Signed-off-by: Jason Wang <jasowang@redhat.com>
---
include/linux/ptr_ring.h | 6 ++++--
1 file changed, 4 insertions(+), 2 deletions(-)
diff --git a/include/linux/ptr_ring.h b/include/linux/ptr_ring.h
index 562a65e..d78b8b8 100644
--- a/include/linux/ptr_ring.h
+++ b/include/linux/ptr_ring.h
@@ -102,7 +102,7 @@ static inline bool ptr_ring_full_bh(struct ptr_ring *r)
*/
static inline int __ptr_ring_produce(struct ptr_ring *r, void *ptr)
{
- if (r->queue[r->producer])
+ if (unlikely(!r->size) || r->queue[r->producer])
return -ENOSPC;
r->queue[r->producer++] = ptr;
@@ -164,7 +164,9 @@ static inline int ptr_ring_produce_bh(struct ptr_ring *r, void *ptr)
*/
static inline void *__ptr_ring_peek(struct ptr_ring *r)
{
- return r->queue[r->consumer];
+ if (likely(r->size))
+ return r->queue[r->consumer];
+ return NULL;
}
/* Note: callers invoking this in a loop must use a compiler barrier,
--
2.7.4
^ permalink raw reply related [flat|nested] 28+ messages in thread
* [PATCH net-next V3 2/6] skb_array: minor tweak
2016-06-30 3:52 ` Jason Wang
@ 2016-06-30 3:52 ` Jason Wang
-1 siblings, 0 replies; 28+ messages in thread
From: Jason Wang @ 2016-06-30 3:52 UTC (permalink / raw)
To: mst, netdev, linux-kernel, davem
Cc: kvm, virtualization, eric.dumazet, brouer, Jason Wang
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
---
include/linux/skb_array.h | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/include/linux/skb_array.h b/include/linux/skb_array.h
index 678bfbf..2dd0d1e 100644
--- a/include/linux/skb_array.h
+++ b/include/linux/skb_array.h
@@ -151,12 +151,12 @@ static inline int skb_array_init(struct skb_array *a, int size, gfp_t gfp)
return ptr_ring_init(&a->ring, size, gfp);
}
-void __skb_array_destroy_skb(void *ptr)
+static void __skb_array_destroy_skb(void *ptr)
{
kfree_skb(ptr);
}
-int skb_array_resize(struct skb_array *a, int size, gfp_t gfp)
+static inline int skb_array_resize(struct skb_array *a, int size, gfp_t gfp)
{
return ptr_ring_resize(&a->ring, size, gfp, __skb_array_destroy_skb);
}
--
2.7.4
^ permalink raw reply related [flat|nested] 28+ messages in thread
* [PATCH net-next V3 2/6] skb_array: minor tweak
@ 2016-06-30 3:52 ` Jason Wang
0 siblings, 0 replies; 28+ messages in thread
From: Jason Wang @ 2016-06-30 3:52 UTC (permalink / raw)
To: mst, netdev, linux-kernel, davem
Cc: brouer, eric.dumazet, kvm, virtualization
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
---
include/linux/skb_array.h | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/include/linux/skb_array.h b/include/linux/skb_array.h
index 678bfbf..2dd0d1e 100644
--- a/include/linux/skb_array.h
+++ b/include/linux/skb_array.h
@@ -151,12 +151,12 @@ static inline int skb_array_init(struct skb_array *a, int size, gfp_t gfp)
return ptr_ring_init(&a->ring, size, gfp);
}
-void __skb_array_destroy_skb(void *ptr)
+static void __skb_array_destroy_skb(void *ptr)
{
kfree_skb(ptr);
}
-int skb_array_resize(struct skb_array *a, int size, gfp_t gfp)
+static inline int skb_array_resize(struct skb_array *a, int size, gfp_t gfp)
{
return ptr_ring_resize(&a->ring, size, gfp, __skb_array_destroy_skb);
}
--
2.7.4
^ permalink raw reply related [flat|nested] 28+ messages in thread
* [PATCH net-next V3 3/6] ptr_ring: support resizing multiple queues
2016-06-30 3:52 ` Jason Wang
@ 2016-06-30 3:52 ` Jason Wang
-1 siblings, 0 replies; 28+ messages in thread
From: Jason Wang @ 2016-06-30 3:52 UTC (permalink / raw)
To: mst, netdev, linux-kernel, davem
Cc: kvm, virtualization, eric.dumazet, brouer, Jason Wang
From: "Michael S. Tsirkin" <mst@redhat.com>
Sometimes, we need support resizing multiple queues at once. This is
because it was not easy to recover to recover from a partial failure
of multiple queues resizing.
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
---
include/linux/ptr_ring.h | 71 +++++++++++++++++++++++++++++++++++-----
tools/virtio/ringtest/ptr_ring.c | 5 +++
2 files changed, 67 insertions(+), 9 deletions(-)
diff --git a/include/linux/ptr_ring.h b/include/linux/ptr_ring.h
index d78b8b8..2052011 100644
--- a/include/linux/ptr_ring.h
+++ b/include/linux/ptr_ring.h
@@ -349,20 +349,14 @@ static inline int ptr_ring_init(struct ptr_ring *r, int size, gfp_t gfp)
return 0;
}
-static inline int ptr_ring_resize(struct ptr_ring *r, int size, gfp_t gfp,
- void (*destroy)(void *))
+static inline void **__ptr_ring_swap_queue(struct ptr_ring *r, void **queue,
+ int size, gfp_t gfp,
+ void (*destroy)(void *))
{
- unsigned long flags;
int producer = 0;
- void **queue = __ptr_ring_init_queue_alloc(size, gfp);
void **old;
void *ptr;
- if (!queue)
- return -ENOMEM;
-
- spin_lock_irqsave(&(r)->producer_lock, flags);
-
while ((ptr = ptr_ring_consume(r)))
if (producer < size)
queue[producer++] = ptr;
@@ -375,6 +369,23 @@ static inline int ptr_ring_resize(struct ptr_ring *r, int size, gfp_t gfp,
old = r->queue;
r->queue = queue;
+ return old;
+}
+
+static inline int ptr_ring_resize(struct ptr_ring *r, int size, gfp_t gfp,
+ void (*destroy)(void *))
+{
+ unsigned long flags;
+ void **queue = __ptr_ring_init_queue_alloc(size, gfp);
+ void **old;
+
+ if (!queue)
+ return -ENOMEM;
+
+ spin_lock_irqsave(&(r)->producer_lock, flags);
+
+ old = __ptr_ring_swap_queue(r, queue, size, gfp, destroy);
+
spin_unlock_irqrestore(&(r)->producer_lock, flags);
kfree(old);
@@ -382,6 +393,48 @@ static inline int ptr_ring_resize(struct ptr_ring *r, int size, gfp_t gfp,
return 0;
}
+static inline int ptr_ring_resize_multiple(struct ptr_ring **rings, int nrings,
+ int size,
+ gfp_t gfp, void (*destroy)(void *))
+{
+ unsigned long flags;
+ void ***queues;
+ int i;
+
+ queues = kmalloc(nrings * sizeof *queues, gfp);
+ if (!queues)
+ goto noqueues;
+
+ for (i = 0; i < nrings; ++i) {
+ queues[i] = __ptr_ring_init_queue_alloc(size, gfp);
+ if (!queues[i])
+ goto nomem;
+ }
+
+ for (i = 0; i < nrings; ++i) {
+ spin_lock_irqsave(&(rings[i])->producer_lock, flags);
+ queues[i] = __ptr_ring_swap_queue(rings[i], queues[i],
+ size, gfp, destroy);
+ spin_unlock_irqrestore(&(rings[i])->producer_lock, flags);
+ }
+
+ for (i = 0; i < nrings; ++i)
+ kfree(queues[i]);
+
+ kfree(queues);
+
+ return 0;
+
+nomem:
+ while (--i >= 0)
+ kfree(queues[i]);
+
+ kfree(queues);
+
+noqueues:
+ return -ENOMEM;
+}
+
static inline void ptr_ring_cleanup(struct ptr_ring *r, void (*destroy)(void *))
{
void *ptr;
diff --git a/tools/virtio/ringtest/ptr_ring.c b/tools/virtio/ringtest/ptr_ring.c
index 74abd74..68e4f9f 100644
--- a/tools/virtio/ringtest/ptr_ring.c
+++ b/tools/virtio/ringtest/ptr_ring.c
@@ -17,6 +17,11 @@
typedef pthread_spinlock_t spinlock_t;
typedef int gfp_t;
+static void *kmalloc(unsigned size, gfp_t gfp)
+{
+ return memalign(64, size);
+}
+
static void *kzalloc(unsigned size, gfp_t gfp)
{
void *p = memalign(64, size);
--
2.7.4
^ permalink raw reply related [flat|nested] 28+ messages in thread
* [PATCH net-next V3 3/6] ptr_ring: support resizing multiple queues
@ 2016-06-30 3:52 ` Jason Wang
0 siblings, 0 replies; 28+ messages in thread
From: Jason Wang @ 2016-06-30 3:52 UTC (permalink / raw)
To: mst, netdev, linux-kernel, davem
Cc: brouer, eric.dumazet, kvm, virtualization
From: "Michael S. Tsirkin" <mst@redhat.com>
Sometimes, we need support resizing multiple queues at once. This is
because it was not easy to recover to recover from a partial failure
of multiple queues resizing.
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
---
include/linux/ptr_ring.h | 71 +++++++++++++++++++++++++++++++++++-----
tools/virtio/ringtest/ptr_ring.c | 5 +++
2 files changed, 67 insertions(+), 9 deletions(-)
diff --git a/include/linux/ptr_ring.h b/include/linux/ptr_ring.h
index d78b8b8..2052011 100644
--- a/include/linux/ptr_ring.h
+++ b/include/linux/ptr_ring.h
@@ -349,20 +349,14 @@ static inline int ptr_ring_init(struct ptr_ring *r, int size, gfp_t gfp)
return 0;
}
-static inline int ptr_ring_resize(struct ptr_ring *r, int size, gfp_t gfp,
- void (*destroy)(void *))
+static inline void **__ptr_ring_swap_queue(struct ptr_ring *r, void **queue,
+ int size, gfp_t gfp,
+ void (*destroy)(void *))
{
- unsigned long flags;
int producer = 0;
- void **queue = __ptr_ring_init_queue_alloc(size, gfp);
void **old;
void *ptr;
- if (!queue)
- return -ENOMEM;
-
- spin_lock_irqsave(&(r)->producer_lock, flags);
-
while ((ptr = ptr_ring_consume(r)))
if (producer < size)
queue[producer++] = ptr;
@@ -375,6 +369,23 @@ static inline int ptr_ring_resize(struct ptr_ring *r, int size, gfp_t gfp,
old = r->queue;
r->queue = queue;
+ return old;
+}
+
+static inline int ptr_ring_resize(struct ptr_ring *r, int size, gfp_t gfp,
+ void (*destroy)(void *))
+{
+ unsigned long flags;
+ void **queue = __ptr_ring_init_queue_alloc(size, gfp);
+ void **old;
+
+ if (!queue)
+ return -ENOMEM;
+
+ spin_lock_irqsave(&(r)->producer_lock, flags);
+
+ old = __ptr_ring_swap_queue(r, queue, size, gfp, destroy);
+
spin_unlock_irqrestore(&(r)->producer_lock, flags);
kfree(old);
@@ -382,6 +393,48 @@ static inline int ptr_ring_resize(struct ptr_ring *r, int size, gfp_t gfp,
return 0;
}
+static inline int ptr_ring_resize_multiple(struct ptr_ring **rings, int nrings,
+ int size,
+ gfp_t gfp, void (*destroy)(void *))
+{
+ unsigned long flags;
+ void ***queues;
+ int i;
+
+ queues = kmalloc(nrings * sizeof *queues, gfp);
+ if (!queues)
+ goto noqueues;
+
+ for (i = 0; i < nrings; ++i) {
+ queues[i] = __ptr_ring_init_queue_alloc(size, gfp);
+ if (!queues[i])
+ goto nomem;
+ }
+
+ for (i = 0; i < nrings; ++i) {
+ spin_lock_irqsave(&(rings[i])->producer_lock, flags);
+ queues[i] = __ptr_ring_swap_queue(rings[i], queues[i],
+ size, gfp, destroy);
+ spin_unlock_irqrestore(&(rings[i])->producer_lock, flags);
+ }
+
+ for (i = 0; i < nrings; ++i)
+ kfree(queues[i]);
+
+ kfree(queues);
+
+ return 0;
+
+nomem:
+ while (--i >= 0)
+ kfree(queues[i]);
+
+ kfree(queues);
+
+noqueues:
+ return -ENOMEM;
+}
+
static inline void ptr_ring_cleanup(struct ptr_ring *r, void (*destroy)(void *))
{
void *ptr;
diff --git a/tools/virtio/ringtest/ptr_ring.c b/tools/virtio/ringtest/ptr_ring.c
index 74abd74..68e4f9f 100644
--- a/tools/virtio/ringtest/ptr_ring.c
+++ b/tools/virtio/ringtest/ptr_ring.c
@@ -17,6 +17,11 @@
typedef pthread_spinlock_t spinlock_t;
typedef int gfp_t;
+static void *kmalloc(unsigned size, gfp_t gfp)
+{
+ return memalign(64, size);
+}
+
static void *kzalloc(unsigned size, gfp_t gfp)
{
void *p = memalign(64, size);
--
2.7.4
^ permalink raw reply related [flat|nested] 28+ messages in thread
* [PATCH net-next V3 4/6] skb_array: add wrappers for resizing
2016-06-30 3:52 ` Jason Wang
@ 2016-06-30 3:52 ` Jason Wang
-1 siblings, 0 replies; 28+ messages in thread
From: Jason Wang @ 2016-06-30 3:52 UTC (permalink / raw)
To: mst, netdev, linux-kernel, davem
Cc: kvm, virtualization, eric.dumazet, brouer, Jason Wang
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
---
include/linux/skb_array.h | 9 +++++++++
1 file changed, 9 insertions(+)
diff --git a/include/linux/skb_array.h b/include/linux/skb_array.h
index 2dd0d1e..f4dfade 100644
--- a/include/linux/skb_array.h
+++ b/include/linux/skb_array.h
@@ -161,6 +161,15 @@ static inline int skb_array_resize(struct skb_array *a, int size, gfp_t gfp)
return ptr_ring_resize(&a->ring, size, gfp, __skb_array_destroy_skb);
}
+static inline int skb_array_resize_multiple(struct skb_array **rings,
+ int nrings, int size, gfp_t gfp)
+{
+ BUILD_BUG_ON(offsetof(struct skb_array, ring));
+ return ptr_ring_resize_multiple((struct ptr_ring **)rings,
+ nrings, size, gfp,
+ __skb_array_destroy_skb);
+}
+
static inline void skb_array_cleanup(struct skb_array *a)
{
ptr_ring_cleanup(&a->ring, __skb_array_destroy_skb);
--
2.7.4
^ permalink raw reply related [flat|nested] 28+ messages in thread
* [PATCH net-next V3 4/6] skb_array: add wrappers for resizing
@ 2016-06-30 3:52 ` Jason Wang
0 siblings, 0 replies; 28+ messages in thread
From: Jason Wang @ 2016-06-30 3:52 UTC (permalink / raw)
To: mst, netdev, linux-kernel, davem
Cc: brouer, eric.dumazet, kvm, virtualization
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
---
include/linux/skb_array.h | 9 +++++++++
1 file changed, 9 insertions(+)
diff --git a/include/linux/skb_array.h b/include/linux/skb_array.h
index 2dd0d1e..f4dfade 100644
--- a/include/linux/skb_array.h
+++ b/include/linux/skb_array.h
@@ -161,6 +161,15 @@ static inline int skb_array_resize(struct skb_array *a, int size, gfp_t gfp)
return ptr_ring_resize(&a->ring, size, gfp, __skb_array_destroy_skb);
}
+static inline int skb_array_resize_multiple(struct skb_array **rings,
+ int nrings, int size, gfp_t gfp)
+{
+ BUILD_BUG_ON(offsetof(struct skb_array, ring));
+ return ptr_ring_resize_multiple((struct ptr_ring **)rings,
+ nrings, size, gfp,
+ __skb_array_destroy_skb);
+}
+
static inline void skb_array_cleanup(struct skb_array *a)
{
ptr_ring_cleanup(&a->ring, __skb_array_destroy_skb);
--
2.7.4
^ permalink raw reply related [flat|nested] 28+ messages in thread
* [PATCH net-next V3 5/6] net: introduce NETDEV_CHANGE_TX_QUEUE_LEN
2016-06-30 3:52 ` Jason Wang
@ 2016-06-30 3:52 ` Jason Wang
-1 siblings, 0 replies; 28+ messages in thread
From: Jason Wang @ 2016-06-30 3:52 UTC (permalink / raw)
To: mst, netdev, linux-kernel, davem
Cc: kvm, virtualization, eric.dumazet, brouer, Jason Wang
This patch introduces a new event - NETDEV_CHANGE_TX_QUEUE_LEN, this
will be triggered when tx_queue_len. It could be used by net device
who want to do some processing at that time. An example is tun who may
want to resize tx array when tx_queue_len is changed.
Signed-off-by: Jason Wang <jasowang@redhat.com>
---
include/linux/netdevice.h | 1 +
net/core/net-sysfs.c | 15 ++++++++++++++-
2 files changed, 15 insertions(+), 1 deletion(-)
diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h
index e84d9d2..7dc2ec7 100644
--- a/include/linux/netdevice.h
+++ b/include/linux/netdevice.h
@@ -2237,6 +2237,7 @@ struct netdev_lag_lower_state_info {
#define NETDEV_PRECHANGEUPPER 0x001A
#define NETDEV_CHANGELOWERSTATE 0x001B
#define NETDEV_UDP_TUNNEL_PUSH_INFO 0x001C
+#define NETDEV_CHANGE_TX_QUEUE_LEN 0x001E
int register_netdevice_notifier(struct notifier_block *nb);
int unregister_netdevice_notifier(struct notifier_block *nb);
diff --git a/net/core/net-sysfs.c b/net/core/net-sysfs.c
index 7a0b616..6e4f347 100644
--- a/net/core/net-sysfs.c
+++ b/net/core/net-sysfs.c
@@ -322,7 +322,20 @@ NETDEVICE_SHOW_RW(flags, fmt_hex);
static int change_tx_queue_len(struct net_device *dev, unsigned long new_len)
{
- dev->tx_queue_len = new_len;
+ int res, orig_len = dev->tx_queue_len;
+
+ if (new_len != orig_len) {
+ dev->tx_queue_len = new_len;
+ res = call_netdevice_notifiers(NETDEV_CHANGE_TX_QUEUE_LEN, dev);
+ res = notifier_to_errno(res);
+ if (res) {
+ netdev_err(dev,
+ "refused to change device tx_queue_len\n");
+ dev->tx_queue_len = orig_len;
+ return -EFAULT;
+ }
+ }
+
return 0;
}
--
2.7.4
^ permalink raw reply related [flat|nested] 28+ messages in thread
* [PATCH net-next V3 5/6] net: introduce NETDEV_CHANGE_TX_QUEUE_LEN
@ 2016-06-30 3:52 ` Jason Wang
0 siblings, 0 replies; 28+ messages in thread
From: Jason Wang @ 2016-06-30 3:52 UTC (permalink / raw)
To: mst, netdev, linux-kernel, davem
Cc: brouer, eric.dumazet, kvm, virtualization
This patch introduces a new event - NETDEV_CHANGE_TX_QUEUE_LEN, this
will be triggered when tx_queue_len. It could be used by net device
who want to do some processing at that time. An example is tun who may
want to resize tx array when tx_queue_len is changed.
Signed-off-by: Jason Wang <jasowang@redhat.com>
---
include/linux/netdevice.h | 1 +
net/core/net-sysfs.c | 15 ++++++++++++++-
2 files changed, 15 insertions(+), 1 deletion(-)
diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h
index e84d9d2..7dc2ec7 100644
--- a/include/linux/netdevice.h
+++ b/include/linux/netdevice.h
@@ -2237,6 +2237,7 @@ struct netdev_lag_lower_state_info {
#define NETDEV_PRECHANGEUPPER 0x001A
#define NETDEV_CHANGELOWERSTATE 0x001B
#define NETDEV_UDP_TUNNEL_PUSH_INFO 0x001C
+#define NETDEV_CHANGE_TX_QUEUE_LEN 0x001E
int register_netdevice_notifier(struct notifier_block *nb);
int unregister_netdevice_notifier(struct notifier_block *nb);
diff --git a/net/core/net-sysfs.c b/net/core/net-sysfs.c
index 7a0b616..6e4f347 100644
--- a/net/core/net-sysfs.c
+++ b/net/core/net-sysfs.c
@@ -322,7 +322,20 @@ NETDEVICE_SHOW_RW(flags, fmt_hex);
static int change_tx_queue_len(struct net_device *dev, unsigned long new_len)
{
- dev->tx_queue_len = new_len;
+ int res, orig_len = dev->tx_queue_len;
+
+ if (new_len != orig_len) {
+ dev->tx_queue_len = new_len;
+ res = call_netdevice_notifiers(NETDEV_CHANGE_TX_QUEUE_LEN, dev);
+ res = notifier_to_errno(res);
+ if (res) {
+ netdev_err(dev,
+ "refused to change device tx_queue_len\n");
+ dev->tx_queue_len = orig_len;
+ return -EFAULT;
+ }
+ }
+
return 0;
}
--
2.7.4
^ permalink raw reply related [flat|nested] 28+ messages in thread
* [PATCH net-next V3 6/6] tun: switch to use skb array for tx
2016-06-30 3:52 ` Jason Wang
@ 2016-06-30 3:52 ` Jason Wang
-1 siblings, 0 replies; 28+ messages in thread
From: Jason Wang @ 2016-06-30 3:52 UTC (permalink / raw)
To: mst, netdev, linux-kernel, davem
Cc: kvm, virtualization, eric.dumazet, brouer, Jason Wang
We used to queue tx packets in sk_receive_queue, this is less
efficient since it requires spinlocks to synchronize between producer
and consumer.
This patch tries to address this by:
- switch from sk_receive_queue to a skb_array, and resize it when
tx_queue_len was changed.
- introduce a new proto_ops peek_len which was used for peeking the
skb length.
- implement a tun version of peek_len for vhost_net to use and convert
vhost_net to use peek_len if possible.
Pktgen test shows about 15.3% improvement on guest receiving pps for small
buffers:
Before: ~1300000pps
After : ~1500000pps
Signed-off-by: Jason Wang <jasowang@redhat.com>
---
drivers/net/tun.c | 138 +++++++++++++++++++++++++++++++++++++++++++++++++---
drivers/vhost/net.c | 16 +++++-
include/linux/net.h | 1 +
3 files changed, 146 insertions(+), 9 deletions(-)
diff --git a/drivers/net/tun.c b/drivers/net/tun.c
index 4884802..3be69ea 100644
--- a/drivers/net/tun.c
+++ b/drivers/net/tun.c
@@ -71,6 +71,7 @@
#include <net/sock.h>
#include <linux/seq_file.h>
#include <linux/uio.h>
+#include <linux/skb_array.h>
#include <asm/uaccess.h>
@@ -167,6 +168,7 @@ struct tun_file {
};
struct list_head next;
struct tun_struct *detached;
+ struct skb_array tx_array;
};
struct tun_flow_entry {
@@ -515,7 +517,11 @@ static struct tun_struct *tun_enable_queue(struct tun_file *tfile)
static void tun_queue_purge(struct tun_file *tfile)
{
- skb_queue_purge(&tfile->sk.sk_receive_queue);
+ struct sk_buff *skb;
+
+ while ((skb = skb_array_consume(&tfile->tx_array)) != NULL)
+ kfree_skb(skb);
+
skb_queue_purge(&tfile->sk.sk_error_queue);
}
@@ -560,6 +566,8 @@ static void __tun_detach(struct tun_file *tfile, bool clean)
tun->dev->reg_state == NETREG_REGISTERED)
unregister_netdevice(tun->dev);
}
+ if (tun)
+ skb_array_cleanup(&tfile->tx_array);
sock_put(&tfile->sk);
}
}
@@ -613,6 +621,7 @@ static void tun_detach_all(struct net_device *dev)
static int tun_attach(struct tun_struct *tun, struct file *file, bool skip_filter)
{
struct tun_file *tfile = file->private_data;
+ struct net_device *dev = tun->dev;
int err;
err = security_tun_dev_attach(tfile->socket.sk, tun->security);
@@ -642,6 +651,13 @@ static int tun_attach(struct tun_struct *tun, struct file *file, bool skip_filte
if (!err)
goto out;
}
+
+ if (!tfile->detached &&
+ skb_array_init(&tfile->tx_array, dev->tx_queue_len, GFP_KERNEL)) {
+ err = -ENOMEM;
+ goto out;
+ }
+
tfile->queue_index = tun->numqueues;
tfile->socket.sk->sk_shutdown &= ~RCV_SHUTDOWN;
rcu_assign_pointer(tfile->tun, tun);
@@ -891,8 +907,8 @@ static netdev_tx_t tun_net_xmit(struct sk_buff *skb, struct net_device *dev)
nf_reset(skb);
- /* Enqueue packet */
- skb_queue_tail(&tfile->socket.sk->sk_receive_queue, skb);
+ if (skb_array_produce(&tfile->tx_array, skb))
+ goto drop;
/* Notify and wake up reader process */
if (tfile->flags & TUN_FASYNC)
@@ -1107,7 +1123,7 @@ static unsigned int tun_chr_poll(struct file *file, poll_table *wait)
poll_wait(file, sk_sleep(sk), wait);
- if (!skb_queue_empty(&sk->sk_receive_queue))
+ if (!skb_array_empty(&tfile->tx_array))
mask |= POLLIN | POLLRDNORM;
if (sock_writeable(sk) ||
@@ -1426,22 +1442,61 @@ done:
return total;
}
+static struct sk_buff *tun_ring_recv(struct tun_file *tfile, int noblock,
+ int *err)
+{
+ DECLARE_WAITQUEUE(wait, current);
+ struct sk_buff *skb = NULL;
+
+ skb = skb_array_consume(&tfile->tx_array);
+ if (skb)
+ goto out;
+ if (noblock) {
+ *err = -EAGAIN;
+ goto out;
+ }
+
+ add_wait_queue(&tfile->wq.wait, &wait);
+ current->state = TASK_INTERRUPTIBLE;
+
+ while (1) {
+ skb = skb_array_consume(&tfile->tx_array);
+ if (skb)
+ break;
+ if (signal_pending(current)) {
+ *err = -ERESTARTSYS;
+ break;
+ }
+ if (tfile->socket.sk->sk_shutdown & RCV_SHUTDOWN) {
+ *err = -EFAULT;
+ break;
+ }
+
+ schedule();
+ };
+
+ current->state = TASK_RUNNING;
+ remove_wait_queue(&tfile->wq.wait, &wait);
+
+out:
+ return skb;
+}
+
static ssize_t tun_do_read(struct tun_struct *tun, struct tun_file *tfile,
struct iov_iter *to,
int noblock)
{
struct sk_buff *skb;
ssize_t ret;
- int peeked, err, off = 0;
+ int err;
tun_debug(KERN_INFO, tun, "tun_do_read\n");
if (!iov_iter_count(to))
return 0;
- /* Read frames from queue */
- skb = __skb_recv_datagram(tfile->socket.sk, noblock ? MSG_DONTWAIT : 0,
- &peeked, &off, &err);
+ /* Read frames from ring */
+ skb = tun_ring_recv(tfile, noblock, &err);
if (!skb)
return err;
@@ -1574,8 +1629,25 @@ out:
return ret;
}
+static int tun_peek_len(struct socket *sock)
+{
+ struct tun_file *tfile = container_of(sock, struct tun_file, socket);
+ struct tun_struct *tun;
+ int ret = 0;
+
+ tun = __tun_get(tfile);
+ if (!tun)
+ return 0;
+
+ ret = skb_array_peek_len(&tfile->tx_array);
+ tun_put(tun);
+
+ return ret;
+}
+
/* Ops structure to mimic raw sockets with tun */
static const struct proto_ops tun_socket_ops = {
+ .peek_len = tun_peek_len,
.sendmsg = tun_sendmsg,
.recvmsg = tun_recvmsg,
};
@@ -2397,6 +2469,53 @@ static const struct ethtool_ops tun_ethtool_ops = {
.get_ts_info = ethtool_op_get_ts_info,
};
+static int tun_queue_resize(struct tun_struct *tun)
+{
+ struct net_device *dev = tun->dev;
+ struct tun_file *tfile;
+ struct skb_array **arrays;
+ int n = tun->numqueues + tun->numdisabled;
+ int ret, i;
+
+ arrays = kmalloc(sizeof *arrays * n, GFP_KERNEL);
+ if (!arrays)
+ return -ENOMEM;
+
+ for (i = 0; i < tun->numqueues; i++) {
+ tfile = rtnl_dereference(tun->tfiles[i]);
+ arrays[i] = &tfile->tx_array;
+ }
+ list_for_each_entry(tfile, &tun->disabled, next)
+ arrays[i++] = &tfile->tx_array;
+
+ ret = skb_array_resize_multiple(arrays, n,
+ dev->tx_queue_len, GFP_KERNEL);
+
+ kfree(arrays);
+ return ret;
+}
+
+static int tun_device_event(struct notifier_block *unused,
+ unsigned long event, void *ptr)
+{
+ struct net_device *dev = netdev_notifier_info_to_dev(ptr);
+ struct tun_struct *tun = netdev_priv(dev);
+
+ switch (event) {
+ case NETDEV_CHANGE_TX_QUEUE_LEN:
+ if (tun_queue_resize(tun))
+ return NOTIFY_BAD;
+ break;
+ default:
+ break;
+ }
+
+ return NOTIFY_DONE;
+}
+
+static struct notifier_block tun_notifier_block __read_mostly = {
+ .notifier_call = tun_device_event,
+};
static int __init tun_init(void)
{
@@ -2416,6 +2535,8 @@ static int __init tun_init(void)
pr_err("Can't register misc device %d\n", TUN_MINOR);
goto err_misc;
}
+
+ register_netdevice_notifier(&tun_notifier_block);
return 0;
err_misc:
rtnl_link_unregister(&tun_link_ops);
@@ -2427,6 +2548,7 @@ static void tun_cleanup(void)
{
misc_deregister(&tun_miscdev);
rtnl_link_unregister(&tun_link_ops);
+ unregister_netdevice_notifier(&tun_notifier_block);
}
/* Get an underlying socket object from tun file. Returns error unless file is
diff --git a/drivers/vhost/net.c b/drivers/vhost/net.c
index 1d3e45f..e032ca3 100644
--- a/drivers/vhost/net.c
+++ b/drivers/vhost/net.c
@@ -481,10 +481,14 @@ out:
static int peek_head_len(struct sock *sk)
{
+ struct socket *sock = sk->sk_socket;
struct sk_buff *head;
int len = 0;
unsigned long flags;
+ if (sock->ops->peek_len)
+ return sock->ops->peek_len(sock);
+
spin_lock_irqsave(&sk->sk_receive_queue.lock, flags);
head = skb_peek(&sk->sk_receive_queue);
if (likely(head)) {
@@ -497,6 +501,16 @@ static int peek_head_len(struct sock *sk)
return len;
}
+static int sk_has_rx_data(struct sock *sk)
+{
+ struct socket *sock = sk->sk_socket;
+
+ if (sock->ops->peek_len)
+ return sock->ops->peek_len(sock);
+
+ return skb_queue_empty(&sk->sk_receive_queue);
+}
+
static int vhost_net_rx_peek_head_len(struct vhost_net *net, struct sock *sk)
{
struct vhost_net_virtqueue *nvq = &net->vqs[VHOST_NET_VQ_TX];
@@ -513,7 +527,7 @@ static int vhost_net_rx_peek_head_len(struct vhost_net *net, struct sock *sk)
endtime = busy_clock() + vq->busyloop_timeout;
while (vhost_can_busy_poll(&net->dev, endtime) &&
- skb_queue_empty(&sk->sk_receive_queue) &&
+ !sk_has_rx_data(sk) &&
vhost_vq_avail_empty(&net->dev, vq))
cpu_relax_lowlatency();
diff --git a/include/linux/net.h b/include/linux/net.h
index 9aa49a0..b6b3843 100644
--- a/include/linux/net.h
+++ b/include/linux/net.h
@@ -185,6 +185,7 @@ struct proto_ops {
ssize_t (*splice_read)(struct socket *sock, loff_t *ppos,
struct pipe_inode_info *pipe, size_t len, unsigned int flags);
int (*set_peek_off)(struct sock *sk, int val);
+ int (*peek_len)(struct socket *sock);
};
#define DECLARE_SOCKADDR(type, dst, src) \
--
2.7.4
^ permalink raw reply related [flat|nested] 28+ messages in thread
* [PATCH net-next V3 6/6] tun: switch to use skb array for tx
@ 2016-06-30 3:52 ` Jason Wang
0 siblings, 0 replies; 28+ messages in thread
From: Jason Wang @ 2016-06-30 3:52 UTC (permalink / raw)
To: mst, netdev, linux-kernel, davem
Cc: brouer, eric.dumazet, kvm, virtualization
We used to queue tx packets in sk_receive_queue, this is less
efficient since it requires spinlocks to synchronize between producer
and consumer.
This patch tries to address this by:
- switch from sk_receive_queue to a skb_array, and resize it when
tx_queue_len was changed.
- introduce a new proto_ops peek_len which was used for peeking the
skb length.
- implement a tun version of peek_len for vhost_net to use and convert
vhost_net to use peek_len if possible.
Pktgen test shows about 15.3% improvement on guest receiving pps for small
buffers:
Before: ~1300000pps
After : ~1500000pps
Signed-off-by: Jason Wang <jasowang@redhat.com>
---
drivers/net/tun.c | 138 +++++++++++++++++++++++++++++++++++++++++++++++++---
drivers/vhost/net.c | 16 +++++-
include/linux/net.h | 1 +
3 files changed, 146 insertions(+), 9 deletions(-)
diff --git a/drivers/net/tun.c b/drivers/net/tun.c
index 4884802..3be69ea 100644
--- a/drivers/net/tun.c
+++ b/drivers/net/tun.c
@@ -71,6 +71,7 @@
#include <net/sock.h>
#include <linux/seq_file.h>
#include <linux/uio.h>
+#include <linux/skb_array.h>
#include <asm/uaccess.h>
@@ -167,6 +168,7 @@ struct tun_file {
};
struct list_head next;
struct tun_struct *detached;
+ struct skb_array tx_array;
};
struct tun_flow_entry {
@@ -515,7 +517,11 @@ static struct tun_struct *tun_enable_queue(struct tun_file *tfile)
static void tun_queue_purge(struct tun_file *tfile)
{
- skb_queue_purge(&tfile->sk.sk_receive_queue);
+ struct sk_buff *skb;
+
+ while ((skb = skb_array_consume(&tfile->tx_array)) != NULL)
+ kfree_skb(skb);
+
skb_queue_purge(&tfile->sk.sk_error_queue);
}
@@ -560,6 +566,8 @@ static void __tun_detach(struct tun_file *tfile, bool clean)
tun->dev->reg_state == NETREG_REGISTERED)
unregister_netdevice(tun->dev);
}
+ if (tun)
+ skb_array_cleanup(&tfile->tx_array);
sock_put(&tfile->sk);
}
}
@@ -613,6 +621,7 @@ static void tun_detach_all(struct net_device *dev)
static int tun_attach(struct tun_struct *tun, struct file *file, bool skip_filter)
{
struct tun_file *tfile = file->private_data;
+ struct net_device *dev = tun->dev;
int err;
err = security_tun_dev_attach(tfile->socket.sk, tun->security);
@@ -642,6 +651,13 @@ static int tun_attach(struct tun_struct *tun, struct file *file, bool skip_filte
if (!err)
goto out;
}
+
+ if (!tfile->detached &&
+ skb_array_init(&tfile->tx_array, dev->tx_queue_len, GFP_KERNEL)) {
+ err = -ENOMEM;
+ goto out;
+ }
+
tfile->queue_index = tun->numqueues;
tfile->socket.sk->sk_shutdown &= ~RCV_SHUTDOWN;
rcu_assign_pointer(tfile->tun, tun);
@@ -891,8 +907,8 @@ static netdev_tx_t tun_net_xmit(struct sk_buff *skb, struct net_device *dev)
nf_reset(skb);
- /* Enqueue packet */
- skb_queue_tail(&tfile->socket.sk->sk_receive_queue, skb);
+ if (skb_array_produce(&tfile->tx_array, skb))
+ goto drop;
/* Notify and wake up reader process */
if (tfile->flags & TUN_FASYNC)
@@ -1107,7 +1123,7 @@ static unsigned int tun_chr_poll(struct file *file, poll_table *wait)
poll_wait(file, sk_sleep(sk), wait);
- if (!skb_queue_empty(&sk->sk_receive_queue))
+ if (!skb_array_empty(&tfile->tx_array))
mask |= POLLIN | POLLRDNORM;
if (sock_writeable(sk) ||
@@ -1426,22 +1442,61 @@ done:
return total;
}
+static struct sk_buff *tun_ring_recv(struct tun_file *tfile, int noblock,
+ int *err)
+{
+ DECLARE_WAITQUEUE(wait, current);
+ struct sk_buff *skb = NULL;
+
+ skb = skb_array_consume(&tfile->tx_array);
+ if (skb)
+ goto out;
+ if (noblock) {
+ *err = -EAGAIN;
+ goto out;
+ }
+
+ add_wait_queue(&tfile->wq.wait, &wait);
+ current->state = TASK_INTERRUPTIBLE;
+
+ while (1) {
+ skb = skb_array_consume(&tfile->tx_array);
+ if (skb)
+ break;
+ if (signal_pending(current)) {
+ *err = -ERESTARTSYS;
+ break;
+ }
+ if (tfile->socket.sk->sk_shutdown & RCV_SHUTDOWN) {
+ *err = -EFAULT;
+ break;
+ }
+
+ schedule();
+ };
+
+ current->state = TASK_RUNNING;
+ remove_wait_queue(&tfile->wq.wait, &wait);
+
+out:
+ return skb;
+}
+
static ssize_t tun_do_read(struct tun_struct *tun, struct tun_file *tfile,
struct iov_iter *to,
int noblock)
{
struct sk_buff *skb;
ssize_t ret;
- int peeked, err, off = 0;
+ int err;
tun_debug(KERN_INFO, tun, "tun_do_read\n");
if (!iov_iter_count(to))
return 0;
- /* Read frames from queue */
- skb = __skb_recv_datagram(tfile->socket.sk, noblock ? MSG_DONTWAIT : 0,
- &peeked, &off, &err);
+ /* Read frames from ring */
+ skb = tun_ring_recv(tfile, noblock, &err);
if (!skb)
return err;
@@ -1574,8 +1629,25 @@ out:
return ret;
}
+static int tun_peek_len(struct socket *sock)
+{
+ struct tun_file *tfile = container_of(sock, struct tun_file, socket);
+ struct tun_struct *tun;
+ int ret = 0;
+
+ tun = __tun_get(tfile);
+ if (!tun)
+ return 0;
+
+ ret = skb_array_peek_len(&tfile->tx_array);
+ tun_put(tun);
+
+ return ret;
+}
+
/* Ops structure to mimic raw sockets with tun */
static const struct proto_ops tun_socket_ops = {
+ .peek_len = tun_peek_len,
.sendmsg = tun_sendmsg,
.recvmsg = tun_recvmsg,
};
@@ -2397,6 +2469,53 @@ static const struct ethtool_ops tun_ethtool_ops = {
.get_ts_info = ethtool_op_get_ts_info,
};
+static int tun_queue_resize(struct tun_struct *tun)
+{
+ struct net_device *dev = tun->dev;
+ struct tun_file *tfile;
+ struct skb_array **arrays;
+ int n = tun->numqueues + tun->numdisabled;
+ int ret, i;
+
+ arrays = kmalloc(sizeof *arrays * n, GFP_KERNEL);
+ if (!arrays)
+ return -ENOMEM;
+
+ for (i = 0; i < tun->numqueues; i++) {
+ tfile = rtnl_dereference(tun->tfiles[i]);
+ arrays[i] = &tfile->tx_array;
+ }
+ list_for_each_entry(tfile, &tun->disabled, next)
+ arrays[i++] = &tfile->tx_array;
+
+ ret = skb_array_resize_multiple(arrays, n,
+ dev->tx_queue_len, GFP_KERNEL);
+
+ kfree(arrays);
+ return ret;
+}
+
+static int tun_device_event(struct notifier_block *unused,
+ unsigned long event, void *ptr)
+{
+ struct net_device *dev = netdev_notifier_info_to_dev(ptr);
+ struct tun_struct *tun = netdev_priv(dev);
+
+ switch (event) {
+ case NETDEV_CHANGE_TX_QUEUE_LEN:
+ if (tun_queue_resize(tun))
+ return NOTIFY_BAD;
+ break;
+ default:
+ break;
+ }
+
+ return NOTIFY_DONE;
+}
+
+static struct notifier_block tun_notifier_block __read_mostly = {
+ .notifier_call = tun_device_event,
+};
static int __init tun_init(void)
{
@@ -2416,6 +2535,8 @@ static int __init tun_init(void)
pr_err("Can't register misc device %d\n", TUN_MINOR);
goto err_misc;
}
+
+ register_netdevice_notifier(&tun_notifier_block);
return 0;
err_misc:
rtnl_link_unregister(&tun_link_ops);
@@ -2427,6 +2548,7 @@ static void tun_cleanup(void)
{
misc_deregister(&tun_miscdev);
rtnl_link_unregister(&tun_link_ops);
+ unregister_netdevice_notifier(&tun_notifier_block);
}
/* Get an underlying socket object from tun file. Returns error unless file is
diff --git a/drivers/vhost/net.c b/drivers/vhost/net.c
index 1d3e45f..e032ca3 100644
--- a/drivers/vhost/net.c
+++ b/drivers/vhost/net.c
@@ -481,10 +481,14 @@ out:
static int peek_head_len(struct sock *sk)
{
+ struct socket *sock = sk->sk_socket;
struct sk_buff *head;
int len = 0;
unsigned long flags;
+ if (sock->ops->peek_len)
+ return sock->ops->peek_len(sock);
+
spin_lock_irqsave(&sk->sk_receive_queue.lock, flags);
head = skb_peek(&sk->sk_receive_queue);
if (likely(head)) {
@@ -497,6 +501,16 @@ static int peek_head_len(struct sock *sk)
return len;
}
+static int sk_has_rx_data(struct sock *sk)
+{
+ struct socket *sock = sk->sk_socket;
+
+ if (sock->ops->peek_len)
+ return sock->ops->peek_len(sock);
+
+ return skb_queue_empty(&sk->sk_receive_queue);
+}
+
static int vhost_net_rx_peek_head_len(struct vhost_net *net, struct sock *sk)
{
struct vhost_net_virtqueue *nvq = &net->vqs[VHOST_NET_VQ_TX];
@@ -513,7 +527,7 @@ static int vhost_net_rx_peek_head_len(struct vhost_net *net, struct sock *sk)
endtime = busy_clock() + vq->busyloop_timeout;
while (vhost_can_busy_poll(&net->dev, endtime) &&
- skb_queue_empty(&sk->sk_receive_queue) &&
+ !sk_has_rx_data(sk) &&
vhost_vq_avail_empty(&net->dev, vq))
cpu_relax_lowlatency();
diff --git a/include/linux/net.h b/include/linux/net.h
index 9aa49a0..b6b3843 100644
--- a/include/linux/net.h
+++ b/include/linux/net.h
@@ -185,6 +185,7 @@ struct proto_ops {
ssize_t (*splice_read)(struct socket *sock, loff_t *ppos,
struct pipe_inode_info *pipe, size_t len, unsigned int flags);
int (*set_peek_off)(struct sock *sk, int val);
+ int (*peek_len)(struct socket *sock);
};
#define DECLARE_SOCKADDR(type, dst, src) \
--
2.7.4
^ permalink raw reply related [flat|nested] 28+ messages in thread
* Re: [PATCH net-next V3 5/6] net: introduce NETDEV_CHANGE_TX_QUEUE_LEN
2016-06-30 3:52 ` Jason Wang
(?)
@ 2016-06-30 4:56 ` John Fastabend
2016-06-30 5:12 ` Jason Wang
-1 siblings, 1 reply; 28+ messages in thread
From: John Fastabend @ 2016-06-30 4:56 UTC (permalink / raw)
To: Jason Wang, mst, netdev, linux-kernel, davem
Cc: kvm, virtualization, eric.dumazet, brouer
On 16-06-29 08:52 PM, Jason Wang wrote:
> This patch introduces a new event - NETDEV_CHANGE_TX_QUEUE_LEN, this
> will be triggered when tx_queue_len. It could be used by net device
> who want to do some processing at that time. An example is tun who may
> want to resize tx array when tx_queue_len is changed.
>
> Signed-off-by: Jason Wang <jasowang@redhat.com>
> ---
> include/linux/netdevice.h | 1 +
> net/core/net-sysfs.c | 15 ++++++++++++++-
> 2 files changed, 15 insertions(+), 1 deletion(-)
>
> diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h
> index e84d9d2..7dc2ec7 100644
> --- a/include/linux/netdevice.h
> +++ b/include/linux/netdevice.h
> @@ -2237,6 +2237,7 @@ struct netdev_lag_lower_state_info {
> #define NETDEV_PRECHANGEUPPER 0x001A
> #define NETDEV_CHANGELOWERSTATE 0x001B
> #define NETDEV_UDP_TUNNEL_PUSH_INFO 0x001C
> +#define NETDEV_CHANGE_TX_QUEUE_LEN 0x001E
>
> int register_netdevice_notifier(struct notifier_block *nb);
> int unregister_netdevice_notifier(struct notifier_block *nb);
> diff --git a/net/core/net-sysfs.c b/net/core/net-sysfs.c
> index 7a0b616..6e4f347 100644
> --- a/net/core/net-sysfs.c
> +++ b/net/core/net-sysfs.c
> @@ -322,7 +322,20 @@ NETDEVICE_SHOW_RW(flags, fmt_hex);
>
> static int change_tx_queue_len(struct net_device *dev, unsigned long new_len)
> {
> - dev->tx_queue_len = new_len;
> + int res, orig_len = dev->tx_queue_len;
> +
> + if (new_len != orig_len) {
> + dev->tx_queue_len = new_len;
> + res = call_netdevice_notifiers(NETDEV_CHANGE_TX_QUEUE_LEN, dev);
> + res = notifier_to_errno(res);
> + if (res) {
> + netdev_err(dev,
> + "refused to change device tx_queue_len\n");
> + dev->tx_queue_len = orig_len;
> + return -EFAULT;
> + }
> + }
> +
> return 0;
> }
>
>
Acked-by: John Fastabend <john.r.fastabend@intel.com>
Great timing I was just looking into this because I need it for the
qdisc side.
It looks like this covers the sysfs change but the tx_queue_len can
also be changed via rtnetlink as well. So we need another patch for
that path right?
if (tb[IFLA_TXQLEN]) {
unsigned long value = nla_get_u32(tb[IFLA_TXQLEN]);
if (dev->tx_queue_len ^ value)
status |= DO_SETLINK_NOTIFY;
dev->tx_queue_len = value;
}
Thanks,
John
^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: [PATCH net-next V3 5/6] net: introduce NETDEV_CHANGE_TX_QUEUE_LEN
2016-06-30 3:52 ` Jason Wang
(?)
(?)
@ 2016-06-30 4:56 ` John Fastabend
-1 siblings, 0 replies; 28+ messages in thread
From: John Fastabend @ 2016-06-30 4:56 UTC (permalink / raw)
To: Jason Wang, mst, netdev, linux-kernel, davem
Cc: brouer, eric.dumazet, kvm, virtualization
On 16-06-29 08:52 PM, Jason Wang wrote:
> This patch introduces a new event - NETDEV_CHANGE_TX_QUEUE_LEN, this
> will be triggered when tx_queue_len. It could be used by net device
> who want to do some processing at that time. An example is tun who may
> want to resize tx array when tx_queue_len is changed.
>
> Signed-off-by: Jason Wang <jasowang@redhat.com>
> ---
> include/linux/netdevice.h | 1 +
> net/core/net-sysfs.c | 15 ++++++++++++++-
> 2 files changed, 15 insertions(+), 1 deletion(-)
>
> diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h
> index e84d9d2..7dc2ec7 100644
> --- a/include/linux/netdevice.h
> +++ b/include/linux/netdevice.h
> @@ -2237,6 +2237,7 @@ struct netdev_lag_lower_state_info {
> #define NETDEV_PRECHANGEUPPER 0x001A
> #define NETDEV_CHANGELOWERSTATE 0x001B
> #define NETDEV_UDP_TUNNEL_PUSH_INFO 0x001C
> +#define NETDEV_CHANGE_TX_QUEUE_LEN 0x001E
>
> int register_netdevice_notifier(struct notifier_block *nb);
> int unregister_netdevice_notifier(struct notifier_block *nb);
> diff --git a/net/core/net-sysfs.c b/net/core/net-sysfs.c
> index 7a0b616..6e4f347 100644
> --- a/net/core/net-sysfs.c
> +++ b/net/core/net-sysfs.c
> @@ -322,7 +322,20 @@ NETDEVICE_SHOW_RW(flags, fmt_hex);
>
> static int change_tx_queue_len(struct net_device *dev, unsigned long new_len)
> {
> - dev->tx_queue_len = new_len;
> + int res, orig_len = dev->tx_queue_len;
> +
> + if (new_len != orig_len) {
> + dev->tx_queue_len = new_len;
> + res = call_netdevice_notifiers(NETDEV_CHANGE_TX_QUEUE_LEN, dev);
> + res = notifier_to_errno(res);
> + if (res) {
> + netdev_err(dev,
> + "refused to change device tx_queue_len\n");
> + dev->tx_queue_len = orig_len;
> + return -EFAULT;
> + }
> + }
> +
> return 0;
> }
>
>
Acked-by: John Fastabend <john.r.fastabend@intel.com>
Great timing I was just looking into this because I need it for the
qdisc side.
It looks like this covers the sysfs change but the tx_queue_len can
also be changed via rtnetlink as well. So we need another patch for
that path right?
if (tb[IFLA_TXQLEN]) {
unsigned long value = nla_get_u32(tb[IFLA_TXQLEN]);
if (dev->tx_queue_len ^ value)
status |= DO_SETLINK_NOTIFY;
dev->tx_queue_len = value;
}
Thanks,
John
^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: [PATCH net-next V3 5/6] net: introduce NETDEV_CHANGE_TX_QUEUE_LEN
2016-06-30 4:56 ` John Fastabend
@ 2016-06-30 5:12 ` Jason Wang
0 siblings, 0 replies; 28+ messages in thread
From: Jason Wang @ 2016-06-30 5:12 UTC (permalink / raw)
To: John Fastabend, mst, netdev, linux-kernel, davem
Cc: kvm, virtualization, eric.dumazet, brouer
On 2016年06月30日 12:56, John Fastabend wrote:
> On 16-06-29 08:52 PM, Jason Wang wrote:
>> This patch introduces a new event - NETDEV_CHANGE_TX_QUEUE_LEN, this
>> will be triggered when tx_queue_len. It could be used by net device
>> who want to do some processing at that time. An example is tun who may
>> want to resize tx array when tx_queue_len is changed.
>>
>> Signed-off-by: Jason Wang <jasowang@redhat.com>
>> ---
>> include/linux/netdevice.h | 1 +
>> net/core/net-sysfs.c | 15 ++++++++++++++-
>> 2 files changed, 15 insertions(+), 1 deletion(-)
>>
>> diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h
>> index e84d9d2..7dc2ec7 100644
>> --- a/include/linux/netdevice.h
>> +++ b/include/linux/netdevice.h
>> @@ -2237,6 +2237,7 @@ struct netdev_lag_lower_state_info {
>> #define NETDEV_PRECHANGEUPPER 0x001A
>> #define NETDEV_CHANGELOWERSTATE 0x001B
>> #define NETDEV_UDP_TUNNEL_PUSH_INFO 0x001C
>> +#define NETDEV_CHANGE_TX_QUEUE_LEN 0x001E
>>
>> int register_netdevice_notifier(struct notifier_block *nb);
>> int unregister_netdevice_notifier(struct notifier_block *nb);
>> diff --git a/net/core/net-sysfs.c b/net/core/net-sysfs.c
>> index 7a0b616..6e4f347 100644
>> --- a/net/core/net-sysfs.c
>> +++ b/net/core/net-sysfs.c
>> @@ -322,7 +322,20 @@ NETDEVICE_SHOW_RW(flags, fmt_hex);
>>
>> static int change_tx_queue_len(struct net_device *dev, unsigned long new_len)
>> {
>> - dev->tx_queue_len = new_len;
>> + int res, orig_len = dev->tx_queue_len;
>> +
>> + if (new_len != orig_len) {
>> + dev->tx_queue_len = new_len;
>> + res = call_netdevice_notifiers(NETDEV_CHANGE_TX_QUEUE_LEN, dev);
>> + res = notifier_to_errno(res);
>> + if (res) {
>> + netdev_err(dev,
>> + "refused to change device tx_queue_len\n");
>> + dev->tx_queue_len = orig_len;
>> + return -EFAULT;
>> + }
>> + }
>> +
>> return 0;
>> }
>>
>>
> Acked-by: John Fastabend <john.r.fastabend@intel.com>
>
> Great timing I was just looking into this because I need it for the
> qdisc side.
>
> It looks like this covers the sysfs change but the tx_queue_len can
> also be changed via rtnetlink as well. So we need another patch for
> that path right?
>
> if (tb[IFLA_TXQLEN]) {
> unsigned long value = nla_get_u32(tb[IFLA_TXQLEN]);
>
> if (dev->tx_queue_len ^ value)
> status |= DO_SETLINK_NOTIFY;
>
> dev->tx_queue_len = value;
> }
>
> Thanks,
> John
>
Right, will do this in next version.
Thanks
^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: [PATCH net-next V3 5/6] net: introduce NETDEV_CHANGE_TX_QUEUE_LEN
@ 2016-06-30 5:12 ` Jason Wang
0 siblings, 0 replies; 28+ messages in thread
From: Jason Wang @ 2016-06-30 5:12 UTC (permalink / raw)
To: John Fastabend, mst, netdev, linux-kernel, davem
Cc: brouer, eric.dumazet, kvm, virtualization
On 2016年06月30日 12:56, John Fastabend wrote:
> On 16-06-29 08:52 PM, Jason Wang wrote:
>> This patch introduces a new event - NETDEV_CHANGE_TX_QUEUE_LEN, this
>> will be triggered when tx_queue_len. It could be used by net device
>> who want to do some processing at that time. An example is tun who may
>> want to resize tx array when tx_queue_len is changed.
>>
>> Signed-off-by: Jason Wang <jasowang@redhat.com>
>> ---
>> include/linux/netdevice.h | 1 +
>> net/core/net-sysfs.c | 15 ++++++++++++++-
>> 2 files changed, 15 insertions(+), 1 deletion(-)
>>
>> diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h
>> index e84d9d2..7dc2ec7 100644
>> --- a/include/linux/netdevice.h
>> +++ b/include/linux/netdevice.h
>> @@ -2237,6 +2237,7 @@ struct netdev_lag_lower_state_info {
>> #define NETDEV_PRECHANGEUPPER 0x001A
>> #define NETDEV_CHANGELOWERSTATE 0x001B
>> #define NETDEV_UDP_TUNNEL_PUSH_INFO 0x001C
>> +#define NETDEV_CHANGE_TX_QUEUE_LEN 0x001E
>>
>> int register_netdevice_notifier(struct notifier_block *nb);
>> int unregister_netdevice_notifier(struct notifier_block *nb);
>> diff --git a/net/core/net-sysfs.c b/net/core/net-sysfs.c
>> index 7a0b616..6e4f347 100644
>> --- a/net/core/net-sysfs.c
>> +++ b/net/core/net-sysfs.c
>> @@ -322,7 +322,20 @@ NETDEVICE_SHOW_RW(flags, fmt_hex);
>>
>> static int change_tx_queue_len(struct net_device *dev, unsigned long new_len)
>> {
>> - dev->tx_queue_len = new_len;
>> + int res, orig_len = dev->tx_queue_len;
>> +
>> + if (new_len != orig_len) {
>> + dev->tx_queue_len = new_len;
>> + res = call_netdevice_notifiers(NETDEV_CHANGE_TX_QUEUE_LEN, dev);
>> + res = notifier_to_errno(res);
>> + if (res) {
>> + netdev_err(dev,
>> + "refused to change device tx_queue_len\n");
>> + dev->tx_queue_len = orig_len;
>> + return -EFAULT;
>> + }
>> + }
>> +
>> return 0;
>> }
>>
>>
> Acked-by: John Fastabend <john.r.fastabend@intel.com>
>
> Great timing I was just looking into this because I need it for the
> qdisc side.
>
> It looks like this covers the sysfs change but the tx_queue_len can
> also be changed via rtnetlink as well. So we need another patch for
> that path right?
>
> if (tb[IFLA_TXQLEN]) {
> unsigned long value = nla_get_u32(tb[IFLA_TXQLEN]);
>
> if (dev->tx_queue_len ^ value)
> status |= DO_SETLINK_NOTIFY;
>
> dev->tx_queue_len = value;
> }
>
> Thanks,
> John
>
Right, will do this in next version.
Thanks
_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization
^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: [PATCH net-next V3 0/6] switch to use tx skb array in tun
2016-06-30 3:52 ` Jason Wang
@ 2016-06-30 5:37 ` Michael S. Tsirkin
-1 siblings, 0 replies; 28+ messages in thread
From: Michael S. Tsirkin @ 2016-06-30 5:37 UTC (permalink / raw)
To: Jason Wang
Cc: netdev, linux-kernel, davem, kvm, virtualization, eric.dumazet, brouer
On Thu, Jun 30, 2016 at 11:52:53AM +0800, Jason Wang wrote:
> Hi all:
>
> This series tries to switch to use skb array in tun. This is used to
> eliminate the spinlock contention between producer and consumer. The
> conversion was straightforward: just introdce a tx skb array and use
> it instead of sk_receive_queue.
>
> A minor issue is to keep the tx_queue_len behaviour, since tun used to
> use it for the length of sk_receive_queue. This is done through:
>
> - add the ability to resize multiple rings at once to avoid handling
> partial resize failure for mutiple rings.
> - add the support for zero length ring.
> - introduce a notifier which was triggered when tx_queue_len was
> changed for a netdev.
> - resize all queues during the tx_queue_len changing.
>
> Tests shows about 15% improvement on guest rx pps:
>
> Before: ~1300000pps
> After : ~1500000pps
Series:
Acked-by: Michael S. Tsirkin <mst@redhat.com>
> Changes from V2:
> - add multiple rings resizing support for ptr_ring/skb_array
> - add zero length ring support
> - introdce a NETDEV_CHANGE_TX_QUEUE_LEN
> - drop new flags
>
> Changes from V1:
> - switch to use skb array instead of a customized circular buffer
> - add non-blocking support
> - rename .peek to .peek_len
> - drop lockless peeking since test show very minor improvement
>
> Jason Wang (5):
> ptr_ring: support zero length ring
> skb_array: minor tweak
> skb_array: add wrappers for resizing
> net: introduce NETDEV_CHANGE_TX_QUEUE_LEN
> tun: switch to use skb array for tx
>
> Michael S. Tsirkin (1):
> ptr_ring: support resizing multiple queues
>
> drivers/net/tun.c | 138 ++++++++++++++++++++++++++++++++++++---
> drivers/vhost/net.c | 16 ++++-
> include/linux/net.h | 1 +
> include/linux/netdevice.h | 1 +
> include/linux/ptr_ring.h | 77 ++++++++++++++++++----
> include/linux/skb_array.h | 13 +++-
> net/core/net-sysfs.c | 15 ++++-
> tools/virtio/ringtest/ptr_ring.c | 5 ++
> 8 files changed, 243 insertions(+), 23 deletions(-)
>
> --
> 2.7.4
^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: [PATCH net-next V3 0/6] switch to use tx skb array in tun
@ 2016-06-30 5:37 ` Michael S. Tsirkin
0 siblings, 0 replies; 28+ messages in thread
From: Michael S. Tsirkin @ 2016-06-30 5:37 UTC (permalink / raw)
To: Jason Wang
Cc: kvm, eric.dumazet, netdev, linux-kernel, virtualization, brouer, davem
On Thu, Jun 30, 2016 at 11:52:53AM +0800, Jason Wang wrote:
> Hi all:
>
> This series tries to switch to use skb array in tun. This is used to
> eliminate the spinlock contention between producer and consumer. The
> conversion was straightforward: just introdce a tx skb array and use
> it instead of sk_receive_queue.
>
> A minor issue is to keep the tx_queue_len behaviour, since tun used to
> use it for the length of sk_receive_queue. This is done through:
>
> - add the ability to resize multiple rings at once to avoid handling
> partial resize failure for mutiple rings.
> - add the support for zero length ring.
> - introduce a notifier which was triggered when tx_queue_len was
> changed for a netdev.
> - resize all queues during the tx_queue_len changing.
>
> Tests shows about 15% improvement on guest rx pps:
>
> Before: ~1300000pps
> After : ~1500000pps
Series:
Acked-by: Michael S. Tsirkin <mst@redhat.com>
> Changes from V2:
> - add multiple rings resizing support for ptr_ring/skb_array
> - add zero length ring support
> - introdce a NETDEV_CHANGE_TX_QUEUE_LEN
> - drop new flags
>
> Changes from V1:
> - switch to use skb array instead of a customized circular buffer
> - add non-blocking support
> - rename .peek to .peek_len
> - drop lockless peeking since test show very minor improvement
>
> Jason Wang (5):
> ptr_ring: support zero length ring
> skb_array: minor tweak
> skb_array: add wrappers for resizing
> net: introduce NETDEV_CHANGE_TX_QUEUE_LEN
> tun: switch to use skb array for tx
>
> Michael S. Tsirkin (1):
> ptr_ring: support resizing multiple queues
>
> drivers/net/tun.c | 138 ++++++++++++++++++++++++++++++++++++---
> drivers/vhost/net.c | 16 ++++-
> include/linux/net.h | 1 +
> include/linux/netdevice.h | 1 +
> include/linux/ptr_ring.h | 77 ++++++++++++++++++----
> include/linux/skb_array.h | 13 +++-
> net/core/net-sysfs.c | 15 ++++-
> tools/virtio/ringtest/ptr_ring.c | 5 ++
> 8 files changed, 243 insertions(+), 23 deletions(-)
>
> --
> 2.7.4
^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: [PATCH net-next V3 5/6] net: introduce NETDEV_CHANGE_TX_QUEUE_LEN
2016-06-30 5:12 ` Jason Wang
@ 2016-06-30 5:59 ` Jason Wang
-1 siblings, 0 replies; 28+ messages in thread
From: Jason Wang @ 2016-06-30 5:59 UTC (permalink / raw)
To: John Fastabend, mst, netdev, linux-kernel, davem
Cc: kvm, virtualization, eric.dumazet, brouer
On 2016年06月30日 13:12, Jason Wang wrote:
>
>
> On 2016年06月30日 12:56, John Fastabend wrote:
>> On 16-06-29 08:52 PM, Jason Wang wrote:
>>> This patch introduces a new event - NETDEV_CHANGE_TX_QUEUE_LEN, this
>>> will be triggered when tx_queue_len. It could be used by net device
>>> who want to do some processing at that time. An example is tun who may
>>> want to resize tx array when tx_queue_len is changed.
>>>
>>> Signed-off-by: Jason Wang <jasowang@redhat.com>
>>> ---
>>> include/linux/netdevice.h | 1 +
>>> net/core/net-sysfs.c | 15 ++++++++++++++-
>>> 2 files changed, 15 insertions(+), 1 deletion(-)
>>>
>>> diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h
>>> index e84d9d2..7dc2ec7 100644
>>> --- a/include/linux/netdevice.h
>>> +++ b/include/linux/netdevice.h
>>> @@ -2237,6 +2237,7 @@ struct netdev_lag_lower_state_info {
>>> #define NETDEV_PRECHANGEUPPER 0x001A
>>> #define NETDEV_CHANGELOWERSTATE 0x001B
>>> #define NETDEV_UDP_TUNNEL_PUSH_INFO 0x001C
>>> +#define NETDEV_CHANGE_TX_QUEUE_LEN 0x001E
>>> int register_netdevice_notifier(struct notifier_block *nb);
>>> int unregister_netdevice_notifier(struct notifier_block *nb);
>>> diff --git a/net/core/net-sysfs.c b/net/core/net-sysfs.c
>>> index 7a0b616..6e4f347 100644
>>> --- a/net/core/net-sysfs.c
>>> +++ b/net/core/net-sysfs.c
>>> @@ -322,7 +322,20 @@ NETDEVICE_SHOW_RW(flags, fmt_hex);
>>> static int change_tx_queue_len(struct net_device *dev, unsigned
>>> long new_len)
>>> {
>>> - dev->tx_queue_len = new_len;
>>> + int res, orig_len = dev->tx_queue_len;
>>> +
>>> + if (new_len != orig_len) {
>>> + dev->tx_queue_len = new_len;
>>> + res = call_netdevice_notifiers(NETDEV_CHANGE_TX_QUEUE_LEN,
>>> dev);
>>> + res = notifier_to_errno(res);
>>> + if (res) {
>>> + netdev_err(dev,
>>> + "refused to change device tx_queue_len\n");
>>> + dev->tx_queue_len = orig_len;
>>> + return -EFAULT;
>>> + }
>>> + }
>>> +
>>> return 0;
>>> }
>>>
>> Acked-by: John Fastabend <john.r.fastabend@intel.com>
>>
>> Great timing I was just looking into this because I need it for the
>> qdisc side.
>>
>> It looks like this covers the sysfs change but the tx_queue_len can
>> also be changed via rtnetlink as well. So we need another patch for
>> that path right?
>>
>> if (tb[IFLA_TXQLEN]) {
>> unsigned long value = nla_get_u32(tb[IFLA_TXQLEN]);
>>
>> if (dev->tx_queue_len ^ value)
>> status |= DO_SETLINK_NOTIFY;
>>
>> dev->tx_queue_len = value;
>> }
>>
>> Thanks,
>> John
>>
>
> Right, will do this in next version.
>
> Thanks
Ok, since Michael has acked on the series, will prepare a patch on top.
Thanks
^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: [PATCH net-next V3 5/6] net: introduce NETDEV_CHANGE_TX_QUEUE_LEN
@ 2016-06-30 5:59 ` Jason Wang
0 siblings, 0 replies; 28+ messages in thread
From: Jason Wang @ 2016-06-30 5:59 UTC (permalink / raw)
To: John Fastabend, mst, netdev, linux-kernel, davem
Cc: brouer, eric.dumazet, kvm, virtualization
On 2016年06月30日 13:12, Jason Wang wrote:
>
>
> On 2016年06月30日 12:56, John Fastabend wrote:
>> On 16-06-29 08:52 PM, Jason Wang wrote:
>>> This patch introduces a new event - NETDEV_CHANGE_TX_QUEUE_LEN, this
>>> will be triggered when tx_queue_len. It could be used by net device
>>> who want to do some processing at that time. An example is tun who may
>>> want to resize tx array when tx_queue_len is changed.
>>>
>>> Signed-off-by: Jason Wang <jasowang@redhat.com>
>>> ---
>>> include/linux/netdevice.h | 1 +
>>> net/core/net-sysfs.c | 15 ++++++++++++++-
>>> 2 files changed, 15 insertions(+), 1 deletion(-)
>>>
>>> diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h
>>> index e84d9d2..7dc2ec7 100644
>>> --- a/include/linux/netdevice.h
>>> +++ b/include/linux/netdevice.h
>>> @@ -2237,6 +2237,7 @@ struct netdev_lag_lower_state_info {
>>> #define NETDEV_PRECHANGEUPPER 0x001A
>>> #define NETDEV_CHANGELOWERSTATE 0x001B
>>> #define NETDEV_UDP_TUNNEL_PUSH_INFO 0x001C
>>> +#define NETDEV_CHANGE_TX_QUEUE_LEN 0x001E
>>> int register_netdevice_notifier(struct notifier_block *nb);
>>> int unregister_netdevice_notifier(struct notifier_block *nb);
>>> diff --git a/net/core/net-sysfs.c b/net/core/net-sysfs.c
>>> index 7a0b616..6e4f347 100644
>>> --- a/net/core/net-sysfs.c
>>> +++ b/net/core/net-sysfs.c
>>> @@ -322,7 +322,20 @@ NETDEVICE_SHOW_RW(flags, fmt_hex);
>>> static int change_tx_queue_len(struct net_device *dev, unsigned
>>> long new_len)
>>> {
>>> - dev->tx_queue_len = new_len;
>>> + int res, orig_len = dev->tx_queue_len;
>>> +
>>> + if (new_len != orig_len) {
>>> + dev->tx_queue_len = new_len;
>>> + res = call_netdevice_notifiers(NETDEV_CHANGE_TX_QUEUE_LEN,
>>> dev);
>>> + res = notifier_to_errno(res);
>>> + if (res) {
>>> + netdev_err(dev,
>>> + "refused to change device tx_queue_len\n");
>>> + dev->tx_queue_len = orig_len;
>>> + return -EFAULT;
>>> + }
>>> + }
>>> +
>>> return 0;
>>> }
>>>
>> Acked-by: John Fastabend <john.r.fastabend@intel.com>
>>
>> Great timing I was just looking into this because I need it for the
>> qdisc side.
>>
>> It looks like this covers the sysfs change but the tx_queue_len can
>> also be changed via rtnetlink as well. So we need another patch for
>> that path right?
>>
>> if (tb[IFLA_TXQLEN]) {
>> unsigned long value = nla_get_u32(tb[IFLA_TXQLEN]);
>>
>> if (dev->tx_queue_len ^ value)
>> status |= DO_SETLINK_NOTIFY;
>>
>> dev->tx_queue_len = value;
>> }
>>
>> Thanks,
>> John
>>
>
> Right, will do this in next version.
>
> Thanks
Ok, since Michael has acked on the series, will prepare a patch on top.
Thanks
_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization
^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: [PATCH net-next V3 6/6] tun: switch to use skb array for tx
2016-06-30 3:52 ` Jason Wang
@ 2016-06-30 6:20 ` kbuild test robot
-1 siblings, 0 replies; 28+ messages in thread
From: kbuild test robot @ 2016-06-30 6:20 UTC (permalink / raw)
To: Jason Wang
Cc: kbuild-all, mst, netdev, linux-kernel, davem, kvm,
virtualization, eric.dumazet, brouer, Jason Wang
Hi,
[auto build test WARNING on net-next/master]
url: https://github.com/0day-ci/linux/commits/Jason-Wang/switch-to-use-tx-skb-array-in-tun/20160630-120656
coccinelle warnings: (new ones prefixed by >>)
>> drivers/net/tun.c:1476:2-3: Unneeded semicolon
Please review and possibly fold the followup patch.
---
0-DAY kernel test infrastructure Open Source Technology Center
https://lists.01.org/pipermail/kbuild-all Intel Corporation
^ permalink raw reply [flat|nested] 28+ messages in thread
* [PATCH] tun: fix semicolon.cocci warnings
2016-06-30 3:52 ` Jason Wang
@ 2016-06-30 6:20 ` kbuild test robot
-1 siblings, 0 replies; 28+ messages in thread
From: kbuild test robot @ 2016-06-30 6:20 UTC (permalink / raw)
To: Jason Wang
Cc: kbuild-all, mst, netdev, linux-kernel, davem, kvm,
virtualization, eric.dumazet, brouer, Jason Wang
drivers/net/tun.c:1476:2-3: Unneeded semicolon
Remove unneeded semicolon.
Generated by: scripts/coccinelle/misc/semicolon.cocci
CC: Jason Wang <jasowang@redhat.com>
Signed-off-by: Fengguang Wu <fengguang.wu@intel.com>
---
tun.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
--- a/drivers/net/tun.c
+++ b/drivers/net/tun.c
@@ -1473,7 +1473,7 @@ static struct sk_buff *tun_ring_recv(str
}
schedule();
- };
+ }
current->state = TASK_RUNNING;
remove_wait_queue(&tfile->wq.wait, &wait);
^ permalink raw reply [flat|nested] 28+ messages in thread
* [PATCH] tun: fix semicolon.cocci warnings
@ 2016-06-30 6:20 ` kbuild test robot
0 siblings, 0 replies; 28+ messages in thread
From: kbuild test robot @ 2016-06-30 6:20 UTC (permalink / raw)
To: Jason Wang
Cc: eric.dumazet, kvm, mst, netdev, linux-kernel, virtualization,
kbuild-all, brouer, davem
drivers/net/tun.c:1476:2-3: Unneeded semicolon
Remove unneeded semicolon.
Generated by: scripts/coccinelle/misc/semicolon.cocci
CC: Jason Wang <jasowang@redhat.com>
Signed-off-by: Fengguang Wu <fengguang.wu@intel.com>
---
tun.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
--- a/drivers/net/tun.c
+++ b/drivers/net/tun.c
@@ -1473,7 +1473,7 @@ static struct sk_buff *tun_ring_recv(str
}
schedule();
- };
+ }
current->state = TASK_RUNNING;
remove_wait_queue(&tfile->wq.wait, &wait);
^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: [PATCH net-next V3 6/6] tun: switch to use skb array for tx
@ 2016-06-30 6:20 ` kbuild test robot
0 siblings, 0 replies; 28+ messages in thread
From: kbuild test robot @ 2016-06-30 6:20 UTC (permalink / raw)
To: Jason Wang
Cc: eric.dumazet, kvm, mst, netdev, linux-kernel, virtualization,
kbuild-all, brouer, davem
Hi,
[auto build test WARNING on net-next/master]
url: https://github.com/0day-ci/linux/commits/Jason-Wang/switch-to-use-tx-skb-array-in-tun/20160630-120656
coccinelle warnings: (new ones prefixed by >>)
>> drivers/net/tun.c:1476:2-3: Unneeded semicolon
Please review and possibly fold the followup patch.
---
0-DAY kernel test infrastructure Open Source Technology Center
https://lists.01.org/pipermail/kbuild-all Intel Corporation
^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: [PATCH net-next V3 5/6] net: introduce NETDEV_CHANGE_TX_QUEUE_LEN
2016-06-30 5:59 ` Jason Wang
@ 2016-06-30 6:43 ` Jason Wang
-1 siblings, 0 replies; 28+ messages in thread
From: Jason Wang @ 2016-06-30 6:43 UTC (permalink / raw)
To: John Fastabend, mst, netdev, linux-kernel, davem
Cc: kvm, virtualization, eric.dumazet, brouer
On 2016年06月30日 13:59, Jason Wang wrote:
>
>
> On 2016年06月30日 13:12, Jason Wang wrote:
>>
>>
>> On 2016年06月30日 12:56, John Fastabend wrote:
>>> On 16-06-29 08:52 PM, Jason Wang wrote:
>>>> This patch introduces a new event - NETDEV_CHANGE_TX_QUEUE_LEN, this
>>>> will be triggered when tx_queue_len. It could be used by net device
>>>> who want to do some processing at that time. An example is tun who may
>>>> want to resize tx array when tx_queue_len is changed.
>>>>
>>>> Signed-off-by: Jason Wang <jasowang@redhat.com>
>>>> ---
>>>> include/linux/netdevice.h | 1 +
>>>> net/core/net-sysfs.c | 15 ++++++++++++++-
>>>> 2 files changed, 15 insertions(+), 1 deletion(-)
>>>>
>>>> diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h
>>>> index e84d9d2..7dc2ec7 100644
>>>> --- a/include/linux/netdevice.h
>>>> +++ b/include/linux/netdevice.h
>>>> @@ -2237,6 +2237,7 @@ struct netdev_lag_lower_state_info {
>>>> #define NETDEV_PRECHANGEUPPER 0x001A
>>>> #define NETDEV_CHANGELOWERSTATE 0x001B
>>>> #define NETDEV_UDP_TUNNEL_PUSH_INFO 0x001C
>>>> +#define NETDEV_CHANGE_TX_QUEUE_LEN 0x001E
>>>> int register_netdevice_notifier(struct notifier_block *nb);
>>>> int unregister_netdevice_notifier(struct notifier_block *nb);
>>>> diff --git a/net/core/net-sysfs.c b/net/core/net-sysfs.c
>>>> index 7a0b616..6e4f347 100644
>>>> --- a/net/core/net-sysfs.c
>>>> +++ b/net/core/net-sysfs.c
>>>> @@ -322,7 +322,20 @@ NETDEVICE_SHOW_RW(flags, fmt_hex);
>>>> static int change_tx_queue_len(struct net_device *dev, unsigned
>>>> long new_len)
>>>> {
>>>> - dev->tx_queue_len = new_len;
>>>> + int res, orig_len = dev->tx_queue_len;
>>>> +
>>>> + if (new_len != orig_len) {
>>>> + dev->tx_queue_len = new_len;
>>>> + res = call_netdevice_notifiers(NETDEV_CHANGE_TX_QUEUE_LEN,
>>>> dev);
>>>> + res = notifier_to_errno(res);
>>>> + if (res) {
>>>> + netdev_err(dev,
>>>> + "refused to change device tx_queue_len\n");
>>>> + dev->tx_queue_len = orig_len;
>>>> + return -EFAULT;
>>>> + }
>>>> + }
>>>> +
>>>> return 0;
>>>> }
>>>>
>>> Acked-by: John Fastabend <john.r.fastabend@intel.com>
>>>
>>> Great timing I was just looking into this because I need it for the
>>> qdisc side.
>>>
>>> It looks like this covers the sysfs change but the tx_queue_len can
>>> also be changed via rtnetlink as well. So we need another patch for
>>> that path right?
>>>
>>> if (tb[IFLA_TXQLEN]) {
>>> unsigned long value = nla_get_u32(tb[IFLA_TXQLEN]);
>>>
>>> if (dev->tx_queue_len ^ value)
>>> status |= DO_SETLINK_NOTIFY;
>>>
>>> dev->tx_queue_len = value;
>>> }
>>>
>>> Thanks,
>>> John
>>>
>>
>> Right, will do this in next version.
>>
>> Thanks
>
> Ok, since Michael has acked on the series, will prepare a patch on top.
>
> Thanks
Since kbuild test robot has found a minor issue on this series, I will
post v4 with this fixed.
Thanks
^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: [PATCH net-next V3 5/6] net: introduce NETDEV_CHANGE_TX_QUEUE_LEN
@ 2016-06-30 6:43 ` Jason Wang
0 siblings, 0 replies; 28+ messages in thread
From: Jason Wang @ 2016-06-30 6:43 UTC (permalink / raw)
To: John Fastabend, mst, netdev, linux-kernel, davem
Cc: brouer, eric.dumazet, kvm, virtualization
On 2016年06月30日 13:59, Jason Wang wrote:
>
>
> On 2016年06月30日 13:12, Jason Wang wrote:
>>
>>
>> On 2016年06月30日 12:56, John Fastabend wrote:
>>> On 16-06-29 08:52 PM, Jason Wang wrote:
>>>> This patch introduces a new event - NETDEV_CHANGE_TX_QUEUE_LEN, this
>>>> will be triggered when tx_queue_len. It could be used by net device
>>>> who want to do some processing at that time. An example is tun who may
>>>> want to resize tx array when tx_queue_len is changed.
>>>>
>>>> Signed-off-by: Jason Wang <jasowang@redhat.com>
>>>> ---
>>>> include/linux/netdevice.h | 1 +
>>>> net/core/net-sysfs.c | 15 ++++++++++++++-
>>>> 2 files changed, 15 insertions(+), 1 deletion(-)
>>>>
>>>> diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h
>>>> index e84d9d2..7dc2ec7 100644
>>>> --- a/include/linux/netdevice.h
>>>> +++ b/include/linux/netdevice.h
>>>> @@ -2237,6 +2237,7 @@ struct netdev_lag_lower_state_info {
>>>> #define NETDEV_PRECHANGEUPPER 0x001A
>>>> #define NETDEV_CHANGELOWERSTATE 0x001B
>>>> #define NETDEV_UDP_TUNNEL_PUSH_INFO 0x001C
>>>> +#define NETDEV_CHANGE_TX_QUEUE_LEN 0x001E
>>>> int register_netdevice_notifier(struct notifier_block *nb);
>>>> int unregister_netdevice_notifier(struct notifier_block *nb);
>>>> diff --git a/net/core/net-sysfs.c b/net/core/net-sysfs.c
>>>> index 7a0b616..6e4f347 100644
>>>> --- a/net/core/net-sysfs.c
>>>> +++ b/net/core/net-sysfs.c
>>>> @@ -322,7 +322,20 @@ NETDEVICE_SHOW_RW(flags, fmt_hex);
>>>> static int change_tx_queue_len(struct net_device *dev, unsigned
>>>> long new_len)
>>>> {
>>>> - dev->tx_queue_len = new_len;
>>>> + int res, orig_len = dev->tx_queue_len;
>>>> +
>>>> + if (new_len != orig_len) {
>>>> + dev->tx_queue_len = new_len;
>>>> + res = call_netdevice_notifiers(NETDEV_CHANGE_TX_QUEUE_LEN,
>>>> dev);
>>>> + res = notifier_to_errno(res);
>>>> + if (res) {
>>>> + netdev_err(dev,
>>>> + "refused to change device tx_queue_len\n");
>>>> + dev->tx_queue_len = orig_len;
>>>> + return -EFAULT;
>>>> + }
>>>> + }
>>>> +
>>>> return 0;
>>>> }
>>>>
>>> Acked-by: John Fastabend <john.r.fastabend@intel.com>
>>>
>>> Great timing I was just looking into this because I need it for the
>>> qdisc side.
>>>
>>> It looks like this covers the sysfs change but the tx_queue_len can
>>> also be changed via rtnetlink as well. So we need another patch for
>>> that path right?
>>>
>>> if (tb[IFLA_TXQLEN]) {
>>> unsigned long value = nla_get_u32(tb[IFLA_TXQLEN]);
>>>
>>> if (dev->tx_queue_len ^ value)
>>> status |= DO_SETLINK_NOTIFY;
>>>
>>> dev->tx_queue_len = value;
>>> }
>>>
>>> Thanks,
>>> John
>>>
>>
>> Right, will do this in next version.
>>
>> Thanks
>
> Ok, since Michael has acked on the series, will prepare a patch on top.
>
> Thanks
Since kbuild test robot has found a minor issue on this series, I will
post v4 with this fixed.
Thanks
_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization
^ permalink raw reply [flat|nested] 28+ messages in thread
end of thread, other threads:[~2016-06-30 6:51 UTC | newest]
Thread overview: 28+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-06-30 3:52 [PATCH net-next V3 0/6] switch to use tx skb array in tun Jason Wang
2016-06-30 3:52 ` Jason Wang
2016-06-30 3:52 ` [PATCH net-next V3 1/6] ptr_ring: support zero length ring Jason Wang
2016-06-30 3:52 ` Jason Wang
2016-06-30 3:52 ` [PATCH net-next V3 2/6] skb_array: minor tweak Jason Wang
2016-06-30 3:52 ` Jason Wang
2016-06-30 3:52 ` [PATCH net-next V3 3/6] ptr_ring: support resizing multiple queues Jason Wang
2016-06-30 3:52 ` Jason Wang
2016-06-30 3:52 ` [PATCH net-next V3 4/6] skb_array: add wrappers for resizing Jason Wang
2016-06-30 3:52 ` Jason Wang
2016-06-30 3:52 ` [PATCH net-next V3 5/6] net: introduce NETDEV_CHANGE_TX_QUEUE_LEN Jason Wang
2016-06-30 3:52 ` Jason Wang
2016-06-30 4:56 ` John Fastabend
2016-06-30 5:12 ` Jason Wang
2016-06-30 5:12 ` Jason Wang
2016-06-30 5:59 ` Jason Wang
2016-06-30 5:59 ` Jason Wang
2016-06-30 6:43 ` Jason Wang
2016-06-30 6:43 ` Jason Wang
2016-06-30 4:56 ` John Fastabend
2016-06-30 3:52 ` [PATCH net-next V3 6/6] tun: switch to use skb array for tx Jason Wang
2016-06-30 3:52 ` Jason Wang
2016-06-30 6:20 ` [PATCH] tun: fix semicolon.cocci warnings kbuild test robot
2016-06-30 6:20 ` kbuild test robot
2016-06-30 6:20 ` [PATCH net-next V3 6/6] tun: switch to use skb array for tx kbuild test robot
2016-06-30 6:20 ` kbuild test robot
2016-06-30 5:37 ` [PATCH net-next V3 0/6] switch to use tx skb array in tun Michael S. Tsirkin
2016-06-30 5:37 ` Michael S. Tsirkin
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.