[REBASE PATCH net-next v9 0/4] net: vhost: improve performance when enable busyloop

All of lore.kernel.org
 help / color / mirror / Atom feed

* [REBASE PATCH net-next v9 0/4] net: vhost: improve performance when enable busyloop
@ 2018-09-25 12:36 xiangxia.m.yue
  2018-09-25 12:36 ` [REBASE PATCH net-next v9 1/4] net: vhost: lock the vqs one by one xiangxia.m.yue
                   ` (9 more replies)
  0 siblings, 10 replies; 17+ messages in thread
From: xiangxia.m.yue @ 2018-09-25 12:36 UTC (permalink / raw)
  To: jasowang, mst, makita.toshiaki, davem
  Cc: netdev, virtualization, Tonghao Zhang

From: Tonghao Zhang <xiangxia.m.yue@gmail.com>

This patches improve the guest receive performance.
On the handle_tx side, we poll the sock receive queue
at the same time. handle_rx do that in the same way.

For more performance report, see patch 4

Tonghao Zhang (4):
  net: vhost: lock the vqs one by one
  net: vhost: replace magic number of lock annotation
  net: vhost: factor out busy polling logic to vhost_net_busy_poll()
  net: vhost: add rx busy polling in tx path

 drivers/vhost/net.c   | 147 +++++++++++++++++++++++++++++---------------------
 drivers/vhost/vhost.c |  24 +++------
 2 files changed, 92 insertions(+), 79 deletions(-)

-- 
1.8.3.1

^ permalink raw reply	[flat|nested] 17+ messages in thread

* [REBASE PATCH net-next v9 1/4] net: vhost: lock the vqs one by one
  2018-09-25 12:36 [REBASE PATCH net-next v9 0/4] net: vhost: improve performance when enable busyloop xiangxia.m.yue
  2018-09-25 12:36 ` [REBASE PATCH net-next v9 1/4] net: vhost: lock the vqs one by one xiangxia.m.yue
@ 2018-09-25 12:36 ` xiangxia.m.yue
  2018-11-29 19:28   ` Jean-Philippe Brucker
  2018-11-29 19:28   ` Jean-Philippe Brucker
  2018-09-25 12:36 ` [REBASE PATCH net-next v9 2/4] net: vhost: replace magic number of lock annotation xiangxia.m.yue
                   ` (7 subsequent siblings)
  9 siblings, 2 replies; 17+ messages in thread
From: xiangxia.m.yue @ 2018-09-25 12:36 UTC (permalink / raw)
  To: jasowang, mst, makita.toshiaki, davem
  Cc: netdev, virtualization, Tonghao Zhang

From: Tonghao Zhang <xiangxia.m.yue@gmail.com>

This patch changes the way that lock all vqs
at the same, to lock them one by one. It will
be used for next patch to avoid the deadlock.

Signed-off-by: Tonghao Zhang <xiangxia.m.yue@gmail.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
---
 drivers/vhost/vhost.c | 24 +++++++-----------------
 1 file changed, 7 insertions(+), 17 deletions(-)

diff --git a/drivers/vhost/vhost.c b/drivers/vhost/vhost.c
index b13c6b4..f52008b 100644
--- a/drivers/vhost/vhost.c
+++ b/drivers/vhost/vhost.c
@@ -294,8 +294,11 @@ static void vhost_vq_meta_reset(struct vhost_dev *d)
 {
 	int i;
 
-	for (i = 0; i < d->nvqs; ++i)
+	for (i = 0; i < d->nvqs; ++i) {
+		mutex_lock(&d->vqs[i]->mutex);
 		__vhost_vq_meta_reset(d->vqs[i]);
+		mutex_unlock(&d->vqs[i]->mutex);
+	}
 }
 
 static void vhost_vq_reset(struct vhost_dev *dev,
@@ -891,20 +894,6 @@ static inline void __user *__vhost_get_user(struct vhost_virtqueue *vq,
 #define vhost_get_used(vq, x, ptr) \
 	vhost_get_user(vq, x, ptr, VHOST_ADDR_USED)
 
-static void vhost_dev_lock_vqs(struct vhost_dev *d)
-{
-	int i = 0;
-	for (i = 0; i < d->nvqs; ++i)
-		mutex_lock_nested(&d->vqs[i]->mutex, i);
-}
-
-static void vhost_dev_unlock_vqs(struct vhost_dev *d)
-{
-	int i = 0;
-	for (i = 0; i < d->nvqs; ++i)
-		mutex_unlock(&d->vqs[i]->mutex);
-}
-
 static int vhost_new_umem_range(struct vhost_umem *umem,
 				u64 start, u64 size, u64 end,
 				u64 userspace_addr, int perm)
@@ -954,7 +943,10 @@ static void vhost_iotlb_notify_vq(struct vhost_dev *d,
 		if (msg->iova <= vq_msg->iova &&
 		    msg->iova + msg->size - 1 >= vq_msg->iova &&
 		    vq_msg->type == VHOST_IOTLB_MISS) {
+			mutex_lock(&node->vq->mutex);
 			vhost_poll_queue(&node->vq->poll);
+			mutex_unlock(&node->vq->mutex);
+
 			list_del(&node->node);
 			kfree(node);
 		}
@@ -986,7 +978,6 @@ static int vhost_process_iotlb_msg(struct vhost_dev *dev,
 	int ret = 0;
 
 	mutex_lock(&dev->mutex);
-	vhost_dev_lock_vqs(dev);
 	switch (msg->type) {
 	case VHOST_IOTLB_UPDATE:
 		if (!dev->iotlb) {
@@ -1020,7 +1011,6 @@ static int vhost_process_iotlb_msg(struct vhost_dev *dev,
 		break;
 	}
 
-	vhost_dev_unlock_vqs(dev);
 	mutex_unlock(&dev->mutex);
 
 	return ret;
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [REBASE PATCH net-next v9 1/4] net: vhost: lock the vqs one by one
  2018-09-25 12:36 [REBASE PATCH net-next v9 0/4] net: vhost: improve performance when enable busyloop xiangxia.m.yue
@ 2018-09-25 12:36 ` xiangxia.m.yue
  2018-09-25 12:36 ` xiangxia.m.yue
                   ` (8 subsequent siblings)
  9 siblings, 0 replies; 17+ messages in thread
From: xiangxia.m.yue @ 2018-09-25 12:36 UTC (permalink / raw)
  To: jasowang, mst, makita.toshiaki, davem; +Cc: netdev, virtualization

From: Tonghao Zhang <xiangxia.m.yue@gmail.com>

This patch changes the way that lock all vqs
at the same, to lock them one by one. It will
be used for next patch to avoid the deadlock.

Signed-off-by: Tonghao Zhang <xiangxia.m.yue@gmail.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
---
 drivers/vhost/vhost.c | 24 +++++++-----------------
 1 file changed, 7 insertions(+), 17 deletions(-)

diff --git a/drivers/vhost/vhost.c b/drivers/vhost/vhost.c
index b13c6b4..f52008b 100644
--- a/drivers/vhost/vhost.c
+++ b/drivers/vhost/vhost.c
@@ -294,8 +294,11 @@ static void vhost_vq_meta_reset(struct vhost_dev *d)
 {
 	int i;
 
-	for (i = 0; i < d->nvqs; ++i)
+	for (i = 0; i < d->nvqs; ++i) {
+		mutex_lock(&d->vqs[i]->mutex);
 		__vhost_vq_meta_reset(d->vqs[i]);
+		mutex_unlock(&d->vqs[i]->mutex);
+	}
 }
 
 static void vhost_vq_reset(struct vhost_dev *dev,
@@ -891,20 +894,6 @@ static inline void __user *__vhost_get_user(struct vhost_virtqueue *vq,
 #define vhost_get_used(vq, x, ptr) \
 	vhost_get_user(vq, x, ptr, VHOST_ADDR_USED)
 
-static void vhost_dev_lock_vqs(struct vhost_dev *d)
-{
-	int i = 0;
-	for (i = 0; i < d->nvqs; ++i)
-		mutex_lock_nested(&d->vqs[i]->mutex, i);
-}
-
-static void vhost_dev_unlock_vqs(struct vhost_dev *d)
-{
-	int i = 0;
-	for (i = 0; i < d->nvqs; ++i)
-		mutex_unlock(&d->vqs[i]->mutex);
-}
-
 static int vhost_new_umem_range(struct vhost_umem *umem,
 				u64 start, u64 size, u64 end,
 				u64 userspace_addr, int perm)
@@ -954,7 +943,10 @@ static void vhost_iotlb_notify_vq(struct vhost_dev *d,
 		if (msg->iova <= vq_msg->iova &&
 		    msg->iova + msg->size - 1 >= vq_msg->iova &&
 		    vq_msg->type == VHOST_IOTLB_MISS) {
+			mutex_lock(&node->vq->mutex);
 			vhost_poll_queue(&node->vq->poll);
+			mutex_unlock(&node->vq->mutex);
+
 			list_del(&node->node);
 			kfree(node);
 		}
@@ -986,7 +978,6 @@ static int vhost_process_iotlb_msg(struct vhost_dev *dev,
 	int ret = 0;
 
 	mutex_lock(&dev->mutex);
-	vhost_dev_lock_vqs(dev);
 	switch (msg->type) {
 	case VHOST_IOTLB_UPDATE:
 		if (!dev->iotlb) {
@@ -1020,7 +1011,6 @@ static int vhost_process_iotlb_msg(struct vhost_dev *dev,
 		break;
 	}
 
-	vhost_dev_unlock_vqs(dev);
 	mutex_unlock(&dev->mutex);
 
 	return ret;
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [REBASE PATCH net-next v9 2/4] net: vhost: replace magic number of lock annotation
  2018-09-25 12:36 [REBASE PATCH net-next v9 0/4] net: vhost: improve performance when enable busyloop xiangxia.m.yue
                   ` (2 preceding siblings ...)
  2018-09-25 12:36 ` [REBASE PATCH net-next v9 2/4] net: vhost: replace magic number of lock annotation xiangxia.m.yue
@ 2018-09-25 12:36 ` xiangxia.m.yue
  2018-09-25 12:36 ` [REBASE PATCH net-next v9 3/4] net: vhost: factor out busy polling logic to vhost_net_busy_poll() xiangxia.m.yue
                   ` (5 subsequent siblings)
  9 siblings, 0 replies; 17+ messages in thread
From: xiangxia.m.yue @ 2018-09-25 12:36 UTC (permalink / raw)
  To: jasowang, mst, makita.toshiaki, davem
  Cc: netdev, virtualization, Tonghao Zhang

From: Tonghao Zhang <xiangxia.m.yue@gmail.com>

Use the VHOST_NET_VQ_XXX as a subclass for mutex_lock_nested.

Signed-off-by: Tonghao Zhang <xiangxia.m.yue@gmail.com>
Acked-by: Jason Wang <jasowang@redhat.com>
---
 drivers/vhost/net.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/drivers/vhost/net.c b/drivers/vhost/net.c
index 1bff6bc..5fe57ab 100644
--- a/drivers/vhost/net.c
+++ b/drivers/vhost/net.c
@@ -856,7 +856,7 @@ static void handle_tx(struct vhost_net *net)
 	struct vhost_virtqueue *vq = &nvq->vq;
 	struct socket *sock;
 
-	mutex_lock(&vq->mutex);
+	mutex_lock_nested(&vq->mutex, VHOST_NET_VQ_TX);
 	sock = vq->private_data;
 	if (!sock)
 		goto out;
@@ -921,7 +921,7 @@ static int vhost_net_rx_peek_head_len(struct vhost_net *net, struct sock *sk,
 		/* Flush batched heads first */
 		vhost_net_signal_used(rnvq);
 		/* Both tx vq and rx socket were polled here */
-		mutex_lock_nested(&tvq->mutex, 1);
+		mutex_lock_nested(&tvq->mutex, VHOST_NET_VQ_TX);
 		vhost_disable_notify(&net->dev, tvq);
 
 		preempt_disable();
@@ -1063,7 +1063,7 @@ static void handle_rx(struct vhost_net *net)
 	__virtio16 num_buffers;
 	int recv_pkts = 0;
 
-	mutex_lock_nested(&vq->mutex, 0);
+	mutex_lock_nested(&vq->mutex, VHOST_NET_VQ_RX);
 	sock = vq->private_data;
 	if (!sock)
 		goto out;
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [REBASE PATCH net-next v9 2/4] net: vhost: replace magic number of lock annotation
  2018-09-25 12:36 [REBASE PATCH net-next v9 0/4] net: vhost: improve performance when enable busyloop xiangxia.m.yue
  2018-09-25 12:36 ` [REBASE PATCH net-next v9 1/4] net: vhost: lock the vqs one by one xiangxia.m.yue
  2018-09-25 12:36 ` xiangxia.m.yue
@ 2018-09-25 12:36 ` xiangxia.m.yue
  2018-09-25 12:36 ` xiangxia.m.yue
                   ` (6 subsequent siblings)
  9 siblings, 0 replies; 17+ messages in thread
From: xiangxia.m.yue @ 2018-09-25 12:36 UTC (permalink / raw)
  To: jasowang, mst, makita.toshiaki, davem; +Cc: netdev, virtualization

From: Tonghao Zhang <xiangxia.m.yue@gmail.com>

Use the VHOST_NET_VQ_XXX as a subclass for mutex_lock_nested.

Signed-off-by: Tonghao Zhang <xiangxia.m.yue@gmail.com>
Acked-by: Jason Wang <jasowang@redhat.com>
---
 drivers/vhost/net.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/drivers/vhost/net.c b/drivers/vhost/net.c
index 1bff6bc..5fe57ab 100644
--- a/drivers/vhost/net.c
+++ b/drivers/vhost/net.c
@@ -856,7 +856,7 @@ static void handle_tx(struct vhost_net *net)
 	struct vhost_virtqueue *vq = &nvq->vq;
 	struct socket *sock;
 
-	mutex_lock(&vq->mutex);
+	mutex_lock_nested(&vq->mutex, VHOST_NET_VQ_TX);
 	sock = vq->private_data;
 	if (!sock)
 		goto out;
@@ -921,7 +921,7 @@ static int vhost_net_rx_peek_head_len(struct vhost_net *net, struct sock *sk,
 		/* Flush batched heads first */
 		vhost_net_signal_used(rnvq);
 		/* Both tx vq and rx socket were polled here */
-		mutex_lock_nested(&tvq->mutex, 1);
+		mutex_lock_nested(&tvq->mutex, VHOST_NET_VQ_TX);
 		vhost_disable_notify(&net->dev, tvq);
 
 		preempt_disable();
@@ -1063,7 +1063,7 @@ static void handle_rx(struct vhost_net *net)
 	__virtio16 num_buffers;
 	int recv_pkts = 0;
 
-	mutex_lock_nested(&vq->mutex, 0);
+	mutex_lock_nested(&vq->mutex, VHOST_NET_VQ_RX);
 	sock = vq->private_data;
 	if (!sock)
 		goto out;
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [REBASE PATCH net-next v9 3/4] net: vhost: factor out busy polling logic to vhost_net_busy_poll()
  2018-09-25 12:36 [REBASE PATCH net-next v9 0/4] net: vhost: improve performance when enable busyloop xiangxia.m.yue
                   ` (3 preceding siblings ...)
  2018-09-25 12:36 ` xiangxia.m.yue
@ 2018-09-25 12:36 ` xiangxia.m.yue
  2018-09-25 12:36 ` xiangxia.m.yue
                   ` (4 subsequent siblings)
  9 siblings, 0 replies; 17+ messages in thread
From: xiangxia.m.yue @ 2018-09-25 12:36 UTC (permalink / raw)
  To: jasowang, mst, makita.toshiaki, davem
  Cc: netdev, virtualization, Tonghao Zhang

From: Tonghao Zhang <xiangxia.m.yue@gmail.com>

Factor out generic busy polling logic and will be
used for in tx path in the next patch. And with the patch,
qemu can set differently the busyloop_timeout for rx queue.

To avoid duplicate codes, introduce the helper functions:
* sock_has_rx_data(changed from sk_has_rx_data)
* vhost_net_busy_poll_try_queue

Signed-off-by: Tonghao Zhang <xiangxia.m.yue@gmail.com>
---
 drivers/vhost/net.c | 110 +++++++++++++++++++++++++++++++++-------------------
 1 file changed, 70 insertions(+), 40 deletions(-)

diff --git a/drivers/vhost/net.c b/drivers/vhost/net.c
index 5fe57ab..ac0b954 100644
--- a/drivers/vhost/net.c
+++ b/drivers/vhost/net.c
@@ -480,6 +480,74 @@ static void vhost_tx_batch(struct vhost_net *net,
 	nvq->batched_xdp = 0;
 }
 
+static int sock_has_rx_data(struct socket *sock)
+{
+	if (unlikely(!sock))
+		return 0;
+
+	if (sock->ops->peek_len)
+		return sock->ops->peek_len(sock);
+
+	return skb_queue_empty(&sock->sk->sk_receive_queue);
+}
+
+static void vhost_net_busy_poll_try_queue(struct vhost_net *net,
+					  struct vhost_virtqueue *vq)
+{
+	if (!vhost_vq_avail_empty(&net->dev, vq)) {
+		vhost_poll_queue(&vq->poll);
+	} else if (unlikely(vhost_enable_notify(&net->dev, vq))) {
+		vhost_disable_notify(&net->dev, vq);
+		vhost_poll_queue(&vq->poll);
+	}
+}
+
+static void vhost_net_busy_poll(struct vhost_net *net,
+				struct vhost_virtqueue *rvq,
+				struct vhost_virtqueue *tvq,
+				bool *busyloop_intr,
+				bool poll_rx)
+{
+	unsigned long busyloop_timeout;
+	unsigned long endtime;
+	struct socket *sock;
+	struct vhost_virtqueue *vq = poll_rx ? tvq : rvq;
+
+	mutex_lock_nested(&vq->mutex, poll_rx ? VHOST_NET_VQ_TX: VHOST_NET_VQ_RX);
+	vhost_disable_notify(&net->dev, vq);
+	sock = rvq->private_data;
+
+	busyloop_timeout = poll_rx ? rvq->busyloop_timeout:
+				     tvq->busyloop_timeout;
+
+	preempt_disable();
+	endtime = busy_clock() + busyloop_timeout;
+
+	while (vhost_can_busy_poll(endtime)) {
+		if (vhost_has_work(&net->dev)) {
+			*busyloop_intr = true;
+			break;
+		}
+
+		if ((sock_has_rx_data(sock) &&
+		     !vhost_vq_avail_empty(&net->dev, rvq)) ||
+		    !vhost_vq_avail_empty(&net->dev, tvq))
+			break;
+
+		cpu_relax();
+	}
+
+	preempt_enable();
+
+	if (poll_rx || sock_has_rx_data(sock))
+		vhost_net_busy_poll_try_queue(net, vq);
+	else if (!poll_rx) /* On tx here, sock has no rx data. */
+		vhost_enable_notify(&net->dev, rvq);
+
+	mutex_unlock(&vq->mutex);
+}
+
+
 static int vhost_net_tx_get_vq_desc(struct vhost_net *net,
 				    struct vhost_net_virtqueue *nvq,
 				    unsigned int *out_num, unsigned int *in_num,
@@ -897,16 +965,6 @@ static int peek_head_len(struct vhost_net_virtqueue *rvq, struct sock *sk)
 	return len;
 }
 
-static int sk_has_rx_data(struct sock *sk)
-{
-	struct socket *sock = sk->sk_socket;
-
-	if (sock->ops->peek_len)
-		return sock->ops->peek_len(sock);
-
-	return skb_queue_empty(&sk->sk_receive_queue);
-}
-
 static int vhost_net_rx_peek_head_len(struct vhost_net *net, struct sock *sk,
 				      bool *busyloop_intr)
 {
@@ -914,41 +972,13 @@ static int vhost_net_rx_peek_head_len(struct vhost_net *net, struct sock *sk,
 	struct vhost_net_virtqueue *tnvq = &net->vqs[VHOST_NET_VQ_TX];
 	struct vhost_virtqueue *rvq = &rnvq->vq;
 	struct vhost_virtqueue *tvq = &tnvq->vq;
-	unsigned long uninitialized_var(endtime);
 	int len = peek_head_len(rnvq, sk);
 
-	if (!len && tvq->busyloop_timeout) {
+	if (!len && rvq->busyloop_timeout) {
 		/* Flush batched heads first */
 		vhost_net_signal_used(rnvq);
 		/* Both tx vq and rx socket were polled here */
-		mutex_lock_nested(&tvq->mutex, VHOST_NET_VQ_TX);
-		vhost_disable_notify(&net->dev, tvq);
-
-		preempt_disable();
-		endtime = busy_clock() + tvq->busyloop_timeout;
-
-		while (vhost_can_busy_poll(endtime)) {
-			if (vhost_has_work(&net->dev)) {
-				*busyloop_intr = true;
-				break;
-			}
-			if ((sk_has_rx_data(sk) &&
-			     !vhost_vq_avail_empty(&net->dev, rvq)) ||
-			    !vhost_vq_avail_empty(&net->dev, tvq))
-				break;
-			cpu_relax();
-		}
-
-		preempt_enable();
-
-		if (!vhost_vq_avail_empty(&net->dev, tvq)) {
-			vhost_poll_queue(&tvq->poll);
-		} else if (unlikely(vhost_enable_notify(&net->dev, tvq))) {
-			vhost_disable_notify(&net->dev, tvq);
-			vhost_poll_queue(&tvq->poll);
-		}
-
-		mutex_unlock(&tvq->mutex);
+		vhost_net_busy_poll(net, rvq, tvq, busyloop_intr, true);
 
 		len = peek_head_len(rnvq, sk);
 	}
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [REBASE PATCH net-next v9 3/4] net: vhost: factor out busy polling logic to vhost_net_busy_poll()
  2018-09-25 12:36 [REBASE PATCH net-next v9 0/4] net: vhost: improve performance when enable busyloop xiangxia.m.yue
                   ` (4 preceding siblings ...)
  2018-09-25 12:36 ` [REBASE PATCH net-next v9 3/4] net: vhost: factor out busy polling logic to vhost_net_busy_poll() xiangxia.m.yue
@ 2018-09-25 12:36 ` xiangxia.m.yue
  2018-09-25 12:36 ` [REBASE PATCH net-next v9 4/4] net: vhost: add rx busy polling in tx path xiangxia.m.yue
                   ` (3 subsequent siblings)
  9 siblings, 0 replies; 17+ messages in thread
From: xiangxia.m.yue @ 2018-09-25 12:36 UTC (permalink / raw)
  To: jasowang, mst, makita.toshiaki, davem; +Cc: netdev, virtualization

From: Tonghao Zhang <xiangxia.m.yue@gmail.com>

Factor out generic busy polling logic and will be
used for in tx path in the next patch. And with the patch,
qemu can set differently the busyloop_timeout for rx queue.

To avoid duplicate codes, introduce the helper functions:
* sock_has_rx_data(changed from sk_has_rx_data)
* vhost_net_busy_poll_try_queue

Signed-off-by: Tonghao Zhang <xiangxia.m.yue@gmail.com>
---
 drivers/vhost/net.c | 110 +++++++++++++++++++++++++++++++++-------------------
 1 file changed, 70 insertions(+), 40 deletions(-)

diff --git a/drivers/vhost/net.c b/drivers/vhost/net.c
index 5fe57ab..ac0b954 100644
--- a/drivers/vhost/net.c
+++ b/drivers/vhost/net.c
@@ -480,6 +480,74 @@ static void vhost_tx_batch(struct vhost_net *net,
 	nvq->batched_xdp = 0;
 }
 
+static int sock_has_rx_data(struct socket *sock)
+{
+	if (unlikely(!sock))
+		return 0;
+
+	if (sock->ops->peek_len)
+		return sock->ops->peek_len(sock);
+
+	return skb_queue_empty(&sock->sk->sk_receive_queue);
+}
+
+static void vhost_net_busy_poll_try_queue(struct vhost_net *net,
+					  struct vhost_virtqueue *vq)
+{
+	if (!vhost_vq_avail_empty(&net->dev, vq)) {
+		vhost_poll_queue(&vq->poll);
+	} else if (unlikely(vhost_enable_notify(&net->dev, vq))) {
+		vhost_disable_notify(&net->dev, vq);
+		vhost_poll_queue(&vq->poll);
+	}
+}
+
+static void vhost_net_busy_poll(struct vhost_net *net,
+				struct vhost_virtqueue *rvq,
+				struct vhost_virtqueue *tvq,
+				bool *busyloop_intr,
+				bool poll_rx)
+{
+	unsigned long busyloop_timeout;
+	unsigned long endtime;
+	struct socket *sock;
+	struct vhost_virtqueue *vq = poll_rx ? tvq : rvq;
+
+	mutex_lock_nested(&vq->mutex, poll_rx ? VHOST_NET_VQ_TX: VHOST_NET_VQ_RX);
+	vhost_disable_notify(&net->dev, vq);
+	sock = rvq->private_data;
+
+	busyloop_timeout = poll_rx ? rvq->busyloop_timeout:
+				     tvq->busyloop_timeout;
+
+	preempt_disable();
+	endtime = busy_clock() + busyloop_timeout;
+
+	while (vhost_can_busy_poll(endtime)) {
+		if (vhost_has_work(&net->dev)) {
+			*busyloop_intr = true;
+			break;
+		}
+
+		if ((sock_has_rx_data(sock) &&
+		     !vhost_vq_avail_empty(&net->dev, rvq)) ||
+		    !vhost_vq_avail_empty(&net->dev, tvq))
+			break;
+
+		cpu_relax();
+	}
+
+	preempt_enable();
+
+	if (poll_rx || sock_has_rx_data(sock))
+		vhost_net_busy_poll_try_queue(net, vq);
+	else if (!poll_rx) /* On tx here, sock has no rx data. */
+		vhost_enable_notify(&net->dev, rvq);
+
+	mutex_unlock(&vq->mutex);
+}
+
+
 static int vhost_net_tx_get_vq_desc(struct vhost_net *net,
 				    struct vhost_net_virtqueue *nvq,
 				    unsigned int *out_num, unsigned int *in_num,
@@ -897,16 +965,6 @@ static int peek_head_len(struct vhost_net_virtqueue *rvq, struct sock *sk)
 	return len;
 }
 
-static int sk_has_rx_data(struct sock *sk)
-{
-	struct socket *sock = sk->sk_socket;
-
-	if (sock->ops->peek_len)
-		return sock->ops->peek_len(sock);
-
-	return skb_queue_empty(&sk->sk_receive_queue);
-}
-
 static int vhost_net_rx_peek_head_len(struct vhost_net *net, struct sock *sk,
 				      bool *busyloop_intr)
 {
@@ -914,41 +972,13 @@ static int vhost_net_rx_peek_head_len(struct vhost_net *net, struct sock *sk,
 	struct vhost_net_virtqueue *tnvq = &net->vqs[VHOST_NET_VQ_TX];
 	struct vhost_virtqueue *rvq = &rnvq->vq;
 	struct vhost_virtqueue *tvq = &tnvq->vq;
-	unsigned long uninitialized_var(endtime);
 	int len = peek_head_len(rnvq, sk);
 
-	if (!len && tvq->busyloop_timeout) {
+	if (!len && rvq->busyloop_timeout) {
 		/* Flush batched heads first */
 		vhost_net_signal_used(rnvq);
 		/* Both tx vq and rx socket were polled here */
-		mutex_lock_nested(&tvq->mutex, VHOST_NET_VQ_TX);
-		vhost_disable_notify(&net->dev, tvq);
-
-		preempt_disable();
-		endtime = busy_clock() + tvq->busyloop_timeout;
-
-		while (vhost_can_busy_poll(endtime)) {
-			if (vhost_has_work(&net->dev)) {
-				*busyloop_intr = true;
-				break;
-			}
-			if ((sk_has_rx_data(sk) &&
-			     !vhost_vq_avail_empty(&net->dev, rvq)) ||
-			    !vhost_vq_avail_empty(&net->dev, tvq))
-				break;
-			cpu_relax();
-		}
-
-		preempt_enable();
-
-		if (!vhost_vq_avail_empty(&net->dev, tvq)) {
-			vhost_poll_queue(&tvq->poll);
-		} else if (unlikely(vhost_enable_notify(&net->dev, tvq))) {
-			vhost_disable_notify(&net->dev, tvq);
-			vhost_poll_queue(&tvq->poll);
-		}
-
-		mutex_unlock(&tvq->mutex);
+		vhost_net_busy_poll(net, rvq, tvq, busyloop_intr, true);
 
 		len = peek_head_len(rnvq, sk);
 	}
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [REBASE PATCH net-next v9 4/4] net: vhost: add rx busy polling in tx path
  2018-09-25 12:36 [REBASE PATCH net-next v9 0/4] net: vhost: improve performance when enable busyloop xiangxia.m.yue
                   ` (5 preceding siblings ...)
  2018-09-25 12:36 ` xiangxia.m.yue
@ 2018-09-25 12:36 ` xiangxia.m.yue
  2018-09-25 12:36 ` xiangxia.m.yue
                   ` (2 subsequent siblings)
  9 siblings, 0 replies; 17+ messages in thread
From: xiangxia.m.yue @ 2018-09-25 12:36 UTC (permalink / raw)
  To: jasowang, mst, makita.toshiaki, davem
  Cc: netdev, virtualization, Tonghao Zhang

From: Tonghao Zhang <xiangxia.m.yue@gmail.com>

This patch improves the guest receive performance.
On the handle_tx side, we poll the sock receive queue at the
same time. handle_rx do that in the same way.

We set the poll-us=100us and use the netperf to test throughput
and mean latency. When running the tests, the vhost-net kthread
of that VM, is alway 100% CPU. The commands are shown as below.

Rx performance is greatly improved by this patch. There is not
notable performance change on tx with this series though. This
patch is useful for bi-directional traffic.

netperf -H IP -t TCP_STREAM -l 20 -- -O "THROUGHPUT, THROUGHPUT_UNITS, MEAN_LATENCY"

Topology:
[Host] ->linux bridge -> tap vhost-net ->[Guest]

TCP_STREAM:
* Without the patch:  19842.95 Mbps, 6.50 us mean latency
* With the patch:     37598.20 Mbps, 3.43 us mean latency

Signed-off-by: Tonghao Zhang <xiangxia.m.yue@gmail.com>
---
 drivers/vhost/net.c | 35 ++++++++++++++---------------------
 1 file changed, 14 insertions(+), 21 deletions(-)

diff --git a/drivers/vhost/net.c b/drivers/vhost/net.c
index ac0b954..015abf3 100644
--- a/drivers/vhost/net.c
+++ b/drivers/vhost/net.c
@@ -547,34 +547,27 @@ static void vhost_net_busy_poll(struct vhost_net *net,
 	mutex_unlock(&vq->mutex);
 }
 
-
 static int vhost_net_tx_get_vq_desc(struct vhost_net *net,
-				    struct vhost_net_virtqueue *nvq,
+				    struct vhost_net_virtqueue *tnvq,
 				    unsigned int *out_num, unsigned int *in_num,
 				    struct msghdr *msghdr, bool *busyloop_intr)
 {
-	struct vhost_virtqueue *vq = &nvq->vq;
-	unsigned long uninitialized_var(endtime);
-	int r = vhost_get_vq_desc(vq, vq->iov, ARRAY_SIZE(vq->iov),
+	struct vhost_net_virtqueue *rnvq = &net->vqs[VHOST_NET_VQ_RX];
+	struct vhost_virtqueue *rvq = &rnvq->vq;
+	struct vhost_virtqueue *tvq = &tnvq->vq;
+
+	int r = vhost_get_vq_desc(tvq, tvq->iov, ARRAY_SIZE(tvq->iov),
 				  out_num, in_num, NULL, NULL);
 
-	if (r == vq->num && vq->busyloop_timeout) {
+	if (r == tvq->num && tvq->busyloop_timeout) {
 		/* Flush batched packets first */
-		if (!vhost_sock_zcopy(vq->private_data))
-			vhost_tx_batch(net, nvq, vq->private_data, msghdr);
-		preempt_disable();
-		endtime = busy_clock() + vq->busyloop_timeout;
-		while (vhost_can_busy_poll(endtime)) {
-			if (vhost_has_work(vq->dev)) {
-				*busyloop_intr = true;
-				break;
-			}
-			if (!vhost_vq_avail_empty(vq->dev, vq))
-				break;
-			cpu_relax();
-		}
-		preempt_enable();
-		r = vhost_get_vq_desc(vq, vq->iov, ARRAY_SIZE(vq->iov),
+		if (!vhost_sock_zcopy(tvq->private_data))
+			// vhost_net_signal_used(tnvq);
+			vhost_tx_batch(net, tnvq, tvq->private_data, msghdr);
+
+		vhost_net_busy_poll(net, rvq, tvq, busyloop_intr, false);
+
+		r = vhost_get_vq_desc(tvq, tvq->iov, ARRAY_SIZE(tvq->iov),
 				      out_num, in_num, NULL, NULL);
 	}
 
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [REBASE PATCH net-next v9 4/4] net: vhost: add rx busy polling in tx path
  2018-09-25 12:36 [REBASE PATCH net-next v9 0/4] net: vhost: improve performance when enable busyloop xiangxia.m.yue
                   ` (6 preceding siblings ...)
  2018-09-25 12:36 ` [REBASE PATCH net-next v9 4/4] net: vhost: add rx busy polling in tx path xiangxia.m.yue
@ 2018-09-25 12:36 ` xiangxia.m.yue
  2018-09-27  3:26 ` [REBASE PATCH net-next v9 0/4] net: vhost: improve performance when enable busyloop David Miller
  2018-09-27  3:26 ` David Miller
  9 siblings, 0 replies; 17+ messages in thread
From: xiangxia.m.yue @ 2018-09-25 12:36 UTC (permalink / raw)
  To: jasowang, mst, makita.toshiaki, davem; +Cc: netdev, virtualization

From: Tonghao Zhang <xiangxia.m.yue@gmail.com>

This patch improves the guest receive performance.
On the handle_tx side, we poll the sock receive queue at the
same time. handle_rx do that in the same way.

We set the poll-us=100us and use the netperf to test throughput
and mean latency. When running the tests, the vhost-net kthread
of that VM, is alway 100% CPU. The commands are shown as below.

Rx performance is greatly improved by this patch. There is not
notable performance change on tx with this series though. This
patch is useful for bi-directional traffic.

netperf -H IP -t TCP_STREAM -l 20 -- -O "THROUGHPUT, THROUGHPUT_UNITS, MEAN_LATENCY"

Topology:
[Host] ->linux bridge -> tap vhost-net ->[Guest]

TCP_STREAM:
* Without the patch:  19842.95 Mbps, 6.50 us mean latency
* With the patch:     37598.20 Mbps, 3.43 us mean latency

Signed-off-by: Tonghao Zhang <xiangxia.m.yue@gmail.com>
---
 drivers/vhost/net.c | 35 ++++++++++++++---------------------
 1 file changed, 14 insertions(+), 21 deletions(-)

diff --git a/drivers/vhost/net.c b/drivers/vhost/net.c
index ac0b954..015abf3 100644
--- a/drivers/vhost/net.c
+++ b/drivers/vhost/net.c
@@ -547,34 +547,27 @@ static void vhost_net_busy_poll(struct vhost_net *net,
 	mutex_unlock(&vq->mutex);
 }
 
-
 static int vhost_net_tx_get_vq_desc(struct vhost_net *net,
-				    struct vhost_net_virtqueue *nvq,
+				    struct vhost_net_virtqueue *tnvq,
 				    unsigned int *out_num, unsigned int *in_num,
 				    struct msghdr *msghdr, bool *busyloop_intr)
 {
-	struct vhost_virtqueue *vq = &nvq->vq;
-	unsigned long uninitialized_var(endtime);
-	int r = vhost_get_vq_desc(vq, vq->iov, ARRAY_SIZE(vq->iov),
+	struct vhost_net_virtqueue *rnvq = &net->vqs[VHOST_NET_VQ_RX];
+	struct vhost_virtqueue *rvq = &rnvq->vq;
+	struct vhost_virtqueue *tvq = &tnvq->vq;
+
+	int r = vhost_get_vq_desc(tvq, tvq->iov, ARRAY_SIZE(tvq->iov),
 				  out_num, in_num, NULL, NULL);
 
-	if (r == vq->num && vq->busyloop_timeout) {
+	if (r == tvq->num && tvq->busyloop_timeout) {
 		/* Flush batched packets first */
-		if (!vhost_sock_zcopy(vq->private_data))
-			vhost_tx_batch(net, nvq, vq->private_data, msghdr);
-		preempt_disable();
-		endtime = busy_clock() + vq->busyloop_timeout;
-		while (vhost_can_busy_poll(endtime)) {
-			if (vhost_has_work(vq->dev)) {
-				*busyloop_intr = true;
-				break;
-			}
-			if (!vhost_vq_avail_empty(vq->dev, vq))
-				break;
-			cpu_relax();
-		}
-		preempt_enable();
-		r = vhost_get_vq_desc(vq, vq->iov, ARRAY_SIZE(vq->iov),
+		if (!vhost_sock_zcopy(tvq->private_data))
+			// vhost_net_signal_used(tnvq);
+			vhost_tx_batch(net, tnvq, tvq->private_data, msghdr);
+
+		vhost_net_busy_poll(net, rvq, tvq, busyloop_intr, false);
+
+		r = vhost_get_vq_desc(tvq, tvq->iov, ARRAY_SIZE(tvq->iov),
 				      out_num, in_num, NULL, NULL);
 	}
 
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 17+ messages in thread

* Re: [REBASE PATCH net-next v9 0/4] net: vhost: improve performance when enable busyloop
  2018-09-25 12:36 [REBASE PATCH net-next v9 0/4] net: vhost: improve performance when enable busyloop xiangxia.m.yue
                   ` (8 preceding siblings ...)
  2018-09-27  3:26 ` [REBASE PATCH net-next v9 0/4] net: vhost: improve performance when enable busyloop David Miller
@ 2018-09-27  3:26 ` David Miller
  9 siblings, 0 replies; 17+ messages in thread
From: David Miller @ 2018-09-27  3:26 UTC (permalink / raw)
  To: xiangxia.m.yue; +Cc: jasowang, mst, makita.toshiaki, netdev, virtualization

From: xiangxia.m.yue@gmail.com
Date: Tue, 25 Sep 2018 05:36:48 -0700

> From: Tonghao Zhang <xiangxia.m.yue@gmail.com>
> 
> This patches improve the guest receive performance.
> On the handle_tx side, we poll the sock receive queue
> at the same time. handle_rx do that in the same way.
> 
> For more performance report, see patch 4

Series applied, thank you.

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [REBASE PATCH net-next v9 0/4] net: vhost: improve performance when enable busyloop
  2018-09-25 12:36 [REBASE PATCH net-next v9 0/4] net: vhost: improve performance when enable busyloop xiangxia.m.yue
                   ` (7 preceding siblings ...)
  2018-09-25 12:36 ` xiangxia.m.yue
@ 2018-09-27  3:26 ` David Miller
  2018-09-27  3:26 ` David Miller
  9 siblings, 0 replies; 17+ messages in thread
From: David Miller @ 2018-09-27  3:26 UTC (permalink / raw)
  To: xiangxia.m.yue; +Cc: netdev, virtualization, mst

From: xiangxia.m.yue@gmail.com
Date: Tue, 25 Sep 2018 05:36:48 -0700

> From: Tonghao Zhang <xiangxia.m.yue@gmail.com>
> 
> This patches improve the guest receive performance.
> On the handle_tx side, we poll the sock receive queue
> at the same time. handle_rx do that in the same way.
> 
> For more performance report, see patch 4

Series applied, thank you.

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [REBASE PATCH net-next v9 1/4] net: vhost: lock the vqs one by one
  2018-09-25 12:36 ` xiangxia.m.yue
  2018-11-29 19:28   ` Jean-Philippe Brucker
@ 2018-11-29 19:28   ` Jean-Philippe Brucker
  2018-11-30  2:34     ` Jason Wang
  2018-11-30  2:34     ` Jason Wang
  1 sibling, 2 replies; 17+ messages in thread
From: Jean-Philippe Brucker @ 2018-11-29 19:28 UTC (permalink / raw)
  To: xiangxia.m.yue, jasowang, mst, makita.toshiaki, davem
  Cc: netdev, virtualization

Hi,

On 25/09/2018 13:36, xiangxia.m.yue@gmail.com wrote:
> From: Tonghao Zhang <xiangxia.m.yue@gmail.com>
> 
> This patch changes the way that lock all vqs
> at the same, to lock them one by one. It will
> be used for next patch to avoid the deadlock.
> 
> Signed-off-by: Tonghao Zhang <xiangxia.m.yue@gmail.com>
> Acked-by: Jason Wang <jasowang@redhat.com>
> Signed-off-by: Jason Wang <jasowang@redhat.com>
> ---
>  drivers/vhost/vhost.c | 24 +++++++-----------------
>  1 file changed, 7 insertions(+), 17 deletions(-)
> 
> diff --git a/drivers/vhost/vhost.c b/drivers/vhost/vhost.c
> index b13c6b4..f52008b 100644
> --- a/drivers/vhost/vhost.c
> +++ b/drivers/vhost/vhost.c
> @@ -294,8 +294,11 @@ static void vhost_vq_meta_reset(struct vhost_dev *d)
>  {
>  	int i;
>  
> -	for (i = 0; i < d->nvqs; ++i)
> +	for (i = 0; i < d->nvqs; ++i) {
> +		mutex_lock(&d->vqs[i]->mutex);
>  		__vhost_vq_meta_reset(d->vqs[i]);
> +		mutex_unlock(&d->vqs[i]->mutex);
> +	}
>  }
>  
>  static void vhost_vq_reset(struct vhost_dev *dev,
> @@ -891,20 +894,6 @@ static inline void __user *__vhost_get_user(struct vhost_virtqueue *vq,
>  #define vhost_get_used(vq, x, ptr) \
>  	vhost_get_user(vq, x, ptr, VHOST_ADDR_USED)
>  
> -static void vhost_dev_lock_vqs(struct vhost_dev *d)
> -{
> -	int i = 0;
> -	for (i = 0; i < d->nvqs; ++i)
> -		mutex_lock_nested(&d->vqs[i]->mutex, i);
> -}
> -
> -static void vhost_dev_unlock_vqs(struct vhost_dev *d)
> -{
> -	int i = 0;
> -	for (i = 0; i < d->nvqs; ++i)
> -		mutex_unlock(&d->vqs[i]->mutex);
> -}
> -
>  static int vhost_new_umem_range(struct vhost_umem *umem,
>  				u64 start, u64 size, u64 end,
>  				u64 userspace_addr, int perm)
> @@ -954,7 +943,10 @@ static void vhost_iotlb_notify_vq(struct vhost_dev *d,
>  		if (msg->iova <= vq_msg->iova &&
>  		    msg->iova + msg->size - 1 >= vq_msg->iova &&
>  		    vq_msg->type == VHOST_IOTLB_MISS) {
> +			mutex_lock(&node->vq->mutex);

This seems to introduce a deadlock (and sleep-in-atomic): the vq->mutex
is taken while the IOTLB spinlock is held (taken earlier in
vhost_iotlb_notify_vq()). On the vhost_iotlb_miss() path, the IOTLB
spinlock is taken while the vq->mutex is held.

I'm not sure how to fix it. Given that we're holding dev->mutex, that
vq->poll only seems to be modified under dev->mutex, and assuming that
vhost_poll_queue(vq->poll) can be called concurrently, is it safe to
simply not take vq->mutex here?

Thanks,
Jean


>  			vhost_poll_queue(&node->vq->poll);
> +			mutex_unlock(&node->vq->mutex);
> +
>  			list_del(&node->node);
>  			kfree(node);
>  		}
> @@ -986,7 +978,6 @@ static int vhost_process_iotlb_msg(struct vhost_dev *dev,
>  	int ret = 0;
>  
>  	mutex_lock(&dev->mutex);
> -	vhost_dev_lock_vqs(dev);
>  	switch (msg->type) {
>  	case VHOST_IOTLB_UPDATE:
>  		if (!dev->iotlb) {
> @@ -1020,7 +1011,6 @@ static int vhost_process_iotlb_msg(struct vhost_dev *dev,
>  		break;
>  	}
>  
> -	vhost_dev_unlock_vqs(dev);
>  	mutex_unlock(&dev->mutex);
>  
>  	return ret;
> 

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [REBASE PATCH net-next v9 1/4] net: vhost: lock the vqs one by one
  2018-09-25 12:36 ` xiangxia.m.yue
@ 2018-11-29 19:28   ` Jean-Philippe Brucker
  2018-11-29 19:28   ` Jean-Philippe Brucker
  1 sibling, 0 replies; 17+ messages in thread
From: Jean-Philippe Brucker @ 2018-11-29 19:28 UTC (permalink / raw)
  To: xiangxia.m.yue, jasowang, mst, makita.toshiaki, davem
  Cc: netdev, virtualization

Hi,

On 25/09/2018 13:36, xiangxia.m.yue@gmail.com wrote:
> From: Tonghao Zhang <xiangxia.m.yue@gmail.com>
> 
> This patch changes the way that lock all vqs
> at the same, to lock them one by one. It will
> be used for next patch to avoid the deadlock.
> 
> Signed-off-by: Tonghao Zhang <xiangxia.m.yue@gmail.com>
> Acked-by: Jason Wang <jasowang@redhat.com>
> Signed-off-by: Jason Wang <jasowang@redhat.com>
> ---
>  drivers/vhost/vhost.c | 24 +++++++-----------------
>  1 file changed, 7 insertions(+), 17 deletions(-)
> 
> diff --git a/drivers/vhost/vhost.c b/drivers/vhost/vhost.c
> index b13c6b4..f52008b 100644
> --- a/drivers/vhost/vhost.c
> +++ b/drivers/vhost/vhost.c
> @@ -294,8 +294,11 @@ static void vhost_vq_meta_reset(struct vhost_dev *d)
>  {
>  	int i;
>  
> -	for (i = 0; i < d->nvqs; ++i)
> +	for (i = 0; i < d->nvqs; ++i) {
> +		mutex_lock(&d->vqs[i]->mutex);
>  		__vhost_vq_meta_reset(d->vqs[i]);
> +		mutex_unlock(&d->vqs[i]->mutex);
> +	}
>  }
>  
>  static void vhost_vq_reset(struct vhost_dev *dev,
> @@ -891,20 +894,6 @@ static inline void __user *__vhost_get_user(struct vhost_virtqueue *vq,
>  #define vhost_get_used(vq, x, ptr) \
>  	vhost_get_user(vq, x, ptr, VHOST_ADDR_USED)
>  
> -static void vhost_dev_lock_vqs(struct vhost_dev *d)
> -{
> -	int i = 0;
> -	for (i = 0; i < d->nvqs; ++i)
> -		mutex_lock_nested(&d->vqs[i]->mutex, i);
> -}
> -
> -static void vhost_dev_unlock_vqs(struct vhost_dev *d)
> -{
> -	int i = 0;
> -	for (i = 0; i < d->nvqs; ++i)
> -		mutex_unlock(&d->vqs[i]->mutex);
> -}
> -
>  static int vhost_new_umem_range(struct vhost_umem *umem,
>  				u64 start, u64 size, u64 end,
>  				u64 userspace_addr, int perm)
> @@ -954,7 +943,10 @@ static void vhost_iotlb_notify_vq(struct vhost_dev *d,
>  		if (msg->iova <= vq_msg->iova &&
>  		    msg->iova + msg->size - 1 >= vq_msg->iova &&
>  		    vq_msg->type == VHOST_IOTLB_MISS) {
> +			mutex_lock(&node->vq->mutex);

This seems to introduce a deadlock (and sleep-in-atomic): the vq->mutex
is taken while the IOTLB spinlock is held (taken earlier in
vhost_iotlb_notify_vq()). On the vhost_iotlb_miss() path, the IOTLB
spinlock is taken while the vq->mutex is held.

I'm not sure how to fix it. Given that we're holding dev->mutex, that
vq->poll only seems to be modified under dev->mutex, and assuming that
vhost_poll_queue(vq->poll) can be called concurrently, is it safe to
simply not take vq->mutex here?

Thanks,
Jean


>  			vhost_poll_queue(&node->vq->poll);
> +			mutex_unlock(&node->vq->mutex);
> +
>  			list_del(&node->node);
>  			kfree(node);
>  		}
> @@ -986,7 +978,6 @@ static int vhost_process_iotlb_msg(struct vhost_dev *dev,
>  	int ret = 0;
>  
>  	mutex_lock(&dev->mutex);
> -	vhost_dev_lock_vqs(dev);
>  	switch (msg->type) {
>  	case VHOST_IOTLB_UPDATE:
>  		if (!dev->iotlb) {
> @@ -1020,7 +1011,6 @@ static int vhost_process_iotlb_msg(struct vhost_dev *dev,
>  		break;
>  	}
>  
> -	vhost_dev_unlock_vqs(dev);
>  	mutex_unlock(&dev->mutex);
>  
>  	return ret;
> 

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [REBASE PATCH net-next v9 1/4] net: vhost: lock the vqs one by one
  2018-11-29 19:28   ` Jean-Philippe Brucker
@ 2018-11-30  2:34     ` Jason Wang
  2018-11-30 10:28       ` Jean-Philippe Brucker
  2018-11-30  2:34     ` Jason Wang
  1 sibling, 1 reply; 17+ messages in thread
From: Jason Wang @ 2018-11-30  2:34 UTC (permalink / raw)
  To: Jean-Philippe Brucker, xiangxia.m.yue, mst, makita.toshiaki, davem
  Cc: netdev, virtualization


On 2018/11/30 上午3:28, Jean-Philippe Brucker wrote:
> Hi,
>
> On 25/09/2018 13:36,xiangxia.m.yue@gmail.com  wrote:
>> From: Tonghao Zhang<xiangxia.m.yue@gmail.com>
>>
>> This patch changes the way that lock all vqs
>> at the same, to lock them one by one. It will
>> be used for next patch to avoid the deadlock.
>>
>> Signed-off-by: Tonghao Zhang<xiangxia.m.yue@gmail.com>
>> Acked-by: Jason Wang<jasowang@redhat.com>
>> Signed-off-by: Jason Wang<jasowang@redhat.com>
>> ---
>>   drivers/vhost/vhost.c | 24 +++++++-----------------
>>   1 file changed, 7 insertions(+), 17 deletions(-)
>>
>> diff --git a/drivers/vhost/vhost.c b/drivers/vhost/vhost.c
>> index b13c6b4..f52008b 100644
>> --- a/drivers/vhost/vhost.c
>> +++ b/drivers/vhost/vhost.c
>> @@ -294,8 +294,11 @@ static void vhost_vq_meta_reset(struct vhost_dev *d)
>>   {
>>   	int i;
>>   
>> -	for (i = 0; i < d->nvqs; ++i)
>> +	for (i = 0; i < d->nvqs; ++i) {
>> +		mutex_lock(&d->vqs[i]->mutex);
>>   		__vhost_vq_meta_reset(d->vqs[i]);
>> +		mutex_unlock(&d->vqs[i]->mutex);
>> +	}
>>   }
>>   
>>   static void vhost_vq_reset(struct vhost_dev *dev,
>> @@ -891,20 +894,6 @@ static inline void __user *__vhost_get_user(struct vhost_virtqueue *vq,
>>   #define vhost_get_used(vq, x, ptr) \
>>   	vhost_get_user(vq, x, ptr, VHOST_ADDR_USED)
>>   
>> -static void vhost_dev_lock_vqs(struct vhost_dev *d)
>> -{
>> -	int i = 0;
>> -	for (i = 0; i < d->nvqs; ++i)
>> -		mutex_lock_nested(&d->vqs[i]->mutex, i);
>> -}
>> -
>> -static void vhost_dev_unlock_vqs(struct vhost_dev *d)
>> -{
>> -	int i = 0;
>> -	for (i = 0; i < d->nvqs; ++i)
>> -		mutex_unlock(&d->vqs[i]->mutex);
>> -}
>> -
>>   static int vhost_new_umem_range(struct vhost_umem *umem,
>>   				u64 start, u64 size, u64 end,
>>   				u64 userspace_addr, int perm)
>> @@ -954,7 +943,10 @@ static void vhost_iotlb_notify_vq(struct vhost_dev *d,
>>   		if (msg->iova <= vq_msg->iova &&
>>   		    msg->iova + msg->size - 1 >= vq_msg->iova &&
>>   		    vq_msg->type == VHOST_IOTLB_MISS) {
>> +			mutex_lock(&node->vq->mutex);
> This seems to introduce a deadlock (and sleep-in-atomic): the vq->mutex
> is taken while the IOTLB spinlock is held (taken earlier in
> vhost_iotlb_notify_vq()). On the vhost_iotlb_miss() path, the IOTLB
> spinlock is taken while the vq->mutex is held.


Good catch.


> I'm not sure how to fix it. Given that we're holding dev->mutex, that
> vq->poll only seems to be modified under dev->mutex, and assuming that
> vhost_poll_queue(vq->poll) can be called concurrently, is it safe to
> simply not take vq->mutex here?


Yes, I think it can be removed here.

Want to post a patch for this?

Thanks


> Thanks,
> Jean
>
>

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [REBASE PATCH net-next v9 1/4] net: vhost: lock the vqs one by one
  2018-11-29 19:28   ` Jean-Philippe Brucker
  2018-11-30  2:34     ` Jason Wang
@ 2018-11-30  2:34     ` Jason Wang
  1 sibling, 0 replies; 17+ messages in thread
From: Jason Wang @ 2018-11-30  2:34 UTC (permalink / raw)
  To: Jean-Philippe Brucker, xiangxia.m.yue, mst, makita.toshiaki, davem
  Cc: netdev, virtualization


On 2018/11/30 上午3:28, Jean-Philippe Brucker wrote:
> Hi,
>
> On 25/09/2018 13:36,xiangxia.m.yue@gmail.com  wrote:
>> From: Tonghao Zhang<xiangxia.m.yue@gmail.com>
>>
>> This patch changes the way that lock all vqs
>> at the same, to lock them one by one. It will
>> be used for next patch to avoid the deadlock.
>>
>> Signed-off-by: Tonghao Zhang<xiangxia.m.yue@gmail.com>
>> Acked-by: Jason Wang<jasowang@redhat.com>
>> Signed-off-by: Jason Wang<jasowang@redhat.com>
>> ---
>>   drivers/vhost/vhost.c | 24 +++++++-----------------
>>   1 file changed, 7 insertions(+), 17 deletions(-)
>>
>> diff --git a/drivers/vhost/vhost.c b/drivers/vhost/vhost.c
>> index b13c6b4..f52008b 100644
>> --- a/drivers/vhost/vhost.c
>> +++ b/drivers/vhost/vhost.c
>> @@ -294,8 +294,11 @@ static void vhost_vq_meta_reset(struct vhost_dev *d)
>>   {
>>   	int i;
>>   
>> -	for (i = 0; i < d->nvqs; ++i)
>> +	for (i = 0; i < d->nvqs; ++i) {
>> +		mutex_lock(&d->vqs[i]->mutex);
>>   		__vhost_vq_meta_reset(d->vqs[i]);
>> +		mutex_unlock(&d->vqs[i]->mutex);
>> +	}
>>   }
>>   
>>   static void vhost_vq_reset(struct vhost_dev *dev,
>> @@ -891,20 +894,6 @@ static inline void __user *__vhost_get_user(struct vhost_virtqueue *vq,
>>   #define vhost_get_used(vq, x, ptr) \
>>   	vhost_get_user(vq, x, ptr, VHOST_ADDR_USED)
>>   
>> -static void vhost_dev_lock_vqs(struct vhost_dev *d)
>> -{
>> -	int i = 0;
>> -	for (i = 0; i < d->nvqs; ++i)
>> -		mutex_lock_nested(&d->vqs[i]->mutex, i);
>> -}
>> -
>> -static void vhost_dev_unlock_vqs(struct vhost_dev *d)
>> -{
>> -	int i = 0;
>> -	for (i = 0; i < d->nvqs; ++i)
>> -		mutex_unlock(&d->vqs[i]->mutex);
>> -}
>> -
>>   static int vhost_new_umem_range(struct vhost_umem *umem,
>>   				u64 start, u64 size, u64 end,
>>   				u64 userspace_addr, int perm)
>> @@ -954,7 +943,10 @@ static void vhost_iotlb_notify_vq(struct vhost_dev *d,
>>   		if (msg->iova <= vq_msg->iova &&
>>   		    msg->iova + msg->size - 1 >= vq_msg->iova &&
>>   		    vq_msg->type == VHOST_IOTLB_MISS) {
>> +			mutex_lock(&node->vq->mutex);
> This seems to introduce a deadlock (and sleep-in-atomic): the vq->mutex
> is taken while the IOTLB spinlock is held (taken earlier in
> vhost_iotlb_notify_vq()). On the vhost_iotlb_miss() path, the IOTLB
> spinlock is taken while the vq->mutex is held.


Good catch.


> I'm not sure how to fix it. Given that we're holding dev->mutex, that
> vq->poll only seems to be modified under dev->mutex, and assuming that
> vhost_poll_queue(vq->poll) can be called concurrently, is it safe to
> simply not take vq->mutex here?


Yes, I think it can be removed here.

Want to post a patch for this?

Thanks


> Thanks,
> Jean
>
>
_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [REBASE PATCH net-next v9 1/4] net: vhost: lock the vqs one by one
  2018-11-30  2:34     ` Jason Wang
@ 2018-11-30 10:28       ` Jean-Philippe Brucker
  0 siblings, 0 replies; 17+ messages in thread
From: Jean-Philippe Brucker @ 2018-11-30 10:28 UTC (permalink / raw)
  To: Jason Wang, xiangxia.m.yue, mst, makita.toshiaki, davem
  Cc: netdev, virtualization

On 30/11/2018 02:34, Jason Wang wrote:
> 
> On 2018/11/30 上午3:28, Jean-Philippe Brucker wrote:
>> Hi,
>>
>> On 25/09/2018 13:36,xiangxia.m.yue@gmail.com  wrote:
>>> From: Tonghao Zhang<xiangxia.m.yue@gmail.com>
>>>
>>> This patch changes the way that lock all vqs
>>> at the same, to lock them one by one. It will
>>> be used for next patch to avoid the deadlock.
>>>
>>> Signed-off-by: Tonghao Zhang<xiangxia.m.yue@gmail.com>
>>> Acked-by: Jason Wang<jasowang@redhat.com>
>>> Signed-off-by: Jason Wang<jasowang@redhat.com>
>>> ---
>>>   drivers/vhost/vhost.c | 24 +++++++-----------------
>>>   1 file changed, 7 insertions(+), 17 deletions(-)
>>>
>>> diff --git a/drivers/vhost/vhost.c b/drivers/vhost/vhost.c
>>> index b13c6b4..f52008b 100644
>>> --- a/drivers/vhost/vhost.c
>>> +++ b/drivers/vhost/vhost.c
>>> @@ -294,8 +294,11 @@ static void vhost_vq_meta_reset(struct vhost_dev *d)
>>>   {
>>>   	int i;
>>>   
>>> -	for (i = 0; i < d->nvqs; ++i)
>>> +	for (i = 0; i < d->nvqs; ++i) {
>>> +		mutex_lock(&d->vqs[i]->mutex);
>>>   		__vhost_vq_meta_reset(d->vqs[i]);
>>> +		mutex_unlock(&d->vqs[i]->mutex);
>>> +	}
>>>   }
>>>   
>>>   static void vhost_vq_reset(struct vhost_dev *dev,
>>> @@ -891,20 +894,6 @@ static inline void __user *__vhost_get_user(struct vhost_virtqueue *vq,
>>>   #define vhost_get_used(vq, x, ptr) \
>>>   	vhost_get_user(vq, x, ptr, VHOST_ADDR_USED)
>>>   
>>> -static void vhost_dev_lock_vqs(struct vhost_dev *d)
>>> -{
>>> -	int i = 0;
>>> -	for (i = 0; i < d->nvqs; ++i)
>>> -		mutex_lock_nested(&d->vqs[i]->mutex, i);
>>> -}
>>> -
>>> -static void vhost_dev_unlock_vqs(struct vhost_dev *d)
>>> -{
>>> -	int i = 0;
>>> -	for (i = 0; i < d->nvqs; ++i)
>>> -		mutex_unlock(&d->vqs[i]->mutex);
>>> -}
>>> -
>>>   static int vhost_new_umem_range(struct vhost_umem *umem,
>>>   				u64 start, u64 size, u64 end,
>>>   				u64 userspace_addr, int perm)
>>> @@ -954,7 +943,10 @@ static void vhost_iotlb_notify_vq(struct vhost_dev *d,
>>>   		if (msg->iova <= vq_msg->iova &&
>>>   		    msg->iova + msg->size - 1 >= vq_msg->iova &&
>>>   		    vq_msg->type == VHOST_IOTLB_MISS) {
>>> +			mutex_lock(&node->vq->mutex);
>> This seems to introduce a deadlock (and sleep-in-atomic): the vq->mutex
>> is taken while the IOTLB spinlock is held (taken earlier in
>> vhost_iotlb_notify_vq()). On the vhost_iotlb_miss() path, the IOTLB
>> spinlock is taken while the vq->mutex is held.
> 
> 
> Good catch.
> 
> 
>> I'm not sure how to fix it. Given that we're holding dev->mutex, that
>> vq->poll only seems to be modified under dev->mutex, and assuming that
>> vhost_poll_queue(vq->poll) can be called concurrently, is it safe to
>> simply not take vq->mutex here?
> 
> 
> Yes, I think it can be removed here.
> 
> Want to post a patch for this?

Yes, I'll post it shortly

Thanks,
Jean
_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

^ permalink raw reply	[flat|nested] 17+ messages in thread

* [REBASE PATCH net-next v9 0/4] net: vhost: improve performance when enable busyloop
@ 2018-09-25 12:36 xiangxia.m.yue
  0 siblings, 0 replies; 17+ messages in thread
From: xiangxia.m.yue @ 2018-09-25 12:36 UTC (permalink / raw)
  To: jasowang, mst, makita.toshiaki, davem; +Cc: netdev, virtualization

From: Tonghao Zhang <xiangxia.m.yue@gmail.com>

This patches improve the guest receive performance.
On the handle_tx side, we poll the sock receive queue
at the same time. handle_rx do that in the same way.

For more performance report, see patch 4

Tonghao Zhang (4):
  net: vhost: lock the vqs one by one
  net: vhost: replace magic number of lock annotation
  net: vhost: factor out busy polling logic to vhost_net_busy_poll()
  net: vhost: add rx busy polling in tx path

 drivers/vhost/net.c   | 147 +++++++++++++++++++++++++++++---------------------
 drivers/vhost/vhost.c |  24 +++------
 2 files changed, 92 insertions(+), 79 deletions(-)

-- 
1.8.3.1

^ permalink raw reply	[flat|nested] 17+ messages in thread

end of thread, other threads:[~2018-11-30 13:42 UTC | newest]

Thread overview: 17+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-09-25 12:36 [REBASE PATCH net-next v9 0/4] net: vhost: improve performance when enable busyloop xiangxia.m.yue
2018-09-25 12:36 ` [REBASE PATCH net-next v9 1/4] net: vhost: lock the vqs one by one xiangxia.m.yue
2018-09-25 12:36 ` xiangxia.m.yue
2018-11-29 19:28   ` Jean-Philippe Brucker
2018-11-29 19:28   ` Jean-Philippe Brucker
2018-11-30  2:34     ` Jason Wang
2018-11-30 10:28       ` Jean-Philippe Brucker
2018-11-30  2:34     ` Jason Wang
2018-09-25 12:36 ` [REBASE PATCH net-next v9 2/4] net: vhost: replace magic number of lock annotation xiangxia.m.yue
2018-09-25 12:36 ` xiangxia.m.yue
2018-09-25 12:36 ` [REBASE PATCH net-next v9 3/4] net: vhost: factor out busy polling logic to vhost_net_busy_poll() xiangxia.m.yue
2018-09-25 12:36 ` xiangxia.m.yue
2018-09-25 12:36 ` [REBASE PATCH net-next v9 4/4] net: vhost: add rx busy polling in tx path xiangxia.m.yue
2018-09-25 12:36 ` xiangxia.m.yue
2018-09-27  3:26 ` [REBASE PATCH net-next v9 0/4] net: vhost: improve performance when enable busyloop David Miller
2018-09-27  3:26 ` David Miller
  -- strict thread matches above, loose matches on Subject: below --
2018-09-25 12:36 xiangxia.m.yue

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.