[PATCH net 0/2] fixes for vhost

All of lore.kernel.org
 help / color / mirror / Atom feed

* [PATCH net 0/2] fixes for vhost_net
@ 2020-12-15  1:48 wangyunjian
  2020-12-15  1:48 ` [PATCH net 1/2] vhost_net: fix ubuf refcount incorrectly when sendmsg fails wangyunjian
                   ` (2 more replies)
  0 siblings, 3 replies; 38+ messages in thread
From: wangyunjian @ 2020-12-15  1:48 UTC (permalink / raw)
  To: netdev, mst, jasowang, willemdebruijn.kernel
  Cc: virtualization, jerry.lilijun, chenchanghu, xudingke,
	brian.huangbin, Yunjian Wang

From: Yunjian Wang <wangyunjian@huawei.com>

This series include two fixes patches for vhost_net.

Yunjian Wang (2):
  vhost_net: fix ubuf refcount incorrectly when sendmsg fails
  vhost_net: fix high cpu load when sendmsg fails

 drivers/vhost/net.c | 27 ++++++++++++---------------
 1 file changed, 12 insertions(+), 15 deletions(-)

-- 
2.23.0


^ permalink raw reply	[flat|nested] 38+ messages in thread

* [PATCH net 1/2] vhost_net: fix ubuf refcount incorrectly when sendmsg fails
  2020-12-15  1:48 [PATCH net 0/2] fixes for vhost_net wangyunjian
@ 2020-12-15  1:48 ` wangyunjian
  2020-12-15  2:45     ` Willem de Bruijn
  2020-12-15  1:48 ` [PATCH net 2/2] vhost_net: fix high cpu load " wangyunjian
  2020-12-16  8:20 ` [PATCH net v2 0/2] fixes for vhost_net wangyunjian
  2 siblings, 1 reply; 38+ messages in thread
From: wangyunjian @ 2020-12-15  1:48 UTC (permalink / raw)
  To: netdev, mst, jasowang, willemdebruijn.kernel
  Cc: virtualization, jerry.lilijun, chenchanghu, xudingke,
	brian.huangbin, Yunjian Wang

From: Yunjian Wang <wangyunjian@huawei.com>

Currently the vhost_zerocopy_callback() maybe be called to decrease
the refcount when sendmsg fails in tun. The error handling in vhost
handle_tx_zerocopy() will try to decrease the same refcount again.
This is wrong. To fix this issue, we only call vhost_net_ubuf_put()
when vq->heads[nvq->desc].len == VHOST_DMA_IN_PROGRESS.

Fixes: 4477138fa0ae ("tun: properly test for IFF_UP")
Fixes: 90e33d459407 ("tun: enable napi_gro_frags() for TUN/TAP driver")

Signed-off-by: Yunjian Wang <wangyunjian@huawei.com>
---
 drivers/vhost/net.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/drivers/vhost/net.c b/drivers/vhost/net.c
index 531a00d703cd..c8784dfafdd7 100644
--- a/drivers/vhost/net.c
+++ b/drivers/vhost/net.c
@@ -863,6 +863,7 @@ static void handle_tx_zerocopy(struct vhost_net *net, struct socket *sock)
 	size_t len, total_len = 0;
 	int err;
 	struct vhost_net_ubuf_ref *ubufs;
+	struct ubuf_info *ubuf;
 	bool zcopy_used;
 	int sent_pkts = 0;
 
@@ -895,9 +896,7 @@ static void handle_tx_zerocopy(struct vhost_net *net, struct socket *sock)
 
 		/* use msg_control to pass vhost zerocopy ubuf info to skb */
 		if (zcopy_used) {
-			struct ubuf_info *ubuf;
 			ubuf = nvq->ubuf_info + nvq->upend_idx;
-
 			vq->heads[nvq->upend_idx].id = cpu_to_vhost32(vq, head);
 			vq->heads[nvq->upend_idx].len = VHOST_DMA_IN_PROGRESS;
 			ubuf->callback = vhost_zerocopy_callback;
@@ -927,7 +926,8 @@ static void handle_tx_zerocopy(struct vhost_net *net, struct socket *sock)
 		err = sock->ops->sendmsg(sock, &msg, len);
 		if (unlikely(err < 0)) {
 			if (zcopy_used) {
-				vhost_net_ubuf_put(ubufs);
+				if (vq->heads[ubuf->desc].len == VHOST_DMA_IN_PROGRESS)
+					vhost_net_ubuf_put(ubufs);
 				nvq->upend_idx = ((unsigned)nvq->upend_idx - 1)
 					% UIO_MAXIOV;
 			}
-- 
2.23.0


^ permalink raw reply related	[flat|nested] 38+ messages in thread

* [PATCH net 2/2] vhost_net: fix high cpu load when sendmsg fails
  2020-12-15  1:48 [PATCH net 0/2] fixes for vhost_net wangyunjian
  2020-12-15  1:48 ` [PATCH net 1/2] vhost_net: fix ubuf refcount incorrectly when sendmsg fails wangyunjian
@ 2020-12-15  1:48 ` wangyunjian
  2020-12-15  4:09     ` Jason Wang
  2020-12-16  8:20 ` [PATCH net v2 0/2] fixes for vhost_net wangyunjian
  2 siblings, 1 reply; 38+ messages in thread
From: wangyunjian @ 2020-12-15  1:48 UTC (permalink / raw)
  To: netdev, mst, jasowang, willemdebruijn.kernel
  Cc: virtualization, jerry.lilijun, chenchanghu, xudingke,
	brian.huangbin, Yunjian Wang

From: Yunjian Wang <wangyunjian@huawei.com>

Currently we break the loop and wake up the vhost_worker when
sendmsg fails. When the worker wakes up again, we'll meet the
same error. This will cause high CPU load. To fix this issue,
we can skip this description by ignoring the error. When we
exceeds sndbuf, the return value of sendmsg is -EAGAIN. In
the case we don't skip the description and don't drop packet.

Signed-off-by: Yunjian Wang <wangyunjian@huawei.com>
---
 drivers/vhost/net.c | 21 +++++++++------------
 1 file changed, 9 insertions(+), 12 deletions(-)

diff --git a/drivers/vhost/net.c b/drivers/vhost/net.c
index c8784dfafdd7..f966592d8900 100644
--- a/drivers/vhost/net.c
+++ b/drivers/vhost/net.c
@@ -827,16 +827,13 @@ static void handle_tx_copy(struct vhost_net *net, struct socket *sock)
 				msg.msg_flags &= ~MSG_MORE;
 		}
 
-		/* TODO: Check specific error and bomb out unless ENOBUFS? */
 		err = sock->ops->sendmsg(sock, &msg, len);
-		if (unlikely(err < 0)) {
+		if (unlikely(err == -EAGAIN)) {
 			vhost_discard_vq_desc(vq, 1);
 			vhost_net_enable_vq(net, vq);
 			break;
-		}
-		if (err != len)
-			pr_debug("Truncated TX packet: len %d != %zd\n",
-				 err, len);
+		} else if (unlikely(err < 0 || err != len))
+			vq_err(vq, "Fail to sending packets err : %d, len : %zd\n", err, len);
 done:
 		vq->heads[nvq->done_idx].id = cpu_to_vhost32(vq, head);
 		vq->heads[nvq->done_idx].len = 0;
@@ -922,7 +919,6 @@ static void handle_tx_zerocopy(struct vhost_net *net, struct socket *sock)
 			msg.msg_flags &= ~MSG_MORE;
 		}
 
-		/* TODO: Check specific error and bomb out unless ENOBUFS? */
 		err = sock->ops->sendmsg(sock, &msg, len);
 		if (unlikely(err < 0)) {
 			if (zcopy_used) {
@@ -931,13 +927,14 @@ static void handle_tx_zerocopy(struct vhost_net *net, struct socket *sock)
 				nvq->upend_idx = ((unsigned)nvq->upend_idx - 1)
 					% UIO_MAXIOV;
 			}
-			vhost_discard_vq_desc(vq, 1);
-			vhost_net_enable_vq(net, vq);
-			break;
+			if (err == -EAGAIN) {
+				vhost_discard_vq_desc(vq, 1);
+				vhost_net_enable_vq(net, vq);
+				break;
+			}
 		}
 		if (err != len)
-			pr_debug("Truncated TX packet: "
-				 " len %d != %zd\n", err, len);
+			vq_err(vq, "Fail to sending packets err : %d, len : %zd\n", err, len);
 		if (!zcopy_used)
 			vhost_add_used_and_signal(&net->dev, vq, head, 0);
 		else
-- 
2.23.0


^ permalink raw reply related	[flat|nested] 38+ messages in thread

* Re: [PATCH net 1/2] vhost_net: fix ubuf refcount incorrectly when sendmsg fails
  2020-12-15  1:48 ` [PATCH net 1/2] vhost_net: fix ubuf refcount incorrectly when sendmsg fails wangyunjian
@ 2020-12-15  2:45     ` Willem de Bruijn
  0 siblings, 0 replies; 38+ messages in thread
From: Willem de Bruijn @ 2020-12-15  2:45 UTC (permalink / raw)
  To: wangyunjian
  Cc: Network Development, Michael S. Tsirkin, Jason Wang,
	Willem de Bruijn, virtualization, Lilijun (Jerry),
	chenchanghu, xudingke, huangbin (J)

On Mon, Dec 14, 2020 at 8:59 PM wangyunjian <wangyunjian@huawei.com> wrote:
>
> From: Yunjian Wang <wangyunjian@huawei.com>
>
> Currently the vhost_zerocopy_callback() maybe be called to decrease
> the refcount when sendmsg fails in tun. The error handling in vhost
> handle_tx_zerocopy() will try to decrease the same refcount again.
> This is wrong. To fix this issue, we only call vhost_net_ubuf_put()
> when vq->heads[nvq->desc].len == VHOST_DMA_IN_PROGRESS.
>
> Fixes: 4477138fa0ae ("tun: properly test for IFF_UP")
> Fixes: 90e33d459407 ("tun: enable napi_gro_frags() for TUN/TAP driver")
>
> Signed-off-by: Yunjian Wang <wangyunjian@huawei.com>

Patch looks good to me. Thanks.

But I think the right Fixes tag would be

Fixes: 0690899b4d45 ("tun: experimental zero copy tx support")

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH net 1/2] vhost_net: fix ubuf refcount incorrectly when sendmsg fails
@ 2020-12-15  2:45     ` Willem de Bruijn
  0 siblings, 0 replies; 38+ messages in thread
From: Willem de Bruijn @ 2020-12-15  2:45 UTC (permalink / raw)
  To: wangyunjian
  Cc: Willem de Bruijn, Michael S. Tsirkin, Network Development,
	Lilijun (Jerry), virtualization, xudingke, huangbin (J),
	chenchanghu

On Mon, Dec 14, 2020 at 8:59 PM wangyunjian <wangyunjian@huawei.com> wrote:
>
> From: Yunjian Wang <wangyunjian@huawei.com>
>
> Currently the vhost_zerocopy_callback() maybe be called to decrease
> the refcount when sendmsg fails in tun. The error handling in vhost
> handle_tx_zerocopy() will try to decrease the same refcount again.
> This is wrong. To fix this issue, we only call vhost_net_ubuf_put()
> when vq->heads[nvq->desc].len == VHOST_DMA_IN_PROGRESS.
>
> Fixes: 4477138fa0ae ("tun: properly test for IFF_UP")
> Fixes: 90e33d459407 ("tun: enable napi_gro_frags() for TUN/TAP driver")
>
> Signed-off-by: Yunjian Wang <wangyunjian@huawei.com>

Patch looks good to me. Thanks.

But I think the right Fixes tag would be

Fixes: 0690899b4d45 ("tun: experimental zero copy tx support")
_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH net 2/2] vhost_net: fix high cpu load when sendmsg fails
  2020-12-15  1:48 ` [PATCH net 2/2] vhost_net: fix high cpu load " wangyunjian
@ 2020-12-15  4:09     ` Jason Wang
  0 siblings, 0 replies; 38+ messages in thread
From: Jason Wang @ 2020-12-15  4:09 UTC (permalink / raw)
  To: wangyunjian, netdev, mst, willemdebruijn.kernel
  Cc: virtualization, jerry.lilijun, chenchanghu, xudingke, brian.huangbin


On 2020/12/15 上午9:48, wangyunjian wrote:
> From: Yunjian Wang <wangyunjian@huawei.com>
>
> Currently we break the loop and wake up the vhost_worker when
> sendmsg fails. When the worker wakes up again, we'll meet the
> same error. This will cause high CPU load. To fix this issue,
> we can skip this description by ignoring the error. When we
> exceeds sndbuf, the return value of sendmsg is -EAGAIN. In
> the case we don't skip the description and don't drop packet.
>
> Signed-off-by: Yunjian Wang <wangyunjian@huawei.com>
> ---
>   drivers/vhost/net.c | 21 +++++++++------------
>   1 file changed, 9 insertions(+), 12 deletions(-)
>
> diff --git a/drivers/vhost/net.c b/drivers/vhost/net.c
> index c8784dfafdd7..f966592d8900 100644
> --- a/drivers/vhost/net.c
> +++ b/drivers/vhost/net.c
> @@ -827,16 +827,13 @@ static void handle_tx_copy(struct vhost_net *net, struct socket *sock)
>   				msg.msg_flags &= ~MSG_MORE;
>   		}
>   
> -		/* TODO: Check specific error and bomb out unless ENOBUFS? */
>   		err = sock->ops->sendmsg(sock, &msg, len);
> -		if (unlikely(err < 0)) {
> +		if (unlikely(err == -EAGAIN)) {
>   			vhost_discard_vq_desc(vq, 1);
>   			vhost_net_enable_vq(net, vq);
>   			break;
> -		}


As I've pointed out in last version. If you don't discard descriptor, 
you probably need to add the head to used ring. Otherwise this 
descriptor will be always inflight that may confuse drivers.


> -		if (err != len)
> -			pr_debug("Truncated TX packet: len %d != %zd\n",
> -				 err, len);
> +		} else if (unlikely(err < 0 || err != len))


It looks to me err != len covers err < 0.

Thanks


> +			vq_err(vq, "Fail to sending packets err : %d, len : %zd\n", err, len);
>   done:
>   		vq->heads[nvq->done_idx].id = cpu_to_vhost32(vq, head);
>   		vq->heads[nvq->done_idx].len = 0;
> @@ -922,7 +919,6 @@ static void handle_tx_zerocopy(struct vhost_net *net, struct socket *sock)
>   			msg.msg_flags &= ~MSG_MORE;
>   		}
>   
> -		/* TODO: Check specific error and bomb out unless ENOBUFS? */
>   		err = sock->ops->sendmsg(sock, &msg, len);
>   		if (unlikely(err < 0)) {
>   			if (zcopy_used) {
> @@ -931,13 +927,14 @@ static void handle_tx_zerocopy(struct vhost_net *net, struct socket *sock)
>   				nvq->upend_idx = ((unsigned)nvq->upend_idx - 1)
>   					% UIO_MAXIOV;
>   			}
> -			vhost_discard_vq_desc(vq, 1);
> -			vhost_net_enable_vq(net, vq);
> -			break;
> +			if (err == -EAGAIN) {
> +				vhost_discard_vq_desc(vq, 1);
> +				vhost_net_enable_vq(net, vq);
> +				break;
> +			}
>   		}
>   		if (err != len)
> -			pr_debug("Truncated TX packet: "
> -				 " len %d != %zd\n", err, len);
> +			vq_err(vq, "Fail to sending packets err : %d, len : %zd\n", err, len);
>   		if (!zcopy_used)
>   			vhost_add_used_and_signal(&net->dev, vq, head, 0);
>   		else


^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH net 2/2] vhost_net: fix high cpu load when sendmsg fails
@ 2020-12-15  4:09     ` Jason Wang
  0 siblings, 0 replies; 38+ messages in thread
From: Jason Wang @ 2020-12-15  4:09 UTC (permalink / raw)
  To: wangyunjian, netdev, mst, willemdebruijn.kernel
  Cc: xudingke, brian.huangbin, jerry.lilijun, chenchanghu, virtualization


On 2020/12/15 上午9:48, wangyunjian wrote:
> From: Yunjian Wang <wangyunjian@huawei.com>
>
> Currently we break the loop and wake up the vhost_worker when
> sendmsg fails. When the worker wakes up again, we'll meet the
> same error. This will cause high CPU load. To fix this issue,
> we can skip this description by ignoring the error. When we
> exceeds sndbuf, the return value of sendmsg is -EAGAIN. In
> the case we don't skip the description and don't drop packet.
>
> Signed-off-by: Yunjian Wang <wangyunjian@huawei.com>
> ---
>   drivers/vhost/net.c | 21 +++++++++------------
>   1 file changed, 9 insertions(+), 12 deletions(-)
>
> diff --git a/drivers/vhost/net.c b/drivers/vhost/net.c
> index c8784dfafdd7..f966592d8900 100644
> --- a/drivers/vhost/net.c
> +++ b/drivers/vhost/net.c
> @@ -827,16 +827,13 @@ static void handle_tx_copy(struct vhost_net *net, struct socket *sock)
>   				msg.msg_flags &= ~MSG_MORE;
>   		}
>   
> -		/* TODO: Check specific error and bomb out unless ENOBUFS? */
>   		err = sock->ops->sendmsg(sock, &msg, len);
> -		if (unlikely(err < 0)) {
> +		if (unlikely(err == -EAGAIN)) {
>   			vhost_discard_vq_desc(vq, 1);
>   			vhost_net_enable_vq(net, vq);
>   			break;
> -		}


As I've pointed out in last version. If you don't discard descriptor, 
you probably need to add the head to used ring. Otherwise this 
descriptor will be always inflight that may confuse drivers.


> -		if (err != len)
> -			pr_debug("Truncated TX packet: len %d != %zd\n",
> -				 err, len);
> +		} else if (unlikely(err < 0 || err != len))


It looks to me err != len covers err < 0.

Thanks


> +			vq_err(vq, "Fail to sending packets err : %d, len : %zd\n", err, len);
>   done:
>   		vq->heads[nvq->done_idx].id = cpu_to_vhost32(vq, head);
>   		vq->heads[nvq->done_idx].len = 0;
> @@ -922,7 +919,6 @@ static void handle_tx_zerocopy(struct vhost_net *net, struct socket *sock)
>   			msg.msg_flags &= ~MSG_MORE;
>   		}
>   
> -		/* TODO: Check specific error and bomb out unless ENOBUFS? */
>   		err = sock->ops->sendmsg(sock, &msg, len);
>   		if (unlikely(err < 0)) {
>   			if (zcopy_used) {
> @@ -931,13 +927,14 @@ static void handle_tx_zerocopy(struct vhost_net *net, struct socket *sock)
>   				nvq->upend_idx = ((unsigned)nvq->upend_idx - 1)
>   					% UIO_MAXIOV;
>   			}
> -			vhost_discard_vq_desc(vq, 1);
> -			vhost_net_enable_vq(net, vq);
> -			break;
> +			if (err == -EAGAIN) {
> +				vhost_discard_vq_desc(vq, 1);
> +				vhost_net_enable_vq(net, vq);
> +				break;
> +			}
>   		}
>   		if (err != len)
> -			pr_debug("Truncated TX packet: "
> -				 " len %d != %zd\n", err, len);
> +			vq_err(vq, "Fail to sending packets err : %d, len : %zd\n", err, len);
>   		if (!zcopy_used)
>   			vhost_add_used_and_signal(&net->dev, vq, head, 0);
>   		else

_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

^ permalink raw reply	[flat|nested] 38+ messages in thread

* RE: [PATCH net 1/2] vhost_net: fix ubuf refcount incorrectly when sendmsg fails
  2020-12-15  2:45     ` Willem de Bruijn
  (?)
@ 2020-12-15  7:51     ` wangyunjian
  -1 siblings, 0 replies; 38+ messages in thread
From: wangyunjian @ 2020-12-15  7:51 UTC (permalink / raw)
  To: Willem de Bruijn
  Cc: Network Development, Michael S. Tsirkin, Jason Wang,
	virtualization, Lilijun (Jerry),
	chenchanghu, xudingke, huangbin (J)

> -----Original Message-----
> From: Willem de Bruijn [mailto:willemdebruijn.kernel@gmail.com]
> Sent: Tuesday, December 15, 2020 10:46 AM
> To: wangyunjian <wangyunjian@huawei.com>
> Cc: Network Development <netdev@vger.kernel.org>; Michael S. Tsirkin
> <mst@redhat.com>; Jason Wang <jasowang@redhat.com>; Willem de Bruijn
> <willemdebruijn.kernel@gmail.com>; virtualization@lists.linux-foundation.org;
> Lilijun (Jerry) <jerry.lilijun@huawei.com>; chenchanghu
> <chenchanghu@huawei.com>; xudingke <xudingke@huawei.com>; huangbin (J)
> <brian.huangbin@huawei.com>
> Subject: Re: [PATCH net 1/2] vhost_net: fix ubuf refcount incorrectly when
> sendmsg fails
> 
> On Mon, Dec 14, 2020 at 8:59 PM wangyunjian <wangyunjian@huawei.com>
> wrote:
> >
> > From: Yunjian Wang <wangyunjian@huawei.com>
> >
> > Currently the vhost_zerocopy_callback() maybe be called to decrease
> > the refcount when sendmsg fails in tun. The error handling in vhost
> > handle_tx_zerocopy() will try to decrease the same refcount again.
> > This is wrong. To fix this issue, we only call vhost_net_ubuf_put()
> > when vq->heads[nvq->desc].len == VHOST_DMA_IN_PROGRESS.
> >
> > Fixes: 4477138fa0ae ("tun: properly test for IFF_UP")
> > Fixes: 90e33d459407 ("tun: enable napi_gro_frags() for TUN/TAP
> > driver")
> >
> > Signed-off-by: Yunjian Wang <wangyunjian@huawei.com>
> 
> Patch looks good to me. Thanks.
> 
> But I think the right Fixes tag would be
> 
> Fixes: 0690899b4d45 ("tun: experimental zero copy tx support")

OK, thanks for the suggestion. I will fix it in next version.

^ permalink raw reply	[flat|nested] 38+ messages in thread

* RE: [PATCH net 2/2] vhost_net: fix high cpu load when sendmsg fails
  2020-12-15  4:09     ` Jason Wang
  (?)
@ 2020-12-15  8:03     ` wangyunjian
  2020-12-16  5:56         ` Jason Wang
  -1 siblings, 1 reply; 38+ messages in thread
From: wangyunjian @ 2020-12-15  8:03 UTC (permalink / raw)
  To: Jason Wang, netdev, mst, willemdebruijn.kernel
  Cc: virtualization, Lilijun (Jerry), chenchanghu, xudingke, huangbin (J)



> -----Original Message-----
> From: Jason Wang [mailto:jasowang@redhat.com]
> Sent: Tuesday, December 15, 2020 12:10 PM
> To: wangyunjian <wangyunjian@huawei.com>; netdev@vger.kernel.org;
> mst@redhat.com; willemdebruijn.kernel@gmail.com
> Cc: virtualization@lists.linux-foundation.org; Lilijun (Jerry)
> <jerry.lilijun@huawei.com>; chenchanghu <chenchanghu@huawei.com>;
> xudingke <xudingke@huawei.com>; huangbin (J)
> <brian.huangbin@huawei.com>
> Subject: Re: [PATCH net 2/2] vhost_net: fix high cpu load when sendmsg fails
> 
> 
> On 2020/12/15 上午9:48, wangyunjian wrote:
> > From: Yunjian Wang <wangyunjian@huawei.com>
> >
> > Currently we break the loop and wake up the vhost_worker when sendmsg
> > fails. When the worker wakes up again, we'll meet the same error. This
> > will cause high CPU load. To fix this issue, we can skip this
> > description by ignoring the error. When we exceeds sndbuf, the return
> > value of sendmsg is -EAGAIN. In the case we don't skip the description
> > and don't drop packet.
> >
> > Signed-off-by: Yunjian Wang <wangyunjian@huawei.com>
> > ---
> >   drivers/vhost/net.c | 21 +++++++++------------
> >   1 file changed, 9 insertions(+), 12 deletions(-)
> >
> > diff --git a/drivers/vhost/net.c b/drivers/vhost/net.c index
> > c8784dfafdd7..f966592d8900 100644
> > --- a/drivers/vhost/net.c
> > +++ b/drivers/vhost/net.c
> > @@ -827,16 +827,13 @@ static void handle_tx_copy(struct vhost_net *net,
> struct socket *sock)
> >   				msg.msg_flags &= ~MSG_MORE;
> >   		}
> >
> > -		/* TODO: Check specific error and bomb out unless ENOBUFS? */
> >   		err = sock->ops->sendmsg(sock, &msg, len);
> > -		if (unlikely(err < 0)) {
> > +		if (unlikely(err == -EAGAIN)) {
> >   			vhost_discard_vq_desc(vq, 1);
> >   			vhost_net_enable_vq(net, vq);
> >   			break;
> > -		}
> 
> 
> As I've pointed out in last version. If you don't discard descriptor, you probably
> need to add the head to used ring. Otherwise this descriptor will be always
> inflight that may confuse drivers.

Sorry for missing the comment.

After deleting discard descriptor and break, the next processing will be the same
as the normal success of sendmsg(), and vhost_zerocopy_signal_used() or
vhost_add_used_and_signal() method will be called to add the head to used ring.

Thanks
> 
> 
> > -		if (err != len)
> > -			pr_debug("Truncated TX packet: len %d != %zd\n",
> > -				 err, len);
> > +		} else if (unlikely(err < 0 || err != len))
> 
> 
> It looks to me err != len covers err < 0.

OK

> 
> Thanks
> 
> 
> > +			vq_err(vq, "Fail to sending packets err : %d, len : %zd\n", err,
> > +len);
> >   done:
> >   		vq->heads[nvq->done_idx].id = cpu_to_vhost32(vq, head);
> >   		vq->heads[nvq->done_idx].len = 0;
> > @@ -922,7 +919,6 @@ static void handle_tx_zerocopy(struct vhost_net
> *net, struct socket *sock)
> >   			msg.msg_flags &= ~MSG_MORE;
> >   		}
> >
> > -		/* TODO: Check specific error and bomb out unless ENOBUFS? */
> >   		err = sock->ops->sendmsg(sock, &msg, len);
> >   		if (unlikely(err < 0)) {
> >   			if (zcopy_used) {
> > @@ -931,13 +927,14 @@ static void handle_tx_zerocopy(struct vhost_net
> *net, struct socket *sock)
> >   				nvq->upend_idx = ((unsigned)nvq->upend_idx - 1)
> >   					% UIO_MAXIOV;
> >   			}
> > -			vhost_discard_vq_desc(vq, 1);
> > -			vhost_net_enable_vq(net, vq);
> > -			break;
> > +			if (err == -EAGAIN) {
> > +				vhost_discard_vq_desc(vq, 1);
> > +				vhost_net_enable_vq(net, vq);
> > +				break;
> > +			}
> >   		}
> >   		if (err != len)
> > -			pr_debug("Truncated TX packet: "
> > -				 " len %d != %zd\n", err, len);
> > +			vq_err(vq, "Fail to sending packets err : %d, len : %zd\n", err,
> > +len);
> >   		if (!zcopy_used)
> >   			vhost_add_used_and_signal(&net->dev, vq, head, 0);
> >   		else


^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH net 2/2] vhost_net: fix high cpu load when sendmsg fails
  2020-12-15  8:03     ` wangyunjian
@ 2020-12-16  5:56         ` Jason Wang
  0 siblings, 0 replies; 38+ messages in thread
From: Jason Wang @ 2020-12-16  5:56 UTC (permalink / raw)
  To: wangyunjian
  Cc: netdev, mst, willemdebruijn kernel, virtualization,
	Lilijun (Jerry), chenchanghu, xudingke, huangbin (J)



----- Original Message -----
> 
> 
> > -----Original Message-----
> > From: Jason Wang [mailto:jasowang@redhat.com]
> > Sent: Tuesday, December 15, 2020 12:10 PM
> > To: wangyunjian <wangyunjian@huawei.com>; netdev@vger.kernel.org;
> > mst@redhat.com; willemdebruijn.kernel@gmail.com
> > Cc: virtualization@lists.linux-foundation.org; Lilijun (Jerry)
> > <jerry.lilijun@huawei.com>; chenchanghu <chenchanghu@huawei.com>;
> > xudingke <xudingke@huawei.com>; huangbin (J)
> > <brian.huangbin@huawei.com>
> > Subject: Re: [PATCH net 2/2] vhost_net: fix high cpu load when sendmsg
> > fails
> > 
> > 
> > On 2020/12/15 上午9:48, wangyunjian wrote:
> > > From: Yunjian Wang <wangyunjian@huawei.com>
> > >
> > > Currently we break the loop and wake up the vhost_worker when sendmsg
> > > fails. When the worker wakes up again, we'll meet the same error. This
> > > will cause high CPU load. To fix this issue, we can skip this
> > > description by ignoring the error. When we exceeds sndbuf, the return
> > > value of sendmsg is -EAGAIN. In the case we don't skip the description
> > > and don't drop packet.
> > >
> > > Signed-off-by: Yunjian Wang <wangyunjian@huawei.com>
> > > ---
> > >   drivers/vhost/net.c | 21 +++++++++------------
> > >   1 file changed, 9 insertions(+), 12 deletions(-)
> > >
> > > diff --git a/drivers/vhost/net.c b/drivers/vhost/net.c index
> > > c8784dfafdd7..f966592d8900 100644
> > > --- a/drivers/vhost/net.c
> > > +++ b/drivers/vhost/net.c
> > > @@ -827,16 +827,13 @@ static void handle_tx_copy(struct vhost_net *net,
> > struct socket *sock)
> > >   				msg.msg_flags &= ~MSG_MORE;
> > >   		}
> > >
> > > -		/* TODO: Check specific error and bomb out unless ENOBUFS? */
> > >   		err = sock->ops->sendmsg(sock, &msg, len);
> > > -		if (unlikely(err < 0)) {
> > > +		if (unlikely(err == -EAGAIN)) {
> > >   			vhost_discard_vq_desc(vq, 1);
> > >   			vhost_net_enable_vq(net, vq);
> > >   			break;
> > > -		}
> > 
> > 
> > As I've pointed out in last version. If you don't discard descriptor, you
> > probably
> > need to add the head to used ring. Otherwise this descriptor will be always
> > inflight that may confuse drivers.
> 
> Sorry for missing the comment.
> 
> After deleting discard descriptor and break, the next processing will be the
> same
> as the normal success of sendmsg(), and vhost_zerocopy_signal_used() or
> vhost_add_used_and_signal() method will be called to add the head to used
> ring.

It's the next head not the one that contains the buggy packet?

Thanks

> 
> Thanks
> > 
> > 
> > > -		if (err != len)
> > > -			pr_debug("Truncated TX packet: len %d != %zd\n",
> > > -				 err, len);
> > > +		} else if (unlikely(err < 0 || err != len))
> > 
> > 
> > It looks to me err != len covers err < 0.
> 
> OK
> 
> > 
> > Thanks
> > 
> > 
> > > +			vq_err(vq, "Fail to sending packets err : %d, len : %zd\n", err,
> > > +len);
> > >   done:
> > >   		vq->heads[nvq->done_idx].id = cpu_to_vhost32(vq, head);
> > >   		vq->heads[nvq->done_idx].len = 0;
> > > @@ -922,7 +919,6 @@ static void handle_tx_zerocopy(struct vhost_net
> > *net, struct socket *sock)
> > >   			msg.msg_flags &= ~MSG_MORE;
> > >   		}
> > >
> > > -		/* TODO: Check specific error and bomb out unless ENOBUFS? */
> > >   		err = sock->ops->sendmsg(sock, &msg, len);
> > >   		if (unlikely(err < 0)) {
> > >   			if (zcopy_used) {
> > > @@ -931,13 +927,14 @@ static void handle_tx_zerocopy(struct vhost_net
> > *net, struct socket *sock)
> > >   				nvq->upend_idx = ((unsigned)nvq->upend_idx - 1)
> > >   					% UIO_MAXIOV;
> > >   			}
> > > -			vhost_discard_vq_desc(vq, 1);
> > > -			vhost_net_enable_vq(net, vq);
> > > -			break;
> > > +			if (err == -EAGAIN) {
> > > +				vhost_discard_vq_desc(vq, 1);
> > > +				vhost_net_enable_vq(net, vq);
> > > +				break;
> > > +			}
> > >   		}
> > >   		if (err != len)
> > > -			pr_debug("Truncated TX packet: "
> > > -				 " len %d != %zd\n", err, len);
> > > +			vq_err(vq, "Fail to sending packets err : %d, len : %zd\n", err,
> > > +len);
> > >   		if (!zcopy_used)
> > >   			vhost_add_used_and_signal(&net->dev, vq, head, 0);
> > >   		else
> 
> 


^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH net 2/2] vhost_net: fix high cpu load when sendmsg fails
@ 2020-12-16  5:56         ` Jason Wang
  0 siblings, 0 replies; 38+ messages in thread
From: Jason Wang @ 2020-12-16  5:56 UTC (permalink / raw)
  To: wangyunjian
  Cc: willemdebruijn kernel, mst, netdev, Lilijun (Jerry),
	virtualization, xudingke, huangbin (J),
	chenchanghu



----- Original Message -----
> 
> 
> > -----Original Message-----
> > From: Jason Wang [mailto:jasowang@redhat.com]
> > Sent: Tuesday, December 15, 2020 12:10 PM
> > To: wangyunjian <wangyunjian@huawei.com>; netdev@vger.kernel.org;
> > mst@redhat.com; willemdebruijn.kernel@gmail.com
> > Cc: virtualization@lists.linux-foundation.org; Lilijun (Jerry)
> > <jerry.lilijun@huawei.com>; chenchanghu <chenchanghu@huawei.com>;
> > xudingke <xudingke@huawei.com>; huangbin (J)
> > <brian.huangbin@huawei.com>
> > Subject: Re: [PATCH net 2/2] vhost_net: fix high cpu load when sendmsg
> > fails
> > 
> > 
> > On 2020/12/15 上午9:48, wangyunjian wrote:
> > > From: Yunjian Wang <wangyunjian@huawei.com>
> > >
> > > Currently we break the loop and wake up the vhost_worker when sendmsg
> > > fails. When the worker wakes up again, we'll meet the same error. This
> > > will cause high CPU load. To fix this issue, we can skip this
> > > description by ignoring the error. When we exceeds sndbuf, the return
> > > value of sendmsg is -EAGAIN. In the case we don't skip the description
> > > and don't drop packet.
> > >
> > > Signed-off-by: Yunjian Wang <wangyunjian@huawei.com>
> > > ---
> > >   drivers/vhost/net.c | 21 +++++++++------------
> > >   1 file changed, 9 insertions(+), 12 deletions(-)
> > >
> > > diff --git a/drivers/vhost/net.c b/drivers/vhost/net.c index
> > > c8784dfafdd7..f966592d8900 100644
> > > --- a/drivers/vhost/net.c
> > > +++ b/drivers/vhost/net.c
> > > @@ -827,16 +827,13 @@ static void handle_tx_copy(struct vhost_net *net,
> > struct socket *sock)
> > >   				msg.msg_flags &= ~MSG_MORE;
> > >   		}
> > >
> > > -		/* TODO: Check specific error and bomb out unless ENOBUFS? */
> > >   		err = sock->ops->sendmsg(sock, &msg, len);
> > > -		if (unlikely(err < 0)) {
> > > +		if (unlikely(err == -EAGAIN)) {
> > >   			vhost_discard_vq_desc(vq, 1);
> > >   			vhost_net_enable_vq(net, vq);
> > >   			break;
> > > -		}
> > 
> > 
> > As I've pointed out in last version. If you don't discard descriptor, you
> > probably
> > need to add the head to used ring. Otherwise this descriptor will be always
> > inflight that may confuse drivers.
> 
> Sorry for missing the comment.
> 
> After deleting discard descriptor and break, the next processing will be the
> same
> as the normal success of sendmsg(), and vhost_zerocopy_signal_used() or
> vhost_add_used_and_signal() method will be called to add the head to used
> ring.

It's the next head not the one that contains the buggy packet?

Thanks

> 
> Thanks
> > 
> > 
> > > -		if (err != len)
> > > -			pr_debug("Truncated TX packet: len %d != %zd\n",
> > > -				 err, len);
> > > +		} else if (unlikely(err < 0 || err != len))
> > 
> > 
> > It looks to me err != len covers err < 0.
> 
> OK
> 
> > 
> > Thanks
> > 
> > 
> > > +			vq_err(vq, "Fail to sending packets err : %d, len : %zd\n", err,
> > > +len);
> > >   done:
> > >   		vq->heads[nvq->done_idx].id = cpu_to_vhost32(vq, head);
> > >   		vq->heads[nvq->done_idx].len = 0;
> > > @@ -922,7 +919,6 @@ static void handle_tx_zerocopy(struct vhost_net
> > *net, struct socket *sock)
> > >   			msg.msg_flags &= ~MSG_MORE;
> > >   		}
> > >
> > > -		/* TODO: Check specific error and bomb out unless ENOBUFS? */
> > >   		err = sock->ops->sendmsg(sock, &msg, len);
> > >   		if (unlikely(err < 0)) {
> > >   			if (zcopy_used) {
> > > @@ -931,13 +927,14 @@ static void handle_tx_zerocopy(struct vhost_net
> > *net, struct socket *sock)
> > >   				nvq->upend_idx = ((unsigned)nvq->upend_idx - 1)
> > >   					% UIO_MAXIOV;
> > >   			}
> > > -			vhost_discard_vq_desc(vq, 1);
> > > -			vhost_net_enable_vq(net, vq);
> > > -			break;
> > > +			if (err == -EAGAIN) {
> > > +				vhost_discard_vq_desc(vq, 1);
> > > +				vhost_net_enable_vq(net, vq);
> > > +				break;
> > > +			}
> > >   		}
> > >   		if (err != len)
> > > -			pr_debug("Truncated TX packet: "
> > > -				 " len %d != %zd\n", err, len);
> > > +			vq_err(vq, "Fail to sending packets err : %d, len : %zd\n", err,
> > > +len);
> > >   		if (!zcopy_used)
> > >   			vhost_add_used_and_signal(&net->dev, vq, head, 0);
> > >   		else
> 
> 

_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

^ permalink raw reply	[flat|nested] 38+ messages in thread

* RE: [PATCH net 2/2] vhost_net: fix high cpu load when sendmsg fails
  2020-12-16  5:56         ` Jason Wang
  (?)
@ 2020-12-16  7:43         ` wangyunjian
  2020-12-16  7:47             ` Jason Wang
  -1 siblings, 1 reply; 38+ messages in thread
From: wangyunjian @ 2020-12-16  7:43 UTC (permalink / raw)
  To: Jason Wang
  Cc: netdev, mst, willemdebruijn kernel, virtualization,
	Lilijun (Jerry), chenchanghu, xudingke, huangbin (J)

> -----Original Message-----
> From: Jason Wang [mailto:jasowang@redhat.com]
> Sent: Wednesday, December 16, 2020 1:57 PM
> To: wangyunjian <wangyunjian@huawei.com>
> Cc: netdev@vger.kernel.org; mst@redhat.com; willemdebruijn kernel
> <willemdebruijn.kernel@gmail.com>; virtualization@lists.linux-foundation.org;
> Lilijun (Jerry) <jerry.lilijun@huawei.com>; chenchanghu
> <chenchanghu@huawei.com>; xudingke <xudingke@huawei.com>; huangbin (J)
> <brian.huangbin@huawei.com>
> Subject: Re: [PATCH net 2/2] vhost_net: fix high cpu load when sendmsg fails
> 
> 
> 
> ----- Original Message -----
> >
> >
> > > -----Original Message-----
> > > From: Jason Wang [mailto:jasowang@redhat.com]
> > > Sent: Tuesday, December 15, 2020 12:10 PM
> > > To: wangyunjian <wangyunjian@huawei.com>; netdev@vger.kernel.org;
> > > mst@redhat.com; willemdebruijn.kernel@gmail.com
> > > Cc: virtualization@lists.linux-foundation.org; Lilijun (Jerry)
> > > <jerry.lilijun@huawei.com>; chenchanghu <chenchanghu@huawei.com>;
> > > xudingke <xudingke@huawei.com>; huangbin (J)
> > > <brian.huangbin@huawei.com>
> > > Subject: Re: [PATCH net 2/2] vhost_net: fix high cpu load when sendmsg
> > > fails
> > >
> > >
> > > On 2020/12/15 上午9:48, wangyunjian wrote:
> > > > From: Yunjian Wang <wangyunjian@huawei.com>
> > > >
> > > > Currently we break the loop and wake up the vhost_worker when
> sendmsg
> > > > fails. When the worker wakes up again, we'll meet the same error. This
> > > > will cause high CPU load. To fix this issue, we can skip this
> > > > description by ignoring the error. When we exceeds sndbuf, the return
> > > > value of sendmsg is -EAGAIN. In the case we don't skip the description
> > > > and don't drop packet.
> > > >
> > > > Signed-off-by: Yunjian Wang <wangyunjian@huawei.com>
> > > > ---
> > > >   drivers/vhost/net.c | 21 +++++++++------------
> > > >   1 file changed, 9 insertions(+), 12 deletions(-)
> > > >
> > > > diff --git a/drivers/vhost/net.c b/drivers/vhost/net.c index
> > > > c8784dfafdd7..f966592d8900 100644
> > > > --- a/drivers/vhost/net.c
> > > > +++ b/drivers/vhost/net.c
> > > > @@ -827,16 +827,13 @@ static void handle_tx_copy(struct vhost_net
> *net,
> > > struct socket *sock)
> > > >   				msg.msg_flags &= ~MSG_MORE;
> > > >   		}
> > > >
> > > > -		/* TODO: Check specific error and bomb out unless ENOBUFS?
> */
> > > >   		err = sock->ops->sendmsg(sock, &msg, len);
> > > > -		if (unlikely(err < 0)) {
> > > > +		if (unlikely(err == -EAGAIN)) {
> > > >   			vhost_discard_vq_desc(vq, 1);
> > > >   			vhost_net_enable_vq(net, vq);
> > > >   			break;
> > > > -		}
> > >
> > >
> > > As I've pointed out in last version. If you don't discard descriptor, you
> > > probably
> > > need to add the head to used ring. Otherwise this descriptor will be always
> > > inflight that may confuse drivers.
> >
> > Sorry for missing the comment.
> >
> > After deleting discard descriptor and break, the next processing will be the
> > same
> > as the normal success of sendmsg(), and vhost_zerocopy_signal_used() or
> > vhost_add_used_and_signal() method will be called to add the head to used
> > ring.
> 
> It's the next head not the one that contains the buggy packet?

In the modified code logic, the head added to used ring is exectly the
one that contains the buggy packet.

Thanks

> 
> Thanks
> 
> >
> > Thanks
> > >
> > >
> > > > -		if (err != len)
> > > > -			pr_debug("Truncated TX packet: len %d != %zd\n",
> > > > -				 err, len);
> > > > +		} else if (unlikely(err < 0 || err != len))
> > >
> > >
> > > It looks to me err != len covers err < 0.
> >
> > OK
> >
> > >
> > > Thanks
> > >
> > >
> > > > +			vq_err(vq, "Fail to sending packets err : %d, len : %zd\n",
> err,
> > > > +len);
> > > >   done:
> > > >   		vq->heads[nvq->done_idx].id = cpu_to_vhost32(vq, head);
> > > >   		vq->heads[nvq->done_idx].len = 0;
> > > > @@ -922,7 +919,6 @@ static void handle_tx_zerocopy(struct vhost_net
> > > *net, struct socket *sock)
> > > >   			msg.msg_flags &= ~MSG_MORE;
> > > >   		}
> > > >
> > > > -		/* TODO: Check specific error and bomb out unless ENOBUFS?
> */
> > > >   		err = sock->ops->sendmsg(sock, &msg, len);
> > > >   		if (unlikely(err < 0)) {
> > > >   			if (zcopy_used) {
> > > > @@ -931,13 +927,14 @@ static void handle_tx_zerocopy(struct
> vhost_net
> > > *net, struct socket *sock)
> > > >   				nvq->upend_idx = ((unsigned)nvq->upend_idx - 1)
> > > >   					% UIO_MAXIOV;
> > > >   			}
> > > > -			vhost_discard_vq_desc(vq, 1);
> > > > -			vhost_net_enable_vq(net, vq);
> > > > -			break;
> > > > +			if (err == -EAGAIN) {
> > > > +				vhost_discard_vq_desc(vq, 1);
> > > > +				vhost_net_enable_vq(net, vq);
> > > > +				break;
> > > > +			}
> > > >   		}
> > > >   		if (err != len)
> > > > -			pr_debug("Truncated TX packet: "
> > > > -				 " len %d != %zd\n", err, len);
> > > > +			vq_err(vq, "Fail to sending packets err : %d, len : %zd\n",
> err,
> > > > +len);
> > > >   		if (!zcopy_used)
> > > >   			vhost_add_used_and_signal(&net->dev, vq, head, 0);
> > > >   		else
> >
> >


^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH net 2/2] vhost_net: fix high cpu load when sendmsg fails
  2020-12-16  7:43         ` wangyunjian
@ 2020-12-16  7:47             ` Jason Wang
  0 siblings, 0 replies; 38+ messages in thread
From: Jason Wang @ 2020-12-16  7:47 UTC (permalink / raw)
  To: wangyunjian
  Cc: netdev, mst, willemdebruijn kernel, virtualization,
	Lilijun (Jerry), chenchanghu, xudingke, huangbin (J)



----- Original Message -----
> > -----Original Message-----
> > From: Jason Wang [mailto:jasowang@redhat.com]
> > Sent: Wednesday, December 16, 2020 1:57 PM
> > To: wangyunjian <wangyunjian@huawei.com>
> > Cc: netdev@vger.kernel.org; mst@redhat.com; willemdebruijn kernel
> > <willemdebruijn.kernel@gmail.com>;
> > virtualization@lists.linux-foundation.org;
> > Lilijun (Jerry) <jerry.lilijun@huawei.com>; chenchanghu
> > <chenchanghu@huawei.com>; xudingke <xudingke@huawei.com>; huangbin (J)
> > <brian.huangbin@huawei.com>
> > Subject: Re: [PATCH net 2/2] vhost_net: fix high cpu load when sendmsg
> > fails
> > 
> > 
> > 
> > ----- Original Message -----
> > >
> > >
> > > > -----Original Message-----
> > > > From: Jason Wang [mailto:jasowang@redhat.com]
> > > > Sent: Tuesday, December 15, 2020 12:10 PM
> > > > To: wangyunjian <wangyunjian@huawei.com>; netdev@vger.kernel.org;
> > > > mst@redhat.com; willemdebruijn.kernel@gmail.com
> > > > Cc: virtualization@lists.linux-foundation.org; Lilijun (Jerry)
> > > > <jerry.lilijun@huawei.com>; chenchanghu <chenchanghu@huawei.com>;
> > > > xudingke <xudingke@huawei.com>; huangbin (J)
> > > > <brian.huangbin@huawei.com>
> > > > Subject: Re: [PATCH net 2/2] vhost_net: fix high cpu load when sendmsg
> > > > fails
> > > >
> > > >
> > > > On 2020/12/15 上午9:48, wangyunjian wrote:
> > > > > From: Yunjian Wang <wangyunjian@huawei.com>
> > > > >
> > > > > Currently we break the loop and wake up the vhost_worker when
> > sendmsg
> > > > > fails. When the worker wakes up again, we'll meet the same error.
> > > > > This
> > > > > will cause high CPU load. To fix this issue, we can skip this
> > > > > description by ignoring the error. When we exceeds sndbuf, the return
> > > > > value of sendmsg is -EAGAIN. In the case we don't skip the
> > > > > description
> > > > > and don't drop packet.
> > > > >
> > > > > Signed-off-by: Yunjian Wang <wangyunjian@huawei.com>
> > > > > ---
> > > > >   drivers/vhost/net.c | 21 +++++++++------------
> > > > >   1 file changed, 9 insertions(+), 12 deletions(-)
> > > > >
> > > > > diff --git a/drivers/vhost/net.c b/drivers/vhost/net.c index
> > > > > c8784dfafdd7..f966592d8900 100644
> > > > > --- a/drivers/vhost/net.c
> > > > > +++ b/drivers/vhost/net.c
> > > > > @@ -827,16 +827,13 @@ static void handle_tx_copy(struct vhost_net
> > *net,
> > > > struct socket *sock)
> > > > >   				msg.msg_flags &= ~MSG_MORE;
> > > > >   		}
> > > > >
> > > > > -		/* TODO: Check specific error and bomb out unless ENOBUFS?
> > */
> > > > >   		err = sock->ops->sendmsg(sock, &msg, len);
> > > > > -		if (unlikely(err < 0)) {
> > > > > +		if (unlikely(err == -EAGAIN)) {
> > > > >   			vhost_discard_vq_desc(vq, 1);
> > > > >   			vhost_net_enable_vq(net, vq);
> > > > >   			break;
> > > > > -		}
> > > >
> > > >
> > > > As I've pointed out in last version. If you don't discard descriptor,
> > > > you
> > > > probably
> > > > need to add the head to used ring. Otherwise this descriptor will be
> > > > always
> > > > inflight that may confuse drivers.
> > >
> > > Sorry for missing the comment.
> > >
> > > After deleting discard descriptor and break, the next processing will be
> > > the
> > > same
> > > as the normal success of sendmsg(), and vhost_zerocopy_signal_used() or
> > > vhost_add_used_and_signal() method will be called to add the head to used
> > > ring.
> > 
> > It's the next head not the one that contains the buggy packet?
> 
> In the modified code logic, the head added to used ring is exectly the
> one that contains the buggy packet.

-ENOTEA :( You're right, I misread the code.

Thanks

> 
> Thanks
> 
> > 
> > Thanks
> > 
> > >
> > > Thanks
> > > >
> > > >
> > > > > -		if (err != len)
> > > > > -			pr_debug("Truncated TX packet: len %d != %zd\n",
> > > > > -				 err, len);
> > > > > +		} else if (unlikely(err < 0 || err != len))
> > > >
> > > >
> > > > It looks to me err != len covers err < 0.
> > >
> > > OK
> > >
> > > >
> > > > Thanks
> > > >
> > > >
> > > > > +			vq_err(vq, "Fail to sending packets err : %d, len : %zd\n",
> > err,
> > > > > +len);
> > > > >   done:
> > > > >   		vq->heads[nvq->done_idx].id = cpu_to_vhost32(vq, head);
> > > > >   		vq->heads[nvq->done_idx].len = 0;
> > > > > @@ -922,7 +919,6 @@ static void handle_tx_zerocopy(struct vhost_net
> > > > *net, struct socket *sock)
> > > > >   			msg.msg_flags &= ~MSG_MORE;
> > > > >   		}
> > > > >
> > > > > -		/* TODO: Check specific error and bomb out unless ENOBUFS?
> > */
> > > > >   		err = sock->ops->sendmsg(sock, &msg, len);
> > > > >   		if (unlikely(err < 0)) {
> > > > >   			if (zcopy_used) {
> > > > > @@ -931,13 +927,14 @@ static void handle_tx_zerocopy(struct
> > vhost_net
> > > > *net, struct socket *sock)
> > > > >   				nvq->upend_idx = ((unsigned)nvq->upend_idx - 1)
> > > > >   					% UIO_MAXIOV;
> > > > >   			}
> > > > > -			vhost_discard_vq_desc(vq, 1);
> > > > > -			vhost_net_enable_vq(net, vq);
> > > > > -			break;
> > > > > +			if (err == -EAGAIN) {
> > > > > +				vhost_discard_vq_desc(vq, 1);
> > > > > +				vhost_net_enable_vq(net, vq);
> > > > > +				break;
> > > > > +			}
> > > > >   		}
> > > > >   		if (err != len)
> > > > > -			pr_debug("Truncated TX packet: "
> > > > > -				 " len %d != %zd\n", err, len);
> > > > > +			vq_err(vq, "Fail to sending packets err : %d, len : %zd\n",
> > err,
> > > > > +len);
> > > > >   		if (!zcopy_used)
> > > > >   			vhost_add_used_and_signal(&net->dev, vq, head, 0);
> > > > >   		else
> > >
> > >
> 
> 


^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH net 2/2] vhost_net: fix high cpu load when sendmsg fails
@ 2020-12-16  7:47             ` Jason Wang
  0 siblings, 0 replies; 38+ messages in thread
From: Jason Wang @ 2020-12-16  7:47 UTC (permalink / raw)
  To: wangyunjian
  Cc: willemdebruijn kernel, mst, netdev, Lilijun (Jerry),
	virtualization, xudingke, huangbin (J),
	chenchanghu



----- Original Message -----
> > -----Original Message-----
> > From: Jason Wang [mailto:jasowang@redhat.com]
> > Sent: Wednesday, December 16, 2020 1:57 PM
> > To: wangyunjian <wangyunjian@huawei.com>
> > Cc: netdev@vger.kernel.org; mst@redhat.com; willemdebruijn kernel
> > <willemdebruijn.kernel@gmail.com>;
> > virtualization@lists.linux-foundation.org;
> > Lilijun (Jerry) <jerry.lilijun@huawei.com>; chenchanghu
> > <chenchanghu@huawei.com>; xudingke <xudingke@huawei.com>; huangbin (J)
> > <brian.huangbin@huawei.com>
> > Subject: Re: [PATCH net 2/2] vhost_net: fix high cpu load when sendmsg
> > fails
> > 
> > 
> > 
> > ----- Original Message -----
> > >
> > >
> > > > -----Original Message-----
> > > > From: Jason Wang [mailto:jasowang@redhat.com]
> > > > Sent: Tuesday, December 15, 2020 12:10 PM
> > > > To: wangyunjian <wangyunjian@huawei.com>; netdev@vger.kernel.org;
> > > > mst@redhat.com; willemdebruijn.kernel@gmail.com
> > > > Cc: virtualization@lists.linux-foundation.org; Lilijun (Jerry)
> > > > <jerry.lilijun@huawei.com>; chenchanghu <chenchanghu@huawei.com>;
> > > > xudingke <xudingke@huawei.com>; huangbin (J)
> > > > <brian.huangbin@huawei.com>
> > > > Subject: Re: [PATCH net 2/2] vhost_net: fix high cpu load when sendmsg
> > > > fails
> > > >
> > > >
> > > > On 2020/12/15 上午9:48, wangyunjian wrote:
> > > > > From: Yunjian Wang <wangyunjian@huawei.com>
> > > > >
> > > > > Currently we break the loop and wake up the vhost_worker when
> > sendmsg
> > > > > fails. When the worker wakes up again, we'll meet the same error.
> > > > > This
> > > > > will cause high CPU load. To fix this issue, we can skip this
> > > > > description by ignoring the error. When we exceeds sndbuf, the return
> > > > > value of sendmsg is -EAGAIN. In the case we don't skip the
> > > > > description
> > > > > and don't drop packet.
> > > > >
> > > > > Signed-off-by: Yunjian Wang <wangyunjian@huawei.com>
> > > > > ---
> > > > >   drivers/vhost/net.c | 21 +++++++++------------
> > > > >   1 file changed, 9 insertions(+), 12 deletions(-)
> > > > >
> > > > > diff --git a/drivers/vhost/net.c b/drivers/vhost/net.c index
> > > > > c8784dfafdd7..f966592d8900 100644
> > > > > --- a/drivers/vhost/net.c
> > > > > +++ b/drivers/vhost/net.c
> > > > > @@ -827,16 +827,13 @@ static void handle_tx_copy(struct vhost_net
> > *net,
> > > > struct socket *sock)
> > > > >   				msg.msg_flags &= ~MSG_MORE;
> > > > >   		}
> > > > >
> > > > > -		/* TODO: Check specific error and bomb out unless ENOBUFS?
> > */
> > > > >   		err = sock->ops->sendmsg(sock, &msg, len);
> > > > > -		if (unlikely(err < 0)) {
> > > > > +		if (unlikely(err == -EAGAIN)) {
> > > > >   			vhost_discard_vq_desc(vq, 1);
> > > > >   			vhost_net_enable_vq(net, vq);
> > > > >   			break;
> > > > > -		}
> > > >
> > > >
> > > > As I've pointed out in last version. If you don't discard descriptor,
> > > > you
> > > > probably
> > > > need to add the head to used ring. Otherwise this descriptor will be
> > > > always
> > > > inflight that may confuse drivers.
> > >
> > > Sorry for missing the comment.
> > >
> > > After deleting discard descriptor and break, the next processing will be
> > > the
> > > same
> > > as the normal success of sendmsg(), and vhost_zerocopy_signal_used() or
> > > vhost_add_used_and_signal() method will be called to add the head to used
> > > ring.
> > 
> > It's the next head not the one that contains the buggy packet?
> 
> In the modified code logic, the head added to used ring is exectly the
> one that contains the buggy packet.

-ENOTEA :( You're right, I misread the code.

Thanks

> 
> Thanks
> 
> > 
> > Thanks
> > 
> > >
> > > Thanks
> > > >
> > > >
> > > > > -		if (err != len)
> > > > > -			pr_debug("Truncated TX packet: len %d != %zd\n",
> > > > > -				 err, len);
> > > > > +		} else if (unlikely(err < 0 || err != len))
> > > >
> > > >
> > > > It looks to me err != len covers err < 0.
> > >
> > > OK
> > >
> > > >
> > > > Thanks
> > > >
> > > >
> > > > > +			vq_err(vq, "Fail to sending packets err : %d, len : %zd\n",
> > err,
> > > > > +len);
> > > > >   done:
> > > > >   		vq->heads[nvq->done_idx].id = cpu_to_vhost32(vq, head);
> > > > >   		vq->heads[nvq->done_idx].len = 0;
> > > > > @@ -922,7 +919,6 @@ static void handle_tx_zerocopy(struct vhost_net
> > > > *net, struct socket *sock)
> > > > >   			msg.msg_flags &= ~MSG_MORE;
> > > > >   		}
> > > > >
> > > > > -		/* TODO: Check specific error and bomb out unless ENOBUFS?
> > */
> > > > >   		err = sock->ops->sendmsg(sock, &msg, len);
> > > > >   		if (unlikely(err < 0)) {
> > > > >   			if (zcopy_used) {
> > > > > @@ -931,13 +927,14 @@ static void handle_tx_zerocopy(struct
> > vhost_net
> > > > *net, struct socket *sock)
> > > > >   				nvq->upend_idx = ((unsigned)nvq->upend_idx - 1)
> > > > >   					% UIO_MAXIOV;
> > > > >   			}
> > > > > -			vhost_discard_vq_desc(vq, 1);
> > > > > -			vhost_net_enable_vq(net, vq);
> > > > > -			break;
> > > > > +			if (err == -EAGAIN) {
> > > > > +				vhost_discard_vq_desc(vq, 1);
> > > > > +				vhost_net_enable_vq(net, vq);
> > > > > +				break;
> > > > > +			}
> > > > >   		}
> > > > >   		if (err != len)
> > > > > -			pr_debug("Truncated TX packet: "
> > > > > -				 " len %d != %zd\n", err, len);
> > > > > +			vq_err(vq, "Fail to sending packets err : %d, len : %zd\n",
> > err,
> > > > > +len);
> > > > >   		if (!zcopy_used)
> > > > >   			vhost_add_used_and_signal(&net->dev, vq, head, 0);
> > > > >   		else
> > >
> > >
> 
> 

_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

^ permalink raw reply	[flat|nested] 38+ messages in thread

* [PATCH net v2 0/2] fixes for vhost_net
  2020-12-15  1:48 [PATCH net 0/2] fixes for vhost_net wangyunjian
  2020-12-15  1:48 ` [PATCH net 1/2] vhost_net: fix ubuf refcount incorrectly when sendmsg fails wangyunjian
  2020-12-15  1:48 ` [PATCH net 2/2] vhost_net: fix high cpu load " wangyunjian
@ 2020-12-16  8:20 ` wangyunjian
  2020-12-16  8:20   ` [PATCH net v2 1/2] vhost_net: fix ubuf refcount incorrectly when sendmsg fails wangyunjian
  2020-12-16  8:20   ` [PATCH net v2 2/2] vhost_net: fix high cpu load " wangyunjian
  2 siblings, 2 replies; 38+ messages in thread
From: wangyunjian @ 2020-12-16  8:20 UTC (permalink / raw)
  To: netdev, mst, jasowang, willemdebruijn.kernel
  Cc: virtualization, jerry.lilijun, chenchanghu, xudingke,
	brian.huangbin, Yunjian Wang

From: Yunjian Wang <wangyunjian@huawei.com>

This series include two fixes patches for vhost_net.

---
v2:
   * update patch 1/2 Fixes tag suggested by Willem de Bruijn
   * update patch 2/2 code styles suggested by Jason Wang


Yunjian Wang (2):
  vhost_net: fix ubuf refcount incorrectly when sendmsg fails
  vhost_net: fix high cpu load when sendmsg fails

 drivers/vhost/net.c | 27 ++++++++++++---------------
 1 file changed, 12 insertions(+), 15 deletions(-)

-- 
2.23.0


^ permalink raw reply	[flat|nested] 38+ messages in thread

* [PATCH net v2 1/2] vhost_net: fix ubuf refcount incorrectly when sendmsg fails
  2020-12-16  8:20 ` [PATCH net v2 0/2] fixes for vhost_net wangyunjian
@ 2020-12-16  8:20   ` wangyunjian
  2020-12-16 14:17       ` Willem de Bruijn
  2020-12-16 20:56       ` Michael S. Tsirkin
  2020-12-16  8:20   ` [PATCH net v2 2/2] vhost_net: fix high cpu load " wangyunjian
  1 sibling, 2 replies; 38+ messages in thread
From: wangyunjian @ 2020-12-16  8:20 UTC (permalink / raw)
  To: netdev, mst, jasowang, willemdebruijn.kernel
  Cc: virtualization, jerry.lilijun, chenchanghu, xudingke,
	brian.huangbin, Yunjian Wang

From: Yunjian Wang <wangyunjian@huawei.com>

Currently the vhost_zerocopy_callback() maybe be called to decrease
the refcount when sendmsg fails in tun. The error handling in vhost
handle_tx_zerocopy() will try to decrease the same refcount again.
This is wrong. To fix this issue, we only call vhost_net_ubuf_put()
when vq->heads[nvq->desc].len == VHOST_DMA_IN_PROGRESS.

Fixes: 0690899b4d45 ("tun: experimental zero copy tx support")

Signed-off-by: Yunjian Wang <wangyunjian@huawei.com>
---
 drivers/vhost/net.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/drivers/vhost/net.c b/drivers/vhost/net.c
index 531a00d703cd..c8784dfafdd7 100644
--- a/drivers/vhost/net.c
+++ b/drivers/vhost/net.c
@@ -863,6 +863,7 @@ static void handle_tx_zerocopy(struct vhost_net *net, struct socket *sock)
 	size_t len, total_len = 0;
 	int err;
 	struct vhost_net_ubuf_ref *ubufs;
+	struct ubuf_info *ubuf;
 	bool zcopy_used;
 	int sent_pkts = 0;
 
@@ -895,9 +896,7 @@ static void handle_tx_zerocopy(struct vhost_net *net, struct socket *sock)
 
 		/* use msg_control to pass vhost zerocopy ubuf info to skb */
 		if (zcopy_used) {
-			struct ubuf_info *ubuf;
 			ubuf = nvq->ubuf_info + nvq->upend_idx;
-
 			vq->heads[nvq->upend_idx].id = cpu_to_vhost32(vq, head);
 			vq->heads[nvq->upend_idx].len = VHOST_DMA_IN_PROGRESS;
 			ubuf->callback = vhost_zerocopy_callback;
@@ -927,7 +926,8 @@ static void handle_tx_zerocopy(struct vhost_net *net, struct socket *sock)
 		err = sock->ops->sendmsg(sock, &msg, len);
 		if (unlikely(err < 0)) {
 			if (zcopy_used) {
-				vhost_net_ubuf_put(ubufs);
+				if (vq->heads[ubuf->desc].len == VHOST_DMA_IN_PROGRESS)
+					vhost_net_ubuf_put(ubufs);
 				nvq->upend_idx = ((unsigned)nvq->upend_idx - 1)
 					% UIO_MAXIOV;
 			}
-- 
2.23.0


^ permalink raw reply related	[flat|nested] 38+ messages in thread

* [PATCH net v2 2/2] vhost_net: fix high cpu load when sendmsg fails
  2020-12-16  8:20 ` [PATCH net v2 0/2] fixes for vhost_net wangyunjian
  2020-12-16  8:20   ` [PATCH net v2 1/2] vhost_net: fix ubuf refcount incorrectly when sendmsg fails wangyunjian
@ 2020-12-16  8:20   ` wangyunjian
  2020-12-16  9:23       ` Michael S. Tsirkin
  2020-12-21 23:07       ` Willem de Bruijn
  1 sibling, 2 replies; 38+ messages in thread
From: wangyunjian @ 2020-12-16  8:20 UTC (permalink / raw)
  To: netdev, mst, jasowang, willemdebruijn.kernel
  Cc: virtualization, jerry.lilijun, chenchanghu, xudingke,
	brian.huangbin, Yunjian Wang

From: Yunjian Wang <wangyunjian@huawei.com>

Currently we break the loop and wake up the vhost_worker when
sendmsg fails. When the worker wakes up again, we'll meet the
same error. This will cause high CPU load. To fix this issue,
we can skip this description by ignoring the error. When we
exceeds sndbuf, the return value of sendmsg is -EAGAIN. In
the case we don't skip the description and don't drop packet.

Signed-off-by: Yunjian Wang <wangyunjian@huawei.com>
---
 drivers/vhost/net.c | 21 +++++++++------------
 1 file changed, 9 insertions(+), 12 deletions(-)

diff --git a/drivers/vhost/net.c b/drivers/vhost/net.c
index c8784dfafdd7..3d33f3183abe 100644
--- a/drivers/vhost/net.c
+++ b/drivers/vhost/net.c
@@ -827,16 +827,13 @@ static void handle_tx_copy(struct vhost_net *net, struct socket *sock)
 				msg.msg_flags &= ~MSG_MORE;
 		}
 
-		/* TODO: Check specific error and bomb out unless ENOBUFS? */
 		err = sock->ops->sendmsg(sock, &msg, len);
-		if (unlikely(err < 0)) {
+		if (unlikely(err == -EAGAIN)) {
 			vhost_discard_vq_desc(vq, 1);
 			vhost_net_enable_vq(net, vq);
 			break;
-		}
-		if (err != len)
-			pr_debug("Truncated TX packet: len %d != %zd\n",
-				 err, len);
+		} else if (unlikely(err != len))
+			vq_err(vq, "Fail to sending packets err : %d, len : %zd\n", err, len);
 done:
 		vq->heads[nvq->done_idx].id = cpu_to_vhost32(vq, head);
 		vq->heads[nvq->done_idx].len = 0;
@@ -922,7 +919,6 @@ static void handle_tx_zerocopy(struct vhost_net *net, struct socket *sock)
 			msg.msg_flags &= ~MSG_MORE;
 		}
 
-		/* TODO: Check specific error and bomb out unless ENOBUFS? */
 		err = sock->ops->sendmsg(sock, &msg, len);
 		if (unlikely(err < 0)) {
 			if (zcopy_used) {
@@ -931,13 +927,14 @@ static void handle_tx_zerocopy(struct vhost_net *net, struct socket *sock)
 				nvq->upend_idx = ((unsigned)nvq->upend_idx - 1)
 					% UIO_MAXIOV;
 			}
-			vhost_discard_vq_desc(vq, 1);
-			vhost_net_enable_vq(net, vq);
-			break;
+			if (err == -EAGAIN) {
+				vhost_discard_vq_desc(vq, 1);
+				vhost_net_enable_vq(net, vq);
+				break;
+			}
 		}
 		if (err != len)
-			pr_debug("Truncated TX packet: "
-				 " len %d != %zd\n", err, len);
+			vq_err(vq, "Fail to sending packets err : %d, len : %zd\n", err, len);
 		if (!zcopy_used)
 			vhost_add_used_and_signal(&net->dev, vq, head, 0);
 		else
-- 
2.23.0


^ permalink raw reply related	[flat|nested] 38+ messages in thread

* Re: [PATCH net v2 2/2] vhost_net: fix high cpu load when sendmsg fails
  2020-12-16  8:20   ` [PATCH net v2 2/2] vhost_net: fix high cpu load " wangyunjian
@ 2020-12-16  9:23       ` Michael S. Tsirkin
  2020-12-21 23:07       ` Willem de Bruijn
  1 sibling, 0 replies; 38+ messages in thread
From: Michael S. Tsirkin @ 2020-12-16  9:23 UTC (permalink / raw)
  To: wangyunjian
  Cc: netdev, jasowang, willemdebruijn.kernel, virtualization,
	jerry.lilijun, chenchanghu, xudingke, brian.huangbin

On Wed, Dec 16, 2020 at 04:20:37PM +0800, wangyunjian wrote:
> From: Yunjian Wang <wangyunjian@huawei.com>
> 
> Currently we break the loop and wake up the vhost_worker when
> sendmsg fails. When the worker wakes up again, we'll meet the
> same error. This will cause high CPU load. To fix this issue,
> we can skip this description by ignoring the error. When we
> exceeds sndbuf, the return value of sendmsg is -EAGAIN. In
> the case we don't skip the description and don't drop packet.

Question: with this patch, what happens if sendmsg is interrupted by a signal?


> 
> Signed-off-by: Yunjian Wang <wangyunjian@huawei.com>
> ---
>  drivers/vhost/net.c | 21 +++++++++------------
>  1 file changed, 9 insertions(+), 12 deletions(-)
> 
> diff --git a/drivers/vhost/net.c b/drivers/vhost/net.c
> index c8784dfafdd7..3d33f3183abe 100644
> --- a/drivers/vhost/net.c
> +++ b/drivers/vhost/net.c
> @@ -827,16 +827,13 @@ static void handle_tx_copy(struct vhost_net *net, struct socket *sock)
>  				msg.msg_flags &= ~MSG_MORE;
>  		}
>  
> -		/* TODO: Check specific error and bomb out unless ENOBUFS? */
>  		err = sock->ops->sendmsg(sock, &msg, len);
> -		if (unlikely(err < 0)) {
> +		if (unlikely(err == -EAGAIN)) {
>  			vhost_discard_vq_desc(vq, 1);
>  			vhost_net_enable_vq(net, vq);
>  			break;
> -		}
> -		if (err != len)
> -			pr_debug("Truncated TX packet: len %d != %zd\n",
> -				 err, len);
> +		} else if (unlikely(err != len))
> +			vq_err(vq, "Fail to sending packets err : %d, len : %zd\n", err, len);
>  done:
>  		vq->heads[nvq->done_idx].id = cpu_to_vhost32(vq, head);
>  		vq->heads[nvq->done_idx].len = 0;
> @@ -922,7 +919,6 @@ static void handle_tx_zerocopy(struct vhost_net *net, struct socket *sock)
>  			msg.msg_flags &= ~MSG_MORE;
>  		}
>  
> -		/* TODO: Check specific error and bomb out unless ENOBUFS? */
>  		err = sock->ops->sendmsg(sock, &msg, len);
>  		if (unlikely(err < 0)) {
>  			if (zcopy_used) {
> @@ -931,13 +927,14 @@ static void handle_tx_zerocopy(struct vhost_net *net, struct socket *sock)
>  				nvq->upend_idx = ((unsigned)nvq->upend_idx - 1)
>  					% UIO_MAXIOV;
>  			}
> -			vhost_discard_vq_desc(vq, 1);
> -			vhost_net_enable_vq(net, vq);
> -			break;
> +			if (err == -EAGAIN) {
> +				vhost_discard_vq_desc(vq, 1);
> +				vhost_net_enable_vq(net, vq);
> +				break;
> +			}
>  		}
>  		if (err != len)
> -			pr_debug("Truncated TX packet: "
> -				 " len %d != %zd\n", err, len);
> +			vq_err(vq, "Fail to sending packets err : %d, len : %zd\n", err, len);

I'd rather make the pr_debug -> vq_err a separate change, with proper
commit log describing motivation.


>  		if (!zcopy_used)
>  			vhost_add_used_and_signal(&net->dev, vq, head, 0);
>  		else
> -- 
> 2.23.0


^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH net v2 2/2] vhost_net: fix high cpu load when sendmsg fails
@ 2020-12-16  9:23       ` Michael S. Tsirkin
  0 siblings, 0 replies; 38+ messages in thread
From: Michael S. Tsirkin @ 2020-12-16  9:23 UTC (permalink / raw)
  To: wangyunjian
  Cc: willemdebruijn.kernel, netdev, jerry.lilijun, virtualization,
	xudingke, brian.huangbin, chenchanghu

On Wed, Dec 16, 2020 at 04:20:37PM +0800, wangyunjian wrote:
> From: Yunjian Wang <wangyunjian@huawei.com>
> 
> Currently we break the loop and wake up the vhost_worker when
> sendmsg fails. When the worker wakes up again, we'll meet the
> same error. This will cause high CPU load. To fix this issue,
> we can skip this description by ignoring the error. When we
> exceeds sndbuf, the return value of sendmsg is -EAGAIN. In
> the case we don't skip the description and don't drop packet.

Question: with this patch, what happens if sendmsg is interrupted by a signal?


> 
> Signed-off-by: Yunjian Wang <wangyunjian@huawei.com>
> ---
>  drivers/vhost/net.c | 21 +++++++++------------
>  1 file changed, 9 insertions(+), 12 deletions(-)
> 
> diff --git a/drivers/vhost/net.c b/drivers/vhost/net.c
> index c8784dfafdd7..3d33f3183abe 100644
> --- a/drivers/vhost/net.c
> +++ b/drivers/vhost/net.c
> @@ -827,16 +827,13 @@ static void handle_tx_copy(struct vhost_net *net, struct socket *sock)
>  				msg.msg_flags &= ~MSG_MORE;
>  		}
>  
> -		/* TODO: Check specific error and bomb out unless ENOBUFS? */
>  		err = sock->ops->sendmsg(sock, &msg, len);
> -		if (unlikely(err < 0)) {
> +		if (unlikely(err == -EAGAIN)) {
>  			vhost_discard_vq_desc(vq, 1);
>  			vhost_net_enable_vq(net, vq);
>  			break;
> -		}
> -		if (err != len)
> -			pr_debug("Truncated TX packet: len %d != %zd\n",
> -				 err, len);
> +		} else if (unlikely(err != len))
> +			vq_err(vq, "Fail to sending packets err : %d, len : %zd\n", err, len);
>  done:
>  		vq->heads[nvq->done_idx].id = cpu_to_vhost32(vq, head);
>  		vq->heads[nvq->done_idx].len = 0;
> @@ -922,7 +919,6 @@ static void handle_tx_zerocopy(struct vhost_net *net, struct socket *sock)
>  			msg.msg_flags &= ~MSG_MORE;
>  		}
>  
> -		/* TODO: Check specific error and bomb out unless ENOBUFS? */
>  		err = sock->ops->sendmsg(sock, &msg, len);
>  		if (unlikely(err < 0)) {
>  			if (zcopy_used) {
> @@ -931,13 +927,14 @@ static void handle_tx_zerocopy(struct vhost_net *net, struct socket *sock)
>  				nvq->upend_idx = ((unsigned)nvq->upend_idx - 1)
>  					% UIO_MAXIOV;
>  			}
> -			vhost_discard_vq_desc(vq, 1);
> -			vhost_net_enable_vq(net, vq);
> -			break;
> +			if (err == -EAGAIN) {
> +				vhost_discard_vq_desc(vq, 1);
> +				vhost_net_enable_vq(net, vq);
> +				break;
> +			}
>  		}
>  		if (err != len)
> -			pr_debug("Truncated TX packet: "
> -				 " len %d != %zd\n", err, len);
> +			vq_err(vq, "Fail to sending packets err : %d, len : %zd\n", err, len);

I'd rather make the pr_debug -> vq_err a separate change, with proper
commit log describing motivation.


>  		if (!zcopy_used)
>  			vhost_add_used_and_signal(&net->dev, vq, head, 0);
>  		else
> -- 
> 2.23.0

_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH net v2 1/2] vhost_net: fix ubuf refcount incorrectly when sendmsg fails
  2020-12-16  8:20   ` [PATCH net v2 1/2] vhost_net: fix ubuf refcount incorrectly when sendmsg fails wangyunjian
@ 2020-12-16 14:17       ` Willem de Bruijn
  2020-12-16 20:56       ` Michael S. Tsirkin
  1 sibling, 0 replies; 38+ messages in thread
From: Willem de Bruijn @ 2020-12-16 14:17 UTC (permalink / raw)
  To: wangyunjian
  Cc: Network Development, Michael S. Tsirkin, Jason Wang,
	Willem de Bruijn, virtualization, Lilijun (Jerry),
	chenchanghu, xudingke, huangbin (J)

On Wed, Dec 16, 2020 at 3:26 AM wangyunjian <wangyunjian@huawei.com> wrote:
>
> From: Yunjian Wang <wangyunjian@huawei.com>
>
> Currently the vhost_zerocopy_callback() maybe be called to decrease
> the refcount when sendmsg fails in tun. The error handling in vhost
> handle_tx_zerocopy() will try to decrease the same refcount again.
> This is wrong. To fix this issue, we only call vhost_net_ubuf_put()
> when vq->heads[nvq->desc].len == VHOST_DMA_IN_PROGRESS.
>
> Fixes: 0690899b4d45 ("tun: experimental zero copy tx support")
>
> Signed-off-by: Yunjian Wang <wangyunjian@huawei.com>

Acked-by: Willem de Bruijn <willemb@google.com>

for next time: it's not customary to have an empty line between Fixes
and Signed-off-by

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH net v2 1/2] vhost_net: fix ubuf refcount incorrectly when sendmsg fails
@ 2020-12-16 14:17       ` Willem de Bruijn
  0 siblings, 0 replies; 38+ messages in thread
From: Willem de Bruijn @ 2020-12-16 14:17 UTC (permalink / raw)
  To: wangyunjian
  Cc: Willem de Bruijn, Michael S. Tsirkin, Network Development,
	Lilijun (Jerry), virtualization, xudingke, huangbin (J),
	chenchanghu

On Wed, Dec 16, 2020 at 3:26 AM wangyunjian <wangyunjian@huawei.com> wrote:
>
> From: Yunjian Wang <wangyunjian@huawei.com>
>
> Currently the vhost_zerocopy_callback() maybe be called to decrease
> the refcount when sendmsg fails in tun. The error handling in vhost
> handle_tx_zerocopy() will try to decrease the same refcount again.
> This is wrong. To fix this issue, we only call vhost_net_ubuf_put()
> when vq->heads[nvq->desc].len == VHOST_DMA_IN_PROGRESS.
>
> Fixes: 0690899b4d45 ("tun: experimental zero copy tx support")
>
> Signed-off-by: Yunjian Wang <wangyunjian@huawei.com>

Acked-by: Willem de Bruijn <willemb@google.com>

for next time: it's not customary to have an empty line between Fixes
and Signed-off-by
_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH net v2 1/2] vhost_net: fix ubuf refcount incorrectly when sendmsg fails
  2020-12-16  8:20   ` [PATCH net v2 1/2] vhost_net: fix ubuf refcount incorrectly when sendmsg fails wangyunjian
@ 2020-12-16 20:56       ` Michael S. Tsirkin
  2020-12-16 20:56       ` Michael S. Tsirkin
  1 sibling, 0 replies; 38+ messages in thread
From: Michael S. Tsirkin @ 2020-12-16 20:56 UTC (permalink / raw)
  To: wangyunjian
  Cc: netdev, jasowang, willemdebruijn.kernel, virtualization,
	jerry.lilijun, chenchanghu, xudingke, brian.huangbin

On Wed, Dec 16, 2020 at 04:20:20PM +0800, wangyunjian wrote:
> From: Yunjian Wang <wangyunjian@huawei.com>
> 
> Currently the vhost_zerocopy_callback() maybe be called to decrease
> the refcount when sendmsg fails in tun. The error handling in vhost
> handle_tx_zerocopy() will try to decrease the same refcount again.
> This is wrong. To fix this issue, we only call vhost_net_ubuf_put()
> when vq->heads[nvq->desc].len == VHOST_DMA_IN_PROGRESS.
> 
> Fixes: 0690899b4d45 ("tun: experimental zero copy tx support")
> 
> Signed-off-by: Yunjian Wang <wangyunjian@huawei.com>

Acked-by: Michael S. Tsirkin <mst@redhat.com>

> ---
>  drivers/vhost/net.c | 6 +++---
>  1 file changed, 3 insertions(+), 3 deletions(-)
> 
> diff --git a/drivers/vhost/net.c b/drivers/vhost/net.c
> index 531a00d703cd..c8784dfafdd7 100644
> --- a/drivers/vhost/net.c
> +++ b/drivers/vhost/net.c
> @@ -863,6 +863,7 @@ static void handle_tx_zerocopy(struct vhost_net *net, struct socket *sock)
>  	size_t len, total_len = 0;
>  	int err;
>  	struct vhost_net_ubuf_ref *ubufs;
> +	struct ubuf_info *ubuf;
>  	bool zcopy_used;
>  	int sent_pkts = 0;
>  
> @@ -895,9 +896,7 @@ static void handle_tx_zerocopy(struct vhost_net *net, struct socket *sock)
>  
>  		/* use msg_control to pass vhost zerocopy ubuf info to skb */
>  		if (zcopy_used) {
> -			struct ubuf_info *ubuf;
>  			ubuf = nvq->ubuf_info + nvq->upend_idx;
> -
>  			vq->heads[nvq->upend_idx].id = cpu_to_vhost32(vq, head);
>  			vq->heads[nvq->upend_idx].len = VHOST_DMA_IN_PROGRESS;
>  			ubuf->callback = vhost_zerocopy_callback;
> @@ -927,7 +926,8 @@ static void handle_tx_zerocopy(struct vhost_net *net, struct socket *sock)
>  		err = sock->ops->sendmsg(sock, &msg, len);
>  		if (unlikely(err < 0)) {
>  			if (zcopy_used) {
> -				vhost_net_ubuf_put(ubufs);
> +				if (vq->heads[ubuf->desc].len == VHOST_DMA_IN_PROGRESS)
> +					vhost_net_ubuf_put(ubufs);
>  				nvq->upend_idx = ((unsigned)nvq->upend_idx - 1)
>  					% UIO_MAXIOV;
>  			}
> -- 
> 2.23.0


^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH net v2 1/2] vhost_net: fix ubuf refcount incorrectly when sendmsg fails
@ 2020-12-16 20:56       ` Michael S. Tsirkin
  0 siblings, 0 replies; 38+ messages in thread
From: Michael S. Tsirkin @ 2020-12-16 20:56 UTC (permalink / raw)
  To: wangyunjian
  Cc: willemdebruijn.kernel, netdev, jerry.lilijun, virtualization,
	xudingke, brian.huangbin, chenchanghu

On Wed, Dec 16, 2020 at 04:20:20PM +0800, wangyunjian wrote:
> From: Yunjian Wang <wangyunjian@huawei.com>
> 
> Currently the vhost_zerocopy_callback() maybe be called to decrease
> the refcount when sendmsg fails in tun. The error handling in vhost
> handle_tx_zerocopy() will try to decrease the same refcount again.
> This is wrong. To fix this issue, we only call vhost_net_ubuf_put()
> when vq->heads[nvq->desc].len == VHOST_DMA_IN_PROGRESS.
> 
> Fixes: 0690899b4d45 ("tun: experimental zero copy tx support")
> 
> Signed-off-by: Yunjian Wang <wangyunjian@huawei.com>

Acked-by: Michael S. Tsirkin <mst@redhat.com>

> ---
>  drivers/vhost/net.c | 6 +++---
>  1 file changed, 3 insertions(+), 3 deletions(-)
> 
> diff --git a/drivers/vhost/net.c b/drivers/vhost/net.c
> index 531a00d703cd..c8784dfafdd7 100644
> --- a/drivers/vhost/net.c
> +++ b/drivers/vhost/net.c
> @@ -863,6 +863,7 @@ static void handle_tx_zerocopy(struct vhost_net *net, struct socket *sock)
>  	size_t len, total_len = 0;
>  	int err;
>  	struct vhost_net_ubuf_ref *ubufs;
> +	struct ubuf_info *ubuf;
>  	bool zcopy_used;
>  	int sent_pkts = 0;
>  
> @@ -895,9 +896,7 @@ static void handle_tx_zerocopy(struct vhost_net *net, struct socket *sock)
>  
>  		/* use msg_control to pass vhost zerocopy ubuf info to skb */
>  		if (zcopy_used) {
> -			struct ubuf_info *ubuf;
>  			ubuf = nvq->ubuf_info + nvq->upend_idx;
> -
>  			vq->heads[nvq->upend_idx].id = cpu_to_vhost32(vq, head);
>  			vq->heads[nvq->upend_idx].len = VHOST_DMA_IN_PROGRESS;
>  			ubuf->callback = vhost_zerocopy_callback;
> @@ -927,7 +926,8 @@ static void handle_tx_zerocopy(struct vhost_net *net, struct socket *sock)
>  		err = sock->ops->sendmsg(sock, &msg, len);
>  		if (unlikely(err < 0)) {
>  			if (zcopy_used) {
> -				vhost_net_ubuf_put(ubufs);
> +				if (vq->heads[ubuf->desc].len == VHOST_DMA_IN_PROGRESS)
> +					vhost_net_ubuf_put(ubufs);
>  				nvq->upend_idx = ((unsigned)nvq->upend_idx - 1)
>  					% UIO_MAXIOV;
>  			}
> -- 
> 2.23.0

_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

^ permalink raw reply	[flat|nested] 38+ messages in thread

* RE: [PATCH net v2 2/2] vhost_net: fix high cpu load when sendmsg fails
  2020-12-16  9:23       ` Michael S. Tsirkin
  (?)
@ 2020-12-17  2:38       ` wangyunjian
  -1 siblings, 0 replies; 38+ messages in thread
From: wangyunjian @ 2020-12-17  2:38 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: netdev, jasowang, willemdebruijn.kernel, virtualization,
	Lilijun (Jerry), chenchanghu, xudingke, huangbin (J)

> -----Original Message-----
> From: Michael S. Tsirkin [mailto:mst@redhat.com]
> Sent: Wednesday, December 16, 2020 5:23 PM
> To: wangyunjian <wangyunjian@huawei.com>
> Cc: netdev@vger.kernel.org; jasowang@redhat.com;
> willemdebruijn.kernel@gmail.com; virtualization@lists.linux-foundation.org;
> Lilijun (Jerry) <jerry.lilijun@huawei.com>; chenchanghu
> <chenchanghu@huawei.com>; xudingke <xudingke@huawei.com>; huangbin (J)
> <brian.huangbin@huawei.com>
> Subject: Re: [PATCH net v2 2/2] vhost_net: fix high cpu load when sendmsg fails
> 
> On Wed, Dec 16, 2020 at 04:20:37PM +0800, wangyunjian wrote:
> > From: Yunjian Wang <wangyunjian@huawei.com>
> >
> > Currently we break the loop and wake up the vhost_worker when sendmsg
> > fails. When the worker wakes up again, we'll meet the same error. This
> > will cause high CPU load. To fix this issue, we can skip this
> > description by ignoring the error. When we exceeds sndbuf, the return
> > value of sendmsg is -EAGAIN. In the case we don't skip the description
> > and don't drop packet.
> 
> Question: with this patch, what happens if sendmsg is interrupted by a signal?

The descriptors are consumed as normal. However, the packet is discarded.
Could you explain the specific scenario?

> 
> 
> >
> > Signed-off-by: Yunjian Wang <wangyunjian@huawei.com>
> > ---
> >  drivers/vhost/net.c | 21 +++++++++------------
> >  1 file changed, 9 insertions(+), 12 deletions(-)
> >
> > diff --git a/drivers/vhost/net.c b/drivers/vhost/net.c index
> > c8784dfafdd7..3d33f3183abe 100644
> > --- a/drivers/vhost/net.c
> > +++ b/drivers/vhost/net.c
> > @@ -827,16 +827,13 @@ static void handle_tx_copy(struct vhost_net *net,
> struct socket *sock)
> >  				msg.msg_flags &= ~MSG_MORE;
> >  		}
> >
> > -		/* TODO: Check specific error and bomb out unless ENOBUFS? */
> >  		err = sock->ops->sendmsg(sock, &msg, len);
> > -		if (unlikely(err < 0)) {
> > +		if (unlikely(err == -EAGAIN)) {
> >  			vhost_discard_vq_desc(vq, 1);
> >  			vhost_net_enable_vq(net, vq);
> >  			break;
> > -		}
> > -		if (err != len)
> > -			pr_debug("Truncated TX packet: len %d != %zd\n",
> > -				 err, len);
> > +		} else if (unlikely(err != len))
> > +			vq_err(vq, "Fail to sending packets err : %d, len : %zd\n", err,
> > +len);
> >  done:
> >  		vq->heads[nvq->done_idx].id = cpu_to_vhost32(vq, head);
> >  		vq->heads[nvq->done_idx].len = 0;
> > @@ -922,7 +919,6 @@ static void handle_tx_zerocopy(struct vhost_net
> *net, struct socket *sock)
> >  			msg.msg_flags &= ~MSG_MORE;
> >  		}
> >
> > -		/* TODO: Check specific error and bomb out unless ENOBUFS? */
> >  		err = sock->ops->sendmsg(sock, &msg, len);
> >  		if (unlikely(err < 0)) {
> >  			if (zcopy_used) {
> > @@ -931,13 +927,14 @@ static void handle_tx_zerocopy(struct vhost_net
> *net, struct socket *sock)
> >  				nvq->upend_idx = ((unsigned)nvq->upend_idx - 1)
> >  					% UIO_MAXIOV;
> >  			}
> > -			vhost_discard_vq_desc(vq, 1);
> > -			vhost_net_enable_vq(net, vq);
> > -			break;
> > +			if (err == -EAGAIN) {
> > +				vhost_discard_vq_desc(vq, 1);
> > +				vhost_net_enable_vq(net, vq);
> > +				break;
> > +			}
> >  		}
> >  		if (err != len)
> > -			pr_debug("Truncated TX packet: "
> > -				 " len %d != %zd\n", err, len);
> > +			vq_err(vq, "Fail to sending packets err : %d, len : %zd\n", err,
> > +len);
> 
> I'd rather make the pr_debug -> vq_err a separate change, with proper commit
> log describing motivation.

This log was originally triggered when packets were truncated. But after the
modification of this patch, other error scenarios will also trigger this log.
That's why I modified the content and level of this log together.
Now, should I just change the content of this patch?

Thanks

> 
> 
> >  		if (!zcopy_used)
> >  			vhost_add_used_and_signal(&net->dev, vq, head, 0);
> >  		else
> > --
> > 2.23.0


^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH net v2 2/2] vhost_net: fix high cpu load when sendmsg fails
  2020-12-16  9:23       ` Michael S. Tsirkin
@ 2020-12-17  3:19         ` Jason Wang
  -1 siblings, 0 replies; 38+ messages in thread
From: Jason Wang @ 2020-12-17  3:19 UTC (permalink / raw)
  To: Michael S. Tsirkin, wangyunjian
  Cc: netdev, willemdebruijn.kernel, virtualization, jerry.lilijun,
	chenchanghu, xudingke, brian.huangbin


On 2020/12/16 下午5:23, Michael S. Tsirkin wrote:
> On Wed, Dec 16, 2020 at 04:20:37PM +0800, wangyunjian wrote:
>> From: Yunjian Wang<wangyunjian@huawei.com>
>>
>> Currently we break the loop and wake up the vhost_worker when
>> sendmsg fails. When the worker wakes up again, we'll meet the
>> same error. This will cause high CPU load. To fix this issue,
>> we can skip this description by ignoring the error. When we
>> exceeds sndbuf, the return value of sendmsg is -EAGAIN. In
>> the case we don't skip the description and don't drop packet.
> Question: with this patch, what happens if sendmsg is interrupted by a signal?


Since we use MSG_DONTWAIT, we don't need to care about signal I think.

Thanks


>
>


^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH net v2 2/2] vhost_net: fix high cpu load when sendmsg fails
@ 2020-12-17  3:19         ` Jason Wang
  0 siblings, 0 replies; 38+ messages in thread
From: Jason Wang @ 2020-12-17  3:19 UTC (permalink / raw)
  To: Michael S. Tsirkin, wangyunjian
  Cc: willemdebruijn.kernel, netdev, jerry.lilijun, virtualization,
	chenchanghu, brian.huangbin, xudingke


On 2020/12/16 下午5:23, Michael S. Tsirkin wrote:
> On Wed, Dec 16, 2020 at 04:20:37PM +0800, wangyunjian wrote:
>> From: Yunjian Wang<wangyunjian@huawei.com>
>>
>> Currently we break the loop and wake up the vhost_worker when
>> sendmsg fails. When the worker wakes up again, we'll meet the
>> same error. This will cause high CPU load. To fix this issue,
>> we can skip this description by ignoring the error. When we
>> exceeds sndbuf, the return value of sendmsg is -EAGAIN. In
>> the case we don't skip the description and don't drop packet.
> Question: with this patch, what happens if sendmsg is interrupted by a signal?


Since we use MSG_DONTWAIT, we don't need to care about signal I think.

Thanks


>
>

_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH net v2 2/2] vhost_net: fix high cpu load when sendmsg fails
  2020-12-16  8:20   ` [PATCH net v2 2/2] vhost_net: fix high cpu load " wangyunjian
@ 2020-12-21 23:07       ` Willem de Bruijn
  2020-12-21 23:07       ` Willem de Bruijn
  1 sibling, 0 replies; 38+ messages in thread
From: Willem de Bruijn @ 2020-12-21 23:07 UTC (permalink / raw)
  To: wangyunjian
  Cc: Network Development, Michael S. Tsirkin, Jason Wang,
	virtualization, Lilijun (Jerry),
	chenchanghu, xudingke, huangbin (J)

On Wed, Dec 16, 2020 at 3:20 AM wangyunjian <wangyunjian@huawei.com> wrote:
>
> From: Yunjian Wang <wangyunjian@huawei.com>
>
> Currently we break the loop and wake up the vhost_worker when
> sendmsg fails. When the worker wakes up again, we'll meet the
> same error.

The patch is based on the assumption that such error cases always
return EAGAIN. Can it not also be ENOMEM, such as from tun_build_skb?

> This will cause high CPU load. To fix this issue,
> we can skip this description by ignoring the error. When we
> exceeds sndbuf, the return value of sendmsg is -EAGAIN. In
> the case we don't skip the description and don't drop packet.

the -> that

here and above: description -> descriptor

Perhaps slightly revise to more explicitly state that

1. in the case of persistent failure (i.e., bad packet), the driver
drops the packet
2. in the case of transient failure (e.g,. memory pressure) the driver
schedules the worker to try again later


> Signed-off-by: Yunjian Wang <wangyunjian@huawei.com>
> ---
>  drivers/vhost/net.c | 21 +++++++++------------
>  1 file changed, 9 insertions(+), 12 deletions(-)
>
> diff --git a/drivers/vhost/net.c b/drivers/vhost/net.c
> index c8784dfafdd7..3d33f3183abe 100644
> --- a/drivers/vhost/net.c
> +++ b/drivers/vhost/net.c
> @@ -827,16 +827,13 @@ static void handle_tx_copy(struct vhost_net *net, struct socket *sock)
>                                 msg.msg_flags &= ~MSG_MORE;
>                 }
>
> -               /* TODO: Check specific error and bomb out unless ENOBUFS? */
>                 err = sock->ops->sendmsg(sock, &msg, len);
> -               if (unlikely(err < 0)) {
> +               if (unlikely(err == -EAGAIN)) {
>                         vhost_discard_vq_desc(vq, 1);
>                         vhost_net_enable_vq(net, vq);
>                         break;
> -               }
> -               if (err != len)
> -                       pr_debug("Truncated TX packet: len %d != %zd\n",
> -                                err, len);
> +               } else if (unlikely(err != len))
> +                       vq_err(vq, "Fail to sending packets err : %d, len : %zd\n", err, len);

sending -> send

Even though vq_err is a wrapper around pr_debug, I agree with Michael
that such a change should be a separate patch to net-next, does not
belong in a fix.

More importantly, the error message is now the same for persistent
errors and for truncated packets. But on truncation the packet was
sent, so that is not entirely correct.

>  done:
>                 vq->heads[nvq->done_idx].id = cpu_to_vhost32(vq, head);
>                 vq->heads[nvq->done_idx].len = 0;
> @@ -922,7 +919,6 @@ static void handle_tx_zerocopy(struct vhost_net *net, struct socket *sock)
>                         msg.msg_flags &= ~MSG_MORE;
>                 }
>
> -               /* TODO: Check specific error and bomb out unless ENOBUFS? */
>                 err = sock->ops->sendmsg(sock, &msg, len);
>                 if (unlikely(err < 0)) {
>                         if (zcopy_used) {
> @@ -931,13 +927,14 @@ static void handle_tx_zerocopy(struct vhost_net *net, struct socket *sock)
>                                 nvq->upend_idx = ((unsigned)nvq->upend_idx - 1)
>                                         % UIO_MAXIOV;
>                         }
> -                       vhost_discard_vq_desc(vq, 1);
> -                       vhost_net_enable_vq(net, vq);
> -                       break;
> +                       if (err == -EAGAIN) {
> +                               vhost_discard_vq_desc(vq, 1);
> +                               vhost_net_enable_vq(net, vq);
> +                               break;
> +                       }
>                 }
>                 if (err != len)
> -                       pr_debug("Truncated TX packet: "
> -                                " len %d != %zd\n", err, len);
> +                       vq_err(vq, "Fail to sending packets err : %d, len : %zd\n", err, len);
>                 if (!zcopy_used)
>                         vhost_add_used_and_signal(&net->dev, vq, head, 0);
>                 else
> --
> 2.23.0
>

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH net v2 2/2] vhost_net: fix high cpu load when sendmsg fails
@ 2020-12-21 23:07       ` Willem de Bruijn
  0 siblings, 0 replies; 38+ messages in thread
From: Willem de Bruijn @ 2020-12-21 23:07 UTC (permalink / raw)
  To: wangyunjian
  Cc: Michael S. Tsirkin, Network Development, Lilijun (Jerry),
	virtualization, xudingke, huangbin (J),
	chenchanghu

On Wed, Dec 16, 2020 at 3:20 AM wangyunjian <wangyunjian@huawei.com> wrote:
>
> From: Yunjian Wang <wangyunjian@huawei.com>
>
> Currently we break the loop and wake up the vhost_worker when
> sendmsg fails. When the worker wakes up again, we'll meet the
> same error.

The patch is based on the assumption that such error cases always
return EAGAIN. Can it not also be ENOMEM, such as from tun_build_skb?

> This will cause high CPU load. To fix this issue,
> we can skip this description by ignoring the error. When we
> exceeds sndbuf, the return value of sendmsg is -EAGAIN. In
> the case we don't skip the description and don't drop packet.

the -> that

here and above: description -> descriptor

Perhaps slightly revise to more explicitly state that

1. in the case of persistent failure (i.e., bad packet), the driver
drops the packet
2. in the case of transient failure (e.g,. memory pressure) the driver
schedules the worker to try again later


> Signed-off-by: Yunjian Wang <wangyunjian@huawei.com>
> ---
>  drivers/vhost/net.c | 21 +++++++++------------
>  1 file changed, 9 insertions(+), 12 deletions(-)
>
> diff --git a/drivers/vhost/net.c b/drivers/vhost/net.c
> index c8784dfafdd7..3d33f3183abe 100644
> --- a/drivers/vhost/net.c
> +++ b/drivers/vhost/net.c
> @@ -827,16 +827,13 @@ static void handle_tx_copy(struct vhost_net *net, struct socket *sock)
>                                 msg.msg_flags &= ~MSG_MORE;
>                 }
>
> -               /* TODO: Check specific error and bomb out unless ENOBUFS? */
>                 err = sock->ops->sendmsg(sock, &msg, len);
> -               if (unlikely(err < 0)) {
> +               if (unlikely(err == -EAGAIN)) {
>                         vhost_discard_vq_desc(vq, 1);
>                         vhost_net_enable_vq(net, vq);
>                         break;
> -               }
> -               if (err != len)
> -                       pr_debug("Truncated TX packet: len %d != %zd\n",
> -                                err, len);
> +               } else if (unlikely(err != len))
> +                       vq_err(vq, "Fail to sending packets err : %d, len : %zd\n", err, len);

sending -> send

Even though vq_err is a wrapper around pr_debug, I agree with Michael
that such a change should be a separate patch to net-next, does not
belong in a fix.

More importantly, the error message is now the same for persistent
errors and for truncated packets. But on truncation the packet was
sent, so that is not entirely correct.

>  done:
>                 vq->heads[nvq->done_idx].id = cpu_to_vhost32(vq, head);
>                 vq->heads[nvq->done_idx].len = 0;
> @@ -922,7 +919,6 @@ static void handle_tx_zerocopy(struct vhost_net *net, struct socket *sock)
>                         msg.msg_flags &= ~MSG_MORE;
>                 }
>
> -               /* TODO: Check specific error and bomb out unless ENOBUFS? */
>                 err = sock->ops->sendmsg(sock, &msg, len);
>                 if (unlikely(err < 0)) {
>                         if (zcopy_used) {
> @@ -931,13 +927,14 @@ static void handle_tx_zerocopy(struct vhost_net *net, struct socket *sock)
>                                 nvq->upend_idx = ((unsigned)nvq->upend_idx - 1)
>                                         % UIO_MAXIOV;
>                         }
> -                       vhost_discard_vq_desc(vq, 1);
> -                       vhost_net_enable_vq(net, vq);
> -                       break;
> +                       if (err == -EAGAIN) {
> +                               vhost_discard_vq_desc(vq, 1);
> +                               vhost_net_enable_vq(net, vq);
> +                               break;
> +                       }
>                 }
>                 if (err != len)
> -                       pr_debug("Truncated TX packet: "
> -                                " len %d != %zd\n", err, len);
> +                       vq_err(vq, "Fail to sending packets err : %d, len : %zd\n", err, len);
>                 if (!zcopy_used)
>                         vhost_add_used_and_signal(&net->dev, vq, head, 0);
>                 else
> --
> 2.23.0
>
_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH net v2 2/2] vhost_net: fix high cpu load when sendmsg fails
  2020-12-21 23:07       ` Willem de Bruijn
@ 2020-12-22  4:41         ` Jason Wang
  -1 siblings, 0 replies; 38+ messages in thread
From: Jason Wang @ 2020-12-22  4:41 UTC (permalink / raw)
  To: Willem de Bruijn, wangyunjian
  Cc: Network Development, Michael S. Tsirkin, virtualization,
	Lilijun (Jerry), chenchanghu, xudingke, huangbin (J)


On 2020/12/22 上午7:07, Willem de Bruijn wrote:
> On Wed, Dec 16, 2020 at 3:20 AM wangyunjian<wangyunjian@huawei.com>  wrote:
>> From: Yunjian Wang<wangyunjian@huawei.com>
>>
>> Currently we break the loop and wake up the vhost_worker when
>> sendmsg fails. When the worker wakes up again, we'll meet the
>> same error.
> The patch is based on the assumption that such error cases always
> return EAGAIN. Can it not also be ENOMEM, such as from tun_build_skb?
>
>> This will cause high CPU load. To fix this issue,
>> we can skip this description by ignoring the error. When we
>> exceeds sndbuf, the return value of sendmsg is -EAGAIN. In
>> the case we don't skip the description and don't drop packet.
> the -> that
>
> here and above: description -> descriptor
>
> Perhaps slightly revise to more explicitly state that
>
> 1. in the case of persistent failure (i.e., bad packet), the driver
> drops the packet
> 2. in the case of transient failure (e.g,. memory pressure) the driver
> schedules the worker to try again later


If we want to go with this way, we need a better time to wakeup the 
worker. Otherwise it just produces more stress on the cpu that is what 
this patch tries to avoid.

Thanks


>
>


^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH net v2 2/2] vhost_net: fix high cpu load when sendmsg fails
@ 2020-12-22  4:41         ` Jason Wang
  0 siblings, 0 replies; 38+ messages in thread
From: Jason Wang @ 2020-12-22  4:41 UTC (permalink / raw)
  To: Willem de Bruijn, wangyunjian
  Cc: Michael S. Tsirkin, Network Development, Lilijun (Jerry),
	virtualization, chenchanghu, huangbin (J),
	xudingke


On 2020/12/22 上午7:07, Willem de Bruijn wrote:
> On Wed, Dec 16, 2020 at 3:20 AM wangyunjian<wangyunjian@huawei.com>  wrote:
>> From: Yunjian Wang<wangyunjian@huawei.com>
>>
>> Currently we break the loop and wake up the vhost_worker when
>> sendmsg fails. When the worker wakes up again, we'll meet the
>> same error.
> The patch is based on the assumption that such error cases always
> return EAGAIN. Can it not also be ENOMEM, such as from tun_build_skb?
>
>> This will cause high CPU load. To fix this issue,
>> we can skip this description by ignoring the error. When we
>> exceeds sndbuf, the return value of sendmsg is -EAGAIN. In
>> the case we don't skip the description and don't drop packet.
> the -> that
>
> here and above: description -> descriptor
>
> Perhaps slightly revise to more explicitly state that
>
> 1. in the case of persistent failure (i.e., bad packet), the driver
> drops the packet
> 2. in the case of transient failure (e.g,. memory pressure) the driver
> schedules the worker to try again later


If we want to go with this way, we need a better time to wakeup the 
worker. Otherwise it just produces more stress on the cpu that is what 
this patch tries to avoid.

Thanks


>
>

_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH net v2 2/2] vhost_net: fix high cpu load when sendmsg fails
  2020-12-22  4:41         ` Jason Wang
@ 2020-12-22 14:24           ` Willem de Bruijn
  -1 siblings, 0 replies; 38+ messages in thread
From: Willem de Bruijn @ 2020-12-22 14:24 UTC (permalink / raw)
  To: Jason Wang
  Cc: wangyunjian, Network Development, Michael S. Tsirkin,
	virtualization, Lilijun (Jerry),
	chenchanghu, xudingke, huangbin (J)

On Mon, Dec 21, 2020 at 11:41 PM Jason Wang <jasowang@redhat.com> wrote:
>
>
> On 2020/12/22 上午7:07, Willem de Bruijn wrote:
> > On Wed, Dec 16, 2020 at 3:20 AM wangyunjian<wangyunjian@huawei.com>  wrote:
> >> From: Yunjian Wang<wangyunjian@huawei.com>
> >>
> >> Currently we break the loop and wake up the vhost_worker when
> >> sendmsg fails. When the worker wakes up again, we'll meet the
> >> same error.
> > The patch is based on the assumption that such error cases always
> > return EAGAIN. Can it not also be ENOMEM, such as from tun_build_skb?
> >
> >> This will cause high CPU load. To fix this issue,
> >> we can skip this description by ignoring the error. When we
> >> exceeds sndbuf, the return value of sendmsg is -EAGAIN. In
> >> the case we don't skip the description and don't drop packet.
> > the -> that
> >
> > here and above: description -> descriptor
> >
> > Perhaps slightly revise to more explicitly state that
> >
> > 1. in the case of persistent failure (i.e., bad packet), the driver
> > drops the packet
> > 2. in the case of transient failure (e.g,. memory pressure) the driver
> > schedules the worker to try again later
>
>
> If we want to go with this way, we need a better time to wakeup the
> worker. Otherwise it just produces more stress on the cpu that is what
> this patch tries to avoid.

Perhaps I misunderstood the purpose of the patch: is it to drop
everything, regardless of transient or persistent failure, until the
ring runs out of descriptors?

I can understand both a blocking and drop strategy during memory
pressure. But partial drop strategy until exceeding ring capacity
seems like a peculiar hybrid?

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH net v2 2/2] vhost_net: fix high cpu load when sendmsg fails
@ 2020-12-22 14:24           ` Willem de Bruijn
  0 siblings, 0 replies; 38+ messages in thread
From: Willem de Bruijn @ 2020-12-22 14:24 UTC (permalink / raw)
  To: Jason Wang
  Cc: Michael S. Tsirkin, Network Development, wangyunjian,
	Lilijun (Jerry), virtualization, xudingke, huangbin (J),
	chenchanghu

On Mon, Dec 21, 2020 at 11:41 PM Jason Wang <jasowang@redhat.com> wrote:
>
>
> On 2020/12/22 上午7:07, Willem de Bruijn wrote:
> > On Wed, Dec 16, 2020 at 3:20 AM wangyunjian<wangyunjian@huawei.com>  wrote:
> >> From: Yunjian Wang<wangyunjian@huawei.com>
> >>
> >> Currently we break the loop and wake up the vhost_worker when
> >> sendmsg fails. When the worker wakes up again, we'll meet the
> >> same error.
> > The patch is based on the assumption that such error cases always
> > return EAGAIN. Can it not also be ENOMEM, such as from tun_build_skb?
> >
> >> This will cause high CPU load. To fix this issue,
> >> we can skip this description by ignoring the error. When we
> >> exceeds sndbuf, the return value of sendmsg is -EAGAIN. In
> >> the case we don't skip the description and don't drop packet.
> > the -> that
> >
> > here and above: description -> descriptor
> >
> > Perhaps slightly revise to more explicitly state that
> >
> > 1. in the case of persistent failure (i.e., bad packet), the driver
> > drops the packet
> > 2. in the case of transient failure (e.g,. memory pressure) the driver
> > schedules the worker to try again later
>
>
> If we want to go with this way, we need a better time to wakeup the
> worker. Otherwise it just produces more stress on the cpu that is what
> this patch tries to avoid.

Perhaps I misunderstood the purpose of the patch: is it to drop
everything, regardless of transient or persistent failure, until the
ring runs out of descriptors?

I can understand both a blocking and drop strategy during memory
pressure. But partial drop strategy until exceeding ring capacity
seems like a peculiar hybrid?
_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

^ permalink raw reply	[flat|nested] 38+ messages in thread

* RE: [PATCH net v2 2/2] vhost_net: fix high cpu load when sendmsg fails
  2020-12-22  4:41         ` Jason Wang
  (?)
  (?)
@ 2020-12-23  2:46         ` wangyunjian
  -1 siblings, 0 replies; 38+ messages in thread
From: wangyunjian @ 2020-12-23  2:46 UTC (permalink / raw)
  To: Jason Wang, Willem de Bruijn
  Cc: Network Development, Michael S. Tsirkin, virtualization,
	Lilijun (Jerry), chenchanghu, xudingke, huangbin (J)



> -----Original Message-----
> From: Jason Wang [mailto:jasowang@redhat.com]
> Sent: Tuesday, December 22, 2020 12:41 PM
> To: Willem de Bruijn <willemdebruijn.kernel@gmail.com>; wangyunjian
> <wangyunjian@huawei.com>
> Cc: Network Development <netdev@vger.kernel.org>; Michael S. Tsirkin
> <mst@redhat.com>; virtualization@lists.linux-foundation.org; Lilijun (Jerry)
> <jerry.lilijun@huawei.com>; chenchanghu <chenchanghu@huawei.com>;
> xudingke <xudingke@huawei.com>; huangbin (J)
> <brian.huangbin@huawei.com>
> Subject: Re: [PATCH net v2 2/2] vhost_net: fix high cpu load when sendmsg fails
> 
> 
> On 2020/12/22 上午7:07, Willem de Bruijn wrote:
> > On Wed, Dec 16, 2020 at 3:20 AM wangyunjian<wangyunjian@huawei.com>
> wrote:
> >> From: Yunjian Wang<wangyunjian@huawei.com>
> >>
> >> Currently we break the loop and wake up the vhost_worker when sendmsg
> >> fails. When the worker wakes up again, we'll meet the same error.
> > The patch is based on the assumption that such error cases always
> > return EAGAIN. Can it not also be ENOMEM, such as from tun_build_skb?
> >
> >> This will cause high CPU load. To fix this issue, we can skip this
> >> description by ignoring the error. When we exceeds sndbuf, the return
> >> value of sendmsg is -EAGAIN. In the case we don't skip the
> >> description and don't drop packet.
> > the -> that
> >
> > here and above: description -> descriptor
> >
> > Perhaps slightly revise to more explicitly state that
> >
> > 1. in the case of persistent failure (i.e., bad packet), the driver
> > drops the packet 2. in the case of transient failure (e.g,. memory
> > pressure) the driver schedules the worker to try again later
> 
> 
> If we want to go with this way, we need a better time to wakeup the worker.
> Otherwise it just produces more stress on the cpu that is what this patch tries
> to avoid.

The problem was initially discovered when a VM sent an abnormal packet,
which causing the VM can't send packets anymore. After this patch
"feb8892cb441c7 vhost_net: conditionally enable tx polling", there have
also been high CPU consumption issues. 

It is the first problem that I am actually more concerned with and want
to solve.

Thanks

> 
> Thanks
> 
> 
> >
> >


^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH net v2 2/2] vhost_net: fix high cpu load when sendmsg fails
  2020-12-22 14:24           ` Willem de Bruijn
@ 2020-12-23  2:53             ` Jason Wang
  -1 siblings, 0 replies; 38+ messages in thread
From: Jason Wang @ 2020-12-23  2:53 UTC (permalink / raw)
  To: Willem de Bruijn
  Cc: wangyunjian, Network Development, Michael S. Tsirkin,
	virtualization, Lilijun (Jerry),
	chenchanghu, xudingke, huangbin (J)


On 2020/12/22 下午10:24, Willem de Bruijn wrote:
> On Mon, Dec 21, 2020 at 11:41 PM Jason Wang <jasowang@redhat.com> wrote:
>>
>> On 2020/12/22 上午7:07, Willem de Bruijn wrote:
>>> On Wed, Dec 16, 2020 at 3:20 AM wangyunjian<wangyunjian@huawei.com>  wrote:
>>>> From: Yunjian Wang<wangyunjian@huawei.com>
>>>>
>>>> Currently we break the loop and wake up the vhost_worker when
>>>> sendmsg fails. When the worker wakes up again, we'll meet the
>>>> same error.
>>> The patch is based on the assumption that such error cases always
>>> return EAGAIN. Can it not also be ENOMEM, such as from tun_build_skb?
>>>
>>>> This will cause high CPU load. To fix this issue,
>>>> we can skip this description by ignoring the error. When we
>>>> exceeds sndbuf, the return value of sendmsg is -EAGAIN. In
>>>> the case we don't skip the description and don't drop packet.
>>> the -> that
>>>
>>> here and above: description -> descriptor
>>>
>>> Perhaps slightly revise to more explicitly state that
>>>
>>> 1. in the case of persistent failure (i.e., bad packet), the driver
>>> drops the packet
>>> 2. in the case of transient failure (e.g,. memory pressure) the driver
>>> schedules the worker to try again later
>>
>> If we want to go with this way, we need a better time to wakeup the
>> worker. Otherwise it just produces more stress on the cpu that is what
>> this patch tries to avoid.
> Perhaps I misunderstood the purpose of the patch: is it to drop
> everything, regardless of transient or persistent failure, until the
> ring runs out of descriptors?


My understanding is that the main motivation is to avoid high cpu 
utilization when sendmsg() fail due to guest reason (e.g bad packet).


>
> I can understand both a blocking and drop strategy during memory
> pressure. But partial drop strategy until exceeding ring capacity
> seems like a peculiar hybrid?


Yes. So I wonder if we want to be do better when we are in the memory 
pressure. E.g can we let socket wake up us instead of rescheduling the 
workers here? At least in this case we know some memory might be freed?

Thanks


>


^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH net v2 2/2] vhost_net: fix high cpu load when sendmsg fails
@ 2020-12-23  2:53             ` Jason Wang
  0 siblings, 0 replies; 38+ messages in thread
From: Jason Wang @ 2020-12-23  2:53 UTC (permalink / raw)
  To: Willem de Bruijn
  Cc: Michael S. Tsirkin, Network Development, wangyunjian,
	Lilijun (Jerry), virtualization, xudingke, huangbin (J),
	chenchanghu


On 2020/12/22 下午10:24, Willem de Bruijn wrote:
> On Mon, Dec 21, 2020 at 11:41 PM Jason Wang <jasowang@redhat.com> wrote:
>>
>> On 2020/12/22 上午7:07, Willem de Bruijn wrote:
>>> On Wed, Dec 16, 2020 at 3:20 AM wangyunjian<wangyunjian@huawei.com>  wrote:
>>>> From: Yunjian Wang<wangyunjian@huawei.com>
>>>>
>>>> Currently we break the loop and wake up the vhost_worker when
>>>> sendmsg fails. When the worker wakes up again, we'll meet the
>>>> same error.
>>> The patch is based on the assumption that such error cases always
>>> return EAGAIN. Can it not also be ENOMEM, such as from tun_build_skb?
>>>
>>>> This will cause high CPU load. To fix this issue,
>>>> we can skip this description by ignoring the error. When we
>>>> exceeds sndbuf, the return value of sendmsg is -EAGAIN. In
>>>> the case we don't skip the description and don't drop packet.
>>> the -> that
>>>
>>> here and above: description -> descriptor
>>>
>>> Perhaps slightly revise to more explicitly state that
>>>
>>> 1. in the case of persistent failure (i.e., bad packet), the driver
>>> drops the packet
>>> 2. in the case of transient failure (e.g,. memory pressure) the driver
>>> schedules the worker to try again later
>>
>> If we want to go with this way, we need a better time to wakeup the
>> worker. Otherwise it just produces more stress on the cpu that is what
>> this patch tries to avoid.
> Perhaps I misunderstood the purpose of the patch: is it to drop
> everything, regardless of transient or persistent failure, until the
> ring runs out of descriptors?


My understanding is that the main motivation is to avoid high cpu 
utilization when sendmsg() fail due to guest reason (e.g bad packet).


>
> I can understand both a blocking and drop strategy during memory
> pressure. But partial drop strategy until exceeding ring capacity
> seems like a peculiar hybrid?


Yes. So I wonder if we want to be do better when we are in the memory 
pressure. E.g can we let socket wake up us instead of rescheduling the 
workers here? At least in this case we know some memory might be freed?

Thanks


>

_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

^ permalink raw reply	[flat|nested] 38+ messages in thread

* RE: [PATCH net v2 2/2] vhost_net: fix high cpu load when sendmsg fails
  2020-12-23  2:53             ` Jason Wang
  (?)
@ 2020-12-23 13:21             ` wangyunjian
  2020-12-23 13:48                 ` Willem de Bruijn
  -1 siblings, 1 reply; 38+ messages in thread
From: wangyunjian @ 2020-12-23 13:21 UTC (permalink / raw)
  To: Jason Wang, Willem de Bruijn
  Cc: Network Development, Michael S. Tsirkin, virtualization,
	Lilijun (Jerry), chenchanghu, xudingke, huangbin (J)

> -----Original Message-----
> From: Jason Wang [mailto:jasowang@redhat.com]
> Sent: Wednesday, December 23, 2020 10:54 AM
> To: Willem de Bruijn <willemdebruijn.kernel@gmail.com>
> Cc: wangyunjian <wangyunjian@huawei.com>; Network Development
> <netdev@vger.kernel.org>; Michael S. Tsirkin <mst@redhat.com>;
> virtualization@lists.linux-foundation.org; Lilijun (Jerry)
> <jerry.lilijun@huawei.com>; chenchanghu <chenchanghu@huawei.com>;
> xudingke <xudingke@huawei.com>; huangbin (J)
> <brian.huangbin@huawei.com>
> Subject: Re: [PATCH net v2 2/2] vhost_net: fix high cpu load when sendmsg fails
> 
> 
> On 2020/12/22 下午10:24, Willem de Bruijn wrote:
> > On Mon, Dec 21, 2020 at 11:41 PM Jason Wang <jasowang@redhat.com>
> wrote:
> >>
> >> On 2020/12/22 上午7:07, Willem de Bruijn wrote:
> >>> On Wed, Dec 16, 2020 at 3:20 AM wangyunjian<wangyunjian@huawei.com>
> wrote:
> >>>> From: Yunjian Wang<wangyunjian@huawei.com>
> >>>>
> >>>> Currently we break the loop and wake up the vhost_worker when
> >>>> sendmsg fails. When the worker wakes up again, we'll meet the same
> >>>> error.
> >>> The patch is based on the assumption that such error cases always
> >>> return EAGAIN. Can it not also be ENOMEM, such as from tun_build_skb?
> >>>
> >>>> This will cause high CPU load. To fix this issue, we can skip this
> >>>> description by ignoring the error. When we exceeds sndbuf, the
> >>>> return value of sendmsg is -EAGAIN. In the case we don't skip the
> >>>> description and don't drop packet.
> >>> the -> that
> >>>
> >>> here and above: description -> descriptor
> >>>
> >>> Perhaps slightly revise to more explicitly state that
> >>>
> >>> 1. in the case of persistent failure (i.e., bad packet), the driver
> >>> drops the packet 2. in the case of transient failure (e.g,. memory
> >>> pressure) the driver schedules the worker to try again later
> >>
> >> If we want to go with this way, we need a better time to wakeup the
> >> worker. Otherwise it just produces more stress on the cpu that is
> >> what this patch tries to avoid.
> > Perhaps I misunderstood the purpose of the patch: is it to drop
> > everything, regardless of transient or persistent failure, until the
> > ring runs out of descriptors?
> 
> 
> My understanding is that the main motivation is to avoid high cpu utilization
> when sendmsg() fail due to guest reason (e.g bad packet).
> 

My main motivation is to avoid the tx queue stuck.

Should I describe it like this:
Currently the driver don't drop a packet which can't be send by tun
(e.g bad packet). In this case, the driver will always process the
same packet lead to the tx queue stuck.

To fix this issue:
1. in the case of persistent failure (e.g bad packet), the driver can skip
this descriptior by ignoring the error.
2. in the case of transient failure (e.g -EAGAIN and -ENOMEM), the driver
schedules the worker to try again.

Thanks

> 
> >
> > I can understand both a blocking and drop strategy during memory
> > pressure. But partial drop strategy until exceeding ring capacity
> > seems like a peculiar hybrid?
> 
> 
> Yes. So I wonder if we want to be do better when we are in the memory
> pressure. E.g can we let socket wake up us instead of rescheduling the
> workers here? At least in this case we know some memory might be freed?
> 
> Thanks
> 
> 
> >


^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH net v2 2/2] vhost_net: fix high cpu load when sendmsg fails
  2020-12-23 13:21             ` wangyunjian
@ 2020-12-23 13:48                 ` Willem de Bruijn
  0 siblings, 0 replies; 38+ messages in thread
From: Willem de Bruijn @ 2020-12-23 13:48 UTC (permalink / raw)
  To: wangyunjian
  Cc: Jason Wang, Network Development, Michael S. Tsirkin,
	virtualization, Lilijun (Jerry),
	chenchanghu, xudingke, huangbin (J)

On Wed, Dec 23, 2020 at 8:21 AM wangyunjian <wangyunjian@huawei.com> wrote:
>
> > -----Original Message-----
> > From: Jason Wang [mailto:jasowang@redhat.com]
> > Sent: Wednesday, December 23, 2020 10:54 AM
> > To: Willem de Bruijn <willemdebruijn.kernel@gmail.com>
> > Cc: wangyunjian <wangyunjian@huawei.com>; Network Development
> > <netdev@vger.kernel.org>; Michael S. Tsirkin <mst@redhat.com>;
> > virtualization@lists.linux-foundation.org; Lilijun (Jerry)
> > <jerry.lilijun@huawei.com>; chenchanghu <chenchanghu@huawei.com>;
> > xudingke <xudingke@huawei.com>; huangbin (J)
> > <brian.huangbin@huawei.com>
> > Subject: Re: [PATCH net v2 2/2] vhost_net: fix high cpu load when sendmsg fails
> >
> >
> > On 2020/12/22 下午10:24, Willem de Bruijn wrote:
> > > On Mon, Dec 21, 2020 at 11:41 PM Jason Wang <jasowang@redhat.com>
> > wrote:
> > >>
> > >> On 2020/12/22 上午7:07, Willem de Bruijn wrote:
> > >>> On Wed, Dec 16, 2020 at 3:20 AM wangyunjian<wangyunjian@huawei.com>
> > wrote:
> > >>>> From: Yunjian Wang<wangyunjian@huawei.com>
> > >>>>
> > >>>> Currently we break the loop and wake up the vhost_worker when
> > >>>> sendmsg fails. When the worker wakes up again, we'll meet the same
> > >>>> error.
> > >>> The patch is based on the assumption that such error cases always
> > >>> return EAGAIN. Can it not also be ENOMEM, such as from tun_build_skb?
> > >>>
> > >>>> This will cause high CPU load. To fix this issue, we can skip this
> > >>>> description by ignoring the error. When we exceeds sndbuf, the
> > >>>> return value of sendmsg is -EAGAIN. In the case we don't skip the
> > >>>> description and don't drop packet.
> > >>> the -> that
> > >>>
> > >>> here and above: description -> descriptor
> > >>>
> > >>> Perhaps slightly revise to more explicitly state that
> > >>>
> > >>> 1. in the case of persistent failure (i.e., bad packet), the driver
> > >>> drops the packet 2. in the case of transient failure (e.g,. memory
> > >>> pressure) the driver schedules the worker to try again later
> > >>
> > >> If we want to go with this way, we need a better time to wakeup the
> > >> worker. Otherwise it just produces more stress on the cpu that is
> > >> what this patch tries to avoid.
> > > Perhaps I misunderstood the purpose of the patch: is it to drop
> > > everything, regardless of transient or persistent failure, until the
> > > ring runs out of descriptors?
> >
> >
> > My understanding is that the main motivation is to avoid high cpu utilization
> > when sendmsg() fail due to guest reason (e.g bad packet).
> >
>
> My main motivation is to avoid the tx queue stuck.
>
> Should I describe it like this:
> Currently the driver don't drop a packet which can't be send by tun
> (e.g bad packet). In this case, the driver will always process the
> same packet lead to the tx queue stuck.
>
> To fix this issue:
> 1. in the case of persistent failure (e.g bad packet), the driver can skip
> this descriptior by ignoring the error.
> 2. in the case of transient failure (e.g -EAGAIN and -ENOMEM), the driver
> schedules the worker to try again.

That sounds good to me, thanks.

> Thanks
>
> >
> > >
> > > I can understand both a blocking and drop strategy during memory
> > > pressure. But partial drop strategy until exceeding ring capacity
> > > seems like a peculiar hybrid?
> >
> >
> > Yes. So I wonder if we want to be do better when we are in the memory
> > pressure. E.g can we let socket wake up us instead of rescheduling the
> > workers here? At least in this case we know some memory might be freed?

I don't know whether a blocking or drop strategy is the better choice.
Either way, it probably deserves to be handled separately.

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH net v2 2/2] vhost_net: fix high cpu load when sendmsg fails
@ 2020-12-23 13:48                 ` Willem de Bruijn
  0 siblings, 0 replies; 38+ messages in thread
From: Willem de Bruijn @ 2020-12-23 13:48 UTC (permalink / raw)
  To: wangyunjian
  Cc: Michael S. Tsirkin, Network Development, Lilijun (Jerry),
	virtualization, xudingke, huangbin (J),
	chenchanghu

On Wed, Dec 23, 2020 at 8:21 AM wangyunjian <wangyunjian@huawei.com> wrote:
>
> > -----Original Message-----
> > From: Jason Wang [mailto:jasowang@redhat.com]
> > Sent: Wednesday, December 23, 2020 10:54 AM
> > To: Willem de Bruijn <willemdebruijn.kernel@gmail.com>
> > Cc: wangyunjian <wangyunjian@huawei.com>; Network Development
> > <netdev@vger.kernel.org>; Michael S. Tsirkin <mst@redhat.com>;
> > virtualization@lists.linux-foundation.org; Lilijun (Jerry)
> > <jerry.lilijun@huawei.com>; chenchanghu <chenchanghu@huawei.com>;
> > xudingke <xudingke@huawei.com>; huangbin (J)
> > <brian.huangbin@huawei.com>
> > Subject: Re: [PATCH net v2 2/2] vhost_net: fix high cpu load when sendmsg fails
> >
> >
> > On 2020/12/22 下午10:24, Willem de Bruijn wrote:
> > > On Mon, Dec 21, 2020 at 11:41 PM Jason Wang <jasowang@redhat.com>
> > wrote:
> > >>
> > >> On 2020/12/22 上午7:07, Willem de Bruijn wrote:
> > >>> On Wed, Dec 16, 2020 at 3:20 AM wangyunjian<wangyunjian@huawei.com>
> > wrote:
> > >>>> From: Yunjian Wang<wangyunjian@huawei.com>
> > >>>>
> > >>>> Currently we break the loop and wake up the vhost_worker when
> > >>>> sendmsg fails. When the worker wakes up again, we'll meet the same
> > >>>> error.
> > >>> The patch is based on the assumption that such error cases always
> > >>> return EAGAIN. Can it not also be ENOMEM, such as from tun_build_skb?
> > >>>
> > >>>> This will cause high CPU load. To fix this issue, we can skip this
> > >>>> description by ignoring the error. When we exceeds sndbuf, the
> > >>>> return value of sendmsg is -EAGAIN. In the case we don't skip the
> > >>>> description and don't drop packet.
> > >>> the -> that
> > >>>
> > >>> here and above: description -> descriptor
> > >>>
> > >>> Perhaps slightly revise to more explicitly state that
> > >>>
> > >>> 1. in the case of persistent failure (i.e., bad packet), the driver
> > >>> drops the packet 2. in the case of transient failure (e.g,. memory
> > >>> pressure) the driver schedules the worker to try again later
> > >>
> > >> If we want to go with this way, we need a better time to wakeup the
> > >> worker. Otherwise it just produces more stress on the cpu that is
> > >> what this patch tries to avoid.
> > > Perhaps I misunderstood the purpose of the patch: is it to drop
> > > everything, regardless of transient or persistent failure, until the
> > > ring runs out of descriptors?
> >
> >
> > My understanding is that the main motivation is to avoid high cpu utilization
> > when sendmsg() fail due to guest reason (e.g bad packet).
> >
>
> My main motivation is to avoid the tx queue stuck.
>
> Should I describe it like this:
> Currently the driver don't drop a packet which can't be send by tun
> (e.g bad packet). In this case, the driver will always process the
> same packet lead to the tx queue stuck.
>
> To fix this issue:
> 1. in the case of persistent failure (e.g bad packet), the driver can skip
> this descriptior by ignoring the error.
> 2. in the case of transient failure (e.g -EAGAIN and -ENOMEM), the driver
> schedules the worker to try again.

That sounds good to me, thanks.

> Thanks
>
> >
> > >
> > > I can understand both a blocking and drop strategy during memory
> > > pressure. But partial drop strategy until exceeding ring capacity
> > > seems like a peculiar hybrid?
> >
> >
> > Yes. So I wonder if we want to be do better when we are in the memory
> > pressure. E.g can we let socket wake up us instead of rescheduling the
> > workers here? At least in this case we know some memory might be freed?

I don't know whether a blocking or drop strategy is the better choice.
Either way, it probably deserves to be handled separately.
_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

^ permalink raw reply	[flat|nested] 38+ messages in thread

end of thread, other threads:[~2020-12-23 13:49 UTC | newest]

Thread overview: 38+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-12-15  1:48 [PATCH net 0/2] fixes for vhost_net wangyunjian
2020-12-15  1:48 ` [PATCH net 1/2] vhost_net: fix ubuf refcount incorrectly when sendmsg fails wangyunjian
2020-12-15  2:45   ` Willem de Bruijn
2020-12-15  2:45     ` Willem de Bruijn
2020-12-15  7:51     ` wangyunjian
2020-12-15  1:48 ` [PATCH net 2/2] vhost_net: fix high cpu load " wangyunjian
2020-12-15  4:09   ` Jason Wang
2020-12-15  4:09     ` Jason Wang
2020-12-15  8:03     ` wangyunjian
2020-12-16  5:56       ` Jason Wang
2020-12-16  5:56         ` Jason Wang
2020-12-16  7:43         ` wangyunjian
2020-12-16  7:47           ` Jason Wang
2020-12-16  7:47             ` Jason Wang
2020-12-16  8:20 ` [PATCH net v2 0/2] fixes for vhost_net wangyunjian
2020-12-16  8:20   ` [PATCH net v2 1/2] vhost_net: fix ubuf refcount incorrectly when sendmsg fails wangyunjian
2020-12-16 14:17     ` Willem de Bruijn
2020-12-16 14:17       ` Willem de Bruijn
2020-12-16 20:56     ` Michael S. Tsirkin
2020-12-16 20:56       ` Michael S. Tsirkin
2020-12-16  8:20   ` [PATCH net v2 2/2] vhost_net: fix high cpu load " wangyunjian
2020-12-16  9:23     ` Michael S. Tsirkin
2020-12-16  9:23       ` Michael S. Tsirkin
2020-12-17  2:38       ` wangyunjian
2020-12-17  3:19       ` Jason Wang
2020-12-17  3:19         ` Jason Wang
2020-12-21 23:07     ` Willem de Bruijn
2020-12-21 23:07       ` Willem de Bruijn
2020-12-22  4:41       ` Jason Wang
2020-12-22  4:41         ` Jason Wang
2020-12-22 14:24         ` Willem de Bruijn
2020-12-22 14:24           ` Willem de Bruijn
2020-12-23  2:53           ` Jason Wang
2020-12-23  2:53             ` Jason Wang
2020-12-23 13:21             ` wangyunjian
2020-12-23 13:48               ` Willem de Bruijn
2020-12-23 13:48                 ` Willem de Bruijn
2020-12-23  2:46         ` wangyunjian

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.