All of lore.kernel.org
 help / color / mirror / Atom feed
From: Willem de Bruijn <willemdebruijn.kernel@gmail.com>
To: wangyunjian <wangyunjian@huawei.com>
Cc: Network Development <netdev@vger.kernel.org>,
	"Michael S. Tsirkin" <mst@redhat.com>,
	Jason Wang <jasowang@redhat.com>,
	virtualization@lists.linux-foundation.org,
	"Lilijun (Jerry)" <jerry.lilijun@huawei.com>,
	chenchanghu <chenchanghu@huawei.com>,
	xudingke <xudingke@huawei.com>,
	"huangbin (J)" <brian.huangbin@huawei.com>
Subject: Re: [PATCH net v3 2/2] vhost_net: fix tx queue stuck when sendmsg fails
Date: Wed, 23 Dec 2020 12:05:38 -0500	[thread overview]
Message-ID: <CAF=yD-KSm4fTWUZy1F2gFOw-qLmMV76rHmzcr05Upz9WV=SXvg@mail.gmail.com> (raw)
In-Reply-To: <1608734856-12516-1-git-send-email-wangyunjian@huawei.com>

On Wed, Dec 23, 2020 at 9:47 AM wangyunjian <wangyunjian@huawei.com> wrote:
>
> From: Yunjian Wang <wangyunjian@huawei.com>
>
> Currently the driver don't drop a packet which can't be send by tun
>
> (e.g bad packet). In this case, the driver will always process the
> same packet lead to the tx queue stuck.
>
> To fix this issue:
> 1. in the case of persistent failure (e.g bad packet), the driver
> can skip this descriptior by ignoring the error.
> 2. in the case of transient failure (e.g -EAGAIN and -ENOMEM), the
> driver schedules the worker to try again.
>

Fixes: 3a4d5c94e959 ("vhost_net: a kernel-level virtio server")

Since I have a few other comments, a few minor typo corrections too:
don't -> doesn't, send -> sent, descriptior -> descriptor.

> Signed-off-by: Yunjian Wang <wangyunjian@huawei.com>
>
>  drivers/vhost/net.c | 12 ++++++------
>  1 file changed, 6 insertions(+), 6 deletions(-)
>
> diff --git a/drivers/vhost/net.c b/drivers/vhost/net.c
> index c8784dfafdd7..e49dd64d086a 100644
> --- a/drivers/vhost/net.c
> +++ b/drivers/vhost/net.c
> @@ -827,9 +827,8 @@ static void handle_tx_copy(struct vhost_net *net, struct socket *sock)
>                                 msg.msg_flags &= ~MSG_MORE;
>                 }
>
> -               /* TODO: Check specific error and bomb out unless ENOBUFS? */
>                 err = sock->ops->sendmsg(sock, &msg, len);
> -               if (unlikely(err < 0)) {
> +               if (unlikely(err == -EAGAIN || err == -ENOMEM)) {
>                         vhost_discard_vq_desc(vq, 1);
>                         vhost_net_enable_vq(net, vq);
>                         break;
> @@ -922,7 +921,6 @@ static void handle_tx_zerocopy(struct vhost_net *net, struct socket *sock)
>                         msg.msg_flags &= ~MSG_MORE;
>                 }
>
> -               /* TODO: Check specific error and bomb out unless ENOBUFS? */
>                 err = sock->ops->sendmsg(sock, &msg, len);
>                 if (unlikely(err < 0)) {
>                         if (zcopy_used) {
> @@ -931,9 +929,11 @@ static void handle_tx_zerocopy(struct vhost_net *net, struct socket *sock)
>                                 nvq->upend_idx = ((unsigned)nvq->upend_idx - 1)
>                                         % UIO_MAXIOV;
>                         }
> -                       vhost_discard_vq_desc(vq, 1);
> -                       vhost_net_enable_vq(net, vq);
> -                       break;
> +                       if (err == -EAGAIN || err == -ENOMEM) {
> +                               vhost_discard_vq_desc(vq, 1);
> +                               vhost_net_enable_vq(net, vq);
> +                               break;
> +                       }
>                 }
>                 if (err != len)
>                         pr_debug("Truncated TX packet: "

Probably my bad for feedback in patch 2/2, but now vhost will
incorrectly log bad packets as truncated packets.

This will need to be if (err >= 0 && err != len).

It would be nice if we could notify the guest in the transmit
descriptor when a packet was dropped due to failing integrity checks
(bad packet). But I don't think we easily can, so out of scope for
this fix.

WARNING: multiple messages have this Message-ID (diff)
From: Willem de Bruijn <willemdebruijn.kernel@gmail.com>
To: wangyunjian <wangyunjian@huawei.com>
Cc: "Michael S. Tsirkin" <mst@redhat.com>,
	Network Development <netdev@vger.kernel.org>,
	"Lilijun \(Jerry\)" <jerry.lilijun@huawei.com>,
	virtualization@lists.linux-foundation.org,
	xudingke <xudingke@huawei.com>,
	"huangbin \(J\)" <brian.huangbin@huawei.com>,
	chenchanghu <chenchanghu@huawei.com>
Subject: Re: [PATCH net v3 2/2] vhost_net: fix tx queue stuck when sendmsg fails
Date: Wed, 23 Dec 2020 12:05:38 -0500	[thread overview]
Message-ID: <CAF=yD-KSm4fTWUZy1F2gFOw-qLmMV76rHmzcr05Upz9WV=SXvg@mail.gmail.com> (raw)
In-Reply-To: <1608734856-12516-1-git-send-email-wangyunjian@huawei.com>

On Wed, Dec 23, 2020 at 9:47 AM wangyunjian <wangyunjian@huawei.com> wrote:
>
> From: Yunjian Wang <wangyunjian@huawei.com>
>
> Currently the driver don't drop a packet which can't be send by tun
>
> (e.g bad packet). In this case, the driver will always process the
> same packet lead to the tx queue stuck.
>
> To fix this issue:
> 1. in the case of persistent failure (e.g bad packet), the driver
> can skip this descriptior by ignoring the error.
> 2. in the case of transient failure (e.g -EAGAIN and -ENOMEM), the
> driver schedules the worker to try again.
>

Fixes: 3a4d5c94e959 ("vhost_net: a kernel-level virtio server")

Since I have a few other comments, a few minor typo corrections too:
don't -> doesn't, send -> sent, descriptior -> descriptor.

> Signed-off-by: Yunjian Wang <wangyunjian@huawei.com>
>
>  drivers/vhost/net.c | 12 ++++++------
>  1 file changed, 6 insertions(+), 6 deletions(-)
>
> diff --git a/drivers/vhost/net.c b/drivers/vhost/net.c
> index c8784dfafdd7..e49dd64d086a 100644
> --- a/drivers/vhost/net.c
> +++ b/drivers/vhost/net.c
> @@ -827,9 +827,8 @@ static void handle_tx_copy(struct vhost_net *net, struct socket *sock)
>                                 msg.msg_flags &= ~MSG_MORE;
>                 }
>
> -               /* TODO: Check specific error and bomb out unless ENOBUFS? */
>                 err = sock->ops->sendmsg(sock, &msg, len);
> -               if (unlikely(err < 0)) {
> +               if (unlikely(err == -EAGAIN || err == -ENOMEM)) {
>                         vhost_discard_vq_desc(vq, 1);
>                         vhost_net_enable_vq(net, vq);
>                         break;
> @@ -922,7 +921,6 @@ static void handle_tx_zerocopy(struct vhost_net *net, struct socket *sock)
>                         msg.msg_flags &= ~MSG_MORE;
>                 }
>
> -               /* TODO: Check specific error and bomb out unless ENOBUFS? */
>                 err = sock->ops->sendmsg(sock, &msg, len);
>                 if (unlikely(err < 0)) {
>                         if (zcopy_used) {
> @@ -931,9 +929,11 @@ static void handle_tx_zerocopy(struct vhost_net *net, struct socket *sock)
>                                 nvq->upend_idx = ((unsigned)nvq->upend_idx - 1)
>                                         % UIO_MAXIOV;
>                         }
> -                       vhost_discard_vq_desc(vq, 1);
> -                       vhost_net_enable_vq(net, vq);
> -                       break;
> +                       if (err == -EAGAIN || err == -ENOMEM) {
> +                               vhost_discard_vq_desc(vq, 1);
> +                               vhost_net_enable_vq(net, vq);
> +                               break;
> +                       }
>                 }
>                 if (err != len)
>                         pr_debug("Truncated TX packet: "

Probably my bad for feedback in patch 2/2, but now vhost will
incorrectly log bad packets as truncated packets.

This will need to be if (err >= 0 && err != len).

It would be nice if we could notify the guest in the transmit
descriptor when a packet was dropped due to failing integrity checks
(bad packet). But I don't think we easily can, so out of scope for
this fix.
_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

  reply	other threads:[~2020-12-23 17:07 UTC|newest]

Thread overview: 3+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-12-23 14:47 [PATCH net v3 2/2] vhost_net: fix tx queue stuck when sendmsg fails wangyunjian
2020-12-23 17:05 ` Willem de Bruijn [this message]
2020-12-23 17:05   ` Willem de Bruijn

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CAF=yD-KSm4fTWUZy1F2gFOw-qLmMV76rHmzcr05Upz9WV=SXvg@mail.gmail.com' \
    --to=willemdebruijn.kernel@gmail.com \
    --cc=brian.huangbin@huawei.com \
    --cc=chenchanghu@huawei.com \
    --cc=jasowang@redhat.com \
    --cc=jerry.lilijun@huawei.com \
    --cc=mst@redhat.com \
    --cc=netdev@vger.kernel.org \
    --cc=virtualization@lists.linux-foundation.org \
    --cc=wangyunjian@huawei.com \
    --cc=xudingke@huawei.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.