All of lore.kernel.org
 help / color / mirror / Atom feed
From: Willem de Bruijn <willemdebruijn.kernel@gmail.com>
To: Andreas Hartmann <andihartmann@01019freenet.de>
Cc: Michal Kubecek <mkubecek@suse.cz>,
	Jason Wang <jasowang@redhat.com>,
	David Miller <davem@davemloft.net>,
	Network Development <netdev@vger.kernel.org>
Subject: Re: Linux 4.14 - regression: broken tun/tap / bridge network with virtio - bisected
Date: Thu, 14 Dec 2017 17:47:31 -0500	[thread overview]
Message-ID: <CAF=yD-+ZvcVjcNm_7QWCFNTewcJkNtoufP1vY_Mi0mJRK5GVyQ@mail.gmail.com> (raw)
In-Reply-To: <CAF=yD-JnDSfH-2wMjN5BUVdXeezp+k4ievSuvVCvEjX-jncD7Q@mail.gmail.com>

On Thu, Dec 14, 2017 at 5:17 PM, Willem de Bruijn
<willemdebruijn.kernel@gmail.com> wrote:
>>> Well, the patch does not fix hanging VMs, which have been shutdown and
>>> can't be killed any more.
>>> Because of the stack trace
>>>
>>> [<ffffffffc0d0e3c5>] vhost_net_ubuf_put_and_wait+0x35/0x60 [vhost_net]
>>> [<ffffffffc0d0f264>] vhost_net_ioctl+0x304/0x870 [vhost_net]
>>> [<ffffffff9b25460f>] do_vfs_ioctl+0x8f/0x5c0
>>> [<ffffffff9b254bb4>] SyS_ioctl+0x74/0x80
>>> [<ffffffff9b00365b>] do_syscall_64+0x5b/0x100
>>> [<ffffffff9b78e7ab>] entry_SYSCALL64_slow_path+0x25/0x25
>>> [<ffffffffffffffff>] 0xffffffffffffffff
>>>
>>> I was hoping, that the problems could be related - but that seems not to
>>> be true.
>>
>> However, it turned out, that reverting the complete patchset "Remove UDP
>> Fragmentation Offload support" prevent hanging qemu processes.
>
> That implies a combination of UFO and vhost zerocopy. Disabling
> experimental_zcopytx in vhost_net will probably work around the bug
> then.
>
> On the surface the two features are independent. Most of the relevant
> UFO code is reverted with the patch mentioned earlier. Missing from
> that is protocol stack support, but it is unlikely that your host OS is
> generating these UFO packets.
>
> They are coming from a guest over virtio_net, to which vhost_net then
> applies zerocopy. Then the packet(s) is/are either freed without calling
> uarg->callback() or queued somewhere for a very long time.
>
> Looking at the diff-of-diffs between my stable patch and your full revert,
> the majority of missing bits beside the procol layer is in device driver
> support. Removing that causes the UFO packets to be segmented at any
> dev_queue_xmit on their path. skb_segment ensures that when it segments
> a large zerocopy packet, all new segments also point to the zerocopy
> callback struct (ubuf_info), as the shared memory pages may not be
> released until all skbs pointing to them are freed.
>
> That may be wrong with vhost_zerocopy_callback, which does not use
> refcounting. I will look into that. It may be that before the msg_zerocopy
> patchsets large packets were copied before entering segmentation. It is
> safe to enter segmentation for msg_zerocopy skbs, but not legacy zerocopy
> skbs.

If this is the cause, then the following, while not a real solution, would
probably also solve resolve the observed issue.

diff --git a/net/core/skbuff.c b/net/core/skbuff.c
index e140ba49b30a..8fe5bca1d6ae 100644
--- a/net/core/skbuff.c
+++ b/net/core/skbuff.c
@@ -3655,10 +3655,10 @@ struct sk_buff *skb_segment(struct sk_buff *head_skb,
                skb_copy_from_linear_data_offset(head_skb, offset,
                                                 skb_put(nskb, hsize), hsize);

+               if (unlikely(skb_orphan_frags_rx(head_skb, GFP_ATOMIC)))
+                       goto err;
                skb_shinfo(nskb)->tx_flags |= skb_shinfo(head_skb)->tx_flags &
                                              SKBTX_SHARED_FRAG;
-               if (skb_zerocopy_clone(nskb, head_skb, GFP_ATOMIC))
-                       goto err;

This basically converts zerocopy TSO skbs to regular and calls their
uarg->callback just before segmenting them.

  reply	other threads:[~2017-12-14 22:48 UTC|newest]

Thread overview: 34+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-11-26 14:17 Linux 4.14 - regression: broken tun/tap / bridge network with virtio - bisected Andreas Hartmann
2017-11-27 16:46 ` Andreas Hartmann
2017-11-27 16:55   ` Michal Kubecek
2017-11-27 19:09     ` Andreas Hartmann
2017-12-01 10:11 ` Andreas Hartmann
2017-12-03 11:35   ` Andreas Hartmann
2017-12-04 16:28     ` Andreas Hartmann
2017-12-05  3:50       ` Jason Wang
2017-12-05 16:23         ` Andreas Hartmann
2017-12-06  3:08           ` Jason Wang
2017-12-08  7:21             ` Andreas Hartmann
2017-12-08  8:47               ` Michal Kubecek
2017-12-08 10:31                 ` Andreas Hartmann
2017-12-08 11:40                   ` Michal Kubecek
2017-12-08 12:45                     ` Andreas Hartmann
2017-12-08 12:58                       ` Michal Kubecek
2017-12-08 13:13                         ` Andreas Hartmann
2017-12-08 15:11                           ` Jason Wang
2017-12-08 16:04                     ` Willem de Bruijn
2017-12-08 20:11                       ` Andreas Hartmann
2017-12-08 20:44                         ` Andreas Hartmann
2017-12-11 15:54                           ` Andreas Hartmann
2017-12-14 16:31                             ` Andreas Hartmann
2017-12-14 22:17                             ` Willem de Bruijn
2017-12-14 22:47                               ` Willem de Bruijn [this message]
2017-12-15  6:05                               ` Andreas Hartmann
2017-12-17 22:33                                 ` Willem de Bruijn
2017-12-18 17:11                                   ` Andreas Hartmann
2017-12-20 15:56                                     ` Andreas Hartmann
2017-12-20 22:44                                       ` Willem de Bruijn
2017-12-21 17:05                                         ` Andreas Hartmann
2017-12-21 17:11                                           ` Willem de Bruijn
2017-12-24 16:24                                       ` Andreas Hartmann
2017-12-24 18:54                                         ` Willem de Bruijn

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CAF=yD-+ZvcVjcNm_7QWCFNTewcJkNtoufP1vY_Mi0mJRK5GVyQ@mail.gmail.com' \
    --to=willemdebruijn.kernel@gmail.com \
    --cc=andihartmann@01019freenet.de \
    --cc=davem@davemloft.net \
    --cc=jasowang@redhat.com \
    --cc=mkubecek@suse.cz \
    --cc=netdev@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.