All of lore.kernel.org
 help / color / mirror / Atom feed
From: Willem de Bruijn <willemdebruijn.kernel@gmail.com>
To: Andreas Hartmann <andihartmann@01019freenet.de>
Cc: Michal Kubecek <mkubecek@suse.cz>,
	Jason Wang <jasowang@redhat.com>,
	David Miller <davem@davemloft.net>,
	Network Development <netdev@vger.kernel.org>
Subject: Re: Linux 4.14 - regression: broken tun/tap / bridge network with virtio - bisected
Date: Thu, 14 Dec 2017 17:17:45 -0500	[thread overview]
Message-ID: <CAF=yD-JnDSfH-2wMjN5BUVdXeezp+k4ievSuvVCvEjX-jncD7Q@mail.gmail.com> (raw)
In-Reply-To: <d71df64e-e65f-4db4-6f2e-c002c15fcbe4@01019freenet.de>

>> Well, the patch does not fix hanging VMs, which have been shutdown and
>> can't be killed any more.
>> Because of the stack trace
>>
>> [<ffffffffc0d0e3c5>] vhost_net_ubuf_put_and_wait+0x35/0x60 [vhost_net]
>> [<ffffffffc0d0f264>] vhost_net_ioctl+0x304/0x870 [vhost_net]
>> [<ffffffff9b25460f>] do_vfs_ioctl+0x8f/0x5c0
>> [<ffffffff9b254bb4>] SyS_ioctl+0x74/0x80
>> [<ffffffff9b00365b>] do_syscall_64+0x5b/0x100
>> [<ffffffff9b78e7ab>] entry_SYSCALL64_slow_path+0x25/0x25
>> [<ffffffffffffffff>] 0xffffffffffffffff
>>
>> I was hoping, that the problems could be related - but that seems not to
>> be true.
>
> However, it turned out, that reverting the complete patchset "Remove UDP
> Fragmentation Offload support" prevent hanging qemu processes.

That implies a combination of UFO and vhost zerocopy. Disabling
experimental_zcopytx in vhost_net will probably work around the bug
then.

On the surface the two features are independent. Most of the relevant
UFO code is reverted with the patch mentioned earlier. Missing from
that is protocol stack support, but it is unlikely that your host OS is
generating these UFO packets.

They are coming from a guest over virtio_net, to which vhost_net then
applies zerocopy. Then the packet(s) is/are either freed without calling
uarg->callback() or queued somewhere for a very long time.

Looking at the diff-of-diffs between my stable patch and your full revert,
the majority of missing bits beside the procol layer is in device driver
support. Removing that causes the UFO packets to be segmented at any
dev_queue_xmit on their path. skb_segment ensures that when it segments
a large zerocopy packet, all new segments also point to the zerocopy
callback struct (ubuf_info), as the shared memory pages may not be
released until all skbs pointing to them are freed.

That may be wrong with vhost_zerocopy_callback, which does not use
refcounting. I will look into that. It may be that before the msg_zerocopy
patchsets large packets were copied before entering segmentation. It is
safe to enter segmentation for msg_zerocopy skbs, but not legacy zerocopy
skbs.

I will also set up two VMs and try to send UFO packets and see whether
they indeed are freed somewhere in the stack without notifying vhost_net.

  parent reply	other threads:[~2017-12-14 22:18 UTC|newest]

Thread overview: 34+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-11-26 14:17 Linux 4.14 - regression: broken tun/tap / bridge network with virtio - bisected Andreas Hartmann
2017-11-27 16:46 ` Andreas Hartmann
2017-11-27 16:55   ` Michal Kubecek
2017-11-27 19:09     ` Andreas Hartmann
2017-12-01 10:11 ` Andreas Hartmann
2017-12-03 11:35   ` Andreas Hartmann
2017-12-04 16:28     ` Andreas Hartmann
2017-12-05  3:50       ` Jason Wang
2017-12-05 16:23         ` Andreas Hartmann
2017-12-06  3:08           ` Jason Wang
2017-12-08  7:21             ` Andreas Hartmann
2017-12-08  8:47               ` Michal Kubecek
2017-12-08 10:31                 ` Andreas Hartmann
2017-12-08 11:40                   ` Michal Kubecek
2017-12-08 12:45                     ` Andreas Hartmann
2017-12-08 12:58                       ` Michal Kubecek
2017-12-08 13:13                         ` Andreas Hartmann
2017-12-08 15:11                           ` Jason Wang
2017-12-08 16:04                     ` Willem de Bruijn
2017-12-08 20:11                       ` Andreas Hartmann
2017-12-08 20:44                         ` Andreas Hartmann
2017-12-11 15:54                           ` Andreas Hartmann
2017-12-14 16:31                             ` Andreas Hartmann
2017-12-14 22:17                             ` Willem de Bruijn [this message]
2017-12-14 22:47                               ` Willem de Bruijn
2017-12-15  6:05                               ` Andreas Hartmann
2017-12-17 22:33                                 ` Willem de Bruijn
2017-12-18 17:11                                   ` Andreas Hartmann
2017-12-20 15:56                                     ` Andreas Hartmann
2017-12-20 22:44                                       ` Willem de Bruijn
2017-12-21 17:05                                         ` Andreas Hartmann
2017-12-21 17:11                                           ` Willem de Bruijn
2017-12-24 16:24                                       ` Andreas Hartmann
2017-12-24 18:54                                         ` Willem de Bruijn

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CAF=yD-JnDSfH-2wMjN5BUVdXeezp+k4ievSuvVCvEjX-jncD7Q@mail.gmail.com' \
    --to=willemdebruijn.kernel@gmail.com \
    --cc=andihartmann@01019freenet.de \
    --cc=davem@davemloft.net \
    --cc=jasowang@redhat.com \
    --cc=mkubecek@suse.cz \
    --cc=netdev@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.