From mboxrd@z Thu Jan 1 00:00:00 1970 From: Willem de Bruijn Subject: Re: Linux 4.14 - regression: broken tun/tap / bridge network with virtio - bisected Date: Thu, 14 Dec 2017 17:17:45 -0500 Message-ID: References: <9615150a-eb78-2f9d-798f-6aa460932aec@01019freenet.de> <2e2392b7-84c5-be89-b0e5-5bae3b2fdaed@01019freenet.de> <4efbaf24-f419-2c8e-c705-59a5242b0575@01019freenet.de> <881560f8-54ec-e946-50cb-b2e80ddb5f97@01019freenet.de> <73b7a7b0-4264-2bd0-9e65-69841377f09f@redhat.com> <401a0715-fd28-63a3-8dfd-e89835d70db0@01019freenet.de> <11c25b88-af9b-a1f7-b5f5-0420c75916d7@01019freenet.de> <20171208084751.tom4auppogz4lanz@unicorn.suse.cz> <20171208114025.kjcaratqcveq7zu5@unicorn.suse.cz> <96a16c1f-c026-f506-78c1-dad88471361d@01019freenet.de> Mime-Version: 1.0 Content-Type: text/plain; charset="UTF-8" Cc: Michal Kubecek , Jason Wang , David Miller , Network Development To: Andreas Hartmann Return-path: Received: from mail-oi0-f50.google.com ([209.85.218.50]:39276 "EHLO mail-oi0-f50.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1753272AbdLNWS0 (ORCPT ); Thu, 14 Dec 2017 17:18:26 -0500 Received: by mail-oi0-f50.google.com with SMTP id r63so4956843oia.6 for ; Thu, 14 Dec 2017 14:18:26 -0800 (PST) In-Reply-To: Sender: netdev-owner@vger.kernel.org List-ID: >> Well, the patch does not fix hanging VMs, which have been shutdown and >> can't be killed any more. >> Because of the stack trace >> >> [] vhost_net_ubuf_put_and_wait+0x35/0x60 [vhost_net] >> [] vhost_net_ioctl+0x304/0x870 [vhost_net] >> [] do_vfs_ioctl+0x8f/0x5c0 >> [] SyS_ioctl+0x74/0x80 >> [] do_syscall_64+0x5b/0x100 >> [] entry_SYSCALL64_slow_path+0x25/0x25 >> [] 0xffffffffffffffff >> >> I was hoping, that the problems could be related - but that seems not to >> be true. > > However, it turned out, that reverting the complete patchset "Remove UDP > Fragmentation Offload support" prevent hanging qemu processes. That implies a combination of UFO and vhost zerocopy. Disabling experimental_zcopytx in vhost_net will probably work around the bug then. On the surface the two features are independent. Most of the relevant UFO code is reverted with the patch mentioned earlier. Missing from that is protocol stack support, but it is unlikely that your host OS is generating these UFO packets. They are coming from a guest over virtio_net, to which vhost_net then applies zerocopy. Then the packet(s) is/are either freed without calling uarg->callback() or queued somewhere for a very long time. Looking at the diff-of-diffs between my stable patch and your full revert, the majority of missing bits beside the procol layer is in device driver support. Removing that causes the UFO packets to be segmented at any dev_queue_xmit on their path. skb_segment ensures that when it segments a large zerocopy packet, all new segments also point to the zerocopy callback struct (ubuf_info), as the shared memory pages may not be released until all skbs pointing to them are freed. That may be wrong with vhost_zerocopy_callback, which does not use refcounting. I will look into that. It may be that before the msg_zerocopy patchsets large packets were copied before entering segmentation. It is safe to enter segmentation for msg_zerocopy skbs, but not legacy zerocopy skbs. I will also set up two VMs and try to send UFO packets and see whether they indeed are freed somewhere in the stack without notifying vhost_net.