From mboxrd@z Thu Jan 1 00:00:00 1970 From: Andreas Hartmann Subject: Re: Linux 4.14 - regression: broken tun/tap / bridge network with virtio - bisected Date: Thu, 21 Dec 2017 18:05:54 +0100 Message-ID: References: <9615150a-eb78-2f9d-798f-6aa460932aec@01019freenet.de> <401a0715-fd28-63a3-8dfd-e89835d70db0@01019freenet.de> <11c25b88-af9b-a1f7-b5f5-0420c75916d7@01019freenet.de> <20171208084751.tom4auppogz4lanz@unicorn.suse.cz> <20171208114025.kjcaratqcveq7zu5@unicorn.suse.cz> <96a16c1f-c026-f506-78c1-dad88471361d@01019freenet.de> <6f75bdf5-839b-8c84-c8be-e83d071b245e@maya.org> Mime-Version: 1.0 Content-Type: text/plain; charset=utf-8; format=flowed Content-Transfer-Encoding: 7bit Cc: Michal Kubecek , Jason Wang , David Miller , Network Development To: Willem de Bruijn Return-path: Received: from mx2.mailbox.org ([80.241.60.215]:59262 "EHLO mx2.mailbox.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752322AbdLURFm (ORCPT ); Thu, 21 Dec 2017 12:05:42 -0500 In-Reply-To: Content-Language: en-US Sender: netdev-owner@vger.kernel.org List-ID: On 12/20/2017 at 11:44 PM Willem de Bruijn wrote: > On Wed, Dec 20, 2017 at 10:56 AM, Andreas Hartmann > wrote: >> On 12/18/2017 at 06:11 PM Andreas Hartmann wrote: >>> On 12/17/2017 at 11:33 PM Willem de Bruijn wrote: >> [...] >>>> I have been able to reproduce the hang by sending a UFO packet >>>> between two guests running v4.13 on a host running v4.15-rc1. >>>> >>>> The vhost_net_ubuf_ref refcount indeed hits overflow (-1) from >>>> vhost_zerocopy_callback being called for each segment of a >>>> segmented UFO skb. This refcount is decremented then on each >>>> segment, but incremented only once for the entire UFO skb. >>>> >>>> Before v4.14, these packets would be converted in skb_segment to >>>> regular copy packets with skb_orphan_frags and the callback function >>>> called once at this point. v4.14 added support for reference counted >>>> zerocopy skb that can pass through skb_orphan_frags unmodified and >>>> have their zerocopy state safely cloned with skb_zerocopy_clone. >>>> >>>> The call to skb_zerocopy_clone must come after skb_orphan_frags >>>> to limit cloning of this state to those skbs that can do so safely. >>>> >>>> Please try a host with the following patch. This fixes it for me. I intend to >>>> send it to net. >>>> >>>> diff --git a/net/core/skbuff.c b/net/core/skbuff.c >>>> index a592ca025fc4..d2d985418819 100644 >>>> --- a/net/core/skbuff.c >>>> +++ b/net/core/skbuff.c >>>> @@ -3654,8 +3654,6 @@ struct sk_buff *skb_segment(struct sk_buff *head_skb, >>>> >>>> skb_shinfo(nskb)->tx_flags |= skb_shinfo(head_skb)->tx_flags & >>>> SKBTX_SHARED_FRAG; >>>> - if (skb_zerocopy_clone(nskb, head_skb, GFP_ATOMIC)) >>>> - goto err; >>>> >>>> while (pos < offset + len) { >>>> if (i >= nfrags) { >>>> @@ -3681,6 +3679,8 @@ struct sk_buff *skb_segment(struct sk_buff *head_skb, >>>> >>>> if (unlikely(skb_orphan_frags(frag_skb, GFP_ATOMIC))) >>>> goto err; >>>> + if (skb_zerocopy_clone(nskb, frag_skb, GFP_ATOMIC)) >>>> + goto err; >>>> >>>> *nskb_frag = *frag; >>>> __skb_frag_ref(nskb_frag); >>>> >>>> >>>> This is relatively inefficient, as it calls skb_zerocopy_clone for each frag >>>> in the frags[] array. I will follow-up with a patch to net-next that only >>>> checks once per skb: >>>> >>>> diff --git a/net/core/skbuff.c b/net/core/skbuff.c >>>> index 466581cf4cdc..a293a33604ec 100644 >>>> --- a/net/core/skbuff.c >>>> +++ b/net/core/skbuff.c >>>> @@ -3662,7 +3662,8 @@ struct sk_buff *skb_segment(struct sk_buff *head_skb, >>>> >>>> skb_shinfo(nskb)->tx_flags |= skb_shinfo(head_skb)->tx_flags & >>>> SKBTX_SHARED_FRAG; >>>> - if (skb_zerocopy_clone(nskb, head_skb, GFP_ATOMIC)) >>>> + if (skb_orphan_frags(frag_skb, GFP_ATOMIC) || >>>> + skb_zerocopy_clone(nskb, frag_skb, GFP_ATOMIC)) >>>> goto err; >>>> >>>> while (pos < offset + len) { >>>> @@ -3676,6 +3677,11 @@ struct sk_buff *skb_segment(struct sk_buff *head_skb, >>>> >>>> BUG_ON(!nfrags); >>>> >>>> + if (skb_orphan_frags(frag_skb, GFP_ATOMIC) || >>>> + skb_zerocopy_clone(nskb, frag_skb, >>>> + GFP_ATOMIC)) >>>> + goto err; >>>> + >>>> list_skb = list_skb->next; >>>> } >>>> >>>> @@ -3687,9 +3693,6 @@ struct sk_buff *skb_segment(struct sk_buff *head_skb, >>>> goto err; >>>> } >>>> >>>> - if (unlikely(skb_orphan_frags(frag_skb, GFP_ATOMIC))) >>>> - goto err; >>>> - >>> >>> I'm currently testing this one. >>> >> >> Test is in progress. I'm testing w/ 4.14.7, which already contains "net: >> accept UFO datagrams from tuntap and packet". >> >> At first, I tested an unpatched 4.14.7 - the problem (no more killable >> qemu-process) did occur promptly on shutdown of the machine. This was >> expected. >> >> Next, I applied the above patch (the second one). Until now, I didn't >> face any problem any more on shutdown of VMs. Looks promising. > > Thanks for testing. > > I sent the first, simpler, one to net together with another fix. > > http://patchwork.ozlabs.org/patch/851715/ > If I'm using the second patch above (the more efficient one and not "[net,1/2] skbuff: orphan frags before zerocopy clone"), which I'm already testing here: Is it still necessary to apply this patch "[net,2/2] skbuff: skb_copy_ubufs must release uarg even without user frags"? Thanks, Andreas