From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-9.3 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI, NICE_REPLY_A,SPF_HELO_NONE,SPF_PASS,USER_AGENT_SANE_1 autolearn=unavailable autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 3A587C432BE for ; Mon, 23 Aug 2021 06:00:07 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 1207261284 for ; Mon, 23 Aug 2021 06:00:07 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S234842AbhHWGAr (ORCPT ); Mon, 23 Aug 2021 02:00:47 -0400 Received: from relay.sw.ru ([185.231.240.75]:59326 "EHLO relay.sw.ru" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231267AbhHWGAp (ORCPT ); Mon, 23 Aug 2021 02:00:45 -0400 DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=virtuozzo.com; s=relay; h=Content-Type:MIME-Version:Date:Message-ID:From: Subject; bh=yhI7M6EFfWyK6pD4Yg7rL901LqGoNx39+qphy4MnlyE=; b=qocYmqO2Fxhx/yhGT zLCAi4lmWYs8XCKQ7guDdj2zbJhlxne211slE+YHKS67j2M9cJsX8MOwS4v/xyacIKcbJcFWtKHwS etNI/yO7lo6ojx+Cq4qjcemTtpAAo4vRMQSJH8NQPfICnJmip7ky0YB2j4KAERsi6Pp4MTlP6gKtc =; Received: from [10.93.0.56] by relay.sw.ru with esmtp (Exim 4.94.2) (envelope-from ) id 1mI302-008Y5t-Rk; Mon, 23 Aug 2021 08:59:58 +0300 Subject: Re: [PATCH NET v4 3/7] ipv6: use skb_expand_head in ip6_xmit From: Vasily Averin To: Christoph Paasch Cc: "David S. Miller" , Hideaki YOSHIFUJI , David Ahern , Jakub Kicinski , Eric Dumazet , netdev , LKML , kernel@openvz.org, Julian Wiedmann References: <77f3e358-c75e-b0bf-ca87-6f8297f5593c@virtuozzo.com> <7d71f2cc-3fff-beff-b82b-d0a81fb60429@virtuozzo.com> Message-ID: <6858f130-e6b4-1ba7-ed6f-58c00152be69@virtuozzo.com> Date: Mon, 23 Aug 2021 08:59:58 +0300 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:78.0) Gecko/20100101 Thunderbird/78.11.0 MIME-Version: 1.0 In-Reply-To: <7d71f2cc-3fff-beff-b82b-d0a81fb60429@virtuozzo.com> Content-Type: text/plain; charset=utf-8 Content-Language: en-US Content-Transfer-Encoding: 7bit Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 8/23/21 8:44 AM, Vasily Averin wrote: > On 8/22/21 8:13 PM, Christoph Paasch wrote: >> On Sun, Aug 22, 2021 at 10:04 AM Christoph Paasch >> wrote: >>> >>> Hello Vasily, >>> >>> On Fri, Aug 20, 2021 at 11:21 PM Vasily Averin wrote: >>>> >>>> On 8/21/21 1:44 AM, Christoph Paasch wrote: >>>>> (resend without html - thanks gmail web-interface...) >>>>> On Fri, Aug 20, 2021 at 3:41 PM Christoph Paasch >>>>>> AFAICS, this is because pskb_expand_head (called from >>>>>> skb_expand_head) is not adjusting skb->truesize when skb->sk is set >>>>>> (which I guess is the case in this particular scenario). I'm not >>>>>> sure what the proper fix would be though... >>>> >>>> Could you please elaborate? >>>> it seems to me skb_realloc_headroom used before my patch called pskb_expand_head() too >>>> and did not adjusted skb->truesize too. Am I missed something perhaps? >>>> >>>> The only difference in my patch is that skb_clone can be not called, >>>> though I do not understand how this can affect skb->truesize. >>> >>> I *believe* that the difference is that after skb_clone() skb->sk is >>> NULL and thus truesize will be adjusted. >>> >>> I will try to confirm that with some more debugging. >> >> Yes indeed. >> >> Before your patch: >> [ 19.154039] ip6_xmit before realloc truesize 4864 sk? 000000002ccd6868 >> [ 19.155230] ip6_xmit after realloc truesize 5376 sk? 0000000000000000 >> >> skb->sk is not set and thus truesize will be adjusted. > > This looks strange for me. skb should not lost sk reference. > > Could you please clarify where exactly you cheked it? > sk on newly allocated skb is set on line 291 > > net/ipv6/ip6_output.c::ip6_xmit() > 282 if (unlikely(skb_headroom(skb) < head_room)) { > 283 struct sk_buff *skb2 = skb_realloc_headroom(skb, head_room); > 284 if (!skb2) { > 285 IP6_INC_STATS(net, ip6_dst_idev(skb_dst(skb)), > 286 IPSTATS_MIB_OUTDISCARDS); > 287 kfree_skb(skb); > 288 return -ENOBUFS; > 289 } > 290 if (skb->sk) > 291 skb_set_owner_w(skb2, skb->sk); <<<<< here > 292 consume_skb(skb); > 293 skb = skb2; > 294 } > >> With your change: >> [ 15.092933] ip6_xmit before realloc truesize 4864 sk? 00000000072930fd >> [ 15.094131] ip6_xmit after realloc truesize 4864 sk? 00000000072930fd >> >> skb->sk is set and thus truesize is not adjusted. > > In this case skb_set_owner_w() is called inside skb_expand_head() > > net/ipv6/ip6_output.c::ip6_xmit() > 265 if (unlikely(head_room > skb_headroom(skb))) { > 266 skb = skb_expand_head(skb, head_room); > 267 if (!skb) { > 268 IP6_INC_STATS(net, idev, IPSTATS_MIB_OUTDISCARDS); > 269 return -ENOBUFS; > 270 } > 271 } > > net/core/skbuff.c::skb_expand_head() > 1813 if (skb_shared(skb)) { > 1814 struct sk_buff *nskb = skb_clone(skb, GFP_ATOMIC); > 1815 > 1816 if (likely(nskb)) { > 1817 if (skb->sk) > 1818 skb_set_owner_w(nskb, skb->sk); <<<< here > 1819 consume_skb(skb); > 1820 } else { > 1821 kfree_skb(skb); > 1822 } > 1823 skb = nskb; > 1824 } > > So I do not understand how this can happen. > With my patch: > a) if skb is not shared -- it should keep original skb->sk > b) if skb is shared -- new skb should set sk if it was set on original skb. > > Your results can be explained if you looked and skb->sk and truesize right after skb_realloc_headroom() call > but before following skb_set_owner_w(). Could you please check it? It seems I've found the reason: before my change pskb_expand_head() is called for newly cloned skb where sk was not set. after my change skb->sk is set before following pskb_expand_head() call On own turn pskb_expand_head() adjust truesize: net/core/skbuff.c::pskb_expand_head() 1751 /* It is not generally safe to change skb->truesize. 1752 * For the moment, we really care of rx path, or 1753 * when skb is orphaned (not attached to a socket). 1754 */ 1755 if (!skb->sk || skb->destructor == sock_edemux) 1756 skb->truesize += size - osize; 1757 1758 return 0; Could you please confirm it? Thank you, Vasily Averin