From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-9.5 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI, NICE_REPLY_A,SPF_HELO_NONE,SPF_PASS,USER_AGENT_SANE_1 autolearn=unavailable autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id B7495C4320A for ; Thu, 2 Sep 2021 07:33:21 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 9A6AC610C8 for ; Thu, 2 Sep 2021 07:33:21 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S243541AbhIBHeS (ORCPT ); Thu, 2 Sep 2021 03:34:18 -0400 Received: from relay.sw.ru ([185.231.240.75]:43126 "EHLO relay.sw.ru" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S233401AbhIBHeO (ORCPT ); Thu, 2 Sep 2021 03:34:14 -0400 DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=virtuozzo.com; s=relay; h=Content-Type:MIME-Version:Date:Message-ID:From: Subject; bh=dSXhbvgJCpxM7HEdHstfIRuSSvLCB6hGd2m4qJH9i3U=; b=WzrkzilRTeoP0oKLS D8+9YGrkV23/Rt892VOpynELXGb6V6Q0M8+uFoUyr3s2IydvrzLGOnWRqJlQcLksWYesmH+Vd1ZlN SK4cRYglfSjgob6fMxorKcLrhV3gYCGLISUNgNawr+VAOLi3hnjLL73ogbyVPq6gttdBGqk7vOxBc =; Received: from [10.93.0.56] by relay.sw.ru with esmtp (Exim 4.94.2) (envelope-from ) id 1mLhDh-000YMb-VB; Thu, 02 Sep 2021 10:33:09 +0300 Subject: Re: [PATCH net-next v4] skb_expand_head() adjust skb->truesize incorrectly From: Vasily Averin To: Eric Dumazet , Christoph Paasch , "David S. Miller" Cc: Hideaki YOSHIFUJI , David Ahern , Jakub Kicinski , netdev , linux-kernel@vger.kernel.org, kernel@openvz.org, Alexey Kuznetsov , Julian Wiedmann References: <67740366-7f1b-c953-dfe1-d2085297bdf3@gmail.com> <8a183782-f4b9-e12a-55d1-c4a3c4078369@virtuozzo.com> Message-ID: <2984f16b-7f20-e72d-1661-b942fdc4ff9b@virtuozzo.com> Date: Thu, 2 Sep 2021 10:33:09 +0300 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:78.0) Gecko/20100101 Thunderbird/78.11.0 MIME-Version: 1.0 In-Reply-To: <8a183782-f4b9-e12a-55d1-c4a3c4078369@virtuozzo.com> Content-Type: text/plain; charset=utf-8 Content-Language: en-US Content-Transfer-Encoding: 7bit Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 9/2/21 10:13 AM, Vasily Averin wrote: > On 9/2/21 7:48 AM, Eric Dumazet wrote: >> On 9/1/21 9:32 PM, Eric Dumazet wrote: >>> I think you missed netem case, in particular >>> skb_orphan_partial() which I already pointed out. >>> >>> You can setup a stack of virtual devices (tunnels), >>> with a qdisc on them, before ip6_xmit() is finally called... >>> >>> Socket might have been closed already. >>> >>> To test your patch, you could force a skb_orphan_partial() at the beginning >>> of skb_expand_head() (extending code coverage) >> >> To clarify : >> >> It is ok to 'downgrade' an skb->destructor having a ref on sk->sk_wmem_alloc to >> something owning a ref on sk->refcnt. >> >> But the opposite operation (ref on sk->sk_refcnt --> ref on sk->sk_wmem_alloc) is not safe. > > Could you please explain in more details, since I stil have a completely opposite point of view? > > Every sk referenced in skb have sk_wmem_alloc > 9 > It is assigned to 1 in sk_alloc and decremented right before last __sk_free(), > inside both sk_free() sock_wfree() and __sock_wfree() > > So it is safe to adjust skb->sk->sk_wmem_alloc, > because alive skb keeps reference to alive sk and last one keeps sk_wmem_alloc > 0 > > So any destructor used sk->sk_refcnt will already have sk_wmem_alloc > 0, > because last sock_put() calls sk_free(). > > However now I'm not sure in reversed direction. > skb_set_owner_w() check !sk_fullsock(sk) and call sock_hold(sk); > If sk->sk_refcnt can be 0 here (i.e. after execution of old destructor inside skb_orphan) > -- it can be trigger pointed problem: > "refcount_add() will trigger a warning (panic under KASAN)". > > Could you please explain where I'm wrong? To clarify: I'm agree it is unsafe to call on alive skb: skb_orphan(skb) adjust(skb_>sk->sk_wmem_alloc) becasue 2 reasone: 1) old destructor can decrease sk_vmem_alloc to zero and free sk 2) becasue old destructor if !sk_fullsock(sk) can call sock_out and release last sk->sk_refcnt reference. in this case sock_hold() will trigger warning. 1) can be handled, we can adjust(sk_wmem_alloc) before skb_orphan() but I badly understand how to handle 2nd case. Thank you, Vasily Averin