From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 7EE8BC04AA5 for ; Thu, 25 Aug 2022 12:29:31 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S241632AbiHYM33 (ORCPT ); Thu, 25 Aug 2022 08:29:29 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:33484 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S236033AbiHYM31 (ORCPT ); Thu, 25 Aug 2022 08:29:27 -0400 Received: from szxga08-in.huawei.com (szxga08-in.huawei.com [45.249.212.255]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 8A49EB2CDC; Thu, 25 Aug 2022 05:29:25 -0700 (PDT) Received: from dggemv711-chm.china.huawei.com (unknown [172.30.72.53]) by szxga08-in.huawei.com (SkyGuard) with ESMTP id 4MD2Hg4kgtz1N7WY; Thu, 25 Aug 2022 20:25:51 +0800 (CST) Received: from kwepemm600001.china.huawei.com (7.193.23.3) by dggemv711-chm.china.huawei.com (10.1.198.66) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2375.24; Thu, 25 Aug 2022 20:29:22 +0800 Received: from [10.174.176.245] (10.174.176.245) by kwepemm600001.china.huawei.com (7.193.23.3) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2375.24; Thu, 25 Aug 2022 20:29:21 +0800 Message-ID: Date: Thu, 25 Aug 2022 20:29:21 +0800 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:91.0) Gecko/20100101 Thunderbird/91.7.0 Subject: Re: [PATCH net] net/sched: fix netdevice reference leaks in attach_one_default_qdisc() From: "wanghai (M)" To: Jakub Kicinski CC: , , , , , , , , References: <20220817104646.22861-1-wanghai38@huawei.com> <20220818105642.6d58e9d4@kernel.org> In-Reply-To: Content-Type: text/plain; charset="UTF-8"; format=flowed Content-Transfer-Encoding: 8bit X-Originating-IP: [10.174.176.245] X-ClientProxiedBy: dggems706-chm.china.huawei.com (10.3.19.183) To kwepemm600001.china.huawei.com (7.193.23.3) X-CFilter-Loop: Reflected Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org 在 2022/8/19 23:58, wanghai (M) 写道: > > 在 2022/8/19 1:56, Jakub Kicinski 写道: >> On Wed, 17 Aug 2022 18:46:46 +0800 Wang Hai wrote: >>> In attach_default_qdiscs(), when attach default qdisc (fq_codel) fails >>> and fallback to noqueue, if the original attached qdisc is not released >>> and a new one is directly attached, this will cause netdevice reference >>> leaks. >> Could you provide more details on the failure path? My preference would >> be to try to clean up properly there, if possible. > Hi Jakub. > > Here are the details of the failure. Do I need to do cleanup under the > failed path? > > If a dev has multiple queues and queue 0 fails to attach qdisc > because there is no memory in attach_one_default_qdisc(). Then > dev->qdisc will be noop_qdisc by default. But the other queues > may be able to successfully attach to default qdisc. > > In this case, the fallback to noqueue process will be triggered > > static void attach_default_qdiscs(struct net_device *dev) > { >     ... >     if (!netif_is_multiqueue(dev) || >         dev->priv_flags & IFF_NO_QUEUE) { >             ... >             netdev_for_each_tx_queue(dev, attach_one_default_qdisc, > NULL); // queue 0 attach failed because -ENOBUFS, but the other queues > attach successfully >             qdisc = txq->qdisc_sleeping; >             rcu_assign_pointer(dev->qdisc, qdisc); // dev->qdisc = > &noop_qdisc >             ... >     } >     ... >     if (qdisc == &noop_qdisc) { >         ... >         netdev_for_each_tx_queue(dev, attach_one_default_qdisc, NULL); > // Re-attach, but not release the previously created qdisc >         ... >     } > } > Hi Jakub. Do you have any other suggestions for this patch? Any replies would be appreciated. >>> The following is the bug log: >>> >>> veth0: default qdisc (fq_codel) fail, fallback to noqueue >>> unregister_netdevice: waiting for veth0 to become free. Usage count >>> = 32 >>> leaked reference. >>>   qdisc_alloc+0x12e/0x210 >>>   qdisc_create_dflt+0x62/0x140 >>>   attach_one_default_qdisc.constprop.41+0x44/0x70 >>>   dev_activate+0x128/0x290 >>>   __dev_open+0x12a/0x190 >>>   __dev_change_flags+0x1a2/0x1f0 >>>   dev_change_flags+0x23/0x60 >>>   do_setlink+0x332/0x1150 >>>   __rtnl_newlink+0x52f/0x8e0 >>>   rtnl_newlink+0x43/0x70 >>>   rtnetlink_rcv_msg+0x140/0x3b0 >>>   netlink_rcv_skb+0x50/0x100 >>>   netlink_unicast+0x1bb/0x290 >>>   netlink_sendmsg+0x37c/0x4e0 >>>   sock_sendmsg+0x5f/0x70 >>>   ____sys_sendmsg+0x208/0x280 >>> >>> In attach_one_default_qdisc(), release the old one before attaching >>> a new qdisc to fix this bug. >>> >>> Fixes: bf6dba76d278 ("net: sched: fallback to qdisc noqueue if >>> default qdisc setup fail") >>> Signed-off-by: Wang Hai >>> --- >>>   net/sched/sch_generic.c | 5 +++++ >>>   1 file changed, 5 insertions(+) >>> >>> diff --git a/net/sched/sch_generic.c b/net/sched/sch_generic.c >>> index d47b9689eba6..87b61ef14497 100644 >>> --- a/net/sched/sch_generic.c >>> +++ b/net/sched/sch_generic.c >>> @@ -1140,6 +1140,11 @@ static void attach_one_default_qdisc(struct >>> net_device *dev, >>>         if (!netif_is_multiqueue(dev)) >>>           qdisc->flags |= TCQ_F_ONETXQUEUE | TCQ_F_NOPARENT; >>> + >>> +    if (dev_queue->qdisc_sleeping && >>> +        dev_queue->qdisc_sleeping != &noop_qdisc) >>> +        qdisc_put(dev_queue->qdisc_sleeping); >>> + >>>       dev_queue->qdisc_sleeping = qdisc; >>>   } >> . > -- Wang Hai