From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-0.6 required=3.0 tests=DKIM_SIGNED,DKIM_VALID, DKIM_VALID_AU,FREEMAIL_FORGED_FROMDOMAIN,FREEMAIL_FROM, HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS autolearn=no autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id DC64DC433E2 for ; Mon, 18 May 2020 18:22:22 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id B183720657 for ; Mon, 18 May 2020 18:22:22 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="fkDqpe+E" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1731287AbgERSWV (ORCPT ); Mon, 18 May 2020 14:22:21 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:37898 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1728659AbgERSWS (ORCPT ); Mon, 18 May 2020 14:22:18 -0400 Received: from mail-ot1-x343.google.com (mail-ot1-x343.google.com [IPv6:2607:f8b0:4864:20::343]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id F35DBC061A0C for ; Mon, 18 May 2020 11:22:17 -0700 (PDT) Received: by mail-ot1-x343.google.com with SMTP id x22so5339030otq.4 for ; Mon, 18 May 2020 11:22:17 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc:content-transfer-encoding; bh=SmaNRsXuZLiI1Rl2g1Im8DQQvPBZC2W11ib76z3KxaQ=; b=fkDqpe+EcCKr1Waj9I38uaPkMF1EeaJYBwn96rRbqcxj9YPhNJMbKsXuo7oEs0hKP1 KNrMofZj5BMw07wVDI5NfflncEmGguKpvQRPiUqus1rLeaSxrCP9GXp8paJNJ6WmSVw7 Vzpqf7aYFaGIzQ9CgLlvxlibXMtNRuRkW+fb8S9R4f7a4g5/pN33ZUUr4eTLO6Mu+bUj NrAFXg5yv4h8uhY67CZ4QMPPpwDXLpoPF81yNyMYhCQG7AHCT5CLTv397adOm03QvQJV HC48VEQCbOZ/cwXfAehgOM+w7XF3ThF0pqC++FJYE60kxx5HQ/rYU34gDL/VNf499nTt BD+Q== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc:content-transfer-encoding; bh=SmaNRsXuZLiI1Rl2g1Im8DQQvPBZC2W11ib76z3KxaQ=; b=U3ITxzFjBy+DDEvQRsYVVihnH1aw2nkWbEjMjSKqFcF/KnGBzk15HQdTwfhxqATzxi Y9WuTE+G8Uo6d7FUkubXWuiUQM0HkZE+FcUqTSa+CAfBsPJM4nOaGZgZEmTQC5M+uoKT kIIIRvQEPqQyQhqwtdZ09G9OBKuRhJOUkFo26mFHz8iO+5MhiaZ8Ar3sCUksDshLZwce CyTARUP3MsGqcl+TrvDAb39gZnQhjHmjctaWAGW4WqsmSkPnqOj/xvJH+xjy0xaXhACW TAwLEktmYxnUhEHmdRdo53dp3RWhRZtj15rg0wDrtYsQ9+/un+n0/24IvBDUynx7aN4q +PRw== X-Gm-Message-State: AOAM533Lw2xBOkCY5rUrBLdgVjdygmIAQrQI3OtP//nzVZ/fq15Q+mQR 1cxAXG6HMu97jOMCnrDf/66HtcrZbZIvVbckzIpgoc+NLHM= X-Google-Smtp-Source: ABdhPJxmXHFuOb1I315Ti5U1F4k57YHjIG7BmEzU8PzZq2lAS9md86ybnmNUWgbJpyxTzM39cmXPNv3fLP3+9Q7BocI= X-Received: by 2002:a9d:d0a:: with SMTP id 10mr12962364oti.189.1589826137281; Mon, 18 May 2020 11:22:17 -0700 (PDT) MIME-Version: 1.0 References: In-Reply-To: From: Cong Wang Date: Mon, 18 May 2020 11:22:06 -0700 Message-ID: Subject: Re: iproute2: tc deletion freezes whole server To: =?UTF-8?Q?V=C3=A1clav_Zindulka?= Cc: Linux Kernel Network Developers Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable Sender: netdev-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org On Mon, May 18, 2020 at 7:16 AM V=C3=A1clav Zindulka wrote: > > On Sun, May 17, 2020 at 9:35 PM Cong Wang wrot= e: > > > > On Fri, May 8, 2020 at 6:59 AM V=C3=A1clav Zindulka > > wrote: > > > > > > > > > > > > I tried to emulate your test case in my VM, here is the script = I use: > > > > > > > > > > > > =3D=3D=3D=3D > > > > > > ip li set dev dummy0 up > > > > > > tc qd add dev dummy0 root handle 1: htb default 1 > > > > > > for i in `seq 1 1000` > > > > > > do > > > > > > tc class add dev dummy0 parent 1:0 classid 1:$i htb rate 1mbi= t ceil 1.5mbit > > > > > > tc qd add dev dummy0 parent 1:$i fq_codel > > > > > > done > > > > > > > > > > > > time tc qd del dev dummy0 root > > > > > > =3D=3D=3D=3D > > > > > > > > > > > > And this is the result: > > > > > > > > > > > > Before my patch: > > > > > > real 0m0.488s > > > > > > user 0m0.000s > > > > > > sys 0m0.325s > > > > > > > > > > > > After my patch: > > > > > > real 0m0.180s > > > > > > user 0m0.000s > > > > > > sys 0m0.132s > > > > > > > > > > My results with your test script. > > > > > > > > > > before patch: > > > > > /usr/bin/time -p tc qdisc del dev enp1s0f0 root > > > > > real 1.63 > > > > > user 0.00 > > > > > sys 1.63 > > > > > > > > > > after patch: > > > > > /usr/bin/time -p tc qdisc del dev enp1s0f0 root > > > > > real 1.55 > > > > > user 0.00 > > > > > sys 1.54 > > > > > > > > > > > This is an obvious improvement, so I have no idea why you didn'= t > > > > > > catch any difference. > > > > > > > > > > We use hfsc instead of htb. I don't know whether it may cause any > > > > > difference. I can provide you with my test scripts if necessary. > > > > > > > > Yeah, you can try to replace the htb with hfsc in my script, > > > > I didn't spend time to figure out hfsc parameters. > > > > > > class add dev dummy0 parent 1:0 classid 1:$i hfsc ls m1 0 d 0 m2 > > > 13107200 ul m1 0 d 0 m2 13107200 > > > > > > but it behaves the same as htb... > > > > > > > My point here is, if I can see the difference with merely 1000 > > > > tc classes, you should see a bigger difference with hundreds > > > > of thousands classes in your setup. So, I don't know why you > > > > saw a relatively smaller difference. > > > > > > I saw a relatively big difference. It was about 1.5s faster on my hug= e > > > setup which is a lot. Yet maybe the problem is caused by something > > > > What percentage? IIUC, without patch it took you about 11s, so > > 1.5s faster means 13% improvement for you? > > My whole setup needs 22.17 seconds to delete with an unpatched kernel. > With your patches applied it is 21.08. So it varies between 1 - 1.5s. > Improvement is about 5 - 6%. Good to know. > > > > else? I thought about tx/rx queues. RJ45 ports have up to 4 tx and rx > > > queues. SFP+ interfaces have much higher limits. 8 or even 64 possibl= e > > > queues. I've tried to increase the number of queues using ethtool fro= m > > > 4 to 8 and decreased to 2. But there was no difference. It was about > > > 1.62 - 1.63 with an unpatched kernel and about 1.55 - 1.58 with your > > > patches applied. I've tried it for ifb and RJ45 interfaces where it > > > took about 0.02 - 0.03 with an unpatched kernel and 0.05 with your > > > patches applied, which is strange, but it may be caused by the fact i= t > > > was very fast even before. > > > > That is odd. In fact, this is highly related to number of TX queues, > > because the existing code resets the qdisc's once for each TX > > queue, so the more TX queues you have, the more resets kernel > > will do, that is the more time it will take. > > Can't the problem be caused that reset is done on active and inactive > queues every time? It would explain why it had no effect in decreasing > and increasing the number of active queues. Yet it doesn't explain why > Intel card (82599ES) with 64 possible queues has exactly the same > problem as Mellanox (ConnectX-4 LX) with 8 possible queues. Regardless of these queues, the qdisc's should be only reset once, because all of these queues point to the same instance of root qdisc in your case. [...] > With the attached patch I'm down to 1.7 seconds - more than 90% > improvement :-) Can you please check it and pass it to proper places? > According to debugging printk messages it empties only active queues. You can't change netdev_for_each_tx_queue(), it would certainly at least break netif_alloc_netdev_queues(). Let me think how to fix this properly, I have some ideas and will provide you some patch(es) to test soon. Thanks!