From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-1.0 required=3.0 tests=HEADER_FROM_DIFFERENT_DOMAINS, MAILING_LIST_MULTI,SPF_PASS,URIBL_BLOCKED autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 45F26C4360F for ; Tue, 2 Apr 2019 17:22:29 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id 1A7C4206DD for ; Tue, 2 Apr 2019 17:22:29 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1730777AbfDBRW2 (ORCPT ); Tue, 2 Apr 2019 13:22:28 -0400 Received: from mail-ed1-f66.google.com ([209.85.208.66]:44869 "EHLO mail-ed1-f66.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1729385AbfDBRW1 (ORCPT ); Tue, 2 Apr 2019 13:22:27 -0400 Received: by mail-ed1-f66.google.com with SMTP id d11so2641148edp.11 for ; Tue, 02 Apr 2019 10:22:26 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:in-reply-to:references:date :message-id:mime-version; bh=JxcrCu28wNpIc2FLIZYz66TFnbQYf78TaLF1SrHEPxg=; b=JiNZMX8L6S0u919mVbYibL9dJrd75Ns8D/T+PWa4W+8x6WU6dF2GECoqr0RFr0zN1u P+lM6RFxTXG41FMT5i/RiJm6OQIVIJ1SaGmeubR2ZFhuhPRF0UxLP1G9CaIiv975FtbW TOJCoaWPRPWMTQK8wDlLk3pqyz7mh0qRiaOMjzqlbcIo2yy6H77cLlX70Rx3wRO72B24 jp5wYsnXe8ZsYPwfBc3mq45WzwAmIgRM5sOWG1T6z/bUaHqLOitpVXjw5VNBbjo3c6+B QcKss/qUSdisQpk4uHs+wHoEyKnfPAw7XuMlCZsqnFWKY0qnBjodhYRrr2JCm/ud8YSw ovOg== X-Gm-Message-State: APjAAAVIA4aaTNFEQ4jGRTMFoRJrGo5DAypGvepDQH2WAntatv+VFVnN lsvLs1U/77Ys0+EvRU3/yYWBWnVAiFtwag== X-Google-Smtp-Source: APXvYqyctpkYQMR7SLOI0fxgffrNkE10tv0zS1tvW3Kv6H9XyVa8ZvtlrbkZYxPxh/0orNKjhKYGGA== X-Received: by 2002:a17:906:5a09:: with SMTP id p9mr40493986ejq.46.1554225745390; Tue, 02 Apr 2019 10:22:25 -0700 (PDT) Received: from alrua-x1.borgediget.toke.dk (alrua-x1.vpn.toke.dk. [2a00:7660:6da:10::2]) by smtp.gmail.com with ESMTPSA id g41sm4296747edb.23.2019.04.02.10.22.24 (version=TLS1_2 cipher=ECDHE-RSA-CHACHA20-POLY1305 bits=256/256); Tue, 02 Apr 2019 10:22:24 -0700 (PDT) Received: by alrua-x1.borgediget.toke.dk (Postfix, from userid 1000) id 1DC7F1800B9; Tue, 2 Apr 2019 19:22:24 +0200 (CEST) From: Toke =?utf-8?Q?H=C3=B8iland-J=C3=B8rgensen?= To: Marc Kleine-Budde , Cong Wang Cc: Jiri Pirko , Linux Kernel Network Developers , Dave Taht , Jamal Hadi Salim , kernel@pengutronix.de, linux-can@vger.kernel.org, David Miller Subject: Re: [PATCH 1/2] net: sch_generic: add flag IFF_FIFO_QUEUE to use pfifo_fast as default scheduler In-Reply-To: <7a25d800-aed1-bec6-0ff8-38c06c3b8fb5@pengutronix.de> References: <20190327165632.10711-1-mkl@pengutronix.de> <20190327165632.10711-2-mkl@pengutronix.de> <7a25d800-aed1-bec6-0ff8-38c06c3b8fb5@pengutronix.de> X-Clacks-Overhead: GNU Terry Pratchett Date: Tue, 02 Apr 2019 19:22:24 +0200 Message-ID: <87mul8nwz3.fsf@toke.dk> MIME-Version: 1.0 Content-Type: text/plain Sender: netdev-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org Marc Kleine-Budde writes: > On 3/27/19 6:14 PM, Cong Wang wrote: >> On Wed, Mar 27, 2019 at 9:56 AM Marc Kleine-Budde wrote: >>> >>> There is networking hardware that isn't based on Ethernet for layers 1 and 2. >>> >>> For example CAN. >>> >>> CAN is a multi-master serial bus standard for connecting Electronic Control >>> Units [ECUs] also known as nodes. A frame on the CAN bus carries up to 8 bytes >>> of payload. Frame corruption is detected by a CRC. However frame loss due to >>> corruption is possible, but a quite unusual phenomenon. >>> >>> While fq_codel works great for TCP/IP, it doesn't for CAN. There are a lot of >>> legacy protocols on top of CAN, which are not build with flow control or high >>> CAN frame drop rates in mind. >>> >>> When using fq_codel, as soon as the queue reaches a certain delay based length, >>> skbs from the head of the queue are silently dropped. Silently meaning that the >>> user space using a send() or similar syscall doesn't get an error. However >>> TCP's flow control algorithm will detect dropped packages and adjust the >>> bandwidth accordingly. >>> >>> When using fq_codel and sending raw frames over CAN, which is the common use >>> case, the user space thinks the package has been sent without problems, because >>> send() returned without an error. pfifo_fast will drop skbs, if the queue >>> length exceeds the maximum. But with this scheduler the skbs at the tail are >>> dropped, an error (-ENOBUFS) is propagated to user space. So that the user >>> space can slow down the package generation. >>> >>> On distributions, where fq_codel is made default via CONFIG_DEFAULT_NET_SCH >>> during compile time, or set default during runtime with sysctl >>> net.core.default_qdisc (see [1]), we get a bad user experience. In my test case >>> with pfifo_fast, I can transfer thousands of million CAN frames without a frame >>> drop. On the other hand with fq_codel there is more then one lost CAN frame per >>> thousand frames. >>> >>> As pointed out fq_codel is not suited for CAN hardware, so this patch >>> introduces a new netdev_priv_flag called "IFF_FIFO_QUEUE" (in contrast to the >>> existing "IFF_NO_QUEUE"). >>> >>> During transition of a netdev from down to up state the default queuing >>> discipline is attached by attach_default_qdiscs() with the help of >>> attach_one_default_qdisc(). This patch modifies attach_one_default_qdisc() to >>> attach the pfifo_fast (pfifo_fast_ops) if the "IFF_FIFO_QUEUE" flag is set. >> >> I wonder if we just need to allow arbitrary default qdisc per netdevice >> while you are on it. A private flag is simply a boolean, perhaps in the >> future other type of devices wants other default qdiscs, so that could >> make it more flexible. > > From my point of view there is networking hardware that use protocols > that work with (i.e. benefit from) fq_codel (hash flow/queue/head drop). > > The silent head drop is the most prominent reason why it doesn't work on > CAN. I haven't dug deep enough into the code to see if skb->hash is used > or what the flow dissector will do on CAN frames. So reordering of CAN > frames (if something else than skb->priority is used) might be a > problem, too. > > From my point of view, if your networking hardware and the protocols on > top don't like re-ordering or silent head drop, than pfifo_fast is > probably a good default choice. > > I discussed the problem a bit at netdev 0x13 and one point someone > mentioned is that if there is a generic set this qdisc function people > might start to add this to network drivers to "optimize" them for > their special workflow or test case. I think I was one of the people you spoke with about this. I agree that the flag approach makes sense, since I view the requirements of the CAN protocol as very specifically being met by a FIFO queue. And yeah I do think we should push back on every device type defining each own arbitrary qdisc default; having the two very specific exceptions "no queue" and "FIFO queue" to the general qdisc default setting makes it explicit that this is for special cases only, and that any other optimisation of the qdisc configuration should be done in userspace. -Toke