From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-7.7 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,FREEMAIL_FORGED_FROMDOMAIN,FREEMAIL_FROM, HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI,MENTIONS_GIT_HOSTING, SPF_HELO_NONE,SPF_PASS autolearn=no autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 90501C3E8C5 for ; Fri, 27 Nov 2020 20:04:41 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 3512421534 for ; Fri, 27 Nov 2020 20:04:41 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="bbEqHQ0S" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1731429AbgK0UEN (ORCPT ); Fri, 27 Nov 2020 15:04:13 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:57140 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1731233AbgK0UCS (ORCPT ); Fri, 27 Nov 2020 15:02:18 -0500 Received: from mail-ej1-x62c.google.com (mail-ej1-x62c.google.com [IPv6:2a00:1450:4864:20::62c]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 4F500C061A47; Fri, 27 Nov 2020 11:29:35 -0800 (PST) Received: by mail-ej1-x62c.google.com with SMTP id lt17so9067060ejb.3; Fri, 27 Nov 2020 11:29:35 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=date:from:to:cc:subject:message-id:references:mime-version :content-disposition:in-reply-to; bh=CQWS0KFOZGclrXsgAFamD/N59/wMiUy4GaVAQ/IVTQQ=; b=bbEqHQ0SHpXzNZZ6lUDkdI7Zrv36B6q9wKcMZ6nWddqQyOMkms+7kMHhRyydqnP5lC TzzNgYCGUQNK0mBu89B4aBen+5zU1JpabGW3YPz2iEeKJYR/Gqlix6qZql3Cj4LSajLC 8AVTQwPCu1ZRNkmiHgkOF0a382Qoi8CqVXcyIKsXnI8KOwlN6Z3kVXTWP45fQRxY7+WA tEp5z+rsKx98XLpdjrB5QsXeAMVxKxkXPNuBpQDJni5sOn/Wz9Fnz/1WmUDGbKosbB54 SgOb/mD1nk3qxTE1epLJBmQgm1rYbMwFT1jxU2UU6TIy588adK86mD7EFd+ufVltJaJK xw2A== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:from:to:cc:subject:message-id:references :mime-version:content-disposition:in-reply-to; bh=CQWS0KFOZGclrXsgAFamD/N59/wMiUy4GaVAQ/IVTQQ=; b=kSzdrXQfeCEc0JWMg93KnPQehUKhaBjeiKGV9q96sXeJ2Gkwsw4z+y9f9vQiEdk6AU 2NxhsdKLUXBrcQr7A9Dx5FXCJordyPx0SwZNjkw8TJa+DC6/4Rx/X/3v7TthdXDugI4H s+POaUnZ8U4Y3rg6VIpGfHECkI4iFWCF2occR/5m9249yBr9dOrP3tgHymkeIZwXJh0t FN/0yp2yjbnOKeLv+/j8uafhK50IhOkZDZclBwFtl9nAcpUqYdXZrITij6H0ltF5tvae SM2DqIO/mq39UGyuX0svblgQex8DiJ/xLO76bzPRzqDrTOWi027ld+dzuKJAhdFhIaWF sB6A== X-Gm-Message-State: AOAM5313/AqFIekEA8DXh0JuL4e40uFv0BDJdkz8NrP1s1dSen9LBv90 vL6KtS24Iid8ksKmgvyVt7g= X-Google-Smtp-Source: ABdhPJys+iRhVhnlJ6hsEnSQgG7iAg+ZS41ltM5kbgn+YhatkkQ3k8lCYuWwu2NDfkgXQ/Q2wx81kA== X-Received: by 2002:a17:906:3bd6:: with SMTP id v22mr9564484ejf.160.1606505373831; Fri, 27 Nov 2020 11:29:33 -0800 (PST) Received: from skbuf ([188.25.2.120]) by smtp.gmail.com with ESMTPSA id gl2sm5313191ejb.29.2020.11.27.11.29.32 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 27 Nov 2020 11:29:33 -0800 (PST) Date: Fri, 27 Nov 2020 21:29:31 +0200 From: Vladimir Oltean To: Lukasz Majewski Cc: Fugang Duan , "David S . Miller" , Jakub Kicinski , netdev@vger.kernel.org, Andrew Lunn , Fabio Estevam , Vivien Didelot , NXP Linux Team , Florian Fainelli , linux-arm-kernel@lists.infradead.org, linux-kernel@vger.kernel.org, Peng Fan , stefan.agner@toradex.com, krzk@kernel.org, Shawn Guo Subject: Re: [RFC 0/4] net: l2switch: Provide support for L2 switch on i.MX28 SoC Message-ID: <20201127192931.4arbxkttmpfcqpz5@skbuf> References: <20201125232459.378-1-lukma@denx.de> <20201126123027.ocsykutucnhpmqbt@skbuf> <20201127003549.3753d64a@jawa> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20201127003549.3753d64a@jawa> Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Fri, Nov 27, 2020 at 12:35:49AM +0100, Lukasz Majewski wrote: > > > - The question regarding power management - at least for my use > > > case there is no need for runtime power management. The L2 switch > > > shall work always at it connects other devices. > > > > > > - The FEC clock is also used for L2 switch management and > > > configuration (as the L2 switch is just in the same, large IP > > > block). For now I just keep it enabled so DSA code can use it. It > > > looks a bit problematic to export fec_enet_clk_enable() to be > > > reused on DSA code. > > > > > > Links: > > > [0] - "i.MX28 Applications Processor Reference Manual, Rev. 2, > > > 08/2013" [1] - > > > https://github.com/lmajewski/linux-imx28-l2switch/commit/e3c7a6eab73401e021aef0070e1935a0dba84fb5 > > > > > > > Disclaimer: I don't know the details of imx28, it's just now that I > > downloaded the reference manual to see what it's about. > > > > I would push back and say that the switch offers bridge acceleration > > for the FEC. > > Am I correct, that the "bridge acceleration" means in-hardware support > for L2 packet bridging? > > And without the switch IP block enabled one shall be able to have > software bridging in Linux in those two interfaces? So if the switch is bypassed through pin strapping of sx_ena, then the DMA0 would be connected to ENETC-MAC 0, DMA1 to ENET-MAC 1, and both these ports could be driven by the regular fec driver, am I right? This is how people use the imx28 with mainline linux right now, no? When the sx_ena signal enables the switch, a hardware accelerator appears between the DMA engine and the same MACs. But that DMA engine is still compatible with the expectations of the fec driver. And the MACs still belong to the FEC. So, at the end of the day, there are still 2 FEC interfaces. Where I was going is that from a user's perspective, it would be natural to have the exact same view of the system in both cases, aka still two network interfaces for MAC 0 and MAC 1, they still function in the same way (i.e. you can still ping through them) but with the additional ability to do hardware-accelerating bridging, if the MAC 0 and MAC 1 network interfaces are put under the same bridge. Currently, you do not offer that, at all. You split the MTIP switch configuration between DSA and the FEC driver. But you are still not exposing networking-capable net devices for MAC 0 and MAC 1. You still do I/O through the DMA engine. So why use DSA at all? What benefit does it bring you? Why not do the entire switch configuration from within FEC, or a separate driver very closely related to it? > > The fact that the bridge acceleration is provided by a > > different vendor and requires access to an extra set of register > > blocks is immaterial. > > Am I correct that you mean not important above (i.e. immaterial == not > important)? Yes, sorry if that was not clear. > > To qualify as a DSA switch, you need to have > > indirect networking I/O through a different network interface. You do > > not have that. > > I do have eth0 (DMA0) -> MoreThanIP switch port 0 input > | > |----> switch port1 -> ENET-MAC0 > | > |----> switch port2 -> ENET-MAC1 The whole point of DSA is to intercept traffic from a different and completely unaware network interface, parse a tag, and masquerade the packet as though it came on a different, virtual, network interface corresponding to the switch. DSA offers this service so that: - any switch works with any host Ethernet controller - all vendors use the same mechanism and do not reinvent the same wheel The last part is especially relevant. Just grep net/core/ for "dsa" and be amazed at the number of hacks there. DSA can justify that only by scale. Aka "we have hardware that works this way, and doesn't work any other way. And lots of it". That being said, in your case, the network interface is not unaware of the switch at all. Come on, you need to put the (allegedly) switch's MAC in promiscuous mode, from the "DSA master" driver! Hardware resource ownership is very DSA-unlike in the imx28/vybrid model, it seems. This is a very big deal. And there is no tag to parse. You have a switch with a DMA engine which is a priori known to be register-compatible with the DMA engine used for plain FEC interfaces. It's not like you have a switch that can be connected via MII to anything and everything. Force forwarding is done via writing a register, again very much unlike DSA, and retrieving the source port on RX is up in the air. > > What I would do is I would expand the fec driver into > > something that, on capable SoCs, detects bridging of the ENET_MAC0 > > and ENETC_MAC1 ports and configures the switch accordingly to offload > > that in a seamless manner for the user. > > Do you propose to catch some kind of notification when user calls: > > ip link add name br0 type bridge; ip link set br0 up; > ip link set lan1 up; ip link set lan2 up; > ip link set lan1 master br0; ip link set lan2 master br0; > bridge link > > And then configure the FEC driver to use this L2 switch driver? Yes, that can be summarized as: One option to get this upstream would be to transform fec into a switchdev driver, or to create a mtip switchdev driver that shares a lot of code with the fec driver. The correct tool for code sharing is EXPORT_SYMBOL_GPL, not DSA. > > This would also solve your > > power management issues, since the entire Ethernet block would be > > handled by a single driver. DSA is a complication you do not need. > > Convince me otherwise. > > From what I see the MoreThanIP IP block looks like a "typical" L2 switch > (like lan9xxx), with VLAN tagging support, static and dynamic tables, > forcing the packet to be passed to port [*], congestion management, > switch input buffer, priority of packets/queues, broadcast, multicast, > port snooping, and even IEEE1588 timestamps. > > Seems like a lot of useful features. I did not say that it is not a typical switch, or that it doesn't have useful features. I said that DSA does not help you. Adding a DSA_TAG_PROTO_NONE driver in 2020 is a no-go all around. > The differences from "normal" DSA switches: > > 1. It uses mapped memory (for its register space) for > configuration/statistics gathering (instead of e.g. SPI, I2C) Nope, that's not a difference. > 2. The TAG is not appended to the frame outgoing from the "master" FEC > port - it can be setup when DMA transfers packet to MTIP switch internal > buffer. That is a difference indeed. Since DSA switches are isolated from their host interface, and there has been a traditional separation at the MAC-to-MAC layer (or MAC-to-PHY-to-PHY-to-MAC in weird cases), the routing information is passed in-band via a header in the packet. The host interface has no idea that this header exists, it is just traffic as usual. The whole DSA infrastructure is built around intercepting and decoding that. > Note: > > [*] - The same situation is in the VF610 and IMX28: > The ESW_FFEN register - Bit 0 -> FEN > > "When set, the next frame received from port 0 (the local DMA port) is > forwarded to the ports defined in FD. The bit resets to zero > automatically when one frame from port 0 has been processed by the > switch (i.e. has been read from the port 0 input buffer; see Figure > 32-1). Therefore, the bit must be set again as necessary. See also > Section 32.5.8.2, "Forced Forwarding" for a description." > > (Of course the "Section 32.5.8.2" is not available) > > > According to above the "tag" (engress port) is set when DMA transfers > the packet to input MTIP buffer. This shall allow force forwarding as > we can setup this bit when we normally append tag in the network stack. > > I will investigate this issue - and check the port separation. If it > works then DSA (or switchdev) shall be used? Source port identification and individual egress port addressing are things that should behave absolutely the same regardless of whether you have a DSA or a pure switchdev driver. I don't think that this is clear enough to you. DSA with DSA_TAG_PROTO_NONE is not the shortcut you're looking for. Please take a look at Documentation/networking/switchdev.rst, it explains pretty clearly that a switchdev port must still expose the same interface as any other net device. > (A side question - DSA uses switchdev, so when one shall use switchdev > standalone?) Short answer: if the system-side packet I/O interface is Ethernet and the hardware ownership is clearly delineated at the boundary of that Ethernet port, then it should be DSA, otherwise it shouldn't. But nonetheless, this raises a fundamental question, and I'll indulge in attempting to answer it more comprehensively. You seem to be tapping into an idea which has been circulating for a while, and which can be also found in Documentation/networking/dsa/dsa.rst: -----------------------------[cut here]----------------------------- TODO ==== Making SWITCHDEV and DSA converge towards an unified codebase ------------------------------------------------------------- SWITCHDEV properly takes care of abstracting the networking stack with offload capable hardware, but does not enforce a strict switch device driver model. On the other DSA enforces a fairly strict device driver model, and deals with most of the switch specific. At some point we should envision a merger between these two subsystems and get the best of both worlds. -----------------------------[cut here]----------------------------- IMO this is never going to happen, nor should it. To unify DSA and switchdev would mean to force plain switchdev drivers to have a stricter separation between the control path and the data path, a la DSA. So, just like DSA has separate drivers for the switch, the tagging protocol and for the DSA master, plain switchdev would need a separate driver too for the DMA/rings/queues, or whatever, of the switch (the piece of code that implements the NAPI instance). That driver would need to: (a) expose a net device for the DMA interface (b) fake a DSA tag that gets added by the DMA net device, just to be later removed by the switchdev net device. This is just for compatibility with what DSA _needs_ to do to drive hardware that was _designed_ to work that way, and where the _hardware_ adds that tag. But non-DSA drivers don't need any of that, nor does it really help them in any way! So why should we model things this way? I don't think the lines between plain switchdev and DSA are blurry at all. And especially not in this case. You should not have a networking interface registered to the system for DMA-0 at all, just for MAC-0 and MAC-1.