netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Vladimir Oltean <olteanv@gmail.com>
To: Ido Schimmel <idosch@idosch.org>
Cc: Andrew Lunn <andrew@lunn.ch>,
	Florian Fainelli <f.fainelli@gmail.com>,
	Vivien Didelot <vivien.didelot@gmail.com>,
	"David S. Miller" <davem@davemloft.net>,
	Jiri Pirko <jiri@resnulli.us>, Jakub Kicinski <kuba@kernel.org>,
	Ivan Vecera <ivecera@redhat.com>, netdev <netdev@vger.kernel.org>,
	Horatiu Vultur <horatiu.vultur@microchip.com>,
	"Allan W. Nielsen" <allan.nielsen@microchip.com>,
	Nikolay Aleksandrov <nikolay@cumulusnetworks.com>,
	Roopa Prabhu <roopa@cumulusnetworks.com>
Subject: Re: [PATCH RFC net-next 00/13] RX filtering for DSA switches
Date: Sun, 24 May 2020 19:24:27 +0300	[thread overview]
Message-ID: <CA+h21hoJwjBt=Uu_tYw3vv2Sze28iRdAAoR3S+LFrKbL6-iuJQ@mail.gmail.com> (raw)
In-Reply-To: <20200524140657.GA1281067@splinter>

On Sun, 24 May 2020 at 17:07, Ido Schimmel <idosch@idosch.org> wrote:
>
> On Fri, May 22, 2020 at 12:10:23AM +0300, Vladimir Oltean wrote:
> > From: Vladimir Oltean <vladimir.oltean@nxp.com>
> >
> > This is a WIP series whose stated goal is to allow DSA and switchdev
> > drivers to flood less traffic to the CPU while keeping the same level of
> > functionality.
> >
> > The strategy is to whitelist towards the CPU only the {DMAC, VLAN} pairs
> > that the operating system has expressed its interest in, either due to
> > those being the MAC addresses of one of the switch ports, or addresses
> > added to our device's RX filter via calls to dev_uc_add/dev_mc_add.
> > Then, the traffic which is not explicitly whitelisted is not sent by the
> > hardware to the CPU, under the assumption that the CPU didn't ask for it
> > and would have dropped it anyway.
> >
> > The ground for these patches were the discussions surrounding RX
> > filtering with switchdev in general, as well as with DSA in particular:
> >
> > "[PATCH net-next 0/4] DSA: promisc on master, generic flow dissector code":
> > https://www.spinics.net/lists/netdev/msg651922.html
> > "[PATCH v3 net-next 2/2] net: dsa: felix: Allow unknown unicast traffic towards the CPU port module":
> > https://www.spinics.net/lists/netdev/msg634859.html
> > "[PATCH v3 0/2] net: core: Notify on changes to dev->promiscuity":
> > https://lkml.org/lkml/2019/8/29/255
> > LPC2019 - SwitchDev offload optimizations:
> > https://www.youtube.com/watch?v=B1HhxEcU7Jg
> >
> > Unicast filtering comes to me as most important, and this includes
> > termination of MAC addresses corresponding to the network interfaces in
> > the system (DSA switch ports, VLAN sub-interfaces, bridge interface).
> > The first 4 patches use Ivan Khoronzhuk's IVDF framework for extending
> > network interface addresses with a Virtual ID (typically VLAN ID). This
> > matches DSA switches perfectly because their FDB already contains keys
> > of the {DMAC, VID} form.
>
> Hi,
>
> I read through the series and I'm not sure how unicast filtering works.
> Instead of writing a very long mail I just created a script with
> comments. I think it's clearer that way. Note that this is not a made up
> configuration. It is used in setups involving VRRP / VXLAN, for example.
>
> ```
> #!/bin/bash
>
> ip netns add ns1
>
> ip -n ns1 link add name br0 type bridge vlan_filtering 1
> ip -n ns1 link add name dummy10 up type dummy
>
> ip -n ns1 link set dev dummy10 master br0
> ip -n ns1 link set dev br0 up
>
> ip -n ns1 link add link br0 name vlan10 up type vlan id 10
> bridge -n ns1 vlan add vid 10 dev br0 self
>
> echo "Before adding macvlan:"
> echo "======================"
>
> echo -n "Promiscuous mode: "
> ip -n ns1 -j -p -d link show dev br0 | jq .[][\"promiscuity\"]
>
> echo -e "\nvlan10's MAC is in br0's FDB:"
> bridge -n ns1 fdb show br0 vlan 10
>
> echo
> echo "After adding macvlan:"
> echo "====================="
>
> ip -n ns1 link add link vlan10 name vlan10-v up address 00:00:5e:00:01:01 \
>         type macvlan mode private
>
> echo -n "Promiscuous mode: "
> ip -n ns1 -j -p -d link show dev br0 | jq .[][\"promiscuity\"]
>
> echo -e "\nvlan10-v's MAC is not in br0's FDB:"
> bridge -n ns1 fdb show br0 | grep master | grep 00:00:5e:00:01:01
> ```
>
> This is the output on my laptop (kernel 5.6.8):
>
> ```
> Before adding macvlan:
> ======================
> Promiscuous mode: 0
>
> vlan10's MAC is in br0's FDB:
> 42:bd:b1:cc:67:15 dev br0 vlan 10 master br0 permanent
>
> After adding macvlan:
> =====================
> Promiscuous mode: 1
>
> vlan10-v's MAC is not in br0's FDB:
> ```
>
> Basically, if the MAC of the VLAN device is not inherited from the
> bridge or you stack macvlans on top, then the bridge will go into
> promiscuous mode and it will locally receive all frames passing through
> it. It's not ideal, but it's a very old and simple behavior. It does not
> require you to track the VLAN associated with the MAC addresses, for
> example.
>

This is a good point. I wasn't aware that the bridge 'gives up' with
macvlan upper devices, but if I understand correctly, we do have the
necessary tools to improve that.
But actually, I'm wondering if this simple behavior from the bridge is
correct. As you, Jiri and Ivan pointed out in last summer's email
thread about the Linux bridge and promiscuous mode, putting the
interface in IFF_PROMISC is only going to guarantee acceptance through
the net device's RX filter, but not that the packets will go to the
CPU. So from that perspective, the current series would break things,
so we should definitely fix that and keep the {MAC, VLAN} pairs in the
bridge's local FDB.

> When you are offloading the Linux data path to hardware this behavior is
> not ideal as your hardware can handle much higher packet rates than the
> CPU.
>
> In mlxsw we handle this by tracking the upper devices of the bridge. I
> was hoping that with Ivan's patches we could add support for unicast
> filtering in the bridge driver and program the MAC addresses to its FDB
> with 'local' flag. Then the FDB entries would be notified via switchdev
> to device drivers.
>

Yes, it should be possible to do that. I'll try and see how far I get.

> >
> > Multicast filtering was taken and reworked from Florian Fainelli's
> > previous attempts, according to my own understanding of multicast
> > forwarding requirements of an IGMP snooping switch. This is the part
> > that needs the most extra work, not only in the DSA core but also in
> > drivers. For this reason, I've left out of this patchset anything that
> > has to do with driver-level configuration (since the audience is a bit
> > larger than usual), as I'm trying to focus more on policy for now, and
> > the series is already pretty huge.
>
> From what I remember, this is the logic in the Linux bridge:
>
> * Broadcast is always locally received
> * Multicast is locally received if:
>         * Snooping disabled
>         * Snooping enabled:
>                 * Bridge netdev is mrouter port
>                 or
>                 * Matches MDB entry with 'host_joined' indication
>
> >
> > Florian Fainelli (3):
> >   net: bridge: multicast: propagate br_mc_disabled_update() return
> >   net: dsa: add ability to program unicast and multicast filters for CPU
> >     port
> >   net: dsa: wire up multicast IGMP snooping attribute notification
> >
> > Ivan Khoronzhuk (4):
> >   net: core: dev_addr_lists: add VID to device address
> >   net: 8021q: vlan_dev: add vid tag to addresses of uc and mc lists
> >   net: 8021q: vlan_dev: add vid tag for vlan device own address
> >   ethernet: eth: add default vid len for all ethernet kind devices
> >
> > Vladimir Oltean (6):
> >   net: core: dev_addr_lists: export some raw __hw_addr helpers
> >   net: dsa: don't use switchdev_notifier_fdb_info in
> >     dsa_switchdev_event_work
> >   net: dsa: mroute: don't panic the kernel if called without the prepare
> >     phase
> >   net: bridge: add port flags for host flooding
> >   net: dsa: deal with new flooding port attributes from bridge
> >   net: dsa: treat switchdev notifications for multicast router connected
> >     to port
> >
> >  include/linux/if_bridge.h |   3 +
> >  include/linux/if_vlan.h   |   2 +
> >  include/linux/netdevice.h |  11 ++
> >  include/net/dsa.h         |  17 +++
> >  net/8021q/Kconfig         |  12 ++
> >  net/8021q/vlan.c          |   3 +
> >  net/8021q/vlan.h          |   2 +
> >  net/8021q/vlan_core.c     |  25 ++++
> >  net/8021q/vlan_dev.c      | 102 +++++++++++---
> >  net/bridge/br_if.c        |  40 ++++++
> >  net/bridge/br_multicast.c |  21 ++-
> >  net/bridge/br_switchdev.c |   4 +-
> >  net/core/dev_addr_lists.c | 144 +++++++++++++++----
> >  net/dsa/Kconfig           |   1 +
> >  net/dsa/dsa2.c            |   6 +
> >  net/dsa/dsa_priv.h        |  27 +++-
> >  net/dsa/port.c            | 155 ++++++++++++++++----
> >  net/dsa/slave.c           | 288 +++++++++++++++++++++++++++++++-------
> >  net/dsa/switch.c          |  36 +++++
> >  net/ethernet/eth.c        |  12 +-
> >  20 files changed, 780 insertions(+), 131 deletions(-)
> >
> > --
> > 2.25.1
> >

Thanks,
-Vladimir

  reply	other threads:[~2020-05-24 16:24 UTC|newest]

Thread overview: 46+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-05-21 21:10 [PATCH RFC net-next 00/13] RX filtering for DSA switches Vladimir Oltean
2020-05-21 21:10 ` [PATCH RFC net-next 01/13] net: core: dev_addr_lists: add VID to device address Vladimir Oltean
2020-05-21 21:10 ` [PATCH RFC net-next 02/13] net: 8021q: vlan_dev: add vid tag to addresses of uc and mc lists Vladimir Oltean
2020-05-21 21:10 ` [PATCH RFC net-next 03/13] net: 8021q: vlan_dev: add vid tag for vlan device own address Vladimir Oltean
2020-05-21 21:10 ` [PATCH RFC net-next 04/13] ethernet: eth: add default vid len for all ethernet kind devices Vladimir Oltean
2020-05-21 21:10 ` [PATCH RFC net-next 05/13] net: bridge: multicast: propagate br_mc_disabled_update() return Vladimir Oltean
2020-05-21 21:10 ` [PATCH RFC net-next 06/13] net: core: dev_addr_lists: export some raw __hw_addr helpers Vladimir Oltean
2020-05-21 21:10 ` [PATCH RFC net-next 07/13] net: dsa: don't use switchdev_notifier_fdb_info in dsa_switchdev_event_work Vladimir Oltean
2020-05-21 21:10 ` [PATCH RFC net-next 08/13] net: dsa: add ability to program unicast and multicast filters for CPU port Vladimir Oltean
2020-05-21 21:10 ` [PATCH RFC net-next 09/13] net: dsa: mroute: don't panic the kernel if called without the prepare phase Vladimir Oltean
2020-05-21 21:10 ` [PATCH RFC net-next 10/13] net: bridge: add port flags for host flooding Vladimir Oltean
2020-05-22 12:38   ` Nikolay Aleksandrov
2020-05-22 13:13     ` Vladimir Oltean
2020-05-22 18:45       ` Allan W. Nielsen
2020-07-20 11:08         ` Vladimir Oltean
2020-05-24 14:26   ` Ido Schimmel
2020-05-24 16:13     ` Vladimir Oltean
2020-05-25 20:11       ` Ido Schimmel
2020-05-25 20:32         ` Vladimir Oltean
2020-07-23 22:35         ` Vladimir Oltean
2020-07-27 17:15           ` Ido Schimmel
2020-05-21 21:10 ` [PATCH RFC net-next 11/13] net: dsa: deal with new flooding port attributes from bridge Vladimir Oltean
2020-05-21 21:10 ` [PATCH RFC net-next 12/13] net: dsa: treat switchdev notifications for multicast router connected to port Vladimir Oltean
2020-05-21 21:10 ` [PATCH RFC net-next 13/13] net: dsa: wire up multicast IGMP snooping attribute notification Vladimir Oltean
2020-05-22 18:42 ` [PATCH RFC net-next 00/13] RX filtering for DSA switches Allan W. Nielsen
2020-05-24 14:06 ` Ido Schimmel
2020-05-24 16:24   ` Vladimir Oltean [this message]
2020-05-25 19:48     ` Ido Schimmel
2020-05-25 20:23       ` Vladimir Oltean
2020-05-26 14:01         ` Ido Schimmel
2020-05-27 11:36           ` Vladimir Oltean
2020-05-28 14:37             ` Ido Schimmel
2020-07-20 10:00               ` Vladimir Oltean
2020-07-27 16:56                 ` Ido Schimmel
2020-10-27 11:52                   ` Vladimir Oltean
2020-10-28 14:43                     ` Ido Schimmel
2020-10-28 18:46                       ` Vladimir Oltean
2020-11-01 11:27                         ` Ido Schimmel
2020-11-01 12:06                           ` Vladimir Oltean
2020-11-01 14:42                             ` Ido Schimmel
2020-11-01 15:04                               ` Vladimir Oltean
2020-11-01 15:39                                 ` Ido Schimmel
2020-11-01 16:13                                   ` Vladimir Oltean
2020-11-11  4:12                                     ` Florian Fainelli
2020-05-24 16:13 ` Florian Fainelli
2020-05-24 16:34   ` Vladimir Oltean

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CA+h21hoJwjBt=Uu_tYw3vv2Sze28iRdAAoR3S+LFrKbL6-iuJQ@mail.gmail.com' \
    --to=olteanv@gmail.com \
    --cc=allan.nielsen@microchip.com \
    --cc=andrew@lunn.ch \
    --cc=davem@davemloft.net \
    --cc=f.fainelli@gmail.com \
    --cc=horatiu.vultur@microchip.com \
    --cc=idosch@idosch.org \
    --cc=ivecera@redhat.com \
    --cc=jiri@resnulli.us \
    --cc=kuba@kernel.org \
    --cc=netdev@vger.kernel.org \
    --cc=nikolay@cumulusnetworks.com \
    --cc=roopa@cumulusnetworks.com \
    --cc=vivien.didelot@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).