linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Ioana Ciornei <ciorneiioana@gmail.com>
To: Vladimir Oltean <olteanv@gmail.com>
Cc: Ioana Ciornei <ciorneiioana@gmail.com>,
	Greg Kroah-Hartman <gregkh@linuxfoundation.org>,
	Andrew Lunn <andrew@lunn.ch>,
	linux-kernel@vger.kernel.org, netdev@vger.kernel.org,
	Ioana Ciornei <ioana.ciornei@nxp.com>
Subject: Re: [RFC 8/9] staging: dpaa2-switch: properly setup switching domains
Date: Thu, 5 Nov 2020 12:58:48 +0200	[thread overview]
Message-ID: <20201105105848.tt3gktuxkq36nt57@skbuf> (raw)
In-Reply-To: <20201104220810.a5n24vh45hsvv646@skbuf>

On Thu, Nov 05, 2020 at 12:08:10AM +0200, Vladimir Oltean wrote:
> On Wed, Nov 04, 2020 at 06:57:19PM +0200, Ioana Ciornei wrote:
> > From: Ioana Ciornei <ioana.ciornei@nxp.com>
> > 
> > Until now, the DPAA2 switch was not capable to properly setup it's
> > switching domains depending on the existence, or lack thereof, of a
> > upper bridge device. This meant that all switch ports of a DPSW object
> > were switching by default even though they were not under the same
> > bridge device.
> > 
> > Another issue was the inability to actually add the CPU in the flooding
> > domains (broadcast, unknown unicast etc) of a particular switch port.
> > This meant that a simple ping on a switch interface was not possible
> > since no broadcast ARP frame would actually reach the CPU queues.
> > 
> > This patch tries to fix exactly these problems by:
> > 
> > * Creating and managing a FDB table for each flooding domain. This means
> >   that when a switch interface is not bridged it will use it's own FDB
> >   table. While in bridged mode all DPAA2 switch interfaces under the
> >   same upper will use the same FDB table, thus leverage the same FDB
> >   entries.
> > 
> > * Adding a new MC firmware command - dpsw_set_egress_flood() - through
> >   which the driver can setup the flooding domains as needed. For
> >   example, when the switch interface is standalone, thus not in a
> >   bridge with any other DPAA2 switch port, it will setup it's broadcast
> >   and unknown unicast flooding domains to only include the control
> >   interface (the queues that reach the CPU and the driver can dequeue
> >   from). This flooding domain changes when the interface joins a bridge
> >   and is configured to include, beside the control interface, all other
> >   DPAA2 switch interfaces.
> > 
> > Signed-off-by: Ioana Ciornei <ioana.ciornei@nxp.com>
> > ---
> 
> None of the occurrences of "it's" in the commit message is grammatically
> correct. So please s/it's/its/g.
> 
> > diff --git a/drivers/staging/fsl-dpaa2/ethsw/ethsw.c b/drivers/staging/fsl-dpaa2/ethsw/ethsw.c
> > index 24bdac6d6005..7a0d9a178cdc 100644
> > --- a/drivers/staging/fsl-dpaa2/ethsw/ethsw.c
> > +++ b/drivers/staging/fsl-dpaa2/ethsw/ethsw.c
> > @@ -25,6 +25,36 @@
> >  
> >  #define DEFAULT_VLAN_ID			1
> >  
> > +static u16 dpaa2_switch_port_get_fdb_id(struct ethsw_port_priv *port_priv)
> > +{
> > +	struct ethsw_port_priv *other_port_priv = NULL;
> > +	struct net_device *other_dev;
> > +	struct list_head *iter;
> > +
> > +	/* If not part of a bridge, just use the private FDB */
> > +	if (!port_priv->bridge_dev)
> > +		return port_priv->fdb_id;
> > +
> > +	/* If part of a bridge, use the FDB of the first dpaa2 switch interface
> > +	 * to be present in that bridge
> > +	 */
> > +	netdev_for_each_lower_dev(port_priv->bridge_dev, other_dev, iter) {
> 
> netdev_for_each_lower_dev calls netdev_lower_get_next which has this in
> the comments:
> 
>  * The caller must hold RTNL lock or
>  * its own locking that guarantees that the neighbour lower
>  * list will remain unchanged.
> 
> Does that hold true for all callers, if you put ASSERT_RTNL() here?

No, not for all. The probe path uses this as well and is not protected
by the rtnl lock.

Good point, I'll add an explicit lock/unlock. Thanks.

> 
> > +		if (!dpaa2_switch_port_dev_check(other_dev, NULL))
> > +			continue;
> > +
> > +		other_port_priv = netdev_priv(other_dev);
> > +		break;
> > +	}
> > +
> > +	/* We are the first dpaa2 switch interface to join the bridge, just use
> > +	 * our own FDB
> > +	 */
> > +	if (!other_port_priv)
> > +		other_port_priv = port_priv;
> > +
> > +	return other_port_priv->fdb_id;
> > +}
> > +
> >  static void *dpaa2_iova_to_virt(struct iommu_domain *domain,
> >  				dma_addr_t iova_addr)
> >  {
> > @@ -133,7 +163,7 @@ static int dpaa2_switch_port_add_vlan(struct ethsw_port_priv *port_priv,
> >  {
> >  	struct ethsw_core *ethsw = port_priv->ethsw_data;
> >  	struct net_device *netdev = port_priv->netdev;
> > -	struct dpsw_vlan_if_cfg vcfg;
> > +	struct dpsw_vlan_if_cfg vcfg = {0};
> >  	int err;
> >  
> >  	if (port_priv->vlans[vid]) {
> > @@ -141,8 +171,13 @@ static int dpaa2_switch_port_add_vlan(struct ethsw_port_priv *port_priv,
> >  		return -EEXIST;
> >  	}
> >  
> > +	/* If hit, this VLAN rule will lead the packet into the FDB table
> > +	 * specified in the vlan configuration below
> > +	 */
> 
> And this is the reason why VLAN-unaware mode is unsupported, right?

Yes, exactly.

> No
> hit on any VLAN rule => no FDB table selected for the packet. What is
> the default action for misses on VLAN rules? Drop or some default FDB
> ID?

The default action for misses on the VLAN table is drop. For example, if
a VLAN tagged packet is received on a switch interface which does not
have an upper VLAN interface with that VLAN id (thus the
.ndo_vlan_rx_add_vid() is not called) , the packet will get dropped
immediately.

> 
> >  	vcfg.num_ifs = 1;
> >  	vcfg.if_id[0] = port_priv->idx;
> > +	vcfg.fdb_id = dpaa2_switch_port_get_fdb_id(port_priv);
> > +	vcfg.options |= DPSW_VLAN_ADD_IF_OPT_FDB_ID;
> >  	err = dpsw_vlan_add_if(ethsw->mc_io, 0, ethsw->dpsw_handle, vid, &vcfg);
> >  	if (err) {
> >  		netdev_err(netdev, "dpsw_vlan_add_if err %d\n", err);
> > @@ -172,8 +207,10 @@ static int dpaa2_switch_port_add_vlan(struct ethsw_port_priv *port_priv,
> >  	return 0;
> >  }
> >  
> > -static int dpaa2_switch_set_learning(struct ethsw_core *ethsw, bool enable)
> > +static int dpaa2_switch_port_set_learning(struct ethsw_port_priv *port_priv, bool enable)
> 
> The commit message says nothing about changes to the learning
> configuration.

The learning flag is per FDB table, thus it's configuration now has to
take into account the corresponding FDB of an interface.

Actually, being able to configure the learning flag is somewhat of an
inconvenience since this would also change the learning behavior of all
the other switch ports under the same bridge, all this without the
bridge actually learning of this change.

I think I will just remove the code that handles changing the learning
status at the moment, until I can make changes in the MC firmware so
that this flag is indeed per port.

> 
> >  {
> > +	u16 fdb_id = dpaa2_switch_port_get_fdb_id(port_priv);
> > +	struct ethsw_core *ethsw = port_priv->ethsw_data;
> >  	enum dpsw_fdb_learning_mode learn_mode;
> >  	int err;
> >  
> > @@ -182,13 +219,12 @@ static int dpaa2_switch_set_learning(struct ethsw_core *ethsw, bool enable)
> >  	else
> >  		learn_mode = DPSW_FDB_LEARNING_MODE_DIS;
> >  
> > -	err = dpsw_fdb_set_learning_mode(ethsw->mc_io, 0, ethsw->dpsw_handle, 0,
> > +	err = dpsw_fdb_set_learning_mode(ethsw->mc_io, 0, ethsw->dpsw_handle, fdb_id,
> >  					 learn_mode);
> >  	if (err) {
> >  		dev_err(ethsw->dev, "dpsw_fdb_set_learning_mode err %d\n", err);
> >  		return err;
> >  	}
> > -	ethsw->learning = enable;
> >  
> >  	return 0;
> >  }
> > @@ -267,15 +303,17 @@ static int dpaa2_switch_port_fdb_add_uc(struct ethsw_port_priv *port_priv,
> >  					const unsigned char *addr)
> >  {
> >  	struct dpsw_fdb_unicast_cfg entry = {0};
> > +	u16 fdb_id;
> >  	int err;
> >  
> >  	entry.if_egress = port_priv->idx;
> >  	entry.type = DPSW_FDB_ENTRY_STATIC;
> >  	ether_addr_copy(entry.mac_addr, addr);
> >  
> > +	fdb_id = dpaa2_switch_port_get_fdb_id(port_priv);
> >  	err = dpsw_fdb_add_unicast(port_priv->ethsw_data->mc_io, 0,
> >  				   port_priv->ethsw_data->dpsw_handle,
> > -				   0, &entry);
> > +				   fdb_id, &entry);
> 
> Hmmm, so in dpaa2_switch_port_get_fdb_id you say:
> 
> 	/* If part of a bridge, use the FDB of the first dpaa2 switch interface
> 	 * to be present in that bridge
> 	 */
> 
> So let's say there is a br0 with swp3 and swp2, and a br1 with swp4.
> IIUC, br0 interfaces (swp3 and swp2) will have an fdb_id of 3 (due to
> swp3 being added first) and br1 will have an fdb_id of 4 (due to swp4).
> 
> When swp3 leaves br0, will this cause the fdb_id of swp2 to change?
> I expect the answer is yes, since otherwise swp2 and swp3 would keep
> forwarding packets to one another. Is this change graceful?
> 

Yes, the fdb_id of swp2 will change but, as the code is now, it will
only change for new FDB static entries added or any new VLANs
installed.

> For example, if you add a static FDB entry to swp2 prior to removing
> swp3, I would expect the fdb_id of swp2 to preserve that static FDB
> entry, even if swp2 now gets moved to a different fdb_id. Similarly,
> flooding settings, everything is preserved when the fdb_id changes?
> 

No, moving static FDB entries on a bridge leave/join does not happen
now.

> The flip side of that is: what happens if you add an FDB entry to swp2,
> then you remove swp2 from br0 and move it to br1? Will swp4 (which was
> already in br1) see that static FDB entry in hardware, even if the
> software bridge br1 hasn't notified you about it?
> 

As you said, moving any FDB entry from one FDB table to another one
would be a problem since the software bridge br1 would not be notified
of any of these new entries.

Taking all the above into account, what I think the code should do if
swp2, for example, leaves a bridge is to:
 - update all VLAN entries installed for swp2 so that on a hit it would
   lead the packet into the new FDB table.
 - all static FDB entries already installed on swp2 should be removed
   from the FDB table corresponding to the previous bridge that the port
   was under.

> Basically, my question boils down to: why is there so little activity in
> dpaa2_switch_port_bridge_leave.
> 
> >  	if (err)
> >  		netdev_err(port_priv->netdev,
> >  			   "dpsw_fdb_add_unicast err %d\n", err);

  reply	other threads:[~2020-11-05 10:58 UTC|newest]

Thread overview: 20+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-11-04 16:57 [RFC 0/9] staging: dpaa2-switch: add support for CPU terminated traffic Ioana Ciornei
2020-11-04 16:57 ` [RFC 1/9] staging: dpaa2-switch: get control interface attributes Ioana Ciornei
2020-11-04 16:57 ` [RFC 2/9] staging: dpaa2-switch: setup buffer pool for control traffic Ioana Ciornei
2020-11-04 16:57 ` [RFC 3/9] staging: dpaa2-switch: setup RX path rings Ioana Ciornei
2020-11-04 16:57 ` [RFC 4/9] staging: dpaa2-switch: setup dpio Ioana Ciornei
2020-11-04 16:57 ` [RFC 5/9] staging: dpaa2-switch: handle Rx path on control interface Ioana Ciornei
2020-11-05  0:45   ` Andrew Lunn
2020-11-05 11:22     ` Ioana Ciornei
2020-11-04 16:57 ` [RFC 6/9] staging: dpaa2-switch: add .ndo_start_xmit() callback Ioana Ciornei
2020-11-04 21:27   ` Vladimir Oltean
2020-11-05  8:11     ` Ioana Ciornei
2020-11-05  1:04   ` Andrew Lunn
2020-11-05  8:25     ` Ioana Ciornei
2020-11-05 13:45       ` Andrew Lunn
2020-11-05 15:51         ` Ioana Ciornei
2020-11-04 16:57 ` [RFC 7/9] staging: dpaa2-switch: enable the control interface Ioana Ciornei
2020-11-04 16:57 ` [RFC 8/9] staging: dpaa2-switch: properly setup switching domains Ioana Ciornei
2020-11-04 22:08   ` Vladimir Oltean
2020-11-05 10:58     ` Ioana Ciornei [this message]
2020-11-04 16:57 ` [RFC 9/9] staging: dpaa2-switch: accept only vlan-aware upper devices Ioana Ciornei

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20201105105848.tt3gktuxkq36nt57@skbuf \
    --to=ciorneiioana@gmail.com \
    --cc=andrew@lunn.ch \
    --cc=gregkh@linuxfoundation.org \
    --cc=ioana.ciornei@nxp.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=netdev@vger.kernel.org \
    --cc=olteanv@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).