[RFC,6/6] Modify tag_ksz.c to support other KSZ switch drivers
diff mbox series

Message ID 93AF473E2DA327428DE3D46B72B1E9FD41121A22@CHN-SV-EXMX02.mchp-main.com
State New, archived
Headers show
Series
  • Untitled series #324469
Related show

Commit Message

Tristram.Ha@microchip.com Sept. 7, 2017, 9:09 p.m. UTC
From: Tristram Ha <Tristram.Ha@microchip.com>

The KSZ tail tag code can be used by different KSZ switch drivers.

Signed-off-by: Tristram Ha <Tristram.Ha@microchip.com>
---
 		nskb = skb;
 	} else {
 		nskb = alloc_skb(NET_IP_ALIGN + skb->len +
-				 padlen + KSZ_INGRESS_TAG_LEN, GFP_ATOMIC);
+				 padlen + len, GFP_ATOMIC);
 		if (!nskb)
 			return NULL;
 		skb_reserve(nskb, NET_IP_ALIGN);
@@ -70,9 +56,7 @@ static struct sk_buff *ksz_xmit(struct sk_buff *skb, struct net_device *dev)
 		consume_skb(skb);
 	}
 
-	tag = skb_put(nskb, KSZ_INGRESS_TAG_LEN);
-	tag[0] = 0;
-	tag[1] = 1 << p->dp->index; /* destination port */
+	swdev->dev_ops->add_tail_tag(swdev, nskb, p->dp->index);
 
 	return nskb;
 }
@@ -83,16 +67,16 @@ static struct sk_buff *ksz_rcv(struct sk_buff *skb, struct net_device *dev,
 	struct dsa_switch_tree *dst = dev->dsa_ptr;
 	struct dsa_port *cpu_dp = dsa_get_cpu_port(dst);
 	struct dsa_switch *ds = cpu_dp->ds;
-	u8 *tag;
+	struct ksz_device *swdev = ds->priv;
+	int len;
 	int source_port;
 
-	tag = skb_tail_pointer(skb) - KSZ_EGRESS_TAG_LEN;
+	len = swdev->dev_ops->get_tail_tag(swdev, skb, &source_port);
 
-	source_port = tag[0] & 7;
 	if (source_port >= ds->num_ports || !ds->ports[source_port].netdev)
 		return NULL;
 
-	pskb_trim_rcsum(skb, skb->len - KSZ_EGRESS_TAG_LEN);
+	pskb_trim_rcsum(skb, skb->len - len);
 
 	skb->dev = ds->ports[source_port].netdev;

Comments

Andrew Lunn Sept. 7, 2017, 9:48 p.m. UTC | #1
On Thu, Sep 07, 2017 at 09:09:30PM +0000, Tristram.Ha@microchip.com wrote:
> From: Tristram Ha <Tristram.Ha@microchip.com>
> 
> The KSZ tail tag code can be used by different KSZ switch drivers.
> 
> Signed-off-by: Tristram Ha <Tristram.Ha@microchip.com>
> ---
> diff --git a/net/dsa/tag_ksz.c b/net/dsa/tag_ksz.c index 010ca0a..d5faf14 100644
> --- a/net/dsa/tag_ksz.c
> +++ b/net/dsa/tag_ksz.c
> @@ -13,35 +13,21 @@
>  #include <linux/slab.h>
>  #include <net/dsa.h>
>  #include "dsa_priv.h"
> -
> -/* For Ingress (Host -> KSZ), 2 bytes are added before FCS.
> - * ---------------------------------------------------------------------------
> - * DA(6bytes)|SA(6bytes)|....|Data(nbytes)|tag0(1byte)|tag1(1byte)|FCS(4bytes)
> - * ---------------------------------------------------------------------------
> - * tag0 : Prioritization (not used now)
> - * tag1 : each bit represents port (eg, 0x01=port1, 0x02=port2, 0x10=port5)
> - *
> - * For Egress (KSZ -> Host), 1 byte is added before FCS.
> - * ---------------------------------------------------------------------------
> - * DA(6bytes)|SA(6bytes)|....|Data(nbytes)|tag0(1byte)|FCS(4bytes)
> - * ---------------------------------------------------------------------------
> - * tag0 : zero-based value represents port
> - *	  (eg, 0x00=port1, 0x02=port3, 0x06=port7)
> - */
> -
> -#define	KSZ_INGRESS_TAG_LEN	2
> -#define	KSZ_EGRESS_TAG_LEN	1
> +#include "../../drivers/net/dsa/microchip/ksz_priv.h"

No.

Move the header file to somewhere under include/linux. You can split
it into private and public parts if you want, and just put the public
bits in include/linux.

>  static struct sk_buff *ksz_xmit(struct sk_buff *skb, struct net_device *dev)  {
>  	struct dsa_slave_priv *p = netdev_priv(dev);
> +	struct dsa_switch *ds = p->dp->ds;
> +	struct ksz_device *swdev = ds->priv;
>  	struct sk_buff *nskb;
> +	int len;
>  	int padlen;
> -	u8 *tag;
>  
>  	padlen = (skb->len >= ETH_ZLEN) ? 0 : ETH_ZLEN - skb->len;
>  
> -	if (skb_tailroom(skb) >= padlen + KSZ_INGRESS_TAG_LEN) {
> +	len = swdev->dev_ops->get_tx_len(swdev);

This is ugly. We have a clean separation between a switch driver and a
tag driver. Here you are mixing them together. Don't. Look at the
mailing lists for what Florian and I suggested to Pavel. It will also
solve your include file issue above.

	 Andrew
Tristram.Ha@microchip.com Sept. 18, 2017, 7:38 p.m. UTC | #2
Sorry for the late response.

> > diff --git a/net/dsa/tag_ksz.c b/net/dsa/tag_ksz.c index 010ca0a..d5faf14
> 100644
> > --- a/net/dsa/tag_ksz.c
> > +++ b/net/dsa/tag_ksz.c
> >  static struct sk_buff *ksz_xmit(struct sk_buff *skb, struct net_device
> *dev)  {
> >  	struct dsa_slave_priv *p = netdev_priv(dev);
> > +	struct dsa_switch *ds = p->dp->ds;
> > +	struct ksz_device *swdev = ds->priv;
> >  	struct sk_buff *nskb;
> > +	int len;
> >  	int padlen;
> > -	u8 *tag;
> >
> >  	padlen = (skb->len >= ETH_ZLEN) ? 0 : ETH_ZLEN - skb->len;
> >
> > -	if (skb_tailroom(skb) >= padlen + KSZ_INGRESS_TAG_LEN) {
> > +	len = swdev->dev_ops->get_tx_len(swdev);
> 
> This is ugly. We have a clean separation between a switch driver and a
> tag driver. Here you are mixing them together. Don't. Look at the
> mailing lists for what Florian and I suggested to Pavel. It will also
> solve your include file issue above.

It seems to me all tag_*.c are hard-coded.  They do not have access to
the switch device and just use the bit definitions defined in the top to
do the job.

This creates a problem for all KSZ switch devices as they have different
tail tag formats and lengths.  There will be potentially 4 additional
DSA_TAG_PROT_KSZ* just to handle them.  Extra files will be added
with about the same code.

The KSZ9477 driver has its own problem as the tail tag length is dynamic
and not fixed.  Right now the function feature that affects this behavior is
not turned on and so the problem does not happen.

But the most important thing is how do we support the offload_fwd_mark
bit in skb when the only place that has access to skb does not have access to
the switch hardware?  I thought this bit is important for the new switchdev model.

A little out-of-topic is the MAC driver may have problem receiving the frame if
the tail tag length is too big.  Most of the MAC drivers have big enough receive
buffer or require 64-bytes multiple so there are extra room to accommodate
the big tail tag at the end of the frame.  Still some MAC drivers use exactly 1500
MTU and some additional length and may run into this problem.  Currently the
Atmel Cadence MAC driver has this potential problem when certain conditions
are met.
Andrew Lunn Sept. 18, 2017, 7:57 p.m. UTC | #3
On Mon, Sep 18, 2017 at 07:38:03PM +0000, Tristram.Ha@microchip.com wrote:
> Sorry for the late response.
> 
> > > diff --git a/net/dsa/tag_ksz.c b/net/dsa/tag_ksz.c index 010ca0a..d5faf14
> > 100644
> > > --- a/net/dsa/tag_ksz.c
> > > +++ b/net/dsa/tag_ksz.c
> > >  static struct sk_buff *ksz_xmit(struct sk_buff *skb, struct net_device
> > *dev)  {
> > >  	struct dsa_slave_priv *p = netdev_priv(dev);
> > > +	struct dsa_switch *ds = p->dp->ds;
> > > +	struct ksz_device *swdev = ds->priv;
> > >  	struct sk_buff *nskb;
> > > +	int len;
> > >  	int padlen;
> > > -	u8 *tag;
> > >
> > >  	padlen = (skb->len >= ETH_ZLEN) ? 0 : ETH_ZLEN - skb->len;
> > >
> > > -	if (skb_tailroom(skb) >= padlen + KSZ_INGRESS_TAG_LEN) {
> > > +	len = swdev->dev_ops->get_tx_len(swdev);
> > 
> > This is ugly. We have a clean separation between a switch driver and a
> > tag driver. Here you are mixing them together. Don't. Look at the
> > mailing lists for what Florian and I suggested to Pavel. It will also
> > solve your include file issue above.
> 
> It seems to me all tag_*.c are hard-coded.  They do not have access to
> the switch device and just use the bit definitions defined in the top to
> do the job.
> 
> This creates a problem for all KSZ switch devices as they have different
> tail tag formats and lengths.  There will be potentially 4 additional
> DSA_TAG_PROT_KSZ* just to handle them.  Extra files will be added
> with about the same code.

Hi Tristram

Think about factoring out the common code into a shared functions.

> But the most important thing is how do we support the offload_fwd_mark
> bit in skb when the only place that has access to skb does not have access to
> the switch hardware?

Well, so far, nothing has set offload_fwd_mark. This is about to
change, since i need it for IGMP snooping on the brX interface, for
Marvell at least. Bit i just need to set it to true for all frames.

What use cases do you have in mind where you need information from the
switch driver to decide on its value?

> A little out-of-topic is the MAC driver may have problem receiving the frame if
> the tail tag length is too big.  Most of the MAC drivers have big enough receive
> buffer or require 64-bytes multiple so there are extra room to accommodate
> the big tail tag at the end of the frame.  Still some MAC drivers use exactly 1500
> MTU and some additional length and may run into this problem.  Currently the
> Atmel Cadence MAC driver has this potential problem when certain conditions
> are met.

This is a know problem. I just fixed the FEC driver which also
discarded DSA tagged frames when they were bigger than the MTU.

So far, it has not been too much of an issue, because most boards with
a switch, use an Ethernet interface from the same manufacture. Marvell
Switches with a Marvell Ethernet interface. Broadcom switch with a
Broadcom Interface. Atheros switch with an Atheros interface. And
naturally, these combinations just work. But Freescale FEC and Marvell
had not been combined before, and it was broken.

So i would not be surprised if the Cadence MAC driver had an issue.

   Andrew
Tristram.Ha@microchip.com Sept. 18, 2017, 8:55 p.m. UTC | #4
> > > This is ugly. We have a clean separation between a switch driver and a
> > > tag driver. Here you are mixing them together. Don't. Look at the
> > > mailing lists for what Florian and I suggested to Pavel. It will also
> > > solve your include file issue above.
> >
> > It seems to me all tag_*.c are hard-coded.  They do not have access to
> > the switch device and just use the bit definitions defined in the top to
> > do the job.
> >
> > This creates a problem for all KSZ switch devices as they have different
> > tail tag formats and lengths.  There will be potentially 4 additional
> > DSA_TAG_PROT_KSZ* just to handle them.  Extra files will be added
> > with about the same code.
> 
> Hi Tristram
> 
> Think about factoring out the common code into a shared functions.
>

I am a little unsure what you have in mind.  Can you elaborate?

Like creating an additional ksz_xmit function and passing a function pointer
to it to do the job?
 
> > But the most important thing is how do we support the offload_fwd_mark
> > bit in skb when the only place that has access to skb does not have access
> to
> > the switch hardware?
> 
> Well, so far, nothing has set offload_fwd_mark. This is about to
> change, since i need it for IGMP snooping on the brX interface, for
> Marvell at least. Bit i just need to set it to true for all frames.
> 
> What use cases do you have in mind where you need information from the
> switch driver to decide on its value?
>

In the old DSA implementation all the ports are partitioned into its own device
and the bridge joining them will do all the forwarding.  This is useful for quick
testing with some protocols like RSTP but it is probably useless for real
operation.

The new switchdev model tries to use the switch hardware as much as
possible.  This offload_fwd_mark bit means the frame is forwarded by the
hardware switch, so the software bridge does not need to do it again.  Without
this bit there will be duplicated multicast frames coming out the ports if internal
forwarding is enabled.

When the DSA device is first created all ports are independent devices.  When
a bridge is created and those devices are attached the ports share the same
membership so all frames will be forwarded internally.

When RSTP is used the port can be put in blocked state and so the forwarding
will stop for that port.   Currently the switch driver will check that membership
to decide whether to set that bit.

> > A little out-of-topic is the MAC driver may have problem receiving the
> frame if
> > the tail tag length is too big.  Most of the MAC drivers have big enough
> receive
> > buffer or require 64-bytes multiple so there are extra room to
> accommodate
> > the big tail tag at the end of the frame.  Still some MAC drivers use exactly
> 1500
> > MTU and some additional length and may run into this problem.  Currently
> the
> > Atmel Cadence MAC driver has this potential problem when certain
> conditions
> > are met.
> 
> This is a know problem. I just fixed the FEC driver which also
> discarded DSA tagged frames when they were bigger than the MTU.
> 
> So far, it has not been too much of an issue, because most boards with
> a switch, use an Ethernet interface from the same manufacture. Marvell
> Switches with a Marvell Ethernet interface. Broadcom switch with a
> Broadcom Interface. Atheros switch with an Atheros interface. And
> naturally, these combinations just work. But Freescale FEC and Marvell
> had not been combined before, and it was broken.
> 
> So i would not be surprised if the Cadence MAC driver had an issue.

The KSZ switches never have a built-in MAC controller, so it is always a support
issue for us.
Andrew Lunn Sept. 18, 2017, 10:42 p.m. UTC | #5
On Mon, Sep 18, 2017 at 08:55:17PM +0000, Tristram.Ha@microchip.com wrote:
> > > > This is ugly. We have a clean separation between a switch driver and a
> > > > tag driver. Here you are mixing them together. Don't. Look at the
> > > > mailing lists for what Florian and I suggested to Pavel. It will also
> > > > solve your include file issue above.
> > >
> > > It seems to me all tag_*.c are hard-coded.  They do not have access to
> > > the switch device and just use the bit definitions defined in the top to
> > > do the job.
> > >
> > > This creates a problem for all KSZ switch devices as they have different
> > > tail tag formats and lengths.  There will be potentially 4 additional
> > > DSA_TAG_PROT_KSZ* just to handle them.  Extra files will be added
> > > with about the same code.
> > 
> > Hi Tristram
> > 
> > Think about factoring out the common code into a shared functions.
> >
> 
> I am a little unsure what you have in mind.  Can you elaborate?

You say you need 4 DSA_TAG_PROT_KSZ. I guess the code is nearly
identical in them all? If i remember correctly, the two tag bytes are
the other way around?

static int ksz8k_xmit( *skb, struct net_device *dev)
{
	uint8* tag;

	tag = ksz_xmit(skb, dev)
	if (!tag)
	     return NULL;

	tag[0] = 1 << p->dp->index;
	tag[1] = 0;

	return skb;
}

static int ksz9k_xmit( *skb, struct net_device *dev)
{
	uint8* tag;

	tag = ksz_xmit(skb, dev)
	if (!tag)
	     return NULL;

	tag[0] = 0;
	tag[1] = 1 << p->dp->index;

	return skb;
}

const struct dsa_device_ops ksz8k_netdev_ops = {
       .xmit   = ksz8k_xmit,
       .rcv    = ksz8k_rcv,
};

const struct dsa_device_ops ksz9k_netdev_ops = {
       .xmit   = ksz9k_xmit,
       .rcv    = ksz9k_rcv,
};

	Andrew
Andrew Lunn Sept. 18, 2017, 10:51 p.m. UTC | #6
> In the old DSA implementation all the ports are partitioned into its own device
> and the bridge joining them will do all the forwarding.  This is useful for quick
> testing with some protocols like RSTP but it is probably useless for real
> operation.

It is a good minimal driver, to get something into the kernel. You can
then add features to it.

> The new switchdev model tries to use the switch hardware as much as
> possible.  This offload_fwd_mark bit means the frame is forwarded by the
> hardware switch, so the software bridge does not need to do it again.  Without
> this bit there will be duplicated multicast frames coming out the ports if internal
> forwarding is enabled.

Correct. Once you switch driver is clever enough, you can enable
offload_fwd_mark.
 
> When RSTP is used the port can be put in blocked state and so the forwarding
> will stop for that port.   Currently the switch driver will check that membership
> to decide whether to set that bit.

This i don't get. RSTP or STP just break loops. How does RSTP vs STP
mean you need to set offload_fwd_mark differently?

> The KSZ switches never have a built-in MAC controller, so it is always a support
> issue for us.

Not quite right. Once your drivers are in mainline, it becomes a
support issue for the community, with you being part of the community.
I've helped others fix this issue, one of the less well used Marvell
Ethernet drivers had this problem, and i gave somebody pointers how to
fix it.

    Andrew
Tristram.Ha@microchip.com Sept. 18, 2017, 11:44 p.m. UTC | #7
> > In the old DSA implementation all the ports are partitioned into its own
> device
> > and the bridge joining them will do all the forwarding.  This is useful for
> quick
> > testing with some protocols like RSTP but it is probably useless for real
> > operation.
> 
> It is a good minimal driver, to get something into the kernel. You can
> then add features to it.
> 
> > The new switchdev model tries to use the switch hardware as much as
> > possible.  This offload_fwd_mark bit means the frame is forwarded by the
> > hardware switch, so the software bridge does not need to do it again.
> Without
> > this bit there will be duplicated multicast frames coming out the ports if
> internal
> > forwarding is enabled.
> 
> Correct. Once you switch driver is clever enough, you can enable
> offload_fwd_mark.
> 
> > When RSTP is used the port can be put in blocked state and so the
> forwarding
> > will stop for that port.   Currently the switch driver will check that
> membership
> > to decide whether to set that bit.
> 
> This i don't get. RSTP or STP just break loops. How does RSTP vs STP
> mean you need to set offload_fwd_mark differently?
> 

The logic of the switch driver is if the membership of the port receiving
the frame contains other ports--not counting cpu port--the bit
offload_fwd_mark is set.  In RSTP closing the blocked port is generally good
enough, but there are exceptions, so the port is removed from the
membership of other forwarding ports.  A disabled port will have its
membership completely reset so it cannot receive anything.  It does not
matter much in RSTP as the software bridge should know whether to forward
the frame or not.

We are back to square one.  Is there any plan to add this offload_fwd_mark
support to DSA driver so that it can be reported properly?  It can be set all the
time, except during port initialization or before bridge creation the forwarding
state does not reflect reality.

If not the port membership can be fixed and there is no internal switch
forwarding, leaving everything handled by the software bridge.
Florian Fainelli Sept. 18, 2017, 11:48 p.m. UTC | #8
On 09/18/2017 04:44 PM, Tristram.Ha@microchip.com wrote:
>>> In the old DSA implementation all the ports are partitioned into its own
>> device
>>> and the bridge joining them will do all the forwarding.  This is useful for
>> quick
>>> testing with some protocols like RSTP but it is probably useless for real
>>> operation.
>>
>> It is a good minimal driver, to get something into the kernel. You can
>> then add features to it.
>>
>>> The new switchdev model tries to use the switch hardware as much as
>>> possible.  This offload_fwd_mark bit means the frame is forwarded by the
>>> hardware switch, so the software bridge does not need to do it again.
>> Without
>>> this bit there will be duplicated multicast frames coming out the ports if
>> internal
>>> forwarding is enabled.
>>
>> Correct. Once you switch driver is clever enough, you can enable
>> offload_fwd_mark.
>>
>>> When RSTP is used the port can be put in blocked state and so the
>> forwarding
>>> will stop for that port.   Currently the switch driver will check that
>> membership
>>> to decide whether to set that bit.
>>
>> This i don't get. RSTP or STP just break loops. How does RSTP vs STP
>> mean you need to set offload_fwd_mark differently?
>>
> 
> The logic of the switch driver is if the membership of the port receiving
> the frame contains other ports--not counting cpu port--the bit
> offload_fwd_mark is set.  In RSTP closing the blocked port is generally good
> enough, but there are exceptions, so the port is removed from the
> membership of other forwarding ports.  A disabled port will have its
> membership completely reset so it cannot receive anything.  It does not
> matter much in RSTP as the software bridge should know whether to forward
> the frame or not.
> 
> We are back to square one.  Is there any plan to add this offload_fwd_mark
> support to DSA driver so that it can be reported properly?  It can be set all the
> time, except during port initialization or before bridge creation the forwarding
> state does not reflect reality.
> 
> If not the port membership can be fixed and there is no internal switch
> forwarding, leaving everything handled by the software bridge.

I am not really sure why this is such a concern for you so soon when
your driver is not even included yet. You should really aim for baby
steps here: get the basic driver(s) included, with a limited set of
features, and gradually add more features to the driver. When
fwd_offload_mark and RSTP become a real problem, we can most
definitively find a way to fix those in DSA and depending drivers.
Tristram.Ha@microchip.com Sept. 19, 2017, 12:45 a.m. UTC | #9
> I am not really sure why this is such a concern for you so soon when
> your driver is not even included yet. You should really aim for baby
> steps here: get the basic driver(s) included, with a limited set of
> features, and gradually add more features to the driver. When
> fwd_offload_mark and RSTP become a real problem, we can most
> definitively find a way to fix those in DSA and depending drivers.

I was under the impression that there is a new push of this new switchdev
model and so the DSA model was overhauled to support that.

The KSZ9477 driver is already in the kernel, and its register access is actually
much different from the other older switches.  There are not much common
code to be reused.  I always know this tail tag handling is the sticking point.
I will submit a much simplified driver and wait for switch access in the future.

Patch
diff mbox series

diff --git a/net/dsa/tag_ksz.c b/net/dsa/tag_ksz.c index 010ca0a..d5faf14 100644
--- a/net/dsa/tag_ksz.c
+++ b/net/dsa/tag_ksz.c
@@ -13,35 +13,21 @@ 
 #include <linux/slab.h>
 #include <net/dsa.h>
 #include "dsa_priv.h"
-
-/* For Ingress (Host -> KSZ), 2 bytes are added before FCS.
- * ---------------------------------------------------------------------------
- * DA(6bytes)|SA(6bytes)|....|Data(nbytes)|tag0(1byte)|tag1(1byte)|FCS(4bytes)
- * ---------------------------------------------------------------------------
- * tag0 : Prioritization (not used now)
- * tag1 : each bit represents port (eg, 0x01=port1, 0x02=port2, 0x10=port5)
- *
- * For Egress (KSZ -> Host), 1 byte is added before FCS.
- * ---------------------------------------------------------------------------
- * DA(6bytes)|SA(6bytes)|....|Data(nbytes)|tag0(1byte)|FCS(4bytes)
- * ---------------------------------------------------------------------------
- * tag0 : zero-based value represents port
- *	  (eg, 0x00=port1, 0x02=port3, 0x06=port7)
- */
-
-#define	KSZ_INGRESS_TAG_LEN	2
-#define	KSZ_EGRESS_TAG_LEN	1
+#include "../../drivers/net/dsa/microchip/ksz_priv.h"
 
 static struct sk_buff *ksz_xmit(struct sk_buff *skb, struct net_device *dev)  {
 	struct dsa_slave_priv *p = netdev_priv(dev);
+	struct dsa_switch *ds = p->dp->ds;
+	struct ksz_device *swdev = ds->priv;
 	struct sk_buff *nskb;
+	int len;
 	int padlen;
-	u8 *tag;
 
 	padlen = (skb->len >= ETH_ZLEN) ? 0 : ETH_ZLEN - skb->len;
 
-	if (skb_tailroom(skb) >= padlen + KSZ_INGRESS_TAG_LEN) {
+	len = swdev->dev_ops->get_tx_len(swdev);
+	if (skb_tailroom(skb) >= padlen + len) {
 		/* Let dsa_slave_xmit() free skb */
 		if (__skb_put_padto(skb, skb->len + padlen, false))
 			return NULL;
@@ -49,7 +35,7 @@  static struct sk_buff *ksz_xmit(struct sk_buff *skb, struct net_device *dev)