linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [net-next RFC PATCH 0/6] Add support for qca8k mdio rw in Ethernet packet
@ 2021-12-07 14:59 Ansuel Smith
  2021-12-07 14:59 ` [net-next RFC PATCH 1/6] net: dsa: tag_qca: convert to FIELD macro Ansuel Smith
                   ` (7 more replies)
  0 siblings, 8 replies; 40+ messages in thread
From: Ansuel Smith @ 2021-12-07 14:59 UTC (permalink / raw)
  To: Andrew Lunn, Vivien Didelot, Florian Fainelli, Vladimir Oltean,
	David S. Miller, Jakub Kicinski, linux-kernel, netdev
  Cc: Ansuel Smith

Hi, this is still WIP and currently has some problem but I would love if
someone can give this a superficial review and answer to some problem
with this.

The main reason for this is that we notice some routing problem in the
switch and it seems assisted learning is needed. Considering mdio is
quite slow due to the indirect write using this Ethernet alternative way
seems to be quicker.

The qca8k switch supports a special way to pass mdio read/write request
using specially crafted Ethernet packet.
This works by putting some defined data in the Ethernet header where the
mac source and dst should be placed. The Ethernet type header is set to qca
header and is set to a mdio read/write type.
This is used to communicate to the switch that this is a special packet
and should be parsed differently.

Current implementation of this use completion API to wait for the packet
to be processed by the tagger and has a timeout that fallback to the
legacy mdio way and mutex to enforce one transaction at time.

Here I list the main concern I have about this:
- Is the changes done to the tagger acceptable? (moving stuff to global
  include)
- Is it correct to put the skb generation code in the qca8k source?
- Is the changes generally correct? (referring to how this is
  implemented with part of the implementation split between the tagger
  and the driver)

I still have to find a solution to a slowdown problem and this is where
I would love to get some hint.
Currently I still didn't find a good way to understand when the tagger
starts to accept packets and because of this the initial setup is slow
as every completion timeouts. Am I missing something or is there a way
to check for this?
After the initial slowdown, as soon as the cpu port is ready and starts
to accept packet, every transaction is near instant and no completion
timeouts.

As I said this is still WIP but it does work correctly aside from the
initial slowdown problem. (the slowdown is in the first port init and at
the first port init... from port 2 the tagger starts to accept packet
and this starts to work)

Ansuel Smith (6):
  net: dsa: tag_qca: convert to FIELD macro
  net: dsa: tag_qca: move define to include linux/dsa
  net: dsa: tag_qca: add define for mdio read/write in ethernet packet
  net: dsa: qca8k: Add support for mdio read/write in Ethernet packet
  net: dsa: tag_qca: Add support for handling mdio read/write packet
  net: dsa: qca8k: cache lo and hi for mdio write

 drivers/net/dsa/qca8k.c     | 205 ++++++++++++++++++++++++++++++++++--
 drivers/net/dsa/qca8k.h     |   5 +
 include/linux/dsa/tag_qca.h |  75 +++++++++++++
 net/dsa/tag_qca.c           |  69 +++++++-----
 4 files changed, 320 insertions(+), 34 deletions(-)
 create mode 100644 include/linux/dsa/tag_qca.h

-- 
2.32.0


^ permalink raw reply	[flat|nested] 40+ messages in thread

* [net-next RFC PATCH 1/6] net: dsa: tag_qca: convert to FIELD macro
  2021-12-07 14:59 [net-next RFC PATCH 0/6] Add support for qca8k mdio rw in Ethernet packet Ansuel Smith
@ 2021-12-07 14:59 ` Ansuel Smith
  2021-12-07 14:59 ` [net-next RFC PATCH 2/6] net: dsa: tag_qca: move define to include linux/dsa Ansuel Smith
                   ` (6 subsequent siblings)
  7 siblings, 0 replies; 40+ messages in thread
From: Ansuel Smith @ 2021-12-07 14:59 UTC (permalink / raw)
  To: Andrew Lunn, Vivien Didelot, Florian Fainelli, Vladimir Oltean,
	David S. Miller, Jakub Kicinski, linux-kernel, netdev
  Cc: Ansuel Smith

Convert driver to FIELD macro to drop redundant define.

Signed-off-by: Ansuel Smith <ansuelsmth@gmail.com>
---
 net/dsa/tag_qca.c | 34 +++++++++++++++-------------------
 1 file changed, 15 insertions(+), 19 deletions(-)

diff --git a/net/dsa/tag_qca.c b/net/dsa/tag_qca.c
index 1ea9401b8ace..55fa6b96b4eb 100644
--- a/net/dsa/tag_qca.c
+++ b/net/dsa/tag_qca.c
@@ -4,29 +4,24 @@
  */
 
 #include <linux/etherdevice.h>
+#include <linux/bitfield.h>
 
 #include "dsa_priv.h"
 
 #define QCA_HDR_LEN	2
 #define QCA_HDR_VERSION	0x2
 
-#define QCA_HDR_RECV_VERSION_MASK	GENMASK(15, 14)
-#define QCA_HDR_RECV_VERSION_S		14
-#define QCA_HDR_RECV_PRIORITY_MASK	GENMASK(13, 11)
-#define QCA_HDR_RECV_PRIORITY_S		11
-#define QCA_HDR_RECV_TYPE_MASK		GENMASK(10, 6)
-#define QCA_HDR_RECV_TYPE_S		6
+#define QCA_HDR_RECV_VERSION		GENMASK(15, 14)
+#define QCA_HDR_RECV_PRIORITY		GENMASK(13, 11)
+#define QCA_HDR_RECV_TYPE		GENMASK(10, 6)
 #define QCA_HDR_RECV_FRAME_IS_TAGGED	BIT(3)
-#define QCA_HDR_RECV_SOURCE_PORT_MASK	GENMASK(2, 0)
-
-#define QCA_HDR_XMIT_VERSION_MASK	GENMASK(15, 14)
-#define QCA_HDR_XMIT_VERSION_S		14
-#define QCA_HDR_XMIT_PRIORITY_MASK	GENMASK(13, 11)
-#define QCA_HDR_XMIT_PRIORITY_S		11
-#define QCA_HDR_XMIT_CONTROL_MASK	GENMASK(10, 8)
-#define QCA_HDR_XMIT_CONTROL_S		8
+#define QCA_HDR_RECV_SOURCE_PORT	GENMASK(2, 0)
+
+#define QCA_HDR_XMIT_VERSION		GENMASK(15, 14)
+#define QCA_HDR_XMIT_PRIORITY		GENMASK(13, 11)
+#define QCA_HDR_XMIT_CONTROL		GENMASK(10, 8)
 #define QCA_HDR_XMIT_FROM_CPU		BIT(7)
-#define QCA_HDR_XMIT_DP_BIT_MASK	GENMASK(6, 0)
+#define QCA_HDR_XMIT_DP_BIT		GENMASK(6, 0)
 
 static struct sk_buff *qca_tag_xmit(struct sk_buff *skb, struct net_device *dev)
 {
@@ -40,8 +35,9 @@ static struct sk_buff *qca_tag_xmit(struct sk_buff *skb, struct net_device *dev)
 	phdr = dsa_etype_header_pos_tx(skb);
 
 	/* Set the version field, and set destination port information */
-	hdr = QCA_HDR_VERSION << QCA_HDR_XMIT_VERSION_S |
-		QCA_HDR_XMIT_FROM_CPU | BIT(dp->index);
+	hdr = FIELD_PREP(QCA_HDR_XMIT_VERSION, QCA_HDR_VERSION);
+	hdr |= QCA_HDR_XMIT_FROM_CPU;
+	hdr |= FIELD_PREP(QCA_HDR_XMIT_DP_BIT, BIT(dp->index));
 
 	*phdr = htons(hdr);
 
@@ -62,7 +58,7 @@ static struct sk_buff *qca_tag_rcv(struct sk_buff *skb, struct net_device *dev)
 	hdr = ntohs(*phdr);
 
 	/* Make sure the version is correct */
-	ver = (hdr & QCA_HDR_RECV_VERSION_MASK) >> QCA_HDR_RECV_VERSION_S;
+	ver = FIELD_GET(QCA_HDR_RECV_VERSION, hdr);
 	if (unlikely(ver != QCA_HDR_VERSION))
 		return NULL;
 
@@ -71,7 +67,7 @@ static struct sk_buff *qca_tag_rcv(struct sk_buff *skb, struct net_device *dev)
 	dsa_strip_etype_header(skb, QCA_HDR_LEN);
 
 	/* Get source port information */
-	port = (hdr & QCA_HDR_RECV_SOURCE_PORT_MASK);
+	port = FIELD_GET(QCA_HDR_RECV_SOURCE_PORT, hdr);
 
 	skb->dev = dsa_master_find_slave(dev, 0, port);
 	if (!skb->dev)
-- 
2.32.0


^ permalink raw reply related	[flat|nested] 40+ messages in thread

* [net-next RFC PATCH 2/6] net: dsa: tag_qca: move define to include linux/dsa
  2021-12-07 14:59 [net-next RFC PATCH 0/6] Add support for qca8k mdio rw in Ethernet packet Ansuel Smith
  2021-12-07 14:59 ` [net-next RFC PATCH 1/6] net: dsa: tag_qca: convert to FIELD macro Ansuel Smith
@ 2021-12-07 14:59 ` Ansuel Smith
  2021-12-07 14:59 ` [net-next RFC PATCH 3/6] net: dsa: tag_qca: add define for mdio read/write in ethernet packet Ansuel Smith
                   ` (5 subsequent siblings)
  7 siblings, 0 replies; 40+ messages in thread
From: Ansuel Smith @ 2021-12-07 14:59 UTC (permalink / raw)
  To: Andrew Lunn, Vivien Didelot, Florian Fainelli, Vladimir Oltean,
	David S. Miller, Jakub Kicinski, linux-kernel, netdev
  Cc: Ansuel Smith

Move tag_qca define to include dir linux/dsa as the qca8k require access
to the tagger define to support in-band mdio read/write using ethernet
packet.

Signed-off-by: Ansuel Smith <ansuelsmth@gmail.com>
---
 include/linux/dsa/tag_qca.h | 21 +++++++++++++++++++++
 net/dsa/tag_qca.c           | 16 +---------------
 2 files changed, 22 insertions(+), 15 deletions(-)
 create mode 100644 include/linux/dsa/tag_qca.h

diff --git a/include/linux/dsa/tag_qca.h b/include/linux/dsa/tag_qca.h
new file mode 100644
index 000000000000..c02d2d39ff4a
--- /dev/null
+++ b/include/linux/dsa/tag_qca.h
@@ -0,0 +1,21 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+
+#ifndef __TAG_QCA_H
+#define __TAG_QCA_H
+
+#define QCA_HDR_LEN	2
+#define QCA_HDR_VERSION	0x2
+
+#define QCA_HDR_RECV_VERSION		GENMASK(15, 14)
+#define QCA_HDR_RECV_PRIORITY		GENMASK(13, 11)
+#define QCA_HDR_RECV_TYPE		GENMASK(10, 6)
+#define QCA_HDR_RECV_FRAME_IS_TAGGED	BIT(3)
+#define QCA_HDR_RECV_SOURCE_PORT	GENMASK(2, 0)
+
+#define QCA_HDR_XMIT_VERSION		GENMASK(15, 14)
+#define QCA_HDR_XMIT_PRIORITY		GENMASK(13, 11)
+#define QCA_HDR_XMIT_CONTROL		GENMASK(10, 8)
+#define QCA_HDR_XMIT_FROM_CPU		BIT(7)
+#define QCA_HDR_XMIT_DP_BIT		GENMASK(6, 0)
+
+#endif /* __TAG_QCA_H */
diff --git a/net/dsa/tag_qca.c b/net/dsa/tag_qca.c
index 55fa6b96b4eb..34e565e00ece 100644
--- a/net/dsa/tag_qca.c
+++ b/net/dsa/tag_qca.c
@@ -5,24 +5,10 @@
 
 #include <linux/etherdevice.h>
 #include <linux/bitfield.h>
+#include <linux/dsa/tag_qca.h>
 
 #include "dsa_priv.h"
 
-#define QCA_HDR_LEN	2
-#define QCA_HDR_VERSION	0x2
-
-#define QCA_HDR_RECV_VERSION		GENMASK(15, 14)
-#define QCA_HDR_RECV_PRIORITY		GENMASK(13, 11)
-#define QCA_HDR_RECV_TYPE		GENMASK(10, 6)
-#define QCA_HDR_RECV_FRAME_IS_TAGGED	BIT(3)
-#define QCA_HDR_RECV_SOURCE_PORT	GENMASK(2, 0)
-
-#define QCA_HDR_XMIT_VERSION		GENMASK(15, 14)
-#define QCA_HDR_XMIT_PRIORITY		GENMASK(13, 11)
-#define QCA_HDR_XMIT_CONTROL		GENMASK(10, 8)
-#define QCA_HDR_XMIT_FROM_CPU		BIT(7)
-#define QCA_HDR_XMIT_DP_BIT		GENMASK(6, 0)
-
 static struct sk_buff *qca_tag_xmit(struct sk_buff *skb, struct net_device *dev)
 {
 	struct dsa_port *dp = dsa_slave_to_port(dev);
-- 
2.32.0


^ permalink raw reply related	[flat|nested] 40+ messages in thread

* [net-next RFC PATCH 3/6] net: dsa: tag_qca: add define for mdio read/write in ethernet packet
  2021-12-07 14:59 [net-next RFC PATCH 0/6] Add support for qca8k mdio rw in Ethernet packet Ansuel Smith
  2021-12-07 14:59 ` [net-next RFC PATCH 1/6] net: dsa: tag_qca: convert to FIELD macro Ansuel Smith
  2021-12-07 14:59 ` [net-next RFC PATCH 2/6] net: dsa: tag_qca: move define to include linux/dsa Ansuel Smith
@ 2021-12-07 14:59 ` Ansuel Smith
  2021-12-07 14:59 ` [net-next RFC PATCH 4/6] net: dsa: qca8k: Add support for mdio read/write in Ethernet packet Ansuel Smith
                   ` (4 subsequent siblings)
  7 siblings, 0 replies; 40+ messages in thread
From: Ansuel Smith @ 2021-12-07 14:59 UTC (permalink / raw)
  To: Andrew Lunn, Vivien Didelot, Florian Fainelli, Vladimir Oltean,
	David S. Miller, Jakub Kicinski, linux-kernel, netdev
  Cc: Ansuel Smith

Add all the required define to prepare support for mdio read/write in
Ethernet packet. Any packet of this type has to be dropped as the only
use of these special packet is receive ack for an mdio write request or
receive data for an mdio read request.
A struct is used that emulates the Ethernet header but is used for a
different purpose.

Signed-off-by: Ansuel Smith <ansuelsmth@gmail.com>
---
 include/linux/dsa/tag_qca.h | 41 +++++++++++++++++++++++++++++++++++++
 net/dsa/tag_qca.c           | 13 +++++++++---
 2 files changed, 51 insertions(+), 3 deletions(-)

diff --git a/include/linux/dsa/tag_qca.h b/include/linux/dsa/tag_qca.h
index c02d2d39ff4a..578a4aeafd92 100644
--- a/include/linux/dsa/tag_qca.h
+++ b/include/linux/dsa/tag_qca.h
@@ -12,10 +12,51 @@
 #define QCA_HDR_RECV_FRAME_IS_TAGGED	BIT(3)
 #define QCA_HDR_RECV_SOURCE_PORT	GENMASK(2, 0)
 
+/* Packet type for recv */
+#define QCA_HDR_RECV_TYPE_NORMAL	0x0
+#define QCA_HDR_RECV_TYPE_MIB		0x1
+#define QCA_HDR_RECV_TYPE_RW_REG_ACK	0x2
+
 #define QCA_HDR_XMIT_VERSION		GENMASK(15, 14)
 #define QCA_HDR_XMIT_PRIORITY		GENMASK(13, 11)
 #define QCA_HDR_XMIT_CONTROL		GENMASK(10, 8)
 #define QCA_HDR_XMIT_FROM_CPU		BIT(7)
 #define QCA_HDR_XMIT_DP_BIT		GENMASK(6, 0)
 
+/* Packet type for xmit */
+#define QCA_HDR_XMIT_TYPE_NORMAL	0x0
+#define QCA_HDR_XMIT_TYPE_RW_REG	0x1
+
+#define MDIO_CHECK_CODE_VAL		0x5
+
+/* Specific define for in-band MDIO read/write with Ethernet packet */
+#define QCA_HDR_MDIO_SEQ_LEN		4 /* 4 byte for the seq */
+#define QCA_HDR_MDIO_COMMAND_LEN	4 /* 4 byte for the command */
+#define QCA_HDR_MDIO_DATA1_LEN		4 /* First 4 byte for the mdio data */
+#define QCA_HDR_MDIO_HEADER_LEN		(QCA_HDR_MDIO_SEQ_LEN + \
+					QCA_HDR_MDIO_COMMAND_LEN + \
+					QCA_HDR_MDIO_DATA1_LEN)
+
+#define QCA_HDR_MDIO_DATA2_LEN		12 /* Other 12 byte for the mdio data */
+#define QCA_HDR_MDIO_PADDING_LEN	34 /* Padding to reach the min Ethernet packet */
+
+#define QCA_HDR_MDIO_PKG_LEN		(QCA_HDR_MDIO_HEADER_LEN + \
+					QCA_HDR_LEN + \
+					QCA_HDR_MDIO_DATA2_LEN + \
+					QCA_HDR_MDIO_PADDING_LEN)
+
+#define QCA_HDR_MDIO_SEQ_NUM		GENMASK(31, 0)  /* 63, 32 */
+#define QCA_HDR_MDIO_CHECK_CODE		GENMASK(31, 29) /* 31, 29 */
+#define QCA_HDR_MDIO_CMD		BIT(28)		/* 28 */
+#define QCA_HDR_MDIO_LENGTH		GENMASK(23, 20) /* 23, 20 */
+#define QCA_HDR_MDIO_ADDR		GENMASK(18, 0)  /* 18, 0 */
+
+/* Special struct emulating a Ethernet header */
+struct mdio_ethhdr {
+	u32 command;		/* command bit 31:0 */
+	u32 seq;		/* seq 63:32 */
+	u32 mdio_data;		/* first 4byte mdio */
+	u16 hdr;		/* qca hdr */
+} __packed;
+
 #endif /* __TAG_QCA_H */
diff --git a/net/dsa/tag_qca.c b/net/dsa/tag_qca.c
index 34e565e00ece..b8b05d54a74c 100644
--- a/net/dsa/tag_qca.c
+++ b/net/dsa/tag_qca.c
@@ -32,10 +32,10 @@ static struct sk_buff *qca_tag_xmit(struct sk_buff *skb, struct net_device *dev)
 
 static struct sk_buff *qca_tag_rcv(struct sk_buff *skb, struct net_device *dev)
 {
-	u8 ver;
-	u16  hdr;
-	int port;
+	u16  hdr, pk_type;
 	__be16 *phdr;
+	int port;
+	u8 ver;
 
 	if (unlikely(!pskb_may_pull(skb, QCA_HDR_LEN)))
 		return NULL;
@@ -48,6 +48,13 @@ static struct sk_buff *qca_tag_rcv(struct sk_buff *skb, struct net_device *dev)
 	if (unlikely(ver != QCA_HDR_VERSION))
 		return NULL;
 
+	/* Get pk type */
+	pk_type = FIELD_GET(QCA_HDR_RECV_TYPE, hdr);
+
+	/* MDIO read/write packet */
+	if (pk_type == QCA_HDR_RECV_TYPE_RW_REG_ACK)
+		return NULL;
+
 	/* Remove QCA tag and recalculate checksum */
 	skb_pull_rcsum(skb, QCA_HDR_LEN);
 	dsa_strip_etype_header(skb, QCA_HDR_LEN);
-- 
2.32.0


^ permalink raw reply related	[flat|nested] 40+ messages in thread

* [net-next RFC PATCH 4/6] net: dsa: qca8k: Add support for mdio read/write in Ethernet packet
  2021-12-07 14:59 [net-next RFC PATCH 0/6] Add support for qca8k mdio rw in Ethernet packet Ansuel Smith
                   ` (2 preceding siblings ...)
  2021-12-07 14:59 ` [net-next RFC PATCH 3/6] net: dsa: tag_qca: add define for mdio read/write in ethernet packet Ansuel Smith
@ 2021-12-07 14:59 ` Ansuel Smith
  2021-12-07 14:59 ` [net-next RFC PATCH 5/6] net: dsa: tag_qca: Add support for handling mdio read/write packet Ansuel Smith
                   ` (3 subsequent siblings)
  7 siblings, 0 replies; 40+ messages in thread
From: Ansuel Smith @ 2021-12-07 14:59 UTC (permalink / raw)
  To: Andrew Lunn, Vivien Didelot, Florian Fainelli, Vladimir Oltean,
	David S. Miller, Jakub Kicinski, linux-kernel, netdev
  Cc: Ansuel Smith

Add qca8k side support for mdio read/write in Ethernet packet.
qca8k supports some specially crafted Ethernet packet that can be used
for mdio read/write instead of the legacy method uart/internal mdio.
This add support for the qca8k side to craft the packet and enqueue it.
Each port and the qca8k_priv have a special struct to put data in it.
The completion API is used to wait for the packet to be received back
with the requested data.

The various steps are:
1. Craft the special packet with the qca hdr set to mdio read/write
   mode.
2. Set the lock in the dedicated mdio struct.
3. Reinit the completion.
4. Enqueue the packet.
5. Wait the packet to be received.
6. Use the data set by the tagger to complete the mdio operation.

If the completion timeouts or the ack value is not true, the legacy
mdio way is used.

It has to be considered that in the initial setup mdio is still used and
mdio is still used until DSA is ready to accept and tag packet.

Signed-off-by: Ansuel Smith <ansuelsmth@gmail.com>
---
 drivers/net/dsa/qca8k.c     | 156 +++++++++++++++++++++++++++++++++++-
 drivers/net/dsa/qca8k.h     |   5 ++
 include/linux/dsa/tag_qca.h |  13 +++
 3 files changed, 172 insertions(+), 2 deletions(-)

diff --git a/drivers/net/dsa/qca8k.c b/drivers/net/dsa/qca8k.c
index 96a7fbf8700c..d2c6139be9ac 100644
--- a/drivers/net/dsa/qca8k.c
+++ b/drivers/net/dsa/qca8k.c
@@ -20,6 +20,7 @@
 #include <linux/phylink.h>
 #include <linux/gpio/consumer.h>
 #include <linux/etherdevice.h>
+#include <linux/dsa/tag_qca.h>
 
 #include "qca8k.h"
 
@@ -170,6 +171,128 @@ qca8k_rmw(struct qca8k_priv *priv, u32 reg, u32 mask, u32 write_val)
 	return regmap_update_bits(priv->regmap, reg, mask, write_val);
 }
 
+static struct sk_buff *qca8k_alloc_mdio_header(struct qca8k_port_tag *header, enum mdio_cmd cmd,
+					       u32 reg, u32 *val)
+{
+	struct mdio_ethhdr *mdio_ethhdr;
+	struct sk_buff *skb;
+	u16 hdr;
+
+	skb = dev_alloc_skb(QCA_HDR_MDIO_PKG_LEN);
+
+	prefetchw(skb->data);
+
+	skb_reset_mac_header(skb);
+	skb_set_network_header(skb, skb->len);
+
+	mdio_ethhdr = skb_push(skb, QCA_HDR_MDIO_HEADER_LEN + QCA_HDR_LEN);
+
+	hdr = FIELD_PREP(QCA_HDR_XMIT_VERSION, QCA_HDR_VERSION);
+	hdr |= QCA_HDR_XMIT_FROM_CPU;
+	hdr |= FIELD_PREP(QCA_HDR_XMIT_DP_BIT, BIT(0));
+	hdr |= FIELD_PREP(QCA_HDR_XMIT_CONTROL, QCA_HDR_XMIT_TYPE_RW_REG);
+
+	mdio_ethhdr->seq = FIELD_PREP(QCA_HDR_MDIO_SEQ_NUM, 200);
+
+	mdio_ethhdr->command = FIELD_PREP(QCA_HDR_MDIO_ADDR, reg);
+	mdio_ethhdr->command |= FIELD_PREP(QCA_HDR_MDIO_LENGTH, 4);
+	mdio_ethhdr->command |= FIELD_PREP(QCA_HDR_MDIO_CMD, cmd);
+	mdio_ethhdr->command |= FIELD_PREP(QCA_HDR_MDIO_CHECK_CODE, MDIO_CHECK_CODE_VAL);
+
+	if (cmd == MDIO_WRITE)
+		mdio_ethhdr->mdio_data = *val;
+
+	mdio_ethhdr->hdr = htons(hdr);
+
+	skb_put_zero(skb, QCA_HDR_MDIO_DATA2_LEN);
+	skb_put_zero(skb, QCA_HDR_MDIO_PADDING_LEN);
+
+	return skb;
+}
+
+static int qca8k_read_eth(struct qca8k_priv *priv, u32 reg, u32 *val)
+{
+	struct qca8k_port_tag *header = priv->header_mdio;
+	struct sk_buff *skb;
+	bool ack;
+	int ret;
+
+	skb = qca8k_alloc_mdio_header(header, MDIO_READ, reg, 0);
+	skb->dev = dsa_to_port(priv->ds, 0)->master;
+
+	mutex_lock(&header->mdio_mutex);
+
+	reinit_completion(&header->rw_done);
+	header->seq = 200;
+	header->ack = false;
+
+	dev_queue_xmit(skb);
+
+	ret = wait_for_completion_timeout(&header->rw_done, QCA8K_MDIO_RW_ETHERNET);
+
+	*val = header->data[0];
+	ack = header->ack;
+
+	mutex_unlock(&header->mdio_mutex);
+
+	if (ret <= 0)
+		return -ETIMEDOUT;
+
+	if (!ack)
+		return -EINVAL;
+
+	return 0;
+}
+
+static int qca8k_write_eth(struct qca8k_priv *priv, u32 reg, u32 val)
+{
+	struct qca8k_port_tag *header = priv->header_mdio;
+	struct sk_buff *skb;
+	bool ack;
+	int ret;
+
+	skb = qca8k_alloc_mdio_header(header, MDIO_WRITE, reg, &val);
+	skb->dev = dsa_to_port(priv->ds, 0)->master;
+
+	mutex_lock(&header->mdio_mutex);
+
+	dev_queue_xmit(skb);
+
+	reinit_completion(&header->rw_done);
+	header->ack = false;
+	header->seq = 200;
+
+	ret = wait_for_completion_timeout(&header->rw_done, QCA8K_MDIO_RW_ETHERNET);
+
+	ack = header->ack;
+
+	mutex_unlock(&header->mdio_mutex);
+
+	if (ret <= 0)
+		return -ETIMEDOUT;
+
+	if (!ack)
+		return -EINVAL;
+
+	return 0;
+}
+
+static int
+qca8k_regmap_update_bits_eth(struct qca8k_priv *priv, u32 reg, u32 mask, u32 write_val)
+{
+	u32 val = 0;
+	int ret;
+
+	ret = qca8k_read_eth(priv, reg, &val);
+	if (ret)
+		return ret;
+
+	val &= ~mask;
+	val |= write_val;
+
+	return qca8k_write_eth(priv, reg, val);
+}
+
 static int
 qca8k_regmap_read(void *ctx, uint32_t reg, uint32_t *val)
 {
@@ -178,6 +301,10 @@ qca8k_regmap_read(void *ctx, uint32_t reg, uint32_t *val)
 	u16 r1, r2, page;
 	int ret;
 
+	if (priv->atheros_header_ready)
+		if (!qca8k_read_eth(priv, reg, val))
+			return 0;
+
 	qca8k_split_addr(reg, &r1, &r2, &page);
 
 	mutex_lock_nested(&bus->mdio_lock, MDIO_MUTEX_NESTED);
@@ -201,6 +328,10 @@ qca8k_regmap_write(void *ctx, uint32_t reg, uint32_t val)
 	u16 r1, r2, page;
 	int ret;
 
+	if (priv->atheros_header_ready)
+		if (!qca8k_write_eth(priv, reg, val))
+			return 0;
+
 	qca8k_split_addr(reg, &r1, &r2, &page);
 
 	mutex_lock_nested(&bus->mdio_lock, MDIO_MUTEX_NESTED);
@@ -225,6 +356,10 @@ qca8k_regmap_update_bits(void *ctx, uint32_t reg, uint32_t mask, uint32_t write_
 	u32 val;
 	int ret;
 
+	if (priv->atheros_header_ready)
+		if (!qca8k_regmap_update_bits_eth(priv, reg, mask, write_val))
+			return 0;
+
 	qca8k_split_addr(reg, &r1, &r2, &page);
 
 	mutex_lock_nested(&bus->mdio_lock, MDIO_MUTEX_NESTED);
@@ -1223,8 +1358,13 @@ qca8k_setup(struct dsa_switch *ds)
 	 * Configure specific switch configuration for ports
 	 */
 	for (i = 0; i < QCA8K_NUM_PORTS; i++) {
+		struct dsa_port *dp = dsa_to_port(ds, i);
+
+		/* Set the header_mdio to be accessible by the qca tagger */
+		dp->priv = priv->header_mdio;
+
 		/* CPU port gets connected to all user ports of the switch */
-		if (dsa_is_cpu_port(ds, i)) {
+		if (dsa_port_is_cpu(dp)) {
 			ret = qca8k_rmw(priv, QCA8K_PORT_LOOKUP_CTRL(i),
 					QCA8K_PORT_LOOKUP_MEMBER, dsa_user_ports(ds));
 			if (ret)
@@ -1232,7 +1372,7 @@ qca8k_setup(struct dsa_switch *ds)
 		}
 
 		/* Individual user ports get connected to CPU port only */
-		if (dsa_is_user_port(ds, i)) {
+		if (dsa_port_is_user(dp)) {
 			ret = qca8k_rmw(priv, QCA8K_PORT_LOOKUP_CTRL(i),
 					QCA8K_PORT_LOOKUP_MEMBER,
 					BIT(cpu_port));
@@ -1684,6 +1824,9 @@ qca8k_phylink_mac_link_up(struct dsa_switch *ds, int port, unsigned int mode,
 	reg |= QCA8K_PORT_STATUS_TXMAC | QCA8K_PORT_STATUS_RXMAC;
 
 	qca8k_write(priv, QCA8K_REG_PORT_STATUS(port), reg);
+
+	if (dsa_is_cpu_port(ds, port))
+		priv->atheros_header_ready = true;
 }
 
 static void
@@ -2452,6 +2595,7 @@ static int qca8k_read_switch_id(struct qca8k_priv *priv)
 static int
 qca8k_sw_probe(struct mdio_device *mdiodev)
 {
+	struct qca8k_port_tag *header_mdio;
 	struct qca8k_priv *priv;
 	int ret;
 
@@ -2462,6 +2606,13 @@ qca8k_sw_probe(struct mdio_device *mdiodev)
 	if (!priv)
 		return -ENOMEM;
 
+	header_mdio = devm_kzalloc(&mdiodev->dev, sizeof(*header_mdio), GFP_KERNEL);
+	if (!header_mdio)
+		return -ENOMEM;
+
+	mutex_init(&header_mdio->mdio_mutex);
+	init_completion(&header_mdio->rw_done);
+
 	priv->bus = mdiodev->bus;
 	priv->dev = &mdiodev->dev;
 
@@ -2501,6 +2652,7 @@ qca8k_sw_probe(struct mdio_device *mdiodev)
 	priv->ds->priv = priv;
 	priv->ops = qca8k_switch_ops;
 	priv->ds->ops = &priv->ops;
+	priv->header_mdio = header_mdio;
 	mutex_init(&priv->reg_mutex);
 	dev_set_drvdata(&mdiodev->dev, priv);
 
diff --git a/drivers/net/dsa/qca8k.h b/drivers/net/dsa/qca8k.h
index ab4a417b25a9..149bc4280856 100644
--- a/drivers/net/dsa/qca8k.h
+++ b/drivers/net/dsa/qca8k.h
@@ -11,6 +11,9 @@
 #include <linux/delay.h>
 #include <linux/regmap.h>
 #include <linux/gpio.h>
+#include <linux/dsa/tag_qca.h>
+
+#define QCA8K_MDIO_RW_ETHERNET				100
 
 #define QCA8K_NUM_PORTS					7
 #define QCA8K_NUM_CPU_PORTS				2
@@ -342,6 +345,7 @@ struct qca8k_priv {
 	u8 mirror_rx;
 	u8 mirror_tx;
 	u8 lag_hash_mode;
+	bool atheros_header_ready;
 	bool legacy_phy_port_mapping;
 	struct qca8k_ports_config ports_config;
 	struct regmap *regmap;
@@ -353,6 +357,7 @@ struct qca8k_priv {
 	struct dsa_switch_ops ops;
 	struct gpio_desc *reset_gpio;
 	unsigned int port_mtu[QCA8K_NUM_PORTS];
+	struct qca8k_port_tag *header_mdio;
 };
 
 struct qca8k_mib_desc {
diff --git a/include/linux/dsa/tag_qca.h b/include/linux/dsa/tag_qca.h
index 578a4aeafd92..a45a973865c3 100644
--- a/include/linux/dsa/tag_qca.h
+++ b/include/linux/dsa/tag_qca.h
@@ -59,4 +59,17 @@ struct mdio_ethhdr {
 	u16 hdr;		/* qca hdr */
 } __packed;
 
+enum mdio_cmd {
+	MDIO_WRITE = 0x0,
+	MDIO_READ
+};
+
+struct qca8k_port_tag {
+	struct completion rw_done;
+	struct mutex mdio_mutex; /* Enforce one mdio read/write at time */
+	bool ack;
+	u32 seq;
+	u32 data[4];
+};
+
 #endif /* __TAG_QCA_H */
-- 
2.32.0


^ permalink raw reply related	[flat|nested] 40+ messages in thread

* [net-next RFC PATCH 5/6] net: dsa: tag_qca: Add support for handling mdio read/write packet
  2021-12-07 14:59 [net-next RFC PATCH 0/6] Add support for qca8k mdio rw in Ethernet packet Ansuel Smith
                   ` (3 preceding siblings ...)
  2021-12-07 14:59 ` [net-next RFC PATCH 4/6] net: dsa: qca8k: Add support for mdio read/write in Ethernet packet Ansuel Smith
@ 2021-12-07 14:59 ` Ansuel Smith
  2021-12-07 14:59 ` [net-next RFC PATCH 6/6] net: dsa: qca8k: cache lo and hi for mdio write Ansuel Smith
                   ` (2 subsequent siblings)
  7 siblings, 0 replies; 40+ messages in thread
From: Ansuel Smith @ 2021-12-07 14:59 UTC (permalink / raw)
  To: Andrew Lunn, Vivien Didelot, Florian Fainelli, Vladimir Oltean,
	David S. Miller, Jakub Kicinski, linux-kernel, netdev
  Cc: Ansuel Smith

Handle mdio read/write Ethernet packet.
When a packet is received, these operation are done:
1. Qca HDR is checked.
2. Packet type is checked.
3. If the type is an mdio read/write packet is parsed.
4. The header data is parsed and put in the generic mdio struct.
5. The rest of the data is copied to the data mdio struct.
6. The seq number is checked and copared with the one in the mdio struct
7. The ack is set to true to set a correct read/write operation
8. The completion is complete
9. The packet is dropped.

Signed-off-by: Ansuel Smith <ansuelsmth@gmail.com>
---
 net/dsa/tag_qca.c | 28 +++++++++++++++++++++++++++-
 1 file changed, 27 insertions(+), 1 deletion(-)

diff --git a/net/dsa/tag_qca.c b/net/dsa/tag_qca.c
index b8b05d54a74c..1d2c4f519c99 100644
--- a/net/dsa/tag_qca.c
+++ b/net/dsa/tag_qca.c
@@ -9,6 +9,30 @@
 
 #include "dsa_priv.h"
 
+static void qca_tag_handle_mdio_packet(struct sk_buff *skb,
+				       struct net_device *dev)
+{
+	struct mdio_ethhdr *mdio_ethhdr;
+	struct qca8k_port_tag *header;
+	struct dsa_port *cpu_dp;
+
+	cpu_dp = dev->dsa_ptr;
+	header = cpu_dp->priv;
+
+	mdio_ethhdr = (struct mdio_ethhdr *)skb_mac_header(skb);
+
+	header->data[0] = mdio_ethhdr->mdio_data;
+
+	/* Get the rest of the 12 byte of data */
+	memcpy(header->data + 1, skb->data, QCA_HDR_MDIO_DATA2_LEN);
+
+	/* Make sure the seq match the requested packet */
+	if (mdio_ethhdr->seq == header->seq)
+		header->ack = true;
+
+	complete(&header->rw_done);
+}
+
 static struct sk_buff *qca_tag_xmit(struct sk_buff *skb, struct net_device *dev)
 {
 	struct dsa_port *dp = dsa_slave_to_port(dev);
@@ -52,8 +76,10 @@ static struct sk_buff *qca_tag_rcv(struct sk_buff *skb, struct net_device *dev)
 	pk_type = FIELD_GET(QCA_HDR_RECV_TYPE, hdr);
 
 	/* MDIO read/write packet */
-	if (pk_type == QCA_HDR_RECV_TYPE_RW_REG_ACK)
+	if (pk_type == QCA_HDR_RECV_TYPE_RW_REG_ACK) {
+		qca_tag_handle_mdio_packet(skb, dev);
 		return NULL;
+	}
 
 	/* Remove QCA tag and recalculate checksum */
 	skb_pull_rcsum(skb, QCA_HDR_LEN);
-- 
2.32.0


^ permalink raw reply related	[flat|nested] 40+ messages in thread

* [net-next RFC PATCH 6/6] net: dsa: qca8k: cache lo and hi for mdio write
  2021-12-07 14:59 [net-next RFC PATCH 0/6] Add support for qca8k mdio rw in Ethernet packet Ansuel Smith
                   ` (4 preceding siblings ...)
  2021-12-07 14:59 ` [net-next RFC PATCH 5/6] net: dsa: tag_qca: Add support for handling mdio read/write packet Ansuel Smith
@ 2021-12-07 14:59 ` Ansuel Smith
  2021-12-07 15:15 ` [net-next RFC PATCH 0/6] Add support for qca8k mdio rw in Ethernet packet Andrew Lunn
  2021-12-07 18:41 ` Andrew Lunn
  7 siblings, 0 replies; 40+ messages in thread
From: Ansuel Smith @ 2021-12-07 14:59 UTC (permalink / raw)
  To: Andrew Lunn, Vivien Didelot, Florian Fainelli, Vladimir Oltean,
	David S. Miller, Jakub Kicinski, linux-kernel, netdev
  Cc: Ansuel Smith

From Documentation, we can cache lo and hi the same way we do with the
page. This massively reduce the mdio write as 3/4 of the time we only
require to write the lo or hi part for a mdio write.

Signed-off-by: Ansuel Smith <ansuelsmth@gmail.com>
---
 drivers/net/dsa/qca8k.c | 49 ++++++++++++++++++++++++++++++++++++-----
 1 file changed, 44 insertions(+), 5 deletions(-)

diff --git a/drivers/net/dsa/qca8k.c b/drivers/net/dsa/qca8k.c
index d2c6139be9ac..64643f1e2f16 100644
--- a/drivers/net/dsa/qca8k.c
+++ b/drivers/net/dsa/qca8k.c
@@ -94,6 +94,48 @@ qca8k_split_addr(u32 regaddr, u16 *r1, u16 *r2, u16 *page)
 	*page = regaddr & 0x3ff;
 }
 
+static u16 qca8k_current_lo = 0xffff;
+
+static int
+qca8k_set_lo(struct mii_bus *bus, int phy_id, u32 regnum, u16 lo)
+{
+	int ret;
+
+	if (lo == qca8k_current_lo) {
+		// pr_info("SAME LOW");
+		return 0;
+	}
+
+	ret = bus->write(bus, phy_id, regnum, lo);
+	if (ret < 0)
+		dev_err_ratelimited(&bus->dev,
+				    "failed to write qca8k 32bit lo register\n");
+
+	qca8k_current_lo = lo;
+	return 0;
+}
+
+static u16 qca8k_current_hi = 0xffff;
+
+static int
+qca8k_set_hi(struct mii_bus *bus, int phy_id, u32 regnum, u16 hi)
+{
+	int ret;
+
+	if (hi == qca8k_current_hi) {
+		// pr_info("SAME HI");
+		return 0;
+	}
+
+	ret = bus->write(bus, phy_id, regnum, hi);
+	if (ret < 0)
+		dev_err_ratelimited(&bus->dev,
+				    "failed to write qca8k 32bit hi register\n");
+
+	qca8k_current_hi = hi;
+	return 0;
+}
+
 static int
 qca8k_mii_read32(struct mii_bus *bus, int phy_id, u32 regnum, u32 *val)
 {
@@ -125,12 +167,9 @@ qca8k_mii_write32(struct mii_bus *bus, int phy_id, u32 regnum, u32 val)
 	lo = val & 0xffff;
 	hi = (u16)(val >> 16);
 
-	ret = bus->write(bus, phy_id, regnum, lo);
+	ret = qca8k_set_lo(bus, phy_id, regnum, lo);
 	if (ret >= 0)
-		ret = bus->write(bus, phy_id, regnum + 1, hi);
-	if (ret < 0)
-		dev_err_ratelimited(&bus->dev,
-				    "failed to write qca8k 32bit register\n");
+		ret = qca8k_set_hi(bus, phy_id, regnum + 1, hi);
 }
 
 static int
-- 
2.32.0


^ permalink raw reply related	[flat|nested] 40+ messages in thread

* Re: [net-next RFC PATCH 0/6] Add support for qca8k mdio rw in Ethernet packet
  2021-12-07 14:59 [net-next RFC PATCH 0/6] Add support for qca8k mdio rw in Ethernet packet Ansuel Smith
                   ` (5 preceding siblings ...)
  2021-12-07 14:59 ` [net-next RFC PATCH 6/6] net: dsa: qca8k: cache lo and hi for mdio write Ansuel Smith
@ 2021-12-07 15:15 ` Andrew Lunn
  2021-12-07 15:33   ` Ansuel Smith
  2021-12-07 18:49   ` Florian Fainelli
  2021-12-07 18:41 ` Andrew Lunn
  7 siblings, 2 replies; 40+ messages in thread
From: Andrew Lunn @ 2021-12-07 15:15 UTC (permalink / raw)
  To: Ansuel Smith
  Cc: Vivien Didelot, Florian Fainelli, Vladimir Oltean,
	David S. Miller, Jakub Kicinski, linux-kernel, netdev

On Tue, Dec 07, 2021 at 03:59:36PM +0100, Ansuel Smith wrote:
> Hi, this is still WIP and currently has some problem but I would love if
> someone can give this a superficial review and answer to some problem
> with this.
> 
> The main reason for this is that we notice some routing problem in the
> switch and it seems assisted learning is needed. Considering mdio is
> quite slow due to the indirect write using this Ethernet alternative way
> seems to be quicker.
> 
> The qca8k switch supports a special way to pass mdio read/write request
> using specially crafted Ethernet packet.

Oh! Cool! Marvell has this as well, and i suspect a few others. It is
something i've wanted to work on for a long long time, but never had
the opportunity.

This also means that, even if you are focusing on qca8k, please try to
think what could be generic, and what should specific to the
qca8k. The idea of sending an Ethernet frame and sometime later
receiving a reply should be generic and usable for other DSA
drivers. The contents of those frames needs to be driver specific.
How we hook this into MDIO might also be generic, maybe.

I will look at your questions later, but soon.

  Andrew

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [net-next RFC PATCH 0/6] Add support for qca8k mdio rw in Ethernet packet
  2021-12-07 15:15 ` [net-next RFC PATCH 0/6] Add support for qca8k mdio rw in Ethernet packet Andrew Lunn
@ 2021-12-07 15:33   ` Ansuel Smith
  2021-12-07 18:49   ` Florian Fainelli
  1 sibling, 0 replies; 40+ messages in thread
From: Ansuel Smith @ 2021-12-07 15:33 UTC (permalink / raw)
  To: Andrew Lunn
  Cc: Vivien Didelot, Florian Fainelli, Vladimir Oltean,
	David S. Miller, Jakub Kicinski, linux-kernel, netdev

On Tue, Dec 07, 2021 at 04:15:51PM +0100, Andrew Lunn wrote:
> On Tue, Dec 07, 2021 at 03:59:36PM +0100, Ansuel Smith wrote:
> > Hi, this is still WIP and currently has some problem but I would love if
> > someone can give this a superficial review and answer to some problem
> > with this.
> > 
> > The main reason for this is that we notice some routing problem in the
> > switch and it seems assisted learning is needed. Considering mdio is
> > quite slow due to the indirect write using this Ethernet alternative way
> > seems to be quicker.
> > 
> > The qca8k switch supports a special way to pass mdio read/write request
> > using specially crafted Ethernet packet.
> 
> Oh! Cool! Marvell has this as well, and i suspect a few others. It is
> something i've wanted to work on for a long long time, but never had
> the opportunity.
>

Really? I tought this was very specific to qca8k.

> This also means that, even if you are focusing on qca8k, please try to
> think what could be generic, and what should specific to the
> qca8k. The idea of sending an Ethernet frame and sometime later
> receiving a reply should be generic and usable for other DSA
> drivers. The contents of those frames needs to be driver specific.
> How we hook this into MDIO might also be generic, maybe.

A generic implementation of this would be add an ops to the dsa generic
struct that driver can implement and find a clean way to use this
alternative way instead of normal mdio. (Implement something like
eth_mdio_read/write ops and the driver will decide when to use them?
The dsa then will correctly understand when the driver is ready to
accept packet and send the skb, in all the other case just send an error
and the driver will use normal mdio?)

I think the tagger require some modification anyway as it's the one that
receive packet and parse them. (I assume also other switch will use the
tagger to understand that the packet is used for mdio)
(A bool to declare that the tagger can correctly parse this kind of
stuff and complete the completion?)

> 
> I will look at your questions later, but soon.
> 

Thanks a lot for the quick response. I'm more than happy to generalize
this and find the a correct way.

>   Andrew

-- 
	Ansuel

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [net-next RFC PATCH 0/6] Add support for qca8k mdio rw in Ethernet packet
  2021-12-07 14:59 [net-next RFC PATCH 0/6] Add support for qca8k mdio rw in Ethernet packet Ansuel Smith
                   ` (6 preceding siblings ...)
  2021-12-07 15:15 ` [net-next RFC PATCH 0/6] Add support for qca8k mdio rw in Ethernet packet Andrew Lunn
@ 2021-12-07 18:41 ` Andrew Lunn
  2021-12-07 18:53   ` Ansuel Smith
  7 siblings, 1 reply; 40+ messages in thread
From: Andrew Lunn @ 2021-12-07 18:41 UTC (permalink / raw)
  To: Ansuel Smith
  Cc: Vivien Didelot, Florian Fainelli, Vladimir Oltean,
	David S. Miller, Jakub Kicinski, linux-kernel, netdev

> I still have to find a solution to a slowdown problem and this is where
> I would love to get some hint.
> Currently I still didn't find a good way to understand when the tagger
> starts to accept packets and because of this the initial setup is slow
> as every completion timeouts. Am I missing something or is there a way
> to check for this?

I've not looked at this particular driver, i just know the general
architecture.

The MDIO bus driver probes first, maybe as part of the Ethernet
driver, maybe as a standalone MDIO driver. The switch is found in DT
and the driver code will at some point later probe the switch driver.

The switch driver has working MDIO at this point. It should use MDIO
to talk to the switch, make sure it is there, maybe do some initial
configuration. Once it is happy, it registers the switch with the DSA
core using dsa_register_switch().

If this is a single switch, the DSA core will then start setting
things up. As part of dsa_switch_setup() it will call the switch
drivers setup() method. It then figures out what tag driver to use, by
calling dsa_switch_setup_tag_protocol(). However, the tag driver
itself is not inserted into the chain yet. That happens later.  Once
the switch is setup, dsa_tree_setup_master() is called which does
dsa_master_setup() and in the middle there is:

	/* If we use a tagging format that doesn't have an ethertype
	 * field, make sure that all packets from this point on get
	 * sent to the tag format's receive function.
	 */
	wmb();

	dev->dsa_ptr = cpu_dp;

This is the magic to actually enable the tagger receiving frames.

I need to look at your patches, but why is the tagger involved?  At
least for the Marvell switch, you send a pretty normal looking
Ethernet frame to a specific MAC address, and the switch replies using
that MAC address. And it has an Ether Type specific to switch
control. Since this is all normal looking, there are hooks in the
network stack which can be used to get these frames.

	Andrew


^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [net-next RFC PATCH 0/6] Add support for qca8k mdio rw in Ethernet packet
  2021-12-07 15:15 ` [net-next RFC PATCH 0/6] Add support for qca8k mdio rw in Ethernet packet Andrew Lunn
  2021-12-07 15:33   ` Ansuel Smith
@ 2021-12-07 18:49   ` Florian Fainelli
  2021-12-07 19:44     ` Ansuel Smith
  2021-12-07 21:10     ` Vladimir Oltean
  1 sibling, 2 replies; 40+ messages in thread
From: Florian Fainelli @ 2021-12-07 18:49 UTC (permalink / raw)
  To: Andrew Lunn, Ansuel Smith
  Cc: Vivien Didelot, Vladimir Oltean, David S. Miller, Jakub Kicinski,
	linux-kernel, netdev

On 12/7/21 7:15 AM, Andrew Lunn wrote:
> On Tue, Dec 07, 2021 at 03:59:36PM +0100, Ansuel Smith wrote:
>> Hi, this is still WIP and currently has some problem but I would love if
>> someone can give this a superficial review and answer to some problem
>> with this.
>>
>> The main reason for this is that we notice some routing problem in the
>> switch and it seems assisted learning is needed. Considering mdio is
>> quite slow due to the indirect write using this Ethernet alternative way
>> seems to be quicker.
>>
>> The qca8k switch supports a special way to pass mdio read/write request
>> using specially crafted Ethernet packet.
> 
> Oh! Cool! Marvell has this as well, and i suspect a few others. It is
> something i've wanted to work on for a long long time, but never had
> the opportunity.
> 
> This also means that, even if you are focusing on qca8k, please try to
> think what could be generic, and what should specific to the
> qca8k. The idea of sending an Ethernet frame and sometime later
> receiving a reply should be generic and usable for other DSA
> drivers. The contents of those frames needs to be driver specific.
> How we hook this into MDIO might also be generic, maybe.
> 
> I will look at your questions later, but soon.

There was a priori attempt from Vivien to add support for mv88e6xxx over
RMU frames:

https://www.mail-archive.com/netdev@vger.kernel.org/msg298317.html

This gets interesting because the switch's control path moves from MDIO
to Ethernet and there is not really an "ethernet bus" though we could
certainly come up with one. We have mdio-i2c, so maybe we should have
mdio-ethernet?
-- 
Florian

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [net-next RFC PATCH 0/6] Add support for qca8k mdio rw in Ethernet packet
  2021-12-07 18:41 ` Andrew Lunn
@ 2021-12-07 18:53   ` Ansuel Smith
  2021-12-07 19:15     ` Andrew Lunn
  0 siblings, 1 reply; 40+ messages in thread
From: Ansuel Smith @ 2021-12-07 18:53 UTC (permalink / raw)
  To: Andrew Lunn
  Cc: Vivien Didelot, Florian Fainelli, Vladimir Oltean,
	David S. Miller, Jakub Kicinski, linux-kernel, netdev

On Tue, Dec 07, 2021 at 07:41:23PM +0100, Andrew Lunn wrote:
> > I still have to find a solution to a slowdown problem and this is where
> > I would love to get some hint.
> > Currently I still didn't find a good way to understand when the tagger
> > starts to accept packets and because of this the initial setup is slow
> > as every completion timeouts. Am I missing something or is there a way
> > to check for this?
> 
> I've not looked at this particular driver, i just know the general
> architecture.
> 
> The MDIO bus driver probes first, maybe as part of the Ethernet
> driver, maybe as a standalone MDIO driver. The switch is found in DT
> and the driver code will at some point later probe the switch driver.
> 
> The switch driver has working MDIO at this point. It should use MDIO
> to talk to the switch, make sure it is there, maybe do some initial
> configuration. Once it is happy, it registers the switch with the DSA
> core using dsa_register_switch().
> 
> If this is a single switch, the DSA core will then start setting
> things up. As part of dsa_switch_setup() it will call the switch
> drivers setup() method. It then figures out what tag driver to use, by
> calling dsa_switch_setup_tag_protocol(). However, the tag driver
> itself is not inserted into the chain yet. That happens later.  Once
> the switch is setup, dsa_tree_setup_master() is called which does
> dsa_master_setup() and in the middle there is:
> 
> 	/* If we use a tagging format that doesn't have an ethertype
> 	 * field, make sure that all packets from this point on get
> 	 * sent to the tag format's receive function.
> 	 */
> 	wmb();
> 
> 	dev->dsa_ptr = cpu_dp;
> 
> This is the magic to actually enable the tagger receiving frames.
> 

Will check if using this is the correct way to prevent use of this
alternative way before it's available.

> I need to look at your patches, but why is the tagger involved?  At
> least for the Marvell switch, you send a pretty normal looking
> Ethernet frame to a specific MAC address, and the switch replies using
> that MAC address. And it has an Ether Type specific to switch
> control. Since this is all normal looking, there are hooks in the
> network stack which can be used to get these frames.
>

The qca tag header provide a TYPE value that refer to a big list of
Frame type. In all of this at value 2 we have the type that tells us
that is a READ_WRITE_REG_ACK (aka a mdio rw Ethernet packet)

The idea of using the tagger is to skip parsing the packet 2 times
considering the qca tag header is present at the same place in both
normal packet and mdio rw Ethernet packet.

Your idea would be hook this before the tagger and parse it?
I assume that is the only way if this has to be generilized. But I
wonder if this would create some overhead by the double parsing.

> 	Andrew
> 

-- 
	Ansuel

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [net-next RFC PATCH 0/6] Add support for qca8k mdio rw in Ethernet packet
  2021-12-07 18:53   ` Ansuel Smith
@ 2021-12-07 19:15     ` Andrew Lunn
  2021-12-07 19:21       ` Ansuel Smith
  0 siblings, 1 reply; 40+ messages in thread
From: Andrew Lunn @ 2021-12-07 19:15 UTC (permalink / raw)
  To: Ansuel Smith
  Cc: Vivien Didelot, Florian Fainelli, Vladimir Oltean,
	David S. Miller, Jakub Kicinski, linux-kernel, netdev

> The qca tag header provide a TYPE value that refer to a big list of
> Frame type. In all of this at value 2 we have the type that tells us
> that is a READ_WRITE_REG_ACK (aka a mdio rw Ethernet packet)
> 
> The idea of using the tagger is to skip parsing the packet 2 times
> considering the qca tag header is present at the same place in both
> normal packet and mdio rw Ethernet packet.
> 
> Your idea would be hook this before the tagger and parse it?
> I assume that is the only way if this has to be generilized. But I
> wonder if this would create some overhead by the double parsing.

So it seems i remembered this incorrectly. Marvell call this Remote
Management Unit, RMU. And RMU makes use of bits inside the Marvell
Tag. I was thinking it was outside of the tag.

So, yes, the tagger does need to be involved in this.

The initial design of DSA was that the tagger and main driver were
kept separate. This has been causing us problems recently, we have use
cases where we need to share information between the tagger and the
driver. This looks like it is going to be another case of that.

	Andrew

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [net-next RFC PATCH 0/6] Add support for qca8k mdio rw in Ethernet packet
  2021-12-07 19:15     ` Andrew Lunn
@ 2021-12-07 19:21       ` Ansuel Smith
  2021-12-07 20:52         ` Vladimir Oltean
  0 siblings, 1 reply; 40+ messages in thread
From: Ansuel Smith @ 2021-12-07 19:21 UTC (permalink / raw)
  To: Andrew Lunn
  Cc: Vivien Didelot, Florian Fainelli, Vladimir Oltean,
	David S. Miller, Jakub Kicinski, linux-kernel, netdev

On Tue, Dec 07, 2021 at 08:15:24PM +0100, Andrew Lunn wrote:
> > The qca tag header provide a TYPE value that refer to a big list of
> > Frame type. In all of this at value 2 we have the type that tells us
> > that is a READ_WRITE_REG_ACK (aka a mdio rw Ethernet packet)
> > 
> > The idea of using the tagger is to skip parsing the packet 2 times
> > considering the qca tag header is present at the same place in both
> > normal packet and mdio rw Ethernet packet.
> > 
> > Your idea would be hook this before the tagger and parse it?
> > I assume that is the only way if this has to be generilized. But I
> > wonder if this would create some overhead by the double parsing.
> 
> So it seems i remembered this incorrectly. Marvell call this Remote
> Management Unit, RMU. And RMU makes use of bits inside the Marvell
> Tag. I was thinking it was outside of the tag.
> 
> So, yes, the tagger does need to be involved in this.
> 
> The initial design of DSA was that the tagger and main driver were
> kept separate. This has been causing us problems recently, we have use
> cases where we need to share information between the tagger and the
> driver. This looks like it is going to be another case of that.
> 
> 	Andrew

I mean if you check the code this is still somewhat ""separate"".
I ""abuse"" the dsa port priv to share the required data.
(I allocate a different struct... i put it in qca8k_priv and i set every
port priv to this struct)

Wonder if we can add something to share data between the driver and the
port so the access that from the tagger. (something that doesn't use the
port priv)

-- 
	Ansuel

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [net-next RFC PATCH 0/6] Add support for qca8k mdio rw in Ethernet packet
  2021-12-07 18:49   ` Florian Fainelli
@ 2021-12-07 19:44     ` Ansuel Smith
  2021-12-07 21:10     ` Vladimir Oltean
  1 sibling, 0 replies; 40+ messages in thread
From: Ansuel Smith @ 2021-12-07 19:44 UTC (permalink / raw)
  To: Florian Fainelli
  Cc: Andrew Lunn, Vivien Didelot, Vladimir Oltean, David S. Miller,
	Jakub Kicinski, linux-kernel, netdev

On Tue, Dec 07, 2021 at 10:49:43AM -0800, Florian Fainelli wrote:
> On 12/7/21 7:15 AM, Andrew Lunn wrote:
> > On Tue, Dec 07, 2021 at 03:59:36PM +0100, Ansuel Smith wrote:
> >> Hi, this is still WIP and currently has some problem but I would love if
> >> someone can give this a superficial review and answer to some problem
> >> with this.
> >>
> >> The main reason for this is that we notice some routing problem in the
> >> switch and it seems assisted learning is needed. Considering mdio is
> >> quite slow due to the indirect write using this Ethernet alternative way
> >> seems to be quicker.
> >>
> >> The qca8k switch supports a special way to pass mdio read/write request
> >> using specially crafted Ethernet packet.
> > 
> > Oh! Cool! Marvell has this as well, and i suspect a few others. It is
> > something i've wanted to work on for a long long time, but never had
> > the opportunity.
> > 
> > This also means that, even if you are focusing on qca8k, please try to
> > think what could be generic, and what should specific to the
> > qca8k. The idea of sending an Ethernet frame and sometime later
> > receiving a reply should be generic and usable for other DSA
> > drivers. The contents of those frames needs to be driver specific.
> > How we hook this into MDIO might also be generic, maybe.
> > 
> > I will look at your questions later, but soon.
> 
> There was a priori attempt from Vivien to add support for mv88e6xxx over
> RMU frames:
> 
> https://www.mail-archive.com/netdev@vger.kernel.org/msg298317.html
> 
> This gets interesting because the switch's control path moves from MDIO
> to Ethernet and there is not really an "ethernet bus" though we could
> certainly come up with one. We have mdio-i2c, so maybe we should have
> mdio-ethernet?
> -- 
> Florian

I checked that series and I notice that the proposed implementation used
a workqueue. The current implementation here use completion and mutex so
the transaction is really one command at time and wait for response.
Considering most of the time we do read and write operation is seems a
bit overkill to use a queue... Also to track the response.
Using a single queue simplify the implementation and should be just
good. (btw qca8k supports a way to queue packet using a seq int but we
don't use it to keep things simple)

Is that acceptable? Also I notice in that series mru have some
limitation and can be used only for some kind of data...
Should we add a way to blacklist some particular reg and use the legacy
mdio way?

-- 
	Ansuel

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [net-next RFC PATCH 0/6] Add support for qca8k mdio rw in Ethernet packet
  2021-12-07 19:21       ` Ansuel Smith
@ 2021-12-07 20:52         ` Vladimir Oltean
  2021-12-07 21:47           ` Ansuel Smith
  0 siblings, 1 reply; 40+ messages in thread
From: Vladimir Oltean @ 2021-12-07 20:52 UTC (permalink / raw)
  To: Ansuel Smith
  Cc: Andrew Lunn, Vivien Didelot, Florian Fainelli, David S. Miller,
	Jakub Kicinski, linux-kernel, netdev

On Tue, Dec 07, 2021 at 08:21:52PM +0100, Ansuel Smith wrote:
> On Tue, Dec 07, 2021 at 08:15:24PM +0100, Andrew Lunn wrote:
> > > The qca tag header provide a TYPE value that refer to a big list of
> > > Frame type. In all of this at value 2 we have the type that tells us
> > > that is a READ_WRITE_REG_ACK (aka a mdio rw Ethernet packet)
> > > 
> > > The idea of using the tagger is to skip parsing the packet 2 times
> > > considering the qca tag header is present at the same place in both
> > > normal packet and mdio rw Ethernet packet.
> > > 
> > > Your idea would be hook this before the tagger and parse it?
> > > I assume that is the only way if this has to be generilized. But I
> > > wonder if this would create some overhead by the double parsing.
> > 
> > So it seems i remembered this incorrectly. Marvell call this Remote
> > Management Unit, RMU. And RMU makes use of bits inside the Marvell
> > Tag. I was thinking it was outside of the tag.
> > 
> > So, yes, the tagger does need to be involved in this.
> > 
> > The initial design of DSA was that the tagger and main driver were
> > kept separate. This has been causing us problems recently, we have use
> > cases where we need to share information between the tagger and the
> > driver. This looks like it is going to be another case of that.
> > 
> > 	Andrew
> 
> I mean if you check the code this is still somewhat ""separate"".
> I ""abuse"" the dsa port priv to share the required data.
> (I allocate a different struct... i put it in qca8k_priv and i set every
> port priv to this struct)
> 
> Wonder if we can add something to share data between the driver and the
> port so the access that from the tagger. (something that doesn't use the
> port priv)

The one problem relevant to this submission among those referenced by
Andrew is that dp->priv needs to be allocated by the Ethernet switch
driver, although it is used by the tagging protocol driver. So they
aren't really 'separate', nor can they be, the way dp->priv is currently
designed, it can only be "abused", not really "used".

The DSA design allows in principle for any switch driver to return any
protocol it desires in ->get_tag_protocol(). I occasionally test various
tagger submissions by hacking dsa_loop to do just that. But your
tag_qca.c driver would have a pretty unpleasant surprise if it was to be
paired to any other switch driver than qca8k, because that other driver
would either not allocate memory for dp->priv, or (worse) allocate some
other type of structure, expected to be used differently etc.

An even bigger complication is created by the fact that we can
dynamically change tagging protocols in certain cases (dsa <-> edsa,
ocelot <-> ocelot-8021q), and the current design doesn't really scale:
if any tagging protocol required its own dp->priv format, we may end up
with bugs such as the driver not freeing the old dp->priv and setting up
the new one, when the tagging protocol changes. These mistakes are all
too easy to make currently.

Another potential issue, which I don't see present here, but still
worth watching out for, is that the tagger cannot use symbols exported
by the switch, and vice versa. Otherwise the tagger cannot be inserted
into the kernel when built as module, due to missing symbols provided by
the switch. And the switch will not probe until it has a tagger.

I'm afraid we need to make a decision now whether we keep supporting the
separation between taggers and switch drivers, especially since the
tagger could become a bus provider for the switch driver. We need to
weigh the pros and cons.

I thought about what would be needed and I think we'd need tagger-owned
per-switch-tree private data. But this implies that there needs to be a
hook in the tagging protocol driver that notifies it when a certain
struct dsa_switch_tree *dst binds and unbinds to a certain tagger.
Then it could pick and choose the ports that need dp->priv configured in
a certain way: some taggers need the dp->priv of user ports to hold
something per port, others need the dp->priv of _all_ user ports to
point to something shared, others (like yours) apparently need the
dp->priv of the CPU port to hold something. This would become something
handled and owned exclusively by the tagger.

Ansuel, would something like this help you in any way?

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [net-next RFC PATCH 0/6] Add support for qca8k mdio rw in Ethernet packet
  2021-12-07 18:49   ` Florian Fainelli
  2021-12-07 19:44     ` Ansuel Smith
@ 2021-12-07 21:10     ` Vladimir Oltean
  2021-12-07 22:01       ` Ansuel Smith
  2021-12-07 22:37       ` Andrew Lunn
  1 sibling, 2 replies; 40+ messages in thread
From: Vladimir Oltean @ 2021-12-07 21:10 UTC (permalink / raw)
  To: Florian Fainelli
  Cc: Andrew Lunn, Ansuel Smith, Vivien Didelot, David S. Miller,
	Jakub Kicinski, linux-kernel, netdev

On Tue, Dec 07, 2021 at 10:49:43AM -0800, Florian Fainelli wrote:
> On 12/7/21 7:15 AM, Andrew Lunn wrote:
> > On Tue, Dec 07, 2021 at 03:59:36PM +0100, Ansuel Smith wrote:
> >> Hi, this is still WIP and currently has some problem but I would love if
> >> someone can give this a superficial review and answer to some problem
> >> with this.
> >>
> >> The main reason for this is that we notice some routing problem in the
> >> switch and it seems assisted learning is needed. Considering mdio is
> >> quite slow due to the indirect write using this Ethernet alternative way
> >> seems to be quicker.
> >>
> >> The qca8k switch supports a special way to pass mdio read/write request
> >> using specially crafted Ethernet packet.
> > 
> > Oh! Cool! Marvell has this as well, and i suspect a few others. It is
> > something i've wanted to work on for a long long time, but never had
> > the opportunity.
> > 
> > This also means that, even if you are focusing on qca8k, please try to
> > think what could be generic, and what should specific to the
> > qca8k. The idea of sending an Ethernet frame and sometime later
> > receiving a reply should be generic and usable for other DSA
> > drivers. The contents of those frames needs to be driver specific.
> > How we hook this into MDIO might also be generic, maybe.
> > 
> > I will look at your questions later, but soon.
> 
> There was a priori attempt from Vivien to add support for mv88e6xxx over
> RMU frames:
> 
> https://www.mail-archive.com/netdev@vger.kernel.org/msg298317.html
> 
> This gets interesting because the switch's control path moves from MDIO
> to Ethernet and there is not really an "ethernet bus" though we could
> certainly come up with one. We have mdio-i2c, so maybe we should have
> mdio-ethernet?

This raises interesting questions. I see two distinct cases:

1. "dual I/O": the switch probes initially over a standard bus (MDIO,
   SPI, I2C) then at some point transitions towards I/O through the
   tagger.  This would be the case when there is some preparation work
   to be done (maybe the CPU port needs to be brought up, maybe there is
   a firmware to be uploaded to the switch's embedded microcontroller
   such that the expected remote management protocol is supported, etc).
   It would also be the case when multiple CPU ports are supported (and
   changing between CPU ports), because we could end up bringing a
   certain CPU port down, and the register I/O would need to be
   temporarily done over MDIO before bringing the other CPU port up.

2. "single I/O": the switch needs no such configuration, and in this
    case, it could in principle probe over an "Ethernet bus" rather than
    a standard bus as mentioned above.

I don't know which case is going to be more common, honestly. The
difference between the two is that the former would work using the
existing infrastructure (bus types) we have today, whereas the latter
would (maybe) need an "Ethernet bus" as mentioned by Vivien and Florian.

I'm not completely convinced, though. The argument for an "Ethernet bus"
seems to be that any Ethernet controller may need to set up such a
thing, independently of being a DSA master. In Vivien's link, an example
is given where we have "Control path via port 1, Data path via port port 3".
But I don't know, this separation seems pretty artificial and ultimately
boils down to configuration. Like it or not, in that particular example,
both ports 1 and 3 are CPU ports, and both eth1 and eth0 are DSA masters.
The fact that they are used asymmetrically should pretty much not matter.

I think a fair simplifying assumption is that switch management
protocols will never be spoken through interfaces that aren't DSA
masters (because being a DSA master _implies_ having a physical link to
a DSA switch). And if we have this simplifying factor, we could consider
moving dsa_tree_setup_master() somewhere earlier in the DSA probe path,
and just make "type 2" / "single I/O" switches be platform devices, with
a more-or-less empty probe function (just calls dsa_register_switch),
and do their hardware init in ->setup, which is _after_
dsa_tree_setup_master and therefore after the tagging protocol has bound
to the DSA switch tree and is prepared to handle I/O. So no bus really
needed.

Although I have to point this out. An "Ethernet bus" would be one of the
most unreliable buses out there. Consider that the user may run "ip link
set eth0 down" at any given time. What do you do, as a switch driver on
an Ethernet bus, when your DSA master goes down? Postpone all your I/O?
Error out on I/O? Unbind?

Even moreso, there are currently code paths in DSA that can only be done
while the DSA master is down (changing the tagging protocol comes to
mind). So does this mean that those code paths are simply not available
when the I/O is over Ethernet? Or does it mean that Ethernet cannot be
the sole I/O method of any switch driver, due to it being unreliable?
And if the latter, this is yet another argument against "Ethernet as bus".
A bus is basically only there for probing purposes, but if you need to
have a primary I/O method beside Ethernet, you already have a bus and
don't need another.

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [net-next RFC PATCH 0/6] Add support for qca8k mdio rw in Ethernet packet
  2021-12-07 20:52         ` Vladimir Oltean
@ 2021-12-07 21:47           ` Ansuel Smith
  2021-12-07 22:22             ` Andrew Lunn
  2021-12-07 22:45             ` Vladimir Oltean
  0 siblings, 2 replies; 40+ messages in thread
From: Ansuel Smith @ 2021-12-07 21:47 UTC (permalink / raw)
  To: Vladimir Oltean
  Cc: Andrew Lunn, Vivien Didelot, Florian Fainelli, David S. Miller,
	Jakub Kicinski, linux-kernel, netdev

On Tue, Dec 07, 2021 at 10:52:19PM +0200, Vladimir Oltean wrote:
> On Tue, Dec 07, 2021 at 08:21:52PM +0100, Ansuel Smith wrote:
> > On Tue, Dec 07, 2021 at 08:15:24PM +0100, Andrew Lunn wrote:
> > > > The qca tag header provide a TYPE value that refer to a big list of
> > > > Frame type. In all of this at value 2 we have the type that tells us
> > > > that is a READ_WRITE_REG_ACK (aka a mdio rw Ethernet packet)
> > > > 
> > > > The idea of using the tagger is to skip parsing the packet 2 times
> > > > considering the qca tag header is present at the same place in both
> > > > normal packet and mdio rw Ethernet packet.
> > > > 
> > > > Your idea would be hook this before the tagger and parse it?
> > > > I assume that is the only way if this has to be generilized. But I
> > > > wonder if this would create some overhead by the double parsing.
> > > 
> > > So it seems i remembered this incorrectly. Marvell call this Remote
> > > Management Unit, RMU. And RMU makes use of bits inside the Marvell
> > > Tag. I was thinking it was outside of the tag.
> > > 
> > > So, yes, the tagger does need to be involved in this.
> > > 
> > > The initial design of DSA was that the tagger and main driver were
> > > kept separate. This has been causing us problems recently, we have use
> > > cases where we need to share information between the tagger and the
> > > driver. This looks like it is going to be another case of that.
> > > 
> > > 	Andrew
> > 
> > I mean if you check the code this is still somewhat ""separate"".
> > I ""abuse"" the dsa port priv to share the required data.
> > (I allocate a different struct... i put it in qca8k_priv and i set every
> > port priv to this struct)
> > 
> > Wonder if we can add something to share data between the driver and the
> > port so the access that from the tagger. (something that doesn't use the
> > port priv)
> 
> The one problem relevant to this submission among those referenced by
> Andrew is that dp->priv needs to be allocated by the Ethernet switch
> driver, although it is used by the tagging protocol driver. So they
> aren't really 'separate', nor can they be, the way dp->priv is currently
> designed, it can only be "abused", not really "used".
> 
> The DSA design allows in principle for any switch driver to return any
> protocol it desires in ->get_tag_protocol(). I occasionally test various
> tagger submissions by hacking dsa_loop to do just that. But your
> tag_qca.c driver would have a pretty unpleasant surprise if it was to be
> paired to any other switch driver than qca8k, because that other driver
> would either not allocate memory for dp->priv, or (worse) allocate some
> other type of structure, expected to be used differently etc.
>
> An even bigger complication is created by the fact that we can
> dynamically change tagging protocols in certain cases (dsa <-> edsa,
> ocelot <-> ocelot-8021q), and the current design doesn't really scale:
> if any tagging protocol required its own dp->priv format, we may end up
> with bugs such as the driver not freeing the old dp->priv and setting up
> the new one, when the tagging protocol changes. These mistakes are all
> too easy to make currently.
> 
> Another potential issue, which I don't see present here, but still
> worth watching out for, is that the tagger cannot use symbols exported
> by the switch, and vice versa. Otherwise the tagger cannot be inserted
> into the kernel when built as module, due to missing symbols provided by
> the switch. And the switch will not probe until it has a tagger.
> 
> I'm afraid we need to make a decision now whether we keep supporting the
> separation between taggers and switch drivers, especially since the
> tagger could become a bus provider for the switch driver. We need to
> weigh the pros and cons.
> 
> I thought about what would be needed and I think we'd need tagger-owned
> per-switch-tree private data. But this implies that there needs to be a
> hook in the tagging protocol driver that notifies it when a certain
> struct dsa_switch_tree *dst binds and unbinds to a certain tagger.
> Then it could pick and choose the ports that need dp->priv configured in
> a certain way: some taggers need the dp->priv of user ports to hold
> something per port, others need the dp->priv of _all_ user ports to
> point to something shared, others (like yours) apparently need the
> dp->priv of the CPU port to hold something. This would become something
> handled and owned exclusively by the tagger.
> 
> Ansuel, would something like this help you in any way?

I agree on all the concern you pointed out. IMHO, current implementation
of the dsa port priv is a bit confusing and someone can really do bad
things with it. (like it's done in this implementation)

The main problem here is that we really need a way to have shared data
between tagger and dsa driver. I also think that it would be limiting
using this only for mdio. For example qca8k can autocast mib with
Ethernet port and that would be another feature that the tagger would
handle.

I like the idea of tagger-owend per-switch-tree private data.
Do we really need to hook logic?
Wonder if something like that would work:
1. Each tagger declare size of his private data (if any).
2. Change tag dsa helper make sure the privata data in dst gets
   allocated and freed.
3. We create some helper to get the tagger private data pointer that
   dsa driver will use. (an error is returned if no data is present)
4. Tagger will use the dst to access his own data.

In theory that way we should be able to make a ""connection"" between
dsa driver and the tagger and prevent any sort of strange stuff that
could result in bug/kernel panic.

I mean for the current task (mdio in ethernet packet) we just need to
put data, send the skb and wait for a response (and after parsing) get
the data from a response skb.

-- 
	Ansuel

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [net-next RFC PATCH 0/6] Add support for qca8k mdio rw in Ethernet packet
  2021-12-07 21:10     ` Vladimir Oltean
@ 2021-12-07 22:01       ` Ansuel Smith
  2021-12-07 22:37       ` Andrew Lunn
  1 sibling, 0 replies; 40+ messages in thread
From: Ansuel Smith @ 2021-12-07 22:01 UTC (permalink / raw)
  To: Vladimir Oltean
  Cc: Florian Fainelli, Andrew Lunn, Vivien Didelot, David S. Miller,
	Jakub Kicinski, linux-kernel, netdev

On Tue, Dec 07, 2021 at 11:10:18PM +0200, Vladimir Oltean wrote:
> On Tue, Dec 07, 2021 at 10:49:43AM -0800, Florian Fainelli wrote:
> > On 12/7/21 7:15 AM, Andrew Lunn wrote:
> > > On Tue, Dec 07, 2021 at 03:59:36PM +0100, Ansuel Smith wrote:
> > >> Hi, this is still WIP and currently has some problem but I would love if
> > >> someone can give this a superficial review and answer to some problem
> > >> with this.
> > >>
> > >> The main reason for this is that we notice some routing problem in the
> > >> switch and it seems assisted learning is needed. Considering mdio is
> > >> quite slow due to the indirect write using this Ethernet alternative way
> > >> seems to be quicker.
> > >>
> > >> The qca8k switch supports a special way to pass mdio read/write request
> > >> using specially crafted Ethernet packet.
> > > 
> > > Oh! Cool! Marvell has this as well, and i suspect a few others. It is
> > > something i've wanted to work on for a long long time, but never had
> > > the opportunity.
> > > 
> > > This also means that, even if you are focusing on qca8k, please try to
> > > think what could be generic, and what should specific to the
> > > qca8k. The idea of sending an Ethernet frame and sometime later
> > > receiving a reply should be generic and usable for other DSA
> > > drivers. The contents of those frames needs to be driver specific.
> > > How we hook this into MDIO might also be generic, maybe.
> > > 
> > > I will look at your questions later, but soon.
> > 
> > There was a priori attempt from Vivien to add support for mv88e6xxx over
> > RMU frames:
> > 
> > https://www.mail-archive.com/netdev@vger.kernel.org/msg298317.html
> > 
> > This gets interesting because the switch's control path moves from MDIO
> > to Ethernet and there is not really an "ethernet bus" though we could
> > certainly come up with one. We have mdio-i2c, so maybe we should have
> > mdio-ethernet?
> 
> This raises interesting questions. I see two distinct cases:
> 
> 1. "dual I/O": the switch probes initially over a standard bus (MDIO,
>    SPI, I2C) then at some point transitions towards I/O through the
>    tagger.  This would be the case when there is some preparation work
>    to be done (maybe the CPU port needs to be brought up, maybe there is
>    a firmware to be uploaded to the switch's embedded microcontroller
>    such that the expected remote management protocol is supported, etc).
>    It would also be the case when multiple CPU ports are supported (and
>    changing between CPU ports), because we could end up bringing a
>    certain CPU port down, and the register I/O would need to be
>    temporarily done over MDIO before bringing the other CPU port up.
> 
> 2. "single I/O": the switch needs no such configuration, and in this
>     case, it could in principle probe over an "Ethernet bus" rather than
>     a standard bus as mentioned above.
> 
> I don't know which case is going to be more common, honestly. The
> difference between the two is that the former would work using the
> existing infrastructure (bus types) we have today, whereas the latter
> would (maybe) need an "Ethernet bus" as mentioned by Vivien and Florian.
>

Considering this is very specific for qca8k (it does use very not
standard Ethernet packet) and we have only another switch that more or
less supports this, I honestly think we should think about the dual I/0.
qca8k for example require to be setup first with mdio (as it does
require some bit to enable header mode, the tagger way to comunicate
ethernet mdio type packet) and generally the cpu port has to be
configured for the phy mode. We can't assume a bootloader correctly
setup the cpu port and the switch being able to receive packet so I
think a fallback is always necessary.

> I'm not completely convinced, though. The argument for an "Ethernet bus"
> seems to be that any Ethernet controller may need to set up such a
> thing, independently of being a DSA master. In Vivien's link, an example
> is given where we have "Control path via port 1, Data path via port port 3".
> But I don't know, this separation seems pretty artificial and ultimately
> boils down to configuration. Like it or not, in that particular example,
> both ports 1 and 3 are CPU ports, and both eth1 and eth0 are DSA masters.
> The fact that they are used asymmetrically should pretty much not matter.
> 
> I think a fair simplifying assumption is that switch management
> protocols will never be spoken through interfaces that aren't DSA
> masters (because being a DSA master _implies_ having a physical link to
> a DSA switch). And if we have this simplifying factor, we could consider
> moving dsa_tree_setup_master() somewhere earlier in the DSA probe path,
> and just make "type 2" / "single I/O" switches be platform devices, with
> a more-or-less empty probe function (just calls dsa_register_switch),
> and do their hardware init in ->setup, which is _after_
> dsa_tree_setup_master and therefore after the tagging protocol has bound
> to the DSA switch tree and is prepared to handle I/O. So no bus really
> needed.
> 
> Although I have to point this out. An "Ethernet bus" would be one of the
> most unreliable buses out there. Consider that the user may run "ip link
> set eth0 down" at any given time. What do you do, as a switch driver on
> an Ethernet bus, when your DSA master goes down? Postpone all your I/O?
> Error out on I/O? Unbind?

With something like an Ethernet bus only (no dual I/O) the operation
should just be rejected to prevent this kind of stuff. (return EBUSY error?)

> 
> Even moreso, there are currently code paths in DSA that can only be done
> while the DSA master is down (changing the tagging protocol comes to
> mind). So does this mean that those code paths are simply not available
> when the I/O is over Ethernet? Or does it mean that Ethernet cannot be
> the sole I/O method of any switch driver, due to it being unreliable?
> And if the latter, this is yet another argument against "Ethernet as bus".
> A bus is basically only there for probing purposes, but if you need to
> have a primary I/O method beside Ethernet, you already have a bus and
> don't need another.

As I said up, a ""stable"" I/O method must be present and the fast path
should be used only if available. About this I'm thinking if we should
create an helper and let dsa decide when a method can be used instead of
another one. A dsa driver will then use these helper function and mimic
the standard read/write/update_bits thing (but this would be very specific
and used only for Ethernet mdio and probably not correct)

-- 
	Ansuel

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [net-next RFC PATCH 0/6] Add support for qca8k mdio rw in Ethernet packet
  2021-12-07 21:47           ` Ansuel Smith
@ 2021-12-07 22:22             ` Andrew Lunn
  2021-12-07 22:30               ` Ansuel Smith
  2021-12-07 23:47               ` Vladimir Oltean
  2021-12-07 22:45             ` Vladimir Oltean
  1 sibling, 2 replies; 40+ messages in thread
From: Andrew Lunn @ 2021-12-07 22:22 UTC (permalink / raw)
  To: Ansuel Smith
  Cc: Vladimir Oltean, Vivien Didelot, Florian Fainelli,
	David S. Miller, Jakub Kicinski, linux-kernel, netdev

> The main problem here is that we really need a way to have shared data
> between tagger and dsa driver. I also think that it would be limiting
> using this only for mdio. For example qca8k can autocast mib with
> Ethernet port and that would be another feature that the tagger would
> handle.

The Marvell switches also have an efficient way to get the whole MIB
table. So this is something i would also want.

> I like the idea of tagger-owend per-switch-tree private data.
> Do we really need to hook logic?

We have two different things here.

1) The tagger needs somewhere to store its own private data.
2) The tagger needs to share state with the switch driver.

We can probably have the DSA core provide 1). Add the size to
dsa_device_ops structure, and provide helpers to go from either a
master or a slave netdev to the private data.

2) is harder. But as far as i know, we have an 1:N setup.  One switch
driver can use N tag drivers. So we need the switch driver to be sure
the tag driver is what it expects. We keep the shared state in the tag
driver, so it always has valid data, but when the switch driver wants
to get a pointer to it, it needs to pass a enum dsa_tag_protocol and
if it does not match, the core should return -EINVAL or similar.

   Andrew

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [net-next RFC PATCH 0/6] Add support for qca8k mdio rw in Ethernet packet
  2021-12-07 22:22             ` Andrew Lunn
@ 2021-12-07 22:30               ` Ansuel Smith
  2021-12-07 22:46                 ` Andrew Lunn
  2021-12-07 23:47               ` Vladimir Oltean
  1 sibling, 1 reply; 40+ messages in thread
From: Ansuel Smith @ 2021-12-07 22:30 UTC (permalink / raw)
  To: Andrew Lunn
  Cc: Vladimir Oltean, Vivien Didelot, Florian Fainelli,
	David S. Miller, Jakub Kicinski, linux-kernel, netdev

On Tue, Dec 07, 2021 at 11:22:41PM +0100, Andrew Lunn wrote:
> > The main problem here is that we really need a way to have shared data
> > between tagger and dsa driver. I also think that it would be limiting
> > using this only for mdio. For example qca8k can autocast mib with
> > Ethernet port and that would be another feature that the tagger would
> > handle.
> 
> The Marvell switches also have an efficient way to get the whole MIB
> table. So this is something i would also want.
>

Again same think... they just put the type in the qca hdr (placed in the
EthType position) and everything else is a mib. The switch send 7 packet
and each correspond to the MIB for the port (the port number is
comunicated in the qca hdr)

> > I like the idea of tagger-owend per-switch-tree private data.
> > Do we really need to hook logic?
> 
> We have two different things here.
> 
> 1) The tagger needs somewhere to store its own private data.
> 2) The tagger needs to share state with the switch driver.
> 
> We can probably have the DSA core provide 1). Add the size to
> dsa_device_ops structure, and provide helpers to go from either a
> master or a slave netdev to the private data.

I'm just implementing this. It doesn't look that hard.

> 
> 2) is harder. But as far as i know, we have an 1:N setup.  One switch
> driver can use N tag drivers. So we need the switch driver to be sure
> the tag driver is what it expects. We keep the shared state in the tag
> driver, so it always has valid data, but when the switch driver wants
> to get a pointer to it, it needs to pass a enum dsa_tag_protocol and
> if it does not match, the core should return -EINVAL or similar.
> 

Mhh this looks a bit complex. I'm probably missing something but why the
tagger needs to share a state? To check if it does support some feature?
If it's ready to be used for mdio Ethernet? Or just to be future-proof?

>    Andrew

-- 
	Ansuel

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [net-next RFC PATCH 0/6] Add support for qca8k mdio rw in Ethernet packet
  2021-12-07 21:10     ` Vladimir Oltean
  2021-12-07 22:01       ` Ansuel Smith
@ 2021-12-07 22:37       ` Andrew Lunn
  1 sibling, 0 replies; 40+ messages in thread
From: Andrew Lunn @ 2021-12-07 22:37 UTC (permalink / raw)
  To: Vladimir Oltean
  Cc: Florian Fainelli, Ansuel Smith, Vivien Didelot, David S. Miller,
	Jakub Kicinski, linux-kernel, netdev

> This raises interesting questions. I see two distinct cases:
> 
> 1. "dual I/O": the switch probes initially over a standard bus (MDIO,
>    SPI, I2C) then at some point transitions towards I/O through the
>    tagger.  This would be the case when there is some preparation work
>    to be done (maybe the CPU port needs to be brought up, maybe there is
>    a firmware to be uploaded to the switch's embedded microcontroller
>    such that the expected remote management protocol is supported, etc).
>    It would also be the case when multiple CPU ports are supported (and
>    changing between CPU ports), because we could end up bringing a
>    certain CPU port down, and the register I/O would need to be
>    temporarily done over MDIO before bringing the other CPU port up.

mv88e6xxx is very likely to take this path. You need to program some
registers to enable RMU. It is possible to enable this via EEPROM
configuration, but i've never seen any hardware with the necessary
EEPROM configuration. And you have the old chicken/egg, in order to be
able to program the EEPROM, you need access to the switch, or a header
and a cable.

> 2. "single I/O": the switch needs no such configuration, and in this
>     case, it could in principle probe over an "Ethernet bus" rather than
>     a standard bus as mentioned above.
> 
> I don't know which case is going to be more common, honestly.

Given the history, i think MDIO startup, and then transition to
Ethernet is going to be a lot more common.  If there was a lot of
hardware out there which could do Ethernet from the beginning, we
would of had patches or at least requests for it by now. 

I would keep it KISS.

      Andrew

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [net-next RFC PATCH 0/6] Add support for qca8k mdio rw in Ethernet packet
  2021-12-07 21:47           ` Ansuel Smith
  2021-12-07 22:22             ` Andrew Lunn
@ 2021-12-07 22:45             ` Vladimir Oltean
  2021-12-07 22:54               ` Andrew Lunn
  2021-12-07 23:05               ` Ansuel Smith
  1 sibling, 2 replies; 40+ messages in thread
From: Vladimir Oltean @ 2021-12-07 22:45 UTC (permalink / raw)
  To: Ansuel Smith
  Cc: Andrew Lunn, Vivien Didelot, Florian Fainelli, David S. Miller,
	Jakub Kicinski, linux-kernel, netdev

On Tue, Dec 07, 2021 at 10:47:59PM +0100, Ansuel Smith wrote:
> The main problem here is that we really need a way to have shared data
> between tagger and dsa driver. I also think that it would be limiting
> using this only for mdio. For example qca8k can autocast mib with
> Ethernet port and that would be another feature that the tagger would
> handle.

This is cool. I suppose this is what QCA_HDR_RECV_TYPE_MIB is for.
But it wouldn't work with your design because the tagger doesn't hold
any queues, it is basically a request/response which is always initiated
by the switch driver. The hardware can't automatically push anything to
software on its own. Maybe if the tagger wouldn't be stateless, that
would be better? What if the qca8k switch driver would just provide some
function pointers to the switch driver (these would be protocol
sub-handlers for QCA_HDR_RECV_TYPE_MIB, QCA_HDR_RECV_TYPE_RW_REG_ACK
etc), and your completion structure would be both initialized, as well
as finalized, all from within the switch driver itself?

> I like the idea of tagger-owend per-switch-tree private data.
> Do we really need to hook logic?
> Wonder if something like that would work:
> 1. Each tagger declare size of his private data (if any).
> 2. Change tag dsa helper make sure the privata data in dst gets
>    allocated and freed.
> 3. We create some helper to get the tagger private data pointer that
>    dsa driver will use. (an error is returned if no data is present)
> 4. Tagger will use the dst to access his own data.

I considered a simplified form like this, but I think the tagger private
data will still stay in dp->priv, only its ownership will change.
It is less flexible to just have an autoalloc size. Ok, you allocate a
structure the size you need, but which dp->priv gets to have it?
Maybe a certain tagging protocol will need dp1->priv == dp2->priv ==
dp3->priv == ..., whereas a different tagging protocol will need unique
different structures for each dp.

> 
> In theory that way we should be able to make a ""connection"" between
> dsa driver and the tagger and prevent any sort of strange stuff that
> could result in bug/kernel panic.
> 
> I mean for the current task (mdio in ethernet packet) we just need to
> put data, send the skb and wait for a response (and after parsing) get
> the data from a response skb.

It would be a huge win IMO if we could avoid managing the lifetime of
dp->priv _directly_. I'm thinking something along the lines of:

- every time we make the "dst->tag_ops = tag_ops;" assignment (see dsa2.c)
  there is a connection event between the switch tree and the tagging
  protocol (and also a disconnection event, if dst->tag_ops wasn't
  previously NULL).

- we could add a new tag_ops->connect(dst) and tag_ops->disconnect(dst)
  and call them. These could allocate/free the dp->priv memory for each
  dp in &dst->ports.

- _after_ the tag_ops->connect() has been called (this makes sure that
  the tagger memory has been allocated) we could also emit a cross-chip
  notifier event:

	/* DSA_NOTIFIER_TAG_PROTO_CONNECT */
	struct dsa_notifier_tag_proto_connect_info {
		const struct dsa_device_ops *tag_ops;
	};

	struct dsa_notifier_tag_proto_connect_info info;

	dsa_tree_notify(dst, DSA_NOTIFIER_TAG_PROTO, &info);

  The role of a cross-chip notifier is to fan-out a call exactly once to
  every switch within a tree. This particular cross-chip notifier could
  end up with an implementation in switch.c that lands with a call to:

  ds->ops->tag_proto_connect(ds, tag_ops);

  At this point, I'm a bit fuzzy on the details. I'm thinking of
  something like this:

	struct qca8k_tagger_private {
		void (*rw_reg_ack_handler)(struct dsa_port *dp, void *buf);
		void (*mib_autocast_handler)(struct dsa_port *dp, void *buf);
	};

	static void qca8k_rw_reg_ack_handler(struct dsa_port *dp, void *buf)
	{
		... (code moved from tagger)
	}

	static void qca8k_mib_autocast_handler(struct dsa_port *dp, void *buf)
	{
		... (code moved from tagger)
	}

	static int qca8k_tag_proto_connect(struct dsa_switch *ds,
					   const struct dsa_device_ops *tag_ops)
	{
		switch (tag_ops->proto) {
		case DSA_TAG_PROTO_QCA:
			struct dsa_port *dp;

			dsa_switch_for_each_port(dp, ds) {
				struct qca8k_tagger_private *priv = dp->priv;

				priv->rw_reg_ack_handler = qca8k_rw_reg_ack_handler;
				priv->mib_autocast_handler = qca8k_mib_autocast_handler;
			}

			break;
		default:
			return -EOPNOTSUPP;
		}
	}

	static const struct dsa_switch_ops qca8k_switch_ops = {
		...
		.tag_proto_connect	= qca8k_tag_proto_connect,
	};

  My current idea is maybe not ideal and a bit fuzzy, because the switch
  driver would need to be aware of the fact that the tagger private data
  is in dp->priv, and some code in one folder needs to be in sync with
  some code in another folder. But at least it should be safer this way,
  because we are in more control over the exact connection that's being
  made.

- to avoid leaking memory, we also need to patch dsa_tree_put() to issue
  a disconnect event on unbind.

- the tagging protocol driver would always need to NULL-check the
  function pointer before dereferencing it, because it may connect to a
  switch driver that doesn't set them up (dsa_loop):

	struct qca8k_tagger_private *priv = dp->priv;

	if (priv->rw_reg_ack_handler)
		priv->rw_reg_ack_handler(dp, skb_mac_header(skb));

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [net-next RFC PATCH 0/6] Add support for qca8k mdio rw in Ethernet packet
  2021-12-07 22:30               ` Ansuel Smith
@ 2021-12-07 22:46                 ` Andrew Lunn
  0 siblings, 0 replies; 40+ messages in thread
From: Andrew Lunn @ 2021-12-07 22:46 UTC (permalink / raw)
  To: Ansuel Smith
  Cc: Vladimir Oltean, Vivien Didelot, Florian Fainelli,
	David S. Miller, Jakub Kicinski, linux-kernel, netdev

> > 1) The tagger needs somewhere to store its own private data.
> > 2) The tagger needs to share state with the switch driver.
> > 
> > We can probably have the DSA core provide 1). Add the size to
> > dsa_device_ops structure, and provide helpers to go from either a
> > master or a slave netdev to the private data.
> 
> I'm just implementing this. It doesn't look that hard.
> 
> > 
> > 2) is harder. But as far as i know, we have an 1:N setup.  One switch
> > driver can use N tag drivers. So we need the switch driver to be sure
> > the tag driver is what it expects. We keep the shared state in the tag
> > driver, so it always has valid data, but when the switch driver wants
> > to get a pointer to it, it needs to pass a enum dsa_tag_protocol and
> > if it does not match, the core should return -EINVAL or similar.
> > 
> 
> Mhh this looks a bit complex. I'm probably missing something but why the
> tagger needs to share a state? To check if it does support some feature?
> If it's ready to be used for mdio Ethernet? Or just to be future-proof?

This is the general problem we have, and it might not be relevant for
this specific problem of MDIO over Ethernet.

tag_sja1105 wants access to a queue of frames and a work queue shared
with switch driver.

tag_ocelot_8021q has something similar to tag_sja1105.

tag_lan9303 wants to know if its two ports are in the same bridge.

	Andrew

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [net-next RFC PATCH 0/6] Add support for qca8k mdio rw in Ethernet packet
  2021-12-07 22:45             ` Vladimir Oltean
@ 2021-12-07 22:54               ` Andrew Lunn
  2021-12-07 23:14                 ` Vladimir Oltean
  2021-12-07 23:05               ` Ansuel Smith
  1 sibling, 1 reply; 40+ messages in thread
From: Andrew Lunn @ 2021-12-07 22:54 UTC (permalink / raw)
  To: Vladimir Oltean
  Cc: Ansuel Smith, Vivien Didelot, Florian Fainelli, David S. Miller,
	Jakub Kicinski, linux-kernel, netdev

> I considered a simplified form like this, but I think the tagger private
> data will still stay in dp->priv, only its ownership will change.

Isn't dp a port structure. So there is one per port?

This is where i think we need to separate shared state from tagger
private data. Probably tagger private data is not per port. Shared
state between the switch driver and the tagger maybe is per port?

      Andrew

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [net-next RFC PATCH 0/6] Add support for qca8k mdio rw in Ethernet packet
  2021-12-07 22:45             ` Vladimir Oltean
  2021-12-07 22:54               ` Andrew Lunn
@ 2021-12-07 23:05               ` Ansuel Smith
  2021-12-07 23:20                 ` Vladimir Oltean
  1 sibling, 1 reply; 40+ messages in thread
From: Ansuel Smith @ 2021-12-07 23:05 UTC (permalink / raw)
  To: Vladimir Oltean
  Cc: Andrew Lunn, Vivien Didelot, Florian Fainelli, David S. Miller,
	Jakub Kicinski, linux-kernel, netdev

On Wed, Dec 08, 2021 at 12:45:25AM +0200, Vladimir Oltean wrote:
> On Tue, Dec 07, 2021 at 10:47:59PM +0100, Ansuel Smith wrote:
> > The main problem here is that we really need a way to have shared data
> > between tagger and dsa driver. I also think that it would be limiting
> > using this only for mdio. For example qca8k can autocast mib with
> > Ethernet port and that would be another feature that the tagger would
> > handle.
> 
> This is cool. I suppose this is what QCA_HDR_RECV_TYPE_MIB is for.

Exactly that.

> But it wouldn't work with your design because the tagger doesn't hold
> any queues, it is basically a request/response which is always initiated
> by the switch driver. The hardware can't automatically push anything to
> software on its own. Maybe if the tagger wouldn't be stateless, that
> would be better? What if the qca8k switch driver would just provide some
> function pointers to the switch driver (these would be protocol
> sub-handlers for QCA_HDR_RECV_TYPE_MIB, QCA_HDR_RECV_TYPE_RW_REG_ACK
> etc), and your completion structure would be both initialized, as well
> as finalized, all from within the switch driver itself?
> 

Hm. Interesting idea. So qca8k would provide the way to parse the packet
and made the request. The tagger would just detect the packet and
execute the dedicated function.
About mib considering the driver autocast counter for every port and
every packet have the relevant port to it (set in the qca tag), the
idea was to put a big array and directly write the data. The ethtool
function will then just read the data and report it. (or even work
directly on the ethtool data array).

> > I like the idea of tagger-owend per-switch-tree private data.
> > Do we really need to hook logic?
> > Wonder if something like that would work:
> > 1. Each tagger declare size of his private data (if any).
> > 2. Change tag dsa helper make sure the privata data in dst gets
> >    allocated and freed.
> > 3. We create some helper to get the tagger private data pointer that
> >    dsa driver will use. (an error is returned if no data is present)
> > 4. Tagger will use the dst to access his own data.
> 
> I considered a simplified form like this, but I think the tagger private
> data will still stay in dp->priv, only its ownership will change.
> It is less flexible to just have an autoalloc size. Ok, you allocate a
> structure the size you need, but which dp->priv gets to have it?
> Maybe a certain tagging protocol will need dp1->priv == dp2->priv ==
> dp3->priv == ..., whereas a different tagging protocol will need unique
> different structures for each dp.
> 
> > 
> > In theory that way we should be able to make a ""connection"" between
> > dsa driver and the tagger and prevent any sort of strange stuff that
> > could result in bug/kernel panic.
> > 
> > I mean for the current task (mdio in ethernet packet) we just need to
> > put data, send the skb and wait for a response (and after parsing) get
> > the data from a response skb.
> 
> It would be a huge win IMO if we could avoid managing the lifetime of
> dp->priv _directly_. I'm thinking something along the lines of:
> 
> - every time we make the "dst->tag_ops = tag_ops;" assignment (see dsa2.c)
>   there is a connection event between the switch tree and the tagging
>   protocol (and also a disconnection event, if dst->tag_ops wasn't
>   previously NULL).
> 
> - we could add a new tag_ops->connect(dst) and tag_ops->disconnect(dst)
>   and call them. These could allocate/free the dp->priv memory for each
>   dp in &dst->ports.
> 
> - _after_ the tag_ops->connect() has been called (this makes sure that
>   the tagger memory has been allocated) we could also emit a cross-chip
>   notifier event:
> 
> 	/* DSA_NOTIFIER_TAG_PROTO_CONNECT */
> 	struct dsa_notifier_tag_proto_connect_info {
> 		const struct dsa_device_ops *tag_ops;
> 	};
> 
> 	struct dsa_notifier_tag_proto_connect_info info;
> 
> 	dsa_tree_notify(dst, DSA_NOTIFIER_TAG_PROTO, &info);
> 
>   The role of a cross-chip notifier is to fan-out a call exactly once to
>   every switch within a tree. This particular cross-chip notifier could
>   end up with an implementation in switch.c that lands with a call to:
> 
>   ds->ops->tag_proto_connect(ds, tag_ops);
> 
>   At this point, I'm a bit fuzzy on the details. I'm thinking of
>   something like this:
> 
> 	struct qca8k_tagger_private {
> 		void (*rw_reg_ack_handler)(struct dsa_port *dp, void *buf);
> 		void (*mib_autocast_handler)(struct dsa_port *dp, void *buf);
> 	};
> 
> 	static void qca8k_rw_reg_ack_handler(struct dsa_port *dp, void *buf)
> 	{
> 		... (code moved from tagger)
> 	}
> 
> 	static void qca8k_mib_autocast_handler(struct dsa_port *dp, void *buf)
> 	{
> 		... (code moved from tagger)
> 	}
> 
> 	static int qca8k_tag_proto_connect(struct dsa_switch *ds,
> 					   const struct dsa_device_ops *tag_ops)
> 	{
> 		switch (tag_ops->proto) {
> 		case DSA_TAG_PROTO_QCA:
> 			struct dsa_port *dp;
> 
> 			dsa_switch_for_each_port(dp, ds) {
> 				struct qca8k_tagger_private *priv = dp->priv;
> 
> 				priv->rw_reg_ack_handler = qca8k_rw_reg_ack_handler;
> 				priv->mib_autocast_handler = qca8k_mib_autocast_handler;
> 			}
> 
> 			break;
> 		default:
> 			return -EOPNOTSUPP;
> 		}
> 	}
> 
> 	static const struct dsa_switch_ops qca8k_switch_ops = {
> 		...
> 		.tag_proto_connect	= qca8k_tag_proto_connect,
> 	};
> 
>   My current idea is maybe not ideal and a bit fuzzy, because the switch
>   driver would need to be aware of the fact that the tagger private data
>   is in dp->priv, and some code in one folder needs to be in sync with
>   some code in another folder. But at least it should be safer this way,
>   because we are in more control over the exact connection that's being
>   made.
> 
> - to avoid leaking memory, we also need to patch dsa_tree_put() to issue
>   a disconnect event on unbind.
> 
> - the tagging protocol driver would always need to NULL-check the
>   function pointer before dereferencing it, because it may connect to a
>   switch driver that doesn't set them up (dsa_loop):
> 
> 	struct qca8k_tagger_private *priv = dp->priv;
> 
> 	if (priv->rw_reg_ack_handler)
> 		priv->rw_reg_ack_handler(dp, skb_mac_header(skb));

Ok so your idea is to make the driver the one controlling ""everything""
and keep the tagger as dummy as possible. That would also remove all the
need to put stuff in the global include dir. Looks complex but handy. We
still need to understand the state part. Any hint about that?

In the mean time I will try implement this.

-- 
	Ansuel

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [net-next RFC PATCH 0/6] Add support for qca8k mdio rw in Ethernet packet
  2021-12-07 22:54               ` Andrew Lunn
@ 2021-12-07 23:14                 ` Vladimir Oltean
  2021-12-08  1:35                   ` Andrew Lunn
  0 siblings, 1 reply; 40+ messages in thread
From: Vladimir Oltean @ 2021-12-07 23:14 UTC (permalink / raw)
  To: Andrew Lunn
  Cc: Ansuel Smith, Vivien Didelot, Florian Fainelli, David S. Miller,
	Jakub Kicinski, linux-kernel, netdev

On Tue, Dec 07, 2021 at 11:54:07PM +0100, Andrew Lunn wrote:
> > I considered a simplified form like this, but I think the tagger private
> > data will still stay in dp->priv, only its ownership will change.
> 
> Isn't dp a port structure. So there is one per port?

Yes, but dp->priv is a pointer. The thing it points to may not
necessarily be per port.

> This is where i think we need to separate shared state from tagger
> private data. Probably tagger private data is not per port. Shared
> state between the switch driver and the tagger maybe is per port?

I don't know whether there's such a big difference between
"shared state" vs "private data"? The dp->priv model is flexible enough
to support both. For example, in tag_sja1105, dp->priv is a struct
sja1105_port. All struct sja1105_port of a switch have a common struct
sja1105_tagger_data *data pointer. We could certainly set up the
tag_ops->connect(dst) function to allocate memory in this way.

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [net-next RFC PATCH 0/6] Add support for qca8k mdio rw in Ethernet packet
  2021-12-07 23:05               ` Ansuel Smith
@ 2021-12-07 23:20                 ` Vladimir Oltean
  2021-12-07 23:24                   ` Ansuel Smith
  0 siblings, 1 reply; 40+ messages in thread
From: Vladimir Oltean @ 2021-12-07 23:20 UTC (permalink / raw)
  To: Ansuel Smith
  Cc: Andrew Lunn, Vivien Didelot, Florian Fainelli, David S. Miller,
	Jakub Kicinski, linux-kernel, netdev

On Wed, Dec 08, 2021 at 12:05:11AM +0100, Ansuel Smith wrote:
> Hm. Interesting idea. So qca8k would provide the way to parse the packet
> and made the request. The tagger would just detect the packet and
> execute the dedicated function.
> About mib considering the driver autocast counter for every port and
> every packet have the relevant port to it (set in the qca tag), the
> idea was to put a big array and directly write the data. The ethtool
> function will then just read the data and report it. (or even work
> directly on the ethtool data array).

Apart from the fact that you'd be running inside the priv->rw_reg_ack_handler()
which runs in softirq context (so you need spinlocks to serialize with
the code that runs in process and/or workqueue context), you have access
to all the data structures from the switch driver that you're used to.
So you could copy from the void *buf into something owned by struct
qca8k_priv *priv, sure.

> >   My current idea is maybe not ideal and a bit fuzzy, because the switch
> >   driver would need to be aware of the fact that the tagger private data
> >   is in dp->priv, and some code in one folder needs to be in sync with
> >   some code in another folder. But at least it should be safer this way,
> >   because we are in more control over the exact connection that's being
> >   made.
> > 
> > - to avoid leaking memory, we also need to patch dsa_tree_put() to issue
> >   a disconnect event on unbind.
> > 
> > - the tagging protocol driver would always need to NULL-check the
> >   function pointer before dereferencing it, because it may connect to a
> >   switch driver that doesn't set them up (dsa_loop):
> > 
> > 	struct qca8k_tagger_private *priv = dp->priv;
> > 
> > 	if (priv->rw_reg_ack_handler)
> > 		priv->rw_reg_ack_handler(dp, skb_mac_header(skb));
> 
> Ok so your idea is to make the driver the one controlling ""everything""
> and keep the tagger as dummy as possible. That would also remove all the
> need to put stuff in the global include dir. Looks complex but handy. We
> still need to understand the state part. Any hint about that?
> 
> In the mean time I will try implement this.

What do you mean exactly by understanding the state?

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [net-next RFC PATCH 0/6] Add support for qca8k mdio rw in Ethernet packet
  2021-12-07 23:20                 ` Vladimir Oltean
@ 2021-12-07 23:24                   ` Ansuel Smith
  0 siblings, 0 replies; 40+ messages in thread
From: Ansuel Smith @ 2021-12-07 23:24 UTC (permalink / raw)
  To: Vladimir Oltean
  Cc: Andrew Lunn, Vivien Didelot, Florian Fainelli, David S. Miller,
	Jakub Kicinski, linux-kernel, netdev

On Wed, Dec 08, 2021 at 01:20:20AM +0200, Vladimir Oltean wrote:
> On Wed, Dec 08, 2021 at 12:05:11AM +0100, Ansuel Smith wrote:
> > Hm. Interesting idea. So qca8k would provide the way to parse the packet
> > and made the request. The tagger would just detect the packet and
> > execute the dedicated function.
> > About mib considering the driver autocast counter for every port and
> > every packet have the relevant port to it (set in the qca tag), the
> > idea was to put a big array and directly write the data. The ethtool
> > function will then just read the data and report it. (or even work
> > directly on the ethtool data array).
> 
> Apart from the fact that you'd be running inside the priv->rw_reg_ack_handler()
> which runs in softirq context (so you need spinlocks to serialize with
> the code that runs in process and/or workqueue context), you have access
> to all the data structures from the switch driver that you're used to.
> So you could copy from the void *buf into something owned by struct
> qca8k_priv *priv, sure.
> 
> > >   My current idea is maybe not ideal and a bit fuzzy, because the switch
> > >   driver would need to be aware of the fact that the tagger private data
> > >   is in dp->priv, and some code in one folder needs to be in sync with
> > >   some code in another folder. But at least it should be safer this way,
> > >   because we are in more control over the exact connection that's being
> > >   made.
> > > 
> > > - to avoid leaking memory, we also need to patch dsa_tree_put() to issue
> > >   a disconnect event on unbind.
> > > 
> > > - the tagging protocol driver would always need to NULL-check the
> > >   function pointer before dereferencing it, because it may connect to a
> > >   switch driver that doesn't set them up (dsa_loop):
> > > 
> > > 	struct qca8k_tagger_private *priv = dp->priv;
> > > 
> > > 	if (priv->rw_reg_ack_handler)
> > > 		priv->rw_reg_ack_handler(dp, skb_mac_header(skb));
> > 
> > Ok so your idea is to make the driver the one controlling ""everything""
> > and keep the tagger as dummy as possible. That would also remove all the
> > need to put stuff in the global include dir. Looks complex but handy. We
> > still need to understand the state part. Any hint about that?
> > 
> > In the mean time I will try implement this.
> 
> What do you mean exactly by understanding the state?

I was referring to the "shared state" problem but you already answer
that in the prev email.

-- 
	Ansuel

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [net-next RFC PATCH 0/6] Add support for qca8k mdio rw in Ethernet packet
  2021-12-07 22:22             ` Andrew Lunn
  2021-12-07 22:30               ` Ansuel Smith
@ 2021-12-07 23:47               ` Vladimir Oltean
  2021-12-08  0:04                 ` Vladimir Oltean
  2021-12-08  1:15                 ` Andrew Lunn
  1 sibling, 2 replies; 40+ messages in thread
From: Vladimir Oltean @ 2021-12-07 23:47 UTC (permalink / raw)
  To: Andrew Lunn
  Cc: Ansuel Smith, Vivien Didelot, Florian Fainelli, David S. Miller,
	Jakub Kicinski, linux-kernel, netdev

On Tue, Dec 07, 2021 at 11:22:41PM +0100, Andrew Lunn wrote:
> > I like the idea of tagger-owend per-switch-tree private data.
> > Do we really need to hook logic?
> 
> We have two different things here.
> 
> 1) The tagger needs somewhere to store its own private data.
> 2) The tagger needs to share state with the switch driver.
> 
> We can probably have the DSA core provide 1). Add the size to
> dsa_device_ops structure, and provide helpers to go from either a
> master or a slave netdev to the private data.

We cannot "add the size to the dsa_device_ops structure", because it is
a singleton (const struct). It is not replicated at all, not per port,
not per switch, not per tree, but global to the kernel. Not to mention
const. Nobody needs state as shared as that.

Given this, do you have objections to the sja1105_port->data model for
shared state?

> 2) is harder. But as far as i know, we have an 1:N setup.  One switch
> driver can use N tag drivers. So we need the switch driver to be sure
> the tag driver is what it expects. We keep the shared state in the tag
> driver, so it always has valid data, but when the switch driver wants
> to get a pointer to it, it needs to pass a enum dsa_tag_protocol and
> if it does not match, the core should return -EINVAL or similar.

In my proposal, the tagger will allocate the memory from its side of the
->connect() call. So regardless of whether the switch driver side
connects or not, the memory inside dp->priv is there for the tagger to
use. The switch can access it or it can ignore it.

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [net-next RFC PATCH 0/6] Add support for qca8k mdio rw in Ethernet packet
  2021-12-07 23:47               ` Vladimir Oltean
@ 2021-12-08  0:04                 ` Vladimir Oltean
  2021-12-08  0:40                   ` Vladimir Oltean
  2021-12-08  1:15                 ` Andrew Lunn
  1 sibling, 1 reply; 40+ messages in thread
From: Vladimir Oltean @ 2021-12-08  0:04 UTC (permalink / raw)
  To: Andrew Lunn
  Cc: Ansuel Smith, Vivien Didelot, Florian Fainelli, David S. Miller,
	Jakub Kicinski, linux-kernel, netdev

On Wed, Dec 08, 2021 at 01:47:36AM +0200, Vladimir Oltean wrote:
> > 2) is harder. But as far as i know, we have an 1:N setup.  One switch
> > driver can use N tag drivers. So we need the switch driver to be sure
> > the tag driver is what it expects. We keep the shared state in the tag
> > driver, so it always has valid data, but when the switch driver wants
> > to get a pointer to it, it needs to pass a enum dsa_tag_protocol and
> > if it does not match, the core should return -EINVAL or similar.
> 
> In my proposal, the tagger will allocate the memory from its side of the
> ->connect() call. So regardless of whether the switch driver side
> connects or not, the memory inside dp->priv is there for the tagger to
> use. The switch can access it or it can ignore it.

I don't think I actually said something useful here.

The goal would be to minimize use of dp->priv inside the switch driver,
outside of the actual ->connect() / ->disconnect() calls.
For example, in the felix driver which supports two tagging protocol
drivers, I think these two methods would be enough, and they would
replace the current felix_port_setup_tagger_data() and
felix_port_teardown_tagger_data() calls.

An additional benefit would be that in ->connect() and ->disconnect() we
get the actual tagging protocol in use. Currently the felix driver lacks
there, because felix_port_setup_tagger_data() just sets dp->priv up
unconditionally for the ocelot-8021q tagging protocol (luckily the
normal ocelot tagger doesn't need dp->priv).

In sja1105 the story is a bit longer, but I believe that can also be
cleaned up to stay within the confines of ->connect()/->disconnect().

So I guess we just need to be careful and push back against dubious use
during review.

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [net-next RFC PATCH 0/6] Add support for qca8k mdio rw in Ethernet packet
  2021-12-08  0:04                 ` Vladimir Oltean
@ 2021-12-08  0:40                   ` Vladimir Oltean
  2021-12-08  0:42                     ` Ansuel Smith
  0 siblings, 1 reply; 40+ messages in thread
From: Vladimir Oltean @ 2021-12-08  0:40 UTC (permalink / raw)
  To: Andrew Lunn
  Cc: Ansuel Smith, Vivien Didelot, Florian Fainelli, David S. Miller,
	Jakub Kicinski, linux-kernel, netdev

On Wed, Dec 08, 2021 at 02:04:32AM +0200, Vladimir Oltean wrote:
> On Wed, Dec 08, 2021 at 01:47:36AM +0200, Vladimir Oltean wrote:
> > > 2) is harder. But as far as i know, we have an 1:N setup.  One switch
> > > driver can use N tag drivers. So we need the switch driver to be sure
> > > the tag driver is what it expects. We keep the shared state in the tag
> > > driver, so it always has valid data, but when the switch driver wants
> > > to get a pointer to it, it needs to pass a enum dsa_tag_protocol and
> > > if it does not match, the core should return -EINVAL or similar.
> > 
> > In my proposal, the tagger will allocate the memory from its side of the
> > ->connect() call. So regardless of whether the switch driver side
> > connects or not, the memory inside dp->priv is there for the tagger to
> > use. The switch can access it or it can ignore it.
> 
> I don't think I actually said something useful here.
> 
> The goal would be to minimize use of dp->priv inside the switch driver,
> outside of the actual ->connect() / ->disconnect() calls.
> For example, in the felix driver which supports two tagging protocol
> drivers, I think these two methods would be enough, and they would
> replace the current felix_port_setup_tagger_data() and
> felix_port_teardown_tagger_data() calls.
> 
> An additional benefit would be that in ->connect() and ->disconnect() we
> get the actual tagging protocol in use. Currently the felix driver lacks
> there, because felix_port_setup_tagger_data() just sets dp->priv up
> unconditionally for the ocelot-8021q tagging protocol (luckily the
> normal ocelot tagger doesn't need dp->priv).
> 
> In sja1105 the story is a bit longer, but I believe that can also be
> cleaned up to stay within the confines of ->connect()/->disconnect().
> 
> So I guess we just need to be careful and push back against dubious use
> during review.

I've started working on a prototype for converting sja1105 to this model.
It should be clearer to me by tomorrow whether there is anything missing
from this proposal.

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [net-next RFC PATCH 0/6] Add support for qca8k mdio rw in Ethernet packet
  2021-12-08  0:40                   ` Vladimir Oltean
@ 2021-12-08  0:42                     ` Ansuel Smith
  2021-12-08  1:09                       ` Vladimir Oltean
  0 siblings, 1 reply; 40+ messages in thread
From: Ansuel Smith @ 2021-12-08  0:42 UTC (permalink / raw)
  To: Vladimir Oltean
  Cc: Andrew Lunn, Vivien Didelot, Florian Fainelli, David S. Miller,
	Jakub Kicinski, linux-kernel, netdev

On Wed, Dec 08, 2021 at 02:40:51AM +0200, Vladimir Oltean wrote:
> On Wed, Dec 08, 2021 at 02:04:32AM +0200, Vladimir Oltean wrote:
> > On Wed, Dec 08, 2021 at 01:47:36AM +0200, Vladimir Oltean wrote:
> > > > 2) is harder. But as far as i know, we have an 1:N setup.  One switch
> > > > driver can use N tag drivers. So we need the switch driver to be sure
> > > > the tag driver is what it expects. We keep the shared state in the tag
> > > > driver, so it always has valid data, but when the switch driver wants
> > > > to get a pointer to it, it needs to pass a enum dsa_tag_protocol and
> > > > if it does not match, the core should return -EINVAL or similar.
> > > 
> > > In my proposal, the tagger will allocate the memory from its side of the
> > > ->connect() call. So regardless of whether the switch driver side
> > > connects or not, the memory inside dp->priv is there for the tagger to
> > > use. The switch can access it or it can ignore it.
> > 
> > I don't think I actually said something useful here.
> > 
> > The goal would be to minimize use of dp->priv inside the switch driver,
> > outside of the actual ->connect() / ->disconnect() calls.
> > For example, in the felix driver which supports two tagging protocol
> > drivers, I think these two methods would be enough, and they would
> > replace the current felix_port_setup_tagger_data() and
> > felix_port_teardown_tagger_data() calls.
> > 
> > An additional benefit would be that in ->connect() and ->disconnect() we
> > get the actual tagging protocol in use. Currently the felix driver lacks
> > there, because felix_port_setup_tagger_data() just sets dp->priv up
> > unconditionally for the ocelot-8021q tagging protocol (luckily the
> > normal ocelot tagger doesn't need dp->priv).
> > 
> > In sja1105 the story is a bit longer, but I believe that can also be
> > cleaned up to stay within the confines of ->connect()/->disconnect().
> > 
> > So I guess we just need to be careful and push back against dubious use
> > during review.
> 
> I've started working on a prototype for converting sja1105 to this model.
> It should be clearer to me by tomorrow whether there is anything missing
> from this proposal.

I'm working on your suggestion and I should be able to post another RFC
this night if all works correctly with my switch.

-- 
	Ansuel

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [net-next RFC PATCH 0/6] Add support for qca8k mdio rw in Ethernet packet
  2021-12-08  0:42                     ` Ansuel Smith
@ 2021-12-08  1:09                       ` Vladimir Oltean
  2021-12-08  3:32                         ` Ansuel Smith
  0 siblings, 1 reply; 40+ messages in thread
From: Vladimir Oltean @ 2021-12-08  1:09 UTC (permalink / raw)
  To: Ansuel Smith
  Cc: Andrew Lunn, Vivien Didelot, Florian Fainelli, David S. Miller,
	Jakub Kicinski, linux-kernel, netdev

On Wed, Dec 08, 2021 at 01:42:59AM +0100, Ansuel Smith wrote:
> On Wed, Dec 08, 2021 at 02:40:51AM +0200, Vladimir Oltean wrote:
> > On Wed, Dec 08, 2021 at 02:04:32AM +0200, Vladimir Oltean wrote:
> > > On Wed, Dec 08, 2021 at 01:47:36AM +0200, Vladimir Oltean wrote:
> > > > > 2) is harder. But as far as i know, we have an 1:N setup.  One switch
> > > > > driver can use N tag drivers. So we need the switch driver to be sure
> > > > > the tag driver is what it expects. We keep the shared state in the tag
> > > > > driver, so it always has valid data, but when the switch driver wants
> > > > > to get a pointer to it, it needs to pass a enum dsa_tag_protocol and
> > > > > if it does not match, the core should return -EINVAL or similar.
> > > > 
> > > > In my proposal, the tagger will allocate the memory from its side of the
> > > > ->connect() call. So regardless of whether the switch driver side
> > > > connects or not, the memory inside dp->priv is there for the tagger to
> > > > use. The switch can access it or it can ignore it.
> > > 
> > > I don't think I actually said something useful here.
> > > 
> > > The goal would be to minimize use of dp->priv inside the switch driver,
> > > outside of the actual ->connect() / ->disconnect() calls.
> > > For example, in the felix driver which supports two tagging protocol
> > > drivers, I think these two methods would be enough, and they would
> > > replace the current felix_port_setup_tagger_data() and
> > > felix_port_teardown_tagger_data() calls.
> > > 
> > > An additional benefit would be that in ->connect() and ->disconnect() we
> > > get the actual tagging protocol in use. Currently the felix driver lacks
> > > there, because felix_port_setup_tagger_data() just sets dp->priv up
> > > unconditionally for the ocelot-8021q tagging protocol (luckily the
> > > normal ocelot tagger doesn't need dp->priv).
> > > 
> > > In sja1105 the story is a bit longer, but I believe that can also be
> > > cleaned up to stay within the confines of ->connect()/->disconnect().
> > > 
> > > So I guess we just need to be careful and push back against dubious use
> > > during review.
> > 
> > I've started working on a prototype for converting sja1105 to this model.
> > It should be clearer to me by tomorrow whether there is anything missing
> > from this proposal.
> 
> I'm working on your suggestion and I should be able to post another RFC
> this night if all works correctly with my switch.

Ok. The key point with my progress so far is that Andrew may be right
and we might need separate tagger priv pointers per port and per switch.
At least for the cleanliness of implementation. In the end I plan to
remove dp->priv and stay with dp->tagger_priv and ds->tagger_priv.

Here's what I have so far. I have more changes locally, but the rest it
isn't ready and overall also a bit irrelevant for the discussion.
I'm going to sleep now.

diff --git a/include/net/dsa.h b/include/net/dsa.h
index bdf308a5c55e..f0f702774c8d 100644
--- a/include/net/dsa.h
+++ b/include/net/dsa.h
@@ -82,12 +82,15 @@ enum dsa_tag_protocol {
 };
 
 struct dsa_switch;
+struct dsa_switch_tree;
 
 struct dsa_device_ops {
 	struct sk_buff *(*xmit)(struct sk_buff *skb, struct net_device *dev);
 	struct sk_buff *(*rcv)(struct sk_buff *skb, struct net_device *dev);
 	void (*flow_dissect)(const struct sk_buff *skb, __be16 *proto,
 			     int *offset);
+	int (*connect)(struct dsa_switch_tree *dst);
+	void (*disconnect)(struct dsa_switch_tree *dst);
 	unsigned int needed_headroom;
 	unsigned int needed_tailroom;
 	const char *name;
@@ -279,6 +282,8 @@ struct dsa_port {
 	 */
 	void *priv;
 
+	void *tagger_priv;
+
 	/*
 	 * Original copy of the master netdev ethtool_ops
 	 */
@@ -337,6 +342,8 @@ struct dsa_switch {
 	 */
 	void *priv;
 
+	void *tagger_priv;
+
 	/*
 	 * Configuration data for this switch.
 	 */
@@ -689,6 +696,8 @@ struct dsa_switch_ops {
 						  enum dsa_tag_protocol mprot);
 	int	(*change_tag_protocol)(struct dsa_switch *ds, int port,
 				       enum dsa_tag_protocol proto);
+	int	(*connect_tag_protocol)(struct dsa_switch *ds,
+					enum dsa_tag_protocol proto);
 
 	/* Optional switch-wide initialization and destruction methods */
 	int	(*setup)(struct dsa_switch *ds);
diff --git a/net/dsa/dsa2.c b/net/dsa/dsa2.c
index 8814fa0e44c8..3787cbce1175 100644
--- a/net/dsa/dsa2.c
+++ b/net/dsa/dsa2.c
@@ -248,8 +248,12 @@ static struct dsa_switch_tree *dsa_tree_alloc(int index)
 
 static void dsa_tree_free(struct dsa_switch_tree *dst)
 {
-	if (dst->tag_ops)
+	if (dst->tag_ops) {
+		if (dst->tag_ops->disconnect)
+			dst->tag_ops->disconnect(dst);
+
 		dsa_tag_driver_put(dst->tag_ops);
+	}
 	list_del(&dst->list);
 	kfree(dst);
 }
@@ -1136,6 +1140,36 @@ static void dsa_tree_teardown(struct dsa_switch_tree *dst)
 	dst->setup = false;
 }
 
+static int dsa_tree_bind_tag_proto(struct dsa_switch_tree *dst,
+				   const struct dsa_device_ops *tag_ops)
+{
+	struct dsa_notifier_tag_proto_info info;
+	int err;
+
+	if (dst->tag_ops && dst->tag_ops->disconnect)
+		dst->tag_ops->disconnect(dst);
+
+	if (tag_ops->connect) {
+		err = tag_ops->connect(dst);
+		if (err)
+			return err;
+	}
+
+	info.tag_ops = tag_ops;
+	err = dsa_tree_notify(dst, DSA_NOTIFIER_TAG_PROTO_CONNECT, &info);
+	if (err && err != -EOPNOTSUPP)
+		goto out_disconnect;
+
+	dst->tag_ops = tag_ops;
+
+	return 0;
+
+out_disconnect:
+	if (tag_ops->disconnect)
+		tag_ops->disconnect(dst);
+	return err;
+}
+
 /* Since the dsa/tagging sysfs device attribute is per master, the assumption
  * is that all DSA switches within a tree share the same tagger, otherwise
  * they would have formed disjoint trees (different "dsa,member" values).
@@ -1173,7 +1207,9 @@ int dsa_tree_change_tag_proto(struct dsa_switch_tree *dst,
 	if (err)
 		goto out_unwind_tagger;
 
-	dst->tag_ops = tag_ops;
+	err = dsa_tree_bind_tag_proto(dst, tag_ops);
+	if (err)
+		goto out_unwind_tagger;
 
 	rtnl_unlock();
 
@@ -1307,7 +1343,9 @@ static int dsa_port_parse_cpu(struct dsa_port *dp, struct net_device *master,
 		 */
 		dsa_tag_driver_put(tag_ops);
 	} else {
-		dst->tag_ops = tag_ops;
+		err = dsa_tree_bind_tag_proto(dst, tag_ops);
+		if (err)
+			return err;
 	}
 
 	dp->master = master;
diff --git a/net/dsa/dsa_priv.h b/net/dsa/dsa_priv.h
index 38ce5129a33d..0db2b26b0c83 100644
--- a/net/dsa/dsa_priv.h
+++ b/net/dsa/dsa_priv.h
@@ -37,6 +37,7 @@ enum {
 	DSA_NOTIFIER_VLAN_DEL,
 	DSA_NOTIFIER_MTU,
 	DSA_NOTIFIER_TAG_PROTO,
+	DSA_NOTIFIER_TAG_PROTO_CONNECT,
 	DSA_NOTIFIER_MRP_ADD,
 	DSA_NOTIFIER_MRP_DEL,
 	DSA_NOTIFIER_MRP_ADD_RING_ROLE,
diff --git a/net/dsa/switch.c b/net/dsa/switch.c
index 9c92edd96961..06948f536829 100644
--- a/net/dsa/switch.c
+++ b/net/dsa/switch.c
@@ -647,6 +647,17 @@ static int dsa_switch_change_tag_proto(struct dsa_switch *ds,
 	return 0;
 }
 
+static int dsa_switch_connect_tag_proto(struct dsa_switch *ds,
+					struct dsa_notifier_tag_proto_info *info)
+{
+	const struct dsa_device_ops *tag_ops = info->tag_ops;
+
+	if (!ds->ops->connect_tag_protocol)
+		return -EOPNOTSUPP;
+
+	return ds->ops->connect_tag_protocol(ds, tag_ops->proto);
+}
+
 static int dsa_switch_mrp_add(struct dsa_switch *ds,
 			      struct dsa_notifier_mrp_info *info)
 {
@@ -766,6 +777,9 @@ static int dsa_switch_event(struct notifier_block *nb,
 	case DSA_NOTIFIER_TAG_PROTO:
 		err = dsa_switch_change_tag_proto(ds, info);
 		break;
+	case DSA_NOTIFIER_TAG_PROTO_CONNECT:
+		err = dsa_switch_connect_tag_proto(ds, info);
+		break;
 	case DSA_NOTIFIER_MRP_ADD:
 		err = dsa_switch_mrp_add(ds, info);
 		break;
diff --git a/net/dsa/tag_sja1105.c b/net/dsa/tag_sja1105.c
index 6c293c2a3008..53362a0f0aab 100644
--- a/net/dsa/tag_sja1105.c
+++ b/net/dsa/tag_sja1105.c
@@ -722,11 +722,59 @@ static void sja1110_flow_dissect(const struct sk_buff *skb, __be16 *proto,
 	*proto = ((__be16 *)skb->data)[(VLAN_HLEN / 2) - 1];
 }
 
+static void sja1105_disconnect(struct dsa_switch_tree *dst)
+{
+	struct dsa_port *dp;
+
+	dsa_tree_for_each_user_port(dp, dst) {
+		if (dp->tagger_priv) {
+			kfree(dp->tagger_priv);
+			dp->tagger_priv = NULL;
+		}
+
+		if (dp->ds->tagger_priv) {
+			kfree(dp->ds->tagger_priv);
+			dp->ds->tagger_priv = NULL;
+		}
+	}
+}
+
+static int sja1105_connect(struct dsa_switch_tree *dst)
+{
+	struct sja1105_tagger_data *data;
+	struct sja1105_port *sp;
+	struct dsa_port *dp;
+
+	dsa_tree_for_each_user_port(dp, dst) {
+		if (!dp->ds->tagger_priv) {
+			data = kzalloc(sizeof(*data), GFP_KERNEL);
+			if (!data)
+				goto out;
+
+			dp->ds->tagger_priv = data;
+		}
+
+		sp = kzalloc(sizeof(*sp), GFP_KERNEL);
+		if (!sp)
+			goto out;
+
+		dp->tagger_priv = sp;
+	}
+
+	return 0;
+
+out:
+	sja1105_disconnect(dst);
+	return -ENOMEM;
+}
+
 static const struct dsa_device_ops sja1105_netdev_ops = {
 	.name = "sja1105",
 	.proto = DSA_TAG_PROTO_SJA1105,
 	.xmit = sja1105_xmit,
 	.rcv = sja1105_rcv,
+	.connect = sja1105_connect,
+	.disconnect = sja1105_disconnect,
 	.needed_headroom = VLAN_HLEN,
 	.flow_dissect = sja1105_flow_dissect,
 	.promisc_on_master = true,
@@ -740,6 +788,8 @@ static const struct dsa_device_ops sja1110_netdev_ops = {
 	.proto = DSA_TAG_PROTO_SJA1110,
 	.xmit = sja1110_xmit,
 	.rcv = sja1110_rcv,
+	.connect = sja1105_connect,
+	.disconnect = sja1105_disconnect,
 	.flow_dissect = sja1110_flow_dissect,
 	.needed_headroom = SJA1110_HEADER_LEN + VLAN_HLEN,
 	.needed_tailroom = SJA1110_RX_TRAILER_LEN + SJA1110_MAX_PADDING_LEN,
-- 
2.25.1

^ permalink raw reply related	[flat|nested] 40+ messages in thread

* Re: [net-next RFC PATCH 0/6] Add support for qca8k mdio rw in Ethernet packet
  2021-12-07 23:47               ` Vladimir Oltean
  2021-12-08  0:04                 ` Vladimir Oltean
@ 2021-12-08  1:15                 ` Andrew Lunn
  1 sibling, 0 replies; 40+ messages in thread
From: Andrew Lunn @ 2021-12-08  1:15 UTC (permalink / raw)
  To: Vladimir Oltean
  Cc: Ansuel Smith, Vivien Didelot, Florian Fainelli, David S. Miller,
	Jakub Kicinski, linux-kernel, netdev

On Wed, Dec 08, 2021 at 01:47:36AM +0200, Vladimir Oltean wrote:
> On Tue, Dec 07, 2021 at 11:22:41PM +0100, Andrew Lunn wrote:
> > > I like the idea of tagger-owend per-switch-tree private data.
> > > Do we really need to hook logic?
> > 
> > We have two different things here.
> > 
> > 1) The tagger needs somewhere to store its own private data.
> > 2) The tagger needs to share state with the switch driver.
> > 
> > We can probably have the DSA core provide 1). Add the size to
> > dsa_device_ops structure, and provide helpers to go from either a
> > master or a slave netdev to the private data.
> 
> We cannot "add the size to the dsa_device_ops structure", because it is
> a singleton (const struct). It is not replicated at all, not per port,
> not per switch, not per tree, but global to the kernel. Not to mention
> const. Nobody needs state as shared as that.

What i'm suggesting is 

static const struct dsa_device_ops edsa_netdev_ops = {
        .name     = "edsa",
        .proto    = DSA_TAG_PROTO_EDSA,
        .xmit     = edsa_xmit,
        .rcv      = edsa_rcv,
        .needed_headroom = EDSA_HLEN,
	.priv_size = 42;
};

The priv_size indicates that an instance of this tagger needs 42 bytes
of private data. More likely it will be a sizeof(struct dsa_priv), but
that is a detail.

When a master is setup and the tagger instantiated, 42 bytes of memory
will be allocated and put somewhere it can be found via a helper.

This is not meant for shared state between the tagger and the switch
driver, this is private to the tagger. As such it is less likely to be
dependent on the number of ports etc. It is somewhere to store an skb
pointer, maybe a sequence number for the management message expected
as a reply from the switch etc.

   Andrew

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [net-next RFC PATCH 0/6] Add support for qca8k mdio rw in Ethernet packet
  2021-12-07 23:14                 ` Vladimir Oltean
@ 2021-12-08  1:35                   ` Andrew Lunn
  2021-12-08  3:39                     ` Ansuel Smith
  2021-12-08 11:51                     ` Vladimir Oltean
  0 siblings, 2 replies; 40+ messages in thread
From: Andrew Lunn @ 2021-12-08  1:35 UTC (permalink / raw)
  To: Vladimir Oltean
  Cc: Ansuel Smith, Vivien Didelot, Florian Fainelli, David S. Miller,
	Jakub Kicinski, linux-kernel, netdev

On Wed, Dec 08, 2021 at 01:14:49AM +0200, Vladimir Oltean wrote:
> On Tue, Dec 07, 2021 at 11:54:07PM +0100, Andrew Lunn wrote:
> > > I considered a simplified form like this, but I think the tagger private
> > > data will still stay in dp->priv, only its ownership will change.
> > 
> > Isn't dp a port structure. So there is one per port?
> 
> Yes, but dp->priv is a pointer. The thing it points to may not
> necessarily be per port.
> 
> > This is where i think we need to separate shared state from tagger
> > private data. Probably tagger private data is not per port. Shared
> > state between the switch driver and the tagger maybe is per port?
> 
> I don't know whether there's such a big difference between
> "shared state" vs "private data"?

The difference is to do with stopping the kernel exploding when the
switch driver is not using the tagger it expects.

Anything which is private to the tagger is not a problem. Only the
tagger uses it, so it cannot be wrong.

Anything which is shared between the tagger and the switch driver we
have to be careful about. We are just passing void * pointers
about. There is no type checking. If i'm correct about the 1:N
relationship, we can store shared state in the tagger. The tagger
should be O.K, because it only ever needs to deal with one format of
shared state. The switch driver needs to handle N different formats of
shared state, depending on which of the N different taggers are in
operation. Ideally, when it asks for the void * pointer for shared
information, some sort of checking is performed to ensure the void *
is what the switch driver actually expects. Maybe it needs to pass the
tag driver it thinks it is talking to, or as well as getting the void
* back, it also gets the tag enum and it verifies it actually knows
about that tag driver.

     Andrew

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [net-next RFC PATCH 0/6] Add support for qca8k mdio rw in Ethernet packet
  2021-12-08  1:09                       ` Vladimir Oltean
@ 2021-12-08  3:32                         ` Ansuel Smith
  2021-12-08 11:54                           ` Vladimir Oltean
  0 siblings, 1 reply; 40+ messages in thread
From: Ansuel Smith @ 2021-12-08  3:32 UTC (permalink / raw)
  To: Vladimir Oltean
  Cc: Andrew Lunn, Vivien Didelot, Florian Fainelli, David S. Miller,
	Jakub Kicinski, linux-kernel, netdev

On Wed, Dec 08, 2021 at 03:09:47AM +0200, Vladimir Oltean wrote:
> On Wed, Dec 08, 2021 at 01:42:59AM +0100, Ansuel Smith wrote:
> > On Wed, Dec 08, 2021 at 02:40:51AM +0200, Vladimir Oltean wrote:
> > > On Wed, Dec 08, 2021 at 02:04:32AM +0200, Vladimir Oltean wrote:
> > > > On Wed, Dec 08, 2021 at 01:47:36AM +0200, Vladimir Oltean wrote:
> > > > > > 2) is harder. But as far as i know, we have an 1:N setup.  One switch
> > > > > > driver can use N tag drivers. So we need the switch driver to be sure
> > > > > > the tag driver is what it expects. We keep the shared state in the tag
> > > > > > driver, so it always has valid data, but when the switch driver wants
> > > > > > to get a pointer to it, it needs to pass a enum dsa_tag_protocol and
> > > > > > if it does not match, the core should return -EINVAL or similar.
> > > > > 
> > > > > In my proposal, the tagger will allocate the memory from its side of the
> > > > > ->connect() call. So regardless of whether the switch driver side
> > > > > connects or not, the memory inside dp->priv is there for the tagger to
> > > > > use. The switch can access it or it can ignore it.
> > > > 
> > > > I don't think I actually said something useful here.
> > > > 
> > > > The goal would be to minimize use of dp->priv inside the switch driver,
> > > > outside of the actual ->connect() / ->disconnect() calls.
> > > > For example, in the felix driver which supports two tagging protocol
> > > > drivers, I think these two methods would be enough, and they would
> > > > replace the current felix_port_setup_tagger_data() and
> > > > felix_port_teardown_tagger_data() calls.
> > > > 
> > > > An additional benefit would be that in ->connect() and ->disconnect() we
> > > > get the actual tagging protocol in use. Currently the felix driver lacks
> > > > there, because felix_port_setup_tagger_data() just sets dp->priv up
> > > > unconditionally for the ocelot-8021q tagging protocol (luckily the
> > > > normal ocelot tagger doesn't need dp->priv).
> > > > 
> > > > In sja1105 the story is a bit longer, but I believe that can also be
> > > > cleaned up to stay within the confines of ->connect()/->disconnect().
> > > > 
> > > > So I guess we just need to be careful and push back against dubious use
> > > > during review.
> > > 
> > > I've started working on a prototype for converting sja1105 to this model.
> > > It should be clearer to me by tomorrow whether there is anything missing
> > > from this proposal.
> > 
> > I'm working on your suggestion and I should be able to post another RFC
> > this night if all works correctly with my switch.
> 
> Ok. The key point with my progress so far is that Andrew may be right
> and we might need separate tagger priv pointers per port and per switch.
> At least for the cleanliness of implementation. In the end I plan to
> remove dp->priv and stay with dp->tagger_priv and ds->tagger_priv.
> 
> Here's what I have so far. I have more changes locally, but the rest it
> isn't ready and overall also a bit irrelevant for the discussion.
> I'm going to sleep now.
>

BTW, I notice we made the same mistake. Don't know if it was the problem
and you didn't notice... The notifier is not ready at times of the first
tagger setup and the tag_proto_connect is never called.
Anyway sending v2 with your suggestion applied.

> diff --git a/include/net/dsa.h b/include/net/dsa.h
> index bdf308a5c55e..f0f702774c8d 100644
> --- a/include/net/dsa.h
> +++ b/include/net/dsa.h
> @@ -82,12 +82,15 @@ enum dsa_tag_protocol {
>  };
>  
>  struct dsa_switch;
> +struct dsa_switch_tree;
>  
>  struct dsa_device_ops {
>  	struct sk_buff *(*xmit)(struct sk_buff *skb, struct net_device *dev);
>  	struct sk_buff *(*rcv)(struct sk_buff *skb, struct net_device *dev);
>  	void (*flow_dissect)(const struct sk_buff *skb, __be16 *proto,
>  			     int *offset);
> +	int (*connect)(struct dsa_switch_tree *dst);
> +	void (*disconnect)(struct dsa_switch_tree *dst);
>  	unsigned int needed_headroom;
>  	unsigned int needed_tailroom;
>  	const char *name;
> @@ -279,6 +282,8 @@ struct dsa_port {
>  	 */
>  	void *priv;
>  
> +	void *tagger_priv;
> +
>  	/*
>  	 * Original copy of the master netdev ethtool_ops
>  	 */
> @@ -337,6 +342,8 @@ struct dsa_switch {
>  	 */
>  	void *priv;
>  
> +	void *tagger_priv;
> +
>  	/*
>  	 * Configuration data for this switch.
>  	 */
> @@ -689,6 +696,8 @@ struct dsa_switch_ops {
>  						  enum dsa_tag_protocol mprot);
>  	int	(*change_tag_protocol)(struct dsa_switch *ds, int port,
>  				       enum dsa_tag_protocol proto);
> +	int	(*connect_tag_protocol)(struct dsa_switch *ds,
> +					enum dsa_tag_protocol proto);
>  
>  	/* Optional switch-wide initialization and destruction methods */
>  	int	(*setup)(struct dsa_switch *ds);
> diff --git a/net/dsa/dsa2.c b/net/dsa/dsa2.c
> index 8814fa0e44c8..3787cbce1175 100644
> --- a/net/dsa/dsa2.c
> +++ b/net/dsa/dsa2.c
> @@ -248,8 +248,12 @@ static struct dsa_switch_tree *dsa_tree_alloc(int index)
>  
>  static void dsa_tree_free(struct dsa_switch_tree *dst)
>  {
> -	if (dst->tag_ops)
> +	if (dst->tag_ops) {
> +		if (dst->tag_ops->disconnect)
> +			dst->tag_ops->disconnect(dst);
> +
>  		dsa_tag_driver_put(dst->tag_ops);
> +	}
>  	list_del(&dst->list);
>  	kfree(dst);
>  }
> @@ -1136,6 +1140,36 @@ static void dsa_tree_teardown(struct dsa_switch_tree *dst)
>  	dst->setup = false;
>  }
>  
> +static int dsa_tree_bind_tag_proto(struct dsa_switch_tree *dst,
> +				   const struct dsa_device_ops *tag_ops)
> +{
> +	struct dsa_notifier_tag_proto_info info;
> +	int err;
> +
> +	if (dst->tag_ops && dst->tag_ops->disconnect)
> +		dst->tag_ops->disconnect(dst);
> +
> +	if (tag_ops->connect) {
> +		err = tag_ops->connect(dst);
> +		if (err)
> +			return err;
> +	}
> +
> +	info.tag_ops = tag_ops;
> +	err = dsa_tree_notify(dst, DSA_NOTIFIER_TAG_PROTO_CONNECT, &info);
> +	if (err && err != -EOPNOTSUPP)
> +		goto out_disconnect;
> +
> +	dst->tag_ops = tag_ops;
> +
> +	return 0;
> +
> +out_disconnect:
> +	if (tag_ops->disconnect)
> +		tag_ops->disconnect(dst);
> +	return err;
> +}
> +
>  /* Since the dsa/tagging sysfs device attribute is per master, the assumption
>   * is that all DSA switches within a tree share the same tagger, otherwise
>   * they would have formed disjoint trees (different "dsa,member" values).
> @@ -1173,7 +1207,9 @@ int dsa_tree_change_tag_proto(struct dsa_switch_tree *dst,
>  	if (err)
>  		goto out_unwind_tagger;
>  
> -	dst->tag_ops = tag_ops;
> +	err = dsa_tree_bind_tag_proto(dst, tag_ops);
> +	if (err)
> +		goto out_unwind_tagger;
>  
>  	rtnl_unlock();
>  
> @@ -1307,7 +1343,9 @@ static int dsa_port_parse_cpu(struct dsa_port *dp, struct net_device *master,
>  		 */
>  		dsa_tag_driver_put(tag_ops);
>  	} else {
> -		dst->tag_ops = tag_ops;
> +		err = dsa_tree_bind_tag_proto(dst, tag_ops);
> +		if (err)
> +			return err;
>  	}
>  
>  	dp->master = master;
> diff --git a/net/dsa/dsa_priv.h b/net/dsa/dsa_priv.h
> index 38ce5129a33d..0db2b26b0c83 100644
> --- a/net/dsa/dsa_priv.h
> +++ b/net/dsa/dsa_priv.h
> @@ -37,6 +37,7 @@ enum {
>  	DSA_NOTIFIER_VLAN_DEL,
>  	DSA_NOTIFIER_MTU,
>  	DSA_NOTIFIER_TAG_PROTO,
> +	DSA_NOTIFIER_TAG_PROTO_CONNECT,
>  	DSA_NOTIFIER_MRP_ADD,
>  	DSA_NOTIFIER_MRP_DEL,
>  	DSA_NOTIFIER_MRP_ADD_RING_ROLE,
> diff --git a/net/dsa/switch.c b/net/dsa/switch.c
> index 9c92edd96961..06948f536829 100644
> --- a/net/dsa/switch.c
> +++ b/net/dsa/switch.c
> @@ -647,6 +647,17 @@ static int dsa_switch_change_tag_proto(struct dsa_switch *ds,
>  	return 0;
>  }
>  
> +static int dsa_switch_connect_tag_proto(struct dsa_switch *ds,
> +					struct dsa_notifier_tag_proto_info *info)
> +{
> +	const struct dsa_device_ops *tag_ops = info->tag_ops;
> +
> +	if (!ds->ops->connect_tag_protocol)
> +		return -EOPNOTSUPP;
> +
> +	return ds->ops->connect_tag_protocol(ds, tag_ops->proto);
> +}
> +
>  static int dsa_switch_mrp_add(struct dsa_switch *ds,
>  			      struct dsa_notifier_mrp_info *info)
>  {
> @@ -766,6 +777,9 @@ static int dsa_switch_event(struct notifier_block *nb,
>  	case DSA_NOTIFIER_TAG_PROTO:
>  		err = dsa_switch_change_tag_proto(ds, info);
>  		break;
> +	case DSA_NOTIFIER_TAG_PROTO_CONNECT:
> +		err = dsa_switch_connect_tag_proto(ds, info);
> +		break;
>  	case DSA_NOTIFIER_MRP_ADD:
>  		err = dsa_switch_mrp_add(ds, info);
>  		break;
> diff --git a/net/dsa/tag_sja1105.c b/net/dsa/tag_sja1105.c
> index 6c293c2a3008..53362a0f0aab 100644
> --- a/net/dsa/tag_sja1105.c
> +++ b/net/dsa/tag_sja1105.c
> @@ -722,11 +722,59 @@ static void sja1110_flow_dissect(const struct sk_buff *skb, __be16 *proto,
>  	*proto = ((__be16 *)skb->data)[(VLAN_HLEN / 2) - 1];
>  }
>  
> +static void sja1105_disconnect(struct dsa_switch_tree *dst)
> +{
> +	struct dsa_port *dp;
> +
> +	dsa_tree_for_each_user_port(dp, dst) {
> +		if (dp->tagger_priv) {
> +			kfree(dp->tagger_priv);
> +			dp->tagger_priv = NULL;
> +		}
> +
> +		if (dp->ds->tagger_priv) {
> +			kfree(dp->ds->tagger_priv);
> +			dp->ds->tagger_priv = NULL;
> +		}
> +	}
> +}
> +
> +static int sja1105_connect(struct dsa_switch_tree *dst)
> +{
> +	struct sja1105_tagger_data *data;
> +	struct sja1105_port *sp;
> +	struct dsa_port *dp;
> +
> +	dsa_tree_for_each_user_port(dp, dst) {
> +		if (!dp->ds->tagger_priv) {
> +			data = kzalloc(sizeof(*data), GFP_KERNEL);
> +			if (!data)
> +				goto out;
> +
> +			dp->ds->tagger_priv = data;
> +		}
> +
> +		sp = kzalloc(sizeof(*sp), GFP_KERNEL);
> +		if (!sp)
> +			goto out;
> +
> +		dp->tagger_priv = sp;
> +	}
> +
> +	return 0;
> +
> +out:
> +	sja1105_disconnect(dst);
> +	return -ENOMEM;
> +}
> +
>  static const struct dsa_device_ops sja1105_netdev_ops = {
>  	.name = "sja1105",
>  	.proto = DSA_TAG_PROTO_SJA1105,
>  	.xmit = sja1105_xmit,
>  	.rcv = sja1105_rcv,
> +	.connect = sja1105_connect,
> +	.disconnect = sja1105_disconnect,
>  	.needed_headroom = VLAN_HLEN,
>  	.flow_dissect = sja1105_flow_dissect,
>  	.promisc_on_master = true,
> @@ -740,6 +788,8 @@ static const struct dsa_device_ops sja1110_netdev_ops = {
>  	.proto = DSA_TAG_PROTO_SJA1110,
>  	.xmit = sja1110_xmit,
>  	.rcv = sja1110_rcv,
> +	.connect = sja1105_connect,
> +	.disconnect = sja1105_disconnect,
>  	.flow_dissect = sja1110_flow_dissect,
>  	.needed_headroom = SJA1110_HEADER_LEN + VLAN_HLEN,
>  	.needed_tailroom = SJA1110_RX_TRAILER_LEN + SJA1110_MAX_PADDING_LEN,
> -- 
> 2.25.1

-- 
	Ansuel

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [net-next RFC PATCH 0/6] Add support for qca8k mdio rw in Ethernet packet
  2021-12-08  1:35                   ` Andrew Lunn
@ 2021-12-08  3:39                     ` Ansuel Smith
  2021-12-08 11:51                     ` Vladimir Oltean
  1 sibling, 0 replies; 40+ messages in thread
From: Ansuel Smith @ 2021-12-08  3:39 UTC (permalink / raw)
  To: Andrew Lunn
  Cc: Vladimir Oltean, Vivien Didelot, Florian Fainelli,
	David S. Miller, Jakub Kicinski, linux-kernel, netdev

On Wed, Dec 08, 2021 at 02:35:34AM +0100, Andrew Lunn wrote:
> On Wed, Dec 08, 2021 at 01:14:49AM +0200, Vladimir Oltean wrote:
> > On Tue, Dec 07, 2021 at 11:54:07PM +0100, Andrew Lunn wrote:
> > > > I considered a simplified form like this, but I think the tagger private
> > > > data will still stay in dp->priv, only its ownership will change.
> > > 
> > > Isn't dp a port structure. So there is one per port?
> > 
> > Yes, but dp->priv is a pointer. The thing it points to may not
> > necessarily be per port.
> > 
> > > This is where i think we need to separate shared state from tagger
> > > private data. Probably tagger private data is not per port. Shared
> > > state between the switch driver and the tagger maybe is per port?
> > 
> > I don't know whether there's such a big difference between
> > "shared state" vs "private data"?
> 
> The difference is to do with stopping the kernel exploding when the
> switch driver is not using the tagger it expects.
> 
> Anything which is private to the tagger is not a problem. Only the
> tagger uses it, so it cannot be wrong.
> 
> Anything which is shared between the tagger and the switch driver we
> have to be careful about. We are just passing void * pointers
> about. There is no type checking. If i'm correct about the 1:N
> relationship, we can store shared state in the tagger. The tagger
> should be O.K, because it only ever needs to deal with one format of
> shared state. The switch driver needs to handle N different formats of
> shared state, depending on which of the N different taggers are in
> operation. Ideally, when it asks for the void * pointer for shared
> information, some sort of checking is performed to ensure the void *
> is what the switch driver actually expects. Maybe it needs to pass the
> tag driver it thinks it is talking to, or as well as getting the void
> * back, it also gets the tag enum and it verifies it actually knows
> about that tag driver.
> 
>      Andrew

I'm sending v2 with Vladimir suggestion so we can start working on that.
Hope with a some split code it would be easier to find the problem with
this and find a way to correctly validate the shared data between tagger
and dsa driver. (you will probably have to rewrite this also for v2 and
sorry for this)

-- 
	Ansuel

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [net-next RFC PATCH 0/6] Add support for qca8k mdio rw in Ethernet packet
  2021-12-08  1:35                   ` Andrew Lunn
  2021-12-08  3:39                     ` Ansuel Smith
@ 2021-12-08 11:51                     ` Vladimir Oltean
  1 sibling, 0 replies; 40+ messages in thread
From: Vladimir Oltean @ 2021-12-08 11:51 UTC (permalink / raw)
  To: Andrew Lunn
  Cc: Ansuel Smith, Vivien Didelot, Florian Fainelli, David S. Miller,
	Jakub Kicinski, linux-kernel, netdev

On Wed, Dec 08, 2021 at 02:35:34AM +0100, Andrew Lunn wrote:
> On Wed, Dec 08, 2021 at 01:14:49AM +0200, Vladimir Oltean wrote:
> > On Tue, Dec 07, 2021 at 11:54:07PM +0100, Andrew Lunn wrote:
> > > > I considered a simplified form like this, but I think the tagger private
> > > > data will still stay in dp->priv, only its ownership will change.
> > > 
> > > Isn't dp a port structure. So there is one per port?
> > 
> > Yes, but dp->priv is a pointer. The thing it points to may not
> > necessarily be per port.
> > 
> > > This is where i think we need to separate shared state from tagger
> > > private data. Probably tagger private data is not per port. Shared
> > > state between the switch driver and the tagger maybe is per port?
> > 
> > I don't know whether there's such a big difference between
> > "shared state" vs "private data"?
> 
> The difference is to do with stopping the kernel exploding when the
> switch driver is not using the tagger it expects.
> 
> Anything which is private to the tagger is not a problem. Only the
> tagger uses it, so it cannot be wrong.
> 
> Anything which is shared between the tagger and the switch driver we
> have to be careful about. We are just passing void * pointers
> about. There is no type checking. If i'm correct about the 1:N
> relationship, we can store shared state in the tagger. The tagger
> should be O.K, because it only ever needs to deal with one format of
> shared state. The switch driver needs to handle N different formats of
> shared state, depending on which of the N different taggers are in
> operation. Ideally, when it asks for the void * pointer for shared
> information, some sort of checking is performed to ensure the void *
> is what the switch driver actually expects. Maybe it needs to pass the
> tag driver it thinks it is talking to, or as well as getting the void
> * back, it also gets the tag enum and it verifies it actually knows
> about that tag driver.

Understood what you mean now (actually I don't know what was unclear yesterday).
I should start doing something else past a certain hour...

What I've started doing now is something like this:

/* include/linux/dsa/ocelot.h */
struct ocelot_8021q_tagger_data {
	void (*xmit_work_fn)(struct kthread_work *work);
};

static inline struct ocelot_8021q_tagger_data *
ocelot_8021q_tagger_data(struct dsa_switch *ds)
{
	BUG_ON(ds->dst->tag_ops->proto != DSA_TAG_PROTO_OCELOT_8021Q);

	return ds->tagger_data;
}

/* net/dsa/tag_ocelot_8021q.c */
struct ocelot_8021q_tagger_private {
	struct ocelot_8021q_tagger_data data; /* Must be first */
	struct kthread_worker *xmit_worker;
};

static struct sk_buff *ocelot_defer_xmit(struct dsa_port *dp,
					 struct sk_buff *skb)
{
	struct ocelot_8021q_tagger_private *priv = dp->ds->tagger_data;
	struct ocelot_8021q_tagger_data *data = &priv->data;
	void (*xmit_work_fn)(struct kthread_work *work);
	struct felix_deferred_xmit_work *xmit_work;
	struct kthread_worker *xmit_worker;

	xmit_work_fn = data->xmit_work_fn;
	xmit_worker = priv->xmit_worker;

	if (!xmit_work_fn || !xmit_worker)
		return NULL;

	xmit_work = kzalloc(sizeof(*xmit_work), GFP_ATOMIC);
	if (!xmit_work)
		return NULL;

	/* Calls felix_port_deferred_xmit in felix.c */
	kthread_init_work(&xmit_work->work, xmit_work_fn);
	/* Increase refcount so the kfree_skb in dsa_slave_xmit
	 * won't really free the packet.
	 */
	xmit_work->dp = dp;
	xmit_work->skb = skb_get(skb);

	kthread_queue_work(xmit_worker, &xmit_work->work);

	return NULL;
}

static void ocelot_disconnect(struct dsa_switch_tree *dst)
{
	struct ocelot_8021q_tagger_private *priv;
	struct dsa_port *dp;

	list_for_each_entry(dp, &dst->ports, list) {
		priv = dp->ds->tagger_data;

		if (!priv)
			continue;

		if (priv->xmit_worker)
			kthread_destroy_worker(priv->xmit_worker);

		kfree(priv);
		dp->ds->tagger_data = NULL;
	}
}

static int ocelot_connect(struct dsa_switch_tree *dst)
{
	struct ocelot_8021q_tagger_private *priv;
	struct dsa_port *dp;
	int err;

	list_for_each_entry(dp, &dst->ports, list) {
		if (dp->ds->tagger_data)
			continue;

		priv = kzalloc(sizeof(*priv), GFP_KERNEL);
		if (!priv) {
			err = -ENOMEM;
			goto out;
		}

		priv->xmit_worker = kthread_create_worker(0, "felix_xmit");
		if (IS_ERR(priv->xmit_worker)) {
			err = PTR_ERR(priv->xmit_worker);
			goto out;
		}

		dp->ds->tagger_data = priv;
	}

	return 0;

out:
	ocelot_disconnect(dst);
	return err;
}

/* drivers/net/dsa/felix.c */
static int felix_connect_tag_protocol(struct dsa_switch *ds,
				      enum dsa_tag_protocol proto)
{
	struct ocelot_8021q_tagger_data *tagger_data;

	switch (proto) {
	case DSA_TAG_PROTO_OCELOT_8021Q:
		tagger_data = ocelot_8021q_tagger_data(ds);
		tagger_data->xmit_work_fn = felix_port_deferred_xmit;
		return 0;
	case DSA_TAG_PROTO_OCELOT:
	case DSA_TAG_PROTO_SEVILLE:
		return 0;
	default:
		return -EPROTONOSUPPORT;
	}
}

Something like this shares memory between what's private and what's
public (it's part of the same allocation), but there's type checking at
least, and private data isn't exposed directly. Is this ok?

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [net-next RFC PATCH 0/6] Add support for qca8k mdio rw in Ethernet packet
  2021-12-08  3:32                         ` Ansuel Smith
@ 2021-12-08 11:54                           ` Vladimir Oltean
  0 siblings, 0 replies; 40+ messages in thread
From: Vladimir Oltean @ 2021-12-08 11:54 UTC (permalink / raw)
  To: Ansuel Smith
  Cc: Andrew Lunn, Vivien Didelot, Florian Fainelli, David S. Miller,
	Jakub Kicinski, linux-kernel, netdev

On Wed, Dec 08, 2021 at 04:32:43AM +0100, Ansuel Smith wrote:
> On Wed, Dec 08, 2021 at 03:09:47AM +0200, Vladimir Oltean wrote:
> > On Wed, Dec 08, 2021 at 01:42:59AM +0100, Ansuel Smith wrote:
> > > On Wed, Dec 08, 2021 at 02:40:51AM +0200, Vladimir Oltean wrote:
> > > > On Wed, Dec 08, 2021 at 02:04:32AM +0200, Vladimir Oltean wrote:
> > > > > On Wed, Dec 08, 2021 at 01:47:36AM +0200, Vladimir Oltean wrote:
> > > > > > > 2) is harder. But as far as i know, we have an 1:N setup.  One switch
> > > > > > > driver can use N tag drivers. So we need the switch driver to be sure
> > > > > > > the tag driver is what it expects. We keep the shared state in the tag
> > > > > > > driver, so it always has valid data, but when the switch driver wants
> > > > > > > to get a pointer to it, it needs to pass a enum dsa_tag_protocol and
> > > > > > > if it does not match, the core should return -EINVAL or similar.
> > > > > > 
> > > > > > In my proposal, the tagger will allocate the memory from its side of the
> > > > > > ->connect() call. So regardless of whether the switch driver side
> > > > > > connects or not, the memory inside dp->priv is there for the tagger to
> > > > > > use. The switch can access it or it can ignore it.
> > > > > 
> > > > > I don't think I actually said something useful here.
> > > > > 
> > > > > The goal would be to minimize use of dp->priv inside the switch driver,
> > > > > outside of the actual ->connect() / ->disconnect() calls.
> > > > > For example, in the felix driver which supports two tagging protocol
> > > > > drivers, I think these two methods would be enough, and they would
> > > > > replace the current felix_port_setup_tagger_data() and
> > > > > felix_port_teardown_tagger_data() calls.
> > > > > 
> > > > > An additional benefit would be that in ->connect() and ->disconnect() we
> > > > > get the actual tagging protocol in use. Currently the felix driver lacks
> > > > > there, because felix_port_setup_tagger_data() just sets dp->priv up
> > > > > unconditionally for the ocelot-8021q tagging protocol (luckily the
> > > > > normal ocelot tagger doesn't need dp->priv).
> > > > > 
> > > > > In sja1105 the story is a bit longer, but I believe that can also be
> > > > > cleaned up to stay within the confines of ->connect()/->disconnect().
> > > > > 
> > > > > So I guess we just need to be careful and push back against dubious use
> > > > > during review.
> > > > 
> > > > I've started working on a prototype for converting sja1105 to this model.
> > > > It should be clearer to me by tomorrow whether there is anything missing
> > > > from this proposal.
> > > 
> > > I'm working on your suggestion and I should be able to post another RFC
> > > this night if all works correctly with my switch.
> > 
> > Ok. The key point with my progress so far is that Andrew may be right
> > and we might need separate tagger priv pointers per port and per switch.
> > At least for the cleanliness of implementation. In the end I plan to
> > remove dp->priv and stay with dp->tagger_priv and ds->tagger_priv.
> > 
> > Here's what I have so far. I have more changes locally, but the rest it
> > isn't ready and overall also a bit irrelevant for the discussion.
> > I'm going to sleep now.
> >
> 
> BTW, I notice we made the same mistake. Don't know if it was the problem
> and you didn't notice... The notifier is not ready at times of the first
> tagger setup and the tag_proto_connect is never called.
> Anyway sending v2 with your suggestion applied.

I didn't go past the compilation stage yesterday. Anyway, now that you
mention it, I remember Tobias hitting this issue as well when he worked
on changing tagging protocol via device tree, and this is why
dsa_switch_setup_tag_protocol() exists.  I believe that's where we'd
need to call ds->ops->connect_tag_proto from, like this:

static int dsa_switch_setup_tag_protocol(struct dsa_switch *ds)
{
	const struct dsa_device_ops *tag_ops = ds->dst->tag_ops;
	struct dsa_switch_tree *dst = ds->dst;
	struct dsa_port *cpu_dp;
	int err;

	if (tag_ops->proto == dst->default_proto)
		goto connect;

	dsa_switch_for_each_cpu_port(cpu_dp, ds) {
		rtnl_lock();
		err = ds->ops->change_tag_protocol(ds, cpu_dp->index,
						   tag_ops->proto);
		rtnl_unlock();
		if (err) {
			dev_err(ds->dev, "Unable to use tag protocol \"%s\": %pe\n",
				tag_ops->name, ERR_PTR(err));
			return err;
		}
	}

connect:
	if (ds->ops->connect_tag_protocol) {
		err = ds->ops->connect_tag_protocol(ds, tag_ops->proto);
		if (err) {
			dev_err(ds->dev,
				"Unable to connect to tag protocol \"%s\": %pe\n",
				tag_ops->name, ERR_PTR(err));
			return err;
		}
	}

	return 0;
}

^ permalink raw reply	[flat|nested] 40+ messages in thread

end of thread, other threads:[~2021-12-08 11:54 UTC | newest]

Thread overview: 40+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-12-07 14:59 [net-next RFC PATCH 0/6] Add support for qca8k mdio rw in Ethernet packet Ansuel Smith
2021-12-07 14:59 ` [net-next RFC PATCH 1/6] net: dsa: tag_qca: convert to FIELD macro Ansuel Smith
2021-12-07 14:59 ` [net-next RFC PATCH 2/6] net: dsa: tag_qca: move define to include linux/dsa Ansuel Smith
2021-12-07 14:59 ` [net-next RFC PATCH 3/6] net: dsa: tag_qca: add define for mdio read/write in ethernet packet Ansuel Smith
2021-12-07 14:59 ` [net-next RFC PATCH 4/6] net: dsa: qca8k: Add support for mdio read/write in Ethernet packet Ansuel Smith
2021-12-07 14:59 ` [net-next RFC PATCH 5/6] net: dsa: tag_qca: Add support for handling mdio read/write packet Ansuel Smith
2021-12-07 14:59 ` [net-next RFC PATCH 6/6] net: dsa: qca8k: cache lo and hi for mdio write Ansuel Smith
2021-12-07 15:15 ` [net-next RFC PATCH 0/6] Add support for qca8k mdio rw in Ethernet packet Andrew Lunn
2021-12-07 15:33   ` Ansuel Smith
2021-12-07 18:49   ` Florian Fainelli
2021-12-07 19:44     ` Ansuel Smith
2021-12-07 21:10     ` Vladimir Oltean
2021-12-07 22:01       ` Ansuel Smith
2021-12-07 22:37       ` Andrew Lunn
2021-12-07 18:41 ` Andrew Lunn
2021-12-07 18:53   ` Ansuel Smith
2021-12-07 19:15     ` Andrew Lunn
2021-12-07 19:21       ` Ansuel Smith
2021-12-07 20:52         ` Vladimir Oltean
2021-12-07 21:47           ` Ansuel Smith
2021-12-07 22:22             ` Andrew Lunn
2021-12-07 22:30               ` Ansuel Smith
2021-12-07 22:46                 ` Andrew Lunn
2021-12-07 23:47               ` Vladimir Oltean
2021-12-08  0:04                 ` Vladimir Oltean
2021-12-08  0:40                   ` Vladimir Oltean
2021-12-08  0:42                     ` Ansuel Smith
2021-12-08  1:09                       ` Vladimir Oltean
2021-12-08  3:32                         ` Ansuel Smith
2021-12-08 11:54                           ` Vladimir Oltean
2021-12-08  1:15                 ` Andrew Lunn
2021-12-07 22:45             ` Vladimir Oltean
2021-12-07 22:54               ` Andrew Lunn
2021-12-07 23:14                 ` Vladimir Oltean
2021-12-08  1:35                   ` Andrew Lunn
2021-12-08  3:39                     ` Ansuel Smith
2021-12-08 11:51                     ` Vladimir Oltean
2021-12-07 23:05               ` Ansuel Smith
2021-12-07 23:20                 ` Vladimir Oltean
2021-12-07 23:24                   ` Ansuel Smith

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).