linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [RFC 0/9] staging: dpaa2-switch: add support for CPU terminated traffic
@ 2020-11-04 16:57 Ioana Ciornei
  2020-11-04 16:57 ` [RFC 1/9] staging: dpaa2-switch: get control interface attributes Ioana Ciornei
                   ` (8 more replies)
  0 siblings, 9 replies; 20+ messages in thread
From: Ioana Ciornei @ 2020-11-04 16:57 UTC (permalink / raw)
  To: Greg Kroah-Hartman, Andrew Lunn; +Cc: linux-kernel, netdev, Ioana Ciornei

From: Ioana Ciornei <ioana.ciornei@nxp.com>

This patch set adds support for Rx/Tx capabilities on DPAA2 switch port
interfaces as well as fixing up so major blunders in how we take care of
the switching domains. The netdev community considers this as a basic
features, thus it's sent against the staging tree before anything can be
moved out of it.

The control interface is comprised of 3 queues in total: Rx, Rx error
and Tx confirmation. In this patch set we only enable Rx and Tx conf.
All switch ports share the same queues when frames are redirected to the
CPU.  Information regarding the ingress switch port is passed through
frame metadata - the flow context field of the descriptor.

NAPI instances are also shared between switch net_devices and are
enabled when at least on one of the switch ports .dev_open() was called
and disabled when no switch port is still up.

Since the last version of this feature was submitted to the list, I
reworked how the switching and flooding domains are taken care of by the
driver, thus the switch is now able to also add the control port (the
queues that the CPU can dequeue from) into the flooding domains of a
port (broadcast, unknown unicast etc). With this, we are able to receive
and sent traffic from the switch interfaces.

Also, the capability to properly partition the DPSW object into multiple
switching domains was added so that when not under a bridge, the ports
are not actually capable to switch between them. This is possible by
adding a private FDB table per switch interface.  When multiple switch
interfaces are under the same bridge, they will all use the same FDB
table.

Another thing that is fixed in this patch set is how the driver handles
VLAN awareness. The DPAA2 switch is not capable to run as VLAN unaware
but this was not reflected in how the driver responded to requests to
change the VLAN awareness. In the last patch, this is fixed by
describing the switch interfaces as Rx VLAN filtering on [fixed] and
declining any request to join a VLAN unaware bridge.

I am sending this as a RFC since I would like to receive feedback on the
current capabilities of the switch driver and if they meet or not meet
the expectations, all this before I actually settle on a firmware
interface.

Ioana Ciornei (9):
  staging: dpaa2-switch: get control interface attributes
  staging: dpaa2-switch: setup buffer pool for control traffic
  staging: dpaa2-switch: setup RX path rings
  staging: dpaa2-switch: setup dpio
  staging: dpaa2-switch: handle Rx path on control interface
  staging: dpaa2-switch: add .ndo_start_xmit() callback
  staging: dpaa2-switch: enable the control interface
  staging: dpaa2-switch: properly setup switching domains
  staging: dpaa2-switch: accept only vlan-aware upper devices

 drivers/staging/fsl-dpaa2/Kconfig          |    1 +
 drivers/staging/fsl-dpaa2/ethsw/dpsw-cmd.h |   89 +-
 drivers/staging/fsl-dpaa2/ethsw/dpsw.c     |  269 +++++
 drivers/staging/fsl-dpaa2/ethsw/dpsw.h     |  144 +++
 drivers/staging/fsl-dpaa2/ethsw/ethsw.c    | 1131 ++++++++++++++++++--
 drivers/staging/fsl-dpaa2/ethsw/ethsw.h    |   75 +-
 6 files changed, 1641 insertions(+), 68 deletions(-)

-- 
2.28.0


^ permalink raw reply	[flat|nested] 20+ messages in thread

* [RFC 1/9] staging: dpaa2-switch: get control interface attributes
  2020-11-04 16:57 [RFC 0/9] staging: dpaa2-switch: add support for CPU terminated traffic Ioana Ciornei
@ 2020-11-04 16:57 ` Ioana Ciornei
  2020-11-04 16:57 ` [RFC 2/9] staging: dpaa2-switch: setup buffer pool for control traffic Ioana Ciornei
                   ` (7 subsequent siblings)
  8 siblings, 0 replies; 20+ messages in thread
From: Ioana Ciornei @ 2020-11-04 16:57 UTC (permalink / raw)
  To: Greg Kroah-Hartman, Andrew Lunn; +Cc: linux-kernel, netdev, Ioana Ciornei

From: Ioana Ciornei <ioana.ciornei@nxp.com>

Introduce a new structure to hold all necessary info related to an RX
queue for the control interface and populate the FQ IDs.
We only have one Rx queue and one Tx confirmation queue on the control
interface, both shared by all the switch ports.

Signed-off-by: Ioana Ciornei <ioana.ciornei@nxp.com>
---
 drivers/staging/fsl-dpaa2/ethsw/dpsw-cmd.h |  9 +++++
 drivers/staging/fsl-dpaa2/ethsw/dpsw.c     | 31 ++++++++++++++++
 drivers/staging/fsl-dpaa2/ethsw/dpsw.h     | 21 +++++++++++
 drivers/staging/fsl-dpaa2/ethsw/ethsw.c    | 43 ++++++++++++++++++++++
 drivers/staging/fsl-dpaa2/ethsw/ethsw.h    | 15 ++++++++
 5 files changed, 119 insertions(+)

diff --git a/drivers/staging/fsl-dpaa2/ethsw/dpsw-cmd.h b/drivers/staging/fsl-dpaa2/ethsw/dpsw-cmd.h
index f100d503bd17..24c902eb63e3 100644
--- a/drivers/staging/fsl-dpaa2/ethsw/dpsw-cmd.h
+++ b/drivers/staging/fsl-dpaa2/ethsw/dpsw-cmd.h
@@ -73,6 +73,8 @@
 #define DPSW_CMDID_IF_GET_PRIMARY_MAC_ADDR  DPSW_CMD_ID(0x0A8)
 #define DPSW_CMDID_IF_SET_PRIMARY_MAC_ADDR  DPSW_CMD_ID(0x0A9)
 
+#define DPSW_CMDID_CTRL_IF_GET_ATTR         DPSW_CMD_ID(0x0A0)
+
 /* Macros for accessing command fields smaller than 1byte */
 #define DPSW_MASK(field)        \
 	GENMASK(DPSW_##field##_SHIFT + DPSW_##field##_SIZE - 1, \
@@ -368,6 +370,13 @@ struct dpsw_rsp_fdb_dump {
 	__le16 num_entries;
 };
 
+struct dpsw_rsp_ctrl_if_get_attr {
+	__le64 pad;
+	__le32 rx_fqid;
+	__le32 rx_err_fqid;
+	__le32 tx_err_conf_fqid;
+};
+
 struct dpsw_rsp_get_api_version {
 	__le16 version_major;
 	__le16 version_minor;
diff --git a/drivers/staging/fsl-dpaa2/ethsw/dpsw.c b/drivers/staging/fsl-dpaa2/ethsw/dpsw.c
index f8bfe779bd30..996f13abfb1b 100644
--- a/drivers/staging/fsl-dpaa2/ethsw/dpsw.c
+++ b/drivers/staging/fsl-dpaa2/ethsw/dpsw.c
@@ -1182,6 +1182,37 @@ int dpsw_fdb_set_learning_mode(struct fsl_mc_io *mc_io,
 	return mc_send_command(mc_io, &cmd);
 }
 
+/**
+ * dpsw_ctrl_if_get_attributes() - Obtain control interface attributes
+ * @mc_io:	Pointer to MC portal's I/O object
+ * @cmd_flags:	Command flags; one or more of 'MC_CMD_FLAG_'
+ * @token:	Token of DPSW object
+ * @attr:	Returned control interface attributes
+ *
+ * Return:	'0' on Success; Error code otherwise.
+ */
+int dpsw_ctrl_if_get_attributes(struct fsl_mc_io *mc_io, u32 cmd_flags,
+				u16 token, struct dpsw_ctrl_if_attr *attr)
+{
+	struct dpsw_rsp_ctrl_if_get_attr *rsp_params;
+	struct fsl_mc_command cmd = { 0 };
+	int err;
+
+	cmd.header = mc_encode_cmd_header(DPSW_CMDID_CTRL_IF_GET_ATTR,
+					  cmd_flags, token);
+
+	err = mc_send_command(mc_io, &cmd);
+	if (err)
+		return err;
+
+	rsp_params = (struct dpsw_rsp_ctrl_if_get_attr *)cmd.params;
+	attr->rx_fqid = le32_to_cpu(rsp_params->rx_fqid);
+	attr->rx_err_fqid = le32_to_cpu(rsp_params->rx_err_fqid);
+	attr->tx_err_conf_fqid = le32_to_cpu(rsp_params->tx_err_conf_fqid);
+
+	return 0;
+}
+
 /**
  * dpsw_get_api_version() - Get Data Path Switch API version
  * @mc_io:	Pointer to MC portal's I/O object
diff --git a/drivers/staging/fsl-dpaa2/ethsw/dpsw.h b/drivers/staging/fsl-dpaa2/ethsw/dpsw.h
index ab63ee4f5cb7..fcc07991119d 100644
--- a/drivers/staging/fsl-dpaa2/ethsw/dpsw.h
+++ b/drivers/staging/fsl-dpaa2/ethsw/dpsw.h
@@ -175,6 +175,27 @@ int dpsw_get_attributes(struct fsl_mc_io *mc_io,
 			u16 token,
 			struct dpsw_attr *attr);
 
+/**
+ * struct dpsw_ctrl_if_attr - Control interface attributes
+ * @rx_fqid:		Receive FQID
+ * @rx_err_fqid:	Receive error FQID
+ * @tx_err_conf_fqid:	Transmit error and confirmation FQID
+ */
+struct dpsw_ctrl_if_attr {
+	u32 rx_fqid;
+	u32 rx_err_fqid;
+	u32 tx_err_conf_fqid;
+};
+
+int dpsw_ctrl_if_get_attributes(struct fsl_mc_io *mc_io, u32 cmd_flags,
+				u16 token, struct dpsw_ctrl_if_attr *attr);
+
+enum dpsw_queue_type {
+	DPSW_QUEUE_RX,
+	DPSW_QUEUE_TX_ERR_CONF,
+	DPSW_QUEUE_RX_ERR,
+};
+
 /**
  * enum dpsw_action - Action selection for special/control frames
  * @DPSW_ACTION_DROP: Drop frame
diff --git a/drivers/staging/fsl-dpaa2/ethsw/ethsw.c b/drivers/staging/fsl-dpaa2/ethsw/ethsw.c
index 20c6326e5dee..c22b9ad558c8 100644
--- a/drivers/staging/fsl-dpaa2/ethsw/ethsw.c
+++ b/drivers/staging/fsl-dpaa2/ethsw/ethsw.c
@@ -1478,6 +1478,43 @@ static void dpaa2_switch_detect_features(struct ethsw_core *ethsw)
 		ethsw->features |= ETHSW_FEATURE_MAC_ADDR;
 }
 
+static int dpaa2_switch_setup_fqs(struct ethsw_core *ethsw)
+{
+	struct dpsw_ctrl_if_attr ctrl_if_attr;
+	struct device *dev = ethsw->dev;
+	int i = 0;
+	int err;
+
+	err = dpsw_ctrl_if_get_attributes(ethsw->mc_io, 0, ethsw->dpsw_handle,
+					  &ctrl_if_attr);
+	if (err) {
+		dev_err(dev, "dpsw_ctrl_if_get_attributes() = %d\n", err);
+		return err;
+	}
+
+	ethsw->fq[i].fqid = ctrl_if_attr.rx_fqid;
+	ethsw->fq[i].ethsw = ethsw;
+	ethsw->fq[i++].type = DPSW_QUEUE_RX;
+
+	ethsw->fq[i].fqid = ctrl_if_attr.tx_err_conf_fqid;
+	ethsw->fq[i].ethsw = ethsw;
+	ethsw->fq[i++].type = DPSW_QUEUE_TX_ERR_CONF;
+
+	return 0;
+}
+
+static int dpaa2_switch_ctrl_if_setup(struct ethsw_core *ethsw)
+{
+	int err;
+
+	/* setup FQs for Rx and Tx Conf */
+	err = dpaa2_switch_setup_fqs(ethsw);
+	if (err)
+		return err;
+
+	return 0;
+}
+
 static int dpaa2_switch_init(struct fsl_mc_device *sw_dev)
 {
 	struct device *dev = &sw_dev->dev;
@@ -1566,6 +1603,12 @@ static int dpaa2_switch_init(struct fsl_mc_device *sw_dev)
 		goto err_close;
 	}
 
+	if (dpaa2_switch_has_ctrl_if(ethsw)) {
+		err = dpaa2_switch_ctrl_if_setup(ethsw);
+		if (err)
+			goto err_destroy_ordered_workqueue;
+	}
+
 	err = dpaa2_switch_register_notifier(dev);
 	if (err)
 		goto err_destroy_ordered_workqueue;
diff --git a/drivers/staging/fsl-dpaa2/ethsw/ethsw.h b/drivers/staging/fsl-dpaa2/ethsw/ethsw.h
index 5f9211ccb1ef..3189dd72a51b 100644
--- a/drivers/staging/fsl-dpaa2/ethsw/ethsw.h
+++ b/drivers/staging/fsl-dpaa2/ethsw/ethsw.h
@@ -39,10 +39,19 @@
 
 #define ETHSW_FEATURE_MAC_ADDR	BIT(0)
 
+/* Number of receive queues (one RX and one TX_CONF) */
+#define DPAA2_SWITCH_RX_NUM_FQS	2
+
 extern const struct ethtool_ops dpaa2_switch_port_ethtool_ops;
 
 struct ethsw_core;
 
+struct dpaa2_switch_fq {
+	struct ethsw_core *ethsw;
+	enum dpsw_queue_type type;
+	u32 fqid;
+};
+
 /* Per port private data */
 struct ethsw_port_priv {
 	struct net_device	*netdev;
@@ -75,6 +84,12 @@ struct ethsw_core {
 	struct notifier_block		port_switchdev_nb;
 	struct notifier_block		port_switchdevb_nb;
 	struct workqueue_struct		*workqueue;
+
+	struct dpaa2_switch_fq		fq[DPAA2_SWITCH_RX_NUM_FQS];
 };
 
+static inline bool dpaa2_switch_has_ctrl_if(struct ethsw_core *ethsw)
+{
+	return !(ethsw->sw_attr.options & DPSW_OPT_CTRL_IF_DIS);
+}
 #endif	/* __ETHSW_H */
-- 
2.28.0


^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [RFC 2/9] staging: dpaa2-switch: setup buffer pool for control traffic
  2020-11-04 16:57 [RFC 0/9] staging: dpaa2-switch: add support for CPU terminated traffic Ioana Ciornei
  2020-11-04 16:57 ` [RFC 1/9] staging: dpaa2-switch: get control interface attributes Ioana Ciornei
@ 2020-11-04 16:57 ` Ioana Ciornei
  2020-11-04 16:57 ` [RFC 3/9] staging: dpaa2-switch: setup RX path rings Ioana Ciornei
                   ` (6 subsequent siblings)
  8 siblings, 0 replies; 20+ messages in thread
From: Ioana Ciornei @ 2020-11-04 16:57 UTC (permalink / raw)
  To: Greg Kroah-Hartman, Andrew Lunn; +Cc: linux-kernel, netdev, Ioana Ciornei

From: Ioana Ciornei <ioana.ciornei@nxp.com>

Allocate and setup a buffer pool, needed on the Rx path of the control
interface. Also, define the Rx buffer size seen by the WRIOP from the
PAGE_SIZE buffers seeded.

Signed-off-by: Ioana Ciornei <ioana.ciornei@nxp.com>
---
 drivers/staging/fsl-dpaa2/ethsw/dpsw-cmd.h | 12 +++
 drivers/staging/fsl-dpaa2/ethsw/dpsw.c     | 31 ++++++++
 drivers/staging/fsl-dpaa2/ethsw/dpsw.h     | 26 +++++++
 drivers/staging/fsl-dpaa2/ethsw/ethsw.c    | 90 ++++++++++++++++++++++
 drivers/staging/fsl-dpaa2/ethsw/ethsw.h    | 10 +++
 5 files changed, 169 insertions(+)

diff --git a/drivers/staging/fsl-dpaa2/ethsw/dpsw-cmd.h b/drivers/staging/fsl-dpaa2/ethsw/dpsw-cmd.h
index 24c902eb63e3..e27e86d5d4fb 100644
--- a/drivers/staging/fsl-dpaa2/ethsw/dpsw-cmd.h
+++ b/drivers/staging/fsl-dpaa2/ethsw/dpsw-cmd.h
@@ -8,6 +8,8 @@
 #ifndef __FSL_DPSW_CMD_H
 #define __FSL_DPSW_CMD_H
 
+#include "dpsw.h"
+
 /* DPSW Version */
 #define DPSW_VER_MAJOR		8
 #define DPSW_VER_MINOR		1
@@ -74,6 +76,7 @@
 #define DPSW_CMDID_IF_SET_PRIMARY_MAC_ADDR  DPSW_CMD_ID(0x0A9)
 
 #define DPSW_CMDID_CTRL_IF_GET_ATTR         DPSW_CMD_ID(0x0A0)
+#define DPSW_CMDID_CTRL_IF_SET_POOLS        DPSW_CMD_ID(0x0A1)
 
 /* Macros for accessing command fields smaller than 1byte */
 #define DPSW_MASK(field)        \
@@ -377,6 +380,15 @@ struct dpsw_rsp_ctrl_if_get_attr {
 	__le32 tx_err_conf_fqid;
 };
 
+#define DPSW_BACKUP_POOL(val, order)	(((val) & 0x1) << (order))
+struct dpsw_cmd_ctrl_if_set_pools {
+	u8 num_dpbp;
+	u8 backup_pool_mask;
+	__le16 pad;
+	__le32 dpbp_id[DPSW_MAX_DPBP];
+	__le16 buffer_size[DPSW_MAX_DPBP];
+};
+
 struct dpsw_rsp_get_api_version {
 	__le16 version_major;
 	__le16 version_minor;
diff --git a/drivers/staging/fsl-dpaa2/ethsw/dpsw.c b/drivers/staging/fsl-dpaa2/ethsw/dpsw.c
index 996f13abfb1b..01ebdb12adec 100644
--- a/drivers/staging/fsl-dpaa2/ethsw/dpsw.c
+++ b/drivers/staging/fsl-dpaa2/ethsw/dpsw.c
@@ -1213,6 +1213,37 @@ int dpsw_ctrl_if_get_attributes(struct fsl_mc_io *mc_io, u32 cmd_flags,
 	return 0;
 }
 
+/**
+ * dpsw_ctrl_if_set_pools() - Set control interface buffer pools
+ * @mc_io:	Pointer to MC portal's I/O object
+ * @cmd_flags:	Command flags; one or more of 'MC_CMD_FLAG_'
+ * @token:	Token of DPSW object
+ * @cfg:	Buffer pools configuration
+ *
+ * Return:	'0' on Success; Error code otherwise.
+ */
+int dpsw_ctrl_if_set_pools(struct fsl_mc_io *mc_io, u32 cmd_flags, u16 token,
+			   const struct dpsw_ctrl_if_pools_cfg *cfg)
+{
+	struct dpsw_cmd_ctrl_if_set_pools *cmd_params;
+	struct fsl_mc_command cmd = { 0 };
+	int i;
+
+	cmd.header = mc_encode_cmd_header(DPSW_CMDID_CTRL_IF_SET_POOLS,
+					  cmd_flags, token);
+	cmd_params = (struct dpsw_cmd_ctrl_if_set_pools *)cmd.params;
+	cmd_params->num_dpbp = cfg->num_dpbp;
+	for (i = 0; i < DPSW_MAX_DPBP; i++) {
+		cmd_params->dpbp_id[i] = cpu_to_le32(cfg->pools[i].dpbp_id);
+		cmd_params->buffer_size[i] =
+			cpu_to_le16(cfg->pools[i].buffer_size);
+		cmd_params->backup_pool_mask |=
+			DPSW_BACKUP_POOL(cfg->pools[i].backup_pool, i);
+	}
+
+	return mc_send_command(mc_io, &cmd);
+}
+
 /**
  * dpsw_get_api_version() - Get Data Path Switch API version
  * @mc_io:	Pointer to MC portal's I/O object
diff --git a/drivers/staging/fsl-dpaa2/ethsw/dpsw.h b/drivers/staging/fsl-dpaa2/ethsw/dpsw.h
index fcc07991119d..5ad7c0634992 100644
--- a/drivers/staging/fsl-dpaa2/ethsw/dpsw.h
+++ b/drivers/staging/fsl-dpaa2/ethsw/dpsw.h
@@ -196,6 +196,32 @@ enum dpsw_queue_type {
 	DPSW_QUEUE_RX_ERR,
 };
 
+/**
+ * Maximum number of DPBP
+ */
+#define DPSW_MAX_DPBP     8
+
+/**
+ * struct dpsw_ctrl_if_pools_cfg - Control interface buffer pools configuration
+ * @num_dpbp: Number of DPBPs
+ * @pools: Array of buffer pools parameters; The number of valid entries
+ *	must match 'num_dpbp' value
+ * @pools.dpbp_id: DPBP object ID
+ * @pools.buffer_size: Buffer size
+ * @pools.backup_pool: Backup pool
+ */
+struct dpsw_ctrl_if_pools_cfg {
+	u8 num_dpbp;
+	struct {
+		int dpbp_id;
+		u16 buffer_size;
+		int backup_pool;
+	} pools[DPSW_MAX_DPBP];
+};
+
+int dpsw_ctrl_if_set_pools(struct fsl_mc_io *mc_io, u32 cmd_flags, u16 token,
+			   const struct dpsw_ctrl_if_pools_cfg *cfg);
+
 /**
  * enum dpsw_action - Action selection for special/control frames
  * @DPSW_ACTION_DROP: Drop frame
diff --git a/drivers/staging/fsl-dpaa2/ethsw/ethsw.c b/drivers/staging/fsl-dpaa2/ethsw/ethsw.c
index c22b9ad558c8..21d3ff6b9f55 100644
--- a/drivers/staging/fsl-dpaa2/ethsw/ethsw.c
+++ b/drivers/staging/fsl-dpaa2/ethsw/ethsw.c
@@ -1503,6 +1503,83 @@ static int dpaa2_switch_setup_fqs(struct ethsw_core *ethsw)
 	return 0;
 }
 
+static int dpaa2_switch_setup_dpbp(struct ethsw_core *ethsw)
+{
+	struct dpsw_ctrl_if_pools_cfg dpsw_ctrl_if_pools_cfg = { 0 };
+	struct device *dev = ethsw->dev;
+	struct fsl_mc_device *dpbp_dev;
+	struct dpbp_attr dpbp_attrs;
+	int err;
+
+	err = fsl_mc_object_allocate(to_fsl_mc_device(dev), FSL_MC_POOL_DPBP,
+				     &dpbp_dev);
+	if (err) {
+		if (err == -ENXIO)
+			err = -EPROBE_DEFER;
+		else
+			dev_err(dev, "DPBP device allocation failed\n");
+		return err;
+	}
+	ethsw->dpbp_dev = dpbp_dev;
+
+	err = dpbp_open(ethsw->mc_io, 0, dpbp_dev->obj_desc.id,
+			&dpbp_dev->mc_handle);
+	if (err) {
+		dev_err(dev, "dpbp_open() failed\n");
+		goto err_open;
+	}
+
+	err = dpbp_reset(ethsw->mc_io, 0, dpbp_dev->mc_handle);
+	if (err) {
+		dev_err(dev, "dpbp_reset() failed\n");
+		goto err_reset;
+	}
+
+	err = dpbp_enable(ethsw->mc_io, 0, dpbp_dev->mc_handle);
+	if (err) {
+		dev_err(dev, "dpbp_enable() failed\n");
+		goto err_enable;
+	}
+
+	err = dpbp_get_attributes(ethsw->mc_io, 0, dpbp_dev->mc_handle,
+				  &dpbp_attrs);
+	if (err) {
+		dev_err(dev, "dpbp_get_attributes() failed\n");
+		goto err_get_attr;
+	}
+
+	dpsw_ctrl_if_pools_cfg.num_dpbp = 1;
+	dpsw_ctrl_if_pools_cfg.pools[0].dpbp_id = dpbp_attrs.id;
+	dpsw_ctrl_if_pools_cfg.pools[0].buffer_size = DPAA2_SWITCH_RX_BUF_SIZE;
+	dpsw_ctrl_if_pools_cfg.pools[0].backup_pool = 0;
+
+	err = dpsw_ctrl_if_set_pools(ethsw->mc_io, 0, ethsw->dpsw_handle,
+				     &dpsw_ctrl_if_pools_cfg);
+	if (err) {
+		dev_err(dev, "dpsw_ctrl_if_set_pools() failed\n");
+		goto err_get_attr;
+	}
+	ethsw->bpid = dpbp_attrs.id;
+
+	return 0;
+
+err_get_attr:
+	dpbp_disable(ethsw->mc_io, 0, dpbp_dev->mc_handle);
+err_enable:
+err_reset:
+	dpbp_close(ethsw->mc_io, 0, dpbp_dev->mc_handle);
+err_open:
+	fsl_mc_object_free(dpbp_dev);
+	return err;
+}
+
+static void dpaa2_switch_free_dpbp(struct ethsw_core *ethsw)
+{
+	dpbp_disable(ethsw->mc_io, 0, ethsw->dpbp_dev->mc_handle);
+	dpbp_close(ethsw->mc_io, 0, ethsw->dpbp_dev->mc_handle);
+	fsl_mc_object_free(ethsw->dpbp_dev);
+}
+
 static int dpaa2_switch_ctrl_if_setup(struct ethsw_core *ethsw)
 {
 	int err;
@@ -1512,6 +1589,11 @@ static int dpaa2_switch_ctrl_if_setup(struct ethsw_core *ethsw)
 	if (err)
 		return err;
 
+	/* setup the buffer poll needed on the Rx path */
+	err = dpaa2_switch_setup_dpbp(ethsw);
+	if (err)
+		return err;
+
 	return 0;
 }
 
@@ -1693,6 +1775,11 @@ static void dpaa2_switch_takedown(struct fsl_mc_device *sw_dev)
 		dev_warn(dev, "dpsw_close err %d\n", err);
 }
 
+static void dpaa2_switch_ctrl_if_teardown(struct ethsw_core *ethsw)
+{
+	dpaa2_switch_free_dpbp(ethsw);
+}
+
 static int dpaa2_switch_remove(struct fsl_mc_device *sw_dev)
 {
 	struct ethsw_port_priv *port_priv;
@@ -1703,6 +1790,9 @@ static int dpaa2_switch_remove(struct fsl_mc_device *sw_dev)
 	dev = &sw_dev->dev;
 	ethsw = dev_get_drvdata(dev);
 
+	if (dpaa2_switch_has_ctrl_if(ethsw))
+		dpaa2_switch_ctrl_if_teardown(ethsw);
+
 	dpaa2_switch_teardown_irqs(sw_dev);
 
 	dpsw_disable(ethsw->mc_io, 0, ethsw->dpsw_handle);
diff --git a/drivers/staging/fsl-dpaa2/ethsw/ethsw.h b/drivers/staging/fsl-dpaa2/ethsw/ethsw.h
index 3189dd72a51b..84130134aa67 100644
--- a/drivers/staging/fsl-dpaa2/ethsw/ethsw.h
+++ b/drivers/staging/fsl-dpaa2/ethsw/ethsw.h
@@ -17,6 +17,7 @@
 #include <uapi/linux/if_bridge.h>
 #include <net/switchdev.h>
 #include <linux/if_bridge.h>
+#include <linux/fsl/mc.h>
 
 #include "dpsw.h"
 
@@ -42,6 +43,13 @@
 /* Number of receive queues (one RX and one TX_CONF) */
 #define DPAA2_SWITCH_RX_NUM_FQS	2
 
+/* Hardware requires alignment for ingress/egress buffer addresses */
+#define DPAA2_SWITCH_RX_BUF_RAW_SIZE	PAGE_SIZE
+#define DPAA2_SWITCH_RX_BUF_TAILROOM \
+	SKB_DATA_ALIGN(sizeof(struct skb_shared_info))
+#define DPAA2_SWITCH_RX_BUF_SIZE \
+	(DPAA2_SWITCH_RX_BUF_RAW_SIZE - DPAA2_SWITCH_RX_BUF_TAILROOM)
+
 extern const struct ethtool_ops dpaa2_switch_port_ethtool_ops;
 
 struct ethsw_core;
@@ -86,6 +94,8 @@ struct ethsw_core {
 	struct workqueue_struct		*workqueue;
 
 	struct dpaa2_switch_fq		fq[DPAA2_SWITCH_RX_NUM_FQS];
+	struct fsl_mc_device		*dpbp_dev;
+	u16				bpid;
 };
 
 static inline bool dpaa2_switch_has_ctrl_if(struct ethsw_core *ethsw)
-- 
2.28.0


^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [RFC 3/9] staging: dpaa2-switch: setup RX path rings
  2020-11-04 16:57 [RFC 0/9] staging: dpaa2-switch: add support for CPU terminated traffic Ioana Ciornei
  2020-11-04 16:57 ` [RFC 1/9] staging: dpaa2-switch: get control interface attributes Ioana Ciornei
  2020-11-04 16:57 ` [RFC 2/9] staging: dpaa2-switch: setup buffer pool for control traffic Ioana Ciornei
@ 2020-11-04 16:57 ` Ioana Ciornei
  2020-11-04 16:57 ` [RFC 4/9] staging: dpaa2-switch: setup dpio Ioana Ciornei
                   ` (5 subsequent siblings)
  8 siblings, 0 replies; 20+ messages in thread
From: Ioana Ciornei @ 2020-11-04 16:57 UTC (permalink / raw)
  To: Greg Kroah-Hartman, Andrew Lunn; +Cc: linux-kernel, netdev, Ioana Ciornei

From: Ioana Ciornei <ioana.ciornei@nxp.com>

On the Rx path, when a pull-dequeue operation is performed on a
software portal, available frame descriptors are put in a ring - a DMA
memory storage - for further usage. Create the needed rings for both
frame queues used on the control interface.

Signed-off-by: Ioana Ciornei <ioana.ciornei@nxp.com>
---
 drivers/staging/fsl-dpaa2/ethsw/ethsw.c | 37 +++++++++++++++++++++++++
 drivers/staging/fsl-dpaa2/ethsw/ethsw.h |  4 +++
 2 files changed, 41 insertions(+)

diff --git a/drivers/staging/fsl-dpaa2/ethsw/ethsw.c b/drivers/staging/fsl-dpaa2/ethsw/ethsw.c
index 21d3ff6b9f55..d81961d36821 100644
--- a/drivers/staging/fsl-dpaa2/ethsw/ethsw.c
+++ b/drivers/staging/fsl-dpaa2/ethsw/ethsw.c
@@ -1580,6 +1580,33 @@ static void dpaa2_switch_free_dpbp(struct ethsw_core *ethsw)
 	fsl_mc_object_free(ethsw->dpbp_dev);
 }
 
+static int dpaa2_switch_alloc_rings(struct ethsw_core *ethsw)
+{
+	int i;
+
+	for (i = 0; i < DPAA2_SWITCH_RX_NUM_FQS; i++) {
+		ethsw->fq[i].store =
+			dpaa2_io_store_create(DPAA2_SWITCH_STORE_SIZE,
+					      ethsw->dev);
+		if (!ethsw->fq[i].store) {
+			dev_err(ethsw->dev, "dpaa2_io_store_create failed\n");
+			while (--i >= 0)
+				dpaa2_io_store_destroy(ethsw->fq[i].store);
+			return -ENOMEM;
+		}
+	}
+
+	return 0;
+}
+
+static void dpaa2_switch_destroy_rings(struct ethsw_core *ethsw)
+{
+	int i;
+
+	for (i = 0; i < DPAA2_SWITCH_RX_NUM_FQS; i++)
+		dpaa2_io_store_destroy(ethsw->fq[i].store);
+}
+
 static int dpaa2_switch_ctrl_if_setup(struct ethsw_core *ethsw)
 {
 	int err;
@@ -1594,7 +1621,16 @@ static int dpaa2_switch_ctrl_if_setup(struct ethsw_core *ethsw)
 	if (err)
 		return err;
 
+	err = dpaa2_switch_alloc_rings(ethsw);
+	if (err)
+		goto err_free_dpbp;
+
 	return 0;
+
+err_free_dpbp:
+	dpaa2_switch_free_dpbp(ethsw);
+
+	return err;
 }
 
 static int dpaa2_switch_init(struct fsl_mc_device *sw_dev)
@@ -1777,6 +1813,7 @@ static void dpaa2_switch_takedown(struct fsl_mc_device *sw_dev)
 
 static void dpaa2_switch_ctrl_if_teardown(struct ethsw_core *ethsw)
 {
+	dpaa2_switch_destroy_rings(ethsw);
 	dpaa2_switch_free_dpbp(ethsw);
 }
 
diff --git a/drivers/staging/fsl-dpaa2/ethsw/ethsw.h b/drivers/staging/fsl-dpaa2/ethsw/ethsw.h
index 84130134aa67..b476f9ac4c55 100644
--- a/drivers/staging/fsl-dpaa2/ethsw/ethsw.h
+++ b/drivers/staging/fsl-dpaa2/ethsw/ethsw.h
@@ -18,6 +18,7 @@
 #include <net/switchdev.h>
 #include <linux/if_bridge.h>
 #include <linux/fsl/mc.h>
+#include <soc/fsl/dpaa2-io.h>
 
 #include "dpsw.h"
 
@@ -50,6 +51,8 @@
 #define DPAA2_SWITCH_RX_BUF_SIZE \
 	(DPAA2_SWITCH_RX_BUF_RAW_SIZE - DPAA2_SWITCH_RX_BUF_TAILROOM)
 
+#define DPAA2_SWITCH_STORE_SIZE 16
+
 extern const struct ethtool_ops dpaa2_switch_port_ethtool_ops;
 
 struct ethsw_core;
@@ -57,6 +60,7 @@ struct ethsw_core;
 struct dpaa2_switch_fq {
 	struct ethsw_core *ethsw;
 	enum dpsw_queue_type type;
+	struct dpaa2_io_store *store;
 	u32 fqid;
 };
 
-- 
2.28.0


^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [RFC 4/9] staging: dpaa2-switch: setup dpio
  2020-11-04 16:57 [RFC 0/9] staging: dpaa2-switch: add support for CPU terminated traffic Ioana Ciornei
                   ` (2 preceding siblings ...)
  2020-11-04 16:57 ` [RFC 3/9] staging: dpaa2-switch: setup RX path rings Ioana Ciornei
@ 2020-11-04 16:57 ` Ioana Ciornei
  2020-11-04 16:57 ` [RFC 5/9] staging: dpaa2-switch: handle Rx path on control interface Ioana Ciornei
                   ` (4 subsequent siblings)
  8 siblings, 0 replies; 20+ messages in thread
From: Ioana Ciornei @ 2020-11-04 16:57 UTC (permalink / raw)
  To: Greg Kroah-Hartman, Andrew Lunn; +Cc: linux-kernel, netdev, Ioana Ciornei

From: Ioana Ciornei <ioana.ciornei@nxp.com>

Setup interrupts on the control interface queues. We do not force an
exact affinity between the interrupts received from a specific queue and
a cpu.

Also, the DPSW object version is incremented since the
dpsw_ctrl_if_set_queue() API is introduced in the v8.4 object
(first seen in the MC 10.19.0 firmware).

Signed-off-by: Ioana Ciornei <ioana.ciornei@nxp.com>
---
 drivers/staging/fsl-dpaa2/ethsw/dpsw-cmd.h | 17 +++++-
 drivers/staging/fsl-dpaa2/ethsw/dpsw.c     | 33 +++++++++++
 drivers/staging/fsl-dpaa2/ethsw/dpsw.h     | 23 ++++++++
 drivers/staging/fsl-dpaa2/ethsw/ethsw.c    | 65 ++++++++++++++++++++++
 drivers/staging/fsl-dpaa2/ethsw/ethsw.h    |  1 +
 5 files changed, 138 insertions(+), 1 deletion(-)

diff --git a/drivers/staging/fsl-dpaa2/ethsw/dpsw-cmd.h b/drivers/staging/fsl-dpaa2/ethsw/dpsw-cmd.h
index e27e86d5d4fb..9caff864ae3e 100644
--- a/drivers/staging/fsl-dpaa2/ethsw/dpsw-cmd.h
+++ b/drivers/staging/fsl-dpaa2/ethsw/dpsw-cmd.h
@@ -12,7 +12,7 @@
 
 /* DPSW Version */
 #define DPSW_VER_MAJOR		8
-#define DPSW_VER_MINOR		1
+#define DPSW_VER_MINOR		4
 
 #define DPSW_CMD_BASE_VERSION	1
 #define DPSW_CMD_ID_OFFSET	4
@@ -77,6 +77,7 @@
 
 #define DPSW_CMDID_CTRL_IF_GET_ATTR         DPSW_CMD_ID(0x0A0)
 #define DPSW_CMDID_CTRL_IF_SET_POOLS        DPSW_CMD_ID(0x0A1)
+#define DPSW_CMDID_CTRL_IF_SET_QUEUE        DPSW_CMD_ID(0x0A6)
 
 /* Macros for accessing command fields smaller than 1byte */
 #define DPSW_MASK(field)        \
@@ -389,6 +390,20 @@ struct dpsw_cmd_ctrl_if_set_pools {
 	__le16 buffer_size[DPSW_MAX_DPBP];
 };
 
+#define DPSW_DEST_TYPE_SHIFT	0
+#define DPSW_DEST_TYPE_SIZE	4
+
+struct dpsw_cmd_ctrl_if_set_queue {
+	__le32 dest_id;
+	u8 dest_priority;
+	u8 pad;
+	/* from LSB: dest_type:4 */
+	u8 dest_type;
+	u8 qtype;
+	__le64 user_ctx;
+	__le32 options;
+};
+
 struct dpsw_rsp_get_api_version {
 	__le16 version_major;
 	__le16 version_minor;
diff --git a/drivers/staging/fsl-dpaa2/ethsw/dpsw.c b/drivers/staging/fsl-dpaa2/ethsw/dpsw.c
index 01ebdb12adec..785140d4652c 100644
--- a/drivers/staging/fsl-dpaa2/ethsw/dpsw.c
+++ b/drivers/staging/fsl-dpaa2/ethsw/dpsw.c
@@ -1244,6 +1244,39 @@ int dpsw_ctrl_if_set_pools(struct fsl_mc_io *mc_io, u32 cmd_flags, u16 token,
 	return mc_send_command(mc_io, &cmd);
 }
 
+/**
+ * dpsw_ctrl_if_set_queue() - Set Rx queue configuration
+ * @mc_io:	Pointer to MC portal's I/O object
+ * @cmd_flags:	Command flags; one or more of 'MC_CMD_FLAG_'
+ * @token:	Token of dpsw object
+ * @qtype:	dpsw_queue_type of the targeted queue
+ * @cfg:	Rx queue configuration
+ *
+ * Return:	'0' on Success; Error code otherwise.
+ */
+int dpsw_ctrl_if_set_queue(struct fsl_mc_io *mc_io, u32 cmd_flags, u16 token,
+			   enum dpsw_queue_type qtype,
+			   const struct dpsw_ctrl_if_queue_cfg *cfg)
+{
+	struct dpsw_cmd_ctrl_if_set_queue *cmd_params;
+	struct fsl_mc_command cmd = { 0 };
+
+	cmd.header = mc_encode_cmd_header(DPSW_CMDID_CTRL_IF_SET_QUEUE,
+					  cmd_flags,
+					  token);
+	cmd_params = (struct dpsw_cmd_ctrl_if_set_queue *)cmd.params;
+	cmd_params->dest_id = cpu_to_le32(cfg->dest_cfg.dest_id);
+	cmd_params->dest_priority = cfg->dest_cfg.priority;
+	cmd_params->qtype = qtype;
+	cmd_params->user_ctx = cpu_to_le64(cfg->user_ctx);
+	cmd_params->options = cpu_to_le32(cfg->options);
+	dpsw_set_field(cmd_params->dest_type,
+		       DEST_TYPE,
+		       cfg->dest_cfg.dest_type);
+
+	return mc_send_command(mc_io, &cmd);
+}
+
 /**
  * dpsw_get_api_version() - Get Data Path Switch API version
  * @mc_io:	Pointer to MC portal's I/O object
diff --git a/drivers/staging/fsl-dpaa2/ethsw/dpsw.h b/drivers/staging/fsl-dpaa2/ethsw/dpsw.h
index 5ad7c0634992..718242ea7d87 100644
--- a/drivers/staging/fsl-dpaa2/ethsw/dpsw.h
+++ b/drivers/staging/fsl-dpaa2/ethsw/dpsw.h
@@ -222,6 +222,29 @@ struct dpsw_ctrl_if_pools_cfg {
 int dpsw_ctrl_if_set_pools(struct fsl_mc_io *mc_io, u32 cmd_flags, u16 token,
 			   const struct dpsw_ctrl_if_pools_cfg *cfg);
 
+#define DPSW_CTRL_IF_QUEUE_OPT_USER_CTX		0x00000001
+#define DPSW_CTRL_IF_QUEUE_OPT_DEST		0x00000002
+
+enum dpsw_ctrl_if_dest {
+	DPSW_CTRL_IF_DEST_NONE = 0,
+	DPSW_CTRL_IF_DEST_DPIO = 1,
+};
+
+struct dpsw_ctrl_if_dest_cfg {
+	enum dpsw_ctrl_if_dest dest_type;
+	int dest_id;
+	u8 priority;
+};
+
+struct dpsw_ctrl_if_queue_cfg {
+	u32 options;
+	u64 user_ctx;
+	struct dpsw_ctrl_if_dest_cfg dest_cfg;
+};
+
+int dpsw_ctrl_if_set_queue(struct fsl_mc_io *mc_io, u32 cmd_flags, u16 token,
+			   enum dpsw_queue_type qtype,
+			   const struct dpsw_ctrl_if_queue_cfg *cfg);
 /**
  * enum dpsw_action - Action selection for special/control frames
  * @DPSW_ACTION_DROP: Drop frame
diff --git a/drivers/staging/fsl-dpaa2/ethsw/ethsw.c b/drivers/staging/fsl-dpaa2/ethsw/ethsw.c
index d81961d36821..f8f972421fea 100644
--- a/drivers/staging/fsl-dpaa2/ethsw/ethsw.c
+++ b/drivers/staging/fsl-dpaa2/ethsw/ethsw.c
@@ -1607,6 +1607,64 @@ static void dpaa2_switch_destroy_rings(struct ethsw_core *ethsw)
 		dpaa2_io_store_destroy(ethsw->fq[i].store);
 }
 
+static int dpaa2_switch_setup_dpio(struct ethsw_core *ethsw)
+{
+	struct dpsw_ctrl_if_queue_cfg queue_cfg;
+	struct dpaa2_io_notification_ctx *nctx;
+	int err, i, j;
+
+	for (i = 0; i < DPAA2_SWITCH_RX_NUM_FQS; i++) {
+		nctx = &ethsw->fq[i].nctx;
+
+		/* Register a new software context for the FQID.
+		 * By using NULL as the first parameter, we specify that we do
+		 * not care on which cpu are interrupts received for this queue
+		 */
+		nctx->is_cdan = 0;
+		nctx->id = ethsw->fq[i].fqid;
+		nctx->desired_cpu = DPAA2_IO_ANY_CPU;
+		err = dpaa2_io_service_register(NULL, nctx, ethsw->dev);
+		if (err) {
+			err = -EPROBE_DEFER;
+			goto err_register;
+		}
+
+		queue_cfg.options = DPSW_CTRL_IF_QUEUE_OPT_DEST |
+				    DPSW_CTRL_IF_QUEUE_OPT_USER_CTX;
+		queue_cfg.dest_cfg.dest_type = DPSW_CTRL_IF_DEST_DPIO;
+		queue_cfg.dest_cfg.dest_id = nctx->dpio_id;
+		queue_cfg.dest_cfg.priority = 0;
+		queue_cfg.user_ctx = nctx->qman64;
+
+		err = dpsw_ctrl_if_set_queue(ethsw->mc_io, 0,
+					     ethsw->dpsw_handle,
+					     ethsw->fq[i].type,
+					     &queue_cfg);
+		if (err)
+			goto err_set_queue;
+	}
+
+	return 0;
+
+err_set_queue:
+	dpaa2_io_service_deregister(NULL, nctx, ethsw->dev);
+err_register:
+	for (j = 0; j < i; j++)
+		dpaa2_io_service_deregister(NULL, &ethsw->fq[j].nctx,
+					    ethsw->dev);
+
+	return err;
+}
+
+static void dpaa2_switch_free_dpio(struct ethsw_core *ethsw)
+{
+	int i;
+
+	for (i = 0; i < DPAA2_SWITCH_RX_NUM_FQS; i++)
+		dpaa2_io_service_deregister(NULL, &ethsw->fq[i].nctx,
+					    ethsw->dev);
+}
+
 static int dpaa2_switch_ctrl_if_setup(struct ethsw_core *ethsw)
 {
 	int err;
@@ -1625,8 +1683,14 @@ static int dpaa2_switch_ctrl_if_setup(struct ethsw_core *ethsw)
 	if (err)
 		goto err_free_dpbp;
 
+	err = dpaa2_switch_setup_dpio(ethsw);
+	if (err)
+		goto err_destroy_rings;
+
 	return 0;
 
+err_destroy_rings:
+	dpaa2_switch_destroy_rings(ethsw);
 err_free_dpbp:
 	dpaa2_switch_free_dpbp(ethsw);
 
@@ -1813,6 +1877,7 @@ static void dpaa2_switch_takedown(struct fsl_mc_device *sw_dev)
 
 static void dpaa2_switch_ctrl_if_teardown(struct ethsw_core *ethsw)
 {
+	dpaa2_switch_free_dpio(ethsw);
 	dpaa2_switch_destroy_rings(ethsw);
 	dpaa2_switch_free_dpbp(ethsw);
 }
diff --git a/drivers/staging/fsl-dpaa2/ethsw/ethsw.h b/drivers/staging/fsl-dpaa2/ethsw/ethsw.h
index b476f9ac4c55..c78ee5409b8a 100644
--- a/drivers/staging/fsl-dpaa2/ethsw/ethsw.h
+++ b/drivers/staging/fsl-dpaa2/ethsw/ethsw.h
@@ -61,6 +61,7 @@ struct dpaa2_switch_fq {
 	struct ethsw_core *ethsw;
 	enum dpsw_queue_type type;
 	struct dpaa2_io_store *store;
+	struct dpaa2_io_notification_ctx nctx;
 	u32 fqid;
 };
 
-- 
2.28.0


^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [RFC 5/9] staging: dpaa2-switch: handle Rx path on control interface
  2020-11-04 16:57 [RFC 0/9] staging: dpaa2-switch: add support for CPU terminated traffic Ioana Ciornei
                   ` (3 preceding siblings ...)
  2020-11-04 16:57 ` [RFC 4/9] staging: dpaa2-switch: setup dpio Ioana Ciornei
@ 2020-11-04 16:57 ` Ioana Ciornei
  2020-11-05  0:45   ` Andrew Lunn
  2020-11-04 16:57 ` [RFC 6/9] staging: dpaa2-switch: add .ndo_start_xmit() callback Ioana Ciornei
                   ` (3 subsequent siblings)
  8 siblings, 1 reply; 20+ messages in thread
From: Ioana Ciornei @ 2020-11-04 16:57 UTC (permalink / raw)
  To: Greg Kroah-Hartman, Andrew Lunn; +Cc: linux-kernel, netdev, Ioana Ciornei

From: Ioana Ciornei <ioana.ciornei@nxp.com>

The dpaa2-ethsw supports only one Rx queue that is shared by all switch
ports. This means that information about which port was the ingress port
for a specific frame needs to be passed in metadata. In our case, the
Flow Context (FLC) field from the frame descriptor holds this
information. Besides the interface ID of the ingress port we also
receive the virtual QDID of the port. Below is a visual description of
the 64 bits of FLC.

63           47           31           15           0
+---------------------------------------------------+
|            |            |            |            |
|  RESERVED  |    IF_ID   |  RESERVED  |  IF QDID   |
|            |            |            |            |
+---------------------------------------------------+

Because all switch ports share the same Rx and Tx conf queues, NAPI
management takes into consideration when there is at least one switch
interface open to enable the NAPI instance.

The Rx path is common, for the most part, for both Rx and Tx conf with
the mention that each of them has its own consume function of a frame
descriptor. Dequeueing from a FQ, consuming dequeued store and also the
NAPI poll function is common between both queues.

Signed-off-by: Ioana Ciornei <ioana.ciornei@nxp.com>
---
 drivers/staging/fsl-dpaa2/ethsw/ethsw.c | 449 +++++++++++++++++++++++-
 drivers/staging/fsl-dpaa2/ethsw/ethsw.h |  18 +
 2 files changed, 463 insertions(+), 4 deletions(-)

diff --git a/drivers/staging/fsl-dpaa2/ethsw/ethsw.c b/drivers/staging/fsl-dpaa2/ethsw/ethsw.c
index f8f972421fea..d09aa4a5126a 100644
--- a/drivers/staging/fsl-dpaa2/ethsw/ethsw.c
+++ b/drivers/staging/fsl-dpaa2/ethsw/ethsw.c
@@ -13,6 +13,7 @@
 #include <linux/msi.h>
 #include <linux/kthread.h>
 #include <linux/workqueue.h>
+#include <linux/iommu.h>
 
 #include <linux/fsl/mc.h>
 
@@ -24,6 +25,16 @@
 
 #define DEFAULT_VLAN_ID			1
 
+static void *dpaa2_iova_to_virt(struct iommu_domain *domain,
+				dma_addr_t iova_addr)
+{
+	phys_addr_t phys_addr;
+
+	phys_addr = domain ? iommu_iova_to_phys(domain, iova_addr) : iova_addr;
+
+	return phys_to_virt(phys_addr);
+}
+
 static int dpaa2_switch_add_vlan(struct ethsw_core *ethsw, u16 vid)
 {
 	int err;
@@ -496,9 +507,51 @@ static int dpaa2_switch_port_carrier_state_sync(struct net_device *netdev)
 	return 0;
 }
 
+/* Manage all NAPI instances for the control interface.
+ *
+ * We only have one RX queue and one Tx Conf queue for all
+ * switch ports. Therefore, we only need to enable the NAPI instance once, the
+ * first time one of the switch ports runs .dev_open().
+ */
+
+static void dpaa2_switch_enable_ctrl_if_napi(struct ethsw_core *ethsw)
+{
+	int i;
+
+	/* a new interface is using the NAPI instance */
+	ethsw->napi_users++;
+
+	/* if there is already a user of the instance, return */
+	if (ethsw->napi_users > 1)
+		return;
+
+	if (!dpaa2_switch_has_ctrl_if(ethsw))
+		return;
+
+	for (i = 0; i < DPAA2_SWITCH_RX_NUM_FQS; i++)
+		napi_enable(&ethsw->fq[i].napi);
+}
+
+static void dpaa2_switch_disable_ctrl_if_napi(struct ethsw_core *ethsw)
+{
+	int i;
+
+	/* If we are not the last interface using the NAPI, return */
+	ethsw->napi_users--;
+	if (ethsw->napi_users)
+		return;
+
+	if (!dpaa2_switch_has_ctrl_if(ethsw))
+		return;
+
+	for (i = 0; i < DPAA2_SWITCH_RX_NUM_FQS; i++)
+		napi_disable(&ethsw->fq[i].napi);
+}
+
 static int dpaa2_switch_port_open(struct net_device *netdev)
 {
 	struct ethsw_port_priv *port_priv = netdev_priv(netdev);
+	struct ethsw_core *ethsw = port_priv->ethsw_data;
 	int err;
 
 	/* No need to allow Tx as control interface is disabled */
@@ -527,6 +580,8 @@ static int dpaa2_switch_port_open(struct net_device *netdev)
 		goto err_carrier_sync;
 	}
 
+	dpaa2_switch_enable_ctrl_if_napi(ethsw);
+
 	return 0;
 
 err_carrier_sync:
@@ -539,6 +594,7 @@ static int dpaa2_switch_port_open(struct net_device *netdev)
 static int dpaa2_switch_port_stop(struct net_device *netdev)
 {
 	struct ethsw_port_priv *port_priv = netdev_priv(netdev);
+	struct ethsw_core *ethsw = port_priv->ethsw_data;
 	int err;
 
 	err = dpsw_if_disable(port_priv->ethsw_data->mc_io, 0,
@@ -549,6 +605,8 @@ static int dpaa2_switch_port_stop(struct net_device *netdev)
 		return err;
 	}
 
+	dpaa2_switch_disable_ctrl_if_napi(ethsw);
+
 	return 0;
 }
 
@@ -755,6 +813,28 @@ static int dpaa2_switch_port_set_mac_addr(struct ethsw_port_priv *port_priv)
 	return 0;
 }
 
+static void dpaa2_switch_free_fd(const struct ethsw_core *ethsw,
+				 const struct dpaa2_fd *fd)
+{
+	struct device *dev = ethsw->dev;
+	unsigned char *buffer_start;
+	struct sk_buff **skbh, *skb;
+	dma_addr_t fd_addr;
+
+	fd_addr = dpaa2_fd_get_addr(fd);
+	skbh = dpaa2_iova_to_virt(ethsw->iommu_domain, fd_addr);
+
+	skb = *skbh;
+	buffer_start = (unsigned char *)skbh;
+
+	dma_unmap_single(dev, fd_addr,
+			 skb_tail_pointer(skb) - buffer_start,
+			 DMA_TO_DEVICE);
+
+	/* Move on with skb release */
+	dev_kfree_skb(skb);
+}
+
 static const struct net_device_ops dpaa2_switch_port_ops = {
 	.ndo_open		= dpaa2_switch_port_open,
 	.ndo_stop		= dpaa2_switch_port_stop,
@@ -1470,6 +1550,104 @@ static int dpaa2_switch_register_notifier(struct device *dev)
 	return err;
 }
 
+/* Build a linear skb based on a single-buffer frame descriptor */
+static struct sk_buff *dpaa2_switch_build_linear_skb(struct ethsw_core *ethsw,
+						     const struct dpaa2_fd *fd)
+{
+	u16 fd_offset = dpaa2_fd_get_offset(fd);
+	u32 fd_length = dpaa2_fd_get_len(fd);
+	struct device *dev = ethsw->dev;
+	struct sk_buff *skb = NULL;
+	dma_addr_t addr;
+	void *fd_vaddr;
+
+	addr = dpaa2_fd_get_addr(fd);
+	dma_unmap_single(dev, addr, DPAA2_SWITCH_RX_BUF_SIZE,
+			 DMA_FROM_DEVICE);
+	fd_vaddr = dpaa2_iova_to_virt(ethsw->iommu_domain, addr);
+	prefetch(fd_vaddr + fd_offset);
+
+	skb = build_skb(fd_vaddr, DPAA2_SWITCH_RX_BUF_SIZE +
+			SKB_DATA_ALIGN(sizeof(struct skb_shared_info)));
+	if (unlikely(!skb)) {
+		dev_err(dev, "build_skb() failed\n");
+		return NULL;
+	}
+
+	skb_reserve(skb, fd_offset);
+	skb_put(skb, fd_length);
+
+	ethsw->buf_count--;
+
+	return skb;
+}
+
+static void dpaa2_switch_tx_conf(struct dpaa2_switch_fq *fq,
+				 const struct dpaa2_fd *fd)
+{
+	dpaa2_switch_free_fd(fq->ethsw, fd);
+}
+
+static void dpaa2_switch_rx(struct dpaa2_switch_fq *fq,
+			    const struct dpaa2_fd *fd)
+{
+	struct ethsw_core *ethsw = fq->ethsw;
+	struct ethsw_port_priv *port_priv;
+	struct net_device *netdev;
+	struct vlan_ethhdr *hdr;
+	struct sk_buff *skb;
+	u16 vlan_tci, vid;
+	int if_id = -1;
+	int err;
+
+	/* prefetch the frame descriptor */
+	prefetch(fd);
+
+	/* get switch ingress interface ID */
+	if_id = upper_32_bits(dpaa2_fd_get_flc(fd)) & 0x0000FFFF;
+
+	if (if_id < 0 || if_id >= ethsw->sw_attr.num_ifs) {
+		dev_err(ethsw->dev, "Frame received from unknown interface!\n");
+		goto err_free_fd;
+	}
+	port_priv = ethsw->ports[if_id];
+	netdev = port_priv->netdev;
+
+	/* build the SKB based on the FD received */
+	if (dpaa2_fd_get_format(fd) == dpaa2_fd_single) {
+		skb = dpaa2_switch_build_linear_skb(ethsw, fd);
+	} else {
+		netdev_err(netdev, "Received invalid frame format\n");
+		goto err_free_fd;
+	}
+
+	if (unlikely(!skb))
+		goto err_free_fd;
+
+	skb_reset_mac_header(skb);
+
+	/* Remove PVID from received frame */
+	hdr = vlan_eth_hdr(skb);
+	vid = ntohs(hdr->h_vlan_TCI) & VLAN_VID_MASK;
+	if (vid == port_priv->pvid) {
+		err = __skb_vlan_pop(skb, &vlan_tci);
+		if (err) {
+			dev_info(ethsw->dev, "skb_vlan_pop() failed %d", err);
+			goto err_free_fd;
+		}
+	}
+
+	skb->dev = netdev;
+	skb->protocol = eth_type_trans(skb, skb->dev);
+
+	netif_receive_skb(skb);
+
+	return;
+
+err_free_fd:
+	dpaa2_switch_free_fd(ethsw, fd);
+}
+
 static void dpaa2_switch_detect_features(struct ethsw_core *ethsw)
 {
 	ethsw->features = 0;
@@ -1486,7 +1664,7 @@ static int dpaa2_switch_setup_fqs(struct ethsw_core *ethsw)
 	int err;
 
 	err = dpsw_ctrl_if_get_attributes(ethsw->mc_io, 0, ethsw->dpsw_handle,
-					  &ctrl_if_attr);
+			&ctrl_if_attr);
 	if (err) {
 		dev_err(dev, "dpsw_ctrl_if_get_attributes() = %d\n", err);
 		return err;
@@ -1494,15 +1672,158 @@ static int dpaa2_switch_setup_fqs(struct ethsw_core *ethsw)
 
 	ethsw->fq[i].fqid = ctrl_if_attr.rx_fqid;
 	ethsw->fq[i].ethsw = ethsw;
-	ethsw->fq[i++].type = DPSW_QUEUE_RX;
+	ethsw->fq[i].type = DPSW_QUEUE_RX;
+	ethsw->fq[i++].consume = dpaa2_switch_rx;
+
 
 	ethsw->fq[i].fqid = ctrl_if_attr.tx_err_conf_fqid;
 	ethsw->fq[i].ethsw = ethsw;
-	ethsw->fq[i++].type = DPSW_QUEUE_TX_ERR_CONF;
+	ethsw->fq[i].type = DPSW_QUEUE_TX_ERR_CONF;
+	ethsw->fq[i++].consume = dpaa2_switch_tx_conf;
 
 	return 0;
 }
 
+/* Free buffers acquired from the buffer pool or which were meant to
+ * be released in the pool
+ */
+static void dpaa2_switch_free_bufs(struct ethsw_core *ethsw, u64 *buf_array, int count)
+{
+	struct device *dev = ethsw->dev;
+	void *vaddr;
+	int i;
+
+	for (i = 0; i < count; i++) {
+		vaddr = dpaa2_iova_to_virt(ethsw->iommu_domain, buf_array[i]);
+		dma_unmap_page(dev, buf_array[i], DPAA2_SWITCH_RX_BUF_SIZE,
+			       DMA_BIDIRECTIONAL);
+		free_pages((unsigned long)vaddr, 0);
+	}
+}
+
+/* Perform a single release command to add buffers
+ * to the specified buffer pool
+ */
+static int dpaa2_switch_add_bufs(struct ethsw_core *ethsw, u16 bpid)
+{
+	struct device *dev = ethsw->dev;
+	u64 buf_array[BUFS_PER_CMD];
+	struct page *page;
+	int retries = 0;
+	dma_addr_t addr;
+	int err;
+	int i;
+
+	for (i = 0; i < BUFS_PER_CMD; i++) {
+		/* Allocate one page for each Rx buffer. WRIOP sees
+		 * the entire page except for a tailroom reserved for
+		 * skb shared info
+		 */
+		page = dev_alloc_pages(0);
+		if (!page) {
+			dev_err(dev, "buffer allocation failed\n");
+			goto err_alloc;
+		}
+
+		addr = dma_map_page(dev, page, 0, DPAA2_SWITCH_RX_BUF_SIZE,
+				    DMA_FROM_DEVICE);
+		if (dma_mapping_error(dev, addr)) {
+			dev_err(dev, "dma_map_single() failed\n");
+			goto err_map;
+		}
+		buf_array[i] = addr;
+	}
+
+release_bufs:
+	/* In case the portal is busy, retry until successful or
+	 * max retries hit.
+	 */
+	while ((err = dpaa2_io_service_release(NULL, bpid,
+					       buf_array, i)) == -EBUSY) {
+		if (retries++ >= DPAA2_SWITCH_SWP_BUSY_RETRIES)
+			break;
+
+		cpu_relax();
+	}
+
+	/* If release command failed, clean up and bail out. */
+	if (err) {
+		dpaa2_switch_free_bufs(ethsw, buf_array, i);
+		return 0;
+	}
+
+	return i;
+
+err_map:
+	__free_pages(page, 0);
+err_alloc:
+	/* If we managed to allocate at least some buffers,
+	 * release them to hardware
+	 */
+	if (i)
+		goto release_bufs;
+
+	return 0;
+}
+
+static int dpaa2_switch_refill_bp(struct ethsw_core *ethsw)
+{
+	int *count = &ethsw->buf_count;
+	int new_count;
+	int err = 0;
+
+	if (unlikely(*count < DPAA2_ETHSW_REFILL_THRESH)) {
+		do {
+			new_count = dpaa2_switch_add_bufs(ethsw, ethsw->bpid);
+			if (unlikely(!new_count)) {
+				/* Out of memory; abort for now, we'll
+				 * try later on
+				 */
+				break;
+			}
+			*count += new_count;
+		} while (*count < DPAA2_ETHSW_NUM_BUFS);
+
+		if (unlikely(*count < DPAA2_ETHSW_NUM_BUFS))
+			err = -ENOMEM;
+	}
+
+	return err;
+}
+
+static int dpaa2_switch_seed_bp(struct ethsw_core *ethsw)
+{
+	int *count, i;
+
+	for (i = 0; i < DPAA2_ETHSW_NUM_BUFS; i += BUFS_PER_CMD) {
+		count = &ethsw->buf_count;
+		*count += dpaa2_switch_add_bufs(ethsw, ethsw->bpid);
+
+		if (unlikely(*count < BUFS_PER_CMD))
+			return -ENOMEM;
+	}
+
+	return 0;
+}
+
+static void dpaa2_switch_drain_bp(struct ethsw_core *ethsw)
+{
+	u64 buf_array[BUFS_PER_CMD];
+	int ret;
+
+	do {
+		ret = dpaa2_io_service_acquire(NULL, ethsw->bpid,
+					       buf_array, BUFS_PER_CMD);
+		if (ret < 0) {
+			dev_err(ethsw->dev,
+				"dpaa2_io_service_acquire() = %d\n", ret);
+			return;
+		}
+		dpaa2_switch_free_bufs(ethsw, buf_array, ret);
+
+	} while (ret);
+}
+
 static int dpaa2_switch_setup_dpbp(struct ethsw_core *ethsw)
 {
 	struct dpsw_ctrl_if_pools_cfg dpsw_ctrl_if_pools_cfg = { 0 };
@@ -1607,6 +1928,106 @@ static void dpaa2_switch_destroy_rings(struct ethsw_core *ethsw)
 		dpaa2_io_store_destroy(ethsw->fq[i].store);
 }
 
+static int dpaa2_switch_pull_fq(struct dpaa2_switch_fq *fq)
+{
+	int err, retries = 0;
+
+	/* Try to pull from the FQ while the portal is busy and we didn't hit
+	 * the maximum number fo retries
+	 */
+	do {
+		err = dpaa2_io_service_pull_fq(NULL,
+					       fq->fqid,
+					       fq->store);
+		cpu_relax();
+	} while (err == -EBUSY && retries++ < DPAA2_SWITCH_SWP_BUSY_RETRIES);
+
+	if (unlikely(err))
+		dev_err(fq->ethsw->dev, "dpaa2_io_service_pull err %d", err);
+
+	return err;
+}
+
+/* Consume all frames pull-dequeued into the store */
+static int dpaa2_switch_store_consume(struct dpaa2_switch_fq *fq)
+{
+	struct ethsw_core *ethsw = fq->ethsw;
+	int cleaned = 0, is_last;
+	struct dpaa2_dq *dq;
+	int retries = 0;
+
+	do {
+		/* Get the next available FD from the store */
+		dq = dpaa2_io_store_next(fq->store, &is_last);
+		if (unlikely(!dq)) {
+			if (retries++ >= DPAA2_SWITCH_SWP_BUSY_RETRIES) {
+				dev_err_once(ethsw->dev,
+					     "No valid dequeue response\n");
+				return -ETIMEDOUT;
+			}
+			continue;
+		}
+
+		/* Process the FD */
+		fq->consume(fq, dpaa2_dq_fd(dq));
+		cleaned++;
+
+	} while (!is_last);
+
+	return cleaned;
+}
+
+/* NAPI poll routine */
+static int dpaa2_switch_poll(struct napi_struct *napi, int budget)
+{
+	int err, cleaned = 0, store_cleaned, work_done;
+	struct dpaa2_switch_fq *fq;
+	int retries = 0;
+
+	fq = container_of(napi, struct dpaa2_switch_fq, napi);
+
+	do {
+		err = dpaa2_switch_pull_fq(fq);
+		if (unlikely(err))
+			break;
+
+		/* Refill pool if appropriate */
+		dpaa2_switch_refill_bp(fq->ethsw);
+
+		store_cleaned = dpaa2_switch_store_consume(fq);
+		cleaned += store_cleaned;
+
+		if (cleaned >= budget) {
+			work_done = budget;
+			goto out;
+		}
+
+	} while (store_cleaned);
+
+	/* We didn't consume entire budget, so finish napi and
+	 * re-enable data availability notifications
+	 */
+	napi_complete_done(napi, cleaned);
+	do {
+		err = dpaa2_io_service_rearm(NULL, &fq->nctx);
+		cpu_relax();
+	} while (err == -EBUSY && retries++ < DPAA2_SWITCH_SWP_BUSY_RETRIES);
+
+	work_done = max(cleaned, 1);
+out:
+
+	return work_done;
+}
+
+static void dpaa2_switch_fqdan_cb(struct dpaa2_io_notification_ctx *nctx)
+{
+	struct dpaa2_switch_fq *fq;
+
+	fq = container_of(nctx, struct dpaa2_switch_fq, nctx);
+
+	napi_schedule_irqoff(&fq->napi);
+}
+
 static int dpaa2_switch_setup_dpio(struct ethsw_core *ethsw)
 {
 	struct dpsw_ctrl_if_queue_cfg queue_cfg;
@@ -1623,6 +2044,7 @@ static int dpaa2_switch_setup_dpio(struct ethsw_core *ethsw)
 		nctx->is_cdan = 0;
 		nctx->id = ethsw->fq[i].fqid;
 		nctx->desired_cpu = DPAA2_IO_ANY_CPU;
+		nctx->cb = dpaa2_switch_fqdan_cb;
 		err = dpaa2_io_service_register(NULL, nctx, ethsw->dev);
 		if (err) {
 			err = -EPROBE_DEFER;
@@ -1679,10 +2101,14 @@ static int dpaa2_switch_ctrl_if_setup(struct ethsw_core *ethsw)
 	if (err)
 		return err;
 
-	err = dpaa2_switch_alloc_rings(ethsw);
+	err = dpaa2_switch_seed_bp(ethsw);
 	if (err)
 		goto err_free_dpbp;
 
+	err = dpaa2_switch_alloc_rings(ethsw);
+	if (err)
+		goto err_drain_dpbp;
+
 	err = dpaa2_switch_setup_dpio(ethsw);
 	if (err)
 		goto err_destroy_rings;
@@ -1691,6 +2117,8 @@ static int dpaa2_switch_ctrl_if_setup(struct ethsw_core *ethsw)
 
 err_destroy_rings:
 	dpaa2_switch_destroy_rings(ethsw);
+err_drain_dpbp:
+	dpaa2_switch_drain_bp(ethsw);
 err_free_dpbp:
 	dpaa2_switch_free_dpbp(ethsw);
 
@@ -1879,6 +2307,7 @@ static void dpaa2_switch_ctrl_if_teardown(struct ethsw_core *ethsw)
 {
 	dpaa2_switch_free_dpio(ethsw);
 	dpaa2_switch_destroy_rings(ethsw);
+	dpaa2_switch_drain_bp(ethsw);
 	dpaa2_switch_free_dpbp(ethsw);
 }
 
@@ -1988,6 +2417,7 @@ static int dpaa2_switch_probe(struct fsl_mc_device *sw_dev)
 		return -ENOMEM;
 
 	ethsw->dev = dev;
+	ethsw->iommu_domain = iommu_get_domain_for_dev(dev);
 	dev_set_drvdata(dev, ethsw);
 
 	err = fsl_mc_portal_allocate(sw_dev, FSL_MC_IO_ATOMIC_CONTEXT_PORTAL,
@@ -2023,6 +2453,17 @@ static int dpaa2_switch_probe(struct fsl_mc_device *sw_dev)
 			goto err_free_ports;
 	}
 
+	/* Add a NAPI instance for each of the Rx queues. The first port's
+	 * net_device will be associated with the instances since we do not have
+	 * different queues for each switch ports.
+	 */
+	if (dpaa2_switch_has_ctrl_if(ethsw)) {
+		for (i = 0; i < DPAA2_SWITCH_RX_NUM_FQS; i++)
+			netif_napi_add(ethsw->ports[0]->netdev,
+				       &ethsw->fq[i].napi, dpaa2_switch_poll,
+				       NAPI_POLL_WEIGHT);
+	}
+
 	err = dpsw_enable(ethsw->mc_io, 0, ethsw->dpsw_handle);
 	if (err) {
 		dev_err(ethsw->dev, "dpsw_enable err %d\n", err);
diff --git a/drivers/staging/fsl-dpaa2/ethsw/ethsw.h b/drivers/staging/fsl-dpaa2/ethsw/ethsw.h
index c78ee5409b8a..bd24be2c6308 100644
--- a/drivers/staging/fsl-dpaa2/ethsw/ethsw.h
+++ b/drivers/staging/fsl-dpaa2/ethsw/ethsw.h
@@ -53,15 +53,30 @@
 
 #define DPAA2_SWITCH_STORE_SIZE 16
 
+/* Buffer management */
+#define BUFS_PER_CMD			7
+#define DPAA2_ETHSW_NUM_BUFS		(1024 * BUFS_PER_CMD)
+#define DPAA2_ETHSW_REFILL_THRESH	(DPAA2_ETHSW_NUM_BUFS * 5 / 6)
+
+/* Number of times to retry DPIO portal operations while waiting
+ * for portal to finish executing current command and become
+ * available. We want to avoid being stuck in a while loop in case
+ * hardware becomes unresponsive, but not give up too easily if
+ * the portal really is busy for valid reasons
+ */
+#define DPAA2_SWITCH_SWP_BUSY_RETRIES		1000
+
 extern const struct ethtool_ops dpaa2_switch_port_ethtool_ops;
 
 struct ethsw_core;
 
 struct dpaa2_switch_fq {
+	void (*consume)(struct dpaa2_switch_fq *fq, const struct dpaa2_fd *fd);
 	struct ethsw_core *ethsw;
 	enum dpsw_queue_type type;
 	struct dpaa2_io_store *store;
 	struct dpaa2_io_notification_ctx nctx;
+	struct napi_struct napi;
 	u32 fqid;
 };
 
@@ -89,6 +104,7 @@ struct ethsw_core {
 	unsigned long			features;
 	int				dev_id;
 	struct ethsw_port_priv		**ports;
+	struct iommu_domain		*iommu_domain;
 
 	u8				vlans[VLAN_VID_MASK + 1];
 	bool				learning;
@@ -100,7 +116,9 @@ struct ethsw_core {
 
 	struct dpaa2_switch_fq		fq[DPAA2_SWITCH_RX_NUM_FQS];
 	struct fsl_mc_device		*dpbp_dev;
+	int				buf_count;
 	u16				bpid;
+	int				napi_users;
 };
 
 static inline bool dpaa2_switch_has_ctrl_if(struct ethsw_core *ethsw)
-- 
2.28.0


^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [RFC 6/9] staging: dpaa2-switch: add .ndo_start_xmit() callback
  2020-11-04 16:57 [RFC 0/9] staging: dpaa2-switch: add support for CPU terminated traffic Ioana Ciornei
                   ` (4 preceding siblings ...)
  2020-11-04 16:57 ` [RFC 5/9] staging: dpaa2-switch: handle Rx path on control interface Ioana Ciornei
@ 2020-11-04 16:57 ` Ioana Ciornei
  2020-11-04 21:27   ` Vladimir Oltean
  2020-11-05  1:04   ` Andrew Lunn
  2020-11-04 16:57 ` [RFC 7/9] staging: dpaa2-switch: enable the control interface Ioana Ciornei
                   ` (2 subsequent siblings)
  8 siblings, 2 replies; 20+ messages in thread
From: Ioana Ciornei @ 2020-11-04 16:57 UTC (permalink / raw)
  To: Greg Kroah-Hartman, Andrew Lunn; +Cc: linux-kernel, netdev, Ioana Ciornei

From: Ioana Ciornei <ioana.ciornei@nxp.com>

Implement the .ndo_start_xmit() callback for the switch port interfaces.
For each of the switch ports, gather the corresponding queue
destination ID (QDID) necessary for Tx enqueueing.

We'll reserve 64 bytes for software annotations, where we keep a skb
backpointer used on the Tx confirmation side for releasing the allocated
memory. At the moment, we only support linear skbs.

Signed-off-by: Ioana Ciornei <ioana.ciornei@nxp.com>
---
 drivers/staging/fsl-dpaa2/ethsw/dpsw-cmd.h |  24 ++++
 drivers/staging/fsl-dpaa2/ethsw/dpsw.c     |  41 ++++++
 drivers/staging/fsl-dpaa2/ethsw/dpsw.h     |  28 ++++
 drivers/staging/fsl-dpaa2/ethsw/ethsw.c    | 146 ++++++++++++++++++---
 drivers/staging/fsl-dpaa2/ethsw/ethsw.h    |  14 ++
 5 files changed, 238 insertions(+), 15 deletions(-)

diff --git a/drivers/staging/fsl-dpaa2/ethsw/dpsw-cmd.h b/drivers/staging/fsl-dpaa2/ethsw/dpsw-cmd.h
index 9caff864ae3e..e239747f8a54 100644
--- a/drivers/staging/fsl-dpaa2/ethsw/dpsw-cmd.h
+++ b/drivers/staging/fsl-dpaa2/ethsw/dpsw-cmd.h
@@ -45,6 +45,8 @@
 #define DPSW_CMDID_IF_ENABLE                DPSW_CMD_ID(0x03D)
 #define DPSW_CMDID_IF_DISABLE               DPSW_CMD_ID(0x03E)
 
+#define DPSW_CMDID_IF_GET_ATTR              DPSW_CMD_ID(0x042)
+
 #define DPSW_CMDID_IF_SET_MAX_FRAME_LENGTH  DPSW_CMD_ID(0x044)
 
 #define DPSW_CMDID_IF_GET_LINK_STATE        DPSW_CMD_ID(0x046)
@@ -258,6 +260,28 @@ struct dpsw_cmd_if {
 	__le16 if_id;
 };
 
+#define DPSW_ADMIT_UNTAGGED_SHIFT	0
+#define DPSW_ADMIT_UNTAGGED_SIZE	4
+#define DPSW_ENABLED_SHIFT		5
+#define DPSW_ENABLED_SIZE		1
+#define DPSW_ACCEPT_ALL_VLAN_SHIFT	6
+#define DPSW_ACCEPT_ALL_VLAN_SIZE	1
+
+struct dpsw_rsp_if_get_attr {
+	/* cmd word 0 */
+	/* from LSB: admit_untagged:4 enabled:1 accept_all_vlan:1 */
+	u8 conf;
+	u8 pad1;
+	u8 num_tcs;
+	u8 pad2;
+	__le16 qdid;
+	/* cmd word 1 */
+	__le32 options;
+	__le32 pad3;
+	/* cmd word 2 */
+	__le32 rate;
+};
+
 struct dpsw_cmd_if_set_max_frame_length {
 	__le16 if_id;
 	__le16 frame_length;
diff --git a/drivers/staging/fsl-dpaa2/ethsw/dpsw.c b/drivers/staging/fsl-dpaa2/ethsw/dpsw.c
index 785140d4652c..2a1967754913 100644
--- a/drivers/staging/fsl-dpaa2/ethsw/dpsw.c
+++ b/drivers/staging/fsl-dpaa2/ethsw/dpsw.c
@@ -704,6 +704,47 @@ int dpsw_if_disable(struct fsl_mc_io *mc_io,
 	return mc_send_command(mc_io, &cmd);
 }
 
+/**
+ * dpsw_if_get_attributes() - Function obtains attributes of interface
+ * @mc_io:	Pointer to MC portal's I/O object
+ * @cmd_flags:	Command flags; one or more of 'MC_CMD_FLAG_'
+ * @token:	Token of DPSW object
+ * @if_id:	Interface Identifier
+ * @attr:	Returned interface attributes
+ *
+ * Return:	Completion status. '0' on Success; Error code otherwise.
+ */
+int dpsw_if_get_attributes(struct fsl_mc_io *mc_io, u32 cmd_flags, u16 token,
+			   u16 if_id, struct dpsw_if_attr *attr)
+{
+	struct dpsw_rsp_if_get_attr *rsp_params;
+	struct fsl_mc_command cmd = { 0 };
+	struct dpsw_cmd_if *cmd_params;
+	int err;
+
+	cmd.header = mc_encode_cmd_header(DPSW_CMDID_IF_GET_ATTR, cmd_flags,
+					  token);
+	cmd_params = (struct dpsw_cmd_if *)cmd.params;
+	cmd_params->if_id = cpu_to_le16(if_id);
+
+	err = mc_send_command(mc_io, &cmd);
+	if (err)
+		return err;
+
+	rsp_params = (struct dpsw_rsp_if_get_attr *)cmd.params;
+	attr->num_tcs = rsp_params->num_tcs;
+	attr->rate = le32_to_cpu(rsp_params->rate);
+	attr->options = le32_to_cpu(rsp_params->options);
+	attr->qdid = le16_to_cpu(rsp_params->qdid);
+	attr->enabled = dpsw_get_field(rsp_params->conf, ENABLED);
+	attr->accept_all_vlan = dpsw_get_field(rsp_params->conf,
+					       ACCEPT_ALL_VLAN);
+	attr->admit_untagged = dpsw_get_field(rsp_params->conf,
+					      ADMIT_UNTAGGED);
+
+	return 0;
+}
+
 /**
  * dpsw_if_set_max_frame_length() - Set Maximum Receive frame length.
  * @mc_io:		Pointer to MC portal's I/O object
diff --git a/drivers/staging/fsl-dpaa2/ethsw/dpsw.h b/drivers/staging/fsl-dpaa2/ethsw/dpsw.h
index 718242ea7d87..a6ed6860946c 100644
--- a/drivers/staging/fsl-dpaa2/ethsw/dpsw.h
+++ b/drivers/staging/fsl-dpaa2/ethsw/dpsw.h
@@ -440,6 +440,34 @@ int dpsw_if_disable(struct fsl_mc_io *mc_io,
 		    u16 token,
 		    u16 if_id);
 
+/**
+ * struct dpsw_if_attr - Structure representing DPSW interface attributes
+ * @num_tcs: Number of traffic classes
+ * @rate: Transmit rate in bits per second
+ * @options: Interface configuration options (bitmap)
+ * @enabled: Indicates if interface is enabled
+ * @accept_all_vlan: The device discards/accepts incoming frames
+ *		for VLANs that do not include this interface
+ * @admit_untagged: When set to 'DPSW_ADMIT_ONLY_VLAN_TAGGED', the device
+ *		discards untagged frames or priority-tagged frames received on
+ *		this interface;
+ *		When set to 'DPSW_ADMIT_ALL', untagged frames or priority-
+ *		tagged frames received on this interface are accepted
+ * @qdid: control frames transmit qdid
+ */
+struct dpsw_if_attr {
+	u8 num_tcs;
+	u32 rate;
+	u32 options;
+	int enabled;
+	int accept_all_vlan;
+	enum dpsw_accepted_frames admit_untagged;
+	u16 qdid;
+};
+
+int dpsw_if_get_attributes(struct fsl_mc_io *mc_io, u32 cmd_flags, u16 token,
+			   u16 if_id, struct dpsw_if_attr *attr);
+
 int dpsw_if_set_max_frame_length(struct fsl_mc_io *mc_io,
 				 u32 cmd_flags,
 				 u16 token,
diff --git a/drivers/staging/fsl-dpaa2/ethsw/ethsw.c b/drivers/staging/fsl-dpaa2/ethsw/ethsw.c
index d09aa4a5126a..7805f0e58ec2 100644
--- a/drivers/staging/fsl-dpaa2/ethsw/ethsw.c
+++ b/drivers/staging/fsl-dpaa2/ethsw/ethsw.c
@@ -477,6 +477,7 @@ static int dpaa2_switch_port_change_mtu(struct net_device *netdev, int mtu)
 static int dpaa2_switch_port_carrier_state_sync(struct net_device *netdev)
 {
 	struct ethsw_port_priv *port_priv = netdev_priv(netdev);
+	struct ethsw_core *ethsw = port_priv->ethsw_data;
 	struct dpsw_link_state state;
 	int err;
 
@@ -497,10 +498,15 @@ static int dpaa2_switch_port_carrier_state_sync(struct net_device *netdev)
 	WARN_ONCE(state.up > 1, "Garbage read into link_state");
 
 	if (state.up != port_priv->link_state) {
-		if (state.up)
+		if (state.up) {
 			netif_carrier_on(netdev);
-		else
+			if (dpaa2_switch_has_ctrl_if(ethsw))
+				netif_tx_start_all_queues(netdev);
+		} else {
 			netif_carrier_off(netdev);
+			if (dpaa2_switch_has_ctrl_if(ethsw))
+				netif_tx_stop_all_queues(netdev);
+		}
 		port_priv->link_state = state.up;
 	}
 
@@ -554,8 +560,10 @@ static int dpaa2_switch_port_open(struct net_device *netdev)
 	struct ethsw_core *ethsw = port_priv->ethsw_data;
 	int err;
 
-	/* No need to allow Tx as control interface is disabled */
-	netif_tx_stop_all_queues(netdev);
+	if (!dpaa2_switch_has_ctrl_if(port_priv->ethsw_data)) {
+		/* No need to allow Tx as control interface is disabled */
+		netif_tx_stop_all_queues(netdev);
+	}
 
 	/* Explicitly set carrier off, otherwise
 	 * netif_carrier_ok() will return true and cause 'ip link show'
@@ -610,15 +618,6 @@ static int dpaa2_switch_port_stop(struct net_device *netdev)
 	return 0;
 }
 
-static netdev_tx_t dpaa2_switch_port_dropframe(struct sk_buff *skb,
-					       struct net_device *netdev)
-{
-	/* we don't support I/O for now, drop the frame */
-	dev_kfree_skb_any(skb);
-
-	return NETDEV_TX_OK;
-}
-
 static int dpaa2_switch_port_parent_id(struct net_device *dev,
 				       struct netdev_phys_item_id *ppid)
 {
@@ -835,6 +834,114 @@ static void dpaa2_switch_free_fd(const struct ethsw_core *ethsw,
 	dev_kfree_skb(skb);
 }
 
+static int dpaa2_switch_build_single_fd(struct ethsw_core *ethsw,
+					struct sk_buff *skb,
+					struct dpaa2_fd *fd)
+{
+	struct device *dev = ethsw->dev;
+	struct sk_buff **skbh;
+	dma_addr_t addr;
+	u8 *buff_start;
+	void *hwa;
+
+	buff_start = PTR_ALIGN(skb->data - DPAA2_SWITCH_TX_DATA_OFFSET -
+			       DPAA2_SWITCH_TX_BUF_ALIGN,
+			       DPAA2_SWITCH_TX_BUF_ALIGN);
+
+	/* Clear FAS to have consistent values for TX confirmation. It is
+	 * located in the first 8 bytes of the buffer's hardware annotation
+	 * area
+	 */
+	hwa = buff_start + DPAA2_SWITCH_SWA_SIZE;
+	memset(hwa, 0, 8);
+
+	/* Store a backpointer to the skb at the beginning of the buffer
+	 * (in the private data area) such that we can release it
+	 * on Tx confirm
+	 */
+	skbh = (struct sk_buff **)buff_start;
+	*skbh = skb;
+
+	addr = dma_map_single(dev, buff_start,
+			      skb_tail_pointer(skb) - buff_start,
+			      DMA_TO_DEVICE);
+	if (unlikely(dma_mapping_error(dev, addr)))
+		return -ENOMEM;
+
+	/* Setup the FD fields */
+	memset(fd, 0, sizeof(*fd));
+
+	dpaa2_fd_set_addr(fd, addr);
+	dpaa2_fd_set_offset(fd, (u16)(skb->data - buff_start));
+	dpaa2_fd_set_len(fd, skb->len);
+	dpaa2_fd_set_format(fd, dpaa2_fd_single);
+
+	return 0;
+}
+
+static netdev_tx_t dpaa2_switch_port_tx(struct sk_buff *skb,
+					struct net_device *net_dev)
+{
+	struct ethsw_port_priv *port_priv = netdev_priv(net_dev);
+	struct ethsw_core *ethsw = port_priv->ethsw_data;
+	int retries = DPAA2_SWITCH_SWP_BUSY_RETRIES;
+	struct dpaa2_fd fd;
+	int err;
+
+	if (!dpaa2_switch_has_ctrl_if(ethsw))
+		goto err_free_skb;
+
+	if (unlikely(skb_headroom(skb) < DPAA2_SWITCH_NEEDED_HEADROOM)) {
+		struct sk_buff *ns;
+
+		ns = skb_realloc_headroom(skb, DPAA2_SWITCH_NEEDED_HEADROOM);
+		if (unlikely(!ns)) {
+			netdev_err(net_dev, "Error reallocating skb headroom\n");
+			goto err_free_skb;
+		}
+		dev_kfree_skb(skb);
+		skb = ns;
+	}
+
+	/* We'll be holding a back-reference to the skb until Tx confirmation */
+	skb = skb_unshare(skb, GFP_ATOMIC);
+	if (unlikely(!skb)) {
+		/* skb_unshare() has already freed the skb */
+		netdev_err(net_dev, "Error copying the socket buffer\n");
+		goto err_exit;
+	}
+
+	if (skb_is_nonlinear(skb)) {
+		netdev_err(net_dev, "No support for non-linear SKBs!\n");
+		goto err_free_skb;
+	}
+
+	err = dpaa2_switch_build_single_fd(ethsw, skb, &fd);
+	if (unlikely(err)) {
+		netdev_err(net_dev, "ethsw_build_*_fd() %d\n", err);
+		goto err_free_skb;
+	}
+
+	do {
+		err = dpaa2_io_service_enqueue_qd(NULL,
+						  port_priv->tx_qdid,
+						  8, 0, &fd);
+		retries--;
+	} while (err == -EBUSY && retries);
+
+	if (unlikely(err < 0)) {
+		dpaa2_switch_free_fd(ethsw, &fd);
+		goto err_exit;
+	}
+
+	return NETDEV_TX_OK;
+
+err_free_skb:
+	dev_kfree_skb(skb);
+err_exit:
+	return NETDEV_TX_OK;
+}
+
 static const struct net_device_ops dpaa2_switch_port_ops = {
 	.ndo_open		= dpaa2_switch_port_open,
 	.ndo_stop		= dpaa2_switch_port_stop,
@@ -848,7 +955,7 @@ static const struct net_device_ops dpaa2_switch_port_ops = {
 	.ndo_fdb_del		= dpaa2_switch_port_fdb_del,
 	.ndo_fdb_dump		= dpaa2_switch_port_fdb_dump,
 
-	.ndo_start_xmit		= dpaa2_switch_port_dropframe,
+	.ndo_start_xmit		= dpaa2_switch_port_tx,
 	.ndo_get_port_parent_id	= dpaa2_switch_port_parent_id,
 	.ndo_get_phys_port_name = dpaa2_switch_port_get_phys_name,
 };
@@ -2237,6 +2344,7 @@ static int dpaa2_switch_port_init(struct ethsw_port_priv *port_priv, u16 port)
 {
 	struct net_device *netdev = port_priv->netdev;
 	struct ethsw_core *ethsw = port_priv->ethsw_data;
+	struct dpsw_if_attr dpsw_if_attr;
 	struct dpsw_vlan_if_cfg vcfg;
 	int err;
 
@@ -2263,7 +2371,15 @@ static int dpaa2_switch_port_init(struct ethsw_port_priv *port_priv, u16 port)
 	if (err)
 		netdev_err(netdev, "dpsw_vlan_remove_if err %d\n", err);
 
-	return err;
+	err = dpsw_if_get_attributes(ethsw->mc_io, 0, ethsw->dpsw_handle,
+				     port_priv->idx, &dpsw_if_attr);
+	if (err) {
+		netdev_err(netdev, "dpsw_if_get_attributes err %d\n", err);
+		return err;
+	}
+	port_priv->tx_qdid = dpsw_if_attr.qdid;
+
+	return 0;
 }
 
 static void dpaa2_switch_unregister_notifier(struct device *dev)
diff --git a/drivers/staging/fsl-dpaa2/ethsw/ethsw.h b/drivers/staging/fsl-dpaa2/ethsw/ethsw.h
index bd24be2c6308..b267c04e2008 100644
--- a/drivers/staging/fsl-dpaa2/ethsw/ethsw.h
+++ b/drivers/staging/fsl-dpaa2/ethsw/ethsw.h
@@ -66,6 +66,19 @@
  */
 #define DPAA2_SWITCH_SWP_BUSY_RETRIES		1000
 
+/* Hardware annotation buffer size */
+#define DPAA2_SWITCH_HWA_SIZE		64
+/* Software annotation buffer size */
+#define DPAA2_SWITCH_SWA_SIZE		64
+
+#define DPAA2_SWITCH_TX_BUF_ALIGN	64
+
+#define DPAA2_SWITCH_TX_DATA_OFFSET \
+	(DPAA2_SWITCH_HWA_SIZE + DPAA2_SWITCH_SWA_SIZE)
+
+#define DPAA2_SWITCH_NEEDED_HEADROOM \
+	(DPAA2_SWITCH_TX_DATA_OFFSET + DPAA2_SWITCH_TX_BUF_ALIGN)
+
 extern const struct ethtool_ops dpaa2_switch_port_ethtool_ops;
 
 struct ethsw_core;
@@ -92,6 +105,7 @@ struct ethsw_port_priv {
 	u8			vlans[VLAN_VID_MASK + 1];
 	u16			pvid;
 	struct net_device	*bridge_dev;
+	u16			tx_qdid;
 };
 
 /* Switch data */
-- 
2.28.0


^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [RFC 7/9] staging: dpaa2-switch: enable the control interface
  2020-11-04 16:57 [RFC 0/9] staging: dpaa2-switch: add support for CPU terminated traffic Ioana Ciornei
                   ` (5 preceding siblings ...)
  2020-11-04 16:57 ` [RFC 6/9] staging: dpaa2-switch: add .ndo_start_xmit() callback Ioana Ciornei
@ 2020-11-04 16:57 ` Ioana Ciornei
  2020-11-04 16:57 ` [RFC 8/9] staging: dpaa2-switch: properly setup switching domains Ioana Ciornei
  2020-11-04 16:57 ` [RFC 9/9] staging: dpaa2-switch: accept only vlan-aware upper devices Ioana Ciornei
  8 siblings, 0 replies; 20+ messages in thread
From: Ioana Ciornei @ 2020-11-04 16:57 UTC (permalink / raw)
  To: Greg Kroah-Hartman, Andrew Lunn; +Cc: linux-kernel, netdev, Ioana Ciornei

From: Ioana Ciornei <ioana.ciornei@nxp.com>

Enable the CTRL_IF of the switch object, now that all the pieces are in
place (buffer and queue management, interrupts, NAPI instances etc).

Signed-off-by: Ioana Ciornei <ioana.ciornei@nxp.com>
---
 drivers/staging/fsl-dpaa2/ethsw/dpsw-cmd.h |  2 ++
 drivers/staging/fsl-dpaa2/ethsw/dpsw.c     | 37 ++++++++++++++++++++++
 drivers/staging/fsl-dpaa2/ethsw/dpsw.h     |  5 +++
 drivers/staging/fsl-dpaa2/ethsw/ethsw.c    |  9 ++++++
 4 files changed, 53 insertions(+)

diff --git a/drivers/staging/fsl-dpaa2/ethsw/dpsw-cmd.h b/drivers/staging/fsl-dpaa2/ethsw/dpsw-cmd.h
index e239747f8a54..c6eb9bffdd3e 100644
--- a/drivers/staging/fsl-dpaa2/ethsw/dpsw-cmd.h
+++ b/drivers/staging/fsl-dpaa2/ethsw/dpsw-cmd.h
@@ -79,6 +79,8 @@
 
 #define DPSW_CMDID_CTRL_IF_GET_ATTR         DPSW_CMD_ID(0x0A0)
 #define DPSW_CMDID_CTRL_IF_SET_POOLS        DPSW_CMD_ID(0x0A1)
+#define DPSW_CMDID_CTRL_IF_ENABLE           DPSW_CMD_ID(0x0A2)
+#define DPSW_CMDID_CTRL_IF_DISABLE          DPSW_CMD_ID(0x0A3)
 #define DPSW_CMDID_CTRL_IF_SET_QUEUE        DPSW_CMD_ID(0x0A6)
 
 /* Macros for accessing command fields smaller than 1byte */
diff --git a/drivers/staging/fsl-dpaa2/ethsw/dpsw.c b/drivers/staging/fsl-dpaa2/ethsw/dpsw.c
index 2a1967754913..f5cfe61498b8 100644
--- a/drivers/staging/fsl-dpaa2/ethsw/dpsw.c
+++ b/drivers/staging/fsl-dpaa2/ethsw/dpsw.c
@@ -1456,3 +1456,40 @@ int dpsw_if_set_primary_mac_addr(struct fsl_mc_io *mc_io, u32 cmd_flags,
 	/* send command to mc*/
 	return mc_send_command(mc_io, &cmd);
 }
+
+/**
+ * dpsw_ctrl_if_enable() - Enable control interface
+ * @mc_io:	Pointer to MC portal's I/O object
+ * @cmd_flags:	Command flags; one or more of 'MC_CMD_FLAG_'
+ * @token:	Token of DPSW object
+ *
+ * Return:	'0' on Success; Error code otherwise.
+ */
+int dpsw_ctrl_if_enable(struct fsl_mc_io *mc_io, u32 cmd_flags, u16 token)
+{
+	struct fsl_mc_command cmd = { 0 };
+
+	cmd.header = mc_encode_cmd_header(DPSW_CMDID_CTRL_IF_ENABLE, cmd_flags,
+					  token);
+
+	return mc_send_command(mc_io, &cmd);
+}
+
+/**
+ * dpsw_ctrl_if_disable() - Function disables control interface
+ * @mc_io:	Pointer to MC portal's I/O object
+ * @cmd_flags:	Command flags; one or more of 'MC_CMD_FLAG_'
+ * @token:	Token of DPSW object
+ *
+ * Return:	'0' on Success; Error code otherwise.
+ */
+int dpsw_ctrl_if_disable(struct fsl_mc_io *mc_io, u32 cmd_flags, u16 token)
+{
+	struct fsl_mc_command cmd = { 0 };
+
+	cmd.header = mc_encode_cmd_header(DPSW_CMDID_CTRL_IF_DISABLE,
+					  cmd_flags,
+					  token);
+
+	return mc_send_command(mc_io, &cmd);
+}
diff --git a/drivers/staging/fsl-dpaa2/ethsw/dpsw.h b/drivers/staging/fsl-dpaa2/ethsw/dpsw.h
index a6ed6860946c..9e30adad4ffa 100644
--- a/drivers/staging/fsl-dpaa2/ethsw/dpsw.h
+++ b/drivers/staging/fsl-dpaa2/ethsw/dpsw.h
@@ -245,6 +245,11 @@ struct dpsw_ctrl_if_queue_cfg {
 int dpsw_ctrl_if_set_queue(struct fsl_mc_io *mc_io, u32 cmd_flags, u16 token,
 			   enum dpsw_queue_type qtype,
 			   const struct dpsw_ctrl_if_queue_cfg *cfg);
+
+int dpsw_ctrl_if_enable(struct fsl_mc_io *mc_io, u32 cmd_flags, u16 token);
+
+int dpsw_ctrl_if_disable(struct fsl_mc_io *mc_io, u32 cmd_flags, u16 token);
+
 /**
  * enum dpsw_action - Action selection for special/control frames
  * @DPSW_ACTION_DROP: Drop frame
diff --git a/drivers/staging/fsl-dpaa2/ethsw/ethsw.c b/drivers/staging/fsl-dpaa2/ethsw/ethsw.c
index 7805f0e58ec2..24bdac6d6005 100644
--- a/drivers/staging/fsl-dpaa2/ethsw/ethsw.c
+++ b/drivers/staging/fsl-dpaa2/ethsw/ethsw.c
@@ -2220,8 +2220,16 @@ static int dpaa2_switch_ctrl_if_setup(struct ethsw_core *ethsw)
 	if (err)
 		goto err_destroy_rings;
 
+	err = dpsw_ctrl_if_enable(ethsw->mc_io, 0, ethsw->dpsw_handle);
+	if (err) {
+		dev_err(ethsw->dev, "dpsw_ctrl_if_enable err %d\n", err);
+		goto err_deregister_dpio;
+	}
+
 	return 0;
 
+err_deregister_dpio:
+	dpaa2_switch_free_dpio(ethsw);
 err_destroy_rings:
 	dpaa2_switch_destroy_rings(ethsw);
 err_drain_dpbp:
@@ -2421,6 +2429,7 @@ static void dpaa2_switch_takedown(struct fsl_mc_device *sw_dev)
 
 static void dpaa2_switch_ctrl_if_teardown(struct ethsw_core *ethsw)
 {
+	dpsw_ctrl_if_disable(ethsw->mc_io, 0, ethsw->dpsw_handle);
 	dpaa2_switch_free_dpio(ethsw);
 	dpaa2_switch_destroy_rings(ethsw);
 	dpaa2_switch_drain_bp(ethsw);
-- 
2.28.0


^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [RFC 8/9] staging: dpaa2-switch: properly setup switching domains
  2020-11-04 16:57 [RFC 0/9] staging: dpaa2-switch: add support for CPU terminated traffic Ioana Ciornei
                   ` (6 preceding siblings ...)
  2020-11-04 16:57 ` [RFC 7/9] staging: dpaa2-switch: enable the control interface Ioana Ciornei
@ 2020-11-04 16:57 ` Ioana Ciornei
  2020-11-04 22:08   ` Vladimir Oltean
  2020-11-04 16:57 ` [RFC 9/9] staging: dpaa2-switch: accept only vlan-aware upper devices Ioana Ciornei
  8 siblings, 1 reply; 20+ messages in thread
From: Ioana Ciornei @ 2020-11-04 16:57 UTC (permalink / raw)
  To: Greg Kroah-Hartman, Andrew Lunn; +Cc: linux-kernel, netdev, Ioana Ciornei

From: Ioana Ciornei <ioana.ciornei@nxp.com>

Until now, the DPAA2 switch was not capable to properly setup it's
switching domains depending on the existence, or lack thereof, of a
upper bridge device. This meant that all switch ports of a DPSW object
were switching by default even though they were not under the same
bridge device.

Another issue was the inability to actually add the CPU in the flooding
domains (broadcast, unknown unicast etc) of a particular switch port.
This meant that a simple ping on a switch interface was not possible
since no broadcast ARP frame would actually reach the CPU queues.

This patch tries to fix exactly these problems by:

* Creating and managing a FDB table for each flooding domain. This means
  that when a switch interface is not bridged it will use it's own FDB
  table. While in bridged mode all DPAA2 switch interfaces under the
  same upper will use the same FDB table, thus leverage the same FDB
  entries.

* Adding a new MC firmware command - dpsw_set_egress_flood() - through
  which the driver can setup the flooding domains as needed. For
  example, when the switch interface is standalone, thus not in a
  bridge with any other DPAA2 switch port, it will setup it's broadcast
  and unknown unicast flooding domains to only include the control
  interface (the queues that reach the CPU and the driver can dequeue
  from). This flooding domain changes when the interface joins a bridge
  and is configured to include, beside the control interface, all other
  DPAA2 switch interfaces.

Signed-off-by: Ioana Ciornei <ioana.ciornei@nxp.com>
---
 drivers/staging/fsl-dpaa2/ethsw/dpsw-cmd.h |  25 ++-
 drivers/staging/fsl-dpaa2/ethsw/dpsw.c     |  96 +++++++++
 drivers/staging/fsl-dpaa2/ethsw/dpsw.h     |  41 ++++
 drivers/staging/fsl-dpaa2/ethsw/ethsw.c    | 220 +++++++++++++++++----
 drivers/staging/fsl-dpaa2/ethsw/ethsw.h    |   6 +-
 5 files changed, 345 insertions(+), 43 deletions(-)

diff --git a/drivers/staging/fsl-dpaa2/ethsw/dpsw-cmd.h b/drivers/staging/fsl-dpaa2/ethsw/dpsw-cmd.h
index c6eb9bffdd3e..9b068b07bbcd 100644
--- a/drivers/staging/fsl-dpaa2/ethsw/dpsw-cmd.h
+++ b/drivers/staging/fsl-dpaa2/ethsw/dpsw-cmd.h
@@ -15,9 +15,11 @@
 #define DPSW_VER_MINOR		4
 
 #define DPSW_CMD_BASE_VERSION	1
+#define DPSW_CMD_2ND_VERSION	2
 #define DPSW_CMD_ID_OFFSET	4
 
 #define DPSW_CMD_ID(id)	(((id) << DPSW_CMD_ID_OFFSET) | DPSW_CMD_BASE_VERSION)
+#define DPSW_CMD_V2(id) (((id) << DPSW_CMD_ID_OFFSET) | DPSW_CMD_2ND_VERSION)
 
 /* Command IDs */
 #define DPSW_CMDID_CLOSE                    DPSW_CMD_ID(0x800)
@@ -58,7 +60,7 @@
 #define DPSW_CMDID_IF_SET_LINK_CFG          DPSW_CMD_ID(0x04C)
 
 #define DPSW_CMDID_VLAN_ADD                 DPSW_CMD_ID(0x060)
-#define DPSW_CMDID_VLAN_ADD_IF              DPSW_CMD_ID(0x061)
+#define DPSW_CMDID_VLAN_ADD_IF              DPSW_CMD_V2(0x061)
 #define DPSW_CMDID_VLAN_ADD_IF_UNTAGGED     DPSW_CMD_ID(0x062)
 
 #define DPSW_CMDID_VLAN_REMOVE_IF           DPSW_CMD_ID(0x064)
@@ -66,6 +68,8 @@
 #define DPSW_CMDID_VLAN_REMOVE_IF_FLOODING  DPSW_CMD_ID(0x066)
 #define DPSW_CMDID_VLAN_REMOVE              DPSW_CMD_ID(0x067)
 
+#define DPSW_CMDID_FDB_ADD                  DPSW_CMD_ID(0x082)
+#define DPSW_CMDID_FDB_REMOVE               DPSW_CMD_ID(0x083)
 #define DPSW_CMDID_FDB_ADD_UNICAST          DPSW_CMD_ID(0x084)
 #define DPSW_CMDID_FDB_REMOVE_UNICAST       DPSW_CMD_ID(0x085)
 #define DPSW_CMDID_FDB_ADD_MULTICAST        DPSW_CMD_ID(0x086)
@@ -83,6 +87,8 @@
 #define DPSW_CMDID_CTRL_IF_DISABLE          DPSW_CMD_ID(0x0A3)
 #define DPSW_CMDID_CTRL_IF_SET_QUEUE        DPSW_CMD_ID(0x0A6)
 
+#define DPSW_CMDID_SET_EGRESS_FLOOD         DPSW_CMD_ID(0x0AC)
+
 /* Macros for accessing command fields smaller than 1byte */
 #define DPSW_MASK(field)        \
 	GENMASK(DPSW_##field##_SHIFT + DPSW_##field##_SIZE - 1, \
@@ -324,6 +330,16 @@ struct dpsw_vlan_add {
 	__le16 vlan_id;
 };
 
+struct dpsw_cmd_vlan_add_if {
+	/* cmd word 0 */
+	__le16 options;
+	__le16 vlan_id;
+	__le16 fdb_id;
+	__le16 pad0;
+	/* cmd word 1-4 */
+	__le64 if_id;
+};
+
 struct dpsw_cmd_vlan_manage_if {
 	/* cmd word 0 */
 	__le16 pad0;
@@ -445,4 +461,11 @@ struct dpsw_cmd_if_set_mac_addr {
 	u8 mac_addr[6];
 };
 
+struct dpsw_cmd_set_egress_flood {
+	__le16 fdb_id;
+	u8 flood_type;
+	u8 pad[5];
+	__le64 if_id;
+};
+
 #endif /* __FSL_DPSW_CMD_H */
diff --git a/drivers/staging/fsl-dpaa2/ethsw/dpsw.c b/drivers/staging/fsl-dpaa2/ethsw/dpsw.c
index f5cfe61498b8..936984e7af50 100644
--- a/drivers/staging/fsl-dpaa2/ethsw/dpsw.c
+++ b/drivers/staging/fsl-dpaa2/ethsw/dpsw.c
@@ -986,6 +986,76 @@ int dpsw_vlan_remove(struct fsl_mc_io *mc_io,
 	return mc_send_command(mc_io, &cmd);
 }
 
+/**
+ * dpsw_fdb_add() - Add FDB to switch and Returns handle to FDB table for
+ *		the reference
+ * @mc_io:	Pointer to MC portal's I/O object
+ * @cmd_flags:	Command flags; one or more of 'MC_CMD_FLAG_'
+ * @token:	Token of DPSW object
+ * @fdb_id:	Returned Forwarding Database Identifier
+ * @cfg:	FDB Configuration
+ *
+ * Return:	Completion status. '0' on Success; Error code otherwise.
+ */
+int dpsw_fdb_add(struct fsl_mc_io *mc_io,
+		 u32 cmd_flags,
+		 u16 token,
+		 u16 *fdb_id,
+		 const struct dpsw_fdb_cfg *cfg)
+{
+	struct dpsw_cmd_fdb_add *cmd_params;
+	struct dpsw_rsp_fdb_add *rsp_params;
+	struct fsl_mc_command cmd = { 0 };
+	int err;
+
+	/* prepare command */
+	cmd.header = mc_encode_cmd_header(DPSW_CMDID_FDB_ADD,
+					  cmd_flags,
+					  token);
+	cmd_params = (struct dpsw_cmd_fdb_add *)cmd.params;
+	cmd_params->fdb_aging_time = cpu_to_le16(cfg->fdb_aging_time);
+	cmd_params->num_fdb_entries = cpu_to_le16(cfg->num_fdb_entries);
+
+	/* send command to mc*/
+	err = mc_send_command(mc_io, &cmd);
+	if (err)
+		return err;
+
+	/* retrieve response parameters */
+	rsp_params = (struct dpsw_rsp_fdb_add *)cmd.params;
+	*fdb_id = le16_to_cpu(rsp_params->fdb_id);
+
+	return 0;
+}
+
+/**
+ * dpsw_fdb_remove() - Remove FDB from switch
+ * @mc_io:	Pointer to MC portal's I/O object
+ * @cmd_flags:	Command flags; one or more of 'MC_CMD_FLAG_'
+ * @token:	Token of DPSW object
+ * @fdb_id:	Forwarding Database Identifier
+ *
+ * Return:	Completion status. '0' on Success; Error code otherwise.
+ */
+int dpsw_fdb_remove(struct fsl_mc_io *mc_io,
+		    u32 cmd_flags,
+		    u16 token,
+		    u16 fdb_id)
+{
+	struct dpsw_cmd_fdb_remove *cmd_params;
+	struct fsl_mc_command cmd = { 0 };
+
+	/* prepare command */
+	cmd.header = mc_encode_cmd_header(DPSW_CMDID_FDB_REMOVE,
+					  cmd_flags,
+					  token);
+	cmd_params = (struct dpsw_cmd_fdb_remove *)cmd.params;
+	cmd_params->fdb_id = cpu_to_le16(fdb_id);
+
+	/* send command to mc*/
+	return mc_send_command(mc_io, &cmd);
+}
+
 /**
  * dpsw_fdb_add_unicast() - Function adds an unicast entry into MAC lookup table
  * @mc_io:	Pointer to MC portal's I/O object
@@ -1493,3 +1563,29 @@ int dpsw_ctrl_if_disable(struct fsl_mc_io *mc_io, u32 cmd_flags, u16 token)
 
 	return mc_send_command(mc_io, &cmd);
 }
+
+/**
+ * dpsw_set_egress_flood() - Set egress parameters associated with an FDB ID
+ * @mc_io:	Pointer to MC portal's I/O object
+ * @cmd_flags:	Command flags; one or more of 'MC_CMD_FLAG_'
+ * @token:	Token of DPSW object
+ * @cfg:	Egress flooding configuration
+ *
+ * Return:	'0' on Success; Error code otherwise.
+ */
+int dpsw_set_egress_flood(struct fsl_mc_io *mc_io, u32 cmd_flags, u16 token,
+			  const struct dpsw_egress_flood_cfg *cfg)
+{
+	struct dpsw_cmd_set_egress_flood *cmd_params;
+	struct fsl_mc_command cmd = { 0 };
+
+	cmd.header = mc_encode_cmd_header(DPSW_CMDID_SET_EGRESS_FLOOD,
+					  cmd_flags,
+					  token);
+	cmd_params = (struct dpsw_cmd_set_egress_flood *)cmd.params;
+	cmd_params->fdb_id = cpu_to_le16(cfg->fdb_id);
+	cmd_params->flood_type = cfg->flood_type;
+	build_if_id_bitmap(&cmd_params->if_id, cfg->if_id, cfg->num_ifs);
+
+	return mc_send_command(mc_io, &cmd);
+}
diff --git a/drivers/staging/fsl-dpaa2/ethsw/dpsw.h b/drivers/staging/fsl-dpaa2/ethsw/dpsw.h
index 9e30adad4ffa..8fe76c7227ea 100644
--- a/drivers/staging/fsl-dpaa2/ethsw/dpsw.h
+++ b/drivers/staging/fsl-dpaa2/ethsw/dpsw.h
@@ -493,6 +493,8 @@ int dpsw_vlan_add(struct fsl_mc_io *mc_io,
 		  u16 vlan_id,
 		  const struct dpsw_vlan_cfg *cfg);
 
+#define DPSW_VLAN_ADD_IF_OPT_FDB_ID            0x0001
+
 /**
  * struct dpsw_vlan_if_cfg - Set of VLAN Interfaces
  * @num_ifs: The number of interfaces that are assigned to the egress
@@ -502,7 +504,9 @@ int dpsw_vlan_add(struct fsl_mc_io *mc_io,
  */
 struct dpsw_vlan_if_cfg {
 	u16 num_ifs;
+	u16 options;
 	u16 if_id[DPSW_MAX_IF];
+	u16 fdb_id;
 };
 
 int dpsw_vlan_add_if(struct fsl_mc_io *mc_io,
@@ -692,4 +696,41 @@ int dpsw_if_get_primary_mac_addr(struct fsl_mc_io *mc_io, u32 cmd_flags,
 int dpsw_if_set_primary_mac_addr(struct fsl_mc_io *mc_io, u32 cmd_flags,
 				 u16 token, u16 if_id, u8 mac_addr[6]);
 
+/**
+ * struct dpsw_fdb_cfg  - FDB Configuration
+ * @num_fdb_entries: Number of FDB entries
+ * @fdb_aging_time: Aging time in seconds
+ */
+struct dpsw_fdb_cfg {
+	u16 num_fdb_entries;
+	u16 fdb_aging_time;
+};
+
+int dpsw_fdb_add(struct fsl_mc_io *mc_io,
+		 u32 cmd_flags,
+		 u16 token,
+		 u16 *fdb_id,
+		 const struct dpsw_fdb_cfg *cfg);
+
+int dpsw_fdb_remove(struct fsl_mc_io *mc_io,
+		    u32 cmd_flags,
+		    u16 token,
+		    u16 fdb_id);
+
+enum dpsw_flood_type {
+	DPSW_BROADCAST_FLOOD = 0,
+	DPSW_UNICAST_FLOOD,
+	DPSW_MULTICAST_FLOOD,
+};
+
+struct dpsw_egress_flood_cfg {
+	u16 fdb_id;
+	enum dpsw_flood_type flood_type;
+	u16 num_ifs;
+	u16 if_id[DPSW_MAX_IF];
+};
+
+int dpsw_set_egress_flood(struct fsl_mc_io *mc_io, u32 cmd_flags, u16 token,
+			  const struct dpsw_egress_flood_cfg *cfg);
+
 #endif /* __FSL_DPSW_H */
diff --git a/drivers/staging/fsl-dpaa2/ethsw/ethsw.c b/drivers/staging/fsl-dpaa2/ethsw/ethsw.c
index 24bdac6d6005..7a0d9a178cdc 100644
--- a/drivers/staging/fsl-dpaa2/ethsw/ethsw.c
+++ b/drivers/staging/fsl-dpaa2/ethsw/ethsw.c
@@ -25,6 +25,36 @@
 
 #define DEFAULT_VLAN_ID			1
 
+static u16 dpaa2_switch_port_get_fdb_id(struct ethsw_port_priv *port_priv)
+{
+	struct ethsw_port_priv *other_port_priv = NULL;
+	struct net_device *other_dev;
+	struct list_head *iter;
+
+	/* If not part of a bridge, just use the private FDB */
+	if (!port_priv->bridge_dev)
+		return port_priv->fdb_id;
+
+	/* If part of a bridge, use the FDB of the first dpaa2 switch interface
+	 * to be present in that bridge
+	 */
+	netdev_for_each_lower_dev(port_priv->bridge_dev, other_dev, iter) {
+		if (!dpaa2_switch_port_dev_check(other_dev, NULL))
+			continue;
+
+		other_port_priv = netdev_priv(other_dev);
+		break;
+	}
+
+	/* We are the first dpaa2 switch interface to join the bridge, just use
+	 * our own FDB
+	 */
+	if (!other_port_priv)
+		other_port_priv = port_priv;
+
+	return other_port_priv->fdb_id;
+}
+
 static void *dpaa2_iova_to_virt(struct iommu_domain *domain,
 				dma_addr_t iova_addr)
 {
@@ -133,7 +163,7 @@ static int dpaa2_switch_port_add_vlan(struct ethsw_port_priv *port_priv,
 {
 	struct ethsw_core *ethsw = port_priv->ethsw_data;
 	struct net_device *netdev = port_priv->netdev;
-	struct dpsw_vlan_if_cfg vcfg;
+	struct dpsw_vlan_if_cfg vcfg = {0};
 	int err;
 
 	if (port_priv->vlans[vid]) {
@@ -141,8 +171,13 @@ static int dpaa2_switch_port_add_vlan(struct ethsw_port_priv *port_priv,
 		return -EEXIST;
 	}
 
+	/* If hit, this VLAN rule will lead the packet into the FDB table
+	 * specified in the vlan configuration below
+	 */
 	vcfg.num_ifs = 1;
 	vcfg.if_id[0] = port_priv->idx;
+	vcfg.fdb_id = dpaa2_switch_port_get_fdb_id(port_priv);
+	vcfg.options |= DPSW_VLAN_ADD_IF_OPT_FDB_ID;
 	err = dpsw_vlan_add_if(ethsw->mc_io, 0, ethsw->dpsw_handle, vid, &vcfg);
 	if (err) {
 		netdev_err(netdev, "dpsw_vlan_add_if err %d\n", err);
@@ -172,8 +207,10 @@ static int dpaa2_switch_port_add_vlan(struct ethsw_port_priv *port_priv,
 	return 0;
 }
 
-static int dpaa2_switch_set_learning(struct ethsw_core *ethsw, bool enable)
+static int dpaa2_switch_port_set_learning(struct ethsw_port_priv *port_priv, bool enable)
 {
+	u16 fdb_id = dpaa2_switch_port_get_fdb_id(port_priv);
+	struct ethsw_core *ethsw = port_priv->ethsw_data;
 	enum dpsw_fdb_learning_mode learn_mode;
 	int err;
 
@@ -182,13 +219,12 @@ static int dpaa2_switch_set_learning(struct ethsw_core *ethsw, bool enable)
 	else
 		learn_mode = DPSW_FDB_LEARNING_MODE_DIS;
 
-	err = dpsw_fdb_set_learning_mode(ethsw->mc_io, 0, ethsw->dpsw_handle, 0,
+	err = dpsw_fdb_set_learning_mode(ethsw->mc_io, 0, ethsw->dpsw_handle, fdb_id,
 					 learn_mode);
 	if (err) {
 		dev_err(ethsw->dev, "dpsw_fdb_set_learning_mode err %d\n", err);
 		return err;
 	}
-	ethsw->learning = enable;
 
 	return 0;
 }
@@ -267,15 +303,17 @@ static int dpaa2_switch_port_fdb_add_uc(struct ethsw_port_priv *port_priv,
 					const unsigned char *addr)
 {
 	struct dpsw_fdb_unicast_cfg entry = {0};
+	u16 fdb_id;
 	int err;
 
 	entry.if_egress = port_priv->idx;
 	entry.type = DPSW_FDB_ENTRY_STATIC;
 	ether_addr_copy(entry.mac_addr, addr);
 
+	fdb_id = dpaa2_switch_port_get_fdb_id(port_priv);
 	err = dpsw_fdb_add_unicast(port_priv->ethsw_data->mc_io, 0,
 				   port_priv->ethsw_data->dpsw_handle,
-				   0, &entry);
+				   fdb_id, &entry);
 	if (err)
 		netdev_err(port_priv->netdev,
 			   "dpsw_fdb_add_unicast err %d\n", err);
@@ -306,6 +344,7 @@ static int dpaa2_switch_port_fdb_add_mc(struct ethsw_port_priv *port_priv,
 					const unsigned char *addr)
 {
 	struct dpsw_fdb_multicast_cfg entry = {0};
+	u16 fdb_id;
 	int err;
 
 	ether_addr_copy(entry.mac_addr, addr);
@@ -313,9 +352,10 @@ static int dpaa2_switch_port_fdb_add_mc(struct ethsw_port_priv *port_priv,
 	entry.num_ifs = 1;
 	entry.if_id[0] = port_priv->idx;
 
+	fdb_id = dpaa2_switch_port_get_fdb_id(port_priv);
 	err = dpsw_fdb_add_multicast(port_priv->ethsw_data->mc_io, 0,
 				     port_priv->ethsw_data->dpsw_handle,
-				     0, &entry);
+				     fdb_id, &entry);
 	/* Silently discard error for calling multiple times the add command */
 	if (err && err != -ENXIO)
 		netdev_err(port_priv->netdev, "dpsw_fdb_add_multicast err %d\n",
@@ -723,6 +763,7 @@ static int dpaa2_switch_port_fdb_dump(struct sk_buff *skb, struct netlink_callba
 	u32 fdb_dump_size;
 	int err = 0, i;
 	u8 *dma_mem;
+	u16 fdb_id;
 
 	fdb_dump_size = ethsw->sw_attr.max_fdb_entries * sizeof(fdb_entry);
 	dma_mem = kzalloc(fdb_dump_size, GFP_KERNEL);
@@ -737,7 +778,8 @@ static int dpaa2_switch_port_fdb_dump(struct sk_buff *skb, struct netlink_callba
 		goto err_map;
 	}
 
-	err = dpsw_fdb_dump(ethsw->mc_io, 0, ethsw->dpsw_handle, 0,
+	fdb_id = dpaa2_switch_port_get_fdb_id(port_priv);
+	err = dpsw_fdb_dump(ethsw->mc_io, 0, ethsw->dpsw_handle, fdb_id,
 			    fdb_dump_iova, fdb_dump_size, &num_fdb_entries);
 	if (err) {
 		netdev_err(net_dev, "dpsw_fdb_dump() = %d\n", err);
@@ -960,8 +1002,8 @@ static const struct net_device_ops dpaa2_switch_port_ops = {
 	.ndo_get_phys_port_name = dpaa2_switch_port_get_phys_name,
 };
 
-static bool dpaa2_switch_port_dev_check(const struct net_device *netdev,
-					struct notifier_block *nb)
+bool dpaa2_switch_port_dev_check(const struct net_device *netdev,
+				 struct notifier_block *nb)
 {
 	struct ethsw_port_priv *port_priv = netdev_priv(netdev);
 
@@ -1119,9 +1161,11 @@ static int dpaa2_switch_port_attr_br_flags_set(struct net_device *netdev,
 	if (switchdev_trans_ph_prepare(trans))
 		return 0;
 
-	/* Learning is enabled per switch */
-	err = dpaa2_switch_set_learning(port_priv->ethsw_data,
-					!!(flags & BR_LEARNING));
+	/* Learning is enabled per swiching domain, thus all dpaa2 switch
+	 * interfaces under the same bridge will have their flooding state
+	 * changed also
+	 */
+	err = dpaa2_switch_port_set_learning(port_priv, !!(flags & BR_LEARNING));
 	if (err)
 		goto exit;
 
@@ -1403,24 +1447,69 @@ static int dpaa2_switch_port_attr_set_event(struct net_device *netdev,
 	return notifier_from_errno(err);
 }
 
-/* For the moment, only flood setting needs to be updated */
-static int dpaa2_switch_port_bridge_join(struct net_device *netdev,
-					 struct net_device *upper_dev)
+static int dpaa2_switch_port_set_egress_flood(struct ethsw_port_priv *port_priv)
 {
-	struct ethsw_port_priv *port_priv = netdev_priv(netdev);
 	struct ethsw_core *ethsw = port_priv->ethsw_data;
+	struct net_device *netdev = port_priv->netdev;
 	struct ethsw_port_priv *other_port_priv;
+	struct dpsw_egress_flood_cfg flood_cfg;
 	struct net_device *other_dev;
 	struct list_head *iter;
-	int i, err;
+	int i = 0;
+	int err;
 
-	for (i = 0; i < ethsw->sw_attr.num_ifs; i++)
-		if (ethsw->ports[i]->bridge_dev &&
-		    (ethsw->ports[i]->bridge_dev != upper_dev)) {
-			netdev_err(netdev,
-				   "Only one bridge supported per DPSW object!\n");
-			return -EINVAL;
+	if (port_priv->bridge_dev) {
+		/* Add all the DPAA2 switch ports found under the same bridge to the
+		 * egress flooding domain
+		 */
+		netdev_for_each_lower_dev(port_priv->bridge_dev, other_dev, iter) {
+			if (!dpaa2_switch_port_dev_check(other_dev, NULL))
+				continue;
+
+			other_port_priv = netdev_priv(other_dev);
+			flood_cfg.if_id[i++] = other_port_priv->idx;
 		}
+	} else {
+		flood_cfg.if_id[i++] = port_priv->idx;
+	}
+
+	/* Add the CTRL interface to the egress flooding domain */
+	flood_cfg.if_id[i++] = ethsw->sw_attr.num_ifs;
+
+	/* Use the FDB of the first dpaa2 switch port added to the bridge */
+	flood_cfg.fdb_id = dpaa2_switch_port_get_fdb_id(port_priv);
+
+	/* Setup broadcast flooding domain */
+	flood_cfg.flood_type = DPSW_BROADCAST_FLOOD;
+	flood_cfg.num_ifs = i;
+	err = dpsw_set_egress_flood(ethsw->mc_io, 0, ethsw->dpsw_handle,
+				    &flood_cfg);
+	if (err) {
+		netdev_err(netdev, "dpsw_set_egress_flood() = %d\n", err);
+		return err;
+	}
+
+	/* Setup unknown unicast flooding domain */
+	flood_cfg.flood_type = DPSW_UNICAST_FLOOD;
+	flood_cfg.num_ifs = i;
+	err = dpsw_set_egress_flood(ethsw->mc_io, 0, ethsw->dpsw_handle,
+				    &flood_cfg);
+	if (err) {
+		netdev_err(netdev, "dpsw_set_egress_flood() = %d\n", err);
+		return err;
+	}
+
+	return 0;
+}
+
+static int dpaa2_switch_port_bridge_join(struct net_device *netdev,
+					 struct net_device *upper_dev)
+{
+	struct ethsw_port_priv *port_priv = netdev_priv(netdev);
+	struct ethsw_port_priv *other_port_priv;
+	struct net_device *other_dev;
+	struct list_head *iter;
+	int err;
 
 	netdev_for_each_lower_dev(upper_dev, other_dev, iter) {
 		if (!dpaa2_switch_port_dev_check(other_dev, NULL))
@@ -1434,11 +1523,23 @@ static int dpaa2_switch_port_bridge_join(struct net_device *netdev,
 		}
 	}
 
-	/* Enable flooding */
-	err = dpaa2_switch_port_set_flood(port_priv, 1);
-	if (!err)
-		port_priv->bridge_dev = upper_dev;
+	/* Delete the previously manually installed VLAN 1 */
+	err = dpaa2_switch_port_del_vlan(port_priv, 1);
+	if (err)
+		return err;
+
+	/* Keep track of the upper bridge device */
+	port_priv->bridge_dev = upper_dev;
+
+	/* Setup the egress flood policy (broadcast, unknown unicast) */
+	err = dpaa2_switch_port_set_egress_flood(port_priv);
+	if (err)
+		goto err_egress_flood;
+
+	return 0;
 
+err_egress_flood:
+	port_priv->bridge_dev = NULL;
 	return err;
 }
 
@@ -1447,10 +1548,24 @@ static int dpaa2_switch_port_bridge_leave(struct net_device *netdev)
 	struct ethsw_port_priv *port_priv = netdev_priv(netdev);
 	int err;
 
-	/* Disable flooding */
-	err = dpaa2_switch_port_set_flood(port_priv, 0);
-	if (!err)
-		port_priv->bridge_dev = NULL;
+	/* Port is not part of a bridge anymore */
+	port_priv->bridge_dev = NULL;
+
+	/* Setup the egress flood policy (broadcast, unknown unicast).
+	 * When the port is not under a bridge, only the CTRL interface is part
+	 * of the flooding domain besides the actual port
+	 */
+	err = dpaa2_switch_port_set_egress_flood(port_priv);
+	if (err)
+		return err;
+
+	/* Add the VLAN 1 as PVID when not under a bridge. We need this since
+	 * the dpaa2 switch interfaces are not capable to be VLAN unaware
+	 */
+	err = dpaa2_switch_port_add_vlan(port_priv, 1,
+					 BRIDGE_VLAN_INFO_UNTAGGED | BRIDGE_VLAN_INFO_PVID);
+	if (err)
+		return err;
 
 	return err;
 }
@@ -2291,13 +2406,6 @@ static int dpaa2_switch_init(struct fsl_mc_device *sw_dev)
 		goto err_close;
 	}
 
-	err = dpsw_fdb_set_learning_mode(ethsw->mc_io, 0, ethsw->dpsw_handle, 0,
-					 DPSW_FDB_LEARNING_MODE_HW);
-	if (err) {
-		dev_err(dev, "dpsw_fdb_set_learning_mode err %d\n", err);
-		goto err_close;
-	}
-
 	stp_cfg.vlan_id = DEFAULT_VLAN_ID;
 	stp_cfg.state = DPSW_STP_STATE_FORWARDING;
 
@@ -2352,6 +2460,7 @@ static int dpaa2_switch_port_init(struct ethsw_port_priv *port_priv, u16 port)
 {
 	struct net_device *netdev = port_priv->netdev;
 	struct ethsw_core *ethsw = port_priv->ethsw_data;
+	struct dpsw_fdb_cfg fdb_cfg = {0};
 	struct dpsw_if_attr dpsw_if_attr;
 	struct dpsw_vlan_if_cfg vcfg;
 	int err;
@@ -2387,6 +2496,38 @@ static int dpaa2_switch_port_init(struct ethsw_port_priv *port_priv, u16 port)
 	}
 	port_priv->tx_qdid = dpsw_if_attr.qdid;
 
+	/* Create a FDB table for this particular switch port */
+	fdb_cfg.num_fdb_entries = ethsw->sw_attr.max_fdb_entries / ethsw->sw_attr.num_ifs;
+	err = dpsw_fdb_add(ethsw->mc_io, 0, ethsw->dpsw_handle,
+			   &port_priv->fdb_id, &fdb_cfg);
+	if (err) {
+		netdev_err(netdev, "dpsw_fdb_add err %d\n", err);
+		return err;
+	}
+
+	/* Setup the default learning mode to he HW learning enabled */
+	err = dpsw_fdb_set_learning_mode(ethsw->mc_io, 0, ethsw->dpsw_handle,
+					 port_priv->fdb_id,
+					 DPSW_FDB_LEARNING_MODE_HW);
+	if (err) {
+		netdev_err(netdev, "dpsw_fdb_set_learning_mode err %d\n", err);
+		return err;
+	}
+
+	/* We need to add VLAN 1 as the PVID on this port until it is under a
+	 * bridge since the DPAA2 switch is not able to handle the traffic in a
+	 * VLAN unaware fashion
+	 */
+	err = dpaa2_switch_port_add_vlan(port_priv, 1,
+					 BRIDGE_VLAN_INFO_UNTAGGED | BRIDGE_VLAN_INFO_PVID);
+	if (err)
+		return err;
+
+	/* Setup the egress flooding domains (broadcast, unknown unicast */
+	err = dpaa2_switch_port_set_egress_flood(port_priv);
+	if (err)
+		return err;
+
 	return 0;
 }
 
@@ -2562,9 +2703,6 @@ static int dpaa2_switch_probe(struct fsl_mc_device *sw_dev)
 	/* DEFAULT_VLAN_ID is implicitly configured on the switch */
 	ethsw->vlans[DEFAULT_VLAN_ID] = ETHSW_VLAN_MEMBER;
 
-	/* Learning is implicitly enabled */
-	ethsw->learning = true;
-
 	ethsw->ports = kcalloc(ethsw->sw_attr.num_ifs, sizeof(*ethsw->ports),
 			       GFP_KERNEL);
 	if (!(ethsw->ports)) {
diff --git a/drivers/staging/fsl-dpaa2/ethsw/ethsw.h b/drivers/staging/fsl-dpaa2/ethsw/ethsw.h
index b267c04e2008..f905acd18c67 100644
--- a/drivers/staging/fsl-dpaa2/ethsw/ethsw.h
+++ b/drivers/staging/fsl-dpaa2/ethsw/ethsw.h
@@ -106,6 +106,7 @@ struct ethsw_port_priv {
 	u16			pvid;
 	struct net_device	*bridge_dev;
 	u16			tx_qdid;
+	u16			fdb_id;
 };
 
 /* Switch data */
@@ -121,7 +122,6 @@ struct ethsw_core {
 	struct iommu_domain		*iommu_domain;
 
 	u8				vlans[VLAN_VID_MASK + 1];
-	bool				learning;
 
 	struct notifier_block		port_nb;
 	struct notifier_block		port_switchdev_nb;
@@ -139,4 +139,8 @@ static inline bool dpaa2_switch_has_ctrl_if(struct ethsw_core *ethsw)
 {
 	return !(ethsw->sw_attr.options & DPSW_OPT_CTRL_IF_DIS);
 }
+
+bool dpaa2_switch_port_dev_check(const struct net_device *netdev,
+				 struct notifier_block *nb);
+
 #endif	/* __ETHSW_H */
-- 
2.28.0


^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [RFC 9/9] staging: dpaa2-switch: accept only vlan-aware upper devices
  2020-11-04 16:57 [RFC 0/9] staging: dpaa2-switch: add support for CPU terminated traffic Ioana Ciornei
                   ` (7 preceding siblings ...)
  2020-11-04 16:57 ` [RFC 8/9] staging: dpaa2-switch: properly setup switching domains Ioana Ciornei
@ 2020-11-04 16:57 ` Ioana Ciornei
  8 siblings, 0 replies; 20+ messages in thread
From: Ioana Ciornei @ 2020-11-04 16:57 UTC (permalink / raw)
  To: Greg Kroah-Hartman, Andrew Lunn; +Cc: linux-kernel, netdev, Ioana Ciornei

From: Ioana Ciornei <ioana.ciornei@nxp.com>

The DPAA2 Switch is not capable to handle traffic in a VLAN unaware
fashion, thus the previous handling of both the accepted upper devices
and the SWITCHDEV_ATTR_ID_BRIDGE_VLAN_FILTERING flag was wrong.

Fix this by checking if the bridge that we are joining is indeed VLAN
aware, if not return an error. Also, the RX VLAN filtering feature is
defined as 'on [fixed]' and the .ndo_vlan_rx_add_vid() and
.ndo_vlan_rx_kill_vid() callbacks are implemented just by recreating a
switchdev_obj_port_vlan object and then calling the same functions used
on the switchdev notifier path.
In addition, changing the vlan_filtering flag to 0 on a bridge under
which a DPAA2 switch interface is present is not supported, thus
rejected when SWITCHDEV_ATTR_ID_BRIDGE_VLAN_FILTERING is received with
such a request.

Signed-off-by: Ioana Ciornei <ioana.ciornei@nxp.com>
---
 drivers/staging/fsl-dpaa2/Kconfig       |  1 +
 drivers/staging/fsl-dpaa2/ethsw/ethsw.c | 80 ++++++++++++++++++++++---
 drivers/staging/fsl-dpaa2/ethsw/ethsw.h |  7 +++
 3 files changed, 79 insertions(+), 9 deletions(-)

diff --git a/drivers/staging/fsl-dpaa2/Kconfig b/drivers/staging/fsl-dpaa2/Kconfig
index 244237bb068a..7cb005b6e7ab 100644
--- a/drivers/staging/fsl-dpaa2/Kconfig
+++ b/drivers/staging/fsl-dpaa2/Kconfig
@@ -12,6 +12,7 @@ config FSL_DPAA2
 
 config FSL_DPAA2_ETHSW
 	tristate "Freescale DPAA2 Ethernet Switch"
+	depends on BRIDGE || BRIDGE=n
 	depends on FSL_DPAA2
 	depends on NET_SWITCHDEV
 	help
diff --git a/drivers/staging/fsl-dpaa2/ethsw/ethsw.c b/drivers/staging/fsl-dpaa2/ethsw/ethsw.c
index 7a0d9a178cdc..4327c432a39f 100644
--- a/drivers/staging/fsl-dpaa2/ethsw/ethsw.c
+++ b/drivers/staging/fsl-dpaa2/ethsw/ethsw.c
@@ -814,6 +814,50 @@ static int dpaa2_switch_port_fdb_dump(struct sk_buff *skb, struct netlink_callba
 	return err;
 }
 
+static int dpaa2_switch_port_vlan_add(struct net_device *netdev, __be16 proto,
+				      u16 vid)
+{
+	struct switchdev_obj_port_vlan vlan = {
+		.obj.id = SWITCHDEV_OBJ_ID_PORT_VLAN,
+		.vid_begin = vid,
+		.vid_end = vid,
+		/* This API only allows programming tagged, non-PVID VIDs */
+		.flags = 0,
+	};
+	struct switchdev_trans trans;
+	int err;
+
+	trans.ph_prepare = true;
+	err = dpaa2_switch_port_vlans_add(netdev, &vlan, &trans);
+	if (err)
+		return err;
+
+	trans.ph_prepare = false;
+	err = dpaa2_switch_port_vlans_add(netdev, &vlan, &trans);
+	if (err)
+		return err;
+
+	return 0;
+}
+
+static int dpaa2_switch_port_vlan_kill(struct net_device *netdev, __be16 proto,
+				       u16 vid)
+{
+	struct switchdev_obj_port_vlan vlan = {
+		.vid_begin = vid,
+		.vid_end = vid,
+		/* This API only allows programming tagged, non-PVID VIDs */
+		.flags = 0,
+	};
+	int err;
+
+	err = dpaa2_switch_port_vlans_del(netdev, &vlan);
+	if (err)
+		return err;
+
+	return 0;
+}
+
 static int dpaa2_switch_port_set_mac_addr(struct ethsw_port_priv *port_priv)
 {
 	struct ethsw_core *ethsw = port_priv->ethsw_data;
@@ -996,6 +1040,8 @@ static const struct net_device_ops dpaa2_switch_port_ops = {
 	.ndo_fdb_add		= dpaa2_switch_port_fdb_add,
 	.ndo_fdb_del		= dpaa2_switch_port_fdb_del,
 	.ndo_fdb_dump		= dpaa2_switch_port_fdb_dump,
+	.ndo_vlan_rx_add_vid	= dpaa2_switch_port_vlan_add,
+	.ndo_vlan_rx_kill_vid	= dpaa2_switch_port_vlan_kill,
 
 	.ndo_start_xmit		= dpaa2_switch_port_tx,
 	.ndo_get_port_parent_id	= dpaa2_switch_port_parent_id,
@@ -1195,7 +1241,8 @@ static int dpaa2_switch_port_attr_set(struct net_device *netdev,
 							  attr->u.brport_flags);
 		break;
 	case SWITCHDEV_ATTR_ID_BRIDGE_VLAN_FILTERING:
-		/* VLANs are supported by default  */
+		if (!attr->u.vlan_filtering)
+			return -EOPNOTSUPP;
 		break;
 	default:
 		err = -EOPNOTSUPP;
@@ -1205,9 +1252,9 @@ static int dpaa2_switch_port_attr_set(struct net_device *netdev,
 	return err;
 }
 
-static int dpaa2_switch_port_vlans_add(struct net_device *netdev,
-				       const struct switchdev_obj_port_vlan *vlan,
-				       struct switchdev_trans *trans)
+int dpaa2_switch_port_vlans_add(struct net_device *netdev,
+				const struct switchdev_obj_port_vlan *vlan,
+				struct switchdev_trans *trans)
 {
 	struct ethsw_port_priv *port_priv = netdev_priv(netdev);
 	struct ethsw_core *ethsw = port_priv->ethsw_data;
@@ -1375,13 +1422,13 @@ static int dpaa2_switch_port_del_vlan(struct ethsw_port_priv *port_priv, u16 vid
 	return 0;
 }
 
-static int dpaa2_switch_port_vlans_del(struct net_device *netdev,
-				       const struct switchdev_obj_port_vlan *vlan)
+int dpaa2_switch_port_vlans_del(struct net_device *netdev,
+				const struct switchdev_obj_port_vlan *vlan)
 {
 	struct ethsw_port_priv *port_priv = netdev_priv(netdev);
 	int vid, err = 0;
 
-	if (netif_is_bridge_master(vlan->obj.orig_dev))
+	if (vlan->obj.orig_dev && netif_is_bridge_master(vlan->obj.orig_dev))
 		return -EOPNOTSUPP;
 
 	for (vid = vlan->vid_begin; vid <= vlan->vid_end; vid++) {
@@ -1575,14 +1622,23 @@ static int dpaa2_switch_port_netdevice_event(struct notifier_block *nb,
 {
 	struct net_device *netdev = netdev_notifier_info_to_dev(ptr);
 	struct netdev_notifier_changeupper_info *info = ptr;
+	struct netlink_ext_ack *extack;
 	struct net_device *upper_dev;
 	int err = 0;
 
 	if (!dpaa2_switch_port_dev_check(netdev, nb))
 		return NOTIFY_DONE;
 
-	/* Handle just upper dev link/unlink for the moment */
-	if (event == NETDEV_CHANGEUPPER) {
+	extack = netdev_notifier_info_to_extack(&info->info);
+	switch (event) {
+	case NETDEV_PRECHANGEUPPER:
+		upper_dev = info->upper_dev;
+		if (netif_is_bridge_master(upper_dev) && !br_vlan_enabled(upper_dev)) {
+			NL_SET_ERR_MSG_MOD(extack, "Cannot join a vlan-unaware bridge");
+			err = -EOPNOTSUPP;
+		}
+		break;
+	case NETDEV_CHANGEUPPER:
 		upper_dev = info->upper_dev;
 		if (netif_is_bridge_master(upper_dev)) {
 			if (info->linking)
@@ -1590,6 +1646,7 @@ static int dpaa2_switch_port_netdevice_event(struct notifier_block *nb,
 			else
 				err = dpaa2_switch_port_bridge_leave(netdev);
 		}
+		break;
 	}
 
 	return notifier_from_errno(err);
@@ -2646,6 +2703,11 @@ static int dpaa2_switch_probe_port(struct ethsw_core *ethsw,
 	port_netdev->min_mtu = ETH_MIN_MTU;
 	port_netdev->max_mtu = ETHSW_MAX_FRAME_LENGTH;
 
+	/* The DPAA2 Switch's ingress path depends on the VLAN table,
+	 * thus we are not able to disable VLAN filtering.
+	 */
+	port_netdev->features = NETIF_F_HW_VLAN_CTAG_FILTER | NETIF_F_HW_VLAN_STAG_FILTER;
+
 	err = dpaa2_switch_port_init(port_priv, port_idx);
 	if (err)
 		goto err_port_probe;
diff --git a/drivers/staging/fsl-dpaa2/ethsw/ethsw.h b/drivers/staging/fsl-dpaa2/ethsw/ethsw.h
index f905acd18c67..3163dd2ab2ab 100644
--- a/drivers/staging/fsl-dpaa2/ethsw/ethsw.h
+++ b/drivers/staging/fsl-dpaa2/ethsw/ethsw.h
@@ -143,4 +143,11 @@ static inline bool dpaa2_switch_has_ctrl_if(struct ethsw_core *ethsw)
 bool dpaa2_switch_port_dev_check(const struct net_device *netdev,
 				 struct notifier_block *nb);
 
+int dpaa2_switch_port_vlans_add(struct net_device *netdev,
+				const struct switchdev_obj_port_vlan *vlan,
+				struct switchdev_trans *trans);
+
+int dpaa2_switch_port_vlans_del(struct net_device *netdev,
+				const struct switchdev_obj_port_vlan *vlan);
+
 #endif	/* __ETHSW_H */
-- 
2.28.0


^ permalink raw reply related	[flat|nested] 20+ messages in thread

* Re: [RFC 6/9] staging: dpaa2-switch: add .ndo_start_xmit() callback
  2020-11-04 16:57 ` [RFC 6/9] staging: dpaa2-switch: add .ndo_start_xmit() callback Ioana Ciornei
@ 2020-11-04 21:27   ` Vladimir Oltean
  2020-11-05  8:11     ` Ioana Ciornei
  2020-11-05  1:04   ` Andrew Lunn
  1 sibling, 1 reply; 20+ messages in thread
From: Vladimir Oltean @ 2020-11-04 21:27 UTC (permalink / raw)
  To: Ioana Ciornei
  Cc: Greg Kroah-Hartman, Andrew Lunn, linux-kernel, netdev, Ioana Ciornei

On Wed, Nov 04, 2020 at 06:57:17PM +0200, Ioana Ciornei wrote:
> From: Ioana Ciornei <ioana.ciornei@nxp.com>
> 
> Implement the .ndo_start_xmit() callback for the switch port interfaces.
> For each of the switch ports, gather the corresponding queue
> destination ID (QDID) necessary for Tx enqueueing.
> 
> We'll reserve 64 bytes for software annotations, where we keep a skb
> backpointer used on the Tx confirmation side for releasing the allocated
> memory. At the moment, we only support linear skbs.
> 
> Signed-off-by: Ioana Ciornei <ioana.ciornei@nxp.com>
> ---
> @@ -554,8 +560,10 @@ static int dpaa2_switch_port_open(struct net_device *netdev)
>  	struct ethsw_core *ethsw = port_priv->ethsw_data;
>  	int err;
>  
> -	/* No need to allow Tx as control interface is disabled */
> -	netif_tx_stop_all_queues(netdev);
> +	if (!dpaa2_switch_has_ctrl_if(port_priv->ethsw_data)) {
> +		/* No need to allow Tx as control interface is disabled */
> +		netif_tx_stop_all_queues(netdev);

Personal taste probably, but you could remove the braces here.

> +	}
>  
>  	/* Explicitly set carrier off, otherwise
>  	 * netif_carrier_ok() will return true and cause 'ip link show'
> @@ -610,15 +618,6 @@ static int dpaa2_switch_port_stop(struct net_device *netdev)
>  	return 0;
>  }
>  
> +static netdev_tx_t dpaa2_switch_port_tx(struct sk_buff *skb,
> +					struct net_device *net_dev)
> +{
> +	struct ethsw_port_priv *port_priv = netdev_priv(net_dev);
> +	struct ethsw_core *ethsw = port_priv->ethsw_data;
> +	int retries = DPAA2_SWITCH_SWP_BUSY_RETRIES;
> +	struct dpaa2_fd fd;
> +	int err;
> +
> +	if (!dpaa2_switch_has_ctrl_if(ethsw))
> +		goto err_free_skb;
> +
> +	if (unlikely(skb_headroom(skb) < DPAA2_SWITCH_NEEDED_HEADROOM)) {
> +		struct sk_buff *ns;
> +
> +		ns = skb_realloc_headroom(skb, DPAA2_SWITCH_NEEDED_HEADROOM);

Looks like this passion for skb_realloc_headroom runs in the company?
Few other drivers use it, and Claudiu just had a bug with that in gianfar.
Luckily what saves you from the same bug is the skb_unshare from right below.
Maybe you could use skb_cow_head and simplify this a bit?

> +		if (unlikely(!ns)) {
> +			netdev_err(net_dev, "Error reallocating skb headroom\n");
> +			goto err_free_skb;
> +		}
> +		dev_kfree_skb(skb);

Please use dev_consume_skb_any here, as it's not error path. Or, if you
use skb_cow_head, only the skb data will be reallocated, not the skb
structure itself, so there will be no consume_skb in that case at all,
another reason to simplify.

> +		skb = ns;
> +	}
> +
> +	/* We'll be holding a back-reference to the skb until Tx confirmation */
> +	skb = skb_unshare(skb, GFP_ATOMIC);
> +	if (unlikely(!skb)) {
> +		/* skb_unshare() has already freed the skb */
> +		netdev_err(net_dev, "Error copying the socket buffer\n");
> +		goto err_exit;
> +	}
> +
> +	if (skb_is_nonlinear(skb)) {
> +		netdev_err(net_dev, "No support for non-linear SKBs!\n");

Rate-limit maybe?
And what is the reason for no non-linear skb's? Too much code to copy
from dpaa2-eth?

> +		goto err_free_skb;
> +	}
> +
> +	err = dpaa2_switch_build_single_fd(ethsw, skb, &fd);
> +	if (unlikely(err)) {
> +		netdev_err(net_dev, "ethsw_build_*_fd() %d\n", err);
> +		goto err_free_skb;
> +	}
> +
> +	do {
> +		err = dpaa2_io_service_enqueue_qd(NULL,
> +						  port_priv->tx_qdid,
> +						  8, 0, &fd);
> +		retries--;
> +	} while (err == -EBUSY && retries);
> +
> +	if (unlikely(err < 0)) {
> +		dpaa2_switch_free_fd(ethsw, &fd);
> +		goto err_exit;
> +	}
> +
> +	return NETDEV_TX_OK;
> +
> +err_free_skb:
> +	dev_kfree_skb(skb);
> +err_exit:
> +	return NETDEV_TX_OK;
> +}
> +
> diff --git a/drivers/staging/fsl-dpaa2/ethsw/ethsw.h b/drivers/staging/fsl-dpaa2/ethsw/ethsw.h
> index bd24be2c6308..b267c04e2008 100644
> --- a/drivers/staging/fsl-dpaa2/ethsw/ethsw.h
> +++ b/drivers/staging/fsl-dpaa2/ethsw/ethsw.h
> @@ -66,6 +66,19 @@
>   */
>  #define DPAA2_SWITCH_SWP_BUSY_RETRIES		1000
>  
> +/* Hardware annotation buffer size */
> +#define DPAA2_SWITCH_HWA_SIZE		64
> +/* Software annotation buffer size */
> +#define DPAA2_SWITCH_SWA_SIZE		64
> +
> +#define DPAA2_SWITCH_TX_BUF_ALIGN	64

Could you align all of these to the "1000" from DPAA2_SWITCH_SWP_BUSY_RETRIES?

> +
> +#define DPAA2_SWITCH_TX_DATA_OFFSET \
> +	(DPAA2_SWITCH_HWA_SIZE + DPAA2_SWITCH_SWA_SIZE)
> +
> +#define DPAA2_SWITCH_NEEDED_HEADROOM \
> +	(DPAA2_SWITCH_TX_DATA_OFFSET + DPAA2_SWITCH_TX_BUF_ALIGN)
> +

Ironically, you create a definition for NEEDED_HEADROOM but you do not
set dev->needed_headroom.

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [RFC 8/9] staging: dpaa2-switch: properly setup switching domains
  2020-11-04 16:57 ` [RFC 8/9] staging: dpaa2-switch: properly setup switching domains Ioana Ciornei
@ 2020-11-04 22:08   ` Vladimir Oltean
  2020-11-05 10:58     ` Ioana Ciornei
  0 siblings, 1 reply; 20+ messages in thread
From: Vladimir Oltean @ 2020-11-04 22:08 UTC (permalink / raw)
  To: Ioana Ciornei
  Cc: Greg Kroah-Hartman, Andrew Lunn, linux-kernel, netdev, Ioana Ciornei

On Wed, Nov 04, 2020 at 06:57:19PM +0200, Ioana Ciornei wrote:
> From: Ioana Ciornei <ioana.ciornei@nxp.com>
> 
> Until now, the DPAA2 switch was not capable to properly setup it's
> switching domains depending on the existence, or lack thereof, of a
> upper bridge device. This meant that all switch ports of a DPSW object
> were switching by default even though they were not under the same
> bridge device.
> 
> Another issue was the inability to actually add the CPU in the flooding
> domains (broadcast, unknown unicast etc) of a particular switch port.
> This meant that a simple ping on a switch interface was not possible
> since no broadcast ARP frame would actually reach the CPU queues.
> 
> This patch tries to fix exactly these problems by:
> 
> * Creating and managing a FDB table for each flooding domain. This means
>   that when a switch interface is not bridged it will use it's own FDB
>   table. While in bridged mode all DPAA2 switch interfaces under the
>   same upper will use the same FDB table, thus leverage the same FDB
>   entries.
> 
> * Adding a new MC firmware command - dpsw_set_egress_flood() - through
>   which the driver can setup the flooding domains as needed. For
>   example, when the switch interface is standalone, thus not in a
>   bridge with any other DPAA2 switch port, it will setup it's broadcast
>   and unknown unicast flooding domains to only include the control
>   interface (the queues that reach the CPU and the driver can dequeue
>   from). This flooding domain changes when the interface joins a bridge
>   and is configured to include, beside the control interface, all other
>   DPAA2 switch interfaces.
> 
> Signed-off-by: Ioana Ciornei <ioana.ciornei@nxp.com>
> ---

None of the occurrences of "it's" in the commit message is grammatically
correct. So please s/it's/its/g.

> diff --git a/drivers/staging/fsl-dpaa2/ethsw/ethsw.c b/drivers/staging/fsl-dpaa2/ethsw/ethsw.c
> index 24bdac6d6005..7a0d9a178cdc 100644
> --- a/drivers/staging/fsl-dpaa2/ethsw/ethsw.c
> +++ b/drivers/staging/fsl-dpaa2/ethsw/ethsw.c
> @@ -25,6 +25,36 @@
>  
>  #define DEFAULT_VLAN_ID			1
>  
> +static u16 dpaa2_switch_port_get_fdb_id(struct ethsw_port_priv *port_priv)
> +{
> +	struct ethsw_port_priv *other_port_priv = NULL;
> +	struct net_device *other_dev;
> +	struct list_head *iter;
> +
> +	/* If not part of a bridge, just use the private FDB */
> +	if (!port_priv->bridge_dev)
> +		return port_priv->fdb_id;
> +
> +	/* If part of a bridge, use the FDB of the first dpaa2 switch interface
> +	 * to be present in that bridge
> +	 */
> +	netdev_for_each_lower_dev(port_priv->bridge_dev, other_dev, iter) {

netdev_for_each_lower_dev calls netdev_lower_get_next which has this in
the comments:

 * The caller must hold RTNL lock or
 * its own locking that guarantees that the neighbour lower
 * list will remain unchanged.

Does that hold true for all callers, if you put ASSERT_RTNL() here?

> +		if (!dpaa2_switch_port_dev_check(other_dev, NULL))
> +			continue;
> +
> +		other_port_priv = netdev_priv(other_dev);
> +		break;
> +	}
> +
> +	/* We are the first dpaa2 switch interface to join the bridge, just use
> +	 * our own FDB
> +	 */
> +	if (!other_port_priv)
> +		other_port_priv = port_priv;
> +
> +	return other_port_priv->fdb_id;
> +}
> +
>  static void *dpaa2_iova_to_virt(struct iommu_domain *domain,
>  				dma_addr_t iova_addr)
>  {
> @@ -133,7 +163,7 @@ static int dpaa2_switch_port_add_vlan(struct ethsw_port_priv *port_priv,
>  {
>  	struct ethsw_core *ethsw = port_priv->ethsw_data;
>  	struct net_device *netdev = port_priv->netdev;
> -	struct dpsw_vlan_if_cfg vcfg;
> +	struct dpsw_vlan_if_cfg vcfg = {0};
>  	int err;
>  
>  	if (port_priv->vlans[vid]) {
> @@ -141,8 +171,13 @@ static int dpaa2_switch_port_add_vlan(struct ethsw_port_priv *port_priv,
>  		return -EEXIST;
>  	}
>  
> +	/* If hit, this VLAN rule will lead the packet into the FDB table
> +	 * specified in the vlan configuration below
> +	 */

And this is the reason why VLAN-unaware mode is unsupported, right? No
hit on any VLAN rule => no FDB table selected for the packet. What is
the default action for misses on VLAN rules? Drop or some default FDB
ID?

>  	vcfg.num_ifs = 1;
>  	vcfg.if_id[0] = port_priv->idx;
> +	vcfg.fdb_id = dpaa2_switch_port_get_fdb_id(port_priv);
> +	vcfg.options |= DPSW_VLAN_ADD_IF_OPT_FDB_ID;
>  	err = dpsw_vlan_add_if(ethsw->mc_io, 0, ethsw->dpsw_handle, vid, &vcfg);
>  	if (err) {
>  		netdev_err(netdev, "dpsw_vlan_add_if err %d\n", err);
> @@ -172,8 +207,10 @@ static int dpaa2_switch_port_add_vlan(struct ethsw_port_priv *port_priv,
>  	return 0;
>  }
>  
> -static int dpaa2_switch_set_learning(struct ethsw_core *ethsw, bool enable)
> +static int dpaa2_switch_port_set_learning(struct ethsw_port_priv *port_priv, bool enable)

The commit message says nothing about changes to the learning
configuration.

>  {
> +	u16 fdb_id = dpaa2_switch_port_get_fdb_id(port_priv);
> +	struct ethsw_core *ethsw = port_priv->ethsw_data;
>  	enum dpsw_fdb_learning_mode learn_mode;
>  	int err;
>  
> @@ -182,13 +219,12 @@ static int dpaa2_switch_set_learning(struct ethsw_core *ethsw, bool enable)
>  	else
>  		learn_mode = DPSW_FDB_LEARNING_MODE_DIS;
>  
> -	err = dpsw_fdb_set_learning_mode(ethsw->mc_io, 0, ethsw->dpsw_handle, 0,
> +	err = dpsw_fdb_set_learning_mode(ethsw->mc_io, 0, ethsw->dpsw_handle, fdb_id,
>  					 learn_mode);
>  	if (err) {
>  		dev_err(ethsw->dev, "dpsw_fdb_set_learning_mode err %d\n", err);
>  		return err;
>  	}
> -	ethsw->learning = enable;
>  
>  	return 0;
>  }
> @@ -267,15 +303,17 @@ static int dpaa2_switch_port_fdb_add_uc(struct ethsw_port_priv *port_priv,
>  					const unsigned char *addr)
>  {
>  	struct dpsw_fdb_unicast_cfg entry = {0};
> +	u16 fdb_id;
>  	int err;
>  
>  	entry.if_egress = port_priv->idx;
>  	entry.type = DPSW_FDB_ENTRY_STATIC;
>  	ether_addr_copy(entry.mac_addr, addr);
>  
> +	fdb_id = dpaa2_switch_port_get_fdb_id(port_priv);
>  	err = dpsw_fdb_add_unicast(port_priv->ethsw_data->mc_io, 0,
>  				   port_priv->ethsw_data->dpsw_handle,
> -				   0, &entry);
> +				   fdb_id, &entry);

Hmmm, so in dpaa2_switch_port_get_fdb_id you say:

	/* If part of a bridge, use the FDB of the first dpaa2 switch interface
	 * to be present in that bridge
	 */

So let's say there is a br0 with swp3 and swp2, and a br1 with swp4.
IIUC, br0 interfaces (swp3 and swp2) will have an fdb_id of 3 (due to
swp3 being added first) and br1 will have an fdb_id of 4 (due to swp4).

When swp3 leaves br0, will this cause the fdb_id of swp2 to change?
I expect the answer is yes, since otherwise swp2 and swp3 would keep
forwarding packets to one another. Is this change graceful?

For example, if you add a static FDB entry to swp2 prior to removing
swp3, I would expect the fdb_id of swp2 to preserve that static FDB
entry, even if swp2 now gets moved to a different fdb_id. Similarly,
flooding settings, everything is preserved when the fdb_id changes?

The flip side of that is: what happens if you add an FDB entry to swp2,
then you remove swp2 from br0 and move it to br1? Will swp4 (which was
already in br1) see that static FDB entry in hardware, even if the
software bridge br1 hasn't notified you about it?

Basically, my question boils down to: why is there so little activity in
dpaa2_switch_port_bridge_leave.

>  	if (err)
>  		netdev_err(port_priv->netdev,
>  			   "dpsw_fdb_add_unicast err %d\n", err);

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [RFC 5/9] staging: dpaa2-switch: handle Rx path on control interface
  2020-11-04 16:57 ` [RFC 5/9] staging: dpaa2-switch: handle Rx path on control interface Ioana Ciornei
@ 2020-11-05  0:45   ` Andrew Lunn
  2020-11-05 11:22     ` Ioana Ciornei
  0 siblings, 1 reply; 20+ messages in thread
From: Andrew Lunn @ 2020-11-05  0:45 UTC (permalink / raw)
  To: Ioana Ciornei; +Cc: Greg Kroah-Hartman, linux-kernel, netdev, Ioana Ciornei

> +/* Manage all NAPI instances for the control interface.
> + *
> + * We only have one RX queue and one Tx Conf queue for all
> + * switch ports. Therefore, we only need to enable the NAPI instance once, the
> + * first time one of the switch ports runs .dev_open().
> + */
> +
> +static void dpaa2_switch_enable_ctrl_if_napi(struct ethsw_core *ethsw)
> +{
> +	int i;
> +
> +	/* a new interface is using the NAPI instance */
> +	ethsw->napi_users++;
> +
> +	/* if there is already a user of the instance, return */
> +	if (ethsw->napi_users > 1)
> +		return;

Does there need to be any locking here? Or does it rely on RTNL?
Maybe a comment would be nice, or a check that RTNL is actually held.

> +
> +	if (!dpaa2_switch_has_ctrl_if(ethsw))
> +		return;
> +
> +	for (i = 0; i < DPAA2_SWITCH_RX_NUM_FQS; i++)
> +		napi_enable(&ethsw->fq[i].napi);
> +}

> +static void dpaa2_switch_rx(struct dpaa2_switch_fq *fq,
> +			    const struct dpaa2_fd *fd)
> +{
> +	struct ethsw_core *ethsw = fq->ethsw;
> +	struct ethsw_port_priv *port_priv;
> +	struct net_device *netdev;
> +	struct vlan_ethhdr *hdr;
> +	struct sk_buff *skb;
> +	u16 vlan_tci, vid;
> +	int if_id = -1;
> +	int err;
> +
> +	/* prefetch the frame descriptor */
> +	prefetch(fd);

Does this actually do any good, given that the next call:

> +
> +	/* get switch ingress interface ID */
> +	if_id = upper_32_bits(dpaa2_fd_get_flc(fd)) & 0x0000FFFF;

is accessing the frame descriptor? The idea of prefetch is to let it
bring it into the cache while you are busy doing something else,
hopefully with something which is already cache hot.

	  Andrew

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [RFC 6/9] staging: dpaa2-switch: add .ndo_start_xmit() callback
  2020-11-04 16:57 ` [RFC 6/9] staging: dpaa2-switch: add .ndo_start_xmit() callback Ioana Ciornei
  2020-11-04 21:27   ` Vladimir Oltean
@ 2020-11-05  1:04   ` Andrew Lunn
  2020-11-05  8:25     ` Ioana Ciornei
  1 sibling, 1 reply; 20+ messages in thread
From: Andrew Lunn @ 2020-11-05  1:04 UTC (permalink / raw)
  To: Ioana Ciornei; +Cc: Greg Kroah-Hartman, linux-kernel, netdev, Ioana Ciornei

> +static int dpaa2_switch_build_single_fd(struct ethsw_core *ethsw,
> +					struct sk_buff *skb,
> +					struct dpaa2_fd *fd)
> +{
> +	struct device *dev = ethsw->dev;
> +	struct sk_buff **skbh;
> +	dma_addr_t addr;
> +	u8 *buff_start;
> +	void *hwa;
> +
> +	buff_start = PTR_ALIGN(skb->data - DPAA2_SWITCH_TX_DATA_OFFSET -
> +			       DPAA2_SWITCH_TX_BUF_ALIGN,
> +			       DPAA2_SWITCH_TX_BUF_ALIGN);
> +
> +	/* Clear FAS to have consistent values for TX confirmation. It is
> +	 * located in the first 8 bytes of the buffer's hardware annotation
> +	 * area
> +	 */
> +	hwa = buff_start + DPAA2_SWITCH_SWA_SIZE;
> +	memset(hwa, 0, 8);
> +
> +	/* Store a backpointer to the skb at the beginning of the buffer
> +	 * (in the private data area) such that we can release it
> +	 * on Tx confirm
> +	 */
> +	skbh = (struct sk_buff **)buff_start;
> +	*skbh = skb;

Where is the TX confirm which uses this stored pointer. I don't see it
in this file.

It can be expensive to store pointer like this in buffers used for
DMA. It has to be flushed out of the cache here as part of the
send. Then the TX complete needs to invalidate and then read it back
into the cache. Or you use coherent memory which is just slow.

It can be cheaper to keep a parallel ring in cacheable memory which
never gets flushed.

      Andrew

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [RFC 6/9] staging: dpaa2-switch: add .ndo_start_xmit() callback
  2020-11-04 21:27   ` Vladimir Oltean
@ 2020-11-05  8:11     ` Ioana Ciornei
  0 siblings, 0 replies; 20+ messages in thread
From: Ioana Ciornei @ 2020-11-05  8:11 UTC (permalink / raw)
  To: Vladimir Oltean
  Cc: Ioana Ciornei, Greg Kroah-Hartman, Andrew Lunn, linux-kernel,
	netdev, Ioana Ciornei

On Wed, Nov 04, 2020 at 11:27:00PM +0200, Vladimir Oltean wrote:
> On Wed, Nov 04, 2020 at 06:57:17PM +0200, Ioana Ciornei wrote:
> > From: Ioana Ciornei <ioana.ciornei@nxp.com>
> > 
> > Implement the .ndo_start_xmit() callback for the switch port interfaces.
> > For each of the switch ports, gather the corresponding queue
> > destination ID (QDID) necessary for Tx enqueueing.
> > 
> > We'll reserve 64 bytes for software annotations, where we keep a skb
> > backpointer used on the Tx confirmation side for releasing the allocated
> > memory. At the moment, we only support linear skbs.
> > 
> > Signed-off-by: Ioana Ciornei <ioana.ciornei@nxp.com>
> > ---
> > @@ -554,8 +560,10 @@ static int dpaa2_switch_port_open(struct net_device *netdev)
> >  	struct ethsw_core *ethsw = port_priv->ethsw_data;
> >  	int err;
> >  
> > -	/* No need to allow Tx as control interface is disabled */
> > -	netif_tx_stop_all_queues(netdev);
> > +	if (!dpaa2_switch_has_ctrl_if(port_priv->ethsw_data)) {
> > +		/* No need to allow Tx as control interface is disabled */
> > +		netif_tx_stop_all_queues(netdev);
> 
> Personal taste probably, but you could remove the braces here.

Usually checkpatch complains about this kind of thing but not this time.
Maybe it takes into account the comment as well..

I'll remove the braces.

> 
> > +	}
> >  
> >  	/* Explicitly set carrier off, otherwise
> >  	 * netif_carrier_ok() will return true and cause 'ip link show'
> > @@ -610,15 +618,6 @@ static int dpaa2_switch_port_stop(struct net_device *netdev)
> >  	return 0;
> >  }
> >  
> > +static netdev_tx_t dpaa2_switch_port_tx(struct sk_buff *skb,
> > +					struct net_device *net_dev)
> > +{
> > +	struct ethsw_port_priv *port_priv = netdev_priv(net_dev);
> > +	struct ethsw_core *ethsw = port_priv->ethsw_data;
> > +	int retries = DPAA2_SWITCH_SWP_BUSY_RETRIES;
> > +	struct dpaa2_fd fd;
> > +	int err;
> > +
> > +	if (!dpaa2_switch_has_ctrl_if(ethsw))
> > +		goto err_free_skb;
> > +
> > +	if (unlikely(skb_headroom(skb) < DPAA2_SWITCH_NEEDED_HEADROOM)) {
> > +		struct sk_buff *ns;
> > +
> > +		ns = skb_realloc_headroom(skb, DPAA2_SWITCH_NEEDED_HEADROOM);
> 
> Looks like this passion for skb_realloc_headroom runs in the company?

Not really, ocelot and sja1105 are safe :)

> Few other drivers use it, and Claudiu just had a bug with that in gianfar.
> Luckily what saves you from the same bug is the skb_unshare from right below.
> Maybe you could use skb_cow_head and simplify this a bit?
> 
> > +		if (unlikely(!ns)) {
> > +			netdev_err(net_dev, "Error reallocating skb headroom\n");
> > +			goto err_free_skb;
> > +		}
> > +		dev_kfree_skb(skb);
> 
> Please use dev_consume_skb_any here, as it's not error path. Or, if you
> use skb_cow_head, only the skb data will be reallocated, not the skb
> structure itself, so there will be no consume_skb in that case at all,
> another reason to simplify.

Ok, I can try that.

How dpaa2-eth deals now with this is to just create a S/G FD when the
headroom is less than what's necessary, so no skb_realloc_headroom() or
skb_cow_head(). But I agree that it's best to make it as simple as
possible.

> 
> > +		skb = ns;
> > +	}
> > +
> > +	/* We'll be holding a back-reference to the skb until Tx confirmation */
> > +	skb = skb_unshare(skb, GFP_ATOMIC);
> > +	if (unlikely(!skb)) {
> > +		/* skb_unshare() has already freed the skb */
> > +		netdev_err(net_dev, "Error copying the socket buffer\n");
> > +		goto err_exit;
> > +	}
> > +
> > +	if (skb_is_nonlinear(skb)) {
> > +		netdev_err(net_dev, "No support for non-linear SKBs!\n");
> 
> Rate-limit maybe?

Yep, that probably should be rate-limited.

> And what is the reason for no non-linear skb's? Too much code to copy
> from dpaa2-eth?

Once this is out of staging, dpaa2-eth and dpaa2-switch could share
the Tx/Rx code path so, as you said, I just didn't want to duplicate
everything if it's not specifically needed.

> > diff --git a/drivers/staging/fsl-dpaa2/ethsw/ethsw.h b/drivers/staging/fsl-dpaa2/ethsw/ethsw.h
> > index bd24be2c6308..b267c04e2008 100644
> > --- a/drivers/staging/fsl-dpaa2/ethsw/ethsw.h
> > +++ b/drivers/staging/fsl-dpaa2/ethsw/ethsw.h
> > @@ -66,6 +66,19 @@
> >   */
> >  #define DPAA2_SWITCH_SWP_BUSY_RETRIES		1000
> >  
> > +/* Hardware annotation buffer size */
> > +#define DPAA2_SWITCH_HWA_SIZE		64
> > +/* Software annotation buffer size */
> > +#define DPAA2_SWITCH_SWA_SIZE		64
> > +
> > +#define DPAA2_SWITCH_TX_BUF_ALIGN	64
> 
> Could you align all of these to the "1000" from DPAA2_SWITCH_SWP_BUSY_RETRIES?
> 

Sure.

> > +
> > +#define DPAA2_SWITCH_TX_DATA_OFFSET \
> > +	(DPAA2_SWITCH_HWA_SIZE + DPAA2_SWITCH_SWA_SIZE)
> > +
> > +#define DPAA2_SWITCH_NEEDED_HEADROOM \
> > +	(DPAA2_SWITCH_TX_DATA_OFFSET + DPAA2_SWITCH_TX_BUF_ALIGN)
> > +
> 
> Ironically, you create a definition for NEEDED_HEADROOM but you do not
> set dev->needed_headroom.

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [RFC 6/9] staging: dpaa2-switch: add .ndo_start_xmit() callback
  2020-11-05  1:04   ` Andrew Lunn
@ 2020-11-05  8:25     ` Ioana Ciornei
  2020-11-05 13:45       ` Andrew Lunn
  0 siblings, 1 reply; 20+ messages in thread
From: Ioana Ciornei @ 2020-11-05  8:25 UTC (permalink / raw)
  To: Andrew Lunn
  Cc: Ioana Ciornei, Greg Kroah-Hartman, linux-kernel, netdev, Ioana Ciornei

On Thu, Nov 05, 2020 at 02:04:39AM +0100, Andrew Lunn wrote:
> > +static int dpaa2_switch_build_single_fd(struct ethsw_core *ethsw,
> > +					struct sk_buff *skb,
> > +					struct dpaa2_fd *fd)
> > +{
> > +	struct device *dev = ethsw->dev;
> > +	struct sk_buff **skbh;
> > +	dma_addr_t addr;
> > +	u8 *buff_start;
> > +	void *hwa;
> > +
> > +	buff_start = PTR_ALIGN(skb->data - DPAA2_SWITCH_TX_DATA_OFFSET -
> > +			       DPAA2_SWITCH_TX_BUF_ALIGN,
> > +			       DPAA2_SWITCH_TX_BUF_ALIGN);
> > +
> > +	/* Clear FAS to have consistent values for TX confirmation. It is
> > +	 * located in the first 8 bytes of the buffer's hardware annotation
> > +	 * area
> > +	 */
> > +	hwa = buff_start + DPAA2_SWITCH_SWA_SIZE;
> > +	memset(hwa, 0, 8);
> > +
> > +	/* Store a backpointer to the skb at the beginning of the buffer
> > +	 * (in the private data area) such that we can release it
> > +	 * on Tx confirm
> > +	 */
> > +	skbh = (struct sk_buff **)buff_start;
> > +	*skbh = skb;
> 
> Where is the TX confirm which uses this stored pointer. I don't see it
> in this file.
> 

The Tx confirm - dpaa2_switch_tx_conf() - is added in patch 5/9.

> It can be expensive to store pointer like this in buffers used for
> DMA.

Yes, it is. But the hardware does not give us any other indication that
a packet was actually sent so that we can move ahead with consuming the
initial skb.

> It has to be flushed out of the cache here as part of the
> send. Then the TX complete needs to invalidate and then read it back
> into the cache. Or you use coherent memory which is just slow.
> 
> It can be cheaper to keep a parallel ring in cacheable memory which
> never gets flushed.

I'm afraid I don't really understand your suggestion. In this parallel
ring I would keep the skb pointers of all frames which are in-flight?
Then, when a packet is received on the Tx confirmation queue I would
have to loop over the parallel ring and determine somehow which skb was
this packet initially associated to. Isn't this even more expensive?

Ioana


^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [RFC 8/9] staging: dpaa2-switch: properly setup switching domains
  2020-11-04 22:08   ` Vladimir Oltean
@ 2020-11-05 10:58     ` Ioana Ciornei
  0 siblings, 0 replies; 20+ messages in thread
From: Ioana Ciornei @ 2020-11-05 10:58 UTC (permalink / raw)
  To: Vladimir Oltean
  Cc: Ioana Ciornei, Greg Kroah-Hartman, Andrew Lunn, linux-kernel,
	netdev, Ioana Ciornei

On Thu, Nov 05, 2020 at 12:08:10AM +0200, Vladimir Oltean wrote:
> On Wed, Nov 04, 2020 at 06:57:19PM +0200, Ioana Ciornei wrote:
> > From: Ioana Ciornei <ioana.ciornei@nxp.com>
> > 
> > Until now, the DPAA2 switch was not capable to properly setup it's
> > switching domains depending on the existence, or lack thereof, of a
> > upper bridge device. This meant that all switch ports of a DPSW object
> > were switching by default even though they were not under the same
> > bridge device.
> > 
> > Another issue was the inability to actually add the CPU in the flooding
> > domains (broadcast, unknown unicast etc) of a particular switch port.
> > This meant that a simple ping on a switch interface was not possible
> > since no broadcast ARP frame would actually reach the CPU queues.
> > 
> > This patch tries to fix exactly these problems by:
> > 
> > * Creating and managing a FDB table for each flooding domain. This means
> >   that when a switch interface is not bridged it will use it's own FDB
> >   table. While in bridged mode all DPAA2 switch interfaces under the
> >   same upper will use the same FDB table, thus leverage the same FDB
> >   entries.
> > 
> > * Adding a new MC firmware command - dpsw_set_egress_flood() - through
> >   which the driver can setup the flooding domains as needed. For
> >   example, when the switch interface is standalone, thus not in a
> >   bridge with any other DPAA2 switch port, it will setup it's broadcast
> >   and unknown unicast flooding domains to only include the control
> >   interface (the queues that reach the CPU and the driver can dequeue
> >   from). This flooding domain changes when the interface joins a bridge
> >   and is configured to include, beside the control interface, all other
> >   DPAA2 switch interfaces.
> > 
> > Signed-off-by: Ioana Ciornei <ioana.ciornei@nxp.com>
> > ---
> 
> None of the occurrences of "it's" in the commit message is grammatically
> correct. So please s/it's/its/g.
> 
> > diff --git a/drivers/staging/fsl-dpaa2/ethsw/ethsw.c b/drivers/staging/fsl-dpaa2/ethsw/ethsw.c
> > index 24bdac6d6005..7a0d9a178cdc 100644
> > --- a/drivers/staging/fsl-dpaa2/ethsw/ethsw.c
> > +++ b/drivers/staging/fsl-dpaa2/ethsw/ethsw.c
> > @@ -25,6 +25,36 @@
> >  
> >  #define DEFAULT_VLAN_ID			1
> >  
> > +static u16 dpaa2_switch_port_get_fdb_id(struct ethsw_port_priv *port_priv)
> > +{
> > +	struct ethsw_port_priv *other_port_priv = NULL;
> > +	struct net_device *other_dev;
> > +	struct list_head *iter;
> > +
> > +	/* If not part of a bridge, just use the private FDB */
> > +	if (!port_priv->bridge_dev)
> > +		return port_priv->fdb_id;
> > +
> > +	/* If part of a bridge, use the FDB of the first dpaa2 switch interface
> > +	 * to be present in that bridge
> > +	 */
> > +	netdev_for_each_lower_dev(port_priv->bridge_dev, other_dev, iter) {
> 
> netdev_for_each_lower_dev calls netdev_lower_get_next which has this in
> the comments:
> 
>  * The caller must hold RTNL lock or
>  * its own locking that guarantees that the neighbour lower
>  * list will remain unchanged.
> 
> Does that hold true for all callers, if you put ASSERT_RTNL() here?

No, not for all. The probe path uses this as well and is not protected
by the rtnl lock.

Good point, I'll add an explicit lock/unlock. Thanks.

> 
> > +		if (!dpaa2_switch_port_dev_check(other_dev, NULL))
> > +			continue;
> > +
> > +		other_port_priv = netdev_priv(other_dev);
> > +		break;
> > +	}
> > +
> > +	/* We are the first dpaa2 switch interface to join the bridge, just use
> > +	 * our own FDB
> > +	 */
> > +	if (!other_port_priv)
> > +		other_port_priv = port_priv;
> > +
> > +	return other_port_priv->fdb_id;
> > +}
> > +
> >  static void *dpaa2_iova_to_virt(struct iommu_domain *domain,
> >  				dma_addr_t iova_addr)
> >  {
> > @@ -133,7 +163,7 @@ static int dpaa2_switch_port_add_vlan(struct ethsw_port_priv *port_priv,
> >  {
> >  	struct ethsw_core *ethsw = port_priv->ethsw_data;
> >  	struct net_device *netdev = port_priv->netdev;
> > -	struct dpsw_vlan_if_cfg vcfg;
> > +	struct dpsw_vlan_if_cfg vcfg = {0};
> >  	int err;
> >  
> >  	if (port_priv->vlans[vid]) {
> > @@ -141,8 +171,13 @@ static int dpaa2_switch_port_add_vlan(struct ethsw_port_priv *port_priv,
> >  		return -EEXIST;
> >  	}
> >  
> > +	/* If hit, this VLAN rule will lead the packet into the FDB table
> > +	 * specified in the vlan configuration below
> > +	 */
> 
> And this is the reason why VLAN-unaware mode is unsupported, right?

Yes, exactly.

> No
> hit on any VLAN rule => no FDB table selected for the packet. What is
> the default action for misses on VLAN rules? Drop or some default FDB
> ID?

The default action for misses on the VLAN table is drop. For example, if
a VLAN tagged packet is received on a switch interface which does not
have an upper VLAN interface with that VLAN id (thus the
.ndo_vlan_rx_add_vid() is not called) , the packet will get dropped
immediately.

> 
> >  	vcfg.num_ifs = 1;
> >  	vcfg.if_id[0] = port_priv->idx;
> > +	vcfg.fdb_id = dpaa2_switch_port_get_fdb_id(port_priv);
> > +	vcfg.options |= DPSW_VLAN_ADD_IF_OPT_FDB_ID;
> >  	err = dpsw_vlan_add_if(ethsw->mc_io, 0, ethsw->dpsw_handle, vid, &vcfg);
> >  	if (err) {
> >  		netdev_err(netdev, "dpsw_vlan_add_if err %d\n", err);
> > @@ -172,8 +207,10 @@ static int dpaa2_switch_port_add_vlan(struct ethsw_port_priv *port_priv,
> >  	return 0;
> >  }
> >  
> > -static int dpaa2_switch_set_learning(struct ethsw_core *ethsw, bool enable)
> > +static int dpaa2_switch_port_set_learning(struct ethsw_port_priv *port_priv, bool enable)
> 
> The commit message says nothing about changes to the learning
> configuration.

The learning flag is per FDB table, thus it's configuration now has to
take into account the corresponding FDB of an interface.

Actually, being able to configure the learning flag is somewhat of an
inconvenience since this would also change the learning behavior of all
the other switch ports under the same bridge, all this without the
bridge actually learning of this change.

I think I will just remove the code that handles changing the learning
status at the moment, until I can make changes in the MC firmware so
that this flag is indeed per port.

> 
> >  {
> > +	u16 fdb_id = dpaa2_switch_port_get_fdb_id(port_priv);
> > +	struct ethsw_core *ethsw = port_priv->ethsw_data;
> >  	enum dpsw_fdb_learning_mode learn_mode;
> >  	int err;
> >  
> > @@ -182,13 +219,12 @@ static int dpaa2_switch_set_learning(struct ethsw_core *ethsw, bool enable)
> >  	else
> >  		learn_mode = DPSW_FDB_LEARNING_MODE_DIS;
> >  
> > -	err = dpsw_fdb_set_learning_mode(ethsw->mc_io, 0, ethsw->dpsw_handle, 0,
> > +	err = dpsw_fdb_set_learning_mode(ethsw->mc_io, 0, ethsw->dpsw_handle, fdb_id,
> >  					 learn_mode);
> >  	if (err) {
> >  		dev_err(ethsw->dev, "dpsw_fdb_set_learning_mode err %d\n", err);
> >  		return err;
> >  	}
> > -	ethsw->learning = enable;
> >  
> >  	return 0;
> >  }
> > @@ -267,15 +303,17 @@ static int dpaa2_switch_port_fdb_add_uc(struct ethsw_port_priv *port_priv,
> >  					const unsigned char *addr)
> >  {
> >  	struct dpsw_fdb_unicast_cfg entry = {0};
> > +	u16 fdb_id;
> >  	int err;
> >  
> >  	entry.if_egress = port_priv->idx;
> >  	entry.type = DPSW_FDB_ENTRY_STATIC;
> >  	ether_addr_copy(entry.mac_addr, addr);
> >  
> > +	fdb_id = dpaa2_switch_port_get_fdb_id(port_priv);
> >  	err = dpsw_fdb_add_unicast(port_priv->ethsw_data->mc_io, 0,
> >  				   port_priv->ethsw_data->dpsw_handle,
> > -				   0, &entry);
> > +				   fdb_id, &entry);
> 
> Hmmm, so in dpaa2_switch_port_get_fdb_id you say:
> 
> 	/* If part of a bridge, use the FDB of the first dpaa2 switch interface
> 	 * to be present in that bridge
> 	 */
> 
> So let's say there is a br0 with swp3 and swp2, and a br1 with swp4.
> IIUC, br0 interfaces (swp3 and swp2) will have an fdb_id of 3 (due to
> swp3 being added first) and br1 will have an fdb_id of 4 (due to swp4).
> 
> When swp3 leaves br0, will this cause the fdb_id of swp2 to change?
> I expect the answer is yes, since otherwise swp2 and swp3 would keep
> forwarding packets to one another. Is this change graceful?
> 

Yes, the fdb_id of swp2 will change but, as the code is now, it will
only change for new FDB static entries added or any new VLANs
installed.

> For example, if you add a static FDB entry to swp2 prior to removing
> swp3, I would expect the fdb_id of swp2 to preserve that static FDB
> entry, even if swp2 now gets moved to a different fdb_id. Similarly,
> flooding settings, everything is preserved when the fdb_id changes?
> 

No, moving static FDB entries on a bridge leave/join does not happen
now.

> The flip side of that is: what happens if you add an FDB entry to swp2,
> then you remove swp2 from br0 and move it to br1? Will swp4 (which was
> already in br1) see that static FDB entry in hardware, even if the
> software bridge br1 hasn't notified you about it?
> 

As you said, moving any FDB entry from one FDB table to another one
would be a problem since the software bridge br1 would not be notified
of any of these new entries.

Taking all the above into account, what I think the code should do if
swp2, for example, leaves a bridge is to:
 - update all VLAN entries installed for swp2 so that on a hit it would
   lead the packet into the new FDB table.
 - all static FDB entries already installed on swp2 should be removed
   from the FDB table corresponding to the previous bridge that the port
   was under.

> Basically, my question boils down to: why is there so little activity in
> dpaa2_switch_port_bridge_leave.
> 
> >  	if (err)
> >  		netdev_err(port_priv->netdev,
> >  			   "dpsw_fdb_add_unicast err %d\n", err);

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [RFC 5/9] staging: dpaa2-switch: handle Rx path on control interface
  2020-11-05  0:45   ` Andrew Lunn
@ 2020-11-05 11:22     ` Ioana Ciornei
  0 siblings, 0 replies; 20+ messages in thread
From: Ioana Ciornei @ 2020-11-05 11:22 UTC (permalink / raw)
  To: Andrew Lunn
  Cc: Ioana Ciornei, Greg Kroah-Hartman, linux-kernel, netdev, Ioana Ciornei

On Thu, Nov 05, 2020 at 01:45:16AM +0100, Andrew Lunn wrote:
> > +/* Manage all NAPI instances for the control interface.
> > + *
> > + * We only have one RX queue and one Tx Conf queue for all
> > + * switch ports. Therefore, we only need to enable the NAPI instance once, the
> > + * first time one of the switch ports runs .dev_open().
> > + */
> > +
> > +static void dpaa2_switch_enable_ctrl_if_napi(struct ethsw_core *ethsw)
> > +{
> > +	int i;
> > +
> > +	/* a new interface is using the NAPI instance */
> > +	ethsw->napi_users++;
> > +
> > +	/* if there is already a user of the instance, return */
> > +	if (ethsw->napi_users > 1)
> > +		return;
> 
> Does there need to be any locking here? Or does it rely on RTNL?
> Maybe a comment would be nice, or a check that RTNL is actually held.
> 

It relies on the RTNL. I'll add an assert on the RTNL lock and a comment
to go with that.

> > +
> > +	if (!dpaa2_switch_has_ctrl_if(ethsw))
> > +		return;
> > +
> > +	for (i = 0; i < DPAA2_SWITCH_RX_NUM_FQS; i++)
> > +		napi_enable(&ethsw->fq[i].napi);
> > +}
> 
> > +static void dpaa2_switch_rx(struct dpaa2_switch_fq *fq,
> > +			    const struct dpaa2_fd *fd)
> > +{
> > +	struct ethsw_core *ethsw = fq->ethsw;
> > +	struct ethsw_port_priv *port_priv;
> > +	struct net_device *netdev;
> > +	struct vlan_ethhdr *hdr;
> > +	struct sk_buff *skb;
> > +	u16 vlan_tci, vid;
> > +	int if_id = -1;
> > +	int err;
> > +
> > +	/* prefetch the frame descriptor */
> > +	prefetch(fd);
> 
> Does this actually do any good, given that the next call:
> 
> > +
> > +	/* get switch ingress interface ID */
> > +	if_id = upper_32_bits(dpaa2_fd_get_flc(fd)) & 0x0000FFFF;
> 
> is accessing the frame descriptor? The idea of prefetch is to let it
> bring it into the cache while you are busy doing something else,
> hopefully with something which is already cache hot.
> 

I'll check w and w/o the prefetch but, most probably, it doesn't help.
Thanks.

Ioana

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [RFC 6/9] staging: dpaa2-switch: add .ndo_start_xmit() callback
  2020-11-05  8:25     ` Ioana Ciornei
@ 2020-11-05 13:45       ` Andrew Lunn
  2020-11-05 15:51         ` Ioana Ciornei
  0 siblings, 1 reply; 20+ messages in thread
From: Andrew Lunn @ 2020-11-05 13:45 UTC (permalink / raw)
  To: Ioana Ciornei; +Cc: Greg Kroah-Hartman, linux-kernel, netdev, Ioana Ciornei

> > Where is the TX confirm which uses this stored pointer. I don't see it
> > in this file.
> > 
> 
> The Tx confirm - dpaa2_switch_tx_conf() - is added in patch 5/9.

Not so obvious. Could it be moved here?

> > It can be expensive to store pointer like this in buffers used for
> > DMA.
> 
> Yes, it is. But the hardware does not give us any other indication that
> a packet was actually sent so that we can move ahead with consuming the
> initial skb.
> 
> > It has to be flushed out of the cache here as part of the
> > send. Then the TX complete needs to invalidate and then read it back
> > into the cache. Or you use coherent memory which is just slow.
> > 
> > It can be cheaper to keep a parallel ring in cacheable memory which
> > never gets flushed.
> 
> I'm afraid I don't really understand your suggestion. In this parallel
> ring I would keep the skb pointers of all frames which are in-flight?
> Then, when a packet is received on the Tx confirmation queue I would
> have to loop over the parallel ring and determine somehow which skb was
> this packet initially associated to. Isn't this even more expensive?

I don't know this particular hardware, so i will talk in general
terms. Generally, you have a transmit ring. You add new frames to be
sent to the beginning of the ring, and you take off completed frames
from the end of the ring. This is kept in 'expensive' memory, in that
either it is coherent, or you need to do flushed/invalidates.

It is expected that the hardware keeps to ring order. It does not pick
and choose which frames it sends, it does them in order. That means
completion also happens in ring order. So the driver can keep a simple
linear array the size of the ring, in cachable memory, with pointers
to the skbuf. And it just needs a counting index to know which one
just completed.

Now, your hardware is more complex. You have one queue feeding
multiple switch ports. Maybe it does not keep to ring order? If you
have one port running at 10M/Half, and another at 10G/Full, does it
leave frames for the 10/Half port in the ring when its egress queue it
full? That is probably a bad idea, since the 10G/Full port could then
starve for lack of free slots in the ring? So my guess would be, the
frames get dropped. And so ring order is maintained.

If you are paranoid it could get out of sync, keep an array of tuples,
address of the frame descriptor and the skbuf. If the fd address does
not match what you expect, then do the linear search of the fd
address, and increment a counter that something odd has happened.

	 Andrew

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [RFC 6/9] staging: dpaa2-switch: add .ndo_start_xmit() callback
  2020-11-05 13:45       ` Andrew Lunn
@ 2020-11-05 15:51         ` Ioana Ciornei
  0 siblings, 0 replies; 20+ messages in thread
From: Ioana Ciornei @ 2020-11-05 15:51 UTC (permalink / raw)
  To: Andrew Lunn
  Cc: Ioana Ciornei, Greg Kroah-Hartman, linux-kernel, netdev, Ioana Ciornei

On Thu, Nov 05, 2020 at 02:45:12PM +0100, Andrew Lunn wrote:
> > > Where is the TX confirm which uses this stored pointer. I don't see it
> > > in this file.
> > > 
> > 
> > The Tx confirm - dpaa2_switch_tx_conf() - is added in patch 5/9.
> 
> Not so obvious. Could it be moved here?
> 

Sure, I'll move it here so that we have both Tx and Tx confirmation in
the same patch.

> > > It can be expensive to store pointer like this in buffers used for
> > > DMA.
> > 
> > Yes, it is. But the hardware does not give us any other indication that
> > a packet was actually sent so that we can move ahead with consuming the
> > initial skb.
> > 
> > > It has to be flushed out of the cache here as part of the
> > > send. Then the TX complete needs to invalidate and then read it back
> > > into the cache. Or you use coherent memory which is just slow.
> > > 
> > > It can be cheaper to keep a parallel ring in cacheable memory which
> > > never gets flushed.
> > 
> > I'm afraid I don't really understand your suggestion. In this parallel
> > ring I would keep the skb pointers of all frames which are in-flight?
> > Then, when a packet is received on the Tx confirmation queue I would
> > have to loop over the parallel ring and determine somehow which skb was
> > this packet initially associated to. Isn't this even more expensive?
> 
> I don't know this particular hardware, so i will talk in general
> terms. Generally, you have a transmit ring. You add new frames to be
> sent to the beginning of the ring, and you take off completed frames
> from the end of the ring. This is kept in 'expensive' memory, in that
> either it is coherent, or you need to do flushed/invalidates.
> 
> It is expected that the hardware keeps to ring order. It does not pick
> and choose which frames it sends, it does them in order. That means
> completion also happens in ring order. So the driver can keep a simple
> linear array the size of the ring, in cachable memory, with pointers
> to the skbuf. And it just needs a counting index to know which one
> just completed.

I agree with all of the above in a general sense.

> 
> Now, your hardware is more complex. You have one queue feeding
> multiple switch ports.

Not really. I have one Tx queue for each switch port and just one Tx
confirmation queue for all of them.

> Maybe it does not keep to ring order?

If the driver enqueues frames #1, #2, #3 in this exact order on a switch
port then the frames will arrive in the same order on the Tx
confirmation queue irrespective of any other traffic sent on other
switch ports.

> If you
> have one port running at 10M/Half, and another at 10G/Full, does it
> leave frames for the 10/Half port in the ring when its egress queue it
> full? That is probably a bad idea, since the 10G/Full port could then
> starve for lack of free slots in the ring? So my guess would be, the
> frames get dropped. And so ring order is maintained.
> 
> If you are paranoid it could get out of sync, keep an array of tuples,
> address of the frame descriptor and the skbuf. If the fd address does
> not match what you expect, then do the linear search of the fd
> address, and increment a counter that something odd has happened.
> 

The problem with this would be, I think, with two TX softirqs on two
different cores which want to send a frame on the same switch port. In
order to update the shadow ring, there should be some kind of locking
mechanism on the access to the shadow ring which would might invalidate
any attempt to make this more efficient.

This might not be a problem for the dpaa2-switch since it does not
enable NETIF_F_LLTX but it might be for dpaa2-eth.

Also, as the architecture is defined now, the driver does not really see
the Tx queues as being fixed-size so that it can infer the size for the
shadow copy.

I will have to dig a little bit more in this area to understand exactly
why the decision to use skb backpointers was made in the first place (I
am not really talking about the dpaa2-switch here, dpaa2-eth has the
same exact behavior and has been around for some time now).

Ioana


^ permalink raw reply	[flat|nested] 20+ messages in thread

end of thread, other threads:[~2020-11-05 15:51 UTC | newest]

Thread overview: 20+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-11-04 16:57 [RFC 0/9] staging: dpaa2-switch: add support for CPU terminated traffic Ioana Ciornei
2020-11-04 16:57 ` [RFC 1/9] staging: dpaa2-switch: get control interface attributes Ioana Ciornei
2020-11-04 16:57 ` [RFC 2/9] staging: dpaa2-switch: setup buffer pool for control traffic Ioana Ciornei
2020-11-04 16:57 ` [RFC 3/9] staging: dpaa2-switch: setup RX path rings Ioana Ciornei
2020-11-04 16:57 ` [RFC 4/9] staging: dpaa2-switch: setup dpio Ioana Ciornei
2020-11-04 16:57 ` [RFC 5/9] staging: dpaa2-switch: handle Rx path on control interface Ioana Ciornei
2020-11-05  0:45   ` Andrew Lunn
2020-11-05 11:22     ` Ioana Ciornei
2020-11-04 16:57 ` [RFC 6/9] staging: dpaa2-switch: add .ndo_start_xmit() callback Ioana Ciornei
2020-11-04 21:27   ` Vladimir Oltean
2020-11-05  8:11     ` Ioana Ciornei
2020-11-05  1:04   ` Andrew Lunn
2020-11-05  8:25     ` Ioana Ciornei
2020-11-05 13:45       ` Andrew Lunn
2020-11-05 15:51         ` Ioana Ciornei
2020-11-04 16:57 ` [RFC 7/9] staging: dpaa2-switch: enable the control interface Ioana Ciornei
2020-11-04 16:57 ` [RFC 8/9] staging: dpaa2-switch: properly setup switching domains Ioana Ciornei
2020-11-04 22:08   ` Vladimir Oltean
2020-11-05 10:58     ` Ioana Ciornei
2020-11-04 16:57 ` [RFC 9/9] staging: dpaa2-switch: accept only vlan-aware upper devices Ioana Ciornei

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).