netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v2 net-next 0/7] dpaa2-eth: add support for Rx traffic classes
@ 2020-05-15 18:47 Ioana Ciornei
  2020-05-15 18:47 ` [PATCH v2 net-next 1/7] dpaa2-eth: Add " Ioana Ciornei
                   ` (7 more replies)
  0 siblings, 8 replies; 28+ messages in thread
From: Ioana Ciornei @ 2020-05-15 18:47 UTC (permalink / raw)
  To: davem, netdev; +Cc: Ioana Ciornei

This patch set adds support for Rx traffic classes on DPAA2 Ethernet
devices.

The first two patches make the necessary changes so that multiple
traffic classes are configured and their statistics are displayed in the
debugfs. The third patch adds a static distribution to said traffic
classes based on the VLAN PCP field.

The last patches add support for the congestion group taildrop mechanism
that allows us to control the number of frames that can accumulate on a
group of Rx frame queues belonging to the same traffic class.

Changes in v2:
  - use mask as u16* in set_vlan_qos - 3/7

Ioana Radulescu (7):
  dpaa2-eth: Add support for Rx traffic classes
  dpaa2-eth: Trim debugfs FQ stats
  dpaa2-eth: Distribute ingress frames based on VLAN prio
  dpaa2-eth: Add helper functions
  dpaa2-eth: Minor cleanup in dpaa2_eth_set_rx_taildrop()
  dpaa2-eth: Add congestion group taildrop
  dpaa2-eth: Update FQ taildrop threshold and buffer pool count

 .../freescale/dpaa2/dpaa2-eth-debugfs.c       |  11 +-
 .../net/ethernet/freescale/dpaa2/dpaa2-eth.c  | 225 +++++++++++++++---
 .../net/ethernet/freescale/dpaa2/dpaa2-eth.h  |  44 +++-
 .../ethernet/freescale/dpaa2/dpaa2-ethtool.c  |  24 +-
 .../net/ethernet/freescale/dpaa2/dpni-cmd.h   |  34 +++
 drivers/net/ethernet/freescale/dpaa2/dpni.c   | 131 ++++++++++
 drivers/net/ethernet/freescale/dpaa2/dpni.h   |  36 +++
 7 files changed, 450 insertions(+), 55 deletions(-)

-- 
2.17.1


^ permalink raw reply	[flat|nested] 28+ messages in thread

* [PATCH v2 net-next 1/7] dpaa2-eth: Add support for Rx traffic classes
  2020-05-15 18:47 [PATCH v2 net-next 0/7] dpaa2-eth: add support for Rx traffic classes Ioana Ciornei
@ 2020-05-15 18:47 ` Ioana Ciornei
  2020-05-15 18:47 ` [PATCH v2 net-next 2/7] dpaa2-eth: Trim debugfs FQ stats Ioana Ciornei
                   ` (6 subsequent siblings)
  7 siblings, 0 replies; 28+ messages in thread
From: Ioana Ciornei @ 2020-05-15 18:47 UTC (permalink / raw)
  To: davem, netdev; +Cc: Ioana Radulescu, Ioana Ciornei

From: Ioana Radulescu <ruxandra.radulescu@nxp.com>

The firmware reserves for each DPNI a number of RX frame queues
equal to the number of configured flows x number of configured
traffic classes.

Current driver configuration directs all incoming traffic to
FQs corresponding to TC0, leaving all other priority levels unused.

Start adding support for multiple ingress traffic classes, by
configuring the FQs associated with all priority levels, not just
TC0. All settings that are per-TC, such as those related to
hashing and flow steering, are also updated.

Signed-off-by: Ioana Radulescu <ruxandra.radulescu@nxp.com>
Signed-off-by: Ioana Ciornei <ioana.ciornei@nxp.com>
---
Changes in v2:
 - none

 .../freescale/dpaa2/dpaa2-eth-debugfs.c       |  7 +-
 .../net/ethernet/freescale/dpaa2/dpaa2-eth.c  | 70 +++++++++++++------
 .../net/ethernet/freescale/dpaa2/dpaa2-eth.h  |  4 +-
 .../ethernet/freescale/dpaa2/dpaa2-ethtool.c  | 19 +++--
 4 files changed, 68 insertions(+), 32 deletions(-)

diff --git a/drivers/net/ethernet/freescale/dpaa2/dpaa2-eth-debugfs.c b/drivers/net/ethernet/freescale/dpaa2/dpaa2-eth-debugfs.c
index 0a31e4268dfb..c453a23045c1 100644
--- a/drivers/net/ethernet/freescale/dpaa2/dpaa2-eth-debugfs.c
+++ b/drivers/net/ethernet/freescale/dpaa2/dpaa2-eth-debugfs.c
@@ -81,8 +81,8 @@ static int dpaa2_dbg_fqs_show(struct seq_file *file, void *offset)
 	int i, err;
 
 	seq_printf(file, "FQ stats for %s:\n", priv->net_dev->name);
-	seq_printf(file, "%s%16s%16s%16s%16s\n",
-		   "VFQID", "CPU", "Type", "Frames", "Pending frames");
+	seq_printf(file, "%s%16s%16s%16s%16s%16s\n",
+		   "VFQID", "CPU", "TC", "Type", "Frames", "Pending frames");
 
 	for (i = 0; i <  priv->num_fqs; i++) {
 		fq = &priv->fq[i];
@@ -90,9 +90,10 @@ static int dpaa2_dbg_fqs_show(struct seq_file *file, void *offset)
 		if (err)
 			fcnt = 0;
 
-		seq_printf(file, "%5d%16d%16s%16llu%16u\n",
+		seq_printf(file, "%5d%16d%16d%16s%16llu%16u\n",
 			   fq->fqid,
 			   fq->target_cpu,
+			   fq->tc,
 			   fq_type_to_str(fq),
 			   fq->stats.frames,
 			   fcnt);
diff --git a/drivers/net/ethernet/freescale/dpaa2/dpaa2-eth.c b/drivers/net/ethernet/freescale/dpaa2/dpaa2-eth.c
index 4388d1039822..a8baf13063ae 100644
--- a/drivers/net/ethernet/freescale/dpaa2/dpaa2-eth.c
+++ b/drivers/net/ethernet/freescale/dpaa2/dpaa2-eth.c
@@ -1283,6 +1283,7 @@ static void disable_ch_napi(struct dpaa2_eth_priv *priv)
 static void dpaa2_eth_set_rx_taildrop(struct dpaa2_eth_priv *priv, bool enable)
 {
 	struct dpni_taildrop td = {0};
+	struct dpaa2_eth_fq *fq;
 	int i, err;
 
 	if (priv->rx_td_enabled == enable)
@@ -1292,11 +1293,12 @@ static void dpaa2_eth_set_rx_taildrop(struct dpaa2_eth_priv *priv, bool enable)
 	td.threshold = DPAA2_ETH_TAILDROP_THRESH;
 
 	for (i = 0; i < priv->num_fqs; i++) {
-		if (priv->fq[i].type != DPAA2_RX_FQ)
+		fq = &priv->fq[i];
+		if (fq->type != DPAA2_RX_FQ)
 			continue;
 		err = dpni_set_taildrop(priv->mc_io, 0, priv->mc_token,
-					DPNI_CP_QUEUE, DPNI_QUEUE_RX, 0,
-					priv->fq[i].flowid, &td);
+					DPNI_CP_QUEUE, DPNI_QUEUE_RX,
+					fq->tc, fq->flowid, &td);
 		if (err) {
 			netdev_err(priv->net_dev,
 				   "dpni_set_taildrop() failed\n");
@@ -2400,7 +2402,7 @@ static void set_fq_affinity(struct dpaa2_eth_priv *priv)
 
 static void setup_fqs(struct dpaa2_eth_priv *priv)
 {
-	int i;
+	int i, j;
 
 	/* We have one TxConf FQ per Tx flow.
 	 * The number of Tx and Rx queues is the same.
@@ -2412,10 +2414,13 @@ static void setup_fqs(struct dpaa2_eth_priv *priv)
 		priv->fq[priv->num_fqs++].flowid = (u16)i;
 	}
 
-	for (i = 0; i < dpaa2_eth_queue_count(priv); i++) {
-		priv->fq[priv->num_fqs].type = DPAA2_RX_FQ;
-		priv->fq[priv->num_fqs].consume = dpaa2_eth_rx;
-		priv->fq[priv->num_fqs++].flowid = (u16)i;
+	for (j = 0; j < dpaa2_eth_tc_count(priv); j++) {
+		for (i = 0; i < dpaa2_eth_queue_count(priv); i++) {
+			priv->fq[priv->num_fqs].type = DPAA2_RX_FQ;
+			priv->fq[priv->num_fqs].consume = dpaa2_eth_rx;
+			priv->fq[priv->num_fqs].tc = (u8)j;
+			priv->fq[priv->num_fqs++].flowid = (u16)i;
+		}
 	}
 
 	/* For each FQ, decide on which core to process incoming frames */
@@ -2782,7 +2787,7 @@ static int setup_rx_flow(struct dpaa2_eth_priv *priv,
 	int err;
 
 	err = dpni_get_queue(priv->mc_io, 0, priv->mc_token,
-			     DPNI_QUEUE_RX, 0, fq->flowid, &queue, &qid);
+			     DPNI_QUEUE_RX, fq->tc, fq->flowid, &queue, &qid);
 	if (err) {
 		dev_err(dev, "dpni_get_queue(RX) failed\n");
 		return err;
@@ -2795,7 +2800,7 @@ static int setup_rx_flow(struct dpaa2_eth_priv *priv,
 	queue.destination.priority = 1;
 	queue.user_context = (u64)(uintptr_t)fq;
 	err = dpni_set_queue(priv->mc_io, 0, priv->mc_token,
-			     DPNI_QUEUE_RX, 0, fq->flowid,
+			     DPNI_QUEUE_RX, fq->tc, fq->flowid,
 			     DPNI_QUEUE_OPT_USER_CTX | DPNI_QUEUE_OPT_DEST,
 			     &queue);
 	if (err) {
@@ -2804,6 +2809,10 @@ static int setup_rx_flow(struct dpaa2_eth_priv *priv,
 	}
 
 	/* xdp_rxq setup */
+	/* only once for each channel */
+	if (fq->tc > 0)
+		return 0;
+
 	err = xdp_rxq_info_reg(&fq->channel->xdp_rxq, priv->net_dev,
 			       fq->flowid);
 	if (err) {
@@ -2941,7 +2950,7 @@ static int config_legacy_hash_key(struct dpaa2_eth_priv *priv, dma_addr_t key)
 {
 	struct device *dev = priv->net_dev->dev.parent;
 	struct dpni_rx_tc_dist_cfg dist_cfg;
-	int err;
+	int i, err = 0;
 
 	memset(&dist_cfg, 0, sizeof(dist_cfg));
 
@@ -2949,9 +2958,14 @@ static int config_legacy_hash_key(struct dpaa2_eth_priv *priv, dma_addr_t key)
 	dist_cfg.dist_size = dpaa2_eth_queue_count(priv);
 	dist_cfg.dist_mode = DPNI_DIST_MODE_HASH;
 
-	err = dpni_set_rx_tc_dist(priv->mc_io, 0, priv->mc_token, 0, &dist_cfg);
-	if (err)
-		dev_err(dev, "dpni_set_rx_tc_dist failed\n");
+	for (i = 0; i < dpaa2_eth_tc_count(priv); i++) {
+		err = dpni_set_rx_tc_dist(priv->mc_io, 0, priv->mc_token,
+					  i, &dist_cfg);
+		if (err) {
+			dev_err(dev, "dpni_set_rx_tc_dist failed\n");
+			break;
+		}
+	}
 
 	return err;
 }
@@ -2961,7 +2975,7 @@ static int config_hash_key(struct dpaa2_eth_priv *priv, dma_addr_t key)
 {
 	struct device *dev = priv->net_dev->dev.parent;
 	struct dpni_rx_dist_cfg dist_cfg;
-	int err;
+	int i, err = 0;
 
 	memset(&dist_cfg, 0, sizeof(dist_cfg));
 
@@ -2969,9 +2983,15 @@ static int config_hash_key(struct dpaa2_eth_priv *priv, dma_addr_t key)
 	dist_cfg.dist_size = dpaa2_eth_queue_count(priv);
 	dist_cfg.enable = 1;
 
-	err = dpni_set_rx_hash_dist(priv->mc_io, 0, priv->mc_token, &dist_cfg);
-	if (err)
-		dev_err(dev, "dpni_set_rx_hash_dist failed\n");
+	for (i = 0; i < dpaa2_eth_tc_count(priv); i++) {
+		dist_cfg.tc = i;
+		err = dpni_set_rx_hash_dist(priv->mc_io, 0, priv->mc_token,
+					    &dist_cfg);
+		if (err) {
+			dev_err(dev, "dpni_set_rx_hash_dist failed\n");
+			break;
+		}
+	}
 
 	return err;
 }
@@ -2981,7 +3001,7 @@ static int config_cls_key(struct dpaa2_eth_priv *priv, dma_addr_t key)
 {
 	struct device *dev = priv->net_dev->dev.parent;
 	struct dpni_rx_dist_cfg dist_cfg;
-	int err;
+	int i, err = 0;
 
 	memset(&dist_cfg, 0, sizeof(dist_cfg));
 
@@ -2989,9 +3009,15 @@ static int config_cls_key(struct dpaa2_eth_priv *priv, dma_addr_t key)
 	dist_cfg.dist_size = dpaa2_eth_queue_count(priv);
 	dist_cfg.enable = 1;
 
-	err = dpni_set_rx_fs_dist(priv->mc_io, 0, priv->mc_token, &dist_cfg);
-	if (err)
-		dev_err(dev, "dpni_set_rx_fs_dist failed\n");
+	for (i = 0; i < dpaa2_eth_tc_count(priv); i++) {
+		dist_cfg.tc = i;
+		err = dpni_set_rx_fs_dist(priv->mc_io, 0, priv->mc_token,
+					  &dist_cfg);
+		if (err) {
+			dev_err(dev, "dpni_set_rx_fs_dist failed\n");
+			break;
+		}
+	}
 
 	return err;
 }
diff --git a/drivers/net/ethernet/freescale/dpaa2/dpaa2-eth.h b/drivers/net/ethernet/freescale/dpaa2/dpaa2-eth.h
index 0581fbf1f98c..580ad5fd7bd8 100644
--- a/drivers/net/ethernet/freescale/dpaa2/dpaa2-eth.h
+++ b/drivers/net/ethernet/freescale/dpaa2/dpaa2-eth.h
@@ -294,7 +294,9 @@ struct dpaa2_eth_ch_stats {
 
 /* Maximum number of queues associated with a DPNI */
 #define DPAA2_ETH_MAX_TCS		8
-#define DPAA2_ETH_MAX_RX_QUEUES		16
+#define DPAA2_ETH_MAX_RX_QUEUES_PER_TC	16
+#define DPAA2_ETH_MAX_RX_QUEUES		\
+	(DPAA2_ETH_MAX_RX_QUEUES_PER_TC * DPAA2_ETH_MAX_TCS)
 #define DPAA2_ETH_MAX_TX_QUEUES		16
 #define DPAA2_ETH_MAX_QUEUES		(DPAA2_ETH_MAX_RX_QUEUES + \
 					DPAA2_ETH_MAX_TX_QUEUES)
diff --git a/drivers/net/ethernet/freescale/dpaa2/dpaa2-ethtool.c b/drivers/net/ethernet/freescale/dpaa2/dpaa2-ethtool.c
index bd13ee48d623..1d0402011ac5 100644
--- a/drivers/net/ethernet/freescale/dpaa2/dpaa2-ethtool.c
+++ b/drivers/net/ethernet/freescale/dpaa2/dpaa2-ethtool.c
@@ -547,7 +547,7 @@ static int do_cls_rule(struct net_device *net_dev,
 	dma_addr_t key_iova;
 	u64 fields = 0;
 	void *key_buf;
-	int err;
+	int i, err;
 
 	if (fs->ring_cookie != RX_CLS_FLOW_DISC &&
 	    fs->ring_cookie >= dpaa2_eth_queue_count(priv))
@@ -607,11 +607,18 @@ static int do_cls_rule(struct net_device *net_dev,
 			fs_act.options |= DPNI_FS_OPT_DISCARD;
 		else
 			fs_act.flow_id = fs->ring_cookie;
-		err = dpni_add_fs_entry(priv->mc_io, 0, priv->mc_token, 0,
-					fs->location, &rule_cfg, &fs_act);
-	} else {
-		err = dpni_remove_fs_entry(priv->mc_io, 0, priv->mc_token, 0,
-					   &rule_cfg);
+	}
+	for (i = 0; i < dpaa2_eth_tc_count(priv); i++) {
+		if (add)
+			err = dpni_add_fs_entry(priv->mc_io, 0, priv->mc_token,
+						i, fs->location, &rule_cfg,
+						&fs_act);
+		else
+			err = dpni_remove_fs_entry(priv->mc_io, 0,
+						   priv->mc_token, i,
+						   &rule_cfg);
+		if (err)
+			break;
 	}
 
 	dma_unmap_single(dev, key_iova, rule_cfg.key_size * 2, DMA_TO_DEVICE);
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [PATCH v2 net-next 2/7] dpaa2-eth: Trim debugfs FQ stats
  2020-05-15 18:47 [PATCH v2 net-next 0/7] dpaa2-eth: add support for Rx traffic classes Ioana Ciornei
  2020-05-15 18:47 ` [PATCH v2 net-next 1/7] dpaa2-eth: Add " Ioana Ciornei
@ 2020-05-15 18:47 ` Ioana Ciornei
  2020-05-15 18:47 ` [PATCH v2 net-next 3/7] dpaa2-eth: Distribute ingress frames based on VLAN prio Ioana Ciornei
                   ` (5 subsequent siblings)
  7 siblings, 0 replies; 28+ messages in thread
From: Ioana Ciornei @ 2020-05-15 18:47 UTC (permalink / raw)
  To: davem, netdev; +Cc: Ioana Radulescu, Ioana Ciornei

From: Ioana Radulescu <ruxandra.radulescu@nxp.com>

With the addition of multiple traffic classes support, the number
of available frame queues grew significantly, overly inflating the
debugfs FQ statistics entry. Update it to only show the queues
which are actually in use (i.e. have a non-zero frame counter).

Signed-off-by: Ioana Radulescu <ruxandra.radulescu@nxp.com>
Signed-off-by: Ioana Ciornei <ioana.ciornei@nxp.com>
---
Changes in v2:
 - none

 drivers/net/ethernet/freescale/dpaa2/dpaa2-eth-debugfs.c | 4 ++++
 1 file changed, 4 insertions(+)

diff --git a/drivers/net/ethernet/freescale/dpaa2/dpaa2-eth-debugfs.c b/drivers/net/ethernet/freescale/dpaa2/dpaa2-eth-debugfs.c
index c453a23045c1..2880ca02d7e7 100644
--- a/drivers/net/ethernet/freescale/dpaa2/dpaa2-eth-debugfs.c
+++ b/drivers/net/ethernet/freescale/dpaa2/dpaa2-eth-debugfs.c
@@ -90,6 +90,10 @@ static int dpaa2_dbg_fqs_show(struct seq_file *file, void *offset)
 		if (err)
 			fcnt = 0;
 
+		/* Skip FQs with no traffic */
+		if (!fq->stats.frames && !fcnt)
+			continue;
+
 		seq_printf(file, "%5d%16d%16d%16s%16llu%16u\n",
 			   fq->fqid,
 			   fq->target_cpu,
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [PATCH v2 net-next 3/7] dpaa2-eth: Distribute ingress frames based on VLAN prio
  2020-05-15 18:47 [PATCH v2 net-next 0/7] dpaa2-eth: add support for Rx traffic classes Ioana Ciornei
  2020-05-15 18:47 ` [PATCH v2 net-next 1/7] dpaa2-eth: Add " Ioana Ciornei
  2020-05-15 18:47 ` [PATCH v2 net-next 2/7] dpaa2-eth: Trim debugfs FQ stats Ioana Ciornei
@ 2020-05-15 18:47 ` Ioana Ciornei
  2020-05-15 23:07   ` kbuild test robot
  2020-05-15 18:47 ` [PATCH v2 net-next 4/7] dpaa2-eth: Add helper functions Ioana Ciornei
                   ` (4 subsequent siblings)
  7 siblings, 1 reply; 28+ messages in thread
From: Ioana Ciornei @ 2020-05-15 18:47 UTC (permalink / raw)
  To: davem, netdev; +Cc: Ioana Radulescu, Ioana Ciornei

From: Ioana Radulescu <ruxandra.radulescu@nxp.com>

Configure static ingress classification based on VLAN PCP field.
If the DPNI doesn't have enough traffic classes to accommodate all
priority levels, the lowest ones end up on TC 0 (default on miss).

Signed-off-by: Ioana Radulescu <ruxandra.radulescu@nxp.com>
Signed-off-by: Ioana Ciornei <ioana.ciornei@nxp.com>
---
Changes in v2:
  - use mask as u16* in set_vlan_qos

 .../net/ethernet/freescale/dpaa2/dpaa2-eth.c  | 117 ++++++++++++++++
 .../net/ethernet/freescale/dpaa2/dpaa2-eth.h  |   1 +
 .../net/ethernet/freescale/dpaa2/dpni-cmd.h   |  34 +++++
 drivers/net/ethernet/freescale/dpaa2/dpni.c   | 131 ++++++++++++++++++
 drivers/net/ethernet/freescale/dpaa2/dpni.h   |  36 +++++
 5 files changed, 319 insertions(+)

diff --git a/drivers/net/ethernet/freescale/dpaa2/dpaa2-eth.c b/drivers/net/ethernet/freescale/dpaa2/dpaa2-eth.c
index a8baf13063ae..f4f085b7829a 100644
--- a/drivers/net/ethernet/freescale/dpaa2/dpaa2-eth.c
+++ b/drivers/net/ethernet/freescale/dpaa2/dpaa2-eth.c
@@ -2689,6 +2689,119 @@ static void update_tx_fqids(struct dpaa2_eth_priv *priv)
 	priv->enqueue = dpaa2_eth_enqueue_qd;
 }
 
+/* Configure ingress classification based on VLAN PCP */
+static int set_vlan_qos(struct dpaa2_eth_priv *priv)
+{
+	struct device *dev = priv->net_dev->dev.parent;
+	struct dpkg_profile_cfg kg_cfg = {0};
+	struct dpni_qos_tbl_cfg qos_cfg = {0};
+	struct dpni_rule_cfg key_params;
+	void *dma_mem, *key;
+	u8 key_size = 2;	/* VLAN TCI field */
+	int i, pcp, err;
+	u16 *mask;
+
+	/* VLAN-based classification only makes sense if we have multiple
+	 * traffic classes.
+	 * Also, we need to extract just the 3-bit PCP field from the VLAN
+	 * header and we can only do that by using a mask
+	 */
+	if (dpaa2_eth_tc_count(priv) == 1 || !dpaa2_eth_fs_mask_enabled(priv)) {
+		dev_dbg(dev, "VLAN-based QoS classification not supported\n");
+		return -EOPNOTSUPP;
+	}
+
+	dma_mem = kzalloc(DPAA2_CLASSIFIER_DMA_SIZE, GFP_KERNEL);
+	if (!dma_mem)
+		return -ENOMEM;
+
+	kg_cfg.num_extracts = 1;
+	kg_cfg.extracts[0].type = DPKG_EXTRACT_FROM_HDR;
+	kg_cfg.extracts[0].extract.from_hdr.prot = NET_PROT_VLAN;
+	kg_cfg.extracts[0].extract.from_hdr.type = DPKG_FULL_FIELD;
+	kg_cfg.extracts[0].extract.from_hdr.field = NH_FLD_VLAN_TCI;
+
+	err =  dpni_prepare_key_cfg(&kg_cfg, dma_mem);
+	if (err) {
+		dev_err(dev, "dpni_prepare_key_cfg failed\n");
+		goto out_free_tbl;
+	}
+
+	/* set QoS table */
+	qos_cfg.default_tc = 0;
+	qos_cfg.discard_on_miss = 0;
+	qos_cfg.key_cfg_iova = dma_map_single(dev, dma_mem,
+					      DPAA2_CLASSIFIER_DMA_SIZE,
+					      DMA_TO_DEVICE);
+	if (dma_mapping_error(dev, qos_cfg.key_cfg_iova)) {
+		dev_err(dev, "QoS table DMA mapping failed\n");
+		err = -ENOMEM;
+		goto out_free_tbl;
+	}
+
+	err = dpni_set_qos_table(priv->mc_io, 0, priv->mc_token, &qos_cfg);
+	if (err) {
+		dev_err(dev, "dpni_set_qos_table failed\n");
+		goto out_unmap_tbl;
+	}
+
+	/* Add QoS table entries */
+	key = kzalloc(key_size * 2, GFP_KERNEL);
+	if (!key) {
+		err = -ENOMEM;
+		goto out_unmap_tbl;
+	}
+	mask = key + key_size;
+	*mask = cpu_to_be16(VLAN_PRIO_MASK);
+
+	key_params.key_iova = dma_map_single(dev, key, key_size * 2,
+					     DMA_TO_DEVICE);
+	if (dma_mapping_error(dev, key_params.key_iova)) {
+		dev_err(dev, "Qos table entry DMA mapping failed\n");
+		err = -ENOMEM;
+		goto out_free_key;
+	}
+
+	key_params.mask_iova = key_params.key_iova + key_size;
+	key_params.key_size = key_size;
+
+	/* We add rules for PCP-based distribution starting with highest
+	 * priority (VLAN PCP = 7). If this DPNI doesn't have enough traffic
+	 * classes to accommodate all priority levels, the lowest ones end up
+	 * on TC 0 which was configured as default
+	 */
+	for (i = dpaa2_eth_tc_count(priv) - 1, pcp = 7; i >= 0; i--, pcp--) {
+		*(u16 *)key = cpu_to_be16(pcp << VLAN_PRIO_SHIFT);
+		dma_sync_single_for_device(dev, key_params.key_iova,
+					   key_size * 2, DMA_TO_DEVICE);
+
+		err = dpni_add_qos_entry(priv->mc_io, 0, priv->mc_token,
+					 &key_params, i, i);
+		if (err) {
+			dev_err(dev, "dpni_add_qos_entry failed\n");
+			dpni_clear_qos_table(priv->mc_io, 0, priv->mc_token);
+			goto out_unmap_key;
+		}
+	}
+
+	priv->vlan_cls_enabled = true;
+
+	/* Table and key memory is not persistent, clean everything up after
+	 * configuration is finished
+	 */
+out_unmap_key:
+	dma_unmap_single(dev, key_params.key_iova, key_size * 2, DMA_TO_DEVICE);
+out_free_key:
+	kfree(key);
+out_unmap_tbl:
+	dma_unmap_single(dev, qos_cfg.key_cfg_iova, DPAA2_CLASSIFIER_DMA_SIZE,
+			 DMA_TO_DEVICE);
+out_free_tbl:
+	kfree(dma_mem);
+
+	return err;
+}
+
 /* Configure the DPNI object this interface is associated with */
 static int setup_dpni(struct fsl_mc_device *ls_dev)
 {
@@ -2751,6 +2864,10 @@ static int setup_dpni(struct fsl_mc_device *ls_dev)
 			goto close;
 	}
 
+	err = set_vlan_qos(priv);
+	if (err && err != -EOPNOTSUPP)
+		goto close;
+
 	priv->cls_rules = devm_kzalloc(dev, sizeof(struct dpaa2_eth_cls_rule) *
 				       dpaa2_eth_fs_count(priv), GFP_KERNEL);
 	if (!priv->cls_rules) {
diff --git a/drivers/net/ethernet/freescale/dpaa2/dpaa2-eth.h b/drivers/net/ethernet/freescale/dpaa2/dpaa2-eth.h
index 580ad5fd7bd8..7856f69bcf36 100644
--- a/drivers/net/ethernet/freescale/dpaa2/dpaa2-eth.h
+++ b/drivers/net/ethernet/freescale/dpaa2/dpaa2-eth.h
@@ -427,6 +427,7 @@ struct dpaa2_eth_priv {
 	u64 rx_cls_fields;
 	struct dpaa2_eth_cls_rule *cls_rules;
 	u8 rx_cls_enabled;
+	u8 vlan_cls_enabled;
 	struct bpf_prog *xdp_prog;
 #ifdef CONFIG_DEBUG_FS
 	struct dpaa2_debugfs dbg;
diff --git a/drivers/net/ethernet/freescale/dpaa2/dpni-cmd.h b/drivers/net/ethernet/freescale/dpaa2/dpni-cmd.h
index d9b6918807af..0048e856f85e 100644
--- a/drivers/net/ethernet/freescale/dpaa2/dpni-cmd.h
+++ b/drivers/net/ethernet/freescale/dpaa2/dpni-cmd.h
@@ -59,6 +59,10 @@
 
 #define DPNI_CMDID_SET_RX_TC_DIST			DPNI_CMD(0x235)
 
+#define DPNI_CMDID_SET_QOS_TBL				DPNI_CMD(0x240)
+#define DPNI_CMDID_ADD_QOS_ENT				DPNI_CMD(0x241)
+#define DPNI_CMDID_REMOVE_QOS_ENT			DPNI_CMD(0x242)
+#define DPNI_CMDID_CLR_QOS_TBL				DPNI_CMD(0x243)
 #define DPNI_CMDID_ADD_FS_ENT				DPNI_CMD(0x244)
 #define DPNI_CMDID_REMOVE_FS_ENT			DPNI_CMD(0x245)
 #define DPNI_CMDID_CLR_FS_ENT				DPNI_CMD(0x246)
@@ -567,4 +571,34 @@ struct dpni_cmd_remove_fs_entry {
 	__le64 mask_iova;
 };
 
+#define DPNI_DISCARD_ON_MISS_SHIFT	0
+#define DPNI_DISCARD_ON_MISS_SIZE	1
+
+struct dpni_cmd_set_qos_table {
+	__le32 pad;
+	u8 default_tc;
+	/* only the LSB */
+	u8 discard_on_miss;
+	__le16 pad1[21];
+	__le64 key_cfg_iova;
+};
+
+struct dpni_cmd_add_qos_entry {
+	__le16 pad;
+	u8 tc_id;
+	u8 key_size;
+	__le16 index;
+	__le16 pad1;
+	__le64 key_iova;
+	__le64 mask_iova;
+};
+
+struct dpni_cmd_remove_qos_entry {
+	u8 pad[3];
+	u8 key_size;
+	__le32 pad1;
+	__le64 key_iova;
+	__le64 mask_iova;
+};
+
 #endif /* _FSL_DPNI_CMD_H */
diff --git a/drivers/net/ethernet/freescale/dpaa2/dpni.c b/drivers/net/ethernet/freescale/dpaa2/dpni.c
index dd54e6953aeb..78fa325407ca 100644
--- a/drivers/net/ethernet/freescale/dpaa2/dpni.c
+++ b/drivers/net/ethernet/freescale/dpaa2/dpni.c
@@ -1786,3 +1786,134 @@ int dpni_remove_fs_entry(struct fsl_mc_io *mc_io,
 	/* send command to mc*/
 	return mc_send_command(mc_io, &cmd);
 }
+
+/**
+ * dpni_set_qos_table() - Set QoS mapping table
+ * @mc_io:	Pointer to MC portal's I/O object
+ * @cmd_flags:	Command flags; one or more of 'MC_CMD_FLAG_'
+ * @token:	Token of DPNI object
+ * @cfg:	QoS table configuration
+ *
+ * This function and all QoS-related functions require that
+ *'max_tcs > 1' was set at DPNI creation.
+ *
+ * warning: Before calling this function, call dpkg_prepare_key_cfg() to
+ *			prepare the key_cfg_iova parameter
+ *
+ * Return:	'0' on Success; Error code otherwise.
+ */
+int dpni_set_qos_table(struct fsl_mc_io *mc_io,
+		       u32 cmd_flags,
+		       u16 token,
+		       const struct dpni_qos_tbl_cfg *cfg)
+{
+	struct dpni_cmd_set_qos_table *cmd_params;
+	struct fsl_mc_command cmd = { 0 };
+
+	/* prepare command */
+	cmd.header = mc_encode_cmd_header(DPNI_CMDID_SET_QOS_TBL,
+					  cmd_flags,
+					  token);
+	cmd_params = (struct dpni_cmd_set_qos_table *)cmd.params;
+	cmd_params->default_tc = cfg->default_tc;
+	cmd_params->key_cfg_iova = cpu_to_le64(cfg->key_cfg_iova);
+	dpni_set_field(cmd_params->discard_on_miss, DISCARD_ON_MISS,
+		       cfg->discard_on_miss);
+
+	/* send command to mc*/
+	return mc_send_command(mc_io, &cmd);
+}
+
+/**
+ * dpni_add_qos_entry() - Add QoS mapping entry (to select a traffic class)
+ * @mc_io:	Pointer to MC portal's I/O object
+ * @cmd_flags:	Command flags; one or more of 'MC_CMD_FLAG_'
+ * @token:	Token of DPNI object
+ * @cfg:	QoS rule to add
+ * @tc_id:	Traffic class selection (0-7)
+ * @index:	Location in the QoS table where to insert the entry.
+ *		Only relevant if MASKING is enabled for QoS classification on
+ *		this DPNI, it is ignored for exact match.
+ *
+ * Return:	'0' on Success; Error code otherwise.
+ */
+int dpni_add_qos_entry(struct fsl_mc_io *mc_io,
+		       u32 cmd_flags,
+		       u16 token,
+		       const struct dpni_rule_cfg *cfg,
+		       u8 tc_id,
+		       u16 index)
+{
+	struct dpni_cmd_add_qos_entry *cmd_params;
+	struct fsl_mc_command cmd = { 0 };
+
+	/* prepare command */
+	cmd.header = mc_encode_cmd_header(DPNI_CMDID_ADD_QOS_ENT,
+					  cmd_flags,
+					  token);
+	cmd_params = (struct dpni_cmd_add_qos_entry *)cmd.params;
+	cmd_params->tc_id = tc_id;
+	cmd_params->key_size = cfg->key_size;
+	cmd_params->index = cpu_to_le16(index);
+	cmd_params->key_iova = cpu_to_le64(cfg->key_iova);
+	cmd_params->mask_iova = cpu_to_le64(cfg->mask_iova);
+
+	/* send command to mc*/
+	return mc_send_command(mc_io, &cmd);
+}
+
+/**
+ * dpni_remove_qos_entry() - Remove QoS mapping entry
+ * @mc_io:	Pointer to MC portal's I/O object
+ * @cmd_flags:	Command flags; one or more of 'MC_CMD_FLAG_'
+ * @token:	Token of DPNI object
+ * @cfg:	QoS rule to remove
+ *
+ * Return:	'0' on Success; Error code otherwise.
+ */
+int dpni_remove_qos_entry(struct fsl_mc_io *mc_io,
+			  u32 cmd_flags,
+			  u16 token,
+			  const struct dpni_rule_cfg *cfg)
+{
+	struct dpni_cmd_remove_qos_entry *cmd_params;
+	struct fsl_mc_command cmd = { 0 };
+
+	/* prepare command */
+	cmd.header = mc_encode_cmd_header(DPNI_CMDID_REMOVE_QOS_ENT,
+					  cmd_flags,
+					  token);
+	cmd_params = (struct dpni_cmd_remove_qos_entry *)cmd.params;
+	cmd_params->key_size = cfg->key_size;
+	cmd_params->key_iova = cpu_to_le64(cfg->key_iova);
+	cmd_params->mask_iova = cpu_to_le64(cfg->mask_iova);
+
+	/* send command to mc*/
+	return mc_send_command(mc_io, &cmd);
+}
+
+/**
+ * dpni_clear_qos_table() - Clear all QoS mapping entries
+ * @mc_io:	Pointer to MC portal's I/O object
+ * @cmd_flags:	Command flags; one or more of 'MC_CMD_FLAG_'
+ * @token:	Token of DPNI object
+ *
+ * Following this function call, all frames are directed to
+ * the default traffic class (0)
+ *
+ * Return:	'0' on Success; Error code otherwise.
+ */
+int dpni_clear_qos_table(struct fsl_mc_io *mc_io,
+			 u32 cmd_flags,
+			 u16 token)
+{
+	struct fsl_mc_command cmd = { 0 };
+
+	/* prepare command */
+	cmd.header = mc_encode_cmd_header(DPNI_CMDID_CLR_QOS_TBL,
+					  cmd_flags,
+					  token);
+
+	/* send command to mc*/
+	return mc_send_command(mc_io, &cmd);
+}
diff --git a/drivers/net/ethernet/freescale/dpaa2/dpni.h b/drivers/net/ethernet/freescale/dpaa2/dpni.h
index ee0711d06b3a..8c7ac20bf1a7 100644
--- a/drivers/net/ethernet/freescale/dpaa2/dpni.h
+++ b/drivers/net/ethernet/freescale/dpaa2/dpni.h
@@ -715,6 +715,26 @@ int dpni_set_rx_hash_dist(struct fsl_mc_io *mc_io,
 			  u16 token,
 			  const struct dpni_rx_dist_cfg *cfg);
 
+/**
+ * struct dpni_qos_tbl_cfg - Structure representing QOS table configuration
+ * @key_cfg_iova: I/O virtual address of 256 bytes DMA-able memory filled with
+ *		key extractions to be used as the QoS criteria by calling
+ *		dpkg_prepare_key_cfg()
+ * @discard_on_miss: Set to '1' to discard frames in case of no match (miss);
+ *		'0' to use the 'default_tc' in such cases
+ * @default_tc: Used in case of no-match and 'discard_on_miss'= 0
+ */
+struct dpni_qos_tbl_cfg {
+	u64 key_cfg_iova;
+	int discard_on_miss;
+	u8 default_tc;
+};
+
+int dpni_set_qos_table(struct fsl_mc_io *mc_io,
+		       u32 cmd_flags,
+		       u16 token,
+		       const struct dpni_qos_tbl_cfg *cfg);
+
 /**
  * enum dpni_dest - DPNI destination types
  * @DPNI_DEST_NONE: Unassigned destination; The queue is set in parked mode and
@@ -961,6 +981,22 @@ int dpni_remove_fs_entry(struct fsl_mc_io *mc_io,
 			 u8 tc_id,
 			 const struct dpni_rule_cfg *cfg);
 
+int dpni_add_qos_entry(struct fsl_mc_io *mc_io,
+		       u32 cmd_flags,
+		       u16 token,
+		       const struct dpni_rule_cfg *cfg,
+		       u8 tc_id,
+		       u16 index);
+
+int dpni_remove_qos_entry(struct fsl_mc_io *mc_io,
+			  u32 cmd_flags,
+			  u16 token,
+			  const struct dpni_rule_cfg *cfg);
+
+int dpni_clear_qos_table(struct fsl_mc_io *mc_io,
+			 u32 cmd_flags,
+			 u16 token);
+
 int dpni_get_api_version(struct fsl_mc_io *mc_io,
 			 u32 cmd_flags,
 			 u16 *major_ver,
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [PATCH v2 net-next 4/7] dpaa2-eth: Add helper functions
  2020-05-15 18:47 [PATCH v2 net-next 0/7] dpaa2-eth: add support for Rx traffic classes Ioana Ciornei
                   ` (2 preceding siblings ...)
  2020-05-15 18:47 ` [PATCH v2 net-next 3/7] dpaa2-eth: Distribute ingress frames based on VLAN prio Ioana Ciornei
@ 2020-05-15 18:47 ` Ioana Ciornei
  2020-05-15 18:47 ` [PATCH v2 net-next 5/7] dpaa2-eth: Minor cleanup in dpaa2_eth_set_rx_taildrop() Ioana Ciornei
                   ` (3 subsequent siblings)
  7 siblings, 0 replies; 28+ messages in thread
From: Ioana Ciornei @ 2020-05-15 18:47 UTC (permalink / raw)
  To: davem, netdev; +Cc: Ioana Radulescu, Ioana Ciornei

From: Ioana Radulescu <ruxandra.radulescu@nxp.com>

Add convenient helper functions that determines whether Rx/Tx pause
frames are enabled based on link state flags received from firmware.

Signed-off-by: Ioana Radulescu <ruxandra.radulescu@nxp.com>
Signed-off-by: Ioana Ciornei <ioana.ciornei@nxp.com>
---
Changes in v2:
 - none

 drivers/net/ethernet/freescale/dpaa2/dpaa2-eth.c     |  3 +--
 drivers/net/ethernet/freescale/dpaa2/dpaa2-eth.h     | 11 +++++++++++
 drivers/net/ethernet/freescale/dpaa2/dpaa2-ethtool.c |  5 ++---
 3 files changed, 14 insertions(+), 5 deletions(-)

diff --git a/drivers/net/ethernet/freescale/dpaa2/dpaa2-eth.c b/drivers/net/ethernet/freescale/dpaa2/dpaa2-eth.c
index f4f085b7829a..d426fc1f20ef 100644
--- a/drivers/net/ethernet/freescale/dpaa2/dpaa2-eth.c
+++ b/drivers/net/ethernet/freescale/dpaa2/dpaa2-eth.c
@@ -1326,8 +1326,7 @@ static int link_state_update(struct dpaa2_eth_priv *priv)
 	 * Rx FQ taildrop configuration as well. We configure taildrop
 	 * only when pause frame generation is disabled.
 	 */
-	tx_pause = !!(state.options & DPNI_LINK_OPT_PAUSE) ^
-		   !!(state.options & DPNI_LINK_OPT_ASYM_PAUSE);
+	tx_pause = dpaa2_eth_tx_pause_enabled(state.options);
 	dpaa2_eth_set_rx_taildrop(priv, !tx_pause);
 
 	/* When we manage the MAC/PHY using phylink there is no need
diff --git a/drivers/net/ethernet/freescale/dpaa2/dpaa2-eth.h b/drivers/net/ethernet/freescale/dpaa2/dpaa2-eth.h
index 7856f69bcf36..6384f6a23349 100644
--- a/drivers/net/ethernet/freescale/dpaa2/dpaa2-eth.h
+++ b/drivers/net/ethernet/freescale/dpaa2/dpaa2-eth.h
@@ -510,6 +510,17 @@ enum dpaa2_eth_rx_dist {
 	(dpaa2_eth_cmp_dpni_ver((priv), DPNI_PAUSE_VER_MAJOR,	\
 				DPNI_PAUSE_VER_MINOR) >= 0)
 
+static inline bool dpaa2_eth_tx_pause_enabled(u64 link_options)
+{
+	return !!(link_options & DPNI_LINK_OPT_PAUSE) ^
+	       !!(link_options & DPNI_LINK_OPT_ASYM_PAUSE);
+}
+
+static inline bool dpaa2_eth_rx_pause_enabled(u64 link_options)
+{
+	return !!(link_options & DPNI_LINK_OPT_PAUSE);
+}
+
 static inline
 unsigned int dpaa2_eth_needed_headroom(struct dpaa2_eth_priv *priv,
 				       struct sk_buff *skb)
diff --git a/drivers/net/ethernet/freescale/dpaa2/dpaa2-ethtool.c b/drivers/net/ethernet/freescale/dpaa2/dpaa2-ethtool.c
index 1d0402011ac5..326c638c7a56 100644
--- a/drivers/net/ethernet/freescale/dpaa2/dpaa2-ethtool.c
+++ b/drivers/net/ethernet/freescale/dpaa2/dpaa2-ethtool.c
@@ -130,9 +130,8 @@ static void dpaa2_eth_get_pauseparam(struct net_device *net_dev,
 		return;
 	}
 
-	pause->rx_pause = !!(link_options & DPNI_LINK_OPT_PAUSE);
-	pause->tx_pause = pause->rx_pause ^
-			  !!(link_options & DPNI_LINK_OPT_ASYM_PAUSE);
+	pause->rx_pause = dpaa2_eth_rx_pause_enabled(link_options);
+	pause->tx_pause = dpaa2_eth_tx_pause_enabled(link_options);
 	pause->autoneg = AUTONEG_DISABLE;
 }
 
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [PATCH v2 net-next 5/7] dpaa2-eth: Minor cleanup in dpaa2_eth_set_rx_taildrop()
  2020-05-15 18:47 [PATCH v2 net-next 0/7] dpaa2-eth: add support for Rx traffic classes Ioana Ciornei
                   ` (3 preceding siblings ...)
  2020-05-15 18:47 ` [PATCH v2 net-next 4/7] dpaa2-eth: Add helper functions Ioana Ciornei
@ 2020-05-15 18:47 ` Ioana Ciornei
  2020-05-15 18:47 ` [PATCH v2 net-next 6/7] dpaa2-eth: Add congestion group taildrop Ioana Ciornei
                   ` (2 subsequent siblings)
  7 siblings, 0 replies; 28+ messages in thread
From: Ioana Ciornei @ 2020-05-15 18:47 UTC (permalink / raw)
  To: davem, netdev; +Cc: Ioana Radulescu, Ioana Ciornei

From: Ioana Radulescu <ruxandra.radulescu@nxp.com>

Make clear the setting refers to FQ-based taildrop and the threshold
value is given in bytes (the default option).

Reverse the logic of the second argument (pass tx_pause transparently).
This will be helpful further on.

Also don't set the device's Rx taildrop flag unless configuration
succeeds.

Signed-off-by: Ioana Radulescu <ruxandra.radulescu@nxp.com>
Signed-off-by: Ioana Ciornei <ioana.ciornei@nxp.com>
---
Changes in v2:
 - none

 .../net/ethernet/freescale/dpaa2/dpaa2-eth.c  | 19 +++++++++++--------
 .../net/ethernet/freescale/dpaa2/dpaa2-eth.h  |  4 ++--
 2 files changed, 13 insertions(+), 10 deletions(-)

diff --git a/drivers/net/ethernet/freescale/dpaa2/dpaa2-eth.c b/drivers/net/ethernet/freescale/dpaa2/dpaa2-eth.c
index d426fc1f20ef..d8b1d6ab6656 100644
--- a/drivers/net/ethernet/freescale/dpaa2/dpaa2-eth.c
+++ b/drivers/net/ethernet/freescale/dpaa2/dpaa2-eth.c
@@ -1280,17 +1280,20 @@ static void disable_ch_napi(struct dpaa2_eth_priv *priv)
 	}
 }
 
-static void dpaa2_eth_set_rx_taildrop(struct dpaa2_eth_priv *priv, bool enable)
+static void dpaa2_eth_set_rx_taildrop(struct dpaa2_eth_priv *priv,
+				      bool tx_pause)
 {
 	struct dpni_taildrop td = {0};
 	struct dpaa2_eth_fq *fq;
 	int i, err;
 
-	if (priv->rx_td_enabled == enable)
+	td.enable = !tx_pause;
+	if (priv->rx_td_enabled == td.enable)
 		return;
 
-	td.enable = enable;
-	td.threshold = DPAA2_ETH_TAILDROP_THRESH;
+	/* FQ taildrop: threshold is in bytes, per frame queue */
+	td.threshold = DPAA2_ETH_FQ_TAILDROP_THRESH;
+	td.units = DPNI_CONGESTION_UNIT_BYTES;
 
 	for (i = 0; i < priv->num_fqs; i++) {
 		fq = &priv->fq[i];
@@ -1301,12 +1304,12 @@ static void dpaa2_eth_set_rx_taildrop(struct dpaa2_eth_priv *priv, bool enable)
 					fq->tc, fq->flowid, &td);
 		if (err) {
 			netdev_err(priv->net_dev,
-				   "dpni_set_taildrop() failed\n");
-			break;
+				   "dpni_set_taildrop(FQ) failed\n");
+			return;
 		}
 	}
 
-	priv->rx_td_enabled = enable;
+	priv->rx_td_enabled = td.enable;
 }
 
 static int link_state_update(struct dpaa2_eth_priv *priv)
@@ -1327,7 +1330,7 @@ static int link_state_update(struct dpaa2_eth_priv *priv)
 	 * only when pause frame generation is disabled.
 	 */
 	tx_pause = dpaa2_eth_tx_pause_enabled(state.options);
-	dpaa2_eth_set_rx_taildrop(priv, !tx_pause);
+	dpaa2_eth_set_rx_taildrop(priv, tx_pause);
 
 	/* When we manage the MAC/PHY using phylink there is no need
 	 * to manually update the netif_carrier.
diff --git a/drivers/net/ethernet/freescale/dpaa2/dpaa2-eth.h b/drivers/net/ethernet/freescale/dpaa2/dpaa2-eth.h
index 6384f6a23349..1b9f6689e9ec 100644
--- a/drivers/net/ethernet/freescale/dpaa2/dpaa2-eth.h
+++ b/drivers/net/ethernet/freescale/dpaa2/dpaa2-eth.h
@@ -40,7 +40,7 @@
  * frames in the Rx queues (length of the current frame is not
  * taken into account when making the taildrop decision)
  */
-#define DPAA2_ETH_TAILDROP_THRESH	(64 * 1024)
+#define DPAA2_ETH_FQ_TAILDROP_THRESH	(64 * 1024)
 
 /* Maximum number of Tx confirmation frames to be processed
  * in a single NAPI call
@@ -52,7 +52,7 @@
  * how many 64B frames fit inside the taildrop threshold and add a margin
  * to accommodate the buffer refill delay.
  */
-#define DPAA2_ETH_MAX_FRAMES_PER_QUEUE	(DPAA2_ETH_TAILDROP_THRESH / 64)
+#define DPAA2_ETH_MAX_FRAMES_PER_QUEUE	(DPAA2_ETH_FQ_TAILDROP_THRESH / 64)
 #define DPAA2_ETH_NUM_BUFS		(DPAA2_ETH_MAX_FRAMES_PER_QUEUE + 256)
 #define DPAA2_ETH_REFILL_THRESH \
 	(DPAA2_ETH_NUM_BUFS - DPAA2_ETH_BUFS_PER_CMD)
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [PATCH v2 net-next 6/7] dpaa2-eth: Add congestion group taildrop
  2020-05-15 18:47 [PATCH v2 net-next 0/7] dpaa2-eth: add support for Rx traffic classes Ioana Ciornei
                   ` (4 preceding siblings ...)
  2020-05-15 18:47 ` [PATCH v2 net-next 5/7] dpaa2-eth: Minor cleanup in dpaa2_eth_set_rx_taildrop() Ioana Ciornei
@ 2020-05-15 18:47 ` Ioana Ciornei
  2020-05-15 18:47 ` [PATCH v2 net-next 7/7] dpaa2-eth: Update FQ taildrop threshold and buffer pool count Ioana Ciornei
  2020-05-15 19:20 ` [PATCH v2 net-next 0/7] dpaa2-eth: add support for Rx traffic classes Jakub Kicinski
  7 siblings, 0 replies; 28+ messages in thread
From: Ioana Ciornei @ 2020-05-15 18:47 UTC (permalink / raw)
  To: davem, netdev; +Cc: Ioana Radulescu, Ioana Ciornei

From: Ioana Radulescu <ruxandra.radulescu@nxp.com>

The increase in number of ingress frame queues means we now risk
depleting the buffer pool before the FQ taildrop kicks in.

Congestion group taildrop allows us to control the number of frames
that can accumulate on a group of Rx frame queues belonging to the
same traffic class.

This setting coexists with the frame queue based taildrop: whichever
limit gets hit first triggers the frame drop.

Signed-off-by: Ioana Radulescu <ruxandra.radulescu@nxp.com>
Signed-off-by: Ioana Ciornei <ioana.ciornei@nxp.com>
---
Changes in v2:
 - none

 drivers/net/ethernet/freescale/dpaa2/dpaa2-eth.c | 16 ++++++++++++++++
 drivers/net/ethernet/freescale/dpaa2/dpaa2-eth.h |  9 +++++++++
 2 files changed, 25 insertions(+)

diff --git a/drivers/net/ethernet/freescale/dpaa2/dpaa2-eth.c b/drivers/net/ethernet/freescale/dpaa2/dpaa2-eth.c
index d8b1d6ab6656..318b4332df87 100644
--- a/drivers/net/ethernet/freescale/dpaa2/dpaa2-eth.c
+++ b/drivers/net/ethernet/freescale/dpaa2/dpaa2-eth.c
@@ -1309,6 +1309,22 @@ static void dpaa2_eth_set_rx_taildrop(struct dpaa2_eth_priv *priv,
 		}
 	}
 
+	/* Congestion group taildrop: threshold is in frames, per group
+	 * of FQs belonging to the same traffic class
+	 */
+	td.threshold = DPAA2_ETH_CG_TAILDROP_THRESH(priv);
+	td.units = DPNI_CONGESTION_UNIT_FRAMES;
+	for (i = 0; i < dpaa2_eth_tc_count(priv); i++) {
+		err = dpni_set_taildrop(priv->mc_io, 0, priv->mc_token,
+					DPNI_CP_GROUP, DPNI_QUEUE_RX,
+					i, 0, &td);
+		if (err) {
+			netdev_err(priv->net_dev,
+				   "dpni_set_taildrop(CG) failed\n");
+			return;
+		}
+	}
+
 	priv->rx_td_enabled = td.enable;
 }
 
diff --git a/drivers/net/ethernet/freescale/dpaa2/dpaa2-eth.h b/drivers/net/ethernet/freescale/dpaa2/dpaa2-eth.h
index 1b9f6689e9ec..184d5d83e497 100644
--- a/drivers/net/ethernet/freescale/dpaa2/dpaa2-eth.h
+++ b/drivers/net/ethernet/freescale/dpaa2/dpaa2-eth.h
@@ -57,6 +57,15 @@
 #define DPAA2_ETH_REFILL_THRESH \
 	(DPAA2_ETH_NUM_BUFS - DPAA2_ETH_BUFS_PER_CMD)
 
+/* Congestion group taildrop threshold: number of frames allowed to accumulate
+ * at any moment in a group of Rx queues belonging to the same traffic class.
+ * Choose value such that we don't risk depleting the buffer pool before the
+ * taildrop kicks in
+ */
+#define DPAA2_ETH_CG_TAILDROP_THRESH(priv)				\
+	(DPAA2_ETH_MAX_FRAMES_PER_QUEUE * dpaa2_eth_queue_count(priv) /	\
+	 dpaa2_eth_tc_count(priv))
+
 /* Maximum number of buffers that can be acquired/released through a single
  * QBMan command
  */
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [PATCH v2 net-next 7/7] dpaa2-eth: Update FQ taildrop threshold and buffer pool count
  2020-05-15 18:47 [PATCH v2 net-next 0/7] dpaa2-eth: add support for Rx traffic classes Ioana Ciornei
                   ` (5 preceding siblings ...)
  2020-05-15 18:47 ` [PATCH v2 net-next 6/7] dpaa2-eth: Add congestion group taildrop Ioana Ciornei
@ 2020-05-15 18:47 ` Ioana Ciornei
  2020-05-15 19:20 ` [PATCH v2 net-next 0/7] dpaa2-eth: add support for Rx traffic classes Jakub Kicinski
  7 siblings, 0 replies; 28+ messages in thread
From: Ioana Ciornei @ 2020-05-15 18:47 UTC (permalink / raw)
  To: davem, netdev; +Cc: Ioana Radulescu, Ioana Ciornei

From: Ioana Radulescu <ruxandra.radulescu@nxp.com>

Now that we have congestion group taildrop configured at all
times, we can afford to increase the frame queue taildrop
threshold; this will ensure a better response when receiving
bursts of large-sized frames.

Also decouple the buffer pool count from the Rx FQ taildrop
threshold, as above change would increase it too much. Instead,
keep the old count as a hardcoded value.

With the new limits, we try to ensure that:
* we allow enough leeway for large frame bursts (by buffering
enough of them in queues to avoid heavy dropping in case of
bursty traffic, but when overall ingress bandwidth is manageable)
* allow pending frames to be evenly spread between ingress FQs,
regardless of frame size
* avoid dropping frames due to the buffer pool being empty; this
is not a bad behaviour per se, but system overall response is
more linear and predictable when frames are dropped at frame
queue/group level.

Signed-off-by: Ioana Radulescu <ruxandra.radulescu@nxp.com>
Signed-off-by: Ioana Ciornei <ioana.ciornei@nxp.com>
---
Changes in v2:
 - none

 .../net/ethernet/freescale/dpaa2/dpaa2-eth.h  | 23 +++++++++----------
 1 file changed, 11 insertions(+), 12 deletions(-)

diff --git a/drivers/net/ethernet/freescale/dpaa2/dpaa2-eth.h b/drivers/net/ethernet/freescale/dpaa2/dpaa2-eth.h
index 184d5d83e497..02c0eea69a23 100644
--- a/drivers/net/ethernet/freescale/dpaa2/dpaa2-eth.h
+++ b/drivers/net/ethernet/freescale/dpaa2/dpaa2-eth.h
@@ -36,24 +36,24 @@
 /* Convert L3 MTU to L2 MFL */
 #define DPAA2_ETH_L2_MAX_FRM(mtu)	((mtu) + VLAN_ETH_HLEN)
 
-/* Set the taildrop threshold (in bytes) to allow the enqueue of several jumbo
- * frames in the Rx queues (length of the current frame is not
- * taken into account when making the taildrop decision)
+/* Set the taildrop threshold (in bytes) to allow the enqueue of a large
+ * enough number of jumbo frames in the Rx queues (length of the current
+ * frame is not taken into account when making the taildrop decision)
  */
-#define DPAA2_ETH_FQ_TAILDROP_THRESH	(64 * 1024)
+#define DPAA2_ETH_FQ_TAILDROP_THRESH	(1024 * 1024)
 
 /* Maximum number of Tx confirmation frames to be processed
  * in a single NAPI call
  */
 #define DPAA2_ETH_TXCONF_PER_NAPI	256
 
-/* Buffer quota per queue. Must be large enough such that for minimum sized
- * frames taildrop kicks in before the bpool gets depleted, so we compute
- * how many 64B frames fit inside the taildrop threshold and add a margin
- * to accommodate the buffer refill delay.
+/* Buffer qouta per channel. We want to keep in check number of ingress frames
+ * in flight: for small sized frames, congestion group taildrop may kick in
+ * first; for large sizes, Rx FQ taildrop threshold will ensure only a
+ * reasonable number of frames will be pending at any given time.
+ * Ingress frame drop due to buffer pool depletion should be a corner case only
  */
-#define DPAA2_ETH_MAX_FRAMES_PER_QUEUE	(DPAA2_ETH_FQ_TAILDROP_THRESH / 64)
-#define DPAA2_ETH_NUM_BUFS		(DPAA2_ETH_MAX_FRAMES_PER_QUEUE + 256)
+#define DPAA2_ETH_NUM_BUFS		1280
 #define DPAA2_ETH_REFILL_THRESH \
 	(DPAA2_ETH_NUM_BUFS - DPAA2_ETH_BUFS_PER_CMD)
 
@@ -63,8 +63,7 @@
  * taildrop kicks in
  */
 #define DPAA2_ETH_CG_TAILDROP_THRESH(priv)				\
-	(DPAA2_ETH_MAX_FRAMES_PER_QUEUE * dpaa2_eth_queue_count(priv) /	\
-	 dpaa2_eth_tc_count(priv))
+	(1024 * dpaa2_eth_queue_count(priv) / dpaa2_eth_tc_count(priv))
 
 /* Maximum number of buffers that can be acquired/released through a single
  * QBMan command
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 28+ messages in thread

* Re: [PATCH v2 net-next 0/7] dpaa2-eth: add support for Rx traffic classes
  2020-05-15 18:47 [PATCH v2 net-next 0/7] dpaa2-eth: add support for Rx traffic classes Ioana Ciornei
                   ` (6 preceding siblings ...)
  2020-05-15 18:47 ` [PATCH v2 net-next 7/7] dpaa2-eth: Update FQ taildrop threshold and buffer pool count Ioana Ciornei
@ 2020-05-15 19:20 ` Jakub Kicinski
  2020-05-15 19:31   ` Ioana Ciornei
  7 siblings, 1 reply; 28+ messages in thread
From: Jakub Kicinski @ 2020-05-15 19:20 UTC (permalink / raw)
  To: Ioana Ciornei; +Cc: davem, netdev

On Fri, 15 May 2020 21:47:46 +0300 Ioana Ciornei wrote:
> This patch set adds support for Rx traffic classes on DPAA2 Ethernet
> devices.
> 
> The first two patches make the necessary changes so that multiple
> traffic classes are configured and their statistics are displayed in the
> debugfs. The third patch adds a static distribution to said traffic
> classes based on the VLAN PCP field.
> 
> The last patches add support for the congestion group taildrop mechanism
> that allows us to control the number of frames that can accumulate on a
> group of Rx frame queues belonging to the same traffic class.

Ah, I miseed you already sent a v2. Same question applies:

> How is this configured from the user perspective? I looked through the
> patches and I see no information on how the input is taken from the
> user.

^ permalink raw reply	[flat|nested] 28+ messages in thread

* RE: [PATCH v2 net-next 0/7] dpaa2-eth: add support for Rx traffic classes
  2020-05-15 19:20 ` [PATCH v2 net-next 0/7] dpaa2-eth: add support for Rx traffic classes Jakub Kicinski
@ 2020-05-15 19:31   ` Ioana Ciornei
  2020-05-15 19:40     ` Jakub Kicinski
  0 siblings, 1 reply; 28+ messages in thread
From: Ioana Ciornei @ 2020-05-15 19:31 UTC (permalink / raw)
  To: Jakub Kicinski; +Cc: davem, netdev

> Subject: Re: [PATCH v2 net-next 0/7] dpaa2-eth: add support for Rx traffic
> classes
> 
> On Fri, 15 May 2020 21:47:46 +0300 Ioana Ciornei wrote:
> > This patch set adds support for Rx traffic classes on DPAA2 Ethernet
> > devices.
> >
> > The first two patches make the necessary changes so that multiple
> > traffic classes are configured and their statistics are displayed in
> > the debugfs. The third patch adds a static distribution to said
> > traffic classes based on the VLAN PCP field.
> >
> > The last patches add support for the congestion group taildrop
> > mechanism that allows us to control the number of frames that can
> > accumulate on a group of Rx frame queues belonging to the same traffic class.
> 
> Ah, I miseed you already sent a v2. Same question applies:
> 
> > How is this configured from the user perspective? I looked through the
> > patches and I see no information on how the input is taken from the
> > user.

There is no input taken from the user at the moment. The traffic class id is statically selected based on the VLAN PCP field. The configuration for this is added in patch 3/7.

Ioana

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH v2 net-next 0/7] dpaa2-eth: add support for Rx traffic classes
  2020-05-15 19:31   ` Ioana Ciornei
@ 2020-05-15 19:40     ` Jakub Kicinski
  2020-05-15 20:48       ` Ioana Ciornei
  0 siblings, 1 reply; 28+ messages in thread
From: Jakub Kicinski @ 2020-05-15 19:40 UTC (permalink / raw)
  To: Ioana Ciornei; +Cc: davem, netdev

On Fri, 15 May 2020 19:31:18 +0000 Ioana Ciornei wrote:
> > Subject: Re: [PATCH v2 net-next 0/7] dpaa2-eth: add support for Rx traffic
> > classes
> > 
> > On Fri, 15 May 2020 21:47:46 +0300 Ioana Ciornei wrote:  
> > > This patch set adds support for Rx traffic classes on DPAA2 Ethernet
> > > devices.
> > >
> > > The first two patches make the necessary changes so that multiple
> > > traffic classes are configured and their statistics are displayed in
> > > the debugfs. The third patch adds a static distribution to said
> > > traffic classes based on the VLAN PCP field.
> > >
> > > The last patches add support for the congestion group taildrop
> > > mechanism that allows us to control the number of frames that can
> > > accumulate on a group of Rx frame queues belonging to the same traffic class.  
> > 
> > Ah, I miseed you already sent a v2. Same question applies:
> >   
> > > How is this configured from the user perspective? I looked through the
> > > patches and I see no information on how the input is taken from the
> > > user.  
> 
> There is no input taken from the user at the moment. The traffic
> class id is statically selected based on the VLAN PCP field. The
> configuration for this is added in patch 3/7.

Having some defaults for RX queue per TC is understandable. But patch 1
changes how many RX queues are used in the first place. Why if user
does not need RX queues per TC?


^ permalink raw reply	[flat|nested] 28+ messages in thread

* RE: [PATCH v2 net-next 0/7] dpaa2-eth: add support for Rx traffic classes
  2020-05-15 19:40     ` Jakub Kicinski
@ 2020-05-15 20:48       ` Ioana Ciornei
  2020-05-15 22:25         ` Jakub Kicinski
  0 siblings, 1 reply; 28+ messages in thread
From: Ioana Ciornei @ 2020-05-15 20:48 UTC (permalink / raw)
  To: Jakub Kicinski; +Cc: davem, netdev


> Subject: Re: [PATCH v2 net-next 0/7] dpaa2-eth: add support for Rx traffic
> classes
> 
> On Fri, 15 May 2020 19:31:18 +0000 Ioana Ciornei wrote:
> > > Subject: Re: [PATCH v2 net-next 0/7] dpaa2-eth: add support for Rx
> > > traffic classes
> > >
> > > On Fri, 15 May 2020 21:47:46 +0300 Ioana Ciornei wrote:
> > > > This patch set adds support for Rx traffic classes on DPAA2
> > > > Ethernet devices.
> > > >
> > > > The first two patches make the necessary changes so that multiple
> > > > traffic classes are configured and their statistics are displayed
> > > > in the debugfs. The third patch adds a static distribution to said
> > > > traffic classes based on the VLAN PCP field.
> > > >
> > > > The last patches add support for the congestion group taildrop
> > > > mechanism that allows us to control the number of frames that can
> > > > accumulate on a group of Rx frame queues belonging to the same traffic
> class.
> > >
> > > Ah, I miseed you already sent a v2. Same question applies:
> > >
> > > > How is this configured from the user perspective? I looked through
> > > > the patches and I see no information on how the input is taken
> > > > from the user.
> >
> > There is no input taken from the user at the moment. The traffic class
> > id is statically selected based on the VLAN PCP field. The
> > configuration for this is added in patch 3/7.
> 
> Having some defaults for RX queue per TC is understandable. But patch 1
> changes how many RX queues are used in the first place. Why if user does not
> need RX queues per TC?

In DPAA2 we have a boot time configurable system in which the user can select
for each interface how many queues and how many traffic classes it needs.

The driver picks these up from firmware and configures the traffic class
distribution only if there is more than one requested.
With one TC the behavior of the driver is exactly as before.

Ioana

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH v2 net-next 0/7] dpaa2-eth: add support for Rx traffic classes
  2020-05-15 20:48       ` Ioana Ciornei
@ 2020-05-15 22:25         ` Jakub Kicinski
  2020-05-15 23:33           ` David Miller
  2020-05-16  8:16           ` Ioana Ciornei
  0 siblings, 2 replies; 28+ messages in thread
From: Jakub Kicinski @ 2020-05-15 22:25 UTC (permalink / raw)
  To: Ioana Ciornei; +Cc: davem, netdev

On Fri, 15 May 2020 20:48:27 +0000 Ioana Ciornei wrote:
> > > There is no input taken from the user at the moment. The traffic class
> > > id is statically selected based on the VLAN PCP field. The
> > > configuration for this is added in patch 3/7.  
> > 
> > Having some defaults for RX queue per TC is understandable. But patch 1
> > changes how many RX queues are used in the first place. Why if user does not
> > need RX queues per TC?  
> 
> In DPAA2 we have a boot time configurable system in which the user can select
> for each interface how many queues and how many traffic classes it needs.

Looking at the UG online DPNI_CREATE has a NUM_RX_TCS param. You're not
using that for the kernel driver?

> The driver picks these up from firmware and configures the traffic class
> distribution only if there is more than one requested.
> With one TC the behavior of the driver is exactly as before.

This configuring things statically via some direct FW interface when
system boots really sounds like a typical "using Linux to boot a
proprietary networking stack" scenario.

With the Rx QoS features users won't even be able to tell via standard
Linux interfaces what the config was.


^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH v2 net-next 3/7] dpaa2-eth: Distribute ingress frames based on VLAN prio
  2020-05-15 18:47 ` [PATCH v2 net-next 3/7] dpaa2-eth: Distribute ingress frames based on VLAN prio Ioana Ciornei
@ 2020-05-15 23:07   ` kbuild test robot
  0 siblings, 0 replies; 28+ messages in thread
From: kbuild test robot @ 2020-05-15 23:07 UTC (permalink / raw)
  To: Ioana Ciornei, davem, netdev; +Cc: kbuild-all, Ioana Radulescu, Ioana Ciornei

Hi Ioana,

I love your patch! Perhaps something to improve:

[auto build test WARNING on net-next/master]
[also build test WARNING on sparc-next/master linus/master v5.7-rc5 next-20200515]
[if your patch is applied to the wrong git tree, please drop us a note to help
improve the system. BTW, we also suggest to use '--base' option to specify the
base tree in git format-patch, please see https://stackoverflow.com/a/37406982]

url:    https://github.com/0day-ci/linux/commits/Ioana-Ciornei/dpaa2-eth-add-support-for-Rx-traffic-classes/20200516-024950
base:   https://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next.git d00f26b623333f2419f4c3b95ff11c8b1bb96f56
reproduce:
        # apt-get install sparse
        # sparse version: v0.6.1-193-gb8fad4bc-dirty
        make ARCH=x86_64 allmodconfig
        make C=1 CF='-fdiagnostic-prefix -D__CHECK_ENDIAN__'

If you fix the issue, kindly add following tag as appropriate
Reported-by: kbuild test robot <lkp@intel.com>


sparse warnings: (new ones prefixed by >>)

>> drivers/net/ethernet/freescale/dpaa2/dpaa2-eth.c:2728:15: sparse: sparse: incorrect type in assignment (different base types) @@    expected unsigned short [usertype] @@    got resunsigned short [usertype] @@
>> drivers/net/ethernet/freescale/dpaa2/dpaa2-eth.c:2728:15: sparse:    expected unsigned short [usertype]
>> drivers/net/ethernet/freescale/dpaa2/dpaa2-eth.c:2728:15: sparse:    got restricted __be16 [usertype]
   drivers/net/ethernet/freescale/dpaa2/dpaa2-eth.c:2747:29: sparse: sparse: incorrect type in assignment (different base types) @@    expected unsigned short [usertype] @@    got resunsigned short [usertype] @@
   drivers/net/ethernet/freescale/dpaa2/dpaa2-eth.c:2747:29: sparse:    expected unsigned short [usertype]
   drivers/net/ethernet/freescale/dpaa2/dpaa2-eth.c:2747:29: sparse:    got restricted __be16 [usertype]

vim +2728 drivers/net/ethernet/freescale/dpaa2/dpaa2-eth.c

  2664	
  2665	/* Configure ingress classification based on VLAN PCP */
  2666	static int set_vlan_qos(struct dpaa2_eth_priv *priv)
  2667	{
  2668		struct device *dev = priv->net_dev->dev.parent;
  2669		struct dpkg_profile_cfg kg_cfg = {0};
  2670		struct dpni_qos_tbl_cfg qos_cfg = {0};
  2671		struct dpni_rule_cfg key_params;
  2672		void *dma_mem, *key;
  2673		u8 key_size = 2;	/* VLAN TCI field */
  2674		int i, pcp, err;
  2675		u16 *mask;
  2676	
  2677		/* VLAN-based classification only makes sense if we have multiple
  2678		 * traffic classes.
  2679		 * Also, we need to extract just the 3-bit PCP field from the VLAN
  2680		 * header and we can only do that by using a mask
  2681		 */
  2682		if (dpaa2_eth_tc_count(priv) == 1 || !dpaa2_eth_fs_mask_enabled(priv)) {
  2683			dev_dbg(dev, "VLAN-based QoS classification not supported\n");
  2684			return -EOPNOTSUPP;
  2685		}
  2686	
  2687		dma_mem = kzalloc(DPAA2_CLASSIFIER_DMA_SIZE, GFP_KERNEL);
  2688		if (!dma_mem)
  2689			return -ENOMEM;
  2690	
  2691		kg_cfg.num_extracts = 1;
  2692		kg_cfg.extracts[0].type = DPKG_EXTRACT_FROM_HDR;
  2693		kg_cfg.extracts[0].extract.from_hdr.prot = NET_PROT_VLAN;
  2694		kg_cfg.extracts[0].extract.from_hdr.type = DPKG_FULL_FIELD;
  2695		kg_cfg.extracts[0].extract.from_hdr.field = NH_FLD_VLAN_TCI;
  2696	
  2697		err =  dpni_prepare_key_cfg(&kg_cfg, dma_mem);
  2698		if (err) {
  2699			dev_err(dev, "dpni_prepare_key_cfg failed\n");
  2700			goto out_free_tbl;
  2701		}
  2702	
  2703		/* set QoS table */
  2704		qos_cfg.default_tc = 0;
  2705		qos_cfg.discard_on_miss = 0;
  2706		qos_cfg.key_cfg_iova = dma_map_single(dev, dma_mem,
  2707						      DPAA2_CLASSIFIER_DMA_SIZE,
  2708						      DMA_TO_DEVICE);
  2709		if (dma_mapping_error(dev, qos_cfg.key_cfg_iova)) {
  2710			dev_err(dev, "QoS table DMA mapping failed\n");
  2711			err = -ENOMEM;
  2712			goto out_free_tbl;
  2713		}
  2714	
  2715		err = dpni_set_qos_table(priv->mc_io, 0, priv->mc_token, &qos_cfg);
  2716		if (err) {
  2717			dev_err(dev, "dpni_set_qos_table failed\n");
  2718			goto out_unmap_tbl;
  2719		}
  2720	
  2721		/* Add QoS table entries */
  2722		key = kzalloc(key_size * 2, GFP_KERNEL);
  2723		if (!key) {
  2724			err = -ENOMEM;
  2725			goto out_unmap_tbl;
  2726		}
  2727		mask = key + key_size;
> 2728		*mask = cpu_to_be16(VLAN_PRIO_MASK);
  2729	
  2730		key_params.key_iova = dma_map_single(dev, key, key_size * 2,
  2731						     DMA_TO_DEVICE);
  2732		if (dma_mapping_error(dev, key_params.key_iova)) {
  2733			dev_err(dev, "Qos table entry DMA mapping failed\n");
  2734			err = -ENOMEM;
  2735			goto out_free_key;
  2736		}
  2737	
  2738		key_params.mask_iova = key_params.key_iova + key_size;
  2739		key_params.key_size = key_size;
  2740	
  2741		/* We add rules for PCP-based distribution starting with highest
  2742		 * priority (VLAN PCP = 7). If this DPNI doesn't have enough traffic
  2743		 * classes to accommodate all priority levels, the lowest ones end up
  2744		 * on TC 0 which was configured as default
  2745		 */
  2746		for (i = dpaa2_eth_tc_count(priv) - 1, pcp = 7; i >= 0; i--, pcp--) {
  2747			*(u16 *)key = cpu_to_be16(pcp << VLAN_PRIO_SHIFT);
  2748			dma_sync_single_for_device(dev, key_params.key_iova,
  2749						   key_size * 2, DMA_TO_DEVICE);
  2750	
  2751			err = dpni_add_qos_entry(priv->mc_io, 0, priv->mc_token,
  2752						 &key_params, i, i);
  2753			if (err) {
  2754				dev_err(dev, "dpni_add_qos_entry failed\n");
  2755				dpni_clear_qos_table(priv->mc_io, 0, priv->mc_token);
  2756				goto out_unmap_key;
  2757			}
  2758		}
  2759	
  2760		priv->vlan_cls_enabled = true;
  2761	
  2762		/* Table and key memory is not persistent, clean everything up after
  2763		 * configuration is finished
  2764		 */
  2765	out_unmap_key:
  2766		dma_unmap_single(dev, key_params.key_iova, key_size * 2, DMA_TO_DEVICE);
  2767	out_free_key:
  2768		kfree(key);
  2769	out_unmap_tbl:
  2770		dma_unmap_single(dev, qos_cfg.key_cfg_iova, DPAA2_CLASSIFIER_DMA_SIZE,
  2771				 DMA_TO_DEVICE);
  2772	out_free_tbl:
  2773		kfree(dma_mem);
  2774	
  2775		return err;
  2776	}
  2777	

---
0-DAY CI Kernel Test Service, Intel Corporation
https://lists.01.org/hyperkitty/list/kbuild-all@lists.01.org

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH v2 net-next 0/7] dpaa2-eth: add support for Rx traffic classes
  2020-05-15 22:25         ` Jakub Kicinski
@ 2020-05-15 23:33           ` David Miller
  2020-05-16  8:16           ` Ioana Ciornei
  1 sibling, 0 replies; 28+ messages in thread
From: David Miller @ 2020-05-15 23:33 UTC (permalink / raw)
  To: kuba; +Cc: ioana.ciornei, netdev

From: Jakub Kicinski <kuba@kernel.org>
Date: Fri, 15 May 2020 15:25:00 -0700

> With the Rx QoS features users won't even be able to tell via standard
> Linux interfaces what the config was.

+1


^ permalink raw reply	[flat|nested] 28+ messages in thread

* RE: [PATCH v2 net-next 0/7] dpaa2-eth: add support for Rx traffic classes
  2020-05-15 22:25         ` Jakub Kicinski
  2020-05-15 23:33           ` David Miller
@ 2020-05-16  8:16           ` Ioana Ciornei
  2020-05-18 19:35             ` Jakub Kicinski
  1 sibling, 1 reply; 28+ messages in thread
From: Ioana Ciornei @ 2020-05-16  8:16 UTC (permalink / raw)
  To: Jakub Kicinski; +Cc: davem, netdev


> Subject: Re: [PATCH v2 net-next 0/7] dpaa2-eth: add support for Rx traffic
> classes
> 
> On Fri, 15 May 2020 20:48:27 +0000 Ioana Ciornei wrote:
> > > > There is no input taken from the user at the moment. The traffic
> > > > class id is statically selected based on the VLAN PCP field. The
> > > > configuration for this is added in patch 3/7.
> > >
> > > Having some defaults for RX queue per TC is understandable. But
> > > patch 1 changes how many RX queues are used in the first place. Why
> > > if user does not need RX queues per TC?
> >
> > In DPAA2 we have a boot time configurable system in which the user can
> > select for each interface how many queues and how many traffic classes it
> needs.
> 
> Looking at the UG online DPNI_CREATE has a NUM_RX_TCS param. You're not
> using that for the kernel driver?

I have to give a bit of context here. If we look at what the hardware supports,
DPAA2 is reconfigurable and what I mean by that is that we can create new
interfaces (DPNIs) or destroy them at runtime using the DPNI_CREATE command
that you found in the UG.
This runtime reconfiguration is not supported in upstream. What we rely on in
upstream is on a static configuration of all the resources (how many interfaces
are needed and how should these interfaces be provisioned) which is applied
and configured at boot time even before Linux boots and gets to probe those
interfaces.

In the kernel driver we just get the num_tcs and num_queues parameters
(which are set in stone by now) and configure everything based on them.
This is why the kernel driver is not using at all the DPNI_CREATE command.
 
> 
> > The driver picks these up from firmware and configures the traffic
> > class distribution only if there is more than one requested.
> > With one TC the behavior of the driver is exactly as before.
> 
> This configuring things statically via some direct FW interface when system
> boots really sounds like a typical "using Linux to boot a proprietary networking
> stack" scenario.

I may have explained it poorly before, but the kernel is not in control of how
many TCs or queues are there it merely just configures the distribution
depending on what the hardware was setup for.

> 
> With the Rx QoS features users won't even be able to tell via standard Linux
> interfaces what the config was.

Ok, that is true. So how should this information be exported to the user?

Ioana


^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH v2 net-next 0/7] dpaa2-eth: add support for Rx traffic classes
  2020-05-16  8:16           ` Ioana Ciornei
@ 2020-05-18 19:35             ` Jakub Kicinski
  2020-05-19  7:38               ` Ioana Ciornei
  2020-05-29 11:45               ` Ioana Ciornei
  0 siblings, 2 replies; 28+ messages in thread
From: Jakub Kicinski @ 2020-05-18 19:35 UTC (permalink / raw)
  To: Ioana Ciornei; +Cc: davem, netdev

On Sat, 16 May 2020 08:16:47 +0000 Ioana Ciornei wrote:
> > With the Rx QoS features users won't even be able to tell via standard Linux
> > interfaces what the config was.  
> 
> Ok, that is true. So how should this information be exported to the user?

I believe no such interface currently exists. 

^ permalink raw reply	[flat|nested] 28+ messages in thread

* RE: [PATCH v2 net-next 0/7] dpaa2-eth: add support for Rx traffic classes
  2020-05-18 19:35             ` Jakub Kicinski
@ 2020-05-19  7:38               ` Ioana Ciornei
  2020-05-19 18:43                 ` Jakub Kicinski
  2020-05-29 11:45               ` Ioana Ciornei
  1 sibling, 1 reply; 28+ messages in thread
From: Ioana Ciornei @ 2020-05-19  7:38 UTC (permalink / raw)
  To: Jakub Kicinski; +Cc: davem, netdev

> Subject: Re: [PATCH v2 net-next 0/7] dpaa2-eth: add support for Rx traffic
> classes
> 
> On Sat, 16 May 2020 08:16:47 +0000 Ioana Ciornei wrote:
> > > With the Rx QoS features users won't even be able to tell via
> > > standard Linux interfaces what the config was.
> >
> > Ok, that is true. So how should this information be exported to the user?
> 
> I believe no such interface currently exists.

I am having a bit of trouble understanding what should be the route for this feature to get accepted.

Is the problem having the classification to a TC based on the VLAN PCP or is there anything else?

Regards,
Ioana

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH v2 net-next 0/7] dpaa2-eth: add support for Rx traffic classes
  2020-05-19  7:38               ` Ioana Ciornei
@ 2020-05-19 18:43                 ` Jakub Kicinski
  2020-05-19 20:58                   ` Ioana Ciornei
  0 siblings, 1 reply; 28+ messages in thread
From: Jakub Kicinski @ 2020-05-19 18:43 UTC (permalink / raw)
  To: Ioana Ciornei; +Cc: davem, netdev

On Tue, 19 May 2020 07:38:57 +0000 Ioana Ciornei wrote:
> > Subject: Re: [PATCH v2 net-next 0/7] dpaa2-eth: add support for Rx traffic classes
> > 
> > On Sat, 16 May 2020 08:16:47 +0000 Ioana Ciornei wrote:  
> > > > With the Rx QoS features users won't even be able to tell via
> > > > standard Linux interfaces what the config was.  
> > >
> > > Ok, that is true. So how should this information be exported to the user?  
> > 
> > I believe no such interface currently exists.  
> 
> I am having a bit of trouble understanding what should be the route
> for this feature to get accepted.

What's the feature you're trying to get accepted? Driver's datapath to
behave correctly when some proprietary out-of-tree API is used to do the
actual configuration? Unexciting.

> Is the problem having the classification to a TC based on the VLAN
> PCP or is there anything else?

What you have is basically RX version of mqprio, right? Multiple rings
per "channel" each gets frames with specific priorities? This needs to
be well integrated with the rest of the stack, but I don't think TC
qdisc offload is a fit. Given we don't have qdiscs on ingress. As I
said a new API for this would most likely have to be created.

^ permalink raw reply	[flat|nested] 28+ messages in thread

* RE: [PATCH v2 net-next 0/7] dpaa2-eth: add support for Rx traffic classes
  2020-05-19 18:43                 ` Jakub Kicinski
@ 2020-05-19 20:58                   ` Ioana Ciornei
  2020-05-19 21:35                     ` Jakub Kicinski
  0 siblings, 1 reply; 28+ messages in thread
From: Ioana Ciornei @ 2020-05-19 20:58 UTC (permalink / raw)
  To: Jakub Kicinski; +Cc: davem, netdev

> Subject: Re: [PATCH v2 net-next 0/7] dpaa2-eth: add support for Rx traffic
> classes
> 
> On Tue, 19 May 2020 07:38:57 +0000 Ioana Ciornei wrote:
> > > Subject: Re: [PATCH v2 net-next 0/7] dpaa2-eth: add support for Rx
> > > traffic classes
> > >
> > > On Sat, 16 May 2020 08:16:47 +0000 Ioana Ciornei wrote:
> > > > > With the Rx QoS features users won't even be able to tell via
> > > > > standard Linux interfaces what the config was.
> > > >
> > > > Ok, that is true. So how should this information be exported to the user?
> > >
> > > I believe no such interface currently exists.
> >
> > I am having a bit of trouble understanding what should be the route
> > for this feature to get accepted.
> 
> What's the feature you're trying to get accepted? Driver's datapath to behave
> correctly when some proprietary out-of-tree API is used to do the actual
> configuration? Unexciting.
> 

I don't think this comment is very productive. The out-of-tree API has nothing
to do with this. The NIC parameters like number of traffic classes and number of
queues per traffic class are fixed as far as the Linux driver is concerned. 

What this patch set does is make use of the queues corresponding to traffic classes
different than 0, if those exist.

> > Is the problem having the classification to a TC based on the VLAN PCP
> > or is there anything else?
> 
> What you have is basically RX version of mqprio, right? Multiple rings per
> "channel" each gets frames with specific priorities? 

You can think of it like that (maybe, understanding that mqprio cannot be attached
to the ingress qdisc, indeed).
DPAA2 has many peculiarities but I don't think this is one of them. There are some
RX queues, some of which can be given a hardware priority at dequeue time and they
form a traffic class together. The assumption being that you don't want low-priority
traffic to cause congestion for high-priority traffic, to have lower latency for
high-priority traffic, etc.
Some 802.1Q standards like PFC actually depend on having multiple traffic classes too,
based on VLAN PCP.

> This needs to be well
> integrated with the rest of the stack, but I don't think TC qdisc offload is a fit.
> Given we don't have qdiscs on ingress. As I said a new API for this would most
> likely have to be created.

For just assigning a traffic class based on packet headers a tc filter with the
skbedit priority action on ingress is enough (maybe even too much since there are
other drivers that have the same default prioritization based on VLAN PCP).

But you are correct that this would not be enough to cover all possible use cases except
for the most simple ones. There are per-traffic class ingress policers, and those become
tricky to support since there's nothing that denotes the traffic class to match on,
currently. I see 2 possible approaches, each with its own drawbacks:
- Allow clsact to be classful, similar to mqprio, and attach filters to its classes (each
  class would correspond to an ingress traffic class). But this would make the skbedit
  action redundant, since QoS classification with a classful clsact should be done
  completely differently now. Also, the classful clsact would have to deny qdiscs attached
  to it that don't make sense, because all of those were written with egress in mind.
- Try to linearize the ingress filter rules under the classless clsact, both the ones that
  have a skbedit action, and the ones that match on a skb priority in order to perform
  ingress policing. But this would be very brittle because the matching order would depend
  on the order in which the rules were introduced:
  rule 1: flower skb-priority 5 action police rate 34Mbps # note: matching on skb-priority doesn't exist (yet?)
  rule 2: flower vlan_prio 5 action skbedit priority 5
  In this case, traffic with VLAN priority 5 would not get rate-limited to 34Mbps.

So this is one of the reasons why I preferred to defer the hard questions and start with
something simple (which for some reason still gets pushback).

Ioana


^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH v2 net-next 0/7] dpaa2-eth: add support for Rx traffic classes
  2020-05-19 20:58                   ` Ioana Ciornei
@ 2020-05-19 21:35                     ` Jakub Kicinski
  2020-05-20 15:10                       ` Ioana Ciornei
  0 siblings, 1 reply; 28+ messages in thread
From: Jakub Kicinski @ 2020-05-19 21:35 UTC (permalink / raw)
  To: Ioana Ciornei; +Cc: davem, netdev

On Tue, 19 May 2020 20:58:50 +0000 Ioana Ciornei wrote:
> > This needs to be well
> > integrated with the rest of the stack, but I don't think TC qdisc offload is a fit.
> > Given we don't have qdiscs on ingress. As I said a new API for this would most
> > likely have to be created.  
> 
> For just assigning a traffic class based on packet headers a tc filter with the
> skbedit priority action on ingress is enough (maybe even too much since there are
> other drivers that have the same default prioritization based on VLAN PCP).
> 
> But you are correct that this would not be enough to cover all possible use cases except
> for the most simple ones. There are per-traffic class ingress policers, and those become
> tricky to support since there's nothing that denotes the traffic class to match on,
> currently. I see 2 possible approaches, each with its own drawbacks:
> - Allow clsact to be classful, similar to mqprio, and attach filters to its classes (each
>   class would correspond to an ingress traffic class). But this would make the skbedit
>   action redundant, since QoS classification with a classful clsact should be done
>   completely differently now. Also, the classful clsact would have to deny qdiscs attached
>   to it that don't make sense, because all of those were written with egress in mind.
> - Try to linearize the ingress filter rules under the classless clsact, both the ones that
>   have a skbedit action, and the ones that match on a skb priority in order to perform
>   ingress policing. But this would be very brittle because the matching order would depend
>   on the order in which the rules were introduced:
>   rule 1: flower skb-priority 5 action police rate 34Mbps # note: matching on skb-priority doesn't exist (yet?)
>   rule 2: flower vlan_prio 5 action skbedit priority 5
>   In this case, traffic with VLAN priority 5 would not get rate-limited to 34Mbps.
> 
> So this is one of the reasons why I preferred to defer the hard questions and start with
> something simple (which for some reason still gets pushback).

You're jumping to classification while the configuration of the queues
itself is still not defined. How does the user know how many queues
there are to classify into?

Does this driver has descriptor rings for RX / completion? How does it
decide which queue to pool at NAPI time?

^ permalink raw reply	[flat|nested] 28+ messages in thread

* RE: [PATCH v2 net-next 0/7] dpaa2-eth: add support for Rx traffic classes
  2020-05-19 21:35                     ` Jakub Kicinski
@ 2020-05-20 15:10                       ` Ioana Ciornei
  2020-05-20 19:12                         ` Jakub Kicinski
  0 siblings, 1 reply; 28+ messages in thread
From: Ioana Ciornei @ 2020-05-20 15:10 UTC (permalink / raw)
  To: Jakub Kicinski; +Cc: davem, netdev

> Subject: Re: [PATCH v2 net-next 0/7] dpaa2-eth: add support for Rx traffic
> classes
> 
> On Tue, 19 May 2020 20:58:50 +0000 Ioana Ciornei wrote:
> > > This needs to be well
> > > integrated with the rest of the stack, but I don't think TC qdisc offload is a fit.
> > > Given we don't have qdiscs on ingress. As I said a new API for this
> > > would most likely have to be created.
> >
> > For just assigning a traffic class based on packet headers a tc filter
> > with the skbedit priority action on ingress is enough (maybe even too
> > much since there are other drivers that have the same default prioritization
> based on VLAN PCP).
> >
> > But you are correct that this would not be enough to cover all
> > possible use cases except for the most simple ones. There are
> > per-traffic class ingress policers, and those become tricky to support
> > since there's nothing that denotes the traffic class to match on, currently. I see
> 2 possible approaches, each with its own drawbacks:
> > - Allow clsact to be classful, similar to mqprio, and attach filters to its classes
> (each
> >   class would correspond to an ingress traffic class). But this would make the
> skbedit
> >   action redundant, since QoS classification with a classful clsact should be
> done
> >   completely differently now. Also, the classful clsact would have to deny
> qdiscs attached
> >   to it that don't make sense, because all of those were written with egress in
> mind.
> > - Try to linearize the ingress filter rules under the classless clsact, both the ones
> that
> >   have a skbedit action, and the ones that match on a skb priority in order to
> perform
> >   ingress policing. But this would be very brittle because the matching order
> would depend
> >   on the order in which the rules were introduced:
> >   rule 1: flower skb-priority 5 action police rate 34Mbps # note: matching on
> skb-priority doesn't exist (yet?)
> >   rule 2: flower vlan_prio 5 action skbedit priority 5
> >   In this case, traffic with VLAN priority 5 would not get rate-limited to
> 34Mbps.
> >
> > So this is one of the reasons why I preferred to defer the hard
> > questions and start with something simple (which for some reason still gets
> pushback).
> 
> You're jumping to classification while the configuration of the queues itself is
> still not defined. How does the user know how many queues there are to classify
> into?
> 
> Does this driver has descriptor rings for RX / completion? How does it decide
> which queue to pool at NAPI time?

DPAA2 has frame queues per each Rx traffic class and the decision from which queue
to pull frames from is made by the HW based on the queue priority within a channel
(there is one channel per each CPU).

If this should be modeled in software, then I assume there should be a NAPI instance
for each traffic class and the stack should know in which order to call the poll()
callbacks so that the priority is respected.

Ioana




^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH v2 net-next 0/7] dpaa2-eth: add support for Rx traffic classes
  2020-05-20 15:10                       ` Ioana Ciornei
@ 2020-05-20 19:12                         ` Jakub Kicinski
  2020-05-20 20:24                           ` Ioana Ciornei
  0 siblings, 1 reply; 28+ messages in thread
From: Jakub Kicinski @ 2020-05-20 19:12 UTC (permalink / raw)
  To: Ioana Ciornei; +Cc: davem, netdev

On Wed, 20 May 2020 15:10:42 +0000 Ioana Ciornei wrote:
> DPAA2 has frame queues per each Rx traffic class and the decision from which queue
> to pull frames from is made by the HW based on the queue priority within a channel
> (there is one channel per each CPU).

IOW you're reading the descriptor for the device memory/iomem address
and the HW will return the next descriptor based on configured priority?
Presumably strict priority?

> If this should be modeled in software, then I assume there should be a NAPI instance
> for each traffic class and the stack should know in which order to call the poll()
> callbacks so that the priority is respected.

Right, something like that. But IMHO not needed if HW can serve the
right descriptor upon poll.

^ permalink raw reply	[flat|nested] 28+ messages in thread

* RE: [PATCH v2 net-next 0/7] dpaa2-eth: add support for Rx traffic classes
  2020-05-20 19:12                         ` Jakub Kicinski
@ 2020-05-20 20:24                           ` Ioana Ciornei
  2020-05-21 19:07                             ` Jakub Kicinski
  0 siblings, 1 reply; 28+ messages in thread
From: Ioana Ciornei @ 2020-05-20 20:24 UTC (permalink / raw)
  To: Jakub Kicinski; +Cc: davem, netdev

> Subject: Re: [PATCH v2 net-next 0/7] dpaa2-eth: add support for Rx traffic
> classes
> 
> On Wed, 20 May 2020 15:10:42 +0000 Ioana Ciornei wrote:
> > DPAA2 has frame queues per each Rx traffic class and the decision from
> > which queue to pull frames from is made by the HW based on the queue
> > priority within a channel (there is one channel per each CPU).
> 
> IOW you're reading the descriptor for the device memory/iomem address and
> the HW will return the next descriptor based on configured priority?

That's the general idea but the decision is not made on a frame by frame bases
but rather on a dequeue operation which can, at a maximum, return
16 frame descriptors at a time.

> Presumably strict priority?

Only the two highest traffic classes are in strict priority, while the other 6 TCs
form two priority tiers - medium(4 TCs) and low (last two TCs).

> 
> > If this should be modeled in software, then I assume there should be a
> > NAPI instance for each traffic class and the stack should know in
> > which order to call the poll() callbacks so that the priority is respected.
> 
> Right, something like that. But IMHO not needed if HW can serve the right
> descriptor upon poll.

After thinking this through I don't actually believe that multiple NAPI instances
would solve this in any circumstance at all:

- If you have hardware prioritization with full scheduling on dequeue then job on the
driver side is already done.
- If you only have hardware assist for prioritization (ie hardware gives you multiple
rings but doesn't tell you from which one to dequeue) then you can still use a single
NAPI instance just fine and pick the highest priority non-empty ring on-the-fly basically.

What I am having trouble understanding is how the fully software implementation
of this possible new Rx qdisc should work. Somehow the skb->priority should be taken
into account when the skb is passing though the stack (ie a higher priority skb should
surpass another previously received skb even if the latter one was received first, but
its priority queue is congested).

I don't have a very deep understanding of the stack but I am thinking that the
enqueue_to_backlog()/process_backlog() area could be a candidate place for sorting out
bottlenecks. In case we do that I don't see why a qdisc would be necessary at all and not
have everybody benefit from prioritization based on skb->priority.

Ioana

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH v2 net-next 0/7] dpaa2-eth: add support for Rx traffic classes
  2020-05-20 20:24                           ` Ioana Ciornei
@ 2020-05-21 19:07                             ` Jakub Kicinski
  2020-05-22 13:58                               ` Ioana Ciornei
  0 siblings, 1 reply; 28+ messages in thread
From: Jakub Kicinski @ 2020-05-21 19:07 UTC (permalink / raw)
  To: Ioana Ciornei; +Cc: davem, netdev

On Wed, 20 May 2020 20:24:43 +0000 Ioana Ciornei wrote:
> > Subject: Re: [PATCH v2 net-next 0/7] dpaa2-eth: add support for Rx traffic
> > classes
> > 
> > On Wed, 20 May 2020 15:10:42 +0000 Ioana Ciornei wrote:  
> > > DPAA2 has frame queues per each Rx traffic class and the decision from
> > > which queue to pull frames from is made by the HW based on the queue
> > > priority within a channel (there is one channel per each CPU).  
> > 
> > IOW you're reading the descriptor for the device memory/iomem address and
> > the HW will return the next descriptor based on configured priority?  
> 
> That's the general idea but the decision is not made on a frame by frame bases
> but rather on a dequeue operation which can, at a maximum, return
> 16 frame descriptors at a time.

I see!

> > Presumably strict priority?  
> 
> Only the two highest traffic classes are in strict priority, while the other 6 TCs
> form two priority tiers - medium(4 TCs) and low (last two TCs).
> 
> > > If this should be modeled in software, then I assume there should be a
> > > NAPI instance for each traffic class and the stack should know in
> > > which order to call the poll() callbacks so that the priority is respected.  
> > 
> > Right, something like that. But IMHO not needed if HW can serve the right
> > descriptor upon poll.  
> 
> After thinking this through I don't actually believe that multiple NAPI instances
> would solve this in any circumstance at all:
> 
> - If you have hardware prioritization with full scheduling on dequeue then job on the
> driver side is already done.
> - If you only have hardware assist for prioritization (ie hardware gives you multiple
> rings but doesn't tell you from which one to dequeue) then you can still use a single
> NAPI instance just fine and pick the highest priority non-empty ring on-the-fly basically.
> 
> What I am having trouble understanding is how the fully software implementation
> of this possible new Rx qdisc should work. Somehow the skb->priority should be taken
> into account when the skb is passing though the stack (ie a higher priority skb should
> surpass another previously received skb even if the latter one was received first, but
> its priority queue is congested).

I'd think the SW implementation would come down to which ring to
service first. If there are multiple rings on the host NAPI can try
to read from highest priority ring first and then move on to next prio.
Not sure if there would be a use case for multiple NAPIs for busy
polling or not.

I was hoping we can solve this with the new ring config API (which is
coming any day now, ehh) - in which I hope user space will be able to
assign rings to NAPI instances, all we would have needed would be also
controlling the querying order. But that doesn't really work for you,
it seems, since the selection is offloaded to HW :S

> I don't have a very deep understanding of the stack but I am thinking that the
> enqueue_to_backlog()/process_backlog() area could be a candidate place for sorting out
> bottlenecks. In case we do that I don't see why a qdisc would be necessary at all and not
> have everybody benefit from prioritization based on skb->priority.

I think once the driver picks the frame up it should run with it to
completion (+/-GRO). We have natural batching with NAPI processing.
Every NAPI budget high priority rings get a chance to preempt lower
ones.

^ permalink raw reply	[flat|nested] 28+ messages in thread

* RE: [PATCH v2 net-next 0/7] dpaa2-eth: add support for Rx traffic classes
  2020-05-21 19:07                             ` Jakub Kicinski
@ 2020-05-22 13:58                               ` Ioana Ciornei
  0 siblings, 0 replies; 28+ messages in thread
From: Ioana Ciornei @ 2020-05-22 13:58 UTC (permalink / raw)
  To: Jakub Kicinski; +Cc: davem, netdev

> Subject: Re: [PATCH v2 net-next 0/7] dpaa2-eth: add support for Rx traffic
> classes
> 
> On Wed, 20 May 2020 20:24:43 +0000 Ioana Ciornei wrote:
> > > Subject: Re: [PATCH v2 net-next 0/7] dpaa2-eth: add support for Rx
> > > traffic classes
> > >
> > > On Wed, 20 May 2020 15:10:42 +0000 Ioana Ciornei wrote:
> > > > DPAA2 has frame queues per each Rx traffic class and the decision
> > > > from which queue to pull frames from is made by the HW based on
> > > > the queue priority within a channel (there is one channel per each CPU).
> > >
> > > IOW you're reading the descriptor for the device memory/iomem
> > > address and the HW will return the next descriptor based on configured
> priority?
> >
> > That's the general idea but the decision is not made on a frame by
> > frame bases but rather on a dequeue operation which can, at a maximum,
> > return
> > 16 frame descriptors at a time.
> 
> I see!
> 
> > > Presumably strict priority?
> >
> > Only the two highest traffic classes are in strict priority, while the
> > other 6 TCs form two priority tiers - medium(4 TCs) and low (last two TCs).
> >
> > > > If this should be modeled in software, then I assume there should
> > > > be a NAPI instance for each traffic class and the stack should
> > > > know in which order to call the poll() callbacks so that the priority is
> respected.
> > >
> > > Right, something like that. But IMHO not needed if HW can serve the
> > > right descriptor upon poll.
> >
> > After thinking this through I don't actually believe that multiple
> > NAPI instances would solve this in any circumstance at all:
> >
> > - If you have hardware prioritization with full scheduling on dequeue
> > then job on the driver side is already done.
> > - If you only have hardware assist for prioritization (ie hardware
> > gives you multiple rings but doesn't tell you from which one to
> > dequeue) then you can still use a single NAPI instance just fine and pick the
> highest priority non-empty ring on-the-fly basically.
> >
> > What I am having trouble understanding is how the fully software
> > implementation of this possible new Rx qdisc should work. Somehow the
> > skb->priority should be taken into account when the skb is passing
> > though the stack (ie a higher priority skb should surpass another
> > previously received skb even if the latter one was received first, but its priority
> queue is congested).
> 
> I'd think the SW implementation would come down to which ring to service first.
> If there are multiple rings on the host NAPI can try to read from highest priority
> ring first and then move on to next prio.
> Not sure if there would be a use case for multiple NAPIs for busy polling or not.
> 
> I was hoping we can solve this with the new ring config API (which is coming any
> day now, ehh) - in which I hope user space will be able to assign rings to NAPI
> instances, all we would have needed would be also controlling the querying
> order. But that doesn't really work for you, it seems, since the selection is
> offloaded to HW :S
> 

Yes, I would need only the configuration of traffic classes and their priorities and
not the software prioritization.

I'll keep a close eye on the mailing list to see what the new ring config API
that you're referring to is adding.

> > I don't have a very deep understanding of the stack but I am thinking
> > that the
> > enqueue_to_backlog()/process_backlog() area could be a candidate place
> > for sorting out bottlenecks. In case we do that I don't see why a
> > qdisc would be necessary at all and not have everybody benefit from
> prioritization based on skb->priority.
> 
> I think once the driver picks the frame up it should run with it to completion (+/-
> GRO). We have natural batching with NAPI processing.
> Every NAPI budget high priority rings get a chance to preempt lower ones.

^ permalink raw reply	[flat|nested] 28+ messages in thread

* RE: [PATCH v2 net-next 0/7] dpaa2-eth: add support for Rx traffic classes
  2020-05-18 19:35             ` Jakub Kicinski
  2020-05-19  7:38               ` Ioana Ciornei
@ 2020-05-29 11:45               ` Ioana Ciornei
  2020-05-29 19:40                 ` Jakub Kicinski
  1 sibling, 1 reply; 28+ messages in thread
From: Ioana Ciornei @ 2020-05-29 11:45 UTC (permalink / raw)
  To: Jakub Kicinski; +Cc: davem, netdev

> Subject: Re: [PATCH v2 net-next 0/7] dpaa2-eth: add support for Rx traffic
> classes
> 
> On Sat, 16 May 2020 08:16:47 +0000 Ioana Ciornei wrote:
> > > With the Rx QoS features users won't even be able to tell via
> > > standard Linux interfaces what the config was.
> >
> > Ok, that is true. So how should this information be exported to the user?
> 
> I believe no such interface currently exists.

Somehow I missed this the first time around but the number of Rx traffic classes
can be exported through the DCB ops if those traffic classes are PFC enabled.
Also, adding PFC support was the main target of this patch set.

An output like the following would convey to the user how many traffic classes
are available and which of them are PFC enabled.

root@localhost:~# lldptool -t -i eth1 -V PFC
IEEE 8021QAZ PFC TLV
         Willing: yes
         MACsec Bypass Capable: no
         PFC capable traffic classes: 8
         PFC enabled: 1 3 7

Ioana

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH v2 net-next 0/7] dpaa2-eth: add support for Rx traffic classes
  2020-05-29 11:45               ` Ioana Ciornei
@ 2020-05-29 19:40                 ` Jakub Kicinski
  0 siblings, 0 replies; 28+ messages in thread
From: Jakub Kicinski @ 2020-05-29 19:40 UTC (permalink / raw)
  To: Ioana Ciornei; +Cc: davem, netdev

On Fri, 29 May 2020 11:45:08 +0000 Ioana Ciornei wrote:
> > Subject: Re: [PATCH v2 net-next 0/7] dpaa2-eth: add support for Rx traffic
> > classes
> > 
> > On Sat, 16 May 2020 08:16:47 +0000 Ioana Ciornei wrote:  
> > > > With the Rx QoS features users won't even be able to tell via
> > > > standard Linux interfaces what the config was.  
> > >
> > > Ok, that is true. So how should this information be exported to the user?  
> > 
> > I believe no such interface currently exists.  
> 
> Somehow I missed this the first time around but the number of Rx traffic classes
> can be exported through the DCB ops if those traffic classes are PFC enabled.
> Also, adding PFC support was the main target of this patch set.
> 
> An output like the following would convey to the user how many traffic classes
> are available and which of them are PFC enabled.
> 
> root@localhost:~# lldptool -t -i eth1 -V PFC
> IEEE 8021QAZ PFC TLV
>          Willing: yes
>          MACsec Bypass Capable: no
>          PFC capable traffic classes: 8
>          PFC enabled: 1 3 7

Ack, the DCB APIs are probably the closest today to what you need. I'm
not sure whether there is an established relation between the tcs
there, and the number of queues reported and used in ethtool, though :(

^ permalink raw reply	[flat|nested] 28+ messages in thread

end of thread, other threads:[~2020-05-29 19:40 UTC | newest]

Thread overview: 28+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-05-15 18:47 [PATCH v2 net-next 0/7] dpaa2-eth: add support for Rx traffic classes Ioana Ciornei
2020-05-15 18:47 ` [PATCH v2 net-next 1/7] dpaa2-eth: Add " Ioana Ciornei
2020-05-15 18:47 ` [PATCH v2 net-next 2/7] dpaa2-eth: Trim debugfs FQ stats Ioana Ciornei
2020-05-15 18:47 ` [PATCH v2 net-next 3/7] dpaa2-eth: Distribute ingress frames based on VLAN prio Ioana Ciornei
2020-05-15 23:07   ` kbuild test robot
2020-05-15 18:47 ` [PATCH v2 net-next 4/7] dpaa2-eth: Add helper functions Ioana Ciornei
2020-05-15 18:47 ` [PATCH v2 net-next 5/7] dpaa2-eth: Minor cleanup in dpaa2_eth_set_rx_taildrop() Ioana Ciornei
2020-05-15 18:47 ` [PATCH v2 net-next 6/7] dpaa2-eth: Add congestion group taildrop Ioana Ciornei
2020-05-15 18:47 ` [PATCH v2 net-next 7/7] dpaa2-eth: Update FQ taildrop threshold and buffer pool count Ioana Ciornei
2020-05-15 19:20 ` [PATCH v2 net-next 0/7] dpaa2-eth: add support for Rx traffic classes Jakub Kicinski
2020-05-15 19:31   ` Ioana Ciornei
2020-05-15 19:40     ` Jakub Kicinski
2020-05-15 20:48       ` Ioana Ciornei
2020-05-15 22:25         ` Jakub Kicinski
2020-05-15 23:33           ` David Miller
2020-05-16  8:16           ` Ioana Ciornei
2020-05-18 19:35             ` Jakub Kicinski
2020-05-19  7:38               ` Ioana Ciornei
2020-05-19 18:43                 ` Jakub Kicinski
2020-05-19 20:58                   ` Ioana Ciornei
2020-05-19 21:35                     ` Jakub Kicinski
2020-05-20 15:10                       ` Ioana Ciornei
2020-05-20 19:12                         ` Jakub Kicinski
2020-05-20 20:24                           ` Ioana Ciornei
2020-05-21 19:07                             ` Jakub Kicinski
2020-05-22 13:58                               ` Ioana Ciornei
2020-05-29 11:45               ` Ioana Ciornei
2020-05-29 19:40                 ` Jakub Kicinski

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).