All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v4 net-next 00/15] ENETC mqprio/taprio cleanup
@ 2023-01-30 17:31 Vladimir Oltean
  2023-01-30 17:31 ` [PATCH v4 net-next 01/15] net: enetc: simplify enetc_num_stack_tx_queues() Vladimir Oltean
                   ` (14 more replies)
  0 siblings, 15 replies; 37+ messages in thread
From: Vladimir Oltean @ 2023-01-30 17:31 UTC (permalink / raw)
  To: netdev
  Cc: David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni,
	Claudiu Manoil, Vinicius Costa Gomes, Kurt Kanzenbach,
	Jacob Keller, Jamal Hadi Salim, Cong Wang, Jiri Pirko,
	Igor Russkikh, Yisen Zhuang, Salil Mehta, Jesse Brandeburg,
	Tony Nguyen, Thomas Petazzoni, Saeed Mahameed, Leon Romanovsky,
	Horatiu Vultur, Lars Povlsen, Steen Hegelund, Daniel Machon,
	UNGLinuxDriver, Gerhard Engleder, Siddharth Vadapalli,
	Roger Quadros

Please excuse the spam, I had failed to compile-test CONFIG_TI_CPSW with
the previous version. Not deleting an #include is the only delta between
v3 and v4.

v3->v4:
- adjusted patch 07/15 to not remove "#include <net/pkt_sched.h>" from
  ti cpsw
https://patchwork.kernel.org/project/netdevbpf/cover/20230127001516.592984-1-vladimir.oltean@nxp.com/

v2->v3:
- move min_num_stack_tx_queues definition so it doesn't conflict with
  the ethtool mm patches I haven't submitted yet for enetc (and also to
  make use of a 4 byte hole)
- warn and mask off excess TCs in gate mask instead of failing
- finally CC qdisc maintainers
v2 at:
https://patchwork.kernel.org/project/netdevbpf/patch/20230126125308.1199404-16-vladimir.oltean@nxp.com/

v1->v2:
- patches 1->4 are new
- update some header inclusions in drivers
- fix typo (said "taprio" instead of "mqprio")
- better enetc mqprio error handling
- dynamically reconstruct mqprio configuration in taprio offload
- also let stmmac and tsnep use per-TXQ gate_mask
v1 (RFC) at:
https://patchwork.kernel.org/project/netdevbpf/cover/20230120141537.1350744-1-vladimir.oltean@nxp.com/

The main goal of this patch set is to make taprio pass the mqprio queue
configuration structure down to ndo_setup_tc() - patch 12/15. But mqprio
itself is not in the best shape currently, so there are some
consolidation patches on that as well.

Next, there are some consolidation patches in the enetc's driver
handling of TX queues and their traffic class assignment. Then, there is
a consolidation between the TX queue configuration for mqprio and
taprio.

Finally, there is a change in the meaning of the gate_mask passed by
taprio through ndo_setup_tc(). We introduce a capability through which
drivers can request the gate mask to be per TXQ. The default is changed
so that it is per TC.

Cc: Igor Russkikh <irusskikh@marvell.com>
Cc: Yisen Zhuang <yisen.zhuang@huawei.com>
Cc: Salil Mehta <salil.mehta@huawei.com>
Cc: Jesse Brandeburg <jesse.brandeburg@intel.com>
Cc: Tony Nguyen <anthony.l.nguyen@intel.com>
Cc: Thomas Petazzoni <thomas.petazzoni@bootlin.com>
Cc: Saeed Mahameed <saeedm@nvidia.com>
Cc: Leon Romanovsky <leon@kernel.org>
Cc: Horatiu Vultur <horatiu.vultur@microchip.com>
Cc: Lars Povlsen <lars.povlsen@microchip.com>
Cc: Steen Hegelund <Steen.Hegelund@microchip.com>
Cc: Daniel Machon <daniel.machon@microchip.com>
Cc: UNGLinuxDriver@microchip.com
Cc: Gerhard Engleder <gerhard@engleder-embedded.com>
Cc: Siddharth Vadapalli <s-vadapalli@ti.com>
Cc: Roger Quadros <rogerq@kernel.org>

Vladimir Oltean (15):
  net: enetc: simplify enetc_num_stack_tx_queues()
  net: enetc: allow the enetc_reconfigure() callback to fail
  net: enetc: recalculate num_real_tx_queues when XDP program attaches
  net: enetc: ensure we always have a minimum number of TXQs for stack
  net/sched: mqprio: refactor nlattr parsing to a separate function
  net/sched: mqprio: refactor offloading and unoffloading to dedicated
    functions
  net/sched: move struct tc_mqprio_qopt_offload from pkt_cls.h to
    pkt_sched.h
  net/sched: mqprio: allow offloading drivers to request queue count
    validation
  net/sched: mqprio: add extack messages for queue count validation
  net: enetc: request mqprio to validate the queue counts
  net: enetc: act upon the requested mqprio queue configuration
  net/sched: taprio: pass mqprio queue configuration to ndo_setup_tc()
  net: enetc: act upon mqprio queue config in taprio offload
  net/sched: taprio: mask off bits in gate mask that exceed number of
    TCs
  net/sched: taprio: only calculate gate mask per TXQ for igc, stmmac
    and tsnep

 .../net/ethernet/aquantia/atlantic/aq_main.c  |   1 +
 .../ethernet/chelsio/cxgb4/cxgb4_tc_mqprio.h  |   2 +-
 drivers/net/ethernet/engleder/tsnep_tc.c      |  21 ++
 drivers/net/ethernet/freescale/enetc/enetc.c  | 174 ++++++----
 drivers/net/ethernet/freescale/enetc/enetc.h  |   3 +
 .../net/ethernet/freescale/enetc/enetc_qos.c  |  27 +-
 drivers/net/ethernet/hisilicon/hns3/hnae3.h   |   1 +
 .../net/ethernet/hisilicon/hns3/hns3_enet.c   |   1 +
 drivers/net/ethernet/intel/i40e/i40e.h        |   1 +
 drivers/net/ethernet/intel/iavf/iavf.h        |   1 +
 drivers/net/ethernet/intel/ice/ice.h          |   1 +
 drivers/net/ethernet/intel/igc/igc_main.c     |  23 ++
 drivers/net/ethernet/marvell/mvneta.c         |   2 +-
 .../net/ethernet/mellanox/mlx5/core/en_main.c |   1 +
 .../ethernet/microchip/lan966x/lan966x_tc.c   |   1 +
 .../net/ethernet/microchip/sparx5/sparx5_tc.c |   1 +
 drivers/net/ethernet/stmicro/stmmac/hwif.h    |   5 +
 .../net/ethernet/stmicro/stmmac/stmmac_main.c |   2 +
 .../net/ethernet/stmicro/stmmac/stmmac_tc.c   |  20 ++
 drivers/net/ethernet/ti/cpsw_priv.c           |   1 +
 include/net/pkt_cls.h                         |  10 -
 include/net/pkt_sched.h                       |  16 +
 net/sched/sch_mqprio.c                        | 298 +++++++++++-------
 net/sched/sch_taprio.c                        |  77 +++--
 24 files changed, 473 insertions(+), 217 deletions(-)

-- 
2.34.1


^ permalink raw reply	[flat|nested] 37+ messages in thread

* [PATCH v4 net-next 01/15] net: enetc: simplify enetc_num_stack_tx_queues()
  2023-01-30 17:31 [PATCH v4 net-next 00/15] ENETC mqprio/taprio cleanup Vladimir Oltean
@ 2023-01-30 17:31 ` Vladimir Oltean
  2023-02-01 13:44   ` Simon Horman
  2023-01-30 17:31 ` [PATCH v4 net-next 02/15] net: enetc: allow the enetc_reconfigure() callback to fail Vladimir Oltean
                   ` (13 subsequent siblings)
  14 siblings, 1 reply; 37+ messages in thread
From: Vladimir Oltean @ 2023-01-30 17:31 UTC (permalink / raw)
  To: netdev
  Cc: David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni,
	Claudiu Manoil, Vinicius Costa Gomes, Kurt Kanzenbach,
	Jacob Keller, Jamal Hadi Salim, Cong Wang, Jiri Pirko

We keep a pointer to the xdp_prog in the private netdev structure as
well; what's replicated per RX ring is done so just for more convenient
access from the NAPI poll procedure.

Simplify enetc_num_stack_tx_queues() by looking at priv->xdp_prog rather
than iterating through the information replicated per RX ring.

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
---
v2->v4: none
v1->v2: patch is new

 drivers/net/ethernet/freescale/enetc/enetc.c | 6 ++----
 1 file changed, 2 insertions(+), 4 deletions(-)

diff --git a/drivers/net/ethernet/freescale/enetc/enetc.c b/drivers/net/ethernet/freescale/enetc/enetc.c
index 159ae740ba3c..3a80f259b17e 100644
--- a/drivers/net/ethernet/freescale/enetc/enetc.c
+++ b/drivers/net/ethernet/freescale/enetc/enetc.c
@@ -28,11 +28,9 @@ EXPORT_SYMBOL_GPL(enetc_port_mac_wr);
 static int enetc_num_stack_tx_queues(struct enetc_ndev_priv *priv)
 {
 	int num_tx_rings = priv->num_tx_rings;
-	int i;
 
-	for (i = 0; i < priv->num_rx_rings; i++)
-		if (priv->rx_ring[i]->xdp.prog)
-			return num_tx_rings - num_possible_cpus();
+	if (priv->xdp_prog)
+		return num_tx_rings - num_possible_cpus();
 
 	return num_tx_rings;
 }
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 37+ messages in thread

* [PATCH v4 net-next 02/15] net: enetc: allow the enetc_reconfigure() callback to fail
  2023-01-30 17:31 [PATCH v4 net-next 00/15] ENETC mqprio/taprio cleanup Vladimir Oltean
  2023-01-30 17:31 ` [PATCH v4 net-next 01/15] net: enetc: simplify enetc_num_stack_tx_queues() Vladimir Oltean
@ 2023-01-30 17:31 ` Vladimir Oltean
  2023-02-01 13:45   ` Simon Horman
  2023-01-30 17:31 ` [PATCH v4 net-next 03/15] net: enetc: recalculate num_real_tx_queues when XDP program attaches Vladimir Oltean
                   ` (12 subsequent siblings)
  14 siblings, 1 reply; 37+ messages in thread
From: Vladimir Oltean @ 2023-01-30 17:31 UTC (permalink / raw)
  To: netdev
  Cc: David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni,
	Claudiu Manoil, Vinicius Costa Gomes, Kurt Kanzenbach,
	Jacob Keller, Jamal Hadi Salim, Cong Wang, Jiri Pirko

enetc_reconfigure() was modified in commit c33bfaf91c4c ("net: enetc:
set up XDP program under enetc_reconfigure()") to take an optional
callback that runs while the netdev is down, but this callback currently
cannot fail.

Code up the error handling so that the interface is restarted with the
old resources if the callback fails.

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
---
v2->v4: none
v1->v2: patch is new

 drivers/net/ethernet/freescale/enetc/enetc.c | 18 ++++++++++++++----
 1 file changed, 14 insertions(+), 4 deletions(-)

diff --git a/drivers/net/ethernet/freescale/enetc/enetc.c b/drivers/net/ethernet/freescale/enetc/enetc.c
index 3a80f259b17e..5d7eeb1b5a23 100644
--- a/drivers/net/ethernet/freescale/enetc/enetc.c
+++ b/drivers/net/ethernet/freescale/enetc/enetc.c
@@ -2574,8 +2574,11 @@ static int enetc_reconfigure(struct enetc_ndev_priv *priv, bool extended,
 	 * without reconfiguration.
 	 */
 	if (!netif_running(priv->ndev)) {
-		if (cb)
-			cb(priv, ctx);
+		if (cb) {
+			err = cb(priv, ctx);
+			if (err)
+				return err;
+		}
 
 		return 0;
 	}
@@ -2596,8 +2599,11 @@ static int enetc_reconfigure(struct enetc_ndev_priv *priv, bool extended,
 	enetc_free_rxtx_rings(priv);
 
 	/* Interface is down, run optional callback now */
-	if (cb)
-		cb(priv, ctx);
+	if (cb) {
+		err = cb(priv, ctx);
+		if (err)
+			goto out_restart;
+	}
 
 	enetc_assign_tx_resources(priv, tx_res);
 	enetc_assign_rx_resources(priv, rx_res);
@@ -2606,6 +2612,10 @@ static int enetc_reconfigure(struct enetc_ndev_priv *priv, bool extended,
 
 	return 0;
 
+out_restart:
+	enetc_setup_bdrs(priv, extended);
+	enetc_start(priv->ndev);
+	enetc_free_rx_resources(rx_res, priv->num_rx_rings);
 out_free_tx_res:
 	enetc_free_tx_resources(tx_res, priv->num_tx_rings);
 out:
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 37+ messages in thread

* [PATCH v4 net-next 03/15] net: enetc: recalculate num_real_tx_queues when XDP program attaches
  2023-01-30 17:31 [PATCH v4 net-next 00/15] ENETC mqprio/taprio cleanup Vladimir Oltean
  2023-01-30 17:31 ` [PATCH v4 net-next 01/15] net: enetc: simplify enetc_num_stack_tx_queues() Vladimir Oltean
  2023-01-30 17:31 ` [PATCH v4 net-next 02/15] net: enetc: allow the enetc_reconfigure() callback to fail Vladimir Oltean
@ 2023-01-30 17:31 ` Vladimir Oltean
  2023-02-01 13:45   ` Simon Horman
  2023-01-30 17:31 ` [PATCH v4 net-next 04/15] net: enetc: ensure we always have a minimum number of TXQs for stack Vladimir Oltean
                   ` (11 subsequent siblings)
  14 siblings, 1 reply; 37+ messages in thread
From: Vladimir Oltean @ 2023-01-30 17:31 UTC (permalink / raw)
  To: netdev
  Cc: David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni,
	Claudiu Manoil, Vinicius Costa Gomes, Kurt Kanzenbach,
	Jacob Keller, Jamal Hadi Salim, Cong Wang, Jiri Pirko

Since the blamed net-next commit, enetc_setup_xdp_prog() no longer goes
through enetc_open(), and therefore, the function which was supposed to
detect whether a BPF program exists (in order to crop some TX queues
from network stack usage), enetc_num_stack_tx_queues(), no longer gets
called.

We can move the netif_set_real_num_rx_queues() call to enetc_alloc_msix()
(probe time), since it is a runtime invariant. We can do the same thing
with netif_set_real_num_tx_queues(), and let enetc_reconfigure_xdp_cb()
explicitly recalculate and change the number of stack TX queues.

Fixes: c33bfaf91c4c ("net: enetc: set up XDP program under enetc_reconfigure()")
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
---
v2->v4: none
v1->v2: patch is new

 drivers/net/ethernet/freescale/enetc/enetc.c | 35 ++++++++++++--------
 1 file changed, 21 insertions(+), 14 deletions(-)

diff --git a/drivers/net/ethernet/freescale/enetc/enetc.c b/drivers/net/ethernet/freescale/enetc/enetc.c
index 5d7eeb1b5a23..e18a6c834eb4 100644
--- a/drivers/net/ethernet/freescale/enetc/enetc.c
+++ b/drivers/net/ethernet/freescale/enetc/enetc.c
@@ -2454,7 +2454,6 @@ int enetc_open(struct net_device *ndev)
 {
 	struct enetc_ndev_priv *priv = netdev_priv(ndev);
 	struct enetc_bdr_resource *tx_res, *rx_res;
-	int num_stack_tx_queues;
 	bool extended;
 	int err;
 
@@ -2480,16 +2479,6 @@ int enetc_open(struct net_device *ndev)
 		goto err_alloc_rx;
 	}
 
-	num_stack_tx_queues = enetc_num_stack_tx_queues(priv);
-
-	err = netif_set_real_num_tx_queues(ndev, num_stack_tx_queues);
-	if (err)
-		goto err_set_queues;
-
-	err = netif_set_real_num_rx_queues(ndev, priv->num_rx_rings);
-	if (err)
-		goto err_set_queues;
-
 	enetc_tx_onestep_tstamp_init(priv);
 	enetc_assign_tx_resources(priv, tx_res);
 	enetc_assign_rx_resources(priv, rx_res);
@@ -2498,8 +2487,6 @@ int enetc_open(struct net_device *ndev)
 
 	return 0;
 
-err_set_queues:
-	enetc_free_rx_resources(rx_res, priv->num_rx_rings);
 err_alloc_rx:
 	enetc_free_tx_resources(tx_res, priv->num_tx_rings);
 err_alloc_tx:
@@ -2683,9 +2670,18 @@ EXPORT_SYMBOL_GPL(enetc_setup_tc_mqprio);
 static int enetc_reconfigure_xdp_cb(struct enetc_ndev_priv *priv, void *ctx)
 {
 	struct bpf_prog *old_prog, *prog = ctx;
-	int i;
+	int num_stack_tx_queues;
+	int err, i;
 
 	old_prog = xchg(&priv->xdp_prog, prog);
+
+	num_stack_tx_queues = enetc_num_stack_tx_queues(priv);
+	err = netif_set_real_num_tx_queues(priv->ndev, num_stack_tx_queues);
+	if (err) {
+		xchg(&priv->xdp_prog, old_prog);
+		return err;
+	}
+
 	if (old_prog)
 		bpf_prog_put(old_prog);
 
@@ -2906,6 +2902,7 @@ EXPORT_SYMBOL_GPL(enetc_ioctl);
 int enetc_alloc_msix(struct enetc_ndev_priv *priv)
 {
 	struct pci_dev *pdev = priv->si->pdev;
+	int num_stack_tx_queues;
 	int first_xdp_tx_ring;
 	int i, n, err, nvec;
 	int v_tx_rings;
@@ -2982,6 +2979,16 @@ int enetc_alloc_msix(struct enetc_ndev_priv *priv)
 		}
 	}
 
+	num_stack_tx_queues = enetc_num_stack_tx_queues(priv);
+
+	err = netif_set_real_num_tx_queues(priv->ndev, num_stack_tx_queues);
+	if (err)
+		goto fail;
+
+	err = netif_set_real_num_rx_queues(priv->ndev, priv->num_rx_rings);
+	if (err)
+		goto fail;
+
 	first_xdp_tx_ring = priv->num_tx_rings - num_possible_cpus();
 	priv->xdp_tx_ring = &priv->tx_ring[first_xdp_tx_ring];
 
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 37+ messages in thread

* [PATCH v4 net-next 04/15] net: enetc: ensure we always have a minimum number of TXQs for stack
  2023-01-30 17:31 [PATCH v4 net-next 00/15] ENETC mqprio/taprio cleanup Vladimir Oltean
                   ` (2 preceding siblings ...)
  2023-01-30 17:31 ` [PATCH v4 net-next 03/15] net: enetc: recalculate num_real_tx_queues when XDP program attaches Vladimir Oltean
@ 2023-01-30 17:31 ` Vladimir Oltean
  2023-02-01 13:43   ` Simon Horman
  2023-01-30 17:31 ` [PATCH v4 net-next 05/15] net/sched: mqprio: refactor nlattr parsing to a separate function Vladimir Oltean
                   ` (10 subsequent siblings)
  14 siblings, 1 reply; 37+ messages in thread
From: Vladimir Oltean @ 2023-01-30 17:31 UTC (permalink / raw)
  To: netdev
  Cc: David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni,
	Claudiu Manoil, Vinicius Costa Gomes, Kurt Kanzenbach,
	Jacob Keller, Jamal Hadi Salim, Cong Wang, Jiri Pirko

Currently it can happen that an mqprio qdisc is installed with num_tc 8,
and this will reserve 8 (out of 8) TXQs for the network stack. Then we
can attach an XDP program, and this will crop 2 TXQs, leaving just 6 for
mqprio. That's not what the user requested, and we should fail it.

On the other hand, if mqprio isn't requested, we still give the 8 TXQs
to the network stack (with hashing among a single traffic class), but
then, cropping 2 TXQs for XDP is fine, because the user didn't
explicitly ask for any number of TXQs, so no expectations are violated.

Simply put, the logic that mqprio should impose a minimum number of TXQs
for the network never existed. Let's say (more or less arbitrarily) that
without mqprio, the driver expects a minimum number of TXQs equal to the
number of CPUs (on NXP LS1028A, that is either 1, or 2). And with mqprio,
mqprio gives the minimum required number of TXQs.

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
---
v3->v4: none
v2->v3: move min_num_stack_tx_queues definition so it doesn't conflict
        with the ethtool mm patches I haven't submitted yet for enetc
        (and also to make use of a 4 byte hole)
v1->v2: patch is new

 drivers/net/ethernet/freescale/enetc/enetc.c | 14 ++++++++++++++
 drivers/net/ethernet/freescale/enetc/enetc.h |  3 +++
 2 files changed, 17 insertions(+)

diff --git a/drivers/net/ethernet/freescale/enetc/enetc.c b/drivers/net/ethernet/freescale/enetc/enetc.c
index e18a6c834eb4..1c0aeaa13cde 100644
--- a/drivers/net/ethernet/freescale/enetc/enetc.c
+++ b/drivers/net/ethernet/freescale/enetc/enetc.c
@@ -2626,6 +2626,7 @@ int enetc_setup_tc_mqprio(struct net_device *ndev, void *type_data)
 	if (!num_tc) {
 		netdev_reset_tc(ndev);
 		netif_set_real_num_tx_queues(ndev, num_stack_tx_queues);
+		priv->min_num_stack_tx_queues = num_possible_cpus();
 
 		/* Reset all ring priorities to 0 */
 		for (i = 0; i < priv->num_tx_rings; i++) {
@@ -2656,6 +2657,7 @@ int enetc_setup_tc_mqprio(struct net_device *ndev, void *type_data)
 
 	/* Reset the number of netdev queues based on the TC count */
 	netif_set_real_num_tx_queues(ndev, num_tc);
+	priv->min_num_stack_tx_queues = num_tc;
 
 	netdev_set_num_tc(ndev, num_tc);
 
@@ -2702,9 +2704,20 @@ static int enetc_reconfigure_xdp_cb(struct enetc_ndev_priv *priv, void *ctx)
 static int enetc_setup_xdp_prog(struct net_device *ndev, struct bpf_prog *prog,
 				struct netlink_ext_ack *extack)
 {
+	int num_xdp_tx_queues = prog ? num_possible_cpus() : 0;
 	struct enetc_ndev_priv *priv = netdev_priv(ndev);
 	bool extended;
 
+	if (priv->min_num_stack_tx_queues + num_xdp_tx_queues >
+	    priv->num_tx_rings) {
+		NL_SET_ERR_MSG_FMT_MOD(extack,
+				       "Reserving %d XDP TXQs does not leave a minimum of %d TXQs for network stack (total %d available)",
+				       num_xdp_tx_queues,
+				       priv->min_num_stack_tx_queues,
+				       priv->num_tx_rings);
+		return -EBUSY;
+	}
+
 	extended = !!(priv->active_offloads & ENETC_F_RX_TSTAMP);
 
 	/* The buffer layout is changing, so we need to drain the old
@@ -2989,6 +3002,7 @@ int enetc_alloc_msix(struct enetc_ndev_priv *priv)
 	if (err)
 		goto fail;
 
+	priv->min_num_stack_tx_queues = num_possible_cpus();
 	first_xdp_tx_ring = priv->num_tx_rings - num_possible_cpus();
 	priv->xdp_tx_ring = &priv->tx_ring[first_xdp_tx_ring];
 
diff --git a/drivers/net/ethernet/freescale/enetc/enetc.h b/drivers/net/ethernet/freescale/enetc/enetc.h
index 1fe8dfd6b6d4..e21d096c5a90 100644
--- a/drivers/net/ethernet/freescale/enetc/enetc.h
+++ b/drivers/net/ethernet/freescale/enetc/enetc.h
@@ -369,6 +369,9 @@ struct enetc_ndev_priv {
 
 	struct psfp_cap psfp_cap;
 
+	/* Minimum number of TX queues required by the network stack */
+	unsigned int min_num_stack_tx_queues;
+
 	struct phylink *phylink;
 	int ic_mode;
 	u32 tx_ictt;
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 37+ messages in thread

* [PATCH v4 net-next 05/15] net/sched: mqprio: refactor nlattr parsing to a separate function
  2023-01-30 17:31 [PATCH v4 net-next 00/15] ENETC mqprio/taprio cleanup Vladimir Oltean
                   ` (3 preceding siblings ...)
  2023-01-30 17:31 ` [PATCH v4 net-next 04/15] net: enetc: ensure we always have a minimum number of TXQs for stack Vladimir Oltean
@ 2023-01-30 17:31 ` Vladimir Oltean
  2023-02-01 14:03   ` Simon Horman
  2023-01-30 17:31 ` [PATCH v4 net-next 06/15] net/sched: mqprio: refactor offloading and unoffloading to dedicated functions Vladimir Oltean
                   ` (9 subsequent siblings)
  14 siblings, 1 reply; 37+ messages in thread
From: Vladimir Oltean @ 2023-01-30 17:31 UTC (permalink / raw)
  To: netdev
  Cc: David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni,
	Claudiu Manoil, Vinicius Costa Gomes, Kurt Kanzenbach,
	Jacob Keller, Jamal Hadi Salim, Cong Wang, Jiri Pirko

mqprio_init() is quite large and unwieldy to add more code to.
Split the netlink attribute parsing to a dedicated function.

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Jacob Keller <jacob.e.keller@intel.com>
---
v1->v4: none

 net/sched/sch_mqprio.c | 114 +++++++++++++++++++++++------------------
 1 file changed, 63 insertions(+), 51 deletions(-)

diff --git a/net/sched/sch_mqprio.c b/net/sched/sch_mqprio.c
index 4c68abaa289b..d2d8a02ded05 100644
--- a/net/sched/sch_mqprio.c
+++ b/net/sched/sch_mqprio.c
@@ -130,6 +130,67 @@ static int parse_attr(struct nlattr *tb[], int maxtype, struct nlattr *nla,
 	return 0;
 }
 
+static int mqprio_parse_nlattr(struct Qdisc *sch, struct tc_mqprio_qopt *qopt,
+			       struct nlattr *opt)
+{
+	struct mqprio_sched *priv = qdisc_priv(sch);
+	struct nlattr *tb[TCA_MQPRIO_MAX + 1];
+	struct nlattr *attr;
+	int i, rem, err;
+
+	err = parse_attr(tb, TCA_MQPRIO_MAX, opt, mqprio_policy,
+			 sizeof(*qopt));
+	if (err < 0)
+		return err;
+
+	if (!qopt->hw)
+		return -EINVAL;
+
+	if (tb[TCA_MQPRIO_MODE]) {
+		priv->flags |= TC_MQPRIO_F_MODE;
+		priv->mode = *(u16 *)nla_data(tb[TCA_MQPRIO_MODE]);
+	}
+
+	if (tb[TCA_MQPRIO_SHAPER]) {
+		priv->flags |= TC_MQPRIO_F_SHAPER;
+		priv->shaper = *(u16 *)nla_data(tb[TCA_MQPRIO_SHAPER]);
+	}
+
+	if (tb[TCA_MQPRIO_MIN_RATE64]) {
+		if (priv->shaper != TC_MQPRIO_SHAPER_BW_RATE)
+			return -EINVAL;
+		i = 0;
+		nla_for_each_nested(attr, tb[TCA_MQPRIO_MIN_RATE64],
+				    rem) {
+			if (nla_type(attr) != TCA_MQPRIO_MIN_RATE64)
+				return -EINVAL;
+			if (i >= qopt->num_tc)
+				break;
+			priv->min_rate[i] = *(u64 *)nla_data(attr);
+			i++;
+		}
+		priv->flags |= TC_MQPRIO_F_MIN_RATE;
+	}
+
+	if (tb[TCA_MQPRIO_MAX_RATE64]) {
+		if (priv->shaper != TC_MQPRIO_SHAPER_BW_RATE)
+			return -EINVAL;
+		i = 0;
+		nla_for_each_nested(attr, tb[TCA_MQPRIO_MAX_RATE64],
+				    rem) {
+			if (nla_type(attr) != TCA_MQPRIO_MAX_RATE64)
+				return -EINVAL;
+			if (i >= qopt->num_tc)
+				break;
+			priv->max_rate[i] = *(u64 *)nla_data(attr);
+			i++;
+		}
+		priv->flags |= TC_MQPRIO_F_MAX_RATE;
+	}
+
+	return 0;
+}
+
 static int mqprio_init(struct Qdisc *sch, struct nlattr *opt,
 		       struct netlink_ext_ack *extack)
 {
@@ -139,9 +200,6 @@ static int mqprio_init(struct Qdisc *sch, struct nlattr *opt,
 	struct Qdisc *qdisc;
 	int i, err = -EOPNOTSUPP;
 	struct tc_mqprio_qopt *qopt = NULL;
-	struct nlattr *tb[TCA_MQPRIO_MAX + 1];
-	struct nlattr *attr;
-	int rem;
 	int len;
 
 	BUILD_BUG_ON(TC_MAX_QUEUE != TC_QOPT_MAX_QUEUE);
@@ -166,55 +224,9 @@ static int mqprio_init(struct Qdisc *sch, struct nlattr *opt,
 
 	len = nla_len(opt) - NLA_ALIGN(sizeof(*qopt));
 	if (len > 0) {
-		err = parse_attr(tb, TCA_MQPRIO_MAX, opt, mqprio_policy,
-				 sizeof(*qopt));
-		if (err < 0)
+		err = mqprio_parse_nlattr(sch, qopt, opt);
+		if (err)
 			return err;
-
-		if (!qopt->hw)
-			return -EINVAL;
-
-		if (tb[TCA_MQPRIO_MODE]) {
-			priv->flags |= TC_MQPRIO_F_MODE;
-			priv->mode = *(u16 *)nla_data(tb[TCA_MQPRIO_MODE]);
-		}
-
-		if (tb[TCA_MQPRIO_SHAPER]) {
-			priv->flags |= TC_MQPRIO_F_SHAPER;
-			priv->shaper = *(u16 *)nla_data(tb[TCA_MQPRIO_SHAPER]);
-		}
-
-		if (tb[TCA_MQPRIO_MIN_RATE64]) {
-			if (priv->shaper != TC_MQPRIO_SHAPER_BW_RATE)
-				return -EINVAL;
-			i = 0;
-			nla_for_each_nested(attr, tb[TCA_MQPRIO_MIN_RATE64],
-					    rem) {
-				if (nla_type(attr) != TCA_MQPRIO_MIN_RATE64)
-					return -EINVAL;
-				if (i >= qopt->num_tc)
-					break;
-				priv->min_rate[i] = *(u64 *)nla_data(attr);
-				i++;
-			}
-			priv->flags |= TC_MQPRIO_F_MIN_RATE;
-		}
-
-		if (tb[TCA_MQPRIO_MAX_RATE64]) {
-			if (priv->shaper != TC_MQPRIO_SHAPER_BW_RATE)
-				return -EINVAL;
-			i = 0;
-			nla_for_each_nested(attr, tb[TCA_MQPRIO_MAX_RATE64],
-					    rem) {
-				if (nla_type(attr) != TCA_MQPRIO_MAX_RATE64)
-					return -EINVAL;
-				if (i >= qopt->num_tc)
-					break;
-				priv->max_rate[i] = *(u64 *)nla_data(attr);
-				i++;
-			}
-			priv->flags |= TC_MQPRIO_F_MAX_RATE;
-		}
 	}
 
 	/* pre-allocate qdisc, attachment can't fail */
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 37+ messages in thread

* [PATCH v4 net-next 06/15] net/sched: mqprio: refactor offloading and unoffloading to dedicated functions
  2023-01-30 17:31 [PATCH v4 net-next 00/15] ENETC mqprio/taprio cleanup Vladimir Oltean
                   ` (4 preceding siblings ...)
  2023-01-30 17:31 ` [PATCH v4 net-next 05/15] net/sched: mqprio: refactor nlattr parsing to a separate function Vladimir Oltean
@ 2023-01-30 17:31 ` Vladimir Oltean
  2023-02-01 14:07   ` Simon Horman
  2023-01-30 17:31 ` [PATCH v4 net-next 07/15] net/sched: move struct tc_mqprio_qopt_offload from pkt_cls.h to pkt_sched.h Vladimir Oltean
                   ` (8 subsequent siblings)
  14 siblings, 1 reply; 37+ messages in thread
From: Vladimir Oltean @ 2023-01-30 17:31 UTC (permalink / raw)
  To: netdev
  Cc: David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni,
	Claudiu Manoil, Vinicius Costa Gomes, Kurt Kanzenbach,
	Jacob Keller, Jamal Hadi Salim, Cong Wang, Jiri Pirko

Some more logic will be added to mqprio offloading, so split that code
up from mqprio_init(), which is already large, and create a new
function, mqprio_enable_offload(), similar to taprio_enable_offload().
Also create the opposite function mqprio_disable_offload().

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Jacob Keller <jacob.e.keller@intel.com>
---
v1->v4: none

 net/sched/sch_mqprio.c | 102 ++++++++++++++++++++++++-----------------
 1 file changed, 59 insertions(+), 43 deletions(-)

diff --git a/net/sched/sch_mqprio.c b/net/sched/sch_mqprio.c
index d2d8a02ded05..3579a64da06e 100644
--- a/net/sched/sch_mqprio.c
+++ b/net/sched/sch_mqprio.c
@@ -27,6 +27,61 @@ struct mqprio_sched {
 	u64 max_rate[TC_QOPT_MAX_QUEUE];
 };
 
+static int mqprio_enable_offload(struct Qdisc *sch,
+				 const struct tc_mqprio_qopt *qopt)
+{
+	struct tc_mqprio_qopt_offload mqprio = {.qopt = *qopt};
+	struct mqprio_sched *priv = qdisc_priv(sch);
+	struct net_device *dev = qdisc_dev(sch);
+	int err, i;
+
+	switch (priv->mode) {
+	case TC_MQPRIO_MODE_DCB:
+		if (priv->shaper != TC_MQPRIO_SHAPER_DCB)
+			return -EINVAL;
+		break;
+	case TC_MQPRIO_MODE_CHANNEL:
+		mqprio.flags = priv->flags;
+		if (priv->flags & TC_MQPRIO_F_MODE)
+			mqprio.mode = priv->mode;
+		if (priv->flags & TC_MQPRIO_F_SHAPER)
+			mqprio.shaper = priv->shaper;
+		if (priv->flags & TC_MQPRIO_F_MIN_RATE)
+			for (i = 0; i < mqprio.qopt.num_tc; i++)
+				mqprio.min_rate[i] = priv->min_rate[i];
+		if (priv->flags & TC_MQPRIO_F_MAX_RATE)
+			for (i = 0; i < mqprio.qopt.num_tc; i++)
+				mqprio.max_rate[i] = priv->max_rate[i];
+		break;
+	default:
+		return -EINVAL;
+	}
+
+	err = dev->netdev_ops->ndo_setup_tc(dev, TC_SETUP_QDISC_MQPRIO,
+					    &mqprio);
+	if (err)
+		return err;
+
+	priv->hw_offload = mqprio.qopt.hw;
+
+	return 0;
+}
+
+static void mqprio_disable_offload(struct Qdisc *sch)
+{
+	struct tc_mqprio_qopt_offload mqprio = { { 0 } };
+	struct mqprio_sched *priv = qdisc_priv(sch);
+	struct net_device *dev = qdisc_dev(sch);
+
+	switch (priv->mode) {
+	case TC_MQPRIO_MODE_DCB:
+	case TC_MQPRIO_MODE_CHANNEL:
+		dev->netdev_ops->ndo_setup_tc(dev, TC_SETUP_QDISC_MQPRIO,
+					      &mqprio);
+		break;
+	}
+}
+
 static void mqprio_destroy(struct Qdisc *sch)
 {
 	struct net_device *dev = qdisc_dev(sch);
@@ -41,22 +96,10 @@ static void mqprio_destroy(struct Qdisc *sch)
 		kfree(priv->qdiscs);
 	}
 
-	if (priv->hw_offload && dev->netdev_ops->ndo_setup_tc) {
-		struct tc_mqprio_qopt_offload mqprio = { { 0 } };
-
-		switch (priv->mode) {
-		case TC_MQPRIO_MODE_DCB:
-		case TC_MQPRIO_MODE_CHANNEL:
-			dev->netdev_ops->ndo_setup_tc(dev,
-						      TC_SETUP_QDISC_MQPRIO,
-						      &mqprio);
-			break;
-		default:
-			return;
-		}
-	} else {
+	if (priv->hw_offload && dev->netdev_ops->ndo_setup_tc)
+		mqprio_disable_offload(sch);
+	else
 		netdev_set_num_tc(dev, 0);
-	}
 }
 
 static int mqprio_parse_opt(struct net_device *dev, struct tc_mqprio_qopt *qopt)
@@ -253,36 +296,9 @@ static int mqprio_init(struct Qdisc *sch, struct nlattr *opt,
 	 * supplied and verified mapping
 	 */
 	if (qopt->hw) {
-		struct tc_mqprio_qopt_offload mqprio = {.qopt = *qopt};
-
-		switch (priv->mode) {
-		case TC_MQPRIO_MODE_DCB:
-			if (priv->shaper != TC_MQPRIO_SHAPER_DCB)
-				return -EINVAL;
-			break;
-		case TC_MQPRIO_MODE_CHANNEL:
-			mqprio.flags = priv->flags;
-			if (priv->flags & TC_MQPRIO_F_MODE)
-				mqprio.mode = priv->mode;
-			if (priv->flags & TC_MQPRIO_F_SHAPER)
-				mqprio.shaper = priv->shaper;
-			if (priv->flags & TC_MQPRIO_F_MIN_RATE)
-				for (i = 0; i < mqprio.qopt.num_tc; i++)
-					mqprio.min_rate[i] = priv->min_rate[i];
-			if (priv->flags & TC_MQPRIO_F_MAX_RATE)
-				for (i = 0; i < mqprio.qopt.num_tc; i++)
-					mqprio.max_rate[i] = priv->max_rate[i];
-			break;
-		default:
-			return -EINVAL;
-		}
-		err = dev->netdev_ops->ndo_setup_tc(dev,
-						    TC_SETUP_QDISC_MQPRIO,
-						    &mqprio);
+		err = mqprio_enable_offload(sch, qopt);
 		if (err)
 			return err;
-
-		priv->hw_offload = mqprio.qopt.hw;
 	} else {
 		netdev_set_num_tc(dev, qopt->num_tc);
 		for (i = 0; i < qopt->num_tc; i++)
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 37+ messages in thread

* [PATCH v4 net-next 07/15] net/sched: move struct tc_mqprio_qopt_offload from pkt_cls.h to pkt_sched.h
  2023-01-30 17:31 [PATCH v4 net-next 00/15] ENETC mqprio/taprio cleanup Vladimir Oltean
                   ` (5 preceding siblings ...)
  2023-01-30 17:31 ` [PATCH v4 net-next 06/15] net/sched: mqprio: refactor offloading and unoffloading to dedicated functions Vladimir Oltean
@ 2023-01-30 17:31 ` Vladimir Oltean
  2023-02-01 14:07   ` Simon Horman
  2023-01-30 17:31 ` [PATCH v4 net-next 08/15] net/sched: mqprio: allow offloading drivers to request queue count validation Vladimir Oltean
                   ` (7 subsequent siblings)
  14 siblings, 1 reply; 37+ messages in thread
From: Vladimir Oltean @ 2023-01-30 17:31 UTC (permalink / raw)
  To: netdev
  Cc: David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni,
	Claudiu Manoil, Vinicius Costa Gomes, Kurt Kanzenbach,
	Jacob Keller, Jamal Hadi Salim, Cong Wang, Jiri Pirko,
	Igor Russkikh, Yisen Zhuang, Salil Mehta, Jesse Brandeburg,
	Tony Nguyen, Thomas Petazzoni, Saeed Mahameed, Leon Romanovsky,
	Horatiu Vultur, Lars Povlsen, Steen Hegelund, Daniel Machon,
	UNGLinuxDriver

Since mqprio is a scheduler and not a classifier, move its offload
structure to pkt_sched.h, where struct tc_taprio_qopt_offload also lies.

Also update some header inclusions in drivers that access this
structure, to the best of my abilities.

Cc: Igor Russkikh <irusskikh@marvell.com>
Cc: Yisen Zhuang <yisen.zhuang@huawei.com>
Cc: Salil Mehta <salil.mehta@huawei.com>
Cc: Jesse Brandeburg <jesse.brandeburg@intel.com>
Cc: Tony Nguyen <anthony.l.nguyen@intel.com>
Cc: Thomas Petazzoni <thomas.petazzoni@bootlin.com>
Cc: Saeed Mahameed <saeedm@nvidia.com>
Cc: Leon Romanovsky <leon@kernel.org>
Cc: Horatiu Vultur <horatiu.vultur@microchip.com>
Cc: Lars Povlsen <lars.povlsen@microchip.com>
Cc: Steen Hegelund <Steen.Hegelund@microchip.com>
Cc: Daniel Machon <daniel.machon@microchip.com>
Cc: UNGLinuxDriver@microchip.com
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
---
v3->v4: shouldn't have removed "#include <net/pkt_sched.h>" from ti cpsw
v2->v3: none
v1->v2:
- update some header inclusions in drivers
- fix typo (said "taprio" instead of "mqprio")

 drivers/net/ethernet/aquantia/atlantic/aq_main.c     |  1 +
 drivers/net/ethernet/chelsio/cxgb4/cxgb4_tc_mqprio.h |  2 +-
 drivers/net/ethernet/hisilicon/hns3/hnae3.h          |  1 +
 drivers/net/ethernet/hisilicon/hns3/hns3_enet.c      |  1 +
 drivers/net/ethernet/intel/i40e/i40e.h               |  1 +
 drivers/net/ethernet/intel/iavf/iavf.h               |  1 +
 drivers/net/ethernet/intel/ice/ice.h                 |  1 +
 drivers/net/ethernet/marvell/mvneta.c                |  2 +-
 drivers/net/ethernet/mellanox/mlx5/core/en_main.c    |  1 +
 drivers/net/ethernet/microchip/lan966x/lan966x_tc.c  |  1 +
 drivers/net/ethernet/microchip/sparx5/sparx5_tc.c    |  1 +
 drivers/net/ethernet/ti/cpsw_priv.c                  |  1 +
 include/net/pkt_cls.h                                | 10 ----------
 include/net/pkt_sched.h                              | 10 ++++++++++
 14 files changed, 22 insertions(+), 12 deletions(-)

diff --git a/drivers/net/ethernet/aquantia/atlantic/aq_main.c b/drivers/net/ethernet/aquantia/atlantic/aq_main.c
index 77609dc0a08d..0b2a52199914 100644
--- a/drivers/net/ethernet/aquantia/atlantic/aq_main.c
+++ b/drivers/net/ethernet/aquantia/atlantic/aq_main.c
@@ -21,6 +21,7 @@
 #include <linux/ip.h>
 #include <linux/udp.h>
 #include <net/pkt_cls.h>
+#include <net/pkt_sched.h>
 #include <linux/filter.h>
 
 MODULE_LICENSE("GPL v2");
diff --git a/drivers/net/ethernet/chelsio/cxgb4/cxgb4_tc_mqprio.h b/drivers/net/ethernet/chelsio/cxgb4/cxgb4_tc_mqprio.h
index be96f1dc0372..d4a862a9fd7d 100644
--- a/drivers/net/ethernet/chelsio/cxgb4/cxgb4_tc_mqprio.h
+++ b/drivers/net/ethernet/chelsio/cxgb4/cxgb4_tc_mqprio.h
@@ -4,7 +4,7 @@
 #ifndef __CXGB4_TC_MQPRIO_H__
 #define __CXGB4_TC_MQPRIO_H__
 
-#include <net/pkt_cls.h>
+#include <net/pkt_sched.h>
 
 #define CXGB4_EOSW_TXQ_DEFAULT_DESC_NUM 128
 
diff --git a/drivers/net/ethernet/hisilicon/hns3/hnae3.h b/drivers/net/ethernet/hisilicon/hns3/hnae3.h
index 17137de9338c..40f4306449eb 100644
--- a/drivers/net/ethernet/hisilicon/hns3/hnae3.h
+++ b/drivers/net/ethernet/hisilicon/hns3/hnae3.h
@@ -32,6 +32,7 @@
 #include <linux/pkt_sched.h>
 #include <linux/types.h>
 #include <net/pkt_cls.h>
+#include <net/pkt_sched.h>
 
 #define HNAE3_MOD_VERSION "1.0"
 
diff --git a/drivers/net/ethernet/hisilicon/hns3/hns3_enet.c b/drivers/net/ethernet/hisilicon/hns3/hns3_enet.c
index b4c4fb873568..25be7f8ac7cd 100644
--- a/drivers/net/ethernet/hisilicon/hns3/hns3_enet.c
+++ b/drivers/net/ethernet/hisilicon/hns3/hns3_enet.c
@@ -20,6 +20,7 @@
 #include <net/gro.h>
 #include <net/ip6_checksum.h>
 #include <net/pkt_cls.h>
+#include <net/pkt_sched.h>
 #include <net/tcp.h>
 #include <net/vxlan.h>
 #include <net/geneve.h>
diff --git a/drivers/net/ethernet/intel/i40e/i40e.h b/drivers/net/ethernet/intel/i40e/i40e.h
index 60e351665c70..38c341b9f368 100644
--- a/drivers/net/ethernet/intel/i40e/i40e.h
+++ b/drivers/net/ethernet/intel/i40e/i40e.h
@@ -33,6 +33,7 @@
 #include <linux/net_tstamp.h>
 #include <linux/ptp_clock_kernel.h>
 #include <net/pkt_cls.h>
+#include <net/pkt_sched.h>
 #include <net/tc_act/tc_gact.h>
 #include <net/tc_act/tc_mirred.h>
 #include <net/udp_tunnel.h>
diff --git a/drivers/net/ethernet/intel/iavf/iavf.h b/drivers/net/ethernet/intel/iavf/iavf.h
index 23bc000e77b8..232bc61d9eee 100644
--- a/drivers/net/ethernet/intel/iavf/iavf.h
+++ b/drivers/net/ethernet/intel/iavf/iavf.h
@@ -30,6 +30,7 @@
 #include <linux/jiffies.h>
 #include <net/ip6_checksum.h>
 #include <net/pkt_cls.h>
+#include <net/pkt_sched.h>
 #include <net/udp.h>
 #include <net/tc_act/tc_gact.h>
 #include <net/tc_act/tc_mirred.h>
diff --git a/drivers/net/ethernet/intel/ice/ice.h b/drivers/net/ethernet/intel/ice/ice.h
index ae93ae488bc2..ef6b91abce70 100644
--- a/drivers/net/ethernet/intel/ice/ice.h
+++ b/drivers/net/ethernet/intel/ice/ice.h
@@ -41,6 +41,7 @@
 #include <linux/dim.h>
 #include <linux/gnss.h>
 #include <net/pkt_cls.h>
+#include <net/pkt_sched.h>
 #include <net/tc_act/tc_mirred.h>
 #include <net/tc_act/tc_gact.h>
 #include <net/ip.h>
diff --git a/drivers/net/ethernet/marvell/mvneta.c b/drivers/net/ethernet/marvell/mvneta.c
index f8925cac61e4..a48588c80317 100644
--- a/drivers/net/ethernet/marvell/mvneta.c
+++ b/drivers/net/ethernet/marvell/mvneta.c
@@ -38,7 +38,7 @@
 #include <net/ipv6.h>
 #include <net/tso.h>
 #include <net/page_pool.h>
-#include <net/pkt_cls.h>
+#include <net/pkt_sched.h>
 #include <linux/bpf_trace.h>
 
 /* Registers */
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_main.c b/drivers/net/ethernet/mellanox/mlx5/core/en_main.c
index 0e87432ec6f1..7de21a1ef009 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en_main.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en_main.c
@@ -39,6 +39,7 @@
 #include <linux/if_bridge.h>
 #include <linux/filter.h>
 #include <net/page_pool.h>
+#include <net/pkt_sched.h>
 #include <net/xdp_sock_drv.h>
 #include "eswitch.h"
 #include "en.h"
diff --git a/drivers/net/ethernet/microchip/lan966x/lan966x_tc.c b/drivers/net/ethernet/microchip/lan966x/lan966x_tc.c
index 80625ba0b354..cf0cc7562d04 100644
--- a/drivers/net/ethernet/microchip/lan966x/lan966x_tc.c
+++ b/drivers/net/ethernet/microchip/lan966x/lan966x_tc.c
@@ -1,6 +1,7 @@
 // SPDX-License-Identifier: GPL-2.0+
 
 #include <net/pkt_cls.h>
+#include <net/pkt_sched.h>
 
 #include "lan966x_main.h"
 
diff --git a/drivers/net/ethernet/microchip/sparx5/sparx5_tc.c b/drivers/net/ethernet/microchip/sparx5/sparx5_tc.c
index 205246b5af82..e80f3166db7d 100644
--- a/drivers/net/ethernet/microchip/sparx5/sparx5_tc.c
+++ b/drivers/net/ethernet/microchip/sparx5/sparx5_tc.c
@@ -5,6 +5,7 @@
  */
 
 #include <net/pkt_cls.h>
+#include <net/pkt_sched.h>
 
 #include "sparx5_tc.h"
 #include "sparx5_main.h"
diff --git a/drivers/net/ethernet/ti/cpsw_priv.c b/drivers/net/ethernet/ti/cpsw_priv.c
index 758295c898ac..e966dd47e2db 100644
--- a/drivers/net/ethernet/ti/cpsw_priv.c
+++ b/drivers/net/ethernet/ti/cpsw_priv.c
@@ -20,6 +20,7 @@
 #include <linux/skbuff.h>
 #include <net/page_pool.h>
 #include <net/pkt_cls.h>
+#include <net/pkt_sched.h>
 
 #include "cpsw.h"
 #include "cpts.h"
diff --git a/include/net/pkt_cls.h b/include/net/pkt_cls.h
index 4cabb32a2ad9..cd410a87517b 100644
--- a/include/net/pkt_cls.h
+++ b/include/net/pkt_cls.h
@@ -788,16 +788,6 @@ struct tc_cls_bpf_offload {
 	bool exts_integrated;
 };
 
-struct tc_mqprio_qopt_offload {
-	/* struct tc_mqprio_qopt must always be the first element */
-	struct tc_mqprio_qopt qopt;
-	u16 mode;
-	u16 shaper;
-	u32 flags;
-	u64 min_rate[TC_QOPT_MAX_QUEUE];
-	u64 max_rate[TC_QOPT_MAX_QUEUE];
-};
-
 /* This structure holds cookie structure that is passed from user
  * to the kernel for actions and classifiers
  */
diff --git a/include/net/pkt_sched.h b/include/net/pkt_sched.h
index 38207873eda6..6c5e64e0a0bb 100644
--- a/include/net/pkt_sched.h
+++ b/include/net/pkt_sched.h
@@ -160,6 +160,16 @@ struct tc_etf_qopt_offload {
 	s32 queue;
 };
 
+struct tc_mqprio_qopt_offload {
+	/* struct tc_mqprio_qopt must always be the first element */
+	struct tc_mqprio_qopt qopt;
+	u16 mode;
+	u16 shaper;
+	u32 flags;
+	u64 min_rate[TC_QOPT_MAX_QUEUE];
+	u64 max_rate[TC_QOPT_MAX_QUEUE];
+};
+
 struct tc_taprio_caps {
 	bool supports_queue_max_sdu:1;
 };
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 37+ messages in thread

* [PATCH v4 net-next 08/15] net/sched: mqprio: allow offloading drivers to request queue count validation
  2023-01-30 17:31 [PATCH v4 net-next 00/15] ENETC mqprio/taprio cleanup Vladimir Oltean
                   ` (6 preceding siblings ...)
  2023-01-30 17:31 ` [PATCH v4 net-next 07/15] net/sched: move struct tc_mqprio_qopt_offload from pkt_cls.h to pkt_sched.h Vladimir Oltean
@ 2023-01-30 17:31 ` Vladimir Oltean
  2023-01-30 18:37   ` Claudiu Manoil
  2023-02-01 14:08   ` Simon Horman
  2023-01-30 17:31 ` [PATCH v4 net-next 09/15] net/sched: mqprio: add extack messages for " Vladimir Oltean
                   ` (6 subsequent siblings)
  14 siblings, 2 replies; 37+ messages in thread
From: Vladimir Oltean @ 2023-01-30 17:31 UTC (permalink / raw)
  To: netdev
  Cc: David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni,
	Claudiu Manoil, Vinicius Costa Gomes, Kurt Kanzenbach,
	Jacob Keller, Jamal Hadi Salim, Cong Wang, Jiri Pirko

mqprio_parse_opt() proudly has a comment:

	/* If hardware offload is requested we will leave it to the device
	 * to either populate the queue counts itself or to validate the
	 * provided queue counts.
	 */

Unfortunately some device drivers did not get this memo, and don't
validate the queue counts.

Introduce a tc capability, and make mqprio query it.

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Jacob Keller <jacob.e.keller@intel.com>
---
v1->v4: none

 include/net/pkt_sched.h |  4 +++
 net/sched/sch_mqprio.c  | 58 +++++++++++++++++++++++++++--------------
 2 files changed, 42 insertions(+), 20 deletions(-)

diff --git a/include/net/pkt_sched.h b/include/net/pkt_sched.h
index 6c5e64e0a0bb..02e3ccfbc7d1 100644
--- a/include/net/pkt_sched.h
+++ b/include/net/pkt_sched.h
@@ -160,6 +160,10 @@ struct tc_etf_qopt_offload {
 	s32 queue;
 };
 
+struct tc_mqprio_caps {
+	bool validate_queue_counts:1;
+};
+
 struct tc_mqprio_qopt_offload {
 	/* struct tc_mqprio_qopt must always be the first element */
 	struct tc_mqprio_qopt qopt;
diff --git a/net/sched/sch_mqprio.c b/net/sched/sch_mqprio.c
index 3579a64da06e..5fdceab82ea1 100644
--- a/net/sched/sch_mqprio.c
+++ b/net/sched/sch_mqprio.c
@@ -27,14 +27,50 @@ struct mqprio_sched {
 	u64 max_rate[TC_QOPT_MAX_QUEUE];
 };
 
+static int mqprio_validate_queue_counts(struct net_device *dev,
+					const struct tc_mqprio_qopt *qopt)
+{
+	int i, j;
+
+	for (i = 0; i < qopt->num_tc; i++) {
+		unsigned int last = qopt->offset[i] + qopt->count[i];
+
+		/* Verify the queue count is in tx range being equal to the
+		 * real_num_tx_queues indicates the last queue is in use.
+		 */
+		if (qopt->offset[i] >= dev->real_num_tx_queues ||
+		    !qopt->count[i] ||
+		    last > dev->real_num_tx_queues)
+			return -EINVAL;
+
+		/* Verify that the offset and counts do not overlap */
+		for (j = i + 1; j < qopt->num_tc; j++) {
+			if (last > qopt->offset[j])
+				return -EINVAL;
+		}
+	}
+
+	return 0;
+}
+
 static int mqprio_enable_offload(struct Qdisc *sch,
 				 const struct tc_mqprio_qopt *qopt)
 {
 	struct tc_mqprio_qopt_offload mqprio = {.qopt = *qopt};
 	struct mqprio_sched *priv = qdisc_priv(sch);
 	struct net_device *dev = qdisc_dev(sch);
+	struct tc_mqprio_caps caps;
 	int err, i;
 
+	qdisc_offload_query_caps(dev, TC_SETUP_QDISC_MQPRIO,
+				 &caps, sizeof(caps));
+
+	if (caps.validate_queue_counts) {
+		err = mqprio_validate_queue_counts(dev, qopt);
+		if (err)
+			return err;
+	}
+
 	switch (priv->mode) {
 	case TC_MQPRIO_MODE_DCB:
 		if (priv->shaper != TC_MQPRIO_SHAPER_DCB)
@@ -104,7 +140,7 @@ static void mqprio_destroy(struct Qdisc *sch)
 
 static int mqprio_parse_opt(struct net_device *dev, struct tc_mqprio_qopt *qopt)
 {
-	int i, j;
+	int i;
 
 	/* Verify num_tc is not out of max range */
 	if (qopt->num_tc > TC_MAX_QUEUE)
@@ -131,25 +167,7 @@ static int mqprio_parse_opt(struct net_device *dev, struct tc_mqprio_qopt *qopt)
 	if (qopt->hw)
 		return dev->netdev_ops->ndo_setup_tc ? 0 : -EINVAL;
 
-	for (i = 0; i < qopt->num_tc; i++) {
-		unsigned int last = qopt->offset[i] + qopt->count[i];
-
-		/* Verify the queue count is in tx range being equal to the
-		 * real_num_tx_queues indicates the last queue is in use.
-		 */
-		if (qopt->offset[i] >= dev->real_num_tx_queues ||
-		    !qopt->count[i] ||
-		    last > dev->real_num_tx_queues)
-			return -EINVAL;
-
-		/* Verify that the offset and counts do not overlap */
-		for (j = i + 1; j < qopt->num_tc; j++) {
-			if (last > qopt->offset[j])
-				return -EINVAL;
-		}
-	}
-
-	return 0;
+	return mqprio_validate_queue_counts(dev, qopt);
 }
 
 static const struct nla_policy mqprio_policy[TCA_MQPRIO_MAX + 1] = {
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 37+ messages in thread

* [PATCH v4 net-next 09/15] net/sched: mqprio: add extack messages for queue count validation
  2023-01-30 17:31 [PATCH v4 net-next 00/15] ENETC mqprio/taprio cleanup Vladimir Oltean
                   ` (7 preceding siblings ...)
  2023-01-30 17:31 ` [PATCH v4 net-next 08/15] net/sched: mqprio: allow offloading drivers to request queue count validation Vladimir Oltean
@ 2023-01-30 17:31 ` Vladimir Oltean
  2023-02-01 14:12   ` Simon Horman
  2023-01-30 17:31 ` [PATCH v4 net-next 10/15] net: enetc: request mqprio to validate the queue counts Vladimir Oltean
                   ` (5 subsequent siblings)
  14 siblings, 1 reply; 37+ messages in thread
From: Vladimir Oltean @ 2023-01-30 17:31 UTC (permalink / raw)
  To: netdev
  Cc: David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni,
	Claudiu Manoil, Vinicius Costa Gomes, Kurt Kanzenbach,
	Jacob Keller, Jamal Hadi Salim, Cong Wang, Jiri Pirko

To make mqprio more user-friendly, create netlink extended ack messages
which say exactly what is wrong about the queue counts. This uses the
new support for printf-formatted extack messages.

Example:

$ tc qdisc add dev eno0 root handle 1: mqprio num_tc 8 \
	map 0 1 2 3 4 5 6 7 queues 3@0 1@1 1@2 1@3 1@4 1@5 1@6 1@7 hw 1
Error: sch_mqprio: Queues 1:1 for TC 1 overlap with last TX queue 3 for TC 0.

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Jacob Keller <jacob.e.keller@intel.com>
---
v1->v4: none

 net/sched/sch_mqprio.c | 40 ++++++++++++++++++++++++++++++----------
 1 file changed, 30 insertions(+), 10 deletions(-)

diff --git a/net/sched/sch_mqprio.c b/net/sched/sch_mqprio.c
index 5fdceab82ea1..4cd6d47cc7a1 100644
--- a/net/sched/sch_mqprio.c
+++ b/net/sched/sch_mqprio.c
@@ -28,25 +28,42 @@ struct mqprio_sched {
 };
 
 static int mqprio_validate_queue_counts(struct net_device *dev,
-					const struct tc_mqprio_qopt *qopt)
+					const struct tc_mqprio_qopt *qopt,
+					struct netlink_ext_ack *extack)
 {
 	int i, j;
 
 	for (i = 0; i < qopt->num_tc; i++) {
 		unsigned int last = qopt->offset[i] + qopt->count[i];
 
+		if (!qopt->count[i]) {
+			NL_SET_ERR_MSG_FMT_MOD(extack, "No queues for TC %d",
+					       i);
+			return -EINVAL;
+		}
+
 		/* Verify the queue count is in tx range being equal to the
 		 * real_num_tx_queues indicates the last queue is in use.
 		 */
 		if (qopt->offset[i] >= dev->real_num_tx_queues ||
-		    !qopt->count[i] ||
-		    last > dev->real_num_tx_queues)
+		    last > dev->real_num_tx_queues) {
+			NL_SET_ERR_MSG_FMT_MOD(extack,
+					       "Queues %d:%d for TC %d exceed the %d TX queues available",
+					       qopt->count[i], qopt->offset[i],
+					       i, dev->real_num_tx_queues);
 			return -EINVAL;
+		}
 
 		/* Verify that the offset and counts do not overlap */
 		for (j = i + 1; j < qopt->num_tc; j++) {
-			if (last > qopt->offset[j])
+			if (last > qopt->offset[j]) {
+				NL_SET_ERR_MSG_FMT_MOD(extack,
+						       "Queues %d:%d for TC %d overlap with last TX queue %d for TC %d",
+						       qopt->count[j],
+						       qopt->offset[j],
+						       j, last, i);
 				return -EINVAL;
+			}
 		}
 	}
 
@@ -54,7 +71,8 @@ static int mqprio_validate_queue_counts(struct net_device *dev,
 }
 
 static int mqprio_enable_offload(struct Qdisc *sch,
-				 const struct tc_mqprio_qopt *qopt)
+				 const struct tc_mqprio_qopt *qopt,
+				 struct netlink_ext_ack *extack)
 {
 	struct tc_mqprio_qopt_offload mqprio = {.qopt = *qopt};
 	struct mqprio_sched *priv = qdisc_priv(sch);
@@ -66,7 +84,7 @@ static int mqprio_enable_offload(struct Qdisc *sch,
 				 &caps, sizeof(caps));
 
 	if (caps.validate_queue_counts) {
-		err = mqprio_validate_queue_counts(dev, qopt);
+		err = mqprio_validate_queue_counts(dev, qopt, extack);
 		if (err)
 			return err;
 	}
@@ -138,7 +156,9 @@ static void mqprio_destroy(struct Qdisc *sch)
 		netdev_set_num_tc(dev, 0);
 }
 
-static int mqprio_parse_opt(struct net_device *dev, struct tc_mqprio_qopt *qopt)
+static int mqprio_parse_opt(struct net_device *dev,
+			    struct tc_mqprio_qopt *qopt,
+			    struct netlink_ext_ack *extack)
 {
 	int i;
 
@@ -167,7 +187,7 @@ static int mqprio_parse_opt(struct net_device *dev, struct tc_mqprio_qopt *qopt)
 	if (qopt->hw)
 		return dev->netdev_ops->ndo_setup_tc ? 0 : -EINVAL;
 
-	return mqprio_validate_queue_counts(dev, qopt);
+	return mqprio_validate_queue_counts(dev, qopt, extack);
 }
 
 static const struct nla_policy mqprio_policy[TCA_MQPRIO_MAX + 1] = {
@@ -280,7 +300,7 @@ static int mqprio_init(struct Qdisc *sch, struct nlattr *opt,
 		return -EINVAL;
 
 	qopt = nla_data(opt);
-	if (mqprio_parse_opt(dev, qopt))
+	if (mqprio_parse_opt(dev, qopt, extack))
 		return -EINVAL;
 
 	len = nla_len(opt) - NLA_ALIGN(sizeof(*qopt));
@@ -314,7 +334,7 @@ static int mqprio_init(struct Qdisc *sch, struct nlattr *opt,
 	 * supplied and verified mapping
 	 */
 	if (qopt->hw) {
-		err = mqprio_enable_offload(sch, qopt);
+		err = mqprio_enable_offload(sch, qopt, extack);
 		if (err)
 			return err;
 	} else {
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 37+ messages in thread

* [PATCH v4 net-next 10/15] net: enetc: request mqprio to validate the queue counts
  2023-01-30 17:31 [PATCH v4 net-next 00/15] ENETC mqprio/taprio cleanup Vladimir Oltean
                   ` (8 preceding siblings ...)
  2023-01-30 17:31 ` [PATCH v4 net-next 09/15] net/sched: mqprio: add extack messages for " Vladimir Oltean
@ 2023-01-30 17:31 ` Vladimir Oltean
  2023-02-01 14:12   ` Simon Horman
  2023-01-30 17:31 ` [PATCH v4 net-next 11/15] net: enetc: act upon the requested mqprio queue configuration Vladimir Oltean
                   ` (4 subsequent siblings)
  14 siblings, 1 reply; 37+ messages in thread
From: Vladimir Oltean @ 2023-01-30 17:31 UTC (permalink / raw)
  To: netdev
  Cc: David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni,
	Claudiu Manoil, Vinicius Costa Gomes, Kurt Kanzenbach,
	Jacob Keller, Jamal Hadi Salim, Cong Wang, Jiri Pirko

The enetc driver does not validate the mqprio queue configuration, so it
currently allows things like this:

$ tc qdisc add dev swp0 root handle 1: mqprio num_tc 8 \
	map 0 1 2 3 4 5 6 7 queues 3@0 1@1 1@2 1@3 1@4 1@5 1@6 1@7 hw 1

By requesting validation via the mqprio capability structure, this is no
longer allowed, and needs no custom code in the driver.

The check that num_tc <= real_num_tx_queues also becomes superfluous and
can be dropped, because mqprio_validate_queue_counts() validates that no
TXQ range exceeds real_num_tx_queues. That is a stronger check, because
there is at least 1 TXQ per TC, so there are at least as many TXQs as TCs.

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Jacob Keller <jacob.e.keller@intel.com>
---
v2->v4: none
v1->v2: move the deletion of the num_tc check to this patch, and add an
        explanation for it

 drivers/net/ethernet/freescale/enetc/enetc.c     | 7 -------
 drivers/net/ethernet/freescale/enetc/enetc_qos.c | 7 +++++++
 2 files changed, 7 insertions(+), 7 deletions(-)

diff --git a/drivers/net/ethernet/freescale/enetc/enetc.c b/drivers/net/ethernet/freescale/enetc/enetc.c
index 1c0aeaa13cde..e4718b50cf31 100644
--- a/drivers/net/ethernet/freescale/enetc/enetc.c
+++ b/drivers/net/ethernet/freescale/enetc/enetc.c
@@ -2638,13 +2638,6 @@ int enetc_setup_tc_mqprio(struct net_device *ndev, void *type_data)
 		return 0;
 	}
 
-	/* Check if we have enough BD rings available to accommodate all TCs */
-	if (num_tc > num_stack_tx_queues) {
-		netdev_err(ndev, "Max %d traffic classes supported\n",
-			   priv->num_tx_rings);
-		return -EINVAL;
-	}
-
 	/* For the moment, we use only one BD ring per TC.
 	 *
 	 * Configure num_tc BD rings with increasing priorities.
diff --git a/drivers/net/ethernet/freescale/enetc/enetc_qos.c b/drivers/net/ethernet/freescale/enetc/enetc_qos.c
index fcebb54224c0..6e0b4dd91509 100644
--- a/drivers/net/ethernet/freescale/enetc/enetc_qos.c
+++ b/drivers/net/ethernet/freescale/enetc/enetc_qos.c
@@ -1611,6 +1611,13 @@ int enetc_qos_query_caps(struct net_device *ndev, void *type_data)
 	struct enetc_si *si = priv->si;
 
 	switch (base->type) {
+	case TC_SETUP_QDISC_MQPRIO: {
+		struct tc_mqprio_caps *caps = base->caps;
+
+		caps->validate_queue_counts = true;
+
+		return 0;
+	}
 	case TC_SETUP_QDISC_TAPRIO: {
 		struct tc_taprio_caps *caps = base->caps;
 
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 37+ messages in thread

* [PATCH v4 net-next 11/15] net: enetc: act upon the requested mqprio queue configuration
  2023-01-30 17:31 [PATCH v4 net-next 00/15] ENETC mqprio/taprio cleanup Vladimir Oltean
                   ` (9 preceding siblings ...)
  2023-01-30 17:31 ` [PATCH v4 net-next 10/15] net: enetc: request mqprio to validate the queue counts Vladimir Oltean
@ 2023-01-30 17:31 ` Vladimir Oltean
  2023-02-01 15:15   ` Simon Horman
  2023-01-30 17:31 ` [PATCH v4 net-next 12/15] net/sched: taprio: pass mqprio queue configuration to ndo_setup_tc() Vladimir Oltean
                   ` (3 subsequent siblings)
  14 siblings, 1 reply; 37+ messages in thread
From: Vladimir Oltean @ 2023-01-30 17:31 UTC (permalink / raw)
  To: netdev
  Cc: David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni,
	Claudiu Manoil, Vinicius Costa Gomes, Kurt Kanzenbach,
	Jacob Keller, Jamal Hadi Salim, Cong Wang, Jiri Pirko

Regardless of the requested queue count per traffic class, the enetc
driver allocates a number of TX rings equal to the number of TCs, and
hardcodes a queue configuration of "1@0 1@1 ... 1@max-tc". Other
configurations are silently ignored and treated the same.

Improve that by allowing what the user requests to be actually
fulfilled. This allows more than one TX ring per traffic class.
For example:

$ tc qdisc add dev eno0 root handle 1: mqprio num_tc 4 \
	map 0 0 1 1 2 2 3 3 queues 2@0 2@2 2@4 2@6
[  146.267648] fsl_enetc 0000:00:00.0 eno0: TX ring 0 prio 0
[  146.273451] fsl_enetc 0000:00:00.0 eno0: TX ring 1 prio 0
[  146.283280] fsl_enetc 0000:00:00.0 eno0: TX ring 2 prio 1
[  146.293987] fsl_enetc 0000:00:00.0 eno0: TX ring 3 prio 1
[  146.300467] fsl_enetc 0000:00:00.0 eno0: TX ring 4 prio 2
[  146.306866] fsl_enetc 0000:00:00.0 eno0: TX ring 5 prio 2
[  146.313261] fsl_enetc 0000:00:00.0 eno0: TX ring 6 prio 3
[  146.319622] fsl_enetc 0000:00:00.0 eno0: TX ring 7 prio 3
$ tc qdisc del dev eno0 root
[  178.238418] fsl_enetc 0000:00:00.0 eno0: TX ring 0 prio 0
[  178.244369] fsl_enetc 0000:00:00.0 eno0: TX ring 1 prio 0
[  178.251486] fsl_enetc 0000:00:00.0 eno0: TX ring 2 prio 0
[  178.258006] fsl_enetc 0000:00:00.0 eno0: TX ring 3 prio 0
[  178.265038] fsl_enetc 0000:00:00.0 eno0: TX ring 4 prio 0
[  178.271557] fsl_enetc 0000:00:00.0 eno0: TX ring 5 prio 0
[  178.277910] fsl_enetc 0000:00:00.0 eno0: TX ring 6 prio 0
[  178.284281] fsl_enetc 0000:00:00.0 eno0: TX ring 7 prio 0
$ tc qdisc add dev eno0 root handle 1: mqprio num_tc 8 \
	map 0 1 2 3 4 5 6 7 queues 1@0 1@1 1@2 1@3 1@4 1@5 1@6 1@7 hw 1
[  186.113162] fsl_enetc 0000:00:00.0 eno0: TX ring 0 prio 0
[  186.118764] fsl_enetc 0000:00:00.0 eno0: TX ring 1 prio 1
[  186.124374] fsl_enetc 0000:00:00.0 eno0: TX ring 2 prio 2
[  186.130765] fsl_enetc 0000:00:00.0 eno0: TX ring 3 prio 3
[  186.136404] fsl_enetc 0000:00:00.0 eno0: TX ring 4 prio 4
[  186.142049] fsl_enetc 0000:00:00.0 eno0: TX ring 5 prio 5
[  186.147674] fsl_enetc 0000:00:00.0 eno0: TX ring 6 prio 6
[  186.153305] fsl_enetc 0000:00:00.0 eno0: TX ring 7 prio 7

The driver used to set TC_MQPRIO_HW_OFFLOAD_TCS, near which there is
this comment in the UAPI header:

        TC_MQPRIO_HW_OFFLOAD_TCS,       /* offload TCs, no queue counts */

but I'm not sure who even looks at this field. Anyway, since this is
basically what enetc was doing up until now (and no longer is; we
offload queue counts too), remove that assignment.

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
---
v2->v4: none
v1->v2: move the mqprio teardown to enetc_reset_tc_mqprio(), and also
        call it on the error path

 drivers/net/ethernet/freescale/enetc/enetc.c | 102 +++++++++++++------
 1 file changed, 71 insertions(+), 31 deletions(-)

diff --git a/drivers/net/ethernet/freescale/enetc/enetc.c b/drivers/net/ethernet/freescale/enetc/enetc.c
index e4718b50cf31..2d87deec6e77 100644
--- a/drivers/net/ethernet/freescale/enetc/enetc.c
+++ b/drivers/net/ethernet/freescale/enetc/enetc.c
@@ -2609,56 +2609,96 @@ static int enetc_reconfigure(struct enetc_ndev_priv *priv, bool extended,
 	return err;
 }
 
-int enetc_setup_tc_mqprio(struct net_device *ndev, void *type_data)
+static void enetc_debug_tx_ring_prios(struct enetc_ndev_priv *priv)
+{
+	int i;
+
+	for (i = 0; i < priv->num_tx_rings; i++)
+		netdev_dbg(priv->ndev, "TX ring %d prio %d\n", i,
+			   priv->tx_ring[i]->prio);
+}
+
+static void enetc_reset_tc_mqprio(struct net_device *ndev)
 {
 	struct enetc_ndev_priv *priv = netdev_priv(ndev);
-	struct tc_mqprio_qopt *mqprio = type_data;
 	struct enetc_hw *hw = &priv->si->hw;
 	struct enetc_bdr *tx_ring;
 	int num_stack_tx_queues;
-	u8 num_tc;
 	int i;
 
 	num_stack_tx_queues = enetc_num_stack_tx_queues(priv);
-	mqprio->hw = TC_MQPRIO_HW_OFFLOAD_TCS;
-	num_tc = mqprio->num_tc;
 
-	if (!num_tc) {
-		netdev_reset_tc(ndev);
-		netif_set_real_num_tx_queues(ndev, num_stack_tx_queues);
-		priv->min_num_stack_tx_queues = num_possible_cpus();
-
-		/* Reset all ring priorities to 0 */
-		for (i = 0; i < priv->num_tx_rings; i++) {
-			tx_ring = priv->tx_ring[i];
-			tx_ring->prio = 0;
-			enetc_set_bdr_prio(hw, tx_ring->index, tx_ring->prio);
-		}
+	netdev_reset_tc(ndev);
+	netif_set_real_num_tx_queues(ndev, num_stack_tx_queues);
+	priv->min_num_stack_tx_queues = num_possible_cpus();
+
+	/* Reset all ring priorities to 0 */
+	for (i = 0; i < priv->num_tx_rings; i++) {
+		tx_ring = priv->tx_ring[i];
+		tx_ring->prio = 0;
+		enetc_set_bdr_prio(hw, tx_ring->index, tx_ring->prio);
+	}
+
+	enetc_debug_tx_ring_prios(priv);
+}
+
+int enetc_setup_tc_mqprio(struct net_device *ndev, void *type_data)
+{
+	struct enetc_ndev_priv *priv = netdev_priv(ndev);
+	struct tc_mqprio_qopt *mqprio = type_data;
+	struct enetc_hw *hw = &priv->si->hw;
+	int num_stack_tx_queues = 0;
+	u8 num_tc = mqprio->num_tc;
+	struct enetc_bdr *tx_ring;
+	int offset, count;
+	int err, tc, q;
 
+	if (!num_tc) {
+		enetc_reset_tc_mqprio(ndev);
 		return 0;
 	}
 
-	/* For the moment, we use only one BD ring per TC.
-	 *
-	 * Configure num_tc BD rings with increasing priorities.
-	 */
-	for (i = 0; i < num_tc; i++) {
-		tx_ring = priv->tx_ring[i];
-		tx_ring->prio = i;
-		enetc_set_bdr_prio(hw, tx_ring->index, tx_ring->prio);
+	err = netdev_set_num_tc(ndev, num_tc);
+	if (err)
+		return err;
+
+	for (tc = 0; tc < num_tc; tc++) {
+		offset = mqprio->offset[tc];
+		count = mqprio->count[tc];
+
+		err = netdev_set_tc_queue(ndev, tc, count, offset);
+		if (err)
+			goto err_reset_tc;
+
+		for (q = offset; q < offset + count; q++) {
+			tx_ring = priv->tx_ring[q];
+			/* The prio_tc_map is skb_tx_hash()'s way of selecting
+			 * between TX queues based on skb->priority. As such,
+			 * there's nothing to offload based on it.
+			 * Make the mqprio "traffic class" be the priority of
+			 * this ring group, and leave the Tx IPV to traffic
+			 * class mapping as its default mapping value of 1:1.
+			 */
+			tx_ring->prio = tc;
+			enetc_set_bdr_prio(hw, tx_ring->index, tx_ring->prio);
+
+			num_stack_tx_queues++;
+		}
 	}
 
-	/* Reset the number of netdev queues based on the TC count */
-	netif_set_real_num_tx_queues(ndev, num_tc);
-	priv->min_num_stack_tx_queues = num_tc;
+	err = netif_set_real_num_tx_queues(ndev, num_stack_tx_queues);
+	if (err)
+		goto err_reset_tc;
 
-	netdev_set_num_tc(ndev, num_tc);
+	priv->min_num_stack_tx_queues = num_stack_tx_queues;
 
-	/* Each TC is associated with one netdev queue */
-	for (i = 0; i < num_tc; i++)
-		netdev_set_tc_queue(ndev, i, 1, i);
+	enetc_debug_tx_ring_prios(priv);
 
 	return 0;
+
+err_reset_tc:
+	enetc_reset_tc_mqprio(ndev);
+	return err;
 }
 EXPORT_SYMBOL_GPL(enetc_setup_tc_mqprio);
 
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 37+ messages in thread

* [PATCH v4 net-next 12/15] net/sched: taprio: pass mqprio queue configuration to ndo_setup_tc()
  2023-01-30 17:31 [PATCH v4 net-next 00/15] ENETC mqprio/taprio cleanup Vladimir Oltean
                   ` (10 preceding siblings ...)
  2023-01-30 17:31 ` [PATCH v4 net-next 11/15] net: enetc: act upon the requested mqprio queue configuration Vladimir Oltean
@ 2023-01-30 17:31 ` Vladimir Oltean
  2023-02-01 15:16   ` Simon Horman
  2023-01-30 17:31 ` [PATCH v4 net-next 13/15] net: enetc: act upon mqprio queue config in taprio offload Vladimir Oltean
                   ` (2 subsequent siblings)
  14 siblings, 1 reply; 37+ messages in thread
From: Vladimir Oltean @ 2023-01-30 17:31 UTC (permalink / raw)
  To: netdev
  Cc: David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni,
	Claudiu Manoil, Vinicius Costa Gomes, Kurt Kanzenbach,
	Jacob Keller, Jamal Hadi Salim, Cong Wang, Jiri Pirko

The taprio offload does not currently pass the mqprio queue configuration
down to the offloading device driver. So the driver cannot act upon the
TXQ counts/offsets per TC, or upon the prio->tc map. It was probably
assumed that the driver only wants to offload num_tc (see
TC_MQPRIO_HW_OFFLOAD_TCS), which it can get from netdev_get_num_tc(),
but there's clearly more to the mqprio configuration than that.

To remedy that, we need to actually reconstruct a struct
tc_mqprio_qopt_offload to pass as part of the tc_taprio_qopt_offload.
The problem is that taprio doesn't keep a persistent reference to the
mqprio queue structure in its own struct taprio_sched, instead it just
applies the contents of that to the netdev state (prio:tc map, per-TC
TXQ counts and offsets, num_tc etc). Maybe it's easier to understand
why, when we look at the size of struct tc_mqprio_qopt_offload: 352
bytes on arm64. Keeping such a large structure would throw off the
memory accesses in struct taprio_sched no matter where we put it.
So we prefer to dynamically reconstruct the mqprio offload structure
based on netdev information, rather than saving a copy of it.

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
---
v2->v4: none
v1->v2: reconstruct the mqprio queue configuration structure

 include/net/pkt_sched.h |  1 +
 net/sched/sch_taprio.c  | 20 ++++++++++++++++++++
 2 files changed, 21 insertions(+)

diff --git a/include/net/pkt_sched.h b/include/net/pkt_sched.h
index 02e3ccfbc7d1..ace8be520fb0 100644
--- a/include/net/pkt_sched.h
+++ b/include/net/pkt_sched.h
@@ -187,6 +187,7 @@ struct tc_taprio_sched_entry {
 };
 
 struct tc_taprio_qopt_offload {
+	struct tc_mqprio_qopt_offload mqprio;
 	u8 enable;
 	ktime_t base_time;
 	u64 cycle_time;
diff --git a/net/sched/sch_taprio.c b/net/sched/sch_taprio.c
index c322a61eaeea..f40016275384 100644
--- a/net/sched/sch_taprio.c
+++ b/net/sched/sch_taprio.c
@@ -1225,6 +1225,25 @@ static void taprio_sched_to_offload(struct net_device *dev,
 	offload->num_entries = i;
 }
 
+static void
+taprio_mqprio_qopt_reconstruct(struct net_device *dev,
+			       struct tc_mqprio_qopt_offload *mqprio)
+{
+	struct tc_mqprio_qopt *qopt = &mqprio->qopt;
+	int num_tc = netdev_get_num_tc(dev);
+	int tc, prio;
+
+	qopt->num_tc = num_tc;
+
+	for (prio = 0; prio <= TC_BITMASK; prio++)
+		qopt->prio_tc_map[prio] = netdev_get_prio_tc_map(dev, prio);
+
+	for (tc = 0; tc < num_tc; tc++) {
+		qopt->count[tc] = dev->tc_to_txq[tc].count;
+		qopt->offset[tc] = dev->tc_to_txq[tc].offset;
+	}
+}
+
 static int taprio_enable_offload(struct net_device *dev,
 				 struct taprio_sched *q,
 				 struct sched_gate_list *sched,
@@ -1261,6 +1280,7 @@ static int taprio_enable_offload(struct net_device *dev,
 		return -ENOMEM;
 	}
 	offload->enable = 1;
+	taprio_mqprio_qopt_reconstruct(dev, &offload->mqprio);
 	taprio_sched_to_offload(dev, sched, offload);
 
 	for (tc = 0; tc < TC_MAX_QUEUE; tc++)
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 37+ messages in thread

* [PATCH v4 net-next 13/15] net: enetc: act upon mqprio queue config in taprio offload
  2023-01-30 17:31 [PATCH v4 net-next 00/15] ENETC mqprio/taprio cleanup Vladimir Oltean
                   ` (11 preceding siblings ...)
  2023-01-30 17:31 ` [PATCH v4 net-next 12/15] net/sched: taprio: pass mqprio queue configuration to ndo_setup_tc() Vladimir Oltean
@ 2023-01-30 17:31 ` Vladimir Oltean
  2023-02-01 15:16   ` Simon Horman
  2023-01-30 17:31 ` [PATCH v4 net-next 14/15] net/sched: taprio: mask off bits in gate mask that exceed number of TCs Vladimir Oltean
  2023-01-30 17:31 ` [PATCH v4 net-next 15/15] net/sched: taprio: only calculate gate mask per TXQ for igc, stmmac and tsnep Vladimir Oltean
  14 siblings, 1 reply; 37+ messages in thread
From: Vladimir Oltean @ 2023-01-30 17:31 UTC (permalink / raw)
  To: netdev
  Cc: David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni,
	Claudiu Manoil, Vinicius Costa Gomes, Kurt Kanzenbach,
	Jacob Keller, Jamal Hadi Salim, Cong Wang, Jiri Pirko

We assume that the mqprio queue configuration from taprio has a simple
1:1 mapping between prio and traffic class, and one TX queue per TC.
That might not be the case. Actually parse and act upon the mqprio
config.

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Jacob Keller <jacob.e.keller@intel.com>
---
v1->v4: none

 .../net/ethernet/freescale/enetc/enetc_qos.c  | 20 ++++++-------------
 1 file changed, 6 insertions(+), 14 deletions(-)

diff --git a/drivers/net/ethernet/freescale/enetc/enetc_qos.c b/drivers/net/ethernet/freescale/enetc/enetc_qos.c
index 6e0b4dd91509..130ebf6853e6 100644
--- a/drivers/net/ethernet/freescale/enetc/enetc_qos.c
+++ b/drivers/net/ethernet/freescale/enetc/enetc_qos.c
@@ -136,29 +136,21 @@ int enetc_setup_tc_taprio(struct net_device *ndev, void *type_data)
 {
 	struct tc_taprio_qopt_offload *taprio = type_data;
 	struct enetc_ndev_priv *priv = netdev_priv(ndev);
-	struct enetc_hw *hw = &priv->si->hw;
-	struct enetc_bdr *tx_ring;
-	int err;
-	int i;
+	int err, i;
 
 	/* TSD and Qbv are mutually exclusive in hardware */
 	for (i = 0; i < priv->num_tx_rings; i++)
 		if (priv->tx_ring[i]->tsd_enable)
 			return -EBUSY;
 
-	for (i = 0; i < priv->num_tx_rings; i++) {
-		tx_ring = priv->tx_ring[i];
-		tx_ring->prio = taprio->enable ? i : 0;
-		enetc_set_bdr_prio(hw, tx_ring->index, tx_ring->prio);
-	}
+	err = enetc_setup_tc_mqprio(ndev, &taprio->mqprio);
+	if (err)
+		return err;
 
 	err = enetc_setup_taprio(ndev, taprio);
 	if (err) {
-		for (i = 0; i < priv->num_tx_rings; i++) {
-			tx_ring = priv->tx_ring[i];
-			tx_ring->prio = taprio->enable ? 0 : i;
-			enetc_set_bdr_prio(hw, tx_ring->index, tx_ring->prio);
-		}
+		taprio->mqprio.qopt.num_tc = 0;
+		enetc_setup_tc_mqprio(ndev, &taprio->mqprio);
 	}
 
 	return err;
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 37+ messages in thread

* [PATCH v4 net-next 14/15] net/sched: taprio: mask off bits in gate mask that exceed number of TCs
  2023-01-30 17:31 [PATCH v4 net-next 00/15] ENETC mqprio/taprio cleanup Vladimir Oltean
                   ` (12 preceding siblings ...)
  2023-01-30 17:31 ` [PATCH v4 net-next 13/15] net: enetc: act upon mqprio queue config in taprio offload Vladimir Oltean
@ 2023-01-30 17:31 ` Vladimir Oltean
  2023-02-01 15:17   ` Simon Horman
  2023-01-30 17:31 ` [PATCH v4 net-next 15/15] net/sched: taprio: only calculate gate mask per TXQ for igc, stmmac and tsnep Vladimir Oltean
  14 siblings, 1 reply; 37+ messages in thread
From: Vladimir Oltean @ 2023-01-30 17:31 UTC (permalink / raw)
  To: netdev
  Cc: David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni,
	Claudiu Manoil, Vinicius Costa Gomes, Kurt Kanzenbach,
	Jacob Keller, Jamal Hadi Salim, Cong Wang, Jiri Pirko

"man tc-taprio" says:

| each gate state allows outgoing traffic for a subset (potentially
| empty) of traffic classes.

So it makes sense to not allow gate actions to have bits set for traffic
classes that exceed the number of TCs of the device (according to the
mqprio configuration).

Validating precisely that would risk introducing breakage in commands
that worked (because taprio ignores the upper bits). OTOH, the user may
not immediately realize that taprio ignores the upper bits (may confuse
the gate mask to be per TXQ rather than per TC). So at least warn to
dmesg, mask off the excess bits and continue.

For this patch to work, we need to move the assignment of the mqprio
queue configuration to the netdev above the parse_taprio_schedule()
call, because we make use of netdev_get_num_tc().

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Jacob Keller <jacob.e.keller@intel.com>
---
v3->v4: none
v2->v3: warn and mask off instead of failing
v1->v2: none

 net/sched/sch_taprio.c | 46 +++++++++++++++++++++++++++---------------
 1 file changed, 30 insertions(+), 16 deletions(-)

diff --git a/net/sched/sch_taprio.c b/net/sched/sch_taprio.c
index f40016275384..a9873056ea97 100644
--- a/net/sched/sch_taprio.c
+++ b/net/sched/sch_taprio.c
@@ -789,15 +789,29 @@ static int fill_sched_entry(struct taprio_sched *q, struct nlattr **tb,
 			    struct netlink_ext_ack *extack)
 {
 	int min_duration = length_to_duration(q, ETH_ZLEN);
+	struct net_device *dev = qdisc_dev(q->root);
+	int num_tc = netdev_get_num_tc(dev);
+	u32 max_gate_mask = 0;
 	u32 interval = 0;
 
+	if (num_tc)
+		max_gate_mask = GENMASK(num_tc - 1, 0);
+
 	if (tb[TCA_TAPRIO_SCHED_ENTRY_CMD])
 		entry->command = nla_get_u8(
 			tb[TCA_TAPRIO_SCHED_ENTRY_CMD]);
 
-	if (tb[TCA_TAPRIO_SCHED_ENTRY_GATE_MASK])
+	if (tb[TCA_TAPRIO_SCHED_ENTRY_GATE_MASK]) {
 		entry->gate_mask = nla_get_u32(
 			tb[TCA_TAPRIO_SCHED_ENTRY_GATE_MASK]);
+		if (entry->gate_mask & ~max_gate_mask) {
+			netdev_warn(dev,
+				    "Gate mask 0x%x contains bits for non-existent TCs (device has %d), truncating to 0x%x",
+				    entry->gate_mask, num_tc,
+				    entry->gate_mask & max_gate_mask);
+			entry->gate_mask &= max_gate_mask;
+		}
+	}
 
 	if (tb[TCA_TAPRIO_SCHED_ENTRY_INTERVAL])
 		interval = nla_get_u32(
@@ -1605,6 +1619,21 @@ static int taprio_change(struct Qdisc *sch, struct nlattr *opt,
 		goto free_sched;
 	}
 
+	if (mqprio) {
+		err = netdev_set_num_tc(dev, mqprio->num_tc);
+		if (err)
+			goto free_sched;
+		for (i = 0; i < mqprio->num_tc; i++)
+			netdev_set_tc_queue(dev, i,
+					    mqprio->count[i],
+					    mqprio->offset[i]);
+
+		/* Always use supplied priority mappings */
+		for (i = 0; i <= TC_BITMASK; i++)
+			netdev_set_prio_tc_map(dev, i,
+					       mqprio->prio_tc_map[i]);
+	}
+
 	err = parse_taprio_schedule(q, tb, new_admin, extack);
 	if (err < 0)
 		goto free_sched;
@@ -1621,21 +1650,6 @@ static int taprio_change(struct Qdisc *sch, struct nlattr *opt,
 
 	taprio_set_picos_per_byte(dev, q);
 
-	if (mqprio) {
-		err = netdev_set_num_tc(dev, mqprio->num_tc);
-		if (err)
-			goto free_sched;
-		for (i = 0; i < mqprio->num_tc; i++)
-			netdev_set_tc_queue(dev, i,
-					    mqprio->count[i],
-					    mqprio->offset[i]);
-
-		/* Always use supplied priority mappings */
-		for (i = 0; i <= TC_BITMASK; i++)
-			netdev_set_prio_tc_map(dev, i,
-					       mqprio->prio_tc_map[i]);
-	}
-
 	if (FULL_OFFLOAD_IS_ENABLED(q->flags))
 		err = taprio_enable_offload(dev, q, new_admin, extack);
 	else
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 37+ messages in thread

* [PATCH v4 net-next 15/15] net/sched: taprio: only calculate gate mask per TXQ for igc, stmmac and tsnep
  2023-01-30 17:31 [PATCH v4 net-next 00/15] ENETC mqprio/taprio cleanup Vladimir Oltean
                   ` (13 preceding siblings ...)
  2023-01-30 17:31 ` [PATCH v4 net-next 14/15] net/sched: taprio: mask off bits in gate mask that exceed number of TCs Vladimir Oltean
@ 2023-01-30 17:31 ` Vladimir Oltean
  2023-02-01 15:17   ` Simon Horman
  14 siblings, 1 reply; 37+ messages in thread
From: Vladimir Oltean @ 2023-01-30 17:31 UTC (permalink / raw)
  To: netdev
  Cc: David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni,
	Claudiu Manoil, Vinicius Costa Gomes, Kurt Kanzenbach,
	Jacob Keller, Jamal Hadi Salim, Cong Wang, Jiri Pirko,
	Horatiu Vultur, Siddharth Vadapalli, Roger Quadros,
	Gerhard Engleder

There are 2 classes of in-tree drivers currently:

- those who act upon struct tc_taprio_sched_entry :: gate_mask as if it
  holds a bit mask of TXQs

- those who act upon the gate_mask as if it holds a bit mask of TCs

When it comes to the standard, IEEE 802.1Q-2018 does say this in the
second paragraph of section 8.6.8.4 Enhancements for scheduled traffic:

| A gate control list associated with each Port contains an ordered list
| of gate operations. Each gate operation changes the transmission gate
| state for the gate associated with each of the Port's traffic class
| queues and allows associated control operations to be scheduled.

In typically obtuse language, it refers to a "traffic class queue"
rather than a "traffic class" or a "queue". But careful reading of
802.1Q clarifies that "traffic class" and "queue" are in fact
synonymous (see 8.6.6 Queuing frames):

| A queue in this context is not necessarily a single FIFO data structure.
| A queue is a record of all frames of a given traffic class awaiting
| transmission on a given Bridge Port. The structure of this record is not
| specified.

i.o.w. their definition of "queue" isn't the Linux TX queue.

The gate_mask really is input into taprio via its UAPI as a mask of
traffic classes, but taprio_sched_to_offload() converts it into a TXQ
mask.

The breakdown of drivers which handle TC_SETUP_QDISC_TAPRIO is:

- hellcreek, felix, sja1105: these are DSA switches, it's not even very
  clear what TXQs correspond to, other than purely software constructs.
  Only the mqprio configuration with 8 TCs and 1 TXQ per TC makes sense.
  So it's fine to convert these to a gate mask per TC.

- enetc: I have the hardware and can confirm that the gate mask is per
  TC, and affects all TXQs (BD rings) configured for that priority.

- igc: in igc_save_qbv_schedule(), the gate_mask is clearly interpreted
  to be per-TXQ.

- tsnep: Gerhard Engleder clarifies that even though this hardware
  supports at most 1 TXQ per TC, the TXQ indices may be different from
  the TC values themselves, and it is the TXQ indices that matter to
  this hardware. So keep it per-TXQ as well.

- stmmac: I have a GMAC datasheet, and in the EST section it does
  specify that the gate events are per TXQ rather than per TC.

- lan966x: again, this is a switch, and while not a DSA one, the way in
  which it implements lan966x_mqprio_add() - by only allowing num_tc ==
  NUM_PRIO_QUEUES (8) - makes it clear to me that TXQs are a purely
  software construct here as well. They seem to map 1:1 with TCs.

- am65_cpsw: from looking at am65_cpsw_est_set_sched_cmds(), I get the
  impression that the fetch_allow variable is treated like a prio_mask.
  I haven't studied this driver's interpretation of the prio_tc_map, but
  that definitely sounds closer to a per-TC gate mask rather than a
  per-TXQ one.

Based on this breakdown, we have 6 drivers with a gate mask per TC and
3 with a gate mask per TXQ. So let's make the gate mask per TXQ the
opt-in and the gate mask per TC the default.

Benefit from the TC_QUERY_CAPS feature that Jakub suggested we add, and
query the device driver before calling the proper ndo_setup_tc(), and
figure out if it expects one or the other format.

Cc: Horatiu Vultur <horatiu.vultur@microchip.com>
Cc: Siddharth Vadapalli <s-vadapalli@ti.com>
Cc: Roger Quadros <rogerq@kernel.org>
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Acked-by: Kurt Kanzenbach <kurt@linutronix.de> # hellcreek
Reviewed-by: Gerhard Engleder <gerhard@engleder-embedded.com>
---
v3->v4: none
v2->v3: adjust commit message in light of what Kurt has said
v1->v2:
- rewrite commit message
- also opt in stmmac and tsnep

 drivers/net/ethernet/engleder/tsnep_tc.c      | 21 +++++++++++++++++
 drivers/net/ethernet/intel/igc/igc_main.c     | 23 +++++++++++++++++++
 drivers/net/ethernet/stmicro/stmmac/hwif.h    |  5 ++++
 .../net/ethernet/stmicro/stmmac/stmmac_main.c |  2 ++
 .../net/ethernet/stmicro/stmmac/stmmac_tc.c   | 20 ++++++++++++++++
 include/net/pkt_sched.h                       |  1 +
 net/sched/sch_taprio.c                        | 11 ++++++---
 7 files changed, 80 insertions(+), 3 deletions(-)

diff --git a/drivers/net/ethernet/engleder/tsnep_tc.c b/drivers/net/ethernet/engleder/tsnep_tc.c
index c4c6e1357317..d083e6684f12 100644
--- a/drivers/net/ethernet/engleder/tsnep_tc.c
+++ b/drivers/net/ethernet/engleder/tsnep_tc.c
@@ -403,12 +403,33 @@ static int tsnep_taprio(struct tsnep_adapter *adapter,
 	return 0;
 }
 
+static int tsnep_tc_query_caps(struct tsnep_adapter *adapter,
+			       struct tc_query_caps_base *base)
+{
+	switch (base->type) {
+	case TC_SETUP_QDISC_TAPRIO: {
+		struct tc_taprio_caps *caps = base->caps;
+
+		if (!adapter->gate_control)
+			return -EOPNOTSUPP;
+
+		caps->gate_mask_per_txq = true;
+
+		return 0;
+	}
+	default:
+		return -EOPNOTSUPP;
+	}
+}
+
 int tsnep_tc_setup(struct net_device *netdev, enum tc_setup_type type,
 		   void *type_data)
 {
 	struct tsnep_adapter *adapter = netdev_priv(netdev);
 
 	switch (type) {
+	case TC_QUERY_CAPS:
+		return tsnep_tc_query_caps(adapter, type_data);
 	case TC_SETUP_QDISC_TAPRIO:
 		return tsnep_taprio(adapter, type_data);
 	default:
diff --git a/drivers/net/ethernet/intel/igc/igc_main.c b/drivers/net/ethernet/intel/igc/igc_main.c
index e86b15efaeb8..cce1dea51f76 100644
--- a/drivers/net/ethernet/intel/igc/igc_main.c
+++ b/drivers/net/ethernet/intel/igc/igc_main.c
@@ -6205,12 +6205,35 @@ static int igc_tsn_enable_cbs(struct igc_adapter *adapter,
 	return igc_tsn_offload_apply(adapter);
 }
 
+static int igc_tc_query_caps(struct igc_adapter *adapter,
+			     struct tc_query_caps_base *base)
+{
+	struct igc_hw *hw = &adapter->hw;
+
+	switch (base->type) {
+	case TC_SETUP_QDISC_TAPRIO: {
+		struct tc_taprio_caps *caps = base->caps;
+
+		if (hw->mac.type != igc_i225)
+			return -EOPNOTSUPP;
+
+		caps->gate_mask_per_txq = true;
+
+		return 0;
+	}
+	default:
+		return -EOPNOTSUPP;
+	}
+}
+
 static int igc_setup_tc(struct net_device *dev, enum tc_setup_type type,
 			void *type_data)
 {
 	struct igc_adapter *adapter = netdev_priv(dev);
 
 	switch (type) {
+	case TC_QUERY_CAPS:
+		return igc_tc_query_caps(adapter, type_data);
 	case TC_SETUP_QDISC_TAPRIO:
 		return igc_tsn_enable_qbv_scheduling(adapter, type_data);
 
diff --git a/drivers/net/ethernet/stmicro/stmmac/hwif.h b/drivers/net/ethernet/stmicro/stmmac/hwif.h
index 592b4067f9b8..16a7421715cb 100644
--- a/drivers/net/ethernet/stmicro/stmmac/hwif.h
+++ b/drivers/net/ethernet/stmicro/stmmac/hwif.h
@@ -567,6 +567,7 @@ struct tc_cbs_qopt_offload;
 struct flow_cls_offload;
 struct tc_taprio_qopt_offload;
 struct tc_etf_qopt_offload;
+struct tc_query_caps_base;
 
 struct stmmac_tc_ops {
 	int (*init)(struct stmmac_priv *priv);
@@ -580,6 +581,8 @@ struct stmmac_tc_ops {
 			    struct tc_taprio_qopt_offload *qopt);
 	int (*setup_etf)(struct stmmac_priv *priv,
 			 struct tc_etf_qopt_offload *qopt);
+	int (*query_caps)(struct stmmac_priv *priv,
+			  struct tc_query_caps_base *base);
 };
 
 #define stmmac_tc_init(__priv, __args...) \
@@ -594,6 +597,8 @@ struct stmmac_tc_ops {
 	stmmac_do_callback(__priv, tc, setup_taprio, __args)
 #define stmmac_tc_setup_etf(__priv, __args...) \
 	stmmac_do_callback(__priv, tc, setup_etf, __args)
+#define stmmac_tc_query_caps(__priv, __args...) \
+	stmmac_do_callback(__priv, tc, query_caps, __args)
 
 struct stmmac_counters;
 
diff --git a/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c b/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
index b7e5af58ab75..17a7ea1cb961 100644
--- a/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
+++ b/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
@@ -5991,6 +5991,8 @@ static int stmmac_setup_tc(struct net_device *ndev, enum tc_setup_type type,
 	struct stmmac_priv *priv = netdev_priv(ndev);
 
 	switch (type) {
+	case TC_QUERY_CAPS:
+		return stmmac_tc_query_caps(priv, priv, type_data);
 	case TC_SETUP_BLOCK:
 		return flow_block_cb_setup_simple(type_data,
 						  &stmmac_block_cb_list,
diff --git a/drivers/net/ethernet/stmicro/stmmac/stmmac_tc.c b/drivers/net/ethernet/stmicro/stmmac/stmmac_tc.c
index 2cfb18cef1d4..9d55226479b4 100644
--- a/drivers/net/ethernet/stmicro/stmmac/stmmac_tc.c
+++ b/drivers/net/ethernet/stmicro/stmmac/stmmac_tc.c
@@ -1107,6 +1107,25 @@ static int tc_setup_etf(struct stmmac_priv *priv,
 	return 0;
 }
 
+static int tc_query_caps(struct stmmac_priv *priv,
+			 struct tc_query_caps_base *base)
+{
+	switch (base->type) {
+	case TC_SETUP_QDISC_TAPRIO: {
+		struct tc_taprio_caps *caps = base->caps;
+
+		if (!priv->dma_cap.estsel)
+			return -EOPNOTSUPP;
+
+		caps->gate_mask_per_txq = true;
+
+		return 0;
+	}
+	default:
+		return -EOPNOTSUPP;
+	}
+}
+
 const struct stmmac_tc_ops dwmac510_tc_ops = {
 	.init = tc_init,
 	.setup_cls_u32 = tc_setup_cls_u32,
@@ -1114,4 +1133,5 @@ const struct stmmac_tc_ops dwmac510_tc_ops = {
 	.setup_cls = tc_setup_cls,
 	.setup_taprio = tc_setup_taprio,
 	.setup_etf = tc_setup_etf,
+	.query_caps = tc_query_caps,
 };
diff --git a/include/net/pkt_sched.h b/include/net/pkt_sched.h
index ace8be520fb0..fd889fc4912b 100644
--- a/include/net/pkt_sched.h
+++ b/include/net/pkt_sched.h
@@ -176,6 +176,7 @@ struct tc_mqprio_qopt_offload {
 
 struct tc_taprio_caps {
 	bool supports_queue_max_sdu:1;
+	bool gate_mask_per_txq:1;
 };
 
 struct tc_taprio_sched_entry {
diff --git a/net/sched/sch_taprio.c b/net/sched/sch_taprio.c
index a9873056ea97..72271bf8cd8b 100644
--- a/net/sched/sch_taprio.c
+++ b/net/sched/sch_taprio.c
@@ -1217,7 +1217,8 @@ static u32 tc_map_to_queue_mask(struct net_device *dev, u32 tc_mask)
 
 static void taprio_sched_to_offload(struct net_device *dev,
 				    struct sched_gate_list *sched,
-				    struct tc_taprio_qopt_offload *offload)
+				    struct tc_taprio_qopt_offload *offload,
+				    const struct tc_taprio_caps *caps)
 {
 	struct sched_entry *entry;
 	int i = 0;
@@ -1231,7 +1232,11 @@ static void taprio_sched_to_offload(struct net_device *dev,
 
 		e->command = entry->command;
 		e->interval = entry->interval;
-		e->gate_mask = tc_map_to_queue_mask(dev, entry->gate_mask);
+		if (caps->gate_mask_per_txq)
+			e->gate_mask = tc_map_to_queue_mask(dev,
+							    entry->gate_mask);
+		else
+			e->gate_mask = entry->gate_mask;
 
 		i++;
 	}
@@ -1295,7 +1300,7 @@ static int taprio_enable_offload(struct net_device *dev,
 	}
 	offload->enable = 1;
 	taprio_mqprio_qopt_reconstruct(dev, &offload->mqprio);
-	taprio_sched_to_offload(dev, sched, offload);
+	taprio_sched_to_offload(dev, sched, offload, &caps);
 
 	for (tc = 0; tc < TC_MAX_QUEUE; tc++)
 		offload->max_sdu[tc] = q->max_sdu[tc];
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 37+ messages in thread

* RE: [PATCH v4 net-next 08/15] net/sched: mqprio: allow offloading drivers to request queue count validation
  2023-01-30 17:31 ` [PATCH v4 net-next 08/15] net/sched: mqprio: allow offloading drivers to request queue count validation Vladimir Oltean
@ 2023-01-30 18:37   ` Claudiu Manoil
  2023-01-30 19:06     ` Vladimir Oltean
  2023-02-01 14:08   ` Simon Horman
  1 sibling, 1 reply; 37+ messages in thread
From: Claudiu Manoil @ 2023-01-30 18:37 UTC (permalink / raw)
  To: Vladimir Oltean, netdev
  Cc: David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni,
	Vinicius Costa Gomes, Kurt Kanzenbach, Jacob Keller,
	Jamal Hadi Salim, Cong Wang, Jiri Pirko

> -----Original Message-----
> From: Vladimir Oltean <vladimir.oltean@nxp.com>
> Sent: Monday, January 30, 2023 7:32 PM
[...]
> Subject: [PATCH v4 net-next 08/15] net/sched: mqprio: allow offloading drivers
> to request queue count validation
>

[...]

> +static int mqprio_validate_queue_counts(struct net_device *dev,
> +					const struct tc_mqprio_qopt *qopt)
> +{
> +	int i, j;
> +
> +	for (i = 0; i < qopt->num_tc; i++) {
> +		unsigned int last = qopt->offset[i] + qopt->count[i];
> +
> +		/* Verify the queue count is in tx range being equal to the
> +		 * real_num_tx_queues indicates the last queue is in use.
> +		 */
> +		if (qopt->offset[i] >= dev->real_num_tx_queues ||
> +		    !qopt->count[i] ||
> +		    last > dev->real_num_tx_queues)
> +			return -EINVAL;
> +
> +		/* Verify that the offset and counts do not overlap */
> +		for (j = i + 1; j < qopt->num_tc; j++) {
> +			if (last > qopt->offset[j])
> +				return -EINVAL;
> +		}
> +	}
> +
> +	return 0;
> +}
> +

Not related to this series, but the above O(n^2) code snippet....
If last[i] := offset[i] + count[i] and last[i] <= offset[i+1],
then offset[i] + count[i] <= offset[i+1] for every i := 0, num_tc - 1.

In other words, it's enough to check that last[i] <= offset[i+1] to make
sure there's no interval overlap, and it's O(n).

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH v4 net-next 08/15] net/sched: mqprio: allow offloading drivers to request queue count validation
  2023-01-30 18:37   ` Claudiu Manoil
@ 2023-01-30 19:06     ` Vladimir Oltean
  0 siblings, 0 replies; 37+ messages in thread
From: Vladimir Oltean @ 2023-01-30 19:06 UTC (permalink / raw)
  To: Claudiu Manoil
  Cc: netdev, David S. Miller, Eric Dumazet, Jakub Kicinski,
	Paolo Abeni, Vinicius Costa Gomes, Kurt Kanzenbach, Jacob Keller,
	Jamal Hadi Salim, Cong Wang, Jiri Pirko

Hi Claudiu,

On Mon, Jan 30, 2023 at 08:37:02PM +0200, Claudiu Manoil wrote:
> > -----Original Message-----
> > From: Vladimir Oltean <vladimir.oltean@nxp.com>
> > Sent: Monday, January 30, 2023 7:32 PM
> [...]
> > Subject: [PATCH v4 net-next 08/15] net/sched: mqprio: allow offloading drivers
> > to request queue count validation
> >
> 
> [...]
> 
> > +static int mqprio_validate_queue_counts(struct net_device *dev,
> > +					const struct tc_mqprio_qopt *qopt)
> > +{
> > +	int i, j;
> > +
> > +	for (i = 0; i < qopt->num_tc; i++) {
> > +		unsigned int last = qopt->offset[i] + qopt->count[i];
> > +
> > +		/* Verify the queue count is in tx range being equal to the
> > +		 * real_num_tx_queues indicates the last queue is in use.
> > +		 */
> > +		if (qopt->offset[i] >= dev->real_num_tx_queues ||
> > +		    !qopt->count[i] ||
> > +		    last > dev->real_num_tx_queues)
> > +			return -EINVAL;
> > +
> > +		/* Verify that the offset and counts do not overlap */
> > +		for (j = i + 1; j < qopt->num_tc; j++) {
> > +			if (last > qopt->offset[j])
> > +				return -EINVAL;
> > +		}
> > +	}
> > +
> > +	return 0;
> > +}
> > +
> 
> Not related to this series, but the above O(n^2) code snippet....
> If last[i] := offset[i] + count[i] and last[i] <= offset[i+1],
> then offset[i] + count[i] <= offset[i+1] for every i := 0, num_tc - 1.
> 
> In other words, it's enough to check that last[i] <= offset[i+1] to make
> sure there's no interval overlap, and it's O(n).

Hmm, actually you bring a good point, which I didn't notice.

It looks to me like someone had an idea but never went through implementing it.
The complexity is O(n^2) because it's actually only the overlaps that
the code is supposed to check for. It's not necessary for TXQs to be in
ascending order ("last[i] <= offset[i+1]" isn't a given).

I'm pretty sure that TXQs can also be mapped in reverse compared to the TC,
like this:

tc qdisc add dev eno0 root handle 1: mqprio num_tc 8 map 0 1 2 3 4 5 6 7 queues 1@7 1@6 1@5 1@4 1@3 1@2 1@1 1@0 hw 1

Which *should* be allowed (at least in hardware, it is), and which would
indeed justify the higher complexity validation function.

But with "hw 0", the existing code indeed doesn't allow that.

We would need this change first, targeting "net":

diff --git a/net/sched/sch_mqprio.c b/net/sched/sch_mqprio.c
index 4c68abaa289b..4f6fb05a4adc 100644
--- a/net/sched/sch_mqprio.c
+++ b/net/sched/sch_mqprio.c
@@ -101,7 +101,8 @@ static int mqprio_parse_opt(struct net_device *dev, struct tc_mqprio_qopt *qopt)
 
 		/* Verify that the offset and counts do not overlap */
 		for (j = i + 1; j < qopt->num_tc; j++) {
-			if (last > qopt->offset[j])
+			if (last > qopt->offset[j] &&
+			    last <= qopt->offset[j] + qopt->count[j])
 				return -EINVAL;
 		}
 	}

then see you in a week from now, with net-next merged with that patch.
Oh well.. :)

^ permalink raw reply related	[flat|nested] 37+ messages in thread

* Re: [PATCH v4 net-next 04/15] net: enetc: ensure we always have a minimum number of TXQs for stack
  2023-01-30 17:31 ` [PATCH v4 net-next 04/15] net: enetc: ensure we always have a minimum number of TXQs for stack Vladimir Oltean
@ 2023-02-01 13:43   ` Simon Horman
  2023-02-01 18:46     ` Vladimir Oltean
  0 siblings, 1 reply; 37+ messages in thread
From: Simon Horman @ 2023-02-01 13:43 UTC (permalink / raw)
  To: Vladimir Oltean
  Cc: netdev, David S. Miller, Eric Dumazet, Jakub Kicinski,
	Paolo Abeni, Claudiu Manoil, Vinicius Costa Gomes,
	Kurt Kanzenbach, Jacob Keller, Jamal Hadi Salim, Cong Wang,
	Jiri Pirko

On Mon, Jan 30, 2023 at 07:31:34PM +0200, Vladimir Oltean wrote:
> Currently it can happen that an mqprio qdisc is installed with num_tc 8,
> and this will reserve 8 (out of 8) TXQs for the network stack. Then we
> can attach an XDP program, and this will crop 2 TXQs, leaving just 6 for
> mqprio. That's not what the user requested, and we should fail it.
> 
> On the other hand, if mqprio isn't requested, we still give the 8 TXQs
> to the network stack (with hashing among a single traffic class), but
> then, cropping 2 TXQs for XDP is fine, because the user didn't
> explicitly ask for any number of TXQs, so no expectations are violated.
> 
> Simply put, the logic that mqprio should impose a minimum number of TXQs
> for the network never existed. Let's say (more or less arbitrarily) that
> without mqprio, the driver expects a minimum number of TXQs equal to the
> number of CPUs (on NXP LS1028A, that is either 1, or 2). And with mqprio,
> mqprio gives the minimum required number of TXQs.
> 
> Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>

The nit below notwithstanding,

Reviewed-by: Simon Horman <simon.horman@corigine.com>

...

> diff --git a/drivers/net/ethernet/freescale/enetc/enetc.h b/drivers/net/ethernet/freescale/enetc/enetc.h
> index 1fe8dfd6b6d4..e21d096c5a90 100644
> --- a/drivers/net/ethernet/freescale/enetc/enetc.h
> +++ b/drivers/net/ethernet/freescale/enetc/enetc.h
> @@ -369,6 +369,9 @@ struct enetc_ndev_priv {
>  
>  	struct psfp_cap psfp_cap;
>  
> +	/* Minimum number of TX queues required by the network stack */
> +	unsigned int min_num_stack_tx_queues;
> +

It is probably not important.
But I do notice there are several holes in struct enetc_ndev_priv
that would fit this field.

>  	struct phylink *phylink;
>  	int ic_mode;
>  	u32 tx_ictt;
> -- 
> 2.34.1
> 

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH v4 net-next 01/15] net: enetc: simplify enetc_num_stack_tx_queues()
  2023-01-30 17:31 ` [PATCH v4 net-next 01/15] net: enetc: simplify enetc_num_stack_tx_queues() Vladimir Oltean
@ 2023-02-01 13:44   ` Simon Horman
  0 siblings, 0 replies; 37+ messages in thread
From: Simon Horman @ 2023-02-01 13:44 UTC (permalink / raw)
  To: Vladimir Oltean
  Cc: netdev, David S. Miller, Eric Dumazet, Jakub Kicinski,
	Paolo Abeni, Claudiu Manoil, Vinicius Costa Gomes,
	Kurt Kanzenbach, Jacob Keller, Jamal Hadi Salim, Cong Wang,
	Jiri Pirko

On Mon, Jan 30, 2023 at 07:31:31PM +0200, Vladimir Oltean wrote:
> We keep a pointer to the xdp_prog in the private netdev structure as
> well; what's replicated per RX ring is done so just for more convenient
> access from the NAPI poll procedure.
> 
> Simplify enetc_num_stack_tx_queues() by looking at priv->xdp_prog rather
> than iterating through the information replicated per RX ring.
> 
> Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>

Reviewed-by: Simon Horman <simon.horman@corigine.com>

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH v4 net-next 03/15] net: enetc: recalculate num_real_tx_queues when XDP program attaches
  2023-01-30 17:31 ` [PATCH v4 net-next 03/15] net: enetc: recalculate num_real_tx_queues when XDP program attaches Vladimir Oltean
@ 2023-02-01 13:45   ` Simon Horman
  0 siblings, 0 replies; 37+ messages in thread
From: Simon Horman @ 2023-02-01 13:45 UTC (permalink / raw)
  To: Vladimir Oltean
  Cc: netdev, David S. Miller, Eric Dumazet, Jakub Kicinski,
	Paolo Abeni, Claudiu Manoil, Vinicius Costa Gomes,
	Kurt Kanzenbach, Jacob Keller, Jamal Hadi Salim, Cong Wang,
	Jiri Pirko

On Mon, Jan 30, 2023 at 07:31:33PM +0200, Vladimir Oltean wrote:
> Since the blamed net-next commit, enetc_setup_xdp_prog() no longer goes
> through enetc_open(), and therefore, the function which was supposed to
> detect whether a BPF program exists (in order to crop some TX queues
> from network stack usage), enetc_num_stack_tx_queues(), no longer gets
> called.
> 
> We can move the netif_set_real_num_rx_queues() call to enetc_alloc_msix()
> (probe time), since it is a runtime invariant. We can do the same thing
> with netif_set_real_num_tx_queues(), and let enetc_reconfigure_xdp_cb()
> explicitly recalculate and change the number of stack TX queues.
> 
> Fixes: c33bfaf91c4c ("net: enetc: set up XDP program under enetc_reconfigure()")
> Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>

Reviewed-by: Simon Horman <simon.horman@corigine.com>


^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH v4 net-next 02/15] net: enetc: allow the enetc_reconfigure() callback to fail
  2023-01-30 17:31 ` [PATCH v4 net-next 02/15] net: enetc: allow the enetc_reconfigure() callback to fail Vladimir Oltean
@ 2023-02-01 13:45   ` Simon Horman
  0 siblings, 0 replies; 37+ messages in thread
From: Simon Horman @ 2023-02-01 13:45 UTC (permalink / raw)
  To: Vladimir Oltean
  Cc: netdev, David S. Miller, Eric Dumazet, Jakub Kicinski,
	Paolo Abeni, Claudiu Manoil, Vinicius Costa Gomes,
	Kurt Kanzenbach, Jacob Keller, Jamal Hadi Salim, Cong Wang,
	Jiri Pirko

On Mon, Jan 30, 2023 at 07:31:32PM +0200, Vladimir Oltean wrote:
> enetc_reconfigure() was modified in commit c33bfaf91c4c ("net: enetc:
> set up XDP program under enetc_reconfigure()") to take an optional
> callback that runs while the netdev is down, but this callback currently
> cannot fail.
> 
> Code up the error handling so that the interface is restarted with the
> old resources if the callback fails.
> 
> Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>

Reviewed-by: Simon Horman <simon.horman@corigine.com>


^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH v4 net-next 05/15] net/sched: mqprio: refactor nlattr parsing to a separate function
  2023-01-30 17:31 ` [PATCH v4 net-next 05/15] net/sched: mqprio: refactor nlattr parsing to a separate function Vladimir Oltean
@ 2023-02-01 14:03   ` Simon Horman
  0 siblings, 0 replies; 37+ messages in thread
From: Simon Horman @ 2023-02-01 14:03 UTC (permalink / raw)
  To: Vladimir Oltean
  Cc: netdev, David S. Miller, Eric Dumazet, Jakub Kicinski,
	Paolo Abeni, Claudiu Manoil, Vinicius Costa Gomes,
	Kurt Kanzenbach, Jacob Keller, Jamal Hadi Salim, Cong Wang,
	Jiri Pirko

On Mon, Jan 30, 2023 at 07:31:35PM +0200, Vladimir Oltean wrote:
> mqprio_init() is quite large and unwieldy to add more code to.
> Split the netlink attribute parsing to a dedicated function.
> 
> Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
> Reviewed-by: Jacob Keller <jacob.e.keller@intel.com>

Reviewed-by: Simon Horman <simon.horman@corigine.com>


^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH v4 net-next 06/15] net/sched: mqprio: refactor offloading and unoffloading to dedicated functions
  2023-01-30 17:31 ` [PATCH v4 net-next 06/15] net/sched: mqprio: refactor offloading and unoffloading to dedicated functions Vladimir Oltean
@ 2023-02-01 14:07   ` Simon Horman
  2023-02-01 14:09     ` Simon Horman
  0 siblings, 1 reply; 37+ messages in thread
From: Simon Horman @ 2023-02-01 14:07 UTC (permalink / raw)
  To: Vladimir Oltean
  Cc: netdev, David S. Miller, Eric Dumazet, Jakub Kicinski,
	Paolo Abeni, Claudiu Manoil, Vinicius Costa Gomes,
	Kurt Kanzenbach, Jacob Keller, Jamal Hadi Salim, Cong Wang,
	Jiri Pirko

On Mon, Jan 30, 2023 at 07:31:36PM +0200, Vladimir Oltean wrote:
> Some more logic will be added to mqprio offloading, so split that code
> up from mqprio_init(), which is already large, and create a new
> function, mqprio_enable_offload(), similar to taprio_enable_offload().
> Also create the opposite function mqprio_disable_offload().
> 
> Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
> Reviewed-by: Jacob Keller <jacob.e.keller@intel.com>

Signed-off-by: Simon Horman <simon.horman@corigine.com>


^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH v4 net-next 07/15] net/sched: move struct tc_mqprio_qopt_offload from pkt_cls.h to pkt_sched.h
  2023-01-30 17:31 ` [PATCH v4 net-next 07/15] net/sched: move struct tc_mqprio_qopt_offload from pkt_cls.h to pkt_sched.h Vladimir Oltean
@ 2023-02-01 14:07   ` Simon Horman
  2023-02-01 14:11     ` Simon Horman
  0 siblings, 1 reply; 37+ messages in thread
From: Simon Horman @ 2023-02-01 14:07 UTC (permalink / raw)
  To: Vladimir Oltean
  Cc: netdev, David S. Miller, Eric Dumazet, Jakub Kicinski,
	Paolo Abeni, Claudiu Manoil, Vinicius Costa Gomes,
	Kurt Kanzenbach, Jacob Keller, Jamal Hadi Salim, Cong Wang,
	Jiri Pirko, Igor Russkikh, Yisen Zhuang, Salil Mehta,
	Jesse Brandeburg, Tony Nguyen, Thomas Petazzoni, Saeed Mahameed,
	Leon Romanovsky, Horatiu Vultur, Lars Povlsen, Steen Hegelund,
	Daniel Machon, UNGLinuxDriver

On Mon, Jan 30, 2023 at 07:31:37PM +0200, Vladimir Oltean wrote:
> Since mqprio is a scheduler and not a classifier, move its offload
> structure to pkt_sched.h, where struct tc_taprio_qopt_offload also lies.
> 
> Also update some header inclusions in drivers that access this
> structure, to the best of my abilities.

Signed-off-by: Simon Horman <simon.horman@corigine.com>


^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH v4 net-next 08/15] net/sched: mqprio: allow offloading drivers to request queue count validation
  2023-01-30 17:31 ` [PATCH v4 net-next 08/15] net/sched: mqprio: allow offloading drivers to request queue count validation Vladimir Oltean
  2023-01-30 18:37   ` Claudiu Manoil
@ 2023-02-01 14:08   ` Simon Horman
  1 sibling, 0 replies; 37+ messages in thread
From: Simon Horman @ 2023-02-01 14:08 UTC (permalink / raw)
  To: Vladimir Oltean
  Cc: netdev, David S. Miller, Eric Dumazet, Jakub Kicinski,
	Paolo Abeni, Claudiu Manoil, Vinicius Costa Gomes,
	Kurt Kanzenbach, Jacob Keller, Jamal Hadi Salim, Cong Wang,
	Jiri Pirko

On Mon, Jan 30, 2023 at 07:31:38PM +0200, Vladimir Oltean wrote:
> mqprio_parse_opt() proudly has a comment:
> 
> 	/* If hardware offload is requested we will leave it to the device
> 	 * to either populate the queue counts itself or to validate the
> 	 * provided queue counts.
> 	 */
> 
> Unfortunately some device drivers did not get this memo, and don't
> validate the queue counts.
> 
> Introduce a tc capability, and make mqprio query it.
> 
> Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
> Reviewed-by: Jacob Keller <jacob.e.keller@intel.com>

Reviewed-by: Simon Horman <simon.horman@corigine.com>


^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH v4 net-next 06/15] net/sched: mqprio: refactor offloading and unoffloading to dedicated functions
  2023-02-01 14:07   ` Simon Horman
@ 2023-02-01 14:09     ` Simon Horman
  0 siblings, 0 replies; 37+ messages in thread
From: Simon Horman @ 2023-02-01 14:09 UTC (permalink / raw)
  To: Vladimir Oltean
  Cc: netdev, David S. Miller, Eric Dumazet, Jakub Kicinski,
	Paolo Abeni, Claudiu Manoil, Vinicius Costa Gomes,
	Kurt Kanzenbach, Jacob Keller, Jamal Hadi Salim, Cong Wang,
	Jiri Pirko

On Wed, Feb 01, 2023 at 03:07:02PM +0100, Simon Horman wrote:
> On Mon, Jan 30, 2023 at 07:31:36PM +0200, Vladimir Oltean wrote:
> > Some more logic will be added to mqprio offloading, so split that code
> > up from mqprio_init(), which is already large, and create a new
> > function, mqprio_enable_offload(), similar to taprio_enable_offload().
> > Also create the opposite function mqprio_disable_offload().
> > 
> > Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
> > Reviewed-by: Jacob Keller <jacob.e.keller@intel.com>
> 
> Signed-off-by: Simon Horman <simon.horman@corigine.com>

Sorry, I hit the wrong button.
I meant:

Reviewed-by: Simon Horman <simon.horman@corigine.com>


^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH v4 net-next 07/15] net/sched: move struct tc_mqprio_qopt_offload from pkt_cls.h to pkt_sched.h
  2023-02-01 14:07   ` Simon Horman
@ 2023-02-01 14:11     ` Simon Horman
  0 siblings, 0 replies; 37+ messages in thread
From: Simon Horman @ 2023-02-01 14:11 UTC (permalink / raw)
  To: Vladimir Oltean
  Cc: netdev, David S. Miller, Eric Dumazet, Jakub Kicinski,
	Paolo Abeni, Claudiu Manoil, Vinicius Costa Gomes,
	Kurt Kanzenbach, Jacob Keller, Jamal Hadi Salim, Cong Wang,
	Jiri Pirko, Igor Russkikh, Yisen Zhuang, Salil Mehta,
	Jesse Brandeburg, Tony Nguyen, Thomas Petazzoni, Saeed Mahameed,
	Leon Romanovsky, Horatiu Vultur, Lars Povlsen, Steen Hegelund,
	Daniel Machon, UNGLinuxDriver

On Wed, Feb 01, 2023 at 03:07:56PM +0100, Simon Horman wrote:
> On Mon, Jan 30, 2023 at 07:31:37PM +0200, Vladimir Oltean wrote:
> > Since mqprio is a scheduler and not a classifier, move its offload
> > structure to pkt_sched.h, where struct tc_taprio_qopt_offload also lies.
> > 
> > Also update some header inclusions in drivers that access this
> > structure, to the best of my abilities.
> 
> Signed-off-by: Simon Horman <simon.horman@corigine.com>

Sorry, I hit the wrong button a second time here.
I meant:

Reviewed-by: Simon Horman <simon.horman@corigine.com>


^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH v4 net-next 09/15] net/sched: mqprio: add extack messages for queue count validation
  2023-01-30 17:31 ` [PATCH v4 net-next 09/15] net/sched: mqprio: add extack messages for " Vladimir Oltean
@ 2023-02-01 14:12   ` Simon Horman
  0 siblings, 0 replies; 37+ messages in thread
From: Simon Horman @ 2023-02-01 14:12 UTC (permalink / raw)
  To: Vladimir Oltean
  Cc: netdev, David S. Miller, Eric Dumazet, Jakub Kicinski,
	Paolo Abeni, Claudiu Manoil, Vinicius Costa Gomes,
	Kurt Kanzenbach, Jacob Keller, Jamal Hadi Salim, Cong Wang,
	Jiri Pirko

On Mon, Jan 30, 2023 at 07:31:39PM +0200, Vladimir Oltean wrote:
> To make mqprio more user-friendly, create netlink extended ack messages
> which say exactly what is wrong about the queue counts. This uses the
> new support for printf-formatted extack messages.
> 
> Example:
> 
> $ tc qdisc add dev eno0 root handle 1: mqprio num_tc 8 \
> 	map 0 1 2 3 4 5 6 7 queues 3@0 1@1 1@2 1@3 1@4 1@5 1@6 1@7 hw 1
> Error: sch_mqprio: Queues 1:1 for TC 1 overlap with last TX queue 3 for TC 0.
> 
> Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
> Reviewed-by: Jacob Keller <jacob.e.keller@intel.com>

Reviewed-by: Simon Horman <simon.horman@corigine.com>


^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH v4 net-next 10/15] net: enetc: request mqprio to validate the queue counts
  2023-01-30 17:31 ` [PATCH v4 net-next 10/15] net: enetc: request mqprio to validate the queue counts Vladimir Oltean
@ 2023-02-01 14:12   ` Simon Horman
  0 siblings, 0 replies; 37+ messages in thread
From: Simon Horman @ 2023-02-01 14:12 UTC (permalink / raw)
  To: Vladimir Oltean
  Cc: netdev, David S. Miller, Eric Dumazet, Jakub Kicinski,
	Paolo Abeni, Claudiu Manoil, Vinicius Costa Gomes,
	Kurt Kanzenbach, Jacob Keller, Jamal Hadi Salim, Cong Wang,
	Jiri Pirko

On Mon, Jan 30, 2023 at 07:31:40PM +0200, Vladimir Oltean wrote:
> The enetc driver does not validate the mqprio queue configuration, so it
> currently allows things like this:
> 
> $ tc qdisc add dev swp0 root handle 1: mqprio num_tc 8 \
> 	map 0 1 2 3 4 5 6 7 queues 3@0 1@1 1@2 1@3 1@4 1@5 1@6 1@7 hw 1
> 
> By requesting validation via the mqprio capability structure, this is no
> longer allowed, and needs no custom code in the driver.
> 
> The check that num_tc <= real_num_tx_queues also becomes superfluous and
> can be dropped, because mqprio_validate_queue_counts() validates that no
> TXQ range exceeds real_num_tx_queues. That is a stronger check, because
> there is at least 1 TXQ per TC, so there are at least as many TXQs as TCs.
> 
> Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
> Reviewed-by: Jacob Keller <jacob.e.keller@intel.com>

Reviewed-by: Simon Horman <simon.horman@corigine.com>


^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH v4 net-next 11/15] net: enetc: act upon the requested mqprio queue configuration
  2023-01-30 17:31 ` [PATCH v4 net-next 11/15] net: enetc: act upon the requested mqprio queue configuration Vladimir Oltean
@ 2023-02-01 15:15   ` Simon Horman
  0 siblings, 0 replies; 37+ messages in thread
From: Simon Horman @ 2023-02-01 15:15 UTC (permalink / raw)
  To: Vladimir Oltean
  Cc: netdev, David S. Miller, Eric Dumazet, Jakub Kicinski,
	Paolo Abeni, Claudiu Manoil, Vinicius Costa Gomes,
	Kurt Kanzenbach, Jacob Keller, Jamal Hadi Salim, Cong Wang,
	Jiri Pirko

On Mon, Jan 30, 2023 at 07:31:41PM +0200, Vladimir Oltean wrote:
> Regardless of the requested queue count per traffic class, the enetc
> driver allocates a number of TX rings equal to the number of TCs, and
> hardcodes a queue configuration of "1@0 1@1 ... 1@max-tc". Other
> configurations are silently ignored and treated the same.
> 
> Improve that by allowing what the user requests to be actually
> fulfilled. This allows more than one TX ring per traffic class.
> For example:
> 
> $ tc qdisc add dev eno0 root handle 1: mqprio num_tc 4 \
> 	map 0 0 1 1 2 2 3 3 queues 2@0 2@2 2@4 2@6
> [  146.267648] fsl_enetc 0000:00:00.0 eno0: TX ring 0 prio 0
> [  146.273451] fsl_enetc 0000:00:00.0 eno0: TX ring 1 prio 0
> [  146.283280] fsl_enetc 0000:00:00.0 eno0: TX ring 2 prio 1
> [  146.293987] fsl_enetc 0000:00:00.0 eno0: TX ring 3 prio 1
> [  146.300467] fsl_enetc 0000:00:00.0 eno0: TX ring 4 prio 2
> [  146.306866] fsl_enetc 0000:00:00.0 eno0: TX ring 5 prio 2
> [  146.313261] fsl_enetc 0000:00:00.0 eno0: TX ring 6 prio 3
> [  146.319622] fsl_enetc 0000:00:00.0 eno0: TX ring 7 prio 3
> $ tc qdisc del dev eno0 root
> [  178.238418] fsl_enetc 0000:00:00.0 eno0: TX ring 0 prio 0
> [  178.244369] fsl_enetc 0000:00:00.0 eno0: TX ring 1 prio 0
> [  178.251486] fsl_enetc 0000:00:00.0 eno0: TX ring 2 prio 0
> [  178.258006] fsl_enetc 0000:00:00.0 eno0: TX ring 3 prio 0
> [  178.265038] fsl_enetc 0000:00:00.0 eno0: TX ring 4 prio 0
> [  178.271557] fsl_enetc 0000:00:00.0 eno0: TX ring 5 prio 0
> [  178.277910] fsl_enetc 0000:00:00.0 eno0: TX ring 6 prio 0
> [  178.284281] fsl_enetc 0000:00:00.0 eno0: TX ring 7 prio 0
> $ tc qdisc add dev eno0 root handle 1: mqprio num_tc 8 \
> 	map 0 1 2 3 4 5 6 7 queues 1@0 1@1 1@2 1@3 1@4 1@5 1@6 1@7 hw 1
> [  186.113162] fsl_enetc 0000:00:00.0 eno0: TX ring 0 prio 0
> [  186.118764] fsl_enetc 0000:00:00.0 eno0: TX ring 1 prio 1
> [  186.124374] fsl_enetc 0000:00:00.0 eno0: TX ring 2 prio 2
> [  186.130765] fsl_enetc 0000:00:00.0 eno0: TX ring 3 prio 3
> [  186.136404] fsl_enetc 0000:00:00.0 eno0: TX ring 4 prio 4
> [  186.142049] fsl_enetc 0000:00:00.0 eno0: TX ring 5 prio 5
> [  186.147674] fsl_enetc 0000:00:00.0 eno0: TX ring 6 prio 6
> [  186.153305] fsl_enetc 0000:00:00.0 eno0: TX ring 7 prio 7
> 
> The driver used to set TC_MQPRIO_HW_OFFLOAD_TCS, near which there is
> this comment in the UAPI header:
> 
>         TC_MQPRIO_HW_OFFLOAD_TCS,       /* offload TCs, no queue counts */
> 
> but I'm not sure who even looks at this field. Anyway, since this is
> basically what enetc was doing up until now (and no longer is; we
> offload queue counts too), remove that assignment.
> 
> Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>

Reviewed-by: Simon Horman <simon.horman@corigine.com>


^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH v4 net-next 12/15] net/sched: taprio: pass mqprio queue configuration to ndo_setup_tc()
  2023-01-30 17:31 ` [PATCH v4 net-next 12/15] net/sched: taprio: pass mqprio queue configuration to ndo_setup_tc() Vladimir Oltean
@ 2023-02-01 15:16   ` Simon Horman
  0 siblings, 0 replies; 37+ messages in thread
From: Simon Horman @ 2023-02-01 15:16 UTC (permalink / raw)
  To: Vladimir Oltean
  Cc: netdev, David S. Miller, Eric Dumazet, Jakub Kicinski,
	Paolo Abeni, Claudiu Manoil, Vinicius Costa Gomes,
	Kurt Kanzenbach, Jacob Keller, Jamal Hadi Salim, Cong Wang,
	Jiri Pirko

On Mon, Jan 30, 2023 at 07:31:42PM +0200, Vladimir Oltean wrote:
> The taprio offload does not currently pass the mqprio queue configuration
> down to the offloading device driver. So the driver cannot act upon the
> TXQ counts/offsets per TC, or upon the prio->tc map. It was probably
> assumed that the driver only wants to offload num_tc (see
> TC_MQPRIO_HW_OFFLOAD_TCS), which it can get from netdev_get_num_tc(),
> but there's clearly more to the mqprio configuration than that.
> 
> To remedy that, we need to actually reconstruct a struct
> tc_mqprio_qopt_offload to pass as part of the tc_taprio_qopt_offload.
> The problem is that taprio doesn't keep a persistent reference to the
> mqprio queue structure in its own struct taprio_sched, instead it just
> applies the contents of that to the netdev state (prio:tc map, per-TC
> TXQ counts and offsets, num_tc etc). Maybe it's easier to understand
> why, when we look at the size of struct tc_mqprio_qopt_offload: 352
> bytes on arm64. Keeping such a large structure would throw off the
> memory accesses in struct taprio_sched no matter where we put it.
> So we prefer to dynamically reconstruct the mqprio offload structure
> based on netdev information, rather than saving a copy of it.
> 
> Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>

Reviewed-by: Simon Horman <simon.horman@corigine.com>


^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH v4 net-next 13/15] net: enetc: act upon mqprio queue config in taprio offload
  2023-01-30 17:31 ` [PATCH v4 net-next 13/15] net: enetc: act upon mqprio queue config in taprio offload Vladimir Oltean
@ 2023-02-01 15:16   ` Simon Horman
  0 siblings, 0 replies; 37+ messages in thread
From: Simon Horman @ 2023-02-01 15:16 UTC (permalink / raw)
  To: Vladimir Oltean
  Cc: netdev, David S. Miller, Eric Dumazet, Jakub Kicinski,
	Paolo Abeni, Claudiu Manoil, Vinicius Costa Gomes,
	Kurt Kanzenbach, Jacob Keller, Jamal Hadi Salim, Cong Wang,
	Jiri Pirko

On Mon, Jan 30, 2023 at 07:31:43PM +0200, Vladimir Oltean wrote:
> We assume that the mqprio queue configuration from taprio has a simple
> 1:1 mapping between prio and traffic class, and one TX queue per TC.
> That might not be the case. Actually parse and act upon the mqprio
> config.
> 
> Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
> Reviewed-by: Jacob Keller <jacob.e.keller@intel.com>

Reviewed-by: Simon Horman <simon.horman@corigine.com>


^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH v4 net-next 14/15] net/sched: taprio: mask off bits in gate mask that exceed number of TCs
  2023-01-30 17:31 ` [PATCH v4 net-next 14/15] net/sched: taprio: mask off bits in gate mask that exceed number of TCs Vladimir Oltean
@ 2023-02-01 15:17   ` Simon Horman
  0 siblings, 0 replies; 37+ messages in thread
From: Simon Horman @ 2023-02-01 15:17 UTC (permalink / raw)
  To: Vladimir Oltean
  Cc: netdev, David S. Miller, Eric Dumazet, Jakub Kicinski,
	Paolo Abeni, Claudiu Manoil, Vinicius Costa Gomes,
	Kurt Kanzenbach, Jacob Keller, Jamal Hadi Salim, Cong Wang,
	Jiri Pirko

On Mon, Jan 30, 2023 at 07:31:44PM +0200, Vladimir Oltean wrote:
> "man tc-taprio" says:
> 
> | each gate state allows outgoing traffic for a subset (potentially
> | empty) of traffic classes.
> 
> So it makes sense to not allow gate actions to have bits set for traffic
> classes that exceed the number of TCs of the device (according to the
> mqprio configuration).
> 
> Validating precisely that would risk introducing breakage in commands
> that worked (because taprio ignores the upper bits). OTOH, the user may
> not immediately realize that taprio ignores the upper bits (may confuse
> the gate mask to be per TXQ rather than per TC). So at least warn to
> dmesg, mask off the excess bits and continue.
> 
> For this patch to work, we need to move the assignment of the mqprio
> queue configuration to the netdev above the parse_taprio_schedule()
> call, because we make use of netdev_get_num_tc().
> 
> Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
> Reviewed-by: Jacob Keller <jacob.e.keller@intel.com>

Reviewed-by: Simon Horman <simon.horman@corigine.com>


^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH v4 net-next 15/15] net/sched: taprio: only calculate gate mask per TXQ for igc, stmmac and tsnep
  2023-01-30 17:31 ` [PATCH v4 net-next 15/15] net/sched: taprio: only calculate gate mask per TXQ for igc, stmmac and tsnep Vladimir Oltean
@ 2023-02-01 15:17   ` Simon Horman
  0 siblings, 0 replies; 37+ messages in thread
From: Simon Horman @ 2023-02-01 15:17 UTC (permalink / raw)
  To: Vladimir Oltean
  Cc: netdev, David S. Miller, Eric Dumazet, Jakub Kicinski,
	Paolo Abeni, Claudiu Manoil, Vinicius Costa Gomes,
	Kurt Kanzenbach, Jacob Keller, Jamal Hadi Salim, Cong Wang,
	Jiri Pirko, Horatiu Vultur, Siddharth Vadapalli, Roger Quadros,
	Gerhard Engleder

On Mon, Jan 30, 2023 at 07:31:45PM +0200, Vladimir Oltean wrote:
> There are 2 classes of in-tree drivers currently:
> 
> - those who act upon struct tc_taprio_sched_entry :: gate_mask as if it
>   holds a bit mask of TXQs
> 
> - those who act upon the gate_mask as if it holds a bit mask of TCs
> 
> When it comes to the standard, IEEE 802.1Q-2018 does say this in the
> second paragraph of section 8.6.8.4 Enhancements for scheduled traffic:
> 
> | A gate control list associated with each Port contains an ordered list
> | of gate operations. Each gate operation changes the transmission gate
> | state for the gate associated with each of the Port's traffic class
> | queues and allows associated control operations to be scheduled.
> 
> In typically obtuse language, it refers to a "traffic class queue"
> rather than a "traffic class" or a "queue". But careful reading of
> 802.1Q clarifies that "traffic class" and "queue" are in fact
> synonymous (see 8.6.6 Queuing frames):
> 
> | A queue in this context is not necessarily a single FIFO data structure.
> | A queue is a record of all frames of a given traffic class awaiting
> | transmission on a given Bridge Port. The structure of this record is not
> | specified.
> 
> i.o.w. their definition of "queue" isn't the Linux TX queue.
> 
> The gate_mask really is input into taprio via its UAPI as a mask of
> traffic classes, but taprio_sched_to_offload() converts it into a TXQ
> mask.
> 
> The breakdown of drivers which handle TC_SETUP_QDISC_TAPRIO is:
> 
> - hellcreek, felix, sja1105: these are DSA switches, it's not even very
>   clear what TXQs correspond to, other than purely software constructs.
>   Only the mqprio configuration with 8 TCs and 1 TXQ per TC makes sense.
>   So it's fine to convert these to a gate mask per TC.
> 
> - enetc: I have the hardware and can confirm that the gate mask is per
>   TC, and affects all TXQs (BD rings) configured for that priority.
> 
> - igc: in igc_save_qbv_schedule(), the gate_mask is clearly interpreted
>   to be per-TXQ.
> 
> - tsnep: Gerhard Engleder clarifies that even though this hardware
>   supports at most 1 TXQ per TC, the TXQ indices may be different from
>   the TC values themselves, and it is the TXQ indices that matter to
>   this hardware. So keep it per-TXQ as well.
> 
> - stmmac: I have a GMAC datasheet, and in the EST section it does
>   specify that the gate events are per TXQ rather than per TC.
> 
> - lan966x: again, this is a switch, and while not a DSA one, the way in
>   which it implements lan966x_mqprio_add() - by only allowing num_tc ==
>   NUM_PRIO_QUEUES (8) - makes it clear to me that TXQs are a purely
>   software construct here as well. They seem to map 1:1 with TCs.
> 
> - am65_cpsw: from looking at am65_cpsw_est_set_sched_cmds(), I get the
>   impression that the fetch_allow variable is treated like a prio_mask.
>   I haven't studied this driver's interpretation of the prio_tc_map, but
>   that definitely sounds closer to a per-TC gate mask rather than a
>   per-TXQ one.
> 
> Based on this breakdown, we have 6 drivers with a gate mask per TC and
> 3 with a gate mask per TXQ. So let's make the gate mask per TXQ the
> opt-in and the gate mask per TC the default.
> 
> Benefit from the TC_QUERY_CAPS feature that Jakub suggested we add, and
> query the device driver before calling the proper ndo_setup_tc(), and
> figure out if it expects one or the other format.
> 
> Cc: Horatiu Vultur <horatiu.vultur@microchip.com>
> Cc: Siddharth Vadapalli <s-vadapalli@ti.com>
> Cc: Roger Quadros <rogerq@kernel.org>
> Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
> Acked-by: Kurt Kanzenbach <kurt@linutronix.de> # hellcreek
> Reviewed-by: Gerhard Engleder <gerhard@engleder-embedded.com>

Reviewed-by: Simon Horman <simon.horman@corigine.com>


^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH v4 net-next 04/15] net: enetc: ensure we always have a minimum number of TXQs for stack
  2023-02-01 13:43   ` Simon Horman
@ 2023-02-01 18:46     ` Vladimir Oltean
  2023-02-02  9:29       ` Simon Horman
  0 siblings, 1 reply; 37+ messages in thread
From: Vladimir Oltean @ 2023-02-01 18:46 UTC (permalink / raw)
  To: Simon Horman
  Cc: netdev, David S. Miller, Eric Dumazet, Jakub Kicinski,
	Paolo Abeni, Claudiu Manoil, Vinicius Costa Gomes,
	Kurt Kanzenbach, Jacob Keller, Jamal Hadi Salim, Cong Wang,
	Jiri Pirko

Hi Simon,

On Wed, Feb 01, 2023 at 02:43:44PM +0100, Simon Horman wrote:
> The nit below notwithstanding,
> 
> Reviewed-by: Simon Horman <simon.horman@corigine.com>

I appreciate your time to review this patch set.

> > diff --git a/drivers/net/ethernet/freescale/enetc/enetc.h b/drivers/net/ethernet/freescale/enetc/enetc.h
> > index 1fe8dfd6b6d4..e21d096c5a90 100644
> > --- a/drivers/net/ethernet/freescale/enetc/enetc.h
> > +++ b/drivers/net/ethernet/freescale/enetc/enetc.h
> > @@ -369,6 +369,9 @@ struct enetc_ndev_priv {
> >  
> >  	struct psfp_cap psfp_cap;
> >  
> > +	/* Minimum number of TX queues required by the network stack */
> > +	unsigned int min_num_stack_tx_queues;
> > +
> 
> It is probably not important.
> But I do notice there are several holes in struct enetc_ndev_priv
> that would fit this field.

This is true. However, this patch was written taking pahole into
consideration, and one new field can only fill a single hole :)

Before:

pahole -C enetc_ndev_priv $KBUILD_OUTPUT/drivers/net/ethernet/freescale/enetc/enetc.o
struct enetc_ndev_priv {
        struct net_device *        ndev;                 /*     0     8 */
        struct device *            dev;                  /*     8     8 */
        struct enetc_si *          si;                   /*    16     8 */
        int                        bdr_int_num;          /*    24     4 */

        /* XXX 4 bytes hole, try to pack */

        struct enetc_int_vector *  int_vector[2];        /*    32    16 */
        u16                        num_rx_rings;         /*    48     2 */
        u16                        num_tx_rings;         /*    50     2 */
        u16                        rx_bd_count;          /*    52     2 */
        u16                        tx_bd_count;          /*    54     2 */
        u16                        msg_enable;           /*    56     2 */

        /* XXX 2 bytes hole, try to pack */

        enum enetc_active_offloads active_offloads;      /*    60     4 */
        /* --- cacheline 1 boundary (64 bytes) --- */
        u32                        speed;                /*    64     4 */

        /* XXX 4 bytes hole, try to pack */

        struct enetc_bdr * *       xdp_tx_ring;          /*    72     8 */
        struct enetc_bdr *         tx_ring[16];          /*    80   128 */
        /* --- cacheline 3 boundary (192 bytes) was 16 bytes ago --- */
        struct enetc_bdr *         rx_ring[16];          /*   208   128 */
        /* --- cacheline 5 boundary (320 bytes) was 16 bytes ago --- */
        const struct enetc_bdr_resource  * tx_res;       /*   336     8 */
        const struct enetc_bdr_resource  * rx_res;       /*   344     8 */
        struct enetc_cls_rule *    cls_rules;            /*   352     8 */
        struct psfp_cap            psfp_cap;             /*   360    20 */

        /* XXX 4 bytes hole, try to pack */

        /* --- cacheline 6 boundary (384 bytes) --- */
        struct phylink *           phylink;              /*   384     8 */
        int                        ic_mode;              /*   392     4 */
        u32                        tx_ictt;              /*   396     4 */
        struct bpf_prog *          xdp_prog;             /*   400     8 */
        long unsigned int          flags;                /*   408     8 */
        struct work_struct         tx_onestep_tstamp;    /*   416     0 */

        /* XXX 32 bytes hole, try to pack */

        /* --- cacheline 7 boundary (448 bytes) --- */
        struct sk_buff_head        tx_skbs;              /*   448     0 */

        /* size: 472, cachelines: 8, members: 26 */
        /* sum members: 402, holes: 5, sum holes: 46 */
        /* padding: 24 */
        /* last cacheline: 24 bytes */
};

After:

struct enetc_ndev_priv {
        struct net_device *        ndev;                 /*     0     8 */
        struct device *            dev;                  /*     8     8 */
        struct enetc_si *          si;                   /*    16     8 */
        int                        bdr_int_num;          /*    24     4 */

        /* XXX 4 bytes hole, try to pack */

        struct enetc_int_vector *  int_vector[2];        /*    32    16 */
        u16                        num_rx_rings;         /*    48     2 */
        u16                        num_tx_rings;         /*    50     2 */
        u16                        rx_bd_count;          /*    52     2 */
        u16                        tx_bd_count;          /*    54     2 */
        u16                        msg_enable;           /*    56     2 */

        /* XXX 2 bytes hole, try to pack */

        enum enetc_active_offloads active_offloads;      /*    60     4 */
        /* --- cacheline 1 boundary (64 bytes) --- */
        u32                        speed;                /*    64     4 */

        /* XXX 4 bytes hole, try to pack */

        struct enetc_bdr * *       xdp_tx_ring;          /*    72     8 */
        struct enetc_bdr *         tx_ring[16];          /*    80   128 */
        /* --- cacheline 3 boundary (192 bytes) was 16 bytes ago --- */
        struct enetc_bdr *         rx_ring[16];          /*   208   128 */
        /* --- cacheline 5 boundary (320 bytes) was 16 bytes ago --- */
        const struct enetc_bdr_resource  * tx_res;       /*   336     8 */
        const struct enetc_bdr_resource  * rx_res;       /*   344     8 */
        struct enetc_cls_rule *    cls_rules;            /*   352     8 */
        struct psfp_cap            psfp_cap;             /*   360    20 */
        unsigned int               min_num_stack_tx_queues; /*   380     4 */
        /* --- cacheline 6 boundary (384 bytes) --- */
        struct phylink *           phylink;              /*   384     8 */
        int                        ic_mode;              /*   392     4 */
        u32                        tx_ictt;              /*   396     4 */
        struct bpf_prog *          xdp_prog;             /*   400     8 */
        long unsigned int          flags;                /*   408     8 */
        struct work_struct         tx_onestep_tstamp;    /*   416     0 */

        /* XXX 32 bytes hole, try to pack */

        /* --- cacheline 7 boundary (448 bytes) --- */
        struct sk_buff_head        tx_skbs;              /*   448     0 */

        /* size: 472, cachelines: 8, members: 27 */
        /* sum members: 406, holes: 4, sum holes: 42 */
        /* padding: 24 */
        /* last cacheline: 24 bytes */
};

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH v4 net-next 04/15] net: enetc: ensure we always have a minimum number of TXQs for stack
  2023-02-01 18:46     ` Vladimir Oltean
@ 2023-02-02  9:29       ` Simon Horman
  0 siblings, 0 replies; 37+ messages in thread
From: Simon Horman @ 2023-02-02  9:29 UTC (permalink / raw)
  To: Vladimir Oltean
  Cc: netdev, David S. Miller, Eric Dumazet, Jakub Kicinski,
	Paolo Abeni, Claudiu Manoil, Vinicius Costa Gomes,
	Kurt Kanzenbach, Jacob Keller, Jamal Hadi Salim, Cong Wang,
	Jiri Pirko

On Wed, Feb 01, 2023 at 08:46:52PM +0200, Vladimir Oltean wrote:
> Hi Simon,
> 
> On Wed, Feb 01, 2023 at 02:43:44PM +0100, Simon Horman wrote:
> > The nit below notwithstanding,
> > 
> > Reviewed-by: Simon Horman <simon.horman@corigine.com>
> 
> I appreciate your time to review this patch set.
> 
> > > diff --git a/drivers/net/ethernet/freescale/enetc/enetc.h b/drivers/net/ethernet/freescale/enetc/enetc.h
> > > index 1fe8dfd6b6d4..e21d096c5a90 100644
> > > --- a/drivers/net/ethernet/freescale/enetc/enetc.h
> > > +++ b/drivers/net/ethernet/freescale/enetc/enetc.h
> > > @@ -369,6 +369,9 @@ struct enetc_ndev_priv {
> > >  
> > >  	struct psfp_cap psfp_cap;
> > >  
> > > +	/* Minimum number of TX queues required by the network stack */
> > > +	unsigned int min_num_stack_tx_queues;
> > > +
> > 
> > It is probably not important.
> > But I do notice there are several holes in struct enetc_ndev_priv
> > that would fit this field.
> 
> This is true. However, this patch was written taking pahole into
> consideration, and one new field can only fill a single hole :)

Yes, indeed. Silly me.

^ permalink raw reply	[flat|nested] 37+ messages in thread

end of thread, other threads:[~2023-02-02  9:29 UTC | newest]

Thread overview: 37+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-01-30 17:31 [PATCH v4 net-next 00/15] ENETC mqprio/taprio cleanup Vladimir Oltean
2023-01-30 17:31 ` [PATCH v4 net-next 01/15] net: enetc: simplify enetc_num_stack_tx_queues() Vladimir Oltean
2023-02-01 13:44   ` Simon Horman
2023-01-30 17:31 ` [PATCH v4 net-next 02/15] net: enetc: allow the enetc_reconfigure() callback to fail Vladimir Oltean
2023-02-01 13:45   ` Simon Horman
2023-01-30 17:31 ` [PATCH v4 net-next 03/15] net: enetc: recalculate num_real_tx_queues when XDP program attaches Vladimir Oltean
2023-02-01 13:45   ` Simon Horman
2023-01-30 17:31 ` [PATCH v4 net-next 04/15] net: enetc: ensure we always have a minimum number of TXQs for stack Vladimir Oltean
2023-02-01 13:43   ` Simon Horman
2023-02-01 18:46     ` Vladimir Oltean
2023-02-02  9:29       ` Simon Horman
2023-01-30 17:31 ` [PATCH v4 net-next 05/15] net/sched: mqprio: refactor nlattr parsing to a separate function Vladimir Oltean
2023-02-01 14:03   ` Simon Horman
2023-01-30 17:31 ` [PATCH v4 net-next 06/15] net/sched: mqprio: refactor offloading and unoffloading to dedicated functions Vladimir Oltean
2023-02-01 14:07   ` Simon Horman
2023-02-01 14:09     ` Simon Horman
2023-01-30 17:31 ` [PATCH v4 net-next 07/15] net/sched: move struct tc_mqprio_qopt_offload from pkt_cls.h to pkt_sched.h Vladimir Oltean
2023-02-01 14:07   ` Simon Horman
2023-02-01 14:11     ` Simon Horman
2023-01-30 17:31 ` [PATCH v4 net-next 08/15] net/sched: mqprio: allow offloading drivers to request queue count validation Vladimir Oltean
2023-01-30 18:37   ` Claudiu Manoil
2023-01-30 19:06     ` Vladimir Oltean
2023-02-01 14:08   ` Simon Horman
2023-01-30 17:31 ` [PATCH v4 net-next 09/15] net/sched: mqprio: add extack messages for " Vladimir Oltean
2023-02-01 14:12   ` Simon Horman
2023-01-30 17:31 ` [PATCH v4 net-next 10/15] net: enetc: request mqprio to validate the queue counts Vladimir Oltean
2023-02-01 14:12   ` Simon Horman
2023-01-30 17:31 ` [PATCH v4 net-next 11/15] net: enetc: act upon the requested mqprio queue configuration Vladimir Oltean
2023-02-01 15:15   ` Simon Horman
2023-01-30 17:31 ` [PATCH v4 net-next 12/15] net/sched: taprio: pass mqprio queue configuration to ndo_setup_tc() Vladimir Oltean
2023-02-01 15:16   ` Simon Horman
2023-01-30 17:31 ` [PATCH v4 net-next 13/15] net: enetc: act upon mqprio queue config in taprio offload Vladimir Oltean
2023-02-01 15:16   ` Simon Horman
2023-01-30 17:31 ` [PATCH v4 net-next 14/15] net/sched: taprio: mask off bits in gate mask that exceed number of TCs Vladimir Oltean
2023-02-01 15:17   ` Simon Horman
2023-01-30 17:31 ` [PATCH v4 net-next 15/15] net/sched: taprio: only calculate gate mask per TXQ for igc, stmmac and tsnep Vladimir Oltean
2023-02-01 15:17   ` Simon Horman

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.