All of lore.kernel.org
 help / color / mirror / Atom feed
* [patch net-next 00/10] net: allow user specify TC filter HW stats type
@ 2020-02-21  9:56 Jiri Pirko
  2020-02-21  9:56 ` [patch net-next 01/10] net: rename tc_cls_can_offload_and_chain0() to tc_cls_can_offload_basic() Jiri Pirko
                   ` (10 more replies)
  0 siblings, 11 replies; 29+ messages in thread
From: Jiri Pirko @ 2020-02-21  9:56 UTC (permalink / raw)
  To: netdev
  Cc: davem, kuba, saeedm, leon, michael.chan, vishal,
	jeffrey.t.kirsher, idosch, aelior, peppe.cavallaro,
	alexandre.torgue, jhs, xiyou.wangcong, pablo, mlxsw

From: Jiri Pirko <jiri@mellanox.com>

Currently, when user adds a TC filter and the filter gets offloaded,
the user expects the HW stats to be counted and included in stats dump.
However, since drivers may implement different types of counting, there
is no way to specify which one the user is interested in.

For example for mlx5, only delayed counters are available as the driver
periodically polls for updated stats.

In case of mlxsw, the counters are queried on dump time. However, the
HW resources for this type of counters is quite limited (couple of
thousands). This limits the amount of supported offloaded filters
significantly. Without counter assigned, the HW is capable to carry
millions of those.

On top of that, mlxsw HW is able to support delayed counters as well in
greater numbers. That is going to be added in a follow-up patch.

This patchset allows user to specify one of the following types of HW
stats for added fitler:
any - current default, user does not care about the type, just expects
      any type of stats.
immediate - queried during dump time
delayed - polled from HW periodically or sent by HW in async manner
disabled - no stats needed

Examples:
$ tc filter add dev enp0s16np28 ingress proto ip handle 1 pref 1 flower hw_stats disabled dst_ip 192.168.1.1 action drop
$ tc -s filter show dev enp0s16np28 ingress
filter protocol ip pref 1 flower chain 0 
filter protocol ip pref 1 flower chain 0 handle 0x1 
  eth_type ipv4
  dst_ip 192.168.1.1
  in_hw in_hw_count 2
  hw_stats disabled
        action order 1: gact action drop
         random type none pass val 0
         index 1 ref 1 bind 1 installed 10 sec used 2 sec
        Action statistics:
        Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0) 
        backlog 0b 0p requeues 0

$ tc filter add dev enp0s16np28 ingress proto ip handle 1 pref 1 flower hw_stats immediate dst_ip 192.168.1.1 action drop
$ tc -s filter show dev enp0s16np28 ingress
filter protocol ip pref 1 flower chain 0 
filter protocol ip pref 1 flower chain 0 handle 0x1 
  eth_type ipv4
  dst_ip 192.168.1.1
  in_hw in_hw_count 2
  hw_stats immediate
        action order 1: gact action drop
         random type none pass val 0
         index 1 ref 1 bind 1 installed 14 sec used 7 sec
        Action statistics:
        Sent 102 bytes 1 pkt (dropped 1, overlimits 0 requeues 0) 
        Sent software 0 bytes 0 pkt
        Sent hardware 102 bytes 1 pkt
        backlog 0b 0p requeues 0

Jiri Pirko (10):
  net: rename tc_cls_can_offload_and_chain0() to
    tc_cls_can_offload_basic()
  iavf: use tc_cls_can_offload_basic() instead of chain check
  flow_offload: Introduce offload of HW stats type
  net: extend tc_cls_can_offload_basic() to check HW stats type
  mlx5: restrict supported HW stats type to "any"
  mlxsw: restrict supported HW stats type to "any"
  flow_offload: introduce "immediate" HW stats type and allow it in
    mlxsw
  flow_offload: introduce "delayed" HW stats type and allow it in mlx5
  flow_offload: introduce "disabled" HW stats type and allow it in mlxsw
  sched: cls_flower: allow user to specify type of HW stats for a filter

 drivers/net/ethernet/broadcom/bnxt/bnxt.c     |  2 +-
 drivers/net/ethernet/broadcom/bnxt/bnxt_vfr.c |  2 +-
 .../net/ethernet/chelsio/cxgb4/cxgb4_main.c   |  4 +-
 drivers/net/ethernet/intel/i40e/i40e_main.c   |  2 +-
 drivers/net/ethernet/intel/iavf/iavf_main.c   |  8 ++--
 drivers/net/ethernet/intel/igb/igb_main.c     |  2 +-
 drivers/net/ethernet/intel/ixgbe/ixgbe_main.c |  2 +-
 .../net/ethernet/mellanox/mlx5/core/en_tc.c   |  8 +++-
 .../net/ethernet/mellanox/mlxsw/spectrum.c    |  3 +-
 .../ethernet/mellanox/mlxsw/spectrum_flower.c | 31 +++++++++++-----
 drivers/net/ethernet/mscc/ocelot_flower.c     |  2 +-
 drivers/net/ethernet/mscc/ocelot_tc.c         |  2 +-
 drivers/net/ethernet/netronome/nfp/abm/cls.c  |  2 +-
 drivers/net/ethernet/netronome/nfp/bpf/main.c |  2 +-
 .../ethernet/netronome/nfp/flower/offload.c   |  2 +-
 drivers/net/ethernet/qlogic/qede/qede_main.c  |  2 +-
 .../net/ethernet/stmicro/stmmac/stmmac_main.c |  2 +-
 drivers/net/netdevsim/bpf.c                   |  2 +-
 include/net/flow_offload.h                    |  8 ++++
 include/net/pkt_cls.h                         | 37 ++++++++++++++++++-
 include/uapi/linux/pkt_cls.h                  | 27 ++++++++++++++
 net/sched/cls_flower.c                        | 12 ++++++
 22 files changed, 132 insertions(+), 32 deletions(-)

-- 
2.21.1


^ permalink raw reply	[flat|nested] 29+ messages in thread

* [patch net-next 01/10] net: rename tc_cls_can_offload_and_chain0() to tc_cls_can_offload_basic()
  2020-02-21  9:56 [patch net-next 00/10] net: allow user specify TC filter HW stats type Jiri Pirko
@ 2020-02-21  9:56 ` Jiri Pirko
  2020-02-21  9:56 ` [patch net-next 02/10] iavf: use tc_cls_can_offload_basic() instead of chain check Jiri Pirko
                   ` (9 subsequent siblings)
  10 siblings, 0 replies; 29+ messages in thread
From: Jiri Pirko @ 2020-02-21  9:56 UTC (permalink / raw)
  To: netdev
  Cc: davem, kuba, saeedm, leon, michael.chan, vishal,
	jeffrey.t.kirsher, idosch, aelior, peppe.cavallaro,
	alexandre.torgue, jhs, xiyou.wangcong, pablo, mlxsw

From: Jiri Pirko <jiri@mellanox.com>

The function is going to be extended for checking not only chain 0, but
also HW stats type. So rename it accordingly as this checks for basic
offloading functionality.

Signed-off-by: Jiri Pirko <jiri@mellanox.com>
---
 drivers/net/ethernet/broadcom/bnxt/bnxt.c           | 2 +-
 drivers/net/ethernet/broadcom/bnxt/bnxt_vfr.c       | 2 +-
 drivers/net/ethernet/chelsio/cxgb4/cxgb4_main.c     | 4 ++--
 drivers/net/ethernet/intel/i40e/i40e_main.c         | 2 +-
 drivers/net/ethernet/intel/igb/igb_main.c           | 2 +-
 drivers/net/ethernet/intel/ixgbe/ixgbe_main.c       | 2 +-
 drivers/net/ethernet/mellanox/mlx5/core/en_tc.c     | 2 +-
 drivers/net/ethernet/mellanox/mlxsw/spectrum.c      | 3 +--
 drivers/net/ethernet/mscc/ocelot_flower.c           | 2 +-
 drivers/net/ethernet/mscc/ocelot_tc.c               | 2 +-
 drivers/net/ethernet/netronome/nfp/abm/cls.c        | 2 +-
 drivers/net/ethernet/netronome/nfp/bpf/main.c       | 2 +-
 drivers/net/ethernet/netronome/nfp/flower/offload.c | 2 +-
 drivers/net/ethernet/qlogic/qede/qede_main.c        | 2 +-
 drivers/net/ethernet/stmicro/stmmac/stmmac_main.c   | 2 +-
 drivers/net/netdevsim/bpf.c                         | 2 +-
 include/net/pkt_cls.h                               | 4 ++--
 17 files changed, 19 insertions(+), 20 deletions(-)

diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt.c b/drivers/net/ethernet/broadcom/bnxt/bnxt.c
index 597e6fd5bfea..f62a1d9667fa 100644
--- a/drivers/net/ethernet/broadcom/bnxt/bnxt.c
+++ b/drivers/net/ethernet/broadcom/bnxt/bnxt.c
@@ -11043,7 +11043,7 @@ static int bnxt_setup_tc_block_cb(enum tc_setup_type type, void *type_data,
 	struct bnxt *bp = cb_priv;
 
 	if (!bnxt_tc_flower_enabled(bp) ||
-	    !tc_cls_can_offload_and_chain0(bp->dev, type_data))
+	    !tc_cls_can_offload_basic(bp->dev, type_data))
 		return -EOPNOTSUPP;
 
 	switch (type) {
diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt_vfr.c b/drivers/net/ethernet/broadcom/bnxt/bnxt_vfr.c
index b010b34cdaf8..2ab3ecb298de 100644
--- a/drivers/net/ethernet/broadcom/bnxt/bnxt_vfr.c
+++ b/drivers/net/ethernet/broadcom/bnxt/bnxt_vfr.c
@@ -150,7 +150,7 @@ static int bnxt_vf_rep_setup_tc_block_cb(enum tc_setup_type type,
 	int vf_fid = bp->pf.vf[vf_rep->vf_idx].fw_fid;
 
 	if (!bnxt_tc_flower_enabled(vf_rep->bp) ||
-	    !tc_cls_can_offload_and_chain0(bp->dev, type_data))
+	    !tc_cls_can_offload_basic(bp->dev, type_data))
 		return -EOPNOTSUPP;
 
 	switch (type) {
diff --git a/drivers/net/ethernet/chelsio/cxgb4/cxgb4_main.c b/drivers/net/ethernet/chelsio/cxgb4/cxgb4_main.c
index 649842a8aa28..7b98c8e2ce69 100644
--- a/drivers/net/ethernet/chelsio/cxgb4/cxgb4_main.c
+++ b/drivers/net/ethernet/chelsio/cxgb4/cxgb4_main.c
@@ -3316,7 +3316,7 @@ static int cxgb_setup_tc_block_ingress_cb(enum tc_setup_type type,
 		return -EINVAL;
 	}
 
-	if (!tc_cls_can_offload_and_chain0(dev, type_data))
+	if (!tc_cls_can_offload_basic(dev, type_data))
 		return -EOPNOTSUPP;
 
 	switch (type) {
@@ -3345,7 +3345,7 @@ static int cxgb_setup_tc_block_egress_cb(enum tc_setup_type type,
 		return -EINVAL;
 	}
 
-	if (!tc_cls_can_offload_and_chain0(dev, type_data))
+	if (!tc_cls_can_offload_basic(dev, type_data))
 		return -EOPNOTSUPP;
 
 	switch (type) {
diff --git a/drivers/net/ethernet/intel/i40e/i40e_main.c b/drivers/net/ethernet/intel/i40e/i40e_main.c
index 8c3e753bfb9d..afa9a96e9212 100644
--- a/drivers/net/ethernet/intel/i40e/i40e_main.c
+++ b/drivers/net/ethernet/intel/i40e/i40e_main.c
@@ -8141,7 +8141,7 @@ static int i40e_setup_tc_block_cb(enum tc_setup_type type, void *type_data,
 {
 	struct i40e_netdev_priv *np = cb_priv;
 
-	if (!tc_cls_can_offload_and_chain0(np->vsi->netdev, type_data))
+	if (!tc_cls_can_offload_basic(np->vsi->netdev, type_data))
 		return -EOPNOTSUPP;
 
 	switch (type) {
diff --git a/drivers/net/ethernet/intel/igb/igb_main.c b/drivers/net/ethernet/intel/igb/igb_main.c
index b46bff8fe056..396bda31ab76 100644
--- a/drivers/net/ethernet/intel/igb/igb_main.c
+++ b/drivers/net/ethernet/intel/igb/igb_main.c
@@ -2775,7 +2775,7 @@ static int igb_setup_tc_block_cb(enum tc_setup_type type, void *type_data,
 {
 	struct igb_adapter *adapter = cb_priv;
 
-	if (!tc_cls_can_offload_and_chain0(adapter->netdev, type_data))
+	if (!tc_cls_can_offload_basic(adapter->netdev, type_data))
 		return -EOPNOTSUPP;
 
 	switch (type) {
diff --git a/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c b/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c
index 718931d951bc..a91623b7ba19 100644
--- a/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c
+++ b/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c
@@ -9631,7 +9631,7 @@ static int ixgbe_setup_tc_block_cb(enum tc_setup_type type, void *type_data,
 {
 	struct ixgbe_adapter *adapter = cb_priv;
 
-	if (!tc_cls_can_offload_and_chain0(adapter->netdev, type_data))
+	if (!tc_cls_can_offload_basic(adapter->netdev, type_data))
 		return -EOPNOTSUPP;
 
 	switch (type) {
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_tc.c b/drivers/net/ethernet/mellanox/mlx5/core/en_tc.c
index 74091f72c9a8..bcfe2b6e35e5 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en_tc.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en_tc.c
@@ -3865,7 +3865,7 @@ mlx5e_add_nic_flow(struct mlx5e_priv *priv,
 	int attr_size, err;
 
 	/* multi-chain not supported for NIC rules */
-	if (!tc_cls_can_offload_and_chain0(priv->netdev, &f->common))
+	if (!tc_cls_can_offload_basic(priv->netdev, &f->common))
 		return -EOPNOTSUPP;
 
 	flow_flags |= BIT(MLX5E_TC_FLOW_FLAG_NIC);
diff --git a/drivers/net/ethernet/mellanox/mlxsw/spectrum.c b/drivers/net/ethernet/mellanox/mlxsw/spectrum.c
index d78e790ba94a..de8c11b5c487 100644
--- a/drivers/net/ethernet/mellanox/mlxsw/spectrum.c
+++ b/drivers/net/ethernet/mellanox/mlxsw/spectrum.c
@@ -1589,8 +1589,7 @@ static int mlxsw_sp_setup_tc_block_cb_matchall(enum tc_setup_type type,
 
 	switch (type) {
 	case TC_SETUP_CLSMATCHALL:
-		if (!tc_cls_can_offload_and_chain0(mlxsw_sp_port->dev,
-						   type_data))
+		if (!tc_cls_can_offload_basic(mlxsw_sp_port->dev, type_data))
 			return -EOPNOTSUPP;
 
 		return mlxsw_sp_setup_tc_cls_matchall(mlxsw_sp_port, type_data,
diff --git a/drivers/net/ethernet/mscc/ocelot_flower.c b/drivers/net/ethernet/mscc/ocelot_flower.c
index 3d65b99b9734..64d2758eb223 100644
--- a/drivers/net/ethernet/mscc/ocelot_flower.c
+++ b/drivers/net/ethernet/mscc/ocelot_flower.c
@@ -261,7 +261,7 @@ static int ocelot_setup_tc_block_cb_flower(enum tc_setup_type type,
 {
 	struct ocelot_port_block *port_block = cb_priv;
 
-	if (!tc_cls_can_offload_and_chain0(port_block->priv->dev, type_data))
+	if (!tc_cls_can_offload_basic(port_block->priv->dev, type_data))
 		return -EOPNOTSUPP;
 
 	switch (type) {
diff --git a/drivers/net/ethernet/mscc/ocelot_tc.c b/drivers/net/ethernet/mscc/ocelot_tc.c
index a4f7fbd76507..954136db2f62 100644
--- a/drivers/net/ethernet/mscc/ocelot_tc.c
+++ b/drivers/net/ethernet/mscc/ocelot_tc.c
@@ -94,7 +94,7 @@ static int ocelot_setup_tc_block_cb(enum tc_setup_type type,
 {
 	struct ocelot_port_private *priv = cb_priv;
 
-	if (!tc_cls_can_offload_and_chain0(priv->dev, type_data))
+	if (!tc_cls_can_offload_basic(priv->dev, type_data))
 		return -EOPNOTSUPP;
 
 	switch (type) {
diff --git a/drivers/net/ethernet/netronome/nfp/abm/cls.c b/drivers/net/ethernet/netronome/nfp/abm/cls.c
index 23ebddfb9532..7c59b248e216 100644
--- a/drivers/net/ethernet/netronome/nfp/abm/cls.c
+++ b/drivers/net/ethernet/netronome/nfp/abm/cls.c
@@ -238,7 +238,7 @@ static int nfp_abm_setup_tc_block_cb(enum tc_setup_type type,
 				   "only offload of u32 classifier supported");
 		return -EOPNOTSUPP;
 	}
-	if (!tc_cls_can_offload_and_chain0(repr->netdev, &cls_u32->common))
+	if (!tc_cls_can_offload_basic(repr->netdev, &cls_u32->common))
 		return -EOPNOTSUPP;
 
 	if (cls_u32->common.protocol != htons(ETH_P_IP) &&
diff --git a/drivers/net/ethernet/netronome/nfp/bpf/main.c b/drivers/net/ethernet/netronome/nfp/bpf/main.c
index 11c83a99b014..4c1b2bd655a0 100644
--- a/drivers/net/ethernet/netronome/nfp/bpf/main.c
+++ b/drivers/net/ethernet/netronome/nfp/bpf/main.c
@@ -116,7 +116,7 @@ static int nfp_bpf_setup_tc_block_cb(enum tc_setup_type type,
 				   "only offload of BPF classifiers supported");
 		return -EOPNOTSUPP;
 	}
-	if (!tc_cls_can_offload_and_chain0(nn->dp.netdev, &cls_bpf->common))
+	if (!tc_cls_can_offload_basic(nn->dp.netdev, &cls_bpf->common))
 		return -EOPNOTSUPP;
 	if (!nfp_net_ebpf_capable(nn)) {
 		NL_SET_ERR_MSG_MOD(cls_bpf->common.extack,
diff --git a/drivers/net/ethernet/netronome/nfp/flower/offload.c b/drivers/net/ethernet/netronome/nfp/flower/offload.c
index 7ca5c1becfcf..7a9bb55deec2 100644
--- a/drivers/net/ethernet/netronome/nfp/flower/offload.c
+++ b/drivers/net/ethernet/netronome/nfp/flower/offload.c
@@ -1523,7 +1523,7 @@ static int nfp_flower_setup_tc_block_cb(enum tc_setup_type type,
 {
 	struct nfp_repr *repr = cb_priv;
 
-	if (!tc_cls_can_offload_and_chain0(repr->netdev, type_data))
+	if (!tc_cls_can_offload_basic(repr->netdev, type_data))
 		return -EOPNOTSUPP;
 
 	switch (type) {
diff --git a/drivers/net/ethernet/qlogic/qede/qede_main.c b/drivers/net/ethernet/qlogic/qede/qede_main.c
index 34fa3917eb33..6168bb6e855c 100644
--- a/drivers/net/ethernet/qlogic/qede/qede_main.c
+++ b/drivers/net/ethernet/qlogic/qede/qede_main.c
@@ -567,7 +567,7 @@ static int qede_setup_tc_block_cb(enum tc_setup_type type, void *type_data,
 	struct flow_cls_offload *f;
 	struct qede_dev *edev = cb_priv;
 
-	if (!tc_cls_can_offload_and_chain0(edev->ndev, type_data))
+	if (!tc_cls_can_offload_basic(edev->ndev, type_data))
 		return -EOPNOTSUPP;
 
 	switch (type) {
diff --git a/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c b/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
index 37920b4da091..80b528a58723 100644
--- a/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
+++ b/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
@@ -4120,7 +4120,7 @@ static int stmmac_setup_tc_block_cb(enum tc_setup_type type, void *type_data,
 	struct stmmac_priv *priv = cb_priv;
 	int ret = -EOPNOTSUPP;
 
-	if (!tc_cls_can_offload_and_chain0(priv->dev, type_data))
+	if (!tc_cls_can_offload_basic(priv->dev, type_data))
 		return ret;
 
 	stmmac_disable_all_queues(priv);
diff --git a/drivers/net/netdevsim/bpf.c b/drivers/net/netdevsim/bpf.c
index 0b362b8dac17..537a4233a989 100644
--- a/drivers/net/netdevsim/bpf.c
+++ b/drivers/net/netdevsim/bpf.c
@@ -124,7 +124,7 @@ int nsim_bpf_setup_tc_block_cb(enum tc_setup_type type,
 		return -EOPNOTSUPP;
 	}
 
-	if (!tc_cls_can_offload_and_chain0(ns->netdev, &cls_bpf->common))
+	if (!tc_cls_can_offload_basic(ns->netdev, &cls_bpf->common))
 		return -EOPNOTSUPP;
 
 	if (cls_bpf->common.protocol != htons(ETH_P_ALL)) {
diff --git a/include/net/pkt_cls.h b/include/net/pkt_cls.h
index 53946b509b51..779364ed080a 100644
--- a/include/net/pkt_cls.h
+++ b/include/net/pkt_cls.h
@@ -584,8 +584,8 @@ static inline bool tc_can_offload_extack(const struct net_device *dev,
 }
 
 static inline bool
-tc_cls_can_offload_and_chain0(const struct net_device *dev,
-			      struct flow_cls_common_offload *common)
+tc_cls_can_offload_basic(const struct net_device *dev,
+			 struct flow_cls_common_offload *common)
 {
 	if (!tc_can_offload_extack(dev, common->extack))
 		return false;
-- 
2.21.1


^ permalink raw reply related	[flat|nested] 29+ messages in thread

* [patch net-next 02/10] iavf: use tc_cls_can_offload_basic() instead of chain check
  2020-02-21  9:56 [patch net-next 00/10] net: allow user specify TC filter HW stats type Jiri Pirko
  2020-02-21  9:56 ` [patch net-next 01/10] net: rename tc_cls_can_offload_and_chain0() to tc_cls_can_offload_basic() Jiri Pirko
@ 2020-02-21  9:56 ` Jiri Pirko
  2020-02-21  9:56 ` [patch net-next 03/10] flow_offload: Introduce offload of HW stats type Jiri Pirko
                   ` (8 subsequent siblings)
  10 siblings, 0 replies; 29+ messages in thread
From: Jiri Pirko @ 2020-02-21  9:56 UTC (permalink / raw)
  To: netdev
  Cc: davem, kuba, saeedm, leon, michael.chan, vishal,
	jeffrey.t.kirsher, idosch, aelior, peppe.cavallaro,
	alexandre.torgue, jhs, xiyou.wangcong, pablo, mlxsw

From: Jiri Pirko <jiri@mellanox.com>

Looks like the iavf code actually experienced a race condition, when a
developer took code before the check for chain 0 was put to helper.
So use tc_cls_can_offload_basic() helper instead of direct check and
move the check to _cb() so this is similar to i40e code.

Signed-off-by: Jiri Pirko <jiri@mellanox.com>
---
 drivers/net/ethernet/intel/iavf/iavf_main.c | 8 +++++---
 1 file changed, 5 insertions(+), 3 deletions(-)

diff --git a/drivers/net/ethernet/intel/iavf/iavf_main.c b/drivers/net/ethernet/intel/iavf/iavf_main.c
index 62fe56ddcb6e..8bc0d287d025 100644
--- a/drivers/net/ethernet/intel/iavf/iavf_main.c
+++ b/drivers/net/ethernet/intel/iavf/iavf_main.c
@@ -3061,9 +3061,6 @@ static int iavf_delete_clsflower(struct iavf_adapter *adapter,
 static int iavf_setup_tc_cls_flower(struct iavf_adapter *adapter,
 				    struct flow_cls_offload *cls_flower)
 {
-	if (cls_flower->common.chain_index)
-		return -EOPNOTSUPP;
-
 	switch (cls_flower->command) {
 	case FLOW_CLS_REPLACE:
 		return iavf_configure_clsflower(adapter, cls_flower);
@@ -3087,6 +3084,11 @@ static int iavf_setup_tc_cls_flower(struct iavf_adapter *adapter,
 static int iavf_setup_tc_block_cb(enum tc_setup_type type, void *type_data,
 				  void *cb_priv)
 {
+	struct iavf_adapter *adapter = cb_priv;
+
+	if (!tc_cls_can_offload_basic(adapter->netdev, type_data))
+		return -EOPNOTSUPP;
+
 	switch (type) {
 	case TC_SETUP_CLSFLOWER:
 		return iavf_setup_tc_cls_flower(cb_priv, type_data);
-- 
2.21.1


^ permalink raw reply related	[flat|nested] 29+ messages in thread

* [patch net-next 03/10] flow_offload: Introduce offload of HW stats type
  2020-02-21  9:56 [patch net-next 00/10] net: allow user specify TC filter HW stats type Jiri Pirko
  2020-02-21  9:56 ` [patch net-next 01/10] net: rename tc_cls_can_offload_and_chain0() to tc_cls_can_offload_basic() Jiri Pirko
  2020-02-21  9:56 ` [patch net-next 02/10] iavf: use tc_cls_can_offload_basic() instead of chain check Jiri Pirko
@ 2020-02-21  9:56 ` Jiri Pirko
  2020-02-21  9:56 ` [patch net-next 04/10] net: extend tc_cls_can_offload_basic() to check " Jiri Pirko
                   ` (7 subsequent siblings)
  10 siblings, 0 replies; 29+ messages in thread
From: Jiri Pirko @ 2020-02-21  9:56 UTC (permalink / raw)
  To: netdev
  Cc: davem, kuba, saeedm, leon, michael.chan, vishal,
	jeffrey.t.kirsher, idosch, aelior, peppe.cavallaro,
	alexandre.torgue, jhs, xiyou.wangcong, pablo, mlxsw

From: Jiri Pirko <jiri@mellanox.com>

Initially, pass "ANY" (struct is zeroed) to the drivers as that is the
current implicit value coming down to flow_offload.

Signed-off-by: Jiri Pirko <jiri@mellanox.com>
---
 include/net/flow_offload.h | 5 +++++
 1 file changed, 5 insertions(+)

diff --git a/include/net/flow_offload.h b/include/net/flow_offload.h
index c6f7bd22db60..34e1e7832cc3 100644
--- a/include/net/flow_offload.h
+++ b/include/net/flow_offload.h
@@ -350,10 +350,15 @@ enum flow_cls_command {
 	FLOW_CLS_TMPLT_DESTROY,
 };
 
+enum flow_cls_hw_stats_type {
+	FLOW_CLS_HW_STATS_TYPE_ANY,
+};
+
 struct flow_cls_common_offload {
 	u32 chain_index;
 	__be16 protocol;
 	u32 prio;
+	enum flow_cls_hw_stats_type hw_stats_type;
 	struct netlink_ext_ack *extack;
 };
 
-- 
2.21.1


^ permalink raw reply related	[flat|nested] 29+ messages in thread

* [patch net-next 04/10] net: extend tc_cls_can_offload_basic() to check HW stats type
  2020-02-21  9:56 [patch net-next 00/10] net: allow user specify TC filter HW stats type Jiri Pirko
                   ` (2 preceding siblings ...)
  2020-02-21  9:56 ` [patch net-next 03/10] flow_offload: Introduce offload of HW stats type Jiri Pirko
@ 2020-02-21  9:56 ` Jiri Pirko
  2020-02-21  9:56 ` [patch net-next 05/10] mlx5: restrict supported HW stats type to "any" Jiri Pirko
                   ` (6 subsequent siblings)
  10 siblings, 0 replies; 29+ messages in thread
From: Jiri Pirko @ 2020-02-21  9:56 UTC (permalink / raw)
  To: netdev
  Cc: davem, kuba, saeedm, leon, michael.chan, vishal,
	jeffrey.t.kirsher, idosch, aelior, peppe.cavallaro,
	alexandre.torgue, jhs, xiyou.wangcong, pablo, mlxsw

From: Jiri Pirko <jiri@mellanox.com>

As the default type of HW stats is "any", extend
tc_cls_can_offload_basic() helper to check for it and don't allow any
other type to be handled by the drivers.

Signed-off-by: Jiri Pirko <jiri@mellanox.com>
---
 include/net/pkt_cls.h | 5 +++++
 1 file changed, 5 insertions(+)

diff --git a/include/net/pkt_cls.h b/include/net/pkt_cls.h
index 779364ed080a..d3d90f714a66 100644
--- a/include/net/pkt_cls.h
+++ b/include/net/pkt_cls.h
@@ -594,6 +594,11 @@ tc_cls_can_offload_basic(const struct net_device *dev,
 			       "Driver supports only offload of chain 0");
 		return false;
 	}
+	if (common->hw_stats_type != FLOW_CLS_HW_STATS_TYPE_ANY) {
+		NL_SET_ERR_MSG(common->extack,
+			       "Driver supports only default HW stats type \"any\"");
+		return false;
+	}
 	return true;
 }
 
-- 
2.21.1


^ permalink raw reply related	[flat|nested] 29+ messages in thread

* [patch net-next 05/10] mlx5: restrict supported HW stats type to "any"
  2020-02-21  9:56 [patch net-next 00/10] net: allow user specify TC filter HW stats type Jiri Pirko
                   ` (3 preceding siblings ...)
  2020-02-21  9:56 ` [patch net-next 04/10] net: extend tc_cls_can_offload_basic() to check " Jiri Pirko
@ 2020-02-21  9:56 ` Jiri Pirko
  2020-02-21  9:56 ` [patch net-next 06/10] mlxsw: " Jiri Pirko
                   ` (5 subsequent siblings)
  10 siblings, 0 replies; 29+ messages in thread
From: Jiri Pirko @ 2020-02-21  9:56 UTC (permalink / raw)
  To: netdev
  Cc: davem, kuba, saeedm, leon, michael.chan, vishal,
	jeffrey.t.kirsher, idosch, aelior, peppe.cavallaro,
	alexandre.torgue, jhs, xiyou.wangcong, pablo, mlxsw

From: Jiri Pirko <jiri@mellanox.com>

Currently don't allow rules with any other type to be inserted.

Signed-off-by: Jiri Pirko <jiri@mellanox.com>
---
 drivers/net/ethernet/mellanox/mlx5/core/en_tc.c | 5 +++++
 1 file changed, 5 insertions(+)

diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_tc.c b/drivers/net/ethernet/mellanox/mlx5/core/en_tc.c
index bcfe2b6e35e5..39bbd9675ae4 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en_tc.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en_tc.c
@@ -3918,6 +3918,11 @@ mlx5e_tc_add_flow(struct mlx5e_priv *priv,
 	if (!tc_can_offload_extack(priv->netdev, f->common.extack))
 		return -EOPNOTSUPP;
 
+	if (f->common.hw_stats_type != FLOW_CLS_HW_STATS_TYPE_ANY) {
+		NL_SET_ERR_MSG_MOD(f->common.extack, "Unsupported HW stats type");
+		return -EOPNOTSUPP;
+	}
+
 	if (esw && esw->mode == MLX5_ESWITCH_OFFLOADS)
 		err = mlx5e_add_fdb_flow(priv, f, flow_flags,
 					 filter_dev, flow);
-- 
2.21.1


^ permalink raw reply related	[flat|nested] 29+ messages in thread

* [patch net-next 06/10] mlxsw: restrict supported HW stats type to "any"
  2020-02-21  9:56 [patch net-next 00/10] net: allow user specify TC filter HW stats type Jiri Pirko
                   ` (4 preceding siblings ...)
  2020-02-21  9:56 ` [patch net-next 05/10] mlx5: restrict supported HW stats type to "any" Jiri Pirko
@ 2020-02-21  9:56 ` Jiri Pirko
  2020-02-21  9:56 ` [patch net-next 07/10] flow_offload: introduce "immediate" HW stats type and allow it in mlxsw Jiri Pirko
                   ` (4 subsequent siblings)
  10 siblings, 0 replies; 29+ messages in thread
From: Jiri Pirko @ 2020-02-21  9:56 UTC (permalink / raw)
  To: netdev
  Cc: davem, kuba, saeedm, leon, michael.chan, vishal,
	jeffrey.t.kirsher, idosch, aelior, peppe.cavallaro,
	alexandre.torgue, jhs, xiyou.wangcong, pablo, mlxsw

From: Jiri Pirko <jiri@mellanox.com>

Currently don't allow rules with any other type to be inserted.

Signed-off-by: Jiri Pirko <jiri@mellanox.com>
---
 .../ethernet/mellanox/mlxsw/spectrum_flower.c | 28 +++++++++++++------
 1 file changed, 19 insertions(+), 9 deletions(-)

diff --git a/drivers/net/ethernet/mellanox/mlxsw/spectrum_flower.c b/drivers/net/ethernet/mellanox/mlxsw/spectrum_flower.c
index b607919c8ad0..ef0799a539d2 100644
--- a/drivers/net/ethernet/mellanox/mlxsw/spectrum_flower.c
+++ b/drivers/net/ethernet/mellanox/mlxsw/spectrum_flower.c
@@ -14,11 +14,13 @@
 #include "spectrum.h"
 #include "core_acl_flex_keys.h"
 
-static int mlxsw_sp_flower_parse_actions(struct mlxsw_sp *mlxsw_sp,
-					 struct mlxsw_sp_acl_block *block,
-					 struct mlxsw_sp_acl_rule_info *rulei,
-					 struct flow_action *flow_action,
-					 struct netlink_ext_ack *extack)
+static int
+mlxsw_sp_flower_parse_actions(struct mlxsw_sp *mlxsw_sp,
+			      struct mlxsw_sp_acl_block *block,
+			      struct mlxsw_sp_acl_rule_info *rulei,
+			      struct flow_action *flow_action,
+			      enum flow_cls_hw_stats_type hw_stats_type,
+			      struct netlink_ext_ack *extack)
 {
 	const struct flow_action_entry *act;
 	int mirror_act_count = 0;
@@ -27,10 +29,17 @@ static int mlxsw_sp_flower_parse_actions(struct mlxsw_sp *mlxsw_sp,
 	if (!flow_action_has_entries(flow_action))
 		return 0;
 
-	/* Count action is inserted first */
-	err = mlxsw_sp_acl_rulei_act_count(mlxsw_sp, rulei, extack);
-	if (err)
-		return err;
+	switch (hw_stats_type) {
+	case FLOW_CLS_HW_STATS_TYPE_ANY:
+		/* Count action is inserted first */
+		err = mlxsw_sp_acl_rulei_act_count(mlxsw_sp, rulei, extack);
+		if (err)
+			return err;
+		break;
+	default:
+		NL_SET_ERR_MSG_MOD(extack, "Unsupported HW stats type");
+		return -EOPNOTSUPP;
+	}
 
 	flow_action_for_each(i, act, flow_action) {
 		switch (act->id) {
@@ -449,6 +458,7 @@ static int mlxsw_sp_flower_parse(struct mlxsw_sp *mlxsw_sp,
 
 	return mlxsw_sp_flower_parse_actions(mlxsw_sp, block, rulei,
 					     &f->rule->action,
+					     f->common.hw_stats_type,
 					     f->common.extack);
 }
 
-- 
2.21.1


^ permalink raw reply related	[flat|nested] 29+ messages in thread

* [patch net-next 07/10] flow_offload: introduce "immediate" HW stats type and allow it in mlxsw
  2020-02-21  9:56 [patch net-next 00/10] net: allow user specify TC filter HW stats type Jiri Pirko
                   ` (5 preceding siblings ...)
  2020-02-21  9:56 ` [patch net-next 06/10] mlxsw: " Jiri Pirko
@ 2020-02-21  9:56 ` Jiri Pirko
  2020-02-21  9:56 ` [patch net-next 08/10] flow_offload: introduce "delayed" HW stats type and allow it in mlx5 Jiri Pirko
                   ` (3 subsequent siblings)
  10 siblings, 0 replies; 29+ messages in thread
From: Jiri Pirko @ 2020-02-21  9:56 UTC (permalink / raw)
  To: netdev
  Cc: davem, kuba, saeedm, leon, michael.chan, vishal,
	jeffrey.t.kirsher, idosch, aelior, peppe.cavallaro,
	alexandre.torgue, jhs, xiyou.wangcong, pablo, mlxsw

From: Jiri Pirko <jiri@mellanox.com>

Introduce new type for immediate HW stats and allow the value in
mlxsw offload.

Signed-off-by: Jiri Pirko <jiri@mellanox.com>
---
 drivers/net/ethernet/mellanox/mlxsw/spectrum_flower.c | 3 ++-
 include/net/flow_offload.h                            | 1 +
 2 files changed, 3 insertions(+), 1 deletion(-)

diff --git a/drivers/net/ethernet/mellanox/mlxsw/spectrum_flower.c b/drivers/net/ethernet/mellanox/mlxsw/spectrum_flower.c
index ef0799a539d2..a3e9f72f50de 100644
--- a/drivers/net/ethernet/mellanox/mlxsw/spectrum_flower.c
+++ b/drivers/net/ethernet/mellanox/mlxsw/spectrum_flower.c
@@ -30,7 +30,8 @@ mlxsw_sp_flower_parse_actions(struct mlxsw_sp *mlxsw_sp,
 		return 0;
 
 	switch (hw_stats_type) {
-	case FLOW_CLS_HW_STATS_TYPE_ANY:
+	case FLOW_CLS_HW_STATS_TYPE_ANY: /* fall-through */
+	case FLOW_CLS_HW_STATS_TYPE_IMMEDIATE:
 		/* Count action is inserted first */
 		err = mlxsw_sp_acl_rulei_act_count(mlxsw_sp, rulei, extack);
 		if (err)
diff --git a/include/net/flow_offload.h b/include/net/flow_offload.h
index 34e1e7832cc3..b1a182941f1d 100644
--- a/include/net/flow_offload.h
+++ b/include/net/flow_offload.h
@@ -352,6 +352,7 @@ enum flow_cls_command {
 
 enum flow_cls_hw_stats_type {
 	FLOW_CLS_HW_STATS_TYPE_ANY,
+	FLOW_CLS_HW_STATS_TYPE_IMMEDIATE,
 };
 
 struct flow_cls_common_offload {
-- 
2.21.1


^ permalink raw reply related	[flat|nested] 29+ messages in thread

* [patch net-next 08/10] flow_offload: introduce "delayed" HW stats type and allow it in mlx5
  2020-02-21  9:56 [patch net-next 00/10] net: allow user specify TC filter HW stats type Jiri Pirko
                   ` (6 preceding siblings ...)
  2020-02-21  9:56 ` [patch net-next 07/10] flow_offload: introduce "immediate" HW stats type and allow it in mlxsw Jiri Pirko
@ 2020-02-21  9:56 ` Jiri Pirko
  2020-02-21  9:56 ` [patch net-next 09/10] flow_offload: introduce "disabled" HW stats type and allow it in mlxsw Jiri Pirko
                   ` (2 subsequent siblings)
  10 siblings, 0 replies; 29+ messages in thread
From: Jiri Pirko @ 2020-02-21  9:56 UTC (permalink / raw)
  To: netdev
  Cc: davem, kuba, saeedm, leon, michael.chan, vishal,
	jeffrey.t.kirsher, idosch, aelior, peppe.cavallaro,
	alexandre.torgue, jhs, xiyou.wangcong, pablo, mlxsw

From: Jiri Pirko <jiri@mellanox.com>

Introduce new type for delayed HW stats and allow the value in
mlxsw offload.

Signed-off-by: Jiri Pirko <jiri@mellanox.com>
---
 drivers/net/ethernet/mellanox/mlx5/core/en_tc.c | 3 ++-
 include/net/flow_offload.h                      | 1 +
 2 files changed, 3 insertions(+), 1 deletion(-)

diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_tc.c b/drivers/net/ethernet/mellanox/mlx5/core/en_tc.c
index 39bbd9675ae4..d48c07bf9711 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en_tc.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en_tc.c
@@ -3918,7 +3918,8 @@ mlx5e_tc_add_flow(struct mlx5e_priv *priv,
 	if (!tc_can_offload_extack(priv->netdev, f->common.extack))
 		return -EOPNOTSUPP;
 
-	if (f->common.hw_stats_type != FLOW_CLS_HW_STATS_TYPE_ANY) {
+	if (f->common.hw_stats_type != FLOW_CLS_HW_STATS_TYPE_ANY &&
+	    f->common.hw_stats_type != FLOW_CLS_HW_STATS_TYPE_DELAYED) {
 		NL_SET_ERR_MSG_MOD(f->common.extack, "Unsupported HW stats type");
 		return -EOPNOTSUPP;
 	}
diff --git a/include/net/flow_offload.h b/include/net/flow_offload.h
index b1a182941f1d..4cff09890725 100644
--- a/include/net/flow_offload.h
+++ b/include/net/flow_offload.h
@@ -353,6 +353,7 @@ enum flow_cls_command {
 enum flow_cls_hw_stats_type {
 	FLOW_CLS_HW_STATS_TYPE_ANY,
 	FLOW_CLS_HW_STATS_TYPE_IMMEDIATE,
+	FLOW_CLS_HW_STATS_TYPE_DELAYED,
 };
 
 struct flow_cls_common_offload {
-- 
2.21.1


^ permalink raw reply related	[flat|nested] 29+ messages in thread

* [patch net-next 09/10] flow_offload: introduce "disabled" HW stats type and allow it in mlxsw
  2020-02-21  9:56 [patch net-next 00/10] net: allow user specify TC filter HW stats type Jiri Pirko
                   ` (7 preceding siblings ...)
  2020-02-21  9:56 ` [patch net-next 08/10] flow_offload: introduce "delayed" HW stats type and allow it in mlx5 Jiri Pirko
@ 2020-02-21  9:56 ` Jiri Pirko
  2020-02-21  9:56 ` [patch net-next 10/10] sched: cls_flower: allow user to specify type of HW stats for a filter Jiri Pirko
  2020-02-21 18:22 ` [patch net-next 00/10] net: allow user specify TC filter HW stats type Jakub Kicinski
  10 siblings, 0 replies; 29+ messages in thread
From: Jiri Pirko @ 2020-02-21  9:56 UTC (permalink / raw)
  To: netdev
  Cc: davem, kuba, saeedm, leon, michael.chan, vishal,
	jeffrey.t.kirsher, idosch, aelior, peppe.cavallaro,
	alexandre.torgue, jhs, xiyou.wangcong, pablo, mlxsw

From: Jiri Pirko <jiri@mellanox.com>

Introduce new type for disabled HW stats and allow the value in
mlxsw offload.

Signed-off-by: Jiri Pirko <jiri@mellanox.com>
---
 drivers/net/ethernet/mellanox/mlxsw/spectrum_flower.c | 2 ++
 include/net/flow_offload.h                            | 1 +
 2 files changed, 3 insertions(+)

diff --git a/drivers/net/ethernet/mellanox/mlxsw/spectrum_flower.c b/drivers/net/ethernet/mellanox/mlxsw/spectrum_flower.c
index a3e9f72f50de..108a96901721 100644
--- a/drivers/net/ethernet/mellanox/mlxsw/spectrum_flower.c
+++ b/drivers/net/ethernet/mellanox/mlxsw/spectrum_flower.c
@@ -37,6 +37,8 @@ mlxsw_sp_flower_parse_actions(struct mlxsw_sp *mlxsw_sp,
 		if (err)
 			return err;
 		break;
+	case FLOW_CLS_HW_STATS_TYPE_DISABLED:
+		break;
 	default:
 		NL_SET_ERR_MSG_MOD(extack, "Unsupported HW stats type");
 		return -EOPNOTSUPP;
diff --git a/include/net/flow_offload.h b/include/net/flow_offload.h
index 4cff09890725..447e039762dd 100644
--- a/include/net/flow_offload.h
+++ b/include/net/flow_offload.h
@@ -354,6 +354,7 @@ enum flow_cls_hw_stats_type {
 	FLOW_CLS_HW_STATS_TYPE_ANY,
 	FLOW_CLS_HW_STATS_TYPE_IMMEDIATE,
 	FLOW_CLS_HW_STATS_TYPE_DELAYED,
+	FLOW_CLS_HW_STATS_TYPE_DISABLED,
 };
 
 struct flow_cls_common_offload {
-- 
2.21.1


^ permalink raw reply related	[flat|nested] 29+ messages in thread

* [patch net-next 10/10] sched: cls_flower: allow user to specify type of HW stats for a filter
  2020-02-21  9:56 [patch net-next 00/10] net: allow user specify TC filter HW stats type Jiri Pirko
                   ` (8 preceding siblings ...)
  2020-02-21  9:56 ` [patch net-next 09/10] flow_offload: introduce "disabled" HW stats type and allow it in mlxsw Jiri Pirko
@ 2020-02-21  9:56 ` Jiri Pirko
  2020-02-21 18:22 ` [patch net-next 00/10] net: allow user specify TC filter HW stats type Jakub Kicinski
  10 siblings, 0 replies; 29+ messages in thread
From: Jiri Pirko @ 2020-02-21  9:56 UTC (permalink / raw)
  To: netdev
  Cc: davem, kuba, saeedm, leon, michael.chan, vishal,
	jeffrey.t.kirsher, idosch, aelior, peppe.cavallaro,
	alexandre.torgue, jhs, xiyou.wangcong, pablo, mlxsw

From: Jiri Pirko <jiri@mellanox.com>

Currently, user who is adding a filter expects HW report stats, however
it does not have exact expectations about the stats types. That is
aligned with TCA_CLS_HW_STATS_TYPE_ANY.

Allow user to specify type of HW stats for a filter and require a type.

Signed-off-by: Jiri Pirko <jiri@mellanox.com>
---
 include/net/pkt_cls.h        | 28 ++++++++++++++++++++++++++++
 include/uapi/linux/pkt_cls.h | 27 +++++++++++++++++++++++++++
 net/sched/cls_flower.c       | 12 ++++++++++++
 3 files changed, 67 insertions(+)

diff --git a/include/net/pkt_cls.h b/include/net/pkt_cls.h
index d3d90f714a66..18044ea7c246 100644
--- a/include/net/pkt_cls.h
+++ b/include/net/pkt_cls.h
@@ -631,6 +631,34 @@ static inline bool tc_in_hw(u32 flags)
 	return (flags & TCA_CLS_FLAGS_IN_HW) ? true : false;
 }
 
+static inline enum tca_cls_hw_stats_type
+tc_hw_stats_type_get(struct nlattr *hw_stats_type_attr)
+{
+	/* If the user did not pass the attr, that means he does
+	 * not care about the type. Return "any" in that case.
+	 */
+	return hw_stats_type_attr ? nla_get_u8(hw_stats_type_attr) :
+				    TCA_CLS_HW_STATS_TYPE_ANY;
+}
+
+static inline enum flow_cls_hw_stats_type
+tc_flow_cls_hw_stats_type(enum tca_cls_hw_stats_type hw_stats_type)
+{
+	switch (hw_stats_type) {
+	default:
+		WARN_ON(1);
+		/* fall-through */
+	case TCA_CLS_HW_STATS_TYPE_ANY:
+		return FLOW_CLS_HW_STATS_TYPE_ANY;
+	case TCA_CLS_HW_STATS_TYPE_IMMEDIATE:
+		return FLOW_CLS_HW_STATS_TYPE_IMMEDIATE;
+	case TCA_CLS_HW_STATS_TYPE_DELAYED:
+		return FLOW_CLS_HW_STATS_TYPE_DELAYED;
+	case TCA_CLS_HW_STATS_TYPE_DISABLED:
+		return FLOW_CLS_HW_STATS_TYPE_DISABLED;
+	}
+}
+
 static inline void
 tc_cls_common_offload_init(struct flow_cls_common_offload *cls_common,
 			   const struct tcf_proto *tp, u32 flags,
diff --git a/include/uapi/linux/pkt_cls.h b/include/uapi/linux/pkt_cls.h
index 449a63971451..7f89954c2998 100644
--- a/include/uapi/linux/pkt_cls.h
+++ b/include/uapi/linux/pkt_cls.h
@@ -180,6 +180,31 @@ enum {
 #define TCA_CLS_FLAGS_NOT_IN_HW (1 << 3) /* filter isn't offloaded to HW */
 #define TCA_CLS_FLAGS_VERBOSE	(1 << 4) /* verbose logging */
 
+/* tca HW stats type */
+enum tca_cls_hw_stats_type {
+	TCA_CLS_HW_STATS_TYPE_ANY, /* User does not care, it's default
+				    * when user does not pass the attr.
+				    * Instructs the driver that user does not
+				    * care if the HW stats are "immediate"
+				    * or "delayed".
+				    */
+	TCA_CLS_HW_STATS_TYPE_IMMEDIATE, /* Means that in dump, user gets
+					  * the current HW stats state from
+					  * the device queried at the dump time.
+					  */
+	TCA_CLS_HW_STATS_TYPE_DELAYED, /* Means that in dump, user gets
+					* HW stats that might be out of date
+					* for some time, maybe couple of
+					* seconds. This is the case when driver
+					* polls stats updates periodically
+					* or when it gets async stats update
+					* from the device.
+					*/
+	TCA_CLS_HW_STATS_TYPE_DISABLED, /* User is not interested in getting
+					 * any HW statistics.
+					 */
+};
+
 /* U32 filters */
 
 #define TC_U32_HTID(h) ((h)&0xFFF00000)
@@ -553,6 +578,8 @@ enum {
 	TCA_FLOWER_KEY_CT_LABELS,	/* u128 */
 	TCA_FLOWER_KEY_CT_LABELS_MASK,	/* u128 */
 
+	TCA_FLOWER_HW_STATS_TYPE,	/* u8 */
+
 	__TCA_FLOWER_MAX,
 };
 
diff --git a/net/sched/cls_flower.c b/net/sched/cls_flower.c
index 726fc9c5910f..91171266ac26 100644
--- a/net/sched/cls_flower.c
+++ b/net/sched/cls_flower.c
@@ -113,6 +113,7 @@ struct cls_fl_filter {
 	u32 handle;
 	u32 flags;
 	u32 in_hw_count;
+	enum tca_cls_hw_stats_type hw_stats_type;
 	struct rcu_work rwork;
 	struct net_device *hw_dev;
 	/* Flower classifier is unlocked, which means that its reference counter
@@ -442,6 +443,8 @@ static int fl_hw_replace_filter(struct tcf_proto *tp,
 		return -ENOMEM;
 
 	tc_cls_common_offload_init(&cls_flower.common, tp, f->flags, extack);
+	cls_flower.common.hw_stats_type =
+		tc_flow_cls_hw_stats_type(f->hw_stats_type);
 	cls_flower.command = FLOW_CLS_REPLACE;
 	cls_flower.cookie = (unsigned long) f;
 	cls_flower.rule->match.dissector = &f->mask->dissector;
@@ -691,6 +694,7 @@ static const struct nla_policy fl_policy[TCA_FLOWER_MAX + 1] = {
 	[TCA_FLOWER_KEY_CT_LABELS_MASK]	= { .type = NLA_BINARY,
 					    .len = 128 / BITS_PER_BYTE },
 	[TCA_FLOWER_FLAGS]		= { .type = NLA_U32 },
+	[TCA_FLOWER_HW_STATS_TYPE]	= { .type = NLA_U8 },
 };
 
 static const struct nla_policy
@@ -1774,6 +1778,9 @@ static int fl_change(struct net *net, struct sk_buff *in_skb,
 		}
 	}
 
+	fnew->hw_stats_type =
+		tc_hw_stats_type_get(tb[TCA_FLOWER_HW_STATS_TYPE]);
+
 	err = fl_set_parms(net, tp, fnew, mask, base, tb, tca[TCA_RATE], ovr,
 			   tp->chain->tmplt_priv, rtnl_held, extack);
 	if (err)
@@ -1992,6 +1999,8 @@ static int fl_reoffload(struct tcf_proto *tp, bool add, flow_setup_cb_t *cb,
 
 		tc_cls_common_offload_init(&cls_flower.common, tp, f->flags,
 					   extack);
+		cls_flower.common.hw_stats_type =
+			tc_flow_cls_hw_stats_type(f->hw_stats_type);
 		cls_flower.command = add ?
 			FLOW_CLS_REPLACE : FLOW_CLS_DESTROY;
 		cls_flower.cookie = (unsigned long)f;
@@ -2714,6 +2723,9 @@ static int fl_dump(struct net *net, struct tcf_proto *tp, void *fh,
 	if (f->flags && nla_put_u32(skb, TCA_FLOWER_FLAGS, f->flags))
 		goto nla_put_failure_locked;
 
+	if (nla_put_u8(skb, TCA_FLOWER_HW_STATS_TYPE, f->hw_stats_type))
+		goto nla_put_failure_locked;
+
 	spin_unlock(&tp->lock);
 
 	if (!skip_hw)
-- 
2.21.1


^ permalink raw reply related	[flat|nested] 29+ messages in thread

* Re: [patch net-next 00/10] net: allow user specify TC filter HW stats type
  2020-02-21  9:56 [patch net-next 00/10] net: allow user specify TC filter HW stats type Jiri Pirko
                   ` (9 preceding siblings ...)
  2020-02-21  9:56 ` [patch net-next 10/10] sched: cls_flower: allow user to specify type of HW stats for a filter Jiri Pirko
@ 2020-02-21 18:22 ` Jakub Kicinski
  2020-02-22  6:38   ` Jiri Pirko
  10 siblings, 1 reply; 29+ messages in thread
From: Jakub Kicinski @ 2020-02-21 18:22 UTC (permalink / raw)
  To: Jiri Pirko
  Cc: netdev, davem, saeedm, leon, michael.chan, vishal,
	jeffrey.t.kirsher, idosch, aelior, peppe.cavallaro,
	alexandre.torgue, jhs, xiyou.wangcong, pablo, mlxsw, Edward Cree

On Fri, 21 Feb 2020 10:56:33 +0100 Jiri Pirko wrote:
> From: Jiri Pirko <jiri@mellanox.com>
> 
> Currently, when user adds a TC filter and the filter gets offloaded,
> the user expects the HW stats to be counted and included in stats dump.
> However, since drivers may implement different types of counting, there
> is no way to specify which one the user is interested in.
> 
> For example for mlx5, only delayed counters are available as the driver
> periodically polls for updated stats.
> 
> In case of mlxsw, the counters are queried on dump time. However, the
> HW resources for this type of counters is quite limited (couple of
> thousands). This limits the amount of supported offloaded filters
> significantly. Without counter assigned, the HW is capable to carry
> millions of those.
> 
> On top of that, mlxsw HW is able to support delayed counters as well in
> greater numbers. That is going to be added in a follow-up patch.
> 
> This patchset allows user to specify one of the following types of HW
> stats for added fitler:
> any - current default, user does not care about the type, just expects
>       any type of stats.
> immediate - queried during dump time
> delayed - polled from HW periodically or sent by HW in async manner
> disabled - no stats needed

Hmm, but the statistics are on actions, it feels a little bit like we
are perpetuating the mistake of counting on filters here.

Would it not work to share actions between filters which don't need
fine grained stats if HW can do more filters than stats?

Let's CC Ed on this.


^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [patch net-next 00/10] net: allow user specify TC filter HW stats type
  2020-02-21 18:22 ` [patch net-next 00/10] net: allow user specify TC filter HW stats type Jakub Kicinski
@ 2020-02-22  6:38   ` Jiri Pirko
  2020-02-24 11:38     ` Edward Cree
  0 siblings, 1 reply; 29+ messages in thread
From: Jiri Pirko @ 2020-02-22  6:38 UTC (permalink / raw)
  To: Jakub Kicinski
  Cc: netdev, davem, saeedm, leon, michael.chan, vishal,
	jeffrey.t.kirsher, idosch, aelior, peppe.cavallaro,
	alexandre.torgue, jhs, xiyou.wangcong, pablo, mlxsw, Edward Cree

Fri, Feb 21, 2020 at 07:22:00PM CET, kuba@kernel.org wrote:
>On Fri, 21 Feb 2020 10:56:33 +0100 Jiri Pirko wrote:
>> From: Jiri Pirko <jiri@mellanox.com>
>> 
>> Currently, when user adds a TC filter and the filter gets offloaded,
>> the user expects the HW stats to be counted and included in stats dump.
>> However, since drivers may implement different types of counting, there
>> is no way to specify which one the user is interested in.
>> 
>> For example for mlx5, only delayed counters are available as the driver
>> periodically polls for updated stats.
>> 
>> In case of mlxsw, the counters are queried on dump time. However, the
>> HW resources for this type of counters is quite limited (couple of
>> thousands). This limits the amount of supported offloaded filters
>> significantly. Without counter assigned, the HW is capable to carry
>> millions of those.
>> 
>> On top of that, mlxsw HW is able to support delayed counters as well in
>> greater numbers. That is going to be added in a follow-up patch.
>> 
>> This patchset allows user to specify one of the following types of HW
>> stats for added fitler:
>> any - current default, user does not care about the type, just expects
>>       any type of stats.
>> immediate - queried during dump time
>> delayed - polled from HW periodically or sent by HW in async manner
>> disabled - no stats needed
>
>Hmm, but the statistics are on actions, it feels a little bit like we
>are perpetuating the mistake of counting on filters here.

You are right, the stats in tc are per-action. However, in both mlxsw
and mlx5, the stats are per-filter. What hw_stats does is that it
basically allows the user to set the type for all the actions in the
filter at once. 

Could you imagine a usecase that would use different HW stats type for
different actions in one action-chain?

Plus, if the fine grained setup is ever needed, the hw_stats could be in
future easilyt extended to be specified per-action too overriding
the per-filter setup only for specific action. What do you think?


>
>Would it not work to share actions between filters which don't need
>fine grained stats if HW can do more filters than stats?

For mlxsw it would work, if the action chain would be identical. For
that you don't need the per-action granularity of setting type hw_stats.

>
>Let's CC Ed on this.
>

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [patch net-next 00/10] net: allow user specify TC filter HW stats type
  2020-02-22  6:38   ` Jiri Pirko
@ 2020-02-24 11:38     ` Edward Cree
  2020-02-24 13:11       ` Jiri Pirko
  0 siblings, 1 reply; 29+ messages in thread
From: Edward Cree @ 2020-02-24 11:38 UTC (permalink / raw)
  To: Jiri Pirko, Jakub Kicinski
  Cc: netdev, davem, saeedm, leon, michael.chan, vishal,
	jeffrey.t.kirsher, idosch, aelior, peppe.cavallaro,
	alexandre.torgue, jhs, xiyou.wangcong, pablo, mlxsw

On 22/02/2020 06:38, Jiri Pirko wrote:
> Fri, Feb 21, 2020 at 07:22:00PM CET, kuba@kernel.org wrote:
>> Hmm, but the statistics are on actions, it feels a little bit like we
>> are perpetuating the mistake of counting on filters here.
> You are right, the stats in tc are per-action. However, in both mlxsw
> and mlx5, the stats are per-filter. What hw_stats does is that it
> basically allows the user to set the type for all the actions in the
> filter at once.
>
> Could you imagine a usecase that would use different HW stats type for
> different actions in one action-chain?
Potentially a user could only want stats on one action and disable them
 on another.  For instance, if their action chain does delivery to the
 'customer' and also mirrors the packets for monitoring, they might only
 want stats on the first delivery (e.g. for billing) and not want to
 waste a counter on the mirror.

> Plus, if the fine grained setup is ever needed, the hw_stats could be in
> future easilyt extended to be specified per-action too overriding
> the per-filter setup only for specific action. What do you think?
I think this is improper; the stats type should be defined per-action in
 the uapi, even if userland initially only has UI to set it across the
 entire filter.  (I imagine `tc action` would grow the corresponding UI
 pretty quickly.)  Then on the driver side, you must determine whether
 your hardware can support the user's request, and if not, return an
 appropriate error code.

Let's try not to encourage people (users, future HW & driver developers)
 to keep thinking of stats as per-filter entities.

-ed

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [patch net-next 00/10] net: allow user specify TC filter HW stats type
  2020-02-24 11:38     ` Edward Cree
@ 2020-02-24 13:11       ` Jiri Pirko
  2020-02-24 15:45         ` Jamal Hadi Salim
  0 siblings, 1 reply; 29+ messages in thread
From: Jiri Pirko @ 2020-02-24 13:11 UTC (permalink / raw)
  To: Edward Cree
  Cc: Jakub Kicinski, netdev, davem, saeedm, leon, michael.chan,
	vishal, jeffrey.t.kirsher, idosch, aelior, peppe.cavallaro,
	alexandre.torgue, jhs, xiyou.wangcong, pablo, mlxsw

Mon, Feb 24, 2020 at 12:38:20PM CET, ecree@solarflare.com wrote:
>On 22/02/2020 06:38, Jiri Pirko wrote:
>> Fri, Feb 21, 2020 at 07:22:00PM CET, kuba@kernel.org wrote:
>>> Hmm, but the statistics are on actions, it feels a little bit like we
>>> are perpetuating the mistake of counting on filters here.
>> You are right, the stats in tc are per-action. However, in both mlxsw
>> and mlx5, the stats are per-filter. What hw_stats does is that it
>> basically allows the user to set the type for all the actions in the
>> filter at once.
>>
>> Could you imagine a usecase that would use different HW stats type for
>> different actions in one action-chain?
>Potentially a user could only want stats on one action and disable them
> on another.  For instance, if their action chain does delivery to the
> 'customer' and also mirrors the packets for monitoring, they might only
> want stats on the first delivery (e.g. for billing) and not want to
> waste a counter on the mirror.

Okay.


>
>> Plus, if the fine grained setup is ever needed, the hw_stats could be in
>> future easilyt extended to be specified per-action too overriding
>> the per-filter setup only for specific action. What do you think?
>I think this is improper; the stats type should be defined per-action in
> the uapi, even if userland initially only has UI to set it across the
> entire filter.  (I imagine `tc action` would grow the corresponding UI
> pretty quickly.)  Then on the driver side, you must determine whether
> your hardware can support the user's request, and if not, return an
> appropriate error code.
>
>Let's try not to encourage people (users, future HW & driver developers)
> to keep thinking of stats as per-filter entities.

Okay, but in that case, it does not make sense to have "UI to set it
across the entire filter". The UI would have to set it per-action too.
Othewise it would make people think it is per-filter entity.
But I guess the tc cmdline is chatty already an it can take other
repetitive cmdline options.

What do you think?


Thanks!

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [patch net-next 00/10] net: allow user specify TC filter HW stats type
  2020-02-24 13:11       ` Jiri Pirko
@ 2020-02-24 15:45         ` Jamal Hadi Salim
  2020-02-24 15:50           ` Edward Cree
  2020-02-24 16:25           ` Jiri Pirko
  0 siblings, 2 replies; 29+ messages in thread
From: Jamal Hadi Salim @ 2020-02-24 15:45 UTC (permalink / raw)
  To: Jiri Pirko, Edward Cree
  Cc: Jakub Kicinski, netdev, davem, saeedm, leon, michael.chan,
	vishal, jeffrey.t.kirsher, idosch, aelior, peppe.cavallaro,
	alexandre.torgue, xiyou.wangcong, pablo, mlxsw

On 2020-02-24 8:11 a.m., Jiri Pirko wrote:
> Mon, Feb 24, 2020 at 12:38:20PM CET, ecree@solarflare.com wrote:
>> On 22/02/2020 06:38, Jiri Pirko wrote:

[..]
>> Potentially a user could only want stats on one action and disable them
>>   on another.  For instance, if their action chain does delivery to the
>>   'customer' and also mirrors the packets for monitoring, they might only
>>   want stats on the first delivery (e.g. for billing) and not want to
>>   waste a counter on the mirror.
> 
> Okay.
> 

+1  very important for telco billing use cases i am familiar with.

Ancient ACL implementations that had only one filter and
one action (drop/accept) didnt need more than one counter.
But not anymore in my opinion.

There's also a requirement for the concept of "sharing" - think
"family plans" or "small bussiness plan".
Counters may be shared across multiple filter-action chains for example.

> 
>>
>>> Plus, if the fine grained setup is ever needed, the hw_stats could be in
>>> future easilyt extended to be specified per-action too overriding
>>> the per-filter setup only for specific action. What do you think?
>> I think this is improper; the stats type should be defined per-action in
>>   the uapi, even if userland initially only has UI to set it across the
>>   entire filter.  (I imagine `tc action` would grow the corresponding UI
>>   pretty quickly.)  Then on the driver side, you must determine whether
>>   your hardware can support the user's request, and if not, return an
>>   appropriate error code.
>>
>> Let's try not to encourage people (users, future HW & driver developers)
>>   to keep thinking of stats as per-filter entities.
> > Okay, but in that case, it does not make sense to have "UI to set it
> across the entire filter". The UI would have to set it per-action too.
> Othewise it would make people think it is per-filter entity.
> But I guess the tc cmdline is chatty already an it can take other
> repetitive cmdline options.
> 

I normally visualize policy as a dag composed of filter(s) +
actions. The UI and uAPI has to be able to express that.

I am not sure if mlxsw works this way, but:
Most hardware i have encountered tends to have a separate
stats/counter table. The stats table is indexed.

Going backwards and looking at your example in this stanza:
---
   in_hw in_hw_count 2
   hw_stats immediate
         action order 1: gact action drop
          random type none pass val 0
          index 1 ref 1 bind 1 installed 14 sec used 7 sec
         Action statistics:
----

Guessing from "in_hw in_hw_count 2" - 2 is a hw stats table index?
If you have enough counters in hardware, the "stats table index"
essentially can be directly mapped to the action "index" attribute.

Sharing then becomes a matter of specifying the same drop action
with the correct index across multiple filters.

If you dont have enough hw counters - then perhaps a scheme to show
separate hardware counter index and software counter index (aka action
index) is needed.

cheers,
jamal

> What do you think?
> 
> 
> Thanks!
> 


^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [patch net-next 00/10] net: allow user specify TC filter HW stats type
  2020-02-24 15:45         ` Jamal Hadi Salim
@ 2020-02-24 15:50           ` Edward Cree
  2020-02-24 15:55             ` Jamal Hadi Salim
  2020-02-24 16:25           ` Jiri Pirko
  1 sibling, 1 reply; 29+ messages in thread
From: Edward Cree @ 2020-02-24 15:50 UTC (permalink / raw)
  To: Jamal Hadi Salim, Jiri Pirko
  Cc: Jakub Kicinski, netdev, davem, saeedm, leon, michael.chan,
	vishal, jeffrey.t.kirsher, idosch, aelior, peppe.cavallaro,
	alexandre.torgue, xiyou.wangcong, pablo, mlxsw

On 24/02/2020 15:45, Jamal Hadi Salim wrote:
> Going backwards and looking at your example in this stanza:
> ---
>   in_hw in_hw_count 2
>   hw_stats immediate
>         action order 1: gact action drop
>          random type none pass val 0
>          index 1 ref 1 bind 1 installed 14 sec used 7 sec
>         Action statistics:
> ----
>
> Guessing from "in_hw in_hw_count 2" - 2 is a hw stats table index?
AIUI in_hw_count is a reference count of hardware devices that have
 offloaded the rule.  Nothing to do with stats "counters".

-ed

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [patch net-next 00/10] net: allow user specify TC filter HW stats type
  2020-02-24 15:50           ` Edward Cree
@ 2020-02-24 15:55             ` Jamal Hadi Salim
  0 siblings, 0 replies; 29+ messages in thread
From: Jamal Hadi Salim @ 2020-02-24 15:55 UTC (permalink / raw)
  To: Edward Cree, Jiri Pirko
  Cc: Jakub Kicinski, netdev, davem, saeedm, leon, michael.chan,
	vishal, jeffrey.t.kirsher, idosch, aelior, peppe.cavallaro,
	alexandre.torgue, xiyou.wangcong, pablo, mlxsw

On 2020-02-24 10:50 a.m., Edward Cree wrote:
> On 24/02/2020 15:45, Jamal Hadi Salim wrote:
>> Going backwards and looking at your example in this stanza:
>> ---
>>    in_hw in_hw_count 2
>>    hw_stats immediate
>>          action order 1: gact action drop
>>           random type none pass val 0
>>           index 1 ref 1 bind 1 installed 14 sec used 7 sec
>>          Action statistics:
>> ----
>>
>> Guessing from "in_hw in_hw_count 2" - 2 is a hw stats table index?
> AIUI in_hw_count is a reference count of hardware devices that have
>   offloaded the rule.  Nothing to do with stats "counters".

ah, ok;->. A more descriptive noun (hw_ref_count) would be nice.

Is it fair to assume that there's some form of stats table which is
indexed? And that the hw table index is accessible to the driver?

cheers,
jamal

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [patch net-next 00/10] net: allow user specify TC filter HW stats type
  2020-02-24 15:45         ` Jamal Hadi Salim
  2020-02-24 15:50           ` Edward Cree
@ 2020-02-24 16:25           ` Jiri Pirko
  2020-02-25 16:01             ` Jamal Hadi Salim
  2020-02-27 15:57             ` Edward Cree
  1 sibling, 2 replies; 29+ messages in thread
From: Jiri Pirko @ 2020-02-24 16:25 UTC (permalink / raw)
  To: Jamal Hadi Salim
  Cc: Edward Cree, Jakub Kicinski, netdev, davem, saeedm, leon,
	michael.chan, vishal, jeffrey.t.kirsher, idosch, aelior,
	peppe.cavallaro, alexandre.torgue, xiyou.wangcong, pablo, mlxsw

Mon, Feb 24, 2020 at 04:45:57PM CET, jhs@mojatatu.com wrote:
>On 2020-02-24 8:11 a.m., Jiri Pirko wrote:
>> Mon, Feb 24, 2020 at 12:38:20PM CET, ecree@solarflare.com wrote:
>> > On 22/02/2020 06:38, Jiri Pirko wrote:
>
>[..]
>> > Potentially a user could only want stats on one action and disable them
>> >   on another.  For instance, if their action chain does delivery to the
>> >   'customer' and also mirrors the packets for monitoring, they might only
>> >   want stats on the first delivery (e.g. for billing) and not want to
>> >   waste a counter on the mirror.
>> 
>> Okay.
>> 
>
>+1  very important for telco billing use cases i am familiar with.
>
>Ancient ACL implementations that had only one filter and
>one action (drop/accept) didnt need more than one counter.
>But not anymore in my opinion.
>
>There's also a requirement for the concept of "sharing" - think
>"family plans" or "small bussiness plan".
>Counters may be shared across multiple filter-action chains for example.

In hardware, we have a separate "counter" action with counter index.
You can reuse this index in multiple counter action instances.
However, in tc there is implicit separate counter for every action.

The counter is limited resource. So we overcome this mismatch in mlxsw
by having action "counter" always first for every rule inserted:
rule->action_counter,the_actual_action,the_actual_action2,...the_actual_actionN

and we report stats from action_counter for all the_actual_actionX.


>
>> 
>> > 
>> > > Plus, if the fine grained setup is ever needed, the hw_stats could be in
>> > > future easilyt extended to be specified per-action too overriding
>> > > the per-filter setup only for specific action. What do you think?
>> > I think this is improper; the stats type should be defined per-action in
>> >   the uapi, even if userland initially only has UI to set it across the
>> >   entire filter.  (I imagine `tc action` would grow the corresponding UI
>> >   pretty quickly.)  Then on the driver side, you must determine whether
>> >   your hardware can support the user's request, and if not, return an
>> >   appropriate error code.
>> > 
>> > Let's try not to encourage people (users, future HW & driver developers)
>> >   to keep thinking of stats as per-filter entities.
>> > Okay, but in that case, it does not make sense to have "UI to set it
>> across the entire filter". The UI would have to set it per-action too.
>> Othewise it would make people think it is per-filter entity.
>> But I guess the tc cmdline is chatty already an it can take other
>> repetitive cmdline options.
>> 
>
>I normally visualize policy as a dag composed of filter(s) +
>actions. The UI and uAPI has to be able to express that.
>
>I am not sure if mlxsw works this way, but:
>Most hardware i have encountered tends to have a separate
>stats/counter table. The stats table is indexed.
>
>Going backwards and looking at your example in this stanza:
>---
>  in_hw in_hw_count 2
>  hw_stats immediate
>        action order 1: gact action drop
>         random type none pass val 0
>         index 1 ref 1 bind 1 installed 14 sec used 7 sec
>        Action statistics:
>----
>
>Guessing from "in_hw in_hw_count 2" - 2 is a hw stats table index?
>If you have enough counters in hardware, the "stats table index"
>essentially can be directly mapped to the action "index" attribute.

>
>Sharing then becomes a matter of specifying the same drop action
>with the correct index across multiple filters.
>
>If you dont have enough hw counters - then perhaps a scheme to show
>separate hardware counter index and software counter index (aka action
>index) is needed.

I don't, that is the purpose of this patchset, to be able to avoid the
"action_counter" from the example I wrote above.

Note that I don't want to share, there is still separate "last_hit"
record in hw I expose in "used X sec". Interestingly enough, in
Spectrum-1 this is per rule, in Spectrum-2,3 this is per action block :)


>
>cheers,
>jamal
>
>> What do you think?
>> 
>> 
>> Thanks!
>> 
>

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [patch net-next 00/10] net: allow user specify TC filter HW stats type
  2020-02-24 16:25           ` Jiri Pirko
@ 2020-02-25 16:01             ` Jamal Hadi Salim
  2020-02-25 16:22               ` Jiri Pirko
  2020-02-27 15:57             ` Edward Cree
  1 sibling, 1 reply; 29+ messages in thread
From: Jamal Hadi Salim @ 2020-02-25 16:01 UTC (permalink / raw)
  To: Jiri Pirko
  Cc: Edward Cree, Jakub Kicinski, netdev, davem, saeedm, leon,
	michael.chan, vishal, jeffrey.t.kirsher, idosch, aelior,
	peppe.cavallaro, alexandre.torgue, xiyou.wangcong, pablo, mlxsw,
	Marian Pritsak

+Cc Marian.

On 2020-02-24 11:25 a.m., Jiri Pirko wrote:
> Mon, Feb 24, 2020 at 04:45:57PM CET, jhs@mojatatu.com wrote:
>> On 2020-02-24 8:11 a.m., Jiri Pirko wrote:
>>> Mon, Feb 24, 2020 at 12:38:20PM CET, ecree@solarflare.com wrote:
>>>> On 22/02/2020 06:38, Jiri Pirko wrote:
>>

>> There's also a requirement for the concept of "sharing" - think
>> "family plans" or "small bussiness plan".
>> Counters may be shared across multiple filter-action chains for example.
> 
> In hardware, we have a separate "counter" action with counter index.

Ok, so it is similar semantics.
In your case, you abstract it as a speacial action, but in most
abstractions(including P4) it looks like an indexed table.
 From a tc perspective you could abstract the equivalent to
your "counter action" as a gact "ok" or "pipe",etc depending
on your policy goal. The counter index becomes the gact index
if there is no conflict.
In most actions "index" attribute is really mapped to a
"counter" index. Exception would be actions with state
(like policer).

> You can reuse this index in multiple counter action instances.

That is great because it maps to tc semantics. When you create
an action of the same type, you can specify the index and it
is re-used. Example:

sudo tc filter add dev lo parent ffff: protocol ip prio 8 u32 \
match ip dst 127.0.0.8/32 flowid 1:8 \
action vlan push id 8 protocol 802.1q index 8\
action mirred egress mirror dev eth0 index 111

sudo tc filter add dev lo parent ffff: protocol ip prio 8 u32 \
match ip dst 127.0.0.15/32 flowid 1:10 \
action vlan push id 15 protocol 802.1q index 15 \
action mirred egress mirror index 111 \
action drop index 111

So for the shared mirror action the counter is shared
by virtue of specifying index 111.

What tc _doesnt allow_ is to re-use the same
counter index across different types of actions (example
mirror index 111 is not the same instance as drop 111).
Thats why i was asking if you are exposing the hw index.

> However, in tc there is implicit separate counter for every action.
> 

Yes, and is proving to be a challenge for hw. In s/w it makes sense.
It seemed sensible at the time; existing hardware back then
(broadcom 5691 family and cant remember the other vendor, iirc)
hard coded the actions with counters. Mind you they would
only support one action per match.

Some rethinking is needed of this status quo.
So maybe having syntaticaly an index for s/w vs h/w may make
sense.
Knowing the indices is important. As an example, for telemetry
you may be very interesting in dumping only the counter stats
instead of the rule. Dumping gact has always made it easy in
my use cases because it doesnt have a lot of attributes. But it
could make sense to introduce a new semantic like "dump action stats .."

> The counter is limited resource. So we overcome this mismatch in mlxsw
> by having action "counter" always first for every rule inserted:
> rule->action_counter,the_actual_action,the_actual_action2,...the_actual_actionN
>

So i am guessing the hw cant support "branching" i.e based on in
some action state sometime you may execute action foo and other times
action bar. Those kind of scenarios would need multiple counters.
> and we report stats from action_counter for all the_actual_actionX.

This may not be accurate if you are branching - for example
a policer or quota enforcer which either accepts or drops or sends next
to a marker action etc .
IMO, this was fine in the old days when you had one action per match.
Best is to leave it to whoever creates the policy to decide what to
count. IOW, I think modelling it as a pipe or ok or drop or continue
and be placed anywhere in the policy graph instead of the begining.

>> Sharing then becomes a matter of specifying the same drop action
>> with the correct index across multiple filters.
>>
>> If you dont have enough hw counters - then perhaps a scheme to show
>> separate hardware counter index and software counter index (aka action
>> index) is needed.
> 
> I don't, that is the purpose of this patchset, to be able to avoid the
> "action_counter" from the example I wrote above.

IMO, it would make sense to reuse existing gact for example and
annotate s/w vs h/w indices as a starting point. It keeps the
existing approach intact.

> Note that I don't want to share, there is still separate "last_hit"
> record in hw I expose in "used X sec". Interestingly enough, in
> Spectrum-1 this is per rule, in Spectrum-2,3 this is per action block :)

I didnt understand this one..

cheers,
jamal

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [patch net-next 00/10] net: allow user specify TC filter HW stats type
  2020-02-25 16:01             ` Jamal Hadi Salim
@ 2020-02-25 16:22               ` Jiri Pirko
  2020-02-25 18:06                 ` Jakub Kicinski
                                   ` (2 more replies)
  0 siblings, 3 replies; 29+ messages in thread
From: Jiri Pirko @ 2020-02-25 16:22 UTC (permalink / raw)
  To: Jamal Hadi Salim
  Cc: Edward Cree, Jakub Kicinski, netdev, davem, saeedm, leon,
	michael.chan, vishal, jeffrey.t.kirsher, idosch, aelior,
	peppe.cavallaro, alexandre.torgue, xiyou.wangcong, pablo, mlxsw,
	Marian Pritsak

Tue, Feb 25, 2020 at 05:01:05PM CET, jhs@mojatatu.com wrote:
>+Cc Marian.
>
>On 2020-02-24 11:25 a.m., Jiri Pirko wrote:
>> Mon, Feb 24, 2020 at 04:45:57PM CET, jhs@mojatatu.com wrote:
>> > On 2020-02-24 8:11 a.m., Jiri Pirko wrote:
>> > > Mon, Feb 24, 2020 at 12:38:20PM CET, ecree@solarflare.com wrote:
>> > > > On 22/02/2020 06:38, Jiri Pirko wrote:
>> > 
>
>> > There's also a requirement for the concept of "sharing" - think
>> > "family plans" or "small bussiness plan".
>> > Counters may be shared across multiple filter-action chains for example.
>> 
>> In hardware, we have a separate "counter" action with counter index.
>
>Ok, so it is similar semantics.
>In your case, you abstract it as a speacial action, but in most
>abstractions(including P4) it looks like an indexed table.
>From a tc perspective you could abstract the equivalent to
>your "counter action" as a gact "ok" or "pipe",etc depending
>on your policy goal. The counter index becomes the gact index
>if there is no conflict.
>In most actions "index" attribute is really mapped to a
>"counter" index. Exception would be actions with state
>(like policer).
>
>> You can reuse this index in multiple counter action instances.
>
>That is great because it maps to tc semantics. When you create
>an action of the same type, you can specify the index and it
>is re-used. Example:
>
>sudo tc filter add dev lo parent ffff: protocol ip prio 8 u32 \
>match ip dst 127.0.0.8/32 flowid 1:8 \
>action vlan push id 8 protocol 802.1q index 8\
>action mirred egress mirror dev eth0 index 111
>
>sudo tc filter add dev lo parent ffff: protocol ip prio 8 u32 \
>match ip dst 127.0.0.15/32 flowid 1:10 \
>action vlan push id 15 protocol 802.1q index 15 \
>action mirred egress mirror index 111 \
>action drop index 111
>
>So for the shared mirror action the counter is shared
>by virtue of specifying index 111.
>
>What tc _doesnt allow_ is to re-use the same
>counter index across different types of actions (example
>mirror index 111 is not the same instance as drop 111).
>Thats why i was asking if you are exposing the hw index.

User does not care about any "hw index". That should be abstracted out
by the driver.


>
>> However, in tc there is implicit separate counter for every action.
>> 
>
>Yes, and is proving to be a challenge for hw. In s/w it makes sense.
>It seemed sensible at the time; existing hardware back then
>(broadcom 5691 family and cant remember the other vendor, iirc)
>hard coded the actions with counters. Mind you they would
>only support one action per match.
>
>Some rethinking is needed of this status quo.
>So maybe having syntaticaly an index for s/w vs h/w may make
>sense.
>Knowing the indices is important. As an example, for telemetry
>you may be very interesting in dumping only the counter stats
>instead of the rule. Dumping gact has always made it easy in
>my use cases because it doesnt have a lot of attributes. But it
>could make sense to introduce a new semantic like "dump action stats .."
>
>> The counter is limited resource. So we overcome this mismatch in mlxsw
>> by having action "counter" always first for every rule inserted:
>> rule->action_counter,the_actual_action,the_actual_action2,...the_actual_actionN
>> 
>
>So i am guessing the hw cant support "branching" i.e based on in
>some action state sometime you may execute action foo and other times
>action bar. Those kind of scenarios would need multiple counters.

We don't and when/if we do, we need to put another counter to the
branch point.


>> and we report stats from action_counter for all the_actual_actionX.
>
>This may not be accurate if you are branching - for example
>a policer or quota enforcer which either accepts or drops or sends next
>to a marker action etc .
>IMO, this was fine in the old days when you had one action per match.
>Best is to leave it to whoever creates the policy to decide what to
>count. IOW, I think modelling it as a pipe or ok or drop or continue
>and be placed anywhere in the policy graph instead of the begining.

Eh, that is not that simple. The existing users are used to the fact
that the actions are providing counters by themselves. Having and
explicit counter action like this would break that expectation.
Also, I think it should be up to the driver implementation. Some HW
might only support stats per rule, not the actions. Driver should fit
into the existing abstraction, I think it is fine.


>
>> > Sharing then becomes a matter of specifying the same drop action
>> > with the correct index across multiple filters.
>> > 
>> > If you dont have enough hw counters - then perhaps a scheme to show
>> > separate hardware counter index and software counter index (aka action
>> > index) is needed.
>> 
>> I don't, that is the purpose of this patchset, to be able to avoid the
>> "action_counter" from the example I wrote above.
>
>IMO, it would make sense to reuse existing gact for example and
>annotate s/w vs h/w indices as a starting point. It keeps the
>existing approach intact.
>
>> Note that I don't want to share, there is still separate "last_hit"
>> record in hw I expose in "used X sec". Interestingly enough, in
>> Spectrum-1 this is per rule, in Spectrum-2,3 this is per action block :)
>
>I didnt understand this one..

It's not "stats", it's an information about how long ago the act was
used.


>
>cheers,
>jamal

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [patch net-next 00/10] net: allow user specify TC filter HW stats type
  2020-02-25 16:22               ` Jiri Pirko
@ 2020-02-25 18:06                 ` Jakub Kicinski
  2020-02-26 12:52                 ` Jamal Hadi Salim
  2020-02-28 19:59                 ` Pablo Neira Ayuso
  2 siblings, 0 replies; 29+ messages in thread
From: Jakub Kicinski @ 2020-02-25 18:06 UTC (permalink / raw)
  To: Jiri Pirko
  Cc: Jamal Hadi Salim, Edward Cree, netdev, davem, saeedm, leon,
	michael.chan, vishal, jeffrey.t.kirsher, idosch, aelior,
	peppe.cavallaro, alexandre.torgue, xiyou.wangcong, pablo, mlxsw,
	Marian Pritsak

On Tue, 25 Feb 2020 17:22:03 +0100 Jiri Pirko wrote:
> >> You can reuse this index in multiple counter action instances.  
> >
> >That is great because it maps to tc semantics. When you create
> >an action of the same type, you can specify the index and it
> >is re-used. Example:
> >
> >sudo tc filter add dev lo parent ffff: protocol ip prio 8 u32 \
> >match ip dst 127.0.0.8/32 flowid 1:8 \
> >action vlan push id 8 protocol 802.1q index 8\
> >action mirred egress mirror dev eth0 index 111
> >
> >sudo tc filter add dev lo parent ffff: protocol ip prio 8 u32 \
> >match ip dst 127.0.0.15/32 flowid 1:10 \
> >action vlan push id 15 protocol 802.1q index 15 \
> >action mirred egress mirror index 111 \
> >action drop index 111
> >
> >So for the shared mirror action the counter is shared
> >by virtue of specifying index 111.
> >
> >What tc _doesnt allow_ is to re-use the same
> >counter index across different types of actions (example
> >mirror index 111 is not the same instance as drop 111).
> >Thats why i was asking if you are exposing the hw index.  
> 
> User does not care about any "hw index". That should be abstracted out
> by the driver.

+1

> >> and we report stats from action_counter for all the_actual_actionX.  
> >
> >This may not be accurate if you are branching - for example
> >a policer or quota enforcer which either accepts or drops or sends next
> >to a marker action etc .
> >IMO, this was fine in the old days when you had one action per match.
> >Best is to leave it to whoever creates the policy to decide what to
> >count. IOW, I think modelling it as a pipe or ok or drop or continue
> >and be placed anywhere in the policy graph instead of the begining.  
> 
> Eh, that is not that simple. The existing users are used to the fact
> that the actions are providing counters by themselves. Having and
> explicit counter action like this would break that expectation.
> Also, I think it should be up to the driver implementation. Some HW
> might only support stats per rule, not the actions. Driver should fit
> into the existing abstraction, I think it is fine.

+1

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [patch net-next 00/10] net: allow user specify TC filter HW stats type
  2020-02-25 16:22               ` Jiri Pirko
  2020-02-25 18:06                 ` Jakub Kicinski
@ 2020-02-26 12:52                 ` Jamal Hadi Salim
  2020-02-26 13:56                   ` Jiri Pirko
  2020-02-28 19:59                 ` Pablo Neira Ayuso
  2 siblings, 1 reply; 29+ messages in thread
From: Jamal Hadi Salim @ 2020-02-26 12:52 UTC (permalink / raw)
  To: Jiri Pirko
  Cc: Edward Cree, Jakub Kicinski, netdev, davem, saeedm, leon,
	michael.chan, vishal, jeffrey.t.kirsher, idosch, aelior,
	peppe.cavallaro, alexandre.torgue, xiyou.wangcong, pablo, mlxsw,
	Marian Pritsak

On 2020-02-25 11:22 a.m., Jiri Pirko wrote:
> Tue, Feb 25, 2020 at 05:01:05PM CET, jhs@mojatatu.com wrote:
>> +Cc Marian.
>>



>> So for the shared mirror action the counter is shared
>> by virtue of specifying index 111.
>>
>> What tc _doesnt allow_ is to re-use the same
>> counter index across different types of actions (example
>> mirror index 111 is not the same instance as drop 111).
>> Thats why i was asking if you are exposing the hw index.
> 
> User does not care about any "hw index". That should be abstracted out
> by the driver.
> 

My main motivation is proper accounting (which is important
for the billing and debugging of course). Example:
if i say "get stats" I should know it is the sum of both
h/w + s/w stats or the rules are clear in regards to how
to retrieve each and sum them or differentiate them.
If your patch takes care of summing up things etc, then i agree.
Or if the rules for accounting are consistent then we are fine
as well.

>> So i am guessing the hw cant support "branching" i.e based on in
>> some action state sometime you may execute action foo and other times
>> action bar. Those kind of scenarios would need multiple counters.
> 
> We don't and when/if we do, we need to put another counter to the
> branch point.
>

Ok, that would work.
> 
>>> and we report stats from action_counter for all the_actual_actionX.
>>
>> This may not be accurate if you are branching - for example
>> a policer or quota enforcer which either accepts or drops or sends next
>> to a marker action etc .
>> IMO, this was fine in the old days when you had one action per match.
>> Best is to leave it to whoever creates the policy to decide what to
>> count. IOW, I think modelling it as a pipe or ok or drop or continue
>> and be placed anywhere in the policy graph instead of the begining.
> 
> Eh, that is not that simple. The existing users are used to the fact
> that the actions are providing counters by themselves. Having and
> explicit counter action like this would break that expectation.
 >
> Also, I think it should be up to the driver implementation. Some HW
> might only support stats per rule, not the actions. Driver should fit
> into the existing abstraction, I think it is fine.
>

Reasonable point.
So "count" action is only useful for h/w?

>>> Note that I don't want to share, there is still separate "last_hit"
>>> record in hw I expose in "used X sec". Interestingly enough, in
>>> Spectrum-1 this is per rule, in Spectrum-2,3 this is per action block :)
>>
>> I didnt understand this one..
> 
> It's not "stats", it's an information about how long ago the act was
> used.

ah. Given tc has one of those per action, are you looking to introduce
a new "last used" action?

cheers,
jamal

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [patch net-next 00/10] net: allow user specify TC filter HW stats type
  2020-02-26 12:52                 ` Jamal Hadi Salim
@ 2020-02-26 13:56                   ` Jiri Pirko
  0 siblings, 0 replies; 29+ messages in thread
From: Jiri Pirko @ 2020-02-26 13:56 UTC (permalink / raw)
  To: Jamal Hadi Salim
  Cc: Edward Cree, Jakub Kicinski, netdev, davem, saeedm, leon,
	michael.chan, vishal, jeffrey.t.kirsher, idosch, aelior,
	peppe.cavallaro, alexandre.torgue, xiyou.wangcong, pablo, mlxsw,
	Marian Pritsak

Wed, Feb 26, 2020 at 01:52:20PM CET, jhs@mojatatu.com wrote:
>On 2020-02-25 11:22 a.m., Jiri Pirko wrote:
>> Tue, Feb 25, 2020 at 05:01:05PM CET, jhs@mojatatu.com wrote:
>> > +Cc Marian.
>> > 
>
>
>
>> > So for the shared mirror action the counter is shared
>> > by virtue of specifying index 111.
>> > 
>> > What tc _doesnt allow_ is to re-use the same
>> > counter index across different types of actions (example
>> > mirror index 111 is not the same instance as drop 111).
>> > Thats why i was asking if you are exposing the hw index.
>> 
>> User does not care about any "hw index". That should be abstracted out
>> by the driver.
>> 
>
>My main motivation is proper accounting (which is important
>for the billing and debugging of course). Example:
>if i say "get stats" I should know it is the sum of both
>h/w + s/w stats or the rules are clear in regards to how
>to retrieve each and sum them or differentiate them.
>If your patch takes care of summing up things etc, then i agree.

The current state implemented in the code is summing up the stats. My
patchset has no relation to that.


>Or if the rules for accounting are consistent then we are fine
>as well.
>
>> > So i am guessing the hw cant support "branching" i.e based on in
>> > some action state sometime you may execute action foo and other times
>> > action bar. Those kind of scenarios would need multiple counters.
>> 
>> We don't and when/if we do, we need to put another counter to the
>> branch point.
>> 
>
>Ok, that would work.
>> 
>> > > and we report stats from action_counter for all the_actual_actionX.
>> > 
>> > This may not be accurate if you are branching - for example
>> > a policer or quota enforcer which either accepts or drops or sends next
>> > to a marker action etc .
>> > IMO, this was fine in the old days when you had one action per match.
>> > Best is to leave it to whoever creates the policy to decide what to
>> > count. IOW, I think modelling it as a pipe or ok or drop or continue
>> > and be placed anywhere in the policy graph instead of the begining.
>> 
>> Eh, that is not that simple. The existing users are used to the fact
>> that the actions are providing counters by themselves. Having and
>> explicit counter action like this would break that expectation.
>>
>> Also, I think it should be up to the driver implementation. Some HW
>> might only support stats per rule, not the actions. Driver should fit
>> into the existing abstraction, I think it is fine.
>> 
>
>Reasonable point.
>So "count" action is only useful for h/w?

There is no "count" action and should not be.


>
>> > > Note that I don't want to share, there is still separate "last_hit"
>> > > record in hw I expose in "used X sec". Interestingly enough, in
>> > > Spectrum-1 this is per rule, in Spectrum-2,3 this is per action block :)
>> > 
>> > I didnt understand this one..
>> 
>> It's not "stats", it's an information about how long ago the act was
>> used.
>
>ah. Given tc has one of those per action, are you looking to introduce
>a new "last used" action?

No.


>
>cheers,
>jamal

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [patch net-next 00/10] net: allow user specify TC filter HW stats type
  2020-02-24 16:25           ` Jiri Pirko
  2020-02-25 16:01             ` Jamal Hadi Salim
@ 2020-02-27 15:57             ` Edward Cree
  1 sibling, 0 replies; 29+ messages in thread
From: Edward Cree @ 2020-02-27 15:57 UTC (permalink / raw)
  To: Jiri Pirko, Jamal Hadi Salim
  Cc: Jakub Kicinski, netdev, davem, saeedm, leon, michael.chan,
	vishal, jeffrey.t.kirsher, idosch, aelior, peppe.cavallaro,
	alexandre.torgue, xiyou.wangcong, pablo, mlxsw

On 24/02/2020 16:25, Jiri Pirko wrote:
> In hardware, we have a separate "counter" action with counter index.
> You can reuse this index in multiple counter action instances.
> However, in tc there is implicit separate counter for every action.
Importantly, not all actions have counters.  Only those with a
 .stats_update() method in their struct tc_action_ops have an implicit
 counter (which requires hardware offload), and to a first approximation
 those are the 'deliverish' actions (pass, drop, mirred).
This informs the sfc design below.

> The counter is limited resource. So we overcome this mismatch in mlxsw
> by having action "counter" always first for every rule inserted:
> rule->action_counter,the_actual_action,the_actual_action2,...the_actual_actionN
>
> and we report stats from action_counter for all the_actual_actionX.
For comparison, here's how we're doing it in the upcoming sfc hardware:
Rather than a sequence of actions in an arbitrary order (applied
 cumulatively to the original packet), we have a concept of an 'action
 set', which is some subset of a fixed sequence of actions, the last of
 which is a delivery.  (One of these actions is 'count', which takes one
 or more counter IDs as its metadata.)  Then the result of a rule match
 is an _action set list_, one or more action sets each of which is
 applied to a separate copy of the original packet.
This works because the delivery (hw) action corresponds to the only (tc)
 action that wants stats, combined with some cleverness in the defined
 order of our fixed action sequence.  And it means that we can properly
 support per-action counters, including making stats type be a per-action
 property (if one of the mirreds doesn't want stats, just don't put a
 'count' in that action-set).
For shared counters we just use the same counter index.  Mapping from
 tc action index to hw counter index is handled in the driver (transparent
 to userspace).

So from our perspective, making stats-type a property of the TC action is
 the Right Thing and fits in with the existing design of TC.  See also my
 old RFC series restoring per-action stats to flow_offload [1] which (in
 patch 4/4) added a 'want_stats' field to struct flow_action_entry; this
 would presumably become an enum flow_cls_hw_stats_typeto support these
 new stats-type semantics, and would be set to
 FLOW_CLS_HW_STATS_TYPE_DISABLEDif act->ops->stats_update == NULL
 regardless of the stats-type specified by the user.

-ed

[1] http://patchwork.ozlabs.org/cover/1110071/

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [patch net-next 00/10] net: allow user specify TC filter HW stats type
  2020-02-25 16:22               ` Jiri Pirko
  2020-02-25 18:06                 ` Jakub Kicinski
  2020-02-26 12:52                 ` Jamal Hadi Salim
@ 2020-02-28 19:59                 ` Pablo Neira Ayuso
  2020-02-29  8:01                   ` Jiri Pirko
  2 siblings, 1 reply; 29+ messages in thread
From: Pablo Neira Ayuso @ 2020-02-28 19:59 UTC (permalink / raw)
  To: Jiri Pirko
  Cc: Jamal Hadi Salim, Edward Cree, Jakub Kicinski, netdev, davem,
	saeedm, leon, michael.chan, vishal, jeffrey.t.kirsher, idosch,
	aelior, peppe.cavallaro, alexandre.torgue, xiyou.wangcong, mlxsw,
	Marian Pritsak

[-- Attachment #1: Type: text/plain, Size: 870 bytes --]

Hi Pirko,

On Tue, Feb 25, 2020 at 05:22:03PM +0100, Jiri Pirko wrote:
[...]
> Eh, that is not that simple. The existing users are used to the fact
> that the actions are providing counters by themselves. Having and
> explicit counter action like this would break that expectation.
> Also, I think it should be up to the driver implementation. Some HW
> might only support stats per rule, not the actions. Driver should fit
> into the existing abstraction, I think it is fine.

Something like the sketch patch that I'm attaching?

The idea behind it is to provide a counter action through the
flow_action API. Then, tc_setup_flow_action() checks if this action
comes with counters in that case the counter action is added.

My patch assumes tcf_vlan_counter() provides tells us what counter
type the user wants, I just introduced this to provide an example.

Thank you.

[-- Attachment #2: x.patch --]
[-- Type: text/x-diff, Size: 2103 bytes --]

diff --git a/include/net/flow_offload.h b/include/net/flow_offload.h
index c6f7bd22db60..1a5006091edc 100644
--- a/include/net/flow_offload.h
+++ b/include/net/flow_offload.h
@@ -138,9 +138,16 @@ enum flow_action_id {
 	FLOW_ACTION_MPLS_PUSH,
 	FLOW_ACTION_MPLS_POP,
 	FLOW_ACTION_MPLS_MANGLE,
+	FLOW_ACTION_COUNTER,
 	NUM_FLOW_ACTIONS,
 };
 
+enum flow_action_counter_type {
+	FLOW_COUNTER_DISABLED		= 0,
+	FLOW_COUNTER_DELAYED,
+	FLOW_COUNTER_IMMEDIATE,
+};
+
 /* This is mirroring enum pedit_header_type definition for easy mapping between
  * tc pedit action. Legacy TCA_PEDIT_KEY_EX_HDR_TYPE_NETWORK is mapped to
  * FLOW_ACT_MANGLE_UNSPEC, which is supported by no driver.
@@ -213,6 +220,9 @@ struct flow_action_entry {
 			u8		bos;
 			u8		ttl;
 		} mpls_mangle;
+		struct {				/* FLOW_ACTION_COUNTER */
+			enum flow_action_counter_type	type;
+		} counter;
 	};
 };
 
diff --git a/net/sched/cls_api.c b/net/sched/cls_api.c
index 13c33eaf1ca1..984f2129c760 100644
--- a/net/sched/cls_api.c
+++ b/net/sched/cls_api.c
@@ -3435,6 +3435,7 @@ static void tcf_sample_get_group(struct flow_action_entry *entry,
 int tc_setup_flow_action(struct flow_action *flow_action,
 			 const struct tcf_exts *exts)
 {
+	enum flow_action_counter_type counter = FLOW_COUNTER_DISABLED;
 	struct tc_action *act;
 	int i, j, k, err = 0;
 
@@ -3489,6 +3490,7 @@ int tc_setup_flow_action(struct flow_action *flow_action,
 				err = -EOPNOTSUPP;
 				goto err_out_locked;
 			}
+			counter = tcf_vlan_counter(act);
 		} else if (is_tcf_tunnel_set(act)) {
 			entry->id = FLOW_ACTION_TUNNEL_ENCAP;
 			err = tcf_tunnel_encap_get_tunnel(entry, act);
@@ -3567,10 +3569,19 @@ int tc_setup_flow_action(struct flow_action *flow_action,
 			err = -EOPNOTSUPP;
 			goto err_out_locked;
 		}
-		spin_unlock_bh(&act->tcfa_lock);
 
 		if (!is_tcf_pedit(act))
 			j++;
+
+		if (counter) {
+			struct flow_action_entry *entry;
+
+			entry = &flow_action->entries[j++];
+			entry->id = FLOW_ACTION_COUNTER;
+			entry->counter.type = counter;
+			counter = FLOW_COUNTER_DISABLED;
+		}
+		spin_unlock_bh(&act->tcfa_lock);
 	}
 
 err_out:

^ permalink raw reply related	[flat|nested] 29+ messages in thread

* Re: [patch net-next 00/10] net: allow user specify TC filter HW stats type
  2020-02-28 19:59                 ` Pablo Neira Ayuso
@ 2020-02-29  8:01                   ` Jiri Pirko
  2020-02-29 19:56                     ` Pablo Neira Ayuso
  2020-03-02 16:07                     ` Edward Cree
  0 siblings, 2 replies; 29+ messages in thread
From: Jiri Pirko @ 2020-02-29  8:01 UTC (permalink / raw)
  To: Pablo Neira Ayuso
  Cc: Jamal Hadi Salim, Edward Cree, Jakub Kicinski, netdev, davem,
	saeedm, leon, michael.chan, vishal, jeffrey.t.kirsher, idosch,
	aelior, peppe.cavallaro, alexandre.torgue, xiyou.wangcong, mlxsw,
	Marian Pritsak

Fri, Feb 28, 2020 at 08:59:09PM CET, pablo@netfilter.org wrote:
>Hi Pirko,
>
>On Tue, Feb 25, 2020 at 05:22:03PM +0100, Jiri Pirko wrote:
>[...]
>> Eh, that is not that simple. The existing users are used to the fact
>> that the actions are providing counters by themselves. Having and
>> explicit counter action like this would break that expectation.
>> Also, I think it should be up to the driver implementation. Some HW
>> might only support stats per rule, not the actions. Driver should fit
>> into the existing abstraction, I think it is fine.
>
>Something like the sketch patch that I'm attaching?

But why? Actions are separate entities, with separate counters. The
driver is either able to offload that or not. Up to the driver to
abstract this out.


>
>The idea behind it is to provide a counter action through the
>flow_action API. Then, tc_setup_flow_action() checks if this action
>comes with counters in that case the counter action is added.
>
>My patch assumes tcf_vlan_counter() provides tells us what counter
>type the user wants, I just introduced this to provide an example.
>
>Thank you.

>diff --git a/include/net/flow_offload.h b/include/net/flow_offload.h
>index c6f7bd22db60..1a5006091edc 100644
>--- a/include/net/flow_offload.h
>+++ b/include/net/flow_offload.h
>@@ -138,9 +138,16 @@ enum flow_action_id {
> 	FLOW_ACTION_MPLS_PUSH,
> 	FLOW_ACTION_MPLS_POP,
> 	FLOW_ACTION_MPLS_MANGLE,
>+	FLOW_ACTION_COUNTER,
> 	NUM_FLOW_ACTIONS,
> };
> 
>+enum flow_action_counter_type {
>+	FLOW_COUNTER_DISABLED		= 0,
>+	FLOW_COUNTER_DELAYED,
>+	FLOW_COUNTER_IMMEDIATE,
>+};
>+
> /* This is mirroring enum pedit_header_type definition for easy mapping between
>  * tc pedit action. Legacy TCA_PEDIT_KEY_EX_HDR_TYPE_NETWORK is mapped to
>  * FLOW_ACT_MANGLE_UNSPEC, which is supported by no driver.
>@@ -213,6 +220,9 @@ struct flow_action_entry {
> 			u8		bos;
> 			u8		ttl;
> 		} mpls_mangle;
>+		struct {				/* FLOW_ACTION_COUNTER */
>+			enum flow_action_counter_type	type;
>+		} counter;
> 	};
> };
> 
>diff --git a/net/sched/cls_api.c b/net/sched/cls_api.c
>index 13c33eaf1ca1..984f2129c760 100644
>--- a/net/sched/cls_api.c
>+++ b/net/sched/cls_api.c
>@@ -3435,6 +3435,7 @@ static void tcf_sample_get_group(struct flow_action_entry *entry,
> int tc_setup_flow_action(struct flow_action *flow_action,
> 			 const struct tcf_exts *exts)
> {
>+	enum flow_action_counter_type counter = FLOW_COUNTER_DISABLED;
> 	struct tc_action *act;
> 	int i, j, k, err = 0;
> 
>@@ -3489,6 +3490,7 @@ int tc_setup_flow_action(struct flow_action *flow_action,
> 				err = -EOPNOTSUPP;
> 				goto err_out_locked;
> 			}
>+			counter = tcf_vlan_counter(act);
> 		} else if (is_tcf_tunnel_set(act)) {
> 			entry->id = FLOW_ACTION_TUNNEL_ENCAP;
> 			err = tcf_tunnel_encap_get_tunnel(entry, act);
>@@ -3567,10 +3569,19 @@ int tc_setup_flow_action(struct flow_action *flow_action,
> 			err = -EOPNOTSUPP;
> 			goto err_out_locked;
> 		}
>-		spin_unlock_bh(&act->tcfa_lock);
> 
> 		if (!is_tcf_pedit(act))
> 			j++;
>+
>+		if (counter) {
>+			struct flow_action_entry *entry;
>+
>+			entry = &flow_action->entries[j++];
>+			entry->id = FLOW_ACTION_COUNTER;
>+			entry->counter.type = counter;
>+			counter = FLOW_COUNTER_DISABLED;
>+		}
>+		spin_unlock_bh(&act->tcfa_lock);
> 	}
> 
> err_out:


^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [patch net-next 00/10] net: allow user specify TC filter HW stats type
  2020-02-29  8:01                   ` Jiri Pirko
@ 2020-02-29 19:56                     ` Pablo Neira Ayuso
  2020-03-02 16:07                     ` Edward Cree
  1 sibling, 0 replies; 29+ messages in thread
From: Pablo Neira Ayuso @ 2020-02-29 19:56 UTC (permalink / raw)
  To: Jiri Pirko
  Cc: Jamal Hadi Salim, Edward Cree, Jakub Kicinski, netdev, davem,
	saeedm, leon, michael.chan, vishal, jeffrey.t.kirsher, idosch,
	aelior, peppe.cavallaro, alexandre.torgue, xiyou.wangcong, mlxsw,
	Marian Pritsak

On Sat, Feb 29, 2020 at 09:01:20AM +0100, Jiri Pirko wrote:
> Fri, Feb 28, 2020 at 08:59:09PM CET, pablo@netfilter.org wrote:
> >Hi Pirko,
> >
> >On Tue, Feb 25, 2020 at 05:22:03PM +0100, Jiri Pirko wrote:
> >[...]
> >> Eh, that is not that simple. The existing users are used to the fact
> >> that the actions are providing counters by themselves. Having and
> >> explicit counter action like this would break that expectation.
> >> Also, I think it should be up to the driver implementation. Some HW
> >> might only support stats per rule, not the actions. Driver should fit
> >> into the existing abstraction, I think it is fine.
> >
> >Something like the sketch patch that I'm attaching?
> 
> But why? Actions are separate entities, with separate counters. The
> driver is either able to offload that or not. Up to the driver to
> abstract this out.

You can add one counter for each action through FLOW_ACTION_COUNTER.
Then, one single tc action maps to two flow_actions, the action itself
and the counter action. Hence, the tc frontend only needs to append a
counter after the action.

Why this might be a problem from the driver side? From reading this
thread, my understanding is that:

1) Some drivers have the ability to attach to immediate/delayed
   counters.
2) The immediate counters might be a limited resource while delayed
   counters are abundant.
3) Some drivers have counters per-filter, not per-action. In that
   case, for each counter action in the rule, the driver might decide
   to make all counter actions refer to the per-filter counter.

Thank you!

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [patch net-next 00/10] net: allow user specify TC filter HW stats type
  2020-02-29  8:01                   ` Jiri Pirko
  2020-02-29 19:56                     ` Pablo Neira Ayuso
@ 2020-03-02 16:07                     ` Edward Cree
  1 sibling, 0 replies; 29+ messages in thread
From: Edward Cree @ 2020-03-02 16:07 UTC (permalink / raw)
  To: Jiri Pirko, Pablo Neira Ayuso
  Cc: Jamal Hadi Salim, Jakub Kicinski, netdev, davem, saeedm, leon,
	michael.chan, vishal, jeffrey.t.kirsher, idosch, aelior,
	peppe.cavallaro, alexandre.torgue, xiyou.wangcong, mlxsw,
	Marian Pritsak

On 29/02/2020 08:01, Jiri Pirko wrote:
> Fri, Feb 28, 2020 at 08:59:09PM CET, pablo@netfilter.org wrote:
>> Something like the sketch patch that I'm attaching?
> But why? Actions are separate entities, with separate counters. The
> driver is either able to offload that or not. Up to the driver to
> abstract this out.
+1

If any of this "fake up a counter action" stuff is common to several
 drivers, we could maybe have some library functions to help them with
 it, but the tc<->driver API shouldn't change (since the change adds
 no expressiveness).

-ed

^ permalink raw reply	[flat|nested] 29+ messages in thread

end of thread, other threads:[~2020-03-02 16:07 UTC | newest]

Thread overview: 29+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-02-21  9:56 [patch net-next 00/10] net: allow user specify TC filter HW stats type Jiri Pirko
2020-02-21  9:56 ` [patch net-next 01/10] net: rename tc_cls_can_offload_and_chain0() to tc_cls_can_offload_basic() Jiri Pirko
2020-02-21  9:56 ` [patch net-next 02/10] iavf: use tc_cls_can_offload_basic() instead of chain check Jiri Pirko
2020-02-21  9:56 ` [patch net-next 03/10] flow_offload: Introduce offload of HW stats type Jiri Pirko
2020-02-21  9:56 ` [patch net-next 04/10] net: extend tc_cls_can_offload_basic() to check " Jiri Pirko
2020-02-21  9:56 ` [patch net-next 05/10] mlx5: restrict supported HW stats type to "any" Jiri Pirko
2020-02-21  9:56 ` [patch net-next 06/10] mlxsw: " Jiri Pirko
2020-02-21  9:56 ` [patch net-next 07/10] flow_offload: introduce "immediate" HW stats type and allow it in mlxsw Jiri Pirko
2020-02-21  9:56 ` [patch net-next 08/10] flow_offload: introduce "delayed" HW stats type and allow it in mlx5 Jiri Pirko
2020-02-21  9:56 ` [patch net-next 09/10] flow_offload: introduce "disabled" HW stats type and allow it in mlxsw Jiri Pirko
2020-02-21  9:56 ` [patch net-next 10/10] sched: cls_flower: allow user to specify type of HW stats for a filter Jiri Pirko
2020-02-21 18:22 ` [patch net-next 00/10] net: allow user specify TC filter HW stats type Jakub Kicinski
2020-02-22  6:38   ` Jiri Pirko
2020-02-24 11:38     ` Edward Cree
2020-02-24 13:11       ` Jiri Pirko
2020-02-24 15:45         ` Jamal Hadi Salim
2020-02-24 15:50           ` Edward Cree
2020-02-24 15:55             ` Jamal Hadi Salim
2020-02-24 16:25           ` Jiri Pirko
2020-02-25 16:01             ` Jamal Hadi Salim
2020-02-25 16:22               ` Jiri Pirko
2020-02-25 18:06                 ` Jakub Kicinski
2020-02-26 12:52                 ` Jamal Hadi Salim
2020-02-26 13:56                   ` Jiri Pirko
2020-02-28 19:59                 ` Pablo Neira Ayuso
2020-02-29  8:01                   ` Jiri Pirko
2020-02-29 19:56                     ` Pablo Neira Ayuso
2020-03-02 16:07                     ` Edward Cree
2020-02-27 15:57             ` Edward Cree

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.