All of lore.kernel.org
 help / color / mirror / Atom feed
* [RFC/PATCH net-next v3 0/8] allow user to offload tc action to net device
@ 2021-10-28 11:06 Simon Horman
  2021-10-28 11:06 ` [RFC/PATCH net-next v3 1/8] flow_offload: fill flags to action structure Simon Horman
                   ` (9 more replies)
  0 siblings, 10 replies; 58+ messages in thread
From: Simon Horman @ 2021-10-28 11:06 UTC (permalink / raw)
  To: netdev
  Cc: Vlad Buslov, Jamal Hadi Salim, Roi Dayan, Ido Schimmel,
	Cong Wang, Jiri Pirko, Baowen Zheng, Louis Peens, oss-drivers,
	Simon Horman

Baowen Zheng says:

Allow use of flow_indr_dev_register/flow_indr_dev_setup_offload to offload
tc actions independent of flows.

The motivation for this work is to prepare for using TC police action
instances to provide hardware offload of OVS metering feature - which calls
for policers that may be used by multiple flows and whose lifecycle is
independent of any flows that use them.

This patch includes basic changes to offload drivers to return EOPNOTSUPP
if this feature is used - it is not yet supported by any driver.

Tc cli command to offload and quote an action:

tc qdisc del dev $DEV ingress && sleep 1 || true
tc actions delete action police index 99 || true

tc qdisc add dev $DEV ingress
tc qdisc show dev $DEV ingress

tc actions add action police index 99 rate 1mbit burst 100k skip_sw
tc actions list action police

tc filter add dev $DEV protocol ip parent ffff:
flower ip_proto tcp action police index 99
tc -s -d filter show dev $DEV protocol ip parent ffff:
tc filter add dev $DEV protocol ipv6 parent ffff:
flower skip_sw ip_proto tcp action police index 99
tc -s -d filter show dev $DEV protocol ipv6 parent ffff:
tc actions list action police

tc qdisc del dev $DEV ingress && sleep 1
tc actions delete action police index 99
tc actions list action police

Changes compared to v2 patches:

* Made changes according to the review comments.
* Delete in_hw and not_in_hw flag and user can judge if the action is
  offloaded to any hardware by in_hw_count.
* Split the main patch of the action offload to three single patch to
facilitate code review.

Posting this revision of the patchset as an RFC as while we feel it is
ready for review we would like an opportunity to conduct further testing
before acceptance into upstream.

Baowen Zheng (8):
  flow_offload: fill flags to action structure
  flow_offload: reject to offload tc actions in offload drivers
  flow_offload: allow user to offload tc action to net device
  flow_offload: add skip_hw and skip_sw to control if offload the action
  flow_offload: add process to update action stats from hardware
  net: sched: save full flags for tc action
  flow_offload: add reoffload process to update hw_count
  flow_offload: validate flags of filter and actions

 drivers/net/ethernet/broadcom/bnxt/bnxt_tc.c  |   2 +-
 .../ethernet/mellanox/mlx5/core/en/rep/tc.c   |   3 +
 .../ethernet/netronome/nfp/flower/offload.c   |   3 +
 include/linux/netdevice.h                     |   1 +
 include/net/act_api.h                         |  34 +-
 include/net/flow_offload.h                    |  17 +
 include/net/pkt_cls.h                         |  61 ++-
 include/uapi/linux/pkt_cls.h                  |   9 +-
 net/core/flow_offload.c                       |  48 +-
 net/sched/act_api.c                           | 440 +++++++++++++++++-
 net/sched/act_bpf.c                           |   2 +-
 net/sched/act_connmark.c                      |   2 +-
 net/sched/act_ctinfo.c                        |   2 +-
 net/sched/act_gate.c                          |   2 +-
 net/sched/act_ife.c                           |   2 +-
 net/sched/act_ipt.c                           |   2 +-
 net/sched/act_mpls.c                          |   2 +-
 net/sched/act_nat.c                           |   2 +-
 net/sched/act_pedit.c                         |   2 +-
 net/sched/act_police.c                        |   2 +-
 net/sched/act_sample.c                        |   2 +-
 net/sched/act_simple.c                        |   2 +-
 net/sched/act_skbedit.c                       |   2 +-
 net/sched/act_skbmod.c                        |   2 +-
 net/sched/cls_api.c                           |  55 ++-
 net/sched/cls_flower.c                        |   3 +-
 net/sched/cls_matchall.c                      |   4 +-
 net/sched/cls_u32.c                           |   7 +-
 28 files changed, 661 insertions(+), 54 deletions(-)

-- 
2.20.1


^ permalink raw reply	[flat|nested] 58+ messages in thread

* [RFC/PATCH net-next v3 1/8] flow_offload: fill flags to action structure
  2021-10-28 11:06 [RFC/PATCH net-next v3 0/8] allow user to offload tc action to net device Simon Horman
@ 2021-10-28 11:06 ` Simon Horman
  2021-10-28 11:06 ` [RFC/PATCH net-next v3 2/8] flow_offload: reject to offload tc actions in offload drivers Simon Horman
                   ` (8 subsequent siblings)
  9 siblings, 0 replies; 58+ messages in thread
From: Simon Horman @ 2021-10-28 11:06 UTC (permalink / raw)
  To: netdev
  Cc: Vlad Buslov, Jamal Hadi Salim, Roi Dayan, Ido Schimmel,
	Cong Wang, Jiri Pirko, Baowen Zheng, Louis Peens, oss-drivers,
	Baowen Zheng, Simon Horman

From: Baowen Zheng <baowen.zheng@corigine.com>

Fill flags to action structure to allow user control if
the action should be offloaded to hardware or not.

Signed-off-by: Baowen Zheng <baowen.zheng@corigine.com>
Signed-off-by: Louis Peens <louis.peens@corigine.com>
Signed-off-by: Simon Horman <simon.horman@corigine.com>
---
 net/sched/act_bpf.c      | 2 +-
 net/sched/act_connmark.c | 2 +-
 net/sched/act_ctinfo.c   | 2 +-
 net/sched/act_gate.c     | 2 +-
 net/sched/act_ife.c      | 2 +-
 net/sched/act_ipt.c      | 2 +-
 net/sched/act_mpls.c     | 2 +-
 net/sched/act_nat.c      | 2 +-
 net/sched/act_pedit.c    | 2 +-
 net/sched/act_police.c   | 2 +-
 net/sched/act_sample.c   | 2 +-
 net/sched/act_simple.c   | 2 +-
 net/sched/act_skbedit.c  | 2 +-
 net/sched/act_skbmod.c   | 2 +-
 14 files changed, 14 insertions(+), 14 deletions(-)

diff --git a/net/sched/act_bpf.c b/net/sched/act_bpf.c
index f2bf896331a5..a77d8908e737 100644
--- a/net/sched/act_bpf.c
+++ b/net/sched/act_bpf.c
@@ -305,7 +305,7 @@ static int tcf_bpf_init(struct net *net, struct nlattr *nla,
 	ret = tcf_idr_check_alloc(tn, &index, act, bind);
 	if (!ret) {
 		ret = tcf_idr_create(tn, index, est, act,
-				     &act_bpf_ops, bind, true, 0);
+				     &act_bpf_ops, bind, true, flags);
 		if (ret < 0) {
 			tcf_idr_cleanup(tn, index);
 			return ret;
diff --git a/net/sched/act_connmark.c b/net/sched/act_connmark.c
index 94e78ac7a748..09e2aafc8943 100644
--- a/net/sched/act_connmark.c
+++ b/net/sched/act_connmark.c
@@ -124,7 +124,7 @@ static int tcf_connmark_init(struct net *net, struct nlattr *nla,
 	ret = tcf_idr_check_alloc(tn, &index, a, bind);
 	if (!ret) {
 		ret = tcf_idr_create(tn, index, est, a,
-				     &act_connmark_ops, bind, false, 0);
+				     &act_connmark_ops, bind, false, flags);
 		if (ret) {
 			tcf_idr_cleanup(tn, index);
 			return ret;
diff --git a/net/sched/act_ctinfo.c b/net/sched/act_ctinfo.c
index 549374a2d008..0281e45987a4 100644
--- a/net/sched/act_ctinfo.c
+++ b/net/sched/act_ctinfo.c
@@ -212,7 +212,7 @@ static int tcf_ctinfo_init(struct net *net, struct nlattr *nla,
 	err = tcf_idr_check_alloc(tn, &index, a, bind);
 	if (!err) {
 		ret = tcf_idr_create(tn, index, est, a,
-				     &act_ctinfo_ops, bind, false, 0);
+				     &act_ctinfo_ops, bind, false, flags);
 		if (ret) {
 			tcf_idr_cleanup(tn, index);
 			return ret;
diff --git a/net/sched/act_gate.c b/net/sched/act_gate.c
index 7df72a4197a3..ac985c53ebaf 100644
--- a/net/sched/act_gate.c
+++ b/net/sched/act_gate.c
@@ -357,7 +357,7 @@ static int tcf_gate_init(struct net *net, struct nlattr *nla,
 
 	if (!err) {
 		ret = tcf_idr_create(tn, index, est, a,
-				     &act_gate_ops, bind, false, 0);
+				     &act_gate_ops, bind, false, flags);
 		if (ret) {
 			tcf_idr_cleanup(tn, index);
 			return ret;
diff --git a/net/sched/act_ife.c b/net/sched/act_ife.c
index b757f90a2d58..41ba55e60b1b 100644
--- a/net/sched/act_ife.c
+++ b/net/sched/act_ife.c
@@ -553,7 +553,7 @@ static int tcf_ife_init(struct net *net, struct nlattr *nla,
 
 	if (!exists) {
 		ret = tcf_idr_create(tn, index, est, a, &act_ife_ops,
-				     bind, true, 0);
+				     bind, true, flags);
 		if (ret) {
 			tcf_idr_cleanup(tn, index);
 			kfree(p);
diff --git a/net/sched/act_ipt.c b/net/sched/act_ipt.c
index 265b1443e252..2f3d507c24a1 100644
--- a/net/sched/act_ipt.c
+++ b/net/sched/act_ipt.c
@@ -145,7 +145,7 @@ static int __tcf_ipt_init(struct net *net, unsigned int id, struct nlattr *nla,
 
 	if (!exists) {
 		ret = tcf_idr_create(tn, index, est, a, ops, bind,
-				     false, 0);
+				     false, flags);
 		if (ret) {
 			tcf_idr_cleanup(tn, index);
 			return ret;
diff --git a/net/sched/act_mpls.c b/net/sched/act_mpls.c
index 8faa4c58305e..2b30dc562743 100644
--- a/net/sched/act_mpls.c
+++ b/net/sched/act_mpls.c
@@ -248,7 +248,7 @@ static int tcf_mpls_init(struct net *net, struct nlattr *nla,
 
 	if (!exists) {
 		ret = tcf_idr_create(tn, index, est, a,
-				     &act_mpls_ops, bind, true, 0);
+				     &act_mpls_ops, bind, true, flags);
 		if (ret) {
 			tcf_idr_cleanup(tn, index);
 			return ret;
diff --git a/net/sched/act_nat.c b/net/sched/act_nat.c
index 7dd6b586ba7f..2a39b3729e84 100644
--- a/net/sched/act_nat.c
+++ b/net/sched/act_nat.c
@@ -61,7 +61,7 @@ static int tcf_nat_init(struct net *net, struct nlattr *nla, struct nlattr *est,
 	err = tcf_idr_check_alloc(tn, &index, a, bind);
 	if (!err) {
 		ret = tcf_idr_create(tn, index, est, a,
-				     &act_nat_ops, bind, false, 0);
+				     &act_nat_ops, bind, false, flags);
 		if (ret) {
 			tcf_idr_cleanup(tn, index);
 			return ret;
diff --git a/net/sched/act_pedit.c b/net/sched/act_pedit.c
index c6c862c459cc..cd3b8aad3192 100644
--- a/net/sched/act_pedit.c
+++ b/net/sched/act_pedit.c
@@ -189,7 +189,7 @@ static int tcf_pedit_init(struct net *net, struct nlattr *nla,
 	err = tcf_idr_check_alloc(tn, &index, a, bind);
 	if (!err) {
 		ret = tcf_idr_create(tn, index, est, a,
-				     &act_pedit_ops, bind, false, 0);
+				     &act_pedit_ops, bind, false, flags);
 		if (ret) {
 			tcf_idr_cleanup(tn, index);
 			goto out_free;
diff --git a/net/sched/act_police.c b/net/sched/act_police.c
index 9e77ba8401e5..c13a6245dfba 100644
--- a/net/sched/act_police.c
+++ b/net/sched/act_police.c
@@ -90,7 +90,7 @@ static int tcf_police_init(struct net *net, struct nlattr *nla,
 
 	if (!exists) {
 		ret = tcf_idr_create(tn, index, NULL, a,
-				     &act_police_ops, bind, true, 0);
+				     &act_police_ops, bind, true, flags);
 		if (ret) {
 			tcf_idr_cleanup(tn, index);
 			return ret;
diff --git a/net/sched/act_sample.c b/net/sched/act_sample.c
index ce859b0e0deb..91a7a93d5f6a 100644
--- a/net/sched/act_sample.c
+++ b/net/sched/act_sample.c
@@ -70,7 +70,7 @@ static int tcf_sample_init(struct net *net, struct nlattr *nla,
 
 	if (!exists) {
 		ret = tcf_idr_create(tn, index, est, a,
-				     &act_sample_ops, bind, true, 0);
+				     &act_sample_ops, bind, true, flags);
 		if (ret) {
 			tcf_idr_cleanup(tn, index);
 			return ret;
diff --git a/net/sched/act_simple.c b/net/sched/act_simple.c
index e617ab4505ca..8c1d60bde93e 100644
--- a/net/sched/act_simple.c
+++ b/net/sched/act_simple.c
@@ -129,7 +129,7 @@ static int tcf_simp_init(struct net *net, struct nlattr *nla,
 
 	if (!exists) {
 		ret = tcf_idr_create(tn, index, est, a,
-				     &act_simp_ops, bind, false, 0);
+				     &act_simp_ops, bind, false, flags);
 		if (ret) {
 			tcf_idr_cleanup(tn, index);
 			return ret;
diff --git a/net/sched/act_skbedit.c b/net/sched/act_skbedit.c
index d30ecbfc8f84..cb2d10d3dcc0 100644
--- a/net/sched/act_skbedit.c
+++ b/net/sched/act_skbedit.c
@@ -176,7 +176,7 @@ static int tcf_skbedit_init(struct net *net, struct nlattr *nla,
 
 	if (!exists) {
 		ret = tcf_idr_create(tn, index, est, a,
-				     &act_skbedit_ops, bind, true, 0);
+				     &act_skbedit_ops, bind, true, flags);
 		if (ret) {
 			tcf_idr_cleanup(tn, index);
 			return ret;
diff --git a/net/sched/act_skbmod.c b/net/sched/act_skbmod.c
index 9b6b52c5e24e..2083612d8780 100644
--- a/net/sched/act_skbmod.c
+++ b/net/sched/act_skbmod.c
@@ -168,7 +168,7 @@ static int tcf_skbmod_init(struct net *net, struct nlattr *nla,
 
 	if (!exists) {
 		ret = tcf_idr_create(tn, index, est, a,
-				     &act_skbmod_ops, bind, true, 0);
+				     &act_skbmod_ops, bind, true, flags);
 		if (ret) {
 			tcf_idr_cleanup(tn, index);
 			return ret;
-- 
2.20.1


^ permalink raw reply related	[flat|nested] 58+ messages in thread

* [RFC/PATCH net-next v3 2/8] flow_offload: reject to offload tc actions in offload drivers
  2021-10-28 11:06 [RFC/PATCH net-next v3 0/8] allow user to offload tc action to net device Simon Horman
  2021-10-28 11:06 ` [RFC/PATCH net-next v3 1/8] flow_offload: fill flags to action structure Simon Horman
@ 2021-10-28 11:06 ` Simon Horman
  2021-10-28 11:06 ` [RFC/PATCH net-next v3 3/8] flow_offload: allow user to offload tc action to net device Simon Horman
                   ` (7 subsequent siblings)
  9 siblings, 0 replies; 58+ messages in thread
From: Simon Horman @ 2021-10-28 11:06 UTC (permalink / raw)
  To: netdev
  Cc: Vlad Buslov, Jamal Hadi Salim, Roi Dayan, Ido Schimmel,
	Cong Wang, Jiri Pirko, Baowen Zheng, Louis Peens, oss-drivers,
	Baowen Zheng, Simon Horman

From: Baowen Zheng <baowen.zheng@corigine.com>

A follow-up patch will allow users to offload tc actions independent of
classifier in the software datapath.

In preparation for this, teach all drivers that support offload of the flow
tables to reject such configuration as currently none of them support it.

Signed-off-by: Baowen Zheng <baowen.zheng@corigine.com>
Signed-off-by: Louis Peens <louis.peens@corigine.com>
Signed-off-by: Simon Horman <simon.horman@corigine.com>
---
 drivers/net/ethernet/broadcom/bnxt/bnxt_tc.c        | 2 +-
 drivers/net/ethernet/mellanox/mlx5/core/en/rep/tc.c | 3 +++
 drivers/net/ethernet/netronome/nfp/flower/offload.c | 3 +++
 3 files changed, 7 insertions(+), 1 deletion(-)

diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt_tc.c b/drivers/net/ethernet/broadcom/bnxt/bnxt_tc.c
index e6a4a768b10b..8c9bab932478 100644
--- a/drivers/net/ethernet/broadcom/bnxt/bnxt_tc.c
+++ b/drivers/net/ethernet/broadcom/bnxt/bnxt_tc.c
@@ -1962,7 +1962,7 @@ static int bnxt_tc_setup_indr_cb(struct net_device *netdev, struct Qdisc *sch, v
 				 void *data,
 				 void (*cleanup)(struct flow_block_cb *block_cb))
 {
-	if (!bnxt_is_netdev_indr_offload(netdev))
+	if (!netdev || !bnxt_is_netdev_indr_offload(netdev))
 		return -EOPNOTSUPP;
 
 	switch (type) {
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en/rep/tc.c b/drivers/net/ethernet/mellanox/mlx5/core/en/rep/tc.c
index 398c6761eeb3..5e69357df295 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en/rep/tc.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en/rep/tc.c
@@ -497,6 +497,9 @@ int mlx5e_rep_indr_setup_cb(struct net_device *netdev, struct Qdisc *sch, void *
 			    void *data,
 			    void (*cleanup)(struct flow_block_cb *block_cb))
 {
+	if (!netdev)
+		return -EOPNOTSUPP;
+
 	switch (type) {
 	case TC_SETUP_BLOCK:
 		return mlx5e_rep_indr_setup_block(netdev, sch, cb_priv, type_data,
diff --git a/drivers/net/ethernet/netronome/nfp/flower/offload.c b/drivers/net/ethernet/netronome/nfp/flower/offload.c
index 64c0ef57ad42..17190fe17a82 100644
--- a/drivers/net/ethernet/netronome/nfp/flower/offload.c
+++ b/drivers/net/ethernet/netronome/nfp/flower/offload.c
@@ -1867,6 +1867,9 @@ nfp_flower_indr_setup_tc_cb(struct net_device *netdev, struct Qdisc *sch, void *
 			    void *data,
 			    void (*cleanup)(struct flow_block_cb *block_cb))
 {
+	if (!netdev)
+		return -EOPNOTSUPP;
+
 	if (!nfp_fl_is_netdev_to_offload(netdev))
 		return -EOPNOTSUPP;
 
-- 
2.20.1


^ permalink raw reply related	[flat|nested] 58+ messages in thread

* [RFC/PATCH net-next v3 3/8] flow_offload: allow user to offload tc action to net device
  2021-10-28 11:06 [RFC/PATCH net-next v3 0/8] allow user to offload tc action to net device Simon Horman
  2021-10-28 11:06 ` [RFC/PATCH net-next v3 1/8] flow_offload: fill flags to action structure Simon Horman
  2021-10-28 11:06 ` [RFC/PATCH net-next v3 2/8] flow_offload: reject to offload tc actions in offload drivers Simon Horman
@ 2021-10-28 11:06 ` Simon Horman
  2021-10-29 16:59   ` Vlad Buslov
  2021-10-31  9:50   ` Oz Shlomo
  2021-10-28 11:06 ` [RFC/PATCH net-next v3 4/8] flow_offload: add skip_hw and skip_sw to control if offload the action Simon Horman
                   ` (6 subsequent siblings)
  9 siblings, 2 replies; 58+ messages in thread
From: Simon Horman @ 2021-10-28 11:06 UTC (permalink / raw)
  To: netdev
  Cc: Vlad Buslov, Jamal Hadi Salim, Roi Dayan, Ido Schimmel,
	Cong Wang, Jiri Pirko, Baowen Zheng, Louis Peens, oss-drivers,
	Baowen Zheng, Simon Horman

From: Baowen Zheng <baowen.zheng@corigine.com>

Use flow_indr_dev_register/flow_indr_dev_setup_offload to
offload tc action.

We need to call tc_cleanup_flow_action to clean up tc action entry since
in tc_setup_action, some actions may hold dev refcnt, especially the mirror
action.

Signed-off-by: Baowen Zheng <baowen.zheng@corigine.com>
Signed-off-by: Louis Peens <louis.peens@corigine.com>
Signed-off-by: Simon Horman <simon.horman@corigine.com>
---
 include/linux/netdevice.h  |   1 +
 include/net/act_api.h      |   2 +-
 include/net/flow_offload.h |  17 ++++
 include/net/pkt_cls.h      |  15 ++++
 net/core/flow_offload.c    |  43 ++++++++--
 net/sched/act_api.c        | 166 +++++++++++++++++++++++++++++++++++++
 net/sched/cls_api.c        |  29 ++++++-
 7 files changed, 260 insertions(+), 13 deletions(-)

diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h
index 3ec42495a43a..9815c3a058e9 100644
--- a/include/linux/netdevice.h
+++ b/include/linux/netdevice.h
@@ -916,6 +916,7 @@ enum tc_setup_type {
 	TC_SETUP_QDISC_TBF,
 	TC_SETUP_QDISC_FIFO,
 	TC_SETUP_QDISC_HTB,
+	TC_SETUP_ACT,
 };
 
 /* These structures hold the attributes of bpf state that are being passed
diff --git a/include/net/act_api.h b/include/net/act_api.h
index b5b624c7e488..9eb19188603c 100644
--- a/include/net/act_api.h
+++ b/include/net/act_api.h
@@ -239,7 +239,7 @@ static inline void tcf_action_inc_overlimit_qstats(struct tc_action *a)
 void tcf_action_update_stats(struct tc_action *a, u64 bytes, u64 packets,
 			     u64 drops, bool hw);
 int tcf_action_copy_stats(struct sk_buff *, struct tc_action *, int);
-
+int tcf_action_offload_del(struct tc_action *action);
 int tcf_action_check_ctrlact(int action, struct tcf_proto *tp,
 			     struct tcf_chain **handle,
 			     struct netlink_ext_ack *newchain);
diff --git a/include/net/flow_offload.h b/include/net/flow_offload.h
index 3961461d9c8b..aa28592fccc0 100644
--- a/include/net/flow_offload.h
+++ b/include/net/flow_offload.h
@@ -552,6 +552,23 @@ struct flow_cls_offload {
 	u32 classid;
 };
 
+enum flow_act_command {
+	FLOW_ACT_REPLACE,
+	FLOW_ACT_DESTROY,
+	FLOW_ACT_STATS,
+};
+
+struct flow_offload_action {
+	struct netlink_ext_ack *extack; /* NULL in FLOW_ACT_STATS process*/
+	enum flow_act_command command;
+	enum flow_action_id id;
+	u32 index;
+	struct flow_stats stats;
+	struct flow_action action;
+};
+
+struct flow_offload_action *flow_action_alloc(unsigned int num_actions);
+
 static inline struct flow_rule *
 flow_cls_offload_flow_rule(struct flow_cls_offload *flow_cmd)
 {
diff --git a/include/net/pkt_cls.h b/include/net/pkt_cls.h
index 193f88ebf629..922775407257 100644
--- a/include/net/pkt_cls.h
+++ b/include/net/pkt_cls.h
@@ -258,6 +258,9 @@ static inline void tcf_exts_put_net(struct tcf_exts *exts)
 	for (; 0; (void)(i), (void)(a), (void)(exts))
 #endif
 
+#define tcf_act_for_each_action(i, a, actions) \
+	for (i = 0; i < TCA_ACT_MAX_PRIO && ((a) = actions[i]); i++)
+
 static inline void
 tcf_exts_stats_update(const struct tcf_exts *exts,
 		      u64 bytes, u64 packets, u64 drops, u64 lastuse,
@@ -532,8 +535,19 @@ tcf_match_indev(struct sk_buff *skb, int ifindex)
 	return ifindex == skb->skb_iif;
 }
 
+#ifdef CONFIG_NET_CLS_ACT
 int tc_setup_flow_action(struct flow_action *flow_action,
 			 const struct tcf_exts *exts);
+#else
+static inline int tc_setup_flow_action(struct flow_action *flow_action,
+				       const struct tcf_exts *exts)
+{
+	return 0;
+}
+#endif
+
+int tc_setup_action(struct flow_action *flow_action,
+		    struct tc_action *actions[]);
 void tc_cleanup_flow_action(struct flow_action *flow_action);
 
 int tc_setup_cb_call(struct tcf_block *block, enum tc_setup_type type,
@@ -554,6 +568,7 @@ int tc_setup_cb_reoffload(struct tcf_block *block, struct tcf_proto *tp,
 			  enum tc_setup_type type, void *type_data,
 			  void *cb_priv, u32 *flags, unsigned int *in_hw_count);
 unsigned int tcf_exts_num_actions(struct tcf_exts *exts);
+unsigned int tcf_act_num_actions_single(struct tc_action *act);
 
 #ifdef CONFIG_NET_CLS_ACT
 int tcf_qevent_init(struct tcf_qevent *qe, struct Qdisc *sch,
diff --git a/net/core/flow_offload.c b/net/core/flow_offload.c
index 6beaea13564a..6676431733ef 100644
--- a/net/core/flow_offload.c
+++ b/net/core/flow_offload.c
@@ -27,6 +27,27 @@ struct flow_rule *flow_rule_alloc(unsigned int num_actions)
 }
 EXPORT_SYMBOL(flow_rule_alloc);
 
+struct flow_offload_action *flow_action_alloc(unsigned int num_actions)
+{
+	struct flow_offload_action *fl_action;
+	int i;
+
+	fl_action = kzalloc(struct_size(fl_action, action.entries, num_actions),
+			    GFP_KERNEL);
+	if (!fl_action)
+		return NULL;
+
+	fl_action->action.num_entries = num_actions;
+	/* Pre-fill each action hw_stats with DONT_CARE.
+	 * Caller can override this if it wants stats for a given action.
+	 */
+	for (i = 0; i < num_actions; i++)
+		fl_action->action.entries[i].hw_stats = FLOW_ACTION_HW_STATS_DONT_CARE;
+
+	return fl_action;
+}
+EXPORT_SYMBOL(flow_action_alloc);
+
 #define FLOW_DISSECTOR_MATCH(__rule, __type, __out)				\
 	const struct flow_match *__m = &(__rule)->match;			\
 	struct flow_dissector *__d = (__m)->dissector;				\
@@ -549,19 +570,25 @@ int flow_indr_dev_setup_offload(struct net_device *dev,	struct Qdisc *sch,
 				void (*cleanup)(struct flow_block_cb *block_cb))
 {
 	struct flow_indr_dev *this;
+	u32 count = 0;
+	int err;
 
 	mutex_lock(&flow_indr_block_lock);
+	if (bo) {
+		if (bo->command == FLOW_BLOCK_BIND)
+			indir_dev_add(data, dev, sch, type, cleanup, bo);
+		else if (bo->command == FLOW_BLOCK_UNBIND)
+			indir_dev_remove(data);
+	}
 
-	if (bo->command == FLOW_BLOCK_BIND)
-		indir_dev_add(data, dev, sch, type, cleanup, bo);
-	else if (bo->command == FLOW_BLOCK_UNBIND)
-		indir_dev_remove(data);
-
-	list_for_each_entry(this, &flow_block_indr_dev_list, list)
-		this->cb(dev, sch, this->cb_priv, type, bo, data, cleanup);
+	list_for_each_entry(this, &flow_block_indr_dev_list, list) {
+		err = this->cb(dev, sch, this->cb_priv, type, bo, data, cleanup);
+		if (!err)
+			count++;
+	}
 
 	mutex_unlock(&flow_indr_block_lock);
 
-	return list_empty(&bo->cb_list) ? -EOPNOTSUPP : 0;
+	return (bo && list_empty(&bo->cb_list)) ? -EOPNOTSUPP : count;
 }
 EXPORT_SYMBOL(flow_indr_dev_setup_offload);
diff --git a/net/sched/act_api.c b/net/sched/act_api.c
index 3258da3d5bed..33f2ff885b4b 100644
--- a/net/sched/act_api.c
+++ b/net/sched/act_api.c
@@ -21,6 +21,19 @@
 #include <net/pkt_cls.h>
 #include <net/act_api.h>
 #include <net/netlink.h>
+#include <net/tc_act/tc_pedit.h>
+#include <net/tc_act/tc_mirred.h>
+#include <net/tc_act/tc_vlan.h>
+#include <net/tc_act/tc_tunnel_key.h>
+#include <net/tc_act/tc_csum.h>
+#include <net/tc_act/tc_gact.h>
+#include <net/tc_act/tc_police.h>
+#include <net/tc_act/tc_sample.h>
+#include <net/tc_act/tc_skbedit.h>
+#include <net/tc_act/tc_ct.h>
+#include <net/tc_act/tc_mpls.h>
+#include <net/tc_act/tc_gate.h>
+#include <net/flow_offload.h>
 
 #ifdef CONFIG_INET
 DEFINE_STATIC_KEY_FALSE(tcf_frag_xmit_count);
@@ -148,6 +161,7 @@ static int __tcf_action_put(struct tc_action *p, bool bind)
 		idr_remove(&idrinfo->action_idr, p->tcfa_index);
 		mutex_unlock(&idrinfo->lock);
 
+		tcf_action_offload_del(p);
 		tcf_action_cleanup(p);
 		return 1;
 	}
@@ -341,6 +355,7 @@ static int tcf_idr_release_unsafe(struct tc_action *p)
 		return -EPERM;
 
 	if (refcount_dec_and_test(&p->tcfa_refcnt)) {
+		tcf_action_offload_del(p);
 		idr_remove(&p->idrinfo->action_idr, p->tcfa_index);
 		tcf_action_cleanup(p);
 		return ACT_P_DELETED;
@@ -452,6 +467,7 @@ static int tcf_idr_delete_index(struct tcf_idrinfo *idrinfo, u32 index)
 						p->tcfa_index));
 			mutex_unlock(&idrinfo->lock);
 
+			tcf_action_offload_del(p);
 			tcf_action_cleanup(p);
 			module_put(owner);
 			return 0;
@@ -1061,6 +1077,154 @@ struct tc_action *tcf_action_init_1(struct net *net, struct tcf_proto *tp,
 	return ERR_PTR(err);
 }
 
+static int flow_action_init(struct flow_offload_action *fl_action,
+			    struct tc_action *act,
+			    enum flow_act_command cmd,
+			    struct netlink_ext_ack *extack)
+{
+	if (!fl_action)
+		return -EINVAL;
+
+	fl_action->extack = extack;
+	fl_action->command = cmd;
+	fl_action->index = act->tcfa_index;
+
+	if (is_tcf_gact_ok(act)) {
+		fl_action->id = FLOW_ACTION_ACCEPT;
+	} else if (is_tcf_gact_shot(act)) {
+		fl_action->id = FLOW_ACTION_DROP;
+	} else if (is_tcf_gact_trap(act)) {
+		fl_action->id = FLOW_ACTION_TRAP;
+	} else if (is_tcf_gact_goto_chain(act)) {
+		fl_action->id = FLOW_ACTION_GOTO;
+	} else if (is_tcf_mirred_egress_redirect(act)) {
+		fl_action->id = FLOW_ACTION_REDIRECT;
+	} else if (is_tcf_mirred_egress_mirror(act)) {
+		fl_action->id = FLOW_ACTION_MIRRED;
+	} else if (is_tcf_mirred_ingress_redirect(act)) {
+		fl_action->id = FLOW_ACTION_REDIRECT_INGRESS;
+	} else if (is_tcf_mirred_ingress_mirror(act)) {
+		fl_action->id = FLOW_ACTION_MIRRED_INGRESS;
+	} else if (is_tcf_vlan(act)) {
+		switch (tcf_vlan_action(act)) {
+		case TCA_VLAN_ACT_PUSH:
+			fl_action->id = FLOW_ACTION_VLAN_PUSH;
+			break;
+		case TCA_VLAN_ACT_POP:
+			fl_action->id = FLOW_ACTION_VLAN_POP;
+			break;
+		case TCA_VLAN_ACT_MODIFY:
+			fl_action->id = FLOW_ACTION_VLAN_MANGLE;
+			break;
+		default:
+			return -EOPNOTSUPP;
+		}
+	} else if (is_tcf_tunnel_set(act)) {
+		fl_action->id = FLOW_ACTION_TUNNEL_ENCAP;
+	} else if (is_tcf_tunnel_release(act)) {
+		fl_action->id = FLOW_ACTION_TUNNEL_DECAP;
+	} else if (is_tcf_csum(act)) {
+		fl_action->id = FLOW_ACTION_CSUM;
+	} else if (is_tcf_skbedit_mark(act)) {
+		fl_action->id = FLOW_ACTION_MARK;
+	} else if (is_tcf_sample(act)) {
+		fl_action->id = FLOW_ACTION_SAMPLE;
+	} else if (is_tcf_police(act)) {
+		fl_action->id = FLOW_ACTION_POLICE;
+	} else if (is_tcf_ct(act)) {
+		fl_action->id = FLOW_ACTION_CT;
+	} else if (is_tcf_mpls(act)) {
+		switch (tcf_mpls_action(act)) {
+		case TCA_MPLS_ACT_PUSH:
+			fl_action->id = FLOW_ACTION_MPLS_PUSH;
+			break;
+		case TCA_MPLS_ACT_POP:
+			fl_action->id = FLOW_ACTION_MPLS_POP;
+			break;
+		case TCA_MPLS_ACT_MODIFY:
+			fl_action->id = FLOW_ACTION_MPLS_MANGLE;
+			break;
+		default:
+			return -EOPNOTSUPP;
+		}
+	} else if (is_tcf_skbedit_ptype(act)) {
+		fl_action->id = FLOW_ACTION_PTYPE;
+	} else if (is_tcf_skbedit_priority(act)) {
+		fl_action->id = FLOW_ACTION_PRIORITY;
+	} else if (is_tcf_gate(act)) {
+		fl_action->id = FLOW_ACTION_GATE;
+	} else {
+		return -EOPNOTSUPP;
+	}
+
+	return 0;
+}
+
+static int tcf_action_offload_cmd(struct flow_offload_action *fl_act,
+				  struct netlink_ext_ack *extack)
+{
+	int err;
+
+	if (IS_ERR(fl_act))
+		return PTR_ERR(fl_act);
+
+	err = flow_indr_dev_setup_offload(NULL, NULL, TC_SETUP_ACT,
+					  fl_act, NULL, NULL);
+	if (err < 0)
+		return err;
+
+	return 0;
+}
+
+/* offload the tc command after inserted */
+static int tcf_action_offload_add(struct tc_action *action,
+				  struct netlink_ext_ack *extack)
+{
+	struct tc_action *actions[TCA_ACT_MAX_PRIO] = {
+		[0] = action,
+	};
+	struct flow_offload_action *fl_action;
+	int err = 0;
+
+	fl_action = flow_action_alloc(tcf_act_num_actions_single(action));
+	if (!fl_action)
+		return -EINVAL;
+
+	err = flow_action_init(fl_action, action, FLOW_ACT_REPLACE, extack);
+	if (err)
+		goto fl_err;
+
+	err = tc_setup_action(&fl_action->action, actions);
+	if (err) {
+		NL_SET_ERR_MSG_MOD(extack,
+				   "Failed to setup tc actions for offload\n");
+		goto fl_err;
+	}
+
+	err = tcf_action_offload_cmd(fl_action, extack);
+	tc_cleanup_flow_action(&fl_action->action);
+
+fl_err:
+	kfree(fl_action);
+
+	return err;
+}
+
+int tcf_action_offload_del(struct tc_action *action)
+{
+	struct flow_offload_action fl_act;
+	int err = 0;
+
+	if (!action)
+		return -EINVAL;
+
+	err = flow_action_init(&fl_act, action, FLOW_ACT_DESTROY, NULL);
+	if (err)
+		return err;
+
+	return tcf_action_offload_cmd(&fl_act, NULL);
+}
+
 /* Returns numbers of initialized actions or negative error. */
 
 int tcf_action_init(struct net *net, struct tcf_proto *tp, struct nlattr *nla,
@@ -1103,6 +1267,8 @@ int tcf_action_init(struct net *net, struct tcf_proto *tp, struct nlattr *nla,
 		sz += tcf_action_fill_size(act);
 		/* Start from index 0 */
 		actions[i - 1] = act;
+		if (!(flags & TCA_ACT_FLAGS_BIND))
+			tcf_action_offload_add(act, extack);
 	}
 
 	/* We have to commit them all together, because if any error happened in
diff --git a/net/sched/cls_api.c b/net/sched/cls_api.c
index 2ef8f5a6205a..351d93988b8b 100644
--- a/net/sched/cls_api.c
+++ b/net/sched/cls_api.c
@@ -3544,8 +3544,8 @@ static enum flow_action_hw_stats tc_act_hw_stats(u8 hw_stats)
 	return hw_stats;
 }
 
-int tc_setup_flow_action(struct flow_action *flow_action,
-			 const struct tcf_exts *exts)
+int tc_setup_action(struct flow_action *flow_action,
+		    struct tc_action *actions[])
 {
 	struct tc_action *act;
 	int i, j, k, err = 0;
@@ -3554,11 +3554,11 @@ int tc_setup_flow_action(struct flow_action *flow_action,
 	BUILD_BUG_ON(TCA_ACT_HW_STATS_IMMEDIATE != FLOW_ACTION_HW_STATS_IMMEDIATE);
 	BUILD_BUG_ON(TCA_ACT_HW_STATS_DELAYED != FLOW_ACTION_HW_STATS_DELAYED);
 
-	if (!exts)
+	if (!actions)
 		return 0;
 
 	j = 0;
-	tcf_exts_for_each_action(i, act, exts) {
+	tcf_act_for_each_action(i, act, actions) {
 		struct flow_action_entry *entry;
 
 		entry = &flow_action->entries[j];
@@ -3725,7 +3725,19 @@ int tc_setup_flow_action(struct flow_action *flow_action,
 	spin_unlock_bh(&act->tcfa_lock);
 	goto err_out;
 }
+EXPORT_SYMBOL(tc_setup_action);
+
+#ifdef CONFIG_NET_CLS_ACT
+int tc_setup_flow_action(struct flow_action *flow_action,
+			 const struct tcf_exts *exts)
+{
+	if (!exts)
+		return 0;
+
+	return tc_setup_action(flow_action, exts->actions);
+}
 EXPORT_SYMBOL(tc_setup_flow_action);
+#endif
 
 unsigned int tcf_exts_num_actions(struct tcf_exts *exts)
 {
@@ -3743,6 +3755,15 @@ unsigned int tcf_exts_num_actions(struct tcf_exts *exts)
 }
 EXPORT_SYMBOL(tcf_exts_num_actions);
 
+unsigned int tcf_act_num_actions_single(struct tc_action *act)
+{
+	if (is_tcf_pedit(act))
+		return tcf_pedit_nkeys(act);
+	else
+		return 1;
+}
+EXPORT_SYMBOL(tcf_act_num_actions_single);
+
 #ifdef CONFIG_NET_CLS_ACT
 static int tcf_qevent_parse_block_index(struct nlattr *block_index_attr,
 					u32 *p_block_index,
-- 
2.20.1


^ permalink raw reply related	[flat|nested] 58+ messages in thread

* [RFC/PATCH net-next v3 4/8] flow_offload: add skip_hw and skip_sw to control if offload the action
  2021-10-28 11:06 [RFC/PATCH net-next v3 0/8] allow user to offload tc action to net device Simon Horman
                   ` (2 preceding siblings ...)
  2021-10-28 11:06 ` [RFC/PATCH net-next v3 3/8] flow_offload: allow user to offload tc action to net device Simon Horman
@ 2021-10-28 11:06 ` Simon Horman
  2021-10-28 11:06 ` [RFC/PATCH net-next v3 5/8] flow_offload: add process to update action stats from hardware Simon Horman
                   ` (5 subsequent siblings)
  9 siblings, 0 replies; 58+ messages in thread
From: Simon Horman @ 2021-10-28 11:06 UTC (permalink / raw)
  To: netdev
  Cc: Vlad Buslov, Jamal Hadi Salim, Roi Dayan, Ido Schimmel,
	Cong Wang, Jiri Pirko, Baowen Zheng, Louis Peens, oss-drivers,
	Baowen Zheng, Simon Horman

From: Baowen Zheng <baowen.zheng@corigine.com>

We add skip_hw and skip_sw for user to control if offload the action
to hardware.

We also add in_hw_count for user to indicate if the action is offloaded
to any hardware.

Signed-off-by: Baowen Zheng <baowen.zheng@corigine.com>
Signed-off-by: Louis Peens <louis.peens@corigine.com>
Signed-off-by: Simon Horman <simon.horman@corigine.com>
---
 include/net/act_api.h        |  7 +++++
 include/net/pkt_cls.h        | 23 +++++++++++++++
 include/uapi/linux/pkt_cls.h |  9 ++++--
 net/sched/act_api.c          | 54 ++++++++++++++++++++++++++++++++----
 4 files changed, 84 insertions(+), 9 deletions(-)

diff --git a/include/net/act_api.h b/include/net/act_api.h
index 9eb19188603c..671208bd27ef 100644
--- a/include/net/act_api.h
+++ b/include/net/act_api.h
@@ -44,6 +44,7 @@ struct tc_action {
 	u8			hw_stats;
 	u8			used_hw_stats;
 	bool			used_hw_stats_valid;
+	u32			in_hw_count;
 };
 #define tcf_index	common.tcfa_index
 #define tcf_refcnt	common.tcfa_refcnt
@@ -236,6 +237,12 @@ static inline void tcf_action_inc_overlimit_qstats(struct tc_action *a)
 	spin_unlock(&a->tcfa_lock);
 }
 
+static inline void flow_action_hw_count_set(struct tc_action *act,
+					    u32 hw_count)
+{
+	act->in_hw_count = hw_count;
+}
+
 void tcf_action_update_stats(struct tc_action *a, u64 bytes, u64 packets,
 			     u64 drops, bool hw);
 int tcf_action_copy_stats(struct sk_buff *, struct tc_action *, int);
diff --git a/include/net/pkt_cls.h b/include/net/pkt_cls.h
index 922775407257..44ae5182a965 100644
--- a/include/net/pkt_cls.h
+++ b/include/net/pkt_cls.h
@@ -261,6 +261,29 @@ static inline void tcf_exts_put_net(struct tcf_exts *exts)
 #define tcf_act_for_each_action(i, a, actions) \
 	for (i = 0; i < TCA_ACT_MAX_PRIO && ((a) = actions[i]); i++)
 
+static inline bool tc_act_skip_hw(u32 flags)
+{
+	return (flags & TCA_ACT_FLAGS_SKIP_HW) ? true : false;
+}
+
+static inline bool tc_act_skip_sw(u32 flags)
+{
+	return (flags & TCA_ACT_FLAGS_SKIP_SW) ? true : false;
+}
+
+static inline bool tc_act_in_hw(struct tc_action *act)
+{
+	return !!act->in_hw_count;
+}
+
+/* SKIP_HW and SKIP_SW are mutually exclusive flags. */
+static inline bool tc_act_flags_valid(u32 flags)
+{
+	flags &= TCA_ACT_FLAGS_SKIP_HW | TCA_ACT_FLAGS_SKIP_SW;
+
+	return flags ^ (TCA_ACT_FLAGS_SKIP_HW | TCA_ACT_FLAGS_SKIP_SW);
+}
+
 static inline void
 tcf_exts_stats_update(const struct tcf_exts *exts,
 		      u64 bytes, u64 packets, u64 drops, u64 lastuse,
diff --git a/include/uapi/linux/pkt_cls.h b/include/uapi/linux/pkt_cls.h
index 6836ccb9c45d..ee38b35c3f57 100644
--- a/include/uapi/linux/pkt_cls.h
+++ b/include/uapi/linux/pkt_cls.h
@@ -19,13 +19,16 @@ enum {
 	TCA_ACT_FLAGS,
 	TCA_ACT_HW_STATS,
 	TCA_ACT_USED_HW_STATS,
+	TCA_ACT_IN_HW_COUNT,
 	__TCA_ACT_MAX
 };
 
 /* See other TCA_ACT_FLAGS_ * flags in include/net/act_api.h. */
-#define TCA_ACT_FLAGS_NO_PERCPU_STATS 1 /* Don't use percpu allocator for
-					 * actions stats.
-					 */
+#define TCA_ACT_FLAGS_NO_PERCPU_STATS (1 << 0) /* Don't use percpu allocator for
+						* actions stats.
+						*/
+#define TCA_ACT_FLAGS_SKIP_HW	(1 << 1) /* don't offload action to HW */
+#define TCA_ACT_FLAGS_SKIP_SW	(1 << 2) /* don't use action in SW */
 
 /* tca HW stats type
  * When user does not pass the attribute, he does not care.
diff --git a/net/sched/act_api.c b/net/sched/act_api.c
index 33f2ff885b4b..604bf1923bcc 100644
--- a/net/sched/act_api.c
+++ b/net/sched/act_api.c
@@ -751,6 +751,9 @@ int tcf_action_exec(struct sk_buff *skb, struct tc_action **actions,
 			jmp_prgcnt -= 1;
 			continue;
 		}
+
+		if (tc_act_skip_sw(a->tcfa_flags))
+			continue;
 repeat:
 		ret = a->ops->act(skb, a, res);
 		if (ret == TC_ACT_REPEAT)
@@ -856,6 +859,9 @@ tcf_action_dump_1(struct sk_buff *skb, struct tc_action *a, int bind, int ref)
 			       a->tcfa_flags, a->tcfa_flags))
 		goto nla_put_failure;
 
+	if (nla_put_u32(skb, TCA_ACT_IN_HW_COUNT, a->in_hw_count))
+		goto nla_put_failure;
+
 	nest = nla_nest_start_noflag(skb, TCA_OPTIONS);
 	if (nest == NULL)
 		goto nla_put_failure;
@@ -935,7 +941,9 @@ static const struct nla_policy tcf_action_policy[TCA_ACT_MAX + 1] = {
 	[TCA_ACT_COOKIE]	= { .type = NLA_BINARY,
 				    .len = TC_COOKIE_MAX_SIZE },
 	[TCA_ACT_OPTIONS]	= { .type = NLA_NESTED },
-	[TCA_ACT_FLAGS]		= NLA_POLICY_BITFIELD32(TCA_ACT_FLAGS_NO_PERCPU_STATS),
+	[TCA_ACT_FLAGS]		= NLA_POLICY_BITFIELD32(TCA_ACT_FLAGS_NO_PERCPU_STATS |
+							TCA_ACT_FLAGS_SKIP_HW |
+							TCA_ACT_FLAGS_SKIP_SW),
 	[TCA_ACT_HW_STATS]	= NLA_POLICY_BITFIELD32(TCA_ACT_HW_STATS_ANY),
 };
 
@@ -1048,8 +1056,13 @@ struct tc_action *tcf_action_init_1(struct net *net, struct tcf_proto *tp,
 			}
 		}
 		hw_stats = tcf_action_hw_stats_get(tb[TCA_ACT_HW_STATS]);
-		if (tb[TCA_ACT_FLAGS])
+		if (tb[TCA_ACT_FLAGS]) {
 			userflags = nla_get_bitfield32(tb[TCA_ACT_FLAGS]);
+			if (!tc_act_flags_valid(userflags.value)) {
+				err = -EINVAL;
+				goto err_out;
+			}
+		}
 
 		err = a_o->init(net, tb[TCA_ACT_OPTIONS], est, &a, tp,
 				userflags.value | flags, extack);
@@ -1161,6 +1174,7 @@ static int flow_action_init(struct flow_offload_action *fl_action,
 }
 
 static int tcf_action_offload_cmd(struct flow_offload_action *fl_act,
+				  u32 *hw_count,
 				  struct netlink_ext_ack *extack)
 {
 	int err;
@@ -1173,6 +1187,9 @@ static int tcf_action_offload_cmd(struct flow_offload_action *fl_act,
 	if (err < 0)
 		return err;
 
+	if (hw_count)
+		*hw_count = err;
+
 	return 0;
 }
 
@@ -1180,12 +1197,17 @@ static int tcf_action_offload_cmd(struct flow_offload_action *fl_act,
 static int tcf_action_offload_add(struct tc_action *action,
 				  struct netlink_ext_ack *extack)
 {
+	bool skip_sw = tc_act_skip_sw(action->tcfa_flags);
 	struct tc_action *actions[TCA_ACT_MAX_PRIO] = {
 		[0] = action,
 	};
 	struct flow_offload_action *fl_action;
+	u32 in_hw_count = 0;
 	int err = 0;
 
+	if (tc_act_skip_hw(action->tcfa_flags))
+		return 0;
+
 	fl_action = flow_action_alloc(tcf_act_num_actions_single(action));
 	if (!fl_action)
 		return -EINVAL;
@@ -1201,7 +1223,13 @@ static int tcf_action_offload_add(struct tc_action *action,
 		goto fl_err;
 	}
 
-	err = tcf_action_offload_cmd(fl_action, extack);
+	err = tcf_action_offload_cmd(fl_action, &in_hw_count, extack);
+	if (!err)
+		flow_action_hw_count_set(action, in_hw_count);
+
+	if (skip_sw && !tc_act_in_hw(action))
+		err = -EINVAL;
+
 	tc_cleanup_flow_action(&fl_action->action);
 
 fl_err:
@@ -1213,16 +1241,27 @@ static int tcf_action_offload_add(struct tc_action *action,
 int tcf_action_offload_del(struct tc_action *action)
 {
 	struct flow_offload_action fl_act;
+	u32 in_hw_count = 0;
 	int err = 0;
 
 	if (!action)
 		return -EINVAL;
 
+	if (!tc_act_in_hw(action))
+		return 0;
+
 	err = flow_action_init(&fl_act, action, FLOW_ACT_DESTROY, NULL);
 	if (err)
 		return err;
 
-	return tcf_action_offload_cmd(&fl_act, NULL);
+	err = tcf_action_offload_cmd(&fl_act, &in_hw_count, NULL);
+	if (err)
+		return err;
+
+	if (action->in_hw_count != in_hw_count)
+		return -EINVAL;
+
+	return 0;
 }
 
 /* Returns numbers of initialized actions or negative error. */
@@ -1267,8 +1306,11 @@ int tcf_action_init(struct net *net, struct tcf_proto *tp, struct nlattr *nla,
 		sz += tcf_action_fill_size(act);
 		/* Start from index 0 */
 		actions[i - 1] = act;
-		if (!(flags & TCA_ACT_FLAGS_BIND))
-			tcf_action_offload_add(act, extack);
+		if (!(flags & TCA_ACT_FLAGS_BIND)) {
+			err = tcf_action_offload_add(act, extack);
+			if (tc_act_skip_sw(act->tcfa_flags) && err)
+				goto err;
+		}
 	}
 
 	/* We have to commit them all together, because if any error happened in
-- 
2.20.1


^ permalink raw reply related	[flat|nested] 58+ messages in thread

* [RFC/PATCH net-next v3 5/8] flow_offload: add process to update action stats from hardware
  2021-10-28 11:06 [RFC/PATCH net-next v3 0/8] allow user to offload tc action to net device Simon Horman
                   ` (3 preceding siblings ...)
  2021-10-28 11:06 ` [RFC/PATCH net-next v3 4/8] flow_offload: add skip_hw and skip_sw to control if offload the action Simon Horman
@ 2021-10-28 11:06 ` Simon Horman
  2021-10-29 17:11   ` Vlad Buslov
  2021-10-28 11:06 ` [RFC/PATCH net-next v3 6/8] net: sched: save full flags for tc action Simon Horman
                   ` (4 subsequent siblings)
  9 siblings, 1 reply; 58+ messages in thread
From: Simon Horman @ 2021-10-28 11:06 UTC (permalink / raw)
  To: netdev
  Cc: Vlad Buslov, Jamal Hadi Salim, Roi Dayan, Ido Schimmel,
	Cong Wang, Jiri Pirko, Baowen Zheng, Louis Peens, oss-drivers,
	Baowen Zheng, Simon Horman

From: Baowen Zheng <baowen.zheng@corigine.com>

When collecting stats for actions update them using both
both hardware and software counters.

Stats update process should not in context of preempt_disable.

Signed-off-by: Baowen Zheng <baowen.zheng@corigine.com>
Signed-off-by: Louis Peens <louis.peens@corigine.com>
Signed-off-by: Simon Horman <simon.horman@corigine.com>
---
 include/net/act_api.h |  1 +
 include/net/pkt_cls.h | 18 ++++++++++--------
 net/sched/act_api.c   | 37 +++++++++++++++++++++++++++++++++++++
 3 files changed, 48 insertions(+), 8 deletions(-)

diff --git a/include/net/act_api.h b/include/net/act_api.h
index 671208bd27ef..80a9d1e7d805 100644
--- a/include/net/act_api.h
+++ b/include/net/act_api.h
@@ -247,6 +247,7 @@ void tcf_action_update_stats(struct tc_action *a, u64 bytes, u64 packets,
 			     u64 drops, bool hw);
 int tcf_action_copy_stats(struct sk_buff *, struct tc_action *, int);
 int tcf_action_offload_del(struct tc_action *action);
+int tcf_action_update_hw_stats(struct tc_action *action);
 int tcf_action_check_ctrlact(int action, struct tcf_proto *tp,
 			     struct tcf_chain **handle,
 			     struct netlink_ext_ack *newchain);
diff --git a/include/net/pkt_cls.h b/include/net/pkt_cls.h
index 44ae5182a965..88788b821f76 100644
--- a/include/net/pkt_cls.h
+++ b/include/net/pkt_cls.h
@@ -292,18 +292,20 @@ tcf_exts_stats_update(const struct tcf_exts *exts,
 #ifdef CONFIG_NET_CLS_ACT
 	int i;
 
-	preempt_disable();
-
 	for (i = 0; i < exts->nr_actions; i++) {
 		struct tc_action *a = exts->actions[i];
 
-		tcf_action_stats_update(a, bytes, packets, drops,
-					lastuse, true);
-		a->used_hw_stats = used_hw_stats;
-		a->used_hw_stats_valid = used_hw_stats_valid;
-	}
+		/* if stats from hw, just skip */
+		if (tcf_action_update_hw_stats(a)) {
+			preempt_disable();
+			tcf_action_stats_update(a, bytes, packets, drops,
+						lastuse, true);
+			preempt_enable();
 
-	preempt_enable();
+			a->used_hw_stats = used_hw_stats;
+			a->used_hw_stats_valid = used_hw_stats_valid;
+		}
+	}
 #endif
 }
 
diff --git a/net/sched/act_api.c b/net/sched/act_api.c
index 604bf1923bcc..881c7ba4d180 100644
--- a/net/sched/act_api.c
+++ b/net/sched/act_api.c
@@ -1238,6 +1238,40 @@ static int tcf_action_offload_add(struct tc_action *action,
 	return err;
 }
 
+int tcf_action_update_hw_stats(struct tc_action *action)
+{
+	struct flow_offload_action fl_act = {};
+	int err = 0;
+
+	if (!tc_act_in_hw(action))
+		return -EOPNOTSUPP;
+
+	err = flow_action_init(&fl_act, action, FLOW_ACT_STATS, NULL);
+	if (err)
+		goto err_out;
+
+	err = tcf_action_offload_cmd(&fl_act, NULL, NULL);
+
+	if (!err && fl_act.stats.lastused) {
+		preempt_disable();
+		tcf_action_stats_update(action, fl_act.stats.bytes,
+					fl_act.stats.pkts,
+					fl_act.stats.drops,
+					fl_act.stats.lastused,
+					true);
+		preempt_enable();
+		action->used_hw_stats = fl_act.stats.used_hw_stats;
+		action->used_hw_stats_valid = true;
+		err = 0;
+	} else {
+		err = -EOPNOTSUPP;
+	}
+
+err_out:
+	return err;
+}
+EXPORT_SYMBOL(tcf_action_update_hw_stats);
+
 int tcf_action_offload_del(struct tc_action *action)
 {
 	struct flow_offload_action fl_act;
@@ -1362,6 +1396,9 @@ int tcf_action_copy_stats(struct sk_buff *skb, struct tc_action *p,
 	if (p == NULL)
 		goto errout;
 
+	/* update hw stats for this action */
+	tcf_action_update_hw_stats(p);
+
 	/* compat_mode being true specifies a call that is supposed
 	 * to add additional backward compatibility statistic TLVs.
 	 */
-- 
2.20.1


^ permalink raw reply related	[flat|nested] 58+ messages in thread

* [RFC/PATCH net-next v3 6/8] net: sched: save full flags for tc action
  2021-10-28 11:06 [RFC/PATCH net-next v3 0/8] allow user to offload tc action to net device Simon Horman
                   ` (4 preceding siblings ...)
  2021-10-28 11:06 ` [RFC/PATCH net-next v3 5/8] flow_offload: add process to update action stats from hardware Simon Horman
@ 2021-10-28 11:06 ` Simon Horman
  2021-10-28 11:06 ` [RFC/PATCH net-next v3 7/8] flow_offload: add reoffload process to update hw_count Simon Horman
                   ` (3 subsequent siblings)
  9 siblings, 0 replies; 58+ messages in thread
From: Simon Horman @ 2021-10-28 11:06 UTC (permalink / raw)
  To: netdev
  Cc: Vlad Buslov, Jamal Hadi Salim, Roi Dayan, Ido Schimmel,
	Cong Wang, Jiri Pirko, Baowen Zheng, Louis Peens, oss-drivers,
	Baowen Zheng, Simon Horman

From: Baowen Zheng <baowen.zheng@corigine.com>

Save full action flags and return user flags when return flags to
user space.

Save full action flags to distinguish if the action is created
independent from classifier.

We made this change mainly for further patch to reoffload tc actions.

Signed-off-by: Baowen Zheng <baowen.zheng@corigine.com>
Signed-off-by: Louis Peens <louis.peens@corigine.com>
Signed-off-by: Simon Horman <simon.horman@corigine.com>
---
 net/sched/act_api.c | 8 +++++---
 1 file changed, 5 insertions(+), 3 deletions(-)

diff --git a/net/sched/act_api.c b/net/sched/act_api.c
index 881c7ba4d180..3893ffd91192 100644
--- a/net/sched/act_api.c
+++ b/net/sched/act_api.c
@@ -513,7 +513,7 @@ int tcf_idr_create(struct tc_action_net *tn, u32 index, struct nlattr *est,
 	p->tcfa_tm.install = jiffies;
 	p->tcfa_tm.lastuse = jiffies;
 	p->tcfa_tm.firstuse = 0;
-	p->tcfa_flags = flags & TCA_ACT_FLAGS_USER_MASK;
+	p->tcfa_flags = flags;
 	if (est) {
 		err = gen_new_estimator(&p->tcfa_bstats, p->cpu_bstats,
 					&p->tcfa_rate_est,
@@ -840,6 +840,7 @@ tcf_action_dump_1(struct sk_buff *skb, struct tc_action *a, int bind, int ref)
 	int err = -EINVAL;
 	unsigned char *b = skb_tail_pointer(skb);
 	struct nlattr *nest;
+	u32 flags;
 
 	if (tcf_action_dump_terse(skb, a, false))
 		goto nla_put_failure;
@@ -854,9 +855,10 @@ tcf_action_dump_1(struct sk_buff *skb, struct tc_action *a, int bind, int ref)
 			       a->used_hw_stats, TCA_ACT_HW_STATS_ANY))
 		goto nla_put_failure;
 
-	if (a->tcfa_flags &&
+	flags = a->tcfa_flags & TCA_ACT_FLAGS_USER_MASK;
+	if (flags &&
 	    nla_put_bitfield32(skb, TCA_ACT_FLAGS,
-			       a->tcfa_flags, a->tcfa_flags))
+			       flags, flags))
 		goto nla_put_failure;
 
 	if (nla_put_u32(skb, TCA_ACT_IN_HW_COUNT, a->in_hw_count))
-- 
2.20.1


^ permalink raw reply related	[flat|nested] 58+ messages in thread

* [RFC/PATCH net-next v3 7/8] flow_offload: add reoffload process to update hw_count
  2021-10-28 11:06 [RFC/PATCH net-next v3 0/8] allow user to offload tc action to net device Simon Horman
                   ` (5 preceding siblings ...)
  2021-10-28 11:06 ` [RFC/PATCH net-next v3 6/8] net: sched: save full flags for tc action Simon Horman
@ 2021-10-28 11:06 ` Simon Horman
  2021-10-29 17:31   ` Vlad Buslov
  2021-10-28 11:06 ` [RFC/PATCH net-next v3 8/8] flow_offload: validate flags of filter and actions Simon Horman
                   ` (2 subsequent siblings)
  9 siblings, 1 reply; 58+ messages in thread
From: Simon Horman @ 2021-10-28 11:06 UTC (permalink / raw)
  To: netdev
  Cc: Vlad Buslov, Jamal Hadi Salim, Roi Dayan, Ido Schimmel,
	Cong Wang, Jiri Pirko, Baowen Zheng, Louis Peens, oss-drivers,
	Baowen Zheng, Simon Horman

From: Baowen Zheng <baowen.zheng@corigine.com>

Add reoffload process to update hw_count when driver
is inserted or removed.

When reoffloading actions, we still offload the actions
that are added independent of filters.

Signed-off-by: Baowen Zheng <baowen.zheng@corigine.com>
Signed-off-by: Louis Peens <louis.peens@corigine.com>
Signed-off-by: Simon Horman <simon.horman@corigine.com>
---
 include/net/act_api.h   |  24 +++++
 include/net/pkt_cls.h   |   5 +
 net/core/flow_offload.c |   5 +
 net/sched/act_api.c     | 213 ++++++++++++++++++++++++++++++++++++----
 4 files changed, 228 insertions(+), 19 deletions(-)

diff --git a/include/net/act_api.h b/include/net/act_api.h
index 80a9d1e7d805..03ff39e347c3 100644
--- a/include/net/act_api.h
+++ b/include/net/act_api.h
@@ -7,6 +7,7 @@
 */
 
 #include <linux/refcount.h>
+#include <net/flow_offload.h>
 #include <net/sch_generic.h>
 #include <net/pkt_sched.h>
 #include <net/net_namespace.h>
@@ -243,11 +244,26 @@ static inline void flow_action_hw_count_set(struct tc_action *act,
 	act->in_hw_count = hw_count;
 }
 
+static inline void flow_action_hw_count_inc(struct tc_action *act,
+					    u32 hw_count)
+{
+	act->in_hw_count += hw_count;
+}
+
+static inline void flow_action_hw_count_dec(struct tc_action *act,
+					    u32 hw_count)
+{
+	act->in_hw_count = act->in_hw_count > hw_count ?
+			   act->in_hw_count - hw_count : 0;
+}
+
 void tcf_action_update_stats(struct tc_action *a, u64 bytes, u64 packets,
 			     u64 drops, bool hw);
 int tcf_action_copy_stats(struct sk_buff *, struct tc_action *, int);
 int tcf_action_offload_del(struct tc_action *action);
 int tcf_action_update_hw_stats(struct tc_action *action);
+int tcf_action_reoffload_cb(flow_indr_block_bind_cb_t *cb,
+			    void *cb_priv, bool add);
 int tcf_action_check_ctrlact(int action, struct tcf_proto *tp,
 			     struct tcf_chain **handle,
 			     struct netlink_ext_ack *newchain);
@@ -259,6 +275,14 @@ DECLARE_STATIC_KEY_FALSE(tcf_frag_xmit_count);
 #endif
 
 int tcf_dev_queue_xmit(struct sk_buff *skb, int (*xmit)(struct sk_buff *skb));
+
+#else /* !CONFIG_NET_CLS_ACT */
+
+static inline int tcf_action_reoffload_cb(flow_indr_block_bind_cb_t *cb,
+					  void *cb_priv, bool add) {
+	return 0;
+}
+
 #endif /* CONFIG_NET_CLS_ACT */
 
 static inline void tcf_action_stats_update(struct tc_action *a, u64 bytes,
diff --git a/include/net/pkt_cls.h b/include/net/pkt_cls.h
index 88788b821f76..82ac631c50bc 100644
--- a/include/net/pkt_cls.h
+++ b/include/net/pkt_cls.h
@@ -284,6 +284,11 @@ static inline bool tc_act_flags_valid(u32 flags)
 	return flags ^ (TCA_ACT_FLAGS_SKIP_HW | TCA_ACT_FLAGS_SKIP_SW);
 }
 
+static inline bool tc_act_bind(u32 flags)
+{
+	return !!(flags & TCA_ACT_FLAGS_BIND);
+}
+
 static inline void
 tcf_exts_stats_update(const struct tcf_exts *exts,
 		      u64 bytes, u64 packets, u64 drops, u64 lastuse,
diff --git a/net/core/flow_offload.c b/net/core/flow_offload.c
index 6676431733ef..d591204af6e0 100644
--- a/net/core/flow_offload.c
+++ b/net/core/flow_offload.c
@@ -1,6 +1,7 @@
 /* SPDX-License-Identifier: GPL-2.0 */
 #include <linux/kernel.h>
 #include <linux/slab.h>
+#include <net/act_api.h>
 #include <net/flow_offload.h>
 #include <linux/rtnetlink.h>
 #include <linux/mutex.h>
@@ -418,6 +419,8 @@ int flow_indr_dev_register(flow_indr_block_bind_cb_t *cb, void *cb_priv)
 	existing_qdiscs_register(cb, cb_priv);
 	mutex_unlock(&flow_indr_block_lock);
 
+	tcf_action_reoffload_cb(cb, cb_priv, true);
+
 	return 0;
 }
 EXPORT_SYMBOL(flow_indr_dev_register);
@@ -472,6 +475,8 @@ void flow_indr_dev_unregister(flow_indr_block_bind_cb_t *cb, void *cb_priv,
 
 	flow_block_indr_notify(&cleanup_list);
 	kfree(indr_dev);
+
+	tcf_action_reoffload_cb(cb, cb_priv, false);
 }
 EXPORT_SYMBOL(flow_indr_dev_unregister);
 
diff --git a/net/sched/act_api.c b/net/sched/act_api.c
index 3893ffd91192..dce25d8f147b 100644
--- a/net/sched/act_api.c
+++ b/net/sched/act_api.c
@@ -638,6 +638,59 @@ EXPORT_SYMBOL(tcf_idrinfo_destroy);
 
 static LIST_HEAD(act_base);
 static DEFINE_RWLOCK(act_mod_lock);
+/* since act ops id is stored in pernet subsystem list,
+ * then there is no way to walk through only all the action
+ * subsystem, so we keep tc action pernet ops id for
+ * reoffload to walk through.
+ */
+static LIST_HEAD(act_pernet_id_list);
+static DEFINE_MUTEX(act_id_mutex);
+struct tc_act_pernet_id {
+	struct list_head list;
+	unsigned int id;
+};
+
+static int tcf_pernet_add_id_list(unsigned int id)
+{
+	struct tc_act_pernet_id *id_ptr;
+	int ret = 0;
+
+	mutex_lock(&act_id_mutex);
+	list_for_each_entry(id_ptr, &act_pernet_id_list, list) {
+		if (id_ptr->id == id) {
+			ret = -EEXIST;
+			goto err_out;
+		}
+	}
+
+	id_ptr = kzalloc(sizeof(*id_ptr), GFP_KERNEL);
+	if (!id_ptr) {
+		ret = -ENOMEM;
+		goto err_out;
+	}
+	id_ptr->id = id;
+
+	list_add_tail(&id_ptr->list, &act_pernet_id_list);
+
+err_out:
+	mutex_unlock(&act_id_mutex);
+	return ret;
+}
+
+static void tcf_pernet_del_id_list(unsigned int id)
+{
+	struct tc_act_pernet_id *id_ptr;
+
+	mutex_lock(&act_id_mutex);
+	list_for_each_entry(id_ptr, &act_pernet_id_list, list) {
+		if (id_ptr->id == id) {
+			list_del(&id_ptr->list);
+			kfree(id_ptr);
+			break;
+		}
+	}
+	mutex_unlock(&act_id_mutex);
+}
 
 int tcf_register_action(struct tc_action_ops *act,
 			struct pernet_operations *ops)
@@ -656,18 +709,30 @@ int tcf_register_action(struct tc_action_ops *act,
 	if (ret)
 		return ret;
 
+	if (ops->id) {
+		ret = tcf_pernet_add_id_list(*ops->id);
+		if (ret)
+			goto id_err;
+	}
+
 	write_lock(&act_mod_lock);
 	list_for_each_entry(a, &act_base, head) {
 		if (act->id == a->id || (strcmp(act->kind, a->kind) == 0)) {
-			write_unlock(&act_mod_lock);
-			unregister_pernet_subsys(ops);
-			return -EEXIST;
+			ret = -EEXIST;
+			goto err_out;
 		}
 	}
 	list_add_tail(&act->head, &act_base);
 	write_unlock(&act_mod_lock);
 
 	return 0;
+
+err_out:
+	write_unlock(&act_mod_lock);
+	tcf_pernet_del_id_list(*ops->id);
+id_err:
+	unregister_pernet_subsys(ops);
+	return ret;
 }
 EXPORT_SYMBOL(tcf_register_action);
 
@@ -686,8 +751,11 @@ int tcf_unregister_action(struct tc_action_ops *act,
 		}
 	}
 	write_unlock(&act_mod_lock);
-	if (!err)
+	if (!err) {
 		unregister_pernet_subsys(ops);
+		if (ops->id)
+			tcf_pernet_del_id_list(*ops->id);
+	}
 	return err;
 }
 EXPORT_SYMBOL(tcf_unregister_action);
@@ -1175,15 +1243,11 @@ static int flow_action_init(struct flow_offload_action *fl_action,
 	return 0;
 }
 
-static int tcf_action_offload_cmd(struct flow_offload_action *fl_act,
-				  u32 *hw_count,
-				  struct netlink_ext_ack *extack)
+static int tcf_action_offload_cmd_ex(struct flow_offload_action *fl_act,
+				     u32 *hw_count)
 {
 	int err;
 
-	if (IS_ERR(fl_act))
-		return PTR_ERR(fl_act);
-
 	err = flow_indr_dev_setup_offload(NULL, NULL, TC_SETUP_ACT,
 					  fl_act, NULL, NULL);
 	if (err < 0)
@@ -1195,9 +1259,41 @@ static int tcf_action_offload_cmd(struct flow_offload_action *fl_act,
 	return 0;
 }
 
+static int tcf_action_offload_cmd_cb_ex(struct flow_offload_action *fl_act,
+					u32 *hw_count,
+					flow_indr_block_bind_cb_t *cb,
+					void *cb_priv)
+{
+	int err;
+
+	err = cb(NULL, NULL, cb_priv, TC_SETUP_ACT, NULL, fl_act, NULL);
+	if (err < 0)
+		return err;
+
+	if (hw_count)
+		*hw_count = 1;
+
+	return 0;
+}
+
+static int tcf_action_offload_cmd(struct flow_offload_action *fl_act,
+				  u32 *hw_count,
+				  flow_indr_block_bind_cb_t *cb,
+				  void *cb_priv)
+{
+	if (IS_ERR(fl_act))
+		return PTR_ERR(fl_act);
+
+	return cb ? tcf_action_offload_cmd_cb_ex(fl_act, hw_count,
+						 cb, cb_priv) :
+		    tcf_action_offload_cmd_ex(fl_act, hw_count);
+}
+
 /* offload the tc command after inserted */
-static int tcf_action_offload_add(struct tc_action *action,
-				  struct netlink_ext_ack *extack)
+static int tcf_action_offload_add_ex(struct tc_action *action,
+				     struct netlink_ext_ack *extack,
+				     flow_indr_block_bind_cb_t *cb,
+				     void *cb_priv)
 {
 	bool skip_sw = tc_act_skip_sw(action->tcfa_flags);
 	struct tc_action *actions[TCA_ACT_MAX_PRIO] = {
@@ -1225,9 +1321,10 @@ static int tcf_action_offload_add(struct tc_action *action,
 		goto fl_err;
 	}
 
-	err = tcf_action_offload_cmd(fl_action, &in_hw_count, extack);
+	err = tcf_action_offload_cmd(fl_action, &in_hw_count, cb, cb_priv);
 	if (!err)
-		flow_action_hw_count_set(action, in_hw_count);
+		cb ? flow_action_hw_count_inc(action, in_hw_count) :
+		     flow_action_hw_count_set(action, in_hw_count);
 
 	if (skip_sw && !tc_act_in_hw(action))
 		err = -EINVAL;
@@ -1240,6 +1337,12 @@ static int tcf_action_offload_add(struct tc_action *action,
 	return err;
 }
 
+static int tcf_action_offload_add(struct tc_action *action,
+				  struct netlink_ext_ack *extack)
+{
+	return tcf_action_offload_add_ex(action, extack, NULL, NULL);
+}
+
 int tcf_action_update_hw_stats(struct tc_action *action)
 {
 	struct flow_offload_action fl_act = {};
@@ -1252,7 +1355,7 @@ int tcf_action_update_hw_stats(struct tc_action *action)
 	if (err)
 		goto err_out;
 
-	err = tcf_action_offload_cmd(&fl_act, NULL, NULL);
+	err = tcf_action_offload_cmd(&fl_act, NULL, NULL, NULL);
 
 	if (!err && fl_act.stats.lastused) {
 		preempt_disable();
@@ -1274,7 +1377,9 @@ int tcf_action_update_hw_stats(struct tc_action *action)
 }
 EXPORT_SYMBOL(tcf_action_update_hw_stats);
 
-int tcf_action_offload_del(struct tc_action *action)
+static int tcf_action_offload_del_ex(struct tc_action *action,
+				     flow_indr_block_bind_cb_t *cb,
+				     void *cb_priv)
 {
 	struct flow_offload_action fl_act;
 	u32 in_hw_count = 0;
@@ -1290,13 +1395,83 @@ int tcf_action_offload_del(struct tc_action *action)
 	if (err)
 		return err;
 
-	err = tcf_action_offload_cmd(&fl_act, &in_hw_count, NULL);
-	if (err)
+	err = tcf_action_offload_cmd(&fl_act, &in_hw_count, cb, cb_priv);
+	if (err < 0)
 		return err;
 
-	if (action->in_hw_count != in_hw_count)
+	if (!cb && action->in_hw_count != in_hw_count)
 		return -EINVAL;
 
+	/* do not need to update hw state when deleting action */
+	if (cb && in_hw_count)
+		flow_action_hw_count_dec(action, in_hw_count);
+
+	return 0;
+}
+
+int tcf_action_offload_del(struct tc_action *action)
+{
+	return tcf_action_offload_del_ex(action, NULL, NULL);
+}
+
+int tcf_action_reoffload_cb(flow_indr_block_bind_cb_t *cb,
+			    void *cb_priv, bool add)
+{
+	struct tc_act_pernet_id *id_ptr;
+	struct tcf_idrinfo *idrinfo;
+	struct tc_action_net *tn;
+	struct tc_action *p;
+	unsigned int act_id;
+	unsigned long tmp;
+	unsigned long id;
+	struct idr *idr;
+	struct net *net;
+	int ret;
+
+	if (!cb)
+		return -EINVAL;
+
+	down_read(&net_rwsem);
+	mutex_lock(&act_id_mutex);
+
+	for_each_net(net) {
+		list_for_each_entry(id_ptr, &act_pernet_id_list, list) {
+			act_id = id_ptr->id;
+			tn = net_generic(net, act_id);
+			if (!tn)
+				continue;
+			idrinfo = tn->idrinfo;
+			if (!idrinfo)
+				continue;
+
+			mutex_lock(&idrinfo->lock);
+			idr = &idrinfo->action_idr;
+			idr_for_each_entry_ul(idr, p, tmp, id) {
+				if (IS_ERR(p) || tc_act_bind(p->tcfa_flags))
+					continue;
+				if (add) {
+					tcf_action_offload_add_ex(p, NULL, cb,
+								  cb_priv);
+					continue;
+				}
+
+				/* cb unregister to update hw count */
+				ret = tcf_action_offload_del_ex(p, cb, cb_priv);
+				if (ret < 0)
+					continue;
+				if (tc_act_skip_sw(p->tcfa_flags) &&
+				    !tc_act_in_hw(p)) {
+					ret = tcf_idr_release_unsafe(p);
+					if (ret == ACT_P_DELETED)
+						module_put(p->ops->owner);
+				}
+			}
+			mutex_unlock(&idrinfo->lock);
+		}
+	}
+	mutex_unlock(&act_id_mutex);
+	up_read(&net_rwsem);
+
 	return 0;
 }
 
-- 
2.20.1


^ permalink raw reply related	[flat|nested] 58+ messages in thread

* [RFC/PATCH net-next v3 8/8] flow_offload: validate flags of filter and actions
  2021-10-28 11:06 [RFC/PATCH net-next v3 0/8] allow user to offload tc action to net device Simon Horman
                   ` (6 preceding siblings ...)
  2021-10-28 11:06 ` [RFC/PATCH net-next v3 7/8] flow_offload: add reoffload process to update hw_count Simon Horman
@ 2021-10-28 11:06 ` Simon Horman
  2021-10-28 19:12   ` kernel test robot
  2021-10-29 18:01   ` Vlad Buslov
  2021-10-28 14:23 ` [RFC/PATCH net-next v3 0/8] allow user to offload tc action to net device Jamal Hadi Salim
  2021-10-31  9:50 ` Oz Shlomo
  9 siblings, 2 replies; 58+ messages in thread
From: Simon Horman @ 2021-10-28 11:06 UTC (permalink / raw)
  To: netdev
  Cc: Vlad Buslov, Jamal Hadi Salim, Roi Dayan, Ido Schimmel,
	Cong Wang, Jiri Pirko, Baowen Zheng, Louis Peens, oss-drivers,
	Baowen Zheng, Simon Horman

From: Baowen Zheng <baowen.zheng@corigine.com>

Add process to validate flags of filter and actions when adding
a tc filter.

We need to prevent adding filter with flags conflicts with its actions.

Signed-off-by: Baowen Zheng <baowen.zheng@corigine.com>
Signed-off-by: Louis Peens <louis.peens@corigine.com>
Signed-off-by: Simon Horman <simon.horman@corigine.com>
---
 net/sched/cls_api.c      | 26 ++++++++++++++++++++++++++
 net/sched/cls_flower.c   |  3 ++-
 net/sched/cls_matchall.c |  4 ++--
 net/sched/cls_u32.c      |  7 ++++---
 4 files changed, 34 insertions(+), 6 deletions(-)

diff --git a/net/sched/cls_api.c b/net/sched/cls_api.c
index 351d93988b8b..80647da9713a 100644
--- a/net/sched/cls_api.c
+++ b/net/sched/cls_api.c
@@ -3025,6 +3025,29 @@ void tcf_exts_destroy(struct tcf_exts *exts)
 }
 EXPORT_SYMBOL(tcf_exts_destroy);
 
+static bool tcf_exts_validate_actions(const struct tcf_exts *exts, u32 flags)
+{
+#ifdef CONFIG_NET_CLS_ACT
+	bool skip_sw = tc_skip_sw(flags);
+	bool skip_hw = tc_skip_hw(flags);
+	int i;
+
+	if (!(skip_sw | skip_hw))
+		return true;
+
+	for (i = 0; i < exts->nr_actions; i++) {
+		struct tc_action *a = exts->actions[i];
+
+		if ((skip_sw && tc_act_skip_hw(a->tcfa_flags)) ||
+		    (skip_hw && tc_act_skip_sw(a->tcfa_flags)))
+			return false;
+	}
+	return true;
+#else
+	return true;
+#endif
+}
+
 int tcf_exts_validate(struct net *net, struct tcf_proto *tp, struct nlattr **tb,
 		      struct nlattr *rate_tlv, struct tcf_exts *exts,
 		      u32 flags, struct netlink_ext_ack *extack)
@@ -3066,6 +3089,9 @@ int tcf_exts_validate(struct net *net, struct tcf_proto *tp, struct nlattr **tb,
 				return err;
 			exts->nr_actions = err;
 		}
+
+		if (!tcf_exts_validate_actions(exts, flags))
+			return -EINVAL;
 	}
 #else
 	if ((exts->action && tb[exts->action]) ||
diff --git a/net/sched/cls_flower.c b/net/sched/cls_flower.c
index eb6345a027e1..55f89f0e393e 100644
--- a/net/sched/cls_flower.c
+++ b/net/sched/cls_flower.c
@@ -2035,7 +2035,8 @@ static int fl_change(struct net *net, struct sk_buff *in_skb,
 	}
 
 	err = fl_set_parms(net, tp, fnew, mask, base, tb, tca[TCA_RATE],
-			   tp->chain->tmplt_priv, flags, extack);
+			   tp->chain->tmplt_priv, flags | fnew->flags,
+			   extack);
 	if (err)
 		goto errout;
 
diff --git a/net/sched/cls_matchall.c b/net/sched/cls_matchall.c
index 24f0046ce0b3..00b76fbc1dce 100644
--- a/net/sched/cls_matchall.c
+++ b/net/sched/cls_matchall.c
@@ -226,8 +226,8 @@ static int mall_change(struct net *net, struct sk_buff *in_skb,
 		goto err_alloc_percpu;
 	}
 
-	err = mall_set_parms(net, tp, new, base, tb, tca[TCA_RATE], flags,
-			     extack);
+	err = mall_set_parms(net, tp, new, base, tb, tca[TCA_RATE],
+			     flags | new->flags, extack);
 	if (err)
 		goto err_set_parms;
 
diff --git a/net/sched/cls_u32.c b/net/sched/cls_u32.c
index 4272814487f0..fc670cc45122 100644
--- a/net/sched/cls_u32.c
+++ b/net/sched/cls_u32.c
@@ -895,7 +895,8 @@ static int u32_change(struct net *net, struct sk_buff *in_skb,
 			return -ENOMEM;
 
 		err = u32_set_parms(net, tp, base, new, tb,
-				    tca[TCA_RATE], flags, extack);
+				    tca[TCA_RATE], flags | new->flags,
+				    extack);
 
 		if (err) {
 			u32_destroy_key(new, false);
@@ -1060,8 +1061,8 @@ static int u32_change(struct net *net, struct sk_buff *in_skb,
 	}
 #endif
 
-	err = u32_set_parms(net, tp, base, n, tb, tca[TCA_RATE], flags,
-			    extack);
+	err = u32_set_parms(net, tp, base, n, tb, tca[TCA_RATE],
+			    flags | n->flags, extack);
 	if (err == 0) {
 		struct tc_u_knode __rcu **ins;
 		struct tc_u_knode *pins;
-- 
2.20.1


^ permalink raw reply related	[flat|nested] 58+ messages in thread

* Re: [RFC/PATCH net-next v3 0/8] allow user to offload tc action to net device
  2021-10-28 11:06 [RFC/PATCH net-next v3 0/8] allow user to offload tc action to net device Simon Horman
                   ` (7 preceding siblings ...)
  2021-10-28 11:06 ` [RFC/PATCH net-next v3 8/8] flow_offload: validate flags of filter and actions Simon Horman
@ 2021-10-28 14:23 ` Jamal Hadi Salim
  2021-10-28 14:39   ` Jamal Hadi Salim
  2021-10-31  9:50 ` Oz Shlomo
  9 siblings, 1 reply; 58+ messages in thread
From: Jamal Hadi Salim @ 2021-10-28 14:23 UTC (permalink / raw)
  To: Simon Horman, netdev
  Cc: Vlad Buslov, Roi Dayan, Ido Schimmel, Cong Wang, Jiri Pirko,
	Baowen Zheng, Louis Peens, oss-drivers

On 2021-10-28 07:06, Simon Horman wrote:
> aowen Zheng says:
> 
> Allow use of flow_indr_dev_register/flow_indr_dev_setup_offload to offload
> tc actions independent of flows.
> 
> The motivation for this work is to prepare for using TC police action
> instances to provide hardware offload of OVS metering feature - which calls
> for policers that may be used by multiple flows and whose lifecycle is
> independent of any flows that use them.
> 
> This patch includes basic changes to offload drivers to return EOPNOTSUPP
> if this feature is used - it is not yet supported by any driver.
> 
> Tc cli command to offload and quote an action:
> 
> tc qdisc del dev $DEV ingress && sleep 1 || true
> tc actions delete action police index 99 || true
> 
> tc qdisc add dev $DEV ingress
> tc qdisc show dev $DEV ingress
> 
> tc actions add action police index 99 rate 1mbit burst 100k skip_sw
> tc actions list action police
> 
> tc filter add dev $DEV protocol ip parent ffff:
> flower ip_proto tcp action police index 99
> tc -s -d filter show dev $DEV protocol ip parent ffff:
> tc filter add dev $DEV protocol ipv6 parent ffff:
> flower skip_sw ip_proto tcp action police index 99
> tc -s -d filter show dev $DEV protocol ipv6 parent ffff:
> tc actions list action police
> 
> tc qdisc del dev $DEV ingress && sleep 1
> tc actions delete action police index 99
> tc actions list action police


It will be helpful to display the output of the show commands in the
cover letter....

cheers,
jamal

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [RFC/PATCH net-next v3 0/8] allow user to offload tc action to net device
  2021-10-28 14:23 ` [RFC/PATCH net-next v3 0/8] allow user to offload tc action to net device Jamal Hadi Salim
@ 2021-10-28 14:39   ` Jamal Hadi Salim
  0 siblings, 0 replies; 58+ messages in thread
From: Jamal Hadi Salim @ 2021-10-28 14:39 UTC (permalink / raw)
  To: Simon Horman, netdev
  Cc: Vlad Buslov, Roi Dayan, Ido Schimmel, Cong Wang, Jiri Pirko,
	Baowen Zheng, Louis Peens, oss-drivers

On 2021-10-28 10:23, Jamal Hadi Salim wrote:
[..]
> 
> It will be helpful to display the output of the show commands in the
> cover letter....

Also some tdc tests please...

cheers,
jamal


^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [RFC/PATCH net-next v3 8/8] flow_offload: validate flags of filter and actions
  2021-10-28 11:06 ` [RFC/PATCH net-next v3 8/8] flow_offload: validate flags of filter and actions Simon Horman
@ 2021-10-28 19:12   ` kernel test robot
  2021-10-29 18:01   ` Vlad Buslov
  1 sibling, 0 replies; 58+ messages in thread
From: kernel test robot @ 2021-10-28 19:12 UTC (permalink / raw)
  To: kbuild-all

[-- Attachment #1: Type: text/plain, Size: 2244 bytes --]

Hi Simon,

[FYI, it's a private test report for your RFC patch.]
[auto build test ERROR on net-next/master]

url:    https://github.com/0day-ci/linux/commits/Simon-Horman/allow-user-to-offload-tc-action-to-net-device/20211028-191349
base:   https://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next.git 911e3a46fb38669560021537e00222591231f456
config: i386-buildonly-randconfig-r001-20211028 (attached as .config)
compiler: gcc-9 (Debian 9.3.0-22) 9.3.0
reproduce (this is a W=1 build):
        # https://github.com/0day-ci/linux/commit/f0dfc25677ef71290fccbf1f8da5e602b4bcfa80
        git remote add linux-review https://github.com/0day-ci/linux
        git fetch --no-tags linux-review Simon-Horman/allow-user-to-offload-tc-action-to-net-device/20211028-191349
        git checkout f0dfc25677ef71290fccbf1f8da5e602b4bcfa80
        # save the attached .config to linux build tree
        make W=1 ARCH=i386 

If you fix the issue, kindly add following tag as appropriate
Reported-by: kernel test robot <lkp@intel.com>

All errors (new ones prefixed by >>):

>> net/sched/cls_api.c:3028:13: error: 'tcf_exts_validate_actions' defined but not used [-Werror=unused-function]
    3028 | static bool tcf_exts_validate_actions(const struct tcf_exts *exts, u32 flags)
         |             ^~~~~~~~~~~~~~~~~~~~~~~~~
   cc1: all warnings being treated as errors


vim +/tcf_exts_validate_actions +3028 net/sched/cls_api.c

  3027	
> 3028	static bool tcf_exts_validate_actions(const struct tcf_exts *exts, u32 flags)
  3029	{
  3030	#ifdef CONFIG_NET_CLS_ACT
  3031		bool skip_sw = tc_skip_sw(flags);
  3032		bool skip_hw = tc_skip_hw(flags);
  3033		int i;
  3034	
  3035		if (!(skip_sw | skip_hw))
  3036			return true;
  3037	
  3038		for (i = 0; i < exts->nr_actions; i++) {
  3039			struct tc_action *a = exts->actions[i];
  3040	
  3041			if ((skip_sw && tc_act_skip_hw(a->tcfa_flags)) ||
  3042			    (skip_hw && tc_act_skip_sw(a->tcfa_flags)))
  3043				return false;
  3044		}
  3045		return true;
  3046	#else
  3047		return true;
  3048	#endif
  3049	}
  3050	

---
0-DAY CI Kernel Test Service, Intel Corporation
https://lists.01.org/hyperkitty/list/kbuild-all(a)lists.01.org

[-- Attachment #2: config.gz --]
[-- Type: application/gzip, Size: 39355 bytes --]

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [RFC/PATCH net-next v3 3/8] flow_offload: allow user to offload tc action to net device
  2021-10-28 11:06 ` [RFC/PATCH net-next v3 3/8] flow_offload: allow user to offload tc action to net device Simon Horman
@ 2021-10-29 16:59   ` Vlad Buslov
  2021-11-01  9:44     ` Baowen Zheng
  2021-10-31  9:50   ` Oz Shlomo
  1 sibling, 1 reply; 58+ messages in thread
From: Vlad Buslov @ 2021-10-29 16:59 UTC (permalink / raw)
  To: Simon Horman
  Cc: netdev, Jamal Hadi Salim, Roi Dayan, Ido Schimmel, Cong Wang,
	Jiri Pirko, Baowen Zheng, Louis Peens, oss-drivers, Baowen Zheng

On Thu 28 Oct 2021 at 14:06, Simon Horman <simon.horman@corigine.com> wrote:
> From: Baowen Zheng <baowen.zheng@corigine.com>
>
> Use flow_indr_dev_register/flow_indr_dev_setup_offload to
> offload tc action.
>
> We need to call tc_cleanup_flow_action to clean up tc action entry since
> in tc_setup_action, some actions may hold dev refcnt, especially the mirror
> action.
>
> Signed-off-by: Baowen Zheng <baowen.zheng@corigine.com>
> Signed-off-by: Louis Peens <louis.peens@corigine.com>
> Signed-off-by: Simon Horman <simon.horman@corigine.com>
> ---
>  include/linux/netdevice.h  |   1 +
>  include/net/act_api.h      |   2 +-
>  include/net/flow_offload.h |  17 ++++
>  include/net/pkt_cls.h      |  15 ++++
>  net/core/flow_offload.c    |  43 ++++++++--
>  net/sched/act_api.c        | 166 +++++++++++++++++++++++++++++++++++++
>  net/sched/cls_api.c        |  29 ++++++-
>  7 files changed, 260 insertions(+), 13 deletions(-)
>
> diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h
> index 3ec42495a43a..9815c3a058e9 100644
> --- a/include/linux/netdevice.h
> +++ b/include/linux/netdevice.h
> @@ -916,6 +916,7 @@ enum tc_setup_type {
>  	TC_SETUP_QDISC_TBF,
>  	TC_SETUP_QDISC_FIFO,
>  	TC_SETUP_QDISC_HTB,
> +	TC_SETUP_ACT,
>  };
>  
>  /* These structures hold the attributes of bpf state that are being passed
> diff --git a/include/net/act_api.h b/include/net/act_api.h
> index b5b624c7e488..9eb19188603c 100644
> --- a/include/net/act_api.h
> +++ b/include/net/act_api.h
> @@ -239,7 +239,7 @@ static inline void tcf_action_inc_overlimit_qstats(struct tc_action *a)
>  void tcf_action_update_stats(struct tc_action *a, u64 bytes, u64 packets,
>  			     u64 drops, bool hw);
>  int tcf_action_copy_stats(struct sk_buff *, struct tc_action *, int);
> -
> +int tcf_action_offload_del(struct tc_action *action);

This doesn't seem to be used anywhere outside of act_api in this series,
so why is it exported?

>  int tcf_action_check_ctrlact(int action, struct tcf_proto *tp,
>  			     struct tcf_chain **handle,
>  			     struct netlink_ext_ack *newchain);
> diff --git a/include/net/flow_offload.h b/include/net/flow_offload.h
> index 3961461d9c8b..aa28592fccc0 100644
> --- a/include/net/flow_offload.h
> +++ b/include/net/flow_offload.h
> @@ -552,6 +552,23 @@ struct flow_cls_offload {
>  	u32 classid;
>  };
>  
> +enum flow_act_command {
> +	FLOW_ACT_REPLACE,
> +	FLOW_ACT_DESTROY,
> +	FLOW_ACT_STATS,
> +};
> +
> +struct flow_offload_action {
> +	struct netlink_ext_ack *extack; /* NULL in FLOW_ACT_STATS process*/
> +	enum flow_act_command command;
> +	enum flow_action_id id;
> +	u32 index;
> +	struct flow_stats stats;
> +	struct flow_action action;
> +};
> +
> +struct flow_offload_action *flow_action_alloc(unsigned int num_actions);
> +
>  static inline struct flow_rule *
>  flow_cls_offload_flow_rule(struct flow_cls_offload *flow_cmd)
>  {
> diff --git a/include/net/pkt_cls.h b/include/net/pkt_cls.h
> index 193f88ebf629..922775407257 100644
> --- a/include/net/pkt_cls.h
> +++ b/include/net/pkt_cls.h
> @@ -258,6 +258,9 @@ static inline void tcf_exts_put_net(struct tcf_exts *exts)
>  	for (; 0; (void)(i), (void)(a), (void)(exts))
>  #endif
>  
> +#define tcf_act_for_each_action(i, a, actions) \
> +	for (i = 0; i < TCA_ACT_MAX_PRIO && ((a) = actions[i]); i++)
> +
>  static inline void
>  tcf_exts_stats_update(const struct tcf_exts *exts,
>  		      u64 bytes, u64 packets, u64 drops, u64 lastuse,
> @@ -532,8 +535,19 @@ tcf_match_indev(struct sk_buff *skb, int ifindex)
>  	return ifindex == skb->skb_iif;
>  }
>  
> +#ifdef CONFIG_NET_CLS_ACT
>  int tc_setup_flow_action(struct flow_action *flow_action,
>  			 const struct tcf_exts *exts);

Why does existing cls_api function tc_setup_flow_action() now depend on
CONFIG_NET_CLS_ACT?

> +#else
> +static inline int tc_setup_flow_action(struct flow_action *flow_action,
> +				       const struct tcf_exts *exts)
> +{
> +	return 0;
> +}
> +#endif
> +
> +int tc_setup_action(struct flow_action *flow_action,
> +		    struct tc_action *actions[]);
>  void tc_cleanup_flow_action(struct flow_action *flow_action);
>  
>  int tc_setup_cb_call(struct tcf_block *block, enum tc_setup_type type,
> @@ -554,6 +568,7 @@ int tc_setup_cb_reoffload(struct tcf_block *block, struct tcf_proto *tp,
>  			  enum tc_setup_type type, void *type_data,
>  			  void *cb_priv, u32 *flags, unsigned int *in_hw_count);
>  unsigned int tcf_exts_num_actions(struct tcf_exts *exts);
> +unsigned int tcf_act_num_actions_single(struct tc_action *act);
>  
>  #ifdef CONFIG_NET_CLS_ACT
>  int tcf_qevent_init(struct tcf_qevent *qe, struct Qdisc *sch,
> diff --git a/net/core/flow_offload.c b/net/core/flow_offload.c
> index 6beaea13564a..6676431733ef 100644
> --- a/net/core/flow_offload.c
> +++ b/net/core/flow_offload.c
> @@ -27,6 +27,27 @@ struct flow_rule *flow_rule_alloc(unsigned int num_actions)
>  }
>  EXPORT_SYMBOL(flow_rule_alloc);
>  
> +struct flow_offload_action *flow_action_alloc(unsigned int num_actions)
> +{
> +	struct flow_offload_action *fl_action;
> +	int i;
> +
> +	fl_action = kzalloc(struct_size(fl_action, action.entries, num_actions),
> +			    GFP_KERNEL);
> +	if (!fl_action)
> +		return NULL;
> +
> +	fl_action->action.num_entries = num_actions;
> +	/* Pre-fill each action hw_stats with DONT_CARE.
> +	 * Caller can override this if it wants stats for a given action.
> +	 */
> +	for (i = 0; i < num_actions; i++)
> +		fl_action->action.entries[i].hw_stats = FLOW_ACTION_HW_STATS_DONT_CARE;
> +
> +	return fl_action;
> +}
> +EXPORT_SYMBOL(flow_action_alloc);
> +
>  #define FLOW_DISSECTOR_MATCH(__rule, __type, __out)				\
>  	const struct flow_match *__m = &(__rule)->match;			\
>  	struct flow_dissector *__d = (__m)->dissector;				\
> @@ -549,19 +570,25 @@ int flow_indr_dev_setup_offload(struct net_device *dev,	struct Qdisc *sch,
>  				void (*cleanup)(struct flow_block_cb *block_cb))
>  {
>  	struct flow_indr_dev *this;
> +	u32 count = 0;
> +	int err;
>  
>  	mutex_lock(&flow_indr_block_lock);
> +	if (bo) {
> +		if (bo->command == FLOW_BLOCK_BIND)
> +			indir_dev_add(data, dev, sch, type, cleanup, bo);
> +		else if (bo->command == FLOW_BLOCK_UNBIND)
> +			indir_dev_remove(data);
> +	}
>  
> -	if (bo->command == FLOW_BLOCK_BIND)
> -		indir_dev_add(data, dev, sch, type, cleanup, bo);
> -	else if (bo->command == FLOW_BLOCK_UNBIND)
> -		indir_dev_remove(data);
> -
> -	list_for_each_entry(this, &flow_block_indr_dev_list, list)
> -		this->cb(dev, sch, this->cb_priv, type, bo, data, cleanup);
> +	list_for_each_entry(this, &flow_block_indr_dev_list, list) {
> +		err = this->cb(dev, sch, this->cb_priv, type, bo, data, cleanup);
> +		if (!err)
> +			count++;
> +	}
>  
>  	mutex_unlock(&flow_indr_block_lock);
>  
> -	return list_empty(&bo->cb_list) ? -EOPNOTSUPP : 0;
> +	return (bo && list_empty(&bo->cb_list)) ? -EOPNOTSUPP : count;
>  }
>  EXPORT_SYMBOL(flow_indr_dev_setup_offload);
> diff --git a/net/sched/act_api.c b/net/sched/act_api.c
> index 3258da3d5bed..33f2ff885b4b 100644
> --- a/net/sched/act_api.c
> +++ b/net/sched/act_api.c
> @@ -21,6 +21,19 @@
>  #include <net/pkt_cls.h>
>  #include <net/act_api.h>
>  #include <net/netlink.h>
> +#include <net/tc_act/tc_pedit.h>
> +#include <net/tc_act/tc_mirred.h>
> +#include <net/tc_act/tc_vlan.h>
> +#include <net/tc_act/tc_tunnel_key.h>
> +#include <net/tc_act/tc_csum.h>
> +#include <net/tc_act/tc_gact.h>
> +#include <net/tc_act/tc_police.h>
> +#include <net/tc_act/tc_sample.h>
> +#include <net/tc_act/tc_skbedit.h>
> +#include <net/tc_act/tc_ct.h>
> +#include <net/tc_act/tc_mpls.h>
> +#include <net/tc_act/tc_gate.h>
> +#include <net/flow_offload.h>
>  
>  #ifdef CONFIG_INET
>  DEFINE_STATIC_KEY_FALSE(tcf_frag_xmit_count);
> @@ -148,6 +161,7 @@ static int __tcf_action_put(struct tc_action *p, bool bind)
>  		idr_remove(&idrinfo->action_idr, p->tcfa_index);
>  		mutex_unlock(&idrinfo->lock);
>  
> +		tcf_action_offload_del(p);
>  		tcf_action_cleanup(p);
>  		return 1;
>  	}
> @@ -341,6 +355,7 @@ static int tcf_idr_release_unsafe(struct tc_action *p)
>  		return -EPERM;
>  
>  	if (refcount_dec_and_test(&p->tcfa_refcnt)) {
> +		tcf_action_offload_del(p);
>  		idr_remove(&p->idrinfo->action_idr, p->tcfa_index);
>  		tcf_action_cleanup(p);
>  		return ACT_P_DELETED;
> @@ -452,6 +467,7 @@ static int tcf_idr_delete_index(struct tcf_idrinfo *idrinfo, u32 index)
>  						p->tcfa_index));
>  			mutex_unlock(&idrinfo->lock);
>  
> +			tcf_action_offload_del(p);

tcf_action_offload_del() and tcf_action_cleanup() seem to be always
called together. Consider moving the call to tcf_action_offload_del()
into tcf_action_cleanup().

>  			tcf_action_cleanup(p);
>  			module_put(owner);
>  			return 0;
> @@ -1061,6 +1077,154 @@ struct tc_action *tcf_action_init_1(struct net *net, struct tcf_proto *tp,
>  	return ERR_PTR(err);
>  }
>  
> +static int flow_action_init(struct flow_offload_action *fl_action,
> +			    struct tc_action *act,
> +			    enum flow_act_command cmd,
> +			    struct netlink_ext_ack *extack)
> +{
> +	if (!fl_action)
> +		return -EINVAL;
> +
> +	fl_action->extack = extack;
> +	fl_action->command = cmd;
> +	fl_action->index = act->tcfa_index;
> +
> +	if (is_tcf_gact_ok(act)) {
> +		fl_action->id = FLOW_ACTION_ACCEPT;
> +	} else if (is_tcf_gact_shot(act)) {
> +		fl_action->id = FLOW_ACTION_DROP;
> +	} else if (is_tcf_gact_trap(act)) {
> +		fl_action->id = FLOW_ACTION_TRAP;
> +	} else if (is_tcf_gact_goto_chain(act)) {
> +		fl_action->id = FLOW_ACTION_GOTO;
> +	} else if (is_tcf_mirred_egress_redirect(act)) {
> +		fl_action->id = FLOW_ACTION_REDIRECT;
> +	} else if (is_tcf_mirred_egress_mirror(act)) {
> +		fl_action->id = FLOW_ACTION_MIRRED;
> +	} else if (is_tcf_mirred_ingress_redirect(act)) {
> +		fl_action->id = FLOW_ACTION_REDIRECT_INGRESS;
> +	} else if (is_tcf_mirred_ingress_mirror(act)) {
> +		fl_action->id = FLOW_ACTION_MIRRED_INGRESS;
> +	} else if (is_tcf_vlan(act)) {
> +		switch (tcf_vlan_action(act)) {
> +		case TCA_VLAN_ACT_PUSH:
> +			fl_action->id = FLOW_ACTION_VLAN_PUSH;
> +			break;
> +		case TCA_VLAN_ACT_POP:
> +			fl_action->id = FLOW_ACTION_VLAN_POP;
> +			break;
> +		case TCA_VLAN_ACT_MODIFY:
> +			fl_action->id = FLOW_ACTION_VLAN_MANGLE;
> +			break;
> +		default:
> +			return -EOPNOTSUPP;
> +		}
> +	} else if (is_tcf_tunnel_set(act)) {
> +		fl_action->id = FLOW_ACTION_TUNNEL_ENCAP;
> +	} else if (is_tcf_tunnel_release(act)) {
> +		fl_action->id = FLOW_ACTION_TUNNEL_DECAP;
> +	} else if (is_tcf_csum(act)) {
> +		fl_action->id = FLOW_ACTION_CSUM;
> +	} else if (is_tcf_skbedit_mark(act)) {
> +		fl_action->id = FLOW_ACTION_MARK;
> +	} else if (is_tcf_sample(act)) {
> +		fl_action->id = FLOW_ACTION_SAMPLE;
> +	} else if (is_tcf_police(act)) {
> +		fl_action->id = FLOW_ACTION_POLICE;
> +	} else if (is_tcf_ct(act)) {
> +		fl_action->id = FLOW_ACTION_CT;
> +	} else if (is_tcf_mpls(act)) {
> +		switch (tcf_mpls_action(act)) {
> +		case TCA_MPLS_ACT_PUSH:
> +			fl_action->id = FLOW_ACTION_MPLS_PUSH;
> +			break;
> +		case TCA_MPLS_ACT_POP:
> +			fl_action->id = FLOW_ACTION_MPLS_POP;
> +			break;
> +		case TCA_MPLS_ACT_MODIFY:
> +			fl_action->id = FLOW_ACTION_MPLS_MANGLE;
> +			break;
> +		default:
> +			return -EOPNOTSUPP;
> +		}
> +	} else if (is_tcf_skbedit_ptype(act)) {
> +		fl_action->id = FLOW_ACTION_PTYPE;
> +	} else if (is_tcf_skbedit_priority(act)) {
> +		fl_action->id = FLOW_ACTION_PRIORITY;
> +	} else if (is_tcf_gate(act)) {
> +		fl_action->id = FLOW_ACTION_GATE;
> +	} else {
> +		return -EOPNOTSUPP;
> +	}
> +
> +	return 0;
> +}
> +
> +static int tcf_action_offload_cmd(struct flow_offload_action *fl_act,
> +				  struct netlink_ext_ack *extack)
> +{
> +	int err;
> +
> +	if (IS_ERR(fl_act))
> +		return PTR_ERR(fl_act);
> +
> +	err = flow_indr_dev_setup_offload(NULL, NULL, TC_SETUP_ACT,
> +					  fl_act, NULL, NULL);
> +	if (err < 0)
> +		return err;
> +
> +	return 0;
> +}
> +
> +/* offload the tc command after inserted */
> +static int tcf_action_offload_add(struct tc_action *action,
> +				  struct netlink_ext_ack *extack)
> +{
> +	struct tc_action *actions[TCA_ACT_MAX_PRIO] = {
> +		[0] = action,
> +	};
> +	struct flow_offload_action *fl_action;
> +	int err = 0;
> +
> +	fl_action = flow_action_alloc(tcf_act_num_actions_single(action));
> +	if (!fl_action)
> +		return -EINVAL;

Failed alloc-like functions usually result -ENOMEM.

> +
> +	err = flow_action_init(fl_action, action, FLOW_ACT_REPLACE, extack);
> +	if (err)
> +		goto fl_err;
> +
> +	err = tc_setup_action(&fl_action->action, actions);
> +	if (err) {
> +		NL_SET_ERR_MSG_MOD(extack,
> +				   "Failed to setup tc actions for offload\n");
> +		goto fl_err;
> +	}
> +
> +	err = tcf_action_offload_cmd(fl_action, extack);
> +	tc_cleanup_flow_action(&fl_action->action);
> +
> +fl_err:
> +	kfree(fl_action);
> +
> +	return err;
> +}
> +
> +int tcf_action_offload_del(struct tc_action *action)
> +{
> +	struct flow_offload_action fl_act;
> +	int err = 0;
> +
> +	if (!action)
> +		return -EINVAL;
> +
> +	err = flow_action_init(&fl_act, action, FLOW_ACT_DESTROY, NULL);
> +	if (err)
> +		return err;
> +
> +	return tcf_action_offload_cmd(&fl_act, NULL);
> +}
> +
>  /* Returns numbers of initialized actions or negative error. */
>  
>  int tcf_action_init(struct net *net, struct tcf_proto *tp, struct nlattr *nla,
> @@ -1103,6 +1267,8 @@ int tcf_action_init(struct net *net, struct tcf_proto *tp, struct nlattr *nla,
>  		sz += tcf_action_fill_size(act);
>  		/* Start from index 0 */
>  		actions[i - 1] = act;
> +		if (!(flags & TCA_ACT_FLAGS_BIND))
> +			tcf_action_offload_add(act, extack);
>  	}
>  
>  	/* We have to commit them all together, because if any error happened in
> diff --git a/net/sched/cls_api.c b/net/sched/cls_api.c
> index 2ef8f5a6205a..351d93988b8b 100644
> --- a/net/sched/cls_api.c
> +++ b/net/sched/cls_api.c
> @@ -3544,8 +3544,8 @@ static enum flow_action_hw_stats tc_act_hw_stats(u8 hw_stats)
>  	return hw_stats;
>  }
>  
> -int tc_setup_flow_action(struct flow_action *flow_action,
> -			 const struct tcf_exts *exts)
> +int tc_setup_action(struct flow_action *flow_action,
> +		    struct tc_action *actions[])
>  {
>  	struct tc_action *act;
>  	int i, j, k, err = 0;
> @@ -3554,11 +3554,11 @@ int tc_setup_flow_action(struct flow_action *flow_action,
>  	BUILD_BUG_ON(TCA_ACT_HW_STATS_IMMEDIATE != FLOW_ACTION_HW_STATS_IMMEDIATE);
>  	BUILD_BUG_ON(TCA_ACT_HW_STATS_DELAYED != FLOW_ACTION_HW_STATS_DELAYED);
>  
> -	if (!exts)
> +	if (!actions)
>  		return 0;
>  
>  	j = 0;
> -	tcf_exts_for_each_action(i, act, exts) {
> +	tcf_act_for_each_action(i, act, actions) {
>  		struct flow_action_entry *entry;
>  
>  		entry = &flow_action->entries[j];
> @@ -3725,7 +3725,19 @@ int tc_setup_flow_action(struct flow_action *flow_action,
>  	spin_unlock_bh(&act->tcfa_lock);
>  	goto err_out;
>  }
> +EXPORT_SYMBOL(tc_setup_action);
> +
> +#ifdef CONFIG_NET_CLS_ACT

Maybe just move tc_setup_action() to act_api and ifdef its definition in
pkt_cls.h instead of existing tc_setup_flow_action()?

> +int tc_setup_flow_action(struct flow_action *flow_action,
> +			 const struct tcf_exts *exts)
> +{
> +	if (!exts)
> +		return 0;
> +
> +	return tc_setup_action(flow_action, exts->actions);
> +}
>  EXPORT_SYMBOL(tc_setup_flow_action);
> +#endif
>  
>  unsigned int tcf_exts_num_actions(struct tcf_exts *exts)
>  {
> @@ -3743,6 +3755,15 @@ unsigned int tcf_exts_num_actions(struct tcf_exts *exts)
>  }
>  EXPORT_SYMBOL(tcf_exts_num_actions);
>  
> +unsigned int tcf_act_num_actions_single(struct tc_action *act)
> +{
> +	if (is_tcf_pedit(act))
> +		return tcf_pedit_nkeys(act);
> +	else
> +		return 1;
> +}
> +EXPORT_SYMBOL(tcf_act_num_actions_single);
> +
>  #ifdef CONFIG_NET_CLS_ACT
>  static int tcf_qevent_parse_block_index(struct nlattr *block_index_attr,
>  					u32 *p_block_index,


^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [RFC/PATCH net-next v3 5/8] flow_offload: add process to update action stats from hardware
  2021-10-28 11:06 ` [RFC/PATCH net-next v3 5/8] flow_offload: add process to update action stats from hardware Simon Horman
@ 2021-10-29 17:11   ` Vlad Buslov
  2021-11-01 10:07     ` Baowen Zheng
  0 siblings, 1 reply; 58+ messages in thread
From: Vlad Buslov @ 2021-10-29 17:11 UTC (permalink / raw)
  To: Simon Horman
  Cc: netdev, Jamal Hadi Salim, Roi Dayan, Ido Schimmel, Cong Wang,
	Jiri Pirko, Baowen Zheng, Louis Peens, oss-drivers, Baowen Zheng

On Thu 28 Oct 2021 at 14:06, Simon Horman <simon.horman@corigine.com> wrote:
> From: Baowen Zheng <baowen.zheng@corigine.com>
>
> When collecting stats for actions update them using both
> both hardware and software counters.
>
> Stats update process should not in context of preempt_disable.

I think you are missing a word here.

>
> Signed-off-by: Baowen Zheng <baowen.zheng@corigine.com>
> Signed-off-by: Louis Peens <louis.peens@corigine.com>
> Signed-off-by: Simon Horman <simon.horman@corigine.com>
> ---
>  include/net/act_api.h |  1 +
>  include/net/pkt_cls.h | 18 ++++++++++--------
>  net/sched/act_api.c   | 37 +++++++++++++++++++++++++++++++++++++
>  3 files changed, 48 insertions(+), 8 deletions(-)
>
> diff --git a/include/net/act_api.h b/include/net/act_api.h
> index 671208bd27ef..80a9d1e7d805 100644
> --- a/include/net/act_api.h
> +++ b/include/net/act_api.h
> @@ -247,6 +247,7 @@ void tcf_action_update_stats(struct tc_action *a, u64 bytes, u64 packets,
>  			     u64 drops, bool hw);
>  int tcf_action_copy_stats(struct sk_buff *, struct tc_action *, int);
>  int tcf_action_offload_del(struct tc_action *action);
> +int tcf_action_update_hw_stats(struct tc_action *action);
>  int tcf_action_check_ctrlact(int action, struct tcf_proto *tp,
>  			     struct tcf_chain **handle,
>  			     struct netlink_ext_ack *newchain);
> diff --git a/include/net/pkt_cls.h b/include/net/pkt_cls.h
> index 44ae5182a965..88788b821f76 100644
> --- a/include/net/pkt_cls.h
> +++ b/include/net/pkt_cls.h
> @@ -292,18 +292,20 @@ tcf_exts_stats_update(const struct tcf_exts *exts,
>  #ifdef CONFIG_NET_CLS_ACT
>  	int i;
>  
> -	preempt_disable();
> -
>  	for (i = 0; i < exts->nr_actions; i++) {
>  		struct tc_action *a = exts->actions[i];
>  
> -		tcf_action_stats_update(a, bytes, packets, drops,
> -					lastuse, true);
> -		a->used_hw_stats = used_hw_stats;
> -		a->used_hw_stats_valid = used_hw_stats_valid;
> -	}
> +		/* if stats from hw, just skip */
> +		if (tcf_action_update_hw_stats(a)) {
> +			preempt_disable();
> +			tcf_action_stats_update(a, bytes, packets, drops,
> +						lastuse, true);
> +			preempt_enable();
>  
> -	preempt_enable();
> +			a->used_hw_stats = used_hw_stats;
> +			a->used_hw_stats_valid = used_hw_stats_valid;
> +		}
> +	}
>  #endif
>  }
>  
> diff --git a/net/sched/act_api.c b/net/sched/act_api.c
> index 604bf1923bcc..881c7ba4d180 100644
> --- a/net/sched/act_api.c
> +++ b/net/sched/act_api.c
> @@ -1238,6 +1238,40 @@ static int tcf_action_offload_add(struct tc_action *action,
>  	return err;
>  }
>  
> +int tcf_action_update_hw_stats(struct tc_action *action)
> +{
> +	struct flow_offload_action fl_act = {};
> +	int err = 0;
> +
> +	if (!tc_act_in_hw(action))
> +		return -EOPNOTSUPP;
> +
> +	err = flow_action_init(&fl_act, action, FLOW_ACT_STATS, NULL);
> +	if (err)
> +		goto err_out;
> +
> +	err = tcf_action_offload_cmd(&fl_act, NULL, NULL);
> +
> +	if (!err && fl_act.stats.lastused) {
> +		preempt_disable();
> +		tcf_action_stats_update(action, fl_act.stats.bytes,
> +					fl_act.stats.pkts,
> +					fl_act.stats.drops,
> +					fl_act.stats.lastused,
> +					true);
> +		preempt_enable();
> +		action->used_hw_stats = fl_act.stats.used_hw_stats;
> +		action->used_hw_stats_valid = true;
> +		err = 0;

Error handling here is slightly convoluted. This line assigns err=0
third time (it is initialized with zero and then we can only get here if
result of tcf_action_offload_cmd() assigned 'err' to zero again).
Considering that error handler in this function is empty we can just
return errors directly as soon as they happen and return zero at the end
of the function.

> +	} else {
> +		err = -EOPNOTSUPP;

Hmm the code can return error here when tcf_action_offload_cmd()
succeeded but 'lastused' is zero. Such behavior will cause
tcf_exts_stats_update() to update action with filter counter values. Is
this the desired behavior when, for example, in filter action list there
is and action that can drop packets followed by some shared action? In
such case 'lastused' can be zero if all packets that filter matched were
dropped by previous action and shared action will be assigned with
filter counter value that includes dropped packets/bytes.

> +	}
> +
> +err_out:
> +	return err;
> +}
> +EXPORT_SYMBOL(tcf_action_update_hw_stats);
> +
>  int tcf_action_offload_del(struct tc_action *action)
>  {
>  	struct flow_offload_action fl_act;
> @@ -1362,6 +1396,9 @@ int tcf_action_copy_stats(struct sk_buff *skb, struct tc_action *p,
>  	if (p == NULL)
>  		goto errout;
>  
> +	/* update hw stats for this action */
> +	tcf_action_update_hw_stats(p);
> +
>  	/* compat_mode being true specifies a call that is supposed
>  	 * to add additional backward compatibility statistic TLVs.
>  	 */


^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [RFC/PATCH net-next v3 7/8] flow_offload: add reoffload process to update hw_count
  2021-10-28 11:06 ` [RFC/PATCH net-next v3 7/8] flow_offload: add reoffload process to update hw_count Simon Horman
@ 2021-10-29 17:31   ` Vlad Buslov
  2021-11-02  9:20     ` Baowen Zheng
  0 siblings, 1 reply; 58+ messages in thread
From: Vlad Buslov @ 2021-10-29 17:31 UTC (permalink / raw)
  To: Simon Horman
  Cc: netdev, Jamal Hadi Salim, Roi Dayan, Ido Schimmel, Cong Wang,
	Jiri Pirko, Baowen Zheng, Louis Peens, oss-drivers, Baowen Zheng

On Thu 28 Oct 2021 at 14:06, Simon Horman <simon.horman@corigine.com> wrote:
> From: Baowen Zheng <baowen.zheng@corigine.com>
>
> Add reoffload process to update hw_count when driver
> is inserted or removed.
>
> When reoffloading actions, we still offload the actions
> that are added independent of filters.
>
> Signed-off-by: Baowen Zheng <baowen.zheng@corigine.com>
> Signed-off-by: Louis Peens <louis.peens@corigine.com>
> Signed-off-by: Simon Horman <simon.horman@corigine.com>
> ---
>  include/net/act_api.h   |  24 +++++
>  include/net/pkt_cls.h   |   5 +
>  net/core/flow_offload.c |   5 +
>  net/sched/act_api.c     | 213 ++++++++++++++++++++++++++++++++++++----
>  4 files changed, 228 insertions(+), 19 deletions(-)
>
> diff --git a/include/net/act_api.h b/include/net/act_api.h
> index 80a9d1e7d805..03ff39e347c3 100644
> --- a/include/net/act_api.h
> +++ b/include/net/act_api.h
> @@ -7,6 +7,7 @@
>  */
>  
>  #include <linux/refcount.h>
> +#include <net/flow_offload.h>
>  #include <net/sch_generic.h>
>  #include <net/pkt_sched.h>
>  #include <net/net_namespace.h>
> @@ -243,11 +244,26 @@ static inline void flow_action_hw_count_set(struct tc_action *act,
>  	act->in_hw_count = hw_count;
>  }
>  
> +static inline void flow_action_hw_count_inc(struct tc_action *act,
> +					    u32 hw_count)
> +{
> +	act->in_hw_count += hw_count;
> +}
> +
> +static inline void flow_action_hw_count_dec(struct tc_action *act,
> +					    u32 hw_count)
> +{
> +	act->in_hw_count = act->in_hw_count > hw_count ?
> +			   act->in_hw_count - hw_count : 0;
> +}
> +
>  void tcf_action_update_stats(struct tc_action *a, u64 bytes, u64 packets,
>  			     u64 drops, bool hw);
>  int tcf_action_copy_stats(struct sk_buff *, struct tc_action *, int);
>  int tcf_action_offload_del(struct tc_action *action);
>  int tcf_action_update_hw_stats(struct tc_action *action);
> +int tcf_action_reoffload_cb(flow_indr_block_bind_cb_t *cb,
> +			    void *cb_priv, bool add);
>  int tcf_action_check_ctrlact(int action, struct tcf_proto *tp,
>  			     struct tcf_chain **handle,
>  			     struct netlink_ext_ack *newchain);
> @@ -259,6 +275,14 @@ DECLARE_STATIC_KEY_FALSE(tcf_frag_xmit_count);
>  #endif
>  
>  int tcf_dev_queue_xmit(struct sk_buff *skb, int (*xmit)(struct sk_buff *skb));
> +
> +#else /* !CONFIG_NET_CLS_ACT */
> +
> +static inline int tcf_action_reoffload_cb(flow_indr_block_bind_cb_t *cb,
> +					  void *cb_priv, bool add) {
> +	return 0;
> +}
> +
>  #endif /* CONFIG_NET_CLS_ACT */
>  
>  static inline void tcf_action_stats_update(struct tc_action *a, u64 bytes,
> diff --git a/include/net/pkt_cls.h b/include/net/pkt_cls.h
> index 88788b821f76..82ac631c50bc 100644
> --- a/include/net/pkt_cls.h
> +++ b/include/net/pkt_cls.h
> @@ -284,6 +284,11 @@ static inline bool tc_act_flags_valid(u32 flags)
>  	return flags ^ (TCA_ACT_FLAGS_SKIP_HW | TCA_ACT_FLAGS_SKIP_SW);
>  }
>  
> +static inline bool tc_act_bind(u32 flags)
> +{
> +	return !!(flags & TCA_ACT_FLAGS_BIND);
> +}
> +
>  static inline void
>  tcf_exts_stats_update(const struct tcf_exts *exts,
>  		      u64 bytes, u64 packets, u64 drops, u64 lastuse,
> diff --git a/net/core/flow_offload.c b/net/core/flow_offload.c
> index 6676431733ef..d591204af6e0 100644
> --- a/net/core/flow_offload.c
> +++ b/net/core/flow_offload.c
> @@ -1,6 +1,7 @@
>  /* SPDX-License-Identifier: GPL-2.0 */
>  #include <linux/kernel.h>
>  #include <linux/slab.h>
> +#include <net/act_api.h>
>  #include <net/flow_offload.h>
>  #include <linux/rtnetlink.h>
>  #include <linux/mutex.h>
> @@ -418,6 +419,8 @@ int flow_indr_dev_register(flow_indr_block_bind_cb_t *cb, void *cb_priv)
>  	existing_qdiscs_register(cb, cb_priv);
>  	mutex_unlock(&flow_indr_block_lock);
>  
> +	tcf_action_reoffload_cb(cb, cb_priv, true);
> +
>  	return 0;
>  }
>  EXPORT_SYMBOL(flow_indr_dev_register);
> @@ -472,6 +475,8 @@ void flow_indr_dev_unregister(flow_indr_block_bind_cb_t *cb, void *cb_priv,
>  
>  	flow_block_indr_notify(&cleanup_list);
>  	kfree(indr_dev);
> +
> +	tcf_action_reoffload_cb(cb, cb_priv, false);

Don't know if it is a problem, but shouldn't tcf_action_reoffload_cb()
be called before flow_block_indr_notify(), which calls
flow_block_indr->cleanup() callbacks?

>  }
>  EXPORT_SYMBOL(flow_indr_dev_unregister);
>  
> diff --git a/net/sched/act_api.c b/net/sched/act_api.c
> index 3893ffd91192..dce25d8f147b 100644
> --- a/net/sched/act_api.c
> +++ b/net/sched/act_api.c
> @@ -638,6 +638,59 @@ EXPORT_SYMBOL(tcf_idrinfo_destroy);
>  
>  static LIST_HEAD(act_base);
>  static DEFINE_RWLOCK(act_mod_lock);
> +/* since act ops id is stored in pernet subsystem list,
> + * then there is no way to walk through only all the action
> + * subsystem, so we keep tc action pernet ops id for
> + * reoffload to walk through.
> + */
> +static LIST_HEAD(act_pernet_id_list);
> +static DEFINE_MUTEX(act_id_mutex);
> +struct tc_act_pernet_id {
> +	struct list_head list;
> +	unsigned int id;
> +};
> +
> +static int tcf_pernet_add_id_list(unsigned int id)
> +{
> +	struct tc_act_pernet_id *id_ptr;
> +	int ret = 0;
> +
> +	mutex_lock(&act_id_mutex);
> +	list_for_each_entry(id_ptr, &act_pernet_id_list, list) {
> +		if (id_ptr->id == id) {
> +			ret = -EEXIST;
> +			goto err_out;
> +		}
> +	}
> +
> +	id_ptr = kzalloc(sizeof(*id_ptr), GFP_KERNEL);
> +	if (!id_ptr) {
> +		ret = -ENOMEM;
> +		goto err_out;
> +	}
> +	id_ptr->id = id;
> +
> +	list_add_tail(&id_ptr->list, &act_pernet_id_list);
> +
> +err_out:
> +	mutex_unlock(&act_id_mutex);
> +	return ret;
> +}
> +
> +static void tcf_pernet_del_id_list(unsigned int id)
> +{
> +	struct tc_act_pernet_id *id_ptr;
> +
> +	mutex_lock(&act_id_mutex);
> +	list_for_each_entry(id_ptr, &act_pernet_id_list, list) {
> +		if (id_ptr->id == id) {
> +			list_del(&id_ptr->list);
> +			kfree(id_ptr);
> +			break;
> +		}
> +	}
> +	mutex_unlock(&act_id_mutex);
> +}
>  
>  int tcf_register_action(struct tc_action_ops *act,
>  			struct pernet_operations *ops)
> @@ -656,18 +709,30 @@ int tcf_register_action(struct tc_action_ops *act,
>  	if (ret)
>  		return ret;
>  
> +	if (ops->id) {
> +		ret = tcf_pernet_add_id_list(*ops->id);
> +		if (ret)
> +			goto id_err;
> +	}
> +
>  	write_lock(&act_mod_lock);
>  	list_for_each_entry(a, &act_base, head) {
>  		if (act->id == a->id || (strcmp(act->kind, a->kind) == 0)) {
> -			write_unlock(&act_mod_lock);
> -			unregister_pernet_subsys(ops);
> -			return -EEXIST;
> +			ret = -EEXIST;
> +			goto err_out;
>  		}
>  	}
>  	list_add_tail(&act->head, &act_base);
>  	write_unlock(&act_mod_lock);
>  
>  	return 0;
> +
> +err_out:
> +	write_unlock(&act_mod_lock);
> +	tcf_pernet_del_id_list(*ops->id);
> +id_err:
> +	unregister_pernet_subsys(ops);
> +	return ret;
>  }
>  EXPORT_SYMBOL(tcf_register_action);
>  
> @@ -686,8 +751,11 @@ int tcf_unregister_action(struct tc_action_ops *act,
>  		}
>  	}
>  	write_unlock(&act_mod_lock);
> -	if (!err)
> +	if (!err) {
>  		unregister_pernet_subsys(ops);
> +		if (ops->id)
> +			tcf_pernet_del_id_list(*ops->id);
> +	}
>  	return err;
>  }
>  EXPORT_SYMBOL(tcf_unregister_action);
> @@ -1175,15 +1243,11 @@ static int flow_action_init(struct flow_offload_action *fl_action,
>  	return 0;
>  }
>  
> -static int tcf_action_offload_cmd(struct flow_offload_action *fl_act,
> -				  u32 *hw_count,
> -				  struct netlink_ext_ack *extack)
> +static int tcf_action_offload_cmd_ex(struct flow_offload_action *fl_act,
> +				     u32 *hw_count)
>  {
>  	int err;
>  
> -	if (IS_ERR(fl_act))
> -		return PTR_ERR(fl_act);
> -
>  	err = flow_indr_dev_setup_offload(NULL, NULL, TC_SETUP_ACT,
>  					  fl_act, NULL, NULL);
>  	if (err < 0)
> @@ -1195,9 +1259,41 @@ static int tcf_action_offload_cmd(struct flow_offload_action *fl_act,
>  	return 0;
>  }
>  
> +static int tcf_action_offload_cmd_cb_ex(struct flow_offload_action *fl_act,
> +					u32 *hw_count,
> +					flow_indr_block_bind_cb_t *cb,
> +					void *cb_priv)
> +{
> +	int err;
> +
> +	err = cb(NULL, NULL, cb_priv, TC_SETUP_ACT, NULL, fl_act, NULL);
> +	if (err < 0)
> +		return err;
> +
> +	if (hw_count)
> +		*hw_count = 1;
> +
> +	return 0;
> +}
> +
> +static int tcf_action_offload_cmd(struct flow_offload_action *fl_act,
> +				  u32 *hw_count,
> +				  flow_indr_block_bind_cb_t *cb,
> +				  void *cb_priv)
> +{
> +	if (IS_ERR(fl_act))
> +		return PTR_ERR(fl_act);
> +
> +	return cb ? tcf_action_offload_cmd_cb_ex(fl_act, hw_count,
> +						 cb, cb_priv) :
> +		    tcf_action_offload_cmd_ex(fl_act, hw_count);
> +}
> +
>  /* offload the tc command after inserted */
> -static int tcf_action_offload_add(struct tc_action *action,
> -				  struct netlink_ext_ack *extack)
> +static int tcf_action_offload_add_ex(struct tc_action *action,
> +				     struct netlink_ext_ack *extack,
> +				     flow_indr_block_bind_cb_t *cb,
> +				     void *cb_priv)
>  {
>  	bool skip_sw = tc_act_skip_sw(action->tcfa_flags);
>  	struct tc_action *actions[TCA_ACT_MAX_PRIO] = {
> @@ -1225,9 +1321,10 @@ static int tcf_action_offload_add(struct tc_action *action,
>  		goto fl_err;
>  	}
>  
> -	err = tcf_action_offload_cmd(fl_action, &in_hw_count, extack);
> +	err = tcf_action_offload_cmd(fl_action, &in_hw_count, cb, cb_priv);
>  	if (!err)
> -		flow_action_hw_count_set(action, in_hw_count);
> +		cb ? flow_action_hw_count_inc(action, in_hw_count) :
> +		     flow_action_hw_count_set(action, in_hw_count);
>  
>  	if (skip_sw && !tc_act_in_hw(action))
>  		err = -EINVAL;
> @@ -1240,6 +1337,12 @@ static int tcf_action_offload_add(struct tc_action *action,
>  	return err;
>  }
>  
> +static int tcf_action_offload_add(struct tc_action *action,
> +				  struct netlink_ext_ack *extack)
> +{
> +	return tcf_action_offload_add_ex(action, extack, NULL, NULL);
> +}
> +
>  int tcf_action_update_hw_stats(struct tc_action *action)
>  {
>  	struct flow_offload_action fl_act = {};
> @@ -1252,7 +1355,7 @@ int tcf_action_update_hw_stats(struct tc_action *action)
>  	if (err)
>  		goto err_out;
>  
> -	err = tcf_action_offload_cmd(&fl_act, NULL, NULL);
> +	err = tcf_action_offload_cmd(&fl_act, NULL, NULL, NULL);
>  
>  	if (!err && fl_act.stats.lastused) {
>  		preempt_disable();
> @@ -1274,7 +1377,9 @@ int tcf_action_update_hw_stats(struct tc_action *action)
>  }
>  EXPORT_SYMBOL(tcf_action_update_hw_stats);
>  
> -int tcf_action_offload_del(struct tc_action *action)
> +static int tcf_action_offload_del_ex(struct tc_action *action,
> +				     flow_indr_block_bind_cb_t *cb,
> +				     void *cb_priv)
>  {
>  	struct flow_offload_action fl_act;
>  	u32 in_hw_count = 0;
> @@ -1290,13 +1395,83 @@ int tcf_action_offload_del(struct tc_action *action)
>  	if (err)
>  		return err;
>  
> -	err = tcf_action_offload_cmd(&fl_act, &in_hw_count, NULL);
> -	if (err)
> +	err = tcf_action_offload_cmd(&fl_act, &in_hw_count, cb, cb_priv);
> +	if (err < 0)
>  		return err;
>  
> -	if (action->in_hw_count != in_hw_count)
> +	if (!cb && action->in_hw_count != in_hw_count)
>  		return -EINVAL;
>  
> +	/* do not need to update hw state when deleting action */
> +	if (cb && in_hw_count)
> +		flow_action_hw_count_dec(action, in_hw_count);
> +
> +	return 0;
> +}
> +
> +int tcf_action_offload_del(struct tc_action *action)
> +{
> +	return tcf_action_offload_del_ex(action, NULL, NULL);
> +}
> +
> +int tcf_action_reoffload_cb(flow_indr_block_bind_cb_t *cb,
> +			    void *cb_priv, bool add)
> +{
> +	struct tc_act_pernet_id *id_ptr;
> +	struct tcf_idrinfo *idrinfo;
> +	struct tc_action_net *tn;
> +	struct tc_action *p;
> +	unsigned int act_id;
> +	unsigned long tmp;
> +	unsigned long id;
> +	struct idr *idr;
> +	struct net *net;
> +	int ret;
> +
> +	if (!cb)
> +		return -EINVAL;
> +
> +	down_read(&net_rwsem);
> +	mutex_lock(&act_id_mutex);
> +
> +	for_each_net(net) {
> +		list_for_each_entry(id_ptr, &act_pernet_id_list, list) {
> +			act_id = id_ptr->id;
> +			tn = net_generic(net, act_id);
> +			if (!tn)
> +				continue;
> +			idrinfo = tn->idrinfo;
> +			if (!idrinfo)
> +				continue;
> +
> +			mutex_lock(&idrinfo->lock);
> +			idr = &idrinfo->action_idr;
> +			idr_for_each_entry_ul(idr, p, tmp, id) {
> +				if (IS_ERR(p) || tc_act_bind(p->tcfa_flags))
> +					continue;
> +				if (add) {
> +					tcf_action_offload_add_ex(p, NULL, cb,
> +								  cb_priv);
> +					continue;
> +				}
> +
> +				/* cb unregister to update hw count */
> +				ret = tcf_action_offload_del_ex(p, cb, cb_priv);
> +				if (ret < 0)
> +					continue;
> +				if (tc_act_skip_sw(p->tcfa_flags) &&
> +				    !tc_act_in_hw(p)) {
> +					ret = tcf_idr_release_unsafe(p);
> +					if (ret == ACT_P_DELETED)
> +						module_put(p->ops->owner);
> +				}
> +			}
> +			mutex_unlock(&idrinfo->lock);
> +		}
> +	}
> +	mutex_unlock(&act_id_mutex);
> +	up_read(&net_rwsem);
> +
>  	return 0;
>  }


^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [RFC/PATCH net-next v3 8/8] flow_offload: validate flags of filter and actions
  2021-10-28 11:06 ` [RFC/PATCH net-next v3 8/8] flow_offload: validate flags of filter and actions Simon Horman
  2021-10-28 19:12   ` kernel test robot
@ 2021-10-29 18:01   ` Vlad Buslov
  2021-10-30 10:54     ` Jamal Hadi Salim
  2021-11-04  2:30     ` Baowen Zheng
  1 sibling, 2 replies; 58+ messages in thread
From: Vlad Buslov @ 2021-10-29 18:01 UTC (permalink / raw)
  To: Simon Horman
  Cc: netdev, Jamal Hadi Salim, Roi Dayan, Ido Schimmel, Cong Wang,
	Jiri Pirko, Baowen Zheng, Louis Peens, oss-drivers, Baowen Zheng

On Thu 28 Oct 2021 at 14:06, Simon Horman <simon.horman@corigine.com> wrote:
> From: Baowen Zheng <baowen.zheng@corigine.com>
>
> Add process to validate flags of filter and actions when adding
> a tc filter.
>
> We need to prevent adding filter with flags conflicts with its actions.
>
> Signed-off-by: Baowen Zheng <baowen.zheng@corigine.com>
> Signed-off-by: Louis Peens <louis.peens@corigine.com>
> Signed-off-by: Simon Horman <simon.horman@corigine.com>
> ---
>  net/sched/cls_api.c      | 26 ++++++++++++++++++++++++++
>  net/sched/cls_flower.c   |  3 ++-
>  net/sched/cls_matchall.c |  4 ++--
>  net/sched/cls_u32.c      |  7 ++++---
>  4 files changed, 34 insertions(+), 6 deletions(-)
>
> diff --git a/net/sched/cls_api.c b/net/sched/cls_api.c
> index 351d93988b8b..80647da9713a 100644
> --- a/net/sched/cls_api.c
> +++ b/net/sched/cls_api.c
> @@ -3025,6 +3025,29 @@ void tcf_exts_destroy(struct tcf_exts *exts)
>  }
>  EXPORT_SYMBOL(tcf_exts_destroy);
>  
> +static bool tcf_exts_validate_actions(const struct tcf_exts *exts, u32 flags)
> +{
> +#ifdef CONFIG_NET_CLS_ACT
> +	bool skip_sw = tc_skip_sw(flags);
> +	bool skip_hw = tc_skip_hw(flags);
> +	int i;
> +
> +	if (!(skip_sw | skip_hw))
> +		return true;
> +
> +	for (i = 0; i < exts->nr_actions; i++) {
> +		struct tc_action *a = exts->actions[i];
> +
> +		if ((skip_sw && tc_act_skip_hw(a->tcfa_flags)) ||
> +		    (skip_hw && tc_act_skip_sw(a->tcfa_flags)))
> +			return false;
> +	}
> +	return true;
> +#else
> +	return true;
> +#endif
> +}
> +

I know Jamal suggested to have skip_sw for actions, but it complicates
the code and I'm still not entirely understand why it is necessary.
After all, action can only get applied to a packet if the packet has
been matched by some filter and filters already have skip sw/hw
controls. Forgoing action skip_sw flag would:

- Alleviate the need to validate that filter and action flags are
compatible. (trying to offload filter that points to existing skip_hw
action would just fail because the driver wouldn't find the action with
provided id in its tables)

- Remove the need to add more conditionals into TC software data path in
patch 4.

WDYT?

>  int tcf_exts_validate(struct net *net, struct tcf_proto *tp, struct nlattr **tb,
>  		      struct nlattr *rate_tlv, struct tcf_exts *exts,
>  		      u32 flags, struct netlink_ext_ack *extack)
> @@ -3066,6 +3089,9 @@ int tcf_exts_validate(struct net *net, struct tcf_proto *tp, struct nlattr **tb,
>  				return err;
>  			exts->nr_actions = err;
>  		}
> +
> +		if (!tcf_exts_validate_actions(exts, flags))
> +			return -EINVAL;
>  	}
>  #else
>  	if ((exts->action && tb[exts->action]) ||
> diff --git a/net/sched/cls_flower.c b/net/sched/cls_flower.c
> index eb6345a027e1..55f89f0e393e 100644
> --- a/net/sched/cls_flower.c
> +++ b/net/sched/cls_flower.c
> @@ -2035,7 +2035,8 @@ static int fl_change(struct net *net, struct sk_buff *in_skb,
>  	}
>  
>  	err = fl_set_parms(net, tp, fnew, mask, base, tb, tca[TCA_RATE],
> -			   tp->chain->tmplt_priv, flags, extack);
> +			   tp->chain->tmplt_priv, flags | fnew->flags,
> +			   extack);

Aren't you or-ing flags from two different ranges (TCA_CLS_FLAGS_* and
TCA_ACT_FLAGS_*) that map to same bits, or am I missing something? This
isn't explained in commit message so it is hard for me to understand the
idea here.

>  	if (err)
>  		goto errout;
>  
> diff --git a/net/sched/cls_matchall.c b/net/sched/cls_matchall.c
> index 24f0046ce0b3..00b76fbc1dce 100644
> --- a/net/sched/cls_matchall.c
> +++ b/net/sched/cls_matchall.c
> @@ -226,8 +226,8 @@ static int mall_change(struct net *net, struct sk_buff *in_skb,
>  		goto err_alloc_percpu;
>  	}
>  
> -	err = mall_set_parms(net, tp, new, base, tb, tca[TCA_RATE], flags,
> -			     extack);
> +	err = mall_set_parms(net, tp, new, base, tb, tca[TCA_RATE],
> +			     flags | new->flags, extack);
>  	if (err)
>  		goto err_set_parms;
>  
> diff --git a/net/sched/cls_u32.c b/net/sched/cls_u32.c
> index 4272814487f0..fc670cc45122 100644
> --- a/net/sched/cls_u32.c
> +++ b/net/sched/cls_u32.c
> @@ -895,7 +895,8 @@ static int u32_change(struct net *net, struct sk_buff *in_skb,
>  			return -ENOMEM;
>  
>  		err = u32_set_parms(net, tp, base, new, tb,
> -				    tca[TCA_RATE], flags, extack);
> +				    tca[TCA_RATE], flags | new->flags,
> +				    extack);
>  
>  		if (err) {
>  			u32_destroy_key(new, false);
> @@ -1060,8 +1061,8 @@ static int u32_change(struct net *net, struct sk_buff *in_skb,
>  	}
>  #endif
>  
> -	err = u32_set_parms(net, tp, base, n, tb, tca[TCA_RATE], flags,
> -			    extack);
> +	err = u32_set_parms(net, tp, base, n, tb, tca[TCA_RATE],
> +			    flags | n->flags, extack);
>  	if (err == 0) {
>  		struct tc_u_knode __rcu **ins;
>  		struct tc_u_knode *pins;


^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [RFC/PATCH net-next v3 8/8] flow_offload: validate flags of filter and actions
  2021-10-29 18:01   ` Vlad Buslov
@ 2021-10-30 10:54     ` Jamal Hadi Salim
  2021-10-30 14:45       ` Vlad Buslov
  2021-11-04  2:30     ` Baowen Zheng
  1 sibling, 1 reply; 58+ messages in thread
From: Jamal Hadi Salim @ 2021-10-30 10:54 UTC (permalink / raw)
  To: Vlad Buslov, Simon Horman
  Cc: netdev, Roi Dayan, Ido Schimmel, Cong Wang, Jiri Pirko,
	Baowen Zheng, Louis Peens, oss-drivers, Baowen Zheng

On 2021-10-29 14:01, Vlad Buslov wrote:
> On Thu 28 Oct 2021 at 14:06, Simon Horman <simon.horman@corigine.com> wrote:
>> From: Baowen Zheng <baowen.zheng@corigine.com>
>>
>> Add process to validate flags of filter and actions when adding
>> a tc filter.
>>
>> We need to prevent adding filter with flags conflicts with its actions.
>>
>> Signed-off-by: Baowen Zheng <baowen.zheng@corigine.com>
>> Signed-off-by: Louis Peens <louis.peens@corigine.com>
>> Signed-off-by: Simon Horman <simon.horman@corigine.com>
>> ---
>>   net/sched/cls_api.c      | 26 ++++++++++++++++++++++++++
>>   net/sched/cls_flower.c   |  3 ++-
>>   net/sched/cls_matchall.c |  4 ++--
>>   net/sched/cls_u32.c      |  7 ++++---
>>   4 files changed, 34 insertions(+), 6 deletions(-)
>>
>> diff --git a/net/sched/cls_api.c b/net/sched/cls_api.c
>> index 351d93988b8b..80647da9713a 100644
>> --- a/net/sched/cls_api.c
>> +++ b/net/sched/cls_api.c
>> @@ -3025,6 +3025,29 @@ void tcf_exts_destroy(struct tcf_exts *exts)
>>   }
>>   EXPORT_SYMBOL(tcf_exts_destroy);
>>   
>> +static bool tcf_exts_validate_actions(const struct tcf_exts *exts, u32 flags)
>> +{
>> +#ifdef CONFIG_NET_CLS_ACT
>> +	bool skip_sw = tc_skip_sw(flags);
>> +	bool skip_hw = tc_skip_hw(flags);
>> +	int i;
>> +
>> +	if (!(skip_sw | skip_hw))
>> +		return true;
>> +
>> +	for (i = 0; i < exts->nr_actions; i++) {
>> +		struct tc_action *a = exts->actions[i];
>> +
>> +		if ((skip_sw && tc_act_skip_hw(a->tcfa_flags)) ||
>> +		    (skip_hw && tc_act_skip_sw(a->tcfa_flags)))
>> +			return false;
>> +	}
>> +	return true;
>> +#else
>> +	return true;
>> +#endif
>> +}
>> +
> 
> I know Jamal suggested to have skip_sw for actions, but it complicates
> the code and I'm still not entirely understand why it is necessary.

If the hardware can independently accept an action offload then
skip_sw per action makes total sense. BTW, my understanding is
_your_ hardware is capable as such at least for policers ;->
And such policers are then shared across filters.
Other than the architectural reason I may have missed something
because I dont see much complexity added as a result.
Are you more worried about slowing down the update rate?

cheers,
jamal

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [RFC/PATCH net-next v3 8/8] flow_offload: validate flags of filter and actions
  2021-10-30 10:54     ` Jamal Hadi Salim
@ 2021-10-30 14:45       ` Vlad Buslov
       [not found]         ` <DM5PR1301MB21722A85B19EE97EFE27A5BBE7899@DM5PR1301MB2172.namprd13.prod.outlook.com>
  0 siblings, 1 reply; 58+ messages in thread
From: Vlad Buslov @ 2021-10-30 14:45 UTC (permalink / raw)
  To: Jamal Hadi Salim
  Cc: Simon Horman, netdev, Roi Dayan, Ido Schimmel, Cong Wang,
	Jiri Pirko, Baowen Zheng, Louis Peens, oss-drivers, Baowen Zheng

On Sat 30 Oct 2021 at 13:54, Jamal Hadi Salim <jhs@mojatatu.com> wrote:
> On 2021-10-29 14:01, Vlad Buslov wrote:
>> On Thu 28 Oct 2021 at 14:06, Simon Horman <simon.horman@corigine.com> wrote:
>>> From: Baowen Zheng <baowen.zheng@corigine.com>
>>>
>>> Add process to validate flags of filter and actions when adding
>>> a tc filter.
>>>
>>> We need to prevent adding filter with flags conflicts with its actions.
>>>
>>> Signed-off-by: Baowen Zheng <baowen.zheng@corigine.com>
>>> Signed-off-by: Louis Peens <louis.peens@corigine.com>
>>> Signed-off-by: Simon Horman <simon.horman@corigine.com>
>>> ---
>>>   net/sched/cls_api.c      | 26 ++++++++++++++++++++++++++
>>>   net/sched/cls_flower.c   |  3 ++-
>>>   net/sched/cls_matchall.c |  4 ++--
>>>   net/sched/cls_u32.c      |  7 ++++---
>>>   4 files changed, 34 insertions(+), 6 deletions(-)
>>>
>>> diff --git a/net/sched/cls_api.c b/net/sched/cls_api.c
>>> index 351d93988b8b..80647da9713a 100644
>>> --- a/net/sched/cls_api.c
>>> +++ b/net/sched/cls_api.c
>>> @@ -3025,6 +3025,29 @@ void tcf_exts_destroy(struct tcf_exts *exts)
>>>   }
>>>   EXPORT_SYMBOL(tcf_exts_destroy);
>>>   +static bool tcf_exts_validate_actions(const struct tcf_exts *exts, u32
>>> flags)
>>> +{
>>> +#ifdef CONFIG_NET_CLS_ACT
>>> +	bool skip_sw = tc_skip_sw(flags);
>>> +	bool skip_hw = tc_skip_hw(flags);
>>> +	int i;
>>> +
>>> +	if (!(skip_sw | skip_hw))
>>> +		return true;
>>> +
>>> +	for (i = 0; i < exts->nr_actions; i++) {
>>> +		struct tc_action *a = exts->actions[i];
>>> +
>>> +		if ((skip_sw && tc_act_skip_hw(a->tcfa_flags)) ||
>>> +		    (skip_hw && tc_act_skip_sw(a->tcfa_flags)))
>>> +			return false;
>>> +	}
>>> +	return true;
>>> +#else
>>> +	return true;
>>> +#endif
>>> +}
>>> +
>> I know Jamal suggested to have skip_sw for actions, but it complicates
>> the code and I'm still not entirely understand why it is necessary.
>
> If the hardware can independently accept an action offload then
> skip_sw per action makes total sense. BTW, my understanding is

Example configuration that seems bizarre to me is when offloaded shared
action has skip_sw flag set but filter doesn't. Then behavior of
classifier that points to such action diverges between hardware and
software (different lists of actions are applied). We always try to make
offloaded TC data path behave exactly the same as software and, even
though here it would be explicit and deliberate, I don't see any
practical use-case for this.

> _your_ hardware is capable as such at least for policers ;->
> And such policers are then shared across filters.

True, but why do you need skip_sw action flag for that?

> Other than the architectural reason I may have missed something
> because I dont see much complexity added as a result.

Well, other part of my email was about how I don't understand what is
going on in the flags handling code here. This patch and parts of other
patches in the series would be unnecessary, if we forgo the action
skip_sw flag. I guess we can just make the code nicer by folding the
validation into tcf_action_init(), for example.

> Are you more worried about slowing down the update rate?

I don't expect the validation code to significantly impact the update
rate.

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [RFC/PATCH net-next v3 3/8] flow_offload: allow user to offload tc action to net device
  2021-10-28 11:06 ` [RFC/PATCH net-next v3 3/8] flow_offload: allow user to offload tc action to net device Simon Horman
  2021-10-29 16:59   ` Vlad Buslov
@ 2021-10-31  9:50   ` Oz Shlomo
  2021-11-01  2:30     ` Baowen Zheng
  1 sibling, 1 reply; 58+ messages in thread
From: Oz Shlomo @ 2021-10-31  9:50 UTC (permalink / raw)
  To: Simon Horman, netdev
  Cc: Vlad Buslov, Jamal Hadi Salim, Roi Dayan, Ido Schimmel,
	Cong Wang, Jiri Pirko, Baowen Zheng, Louis Peens, oss-drivers,
	Baowen Zheng



On 10/28/2021 2:06 PM, Simon Horman wrote:
> From: Baowen Zheng <baowen.zheng@corigine.com>
> 
> Use flow_indr_dev_register/flow_indr_dev_setup_offload to
> offload tc action.

How will device drivers reference the offloaded actions when offloading a flow?
Perhaps the flow_action_entry structure should also include the action index.

> 
> We need to call tc_cleanup_flow_action to clean up tc action entry since
> in tc_setup_action, some actions may hold dev refcnt, especially the mirror
> action.
> 
> Signed-off-by: Baowen Zheng <baowen.zheng@corigine.com>
> Signed-off-by: Louis Peens <louis.peens@corigine.com>
> Signed-off-by: Simon Horman <simon.horman@corigine.com>
> ---
>   include/linux/netdevice.h  |   1 +
>   include/net/act_api.h      |   2 +-
>   include/net/flow_offload.h |  17 ++++
>   include/net/pkt_cls.h      |  15 ++++
>   net/core/flow_offload.c    |  43 ++++++++--
>   net/sched/act_api.c        | 166 +++++++++++++++++++++++++++++++++++++
>   net/sched/cls_api.c        |  29 ++++++-
>   7 files changed, 260 insertions(+), 13 deletions(-)
> 
> diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h
> index 3ec42495a43a..9815c3a058e9 100644
> --- a/include/linux/netdevice.h
> +++ b/include/linux/netdevice.h
> @@ -916,6 +916,7 @@ enum tc_setup_type {
>   	TC_SETUP_QDISC_TBF,
>   	TC_SETUP_QDISC_FIFO,
>   	TC_SETUP_QDISC_HTB,
> +	TC_SETUP_ACT,
>   };
>   
>   /* These structures hold the attributes of bpf state that are being passed
> diff --git a/include/net/act_api.h b/include/net/act_api.h
> index b5b624c7e488..9eb19188603c 100644
> --- a/include/net/act_api.h
> +++ b/include/net/act_api.h
> @@ -239,7 +239,7 @@ static inline void tcf_action_inc_overlimit_qstats(struct tc_action *a)
>   void tcf_action_update_stats(struct tc_action *a, u64 bytes, u64 packets,
>   			     u64 drops, bool hw);
>   int tcf_action_copy_stats(struct sk_buff *, struct tc_action *, int);
> -
> +int tcf_action_offload_del(struct tc_action *action);
>   int tcf_action_check_ctrlact(int action, struct tcf_proto *tp,
>   			     struct tcf_chain **handle,
>   			     struct netlink_ext_ack *newchain);
> diff --git a/include/net/flow_offload.h b/include/net/flow_offload.h
> index 3961461d9c8b..aa28592fccc0 100644
> --- a/include/net/flow_offload.h
> +++ b/include/net/flow_offload.h
> @@ -552,6 +552,23 @@ struct flow_cls_offload {
>   	u32 classid;
>   };
>   
> +enum flow_act_command {
> +	FLOW_ACT_REPLACE,
> +	FLOW_ACT_DESTROY,
> +	FLOW_ACT_STATS,
> +};
> +
> +struct flow_offload_action {
> +	struct netlink_ext_ack *extack; /* NULL in FLOW_ACT_STATS process*/
> +	enum flow_act_command command;
> +	enum flow_action_id id;
> +	u32 index;
> +	struct flow_stats stats;
> +	struct flow_action action;
> +};
> +
> +struct flow_offload_action *flow_action_alloc(unsigned int num_actions);
> +
>   static inline struct flow_rule *
>   flow_cls_offload_flow_rule(struct flow_cls_offload *flow_cmd)
>   {
> diff --git a/include/net/pkt_cls.h b/include/net/pkt_cls.h
> index 193f88ebf629..922775407257 100644
> --- a/include/net/pkt_cls.h
> +++ b/include/net/pkt_cls.h
> @@ -258,6 +258,9 @@ static inline void tcf_exts_put_net(struct tcf_exts *exts)
>   	for (; 0; (void)(i), (void)(a), (void)(exts))
>   #endif
>   
> +#define tcf_act_for_each_action(i, a, actions) \
> +	for (i = 0; i < TCA_ACT_MAX_PRIO && ((a) = actions[i]); i++)
> +
>   static inline void
>   tcf_exts_stats_update(const struct tcf_exts *exts,
>   		      u64 bytes, u64 packets, u64 drops, u64 lastuse,
> @@ -532,8 +535,19 @@ tcf_match_indev(struct sk_buff *skb, int ifindex)
>   	return ifindex == skb->skb_iif;
>   }
>   
> +#ifdef CONFIG_NET_CLS_ACT
>   int tc_setup_flow_action(struct flow_action *flow_action,
>   			 const struct tcf_exts *exts);
> +#else
> +static inline int tc_setup_flow_action(struct flow_action *flow_action,
> +				       const struct tcf_exts *exts)
> +{
> +	return 0;
> +}
> +#endif
> +
> +int tc_setup_action(struct flow_action *flow_action,
> +		    struct tc_action *actions[]);
>   void tc_cleanup_flow_action(struct flow_action *flow_action);
>   
>   int tc_setup_cb_call(struct tcf_block *block, enum tc_setup_type type,
> @@ -554,6 +568,7 @@ int tc_setup_cb_reoffload(struct tcf_block *block, struct tcf_proto *tp,
>   			  enum tc_setup_type type, void *type_data,
>   			  void *cb_priv, u32 *flags, unsigned int *in_hw_count);
>   unsigned int tcf_exts_num_actions(struct tcf_exts *exts);
> +unsigned int tcf_act_num_actions_single(struct tc_action *act);
>   
>   #ifdef CONFIG_NET_CLS_ACT
>   int tcf_qevent_init(struct tcf_qevent *qe, struct Qdisc *sch,
> diff --git a/net/core/flow_offload.c b/net/core/flow_offload.c
> index 6beaea13564a..6676431733ef 100644
> --- a/net/core/flow_offload.c
> +++ b/net/core/flow_offload.c
> @@ -27,6 +27,27 @@ struct flow_rule *flow_rule_alloc(unsigned int num_actions)
>   }
>   EXPORT_SYMBOL(flow_rule_alloc);
>   
> +struct flow_offload_action *flow_action_alloc(unsigned int num_actions)
> +{
> +	struct flow_offload_action *fl_action;
> +	int i;
> +
> +	fl_action = kzalloc(struct_size(fl_action, action.entries, num_actions),
> +			    GFP_KERNEL);
> +	if (!fl_action)
> +		return NULL;
> +
> +	fl_action->action.num_entries = num_actions;
> +	/* Pre-fill each action hw_stats with DONT_CARE.
> +	 * Caller can override this if it wants stats for a given action.
> +	 */
> +	for (i = 0; i < num_actions; i++)
> +		fl_action->action.entries[i].hw_stats = FLOW_ACTION_HW_STATS_DONT_CARE;
> +
> +	return fl_action;
> +}
> +EXPORT_SYMBOL(flow_action_alloc);
> +
>   #define FLOW_DISSECTOR_MATCH(__rule, __type, __out)				\
>   	const struct flow_match *__m = &(__rule)->match;			\
>   	struct flow_dissector *__d = (__m)->dissector;				\
> @@ -549,19 +570,25 @@ int flow_indr_dev_setup_offload(struct net_device *dev,	struct Qdisc *sch,
>   				void (*cleanup)(struct flow_block_cb *block_cb))
>   {
>   	struct flow_indr_dev *this;
> +	u32 count = 0;
> +	int err;
>   
>   	mutex_lock(&flow_indr_block_lock);
> +	if (bo) {
> +		if (bo->command == FLOW_BLOCK_BIND)
> +			indir_dev_add(data, dev, sch, type, cleanup, bo);
> +		else if (bo->command == FLOW_BLOCK_UNBIND)
> +			indir_dev_remove(data);
> +	}
>   
> -	if (bo->command == FLOW_BLOCK_BIND)
> -		indir_dev_add(data, dev, sch, type, cleanup, bo);
> -	else if (bo->command == FLOW_BLOCK_UNBIND)
> -		indir_dev_remove(data);
> -
> -	list_for_each_entry(this, &flow_block_indr_dev_list, list)
> -		this->cb(dev, sch, this->cb_priv, type, bo, data, cleanup);
> +	list_for_each_entry(this, &flow_block_indr_dev_list, list) {
> +		err = this->cb(dev, sch, this->cb_priv, type, bo, data, cleanup);
> +		if (!err)
> +			count++;
> +	}
>   
>   	mutex_unlock(&flow_indr_block_lock);
>   
> -	return list_empty(&bo->cb_list) ? -EOPNOTSUPP : 0;
> +	return (bo && list_empty(&bo->cb_list)) ? -EOPNOTSUPP : count;
>   }
>   EXPORT_SYMBOL(flow_indr_dev_setup_offload);
> diff --git a/net/sched/act_api.c b/net/sched/act_api.c
> index 3258da3d5bed..33f2ff885b4b 100644
> --- a/net/sched/act_api.c
> +++ b/net/sched/act_api.c
> @@ -21,6 +21,19 @@
>   #include <net/pkt_cls.h>
>   #include <net/act_api.h>
>   #include <net/netlink.h>
> +#include <net/tc_act/tc_pedit.h>
> +#include <net/tc_act/tc_mirred.h>
> +#include <net/tc_act/tc_vlan.h>
> +#include <net/tc_act/tc_tunnel_key.h>
> +#include <net/tc_act/tc_csum.h>
> +#include <net/tc_act/tc_gact.h>
> +#include <net/tc_act/tc_police.h>
> +#include <net/tc_act/tc_sample.h>
> +#include <net/tc_act/tc_skbedit.h>
> +#include <net/tc_act/tc_ct.h>
> +#include <net/tc_act/tc_mpls.h>
> +#include <net/tc_act/tc_gate.h>
> +#include <net/flow_offload.h>
>   
>   #ifdef CONFIG_INET
>   DEFINE_STATIC_KEY_FALSE(tcf_frag_xmit_count);
> @@ -148,6 +161,7 @@ static int __tcf_action_put(struct tc_action *p, bool bind)
>   		idr_remove(&idrinfo->action_idr, p->tcfa_index);
>   		mutex_unlock(&idrinfo->lock);
>   
> +		tcf_action_offload_del(p);
>   		tcf_action_cleanup(p);
>   		return 1;
>   	}
> @@ -341,6 +355,7 @@ static int tcf_idr_release_unsafe(struct tc_action *p)
>   		return -EPERM;
>   
>   	if (refcount_dec_and_test(&p->tcfa_refcnt)) {
> +		tcf_action_offload_del(p);
>   		idr_remove(&p->idrinfo->action_idr, p->tcfa_index);
>   		tcf_action_cleanup(p);
>   		return ACT_P_DELETED;
> @@ -452,6 +467,7 @@ static int tcf_idr_delete_index(struct tcf_idrinfo *idrinfo, u32 index)
>   						p->tcfa_index));
>   			mutex_unlock(&idrinfo->lock);
>   
> +			tcf_action_offload_del(p);
>   			tcf_action_cleanup(p);
>   			module_put(owner);
>   			return 0;
> @@ -1061,6 +1077,154 @@ struct tc_action *tcf_action_init_1(struct net *net, struct tcf_proto *tp,
>   	return ERR_PTR(err);
>   }
>   
> +static int flow_action_init(struct flow_offload_action *fl_action,
> +			    struct tc_action *act,
> +			    enum flow_act_command cmd,
> +			    struct netlink_ext_ack *extack)
> +{
> +	if (!fl_action)
> +		return -EINVAL;
> +
> +	fl_action->extack = extack;
> +	fl_action->command = cmd;
> +	fl_action->index = act->tcfa_index;
> +
> +	if (is_tcf_gact_ok(act)) {
> +		fl_action->id = FLOW_ACTION_ACCEPT;
> +	} else if (is_tcf_gact_shot(act)) {
> +		fl_action->id = FLOW_ACTION_DROP;
> +	} else if (is_tcf_gact_trap(act)) {
> +		fl_action->id = FLOW_ACTION_TRAP;
> +	} else if (is_tcf_gact_goto_chain(act)) {
> +		fl_action->id = FLOW_ACTION_GOTO;
> +	} else if (is_tcf_mirred_egress_redirect(act)) {
> +		fl_action->id = FLOW_ACTION_REDIRECT;
> +	} else if (is_tcf_mirred_egress_mirror(act)) {
> +		fl_action->id = FLOW_ACTION_MIRRED;
> +	} else if (is_tcf_mirred_ingress_redirect(act)) {
> +		fl_action->id = FLOW_ACTION_REDIRECT_INGRESS;
> +	} else if (is_tcf_mirred_ingress_mirror(act)) {
> +		fl_action->id = FLOW_ACTION_MIRRED_INGRESS;
> +	} else if (is_tcf_vlan(act)) {
> +		switch (tcf_vlan_action(act)) {
> +		case TCA_VLAN_ACT_PUSH:
> +			fl_action->id = FLOW_ACTION_VLAN_PUSH;
> +			break;
> +		case TCA_VLAN_ACT_POP:
> +			fl_action->id = FLOW_ACTION_VLAN_POP;
> +			break;
> +		case TCA_VLAN_ACT_MODIFY:
> +			fl_action->id = FLOW_ACTION_VLAN_MANGLE;
> +			break;
> +		default:
> +			return -EOPNOTSUPP;
> +		}
> +	} else if (is_tcf_tunnel_set(act)) {
> +		fl_action->id = FLOW_ACTION_TUNNEL_ENCAP;
> +	} else if (is_tcf_tunnel_release(act)) {
> +		fl_action->id = FLOW_ACTION_TUNNEL_DECAP;
> +	} else if (is_tcf_csum(act)) {
> +		fl_action->id = FLOW_ACTION_CSUM;
> +	} else if (is_tcf_skbedit_mark(act)) {
> +		fl_action->id = FLOW_ACTION_MARK;
> +	} else if (is_tcf_sample(act)) {
> +		fl_action->id = FLOW_ACTION_SAMPLE;
> +	} else if (is_tcf_police(act)) {
> +		fl_action->id = FLOW_ACTION_POLICE;
> +	} else if (is_tcf_ct(act)) {
> +		fl_action->id = FLOW_ACTION_CT;
> +	} else if (is_tcf_mpls(act)) {
> +		switch (tcf_mpls_action(act)) {
> +		case TCA_MPLS_ACT_PUSH:
> +			fl_action->id = FLOW_ACTION_MPLS_PUSH;
> +			break;
> +		case TCA_MPLS_ACT_POP:
> +			fl_action->id = FLOW_ACTION_MPLS_POP;
> +			break;
> +		case TCA_MPLS_ACT_MODIFY:
> +			fl_action->id = FLOW_ACTION_MPLS_MANGLE;
> +			break;
> +		default:
> +			return -EOPNOTSUPP;
> +		}
> +	} else if (is_tcf_skbedit_ptype(act)) {
> +		fl_action->id = FLOW_ACTION_PTYPE;
> +	} else if (is_tcf_skbedit_priority(act)) {
> +		fl_action->id = FLOW_ACTION_PRIORITY;
> +	} else if (is_tcf_gate(act)) {
> +		fl_action->id = FLOW_ACTION_GATE;
> +	} else {
> +		return -EOPNOTSUPP;
> +	}
> +
> +	return 0;
> +}
> +
> +static int tcf_action_offload_cmd(struct flow_offload_action *fl_act,
> +				  struct netlink_ext_ack *extack)
> +{
> +	int err;
> +
> +	if (IS_ERR(fl_act))
> +		return PTR_ERR(fl_act);
> +
> +	err = flow_indr_dev_setup_offload(NULL, NULL, TC_SETUP_ACT,
> +					  fl_act, NULL, NULL);
> +	if (err < 0)
> +		return err;
> +
> +	return 0;
> +}
> +
> +/* offload the tc command after inserted */
> +static int tcf_action_offload_add(struct tc_action *action,
> +				  struct netlink_ext_ack *extack)
> +{
> +	struct tc_action *actions[TCA_ACT_MAX_PRIO] = {
> +		[0] = action,
> +	};
> +	struct flow_offload_action *fl_action;
> +	int err = 0;
> +
> +	fl_action = flow_action_alloc(tcf_act_num_actions_single(action));
> +	if (!fl_action)
> +		return -EINVAL;
> +
> +	err = flow_action_init(fl_action, action, FLOW_ACT_REPLACE, extack);
> +	if (err)
> +		goto fl_err;
> +
> +	err = tc_setup_action(&fl_action->action, actions);
> +	if (err) {
> +		NL_SET_ERR_MSG_MOD(extack,
> +				   "Failed to setup tc actions for offload\n");
> +		goto fl_err;
> +	}
> +
> +	err = tcf_action_offload_cmd(fl_action, extack);
> +	tc_cleanup_flow_action(&fl_action->action);
> +
> +fl_err:
> +	kfree(fl_action);
> +
> +	return err;
> +}
> +
> +int tcf_action_offload_del(struct tc_action *action)
> +{
> +	struct flow_offload_action fl_act;
> +	int err = 0;
> +
> +	if (!action)
> +		return -EINVAL;
> +
> +	err = flow_action_init(&fl_act, action, FLOW_ACT_DESTROY, NULL);
> +	if (err)
> +		return err;
> +
> +	return tcf_action_offload_cmd(&fl_act, NULL);
> +}
> +
>   /* Returns numbers of initialized actions or negative error. */
>   
>   int tcf_action_init(struct net *net, struct tcf_proto *tp, struct nlattr *nla,
> @@ -1103,6 +1267,8 @@ int tcf_action_init(struct net *net, struct tcf_proto *tp, struct nlattr *nla,
>   		sz += tcf_action_fill_size(act);
>   		/* Start from index 0 */
>   		actions[i - 1] = act;
> +		if (!(flags & TCA_ACT_FLAGS_BIND))
> +			tcf_action_offload_add(act, extack);

Why is this restricted to actions created without the TCA_ACT_FLAGS_BIND flag?
How are actions instantiated by the filters different from those that are created by "tc actions"?

>   	}
>   
>   	/* We have to commit them all together, because if any error happened in
> diff --git a/net/sched/cls_api.c b/net/sched/cls_api.c
> index 2ef8f5a6205a..351d93988b8b 100644
> --- a/net/sched/cls_api.c
> +++ b/net/sched/cls_api.c
> @@ -3544,8 +3544,8 @@ static enum flow_action_hw_stats tc_act_hw_stats(u8 hw_stats)
>   	return hw_stats;
>   }
>   
> -int tc_setup_flow_action(struct flow_action *flow_action,
> -			 const struct tcf_exts *exts)
> +int tc_setup_action(struct flow_action *flow_action,
> +		    struct tc_action *actions[])
>   {
>   	struct tc_action *act;
>   	int i, j, k, err = 0;
> @@ -3554,11 +3554,11 @@ int tc_setup_flow_action(struct flow_action *flow_action,
>   	BUILD_BUG_ON(TCA_ACT_HW_STATS_IMMEDIATE != FLOW_ACTION_HW_STATS_IMMEDIATE);
>   	BUILD_BUG_ON(TCA_ACT_HW_STATS_DELAYED != FLOW_ACTION_HW_STATS_DELAYED);
>   
> -	if (!exts)
> +	if (!actions)
>   		return 0;
>   
>   	j = 0;
> -	tcf_exts_for_each_action(i, act, exts) {
> +	tcf_act_for_each_action(i, act, actions) {
>   		struct flow_action_entry *entry;
>   
>   		entry = &flow_action->entries[j];
> @@ -3725,7 +3725,19 @@ int tc_setup_flow_action(struct flow_action *flow_action,
>   	spin_unlock_bh(&act->tcfa_lock);
>   	goto err_out;
>   }
> +EXPORT_SYMBOL(tc_setup_action);
> +
> +#ifdef CONFIG_NET_CLS_ACT
> +int tc_setup_flow_action(struct flow_action *flow_action,
> +			 const struct tcf_exts *exts)
> +{
> +	if (!exts)
> +		return 0;
> +
> +	return tc_setup_action(flow_action, exts->actions);
> +}
>   EXPORT_SYMBOL(tc_setup_flow_action);
> +#endif
>   
>   unsigned int tcf_exts_num_actions(struct tcf_exts *exts)
>   {
> @@ -3743,6 +3755,15 @@ unsigned int tcf_exts_num_actions(struct tcf_exts *exts)
>   }
>   EXPORT_SYMBOL(tcf_exts_num_actions);
>   
> +unsigned int tcf_act_num_actions_single(struct tc_action *act)
> +{
> +	if (is_tcf_pedit(act))
> +		return tcf_pedit_nkeys(act);
> +	else
> +		return 1;
> +}
> +EXPORT_SYMBOL(tcf_act_num_actions_single);
> +
>   #ifdef CONFIG_NET_CLS_ACT
>   static int tcf_qevent_parse_block_index(struct nlattr *block_index_attr,
>   					u32 *p_block_index,
> 

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [RFC/PATCH net-next v3 0/8] allow user to offload tc action to net device
  2021-10-28 11:06 [RFC/PATCH net-next v3 0/8] allow user to offload tc action to net device Simon Horman
                   ` (8 preceding siblings ...)
  2021-10-28 14:23 ` [RFC/PATCH net-next v3 0/8] allow user to offload tc action to net device Jamal Hadi Salim
@ 2021-10-31  9:50 ` Oz Shlomo
  2021-10-31 12:03   ` Dave Taht
  2021-10-31 13:40   ` Jamal Hadi Salim
  9 siblings, 2 replies; 58+ messages in thread
From: Oz Shlomo @ 2021-10-31  9:50 UTC (permalink / raw)
  To: Simon Horman, netdev
  Cc: Vlad Buslov, Jamal Hadi Salim, Roi Dayan, Ido Schimmel,
	Cong Wang, Jiri Pirko, Baowen Zheng, Louis Peens, oss-drivers



On 10/28/2021 2:06 PM, Simon Horman wrote:
> Baowen Zheng says:
> 
> Allow use of flow_indr_dev_register/flow_indr_dev_setup_offload to offload
> tc actions independent of flows.
> 
> The motivation for this work is to prepare for using TC police action
> instances to provide hardware offload of OVS metering feature - which calls
> for policers that may be used by multiple flows and whose lifecycle is
> independent of any flows that use them.
> 
> This patch includes basic changes to offload drivers to return EOPNOTSUPP
> if this feature is used - it is not yet supported by any driver.
> 
> Tc cli command to offload and quote an action:
> 
> tc qdisc del dev $DEV ingress && sleep 1 || true
> tc actions delete action police index 99 || true
> 
> tc qdisc add dev $DEV ingress
> tc qdisc show dev $DEV ingress
> 
> tc actions add action police index 99 rate 1mbit burst 100k skip_sw
> tc actions list action police
> 
> tc filter add dev $DEV protocol ip parent ffff:
> flower ip_proto tcp action police index 99
> tc -s -d filter show dev $DEV protocol ip parent ffff:
> tc filter add dev $DEV protocol ipv6 parent ffff:
> flower skip_sw ip_proto tcp action police index 99
> tc -s -d filter show dev $DEV protocol ipv6 parent ffff:
> tc actions list action police
> 
> tc qdisc del dev $DEV ingress && sleep 1
> tc actions delete action police index 99
> tc actions list action police
> 

Actions are also (implicitly) instantiated when filters are created.
In the following example the mirred action instance (created by the first filter) is shared by the 
second filter:

tc filter add dev $DEV1 proto ip parent ffff: flower \
	ip_proto tcp action mirred egress redirect dev $DEV3

tc filter add dev $DEV2 proto ip parent ffff: flower \
	ip_proto tcp action mirred index 1


> Changes compared to v2 patches:
> 
> * Made changes according to the review comments.
> * Delete in_hw and not_in_hw flag and user can judge if the action is
>    offloaded to any hardware by in_hw_count.
> * Split the main patch of the action offload to three single patch to
> facilitate code review.
> 
> Posting this revision of the patchset as an RFC as while we feel it is
> ready for review we would like an opportunity to conduct further testing
> before acceptance into upstream.
> 
> Baowen Zheng (8):
>    flow_offload: fill flags to action structure
>    flow_offload: reject to offload tc actions in offload drivers
>    flow_offload: allow user to offload tc action to net device
>    flow_offload: add skip_hw and skip_sw to control if offload the action
>    flow_offload: add process to update action stats from hardware
>    net: sched: save full flags for tc action
>    flow_offload: add reoffload process to update hw_count
>    flow_offload: validate flags of filter and actions
> 
>   drivers/net/ethernet/broadcom/bnxt/bnxt_tc.c  |   2 +-
>   .../ethernet/mellanox/mlx5/core/en/rep/tc.c   |   3 +
>   .../ethernet/netronome/nfp/flower/offload.c   |   3 +
>   include/linux/netdevice.h                     |   1 +
>   include/net/act_api.h                         |  34 +-
>   include/net/flow_offload.h                    |  17 +
>   include/net/pkt_cls.h                         |  61 ++-
>   include/uapi/linux/pkt_cls.h                  |   9 +-
>   net/core/flow_offload.c                       |  48 +-
>   net/sched/act_api.c                           | 440 +++++++++++++++++-
>   net/sched/act_bpf.c                           |   2 +-
>   net/sched/act_connmark.c                      |   2 +-
>   net/sched/act_ctinfo.c                        |   2 +-
>   net/sched/act_gate.c                          |   2 +-
>   net/sched/act_ife.c                           |   2 +-
>   net/sched/act_ipt.c                           |   2 +-
>   net/sched/act_mpls.c                          |   2 +-
>   net/sched/act_nat.c                           |   2 +-
>   net/sched/act_pedit.c                         |   2 +-
>   net/sched/act_police.c                        |   2 +-
>   net/sched/act_sample.c                        |   2 +-
>   net/sched/act_simple.c                        |   2 +-
>   net/sched/act_skbedit.c                       |   2 +-
>   net/sched/act_skbmod.c                        |   2 +-
>   net/sched/cls_api.c                           |  55 ++-
>   net/sched/cls_flower.c                        |   3 +-
>   net/sched/cls_matchall.c                      |   4 +-
>   net/sched/cls_u32.c                           |   7 +-
>   28 files changed, 661 insertions(+), 54 deletions(-)
> 

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [RFC/PATCH net-next v3 0/8] allow user to offload tc action to net device
  2021-10-31  9:50 ` Oz Shlomo
@ 2021-10-31 12:03   ` Dave Taht
  2021-10-31 14:14     ` Jamal Hadi Salim
  2021-10-31 13:40   ` Jamal Hadi Salim
  1 sibling, 1 reply; 58+ messages in thread
From: Dave Taht @ 2021-10-31 12:03 UTC (permalink / raw)
  To: Oz Shlomo
  Cc: Simon Horman, Linux Kernel Network Developers, Vlad Buslov,
	Jamal Hadi Salim, Roi Dayan, Ido Schimmel, Cong Wang, Jiri Pirko,
	Baowen Zheng, Louis Peens, oss-drivers

On Sun, Oct 31, 2021 at 2:51 AM Oz Shlomo <ozsh@nvidia.com> wrote:
>
>
>
> On 10/28/2021 2:06 PM, Simon Horman wrote:
> > Baowen Zheng says:
> >
> > Allow use of flow_indr_dev_register/flow_indr_dev_setup_offload to offload
> > tc actions independent of flows.
> >
> > The motivation for this work is to prepare for using TC police action
> > instances to provide hardware offload of OVS metering feature - which calls
> > for policers that may be used by multiple flows and whose lifecycle is
> > independent of any flows that use them.
> >
> > This patch includes basic changes to offload drivers to return EOPNOTSUPP
> > if this feature is used - it is not yet supported by any driver.
> >
> > Tc cli command to offload and quote an action:
> >
> > tc qdisc del dev $DEV ingress && sleep 1 || true
> > tc actions delete action police index 99 || true
> >
> > tc qdisc add dev $DEV ingress
> > tc qdisc show dev $DEV ingress
> >
> > tc actions add action police index 99 rate 1mbit burst 100k skip_sw
> > tc actions list action police
> >
> > tc filter add dev $DEV protocol ip parent ffff:
> > flower ip_proto tcp action police index 99
> > tc -s -d filter show dev $DEV protocol ip parent ffff:
> > tc filter add dev $DEV protocol ipv6 parent ffff:
> > flower skip_sw ip_proto tcp action police index 99
> > tc -s -d filter show dev $DEV protocol ipv6 parent ffff:
> > tc actions list action police
> >
> > tc qdisc del dev $DEV ingress && sleep 1
> > tc actions delete action police index 99
> > tc actions list action police
> >
>
> Actions are also (implicitly) instantiated when filters are created.
> In the following example the mirred action instance (created by the first filter) is shared by the
> second filter:
>
> tc filter add dev $DEV1 proto ip parent ffff: flower \
>         ip_proto tcp action mirred egress redirect dev $DEV3
>
> tc filter add dev $DEV2 proto ip parent ffff: flower \
>         ip_proto tcp action mirred index 1
>
>
> > Changes compared to v2 patches:
> >
> > * Made changes according to the review comments.
> > * Delete in_hw and not_in_hw flag and user can judge if the action is
> >    offloaded to any hardware by in_hw_count.
> > * Split the main patch of the action offload to three single patch to
> > facilitate code review.
> >
> > Posting this revision of the patchset as an RFC as while we feel it is
> > ready for review we would like an opportunity to conduct further testing
> > before acceptance into upstream.
> >
> > Baowen Zheng (8):
> >    flow_offload: fill flags to action structure
> >    flow_offload: reject to offload tc actions in offload drivers
> >    flow_offload: allow user to offload tc action to net device
> >    flow_offload: add skip_hw and skip_sw to control if offload the action
> >    flow_offload: add process to update action stats from hardware
> >    net: sched: save full flags for tc action
> >    flow_offload: add reoffload process to update hw_count
> >    flow_offload: validate flags of filter and actions
> >
> >   drivers/net/ethernet/broadcom/bnxt/bnxt_tc.c  |   2 +-
> >   .../ethernet/mellanox/mlx5/core/en/rep/tc.c   |   3 +
> >   .../ethernet/netronome/nfp/flower/offload.c   |   3 +
> >   include/linux/netdevice.h                     |   1 +
> >   include/net/act_api.h                         |  34 +-
> >   include/net/flow_offload.h                    |  17 +
> >   include/net/pkt_cls.h                         |  61 ++-
> >   include/uapi/linux/pkt_cls.h                  |   9 +-
> >   net/core/flow_offload.c                       |  48 +-
> >   net/sched/act_api.c                           | 440 +++++++++++++++++-
> >   net/sched/act_bpf.c                           |   2 +-
> >   net/sched/act_connmark.c                      |   2 +-
> >   net/sched/act_ctinfo.c                        |   2 +-
> >   net/sched/act_gate.c                          |   2 +-
> >   net/sched/act_ife.c                           |   2 +-
> >   net/sched/act_ipt.c                           |   2 +-
> >   net/sched/act_mpls.c                          |   2 +-
> >   net/sched/act_nat.c                           |   2 +-
> >   net/sched/act_pedit.c                         |   2 +-
> >   net/sched/act_police.c                        |   2 +-
> >   net/sched/act_sample.c                        |   2 +-
> >   net/sched/act_simple.c                        |   2 +-
> >   net/sched/act_skbedit.c                       |   2 +-
> >   net/sched/act_skbmod.c                        |   2 +-
> >   net/sched/cls_api.c                           |  55 ++-
> >   net/sched/cls_flower.c                        |   3 +-
> >   net/sched/cls_matchall.c                      |   4 +-
> >   net/sched/cls_u32.c                           |   7 +-
> >   28 files changed, 661 insertions(+), 54 deletions(-)
> >

Just as an on-going grump: It has been my hope that policing as a
technique would have died a horrible death by now. Seeing it come back
as an "easy to offload" operation here - fresh from the 1990s! does
not mean it's a good idea, and I'd rather like it if we were finding
ways to
offload newer things that work better, such as modern aqm, fair
queuing, and shaping technologies that are in pie, fq_codel, and cake.

policing leads to bursty loss, especially at higher rates, BBR has a
specific mode designed to defeat it, and I ripped it out of
wondershaper
long ago for very good reasons:
https://www.bufferbloat.net/projects/bloat/wiki/Wondershaper_Must_Die/

I did a long time ago start working on a better policing idea based on
some good aqm ideas like AFD, but dropped it figuring that policing
was going to vanish
from the planet. It's baaaaaack.

-- 
I tried to build a better future, a few times:
https://wayforward.archive.org/?site=https%3A%2F%2Fwww.icei.org

Dave Täht CEO, TekLibre, LLC

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [RFC/PATCH net-next v3 8/8] flow_offload: validate flags of filter and actions
       [not found]         ` <DM5PR1301MB21722A85B19EE97EFE27A5BBE7899@DM5PR1301MB2172.namprd13.prod.outlook.com>
@ 2021-10-31 13:30           ` Jamal Hadi Salim
  2021-11-01  3:29             ` Baowen Zheng
  0 siblings, 1 reply; 58+ messages in thread
From: Jamal Hadi Salim @ 2021-10-31 13:30 UTC (permalink / raw)
  To: Baowen Zheng, Vlad Buslov
  Cc: Simon Horman, netdev, Roi Dayan, Ido Schimmel, Cong Wang,
	Jiri Pirko, Baowen Zheng, Louis Peens, oss-drivers

On 2021-10-30 22:27, Baowen Zheng wrote:
> Thanks for your review, after some considerarion, I think I understand what you are meaning.
> 

[..]

>>>> I know Jamal suggested to have skip_sw for actions, but it complicates
>>>> the code and I'm still not entirely understand why it is necessary.
>>>
>>> If the hardware can independently accept an action offload then
>>> skip_sw per action makes total sense. BTW, my understanding is
>>
>> Example configuration that seems bizarre to me is when offloaded shared
>> action has skip_sw flag set but filter doesn't. Then behavior of
>> classifier that points to such action diverges between hardware and
>> software (different lists of actions are applied). We always try to make
>> offloaded TC data path behave exactly the same as software and, even
>> though here it would be explicit and deliberate, I don't see any
>> practical use-case for this.
> We add the skip_sw to keep compatible with the filter flags and give the user an
> option to specify if the action should run in software. I understand what you mean,
> maybe our example is not proper, we need to prevent the filter to run in software if the
> actions it applies is skip_sw, so we need to add more validation to check about this.
> Also I think your suggestion makes full sense if there is no use case to specify the action
> should not run in sw and indeed it will make our implement more simple if we omit the
> skip_sw option.
> Jamal, WDYT?


Let me use an example to illustrate my concern:

#add a policer offload it
tc actions add action police skip_sw rate ... index 20
#now add filter1 which is offloaded
tc filter add dev $DEV1 proto ip parent ffff: flower \
     skip_sw ip_proto tcp action police index 20
#add filter2 likewise offloaded
tc filter add dev $DEV1 proto ip parent ffff: flower \
     skip_sw ip_proto udp action police index 20

All good so far...
#Now add a filter3 which is s/w only
tc filter add dev $DEV1 proto ip parent ffff: flower \
     skip_hw ip_proto icmp action police index 20

filter3 should not be allowed.

If we had added the policer without skip_sw and without
skip_hw then i think filter3 should have been legal
(we just need to account for stats in_hw vs in_sw).

Not sure if that makes sense (and addresses Vlad's earlier
comment).


cheers,
jamal

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [RFC/PATCH net-next v3 0/8] allow user to offload tc action to net device
  2021-10-31  9:50 ` Oz Shlomo
  2021-10-31 12:03   ` Dave Taht
@ 2021-10-31 13:40   ` Jamal Hadi Salim
  2021-11-01  8:01     ` Vlad Buslov
  1 sibling, 1 reply; 58+ messages in thread
From: Jamal Hadi Salim @ 2021-10-31 13:40 UTC (permalink / raw)
  To: Oz Shlomo, Simon Horman, netdev
  Cc: Vlad Buslov, Roi Dayan, Ido Schimmel, Cong Wang, Jiri Pirko,
	Baowen Zheng, Louis Peens, oss-drivers

On 2021-10-31 05:50, Oz Shlomo wrote:
> 
> 
> On 10/28/2021 2:06 PM, Simon Horman wrote:
>> Baowen Zheng says:
>>
>> Allow use of flow_indr_dev_register/flow_indr_dev_setup_offload to 
>> offload
>> tc actions independent of flows.
>>
>> The motivation for this work is to prepare for using TC police action
>> instances to provide hardware offload of OVS metering feature - which 
>> calls
>> for policers that may be used by multiple flows and whose lifecycle is
>> independent of any flows that use them.
>>
>> This patch includes basic changes to offload drivers to return EOPNOTSUPP
>> if this feature is used - it is not yet supported by any driver.
>>
>> Tc cli command to offload and quote an action:
>>
>> tc qdisc del dev $DEV ingress && sleep 1 || true
>> tc actions delete action police index 99 || true
>>
>> tc qdisc add dev $DEV ingress
>> tc qdisc show dev $DEV ingress
>>
>> tc actions add action police index 99 rate 1mbit burst 100k skip_sw
>> tc actions list action police
>>
>> tc filter add dev $DEV protocol ip parent ffff:
>> flower ip_proto tcp action police index 99
>> tc -s -d filter show dev $DEV protocol ip parent ffff:
>> tc filter add dev $DEV protocol ipv6 parent ffff:
>> flower skip_sw ip_proto tcp action police index 99
>> tc -s -d filter show dev $DEV protocol ipv6 parent ffff:
>> tc actions list action police
>>
>> tc qdisc del dev $DEV ingress && sleep 1
>> tc actions delete action police index 99
>> tc actions list action police
>>
> 
> Actions are also (implicitly) instantiated when filters are created.
> In the following example the mirred action instance (created by the 
> first filter) is shared by the second filter:
> 
> tc filter add dev $DEV1 proto ip parent ffff: flower \
>      ip_proto tcp action mirred egress redirect dev $DEV3
> 
> tc filter add dev $DEV2 proto ip parent ffff: flower \
>      ip_proto tcp action mirred index 1
> 
> 

I sure hope this is supported. At least the discussions so far
are a nod in that direction...
I know there is hardware that is not capable of achieving this
(little CPE type devices) but lets not make that the common case.

cheers,
jamal

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [RFC/PATCH net-next v3 0/8] allow user to offload tc action to net device
  2021-10-31 12:03   ` Dave Taht
@ 2021-10-31 14:14     ` Jamal Hadi Salim
  2021-10-31 14:19       ` Jamal Hadi Salim
  2021-11-01 14:27       ` Dave Taht
  0 siblings, 2 replies; 58+ messages in thread
From: Jamal Hadi Salim @ 2021-10-31 14:14 UTC (permalink / raw)
  To: Dave Taht, Oz Shlomo
  Cc: Simon Horman, Linux Kernel Network Developers, Vlad Buslov,
	Roi Dayan, Ido Schimmel, Cong Wang, Jiri Pirko, Baowen Zheng,
	Louis Peens, oss-drivers

On 2021-10-31 08:03, Dave Taht wrote:
[..]

> 
> Just as an on-going grump: It has been my hope that policing as a
> technique would have died a horrible death by now. Seeing it come back
> as an "easy to offload" operation here - fresh from the 1990s! does
> not mean it's a good idea, and I'd rather like it if we were finding
> ways to
> offload newer things that work better, such as modern aqm, fair
> queuing, and shaping technologies that are in pie, fq_codel, and cake.
> 
> policing leads to bursty loss, especially at higher rates, BBR has a
> specific mode designed to defeat it, and I ripped it out of
> wondershaper
> long ago for very good reasons:
> https://www.bufferbloat.net/projects/bloat/wiki/Wondershaper_Must_Die/
> 
> I did a long time ago start working on a better policing idea based on
> some good aqm ideas like AFD, but dropped it figuring that policing
> was going to vanish
> from the planet. It's baaaaaack.

A lot of enthusiasm for fq_codel in that link ;->
Root cause for burstiness is typically due to large transient queues
(which are sometimes not under your admin control) and of course if
you use a policer and dont have your double leaky buckets set properly
to compensate for both short and long term rates you will have bursts
of drops with the policer. It would be the same with shaper as well
if the packet burst shows up when the queue is full.

Intuitively it would feel, for non-work conserving approaches,
delaying a packet (as in shaping) is better than dropping (as in
policing) - but i have not a study which scientifically proves it.
Any pointers in that regard?
TCP would recover either way (either detecting sequence gaps or RTO).

In Linux kernel level i am not sure i see much difference in either
since we actually feedback an indicator to TCP to indicate a local
drop (as opposed to guessing when it is dropped in the network)
and the TCP code is smart enough to utilize that knowledge.
For hardware offload there is no such feedback for either of those
two approaches (so no difference with drop in the blackhole).

As to "policer must die" - not possible i am afraid;-> I mean there
has to be strong evidence that it is a bad idea and besides that
_a lot of hardware_ supports it;-> Ergo, we have to support it as well.
Note: RED for example has been proven almost impossible to configure
properly but we still support it and there's a good set of hardware
offload support for it. For RED - and i should say the policer as well -
if you configure properly, _it works_.


BTW, Some mellanox NICs offload HTB. See for example:
https://legacy.netdevconf.info/0x14/session.html?talk-hierarchical-QoS-hardware-offload

cheers,
jamal

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [RFC/PATCH net-next v3 0/8] allow user to offload tc action to net device
  2021-10-31 14:14     ` Jamal Hadi Salim
@ 2021-10-31 14:19       ` Jamal Hadi Salim
  2021-11-01 14:27       ` Dave Taht
  1 sibling, 0 replies; 58+ messages in thread
From: Jamal Hadi Salim @ 2021-10-31 14:19 UTC (permalink / raw)
  To: Dave Taht, Oz Shlomo
  Cc: Simon Horman, Linux Kernel Network Developers, Vlad Buslov,
	Roi Dayan, Ido Schimmel, Cong Wang, Jiri Pirko, Baowen Zheng,
	Louis Peens, oss-drivers, Yossi Kuperman

On 2021-10-31 10:14, Jamal Hadi Salim wrote:

> BTW, Some mellanox NICs offload HTB. See for example:
             ^^^^^^^^

Sorry, to be politically correct s/mellanox/Nvidia ;->

cheers,
jamal

^ permalink raw reply	[flat|nested] 58+ messages in thread

* RE: [RFC/PATCH net-next v3 3/8] flow_offload: allow user to offload tc action to net device
  2021-10-31  9:50   ` Oz Shlomo
@ 2021-11-01  2:30     ` Baowen Zheng
  2021-11-01 10:07       ` Oz Shlomo
  0 siblings, 1 reply; 58+ messages in thread
From: Baowen Zheng @ 2021-11-01  2:30 UTC (permalink / raw)
  To: Oz Shlomo, Simon Horman, netdev
  Cc: Vlad Buslov, Jamal Hadi Salim, Roi Dayan, Ido Schimmel,
	Cong Wang, Jiri Pirko, Baowen Zheng, Louis Peens, oss-drivers

On 10/31/2021 5:50 PM, Oz Shlomo wrote:
>On 10/28/2021 2:06 PM, Simon Horman wrote:
>> From: Baowen Zheng <baowen.zheng@corigine.com>
>>
>> Use flow_indr_dev_register/flow_indr_dev_setup_offload to offload tc
>> action.
>
>How will device drivers reference the offloaded actions when offloading a
>flow?
>Perhaps the flow_action_entry structure should also include the action index.
>
We have set action index in flow_offload_action to offload the action, also there are
already some actions in flow_action_entry include index which we want to offload.
If the driver wants to support action that needs index, I think it can add the index later,
it may not include in this patch, WDYT?
>>
>> We need to call tc_cleanup_flow_action to clean up tc action entry
>> since in tc_setup_action, some actions may hold dev refcnt, especially
>> the mirror action.
>>
>> Signed-off-by: Baowen Zheng <baowen.zheng@corigine.com>
>> Signed-off-by: Louis Peens <louis.peens@corigine.com>
>> Signed-off-by: Simon Horman <simon.horman@corigine.com>
>> ---
>>   include/linux/netdevice.h  |   1 +
>>   include/net/act_api.h      |   2 +-
>>   include/net/flow_offload.h |  17 ++++
>>   include/net/pkt_cls.h      |  15 ++++
>>   net/core/flow_offload.c    |  43 ++++++++--
>>   net/sched/act_api.c        | 166 +++++++++++++++++++++++++++++++++++++
>>   net/sched/cls_api.c        |  29 ++++++-
>>   7 files changed, 260 insertions(+), 13 deletions(-)
>>
>> diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h
>> index 3ec42495a43a..9815c3a058e9 100644
>> --- a/include/linux/netdevice.h
>> +++ b/include/linux/netdevice.h
>> @@ -916,6 +916,7 @@ enum tc_setup_type {
>>   	TC_SETUP_QDISC_TBF,
>>   	TC_SETUP_QDISC_FIFO,
>>   	TC_SETUP_QDISC_HTB,
>> +	TC_SETUP_ACT,
>>   };
>>
>>   /* These structures hold the attributes of bpf state that are being
>> passed diff --git a/include/net/act_api.h b/include/net/act_api.h
>> index b5b624c7e488..9eb19188603c 100644
>> --- a/include/net/act_api.h
>> +++ b/include/net/act_api.h
>> @@ -239,7 +239,7 @@ static inline void
>tcf_action_inc_overlimit_qstats(struct tc_action *a)
>>   void tcf_action_update_stats(struct tc_action *a, u64 bytes, u64 packets,
>>   			     u64 drops, bool hw);
>>   int tcf_action_copy_stats(struct sk_buff *, struct tc_action *,
>> int);
>> -
>> +int tcf_action_offload_del(struct tc_action *action);
>>   int tcf_action_check_ctrlact(int action, struct tcf_proto *tp,
>>   			     struct tcf_chain **handle,
>>   			     struct netlink_ext_ack *newchain); diff --git
>> a/include/net/flow_offload.h b/include/net/flow_offload.h index
>> 3961461d9c8b..aa28592fccc0 100644
>> --- a/include/net/flow_offload.h
>> +++ b/include/net/flow_offload.h
>> @@ -552,6 +552,23 @@ struct flow_cls_offload {
>>   	u32 classid;
>>   };
>>
>> +enum flow_act_command {
>> +	FLOW_ACT_REPLACE,
>> +	FLOW_ACT_DESTROY,
>> +	FLOW_ACT_STATS,
>> +};
>> +
>> +struct flow_offload_action {
>> +	struct netlink_ext_ack *extack; /* NULL in FLOW_ACT_STATS
>process*/
>> +	enum flow_act_command command;
>> +	enum flow_action_id id;
>> +	u32 index;
>> +	struct flow_stats stats;
>> +	struct flow_action action;
>> +};
>> +
>> +struct flow_offload_action *flow_action_alloc(unsigned int
>> +num_actions);
>> +
>>   static inline struct flow_rule *
>>   flow_cls_offload_flow_rule(struct flow_cls_offload *flow_cmd)
>>   {
>> diff --git a/include/net/pkt_cls.h b/include/net/pkt_cls.h index
>> 193f88ebf629..922775407257 100644
>> --- a/include/net/pkt_cls.h
>> +++ b/include/net/pkt_cls.h
>> @@ -258,6 +258,9 @@ static inline void tcf_exts_put_net(struct tcf_exts
>*exts)
>>   	for (; 0; (void)(i), (void)(a), (void)(exts))
>>   #endif
>>
>> +#define tcf_act_for_each_action(i, a, actions) \
>> +	for (i = 0; i < TCA_ACT_MAX_PRIO && ((a) = actions[i]); i++)
>> +
>>   static inline void
>>   tcf_exts_stats_update(const struct tcf_exts *exts,
>>   		      u64 bytes, u64 packets, u64 drops, u64 lastuse, @@ -532,8
>> +535,19 @@ tcf_match_indev(struct sk_buff *skb, int ifindex)
>>   	return ifindex == skb->skb_iif;
>>   }
>>
>> +#ifdef CONFIG_NET_CLS_ACT
>>   int tc_setup_flow_action(struct flow_action *flow_action,
>>   			 const struct tcf_exts *exts);
>> +#else
>> +static inline int tc_setup_flow_action(struct flow_action *flow_action,
>> +				       const struct tcf_exts *exts) {
>> +	return 0;
>> +}
>> +#endif
>> +
>> +int tc_setup_action(struct flow_action *flow_action,
>> +		    struct tc_action *actions[]);
>>   void tc_cleanup_flow_action(struct flow_action *flow_action);
>>
>>   int tc_setup_cb_call(struct tcf_block *block, enum tc_setup_type
>> type, @@ -554,6 +568,7 @@ int tc_setup_cb_reoffload(struct tcf_block
>*block, struct tcf_proto *tp,
>>   			  enum tc_setup_type type, void *type_data,
>>   			  void *cb_priv, u32 *flags, unsigned int
>*in_hw_count);
>>   unsigned int tcf_exts_num_actions(struct tcf_exts *exts);
>> +unsigned int tcf_act_num_actions_single(struct tc_action *act);
>>
>>   #ifdef CONFIG_NET_CLS_ACT
>>   int tcf_qevent_init(struct tcf_qevent *qe, struct Qdisc *sch, diff
>> --git a/net/core/flow_offload.c b/net/core/flow_offload.c index
>> 6beaea13564a..6676431733ef 100644
>> --- a/net/core/flow_offload.c
>> +++ b/net/core/flow_offload.c
>> @@ -27,6 +27,27 @@ struct flow_rule *flow_rule_alloc(unsigned int
>num_actions)
>>   }
>>   EXPORT_SYMBOL(flow_rule_alloc);
>>
>> +struct flow_offload_action *flow_action_alloc(unsigned int
>> +num_actions) {
>> +	struct flow_offload_action *fl_action;
>> +	int i;
>> +
>> +	fl_action = kzalloc(struct_size(fl_action, action.entries, num_actions),
>> +			    GFP_KERNEL);
>> +	if (!fl_action)
>> +		return NULL;
>> +
>> +	fl_action->action.num_entries = num_actions;
>> +	/* Pre-fill each action hw_stats with DONT_CARE.
>> +	 * Caller can override this if it wants stats for a given action.
>> +	 */
>> +	for (i = 0; i < num_actions; i++)
>> +		fl_action->action.entries[i].hw_stats =
>> +FLOW_ACTION_HW_STATS_DONT_CARE;
>> +
>> +	return fl_action;
>> +}
>> +EXPORT_SYMBOL(flow_action_alloc);
>> +
>>   #define FLOW_DISSECTOR_MATCH(__rule, __type, __out)
>		\
>>   	const struct flow_match *__m = &(__rule)->match;
>	\
>>   	struct flow_dissector *__d = (__m)->dissector;
>	\
>> @@ -549,19 +570,25 @@ int flow_indr_dev_setup_offload(struct
>net_device *dev,	struct Qdisc *sch,
>>   				void (*cleanup)(struct flow_block_cb
>*block_cb))
>>   {
>>   	struct flow_indr_dev *this;
>> +	u32 count = 0;
>> +	int err;
>>
>>   	mutex_lock(&flow_indr_block_lock);
>> +	if (bo) {
>> +		if (bo->command == FLOW_BLOCK_BIND)
>> +			indir_dev_add(data, dev, sch, type, cleanup, bo);
>> +		else if (bo->command == FLOW_BLOCK_UNBIND)
>> +			indir_dev_remove(data);
>> +	}
>>
>> -	if (bo->command == FLOW_BLOCK_BIND)
>> -		indir_dev_add(data, dev, sch, type, cleanup, bo);
>> -	else if (bo->command == FLOW_BLOCK_UNBIND)
>> -		indir_dev_remove(data);
>> -
>> -	list_for_each_entry(this, &flow_block_indr_dev_list, list)
>> -		this->cb(dev, sch, this->cb_priv, type, bo, data, cleanup);
>> +	list_for_each_entry(this, &flow_block_indr_dev_list, list) {
>> +		err = this->cb(dev, sch, this->cb_priv, type, bo, data, cleanup);
>> +		if (!err)
>> +			count++;
>> +	}
>>
>>   	mutex_unlock(&flow_indr_block_lock);
>>
>> -	return list_empty(&bo->cb_list) ? -EOPNOTSUPP : 0;
>> +	return (bo && list_empty(&bo->cb_list)) ? -EOPNOTSUPP : count;
>>   }
>>   EXPORT_SYMBOL(flow_indr_dev_setup_offload);
>> diff --git a/net/sched/act_api.c b/net/sched/act_api.c index
>> 3258da3d5bed..33f2ff885b4b 100644
>> --- a/net/sched/act_api.c
>> +++ b/net/sched/act_api.c
>> @@ -21,6 +21,19 @@
>>   #include <net/pkt_cls.h>
>>   #include <net/act_api.h>
>>   #include <net/netlink.h>
>> +#include <net/tc_act/tc_pedit.h>
>> +#include <net/tc_act/tc_mirred.h>
>> +#include <net/tc_act/tc_vlan.h>
>> +#include <net/tc_act/tc_tunnel_key.h> #include <net/tc_act/tc_csum.h>
>> +#include <net/tc_act/tc_gact.h> #include <net/tc_act/tc_police.h>
>> +#include <net/tc_act/tc_sample.h> #include <net/tc_act/tc_skbedit.h>
>> +#include <net/tc_act/tc_ct.h> #include <net/tc_act/tc_mpls.h>
>> +#include <net/tc_act/tc_gate.h> #include <net/flow_offload.h>
>>
>>   #ifdef CONFIG_INET
>>   DEFINE_STATIC_KEY_FALSE(tcf_frag_xmit_count);
>> @@ -148,6 +161,7 @@ static int __tcf_action_put(struct tc_action *p, bool
>bind)
>>   		idr_remove(&idrinfo->action_idr, p->tcfa_index);
>>   		mutex_unlock(&idrinfo->lock);
>>
>> +		tcf_action_offload_del(p);
>>   		tcf_action_cleanup(p);
>>   		return 1;
>>   	}
>> @@ -341,6 +355,7 @@ static int tcf_idr_release_unsafe(struct tc_action *p)
>>   		return -EPERM;
>>
>>   	if (refcount_dec_and_test(&p->tcfa_refcnt)) {
>> +		tcf_action_offload_del(p);
>>   		idr_remove(&p->idrinfo->action_idr, p->tcfa_index);
>>   		tcf_action_cleanup(p);
>>   		return ACT_P_DELETED;
>> @@ -452,6 +467,7 @@ static int tcf_idr_delete_index(struct tcf_idrinfo
>*idrinfo, u32 index)
>>   						p->tcfa_index));
>>   			mutex_unlock(&idrinfo->lock);
>>
>> +			tcf_action_offload_del(p);
>>   			tcf_action_cleanup(p);
>>   			module_put(owner);
>>   			return 0;
>> @@ -1061,6 +1077,154 @@ struct tc_action *tcf_action_init_1(struct net
>*net, struct tcf_proto *tp,
>>   	return ERR_PTR(err);
>>   }
>>
>> +static int flow_action_init(struct flow_offload_action *fl_action,
>> +			    struct tc_action *act,
>> +			    enum flow_act_command cmd,
>> +			    struct netlink_ext_ack *extack) {
>> +	if (!fl_action)
>> +		return -EINVAL;
>> +
>> +	fl_action->extack = extack;
>> +	fl_action->command = cmd;
>> +	fl_action->index = act->tcfa_index;
>> +
>> +	if (is_tcf_gact_ok(act)) {
>> +		fl_action->id = FLOW_ACTION_ACCEPT;
>> +	} else if (is_tcf_gact_shot(act)) {
>> +		fl_action->id = FLOW_ACTION_DROP;
>> +	} else if (is_tcf_gact_trap(act)) {
>> +		fl_action->id = FLOW_ACTION_TRAP;
>> +	} else if (is_tcf_gact_goto_chain(act)) {
>> +		fl_action->id = FLOW_ACTION_GOTO;
>> +	} else if (is_tcf_mirred_egress_redirect(act)) {
>> +		fl_action->id = FLOW_ACTION_REDIRECT;
>> +	} else if (is_tcf_mirred_egress_mirror(act)) {
>> +		fl_action->id = FLOW_ACTION_MIRRED;
>> +	} else if (is_tcf_mirred_ingress_redirect(act)) {
>> +		fl_action->id = FLOW_ACTION_REDIRECT_INGRESS;
>> +	} else if (is_tcf_mirred_ingress_mirror(act)) {
>> +		fl_action->id = FLOW_ACTION_MIRRED_INGRESS;
>> +	} else if (is_tcf_vlan(act)) {
>> +		switch (tcf_vlan_action(act)) {
>> +		case TCA_VLAN_ACT_PUSH:
>> +			fl_action->id = FLOW_ACTION_VLAN_PUSH;
>> +			break;
>> +		case TCA_VLAN_ACT_POP:
>> +			fl_action->id = FLOW_ACTION_VLAN_POP;
>> +			break;
>> +		case TCA_VLAN_ACT_MODIFY:
>> +			fl_action->id = FLOW_ACTION_VLAN_MANGLE;
>> +			break;
>> +		default:
>> +			return -EOPNOTSUPP;
>> +		}
>> +	} else if (is_tcf_tunnel_set(act)) {
>> +		fl_action->id = FLOW_ACTION_TUNNEL_ENCAP;
>> +	} else if (is_tcf_tunnel_release(act)) {
>> +		fl_action->id = FLOW_ACTION_TUNNEL_DECAP;
>> +	} else if (is_tcf_csum(act)) {
>> +		fl_action->id = FLOW_ACTION_CSUM;
>> +	} else if (is_tcf_skbedit_mark(act)) {
>> +		fl_action->id = FLOW_ACTION_MARK;
>> +	} else if (is_tcf_sample(act)) {
>> +		fl_action->id = FLOW_ACTION_SAMPLE;
>> +	} else if (is_tcf_police(act)) {
>> +		fl_action->id = FLOW_ACTION_POLICE;
>> +	} else if (is_tcf_ct(act)) {
>> +		fl_action->id = FLOW_ACTION_CT;
>> +	} else if (is_tcf_mpls(act)) {
>> +		switch (tcf_mpls_action(act)) {
>> +		case TCA_MPLS_ACT_PUSH:
>> +			fl_action->id = FLOW_ACTION_MPLS_PUSH;
>> +			break;
>> +		case TCA_MPLS_ACT_POP:
>> +			fl_action->id = FLOW_ACTION_MPLS_POP;
>> +			break;
>> +		case TCA_MPLS_ACT_MODIFY:
>> +			fl_action->id = FLOW_ACTION_MPLS_MANGLE;
>> +			break;
>> +		default:
>> +			return -EOPNOTSUPP;
>> +		}
>> +	} else if (is_tcf_skbedit_ptype(act)) {
>> +		fl_action->id = FLOW_ACTION_PTYPE;
>> +	} else if (is_tcf_skbedit_priority(act)) {
>> +		fl_action->id = FLOW_ACTION_PRIORITY;
>> +	} else if (is_tcf_gate(act)) {
>> +		fl_action->id = FLOW_ACTION_GATE;
>> +	} else {
>> +		return -EOPNOTSUPP;
>> +	}
>> +
>> +	return 0;
>> +}
>> +
>> +static int tcf_action_offload_cmd(struct flow_offload_action *fl_act,
>> +				  struct netlink_ext_ack *extack) {
>> +	int err;
>> +
>> +	if (IS_ERR(fl_act))
>> +		return PTR_ERR(fl_act);
>> +
>> +	err = flow_indr_dev_setup_offload(NULL, NULL, TC_SETUP_ACT,
>> +					  fl_act, NULL, NULL);
>> +	if (err < 0)
>> +		return err;
>> +
>> +	return 0;
>> +}
>> +
>> +/* offload the tc command after inserted */ static int
>> +tcf_action_offload_add(struct tc_action *action,
>> +				  struct netlink_ext_ack *extack) {
>> +	struct tc_action *actions[TCA_ACT_MAX_PRIO] = {
>> +		[0] = action,
>> +	};
>> +	struct flow_offload_action *fl_action;
>> +	int err = 0;
>> +
>> +	fl_action = flow_action_alloc(tcf_act_num_actions_single(action));
>> +	if (!fl_action)
>> +		return -EINVAL;
>> +
>> +	err = flow_action_init(fl_action, action, FLOW_ACT_REPLACE, extack);
>> +	if (err)
>> +		goto fl_err;
>> +
>> +	err = tc_setup_action(&fl_action->action, actions);
>> +	if (err) {
>> +		NL_SET_ERR_MSG_MOD(extack,
>> +				   "Failed to setup tc actions for offload\n");
>> +		goto fl_err;
>> +	}
>> +
>> +	err = tcf_action_offload_cmd(fl_action, extack);
>> +	tc_cleanup_flow_action(&fl_action->action);
>> +
>> +fl_err:
>> +	kfree(fl_action);
>> +
>> +	return err;
>> +}
>> +
>> +int tcf_action_offload_del(struct tc_action *action) {
>> +	struct flow_offload_action fl_act;
>> +	int err = 0;
>> +
>> +	if (!action)
>> +		return -EINVAL;
>> +
>> +	err = flow_action_init(&fl_act, action, FLOW_ACT_DESTROY, NULL);
>> +	if (err)
>> +		return err;
>> +
>> +	return tcf_action_offload_cmd(&fl_act, NULL); }
>> +
>>   /* Returns numbers of initialized actions or negative error. */
>>
>>   int tcf_action_init(struct net *net, struct tcf_proto *tp, struct
>> nlattr *nla, @@ -1103,6 +1267,8 @@ int tcf_action_init(struct net *net,
>struct tcf_proto *tp, struct nlattr *nla,
>>   		sz += tcf_action_fill_size(act);
>>   		/* Start from index 0 */
>>   		actions[i - 1] = act;
>> +		if (!(flags & TCA_ACT_FLAGS_BIND))
>> +			tcf_action_offload_add(act, extack);
>
>Why is this restricted to actions created without the TCA_ACT_FLAGS_BIND
>flag?
>How are actions instantiated by the filters different from those that are
>created by "tc actions"?
>
Our patch aims to offload tc action that is created independent of any flow. It is usually
offloaded when it is added or replaced. 
This patch is to implement a process of reoffloading the actions when driver is
inserted or removed, so it will still offload the independent actions. 
>>   	}
>>
>>   	/* We have to commit them all together, because if any error
>> happened in diff --git a/net/sched/cls_api.c b/net/sched/cls_api.c
>> index 2ef8f5a6205a..351d93988b8b 100644
>> --- a/net/sched/cls_api.c
>> +++ b/net/sched/cls_api.c
>> @@ -3544,8 +3544,8 @@ static enum flow_action_hw_stats
>tc_act_hw_stats(u8 hw_stats)
>>   	return hw_stats;
>>   }
>>
>> -int tc_setup_flow_action(struct flow_action *flow_action,
>> -			 const struct tcf_exts *exts)
>> +int tc_setup_action(struct flow_action *flow_action,
>> +		    struct tc_action *actions[])
>>   {
>>   	struct tc_action *act;
>>   	int i, j, k, err = 0;
>> @@ -3554,11 +3554,11 @@ int tc_setup_flow_action(struct flow_action
>*flow_action,
>>   	BUILD_BUG_ON(TCA_ACT_HW_STATS_IMMEDIATE !=
>FLOW_ACTION_HW_STATS_IMMEDIATE);
>>   	BUILD_BUG_ON(TCA_ACT_HW_STATS_DELAYED !=
>> FLOW_ACTION_HW_STATS_DELAYED);
>>
>> -	if (!exts)
>> +	if (!actions)
>>   		return 0;
>>
>>   	j = 0;
>> -	tcf_exts_for_each_action(i, act, exts) {
>> +	tcf_act_for_each_action(i, act, actions) {
>>   		struct flow_action_entry *entry;
>>
>>   		entry = &flow_action->entries[j];
>> @@ -3725,7 +3725,19 @@ int tc_setup_flow_action(struct flow_action
>*flow_action,
>>   	spin_unlock_bh(&act->tcfa_lock);
>>   	goto err_out;
>>   }
>> +EXPORT_SYMBOL(tc_setup_action);
>> +
>> +#ifdef CONFIG_NET_CLS_ACT
>> +int tc_setup_flow_action(struct flow_action *flow_action,
>> +			 const struct tcf_exts *exts)
>> +{
>> +	if (!exts)
>> +		return 0;
>> +
>> +	return tc_setup_action(flow_action, exts->actions); }
>>   EXPORT_SYMBOL(tc_setup_flow_action);
>> +#endif
>>
>>   unsigned int tcf_exts_num_actions(struct tcf_exts *exts)
>>   {
>> @@ -3743,6 +3755,15 @@ unsigned int tcf_exts_num_actions(struct
>tcf_exts *exts)
>>   }
>>   EXPORT_SYMBOL(tcf_exts_num_actions);
>>
>> +unsigned int tcf_act_num_actions_single(struct tc_action *act) {
>> +	if (is_tcf_pedit(act))
>> +		return tcf_pedit_nkeys(act);
>> +	else
>> +		return 1;
>> +}
>> +EXPORT_SYMBOL(tcf_act_num_actions_single);
>> +
>>   #ifdef CONFIG_NET_CLS_ACT
>>   static int tcf_qevent_parse_block_index(struct nlattr *block_index_attr,
>>   					u32 *p_block_index,
>>

^ permalink raw reply	[flat|nested] 58+ messages in thread

* RE: [RFC/PATCH net-next v3 8/8] flow_offload: validate flags of filter and actions
  2021-10-31 13:30           ` Jamal Hadi Salim
@ 2021-11-01  3:29             ` Baowen Zheng
  2021-11-01  7:38               ` Vlad Buslov
  0 siblings, 1 reply; 58+ messages in thread
From: Baowen Zheng @ 2021-11-01  3:29 UTC (permalink / raw)
  To: Jamal Hadi Salim, Vlad Buslov
  Cc: Simon Horman, netdev, Roi Dayan, Ido Schimmel, Cong Wang,
	Jiri Pirko, Baowen Zheng, Louis Peens, oss-drivers

On 2021-10-31 9:31 PM, Jamal Hadi Salim wrote:
>On 2021-10-30 22:27, Baowen Zheng wrote:
>> Thanks for your review, after some considerarion, I think I understand what
>you are meaning.
>>
>
>[..]
>
>>>>> I know Jamal suggested to have skip_sw for actions, but it
>>>>> complicates the code and I'm still not entirely understand why it is
>necessary.
>>>>
>>>> If the hardware can independently accept an action offload then
>>>> skip_sw per action makes total sense. BTW, my understanding is
>>>
>>> Example configuration that seems bizarre to me is when offloaded
>>> shared action has skip_sw flag set but filter doesn't. Then behavior
>>> of classifier that points to such action diverges between hardware
>>> and software (different lists of actions are applied). We always try
>>> to make offloaded TC data path behave exactly the same as software
>>> and, even though here it would be explicit and deliberate, I don't
>>> see any practical use-case for this.
>> We add the skip_sw to keep compatible with the filter flags and give
>> the user an option to specify if the action should run in software. I
>> understand what you mean, maybe our example is not proper, we need to
>> prevent the filter to run in software if the actions it applies is skip_sw, so we
>need to add more validation to check about this.
>> Also I think your suggestion makes full sense if there is no use case
>> to specify the action should not run in sw and indeed it will make our
>> implement more simple if we omit the skip_sw option.
>> Jamal, WDYT?
>
>
>Let me use an example to illustrate my concern:
>
>#add a policer offload it
>tc actions add action police skip_sw rate ... index 20 #now add filter1 which is
>offloaded tc filter add dev $DEV1 proto ip parent ffff: flower \
>     skip_sw ip_proto tcp action police index 20 #add filter2 likewise offloaded
>tc filter add dev $DEV1 proto ip parent ffff: flower \
>     skip_sw ip_proto udp action police index 20
>
>All good so far...
>#Now add a filter3 which is s/w only
>tc filter add dev $DEV1 proto ip parent ffff: flower \
>     skip_hw ip_proto icmp action police index 20
>
>filter3 should not be allowed.
>
>If we had added the policer without skip_sw and without skip_hw then i think
>filter3 should have been legal (we just need to account for stats in_hw vs
>in_sw).
>
>Not sure if that makes sense (and addresses Vlad's earlier comment).
>
I think the cases you mentioned make sense to us. But what Vlad concerns is the use
case as: 
#add a policer offload it
tc actions add action police skip_sw rate ... index 20
#now add filter4 which can't be  offloaded
tc filter add dev $DEV1 proto ip parent ffff: flower \
ip_proto tcp action police index 20
it is possible the filter4 can't be offloaded, then filter4 will run in software,
should this be legal? 
Originally I think this is legal, but as comments of Vlad, this should not be legal, since the action
will not be executed in software. I think what Vlad concerns is do we really need skip_sw flag for
an action? If a packet matches the filter in software, the action should not be skip_sw. 
If we choose to omit the skip_sw flag and just keep skip_hw, it will simplify our work. 
Of course, we can also keep skip_sw by adding more check to avoid the above case. 

Vlad, I am not sure if I understand your idea correctly. 


^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [RFC/PATCH net-next v3 8/8] flow_offload: validate flags of filter and actions
  2021-11-01  3:29             ` Baowen Zheng
@ 2021-11-01  7:38               ` Vlad Buslov
  2021-11-02 12:39                 ` Simon Horman
  0 siblings, 1 reply; 58+ messages in thread
From: Vlad Buslov @ 2021-11-01  7:38 UTC (permalink / raw)
  To: Baowen Zheng
  Cc: Jamal Hadi Salim, Simon Horman, netdev, Roi Dayan, Ido Schimmel,
	Cong Wang, Jiri Pirko, Baowen Zheng, Louis Peens, oss-drivers


On Mon 01 Nov 2021 at 05:29, Baowen Zheng <baowen.zheng@corigine.com> wrote:
> On 2021-10-31 9:31 PM, Jamal Hadi Salim wrote:
>>On 2021-10-30 22:27, Baowen Zheng wrote:
>>> Thanks for your review, after some considerarion, I think I understand what
>>you are meaning.
>>>
>>
>>[..]
>>
>>>>>> I know Jamal suggested to have skip_sw for actions, but it
>>>>>> complicates the code and I'm still not entirely understand why it is
>>necessary.
>>>>>
>>>>> If the hardware can independently accept an action offload then
>>>>> skip_sw per action makes total sense. BTW, my understanding is
>>>>
>>>> Example configuration that seems bizarre to me is when offloaded
>>>> shared action has skip_sw flag set but filter doesn't. Then behavior
>>>> of classifier that points to such action diverges between hardware
>>>> and software (different lists of actions are applied). We always try
>>>> to make offloaded TC data path behave exactly the same as software
>>>> and, even though here it would be explicit and deliberate, I don't
>>>> see any practical use-case for this.
>>> We add the skip_sw to keep compatible with the filter flags and give
>>> the user an option to specify if the action should run in software. I
>>> understand what you mean, maybe our example is not proper, we need to
>>> prevent the filter to run in software if the actions it applies is skip_sw, so we
>>need to add more validation to check about this.
>>> Also I think your suggestion makes full sense if there is no use case
>>> to specify the action should not run in sw and indeed it will make our
>>> implement more simple if we omit the skip_sw option.
>>> Jamal, WDYT?
>>
>>
>>Let me use an example to illustrate my concern:
>>
>>#add a policer offload it
>>tc actions add action police skip_sw rate ... index 20 #now add filter1 which is
>>offloaded tc filter add dev $DEV1 proto ip parent ffff: flower \
>>     skip_sw ip_proto tcp action police index 20 #add filter2 likewise offloaded
>>tc filter add dev $DEV1 proto ip parent ffff: flower \
>>     skip_sw ip_proto udp action police index 20
>>
>>All good so far...
>>#Now add a filter3 which is s/w only
>>tc filter add dev $DEV1 proto ip parent ffff: flower \
>>     skip_hw ip_proto icmp action police index 20
>>
>>filter3 should not be allowed.
>>
>>If we had added the policer without skip_sw and without skip_hw then i think
>>filter3 should have been legal (we just need to account for stats in_hw vs
>>in_sw).
>>
>>Not sure if that makes sense (and addresses Vlad's earlier comment).
>>
> I think the cases you mentioned make sense to us. But what Vlad concerns is the use
> case as: 
> #add a policer offload it
> tc actions add action police skip_sw rate ... index 20
> #now add filter4 which can't be  offloaded
> tc filter add dev $DEV1 proto ip parent ffff: flower \
> ip_proto tcp action police index 20
> it is possible the filter4 can't be offloaded, then filter4 will run in software,
> should this be legal? 
> Originally I think this is legal, but as comments of Vlad, this should not be legal, since the action
> will not be executed in software. I think what Vlad concerns is do we really need skip_sw flag for
> an action? If a packet matches the filter in software, the action should not be skip_sw. 
> If we choose to omit the skip_sw flag and just keep skip_hw, it will simplify our work. 
> Of course, we can also keep skip_sw by adding more check to avoid the above case. 
>
> Vlad, I am not sure if I understand your idea correctly. 

My suggestion was to forgo the skip_sw flag for shared action offload
and, consecutively, remove the validation code, not to add even more
checks. I still don't see a practical case where skip_sw shared action
is useful. But I don't have any strong feelings about this flag, so if
Jamal thinks it is necessary, then fine by me.


^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [RFC/PATCH net-next v3 0/8] allow user to offload tc action to net device
  2021-10-31 13:40   ` Jamal Hadi Salim
@ 2021-11-01  8:01     ` Vlad Buslov
  2021-11-02 12:51       ` Simon Horman
  0 siblings, 1 reply; 58+ messages in thread
From: Vlad Buslov @ 2021-11-01  8:01 UTC (permalink / raw)
  To: Jamal Hadi Salim
  Cc: Oz Shlomo, Simon Horman, netdev, Roi Dayan, Ido Schimmel,
	Cong Wang, Jiri Pirko, Baowen Zheng, Louis Peens, oss-drivers

On Sun 31 Oct 2021 at 15:40, Jamal Hadi Salim <jhs@mojatatu.com> wrote:
> On 2021-10-31 05:50, Oz Shlomo wrote:
>> 
>> On 10/28/2021 2:06 PM, Simon Horman wrote:
>>> Baowen Zheng says:
>>>
>>> Allow use of flow_indr_dev_register/flow_indr_dev_setup_offload to offload
>>> tc actions independent of flows.
>>>
>>> The motivation for this work is to prepare for using TC police action
>>> instances to provide hardware offload of OVS metering feature - which calls
>>> for policers that may be used by multiple flows and whose lifecycle is
>>> independent of any flows that use them.
>>>
>>> This patch includes basic changes to offload drivers to return EOPNOTSUPP
>>> if this feature is used - it is not yet supported by any driver.
>>>
>>> Tc cli command to offload and quote an action:
>>>
>>> tc qdisc del dev $DEV ingress && sleep 1 || true
>>> tc actions delete action police index 99 || true
>>>
>>> tc qdisc add dev $DEV ingress
>>> tc qdisc show dev $DEV ingress
>>>
>>> tc actions add action police index 99 rate 1mbit burst 100k skip_sw
>>> tc actions list action police
>>>
>>> tc filter add dev $DEV protocol ip parent ffff:
>>> flower ip_proto tcp action police index 99
>>> tc -s -d filter show dev $DEV protocol ip parent ffff:
>>> tc filter add dev $DEV protocol ipv6 parent ffff:
>>> flower skip_sw ip_proto tcp action police index 99
>>> tc -s -d filter show dev $DEV protocol ipv6 parent ffff:
>>> tc actions list action police
>>>
>>> tc qdisc del dev $DEV ingress && sleep 1
>>> tc actions delete action police index 99
>>> tc actions list action police
>>>
>> Actions are also (implicitly) instantiated when filters are created.
>> In the following example the mirred action instance (created by the first
>> filter) is shared by the second filter:
>> tc filter add dev $DEV1 proto ip parent ffff: flower \
>>      ip_proto tcp action mirred egress redirect dev $DEV3
>> tc filter add dev $DEV2 proto ip parent ffff: flower \
>>      ip_proto tcp action mirred index 1
>> 
>
> I sure hope this is supported. At least the discussions so far
> are a nod in that direction...
> I know there is hardware that is not capable of achieving this
> (little CPE type devices) but lets not make that the common case.

Looks like it isn't supported in this change since
tcf_action_offload_add() is only called by tcf_action_init() when BIND
flag is not set (the flag is always set when called from cls code).
Moreover, I don't think it is good idea to support such use-case because
that would require to increase number of calls to driver offload
infrastructure from 1 per filter to 1+number_of_actions, which would
significantly impact insertion rate.

^ permalink raw reply	[flat|nested] 58+ messages in thread

* RE: [RFC/PATCH net-next v3 3/8] flow_offload: allow user to offload tc action to net device
  2021-10-29 16:59   ` Vlad Buslov
@ 2021-11-01  9:44     ` Baowen Zheng
  2021-11-01 12:05       ` Vlad Buslov
  0 siblings, 1 reply; 58+ messages in thread
From: Baowen Zheng @ 2021-11-01  9:44 UTC (permalink / raw)
  To: Vlad Buslov, Simon Horman
  Cc: netdev, Jamal Hadi Salim, Roi Dayan, Ido Schimmel, Cong Wang,
	Jiri Pirko, Baowen Zheng, Louis Peens, oss-drivers

Thanks for your review and sorry for delay in responding.

On October 30, 2021 12:59 AM, Vlad Buslov <vladbu@nvidia.com> wrote:
>On Thu 28 Oct 2021 at 14:06, Simon Horman <simon.horman@corigine.com>
>wrote:
>> From: Baowen Zheng <baowen.zheng@corigine.com>
>>
>> Use flow_indr_dev_register/flow_indr_dev_setup_offload to offload tc
>> action.
>>
>> We need to call tc_cleanup_flow_action to clean up tc action entry
>> since in tc_setup_action, some actions may hold dev refcnt, especially
>> the mirror action.
>>
>> Signed-off-by: Baowen Zheng <baowen.zheng@corigine.com>
>> Signed-off-by: Louis Peens <louis.peens@corigine.com>
>> Signed-off-by: Simon Horman <simon.horman@corigine.com>
>> ---
>>  include/linux/netdevice.h  |   1 +
>>  include/net/act_api.h      |   2 +-
>>  include/net/flow_offload.h |  17 ++++
>>  include/net/pkt_cls.h      |  15 ++++
>>  net/core/flow_offload.c    |  43 ++++++++--
>>  net/sched/act_api.c        | 166
>+++++++++++++++++++++++++++++++++++++
>>  net/sched/cls_api.c        |  29 ++++++-
>>  7 files changed, 260 insertions(+), 13 deletions(-)
>>
>> diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h
>> index 3ec42495a43a..9815c3a058e9 100644
>> --- a/include/linux/netdevice.h
>> +++ b/include/linux/netdevice.h
>> @@ -916,6 +916,7 @@ enum tc_setup_type {
>>  	TC_SETUP_QDISC_TBF,
>>  	TC_SETUP_QDISC_FIFO,
>>  	TC_SETUP_QDISC_HTB,
>> +	TC_SETUP_ACT,
>>  };
>>
>>  /* These structures hold the attributes of bpf state that are being
>> passed diff --git a/include/net/act_api.h b/include/net/act_api.h
>> index b5b624c7e488..9eb19188603c 100644
>> --- a/include/net/act_api.h
>> +++ b/include/net/act_api.h
>> @@ -239,7 +239,7 @@ static inline void
>> tcf_action_inc_overlimit_qstats(struct tc_action *a)  void
>tcf_action_update_stats(struct tc_action *a, u64 bytes, u64 packets,
>>  			     u64 drops, bool hw);
>>  int tcf_action_copy_stats(struct sk_buff *, struct tc_action *, int);
>> -
>> +int tcf_action_offload_del(struct tc_action *action);
>
>This doesn't seem to be used anywhere outside of act_api in this series, so
>why is it exported?
Thanks for bring this to us, we will fix this by moving the block of implement in act_api.c.
>>  int tcf_action_check_ctrlact(int action, struct tcf_proto *tp,
>>  			     struct tcf_chain **handle,
>>  			     struct netlink_ext_ack *newchain); diff --git
>> a/include/net/flow_offload.h b/include/net/flow_offload.h index
>> 3961461d9c8b..aa28592fccc0 100644
>> --- a/include/net/flow_offload.h
>> +++ b/include/net/flow_offload.h
>> @@ -552,6 +552,23 @@ struct flow_cls_offload {
>>  	u32 classid;
>>  };
>>
>> +enum flow_act_command {
>> +	FLOW_ACT_REPLACE,
>> +	FLOW_ACT_DESTROY,
>> +	FLOW_ACT_STATS,
>> +};
>> +
>> +struct flow_offload_action {
>> +	struct netlink_ext_ack *extack; /* NULL in FLOW_ACT_STATS
>process*/
>> +	enum flow_act_command command;
>> +	enum flow_action_id id;
>> +	u32 index;
>> +	struct flow_stats stats;
>> +	struct flow_action action;
>> +};
>> +
>> +struct flow_offload_action *flow_action_alloc(unsigned int
>> +num_actions);
>> +
>>  static inline struct flow_rule *
>>  flow_cls_offload_flow_rule(struct flow_cls_offload *flow_cmd)  { diff
>> --git a/include/net/pkt_cls.h b/include/net/pkt_cls.h index
>> 193f88ebf629..922775407257 100644
>> --- a/include/net/pkt_cls.h
>> +++ b/include/net/pkt_cls.h
>> @@ -258,6 +258,9 @@ static inline void tcf_exts_put_net(struct tcf_exts
>*exts)
>>  	for (; 0; (void)(i), (void)(a), (void)(exts))  #endif
>>
>> +#define tcf_act_for_each_action(i, a, actions) \
>> +	for (i = 0; i < TCA_ACT_MAX_PRIO && ((a) = actions[i]); i++)
>> +
>>  static inline void
>>  tcf_exts_stats_update(const struct tcf_exts *exts,
>>  		      u64 bytes, u64 packets, u64 drops, u64 lastuse, @@ -532,8
>> +535,19 @@ tcf_match_indev(struct sk_buff *skb, int ifindex)
>>  	return ifindex == skb->skb_iif;
>>  }
>>
>> +#ifdef CONFIG_NET_CLS_ACT
>>  int tc_setup_flow_action(struct flow_action *flow_action,
>>  			 const struct tcf_exts *exts);
>
>Why does existing cls_api function tc_setup_flow_action() now depend on
>CONFIG_NET_CLS_ACT?
Originally the function tc_setup_flow_action deal with the dependence of CONFIG_NET_CLS_ACT
By calling the macro tcf_exts_for_each_action, now we change to call the function tc_setup_action
Then tc_setup_flow_action will refer to exts->actions, so it will depend on CONFIG_NET_CLS_ACT explicitly.
To fix this, we have to have the ifdef in tc_setup_flow_action declaration or in the implement in cls_api.c.
Do you think if it makes sense?
>> +#else
>> +static inline int tc_setup_flow_action(struct flow_action *flow_action,
>> +				       const struct tcf_exts *exts) {
>> +	return 0;
>> +}
>> +#endif
>> +
>> +int tc_setup_action(struct flow_action *flow_action,
>> +		    struct tc_action *actions[]);
>>  void tc_cleanup_flow_action(struct flow_action *flow_action);
>>
...
>>  #ifdef CONFIG_INET
>>  DEFINE_STATIC_KEY_FALSE(tcf_frag_xmit_count);
>> @@ -148,6 +161,7 @@ static int __tcf_action_put(struct tc_action *p, bool
>bind)
>>  		idr_remove(&idrinfo->action_idr, p->tcfa_index);
>>  		mutex_unlock(&idrinfo->lock);
>>
>> +		tcf_action_offload_del(p);
>>  		tcf_action_cleanup(p);
>>  		return 1;
>>  	}
>> @@ -341,6 +355,7 @@ static int tcf_idr_release_unsafe(struct tc_action *p)
>>  		return -EPERM;
>>
>>  	if (refcount_dec_and_test(&p->tcfa_refcnt)) {
>> +		tcf_action_offload_del(p);
>>  		idr_remove(&p->idrinfo->action_idr, p->tcfa_index);
>>  		tcf_action_cleanup(p);
>>  		return ACT_P_DELETED;
>> @@ -452,6 +467,7 @@ static int tcf_idr_delete_index(struct tcf_idrinfo
>*idrinfo, u32 index)
>>  						p->tcfa_index));
>>  			mutex_unlock(&idrinfo->lock);
>>
>> +			tcf_action_offload_del(p);
>
>tcf_action_offload_del() and tcf_action_cleanup() seem to be always called
>together. Consider moving the call to tcf_action_offload_del() into
>tcf_action_cleanup().
>
Thanks, we will consider to move tcf_action_offload_del() inside of tcf_action_cleanup.
>>  			tcf_action_cleanup(p);
>>  			module_put(owner);
>>  			return 0;
>> @@ -1061,6 +1077,154 @@ struct tc_action *tcf_action_init_1(struct net
>*net, struct tcf_proto *tp,
>>  	return ERR_PTR(err);
>>  }
>>
...
>> +/* offload the tc command after inserted */ static int
>> +tcf_action_offload_add(struct tc_action *action,
>> +				  struct netlink_ext_ack *extack) {
>> +	struct tc_action *actions[TCA_ACT_MAX_PRIO] = {
>> +		[0] = action,
>> +	};
>> +	struct flow_offload_action *fl_action;
>> +	int err = 0;
>> +
>> +	fl_action = flow_action_alloc(tcf_act_num_actions_single(action));
>> +	if (!fl_action)
>> +		return -EINVAL;
>
>Failed alloc-like functions usually result -ENOMEM.
>
Thanks, we will fix this in V4 patch.
>> +
>> +	err = flow_action_init(fl_action, action, FLOW_ACT_REPLACE, extack);
>> +	if (err)
>> +		goto fl_err;
>> +
>> +	err = tc_setup_action(&fl_action->action, actions);
>> +	if (err) {
>> +		NL_SET_ERR_MSG_MOD(extack,
>> +				   "Failed to setup tc actions for offload\n");
>> +		goto fl_err;
>> +	}
>> +
>> +	err = tcf_action_offload_cmd(fl_action, extack);
>> +	tc_cleanup_flow_action(&fl_action->action);
>> +
>> +fl_err:
>> +	kfree(fl_action);
>> +
>> +	return err;
>> +}
>> +
>> +int tcf_action_offload_del(struct tc_action *action) {
>> +	struct flow_offload_action fl_act;
>> +	int err = 0;
>> +
>> +	if (!action)
>> +		return -EINVAL;
>> +
>> +	err = flow_action_init(&fl_act, action, FLOW_ACT_DESTROY, NULL);
>> +	if (err)
>> +		return err;
>> +
>> +	return tcf_action_offload_cmd(&fl_act, NULL); }
>> +
>>  /* Returns numbers of initialized actions or negative error. */
>>
>>  int tcf_action_init(struct net *net, struct tcf_proto *tp, struct
>> nlattr *nla, @@ -1103,6 +1267,8 @@ int tcf_action_init(struct net *net,
>struct tcf_proto *tp, struct nlattr *nla,
>>  		sz += tcf_action_fill_size(act);
>>  		/* Start from index 0 */
>>  		actions[i - 1] = act;
>> +		if (!(flags & TCA_ACT_FLAGS_BIND))
>> +			tcf_action_offload_add(act, extack);
>>  	}
>>
>>  	/* We have to commit them all together, because if any error
>> happened in diff --git a/net/sched/cls_api.c b/net/sched/cls_api.c
>> index 2ef8f5a6205a..351d93988b8b 100644
>> --- a/net/sched/cls_api.c
>> +++ b/net/sched/cls_api.c
>> @@ -3544,8 +3544,8 @@ static enum flow_action_hw_stats
>tc_act_hw_stats(u8 hw_stats)
>>  	return hw_stats;
>>  }
>>
>> -int tc_setup_flow_action(struct flow_action *flow_action,
>> -			 const struct tcf_exts *exts)
>> +int tc_setup_action(struct flow_action *flow_action,
>> +		    struct tc_action *actions[])
>>  {
>>  	struct tc_action *act;
>>  	int i, j, k, err = 0;
>> @@ -3554,11 +3554,11 @@ int tc_setup_flow_action(struct flow_action
>*flow_action,
>>  	BUILD_BUG_ON(TCA_ACT_HW_STATS_IMMEDIATE !=
>FLOW_ACTION_HW_STATS_IMMEDIATE);
>>  	BUILD_BUG_ON(TCA_ACT_HW_STATS_DELAYED !=
>> FLOW_ACTION_HW_STATS_DELAYED);
>>
>> -	if (!exts)
>> +	if (!actions)
>>  		return 0;
>>
>>  	j = 0;
>> -	tcf_exts_for_each_action(i, act, exts) {
>> +	tcf_act_for_each_action(i, act, actions) {
>>  		struct flow_action_entry *entry;
>>
>>  		entry = &flow_action->entries[j];
>> @@ -3725,7 +3725,19 @@ int tc_setup_flow_action(struct flow_action
>*flow_action,
>>  	spin_unlock_bh(&act->tcfa_lock);
>>  	goto err_out;
>>  }
>> +EXPORT_SYMBOL(tc_setup_action);
>> +
>> +#ifdef CONFIG_NET_CLS_ACT
>
>Maybe just move tc_setup_action() to act_api and ifdef its definition in
>pkt_cls.h instead of existing tc_setup_flow_action()?
As explanation above, after the change, tc_setup_flow_action will call function of 
tc_setup_action and refer to exts->actions, so just move tc_setup_action can not
fix this problem.
>> +int tc_setup_flow_action(struct flow_action *flow_action,
>> +			 const struct tcf_exts *exts)
>> +{
>> +	if (!exts)
>> +		return 0;
>> +
>> +	return tc_setup_action(flow_action, exts->actions); }
>>  EXPORT_SYMBOL(tc_setup_flow_action);
>> +#endif
>>
>>  unsigned int tcf_exts_num_actions(struct tcf_exts *exts)  { @@
>> -3743,6 +3755,15 @@ unsigned int tcf_exts_num_actions(struct tcf_exts
>> *exts)  }  EXPORT_SYMBOL(tcf_exts_num_actions);
>>
>> +unsigned int tcf_act_num_actions_single(struct tc_action *act) {
>> +	if (is_tcf_pedit(act))
>> +		return tcf_pedit_nkeys(act);
>> +	else
>> +		return 1;
>> +}
>> +EXPORT_SYMBOL(tcf_act_num_actions_single);
>> +
>>  #ifdef CONFIG_NET_CLS_ACT
>>  static int tcf_qevent_parse_block_index(struct nlattr *block_index_attr,
>>  					u32 *p_block_index,


^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [RFC/PATCH net-next v3 3/8] flow_offload: allow user to offload tc action to net device
  2021-11-01  2:30     ` Baowen Zheng
@ 2021-11-01 10:07       ` Oz Shlomo
  2021-11-01 10:27         ` Baowen Zheng
  0 siblings, 1 reply; 58+ messages in thread
From: Oz Shlomo @ 2021-11-01 10:07 UTC (permalink / raw)
  To: Baowen Zheng, Simon Horman, netdev
  Cc: Vlad Buslov, Jamal Hadi Salim, Roi Dayan, Ido Schimmel,
	Cong Wang, Jiri Pirko, Baowen Zheng, Louis Peens, oss-drivers



On 11/1/2021 4:30 AM, Baowen Zheng wrote:
> On 10/31/2021 5:50 PM, Oz Shlomo wrote:
>> On 10/28/2021 2:06 PM, Simon Horman wrote:
>>> From: Baowen Zheng <baowen.zheng@corigine.com>
>>>
>>> Use flow_indr_dev_register/flow_indr_dev_setup_offload to offload tc
>>> action.
>>
>> How will device drivers reference the offloaded actions when offloading a
>> flow?
>> Perhaps the flow_action_entry structure should also include the action index.
>>
> We have set action index in flow_offload_action to offload the action, also there are > already some actions in flow_action_entry include index which we want to offload.
> If the driver wants to support action that needs index, I think it can add the index later,
> it may not include in this patch, WDYT?

What do you mean by "action that needs index"?

Currently only the police and gate actions have an action index parameter.
However, with this series the user can create any action using the tc action API and then reference 
it from any filter.
Do you see a reason not to expose the action index as a flow_action_entry attribute?


>>>
>>> We need to call tc_cleanup_flow_action to clean up tc action entry
>>> since in tc_setup_action, some actions may hold dev refcnt, especially
>>> the mirror action.
>>>
>>> Signed-off-by: Baowen Zheng <baowen.zheng@corigine.com>
>>> Signed-off-by: Louis Peens <louis.peens@corigine.com>
>>> Signed-off-by: Simon Horman <simon.horman@corigine.com>
>>> ---
>>>    include/linux/netdevice.h  |   1 +
>>>    include/net/act_api.h      |   2 +-
>>>    include/net/flow_offload.h |  17 ++++
>>>    include/net/pkt_cls.h      |  15 ++++
>>>    net/core/flow_offload.c    |  43 ++++++++--
>>>    net/sched/act_api.c        | 166 +++++++++++++++++++++++++++++++++++++
>>>    net/sched/cls_api.c        |  29 ++++++-
>>>    7 files changed, 260 insertions(+), 13 deletions(-)
>>>
>>> diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h
>>> index 3ec42495a43a..9815c3a058e9 100644
>>> --- a/include/linux/netdevice.h
>>> +++ b/include/linux/netdevice.h
>>> @@ -916,6 +916,7 @@ enum tc_setup_type {
>>>    	TC_SETUP_QDISC_TBF,
>>>    	TC_SETUP_QDISC_FIFO,
>>>    	TC_SETUP_QDISC_HTB,
>>> +	TC_SETUP_ACT,
>>>    };
>>>
>>>    /* These structures hold the attributes of bpf state that are being
>>> passed diff --git a/include/net/act_api.h b/include/net/act_api.h
>>> index b5b624c7e488..9eb19188603c 100644
>>> --- a/include/net/act_api.h
>>> +++ b/include/net/act_api.h
>>> @@ -239,7 +239,7 @@ static inline void
>> tcf_action_inc_overlimit_qstats(struct tc_action *a)
>>>    void tcf_action_update_stats(struct tc_action *a, u64 bytes, u64 packets,
>>>    			     u64 drops, bool hw);
>>>    int tcf_action_copy_stats(struct sk_buff *, struct tc_action *,
>>> int);
>>> -
>>> +int tcf_action_offload_del(struct tc_action *action);
>>>    int tcf_action_check_ctrlact(int action, struct tcf_proto *tp,
>>>    			     struct tcf_chain **handle,
>>>    			     struct netlink_ext_ack *newchain); diff --git
>>> a/include/net/flow_offload.h b/include/net/flow_offload.h index
>>> 3961461d9c8b..aa28592fccc0 100644
>>> --- a/include/net/flow_offload.h
>>> +++ b/include/net/flow_offload.h
>>> @@ -552,6 +552,23 @@ struct flow_cls_offload {
>>>    	u32 classid;
>>>    };
>>>
>>> +enum flow_act_command {
>>> +	FLOW_ACT_REPLACE,
>>> +	FLOW_ACT_DESTROY,
>>> +	FLOW_ACT_STATS,
>>> +};
>>> +
>>> +struct flow_offload_action {
>>> +	struct netlink_ext_ack *extack; /* NULL in FLOW_ACT_STATS
>> process*/
>>> +	enum flow_act_command command;
>>> +	enum flow_action_id id;
>>> +	u32 index;
>>> +	struct flow_stats stats;
>>> +	struct flow_action action;
>>> +};
>>> +
>>> +struct flow_offload_action *flow_action_alloc(unsigned int
>>> +num_actions);
>>> +
>>>    static inline struct flow_rule *
>>>    flow_cls_offload_flow_rule(struct flow_cls_offload *flow_cmd)
>>>    {
>>> diff --git a/include/net/pkt_cls.h b/include/net/pkt_cls.h index
>>> 193f88ebf629..922775407257 100644
>>> --- a/include/net/pkt_cls.h
>>> +++ b/include/net/pkt_cls.h
>>> @@ -258,6 +258,9 @@ static inline void tcf_exts_put_net(struct tcf_exts
>> *exts)
>>>    	for (; 0; (void)(i), (void)(a), (void)(exts))
>>>    #endif
>>>
>>> +#define tcf_act_for_each_action(i, a, actions) \
>>> +	for (i = 0; i < TCA_ACT_MAX_PRIO && ((a) = actions[i]); i++)
>>> +
>>>    static inline void
>>>    tcf_exts_stats_update(const struct tcf_exts *exts,
>>>    		      u64 bytes, u64 packets, u64 drops, u64 lastuse, @@ -532,8
>>> +535,19 @@ tcf_match_indev(struct sk_buff *skb, int ifindex)
>>>    	return ifindex == skb->skb_iif;
>>>    }
>>>
>>> +#ifdef CONFIG_NET_CLS_ACT
>>>    int tc_setup_flow_action(struct flow_action *flow_action,
>>>    			 const struct tcf_exts *exts);
>>> +#else
>>> +static inline int tc_setup_flow_action(struct flow_action *flow_action,
>>> +				       const struct tcf_exts *exts) {
>>> +	return 0;
>>> +}
>>> +#endif
>>> +
>>> +int tc_setup_action(struct flow_action *flow_action,
>>> +		    struct tc_action *actions[]);
>>>    void tc_cleanup_flow_action(struct flow_action *flow_action);
>>>
>>>    int tc_setup_cb_call(struct tcf_block *block, enum tc_setup_type
>>> type, @@ -554,6 +568,7 @@ int tc_setup_cb_reoffload(struct tcf_block
>> *block, struct tcf_proto *tp,
>>>    			  enum tc_setup_type type, void *type_data,
>>>    			  void *cb_priv, u32 *flags, unsigned int
>> *in_hw_count);
>>>    unsigned int tcf_exts_num_actions(struct tcf_exts *exts);
>>> +unsigned int tcf_act_num_actions_single(struct tc_action *act);
>>>
>>>    #ifdef CONFIG_NET_CLS_ACT
>>>    int tcf_qevent_init(struct tcf_qevent *qe, struct Qdisc *sch, diff
>>> --git a/net/core/flow_offload.c b/net/core/flow_offload.c index
>>> 6beaea13564a..6676431733ef 100644
>>> --- a/net/core/flow_offload.c
>>> +++ b/net/core/flow_offload.c
>>> @@ -27,6 +27,27 @@ struct flow_rule *flow_rule_alloc(unsigned int
>> num_actions)
>>>    }
>>>    EXPORT_SYMBOL(flow_rule_alloc);
>>>
>>> +struct flow_offload_action *flow_action_alloc(unsigned int
>>> +num_actions) {
>>> +	struct flow_offload_action *fl_action;
>>> +	int i;
>>> +
>>> +	fl_action = kzalloc(struct_size(fl_action, action.entries, num_actions),
>>> +			    GFP_KERNEL);
>>> +	if (!fl_action)
>>> +		return NULL;
>>> +
>>> +	fl_action->action.num_entries = num_actions;
>>> +	/* Pre-fill each action hw_stats with DONT_CARE.
>>> +	 * Caller can override this if it wants stats for a given action.
>>> +	 */
>>> +	for (i = 0; i < num_actions; i++)
>>> +		fl_action->action.entries[i].hw_stats =
>>> +FLOW_ACTION_HW_STATS_DONT_CARE;
>>> +
>>> +	return fl_action;
>>> +}
>>> +EXPORT_SYMBOL(flow_action_alloc);
>>> +
>>>    #define FLOW_DISSECTOR_MATCH(__rule, __type, __out)
>> 		\
>>>    	const struct flow_match *__m = &(__rule)->match;
>> 	\
>>>    	struct flow_dissector *__d = (__m)->dissector;
>> 	\
>>> @@ -549,19 +570,25 @@ int flow_indr_dev_setup_offload(struct
>> net_device *dev,	struct Qdisc *sch,
>>>    				void (*cleanup)(struct flow_block_cb
>> *block_cb))
>>>    {
>>>    	struct flow_indr_dev *this;
>>> +	u32 count = 0;
>>> +	int err;
>>>
>>>    	mutex_lock(&flow_indr_block_lock);
>>> +	if (bo) {
>>> +		if (bo->command == FLOW_BLOCK_BIND)
>>> +			indir_dev_add(data, dev, sch, type, cleanup, bo);
>>> +		else if (bo->command == FLOW_BLOCK_UNBIND)
>>> +			indir_dev_remove(data);
>>> +	}
>>>
>>> -	if (bo->command == FLOW_BLOCK_BIND)
>>> -		indir_dev_add(data, dev, sch, type, cleanup, bo);
>>> -	else if (bo->command == FLOW_BLOCK_UNBIND)
>>> -		indir_dev_remove(data);
>>> -
>>> -	list_for_each_entry(this, &flow_block_indr_dev_list, list)
>>> -		this->cb(dev, sch, this->cb_priv, type, bo, data, cleanup);
>>> +	list_for_each_entry(this, &flow_block_indr_dev_list, list) {
>>> +		err = this->cb(dev, sch, this->cb_priv, type, bo, data, cleanup);
>>> +		if (!err)
>>> +			count++;
>>> +	}
>>>
>>>    	mutex_unlock(&flow_indr_block_lock);
>>>
>>> -	return list_empty(&bo->cb_list) ? -EOPNOTSUPP : 0;
>>> +	return (bo && list_empty(&bo->cb_list)) ? -EOPNOTSUPP : count;
>>>    }
>>>    EXPORT_SYMBOL(flow_indr_dev_setup_offload);
>>> diff --git a/net/sched/act_api.c b/net/sched/act_api.c index
>>> 3258da3d5bed..33f2ff885b4b 100644
>>> --- a/net/sched/act_api.c
>>> +++ b/net/sched/act_api.c
>>> @@ -21,6 +21,19 @@
>>>    #include <net/pkt_cls.h>
>>>    #include <net/act_api.h>
>>>    #include <net/netlink.h>
>>> +#include <net/tc_act/tc_pedit.h>
>>> +#include <net/tc_act/tc_mirred.h>
>>> +#include <net/tc_act/tc_vlan.h>
>>> +#include <net/tc_act/tc_tunnel_key.h> #include <net/tc_act/tc_csum.h>
>>> +#include <net/tc_act/tc_gact.h> #include <net/tc_act/tc_police.h>
>>> +#include <net/tc_act/tc_sample.h> #include <net/tc_act/tc_skbedit.h>
>>> +#include <net/tc_act/tc_ct.h> #include <net/tc_act/tc_mpls.h>
>>> +#include <net/tc_act/tc_gate.h> #include <net/flow_offload.h>
>>>
>>>    #ifdef CONFIG_INET
>>>    DEFINE_STATIC_KEY_FALSE(tcf_frag_xmit_count);
>>> @@ -148,6 +161,7 @@ static int __tcf_action_put(struct tc_action *p, bool
>> bind)
>>>    		idr_remove(&idrinfo->action_idr, p->tcfa_index);
>>>    		mutex_unlock(&idrinfo->lock);
>>>
>>> +		tcf_action_offload_del(p);
>>>    		tcf_action_cleanup(p);
>>>    		return 1;
>>>    	}
>>> @@ -341,6 +355,7 @@ static int tcf_idr_release_unsafe(struct tc_action *p)
>>>    		return -EPERM;
>>>
>>>    	if (refcount_dec_and_test(&p->tcfa_refcnt)) {
>>> +		tcf_action_offload_del(p);
>>>    		idr_remove(&p->idrinfo->action_idr, p->tcfa_index);
>>>    		tcf_action_cleanup(p);
>>>    		return ACT_P_DELETED;
>>> @@ -452,6 +467,7 @@ static int tcf_idr_delete_index(struct tcf_idrinfo
>> *idrinfo, u32 index)
>>>    						p->tcfa_index));
>>>    			mutex_unlock(&idrinfo->lock);
>>>
>>> +			tcf_action_offload_del(p);
>>>    			tcf_action_cleanup(p);
>>>    			module_put(owner);
>>>    			return 0;
>>> @@ -1061,6 +1077,154 @@ struct tc_action *tcf_action_init_1(struct net
>> *net, struct tcf_proto *tp,
>>>    	return ERR_PTR(err);
>>>    }
>>>
>>> +static int flow_action_init(struct flow_offload_action *fl_action,
>>> +			    struct tc_action *act,
>>> +			    enum flow_act_command cmd,
>>> +			    struct netlink_ext_ack *extack) {
>>> +	if (!fl_action)
>>> +		return -EINVAL;
>>> +
>>> +	fl_action->extack = extack;
>>> +	fl_action->command = cmd;
>>> +	fl_action->index = act->tcfa_index;
>>> +
>>> +	if (is_tcf_gact_ok(act)) {
>>> +		fl_action->id = FLOW_ACTION_ACCEPT;
>>> +	} else if (is_tcf_gact_shot(act)) {
>>> +		fl_action->id = FLOW_ACTION_DROP;
>>> +	} else if (is_tcf_gact_trap(act)) {
>>> +		fl_action->id = FLOW_ACTION_TRAP;
>>> +	} else if (is_tcf_gact_goto_chain(act)) {
>>> +		fl_action->id = FLOW_ACTION_GOTO;
>>> +	} else if (is_tcf_mirred_egress_redirect(act)) {
>>> +		fl_action->id = FLOW_ACTION_REDIRECT;
>>> +	} else if (is_tcf_mirred_egress_mirror(act)) {
>>> +		fl_action->id = FLOW_ACTION_MIRRED;
>>> +	} else if (is_tcf_mirred_ingress_redirect(act)) {
>>> +		fl_action->id = FLOW_ACTION_REDIRECT_INGRESS;
>>> +	} else if (is_tcf_mirred_ingress_mirror(act)) {
>>> +		fl_action->id = FLOW_ACTION_MIRRED_INGRESS;
>>> +	} else if (is_tcf_vlan(act)) {
>>> +		switch (tcf_vlan_action(act)) {
>>> +		case TCA_VLAN_ACT_PUSH:
>>> +			fl_action->id = FLOW_ACTION_VLAN_PUSH;
>>> +			break;
>>> +		case TCA_VLAN_ACT_POP:
>>> +			fl_action->id = FLOW_ACTION_VLAN_POP;
>>> +			break;
>>> +		case TCA_VLAN_ACT_MODIFY:
>>> +			fl_action->id = FLOW_ACTION_VLAN_MANGLE;
>>> +			break;
>>> +		default:
>>> +			return -EOPNOTSUPP;
>>> +		}
>>> +	} else if (is_tcf_tunnel_set(act)) {
>>> +		fl_action->id = FLOW_ACTION_TUNNEL_ENCAP;
>>> +	} else if (is_tcf_tunnel_release(act)) {
>>> +		fl_action->id = FLOW_ACTION_TUNNEL_DECAP;
>>> +	} else if (is_tcf_csum(act)) {
>>> +		fl_action->id = FLOW_ACTION_CSUM;
>>> +	} else if (is_tcf_skbedit_mark(act)) {
>>> +		fl_action->id = FLOW_ACTION_MARK;
>>> +	} else if (is_tcf_sample(act)) {
>>> +		fl_action->id = FLOW_ACTION_SAMPLE;
>>> +	} else if (is_tcf_police(act)) {
>>> +		fl_action->id = FLOW_ACTION_POLICE;
>>> +	} else if (is_tcf_ct(act)) {
>>> +		fl_action->id = FLOW_ACTION_CT;
>>> +	} else if (is_tcf_mpls(act)) {
>>> +		switch (tcf_mpls_action(act)) {
>>> +		case TCA_MPLS_ACT_PUSH:
>>> +			fl_action->id = FLOW_ACTION_MPLS_PUSH;
>>> +			break;
>>> +		case TCA_MPLS_ACT_POP:
>>> +			fl_action->id = FLOW_ACTION_MPLS_POP;
>>> +			break;
>>> +		case TCA_MPLS_ACT_MODIFY:
>>> +			fl_action->id = FLOW_ACTION_MPLS_MANGLE;
>>> +			break;
>>> +		default:
>>> +			return -EOPNOTSUPP;
>>> +		}
>>> +	} else if (is_tcf_skbedit_ptype(act)) {
>>> +		fl_action->id = FLOW_ACTION_PTYPE;
>>> +	} else if (is_tcf_skbedit_priority(act)) {
>>> +		fl_action->id = FLOW_ACTION_PRIORITY;
>>> +	} else if (is_tcf_gate(act)) {
>>> +		fl_action->id = FLOW_ACTION_GATE;
>>> +	} else {
>>> +		return -EOPNOTSUPP;
>>> +	}
>>> +
>>> +	return 0;
>>> +}
>>> +
>>> +static int tcf_action_offload_cmd(struct flow_offload_action *fl_act,
>>> +				  struct netlink_ext_ack *extack) {
>>> +	int err;
>>> +
>>> +	if (IS_ERR(fl_act))
>>> +		return PTR_ERR(fl_act);
>>> +
>>> +	err = flow_indr_dev_setup_offload(NULL, NULL, TC_SETUP_ACT,
>>> +					  fl_act, NULL, NULL);
>>> +	if (err < 0)
>>> +		return err;
>>> +
>>> +	return 0;
>>> +}
>>> +
>>> +/* offload the tc command after inserted */ static int
>>> +tcf_action_offload_add(struct tc_action *action,
>>> +				  struct netlink_ext_ack *extack) {
>>> +	struct tc_action *actions[TCA_ACT_MAX_PRIO] = {
>>> +		[0] = action,
>>> +	};
>>> +	struct flow_offload_action *fl_action;
>>> +	int err = 0;
>>> +
>>> +	fl_action = flow_action_alloc(tcf_act_num_actions_single(action));
>>> +	if (!fl_action)
>>> +		return -EINVAL;
>>> +
>>> +	err = flow_action_init(fl_action, action, FLOW_ACT_REPLACE, extack);
>>> +	if (err)
>>> +		goto fl_err;
>>> +
>>> +	err = tc_setup_action(&fl_action->action, actions);
>>> +	if (err) {
>>> +		NL_SET_ERR_MSG_MOD(extack,
>>> +				   "Failed to setup tc actions for offload\n");
>>> +		goto fl_err;
>>> +	}
>>> +
>>> +	err = tcf_action_offload_cmd(fl_action, extack);
>>> +	tc_cleanup_flow_action(&fl_action->action);
>>> +
>>> +fl_err:
>>> +	kfree(fl_action);
>>> +
>>> +	return err;
>>> +}
>>> +
>>> +int tcf_action_offload_del(struct tc_action *action) {
>>> +	struct flow_offload_action fl_act;
>>> +	int err = 0;
>>> +
>>> +	if (!action)
>>> +		return -EINVAL;
>>> +
>>> +	err = flow_action_init(&fl_act, action, FLOW_ACT_DESTROY, NULL);
>>> +	if (err)
>>> +		return err;
>>> +
>>> +	return tcf_action_offload_cmd(&fl_act, NULL); }
>>> +
>>>    /* Returns numbers of initialized actions or negative error. */
>>>
>>>    int tcf_action_init(struct net *net, struct tcf_proto *tp, struct
>>> nlattr *nla, @@ -1103,6 +1267,8 @@ int tcf_action_init(struct net *net,
>> struct tcf_proto *tp, struct nlattr *nla,
>>>    		sz += tcf_action_fill_size(act);
>>>    		/* Start from index 0 */
>>>    		actions[i - 1] = act;
>>> +		if (!(flags & TCA_ACT_FLAGS_BIND))
>>> +			tcf_action_offload_add(act, extack);
>>
>> Why is this restricted to actions created without the TCA_ACT_FLAGS_BIND
>> flag?
>> How are actions instantiated by the filters different from those that are
>> created by "tc actions"?
>>
> Our patch aims to offload tc action that is created independent of any flow. It is usually
> offloaded when it is added or replaced.
> This patch is to implement a process of reoffloading the actions when driver is
> inserted or removed, so it will still offload the independent actions.

I see.

>>>    	}
>>>
>>>    	/* We have to commit them all together, because if any error
>>> happened in diff --git a/net/sched/cls_api.c b/net/sched/cls_api.c
>>> index 2ef8f5a6205a..351d93988b8b 100644
>>> --- a/net/sched/cls_api.c
>>> +++ b/net/sched/cls_api.c
>>> @@ -3544,8 +3544,8 @@ static enum flow_action_hw_stats
>> tc_act_hw_stats(u8 hw_stats)
>>>    	return hw_stats;
>>>    }
>>>
>>> -int tc_setup_flow_action(struct flow_action *flow_action,
>>> -			 const struct tcf_exts *exts)
>>> +int tc_setup_action(struct flow_action *flow_action,
>>> +		    struct tc_action *actions[])
>>>    {
>>>    	struct tc_action *act;
>>>    	int i, j, k, err = 0;
>>> @@ -3554,11 +3554,11 @@ int tc_setup_flow_action(struct flow_action
>> *flow_action,
>>>    	BUILD_BUG_ON(TCA_ACT_HW_STATS_IMMEDIATE !=
>> FLOW_ACTION_HW_STATS_IMMEDIATE);
>>>    	BUILD_BUG_ON(TCA_ACT_HW_STATS_DELAYED !=
>>> FLOW_ACTION_HW_STATS_DELAYED);
>>>
>>> -	if (!exts)
>>> +	if (!actions)
>>>    		return 0;
>>>
>>>    	j = 0;
>>> -	tcf_exts_for_each_action(i, act, exts) {
>>> +	tcf_act_for_each_action(i, act, actions) {
>>>    		struct flow_action_entry *entry;
>>>
>>>    		entry = &flow_action->entries[j];
>>> @@ -3725,7 +3725,19 @@ int tc_setup_flow_action(struct flow_action
>> *flow_action,
>>>    	spin_unlock_bh(&act->tcfa_lock);
>>>    	goto err_out;
>>>    }
>>> +EXPORT_SYMBOL(tc_setup_action);
>>> +
>>> +#ifdef CONFIG_NET_CLS_ACT
>>> +int tc_setup_flow_action(struct flow_action *flow_action,
>>> +			 const struct tcf_exts *exts)
>>> +{
>>> +	if (!exts)
>>> +		return 0;
>>> +
>>> +	return tc_setup_action(flow_action, exts->actions); }
>>>    EXPORT_SYMBOL(tc_setup_flow_action);
>>> +#endif
>>>
>>>    unsigned int tcf_exts_num_actions(struct tcf_exts *exts)
>>>    {
>>> @@ -3743,6 +3755,15 @@ unsigned int tcf_exts_num_actions(struct
>> tcf_exts *exts)
>>>    }
>>>    EXPORT_SYMBOL(tcf_exts_num_actions);
>>>
>>> +unsigned int tcf_act_num_actions_single(struct tc_action *act) {
>>> +	if (is_tcf_pedit(act))
>>> +		return tcf_pedit_nkeys(act);
>>> +	else
>>> +		return 1;
>>> +}
>>> +EXPORT_SYMBOL(tcf_act_num_actions_single);
>>> +
>>>    #ifdef CONFIG_NET_CLS_ACT
>>>    static int tcf_qevent_parse_block_index(struct nlattr *block_index_attr,
>>>    					u32 *p_block_index,
>>>

^ permalink raw reply	[flat|nested] 58+ messages in thread

* RE: [RFC/PATCH net-next v3 5/8] flow_offload: add process to update action stats from hardware
  2021-10-29 17:11   ` Vlad Buslov
@ 2021-11-01 10:07     ` Baowen Zheng
  0 siblings, 0 replies; 58+ messages in thread
From: Baowen Zheng @ 2021-11-01 10:07 UTC (permalink / raw)
  To: Vlad Buslov, Simon Horman
  Cc: netdev, Jamal Hadi Salim, Roi Dayan, Ido Schimmel, Cong Wang,
	Jiri Pirko, Baowen Zheng, Louis Peens, oss-drivers

On October 30, 2021 1:11 AM, Vlad Buslov <vladbu@nvidia.com> wrote:
>On Thu 28 Oct 2021 at 14:06, Simon Horman <simon.horman@corigine.com>
>wrote:
>> From: Baowen Zheng <baowen.zheng@corigine.com>
>>
>> When collecting stats for actions update them using both both hardware
>> and software counters.
>>
>> Stats update process should not in context of preempt_disable.
>
>I think you are missing a word here.
Thanks, we will fix it in next patch.
>>
>> Signed-off-by: Baowen Zheng <baowen.zheng@corigine.com>
>> Signed-off-by: Louis Peens <louis.peens@corigine.com>
>> Signed-off-by: Simon Horman <simon.horman@corigine.com>
>> ---
>>  include/net/act_api.h |  1 +
>>  include/net/pkt_cls.h | 18 ++++++++++--------
>>  net/sched/act_api.c   | 37 +++++++++++++++++++++++++++++++++++++
>>  3 files changed, 48 insertions(+), 8 deletions(-)
>>
>> diff --git a/include/net/act_api.h b/include/net/act_api.h index
>> 671208bd27ef..80a9d1e7d805 100644
>> --- a/include/net/act_api.h
>> +++ b/include/net/act_api.h
>> @@ -247,6 +247,7 @@ void tcf_action_update_stats(struct tc_action *a,
>u64 bytes, u64 packets,
>>  			     u64 drops, bool hw);
>>  int tcf_action_copy_stats(struct sk_buff *, struct tc_action *, int);
>> int tcf_action_offload_del(struct tc_action *action);
>> +int tcf_action_update_hw_stats(struct tc_action *action);
>>  int tcf_action_check_ctrlact(int action, struct tcf_proto *tp,
>>  			     struct tcf_chain **handle,
>>  			     struct netlink_ext_ack *newchain); diff --git
>> a/include/net/pkt_cls.h b/include/net/pkt_cls.h index
>> 44ae5182a965..88788b821f76 100644
>> --- a/include/net/pkt_cls.h
>> +++ b/include/net/pkt_cls.h
>> @@ -292,18 +292,20 @@ tcf_exts_stats_update(const struct tcf_exts
>> *exts,  #ifdef CONFIG_NET_CLS_ACT
>>  	int i;
>>
>> -	preempt_disable();
>> -
>>  	for (i = 0; i < exts->nr_actions; i++) {
>>  		struct tc_action *a = exts->actions[i];
>>
>> -		tcf_action_stats_update(a, bytes, packets, drops,
>> -					lastuse, true);
>> -		a->used_hw_stats = used_hw_stats;
>> -		a->used_hw_stats_valid = used_hw_stats_valid;
>> -	}
>> +		/* if stats from hw, just skip */
>> +		if (tcf_action_update_hw_stats(a)) {
>> +			preempt_disable();
>> +			tcf_action_stats_update(a, bytes, packets, drops,
>> +						lastuse, true);
>> +			preempt_enable();
>>
>> -	preempt_enable();
>> +			a->used_hw_stats = used_hw_stats;
>> +			a->used_hw_stats_valid = used_hw_stats_valid;
>> +		}
>> +	}
>>  #endif
>>  }
>>
>> diff --git a/net/sched/act_api.c b/net/sched/act_api.c index
>> 604bf1923bcc..881c7ba4d180 100644
>> --- a/net/sched/act_api.c
>> +++ b/net/sched/act_api.c
>> @@ -1238,6 +1238,40 @@ static int tcf_action_offload_add(struct tc_action
>*action,
>>  	return err;
>>  }
>>
>> +int tcf_action_update_hw_stats(struct tc_action *action) {
>> +	struct flow_offload_action fl_act = {};
>> +	int err = 0;
>> +
>> +	if (!tc_act_in_hw(action))
>> +		return -EOPNOTSUPP;
>> +
>> +	err = flow_action_init(&fl_act, action, FLOW_ACT_STATS, NULL);
>> +	if (err)
>> +		goto err_out;
>> +
>> +	err = tcf_action_offload_cmd(&fl_act, NULL, NULL);
>> +
>> +	if (!err && fl_act.stats.lastused) {
>> +		preempt_disable();
>> +		tcf_action_stats_update(action, fl_act.stats.bytes,
>> +					fl_act.stats.pkts,
>> +					fl_act.stats.drops,
>> +					fl_act.stats.lastused,
>> +					true);
>> +		preempt_enable();
>> +		action->used_hw_stats = fl_act.stats.used_hw_stats;
>> +		action->used_hw_stats_valid = true;
>> +		err = 0;
>
>Error handling here is slightly convoluted. This line assigns err=0 third time (it is
>initialized with zero and then we can only get here if result of
>tcf_action_offload_cmd() assigned 'err' to zero again).
>Considering that error handler in this function is empty we can just return
>errors directly as soon as they happen and return zero at the end of the
>function.
>
Thanks, we will change as your suggestion.
>> +	} else {
>> +		err = -EOPNOTSUPP;
>
>Hmm the code can return error here when tcf_action_offload_cmd()
>succeeded but 'lastused' is zero. Such behavior will cause
>tcf_exts_stats_update() to update action with filter counter values. Is this the
>desired behavior when, for example, in filter action list there is and action that
>can drop packets followed by some shared action? In such case 'lastused' can
>be zero if all packets that filter matched were dropped by previous action and
>shared action will be assigned with filter counter value that includes dropped
>packets/bytes.
>
Thanks, we will consider if it make sense to only judge return value err from tcf_action_offload_cmd.
>> +	}
>> +
>> +err_out:
>> +	return err;
>> +}
>> +EXPORT_SYMBOL(tcf_action_update_hw_stats);
>> +
>>  int tcf_action_offload_del(struct tc_action *action)  {
>>  	struct flow_offload_action fl_act;
>> @@ -1362,6 +1396,9 @@ int tcf_action_copy_stats(struct sk_buff *skb,
>struct tc_action *p,
>>  	if (p == NULL)
>>  		goto errout;
>>
>> +	/* update hw stats for this action */
>> +	tcf_action_update_hw_stats(p);
>> +
>>  	/* compat_mode being true specifies a call that is supposed
>>  	 * to add additional backward compatibility statistic TLVs.
>>  	 */


^ permalink raw reply	[flat|nested] 58+ messages in thread

* RE: [RFC/PATCH net-next v3 3/8] flow_offload: allow user to offload tc action to net device
  2021-11-01 10:07       ` Oz Shlomo
@ 2021-11-01 10:27         ` Baowen Zheng
  0 siblings, 0 replies; 58+ messages in thread
From: Baowen Zheng @ 2021-11-01 10:27 UTC (permalink / raw)
  To: Oz Shlomo, Simon Horman, netdev
  Cc: Vlad Buslov, Jamal Hadi Salim, Roi Dayan, Ido Schimmel,
	Cong Wang, Jiri Pirko, Baowen Zheng, Louis Peens, oss-drivers

On November 1, 2021 6:07 PM, Oz Shlomo wrote:
>On 11/1/2021 4:30 AM, Baowen Zheng wrote:
>> On 10/31/2021 5:50 PM, Oz Shlomo wrote:
>>> On 10/28/2021 2:06 PM, Simon Horman wrote:
>>>> From: Baowen Zheng <baowen.zheng@corigine.com>
>>>>
>>>> Use flow_indr_dev_register/flow_indr_dev_setup_offload to offload tc
>>>> action.
>>>
>>> How will device drivers reference the offloaded actions when
>>> offloading a flow?
>>> Perhaps the flow_action_entry structure should also include the action
>index.
>>>
>> We have set action index in flow_offload_action to offload the action, also
>there are > already some actions in flow_action_entry include index which we
>want to offload.
>> If the driver wants to support action that needs index, I think it can
>> add the index later, it may not include in this patch, WDYT?
>
>What do you mean by "action that needs index"?
>
>Currently only the police and gate actions have an action index parameter.
>However, with this series the user can create any action using the tc action API
>and then reference it from any filter.
>Do you see a reason not to expose the action index as a flow_action_entry
>attribute?
What I mean is currently the action is created along with the filter, then the index is not needed.
With this patch, we intend to offload the police action which already includes action index. 
I think your suggestion makes sense to us, we will consider to move the index to the
flow_action_entry structure instead of current in single action structure, thanks.
>>>>
>>>> We need to call tc_cleanup_flow_action to clean up tc action entry
>>>> since in tc_setup_action, some actions may hold dev refcnt, especially
>>>> the mirror action.
>>>>
>>>> Signed-off-by: Baowen Zheng <baowen.zheng@corigine.com>
>>>> Signed-off-by: Louis Peens <louis.peens@corigine.com>
>>>> Signed-off-by: Simon Horman <simon.horman@corigine.com>
>>>> ---
>>>>    include/linux/netdevice.h  |   1 +
>>>>    include/net/act_api.h      |   2 +-
>>>>    include/net/flow_offload.h |  17 ++++
>>>>    include/net/pkt_cls.h      |  15 ++++
>>>>    net/core/flow_offload.c    |  43 ++++++++--
>>>>    net/sched/act_api.c        | 166
>+++++++++++++++++++++++++++++++++++++
>>>>    net/sched/cls_api.c        |  29 ++++++-
>>>>    7 files changed, 260 insertions(+), 13 deletions(-)
>>>>
>>>> diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h
>>>> index 3ec42495a43a..9815c3a058e9 100644
>>>> --- a/include/linux/netdevice.h
>>>> +++ b/include/linux/netdevice.h
>>>> @@ -916,6 +916,7 @@ enum tc_setup_type {
>>>>    	TC_SETUP_QDISC_TBF,
>>>>    	TC_SETUP_QDISC_FIFO,
>>>>    	TC_SETUP_QDISC_HTB,
>>>> +	TC_SETUP_ACT,
>>>>    };
>>>>
...
>>>>    int tcf_action_init(struct net *net, struct tcf_proto *tp, struct
>>>> nlattr *nla, @@ -1103,6 +1267,8 @@ int tcf_action_init(struct net *net,
>>> struct tcf_proto *tp, struct nlattr *nla,
>>>>    		sz += tcf_action_fill_size(act);
>>>>    		/* Start from index 0 */
>>>>    		actions[i - 1] = act;
>>>> +		if (!(flags & TCA_ACT_FLAGS_BIND))
>>>> +			tcf_action_offload_add(act, extack);
>>>
>>> Why is this restricted to actions created without the
>TCA_ACT_FLAGS_BIND
>>> flag?
>>> How are actions instantiated by the filters different from those that are
>>> created by "tc actions"?
>>>
>> Our patch aims to offload tc action that is created independent of any flow.
>It is usually
>> offloaded when it is added or replaced.
>> This patch is to implement a process of reoffloading the actions when driver
>is
>> inserted or removed, so it will still offload the independent actions.
>
>I see.
>
>>>>    	}
>>>>
>>>>    	/* We have to commit them all together, because if any error
>>>> happened in diff --git a/net/sched/cls_api.c b/net/sched/cls_api.c
>>>> index 2ef8f5a6205a..351d93988b8b 100644
>>>> --- a/net/sched/cls_api.c
>>>> +++ b/net/sched/cls_api.c
>>>> @@ -3544,8 +3544,8 @@ static enum flow_action_hw_stats
>>> tc_act_hw_stats(u8 hw_stats)
>>>>    	return hw_stats;
>>>>    }
>>>>
>>>> -int tc_setup_flow_action(struct flow_action *flow_action,
>>>> -			 const struct tcf_exts *exts)
>>>> +int tc_setup_action(struct flow_action *flow_action,
>>>> +		    struct tc_action *actions[])
>>>>    {
>>>>    	struct tc_action *act;
>>>>    	int i, j, k, err = 0;
>>>> @@ -3554,11 +3554,11 @@ int tc_setup_flow_action(struct flow_action
>>> *flow_action,
>>>>    	BUILD_BUG_ON(TCA_ACT_HW_STATS_IMMEDIATE !=
>>> FLOW_ACTION_HW_STATS_IMMEDIATE);
>>>>    	BUILD_BUG_ON(TCA_ACT_HW_STATS_DELAYED !=
>>>> FLOW_ACTION_HW_STATS_DELAYED);
>>>>
>>>> -	if (!exts)
>>>> +	if (!actions)
>>>>    		return 0;
>>>>
>>>>    	j = 0;
>>>> -	tcf_exts_for_each_action(i, act, exts) {
>>>> +	tcf_act_for_each_action(i, act, actions) {
>>>>    		struct flow_action_entry *entry;
>>>>
>>>>    		entry = &flow_action->entries[j];
>>>> @@ -3725,7 +3725,19 @@ int tc_setup_flow_action(struct flow_action
>>> *flow_action,
>>>>    	spin_unlock_bh(&act->tcfa_lock);
>>>>    	goto err_out;
>>>>    }
>>>> +EXPORT_SYMBOL(tc_setup_action);
>>>> +
>>>> +#ifdef CONFIG_NET_CLS_ACT
>>>> +int tc_setup_flow_action(struct flow_action *flow_action,
>>>> +			 const struct tcf_exts *exts)
>>>> +{
>>>> +	if (!exts)
>>>> +		return 0;
>>>> +
>>>> +	return tc_setup_action(flow_action, exts->actions); }
>>>>    EXPORT_SYMBOL(tc_setup_flow_action);
>>>> +#endif
>>>>
>>>>    unsigned int tcf_exts_num_actions(struct tcf_exts *exts)
>>>>    {
>>>> @@ -3743,6 +3755,15 @@ unsigned int tcf_exts_num_actions(struct
>>> tcf_exts *exts)
>>>>    }
>>>>    EXPORT_SYMBOL(tcf_exts_num_actions);
>>>>
>>>> +unsigned int tcf_act_num_actions_single(struct tc_action *act) {
>>>> +	if (is_tcf_pedit(act))
>>>> +		return tcf_pedit_nkeys(act);
>>>> +	else
>>>> +		return 1;
>>>> +}
>>>> +EXPORT_SYMBOL(tcf_act_num_actions_single);
>>>> +
>>>>    #ifdef CONFIG_NET_CLS_ACT
>>>>    static int tcf_qevent_parse_block_index(struct nlattr *block_index_attr,
>>>>    					u32 *p_block_index,
>>>>

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [RFC/PATCH net-next v3 3/8] flow_offload: allow user to offload tc action to net device
  2021-11-01  9:44     ` Baowen Zheng
@ 2021-11-01 12:05       ` Vlad Buslov
  2021-11-02  1:38         ` Baowen Zheng
  0 siblings, 1 reply; 58+ messages in thread
From: Vlad Buslov @ 2021-11-01 12:05 UTC (permalink / raw)
  To: Baowen Zheng
  Cc: Simon Horman, netdev, Jamal Hadi Salim, Roi Dayan, Ido Schimmel,
	Cong Wang, Jiri Pirko, Baowen Zheng, Louis Peens, oss-drivers


On Mon 01 Nov 2021 at 11:44, Baowen Zheng <baowen.zheng@corigine.com> wrote:
> Thanks for your review and sorry for delay in responding.
>
> On October 30, 2021 12:59 AM, Vlad Buslov <vladbu@nvidia.com> wrote:
>>On Thu 28 Oct 2021 at 14:06, Simon Horman <simon.horman@corigine.com>
>>wrote:
>>> From: Baowen Zheng <baowen.zheng@corigine.com>
>>>
>>> Use flow_indr_dev_register/flow_indr_dev_setup_offload to offload tc
>>> action.
>>>
>>> We need to call tc_cleanup_flow_action to clean up tc action entry
>>> since in tc_setup_action, some actions may hold dev refcnt, especially
>>> the mirror action.
>>>
>>> Signed-off-by: Baowen Zheng <baowen.zheng@corigine.com>
>>> Signed-off-by: Louis Peens <louis.peens@corigine.com>
>>> Signed-off-by: Simon Horman <simon.horman@corigine.com>
>>> ---
>>>  include/linux/netdevice.h  |   1 +
>>>  include/net/act_api.h      |   2 +-
>>>  include/net/flow_offload.h |  17 ++++
>>>  include/net/pkt_cls.h      |  15 ++++
>>>  net/core/flow_offload.c    |  43 ++++++++--
>>>  net/sched/act_api.c        | 166
>>+++++++++++++++++++++++++++++++++++++
>>>  net/sched/cls_api.c        |  29 ++++++-
>>>  7 files changed, 260 insertions(+), 13 deletions(-)
>>>
>>> diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h
>>> index 3ec42495a43a..9815c3a058e9 100644
>>> --- a/include/linux/netdevice.h
>>> +++ b/include/linux/netdevice.h
>>> @@ -916,6 +916,7 @@ enum tc_setup_type {
>>>  	TC_SETUP_QDISC_TBF,
>>>  	TC_SETUP_QDISC_FIFO,
>>>  	TC_SETUP_QDISC_HTB,
>>> +	TC_SETUP_ACT,
>>>  };
>>>
>>>  /* These structures hold the attributes of bpf state that are being
>>> passed diff --git a/include/net/act_api.h b/include/net/act_api.h
>>> index b5b624c7e488..9eb19188603c 100644
>>> --- a/include/net/act_api.h
>>> +++ b/include/net/act_api.h
>>> @@ -239,7 +239,7 @@ static inline void
>>> tcf_action_inc_overlimit_qstats(struct tc_action *a)  void
>>tcf_action_update_stats(struct tc_action *a, u64 bytes, u64 packets,
>>>  			     u64 drops, bool hw);
>>>  int tcf_action_copy_stats(struct sk_buff *, struct tc_action *, int);
>>> -
>>> +int tcf_action_offload_del(struct tc_action *action);
>>
>>This doesn't seem to be used anywhere outside of act_api in this series, so
>>why is it exported?
> Thanks for bring this to us, we will fix this by moving the block of implement in act_api.c.
>>>  int tcf_action_check_ctrlact(int action, struct tcf_proto *tp,
>>>  			     struct tcf_chain **handle,
>>>  			     struct netlink_ext_ack *newchain); diff --git
>>> a/include/net/flow_offload.h b/include/net/flow_offload.h index
>>> 3961461d9c8b..aa28592fccc0 100644
>>> --- a/include/net/flow_offload.h
>>> +++ b/include/net/flow_offload.h
>>> @@ -552,6 +552,23 @@ struct flow_cls_offload {
>>>  	u32 classid;
>>>  };
>>>
>>> +enum flow_act_command {
>>> +	FLOW_ACT_REPLACE,
>>> +	FLOW_ACT_DESTROY,
>>> +	FLOW_ACT_STATS,
>>> +};
>>> +
>>> +struct flow_offload_action {
>>> +	struct netlink_ext_ack *extack; /* NULL in FLOW_ACT_STATS
>>process*/
>>> +	enum flow_act_command command;
>>> +	enum flow_action_id id;
>>> +	u32 index;
>>> +	struct flow_stats stats;
>>> +	struct flow_action action;
>>> +};
>>> +
>>> +struct flow_offload_action *flow_action_alloc(unsigned int
>>> +num_actions);
>>> +
>>>  static inline struct flow_rule *
>>>  flow_cls_offload_flow_rule(struct flow_cls_offload *flow_cmd)  { diff
>>> --git a/include/net/pkt_cls.h b/include/net/pkt_cls.h index
>>> 193f88ebf629..922775407257 100644
>>> --- a/include/net/pkt_cls.h
>>> +++ b/include/net/pkt_cls.h
>>> @@ -258,6 +258,9 @@ static inline void tcf_exts_put_net(struct tcf_exts
>>*exts)
>>>  	for (; 0; (void)(i), (void)(a), (void)(exts))  #endif
>>>
>>> +#define tcf_act_for_each_action(i, a, actions) \
>>> +	for (i = 0; i < TCA_ACT_MAX_PRIO && ((a) = actions[i]); i++)
>>> +
>>>  static inline void
>>>  tcf_exts_stats_update(const struct tcf_exts *exts,
>>>  		      u64 bytes, u64 packets, u64 drops, u64 lastuse, @@ -532,8
>>> +535,19 @@ tcf_match_indev(struct sk_buff *skb, int ifindex)
>>>  	return ifindex == skb->skb_iif;
>>>  }
>>>
>>> +#ifdef CONFIG_NET_CLS_ACT
>>>  int tc_setup_flow_action(struct flow_action *flow_action,
>>>  			 const struct tcf_exts *exts);
>>
>>Why does existing cls_api function tc_setup_flow_action() now depend on
>>CONFIG_NET_CLS_ACT?
> Originally the function tc_setup_flow_action deal with the dependence of CONFIG_NET_CLS_ACT
> By calling the macro tcf_exts_for_each_action, now we change to call the function tc_setup_action
> Then tc_setup_flow_action will refer to exts->actions, so it will depend on CONFIG_NET_CLS_ACT explicitly.
> To fix this, we have to have the ifdef in tc_setup_flow_action declaration or in the implement in cls_api.c.
> Do you think if it makes sense?

Since we already have multiple of such ifdefs in cls_api I don't think
having more is an issue, but I also don't think we need to ifdef this
function in both pkt_cls.h and cls_api.c. Unless I'm missing something
you can either:

- Make tc_setup_flow_action() inline in pkt_cls.h and remove its
definition from cls_api.c since tc_setup_action() is also exported.

- Move ifdef check inside function definition in cls_api.c (return 0, if
config is not defined), which will allows you to remove ifdef from
pkt_cls.h.

WDYT?

>>> +#else
>>> +static inline int tc_setup_flow_action(struct flow_action *flow_action,
>>> +				       const struct tcf_exts *exts) {
>>> +	return 0;
>>> +}
>>> +#endif
>>> +
>>> +int tc_setup_action(struct flow_action *flow_action,
>>> +		    struct tc_action *actions[]);
>>>  void tc_cleanup_flow_action(struct flow_action *flow_action);
>>>
> ...
>>>  #ifdef CONFIG_INET
>>>  DEFINE_STATIC_KEY_FALSE(tcf_frag_xmit_count);
>>> @@ -148,6 +161,7 @@ static int __tcf_action_put(struct tc_action *p, bool
>>bind)
>>>  		idr_remove(&idrinfo->action_idr, p->tcfa_index);
>>>  		mutex_unlock(&idrinfo->lock);
>>>
>>> +		tcf_action_offload_del(p);
>>>  		tcf_action_cleanup(p);
>>>  		return 1;
>>>  	}
>>> @@ -341,6 +355,7 @@ static int tcf_idr_release_unsafe(struct tc_action *p)
>>>  		return -EPERM;
>>>
>>>  	if (refcount_dec_and_test(&p->tcfa_refcnt)) {
>>> +		tcf_action_offload_del(p);
>>>  		idr_remove(&p->idrinfo->action_idr, p->tcfa_index);
>>>  		tcf_action_cleanup(p);
>>>  		return ACT_P_DELETED;
>>> @@ -452,6 +467,7 @@ static int tcf_idr_delete_index(struct tcf_idrinfo
>>*idrinfo, u32 index)
>>>  						p->tcfa_index));
>>>  			mutex_unlock(&idrinfo->lock);
>>>
>>> +			tcf_action_offload_del(p);
>>
>>tcf_action_offload_del() and tcf_action_cleanup() seem to be always called
>>together. Consider moving the call to tcf_action_offload_del() into
>>tcf_action_cleanup().
>>
> Thanks, we will consider to move tcf_action_offload_del() inside of tcf_action_cleanup.
>>>  			tcf_action_cleanup(p);
>>>  			module_put(owner);
>>>  			return 0;
>>> @@ -1061,6 +1077,154 @@ struct tc_action *tcf_action_init_1(struct net
>>*net, struct tcf_proto *tp,
>>>  	return ERR_PTR(err);
>>>  }
>>>
> ...
>>> +/* offload the tc command after inserted */ static int
>>> +tcf_action_offload_add(struct tc_action *action,
>>> +				  struct netlink_ext_ack *extack) {
>>> +	struct tc_action *actions[TCA_ACT_MAX_PRIO] = {
>>> +		[0] = action,
>>> +	};
>>> +	struct flow_offload_action *fl_action;
>>> +	int err = 0;
>>> +
>>> +	fl_action = flow_action_alloc(tcf_act_num_actions_single(action));
>>> +	if (!fl_action)
>>> +		return -EINVAL;
>>
>>Failed alloc-like functions usually result -ENOMEM.
>>
> Thanks, we will fix this in V4 patch.
>>> +
>>> +	err = flow_action_init(fl_action, action, FLOW_ACT_REPLACE, extack);
>>> +	if (err)
>>> +		goto fl_err;
>>> +
>>> +	err = tc_setup_action(&fl_action->action, actions);
>>> +	if (err) {
>>> +		NL_SET_ERR_MSG_MOD(extack,
>>> +				   "Failed to setup tc actions for offload\n");
>>> +		goto fl_err;
>>> +	}
>>> +
>>> +	err = tcf_action_offload_cmd(fl_action, extack);
>>> +	tc_cleanup_flow_action(&fl_action->action);
>>> +
>>> +fl_err:
>>> +	kfree(fl_action);
>>> +
>>> +	return err;
>>> +}
>>> +
>>> +int tcf_action_offload_del(struct tc_action *action) {
>>> +	struct flow_offload_action fl_act;
>>> +	int err = 0;
>>> +
>>> +	if (!action)
>>> +		return -EINVAL;
>>> +
>>> +	err = flow_action_init(&fl_act, action, FLOW_ACT_DESTROY, NULL);
>>> +	if (err)
>>> +		return err;
>>> +
>>> +	return tcf_action_offload_cmd(&fl_act, NULL); }
>>> +
>>>  /* Returns numbers of initialized actions or negative error. */
>>>
>>>  int tcf_action_init(struct net *net, struct tcf_proto *tp, struct
>>> nlattr *nla, @@ -1103,6 +1267,8 @@ int tcf_action_init(struct net *net,
>>struct tcf_proto *tp, struct nlattr *nla,
>>>  		sz += tcf_action_fill_size(act);
>>>  		/* Start from index 0 */
>>>  		actions[i - 1] = act;
>>> +		if (!(flags & TCA_ACT_FLAGS_BIND))
>>> +			tcf_action_offload_add(act, extack);
>>>  	}
>>>
>>>  	/* We have to commit them all together, because if any error
>>> happened in diff --git a/net/sched/cls_api.c b/net/sched/cls_api.c
>>> index 2ef8f5a6205a..351d93988b8b 100644
>>> --- a/net/sched/cls_api.c
>>> +++ b/net/sched/cls_api.c
>>> @@ -3544,8 +3544,8 @@ static enum flow_action_hw_stats
>>tc_act_hw_stats(u8 hw_stats)
>>>  	return hw_stats;
>>>  }
>>>
>>> -int tc_setup_flow_action(struct flow_action *flow_action,
>>> -			 const struct tcf_exts *exts)
>>> +int tc_setup_action(struct flow_action *flow_action,
>>> +		    struct tc_action *actions[])
>>>  {
>>>  	struct tc_action *act;
>>>  	int i, j, k, err = 0;
>>> @@ -3554,11 +3554,11 @@ int tc_setup_flow_action(struct flow_action
>>*flow_action,
>>>  	BUILD_BUG_ON(TCA_ACT_HW_STATS_IMMEDIATE !=
>>FLOW_ACTION_HW_STATS_IMMEDIATE);
>>>  	BUILD_BUG_ON(TCA_ACT_HW_STATS_DELAYED !=
>>> FLOW_ACTION_HW_STATS_DELAYED);
>>>
>>> -	if (!exts)
>>> +	if (!actions)
>>>  		return 0;
>>>
>>>  	j = 0;
>>> -	tcf_exts_for_each_action(i, act, exts) {
>>> +	tcf_act_for_each_action(i, act, actions) {
>>>  		struct flow_action_entry *entry;
>>>
>>>  		entry = &flow_action->entries[j];
>>> @@ -3725,7 +3725,19 @@ int tc_setup_flow_action(struct flow_action
>>*flow_action,
>>>  	spin_unlock_bh(&act->tcfa_lock);
>>>  	goto err_out;
>>>  }
>>> +EXPORT_SYMBOL(tc_setup_action);
>>> +
>>> +#ifdef CONFIG_NET_CLS_ACT
>>
>>Maybe just move tc_setup_action() to act_api and ifdef its definition in
>>pkt_cls.h instead of existing tc_setup_flow_action()?
> As explanation above, after the change, tc_setup_flow_action will call function of 
> tc_setup_action and refer to exts->actions, so just move tc_setup_action can not
> fix this problem.

Got it.

>>> +int tc_setup_flow_action(struct flow_action *flow_action,
>>> +			 const struct tcf_exts *exts)
>>> +{
>>> +	if (!exts)
>>> +		return 0;
>>> +
>>> +	return tc_setup_action(flow_action, exts->actions); }
>>>  EXPORT_SYMBOL(tc_setup_flow_action);
>>> +#endif
>>>
>>>  unsigned int tcf_exts_num_actions(struct tcf_exts *exts)  { @@
>>> -3743,6 +3755,15 @@ unsigned int tcf_exts_num_actions(struct tcf_exts
>>> *exts)  }  EXPORT_SYMBOL(tcf_exts_num_actions);
>>>
>>> +unsigned int tcf_act_num_actions_single(struct tc_action *act) {
>>> +	if (is_tcf_pedit(act))
>>> +		return tcf_pedit_nkeys(act);
>>> +	else
>>> +		return 1;
>>> +}
>>> +EXPORT_SYMBOL(tcf_act_num_actions_single);
>>> +
>>>  #ifdef CONFIG_NET_CLS_ACT
>>>  static int tcf_qevent_parse_block_index(struct nlattr *block_index_attr,
>>>  					u32 *p_block_index,


^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [RFC/PATCH net-next v3 0/8] allow user to offload tc action to net device
  2021-10-31 14:14     ` Jamal Hadi Salim
  2021-10-31 14:19       ` Jamal Hadi Salim
@ 2021-11-01 14:27       ` Dave Taht
  1 sibling, 0 replies; 58+ messages in thread
From: Dave Taht @ 2021-11-01 14:27 UTC (permalink / raw)
  To: Jamal Hadi Salim
  Cc: Oz Shlomo, Simon Horman, Linux Kernel Network Developers,
	Vlad Buslov, Roi Dayan, Ido Schimmel, Cong Wang, Jiri Pirko,
	Baowen Zheng, Louis Peens, oss-drivers

On Sun, Oct 31, 2021 at 7:14 AM Jamal Hadi Salim <jhs@mojatatu.com> wrote:
>
> On 2021-10-31 08:03, Dave Taht wrote:
> [..]
>
> >
> > Just as an on-going grump: It has been my hope that policing as a
> > technique would have died a horrible death by now. Seeing it come back
> > as an "easy to offload" operation here - fresh from the 1990s! does
> > not mean it's a good idea, and I'd rather like it if we were finding
> > ways to
> > offload newer things that work better, such as modern aqm, fair
> > queuing, and shaping technologies that are in pie, fq_codel, and cake.
> >
> > policing leads to bursty loss, especially at higher rates, BBR has a
> > specific mode designed to defeat it, and I ripped it out of
> > wondershaper
> > long ago for very good reasons:
> > https://www.bufferbloat.net/projects/bloat/wiki/Wondershaper_Must_Die/
> >
> > I did a long time ago start working on a better policing idea based on
> > some good aqm ideas like AFD, but dropped it figuring that policing
> > was going to vanish
> > from the planet. It's baaaaaack.
>
> A lot of enthusiasm for fq_codel in that link ;->

Wrote that in 2013. It's not every day you solve tcp global synchronization,
achieve a queue depth of 5ms no matter the rate, develop something that
has zero latency for sparse packets, only shoots at the fat flows, drops from
head so there's always an immediate signal of congestion from the packet
just behind, makes opus's PLC and simpler forms of FEC "just work", and
requires near zero configuration.

The plots at the end made a very convincing case for abandoning policing.

> Root cause for burstiness is typically due to large transient queues
> (which are sometimes not under your admin control) and of course if
> you use a policer and dont have your double leaky buckets set properly
> to compensate for both short and long term rates you will have bursts
> of drops with the policer.

I would really like to see a good configuration guide for policing at
multiple real-world bandwidths and at real-world workloads.

> It would be the same with shaper as well
> if the packet burst shows up when the queue is full.

Queues are shock absorbers as Van always says. We do drop packets
still, on the rx ring. The default queue depth of codel is 32MB. It takes
a really really really large burst to overwhelm that.

I wonder where all the userspace wireguard vpns are dropping packets nowdays.

> Intuitively it would feel, for non-work conserving approaches,
> delaying a packet (as in shaping) is better than dropping (as in

It's shaping + flow queueing that's the win, if you are going to
queue. It gets all
the flows statistically multiplexed and in flow balance orders of
magnitude faster
than a policer could. (flow queuing is different from classic fair queuing)

The tiny flows pass through untouched at zero delay also.

At the time, I was considering applying a codel-like technique to policing -
I'd called it "bobbie", where once you exceed the rate, a virtual clock moves
forward as to how long you would have delayed packet delivery if you were
queueing and then starts shooting at packets once your burst tolerance is
exceeded until

But inbound fq+shaping did wonders faster, and selfishly I didn't feel
like abandoning
floating point to work with in the kernel.

That said, it's taken breaking the qdisc lock and xpf to make inbound
shaping scale
decently (see: https://github.com/rchac/LibreQoS#how-do-cake-and-fq_codel-work )

> policing) - but i have not a study which scientifically proves it.
> Any pointers in that regard?

Neither do I. Matt Mathis has ranted about it, and certainly the workarounds
in BBRv1 to defeat others desperate attempts to control their bandwidth with
a policer is obvious from their data.

If there really is a resurgence of interest in policing, a good paper
would compare
a classic 3 color policer to bobbie, and to shaping vs a vs BBR and cubic.

I'm low on students at the moment...

> TCP would recover either way (either detecting sequence gaps or RTO).

Yes, it does. But policing is often devastating to voip and videoconferencing
traffic.

> In Linux kernel level i am not sure i see much difference in either
> since we actually feedback an indicator to TCP to indicate a local
> drop (as opposed to guessing when it is dropped in the network)
> and the TCP code is smart enough to utilize that knowledge.

There is an extremely long and difficult conversation I'd had over
the differences between sch_fq and fq_codel and the differences
between a server and a router, for real world applications over here:

https://github.com/systemd/systemd/issues/9725#issuecomment-413369212

> For hardware offload there is no such feedback for either of those
> two approaches (so no difference with drop in the blackhole).

Yes, now you've built a *router* and lost the local control loop.
TSQ, sch_fq's pacing, and other host optimizations no longer work.

I encourage more folk to regularly take packet
captures of the end results of offloads vs a vs network latency.

Look! MORE BANDWIDTH for a single flow! Wait! There's
600ms of latency and new flows can't even get started!

>
> As to "policer must die" - not possible i am afraid;-> I mean there
> has to be strong evidence that it is a bad idea and besides that
> _a lot of hardware_ supports it;-> Ergo, we have to support it as well.

I agree that supporting hardware features is good. I merely wish that
certain other software features were making it into modern hardware.

I'm encouraged by this work in p4, at least.

https://arxiv.org/pdf/2010.04528.pdf

> Note: RED for example has been proven almost impossible to configure
> properly but we still support it and there's a good set of hardware
> offload support for it. For RED - and i should say the policer as well -
> if you configure properly, _it works_.

I have no idea how often RED is used nowadays. The *only* requests
for offloading it Ive heard is for configuring it as a brick wall ecn marking
tool, which does indeed work for dctcp.

The hope was with pie, being similar in construction, would end up
implemented in hardware, however it's so far turned out that codel
was easier to implement in hw and more effective.
>
>
> BTW, Some mellanox NICs offload HTB. See for example:
> https://legacy.netdevconf.info/0x14/session.html?talk-hierarchical-QoS-hardware-offload

Yes. Too bad they then attach it to fifos. I'd so love to help a
hardware company
willing to do the work to put modern algorithms in hw...

> cheers,
> jamal

I note that although I've enjoyed ranting, I don't actually have any
objections to this
particular patch.
--
I tried to build a better future, a few times:
https://wayforward.archive.org/?site=https%3A%2F%2Fwww.icei.org

Dave Täht CEO, TekLibre, LLC

^ permalink raw reply	[flat|nested] 58+ messages in thread

* RE: [RFC/PATCH net-next v3 3/8] flow_offload: allow user to offload tc action to net device
  2021-11-01 12:05       ` Vlad Buslov
@ 2021-11-02  1:38         ` Baowen Zheng
  0 siblings, 0 replies; 58+ messages in thread
From: Baowen Zheng @ 2021-11-02  1:38 UTC (permalink / raw)
  To: Vlad Buslov
  Cc: Simon Horman, netdev, Jamal Hadi Salim, Roi Dayan, Ido Schimmel,
	Cong Wang, Jiri Pirko, Baowen Zheng, Louis Peens, oss-drivers

On November 1, 2021 8:06 PM, Vlad Buslov <vladbu@nvidia.com> wrote:
>On Mon 01 Nov 2021 at 11:44, Baowen Zheng <baowen.zheng@corigine.com>
>wrote:
>> Thanks for your review and sorry for delay in responding.
>>
>> On October 30, 2021 12:59 AM, Vlad Buslov <vladbu@nvidia.com> wrote:
>>>On Thu 28 Oct 2021 at 14:06, Simon Horman <simon.horman@corigine.com>
>>>wrote:
>>>> From: Baowen Zheng <baowen.zheng@corigine.com>
>>>>
>>>> Use flow_indr_dev_register/flow_indr_dev_setup_offload to offload tc
>>>> action.
>>>>
>>>> We need to call tc_cleanup_flow_action to clean up tc action entry
>>>> since in tc_setup_action, some actions may hold dev refcnt,
>>>> especially the mirror action.
>>>>
>>>> Signed-off-by: Baowen Zheng <baowen.zheng@corigine.com>
>>>> Signed-off-by: Louis Peens <louis.peens@corigine.com>
>>>> Signed-off-by: Simon Horman <simon.horman@corigine.com>
>>>> ---
>>>>  include/linux/netdevice.h  |   1 +
>>>>  include/net/act_api.h      |   2 +-
>>>>  include/net/flow_offload.h |  17 ++++
>>>>  include/net/pkt_cls.h      |  15 ++++
>>>>  net/core/flow_offload.c    |  43 ++++++++--
>>>>  net/sched/act_api.c        | 166
>>>+++++++++++++++++++++++++++++++++++++
>>>>  net/sched/cls_api.c        |  29 ++++++-
>>>>  7 files changed, 260 insertions(+), 13 deletions(-)
>>>>
>>>> diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h
>>>> index 3ec42495a43a..9815c3a058e9 100644
>>>> --- a/include/linux/netdevice.h
>>>> +++ b/include/linux/netdevice.h
>>>> @@ -916,6 +916,7 @@ enum tc_setup_type {
>>>>  	TC_SETUP_QDISC_TBF,
>>>>  	TC_SETUP_QDISC_FIFO,
>>>>  	TC_SETUP_QDISC_HTB,
>>>> +	TC_SETUP_ACT,
>>>>  };
>>>>
>>>>  /* These structures hold the attributes of bpf state that are being
>>>> passed diff --git a/include/net/act_api.h b/include/net/act_api.h
>>>> index b5b624c7e488..9eb19188603c 100644
>>>> --- a/include/net/act_api.h
>>>> +++ b/include/net/act_api.h
>>>> @@ -239,7 +239,7 @@ static inline void
>>>> tcf_action_inc_overlimit_qstats(struct tc_action *a)  void
>>>tcf_action_update_stats(struct tc_action *a, u64 bytes, u64 packets,
>>>>  			     u64 drops, bool hw);
>>>>  int tcf_action_copy_stats(struct sk_buff *, struct tc_action *,
>>>> int);
>>>> -
>>>> +int tcf_action_offload_del(struct tc_action *action);
>>>
>>>This doesn't seem to be used anywhere outside of act_api in this
>>>series, so why is it exported?
>> Thanks for bring this to us, we will fix this by moving the block of implement
>in act_api.c.
>>>>  int tcf_action_check_ctrlact(int action, struct tcf_proto *tp,
>>>>  			     struct tcf_chain **handle,
>>>>  			     struct netlink_ext_ack *newchain); diff --git
>>>> a/include/net/flow_offload.h b/include/net/flow_offload.h index
>>>> 3961461d9c8b..aa28592fccc0 100644
>>>> --- a/include/net/flow_offload.h
>>>> +++ b/include/net/flow_offload.h
>>>> @@ -552,6 +552,23 @@ struct flow_cls_offload {
>>>>  	u32 classid;
>>>>  };
>>>>
>>>> +enum flow_act_command {
>>>> +	FLOW_ACT_REPLACE,
>>>> +	FLOW_ACT_DESTROY,
>>>> +	FLOW_ACT_STATS,
>>>> +};
>>>> +
>>>> +struct flow_offload_action {
>>>> +	struct netlink_ext_ack *extack; /* NULL in FLOW_ACT_STATS
>>>process*/
>>>> +	enum flow_act_command command;
>>>> +	enum flow_action_id id;
>>>> +	u32 index;
>>>> +	struct flow_stats stats;
>>>> +	struct flow_action action;
>>>> +};
>>>> +
>>>> +struct flow_offload_action *flow_action_alloc(unsigned int
>>>> +num_actions);
>>>> +
>>>>  static inline struct flow_rule *
>>>>  flow_cls_offload_flow_rule(struct flow_cls_offload *flow_cmd)  {
>>>> diff --git a/include/net/pkt_cls.h b/include/net/pkt_cls.h index
>>>> 193f88ebf629..922775407257 100644
>>>> --- a/include/net/pkt_cls.h
>>>> +++ b/include/net/pkt_cls.h
>>>> @@ -258,6 +258,9 @@ static inline void tcf_exts_put_net(struct
>>>> tcf_exts
>>>*exts)
>>>>  	for (; 0; (void)(i), (void)(a), (void)(exts))  #endif
>>>>
>>>> +#define tcf_act_for_each_action(i, a, actions) \
>>>> +	for (i = 0; i < TCA_ACT_MAX_PRIO && ((a) = actions[i]); i++)
>>>> +
>>>>  static inline void
>>>>  tcf_exts_stats_update(const struct tcf_exts *exts,
>>>>  		      u64 bytes, u64 packets, u64 drops, u64 lastuse, @@ -532,8
>>>> +535,19 @@ tcf_match_indev(struct sk_buff *skb, int ifindex)
>>>>  	return ifindex == skb->skb_iif;
>>>>  }
>>>>
>>>> +#ifdef CONFIG_NET_CLS_ACT
>>>>  int tc_setup_flow_action(struct flow_action *flow_action,
>>>>  			 const struct tcf_exts *exts);
>>>
>>>Why does existing cls_api function tc_setup_flow_action() now depend
>>>on CONFIG_NET_CLS_ACT?
>> Originally the function tc_setup_flow_action deal with the dependence
>> of CONFIG_NET_CLS_ACT By calling the macro tcf_exts_for_each_action,
>> now we change to call the function tc_setup_action Then
>tc_setup_flow_action will refer to exts->actions, so it will depend on
>CONFIG_NET_CLS_ACT explicitly.
>> To fix this, we have to have the ifdef in tc_setup_flow_action declaration or
>in the implement in cls_api.c.
>> Do you think if it makes sense?
>
>Since we already have multiple of such ifdefs in cls_api I don't think having
>more is an issue, but I also don't think we need to ifdef this function in both
>pkt_cls.h and cls_api.c. Unless I'm missing something you can either:
>
>- Make tc_setup_flow_action() inline in pkt_cls.h and remove its definition
>from cls_api.c since tc_setup_action() is also exported.
>
>- Move ifdef check inside function definition in cls_api.c (return 0, if config is
>not defined), which will allows you to remove ifdef from pkt_cls.h.
>
>WDYT?
Thanks, I think it makes sense to us. We will make the change according to the second option.
>>>> +#else
>>>> +static inline int tc_setup_flow_action(struct flow_action *flow_action,
>>>> +				       const struct tcf_exts *exts) {
>>>> +	return 0;
>>>> +}
>>>> +#endif
>>>> +
>>>> +int tc_setup_action(struct flow_action *flow_action,
>>>> +		    struct tc_action *actions[]);
>>>>  void tc_cleanup_flow_action(struct flow_action *flow_action);
>>>>
>> ...
>>>>  #ifdef CONFIG_INET
>>>>  DEFINE_STATIC_KEY_FALSE(tcf_frag_xmit_count);
>>>> @@ -148,6 +161,7 @@ static int __tcf_action_put(struct tc_action *p,
>>>> bool
>>>bind)
>>>>  		idr_remove(&idrinfo->action_idr, p->tcfa_index);
>>>>  		mutex_unlock(&idrinfo->lock);
>>>>
>>>> +		tcf_action_offload_del(p);
>>>>  		tcf_action_cleanup(p);
>>>>  		return 1;
>>>>  	}
>>>> @@ -341,6 +355,7 @@ static int tcf_idr_release_unsafe(struct tc_action
>*p)
>>>>  		return -EPERM;
>>>>
>>>>  	if (refcount_dec_and_test(&p->tcfa_refcnt)) {
>>>> +		tcf_action_offload_del(p);
>>>>  		idr_remove(&p->idrinfo->action_idr, p->tcfa_index);
>>>>  		tcf_action_cleanup(p);
>>>>  		return ACT_P_DELETED;
>>>> @@ -452,6 +467,7 @@ static int tcf_idr_delete_index(struct
>>>> tcf_idrinfo
>>>*idrinfo, u32 index)
>>>>  						p->tcfa_index));
>>>>  			mutex_unlock(&idrinfo->lock);
>>>>
>>>> +			tcf_action_offload_del(p);
>>>
>>>tcf_action_offload_del() and tcf_action_cleanup() seem to be always
>>>called together. Consider moving the call to tcf_action_offload_del()
>>>into tcf_action_cleanup().
>>>
>> Thanks, we will consider to move tcf_action_offload_del() inside of
>tcf_action_cleanup.
>>>>  			tcf_action_cleanup(p);
>>>>  			module_put(owner);
>>>>  			return 0;
>>>> @@ -1061,6 +1077,154 @@ struct tc_action *tcf_action_init_1(struct
>>>> net
>>>*net, struct tcf_proto *tp,
>>>>  	return ERR_PTR(err);
>>>>  }
>>>>
>> ...
>>>> +/* offload the tc command after inserted */ static int
>>>> +tcf_action_offload_add(struct tc_action *action,
>>>> +				  struct netlink_ext_ack *extack) {
>>>> +	struct tc_action *actions[TCA_ACT_MAX_PRIO] = {
>>>> +		[0] = action,
>>>> +	};
>>>> +	struct flow_offload_action *fl_action;
>>>> +	int err = 0;
>>>> +
>>>> +	fl_action = flow_action_alloc(tcf_act_num_actions_single(action));
>>>> +	if (!fl_action)
>>>> +		return -EINVAL;
>>>
>>>Failed alloc-like functions usually result -ENOMEM.
>>>
>> Thanks, we will fix this in V4 patch.
>>>> +
>>>> +	err = flow_action_init(fl_action, action, FLOW_ACT_REPLACE, extack);
>>>> +	if (err)
>>>> +		goto fl_err;
>>>> +
>>>> +	err = tc_setup_action(&fl_action->action, actions);
>>>> +	if (err) {
>>>> +		NL_SET_ERR_MSG_MOD(extack,
>>>> +				   "Failed to setup tc actions for offload\n");
>>>> +		goto fl_err;
>>>> +	}
>>>> +
>>>> +	err = tcf_action_offload_cmd(fl_action, extack);
>>>> +	tc_cleanup_flow_action(&fl_action->action);
>>>> +
>>>> +fl_err:
>>>> +	kfree(fl_action);
>>>> +
>>>> +	return err;
>>>> +}
>>>> +
>>>> +int tcf_action_offload_del(struct tc_action *action) {
>>>> +	struct flow_offload_action fl_act;
>>>> +	int err = 0;
>>>> +
>>>> +	if (!action)
>>>> +		return -EINVAL;
>>>> +
>>>> +	err = flow_action_init(&fl_act, action, FLOW_ACT_DESTROY, NULL);
>>>> +	if (err)
>>>> +		return err;
>>>> +
>>>> +	return tcf_action_offload_cmd(&fl_act, NULL); }
>>>> +
>>>>  /* Returns numbers of initialized actions or negative error. */
>>>>
>>>>  int tcf_action_init(struct net *net, struct tcf_proto *tp, struct
>>>> nlattr *nla, @@ -1103,6 +1267,8 @@ int tcf_action_init(struct net
>>>> *net,
>>>struct tcf_proto *tp, struct nlattr *nla,
>>>>  		sz += tcf_action_fill_size(act);
>>>>  		/* Start from index 0 */
>>>>  		actions[i - 1] = act;
>>>> +		if (!(flags & TCA_ACT_FLAGS_BIND))
>>>> +			tcf_action_offload_add(act, extack);
>>>>  	}
>>>>
>>>>  	/* We have to commit them all together, because if any error
>>>> happened in diff --git a/net/sched/cls_api.c b/net/sched/cls_api.c
>>>> index 2ef8f5a6205a..351d93988b8b 100644
>>>> --- a/net/sched/cls_api.c
>>>> +++ b/net/sched/cls_api.c
>>>> @@ -3544,8 +3544,8 @@ static enum flow_action_hw_stats
>>>tc_act_hw_stats(u8 hw_stats)
>>>>  	return hw_stats;
>>>>  }
>>>>
>>>> -int tc_setup_flow_action(struct flow_action *flow_action,
>>>> -			 const struct tcf_exts *exts)
>>>> +int tc_setup_action(struct flow_action *flow_action,
>>>> +		    struct tc_action *actions[])
>>>>  {
>>>>  	struct tc_action *act;
>>>>  	int i, j, k, err = 0;
>>>> @@ -3554,11 +3554,11 @@ int tc_setup_flow_action(struct flow_action
>>>*flow_action,
>>>>  	BUILD_BUG_ON(TCA_ACT_HW_STATS_IMMEDIATE !=
>>>FLOW_ACTION_HW_STATS_IMMEDIATE);
>>>>  	BUILD_BUG_ON(TCA_ACT_HW_STATS_DELAYED !=
>>>> FLOW_ACTION_HW_STATS_DELAYED);
>>>>
>>>> -	if (!exts)
>>>> +	if (!actions)
>>>>  		return 0;
>>>>
>>>>  	j = 0;
>>>> -	tcf_exts_for_each_action(i, act, exts) {
>>>> +	tcf_act_for_each_action(i, act, actions) {
>>>>  		struct flow_action_entry *entry;
>>>>
>>>>  		entry = &flow_action->entries[j]; @@ -3725,7 +3725,19 @@
>int
>>>> tc_setup_flow_action(struct flow_action
>>>*flow_action,
>>>>  	spin_unlock_bh(&act->tcfa_lock);
>>>>  	goto err_out;
>>>>  }
>>>> +EXPORT_SYMBOL(tc_setup_action);
>>>> +
>>>> +#ifdef CONFIG_NET_CLS_ACT
>>>
>>>Maybe just move tc_setup_action() to act_api and ifdef its definition
>>>in pkt_cls.h instead of existing tc_setup_flow_action()?
>> As explanation above, after the change, tc_setup_flow_action will call
>> function of tc_setup_action and refer to exts->actions, so just move
>> tc_setup_action can not fix this problem.
>
>Got it.
>
>>>> +int tc_setup_flow_action(struct flow_action *flow_action,
>>>> +			 const struct tcf_exts *exts)
>>>> +{
>>>> +	if (!exts)
>>>> +		return 0;
>>>> +
>>>> +	return tc_setup_action(flow_action, exts->actions); }
>>>>  EXPORT_SYMBOL(tc_setup_flow_action);
>>>> +#endif
>>>>
>>>>  unsigned int tcf_exts_num_actions(struct tcf_exts *exts)  { @@
>>>> -3743,6 +3755,15 @@ unsigned int tcf_exts_num_actions(struct
>>>> tcf_exts
>>>> *exts)  }  EXPORT_SYMBOL(tcf_exts_num_actions);
>>>>
>>>> +unsigned int tcf_act_num_actions_single(struct tc_action *act) {
>>>> +	if (is_tcf_pedit(act))
>>>> +		return tcf_pedit_nkeys(act);
>>>> +	else
>>>> +		return 1;
>>>> +}
>>>> +EXPORT_SYMBOL(tcf_act_num_actions_single);
>>>> +
>>>>  #ifdef CONFIG_NET_CLS_ACT
>>>>  static int tcf_qevent_parse_block_index(struct nlattr *block_index_attr,
>>>>  					u32 *p_block_index,


^ permalink raw reply	[flat|nested] 58+ messages in thread

* RE: [RFC/PATCH net-next v3 7/8] flow_offload: add reoffload process to update hw_count
  2021-10-29 17:31   ` Vlad Buslov
@ 2021-11-02  9:20     ` Baowen Zheng
  0 siblings, 0 replies; 58+ messages in thread
From: Baowen Zheng @ 2021-11-02  9:20 UTC (permalink / raw)
  To: Vlad Buslov, Simon Horman
  Cc: netdev, Jamal Hadi Salim, Roi Dayan, Ido Schimmel, Cong Wang,
	Jiri Pirko, Baowen Zheng, Louis Peens, oss-drivers

On October 30, 2021 1:31 AM, Vlad Buslov <vladbu@nvidia.com> wrote:
>On Thu 28 Oct 2021 at 14:06, Simon Horman <simon.horman@corigine.com>
>wrote:
>> From: Baowen Zheng <baowen.zheng@corigine.com>
>>
>> Add reoffload process to update hw_count when driver is inserted or
>> removed.
>>
>> When reoffloading actions, we still offload the actions that are added
>> independent of filters.
>>
>> Signed-off-by: Baowen Zheng <baowen.zheng@corigine.com>
>> Signed-off-by: Louis Peens <louis.peens@corigine.com>
>> Signed-off-by: Simon Horman <simon.horman@corigine.com>
>> ---
>>  include/net/act_api.h   |  24 +++++
>>  include/net/pkt_cls.h   |   5 +
>>  net/core/flow_offload.c |   5 +
>>  net/sched/act_api.c     | 213
>++++++++++++++++++++++++++++++++++++----
>>  4 files changed, 228 insertions(+), 19 deletions(-)
>>
>> diff --git a/include/net/act_api.h b/include/net/act_api.h index
>> 80a9d1e7d805..03ff39e347c3 100644
>> --- a/include/net/act_api.h
>> +++ b/include/net/act_api.h
>> @@ -7,6 +7,7 @@
>>  */
>>
>>  #include <linux/refcount.h>
>> +#include <net/flow_offload.h>
>>  #include <net/sch_generic.h>
>>  #include <net/pkt_sched.h>
>>  #include <net/net_namespace.h>
>> @@ -243,11 +244,26 @@ static inline void flow_action_hw_count_set(struct
>tc_action *act,
>>  	act->in_hw_count = hw_count;
>>  }
>>
>> +static inline void flow_action_hw_count_inc(struct tc_action *act,
>> +					    u32 hw_count)
>> +{
>> +	act->in_hw_count += hw_count;
>> +}
>> +
>> +static inline void flow_action_hw_count_dec(struct tc_action *act,
>> +					    u32 hw_count)
>> +{
>> +	act->in_hw_count = act->in_hw_count > hw_count ?
>> +			   act->in_hw_count - hw_count : 0; }
>> +
>>  void tcf_action_update_stats(struct tc_action *a, u64 bytes, u64 packets,
>>  			     u64 drops, bool hw);
>>  int tcf_action_copy_stats(struct sk_buff *, struct tc_action *, int);
>> int tcf_action_offload_del(struct tc_action *action);  int
>> tcf_action_update_hw_stats(struct tc_action *action);
>> +int tcf_action_reoffload_cb(flow_indr_block_bind_cb_t *cb,
>> +			    void *cb_priv, bool add);
>>  int tcf_action_check_ctrlact(int action, struct tcf_proto *tp,
>>  			     struct tcf_chain **handle,
>>  			     struct netlink_ext_ack *newchain); @@ -259,6
>+275,14 @@
>> DECLARE_STATIC_KEY_FALSE(tcf_frag_xmit_count);
>>  #endif
>>
>>  int tcf_dev_queue_xmit(struct sk_buff *skb, int (*xmit)(struct
>> sk_buff *skb));
>> +
>> +#else /* !CONFIG_NET_CLS_ACT */
>> +
>> +static inline int tcf_action_reoffload_cb(flow_indr_block_bind_cb_t *cb,
>> +					  void *cb_priv, bool add) {
>> +	return 0;
>> +}
>> +
>>  #endif /* CONFIG_NET_CLS_ACT */
>>
>>  static inline void tcf_action_stats_update(struct tc_action *a, u64
>> bytes, diff --git a/include/net/pkt_cls.h b/include/net/pkt_cls.h
>> index 88788b821f76..82ac631c50bc 100644
>> --- a/include/net/pkt_cls.h
>> +++ b/include/net/pkt_cls.h
>> @@ -284,6 +284,11 @@ static inline bool tc_act_flags_valid(u32 flags)
>>  	return flags ^ (TCA_ACT_FLAGS_SKIP_HW |
>TCA_ACT_FLAGS_SKIP_SW);  }
>>
>> +static inline bool tc_act_bind(u32 flags) {
>> +	return !!(flags & TCA_ACT_FLAGS_BIND); }
>> +
>>  static inline void
>>  tcf_exts_stats_update(const struct tcf_exts *exts,
>>  		      u64 bytes, u64 packets, u64 drops, u64 lastuse, diff --git
>> a/net/core/flow_offload.c b/net/core/flow_offload.c index
>> 6676431733ef..d591204af6e0 100644
>> --- a/net/core/flow_offload.c
>> +++ b/net/core/flow_offload.c
>> @@ -1,6 +1,7 @@
>>  /* SPDX-License-Identifier: GPL-2.0 */  #include <linux/kernel.h>
>> #include <linux/slab.h>
>> +#include <net/act_api.h>
>>  #include <net/flow_offload.h>
>>  #include <linux/rtnetlink.h>
>>  #include <linux/mutex.h>
>> @@ -418,6 +419,8 @@ int
>flow_indr_dev_register(flow_indr_block_bind_cb_t *cb, void *cb_priv)
>>  	existing_qdiscs_register(cb, cb_priv);
>>  	mutex_unlock(&flow_indr_block_lock);
>>
>> +	tcf_action_reoffload_cb(cb, cb_priv, true);
>> +
>>  	return 0;
>>  }
>>  EXPORT_SYMBOL(flow_indr_dev_register);
>> @@ -472,6 +475,8 @@ void
>> flow_indr_dev_unregister(flow_indr_block_bind_cb_t *cb, void *cb_priv,
>>
>>  	flow_block_indr_notify(&cleanup_list);
>>  	kfree(indr_dev);
>> +
>> +	tcf_action_reoffload_cb(cb, cb_priv, false);
>
>Don't know if it is a problem, but shouldn't tcf_action_reoffload_cb() be called
>before flow_block_indr_notify(), which calls
>flow_block_indr->cleanup() callbacks?
Thanks for bring this issue to us. I think it totally make sense to us.
Although we did not find problem as current tests.
We will make the change according to our review.
>>  }
>>  EXPORT_SYMBOL(flow_indr_dev_unregister);
>>
>> diff --git a/net/sched/act_api.c b/net/sched/act_api.c index
>> 3893ffd91192..dce25d8f147b 100644
>> --- a/net/sched/act_api.c
>> +++ b/net/sched/act_api.c
>> @@ -638,6 +638,59 @@ EXPORT_SYMBOL(tcf_idrinfo_destroy);
>>
>>  static LIST_HEAD(act_base);
>>  static DEFINE_RWLOCK(act_mod_lock);
>> +/* since act ops id is stored in pernet subsystem list,
>> + * then there is no way to walk through only all the action
>> + * subsystem, so we keep tc action pernet ops id for
>> + * reoffload to walk through.
>> + */
>> +static LIST_HEAD(act_pernet_id_list); static
>> +DEFINE_MUTEX(act_id_mutex); struct tc_act_pernet_id {
>> +	struct list_head list;
>> +	unsigned int id;
>> +};
>> +
>> +static int tcf_pernet_add_id_list(unsigned int id) {
>> +	struct tc_act_pernet_id *id_ptr;
>> +	int ret = 0;
>> +
>> +	mutex_lock(&act_id_mutex);
>> +	list_for_each_entry(id_ptr, &act_pernet_id_list, list) {
>> +		if (id_ptr->id == id) {
>> +			ret = -EEXIST;
>> +			goto err_out;
>> +		}
>> +	}
>> +
>> +	id_ptr = kzalloc(sizeof(*id_ptr), GFP_KERNEL);
>> +	if (!id_ptr) {
>> +		ret = -ENOMEM;
>> +		goto err_out;
>> +	}
>> +	id_ptr->id = id;
>> +
>> +	list_add_tail(&id_ptr->list, &act_pernet_id_list);
>> +
>> +err_out:
>> +	mutex_unlock(&act_id_mutex);
>> +	return ret;
>> +}
>> +
>> +static void tcf_pernet_del_id_list(unsigned int id) {
>> +	struct tc_act_pernet_id *id_ptr;
>> +
>> +	mutex_lock(&act_id_mutex);
>> +	list_for_each_entry(id_ptr, &act_pernet_id_list, list) {
>> +		if (id_ptr->id == id) {
>> +			list_del(&id_ptr->list);
>> +			kfree(id_ptr);
>> +			break;
>> +		}
>> +	}
>> +	mutex_unlock(&act_id_mutex);
>> +}
>>
>>  int tcf_register_action(struct tc_action_ops *act,
>>  			struct pernet_operations *ops)
>> @@ -656,18 +709,30 @@ int tcf_register_action(struct tc_action_ops *act,
>>  	if (ret)
>>  		return ret;
>>
>> +	if (ops->id) {
>> +		ret = tcf_pernet_add_id_list(*ops->id);
>> +		if (ret)
>> +			goto id_err;
>> +	}
>> +
>>  	write_lock(&act_mod_lock);
>>  	list_for_each_entry(a, &act_base, head) {
>>  		if (act->id == a->id || (strcmp(act->kind, a->kind) == 0)) {
>> -			write_unlock(&act_mod_lock);
>> -			unregister_pernet_subsys(ops);
>> -			return -EEXIST;
>> +			ret = -EEXIST;
>> +			goto err_out;
>>  		}
>>  	}
>>  	list_add_tail(&act->head, &act_base);
>>  	write_unlock(&act_mod_lock);
>>
>>  	return 0;
>> +
>> +err_out:
>> +	write_unlock(&act_mod_lock);
>> +	tcf_pernet_del_id_list(*ops->id);
>> +id_err:
>> +	unregister_pernet_subsys(ops);
>> +	return ret;
>>  }
>>  EXPORT_SYMBOL(tcf_register_action);
>>
>> @@ -686,8 +751,11 @@ int tcf_unregister_action(struct tc_action_ops *act,
>>  		}
>>  	}
>>  	write_unlock(&act_mod_lock);
>> -	if (!err)
>> +	if (!err) {
>>  		unregister_pernet_subsys(ops);
>> +		if (ops->id)
>> +			tcf_pernet_del_id_list(*ops->id);
>> +	}
>>  	return err;
>>  }
>>  EXPORT_SYMBOL(tcf_unregister_action);
>> @@ -1175,15 +1243,11 @@ static int flow_action_init(struct
>flow_offload_action *fl_action,
>>  	return 0;
>>  }
>>
>> -static int tcf_action_offload_cmd(struct flow_offload_action *fl_act,
>> -				  u32 *hw_count,
>> -				  struct netlink_ext_ack *extack)
>> +static int tcf_action_offload_cmd_ex(struct flow_offload_action *fl_act,
>> +				     u32 *hw_count)
>>  {
>>  	int err;
>>
>> -	if (IS_ERR(fl_act))
>> -		return PTR_ERR(fl_act);
>> -
>>  	err = flow_indr_dev_setup_offload(NULL, NULL, TC_SETUP_ACT,
>>  					  fl_act, NULL, NULL);
>>  	if (err < 0)
>> @@ -1195,9 +1259,41 @@ static int tcf_action_offload_cmd(struct
>flow_offload_action *fl_act,
>>  	return 0;
>>  }
>>
>> +static int tcf_action_offload_cmd_cb_ex(struct flow_offload_action
>*fl_act,
>> +					u32 *hw_count,
>> +					flow_indr_block_bind_cb_t *cb,
>> +					void *cb_priv)
>> +{
>> +	int err;
>> +
>> +	err = cb(NULL, NULL, cb_priv, TC_SETUP_ACT, NULL, fl_act, NULL);
>> +	if (err < 0)
>> +		return err;
>> +
>> +	if (hw_count)
>> +		*hw_count = 1;
>> +
>> +	return 0;
>> +}
>> +
>> +static int tcf_action_offload_cmd(struct flow_offload_action *fl_act,
>> +				  u32 *hw_count,
>> +				  flow_indr_block_bind_cb_t *cb,
>> +				  void *cb_priv)
>> +{
>> +	if (IS_ERR(fl_act))
>> +		return PTR_ERR(fl_act);
>> +
>> +	return cb ? tcf_action_offload_cmd_cb_ex(fl_act, hw_count,
>> +						 cb, cb_priv) :
>> +		    tcf_action_offload_cmd_ex(fl_act, hw_count); }
>> +
>>  /* offload the tc command after inserted */ -static int
>> tcf_action_offload_add(struct tc_action *action,
>> -				  struct netlink_ext_ack *extack)
>> +static int tcf_action_offload_add_ex(struct tc_action *action,
>> +				     struct netlink_ext_ack *extack,
>> +				     flow_indr_block_bind_cb_t *cb,
>> +				     void *cb_priv)
>>  {
>>  	bool skip_sw = tc_act_skip_sw(action->tcfa_flags);
>>  	struct tc_action *actions[TCA_ACT_MAX_PRIO] = { @@ -1225,9
>+1321,10
>> @@ static int tcf_action_offload_add(struct tc_action *action,
>>  		goto fl_err;
>>  	}
>>
>> -	err = tcf_action_offload_cmd(fl_action, &in_hw_count, extack);
>> +	err = tcf_action_offload_cmd(fl_action, &in_hw_count, cb, cb_priv);
>>  	if (!err)
>> -		flow_action_hw_count_set(action, in_hw_count);
>> +		cb ? flow_action_hw_count_inc(action, in_hw_count) :
>> +		     flow_action_hw_count_set(action, in_hw_count);
>>
>>  	if (skip_sw && !tc_act_in_hw(action))
>>  		err = -EINVAL;
>> @@ -1240,6 +1337,12 @@ static int tcf_action_offload_add(struct tc_action
>*action,
>>  	return err;
>>  }
>>
>> +static int tcf_action_offload_add(struct tc_action *action,
>> +				  struct netlink_ext_ack *extack) {
>> +	return tcf_action_offload_add_ex(action, extack, NULL, NULL); }
>> +
>>  int tcf_action_update_hw_stats(struct tc_action *action)  {
>>  	struct flow_offload_action fl_act = {}; @@ -1252,7 +1355,7 @@ int
>> tcf_action_update_hw_stats(struct tc_action *action)
>>  	if (err)
>>  		goto err_out;
>>
>> -	err = tcf_action_offload_cmd(&fl_act, NULL, NULL);
>> +	err = tcf_action_offload_cmd(&fl_act, NULL, NULL, NULL);
>>
>>  	if (!err && fl_act.stats.lastused) {
>>  		preempt_disable();
>> @@ -1274,7 +1377,9 @@ int tcf_action_update_hw_stats(struct tc_action
>> *action)  }  EXPORT_SYMBOL(tcf_action_update_hw_stats);
>>
>> -int tcf_action_offload_del(struct tc_action *action)
>> +static int tcf_action_offload_del_ex(struct tc_action *action,
>> +				     flow_indr_block_bind_cb_t *cb,
>> +				     void *cb_priv)
>>  {
>>  	struct flow_offload_action fl_act;
>>  	u32 in_hw_count = 0;
>> @@ -1290,13 +1395,83 @@ int tcf_action_offload_del(struct tc_action
>*action)
>>  	if (err)
>>  		return err;
>>
>> -	err = tcf_action_offload_cmd(&fl_act, &in_hw_count, NULL);
>> -	if (err)
>> +	err = tcf_action_offload_cmd(&fl_act, &in_hw_count, cb, cb_priv);
>> +	if (err < 0)
>>  		return err;
>>
>> -	if (action->in_hw_count != in_hw_count)
>> +	if (!cb && action->in_hw_count != in_hw_count)
>>  		return -EINVAL;
>>
>> +	/* do not need to update hw state when deleting action */
>> +	if (cb && in_hw_count)
>> +		flow_action_hw_count_dec(action, in_hw_count);
>> +
>> +	return 0;
>> +}
>> +
>> +int tcf_action_offload_del(struct tc_action *action) {
>> +	return tcf_action_offload_del_ex(action, NULL, NULL); }
>> +
>> +int tcf_action_reoffload_cb(flow_indr_block_bind_cb_t *cb,
>> +			    void *cb_priv, bool add)
>> +{
>> +	struct tc_act_pernet_id *id_ptr;
>> +	struct tcf_idrinfo *idrinfo;
>> +	struct tc_action_net *tn;
>> +	struct tc_action *p;
>> +	unsigned int act_id;
>> +	unsigned long tmp;
>> +	unsigned long id;
>> +	struct idr *idr;
>> +	struct net *net;
>> +	int ret;
>> +
>> +	if (!cb)
>> +		return -EINVAL;
>> +
>> +	down_read(&net_rwsem);
>> +	mutex_lock(&act_id_mutex);
>> +
>> +	for_each_net(net) {
>> +		list_for_each_entry(id_ptr, &act_pernet_id_list, list) {
>> +			act_id = id_ptr->id;
>> +			tn = net_generic(net, act_id);
>> +			if (!tn)
>> +				continue;
>> +			idrinfo = tn->idrinfo;
>> +			if (!idrinfo)
>> +				continue;
>> +
>> +			mutex_lock(&idrinfo->lock);
>> +			idr = &idrinfo->action_idr;
>> +			idr_for_each_entry_ul(idr, p, tmp, id) {
>> +				if (IS_ERR(p) || tc_act_bind(p->tcfa_flags))
>> +					continue;
>> +				if (add) {
>> +					tcf_action_offload_add_ex(p, NULL,
>cb,
>> +								  cb_priv);
>> +					continue;
>> +				}
>> +
>> +				/* cb unregister to update hw count */
>> +				ret = tcf_action_offload_del_ex(p, cb,
>cb_priv);
>> +				if (ret < 0)
>> +					continue;
>> +				if (tc_act_skip_sw(p->tcfa_flags) &&
>> +				    !tc_act_in_hw(p)) {
>> +					ret = tcf_idr_release_unsafe(p);
>> +					if (ret == ACT_P_DELETED)
>> +						module_put(p->ops->owner);
>> +				}
>> +			}
>> +			mutex_unlock(&idrinfo->lock);
>> +		}
>> +	}
>> +	mutex_unlock(&act_id_mutex);
>> +	up_read(&net_rwsem);
>> +
>>  	return 0;
>>  }


^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [RFC/PATCH net-next v3 8/8] flow_offload: validate flags of filter and actions
  2021-11-01  7:38               ` Vlad Buslov
@ 2021-11-02 12:39                 ` Simon Horman
  2021-11-03  7:57                   ` Baowen Zheng
  0 siblings, 1 reply; 58+ messages in thread
From: Simon Horman @ 2021-11-02 12:39 UTC (permalink / raw)
  To: Vlad Buslov
  Cc: Baowen Zheng, Jamal Hadi Salim, netdev, Roi Dayan, Ido Schimmel,
	Cong Wang, Jiri Pirko, Baowen Zheng, Louis Peens, oss-drivers

On Mon, Nov 01, 2021 at 09:38:34AM +0200, Vlad Buslov wrote:
> On Mon 01 Nov 2021 at 05:29, Baowen Zheng <baowen.zheng@corigine.com> wrote:
> > On 2021-10-31 9:31 PM, Jamal Hadi Salim wrote:
> >>On 2021-10-30 22:27, Baowen Zheng wrote:
> >>> Thanks for your review, after some considerarion, I think I understand what

..

> >>Let me use an example to illustrate my concern:
> >>
> >>#add a policer offload it
> >>tc actions add action police skip_sw rate ... index 20 #now add filter1 which is
> >>offloaded tc filter add dev $DEV1 proto ip parent ffff: flower \
> >>     skip_sw ip_proto tcp action police index 20 #add filter2 likewise offloaded
> >>tc filter add dev $DEV1 proto ip parent ffff: flower \
> >>     skip_sw ip_proto udp action police index 20
> >>
> >>All good so far...
> >>#Now add a filter3 which is s/w only
> >>tc filter add dev $DEV1 proto ip parent ffff: flower \
> >>     skip_hw ip_proto icmp action police index 20
> >>
> >>filter3 should not be allowed.
> >>
> >>If we had added the policer without skip_sw and without skip_hw then i think
> >>filter3 should have been legal (we just need to account for stats in_hw vs
> >>in_sw).
> >>
> >>Not sure if that makes sense (and addresses Vlad's earlier comment).
> >>
> > I think the cases you mentioned make sense to us. But what Vlad concerns is the use
> > case as: 
> > #add a policer offload it
> > tc actions add action police skip_sw rate ... index 20
> > #now add filter4 which can't be  offloaded
> > tc filter add dev $DEV1 proto ip parent ffff: flower \
> > ip_proto tcp action police index 20
> > it is possible the filter4 can't be offloaded, then filter4 will run in software,
> > should this be legal? 
> > Originally I think this is legal, but as comments of Vlad, this should not be legal, since the action
> > will not be executed in software. I think what Vlad concerns is do we really need skip_sw flag for
> > an action? If a packet matches the filter in software, the action should not be skip_sw. 
> > If we choose to omit the skip_sw flag and just keep skip_hw, it will simplify our work. 
> > Of course, we can also keep skip_sw by adding more check to avoid the above case. 
> >
> > Vlad, I am not sure if I understand your idea correctly. 
> 
> My suggestion was to forgo the skip_sw flag for shared action offload
> and, consecutively, remove the validation code, not to add even more
> checks. I still don't see a practical case where skip_sw shared action
> is useful. But I don't have any strong feelings about this flag, so if
> Jamal thinks it is necessary, then fine by me.

FWIIW, my feelings are the same as Vlad's.

I think these flags add complexity that would be nice to avoid.
But if Jamal thinks its necessary, then including the flags implementation
is fine by me.

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [RFC/PATCH net-next v3 0/8] allow user to offload tc action to net device
  2021-11-01  8:01     ` Vlad Buslov
@ 2021-11-02 12:51       ` Simon Horman
  2021-11-02 15:33         ` Vlad Buslov
  0 siblings, 1 reply; 58+ messages in thread
From: Simon Horman @ 2021-11-02 12:51 UTC (permalink / raw)
  To: Vlad Buslov
  Cc: Jamal Hadi Salim, Oz Shlomo, netdev, Roi Dayan, Ido Schimmel,
	Cong Wang, Jiri Pirko, Baowen Zheng, Louis Peens, oss-drivers

On Mon, Nov 01, 2021 at 10:01:28AM +0200, Vlad Buslov wrote:
> On Sun 31 Oct 2021 at 15:40, Jamal Hadi Salim <jhs@mojatatu.com> wrote:
> > On 2021-10-31 05:50, Oz Shlomo wrote:
> >> 
> >> On 10/28/2021 2:06 PM, Simon Horman wrote:

...

> >> Actions are also (implicitly) instantiated when filters are created.
> >> In the following example the mirred action instance (created by the first
> >> filter) is shared by the second filter:
> >> tc filter add dev $DEV1 proto ip parent ffff: flower \
> >>      ip_proto tcp action mirred egress redirect dev $DEV3
> >> tc filter add dev $DEV2 proto ip parent ffff: flower \
> >>      ip_proto tcp action mirred index 1
> >> 
> >
> > I sure hope this is supported. At least the discussions so far
> > are a nod in that direction...
> > I know there is hardware that is not capable of achieving this
> > (little CPE type devices) but lets not make that the common case.
> 
> Looks like it isn't supported in this change since
> tcf_action_offload_add() is only called by tcf_action_init() when BIND
> flag is not set (the flag is always set when called from cls code).
> Moreover, I don't think it is good idea to support such use-case because
> that would require to increase number of calls to driver offload
> infrastructure from 1 per filter to 1+number_of_actions, which would
> significantly impact insertion rate.

Hi,

I feel that I am missing some very obvious point here.

But from my perspective the use case described by Oz is supported
by existing offload of the flower classifier (since ~4.13 IIRC).

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [RFC/PATCH net-next v3 0/8] allow user to offload tc action to net device
  2021-11-02 12:51       ` Simon Horman
@ 2021-11-02 15:33         ` Vlad Buslov
  2021-11-02 16:15           ` Simon Horman
  0 siblings, 1 reply; 58+ messages in thread
From: Vlad Buslov @ 2021-11-02 15:33 UTC (permalink / raw)
  To: Simon Horman
  Cc: Jamal Hadi Salim, Oz Shlomo, netdev, Roi Dayan, Ido Schimmel,
	Cong Wang, Jiri Pirko, Baowen Zheng, Louis Peens, oss-drivers

On Tue 02 Nov 2021 at 14:51, Simon Horman <simon.horman@corigine.com> wrote:
> On Mon, Nov 01, 2021 at 10:01:28AM +0200, Vlad Buslov wrote:
>> On Sun 31 Oct 2021 at 15:40, Jamal Hadi Salim <jhs@mojatatu.com> wrote:
>> > On 2021-10-31 05:50, Oz Shlomo wrote:
>> >> 
>> >> On 10/28/2021 2:06 PM, Simon Horman wrote:
>
> ...
>
>> >> Actions are also (implicitly) instantiated when filters are created.
>> >> In the following example the mirred action instance (created by the first
>> >> filter) is shared by the second filter:
>> >> tc filter add dev $DEV1 proto ip parent ffff: flower \
>> >>      ip_proto tcp action mirred egress redirect dev $DEV3
>> >> tc filter add dev $DEV2 proto ip parent ffff: flower \
>> >>      ip_proto tcp action mirred index 1
>> >> 
>> >
>> > I sure hope this is supported. At least the discussions so far
>> > are a nod in that direction...
>> > I know there is hardware that is not capable of achieving this
>> > (little CPE type devices) but lets not make that the common case.
>> 
>> Looks like it isn't supported in this change since
>> tcf_action_offload_add() is only called by tcf_action_init() when BIND
>> flag is not set (the flag is always set when called from cls code).
>> Moreover, I don't think it is good idea to support such use-case because
>> that would require to increase number of calls to driver offload
>> infrastructure from 1 per filter to 1+number_of_actions, which would
>> significantly impact insertion rate.
>
> Hi,
>
> I feel that I am missing some very obvious point here.
>
> But from my perspective the use case described by Oz is supported
> by existing offload of the flower classifier (since ~4.13 IIRC).

Mlx5 driver can't support such case without infrastructure change in
kernel for following reasons:

- Action index is not provided by flow_action offload infrastructure for
  most of the actions, so there is no way for driver to determine
  whether the action is shared.

- If we extend the infrastructure to always provide tcfa_index (a
  trivial change), there would be not much use for it because there is
  no way to properly update shared action counters without
  infrastructure code similar to what you implemented as part of this
  series.

How do you support shared actions created through cls_api in your
driver, considering described limitations?


^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [RFC/PATCH net-next v3 0/8] allow user to offload tc action to net device
  2021-11-02 15:33         ` Vlad Buslov
@ 2021-11-02 16:15           ` Simon Horman
  2021-11-03 10:56             ` Oz Shlomo
  0 siblings, 1 reply; 58+ messages in thread
From: Simon Horman @ 2021-11-02 16:15 UTC (permalink / raw)
  To: Vlad Buslov
  Cc: Jamal Hadi Salim, Oz Shlomo, netdev, Roi Dayan, Ido Schimmel,
	Cong Wang, Jiri Pirko, Baowen Zheng, Louis Peens, oss-drivers

On Tue, Nov 02, 2021 at 05:33:14PM +0200, Vlad Buslov wrote:
> On Tue 02 Nov 2021 at 14:51, Simon Horman <simon.horman@corigine.com> wrote:
> > On Mon, Nov 01, 2021 at 10:01:28AM +0200, Vlad Buslov wrote:
> >> On Sun 31 Oct 2021 at 15:40, Jamal Hadi Salim <jhs@mojatatu.com> wrote:
> >> > On 2021-10-31 05:50, Oz Shlomo wrote:
> >> >> 
> >> >> On 10/28/2021 2:06 PM, Simon Horman wrote:
> >
> > ...
> >
> >> >> Actions are also (implicitly) instantiated when filters are created.
> >> >> In the following example the mirred action instance (created by the first
> >> >> filter) is shared by the second filter:
> >> >> tc filter add dev $DEV1 proto ip parent ffff: flower \
> >> >>      ip_proto tcp action mirred egress redirect dev $DEV3
> >> >> tc filter add dev $DEV2 proto ip parent ffff: flower \
> >> >>      ip_proto tcp action mirred index 1
> >> >> 
> >> >
> >> > I sure hope this is supported. At least the discussions so far
> >> > are a nod in that direction...
> >> > I know there is hardware that is not capable of achieving this
> >> > (little CPE type devices) but lets not make that the common case.
> >> 
> >> Looks like it isn't supported in this change since
> >> tcf_action_offload_add() is only called by tcf_action_init() when BIND
> >> flag is not set (the flag is always set when called from cls code).
> >> Moreover, I don't think it is good idea to support such use-case because
> >> that would require to increase number of calls to driver offload
> >> infrastructure from 1 per filter to 1+number_of_actions, which would
> >> significantly impact insertion rate.
> >
> > Hi,
> >
> > I feel that I am missing some very obvious point here.
> >
> > But from my perspective the use case described by Oz is supported
> > by existing offload of the flower classifier (since ~4.13 IIRC).
> 
> Mlx5 driver can't support such case without infrastructure change in
> kernel for following reasons:
> 
> - Action index is not provided by flow_action offload infrastructure for
>   most of the actions, so there is no way for driver to determine
>   whether the action is shared.
> 
> - If we extend the infrastructure to always provide tcfa_index (a
>   trivial change), there would be not much use for it because there is
>   no way to properly update shared action counters without
>   infrastructure code similar to what you implemented as part of this
>   series.
> 
> How do you support shared actions created through cls_api in your
> driver, considering described limitations?

Thanks,

I misread the use case described by Oz, but I believe I understand it now.

I agree that the case described is neither currently supported, nor
supported by this patchset (to be honest I for one had not considered it).

So, I think the question is: does upporting this use-case make sense - from
implementation, use-case, and consistency perspectives - in the context of
this patchset?

Am I on the right track?

^ permalink raw reply	[flat|nested] 58+ messages in thread

* RE: [RFC/PATCH net-next v3 8/8] flow_offload: validate flags of filter and actions
  2021-11-02 12:39                 ` Simon Horman
@ 2021-11-03  7:57                   ` Baowen Zheng
  2021-11-03 10:13                     ` Jamal Hadi Salim
  0 siblings, 1 reply; 58+ messages in thread
From: Baowen Zheng @ 2021-11-03  7:57 UTC (permalink / raw)
  To: Simon Horman, Vlad Buslov, Jamal Hadi Salim
  Cc: netdev, Roi Dayan, Ido Schimmel, Cong Wang, Jiri Pirko,
	Baowen Zheng, Louis Peens, oss-drivers

On November 2, 2021 8:40 PM, Simon Horman wrote:
>On Mon, Nov 01, 2021 at 09:38:34AM +0200, Vlad Buslov wrote:
>> On Mon 01 Nov 2021 at 05:29, Baowen Zheng
><baowen.zheng@corigine.com> wrote:
>> > On 2021-10-31 9:31 PM, Jamal Hadi Salim wrote:
>> >>On 2021-10-30 22:27, Baowen Zheng wrote:
>> >>> Thanks for your review, after some considerarion, I think I
>> >>> understand what
>
>..
>
>> >>Let me use an example to illustrate my concern:
>> >>
>> >>#add a policer offload it
>> >>tc actions add action police skip_sw rate ... index 20 #now add
>> >>filter1 which is offloaded tc filter add dev $DEV1 proto ip parent ffff:
>flower \
>> >>     skip_sw ip_proto tcp action police index 20 #add filter2
>> >>likewise offloaded tc filter add dev $DEV1 proto ip parent ffff: flower \
>> >>     skip_sw ip_proto udp action police index 20
>> >>
>> >>All good so far...
>> >>#Now add a filter3 which is s/w only tc filter add dev $DEV1 proto
>> >>ip parent ffff: flower \
>> >>     skip_hw ip_proto icmp action police index 20
>> >>
>> >>filter3 should not be allowed.
>> >>
>> >>If we had added the policer without skip_sw and without skip_hw then
>> >>i think
>> >>filter3 should have been legal (we just need to account for stats
>> >>in_hw vs in_sw).
>> >>
>> >>Not sure if that makes sense (and addresses Vlad's earlier comment).
>> >>
>> > I think the cases you mentioned make sense to us. But what Vlad
>> > concerns is the use case as:
>> > #add a policer offload it
>> > tc actions add action police skip_sw rate ... index 20 #now add
>> > filter4 which can't be  offloaded tc filter add dev $DEV1 proto ip
>> > parent ffff: flower \ ip_proto tcp action police index 20 it is
>> > possible the filter4 can't be offloaded, then filter4 will run in
>> > software, should this be legal?
>> > Originally I think this is legal, but as comments of Vlad, this
>> > should not be legal, since the action will not be executed in
>> > software. I think what Vlad concerns is do we really need skip_sw flag for
>an action? If a packet matches the filter in software, the action should not be
>skip_sw.
>> > If we choose to omit the skip_sw flag and just keep skip_hw, it will simplify
>our work.
>> > Of course, we can also keep skip_sw by adding more check to avoid the
>above case.
>> >
>> > Vlad, I am not sure if I understand your idea correctly.
>>
>> My suggestion was to forgo the skip_sw flag for shared action offload
>> and, consecutively, remove the validation code, not to add even more
>> checks. I still don't see a practical case where skip_sw shared action
>> is useful. But I don't have any strong feelings about this flag, so if
>> Jamal thinks it is necessary, then fine by me.
>
>FWIIW, my feelings are the same as Vlad's.
>
>I think these flags add complexity that would be nice to avoid.
>But if Jamal thinks its necessary, then including the flags implementation is
>fine by me.
Thanks Simon. Jamal, do you think it is necessary to keep the skip_sw flag for user to specify
the action should not run in software?

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [RFC/PATCH net-next v3 8/8] flow_offload: validate flags of filter and actions
  2021-11-03  7:57                   ` Baowen Zheng
@ 2021-11-03 10:13                     ` Jamal Hadi Salim
  2021-11-03 11:30                       ` Baowen Zheng
  0 siblings, 1 reply; 58+ messages in thread
From: Jamal Hadi Salim @ 2021-11-03 10:13 UTC (permalink / raw)
  To: Baowen Zheng, Simon Horman, Vlad Buslov
  Cc: netdev, Roi Dayan, Ido Schimmel, Cong Wang, Jiri Pirko,
	Baowen Zheng, Louis Peens, oss-drivers, Oz Shlomo

On 2021-11-03 03:57, Baowen Zheng wrote:
> On November 2, 2021 8:40 PM, Simon Horman wrote:
>> On Mon, Nov 01, 2021 at 09:38:34AM +0200, Vlad Buslov wrote:
>>> On Mon 01 Nov 2021 at 05:29, Baowen Zheng

[..]
>>>
>>> My suggestion was to forgo the skip_sw flag for shared action offload
>>> and, consecutively, remove the validation code, not to add even more
>>> checks. I still don't see a practical case where skip_sw shared action
>>> is useful. But I don't have any strong feelings about this flag, so if
>>> Jamal thinks it is necessary, then fine by me.
>>
>> FWIIW, my feelings are the same as Vlad's.
>>
>> I think these flags add complexity that would be nice to avoid.
>> But if Jamal thinks its necessary, then including the flags implementation is
>> fine by me.
> Thanks Simon. Jamal, do you think it is necessary to keep the skip_sw flag for user to specify
> the action should not run in software?
> 

Just catching up with discussion...
IMO, we need the flag. Oz indicated with requirement to be able to
identify the action with an index. So if a specific action is added
for skip_sw (as standalone or alongside a filter) then it cant be
used for skip_hw. To illustrate using extended example:

#filter 1, skip_sw
tc filter add dev $DEV1 proto ip parent ffff: flower \
     skip_sw ip_proto tcp action police blah index 10

#filter 2, skip_hw
tc filter add dev $DEV1 proto ip parent ffff: flower \
     skip_hw ip_proto udp action police index 10

Filter2 should be illegal.
And when i dump the actions as so:
tc actions ls action police

For debugability, I should see index 10 clearly marked with
the flag as skip_sw

The other example i gave earlier which showed the sharing
of actions:

#add a policer action and offload it
tc actions add action police skip_sw rate ... index 20
#now add filter1 which is offloaded using offloaded policer
tc filter add dev $DEV1 proto ip parent ffff: flower \
     skip_sw ip_proto tcp action police index 20
#add filter2 likewise offloaded
tc filter add dev $DEV1 proto ip parent ffff: flower \
     skip_sw ip_proto udp action police index 20

All good and filter 1 and 2 are sharing policer instance with
index 20.

#Now add a filter3 which is s/w only
tc filter add dev $DEV1 proto ip parent ffff: flower \
     skip_hw ip_proto icmp action police index 20

filter3 should not be allowed.

cheers,
jamal

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [RFC/PATCH net-next v3 0/8] allow user to offload tc action to net device
  2021-11-02 16:15           ` Simon Horman
@ 2021-11-03 10:56             ` Oz Shlomo
  0 siblings, 0 replies; 58+ messages in thread
From: Oz Shlomo @ 2021-11-03 10:56 UTC (permalink / raw)
  To: Simon Horman, Vlad Buslov
  Cc: Jamal Hadi Salim, netdev, Roi Dayan, Ido Schimmel, Cong Wang,
	Jiri Pirko, Baowen Zheng, Louis Peens, oss-drivers



On 11/2/2021 6:15 PM, Simon Horman wrote:
> On Tue, Nov 02, 2021 at 05:33:14PM +0200, Vlad Buslov wrote:
>> On Tue 02 Nov 2021 at 14:51, Simon Horman <simon.horman@corigine.com> wrote:
>>> On Mon, Nov 01, 2021 at 10:01:28AM +0200, Vlad Buslov wrote:
>>>> On Sun 31 Oct 2021 at 15:40, Jamal Hadi Salim <jhs@mojatatu.com> wrote:
>>>>> On 2021-10-31 05:50, Oz Shlomo wrote:
>>>>>>
>>>>>> On 10/28/2021 2:06 PM, Simon Horman wrote:
>>>
>>> ...
>>>
>>>>>> Actions are also (implicitly) instantiated when filters are created.
>>>>>> In the following example the mirred action instance (created by the first
>>>>>> filter) is shared by the second filter:
>>>>>> tc filter add dev $DEV1 proto ip parent ffff: flower \
>>>>>>       ip_proto tcp action mirred egress redirect dev $DEV3
>>>>>> tc filter add dev $DEV2 proto ip parent ffff: flower \
>>>>>>       ip_proto tcp action mirred index 1
>>>>>>
>>>>>
>>>>> I sure hope this is supported. At least the discussions so far
>>>>> are a nod in that direction...
>>>>> I know there is hardware that is not capable of achieving this
>>>>> (little CPE type devices) but lets not make that the common case.
>>>>
>>>> Looks like it isn't supported in this change since
>>>> tcf_action_offload_add() is only called by tcf_action_init() when BIND
>>>> flag is not set (the flag is always set when called from cls code).
>>>> Moreover, I don't think it is good idea to support such use-case because
>>>> that would require to increase number of calls to driver offload
>>>> infrastructure from 1 per filter to 1+number_of_actions, which would
>>>> significantly impact insertion rate.
>>>
>>> Hi,
>>>
>>> I feel that I am missing some very obvious point here.
>>>
>>> But from my perspective the use case described by Oz is supported
>>> by existing offload of the flower classifier (since ~4.13 IIRC).
>>
>> Mlx5 driver can't support such case without infrastructure change in
>> kernel for following reasons:
>>
>> - Action index is not provided by flow_action offload infrastructure for
>>    most of the actions, so there is no way for driver to determine
>>    whether the action is shared.
>>
>> - If we extend the infrastructure to always provide tcfa_index (a
>>    trivial change), there would be not much use for it because there is
>>    no way to properly update shared action counters without
>>    infrastructure code similar to what you implemented as part of this
>>    series.
>>
>> How do you support shared actions created through cls_api in your
>> driver, considering described limitations?
> 
> Thanks,
> 
> I misread the use case described by Oz, but I believe I understand it now.
> 
> I agree that the case described is neither currently supported, nor
> supported by this patchset (to be honest I for one had not considered it).
> 
> So, I think the question is: does upporting this use-case make sense - from
> implementation, use-case, and consistency perspectives - in the context of
> this patchset?
> 
> Am I on the right track?
> 

Currently we don't have a specific application use case for sharing actions that were created by tc 
filters. However, we do have future use case in mind.
We could add such functionality on top of this series when a use case will materialize.
Perhaps, at that point, we can also introduce a control flag in order to avoid unnecessary insertion 
rate performance degradation.

^ permalink raw reply	[flat|nested] 58+ messages in thread

* RE: [RFC/PATCH net-next v3 8/8] flow_offload: validate flags of filter and actions
  2021-11-03 10:13                     ` Jamal Hadi Salim
@ 2021-11-03 11:30                       ` Baowen Zheng
  2021-11-03 12:33                         ` Jamal Hadi Salim
  0 siblings, 1 reply; 58+ messages in thread
From: Baowen Zheng @ 2021-11-03 11:30 UTC (permalink / raw)
  To: Jamal Hadi Salim, Simon Horman, Vlad Buslov
  Cc: netdev, Roi Dayan, Ido Schimmel, Cong Wang, Jiri Pirko,
	Baowen Zheng, Louis Peens, oss-drivers, Oz Shlomo

On November 3, 2021 6:14 PM, Jamal Hadi Salim wrote:
>On 2021-11-03 03:57, Baowen Zheng wrote:
>> On November 2, 2021 8:40 PM, Simon Horman wrote:
>>> On Mon, Nov 01, 2021 at 09:38:34AM +0200, Vlad Buslov wrote:
>>>> On Mon 01 Nov 2021 at 05:29, Baowen Zheng
>
>[..]
>>>>
>>>> My suggestion was to forgo the skip_sw flag for shared action
>>>> offload and, consecutively, remove the validation code, not to add
>>>> even more checks. I still don't see a practical case where skip_sw
>>>> shared action is useful. But I don't have any strong feelings about
>>>> this flag, so if Jamal thinks it is necessary, then fine by me.
>>>
>>> FWIIW, my feelings are the same as Vlad's.
>>>
>>> I think these flags add complexity that would be nice to avoid.
>>> But if Jamal thinks its necessary, then including the flags
>>> implementation is fine by me.
>> Thanks Simon. Jamal, do you think it is necessary to keep the skip_sw
>> flag for user to specify the action should not run in software?
>>
>
>Just catching up with discussion...
>IMO, we need the flag. Oz indicated with requirement to be able to identify
>the action with an index. So if a specific action is added for skip_sw (as
>standalone or alongside a filter) then it cant be used for skip_hw. To illustrate
>using extended example:
>
>#filter 1, skip_sw
>tc filter add dev $DEV1 proto ip parent ffff: flower \
>     skip_sw ip_proto tcp action police blah index 10
>
>#filter 2, skip_hw
>tc filter add dev $DEV1 proto ip parent ffff: flower \
>     skip_hw ip_proto udp action police index 10
>
>Filter2 should be illegal.
>And when i dump the actions as so:
>tc actions ls action police
>
>For debugability, I should see index 10 clearly marked with the flag as skip_sw
>
>The other example i gave earlier which showed the sharing of actions:
>
>#add a policer action and offload it
>tc actions add action police skip_sw rate ... index 20 #now add filter1 which is
>offloaded using offloaded policer tc filter add dev $DEV1 proto ip parent ffff:
>flower \
>     skip_sw ip_proto tcp action police index 20 #add filter2 likewise offloaded
>tc filter add dev $DEV1 proto ip parent ffff: flower \
>     skip_sw ip_proto udp action police index 20
>
>All good and filter 1 and 2 are sharing policer instance with index 20.
>
>#Now add a filter3 which is s/w only
>tc filter add dev $DEV1 proto ip parent ffff: flower \
>     skip_hw ip_proto icmp action police index 20
>
>filter3 should not be allowed.
I think the use cases you mentioned above are clear for us. For the case: 

#add a policer action and offload it
tc actions add action police skip_sw rate ... index 20
#Now add a filter4 which has no flag
tc filter add dev $DEV1 proto ip parent ffff: flower \
     ip_proto icmp action police index 20

Is filter4 legal? basically, it should be legal, but since filter4 may be offloaded failed so 
it will run in software, you know the action police should not run in software with skip_sw,
so I think filter4 should be illegal and we should not allow this case. 
That is if the action is skip_sw, then the filter refers to this action should also skip_sw. 
WDYT?

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [RFC/PATCH net-next v3 8/8] flow_offload: validate flags of filter and actions
  2021-11-03 11:30                       ` Baowen Zheng
@ 2021-11-03 12:33                         ` Jamal Hadi Salim
  2021-11-03 13:33                           ` Jamal Hadi Salim
  2021-11-03 13:37                           ` Baowen Zheng
  0 siblings, 2 replies; 58+ messages in thread
From: Jamal Hadi Salim @ 2021-11-03 12:33 UTC (permalink / raw)
  To: Baowen Zheng, Simon Horman, Vlad Buslov
  Cc: netdev, Roi Dayan, Ido Schimmel, Cong Wang, Jiri Pirko,
	Baowen Zheng, Louis Peens, oss-drivers, Oz Shlomo

On 2021-11-03 07:30, Baowen Zheng wrote:
> On November 3, 2021 6:14 PM, Jamal Hadi Salim wrote:
>> On 2021-11-03 03:57, Baowen Zheng wrote:
>>> On November 2, 2021 8:40 PM, Simon Horman wrote:
>>>> On Mon, Nov 01, 2021 at 09:38:34AM +0200, Vlad Buslov wrote:
>>>>> On Mon 01 Nov 2021 at 05:29, Baowen Zheng
>>
>> [..]
>>>>>
>>>>> My suggestion was to forgo the skip_sw flag for shared action
>>>>> offload and, consecutively, remove the validation code, not to add
>>>>> even more checks. I still don't see a practical case where skip_sw
>>>>> shared action is useful. But I don't have any strong feelings about
>>>>> this flag, so if Jamal thinks it is necessary, then fine by me.
>>>>
>>>> FWIIW, my feelings are the same as Vlad's.
>>>>
>>>> I think these flags add complexity that would be nice to avoid.
>>>> But if Jamal thinks its necessary, then including the flags
>>>> implementation is fine by me.
>>> Thanks Simon. Jamal, do you think it is necessary to keep the skip_sw
>>> flag for user to specify the action should not run in software?
>>>
>>
>> Just catching up with discussion...
>> IMO, we need the flag. Oz indicated with requirement to be able to identify
>> the action with an index. So if a specific action is added for skip_sw (as
>> standalone or alongside a filter) then it cant be used for skip_hw. To illustrate
>> using extended example:
>>
>> #filter 1, skip_sw
>> tc filter add dev $DEV1 proto ip parent ffff: flower \
>>      skip_sw ip_proto tcp action police blah index 10
>>
>> #filter 2, skip_hw
>> tc filter add dev $DEV1 proto ip parent ffff: flower \
>>      skip_hw ip_proto udp action police index 10
>>
>> Filter2 should be illegal.
>> And when i dump the actions as so:
>> tc actions ls action police
>>
>> For debugability, I should see index 10 clearly marked with the flag as skip_sw
>>
>> The other example i gave earlier which showed the sharing of actions:
>>
>> #add a policer action and offload it
>> tc actions add action police skip_sw rate ... index 20 #now add filter1 which is
>> offloaded using offloaded policer tc filter add dev $DEV1 proto ip parent ffff:
>> flower \
>>      skip_sw ip_proto tcp action police index 20 #add filter2 likewise offloaded
>> tc filter add dev $DEV1 proto ip parent ffff: flower \
>>      skip_sw ip_proto udp action police index 20
>>
>> All good and filter 1 and 2 are sharing policer instance with index 20.
>>
>> #Now add a filter3 which is s/w only
>> tc filter add dev $DEV1 proto ip parent ffff: flower \
>>      skip_hw ip_proto icmp action police index 20
>>
>> filter3 should not be allowed.
> I think the use cases you mentioned above are clear for us. For the case:
> 
> #add a policer action and offload it
> tc actions add action police skip_sw rate ... index 20
> #Now add a filter4 which has no flag
> tc filter add dev $DEV1 proto ip parent ffff: flower \
>       ip_proto icmp action police index 20
> 
> Is filter4 legal? 

Yes it is _based on current semantics_.
The reason is when adding a filter and specifying neither
skip_sw nor skip_hw it defaults to allowing both.
i.e is the same as skip_sw|skip_hw. You will need to have
counters for both s/w and h/w (which i think is taken care of today).


>basically, it should be legal, but since filter4 may be offloaded failed so
> it will run in software, you know the action police should not run in software with skip_sw,
> so I think filter4 should be illegal and we should not allow this case.
> That is if the action is skip_sw, then the filter refers to this action should also skip_sw.
> WDYT?

See above..

cheers,
jamal


^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [RFC/PATCH net-next v3 8/8] flow_offload: validate flags of filter and actions
  2021-11-03 12:33                         ` Jamal Hadi Salim
@ 2021-11-03 13:33                           ` Jamal Hadi Salim
  2021-11-03 13:38                             ` Simon Horman
  2021-11-03 14:03                             ` Baowen Zheng
  2021-11-03 13:37                           ` Baowen Zheng
  1 sibling, 2 replies; 58+ messages in thread
From: Jamal Hadi Salim @ 2021-11-03 13:33 UTC (permalink / raw)
  To: Baowen Zheng, Simon Horman, Vlad Buslov
  Cc: netdev, Roi Dayan, Ido Schimmel, Cong Wang, Jiri Pirko,
	Baowen Zheng, Louis Peens, oss-drivers, Oz Shlomo

On 2021-11-03 08:33, Jamal Hadi Salim wrote:
> On 2021-11-03 07:30, Baowen Zheng wrote:
>> On November 3, 2021 6:14 PM, Jamal Hadi Salim wrote:
>>> On 2021-11-03 03:57, Baowen Zheng wrote:
>>>> On November 2, 2021 8:40 PM, Simon Horman wrote:
>>>>> On Mon, Nov 01, 2021 at 09:38:34AM +0200, Vlad Buslov wrote:
>>>>>> On Mon 01 Nov 2021 at 05:29, Baowen Zheng
>>>
>>> [..]
>>>>>>
>>>>>> My suggestion was to forgo the skip_sw flag for shared action
>>>>>> offload and, consecutively, remove the validation code, not to add
>>>>>> even more checks. I still don't see a practical case where skip_sw
>>>>>> shared action is useful. But I don't have any strong feelings about
>>>>>> this flag, so if Jamal thinks it is necessary, then fine by me.
>>>>>
>>>>> FWIIW, my feelings are the same as Vlad's.
>>>>>
>>>>> I think these flags add complexity that would be nice to avoid.
>>>>> But if Jamal thinks its necessary, then including the flags
>>>>> implementation is fine by me.
>>>> Thanks Simon. Jamal, do you think it is necessary to keep the skip_sw
>>>> flag for user to specify the action should not run in software?
>>>>
>>>
>>> Just catching up with discussion...
>>> IMO, we need the flag. Oz indicated with requirement to be able to 
>>> identify
>>> the action with an index. So if a specific action is added for 
>>> skip_sw (as
>>> standalone or alongside a filter) then it cant be used for skip_hw. 
>>> To illustrate
>>> using extended example:
>>>
>>> #filter 1, skip_sw
>>> tc filter add dev $DEV1 proto ip parent ffff: flower \
>>>      skip_sw ip_proto tcp action police blah index 10
>>>
>>> #filter 2, skip_hw
>>> tc filter add dev $DEV1 proto ip parent ffff: flower \
>>>      skip_hw ip_proto udp action police index 10
>>>
>>> Filter2 should be illegal.
>>> And when i dump the actions as so:
>>> tc actions ls action police
>>>
>>> For debugability, I should see index 10 clearly marked with the flag 
>>> as skip_sw
>>>
>>> The other example i gave earlier which showed the sharing of actions:
>>>
>>> #add a policer action and offload it
>>> tc actions add action police skip_sw rate ... index 20 #now add 
>>> filter1 which is
>>> offloaded using offloaded policer tc filter add dev $DEV1 proto ip 
>>> parent ffff:
>>> flower \
>>>      skip_sw ip_proto tcp action police index 20 #add filter2 
>>> likewise offloaded
>>> tc filter add dev $DEV1 proto ip parent ffff: flower \
>>>      skip_sw ip_proto udp action police index 20
>>>
>>> All good and filter 1 and 2 are sharing policer instance with index 20.
>>>
>>> #Now add a filter3 which is s/w only
>>> tc filter add dev $DEV1 proto ip parent ffff: flower \
>>>      skip_hw ip_proto icmp action police index 20
>>>
>>> filter3 should not be allowed.
>> I think the use cases you mentioned above are clear for us. For the case:
>>
>> #add a policer action and offload it
>> tc actions add action police skip_sw rate ... index 20
>> #Now add a filter4 which has no flag
>> tc filter add dev $DEV1 proto ip parent ffff: flower \
>>       ip_proto icmp action police index 20
>>
>> Is filter4 legal? 
> 
> Yes it is _based on current semantics_.
> The reason is when adding a filter and specifying neither
> skip_sw nor skip_hw it defaults to allowing both.
> i.e is the same as skip_sw|skip_hw. You will need to have
> counters for both s/w and h/w (which i think is taken care of today).
> 
> 

Apologies, i will like to take this one back. Couldnt stop thinking
about it while sipping coffee;->
To be safe that should be illegal. The flags have to match _exactly_
for both  action and filter to make any sense. i.e in the above case
they are not.

cheers,
jamal


^ permalink raw reply	[flat|nested] 58+ messages in thread

* RE: [RFC/PATCH net-next v3 8/8] flow_offload: validate flags of filter and actions
  2021-11-03 12:33                         ` Jamal Hadi Salim
  2021-11-03 13:33                           ` Jamal Hadi Salim
@ 2021-11-03 13:37                           ` Baowen Zheng
  1 sibling, 0 replies; 58+ messages in thread
From: Baowen Zheng @ 2021-11-03 13:37 UTC (permalink / raw)
  To: Jamal Hadi Salim, Simon Horman, Vlad Buslov
  Cc: netdev, Roi Dayan, Ido Schimmel, Cong Wang, Jiri Pirko,
	Baowen Zheng, Louis Peens, oss-drivers, Oz Shlomo

On November 3, 2021 8:34 PM, Jamal Hadi Salim wrote:
>On 2021-11-03 07:30, Baowen Zheng wrote:
>> On November 3, 2021 6:14 PM, Jamal Hadi Salim wrote:
>>> On 2021-11-03 03:57, Baowen Zheng wrote:
>>>> On November 2, 2021 8:40 PM, Simon Horman wrote:
>>>>> On Mon, Nov 01, 2021 at 09:38:34AM +0200, Vlad Buslov wrote:
>>>>>> On Mon 01 Nov 2021 at 05:29, Baowen Zheng
>>>
>>> [..]
>>>>>>
>>>>>> My suggestion was to forgo the skip_sw flag for shared action
>>>>>> offload and, consecutively, remove the validation code, not to add
>>>>>> even more checks. I still don't see a practical case where skip_sw
>>>>>> shared action is useful. But I don't have any strong feelings
>>>>>> about this flag, so if Jamal thinks it is necessary, then fine by me.
>>>>>
>>>>> FWIIW, my feelings are the same as Vlad's.
>>>>>
>>>>> I think these flags add complexity that would be nice to avoid.
>>>>> But if Jamal thinks its necessary, then including the flags
>>>>> implementation is fine by me.
>>>> Thanks Simon. Jamal, do you think it is necessary to keep the
>>>> skip_sw flag for user to specify the action should not run in software?
>>>>
>>>
>>> Just catching up with discussion...
>>> IMO, we need the flag. Oz indicated with requirement to be able to
>>> identify the action with an index. So if a specific action is added
>>> for skip_sw (as standalone or alongside a filter) then it cant be
>>> used for skip_hw. To illustrate using extended example:
>>>
>>> #filter 1, skip_sw
>>> tc filter add dev $DEV1 proto ip parent ffff: flower \
>>>      skip_sw ip_proto tcp action police blah index 10
>>>
>>> #filter 2, skip_hw
>>> tc filter add dev $DEV1 proto ip parent ffff: flower \
>>>      skip_hw ip_proto udp action police index 10
>>>
>>> Filter2 should be illegal.
>>> And when i dump the actions as so:
>>> tc actions ls action police
>>>
>>> For debugability, I should see index 10 clearly marked with the flag
>>> as skip_sw
>>>
>>> The other example i gave earlier which showed the sharing of actions:
>>>
>>> #add a policer action and offload it
>>> tc actions add action police skip_sw rate ... index 20 #now add
>>> filter1 which is offloaded using offloaded policer tc filter add dev $DEV1
>proto ip parent ffff:
>>> flower \
>>>      skip_sw ip_proto tcp action police index 20 #add filter2
>>> likewise offloaded tc filter add dev $DEV1 proto ip parent ffff: flower \
>>>      skip_sw ip_proto udp action police index 20
>>>
>>> All good and filter 1 and 2 are sharing policer instance with index 20.
>>>
>>> #Now add a filter3 which is s/w only
>>> tc filter add dev $DEV1 proto ip parent ffff: flower \
>>>      skip_hw ip_proto icmp action police index 20
>>>
>>> filter3 should not be allowed.
>> I think the use cases you mentioned above are clear for us. For the case:
>>
>> #add a policer action and offload it
>> tc actions add action police skip_sw rate ... index 20 #Now add a
>> filter4 which has no flag tc filter add dev $DEV1 proto ip parent
>> ffff: flower \
>>       ip_proto icmp action police index 20
>>
>> Is filter4 legal?
>
>Yes it is _based on current semantics_.
>The reason is when adding a filter and specifying neither skip_sw nor skip_hw
>it defaults to allowing both.
>i.e is the same as skip_sw|skip_hw. You will need to have counters for both
>s/w and h/w (which i think is taken care of today).
Thanks, but what we concern is not the counters but the behavior of this filter. 
Since the filter runs in software and action is skip_sw, so the action will not execute in software. 
So when the packet matches the filter, it will execute all the actions except the skip_sw action. 
I think it is not what we expect, we expect the packet execute all the actions the filter refers to. 
So I think in this case, filter4 should not be allowed.
WDYT?
>>basically, it should be legal, but since filter4 may be offloaded
>>failed so  it will run in software, you know the action police should
>>not run in software with skip_sw,  so I think filter4 should be illegal and we
>should not allow this case.
>> That is if the action is skip_sw, then the filter refers to this action should also
>skip_sw.
>> WDYT?
>
>See above..
>
>cheers,
>jamal


^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [RFC/PATCH net-next v3 8/8] flow_offload: validate flags of filter and actions
  2021-11-03 13:33                           ` Jamal Hadi Salim
@ 2021-11-03 13:38                             ` Simon Horman
  2021-11-03 14:05                               ` Jamal Hadi Salim
  2021-11-03 14:03                             ` Baowen Zheng
  1 sibling, 1 reply; 58+ messages in thread
From: Simon Horman @ 2021-11-03 13:38 UTC (permalink / raw)
  To: Jamal Hadi Salim
  Cc: Baowen Zheng, Vlad Buslov, netdev, Roi Dayan, Ido Schimmel,
	Cong Wang, Jiri Pirko, Baowen Zheng, Louis Peens, oss-drivers,
	Oz Shlomo

On Wed, Nov 03, 2021 at 09:33:52AM -0400, Jamal Hadi Salim wrote:
> On 2021-11-03 08:33, Jamal Hadi Salim wrote:
> > On 2021-11-03 07:30, Baowen Zheng wrote:
> > > On November 3, 2021 6:14 PM, Jamal Hadi Salim wrote:
> > > > On 2021-11-03 03:57, Baowen Zheng wrote:
> > > > > On November 2, 2021 8:40 PM, Simon Horman wrote:
> > > > > > On Mon, Nov 01, 2021 at 09:38:34AM +0200, Vlad Buslov wrote:
> > > > > > > On Mon 01 Nov 2021 at 05:29, Baowen Zheng
> > > > 
> > > > [..]
> > > > > > > 
> > > > > > > My suggestion was to forgo the skip_sw flag for shared action
> > > > > > > offload and, consecutively, remove the validation code, not to add
> > > > > > > even more checks. I still don't see a practical case where skip_sw
> > > > > > > shared action is useful. But I don't have any strong feelings about
> > > > > > > this flag, so if Jamal thinks it is necessary, then fine by me.
> > > > > > 
> > > > > > FWIIW, my feelings are the same as Vlad's.
> > > > > > 
> > > > > > I think these flags add complexity that would be nice to avoid.
> > > > > > But if Jamal thinks its necessary, then including the flags
> > > > > > implementation is fine by me.
> > > > > Thanks Simon. Jamal, do you think it is necessary to keep the skip_sw
> > > > > flag for user to specify the action should not run in software?
> > > > > 
> > > > 
> > > > Just catching up with discussion...
> > > > IMO, we need the flag. Oz indicated with requirement to be able
> > > > to identify
> > > > the action with an index. So if a specific action is added for
> > > > skip_sw (as
> > > > standalone or alongside a filter) then it cant be used for
> > > > skip_hw. To illustrate
> > > > using extended example:
> > > > 
> > > > #filter 1, skip_sw
> > > > tc filter add dev $DEV1 proto ip parent ffff: flower \
> > > >      skip_sw ip_proto tcp action police blah index 10
> > > > 
> > > > #filter 2, skip_hw
> > > > tc filter add dev $DEV1 proto ip parent ffff: flower \
> > > >      skip_hw ip_proto udp action police index 10
> > > > 
> > > > Filter2 should be illegal.
> > > > And when i dump the actions as so:
> > > > tc actions ls action police
> > > > 
> > > > For debugability, I should see index 10 clearly marked with the
> > > > flag as skip_sw
> > > > 
> > > > The other example i gave earlier which showed the sharing of actions:
> > > > 
> > > > #add a policer action and offload it
> > > > tc actions add action police skip_sw rate ... index 20 #now add
> > > > filter1 which is
> > > > offloaded using offloaded policer tc filter add dev $DEV1 proto
> > > > ip parent ffff:
> > > > flower \
> > > >      skip_sw ip_proto tcp action police index 20 #add filter2
> > > > likewise offloaded
> > > > tc filter add dev $DEV1 proto ip parent ffff: flower \
> > > >      skip_sw ip_proto udp action police index 20
> > > > 
> > > > All good and filter 1 and 2 are sharing policer instance with index 20.
> > > > 
> > > > #Now add a filter3 which is s/w only
> > > > tc filter add dev $DEV1 proto ip parent ffff: flower \
> > > >      skip_hw ip_proto icmp action police index 20
> > > > 
> > > > filter3 should not be allowed.
> > > I think the use cases you mentioned above are clear for us. For the case:
> > > 
> > > #add a policer action and offload it
> > > tc actions add action police skip_sw rate ... index 20
> > > #Now add a filter4 which has no flag
> > > tc filter add dev $DEV1 proto ip parent ffff: flower \
> > >       ip_proto icmp action police index 20
> > > 
> > > Is filter4 legal?
> > 
> > Yes it is _based on current semantics_.
> > The reason is when adding a filter and specifying neither
> > skip_sw nor skip_hw it defaults to allowing both.
> > i.e is the same as skip_sw|skip_hw. You will need to have
> > counters for both s/w and h/w (which i think is taken care of today).
> > 
> > 
> 
> Apologies, i will like to take this one back. Couldnt stop thinking
> about it while sipping coffee;->
> To be safe that should be illegal. The flags have to match _exactly_
> for both  action and filter to make any sense. i.e in the above case
> they are not.

I could be wrong, but I would have thought that in this case the flow
is legal but is only added to hw (because the action doesn't exist in sw).

But if you prefer to make it illegal I guess that is ok too.

^ permalink raw reply	[flat|nested] 58+ messages in thread

* RE: [RFC/PATCH net-next v3 8/8] flow_offload: validate flags of filter and actions
  2021-11-03 13:33                           ` Jamal Hadi Salim
  2021-11-03 13:38                             ` Simon Horman
@ 2021-11-03 14:03                             ` Baowen Zheng
  2021-11-03 14:16                               ` Jamal Hadi Salim
  1 sibling, 1 reply; 58+ messages in thread
From: Baowen Zheng @ 2021-11-03 14:03 UTC (permalink / raw)
  To: Jamal Hadi Salim, Simon Horman, Vlad Buslov
  Cc: netdev, Roi Dayan, Ido Schimmel, Cong Wang, Jiri Pirko,
	Baowen Zheng, Louis Peens, oss-drivers, Oz Shlomo

Thanks for your reply.
On November 3, 2021 9:34 PM, Jamal Hadi Salim wrote:
>On 2021-11-03 08:33, Jamal Hadi Salim wrote:
>> On 2021-11-03 07:30, Baowen Zheng wrote:
>>> On November 3, 2021 6:14 PM, Jamal Hadi Salim wrote:
>>>> On 2021-11-03 03:57, Baowen Zheng wrote:
>>>>> On November 2, 2021 8:40 PM, Simon Horman wrote:
>>>>>> On Mon, Nov 01, 2021 at 09:38:34AM +0200, Vlad Buslov wrote:
>>>>>>> On Mon 01 Nov 2021 at 05:29, Baowen Zheng
>>>>
>>>> [..]
>>>>>>>
>>>>>>> My suggestion was to forgo the skip_sw flag for shared action
>>>>>>> offload and, consecutively, remove the validation code, not to
>>>>>>> add even more checks. I still don't see a practical case where
>>>>>>> skip_sw shared action is useful. But I don't have any strong
>>>>>>> feelings about this flag, so if Jamal thinks it is necessary, then fine by
>me.
>>>>>>
>>>>>> FWIIW, my feelings are the same as Vlad's.
>>>>>>
>>>>>> I think these flags add complexity that would be nice to avoid.
>>>>>> But if Jamal thinks its necessary, then including the flags
>>>>>> implementation is fine by me.
>>>>> Thanks Simon. Jamal, do you think it is necessary to keep the
>>>>> skip_sw flag for user to specify the action should not run in software?
>>>>>
>>>>
>>>> Just catching up with discussion...
>>>> IMO, we need the flag. Oz indicated with requirement to be able to
>>>> identify the action with an index. So if a specific action is added
>>>> for skip_sw (as standalone or alongside a filter) then it cant be
>>>> used for skip_hw.
>>>> To illustrate
>>>> using extended example:
>>>>
>>>> #filter 1, skip_sw
>>>> tc filter add dev $DEV1 proto ip parent ffff: flower \
>>>>      skip_sw ip_proto tcp action police blah index 10
>>>>
>>>> #filter 2, skip_hw
>>>> tc filter add dev $DEV1 proto ip parent ffff: flower \
>>>>      skip_hw ip_proto udp action police index 10
>>>>
>>>> Filter2 should be illegal.
>>>> And when i dump the actions as so:
>>>> tc actions ls action police
>>>>
>>>> For debugability, I should see index 10 clearly marked with the flag
>>>> as skip_sw
>>>>
>>>> The other example i gave earlier which showed the sharing of actions:
>>>>
>>>> #add a policer action and offload it tc actions add action police
>>>> skip_sw rate ... index 20 #now add
>>>> filter1 which is
>>>> offloaded using offloaded policer tc filter add dev $DEV1 proto ip
>>>> parent ffff:
>>>> flower \
>>>>      skip_sw ip_proto tcp action police index 20 #add filter2
>>>> likewise offloaded tc filter add dev $DEV1 proto ip parent ffff:
>>>> flower \
>>>>      skip_sw ip_proto udp action police index 20
>>>>
>>>> All good and filter 1 and 2 are sharing policer instance with index 20.
>>>>
>>>> #Now add a filter3 which is s/w only tc filter add dev $DEV1 proto
>>>> ip parent ffff: flower \
>>>>      skip_hw ip_proto icmp action police index 20
>>>>
>>>> filter3 should not be allowed.
>>> I think the use cases you mentioned above are clear for us. For the case:
>>>
>>> #add a policer action and offload it
>>> tc actions add action police skip_sw rate ... index 20 #Now add a
>>> filter4 which has no flag tc filter add dev $DEV1 proto ip parent
>>> ffff: flower \
>>>       ip_proto icmp action police index 20
>>>
>>> Is filter4 legal?
>>
>> Yes it is _based on current semantics_.
>> The reason is when adding a filter and specifying neither skip_sw nor
>> skip_hw it defaults to allowing both.
>> i.e is the same as skip_sw|skip_hw. You will need to have counters for
>> both s/w and h/w (which i think is taken care of today).
>>
>>
>
>Apologies, i will like to take this one back. Couldnt stop thinking about it while
>sipping coffee;-> To be safe that should be illegal. The flags have to match
>_exactly_ for both  action and filter to make any sense. i.e in the above case
>they are not.
Thanks. I think we have get agreement that filter4 is illegal. 
Sorry for more clarification about another case that Vlad mentioned: 
#add a policer action with skip_hw
tc actions add action police skip_hw rate ... index 20
#Now add a  filter5 which has no flag
tc filter add dev $DEV1 proto ip parent ffff: flower \
       ip_proto icmp action police index 20
I think the filter5 could be legal, since it will not run in hardware. 
Driver will check failed when try to offload this filter. So the filter5 will only run in software.
WDYT?

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [RFC/PATCH net-next v3 8/8] flow_offload: validate flags of filter and actions
  2021-11-03 13:38                             ` Simon Horman
@ 2021-11-03 14:05                               ` Jamal Hadi Salim
  0 siblings, 0 replies; 58+ messages in thread
From: Jamal Hadi Salim @ 2021-11-03 14:05 UTC (permalink / raw)
  To: Simon Horman
  Cc: Baowen Zheng, Vlad Buslov, netdev, Roi Dayan, Ido Schimmel,
	Cong Wang, Jiri Pirko, Baowen Zheng, Louis Peens, oss-drivers,
	Oz Shlomo

On 2021-11-03 09:38, Simon Horman wrote:
> On Wed, Nov 03, 2021 at 09:33:52AM -0400, Jamal Hadi Salim wrote:
>> On 2021-11-03 08:33, Jamal Hadi Salim wrote:
>>> On 2021-11-03 07:30, Baowen Zheng wrote:
>>>> On November 3, 2021 6:14 PM, Jamal Hadi Salim wrote:
>>>>> On 2021-11-03 03:57, Baowen Zheng wrote:
>>>>>> On November 2, 2021 8:40 PM, Simon Horman wrote:
>>>>>>> On Mon, Nov 01, 2021 at 09:38:34AM +0200, Vlad Buslov wrote:
>>>>>>>> On Mon 01 Nov 2021 at 05:29, Baowen Zheng
>>>>>
>>>>> [..]
>>>>>>>>
>>>>>>>> My suggestion was to forgo the skip_sw flag for shared action
>>>>>>>> offload and, consecutively, remove the validation code, not to add
>>>>>>>> even more checks. I still don't see a practical case where skip_sw
>>>>>>>> shared action is useful. But I don't have any strong feelings about
>>>>>>>> this flag, so if Jamal thinks it is necessary, then fine by me.
>>>>>>>
>>>>>>> FWIIW, my feelings are the same as Vlad's.
>>>>>>>
>>>>>>> I think these flags add complexity that would be nice to avoid.
>>>>>>> But if Jamal thinks its necessary, then including the flags
>>>>>>> implementation is fine by me.
>>>>>> Thanks Simon. Jamal, do you think it is necessary to keep the skip_sw
>>>>>> flag for user to specify the action should not run in software?
>>>>>>
>>>>>
>>>>> Just catching up with discussion...
>>>>> IMO, we need the flag. Oz indicated with requirement to be able
>>>>> to identify
>>>>> the action with an index. So if a specific action is added for
>>>>> skip_sw (as
>>>>> standalone or alongside a filter) then it cant be used for
>>>>> skip_hw. To illustrate
>>>>> using extended example:
>>>>>
>>>>> #filter 1, skip_sw
>>>>> tc filter add dev $DEV1 proto ip parent ffff: flower \
>>>>>       skip_sw ip_proto tcp action police blah index 10
>>>>>
>>>>> #filter 2, skip_hw
>>>>> tc filter add dev $DEV1 proto ip parent ffff: flower \
>>>>>       skip_hw ip_proto udp action police index 10
>>>>>
>>>>> Filter2 should be illegal.
>>>>> And when i dump the actions as so:
>>>>> tc actions ls action police
>>>>>
>>>>> For debugability, I should see index 10 clearly marked with the
>>>>> flag as skip_sw
>>>>>
>>>>> The other example i gave earlier which showed the sharing of actions:
>>>>>
>>>>> #add a policer action and offload it
>>>>> tc actions add action police skip_sw rate ... index 20 #now add
>>>>> filter1 which is
>>>>> offloaded using offloaded policer tc filter add dev $DEV1 proto
>>>>> ip parent ffff:
>>>>> flower \
>>>>>       skip_sw ip_proto tcp action police index 20 #add filter2
>>>>> likewise offloaded
>>>>> tc filter add dev $DEV1 proto ip parent ffff: flower \
>>>>>       skip_sw ip_proto udp action police index 20
>>>>>
>>>>> All good and filter 1 and 2 are sharing policer instance with index 20.
>>>>>
>>>>> #Now add a filter3 which is s/w only
>>>>> tc filter add dev $DEV1 proto ip parent ffff: flower \
>>>>>       skip_hw ip_proto icmp action police index 20
>>>>>
>>>>> filter3 should not be allowed.
>>>> I think the use cases you mentioned above are clear for us. For the case:
>>>>
>>>> #add a policer action and offload it
>>>> tc actions add action police skip_sw rate ... index 20
>>>> #Now add a filter4 which has no flag
>>>> tc filter add dev $DEV1 proto ip parent ffff: flower \
>>>>        ip_proto icmp action police index 20
>>>>
>>>> Is filter4 legal?
>>>
>>> Yes it is _based on current semantics_.
>>> The reason is when adding a filter and specifying neither
>>> skip_sw nor skip_hw it defaults to allowing both.
>>> i.e is the same as skip_sw|skip_hw. You will need to have
>>> counters for both s/w and h/w (which i think is taken care of today).
>>>
>>>
>>
>> Apologies, i will like to take this one back. Couldnt stop thinking
>> about it while sipping coffee;->
>> To be safe that should be illegal. The flags have to match _exactly_
>> for both  action and filter to make any sense. i.e in the above case
>> they are not.
> 
> I could be wrong, but I would have thought that in this case the flow
> is legal but is only added to hw (because the action doesn't exist in sw).
> 

I was worried what would show up in a dump of the filter.
Would it show only the h/w counter? And if yes, is the s/w
version mutated with no policer (since the policer is only
in h/w)?

> But if you prefer to make it illegal I guess that is ok too.

It just seemed easier from manageability pov to make it illegal, no?
i.e if flags dont match exactly it is illegal is a simple check.

cheers,
jamal

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [RFC/PATCH net-next v3 8/8] flow_offload: validate flags of filter and actions
  2021-11-03 14:03                             ` Baowen Zheng
@ 2021-11-03 14:16                               ` Jamal Hadi Salim
  2021-11-03 14:48                                 ` Baowen Zheng
  0 siblings, 1 reply; 58+ messages in thread
From: Jamal Hadi Salim @ 2021-11-03 14:16 UTC (permalink / raw)
  To: Baowen Zheng, Simon Horman, Vlad Buslov
  Cc: netdev, Roi Dayan, Ido Schimmel, Cong Wang, Jiri Pirko,
	Baowen Zheng, Louis Peens, oss-drivers, Oz Shlomo

On 2021-11-03 10:03, Baowen Zheng wrote:
> Thanks for your reply.
> On November 3, 2021 9:34 PM, Jamal Hadi Salim wrote:
>> On 2021-11-03 08:33, Jamal Hadi Salim wrote:
>>> On 2021-11-03 07:30, Baowen Zheng wrote:
>>>> On November 3, 2021 6:14 PM, Jamal Hadi Salim wrote:


[..]

> Sorry for more clarification about another case that Vlad mentioned:
> #add a policer action with skip_hw
> tc actions add action police skip_hw rate ... index 20
> #Now add a  filter5 which has no flag
> tc filter add dev $DEV1 proto ip parent ffff: flower \
>         ip_proto icmp action police index 20
> I think the filter5 could be legal, since it will not run in hardware.
> Driver will check failed when try to offload this filter. So the filter5 will only run in software.
> WDYT?
> 

I think this one also has ambiguity. If the filter doesnt specify 
skip_sw or skip_hw it will run both in s/w and h/w. I am worried if
that looks suprising to someone debugging after because in h/w
there is filter 5 but no policer but in s/w twin we have filter 5
and policer index 20.
It could be design intent, but in my opinion we have priorities
to resolve such ambiguities in policies.

If we use the rule which says the flags have to match exactly then we
can simplify resolving any ambiguity - which will make it illegal, no?

cheers,
jamal

^ permalink raw reply	[flat|nested] 58+ messages in thread

* RE: [RFC/PATCH net-next v3 8/8] flow_offload: validate flags of filter and actions
  2021-11-03 14:16                               ` Jamal Hadi Salim
@ 2021-11-03 14:48                                 ` Baowen Zheng
  2021-11-03 15:35                                   ` Jamal Hadi Salim
  0 siblings, 1 reply; 58+ messages in thread
From: Baowen Zheng @ 2021-11-03 14:48 UTC (permalink / raw)
  To: Jamal Hadi Salim, Simon Horman, Vlad Buslov
  Cc: netdev, Roi Dayan, Ido Schimmel, Cong Wang, Jiri Pirko,
	Baowen Zheng, Louis Peens, oss-drivers, Oz Shlomo

On November 3, 2021 10:16 PM, Jamal Hadi Salim wrote:
>On 2021-11-03 10:03, Baowen Zheng wrote:
>> Thanks for your reply.
>> On November 3, 2021 9:34 PM, Jamal Hadi Salim wrote:
>>> On 2021-11-03 08:33, Jamal Hadi Salim wrote:
>>>> On 2021-11-03 07:30, Baowen Zheng wrote:
>>>>> On November 3, 2021 6:14 PM, Jamal Hadi Salim wrote:
>
>
>[..]
>
>> Sorry for more clarification about another case that Vlad mentioned:
>> #add a policer action with skip_hw
>> tc actions add action police skip_hw rate ... index 20 #Now add a
>> filter5 which has no flag tc filter add dev $DEV1 proto ip parent
>> ffff: flower \
>>         ip_proto icmp action police index 20 I think the filter5 could
>> be legal, since it will not run in hardware.
>> Driver will check failed when try to offload this filter. So the filter5 will only
>run in software.
>> WDYT?
>>
>
>I think this one also has ambiguity. If the filter doesnt specify skip_sw or
>skip_hw it will run both in s/w and h/w. I am worried if that looks suprising to
>someone debugging after because in h/w there is filter 5 but no policer but in
>s/w twin we have filter 5 and policer index 20.
In this case, the filter will not in h/w because when the driver tries to offload the filter,
It will found the action is not in h/w and return failed, then the filter will not in h/w, so the filter will only
In software.
>It could be design intent, but in my opinion we have priorities to resolve such
>ambiguities in policies.
>
>If we use the rule which says the flags have to match exactly then we can
>simplify resolving any ambiguity - which will make it illegal, no?
When you mentioned " match exactly ", do you mean the flags of the filter and the actions should be
exactly same? 
Please consider the case that filter has flag and the action does not have any flag. I think we should allow this case.
Because it is legal before our patch, we do not expect to break this use case, yes?
So maybe the "match exactly" just limits action flags, when action has flags(skip_sw or skip_hw), the filter must have
exactly the same flags. 
WDYT?

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [RFC/PATCH net-next v3 8/8] flow_offload: validate flags of filter and actions
  2021-11-03 14:48                                 ` Baowen Zheng
@ 2021-11-03 15:35                                   ` Jamal Hadi Salim
  0 siblings, 0 replies; 58+ messages in thread
From: Jamal Hadi Salim @ 2021-11-03 15:35 UTC (permalink / raw)
  To: Baowen Zheng, Simon Horman, Vlad Buslov
  Cc: netdev, Roi Dayan, Ido Schimmel, Cong Wang, Jiri Pirko,
	Baowen Zheng, Louis Peens, oss-drivers, Oz Shlomo

On 2021-11-03 10:48, Baowen Zheng wrote:
> On November 3, 2021 10:16 PM, Jamal Hadi Salim wrote:
>> On 2021-11-03 10:03, Baowen Zheng wrote:
>>> Thanks for your reply.
>>> On November 3, 2021 9:34 PM, Jamal Hadi Salim wrote:
>>>> On 2021-11-03 08:33, Jamal Hadi Salim wrote:
>>>>> On 2021-11-03 07:30, Baowen Zheng wrote:
>>>>>> On November 3, 2021 6:14 PM, Jamal Hadi Salim wrote:
>>
>>
>> [..]
>>
>>> Sorry for more clarification about another case that Vlad mentioned:
>>> #add a policer action with skip_hw
>>> tc actions add action police skip_hw rate ... index 20 #Now add a
>>> filter5 which has no flag tc filter add dev $DEV1 proto ip parent
>>> ffff: flower \
>>>          ip_proto icmp action police index 20 I think the filter5 could
>>> be legal, since it will not run in hardware.
>>> Driver will check failed when try to offload this filter. So the filter5 will only
>> run in software.
>>> WDYT?
>>>
>>
>> I think this one also has ambiguity. If the filter doesnt specify skip_sw or
>> skip_hw it will run both in s/w and h/w. I am worried if that looks suprising to
>> someone debugging after because in h/w there is filter 5 but no policer but in
>> s/w twin we have filter 5 and policer index 20.
> In this case, the filter will not in h/w because when the driver tries to offload the filter,
> It will found the action is not in h/w and return failed, then the filter will not in h/w, so the filter will only
> In software.

So you have partial failure? That doesnt sound good. What do you return
to the user - "success" or "somehow success"?
I worry it is still ambigous. Did the user really intend to do that?
If they did maybe they should have just added it to s/w instead of h/w
and s/w and then get saved by the driver?

>> It could be design intent, but in my opinion we have priorities to resolve such
>> ambiguities in policies.
>>
>> If we use the rule which says the flags have to match exactly then we can
>> simplify resolving any ambiguity - which will make it illegal, no?
> When you mentioned " match exactly ", do you mean the flags of the filter and the actions should be
> exactly same?
> Please consider the case that filter has flag and the action does not have any flag. 

See above.

> I think we should allow this case.
> Because it is legal before our patch, we do not expect to break this use case, yes?
> So maybe the "match exactly" just limits action flags, when action has flags(skip_sw or skip_hw), the filter must have
> exactly the same flags.

Maybe i am missing something but nothing should break.
I think what you mean is when the action is specified with
the filter. The flags should be the same in that case.

Example, filter 1:
tc filter add dev $DEV1 proto ip paren ffff: flower \
ip_proto icmp action police blah

where flag is 0 implies this filter goes both in h/w and s/w.
If i dump the policer or the filter i will see some index provided by
the kernel and i should be able to see both s/w and h/w
counters.

Same thing if i did:

Example filter 2:
tc filter add dev $DEV1 proto ip paren ffff: flower \
skip_sw ip_proto udp action police blah

both filter + action will have where flag of skip_sw when i dump
implies this filter goes only in h/w and any displayed index
is allocated by the kernel.


Our challenge is when someone specifies a specific action by index
and tries to use it ambigously.

cheers,
jamal

^ permalink raw reply	[flat|nested] 58+ messages in thread

* RE: [RFC/PATCH net-next v3 8/8] flow_offload: validate flags of filter and actions
  2021-10-29 18:01   ` Vlad Buslov
  2021-10-30 10:54     ` Jamal Hadi Salim
@ 2021-11-04  2:30     ` Baowen Zheng
  2021-11-04  5:51       ` Baowen Zheng
  1 sibling, 1 reply; 58+ messages in thread
From: Baowen Zheng @ 2021-11-04  2:30 UTC (permalink / raw)
  To: Vlad Buslov, Simon Horman
  Cc: netdev, Jamal Hadi Salim, Roi Dayan, Ido Schimmel, Cong Wang,
	Jiri Pirko, Baowen Zheng, Louis Peens, oss-drivers

Thanks for your review and sorry for delay in responding.
On October 30, 2021 2:01 AM, Vlad Buslov wrote:
>On Thu 28 Oct 2021 at 14:06, Simon Horman <simon.horman@corigine.com>
>wrote:
>> From: Baowen Zheng <baowen.zheng@corigine.com>
>>
>> Add process to validate flags of filter and actions when adding a tc
>> filter.
>>
>> We need to prevent adding filter with flags conflicts with its actions.
>>
>> Signed-off-by: Baowen Zheng <baowen.zheng@corigine.com>
>> Signed-off-by: Louis Peens <louis.peens@corigine.com>
>> Signed-off-by: Simon Horman <simon.horman@corigine.com>
>> ---
>>  net/sched/cls_api.c      | 26 ++++++++++++++++++++++++++
>>  net/sched/cls_flower.c   |  3 ++-
>>  net/sched/cls_matchall.c |  4 ++--
>>  net/sched/cls_u32.c      |  7 ++++---
>>  4 files changed, 34 insertions(+), 6 deletions(-)
>>
>> diff --git a/net/sched/cls_api.c b/net/sched/cls_api.c index
>> 351d93988b8b..80647da9713a 100644
>> --- a/net/sched/cls_api.c
>> +++ b/net/sched/cls_api.c
>> @@ -3025,6 +3025,29 @@ void tcf_exts_destroy(struct tcf_exts *exts)  }
>> EXPORT_SYMBOL(tcf_exts_destroy);
>>
>> +static bool tcf_exts_validate_actions(const struct tcf_exts *exts,
>> +u32 flags) { #ifdef CONFIG_NET_CLS_ACT
>> +	bool skip_sw = tc_skip_sw(flags);
>> +	bool skip_hw = tc_skip_hw(flags);
>> +	int i;
>> +
>> +	if (!(skip_sw | skip_hw))
>> +		return true;
>> +
>> +	for (i = 0; i < exts->nr_actions; i++) {
>> +		struct tc_action *a = exts->actions[i];
>> +
>> +		if ((skip_sw && tc_act_skip_hw(a->tcfa_flags)) ||
>> +		    (skip_hw && tc_act_skip_sw(a->tcfa_flags)))
>> +			return false;
>> +	}
>> +	return true;
>> +#else
>> +	return true;
>> +#endif
>> +}
>> +
>
>I know Jamal suggested to have skip_sw for actions, but it complicates the
>code and I'm still not entirely understand why it is necessary.
>After all, action can only get applied to a packet if the packet has been
>matched by some filter and filters already have skip sw/hw controls. Forgoing
>action skip_sw flag would:
>
>- Alleviate the need to validate that filter and action flags are compatible.
>(trying to offload filter that points to existing skip_hw action would just fail
>because the driver wouldn't find the action with provided id in its tables)
>
>- Remove the need to add more conditionals into TC software data path in
>patch 4.
>
>WDYT?
As we discussed with Jamal, we will keep the flag of skip_sw and we need to make
exactly match for the actions with flags and the filter specific action with index. 
>
>>  int tcf_exts_validate(struct net *net, struct tcf_proto *tp, struct nlattr **tb,
>>  		      struct nlattr *rate_tlv, struct tcf_exts *exts,
>>  		      u32 flags, struct netlink_ext_ack *extack) @@ -3066,6
>+3089,9
>> @@ int tcf_exts_validate(struct net *net, struct tcf_proto *tp, struct nlattr
>**tb,
>>  				return err;
>>  			exts->nr_actions = err;
>>  		}
>> +
>> +		if (!tcf_exts_validate_actions(exts, flags))
>> +			return -EINVAL;
>>  	}
>>  #else
>>  	if ((exts->action && tb[exts->action]) || diff --git
>> a/net/sched/cls_flower.c b/net/sched/cls_flower.c index
>> eb6345a027e1..55f89f0e393e 100644
>> --- a/net/sched/cls_flower.c
>> +++ b/net/sched/cls_flower.c
>> @@ -2035,7 +2035,8 @@ static int fl_change(struct net *net, struct sk_buff
>*in_skb,
>>  	}
>>
>>  	err = fl_set_parms(net, tp, fnew, mask, base, tb, tca[TCA_RATE],
>> -			   tp->chain->tmplt_priv, flags, extack);
>> +			   tp->chain->tmplt_priv, flags | fnew->flags,
>> +			   extack);
>
>Aren't you or-ing flags from two different ranges (TCA_CLS_FLAGS_* and
>TCA_ACT_FLAGS_*) that map to same bits, or am I missing something? This
>isn't explained in commit message so it is hard for me to understand the idea
>here.
Yes, as you said we use TCA_CLS_FLAGS_* or TCA_ACT_FLAGS_* flags to validate the action flags. 
As you know, the TCA_ACT_FLAGS_* in flags are system flags(in high 16 bits) and the TCA_CLS_FLAGS_*
are user flags(in low 16 bits), so they will not be conflict. 
But I think you suggestion also makes sense to us, do you think we need to pass a single filter flag
to make the process more clear? 
>
>>  	if (err)
>>  		goto errout;
>>
>> diff --git a/net/sched/cls_matchall.c b/net/sched/cls_matchall.c index
>> 24f0046ce0b3..00b76fbc1dce 100644
>> --- a/net/sched/cls_matchall.c
>> +++ b/net/sched/cls_matchall.c
>> @@ -226,8 +226,8 @@ static int mall_change(struct net *net, struct sk_buff
>*in_skb,
>>  		goto err_alloc_percpu;
>>  	}
>>
>> -	err = mall_set_parms(net, tp, new, base, tb, tca[TCA_RATE], flags,
>> -			     extack);
>> +	err = mall_set_parms(net, tp, new, base, tb, tca[TCA_RATE],
>> +			     flags | new->flags, extack);
>>  	if (err)
>>  		goto err_set_parms;
>>
>> diff --git a/net/sched/cls_u32.c b/net/sched/cls_u32.c index
>> 4272814487f0..fc670cc45122 100644
>> --- a/net/sched/cls_u32.c
>> +++ b/net/sched/cls_u32.c
>> @@ -895,7 +895,8 @@ static int u32_change(struct net *net, struct sk_buff
>*in_skb,
>>  			return -ENOMEM;
>>
>>  		err = u32_set_parms(net, tp, base, new, tb,
>> -				    tca[TCA_RATE], flags, extack);
>> +				    tca[TCA_RATE], flags | new->flags,
>> +				    extack);
>>
>>  		if (err) {
>>  			u32_destroy_key(new, false);
>> @@ -1060,8 +1061,8 @@ static int u32_change(struct net *net, struct
>sk_buff *in_skb,
>>  	}
>>  #endif
>>
>> -	err = u32_set_parms(net, tp, base, n, tb, tca[TCA_RATE], flags,
>> -			    extack);
>> +	err = u32_set_parms(net, tp, base, n, tb, tca[TCA_RATE],
>> +			    flags | n->flags, extack);
>>  	if (err == 0) {
>>  		struct tc_u_knode __rcu **ins;
>>  		struct tc_u_knode *pins;


^ permalink raw reply	[flat|nested] 58+ messages in thread

* RE: [RFC/PATCH net-next v3 8/8] flow_offload: validate flags of filter and actions
  2021-11-04  2:30     ` Baowen Zheng
@ 2021-11-04  5:51       ` Baowen Zheng
  2021-11-04  9:07         ` Vlad Buslov
  0 siblings, 1 reply; 58+ messages in thread
From: Baowen Zheng @ 2021-11-04  5:51 UTC (permalink / raw)
  To: Vlad Buslov, Simon Horman
  Cc: netdev, Jamal Hadi Salim, Roi Dayan, Ido Schimmel, Cong Wang,
	Jiri Pirko, Baowen Zheng, Louis Peens, oss-drivers

Sorry for reply this message again.
On November 4, 2021 10:31 AM, Baowen Zheng wrote:
>Thanks for your review and sorry for delay in responding.
>On October 30, 2021 2:01 AM, Vlad Buslov wrote:
>>On Thu 28 Oct 2021 at 14:06, Simon Horman <simon.horman@corigine.com>
>>wrote:
>>> From: Baowen Zheng <baowen.zheng@corigine.com>
>>>
>>> Add process to validate flags of filter and actions when adding a tc
>>> filter.
>>>
>>> We need to prevent adding filter with flags conflicts with its actions.
>>>
>>> Signed-off-by: Baowen Zheng <baowen.zheng@corigine.com>
>>> Signed-off-by: Louis Peens <louis.peens@corigine.com>
>>> Signed-off-by: Simon Horman <simon.horman@corigine.com>
>>> ---
>>>  net/sched/cls_api.c      | 26 ++++++++++++++++++++++++++
>>>  net/sched/cls_flower.c   |  3 ++-
>>>  net/sched/cls_matchall.c |  4 ++--
>>>  net/sched/cls_u32.c      |  7 ++++---
>>>  4 files changed, 34 insertions(+), 6 deletions(-)
>>>
>>> diff --git a/net/sched/cls_api.c b/net/sched/cls_api.c index
>>> 351d93988b8b..80647da9713a 100644
>>> --- a/net/sched/cls_api.c
>>> +++ b/net/sched/cls_api.c
>>> @@ -3025,6 +3025,29 @@ void tcf_exts_destroy(struct tcf_exts *exts)
>>> } EXPORT_SYMBOL(tcf_exts_destroy);
>>>
>>> +static bool tcf_exts_validate_actions(const struct tcf_exts *exts,
>>> +u32 flags) { #ifdef CONFIG_NET_CLS_ACT
>>> +	bool skip_sw = tc_skip_sw(flags);
>>> +	bool skip_hw = tc_skip_hw(flags);
>>> +	int i;
>>> +
>>> +	if (!(skip_sw | skip_hw))
>>> +		return true;
>>> +
>>> +	for (i = 0; i < exts->nr_actions; i++) {
>>> +		struct tc_action *a = exts->actions[i];
>>> +
>>> +		if ((skip_sw && tc_act_skip_hw(a->tcfa_flags)) ||
>>> +		    (skip_hw && tc_act_skip_sw(a->tcfa_flags)))
>>> +			return false;
>>> +	}
>>> +	return true;
>>> +#else
>>> +	return true;
>>> +#endif
>>> +}
>>> +
>>
>>I know Jamal suggested to have skip_sw for actions, but it complicates
>>the code and I'm still not entirely understand why it is necessary.
>>After all, action can only get applied to a packet if the packet has
>>been matched by some filter and filters already have skip sw/hw
>>controls. Forgoing action skip_sw flag would:
>>
>>- Alleviate the need to validate that filter and action flags are compatible.
>>(trying to offload filter that points to existing skip_hw action would
>>just fail because the driver wouldn't find the action with provided id
>>in its tables)
>>
>>- Remove the need to add more conditionals into TC software data path
>>in patch 4.
>>
>>WDYT?
>As we discussed with Jamal, we will keep the flag of skip_sw and we need to
>make exactly match for the actions with flags and the filter specific action with
>index.
>>
>>>  int tcf_exts_validate(struct net *net, struct tcf_proto *tp, struct nlattr
>**tb,
>>>  		      struct nlattr *rate_tlv, struct tcf_exts *exts,
>>>  		      u32 flags, struct netlink_ext_ack *extack) @@ -3066,6
>>+3089,9
>>> @@ int tcf_exts_validate(struct net *net, struct tcf_proto *tp,
>>> struct nlattr
>>**tb,
>>>  				return err;
>>>  			exts->nr_actions = err;
>>>  		}
>>> +
>>> +		if (!tcf_exts_validate_actions(exts, flags))
>>> +			return -EINVAL;
>>>  	}
>>>  #else
>>>  	if ((exts->action && tb[exts->action]) || diff --git
>>> a/net/sched/cls_flower.c b/net/sched/cls_flower.c index
>>> eb6345a027e1..55f89f0e393e 100644
>>> --- a/net/sched/cls_flower.c
>>> +++ b/net/sched/cls_flower.c
>>> @@ -2035,7 +2035,8 @@ static int fl_change(struct net *net, struct
>>> sk_buff
>>*in_skb,
>>>  	}
>>>
>>>  	err = fl_set_parms(net, tp, fnew, mask, base, tb, tca[TCA_RATE],
>>> -			   tp->chain->tmplt_priv, flags, extack);
>>> +			   tp->chain->tmplt_priv, flags | fnew->flags,
>>> +			   extack);
>>
>>Aren't you or-ing flags from two different ranges (TCA_CLS_FLAGS_* and
>>TCA_ACT_FLAGS_*) that map to same bits, or am I missing something? This
>>isn't explained in commit message so it is hard for me to understand
>>the idea here.
>Yes, as you said we use TCA_CLS_FLAGS_* or TCA_ACT_FLAGS_* flags to
>validate the action flags.
>As you know, the TCA_ACT_FLAGS_* in flags are system flags(in high 16 bits)
>and the TCA_CLS_FLAGS_* are user flags(in low 16 bits), so they will not be
>conflict.
>But I think you suggestion also makes sense to us, do you think we need to
>pass a single filter flag to make the process more clear?
After consideration, I think it is better to separate CLS flags and ACT flags. 
So we will pass CLS flags as a separate flags, thanks.
>>
>>>  	if (err)
>>>  		goto errout;
>>>
>>> diff --git a/net/sched/cls_matchall.c b/net/sched/cls_matchall.c
>>> index 24f0046ce0b3..00b76fbc1dce 100644
>>> --- a/net/sched/cls_matchall.c
>>> +++ b/net/sched/cls_matchall.c
>>> @@ -226,8 +226,8 @@ static int mall_change(struct net *net, struct
>>> sk_buff
>>*in_skb,
>>>  		goto err_alloc_percpu;
>>>  	}
>>>
>>> -	err = mall_set_parms(net, tp, new, base, tb, tca[TCA_RATE], flags,
>>> -			     extack);
>>> +	err = mall_set_parms(net, tp, new, base, tb, tca[TCA_RATE],
>>> +			     flags | new->flags, extack);
>>>  	if (err)
>>>  		goto err_set_parms;
>>>
>>> diff --git a/net/sched/cls_u32.c b/net/sched/cls_u32.c index
>>> 4272814487f0..fc670cc45122 100644
>>> --- a/net/sched/cls_u32.c
>>> +++ b/net/sched/cls_u32.c
>>> @@ -895,7 +895,8 @@ static int u32_change(struct net *net, struct
>>> sk_buff
>>*in_skb,
>>>  			return -ENOMEM;
>>>
>>>  		err = u32_set_parms(net, tp, base, new, tb,
>>> -				    tca[TCA_RATE], flags, extack);
>>> +				    tca[TCA_RATE], flags | new->flags,
>>> +				    extack);
>>>
>>>  		if (err) {
>>>  			u32_destroy_key(new, false);
>>> @@ -1060,8 +1061,8 @@ static int u32_change(struct net *net, struct
>>sk_buff *in_skb,
>>>  	}
>>>  #endif
>>>
>>> -	err = u32_set_parms(net, tp, base, n, tb, tca[TCA_RATE], flags,
>>> -			    extack);
>>> +	err = u32_set_parms(net, tp, base, n, tb, tca[TCA_RATE],
>>> +			    flags | n->flags, extack);
>>>  	if (err == 0) {
>>>  		struct tc_u_knode __rcu **ins;
>>>  		struct tc_u_knode *pins;


^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [RFC/PATCH net-next v3 8/8] flow_offload: validate flags of filter and actions
  2021-11-04  5:51       ` Baowen Zheng
@ 2021-11-04  9:07         ` Vlad Buslov
  2021-11-04 11:15           ` Baowen Zheng
  0 siblings, 1 reply; 58+ messages in thread
From: Vlad Buslov @ 2021-11-04  9:07 UTC (permalink / raw)
  To: Baowen Zheng
  Cc: Simon Horman, netdev, Jamal Hadi Salim, Roi Dayan, Ido Schimmel,
	Cong Wang, Jiri Pirko, Baowen Zheng, Louis Peens, oss-drivers

On Thu 04 Nov 2021 at 07:51, Baowen Zheng <baowen.zheng@corigine.com> wrote:
> Sorry for reply this message again.
> On November 4, 2021 10:31 AM, Baowen Zheng wrote:
>>Thanks for your review and sorry for delay in responding.
>>On October 30, 2021 2:01 AM, Vlad Buslov wrote:
>>>On Thu 28 Oct 2021 at 14:06, Simon Horman <simon.horman@corigine.com>
>>>wrote:
>>>> From: Baowen Zheng <baowen.zheng@corigine.com>
>>>>
>>>> Add process to validate flags of filter and actions when adding a tc
>>>> filter.
>>>>
>>>> We need to prevent adding filter with flags conflicts with its actions.
>>>>
>>>> Signed-off-by: Baowen Zheng <baowen.zheng@corigine.com>
>>>> Signed-off-by: Louis Peens <louis.peens@corigine.com>
>>>> Signed-off-by: Simon Horman <simon.horman@corigine.com>
>>>> ---
>>>>  net/sched/cls_api.c      | 26 ++++++++++++++++++++++++++
>>>>  net/sched/cls_flower.c   |  3 ++-
>>>>  net/sched/cls_matchall.c |  4 ++--
>>>>  net/sched/cls_u32.c      |  7 ++++---
>>>>  4 files changed, 34 insertions(+), 6 deletions(-)
>>>>
>>>> diff --git a/net/sched/cls_api.c b/net/sched/cls_api.c index
>>>> 351d93988b8b..80647da9713a 100644
>>>> --- a/net/sched/cls_api.c
>>>> +++ b/net/sched/cls_api.c
>>>> @@ -3025,6 +3025,29 @@ void tcf_exts_destroy(struct tcf_exts *exts)
>>>> } EXPORT_SYMBOL(tcf_exts_destroy);
>>>>
>>>> +static bool tcf_exts_validate_actions(const struct tcf_exts *exts,
>>>> +u32 flags) { #ifdef CONFIG_NET_CLS_ACT
>>>> +	bool skip_sw = tc_skip_sw(flags);
>>>> +	bool skip_hw = tc_skip_hw(flags);
>>>> +	int i;
>>>> +
>>>> +	if (!(skip_sw | skip_hw))
>>>> +		return true;
>>>> +
>>>> +	for (i = 0; i < exts->nr_actions; i++) {
>>>> +		struct tc_action *a = exts->actions[i];
>>>> +
>>>> +		if ((skip_sw && tc_act_skip_hw(a->tcfa_flags)) ||
>>>> +		    (skip_hw && tc_act_skip_sw(a->tcfa_flags)))
>>>> +			return false;
>>>> +	}
>>>> +	return true;
>>>> +#else
>>>> +	return true;
>>>> +#endif
>>>> +}
>>>> +
>>>
>>>I know Jamal suggested to have skip_sw for actions, but it complicates
>>>the code and I'm still not entirely understand why it is necessary.
>>>After all, action can only get applied to a packet if the packet has
>>>been matched by some filter and filters already have skip sw/hw
>>>controls. Forgoing action skip_sw flag would:
>>>
>>>- Alleviate the need to validate that filter and action flags are compatible.
>>>(trying to offload filter that points to existing skip_hw action would
>>>just fail because the driver wouldn't find the action with provided id
>>>in its tables)
>>>
>>>- Remove the need to add more conditionals into TC software data path
>>>in patch 4.
>>>
>>>WDYT?
>>As we discussed with Jamal, we will keep the flag of skip_sw and we need to
>>make exactly match for the actions with flags and the filter specific action with
>>index.
>>>
>>>>  int tcf_exts_validate(struct net *net, struct tcf_proto *tp, struct nlattr
>>**tb,
>>>>  		      struct nlattr *rate_tlv, struct tcf_exts *exts,
>>>>  		      u32 flags, struct netlink_ext_ack *extack) @@ -3066,6
>>>+3089,9
>>>> @@ int tcf_exts_validate(struct net *net, struct tcf_proto *tp,
>>>> struct nlattr
>>>**tb,
>>>>  				return err;
>>>>  			exts->nr_actions = err;
>>>>  		}
>>>> +
>>>> +		if (!tcf_exts_validate_actions(exts, flags))
>>>> +			return -EINVAL;
>>>>  	}
>>>>  #else
>>>>  	if ((exts->action && tb[exts->action]) || diff --git
>>>> a/net/sched/cls_flower.c b/net/sched/cls_flower.c index
>>>> eb6345a027e1..55f89f0e393e 100644
>>>> --- a/net/sched/cls_flower.c
>>>> +++ b/net/sched/cls_flower.c
>>>> @@ -2035,7 +2035,8 @@ static int fl_change(struct net *net, struct
>>>> sk_buff
>>>*in_skb,
>>>>  	}
>>>>
>>>>  	err = fl_set_parms(net, tp, fnew, mask, base, tb, tca[TCA_RATE],
>>>> -			   tp->chain->tmplt_priv, flags, extack);
>>>> +			   tp->chain->tmplt_priv, flags | fnew->flags,
>>>> +			   extack);
>>>
>>>Aren't you or-ing flags from two different ranges (TCA_CLS_FLAGS_* and
>>>TCA_ACT_FLAGS_*) that map to same bits, or am I missing something? This
>>>isn't explained in commit message so it is hard for me to understand
>>>the idea here.
>>Yes, as you said we use TCA_CLS_FLAGS_* or TCA_ACT_FLAGS_* flags to
>>validate the action flags.
>>As you know, the TCA_ACT_FLAGS_* in flags are system flags(in high 16 bits)
>>and the TCA_CLS_FLAGS_* are user flags(in low 16 bits), so they will not be
>>conflict.

Indeed, currently available TCA_CLS_FLAGS_* fit into first 16 bits, but
the field itself is 32 bits and with addition of more flags in the
future higher bits may start to be used since TCA_CLS_FLAGS_* and
TCA_ACT_FLAGS_* are independent sets.

>>But I think you suggestion also makes sense to us, do you think we need to
>>pass a single filter flag to make the process more clear?
> After consideration, I think it is better to separate CLS flags and ACT flags. 
> So we will pass CLS flags as a separate flags, thanks.

Please also validate inside tcf_action_init() instead of creating new
tcf_exts_validate_actions() function, if possible. I think this will
lead to cleaner and more simple code.

>>>
>>>>  	if (err)
>>>>  		goto errout;
>>>>
>>>> diff --git a/net/sched/cls_matchall.c b/net/sched/cls_matchall.c
>>>> index 24f0046ce0b3..00b76fbc1dce 100644
>>>> --- a/net/sched/cls_matchall.c
>>>> +++ b/net/sched/cls_matchall.c
>>>> @@ -226,8 +226,8 @@ static int mall_change(struct net *net, struct
>>>> sk_buff
>>>*in_skb,
>>>>  		goto err_alloc_percpu;
>>>>  	}
>>>>
>>>> -	err = mall_set_parms(net, tp, new, base, tb, tca[TCA_RATE], flags,
>>>> -			     extack);
>>>> +	err = mall_set_parms(net, tp, new, base, tb, tca[TCA_RATE],
>>>> +			     flags | new->flags, extack);
>>>>  	if (err)
>>>>  		goto err_set_parms;
>>>>
>>>> diff --git a/net/sched/cls_u32.c b/net/sched/cls_u32.c index
>>>> 4272814487f0..fc670cc45122 100644
>>>> --- a/net/sched/cls_u32.c
>>>> +++ b/net/sched/cls_u32.c
>>>> @@ -895,7 +895,8 @@ static int u32_change(struct net *net, struct
>>>> sk_buff
>>>*in_skb,
>>>>  			return -ENOMEM;
>>>>
>>>>  		err = u32_set_parms(net, tp, base, new, tb,
>>>> -				    tca[TCA_RATE], flags, extack);
>>>> +				    tca[TCA_RATE], flags | new->flags,
>>>> +				    extack);
>>>>
>>>>  		if (err) {
>>>>  			u32_destroy_key(new, false);
>>>> @@ -1060,8 +1061,8 @@ static int u32_change(struct net *net, struct
>>>sk_buff *in_skb,
>>>>  	}
>>>>  #endif
>>>>
>>>> -	err = u32_set_parms(net, tp, base, n, tb, tca[TCA_RATE], flags,
>>>> -			    extack);
>>>> +	err = u32_set_parms(net, tp, base, n, tb, tca[TCA_RATE],
>>>> +			    flags | n->flags, extack);
>>>>  	if (err == 0) {
>>>>  		struct tc_u_knode __rcu **ins;
>>>>  		struct tc_u_knode *pins;


^ permalink raw reply	[flat|nested] 58+ messages in thread

* RE: [RFC/PATCH net-next v3 8/8] flow_offload: validate flags of filter and actions
  2021-11-04  9:07         ` Vlad Buslov
@ 2021-11-04 11:15           ` Baowen Zheng
  0 siblings, 0 replies; 58+ messages in thread
From: Baowen Zheng @ 2021-11-04 11:15 UTC (permalink / raw)
  To: Vlad Buslov
  Cc: Simon Horman, netdev, Jamal Hadi Salim, Roi Dayan, Ido Schimmel,
	Cong Wang, Jiri Pirko, Baowen Zheng, Louis Peens, oss-drivers

On November 4, 2021 5:07 PM, Vlad Buslov wrote:
>On Thu 04 Nov 2021 at 07:51, Baowen Zheng <baowen.zheng@corigine.com>
>wrote:
>> Sorry for reply this message again.
>> On November 4, 2021 10:31 AM, Baowen Zheng wrote:
>>>Thanks for your review and sorry for delay in responding.
>>>On October 30, 2021 2:01 AM, Vlad Buslov wrote:
>>>>On Thu 28 Oct 2021 at 14:06, Simon Horman
><simon.horman@corigine.com>
>>>>wrote:
>>>>> From: Baowen Zheng <baowen.zheng@corigine.com>
>>>>>
>>>>> Add process to validate flags of filter and actions when adding a
>>>>> tc filter.
>>>>>
>>>>> We need to prevent adding filter with flags conflicts with its actions.
>>>>>
>>>>> Signed-off-by: Baowen Zheng <baowen.zheng@corigine.com>
>>>>> Signed-off-by: Louis Peens <louis.peens@corigine.com>
>>>>> Signed-off-by: Simon Horman <simon.horman@corigine.com>
>>>>> ---
>>>>>  net/sched/cls_api.c      | 26 ++++++++++++++++++++++++++
>>>>>  net/sched/cls_flower.c   |  3 ++-
>>>>>  net/sched/cls_matchall.c |  4 ++--
>>>>>  net/sched/cls_u32.c      |  7 ++++---
>>>>>  4 files changed, 34 insertions(+), 6 deletions(-)
>>>>>
>>>>> diff --git a/net/sched/cls_api.c b/net/sched/cls_api.c index
>>>>> 351d93988b8b..80647da9713a 100644
>>>>> --- a/net/sched/cls_api.c
>>>>> +++ b/net/sched/cls_api.c
>>>>> @@ -3025,6 +3025,29 @@ void tcf_exts_destroy(struct tcf_exts *exts)
>>>>> } EXPORT_SYMBOL(tcf_exts_destroy);
>>>>>
>>>>> +static bool tcf_exts_validate_actions(const struct tcf_exts *exts,
>>>>> +u32 flags) { #ifdef CONFIG_NET_CLS_ACT
>>>>> +	bool skip_sw = tc_skip_sw(flags);
>>>>> +	bool skip_hw = tc_skip_hw(flags);
>>>>> +	int i;
>>>>> +
>>>>> +	if (!(skip_sw | skip_hw))
>>>>> +		return true;
>>>>> +
>>>>> +	for (i = 0; i < exts->nr_actions; i++) {
>>>>> +		struct tc_action *a = exts->actions[i];
>>>>> +
>>>>> +		if ((skip_sw && tc_act_skip_hw(a->tcfa_flags)) ||
>>>>> +		    (skip_hw && tc_act_skip_sw(a->tcfa_flags)))
>>>>> +			return false;
>>>>> +	}
>>>>> +	return true;
>>>>> +#else
>>>>> +	return true;
>>>>> +#endif
>>>>> +}
>>>>> +
>>>>
>>>>I know Jamal suggested to have skip_sw for actions, but it
>>>>complicates the code and I'm still not entirely understand why it is
>necessary.
>>>>After all, action can only get applied to a packet if the packet has
>>>>been matched by some filter and filters already have skip sw/hw
>>>>controls. Forgoing action skip_sw flag would:
>>>>
>>>>- Alleviate the need to validate that filter and action flags are compatible.
>>>>(trying to offload filter that points to existing skip_hw action
>>>>would just fail because the driver wouldn't find the action with
>>>>provided id in its tables)
>>>>
>>>>- Remove the need to add more conditionals into TC software data path
>>>>in patch 4.
>>>>
>>>>WDYT?
>>>As we discussed with Jamal, we will keep the flag of skip_sw and we
>>>need to make exactly match for the actions with flags and the filter
>>>specific action with index.
>>>>
>>>>>  int tcf_exts_validate(struct net *net, struct tcf_proto *tp,
>>>>> struct nlattr
>>>**tb,
>>>>>  		      struct nlattr *rate_tlv, struct tcf_exts *exts,
>>>>>  		      u32 flags, struct netlink_ext_ack *extack) @@ -3066,6
>>>>+3089,9
>>>>> @@ int tcf_exts_validate(struct net *net, struct tcf_proto *tp,
>>>>> struct nlattr
>>>>**tb,
>>>>>  				return err;
>>>>>  			exts->nr_actions = err;
>>>>>  		}
>>>>> +
>>>>> +		if (!tcf_exts_validate_actions(exts, flags))
>>>>> +			return -EINVAL;
>>>>>  	}
>>>>>  #else
>>>>>  	if ((exts->action && tb[exts->action]) || diff --git
>>>>> a/net/sched/cls_flower.c b/net/sched/cls_flower.c index
>>>>> eb6345a027e1..55f89f0e393e 100644
>>>>> --- a/net/sched/cls_flower.c
>>>>> +++ b/net/sched/cls_flower.c
>>>>> @@ -2035,7 +2035,8 @@ static int fl_change(struct net *net, struct
>>>>> sk_buff
>>>>*in_skb,
>>>>>  	}
>>>>>
>>>>>  	err = fl_set_parms(net, tp, fnew, mask, base, tb, tca[TCA_RATE],
>>>>> -			   tp->chain->tmplt_priv, flags, extack);
>>>>> +			   tp->chain->tmplt_priv, flags | fnew->flags,
>>>>> +			   extack);
>>>>
>>>>Aren't you or-ing flags from two different ranges (TCA_CLS_FLAGS_*
>>>>and
>>>>TCA_ACT_FLAGS_*) that map to same bits, or am I missing something?
>>>>This isn't explained in commit message so it is hard for me to
>>>>understand the idea here.
>>>Yes, as you said we use TCA_CLS_FLAGS_* or TCA_ACT_FLAGS_* flags to
>>>validate the action flags.
>>>As you know, the TCA_ACT_FLAGS_* in flags are system flags(in high 16
>>>bits) and the TCA_CLS_FLAGS_* are user flags(in low 16 bits), so they
>>>will not be conflict.
>
>Indeed, currently available TCA_CLS_FLAGS_* fit into first 16 bits, but the field
>itself is 32 bits and with addition of more flags in the future higher bits may
>start to be used since TCA_CLS_FLAGS_* and
>TCA_ACT_FLAGS_* are independent sets.
Thanks, we will use a single parameter as the filter flag.
>
>>>But I think you suggestion also makes sense to us, do you think we
>>>need to pass a single filter flag to make the process more clear?
>> After consideration, I think it is better to separate CLS flags and ACT flags.
>> So we will pass CLS flags as a separate flags, thanks.
>
>Please also validate inside tcf_action_init() instead of creating new
>tcf_exts_validate_actions() function, if possible. I think this will lead to cleaner
>and more simple code.
Thanks, we will consider to implement the validation inside tcf_action_init().
>
>>>>
>>>>>  	if (err)
>>>>>  		goto errout;
>>>>>
>>>>> diff --git a/net/sched/cls_matchall.c b/net/sched/cls_matchall.c
>>>>> index 24f0046ce0b3..00b76fbc1dce 100644
>>>>> --- a/net/sched/cls_matchall.c
>>>>> +++ b/net/sched/cls_matchall.c
>>>>> @@ -226,8 +226,8 @@ static int mall_change(struct net *net, struct
>>>>> sk_buff
>>>>*in_skb,
>>>>>  		goto err_alloc_percpu;
>>>>>  	}
>>>>>
>>>>> -	err = mall_set_parms(net, tp, new, base, tb, tca[TCA_RATE], flags,
>>>>> -			     extack);
>>>>> +	err = mall_set_parms(net, tp, new, base, tb, tca[TCA_RATE],
>>>>> +			     flags | new->flags, extack);
>>>>>  	if (err)
>>>>>  		goto err_set_parms;
>>>>>
>>>>> diff --git a/net/sched/cls_u32.c b/net/sched/cls_u32.c index
>>>>> 4272814487f0..fc670cc45122 100644
>>>>> --- a/net/sched/cls_u32.c
>>>>> +++ b/net/sched/cls_u32.c
>>>>> @@ -895,7 +895,8 @@ static int u32_change(struct net *net, struct
>>>>> sk_buff
>>>>*in_skb,
>>>>>  			return -ENOMEM;
>>>>>
>>>>>  		err = u32_set_parms(net, tp, base, new, tb,
>>>>> -				    tca[TCA_RATE], flags, extack);
>>>>> +				    tca[TCA_RATE], flags | new->flags,
>>>>> +				    extack);
>>>>>
>>>>>  		if (err) {
>>>>>  			u32_destroy_key(new, false);
>>>>> @@ -1060,8 +1061,8 @@ static int u32_change(struct net *net, struct
>>>>sk_buff *in_skb,
>>>>>  	}
>>>>>  #endif
>>>>>
>>>>> -	err = u32_set_parms(net, tp, base, n, tb, tca[TCA_RATE], flags,
>>>>> -			    extack);
>>>>> +	err = u32_set_parms(net, tp, base, n, tb, tca[TCA_RATE],
>>>>> +			    flags | n->flags, extack);
>>>>>  	if (err == 0) {
>>>>>  		struct tc_u_knode __rcu **ins;
>>>>>  		struct tc_u_knode *pins;


^ permalink raw reply	[flat|nested] 58+ messages in thread

end of thread, other threads:[~2021-11-04 11:15 UTC | newest]

Thread overview: 58+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-10-28 11:06 [RFC/PATCH net-next v3 0/8] allow user to offload tc action to net device Simon Horman
2021-10-28 11:06 ` [RFC/PATCH net-next v3 1/8] flow_offload: fill flags to action structure Simon Horman
2021-10-28 11:06 ` [RFC/PATCH net-next v3 2/8] flow_offload: reject to offload tc actions in offload drivers Simon Horman
2021-10-28 11:06 ` [RFC/PATCH net-next v3 3/8] flow_offload: allow user to offload tc action to net device Simon Horman
2021-10-29 16:59   ` Vlad Buslov
2021-11-01  9:44     ` Baowen Zheng
2021-11-01 12:05       ` Vlad Buslov
2021-11-02  1:38         ` Baowen Zheng
2021-10-31  9:50   ` Oz Shlomo
2021-11-01  2:30     ` Baowen Zheng
2021-11-01 10:07       ` Oz Shlomo
2021-11-01 10:27         ` Baowen Zheng
2021-10-28 11:06 ` [RFC/PATCH net-next v3 4/8] flow_offload: add skip_hw and skip_sw to control if offload the action Simon Horman
2021-10-28 11:06 ` [RFC/PATCH net-next v3 5/8] flow_offload: add process to update action stats from hardware Simon Horman
2021-10-29 17:11   ` Vlad Buslov
2021-11-01 10:07     ` Baowen Zheng
2021-10-28 11:06 ` [RFC/PATCH net-next v3 6/8] net: sched: save full flags for tc action Simon Horman
2021-10-28 11:06 ` [RFC/PATCH net-next v3 7/8] flow_offload: add reoffload process to update hw_count Simon Horman
2021-10-29 17:31   ` Vlad Buslov
2021-11-02  9:20     ` Baowen Zheng
2021-10-28 11:06 ` [RFC/PATCH net-next v3 8/8] flow_offload: validate flags of filter and actions Simon Horman
2021-10-28 19:12   ` kernel test robot
2021-10-29 18:01   ` Vlad Buslov
2021-10-30 10:54     ` Jamal Hadi Salim
2021-10-30 14:45       ` Vlad Buslov
     [not found]         ` <DM5PR1301MB21722A85B19EE97EFE27A5BBE7899@DM5PR1301MB2172.namprd13.prod.outlook.com>
2021-10-31 13:30           ` Jamal Hadi Salim
2021-11-01  3:29             ` Baowen Zheng
2021-11-01  7:38               ` Vlad Buslov
2021-11-02 12:39                 ` Simon Horman
2021-11-03  7:57                   ` Baowen Zheng
2021-11-03 10:13                     ` Jamal Hadi Salim
2021-11-03 11:30                       ` Baowen Zheng
2021-11-03 12:33                         ` Jamal Hadi Salim
2021-11-03 13:33                           ` Jamal Hadi Salim
2021-11-03 13:38                             ` Simon Horman
2021-11-03 14:05                               ` Jamal Hadi Salim
2021-11-03 14:03                             ` Baowen Zheng
2021-11-03 14:16                               ` Jamal Hadi Salim
2021-11-03 14:48                                 ` Baowen Zheng
2021-11-03 15:35                                   ` Jamal Hadi Salim
2021-11-03 13:37                           ` Baowen Zheng
2021-11-04  2:30     ` Baowen Zheng
2021-11-04  5:51       ` Baowen Zheng
2021-11-04  9:07         ` Vlad Buslov
2021-11-04 11:15           ` Baowen Zheng
2021-10-28 14:23 ` [RFC/PATCH net-next v3 0/8] allow user to offload tc action to net device Jamal Hadi Salim
2021-10-28 14:39   ` Jamal Hadi Salim
2021-10-31  9:50 ` Oz Shlomo
2021-10-31 12:03   ` Dave Taht
2021-10-31 14:14     ` Jamal Hadi Salim
2021-10-31 14:19       ` Jamal Hadi Salim
2021-11-01 14:27       ` Dave Taht
2021-10-31 13:40   ` Jamal Hadi Salim
2021-11-01  8:01     ` Vlad Buslov
2021-11-02 12:51       ` Simon Horman
2021-11-02 15:33         ` Vlad Buslov
2021-11-02 16:15           ` Simon Horman
2021-11-03 10:56             ` Oz Shlomo

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.