All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH net-next 00/13] Control action percpu counters allocation by netlink flag
@ 2019-10-22 14:17 Vlad Buslov
  2019-10-22 14:17 ` [PATCH net-next 01/13] net: sched: extract common action counters update code into function Vlad Buslov
                   ` (16 more replies)
  0 siblings, 17 replies; 64+ messages in thread
From: Vlad Buslov @ 2019-10-22 14:17 UTC (permalink / raw)
  To: netdev; +Cc: jhs, xiyou.wangcong, jiri, davem, mleitner, dcaratti, Vlad Buslov

Currently, significant fraction of CPU time during TC filter allocation
is spent in percpu allocator. Moreover, percpu allocator is protected
with single global mutex which negates any potential to improve its
performance by means of recent developments in TC filter update API that
removed rtnl lock for some Qdiscs and classifiers. In order to
significantly improve filter update rate and reduce memory usage we
would like to allow users to skip percpu counters allocation for
specific action if they don't expect high traffic rate hitting the
action, which is a reasonable expectation for hardware-offloaded setup.
In that case any potential gains to software fast-path performance
gained by usage of percpu-allocated counters compared to regular integer
counters protected by spinlock are not important, but amount of
additional CPU and memory consumed by them is significant.

In order to allow configuring action counters allocation type at
runtime, implement following changes:

- Implement helper functions to update the action counters and use them
  in affected actions instead of updating counters directly. This steps
  abstracts actions implementation from counter types that are being
  used for particular action instance at runtime.

- Modify the new helpers to use percpu counters if they were allocated
  during action initialization and use regular counters otherwise.

- Extend actions that are used for hardware offloads with optional
  netlink 32bit flags field. Add TCA_ACT_FLAGS_FAST_INIT action flag and
  update affected actions to not allocate percpu counters when the flag
  is set.

With this changes users that prefer action update slow-path speed over
software fast-path speed can dynamically request actions to skip percpu
counters allocation without affecting other users.

Now, lets look at actual performance gains provided by this change.
Simple test is used to measure insertion rate - iproute2 TC is executed
in parallel by xargs in batch mode, its total execution time is measured
by shell builtin "time" command. The command runs 20 concurrent tc
instances, each with its own batch file with 100k rules:

$ time ls add* | xargs -n 1 -P 20 sudo tc -b

Two main rule profiles are tested. First is simple L2 flower classifier
with single gact drop action. The configuration is chosen as worst case
scenario because with single-action rules pressure on percpu allocator
is minimized. Example rule:

filter add dev ens1f0 protocol ip ingress prio 1 handle 1 flower skip_hw
    src_mac e4:11:0:0:0:0 dst_mac e4:12:0:0:0:0 action drop

Second profile is typical real-world scenario that uses flower
classifier with some L2-4 fields and two actions (tunnel_key+mirred).
Example rule:

filter add dev ens1f0_0 protocol ip ingress prio 1 handle 1 flower
    skip_hw src_mac e4:11:0:0:0:0 dst_mac e4:12:0:0:0:0 src_ip
    192.168.111.1 dst_ip 192.168.111.2 ip_proto udp dst_port 1 src_port
    1 action tunnel_key set id 1 src_ip 2.2.2.2 dst_ip 2.2.2.3 dst_port
    4789 action mirred egress redirect dev vxlan1

 Profile           |        percpu |     fast_init | X improvement 
                   | (k rules/sec) | (k rules/sec) |               
-------------------+---------------+---------------+---------------
 Gact drop         |           203 |           259 |          1.28 
 tunnel_key+mirred |            92 |           204 |          2.22 

For simple drop action removing percpu allocation leads to ~25%
insertion rate improvement. Perf profiles highlights the bottlenecks.

Perf profile of run with percpu allocation (gact drop):

+ 89.11% 0.48% tc [kernel.vmlinux] [k] entry_SYSCALL_64
+ 88.58% 0.04% tc [kernel.vmlinux] [k] do_syscall_64
+ 87.50% 0.04% tc libc-2.29.so [.] __libc_sendmsg
+ 86.96% 0.04% tc [kernel.vmlinux] [k] __sys_sendmsg
+ 86.85% 0.01% tc [kernel.vmlinux] [k] ___sys_sendmsg
+ 86.60% 0.05% tc [kernel.vmlinux] [k] sock_sendmsg
+ 86.55% 0.12% tc [kernel.vmlinux] [k] netlink_sendmsg
+ 86.04% 0.13% tc [kernel.vmlinux] [k] netlink_unicast
+ 85.42% 0.03% tc [kernel.vmlinux] [k] netlink_rcv_skb
+ 84.68% 0.04% tc [kernel.vmlinux] [k] rtnetlink_rcv_msg
+ 84.56% 0.24% tc [kernel.vmlinux] [k] tc_new_tfilter
+ 75.73% 0.65% tc [cls_flower] [k] fl_change
+ 71.30% 0.03% tc [kernel.vmlinux] [k] tcf_exts_validate
+ 71.27% 0.13% tc [kernel.vmlinux] [k] tcf_action_init
+ 71.06% 0.01% tc [kernel.vmlinux] [k] tcf_action_init_1
+ 70.41% 0.04% tc [act_gact] [k] tcf_gact_init
+ 53.59% 1.21% tc [kernel.vmlinux] [k] __mutex_lock.isra.0
+ 52.34% 0.34% tc [kernel.vmlinux] [k] tcf_idr_create
- 51.23% 2.17% tc [kernel.vmlinux] [k] pcpu_alloc
  - 49.05% pcpu_alloc
    + 39.35% __mutex_lock.isra.0 4.99% memset_erms
    + 2.16% pcpu_alloc_area
  + 2.17% __libc_sendmsg
+ 45.89% 44.33% tc [kernel.vmlinux] [k] osq_lock
+ 9.94% 0.04% tc [kernel.vmlinux] [k] tcf_idr_check_alloc
+ 7.76% 0.00% tc [kernel.vmlinux] [k] tcf_idr_insert
+ 6.50% 0.03% tc [kernel.vmlinux] [k] tfilter_notify
+ 6.24% 6.11% tc [kernel.vmlinux] [k] mutex_spin_on_owner
+ 5.73% 5.32% tc [kernel.vmlinux] [k] memset_erms
+ 5.31% 0.18% tc [kernel.vmlinux] [k] tcf_fill_node

Here bottleneck is clearly in pcpu_alloc() function that takes more than
half CPU time, which is mostly wasted busy-waiting for internal percpu
allocator global lock.

With percpu allocation removed (gact drop):

+ 87.50% 0.51% tc [kernel.vmlinux] [k] entry_SYSCALL_64
+ 86.94% 0.07% tc [kernel.vmlinux] [k] do_syscall_64
+ 85.75% 0.04% tc libc-2.29.so [.] __libc_sendmsg
+ 85.00% 0.07% tc [kernel.vmlinux] [k] __sys_sendmsg
+ 84.84% 0.07% tc [kernel.vmlinux] [k] ___sys_sendmsg
+ 84.59% 0.01% tc [kernel.vmlinux] [k] sock_sendmsg
+ 84.58% 0.14% tc [kernel.vmlinux] [k] netlink_sendmsg
+ 83.95% 0.12% tc [kernel.vmlinux] [k] netlink_unicast
+ 83.34% 0.01% tc [kernel.vmlinux] [k] netlink_rcv_skb
+ 82.39% 0.12% tc [kernel.vmlinux] [k] rtnetlink_rcv_msg
+ 82.16% 0.25% tc [kernel.vmlinux] [k] tc_new_tfilter
+ 75.13% 0.84% tc [cls_flower] [k] fl_change
+ 69.92% 0.05% tc [kernel.vmlinux] [k] tcf_exts_validate
+ 69.87% 0.11% tc [kernel.vmlinux] [k] tcf_action_init
+ 69.61% 0.02% tc [kernel.vmlinux] [k] tcf_action_init_1
- 68.80% 0.10% tc [act_gact] [k] tcf_gact_init
  - 68.70% tcf_gact_init
    + 36.08% tcf_idr_check_alloc
    + 31.88% tcf_idr_insert
+ 63.72% 0.58% tc [kernel.vmlinux] [k] __mutex_lock.isra.0
+ 58.80% 56.68% tc [kernel.vmlinux] [k] osq_lock
+ 36.08% 0.04% tc [kernel.vmlinux] [k] tcf_idr_check_alloc
+ 31.88% 0.01% tc [kernel.vmlinux] [k] tcf_idr_insert

The gact actions (like all other actions types) are inserted in single
idr instance protected by global (per namespace) lock that becomes new
bottleneck with such simple rule profile and prevents achieving 2x+
performance increase that can be expected by looking at profiling data
for insertion action with percpu counter.

Perf profile of run with percpu allocation (tunnel_key+mirred):

+ 91.95% 0.21% tc [kernel.vmlinux] [k] entry_SYSCALL_64
+ 91.74% 0.06% tc [kernel.vmlinux] [k] do_syscall_64
+ 90.74% 0.01% tc libc-2.29.so [.] __libc_sendmsg
+ 90.52% 0.01% tc [kernel.vmlinux] [k] __sys_sendmsg
+ 90.50% 0.04% tc [kernel.vmlinux] [k] ___sys_sendmsg
+ 90.41% 0.02% tc [kernel.vmlinux] [k] sock_sendmsg
+ 90.38% 0.04% tc [kernel.vmlinux] [k] netlink_sendmsg
+ 90.10% 0.06% tc [kernel.vmlinux] [k] netlink_unicast
+ 89.76% 0.01% tc [kernel.vmlinux] [k] netlink_rcv_skb
+ 89.28% 0.04% tc [kernel.vmlinux] [k] rtnetlink_rcv_msg
+ 89.15% 0.03% tc [kernel.vmlinux] [k] tc_new_tfilter
+ 83.41% 0.33% tc [cls_flower] [k] fl_change
+ 81.17% 0.04% tc [kernel.vmlinux] [k] tcf_exts_validate
+ 81.13% 0.06% tc [kernel.vmlinux] [k] tcf_action_init
+ 81.04% 0.04% tc [kernel.vmlinux] [k] tcf_action_init_1
- 73.59% 2.16% tc [kernel.vmlinux] [k] pcpu_alloc
  - 71.42% pcpu_alloc
    + 61.41% __mutex_lock.isra.0 5.02% memset_erms
    + 2.93% pcpu_alloc_area
  + 2.16% __libc_sendmsg
+ 63.58% 0.17% tc [kernel.vmlinux] [k] tcf_idr_create
+ 63.40% 0.60% tc [kernel.vmlinux] [k] __mutex_lock.isra.0
+ 57.85% 56.38% tc [kernel.vmlinux] [k] osq_lock
+ 46.27% 0.13% tc [act_tunnel_key] [k] tunnel_key_init
+ 34.26% 0.02% tc [act_mirred] [k] tcf_mirred_init
+ 10.99% 0.00% tc [kernel.vmlinux] [k] dst_cache_init
+ 5.32% 5.11% tc [kernel.vmlinux] [k] memset_erms

With two times more actions pressure on percpu allocator doubles, so now
it takes ~74% of CPU execution time.

With percpu allocation removed (tunnel_key+mirred):

+ 86.02% 0.50% tc [kernel.vmlinux] [k] entry_SYSCALL_64
+ 85.51% 0.12% tc [kernel.vmlinux] [k] do_syscall_64
+ 84.40% 0.03% tc libc-2.29.so [.] __libc_sendmsg
+ 83.84% 0.03% tc [kernel.vmlinux] [k] __sys_sendmsg
+ 83.72% 0.01% tc [kernel.vmlinux] [k] ___sys_sendmsg
+ 83.56% 0.01% tc [kernel.vmlinux] [k] sock_sendmsg
+ 83.50% 0.08% tc [kernel.vmlinux] [k] netlink_sendmsg
+ 83.02% 0.17% tc [kernel.vmlinux] [k] netlink_unicast
+ 82.48% 0.00% tc [kernel.vmlinux] [k] netlink_rcv_skb
+ 81.89% 0.11% tc [kernel.vmlinux] [k] rtnetlink_rcv_msg
+ 81.71% 0.25% tc [kernel.vmlinux] [k] tc_new_tfilter
+ 73.99% 0.63% tc [cls_flower] [k] fl_change
+ 69.72% 0.00% tc [kernel.vmlinux] [k] tcf_exts_validate
+ 69.72% 0.09% tc [kernel.vmlinux] [k] tcf_action_init
+ 69.53% 0.05% tc [kernel.vmlinux] [k] tcf_action_init_1
+ 53.08% 0.91% tc [kernel.vmlinux] [k] __mutex_lock.isra.0
+ 45.52% 43.99% tc [kernel.vmlinux] [k] osq_lock
- 36.02% 0.21% tc [act_tunnel_key] [k] tunnel_key_init
  - 35.81% tunnel_key_init
    + 15.95% tcf_idr_check_alloc
    + 13.91% tcf_idr_insert
    - 4.70% dst_cache_init
      + 4.68% pcpu_alloc
+ 33.22% 0.04% tc [kernel.vmlinux] [k] tcf_idr_check_alloc
+ 32.34% 0.05% tc [act_mirred] [k] tcf_mirred_init
+ 28.24% 0.01% tc [kernel.vmlinux] [k] tcf_idr_insert
+ 7.79% 0.05% tc [kernel.vmlinux] [k] idr_alloc_u32
+ 7.67% 7.35% tc [kernel.vmlinux] [k] idr_get_free
+ 6.46% 6.22% tc [kernel.vmlinux] [k] mutex_spin_on_owner
+ 5.11% 0.05% tc [kernel.vmlinux] [k] tfilter_notify

With percpu allocation removed insertion rate is increased by ~120%.
Such rule profile scales much better than simple single action because
both types of actions were competing for single lock in percpu
allocator, but not for action idr lock, which is per-action. Note that
percpu allocator is still used by dst_cache in tunnel_key actions and
consumes 4.68% CPU time. Dst_cache seems like good opportunity for
further insertion rate optimization but is not addressed by this change.

Setup details: 2x Intel(R) Xeon(R) CPU E5-2620 v3 @ 2.40GHz, 32GB memory

Patches applied on top of net-next branch:

commit 2203cbf2c8b58a1e3bef98c47531d431d11639a0 (net-next) Author:
Russell King <rmk+kernel@armlinux.org.uk> Date: Tue Oct 15 11:38:39 2019
+0100

net: sfp: move fwnode parsing into sfp-bus layer

Vlad Buslov (13):
  net: sched: extract common action counters update code into function
  net: sched: extract bstats update code into function
  net: sched: extract qstats update code into functions
  net: sched: don't expose action qstats to skb_tc_reinsert()
  net: sched: modify stats helper functions to support regular stats
  net: sched: add action fast initialization flag
  net: sched: act_gact: support fast init flag to skip pcpu allocation
  net: sched: act_csum: support fast init flag to skip pcpu allocation
  net: sched: act_mirred: support fast init flag to skip pcpu alloc
  net: sched: tunnel_key: support fast init flag to skip pcpu alloc
  net: sched: act_vlan: support fast init flag to skip pcpu allocation
  net: sched: act_ct: support fast init flag to skip pcpu allocation
  tc-testing: implement tests for new fast_init action flag

 include/net/act_api.h                         | 47 ++++++++++++++++++-
 include/net/pkt_cls.h                         |  3 ++
 include/net/sch_generic.h                     | 12 +----
 include/uapi/linux/pkt_cls.h                  |  8 ++++
 include/uapi/linux/tc_act/tc_csum.h           |  1 +
 include/uapi/linux/tc_act/tc_ct.h             |  1 +
 include/uapi/linux/tc_act/tc_gact.h           |  1 +
 include/uapi/linux/tc_act/tc_mirred.h         |  1 +
 include/uapi/linux/tc_act/tc_tunnel_key.h     |  1 +
 include/uapi/linux/tc_act/tc_vlan.h           |  1 +
 net/sched/act_api.c                           | 28 ++++++++++-
 net/sched/act_bpf.c                           |  2 +-
 net/sched/act_connmark.c                      |  3 +-
 net/sched/act_csum.c                          | 22 +++++++--
 net/sched/act_ct.c                            | 27 +++++++----
 net/sched/act_ctinfo.c                        |  3 +-
 net/sched/act_gact.c                          | 32 ++++++++-----
 net/sched/act_ife.c                           |  2 +-
 net/sched/act_ipt.c                           |  8 ++--
 net/sched/act_mirred.c                        | 30 ++++++++----
 net/sched/act_mpls.c                          |  2 +-
 net/sched/act_nat.c                           |  3 +-
 net/sched/act_pedit.c                         |  3 +-
 net/sched/act_police.c                        |  7 +--
 net/sched/act_sample.c                        |  2 +-
 net/sched/act_simple.c                        |  3 +-
 net/sched/act_skbedit.c                       |  2 +-
 net/sched/act_skbmod.c                        |  2 +-
 net/sched/act_tunnel_key.c                    | 20 ++++++--
 net/sched/act_vlan.c                          | 26 ++++++----
 .../tc-testing/tc-tests/actions/csum.json     | 24 ++++++++++
 .../tc-testing/tc-tests/actions/ct.json       | 24 ++++++++++
 .../tc-testing/tc-tests/actions/gact.json     | 24 ++++++++++
 .../tc-testing/tc-tests/actions/mirred.json   | 24 ++++++++++
 .../tc-tests/actions/tunnel_key.json          | 24 ++++++++++
 .../tc-testing/tc-tests/actions/vlan.json     | 24 ++++++++++
 36 files changed, 367 insertions(+), 80 deletions(-)

-- 
2.21.0


^ permalink raw reply	[flat|nested] 64+ messages in thread

* [PATCH net-next 01/13] net: sched: extract common action counters update code into function
  2019-10-22 14:17 [PATCH net-next 00/13] Control action percpu counters allocation by netlink flag Vlad Buslov
@ 2019-10-22 14:17 ` Vlad Buslov
  2019-10-22 14:17 ` [PATCH net-next 02/13] net: sched: extract bstats " Vlad Buslov
                   ` (15 subsequent siblings)
  16 siblings, 0 replies; 64+ messages in thread
From: Vlad Buslov @ 2019-10-22 14:17 UTC (permalink / raw)
  To: netdev
  Cc: jhs, xiyou.wangcong, jiri, davem, mleitner, dcaratti,
	Vlad Buslov, Jiri Pirko

Currently, all implementations of tc_action_ops->stats_update() callback
have almost exactly the same implementation of counters update
code (besides gact which also updates drop counter). In order to simplify
support for using both percpu-allocated and regular action counters
depending on run-time flag in following patches, extract action counters
update code into standalone function in act API.

This commit doesn't change functionality.

Signed-off-by: Vlad Buslov <vladbu@mellanox.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
---
 include/net/act_api.h  |  2 ++
 net/sched/act_api.c    | 14 ++++++++++++++
 net/sched/act_ct.c     |  6 +-----
 net/sched/act_gact.c   | 10 +---------
 net/sched/act_mirred.c |  5 +----
 net/sched/act_police.c |  5 +----
 net/sched/act_vlan.c   |  5 +----
 7 files changed, 21 insertions(+), 26 deletions(-)

diff --git a/include/net/act_api.h b/include/net/act_api.h
index b18c699681ca..f6f66c692385 100644
--- a/include/net/act_api.h
+++ b/include/net/act_api.h
@@ -186,6 +186,8 @@ int tcf_action_dump(struct sk_buff *skb, struct tc_action *actions[], int bind,
 		    int ref);
 int tcf_action_dump_old(struct sk_buff *skb, struct tc_action *a, int, int);
 int tcf_action_dump_1(struct sk_buff *skb, struct tc_action *a, int, int);
+void tcf_action_update_stats(struct tc_action *a, u64 bytes, u32 packets,
+			     bool drop, bool hw);
 int tcf_action_copy_stats(struct sk_buff *, struct tc_action *, int);
 
 int tcf_action_check_ctrlact(int action, struct tcf_proto *tp,
diff --git a/net/sched/act_api.c b/net/sched/act_api.c
index 69d4676a402f..0638afa2fc3f 100644
--- a/net/sched/act_api.c
+++ b/net/sched/act_api.c
@@ -989,6 +989,20 @@ int tcf_action_init(struct net *net, struct tcf_proto *tp, struct nlattr *nla,
 	return err;
 }
 
+void tcf_action_update_stats(struct tc_action *a, u64 bytes, u32 packets,
+			     bool drop, bool hw)
+{
+	_bstats_cpu_update(this_cpu_ptr(a->cpu_bstats), bytes, packets);
+
+	if (drop)
+		this_cpu_ptr(a->cpu_qstats)->drops += packets;
+
+	if (hw)
+		_bstats_cpu_update(this_cpu_ptr(a->cpu_bstats_hw),
+				   bytes, packets);
+}
+EXPORT_SYMBOL(tcf_action_update_stats);
+
 int tcf_action_copy_stats(struct sk_buff *skb, struct tc_action *p,
 			  int compat_mode)
 {
diff --git a/net/sched/act_ct.c b/net/sched/act_ct.c
index fcc46025e790..ba76857754e5 100644
--- a/net/sched/act_ct.c
+++ b/net/sched/act_ct.c
@@ -905,11 +905,7 @@ static void tcf_stats_update(struct tc_action *a, u64 bytes, u32 packets,
 {
 	struct tcf_ct *c = to_ct(a);
 
-	_bstats_cpu_update(this_cpu_ptr(a->cpu_bstats), bytes, packets);
-
-	if (hw)
-		_bstats_cpu_update(this_cpu_ptr(a->cpu_bstats_hw),
-				   bytes, packets);
+	tcf_action_update_stats(a, bytes, packets, false, hw);
 	c->tcf_tm.lastuse = max_t(u64, c->tcf_tm.lastuse, lastuse);
 }
 
diff --git a/net/sched/act_gact.c b/net/sched/act_gact.c
index 324f1d1f6d47..569cec63d4c3 100644
--- a/net/sched/act_gact.c
+++ b/net/sched/act_gact.c
@@ -177,15 +177,7 @@ static void tcf_gact_stats_update(struct tc_action *a, u64 bytes, u32 packets,
 	int action = READ_ONCE(gact->tcf_action);
 	struct tcf_t *tm = &gact->tcf_tm;
 
-	_bstats_cpu_update(this_cpu_ptr(gact->common.cpu_bstats), bytes,
-			   packets);
-	if (action == TC_ACT_SHOT)
-		this_cpu_ptr(gact->common.cpu_qstats)->drops += packets;
-
-	if (hw)
-		_bstats_cpu_update(this_cpu_ptr(gact->common.cpu_bstats_hw),
-				   bytes, packets);
-
+	tcf_action_update_stats(a, bytes, packets, action == TC_ACT_SHOT, hw);
 	tm->lastuse = max_t(u64, tm->lastuse, lastuse);
 }
 
diff --git a/net/sched/act_mirred.c b/net/sched/act_mirred.c
index 08923b21e566..621686a6b5be 100644
--- a/net/sched/act_mirred.c
+++ b/net/sched/act_mirred.c
@@ -318,10 +318,7 @@ static void tcf_stats_update(struct tc_action *a, u64 bytes, u32 packets,
 	struct tcf_mirred *m = to_mirred(a);
 	struct tcf_t *tm = &m->tcf_tm;
 
-	_bstats_cpu_update(this_cpu_ptr(a->cpu_bstats), bytes, packets);
-	if (hw)
-		_bstats_cpu_update(this_cpu_ptr(a->cpu_bstats_hw),
-				   bytes, packets);
+	tcf_action_update_stats(a, bytes, packets, false, hw);
 	tm->lastuse = max_t(u64, tm->lastuse, lastuse);
 }
 
diff --git a/net/sched/act_police.c b/net/sched/act_police.c
index 981a9eca0c52..51d34b1a61d5 100644
--- a/net/sched/act_police.c
+++ b/net/sched/act_police.c
@@ -294,10 +294,7 @@ static void tcf_police_stats_update(struct tc_action *a,
 	struct tcf_police *police = to_police(a);
 	struct tcf_t *tm = &police->tcf_tm;
 
-	_bstats_cpu_update(this_cpu_ptr(a->cpu_bstats), bytes, packets);
-	if (hw)
-		_bstats_cpu_update(this_cpu_ptr(a->cpu_bstats_hw),
-				   bytes, packets);
+	tcf_action_update_stats(a, bytes, packets, false, hw);
 	tm->lastuse = max_t(u64, tm->lastuse, lastuse);
 }
 
diff --git a/net/sched/act_vlan.c b/net/sched/act_vlan.c
index 08aaf719a70f..9e68edb22e53 100644
--- a/net/sched/act_vlan.c
+++ b/net/sched/act_vlan.c
@@ -307,10 +307,7 @@ static void tcf_vlan_stats_update(struct tc_action *a, u64 bytes, u32 packets,
 	struct tcf_vlan *v = to_vlan(a);
 	struct tcf_t *tm = &v->tcf_tm;
 
-	_bstats_cpu_update(this_cpu_ptr(a->cpu_bstats), bytes, packets);
-	if (hw)
-		_bstats_cpu_update(this_cpu_ptr(a->cpu_bstats_hw),
-				   bytes, packets);
+	tcf_action_update_stats(a, bytes, packets, false, hw);
 	tm->lastuse = max_t(u64, tm->lastuse, lastuse);
 }
 
-- 
2.21.0


^ permalink raw reply related	[flat|nested] 64+ messages in thread

* [PATCH net-next 02/13] net: sched: extract bstats update code into function
  2019-10-22 14:17 [PATCH net-next 00/13] Control action percpu counters allocation by netlink flag Vlad Buslov
  2019-10-22 14:17 ` [PATCH net-next 01/13] net: sched: extract common action counters update code into function Vlad Buslov
@ 2019-10-22 14:17 ` Vlad Buslov
  2019-10-22 14:17 ` [PATCH net-next 03/13] net: sched: extract qstats update code into functions Vlad Buslov
                   ` (14 subsequent siblings)
  16 siblings, 0 replies; 64+ messages in thread
From: Vlad Buslov @ 2019-10-22 14:17 UTC (permalink / raw)
  To: netdev
  Cc: jhs, xiyou.wangcong, jiri, davem, mleitner, dcaratti,
	Vlad Buslov, Jiri Pirko

Extract common code that increments cpu_bstats counter into standalone act
API function. Change hardware offloaded actions that use percpu counter
allocation to use the new function instead of incrementing cpu_bstats
directly.

This commit doesn't change functionality.

Signed-off-by: Vlad Buslov <vladbu@mellanox.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
---
 include/net/act_api.h      | 7 +++++++
 net/sched/act_csum.c       | 2 +-
 net/sched/act_ct.c         | 2 +-
 net/sched/act_gact.c       | 2 +-
 net/sched/act_mirred.c     | 2 +-
 net/sched/act_tunnel_key.c | 2 +-
 net/sched/act_vlan.c       | 2 +-
 7 files changed, 13 insertions(+), 6 deletions(-)

diff --git a/include/net/act_api.h b/include/net/act_api.h
index f6f66c692385..9a32853f77f9 100644
--- a/include/net/act_api.h
+++ b/include/net/act_api.h
@@ -186,6 +186,13 @@ int tcf_action_dump(struct sk_buff *skb, struct tc_action *actions[], int bind,
 		    int ref);
 int tcf_action_dump_old(struct sk_buff *skb, struct tc_action *a, int, int);
 int tcf_action_dump_1(struct sk_buff *skb, struct tc_action *a, int, int);
+
+static inline void tcf_action_update_bstats(struct tc_action *a,
+					    struct sk_buff *skb)
+{
+	bstats_cpu_update(this_cpu_ptr(a->cpu_bstats), skb);
+}
+
 void tcf_action_update_stats(struct tc_action *a, u64 bytes, u32 packets,
 			     bool drop, bool hw);
 int tcf_action_copy_stats(struct sk_buff *, struct tc_action *, int);
diff --git a/net/sched/act_csum.c b/net/sched/act_csum.c
index d3cfad88dc3a..69747b1860aa 100644
--- a/net/sched/act_csum.c
+++ b/net/sched/act_csum.c
@@ -580,7 +580,7 @@ static int tcf_csum_act(struct sk_buff *skb, const struct tc_action *a,
 	params = rcu_dereference_bh(p->params);
 
 	tcf_lastuse_update(&p->tcf_tm);
-	bstats_cpu_update(this_cpu_ptr(p->common.cpu_bstats), skb);
+	tcf_action_update_bstats(&p->common, skb);
 
 	action = READ_ONCE(p->tcf_action);
 	if (unlikely(action == TC_ACT_SHOT))
diff --git a/net/sched/act_ct.c b/net/sched/act_ct.c
index ba76857754e5..f9779907dcf7 100644
--- a/net/sched/act_ct.c
+++ b/net/sched/act_ct.c
@@ -465,7 +465,7 @@ static int tcf_ct_act(struct sk_buff *skb, const struct tc_action *a,
 	skb_push_rcsum(skb, nh_ofs);
 
 out:
-	bstats_cpu_update(this_cpu_ptr(a->cpu_bstats), skb);
+	tcf_action_update_bstats(&c->common, skb);
 	return retval;
 
 drop:
diff --git a/net/sched/act_gact.c b/net/sched/act_gact.c
index 569cec63d4c3..a7e3d5621608 100644
--- a/net/sched/act_gact.c
+++ b/net/sched/act_gact.c
@@ -161,7 +161,7 @@ static int tcf_gact_act(struct sk_buff *skb, const struct tc_action *a,
 		action = gact_rand[ptype](gact);
 	}
 #endif
-	bstats_cpu_update(this_cpu_ptr(gact->common.cpu_bstats), skb);
+	tcf_action_update_bstats(&gact->common, skb);
 	if (action == TC_ACT_SHOT)
 		qstats_drop_inc(this_cpu_ptr(gact->common.cpu_qstats));
 
diff --git a/net/sched/act_mirred.c b/net/sched/act_mirred.c
index 621686a6b5be..e5216f80883b 100644
--- a/net/sched/act_mirred.c
+++ b/net/sched/act_mirred.c
@@ -231,7 +231,7 @@ static int tcf_mirred_act(struct sk_buff *skb, const struct tc_action *a,
 	}
 
 	tcf_lastuse_update(&m->tcf_tm);
-	bstats_cpu_update(this_cpu_ptr(m->common.cpu_bstats), skb);
+	tcf_action_update_bstats(&m->common, skb);
 
 	m_mac_header_xmit = READ_ONCE(m->tcfm_mac_header_xmit);
 	m_eaction = READ_ONCE(m->tcfm_eaction);
diff --git a/net/sched/act_tunnel_key.c b/net/sched/act_tunnel_key.c
index 2f83a79f76aa..9ab2d3b4a9fc 100644
--- a/net/sched/act_tunnel_key.c
+++ b/net/sched/act_tunnel_key.c
@@ -31,7 +31,7 @@ static int tunnel_key_act(struct sk_buff *skb, const struct tc_action *a,
 	params = rcu_dereference_bh(t->params);
 
 	tcf_lastuse_update(&t->tcf_tm);
-	bstats_cpu_update(this_cpu_ptr(t->common.cpu_bstats), skb);
+	tcf_action_update_bstats(&t->common, skb);
 	action = READ_ONCE(t->tcf_action);
 
 	switch (params->tcft_action) {
diff --git a/net/sched/act_vlan.c b/net/sched/act_vlan.c
index 9e68edb22e53..f6dccaa29239 100644
--- a/net/sched/act_vlan.c
+++ b/net/sched/act_vlan.c
@@ -29,7 +29,7 @@ static int tcf_vlan_act(struct sk_buff *skb, const struct tc_action *a,
 	u16 tci;
 
 	tcf_lastuse_update(&v->tcf_tm);
-	bstats_cpu_update(this_cpu_ptr(v->common.cpu_bstats), skb);
+	tcf_action_update_bstats(&v->common, skb);
 
 	/* Ensure 'data' points at mac_header prior calling vlan manipulating
 	 * functions.
-- 
2.21.0


^ permalink raw reply related	[flat|nested] 64+ messages in thread

* [PATCH net-next 03/13] net: sched: extract qstats update code into functions
  2019-10-22 14:17 [PATCH net-next 00/13] Control action percpu counters allocation by netlink flag Vlad Buslov
  2019-10-22 14:17 ` [PATCH net-next 01/13] net: sched: extract common action counters update code into function Vlad Buslov
  2019-10-22 14:17 ` [PATCH net-next 02/13] net: sched: extract bstats " Vlad Buslov
@ 2019-10-22 14:17 ` Vlad Buslov
  2019-10-22 14:17 ` [PATCH net-next 04/13] net: sched: don't expose action qstats to skb_tc_reinsert() Vlad Buslov
                   ` (13 subsequent siblings)
  16 siblings, 0 replies; 64+ messages in thread
From: Vlad Buslov @ 2019-10-22 14:17 UTC (permalink / raw)
  To: netdev
  Cc: jhs, xiyou.wangcong, jiri, davem, mleitner, dcaratti,
	Vlad Buslov, Jiri Pirko

Extract common code that increments cpu_qstats counters into standalone act
API functions. Change hardware offloaded actions that use percpu counter
allocation to use the new functions instead of accessing cpu_qstats
directly.

This commit doesn't change functionality.

Signed-off-by: Vlad Buslov <vladbu@mellanox.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
---
 include/net/act_api.h  | 16 ++++++++++++++++
 net/sched/act_csum.c   |  2 +-
 net/sched/act_ct.c     |  2 +-
 net/sched/act_gact.c   |  2 +-
 net/sched/act_mirred.c |  2 +-
 net/sched/act_vlan.c   |  2 +-
 6 files changed, 21 insertions(+), 5 deletions(-)

diff --git a/include/net/act_api.h b/include/net/act_api.h
index 9a32853f77f9..8d6861ce205b 100644
--- a/include/net/act_api.h
+++ b/include/net/act_api.h
@@ -193,6 +193,22 @@ static inline void tcf_action_update_bstats(struct tc_action *a,
 	bstats_cpu_update(this_cpu_ptr(a->cpu_bstats), skb);
 }
 
+static inline struct gnet_stats_queue *
+tcf_action_get_qstats(struct tc_action *a)
+{
+	return this_cpu_ptr(a->cpu_qstats);
+}
+
+static inline void tcf_action_inc_drop_qstats(struct tc_action *a)
+{
+	qstats_drop_inc(this_cpu_ptr(a->cpu_qstats));
+}
+
+static inline void tcf_action_inc_overlimit_qstats(struct tc_action *a)
+{
+	qstats_overlimit_inc(this_cpu_ptr(a->cpu_qstats));
+}
+
 void tcf_action_update_stats(struct tc_action *a, u64 bytes, u32 packets,
 			     bool drop, bool hw);
 int tcf_action_copy_stats(struct sk_buff *, struct tc_action *, int);
diff --git a/net/sched/act_csum.c b/net/sched/act_csum.c
index 69747b1860aa..bc909cf72257 100644
--- a/net/sched/act_csum.c
+++ b/net/sched/act_csum.c
@@ -624,7 +624,7 @@ static int tcf_csum_act(struct sk_buff *skb, const struct tc_action *a,
 	return action;
 
 drop:
-	qstats_drop_inc(this_cpu_ptr(p->common.cpu_qstats));
+	tcf_action_inc_drop_qstats(&p->common);
 	action = TC_ACT_SHOT;
 	goto out;
 }
diff --git a/net/sched/act_ct.c b/net/sched/act_ct.c
index f9779907dcf7..eabae2227e13 100644
--- a/net/sched/act_ct.c
+++ b/net/sched/act_ct.c
@@ -469,7 +469,7 @@ static int tcf_ct_act(struct sk_buff *skb, const struct tc_action *a,
 	return retval;
 
 drop:
-	qstats_drop_inc(this_cpu_ptr(a->cpu_qstats));
+	tcf_action_inc_drop_qstats(&c->common);
 	return TC_ACT_SHOT;
 }
 
diff --git a/net/sched/act_gact.c b/net/sched/act_gact.c
index a7e3d5621608..221f0c2e26b1 100644
--- a/net/sched/act_gact.c
+++ b/net/sched/act_gact.c
@@ -163,7 +163,7 @@ static int tcf_gact_act(struct sk_buff *skb, const struct tc_action *a,
 #endif
 	tcf_action_update_bstats(&gact->common, skb);
 	if (action == TC_ACT_SHOT)
-		qstats_drop_inc(this_cpu_ptr(gact->common.cpu_qstats));
+		tcf_action_inc_drop_qstats(&gact->common);
 
 	tcf_lastuse_update(&gact->tcf_tm);
 
diff --git a/net/sched/act_mirred.c b/net/sched/act_mirred.c
index e5216f80883b..49a378a5b4fa 100644
--- a/net/sched/act_mirred.c
+++ b/net/sched/act_mirred.c
@@ -303,7 +303,7 @@ static int tcf_mirred_act(struct sk_buff *skb, const struct tc_action *a,
 
 	if (err) {
 out:
-		qstats_overlimit_inc(this_cpu_ptr(m->common.cpu_qstats));
+		tcf_action_inc_overlimit_qstats(&m->common);
 		if (tcf_mirred_is_act_redirect(m_eaction))
 			retval = TC_ACT_SHOT;
 	}
diff --git a/net/sched/act_vlan.c b/net/sched/act_vlan.c
index f6dccaa29239..ffa0f431aa84 100644
--- a/net/sched/act_vlan.c
+++ b/net/sched/act_vlan.c
@@ -88,7 +88,7 @@ static int tcf_vlan_act(struct sk_buff *skb, const struct tc_action *a,
 	return action;
 
 drop:
-	qstats_drop_inc(this_cpu_ptr(v->common.cpu_qstats));
+	tcf_action_inc_drop_qstats(&v->common);
 	return TC_ACT_SHOT;
 }
 
-- 
2.21.0


^ permalink raw reply related	[flat|nested] 64+ messages in thread

* [PATCH net-next 04/13] net: sched: don't expose action qstats to skb_tc_reinsert()
  2019-10-22 14:17 [PATCH net-next 00/13] Control action percpu counters allocation by netlink flag Vlad Buslov
                   ` (2 preceding siblings ...)
  2019-10-22 14:17 ` [PATCH net-next 03/13] net: sched: extract qstats update code into functions Vlad Buslov
@ 2019-10-22 14:17 ` Vlad Buslov
  2019-10-22 14:17 ` [PATCH net-next 05/13] net: sched: modify stats helper functions to support regular stats Vlad Buslov
                   ` (12 subsequent siblings)
  16 siblings, 0 replies; 64+ messages in thread
From: Vlad Buslov @ 2019-10-22 14:17 UTC (permalink / raw)
  To: netdev
  Cc: jhs, xiyou.wangcong, jiri, davem, mleitner, dcaratti,
	Vlad Buslov, Jiri Pirko

Previous commit introduced helper function for updating qstats and
refactored set of actions to use the helpers, instead of modifying qstats
directly. However, one of the affected action exposes its qstats to
skb_tc_reinsert(), which then modifies it.

Refactor skb_tc_reinsert() to return integer error code and don't increment
overlimit qstats in case of error, and use the returned error code in
tcf_mirred_act() to manually increment the overlimit counter with new
helper function.

Signed-off-by: Vlad Buslov <vladbu@mellanox.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
---
 include/net/sch_generic.h | 12 ++----------
 net/sched/act_mirred.c    |  4 ++--
 2 files changed, 4 insertions(+), 12 deletions(-)

diff --git a/include/net/sch_generic.h b/include/net/sch_generic.h
index 637548d54b3e..a8b0a9a4c686 100644
--- a/include/net/sch_generic.h
+++ b/include/net/sch_generic.h
@@ -1286,17 +1286,9 @@ void mini_qdisc_pair_swap(struct mini_Qdisc_pair *miniqp,
 void mini_qdisc_pair_init(struct mini_Qdisc_pair *miniqp, struct Qdisc *qdisc,
 			  struct mini_Qdisc __rcu **p_miniq);
 
-static inline void skb_tc_reinsert(struct sk_buff *skb, struct tcf_result *res)
+static inline int skb_tc_reinsert(struct sk_buff *skb, struct tcf_result *res)
 {
-	struct gnet_stats_queue *stats = res->qstats;
-	int ret;
-
-	if (res->ingress)
-		ret = netif_receive_skb(skb);
-	else
-		ret = dev_queue_xmit(skb);
-	if (ret && stats)
-		qstats_overlimit_inc(res->qstats);
+	return res->ingress ? netif_receive_skb(skb) : dev_queue_xmit(skb);
 }
 
 #endif
diff --git a/net/sched/act_mirred.c b/net/sched/act_mirred.c
index 49a378a5b4fa..ae1129aaf3c0 100644
--- a/net/sched/act_mirred.c
+++ b/net/sched/act_mirred.c
@@ -289,8 +289,8 @@ static int tcf_mirred_act(struct sk_buff *skb, const struct tc_action *a,
 		/* let's the caller reinsert the packet, if possible */
 		if (use_reinsert) {
 			res->ingress = want_ingress;
-			res->qstats = this_cpu_ptr(m->common.cpu_qstats);
-			skb_tc_reinsert(skb, res);
+			if (skb_tc_reinsert(skb, res))
+				tcf_action_inc_overlimit_qstats(&m->common);
 			__this_cpu_dec(mirred_rec_level);
 			return TC_ACT_CONSUMED;
 		}
-- 
2.21.0


^ permalink raw reply related	[flat|nested] 64+ messages in thread

* [PATCH net-next 05/13] net: sched: modify stats helper functions to support regular stats
  2019-10-22 14:17 [PATCH net-next 00/13] Control action percpu counters allocation by netlink flag Vlad Buslov
                   ` (3 preceding siblings ...)
  2019-10-22 14:17 ` [PATCH net-next 04/13] net: sched: don't expose action qstats to skb_tc_reinsert() Vlad Buslov
@ 2019-10-22 14:17 ` Vlad Buslov
  2019-10-22 14:17 ` [PATCH net-next 06/13] net: sched: add action fast initialization flag Vlad Buslov
                   ` (11 subsequent siblings)
  16 siblings, 0 replies; 64+ messages in thread
From: Vlad Buslov @ 2019-10-22 14:17 UTC (permalink / raw)
  To: netdev
  Cc: jhs, xiyou.wangcong, jiri, davem, mleitner, dcaratti,
	Vlad Buslov, Jiri Pirko

Modify stats update helper functions introduced in previous patches in this
series to fallback to regular tc_action->tcfa_{b|q}stats if cpu stats are
not allocated for the action argument. If regular non-percpu allocated
counters are in use, then obtain action tcfa_lock while modifying them.

Signed-off-by: Vlad Buslov <vladbu@mellanox.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
---
 include/net/act_api.h | 30 +++++++++++++++++++++---------
 net/sched/act_api.c   | 19 ++++++++++++++-----
 2 files changed, 35 insertions(+), 14 deletions(-)

diff --git a/include/net/act_api.h b/include/net/act_api.h
index 8d6861ce205b..a56477051dae 100644
--- a/include/net/act_api.h
+++ b/include/net/act_api.h
@@ -190,23 +190,35 @@ int tcf_action_dump_1(struct sk_buff *skb, struct tc_action *a, int, int);
 static inline void tcf_action_update_bstats(struct tc_action *a,
 					    struct sk_buff *skb)
 {
-	bstats_cpu_update(this_cpu_ptr(a->cpu_bstats), skb);
-}
-
-static inline struct gnet_stats_queue *
-tcf_action_get_qstats(struct tc_action *a)
-{
-	return this_cpu_ptr(a->cpu_qstats);
+	if (likely(a->cpu_bstats)) {
+		bstats_cpu_update(this_cpu_ptr(a->cpu_bstats), skb);
+		return;
+	}
+	spin_lock(&a->tcfa_lock);
+	bstats_update(&a->tcfa_bstats, skb);
+	spin_unlock(&a->tcfa_lock);
 }
 
 static inline void tcf_action_inc_drop_qstats(struct tc_action *a)
 {
-	qstats_drop_inc(this_cpu_ptr(a->cpu_qstats));
+	if (likely(a->cpu_qstats)) {
+		qstats_drop_inc(this_cpu_ptr(a->cpu_qstats));
+		return;
+	}
+	spin_lock(&a->tcfa_lock);
+	qstats_drop_inc(&a->tcfa_qstats);
+	spin_unlock(&a->tcfa_lock);
 }
 
 static inline void tcf_action_inc_overlimit_qstats(struct tc_action *a)
 {
-	qstats_overlimit_inc(this_cpu_ptr(a->cpu_qstats));
+	if (likely(a->cpu_qstats)) {
+		qstats_overlimit_inc(this_cpu_ptr(a->cpu_qstats));
+		return;
+	}
+	spin_lock(&a->tcfa_lock);
+	qstats_overlimit_inc(&a->tcfa_qstats);
+	spin_unlock(&a->tcfa_lock);
 }
 
 void tcf_action_update_stats(struct tc_action *a, u64 bytes, u32 packets,
diff --git a/net/sched/act_api.c b/net/sched/act_api.c
index 0638afa2fc3f..f85b88da5216 100644
--- a/net/sched/act_api.c
+++ b/net/sched/act_api.c
@@ -992,14 +992,23 @@ int tcf_action_init(struct net *net, struct tcf_proto *tp, struct nlattr *nla,
 void tcf_action_update_stats(struct tc_action *a, u64 bytes, u32 packets,
 			     bool drop, bool hw)
 {
-	_bstats_cpu_update(this_cpu_ptr(a->cpu_bstats), bytes, packets);
+	if (a->cpu_bstats) {
+		_bstats_cpu_update(this_cpu_ptr(a->cpu_bstats), bytes, packets);
 
-	if (drop)
-		this_cpu_ptr(a->cpu_qstats)->drops += packets;
+		if (drop)
+			this_cpu_ptr(a->cpu_qstats)->drops += packets;
+
+		if (hw)
+			_bstats_cpu_update(this_cpu_ptr(a->cpu_bstats_hw),
+					   bytes, packets);
+		return;
+	}
 
+	_bstats_update(&a->tcfa_bstats, bytes, packets);
+	if (drop)
+		a->tcfa_qstats.drops += packets;
 	if (hw)
-		_bstats_cpu_update(this_cpu_ptr(a->cpu_bstats_hw),
-				   bytes, packets);
+		_bstats_update(&a->tcfa_bstats_hw, bytes, packets);
 }
 EXPORT_SYMBOL(tcf_action_update_stats);
 
-- 
2.21.0


^ permalink raw reply related	[flat|nested] 64+ messages in thread

* [PATCH net-next 06/13] net: sched: add action fast initialization flag
  2019-10-22 14:17 [PATCH net-next 00/13] Control action percpu counters allocation by netlink flag Vlad Buslov
                   ` (4 preceding siblings ...)
  2019-10-22 14:17 ` [PATCH net-next 05/13] net: sched: modify stats helper functions to support regular stats Vlad Buslov
@ 2019-10-22 14:17 ` Vlad Buslov
  2019-10-22 14:17 ` [PATCH net-next 07/13] net: sched: act_gact: support fast init flag to skip pcpu allocation Vlad Buslov
                   ` (10 subsequent siblings)
  16 siblings, 0 replies; 64+ messages in thread
From: Vlad Buslov @ 2019-10-22 14:17 UTC (permalink / raw)
  To: netdev
  Cc: jhs, xiyou.wangcong, jiri, davem, mleitner, dcaratti,
	Vlad Buslov, Jiri Pirko

Add new tc action "tcfa_flags" field and define first flag value
TCA_ACT_FLAGS_FAST_INIT that will be used for following patches in the
series to skip percpu allocator usage in specific actions. Implement helper
function to check for fast initialization flags bit. Provide default
allowed flags value to be used for netlink validation.

Refactor tcf_idr_create() to accept additional 'flags' argument instead of
'cpustats'. Don't allocate percpu counters, if TCA_ACT_FLAGS_FAST_INIT flag
is set. Always pass TCA_ACT_FLAGS_FAST_INIT flag to tcf_idr_create() from
actions that don't use percpu-allocated counters.

Signed-off-by: Vlad Buslov <vladbu@mellanox.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
---
 include/net/act_api.h        | 10 +++++++++-
 include/net/pkt_cls.h        |  3 +++
 include/uapi/linux/pkt_cls.h |  8 ++++++++
 net/sched/act_api.c          |  5 +++--
 net/sched/act_bpf.c          |  2 +-
 net/sched/act_connmark.c     |  3 ++-
 net/sched/act_csum.c         |  2 +-
 net/sched/act_ct.c           |  2 +-
 net/sched/act_ctinfo.c       |  3 ++-
 net/sched/act_gact.c         |  2 +-
 net/sched/act_ife.c          |  2 +-
 net/sched/act_ipt.c          |  8 ++++----
 net/sched/act_mirred.c       |  2 +-
 net/sched/act_mpls.c         |  2 +-
 net/sched/act_nat.c          |  3 ++-
 net/sched/act_pedit.c        |  3 ++-
 net/sched/act_police.c       |  2 +-
 net/sched/act_sample.c       |  2 +-
 net/sched/act_simple.c       |  3 ++-
 net/sched/act_skbedit.c      |  2 +-
 net/sched/act_skbmod.c       |  2 +-
 net/sched/act_tunnel_key.c   |  2 +-
 net/sched/act_vlan.c         |  2 +-
 23 files changed, 50 insertions(+), 25 deletions(-)

diff --git a/include/net/act_api.h b/include/net/act_api.h
index a56477051dae..d6c3e283f7e6 100644
--- a/include/net/act_api.h
+++ b/include/net/act_api.h
@@ -27,6 +27,7 @@ struct tc_action {
 	struct tcf_idrinfo		*idrinfo;
 
 	u32				tcfa_index;
+	u32				tcfa_flags;
 	refcount_t			tcfa_refcnt;
 	atomic_t			tcfa_bindcnt;
 	int				tcfa_action;
@@ -43,6 +44,7 @@ struct tc_action {
 	struct tcf_chain	__rcu *goto_chain;
 };
 #define tcf_index	common.tcfa_index
+#define tcf_flags	common.tcfa_flags
 #define tcf_refcnt	common.tcfa_refcnt
 #define tcf_bindcnt	common.tcfa_bindcnt
 #define tcf_action	common.tcfa_action
@@ -154,7 +156,7 @@ int tcf_generic_walker(struct tc_action_net *tn, struct sk_buff *skb,
 int tcf_idr_search(struct tc_action_net *tn, struct tc_action **a, u32 index);
 int tcf_idr_create(struct tc_action_net *tn, u32 index, struct nlattr *est,
 		   struct tc_action **a, const struct tc_action_ops *ops,
-		   int bind, bool cpustats);
+		   int bind, u32 flags);
 void tcf_idr_insert(struct tc_action_net *tn, struct tc_action *a);
 
 void tcf_idr_cleanup(struct tc_action_net *tn, u32 index);
@@ -230,6 +232,12 @@ int tcf_action_check_ctrlact(int action, struct tcf_proto *tp,
 			     struct netlink_ext_ack *newchain);
 struct tcf_chain *tcf_action_set_ctrlact(struct tc_action *a, int action,
 					 struct tcf_chain *newchain);
+
+static inline bool tc_act_fast_init(u32 flags)
+{
+	return (flags & TCA_ACT_FLAGS_FAST_INIT) ? true : false;
+}
+
 #endif /* CONFIG_NET_CLS_ACT */
 
 static inline void tcf_action_stats_update(struct tc_action *a, u64 bytes,
diff --git a/include/net/pkt_cls.h b/include/net/pkt_cls.h
index e553fc80eb23..53c118dcfeca 100644
--- a/include/net/pkt_cls.h
+++ b/include/net/pkt_cls.h
@@ -11,6 +11,9 @@
 /* TC action not accessible from user space */
 #define TC_ACT_CONSUMED		(TC_ACT_VALUE_MAX + 1)
 
+/* Default allowed action flags. */
+static const u32 tca_flags_allowed = TCA_ACT_FLAGS_FAST_INIT;
+
 /* Basic packet classifier frontend definitions. */
 
 struct tcf_walker {
diff --git a/include/uapi/linux/pkt_cls.h b/include/uapi/linux/pkt_cls.h
index a6aa466fac9e..56664854a5ab 100644
--- a/include/uapi/linux/pkt_cls.h
+++ b/include/uapi/linux/pkt_cls.h
@@ -113,6 +113,14 @@ enum tca_id {
 
 #define TCA_ID_MAX __TCA_ID_MAX
 
+/* act flags definitions */
+#define TCA_ACT_FLAGS_FAST_INIT	(1 << 0) /* prefer action update rate
+					  * (slow-path), even at the cost of
+					  * reduced software data-path
+					  * performance (intended to be used for
+					  * hardware-offloaded actions)
+					  */
+
 struct tc_police {
 	__u32			index;
 	int			action;
diff --git a/net/sched/act_api.c b/net/sched/act_api.c
index f85b88da5216..948e458217cf 100644
--- a/net/sched/act_api.c
+++ b/net/sched/act_api.c
@@ -399,7 +399,7 @@ static int tcf_idr_delete_index(struct tcf_idrinfo *idrinfo, u32 index)
 
 int tcf_idr_create(struct tc_action_net *tn, u32 index, struct nlattr *est,
 		   struct tc_action **a, const struct tc_action_ops *ops,
-		   int bind, bool cpustats)
+		   int bind, u32 flags)
 {
 	struct tc_action *p = kzalloc(ops->size, GFP_KERNEL);
 	struct tcf_idrinfo *idrinfo = tn->idrinfo;
@@ -411,7 +411,7 @@ int tcf_idr_create(struct tc_action_net *tn, u32 index, struct nlattr *est,
 	if (bind)
 		atomic_set(&p->tcfa_bindcnt, 1);
 
-	if (cpustats) {
+	if (!tc_act_fast_init(flags)) {
 		p->cpu_bstats = netdev_alloc_pcpu_stats(struct gnet_stats_basic_cpu);
 		if (!p->cpu_bstats)
 			goto err1;
@@ -437,6 +437,7 @@ int tcf_idr_create(struct tc_action_net *tn, u32 index, struct nlattr *est,
 
 	p->idrinfo = idrinfo;
 	p->ops = ops;
+	p->tcfa_flags = flags;
 	*a = p;
 	return 0;
 err4:
diff --git a/net/sched/act_bpf.c b/net/sched/act_bpf.c
index 04b7bd4ec751..250f995a06d1 100644
--- a/net/sched/act_bpf.c
+++ b/net/sched/act_bpf.c
@@ -303,7 +303,7 @@ static int tcf_bpf_init(struct net *net, struct nlattr *nla,
 	ret = tcf_idr_check_alloc(tn, &index, act, bind);
 	if (!ret) {
 		ret = tcf_idr_create(tn, index, est, act,
-				     &act_bpf_ops, bind, true);
+				     &act_bpf_ops, bind, 0);
 		if (ret < 0) {
 			tcf_idr_cleanup(tn, index);
 			return ret;
diff --git a/net/sched/act_connmark.c b/net/sched/act_connmark.c
index 2b43cacf82af..3222c7caeaba 100644
--- a/net/sched/act_connmark.c
+++ b/net/sched/act_connmark.c
@@ -121,7 +121,8 @@ static int tcf_connmark_init(struct net *net, struct nlattr *nla,
 	ret = tcf_idr_check_alloc(tn, &index, a, bind);
 	if (!ret) {
 		ret = tcf_idr_create(tn, index, est, a,
-				     &act_connmark_ops, bind, false);
+				     &act_connmark_ops, bind,
+				     TCA_ACT_FLAGS_FAST_INIT);
 		if (ret) {
 			tcf_idr_cleanup(tn, index);
 			return ret;
diff --git a/net/sched/act_csum.c b/net/sched/act_csum.c
index bc909cf72257..d2e6d8d77d9a 100644
--- a/net/sched/act_csum.c
+++ b/net/sched/act_csum.c
@@ -69,7 +69,7 @@ static int tcf_csum_init(struct net *net, struct nlattr *nla,
 	err = tcf_idr_check_alloc(tn, &index, a, bind);
 	if (!err) {
 		ret = tcf_idr_create(tn, index, est, a,
-				     &act_csum_ops, bind, true);
+				     &act_csum_ops, bind, 0);
 		if (ret) {
 			tcf_idr_cleanup(tn, index);
 			return ret;
diff --git a/net/sched/act_ct.c b/net/sched/act_ct.c
index eabae2227e13..5dc8de42e1d4 100644
--- a/net/sched/act_ct.c
+++ b/net/sched/act_ct.c
@@ -689,7 +689,7 @@ static int tcf_ct_init(struct net *net, struct nlattr *nla,
 
 	if (!err) {
 		err = tcf_idr_create(tn, index, est, a,
-				     &act_ct_ops, bind, true);
+				     &act_ct_ops, bind, 0);
 		if (err) {
 			tcf_idr_cleanup(tn, index);
 			return err;
diff --git a/net/sched/act_ctinfo.c b/net/sched/act_ctinfo.c
index 0dbcfd1dca7b..1d80ae4eecc4 100644
--- a/net/sched/act_ctinfo.c
+++ b/net/sched/act_ctinfo.c
@@ -210,7 +210,8 @@ static int tcf_ctinfo_init(struct net *net, struct nlattr *nla,
 	err = tcf_idr_check_alloc(tn, &index, a, bind);
 	if (!err) {
 		ret = tcf_idr_create(tn, index, est, a,
-				     &act_ctinfo_ops, bind, false);
+				     &act_ctinfo_ops, bind,
+				     TCA_ACT_FLAGS_FAST_INIT);
 		if (ret) {
 			tcf_idr_cleanup(tn, index);
 			return ret;
diff --git a/net/sched/act_gact.c b/net/sched/act_gact.c
index 221f0c2e26b1..67654aaa37a3 100644
--- a/net/sched/act_gact.c
+++ b/net/sched/act_gact.c
@@ -99,7 +99,7 @@ static int tcf_gact_init(struct net *net, struct nlattr *nla,
 	err = tcf_idr_check_alloc(tn, &index, a, bind);
 	if (!err) {
 		ret = tcf_idr_create(tn, index, est, a,
-				     &act_gact_ops, bind, true);
+				     &act_gact_ops, bind, 0);
 		if (ret) {
 			tcf_idr_cleanup(tn, index);
 			return ret;
diff --git a/net/sched/act_ife.c b/net/sched/act_ife.c
index 3a31e241c647..b4de0addea04 100644
--- a/net/sched/act_ife.c
+++ b/net/sched/act_ife.c
@@ -522,7 +522,7 @@ static int tcf_ife_init(struct net *net, struct nlattr *nla,
 
 	if (!exists) {
 		ret = tcf_idr_create(tn, index, est, a, &act_ife_ops,
-				     bind, true);
+				     bind, 0);
 		if (ret) {
 			tcf_idr_cleanup(tn, index);
 			kfree(p);
diff --git a/net/sched/act_ipt.c b/net/sched/act_ipt.c
index 214a03d405cf..dc6a8c0d0905 100644
--- a/net/sched/act_ipt.c
+++ b/net/sched/act_ipt.c
@@ -95,7 +95,7 @@ static const struct nla_policy ipt_policy[TCA_IPT_MAX + 1] = {
 static int __tcf_ipt_init(struct net *net, unsigned int id, struct nlattr *nla,
 			  struct nlattr *est, struct tc_action **a,
 			  const struct tc_action_ops *ops, int ovr, int bind,
-			  struct tcf_proto *tp)
+			  struct tcf_proto *tp, struct netlink_ext_ack *extack)
 {
 	struct tc_action_net *tn = net_generic(net, id);
 	struct nlattr *tb[TCA_IPT_MAX + 1];
@@ -144,7 +144,7 @@ static int __tcf_ipt_init(struct net *net, unsigned int id, struct nlattr *nla,
 
 	if (!exists) {
 		ret = tcf_idr_create(tn, index, est, a, ops, bind,
-				     false);
+				     TCA_ACT_FLAGS_FAST_INIT);
 		if (ret) {
 			tcf_idr_cleanup(tn, index);
 			return ret;
@@ -208,7 +208,7 @@ static int tcf_ipt_init(struct net *net, struct nlattr *nla,
 			struct netlink_ext_ack *extack)
 {
 	return __tcf_ipt_init(net, ipt_net_id, nla, est, a, &act_ipt_ops, ovr,
-			      bind, tp);
+			      bind, tp, extack);
 }
 
 static int tcf_xt_init(struct net *net, struct nlattr *nla,
@@ -217,7 +217,7 @@ static int tcf_xt_init(struct net *net, struct nlattr *nla,
 		       struct netlink_ext_ack *extack)
 {
 	return __tcf_ipt_init(net, xt_net_id, nla, est, a, &act_xt_ops, ovr,
-			      bind, tp);
+			      bind, tp, extack);
 }
 
 static int tcf_ipt_act(struct sk_buff *skb, const struct tc_action *a,
diff --git a/net/sched/act_mirred.c b/net/sched/act_mirred.c
index ae1129aaf3c0..f5582236bcee 100644
--- a/net/sched/act_mirred.c
+++ b/net/sched/act_mirred.c
@@ -149,7 +149,7 @@ static int tcf_mirred_init(struct net *net, struct nlattr *nla,
 			return -EINVAL;
 		}
 		ret = tcf_idr_create(tn, index, est, a,
-				     &act_mirred_ops, bind, true);
+				     &act_mirred_ops, bind, 0);
 		if (ret) {
 			tcf_idr_cleanup(tn, index);
 			return ret;
diff --git a/net/sched/act_mpls.c b/net/sched/act_mpls.c
index 4cf6c553bb0b..91083075e148 100644
--- a/net/sched/act_mpls.c
+++ b/net/sched/act_mpls.c
@@ -224,7 +224,7 @@ static int tcf_mpls_init(struct net *net, struct nlattr *nla,
 
 	if (!exists) {
 		ret = tcf_idr_create(tn, index, est, a,
-				     &act_mpls_ops, bind, true);
+				     &act_mpls_ops, bind, 0);
 		if (ret) {
 			tcf_idr_cleanup(tn, index);
 			return ret;
diff --git a/net/sched/act_nat.c b/net/sched/act_nat.c
index ea4c5359e7df..d66fed95a7d4 100644
--- a/net/sched/act_nat.c
+++ b/net/sched/act_nat.c
@@ -61,7 +61,8 @@ static int tcf_nat_init(struct net *net, struct nlattr *nla, struct nlattr *est,
 	err = tcf_idr_check_alloc(tn, &index, a, bind);
 	if (!err) {
 		ret = tcf_idr_create(tn, index, est, a,
-				     &act_nat_ops, bind, false);
+				     &act_nat_ops, bind,
+				     TCA_ACT_FLAGS_FAST_INIT);
 		if (ret) {
 			tcf_idr_cleanup(tn, index);
 			return ret;
diff --git a/net/sched/act_pedit.c b/net/sched/act_pedit.c
index cdfaa79382a2..65b0c7428341 100644
--- a/net/sched/act_pedit.c
+++ b/net/sched/act_pedit.c
@@ -190,7 +190,8 @@ static int tcf_pedit_init(struct net *net, struct nlattr *nla,
 			goto out_free;
 		}
 		ret = tcf_idr_create(tn, index, est, a,
-				     &act_pedit_ops, bind, false);
+				     &act_pedit_ops, bind,
+				     TCA_ACT_FLAGS_FAST_INIT);
 		if (ret) {
 			tcf_idr_cleanup(tn, index);
 			goto out_free;
diff --git a/net/sched/act_police.c b/net/sched/act_police.c
index 51d34b1a61d5..46528dcd907c 100644
--- a/net/sched/act_police.c
+++ b/net/sched/act_police.c
@@ -87,7 +87,7 @@ static int tcf_police_init(struct net *net, struct nlattr *nla,
 
 	if (!exists) {
 		ret = tcf_idr_create(tn, index, NULL, a,
-				     &act_police_ops, bind, true);
+				     &act_police_ops, bind, 0);
 		if (ret) {
 			tcf_idr_cleanup(tn, index);
 			return ret;
diff --git a/net/sched/act_sample.c b/net/sched/act_sample.c
index 514456a0b9a8..ee7bb68cb919 100644
--- a/net/sched/act_sample.c
+++ b/net/sched/act_sample.c
@@ -69,7 +69,7 @@ static int tcf_sample_init(struct net *net, struct nlattr *nla,
 
 	if (!exists) {
 		ret = tcf_idr_create(tn, index, est, a,
-				     &act_sample_ops, bind, true);
+				     &act_sample_ops, bind, 0);
 		if (ret) {
 			tcf_idr_cleanup(tn, index);
 			return ret;
diff --git a/net/sched/act_simple.c b/net/sched/act_simple.c
index 6120e56117ca..ff6ab2708978 100644
--- a/net/sched/act_simple.c
+++ b/net/sched/act_simple.c
@@ -127,7 +127,8 @@ static int tcf_simp_init(struct net *net, struct nlattr *nla,
 
 	if (!exists) {
 		ret = tcf_idr_create(tn, index, est, a,
-				     &act_simp_ops, bind, false);
+				     &act_simp_ops, bind,
+				     TCA_ACT_FLAGS_FAST_INIT);
 		if (ret) {
 			tcf_idr_cleanup(tn, index);
 			return ret;
diff --git a/net/sched/act_skbedit.c b/net/sched/act_skbedit.c
index 6a8d3337c577..a1ac65116391 100644
--- a/net/sched/act_skbedit.c
+++ b/net/sched/act_skbedit.c
@@ -165,7 +165,7 @@ static int tcf_skbedit_init(struct net *net, struct nlattr *nla,
 
 	if (!exists) {
 		ret = tcf_idr_create(tn, index, est, a,
-				     &act_skbedit_ops, bind, true);
+				     &act_skbedit_ops, bind, 0);
 		if (ret) {
 			tcf_idr_cleanup(tn, index);
 			return ret;
diff --git a/net/sched/act_skbmod.c b/net/sched/act_skbmod.c
index 888437f97ba6..1dfe42503835 100644
--- a/net/sched/act_skbmod.c
+++ b/net/sched/act_skbmod.c
@@ -143,7 +143,7 @@ static int tcf_skbmod_init(struct net *net, struct nlattr *nla,
 
 	if (!exists) {
 		ret = tcf_idr_create(tn, index, est, a,
-				     &act_skbmod_ops, bind, true);
+				     &act_skbmod_ops, bind, 0);
 		if (ret) {
 			tcf_idr_cleanup(tn, index);
 			return ret;
diff --git a/net/sched/act_tunnel_key.c b/net/sched/act_tunnel_key.c
index 9ab2d3b4a9fc..b517bcf9152c 100644
--- a/net/sched/act_tunnel_key.c
+++ b/net/sched/act_tunnel_key.c
@@ -348,7 +348,7 @@ static int tunnel_key_init(struct net *net, struct nlattr *nla,
 
 	if (!exists) {
 		ret = tcf_idr_create(tn, index, est, a,
-				     &act_tunnel_key_ops, bind, true);
+				     &act_tunnel_key_ops, bind, 0);
 		if (ret) {
 			NL_SET_ERR_MSG(extack, "Cannot create TC IDR");
 			goto release_tun_meta;
diff --git a/net/sched/act_vlan.c b/net/sched/act_vlan.c
index ffa0f431aa84..f7aff66e5026 100644
--- a/net/sched/act_vlan.c
+++ b/net/sched/act_vlan.c
@@ -189,7 +189,7 @@ static int tcf_vlan_init(struct net *net, struct nlattr *nla,
 
 	if (!exists) {
 		ret = tcf_idr_create(tn, index, est, a,
-				     &act_vlan_ops, bind, true);
+				     &act_vlan_ops, bind, 0);
 		if (ret) {
 			tcf_idr_cleanup(tn, index);
 			return ret;
-- 
2.21.0


^ permalink raw reply related	[flat|nested] 64+ messages in thread

* [PATCH net-next 07/13] net: sched: act_gact: support fast init flag to skip pcpu allocation
  2019-10-22 14:17 [PATCH net-next 00/13] Control action percpu counters allocation by netlink flag Vlad Buslov
                   ` (5 preceding siblings ...)
  2019-10-22 14:17 ` [PATCH net-next 06/13] net: sched: add action fast initialization flag Vlad Buslov
@ 2019-10-22 14:17 ` Vlad Buslov
  2019-10-22 14:17 ` [PATCH net-next 08/13] net: sched: act_csum: " Vlad Buslov
                   ` (9 subsequent siblings)
  16 siblings, 0 replies; 64+ messages in thread
From: Vlad Buslov @ 2019-10-22 14:17 UTC (permalink / raw)
  To: netdev
  Cc: jhs, xiyou.wangcong, jiri, davem, mleitner, dcaratti,
	Vlad Buslov, Jiri Pirko

Extend gact action with u32 netlink TCA_GACT_FLAGS field. Use it to pass
fast initialization flag defined by previous patch in this series and don't
allocate percpu counters in init code when flag is set. Extend action dump
callback to also dump flags, if field is set.

Signed-off-by: Vlad Buslov <vladbu@mellanox.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
---
 include/uapi/linux/tc_act/tc_gact.h |  1 +
 net/sched/act_gact.c                | 18 ++++++++++++++++--
 2 files changed, 17 insertions(+), 2 deletions(-)

diff --git a/include/uapi/linux/tc_act/tc_gact.h b/include/uapi/linux/tc_act/tc_gact.h
index 37e5392e02c7..b621b7df9919 100644
--- a/include/uapi/linux/tc_act/tc_gact.h
+++ b/include/uapi/linux/tc_act/tc_gact.h
@@ -26,6 +26,7 @@ enum {
 	TCA_GACT_PARMS,
 	TCA_GACT_PROB,
 	TCA_GACT_PAD,
+	TCA_GACT_FLAGS,
 	__TCA_GACT_MAX
 };
 #define TCA_GACT_MAX (__TCA_GACT_MAX - 1)
diff --git a/net/sched/act_gact.c b/net/sched/act_gact.c
index 67654aaa37a3..9d052ade336a 100644
--- a/net/sched/act_gact.c
+++ b/net/sched/act_gact.c
@@ -48,6 +48,8 @@ static g_rand gact_rand[MAX_RAND] = { NULL, gact_net_rand, gact_determ };
 static const struct nla_policy gact_policy[TCA_GACT_MAX + 1] = {
 	[TCA_GACT_PARMS]	= { .len = sizeof(struct tc_gact) },
 	[TCA_GACT_PROB]		= { .len = sizeof(struct tc_gact_p) },
+	[TCA_GACT_FLAGS]	= { .type = NLA_BITFIELD32,
+				    .validation_data = &tca_flags_allowed },
 };
 
 static int tcf_gact_init(struct net *net, struct nlattr *nla,
@@ -60,8 +62,8 @@ static int tcf_gact_init(struct net *net, struct nlattr *nla,
 	struct tcf_chain *goto_ch = NULL;
 	struct tc_gact *parm;
 	struct tcf_gact *gact;
+	u32 index, flags = 0;
 	int ret = 0;
-	u32 index;
 	int err;
 #ifdef CONFIG_GACT_PROB
 	struct tc_gact_p *p_parm = NULL;
@@ -96,10 +98,13 @@ static int tcf_gact_init(struct net *net, struct nlattr *nla,
 	}
 #endif
 
+	if (tb[TCA_GACT_FLAGS])
+		flags = nla_get_bitfield32(tb[TCA_GACT_FLAGS]).value;
+
 	err = tcf_idr_check_alloc(tn, &index, a, bind);
 	if (!err) {
 		ret = tcf_idr_create(tn, index, est, a,
-				     &act_gact_ops, bind, 0);
+				     &act_gact_ops, bind, flags);
 		if (ret) {
 			tcf_idr_cleanup(tn, index);
 			return ret;
@@ -209,6 +214,15 @@ static int tcf_gact_dump(struct sk_buff *skb, struct tc_action *a,
 			goto nla_put_failure;
 	}
 #endif
+	if (gact->tcf_flags) {
+		struct nla_bitfield32 flags = { gact->tcf_flags,
+						gact->tcf_flags };
+
+		if (nla_put(skb, TCA_GACT_FLAGS, sizeof(struct nla_bitfield32),
+			    &flags))
+			goto nla_put_failure;
+	}
+
 	tcf_tm_dump(&t, &gact->tcf_tm);
 	if (nla_put_64bit(skb, TCA_GACT_TM, sizeof(t), &t, TCA_GACT_PAD))
 		goto nla_put_failure;
-- 
2.21.0


^ permalink raw reply related	[flat|nested] 64+ messages in thread

* [PATCH net-next 08/13] net: sched: act_csum: support fast init flag to skip pcpu allocation
  2019-10-22 14:17 [PATCH net-next 00/13] Control action percpu counters allocation by netlink flag Vlad Buslov
                   ` (6 preceding siblings ...)
  2019-10-22 14:17 ` [PATCH net-next 07/13] net: sched: act_gact: support fast init flag to skip pcpu allocation Vlad Buslov
@ 2019-10-22 14:17 ` Vlad Buslov
  2019-10-22 14:18 ` [PATCH net-next 09/13] net: sched: act_mirred: support fast init flag to skip pcpu alloc Vlad Buslov
                   ` (8 subsequent siblings)
  16 siblings, 0 replies; 64+ messages in thread
From: Vlad Buslov @ 2019-10-22 14:17 UTC (permalink / raw)
  To: netdev
  Cc: jhs, xiyou.wangcong, jiri, davem, mleitner, dcaratti,
	Vlad Buslov, Jiri Pirko

Extend csum action with u32 netlink TCA_CSUM_FLAGS field. Use it to pass
fast initialization flag defined by previous patch in this series and don't
allocate percpu counters in init code when flag is set. Extend action dump
callback to also dump flags, if field is set.

Signed-off-by: Vlad Buslov <vladbu@mellanox.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
---
 include/uapi/linux/tc_act/tc_csum.h |  1 +
 net/sched/act_csum.c                | 18 ++++++++++++++++--
 2 files changed, 17 insertions(+), 2 deletions(-)

diff --git a/include/uapi/linux/tc_act/tc_csum.h b/include/uapi/linux/tc_act/tc_csum.h
index 94b2044929de..14eddaca85f8 100644
--- a/include/uapi/linux/tc_act/tc_csum.h
+++ b/include/uapi/linux/tc_act/tc_csum.h
@@ -10,6 +10,7 @@ enum {
 	TCA_CSUM_PARMS,
 	TCA_CSUM_TM,
 	TCA_CSUM_PAD,
+	TCA_CSUM_FLAGS,
 	__TCA_CSUM_MAX
 };
 #define TCA_CSUM_MAX (__TCA_CSUM_MAX - 1)
diff --git a/net/sched/act_csum.c b/net/sched/act_csum.c
index d2e6d8d77d9a..655d80ccbac2 100644
--- a/net/sched/act_csum.c
+++ b/net/sched/act_csum.c
@@ -35,6 +35,8 @@
 
 static const struct nla_policy csum_policy[TCA_CSUM_MAX + 1] = {
 	[TCA_CSUM_PARMS] = { .len = sizeof(struct tc_csum), },
+	[TCA_CSUM_FLAGS] = { .type = NLA_BITFIELD32,
+			     .validation_data = &tca_flags_allowed },
 };
 
 static unsigned int csum_net_id;
@@ -50,9 +52,9 @@ static int tcf_csum_init(struct net *net, struct nlattr *nla,
 	struct nlattr *tb[TCA_CSUM_MAX + 1];
 	struct tcf_chain *goto_ch = NULL;
 	struct tc_csum *parm;
+	u32 index, flags = 0;
 	struct tcf_csum *p;
 	int ret = 0, err;
-	u32 index;
 
 	if (nla == NULL)
 		return -EINVAL;
@@ -66,10 +68,14 @@ static int tcf_csum_init(struct net *net, struct nlattr *nla,
 		return -EINVAL;
 	parm = nla_data(tb[TCA_CSUM_PARMS]);
 	index = parm->index;
+
+	if (tb[TCA_CSUM_FLAGS])
+		flags = nla_get_bitfield32(tb[TCA_CSUM_FLAGS]).value;
+
 	err = tcf_idr_check_alloc(tn, &index, a, bind);
 	if (!err) {
 		ret = tcf_idr_create(tn, index, est, a,
-				     &act_csum_ops, bind, 0);
+				     &act_csum_ops, bind, flags);
 		if (ret) {
 			tcf_idr_cleanup(tn, index);
 			return ret;
@@ -650,6 +656,14 @@ static int tcf_csum_dump(struct sk_buff *skb, struct tc_action *a, int bind,
 
 	if (nla_put(skb, TCA_CSUM_PARMS, sizeof(opt), &opt))
 		goto nla_put_failure;
+	if (p->tcf_flags) {
+		struct nla_bitfield32 flags = { p->tcf_flags,
+						p->tcf_flags };
+
+		if (nla_put(skb, TCA_CSUM_FLAGS, sizeof(struct nla_bitfield32),
+			    &flags))
+			goto nla_put_failure;
+	}
 
 	tcf_tm_dump(&t, &p->tcf_tm);
 	if (nla_put_64bit(skb, TCA_CSUM_TM, sizeof(t), &t, TCA_CSUM_PAD))
-- 
2.21.0


^ permalink raw reply related	[flat|nested] 64+ messages in thread

* [PATCH net-next 09/13] net: sched: act_mirred: support fast init flag to skip pcpu alloc
  2019-10-22 14:17 [PATCH net-next 00/13] Control action percpu counters allocation by netlink flag Vlad Buslov
                   ` (7 preceding siblings ...)
  2019-10-22 14:17 ` [PATCH net-next 08/13] net: sched: act_csum: " Vlad Buslov
@ 2019-10-22 14:18 ` Vlad Buslov
  2019-10-22 14:18 ` [PATCH net-next 10/13] net: sched: tunnel_key: " Vlad Buslov
                   ` (7 subsequent siblings)
  16 siblings, 0 replies; 64+ messages in thread
From: Vlad Buslov @ 2019-10-22 14:18 UTC (permalink / raw)
  To: netdev
  Cc: jhs, xiyou.wangcong, jiri, davem, mleitner, dcaratti,
	Vlad Buslov, Jiri Pirko

Extend mirred action with u32 netlink TCA_MIRRED_FLAGS field. Use it to
pass fast initialization flag defined by previous patch in this series and
don't allocate percpu counters in init code when flag is set. Extend action
dump callback to also dump flags, if field is set.

Signed-off-by: Vlad Buslov <vladbu@mellanox.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
---
 include/uapi/linux/tc_act/tc_mirred.h |  1 +
 net/sched/act_mirred.c                | 17 +++++++++++++++--
 2 files changed, 16 insertions(+), 2 deletions(-)

diff --git a/include/uapi/linux/tc_act/tc_mirred.h b/include/uapi/linux/tc_act/tc_mirred.h
index 2500a0005d05..322108890807 100644
--- a/include/uapi/linux/tc_act/tc_mirred.h
+++ b/include/uapi/linux/tc_act/tc_mirred.h
@@ -21,6 +21,7 @@ enum {
 	TCA_MIRRED_TM,
 	TCA_MIRRED_PARMS,
 	TCA_MIRRED_PAD,
+	TCA_MIRRED_FLAGS,
 	__TCA_MIRRED_MAX
 };
 #define TCA_MIRRED_MAX (__TCA_MIRRED_MAX - 1)
diff --git a/net/sched/act_mirred.c b/net/sched/act_mirred.c
index f5582236bcee..2a56ddbacff4 100644
--- a/net/sched/act_mirred.c
+++ b/net/sched/act_mirred.c
@@ -84,6 +84,8 @@ static void tcf_mirred_release(struct tc_action *a)
 
 static const struct nla_policy mirred_policy[TCA_MIRRED_MAX + 1] = {
 	[TCA_MIRRED_PARMS]	= { .len = sizeof(struct tc_mirred) },
+	[TCA_MIRRED_FLAGS]	= { .type = NLA_BITFIELD32,
+				    .validation_data = &tca_flags_allowed },
 };
 
 static unsigned int mirred_net_id;
@@ -102,9 +104,9 @@ static int tcf_mirred_init(struct net *net, struct nlattr *nla,
 	struct tc_mirred *parm;
 	struct tcf_mirred *m;
 	struct net_device *dev;
+	u32 index, flags = 0;
 	bool exists = false;
 	int ret, err;
-	u32 index;
 
 	if (!nla) {
 		NL_SET_ERR_MSG_MOD(extack, "Mirred requires attributes to be passed");
@@ -142,6 +144,9 @@ static int tcf_mirred_init(struct net *net, struct nlattr *nla,
 		return -EINVAL;
 	}
 
+	if (tb[TCA_MIRRED_FLAGS])
+		flags = nla_get_bitfield32(tb[TCA_MIRRED_FLAGS]).value;
+
 	if (!exists) {
 		if (!parm->ifindex) {
 			tcf_idr_cleanup(tn, index);
@@ -149,7 +154,7 @@ static int tcf_mirred_init(struct net *net, struct nlattr *nla,
 			return -EINVAL;
 		}
 		ret = tcf_idr_create(tn, index, est, a,
-				     &act_mirred_ops, bind, 0);
+				     &act_mirred_ops, bind, flags);
 		if (ret) {
 			tcf_idr_cleanup(tn, index);
 			return ret;
@@ -344,6 +349,14 @@ static int tcf_mirred_dump(struct sk_buff *skb, struct tc_action *a, int bind,
 
 	if (nla_put(skb, TCA_MIRRED_PARMS, sizeof(opt), &opt))
 		goto nla_put_failure;
+	if (m->tcf_flags) {
+		struct nla_bitfield32 flags = { m->tcf_flags,
+						m->tcf_flags };
+
+		if (nla_put(skb, TCA_MIRRED_FLAGS,
+			    sizeof(struct nla_bitfield32), &flags))
+			goto nla_put_failure;
+	}
 
 	tcf_tm_dump(&t, &m->tcf_tm);
 	if (nla_put_64bit(skb, TCA_MIRRED_TM, sizeof(t), &t, TCA_MIRRED_PAD))
-- 
2.21.0


^ permalink raw reply related	[flat|nested] 64+ messages in thread

* [PATCH net-next 10/13] net: sched: tunnel_key: support fast init flag to skip pcpu alloc
  2019-10-22 14:17 [PATCH net-next 00/13] Control action percpu counters allocation by netlink flag Vlad Buslov
                   ` (8 preceding siblings ...)
  2019-10-22 14:18 ` [PATCH net-next 09/13] net: sched: act_mirred: support fast init flag to skip pcpu alloc Vlad Buslov
@ 2019-10-22 14:18 ` Vlad Buslov
  2019-10-22 14:18 ` [PATCH net-next 11/13] net: sched: act_vlan: support fast init flag to skip pcpu allocation Vlad Buslov
                   ` (6 subsequent siblings)
  16 siblings, 0 replies; 64+ messages in thread
From: Vlad Buslov @ 2019-10-22 14:18 UTC (permalink / raw)
  To: netdev
  Cc: jhs, xiyou.wangcong, jiri, davem, mleitner, dcaratti,
	Vlad Buslov, Jiri Pirko

Extend tunnel_key action with u32 netlink TCA_TUNNEL_KEY_FLAGS field. Use
it to pass fast initialization flag defined by previous patch in this
series and don't allocate percpu counters in init code when flag is set.
Extend action dump callback to also dump flags, if field is set.

Signed-off-by: Vlad Buslov <vladbu@mellanox.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
---
 include/uapi/linux/tc_act/tc_tunnel_key.h |  1 +
 net/sched/act_tunnel_key.c                | 18 ++++++++++++++++--
 2 files changed, 17 insertions(+), 2 deletions(-)

diff --git a/include/uapi/linux/tc_act/tc_tunnel_key.h b/include/uapi/linux/tc_act/tc_tunnel_key.h
index 41c8b462c177..f572101f6adc 100644
--- a/include/uapi/linux/tc_act/tc_tunnel_key.h
+++ b/include/uapi/linux/tc_act/tc_tunnel_key.h
@@ -39,6 +39,7 @@ enum {
 					 */
 	TCA_TUNNEL_KEY_ENC_TOS,		/* u8 */
 	TCA_TUNNEL_KEY_ENC_TTL,		/* u8 */
+	TCA_TUNNEL_KEY_FLAGS,
 	__TCA_TUNNEL_KEY_MAX,
 };
 
diff --git a/net/sched/act_tunnel_key.c b/net/sched/act_tunnel_key.c
index b517bcf9152c..965912b1cc74 100644
--- a/net/sched/act_tunnel_key.c
+++ b/net/sched/act_tunnel_key.c
@@ -193,6 +193,8 @@ static const struct nla_policy tunnel_key_policy[TCA_TUNNEL_KEY_MAX + 1] = {
 	[TCA_TUNNEL_KEY_ENC_OPTS]     = { .type = NLA_NESTED },
 	[TCA_TUNNEL_KEY_ENC_TOS]      = { .type = NLA_U8 },
 	[TCA_TUNNEL_KEY_ENC_TTL]      = { .type = NLA_U8 },
+	[TCA_TUNNEL_KEY_FLAGS]	= { .type = NLA_BITFIELD32,
+				    .validation_data = &tca_flags_allowed },
 };
 
 static void tunnel_key_release_params(struct tcf_tunnel_key_params *p)
@@ -218,6 +220,7 @@ static int tunnel_key_init(struct net *net, struct nlattr *nla,
 	struct tcf_chain *goto_ch = NULL;
 	struct tc_tunnel_key *parm;
 	struct tcf_tunnel_key *t;
+	u32 index, act_flags = 0;
 	bool exists = false;
 	__be16 dst_port = 0;
 	__be64 key_id = 0;
@@ -225,7 +228,6 @@ static int tunnel_key_init(struct net *net, struct nlattr *nla,
 	__be16 flags = 0;
 	u8 tos, ttl;
 	int ret = 0;
-	u32 index;
 	int err;
 
 	if (!nla) {
@@ -346,9 +348,12 @@ static int tunnel_key_init(struct net *net, struct nlattr *nla,
 		goto err_out;
 	}
 
+	if (tb[TCA_TUNNEL_KEY_FLAGS])
+		act_flags = nla_get_bitfield32(tb[TCA_TUNNEL_KEY_FLAGS]).value;
+
 	if (!exists) {
 		ret = tcf_idr_create(tn, index, est, a,
-				     &act_tunnel_key_ops, bind, 0);
+				     &act_tunnel_key_ops, bind, act_flags);
 		if (ret) {
 			NL_SET_ERR_MSG(extack, "Cannot create TC IDR");
 			goto release_tun_meta;
@@ -552,6 +557,15 @@ static int tunnel_key_dump(struct sk_buff *skb, struct tc_action *a,
 			goto nla_put_failure;
 	}
 
+	if (t->tcf_flags) {
+		struct nla_bitfield32 flags = { t->tcf_flags,
+						t->tcf_flags };
+
+		if (nla_put(skb, TCA_TUNNEL_KEY_FLAGS,
+			    sizeof(struct nla_bitfield32), &flags))
+			goto nla_put_failure;
+	}
+
 	tcf_tm_dump(&tm, &t->tcf_tm);
 	if (nla_put_64bit(skb, TCA_TUNNEL_KEY_TM, sizeof(tm),
 			  &tm, TCA_TUNNEL_KEY_PAD))
-- 
2.21.0


^ permalink raw reply related	[flat|nested] 64+ messages in thread

* [PATCH net-next 11/13] net: sched: act_vlan: support fast init flag to skip pcpu allocation
  2019-10-22 14:17 [PATCH net-next 00/13] Control action percpu counters allocation by netlink flag Vlad Buslov
                   ` (9 preceding siblings ...)
  2019-10-22 14:18 ` [PATCH net-next 10/13] net: sched: tunnel_key: " Vlad Buslov
@ 2019-10-22 14:18 ` Vlad Buslov
  2019-10-22 14:18 ` [PATCH net-next 12/13] net: sched: act_ct: " Vlad Buslov
                   ` (5 subsequent siblings)
  16 siblings, 0 replies; 64+ messages in thread
From: Vlad Buslov @ 2019-10-22 14:18 UTC (permalink / raw)
  To: netdev
  Cc: jhs, xiyou.wangcong, jiri, davem, mleitner, dcaratti,
	Vlad Buslov, Jiri Pirko

Extend vlan action with u32 netlink TCA_VLAN_FLAGS field. Use it to pass
fast initialization flag defined by previous patch in this series and don't
allocate percpu counters in init code when flag is set. Extend action dump
callback to also dump flags, if field is set.

Signed-off-by: Vlad Buslov <vladbu@mellanox.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
---
 include/uapi/linux/tc_act/tc_vlan.h |  1 +
 net/sched/act_vlan.c                | 17 +++++++++++++++--
 2 files changed, 16 insertions(+), 2 deletions(-)

diff --git a/include/uapi/linux/tc_act/tc_vlan.h b/include/uapi/linux/tc_act/tc_vlan.h
index 168995b54a70..b2847ed14f08 100644
--- a/include/uapi/linux/tc_act/tc_vlan.h
+++ b/include/uapi/linux/tc_act/tc_vlan.h
@@ -30,6 +30,7 @@ enum {
 	TCA_VLAN_PUSH_VLAN_PROTOCOL,
 	TCA_VLAN_PAD,
 	TCA_VLAN_PUSH_VLAN_PRIORITY,
+	TCA_VLAN_FLAGS,
 	__TCA_VLAN_MAX,
 };
 #define TCA_VLAN_MAX (__TCA_VLAN_MAX - 1)
diff --git a/net/sched/act_vlan.c b/net/sched/act_vlan.c
index f7aff66e5026..063c6afd51c4 100644
--- a/net/sched/act_vlan.c
+++ b/net/sched/act_vlan.c
@@ -97,6 +97,8 @@ static const struct nla_policy vlan_policy[TCA_VLAN_MAX + 1] = {
 	[TCA_VLAN_PUSH_VLAN_ID]		= { .type = NLA_U16 },
 	[TCA_VLAN_PUSH_VLAN_PROTOCOL]	= { .type = NLA_U16 },
 	[TCA_VLAN_PUSH_VLAN_PRIORITY]	= { .type = NLA_U8 },
+	[TCA_VLAN_FLAGS]	= { .type = NLA_BITFIELD32,
+				    .validation_data = &tca_flags_allowed },
 };
 
 static int tcf_vlan_init(struct net *net, struct nlattr *nla,
@@ -109,6 +111,7 @@ static int tcf_vlan_init(struct net *net, struct nlattr *nla,
 	struct tcf_chain *goto_ch = NULL;
 	struct tcf_vlan_params *p;
 	struct tc_vlan *parm;
+	u32 index, flags = 0;
 	struct tcf_vlan *v;
 	int action;
 	u16 push_vid = 0;
@@ -116,7 +119,6 @@ static int tcf_vlan_init(struct net *net, struct nlattr *nla,
 	u8 push_prio = 0;
 	bool exists = false;
 	int ret = 0, err;
-	u32 index;
 
 	if (!nla)
 		return -EINVAL;
@@ -187,9 +189,12 @@ static int tcf_vlan_init(struct net *net, struct nlattr *nla,
 	}
 	action = parm->v_action;
 
+	if (tb[TCA_VLAN_FLAGS])
+		flags = nla_get_bitfield32(tb[TCA_VLAN_FLAGS]).value;
+
 	if (!exists) {
 		ret = tcf_idr_create(tn, index, est, a,
-				     &act_vlan_ops, bind, 0);
+				     &act_vlan_ops, bind, flags);
 		if (ret) {
 			tcf_idr_cleanup(tn, index);
 			return ret;
@@ -277,6 +282,14 @@ static int tcf_vlan_dump(struct sk_buff *skb, struct tc_action *a,
 	     (nla_put_u8(skb, TCA_VLAN_PUSH_VLAN_PRIORITY,
 					      p->tcfv_push_prio))))
 		goto nla_put_failure;
+	if (v->tcf_flags) {
+		struct nla_bitfield32 flags = { v->tcf_flags,
+						v->tcf_flags };
+
+		if (nla_put(skb, TCA_VLAN_FLAGS, sizeof(struct nla_bitfield32),
+			    &flags))
+			goto nla_put_failure;
+	}
 
 	tcf_tm_dump(&t, &v->tcf_tm);
 	if (nla_put_64bit(skb, TCA_VLAN_TM, sizeof(t), &t, TCA_VLAN_PAD))
-- 
2.21.0


^ permalink raw reply related	[flat|nested] 64+ messages in thread

* [PATCH net-next 12/13] net: sched: act_ct: support fast init flag to skip pcpu allocation
  2019-10-22 14:17 [PATCH net-next 00/13] Control action percpu counters allocation by netlink flag Vlad Buslov
                   ` (10 preceding siblings ...)
  2019-10-22 14:18 ` [PATCH net-next 11/13] net: sched: act_vlan: support fast init flag to skip pcpu allocation Vlad Buslov
@ 2019-10-22 14:18 ` Vlad Buslov
  2019-10-22 14:18 ` [PATCH net-next 13/13] tc-testing: implement tests for new fast_init action flag Vlad Buslov
                   ` (4 subsequent siblings)
  16 siblings, 0 replies; 64+ messages in thread
From: Vlad Buslov @ 2019-10-22 14:18 UTC (permalink / raw)
  To: netdev
  Cc: jhs, xiyou.wangcong, jiri, davem, mleitner, dcaratti,
	Vlad Buslov, Jiri Pirko

Extend ct action with u32 netlink TCA_CT_FLAGS field. Use it to pass fast
initialization flag defined by previous patch in this series and don't
allocate percpu counters in init code when flag is set. Extend action dump
callback to also dump flags, if field is set.

Signed-off-by: Vlad Buslov <vladbu@mellanox.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
---
 include/uapi/linux/tc_act/tc_ct.h |  1 +
 net/sched/act_ct.c                | 17 +++++++++++++++--
 2 files changed, 16 insertions(+), 2 deletions(-)

diff --git a/include/uapi/linux/tc_act/tc_ct.h b/include/uapi/linux/tc_act/tc_ct.h
index 5fb1d7ac1027..82ea92b59d3d 100644
--- a/include/uapi/linux/tc_act/tc_ct.h
+++ b/include/uapi/linux/tc_act/tc_ct.h
@@ -22,6 +22,7 @@ enum {
 	TCA_CT_NAT_PORT_MIN,	/* be16 */
 	TCA_CT_NAT_PORT_MAX,	/* be16 */
 	TCA_CT_PAD,
+	TCA_CT_FLAGS,
 	__TCA_CT_MAX
 };
 
diff --git a/net/sched/act_ct.c b/net/sched/act_ct.c
index 5dc8de42e1d4..04968dad328a 100644
--- a/net/sched/act_ct.c
+++ b/net/sched/act_ct.c
@@ -492,6 +492,8 @@ static const struct nla_policy ct_policy[TCA_CT_MAX + 1] = {
 				   .len = sizeof(struct in6_addr) },
 	[TCA_CT_NAT_PORT_MIN] = { .type = NLA_U16 },
 	[TCA_CT_NAT_PORT_MAX] = { .type = NLA_U16 },
+	[TCA_CT_FLAGS] = { .type = NLA_BITFIELD32,
+			   .validation_data = &tca_flags_allowed },
 };
 
 static int tcf_ct_fill_params_nat(struct tcf_ct_params *p,
@@ -663,10 +665,10 @@ static int tcf_ct_init(struct net *net, struct nlattr *nla,
 	struct tcf_ct_params *params = NULL;
 	struct nlattr *tb[TCA_CT_MAX + 1];
 	struct tcf_chain *goto_ch = NULL;
+	u32 index, flags = 0;
 	struct tc_ct *parm;
 	struct tcf_ct *c;
 	int err, res = 0;
-	u32 index;
 
 	if (!nla) {
 		NL_SET_ERR_MSG_MOD(extack, "Ct requires attributes to be passed");
@@ -687,9 +689,12 @@ static int tcf_ct_init(struct net *net, struct nlattr *nla,
 	if (err < 0)
 		return err;
 
+	if (tb[TCA_CT_FLAGS])
+		flags = nla_get_bitfield32(tb[TCA_CT_FLAGS]).value;
+
 	if (!err) {
 		err = tcf_idr_create(tn, index, est, a,
-				     &act_ct_ops, bind, 0);
+				     &act_ct_ops, bind, flags);
 		if (err) {
 			tcf_idr_cleanup(tn, index);
 			return err;
@@ -870,6 +875,14 @@ static inline int tcf_ct_dump(struct sk_buff *skb, struct tc_action *a,
 skip_dump:
 	if (nla_put(skb, TCA_CT_PARMS, sizeof(opt), &opt))
 		goto nla_put_failure;
+	if (c->tcf_flags) {
+		struct nla_bitfield32 flags = { c->tcf_flags,
+						c->tcf_flags };
+
+		if (nla_put(skb, TCA_CT_FLAGS, sizeof(struct nla_bitfield32),
+			    &flags))
+			goto nla_put_failure;
+	}
 
 	tcf_tm_dump(&t, &c->tcf_tm);
 	if (nla_put_64bit(skb, TCA_CT_TM, sizeof(t), &t, TCA_CT_PAD))
-- 
2.21.0


^ permalink raw reply related	[flat|nested] 64+ messages in thread

* [PATCH net-next 13/13] tc-testing: implement tests for new fast_init action flag
  2019-10-22 14:17 [PATCH net-next 00/13] Control action percpu counters allocation by netlink flag Vlad Buslov
                   ` (11 preceding siblings ...)
  2019-10-22 14:18 ` [PATCH net-next 12/13] net: sched: act_ct: " Vlad Buslov
@ 2019-10-22 14:18 ` Vlad Buslov
  2019-10-22 14:28 ` [PATCH iproute2-next] tc: implement support for action flags Vlad Buslov
                   ` (3 subsequent siblings)
  16 siblings, 0 replies; 64+ messages in thread
From: Vlad Buslov @ 2019-10-22 14:18 UTC (permalink / raw)
  To: netdev; +Cc: jhs, xiyou.wangcong, jiri, davem, mleitner, dcaratti, Vlad Buslov

Add basic tests to verify action creation with new fast_init flag for all
actions that support the flag.

Signed-off-by: Vlad Buslov <vladbu@mellanox.com>
---
 .../tc-testing/tc-tests/actions/csum.json     | 24 +++++++++++++++++++
 .../tc-testing/tc-tests/actions/ct.json       | 24 +++++++++++++++++++
 .../tc-testing/tc-tests/actions/gact.json     | 24 +++++++++++++++++++
 .../tc-testing/tc-tests/actions/mirred.json   | 24 +++++++++++++++++++
 .../tc-tests/actions/tunnel_key.json          | 24 +++++++++++++++++++
 .../tc-testing/tc-tests/actions/vlan.json     | 24 +++++++++++++++++++
 6 files changed, 144 insertions(+)

diff --git a/tools/testing/selftests/tc-testing/tc-tests/actions/csum.json b/tools/testing/selftests/tc-testing/tc-tests/actions/csum.json
index ddabb2fbb7c7..1dbbd53b994f 100644
--- a/tools/testing/selftests/tc-testing/tc-tests/actions/csum.json
+++ b/tools/testing/selftests/tc-testing/tc-tests/actions/csum.json
@@ -525,5 +525,29 @@
         "teardown": [
             "$TC actions flush action csum"
         ]
+    },
+    {
+        "id": "eaf0",
+        "name": "Add csum iph action with fast_init flag",
+        "category": [
+            "actions",
+            "csum"
+        ],
+        "setup": [
+            [
+                "$TC actions flush action csum",
+                0,
+                1,
+                255
+            ]
+        ],
+        "cmdUnderTest": "$TC actions add action csum iph fast_init",
+        "expExitCode": "0",
+        "verifyCmd": "$TC actions list action csum",
+        "matchPattern": "action order [0-9]*: csum \\(iph\\) action pass.*fast_init",
+        "matchCount": "1",
+        "teardown": [
+            "$TC actions flush action csum"
+        ]
     }
 ]
diff --git a/tools/testing/selftests/tc-testing/tc-tests/actions/ct.json b/tools/testing/selftests/tc-testing/tc-tests/actions/ct.json
index 62b82fe10c89..e6c798e840de 100644
--- a/tools/testing/selftests/tc-testing/tc-tests/actions/ct.json
+++ b/tools/testing/selftests/tc-testing/tc-tests/actions/ct.json
@@ -310,5 +310,29 @@
         "teardown": [
             "$TC actions flush action ct"
         ]
+    },
+    {
+        "id": "3991",
+        "name": "Add simple ct action with fast_init flag",
+        "category": [
+            "actions",
+            "ct"
+        ],
+        "setup": [
+            [
+                "$TC actions flush action ct",
+                0,
+                1,
+                255
+            ]
+        ],
+        "cmdUnderTest": "$TC actions add action ct fast_init",
+        "expExitCode": "0",
+        "verifyCmd": "$TC actions list action ct",
+        "matchPattern": "action order [0-9]*: ct zone 0 pipe.*fast_init",
+        "matchCount": "1",
+        "teardown": [
+            "$TC actions flush action ct"
+        ]
     }
 ]
diff --git a/tools/testing/selftests/tc-testing/tc-tests/actions/gact.json b/tools/testing/selftests/tc-testing/tc-tests/actions/gact.json
index 814b7a8a478b..b7b03aff2328 100644
--- a/tools/testing/selftests/tc-testing/tc-tests/actions/gact.json
+++ b/tools/testing/selftests/tc-testing/tc-tests/actions/gact.json
@@ -585,5 +585,29 @@
         "teardown": [
             "$TC actions flush action gact"
         ]
+    },
+    {
+        "id": "95ad",
+        "name": "Add gact pass action with fast_init flag",
+        "category": [
+            "actions",
+            "gact"
+        ],
+        "setup": [
+            [
+                "$TC actions flush action gact",
+                0,
+                1,
+                255
+            ]
+        ],
+        "cmdUnderTest": "$TC actions add action pass fast_init",
+        "expExitCode": "0",
+        "verifyCmd": "$TC actions list action gact",
+        "matchPattern": "action order [0-9]*: gact action pass.*fast_init",
+        "matchCount": "1",
+        "teardown": [
+            "$TC actions flush action gact"
+        ]
     }
 ]
diff --git a/tools/testing/selftests/tc-testing/tc-tests/actions/mirred.json b/tools/testing/selftests/tc-testing/tc-tests/actions/mirred.json
index 2232b21e2510..87331a6ef45f 100644
--- a/tools/testing/selftests/tc-testing/tc-tests/actions/mirred.json
+++ b/tools/testing/selftests/tc-testing/tc-tests/actions/mirred.json
@@ -553,5 +553,29 @@
         "matchPattern": "^[ \t]+index [0-9]+ ref",
         "matchCount": "0",
         "teardown": []
+    },
+    {
+        "id": "31e3",
+        "name": "Add mirred mirror to egress action with fast_init flag",
+        "category": [
+            "actions",
+            "mirred"
+        ],
+        "setup": [
+            [
+                "$TC actions flush action mirred",
+                0,
+                1,
+                255
+            ]
+        ],
+        "cmdUnderTest": "$TC actions add action mirred egress mirror fast_init dev lo",
+        "expExitCode": "0",
+        "verifyCmd": "$TC actions list action mirred",
+        "matchPattern": "action order [0-9]*: mirred \\(Egress Mirror to device lo\\).*fast_init",
+        "matchCount": "1",
+        "teardown": [
+            "$TC actions flush action mirred"
+        ]
     }
 ]
diff --git a/tools/testing/selftests/tc-testing/tc-tests/actions/tunnel_key.json b/tools/testing/selftests/tc-testing/tc-tests/actions/tunnel_key.json
index 28453a445fdb..89097c28b6d1 100644
--- a/tools/testing/selftests/tc-testing/tc-tests/actions/tunnel_key.json
+++ b/tools/testing/selftests/tc-testing/tc-tests/actions/tunnel_key.json
@@ -909,5 +909,29 @@
         "teardown": [
             "$TC actions flush action tunnel_key"
         ]
+    },
+    {
+        "id": "0cd2",
+        "name": "Add tunnel_key set action with fast_init flag",
+        "category": [
+            "actions",
+            "tunnel_key"
+        ],
+        "setup": [
+            [
+                "$TC actions flush action tunnel_key",
+                0,
+                1,
+                255
+            ]
+        ],
+        "cmdUnderTest": "$TC actions add action tunnel_key set fast_init src_ip 10.10.10.1 dst_ip 20.20.20.2 id 1",
+        "expExitCode": "0",
+        "verifyCmd": "$TC actions list action tunnel_key",
+        "matchPattern": "action order [0-9]+: tunnel_key.*set.*src_ip 10.10.10.1.*dst_ip 20.20.20.2.*key_id 1.*fast_init",
+        "matchCount": "1",
+        "teardown": [
+            "$TC actions flush action tunnel_key"
+        ]
     }
 ]
diff --git a/tools/testing/selftests/tc-testing/tc-tests/actions/vlan.json b/tools/testing/selftests/tc-testing/tc-tests/actions/vlan.json
index 6503b1ce091f..3ededaf18ad0 100644
--- a/tools/testing/selftests/tc-testing/tc-tests/actions/vlan.json
+++ b/tools/testing/selftests/tc-testing/tc-tests/actions/vlan.json
@@ -807,5 +807,29 @@
         "matchPattern": "^[ \t]+index [0-9]+ ref",
         "matchCount": "0",
         "teardown": []
+    },
+    {
+        "id": "1a3d",
+        "name": "Add vlan pop action with fast_init flag",
+        "category": [
+            "actions",
+            "vlan"
+        ],
+        "setup": [
+            [
+                "$TC actions flush action vlan",
+                0,
+                1,
+                255
+            ]
+        ],
+        "cmdUnderTest": "$TC actions add action vlan pop fast_init",
+        "expExitCode": "0",
+        "verifyCmd": "$TC actions list action vlan",
+        "matchPattern": "action order [0-9]+: vlan.*pop.*fast_init",
+        "matchCount": "1",
+        "teardown": [
+            "$TC actions flush action vlan"
+        ]
     }
 ]
-- 
2.21.0


^ permalink raw reply related	[flat|nested] 64+ messages in thread

* [PATCH iproute2-next] tc: implement support for action flags
  2019-10-22 14:17 [PATCH net-next 00/13] Control action percpu counters allocation by netlink flag Vlad Buslov
                   ` (12 preceding siblings ...)
  2019-10-22 14:18 ` [PATCH net-next 13/13] tc-testing: implement tests for new fast_init action flag Vlad Buslov
@ 2019-10-22 14:28 ` Vlad Buslov
  2019-10-22 14:35 ` [PATCH net-next 00/13] Control action percpu counters allocation by netlink flag Marcelo Ricardo Leitner
                   ` (2 subsequent siblings)
  16 siblings, 0 replies; 64+ messages in thread
From: Vlad Buslov @ 2019-10-22 14:28 UTC (permalink / raw)
  To: netdev; +Cc: mleitner, dcaratti, Vlad Buslov, Jiri Pirko

Implement setting and printing flags for actions that support this new
field (gact, csum, mirred, tunnel_key, vlan, ct). Update docs for
affected actions.

Example usage of fast init flag (only supported flag value at the
moment):

 # tc actions add action gact drop fast_init index 1
 # tc -s actions ls action gact
 total acts 1

        action order 0: gact action drop
         random type none pass val 0
         fast_init
         index 1 ref 1 bind 0 installed 7 sec used 7 sec
        Action statistics:
        Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0)
        backlog 0b 0p requeues 0

Signed-off-by: Vlad Buslov <vladbu@mellanox.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
---
 include/uapi/linux/pkt_cls.h              |  8 ++++++++
 include/uapi/linux/tc_act/tc_csum.h       |  1 +
 include/uapi/linux/tc_act/tc_ct.h         |  1 +
 include/uapi/linux/tc_act/tc_gact.h       |  1 +
 include/uapi/linux/tc_act/tc_mirred.h     |  1 +
 include/uapi/linux/tc_act/tc_tunnel_key.h |  1 +
 include/uapi/linux/tc_act/tc_vlan.h       |  1 +
 man/man8/tc-csum.8                        |  4 ++++
 man/man8/tc-mirred.8                      |  4 ++++
 man/man8/tc-tunnel_key.8                  |  4 ++++
 man/man8/tc-vlan.8                        |  7 ++++++-
 tc/m_csum.c                               | 17 ++++++++++++++++-
 tc/m_ct.c                                 | 19 ++++++++++++++++---
 tc/m_gact.c                               | 22 ++++++++++++++++++++--
 tc/m_mirred.c                             | 19 +++++++++++++++++--
 tc/m_tunnel_key.c                         | 17 +++++++++++++++--
 tc/m_vlan.c                               | 19 ++++++++++++++++---
 17 files changed, 132 insertions(+), 14 deletions(-)

diff --git a/include/uapi/linux/pkt_cls.h b/include/uapi/linux/pkt_cls.h
index a6aa466fac9e..56664854a5ab 100644
--- a/include/uapi/linux/pkt_cls.h
+++ b/include/uapi/linux/pkt_cls.h
@@ -113,6 +113,14 @@ enum tca_id {
 
 #define TCA_ID_MAX __TCA_ID_MAX
 
+/* act flags definitions */
+#define TCA_ACT_FLAGS_FAST_INIT	(1 << 0) /* prefer action update rate
+					  * (slow-path), even at the cost of
+					  * reduced software data-path
+					  * performance (intended to be used for
+					  * hardware-offloaded actions)
+					  */
+
 struct tc_police {
 	__u32			index;
 	int			action;
diff --git a/include/uapi/linux/tc_act/tc_csum.h b/include/uapi/linux/tc_act/tc_csum.h
index 94b2044929de..14eddaca85f8 100644
--- a/include/uapi/linux/tc_act/tc_csum.h
+++ b/include/uapi/linux/tc_act/tc_csum.h
@@ -10,6 +10,7 @@ enum {
 	TCA_CSUM_PARMS,
 	TCA_CSUM_TM,
 	TCA_CSUM_PAD,
+	TCA_CSUM_FLAGS,
 	__TCA_CSUM_MAX
 };
 #define TCA_CSUM_MAX (__TCA_CSUM_MAX - 1)
diff --git a/include/uapi/linux/tc_act/tc_ct.h b/include/uapi/linux/tc_act/tc_ct.h
index 5fb1d7ac1027..82ea92b59d3d 100644
--- a/include/uapi/linux/tc_act/tc_ct.h
+++ b/include/uapi/linux/tc_act/tc_ct.h
@@ -22,6 +22,7 @@ enum {
 	TCA_CT_NAT_PORT_MIN,	/* be16 */
 	TCA_CT_NAT_PORT_MAX,	/* be16 */
 	TCA_CT_PAD,
+	TCA_CT_FLAGS,
 	__TCA_CT_MAX
 };
 
diff --git a/include/uapi/linux/tc_act/tc_gact.h b/include/uapi/linux/tc_act/tc_gact.h
index 37e5392e02c7..b621b7df9919 100644
--- a/include/uapi/linux/tc_act/tc_gact.h
+++ b/include/uapi/linux/tc_act/tc_gact.h
@@ -26,6 +26,7 @@ enum {
 	TCA_GACT_PARMS,
 	TCA_GACT_PROB,
 	TCA_GACT_PAD,
+	TCA_GACT_FLAGS,
 	__TCA_GACT_MAX
 };
 #define TCA_GACT_MAX (__TCA_GACT_MAX - 1)
diff --git a/include/uapi/linux/tc_act/tc_mirred.h b/include/uapi/linux/tc_act/tc_mirred.h
index 2500a0005d05..322108890807 100644
--- a/include/uapi/linux/tc_act/tc_mirred.h
+++ b/include/uapi/linux/tc_act/tc_mirred.h
@@ -21,6 +21,7 @@ enum {
 	TCA_MIRRED_TM,
 	TCA_MIRRED_PARMS,
 	TCA_MIRRED_PAD,
+	TCA_MIRRED_FLAGS,
 	__TCA_MIRRED_MAX
 };
 #define TCA_MIRRED_MAX (__TCA_MIRRED_MAX - 1)
diff --git a/include/uapi/linux/tc_act/tc_tunnel_key.h b/include/uapi/linux/tc_act/tc_tunnel_key.h
index 41c8b462c177..f572101f6adc 100644
--- a/include/uapi/linux/tc_act/tc_tunnel_key.h
+++ b/include/uapi/linux/tc_act/tc_tunnel_key.h
@@ -39,6 +39,7 @@ enum {
 					 */
 	TCA_TUNNEL_KEY_ENC_TOS,		/* u8 */
 	TCA_TUNNEL_KEY_ENC_TTL,		/* u8 */
+	TCA_TUNNEL_KEY_FLAGS,
 	__TCA_TUNNEL_KEY_MAX,
 };
 
diff --git a/include/uapi/linux/tc_act/tc_vlan.h b/include/uapi/linux/tc_act/tc_vlan.h
index 168995b54a70..b2847ed14f08 100644
--- a/include/uapi/linux/tc_act/tc_vlan.h
+++ b/include/uapi/linux/tc_act/tc_vlan.h
@@ -30,6 +30,7 @@ enum {
 	TCA_VLAN_PUSH_VLAN_PROTOCOL,
 	TCA_VLAN_PAD,
 	TCA_VLAN_PUSH_VLAN_PRIORITY,
+	TCA_VLAN_FLAGS,
 	__TCA_VLAN_MAX,
 };
 #define TCA_VLAN_MAX (__TCA_VLAN_MAX - 1)
diff --git a/man/man8/tc-csum.8 b/man/man8/tc-csum.8
index 65724b88d0b6..c93707948d80 100644
--- a/man/man8/tc-csum.8
+++ b/man/man8/tc-csum.8
@@ -6,6 +6,7 @@ csum - checksum update action
 .in +8
 .ti -8
 .BR tc " ... " "action csum"
+.RB "[ " fast_init " ] "
 .I UPDATE
 
 .ti -8
@@ -52,6 +53,9 @@ SCTP header
 .TP
 .B SWEETS
 These are merely syntactic sugar and ignored internally.
+.TP
+.B fast_init
+Prefer initialization speed over run time fast-path performance.
 .SH EXAMPLES
 The following performs stateless NAT for incoming packets from 192.0.2.100 to
 new destination 198.51.100.1. Assuming these are UDP
diff --git a/man/man8/tc-mirred.8 b/man/man8/tc-mirred.8
index 38833b452d92..0cd39f454347 100644
--- a/man/man8/tc-mirred.8
+++ b/man/man8/tc-mirred.8
@@ -7,6 +7,7 @@ mirred - mirror/redirect action
 .ti -8
 .BR tc " ... " "action mirred"
 .I DIRECTION ACTION
+.RB "[ " fast_init " ] "
 .RB "[ " index
 .IR INDEX " ] "
 .BI dev " DEVICENAME"
@@ -49,6 +50,9 @@ is a 32bit unsigned integer greater than zero.
 .TP
 .BI dev " DEVICENAME"
 Specify the network interface to redirect or mirror to.
+.TP
+.B fast_init
+Prefer initialization speed over run time fast-path performance.
 .SH EXAMPLES
 Limit ingress bandwidth on eth0 to 1mbit/s, redirect exceeding traffic to lo for
 debugging purposes:
diff --git a/man/man8/tc-tunnel_key.8 b/man/man8/tc-tunnel_key.8
index 2145eb62e70e..7b827ae0d257 100644
--- a/man/man8/tc-tunnel_key.8
+++ b/man/man8/tc-tunnel_key.8
@@ -7,6 +7,7 @@ tunnel_key - Tunnel metadata manipulation
 .ti -8
 .BR tc " ... " "action tunnel_key" " { " unset " | "
 .IR SET " }"
+.RB "[ " fast_init " ] "
 
 .ti -8
 .IR SET " := "
@@ -114,6 +115,9 @@ If using
 with IPv6, be sure you know what you are doing. Zero UDP checksums provide
 weaker protection against corrupted packets. See RFC6935 for details.
 .RE
+.TP
+.B fast_init
+Prefer initialization speed over run time fast-path performance.
 .SH EXAMPLES
 The following example encapsulates incoming ICMP packets on eth0 into a vxlan
 tunnel, by setting metadata to VNI 11, source IP 11.11.0.1 and destination IP
diff --git a/man/man8/tc-vlan.8 b/man/man8/tc-vlan.8
index f5ffc25f054e..e69322fefd70 100644
--- a/man/man8/tc-vlan.8
+++ b/man/man8/tc-vlan.8
@@ -6,7 +6,9 @@ vlan - vlan manipulation module
 .in +8
 .ti -8
 .BR tc " ... " "action vlan" " { " pop " |"
-.IR PUSH " | " MODIFY " } [ " CONTROL " ]"
+.IR PUSH " | " MODIFY " } [ "
+.BR fast_init " ] [ "
+.IR CONTROL " ]"
 
 .ti -8
 .IR PUSH " := "
@@ -94,6 +96,9 @@ Continue classification with next filter in line.
 Return to calling qdisc for packet processing. This ends the classification
 process.
 .RE
+.TP
+.B fast_init
+Prefer initialization speed over run time fast-path performance.
 .SH EXAMPLES
 The following example encapsulates incoming ICMP packets on eth0 from 10.0.0.2
 into VLAN ID 123:
diff --git a/tc/m_csum.c b/tc/m_csum.c
index 3e3dc251ea38..64cc5a96ff5c 100644
--- a/tc/m_csum.c
+++ b/tc/m_csum.c
@@ -22,7 +22,7 @@
 static void
 explain(void)
 {
-	fprintf(stderr, "Usage: ... csum <UPDATE>\n"
+	fprintf(stderr, "Usage: ... csum [fast_init] <UPDATE>\n"
 			"Where: UPDATE := <TARGET> [<UPDATE>]\n"
 			"       TARGET := { ip4h | icmp | igmp | tcp | udp | udplite | sctp | <SWEETS> }\n"
 			"       SWEETS := { and | or | \'+\' }\n");
@@ -88,6 +88,7 @@ static int
 parse_csum(struct action_util *a, int *argc_p,
 	   char ***argv_p, int tca_id, struct nlmsghdr *n)
 {
+	struct nla_bitfield32 flags = { 0 };
 	struct tc_csum sel = {};
 
 	int argc = *argc_p;
@@ -106,6 +107,11 @@ parse_csum(struct action_util *a, int *argc_p,
 			}
 			ok++;
 			continue;
+		} else if (matches(*argv, "fast_init") == 0) {
+			flags.value |= TCA_ACT_FLAGS_FAST_INIT;
+			flags.selector |= TCA_ACT_FLAGS_FAST_INIT;
+			NEXT_ARG_FWD();
+			continue;
 		} else if (matches(*argv, "help") == 0) {
 			usage();
 		} else {
@@ -140,6 +146,8 @@ parse_csum(struct action_util *a, int *argc_p,
 
 	tail = addattr_nest(n, MAX_MSG, tca_id);
 	addattr_l(n, MAX_MSG, TCA_CSUM_PARMS, &sel, sizeof(sel));
+	addattr_l(n, MAX_MSG, TCA_CSUM_FLAGS, &flags,
+		  sizeof(struct nla_bitfield32));
 	addattr_nest_end(n, tail);
 
 	*argc_p = argc;
@@ -206,6 +214,13 @@ print_csum(struct action_util *au, FILE *f, struct rtattr *arg)
 	print_string(PRINT_ANY, "csum", "(%s) ", buf);
 
 	print_action_control(f, "action ", sel->action, "\n");
+	if (tb[TCA_CSUM_FLAGS]) {
+		struct nla_bitfield32 *flags = RTA_DATA(tb[TCA_CSUM_FLAGS]);
+
+		if (flags->selector & TCA_ACT_FLAGS_FAST_INIT)
+			print_bool(PRINT_ANY, "fast_init", "\n\t fast_init",
+				   flags->value & TCA_ACT_FLAGS_FAST_INIT);
+	}
 	print_uint(PRINT_ANY, "index", "\tindex %u", sel->index);
 	print_int(PRINT_ANY, "ref", " ref %d", sel->refcnt);
 	print_int(PRINT_ANY, "bind", " bind %d", sel->bindcnt);
diff --git a/tc/m_ct.c b/tc/m_ct.c
index 8589cb9a3c51..02947e5075c8 100644
--- a/tc/m_ct.c
+++ b/tc/m_ct.c
@@ -19,9 +19,9 @@ static void
 usage(void)
 {
 	fprintf(stderr,
-		"Usage: ct clear\n"
-		"	ct commit [force] [zone ZONE] [mark MASKED_MARK] [label MASKED_LABEL] [nat NAT_SPEC]\n"
-		"	ct [nat] [zone ZONE]\n"
+		"Usage: ct clear [fast_init]\n"
+		"	ct commit [fast_init] [force] [zone ZONE] [mark MASKED_MARK] [label MASKED_LABEL] [nat NAT_SPEC]\n"
+		"	ct [fast_init] [nat] [zone ZONE]\n"
 		"Where: ZONE is the conntrack zone table number\n"
 		"	NAT_SPEC is {src|dst} addr addr1[-addr2] [port port1[-port2]]\n"
 		"\n");
@@ -200,6 +200,7 @@ static int
 parse_ct(struct action_util *a, int *argc_p, char ***argv_p, int tca_id,
 		struct nlmsghdr *n)
 {
+	struct nla_bitfield32 flags = { 0 };
 	struct tc_ct sel = {};
 	char **argv = *argv_p;
 	struct rtattr *tail;
@@ -281,6 +282,9 @@ parse_ct(struct action_util *a, int *argc_p, char ***argv_p, int tca_id,
 				fprintf(stderr, "ct: Illegal \"label\"\n");
 				return -1;
 			}
+		} else if (matches(*argv, "fast_init") == 0) {
+			flags.value |= TCA_ACT_FLAGS_FAST_INIT;
+			flags.selector |= TCA_ACT_FLAGS_FAST_INIT;
 		} else if (matches(*argv, "help") == 0) {
 			usage();
 		} else {
@@ -320,6 +324,8 @@ parse_ct(struct action_util *a, int *argc_p, char ***argv_p, int tca_id,
 
 	addattr16(n, MAX_MSG, TCA_CT_ACTION, ct_action);
 	addattr_l(n, MAX_MSG, TCA_CT_PARMS, &sel, sizeof(sel));
+	addattr_l(n, MAX_MSG, TCA_CT_FLAGS, &flags,
+		  sizeof(struct nla_bitfield32));
 	addattr_nest_end(n, tail);
 
 	*argc_p = argc;
@@ -473,6 +479,13 @@ static int print_ct(struct action_util *au, FILE *f, struct rtattr *arg)
 	ct_print_nat(ct_action, tb);
 
 	print_action_control(f, " ", p->action, "");
+	if (tb[TCA_CT_FLAGS]) {
+		struct nla_bitfield32 *flags = RTA_DATA(tb[TCA_CT_FLAGS]);
+
+		if (flags->selector & TCA_ACT_FLAGS_FAST_INIT)
+			print_bool(PRINT_ANY, "fast_init", "\n\t fast_init",
+				   flags->value & TCA_ACT_FLAGS_FAST_INIT);
+	}
 
 	print_uint(PRINT_ANY, "index", "\n\t index %u", p->index);
 	print_int(PRINT_ANY, "ref", " ref %d", p->refcnt);
diff --git a/tc/m_gact.c b/tc/m_gact.c
index dca2a2f9692f..ce5f22b59da0 100644
--- a/tc/m_gact.c
+++ b/tc/m_gact.c
@@ -42,7 +42,7 @@ static void
 explain(void)
 {
 #ifdef CONFIG_GACT_PROB
-	fprintf(stderr, "Usage: ... gact <ACTION> [RAND] [INDEX]\n");
+	fprintf(stderr, "Usage: ... gact <ACTION> [RAND] [fast_init] [INDEX]\n");
 	fprintf(stderr,
 		"Where: \tACTION := reclassify | drop | continue | pass | pipe |\n"
 		"       \t          goto chain <CHAIN_INDEX> | jump <JUMP_COUNT>\n"
@@ -53,7 +53,7 @@ explain(void)
 			"\tINDEX := index value used\n"
 			"\n");
 #else
-	fprintf(stderr, "Usage: ... gact <ACTION> [INDEX]\n"
+	fprintf(stderr, "Usage: ... gact <ACTION> [fast_init] [INDEX]\n"
 		"Where: \tACTION := reclassify | drop | continue | pass | pipe |\n"
 		"       \t          goto chain <CHAIN_INDEX> | jump <JUMP_COUNT>\n"
 		"\tINDEX := index value used\n"
@@ -74,6 +74,7 @@ static int
 parse_gact(struct action_util *a, int *argc_p, char ***argv_p,
 	   int tca_id, struct nlmsghdr *n)
 {
+	struct nla_bitfield32 flags = { 0 };
 	int argc = *argc_p;
 	char **argv = *argv_p;
 	struct tc_gact p = { 0 };
@@ -133,6 +134,12 @@ parse_gact(struct action_util *a, int *argc_p, char ***argv_p,
 	}
 #endif
 
+	if (argc > 0 && matches(*argv, "fast_init") == 0) {
+		flags.value |= TCA_ACT_FLAGS_FAST_INIT;
+		flags.selector |= TCA_ACT_FLAGS_FAST_INIT;
+		NEXT_ARG_FWD();
+	}
+
 	if (argc > 0) {
 		if (matches(*argv, "index") == 0) {
 skip_args:
@@ -154,6 +161,9 @@ skip_args:
 	if (rd)
 		addattr_l(n, MAX_MSG, TCA_GACT_PROB, &pp, sizeof(pp));
 #endif
+	addattr_l(n, MAX_MSG, TCA_GACT_FLAGS, &flags,
+		  sizeof(struct nla_bitfield32));
+
 	addattr_nest_end(n, tail);
 
 	*argc_p = argc;
@@ -199,6 +209,14 @@ print_gact(struct action_util *au, FILE *f, struct rtattr *arg)
 	print_int(PRINT_ANY, "val", "val %d", pp->pval);
 	close_json_object();
 #endif
+	if (tb[TCA_GACT_FLAGS]) {
+		struct nla_bitfield32 *flags = RTA_DATA(tb[TCA_GACT_FLAGS]);
+
+		if (flags->selector & TCA_ACT_FLAGS_FAST_INIT)
+			print_bool(PRINT_ANY, "fast_init", "\n\t fast_init",
+				   flags->value & TCA_ACT_FLAGS_FAST_INIT);
+	}
+
 	print_uint(PRINT_ANY, "index", "\n\t index %u", p->index);
 	print_int(PRINT_ANY, "ref", " ref %d", p->refcnt);
 	print_int(PRINT_ANY, "bind", " bind %d", p->bindcnt);
diff --git a/tc/m_mirred.c b/tc/m_mirred.c
index 132095237929..b95f4611891e 100644
--- a/tc/m_mirred.c
+++ b/tc/m_mirred.c
@@ -29,7 +29,7 @@ static void
 explain(void)
 {
 	fprintf(stderr,
-		"Usage: mirred <DIRECTION> <ACTION> [index INDEX] <dev DEVICENAME>\n"
+		"Usage: mirred <DIRECTION> <ACTION> [fast_init] [index INDEX] <dev DEVICENAME>\n"
 		"where:\n"
 		"\tDIRECTION := <ingress | egress>\n"
 		"\tACTION := <mirror | redirect>\n"
@@ -92,7 +92,7 @@ static int
 parse_direction(struct action_util *a, int *argc_p, char ***argv_p,
 		int tca_id, struct nlmsghdr *n)
 {
-
+	struct nla_bitfield32 flags = { 0 };
 	int argc = *argc_p;
 	char **argv = *argv_p;
 	int ok = 0, iok = 0, mirror = 0, redir = 0, ingress = 0, egress = 0;
@@ -125,6 +125,11 @@ parse_direction(struct action_util *a, int *argc_p, char ***argv_p,
 			NEXT_ARG();
 			ok++;
 			continue;
+		} else if (matches(*argv, "fast_init") == 0) {
+			flags.value |= TCA_ACT_FLAGS_FAST_INIT;
+			flags.selector |= TCA_ACT_FLAGS_FAST_INIT;
+			NEXT_ARG_FWD();
+			continue;
 		} else {
 
 			if (matches(*argv, "index") == 0) {
@@ -225,6 +230,8 @@ parse_direction(struct action_util *a, int *argc_p, char ***argv_p,
 
 	tail = addattr_nest(n, MAX_MSG, tca_id);
 	addattr_l(n, MAX_MSG, TCA_MIRRED_PARMS, &p, sizeof(p));
+	addattr_l(n, MAX_MSG, TCA_MIRRED_FLAGS, &flags,
+		  sizeof(struct nla_bitfield32));
 	addattr_nest_end(n, tail);
 
 	*argc_p = argc;
@@ -307,6 +314,14 @@ print_mirred(struct action_util *au, FILE *f, struct rtattr *arg)
 	print_string(PRINT_ANY, "to_dev", " to device %s)", dev);
 	print_action_control(f, " ", p->action, "");
 
+	if (tb[TCA_MIRRED_FLAGS]) {
+		struct nla_bitfield32 *flags = RTA_DATA(tb[TCA_MIRRED_FLAGS]);
+
+		if (flags->selector & TCA_ACT_FLAGS_FAST_INIT)
+			print_bool(PRINT_ANY, "fast_init", "\n\t fast_init",
+				   flags->value & TCA_ACT_FLAGS_FAST_INIT);
+	}
+
 	print_uint(PRINT_ANY, "index", "\n \tindex %u", p->index);
 	print_int(PRINT_ANY, "ref", " ref %d", p->refcnt);
 	print_int(PRINT_ANY, "bind", " bind %d", p->bindcnt);
diff --git a/tc/m_tunnel_key.c b/tc/m_tunnel_key.c
index 4e65e444776a..e8b96f8c008f 100644
--- a/tc/m_tunnel_key.c
+++ b/tc/m_tunnel_key.c
@@ -22,8 +22,8 @@
 static void explain(void)
 {
 	fprintf(stderr,
-		"Usage: tunnel_key unset\n"
-		"       tunnel_key set <TUNNEL_KEY>\n"
+		"Usage: tunnel_key [fast_init] unset\n"
+		"       tunnel_key [fast_init] set <TUNNEL_KEY>\n"
 		"Where TUNNEL_KEY is a combination of:\n"
 		"id <TUNNELID>\n"
 		"src_ip <IP> (mandatory)\n"
@@ -209,6 +209,7 @@ static int tunnel_key_parse_tos_ttl(char *str, int type, struct nlmsghdr *n)
 static int parse_tunnel_key(struct action_util *a, int *argc_p, char ***argv_p,
 			    int tca_id, struct nlmsghdr *n)
 {
+	struct nla_bitfield32 flags = { 0 };
 	struct tc_tunnel_key parm = {};
 	char **argv = *argv_p;
 	int argc = *argc_p;
@@ -309,6 +310,9 @@ static int parse_tunnel_key(struct action_util *a, int *argc_p, char ***argv_p,
 			csum = 0;
 		} else if (matches(*argv, "help") == 0) {
 			usage();
+		} else if (matches(*argv, "fast_init") == 0) {
+			flags.value |= TCA_ACT_FLAGS_FAST_INIT;
+			flags.selector |= TCA_ACT_FLAGS_FAST_INIT;
 		} else {
 			break;
 		}
@@ -341,6 +345,8 @@ static int parse_tunnel_key(struct action_util *a, int *argc_p, char ***argv_p,
 
 	parm.t_action = action;
 	addattr_l(n, MAX_MSG, TCA_TUNNEL_KEY_PARMS, &parm, sizeof(parm));
+	addattr_l(n, MAX_MSG, TCA_TUNNEL_KEY_FLAGS, &flags,
+		  sizeof(struct nla_bitfield32));
 	addattr_nest_end(n, tail);
 
 	*argc_p = argc;
@@ -529,6 +535,13 @@ static int print_tunnel_key(struct action_util *au, FILE *f, struct rtattr *arg)
 		break;
 	}
 	print_action_control(f, " ", parm->action, "");
+	if (tb[TCA_TUNNEL_KEY_FLAGS]) {
+		struct nla_bitfield32 *flags = RTA_DATA(tb[TCA_TUNNEL_KEY_FLAGS]);
+
+		if (flags->selector & TCA_ACT_FLAGS_FAST_INIT)
+			print_bool(PRINT_ANY, "fast_init", "\n\t fast_init",
+				   flags->value & TCA_ACT_FLAGS_FAST_INIT);
+	}
 
 	print_string(PRINT_FP, NULL, "%s", _SL_);
 	print_uint(PRINT_ANY, "index", "\t index %u", parm->index);
diff --git a/tc/m_vlan.c b/tc/m_vlan.c
index 9c8071e9dbbe..c54829ce297f 100644
--- a/tc/m_vlan.c
+++ b/tc/m_vlan.c
@@ -28,9 +28,9 @@ static const char * const action_names[] = {
 static void explain(void)
 {
 	fprintf(stderr,
-		"Usage: vlan pop\n"
-		"       vlan push [ protocol VLANPROTO ] id VLANID [ priority VLANPRIO ] [CONTROL]\n"
-		"       vlan modify [ protocol VLANPROTO ] id VLANID [ priority VLANPRIO ] [CONTROL]\n"
+		"Usage: vlan pop [fast_init]\n"
+		"       vlan push [fast_init] [ protocol VLANPROTO ] id VLANID [ priority VLANPRIO ] [CONTROL]\n"
+		"       vlan modify [fast_init] [ protocol VLANPROTO ] id VLANID [ priority VLANPRIO ] [CONTROL]\n"
 		"       VLANPROTO is one of 802.1Q or 802.1AD\n"
 		"            with default: 802.1Q\n"
 		"       CONTROL := reclassify | pipe | drop | continue | pass |\n"
@@ -59,6 +59,7 @@ static void unexpected(const char *arg)
 static int parse_vlan(struct action_util *a, int *argc_p, char ***argv_p,
 		      int tca_id, struct nlmsghdr *n)
 {
+	struct nla_bitfield32 flags = { 0 };
 	int argc = *argc_p;
 	char **argv = *argv_p;
 	struct rtattr *tail;
@@ -119,6 +120,9 @@ static int parse_vlan(struct action_util *a, int *argc_p, char ***argv_p,
 			if (get_u8(&prio, *argv, 0) || (prio & ~0x7))
 				invarg("prio is invalid", *argv);
 			prio_set = 1;
+		} else if (matches(*argv, "fast_init") == 0) {
+			flags.value |= TCA_ACT_FLAGS_FAST_INIT;
+			flags.selector |= TCA_ACT_FLAGS_FAST_INIT;
 		} else if (matches(*argv, "help") == 0) {
 			usage();
 		} else {
@@ -167,6 +171,8 @@ static int parse_vlan(struct action_util *a, int *argc_p, char ***argv_p,
 	}
 	if (prio_set)
 		addattr8(n, MAX_MSG, TCA_VLAN_PUSH_VLAN_PRIORITY, prio);
+	addattr_l(n, MAX_MSG, TCA_VLAN_FLAGS, &flags,
+		  sizeof(struct nla_bitfield32));
 
 	addattr_nest_end(n, tail);
 
@@ -218,6 +224,13 @@ static int print_vlan(struct action_util *au, FILE *f, struct rtattr *arg)
 		break;
 	}
 	print_action_control(f, " ", parm->action, "");
+	if (tb[TCA_VLAN_FLAGS]) {
+		struct nla_bitfield32 *flags = RTA_DATA(tb[TCA_VLAN_FLAGS]);
+
+		if (flags->selector & TCA_ACT_FLAGS_FAST_INIT)
+			print_bool(PRINT_ANY, "fast_init", "\n\t fast_init",
+				   flags->value & TCA_ACT_FLAGS_FAST_INIT);
+	}
 
 	print_uint(PRINT_ANY, "index", "\n\t index %u", parm->index);
 	print_int(PRINT_ANY, "ref", " ref %d", parm->refcnt);
-- 
2.21.0


^ permalink raw reply related	[flat|nested] 64+ messages in thread

* Re: [PATCH net-next 00/13] Control action percpu counters allocation by netlink flag
  2019-10-22 14:17 [PATCH net-next 00/13] Control action percpu counters allocation by netlink flag Vlad Buslov
                   ` (13 preceding siblings ...)
  2019-10-22 14:28 ` [PATCH iproute2-next] tc: implement support for action flags Vlad Buslov
@ 2019-10-22 14:35 ` Marcelo Ricardo Leitner
  2019-10-22 14:52   ` Vlad Buslov
  2019-10-22 15:15 ` Marcelo Ricardo Leitner
  2019-10-23 12:49 ` Jamal Hadi Salim
  16 siblings, 1 reply; 64+ messages in thread
From: Marcelo Ricardo Leitner @ 2019-10-22 14:35 UTC (permalink / raw)
  To: Vlad Buslov; +Cc: netdev, jhs, xiyou.wangcong, jiri, davem, dcaratti, pabeni

On Tue, Oct 22, 2019 at 05:17:51PM +0300, Vlad Buslov wrote:
> Currently, significant fraction of CPU time during TC filter allocation
> is spent in percpu allocator. Moreover, percpu allocator is protected
> with single global mutex which negates any potential to improve its
> performance by means of recent developments in TC filter update API that
> removed rtnl lock for some Qdiscs and classifiers. In order to
> significantly improve filter update rate and reduce memory usage we
> would like to allow users to skip percpu counters allocation for
> specific action if they don't expect high traffic rate hitting the
> action, which is a reasonable expectation for hardware-offloaded setup.
> In that case any potential gains to software fast-path performance
> gained by usage of percpu-allocated counters compared to regular integer
> counters protected by spinlock are not important, but amount of
> additional CPU and memory consumed by them is significant.

Yes!

I wonder how this can play together with conntrack offloading.  With
it the sw datapath will be more used, as a conntrack entry can only be
offloaded after the handshake.  That said, the host can have to
process quite some handshakes in sw datapath.  Seems OvS can then just
not set this flag in act_ct (and others for this rule), and such cases
will be able to leverage the percpu stats.  Right?

> allocator, but not for action idr lock, which is per-action. Note that
> percpu allocator is still used by dst_cache in tunnel_key actions and
> consumes 4.68% CPU time. Dst_cache seems like good opportunity for
> further insertion rate optimization but is not addressed by this change.

I vented this idea re dst_cache last week with Paolo. He sent me a
draft patch but I didn't test it yet.

Thanks,
Marcelo


^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [PATCH net-next 00/13] Control action percpu counters allocation by netlink flag
  2019-10-22 14:35 ` [PATCH net-next 00/13] Control action percpu counters allocation by netlink flag Marcelo Ricardo Leitner
@ 2019-10-22 14:52   ` Vlad Buslov
  2019-10-22 18:17     ` Roman Mashak
  0 siblings, 1 reply; 64+ messages in thread
From: Vlad Buslov @ 2019-10-22 14:52 UTC (permalink / raw)
  To: Marcelo Ricardo Leitner
  Cc: Vlad Buslov, netdev, jhs, xiyou.wangcong, jiri, davem, dcaratti, pabeni


On Tue 22 Oct 2019 at 17:35, Marcelo Ricardo Leitner <mleitner@redhat.com> wrote:
> On Tue, Oct 22, 2019 at 05:17:51PM +0300, Vlad Buslov wrote:
>> Currently, significant fraction of CPU time during TC filter allocation
>> is spent in percpu allocator. Moreover, percpu allocator is protected
>> with single global mutex which negates any potential to improve its
>> performance by means of recent developments in TC filter update API that
>> removed rtnl lock for some Qdiscs and classifiers. In order to
>> significantly improve filter update rate and reduce memory usage we
>> would like to allow users to skip percpu counters allocation for
>> specific action if they don't expect high traffic rate hitting the
>> action, which is a reasonable expectation for hardware-offloaded setup.
>> In that case any potential gains to software fast-path performance
>> gained by usage of percpu-allocated counters compared to regular integer
>> counters protected by spinlock are not important, but amount of
>> additional CPU and memory consumed by them is significant.
>
> Yes!
>
> I wonder how this can play together with conntrack offloading.  With
> it the sw datapath will be more used, as a conntrack entry can only be
> offloaded after the handshake.  That said, the host can have to
> process quite some handshakes in sw datapath.  Seems OvS can then just
> not set this flag in act_ct (and others for this rule), and such cases
> will be able to leverage the percpu stats.  Right?

The flag is set per each actions instance so client can chose not to use
the flag in case-by-case basis. Conntrack use case requires further
investigation since I'm not entirely convinced that handling first few
packets in sw (before connection reaches established state and is
offloaded) warrants having percpu counter.

>
>> allocator, but not for action idr lock, which is per-action. Note that
>> percpu allocator is still used by dst_cache in tunnel_key actions and
>> consumes 4.68% CPU time. Dst_cache seems like good opportunity for
>> further insertion rate optimization but is not addressed by this change.
>
> I vented this idea re dst_cache last week with Paolo. He sent me a
> draft patch but I didn't test it yet.

Looking forward to it!

>
> Thanks,
> Marcelo

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [PATCH net-next 00/13] Control action percpu counters allocation by netlink flag
  2019-10-22 14:17 [PATCH net-next 00/13] Control action percpu counters allocation by netlink flag Vlad Buslov
                   ` (14 preceding siblings ...)
  2019-10-22 14:35 ` [PATCH net-next 00/13] Control action percpu counters allocation by netlink flag Marcelo Ricardo Leitner
@ 2019-10-22 15:15 ` Marcelo Ricardo Leitner
  2019-10-22 15:52   ` Vlad Buslov
  2019-10-23 12:49 ` Jamal Hadi Salim
  16 siblings, 1 reply; 64+ messages in thread
From: Marcelo Ricardo Leitner @ 2019-10-22 15:15 UTC (permalink / raw)
  To: Vlad Buslov; +Cc: netdev, jhs, xiyou.wangcong, jiri, davem, dcaratti

On Tue, Oct 22, 2019 at 05:17:51PM +0300, Vlad Buslov wrote:
> - Extend actions that are used for hardware offloads with optional
>   netlink 32bit flags field. Add TCA_ACT_FLAGS_FAST_INIT action flag and
>   update affected actions to not allocate percpu counters when the flag
>   is set.

I just went over all the patches and they mostly make sense to me. So
far the only point I'm uncertain of is the naming of the flag,
"fast_init".  That is not clear on what it does and can be overloaded
with other stuff later and we probably don't want that.

Say, for example, we want percpu counters but to disable allocating
the stats for hw, to make the counter in 28169abadb08 ("net/sched: Add
hardware specific counters to TC actions") optional.

So what about:
TCA_ACT_FLAGS_NO_PERCPU_STATS
TCA_ACT_FLAGS_NO_HW_STATS (this one to be done on a subsequent patchset, yes)
?

  Marcelo


^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [PATCH net-next 00/13] Control action percpu counters allocation by netlink flag
  2019-10-22 15:15 ` Marcelo Ricardo Leitner
@ 2019-10-22 15:52   ` Vlad Buslov
  2019-10-22 17:09     ` Marcelo Ricardo Leitner
  0 siblings, 1 reply; 64+ messages in thread
From: Vlad Buslov @ 2019-10-22 15:52 UTC (permalink / raw)
  To: Marcelo Ricardo Leitner
  Cc: Vlad Buslov, netdev, jhs, xiyou.wangcong, jiri, davem, dcaratti


On Tue 22 Oct 2019 at 18:15, Marcelo Ricardo Leitner <mleitner@redhat.com> wrote:
> On Tue, Oct 22, 2019 at 05:17:51PM +0300, Vlad Buslov wrote:
>> - Extend actions that are used for hardware offloads with optional
>>   netlink 32bit flags field. Add TCA_ACT_FLAGS_FAST_INIT action flag and
>>   update affected actions to not allocate percpu counters when the flag
>>   is set.
>
> I just went over all the patches and they mostly make sense to me. So
> far the only point I'm uncertain of is the naming of the flag,
> "fast_init".  That is not clear on what it does and can be overloaded
> with other stuff later and we probably don't want that.

I intentionally named it like that because I do want to overload it with
other stuff in future, instead of adding new flag value for every single
small optimization we might come up with :)

Also, I didn't want to hardcode implementation details into UAPI that we
will have to maintain for long time after percpu allocator in kernel is
potentially replaced with something new and better (like idr is being
replaced with xarray now, for example)

Anyway, lets see what other people think. I'm open to changing it.

>
> Say, for example, we want percpu counters but to disable allocating
> the stats for hw, to make the counter in 28169abadb08 ("net/sched: Add
> hardware specific counters to TC actions") optional.
>
> So what about:
> TCA_ACT_FLAGS_NO_PERCPU_STATS
> TCA_ACT_FLAGS_NO_HW_STATS (this one to be done on a subsequent patchset, yes)
> ?
>
>   Marcelo

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [PATCH net-next 00/13] Control action percpu counters allocation by netlink flag
  2019-10-22 15:52   ` Vlad Buslov
@ 2019-10-22 17:09     ` Marcelo Ricardo Leitner
  2019-10-23  6:42       ` Vlad Buslov
  2019-10-24 15:12       ` Roopa Prabhu
  0 siblings, 2 replies; 64+ messages in thread
From: Marcelo Ricardo Leitner @ 2019-10-22 17:09 UTC (permalink / raw)
  To: Vlad Buslov; +Cc: netdev, jhs, xiyou.wangcong, jiri, davem, dcaratti

On Tue, Oct 22, 2019 at 03:52:31PM +0000, Vlad Buslov wrote:
> 
> On Tue 22 Oct 2019 at 18:15, Marcelo Ricardo Leitner <mleitner@redhat.com> wrote:
> > On Tue, Oct 22, 2019 at 05:17:51PM +0300, Vlad Buslov wrote:
> >> - Extend actions that are used for hardware offloads with optional
> >>   netlink 32bit flags field. Add TCA_ACT_FLAGS_FAST_INIT action flag and
> >>   update affected actions to not allocate percpu counters when the flag
> >>   is set.
> >
> > I just went over all the patches and they mostly make sense to me. So
> > far the only point I'm uncertain of is the naming of the flag,
> > "fast_init".  That is not clear on what it does and can be overloaded
> > with other stuff later and we probably don't want that.
> 
> I intentionally named it like that because I do want to overload it with
> other stuff in future, instead of adding new flag value for every single
> small optimization we might come up with :)

Hah :-)

> 
> Also, I didn't want to hardcode implementation details into UAPI that we
> will have to maintain for long time after percpu allocator in kernel is
> potentially replaced with something new and better (like idr is being
> replaced with xarray now, for example)

I see. OTOH, this also means that the UAPI here would be unstable
(different meanings over time for the same call), and hopefully new
behaviors would always be backwards compatible.

> 
> Anyway, lets see what other people think. I'm open to changing it.
> 
> >
> > Say, for example, we want percpu counters but to disable allocating
> > the stats for hw, to make the counter in 28169abadb08 ("net/sched: Add
> > hardware specific counters to TC actions") optional.
> >
> > So what about:
> > TCA_ACT_FLAGS_NO_PERCPU_STATS
> > TCA_ACT_FLAGS_NO_HW_STATS (this one to be done on a subsequent patchset, yes)
> > ?
> >
> >   Marcelo
> 


^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [PATCH net-next 00/13] Control action percpu counters allocation by netlink flag
  2019-10-22 14:52   ` Vlad Buslov
@ 2019-10-22 18:17     ` Roman Mashak
  2019-10-23  6:38       ` Vlad Buslov
  0 siblings, 1 reply; 64+ messages in thread
From: Roman Mashak @ 2019-10-22 18:17 UTC (permalink / raw)
  To: Vlad Buslov
  Cc: Marcelo Ricardo Leitner, netdev, jhs, xiyou.wangcong, jiri,
	davem, dcaratti, pabeni

Vlad Buslov <vladbu@mellanox.com> writes:

> On Tue 22 Oct 2019 at 17:35, Marcelo Ricardo Leitner <mleitner@redhat.com> wrote:
>> On Tue, Oct 22, 2019 at 05:17:51PM +0300, Vlad Buslov wrote:
>>> Currently, significant fraction of CPU time during TC filter allocation
>>> is spent in percpu allocator. Moreover, percpu allocator is protected
>>> with single global mutex which negates any potential to improve its
>>> performance by means of recent developments in TC filter update API that
>>> removed rtnl lock for some Qdiscs and classifiers. In order to
>>> significantly improve filter update rate and reduce memory usage we
>>> would like to allow users to skip percpu counters allocation for
>>> specific action if they don't expect high traffic rate hitting the
>>> action, which is a reasonable expectation for hardware-offloaded setup.
>>> In that case any potential gains to software fast-path performance
>>> gained by usage of percpu-allocated counters compared to regular integer
>>> counters protected by spinlock are not important, but amount of
>>> additional CPU and memory consumed by them is significant.
>>
>> Yes!
>>
>> I wonder how this can play together with conntrack offloading.  With
>> it the sw datapath will be more used, as a conntrack entry can only be
>> offloaded after the handshake.  That said, the host can have to
>> process quite some handshakes in sw datapath.  Seems OvS can then just
>> not set this flag in act_ct (and others for this rule), and such cases
>> will be able to leverage the percpu stats.  Right?
>
> The flag is set per each actions instance so client can chose not to use
> the flag in case-by-case basis. Conntrack use case requires further
> investigation since I'm not entirely convinced that handling first few
> packets in sw (before connection reaches established state and is
> offloaded) warrants having percpu counter.

Hi Vlad,

Did you consider using TCA_ROOT_FLAGS instead of adding another
per-action 32-bit flag?

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [PATCH net-next 00/13] Control action percpu counters allocation by netlink flag
  2019-10-22 18:17     ` Roman Mashak
@ 2019-10-23  6:38       ` Vlad Buslov
  2019-10-23 13:02         ` Jamal Hadi Salim
  0 siblings, 1 reply; 64+ messages in thread
From: Vlad Buslov @ 2019-10-23  6:38 UTC (permalink / raw)
  To: Roman Mashak
  Cc: Vlad Buslov, Marcelo Ricardo Leitner, netdev, jhs,
	xiyou.wangcong, jiri, davem, dcaratti, pabeni


On Tue 22 Oct 2019 at 21:17, Roman Mashak <mrv@mojatatu.com> wrote:
> Vlad Buslov <vladbu@mellanox.com> writes:
>
>> On Tue 22 Oct 2019 at 17:35, Marcelo Ricardo Leitner <mleitner@redhat.com> wrote:
>>> On Tue, Oct 22, 2019 at 05:17:51PM +0300, Vlad Buslov wrote:
>>>> Currently, significant fraction of CPU time during TC filter allocation
>>>> is spent in percpu allocator. Moreover, percpu allocator is protected
>>>> with single global mutex which negates any potential to improve its
>>>> performance by means of recent developments in TC filter update API that
>>>> removed rtnl lock for some Qdiscs and classifiers. In order to
>>>> significantly improve filter update rate and reduce memory usage we
>>>> would like to allow users to skip percpu counters allocation for
>>>> specific action if they don't expect high traffic rate hitting the
>>>> action, which is a reasonable expectation for hardware-offloaded setup.
>>>> In that case any potential gains to software fast-path performance
>>>> gained by usage of percpu-allocated counters compared to regular integer
>>>> counters protected by spinlock are not important, but amount of
>>>> additional CPU and memory consumed by them is significant.
>>>
>>> Yes!
>>>
>>> I wonder how this can play together with conntrack offloading.  With
>>> it the sw datapath will be more used, as a conntrack entry can only be
>>> offloaded after the handshake.  That said, the host can have to
>>> process quite some handshakes in sw datapath.  Seems OvS can then just
>>> not set this flag in act_ct (and others for this rule), and such cases
>>> will be able to leverage the percpu stats.  Right?
>>
>> The flag is set per each actions instance so client can chose not to use
>> the flag in case-by-case basis. Conntrack use case requires further
>> investigation since I'm not entirely convinced that handling first few
>> packets in sw (before connection reaches established state and is
>> offloaded) warrants having percpu counter.
>
> Hi Vlad,
>
> Did you consider using TCA_ROOT_FLAGS instead of adding another
> per-action 32-bit flag?

Hi Roman,

I considered it, but didn't find good way to implement my change with
TCA_ROOT_FLAGS. I needed some flags to be per-action for following
reasons:

1. Not all actions support the flag (only implemented for hw offloaded
   actions).

2. TCA_ROOT_FLAGS is act API specific and I need this to work when
   actions are created when actions are created with filters through cls
   API. I guess I could have changed tcf_action_init_1() to require
   having TCA_ROOT_FLAGS before actions attribute and then pass obtained
   value to act_ops->init() as additional argument, but it sounds more
   complex and ugly.

Regards,
Vlad

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [PATCH net-next 00/13] Control action percpu counters allocation by netlink flag
  2019-10-22 17:09     ` Marcelo Ricardo Leitner
@ 2019-10-23  6:42       ` Vlad Buslov
  2019-10-24 15:12       ` Roopa Prabhu
  1 sibling, 0 replies; 64+ messages in thread
From: Vlad Buslov @ 2019-10-23  6:42 UTC (permalink / raw)
  To: Marcelo Ricardo Leitner
  Cc: Vlad Buslov, netdev, jhs, xiyou.wangcong, jiri, davem, dcaratti


On Tue 22 Oct 2019 at 20:09, Marcelo Ricardo Leitner <mleitner@redhat.com> wrote:
> On Tue, Oct 22, 2019 at 03:52:31PM +0000, Vlad Buslov wrote:
>>
>> On Tue 22 Oct 2019 at 18:15, Marcelo Ricardo Leitner <mleitner@redhat.com> wrote:
>> > On Tue, Oct 22, 2019 at 05:17:51PM +0300, Vlad Buslov wrote:
>> >> - Extend actions that are used for hardware offloads with optional
>> >>   netlink 32bit flags field. Add TCA_ACT_FLAGS_FAST_INIT action flag and
>> >>   update affected actions to not allocate percpu counters when the flag
>> >>   is set.
>> >
>> > I just went over all the patches and they mostly make sense to me. So
>> > far the only point I'm uncertain of is the naming of the flag,
>> > "fast_init".  That is not clear on what it does and can be overloaded
>> > with other stuff later and we probably don't want that.
>>
>> I intentionally named it like that because I do want to overload it with
>> other stuff in future, instead of adding new flag value for every single
>> small optimization we might come up with :)
>
> Hah :-)
>
>>
>> Also, I didn't want to hardcode implementation details into UAPI that we
>> will have to maintain for long time after percpu allocator in kernel is
>> potentially replaced with something new and better (like idr is being
>> replaced with xarray now, for example)
>
> I see. OTOH, this also means that the UAPI here would be unstable
> (different meanings over time for the same call), and hopefully new
> behaviors would always be backwards compatible.

This flag doesn't change any userland-visible behavior, just suggests
different performance trade off (insertion rate for sw packet processing
speed). Counters continue to work with or without the flag. Any
optimization that actually modifies cls or act API behavior will have to
use dedicated flag value.

>
>>
>> Anyway, lets see what other people think. I'm open to changing it.
>>
>> >
>> > Say, for example, we want percpu counters but to disable allocating
>> > the stats for hw, to make the counter in 28169abadb08 ("net/sched: Add
>> > hardware specific counters to TC actions") optional.
>> >
>> > So what about:
>> > TCA_ACT_FLAGS_NO_PERCPU_STATS
>> > TCA_ACT_FLAGS_NO_HW_STATS (this one to be done on a subsequent patchset, yes)
>> > ?
>> >
>> >   Marcelo
>>

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [PATCH net-next 00/13] Control action percpu counters allocation by netlink flag
  2019-10-22 14:17 [PATCH net-next 00/13] Control action percpu counters allocation by netlink flag Vlad Buslov
                   ` (15 preceding siblings ...)
  2019-10-22 15:15 ` Marcelo Ricardo Leitner
@ 2019-10-23 12:49 ` Jamal Hadi Salim
  2019-10-23 13:04   ` Vlad Buslov
  16 siblings, 1 reply; 64+ messages in thread
From: Jamal Hadi Salim @ 2019-10-23 12:49 UTC (permalink / raw)
  To: Vlad Buslov, netdev
  Cc: xiyou.wangcong, jiri, davem, mleitner, dcaratti, Eric Dumazet


Hi Vlad,

On 2019-10-22 10:17 a.m., Vlad Buslov wrote:
> Currently, significant fraction of CPU time during TC filter allocation
> is spent in percpu allocator. Moreover, percpu allocator is protected
> with single global mutex which negates any potential to improve its
> performance by means of recent developments in TC filter update API that
> removed rtnl lock for some Qdiscs and classifiers. In order to
> significantly improve filter update rate and reduce memory usage we
> would like to allow users to skip percpu counters allocation for
> specific action if they don't expect high traffic rate hitting the
> action, which is a reasonable expectation for hardware-offloaded setup.
> In that case any potential gains to software fast-path performance
> gained by usage of percpu-allocated counters compared to regular integer
> counters protected by spinlock are not important, but amount of
> additional CPU and memory consumed by them is significant.

Great to see this becoming low hanging on the fruit tree
after your improvements.
Note: had a discussion a few years back with Eric D.(on Cc)
when i was trying to improve action dumping; what you are seeing
was very visible when doing a large batch creation of actions.
At the time i was thinking of amortizing the cost of that mutex
in a batch action create i.e you ask the per cpu allocator
to alloc a batch of the stats instead of singular.

I understand your use case being different since it is for h/w
offload. If you have time can you test with batching many actions
and seeing the before/after improvement?

Note: even for h/w offload it makes sense to first create the actions
then bind to filters (in my world thats what we end up doing).
If we can improve the first phase it is a win for both s/w and hw use
cases.

Question:
Given TCA_ACT_FLAGS_FAST_INIT is common to all actions would it make
sense to use Could you have used a TLV in the namespace of TCA_ACT_MAX
(outer TLV)? You will have to pass a param to ->init().

cheers,
jamal

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [PATCH net-next 00/13] Control action percpu counters allocation by netlink flag
  2019-10-23  6:38       ` Vlad Buslov
@ 2019-10-23 13:02         ` Jamal Hadi Salim
  2019-10-23 13:08           ` Vlad Buslov
  0 siblings, 1 reply; 64+ messages in thread
From: Jamal Hadi Salim @ 2019-10-23 13:02 UTC (permalink / raw)
  To: Vlad Buslov, Roman Mashak
  Cc: Marcelo Ricardo Leitner, netdev, xiyou.wangcong, jiri, davem,
	dcaratti, pabeni


I shouldve read the thread backward. My earlier email was asking
similar question to Roman.

On 2019-10-23 2:38 a.m., Vlad Buslov wrote:
> 
> On Tue 22 Oct 2019 at 21:17, Roman Mashak <mrv@mojatatu.com> wrote:

> Hi Roman,
> 
> I considered it, but didn't find good way to implement my change with
> TCA_ROOT_FLAGS. I needed some flags to be per-action for following
> reasons:
> 
> 1. Not all actions support the flag (only implemented for hw offloaded
>     actions).
> 
 >
> 2. TCA_ROOT_FLAGS is act API specific and I need this to work when
>     actions are created when actions are created with filters through cls
>     API. I guess I could have changed tcf_action_init_1() to require
>     having TCA_ROOT_FLAGS before actions attribute and then pass obtained
>     value to act_ops->init() as additional argument, but it sounds more
>     complex and ugly.

I really shouldve read the thread backwards;->
The question is if this uglier than introducing a new TLV for every
action.

cheers,
jamal

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [PATCH net-next 00/13] Control action percpu counters allocation by netlink flag
  2019-10-23 12:49 ` Jamal Hadi Salim
@ 2019-10-23 13:04   ` Vlad Buslov
  2019-10-23 14:21     ` Jamal Hadi Salim
  0 siblings, 1 reply; 64+ messages in thread
From: Vlad Buslov @ 2019-10-23 13:04 UTC (permalink / raw)
  To: Jamal Hadi Salim
  Cc: Vlad Buslov, netdev, xiyou.wangcong, jiri, davem, mleitner,
	dcaratti, Eric Dumazet


On Wed 23 Oct 2019 at 15:49, Jamal Hadi Salim <jhs@mojatatu.com> wrote:
> Hi Vlad,
>
> On 2019-10-22 10:17 a.m., Vlad Buslov wrote:
>> Currently, significant fraction of CPU time during TC filter allocation
>> is spent in percpu allocator. Moreover, percpu allocator is protected
>> with single global mutex which negates any potential to improve its
>> performance by means of recent developments in TC filter update API that
>> removed rtnl lock for some Qdiscs and classifiers. In order to
>> significantly improve filter update rate and reduce memory usage we
>> would like to allow users to skip percpu counters allocation for
>> specific action if they don't expect high traffic rate hitting the
>> action, which is a reasonable expectation for hardware-offloaded setup.
>> In that case any potential gains to software fast-path performance
>> gained by usage of percpu-allocated counters compared to regular integer
>> counters protected by spinlock are not important, but amount of
>> additional CPU and memory consumed by them is significant.
>
> Great to see this becoming low hanging on the fruit tree
> after your improvements.
> Note: had a discussion a few years back with Eric D.(on Cc)
> when i was trying to improve action dumping; what you are seeing
> was very visible when doing a large batch creation of actions.
> At the time i was thinking of amortizing the cost of that mutex
> in a batch action create i.e you ask the per cpu allocator
> to alloc a batch of the stats instead of singular.
>
> I understand your use case being different since it is for h/w
> offload. If you have time can you test with batching many actions
> and seeing the before/after improvement?

Will do.

>
> Note: even for h/w offload it makes sense to first create the actions
> then bind to filters (in my world thats what we end up doing).
> If we can improve the first phase it is a win for both s/w and hw use
> cases.
>
> Question:
> Given TCA_ACT_FLAGS_FAST_INIT is common to all actions would it make
> sense to use Could you have used a TLV in the namespace of TCA_ACT_MAX
> (outer TLV)? You will have to pass a param to ->init().

It is not common for all actions. I omitted modifying actions that are
not offloaded and some actions don't user percpu allocator at all
(pedit, for example) and have no use for this flag at the moment.

>
> cheers,
> jamal

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [PATCH net-next 00/13] Control action percpu counters allocation by netlink flag
  2019-10-23 13:02         ` Jamal Hadi Salim
@ 2019-10-23 13:08           ` Vlad Buslov
  0 siblings, 0 replies; 64+ messages in thread
From: Vlad Buslov @ 2019-10-23 13:08 UTC (permalink / raw)
  To: Jamal Hadi Salim
  Cc: Vlad Buslov, Roman Mashak, Marcelo Ricardo Leitner, netdev,
	xiyou.wangcong, jiri, davem, dcaratti, pabeni


On Wed 23 Oct 2019 at 16:02, Jamal Hadi Salim <jhs@mojatatu.com> wrote:
> I shouldve read the thread backward. My earlier email was asking
> similar question to Roman.
>
> On 2019-10-23 2:38 a.m., Vlad Buslov wrote:
>>
>> On Tue 22 Oct 2019 at 21:17, Roman Mashak <mrv@mojatatu.com> wrote:
>
>> Hi Roman,
>>
>> I considered it, but didn't find good way to implement my change with
>> TCA_ROOT_FLAGS. I needed some flags to be per-action for following
>> reasons:
>>
>> 1. Not all actions support the flag (only implemented for hw offloaded
>>     actions).
>>
>>
>> 2. TCA_ROOT_FLAGS is act API specific and I need this to work when
>>     actions are created when actions are created with filters through cls
>>     API. I guess I could have changed tcf_action_init_1() to require
>>     having TCA_ROOT_FLAGS before actions attribute and then pass obtained
>>     value to act_ops->init() as additional argument, but it sounds more
>>     complex and ugly.
>
> I really shouldve read the thread backwards;->
> The question is if this uglier than introducing a new TLV for every
> action.

I'm not introducing it for every action, only for minority of them.

>
> cheers,
> jamal

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [PATCH net-next 00/13] Control action percpu counters allocation by netlink flag
  2019-10-23 13:04   ` Vlad Buslov
@ 2019-10-23 14:21     ` Jamal Hadi Salim
  2019-10-23 15:04       ` Vlad Buslov
  2019-10-24  7:35       ` Jiri Pirko
  0 siblings, 2 replies; 64+ messages in thread
From: Jamal Hadi Salim @ 2019-10-23 14:21 UTC (permalink / raw)
  To: Vlad Buslov
  Cc: netdev, xiyou.wangcong, jiri, davem, mleitner, dcaratti, Eric Dumazet

On 2019-10-23 9:04 a.m., Vlad Buslov wrote:
> 
> On Wed 23 Oct 2019 at 15:49, Jamal Hadi Salim <jhs@mojatatu.com> wrote:
>> Hi Vlad,
>>

>> I understand your use case being different since it is for h/w
>> offload. If you have time can you test with batching many actions
>> and seeing the before/after improvement?
> 
> Will do.

Thanks.

I think you may have published number before, but would be interesting
to see the before and after of adding the action first and measuring the
filter improvement without caring about the allocator.

> 
>>
>> Note: even for h/w offload it makes sense to first create the actions
>> then bind to filters (in my world thats what we end up doing).
>> If we can improve the first phase it is a win for both s/w and hw use
>> cases.
>>
>> Question:
>> Given TCA_ACT_FLAGS_FAST_INIT is common to all actions would it make
>> sense to use Could you have used a TLV in the namespace of TCA_ACT_MAX
>> (outer TLV)? You will have to pass a param to ->init().
> 
> It is not common for all actions. I omitted modifying actions that are
> not offloaded and some actions don't user percpu allocator at all
> (pedit, for example) and have no use for this flag at the moment.

pedit just never got updated (its simple to update). There is
value in the software to have _all_ the actions use per cpu stats.
It improves fast path performance.

Jiri complains constantly about all these new per-action TLVs
which are generic. He promised to "fix it all" someday. Jiri i notice
your ack here, what happened? ;->

cheers,
jamal

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [PATCH net-next 00/13] Control action percpu counters allocation by netlink flag
  2019-10-23 14:21     ` Jamal Hadi Salim
@ 2019-10-23 15:04       ` Vlad Buslov
  2019-10-24  7:35       ` Jiri Pirko
  1 sibling, 0 replies; 64+ messages in thread
From: Vlad Buslov @ 2019-10-23 15:04 UTC (permalink / raw)
  To: Jamal Hadi Salim
  Cc: Vlad Buslov, netdev, xiyou.wangcong, jiri, davem, mleitner,
	dcaratti, Eric Dumazet


On Wed 23 Oct 2019 at 17:21, Jamal Hadi Salim <jhs@mojatatu.com> wrote:
> On 2019-10-23 9:04 a.m., Vlad Buslov wrote:
>>
>> On Wed 23 Oct 2019 at 15:49, Jamal Hadi Salim <jhs@mojatatu.com> wrote:
>>> Hi Vlad,
>>>
>
>>> I understand your use case being different since it is for h/w
>>> offload. If you have time can you test with batching many actions
>>> and seeing the before/after improvement?
>>
>> Will do.
>
> Thanks.
>
> I think you may have published number before, but would be interesting
> to see the before and after of adding the action first and measuring the
> filter improvement without caring about the allocator.

For filter with single gact drop action (first line in insertion rate
table in the cover letter) I get insertion rate of 412k rules/sec with
all of the actions preallocated in advance, which is 2x improvement.

>
>>
>>>
>>> Note: even for h/w offload it makes sense to first create the actions
>>> then bind to filters (in my world thats what we end up doing).
>>> If we can improve the first phase it is a win for both s/w and hw use
>>> cases.
>>>
>>> Question:
>>> Given TCA_ACT_FLAGS_FAST_INIT is common to all actions would it make
>>> sense to use Could you have used a TLV in the namespace of TCA_ACT_MAX
>>> (outer TLV)? You will have to pass a param to ->init().
>>
>> It is not common for all actions. I omitted modifying actions that are
>> not offloaded and some actions don't user percpu allocator at all
>> (pedit, for example) and have no use for this flag at the moment.
>
> pedit just never got updated (its simple to update). There is
> value in the software to have _all_ the actions use per cpu stats.
> It improves fast path performance.
>
> Jiri complains constantly about all these new per-action TLVs
> which are generic. He promised to "fix it all" someday. Jiri i notice
> your ack here, what happened? ;->
>
> cheers,
> jamal

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [PATCH net-next 00/13] Control action percpu counters allocation by netlink flag
  2019-10-23 14:21     ` Jamal Hadi Salim
  2019-10-23 15:04       ` Vlad Buslov
@ 2019-10-24  7:35       ` Jiri Pirko
  2019-10-24 15:23         ` Vlad Buslov
  1 sibling, 1 reply; 64+ messages in thread
From: Jiri Pirko @ 2019-10-24  7:35 UTC (permalink / raw)
  To: Jamal Hadi Salim
  Cc: Vlad Buslov, netdev, xiyou.wangcong, davem, mleitner, dcaratti,
	Eric Dumazet

Wed, Oct 23, 2019 at 04:21:51PM CEST, jhs@mojatatu.com wrote:
>On 2019-10-23 9:04 a.m., Vlad Buslov wrote:
>> 
>> On Wed 23 Oct 2019 at 15:49, Jamal Hadi Salim <jhs@mojatatu.com> wrote:
>> > Hi Vlad,
>> > 
>
>> > I understand your use case being different since it is for h/w
>> > offload. If you have time can you test with batching many actions
>> > and seeing the before/after improvement?
>> 
>> Will do.
>
>Thanks.
>
>I think you may have published number before, but would be interesting
>to see the before and after of adding the action first and measuring the
>filter improvement without caring about the allocator.
>
>> 
>> > 
>> > Note: even for h/w offload it makes sense to first create the actions
>> > then bind to filters (in my world thats what we end up doing).
>> > If we can improve the first phase it is a win for both s/w and hw use
>> > cases.
>> > 
>> > Question:
>> > Given TCA_ACT_FLAGS_FAST_INIT is common to all actions would it make
>> > sense to use Could you have used a TLV in the namespace of TCA_ACT_MAX
>> > (outer TLV)? You will have to pass a param to ->init().
>> 
>> It is not common for all actions. I omitted modifying actions that are
>> not offloaded and some actions don't user percpu allocator at all
>> (pedit, for example) and have no use for this flag at the moment.
>
>pedit just never got updated (its simple to update). There is
>value in the software to have _all_ the actions use per cpu stats.
>It improves fast path performance.
>
>Jiri complains constantly about all these new per-action TLVs
>which are generic. He promised to "fix it all" someday. Jiri i notice
>your ack here, what happened? ;->

Correct, it would be great. However not sure how exactly to do that now.
Do you have some ideas.

But basically this patchset does what was done many many times in the
past. I think it was a mistake in the original design not to have some
"common attrs" :/ Lesson learned for next interfaces.

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [PATCH net-next 00/13] Control action percpu counters allocation by netlink flag
  2019-10-22 17:09     ` Marcelo Ricardo Leitner
  2019-10-23  6:42       ` Vlad Buslov
@ 2019-10-24 15:12       ` Roopa Prabhu
  2019-10-24 15:18         ` Vlad Buslov
  1 sibling, 1 reply; 64+ messages in thread
From: Roopa Prabhu @ 2019-10-24 15:12 UTC (permalink / raw)
  To: Marcelo Ricardo Leitner
  Cc: Vlad Buslov, netdev, jhs, xiyou.wangcong, jiri, davem, dcaratti

On Tue, Oct 22, 2019 at 10:10 AM Marcelo Ricardo Leitner
<mleitner@redhat.com> wrote:
>
> On Tue, Oct 22, 2019 at 03:52:31PM +0000, Vlad Buslov wrote:
> >
> > On Tue 22 Oct 2019 at 18:15, Marcelo Ricardo Leitner <mleitner@redhat.com> wrote:
> > > On Tue, Oct 22, 2019 at 05:17:51PM +0300, Vlad Buslov wrote:
> > >> - Extend actions that are used for hardware offloads with optional
> > >>   netlink 32bit flags field. Add TCA_ACT_FLAGS_FAST_INIT action flag and
> > >>   update affected actions to not allocate percpu counters when the flag
> > >>   is set.
> > >
> > > I just went over all the patches and they mostly make sense to me. So
> > > far the only point I'm uncertain of is the naming of the flag,
> > > "fast_init".  That is not clear on what it does and can be overloaded
> > > with other stuff later and we probably don't want that.
> >
> > I intentionally named it like that because I do want to overload it with
> > other stuff in future, instead of adding new flag value for every single
> > small optimization we might come up with :)
>
> Hah :-)
>
> >
> > Also, I didn't want to hardcode implementation details into UAPI that we
> > will have to maintain for long time after percpu allocator in kernel is
> > potentially replaced with something new and better (like idr is being
> > replaced with xarray now, for example)
>
> I see. OTOH, this also means that the UAPI here would be unstable
> (different meanings over time for the same call), and hopefully new
> behaviors would always be backwards compatible.

+1, I also think optimization flags should be specific to what they optimize.
TCA_ACT_FLAGS_NO_PERCPU_STATS seems like a better choice. It allows
user to explicitly request for it.

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [PATCH net-next 00/13] Control action percpu counters allocation by netlink flag
  2019-10-24 15:12       ` Roopa Prabhu
@ 2019-10-24 15:18         ` Vlad Buslov
  2019-10-24 17:31           ` Marcelo Ricardo Leitner
  0 siblings, 1 reply; 64+ messages in thread
From: Vlad Buslov @ 2019-10-24 15:18 UTC (permalink / raw)
  To: Roopa Prabhu
  Cc: Marcelo Ricardo Leitner, Vlad Buslov, netdev, jhs,
	xiyou.wangcong, jiri, davem, dcaratti


On Thu 24 Oct 2019 at 18:12, Roopa Prabhu <roopa@cumulusnetworks.com> wrote:
> On Tue, Oct 22, 2019 at 10:10 AM Marcelo Ricardo Leitner
> <mleitner@redhat.com> wrote:
>>
>> On Tue, Oct 22, 2019 at 03:52:31PM +0000, Vlad Buslov wrote:
>> >
>> > On Tue 22 Oct 2019 at 18:15, Marcelo Ricardo Leitner <mleitner@redhat.com> wrote:
>> > > On Tue, Oct 22, 2019 at 05:17:51PM +0300, Vlad Buslov wrote:
>> > >> - Extend actions that are used for hardware offloads with optional
>> > >>   netlink 32bit flags field. Add TCA_ACT_FLAGS_FAST_INIT action flag and
>> > >>   update affected actions to not allocate percpu counters when the flag
>> > >>   is set.
>> > >
>> > > I just went over all the patches and they mostly make sense to me. So
>> > > far the only point I'm uncertain of is the naming of the flag,
>> > > "fast_init".  That is not clear on what it does and can be overloaded
>> > > with other stuff later and we probably don't want that.
>> >
>> > I intentionally named it like that because I do want to overload it with
>> > other stuff in future, instead of adding new flag value for every single
>> > small optimization we might come up with :)
>>
>> Hah :-)
>>
>> >
>> > Also, I didn't want to hardcode implementation details into UAPI that we
>> > will have to maintain for long time after percpu allocator in kernel is
>> > potentially replaced with something new and better (like idr is being
>> > replaced with xarray now, for example)
>>
>> I see. OTOH, this also means that the UAPI here would be unstable
>> (different meanings over time for the same call), and hopefully new
>> behaviors would always be backwards compatible.
>
> +1, I also think optimization flags should be specific to what they optimize.
> TCA_ACT_FLAGS_NO_PERCPU_STATS seems like a better choice. It allows
> user to explicitly request for it.

Why would user care about details of optimizations that doesn't produce
visible side effects for user land? Again, counters continue to work the
same with or without the flag.

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [PATCH net-next 00/13] Control action percpu counters allocation by netlink flag
  2019-10-24  7:35       ` Jiri Pirko
@ 2019-10-24 15:23         ` Vlad Buslov
  2019-10-24 16:12           ` Jamal Hadi Salim
  0 siblings, 1 reply; 64+ messages in thread
From: Vlad Buslov @ 2019-10-24 15:23 UTC (permalink / raw)
  To: Jamal Hadi Salim
  Cc: Jiri Pirko, Vlad Buslov, netdev, xiyou.wangcong, davem, mleitner,
	dcaratti, Eric Dumazet

On Thu 24 Oct 2019 at 10:35, Jiri Pirko <jiri@resnulli.us> wrote:
> Wed, Oct 23, 2019 at 04:21:51PM CEST, jhs@mojatatu.com wrote:
>>On 2019-10-23 9:04 a.m., Vlad Buslov wrote:
>>> 
>>> On Wed 23 Oct 2019 at 15:49, Jamal Hadi Salim <jhs@mojatatu.com> wrote:
>>> > Hi Vlad,
>>> > 
>>
>>> > I understand your use case being different since it is for h/w
>>> > offload. If you have time can you test with batching many actions
>>> > and seeing the before/after improvement?
>>> 
>>> Will do.
>>
>>Thanks.
>>
>>I think you may have published number before, but would be interesting
>>to see the before and after of adding the action first and measuring the
>>filter improvement without caring about the allocator.
>>
>>> 
>>> > 
>>> > Note: even for h/w offload it makes sense to first create the actions
>>> > then bind to filters (in my world thats what we end up doing).
>>> > If we can improve the first phase it is a win for both s/w and hw use
>>> > cases.
>>> > 
>>> > Question:
>>> > Given TCA_ACT_FLAGS_FAST_INIT is common to all actions would it make
>>> > sense to use Could you have used a TLV in the namespace of TCA_ACT_MAX
>>> > (outer TLV)? You will have to pass a param to ->init().
>>> 
>>> It is not common for all actions. I omitted modifying actions that are
>>> not offloaded and some actions don't user percpu allocator at all
>>> (pedit, for example) and have no use for this flag at the moment.
>>
>>pedit just never got updated (its simple to update). There is
>>value in the software to have _all_ the actions use per cpu stats.
>>It improves fast path performance.
>>
>>Jiri complains constantly about all these new per-action TLVs
>>which are generic. He promised to "fix it all" someday. Jiri i notice
>>your ack here, what happened? ;->
>
> Correct, it would be great. However not sure how exactly to do that now.
> Do you have some ideas.
>
> But basically this patchset does what was done many many times in the
> past. I think it was a mistake in the original design not to have some
> "common attrs" :/ Lesson learned for next interfaces.

Jamal, can we reach some conclusion? Do you still suggest to refactor
the patches to use TCA_ROOT_FLAGS and parse it in act API instead of
individual actions?

Thanks,
Vlad

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [PATCH net-next 00/13] Control action percpu counters allocation by netlink flag
  2019-10-24 15:23         ` Vlad Buslov
@ 2019-10-24 16:12           ` Jamal Hadi Salim
  2019-10-24 16:44             ` Vlad Buslov
  2019-10-25 14:26             ` Vlad Buslov
  0 siblings, 2 replies; 64+ messages in thread
From: Jamal Hadi Salim @ 2019-10-24 16:12 UTC (permalink / raw)
  To: Vlad Buslov
  Cc: Jiri Pirko, netdev, xiyou.wangcong, davem, mleitner, dcaratti,
	Eric Dumazet

On 2019-10-24 11:23 a.m., Vlad Buslov wrote:
> On Thu 24 Oct 2019 at 10:35, Jiri Pirko <jiri@resnulli.us> wrote:
>> Wed, Oct 23, 2019 at 04:21:51PM CEST, jhs@mojatatu.com wrote:
>>> On 2019-10-23 9:04 a.m., Vlad Buslov wrote:
>>>>

[..]
>>> Jiri complains constantly about all these new per-action TLVs
>>> which are generic. He promised to "fix it all" someday. Jiri i notice
>>> your ack here, what happened? ;->
>>
>> Correct, it would be great. However not sure how exactly to do that now.
>> Do you have some ideas.
>>
 >>
>> But basically this patchset does what was done many many times in the
>> past. I think it was a mistake in the original design not to have some
>> "common attrs" :/ Lesson learned for next interfaces.
> 

Jiri, we have a high level TLV space which can be applied to all
actions. See discussion below with Vlad. At least for this specific
change we can get away from repeating that mistake.

> Jamal, can we reach some conclusion? Do you still suggest to refactor
> the patches to use TCA_ROOT_FLAGS and parse it in act API instead of
> individual actions?
> 

IMO this would certainly help us walk away from having
every action replicate the same attribute with common semantics.
refactoring ->init() to take an extra attribute may look ugly but
is cleaner from a uapi pov. Actions that dont implement the feature
can ignore the extra parameter(s). It doesnt have to be TCA_ROOT_FLAGS
but certainly that high level TLV space is the right place to put it.
WDYT?

cheers,
jamal

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [PATCH net-next 00/13] Control action percpu counters allocation by netlink flag
  2019-10-24 16:12           ` Jamal Hadi Salim
@ 2019-10-24 16:44             ` Vlad Buslov
  2019-10-24 17:17               ` Jamal Hadi Salim
  2019-10-25 14:26             ` Vlad Buslov
  1 sibling, 1 reply; 64+ messages in thread
From: Vlad Buslov @ 2019-10-24 16:44 UTC (permalink / raw)
  To: Jamal Hadi Salim
  Cc: Vlad Buslov, Jiri Pirko, netdev, xiyou.wangcong, davem, mleitner,
	dcaratti, Eric Dumazet


On Thu 24 Oct 2019 at 19:12, Jamal Hadi Salim <jhs@mojatatu.com> wrote:
> On 2019-10-24 11:23 a.m., Vlad Buslov wrote:
>> On Thu 24 Oct 2019 at 10:35, Jiri Pirko <jiri@resnulli.us> wrote:
>>> Wed, Oct 23, 2019 at 04:21:51PM CEST, jhs@mojatatu.com wrote:
>>>> On 2019-10-23 9:04 a.m., Vlad Buslov wrote:
>>>>>
>
> [..]
>>>> Jiri complains constantly about all these new per-action TLVs
>>>> which are generic. He promised to "fix it all" someday. Jiri i notice
>>>> your ack here, what happened? ;->
>>>
>>> Correct, it would be great. However not sure how exactly to do that now.
>>> Do you have some ideas.
>>>
>>>
>>> But basically this patchset does what was done many many times in the
>>> past. I think it was a mistake in the original design not to have some
>>> "common attrs" :/ Lesson learned for next interfaces.
>>
>
> Jiri, we have a high level TLV space which can be applied to all
> actions. See discussion below with Vlad. At least for this specific
> change we can get away from repeating that mistake.
>
>> Jamal, can we reach some conclusion? Do you still suggest to refactor
>> the patches to use TCA_ROOT_FLAGS and parse it in act API instead of
>> individual actions?
>>
>
> IMO this would certainly help us walk away from having
> every action replicate the same attribute with common semantics.
> refactoring ->init() to take an extra attribute may look ugly but
> is cleaner from a uapi pov. Actions that dont implement the feature
> can ignore the extra parameter(s). It doesnt have to be TCA_ROOT_FLAGS
> but certainly that high level TLV space is the right place to put it.
> WDYT?
>
> cheers,
> jamal

Well, I like having it per-action better because of reasons I explained
before (some actions don't use percpu allocator at all and some actions
that are not hw offloaded don't need it), but I think both solutions
have their benefits and drawbacks, so I'm fine with refactoring it.

Do you have any opinion regarding flag naming? Several people suggested
to be more specific, but I strongly dislike the idea of hardcoding the
name of a internal kernel data structure in UAPI constant that will
potentially outlive the data structure by a long time.

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [PATCH net-next 00/13] Control action percpu counters allocation by netlink flag
  2019-10-24 16:44             ` Vlad Buslov
@ 2019-10-24 17:17               ` Jamal Hadi Salim
  2019-10-24 18:05                 ` Vlad Buslov
  0 siblings, 1 reply; 64+ messages in thread
From: Jamal Hadi Salim @ 2019-10-24 17:17 UTC (permalink / raw)
  To: Vlad Buslov
  Cc: Jiri Pirko, netdev, xiyou.wangcong, davem, mleitner, dcaratti,
	Eric Dumazet

Hi Vlad,

On 2019-10-24 12:44 p.m., Vlad Buslov wrote:

> 
> Well, I like having it per-action better because of reasons I explained
> before (some actions don't use percpu allocator at all and some actions
> that are not hw offloaded don't need it), but I think both solutions
> have their benefits and drawbacks, so I'm fine with refactoring it.
>

I am happy you are doing all this great work already. I would be happier 
if you did it at the root level. It is something that we have been
meaning to deal with for a while now.

> Do you have any opinion regarding flag naming? Several people suggested
> to be more specific, but I strongly dislike the idea of hardcoding the
> name of a internal kernel data structure in UAPI constant that will
> potentially outlive the data structure by a long time.

Could you not just name the bit with a define to say what the bit
is for and still use the top level flag? Example we have
a bit called "TCA_FLAG_LARGE_DUMP_ON"

cheers,
jamal


^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [PATCH net-next 00/13] Control action percpu counters allocation by netlink flag
  2019-10-24 15:18         ` Vlad Buslov
@ 2019-10-24 17:31           ` Marcelo Ricardo Leitner
  2019-10-24 18:03             ` Vlad Buslov
  0 siblings, 1 reply; 64+ messages in thread
From: Marcelo Ricardo Leitner @ 2019-10-24 17:31 UTC (permalink / raw)
  To: Vlad Buslov
  Cc: Roopa Prabhu, netdev, jhs, xiyou.wangcong, jiri, davem, dcaratti

On Thu, Oct 24, 2019 at 03:18:00PM +0000, Vlad Buslov wrote:
> 
> On Thu 24 Oct 2019 at 18:12, Roopa Prabhu <roopa@cumulusnetworks.com> wrote:
> > On Tue, Oct 22, 2019 at 10:10 AM Marcelo Ricardo Leitner
> > <mleitner@redhat.com> wrote:
> >>
> >> On Tue, Oct 22, 2019 at 03:52:31PM +0000, Vlad Buslov wrote:
> >> >
> >> > On Tue 22 Oct 2019 at 18:15, Marcelo Ricardo Leitner <mleitner@redhat.com> wrote:
> >> > > On Tue, Oct 22, 2019 at 05:17:51PM +0300, Vlad Buslov wrote:
> >> > >> - Extend actions that are used for hardware offloads with optional
> >> > >>   netlink 32bit flags field. Add TCA_ACT_FLAGS_FAST_INIT action flag and
> >> > >>   update affected actions to not allocate percpu counters when the flag
> >> > >>   is set.
> >> > >
> >> > > I just went over all the patches and they mostly make sense to me. So
> >> > > far the only point I'm uncertain of is the naming of the flag,
> >> > > "fast_init".  That is not clear on what it does and can be overloaded
> >> > > with other stuff later and we probably don't want that.
> >> >
> >> > I intentionally named it like that because I do want to overload it with
> >> > other stuff in future, instead of adding new flag value for every single
> >> > small optimization we might come up with :)
> >>
> >> Hah :-)
> >>
> >> >
> >> > Also, I didn't want to hardcode implementation details into UAPI that we
> >> > will have to maintain for long time after percpu allocator in kernel is
> >> > potentially replaced with something new and better (like idr is being
> >> > replaced with xarray now, for example)
> >>
> >> I see. OTOH, this also means that the UAPI here would be unstable
> >> (different meanings over time for the same call), and hopefully new
> >> behaviors would always be backwards compatible.
> >
> > +1, I also think optimization flags should be specific to what they optimize.
> > TCA_ACT_FLAGS_NO_PERCPU_STATS seems like a better choice. It allows
> > user to explicitly request for it.
> 
> Why would user care about details of optimizations that doesn't produce
> visible side effects for user land? Again, counters continue to work the
> same with or without the flag.

It's just just details of optimizations, on whether to use likely() or
not, and it does produce user visible effects. Not in terms of API but
of system behavior. Otherwise we wouldn't need the flag, right?
_FAST_INIT, or the fact that it inits faster, is just one of the
aspects of it, but one could also want it for being lighther in
footprint as currently it is really easy to eat Gigs of RAM away on
these percpu counters. So how should it be called, _FAST_INIT or
_SLIM_RULES?

It may be implementation detail, yes, but we shouldn't be building
usage profiles and instead let the user pick what they want. Likewise,
we don't have net.ipv4.fast_tcp, but net.ipv4.tcp_sack, tcp_timestamps
& cia.

If we can find another name then, without using 'percpu' on it but
without stablishing profiles, that would be nice.
Like TCA_ACT_FLAGS_SIMPLE_STATS, or so.

Even though I still prefer the PERCPU, as it's as explicit as it can be. Note
that bpf does it like that already:
uapi]$ grep BPF.*HASH -- linux/bpf.h
        BPF_MAP_TYPE_HASH,
        BPF_MAP_TYPE_PERCPU_HASH,
        BPF_MAP_TYPE_LRU_HASH,
        BPF_MAP_TYPE_LRU_PERCPU_HASH,
...


^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [PATCH net-next 00/13] Control action percpu counters allocation by netlink flag
  2019-10-24 17:31           ` Marcelo Ricardo Leitner
@ 2019-10-24 18:03             ` Vlad Buslov
  0 siblings, 0 replies; 64+ messages in thread
From: Vlad Buslov @ 2019-10-24 18:03 UTC (permalink / raw)
  To: Marcelo Ricardo Leitner
  Cc: Vlad Buslov, Roopa Prabhu, netdev, jhs, xiyou.wangcong, jiri,
	davem, dcaratti


On Thu 24 Oct 2019 at 20:31, Marcelo Ricardo Leitner <mleitner@redhat.com> wrote:
> On Thu, Oct 24, 2019 at 03:18:00PM +0000, Vlad Buslov wrote:
>>
>> On Thu 24 Oct 2019 at 18:12, Roopa Prabhu <roopa@cumulusnetworks.com> wrote:
>> > On Tue, Oct 22, 2019 at 10:10 AM Marcelo Ricardo Leitner
>> > <mleitner@redhat.com> wrote:
>> >>
>> >> On Tue, Oct 22, 2019 at 03:52:31PM +0000, Vlad Buslov wrote:
>> >> >
>> >> > On Tue 22 Oct 2019 at 18:15, Marcelo Ricardo Leitner <mleitner@redhat.com> wrote:
>> >> > > On Tue, Oct 22, 2019 at 05:17:51PM +0300, Vlad Buslov wrote:
>> >> > >> - Extend actions that are used for hardware offloads with optional
>> >> > >>   netlink 32bit flags field. Add TCA_ACT_FLAGS_FAST_INIT action flag and
>> >> > >>   update affected actions to not allocate percpu counters when the flag
>> >> > >>   is set.
>> >> > >
>> >> > > I just went over all the patches and they mostly make sense to me. So
>> >> > > far the only point I'm uncertain of is the naming of the flag,
>> >> > > "fast_init".  That is not clear on what it does and can be overloaded
>> >> > > with other stuff later and we probably don't want that.
>> >> >
>> >> > I intentionally named it like that because I do want to overload it with
>> >> > other stuff in future, instead of adding new flag value for every single
>> >> > small optimization we might come up with :)
>> >>
>> >> Hah :-)
>> >>
>> >> >
>> >> > Also, I didn't want to hardcode implementation details into UAPI that we
>> >> > will have to maintain for long time after percpu allocator in kernel is
>> >> > potentially replaced with something new and better (like idr is being
>> >> > replaced with xarray now, for example)
>> >>
>> >> I see. OTOH, this also means that the UAPI here would be unstable
>> >> (different meanings over time for the same call), and hopefully new
>> >> behaviors would always be backwards compatible.
>> >
>> > +1, I also think optimization flags should be specific to what they optimize.
>> > TCA_ACT_FLAGS_NO_PERCPU_STATS seems like a better choice. It allows
>> > user to explicitly request for it.
>>
>> Why would user care about details of optimizations that doesn't produce
>> visible side effects for user land? Again, counters continue to work the
>> same with or without the flag.
>
> It's just just details of optimizations, on whether to use likely() or
> not, and it does produce user visible effects. Not in terms of API but
> of system behavior. Otherwise we wouldn't need the flag, right?
> _FAST_INIT, or the fact that it inits faster, is just one of the
> aspects of it, but one could also want it for being lighther in
> footprint as currently it is really easy to eat Gigs of RAM away on
> these percpu counters. So how should it be called, _FAST_INIT or
> _SLIM_RULES?

Hmm, yes, I think I emphasize insertion rate too much because it was my
main goal, but memory usage is also important.

>
> It may be implementation detail, yes, but we shouldn't be building
> usage profiles and instead let the user pick what they want. Likewise,
> we don't have net.ipv4.fast_tcp, but net.ipv4.tcp_sack, tcp_timestamps
> & cia.
>
> If we can find another name then, without using 'percpu' on it but
> without stablishing profiles, that would be nice.
> Like TCA_ACT_FLAGS_SIMPLE_STATS, or so.
>
> Even though I still prefer the PERCPU, as it's as explicit as it can be. Note
> that bpf does it like that already:
> uapi]$ grep BPF.*HASH -- linux/bpf.h
>         BPF_MAP_TYPE_HASH,
>         BPF_MAP_TYPE_PERCPU_HASH,
>         BPF_MAP_TYPE_LRU_HASH,
>         BPF_MAP_TYPE_LRU_PERCPU_HASH,
> ...

I would argue that all of these produce visible side effect for the
user. Selective acks are clearly visible in packet dumps, BPF maps are
directly accessed from BPF programs loaded from user space, etc. But I
get your point that naming can be improved from FAST_INIT which is too
abstract.

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [PATCH net-next 00/13] Control action percpu counters allocation by netlink flag
  2019-10-24 17:17               ` Jamal Hadi Salim
@ 2019-10-24 18:05                 ` Vlad Buslov
  2019-10-25 11:46                   ` Jamal Hadi Salim
  0 siblings, 1 reply; 64+ messages in thread
From: Vlad Buslov @ 2019-10-24 18:05 UTC (permalink / raw)
  To: Jamal Hadi Salim
  Cc: Vlad Buslov, Jiri Pirko, netdev, xiyou.wangcong, davem, mleitner,
	dcaratti, Eric Dumazet


On Thu 24 Oct 2019 at 20:17, Jamal Hadi Salim <jhs@mojatatu.com> wrote:
> Hi Vlad,
>
> On 2019-10-24 12:44 p.m., Vlad Buslov wrote:
>
>>
>> Well, I like having it per-action better because of reasons I explained
>> before (some actions don't use percpu allocator at all and some actions
>> that are not hw offloaded don't need it), but I think both solutions
>> have their benefits and drawbacks, so I'm fine with refactoring it.
>>
>
> I am happy you are doing all this great work already. I would be happier if you
> did it at the root level. It is something that we have been
> meaning to deal with for a while now.
>
>> Do you have any opinion regarding flag naming? Several people suggested
>> to be more specific, but I strongly dislike the idea of hardcoding the
>> name of a internal kernel data structure in UAPI constant that will
>> potentially outlive the data structure by a long time.
>
> Could you not just name the bit with a define to say what the bit
> is for and still use the top level flag? Example we have
> a bit called "TCA_FLAG_LARGE_DUMP_ON"

Yes, of course. I was talking strictly about naming of
TCA_ACT_FLAGS_FAST_INIT flag value constant.

>
> cheers,
> jamal


^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [PATCH net-next 00/13] Control action percpu counters allocation by netlink flag
  2019-10-24 18:05                 ` Vlad Buslov
@ 2019-10-25 11:46                   ` Jamal Hadi Salim
  2019-10-25 11:55                     ` Vlad Buslov
  0 siblings, 1 reply; 64+ messages in thread
From: Jamal Hadi Salim @ 2019-10-25 11:46 UTC (permalink / raw)
  To: Vlad Buslov
  Cc: Jiri Pirko, netdev, xiyou.wangcong, davem, mleitner, dcaratti,
	Eric Dumazet

On 2019-10-24 2:05 p.m., Vlad Buslov wrote:
> 

> 
> Yes, of course. I was talking strictly about naming of
> TCA_ACT_FLAGS_FAST_INIT flag value constant.
> 

Sorry, I missed that. You know discussions on names and
punctiations can last a lifetime;->[1].
A name which reflects semantics or established style
is always helpful. So Marcelo's suggestion is reasonable..
In the same spirit as TCA_FLAG_LARGE_DUMP_ON, how about:
TCA_FLAG_NO_PERCPU_STATS?

cheers,
jamal
[1]http://bikeshed.com/

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [PATCH net-next 00/13] Control action percpu counters allocation by netlink flag
  2019-10-25 11:46                   ` Jamal Hadi Salim
@ 2019-10-25 11:55                     ` Vlad Buslov
  0 siblings, 0 replies; 64+ messages in thread
From: Vlad Buslov @ 2019-10-25 11:55 UTC (permalink / raw)
  To: Jamal Hadi Salim
  Cc: Vlad Buslov, Jiri Pirko, netdev, xiyou.wangcong, davem, mleitner,
	dcaratti, Eric Dumazet

On Fri 25 Oct 2019 at 14:46, Jamal Hadi Salim <jhs@mojatatu.com> wrote:
> On 2019-10-24 2:05 p.m., Vlad Buslov wrote:
>>
>
>>
>> Yes, of course. I was talking strictly about naming of
>> TCA_ACT_FLAGS_FAST_INIT flag value constant.
>>
>
> Sorry, I missed that. You know discussions on names and
> punctiations can last a lifetime;->[1].
> A name which reflects semantics or established style
> is always helpful. So Marcelo's suggestion is reasonable..
> In the same spirit as TCA_FLAG_LARGE_DUMP_ON, how about:
> TCA_FLAG_NO_PERCPU_STATS?
>
> cheers,
> jamal
> [1]http://bikeshed.com/

It looks like I'm the only one who doesn't like to hardcode internal
kernel data structure names into UAPI. Will change it in V2.

Thanks,
Vlad

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [PATCH net-next 00/13] Control action percpu counters allocation by netlink flag
  2019-10-24 16:12           ` Jamal Hadi Salim
  2019-10-24 16:44             ` Vlad Buslov
@ 2019-10-25 14:26             ` Vlad Buslov
  2019-10-25 14:57               ` Jamal Hadi Salim
  1 sibling, 1 reply; 64+ messages in thread
From: Vlad Buslov @ 2019-10-25 14:26 UTC (permalink / raw)
  To: Jamal Hadi Salim
  Cc: Vlad Buslov, Jiri Pirko, netdev, xiyou.wangcong, davem, mleitner,
	dcaratti, Eric Dumazet

On Thu 24 Oct 2019 at 19:12, Jamal Hadi Salim <jhs@mojatatu.com> wrote:
> On 2019-10-24 11:23 a.m., Vlad Buslov wrote:
>> On Thu 24 Oct 2019 at 10:35, Jiri Pirko <jiri@resnulli.us> wrote:
>>> Wed, Oct 23, 2019 at 04:21:51PM CEST, jhs@mojatatu.com wrote:
>>>> On 2019-10-23 9:04 a.m., Vlad Buslov wrote:
>>>>>
>
> [..]
>>>> Jiri complains constantly about all these new per-action TLVs
>>>> which are generic. He promised to "fix it all" someday. Jiri i notice
>>>> your ack here, what happened? ;->
>>>
>>> Correct, it would be great. However not sure how exactly to do that now.
>>> Do you have some ideas.
>>>
>>>
>>> But basically this patchset does what was done many many times in the
>>> past. I think it was a mistake in the original design not to have some
>>> "common attrs" :/ Lesson learned for next interfaces.
>>
>
> Jiri, we have a high level TLV space which can be applied to all
> actions. See discussion below with Vlad. At least for this specific
> change we can get away from repeating that mistake.
>
>> Jamal, can we reach some conclusion? Do you still suggest to refactor
>> the patches to use TCA_ROOT_FLAGS and parse it in act API instead of
>> individual actions?
>>
>
> IMO this would certainly help us walk away from having
> every action replicate the same attribute with common semantics.
> refactoring ->init() to take an extra attribute may look ugly but
> is cleaner from a uapi pov. Actions that dont implement the feature
> can ignore the extra parameter(s). It doesnt have to be TCA_ROOT_FLAGS
> but certainly that high level TLV space is the right place to put it.
> WDYT?
>
> cheers,
> jamal

Hi Jamal,

I've been trying to re-design this change to use high level TLV, but I
don't understand how to make it backward compatible. If I just put new
attribute before nested action attributes, it will fail to parse when
older client send netlink packet without it and I don't see
straightforward way to distinguish between the new attribute and
following nested action attribute (which might accidentally have same
length and fall into allowed attribute range). I don't have much
experience working with netlink, so I might be missing something obvious
here. However, extending existing action attributes (which are already
conveniently parsed in tcf_action_init_1() function) with new optional
'flags' value seems like straightforward and backward compatible
solution.

Rough outline in code:

2 files changed, 6 insertions(+), 2 deletions(-)
include/uapi/linux/pkt_cls.h | 1 +
net/sched/act_api.c          | 7 +++++--

modified   include/uapi/linux/pkt_cls.h
@@ -16,6 +16,7 @@ enum {
 	TCA_ACT_STATS,
 	TCA_ACT_PAD,
 	TCA_ACT_COOKIE,
+	TCA_ACT_FLAGS,
 	__TCA_ACT_MAX
 };
 
modified   net/sched/act_api.c
@@ -852,6 +852,7 @@ struct tc_action *tcf_action_init_1(struct net *net, struct tcf_proto *tp,
 	char act_name[IFNAMSIZ];
 	struct nlattr *tb[TCA_ACT_MAX + 1];
 	struct nlattr *kind;
+	u32 flags = 0;
 	int err;
 
 	if (name == NULL) {
@@ -877,6 +878,8 @@ struct tc_action *tcf_action_init_1(struct net *net, struct tcf_proto *tp,
 				goto err_out;
 			}
 		}
+		if (tb[TCA_ACT_FLAGS])
+			flags = nla_get_bitfield32(tb[TCA_ACT_FLAGS]).value;
 	} else {
 		if (strlcpy(act_name, name, IFNAMSIZ) >= IFNAMSIZ) {
 			NL_SET_ERR_MSG(extack, "TC action name too long");
@@ -915,10 +918,10 @@ struct tc_action *tcf_action_init_1(struct net *net, struct tcf_proto *tp,
 	/* backward compatibility for policer */
 	if (name == NULL)
 		err = a_o->init(net, tb[TCA_ACT_OPTIONS], est, &a, ovr, bind,
-				rtnl_held, tp, extack);
+				rtnl_held, tp, flags, extack);
 	else
 		err = a_o->init(net, nla, est, &a, ovr, bind, rtnl_held,
-				tp, extack);
+				tp, flags, extack);
 	if (err < 0)
 		goto err_mod;


WDYT?

Regards,
Vlad

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [PATCH net-next 00/13] Control action percpu counters allocation by netlink flag
  2019-10-25 14:26             ` Vlad Buslov
@ 2019-10-25 14:57               ` Jamal Hadi Salim
  2019-10-25 15:08                 ` Jamal Hadi Salim
  0 siblings, 1 reply; 64+ messages in thread
From: Jamal Hadi Salim @ 2019-10-25 14:57 UTC (permalink / raw)
  To: Vlad Buslov
  Cc: Jiri Pirko, netdev, xiyou.wangcong, davem, mleitner, dcaratti,
	Eric Dumazet

On 2019-10-25 10:26 a.m., Vlad Buslov wrote:

> I've been trying to re-design this change to use high level TLV, but I
> don't understand how to make it backward compatible. If I just put new
> attribute before nested action attributes, it will fail to parse when
> older client send netlink packet without it and I don't see
> straightforward way to distinguish between the new attribute and
> following nested action attribute (which might accidentally have same
> length and fall into allowed attribute range). I don't have much
> experience working with netlink, so I might be missing something obvious
> here. However, extending existing action attributes (which are already
> conveniently parsed in tcf_action_init_1() function) with new optional
> 'flags' value seems like straightforward and backward compatible
> solution.
>

I think i understand what you are saying. Let me take a quick
look at the code and get back to you.

cheers,
jamal

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [PATCH net-next 00/13] Control action percpu counters allocation by netlink flag
  2019-10-25 14:57               ` Jamal Hadi Salim
@ 2019-10-25 15:08                 ` Jamal Hadi Salim
  2019-10-25 15:18                   ` Vlad Buslov
  0 siblings, 1 reply; 64+ messages in thread
From: Jamal Hadi Salim @ 2019-10-25 15:08 UTC (permalink / raw)
  To: Vlad Buslov
  Cc: Jiri Pirko, netdev, xiyou.wangcong, davem, mleitner, dcaratti,
	Eric Dumazet

[-- Attachment #1: Type: text/plain, Size: 446 bytes --]

On 2019-10-25 10:57 a.m., Jamal Hadi Salim wrote:
> On 2019-10-25 10:26 a.m., Vlad Buslov wrote:

> 
> I think i understand what you are saying. Let me take a quick
> look at the code and get back to you.
>

How about something like this (not even compile tested)..

Yes, that init is getting uglier... over time if we
are going to move things into shared attributes we will
need to introduce a new data struct to pass to ->init()

cheers,
jamal

[-- Attachment #2: patchlet-v --]
[-- Type: text/plain, Size: 3234 bytes --]

diff --git a/net/sched/act_api.c b/net/sched/act_api.c
index 69d4676a402f..b0d1e00fe2c2 100644
--- a/net/sched/act_api.c
+++ b/net/sched/act_api.c
@@ -842,6 +842,7 @@ static const struct nla_policy tcf_action_policy[TCA_ACT_MAX + 1] = {
 struct tc_action *tcf_action_init_1(struct net *net, struct tcf_proto *tp,
 				    struct nlattr *nla, struct nlattr *est,
 				    char *name, int ovr, int bind,
+				    struct nla_bitfield32 *root_flags,
 				    bool rtnl_held,
 				    struct netlink_ext_ack *extack)
 {
@@ -914,10 +915,10 @@ struct tc_action *tcf_action_init_1(struct net *net, struct tcf_proto *tp,
 	/* backward compatibility for policer */
 	if (name == NULL)
 		err = a_o->init(net, tb[TCA_ACT_OPTIONS], est, &a, ovr, bind,
-				rtnl_held, tp, extack);
+				rtnl_held, root_flags, tp, extack);
 	else
 		err = a_o->init(net, nla, est, &a, ovr, bind, rtnl_held,
-				tp, extack);
+				root_flags, tp, extack);
 	if (err < 0)
 		goto err_mod;
 
@@ -955,7 +956,8 @@ struct tc_action *tcf_action_init_1(struct net *net, struct tcf_proto *tp,
 int tcf_action_init(struct net *net, struct tcf_proto *tp, struct nlattr *nla,
 		    struct nlattr *est, char *name, int ovr, int bind,
 		    struct tc_action *actions[], size_t *attr_size,
-		    bool rtnl_held, struct netlink_ext_ack *extack)
+		    bool rtnl_held, struct nla_bitfield32 *root_flags,
+		    struct netlink_ext_ack *extack)
 {
 	struct nlattr *tb[TCA_ACT_MAX_PRIO + 1];
 	struct tc_action *act;
@@ -970,7 +972,7 @@ int tcf_action_init(struct net *net, struct tcf_proto *tp, struct nlattr *nla,
 
 	for (i = 1; i <= TCA_ACT_MAX_PRIO && tb[i]; i++) {
 		act = tcf_action_init_1(net, tp, tb[i], est, name, ovr, bind,
-					rtnl_held, extack);
+					root_flags, rtnl_held, extack);
 		if (IS_ERR(act)) {
 			err = PTR_ERR(act);
 			goto err;
@@ -1350,6 +1352,7 @@ tcf_add_notify(struct net *net, struct nlmsghdr *n, struct tc_action *actions[],
 
 static int tcf_action_add(struct net *net, struct nlattr *nla,
 			  struct nlmsghdr *n, u32 portid, int ovr,
+			  struct nla_bitfield32 *root_flags,
 			  struct netlink_ext_ack *extack)
 {
 	size_t attr_size = 0;
@@ -1358,7 +1361,8 @@ static int tcf_action_add(struct net *net, struct nlattr *nla,
 
 	for (loop = 0; loop < 10; loop++) {
 		ret = tcf_action_init(net, NULL, nla, NULL, NULL, ovr, 0,
-				      actions, &attr_size, true, extack);
+				      actions, &attr_size, true, root_flags,
+				      extack);
 		if (ret != -EAGAIN)
 			break;
 	}
@@ -1385,6 +1389,7 @@ static int tc_ctl_action(struct sk_buff *skb, struct nlmsghdr *n,
 	struct net *net = sock_net(skb->sk);
 	struct nlattr *tca[TCA_ROOT_MAX + 1];
 	u32 portid = skb ? NETLINK_CB(skb).portid : 0;
+	struct nla_bitfield32 root_flags = {0, 0};
 	int ret = 0, ovr = 0;
 
 	if ((n->nlmsg_type != RTM_GETACTION) &&
@@ -1412,8 +1417,11 @@ static int tc_ctl_action(struct sk_buff *skb, struct nlmsghdr *n,
 		 */
 		if (n->nlmsg_flags & NLM_F_REPLACE)
 			ovr = 1;
+		if (tb[TCA_ROOT_FLAGS])
+			root_flags = nla_get_bitfield32(tb[TCA_ROOT_FLAGS]);
+
 		ret = tcf_action_add(net, tca[TCA_ACT_TAB], n, portid, ovr,
-				     extack);
+				     &root_flags, extack);
 		break;
 	case RTM_DELACTION:
 		ret = tca_action_gd(net, tca[TCA_ACT_TAB], n,

^ permalink raw reply related	[flat|nested] 64+ messages in thread

* Re: [PATCH net-next 00/13] Control action percpu counters allocation by netlink flag
  2019-10-25 15:08                 ` Jamal Hadi Salim
@ 2019-10-25 15:18                   ` Vlad Buslov
  2019-10-25 15:43                     ` Jamal Hadi Salim
  0 siblings, 1 reply; 64+ messages in thread
From: Vlad Buslov @ 2019-10-25 15:18 UTC (permalink / raw)
  To: Jamal Hadi Salim
  Cc: Vlad Buslov, Jiri Pirko, netdev, xiyou.wangcong, davem, mleitner,
	dcaratti, Eric Dumazet


On Fri 25 Oct 2019 at 18:08, Jamal Hadi Salim <jhs@mojatatu.com> wrote:
> On 2019-10-25 10:57 a.m., Jamal Hadi Salim wrote:
>> On 2019-10-25 10:26 a.m., Vlad Buslov wrote:
>
>>
>> I think i understand what you are saying. Let me take a quick
>> look at the code and get back to you.
>>
>
> How about something like this (not even compile tested)..
>
> Yes, that init is getting uglier... over time if we
> are going to move things into shared attributes we will
> need to introduce a new data struct to pass to ->init()
>
> cheers,
> jamal

The problem with this approach is that it only works when actions are
created through act API, and not when they are created together with
filter by cls API which doesn't expect or parse TCA_ROOT. That is why I
wanted to have something in tcf_action_init_1() which is called by both
of them.

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [PATCH net-next 00/13] Control action percpu counters allocation by netlink flag
  2019-10-25 15:18                   ` Vlad Buslov
@ 2019-10-25 15:43                     ` Jamal Hadi Salim
  2019-10-25 16:06                       ` Jamal Hadi Salim
  2019-10-25 16:08                       ` Vlad Buslov
  0 siblings, 2 replies; 64+ messages in thread
From: Jamal Hadi Salim @ 2019-10-25 15:43 UTC (permalink / raw)
  To: Vlad Buslov
  Cc: Jiri Pirko, netdev, xiyou.wangcong, davem, mleitner, dcaratti,
	Eric Dumazet

On 2019-10-25 11:18 a.m., Vlad Buslov wrote:
> 

> The problem with this approach is that it only works when actions are
> created through act API, and not when they are created together with
> filter by cls API which doesn't expect or parse TCA_ROOT. That is why I
> wanted to have something in tcf_action_init_1() which is called by both
> of them.
> 

Aha. So the call path for tcf_action_init_1() via cls_api also needs
to have this infra. I think i understand better what you wanted
to do earlier with changing those enums.

This is fixable - we just need to have the classifier side also take a
new action root flags bitfield (I can send a sample). To your original
comment, it is ugly. But maybe "fixing it" is pushing the boundaries
and we should just go on and let your original approach in.
My only worry is, given this is uapi, if we go back to your original
idea we continue to propagate the bad design and we cant take it back
(all the tooling etc would be cast for the next 5 years even if we
did fix it in a month). Thoughts?

cheers,
jamal

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [PATCH net-next 00/13] Control action percpu counters allocation by netlink flag
  2019-10-25 15:43                     ` Jamal Hadi Salim
@ 2019-10-25 16:06                       ` Jamal Hadi Salim
  2019-10-25 16:10                         ` Vlad Buslov
  2019-10-25 16:27                         ` Vlad Buslov
  2019-10-25 16:08                       ` Vlad Buslov
  1 sibling, 2 replies; 64+ messages in thread
From: Jamal Hadi Salim @ 2019-10-25 16:06 UTC (permalink / raw)
  To: Vlad Buslov
  Cc: Jiri Pirko, netdev, xiyou.wangcong, davem, mleitner, dcaratti,
	Eric Dumazet

On 2019-10-25 11:43 a.m., Jamal Hadi Salim wrote:
> On 2019-10-25 11:18 a.m., Vlad Buslov wrote:
>>
> 
>> The problem with this approach is that it only works when actions are
>> created through act API, and not when they are created together with
>> filter by cls API which doesn't expect or parse TCA_ROOT. That is why I
>> wanted to have something in tcf_action_init_1() which is called by both
>> of them.
>>
> 
> Aha. So the call path for tcf_action_init_1() via cls_api also needs
> to have this infra. I think i understand better what you wanted
> to do earlier with changing those enums.
> 

Hold on. Looking more at the code, direct call for tcf_action_init_1()
from the cls code path is for backward compat of old policer approach.
I think even modern iproute2 doesnt support that kind of call
anymore. So you can pass NULL there for the *flags.

But: for direct call to tcf_action_init() we would have
to extract the flag from the TLV.
The TLV already has TCA_ROOT_FLAGS in it.


cheers,
jamal

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [PATCH net-next 00/13] Control action percpu counters allocation by netlink flag
  2019-10-25 15:43                     ` Jamal Hadi Salim
  2019-10-25 16:06                       ` Jamal Hadi Salim
@ 2019-10-25 16:08                       ` Vlad Buslov
  1 sibling, 0 replies; 64+ messages in thread
From: Vlad Buslov @ 2019-10-25 16:08 UTC (permalink / raw)
  To: Jamal Hadi Salim
  Cc: Vlad Buslov, Jiri Pirko, netdev, xiyou.wangcong, davem, mleitner,
	dcaratti, Eric Dumazet


On Fri 25 Oct 2019 at 18:43, Jamal Hadi Salim <jhs@mojatatu.com> wrote:
> On 2019-10-25 11:18 a.m., Vlad Buslov wrote:
>>
>
>> The problem with this approach is that it only works when actions are
>> created through act API, and not when they are created together with
>> filter by cls API which doesn't expect or parse TCA_ROOT. That is why I
>> wanted to have something in tcf_action_init_1() which is called by both
>> of them.
>>
>
> Aha. So the call path for tcf_action_init_1() via cls_api also needs
> to have this infra. I think i understand better what you wanted
> to do earlier with changing those enums.
>
> This is fixable - we just need to have the classifier side also take a
> new action root flags bitfield (I can send a sample). To your original

Trying to add this infra to cls API leads back to my original question:
how do I do it in backward compatible manner? I assume that we can't
break users of RTM_NEWTFILTER.

> comment, it is ugly. But maybe "fixing it" is pushing the boundaries
> and we should just go on and let your original approach in.
> My only worry is, given this is uapi, if we go back to your original
> idea we continue to propagate the bad design and we cant take it back
> (all the tooling etc would be cast for the next 5 years even if we
> did fix it in a month). Thoughts?

I don't see anything particularly ugly with extending either
action-specific attributes or TCA_ACT_* attributes, so its hard for me
to reason about problems with current approach :)

>
> cheers,
> jamal

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [PATCH net-next 00/13] Control action percpu counters allocation by netlink flag
  2019-10-25 16:06                       ` Jamal Hadi Salim
@ 2019-10-25 16:10                         ` Vlad Buslov
  2019-10-25 16:29                           ` Jamal Hadi Salim
  2019-10-25 16:27                         ` Vlad Buslov
  1 sibling, 1 reply; 64+ messages in thread
From: Vlad Buslov @ 2019-10-25 16:10 UTC (permalink / raw)
  To: Jamal Hadi Salim
  Cc: Vlad Buslov, Jiri Pirko, netdev, xiyou.wangcong, davem, mleitner,
	dcaratti, Eric Dumazet


On Fri 25 Oct 2019 at 19:06, Jamal Hadi Salim <jhs@mojatatu.com> wrote:
> On 2019-10-25 11:43 a.m., Jamal Hadi Salim wrote:
>> On 2019-10-25 11:18 a.m., Vlad Buslov wrote:
>>>
>>
>>> The problem with this approach is that it only works when actions are
>>> created through act API, and not when they are created together with
>>> filter by cls API which doesn't expect or parse TCA_ROOT. That is why I
>>> wanted to have something in tcf_action_init_1() which is called by both
>>> of them.
>>>
>>
>> Aha. So the call path for tcf_action_init_1() via cls_api also needs
>> to have this infra. I think i understand better what you wanted
>> to do earlier with changing those enums.
>>
>
> Hold on. Looking more at the code, direct call for tcf_action_init_1()
> from the cls code path is for backward compat of old policer approach.
> I think even modern iproute2 doesnt support that kind of call
> anymore. So you can pass NULL there for the *flags.

But having the FAST_INIT flag set when creating actions through cls API
is my main use case. Are you suggesting to only have flags when actions
created through act API?

>
> But: for direct call to tcf_action_init() we would have
> to extract the flag from the TLV.
> The TLV already has TCA_ROOT_FLAGS in it.
>
>
> cheers,
> jamal


^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [PATCH net-next 00/13] Control action percpu counters allocation by netlink flag
  2019-10-25 16:06                       ` Jamal Hadi Salim
  2019-10-25 16:10                         ` Vlad Buslov
@ 2019-10-25 16:27                         ` Vlad Buslov
  1 sibling, 0 replies; 64+ messages in thread
From: Vlad Buslov @ 2019-10-25 16:27 UTC (permalink / raw)
  To: Jamal Hadi Salim
  Cc: Vlad Buslov, Jiri Pirko, netdev, xiyou.wangcong, davem, mleitner,
	dcaratti, Eric Dumazet


On Fri 25 Oct 2019 at 19:06, Jamal Hadi Salim <jhs@mojatatu.com> wrote:
> On 2019-10-25 11:43 a.m., Jamal Hadi Salim wrote:
>> On 2019-10-25 11:18 a.m., Vlad Buslov wrote:
>>>
>>
>>> The problem with this approach is that it only works when actions are
>>> created through act API, and not when they are created together with
>>> filter by cls API which doesn't expect or parse TCA_ROOT. That is why I
>>> wanted to have something in tcf_action_init_1() which is called by both
>>> of them.
>>>
>>
>> Aha. So the call path for tcf_action_init_1() via cls_api also needs
>> to have this infra. I think i understand better what you wanted
>> to do earlier with changing those enums.
>>
>
> Hold on. Looking more at the code, direct call for tcf_action_init_1()
> from the cls code path is for backward compat of old policer approach.
> I think even modern iproute2 doesnt support that kind of call
> anymore. So you can pass NULL there for the *flags.
>
> But: for direct call to tcf_action_init() we would have
> to extract the flag from the TLV.
> The TLV already has TCA_ROOT_FLAGS in it.

After re-reading the code I think what you suggesting is that I can
somehow parse TCA_ROOT_FLAGS in following functions:

int tcf_action_init(struct net *net, struct tcf_proto *tp, struct nlattr *nla,
		    struct nlattr *est, char *name, int ovr, int bind,
		    struct tc_action *actions[], size_t *attr_size,
		    bool rtnl_held, struct netlink_ext_ack *extack)
{
	struct nlattr *tb[TCA_ACT_MAX_PRIO + 1];
	struct tc_action *act;
	size_t sz = 0;
	int err;
	int i;

->	err = nla_parse_nested_deprecated(tb, TCA_ACT_MAX_PRIO, nla, NULL,
					  extack);
	if (err < 0)
		return err;

	for (i = 1; i <= TCA_ACT_MAX_PRIO && tb[i]; i++) {
		act = tcf_action_init_1(net, tp, tb[i], est, name, ovr, bind,
					rtnl_held, extack);
		if (IS_ERR(act)) {
			err = PTR_ERR(act);
			goto err;
		}
		act->order = i;
		sz += tcf_action_fill_size(act);
		/* Start from index 0 */
		actions[i - 1] = act;
	}

	*attr_size = tcf_action_full_attrs_size(sz);
	return i - 1;

err:
	tcf_action_destroy(actions, bind);
	return err;
}

Again, I'm not too familiar with netlink, but looking at that I assume
that the code parses up to TCA_ACT_MAX_PRIO nested TCA_ACT attributes
and passes them one-by-one to tcf_action_init_1(). Are you suggesting
that it also somehow includes TCA_ROOT?

Regards,
Vlad

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [PATCH net-next 00/13] Control action percpu counters allocation by netlink flag
  2019-10-25 16:10                         ` Vlad Buslov
@ 2019-10-25 16:29                           ` Jamal Hadi Salim
  2019-10-25 16:53                             ` Vlad Buslov
  0 siblings, 1 reply; 64+ messages in thread
From: Jamal Hadi Salim @ 2019-10-25 16:29 UTC (permalink / raw)
  To: Vlad Buslov
  Cc: Jiri Pirko, netdev, xiyou.wangcong, davem, mleitner, dcaratti,
	Eric Dumazet

On 2019-10-25 12:10 p.m., Vlad Buslov wrote:
> 

>> Hold on. Looking more at the code, direct call for tcf_action_init_1()
>> from the cls code path is for backward compat of old policer approach.
>> I think even modern iproute2 doesnt support that kind of call
>> anymore. So you can pass NULL there for the *flags.
> 
> But having the FAST_INIT flag set when creating actions through cls API
> is my main use case. Are you suggesting to only have flags when actions
> created through act API?
>

Not at all. Here's my thinking...

I didnt see your iproute2 change; however, in user space - i think all
the classifiers eventually call parse_action()? You can stick the flags
there under TCA_ACT_ROOT_FLAGS

In the kernel, tcf_exts_validate() - two spots:
tcf_action_init_1() pass NULL
tcf_action_init() before invocation extract the TCA_ACT_ROOT_FLAGS
(similar to the act_api approach).

Am i missing something? Sorry - dont have much cycles right now
but i could do a prototype later.

cheers,
jamal

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [PATCH net-next 00/13] Control action percpu counters allocation by netlink flag
  2019-10-25 16:29                           ` Jamal Hadi Salim
@ 2019-10-25 16:53                             ` Vlad Buslov
  2019-10-25 18:17                               ` Jamal Hadi Salim
  0 siblings, 1 reply; 64+ messages in thread
From: Vlad Buslov @ 2019-10-25 16:53 UTC (permalink / raw)
  To: Jamal Hadi Salim
  Cc: Vlad Buslov, Jiri Pirko, netdev, xiyou.wangcong, davem, mleitner,
	dcaratti, Eric Dumazet


On Fri 25 Oct 2019 at 19:29, Jamal Hadi Salim <jhs@mojatatu.com> wrote:
> On 2019-10-25 12:10 p.m., Vlad Buslov wrote:
>>
>
>>> Hold on. Looking more at the code, direct call for tcf_action_init_1()
>>> from the cls code path is for backward compat of old policer approach.
>>> I think even modern iproute2 doesnt support that kind of call
>>> anymore. So you can pass NULL there for the *flags.
>>
>> But having the FAST_INIT flag set when creating actions through cls API
>> is my main use case. Are you suggesting to only have flags when actions
>> created through act API?
>>
>
> Not at all. Here's my thinking...
>
> I didnt see your iproute2 change; however, in user space - i think all
> the classifiers eventually call parse_action()? You can stick the flags
> there under TCA_ACT_ROOT_FLAGS
>
> In the kernel, tcf_exts_validate() - two spots:
> tcf_action_init_1() pass NULL
> tcf_action_init() before invocation extract the TCA_ACT_ROOT_FLAGS
> (similar to the act_api approach).
>
> Am i missing something? Sorry - dont have much cycles right now
> but i could do a prototype later.
>
> cheers,
> jamal

I don't exactly follow. In case of act API we have following call chain:
tc_ctl_action()->tcf_action_add()->tcf_action_init(). In this case
TCA_ROOT is parsed in tc_ctl_action() and tcf_action_init() is called
with one of its nested attributes - tca[TCA_ACT_TAB]. When
tcf_action_init() is called TCA_ROOT is already "parsed out", but it is
easy to pass it as an argument.

For cls API lets take flower as an example: fl_change() parses TCA_FLOWER, and calls
fl_set_parms()->tcf_exts_validate()->tcf_action_init() with
TCA_FLOWER_ACT nested attribute. No TCA_ROOT is expected, TCA_FLOWER_ACT
contains up to TCA_ACT_MAX_PRIO nested TCA_ACT attributes. So where can
I include it without breaking backward compatibility?

Regards,
Vlad

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [PATCH net-next 00/13] Control action percpu counters allocation by netlink flag
  2019-10-25 16:53                             ` Vlad Buslov
@ 2019-10-25 18:17                               ` Jamal Hadi Salim
  2019-10-25 21:05                                 ` Jamal Hadi Salim
  0 siblings, 1 reply; 64+ messages in thread
From: Jamal Hadi Salim @ 2019-10-25 18:17 UTC (permalink / raw)
  To: Vlad Buslov
  Cc: Jiri Pirko, netdev, xiyou.wangcong, davem, mleitner, dcaratti,
	Eric Dumazet

On 2019-10-25 12:53 p.m., Vlad Buslov wrote:
[..]

Sorry, the distractions here are not helping.
When i responded i had started to do below

enum {
         TCA_ACT_UNSPEC,
         TCA_ACT_KIND,
         TCA_ACT_OPTIONS,
         TCA_ACT_INDEX,
         TCA_ACT_STATS,
         TCA_ACT_PAD,
         TCA_ACT_COOKIE,
         TCA_ACT_ROOT_FLAGS,
         __TCA_ACT_MAX
};

Note: "TCA_ACT_ROOT_FLAGS"
I think your other email may have tried to do the same?
So i claimed it was there in that email because grep showed it
but that's because i had added it;->

> For cls API lets take flower as an example: fl_change() parses TCA_FLOWER, and calls
> fl_set_parms()->tcf_exts_validate()->tcf_action_init() with
> TCA_FLOWER_ACT nested attribute. No TCA_ROOT is expected, TCA_FLOWER_ACT
> contains up to TCA_ACT_MAX_PRIO nested TCA_ACT attributes. So where can
> I include it without breaking backward compatibility?
> 
Not a clean solution but avoids the per action TLV attribute:

in user space parse_action() you add TCA_ACT_ROOT_FLAGS
if the user specifies this on the command line.

in the kernel everything just passes NULL for root_flags
when invocation is from tcf_exts_validate() i.e both
tcf_action_init_1() and
tcf_action_init()

In tcf_action_init_1(), if root_flags is NULL
you try to get flags from from tb[TCA_ACT_ROOT_FLAGS]

Apologies in advance for any latency - I will have time tomorrow.

cheers,
jamal

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [PATCH net-next 00/13] Control action percpu counters allocation by netlink flag
  2019-10-25 18:17                               ` Jamal Hadi Salim
@ 2019-10-25 21:05                                 ` Jamal Hadi Salim
  2019-10-25 21:10                                   ` Jamal Hadi Salim
  0 siblings, 1 reply; 64+ messages in thread
From: Jamal Hadi Salim @ 2019-10-25 21:05 UTC (permalink / raw)
  To: Vlad Buslov
  Cc: Jiri Pirko, netdev, xiyou.wangcong, davem, mleitner, dcaratti,
	Eric Dumazet

[-- Attachment #1: Type: text/plain, Size: 1653 bytes --]


Like attached...

cheers,
jamal

On 2019-10-25 2:17 p.m., Jamal Hadi Salim wrote:
> On 2019-10-25 12:53 p.m., Vlad Buslov wrote:
> [..]
> 
> Sorry, the distractions here are not helping.
> When i responded i had started to do below
> 
> enum {
>          TCA_ACT_UNSPEC,
>          TCA_ACT_KIND,
>          TCA_ACT_OPTIONS,
>          TCA_ACT_INDEX,
>          TCA_ACT_STATS,
>          TCA_ACT_PAD,
>          TCA_ACT_COOKIE,
>          TCA_ACT_ROOT_FLAGS,
>          __TCA_ACT_MAX
> };
> 
> Note: "TCA_ACT_ROOT_FLAGS"
> I think your other email may have tried to do the same?
> So i claimed it was there in that email because grep showed it
> but that's because i had added it;->
> 
>> For cls API lets take flower as an example: fl_change() parses 
>> TCA_FLOWER, and calls
>> fl_set_parms()->tcf_exts_validate()->tcf_action_init() with
>> TCA_FLOWER_ACT nested attribute. No TCA_ROOT is expected, TCA_FLOWER_ACT
>> contains up to TCA_ACT_MAX_PRIO nested TCA_ACT attributes. So where can
>> I include it without breaking backward compatibility?
>>
> Not a clean solution but avoids the per action TLV attribute:
> 
> in user space parse_action() you add TCA_ACT_ROOT_FLAGS
> if the user specifies this on the command line.
> 
> in the kernel everything just passes NULL for root_flags
> when invocation is from tcf_exts_validate() i.e both
> tcf_action_init_1() and
> tcf_action_init()
> 
> In tcf_action_init_1(), if root_flags is NULL
> you try to get flags from from tb[TCA_ACT_ROOT_FLAGS]
> 
> Apologies in advance for any latency - I will have time tomorrow.
> 
> cheers,
> jamal


[-- Attachment #2: patchlet-v2 --]
[-- Type: text/plain, Size: 6398 bytes --]

diff --git a/include/net/act_api.h b/include/net/act_api.h
index b18c699681ca..ec52b28ddfc5 100644
--- a/include/net/act_api.h
+++ b/include/net/act_api.h
@@ -93,7 +93,8 @@ struct tc_action_ops {
 	int     (*lookup)(struct net *net, struct tc_action **a, u32 index);
 	int     (*init)(struct net *net, struct nlattr *nla,
 			struct nlattr *est, struct tc_action **act, int ovr,
-			int bind, bool rtnl_held, struct tcf_proto *tp,
+			int bind, bool rtnl_held, struct nla_bitfield32 *rf,
+			struct tcf_proto *tp,
 			struct netlink_ext_ack *extack);
 	int     (*walk)(struct net *, struct sk_buff *,
 			struct netlink_callback *, int,
@@ -176,10 +177,12 @@ int tcf_action_exec(struct sk_buff *skb, struct tc_action **actions,
 int tcf_action_init(struct net *net, struct tcf_proto *tp, struct nlattr *nla,
 		    struct nlattr *est, char *name, int ovr, int bind,
 		    struct tc_action *actions[], size_t *attr_size,
-		    bool rtnl_held, struct netlink_ext_ack *extack);
+		    bool rtnl_held, struct nla_bitfield32 *root_flags,
+		    struct netlink_ext_ack *extack);
 struct tc_action *tcf_action_init_1(struct net *net, struct tcf_proto *tp,
 				    struct nlattr *nla, struct nlattr *est,
 				    char *name, int ovr, int bind,
+				    struct nla_bitfield32 *root_flags,
 				    bool rtnl_held,
 				    struct netlink_ext_ack *extack);
 int tcf_action_dump(struct sk_buff *skb, struct tc_action *actions[], int bind,
diff --git a/include/uapi/linux/pkt_cls.h b/include/uapi/linux/pkt_cls.h
index a6aa466fac9e..d92f3d0f2c79 100644
--- a/include/uapi/linux/pkt_cls.h
+++ b/include/uapi/linux/pkt_cls.h
@@ -16,6 +16,7 @@ enum {
 	TCA_ACT_STATS,
 	TCA_ACT_PAD,
 	TCA_ACT_COOKIE,
+	TCA_ACT_ROOT_FLAGS,
 	__TCA_ACT_MAX
 };
 
diff --git a/net/sched/act_api.c b/net/sched/act_api.c
index 69d4676a402f..ef7b0bb735c7 100644
--- a/net/sched/act_api.c
+++ b/net/sched/act_api.c
@@ -842,6 +842,7 @@ static const struct nla_policy tcf_action_policy[TCA_ACT_MAX + 1] = {
 struct tc_action *tcf_action_init_1(struct net *net, struct tcf_proto *tp,
 				    struct nlattr *nla, struct nlattr *est,
 				    char *name, int ovr, int bind,
+				    struct nla_bitfield32 *root_flags,
 				    bool rtnl_held,
 				    struct netlink_ext_ack *extack)
 {
@@ -852,6 +853,7 @@ struct tc_action *tcf_action_init_1(struct net *net, struct tcf_proto *tp,
 	struct nlattr *tb[TCA_ACT_MAX + 1];
 	struct nlattr *kind;
 	int err;
+	struct nla_bitfield32 rf = {0, 0};
 
 	if (name == NULL) {
 		err = nla_parse_nested_deprecated(tb, TCA_ACT_MAX, nla,
@@ -884,6 +886,10 @@ struct tc_action *tcf_action_init_1(struct net *net, struct tcf_proto *tp,
 		}
 	}
 
+	if (!root_flags && tb[TCA_ACT_ROOT_FLAGS]) {
+		rf = nla_get_bitfield32(tb[TCA_ACT_ROOT_FLAGS]);
+		root_flags = &rf;
+	}
 	a_o = tc_lookup_action_n(act_name);
 	if (a_o == NULL) {
 #ifdef CONFIG_MODULES
@@ -914,10 +920,10 @@ struct tc_action *tcf_action_init_1(struct net *net, struct tcf_proto *tp,
 	/* backward compatibility for policer */
 	if (name == NULL)
 		err = a_o->init(net, tb[TCA_ACT_OPTIONS], est, &a, ovr, bind,
-				rtnl_held, tp, extack);
+				rtnl_held, root_flags, tp, extack);
 	else
 		err = a_o->init(net, nla, est, &a, ovr, bind, rtnl_held,
-				tp, extack);
+				root_flags, tp, extack);
 	if (err < 0)
 		goto err_mod;
 
@@ -955,7 +961,8 @@ struct tc_action *tcf_action_init_1(struct net *net, struct tcf_proto *tp,
 int tcf_action_init(struct net *net, struct tcf_proto *tp, struct nlattr *nla,
 		    struct nlattr *est, char *name, int ovr, int bind,
 		    struct tc_action *actions[], size_t *attr_size,
-		    bool rtnl_held, struct netlink_ext_ack *extack)
+		    bool rtnl_held, struct nla_bitfield32 *root_flags,
+		    struct netlink_ext_ack *extack)
 {
 	struct nlattr *tb[TCA_ACT_MAX_PRIO + 1];
 	struct tc_action *act;
@@ -970,7 +977,7 @@ int tcf_action_init(struct net *net, struct tcf_proto *tp, struct nlattr *nla,
 
 	for (i = 1; i <= TCA_ACT_MAX_PRIO && tb[i]; i++) {
 		act = tcf_action_init_1(net, tp, tb[i], est, name, ovr, bind,
-					rtnl_held, extack);
+					root_flags, rtnl_held, extack);
 		if (IS_ERR(act)) {
 			err = PTR_ERR(act);
 			goto err;
@@ -1350,6 +1357,7 @@ tcf_add_notify(struct net *net, struct nlmsghdr *n, struct tc_action *actions[],
 
 static int tcf_action_add(struct net *net, struct nlattr *nla,
 			  struct nlmsghdr *n, u32 portid, int ovr,
+			  struct nla_bitfield32 *root_flags,
 			  struct netlink_ext_ack *extack)
 {
 	size_t attr_size = 0;
@@ -1358,7 +1366,8 @@ static int tcf_action_add(struct net *net, struct nlattr *nla,
 
 	for (loop = 0; loop < 10; loop++) {
 		ret = tcf_action_init(net, NULL, nla, NULL, NULL, ovr, 0,
-				      actions, &attr_size, true, extack);
+				      actions, &attr_size, true, root_flags,
+				      extack);
 		if (ret != -EAGAIN)
 			break;
 	}
@@ -1385,6 +1394,7 @@ static int tc_ctl_action(struct sk_buff *skb, struct nlmsghdr *n,
 	struct net *net = sock_net(skb->sk);
 	struct nlattr *tca[TCA_ROOT_MAX + 1];
 	u32 portid = skb ? NETLINK_CB(skb).portid : 0;
+	struct nla_bitfield32 root_flags = {0, 0};
 	int ret = 0, ovr = 0;
 
 	if ((n->nlmsg_type != RTM_GETACTION) &&
@@ -1412,8 +1422,11 @@ static int tc_ctl_action(struct sk_buff *skb, struct nlmsghdr *n,
 		 */
 		if (n->nlmsg_flags & NLM_F_REPLACE)
 			ovr = 1;
+		if (tb[TCA_ROOT_FLAGS])
+			root_flags = nla_get_bitfield32(tb[TCA_ROOT_FLAGS]);
+
 		ret = tcf_action_add(net, tca[TCA_ACT_TAB], n, portid, ovr,
-				     extack);
+				     &root_flags, extack);
 		break;
 	case RTM_DELACTION:
 		ret = tca_action_gd(net, tca[TCA_ACT_TAB], n,
diff --git a/net/sched/cls_api.c b/net/sched/cls_api.c
index 8717c0b26c90..9dc5bf0d637e 100644
--- a/net/sched/cls_api.c
+++ b/net/sched/cls_api.c
@@ -2946,7 +2946,7 @@ int tcf_exts_validate(struct net *net, struct tcf_proto *tp, struct nlattr **tb,
 			act = tcf_action_init_1(net, tp, tb[exts->police],
 						rate_tlv, "police", ovr,
 						TCA_ACT_BIND, rtnl_held,
-						extack);
+						NULL, extack);
 			if (IS_ERR(act))
 				return PTR_ERR(act);
 
@@ -2959,7 +2959,7 @@ int tcf_exts_validate(struct net *net, struct tcf_proto *tp, struct nlattr **tb,
 			err = tcf_action_init(net, tp, tb[exts->action],
 					      rate_tlv, NULL, ovr, TCA_ACT_BIND,
 					      exts->actions, &attr_size,
-					      rtnl_held, extack);
+					      rtnl_held, NULL, extack);
 			if (err < 0)
 				return err;
 			exts->nr_actions = err;

^ permalink raw reply related	[flat|nested] 64+ messages in thread

* Re: [PATCH net-next 00/13] Control action percpu counters allocation by netlink flag
  2019-10-25 21:05                                 ` Jamal Hadi Salim
@ 2019-10-25 21:10                                   ` Jamal Hadi Salim
  2019-10-25 21:52                                     ` Jamal Hadi Salim
  0 siblings, 1 reply; 64+ messages in thread
From: Jamal Hadi Salim @ 2019-10-25 21:10 UTC (permalink / raw)
  To: Vlad Buslov
  Cc: Jiri Pirko, netdev, xiyou.wangcong, davem, mleitner, dcaratti,
	Eric Dumazet

On 2019-10-25 5:05 p.m., Jamal Hadi Salim wrote:
> +	if (!root_flags && tb[TCA_ACT_ROOT_FLAGS]) {
> +		rf = nla_get_bitfield32(tb[TCA_ACT_ROOT_FLAGS]);
> +		root_flags = &rf;
> +	}



!root_flags check doesnt look right.
Hopefully it makes more sense now....

cheers,
jamal

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [PATCH net-next 00/13] Control action percpu counters allocation by netlink flag
  2019-10-25 21:10                                   ` Jamal Hadi Salim
@ 2019-10-25 21:52                                     ` Jamal Hadi Salim
  2019-10-26  9:44                                       ` Vlad Buslov
  0 siblings, 1 reply; 64+ messages in thread
From: Jamal Hadi Salim @ 2019-10-25 21:52 UTC (permalink / raw)
  To: Vlad Buslov
  Cc: Jiri Pirko, netdev, xiyou.wangcong, davem, mleitner, dcaratti,
	Eric Dumazet

On 2019-10-25 5:10 p.m., Jamal Hadi Salim wrote:
> On 2019-10-25 5:05 p.m., Jamal Hadi Salim wrote:
>> +    if (!root_flags && tb[TCA_ACT_ROOT_FLAGS]) {
>> +        rf = nla_get_bitfield32(tb[TCA_ACT_ROOT_FLAGS]);
>> +        root_flags = &rf;
>> +    }
> 
> 
> 
> !root_flags check doesnt look right.
> Hopefully it makes more sense now....
> 

For completion:
It compiled ;->


cheers,
jamal

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [PATCH net-next 00/13] Control action percpu counters allocation by netlink flag
  2019-10-25 21:52                                     ` Jamal Hadi Salim
@ 2019-10-26  9:44                                       ` Vlad Buslov
  2019-10-26 12:26                                         ` Jamal Hadi Salim
  0 siblings, 1 reply; 64+ messages in thread
From: Vlad Buslov @ 2019-10-26  9:44 UTC (permalink / raw)
  To: Jamal Hadi Salim
  Cc: Vlad Buslov, Jiri Pirko, netdev, xiyou.wangcong, davem, mleitner,
	dcaratti, Eric Dumazet


On Sat 26 Oct 2019 at 00:52, Jamal Hadi Salim <jhs@mojatatu.com> wrote:
> On 2019-10-25 5:10 p.m., Jamal Hadi Salim wrote:
>> On 2019-10-25 5:05 p.m., Jamal Hadi Salim wrote:
>>> +    if (!root_flags && tb[TCA_ACT_ROOT_FLAGS]) {
>>> +        rf = nla_get_bitfield32(tb[TCA_ACT_ROOT_FLAGS]);
>>> +        root_flags = &rf;
>>> +    }
>>
>>
>>
>> !root_flags check doesnt look right.
>> Hopefully it makes more sense now....
>>
>
> For completion:
> It compiled ;->
>
>
> cheers,
> jamal

Okay, I understand now what you suggest. But why not unify cls and act
API, and always have flags parsed in tcf_action_init_1() as
TCA_ACT_ROOT_FLAGS like I suggested in one of my previous mails? That
way we don't have to pass pointers around.

Regards,
Vlad

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [PATCH net-next 00/13] Control action percpu counters allocation by netlink flag
  2019-10-26  9:44                                       ` Vlad Buslov
@ 2019-10-26 12:26                                         ` Jamal Hadi Salim
  2019-10-26 14:52                                           ` Roman Mashak
  0 siblings, 1 reply; 64+ messages in thread
From: Jamal Hadi Salim @ 2019-10-26 12:26 UTC (permalink / raw)
  To: Vlad Buslov
  Cc: Jiri Pirko, netdev, xiyou.wangcong, davem, mleitner, dcaratti,
	Eric Dumazet

On 2019-10-26 5:44 a.m., Vlad Buslov wrote:
> 

> Okay, I understand now what you suggest. But why not unify cls and act
> API, and always have flags parsed in tcf_action_init_1() as
> TCA_ACT_ROOT_FLAGS like I suggested in one of my previous mails? That
> way we don't have to pass pointers around.

That would work.
I am being a sucker for optimization - one flag for a batch
of actions vs one per action.
It is a good compromise, go for it.

cheers,
jamal

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [PATCH net-next 00/13] Control action percpu counters allocation by netlink flag
  2019-10-26 12:26                                         ` Jamal Hadi Salim
@ 2019-10-26 14:52                                           ` Roman Mashak
  2019-10-26 16:06                                             ` Jamal Hadi Salim
  0 siblings, 1 reply; 64+ messages in thread
From: Roman Mashak @ 2019-10-26 14:52 UTC (permalink / raw)
  To: Jamal Hadi Salim
  Cc: Vlad Buslov, Jiri Pirko, netdev, xiyou.wangcong, davem, mleitner,
	dcaratti, Eric Dumazet

Jamal Hadi Salim <jhs@mojatatu.com> writes:

> On 2019-10-26 5:44 a.m., Vlad Buslov wrote:
>>
>
>> Okay, I understand now what you suggest. But why not unify cls and act
>> API, and always have flags parsed in tcf_action_init_1() as
>> TCA_ACT_ROOT_FLAGS like I suggested in one of my previous mails? That
>> way we don't have to pass pointers around.
>
> That would work.
> I am being a sucker for optimization - one flag for a batch
> of actions vs one per action.
> It is a good compromise, go for it.

But why do we need to have two attributes, one at the root level
TCA_ROOT_FLAGS and the other at the inner TCA_ACT_* level, but in fact
serving the same purpose -- passing flags for optimizations?

The whole nest of action attributes including root ones is passed as 3rd
argument of tcf_exts_validate(), so it can be validated and extracted at
that level and passed to tcf_action_init_1() as pointer to 32-bit flag,
admittedly it's ugly given the growing number of arguments to
tcf_action_init_1(). With old iproute2 the pointer will always be NULL,
so I think backward compatibilty will be preserved.

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [PATCH net-next 00/13] Control action percpu counters allocation by netlink flag
  2019-10-26 14:52                                           ` Roman Mashak
@ 2019-10-26 16:06                                             ` Jamal Hadi Salim
  2019-10-26 16:42                                               ` Vlad Buslov
  0 siblings, 1 reply; 64+ messages in thread
From: Jamal Hadi Salim @ 2019-10-26 16:06 UTC (permalink / raw)
  To: Roman Mashak
  Cc: Vlad Buslov, Jiri Pirko, netdev, xiyou.wangcong, davem, mleitner,
	dcaratti, Eric Dumazet

On 2019-10-26 10:52 a.m., Roman Mashak wrote:
[..]
> 
> But why do we need to have two attributes, one at the root level
> TCA_ROOT_FLAGS and the other at the inner TCA_ACT_* level, but in fact
> serving the same purpose -- passing flags for optimizations?
> 
 >
> The whole nest of action attributes including root ones is passed as 3rd
> argument of tcf_exts_validate(), so it can be validated and extracted at
> that level and passed to tcf_action_init_1() as pointer to 32-bit flag,
> admittedly it's ugly given the growing number of arguments to
> tcf_action_init_1(). With old iproute2 the pointer will always be NULL,
> so I think backward compatibilty will be preserved.

Note: we only call tcf_action_init_1() at that level for very
old policer api for backward compatibility reasons. I think what
would make sense is to be able to call tcf_action_init()(the else
statement in tcf_exts_validate()) from that level with a global flag
but for that we would need to introduce something like TCA_ROOT_FLAGS
under this space:
---
enum {
         TCA_UNSPEC,
         TCA_KIND,
         TCA_OPTIONS,
         TCA_STATS,
         TCA_XSTATS,
         TCA_RATE,
         TCA_FCNT,
         TCA_STATS2,
         TCA_STAB,
         TCA_PAD,
         TCA_DUMP_INVISIBLE,
         TCA_CHAIN,
         TCA_HW_OFFLOAD,
         TCA_INGRESS_BLOCK,
         TCA_EGRESS_BLOCK,
         __TCA_MAX
};
---

which would be a cleaner solution but would require
_a lot more code_ both in user/kernel.
Thats why i feel Vlad's suggestion is a reasonable compromise
because it gets rid of the original issue of per-specific-action
TLVs.

On optimization:
The current suggestion from Vlad is a bit inefficient,
example, if was trying to batch 100 actions i now have 1200
bytes of overhead instead of 12 bytes.

cheers,
jamal

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [PATCH net-next 00/13] Control action percpu counters allocation by netlink flag
  2019-10-26 16:06                                             ` Jamal Hadi Salim
@ 2019-10-26 16:42                                               ` Vlad Buslov
  2019-10-26 18:38                                                 ` Jamal Hadi Salim
  0 siblings, 1 reply; 64+ messages in thread
From: Vlad Buslov @ 2019-10-26 16:42 UTC (permalink / raw)
  To: Jamal Hadi Salim
  Cc: Roman Mashak, Vlad Buslov, Jiri Pirko, netdev, xiyou.wangcong,
	davem, mleitner, dcaratti, Eric Dumazet


On Sat 26 Oct 2019 at 19:06, Jamal Hadi Salim <jhs@mojatatu.com> wrote:
> On 2019-10-26 10:52 a.m., Roman Mashak wrote:
> [..]
>>
>> But why do we need to have two attributes, one at the root level
>> TCA_ROOT_FLAGS and the other at the inner TCA_ACT_* level, but in fact
>> serving the same purpose -- passing flags for optimizations?
>>
>>
>> The whole nest of action attributes including root ones is passed as 3rd
>> argument of tcf_exts_validate(), so it can be validated and extracted at
>> that level and passed to tcf_action_init_1() as pointer to 32-bit flag,
>> admittedly it's ugly given the growing number of arguments to
>> tcf_action_init_1(). With old iproute2 the pointer will always be NULL,
>> so I think backward compatibilty will be preserved.
>
> Note: we only call tcf_action_init_1() at that level for very
> old policer api for backward compatibility reasons. I think what
> would make sense is to be able to call tcf_action_init()(the else
> statement in tcf_exts_validate()) from that level with a global flag
> but for that we would need to introduce something like TCA_ROOT_FLAGS
> under this space:
> ---
> enum {
>         TCA_UNSPEC,
>         TCA_KIND,
>         TCA_OPTIONS,
>         TCA_STATS,
>         TCA_XSTATS,
>         TCA_RATE,
>         TCA_FCNT,
>         TCA_STATS2,
>         TCA_STAB,
>         TCA_PAD,
>         TCA_DUMP_INVISIBLE,
>         TCA_CHAIN,
>         TCA_HW_OFFLOAD,
>         TCA_INGRESS_BLOCK,
>         TCA_EGRESS_BLOCK,
>         __TCA_MAX
> };
> ---
>
> which would be a cleaner solution but would require
> _a lot more code_ both in user/kernel.
> Thats why i feel Vlad's suggestion is a reasonable compromise
> because it gets rid of the original issue of per-specific-action
> TLVs.
>
> On optimization:
> The current suggestion from Vlad is a bit inefficient,
> example, if was trying to batch 100 actions i now have 1200
> bytes of overhead instead of 12 bytes.
>
> cheers,
> jamal

Hmm, yes, this looks quite redundant. At the same time having basically
same flags in two different spaces is ugly. Maybe correct approach would
be not to add act API at all and only extend tcf_action_init_1() to be
used from cls API? I don't see a use for act API at the moment because
the only flag value (skip percpu allocation) is hardware offloads
specific and such clients generally create action through cls new filter
API. WDYT?

Regards,
Vlad

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [PATCH net-next 00/13] Control action percpu counters allocation by netlink flag
  2019-10-26 16:42                                               ` Vlad Buslov
@ 2019-10-26 18:38                                                 ` Jamal Hadi Salim
  2019-10-27  8:31                                                   ` Vlad Buslov
  0 siblings, 1 reply; 64+ messages in thread
From: Jamal Hadi Salim @ 2019-10-26 18:38 UTC (permalink / raw)
  To: Vlad Buslov
  Cc: Roman Mashak, Jiri Pirko, netdev, xiyou.wangcong, davem,
	mleitner, dcaratti, Eric Dumazet

On 2019-10-26 12:42 p.m., Vlad Buslov wrote:
> 
> On Sat 26 Oct 2019 at 19:06, Jamal Hadi Salim <jhs@mojatatu.com> wrote:

> Hmm, yes, this looks quite redundant.

It's legit.
Two different code paths for two different objects
that can be configured independently or together.
Only other thing that does this is FIB/NH path.

> At the same time having basically
> same flags in two different spaces is ugly. Maybe correct approach would
> be not to add act API at all and only extend tcf_action_init_1() to be
> used from cls API? I don't see a use for act API at the moment because
> the only flag value (skip percpu allocation) is hardware offloads
> specific and such clients generally create action through cls new filter
> API. WDYT?
> 

I am not sure if there is a gain. Your code path is
tcf_exts_validate()->tcf_action_init()->tcf_action_init_1()

Do you want to replicate the tcf_action_init() in
cls?

cheers,
jamal

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [PATCH net-next 00/13] Control action percpu counters allocation by netlink flag
  2019-10-26 18:38                                                 ` Jamal Hadi Salim
@ 2019-10-27  8:31                                                   ` Vlad Buslov
  2019-10-27  9:13                                                     ` Jamal Hadi Salim
  0 siblings, 1 reply; 64+ messages in thread
From: Vlad Buslov @ 2019-10-27  8:31 UTC (permalink / raw)
  To: Jamal Hadi Salim
  Cc: Vlad Buslov, Roman Mashak, Jiri Pirko, netdev, xiyou.wangcong,
	davem, mleitner, dcaratti, Eric Dumazet


On Sat 26 Oct 2019 at 21:38, Jamal Hadi Salim <jhs@mojatatu.com> wrote:
> On 2019-10-26 12:42 p.m., Vlad Buslov wrote:
>>
>> On Sat 26 Oct 2019 at 19:06, Jamal Hadi Salim <jhs@mojatatu.com> wrote:
>
>> Hmm, yes, this looks quite redundant.
>
> It's legit.
> Two different code paths for two different objects
> that can be configured independently or together.
> Only other thing that does this is FIB/NH path.
>
>> At the same time having basically
>> same flags in two different spaces is ugly. Maybe correct approach would
>> be not to add act API at all and only extend tcf_action_init_1() to be
>> used from cls API? I don't see a use for act API at the moment because
>> the only flag value (skip percpu allocation) is hardware offloads
>> specific and such clients generally create action through cls new filter
>> API. WDYT?
>>
>
> I am not sure if there is a gain. Your code path is
> tcf_exts_validate()->tcf_action_init()->tcf_action_init_1()
>
> Do you want to replicate the tcf_action_init() in
> cls?
>
> cheers,
> jamal

No, I meant that we don't need to change iproute2 to always send new
TCA_ACT_ROOT_FLAGS. They can be set conditionally (only when user set
the flag) or only when creating filters with actions attached (cls API).
Such approach wouldn't introduce any additional overhead to netlink
packets when batch creating actions.

Regards,
Vlad

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [PATCH net-next 00/13] Control action percpu counters allocation by netlink flag
  2019-10-27  8:31                                                   ` Vlad Buslov
@ 2019-10-27  9:13                                                     ` Jamal Hadi Salim
  0 siblings, 0 replies; 64+ messages in thread
From: Jamal Hadi Salim @ 2019-10-27  9:13 UTC (permalink / raw)
  To: Vlad Buslov
  Cc: Roman Mashak, Jiri Pirko, netdev, xiyou.wangcong, davem,
	mleitner, dcaratti, Eric Dumazet

On 2019-10-27 4:31 a.m., Vlad Buslov wrote:

> 
> No, I meant that we don't need to change iproute2 to always send new
> TCA_ACT_ROOT_FLAGS. They can be set conditionally (only when user set
> the flag) or only when creating filters with actions attached (cls API).
> Such approach wouldn't introduce any additional overhead to netlink
> packets when batch creating actions.
> 
iproute2 will only send conditionally.
parse_actions() in iproute2 will only set this field if the user
explicitly set it on the command line for that action - either when
an action is  specified as standalone(or in a batch) or when bound
to a filter should make no difference. No?

The overhead i was talking about was worst case scenario.

cheers,
jamal

^ permalink raw reply	[flat|nested] 64+ messages in thread

end of thread, other threads:[~2019-10-27  9:13 UTC | newest]

Thread overview: 64+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-10-22 14:17 [PATCH net-next 00/13] Control action percpu counters allocation by netlink flag Vlad Buslov
2019-10-22 14:17 ` [PATCH net-next 01/13] net: sched: extract common action counters update code into function Vlad Buslov
2019-10-22 14:17 ` [PATCH net-next 02/13] net: sched: extract bstats " Vlad Buslov
2019-10-22 14:17 ` [PATCH net-next 03/13] net: sched: extract qstats update code into functions Vlad Buslov
2019-10-22 14:17 ` [PATCH net-next 04/13] net: sched: don't expose action qstats to skb_tc_reinsert() Vlad Buslov
2019-10-22 14:17 ` [PATCH net-next 05/13] net: sched: modify stats helper functions to support regular stats Vlad Buslov
2019-10-22 14:17 ` [PATCH net-next 06/13] net: sched: add action fast initialization flag Vlad Buslov
2019-10-22 14:17 ` [PATCH net-next 07/13] net: sched: act_gact: support fast init flag to skip pcpu allocation Vlad Buslov
2019-10-22 14:17 ` [PATCH net-next 08/13] net: sched: act_csum: " Vlad Buslov
2019-10-22 14:18 ` [PATCH net-next 09/13] net: sched: act_mirred: support fast init flag to skip pcpu alloc Vlad Buslov
2019-10-22 14:18 ` [PATCH net-next 10/13] net: sched: tunnel_key: " Vlad Buslov
2019-10-22 14:18 ` [PATCH net-next 11/13] net: sched: act_vlan: support fast init flag to skip pcpu allocation Vlad Buslov
2019-10-22 14:18 ` [PATCH net-next 12/13] net: sched: act_ct: " Vlad Buslov
2019-10-22 14:18 ` [PATCH net-next 13/13] tc-testing: implement tests for new fast_init action flag Vlad Buslov
2019-10-22 14:28 ` [PATCH iproute2-next] tc: implement support for action flags Vlad Buslov
2019-10-22 14:35 ` [PATCH net-next 00/13] Control action percpu counters allocation by netlink flag Marcelo Ricardo Leitner
2019-10-22 14:52   ` Vlad Buslov
2019-10-22 18:17     ` Roman Mashak
2019-10-23  6:38       ` Vlad Buslov
2019-10-23 13:02         ` Jamal Hadi Salim
2019-10-23 13:08           ` Vlad Buslov
2019-10-22 15:15 ` Marcelo Ricardo Leitner
2019-10-22 15:52   ` Vlad Buslov
2019-10-22 17:09     ` Marcelo Ricardo Leitner
2019-10-23  6:42       ` Vlad Buslov
2019-10-24 15:12       ` Roopa Prabhu
2019-10-24 15:18         ` Vlad Buslov
2019-10-24 17:31           ` Marcelo Ricardo Leitner
2019-10-24 18:03             ` Vlad Buslov
2019-10-23 12:49 ` Jamal Hadi Salim
2019-10-23 13:04   ` Vlad Buslov
2019-10-23 14:21     ` Jamal Hadi Salim
2019-10-23 15:04       ` Vlad Buslov
2019-10-24  7:35       ` Jiri Pirko
2019-10-24 15:23         ` Vlad Buslov
2019-10-24 16:12           ` Jamal Hadi Salim
2019-10-24 16:44             ` Vlad Buslov
2019-10-24 17:17               ` Jamal Hadi Salim
2019-10-24 18:05                 ` Vlad Buslov
2019-10-25 11:46                   ` Jamal Hadi Salim
2019-10-25 11:55                     ` Vlad Buslov
2019-10-25 14:26             ` Vlad Buslov
2019-10-25 14:57               ` Jamal Hadi Salim
2019-10-25 15:08                 ` Jamal Hadi Salim
2019-10-25 15:18                   ` Vlad Buslov
2019-10-25 15:43                     ` Jamal Hadi Salim
2019-10-25 16:06                       ` Jamal Hadi Salim
2019-10-25 16:10                         ` Vlad Buslov
2019-10-25 16:29                           ` Jamal Hadi Salim
2019-10-25 16:53                             ` Vlad Buslov
2019-10-25 18:17                               ` Jamal Hadi Salim
2019-10-25 21:05                                 ` Jamal Hadi Salim
2019-10-25 21:10                                   ` Jamal Hadi Salim
2019-10-25 21:52                                     ` Jamal Hadi Salim
2019-10-26  9:44                                       ` Vlad Buslov
2019-10-26 12:26                                         ` Jamal Hadi Salim
2019-10-26 14:52                                           ` Roman Mashak
2019-10-26 16:06                                             ` Jamal Hadi Salim
2019-10-26 16:42                                               ` Vlad Buslov
2019-10-26 18:38                                                 ` Jamal Hadi Salim
2019-10-27  8:31                                                   ` Vlad Buslov
2019-10-27  9:13                                                     ` Jamal Hadi Salim
2019-10-25 16:27                         ` Vlad Buslov
2019-10-25 16:08                       ` Vlad Buslov

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.