* [PATCH net-next v3 0/2] net/sched: remove spinlock from 'csum' action
@ 2018-01-22 17:14 Davide Caratti
2018-01-22 17:14 ` [PATCH net-next v3 1/2] net/sched: act_csum: use per-core statistics Davide Caratti
` (3 more replies)
0 siblings, 4 replies; 5+ messages in thread
From: Davide Caratti @ 2018-01-22 17:14 UTC (permalink / raw)
To: Cong Wang, David S. Miller, netdev; +Cc: Jiri Pirko
Similarly to what has been done earlier with other actions [1][2], this
series tries to improve the performance of 'csum' tc action, removing a
spinlock in the data path. Patch 1 lets act_csum use per-CPU counters;
patch 2 removes spin_{,un}lock_bh() calls from the act() method.
test procedure (using pktgen from https://github.com/netoptimizer):
# ip link add name eth1 type dummy
# ip link set dev eth1 up
# tc qdisc add dev eth1 root handle 1: prio
# for a in pass drop; do
> tc filter del dev eth1 parent 1: pref 10 matchall action csum udp
> tc filter add dev eth1 parent 1: pref 10 matchall action csum udp $a
> for n in 2 4; do
> ./pktgen_bench_xmit_mode_queue_xmit.sh -v -s 64 -t $n -n 1000000 -i eth1
> done
> done
test results:
| | before patch | after patch
$a | $n | avg. pps/thread | avg. pps/thread
-----+----+-----------------+----------------
pass | 2 | 1671463 ± 4% | 1920789 ± 3%
pass | 4 | 648797 ± 1% | 738190 ± 1%
drop | 2 | 3212692 ± 2% | 3719811 ± 2%
drop | 4 | 1078824 ± 1% | 1328099 ± 1%
references:
[1] https://www.spinics.net/lists/netdev/msg334760.html
[2] https://www.spinics.net/lists/netdev/msg465862.html
v3 changes:
- use rtnl_dereference() in place of rcu_dereference() in tcf_csum_dump()
v2 changes:
- add 'drop' test, it produces more contentions
- use RCU-protected struct to store 'action' and 'update_flags', to avoid
reading the values from subsequent configurations
Davide Caratti (2):
net/sched: act_csum: use per-core statistics
net/sched: act_csum: don't use spinlock in the fast path
include/net/tc_act/tc_csum.h | 16 +++++++++--
net/sched/act_csum.c | 66 ++++++++++++++++++++++++++++++++------------
2 files changed, 62 insertions(+), 20 deletions(-)
--
2.14.3
^ permalink raw reply [flat|nested] 5+ messages in thread
* [PATCH net-next v3 1/2] net/sched: act_csum: use per-core statistics
2018-01-22 17:14 [PATCH net-next v3 0/2] net/sched: remove spinlock from 'csum' action Davide Caratti
@ 2018-01-22 17:14 ` Davide Caratti
2018-01-22 17:14 ` [PATCH net-next v3 2/2] net/sched: act_csum: don't use spinlock in the fast path Davide Caratti
` (2 subsequent siblings)
3 siblings, 0 replies; 5+ messages in thread
From: Davide Caratti @ 2018-01-22 17:14 UTC (permalink / raw)
To: Cong Wang, David S. Miller, netdev; +Cc: Jiri Pirko
use per-CPU counters, like other TC actions do, instead of maintaining one
set of stats across all cores. This allows updating act_csum stats without
the need of protecting them using spin_{,un}lock_bh() invocations.
Signed-off-by: Davide Caratti <dcaratti@redhat.com>
---
net/sched/act_csum.c | 10 ++++------
1 file changed, 4 insertions(+), 6 deletions(-)
diff --git a/net/sched/act_csum.c b/net/sched/act_csum.c
index af4b8ec60d9a..df22da365cd9 100644
--- a/net/sched/act_csum.c
+++ b/net/sched/act_csum.c
@@ -67,7 +67,7 @@ static int tcf_csum_init(struct net *net, struct nlattr *nla,
if (!tcf_idr_check(tn, parm->index, a, bind)) {
ret = tcf_idr_create(tn, parm->index, est, a,
- &act_csum_ops, bind, false);
+ &act_csum_ops, bind, true);
if (ret)
return ret;
ret = ACT_P_CREATED;
@@ -542,9 +542,9 @@ static int tcf_csum(struct sk_buff *skb, const struct tc_action *a,
int action;
u32 update_flags;
- spin_lock(&p->tcf_lock);
tcf_lastuse_update(&p->tcf_tm);
- bstats_update(&p->tcf_bstats, skb);
+ bstats_cpu_update(this_cpu_ptr(p->common.cpu_bstats), skb);
+ spin_lock(&p->tcf_lock);
action = p->tcf_action;
update_flags = p->update_flags;
spin_unlock(&p->tcf_lock);
@@ -566,9 +566,7 @@ static int tcf_csum(struct sk_buff *skb, const struct tc_action *a,
return action;
drop:
- spin_lock(&p->tcf_lock);
- p->tcf_qstats.drops++;
- spin_unlock(&p->tcf_lock);
+ qstats_drop_inc(this_cpu_ptr(p->common.cpu_qstats));
return TC_ACT_SHOT;
}
--
2.14.3
^ permalink raw reply related [flat|nested] 5+ messages in thread
* [PATCH net-next v3 2/2] net/sched: act_csum: don't use spinlock in the fast path
2018-01-22 17:14 [PATCH net-next v3 0/2] net/sched: remove spinlock from 'csum' action Davide Caratti
2018-01-22 17:14 ` [PATCH net-next v3 1/2] net/sched: act_csum: use per-core statistics Davide Caratti
@ 2018-01-22 17:14 ` Davide Caratti
2018-01-22 20:23 ` [PATCH net-next v3 0/2] net/sched: remove spinlock from 'csum' action Cong Wang
2018-01-24 0:52 ` David Miller
3 siblings, 0 replies; 5+ messages in thread
From: Davide Caratti @ 2018-01-22 17:14 UTC (permalink / raw)
To: Cong Wang, David S. Miller, netdev; +Cc: Jiri Pirko
use RCU instead of spin_{,unlock}_bh() to protect concurrent read/write on
act_csum configuration, to reduce the effects of contention in the data
path when multiple readers are present.
Signed-off-by: Davide Caratti <dcaratti@redhat.com>
---
include/net/tc_act/tc_csum.h | 16 ++++++++++--
net/sched/act_csum.c | 58 ++++++++++++++++++++++++++++++++++----------
2 files changed, 59 insertions(+), 15 deletions(-)
diff --git a/include/net/tc_act/tc_csum.h b/include/net/tc_act/tc_csum.h
index 781f3433a0be..9470fd7e4350 100644
--- a/include/net/tc_act/tc_csum.h
+++ b/include/net/tc_act/tc_csum.h
@@ -6,10 +6,16 @@
#include <net/act_api.h>
#include <linux/tc_act/tc_csum.h>
+struct tcf_csum_params {
+ int action;
+ u32 update_flags;
+ struct rcu_head rcu;
+};
+
struct tcf_csum {
struct tc_action common;
- u32 update_flags;
+ struct tcf_csum_params __rcu *params;
};
#define to_tcf_csum(a) ((struct tcf_csum *)a)
@@ -24,7 +30,13 @@ static inline bool is_tcf_csum(const struct tc_action *a)
static inline u32 tcf_csum_update_flags(const struct tc_action *a)
{
- return to_tcf_csum(a)->update_flags;
+ u32 update_flags;
+
+ rcu_read_lock();
+ update_flags = rcu_dereference(to_tcf_csum(a)->params)->update_flags;
+ rcu_read_unlock();
+
+ return update_flags;
}
#endif /* __NET_TC_CSUM_H */
diff --git a/net/sched/act_csum.c b/net/sched/act_csum.c
index df22da365cd9..b7ba9b06b147 100644
--- a/net/sched/act_csum.c
+++ b/net/sched/act_csum.c
@@ -49,6 +49,7 @@ static int tcf_csum_init(struct net *net, struct nlattr *nla,
int bind)
{
struct tc_action_net *tn = net_generic(net, csum_net_id);
+ struct tcf_csum_params *params_old, *params_new;
struct nlattr *tb[TCA_CSUM_MAX + 1];
struct tc_csum *parm;
struct tcf_csum *p;
@@ -80,10 +81,21 @@ static int tcf_csum_init(struct net *net, struct nlattr *nla,
}
p = to_tcf_csum(*a);
- spin_lock_bh(&p->tcf_lock);
- p->tcf_action = parm->action;
- p->update_flags = parm->update_flags;
- spin_unlock_bh(&p->tcf_lock);
+ ASSERT_RTNL();
+
+ params_new = kzalloc(sizeof(*params_new), GFP_KERNEL);
+ if (unlikely(!params_new)) {
+ if (ret == ACT_P_CREATED)
+ tcf_idr_release(*a, bind);
+ return -ENOMEM;
+ }
+ params_old = rtnl_dereference(p->params);
+
+ params_new->action = parm->action;
+ params_new->update_flags = parm->update_flags;
+ rcu_assign_pointer(p->params, params_new);
+ if (params_old)
+ kfree_rcu(params_old, rcu);
if (ret == ACT_P_CREATED)
tcf_idr_insert(tn, *a);
@@ -539,19 +551,21 @@ static int tcf_csum(struct sk_buff *skb, const struct tc_action *a,
struct tcf_result *res)
{
struct tcf_csum *p = to_tcf_csum(a);
- int action;
+ struct tcf_csum_params *params;
u32 update_flags;
+ int action;
+
+ rcu_read_lock();
+ params = rcu_dereference(p->params);
tcf_lastuse_update(&p->tcf_tm);
bstats_cpu_update(this_cpu_ptr(p->common.cpu_bstats), skb);
- spin_lock(&p->tcf_lock);
- action = p->tcf_action;
- update_flags = p->update_flags;
- spin_unlock(&p->tcf_lock);
+ action = params->action;
if (unlikely(action == TC_ACT_SHOT))
- goto drop;
+ goto drop_stats;
+ update_flags = params->update_flags;
switch (tc_skb_protocol(skb)) {
case cpu_to_be16(ETH_P_IP):
if (!tcf_csum_ipv4(skb, update_flags))
@@ -563,11 +577,16 @@ static int tcf_csum(struct sk_buff *skb, const struct tc_action *a,
break;
}
+unlock:
+ rcu_read_unlock();
return action;
drop:
+ action = TC_ACT_SHOT;
+
+drop_stats:
qstats_drop_inc(this_cpu_ptr(p->common.cpu_qstats));
- return TC_ACT_SHOT;
+ goto unlock;
}
static int tcf_csum_dump(struct sk_buff *skb, struct tc_action *a, int bind,
@@ -575,15 +594,18 @@ static int tcf_csum_dump(struct sk_buff *skb, struct tc_action *a, int bind,
{
unsigned char *b = skb_tail_pointer(skb);
struct tcf_csum *p = to_tcf_csum(a);
+ struct tcf_csum_params *params;
struct tc_csum opt = {
- .update_flags = p->update_flags,
.index = p->tcf_index,
- .action = p->tcf_action,
.refcnt = p->tcf_refcnt - ref,
.bindcnt = p->tcf_bindcnt - bind,
};
struct tcf_t t;
+ params = rtnl_dereference(p->params);
+ opt.action = params->action;
+ opt.update_flags = params->update_flags;
+
if (nla_put(skb, TCA_CSUM_PARMS, sizeof(opt), &opt))
goto nla_put_failure;
@@ -598,6 +620,15 @@ static int tcf_csum_dump(struct sk_buff *skb, struct tc_action *a, int bind,
return -1;
}
+static void tcf_csum_cleanup(struct tc_action *a)
+{
+ struct tcf_csum *p = to_tcf_csum(a);
+ struct tcf_csum_params *params;
+
+ params = rcu_dereference_protected(p->params, 1);
+ kfree_rcu(params, rcu);
+}
+
static int tcf_csum_walker(struct net *net, struct sk_buff *skb,
struct netlink_callback *cb, int type,
const struct tc_action_ops *ops)
@@ -621,6 +652,7 @@ static struct tc_action_ops act_csum_ops = {
.act = tcf_csum,
.dump = tcf_csum_dump,
.init = tcf_csum_init,
+ .cleanup = tcf_csum_cleanup,
.walk = tcf_csum_walker,
.lookup = tcf_csum_search,
.size = sizeof(struct tcf_csum),
--
2.14.3
^ permalink raw reply related [flat|nested] 5+ messages in thread
* Re: [PATCH net-next v3 0/2] net/sched: remove spinlock from 'csum' action
2018-01-22 17:14 [PATCH net-next v3 0/2] net/sched: remove spinlock from 'csum' action Davide Caratti
2018-01-22 17:14 ` [PATCH net-next v3 1/2] net/sched: act_csum: use per-core statistics Davide Caratti
2018-01-22 17:14 ` [PATCH net-next v3 2/2] net/sched: act_csum: don't use spinlock in the fast path Davide Caratti
@ 2018-01-22 20:23 ` Cong Wang
2018-01-24 0:52 ` David Miller
3 siblings, 0 replies; 5+ messages in thread
From: Cong Wang @ 2018-01-22 20:23 UTC (permalink / raw)
To: Davide Caratti
Cc: David S. Miller, Linux Kernel Network Developers, Jiri Pirko
On Mon, Jan 22, 2018 at 9:14 AM, Davide Caratti <dcaratti@redhat.com> wrote:
> Similarly to what has been done earlier with other actions [1][2], this
> series tries to improve the performance of 'csum' tc action, removing a
> spinlock in the data path. Patch 1 lets act_csum use per-CPU counters;
> patch 2 removes spin_{,un}lock_bh() calls from the act() method.
>
> test procedure (using pktgen from https://github.com/netoptimizer):
>
> # ip link add name eth1 type dummy
> # ip link set dev eth1 up
> # tc qdisc add dev eth1 root handle 1: prio
> # for a in pass drop; do
> > tc filter del dev eth1 parent 1: pref 10 matchall action csum udp
> > tc filter add dev eth1 parent 1: pref 10 matchall action csum udp $a
> > for n in 2 4; do
> > ./pktgen_bench_xmit_mode_queue_xmit.sh -v -s 64 -t $n -n 1000000 -i eth1
> > done
> > done
>
> test results:
>
> | | before patch | after patch
> $a | $n | avg. pps/thread | avg. pps/thread
> -----+----+-----------------+----------------
> pass | 2 | 1671463 ± 4% | 1920789 ± 3%
> pass | 4 | 648797 ± 1% | 738190 ± 1%
> drop | 2 | 3212692 ± 2% | 3719811 ± 2%
> drop | 4 | 1078824 ± 1% | 1328099 ± 1%
Looks good to me,
Acked-by: Cong Wang <xiyou.wangcong@gmail.com>
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [PATCH net-next v3 0/2] net/sched: remove spinlock from 'csum' action
2018-01-22 17:14 [PATCH net-next v3 0/2] net/sched: remove spinlock from 'csum' action Davide Caratti
` (2 preceding siblings ...)
2018-01-22 20:23 ` [PATCH net-next v3 0/2] net/sched: remove spinlock from 'csum' action Cong Wang
@ 2018-01-24 0:52 ` David Miller
3 siblings, 0 replies; 5+ messages in thread
From: David Miller @ 2018-01-24 0:52 UTC (permalink / raw)
To: dcaratti; +Cc: xiyou.wangcong, netdev, jiri
From: Davide Caratti <dcaratti@redhat.com>
Date: Mon, 22 Jan 2018 18:14:30 +0100
> Similarly to what has been done earlier with other actions [1][2], this
> series tries to improve the performance of 'csum' tc action, removing a
> spinlock in the data path. Patch 1 lets act_csum use per-CPU counters;
> patch 2 removes spin_{,un}lock_bh() calls from the act() method.
>
> test procedure (using pktgen from https://github.com/netoptimizer):
>
> # ip link add name eth1 type dummy
> # ip link set dev eth1 up
> # tc qdisc add dev eth1 root handle 1: prio
> # for a in pass drop; do
> > tc filter del dev eth1 parent 1: pref 10 matchall action csum udp
> > tc filter add dev eth1 parent 1: pref 10 matchall action csum udp $a
> > for n in 2 4; do
> > ./pktgen_bench_xmit_mode_queue_xmit.sh -v -s 64 -t $n -n 1000000 -i eth1
> > done
> > done
>
> test results:
>
> | | before patch | after patch
> $a | $n | avg. pps/thread | avg. pps/thread
> -----+----+-----------------+----------------
> pass | 2 | 1671463 ± 4% | 1920789 ± 3%
> pass | 4 | 648797 ± 1% | 738190 ± 1%
> drop | 2 | 3212692 ± 2% | 3719811 ± 2%
> drop | 4 | 1078824 ± 1% | 1328099 ± 1%
>
> references:
>
> [1] https://www.spinics.net/lists/netdev/msg334760.html
> [2] https://www.spinics.net/lists/netdev/msg465862.html
Series applied, thanks Davide.
^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2018-01-24 0:52 UTC | newest]
Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-01-22 17:14 [PATCH net-next v3 0/2] net/sched: remove spinlock from 'csum' action Davide Caratti
2018-01-22 17:14 ` [PATCH net-next v3 1/2] net/sched: act_csum: use per-core statistics Davide Caratti
2018-01-22 17:14 ` [PATCH net-next v3 2/2] net/sched: act_csum: don't use spinlock in the fast path Davide Caratti
2018-01-22 20:23 ` [PATCH net-next v3 0/2] net/sched: remove spinlock from 'csum' action Cong Wang
2018-01-24 0:52 ` David Miller
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.